[PATCH i-g-t v6 00/17] Test coverage for GPU debug support

Igt-dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH i-g-t v6 00/17] Test coverage for GPU debug support
@ 2024-09-05  9:27 Christoph Manszewski
  2024-09-05  9:27 ` [PATCH i-g-t v6 01/17] drm-uapi/xe: Sync with oa uapi fix Christoph Manszewski
                   ` (19 more replies)
  0 siblings, 20 replies; 50+ messages in thread
From: Christoph Manszewski @ 2024-09-05  9:27 UTC (permalink / raw)
  To: igt-dev
  Cc: Zbigniew Kempczyński, Kamil Konieczny, Dominik Grzegorzek,
	Maciej Patelczyk, Dominik Karol Piątkowski, Pawel Sikora,
	Andrzej Hajda, Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun,
	Christoph Manszewski

Hi,

In this series the eudebug kernel and validation team would like to
add test coverage for GPU debug support recently proposed as an RFC.
(https://patchwork.freedesktop.org/series/136572/)

This series adds 'xe_eudebug' and 'xe_eudebug_online' tests together
with a library that encapsulates common paths in current and future
EU debugger scenarios. It also extends the 'xe_exec_sip' test and
'gpgpu_shader' library.

The aim of the 'xe_eudebug' test is to validate the eudebug resource
tracking and event delivery mechanism. The 'xe_eudebug_online' test is
dedicated for 'online' scenarios which means scenarios that exercise
hardware exception handling and thread state manipulation.

The xe_eudebug library provides an abstraction over debugger and debuggee
processes, asynchronous event reader, and event log buffers for post-mortem
analysis.

Latest kernel code can be found here:
https://gitlab.freedesktop.org/miku/kernel/-/commits/eudebug-dev

Thank you in advance for any comments and insight.

v2:
 - make sure to include all patches and verify that each individual
 patch compiles (Zbigniew)

v3:
 - fix multiple typos (Dominik Karol),
 - squash subtest and eudebug lib patches (Zbigniew),
 - include uapi sync/fix (Kamil)

v4:
 - move all eudebug uapi changes to xe_drm_eudebug.h (Zbigniew),
 - move some 'xe_exec_sip' tests to 'xe_exec_sip_eudebug' test (Zbigniew),
 - control eudebug lib and test build with meson flag and disable it by
 default (Zbigniew),
 - fix multiple checkpatch issues (Kamil),
 - apply review comments from Dominik,

v5:
 - add comment to 'xe_drm_eudebug' (Zbigniew),
 - misc fixes and cleanups in gpgpu_shader.[ch] (Zbigniew),
 - assert on offset in intel_bb_ptr_get (Zbigniew),
 - use enum for shader and sip type in 'xe_exec_sip[_eudebug]'
 (Zbigniew),
 - fix string concatenation issue for meson older than 0.49,
 - more (hopefully all relevant) checkpatch issues addressed,

v6:
 - trim write_on_exception shader parameter list (Zbigniew),
 - fix assert condition for 'intel_bb_ptr_get' (Zbigniew),
 - properly use previously introduced enum arround xe_exec_sip[_eudebug]
 (Zbigniew),
 - simplify loop condition in xe_exec_sip_eudebug (Zbigniew),
 - add 'gpgpu_shader_last_instr' helper (Zbigniew),
 - assert on params of public functions in lib/xe_eudebug (Zbigniew),
 - fix assert order arround lib/xe_eudebug (Zbigniew),
 - simplify and unify casts in lib/xe_eudebug (Zbigniew),
 - more descriptive variable names for lib/xe_eudebug functions
 and structs (Zbigniew),
 - fix typo in 'xe_eudebug_debugger_dettach' (Zbigniew),
 - create 'xe_eudebug_debugger_worker_state' enum (Zbigniew),
 - misc code formatting fixes (Zbigniew),
 - use common list in meson for conditional build and doc generation of
 xe_eudebug tests (Zbigniew),
 - rebase on top of master,

Andrzej Hajda (5):
  lib/gpgpu_shader: Add write_on_exception template
  lib/gpgpu_shader: Add set/clear exception register (cr0.1) helpers
  lib/intel_batchbuffer: Add helper to get pointer at specified offset
  lib/gpgpu_shader: Allow enabling illegal opcode exceptions in shader
  tests/xe_exec_sip: Introduce invalid instruction tests

Christoph Manszewski (7):
  drm-uapi/xe: Sync with oa uapi fix
  lib/xe_ioctl: Add wrapper with vm_bind_op extension parameter
  lib/gpgpu_shader: Extend shader building library
  tests/xe_exec_sip: Add sanity-after-timeout test
  scripts/igt_doc: Add '--exclude-files' parameter
  tests/xe_exec_sip_eudebug: Port tests for shaders and sip
  tests/xe_live_ktest: Add xe_eudebug live test

Dominik Grzegorzek (4):
  drm-uapi/xe: Sync with eudebug uapi
  lib/xe_eudebug: Introduce eu debug testing framework
  tests/xe_eudebug: Test eudebug resource tracking and manipulation
  tests/xe_eudebug_online: Debug client which runs workloads on EU

Gwan-gyeong Mun (1):
  lib/intel_batchbuffer: Add support for long-running mode execution

 docs/testplan/meson.build         |   13 +-
 include/drm-uapi/xe_drm.h         |   16 +-
 include/drm-uapi/xe_drm_eudebug.h |  341 ++++
 lib/gpgpu_shader.c                |  477 ++++-
 lib/gpgpu_shader.h                |   34 +-
 lib/iga64_generated_codes.c       |  532 +++++-
 lib/intel_batchbuffer.c           |  149 +-
 lib/intel_batchbuffer.h           |   24 +
 lib/meson.build                   |    5 +
 lib/xe/xe_eudebug.c               | 2249 ++++++++++++++++++++++++
 lib/xe/xe_eudebug.h               |  218 +++
 lib/xe/xe_ioctl.c                 |   20 +-
 lib/xe/xe_ioctl.h                 |    5 +
 meson.build                       |    2 +
 meson_options.txt                 |    5 +
 scripts/igt_doc.py                |    3 +
 scripts/test_list.py              |   47 +-
 tests/intel/xe_eudebug.c          | 2716 +++++++++++++++++++++++++++++
 tests/intel/xe_eudebug_online.c   | 2254 ++++++++++++++++++++++++
 tests/intel/xe_exec_sip.c         |  152 +-
 tests/intel/xe_exec_sip_eudebug.c |  355 ++++
 tests/intel/xe_live_ktest.c       |    6 +
 tests/meson.build                 |   10 +
 23 files changed, 9581 insertions(+), 52 deletions(-)
 create mode 100644 include/drm-uapi/xe_drm_eudebug.h
 create mode 100644 lib/xe/xe_eudebug.c
 create mode 100644 lib/xe/xe_eudebug.h
 create mode 100644 tests/intel/xe_eudebug.c
 create mode 100644 tests/intel/xe_eudebug_online.c
 create mode 100644 tests/intel/xe_exec_sip_eudebug.c

-- 
2.34.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH i-g-t v6 01/17] drm-uapi/xe: Sync with oa uapi fix
  2024-09-05  9:27 [PATCH i-g-t v6 00/17] Test coverage for GPU debug support Christoph Manszewski
@ 2024-09-05  9:27 ` Christoph Manszewski
  2024-09-06 14:41   ` Kamil Konieczny
  2024-09-05  9:27 ` [PATCH i-g-t v6 02/17] lib/xe_ioctl: Add wrapper with vm_bind_op extension parameter Christoph Manszewski
                   ` (18 subsequent siblings)
  19 siblings, 1 reply; 50+ messages in thread
From: Christoph Manszewski @ 2024-09-05  9:27 UTC (permalink / raw)
  To: igt-dev
  Cc: Zbigniew Kempczyński, Kamil Konieczny, Dominik Grzegorzek,
	Maciej Patelczyk, Dominik Karol Piątkowski, Pawel Sikora,
	Andrzej Hajda, Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun,
	Christoph Manszewski, Jonathan Cavitt, Lucas De Marchi

Align with kernel commit f2881dfdaaa9 ("drm/xe/oa/uapi: Make bit masks
unsigned"). Use built header instead of raw uapi header.

Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
Reviewed-by: Kamil Konieczny <kamil.konieczny@linux.intel.com>
---
 include/drm-uapi/xe_drm.h | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/drm-uapi/xe_drm.h b/include/drm-uapi/xe_drm.h
index 29425d7fd..f0a450db9 100644
--- a/include/drm-uapi/xe_drm.h
+++ b/include/drm-uapi/xe_drm.h
@@ -3,8 +3,8 @@
  * Copyright © 2023 Intel Corporation
  */
 
-#ifndef _UAPI_XE_DRM_H_
-#define _UAPI_XE_DRM_H_
+#ifndef _XE_DRM_H_
+#define _XE_DRM_H_
 
 #include "drm.h"
 
@@ -134,7 +134,7 @@ extern "C" {
  * redefine the interface more easily than an ever growing struct of
  * increasing complexity, and for large parts of that interface to be
  * entirely optional. The downside is more pointer chasing; chasing across
- * the __user boundary with pointers encapsulated inside u64.
+ * the boundary with pointers encapsulated inside u64.
  *
  * Example chaining:
  *
@@ -1598,10 +1598,10 @@ enum drm_xe_oa_property_id {
 	 * b. Counter select c. Counter size and d. BC report. Also refer to the
 	 * oa_formats array in drivers/gpu/drm/xe/xe_oa.c.
 	 */
-#define DRM_XE_OA_FORMAT_MASK_FMT_TYPE		(0xff << 0)
-#define DRM_XE_OA_FORMAT_MASK_COUNTER_SEL	(0xff << 8)
-#define DRM_XE_OA_FORMAT_MASK_COUNTER_SIZE	(0xff << 16)
-#define DRM_XE_OA_FORMAT_MASK_BC_REPORT		(0xff << 24)
+#define DRM_XE_OA_FORMAT_MASK_FMT_TYPE		(0xffu << 0)
+#define DRM_XE_OA_FORMAT_MASK_COUNTER_SEL	(0xffu << 8)
+#define DRM_XE_OA_FORMAT_MASK_COUNTER_SIZE	(0xffu << 16)
+#define DRM_XE_OA_FORMAT_MASK_BC_REPORT		(0xffu << 24)
 
 	/**
 	 * @DRM_XE_OA_PROPERTY_OA_PERIOD_EXPONENT: Requests periodic OA unit
@@ -1698,4 +1698,4 @@ struct drm_xe_oa_stream_info {
 }
 #endif
 
-#endif /* _UAPI_XE_DRM_H_ */
+#endif /* _XE_DRM_H_ */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 01/17] drm-uapi/xe: Sync with oa uapi fix
  2024-09-05  9:27 ` [PATCH i-g-t v6 01/17] drm-uapi/xe: Sync with oa uapi fix Christoph Manszewski
@ 2024-09-06 14:41   ` Kamil Konieczny
  0 siblings, 0 replies; 50+ messages in thread
From: Kamil Konieczny @ 2024-09-06 14:41 UTC (permalink / raw)
  To: igt-dev
  Cc: Christoph Manszewski, Zbigniew Kempczyński,
	Dominik Grzegorzek, Maciej Patelczyk,
	Dominik Karol Piątkowski, Pawel Sikora, Andrzej Hajda,
	Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun,
	Jonathan Cavitt, Lucas De Marchi

Hi Christoph,
On 2024-09-05 at 11:27:56 +0200, Christoph Manszewski wrote:
> Align with kernel commit f2881dfdaaa9 ("drm/xe/oa/uapi: Make bit masks
> unsigned"). Use built header instead of raw uapi header.

Thank you Christoph for this fix,
I merged it.

Regards,
Kamil

> 
> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
> Reviewed-by: Kamil Konieczny <kamil.konieczny@linux.intel.com>
> ---
>  include/drm-uapi/xe_drm.h | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/include/drm-uapi/xe_drm.h b/include/drm-uapi/xe_drm.h
> index 29425d7fd..f0a450db9 100644
> --- a/include/drm-uapi/xe_drm.h
> +++ b/include/drm-uapi/xe_drm.h
> @@ -3,8 +3,8 @@
>   * Copyright © 2023 Intel Corporation
>   */
>  
> -#ifndef _UAPI_XE_DRM_H_
> -#define _UAPI_XE_DRM_H_
> +#ifndef _XE_DRM_H_
> +#define _XE_DRM_H_
>  
>  #include "drm.h"
>  
> @@ -134,7 +134,7 @@ extern "C" {
>   * redefine the interface more easily than an ever growing struct of
>   * increasing complexity, and for large parts of that interface to be
>   * entirely optional. The downside is more pointer chasing; chasing across
> - * the __user boundary with pointers encapsulated inside u64.
> + * the boundary with pointers encapsulated inside u64.
>   *
>   * Example chaining:
>   *
> @@ -1598,10 +1598,10 @@ enum drm_xe_oa_property_id {
>  	 * b. Counter select c. Counter size and d. BC report. Also refer to the
>  	 * oa_formats array in drivers/gpu/drm/xe/xe_oa.c.
>  	 */
> -#define DRM_XE_OA_FORMAT_MASK_FMT_TYPE		(0xff << 0)
> -#define DRM_XE_OA_FORMAT_MASK_COUNTER_SEL	(0xff << 8)
> -#define DRM_XE_OA_FORMAT_MASK_COUNTER_SIZE	(0xff << 16)
> -#define DRM_XE_OA_FORMAT_MASK_BC_REPORT		(0xff << 24)
> +#define DRM_XE_OA_FORMAT_MASK_FMT_TYPE		(0xffu << 0)
> +#define DRM_XE_OA_FORMAT_MASK_COUNTER_SEL	(0xffu << 8)
> +#define DRM_XE_OA_FORMAT_MASK_COUNTER_SIZE	(0xffu << 16)
> +#define DRM_XE_OA_FORMAT_MASK_BC_REPORT		(0xffu << 24)
>  
>  	/**
>  	 * @DRM_XE_OA_PROPERTY_OA_PERIOD_EXPONENT: Requests periodic OA unit
> @@ -1698,4 +1698,4 @@ struct drm_xe_oa_stream_info {
>  }
>  #endif
>  
> -#endif /* _UAPI_XE_DRM_H_ */
> +#endif /* _XE_DRM_H_ */
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH i-g-t v6 02/17] lib/xe_ioctl: Add wrapper with vm_bind_op extension parameter
  2024-09-05  9:27 [PATCH i-g-t v6 00/17] Test coverage for GPU debug support Christoph Manszewski
  2024-09-05  9:27 ` [PATCH i-g-t v6 01/17] drm-uapi/xe: Sync with oa uapi fix Christoph Manszewski
@ 2024-09-05  9:27 ` Christoph Manszewski
  2024-09-05  9:27 ` [PATCH i-g-t v6 03/17] lib/gpgpu_shader: Extend shader building library Christoph Manszewski
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 50+ messages in thread
From: Christoph Manszewski @ 2024-09-05  9:27 UTC (permalink / raw)
  To: igt-dev
  Cc: Zbigniew Kempczyński, Kamil Konieczny, Dominik Grzegorzek,
	Maciej Patelczyk, Dominik Karol Piątkowski, Pawel Sikora,
	Andrzej Hajda, Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun,
	Christoph Manszewski, Mika Kuoppala

Currently there is no way to set drm_xe_vm_bind_op extension field. Add
vm_bind wrapper that allows pass this field as a parameter.

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
Reviewed-by: Dominik Karol Piątkowski <dominik.karol.piatkowski@intel.com>
Reviewed-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 lib/xe/xe_ioctl.c | 20 ++++++++++++++++----
 lib/xe/xe_ioctl.h |  5 +++++
 2 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/lib/xe/xe_ioctl.c b/lib/xe/xe_ioctl.c
index ae43ffd15..6d8388918 100644
--- a/lib/xe/xe_ioctl.c
+++ b/lib/xe/xe_ioctl.c
@@ -96,15 +96,17 @@ void xe_vm_bind_array(int fd, uint32_t vm, uint32_t exec_queue,
 	igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_VM_BIND, &bind), 0);
 }
 
-int  __xe_vm_bind(int fd, uint32_t vm, uint32_t exec_queue, uint32_t bo,
-		  uint64_t offset, uint64_t addr, uint64_t size, uint32_t op,
-		  uint32_t flags, struct drm_xe_sync *sync, uint32_t num_syncs,
-		  uint32_t prefetch_region, uint8_t pat_index, uint64_t ext)
+int  ___xe_vm_bind(int fd, uint32_t vm, uint32_t exec_queue, uint32_t bo,
+		   uint64_t offset, uint64_t addr, uint64_t size, uint32_t op,
+		   uint32_t flags, struct drm_xe_sync *sync, uint32_t num_syncs,
+		   uint32_t prefetch_region, uint8_t pat_index, uint64_t ext,
+		   uint64_t op_ext)
 {
 	struct drm_xe_vm_bind bind = {
 		.extensions = ext,
 		.vm_id = vm,
 		.num_binds = 1,
+		.bind.extensions = op_ext,
 		.bind.obj = bo,
 		.bind.obj_offset = offset,
 		.bind.range = size,
@@ -125,6 +127,16 @@ int  __xe_vm_bind(int fd, uint32_t vm, uint32_t exec_queue, uint32_t bo,
 	return 0;
 }
 
+int  __xe_vm_bind(int fd, uint32_t vm, uint32_t exec_queue, uint32_t bo,
+		  uint64_t offset, uint64_t addr, uint64_t size, uint32_t op,
+		  uint32_t flags, struct drm_xe_sync *sync, uint32_t num_syncs,
+		  uint32_t prefetch_region, uint8_t pat_index, uint64_t ext)
+{
+	return ___xe_vm_bind(fd, vm, exec_queue, bo, offset, addr, size, op,
+			     flags, sync, num_syncs, prefetch_region,
+			     pat_index, ext, 0);
+}
+
 void  __xe_vm_bind_assert(int fd, uint32_t vm, uint32_t exec_queue, uint32_t bo,
 			  uint64_t offset, uint64_t addr, uint64_t size,
 			  uint32_t op, uint32_t flags, struct drm_xe_sync *sync,
diff --git a/lib/xe/xe_ioctl.h b/lib/xe/xe_ioctl.h
index b27c0053f..18cc2b72b 100644
--- a/lib/xe/xe_ioctl.h
+++ b/lib/xe/xe_ioctl.h
@@ -20,6 +20,11 @@
 uint32_t xe_cs_prefetch_size(int fd);
 uint64_t xe_bb_size(int fd, uint64_t reqsize);
 uint32_t xe_vm_create(int fd, uint32_t flags, uint64_t ext);
+int  ___xe_vm_bind(int fd, uint32_t vm, uint32_t exec_queue, uint32_t bo,
+		   uint64_t offset, uint64_t addr, uint64_t size, uint32_t op,
+		   uint32_t flags, struct drm_xe_sync *sync, uint32_t num_syncs,
+		   uint32_t prefetch_region, uint8_t pat_index, uint64_t ext,
+		   uint64_t op_ext);
 int  __xe_vm_bind(int fd, uint32_t vm, uint32_t exec_queue, uint32_t bo,
 		  uint64_t offset, uint64_t addr, uint64_t size, uint32_t op,
 		  uint32_t flags, struct drm_xe_sync *sync, uint32_t num_syncs,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH i-g-t v6 03/17] lib/gpgpu_shader: Extend shader building library
  2024-09-05  9:27 [PATCH i-g-t v6 00/17] Test coverage for GPU debug support Christoph Manszewski
  2024-09-05  9:27 ` [PATCH i-g-t v6 01/17] drm-uapi/xe: Sync with oa uapi fix Christoph Manszewski
  2024-09-05  9:27 ` [PATCH i-g-t v6 02/17] lib/xe_ioctl: Add wrapper with vm_bind_op extension parameter Christoph Manszewski
@ 2024-09-05  9:27 ` Christoph Manszewski
  2024-09-05 11:56   ` Zbigniew Kempczyński
  2024-09-09  6:54   ` Zbigniew Kempczyński
  2024-09-05  9:27 ` [PATCH i-g-t v6 04/17] lib/gpgpu_shader: Add write_on_exception template Christoph Manszewski
                   ` (16 subsequent siblings)
  19 siblings, 2 replies; 50+ messages in thread
From: Christoph Manszewski @ 2024-09-05  9:27 UTC (permalink / raw)
  To: igt-dev
  Cc: Zbigniew Kempczyński, Kamil Konieczny, Dominik Grzegorzek,
	Maciej Patelczyk, Dominik Karol Piątkowski, Pawel Sikora,
	Andrzej Hajda, Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun,
	Christoph Manszewski

Add shader building functions and iga64 code used by eudebug subtests.

Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Dominik Karol Piątkowski <dominik.karol.piatkowski@intel.com>
Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
---
 lib/gpgpu_shader.c          | 392 ++++++++++++++++++++++++++++++++++-
 lib/gpgpu_shader.h          |  29 ++-
 lib/iga64_generated_codes.c | 401 +++++++++++++++++++++++++++++++++++-
 3 files changed, 818 insertions(+), 4 deletions(-)

diff --git a/lib/gpgpu_shader.c b/lib/gpgpu_shader.c
index 80bad342a..dacab51dd 100644
--- a/lib/gpgpu_shader.c
+++ b/lib/gpgpu_shader.c
@@ -7,10 +7,16 @@
 
 #include <i915_drm.h>
 
+#include "igt_map.h"
 #include "ioctl_wrappers.h"
 #include "gpgpu_shader.h"
 #include "gpu_cmds.h"
 
+struct label_entry {
+	uint32_t id;
+	uint32_t offset;
+};
+
 #define IGA64_ARG0 0xc0ded000
 #define IGA64_ARG_MASK 0xffffff00
 
@@ -32,7 +38,7 @@ static void gpgpu_shader_extend(struct gpgpu_shader *shdr)
 	igt_assert(shdr->code);
 }
 
-void
+uint32_t
 __emit_iga64_code(struct gpgpu_shader *shdr, struct iga64_template const *tpls,
 		  int argc, uint32_t *argv)
 {
@@ -60,6 +66,8 @@ __emit_iga64_code(struct gpgpu_shader *shdr, struct iga64_template const *tpls,
 	}
 
 	shdr->size += tpls->size;
+
+	return tpls->size;
 }
 
 static uint32_t fill_sip(struct intel_bb *ibb,
@@ -235,10 +243,16 @@ struct gpgpu_shader *gpgpu_shader_create(int fd)
 	shdr->gen_ver = 100 * info->graphics_ver + info->graphics_rel;
 	shdr->max_size = 16 * 4;
 	shdr->code = malloc(4 * shdr->max_size);
+	shdr->labels = igt_map_create(igt_map_hash_32, igt_map_equal_32);
 	igt_assert(shdr->code);
 	return shdr;
 }
 
+static void free_func(struct igt_map_entry *entry)
+{
+       free(entry->data);
+}
+
 /**
  * gpgpu_shader_destroy:
  * @shdr: pointer to shader struct created with 'gpgpu_shader_create'
@@ -247,10 +261,76 @@ struct gpgpu_shader *gpgpu_shader_create(int fd)
  */
 void gpgpu_shader_destroy(struct gpgpu_shader *shdr)
 {
+	igt_map_destroy(shdr->labels, free_func);
 	free(shdr->code);
 	free(shdr);
 }
 
+/**
+ * gpgpu_shader_dump:
+ * @shdr: shader to be printed
+ *
+ * Print shader instructions from @shdr in hex.
+ */
+void gpgpu_shader_dump(struct gpgpu_shader *shdr)
+{
+	for (int i = 0; i < shdr->size / 4; i++)
+		igt_info("0x%08x 0x%08x 0x%08x 0x%08x\n",
+			 shdr->instr[i][0], shdr->instr[i][1],
+			 shdr->instr[i][2], shdr->instr[i][3]);
+}
+
+/**
+ * gpgpu_shader__breakpoint_on:
+ * @shdr: shader to create breakpoint in
+ * @cmd_no: index of the instruction to break on
+ *
+ * Insert a breakpoint on the @cmd_no'th instruction within @shdr.
+ */
+void gpgpu_shader__breakpoint_on(struct gpgpu_shader *shdr, uint32_t cmd_no)
+{
+	igt_assert(cmd_no < shdr->size / 4);
+	shdr->instr[cmd_no][0] |= 1<<30;
+}
+
+/**
+ * gpgpu_shader__breakpoint:
+ * @shdr: shader to create breakpoint in
+ *
+ * Insert a breakpoint on the last instruction in @shdr.
+ */
+void gpgpu_shader__breakpoint(struct gpgpu_shader *shdr)
+{
+	gpgpu_shader__breakpoint_on(shdr, gpgpu_shader_last_instr(shdr));
+}
+
+/**
+ * gpgpu_shader__wait:
+ * @shdr: shader to be modified
+ *
+ * Append wait instruction to @shader. This instruction raises attention
+ * and stops execution.
+ */
+void gpgpu_shader__wait(struct gpgpu_shader *shdr)
+{
+	emit_iga64_code(shdr, sync_host, "	\n\
+(W)	sync.host	        null		\n\
+	");
+}
+
+/**
+ * gpgpu_shader__nop:
+ * @shdr: shader to be modified
+ *
+ * Append a no-op instruction to @shdr.
+ */
+void gpgpu_shader__nop(struct gpgpu_shader *shdr)
+{
+	emit_iga64_code(shdr, nop, "	\n\
+(W)	nop				\n\
+	");
+}
+
 /**
  * gpgpu_shader__eot:
  * @shdr: shader to be modified
@@ -269,6 +349,246 @@ void gpgpu_shader__eot(struct gpgpu_shader *shdr)
 	");
 }
 
+/**
+ * gpgpu_shader__label:
+ * @shdr: shader to be modified
+ * @label_id: id of the label to be created
+ *
+ * Create a label for the last instruction within @shdr.
+ */
+void gpgpu_shader__label(struct gpgpu_shader *shdr, int label_id)
+{
+	struct label_entry *l = malloc(sizeof(*l));
+
+	l->id = label_id;
+	l->offset = shdr->size;
+	igt_map_insert(shdr->labels, &l->id, l);
+}
+
+#define OPCODE(x) (x & 0x7f)
+#define OPCODE_JUMP_INDEXED 0x20
+static void __patch_indexed_jump(struct gpgpu_shader *shdr, int label_id,
+				 uint32_t jump_iga64_size)
+{
+	struct label_entry *l;
+	uint32_t *start, *end, *label;
+	int32_t relative;
+
+	l = igt_map_search(shdr->labels, &label_id);
+	igt_assert(l);
+
+	igt_assert(jump_iga64_size % 4 == 0);
+
+	label = shdr->code + l->offset;
+	end = shdr->code + shdr->size;
+	start = end - jump_iga64_size;
+
+	for (; start < end; start += 4)
+		if (OPCODE(*start) == OPCODE_JUMP_INDEXED) {
+			relative = (label - start) * 4;
+			*(start + 3) = relative;
+			break;
+		}
+}
+
+/**
+ * gpgpu_shader__jump:
+ * @shdr: shader to be modified
+ * @label_id: label to jump to
+ *
+ * Append jump instruction to @shdr. Jump to instruction with label @label_id.
+ */
+void gpgpu_shader__jump(struct gpgpu_shader *shdr, int label_id)
+{
+	size_t shader_size;
+
+	shader_size = emit_iga64_code(shdr, jump, "	\n\
+L0:							\n\
+(W)	jmpi        L0					\n\
+	");
+
+	__patch_indexed_jump(shdr, label_id, shader_size);
+}
+
+/**
+ * gpgpu_shader__jump_neq:
+ * @shdr: shader to be modified
+ * @label_id: label to jump to
+ * @y_offset: offset within target buffer in rows
+ * @value: expected value
+ *
+ * Append jump instruction to @shdr. Jump to instruction with label @label_id
+ * when @value is not equal to dword stored at @y_offset within the surface.
+ */
+void gpgpu_shader__jump_neq(struct gpgpu_shader *shdr, int label_id,
+			    uint32_t y_offset, uint32_t value)
+{
+	uint32_t size;
+
+	size = emit_iga64_code(shdr, jump_dw_neq, "					\n\
+L0:											\n\
+(W)		mov (16|M0)              r30.0<1>:ud    0x0:ud				\n\
+#if GEN_VER < 2000 // Media Block Write							\n\
+	// Y offset of the block in rows := thread group id Y				\n\
+(W)		mov (1|M0)               r30.1<1>:ud    ARG(0):ud			\n\
+	// block width [0,63] representing 1 to 64 bytes, we want dword			\n\
+(W)		mov (1|M0)               r30.2<1>:ud    0x3:ud				\n\
+	// FFTID := FFTID from R0 header						\n\
+(W)		mov (1|M0)               r30.4<1>:ud    r0.5<0;1,0>:ud  		\n\
+(W)		send.dc1 (16|M0)         r31     r30      null    0x0	0x2190000	\n\
+#else // Typed 2D Block Store								\n\
+	// Store X and Y block start (160:191 and 192:223)				\n\
+(W)            mov (2|M0)               r30.6<1>:ud    ARG(0):ud			\n\
+	// Store X and Y block size (224:231 and 232:239)				\n\
+(W)            mov (1|M0)               r30.7<1>:ud    0x3:ud				\n\
+(W)            send.tgm (16|M0)         r31     r30    null:0    0x0    0x62100003	\n\
+#endif											\n\
+	// clear the flag register							\n\
+(W)		mov (1|M0)               f0.0<1>:ud    0x0:ud				\n\
+(W)		cmp (1|M0)    (ne)f0.0   null<1>:ud     r31.0<0;1,0>:ud   ARG(1):ud	\n\
+(W&f0.0)	jmpi                     L0						\n\
+	", y_offset, value);
+
+	__patch_indexed_jump(shdr, label_id, size);
+}
+
+/**
+ * gpgpu_shader__loop_begin:
+ * @shdr: shader to be modified
+ * @label_id: id of the label to be created
+ *
+ * Begin a counting loop in @shdr. All subsequent instructions will constitute
+ * the loop body up until 'gpgpu_shader__loop_end' gets called. The first
+ * instruction of the loop will be at label @label_id. The r40 register will be
+ * overwritten as it is used as the loop counter.
+ */
+void gpgpu_shader__loop_begin(struct gpgpu_shader *shdr, int label_id)
+{
+	emit_iga64_code(shdr, clear_r40, "		\n\
+L0:							\n\
+(W)	mov (1|M0)               r40:ud    0x0:ud	\n\
+	");
+
+	gpgpu_shader__label(shdr, label_id);
+}
+
+/**
+ * gpgpu_shader__loop_end:
+ * @shdr: shader to be modified
+ * @label_id: label id passed to 'gpgpu_shader__loop_begin'
+ * @iter: iteration count
+ *
+ * End loop body in @shdr.
+ */
+void gpgpu_shader__loop_end(struct gpgpu_shader *shdr, int label_id, uint32_t iter)
+{
+	uint32_t size;
+
+	size = emit_iga64_code(shdr, inc_r40_jump_neq, "				\n\
+L0:											\n\
+(W)		add (1|M0)              r40:ud          r40.0<0;1,0>:ud 0x1:ud		\n\
+(W)		mov (1|M0)              f0.0<1>:ud      0x0:ud				\n\
+(W)		cmp (1|M0)    (ne)f0.0   null<1>:ud     r40.0<0;1,0>:ud   ARG(0):ud	\n\
+(W&f0.0)	jmpi                     L0						\n\
+	", iter);
+
+	__patch_indexed_jump(shdr, label_id, size);
+}
+
+/**
+ * gpgpu_shader__common_target_write:
+ * @shdr: shader to be modified
+ * @y_offset: write target offset within target buffer in rows
+ * @value: oword to be written
+ *
+ * Write the oword stored in @value to the target buffer at @y_offset.
+ */
+void gpgpu_shader__common_target_write(struct gpgpu_shader *shdr,
+				       uint32_t y_offset, const uint32_t value[4])
+{
+	emit_iga64_code(shdr, common_target_write, "				\n\
+(W)	mov (16|M0)		r30.0<1>:ud	0x0:ud				\n\
+(W)	mov (16|M0)		r31.0<1>:ud	0x0:ud				\n\
+(W)	mov (1|M0)		r31.0<1>:ud	ARG(1):ud			\n\
+(W)	mov (1|M0)		r31.1<1>:ud	ARG(2):ud			\n\
+(W)	mov (1|M0)		r31.2<1>:ud	ARG(3):ud			\n\
+(W)	mov (1|M0)		r31.3<1>:ud	ARG(4):ud			\n\
+#if GEN_VER < 2000 // Media Block Write						\n\
+	// Y offset of the block in rows					\n\
+(W)	mov (1|M0)		r30.1<1>:ud	ARG(0):ud			\n\
+	// block width [0,63] representing 1 to 64 bytes			\n\
+(W)	mov (1|M0)		r30.2<1>:ud	0xf:ud				\n\
+	// FFTID := FFTID from R0 header					\n\
+(W)	mov (1|M0)		r30.4<1>:ud	r0.5<0;1,0>:ud			\n\
+	// written value							\n\
+(W)	send.dc1 (16|M0)	null	r30	src1_null  0x0	0x40A8000	\n\
+#else	// Typed 2D Block Store							\n\
+	// Store X and Y block start (160:191 and 192:223)			\n\
+(W)	mov (2|M0)              r30.6<1>:ud     ARG(0):ud			\n\
+	// Store X and Y block size (224:231 and 232:239)			\n\
+(W)	mov (1|M0)              r30.7<1>:ud     0xf:ud				\n\
+(W)	send.tgm (16|M0)        null    r30     null:0  0x0     0x64000007	\n\
+#endif										\n\
+	", y_offset, value[0], value[1], value[2], value[3]);
+}
+
+/**
+ * gpgpu_shader__common_target_write_u32:
+ * @shdr: shader to be modified
+ * @y_offset: write target offset within target buffer in rows
+ * @value: dword to be written
+ *
+ * Fill oword at @y_offset with dword stored in @value.
+ */
+void gpgpu_shader__common_target_write_u32(struct gpgpu_shader *shdr,
+					   uint32_t y_offset, uint32_t value)
+{
+	const uint32_t owblock[4] = {
+		value, value, value, value
+	};
+	gpgpu_shader__common_target_write(shdr, y_offset, owblock);
+}
+
+/**
+ * gpgpu_shader__write_aip:
+ * @shdr: shader to be modified
+ * @y_offset: write target offset within the surface in rows
+ *
+ * Write address instruction pointer to row tg_id_y + @y_offset.
+ */
+void gpgpu_shader__write_aip(struct gpgpu_shader *shdr, uint32_t y_offset)
+{
+	emit_iga64_code(shdr, media_block_write_aip, "				\n\
+	// Payload								\n\
+(W)	mov (1|M0)               r5.0<1>:ud    cr0.2:ud				\n\
+#if GEN_VER < 2000 // Media Block Write						\n\
+	// X offset of the block in bytes := (thread group id X << ARG(0))	\n\
+(W)	shl (1|M0)               r4.0<1>:ud    r0.1<0;1,0>:ud    0x2:ud		\n\
+	// Y offset of the block in rows := thread group id Y			\n\
+(W)	mov (1|M0)               r4.1<1>:ud    r0.6<0;1,0>:ud			\n\
+(W)	add (1|M0)               r4.1<1>:ud    r4.1<0;1,0>:ud    ARG(0):ud	\n\
+	// block width [0,63] representing 1 to 64 bytes			\n\
+(W)	mov (1|M0)               r4.2<1>:ud    0x3:ud				\n\
+	// FFTID := FFTID from R0 header					\n\
+(W)	mov (1|M0)               r4.4<1>:ud    r0.5<0;1,0>:ud			\n\
+(W)	send.dc1 (16|M0)         null     r4   src1_null 0       0x40A8000	\n\
+#else // Typed 2D Block Store							\n\
+	// Load r2.0-3 with tg id X << ARG(0)					\n\
+(W)	shl (1|M0)               r2.0<1>:ud    r0.1<0;1,0>:ud    0x2:ud		\n\
+	// Load r2.4-7 with tg id Y + ARG(1):ud					\n\
+(W)	mov (1|M0)               r2.1<1>:ud    r0.6<0;1,0>:ud			\n\
+(W)	add (1|M0)               r2.1<1>:ud    r2.1<0;1,0>:ud    ARG(0):ud	\n\
+	// payload setup							\n\
+(W)	mov (16|M0)              r4.0<1>:ud    0x0:ud				\n\
+	// Store X and Y block start (160:191 and 192:223)			\n\
+(W)	mov (2|M0)               r4.5<1>:ud    r2.0<2;2,1>:ud			\n\
+	// Store X and Y block max_size (224:231 and 232:239)			\n\
+(W)	mov (1|M0)               r4.7<1>:ud    0x3:ud				\n\
+(W)	send.tgm (16|M0)         null     r4   null:0    0    0x64000007	\n\
+#endif										\n\
+	", y_offset);
+}
+
 /**
  * gpgpu_shader__write_dword:
  * @shdr: shader to be modified
@@ -313,3 +633,73 @@ void gpgpu_shader__write_dword(struct gpgpu_shader *shdr, uint32_t value,
 #endif										\n\
 	", 2, y_offset, 3, value, value, value, value);
 }
+
+/**
+ * gpgpu_shader__end_system_routine:
+ * @shdr: shader to be modified
+ * @breakpoint_suppress: breakpoint suppress flag
+ *
+ * Return from system routine. To prevent infinite jumping to the system
+ * routine on a breakpoint, @breakpoint_suppress flag has to be set.
+ */
+void gpgpu_shader__end_system_routine(struct gpgpu_shader *shdr,
+				      bool breakpoint_suppress)
+{
+	/*
+	 * set breakpoint suppress bit to avoid an endless loop
+	 * when sip was invoked by a breakpoint
+	 */
+	if (breakpoint_suppress)
+		emit_iga64_code(shdr, breakpoint_suppress, "			\n\
+(W)	or  (1|M0)               cr0.0<1>:ud   cr0.0<0;1,0>:ud   0x8000:ud	\n\
+		");
+
+	emit_iga64_code(shdr, end_system_routine, "				\n\
+(W)	and (1|M0)               cr0.1<1>:ud   cr0.1<0;1,0>:ud   ARG(0):ud	\n\
+	// return to an application						\n\
+(W)	and (1|M0)               cr0.0<1>:ud   cr0.0<0;1,0>:ud   0x7FFFFFFD:ud	\n\
+	", 0x7fffff | (1 << 26)); /* clear all exceptions, except read only bit */
+}
+
+/**
+ * gpgpu_shader__end_system_routine_step_if_eq:
+ * @shdr: shader to be modified
+ * @y_offset: offset within target buffer in rows
+ * @value: expected value for single stepping execution
+ *
+ * Return from system routine. Don't clear breakpoint exception when @value
+ * is equal to value stored at @y_offset. This triggers the system routine
+ * after the subsequent instruction, resulting in single stepping execution.
+ */
+void gpgpu_shader__end_system_routine_step_if_eq(struct gpgpu_shader *shdr,
+						 uint32_t y_offset,
+						 uint32_t value)
+{
+	emit_iga64_code(shdr, end_system_routine_step_if_eq, "				\n\
+(W)		or  (1|M0)               cr0.0<1>:ud   cr0.0<0;1,0>:ud   0x8000:ud	\n\
+(W)		and (1|M0)               cr0.1<1>:ud   cr0.1<0;1,0>:ud   ARG(0):ud	\n\
+(W)		mov (16|M0)              r30.0<1>:ud    0x0:ud				\n\
+#if GEN_VER < 2000 // Media Block Write							\n\
+		// Y offset of the block in rows := thread group id Y			\n\
+(W)		mov (1|M0)               r30.1<1>:ud    ARG(1):ud			\n\
+		// block width [0,63] representing 1 to 64 bytes, we want dword		\n\
+(W)		mov (1|M0)               r30.2<1>:ud    0x3:ud				\n\
+		// FFTID := FFTID from R0 header					\n\
+(W)		mov (1|M0)               r30.4<1>:ud    r0.5<0;1,0>:ud			\n\
+(W)		send.dc1 (16|M0)         r31     r30      null    0x0	0x2190000	\n\
+#else	// Typed 2D Block Store								\n\
+		// Store X and Y block start (160:191 and 192:223)			\n\
+(W)		mov (2|M0)               r30.6<1>:ud    ARG(1):ud			\n\
+		// Store X and Y block size (224:231 and 232:239)			\n\
+(W)		mov (1|M0)               r30.7<1>:ud    0x3:ud				\n\
+(W)		send.tgm (16|M0)         r31     r30    null:0    0x0    0x62100003	\n\
+#endif											\n\
+		// clear the flag register						\n\
+(W)		mov (1|M0)               f0.0<1>:ud    0x0:ud				\n\
+(W)		cmp (1|M0)    (ne)f0.0   null<1>:ud     r31.0<0;1,0>:ud   ARG(2):ud	\n\
+(W&f0.0)	and (1|M0)              cr0.1<1>:ud     cr0.1<0;1,0>:ud   ARG(3):ud	\n\
+		// return to an application						\n\
+(W)		and (1|M0)               cr0.0<1>:ud   cr0.0<0;1,0>:ud   0x7FFFFFFD:ud	\n\
+	", 0x807fffff, /* leave breakpoint exception */
+	y_offset, value, 0x7fffff /* clear all exceptions */ );
+}
diff --git a/lib/gpgpu_shader.h b/lib/gpgpu_shader.h
index 255f93b4d..da4ece983 100644
--- a/lib/gpgpu_shader.h
+++ b/lib/gpgpu_shader.h
@@ -21,6 +21,7 @@ struct gpgpu_shader {
 		uint32_t *code;
 		uint32_t (*instr)[4];
 	};
+	struct igt_map *labels;
 };
 
 struct iga64_template {
@@ -31,7 +32,7 @@ struct iga64_template {
 
 #pragma GCC diagnostic ignored "-Wnested-externs"
 
-void
+uint32_t
 __emit_iga64_code(struct gpgpu_shader *shdr, const struct iga64_template *tpls,
 		  int argc, uint32_t *argv);
 
@@ -56,8 +57,32 @@ void gpgpu_shader_exec(struct intel_bb *ibb,
 		       struct gpgpu_shader *sip,
 		       uint64_t ring, bool explicit_engine);
 
+static inline uint32_t gpgpu_shader_last_instr(struct gpgpu_shader *shdr)
+{
+	return shdr->size / 4 - 1;
+}
+
+void gpgpu_shader__wait(struct gpgpu_shader *shdr);
+void gpgpu_shader__breakpoint_on(struct gpgpu_shader *shdr, uint32_t cmd_no);
+void gpgpu_shader__breakpoint(struct gpgpu_shader *shdr);
+void gpgpu_shader__nop(struct gpgpu_shader *shdr);
 void gpgpu_shader__eot(struct gpgpu_shader *shdr);
+void gpgpu_shader__common_target_write(struct gpgpu_shader *shdr,
+				       uint32_t y_offset, const uint32_t value[4]);
+void gpgpu_shader__common_target_write_u32(struct gpgpu_shader *shdr,
+				     uint32_t y_offset, uint32_t value);
+void gpgpu_shader__end_system_routine(struct gpgpu_shader *shdr,
+				      bool breakpoint_suppress);
+void gpgpu_shader__end_system_routine_step_if_eq(struct gpgpu_shader *shdr,
+						 uint32_t dw_offset,
+						 uint32_t value);
+void gpgpu_shader__write_aip(struct gpgpu_shader *shdr, uint32_t y_offset);
 void gpgpu_shader__write_dword(struct gpgpu_shader *shdr, uint32_t value,
 			       uint32_t y_offset);
-
+void gpgpu_shader__label(struct gpgpu_shader *shdr, int label_id);
+void gpgpu_shader__jump(struct gpgpu_shader *shdr, int label_id);
+void gpgpu_shader__jump_neq(struct gpgpu_shader *shdr, int label_id,
+			    uint32_t dw_offset, uint32_t value);
+void gpgpu_shader__loop_begin(struct gpgpu_shader *shdr, int label_id);
+void gpgpu_shader__loop_end(struct gpgpu_shader *shdr, int label_id, uint32_t iter);
 #endif /* GPGPU_SHADER_H */
diff --git a/lib/iga64_generated_codes.c b/lib/iga64_generated_codes.c
index ea8d0f097..dd849eebc 100644
--- a/lib/iga64_generated_codes.c
+++ b/lib/iga64_generated_codes.c
@@ -3,7 +3,7 @@
 
 #include "gpgpu_shader.h"
 
-#define MD5_SUM_IGA64_ASMS 9977ade854d57c5af5c5ca9e93c0f37e
+#define MD5_SUM_IGA64_ASMS 33b7cd843e3b009c123a85a6c520d7d0
 
 struct iga64_template const iga64_code_gpgpu_fill[] = {
 	{ .gen_ver = 2000, .size = 44, .code = (const uint32_t []) {
@@ -79,6 +79,119 @@ struct iga64_template const iga64_code_gpgpu_fill[] = {
 	}}
 };
 
+struct iga64_template const iga64_code_end_system_routine_step_if_eq[] = {
+	{ .gen_ver = 2000, .size = 44, .code = (const uint32_t []) {
+		0x80000966, 0x80018220, 0x02008000, 0x00008000,
+		0x80000965, 0x80118220, 0x02008010, 0xc0ded000,
+		0x80100961, 0x1e054220, 0x00000000, 0x00000000,
+		0x80040061, 0x1e654220, 0x00000000, 0xc0ded001,
+		0x80000061, 0x1e754220, 0x00000000, 0x00000003,
+		0x80132031, 0x1f0c0000, 0xd0061e8c, 0x04000000,
+		0x80000061, 0x30014220, 0x00000000, 0x00000000,
+		0x80008070, 0x00018220, 0x22001f04, 0xc0ded002,
+		0x84000965, 0x80118220, 0x02008010, 0xc0ded003,
+		0x80000965, 0x80018220, 0x02008000, 0x7ffffffd,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1270, .size = 52, .code = (const uint32_t []) {
+		0x80000966, 0x80018220, 0x02008000, 0x00008000,
+		0x80000965, 0x80218220, 0x02008020, 0xc0ded000,
+		0x80040961, 0x1e054220, 0x00000000, 0x00000000,
+		0x80000061, 0x1e254220, 0x00000000, 0xc0ded001,
+		0x80000061, 0x1e454220, 0x00000000, 0x00000003,
+		0x80000061, 0x1e850220, 0x000000a4, 0x00000000,
+		0x80001901, 0x00010000, 0x00000000, 0x00000000,
+		0x80044031, 0x1f0c0000, 0xc0001e0c, 0x02400000,
+		0x80000061, 0x30014220, 0x00000000, 0x00000000,
+		0x80002070, 0x00018220, 0x22001f04, 0xc0ded002,
+		0x81000965, 0x80218220, 0x02008020, 0xc0ded003,
+		0x80000965, 0x80018220, 0x02008000, 0x7ffffffd,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1260, .size = 48, .code = (const uint32_t []) {
+		0x80000966, 0x80018220, 0x02008000, 0x00008000,
+		0x80000965, 0x80118220, 0x02008010, 0xc0ded000,
+		0x80100961, 0x1e054220, 0x00000000, 0x00000000,
+		0x80000061, 0x1e154220, 0x00000000, 0xc0ded001,
+		0x80000061, 0x1e254220, 0x00000000, 0x00000003,
+		0x80000061, 0x1e450220, 0x00000054, 0x00000000,
+		0x80132031, 0x1f0c0000, 0xc0001e0c, 0x02400000,
+		0x80000061, 0x30014220, 0x00000000, 0x00000000,
+		0x80008070, 0x00018220, 0x22001f04, 0xc0ded002,
+		0x84000965, 0x80118220, 0x02008010, 0xc0ded003,
+		0x80000965, 0x80018220, 0x02008000, 0x7ffffffd,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1250, .size = 52, .code = (const uint32_t []) {
+		0x80000966, 0x80018220, 0x02008000, 0x00008000,
+		0x80000965, 0x80218220, 0x02008020, 0xc0ded000,
+		0x80040961, 0x1e054220, 0x00000000, 0x00000000,
+		0x80000061, 0x1e254220, 0x00000000, 0xc0ded001,
+		0x80000061, 0x1e454220, 0x00000000, 0x00000003,
+		0x80000061, 0x1e850220, 0x000000a4, 0x00000000,
+		0x80001901, 0x00010000, 0x00000000, 0x00000000,
+		0x80044031, 0x1f0c0000, 0xc0001e0c, 0x02400000,
+		0x80000061, 0x30014220, 0x00000000, 0x00000000,
+		0x80002070, 0x00018220, 0x22001f04, 0xc0ded002,
+		0x81000965, 0x80218220, 0x02008020, 0xc0ded003,
+		0x80000965, 0x80018220, 0x02008000, 0x7ffffffd,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 0, .size = 48, .code = (const uint32_t []) {
+		0x80000166, 0x80018220, 0x02008000, 0x00008000,
+		0x80000165, 0x80218220, 0x02008020, 0xc0ded000,
+		0x80040161, 0x1e054220, 0x00000000, 0x00000000,
+		0x80000061, 0x1e254220, 0x00000000, 0xc0ded001,
+		0x80000061, 0x1e454220, 0x00000000, 0x00000003,
+		0x80000061, 0x1e850220, 0x000000a4, 0x00000000,
+		0x80049031, 0x1f0c0000, 0xc0001e0c, 0x02400000,
+		0x80000061, 0x30014220, 0x00000000, 0x00000000,
+		0x80002070, 0x00018220, 0x22001f04, 0xc0ded002,
+		0x81000165, 0x80218220, 0x02008020, 0xc0ded003,
+		0x80000165, 0x80018220, 0x02008000, 0x7ffffffd,
+		0x80000101, 0x00010000, 0x00000000, 0x00000000,
+	}}
+};
+
+struct iga64_template const iga64_code_end_system_routine[] = {
+	{ .gen_ver = 2000, .size = 12, .code = (const uint32_t []) {
+		0x80000965, 0x80118220, 0x02008010, 0xc0ded000,
+		0x80000965, 0x80018220, 0x02008000, 0x7ffffffd,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1270, .size = 12, .code = (const uint32_t []) {
+		0x80000965, 0x80218220, 0x02008020, 0xc0ded000,
+		0x80000965, 0x80018220, 0x02008000, 0x7ffffffd,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1260, .size = 12, .code = (const uint32_t []) {
+		0x80000965, 0x80118220, 0x02008010, 0xc0ded000,
+		0x80000965, 0x80018220, 0x02008000, 0x7ffffffd,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1250, .size = 12, .code = (const uint32_t []) {
+		0x80000965, 0x80218220, 0x02008020, 0xc0ded000,
+		0x80000965, 0x80018220, 0x02008000, 0x7ffffffd,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 0, .size = 12, .code = (const uint32_t []) {
+		0x80000165, 0x80218220, 0x02008020, 0xc0ded000,
+		0x80000165, 0x80018220, 0x02008000, 0x7ffffffd,
+		0x80000101, 0x00010000, 0x00000000, 0x00000000,
+	}}
+};
+
+struct iga64_template const iga64_code_breakpoint_suppress[] = {
+	{ .gen_ver = 1250, .size = 8, .code = (const uint32_t []) {
+		0x80000966, 0x80018220, 0x02008000, 0x00008000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 0, .size = 8, .code = (const uint32_t []) {
+		0x80000166, 0x80018220, 0x02008000, 0x00008000,
+		0x80000101, 0x00010000, 0x00000000, 0x00000000,
+	}}
+};
+
 struct iga64_template const iga64_code_media_block_write[] = {
 	{ .gen_ver = 2000, .size = 56, .code = (const uint32_t []) {
 		0x80100061, 0x04054220, 0x00000000, 0x00000000,
@@ -164,6 +277,270 @@ struct iga64_template const iga64_code_media_block_write[] = {
 	}}
 };
 
+struct iga64_template const iga64_code_media_block_write_aip[] = {
+	{ .gen_ver = 2000, .size = 44, .code = (const uint32_t []) {
+		0x80000961, 0x05050220, 0x00008020, 0x00000000,
+		0x80000969, 0x02058220, 0x02000014, 0x00000002,
+		0x80000061, 0x02150220, 0x00000064, 0x00000000,
+		0x80001940, 0x02158220, 0x02000214, 0xc0ded000,
+		0x80100061, 0x04054220, 0x00000000, 0x00000000,
+		0x80041a61, 0x04550220, 0x00220205, 0x00000000,
+		0x80000061, 0x04754220, 0x00000000, 0x00000003,
+		0x80132031, 0x00000000, 0xd00e0494, 0x04000000,
+		0x80000001, 0x00010000, 0x20000000, 0x00000000,
+		0x80000001, 0x00010000, 0x30000000, 0x00000000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1270, .size = 44, .code = (const uint32_t []) {
+		0x80000961, 0x05050220, 0x00008040, 0x00000000,
+		0x80000969, 0x04058220, 0x02000024, 0x00000002,
+		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
+		0x80001940, 0x04258220, 0x02000424, 0xc0ded000,
+		0x80000061, 0x04454220, 0x00000000, 0x00000003,
+		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
+		0x80001901, 0x00010000, 0x00000000, 0x00000000,
+		0x80044031, 0x00000000, 0xc0000414, 0x02a00000,
+		0x80000001, 0x00010000, 0x20000000, 0x00000000,
+		0x80000001, 0x00010000, 0x30000000, 0x00000000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1260, .size = 40, .code = (const uint32_t []) {
+		0x80000961, 0x05050220, 0x00008020, 0x00000000,
+		0x80000969, 0x04058220, 0x02000014, 0x00000002,
+		0x80000061, 0x04150220, 0x00000064, 0x00000000,
+		0x80001940, 0x04158220, 0x02000414, 0xc0ded000,
+		0x80000061, 0x04254220, 0x00000000, 0x00000003,
+		0x80000061, 0x04450220, 0x00000054, 0x00000000,
+		0x80132031, 0x00000000, 0xc0000414, 0x02a00000,
+		0x80000001, 0x00010000, 0x20000000, 0x00000000,
+		0x80000001, 0x00010000, 0x30000000, 0x00000000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1250, .size = 44, .code = (const uint32_t []) {
+		0x80000961, 0x05050220, 0x00008040, 0x00000000,
+		0x80000969, 0x04058220, 0x02000024, 0x00000002,
+		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
+		0x80001940, 0x04258220, 0x02000424, 0xc0ded000,
+		0x80000061, 0x04454220, 0x00000000, 0x00000003,
+		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
+		0x80001901, 0x00010000, 0x00000000, 0x00000000,
+		0x80044031, 0x00000000, 0xc0000414, 0x02a00000,
+		0x80000001, 0x00010000, 0x20000000, 0x00000000,
+		0x80000001, 0x00010000, 0x30000000, 0x00000000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 0, .size = 40, .code = (const uint32_t []) {
+		0x80000161, 0x05050220, 0x00008040, 0x00000000,
+		0x80000169, 0x04058220, 0x02000024, 0x00000002,
+		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
+		0x80000140, 0x04258220, 0x02000424, 0xc0ded000,
+		0x80000061, 0x04454220, 0x00000000, 0x00000003,
+		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
+		0x80049031, 0x00000000, 0xc0000414, 0x02a00000,
+		0x80000001, 0x00010000, 0x20000000, 0x00000000,
+		0x80000001, 0x00010000, 0x30000000, 0x00000000,
+		0x80000101, 0x00010000, 0x00000000, 0x00000000,
+	}}
+};
+
+struct iga64_template const iga64_code_common_target_write[] = {
+	{ .gen_ver = 2000, .size = 48, .code = (const uint32_t []) {
+		0x80100061, 0x1e054220, 0x00000000, 0x00000000,
+		0x80100061, 0x1f054220, 0x00000000, 0x00000000,
+		0x80000061, 0x1f054220, 0x00000000, 0xc0ded001,
+		0x80000061, 0x1f154220, 0x00000000, 0xc0ded002,
+		0x80000061, 0x1f254220, 0x00000000, 0xc0ded003,
+		0x80000061, 0x1f354220, 0x00000000, 0xc0ded004,
+		0x80040061, 0x1e654220, 0x00000000, 0xc0ded000,
+		0x80000061, 0x1e754220, 0x00000000, 0x0000000f,
+		0x80132031, 0x00000000, 0xd00e1e94, 0x04000000,
+		0x80000001, 0x00010000, 0x20000000, 0x00000000,
+		0x80000001, 0x00010000, 0x30000000, 0x00000000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1270, .size = 56, .code = (const uint32_t []) {
+		0x80040061, 0x1e054220, 0x00000000, 0x00000000,
+		0x80040061, 0x1f054220, 0x00000000, 0x00000000,
+		0x80000061, 0x1f054220, 0x00000000, 0xc0ded001,
+		0x80000061, 0x1f254220, 0x00000000, 0xc0ded002,
+		0x80000061, 0x1f454220, 0x00000000, 0xc0ded003,
+		0x80000061, 0x1f654220, 0x00000000, 0xc0ded004,
+		0x80000061, 0x1e254220, 0x00000000, 0xc0ded000,
+		0x80000061, 0x1e454220, 0x00000000, 0x0000000f,
+		0x80000061, 0x1e850220, 0x000000a4, 0x00000000,
+		0x80001901, 0x00010000, 0x00000000, 0x00000000,
+		0x80044031, 0x00000000, 0xc0001e14, 0x02a00000,
+		0x80000001, 0x00010000, 0x20000000, 0x00000000,
+		0x80000001, 0x00010000, 0x30000000, 0x00000000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1260, .size = 52, .code = (const uint32_t []) {
+		0x80100061, 0x1e054220, 0x00000000, 0x00000000,
+		0x80100061, 0x1f054220, 0x00000000, 0x00000000,
+		0x80000061, 0x1f054220, 0x00000000, 0xc0ded001,
+		0x80000061, 0x1f154220, 0x00000000, 0xc0ded002,
+		0x80000061, 0x1f254220, 0x00000000, 0xc0ded003,
+		0x80000061, 0x1f354220, 0x00000000, 0xc0ded004,
+		0x80000061, 0x1e154220, 0x00000000, 0xc0ded000,
+		0x80000061, 0x1e254220, 0x00000000, 0x0000000f,
+		0x80000061, 0x1e450220, 0x00000054, 0x00000000,
+		0x80132031, 0x00000000, 0xc0001e14, 0x02a00000,
+		0x80000001, 0x00010000, 0x20000000, 0x00000000,
+		0x80000001, 0x00010000, 0x30000000, 0x00000000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1250, .size = 56, .code = (const uint32_t []) {
+		0x80040061, 0x1e054220, 0x00000000, 0x00000000,
+		0x80040061, 0x1f054220, 0x00000000, 0x00000000,
+		0x80000061, 0x1f054220, 0x00000000, 0xc0ded001,
+		0x80000061, 0x1f254220, 0x00000000, 0xc0ded002,
+		0x80000061, 0x1f454220, 0x00000000, 0xc0ded003,
+		0x80000061, 0x1f654220, 0x00000000, 0xc0ded004,
+		0x80000061, 0x1e254220, 0x00000000, 0xc0ded000,
+		0x80000061, 0x1e454220, 0x00000000, 0x0000000f,
+		0x80000061, 0x1e850220, 0x000000a4, 0x00000000,
+		0x80001901, 0x00010000, 0x00000000, 0x00000000,
+		0x80044031, 0x00000000, 0xc0001e14, 0x02a00000,
+		0x80000001, 0x00010000, 0x20000000, 0x00000000,
+		0x80000001, 0x00010000, 0x30000000, 0x00000000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 0, .size = 52, .code = (const uint32_t []) {
+		0x80040061, 0x1e054220, 0x00000000, 0x00000000,
+		0x80040061, 0x1f054220, 0x00000000, 0x00000000,
+		0x80000061, 0x1f054220, 0x00000000, 0xc0ded001,
+		0x80000061, 0x1f254220, 0x00000000, 0xc0ded002,
+		0x80000061, 0x1f454220, 0x00000000, 0xc0ded003,
+		0x80000061, 0x1f654220, 0x00000000, 0xc0ded004,
+		0x80000061, 0x1e254220, 0x00000000, 0xc0ded000,
+		0x80000061, 0x1e454220, 0x00000000, 0x0000000f,
+		0x80000061, 0x1e850220, 0x000000a4, 0x00000000,
+		0x80049031, 0x00000000, 0xc0001e14, 0x02a00000,
+		0x80000001, 0x00010000, 0x20000000, 0x00000000,
+		0x80000001, 0x00010000, 0x30000000, 0x00000000,
+		0x80000101, 0x00010000, 0x00000000, 0x00000000,
+	}}
+};
+
+struct iga64_template const iga64_code_inc_r40_jump_neq[] = {
+	{ .gen_ver = 2000, .size = 20, .code = (const uint32_t []) {
+		0x80000040, 0x28058220, 0x02002804, 0x00000001,
+		0x80000061, 0x30014220, 0x00000000, 0x00000000,
+		0x80001a70, 0x00018220, 0x22002804, 0xc0ded000,
+		0x84000020, 0x00004000, 0x00000000, 0xffffffd0,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1270, .size = 20, .code = (const uint32_t []) {
+		0x80000040, 0x28058220, 0x02002804, 0x00000001,
+		0x80000061, 0x30014220, 0x00000000, 0x00000000,
+		0x80001a70, 0x00018220, 0x22002804, 0xc0ded000,
+		0x81000020, 0x00004000, 0x00000000, 0xffffffd0,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1260, .size = 20, .code = (const uint32_t []) {
+		0x80000040, 0x28058220, 0x02002804, 0x00000001,
+		0x80000061, 0x30014220, 0x00000000, 0x00000000,
+		0x80001a70, 0x00018220, 0x22002804, 0xc0ded000,
+		0x84000020, 0x00004000, 0x00000000, 0xffffffd0,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1250, .size = 20, .code = (const uint32_t []) {
+		0x80000040, 0x28058220, 0x02002804, 0x00000001,
+		0x80000061, 0x30014220, 0x00000000, 0x00000000,
+		0x80001a70, 0x00018220, 0x22002804, 0xc0ded000,
+		0x81000020, 0x00004000, 0x00000000, 0xffffffd0,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 0, .size = 20, .code = (const uint32_t []) {
+		0x80000040, 0x28058220, 0x02002804, 0x00000001,
+		0x80000061, 0x30014220, 0x00000000, 0x00000000,
+		0x80000270, 0x00018220, 0x22002804, 0xc0ded000,
+		0x81000020, 0x00004000, 0x00000000, 0xffffffd0,
+		0x80000101, 0x00010000, 0x00000000, 0x00000000,
+	}}
+};
+
+struct iga64_template const iga64_code_clear_r40[] = {
+	{ .gen_ver = 1250, .size = 8, .code = (const uint32_t []) {
+		0x80000061, 0x28054220, 0x00000000, 0x00000000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 0, .size = 8, .code = (const uint32_t []) {
+		0x80000061, 0x28054220, 0x00000000, 0x00000000,
+		0x80000101, 0x00010000, 0x00000000, 0x00000000,
+	}}
+};
+
+struct iga64_template const iga64_code_jump_dw_neq[] = {
+	{ .gen_ver = 2000, .size = 32, .code = (const uint32_t []) {
+		0x80100061, 0x1e054220, 0x00000000, 0x00000000,
+		0x80040061, 0x1e654220, 0x00000000, 0xc0ded000,
+		0x80000061, 0x1e754220, 0x00000000, 0x00000003,
+		0x80132031, 0x1f0c0000, 0xd0061e8c, 0x04000000,
+		0x80000061, 0x30014220, 0x00000000, 0x00000000,
+		0x80008070, 0x00018220, 0x22001f04, 0xc0ded001,
+		0x84000020, 0x00004000, 0x00000000, 0xffffffa0,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1270, .size = 40, .code = (const uint32_t []) {
+		0x80040061, 0x1e054220, 0x00000000, 0x00000000,
+		0x80000061, 0x1e254220, 0x00000000, 0xc0ded000,
+		0x80000061, 0x1e454220, 0x00000000, 0x00000003,
+		0x80000061, 0x1e850220, 0x000000a4, 0x00000000,
+		0x80001901, 0x00010000, 0x00000000, 0x00000000,
+		0x80044031, 0x1f0c0000, 0xc0001e0c, 0x02400000,
+		0x80000061, 0x30014220, 0x00000000, 0x00000000,
+		0x80002070, 0x00018220, 0x22001f04, 0xc0ded001,
+		0x81000020, 0x00004000, 0x00000000, 0xffffff80,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1260, .size = 36, .code = (const uint32_t []) {
+		0x80100061, 0x1e054220, 0x00000000, 0x00000000,
+		0x80000061, 0x1e154220, 0x00000000, 0xc0ded000,
+		0x80000061, 0x1e254220, 0x00000000, 0x00000003,
+		0x80000061, 0x1e450220, 0x00000054, 0x00000000,
+		0x80132031, 0x1f0c0000, 0xc0001e0c, 0x02400000,
+		0x80000061, 0x30014220, 0x00000000, 0x00000000,
+		0x80008070, 0x00018220, 0x22001f04, 0xc0ded001,
+		0x84000020, 0x00004000, 0x00000000, 0xffffff90,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1250, .size = 40, .code = (const uint32_t []) {
+		0x80040061, 0x1e054220, 0x00000000, 0x00000000,
+		0x80000061, 0x1e254220, 0x00000000, 0xc0ded000,
+		0x80000061, 0x1e454220, 0x00000000, 0x00000003,
+		0x80000061, 0x1e850220, 0x000000a4, 0x00000000,
+		0x80001901, 0x00010000, 0x00000000, 0x00000000,
+		0x80044031, 0x1f0c0000, 0xc0001e0c, 0x02400000,
+		0x80000061, 0x30014220, 0x00000000, 0x00000000,
+		0x80002070, 0x00018220, 0x22001f04, 0xc0ded001,
+		0x81000020, 0x00004000, 0x00000000, 0xffffff80,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 0, .size = 36, .code = (const uint32_t []) {
+		0x80040061, 0x1e054220, 0x00000000, 0x00000000,
+		0x80000061, 0x1e254220, 0x00000000, 0xc0ded000,
+		0x80000061, 0x1e454220, 0x00000000, 0x00000003,
+		0x80000061, 0x1e850220, 0x000000a4, 0x00000000,
+		0x80049031, 0x1f0c0000, 0xc0001e0c, 0x02400000,
+		0x80000061, 0x30014220, 0x00000000, 0x00000000,
+		0x80002070, 0x00018220, 0x22001f04, 0xc0ded001,
+		0x81000020, 0x00004000, 0x00000000, 0xffffff90,
+		0x80000101, 0x00010000, 0x00000000, 0x00000000,
+	}}
+};
+
+struct iga64_template const iga64_code_jump[] = {
+	{ .gen_ver = 1250, .size = 8, .code = (const uint32_t []) {
+		0x80000020, 0x00004000, 0x00000000, 0x00000000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 0, .size = 8, .code = (const uint32_t []) {
+		0x80000020, 0x00004000, 0x00000000, 0x00000000,
+		0x80000101, 0x00010000, 0x00000000, 0x00000000,
+	}}
+};
+
 struct iga64_template const iga64_code_eot[] = {
 	{ .gen_ver = 2000, .size = 8, .code = (const uint32_t []) {
 		0x800c0061, 0x70050220, 0x00460005, 0x00000000,
@@ -188,3 +565,25 @@ struct iga64_template const iga64_code_eot[] = {
 		0x80049031, 0x00000004, 0x7020700c, 0x10000000,
 	}}
 };
+
+struct iga64_template const iga64_code_nop[] = {
+	{ .gen_ver = 1250, .size = 8, .code = (const uint32_t []) {
+		0x00000060, 0x00000000, 0x00000000, 0x00000000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 0, .size = 8, .code = (const uint32_t []) {
+		0x00000060, 0x00000000, 0x00000000, 0x00000000,
+		0x80000101, 0x00010000, 0x00000000, 0x00000000,
+	}}
+};
+
+struct iga64_template const iga64_code_sync_host[] = {
+	{ .gen_ver = 1250, .size = 8, .code = (const uint32_t []) {
+		0x80000001, 0x00010000, 0xf0000000, 0x00000000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 0, .size = 8, .code = (const uint32_t []) {
+		0x80000001, 0x00010000, 0xf0000000, 0x00000000,
+		0x80000101, 0x00010000, 0x00000000, 0x00000000,
+	}}
+};
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 03/17] lib/gpgpu_shader: Extend shader building library
  2024-09-05  9:27 ` [PATCH i-g-t v6 03/17] lib/gpgpu_shader: Extend shader building library Christoph Manszewski
@ 2024-09-05 11:56   ` Zbigniew Kempczyński
  2024-09-09  6:54   ` Zbigniew Kempczyński
  1 sibling, 0 replies; 50+ messages in thread
From: Zbigniew Kempczyński @ 2024-09-05 11:56 UTC (permalink / raw)
  To: Christoph Manszewski
  Cc: igt-dev, Kamil Konieczny, Dominik Grzegorzek, Maciej Patelczyk,
	Dominik Karol Piątkowski, Pawel Sikora, Andrzej Hajda,
	Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun

On Thu, Sep 05, 2024 at 11:27:58AM +0200, Christoph Manszewski wrote:
> Add shader building functions and iga64 code used by eudebug subtests.
> 
> Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
> Signed-off-by: Dominik Karol Piątkowski <dominik.karol.piatkowski@intel.com>
> Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> ---
>  lib/gpgpu_shader.c          | 392 ++++++++++++++++++++++++++++++++++-
>  lib/gpgpu_shader.h          |  29 ++-
>  lib/iga64_generated_codes.c | 401 +++++++++++++++++++++++++++++++++++-
>  3 files changed, 818 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/gpgpu_shader.c b/lib/gpgpu_shader.c
> index 80bad342a..dacab51dd 100644
> --- a/lib/gpgpu_shader.c
> +++ b/lib/gpgpu_shader.c
> @@ -7,10 +7,16 @@
>  
>  #include <i915_drm.h>
>  
> +#include "igt_map.h"
>  #include "ioctl_wrappers.h"
>  #include "gpgpu_shader.h"
>  #include "gpu_cmds.h"
>  
> +struct label_entry {
> +	uint32_t id;
> +	uint32_t offset;
> +};
> +
>  #define IGA64_ARG0 0xc0ded000
>  #define IGA64_ARG_MASK 0xffffff00
>  
> @@ -32,7 +38,7 @@ static void gpgpu_shader_extend(struct gpgpu_shader *shdr)
>  	igt_assert(shdr->code);
>  }
>  
> -void
> +uint32_t
>  __emit_iga64_code(struct gpgpu_shader *shdr, struct iga64_template const *tpls,
>  		  int argc, uint32_t *argv)
>  {
> @@ -60,6 +66,8 @@ __emit_iga64_code(struct gpgpu_shader *shdr, struct iga64_template const *tpls,
>  	}
>  
>  	shdr->size += tpls->size;
> +
> +	return tpls->size;
>  }
>  
>  static uint32_t fill_sip(struct intel_bb *ibb,
> @@ -235,10 +243,16 @@ struct gpgpu_shader *gpgpu_shader_create(int fd)
>  	shdr->gen_ver = 100 * info->graphics_ver + info->graphics_rel;
>  	shdr->max_size = 16 * 4;
>  	shdr->code = malloc(4 * shdr->max_size);
> +	shdr->labels = igt_map_create(igt_map_hash_32, igt_map_equal_32);
>  	igt_assert(shdr->code);
>  	return shdr;
>  }
>  
> +static void free_func(struct igt_map_entry *entry)
> +{
> +       free(entry->data);
> +}
> +
>  /**
>   * gpgpu_shader_destroy:
>   * @shdr: pointer to shader struct created with 'gpgpu_shader_create'
> @@ -247,10 +261,76 @@ struct gpgpu_shader *gpgpu_shader_create(int fd)
>   */
>  void gpgpu_shader_destroy(struct gpgpu_shader *shdr)
>  {
> +	igt_map_destroy(shdr->labels, free_func);
>  	free(shdr->code);
>  	free(shdr);
>  }
>  
> +/**
> + * gpgpu_shader_dump:
> + * @shdr: shader to be printed
> + *
> + * Print shader instructions from @shdr in hex.
> + */
> +void gpgpu_shader_dump(struct gpgpu_shader *shdr)
> +{
> +	for (int i = 0; i < shdr->size / 4; i++)
> +		igt_info("0x%08x 0x%08x 0x%08x 0x%08x\n",
> +			 shdr->instr[i][0], shdr->instr[i][1],
> +			 shdr->instr[i][2], shdr->instr[i][3]);
> +}
> +
> +/**
> + * gpgpu_shader__breakpoint_on:
> + * @shdr: shader to create breakpoint in
> + * @cmd_no: index of the instruction to break on
> + *
> + * Insert a breakpoint on the @cmd_no'th instruction within @shdr.
> + */
> +void gpgpu_shader__breakpoint_on(struct gpgpu_shader *shdr, uint32_t cmd_no)
> +{
> +	igt_assert(cmd_no < shdr->size / 4);
> +	shdr->instr[cmd_no][0] |= 1<<30;
> +}
> +
> +/**
> + * gpgpu_shader__breakpoint:
> + * @shdr: shader to create breakpoint in
> + *
> + * Insert a breakpoint on the last instruction in @shdr.
> + */
> +void gpgpu_shader__breakpoint(struct gpgpu_shader *shdr)
> +{
> +	gpgpu_shader__breakpoint_on(shdr, gpgpu_shader_last_instr(shdr));
> +}
> +
> +/**
> + * gpgpu_shader__wait:
> + * @shdr: shader to be modified
> + *
> + * Append wait instruction to @shader. This instruction raises attention
> + * and stops execution.
> + */
> +void gpgpu_shader__wait(struct gpgpu_shader *shdr)
> +{
> +	emit_iga64_code(shdr, sync_host, "	\n\
> +(W)	sync.host	        null		\n\
> +	");
> +}
> +
> +/**
> + * gpgpu_shader__nop:
> + * @shdr: shader to be modified
> + *
> + * Append a no-op instruction to @shdr.
> + */
> +void gpgpu_shader__nop(struct gpgpu_shader *shdr)
> +{
> +	emit_iga64_code(shdr, nop, "	\n\
> +(W)	nop				\n\
> +	");
> +}
> +
>  /**
>   * gpgpu_shader__eot:
>   * @shdr: shader to be modified
> @@ -269,6 +349,246 @@ void gpgpu_shader__eot(struct gpgpu_shader *shdr)
>  	");
>  }
>  
> +/**
> + * gpgpu_shader__label:
> + * @shdr: shader to be modified
> + * @label_id: id of the label to be created
> + *
> + * Create a label for the last instruction within @shdr.
> + */
> +void gpgpu_shader__label(struct gpgpu_shader *shdr, int label_id)
> +{
> +	struct label_entry *l = malloc(sizeof(*l));
> +
> +	l->id = label_id;
> +	l->offset = shdr->size;
> +	igt_map_insert(shdr->labels, &l->id, l);
> +}
> +
> +#define OPCODE(x) (x & 0x7f)
> +#define OPCODE_JUMP_INDEXED 0x20
> +static void __patch_indexed_jump(struct gpgpu_shader *shdr, int label_id,
> +				 uint32_t jump_iga64_size)
> +{
> +	struct label_entry *l;
> +	uint32_t *start, *end, *label;
> +	int32_t relative;
> +
> +	l = igt_map_search(shdr->labels, &label_id);
> +	igt_assert(l);
> +
> +	igt_assert(jump_iga64_size % 4 == 0);
> +
> +	label = shdr->code + l->offset;
> +	end = shdr->code + shdr->size;
> +	start = end - jump_iga64_size;
> +
> +	for (; start < end; start += 4)
> +		if (OPCODE(*start) == OPCODE_JUMP_INDEXED) {
> +			relative = (label - start) * 4;
> +			*(start + 3) = relative;
> +			break;
> +		}
> +}
> +
> +/**
> + * gpgpu_shader__jump:
> + * @shdr: shader to be modified
> + * @label_id: label to jump to
> + *
> + * Append jump instruction to @shdr. Jump to instruction with label @label_id.
> + */
> +void gpgpu_shader__jump(struct gpgpu_shader *shdr, int label_id)
> +{
> +	size_t shader_size;
> +
> +	shader_size = emit_iga64_code(shdr, jump, "	\n\
> +L0:							\n\
> +(W)	jmpi        L0					\n\
> +	");
> +
> +	__patch_indexed_jump(shdr, label_id, shader_size);
> +}
> +
> +/**
> + * gpgpu_shader__jump_neq:
> + * @shdr: shader to be modified
> + * @label_id: label to jump to
> + * @y_offset: offset within target buffer in rows
> + * @value: expected value
> + *
> + * Append jump instruction to @shdr. Jump to instruction with label @label_id
> + * when @value is not equal to dword stored at @y_offset within the surface.
> + */
> +void gpgpu_shader__jump_neq(struct gpgpu_shader *shdr, int label_id,
> +			    uint32_t y_offset, uint32_t value)
> +{
> +	uint32_t size;
> +
> +	size = emit_iga64_code(shdr, jump_dw_neq, "					\n\
> +L0:											\n\
> +(W)		mov (16|M0)              r30.0<1>:ud    0x0:ud				\n\
> +#if GEN_VER < 2000 // Media Block Write							\n\
> +	// Y offset of the block in rows := thread group id Y				\n\
> +(W)		mov (1|M0)               r30.1<1>:ud    ARG(0):ud			\n\
> +	// block width [0,63] representing 1 to 64 bytes, we want dword			\n\
> +(W)		mov (1|M0)               r30.2<1>:ud    0x3:ud				\n\
> +	// FFTID := FFTID from R0 header						\n\
> +(W)		mov (1|M0)               r30.4<1>:ud    r0.5<0;1,0>:ud  		\n\
> +(W)		send.dc1 (16|M0)         r31     r30      null    0x0	0x2190000	\n\
> +#else // Typed 2D Block Store								\n\
> +	// Store X and Y block start (160:191 and 192:223)				\n\
> +(W)            mov (2|M0)               r30.6<1>:ud    ARG(0):ud			\n\

I haven't spotted this before, that's really minor nit:

mov(1|M0) is enough as r30.7 is written below. 

> +	// Store X and Y block size (224:231 and 232:239)				\n\
> +(W)            mov (1|M0)               r30.7<1>:ud    0x3:ud				\n\
> +(W)            send.tgm (16|M0)         r31     r30    null:0    0x0    0x62100003	\n\
> +#endif											\n\
> +	// clear the flag register							\n\
> +(W)		mov (1|M0)               f0.0<1>:ud    0x0:ud				\n\
> +(W)		cmp (1|M0)    (ne)f0.0   null<1>:ud     r31.0<0;1,0>:ud   ARG(1):ud	\n\
> +(W&f0.0)	jmpi                     L0						\n\
> +	", y_offset, value);
> +
> +	__patch_indexed_jump(shdr, label_id, size);
> +}
> +
> +/**
> + * gpgpu_shader__loop_begin:
> + * @shdr: shader to be modified
> + * @label_id: id of the label to be created
> + *
> + * Begin a counting loop in @shdr. All subsequent instructions will constitute
> + * the loop body up until 'gpgpu_shader__loop_end' gets called. The first
> + * instruction of the loop will be at label @label_id. The r40 register will be
> + * overwritten as it is used as the loop counter.
> + */
> +void gpgpu_shader__loop_begin(struct gpgpu_shader *shdr, int label_id)
> +{
> +	emit_iga64_code(shdr, clear_r40, "		\n\
> +L0:							\n\
> +(W)	mov (1|M0)               r40:ud    0x0:ud	\n\
> +	");
> +
> +	gpgpu_shader__label(shdr, label_id);
> +}
> +
> +/**
> + * gpgpu_shader__loop_end:
> + * @shdr: shader to be modified
> + * @label_id: label id passed to 'gpgpu_shader__loop_begin'
> + * @iter: iteration count
> + *
> + * End loop body in @shdr.
> + */
> +void gpgpu_shader__loop_end(struct gpgpu_shader *shdr, int label_id, uint32_t iter)
> +{
> +	uint32_t size;
> +
> +	size = emit_iga64_code(shdr, inc_r40_jump_neq, "				\n\
> +L0:											\n\
> +(W)		add (1|M0)              r40:ud          r40.0<0;1,0>:ud 0x1:ud		\n\
> +(W)		mov (1|M0)              f0.0<1>:ud      0x0:ud				\n\
> +(W)		cmp (1|M0)    (ne)f0.0   null<1>:ud     r40.0<0;1,0>:ud   ARG(0):ud	\n\
> +(W&f0.0)	jmpi                     L0						\n\
> +	", iter);
> +
> +	__patch_indexed_jump(shdr, label_id, size);
> +}
> +
> +/**
> + * gpgpu_shader__common_target_write:
> + * @shdr: shader to be modified
> + * @y_offset: write target offset within target buffer in rows
> + * @value: oword to be written
> + *
> + * Write the oword stored in @value to the target buffer at @y_offset.
> + */
> +void gpgpu_shader__common_target_write(struct gpgpu_shader *shdr,
> +				       uint32_t y_offset, const uint32_t value[4])
> +{
> +	emit_iga64_code(shdr, common_target_write, "				\n\
> +(W)	mov (16|M0)		r30.0<1>:ud	0x0:ud				\n\
> +(W)	mov (16|M0)		r31.0<1>:ud	0x0:ud				\n\
> +(W)	mov (1|M0)		r31.0<1>:ud	ARG(1):ud			\n\
> +(W)	mov (1|M0)		r31.1<1>:ud	ARG(2):ud			\n\
> +(W)	mov (1|M0)		r31.2<1>:ud	ARG(3):ud			\n\
> +(W)	mov (1|M0)		r31.3<1>:ud	ARG(4):ud			\n\
> +#if GEN_VER < 2000 // Media Block Write						\n\
> +	// Y offset of the block in rows					\n\
> +(W)	mov (1|M0)		r30.1<1>:ud	ARG(0):ud			\n\
> +	// block width [0,63] representing 1 to 64 bytes			\n\
> +(W)	mov (1|M0)		r30.2<1>:ud	0xf:ud				\n\
> +	// FFTID := FFTID from R0 header					\n\
> +(W)	mov (1|M0)		r30.4<1>:ud	r0.5<0;1,0>:ud			\n\
> +	// written value							\n\
> +(W)	send.dc1 (16|M0)	null	r30	src1_null  0x0	0x40A8000	\n\
> +#else	// Typed 2D Block Store							\n\
> +	// Store X and Y block start (160:191 and 192:223)			\n\
> +(W)	mov (2|M0)              r30.6<1>:ud     ARG(0):ud			\n\

Same here.

> +	// Store X and Y block size (224:231 and 232:239)			\n\
> +(W)	mov (1|M0)              r30.7<1>:ud     0xf:ud				\n\
> +(W)	send.tgm (16|M0)        null    r30     null:0  0x0     0x64000007	\n\
> +#endif										\n\
> +	", y_offset, value[0], value[1], value[2], value[3]);
> +}
> +
> +/**
> + * gpgpu_shader__common_target_write_u32:
> + * @shdr: shader to be modified
> + * @y_offset: write target offset within target buffer in rows
> + * @value: dword to be written
> + *
> + * Fill oword at @y_offset with dword stored in @value.
> + */
> +void gpgpu_shader__common_target_write_u32(struct gpgpu_shader *shdr,
> +					   uint32_t y_offset, uint32_t value)
> +{
> +	const uint32_t owblock[4] = {
> +		value, value, value, value
> +	};
> +	gpgpu_shader__common_target_write(shdr, y_offset, owblock);
> +}
> +
> +/**
> + * gpgpu_shader__write_aip:
> + * @shdr: shader to be modified
> + * @y_offset: write target offset within the surface in rows
> + *
> + * Write address instruction pointer to row tg_id_y + @y_offset.
> + */
> +void gpgpu_shader__write_aip(struct gpgpu_shader *shdr, uint32_t y_offset)
> +{
> +	emit_iga64_code(shdr, media_block_write_aip, "				\n\
> +	// Payload								\n\
> +(W)	mov (1|M0)               r5.0<1>:ud    cr0.2:ud				\n\
> +#if GEN_VER < 2000 // Media Block Write						\n\
> +	// X offset of the block in bytes := (thread group id X << ARG(0))	\n\
> +(W)	shl (1|M0)               r4.0<1>:ud    r0.1<0;1,0>:ud    0x2:ud		\n\
> +	// Y offset of the block in rows := thread group id Y			\n\
> +(W)	mov (1|M0)               r4.1<1>:ud    r0.6<0;1,0>:ud			\n\
> +(W)	add (1|M0)               r4.1<1>:ud    r4.1<0;1,0>:ud    ARG(0):ud	\n\
> +	// block width [0,63] representing 1 to 64 bytes			\n\
> +(W)	mov (1|M0)               r4.2<1>:ud    0x3:ud				\n\
> +	// FFTID := FFTID from R0 header					\n\
> +(W)	mov (1|M0)               r4.4<1>:ud    r0.5<0;1,0>:ud			\n\
> +(W)	send.dc1 (16|M0)         null     r4   src1_null 0       0x40A8000	\n\
> +#else // Typed 2D Block Store							\n\
> +	// Load r2.0-3 with tg id X << ARG(0)					\n\
> +(W)	shl (1|M0)               r2.0<1>:ud    r0.1<0;1,0>:ud    0x2:ud		\n\
> +	// Load r2.4-7 with tg id Y + ARG(1):ud					\n\
> +(W)	mov (1|M0)               r2.1<1>:ud    r0.6<0;1,0>:ud			\n\
> +(W)	add (1|M0)               r2.1<1>:ud    r2.1<0;1,0>:ud    ARG(0):ud	\n\
> +	// payload setup							\n\
> +(W)	mov (16|M0)              r4.0<1>:ud    0x0:ud				\n\
> +	// Store X and Y block start (160:191 and 192:223)			\n\
> +(W)	mov (2|M0)               r4.5<1>:ud    r2.0<2;2,1>:ud			\n\
> +	// Store X and Y block max_size (224:231 and 232:239)			\n\
> +(W)	mov (1|M0)               r4.7<1>:ud    0x3:ud				\n\
> +(W)	send.tgm (16|M0)         null     r4   null:0    0    0x64000007	\n\
> +#endif										\n\
> +	", y_offset);
> +}
> +
>  /**
>   * gpgpu_shader__write_dword:
>   * @shdr: shader to be modified
> @@ -313,3 +633,73 @@ void gpgpu_shader__write_dword(struct gpgpu_shader *shdr, uint32_t value,
>  #endif										\n\
>  	", 2, y_offset, 3, value, value, value, value);
>  }
> +
> +/**
> + * gpgpu_shader__end_system_routine:
> + * @shdr: shader to be modified
> + * @breakpoint_suppress: breakpoint suppress flag
> + *
> + * Return from system routine. To prevent infinite jumping to the system
> + * routine on a breakpoint, @breakpoint_suppress flag has to be set.
> + */
> +void gpgpu_shader__end_system_routine(struct gpgpu_shader *shdr,
> +				      bool breakpoint_suppress)
> +{
> +	/*
> +	 * set breakpoint suppress bit to avoid an endless loop
> +	 * when sip was invoked by a breakpoint
> +	 */
> +	if (breakpoint_suppress)
> +		emit_iga64_code(shdr, breakpoint_suppress, "			\n\
> +(W)	or  (1|M0)               cr0.0<1>:ud   cr0.0<0;1,0>:ud   0x8000:ud	\n\
> +		");
> +
> +	emit_iga64_code(shdr, end_system_routine, "				\n\
> +(W)	and (1|M0)               cr0.1<1>:ud   cr0.1<0;1,0>:ud   ARG(0):ud	\n\
> +	// return to an application						\n\
> +(W)	and (1|M0)               cr0.0<1>:ud   cr0.0<0;1,0>:ud   0x7FFFFFFD:ud	\n\
> +	", 0x7fffff | (1 << 26)); /* clear all exceptions, except read only bit */
> +}
> +
> +/**
> + * gpgpu_shader__end_system_routine_step_if_eq:
> + * @shdr: shader to be modified
> + * @y_offset: offset within target buffer in rows
> + * @value: expected value for single stepping execution
> + *
> + * Return from system routine. Don't clear breakpoint exception when @value
> + * is equal to value stored at @y_offset. This triggers the system routine
> + * after the subsequent instruction, resulting in single stepping execution.
> + */
> +void gpgpu_shader__end_system_routine_step_if_eq(struct gpgpu_shader *shdr,
> +						 uint32_t y_offset,
> +						 uint32_t value)
> +{
> +	emit_iga64_code(shdr, end_system_routine_step_if_eq, "				\n\
> +(W)		or  (1|M0)               cr0.0<1>:ud   cr0.0<0;1,0>:ud   0x8000:ud	\n\
> +(W)		and (1|M0)               cr0.1<1>:ud   cr0.1<0;1,0>:ud   ARG(0):ud	\n\
> +(W)		mov (16|M0)              r30.0<1>:ud    0x0:ud				\n\
> +#if GEN_VER < 2000 // Media Block Write							\n\
> +		// Y offset of the block in rows := thread group id Y			\n\
> +(W)		mov (1|M0)               r30.1<1>:ud    ARG(1):ud			\n\
> +		// block width [0,63] representing 1 to 64 bytes, we want dword		\n\
> +(W)		mov (1|M0)               r30.2<1>:ud    0x3:ud				\n\
> +		// FFTID := FFTID from R0 header					\n\
> +(W)		mov (1|M0)               r30.4<1>:ud    r0.5<0;1,0>:ud			\n\
> +(W)		send.dc1 (16|M0)         r31     r30      null    0x0	0x2190000	\n\
> +#else	// Typed 2D Block Store								\n\
> +		// Store X and Y block start (160:191 and 192:223)			\n\
> +(W)		mov (2|M0)               r30.6<1>:ud    ARG(1):ud			\n\

And here.

Code imo looks correct, with/without minor fix:

Reviewed-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>

--
Zbigniew

> +		// Store X and Y block size (224:231 and 232:239)			\n\
> +(W)		mov (1|M0)               r30.7<1>:ud    0x3:ud				\n\
> +(W)		send.tgm (16|M0)         r31     r30    null:0    0x0    0x62100003	\n\
> +#endif											\n\
> +		// clear the flag register						\n\
> +(W)		mov (1|M0)               f0.0<1>:ud    0x0:ud				\n\
> +(W)		cmp (1|M0)    (ne)f0.0   null<1>:ud     r31.0<0;1,0>:ud   ARG(2):ud	\n\
> +(W&f0.0)	and (1|M0)              cr0.1<1>:ud     cr0.1<0;1,0>:ud   ARG(3):ud	\n\
> +		// return to an application						\n\
> +(W)		and (1|M0)               cr0.0<1>:ud   cr0.0<0;1,0>:ud   0x7FFFFFFD:ud	\n\
> +	", 0x807fffff, /* leave breakpoint exception */
> +	y_offset, value, 0x7fffff /* clear all exceptions */ );
> +}
> diff --git a/lib/gpgpu_shader.h b/lib/gpgpu_shader.h
> index 255f93b4d..da4ece983 100644
> --- a/lib/gpgpu_shader.h
> +++ b/lib/gpgpu_shader.h
> @@ -21,6 +21,7 @@ struct gpgpu_shader {
>  		uint32_t *code;
>  		uint32_t (*instr)[4];
>  	};
> +	struct igt_map *labels;
>  };
>  
>  struct iga64_template {
> @@ -31,7 +32,7 @@ struct iga64_template {
>  
>  #pragma GCC diagnostic ignored "-Wnested-externs"
>  
> -void
> +uint32_t
>  __emit_iga64_code(struct gpgpu_shader *shdr, const struct iga64_template *tpls,
>  		  int argc, uint32_t *argv);
>  
> @@ -56,8 +57,32 @@ void gpgpu_shader_exec(struct intel_bb *ibb,
>  		       struct gpgpu_shader *sip,
>  		       uint64_t ring, bool explicit_engine);
>  
> +static inline uint32_t gpgpu_shader_last_instr(struct gpgpu_shader *shdr)
> +{
> +	return shdr->size / 4 - 1;
> +}
> +
> +void gpgpu_shader__wait(struct gpgpu_shader *shdr);
> +void gpgpu_shader__breakpoint_on(struct gpgpu_shader *shdr, uint32_t cmd_no);
> +void gpgpu_shader__breakpoint(struct gpgpu_shader *shdr);
> +void gpgpu_shader__nop(struct gpgpu_shader *shdr);
>  void gpgpu_shader__eot(struct gpgpu_shader *shdr);
> +void gpgpu_shader__common_target_write(struct gpgpu_shader *shdr,
> +				       uint32_t y_offset, const uint32_t value[4]);
> +void gpgpu_shader__common_target_write_u32(struct gpgpu_shader *shdr,
> +				     uint32_t y_offset, uint32_t value);
> +void gpgpu_shader__end_system_routine(struct gpgpu_shader *shdr,
> +				      bool breakpoint_suppress);
> +void gpgpu_shader__end_system_routine_step_if_eq(struct gpgpu_shader *shdr,
> +						 uint32_t dw_offset,
> +						 uint32_t value);
> +void gpgpu_shader__write_aip(struct gpgpu_shader *shdr, uint32_t y_offset);
>  void gpgpu_shader__write_dword(struct gpgpu_shader *shdr, uint32_t value,
>  			       uint32_t y_offset);
> -
> +void gpgpu_shader__label(struct gpgpu_shader *shdr, int label_id);
> +void gpgpu_shader__jump(struct gpgpu_shader *shdr, int label_id);
> +void gpgpu_shader__jump_neq(struct gpgpu_shader *shdr, int label_id,
> +			    uint32_t dw_offset, uint32_t value);
> +void gpgpu_shader__loop_begin(struct gpgpu_shader *shdr, int label_id);
> +void gpgpu_shader__loop_end(struct gpgpu_shader *shdr, int label_id, uint32_t iter);
>  #endif /* GPGPU_SHADER_H */
> diff --git a/lib/iga64_generated_codes.c b/lib/iga64_generated_codes.c
> index ea8d0f097..dd849eebc 100644
> --- a/lib/iga64_generated_codes.c
> +++ b/lib/iga64_generated_codes.c
> @@ -3,7 +3,7 @@
>  
>  #include "gpgpu_shader.h"
>  
> -#define MD5_SUM_IGA64_ASMS 9977ade854d57c5af5c5ca9e93c0f37e
> +#define MD5_SUM_IGA64_ASMS 33b7cd843e3b009c123a85a6c520d7d0
>  
>  struct iga64_template const iga64_code_gpgpu_fill[] = {
>  	{ .gen_ver = 2000, .size = 44, .code = (const uint32_t []) {
> @@ -79,6 +79,119 @@ struct iga64_template const iga64_code_gpgpu_fill[] = {
>  	}}
>  };
>  
> +struct iga64_template const iga64_code_end_system_routine_step_if_eq[] = {
> +	{ .gen_ver = 2000, .size = 44, .code = (const uint32_t []) {
> +		0x80000966, 0x80018220, 0x02008000, 0x00008000,
> +		0x80000965, 0x80118220, 0x02008010, 0xc0ded000,
> +		0x80100961, 0x1e054220, 0x00000000, 0x00000000,
> +		0x80040061, 0x1e654220, 0x00000000, 0xc0ded001,
> +		0x80000061, 0x1e754220, 0x00000000, 0x00000003,
> +		0x80132031, 0x1f0c0000, 0xd0061e8c, 0x04000000,
> +		0x80000061, 0x30014220, 0x00000000, 0x00000000,
> +		0x80008070, 0x00018220, 0x22001f04, 0xc0ded002,
> +		0x84000965, 0x80118220, 0x02008010, 0xc0ded003,
> +		0x80000965, 0x80018220, 0x02008000, 0x7ffffffd,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1270, .size = 52, .code = (const uint32_t []) {
> +		0x80000966, 0x80018220, 0x02008000, 0x00008000,
> +		0x80000965, 0x80218220, 0x02008020, 0xc0ded000,
> +		0x80040961, 0x1e054220, 0x00000000, 0x00000000,
> +		0x80000061, 0x1e254220, 0x00000000, 0xc0ded001,
> +		0x80000061, 0x1e454220, 0x00000000, 0x00000003,
> +		0x80000061, 0x1e850220, 0x000000a4, 0x00000000,
> +		0x80001901, 0x00010000, 0x00000000, 0x00000000,
> +		0x80044031, 0x1f0c0000, 0xc0001e0c, 0x02400000,
> +		0x80000061, 0x30014220, 0x00000000, 0x00000000,
> +		0x80002070, 0x00018220, 0x22001f04, 0xc0ded002,
> +		0x81000965, 0x80218220, 0x02008020, 0xc0ded003,
> +		0x80000965, 0x80018220, 0x02008000, 0x7ffffffd,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1260, .size = 48, .code = (const uint32_t []) {
> +		0x80000966, 0x80018220, 0x02008000, 0x00008000,
> +		0x80000965, 0x80118220, 0x02008010, 0xc0ded000,
> +		0x80100961, 0x1e054220, 0x00000000, 0x00000000,
> +		0x80000061, 0x1e154220, 0x00000000, 0xc0ded001,
> +		0x80000061, 0x1e254220, 0x00000000, 0x00000003,
> +		0x80000061, 0x1e450220, 0x00000054, 0x00000000,
> +		0x80132031, 0x1f0c0000, 0xc0001e0c, 0x02400000,
> +		0x80000061, 0x30014220, 0x00000000, 0x00000000,
> +		0x80008070, 0x00018220, 0x22001f04, 0xc0ded002,
> +		0x84000965, 0x80118220, 0x02008010, 0xc0ded003,
> +		0x80000965, 0x80018220, 0x02008000, 0x7ffffffd,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1250, .size = 52, .code = (const uint32_t []) {
> +		0x80000966, 0x80018220, 0x02008000, 0x00008000,
> +		0x80000965, 0x80218220, 0x02008020, 0xc0ded000,
> +		0x80040961, 0x1e054220, 0x00000000, 0x00000000,
> +		0x80000061, 0x1e254220, 0x00000000, 0xc0ded001,
> +		0x80000061, 0x1e454220, 0x00000000, 0x00000003,
> +		0x80000061, 0x1e850220, 0x000000a4, 0x00000000,
> +		0x80001901, 0x00010000, 0x00000000, 0x00000000,
> +		0x80044031, 0x1f0c0000, 0xc0001e0c, 0x02400000,
> +		0x80000061, 0x30014220, 0x00000000, 0x00000000,
> +		0x80002070, 0x00018220, 0x22001f04, 0xc0ded002,
> +		0x81000965, 0x80218220, 0x02008020, 0xc0ded003,
> +		0x80000965, 0x80018220, 0x02008000, 0x7ffffffd,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 0, .size = 48, .code = (const uint32_t []) {
> +		0x80000166, 0x80018220, 0x02008000, 0x00008000,
> +		0x80000165, 0x80218220, 0x02008020, 0xc0ded000,
> +		0x80040161, 0x1e054220, 0x00000000, 0x00000000,
> +		0x80000061, 0x1e254220, 0x00000000, 0xc0ded001,
> +		0x80000061, 0x1e454220, 0x00000000, 0x00000003,
> +		0x80000061, 0x1e850220, 0x000000a4, 0x00000000,
> +		0x80049031, 0x1f0c0000, 0xc0001e0c, 0x02400000,
> +		0x80000061, 0x30014220, 0x00000000, 0x00000000,
> +		0x80002070, 0x00018220, 0x22001f04, 0xc0ded002,
> +		0x81000165, 0x80218220, 0x02008020, 0xc0ded003,
> +		0x80000165, 0x80018220, 0x02008000, 0x7ffffffd,
> +		0x80000101, 0x00010000, 0x00000000, 0x00000000,
> +	}}
> +};
> +
> +struct iga64_template const iga64_code_end_system_routine[] = {
> +	{ .gen_ver = 2000, .size = 12, .code = (const uint32_t []) {
> +		0x80000965, 0x80118220, 0x02008010, 0xc0ded000,
> +		0x80000965, 0x80018220, 0x02008000, 0x7ffffffd,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1270, .size = 12, .code = (const uint32_t []) {
> +		0x80000965, 0x80218220, 0x02008020, 0xc0ded000,
> +		0x80000965, 0x80018220, 0x02008000, 0x7ffffffd,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1260, .size = 12, .code = (const uint32_t []) {
> +		0x80000965, 0x80118220, 0x02008010, 0xc0ded000,
> +		0x80000965, 0x80018220, 0x02008000, 0x7ffffffd,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1250, .size = 12, .code = (const uint32_t []) {
> +		0x80000965, 0x80218220, 0x02008020, 0xc0ded000,
> +		0x80000965, 0x80018220, 0x02008000, 0x7ffffffd,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 0, .size = 12, .code = (const uint32_t []) {
> +		0x80000165, 0x80218220, 0x02008020, 0xc0ded000,
> +		0x80000165, 0x80018220, 0x02008000, 0x7ffffffd,
> +		0x80000101, 0x00010000, 0x00000000, 0x00000000,
> +	}}
> +};
> +
> +struct iga64_template const iga64_code_breakpoint_suppress[] = {
> +	{ .gen_ver = 1250, .size = 8, .code = (const uint32_t []) {
> +		0x80000966, 0x80018220, 0x02008000, 0x00008000,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 0, .size = 8, .code = (const uint32_t []) {
> +		0x80000166, 0x80018220, 0x02008000, 0x00008000,
> +		0x80000101, 0x00010000, 0x00000000, 0x00000000,
> +	}}
> +};
> +
>  struct iga64_template const iga64_code_media_block_write[] = {
>  	{ .gen_ver = 2000, .size = 56, .code = (const uint32_t []) {
>  		0x80100061, 0x04054220, 0x00000000, 0x00000000,
> @@ -164,6 +277,270 @@ struct iga64_template const iga64_code_media_block_write[] = {
>  	}}
>  };
>  
> +struct iga64_template const iga64_code_media_block_write_aip[] = {
> +	{ .gen_ver = 2000, .size = 44, .code = (const uint32_t []) {
> +		0x80000961, 0x05050220, 0x00008020, 0x00000000,
> +		0x80000969, 0x02058220, 0x02000014, 0x00000002,
> +		0x80000061, 0x02150220, 0x00000064, 0x00000000,
> +		0x80001940, 0x02158220, 0x02000214, 0xc0ded000,
> +		0x80100061, 0x04054220, 0x00000000, 0x00000000,
> +		0x80041a61, 0x04550220, 0x00220205, 0x00000000,
> +		0x80000061, 0x04754220, 0x00000000, 0x00000003,
> +		0x80132031, 0x00000000, 0xd00e0494, 0x04000000,
> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1270, .size = 44, .code = (const uint32_t []) {
> +		0x80000961, 0x05050220, 0x00008040, 0x00000000,
> +		0x80000969, 0x04058220, 0x02000024, 0x00000002,
> +		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
> +		0x80001940, 0x04258220, 0x02000424, 0xc0ded000,
> +		0x80000061, 0x04454220, 0x00000000, 0x00000003,
> +		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
> +		0x80001901, 0x00010000, 0x00000000, 0x00000000,
> +		0x80044031, 0x00000000, 0xc0000414, 0x02a00000,
> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1260, .size = 40, .code = (const uint32_t []) {
> +		0x80000961, 0x05050220, 0x00008020, 0x00000000,
> +		0x80000969, 0x04058220, 0x02000014, 0x00000002,
> +		0x80000061, 0x04150220, 0x00000064, 0x00000000,
> +		0x80001940, 0x04158220, 0x02000414, 0xc0ded000,
> +		0x80000061, 0x04254220, 0x00000000, 0x00000003,
> +		0x80000061, 0x04450220, 0x00000054, 0x00000000,
> +		0x80132031, 0x00000000, 0xc0000414, 0x02a00000,
> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1250, .size = 44, .code = (const uint32_t []) {
> +		0x80000961, 0x05050220, 0x00008040, 0x00000000,
> +		0x80000969, 0x04058220, 0x02000024, 0x00000002,
> +		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
> +		0x80001940, 0x04258220, 0x02000424, 0xc0ded000,
> +		0x80000061, 0x04454220, 0x00000000, 0x00000003,
> +		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
> +		0x80001901, 0x00010000, 0x00000000, 0x00000000,
> +		0x80044031, 0x00000000, 0xc0000414, 0x02a00000,
> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 0, .size = 40, .code = (const uint32_t []) {
> +		0x80000161, 0x05050220, 0x00008040, 0x00000000,
> +		0x80000169, 0x04058220, 0x02000024, 0x00000002,
> +		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
> +		0x80000140, 0x04258220, 0x02000424, 0xc0ded000,
> +		0x80000061, 0x04454220, 0x00000000, 0x00000003,
> +		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
> +		0x80049031, 0x00000000, 0xc0000414, 0x02a00000,
> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> +		0x80000101, 0x00010000, 0x00000000, 0x00000000,
> +	}}
> +};
> +
> +struct iga64_template const iga64_code_common_target_write[] = {
> +	{ .gen_ver = 2000, .size = 48, .code = (const uint32_t []) {
> +		0x80100061, 0x1e054220, 0x00000000, 0x00000000,
> +		0x80100061, 0x1f054220, 0x00000000, 0x00000000,
> +		0x80000061, 0x1f054220, 0x00000000, 0xc0ded001,
> +		0x80000061, 0x1f154220, 0x00000000, 0xc0ded002,
> +		0x80000061, 0x1f254220, 0x00000000, 0xc0ded003,
> +		0x80000061, 0x1f354220, 0x00000000, 0xc0ded004,
> +		0x80040061, 0x1e654220, 0x00000000, 0xc0ded000,
> +		0x80000061, 0x1e754220, 0x00000000, 0x0000000f,
> +		0x80132031, 0x00000000, 0xd00e1e94, 0x04000000,
> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1270, .size = 56, .code = (const uint32_t []) {
> +		0x80040061, 0x1e054220, 0x00000000, 0x00000000,
> +		0x80040061, 0x1f054220, 0x00000000, 0x00000000,
> +		0x80000061, 0x1f054220, 0x00000000, 0xc0ded001,
> +		0x80000061, 0x1f254220, 0x00000000, 0xc0ded002,
> +		0x80000061, 0x1f454220, 0x00000000, 0xc0ded003,
> +		0x80000061, 0x1f654220, 0x00000000, 0xc0ded004,
> +		0x80000061, 0x1e254220, 0x00000000, 0xc0ded000,
> +		0x80000061, 0x1e454220, 0x00000000, 0x0000000f,
> +		0x80000061, 0x1e850220, 0x000000a4, 0x00000000,
> +		0x80001901, 0x00010000, 0x00000000, 0x00000000,
> +		0x80044031, 0x00000000, 0xc0001e14, 0x02a00000,
> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1260, .size = 52, .code = (const uint32_t []) {
> +		0x80100061, 0x1e054220, 0x00000000, 0x00000000,
> +		0x80100061, 0x1f054220, 0x00000000, 0x00000000,
> +		0x80000061, 0x1f054220, 0x00000000, 0xc0ded001,
> +		0x80000061, 0x1f154220, 0x00000000, 0xc0ded002,
> +		0x80000061, 0x1f254220, 0x00000000, 0xc0ded003,
> +		0x80000061, 0x1f354220, 0x00000000, 0xc0ded004,
> +		0x80000061, 0x1e154220, 0x00000000, 0xc0ded000,
> +		0x80000061, 0x1e254220, 0x00000000, 0x0000000f,
> +		0x80000061, 0x1e450220, 0x00000054, 0x00000000,
> +		0x80132031, 0x00000000, 0xc0001e14, 0x02a00000,
> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1250, .size = 56, .code = (const uint32_t []) {
> +		0x80040061, 0x1e054220, 0x00000000, 0x00000000,
> +		0x80040061, 0x1f054220, 0x00000000, 0x00000000,
> +		0x80000061, 0x1f054220, 0x00000000, 0xc0ded001,
> +		0x80000061, 0x1f254220, 0x00000000, 0xc0ded002,
> +		0x80000061, 0x1f454220, 0x00000000, 0xc0ded003,
> +		0x80000061, 0x1f654220, 0x00000000, 0xc0ded004,
> +		0x80000061, 0x1e254220, 0x00000000, 0xc0ded000,
> +		0x80000061, 0x1e454220, 0x00000000, 0x0000000f,
> +		0x80000061, 0x1e850220, 0x000000a4, 0x00000000,
> +		0x80001901, 0x00010000, 0x00000000, 0x00000000,
> +		0x80044031, 0x00000000, 0xc0001e14, 0x02a00000,
> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 0, .size = 52, .code = (const uint32_t []) {
> +		0x80040061, 0x1e054220, 0x00000000, 0x00000000,
> +		0x80040061, 0x1f054220, 0x00000000, 0x00000000,
> +		0x80000061, 0x1f054220, 0x00000000, 0xc0ded001,
> +		0x80000061, 0x1f254220, 0x00000000, 0xc0ded002,
> +		0x80000061, 0x1f454220, 0x00000000, 0xc0ded003,
> +		0x80000061, 0x1f654220, 0x00000000, 0xc0ded004,
> +		0x80000061, 0x1e254220, 0x00000000, 0xc0ded000,
> +		0x80000061, 0x1e454220, 0x00000000, 0x0000000f,
> +		0x80000061, 0x1e850220, 0x000000a4, 0x00000000,
> +		0x80049031, 0x00000000, 0xc0001e14, 0x02a00000,
> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> +		0x80000101, 0x00010000, 0x00000000, 0x00000000,
> +	}}
> +};
> +
> +struct iga64_template const iga64_code_inc_r40_jump_neq[] = {
> +	{ .gen_ver = 2000, .size = 20, .code = (const uint32_t []) {
> +		0x80000040, 0x28058220, 0x02002804, 0x00000001,
> +		0x80000061, 0x30014220, 0x00000000, 0x00000000,
> +		0x80001a70, 0x00018220, 0x22002804, 0xc0ded000,
> +		0x84000020, 0x00004000, 0x00000000, 0xffffffd0,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1270, .size = 20, .code = (const uint32_t []) {
> +		0x80000040, 0x28058220, 0x02002804, 0x00000001,
> +		0x80000061, 0x30014220, 0x00000000, 0x00000000,
> +		0x80001a70, 0x00018220, 0x22002804, 0xc0ded000,
> +		0x81000020, 0x00004000, 0x00000000, 0xffffffd0,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1260, .size = 20, .code = (const uint32_t []) {
> +		0x80000040, 0x28058220, 0x02002804, 0x00000001,
> +		0x80000061, 0x30014220, 0x00000000, 0x00000000,
> +		0x80001a70, 0x00018220, 0x22002804, 0xc0ded000,
> +		0x84000020, 0x00004000, 0x00000000, 0xffffffd0,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1250, .size = 20, .code = (const uint32_t []) {
> +		0x80000040, 0x28058220, 0x02002804, 0x00000001,
> +		0x80000061, 0x30014220, 0x00000000, 0x00000000,
> +		0x80001a70, 0x00018220, 0x22002804, 0xc0ded000,
> +		0x81000020, 0x00004000, 0x00000000, 0xffffffd0,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 0, .size = 20, .code = (const uint32_t []) {
> +		0x80000040, 0x28058220, 0x02002804, 0x00000001,
> +		0x80000061, 0x30014220, 0x00000000, 0x00000000,
> +		0x80000270, 0x00018220, 0x22002804, 0xc0ded000,
> +		0x81000020, 0x00004000, 0x00000000, 0xffffffd0,
> +		0x80000101, 0x00010000, 0x00000000, 0x00000000,
> +	}}
> +};
> +
> +struct iga64_template const iga64_code_clear_r40[] = {
> +	{ .gen_ver = 1250, .size = 8, .code = (const uint32_t []) {
> +		0x80000061, 0x28054220, 0x00000000, 0x00000000,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 0, .size = 8, .code = (const uint32_t []) {
> +		0x80000061, 0x28054220, 0x00000000, 0x00000000,
> +		0x80000101, 0x00010000, 0x00000000, 0x00000000,
> +	}}
> +};
> +
> +struct iga64_template const iga64_code_jump_dw_neq[] = {
> +	{ .gen_ver = 2000, .size = 32, .code = (const uint32_t []) {
> +		0x80100061, 0x1e054220, 0x00000000, 0x00000000,
> +		0x80040061, 0x1e654220, 0x00000000, 0xc0ded000,
> +		0x80000061, 0x1e754220, 0x00000000, 0x00000003,
> +		0x80132031, 0x1f0c0000, 0xd0061e8c, 0x04000000,
> +		0x80000061, 0x30014220, 0x00000000, 0x00000000,
> +		0x80008070, 0x00018220, 0x22001f04, 0xc0ded001,
> +		0x84000020, 0x00004000, 0x00000000, 0xffffffa0,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1270, .size = 40, .code = (const uint32_t []) {
> +		0x80040061, 0x1e054220, 0x00000000, 0x00000000,
> +		0x80000061, 0x1e254220, 0x00000000, 0xc0ded000,
> +		0x80000061, 0x1e454220, 0x00000000, 0x00000003,
> +		0x80000061, 0x1e850220, 0x000000a4, 0x00000000,
> +		0x80001901, 0x00010000, 0x00000000, 0x00000000,
> +		0x80044031, 0x1f0c0000, 0xc0001e0c, 0x02400000,
> +		0x80000061, 0x30014220, 0x00000000, 0x00000000,
> +		0x80002070, 0x00018220, 0x22001f04, 0xc0ded001,
> +		0x81000020, 0x00004000, 0x00000000, 0xffffff80,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1260, .size = 36, .code = (const uint32_t []) {
> +		0x80100061, 0x1e054220, 0x00000000, 0x00000000,
> +		0x80000061, 0x1e154220, 0x00000000, 0xc0ded000,
> +		0x80000061, 0x1e254220, 0x00000000, 0x00000003,
> +		0x80000061, 0x1e450220, 0x00000054, 0x00000000,
> +		0x80132031, 0x1f0c0000, 0xc0001e0c, 0x02400000,
> +		0x80000061, 0x30014220, 0x00000000, 0x00000000,
> +		0x80008070, 0x00018220, 0x22001f04, 0xc0ded001,
> +		0x84000020, 0x00004000, 0x00000000, 0xffffff90,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1250, .size = 40, .code = (const uint32_t []) {
> +		0x80040061, 0x1e054220, 0x00000000, 0x00000000,
> +		0x80000061, 0x1e254220, 0x00000000, 0xc0ded000,
> +		0x80000061, 0x1e454220, 0x00000000, 0x00000003,
> +		0x80000061, 0x1e850220, 0x000000a4, 0x00000000,
> +		0x80001901, 0x00010000, 0x00000000, 0x00000000,
> +		0x80044031, 0x1f0c0000, 0xc0001e0c, 0x02400000,
> +		0x80000061, 0x30014220, 0x00000000, 0x00000000,
> +		0x80002070, 0x00018220, 0x22001f04, 0xc0ded001,
> +		0x81000020, 0x00004000, 0x00000000, 0xffffff80,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 0, .size = 36, .code = (const uint32_t []) {
> +		0x80040061, 0x1e054220, 0x00000000, 0x00000000,
> +		0x80000061, 0x1e254220, 0x00000000, 0xc0ded000,
> +		0x80000061, 0x1e454220, 0x00000000, 0x00000003,
> +		0x80000061, 0x1e850220, 0x000000a4, 0x00000000,
> +		0x80049031, 0x1f0c0000, 0xc0001e0c, 0x02400000,
> +		0x80000061, 0x30014220, 0x00000000, 0x00000000,
> +		0x80002070, 0x00018220, 0x22001f04, 0xc0ded001,
> +		0x81000020, 0x00004000, 0x00000000, 0xffffff90,
> +		0x80000101, 0x00010000, 0x00000000, 0x00000000,
> +	}}
> +};
> +
> +struct iga64_template const iga64_code_jump[] = {
> +	{ .gen_ver = 1250, .size = 8, .code = (const uint32_t []) {
> +		0x80000020, 0x00004000, 0x00000000, 0x00000000,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 0, .size = 8, .code = (const uint32_t []) {
> +		0x80000020, 0x00004000, 0x00000000, 0x00000000,
> +		0x80000101, 0x00010000, 0x00000000, 0x00000000,
> +	}}
> +};
> +
>  struct iga64_template const iga64_code_eot[] = {
>  	{ .gen_ver = 2000, .size = 8, .code = (const uint32_t []) {
>  		0x800c0061, 0x70050220, 0x00460005, 0x00000000,
> @@ -188,3 +565,25 @@ struct iga64_template const iga64_code_eot[] = {
>  		0x80049031, 0x00000004, 0x7020700c, 0x10000000,
>  	}}
>  };
> +
> +struct iga64_template const iga64_code_nop[] = {
> +	{ .gen_ver = 1250, .size = 8, .code = (const uint32_t []) {
> +		0x00000060, 0x00000000, 0x00000000, 0x00000000,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 0, .size = 8, .code = (const uint32_t []) {
> +		0x00000060, 0x00000000, 0x00000000, 0x00000000,
> +		0x80000101, 0x00010000, 0x00000000, 0x00000000,
> +	}}
> +};
> +
> +struct iga64_template const iga64_code_sync_host[] = {
> +	{ .gen_ver = 1250, .size = 8, .code = (const uint32_t []) {
> +		0x80000001, 0x00010000, 0xf0000000, 0x00000000,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 0, .size = 8, .code = (const uint32_t []) {
> +		0x80000001, 0x00010000, 0xf0000000, 0x00000000,
> +		0x80000101, 0x00010000, 0x00000000, 0x00000000,
> +	}}
> +};
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 03/17] lib/gpgpu_shader: Extend shader building library
  2024-09-05  9:27 ` [PATCH i-g-t v6 03/17] lib/gpgpu_shader: Extend shader building library Christoph Manszewski
  2024-09-05 11:56   ` Zbigniew Kempczyński
@ 2024-09-09  6:54   ` Zbigniew Kempczyński
  1 sibling, 0 replies; 50+ messages in thread
From: Zbigniew Kempczyński @ 2024-09-09  6:54 UTC (permalink / raw)
  To: Christoph Manszewski
  Cc: igt-dev, Kamil Konieczny, Dominik Grzegorzek, Maciej Patelczyk,
	Dominik Karol Piątkowski, Pawel Sikora, Andrzej Hajda,
	Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun

On Thu, Sep 05, 2024 at 11:27:58AM +0200, Christoph Manszewski wrote:

<cut>

> +
> +/**
> + * gpgpu_shader__end_system_routine:
> + * @shdr: shader to be modified
> + * @breakpoint_suppress: breakpoint suppress flag
> + *
> + * Return from system routine. To prevent infinite jumping to the system
> + * routine on a breakpoint, @breakpoint_suppress flag has to be set.
> + */
> +void gpgpu_shader__end_system_routine(struct gpgpu_shader *shdr,
> +				      bool breakpoint_suppress)
> +{
> +	/*
> +	 * set breakpoint suppress bit to avoid an endless loop
> +	 * when sip was invoked by a breakpoint
> +	 */
> +	if (breakpoint_suppress)
> +		emit_iga64_code(shdr, breakpoint_suppress, "			\n\
> +(W)	or  (1|M0)               cr0.0<1>:ud   cr0.0<0;1,0>:ud   0x8000:ud	\n\
> +		");
> +
> +	emit_iga64_code(shdr, end_system_routine, "				\n\
> +(W)	and (1|M0)               cr0.1<1>:ud   cr0.1<0;1,0>:ud   ARG(0):ud	\n\
> +	// return to an application						\n\
> +(W)	and (1|M0)               cr0.0<1>:ud   cr0.0<0;1,0>:ud   0x7FFFFFFD:ud	\n\

I'm reading documentation and I've doubts regarding 0x7ffffffd value.
If I understand correctly, the bits 0 and 2 are mbz, so mask here
should likely be 0x7ffffffa.

> +	", 0x7fffff | (1 << 26)); /* clear all exceptions, except read only bit */
> +}
> +

--
Zbigniew


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH i-g-t v6 04/17] lib/gpgpu_shader: Add write_on_exception template
  2024-09-05  9:27 [PATCH i-g-t v6 00/17] Test coverage for GPU debug support Christoph Manszewski
                   ` (2 preceding siblings ...)
  2024-09-05  9:27 ` [PATCH i-g-t v6 03/17] lib/gpgpu_shader: Extend shader building library Christoph Manszewski
@ 2024-09-05  9:27 ` Christoph Manszewski
  2024-09-05 10:51   ` Zbigniew Kempczyński
  2024-09-05  9:28 ` [PATCH i-g-t v6 05/17] lib/gpgpu_shader: Add set/clear exception register (cr0.1) helpers Christoph Manszewski
                   ` (15 subsequent siblings)
  19 siblings, 1 reply; 50+ messages in thread
From: Christoph Manszewski @ 2024-09-05  9:27 UTC (permalink / raw)
  To: igt-dev
  Cc: Zbigniew Kempczyński, Kamil Konieczny, Dominik Grzegorzek,
	Maciej Patelczyk, Dominik Karol Piątkowski, Pawel Sikora,
	Andrzej Hajda, Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun,
	Christoph Manszewski

From: Andrzej Hajda <andrzej.hajda@intel.com>

Writing specific value to memory location on unexpected value in exception
register allows to report errors from inside shader or siplet.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
---
 lib/gpgpu_shader.c          | 53 ++++++++++++++++++++++
 lib/gpgpu_shader.h          |  2 +
 lib/iga64_generated_codes.c | 87 ++++++++++++++++++++++++++++++++++++-
 3 files changed, 141 insertions(+), 1 deletion(-)

diff --git a/lib/gpgpu_shader.c b/lib/gpgpu_shader.c
index dacab51dd..926eccaa0 100644
--- a/lib/gpgpu_shader.c
+++ b/lib/gpgpu_shader.c
@@ -634,6 +634,59 @@ void gpgpu_shader__write_dword(struct gpgpu_shader *shdr, uint32_t value,
 	", 2, y_offset, 3, value, value, value, value);
 }
 
+/**
+ * gpgpu_shader__write_on_exception:
+ * @shdr: shader to be modified
+ * @value: dword to be written
+ * @y_offset: write target offset within the surface in rows
+ * @mask: mask to be applied on exception register
+ * @expected: expected value of exception register with @mask applied
+ *
+ * Check if bits specified by @mask in exception register(cr0.1) are equal
+ * to provided ones: cr0.1 & @mask == @expected,
+ * if yes fill dword in (row, column/dword) == (tg_id_y + @y_offset, tg_id_x).
+ */
+void gpgpu_shader__write_on_exception(struct gpgpu_shader *shdr, uint32_t value,
+				      uint32_t y_offset, uint32_t mask, uint32_t expected)
+{
+	emit_iga64_code(shdr, write_on_exception, "					\n\
+	// Clear message header								\n\
+(W)	mov (16|M0)              r4.0<1>:ud    0x0:ud					\n\
+	// Payload									\n\
+(W)	mov (1|M0)               r5.0<1>:ud    ARG(3):ud				\n\
+#if GEN_VER < 2000 // prepare Media Block Write						\n\
+	// X offset of the block in bytes := (thread group id X << ARG(0))		\n\
+(W)	shl (1|M0)               r4.0<1>:ud    r0.1<0;1,0>:ud    ARG(0):ud		\n\
+	// Y offset of the block in rows := thread group id Y				\n\
+(W)	mov (1|M0)               r4.1<1>:ud    r0.6<0;1,0>:ud				\n\
+(W)	add (1|M0)               r4.1<1>:ud    r4.1<0;1,0>:ud   ARG(1):ud		\n\
+	// block width [0,63] representing 1 to 64 bytes				\n\
+(W)	mov (1|M0)               r4.2<1>:ud    ARG(2):ud				\n\
+	// FFTID := FFTID from R0 header						\n\
+(W)	mov (1|M0)               r4.4<1>:ud    r0.5<0;1,0>:ud				\n\
+#else // prepare Typed 2D Block Store							\n\
+	// Load r2.0-3 with tg id X << ARG(0)						\n\
+(W)	shl (1|M0)               r2.0<1>:ud    r0.1<0;1,0>:ud    ARG(0):ud		\n\
+	// Load r2.4-7 with tg id Y + ARG(1):ud						\n\
+(W)	mov (1|M0)               r2.1<1>:ud    r0.6<0;1,0>:ud				\n\
+(W)	add (1|M0)               r2.1<1>:ud    r2.1<0;1,0>:ud    ARG(1):ud		\n\
+	// Store X and Y block start (160:191 and 192:223)				\n\
+(W)	mov (2|M0)               r4.5<1>:ud    r2.0<2;2,1>:ud				\n\
+	// Store X and Y block max_size (224:231 and 232:239)				\n\
+(W)	mov (1|M0)               r4.7<1>:ud    ARG(2):ud				\n\
+#endif											\n\
+	// Check if masked exception is equal to provided value and write conditionally \n\
+(W)      and (1|M0)              r3.0<1>:ud     cr0.1<0;1,0>:ud ARG(4):ud		\n\
+(W)      mov (1|M0)              f0.0<1>:ud     0x0:ud					\n\
+(W)      cmp (1|M0)     (eq)f0.0 null:ud        r3.0<0;1,0>:ud  ARG(5):ud		\n\
+#if GEN_VER < 2000 // Media Block Write							\n\
+(W&f0.0) send.dc1 (16|M0)        null     r4   src1_null 0    0x40A8000			\n\
+#else // Typed 2D Block Store								\n\
+(W&f0.0) send.tgm (16|M0)        null     r4   null:0    0    0x64000007		\n\
+#endif											\n\
+	", 2, y_offset, 3, value, mask, expected);
+}
+
 /**
  * gpgpu_shader__end_system_routine:
  * @shdr: shader to be modified
diff --git a/lib/gpgpu_shader.h b/lib/gpgpu_shader.h
index da4ece983..6c6953a1a 100644
--- a/lib/gpgpu_shader.h
+++ b/lib/gpgpu_shader.h
@@ -79,6 +79,8 @@ void gpgpu_shader__end_system_routine_step_if_eq(struct gpgpu_shader *shdr,
 void gpgpu_shader__write_aip(struct gpgpu_shader *shdr, uint32_t y_offset);
 void gpgpu_shader__write_dword(struct gpgpu_shader *shdr, uint32_t value,
 			       uint32_t y_offset);
+void gpgpu_shader__write_on_exception(struct gpgpu_shader *shdr, uint32_t dw,
+			       uint32_t y_offset, uint32_t mask, uint32_t value);
 void gpgpu_shader__label(struct gpgpu_shader *shdr, int label_id);
 void gpgpu_shader__jump(struct gpgpu_shader *shdr, int label_id);
 void gpgpu_shader__jump_neq(struct gpgpu_shader *shdr, int label_id,
diff --git a/lib/iga64_generated_codes.c b/lib/iga64_generated_codes.c
index dd849eebc..0800496c5 100644
--- a/lib/iga64_generated_codes.c
+++ b/lib/iga64_generated_codes.c
@@ -3,7 +3,7 @@
 
 #include "gpgpu_shader.h"
 
-#define MD5_SUM_IGA64_ASMS 33b7cd843e3b009c123a85a6c520d7d0
+#define MD5_SUM_IGA64_ASMS 716c5b437e2abd2a1768e79182993ff6
 
 struct iga64_template const iga64_code_gpgpu_fill[] = {
 	{ .gen_ver = 2000, .size = 44, .code = (const uint32_t []) {
@@ -192,6 +192,91 @@ struct iga64_template const iga64_code_breakpoint_suppress[] = {
 	}}
 };
 
+struct iga64_template const iga64_code_write_on_exception[] = {
+	{ .gen_ver = 2000, .size = 56, .code = (const uint32_t []) {
+		0x80100061, 0x04054220, 0x00000000, 0x00000000,
+		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
+		0x80000069, 0x02058220, 0x02000014, 0xc0ded000,
+		0x80000061, 0x02150220, 0x00000064, 0x00000000,
+		0x80001940, 0x02158220, 0x02000214, 0xc0ded001,
+		0x80041961, 0x04550220, 0x00220205, 0x00000000,
+		0x80000061, 0x04754220, 0x00000000, 0xc0ded002,
+		0x80000965, 0x03058220, 0x02008010, 0xc0ded004,
+		0x80000961, 0x30014220, 0x00000000, 0x00000000,
+		0x80001a70, 0x00018220, 0x12000304, 0xc0ded005,
+		0x84134031, 0x00000000, 0xd00e0494, 0x04000000,
+		0x80000001, 0x00010000, 0x20000000, 0x00000000,
+		0x80000001, 0x00010000, 0x30000000, 0x00000000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1270, .size = 60, .code = (const uint32_t []) {
+		0x80040061, 0x04054220, 0x00000000, 0x00000000,
+		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
+		0x80000069, 0x04058220, 0x02000024, 0xc0ded000,
+		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
+		0x80001940, 0x04258220, 0x02000424, 0xc0ded001,
+		0x80000061, 0x04454220, 0x00000000, 0xc0ded002,
+		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
+		0x80000965, 0x03058220, 0x02008020, 0xc0ded004,
+		0x80000961, 0x30014220, 0x00000000, 0x00000000,
+		0x80001a70, 0x00018220, 0x12000304, 0xc0ded005,
+		0x80001a01, 0x00010000, 0x00000000, 0x00000000,
+		0x81044031, 0x00000000, 0xc0000414, 0x02a00000,
+		0x80000001, 0x00010000, 0x20000000, 0x00000000,
+		0x80000001, 0x00010000, 0x30000000, 0x00000000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1260, .size = 56, .code = (const uint32_t []) {
+		0x80100061, 0x04054220, 0x00000000, 0x00000000,
+		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
+		0x80000069, 0x04058220, 0x02000014, 0xc0ded000,
+		0x80000061, 0x04150220, 0x00000064, 0x00000000,
+		0x80001940, 0x04158220, 0x02000414, 0xc0ded001,
+		0x80000061, 0x04254220, 0x00000000, 0xc0ded002,
+		0x80000061, 0x04450220, 0x00000054, 0x00000000,
+		0x80000965, 0x03058220, 0x02008010, 0xc0ded004,
+		0x80000961, 0x30014220, 0x00000000, 0x00000000,
+		0x80001a70, 0x00018220, 0x12000304, 0xc0ded005,
+		0x84134031, 0x00000000, 0xc0000414, 0x02a00000,
+		0x80000001, 0x00010000, 0x20000000, 0x00000000,
+		0x80000001, 0x00010000, 0x30000000, 0x00000000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1250, .size = 60, .code = (const uint32_t []) {
+		0x80040061, 0x04054220, 0x00000000, 0x00000000,
+		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
+		0x80000069, 0x04058220, 0x02000024, 0xc0ded000,
+		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
+		0x80001940, 0x04258220, 0x02000424, 0xc0ded001,
+		0x80000061, 0x04454220, 0x00000000, 0xc0ded002,
+		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
+		0x80000965, 0x03058220, 0x02008020, 0xc0ded004,
+		0x80000961, 0x30014220, 0x00000000, 0x00000000,
+		0x80001a70, 0x00018220, 0x12000304, 0xc0ded005,
+		0x80001a01, 0x00010000, 0x00000000, 0x00000000,
+		0x81044031, 0x00000000, 0xc0000414, 0x02a00000,
+		0x80000001, 0x00010000, 0x20000000, 0x00000000,
+		0x80000001, 0x00010000, 0x30000000, 0x00000000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 0, .size = 56, .code = (const uint32_t []) {
+		0x80040061, 0x04054220, 0x00000000, 0x00000000,
+		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
+		0x80000069, 0x04058220, 0x02000024, 0xc0ded000,
+		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
+		0x80000140, 0x04258220, 0x02000424, 0xc0ded001,
+		0x80000061, 0x04454220, 0x00000000, 0xc0ded002,
+		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
+		0x80000165, 0x03058220, 0x02008020, 0xc0ded004,
+		0x80000161, 0x30014220, 0x00000000, 0x00000000,
+		0x80000270, 0x00018220, 0x12000304, 0xc0ded005,
+		0x8104a031, 0x00000000, 0xc0000414, 0x02a00000,
+		0x80000001, 0x00010000, 0x20000000, 0x00000000,
+		0x80000001, 0x00010000, 0x30000000, 0x00000000,
+		0x80000101, 0x00010000, 0x00000000, 0x00000000,
+	}}
+};
+
 struct iga64_template const iga64_code_media_block_write[] = {
 	{ .gen_ver = 2000, .size = 56, .code = (const uint32_t []) {
 		0x80100061, 0x04054220, 0x00000000, 0x00000000,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 04/17] lib/gpgpu_shader: Add write_on_exception template
  2024-09-05  9:27 ` [PATCH i-g-t v6 04/17] lib/gpgpu_shader: Add write_on_exception template Christoph Manszewski
@ 2024-09-05 10:51   ` Zbigniew Kempczyński
  2024-09-06  5:58     ` Andrzej Hajda
  0 siblings, 1 reply; 50+ messages in thread
From: Zbigniew Kempczyński @ 2024-09-05 10:51 UTC (permalink / raw)
  To: Christoph Manszewski
  Cc: igt-dev, Kamil Konieczny, Dominik Grzegorzek, Maciej Patelczyk,
	Dominik Karol Piątkowski, Pawel Sikora, Andrzej Hajda,
	Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun

On Thu, Sep 05, 2024 at 11:27:59AM +0200, Christoph Manszewski wrote:
> From: Andrzej Hajda <andrzej.hajda@intel.com>
> 
> Writing specific value to memory location on unexpected value in exception
> register allows to report errors from inside shader or siplet.
> 
> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
> Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
> ---
>  lib/gpgpu_shader.c          | 53 ++++++++++++++++++++++
>  lib/gpgpu_shader.h          |  2 +
>  lib/iga64_generated_codes.c | 87 ++++++++++++++++++++++++++++++++++++-
>  3 files changed, 141 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/gpgpu_shader.c b/lib/gpgpu_shader.c
> index dacab51dd..926eccaa0 100644
> --- a/lib/gpgpu_shader.c
> +++ b/lib/gpgpu_shader.c
> @@ -634,6 +634,59 @@ void gpgpu_shader__write_dword(struct gpgpu_shader *shdr, uint32_t value,
>  	", 2, y_offset, 3, value, value, value, value);
>  }
>  
> +/**
> + * gpgpu_shader__write_on_exception:
> + * @shdr: shader to be modified
> + * @value: dword to be written
> + * @y_offset: write target offset within the surface in rows
> + * @mask: mask to be applied on exception register
> + * @expected: expected value of exception register with @mask applied
> + *
> + * Check if bits specified by @mask in exception register(cr0.1) are equal
> + * to provided ones: cr0.1 & @mask == @expected,
> + * if yes fill dword in (row, column/dword) == (tg_id_y + @y_offset, tg_id_x).
> + */
> +void gpgpu_shader__write_on_exception(struct gpgpu_shader *shdr, uint32_t value,
> +				      uint32_t y_offset, uint32_t mask, uint32_t expected)
> +{
> +	emit_iga64_code(shdr, write_on_exception, "					\n\
> +	// Clear message header								\n\
> +(W)	mov (16|M0)              r4.0<1>:ud    0x0:ud					\n\

I got rid off rest of instructions and appropriate xe_exec_sip tests are
still passing. May you check why this happens?

--
Zbigniew

> +	// Payload									\n\
> +(W)	mov (1|M0)               r5.0<1>:ud    ARG(3):ud				\n\
> +#if GEN_VER < 2000 // prepare Media Block Write						\n\
> +	// X offset of the block in bytes := (thread group id X << ARG(0))		\n\
> +(W)	shl (1|M0)               r4.0<1>:ud    r0.1<0;1,0>:ud    ARG(0):ud		\n\
> +	// Y offset of the block in rows := thread group id Y				\n\
> +(W)	mov (1|M0)               r4.1<1>:ud    r0.6<0;1,0>:ud				\n\
> +(W)	add (1|M0)               r4.1<1>:ud    r4.1<0;1,0>:ud   ARG(1):ud		\n\
> +	// block width [0,63] representing 1 to 64 bytes				\n\
> +(W)	mov (1|M0)               r4.2<1>:ud    ARG(2):ud				\n\
> +	// FFTID := FFTID from R0 header						\n\
> +(W)	mov (1|M0)               r4.4<1>:ud    r0.5<0;1,0>:ud				\n\
> +#else // prepare Typed 2D Block Store							\n\
> +	// Load r2.0-3 with tg id X << ARG(0)						\n\
> +(W)	shl (1|M0)               r2.0<1>:ud    r0.1<0;1,0>:ud    ARG(0):ud		\n\
> +	// Load r2.4-7 with tg id Y + ARG(1):ud						\n\
> +(W)	mov (1|M0)               r2.1<1>:ud    r0.6<0;1,0>:ud				\n\
> +(W)	add (1|M0)               r2.1<1>:ud    r2.1<0;1,0>:ud    ARG(1):ud		\n\
> +	// Store X and Y block start (160:191 and 192:223)				\n\
> +(W)	mov (2|M0)               r4.5<1>:ud    r2.0<2;2,1>:ud				\n\
> +	// Store X and Y block max_size (224:231 and 232:239)				\n\
> +(W)	mov (1|M0)               r4.7<1>:ud    ARG(2):ud				\n\
> +#endif											\n\
> +	// Check if masked exception is equal to provided value and write conditionally \n\
> +(W)      and (1|M0)              r3.0<1>:ud     cr0.1<0;1,0>:ud ARG(4):ud		\n\
> +(W)      mov (1|M0)              f0.0<1>:ud     0x0:ud					\n\
> +(W)      cmp (1|M0)     (eq)f0.0 null:ud        r3.0<0;1,0>:ud  ARG(5):ud		\n\
> +#if GEN_VER < 2000 // Media Block Write							\n\
> +(W&f0.0) send.dc1 (16|M0)        null     r4   src1_null 0    0x40A8000			\n\
> +#else // Typed 2D Block Store								\n\
> +(W&f0.0) send.tgm (16|M0)        null     r4   null:0    0    0x64000007		\n\
> +#endif											\n\
> +	", 2, y_offset, 3, value, mask, expected);
> +}
> +
>  /**
>   * gpgpu_shader__end_system_routine:
>   * @shdr: shader to be modified
> diff --git a/lib/gpgpu_shader.h b/lib/gpgpu_shader.h
> index da4ece983..6c6953a1a 100644
> --- a/lib/gpgpu_shader.h
> +++ b/lib/gpgpu_shader.h
> @@ -79,6 +79,8 @@ void gpgpu_shader__end_system_routine_step_if_eq(struct gpgpu_shader *shdr,
>  void gpgpu_shader__write_aip(struct gpgpu_shader *shdr, uint32_t y_offset);
>  void gpgpu_shader__write_dword(struct gpgpu_shader *shdr, uint32_t value,
>  			       uint32_t y_offset);
> +void gpgpu_shader__write_on_exception(struct gpgpu_shader *shdr, uint32_t dw,
> +			       uint32_t y_offset, uint32_t mask, uint32_t value);
>  void gpgpu_shader__label(struct gpgpu_shader *shdr, int label_id);
>  void gpgpu_shader__jump(struct gpgpu_shader *shdr, int label_id);
>  void gpgpu_shader__jump_neq(struct gpgpu_shader *shdr, int label_id,
> diff --git a/lib/iga64_generated_codes.c b/lib/iga64_generated_codes.c
> index dd849eebc..0800496c5 100644
> --- a/lib/iga64_generated_codes.c
> +++ b/lib/iga64_generated_codes.c
> @@ -3,7 +3,7 @@
>  
>  #include "gpgpu_shader.h"
>  
> -#define MD5_SUM_IGA64_ASMS 33b7cd843e3b009c123a85a6c520d7d0
> +#define MD5_SUM_IGA64_ASMS 716c5b437e2abd2a1768e79182993ff6
>  
>  struct iga64_template const iga64_code_gpgpu_fill[] = {
>  	{ .gen_ver = 2000, .size = 44, .code = (const uint32_t []) {
> @@ -192,6 +192,91 @@ struct iga64_template const iga64_code_breakpoint_suppress[] = {
>  	}}
>  };
>  
> +struct iga64_template const iga64_code_write_on_exception[] = {
> +	{ .gen_ver = 2000, .size = 56, .code = (const uint32_t []) {
> +		0x80100061, 0x04054220, 0x00000000, 0x00000000,
> +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
> +		0x80000069, 0x02058220, 0x02000014, 0xc0ded000,
> +		0x80000061, 0x02150220, 0x00000064, 0x00000000,
> +		0x80001940, 0x02158220, 0x02000214, 0xc0ded001,
> +		0x80041961, 0x04550220, 0x00220205, 0x00000000,
> +		0x80000061, 0x04754220, 0x00000000, 0xc0ded002,
> +		0x80000965, 0x03058220, 0x02008010, 0xc0ded004,
> +		0x80000961, 0x30014220, 0x00000000, 0x00000000,
> +		0x80001a70, 0x00018220, 0x12000304, 0xc0ded005,
> +		0x84134031, 0x00000000, 0xd00e0494, 0x04000000,
> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1270, .size = 60, .code = (const uint32_t []) {
> +		0x80040061, 0x04054220, 0x00000000, 0x00000000,
> +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
> +		0x80000069, 0x04058220, 0x02000024, 0xc0ded000,
> +		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
> +		0x80001940, 0x04258220, 0x02000424, 0xc0ded001,
> +		0x80000061, 0x04454220, 0x00000000, 0xc0ded002,
> +		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
> +		0x80000965, 0x03058220, 0x02008020, 0xc0ded004,
> +		0x80000961, 0x30014220, 0x00000000, 0x00000000,
> +		0x80001a70, 0x00018220, 0x12000304, 0xc0ded005,
> +		0x80001a01, 0x00010000, 0x00000000, 0x00000000,
> +		0x81044031, 0x00000000, 0xc0000414, 0x02a00000,
> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1260, .size = 56, .code = (const uint32_t []) {
> +		0x80100061, 0x04054220, 0x00000000, 0x00000000,
> +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
> +		0x80000069, 0x04058220, 0x02000014, 0xc0ded000,
> +		0x80000061, 0x04150220, 0x00000064, 0x00000000,
> +		0x80001940, 0x04158220, 0x02000414, 0xc0ded001,
> +		0x80000061, 0x04254220, 0x00000000, 0xc0ded002,
> +		0x80000061, 0x04450220, 0x00000054, 0x00000000,
> +		0x80000965, 0x03058220, 0x02008010, 0xc0ded004,
> +		0x80000961, 0x30014220, 0x00000000, 0x00000000,
> +		0x80001a70, 0x00018220, 0x12000304, 0xc0ded005,
> +		0x84134031, 0x00000000, 0xc0000414, 0x02a00000,
> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1250, .size = 60, .code = (const uint32_t []) {
> +		0x80040061, 0x04054220, 0x00000000, 0x00000000,
> +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
> +		0x80000069, 0x04058220, 0x02000024, 0xc0ded000,
> +		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
> +		0x80001940, 0x04258220, 0x02000424, 0xc0ded001,
> +		0x80000061, 0x04454220, 0x00000000, 0xc0ded002,
> +		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
> +		0x80000965, 0x03058220, 0x02008020, 0xc0ded004,
> +		0x80000961, 0x30014220, 0x00000000, 0x00000000,
> +		0x80001a70, 0x00018220, 0x12000304, 0xc0ded005,
> +		0x80001a01, 0x00010000, 0x00000000, 0x00000000,
> +		0x81044031, 0x00000000, 0xc0000414, 0x02a00000,
> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 0, .size = 56, .code = (const uint32_t []) {
> +		0x80040061, 0x04054220, 0x00000000, 0x00000000,
> +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
> +		0x80000069, 0x04058220, 0x02000024, 0xc0ded000,
> +		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
> +		0x80000140, 0x04258220, 0x02000424, 0xc0ded001,
> +		0x80000061, 0x04454220, 0x00000000, 0xc0ded002,
> +		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
> +		0x80000165, 0x03058220, 0x02008020, 0xc0ded004,
> +		0x80000161, 0x30014220, 0x00000000, 0x00000000,
> +		0x80000270, 0x00018220, 0x12000304, 0xc0ded005,
> +		0x8104a031, 0x00000000, 0xc0000414, 0x02a00000,
> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> +		0x80000101, 0x00010000, 0x00000000, 0x00000000,
> +	}}
> +};
> +
>  struct iga64_template const iga64_code_media_block_write[] = {
>  	{ .gen_ver = 2000, .size = 56, .code = (const uint32_t []) {
>  		0x80100061, 0x04054220, 0x00000000, 0x00000000,
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 04/17] lib/gpgpu_shader: Add write_on_exception template
  2024-09-05 10:51   ` Zbigniew Kempczyński
@ 2024-09-06  5:58     ` Andrzej Hajda
  2024-09-06  6:54       ` Zbigniew Kempczyński
  0 siblings, 1 reply; 50+ messages in thread
From: Andrzej Hajda @ 2024-09-06  5:58 UTC (permalink / raw)
  To: Zbigniew Kempczyński, Christoph Manszewski
  Cc: igt-dev, Kamil Konieczny, Dominik Grzegorzek, Maciej Patelczyk,
	Dominik Karol Piątkowski, Pawel Sikora, Kolanupaka Naveena,
	Mika Kuoppala, Gwan-gyeong Mun



On 05.09.2024 12:51, Zbigniew Kempczyński wrote:
> On Thu, Sep 05, 2024 at 11:27:59AM +0200, Christoph Manszewski wrote:
>> From: Andrzej Hajda <andrzej.hajda@intel.com>
>>
>> Writing specific value to memory location on unexpected value in exception
>> register allows to report errors from inside shader or siplet.
>>
>> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
>> Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
>> ---
>>   lib/gpgpu_shader.c          | 53 ++++++++++++++++++++++
>>   lib/gpgpu_shader.h          |  2 +
>>   lib/iga64_generated_codes.c | 87 ++++++++++++++++++++++++++++++++++++-
>>   3 files changed, 141 insertions(+), 1 deletion(-)
>>
>> diff --git a/lib/gpgpu_shader.c b/lib/gpgpu_shader.c
>> index dacab51dd..926eccaa0 100644
>> --- a/lib/gpgpu_shader.c
>> +++ b/lib/gpgpu_shader.c
>> @@ -634,6 +634,59 @@ void gpgpu_shader__write_dword(struct gpgpu_shader *shdr, uint32_t value,
>>   	", 2, y_offset, 3, value, value, value, value);
>>   }
>>   
>> +/**
>> + * gpgpu_shader__write_on_exception:
>> + * @shdr: shader to be modified
>> + * @value: dword to be written
>> + * @y_offset: write target offset within the surface in rows
>> + * @mask: mask to be applied on exception register
>> + * @expected: expected value of exception register with @mask applied
>> + *
>> + * Check if bits specified by @mask in exception register(cr0.1) are equal
>> + * to provided ones: cr0.1 & @mask == @expected,
>> + * if yes fill dword in (row, column/dword) == (tg_id_y + @y_offset, tg_id_x).
>> + */
>> +void gpgpu_shader__write_on_exception(struct gpgpu_shader *shdr, uint32_t value,
>> +				      uint32_t y_offset, uint32_t mask, uint32_t expected)
>> +{
>> +	emit_iga64_code(shdr, write_on_exception, "					\n\
>> +	// Clear message header								\n\
>> +(W)	mov (16|M0)              r4.0<1>:ud    0x0:ud					\n\
> I got rid off rest of instructions and appropriate xe_exec_sip tests are
> still passing. May you check why this happens?

The "rest of instructions" are used to set error value. If you get rid 
of them, the error will not be set, and the test will report almost 
always success.
So this is correct behavior.
On the other side, I agree block writing is quite big and complicated 
and varies between gens. So if some day the block will stop working as 
expected, the test will always report success.
There are multiple ways of hardening it, for example, we can perform 
write on success (as opposed to the current situation, when we perform 
write only on error).

Regards
Andrzej

>
> --
> Zbigniew
>
>> +	// Payload									\n\
>> +(W)	mov (1|M0)               r5.0<1>:ud    ARG(3):ud				\n\
>> +#if GEN_VER < 2000 // prepare Media Block Write						\n\
>> +	// X offset of the block in bytes := (thread group id X << ARG(0))		\n\
>> +(W)	shl (1|M0)               r4.0<1>:ud    r0.1<0;1,0>:ud    ARG(0):ud		\n\
>> +	// Y offset of the block in rows := thread group id Y				\n\
>> +(W)	mov (1|M0)               r4.1<1>:ud    r0.6<0;1,0>:ud				\n\
>> +(W)	add (1|M0)               r4.1<1>:ud    r4.1<0;1,0>:ud   ARG(1):ud		\n\
>> +	// block width [0,63] representing 1 to 64 bytes				\n\
>> +(W)	mov (1|M0)               r4.2<1>:ud    ARG(2):ud				\n\
>> +	// FFTID := FFTID from R0 header						\n\
>> +(W)	mov (1|M0)               r4.4<1>:ud    r0.5<0;1,0>:ud				\n\
>> +#else // prepare Typed 2D Block Store							\n\
>> +	// Load r2.0-3 with tg id X << ARG(0)						\n\
>> +(W)	shl (1|M0)               r2.0<1>:ud    r0.1<0;1,0>:ud    ARG(0):ud		\n\
>> +	// Load r2.4-7 with tg id Y + ARG(1):ud						\n\
>> +(W)	mov (1|M0)               r2.1<1>:ud    r0.6<0;1,0>:ud				\n\
>> +(W)	add (1|M0)               r2.1<1>:ud    r2.1<0;1,0>:ud    ARG(1):ud		\n\
>> +	// Store X and Y block start (160:191 and 192:223)				\n\
>> +(W)	mov (2|M0)               r4.5<1>:ud    r2.0<2;2,1>:ud				\n\
>> +	// Store X and Y block max_size (224:231 and 232:239)				\n\
>> +(W)	mov (1|M0)               r4.7<1>:ud    ARG(2):ud				\n\
>> +#endif											\n\
>> +	// Check if masked exception is equal to provided value and write conditionally \n\
>> +(W)      and (1|M0)              r3.0<1>:ud     cr0.1<0;1,0>:ud ARG(4):ud		\n\
>> +(W)      mov (1|M0)              f0.0<1>:ud     0x0:ud					\n\
>> +(W)      cmp (1|M0)     (eq)f0.0 null:ud        r3.0<0;1,0>:ud  ARG(5):ud		\n\
>> +#if GEN_VER < 2000 // Media Block Write							\n\
>> +(W&f0.0) send.dc1 (16|M0)        null     r4   src1_null 0    0x40A8000			\n\
>> +#else // Typed 2D Block Store								\n\
>> +(W&f0.0) send.tgm (16|M0)        null     r4   null:0    0    0x64000007		\n\
>> +#endif											\n\
>> +	", 2, y_offset, 3, value, mask, expected);
>> +}
>> +
>>   /**
>>    * gpgpu_shader__end_system_routine:
>>    * @shdr: shader to be modified
>> diff --git a/lib/gpgpu_shader.h b/lib/gpgpu_shader.h
>> index da4ece983..6c6953a1a 100644
>> --- a/lib/gpgpu_shader.h
>> +++ b/lib/gpgpu_shader.h
>> @@ -79,6 +79,8 @@ void gpgpu_shader__end_system_routine_step_if_eq(struct gpgpu_shader *shdr,
>>   void gpgpu_shader__write_aip(struct gpgpu_shader *shdr, uint32_t y_offset);
>>   void gpgpu_shader__write_dword(struct gpgpu_shader *shdr, uint32_t value,
>>   			       uint32_t y_offset);
>> +void gpgpu_shader__write_on_exception(struct gpgpu_shader *shdr, uint32_t dw,
>> +			       uint32_t y_offset, uint32_t mask, uint32_t value);
>>   void gpgpu_shader__label(struct gpgpu_shader *shdr, int label_id);
>>   void gpgpu_shader__jump(struct gpgpu_shader *shdr, int label_id);
>>   void gpgpu_shader__jump_neq(struct gpgpu_shader *shdr, int label_id,
>> diff --git a/lib/iga64_generated_codes.c b/lib/iga64_generated_codes.c
>> index dd849eebc..0800496c5 100644
>> --- a/lib/iga64_generated_codes.c
>> +++ b/lib/iga64_generated_codes.c
>> @@ -3,7 +3,7 @@
>>   
>>   #include "gpgpu_shader.h"
>>   
>> -#define MD5_SUM_IGA64_ASMS 33b7cd843e3b009c123a85a6c520d7d0
>> +#define MD5_SUM_IGA64_ASMS 716c5b437e2abd2a1768e79182993ff6
>>   
>>   struct iga64_template const iga64_code_gpgpu_fill[] = {
>>   	{ .gen_ver = 2000, .size = 44, .code = (const uint32_t []) {
>> @@ -192,6 +192,91 @@ struct iga64_template const iga64_code_breakpoint_suppress[] = {
>>   	}}
>>   };
>>   
>> +struct iga64_template const iga64_code_write_on_exception[] = {
>> +	{ .gen_ver = 2000, .size = 56, .code = (const uint32_t []) {
>> +		0x80100061, 0x04054220, 0x00000000, 0x00000000,
>> +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
>> +		0x80000069, 0x02058220, 0x02000014, 0xc0ded000,
>> +		0x80000061, 0x02150220, 0x00000064, 0x00000000,
>> +		0x80001940, 0x02158220, 0x02000214, 0xc0ded001,
>> +		0x80041961, 0x04550220, 0x00220205, 0x00000000,
>> +		0x80000061, 0x04754220, 0x00000000, 0xc0ded002,
>> +		0x80000965, 0x03058220, 0x02008010, 0xc0ded004,
>> +		0x80000961, 0x30014220, 0x00000000, 0x00000000,
>> +		0x80001a70, 0x00018220, 0x12000304, 0xc0ded005,
>> +		0x84134031, 0x00000000, 0xd00e0494, 0x04000000,
>> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
>> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
>> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
>> +	}},
>> +	{ .gen_ver = 1270, .size = 60, .code = (const uint32_t []) {
>> +		0x80040061, 0x04054220, 0x00000000, 0x00000000,
>> +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
>> +		0x80000069, 0x04058220, 0x02000024, 0xc0ded000,
>> +		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
>> +		0x80001940, 0x04258220, 0x02000424, 0xc0ded001,
>> +		0x80000061, 0x04454220, 0x00000000, 0xc0ded002,
>> +		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
>> +		0x80000965, 0x03058220, 0x02008020, 0xc0ded004,
>> +		0x80000961, 0x30014220, 0x00000000, 0x00000000,
>> +		0x80001a70, 0x00018220, 0x12000304, 0xc0ded005,
>> +		0x80001a01, 0x00010000, 0x00000000, 0x00000000,
>> +		0x81044031, 0x00000000, 0xc0000414, 0x02a00000,
>> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
>> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
>> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
>> +	}},
>> +	{ .gen_ver = 1260, .size = 56, .code = (const uint32_t []) {
>> +		0x80100061, 0x04054220, 0x00000000, 0x00000000,
>> +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
>> +		0x80000069, 0x04058220, 0x02000014, 0xc0ded000,
>> +		0x80000061, 0x04150220, 0x00000064, 0x00000000,
>> +		0x80001940, 0x04158220, 0x02000414, 0xc0ded001,
>> +		0x80000061, 0x04254220, 0x00000000, 0xc0ded002,
>> +		0x80000061, 0x04450220, 0x00000054, 0x00000000,
>> +		0x80000965, 0x03058220, 0x02008010, 0xc0ded004,
>> +		0x80000961, 0x30014220, 0x00000000, 0x00000000,
>> +		0x80001a70, 0x00018220, 0x12000304, 0xc0ded005,
>> +		0x84134031, 0x00000000, 0xc0000414, 0x02a00000,
>> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
>> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
>> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
>> +	}},
>> +	{ .gen_ver = 1250, .size = 60, .code = (const uint32_t []) {
>> +		0x80040061, 0x04054220, 0x00000000, 0x00000000,
>> +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
>> +		0x80000069, 0x04058220, 0x02000024, 0xc0ded000,
>> +		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
>> +		0x80001940, 0x04258220, 0x02000424, 0xc0ded001,
>> +		0x80000061, 0x04454220, 0x00000000, 0xc0ded002,
>> +		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
>> +		0x80000965, 0x03058220, 0x02008020, 0xc0ded004,
>> +		0x80000961, 0x30014220, 0x00000000, 0x00000000,
>> +		0x80001a70, 0x00018220, 0x12000304, 0xc0ded005,
>> +		0x80001a01, 0x00010000, 0x00000000, 0x00000000,
>> +		0x81044031, 0x00000000, 0xc0000414, 0x02a00000,
>> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
>> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
>> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
>> +	}},
>> +	{ .gen_ver = 0, .size = 56, .code = (const uint32_t []) {
>> +		0x80040061, 0x04054220, 0x00000000, 0x00000000,
>> +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
>> +		0x80000069, 0x04058220, 0x02000024, 0xc0ded000,
>> +		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
>> +		0x80000140, 0x04258220, 0x02000424, 0xc0ded001,
>> +		0x80000061, 0x04454220, 0x00000000, 0xc0ded002,
>> +		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
>> +		0x80000165, 0x03058220, 0x02008020, 0xc0ded004,
>> +		0x80000161, 0x30014220, 0x00000000, 0x00000000,
>> +		0x80000270, 0x00018220, 0x12000304, 0xc0ded005,
>> +		0x8104a031, 0x00000000, 0xc0000414, 0x02a00000,
>> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
>> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
>> +		0x80000101, 0x00010000, 0x00000000, 0x00000000,
>> +	}}
>> +};
>> +
>>   struct iga64_template const iga64_code_media_block_write[] = {
>>   	{ .gen_ver = 2000, .size = 56, .code = (const uint32_t []) {
>>   		0x80100061, 0x04054220, 0x00000000, 0x00000000,
>> -- 
>> 2.34.1
>>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 04/17] lib/gpgpu_shader: Add write_on_exception template
  2024-09-06  5:58     ` Andrzej Hajda
@ 2024-09-06  6:54       ` Zbigniew Kempczyński
  0 siblings, 0 replies; 50+ messages in thread
From: Zbigniew Kempczyński @ 2024-09-06  6:54 UTC (permalink / raw)
  To: Andrzej Hajda
  Cc: Christoph Manszewski, igt-dev, Kamil Konieczny,
	Dominik Grzegorzek, Maciej Patelczyk,
	Dominik Karol Piątkowski, Pawel Sikora, Kolanupaka Naveena,
	Mika Kuoppala, Gwan-gyeong Mun

On Fri, Sep 06, 2024 at 07:58:16AM +0200, Andrzej Hajda wrote:
> 
> 
> On 05.09.2024 12:51, Zbigniew Kempczyński wrote:
> > On Thu, Sep 05, 2024 at 11:27:59AM +0200, Christoph Manszewski wrote:
> > > From: Andrzej Hajda <andrzej.hajda@intel.com>
> > > 
> > > Writing specific value to memory location on unexpected value in exception
> > > register allows to report errors from inside shader or siplet.
> > > 
> > > Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
> > > Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
> > > ---
> > >   lib/gpgpu_shader.c          | 53 ++++++++++++++++++++++
> > >   lib/gpgpu_shader.h          |  2 +
> > >   lib/iga64_generated_codes.c | 87 ++++++++++++++++++++++++++++++++++++-
> > >   3 files changed, 141 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/lib/gpgpu_shader.c b/lib/gpgpu_shader.c
> > > index dacab51dd..926eccaa0 100644
> > > --- a/lib/gpgpu_shader.c
> > > +++ b/lib/gpgpu_shader.c
> > > @@ -634,6 +634,59 @@ void gpgpu_shader__write_dword(struct gpgpu_shader *shdr, uint32_t value,
> > >   	", 2, y_offset, 3, value, value, value, value);
> > >   }
> > > +/**
> > > + * gpgpu_shader__write_on_exception:
> > > + * @shdr: shader to be modified
> > > + * @value: dword to be written
> > > + * @y_offset: write target offset within the surface in rows
> > > + * @mask: mask to be applied on exception register
> > > + * @expected: expected value of exception register with @mask applied
> > > + *
> > > + * Check if bits specified by @mask in exception register(cr0.1) are equal
> > > + * to provided ones: cr0.1 & @mask == @expected,
> > > + * if yes fill dword in (row, column/dword) == (tg_id_y + @y_offset, tg_id_x).
> > > + */
> > > +void gpgpu_shader__write_on_exception(struct gpgpu_shader *shdr, uint32_t value,
> > > +				      uint32_t y_offset, uint32_t mask, uint32_t expected)
> > > +{
> > > +	emit_iga64_code(shdr, write_on_exception, "					\n\
> > > +	// Clear message header								\n\
> > > +(W)	mov (16|M0)              r4.0<1>:ud    0x0:ud					\n\
> > I got rid off rest of instructions and appropriate xe_exec_sip tests are
> > still passing. May you check why this happens?
> 
> The "rest of instructions" are used to set error value. If you get rid of
> them, the error will not be set, and the test will report almost always
> success.

According to xe_exec_sip which is currently the only user of this shader -
error which is set should be checked if it contains expected value (error
code). Unfortunately this memory location is overwritten immediately when
invalid instruction triggers entering sip.  I understand entering sip is
explicit proof that execution of invalid instruction happened, but it doesn't
prove error code setting is working properly. Noone reads it after it is set,
what means if there's something wrong in the hardware and compare inside the
write_on_exception() shader doesn't call block write we're not aware of that.

> So this is correct behavior.

Code is correct, but I've no prove how it behaves. As test logic
overwrites error code I can't say it previously contained error
code we expect.

> On the other side, I agree block writing is quite big and complicated and
> varies between gens. So if some day the block will stop working as expected,
> the test will always report success.

Agree, we'll need to change the verification logic a bit to catch
situation when error setting doesn't work. Not a blocker for the
moment, but this deserves to be addressed.


> There are multiple ways of hardening it, for example, we can perform write
> on success (as opposed to the current situation, when we perform write only
> on error).

Whatever, but we have to be sure shader is writing error code in place
we want.

Thanks for the offline discussion yesterday.

So at the moment I can give my r-b here (code is correct), but we just
don't catch it in the test.

Reviewed-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>

--
Zbigniew

> 
> Regards
> Andrzej
> 
> > 
> > --
> > Zbigniew
> > 
> > > +	// Payload									\n\
> > > +(W)	mov (1|M0)               r5.0<1>:ud    ARG(3):ud				\n\
> > > +#if GEN_VER < 2000 // prepare Media Block Write						\n\
> > > +	// X offset of the block in bytes := (thread group id X << ARG(0))		\n\
> > > +(W)	shl (1|M0)               r4.0<1>:ud    r0.1<0;1,0>:ud    ARG(0):ud		\n\
> > > +	// Y offset of the block in rows := thread group id Y				\n\
> > > +(W)	mov (1|M0)               r4.1<1>:ud    r0.6<0;1,0>:ud				\n\
> > > +(W)	add (1|M0)               r4.1<1>:ud    r4.1<0;1,0>:ud   ARG(1):ud		\n\
> > > +	// block width [0,63] representing 1 to 64 bytes				\n\
> > > +(W)	mov (1|M0)               r4.2<1>:ud    ARG(2):ud				\n\
> > > +	// FFTID := FFTID from R0 header						\n\
> > > +(W)	mov (1|M0)               r4.4<1>:ud    r0.5<0;1,0>:ud				\n\
> > > +#else // prepare Typed 2D Block Store							\n\
> > > +	// Load r2.0-3 with tg id X << ARG(0)						\n\
> > > +(W)	shl (1|M0)               r2.0<1>:ud    r0.1<0;1,0>:ud    ARG(0):ud		\n\
> > > +	// Load r2.4-7 with tg id Y + ARG(1):ud						\n\
> > > +(W)	mov (1|M0)               r2.1<1>:ud    r0.6<0;1,0>:ud				\n\
> > > +(W)	add (1|M0)               r2.1<1>:ud    r2.1<0;1,0>:ud    ARG(1):ud		\n\
> > > +	// Store X and Y block start (160:191 and 192:223)				\n\
> > > +(W)	mov (2|M0)               r4.5<1>:ud    r2.0<2;2,1>:ud				\n\
> > > +	// Store X and Y block max_size (224:231 and 232:239)				\n\
> > > +(W)	mov (1|M0)               r4.7<1>:ud    ARG(2):ud				\n\
> > > +#endif											\n\
> > > +	// Check if masked exception is equal to provided value and write conditionally \n\
> > > +(W)      and (1|M0)              r3.0<1>:ud     cr0.1<0;1,0>:ud ARG(4):ud		\n\
> > > +(W)      mov (1|M0)              f0.0<1>:ud     0x0:ud					\n\
> > > +(W)      cmp (1|M0)     (eq)f0.0 null:ud        r3.0<0;1,0>:ud  ARG(5):ud		\n\
> > > +#if GEN_VER < 2000 // Media Block Write							\n\
> > > +(W&f0.0) send.dc1 (16|M0)        null     r4   src1_null 0    0x40A8000			\n\
> > > +#else // Typed 2D Block Store								\n\
> > > +(W&f0.0) send.tgm (16|M0)        null     r4   null:0    0    0x64000007		\n\
> > > +#endif											\n\
> > > +	", 2, y_offset, 3, value, mask, expected);
> > > +}
> > > +
> > >   /**
> > >    * gpgpu_shader__end_system_routine:
> > >    * @shdr: shader to be modified
> > > diff --git a/lib/gpgpu_shader.h b/lib/gpgpu_shader.h
> > > index da4ece983..6c6953a1a 100644
> > > --- a/lib/gpgpu_shader.h
> > > +++ b/lib/gpgpu_shader.h
> > > @@ -79,6 +79,8 @@ void gpgpu_shader__end_system_routine_step_if_eq(struct gpgpu_shader *shdr,
> > >   void gpgpu_shader__write_aip(struct gpgpu_shader *shdr, uint32_t y_offset);
> > >   void gpgpu_shader__write_dword(struct gpgpu_shader *shdr, uint32_t value,
> > >   			       uint32_t y_offset);
> > > +void gpgpu_shader__write_on_exception(struct gpgpu_shader *shdr, uint32_t dw,
> > > +			       uint32_t y_offset, uint32_t mask, uint32_t value);
> > >   void gpgpu_shader__label(struct gpgpu_shader *shdr, int label_id);
> > >   void gpgpu_shader__jump(struct gpgpu_shader *shdr, int label_id);
> > >   void gpgpu_shader__jump_neq(struct gpgpu_shader *shdr, int label_id,
> > > diff --git a/lib/iga64_generated_codes.c b/lib/iga64_generated_codes.c
> > > index dd849eebc..0800496c5 100644
> > > --- a/lib/iga64_generated_codes.c
> > > +++ b/lib/iga64_generated_codes.c
> > > @@ -3,7 +3,7 @@
> > >   #include "gpgpu_shader.h"
> > > -#define MD5_SUM_IGA64_ASMS 33b7cd843e3b009c123a85a6c520d7d0
> > > +#define MD5_SUM_IGA64_ASMS 716c5b437e2abd2a1768e79182993ff6
> > >   struct iga64_template const iga64_code_gpgpu_fill[] = {
> > >   	{ .gen_ver = 2000, .size = 44, .code = (const uint32_t []) {
> > > @@ -192,6 +192,91 @@ struct iga64_template const iga64_code_breakpoint_suppress[] = {
> > >   	}}
> > >   };
> > > +struct iga64_template const iga64_code_write_on_exception[] = {
> > > +	{ .gen_ver = 2000, .size = 56, .code = (const uint32_t []) {
> > > +		0x80100061, 0x04054220, 0x00000000, 0x00000000,
> > > +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
> > > +		0x80000069, 0x02058220, 0x02000014, 0xc0ded000,
> > > +		0x80000061, 0x02150220, 0x00000064, 0x00000000,
> > > +		0x80001940, 0x02158220, 0x02000214, 0xc0ded001,
> > > +		0x80041961, 0x04550220, 0x00220205, 0x00000000,
> > > +		0x80000061, 0x04754220, 0x00000000, 0xc0ded002,
> > > +		0x80000965, 0x03058220, 0x02008010, 0xc0ded004,
> > > +		0x80000961, 0x30014220, 0x00000000, 0x00000000,
> > > +		0x80001a70, 0x00018220, 0x12000304, 0xc0ded005,
> > > +		0x84134031, 0x00000000, 0xd00e0494, 0x04000000,
> > > +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> > > +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> > > +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> > > +	}},
> > > +	{ .gen_ver = 1270, .size = 60, .code = (const uint32_t []) {
> > > +		0x80040061, 0x04054220, 0x00000000, 0x00000000,
> > > +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
> > > +		0x80000069, 0x04058220, 0x02000024, 0xc0ded000,
> > > +		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
> > > +		0x80001940, 0x04258220, 0x02000424, 0xc0ded001,
> > > +		0x80000061, 0x04454220, 0x00000000, 0xc0ded002,
> > > +		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
> > > +		0x80000965, 0x03058220, 0x02008020, 0xc0ded004,
> > > +		0x80000961, 0x30014220, 0x00000000, 0x00000000,
> > > +		0x80001a70, 0x00018220, 0x12000304, 0xc0ded005,
> > > +		0x80001a01, 0x00010000, 0x00000000, 0x00000000,
> > > +		0x81044031, 0x00000000, 0xc0000414, 0x02a00000,
> > > +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> > > +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> > > +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> > > +	}},
> > > +	{ .gen_ver = 1260, .size = 56, .code = (const uint32_t []) {
> > > +		0x80100061, 0x04054220, 0x00000000, 0x00000000,
> > > +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
> > > +		0x80000069, 0x04058220, 0x02000014, 0xc0ded000,
> > > +		0x80000061, 0x04150220, 0x00000064, 0x00000000,
> > > +		0x80001940, 0x04158220, 0x02000414, 0xc0ded001,
> > > +		0x80000061, 0x04254220, 0x00000000, 0xc0ded002,
> > > +		0x80000061, 0x04450220, 0x00000054, 0x00000000,
> > > +		0x80000965, 0x03058220, 0x02008010, 0xc0ded004,
> > > +		0x80000961, 0x30014220, 0x00000000, 0x00000000,
> > > +		0x80001a70, 0x00018220, 0x12000304, 0xc0ded005,
> > > +		0x84134031, 0x00000000, 0xc0000414, 0x02a00000,
> > > +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> > > +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> > > +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> > > +	}},
> > > +	{ .gen_ver = 1250, .size = 60, .code = (const uint32_t []) {
> > > +		0x80040061, 0x04054220, 0x00000000, 0x00000000,
> > > +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
> > > +		0x80000069, 0x04058220, 0x02000024, 0xc0ded000,
> > > +		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
> > > +		0x80001940, 0x04258220, 0x02000424, 0xc0ded001,
> > > +		0x80000061, 0x04454220, 0x00000000, 0xc0ded002,
> > > +		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
> > > +		0x80000965, 0x03058220, 0x02008020, 0xc0ded004,
> > > +		0x80000961, 0x30014220, 0x00000000, 0x00000000,
> > > +		0x80001a70, 0x00018220, 0x12000304, 0xc0ded005,
> > > +		0x80001a01, 0x00010000, 0x00000000, 0x00000000,
> > > +		0x81044031, 0x00000000, 0xc0000414, 0x02a00000,
> > > +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> > > +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> > > +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> > > +	}},
> > > +	{ .gen_ver = 0, .size = 56, .code = (const uint32_t []) {
> > > +		0x80040061, 0x04054220, 0x00000000, 0x00000000,
> > > +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
> > > +		0x80000069, 0x04058220, 0x02000024, 0xc0ded000,
> > > +		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
> > > +		0x80000140, 0x04258220, 0x02000424, 0xc0ded001,
> > > +		0x80000061, 0x04454220, 0x00000000, 0xc0ded002,
> > > +		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
> > > +		0x80000165, 0x03058220, 0x02008020, 0xc0ded004,
> > > +		0x80000161, 0x30014220, 0x00000000, 0x00000000,
> > > +		0x80000270, 0x00018220, 0x12000304, 0xc0ded005,
> > > +		0x8104a031, 0x00000000, 0xc0000414, 0x02a00000,
> > > +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> > > +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> > > +		0x80000101, 0x00010000, 0x00000000, 0x00000000,
> > > +	}}
> > > +};
> > > +
> > >   struct iga64_template const iga64_code_media_block_write[] = {
> > >   	{ .gen_ver = 2000, .size = 56, .code = (const uint32_t []) {
> > >   		0x80100061, 0x04054220, 0x00000000, 0x00000000,
> > > -- 
> > > 2.34.1
> > > 
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH i-g-t v6 05/17] lib/gpgpu_shader: Add set/clear exception register (cr0.1) helpers
  2024-09-05  9:27 [PATCH i-g-t v6 00/17] Test coverage for GPU debug support Christoph Manszewski
                   ` (3 preceding siblings ...)
  2024-09-05  9:27 ` [PATCH i-g-t v6 04/17] lib/gpgpu_shader: Add write_on_exception template Christoph Manszewski
@ 2024-09-05  9:28 ` Christoph Manszewski
  2024-09-05  9:28 ` [PATCH i-g-t v6 06/17] lib/intel_batchbuffer: Add helper to get pointer at specified offset Christoph Manszewski
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 50+ messages in thread
From: Christoph Manszewski @ 2024-09-05  9:28 UTC (permalink / raw)
  To: igt-dev
  Cc: Zbigniew Kempczyński, Kamil Konieczny, Dominik Grzegorzek,
	Maciej Patelczyk, Dominik Karol Piątkowski, Pawel Sikora,
	Andrzej Hajda, Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun,
	Christoph Manszewski

From: Andrzej Hajda <andrzej.hajda@intel.com>

To allow enabling and handling exceptions from shader and siplet
proper helpers should be provided.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
Reviewed-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 lib/gpgpu_shader.c          | 28 ++++++++++++++++++++++
 lib/gpgpu_shader.h          |  2 ++
 lib/iga64_generated_codes.c | 48 ++++++++++++++++++++++++++++++++++++-
 3 files changed, 77 insertions(+), 1 deletion(-)

diff --git a/lib/gpgpu_shader.c b/lib/gpgpu_shader.c
index 926eccaa0..9284ad5ea 100644
--- a/lib/gpgpu_shader.c
+++ b/lib/gpgpu_shader.c
@@ -634,6 +634,34 @@ void gpgpu_shader__write_dword(struct gpgpu_shader *shdr, uint32_t value,
 	", 2, y_offset, 3, value, value, value, value);
 }
 
+/**
+ * gpgpu_shader__clear_exception:
+ * @shdr: shader to be modified
+ * @value: exception bits to be cleared
+ *
+ * Clear provided bits in exception register: cr0.1 &= ~value.
+ */
+void gpgpu_shader__clear_exception(struct gpgpu_shader *shdr, uint32_t value)
+{
+	emit_iga64_code(shdr, clear_exception, "		\n\
+(W)	and (1|M0) cr0.1<1>:ud cr0.1<0;1,0>:ud ARG(0):ud	\n\
+	", ~value);
+}
+
+/**
+ * gpgpu_shader__set_exception:
+ * @shdr: shader to be modified
+ * @value: exception bits to be set
+ *
+ * Set provided bits in exception register: cr0.1 |= value.
+ */
+void gpgpu_shader__set_exception(struct gpgpu_shader *shdr, uint32_t value)
+{
+	emit_iga64_code(shdr, set_exception, "		\n\
+(W)	or (1|M0) cr0.1<1>:ud cr0.1<0;1,0>:ud ARG(0):ud	\n\
+	", value);
+}
+
 /**
  * gpgpu_shader__write_on_exception:
  * @shdr: shader to be modified
diff --git a/lib/gpgpu_shader.h b/lib/gpgpu_shader.h
index 6c6953a1a..b722b9e50 100644
--- a/lib/gpgpu_shader.h
+++ b/lib/gpgpu_shader.h
@@ -71,6 +71,8 @@ void gpgpu_shader__common_target_write(struct gpgpu_shader *shdr,
 				       uint32_t y_offset, const uint32_t value[4]);
 void gpgpu_shader__common_target_write_u32(struct gpgpu_shader *shdr,
 				     uint32_t y_offset, uint32_t value);
+void gpgpu_shader__clear_exception(struct gpgpu_shader *shdr, uint32_t value);
+void gpgpu_shader__set_exception(struct gpgpu_shader *shdr, uint32_t value);
 void gpgpu_shader__end_system_routine(struct gpgpu_shader *shdr,
 				      bool breakpoint_suppress);
 void gpgpu_shader__end_system_routine_step_if_eq(struct gpgpu_shader *shdr,
diff --git a/lib/iga64_generated_codes.c b/lib/iga64_generated_codes.c
index 0800496c5..e1c3adf80 100644
--- a/lib/iga64_generated_codes.c
+++ b/lib/iga64_generated_codes.c
@@ -3,7 +3,7 @@
 
 #include "gpgpu_shader.h"
 
-#define MD5_SUM_IGA64_ASMS 716c5b437e2abd2a1768e79182993ff6
+#define MD5_SUM_IGA64_ASMS 75f01a0931a6c846c506d943aab8f727
 
 struct iga64_template const iga64_code_gpgpu_fill[] = {
 	{ .gen_ver = 2000, .size = 44, .code = (const uint32_t []) {
@@ -277,6 +277,52 @@ struct iga64_template const iga64_code_write_on_exception[] = {
 	}}
 };
 
+struct iga64_template const iga64_code_set_exception[] = {
+	{ .gen_ver = 2000, .size = 8, .code = (const uint32_t []) {
+		0x80000966, 0x80118220, 0x02008010, 0xc0ded000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1270, .size = 8, .code = (const uint32_t []) {
+		0x80000966, 0x80218220, 0x02008020, 0xc0ded000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1260, .size = 8, .code = (const uint32_t []) {
+		0x80000966, 0x80118220, 0x02008010, 0xc0ded000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1250, .size = 8, .code = (const uint32_t []) {
+		0x80000966, 0x80218220, 0x02008020, 0xc0ded000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 0, .size = 8, .code = (const uint32_t []) {
+		0x80000166, 0x80218220, 0x02008020, 0xc0ded000,
+		0x80000101, 0x00010000, 0x00000000, 0x00000000,
+	}}
+};
+
+struct iga64_template const iga64_code_clear_exception[] = {
+	{ .gen_ver = 2000, .size = 8, .code = (const uint32_t []) {
+		0x80000965, 0x80118220, 0x02008010, 0xc0ded000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1270, .size = 8, .code = (const uint32_t []) {
+		0x80000965, 0x80218220, 0x02008020, 0xc0ded000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1260, .size = 8, .code = (const uint32_t []) {
+		0x80000965, 0x80118220, 0x02008010, 0xc0ded000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1250, .size = 8, .code = (const uint32_t []) {
+		0x80000965, 0x80218220, 0x02008020, 0xc0ded000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 0, .size = 8, .code = (const uint32_t []) {
+		0x80000165, 0x80218220, 0x02008020, 0xc0ded000,
+		0x80000101, 0x00010000, 0x00000000, 0x00000000,
+	}}
+};
+
 struct iga64_template const iga64_code_media_block_write[] = {
 	{ .gen_ver = 2000, .size = 56, .code = (const uint32_t []) {
 		0x80100061, 0x04054220, 0x00000000, 0x00000000,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH i-g-t v6 06/17] lib/intel_batchbuffer: Add helper to get pointer at specified offset
  2024-09-05  9:27 [PATCH i-g-t v6 00/17] Test coverage for GPU debug support Christoph Manszewski
                   ` (4 preceding siblings ...)
  2024-09-05  9:28 ` [PATCH i-g-t v6 05/17] lib/gpgpu_shader: Add set/clear exception register (cr0.1) helpers Christoph Manszewski
@ 2024-09-05  9:28 ` Christoph Manszewski
  2024-09-06  7:46   ` Zbigniew Kempczyński
  2024-09-05  9:28 ` [PATCH i-g-t v6 07/17] lib/gpgpu_shader: Allow enabling illegal opcode exceptions in shader Christoph Manszewski
                   ` (13 subsequent siblings)
  19 siblings, 1 reply; 50+ messages in thread
From: Christoph Manszewski @ 2024-09-05  9:28 UTC (permalink / raw)
  To: igt-dev
  Cc: Zbigniew Kempczyński, Kamil Konieczny, Dominik Grzegorzek,
	Maciej Patelczyk, Dominik Karol Piątkowski, Pawel Sikora,
	Andrzej Hajda, Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun,
	Christoph Manszewski

From: Andrzej Hajda <andrzej.hajda@intel.com>

The helper will be used to access data placed in batchbuffer.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Acked-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
---
 lib/intel_batchbuffer.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/lib/intel_batchbuffer.h b/lib/intel_batchbuffer.h
index cb32206e5..9e3430e2a 100644
--- a/lib/intel_batchbuffer.h
+++ b/lib/intel_batchbuffer.h
@@ -353,6 +353,13 @@ static inline uint32_t intel_bb_offset(struct intel_bb *ibb)
 	return (uint32_t) ((uint8_t *) ibb->ptr - (uint8_t *) ibb->batch);
 }
 
+static inline void *intel_bb_ptr_get(struct intel_bb *ibb, uint32_t offset)
+{
+	igt_assert(offset < ibb->size);
+
+	return ((uint8_t *) ibb->batch + offset);
+}
+
 static inline void intel_bb_ptr_set(struct intel_bb *ibb, uint32_t offset)
 {
 	ibb->ptr = (void *) ((uint8_t *) ibb->batch + offset);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 06/17] lib/intel_batchbuffer: Add helper to get pointer at specified offset
  2024-09-05  9:28 ` [PATCH i-g-t v6 06/17] lib/intel_batchbuffer: Add helper to get pointer at specified offset Christoph Manszewski
@ 2024-09-06  7:46   ` Zbigniew Kempczyński
  0 siblings, 0 replies; 50+ messages in thread
From: Zbigniew Kempczyński @ 2024-09-06  7:46 UTC (permalink / raw)
  To: Christoph Manszewski
  Cc: igt-dev, Kamil Konieczny, Dominik Grzegorzek, Maciej Patelczyk,
	Dominik Karol Piątkowski, Pawel Sikora, Andrzej Hajda,
	Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun

On Thu, Sep 05, 2024 at 11:28:01AM +0200, Christoph Manszewski wrote:
> From: Andrzej Hajda <andrzej.hajda@intel.com>
> 
> The helper will be used to access data placed in batchbuffer.
> 
> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
> Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> Acked-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> ---
>  lib/intel_batchbuffer.h | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/lib/intel_batchbuffer.h b/lib/intel_batchbuffer.h
> index cb32206e5..9e3430e2a 100644
> --- a/lib/intel_batchbuffer.h
> +++ b/lib/intel_batchbuffer.h
> @@ -353,6 +353,13 @@ static inline uint32_t intel_bb_offset(struct intel_bb *ibb)
>  	return (uint32_t) ((uint8_t *) ibb->ptr - (uint8_t *) ibb->batch);
>  }
>  
> +static inline void *intel_bb_ptr_get(struct intel_bb *ibb, uint32_t offset)
> +{
> +	igt_assert(offset < ibb->size);
> +
> +	return ((uint8_t *) ibb->batch + offset);
> +}
> +

Reviewed-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>

--
Zbigniew

>  static inline void intel_bb_ptr_set(struct intel_bb *ibb, uint32_t offset)
>  {
>  	ibb->ptr = (void *) ((uint8_t *) ibb->batch + offset);
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH i-g-t v6 07/17] lib/gpgpu_shader: Allow enabling illegal opcode exceptions in shader
  2024-09-05  9:27 [PATCH i-g-t v6 00/17] Test coverage for GPU debug support Christoph Manszewski
                   ` (5 preceding siblings ...)
  2024-09-05  9:28 ` [PATCH i-g-t v6 06/17] lib/intel_batchbuffer: Add helper to get pointer at specified offset Christoph Manszewski
@ 2024-09-05  9:28 ` Christoph Manszewski
  2024-09-05  9:28 ` [PATCH i-g-t v6 08/17] tests/xe_exec_sip: Add sanity-after-timeout test Christoph Manszewski
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 50+ messages in thread
From: Christoph Manszewski @ 2024-09-05  9:28 UTC (permalink / raw)
  To: igt-dev
  Cc: Zbigniew Kempczyński, Kamil Konieczny, Dominik Grzegorzek,
	Maciej Patelczyk, Dominik Karol Piątkowski, Pawel Sikora,
	Andrzej Hajda, Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun,
	Christoph Manszewski

From: Andrzej Hajda <andrzej.hajda@intel.com>

Illegal opcode exceptions can be enabled in interface descriptor data
passed to COMPUTE_WALKER instruction.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
Reviewed-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 lib/gpgpu_shader.c | 4 ++++
 lib/gpgpu_shader.h | 1 +
 2 files changed, 5 insertions(+)

diff --git a/lib/gpgpu_shader.c b/lib/gpgpu_shader.c
index 9284ad5ea..d4dd118d6 100644
--- a/lib/gpgpu_shader.c
+++ b/lib/gpgpu_shader.c
@@ -103,6 +103,7 @@ __xelp_gpgpu_execfunc(struct intel_bb *ibb,
 		      struct gpgpu_shader *sip,
 		      uint64_t ring, bool explicit_engine)
 {
+	struct gen8_interface_descriptor_data *idd;
 	uint32_t interface_descriptor, sip_offset;
 	uint64_t engine;
 
@@ -113,6 +114,8 @@ __xelp_gpgpu_execfunc(struct intel_bb *ibb,
 	interface_descriptor = gen8_fill_interface_descriptor(ibb, target,
 							      shdr->instr,
 							      4 * shdr->size);
+	idd = intel_bb_ptr_get(ibb, interface_descriptor);
+	idd->desc2.illegal_opcode_exception_enable = shdr->illegal_opcode_exception_enable;
 
 	if (sip && sip->size)
 		sip_offset = fill_sip(ibb, sip->instr, 4 * sip->size);
@@ -163,6 +166,7 @@ __xehp_gpgpu_execfunc(struct intel_bb *ibb,
 
 	xehp_fill_interface_descriptor(ibb, target, shdr->instr,
 				       4 * shdr->size, &idd);
+	idd.desc2.illegal_opcode_exception_enable = shdr->illegal_opcode_exception_enable;
 
 	if (sip && sip->size)
 		sip_offset = fill_sip(ibb, sip->instr, 4 * sip->size);
diff --git a/lib/gpgpu_shader.h b/lib/gpgpu_shader.h
index b722b9e50..53fe2869e 100644
--- a/lib/gpgpu_shader.h
+++ b/lib/gpgpu_shader.h
@@ -22,6 +22,7 @@ struct gpgpu_shader {
 		uint32_t (*instr)[4];
 	};
 	struct igt_map *labels;
+	bool illegal_opcode_exception_enable;
 };
 
 struct iga64_template {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH i-g-t v6 08/17] tests/xe_exec_sip: Add sanity-after-timeout test
  2024-09-05  9:27 [PATCH i-g-t v6 00/17] Test coverage for GPU debug support Christoph Manszewski
                   ` (6 preceding siblings ...)
  2024-09-05  9:28 ` [PATCH i-g-t v6 07/17] lib/gpgpu_shader: Allow enabling illegal opcode exceptions in shader Christoph Manszewski
@ 2024-09-05  9:28 ` Christoph Manszewski
  2024-09-05  9:28 ` [PATCH i-g-t v6 09/17] tests/xe_exec_sip: Introduce invalid instruction tests Christoph Manszewski
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 50+ messages in thread
From: Christoph Manszewski @ 2024-09-05  9:28 UTC (permalink / raw)
  To: igt-dev
  Cc: Zbigniew Kempczyński, Kamil Konieczny, Dominik Grzegorzek,
	Maciej Patelczyk, Dominik Karol Piątkowski, Pawel Sikora,
	Andrzej Hajda, Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun,
	Christoph Manszewski

Add a subtest that checks if we are able to submit workloads after gpu
was reset due to hung job.

Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
Reviewed-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 tests/intel/xe_exec_sip.c | 42 ++++++++++++++++++++++++++++++++-------
 1 file changed, 35 insertions(+), 7 deletions(-)

diff --git a/tests/intel/xe_exec_sip.c b/tests/intel/xe_exec_sip.c
index ea1770cd6..564c899f8 100644
--- a/tests/intel/xe_exec_sip.c
+++ b/tests/intel/xe_exec_sip.c
@@ -31,6 +31,11 @@
 
 #define SHADER_CANARY 0x01010101
 
+enum shader_type {
+	SHADER_HANG,
+	SHADER_WRITE,
+};
+
 static struct intel_buf *
 create_fill_buf(int fd, int width, int height, uint8_t color)
 {
@@ -50,21 +55,32 @@ create_fill_buf(int fd, int width, int height, uint8_t color)
 	return buf;
 }
 
-static struct gpgpu_shader *get_shader(int fd)
+static struct gpgpu_shader *get_shader(int fd, enum shader_type shader_type)
 {
 	static struct gpgpu_shader *shader;
 
 	shader = gpgpu_shader_create(fd);
 	gpgpu_shader__write_dword(shader, SHADER_CANARY, 0);
+
+	switch (shader_type) {
+	case SHADER_HANG:
+		gpgpu_shader__label(shader, 0);
+		gpgpu_shader__nop(shader);
+		gpgpu_shader__jump(shader, 0);
+		break;
+	case SHADER_WRITE:
+		break;
+	}
+
 	gpgpu_shader__eot(shader);
 	return shader;
 }
 
-static uint32_t gpgpu_shader(int fd, struct intel_bb *ibb, unsigned int threads,
-			     unsigned int width, unsigned int height)
+static uint32_t gpgpu_shader(int fd, struct intel_bb *ibb, enum shader_type shader_type,
+			     unsigned int threads, unsigned int width, unsigned int height)
 {
 	struct intel_buf *buf = create_fill_buf(fd, width, height, COLOR_C4);
-	struct gpgpu_shader *shader = get_shader(fd);
+	struct gpgpu_shader *shader = get_shader(fd, shader_type);
 
 	gpgpu_shader_exec(ibb, buf, 1, threads, shader, NULL, 0, 0);
 	gpgpu_shader_destroy(shader);
@@ -125,8 +141,11 @@ xe_sysfs_get_job_timeout_ms(int fd, struct drm_xe_engine_class_instance *eci)
  * SUBTEST: sanity
  * Description: check basic shader with write operation
  *
+ * SUBTEST: sanity-after-timeout
+ * Description: check basic shader execution after job timeout
  */
-static void test_sip(struct drm_xe_engine_class_instance *eci, uint32_t flags)
+static void test_sip(enum shader_type shader_type, struct drm_xe_engine_class_instance *eci,
+		     uint32_t flags)
 {
 	unsigned int threads = 512;
 	unsigned int height = max_t(threads, HEIGHT, threads * 2);
@@ -153,7 +172,7 @@ static void test_sip(struct drm_xe_engine_class_instance *eci, uint32_t flags)
 	ibb = intel_bb_create_with_context(fd, exec_queue_id, vm_id, NULL, 4096);
 
 	igt_nsec_elapsed(&ts);
-	handle = gpgpu_shader(fd, ibb, threads, width, height);
+	handle = gpgpu_shader(fd, ibb, shader_type, threads, width, height);
 
 	intel_bb_sync(ibb);
 	igt_assert_lt_u64(igt_nsec_elapsed(&ts), timeout);
@@ -186,7 +205,16 @@ igt_main
 		fd = drm_open_driver(DRIVER_XE);
 
 	test_render_and_compute("sanity", fd, eci)
-		test_sip(eci, 0);
+		test_sip(SHADER_WRITE, eci, 0);
+
+	test_render_and_compute("sanity-after-timeout", fd, eci) {
+		test_sip(SHADER_HANG, eci, 0);
+
+		xe_for_each_engine(fd, eci)
+			if (eci->engine_class == DRM_XE_ENGINE_CLASS_RENDER ||
+			    eci->engine_class == DRM_XE_ENGINE_CLASS_COMPUTE)
+				test_sip(SHADER_WRITE, eci, 0);
+	}
 
 	igt_fixture
 		drm_close_driver(fd);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH i-g-t v6 09/17] tests/xe_exec_sip: Introduce invalid instruction tests
  2024-09-05  9:27 [PATCH i-g-t v6 00/17] Test coverage for GPU debug support Christoph Manszewski
                   ` (7 preceding siblings ...)
  2024-09-05  9:28 ` [PATCH i-g-t v6 08/17] tests/xe_exec_sip: Add sanity-after-timeout test Christoph Manszewski
@ 2024-09-05  9:28 ` Christoph Manszewski
  2024-09-05 18:39   ` Zbigniew Kempczyński
  2024-09-09  7:21   ` Zbigniew Kempczyński
  2024-09-05  9:28 ` [PATCH i-g-t v6 10/17] drm-uapi/xe: Sync with eudebug uapi Christoph Manszewski
                   ` (10 subsequent siblings)
  19 siblings, 2 replies; 50+ messages in thread
From: Christoph Manszewski @ 2024-09-05  9:28 UTC (permalink / raw)
  To: igt-dev
  Cc: Zbigniew Kempczyński, Kamil Konieczny, Dominik Grzegorzek,
	Maciej Patelczyk, Dominik Karol Piątkowski, Pawel Sikora,
	Andrzej Hajda, Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun,
	Christoph Manszewski

From: Andrzej Hajda <andrzej.hajda@intel.com>

Xe2 and earlier gens are able to handle very limited set of invalid
instructions - only illegal and undefined opcodes, other errors in
instruction can cause undefined behavior.
Illegal/undefined opcode results in:
- setting illegal opcode status bit - cr0.1[28],
- calling SIP if illegal opcode bit is enabled - cr0.1[12].
cr0.1[12] can be enabled directly from the thread or by thread dispatcher
from Interface Descriptor Data provided to COMPUTE_WALKER instruction.

Implemented cases:
- check if SIP is not called when exception is not enabled,
- check if SIP is called when exception is enabled from EU thread,
- check if SIP is called when exception is enabled from COMPUTE_WALKER

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
---
 tests/intel/xe_exec_sip.c | 124 ++++++++++++++++++++++++++++++++++----
 1 file changed, 113 insertions(+), 11 deletions(-)

diff --git a/tests/intel/xe_exec_sip.c b/tests/intel/xe_exec_sip.c
index 564c899f8..ee4787d61 100644
--- a/tests/intel/xe_exec_sip.c
+++ b/tests/intel/xe_exec_sip.c
@@ -30,12 +30,25 @@
 #define COLOR_C4 0xc4
 
 #define SHADER_CANARY 0x01010101
+#define SIP_CANARY 0x02020202
 
 enum shader_type {
 	SHADER_HANG,
+	SHADER_INV_INSTR_DISABLED,
+	SHADER_INV_INSTR_THREAD_ENABLED,
+	SHADER_INV_INSTR_WALKER_ENABLED,
 	SHADER_WRITE,
 };
 
+enum sip_type {
+	SIP_INV_INSTR,
+	SIP_NULL,
+};
+
+/* Control Register cr0.1 bits for exception handling */
+#define ILLEGAL_OPCODE_ENABLE BIT(12)
+#define ILLEGAL_OPCODE_STATUS BIT(28)
+
 static struct intel_buf *
 create_fill_buf(int fd, int width, int height, uint8_t color)
 {
@@ -58,8 +71,12 @@ create_fill_buf(int fd, int width, int height, uint8_t color)
 static struct gpgpu_shader *get_shader(int fd, enum shader_type shader_type)
 {
 	static struct gpgpu_shader *shader;
+	uint32_t bad;
 
 	shader = gpgpu_shader_create(fd);
+	if (shader_type == SHADER_INV_INSTR_WALKER_ENABLED)
+		shader->illegal_opcode_exception_enable = true;
+
 	gpgpu_shader__write_dword(shader, SHADER_CANARY, 0);
 
 	switch (shader_type) {
@@ -70,19 +87,63 @@ static struct gpgpu_shader *get_shader(int fd, enum shader_type shader_type)
 		break;
 	case SHADER_WRITE:
 		break;
+	case SHADER_INV_INSTR_THREAD_ENABLED:
+		gpgpu_shader__set_exception(shader, ILLEGAL_OPCODE_ENABLE);
+		__attribute__ ((fallthrough));
+	case SHADER_INV_INSTR_DISABLED:
+	case SHADER_INV_INSTR_WALKER_ENABLED:
+		bad = (shader_type == SHADER_INV_INSTR_DISABLED) ? ILLEGAL_OPCODE_ENABLE : 0;
+		gpgpu_shader__write_on_exception(shader, 1, 0, ILLEGAL_OPCODE_ENABLE, bad);
+		gpgpu_shader__nop(shader);
+		gpgpu_shader__nop(shader);
+		/* modify second nop, set only opcode bits[6:0] */
+		shader->instr[gpgpu_shader_last_instr(shader)][0] = 0x7f;
+		/* SIP should clear exception bit */
+		bad = ILLEGAL_OPCODE_STATUS;
+		gpgpu_shader__write_on_exception(shader, 2, 0, ILLEGAL_OPCODE_STATUS, bad);
+		break;
 	}
 
 	gpgpu_shader__eot(shader);
 	return shader;
 }
 
+static struct gpgpu_shader *get_sip(int fd, enum sip_type sip_type, unsigned int y_offset)
+{
+	static struct gpgpu_shader *sip;
+
+	if (sip_type == SIP_NULL)
+		return NULL;
+
+	sip = gpgpu_shader_create(fd);
+	gpgpu_shader__write_dword(sip, SIP_CANARY, y_offset);
+
+	switch (sip_type) {
+	case SIP_INV_INSTR:
+		gpgpu_shader__write_on_exception(sip, 1, y_offset, ILLEGAL_OPCODE_STATUS, 0);
+		break;
+	default:
+		break;
+	}
+
+	gpgpu_shader__end_system_routine(sip, false);
+
+	return sip;
+}
+
 static uint32_t gpgpu_shader(int fd, struct intel_bb *ibb, enum shader_type shader_type,
-			     unsigned int threads, unsigned int width, unsigned int height)
+			     enum sip_type sip_type, unsigned int threads, unsigned int width,
+			     unsigned int height)
 {
 	struct intel_buf *buf = create_fill_buf(fd, width, height, COLOR_C4);
+	struct gpgpu_shader *sip = get_sip(fd, sip_type, height / 2);
 	struct gpgpu_shader *shader = get_shader(fd, shader_type);
 
-	gpgpu_shader_exec(ibb, buf, 1, threads, shader, NULL, 0, 0);
+	gpgpu_shader_exec(ibb, buf, 1, threads, shader, sip, 0, 0);
+
+	if (sip)
+		gpgpu_shader_destroy(sip);
+
 	gpgpu_shader_destroy(shader);
 	return buf->handle;
 }
@@ -98,10 +159,10 @@ static void check_fill_buf(uint8_t *ptr, const int width, const int x,
 }
 
 static void check_buf(int fd, uint32_t handle, int width, int height,
-		      uint8_t poison_c)
+		      enum shader_type shader_type, enum sip_type sip_type, uint8_t poison_c)
 {
 	unsigned int sz = ALIGN(width * height, 4096);
-	int thread_count = 0;
+	int thread_count = 0, sip_count = 0;
 	uint32_t *ptr;
 	int i, j;
 
@@ -119,7 +180,27 @@ static void check_buf(int fd, uint32_t handle, int width, int height,
 		i = 0;
 	}
 
+	for (i = 0, j = height / 2; j < height; ++j) {
+		if (ptr[j * width / 4] == SIP_CANARY) {
+			++sip_count;
+			i = 4;
+		}
+
+		for (; i < width; i++)
+			check_fill_buf((uint8_t *)ptr, width, i, j, poison_c);
+
+		i = 0;
+	}
+
 	igt_assert(thread_count);
+	if (shader_type == SHADER_INV_INSTR_DISABLED)
+		igt_assert(!sip_count);
+	else if (sip_type == SIP_INV_INSTR && shader_type != SHADER_INV_INSTR_DISABLED)
+		igt_assert_f(thread_count == sip_count,
+			     "Thread and SIP count mismatch, %d != %d\n",
+			     thread_count, sip_count);
+	else
+		igt_assert(sip_count == 0);
 
 	munmap(ptr, sz);
 }
@@ -143,9 +224,21 @@ xe_sysfs_get_job_timeout_ms(int fd, struct drm_xe_engine_class_instance *eci)
  *
  * SUBTEST: sanity-after-timeout
  * Description: check basic shader execution after job timeout
+ *
+ * SUBTEST: invalidinstr-disabled
+ * Description: Verify that we don't enter SIP after running into an invalid
+ *		instruction when exception is not enabled.
+ *
+ * SUBTEST: invalidinstr-thread-enabled
+ * Description: Verify that we enter SIP after running into an invalid instruction
+ *              when exception is enabled from thread.
+ *
+ * SUBTEST: invalidinstr-walker-enabled
+ * Description: Verify that we enter SIP after running into an invalid instruction
+ *              when exception is enabled from COMPUTE_WALKER.
  */
-static void test_sip(enum shader_type shader_type, struct drm_xe_engine_class_instance *eci,
-		     uint32_t flags)
+static void test_sip(enum shader_type shader_type, enum sip_type sip_type,
+		     struct drm_xe_engine_class_instance *eci, uint32_t flags)
 {
 	unsigned int threads = 512;
 	unsigned int height = max_t(threads, HEIGHT, threads * 2);
@@ -172,12 +265,12 @@ static void test_sip(enum shader_type shader_type, struct drm_xe_engine_class_in
 	ibb = intel_bb_create_with_context(fd, exec_queue_id, vm_id, NULL, 4096);
 
 	igt_nsec_elapsed(&ts);
-	handle = gpgpu_shader(fd, ibb, shader_type, threads, width, height);
+	handle = gpgpu_shader(fd, ibb, shader_type, sip_type, threads, width, height);
 
 	intel_bb_sync(ibb);
 	igt_assert_lt_u64(igt_nsec_elapsed(&ts), timeout);
 
-	check_buf(fd, handle, width, height, COLOR_C4);
+	check_buf(fd, handle, width, height, shader_type, sip_type, COLOR_C4);
 
 	gem_close(fd, handle);
 	intel_bb_destroy(ibb);
@@ -205,17 +298,26 @@ igt_main
 		fd = drm_open_driver(DRIVER_XE);
 
 	test_render_and_compute("sanity", fd, eci)
-		test_sip(SHADER_WRITE, eci, 0);
+		test_sip(SHADER_WRITE, SIP_NULL, eci, 0);
 
 	test_render_and_compute("sanity-after-timeout", fd, eci) {
-		test_sip(SHADER_HANG, eci, 0);
+		test_sip(SHADER_HANG, SIP_NULL, eci, 0);
 
 		xe_for_each_engine(fd, eci)
 			if (eci->engine_class == DRM_XE_ENGINE_CLASS_RENDER ||
 			    eci->engine_class == DRM_XE_ENGINE_CLASS_COMPUTE)
-				test_sip(SHADER_WRITE, eci, 0);
+				test_sip(SHADER_WRITE, SIP_NULL, eci, 0);
 	}
 
+	test_render_and_compute("invalidinstr-disabled", fd, eci)
+		test_sip(SHADER_INV_INSTR_DISABLED, SIP_INV_INSTR, eci, 0);
+
+	test_render_and_compute("invalidinstr-thread-enabled", fd, eci)
+		test_sip(SHADER_INV_INSTR_THREAD_ENABLED, SIP_INV_INSTR, eci, 0);
+
+	test_render_and_compute("invalidinstr-walker-enabled", fd, eci)
+		test_sip(SHADER_INV_INSTR_WALKER_ENABLED, SIP_INV_INSTR, eci, 0);
+
 	igt_fixture
 		drm_close_driver(fd);
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 09/17] tests/xe_exec_sip: Introduce invalid instruction tests
  2024-09-05  9:28 ` [PATCH i-g-t v6 09/17] tests/xe_exec_sip: Introduce invalid instruction tests Christoph Manszewski
@ 2024-09-05 18:39   ` Zbigniew Kempczyński
  2024-09-09  7:21   ` Zbigniew Kempczyński
  1 sibling, 0 replies; 50+ messages in thread
From: Zbigniew Kempczyński @ 2024-09-05 18:39 UTC (permalink / raw)
  To: Christoph Manszewski
  Cc: igt-dev, Kamil Konieczny, Dominik Grzegorzek, Maciej Patelczyk,
	Dominik Karol Piątkowski, Pawel Sikora, Andrzej Hajda,
	Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun

On Thu, Sep 05, 2024 at 11:28:04AM +0200, Christoph Manszewski wrote:
> From: Andrzej Hajda <andrzej.hajda@intel.com>
> 
> Xe2 and earlier gens are able to handle very limited set of invalid
> instructions - only illegal and undefined opcodes, other errors in
> instruction can cause undefined behavior.
> Illegal/undefined opcode results in:
> - setting illegal opcode status bit - cr0.1[28],
> - calling SIP if illegal opcode bit is enabled - cr0.1[12].
> cr0.1[12] can be enabled directly from the thread or by thread dispatcher
> from Interface Descriptor Data provided to COMPUTE_WALKER instruction.
> 
> Implemented cases:
> - check if SIP is not called when exception is not enabled,
> - check if SIP is called when exception is enabled from EU thread,
> - check if SIP is called when exception is enabled from COMPUTE_WALKER
> 
> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
> ---
>  tests/intel/xe_exec_sip.c | 124 ++++++++++++++++++++++++++++++++++----
>  1 file changed, 113 insertions(+), 11 deletions(-)
> 
> diff --git a/tests/intel/xe_exec_sip.c b/tests/intel/xe_exec_sip.c
> index 564c899f8..ee4787d61 100644
> --- a/tests/intel/xe_exec_sip.c
> +++ b/tests/intel/xe_exec_sip.c
> @@ -30,12 +30,25 @@
>  #define COLOR_C4 0xc4
>  
>  #define SHADER_CANARY 0x01010101
> +#define SIP_CANARY 0x02020202
>  
>  enum shader_type {
>  	SHADER_HANG,
> +	SHADER_INV_INSTR_DISABLED,
> +	SHADER_INV_INSTR_THREAD_ENABLED,
> +	SHADER_INV_INSTR_WALKER_ENABLED,
>  	SHADER_WRITE,
>  };
>  
> +enum sip_type {
> +	SIP_INV_INSTR,
> +	SIP_NULL,
> +};
> +
> +/* Control Register cr0.1 bits for exception handling */
> +#define ILLEGAL_OPCODE_ENABLE BIT(12)
> +#define ILLEGAL_OPCODE_STATUS BIT(28)
> +
>  static struct intel_buf *
>  create_fill_buf(int fd, int width, int height, uint8_t color)
>  {
> @@ -58,8 +71,12 @@ create_fill_buf(int fd, int width, int height, uint8_t color)
>  static struct gpgpu_shader *get_shader(int fd, enum shader_type shader_type)
>  {
>  	static struct gpgpu_shader *shader;
> +	uint32_t bad;
>  
>  	shader = gpgpu_shader_create(fd);
> +	if (shader_type == SHADER_INV_INSTR_WALKER_ENABLED)
> +		shader->illegal_opcode_exception_enable = true;
> +
>  	gpgpu_shader__write_dword(shader, SHADER_CANARY, 0);
>  
>  	switch (shader_type) {
> @@ -70,19 +87,63 @@ static struct gpgpu_shader *get_shader(int fd, enum shader_type shader_type)
>  		break;
>  	case SHADER_WRITE:
>  		break;
> +	case SHADER_INV_INSTR_THREAD_ENABLED:
> +		gpgpu_shader__set_exception(shader, ILLEGAL_OPCODE_ENABLE);
> +		__attribute__ ((fallthrough));
> +	case SHADER_INV_INSTR_DISABLED:
> +	case SHADER_INV_INSTR_WALKER_ENABLED:
> +		bad = (shader_type == SHADER_INV_INSTR_DISABLED) ? ILLEGAL_OPCODE_ENABLE : 0;
> +		gpgpu_shader__write_on_exception(shader, 1, 0, ILLEGAL_OPCODE_ENABLE, bad);
> +		gpgpu_shader__nop(shader);
> +		gpgpu_shader__nop(shader);
> +		/* modify second nop, set only opcode bits[6:0] */
> +		shader->instr[gpgpu_shader_last_instr(shader)][0] = 0x7f;
> +		/* SIP should clear exception bit */
> +		bad = ILLEGAL_OPCODE_STATUS;
> +		gpgpu_shader__write_on_exception(shader, 2, 0, ILLEGAL_OPCODE_STATUS, bad);
> +		break;

According to my review in gpgpu_shader__write_on_exception() above
code (+sip one below) has flaw there's no check is block write
happen successfully. I mean test just overwrites this block write
value anyway. Commenting out block write within write_on_exception()
shader should be detected by the test logic (no write happens).

I've been discussing this case with Andrzej and we agreed that
we may add such scenario later, but definitely this has to be
addressed.

So I can conditionally give my r-b to this patch (I understand
that handcrafting block-write with manual calculation of the
destination address isn't much convinient.

Reviewed-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>

--
Zbigniew

>  	}
>  
>  	gpgpu_shader__eot(shader);
>  	return shader;
>  }
>  
> +static struct gpgpu_shader *get_sip(int fd, enum sip_type sip_type, unsigned int y_offset)
> +{
> +	static struct gpgpu_shader *sip;
> +
> +	if (sip_type == SIP_NULL)
> +		return NULL;
> +
> +	sip = gpgpu_shader_create(fd);
> +	gpgpu_shader__write_dword(sip, SIP_CANARY, y_offset);
> +
> +	switch (sip_type) {
> +	case SIP_INV_INSTR:
> +		gpgpu_shader__write_on_exception(sip, 1, y_offset, ILLEGAL_OPCODE_STATUS, 0);
> +		break;
> +	default:
> +		break;
> +	}
> +
> +	gpgpu_shader__end_system_routine(sip, false);
> +
> +	return sip;
> +}
> +
>  static uint32_t gpgpu_shader(int fd, struct intel_bb *ibb, enum shader_type shader_type,
> -			     unsigned int threads, unsigned int width, unsigned int height)
> +			     enum sip_type sip_type, unsigned int threads, unsigned int width,
> +			     unsigned int height)
>  {
>  	struct intel_buf *buf = create_fill_buf(fd, width, height, COLOR_C4);
> +	struct gpgpu_shader *sip = get_sip(fd, sip_type, height / 2);
>  	struct gpgpu_shader *shader = get_shader(fd, shader_type);
>  
> -	gpgpu_shader_exec(ibb, buf, 1, threads, shader, NULL, 0, 0);
> +	gpgpu_shader_exec(ibb, buf, 1, threads, shader, sip, 0, 0);
> +
> +	if (sip)
> +		gpgpu_shader_destroy(sip);
> +
>  	gpgpu_shader_destroy(shader);
>  	return buf->handle;
>  }
> @@ -98,10 +159,10 @@ static void check_fill_buf(uint8_t *ptr, const int width, const int x,
>  }
>  
>  static void check_buf(int fd, uint32_t handle, int width, int height,
> -		      uint8_t poison_c)
> +		      enum shader_type shader_type, enum sip_type sip_type, uint8_t poison_c)
>  {
>  	unsigned int sz = ALIGN(width * height, 4096);
> -	int thread_count = 0;
> +	int thread_count = 0, sip_count = 0;
>  	uint32_t *ptr;
>  	int i, j;
>  
> @@ -119,7 +180,27 @@ static void check_buf(int fd, uint32_t handle, int width, int height,
>  		i = 0;
>  	}
>  
> +	for (i = 0, j = height / 2; j < height; ++j) {
> +		if (ptr[j * width / 4] == SIP_CANARY) {
> +			++sip_count;
> +			i = 4;
> +		}
> +
> +		for (; i < width; i++)
> +			check_fill_buf((uint8_t *)ptr, width, i, j, poison_c);
> +
> +		i = 0;
> +	}
> +
>  	igt_assert(thread_count);
> +	if (shader_type == SHADER_INV_INSTR_DISABLED)
> +		igt_assert(!sip_count);
> +	else if (sip_type == SIP_INV_INSTR && shader_type != SHADER_INV_INSTR_DISABLED)
> +		igt_assert_f(thread_count == sip_count,
> +			     "Thread and SIP count mismatch, %d != %d\n",
> +			     thread_count, sip_count);
> +	else
> +		igt_assert(sip_count == 0);
>  
>  	munmap(ptr, sz);
>  }
> @@ -143,9 +224,21 @@ xe_sysfs_get_job_timeout_ms(int fd, struct drm_xe_engine_class_instance *eci)
>   *
>   * SUBTEST: sanity-after-timeout
>   * Description: check basic shader execution after job timeout
> + *
> + * SUBTEST: invalidinstr-disabled
> + * Description: Verify that we don't enter SIP after running into an invalid
> + *		instruction when exception is not enabled.
> + *
> + * SUBTEST: invalidinstr-thread-enabled
> + * Description: Verify that we enter SIP after running into an invalid instruction
> + *              when exception is enabled from thread.
> + *
> + * SUBTEST: invalidinstr-walker-enabled
> + * Description: Verify that we enter SIP after running into an invalid instruction
> + *              when exception is enabled from COMPUTE_WALKER.
>   */
> -static void test_sip(enum shader_type shader_type, struct drm_xe_engine_class_instance *eci,
> -		     uint32_t flags)
> +static void test_sip(enum shader_type shader_type, enum sip_type sip_type,
> +		     struct drm_xe_engine_class_instance *eci, uint32_t flags)
>  {
>  	unsigned int threads = 512;
>  	unsigned int height = max_t(threads, HEIGHT, threads * 2);
> @@ -172,12 +265,12 @@ static void test_sip(enum shader_type shader_type, struct drm_xe_engine_class_in
>  	ibb = intel_bb_create_with_context(fd, exec_queue_id, vm_id, NULL, 4096);
>  
>  	igt_nsec_elapsed(&ts);
> -	handle = gpgpu_shader(fd, ibb, shader_type, threads, width, height);
> +	handle = gpgpu_shader(fd, ibb, shader_type, sip_type, threads, width, height);
>  
>  	intel_bb_sync(ibb);
>  	igt_assert_lt_u64(igt_nsec_elapsed(&ts), timeout);
>  
> -	check_buf(fd, handle, width, height, COLOR_C4);
> +	check_buf(fd, handle, width, height, shader_type, sip_type, COLOR_C4);
>  
>  	gem_close(fd, handle);
>  	intel_bb_destroy(ibb);
> @@ -205,17 +298,26 @@ igt_main
>  		fd = drm_open_driver(DRIVER_XE);
>  
>  	test_render_and_compute("sanity", fd, eci)
> -		test_sip(SHADER_WRITE, eci, 0);
> +		test_sip(SHADER_WRITE, SIP_NULL, eci, 0);
>  
>  	test_render_and_compute("sanity-after-timeout", fd, eci) {
> -		test_sip(SHADER_HANG, eci, 0);
> +		test_sip(SHADER_HANG, SIP_NULL, eci, 0);
>  
>  		xe_for_each_engine(fd, eci)
>  			if (eci->engine_class == DRM_XE_ENGINE_CLASS_RENDER ||
>  			    eci->engine_class == DRM_XE_ENGINE_CLASS_COMPUTE)
> -				test_sip(SHADER_WRITE, eci, 0);
> +				test_sip(SHADER_WRITE, SIP_NULL, eci, 0);
>  	}
>  
> +	test_render_and_compute("invalidinstr-disabled", fd, eci)
> +		test_sip(SHADER_INV_INSTR_DISABLED, SIP_INV_INSTR, eci, 0);
> +
> +	test_render_and_compute("invalidinstr-thread-enabled", fd, eci)
> +		test_sip(SHADER_INV_INSTR_THREAD_ENABLED, SIP_INV_INSTR, eci, 0);
> +
> +	test_render_and_compute("invalidinstr-walker-enabled", fd, eci)
> +		test_sip(SHADER_INV_INSTR_WALKER_ENABLED, SIP_INV_INSTR, eci, 0);
> +
>  	igt_fixture
>  		drm_close_driver(fd);
>  }
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 09/17] tests/xe_exec_sip: Introduce invalid instruction tests
  2024-09-05  9:28 ` [PATCH i-g-t v6 09/17] tests/xe_exec_sip: Introduce invalid instruction tests Christoph Manszewski
  2024-09-05 18:39   ` Zbigniew Kempczyński
@ 2024-09-09  7:21   ` Zbigniew Kempczyński
  2024-09-13 11:50     ` Manszewski, Christoph
  1 sibling, 1 reply; 50+ messages in thread
From: Zbigniew Kempczyński @ 2024-09-09  7:21 UTC (permalink / raw)
  To: Christoph Manszewski
  Cc: igt-dev, Kamil Konieczny, Dominik Grzegorzek, Maciej Patelczyk,
	Dominik Karol Piątkowski, Pawel Sikora, Andrzej Hajda,
	Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun

On Thu, Sep 05, 2024 at 11:28:04AM +0200, Christoph Manszewski wrote:
> From: Andrzej Hajda <andrzej.hajda@intel.com>
> 
> Xe2 and earlier gens are able to handle very limited set of invalid
> instructions - only illegal and undefined opcodes, other errors in
> instruction can cause undefined behavior.
> Illegal/undefined opcode results in:
> - setting illegal opcode status bit - cr0.1[28],
> - calling SIP if illegal opcode bit is enabled - cr0.1[12].
> cr0.1[12] can be enabled directly from the thread or by thread dispatcher
> from Interface Descriptor Data provided to COMPUTE_WALKER instruction.
> 
> Implemented cases:
> - check if SIP is not called when exception is not enabled,
> - check if SIP is called when exception is enabled from EU thread,
> - check if SIP is called when exception is enabled from COMPUTE_WALKER
> 
> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>

I've some observation - tests are passing but I've doubts regarding
their completion. I mean I observe hang and test report success after
job timeout.

--
Zbigniew

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 09/17] tests/xe_exec_sip: Introduce invalid instruction tests
  2024-09-09  7:21   ` Zbigniew Kempczyński
@ 2024-09-13 11:50     ` Manszewski, Christoph
  0 siblings, 0 replies; 50+ messages in thread
From: Manszewski, Christoph @ 2024-09-13 11:50 UTC (permalink / raw)
  To: Zbigniew Kempczyński
  Cc: igt-dev, Kamil Konieczny, Dominik Grzegorzek, Maciej Patelczyk,
	Dominik Karol Piątkowski, Pawel Sikora, Andrzej Hajda,
	Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun

Hi Zbigniew,

On 9.09.2024 09:21, Zbigniew Kempczyński wrote:
> On Thu, Sep 05, 2024 at 11:28:04AM +0200, Christoph Manszewski wrote:
>> From: Andrzej Hajda <andrzej.hajda@intel.com>
>>
>> Xe2 and earlier gens are able to handle very limited set of invalid
>> instructions - only illegal and undefined opcodes, other errors in
>> instruction can cause undefined behavior.
>> Illegal/undefined opcode results in:
>> - setting illegal opcode status bit - cr0.1[28],
>> - calling SIP if illegal opcode bit is enabled - cr0.1[12].
>> cr0.1[12] can be enabled directly from the thread or by thread dispatcher
>> from Interface Descriptor Data provided to COMPUTE_WALKER instruction.
>>
>> Implemented cases:
>> - check if SIP is not called when exception is not enabled,
>> - check if SIP is called when exception is enabled from EU thread,
>> - check if SIP is called when exception is enabled from COMPUTE_WALKER
>>
>> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
>> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
>> Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
>> Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
> 
> I've some observation - tests are passing but I've doubts regarding
> their completion. I mean I observe hang and test report success after
> job timeout.

Well for the 'sanity-after-timeout' sub test this is what it actually 
intends to do - cause a timeout and see if we are able to run a basic 
workload after this.

As for 'invalidinstr-*-enabled' I also assume this is expected since 
executing an invalid instruction causes an exception and no one 
handles/clears it. But maybe we should? I will pass this one down to 
@Andrzej.

Thanks,
Christoph

> 
> --
> Zbigniew

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH i-g-t v6 10/17] drm-uapi/xe: Sync with eudebug uapi
  2024-09-05  9:27 [PATCH i-g-t v6 00/17] Test coverage for GPU debug support Christoph Manszewski
                   ` (8 preceding siblings ...)
  2024-09-05  9:28 ` [PATCH i-g-t v6 09/17] tests/xe_exec_sip: Introduce invalid instruction tests Christoph Manszewski
@ 2024-09-05  9:28 ` Christoph Manszewski
  2024-09-05  9:28 ` [PATCH i-g-t v6 11/17] lib/xe_eudebug: Introduce eu debug testing framework Christoph Manszewski
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 50+ messages in thread
From: Christoph Manszewski @ 2024-09-05  9:28 UTC (permalink / raw)
  To: igt-dev
  Cc: Zbigniew Kempczyński, Kamil Konieczny, Dominik Grzegorzek,
	Maciej Patelczyk, Dominik Karol Piątkowski, Pawel Sikora,
	Andrzej Hajda, Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun,
	Mika Kuoppala, Christoph Manszewski

From: Dominik Grzegorzek <dominik.grzegorzek@intel.com>

Align with kernel commit 09411c6ecbef ("drm/xe/eudebug: Add debug
metadata support for xe_eudebug") from:

https://gitlab.freedesktop.org/miku/kernel.git

which introduces most recent changes to the eudebug uapi.

Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
Acked-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 include/drm-uapi/xe_drm_eudebug.h | 341 ++++++++++++++++++++++++++++++
 1 file changed, 341 insertions(+)
 create mode 100644 include/drm-uapi/xe_drm_eudebug.h

diff --git a/include/drm-uapi/xe_drm_eudebug.h b/include/drm-uapi/xe_drm_eudebug.h
new file mode 100644
index 000000000..13c069e00
--- /dev/null
+++ b/include/drm-uapi/xe_drm_eudebug.h
@@ -0,0 +1,341 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#ifndef _XE_DRM_EUDEBUG_H_
+#define _XE_DRM_EUDEBUG_H_
+
+#if defined(__cplusplus)
+extern "C" {
+#endif
+
+/*
+ * The Xe EU debugger extends the uapi by both extending the api for the Xe device as well as
+ * adding an api to use with a separate debugger handle. Currently the KMD part adds the former to
+ * 'xe_drm.h' and the latter to 'xe_drm_eudebug.h'. Since the KMD part is not yet merged upstream,
+ * let's put all eudebug specific uapi here to keep the 'xe_drm.h' file synced with the upstream
+ * kernel.
+ */
+
+/* XXX: BEGIN section moved from xe_drm.h as temporary solution */
+#define DRM_XE_EUDEBUG_CONNECT		0x0c
+#define DRM_XE_DEBUG_METADATA_CREATE	0x0d
+#define DRM_XE_DEBUG_METADATA_DESTROY	0x0e
+
+/* ... */
+
+#define DRM_IOCTL_XE_EUDEBUG_CONNECT		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_EUDEBUG_CONNECT, struct drm_xe_eudebug_connect)
+#define DRM_IOCTL_XE_DEBUG_METADATA_CREATE	 DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_DEBUG_METADATA_CREATE, struct drm_xe_debug_metadata_create)
+#define DRM_IOCTL_XE_DEBUG_METADATA_DESTROY	 DRM_IOW(DRM_COMMAND_BASE + DRM_XE_DEBUG_METADATA_DESTROY, struct drm_xe_debug_metadata_destroy)
+
+/* ... */
+
+struct drm_xe_vm_bind_op_ext_attach_debug {
+	/** @base: base user extension */
+	struct drm_xe_user_extension base;
+
+	/** @id: Debug object id from create metadata */
+	__u64 metadata_id;
+
+	/** @flags: Flags */
+	__u64 flags;
+
+	/** @cookie: Cookie */
+	__u64 cookie;
+
+	/** @reserved: Reserved */
+	__u64 reserved;
+};
+
+/* ... */
+
+#define XE_VM_BIND_OP_EXTENSIONS_ATTACH_DEBUG 0
+
+/* ... */
+
+#define   DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG		2
+#define     DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE		(1 << 0)
+
+/* ... */
+
+/*
+ * Debugger ABI (ioctl and events) Version History:
+ * 0 - No debugger available
+ * 1 - Initial version
+ */
+#define DRM_XE_EUDEBUG_VERSION 1
+
+struct drm_xe_eudebug_connect {
+	/** @extensions: Pointer to the first extension struct, if any */
+	__u64 extensions;
+
+	__u64 pid; /* input: Target process ID */
+	__u32 flags; /* MBZ */
+
+	__u32 version; /* output: current ABI (ioctl / events) version */
+};
+
+/*
+ * struct drm_xe_debug_metadata_create - Create debug metadata
+ *
+ * Add a region of user memory to be marked as debug metadata.
+ * When the debugger attaches, the metadata regions will be delivered
+ * for debugger. Debugger can then map these regions to help decode
+ * the program state.
+ *
+ * Returns handle to created metadata entry.
+ */
+struct drm_xe_debug_metadata_create {
+	/** @extensions: Pointer to the first extension struct, if any */
+	__u64 extensions;
+
+#define DRM_XE_DEBUG_METADATA_ELF_BINARY     0
+#define DRM_XE_DEBUG_METADATA_PROGRAM_MODULE 1
+#define WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_MODULE_AREA 2
+#define WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_SBA_AREA 3
+#define WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_SIP_AREA 4
+#define WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM (1 + \
+	  WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_SIP_AREA)
+
+	/** @type: Type of metadata */
+	__u64 type;
+
+	/** @user_addr: pointer to start of the metadata */
+	__u64 user_addr;
+
+	/** @len: length, in bytes of the medata */
+	__u64 len;
+
+	/** @metadata_id: created metadata handle (out) */
+	__u32 metadata_id;
+};
+
+/**
+ * struct drm_xe_debug_metadata_destroy - Destroy debug metadata
+ *
+ * Destroy debug metadata.
+ */
+struct drm_xe_debug_metadata_destroy {
+	/** @extensions: Pointer to the first extension struct, if any */
+	__u64 extensions;
+
+	/** @metadata_id: metadata handle to destroy */
+	__u32 metadata_id;
+};
+
+/* XXX: END section moved from xe_drm.h as temporary solution */
+
+/**
+ * Do a eudebug event read for a debugger connection.
+ *
+ * This ioctl is available in debug version 1.
+ */
+#define DRM_XE_EUDEBUG_IOCTL_READ_EVENT		_IO('j', 0x0)
+#define DRM_XE_EUDEBUG_IOCTL_EU_CONTROL		_IOWR('j', 0x2, struct drm_xe_eudebug_eu_control)
+#define DRM_XE_EUDEBUG_IOCTL_ACK_EVENT		_IOW('j', 0x4, struct drm_xe_eudebug_ack_event)
+#define DRM_XE_EUDEBUG_IOCTL_VM_OPEN		_IOW('j', 0x1, struct drm_xe_eudebug_vm_open)
+#define DRM_XE_EUDEBUG_IOCTL_READ_METADATA	_IOWR('j', 0x3, struct drm_xe_eudebug_read_metadata)
+
+/* XXX: Document events to match their internal counterparts when moved to xe_drm.h */
+struct drm_xe_eudebug_event {
+	__u32 len;
+
+	__u16 type;
+#define DRM_XE_EUDEBUG_EVENT_NONE		0
+#define DRM_XE_EUDEBUG_EVENT_READ		1
+#define DRM_XE_EUDEBUG_EVENT_OPEN		2
+#define DRM_XE_EUDEBUG_EVENT_VM			3
+#define DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE		4
+#define DRM_XE_EUDEBUG_EVENT_EU_ATTENTION	5
+#define DRM_XE_EUDEBUG_EVENT_VM_BIND		6
+#define DRM_XE_EUDEBUG_EVENT_VM_BIND_OP		7
+#define DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE	8
+#define DRM_XE_EUDEBUG_EVENT_METADATA		9
+#define DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_METADATA 10
+
+	__u16 flags;
+#define DRM_XE_EUDEBUG_EVENT_CREATE		(1 << 0)
+#define DRM_XE_EUDEBUG_EVENT_DESTROY		(1 << 1)
+#define DRM_XE_EUDEBUG_EVENT_STATE_CHANGE	(1 << 2)
+#define DRM_XE_EUDEBUG_EVENT_NEED_ACK		(1 << 3)
+
+	__u64 seqno;
+	__u64 reserved;
+};
+
+struct drm_xe_eudebug_event_client {
+	struct drm_xe_eudebug_event base;
+
+	__u64 client_handle; /* This is unique per debug connection */
+};
+
+struct drm_xe_eudebug_event_vm {
+	struct drm_xe_eudebug_event base;
+
+	__u64 client_handle;
+	__u64 vm_handle;
+};
+
+struct drm_xe_eudebug_event_exec_queue {
+	struct drm_xe_eudebug_event base;
+
+	__u64 client_handle;
+	__u64 vm_handle;
+	__u64 exec_queue_handle;
+	__u32 engine_class;
+	__u32 width;
+	__u64 lrc_handle[];
+};
+
+struct drm_xe_eudebug_event_eu_attention {
+	struct drm_xe_eudebug_event base;
+
+	__u64 client_handle;
+	__u64 exec_queue_handle;
+	__u64 lrc_handle;
+	__u32 flags;
+	__u32 bitmask_size;
+	__u8 bitmask[];
+};
+
+struct drm_xe_eudebug_eu_control {
+	__u64 client_handle;
+
+#define DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL	0
+#define DRM_XE_EUDEBUG_EU_CONTROL_CMD_STOPPED		1
+#define DRM_XE_EUDEBUG_EU_CONTROL_CMD_RESUME		2
+	__u32 cmd;
+	__u32 flags;
+
+	__u64 seqno;
+
+	__u64 exec_queue_handle;
+	__u64 lrc_handle;
+	__u32 reserved;
+	__u32 bitmask_size;
+	__u64 bitmask_ptr;
+};
+
+/*
+ *  When client (debuggee) does vm_bind_ioctl() following event
+ *  sequence will be created (for the debugger):
+ *
+ *  ┌───────────────────────┐
+ *  │  EVENT_VM_BIND        ├───────┬─┬─┐
+ *  └───────────────────────┘       │ │ │
+ *      ┌───────────────────────┐   │ │ │
+ *      │ EVENT_VM_BIND_OP #1   ├───┘ │ │
+ *      └───────────────────────┘     │ │
+ *                 ...                │ │
+ *      ┌───────────────────────┐     │ │
+ *      │ EVENT_VM_BIND_OP #n   ├─────┘ │
+ *      └───────────────────────┘       │
+ *                                      │
+ *      ┌───────────────────────┐       │
+ *      │ EVENT_UFENCE          ├───────┘
+ *      └───────────────────────┘
+ *
+ * All the events below VM_BIND will reference the VM_BIND
+ * they associate with, by field .vm_bind_ref_seqno.
+ * event_ufence will only be included if the client did
+ * attach sync of type UFENCE into its vm_bind_ioctl().
+ *
+ * When EVENT_UFENCE is sent by the driver, all the OPs of
+ * the original VM_BIND are completed and the [addr,range]
+ * contained in them are present and modifiable through the
+ * vm accessors. Accessing [addr, range] before related ufence
+ * event will lead to undefined results as the actual bind
+ * operations are async and the backing storage might not
+ * be there on a moment of receiving the event.
+ *
+ * Client's UFENCE sync will be held by the driver: client's
+ * drm_xe_wait_ufence will not complete and the value of the ufence
+ * won't appear until ufence is acked by the debugger process calling
+ * DRM_XE_EUDEBUG_IOCTL_ACK_EVENT with the event_ufence.base.seqno.
+ * This will signal the fence, .value will update and the wait will
+ * complete allowing the client to continue.
+ *
+ */
+
+struct drm_xe_eudebug_event_vm_bind {
+	struct drm_xe_eudebug_event base;
+
+	__u64 client_handle;
+	__u64 vm_handle;
+
+	__u32 flags;
+#define DRM_XE_EUDEBUG_EVENT_VM_BIND_FLAG_UFENCE (1 << 0)
+
+	__u32 num_binds;
+};
+
+struct drm_xe_eudebug_event_vm_bind_op {
+	struct drm_xe_eudebug_event base;
+	__u64 vm_bind_ref_seqno; /* *_event_vm_bind.base.seqno */
+	__u64 num_extensions;
+
+	__u64 addr; /* XXX: Zero for unmap all? */
+	__u64 range; /* XXX: Zero for unmap all? */
+};
+
+struct drm_xe_eudebug_event_vm_bind_ufence {
+	struct drm_xe_eudebug_event base;
+	__u64 vm_bind_ref_seqno; /* *_event_vm_bind.base.seqno */
+};
+
+struct drm_xe_eudebug_ack_event {
+	__u32 type;
+	__u32 flags; /* MBZ */
+	__u64 seqno;
+};
+
+struct drm_xe_eudebug_vm_open {
+	/** @extensions: Pointer to the first extension struct, if any */
+	__u64 extensions;
+
+	/** @client_handle: id of client */
+	__u64 client_handle;
+
+	/** @vm_handle: id of vm */
+	__u64 vm_handle;
+
+	/** @flags: flags */
+	__u64 flags;
+
+	/** @timeout_ns: Timeout value in nanoseconds operations (fsync) */
+	__u64 timeout_ns;
+};
+
+struct drm_xe_eudebug_read_metadata {
+	__u64 client_handle;
+	__u64 metadata_handle;
+	__u32 flags;
+	__u32 reserved;
+	__u64 ptr;
+	__u64 size;
+};
+
+struct drm_xe_eudebug_event_metadata {
+	struct drm_xe_eudebug_event base;
+
+	__u64 client_handle;
+	__u64 metadata_handle;
+	/* XXX: Refer to xe_drm.h for fields */
+	__u64 type;
+	__u64 len;
+};
+
+struct drm_xe_eudebug_event_vm_bind_op_metadata {
+	struct drm_xe_eudebug_event base;
+	__u64 vm_bind_op_ref_seqno; /* *_event_vm_bind_op.base.seqno */
+
+	__u64 metadata_handle;
+	__u64 metadata_cookie;
+};
+
+#if defined(__cplusplus)
+}
+#endif
+
+#endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH i-g-t v6 11/17] lib/xe_eudebug: Introduce eu debug testing framework
  2024-09-05  9:27 [PATCH i-g-t v6 00/17] Test coverage for GPU debug support Christoph Manszewski
                   ` (9 preceding siblings ...)
  2024-09-05  9:28 ` [PATCH i-g-t v6 10/17] drm-uapi/xe: Sync with eudebug uapi Christoph Manszewski
@ 2024-09-05  9:28 ` Christoph Manszewski
  2024-09-09  8:46   ` Zbigniew Kempczyński
  2024-09-10  5:32   ` Zbigniew Kempczyński
  2024-09-05  9:28 ` [PATCH i-g-t v6 12/17] scripts/igt_doc: Add '--exclude-files' parameter Christoph Manszewski
                   ` (8 subsequent siblings)
  19 siblings, 2 replies; 50+ messages in thread
From: Christoph Manszewski @ 2024-09-05  9:28 UTC (permalink / raw)
  To: igt-dev
  Cc: Zbigniew Kempczyński, Kamil Konieczny, Dominik Grzegorzek,
	Maciej Patelczyk, Dominik Karol Piątkowski, Pawel Sikora,
	Andrzej Hajda, Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun,
	Mika Kuoppala, Christoph Manszewski, Karolina Stolarek

From: Dominik Grzegorzek <dominik.grzegorzek@intel.com>

Introduce library which simplifies testing of eu debug capability.
The library provides event log helpers together with asynchronous
abstraction for client proccess and the debugger itself.

xe_eudebug_client creates its own proccess with user's work function,
and gives machanisms to synchronize beginning of execution and event
logging.

xe_eudebug_debugger allows to attach to the given proccess, provides
asynchronous thread for event reading and introduces triggers - a
callback mechanism triggered every time subscribed event was read.

To build the eudebug testing framework 'xe_eudebug' meson build option
has to be enabled, as it is disabled by default.

Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuaoppala@linux.intel.com>
Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Signed-off-by: Pawel Sikora <pawel.sikora@intel.com>
Signed-off-by: Karolina Stolarek <karolina.stolarek@intel.com>
---
 lib/meson.build     |    5 +
 lib/xe/xe_eudebug.c | 2249 +++++++++++++++++++++++++++++++++++++++++++
 lib/xe/xe_eudebug.h |  218 +++++
 meson.build         |    2 +
 meson_options.txt   |    5 +
 5 files changed, 2479 insertions(+)
 create mode 100644 lib/xe/xe_eudebug.c
 create mode 100644 lib/xe/xe_eudebug.h

diff --git a/lib/meson.build b/lib/meson.build
index 4af2bc743..96dec9678 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -195,6 +195,11 @@ if chamelium.found()
 	lib_sources += 'monitor_edids/monitor_edids_helper.c'
 endif
 
+if build_xe_eudebug
+	build_info += 'Xe EU debugger test framework enabled.'
+	lib_sources += 'xe/xe_eudebug.c'
+endif
+
 if libprocps.found()
 	lib_deps += libprocps
 else
diff --git a/lib/xe/xe_eudebug.c b/lib/xe/xe_eudebug.c
new file mode 100644
index 000000000..55cb3e99e
--- /dev/null
+++ b/lib/xe/xe_eudebug.c
@@ -0,0 +1,2249 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#include <fcntl.h>
+#include <poll.h>
+#include <signal.h>
+#include <sys/select.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <sys/wait.h>
+
+#include "igt.h"
+#include "igt_sysfs.h"
+#include "intel_pat.h"
+#include "xe_eudebug.h"
+#include "xe_ioctl.h"
+
+struct event_trigger {
+	xe_eudebug_trigger_fn fn;
+	int type;
+	struct igt_list_head link;
+};
+
+struct seqno_list_entry {
+	struct igt_list_head link;
+	uint64_t seqno;
+};
+
+struct match_dto {
+	struct drm_xe_eudebug_event *target;
+	struct igt_list_head *seqno_list;
+	uint64_t client_handle;
+	uint32_t filter;
+
+	/* store latest 'EVENT_VM_BIND' seqno */
+	uint64_t *bind_seqno;
+	/* latest vm_bind_op seqno matching bind_seqno */
+	uint64_t *bind_op_seqno;
+};
+
+#define CLIENT_PID  1
+#define CLIENT_RUN  2
+#define CLIENT_FINI 3
+#define CLIENT_STOP 4
+#define CLIENT_STAGE 5
+#define DEBUGGER_STAGE 6
+
+static const char *type_to_str(unsigned int type)
+{
+	switch (type) {
+	case DRM_XE_EUDEBUG_EVENT_NONE:
+		return "none";
+	case DRM_XE_EUDEBUG_EVENT_READ:
+		return "read";
+	case DRM_XE_EUDEBUG_EVENT_OPEN:
+		return "client";
+	case DRM_XE_EUDEBUG_EVENT_VM:
+		return "vm";
+	case DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE:
+		return "exec_queue";
+	case DRM_XE_EUDEBUG_EVENT_EU_ATTENTION:
+		return "attention";
+	case DRM_XE_EUDEBUG_EVENT_VM_BIND:
+		return "vm_bind";
+	case DRM_XE_EUDEBUG_EVENT_VM_BIND_OP:
+		return "vm_bind_op";
+	case DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE:
+		return "vm_bind_ufence";
+	case DRM_XE_EUDEBUG_EVENT_METADATA:
+		return "metadata";
+	case DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_METADATA:
+		return "vm_bind_op_metadata";
+	}
+
+	return "UNKNOWN";
+}
+
+static const char *event_type_to_str(struct drm_xe_eudebug_event *e, char *buf)
+{
+	sprintf(buf, "%s(%d)", type_to_str(e->type), e->type);
+
+	return buf;
+}
+
+static const char *flags_to_str(unsigned int flags)
+{
+	if (flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
+		if (flags & DRM_XE_EUDEBUG_EVENT_NEED_ACK)
+			return "create|ack";
+		else
+			return "create";
+	}
+	if (flags & DRM_XE_EUDEBUG_EVENT_DESTROY)
+		return "destroy";
+
+	if (flags & DRM_XE_EUDEBUG_EVENT_STATE_CHANGE)
+		return "state-change";
+
+	igt_assert(!(flags & DRM_XE_EUDEBUG_EVENT_NEED_ACK));
+
+	return "flags unknown";
+}
+
+static const char *event_members_to_str(struct drm_xe_eudebug_event *e, char *buf)
+{
+	switch (e->type) {
+	case DRM_XE_EUDEBUG_EVENT_OPEN: {
+		struct drm_xe_eudebug_event_client *ec = (void *)e;
+
+		sprintf(buf, "handle=%llu", ec->client_handle);
+		break;
+	}
+	case DRM_XE_EUDEBUG_EVENT_VM: {
+		struct drm_xe_eudebug_event_vm *evm = (void *)e;
+
+		sprintf(buf, "client_handle=%llu, handle=%llu",
+			evm->client_handle, evm->vm_handle);
+		break;
+	}
+	case DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE: {
+		struct drm_xe_eudebug_event_exec_queue *ee = (void *)e;
+
+		sprintf(buf, "client_handle=%llu, vm_handle=%llu, "
+			"exec_queue_handle=%llu, engine_class=%d, exec_queue_width=%d",
+			ee->client_handle, ee->vm_handle,
+			ee->exec_queue_handle, ee->engine_class, ee->width);
+		break;
+	}
+	case DRM_XE_EUDEBUG_EVENT_EU_ATTENTION: {
+		struct drm_xe_eudebug_event_eu_attention *ea = (void *)e;
+
+		sprintf(buf, "client_handle=%llu, exec_queue_handle=%llu, "
+			"lrc_handle=%llu, bitmask_size=%d",
+			ea->client_handle, ea->exec_queue_handle,
+			ea->lrc_handle, ea->bitmask_size);
+		break;
+	}
+	case DRM_XE_EUDEBUG_EVENT_VM_BIND: {
+		struct drm_xe_eudebug_event_vm_bind *evmb = (void *)e;
+
+		sprintf(buf, "client_handle=%llu, vm_handle=%llu, flags=0x%x, num_binds=%u",
+			evmb->client_handle, evmb->vm_handle, evmb->flags, evmb->num_binds);
+		break;
+	}
+	case DRM_XE_EUDEBUG_EVENT_VM_BIND_OP: {
+		struct drm_xe_eudebug_event_vm_bind_op *op = (void *)e;
+
+		sprintf(buf, "vm_bind_ref_seqno=%lld, addr=%016llx, range=%llu num_extensions=%llu",
+			op->vm_bind_ref_seqno, op->addr, op->range, op->num_extensions);
+		break;
+	}
+	case DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE: {
+		struct drm_xe_eudebug_event_vm_bind_ufence *f = (void *)e;
+
+		sprintf(buf, "vm_bind_ref_seqno=%lld", f->vm_bind_ref_seqno);
+		break;
+	}
+	case DRM_XE_EUDEBUG_EVENT_METADATA: {
+		struct drm_xe_eudebug_event_metadata *em = (void *)e;
+
+		sprintf(buf, "client_handle=%llu, metadata_handle=%llu, type=%llu, len=%llu",
+			em->client_handle, em->metadata_handle, em->type, em->len);
+		break;
+	}
+	case DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_METADATA: {
+		struct drm_xe_eudebug_event_vm_bind_op_metadata *op = (void *)e;
+
+		sprintf(buf, "vm_bind_op_ref_seqno=%lld, metadata_handle=%llu, metadata_cookie=%llu",
+			op->vm_bind_op_ref_seqno, op->metadata_handle, op->metadata_cookie);
+		break;
+	}
+	default:
+		strcpy(buf, "<...>");
+	}
+
+	return buf;
+}
+
+/**
+ * xe_eudebug_event_to_str:
+ * @e: pointer to event
+ * @buf: target to write string representation of @e
+ * @len: size of target buffer @buf
+ *
+ * Creates string representation for given event.
+ *
+ * Returns: the written input buffer pointed by @buf.
+ */
+const char *xe_eudebug_event_to_str(struct drm_xe_eudebug_event *e, char *buf, size_t len)
+{
+	char a[256];
+	char b[256];
+
+	igt_assert(e);
+	igt_assert(buf);
+
+	snprintf(buf, len, "(%llu) %15s:%s: %s",
+		 e->seqno,
+		 event_type_to_str(e, a),
+		 flags_to_str(e->flags),
+		 event_members_to_str(e, b));
+
+	return buf;
+}
+
+static void catch_child_failure(void)
+{
+	pid_t pid;
+	int status;
+
+	pid = waitpid(-1, &status, WNOHANG);
+
+	if (pid == 0 || pid == -1)
+		return;
+
+	if (!WIFEXITED(status))
+		return;
+
+	igt_assert_f(WEXITSTATUS(status) == 0, "Client failed!\n");
+}
+
+static int safe_pipe_read(int pipe[2], void *buf, int nbytes, int timeout_ms)
+{
+	int ret;
+	int t = 0;
+	struct pollfd fd = {
+		.fd = pipe[0],
+		.events = POLLIN,
+		.revents = 0
+	};
+
+	/* When child fails we may get stuck forever. Check whether
+	 * the child process ended with an error.
+	 */
+	do {
+		const int interval_ms = 1000;
+
+		ret = poll(&fd, 1, interval_ms);
+
+		if (!ret) {
+			catch_child_failure();
+			t += interval_ms;
+		}
+	} while (!ret && t < timeout_ms);
+
+	if (ret > 0)
+		return read(pipe[0], buf, nbytes);
+
+	return 0;
+}
+
+static uint64_t pipe_read(int pipe[2], int timeout_ms)
+{
+	uint64_t in;
+	uint64_t ret;
+
+	ret = safe_pipe_read(pipe, &in, sizeof(in), timeout_ms);
+	igt_assert(ret == sizeof(in));
+
+	return in;
+}
+
+static void pipe_signal(int pipe[2], uint64_t token)
+{
+	igt_assert(write(pipe[1], &token, sizeof(token)) == sizeof(token));
+}
+
+static void pipe_close(int pipe[2])
+{
+	if (pipe[0] != -1)
+		close(pipe[0]);
+
+	if (pipe[1] != -1)
+		close(pipe[1]);
+}
+
+static uint64_t __wait_token(int p[2], const uint64_t token, int timeout_ms)
+{
+	uint64_t in;
+
+	in = pipe_read(p, timeout_ms);
+
+	igt_assert_eq(in, token);
+
+	return pipe_read(p, timeout_ms);
+}
+
+static uint64_t client_wait_token(struct xe_eudebug_client *c, const uint64_t token)
+{
+	return __wait_token(c->p_in, token, c->timeout_ms);
+}
+
+static uint64_t wait_from_client(struct xe_eudebug_client *c, const uint64_t token)
+{
+	return __wait_token(c->p_out, token, c->timeout_ms);
+}
+
+static void token_signal(int p[2], const uint64_t token, const uint64_t value)
+{
+	pipe_signal(p, token);
+	pipe_signal(p, value);
+}
+
+static void client_signal(struct xe_eudebug_client *c,
+			  const uint64_t token,
+			  const uint64_t value)
+{
+	token_signal(c->p_out, token, value);
+}
+
+static int __xe_eudebug_connect(int fd, pid_t pid, uint32_t flags, uint64_t events)
+{
+	struct drm_xe_eudebug_connect param = {
+		.pid = pid,
+		.flags = flags,
+	};
+	int debugfd;
+
+	debugfd = igt_ioctl(fd, DRM_IOCTL_XE_EUDEBUG_CONNECT, &param);
+
+	if (debugfd < 0)
+		return -errno;
+
+	return debugfd;
+}
+
+static void event_log_write_to_fd(struct xe_eudebug_event_log *l, int fd)
+{
+	igt_assert_eq(write(fd, &l->head, sizeof(l->head)),
+		      sizeof(l->head));
+
+	igt_assert_eq(write(fd, l->log, l->head), l->head);
+}
+
+static void read_all(int fd, void *buf, size_t nbytes)
+{
+	ssize_t remaining_size = nbytes;
+	ssize_t current_size = 0;
+	ssize_t read_size = 0;
+
+	do {
+		read_size = read(fd, buf + current_size, remaining_size);
+		igt_assert_f(read_size >= 0, "read failed: %s\n", strerror(errno));
+
+		current_size += read_size;
+		remaining_size -= read_size;
+	} while (remaining_size > 0 && read_size > 0);
+
+	igt_assert_eq(current_size, nbytes);
+}
+
+static void event_log_read_from_fd(struct xe_eudebug_event_log *l, int fd)
+{
+	read_all(fd, &l->head, sizeof(l->head));
+	igt_assert_lt(l->head, l->max_size);
+
+	read_all(fd, l->log, l->head);
+}
+
+typedef int (*cmp_fn_t)(struct drm_xe_eudebug_event *, void *);
+
+static struct drm_xe_eudebug_event *
+event_cmp(struct xe_eudebug_event_log *l,
+	  struct drm_xe_eudebug_event *current,
+	  cmp_fn_t match,
+	  void *data)
+{
+	struct drm_xe_eudebug_event *e = current;
+
+	xe_eudebug_for_each_event(e, l) {
+		if (match(e, data))
+			return e;
+	}
+
+	return NULL;
+}
+
+static int match_type_and_flags(struct drm_xe_eudebug_event *a, void *data)
+{
+	struct drm_xe_eudebug_event *b = data;
+
+	if (a->type == b->type &&
+	    a->flags == b->flags)
+		return 1;
+
+	return 0;
+}
+
+static int match_fields(struct drm_xe_eudebug_event *a, void *data)
+{
+	struct drm_xe_eudebug_event *b = data;
+	int ret = 0;
+
+	ret = match_type_and_flags(a, data);
+	if (!ret)
+		return ret;
+
+	ret = 0;
+
+	switch (a->type) {
+	case DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE: {
+		struct drm_xe_eudebug_event_exec_queue *ae = (void *)a;
+		struct drm_xe_eudebug_event_exec_queue *be = (void *)b;
+
+		if (ae->engine_class == be->engine_class && ae->width == be->width)
+			ret = 1;
+		break;
+	}
+	case DRM_XE_EUDEBUG_EVENT_VM_BIND: {
+		struct drm_xe_eudebug_event_vm_bind *ea = (void *)a;
+		struct drm_xe_eudebug_event_vm_bind *eb = (void *)b;
+
+		if (ea->num_binds == eb->num_binds)
+			ret = 1;
+		break;
+	}
+	case DRM_XE_EUDEBUG_EVENT_VM_BIND_OP: {
+		struct drm_xe_eudebug_event_vm_bind_op *ea = (void *)a;
+		struct drm_xe_eudebug_event_vm_bind_op *eb = (void *)b;
+
+		if (ea->addr == eb->addr && ea->range == eb->range &&
+		    ea->num_extensions == eb->num_extensions)
+			ret = 1;
+		break;
+	}
+	case DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_METADATA: {
+		struct drm_xe_eudebug_event_vm_bind_op_metadata *ea = (void *)a;
+		struct drm_xe_eudebug_event_vm_bind_op_metadata *eb = (void *)b;
+
+		if (ea->metadata_handle == eb->metadata_handle &&
+		    ea->metadata_cookie == eb->metadata_cookie)
+			ret = 1;
+		break;
+	}
+
+	default:
+		ret = 1;
+		break;
+	}
+
+	return ret;
+}
+
+static int match_client_handle(struct drm_xe_eudebug_event *e, void *data)
+{
+	struct match_dto *md = data;
+	uint64_t *bind_seqno = md->bind_seqno;
+	uint64_t *bind_op_seqno = md->bind_op_seqno;
+	uint64_t h = md->client_handle;
+
+	if (XE_EUDEBUG_EVENT_IS_FILTERED(e->type, md->filter))
+		return 0;
+
+	switch (e->type) {
+	case DRM_XE_EUDEBUG_EVENT_OPEN: {
+		struct drm_xe_eudebug_event_client *client = (void *)e;
+
+		if (client->client_handle == h)
+			return 1;
+		break;
+	}
+	case DRM_XE_EUDEBUG_EVENT_VM: {
+		struct drm_xe_eudebug_event_vm *vm = (void *)e;
+
+		if (vm->client_handle == h)
+			return 1;
+		break;
+	}
+	case DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE: {
+		struct drm_xe_eudebug_event_exec_queue *ee = (void *)e;
+
+		if (ee->client_handle == h)
+			return 1;
+		break;
+	}
+	case DRM_XE_EUDEBUG_EVENT_VM_BIND: {
+		struct drm_xe_eudebug_event_vm_bind *evmb = (void *)e;
+
+		if (evmb->client_handle == h) {
+			*bind_seqno = evmb->base.seqno;
+			return 1;
+		}
+		break;
+	}
+	case DRM_XE_EUDEBUG_EVENT_VM_BIND_OP: {
+		struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
+
+		if (eo->vm_bind_ref_seqno == *bind_seqno) {
+			*bind_op_seqno = eo->base.seqno;
+			return 1;
+		}
+		break;
+	}
+	case DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE: {
+		struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
+
+		if (ef->vm_bind_ref_seqno == *bind_seqno)
+			return 1;
+
+		break;
+	}
+	case DRM_XE_EUDEBUG_EVENT_METADATA: {
+		struct drm_xe_eudebug_event_metadata *em = (void *)e;
+
+		if (em->client_handle == h)
+			return 1;
+		break;
+	}
+	case DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_METADATA: {
+		struct drm_xe_eudebug_event_vm_bind_op_metadata *eo = (void *)e;
+
+		if (eo->vm_bind_op_ref_seqno == *bind_op_seqno)
+			return 1;
+		break;
+	}
+	default:
+		break;
+	}
+
+	return 0;
+}
+
+static int match_opposite_resource(struct drm_xe_eudebug_event *e, void *data)
+{
+	struct drm_xe_eudebug_event *d = data;
+	int ret;
+
+	d->flags ^= DRM_XE_EUDEBUG_EVENT_CREATE | DRM_XE_EUDEBUG_EVENT_DESTROY;
+	d->flags &= ~(DRM_XE_EUDEBUG_EVENT_NEED_ACK);
+	ret = match_type_and_flags(e, data);
+	d->flags ^= DRM_XE_EUDEBUG_EVENT_CREATE | DRM_XE_EUDEBUG_EVENT_DESTROY;
+
+	if (!ret)
+		return 0;
+
+	switch (e->type) {
+	case DRM_XE_EUDEBUG_EVENT_OPEN: {
+		struct drm_xe_eudebug_event_client *client = (void *)e;
+		struct drm_xe_eudebug_event_client *filter = data;
+
+		if (client->client_handle == filter->client_handle)
+			return 1;
+		break;
+	}
+	case DRM_XE_EUDEBUG_EVENT_VM: {
+		struct drm_xe_eudebug_event_vm *vm = (void *)e;
+		struct drm_xe_eudebug_event_vm *filter = data;
+
+		if (vm->vm_handle == filter->vm_handle)
+			return 1;
+		break;
+	}
+	case DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE: {
+		struct drm_xe_eudebug_event_exec_queue *ee = (void *)e;
+		struct drm_xe_eudebug_event_exec_queue *filter = data;
+
+		if (ee->exec_queue_handle == filter->exec_queue_handle)
+			return 1;
+		break;
+	}
+	case DRM_XE_EUDEBUG_EVENT_VM_BIND: {
+		struct drm_xe_eudebug_event_vm_bind *evmb = (void *)e;
+		struct drm_xe_eudebug_event_vm_bind *filter = data;
+
+		if (evmb->vm_handle == filter->vm_handle &&
+		    evmb->num_binds == filter->num_binds)
+			return 1;
+		break;
+	}
+	case DRM_XE_EUDEBUG_EVENT_VM_BIND_OP: {
+		struct drm_xe_eudebug_event_vm_bind_op *avmb = (void *)e;
+		struct drm_xe_eudebug_event_vm_bind_op *filter = data;
+
+		if (avmb->addr == filter->addr &&
+		    avmb->range == filter->range)
+			return 1;
+		break;
+	}
+	case DRM_XE_EUDEBUG_EVENT_METADATA: {
+		struct drm_xe_eudebug_event_metadata *em = (void *)e;
+		struct drm_xe_eudebug_event_metadata *filter = data;
+
+		if (em->metadata_handle == filter->metadata_handle)
+			return 1;
+		break;
+	}
+	case DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_METADATA: {
+		struct drm_xe_eudebug_event_vm_bind_op_metadata *avmb = (void *)e;
+		struct drm_xe_eudebug_event_vm_bind_op_metadata *filter = data;
+
+		if (avmb->metadata_handle == filter->metadata_handle &&
+		    avmb->metadata_cookie == filter->metadata_cookie)
+			return 1;
+		break;
+	}
+
+	default:
+		break;
+	}
+	return 0;
+}
+
+static int match_full(struct drm_xe_eudebug_event *e, void *data)
+{
+	struct seqno_list_entry *sl;
+
+	struct match_dto *md = (void *)data;
+	int ret = 0;
+
+	ret = match_client_handle(e, md);
+	if (!ret)
+		return 0;
+
+	ret = match_fields(e, md->target);
+	if (!ret)
+		return 0;
+
+	igt_list_for_each_entry(sl, md->seqno_list, link) {
+		if (sl->seqno == e->seqno)
+			return 0;
+	}
+
+	return 1;
+}
+
+static struct drm_xe_eudebug_event *
+event_type_match(struct xe_eudebug_event_log *l,
+		 struct drm_xe_eudebug_event *target,
+		 struct drm_xe_eudebug_event *current)
+{
+	return event_cmp(l, current, match_type_and_flags, target);
+}
+
+static struct drm_xe_eudebug_event *
+client_match(struct xe_eudebug_event_log *l,
+	     uint64_t client_handle,
+	     struct drm_xe_eudebug_event *current,
+	     uint32_t filter,
+	     uint64_t *bind_seqno,
+	     uint64_t *bind_op_seqno)
+{
+	struct match_dto md = {
+		.client_handle = client_handle,
+		.filter = filter,
+		.bind_seqno = bind_seqno,
+		.bind_op_seqno = bind_op_seqno,
+	};
+
+	return event_cmp(l, current, match_client_handle, &md);
+}
+
+static struct drm_xe_eudebug_event *
+opposite_event_match(struct xe_eudebug_event_log *l,
+		     struct drm_xe_eudebug_event *target,
+		     struct drm_xe_eudebug_event *current)
+{
+	return event_cmp(l, current, match_opposite_resource, target);
+}
+
+static struct drm_xe_eudebug_event *
+event_match(struct xe_eudebug_event_log *l,
+	    struct drm_xe_eudebug_event *target,
+	    uint64_t client_handle,
+	    struct igt_list_head *seqno_list,
+	    uint64_t *bind_seqno,
+	    uint64_t *bind_op_seqno)
+{
+	struct match_dto md = {
+		.target = target,
+		.client_handle = client_handle,
+		.seqno_list = seqno_list,
+		.bind_seqno = bind_seqno,
+		.bind_op_seqno = bind_op_seqno,
+	};
+
+	return event_cmp(l, NULL, match_full, &md);
+}
+
+static void compare_client(struct xe_eudebug_event_log *log1, struct drm_xe_eudebug_event *ev1,
+			   struct xe_eudebug_event_log *log2, struct drm_xe_eudebug_event *ev2,
+			   uint32_t filter)
+{
+	struct drm_xe_eudebug_event_client *ev1_client = (void *)ev1;
+	struct drm_xe_eudebug_event_client *ev2_client = (void *)ev2;
+	uint64_t cbs = 0, dbs = 0, cbso = 0, dbso = 0;
+
+	struct igt_list_head matched_seqno_list;
+	struct drm_xe_eudebug_event *evptr1, *evptr2;
+	struct seqno_list_entry *entry, *tmp;
+
+	igt_assert(ev1_client);
+	igt_assert(ev2_client);
+
+	igt_debug("client: %llu -> %llu\n", ev1_client->client_handle, ev2_client->client_handle);
+
+	evptr1 = NULL;
+	evptr2 = NULL;
+	IGT_INIT_LIST_HEAD(&matched_seqno_list);
+
+	do {
+		evptr1 = client_match(log1, ev1_client->client_handle, evptr1, filter, &cbs, &cbso);
+		if (!evptr1)
+			break;
+
+		evptr2 = event_match(log2, evptr1, ev2_client->client_handle, &matched_seqno_list,
+				     &dbs, &dbso);
+
+		igt_assert_f(evptr2, "%s (%llu): no matching event type %u found for client %llu\n",
+			     log1->name,
+			     evptr1->seqno,
+			     evptr1->type,
+			     ev1_client->client_handle);
+
+		igt_debug("comparing %s %llu vs %s %llu\n",
+			  log1->name, evptr1->seqno, log2->name, evptr2->seqno);
+
+		/*
+		 * Store the seqno of the event that was matched above,
+		 * inside 'matched_seqno_list', to avoid it getting matched
+		 * by subsequent 'event_match' calls.
+		 */
+		entry = malloc(sizeof(*entry));
+		entry->seqno = evptr2->seqno;
+		igt_list_add(&entry->link, &matched_seqno_list);
+	} while (evptr1);
+
+	igt_list_for_each_entry_safe(entry, tmp, &matched_seqno_list, link)
+		free(entry);
+}
+
+/**
+ * xe_eudebug_event_log_find_seqno:
+ * @l: event log pointer
+ * @seqno: seqno of event to be found
+ *
+ * Finds the event with given seqno in the event log.
+ *
+ * Returns: pointer to the event with given seqno within @l or NULL seqno is
+ * not present.
+ */
+struct drm_xe_eudebug_event *
+xe_eudebug_event_log_find_seqno(struct xe_eudebug_event_log *l, uint64_t seqno)
+{
+	struct drm_xe_eudebug_event *e = NULL, *found = NULL;
+
+	igt_assert(l);
+	igt_assert_neq(seqno, 0);
+	/*
+	 * Try to catch if seqno is corrupted and prevent too long tests,
+	 * as our post processing of events is not optimized.
+	 */
+	igt_assert_lt(seqno, 10 * 1000 * 1000);
+
+	xe_eudebug_for_each_event(e, l) {
+		if (e->seqno == seqno) {
+			if (found) {
+				igt_warn("Found multiple events with the same seqno %lu\n", seqno);
+				xe_eudebug_event_log_print(l, false);
+				igt_assert(!found);
+			}
+			found = e;
+		}
+	}
+
+	return found;
+}
+
+static void event_log_sort(struct xe_eudebug_event_log *l)
+{
+	struct xe_eudebug_event_log *tmp;
+	struct drm_xe_eudebug_event *e = NULL;
+	uint64_t first_seqno = UINT64_MAX;
+	uint64_t last_seqno = 0;
+	uint64_t events = 0, added = 0;
+	uint64_t i;
+
+	xe_eudebug_for_each_event(e, l) {
+		if (e->seqno > last_seqno)
+			last_seqno = e->seqno;
+
+		if (e->seqno < first_seqno)
+			first_seqno = e->seqno;
+
+		events++;
+	}
+
+	tmp = xe_eudebug_event_log_create("tmp", l->max_size);
+
+	for (i = first_seqno; i <= last_seqno; i++) {
+		e = xe_eudebug_event_log_find_seqno(l, i);
+		if (e) {
+			xe_eudebug_event_log_write(tmp, e);
+			added++;
+		}
+	}
+
+	igt_assert_eq(events, added);
+	igt_assert_eq(tmp->head, l->head);
+
+	memcpy(l->log, tmp->log, tmp->head);
+
+	xe_eudebug_event_log_destroy(tmp);
+}
+
+/**
+ * xe_eudebug_connect:
+ * @fd: Xe file descriptor
+ * @pid: client PID
+ * @flags: connection flags
+ *
+ * Opens the xe eu debugger connection to the process described by @pid
+ *
+ * Returns: 0 if the debugger was successfully attached, -errno otherwise.
+ */
+int xe_eudebug_connect(int fd, pid_t pid, uint32_t flags)
+{
+	int ret;
+	uint64_t events = 0; /* events filtering not supported yet! */
+
+	ret = __xe_eudebug_connect(fd, pid, flags, events);
+
+	return ret;
+}
+
+/**
+ * xe_eudebug_event_log_create:
+ * @name: event log identifier
+ * @max_size: maximum size of created log
+ *
+ * Function creates an Eu Debugger event log with size equal to @max_size.
+ *
+ * Returns: pointer to just created log
+ */
+#define MAX_EVENT_LOG_SIZE (32 * 1024 * 1024)
+struct xe_eudebug_event_log *xe_eudebug_event_log_create(const char *name, unsigned int max_size)
+{
+	struct xe_eudebug_event_log *l;
+
+	igt_assert(name);
+
+	l = calloc(1, sizeof(*l));
+	igt_assert(l);
+	l->log = calloc(1, max_size);
+	igt_assert(l->log);
+	l->max_size = max_size;
+	strncpy(l->name, name, sizeof(l->name) - 1);
+	pthread_mutex_init(&l->lock, NULL);
+
+	return l;
+}
+
+/**
+ * xe_eudebug_event_log_destroy:
+ * @l: event log pointer
+ *
+ * Frees given event log @l.
+ */
+void xe_eudebug_event_log_destroy(struct xe_eudebug_event_log *l)
+{
+	igt_assert(l);
+	pthread_mutex_destroy(&l->lock);
+	free(l->log);
+	free(l);
+}
+
+/**
+ * xe_eudebug_event_log_write:
+ * @l: event log pointer
+ * @e: event to be written to event log
+ *
+ * Writes event @e to the event log, thread-safe.
+ */
+void xe_eudebug_event_log_write(struct xe_eudebug_event_log *l, struct drm_xe_eudebug_event *e)
+{
+	igt_assert(l);
+	igt_assert(e);
+	igt_assert(e->seqno);
+	/*
+	 * Try to catch if seqno is corrupted and prevent too long tests,
+	 * as our post processing of events is not optimized.
+	 */
+	igt_assert_lt(e->seqno, 10 * 1000 * 1000);
+
+	pthread_mutex_lock(&l->lock);
+	igt_assert_lt(l->head + e->len, l->max_size);
+	memcpy(l->log + l->head, e, e->len);
+	l->head += e->len;
+	pthread_mutex_unlock(&l->lock);
+}
+
+/**
+ * xe_eudebug_event_log_print:
+ * @l: event log pointer
+ * @debug: when true function uses igt_debug instead of igt_info.
+ *
+ * Prints given event log.
+ */
+void
+xe_eudebug_event_log_print(struct xe_eudebug_event_log *l, bool debug)
+{
+	struct drm_xe_eudebug_event *e = NULL;
+	int level = debug ? IGT_LOG_DEBUG : IGT_LOG_INFO;
+	char str[XE_EUDEBUG_EVENT_STRING_MAX_LEN];
+
+	igt_assert(l);
+
+	igt_log(IGT_LOG_DOMAIN, level,
+		"event log '%s' (%u bytes):\n", l->name, l->head);
+
+	xe_eudebug_for_each_event(e, l) {
+		xe_eudebug_event_to_str(e, str, XE_EUDEBUG_EVENT_STRING_MAX_LEN);
+		igt_log(IGT_LOG_DOMAIN, level, "%s\n", str);
+	}
+}
+
+/**
+ * xe_eudebug_event_log_compare:
+ * @a: event log pointer
+ * @b: event log pointer
+ * @filter: mask that represents events to be skipped during comparison, useful
+ * for events like 'VM_BIND' since they can be asymmetric. Note that
+ * 'DRM_XE_EUDEBUG_EVENT_OPEN' will always be matched.
+ *
+ * Compares and asserts event logs @a, @b if the event
+ * sequence matches.
+ */
+void xe_eudebug_event_log_compare(struct xe_eudebug_event_log *log1,
+				  struct xe_eudebug_event_log *log2,
+				  uint32_t filter)
+{
+	struct drm_xe_eudebug_event *ev1 = NULL;
+	struct drm_xe_eudebug_event *ev2 = NULL;
+
+	igt_assert(log1);
+	igt_assert(log2);
+
+	xe_eudebug_for_each_event(ev1, log1) {
+		if (ev1->type == DRM_XE_EUDEBUG_EVENT_OPEN &&
+		    ev1->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
+			ev2 = event_type_match(log2, ev1, ev2);
+
+			compare_client(log1, ev1, log2, ev2, filter);
+			compare_client(log2, ev2, log1, ev1, filter);
+		}
+	}
+}
+
+/**
+ * xe_eudebug_event_log_match_opposite:
+ * @l: event log pointer
+ * @filter: mask that represents events to be skipped during comparison, useful
+ * for events like 'VM_BIND' since they can be asymmetric
+ *
+ * Matches and asserts content of all opposite events (create vs destroy).
+ */
+void
+xe_eudebug_event_log_match_opposite(struct xe_eudebug_event_log *l, uint32_t filter)
+{
+	struct drm_xe_eudebug_event *ev1 = NULL;
+	struct drm_xe_eudebug_event *ev2 = NULL;
+
+	igt_assert(l);
+
+	xe_eudebug_for_each_event(ev1, l) {
+		if (ev1->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
+			uint8_t offset = sizeof(struct drm_xe_eudebug_event);
+			int opposite_matching;
+
+			if (XE_EUDEBUG_EVENT_IS_FILTERED(ev1->type, filter))
+				continue;
+
+			/* No opposite matching for binds */
+			if ((ev1->type >= DRM_XE_EUDEBUG_EVENT_VM_BIND &&
+			     ev1->type <= DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE) ||
+			    ev1->type == DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_METADATA)
+				continue;
+
+			ev2 = opposite_event_match(l, ev1, ev1);
+
+			igt_assert_f(ev2, "no opposite event of type %u found\n", ev1->type);
+
+			igt_assert_eq(ev1->len, ev2->len);
+			opposite_matching = memcmp((uint8_t *)ev2 + offset,
+						   (uint8_t *)ev1 + offset,
+						   ev2->len - offset) == 0;
+
+			igt_assert_f(opposite_matching,
+				     "%s: create|destroy event not maching (%llu) vs (%llu)\n",
+				     l->name, ev2->seqno, ev1->seqno);
+		}
+	}
+}
+
+static void debugger_run_triggers(struct xe_eudebug_debugger *d,
+				  struct drm_xe_eudebug_event *e)
+{
+	struct event_trigger *t;
+
+	igt_list_for_each_entry(t, &d->triggers, link) {
+		if (e->type == t->type)
+			t->fn(d, e);
+	}
+}
+
+#define MAX_EVENT_SIZE (32 * 1024)
+static int
+xe_eudebug_read_event(int fd, struct drm_xe_eudebug_event *event)
+{
+	int ret;
+
+	event->type = DRM_XE_EUDEBUG_EVENT_READ;
+	event->flags = 0;
+	event->len = MAX_EVENT_SIZE;
+
+	ret = igt_ioctl(fd, DRM_XE_EUDEBUG_IOCTL_READ_EVENT, event);
+	if (ret < 0)
+		return -errno;
+
+	return ret;
+}
+
+static void *debugger_worker_loop(void *data)
+{
+	uint8_t buf[MAX_EVENT_SIZE];
+	struct drm_xe_eudebug_event *e = (void *)buf;
+	struct xe_eudebug_debugger *d = data;
+	struct pollfd p = {
+		.events = POLLIN,
+		.revents = 0,
+	};
+	int timeout_ms = 100, ret;
+
+	igt_assert(d->master_fd >= 0);
+
+	do {
+		p.fd = d->fd;
+		ret = poll(&p, 1, timeout_ms);
+
+		if (ret == -1) {
+			igt_info("poll failed with errno %d\n", errno);
+			break;
+		}
+
+		if (ret == 1 && (p.revents & POLLIN)) {
+			int err = xe_eudebug_read_event(d->fd, e);
+
+			if (!err) {
+				++d->event_count;
+
+				xe_eudebug_event_log_write(d->log, e);
+				debugger_run_triggers(d, e);
+			} else {
+				igt_info("xe_eudebug_read_event returned %d\n", ret);
+			}
+		}
+	} while ((ret && READ_ONCE(d->worker_state) == DEBUGGER_WORKER_QUITTING) ||
+		 READ_ONCE(d->worker_state) == DEBUGGER_WORKER_ACTIVE);
+
+	d->worker_state = DEBUGGER_WORKER_INACTIVE;
+
+	return NULL;
+}
+
+/**
+ * xe_eudebug_debugger_available:
+ * @fd: Xe file descriptor
+ *
+ * Returns: true it debugger connection is available, false otherwise.
+ */
+bool xe_eudebug_debugger_available(int fd)
+{
+	struct drm_xe_eudebug_connect param = { .pid = getpid() };
+	int debugfd;
+
+	debugfd = igt_ioctl(fd, DRM_IOCTL_XE_EUDEBUG_CONNECT, &param);
+	if (debugfd >= 0)
+		close(debugfd);
+
+	return debugfd >= 0;
+}
+
+/**
+ * xe_eudebug_debugger_create:
+ * @master_fd: xe client used to open the debugger connection
+ * @flags: flags stored in a debugger structure, can be used at will
+ * of the caller, i.e. to be used inside triggers.
+ * @data: test's private data, allocated with MAP_SHARED | MAP_ANONYMOUS,
+ * can be shared between client and debugger. Can be NULL.
+ *
+ * Returns: newly created xe_eudebug_debugger structure with its
+ * event log initialized. Note that to open the connection
+ * you need call @xe_eudebug_debugger_attach.
+ */
+struct xe_eudebug_debugger *
+xe_eudebug_debugger_create(int master_fd, uint64_t flags, void *data)
+{
+	struct xe_eudebug_debugger *d;
+
+	d = calloc(1, sizeof(*d));
+	igt_assert(d);
+
+	d->flags = flags;
+	IGT_INIT_LIST_HEAD(&d->triggers);
+	d->log = xe_eudebug_event_log_create("debugger", MAX_EVENT_LOG_SIZE);
+	d->fd = -1;
+	d->master_fd = master_fd;
+	d->ptr = data;
+
+	return d;
+}
+
+static void debugger_destroy_triggers(struct xe_eudebug_debugger *d)
+{
+	struct event_trigger *t, *tmp;
+
+	igt_list_for_each_entry_safe(t, tmp, &d->triggers, link)
+		free(t);
+}
+
+/**
+ * xe_eudebug_debugger_destroy:
+ * @d: pointer to the debugger
+ *
+ * Frees xe_eudebug_debugger structure pointed by @d. If the debugger
+ * connection was still opened it terminates it.
+ */
+void xe_eudebug_debugger_destroy(struct xe_eudebug_debugger *d)
+{
+	if (d->worker_state != DEBUGGER_WORKER_INACTIVE)
+		xe_eudebug_debugger_stop_worker(d, 1);
+
+	if (d->target_pid)
+		xe_eudebug_debugger_detach(d);
+
+	xe_eudebug_event_log_destroy(d->log);
+	debugger_destroy_triggers(d);
+	free(d);
+}
+
+/**
+ * xe_eudebug_debugger_attach:
+ * @d: pointer to the debugger
+ * @c: pointer to the client
+ *
+ * Opens the xe eu debugger connection to the process described by @c (c->pid)
+ *
+ * Returns: 0 if the debugger was successfully attached, -errno otherwise.
+ */
+int xe_eudebug_debugger_attach(struct xe_eudebug_debugger *d,
+			       struct xe_eudebug_client *c)
+{
+	int ret;
+
+	igt_assert_eq(d->fd, -1);
+	igt_assert_neq(c->pid, 0);
+	ret = xe_eudebug_connect(d->master_fd, c->pid, 0);
+
+	if (ret < 0)
+		return ret;
+
+	d->fd = ret;
+	d->target_pid = c->pid;
+	d->p_client[0] = c->p_in[0];
+	d->p_client[1] = c->p_in[1];
+
+	igt_debug("debugger connected to %lu\n", d->target_pid);
+
+	return 0;
+}
+
+/**
+ * xe_eudebug_debugger_detach:
+ * @d: pointer to the debugger
+ *
+ * Closes previously opened xe eu debugger connection. Asserts if
+ * the debugger has active session.
+ */
+void xe_eudebug_debugger_detach(struct xe_eudebug_debugger *d)
+{
+	igt_assert(d->target_pid);
+	close(d->fd);
+	d->target_pid = 0;
+	d->fd = -1;
+}
+
+/**
+ * xe_eudebug_debugger_add_trigger:
+ * @d: pointer to the debugger
+ * @type: the type of the event which activates the trigger
+ * @fn: function to be called when event of @type was read by the debugger.
+ *
+ * Adds function @fn to the list of triggers activated when event of @type
+ * has been read by worker.
+ * Note: Triggers are activated by the worker.
+ */
+void xe_eudebug_debugger_add_trigger(struct xe_eudebug_debugger *d,
+				     int type, xe_eudebug_trigger_fn fn)
+{
+	struct event_trigger *t;
+
+	t = calloc(1, sizeof(*t));
+	igt_assert(t);
+
+	IGT_INIT_LIST_HEAD(&t->link);
+	t->type = type;
+	t->fn = fn;
+
+	igt_list_add_tail(&t->link, &d->triggers);
+	igt_debug("added trigger %p\n", t);
+}
+
+/**
+ * xe_eudebug_debugger_start_worker:
+ * @d: pointer to the debugger
+ *
+ * Starts the debugger worker. Worker is resposible for reading all
+ * incoming events from the debugger, put then into debugger log and
+ * execute appropriate event triggers. Note that using the debuggers
+ * event log while worker is running is not safe.
+ */
+void xe_eudebug_debugger_start_worker(struct xe_eudebug_debugger *d)
+{
+	int ret;
+
+	d->worker_state = DEBUGGER_WORKER_ACTIVE;
+	ret = pthread_create(&d->worker_thread, NULL, &debugger_worker_loop, d);
+
+	igt_assert_f(ret == 0, "Debugger worker thread creation failed!");
+}
+
+/**
+ * xe_eudebug_debugger_stop_worker:
+ * @d: pointer to the debugger
+ *
+ * Stops the debugger worker. Event log is sorted by seqno after closure.
+ */
+void xe_eudebug_debugger_stop_worker(struct xe_eudebug_debugger *d,
+				     int timeout_s)
+{
+	struct timespec t = {};
+	int ret;
+
+	igt_assert_neq(d->worker_state, DEBUGGER_WORKER_INACTIVE);
+
+	d->worker_state = DEBUGGER_WORKER_QUITTING; /* First time be polite. */
+	igt_assert_eq(clock_gettime(CLOCK_REALTIME, &t), 0);
+	t.tv_sec += timeout_s;
+
+	ret = pthread_timedjoin_np(d->worker_thread, NULL, &t);
+
+	if (ret == ETIMEDOUT) {
+		d->worker_state = DEBUGGER_WORKER_INACTIVE;
+		ret = pthread_join(d->worker_thread, NULL);
+	}
+
+	igt_assert_f(ret == 0 || ret != ESRCH,
+		     "pthread join failed with error %d!\n", ret);
+
+	event_log_sort(d->log);
+}
+
+/**
+ * xe_eudebug_debugger_signal_stage:
+ * @d: pointer to the debugger
+ * @stage: stage to signal
+ *
+ * Signals to client, waiting in xe_eudebug_client_wait_stage(),
+ * releasing it to proceed.
+ */
+void xe_eudebug_debugger_signal_stage(struct xe_eudebug_debugger *d, uint64_t stage)
+{
+	token_signal(d->p_client, CLIENT_STAGE, stage);
+}
+
+/**
+ * xe_eudebug_debugger_wait_stage:
+ * @s: pointer to xe_eudebug_debugger structure
+ * @stage: stage to wait on
+ *
+ * Pauses debugger until the client has signalled the corresponding stage with
+ * xe_eudebug_client_signal_stage. This is only for situations where the actual
+ * event flow is not enough to coordinate between client/debugger and extra sync
+ * mechanism is needed.
+ */
+void xe_eudebug_debugger_wait_stage(struct xe_eudebug_session *s, uint64_t stage)
+{
+	u64 stage_in;
+
+	igt_debug("debugger xe client fd: %d pausing for stage %lu\n", s->debugger->master_fd,
+		  stage);
+
+	stage_in = wait_from_client(s->client, DEBUGGER_STAGE);
+	igt_debug("debugger xe client fd: %d stage %lu, expected %lu, stage\n",
+		  s->debugger->master_fd, stage_in, stage);
+
+	igt_assert_eq(stage_in, stage);
+}
+
+/**
+ * xe_eudebug_client_create:
+ * @master_fd: xe client used to open the debugger connection
+ * @work: function that opens xe device and executes arbitrary workload
+ * @flags: flags stored in a client structure, can be used at will
+ * of the caller, i.e. to provide the @work function an additional switch.
+ * @data: test's private data, allocated with MAP_SHARED | MAP_ANONYMOUS,
+ * can be shared between client and debugger. Accesible via client->ptr.
+ * Can be NULL.
+ *
+ * Forks and creates the debugger process. @work won't be called until
+ * xe_eudebug_client_start is called.
+ *
+ * Returns: newly created xe_eudebug_debugger structure with its
+ * event log initialized.
+ */
+struct xe_eudebug_client *xe_eudebug_client_create(int master_fd, xe_eudebug_client_work_fn work,
+						   uint64_t flags, void *data)
+{
+	struct xe_eudebug_client *c;
+
+	c = calloc(1, sizeof(*c));
+	igt_assert(c);
+
+	c->flags = flags;
+	igt_assert(!pipe(c->p_in));
+	igt_assert(!pipe(c->p_out));
+	c->seqno = 1;
+	c->log = xe_eudebug_event_log_create("client", MAX_EVENT_LOG_SIZE);
+	c->done = 0;
+	c->ptr = data;
+	c->master_fd = master_fd;
+	c->timeout_ms = XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * MSEC_PER_SEC;
+
+	igt_fork(child, 1) {
+		int mypid;
+
+		igt_assert_eq(c->pid, 0);
+
+		close(c->p_out[0]);
+		c->p_out[0] = -1;
+		close(c->p_in[1]);
+		c->p_in[1] = -1;
+
+		mypid = getpid();
+		client_signal(c, CLIENT_PID, mypid);
+
+		c->pid = client_wait_token(c, CLIENT_RUN);
+		igt_assert_eq(c->pid, mypid);
+		if (work)
+			work(c);
+
+		client_signal(c, CLIENT_FINI, c->seqno);
+
+		event_log_write_to_fd(c->log, c->p_out[1]);
+
+		c->pid = client_wait_token(c, CLIENT_STOP);
+		igt_assert_eq(c->pid, mypid);
+	}
+
+	close(c->p_out[1]);
+	c->p_out[1] = -1;
+	close(c->p_in[0]);
+	c->p_in[0] = -1;
+
+	c->pid = wait_from_client(c, CLIENT_PID);
+
+	igt_info("client running with pid %d\n", c->pid);
+
+	return c;
+}
+
+/**
+ * xe_eudebug_client_stop:
+ * @c: pointer to xe_eudebug_client structure
+ *
+ * Waits for the end of client's work and exits the proccess.
+ */
+void xe_eudebug_client_stop(struct xe_eudebug_client *c)
+{
+	if (c->pid) {
+		int waitstatus;
+
+		xe_eudebug_client_wait_done(c);
+
+		token_signal(c->p_in, CLIENT_STOP, c->pid);
+		igt_assert_eq(waitpid(c->pid, &waitstatus, 0),
+			      c->pid);
+		c->pid = 0;
+	}
+}
+
+/**
+ * xe_eudebug_client_destroy:
+ * @c: pointer to xe_eudebug_client structure to be freed
+ *
+ * Frees the @c client structure. Note that it calls xe_eudebug_client_stop if
+ * client proccess has not terminated yet.
+ */
+void xe_eudebug_client_destroy(struct xe_eudebug_client *c)
+{
+	xe_eudebug_client_stop(c);
+	pipe_close(c->p_in);
+	pipe_close(c->p_out);
+	xe_eudebug_event_log_destroy(c->log);
+	free(c);
+}
+
+/**
+ * xe_eudebug_client_get_seqno:
+ * @c: pointer to xe_eudebug_client structure
+ *
+ * Increments and returns current seqno value of the given client @c
+ *
+ * Returns: incremented seqno
+ */
+uint64_t xe_eudebug_client_get_seqno(struct xe_eudebug_client *c)
+{
+	return c->seqno++;
+}
+
+/**
+ * xe_eudebug_client_start:
+ * @c: pointer to xe_eudebug_client structure
+ *
+ * Starts execution of client's work function within the client's proccess.
+ */
+void xe_eudebug_client_start(struct xe_eudebug_client *c)
+{
+	token_signal(c->p_in, CLIENT_RUN, c->pid);
+}
+
+/**
+ * xe_eudebug_client_wait_done:
+ * @c: pointer to xe_eudebug_client structure
+ *
+ * Waits for the client work end updates the event log.
+ * Doesn't terminate the client's proccess yet.
+ */
+void xe_eudebug_client_wait_done(struct xe_eudebug_client *c)
+{
+	if (!c->done) {
+		c->done = 1;
+		c->seqno = wait_from_client(c, CLIENT_FINI);
+		event_log_read_from_fd(c->log, c->p_out[0]);
+	}
+}
+
+/**
+ * xe_eudebug_client_signal_stage:
+ * @c: pointer to the client
+ * @stage: stage to signal
+ *
+ * Signals to debugger, waiting in xe_eudebug_debugger_wait_stage(),
+ * releasing it to proceed.
+ */
+void xe_eudebug_client_signal_stage(struct xe_eudebug_client *c, uint64_t stage)
+{
+	token_signal(c->p_out, DEBUGGER_STAGE, stage);
+}
+
+/**
+ * xe_eudebug_client_wait_stage:
+ * @c: pointer to xe_eudebug_client structure
+ * @stage: stage to wait on
+ *
+ * Pauses client until the debugger has signalled the corresponding stage with
+ * xe_eudebug_debugger_signal_stage. This is only for situations where the
+ * actual event flow is not enough to coordinate between client/debugger and extra
+ * sync mechanism is needed.
+ *
+ */
+void xe_eudebug_client_wait_stage(struct xe_eudebug_client *c, uint64_t stage)
+{
+	u64 stage_in;
+
+	if (c->done) {
+		igt_warn("client: %d already done before %lu\n", c->pid, stage);
+		return;
+	}
+
+	igt_debug("client: %d pausing for stage %lu\n", c->pid, stage);
+
+	stage_in = client_wait_token(c, CLIENT_STAGE);
+	igt_debug("client: %d stage %lu, expected %lu, stage\n", c->pid, stage_in, stage);
+
+	igt_assert_eq(stage_in, stage);
+}
+
+/**
+ * xe_eudebug_session_create:
+ * @fd: Xe file descriptor
+ * @work: function passed to the xe_eudebug_client_create
+ * @flags: flags passed to client and debugger
+ * @test_private: test's  data, allocated with MAP_SHARED | MAP_ANONYMOUS,
+ * passed to client and debugger. Can be NULL.
+ *
+ * Creates session together with client and debugger structures.
+ */
+struct xe_eudebug_session *xe_eudebug_session_create(int fd,
+						     xe_eudebug_client_work_fn work,
+						     unsigned int flags,
+						     void *test_private)
+{
+	struct xe_eudebug_session *s;
+
+	s = calloc(1, sizeof(*s));
+	igt_assert(s);
+
+	s->client = xe_eudebug_client_create(fd, work, flags, test_private);
+	s->debugger = xe_eudebug_debugger_create(fd, flags, test_private);
+	s->flags = flags;
+
+	return s;
+}
+
+/**
+ * xe_eudebug_session_run:
+ * @s: pointer to xe_eudebug_session structure
+ *
+ * Attaches debugger to client's proccess, starts debugger's
+ * async event reader, starts client and once client finish
+ * it stops debugger worker.
+ */
+void xe_eudebug_session_run(struct xe_eudebug_session *s)
+{
+	struct xe_eudebug_debugger *debugger = s->debugger;
+	struct xe_eudebug_client *client = s->client;
+
+	igt_assert_eq(xe_eudebug_debugger_attach(debugger, client), 0);
+
+	xe_eudebug_debugger_start_worker(debugger);
+
+	xe_eudebug_client_start(client);
+	xe_eudebug_client_wait_done(client);
+
+	xe_eudebug_debugger_stop_worker(debugger, 1);
+
+	xe_eudebug_event_log_print(debugger->log, true);
+	xe_eudebug_event_log_print(client->log, true);
+}
+
+/**
+ * xe_eudebug_session_check:
+ * @s: pointer to xe_eudebug_session structure
+ * @match_opposite: indicates whether check should match all
+ * create and destroy events.
+ * @filter: mask that represents events to be skipped during comparison, useful
+ * for events like 'VM_BIND' since they can be asymmetric
+ *
+ * Validate debugger's log against the log created by the client.
+ */
+void xe_eudebug_session_check(struct xe_eudebug_session *s, bool match_opposite, uint32_t filter)
+{
+	xe_eudebug_event_log_compare(s->client->log, s->debugger->log, filter);
+
+	if (match_opposite)
+		xe_eudebug_event_log_match_opposite(s->debugger->log, filter);
+}
+
+/**
+ * xe_eudebug_session_destroy:
+ * @s: pointer to xe_eudebug_session structure
+ *
+ * Destroy session together with its debugger and client.
+ */
+void xe_eudebug_session_destroy(struct xe_eudebug_session *s)
+{
+	xe_eudebug_debugger_destroy(s->debugger);
+	xe_eudebug_client_destroy(s->client);
+
+	free(s);
+}
+
+#define to_base(x) ((struct drm_xe_eudebug_event *)&(x))
+
+static void base_event(struct xe_eudebug_client *c,
+		       struct drm_xe_eudebug_event *e,
+		       uint32_t type,
+		       uint32_t flags,
+		       uint64_t size)
+{
+	e->type = type;
+	e->flags = flags;
+	e->seqno = xe_eudebug_client_get_seqno(c);
+	e->len = size;
+}
+
+static void client_event(struct xe_eudebug_client *c, uint32_t flags, int client_fd)
+{
+	struct drm_xe_eudebug_event_client ec;
+
+	base_event(c, to_base(ec), DRM_XE_EUDEBUG_EVENT_OPEN, flags, sizeof(ec));
+
+	ec.client_handle = client_fd;
+
+	xe_eudebug_event_log_write(c->log, (void *)&ec);
+}
+
+static void vm_event(struct xe_eudebug_client *c, uint32_t flags, int client_fd, uint32_t vm_id)
+{
+	struct drm_xe_eudebug_event_vm evm;
+
+	base_event(c, to_base(evm), DRM_XE_EUDEBUG_EVENT_VM, flags, sizeof(evm));
+
+	evm.client_handle = client_fd;
+	evm.vm_handle = vm_id;
+
+	xe_eudebug_event_log_write(c->log, (void *)&evm);
+}
+
+static void exec_queue_event(struct xe_eudebug_client *c, uint32_t flags,
+			     int client_fd, uint32_t vm_id,
+			     uint32_t exec_queue_handle, uint16_t class,
+			     uint16_t width)
+{
+	struct drm_xe_eudebug_event_exec_queue ee;
+
+	base_event(c, to_base(ee), DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE,
+		   flags, sizeof(ee));
+
+	ee.client_handle = client_fd;
+	ee.vm_handle = vm_id;
+	ee.exec_queue_handle = exec_queue_handle;
+	ee.engine_class = class;
+	ee.width = width;
+
+	xe_eudebug_event_log_write(c->log, (void *)&ee);
+}
+
+static void metadata_event(struct xe_eudebug_client *c, uint32_t flags,
+			   int client_fd, uint32_t id, uint64_t type, uint64_t len)
+{
+	struct drm_xe_eudebug_event_metadata em;
+
+	base_event(c, to_base(em), DRM_XE_EUDEBUG_EVENT_METADATA,
+		   flags, sizeof(em));
+
+	em.client_handle = client_fd;
+	em.metadata_handle = id;
+	em.type = type;
+	em.len = len;
+
+	xe_eudebug_event_log_write(c->log, (void *)&em);
+}
+
+static int enable_getset(int fd, bool *old, bool *new)
+{
+	static const char * const fname = "enable_eudebug";
+	int ret = 0;
+	int sysfs, device_fd;
+	bool val_before;
+	struct stat st;
+
+	igt_assert(new || old);
+	igt_assert_eq(fstat(fd, &st), 0);
+
+	sysfs = igt_sysfs_open(fd);
+	if (sysfs < 0)
+		return -1;
+
+	device_fd = openat(sysfs, "device", O_DIRECTORY | O_RDONLY);
+	close(sysfs);
+	if (device_fd < 0)
+		return -1;
+
+	if (!__igt_sysfs_get_boolean(device_fd, fname, &val_before)) {
+		ret = -1;
+		goto out;
+	}
+
+	igt_debug("enable_eudebug before: %d\n", val_before);
+
+	if (old)
+		*old = val_before;
+
+	ret = 0;
+	if (new) {
+		if (__igt_sysfs_set_boolean(device_fd, fname, *new))
+			igt_assert_eq(igt_sysfs_get_boolean(device_fd, fname), *new);
+		else
+			ret = -1;
+	}
+
+out:
+	close(device_fd);
+
+	return ret;
+}
+
+/**
+ * xe_eudebug_enable
+ * @fd: xe client
+ * @enable: state toggle - true to enable, false to disable
+ *
+ * Enables/disables eudebug capability by writing to
+ * '/sys/class/drm/card<N>/device/enable_eudebug' sysfs entry.
+ *
+ * Returns: previous toggle value, i.e. true when eudebugging was enabled,
+ * false when eudebugging was disabled.
+ */
+bool xe_eudebug_enable(int fd, bool enable)
+{
+	bool old = false;
+	int ret = enable_getset(fd, &old, &enable);
+
+	if (ret) {
+		igt_skip_on(enable);
+		old = false;
+	}
+
+	return old;
+}
+
+/* Eu debugger wrappers around resource creating xe ioctls. */
+
+/**
+ * xe_eudebug_client_open_driver:
+ * @c: pointer to xe_eudebug_client structure
+ *
+ * Calls drm_open_client(DRIVER_XE) and logs the corresponding
+ * event in client's event log.
+ *
+ * Returns: valid DRM file descriptor
+ */
+int xe_eudebug_client_open_driver(struct xe_eudebug_client *c)
+{
+	int fd;
+
+	igt_assert(c);
+	fd = drm_reopen_driver(c->master_fd);
+	client_event(c, DRM_XE_EUDEBUG_EVENT_CREATE, fd);
+
+	return fd;
+}
+
+/**
+ * xe_eudebug_client_close_driver:
+ * @c: pointer to xe_eudebug_client structure
+ * @fd: xe client
+ *
+ * Calls close driver and logs the corresponding event in
+ * client's event log.
+ */
+void xe_eudebug_client_close_driver(struct xe_eudebug_client *c, int fd)
+{
+	igt_assert(c);
+	client_event(c, DRM_XE_EUDEBUG_EVENT_DESTROY, fd);
+	close(fd);
+}
+
+/**
+ * xe_eudebug_client_vm_create:
+ * @c: pointer to xe_eudebug_client structure
+ * @fd: xe client
+ * @flags: vm bind flags
+ * @ext: pointer to the first user extension
+ *
+ * Calls xe_vm_create() and logs corresponding events
+ * (including vm set metadata events) in client's event log.
+ *
+ * Returns: valid vm handle
+ */
+uint32_t xe_eudebug_client_vm_create(struct xe_eudebug_client *c, int fd,
+				     uint32_t flags, uint64_t ext)
+{
+	uint32_t vm;
+
+	igt_assert(c);
+	vm = xe_vm_create(fd, flags, ext);
+	vm_event(c, DRM_XE_EUDEBUG_EVENT_CREATE, fd, vm);
+
+	return vm;
+}
+
+/**
+ * xe_eudebug_client_vm_destroy:
+ * @c: pointer to xe_eudebug_client structure
+ * fd: xe client
+ * vm: vm handle
+ *
+ * Calls xe_vm_destroy() and logs the corresponding event in
+ * client's event log.
+ */
+void xe_eudebug_client_vm_destroy(struct xe_eudebug_client *c, int fd, uint32_t vm)
+{
+	igt_assert(c);
+	xe_vm_destroy(fd, vm);
+	vm_event(c, DRM_XE_EUDEBUG_EVENT_DESTROY, fd, vm);
+}
+
+/**
+ * xe_eudebug_client_exec_queue_create:
+ * @c: pointer to xe_eudebug_client structure
+ * @fd: xe client
+ * @create: exec_queue create drm struct
+ *
+ * Calls xe exec queue create ioctl and logs the corresponding event in
+ * client's event log.
+ *
+ * Returns: valid exec queue handle
+ */
+uint32_t xe_eudebug_client_exec_queue_create(struct xe_eudebug_client *c, int fd,
+					     struct drm_xe_exec_queue_create *create)
+{
+	struct drm_xe_ext_set_property *ext;
+	bool send = false;
+	uint16_t class;
+
+	igt_assert(c);
+	igt_assert(create);
+
+	class = ((struct drm_xe_engine_class_instance *)(create->instances))[0].engine_class;
+
+	igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_EXEC_QUEUE_CREATE, create), 0);
+
+	for (ext = from_user_pointer(create->extensions); ext;
+	     ext = from_user_pointer(ext->base.next_extension))
+		if (ext->base.name == DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY &&
+		    ext->property == DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG &&
+		    ext->value == DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE)
+			send = true;
+
+	if (send)
+		exec_queue_event(c, DRM_XE_EUDEBUG_EVENT_CREATE, fd, create->vm_id,
+				 create->exec_queue_id, class, create->width);
+
+	return create->exec_queue_id;
+}
+
+/**
+ * xe_eudebug_client_exec_queue_destroy:
+ * @c: pointer to xe_eudebug_client structure
+ * @fd: xe client
+ * @create: exec_queue create drm struct which was used for creation
+ *
+ * Calls xe exec_queue destroy ioctl and logs the corresponding event in
+ * client's event log.
+ */
+void xe_eudebug_client_exec_queue_destroy(struct xe_eudebug_client *c, int fd,
+					  struct drm_xe_exec_queue_create *create)
+{
+	struct drm_xe_exec_queue_destroy destroy = {};
+	struct drm_xe_ext_set_property *ext;
+	bool send = false;
+	uint16_t class;
+
+	igt_assert(c);
+	igt_assert(create);
+
+	destroy.exec_queue_id = create->exec_queue_id;
+	class = ((struct drm_xe_engine_class_instance *)(create->instances))[0].engine_class;
+
+	for (ext = from_user_pointer(create->extensions); ext;
+	     ext = from_user_pointer(ext->base.next_extension))
+		if (ext->base.name == DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY &&
+		    ext->property == DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG &&
+		    ext->value == DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE)
+			send = true;
+
+	if (send)
+		exec_queue_event(c, DRM_XE_EUDEBUG_EVENT_DESTROY, fd, create->vm_id,
+				 create->exec_queue_id, class, create->width);
+
+	igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_EXEC_QUEUE_DESTROY, &destroy), 0);
+}
+
+/**
+ * xe_eudebug_client_vm_bind_event:
+ * @c: pointer to xe_eudebug_client structure
+ * @event_flags: base event flags
+ * @fd: xe client
+ * @vm: vm handle
+ * @bind_flags: bind flags of vm_bind_event
+ * @num_binds: number of bind (operations) for event
+ * @ref_seqno: base vm bind reference seqno
+ * Logs vm bind event in client's event log.
+ */
+void xe_eudebug_client_vm_bind_event(struct xe_eudebug_client *c,
+				     uint32_t event_flags, int fd,
+				     uint32_t vm, uint32_t bind_flags,
+				     uint32_t num_binds, u64 *ref_seqno)
+{
+	struct drm_xe_eudebug_event_vm_bind evmb;
+
+	igt_assert(c);
+	igt_assert(ref_seqno);
+
+	base_event(c, to_base(evmb), DRM_XE_EUDEBUG_EVENT_VM_BIND,
+		   event_flags, sizeof(evmb));
+	evmb.client_handle = fd;
+	evmb.vm_handle = vm;
+	evmb.flags = bind_flags;
+	evmb.num_binds = num_binds;
+
+	*ref_seqno = evmb.base.seqno;
+
+	xe_eudebug_event_log_write(c->log, (void *)&evmb);
+}
+
+/**
+ * xe_eudebug_client_vm_bind_op_event:
+ * @c: pointer to xe_eudebug_client structure
+ * @event_flags: base event flags
+ * @bind_ref_seqno: base vm bind reference seqno
+ * @op_ref_seqno: output, the vm_bind_op event seqno
+ * @addr: ppgtt address
+ * @size: size of the binding
+ * @num_extensions: number of vm bind op extensions
+ *
+ * Logs vm bind op event in client's event log.
+ */
+void xe_eudebug_client_vm_bind_op_event(struct xe_eudebug_client *c, uint32_t event_flags,
+					uint64_t bind_ref_seqno, uint64_t *op_ref_seqno,
+					uint64_t addr, uint64_t range,
+					uint64_t num_extensions)
+{
+	struct drm_xe_eudebug_event_vm_bind_op op;
+
+	igt_assert(c);
+	igt_assert(op_ref_seqno);
+
+	base_event(c, to_base(op), DRM_XE_EUDEBUG_EVENT_VM_BIND_OP,
+		   event_flags, sizeof(op));
+	op.vm_bind_ref_seqno = bind_ref_seqno;
+	op.addr = addr;
+	op.range = range;
+	op.num_extensions = num_extensions;
+
+	*op_ref_seqno = op.base.seqno;
+
+	xe_eudebug_event_log_write(c->log, (void *)&op);
+}
+
+/**
+ * xe_eudebug_client_vm_bind_op_metadata_event:
+ * @c: pointer to xe_eudebug_client structure
+ * @event_flags: base event flags
+ * @op_ref_seqno: base vm bind op reference seqno
+ * @metadata_handle: metadata handle
+ * @metadata_cookie: metadata cookie
+ *
+ * Logs vm bind op metadata event in client's event log.
+ */
+void xe_eudebug_client_vm_bind_op_metadata_event(struct xe_eudebug_client *c,
+						 uint32_t event_flags, uint64_t op_ref_seqno,
+						 uint64_t metadata_handle, uint64_t metadata_cookie)
+{
+	struct drm_xe_eudebug_event_vm_bind_op_metadata op;
+
+	igt_assert(c);
+
+	base_event(c, to_base(op), DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_METADATA,
+		   event_flags, sizeof(op));
+	op.vm_bind_op_ref_seqno = op_ref_seqno;
+	op.metadata_handle = metadata_handle;
+	op.metadata_cookie = metadata_cookie;
+
+	xe_eudebug_event_log_write(c->log, (void *)&op);
+}
+
+/**
+ * xe_eudebug_client_vm_bind_ufence_event:
+ * @c: pointer to xe_eudebug_client structure
+ * @event_flags: base event flags
+ * @ref_seqno: base vm bind event seqno
+ *
+ * Logs vm bind ufence event in client's event log.
+ */
+void xe_eudebug_client_vm_bind_ufence_event(struct xe_eudebug_client *c, uint32_t event_flags,
+					    uint64_t ref_seqno)
+{
+	struct drm_xe_eudebug_event_vm_bind_ufence f;
+
+	igt_assert(c);
+
+	base_event(c, to_base(f), DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
+		   event_flags, sizeof(f));
+	f.vm_bind_ref_seqno = ref_seqno;
+
+	xe_eudebug_event_log_write(c->log, (void *)&f);
+}
+
+static bool has_user_fence(const struct drm_xe_sync *sync, uint32_t num_syncs)
+{
+	while (num_syncs--)
+		if (sync[num_syncs].type == DRM_XE_SYNC_TYPE_USER_FENCE)
+			return true;
+
+	return false;
+}
+
+#define for_each_metadata(__m, __ext)					\
+	for ((__m) = from_user_pointer(__ext);				\
+	     (__m);							\
+	     (__m) = from_user_pointer((__m)->base.next_extension))	\
+		if ((__m)->base.name == XE_VM_BIND_OP_EXTENSIONS_ATTACH_DEBUG)
+
+static int  __xe_eudebug_client_vm_bind(struct xe_eudebug_client *c,
+					int fd, uint32_t vm, uint32_t exec_queue,
+					uint32_t bo, uint64_t offset,
+					uint64_t addr, uint64_t size,
+					uint32_t op, uint32_t flags,
+					struct drm_xe_sync *sync,
+					uint32_t num_syncs,
+					uint32_t prefetch_region,
+					uint8_t pat_index, uint64_t op_ext)
+{
+	struct drm_xe_vm_bind_op_ext_attach_debug *metadata;
+	const bool ufence = has_user_fence(sync, num_syncs);
+	const uint32_t bind_flags = ufence ?
+		DRM_XE_EUDEBUG_EVENT_VM_BIND_FLAG_UFENCE : 0;
+	uint64_t seqno = 0, op_seqno = 0, num_metadata = 0;
+	uint32_t bind_base_flags = 0;
+	int ret;
+
+	for_each_metadata(metadata, op_ext)
+		num_metadata++;
+
+	switch (op) {
+	case DRM_XE_VM_BIND_OP_MAP:
+		bind_base_flags = DRM_XE_EUDEBUG_EVENT_CREATE;
+		break;
+	case DRM_XE_VM_BIND_OP_UNMAP:
+		bind_base_flags = DRM_XE_EUDEBUG_EVENT_DESTROY;
+		igt_assert_eq(num_metadata, 0);
+		igt_assert_eq(ufence, false);
+		break;
+	default:
+		/* XXX unmap all? */
+		igt_assert(op);
+		break;
+	}
+
+	ret = ___xe_vm_bind(fd, vm, exec_queue, bo, offset, addr, size,
+			    op, flags, sync, num_syncs, prefetch_region,
+			    pat_index, 0, op_ext);
+
+	if (ret)
+		return ret;
+
+	if (!bind_base_flags)
+		return -EINVAL;
+
+	xe_eudebug_client_vm_bind_event(c, DRM_XE_EUDEBUG_EVENT_STATE_CHANGE,
+					fd, vm, bind_flags, 1, &seqno);
+	xe_eudebug_client_vm_bind_op_event(c, bind_base_flags,
+					   seqno, &op_seqno, addr, size,
+					   num_metadata);
+
+	for_each_metadata(metadata, op_ext)
+		xe_eudebug_client_vm_bind_op_metadata_event(c,
+							    DRM_XE_EUDEBUG_EVENT_CREATE,
+							    op_seqno,
+							    metadata->metadata_id,
+							    metadata->cookie);
+	if (ufence)
+		xe_eudebug_client_vm_bind_ufence_event(c, DRM_XE_EUDEBUG_EVENT_CREATE |
+						       DRM_XE_EUDEBUG_EVENT_NEED_ACK,
+						       seqno);
+	return ret;
+}
+
+static void _xe_eudebug_client_vm_bind(struct xe_eudebug_client *c, int fd,
+				       uint32_t vm, uint32_t bo,
+				       uint64_t offset, uint64_t addr, uint64_t size,
+				       uint32_t op,
+				       uint32_t flags,
+				       struct drm_xe_sync *sync,
+				       uint32_t num_syncs,
+				       uint64_t op_ext)
+{
+	const uint32_t exec_queue_id = 0;
+	const uint32_t prefetch_region = 0;
+
+	igt_assert_eq(__xe_eudebug_client_vm_bind(c, fd, vm, exec_queue_id, bo, offset,
+						  addr, size, op, flags,
+						  sync, num_syncs, prefetch_region,
+						  DEFAULT_PAT_INDEX, op_ext),
+		      0);
+}
+
+/**
+ * xe_eudebug_client_vm_bind_flags
+ * @c: pointer to xe_eudebug_client structure
+ * @fd: xe client
+ * @vm: vm handle
+ * @bo: buffer object handle
+ * @offset: offset within buffer object
+ * @addr: ppgtt address
+ * @size: size of the binding
+ * @flags: vm_bind flags
+ * @sync: sync objects
+ * @num_syncs: number of sync objects
+ * @op_ext: BIND_OP extensions
+ *
+ * Calls xe vm_bind ioctl and logs the corresponding event in client's event log.
+ */
+void xe_eudebug_client_vm_bind_flags(struct xe_eudebug_client *c, int fd, uint32_t vm,
+				     uint32_t bo, uint64_t offset,
+				     uint64_t addr, uint64_t size, uint32_t flags,
+				     struct drm_xe_sync *sync, uint32_t num_syncs,
+				     uint64_t op_ext)
+{
+	igt_assert(c);
+	_xe_eudebug_client_vm_bind(c, fd, vm, bo, offset, addr, size,
+				   DRM_XE_VM_BIND_OP_MAP, flags,
+				   sync, num_syncs, op_ext);
+}
+
+/**
+ * xe_eudebug_client_vm_bind
+ * @c: pointer to xe_eudebug_client structure
+ * @fd: xe client
+ * @vm: vm handle
+ * @bo: buffer object handle
+ * @offset: offset within buffer object
+ * @addr: ppgtt address
+ * @size: size of the binding
+ *
+ * Calls xe vm_bind ioctl and logs the corresponding event in client's event log.
+ */
+void xe_eudebug_client_vm_bind(struct xe_eudebug_client *c, int fd, uint32_t vm,
+			       uint32_t bo, uint64_t offset,
+			       uint64_t addr, uint64_t size)
+{
+	const uint32_t flags = 0;
+	struct drm_xe_sync *sync = NULL;
+	const uint32_t num_syncs = 0;
+	const uint64_t op_ext = 0;
+
+	xe_eudebug_client_vm_bind_flags(c, fd, vm, bo, offset, addr, size, flags, sync, num_syncs,
+					op_ext);
+}
+
+/**
+ * xe_eudebug_client_vm_unbind_flags
+ * @c: pointer to xe_eudebug_client structure
+ * @fd: xe client
+ * @vm: vm handle
+ * @offset: offset
+ * @addr: ppgtt address
+ * @size: size of the binding
+ * @flags: vm_bind flags
+ * @sync: sync objects
+ * @num_syncs: number of sync objects
+ *
+ * Calls xe vm_unbind ioctl and logs the corresponding event in client's event log.
+ */
+void xe_eudebug_client_vm_unbind_flags(struct xe_eudebug_client *c, int fd,
+				       uint32_t vm, uint64_t offset,
+				       uint64_t addr, uint64_t size, uint32_t flags,
+				       struct drm_xe_sync *sync, uint32_t num_syncs)
+{
+	igt_assert(c);
+	_xe_eudebug_client_vm_bind(c, fd, vm, 0, offset, addr, size,
+				   DRM_XE_VM_BIND_OP_UNMAP, flags,
+				   sync, num_syncs, 0);
+}
+
+/**
+ * xe_eudebug_client_vm_unbind
+ * @c: pointer to xe_eudebug_client structure
+ * @fd: xe client
+ * @vm: vm handle
+ * @offset: offset
+ * @addr: ppgtt address
+ * @size: size of the binding
+ *
+ * Calls xe vm_unbind ioctl and logs the corresponding event in client's event log.
+ */
+void xe_eudebug_client_vm_unbind(struct xe_eudebug_client *c, int fd, uint32_t vm,
+				 uint64_t offset, uint64_t addr, uint64_t size)
+{
+	const uint32_t flags = 0;
+	struct drm_xe_sync *sync = NULL;
+	const uint32_t num_syncs = 0;
+
+	xe_eudebug_client_vm_unbind_flags(c, fd, vm, offset, addr, size,
+					  flags, sync, num_syncs);
+}
+
+/**
+ * xe_eudebug_client_metadata_create:
+ * @c: pointer to xe_eudebug_client structure
+ * @fd: xe client
+ * @type: debug metadata type
+ * @len: size of @data
+ * @data: debug metadata paylad
+ *
+ * Calls xe metadata create ioctl and logs the corresponding event in
+ * client's event log.
+ *
+ * Return: valid debug metadata id.
+ */
+uint32_t xe_eudebug_client_metadata_create(struct xe_eudebug_client *c, int fd,
+					   int type, size_t len, void *data)
+{
+	struct drm_xe_debug_metadata_create create = {
+		.type = type,
+		.user_addr = to_user_pointer(data),
+		.len = len
+	};
+
+	igt_assert(c);
+	igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_DEBUG_METADATA_CREATE, &create), 0);
+
+	metadata_event(c, DRM_XE_EUDEBUG_EVENT_CREATE, fd, create.metadata_id, type, len);
+
+	return create.metadata_id;
+}
+
+/**
+ * xe_eudebug_client_metadata_destroy:
+ * @c: pointer to xe_eudebug_client structure
+ * @fd: xe client
+ * @id: xe debug metadata handle
+ * @type: debug metadata type
+ * @len: size of debug metadata payload
+ *
+ * Calls xe metadata destroy ioctl and logs the corresponding event in
+ * client's event log.
+ */
+void xe_eudebug_client_metadata_destroy(struct xe_eudebug_client *c, int fd,
+					uint32_t id, int type, size_t len)
+{
+	struct drm_xe_debug_metadata_destroy destroy = { .metadata_id = id };
+
+	igt_assert(c);
+	igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_DEBUG_METADATA_DESTROY, &destroy), 0);
+
+	metadata_event(c, DRM_XE_EUDEBUG_EVENT_DESTROY, fd, id, type, len);
+}
+
+void xe_eudebug_ack_ufence(int debugfd,
+			   const struct drm_xe_eudebug_event_vm_bind_ufence *f)
+{
+	struct drm_xe_eudebug_ack_event ack = { 0, };
+	char event_str[XE_EUDEBUG_EVENT_STRING_MAX_LEN];
+
+	igt_assert(f);
+
+	ack.type = f->base.type;
+	ack.seqno = f->base.seqno;
+
+	xe_eudebug_event_to_str((void *)f, event_str, XE_EUDEBUG_EVENT_STRING_MAX_LEN);
+	igt_debug("delivering ack for event: %s\n", event_str);
+	igt_assert_eq(igt_ioctl(debugfd, DRM_XE_EUDEBUG_IOCTL_ACK_EVENT, &ack), 0);
+}
diff --git a/lib/xe/xe_eudebug.h b/lib/xe/xe_eudebug.h
new file mode 100644
index 000000000..e32bc5a3c
--- /dev/null
+++ b/lib/xe/xe_eudebug.h
@@ -0,0 +1,218 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+#include <fcntl.h>
+#include <pthread.h>
+#include <stdint.h>
+#include <xe_drm.h>
+#include <xe_drm_eudebug.h>
+
+#include "igt_list.h"
+
+struct xe_eudebug_event_log {
+	uint8_t *log;
+	unsigned int head;
+	unsigned int max_size;
+	char name[80];
+	pthread_mutex_t lock;
+};
+
+enum xe_eudebug_debugger_worker_state {
+	DEBUGGER_WORKER_INACTIVE = 0,
+	DEBUGGER_WORKER_ACTIVE,
+	DEBUGGER_WORKER_QUITTING,
+};
+
+struct xe_eudebug_debugger {
+	int fd;
+	uint64_t flags;
+
+	/* Used to smuggle private data */
+	void *ptr;
+
+	struct xe_eudebug_event_log *log;
+
+	uint64_t event_count;
+
+	uint64_t target_pid;
+
+	struct igt_list_head triggers;
+
+	int master_fd;
+
+	pthread_t worker_thread;
+	enum xe_eudebug_debugger_worker_state worker_state;
+
+	int p_client[2];
+};
+
+struct xe_eudebug_client {
+	int pid;
+	uint64_t seqno;
+	uint64_t flags;
+
+	/* Used to smuggle private data */
+	void *ptr;
+
+	struct xe_eudebug_event_log *log;
+
+	int done;
+	int p_in[2];
+	int p_out[2];
+
+	/* Used to pickup right device (the one used in debugger) */
+	int master_fd;
+
+	int timeout_ms;
+};
+
+struct xe_eudebug_session {
+	uint64_t flags;
+	struct xe_eudebug_client *client;
+	struct xe_eudebug_debugger *debugger;
+};
+
+typedef void (*xe_eudebug_client_work_fn)(struct xe_eudebug_client *);
+typedef void (*xe_eudebug_trigger_fn)(struct xe_eudebug_debugger *,
+				      struct drm_xe_eudebug_event *);
+
+#define xe_eudebug_for_each_engine(fd__, hwe__) \
+	xe_for_each_engine(fd__, hwe__) \
+		if (hwe__->engine_class == DRM_XE_ENGINE_CLASS_RENDER || \
+		    hwe__->engine_class == DRM_XE_ENGINE_CLASS_COMPUTE)
+
+#define xe_eudebug_for_each_event(_e, _log) \
+	for ((_e) = (_e) ? (void *)(uint8_t *)(_e) + (_e)->len : \
+		    (void *)(_log)->log; \
+	    (uint8_t *)(_e) < (_log)->log + (_log)->head; \
+	    (_e) = (void *)(uint8_t *)(_e) + (_e)->len)
+
+#define xe_eudebug_assert(d, c)						\
+	do {								\
+		if (!(c)) {						\
+			xe_eudebug_event_log_print((d)->log, true);	\
+			igt_assert(c);					\
+		}							\
+	} while (0)
+
+#define xe_eudebug_assert_f(d, c, f...)					\
+	do {								\
+		if (!(c)) {						\
+			xe_eudebug_event_log_print((d)->log, true);	\
+			igt_assert_f(c, f);				\
+		}							\
+	} while (0)
+
+#define XE_EUDEBUG_EVENT_STRING_MAX_LEN		4096
+
+/*
+ * Default abort timeout to use across xe_eudebug lib and tests if no specific
+ * timeout value is required.
+ */
+#define XE_EUDEBUG_DEFAULT_TIMEOUT_SEC		25ULL
+
+#define XE_EUDEBUG_FILTER_EVENT_NONE		BIT(DRM_XE_EUDEBUG_EVENT_NONE)
+#define XE_EUDEBUG_FILTER_EVENT_READ		BIT(DRM_XE_EUDEBUG_EVENT_READ)
+#define XE_EUDEBUG_FILTER_EVENT_OPEN		BIT(DRM_XE_EUDEBUG_EVENT_OPEN)
+#define XE_EUDEBUG_FILTER_EVENT_VM		BIT(DRM_XE_EUDEBUG_EVENT_VM)
+#define XE_EUDEBUG_FILTER_EVENT_EXEC_QUEUE	BIT(DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE)
+#define XE_EUDEBUG_FILTER_EVENT_EU_ATTENTION	BIT(DRM_XE_EUDEBUG_EVENT_EU_ATTENTION)
+#define XE_EUDEBUG_FILTER_EVENT_VM_BIND		BIT(DRM_XE_EUDEBUG_EVENT_VM_BIND)
+#define XE_EUDEBUG_FILTER_EVENT_VM_BIND_OP	BIT(DRM_XE_EUDEBUG_EVENT_VM_BIND_OP)
+#define XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE  BIT(DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE)
+#define XE_EUDEBUG_FILTER_ALL			GENMASK(DRM_XE_EUDEBUG_EVENT_MAX_EVENT, 0)
+#define XE_EUDEBUG_EVENT_IS_FILTERED(_e, _f)	((1UL << (_e)) & (_f))
+
+int xe_eudebug_connect(int fd, pid_t pid, uint32_t flags);
+const char *xe_eudebug_event_to_str(struct drm_xe_eudebug_event *e, char *buf, size_t len);
+struct drm_xe_eudebug_event *
+xe_eudebug_event_log_find_seqno(struct xe_eudebug_event_log *l, uint64_t seqno);
+struct xe_eudebug_event_log *
+xe_eudebug_event_log_create(const char *name, unsigned int max_size);
+void xe_eudebug_event_log_destroy(struct xe_eudebug_event_log *l);
+void xe_eudebug_event_log_print(struct xe_eudebug_event_log *l, bool debug);
+void xe_eudebug_event_log_compare(struct xe_eudebug_event_log *c, struct xe_eudebug_event_log *d,
+				  uint32_t filter);
+void xe_eudebug_event_log_write(struct xe_eudebug_event_log *l, struct drm_xe_eudebug_event *e);
+void xe_eudebug_event_log_match_opposite(struct xe_eudebug_event_log *l, uint32_t filter);
+
+bool xe_eudebug_debugger_available(int fd);
+struct xe_eudebug_debugger *
+xe_eudebug_debugger_create(int xe, uint64_t flags, void *data);
+void xe_eudebug_debugger_destroy(struct xe_eudebug_debugger *d);
+int xe_eudebug_debugger_attach(struct xe_eudebug_debugger *d, struct xe_eudebug_client *c);
+void xe_eudebug_debugger_start_worker(struct xe_eudebug_debugger *d);
+void xe_eudebug_debugger_stop_worker(struct xe_eudebug_debugger *d, int timeout_s);
+void xe_eudebug_debugger_detach(struct xe_eudebug_debugger *d);
+void xe_eudebug_debugger_set_data(struct xe_eudebug_debugger *c, void *ptr);
+void xe_eudebug_debugger_add_trigger(struct xe_eudebug_debugger *d, int type,
+				     xe_eudebug_trigger_fn fn);
+void xe_eudebug_debugger_signal_stage(struct xe_eudebug_debugger *d, uint64_t stage);
+void xe_eudebug_debugger_wait_stage(struct xe_eudebug_session *s, uint64_t stage);
+
+struct xe_eudebug_client *
+xe_eudebug_client_create(int xe, xe_eudebug_client_work_fn work, uint64_t flags, void *data);
+void xe_eudebug_client_destroy(struct xe_eudebug_client *c);
+void xe_eudebug_client_start(struct xe_eudebug_client *c);
+void xe_eudebug_client_stop(struct xe_eudebug_client *c);
+void xe_eudebug_client_wait_done(struct xe_eudebug_client *c);
+void xe_eudebug_client_signal_stage(struct xe_eudebug_client *c, uint64_t stage);
+void xe_eudebug_client_wait_stage(struct xe_eudebug_client *c, uint64_t stage);
+
+uint64_t xe_eudebug_client_get_seqno(struct xe_eudebug_client *c);
+void xe_eudebug_client_set_data(struct xe_eudebug_client *c, void *ptr);
+
+bool xe_eudebug_enable(int fd, bool enable);
+
+int xe_eudebug_client_open_driver(struct xe_eudebug_client *c);
+void xe_eudebug_client_close_driver(struct xe_eudebug_client *c, int fd);
+uint32_t xe_eudebug_client_vm_create(struct xe_eudebug_client *c, int fd,
+				     uint32_t flags, uint64_t ext);
+void xe_eudebug_client_vm_destroy(struct xe_eudebug_client *c, int fd, uint32_t vm);
+uint32_t xe_eudebug_client_exec_queue_create(struct xe_eudebug_client *c, int fd,
+					     struct drm_xe_exec_queue_create *create);
+void xe_eudebug_client_exec_queue_destroy(struct xe_eudebug_client *c, int fd,
+					  struct drm_xe_exec_queue_create *create);
+void xe_eudebug_client_vm_bind_event(struct xe_eudebug_client *c, uint32_t event_flags, int fd,
+				     uint32_t vm, uint32_t bind_flags,
+				     uint32_t num_ops, uint64_t *ref_seqno);
+void xe_eudebug_client_vm_bind_op_event(struct xe_eudebug_client *c, uint32_t event_flags,
+					uint64_t ref_seqno, uint64_t *op_ref_seqno,
+					uint64_t addr, uint64_t range,
+					uint64_t num_extensions);
+void xe_eudebug_client_vm_bind_op_metadata_event(struct xe_eudebug_client *c,
+						 uint32_t event_flags, uint64_t op_ref_seqno,
+						 uint64_t metadata_handle, uint64_t metadata_cookie);
+void xe_eudebug_client_vm_bind_ufence_event(struct xe_eudebug_client *c, uint32_t event_flags,
+					    uint64_t ref_seqno);
+void xe_eudebug_ack_ufence(int debugfd,
+			   const struct drm_xe_eudebug_event_vm_bind_ufence *f);
+
+void xe_eudebug_client_vm_bind_flags(struct xe_eudebug_client *c, int fd, uint32_t vm,
+				     uint32_t bo, uint64_t offset,
+				     uint64_t addr, uint64_t size, uint32_t flags,
+				     struct drm_xe_sync *sync, uint32_t num_syncs,
+				     uint64_t op_ext);
+void xe_eudebug_client_vm_bind(struct xe_eudebug_client *c, int fd, uint32_t vm,
+			       uint32_t bo, uint64_t offset,
+			       uint64_t addr, uint64_t size);
+void xe_eudebug_client_vm_unbind_flags(struct xe_eudebug_client *c, int fd,
+				       uint32_t vm, uint64_t offset,
+				       uint64_t addr, uint64_t size, uint32_t flags,
+				       struct drm_xe_sync *sync, uint32_t num_syncs);
+void xe_eudebug_client_vm_unbind(struct xe_eudebug_client *c, int fd, uint32_t vm,
+				 uint64_t offset, uint64_t addr, uint64_t size);
+
+uint32_t xe_eudebug_client_metadata_create(struct xe_eudebug_client *c, int fd,
+					   int type, size_t len, void *data);
+void xe_eudebug_client_metadata_destroy(struct xe_eudebug_client *c, int fd,
+					uint32_t id, int type, size_t len);
+
+struct xe_eudebug_session *xe_eudebug_session_create(int fd,
+						     xe_eudebug_client_work_fn work,
+						     unsigned int flags,
+						     void *test_private);
+void xe_eudebug_session_destroy(struct xe_eudebug_session *s);
+void xe_eudebug_session_run(struct xe_eudebug_session *s);
+void xe_eudebug_session_check(struct xe_eudebug_session *s, bool match_opposite, uint32_t filter);
diff --git a/meson.build b/meson.build
index f67655367..0d06721b4 100644
--- a/meson.build
+++ b/meson.build
@@ -90,9 +90,11 @@ build_chamelium = get_option('chamelium')
 build_docs = get_option('docs')
 build_tests = not get_option('tests').disabled()
 build_xe = not get_option('xe_driver').disabled()
+build_xe_eudebug = get_option('xe_eudebug').enabled()
 with_libdrm = get_option('libdrm_drivers')
 
 build_info = ['Build type: ' + get_option('buildtype')]
+build_info += 'Build Xe EU debugger test framework: @0@'.format(build_xe_eudebug)
 
 inc = include_directories('include', 'include/drm-uapi', 'include/linux-uapi', 'lib', 'lib/stubs/syscalls', '.')
 
diff --git a/meson_options.txt b/meson_options.txt
index 6a9493ea6..11922523b 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -42,6 +42,11 @@ option('xe_driver',
        value : 'enabled',
        description : 'Build tests for Xe driver')
 
+option('xe_eudebug',
+       type : 'feature',
+       value : 'disabled',
+       description : 'Build library for Xe EU debugger')
+
 option('libdrm_drivers',
        type : 'array',
        value : ['auto'],
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 11/17] lib/xe_eudebug: Introduce eu debug testing framework
  2024-09-05  9:28 ` [PATCH i-g-t v6 11/17] lib/xe_eudebug: Introduce eu debug testing framework Christoph Manszewski
@ 2024-09-09  8:46   ` Zbigniew Kempczyński
  2024-09-13 15:14     ` Manszewski, Christoph
  2024-09-10  5:32   ` Zbigniew Kempczyński
  1 sibling, 1 reply; 50+ messages in thread
From: Zbigniew Kempczyński @ 2024-09-09  8:46 UTC (permalink / raw)
  To: Christoph Manszewski
  Cc: igt-dev, Kamil Konieczny, Dominik Grzegorzek, Maciej Patelczyk,
	Dominik Karol Piątkowski, Pawel Sikora, Andrzej Hajda,
	Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun, Mika Kuoppala

On Thu, Sep 05, 2024 at 11:28:06AM +0200, Christoph Manszewski wrote:
> From: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> 
> Introduce library which simplifies testing of eu debug capability.
> The library provides event log helpers together with asynchronous
> abstraction for client proccess and the debugger itself.
> 
> xe_eudebug_client creates its own proccess with user's work function,
> and gives machanisms to synchronize beginning of execution and event
> logging.
> 
> xe_eudebug_debugger allows to attach to the given proccess, provides
> asynchronous thread for event reading and introduces triggers - a
> callback mechanism triggered every time subscribed event was read.
> 
> To build the eudebug testing framework 'xe_eudebug' meson build option
> has to be enabled, as it is disabled by default.
> 
> Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> Signed-off-by: Mika Kuoppala <mika.kuaoppala@linux.intel.com>
> Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
> Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
> Signed-off-by: Pawel Sikora <pawel.sikora@intel.com>
> Signed-off-by: Karolina Stolarek <karolina.stolarek@intel.com>
> ---
>  lib/meson.build     |    5 +
>  lib/xe/xe_eudebug.c | 2249 +++++++++++++++++++++++++++++++++++++++++++
>  lib/xe/xe_eudebug.h |  218 +++++
>  meson.build         |    2 +
>  meson_options.txt   |    5 +
>  5 files changed, 2479 insertions(+)
>  create mode 100644 lib/xe/xe_eudebug.c
>  create mode 100644 lib/xe/xe_eudebug.h

That's my final review. I left only things which may/should be addressed
in the future.

> +static int safe_pipe_read(int pipe[2], void *buf, int nbytes, int timeout_ms)

I wonder what's for is passing pipe array instead of just one fd.
fd[0] is always for read, so passing fd[1] here doesn't make sense
(apart of int pipe[2] just points there is a pipe).
> +{
> +	int ret;
> +	int t = 0;
> +	struct pollfd fd = {
> +		.fd = pipe[0],
> +		.events = POLLIN,
> +		.revents = 0
> +	};
> +
> +	/* When child fails we may get stuck forever. Check whether
> +	 * the child process ended with an error.
> +	 */
> +	do {
> +		const int interval_ms = 1000;
> +
> +		ret = poll(&fd, 1, interval_ms);
> +
> +		if (!ret) {
> +			catch_child_failure();
> +			t += interval_ms;
> +		}
> +	} while (!ret && t < timeout_ms);
> +
> +	if (ret > 0)
> +		return read(pipe[0], buf, nbytes);
> +
> +	return 0;
> +}
> +
> +static uint64_t pipe_read(int pipe[2], int timeout_ms)
> +{
> +	uint64_t in;
> +	uint64_t ret;
> +
> +	ret = safe_pipe_read(pipe, &in, sizeof(in), timeout_ms);
> +	igt_assert(ret == sizeof(in));
> +
> +	return in;
> +}
> +
> +static void pipe_signal(int pipe[2], uint64_t token)
> +{
> +	igt_assert(write(pipe[1], &token, sizeof(token)) == sizeof(token));
> +}
> +
> +static void pipe_close(int pipe[2])
> +{
> +	if (pipe[0] != -1)
> +		close(pipe[0]);
> +
> +	if (pipe[1] != -1)
> +		close(pipe[1]);

Just close(pipe[0]); close(pipe[1]); is enough.

> +}
> +
> +static uint64_t __wait_token(int p[2], const uint64_t token, int timeout_ms)

s/p[2]/pipe[2]/ for being consistent.

> +{
> +	uint64_t in;
> +
> +	in = pipe_read(p, timeout_ms);
> +
> +	igt_assert_eq(in, token);
> +
> +	return pipe_read(p, timeout_ms);
> +}
> +
> +static uint64_t client_wait_token(struct xe_eudebug_client *c, const uint64_t token)
> +{
> +	return __wait_token(c->p_in, token, c->timeout_ms);
> +}
> +
> +static uint64_t wait_from_client(struct xe_eudebug_client *c, const uint64_t token)
> +{
> +	return __wait_token(c->p_out, token, c->timeout_ms);

p_in[2] and p_out[2] are weird names, especially p_in[1] is for writing
to the pipe and p_out[0] is for reading from the pipe. So naming is
confusing for the reader. Waiting on p_out is misleading.

> +}
> +
> +static void token_signal(int p[2], const uint64_t token, const uint64_t value)
> +{

Same here, p[2] -> pipe[2].

> +	pipe_signal(p, token);
> +	pipe_signal(p, value);
> +}
> +
> +static void client_signal(struct xe_eudebug_client *c,
> +			  const uint64_t token,
> +			  const uint64_t value)
> +{
> +	token_signal(c->p_out, token, value);
> +}
> +
> +static int __xe_eudebug_connect(int fd, pid_t pid, uint32_t flags, uint64_t events)
> +{
> +	struct drm_xe_eudebug_connect param = {
> +		.pid = pid,
> +		.flags = flags,
> +	};
> +	int debugfd;
> +
> +	debugfd = igt_ioctl(fd, DRM_IOCTL_XE_EUDEBUG_CONNECT, &param);
> +
> +	if (debugfd < 0)
> +		return -errno;
> +
> +	return debugfd;
> +}
> +
> +static void event_log_write_to_fd(struct xe_eudebug_event_log *l, int fd)
> +{
> +	igt_assert_eq(write(fd, &l->head, sizeof(l->head)),
> +		      sizeof(l->head));
> +
> +	igt_assert_eq(write(fd, l->log, l->head), l->head);
> +}
> +
> +static void read_all(int fd, void *buf, size_t nbytes)
> +{
> +	ssize_t remaining_size = nbytes;
> +	ssize_t current_size = 0;
> +	ssize_t read_size = 0;
> +
> +	do {
> +		read_size = read(fd, buf + current_size, remaining_size);
> +		igt_assert_f(read_size >= 0, "read failed: %s\n", strerror(errno));
> +
> +		current_size += read_size;
> +		remaining_size -= read_size;
> +	} while (remaining_size > 0 && read_size > 0);
> +
> +	igt_assert_eq(current_size, nbytes);
> +}
> +
> +static void event_log_read_from_fd(struct xe_eudebug_event_log *l, int fd)
> +{
> +	read_all(fd, &l->head, sizeof(l->head));
> +	igt_assert_lt(l->head, l->max_size);

Instead of asserting reallocing log may happen here. I mean sth like
this should be enough:

if (l->head > l->max_size) {
	l->max_size += MAX_SIZE;
	l->log = realloc(l->log, l->max_size);
	igt_assert(l->log);
}

Anyway, I'm not happy how keeping log have been implemented here.
This is however not a blocker for merging the series. I understand
how convinient is to write the whole log from client to main process
for compare what was noticed from the debugger side.


> +
> +	read_all(fd, l->log, l->head);
> +}
> +

<cut>

> +/**
> + * xe_eudebug_event_log_find_seqno:
> + * @l: event log pointer
> + * @seqno: seqno of event to be found
> + *
> + * Finds the event with given seqno in the event log.
> + *
> + * Returns: pointer to the event with given seqno within @l or NULL seqno is
> + * not present.
> + */
> +struct drm_xe_eudebug_event *
> +xe_eudebug_event_log_find_seqno(struct xe_eudebug_event_log *l, uint64_t seqno)
> +{
> +	struct drm_xe_eudebug_event *e = NULL, *found = NULL;
> +
> +	igt_assert(l);
> +	igt_assert_neq(seqno, 0);
> +	/*
> +	 * Try to catch if seqno is corrupted and prevent too long tests,
> +	 * as our post processing of events is not optimized.
> +	 */
> +	igt_assert_lt(seqno, 10 * 1000 * 1000);
> +
> +	xe_eudebug_for_each_event(e, l) {
> +		if (e->seqno == seqno) {
> +			if (found) {
> +				igt_warn("Found multiple events with the same seqno %lu\n", seqno);
> +				xe_eudebug_event_log_print(l, false);
> +				igt_assert(!found);
> +			}
> +			found = e;
> +		}
> +	}
> +
> +	return found;
> +}
> +
> +static void event_log_sort(struct xe_eudebug_event_log *l)
> +{
> +	struct xe_eudebug_event_log *tmp;
> +	struct drm_xe_eudebug_event *e = NULL;
> +	uint64_t first_seqno = UINT64_MAX;
> +	uint64_t last_seqno = 0;
> +	uint64_t events = 0, added = 0;
> +	uint64_t i;
> +
> +	xe_eudebug_for_each_event(e, l) {
> +		if (e->seqno > last_seqno)
> +			last_seqno = e->seqno;
> +
> +		if (e->seqno < first_seqno)
> +			first_seqno = e->seqno;
> +
> +		events++;
> +	}

Above code suggests this function is called many times during
test execution (scanning first/last seqno). But it run once on the
test completion. Confusing for first-time reader.

> +
> +	tmp = xe_eudebug_event_log_create("tmp", l->max_size);
> +
> +	for (i = first_seqno; i <= last_seqno; i++) {
> +		e = xe_eudebug_event_log_find_seqno(l, i);
> +		if (e) {
> +			xe_eudebug_event_log_write(tmp, e);
> +			added++;
> +		}
> +	}
> +
> +	igt_assert_eq(events, added);
> +	igt_assert_eq(tmp->head, l->head);
> +
> +	memcpy(l->log, tmp->log, tmp->head);
> +
> +	xe_eudebug_event_log_destroy(tmp);

<cut>

> +
> +/**
> + * xe_eudebug_event_log_create:
> + * @name: event log identifier
> + * @max_size: maximum size of created log
> + *
> + * Function creates an Eu Debugger event log with size equal to @max_size.
> + *
> + * Returns: pointer to just created log
> + */
> +#define MAX_EVENT_LOG_SIZE (32 * 1024 * 1024)
> +struct xe_eudebug_event_log *xe_eudebug_event_log_create(const char *name, unsigned int max_size)
> +{
> +	struct xe_eudebug_event_log *l;
> +
> +	igt_assert(name);
> +
> +	l = calloc(1, sizeof(*l));
> +	igt_assert(l);
> +	l->log = calloc(1, max_size);
> +	igt_assert(l->log);
> +	l->max_size = max_size;
> +	strncpy(l->name, name, sizeof(l->name) - 1);
> +	pthread_mutex_init(&l->lock, NULL);
> +
> +	return l;
> +}
> +
> +/**
> + * xe_eudebug_event_log_destroy:
> + * @l: event log pointer
> + *
> + * Frees given event log @l.
> + */
> +void xe_eudebug_event_log_destroy(struct xe_eudebug_event_log *l)
> +{
> +	igt_assert(l);
> +	pthread_mutex_destroy(&l->lock);
> +	free(l->log);
> +	free(l);
> +}
> +
> +/**
> + * xe_eudebug_event_log_write:
> + * @l: event log pointer
> + * @e: event to be written to event log
> + *
> + * Writes event @e to the event log, thread-safe.
> + */
> +void xe_eudebug_event_log_write(struct xe_eudebug_event_log *l, struct drm_xe_eudebug_event *e)
> +{
> +	igt_assert(l);
> +	igt_assert(e);
> +	igt_assert(e->seqno);
> +	/*
> +	 * Try to catch if seqno is corrupted and prevent too long tests,
> +	 * as our post processing of events is not optimized.
> +	 */
> +	igt_assert_lt(e->seqno, 10 * 1000 * 1000);
> +
> +	pthread_mutex_lock(&l->lock);
> +	igt_assert_lt(l->head + e->len, l->max_size);

Similar to the above code reallocing may be added here.

> +	memcpy(l->log + l->head, e, e->len);
> +	l->head += e->len;
> +	pthread_mutex_unlock(&l->lock);
> +}

<cut>

> +
> +/**
> + * xe_eudebug_debugger_stop_worker:
> + * @d: pointer to the debugger
> + *
> + * Stops the debugger worker. Event log is sorted by seqno after closure.
> + */
> +void xe_eudebug_debugger_stop_worker(struct xe_eudebug_debugger *d,
> +				     int timeout_s)
> +{
> +	struct timespec t = {};
> +	int ret;
> +
> +	igt_assert_neq(d->worker_state, DEBUGGER_WORKER_INACTIVE);
> +
> +	d->worker_state = DEBUGGER_WORKER_QUITTING; /* First time be polite. */
> +	igt_assert_eq(clock_gettime(CLOCK_REALTIME, &t), 0);
> +	t.tv_sec += timeout_s;
> +
> +	ret = pthread_timedjoin_np(d->worker_thread, NULL, &t);
> +
> +	if (ret == ETIMEDOUT) {
> +		d->worker_state = DEBUGGER_WORKER_INACTIVE;
> +		ret = pthread_join(d->worker_thread, NULL);

It's possible we stuck here forever (until runner will kill the test).
And I don't like caller is setting INACTIVE instead of thread which
should do that after noticing QUITTING state.

> +	}
> +
> +	igt_assert_f(ret == 0 || ret != ESRCH,
> +		     "pthread join failed with error %d!\n", ret);
> +
> +	event_log_sort(d->log);
> +}

<cut>

> +/**
> + * xe_eudebug_client_create:
> + * @master_fd: xe client used to open the debugger connection
> + * @work: function that opens xe device and executes arbitrary workload
> + * @flags: flags stored in a client structure, can be used at will
> + * of the caller, i.e. to provide the @work function an additional switch.
> + * @data: test's private data, allocated with MAP_SHARED | MAP_ANONYMOUS,
> + * can be shared between client and debugger. Accesible via client->ptr.
> + * Can be NULL.
> + *
> + * Forks and creates the debugger process. @work won't be called until
> + * xe_eudebug_client_start is called.
> + *
> + * Returns: newly created xe_eudebug_debugger structure with its
> + * event log initialized.
> + */
> +struct xe_eudebug_client *xe_eudebug_client_create(int master_fd, xe_eudebug_client_work_fn work,
> +						   uint64_t flags, void *data)
> +{
> +	struct xe_eudebug_client *c;
> +
> +	c = calloc(1, sizeof(*c));
> +	igt_assert(c);
> +
> +	c->flags = flags;
> +	igt_assert(!pipe(c->p_in));
> +	igt_assert(!pipe(c->p_out));

Imo these p_in/p_out are not luckily chosen names.

<cut>

All of my nits may be addressed in the future. Code for me may be
accepted under the flag, especially eudebug kernel changes will reflect
on igt changes as well.

So for this patch:

Acked-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>

--
Zbigniew


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 11/17] lib/xe_eudebug: Introduce eu debug testing framework
  2024-09-09  8:46   ` Zbigniew Kempczyński
@ 2024-09-13 15:14     ` Manszewski, Christoph
  2024-09-16  6:48       ` Zbigniew Kempczyński
  0 siblings, 1 reply; 50+ messages in thread
From: Manszewski, Christoph @ 2024-09-13 15:14 UTC (permalink / raw)
  To: Zbigniew Kempczyński
  Cc: igt-dev, Kamil Konieczny, Dominik Grzegorzek, Maciej Patelczyk,
	Dominik Karol Piątkowski, Pawel Sikora, Andrzej Hajda,
	Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun, Mika Kuoppala

Hi Zbigniew,

On 9.09.2024 10:46, Zbigniew Kempczyński wrote:
> On Thu, Sep 05, 2024 at 11:28:06AM +0200, Christoph Manszewski wrote:
>> From: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
>>
>> Introduce library which simplifies testing of eu debug capability.
>> The library provides event log helpers together with asynchronous
>> abstraction for client proccess and the debugger itself.
>>
>> xe_eudebug_client creates its own proccess with user's work function,
>> and gives machanisms to synchronize beginning of execution and event
>> logging.
>>
>> xe_eudebug_debugger allows to attach to the given proccess, provides
>> asynchronous thread for event reading and introduces triggers - a
>> callback mechanism triggered every time subscribed event was read.
>>
>> To build the eudebug testing framework 'xe_eudebug' meson build option
>> has to be enabled, as it is disabled by default.
>>
>> Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
>> Signed-off-by: Mika Kuoppala <mika.kuaoppala@linux.intel.com>
>> Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
>> Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
>> Signed-off-by: Pawel Sikora <pawel.sikora@intel.com>
>> Signed-off-by: Karolina Stolarek <karolina.stolarek@intel.com>
>> ---
>>   lib/meson.build     |    5 +
>>   lib/xe/xe_eudebug.c | 2249 +++++++++++++++++++++++++++++++++++++++++++
>>   lib/xe/xe_eudebug.h |  218 +++++
>>   meson.build         |    2 +
>>   meson_options.txt   |    5 +
>>   5 files changed, 2479 insertions(+)
>>   create mode 100644 lib/xe/xe_eudebug.c
>>   create mode 100644 lib/xe/xe_eudebug.h
> 
> That's my final review. I left only things which may/should be addressed
> in the future.
> 
>> +static int safe_pipe_read(int pipe[2], void *buf, int nbytes, int timeout_ms)
> 
> I wonder what's for is passing pipe array instead of just one fd.
> fd[0] is always for read, so passing fd[1] here doesn't make sense
> (apart of int pipe[2] just points there is a pipe).

I guess it was done this way to provide some abstraction, so the caller 
just passes a pipe, created with 'pipe(2)' and leaves it to the last 
funcion to pick the appropriate fd for read/write. I would be happy to 
change it however just changing the parameter to 'int fd' would then 
also require to adapt the naming - afterall the name 'pipe_read' doesn't 
make sense if we no longer pass in a pipe but a fd.

Perhaps the whole __wait_token->pipe_read->safe_pipe_read and 
token_signal->pipe_signal hierachy of functions could use some 
refactoring. But then I would also need some guideline to make it 
coherent and aligned with the expectations.

>> +{
>> +	int ret;
>> +	int t = 0;
>> +	struct pollfd fd = {
>> +		.fd = pipe[0],
>> +		.events = POLLIN,
>> +		.revents = 0
>> +	};
>> +
>> +	/* When child fails we may get stuck forever. Check whether
>> +	 * the child process ended with an error.
>> +	 */
>> +	do {
>> +		const int interval_ms = 1000;
>> +
>> +		ret = poll(&fd, 1, interval_ms);
>> +
>> +		if (!ret) {
>> +			catch_child_failure();
>> +			t += interval_ms;
>> +		}
>> +	} while (!ret && t < timeout_ms);
>> +
>> +	if (ret > 0)
>> +		return read(pipe[0], buf, nbytes);
>> +
>> +	return 0;
>> +}
>> +
>> +static uint64_t pipe_read(int pipe[2], int timeout_ms)
>> +{
>> +	uint64_t in;
>> +	uint64_t ret;
>> +
>> +	ret = safe_pipe_read(pipe, &in, sizeof(in), timeout_ms);
>> +	igt_assert(ret == sizeof(in));
>> +
>> +	return in;
>> +}
>> +
>> +static void pipe_signal(int pipe[2], uint64_t token)
>> +{
>> +	igt_assert(write(pipe[1], &token, sizeof(token)) == sizeof(token));
>> +}
>> +
>> +static void pipe_close(int pipe[2])
>> +{
>> +	if (pipe[0] != -1)
>> +		close(pipe[0]);
>> +
>> +	if (pipe[1] != -1)
>> +		close(pipe[1]);
> 
> Just close(pipe[0]); close(pipe[1]); is enough.

Ok

> 
>> +}
>> +
>> +static uint64_t __wait_token(int p[2], const uint64_t token, int timeout_ms)
> 
> s/p[2]/pipe[2]/ for being consistent.

Sure


> 
>> +{
>> +	uint64_t in;
>> +
>> +	in = pipe_read(p, timeout_ms);
>> +
>> +	igt_assert_eq(in, token);
>> +
>> +	return pipe_read(p, timeout_ms);
>> +}
>> +
>> +static uint64_t client_wait_token(struct xe_eudebug_client *c, const uint64_t token)
>> +{
>> +	return __wait_token(c->p_in, token, c->timeout_ms);
>> +}
>> +
>> +static uint64_t wait_from_client(struct xe_eudebug_client *c, const uint64_t token)
>> +{
>> +	return __wait_token(c->p_out, token, c->timeout_ms);
> 
> p_in[2] and p_out[2] are weird names, especially p_in[1] is for writing
> to the pipe and p_out[0] is for reading from the pipe. So naming is
> confusing for the reader. Waiting on p_out is misleading.

I understand. Since this is somewhat personal or at least I don't see 
any obvious 'correct' way of naming those fields I would find it helpful 
to provide some alternative.

I also agree that considering these field names out of context is 
misleading. But since this is a field in 'xe_eudebug_client' it reads 
'client pipe out' and 'client pipe in'. One pipe flows to the client the 
other from the client. And to me it makes sense that we read and write 
the 'client pipe in' depending on whether we read client side code, or 
non client side code.

> 
>> +}
>> +
>> +static void token_signal(int p[2], const uint64_t token, const uint64_t value)
>> +{
> 
> Same here, p[2] -> pipe[2].

Ok


> 
>> +	pipe_signal(p, token);
>> +	pipe_signal(p, value);
>> +}
>> +
>> +static void client_signal(struct xe_eudebug_client *c,
>> +			  const uint64_t token,
>> +			  const uint64_t value)
>> +{
>> +	token_signal(c->p_out, token, value);
>> +}
>> +
>> +static int __xe_eudebug_connect(int fd, pid_t pid, uint32_t flags, uint64_t events)
>> +{
>> +	struct drm_xe_eudebug_connect param = {
>> +		.pid = pid,
>> +		.flags = flags,
>> +	};
>> +	int debugfd;
>> +
>> +	debugfd = igt_ioctl(fd, DRM_IOCTL_XE_EUDEBUG_CONNECT, &param);
>> +
>> +	if (debugfd < 0)
>> +		return -errno;
>> +
>> +	return debugfd;
>> +}
>> +
>> +static void event_log_write_to_fd(struct xe_eudebug_event_log *l, int fd)
>> +{
>> +	igt_assert_eq(write(fd, &l->head, sizeof(l->head)),
>> +		      sizeof(l->head));
>> +
>> +	igt_assert_eq(write(fd, l->log, l->head), l->head);
>> +}
>> +
>> +static void read_all(int fd, void *buf, size_t nbytes)
>> +{
>> +	ssize_t remaining_size = nbytes;
>> +	ssize_t current_size = 0;
>> +	ssize_t read_size = 0;
>> +
>> +	do {
>> +		read_size = read(fd, buf + current_size, remaining_size);
>> +		igt_assert_f(read_size >= 0, "read failed: %s\n", strerror(errno));
>> +
>> +		current_size += read_size;
>> +		remaining_size -= read_size;
>> +	} while (remaining_size > 0 && read_size > 0);
>> +
>> +	igt_assert_eq(current_size, nbytes);
>> +}
>> +
>> +static void event_log_read_from_fd(struct xe_eudebug_event_log *l, int fd)
>> +{
>> +	read_all(fd, &l->head, sizeof(l->head));
>> +	igt_assert_lt(l->head, l->max_size);
> 
> Instead of asserting reallocing log may happen here. I mean sth like
> this should be enough:
> 
> if (l->head > l->max_size) {
> 	l->max_size += MAX_SIZE;

Not sure I follow this line.

> 	l->log = realloc(l->log, l->max_size);
> 	igt_assert(l->log);
> }
> 
> Anyway, I'm not happy how keeping log have been implemented here.
> This is however not a blocker for merging the series. I understand
> how convinient is to write the whole log from client to main process
> for compare what was noticed from the debugger side.
> 
> 
>> +
>> +	read_all(fd, l->log, l->head);
>> +}
>> +
> 
> <cut>
> 
>> +/**
>> + * xe_eudebug_event_log_find_seqno:
>> + * @l: event log pointer
>> + * @seqno: seqno of event to be found
>> + *
>> + * Finds the event with given seqno in the event log.
>> + *
>> + * Returns: pointer to the event with given seqno within @l or NULL seqno is
>> + * not present.
>> + */
>> +struct drm_xe_eudebug_event *
>> +xe_eudebug_event_log_find_seqno(struct xe_eudebug_event_log *l, uint64_t seqno)
>> +{
>> +	struct drm_xe_eudebug_event *e = NULL, *found = NULL;
>> +
>> +	igt_assert(l);
>> +	igt_assert_neq(seqno, 0);
>> +	/*
>> +	 * Try to catch if seqno is corrupted and prevent too long tests,
>> +	 * as our post processing of events is not optimized.
>> +	 */
>> +	igt_assert_lt(seqno, 10 * 1000 * 1000);
>> +
>> +	xe_eudebug_for_each_event(e, l) {
>> +		if (e->seqno == seqno) {
>> +			if (found) {
>> +				igt_warn("Found multiple events with the same seqno %lu\n", seqno);
>> +				xe_eudebug_event_log_print(l, false);
>> +				igt_assert(!found);
>> +			}
>> +			found = e;
>> +		}
>> +	}
>> +
>> +	return found;
>> +}
>> +
>> +static void event_log_sort(struct xe_eudebug_event_log *l)
>> +{
>> +	struct xe_eudebug_event_log *tmp;
>> +	struct drm_xe_eudebug_event *e = NULL;
>> +	uint64_t first_seqno = UINT64_MAX;
>> +	uint64_t last_seqno = 0;
>> +	uint64_t events = 0, added = 0;
>> +	uint64_t i;
>> +
>> +	xe_eudebug_for_each_event(e, l) {
>> +		if (e->seqno > last_seqno)
>> +			last_seqno = e->seqno;
>> +
>> +		if (e->seqno < first_seqno)
>> +			first_seqno = e->seqno;
>> +
>> +		events++;
>> +	}
> 
> Above code suggests this function is called many times during
> test execution (scanning first/last seqno). But it run once on the
> test completion. Confusing for first-time reader.

Noted

> 
>> +
>> +	tmp = xe_eudebug_event_log_create("tmp", l->max_size);
>> +
>> +	for (i = first_seqno; i <= last_seqno; i++) {
>> +		e = xe_eudebug_event_log_find_seqno(l, i);
>> +		if (e) {
>> +			xe_eudebug_event_log_write(tmp, e);
>> +			added++;
>> +		}
>> +	}
>> +
>> +	igt_assert_eq(events, added);
>> +	igt_assert_eq(tmp->head, l->head);
>> +
>> +	memcpy(l->log, tmp->log, tmp->head);
>> +
>> +	xe_eudebug_event_log_destroy(tmp);
> 
> <cut>
> 
>> +
>> +/**
>> + * xe_eudebug_event_log_create:
>> + * @name: event log identifier
>> + * @max_size: maximum size of created log
>> + *
>> + * Function creates an Eu Debugger event log with size equal to @max_size.
>> + *
>> + * Returns: pointer to just created log
>> + */
>> +#define MAX_EVENT_LOG_SIZE (32 * 1024 * 1024)
>> +struct xe_eudebug_event_log *xe_eudebug_event_log_create(const char *name, unsigned int max_size)
>> +{
>> +	struct xe_eudebug_event_log *l;
>> +
>> +	igt_assert(name);
>> +
>> +	l = calloc(1, sizeof(*l));
>> +	igt_assert(l);
>> +	l->log = calloc(1, max_size);
>> +	igt_assert(l->log);
>> +	l->max_size = max_size;
>> +	strncpy(l->name, name, sizeof(l->name) - 1);
>> +	pthread_mutex_init(&l->lock, NULL);
>> +
>> +	return l;
>> +}
>> +
>> +/**
>> + * xe_eudebug_event_log_destroy:
>> + * @l: event log pointer
>> + *
>> + * Frees given event log @l.
>> + */
>> +void xe_eudebug_event_log_destroy(struct xe_eudebug_event_log *l)
>> +{
>> +	igt_assert(l);
>> +	pthread_mutex_destroy(&l->lock);
>> +	free(l->log);
>> +	free(l);
>> +}
>> +
>> +/**
>> + * xe_eudebug_event_log_write:
>> + * @l: event log pointer
>> + * @e: event to be written to event log
>> + *
>> + * Writes event @e to the event log, thread-safe.
>> + */
>> +void xe_eudebug_event_log_write(struct xe_eudebug_event_log *l, struct drm_xe_eudebug_event *e)
>> +{
>> +	igt_assert(l);
>> +	igt_assert(e);
>> +	igt_assert(e->seqno);
>> +	/*
>> +	 * Try to catch if seqno is corrupted and prevent too long tests,
>> +	 * as our post processing of events is not optimized.
>> +	 */
>> +	igt_assert_lt(e->seqno, 10 * 1000 * 1000);
>> +
>> +	pthread_mutex_lock(&l->lock);
>> +	igt_assert_lt(l->head + e->len, l->max_size);
> 
> Similar to the above code reallocing may be added here.

Ok

> 
>> +	memcpy(l->log + l->head, e, e->len);
>> +	l->head += e->len;
>> +	pthread_mutex_unlock(&l->lock);
>> +}
> 
> <cut>
> 
>> +
>> +/**
>> + * xe_eudebug_debugger_stop_worker:
>> + * @d: pointer to the debugger
>> + *
>> + * Stops the debugger worker. Event log is sorted by seqno after closure.
>> + */
>> +void xe_eudebug_debugger_stop_worker(struct xe_eudebug_debugger *d,
>> +				     int timeout_s)
>> +{
>> +	struct timespec t = {};
>> +	int ret;
>> +
>> +	igt_assert_neq(d->worker_state, DEBUGGER_WORKER_INACTIVE);
>> +
>> +	d->worker_state = DEBUGGER_WORKER_QUITTING; /* First time be polite. */
>> +	igt_assert_eq(clock_gettime(CLOCK_REALTIME, &t), 0);
>> +	t.tv_sec += timeout_s;
>> +
>> +	ret = pthread_timedjoin_np(d->worker_thread, NULL, &t);
>> +
>> +	if (ret == ETIMEDOUT) {
>> +		d->worker_state = DEBUGGER_WORKER_INACTIVE;
>> +		ret = pthread_join(d->worker_thread, NULL);
> 
> It's possible we stuck here forever (until runner will kill the test).

Ok but then it's a problem with the test in one way or another right? 
What do you suggest? Doing a timedjoin and sending a signal?

> And I don't like caller is setting INACTIVE instead of thread which
> should do that after noticing QUITTING state. >
>> +	}
>> +
>> +	igt_assert_f(ret == 0 || ret != ESRCH,
>> +		     "pthread join failed with error %d!\n", ret);
>> +
>> +	event_log_sort(d->log);
>> +}
> 
> <cut>
> 
>> +/**
>> + * xe_eudebug_client_create:
>> + * @master_fd: xe client used to open the debugger connection
>> + * @work: function that opens xe device and executes arbitrary workload
>> + * @flags: flags stored in a client structure, can be used at will
>> + * of the caller, i.e. to provide the @work function an additional switch.
>> + * @data: test's private data, allocated with MAP_SHARED | MAP_ANONYMOUS,
>> + * can be shared between client and debugger. Accesible via client->ptr.
>> + * Can be NULL.
>> + *
>> + * Forks and creates the debugger process. @work won't be called until
>> + * xe_eudebug_client_start is called.
>> + *
>> + * Returns: newly created xe_eudebug_debugger structure with its
>> + * event log initialized.
>> + */
>> +struct xe_eudebug_client *xe_eudebug_client_create(int master_fd, xe_eudebug_client_work_fn work,
>> +						   uint64_t flags, void *data)
>> +{
>> +	struct xe_eudebug_client *c;
>> +
>> +	c = calloc(1, sizeof(*c));
>> +	igt_assert(c);
>> +
>> +	c->flags = flags;
>> +	igt_assert(!pipe(c->p_in));
>> +	igt_assert(!pipe(c->p_out));
> 
> Imo these p_in/p_out are not luckily chosen names.

See the discussion above. Please try to use some suggestion when 
expressing a concern like this one.

> 
> <cut>
> 
> All of my nits may be addressed in the future. Code for me may be
> accepted under the flag, especially eudebug kernel changes will reflect
> on igt changes as well.
> 
> So for this patch:
> 
> Acked-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>

Thanks,
Christoph
> 
> --
> Zbigniew
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 11/17] lib/xe_eudebug: Introduce eu debug testing framework
  2024-09-13 15:14     ` Manszewski, Christoph
@ 2024-09-16  6:48       ` Zbigniew Kempczyński
  0 siblings, 0 replies; 50+ messages in thread
From: Zbigniew Kempczyński @ 2024-09-16  6:48 UTC (permalink / raw)
  To: Manszewski, Christoph
  Cc: igt-dev, Kamil Konieczny, Dominik Grzegorzek, Maciej Patelczyk,
	Dominik Karol Piątkowski, Pawel Sikora, Andrzej Hajda,
	Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun, Mika Kuoppala

On Fri, Sep 13, 2024 at 05:14:19PM +0200, Manszewski, Christoph wrote:
> Hi Zbigniew,
> 
> On 9.09.2024 10:46, Zbigniew Kempczyński wrote:
> > On Thu, Sep 05, 2024 at 11:28:06AM +0200, Christoph Manszewski wrote:
> > > From: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> > > 
> > > Introduce library which simplifies testing of eu debug capability.
> > > The library provides event log helpers together with asynchronous
> > > abstraction for client proccess and the debugger itself.
> > > 
> > > xe_eudebug_client creates its own proccess with user's work function,
> > > and gives machanisms to synchronize beginning of execution and event
> > > logging.
> > > 
> > > xe_eudebug_debugger allows to attach to the given proccess, provides
> > > asynchronous thread for event reading and introduces triggers - a
> > > callback mechanism triggered every time subscribed event was read.
> > > 
> > > To build the eudebug testing framework 'xe_eudebug' meson build option
> > > has to be enabled, as it is disabled by default.
> > > 
> > > Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> > > Signed-off-by: Mika Kuoppala <mika.kuaoppala@linux.intel.com>
> > > Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
> > > Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
> > > Signed-off-by: Pawel Sikora <pawel.sikora@intel.com>
> > > Signed-off-by: Karolina Stolarek <karolina.stolarek@intel.com>
> > > ---
> > >   lib/meson.build     |    5 +
> > >   lib/xe/xe_eudebug.c | 2249 +++++++++++++++++++++++++++++++++++++++++++
> > >   lib/xe/xe_eudebug.h |  218 +++++
> > >   meson.build         |    2 +
> > >   meson_options.txt   |    5 +
> > >   5 files changed, 2479 insertions(+)
> > >   create mode 100644 lib/xe/xe_eudebug.c
> > >   create mode 100644 lib/xe/xe_eudebug.h
> > 
> > That's my final review. I left only things which may/should be addressed
> > in the future.
> > 
> > > +static int safe_pipe_read(int pipe[2], void *buf, int nbytes, int timeout_ms)
> > 
> > I wonder what's for is passing pipe array instead of just one fd.
> > fd[0] is always for read, so passing fd[1] here doesn't make sense
> > (apart of int pipe[2] just points there is a pipe).
> 
> I guess it was done this way to provide some abstraction, so the caller just
> passes a pipe, created with 'pipe(2)' and leaves it to the last funcion to
> pick the appropriate fd for read/write. I would be happy to change it
> however just changing the parameter to 'int fd' would then also require to
> adapt the naming - afterall the name 'pipe_read' doesn't make sense if we no
> longer pass in a pipe but a fd.
> 
> Perhaps the whole __wait_token->pipe_read->safe_pipe_read and
> token_signal->pipe_signal hierachy of functions could use some refactoring.
> But then I would also need some guideline to make it coherent and aligned
> with the expectations.

You may keep the code intact. It is working and may be subject of future
refactoring. I think without passing pipe[2] it is still possible to
hint the caller about meaning of arguments, like:

int my_pipe_read(int pipe_read_fd, ...);

> 
> > > +{
> > > +	int ret;
> > > +	int t = 0;
> > > +	struct pollfd fd = {
> > > +		.fd = pipe[0],
> > > +		.events = POLLIN,
> > > +		.revents = 0
> > > +	};
> > > +
> > > +	/* When child fails we may get stuck forever. Check whether
> > > +	 * the child process ended with an error.
> > > +	 */
> > > +	do {
> > > +		const int interval_ms = 1000;
> > > +
> > > +		ret = poll(&fd, 1, interval_ms);
> > > +
> > > +		if (!ret) {
> > > +			catch_child_failure();
> > > +			t += interval_ms;
> > > +		}
> > > +	} while (!ret && t < timeout_ms);
> > > +
> > > +	if (ret > 0)
> > > +		return read(pipe[0], buf, nbytes);
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +static uint64_t pipe_read(int pipe[2], int timeout_ms)
> > > +{
> > > +	uint64_t in;
> > > +	uint64_t ret;
> > > +
> > > +	ret = safe_pipe_read(pipe, &in, sizeof(in), timeout_ms);
> > > +	igt_assert(ret == sizeof(in));
> > > +
> > > +	return in;
> > > +}
> > > +
> > > +static void pipe_signal(int pipe[2], uint64_t token)
> > > +{
> > > +	igt_assert(write(pipe[1], &token, sizeof(token)) == sizeof(token));
> > > +}
> > > +
> > > +static void pipe_close(int pipe[2])
> > > +{
> > > +	if (pipe[0] != -1)
> > > +		close(pipe[0]);
> > > +
> > > +	if (pipe[1] != -1)
> > > +		close(pipe[1]);
> > 
> > Just close(pipe[0]); close(pipe[1]); is enough.
> 
> Ok
> 
> > 
> > > +}
> > > +
> > > +static uint64_t __wait_token(int p[2], const uint64_t token, int timeout_ms)
> > 
> > s/p[2]/pipe[2]/ for being consistent.
> 
> Sure
> 
> 
> > 
> > > +{
> > > +	uint64_t in;
> > > +
> > > +	in = pipe_read(p, timeout_ms);
> > > +
> > > +	igt_assert_eq(in, token);
> > > +
> > > +	return pipe_read(p, timeout_ms);
> > > +}
> > > +
> > > +static uint64_t client_wait_token(struct xe_eudebug_client *c, const uint64_t token)
> > > +{
> > > +	return __wait_token(c->p_in, token, c->timeout_ms);
> > > +}
> > > +
> > > +static uint64_t wait_from_client(struct xe_eudebug_client *c, const uint64_t token)
> > > +{
> > > +	return __wait_token(c->p_out, token, c->timeout_ms);
> > 
> > p_in[2] and p_out[2] are weird names, especially p_in[1] is for writing
> > to the pipe and p_out[0] is for reading from the pipe. So naming is
> > confusing for the reader. Waiting on p_out is misleading.
> 
> I understand. Since this is somewhat personal or at least I don't see any
> obvious 'correct' way of naming those fields I would find it helpful to
> provide some alternative.
> 
> I also agree that considering these field names out of context is
> misleading. But since this is a field in 'xe_eudebug_client' it reads
> 'client pipe out' and 'client pipe in'. One pipe flows to the client the
> other from the client. And to me it makes sense that we read and write the
> 'client pipe in' depending on whether we read client side code, or non
> client side code.

Naming like to_client_pipe and from_client_pipe would be more readable
imo. I don't have resistence to use longer naming when code itself
becomes more readable. As you've said naming perception is somewhat
personal so what's readable for you may not be too readable for me.
However it is not a blocker for merging the series.

> 
> > 
> > > +}
> > > +
> > > +static void token_signal(int p[2], const uint64_t token, const uint64_t value)
> > > +{
> > 
> > Same here, p[2] -> pipe[2].
> 
> Ok
> 
> 
> > 
> > > +	pipe_signal(p, token);
> > > +	pipe_signal(p, value);
> > > +}
> > > +
> > > +static void client_signal(struct xe_eudebug_client *c,
> > > +			  const uint64_t token,
> > > +			  const uint64_t value)
> > > +{
> > > +	token_signal(c->p_out, token, value);
> > > +}
> > > +
> > > +static int __xe_eudebug_connect(int fd, pid_t pid, uint32_t flags, uint64_t events)
> > > +{
> > > +	struct drm_xe_eudebug_connect param = {
> > > +		.pid = pid,
> > > +		.flags = flags,
> > > +	};
> > > +	int debugfd;
> > > +
> > > +	debugfd = igt_ioctl(fd, DRM_IOCTL_XE_EUDEBUG_CONNECT, &param);
> > > +
> > > +	if (debugfd < 0)
> > > +		return -errno;
> > > +
> > > +	return debugfd;
> > > +}
> > > +
> > > +static void event_log_write_to_fd(struct xe_eudebug_event_log *l, int fd)
> > > +{
> > > +	igt_assert_eq(write(fd, &l->head, sizeof(l->head)),
> > > +		      sizeof(l->head));
> > > +
> > > +	igt_assert_eq(write(fd, l->log, l->head), l->head);
> > > +}
> > > +
> > > +static void read_all(int fd, void *buf, size_t nbytes)
> > > +{
> > > +	ssize_t remaining_size = nbytes;
> > > +	ssize_t current_size = 0;
> > > +	ssize_t read_size = 0;
> > > +
> > > +	do {
> > > +		read_size = read(fd, buf + current_size, remaining_size);
> > > +		igt_assert_f(read_size >= 0, "read failed: %s\n", strerror(errno));
> > > +
> > > +		current_size += read_size;
> > > +		remaining_size -= read_size;
> > > +	} while (remaining_size > 0 && read_size > 0);
> > > +
> > > +	igt_assert_eq(current_size, nbytes);
> > > +}
> > > +
> > > +static void event_log_read_from_fd(struct xe_eudebug_event_log *l, int fd)
> > > +{
> > > +	read_all(fd, &l->head, sizeof(l->head));
> > > +	igt_assert_lt(l->head, l->max_size);
> > 
> > Instead of asserting reallocing log may happen here. I mean sth like
> > this should be enough:
> > 
> > if (l->head > l->max_size) {
> > 	l->max_size += MAX_SIZE;
> 
> Not sure I follow this line.

Right, best is to reallocate to l->head as you're reading log only one.

So:

l->max_size = l->head;
l->log = realloc(l->log, l->max_size);

Should solve your doubts.

> 
> > 	l->log = realloc(l->log, l->max_size);
> > 	igt_assert(l->log);
> > }
> > 
> > Anyway, I'm not happy how keeping log have been implemented here.
> > This is however not a blocker for merging the series. I understand
> > how convinient is to write the whole log from client to main process
> > for compare what was noticed from the debugger side.
> > 
> > 
> > > +
> > > +	read_all(fd, l->log, l->head);
> > > +}
> > > +
> > 
> > <cut>
> > 
> > > +/**
> > > + * xe_eudebug_event_log_find_seqno:
> > > + * @l: event log pointer
> > > + * @seqno: seqno of event to be found
> > > + *
> > > + * Finds the event with given seqno in the event log.
> > > + *
> > > + * Returns: pointer to the event with given seqno within @l or NULL seqno is
> > > + * not present.
> > > + */
> > > +struct drm_xe_eudebug_event *
> > > +xe_eudebug_event_log_find_seqno(struct xe_eudebug_event_log *l, uint64_t seqno)
> > > +{
> > > +	struct drm_xe_eudebug_event *e = NULL, *found = NULL;
> > > +
> > > +	igt_assert(l);
> > > +	igt_assert_neq(seqno, 0);
> > > +	/*
> > > +	 * Try to catch if seqno is corrupted and prevent too long tests,
> > > +	 * as our post processing of events is not optimized.
> > > +	 */
> > > +	igt_assert_lt(seqno, 10 * 1000 * 1000);
> > > +
> > > +	xe_eudebug_for_each_event(e, l) {
> > > +		if (e->seqno == seqno) {
> > > +			if (found) {
> > > +				igt_warn("Found multiple events with the same seqno %lu\n", seqno);
> > > +				xe_eudebug_event_log_print(l, false);
> > > +				igt_assert(!found);
> > > +			}
> > > +			found = e;
> > > +		}
> > > +	}
> > > +
> > > +	return found;
> > > +}
> > > +
> > > +static void event_log_sort(struct xe_eudebug_event_log *l)
> > > +{
> > > +	struct xe_eudebug_event_log *tmp;
> > > +	struct drm_xe_eudebug_event *e = NULL;
> > > +	uint64_t first_seqno = UINT64_MAX;
> > > +	uint64_t last_seqno = 0;
> > > +	uint64_t events = 0, added = 0;
> > > +	uint64_t i;
> > > +
> > > +	xe_eudebug_for_each_event(e, l) {
> > > +		if (e->seqno > last_seqno)
> > > +			last_seqno = e->seqno;
> > > +
> > > +		if (e->seqno < first_seqno)
> > > +			first_seqno = e->seqno;
> > > +
> > > +		events++;
> > > +	}
> > 
> > Above code suggests this function is called many times during
> > test execution (scanning first/last seqno). But it run once on the
> > test completion. Confusing for first-time reader.
> 
> Noted
> 
> > 
> > > +
> > > +	tmp = xe_eudebug_event_log_create("tmp", l->max_size);
> > > +
> > > +	for (i = first_seqno; i <= last_seqno; i++) {
> > > +		e = xe_eudebug_event_log_find_seqno(l, i);
> > > +		if (e) {
> > > +			xe_eudebug_event_log_write(tmp, e);
> > > +			added++;
> > > +		}
> > > +	}
> > > +
> > > +	igt_assert_eq(events, added);
> > > +	igt_assert_eq(tmp->head, l->head);
> > > +
> > > +	memcpy(l->log, tmp->log, tmp->head);
> > > +
> > > +	xe_eudebug_event_log_destroy(tmp);
> > 
> > <cut>
> > 
> > > +
> > > +/**
> > > + * xe_eudebug_event_log_create:
> > > + * @name: event log identifier
> > > + * @max_size: maximum size of created log
> > > + *
> > > + * Function creates an Eu Debugger event log with size equal to @max_size.
> > > + *
> > > + * Returns: pointer to just created log
> > > + */
> > > +#define MAX_EVENT_LOG_SIZE (32 * 1024 * 1024)
> > > +struct xe_eudebug_event_log *xe_eudebug_event_log_create(const char *name, unsigned int max_size)
> > > +{
> > > +	struct xe_eudebug_event_log *l;
> > > +
> > > +	igt_assert(name);
> > > +
> > > +	l = calloc(1, sizeof(*l));
> > > +	igt_assert(l);
> > > +	l->log = calloc(1, max_size);
> > > +	igt_assert(l->log);
> > > +	l->max_size = max_size;
> > > +	strncpy(l->name, name, sizeof(l->name) - 1);
> > > +	pthread_mutex_init(&l->lock, NULL);
> > > +
> > > +	return l;
> > > +}
> > > +
> > > +/**
> > > + * xe_eudebug_event_log_destroy:
> > > + * @l: event log pointer
> > > + *
> > > + * Frees given event log @l.
> > > + */
> > > +void xe_eudebug_event_log_destroy(struct xe_eudebug_event_log *l)
> > > +{
> > > +	igt_assert(l);
> > > +	pthread_mutex_destroy(&l->lock);
> > > +	free(l->log);
> > > +	free(l);
> > > +}
> > > +
> > > +/**
> > > + * xe_eudebug_event_log_write:
> > > + * @l: event log pointer
> > > + * @e: event to be written to event log
> > > + *
> > > + * Writes event @e to the event log, thread-safe.
> > > + */
> > > +void xe_eudebug_event_log_write(struct xe_eudebug_event_log *l, struct drm_xe_eudebug_event *e)
> > > +{
> > > +	igt_assert(l);
> > > +	igt_assert(e);
> > > +	igt_assert(e->seqno);
> > > +	/*
> > > +	 * Try to catch if seqno is corrupted and prevent too long tests,
> > > +	 * as our post processing of events is not optimized.
> > > +	 */
> > > +	igt_assert_lt(e->seqno, 10 * 1000 * 1000);
> > > +
> > > +	pthread_mutex_lock(&l->lock);
> > > +	igt_assert_lt(l->head + e->len, l->max_size);
> > 
> > Similar to the above code reallocing may be added here.
> 
> Ok
> 
> > 
> > > +	memcpy(l->log + l->head, e, e->len);
> > > +	l->head += e->len;
> > > +	pthread_mutex_unlock(&l->lock);
> > > +}
> > 
> > <cut>
> > 
> > > +
> > > +/**
> > > + * xe_eudebug_debugger_stop_worker:
> > > + * @d: pointer to the debugger
> > > + *
> > > + * Stops the debugger worker. Event log is sorted by seqno after closure.
> > > + */
> > > +void xe_eudebug_debugger_stop_worker(struct xe_eudebug_debugger *d,
> > > +				     int timeout_s)
> > > +{
> > > +	struct timespec t = {};
> > > +	int ret;
> > > +
> > > +	igt_assert_neq(d->worker_state, DEBUGGER_WORKER_INACTIVE);
> > > +
> > > +	d->worker_state = DEBUGGER_WORKER_QUITTING; /* First time be polite. */
> > > +	igt_assert_eq(clock_gettime(CLOCK_REALTIME, &t), 0);
> > > +	t.tv_sec += timeout_s;
> > > +
> > > +	ret = pthread_timedjoin_np(d->worker_thread, NULL, &t);
> > > +
> > > +	if (ret == ETIMEDOUT) {
> > > +		d->worker_state = DEBUGGER_WORKER_INACTIVE;
> > > +		ret = pthread_join(d->worker_thread, NULL);
> > 
> > It's possible we stuck here forever (until runner will kill the test).
> 
> Ok but then it's a problem with the test in one way or another right? What
> do you suggest? Doing a timedjoin and sending a signal?

If worker thread has no bugs and it will end its execution always we'll
never stuck here. If it has igt_runner will do its job and kill the test.
I proposed some responsibility changes regarding state handling which
may mitigate worker stucking due some bug. My personal experience with
multithreading is if we rely on some state handling each thread
should personally inform about its state change. Just imagine I gave
you some task and someone else is reporting about the completion.
Current code is implemented this way and that's what I don't like here.

> 
> > And I don't like caller is setting INACTIVE instead of thread which
> > should do that after noticing QUITTING state. >
> > > +	}
> > > +
> > > +	igt_assert_f(ret == 0 || ret != ESRCH,
> > > +		     "pthread join failed with error %d!\n", ret);
> > > +
> > > +	event_log_sort(d->log);
> > > +}
> > 
> > <cut>
> > 
> > > +/**
> > > + * xe_eudebug_client_create:
> > > + * @master_fd: xe client used to open the debugger connection
> > > + * @work: function that opens xe device and executes arbitrary workload
> > > + * @flags: flags stored in a client structure, can be used at will
> > > + * of the caller, i.e. to provide the @work function an additional switch.
> > > + * @data: test's private data, allocated with MAP_SHARED | MAP_ANONYMOUS,
> > > + * can be shared between client and debugger. Accesible via client->ptr.
> > > + * Can be NULL.
> > > + *
> > > + * Forks and creates the debugger process. @work won't be called until
> > > + * xe_eudebug_client_start is called.
> > > + *
> > > + * Returns: newly created xe_eudebug_debugger structure with its
> > > + * event log initialized.
> > > + */
> > > +struct xe_eudebug_client *xe_eudebug_client_create(int master_fd, xe_eudebug_client_work_fn work,
> > > +						   uint64_t flags, void *data)
> > > +{
> > > +	struct xe_eudebug_client *c;
> > > +
> > > +	c = calloc(1, sizeof(*c));
> > > +	igt_assert(c);
> > > +
> > > +	c->flags = flags;
> > > +	igt_assert(!pipe(c->p_in));
> > > +	igt_assert(!pipe(c->p_out));
> > 
> > Imo these p_in/p_out are not luckily chosen names.
> 
> See the discussion above. Please try to use some suggestion when expressing
> a concern like this one.

to_client_pipe/from_client_pipe?

--
Zbigniew

> 
> > 
> > <cut>
> > 
> > All of my nits may be addressed in the future. Code for me may be
> > accepted under the flag, especially eudebug kernel changes will reflect
> > on igt changes as well.
> > 
> > So for this patch:
> > 
> > Acked-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> 
> Thanks,
> Christoph
> > 
> > --
> > Zbigniew
> > 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 11/17] lib/xe_eudebug: Introduce eu debug testing framework
  2024-09-05  9:28 ` [PATCH i-g-t v6 11/17] lib/xe_eudebug: Introduce eu debug testing framework Christoph Manszewski
  2024-09-09  8:46   ` Zbigniew Kempczyński
@ 2024-09-10  5:32   ` Zbigniew Kempczyński
  1 sibling, 0 replies; 50+ messages in thread
From: Zbigniew Kempczyński @ 2024-09-10  5:32 UTC (permalink / raw)
  To: Christoph Manszewski; +Cc: igt-dev

On Thu, Sep 05, 2024 at 11:28:06AM +0200, Christoph Manszewski wrote:
> From: Dominik Grzegorzek <dominik.grzegorzek@intel.com>

<cut>

> +static int enable_getset(int fd, bool *old, bool *new)
> +{
> +	static const char * const fname = "enable_eudebug";
> +	int ret = 0;
> +	int sysfs, device_fd;
> +	bool val_before;
> +	struct stat st;
> +
> +	igt_assert(new || old);
> +	igt_assert_eq(fstat(fd, &st), 0);
> +
> +	sysfs = igt_sysfs_open(fd);
> +	if (sysfs < 0)
> +		return -1;
> +
> +	device_fd = openat(sysfs, "device", O_DIRECTORY | O_RDONLY);
> +	close(sysfs);
> +	if (device_fd < 0)
> +		return -1;
> +
> +	if (!__igt_sysfs_get_boolean(device_fd, fname, &val_before)) {
> +		ret = -1;
> +		goto out;
> +	}
> +
> +	igt_debug("enable_eudebug before: %d\n", val_before);
> +
> +	if (old)
> +		*old = val_before;
> +
> +	ret = 0;

You may drop ret = 0 here, as it is already initialized to 0
and noone overwrites it.

--
Zbigniew

> +	if (new) {
> +		if (__igt_sysfs_set_boolean(device_fd, fname, *new))
> +			igt_assert_eq(igt_sysfs_get_boolean(device_fd, fname), *new);
> +		else
> +			ret = -1;
> +	}
> +
> +out:
> +	close(device_fd);
> +
> +	return ret;
> +}
> +

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH i-g-t v6 12/17] scripts/igt_doc: Add '--exclude-files' parameter
  2024-09-05  9:27 [PATCH i-g-t v6 00/17] Test coverage for GPU debug support Christoph Manszewski
                   ` (10 preceding siblings ...)
  2024-09-05  9:28 ` [PATCH i-g-t v6 11/17] lib/xe_eudebug: Introduce eu debug testing framework Christoph Manszewski
@ 2024-09-05  9:28 ` Christoph Manszewski
  2024-09-09 11:31   ` Kamil Konieczny
  2024-09-05  9:28 ` [PATCH i-g-t v6 13/17] tests/xe_eudebug: Test eudebug resource tracking and manipulation Christoph Manszewski
                   ` (7 subsequent siblings)
  19 siblings, 1 reply; 50+ messages in thread
From: Christoph Manszewski @ 2024-09-05  9:28 UTC (permalink / raw)
  To: igt-dev
  Cc: Zbigniew Kempczyński, Kamil Konieczny, Dominik Grzegorzek,
	Maciej Patelczyk, Dominik Karol Piątkowski, Pawel Sikora,
	Andrzej Hajda, Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun,
	Christoph Manszewski

Currently the doc scripts relies on a static list of files passed
through the respective '<config>.json' file for generating
documentation. This results in a compilation error when we exclude some
tests from building since the doc script notices documentation generated
for missing binaries.

Make it possible to dynamically exclude files from doc generation using
'--exclude-files' parameter. Merge excluded files passed with that parameter
with files excluded from the test config '.json' files. Align the behavior
of the '--files' parameter to also merge with files included from the config.

Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
Reviewed-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 scripts/igt_doc.py   |  3 +++
 scripts/test_list.py | 47 +++++++++++++++++++++++++-------------------
 2 files changed, 30 insertions(+), 20 deletions(-)

diff --git a/scripts/igt_doc.py b/scripts/igt_doc.py
index fa2c2c7ca..9b7317feb 100755
--- a/scripts/igt_doc.py
+++ b/scripts/igt_doc.py
@@ -313,6 +313,8 @@ def main():
                         help="Generate testlists for Intel CI integration at the INTELCI_TESTLIST directory.")
     parser.add_argument('--files', nargs='+',
                         help="File name(s) to be processed")
+    parser.add_argument('--exclude-files', nargs='+',
+                        help="File name(s) to ignore")
 
     parse_args = parser.parse_args()
 
@@ -325,6 +327,7 @@ def main():
         tests = IgtTestList(config_fname = config,
                             include_plan = parse_args.include_plan,
                             file_list = parse_args.files,
+                            exclude_file_list = parse_args.exclude_files,
                             igt_build_path = parse_args.igt_build_path)
 
         if parse_args.filter_field:
diff --git a/scripts/test_list.py b/scripts/test_list.py
index 69c830ca1..4fe209e5a 100644
--- a/scripts/test_list.py
+++ b/scripts/test_list.py
@@ -250,7 +250,7 @@ class TestList:
                  config_dict = None, sources_path = None,
                  test_tag = "TEST", subtest_tag = "SUBTESTS?",
                  main_name = "igt", planned_name = "planned",
-                 subtest_separator = "@"):
+                 subtest_separator = "@", exclude_file_list = None):
         self.doc = {}
         self.test_number = 0
         self.config = None
@@ -368,28 +368,35 @@ class TestList:
         if "_properties_" in self.props:
             del self.props["_properties_"]
 
-        has_implemented = False
+        if not exclude_file_list:
+            exclude_file_list = []
+        else:
+            exclude_file_list = [os.path.normpath(f) for f in exclude_file_list]
+
+        exclude_file_glob = self.config.get("exclude_files", [])
+        for cfg_file in exclude_file_glob:
+            cfg_file = cfg_path + cfg_file
+            for fname in glob.glob(cfg_file):
+                exclude_file_list.append(fname)
+
         if not self.filenames:
             self.filenames = []
-            exclude_files = []
-            files = self.config["files"]
-            exclude_file_glob = self.config.get("exclude_files", [])
-            for cfg_file in exclude_file_glob:
-                cfg_file = cfg_path + cfg_file
-                for fname in glob.glob(cfg_file):
-                    exclude_files.append(fname)
-
-            for cfg_file in files:
-                cfg_file = cfg_path + cfg_file
-                for fname in glob.glob(cfg_file):
-                    if fname in exclude_files:
-                        continue
-                    self.filenames.append(fname)
-                    has_implemented = True
         else:
-            for cfg_file in self.filenames:
-                if cfg_file:
-                    has_implemented = True
+            self.filenames = [os.path.normpath(f) for f in self.filenames]
+
+        files = self.config["files"]
+        for cfg_file in files:
+            cfg_file = cfg_path + cfg_file
+            for fname in glob.glob(cfg_file):
+                if fname in exclude_file_list:
+                    continue
+                self.filenames.append(fname)
+
+        has_implemented = False
+        for cfg_file in self.filenames:
+            if cfg_file:
+                has_implemented = True
+                break
 
         has_planned = False
         if include_plan and "planning_files" in self.config:
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 12/17] scripts/igt_doc: Add '--exclude-files' parameter
  2024-09-05  9:28 ` [PATCH i-g-t v6 12/17] scripts/igt_doc: Add '--exclude-files' parameter Christoph Manszewski
@ 2024-09-09 11:31   ` Kamil Konieczny
  2024-09-09 13:57     ` Zbigniew Kempczyński
  0 siblings, 1 reply; 50+ messages in thread
From: Kamil Konieczny @ 2024-09-09 11:31 UTC (permalink / raw)
  To: igt-dev
  Cc: Christoph Manszewski, Zbigniew Kempczyński,
	Dominik Grzegorzek, Maciej Patelczyk,
	Dominik Karol Piątkowski, Pawel Sikora, Andrzej Hajda,
	Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun,
	Jari Tahvanainen, Katarzyna Piecielska

Hi Christoph,
On 2024-09-05 at 11:28:07 +0200, Christoph Manszewski wrote:
> Currently the doc scripts relies on a static list of files passed
> through the respective '<config>.json' file for generating
> documentation. This results in a compilation error when we exclude some
> tests from building since the doc script notices documentation generated
> for missing binaries.
> 
> Make it possible to dynamically exclude files from doc generation using
> '--exclude-files' parameter. Merge excluded files passed with that parameter
> with files excluded from the test config '.json' files. Align the behavior
> of the '--files' parameter to also merge with files included from the config.

Could we get rid of this option and instead check if there was
a binary file generated for a given C source file?
As I understand it, we could skip C source checks if there are
no binary generated.

Or are there any other constrains like tests planned?

Adding Jari and Katarczyna on Cc.

Cc: Jari Tahvanainen <jari.tahvanainen@intel.com>
Cc: Katarzyna Piecielska <katarzyna.piecielska@intel.com>

Regards,
Kamil

> 
> Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
> Reviewed-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> ---
>  scripts/igt_doc.py   |  3 +++
>  scripts/test_list.py | 47 +++++++++++++++++++++++++-------------------
>  2 files changed, 30 insertions(+), 20 deletions(-)
> 
> diff --git a/scripts/igt_doc.py b/scripts/igt_doc.py
> index fa2c2c7ca..9b7317feb 100755
> --- a/scripts/igt_doc.py
> +++ b/scripts/igt_doc.py
> @@ -313,6 +313,8 @@ def main():
>                          help="Generate testlists for Intel CI integration at the INTELCI_TESTLIST directory.")
>      parser.add_argument('--files', nargs='+',
>                          help="File name(s) to be processed")
> +    parser.add_argument('--exclude-files', nargs='+',
> +                        help="File name(s) to ignore")
>  
>      parse_args = parser.parse_args()
>  
> @@ -325,6 +327,7 @@ def main():
>          tests = IgtTestList(config_fname = config,
>                              include_plan = parse_args.include_plan,
>                              file_list = parse_args.files,
> +                            exclude_file_list = parse_args.exclude_files,
>                              igt_build_path = parse_args.igt_build_path)
>  
>          if parse_args.filter_field:
> diff --git a/scripts/test_list.py b/scripts/test_list.py
> index 69c830ca1..4fe209e5a 100644
> --- a/scripts/test_list.py
> +++ b/scripts/test_list.py
> @@ -250,7 +250,7 @@ class TestList:
>                   config_dict = None, sources_path = None,
>                   test_tag = "TEST", subtest_tag = "SUBTESTS?",
>                   main_name = "igt", planned_name = "planned",
> -                 subtest_separator = "@"):
> +                 subtest_separator = "@", exclude_file_list = None):
>          self.doc = {}
>          self.test_number = 0
>          self.config = None
> @@ -368,28 +368,35 @@ class TestList:
>          if "_properties_" in self.props:
>              del self.props["_properties_"]
>  
> -        has_implemented = False
> +        if not exclude_file_list:
> +            exclude_file_list = []
> +        else:
> +            exclude_file_list = [os.path.normpath(f) for f in exclude_file_list]
> +
> +        exclude_file_glob = self.config.get("exclude_files", [])
> +        for cfg_file in exclude_file_glob:
> +            cfg_file = cfg_path + cfg_file
> +            for fname in glob.glob(cfg_file):
> +                exclude_file_list.append(fname)
> +
>          if not self.filenames:
>              self.filenames = []
> -            exclude_files = []
> -            files = self.config["files"]
> -            exclude_file_glob = self.config.get("exclude_files", [])
> -            for cfg_file in exclude_file_glob:
> -                cfg_file = cfg_path + cfg_file
> -                for fname in glob.glob(cfg_file):
> -                    exclude_files.append(fname)
> -
> -            for cfg_file in files:
> -                cfg_file = cfg_path + cfg_file
> -                for fname in glob.glob(cfg_file):
> -                    if fname in exclude_files:
> -                        continue
> -                    self.filenames.append(fname)
> -                    has_implemented = True
>          else:
> -            for cfg_file in self.filenames:
> -                if cfg_file:
> -                    has_implemented = True
> +            self.filenames = [os.path.normpath(f) for f in self.filenames]
> +
> +        files = self.config["files"]
> +        for cfg_file in files:
> +            cfg_file = cfg_path + cfg_file
> +            for fname in glob.glob(cfg_file):
> +                if fname in exclude_file_list:
> +                    continue
> +                self.filenames.append(fname)
> +
> +        has_implemented = False
> +        for cfg_file in self.filenames:
> +            if cfg_file:
> +                has_implemented = True
> +                break
>  
>          has_planned = False
>          if include_plan and "planning_files" in self.config:
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 12/17] scripts/igt_doc: Add '--exclude-files' parameter
  2024-09-09 11:31   ` Kamil Konieczny
@ 2024-09-09 13:57     ` Zbigniew Kempczyński
  2024-09-13 13:24       ` Manszewski, Christoph
  0 siblings, 1 reply; 50+ messages in thread
From: Zbigniew Kempczyński @ 2024-09-09 13:57 UTC (permalink / raw)
  To: Kamil Konieczny, igt-dev, Christoph Manszewski,
	Dominik Grzegorzek, Maciej Patelczyk,
	Dominik Karol Piątkowski, Pawel Sikora, Andrzej Hajda,
	Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun,
	Jari Tahvanainen, Katarzyna Piecielska

On Mon, Sep 09, 2024 at 01:31:25PM +0200, Kamil Konieczny wrote:
> Hi Christoph,
> On 2024-09-05 at 11:28:07 +0200, Christoph Manszewski wrote:
> > Currently the doc scripts relies on a static list of files passed
> > through the respective '<config>.json' file for generating
> > documentation. This results in a compilation error when we exclude some
> > tests from building since the doc script notices documentation generated
> > for missing binaries.
> > 
> > Make it possible to dynamically exclude files from doc generation using
> > '--exclude-files' parameter. Merge excluded files passed with that parameter
> > with files excluded from the test config '.json' files. Align the behavior
> > of the '--files' parameter to also merge with files included from the config.
> 
> Could we get rid of this option and instead check if there was
> a binary file generated for a given C source file?
> As I understand it, we could skip C source checks if there are
> no binary generated.

I like this idea, I'll ask Christoph for drop exclude-files option and
check corresponding .c files only if binary exists.

--
Zbigniew

> 
> Or are there any other constrains like tests planned?
> 
> Adding Jari and Katarczyna on Cc.
> 
> Cc: Jari Tahvanainen <jari.tahvanainen@intel.com>
> Cc: Katarzyna Piecielska <katarzyna.piecielska@intel.com>
> 
> Regards,
> Kamil
> 
> > 
> > Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
> > Reviewed-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> > ---
> >  scripts/igt_doc.py   |  3 +++
> >  scripts/test_list.py | 47 +++++++++++++++++++++++++-------------------
> >  2 files changed, 30 insertions(+), 20 deletions(-)
> > 
> > diff --git a/scripts/igt_doc.py b/scripts/igt_doc.py
> > index fa2c2c7ca..9b7317feb 100755
> > --- a/scripts/igt_doc.py
> > +++ b/scripts/igt_doc.py
> > @@ -313,6 +313,8 @@ def main():
> >                          help="Generate testlists for Intel CI integration at the INTELCI_TESTLIST directory.")
> >      parser.add_argument('--files', nargs='+',
> >                          help="File name(s) to be processed")
> > +    parser.add_argument('--exclude-files', nargs='+',
> > +                        help="File name(s) to ignore")
> >  
> >      parse_args = parser.parse_args()
> >  
> > @@ -325,6 +327,7 @@ def main():
> >          tests = IgtTestList(config_fname = config,
> >                              include_plan = parse_args.include_plan,
> >                              file_list = parse_args.files,
> > +                            exclude_file_list = parse_args.exclude_files,
> >                              igt_build_path = parse_args.igt_build_path)
> >  
> >          if parse_args.filter_field:
> > diff --git a/scripts/test_list.py b/scripts/test_list.py
> > index 69c830ca1..4fe209e5a 100644
> > --- a/scripts/test_list.py
> > +++ b/scripts/test_list.py
> > @@ -250,7 +250,7 @@ class TestList:
> >                   config_dict = None, sources_path = None,
> >                   test_tag = "TEST", subtest_tag = "SUBTESTS?",
> >                   main_name = "igt", planned_name = "planned",
> > -                 subtest_separator = "@"):
> > +                 subtest_separator = "@", exclude_file_list = None):
> >          self.doc = {}
> >          self.test_number = 0
> >          self.config = None
> > @@ -368,28 +368,35 @@ class TestList:
> >          if "_properties_" in self.props:
> >              del self.props["_properties_"]
> >  
> > -        has_implemented = False
> > +        if not exclude_file_list:
> > +            exclude_file_list = []
> > +        else:
> > +            exclude_file_list = [os.path.normpath(f) for f in exclude_file_list]
> > +
> > +        exclude_file_glob = self.config.get("exclude_files", [])
> > +        for cfg_file in exclude_file_glob:
> > +            cfg_file = cfg_path + cfg_file
> > +            for fname in glob.glob(cfg_file):
> > +                exclude_file_list.append(fname)
> > +
> >          if not self.filenames:
> >              self.filenames = []
> > -            exclude_files = []
> > -            files = self.config["files"]
> > -            exclude_file_glob = self.config.get("exclude_files", [])
> > -            for cfg_file in exclude_file_glob:
> > -                cfg_file = cfg_path + cfg_file
> > -                for fname in glob.glob(cfg_file):
> > -                    exclude_files.append(fname)
> > -
> > -            for cfg_file in files:
> > -                cfg_file = cfg_path + cfg_file
> > -                for fname in glob.glob(cfg_file):
> > -                    if fname in exclude_files:
> > -                        continue
> > -                    self.filenames.append(fname)
> > -                    has_implemented = True
> >          else:
> > -            for cfg_file in self.filenames:
> > -                if cfg_file:
> > -                    has_implemented = True
> > +            self.filenames = [os.path.normpath(f) for f in self.filenames]
> > +
> > +        files = self.config["files"]
> > +        for cfg_file in files:
> > +            cfg_file = cfg_path + cfg_file
> > +            for fname in glob.glob(cfg_file):
> > +                if fname in exclude_file_list:
> > +                    continue
> > +                self.filenames.append(fname)
> > +
> > +        has_implemented = False
> > +        for cfg_file in self.filenames:
> > +            if cfg_file:
> > +                has_implemented = True
> > +                break
> >  
> >          has_planned = False
> >          if include_plan and "planning_files" in self.config:
> > -- 
> > 2.34.1
> > 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 12/17] scripts/igt_doc: Add '--exclude-files' parameter
  2024-09-09 13:57     ` Zbigniew Kempczyński
@ 2024-09-13 13:24       ` Manszewski, Christoph
  2024-09-13 16:40         ` Kamil Konieczny
  0 siblings, 1 reply; 50+ messages in thread
From: Manszewski, Christoph @ 2024-09-13 13:24 UTC (permalink / raw)
  To: Zbigniew Kempczyński, Kamil Konieczny, igt-dev,
	Dominik Grzegorzek, Maciej Patelczyk,
	Dominik Karol Piątkowski, Pawel Sikora, Andrzej Hajda,
	Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun,
	Jari Tahvanainen, Katarzyna Piecielska

Hi all,

On 9.09.2024 15:57, Zbigniew Kempczyński wrote:
> On Mon, Sep 09, 2024 at 01:31:25PM +0200, Kamil Konieczny wrote:
>> Hi Christoph,
>> On 2024-09-05 at 11:28:07 +0200, Christoph Manszewski wrote:
>>> Currently the doc scripts relies on a static list of files passed
>>> through the respective '<config>.json' file for generating
>>> documentation. This results in a compilation error when we exclude some
>>> tests from building since the doc script notices documentation generated
>>> for missing binaries.
>>>
>>> Make it possible to dynamically exclude files from doc generation using
>>> '--exclude-files' parameter. Merge excluded files passed with that parameter
>>> with files excluded from the test config '.json' files. Align the behavior
>>> of the '--files' parameter to also merge with files included from the config.
>>
>> Could we get rid of this option and instead check if there was
>> a binary file generated for a given C source file?
>> As I understand it, we could skip C source checks if there are
>> no binary generated.
> 
> I like this idea, I'll ask Christoph for drop exclude-files option and
> check corresponding .c files only if binary exists.

I already replied to Kamil's patch and will include it in this series 
instead of this one. Note however that this results in generating 
documentation for tests that weren't included in the build. I was 
specifically trying to avoid that but since Kamil suggests a change like 
this I guess it's fine.

Thanks,
Christoph
> 
> --
> Zbigniew
> 
>>
>> Or are there any other constrains like tests planned?
>>
>> Adding Jari and Katarczyna on Cc.
>>
>> Cc: Jari Tahvanainen <jari.tahvanainen@intel.com>
>> Cc: Katarzyna Piecielska <katarzyna.piecielska@intel.com>
>>
>> Regards,
>> Kamil
>>
>>>
>>> Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
>>> Reviewed-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
>>> ---
>>>   scripts/igt_doc.py   |  3 +++
>>>   scripts/test_list.py | 47 +++++++++++++++++++++++++-------------------
>>>   2 files changed, 30 insertions(+), 20 deletions(-)
>>>
>>> diff --git a/scripts/igt_doc.py b/scripts/igt_doc.py
>>> index fa2c2c7ca..9b7317feb 100755
>>> --- a/scripts/igt_doc.py
>>> +++ b/scripts/igt_doc.py
>>> @@ -313,6 +313,8 @@ def main():
>>>                           help="Generate testlists for Intel CI integration at the INTELCI_TESTLIST directory.")
>>>       parser.add_argument('--files', nargs='+',
>>>                           help="File name(s) to be processed")
>>> +    parser.add_argument('--exclude-files', nargs='+',
>>> +                        help="File name(s) to ignore")
>>>   
>>>       parse_args = parser.parse_args()
>>>   
>>> @@ -325,6 +327,7 @@ def main():
>>>           tests = IgtTestList(config_fname = config,
>>>                               include_plan = parse_args.include_plan,
>>>                               file_list = parse_args.files,
>>> +                            exclude_file_list = parse_args.exclude_files,
>>>                               igt_build_path = parse_args.igt_build_path)
>>>   
>>>           if parse_args.filter_field:
>>> diff --git a/scripts/test_list.py b/scripts/test_list.py
>>> index 69c830ca1..4fe209e5a 100644
>>> --- a/scripts/test_list.py
>>> +++ b/scripts/test_list.py
>>> @@ -250,7 +250,7 @@ class TestList:
>>>                    config_dict = None, sources_path = None,
>>>                    test_tag = "TEST", subtest_tag = "SUBTESTS?",
>>>                    main_name = "igt", planned_name = "planned",
>>> -                 subtest_separator = "@"):
>>> +                 subtest_separator = "@", exclude_file_list = None):
>>>           self.doc = {}
>>>           self.test_number = 0
>>>           self.config = None
>>> @@ -368,28 +368,35 @@ class TestList:
>>>           if "_properties_" in self.props:
>>>               del self.props["_properties_"]
>>>   
>>> -        has_implemented = False
>>> +        if not exclude_file_list:
>>> +            exclude_file_list = []
>>> +        else:
>>> +            exclude_file_list = [os.path.normpath(f) for f in exclude_file_list]
>>> +
>>> +        exclude_file_glob = self.config.get("exclude_files", [])
>>> +        for cfg_file in exclude_file_glob:
>>> +            cfg_file = cfg_path + cfg_file
>>> +            for fname in glob.glob(cfg_file):
>>> +                exclude_file_list.append(fname)
>>> +
>>>           if not self.filenames:
>>>               self.filenames = []
>>> -            exclude_files = []
>>> -            files = self.config["files"]
>>> -            exclude_file_glob = self.config.get("exclude_files", [])
>>> -            for cfg_file in exclude_file_glob:
>>> -                cfg_file = cfg_path + cfg_file
>>> -                for fname in glob.glob(cfg_file):
>>> -                    exclude_files.append(fname)
>>> -
>>> -            for cfg_file in files:
>>> -                cfg_file = cfg_path + cfg_file
>>> -                for fname in glob.glob(cfg_file):
>>> -                    if fname in exclude_files:
>>> -                        continue
>>> -                    self.filenames.append(fname)
>>> -                    has_implemented = True
>>>           else:
>>> -            for cfg_file in self.filenames:
>>> -                if cfg_file:
>>> -                    has_implemented = True
>>> +            self.filenames = [os.path.normpath(f) for f in self.filenames]
>>> +
>>> +        files = self.config["files"]
>>> +        for cfg_file in files:
>>> +            cfg_file = cfg_path + cfg_file
>>> +            for fname in glob.glob(cfg_file):
>>> +                if fname in exclude_file_list:
>>> +                    continue
>>> +                self.filenames.append(fname)
>>> +
>>> +        has_implemented = False
>>> +        for cfg_file in self.filenames:
>>> +            if cfg_file:
>>> +                has_implemented = True
>>> +                break
>>>   
>>>           has_planned = False
>>>           if include_plan and "planning_files" in self.config:
>>> -- 
>>> 2.34.1
>>>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 12/17] scripts/igt_doc: Add '--exclude-files' parameter
  2024-09-13 13:24       ` Manszewski, Christoph
@ 2024-09-13 16:40         ` Kamil Konieczny
  0 siblings, 0 replies; 50+ messages in thread
From: Kamil Konieczny @ 2024-09-13 16:40 UTC (permalink / raw)
  To: igt-dev
  Cc: Manszewski, Christoph, Zbigniew Kempczyński,
	Dominik Grzegorzek, Maciej Patelczyk,
	Dominik Karol Piątkowski, Pawel Sikora, Andrzej Hajda,
	Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun,
	Jari Tahvanainen, Katarzyna Piecielska

Hi Manszewski,,
On 2024-09-13 at 15:24:13 +0200, Manszewski, Christoph wrote:
> Hi all,
> 
> On 9.09.2024 15:57, Zbigniew Kempczyński wrote:
> > On Mon, Sep 09, 2024 at 01:31:25PM +0200, Kamil Konieczny wrote:
> > > Hi Christoph,
> > > On 2024-09-05 at 11:28:07 +0200, Christoph Manszewski wrote:
> > > > Currently the doc scripts relies on a static list of files passed
> > > > through the respective '<config>.json' file for generating
> > > > documentation. This results in a compilation error when we exclude some
> > > > tests from building since the doc script notices documentation generated
> > > > for missing binaries.
> > > > 
> > > > Make it possible to dynamically exclude files from doc generation using
> > > > '--exclude-files' parameter. Merge excluded files passed with that parameter
> > > > with files excluded from the test config '.json' files. Align the behavior
> > > > of the '--files' parameter to also merge with files included from the config.
> > > 
> > > Could we get rid of this option and instead check if there was
> > > a binary file generated for a given C source file?
> > > As I understand it, we could skip C source checks if there are
> > > no binary generated.

Now I see I did something opposite, as Christoph wrote below.

> > 
> > I like this idea, I'll ask Christoph for drop exclude-files option and
> > check corresponding .c files only if binary exists.
> 
> I already replied to Kamil's patch and will include it in this series
> instead of this one. Note however that this results in generating
> documentation for tests that weren't included in the build. I was
> specifically trying to avoid that but since Kamil suggests a change like
> this I guess it's fine.
> 
> Thanks,
> Christoph

We could ask Jari or Katarzyna if that is ok, imho it is - we started to develop
those tests so it should be ok to have them in documentation.

Regards,
Kamil

> > 
> > --
> > Zbigniew
> > 
> > > 
> > > Or are there any other constrains like tests planned?
> > > 
> > > Adding Jari and Katarczyna on Cc.
> > > 
> > > Cc: Jari Tahvanainen <jari.tahvanainen@intel.com>
> > > Cc: Katarzyna Piecielska <katarzyna.piecielska@intel.com>
> > > 
> > > Regards,
> > > Kamil
> > > 
> > > > 
> > > > Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
> > > > Reviewed-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> > > > ---
> > > >   scripts/igt_doc.py   |  3 +++
> > > >   scripts/test_list.py | 47 +++++++++++++++++++++++++-------------------
> > > >   2 files changed, 30 insertions(+), 20 deletions(-)
> > > > 
[...cut...]


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH i-g-t v6 13/17] tests/xe_eudebug: Test eudebug resource tracking and manipulation
  2024-09-05  9:27 [PATCH i-g-t v6 00/17] Test coverage for GPU debug support Christoph Manszewski
                   ` (11 preceding siblings ...)
  2024-09-05  9:28 ` [PATCH i-g-t v6 12/17] scripts/igt_doc: Add '--exclude-files' parameter Christoph Manszewski
@ 2024-09-05  9:28 ` Christoph Manszewski
  2024-09-06 14:46   ` Kamil Konieczny
  2024-09-12  8:04   ` Zbigniew Kempczyński
  2024-09-05  9:28 ` [PATCH i-g-t v6 14/17] lib/intel_batchbuffer: Add support for long-running mode execution Christoph Manszewski
                   ` (6 subsequent siblings)
  19 siblings, 2 replies; 50+ messages in thread
From: Christoph Manszewski @ 2024-09-05  9:28 UTC (permalink / raw)
  To: igt-dev
  Cc: Zbigniew Kempczyński, Kamil Konieczny, Dominik Grzegorzek,
	Maciej Patelczyk, Dominik Karol Piątkowski, Pawel Sikora,
	Andrzej Hajda, Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun,
	Mika Kuoppala, Christoph Manszewski, Karolina Stolarek,
	Jonathan Cavitt

From: Dominik Grzegorzek <dominik.grzegorzek@intel.com>

For typical debugging under gdb one can specify two main usecases:
accessing and manupulating resources created by the application and
manipulating thread execution (interrupting and setting breakpoints).

This test adds coverage for the former by checking that:
- the debugger reports the expected events for Xe resources created
by the debugged client,
- the debugger is able to read and write the vm of the debugged client.

Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
Signed-off-by: Karolina Stolarek <karolina.stolarek@intel.com>
Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Signed-off-by: Pawel Sikora <pawel.sikora@intel.com>
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Dominik Karol Piątkowski <dominik.karol.piatkowski@intel.com>
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
---
 docs/testplan/meson.build |   13 +-
 meson_options.txt         |    2 +-
 tests/intel/xe_eudebug.c  | 2716 +++++++++++++++++++++++++++++++++++++
 tests/meson.build         |    8 +
 4 files changed, 2737 insertions(+), 2 deletions(-)
 create mode 100644 tests/intel/xe_eudebug.c

diff --git a/docs/testplan/meson.build b/docs/testplan/meson.build
index 5560347f1..e86af028e 100644
--- a/docs/testplan/meson.build
+++ b/docs/testplan/meson.build
@@ -33,11 +33,22 @@ else
 	doc_dependencies = []
 endif
 
+xe_excluded_tests = []
+if not build_xe_eudebug
+	foreach test : intel_xe_eudebug_progs
+		xe_excluded_tests += meson.current_source_dir() + '/../../tests/intel/' + test + '.c'
+	endforeach
+endif
+
+if xe_excluded_tests.length() > 0
+	xe_excluded_tests = ['--exclude-files'] + xe_excluded_tests
+endif
+
 if build_xe
 	test_dict = {
 		'i915_tests': { 'input': i915_test_config, 'extra_args': check_testlist },
 		'kms_tests': { 'input': kms_test_config, 'extra_args': kms_check_testlist },
-		'xe_tests': { 'input': xe_test_config, 'extra_args': check_testlist }
+		'xe_tests': { 'input': xe_test_config, 'extra_args': check_testlist + xe_excluded_tests }
 	    }
 else
 	test_dict = {
diff --git a/meson_options.txt b/meson_options.txt
index 11922523b..c410f9b77 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -45,7 +45,7 @@ option('xe_driver',
 option('xe_eudebug',
        type : 'feature',
        value : 'disabled',
-       description : 'Build library for Xe EU debugger')
+       description : 'Build library and tests for Xe EU debugger')
 
 option('libdrm_drivers',
        type : 'array',
diff --git a/tests/intel/xe_eudebug.c b/tests/intel/xe_eudebug.c
new file mode 100644
index 000000000..fd2894a5e
--- /dev/null
+++ b/tests/intel/xe_eudebug.c
@@ -0,0 +1,2716 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+/**
+ * TEST: Test EU Debugger functionality
+ * Category: Core
+ * Mega feature: EUdebug
+ * Sub-category: EUdebug tests
+ * Functionality: eu debugger framework
+ * Test category: functionality test
+ */
+
+#include <grp.h>
+#include <poll.h>
+#include <pthread.h>
+#include <pwd.h>
+#include <sys/ioctl.h>
+#include <sys/prctl.h>
+
+#include "igt.h"
+#include "intel_pat.h"
+#include "lib/igt_syncobj.h"
+#include "xe/xe_eudebug.h"
+#include "xe/xe_ioctl.h"
+#include "xe/xe_query.h"
+
+/**
+ * SUBTEST: sysfs-toggle
+ * Description:
+ *	Exercise the debugger enable/disable sysfs toggle logic
+ */
+static void test_sysfs_toggle(int fd)
+{
+	xe_eudebug_enable(fd, false);
+	igt_assert(!xe_eudebug_debugger_available(fd));
+
+	xe_eudebug_enable(fd, true);
+	igt_assert(xe_eudebug_debugger_available(fd));
+	xe_eudebug_enable(fd, true);
+	igt_assert(xe_eudebug_debugger_available(fd));
+
+	xe_eudebug_enable(fd, false);
+	igt_assert(!xe_eudebug_debugger_available(fd));
+	xe_eudebug_enable(fd, false);
+	igt_assert(!xe_eudebug_debugger_available(fd));
+
+	xe_eudebug_enable(fd, true);
+	igt_assert(xe_eudebug_debugger_available(fd));
+}
+
+#define STAGE_PRE_DEBUG_RESOURCES_DONE 1
+#define STAGE_DISCOVERY_DONE 2
+
+#define CREATE_VMS (1 << 0)
+#define CREATE_EXEC_QUEUES (1 << 1)
+#define VM_BIND (1 << 2)
+#define VM_BIND_VM_DESTROY (1 << 3)
+#define VM_BIND_EXTENDED (1 << 4)
+#define VM_METADATA (1 << 5)
+#define VM_BIND_METADATA (1 << 6)
+#define VM_BIND_OP_MAP_USERPTR (1 << 7)
+#define TEST_DISCOVERY (1 << 31)
+
+#define PAGE_SIZE 4096
+static struct drm_xe_vm_bind_op_ext_attach_debug *
+basic_vm_bind_metadata_ext_prepare(int fd, struct xe_eudebug_client *c,
+				   uint8_t **data, uint32_t data_size)
+{
+	struct drm_xe_vm_bind_op_ext_attach_debug *ext;
+	int i;
+
+	*data = calloc(data_size, sizeof(*data));
+	igt_assert(*data);
+
+	for (i = 0; i < data_size; i++)
+		(*data)[i] = 0xff & (i + (i > PAGE_SIZE));
+
+	ext = calloc(WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM, sizeof(*ext));
+	igt_assert(ext);
+
+	for (i = 0; i < WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM; i++) {
+		ext[i].base.name = XE_VM_BIND_OP_EXTENSIONS_ATTACH_DEBUG;
+		ext[i].metadata_id = xe_eudebug_client_metadata_create(c, fd, i,
+								       (i + 1) * PAGE_SIZE, *data);
+		ext[i].cookie = i;
+
+		if (i < WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM - 1)
+			ext[i].base.next_extension = to_user_pointer(&ext[i + 1]);
+	}
+	return ext;
+}
+
+static void basic_vm_bind_metadata_ext_del(int fd, struct xe_eudebug_client *c,
+					   struct drm_xe_vm_bind_op_ext_attach_debug *ext,
+					   uint8_t *data)
+{
+	for (int i = 0; i < WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM; i++)
+		xe_eudebug_client_metadata_destroy(c, fd, ext[i].metadata_id, i,
+						   (i + 1) * PAGE_SIZE);
+	free(ext);
+	free(data);
+}
+
+static void basic_vm_bind_client(int fd, struct xe_eudebug_client *c)
+{
+	struct drm_xe_vm_bind_op_ext_attach_debug *ext = NULL;
+	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
+	size_t bo_size = xe_get_default_alignment(fd);
+	bool test_discovery = c->flags & TEST_DISCOVERY;
+	bool test_metadata = c->flags & VM_BIND_METADATA;
+	uint32_t bo = xe_bo_create(fd, 0, bo_size,
+				   system_memory(fd), 0);
+	uint64_t addr = 0x1a0000;
+	uint8_t *data = NULL;
+
+	if (test_metadata)
+		ext = basic_vm_bind_metadata_ext_prepare(fd, c, &data, PAGE_SIZE);
+
+	xe_eudebug_client_vm_bind_flags(c, fd, vm, bo, 0, addr,
+					bo_size, 0, NULL, 0, to_user_pointer(ext));
+
+	if (test_discovery) {
+		xe_eudebug_client_signal_stage(c, STAGE_PRE_DEBUG_RESOURCES_DONE);
+		xe_eudebug_client_wait_stage(c, STAGE_DISCOVERY_DONE);
+	}
+
+	xe_eudebug_client_vm_unbind(c, fd, vm, 0, addr, bo_size);
+
+	if (test_metadata)
+		basic_vm_bind_metadata_ext_del(fd, c, ext, data);
+
+	gem_close(fd, bo);
+	xe_eudebug_client_vm_destroy(c, fd, vm);
+}
+
+static void basic_vm_bind_vm_destroy_client(int fd, struct xe_eudebug_client *c)
+{
+	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
+	size_t bo_size = xe_get_default_alignment(fd);
+	bool test_discovery = c->flags & TEST_DISCOVERY;
+	uint32_t bo = xe_bo_create(fd, 0, bo_size,
+				   system_memory(fd), 0);
+	uint64_t addr = 0x1a0000;
+
+	if (test_discovery) {
+		vm = xe_vm_create(fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
+
+		xe_vm_bind_async(fd, vm, 0, bo, 0, addr, bo_size, NULL, 0);
+
+		xe_vm_destroy(fd, vm);
+
+		xe_eudebug_client_signal_stage(c, STAGE_PRE_DEBUG_RESOURCES_DONE);
+		xe_eudebug_client_wait_stage(c, STAGE_DISCOVERY_DONE);
+	} else {
+		vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
+		xe_eudebug_client_vm_bind(c, fd, vm, bo, 0, addr, bo_size);
+		xe_eudebug_client_vm_destroy(c, fd, vm);
+	}
+
+	gem_close(fd, bo);
+}
+
+#define BO_ADDR 0x1a0000
+#define BO_ITEMS 4096
+#define MIN_BO_SIZE (BO_ITEMS * sizeof(uint64_t))
+
+union buf_id {
+	uint32_t fd;
+	void *userptr;
+};
+
+struct bind_list {
+	int fd;
+	uint32_t vm;
+	union buf_id *bo;
+	struct drm_xe_vm_bind_op *bind_ops;
+	unsigned int n;
+};
+
+static void *bo_get_ptr(int fd, struct drm_xe_vm_bind_op *o)
+{
+	void *ptr;
+
+	if (o->op != DRM_XE_VM_BIND_OP_MAP_USERPTR)
+		ptr = xe_bo_map(fd, o->obj, o->range);
+	else
+		ptr = (void *)(uintptr_t)o->userptr;
+
+	igt_assert(ptr);
+
+	return ptr;
+}
+
+static void bo_put_ptr(int fd, struct drm_xe_vm_bind_op *o, void *ptr)
+{
+	if (o->op != DRM_XE_VM_BIND_OP_MAP_USERPTR)
+		munmap(ptr, o->range);
+}
+
+static void bo_prime(int fd, struct drm_xe_vm_bind_op *o)
+{
+	uint64_t *d;
+	uint64_t i;
+
+	d = bo_get_ptr(fd, o);
+
+	for (i = 0; i < o->range / sizeof(*d); i++)
+		d[i] = o->addr + i;
+
+	bo_put_ptr(fd, o, d);
+}
+
+static void bo_check(int fd, struct drm_xe_vm_bind_op *o)
+{
+	uint64_t *d;
+	uint64_t i;
+
+	d = bo_get_ptr(fd, o);
+
+	for (i = 0; i < o->range / sizeof(*d); i++)
+		igt_assert_eq(d[i], o->addr + i + 1);
+
+	bo_put_ptr(fd, o, d);
+}
+
+static union buf_id *vm_create_objects(int fd, uint32_t bo_placement, uint32_t vm,
+				       unsigned int size, unsigned int n)
+{
+	union buf_id *bo;
+	unsigned int i;
+
+	bo = calloc(n, sizeof(*bo));
+	igt_assert(bo);
+
+	for (i = 0; i < n; i++) {
+		if (bo_placement) {
+			bo[i].fd = xe_bo_create(fd, vm, size, bo_placement, 0);
+			igt_assert(bo[i].fd);
+		} else {
+			bo[i].userptr = aligned_alloc(PAGE_SIZE, size);
+			igt_assert(bo[i].userptr);
+		}
+	}
+
+	return bo;
+}
+
+static struct bind_list *create_bind_list(int fd, uint32_t bo_placement,
+					  uint32_t vm, unsigned int n,
+					  unsigned int target_size)
+{
+	unsigned int i = target_size ?: MIN_BO_SIZE;
+	const unsigned int bo_size = max_t(bo_size, xe_get_default_alignment(fd), i);
+	bool is_userptr = !bo_placement;
+	struct bind_list *bl;
+
+	bl = malloc(sizeof(*bl));
+	bl->fd = fd;
+	bl->vm = vm;
+	bl->bo = vm_create_objects(fd, bo_placement, vm, bo_size, n);
+	bl->n = n;
+	bl->bind_ops = calloc(n, sizeof(*bl->bind_ops));
+	igt_assert(bl->bind_ops);
+
+	for (i = 0; i < n; i++) {
+		struct drm_xe_vm_bind_op *o = &bl->bind_ops[i];
+
+		if (is_userptr) {
+			o->obj = 0;
+			o->userptr = (uintptr_t)bl->bo[i].userptr;
+			o->op = DRM_XE_VM_BIND_OP_MAP_USERPTR;
+		} else {
+			o->obj = bl->bo[i].fd;
+			o->obj_offset = 0;
+			o->op = DRM_XE_VM_BIND_OP_MAP;
+		}
+
+		o->range = bo_size;
+		o->addr = BO_ADDR + 2 * i * bo_size;
+		o->flags = 0;
+		o->pat_index = intel_get_pat_idx_wb(fd);
+		o->prefetch_mem_region_instance = 0;
+		o->reserved[0] = 0;
+		o->reserved[1] = 0;
+	}
+
+	for (i = 0; i < bl->n; i++) {
+		struct drm_xe_vm_bind_op *o = &bl->bind_ops[i];
+
+		igt_debug("bo %d: addr 0x%llx, range 0x%llx\n", i, o->addr, o->range);
+		bo_prime(fd, o);
+	}
+
+	return bl;
+}
+
+static void do_bind_list(struct xe_eudebug_client *c,
+			 struct bind_list *bl, bool sync)
+{
+	struct drm_xe_sync uf_sync = {
+		.type = DRM_XE_SYNC_TYPE_USER_FENCE,
+		.flags = DRM_XE_SYNC_FLAG_SIGNAL,
+		.timeline_value = 1337,
+	};
+	uint64_t ref_seqno = 0, op_ref_seqno = 0;
+	uint64_t *fence_data;
+	int i;
+
+	if (sync) {
+		fence_data = aligned_alloc(xe_get_default_alignment(bl->fd),
+					   sizeof(*fence_data));
+		igt_assert(fence_data);
+		uf_sync.addr = to_user_pointer(fence_data);
+		memset(fence_data, 0, sizeof(*fence_data));
+	}
+
+	xe_vm_bind_array(bl->fd, bl->vm, 0, bl->bind_ops, bl->n, &uf_sync, sync ? 1 : 0);
+	xe_eudebug_client_vm_bind_event(c, DRM_XE_EUDEBUG_EVENT_STATE_CHANGE,
+					bl->fd, bl->vm, 0, bl->n, &ref_seqno);
+	for (i = 0; i < bl->n; i++)
+		xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_CREATE,
+						   ref_seqno,
+						   &op_ref_seqno,
+						   bl->bind_ops[i].addr,
+						   bl->bind_ops[i].range,
+						   0);
+
+	if (sync) {
+		xe_wait_ufence(bl->fd, fence_data, uf_sync.timeline_value, 0,
+			       XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC);
+		free(fence_data);
+	}
+}
+
+static void free_bind_list(struct xe_eudebug_client *c, struct bind_list *bl)
+{
+	unsigned int i;
+
+	for (i = 0; i < bl->n; i++) {
+		igt_debug("%d: checking 0x%llx (%lld)\n",
+			  i, bl->bind_ops[i].addr, bl->bind_ops[i].addr);
+		bo_check(bl->fd, &bl->bind_ops[i]);
+		if (bl->bind_ops[i].op == DRM_XE_VM_BIND_OP_MAP_USERPTR)
+			free(bl->bo[i].userptr);
+		xe_eudebug_client_vm_unbind(c, bl->fd, bl->vm, 0,
+					    bl->bind_ops[i].addr,
+					    bl->bind_ops[i].range);
+	}
+
+	free(bl->bind_ops);
+	free(bl->bo);
+	free(bl);
+}
+
+static void vm_bind_client(int fd, struct xe_eudebug_client *c)
+{
+	uint64_t op_ref_seqno, ref_seqno;
+	struct bind_list *bl;
+	bool test_discovery = c->flags & TEST_DISCOVERY;
+	size_t bo_size = 3 * xe_get_default_alignment(fd);
+	uint32_t bo[2] = {
+		xe_bo_create(fd, 0, bo_size, system_memory(fd), 0),
+		xe_bo_create(fd, 0, bo_size, system_memory(fd), 0),
+	};
+	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
+	uint64_t addr[] = {0x2a0000, 0x3a0000};
+	uint64_t rebind_bo_offset = 2 * bo_size / 3;
+	uint64_t size = bo_size / 3;
+	int i = 0;
+
+	if (test_discovery) {
+		xe_vm_bind_async(fd, vm, 0, bo[0], 0, addr[0], bo_size, NULL, 0);
+
+		xe_vm_unbind_async(fd, vm, 0, 0, addr[0] + size, size, NULL, 0);
+
+		xe_vm_bind_async(fd, vm, 0, bo[1], 0, addr[1], bo_size, NULL, 0);
+
+		xe_vm_bind_async(fd, vm, 0, bo[1], rebind_bo_offset, addr[1], size, NULL, 0);
+
+		bl = create_bind_list(fd, system_memory(fd), vm, 4, 0);
+		xe_vm_bind_array(bl->fd, bl->vm, 0, bl->bind_ops, bl->n, NULL, 0);
+
+		xe_vm_unbind_all_async(fd, vm, 0, bo[0], NULL, 0);
+
+		xe_eudebug_client_vm_bind_event(c, DRM_XE_EUDEBUG_EVENT_STATE_CHANGE,
+						bl->fd, bl->vm, 0, bl->n + 2, &ref_seqno);
+
+		xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_CREATE, ref_seqno,
+						   &op_ref_seqno, addr[1], size, 0);
+		xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_CREATE, ref_seqno,
+						   &op_ref_seqno, addr[1] + size, size * 2, 0);
+
+		for (i = 0; i < bl->n; i++)
+			xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_CREATE,
+							   ref_seqno, &op_ref_seqno,
+							   bl->bind_ops[i].addr,
+							   bl->bind_ops[i].range, 0);
+
+		xe_eudebug_client_signal_stage(c, STAGE_PRE_DEBUG_RESOURCES_DONE);
+		xe_eudebug_client_wait_stage(c, STAGE_DISCOVERY_DONE);
+	} else {
+		xe_eudebug_client_vm_bind(c, fd, vm, bo[0], 0, addr[0], bo_size);
+		xe_eudebug_client_vm_unbind(c, fd, vm, 0, addr[0] + size, size);
+
+		xe_eudebug_client_vm_bind(c, fd, vm, bo[1], 0, addr[1], bo_size);
+		xe_eudebug_client_vm_bind(c, fd, vm, bo[1], rebind_bo_offset, addr[1], size);
+
+		bl = create_bind_list(fd, system_memory(fd), vm, 4, 0);
+		do_bind_list(c, bl, false);
+	}
+
+	xe_vm_unbind_all_async(fd, vm, 0, bo[1], NULL, 0);
+
+	xe_eudebug_client_vm_bind_event(c, DRM_XE_EUDEBUG_EVENT_STATE_CHANGE, fd, vm, 0,
+					1, &ref_seqno);
+	xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_DESTROY, ref_seqno,
+					   &op_ref_seqno, 0, 0, 0);
+
+	gem_close(fd, bo[0]);
+	gem_close(fd, bo[1]);
+	xe_eudebug_client_vm_destroy(c, fd, vm);
+}
+
+static void run_basic_client(struct xe_eudebug_client *c)
+{
+	int fd, i;
+
+	fd = xe_eudebug_client_open_driver(c);
+	xe_device_get(fd);
+
+	if (c->flags & CREATE_VMS) {
+		const uint32_t flags[] = {
+			DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE | DRM_XE_VM_CREATE_FLAG_LR_MODE,
+			DRM_XE_VM_CREATE_FLAG_LR_MODE,
+		};
+		uint32_t vms[ARRAY_SIZE(flags)];
+
+		for (i = 0; i < ARRAY_SIZE(flags); i++)
+			vms[i] = xe_eudebug_client_vm_create(c, fd, flags[i], 0);
+
+		for (i--; i >= 0; i--)
+			xe_eudebug_client_vm_destroy(c, fd, vms[i]);
+	}
+
+	if (c->flags & CREATE_EXEC_QUEUES) {
+		struct drm_xe_exec_queue_create *create;
+		struct drm_xe_engine_class_instance *hwe;
+		struct drm_xe_ext_set_property eq_ext = {
+			.base.name = DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY,
+			.property = DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG,
+			.value = DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE,
+		};
+		uint32_t vm;
+
+		create = calloc(xe_number_engines(fd), sizeof(*create));
+
+		vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
+
+		i = 0;
+		xe_eudebug_for_each_engine(fd, hwe) {
+			create[i].instances = to_user_pointer(hwe);
+			create[i].vm_id = vm;
+			create[i].width = 1;
+			create[i].num_placements = 1;
+			create[i].extensions = to_user_pointer(&eq_ext);
+			xe_eudebug_client_exec_queue_create(c, fd, &create[i++]);
+		}
+
+		while (--i >= 0)
+			xe_eudebug_client_exec_queue_destroy(c, fd, &create[i]);
+
+		xe_eudebug_client_vm_destroy(c, fd, vm);
+	}
+
+	if (c->flags & VM_BIND || c->flags & VM_BIND_METADATA)
+		basic_vm_bind_client(fd, c);
+
+	if (c->flags & VM_BIND_EXTENDED)
+		vm_bind_client(fd, c);
+
+	if (c->flags & VM_BIND_VM_DESTROY)
+		basic_vm_bind_vm_destroy_client(fd, c);
+
+	xe_device_put(fd);
+	xe_eudebug_client_close_driver(c, fd);
+}
+
+static int read_event(int debugfd, struct drm_xe_eudebug_event *event)
+{
+	int ret;
+
+	ret = igt_ioctl(debugfd, DRM_XE_EUDEBUG_IOCTL_READ_EVENT, event);
+	if (ret < 0)
+		return -errno;
+
+	return ret;
+}
+
+static int __read_event(int debugfd, struct drm_xe_eudebug_event *event)
+{
+	int ret;
+
+	ret = ioctl(debugfd, DRM_XE_EUDEBUG_IOCTL_READ_EVENT, event);
+	if (ret < 0)
+		return -errno;
+
+	return ret;
+}
+
+static int poll_event(int fd, int timeout_ms)
+{
+	int ret;
+
+	struct pollfd p = {
+		.fd = fd,
+		.events = POLLIN,
+		.revents = 0,
+	};
+
+	ret = poll(&p, 1, timeout_ms);
+	if (ret == -1)
+		return -errno;
+
+	return ret == 1 && (p.revents & POLLIN);
+}
+
+static int __debug_connect(int fd, int *debugfd, struct drm_xe_eudebug_connect *param)
+{
+	int ret = 0;
+
+	*debugfd = igt_ioctl(fd, DRM_IOCTL_XE_EUDEBUG_CONNECT, param);
+
+	if (*debugfd < 0) {
+		ret = -errno;
+		igt_assume(ret != 0);
+	}
+
+	errno = 0;
+	return ret;
+}
+
+/**
+ * SUBTEST: basic-connect
+ * Description:
+ *	Exercise XE_EUDEBUG_CONNECT ioctl with passing
+ *	valid and invalid params.
+ */
+static void test_connect(int fd)
+{
+	struct drm_xe_eudebug_connect param = {};
+	int debugfd, ret;
+	pid_t *pid;
+
+	pid = mmap(NULL, sizeof(pid_t), PROT_WRITE,
+		   MAP_SHARED | MAP_ANON, -1, 0);
+
+	/* get fresh unrelated pid */
+	igt_fork(child, 1)
+		*pid = getpid();
+
+	igt_waitchildren();
+	param.pid = *pid;
+	munmap(pid, sizeof(pid_t));
+
+	ret = __debug_connect(fd, &debugfd, &param);
+	igt_assert(debugfd == -1);
+	igt_assert_eq(ret, param.pid ? -ENOENT : -EINVAL);
+
+	param.pid = 0;
+	ret = __debug_connect(fd, &debugfd, &param);
+	igt_assert(debugfd == -1);
+	igt_assert_eq(ret, -EINVAL);
+
+	param.pid = getpid();
+	param.version = -1;
+	ret = __debug_connect(fd, &debugfd, &param);
+	igt_assert(debugfd == -1);
+	igt_assert_eq(ret, -EINVAL);
+
+	param.version = 0;
+	param.flags = ~0;
+	ret = __debug_connect(fd, &debugfd, &param);
+	igt_assert(debugfd == -1);
+	igt_assert_eq(ret, -EINVAL);
+
+	param.flags = 0;
+	param.extensions = ~0;
+	ret = __debug_connect(fd, &debugfd, &param);
+	igt_assert(debugfd == -1);
+	igt_assert_eq(ret, -EINVAL);
+
+	param.extensions = 0;
+	ret = __debug_connect(fd, &debugfd, &param);
+	igt_assert_neq(debugfd, -1);
+	igt_assert_eq(ret, 0);
+
+	close(debugfd);
+}
+
+static void switch_user(__uid_t uid, __gid_t gid)
+{
+	struct group *gr;
+	__gid_t gr_v;
+
+	/* Users other then root need to belong to video group */
+	gr = getgrnam("video");
+	igt_assert(gr);
+
+	/* Drop all */
+	igt_assert_eq(setgroups(1, &gr->gr_gid), 0);
+	igt_assert_eq(setgid(gid), 0);
+	igt_assert_eq(setuid(uid), 0);
+
+	igt_assert_eq(getgroups(1, &gr_v), 1);
+	igt_assert_eq(gr_v, gr->gr_gid);
+	igt_assert_eq(getgid(), gid);
+	igt_assert_eq(getuid(), uid);
+
+	igt_assert_eq(prctl(PR_SET_DUMPABLE, 1L), 0);
+}
+
+/**
+ * SUBTEST: connect-user
+ * Description:
+ *	Verify unprivileged XE_EUDEBG_CONNECT ioctl.
+ *	Check:
+ *	 - user debugger to user workload connection
+ *	 - user debugger to other user workload connection
+ *	 - user debugger to privileged workload connection
+ */
+static void test_connect_user(int fd)
+{
+	struct drm_xe_eudebug_connect param = {};
+	struct passwd *pwd, *pwd2;
+	const char *user1 = "lp";
+	const char *user2 = "mail";
+	int debugfd, ret, i;
+	int p1[2], p2[2];
+	__uid_t u1, u2;
+	__gid_t g1, g2;
+	int newfd;
+	pid_t pid;
+
+#define NUM_USER_TESTS 4
+#define P_APP 0
+#define P_GDB 1
+	struct conn_user {
+		/* u[0] - process uid, u[1] - gdb uid */
+		__uid_t u[P_GDB + 1];
+		/* g[0] - process gid, g[1] - gdb gid */
+		__gid_t g[P_GDB + 1];
+		/* Expected fd from open */
+		int ret;
+		/* Skip this test case */
+		int skip;
+		const char *desc;
+	} test[NUM_USER_TESTS] = {};
+
+	igt_assert(!pipe(p1));
+	igt_assert(!pipe(p2));
+
+	pwd = getpwnam(user1);
+	igt_require(pwd);
+	u1 = pwd->pw_uid;
+	g1 = pwd->pw_gid;
+
+	/*
+	 * Keep a copy of needed contents as it is a static
+	 * memory area and subsequent calls will overwrite
+	 * what's in.
+	 * However getpwnam() returns NULL if cannot find
+	 * user in passwd.
+	 */
+	setpwent();
+	pwd2 = getpwnam(user2);
+	if (pwd2) {
+		u2 = pwd2->pw_uid;
+		g2 = pwd2->pw_gid;
+	}
+
+	test[0].skip = !pwd;
+	test[0].u[P_GDB] = u1;
+	test[0].g[P_GDB] = g1;
+	test[0].ret = -EACCES;
+	test[0].desc = "User GDB to Root App";
+
+	test[1].skip = !pwd;
+	test[1].u[P_APP] = u1;
+	test[1].g[P_APP] = g1;
+	test[1].u[P_GDB] = u1;
+	test[1].g[P_GDB] = g1;
+	test[1].ret = 0;
+	test[1].desc = "User GDB to User App";
+
+	test[2].skip = !pwd;
+	test[2].u[P_APP] = u1;
+	test[2].g[P_APP] = g1;
+	test[2].ret = 0;
+	test[2].desc = "Root GDB to User App";
+
+	test[3].skip = !pwd2;
+	test[3].u[P_APP] = u1;
+	test[3].g[P_APP] = g1;
+	test[3].u[P_GDB] = u2;
+	test[3].g[P_GDB] = g2;
+	test[3].ret = -EACCES;
+	test[3].desc = "User GDB to Other User App";
+
+	if (!pwd2)
+		igt_warn("User %s not available in the system. Skipping subtests: %s.\n",
+			 user2, test[3].desc);
+
+	for (i = 0; i < NUM_USER_TESTS; i++) {
+		if (test[i].skip) {
+			igt_debug("Subtest %s skipped\n", test[i].desc);
+			continue;
+		}
+		igt_debug("Executing connection: %s\n", test[i].desc);
+		igt_fork(child, 2) {
+			if (!child) {
+				if (test[i].u[P_APP])
+					switch_user(test[i].u[P_APP], test[i].g[P_APP]);
+
+				pid = getpid();
+				/* Signal the PID */
+				igt_assert(write(p1[1], &pid, sizeof(pid)) == sizeof(pid));
+				/* wait with exit */
+				igt_assert(read(p2[0], &pid, sizeof(pid)) == sizeof(pid));
+			} else {
+				if (test[i].u[P_GDB])
+					switch_user(test[i].u[P_GDB], test[i].g[P_GDB]);
+
+				igt_assert(read(p1[0], &pid, sizeof(pid)) == sizeof(pid));
+				param.pid = pid;
+
+				newfd = drm_open_driver(DRIVER_XE);
+				ret = __debug_connect(newfd, &debugfd, &param);
+
+				/* Release the app first */
+				igt_assert(write(p2[1], &pid, sizeof(pid)) == sizeof(pid));
+
+				igt_assert_eq(ret, test[i].ret);
+				if (!ret)
+					close(debugfd);
+			}
+		}
+		igt_waitchildren();
+	}
+	close(p1[0]);
+	close(p1[1]);
+	close(p2[0]);
+	close(p2[1]);
+#undef NUM_USER_TESTS
+#undef P_APP
+#undef P_GDB
+}
+
+/**
+ * SUBTEST: basic-close
+ * Description:
+ *	Test whether eudebug can be reattached after closure.
+ */
+static void test_close(int fd)
+{
+	struct drm_xe_eudebug_connect param = { 0,  };
+	int debug_fd1, debug_fd2;
+	int fd2;
+
+	param.pid = getpid();
+
+	igt_assert_eq(__debug_connect(fd, &debug_fd1, &param), 0);
+	igt_assert(debug_fd1 >= 0);
+	igt_assert_eq(__debug_connect(fd, &debug_fd2, &param), -EBUSY);
+	igt_assert_eq(debug_fd2, -1);
+
+	close(debug_fd1);
+	fd2 = drm_open_driver(DRIVER_XE);
+
+	igt_assert_eq(__debug_connect(fd2, &debug_fd2, &param), 0);
+	igt_assert(debug_fd2 >= 0);
+	close(fd2);
+	close(debug_fd2);
+	close(debug_fd1);
+}
+
+/**
+ * SUBTEST: basic-read-event
+ * Description:
+ *	Synchronously exercise eu debugger event polling and reading.
+ */
+#define MAX_EVENT_SIZE (32 * 1024)
+static void test_read_event(int fd)
+{
+	struct drm_xe_eudebug_event *event;
+	struct xe_eudebug_debugger *d;
+	struct xe_eudebug_client *c;
+
+	event = malloc(MAX_EVENT_SIZE);
+	igt_assert(event);
+	memset(event, 0, sizeof(*event));
+
+	c = xe_eudebug_client_create(fd, run_basic_client, 0, NULL);
+	d = xe_eudebug_debugger_create(fd, 0, NULL);
+
+	igt_assert_eq(xe_eudebug_debugger_attach(d, c), 0);
+	igt_assert_eq(poll_event(d->fd, 500), 0);
+
+	event->len = 1;
+	event->type = DRM_XE_EUDEBUG_EVENT_NONE;
+	igt_assert_eq(read_event(d->fd, event), -EINVAL);
+
+	event->len = MAX_EVENT_SIZE;
+	event->type = DRM_XE_EUDEBUG_EVENT_NONE;
+	igt_assert_eq(read_event(d->fd, event), -EINVAL);
+
+	xe_eudebug_client_start(c);
+
+	igt_assert_eq(poll_event(d->fd, 500), 1);
+	event->type = DRM_XE_EUDEBUG_EVENT_READ;
+	igt_assert_eq(read_event(d->fd, event), 0);
+
+	igt_assert_eq(poll_event(d->fd, 500), 1);
+
+	event->flags = 0;
+	event->type = DRM_XE_EUDEBUG_EVENT_READ;
+
+	event->len = 0;
+	igt_assert_eq(read_event(d->fd, event), -EINVAL);
+	igt_assert_eq(0, event->len);
+
+	event->len = sizeof(*event) - 1;
+	igt_assert_eq(read_event(d->fd, event), -EINVAL);
+
+	event->len = sizeof(*event);
+	igt_assert_eq(read_event(d->fd, event), -EMSGSIZE);
+	igt_assert_lt(sizeof(*event), event->len);
+
+	event->len = event->len - 1;
+	igt_assert_eq(read_event(d->fd, event), -EMSGSIZE);
+	/* event->len should now contain the exact len */
+	igt_assert_eq(read_event(d->fd, event), 0);
+
+	fcntl(d->fd, F_SETFL, fcntl(d->fd, F_GETFL) | O_NONBLOCK);
+	igt_assert(fcntl(d->fd, F_GETFL) & O_NONBLOCK);
+
+	igt_assert_eq(poll_event(d->fd, 500), 0);
+	event->len = MAX_EVENT_SIZE;
+	event->flags = 0;
+	event->type = DRM_XE_EUDEBUG_EVENT_READ;
+	igt_assert_eq(__read_event(d->fd, event), -EAGAIN);
+
+	xe_eudebug_client_wait_done(c);
+	xe_eudebug_client_stop(c);
+
+	igt_assert_eq(poll_event(d->fd, 500), 0);
+	igt_assert_eq(__read_event(d->fd, event), -EAGAIN);
+
+	xe_eudebug_debugger_destroy(d);
+	xe_eudebug_client_destroy(c);
+
+	free(event);
+}
+
+/**
+ * SUBTEST: basic-client
+ * Description:
+ *	Attach the debugger to process which opens and closes xe drm client.
+ *
+ * SUBTEST: basic-client-th
+ * Description:
+ *	Create client basic resources (vms) in multiple threads
+ *
+ * SUBTEST: multiple-sessions
+ * Description:
+ *	Simultaneously attach many debuggers to many processes.
+ *	Each process opens and closes xe drm client and creates few resources.
+ *
+ * SUBTEST: basic-%s
+ * Description:
+ *	Attach the debugger to process which creates and destroys a few %arg[1].
+ *
+ * SUBTEST: basic-vm-bind
+ * Description:
+ *	Attach the debugger to a process that performs synchronous vm bind
+ *	and vm unbind.
+ *
+ * SUBTEST: basic-vm-bind-vm-destroy
+ * Description:
+ *	Attach the debugger to a process that performs vm bind, and destroys
+ *	the vm without unbinding. Make sure that we don't get unbind events.
+ *
+ * SUBTEST: basic-vm-bind-extended
+ * Description:
+ *	Attach the debugger to a process that performs bind, bind array, rebind,
+ *	partial unbind, unbind and unbind all operations.
+ *
+ * SUBTEST: multigpu-basic-client
+ * Description:
+ *	Attach the debugger to process which opens and closes xe drm client on all Xe devices.
+ *
+ * SUBTEST: multigpu-basic-client-many
+ * Description:
+ *	Simultaneously attach many debuggers to many processes on all Xe devices.
+ *	Each process opens and closes xe drm client and creates few resources.
+ *
+ * arg[1]:
+ *
+ * @vms: vms
+ * @exec-queues: exec queues
+ */
+
+static void test_basic_sessions(int fd, unsigned int flags, int count, bool match_opposite)
+{
+	struct xe_eudebug_session **s;
+	int i;
+
+	s = calloc(count, sizeof(*s));
+
+	igt_assert(s);
+
+	for (i = 0; i < count; i++)
+		s[i] = xe_eudebug_session_create(fd, run_basic_client, flags, NULL);
+
+	for (i = 0; i < count; i++)
+		xe_eudebug_session_run(s[i]);
+
+	for (i = 0; i < count; i++)
+		xe_eudebug_session_check(s[i], match_opposite, 0);
+
+	for (i = 0; i < count; i++)
+		xe_eudebug_session_destroy(s[i]);
+}
+
+/**
+ * SUBTEST: basic-vm-bind-discovery
+ * Description:
+ *	Attach the debugger to a process that performs vm-bind before attaching
+ *	and check if the discovery process reports it.
+ *
+ * SUBTEST: basic-vm-bind-metadata-discovery
+ * Description:
+ *	Attach the debugger to a process that performs vm-bind with metadata attached
+ *	before attaching and check if the discovery process reports it.
+ *
+ * SUBTEST: basic-vm-bind-vm-destroy-discovery
+ * Description:
+ *	Attach the debugger to a process that performs vm bind, and destroys
+ *	the vm without unbinding before attaching. Make sure that we don't get
+ *	any bind/unbind and vm create/destroy events.
+ *
+ * SUBTEST: basic-vm-bind-extended-discovery
+ * Description:
+ *	Attach the debugger to a process that performs bind, bind array, rebind,
+ *	partial unbind, and unbind all operations before attaching. Ensure that
+ *	we get a only a singe 'VM_BIND' event from the discovery worker.
+ */
+static void test_basic_discovery(int fd, unsigned int flags, bool match_opposite)
+{
+	struct xe_eudebug_debugger *d;
+	struct xe_eudebug_session *s;
+	struct xe_eudebug_client *c;
+
+	s = xe_eudebug_session_create(fd, run_basic_client, flags | TEST_DISCOVERY, NULL);
+
+	c = s->client;
+	d = s->debugger;
+
+	xe_eudebug_client_start(c);
+	xe_eudebug_debugger_wait_stage(s, STAGE_PRE_DEBUG_RESOURCES_DONE);
+
+	igt_assert_eq(xe_eudebug_debugger_attach(d, c), 0);
+	xe_eudebug_debugger_start_worker(d);
+
+	/* give the worker time to do it's job */
+	sleep(2);
+	xe_eudebug_debugger_signal_stage(d, STAGE_DISCOVERY_DONE);
+
+	xe_eudebug_client_wait_done(c);
+
+	xe_eudebug_debugger_stop_worker(d, 1);
+
+	xe_eudebug_event_log_print(d->log, true);
+	xe_eudebug_event_log_print(c->log, true);
+
+	xe_eudebug_session_check(s, match_opposite, 0);
+	xe_eudebug_session_destroy(s);
+}
+
+#define RESOURCE_COUNT 16
+#define PRIMARY_THREAD			(1 << 0)
+#define DISCOVERY_CLOSE_CLIENT		(1 << 1)
+#define DISCOVERY_DESTROY_RESOURCES	(1 << 2)
+#define DISCOVERY_VM_BIND		(1 << 3)
+static void run_discovery_client(struct xe_eudebug_client *c)
+{
+	struct drm_xe_engine_class_instance *hwe = NULL;
+	int fd[RESOURCE_COUNT], i;
+	bool skip_sleep = c->flags & (DISCOVERY_DESTROY_RESOURCES | DISCOVERY_CLOSE_CLIENT);
+	uint64_t addr = 0x1a0000;
+
+	srand(getpid());
+
+	for (i = 0; i < RESOURCE_COUNT; i++) {
+		fd[i] = xe_eudebug_client_open_driver(c);
+
+		if (!i) {
+			bool found = false;
+
+			xe_device_get(fd[0]);
+			xe_for_each_engine(fd[0], hwe) {
+				if (hwe->engine_class == DRM_XE_ENGINE_CLASS_COMPUTE ||
+				    hwe->engine_class == DRM_XE_ENGINE_CLASS_RENDER) {
+					found = true;
+					break;
+				}
+			}
+			igt_assert(found);
+		}
+
+		/*
+		 * Give the debugger a break in event stream after every
+		 * other client, that allows to read discovery and dettach in quiet.
+		 */
+		if (random() % 2 == 0 && !skip_sleep)
+			sleep(1);
+
+		for (int j = 0; j < RESOURCE_COUNT; j++) {
+			uint32_t vm = xe_eudebug_client_vm_create(c, fd[i],
+								  DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
+			struct drm_xe_ext_set_property eq_ext = {
+				.base.name = DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY,
+				.property = DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG,
+				.value = DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE,
+			};
+			struct drm_xe_exec_queue_create create = {
+				.width = 1,
+				.num_placements = 1,
+				.vm_id = vm,
+				.instances = to_user_pointer(hwe),
+				.extensions = to_user_pointer(&eq_ext),
+			};
+			const unsigned int bo_size = max_t(bo_size,
+							   xe_get_default_alignment(fd[i]),
+							   MIN_BO_SIZE);
+			uint32_t bo = xe_bo_create(fd[i], 0, bo_size, system_memory(fd[i]), 0);
+
+			xe_eudebug_client_exec_queue_create(c, fd[i], &create);
+
+			if (c->flags & DISCOVERY_VM_BIND) {
+				xe_eudebug_client_vm_bind(c, fd[i], vm, bo, 0, addr, bo_size);
+				addr += 0x100000;
+			}
+
+			if (c->flags & DISCOVERY_DESTROY_RESOURCES) {
+				xe_eudebug_client_exec_queue_destroy(c, fd[i], &create);
+				xe_eudebug_client_vm_destroy(c, fd[i], create.vm_id);
+				gem_close(fd[i], bo);
+			}
+		}
+
+		if (c->flags & DISCOVERY_CLOSE_CLIENT)
+			xe_eudebug_client_close_driver(c, fd[i]);
+	}
+	xe_device_put(fd[0]);
+}
+
+/**
+ * SUBTEST: discovery-%s
+ * Description: Race discovery against %arg[1] and the debugger dettach.
+ *
+ * arg[1]:
+ *
+ * @race:		resources creation
+ * @race-vmbind:	vm-bind operations
+ * @empty:		resources destruction
+ * @empty-clients:	client closure
+ */
+static void *discovery_race_thread(void *data)
+{
+	struct {
+		uint64_t client_handle;
+		int vm_count;
+		int exec_queue_count;
+		int vm_bind_op_count;
+	} clients[RESOURCE_COUNT];
+	struct xe_eudebug_session *s = data;
+	int expected = RESOURCE_COUNT * (1 + 2 * RESOURCE_COUNT);
+	const int tries = 100;
+	bool done = false;
+	int ret = 0;
+
+	for (int try = 0; try < tries && !done; try++) {
+		ret = xe_eudebug_debugger_attach(s->debugger, s->client);
+
+		if (ret == -EBUSY) {
+			usleep(100000);
+			continue;
+		}
+
+		igt_assert_eq(ret, 0);
+
+		if (random() % 2) {
+			struct drm_xe_eudebug_event *e = NULL;
+			int i = -1;
+
+			xe_eudebug_debugger_start_worker(s->debugger);
+			sleep(1);
+			xe_eudebug_debugger_stop_worker(s->debugger, 1);
+			igt_debug("Resources discovered: %lu\n", s->debugger->event_count);
+
+			xe_eudebug_for_each_event(e, s->debugger->log) {
+				if (e->type == DRM_XE_EUDEBUG_EVENT_OPEN) {
+					struct drm_xe_eudebug_event_client *eo = (void *)e;
+
+					if (i >= 0) {
+						igt_assert_eq(clients[i].vm_count,
+							      RESOURCE_COUNT);
+
+						igt_assert_eq(clients[i].exec_queue_count,
+							      RESOURCE_COUNT);
+
+						if (s->client->flags & DISCOVERY_VM_BIND)
+							igt_assert_eq(clients[i].vm_bind_op_count,
+								      RESOURCE_COUNT);
+					}
+
+					igt_assert(++i < RESOURCE_COUNT);
+					clients[i].client_handle = eo->client_handle;
+					clients[i].vm_count = 0;
+					clients[i].exec_queue_count = 0;
+					clients[i].vm_bind_op_count = 0;
+				}
+
+				if (e->type == DRM_XE_EUDEBUG_EVENT_VM)
+					clients[i].vm_count++;
+
+				if (e->type == DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE)
+					clients[i].exec_queue_count++;
+
+				if (e->type == DRM_XE_EUDEBUG_EVENT_VM_BIND_OP)
+					clients[i].vm_bind_op_count++;
+			};
+
+			igt_assert_lte(0, i);
+
+			for (int j = 0; j < i; j++)
+				for (int k = 0; k < i; k++) {
+					if (k == j)
+						continue;
+
+					igt_assert_neq(clients[j].client_handle,
+						       clients[k].client_handle);
+				}
+
+			if (s->debugger->event_count >= expected)
+				done = true;
+		}
+
+		xe_eudebug_debugger_detach(s->debugger);
+		s->debugger->log->head = 0;
+		s->debugger->event_count = 0;
+	}
+
+	/* Primary thread must read everything */
+	if (s->flags & PRIMARY_THREAD) {
+		while ((ret = xe_eudebug_debugger_attach(s->debugger, s->client)) == -EBUSY)
+			usleep(100000);
+
+		igt_assert_eq(ret, 0);
+
+		xe_eudebug_debugger_start_worker(s->debugger);
+		xe_eudebug_client_wait_done(s->client);
+
+		if (READ_ONCE(s->debugger->event_count) != expected)
+			sleep(5);
+
+		xe_eudebug_debugger_stop_worker(s->debugger, 1);
+		xe_eudebug_debugger_detach(s->debugger);
+	}
+
+	return NULL;
+}
+
+static void test_race_discovery(int fd, unsigned int flags, int clients)
+{
+	const int debuggers_per_client = 3;
+	int count = clients * debuggers_per_client;
+	struct xe_eudebug_session *sessions, *s;
+	struct xe_eudebug_client *c;
+	pthread_t *threads;
+	int i, j;
+
+	sessions = calloc(count, sizeof(*sessions));
+	threads = calloc(count, sizeof(*threads));
+
+	for (i = 0; i < clients; i++) {
+		c = xe_eudebug_client_create(fd, run_discovery_client, flags, NULL);
+		for (j = 0; j < debuggers_per_client; j++) {
+			s = &sessions[i * debuggers_per_client + j];
+			s->client = c;
+			s->debugger = xe_eudebug_debugger_create(fd, flags, NULL);
+			s->flags = flags | (!j ? PRIMARY_THREAD : 0);
+		}
+	}
+
+	for (i = 0; i < count; i++) {
+		if (sessions[i].flags & PRIMARY_THREAD)
+			xe_eudebug_client_start(sessions[i].client);
+
+		pthread_create(&threads[i], NULL, discovery_race_thread, &sessions[i]);
+	}
+
+	for (i = 0; i < count; i++)
+		pthread_join(threads[i], NULL);
+
+	for (i = count - 1; i > 0; i--) {
+		if (sessions[i].flags & PRIMARY_THREAD) {
+			igt_assert_eq(sessions[i].client->seqno - 1,
+				      sessions[i].debugger->event_count);
+
+			xe_eudebug_event_log_compare(sessions[0].debugger->log,
+						     sessions[i].debugger->log,
+						     XE_EUDEBUG_FILTER_EVENT_VM_BIND);
+
+			xe_eudebug_client_destroy(sessions[i].client);
+		}
+		xe_eudebug_debugger_destroy(sessions[i].debugger);
+	}
+}
+
+static void *attach_dettach_thread(void *data)
+{
+	struct xe_eudebug_session *s = data;
+	const int tries = 100;
+	int ret = 0;
+
+	for (int try = 0; try < tries; try++) {
+		ret = xe_eudebug_debugger_attach(s->debugger, s->client);
+
+		if (ret == -EBUSY) {
+			usleep(100000);
+			continue;
+		}
+
+		igt_assert_eq(ret, 0);
+
+		if (random() % 2 == 0) {
+			xe_eudebug_debugger_start_worker(s->debugger);
+			xe_eudebug_debugger_stop_worker(s->debugger, 1);
+		}
+
+		xe_eudebug_debugger_detach(s->debugger);
+		s->debugger->log->head = 0;
+		s->debugger->event_count = 0;
+	}
+
+	return NULL;
+}
+
+static void test_empty_discovery(int fd, unsigned int flags, int clients)
+{
+	struct xe_eudebug_session **s;
+	pthread_t *threads;
+	int i, expected = flags & DISCOVERY_CLOSE_CLIENT ? 0 : RESOURCE_COUNT;
+
+	igt_assert(flags & (DISCOVERY_DESTROY_RESOURCES | DISCOVERY_CLOSE_CLIENT));
+
+	s = calloc(clients, sizeof(struct xe_eudebug_session *));
+	threads = calloc(clients, sizeof(*threads));
+
+	for (i = 0; i < clients; i++)
+		s[i] = xe_eudebug_session_create(fd, run_discovery_client, flags, NULL);
+
+	for (i = 0; i < clients; i++) {
+		xe_eudebug_client_start(s[i]->client);
+
+		pthread_create(&threads[i], NULL, attach_dettach_thread, s[i]);
+	}
+
+	for (i = 0; i < clients; i++)
+		pthread_join(threads[i], NULL);
+
+	for (i = 0; i < clients; i++) {
+		xe_eudebug_client_wait_done(s[i]->client);
+		igt_assert_eq(xe_eudebug_debugger_attach(s[i]->debugger, s[i]->client), 0);
+
+		xe_eudebug_debugger_start_worker(s[i]->debugger);
+		xe_eudebug_debugger_stop_worker(s[i]->debugger, 5);
+		xe_eudebug_debugger_detach(s[i]->debugger);
+
+		igt_assert_eq(s[i]->debugger->event_count, expected);
+
+		xe_eudebug_session_destroy(s[i]);
+	}
+}
+
+static void ufence_ack_trigger(struct xe_eudebug_debugger *d,
+			       struct drm_xe_eudebug_event *e)
+{
+	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
+
+	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE)
+		xe_eudebug_ack_ufence(d->fd, ef);
+}
+
+typedef void (*client_run_t)(struct xe_eudebug_client *);
+
+static void test_client_with_trigger(int fd, unsigned int flags, int count,
+				     client_run_t client_fn, int type,
+				     xe_eudebug_trigger_fn trigger_fn,
+				     struct drm_xe_engine_class_instance *hwe,
+				     bool match_opposite, uint32_t event_filter)
+{
+	struct xe_eudebug_session **s;
+	int i;
+
+	s = calloc(count, sizeof(*s));
+
+	igt_assert(s);
+
+	for (i = 0; i < count; i++)
+		s[i] = xe_eudebug_session_create(fd, client_fn, flags, hwe);
+
+	if (trigger_fn)
+		for (i = 0; i < count; i++)
+			xe_eudebug_debugger_add_trigger(s[i]->debugger, type, trigger_fn);
+
+	for (i = 0; i < count; i++)
+		xe_eudebug_debugger_add_trigger(s[i]->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
+						ufence_ack_trigger);
+
+	for (i = 0; i < count; i++)
+		xe_eudebug_session_run(s[i]);
+
+	for (i = 0; i < count; i++)
+		xe_eudebug_session_check(s[i], match_opposite, event_filter);
+
+	for (i = 0; i < count; i++)
+		xe_eudebug_session_destroy(s[i]);
+}
+
+struct thread_fn_args {
+	struct xe_eudebug_client *client;
+	int fd;
+};
+
+static void *basic_client_th(void *data)
+{
+	struct thread_fn_args *f = data;
+	struct xe_eudebug_client *c = f->client;
+	uint32_t *vms;
+	int fd, i, num_vms;
+
+	fd = f->fd;
+	igt_assert(fd);
+
+	xe_device_get(fd);
+
+	num_vms = 2 + rand() % 16;
+	vms = calloc(num_vms, sizeof(*vms));
+	igt_assert(vms);
+	igt_debug("Create %d client vms\n", num_vms);
+
+	for (i = 0; i < num_vms; i++)
+		vms[i] = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
+
+	for (i = 0; i < num_vms; i++)
+		xe_eudebug_client_vm_destroy(c, fd, vms[i]);
+
+	xe_device_put(fd);
+	free(vms);
+
+	return NULL;
+}
+
+static void run_basic_client_th(struct xe_eudebug_client *c)
+{
+	struct thread_fn_args *args;
+	int i, num_threads, fd;
+	pthread_t *threads;
+
+	args = calloc(1, sizeof(*args));
+	igt_assert(args);
+
+	num_threads = 2 + random() % 16;
+	igt_debug("Run on %d threads\n", num_threads);
+	threads = calloc(num_threads, sizeof(*threads));
+	igt_assert(threads);
+
+	fd = xe_eudebug_client_open_driver(c);
+	args->client = c;
+	args->fd = fd;
+
+	for (i = 0; i < num_threads; i++)
+		pthread_create(&threads[i], NULL, basic_client_th, args);
+
+	for (i = 0; i < num_threads; i++)
+		pthread_join(threads[i], NULL);
+
+	xe_eudebug_client_close_driver(c, fd);
+	free(args);
+	free(threads);
+}
+
+static void test_basic_sessions_th(int fd, unsigned int flags, int num_clients, bool match_opposite)
+{
+	test_client_with_trigger(fd, flags, num_clients, run_basic_client_th, 0, NULL, NULL,
+				 match_opposite, 0);
+}
+
+static void vm_access_client(struct xe_eudebug_client *c)
+{
+	struct drm_xe_engine_class_instance *hwe = c->ptr;
+	uint32_t bo_placement;
+	struct bind_list *bl;
+	uint32_t vm;
+	int fd, i, j;
+
+	igt_debug("Using %s\n", xe_engine_class_string(hwe->engine_class));
+
+	fd = xe_eudebug_client_open_driver(c);
+	xe_device_get(fd);
+
+	vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
+
+	if (c->flags & VM_BIND_OP_MAP_USERPTR)
+		bo_placement = 0;
+	else
+		bo_placement = vram_if_possible(fd, hwe->gt_id);
+
+	for (j = 0; j < 5; j++) {
+		unsigned int target_size = MIN_BO_SIZE * (1 << j);
+
+		bl = create_bind_list(fd, bo_placement, vm, 4, target_size);
+		do_bind_list(c, bl, true);
+
+		for (i = 0; i < bl->n; i++)
+			xe_eudebug_client_wait_stage(c, bl->bind_ops[i].addr);
+
+		free_bind_list(c, bl);
+	}
+	xe_eudebug_client_vm_destroy(c, fd, vm);
+
+	xe_device_put(fd);
+	xe_eudebug_client_close_driver(c, fd);
+}
+
+static void debugger_test_vma(struct xe_eudebug_debugger *d,
+			      uint64_t client_handle,
+			      uint64_t vm_handle,
+			      uint64_t va_start,
+			      uint64_t va_length)
+{
+	struct drm_xe_eudebug_vm_open vo = { 0, };
+	uint64_t *v1, *v2;
+	uint64_t items = va_length / sizeof(uint64_t);
+	int fd;
+	int r, i;
+
+	v1 = malloc(va_length);
+	igt_assert(v1);
+	v2 = malloc(va_length);
+	igt_assert(v2);
+
+	vo.client_handle = client_handle;
+	vo.vm_handle = vm_handle;
+
+	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
+	igt_assert_lte(0, fd);
+
+	r = pread(fd, v1, va_length, va_start);
+	igt_assert_eq(r, va_length);
+
+	for (i = 0; i < items; i++)
+		igt_assert_eq(v1[i], va_start + i);
+
+	for (i = 0; i < items; i++)
+		v1[i] = va_start + i + 1;
+
+	r = pwrite(fd, v1, va_length, va_start);
+	igt_assert_eq(r, va_length);
+
+	lseek(fd, va_start, SEEK_SET);
+	r = read(fd, v2, va_length);
+	igt_assert_eq(r, va_length);
+
+	for (i = 0; i < items; i++)
+		igt_assert_eq(v1[i], v2[i]);
+
+	fsync(fd);
+
+	close(fd);
+	free(v1);
+	free(v2);
+}
+
+static void vm_trigger(struct xe_eudebug_debugger *d,
+		       struct drm_xe_eudebug_event *e)
+{
+	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
+
+	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
+		struct drm_xe_eudebug_event_vm_bind *eb;
+
+		igt_debug("vm bind op event received with ref %lld, addr 0x%llx, range 0x%llx\n",
+			  eo->vm_bind_ref_seqno,
+			  eo->addr,
+			  eo->range);
+
+		eb = (struct drm_xe_eudebug_event_vm_bind *)
+			xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
+		igt_assert(eb);
+
+		debugger_test_vma(d, eb->client_handle, eb->vm_handle,
+				  eo->addr, eo->range);
+		xe_eudebug_debugger_signal_stage(d, eo->addr);
+	}
+}
+
+/**
+ * SUBTEST: basic-vm-access
+ * Description:
+ *      Exercise XE_EUDEBUG_VM_OPEN with pread and pwrite into the
+ *      vm fd, concerning many different offsets inside the vm,
+ *      and many virtual addresses of the vm_bound object.
+ *
+ * SUBTEST: basic-vm-access-userptr
+ * Description:
+ *      Exercise XE_EUDEBUG_VM_OPEN with pread and pwrite into the
+ *      vm fd, concerning many different offsets inside the vm,
+ *      and many virtual addresses of the vm_bound object, but backed
+ *      by userptr.
+ */
+static void test_vm_access(int fd, unsigned int flags, int num_clients)
+{
+	struct drm_xe_engine_class_instance *hwe;
+
+	xe_eudebug_for_each_engine(fd, hwe)
+		test_client_with_trigger(fd, flags, num_clients,
+					 vm_access_client,
+					 DRM_XE_EUDEBUG_EVENT_VM_BIND_OP,
+					 vm_trigger, hwe,
+					 false,
+					 XE_EUDEBUG_FILTER_EVENT_VM_BIND_OP |
+					 XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
+}
+
+static void debugger_test_vma_parameters(struct xe_eudebug_debugger *d,
+					 uint64_t client_handle,
+					 uint64_t vm_handle,
+					 uint64_t va_start,
+					 uint64_t va_length)
+{
+	struct drm_xe_eudebug_vm_open vo = { 0, };
+	uint64_t *v;
+	uint64_t items = va_length / sizeof(uint64_t);
+	int fd;
+	int r, i;
+
+	v = malloc(va_length);
+	igt_assert(v);
+
+	/* Negative VM open - bad client handle */
+	vo.client_handle = client_handle + 123;
+	vo.vm_handle = vm_handle;
+	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
+	igt_assert(fd < 0);
+
+	/* Negative VM open - bad vm handle */
+	vo.client_handle = client_handle;
+	vo.vm_handle = vm_handle + 123;
+	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
+	igt_assert(fd < 0);
+
+	/* Positive VM open */
+	vo.client_handle = client_handle;
+	vo.vm_handle = vm_handle;
+	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
+	igt_assert_lte(0, fd);
+
+	/* Negative pread - bad fd */
+	r = pread(fd + 123, v, va_length, va_start);
+	igt_assert(r < 0);
+
+	/* Negative pread - bad va_start */
+	r = pread(fd, v, va_length, 0);
+	igt_assert(r < 0);
+
+	/* Negative pread - bad va_start */
+	r = pread(fd, v, va_length, va_start - 1);
+	igt_assert(r < 0);
+
+	/* Positive pread - zero va_length */
+	r = pread(fd, v, 0, va_start);
+	igt_assert_eq(r, 0);
+
+	/* Negative pread - out of range */
+	r = pread(fd, v, va_length + 1, va_start);
+	igt_assert_eq(r, va_length);
+
+	/* Negative pread - bad va_start */
+	r = pread(fd, v, 1, va_start + va_length);
+	igt_assert(r < 0);
+
+	/* Positive pread - whole range */
+	r = pread(fd, v, va_length, va_start);
+	igt_assert_eq(r, va_length);
+
+	/* Positive pread */
+	r = pread(fd, v, 1, va_start + va_length - 1);
+	igt_assert_eq(r, 1);
+
+	for (i = 0; i < items; i++)
+		igt_assert_eq(v[i], va_start + i);
+
+	for (i = 0; i < items; i++)
+		v[i] = va_start + i + 1;
+
+	/* Negative pwrite - bad fd */
+	r = pwrite(fd + 123, v, va_length, va_start);
+	igt_assert(r < 0);
+
+	/* Negative pwrite - bad va_start */
+	r = pwrite(fd, v, va_length, -1);
+	igt_assert(r < 0);
+
+	/* Negative pwrite - zero va_start */
+	r = pwrite(fd, v, va_length, 0);
+	igt_assert(r < 0);
+
+	/* Negative pwrite - bad va_length */
+	r = pwrite(fd, v, va_length + 1, va_start);
+	igt_assert_eq(r, va_length);
+
+	/* Positive pwrite - zero va_length */
+	r = pwrite(fd, v, 0, va_start);
+	igt_assert_eq(r, 0);
+
+	/* Positive pwrite */
+	r = pwrite(fd, v, va_length, va_start);
+	igt_assert_eq(r, va_length);
+	fsync(fd);
+
+	close(fd);
+	free(v);
+}
+
+static void vm_trigger_access_parameters(struct xe_eudebug_debugger *d,
+					 struct drm_xe_eudebug_event *e)
+{
+	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
+
+	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
+		struct drm_xe_eudebug_event_vm_bind *eb;
+
+		igt_debug("vm bind op event received with ref %lld, addr 0x%llx, range 0x%llx\n",
+			  eo->vm_bind_ref_seqno,
+			  eo->addr,
+			  eo->range);
+
+		eb = (struct drm_xe_eudebug_event_vm_bind *)
+			xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
+		igt_assert(eb);
+
+		debugger_test_vma_parameters(d, eb->client_handle, eb->vm_handle, eo->addr,
+					     eo->range);
+		xe_eudebug_debugger_signal_stage(d, eo->addr);
+	}
+}
+
+/**
+ * SUBTEST: basic-vm-access-parameters
+ * Description:
+ *      Check negative scenarios of VM_OPEN ioctl and pread/pwrite usage.
+ */
+static void test_vm_access_parameters(int fd, unsigned int flags, int num_clients)
+{
+	struct drm_xe_engine_class_instance *hwe;
+
+	xe_eudebug_for_each_engine(fd, hwe)
+		test_client_with_trigger(fd, flags, num_clients,
+					 vm_access_client,
+					 DRM_XE_EUDEBUG_EVENT_VM_BIND_OP,
+					 vm_trigger_access_parameters, hwe,
+					 false,
+					 XE_EUDEBUG_FILTER_EVENT_VM_BIND_OP |
+					 XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
+}
+
+#define PAGE_SIZE 4096
+#define MDATA_SIZE (WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM * PAGE_SIZE)
+static void metadata_access_client(struct xe_eudebug_client *c)
+{
+	const uint64_t addr = 0x1a0000;
+	struct drm_xe_vm_bind_op_ext_attach_debug *ext;
+	uint8_t *data;
+	size_t bo_size;
+	uint32_t bo, vm;
+	int fd, i;
+
+	fd = xe_eudebug_client_open_driver(c);
+	xe_device_get(fd);
+
+	bo_size = xe_get_default_alignment(fd);
+	vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
+	bo = xe_bo_create(fd, vm, bo_size, system_memory(fd), 0);
+
+	ext = basic_vm_bind_metadata_ext_prepare(fd, c, &data, MDATA_SIZE);
+
+	xe_eudebug_client_vm_bind_flags(c, fd, vm, bo, 0, addr,
+					bo_size, 0, NULL, 0, to_user_pointer(ext));
+
+	for (i = 0; i < WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM; i++)
+		xe_eudebug_client_wait_stage(c, i);
+
+	xe_eudebug_client_vm_unbind(c, fd, vm, 0, addr, bo_size);
+
+	basic_vm_bind_metadata_ext_del(fd, c, ext, data);
+
+	close(bo);
+	xe_eudebug_client_vm_destroy(c, fd, vm);
+
+	xe_device_put(fd);
+	xe_eudebug_client_close_driver(c, fd);
+}
+
+static void debugger_test_metadata(struct xe_eudebug_debugger *d,
+				   uint64_t client_handle,
+				   uint64_t metadata_handle,
+				   uint64_t type,
+				   uint64_t len)
+{
+	struct drm_xe_eudebug_read_metadata rm = {
+		.client_handle = client_handle,
+		.metadata_handle = metadata_handle,
+		.size = len,
+	};
+	uint8_t *data;
+	int i;
+
+	data = malloc(len);
+	igt_assert(data);
+
+	rm.ptr = to_user_pointer(data);
+
+	igt_assert_eq(igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_READ_METADATA, &rm), 0);
+
+	/* syntetic check, test sets different size per metadata type */
+	igt_assert_eq((type + 1) * PAGE_SIZE, rm.size);
+
+	for (i = 0; i < rm.size; i++)
+		igt_assert_eq(data[i], 0xff & (i + (i > PAGE_SIZE)));
+
+	free(data);
+}
+
+static void metadata_read_trigger(struct xe_eudebug_debugger *d,
+				  struct drm_xe_eudebug_event *e)
+{
+	struct drm_xe_eudebug_event_metadata *em = (void *)e;
+
+	/* syntetic check, test sets different size per metadata type */
+	igt_assert_eq((em->type + 1) * PAGE_SIZE, em->len);
+
+	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
+		debugger_test_metadata(d, em->client_handle, em->metadata_handle,
+				       em->type, em->len);
+		xe_eudebug_debugger_signal_stage(d, em->type);
+	}
+}
+
+static void metadata_read_on_vm_bind_trigger(struct xe_eudebug_debugger *d,
+					     struct drm_xe_eudebug_event *e)
+{
+	struct drm_xe_eudebug_event_vm_bind_op_metadata *em = (void *)e;
+	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
+	struct drm_xe_eudebug_event_vm_bind *eb;
+
+	/* For testing purpose client sets metadata_cookie = type */
+
+	/*
+	 * Metadata event has a reference to vm-bind-op event which has a reference
+	 * to vm-bind event which contains proper client-handle.
+	 */
+	eo = (struct drm_xe_eudebug_event_vm_bind_op *)
+		xe_eudebug_event_log_find_seqno(d->log, em->vm_bind_op_ref_seqno);
+	igt_assert(eo);
+	eb = (struct drm_xe_eudebug_event_vm_bind *)
+		xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
+	igt_assert(eb);
+
+	debugger_test_metadata(d,
+			       eb->client_handle,
+			       em->metadata_handle,
+			       em->metadata_cookie,
+			       MDATA_SIZE); /* max size */
+
+	xe_eudebug_debugger_signal_stage(d, em->metadata_cookie);
+}
+
+/**
+ * SUBTEST: read-metadata
+ * Description:
+ *      Exercise DRM_XE_EUDEBUG_IOCTL_READ_METADATA and debug metadata create|destroy events.
+ */
+static void test_metadata_read(int fd, unsigned int flags, int num_clients)
+{
+	test_client_with_trigger(fd, flags, num_clients, metadata_access_client,
+				 DRM_XE_EUDEBUG_EVENT_METADATA, metadata_read_trigger,
+				 NULL, true, 0);
+}
+
+/**
+ * SUBTEST: attach-debug-metadata
+ * Description:
+ *      Read debug metadata when vm_bind has it attached.
+ */
+static void test_metadata_attach(int fd, unsigned int flags, int num_clients)
+{
+	test_client_with_trigger(fd, flags, num_clients, metadata_access_client,
+				 DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_METADATA,
+				 metadata_read_on_vm_bind_trigger,
+				 NULL, true, 0);
+}
+
+#define STAGE_CLIENT_WAIT_ON_UFENCE_DONE 1337
+
+#define UFENCE_EVENT_COUNT_EXPECTED 4
+#define UFENCE_EVENT_COUNT_MAX 100
+
+struct ufence_bind {
+	struct drm_xe_sync f;
+	uint64_t addr;
+	uint64_t range;
+	uint64_t value;
+	struct {
+		uint64_t vm_sync;
+	} *fence_data;
+};
+
+static void client_wait_ufences(struct xe_eudebug_client *c,
+				int fd, struct ufence_bind *binds, int count)
+{
+	const int64_t default_fence_timeout_ns = 500 * NSEC_PER_MSEC;
+	int64_t timeout_ns;
+	int err;
+
+	/* Ensure that wait on unacked ufence times out */
+	for (int i = 0; i < count; i++) {
+		struct ufence_bind *b = &binds[i];
+
+		timeout_ns = default_fence_timeout_ns;
+		err = __xe_wait_ufence(fd, &b->fence_data->vm_sync, b->f.timeline_value,
+				       0, &timeout_ns);
+		igt_assert_eq(err, -ETIME);
+		igt_assert_neq(b->fence_data->vm_sync, b->f.timeline_value);
+		igt_debug("wait #%d blocked on ack\n", i);
+	}
+
+	/* Wait on fence timed out, now tell the debugger to ack */
+	xe_eudebug_client_signal_stage(c, STAGE_CLIENT_WAIT_ON_UFENCE_DONE);
+
+	/* Check that ack unblocks ufence */
+	for (int i = 0; i < count; i++) {
+		struct ufence_bind *b = &binds[i];
+
+		timeout_ns = XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC;
+		err = __xe_wait_ufence(fd, &b->fence_data->vm_sync, b->f.timeline_value,
+				       0, &timeout_ns);
+		igt_assert_eq(err, 0);
+		igt_assert_eq(b->fence_data->vm_sync, b->f.timeline_value);
+		igt_debug("wait #%d completed\n", i);
+	}
+}
+
+static struct ufence_bind *create_binds_with_ufence(int fd, int count)
+{
+	struct ufence_bind *binds;
+
+	binds = calloc(count, sizeof(*binds));
+	igt_assert(binds);
+
+	for (int i = 0; i < count; i++) {
+		struct ufence_bind *b = &binds[i];
+
+		b->range = 0x1000;
+		b->addr = 0x100000 + b->range * i;
+		b->fence_data = aligned_alloc(xe_get_default_alignment(fd),
+					      sizeof(*b->fence_data));
+		igt_assert(b->fence_data);
+		memset(b->fence_data, 0, sizeof(*b->fence_data));
+
+		b->f.type = DRM_XE_SYNC_TYPE_USER_FENCE;
+		b->f.flags = DRM_XE_SYNC_FLAG_SIGNAL;
+		b->f.addr = to_user_pointer(&b->fence_data->vm_sync);
+		b->f.timeline_value = UFENCE_EVENT_COUNT_EXPECTED + i;
+	}
+
+	return binds;
+}
+
+static void basic_ufence_client(struct xe_eudebug_client *c)
+{
+	const unsigned int n = UFENCE_EVENT_COUNT_EXPECTED;
+	int fd = xe_eudebug_client_open_driver(c);
+	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
+	size_t bo_size = n * xe_get_default_alignment(fd);
+	uint32_t bo = xe_bo_create(fd, 0, bo_size,
+				   system_memory(fd), 0);
+	struct ufence_bind *binds = create_binds_with_ufence(fd, n);
+
+	for (int i = 0; i < n; i++) {
+		struct ufence_bind *b = &binds[i];
+
+		xe_eudebug_client_vm_bind_flags(c, fd, vm, bo, 0, b->addr, b->range, 0,
+						&b->f, 1, 0);
+	}
+
+	client_wait_ufences(c, fd, binds, n);
+
+	for (int i = 0; i < n; i++) {
+		struct ufence_bind *b = &binds[i];
+
+		xe_eudebug_client_vm_unbind(c, fd, vm, 0, b->addr, b->range);
+	}
+
+	free(binds);
+	gem_close(fd, bo);
+	xe_eudebug_client_vm_destroy(c, fd, vm);
+	xe_eudebug_client_close_driver(c, fd);
+}
+
+struct ufence_priv {
+	struct drm_xe_eudebug_event_vm_bind_ufence ufence_events[UFENCE_EVENT_COUNT_MAX];
+	uint64_t ufence_event_seqno[UFENCE_EVENT_COUNT_MAX];
+	uint64_t ufence_event_vm_addr_start[UFENCE_EVENT_COUNT_MAX];
+	uint64_t ufence_event_vm_addr_range[UFENCE_EVENT_COUNT_MAX];
+	unsigned int ufence_event_count;
+	unsigned int vm_bind_op_count;
+	pthread_mutex_t mutex;
+};
+
+static struct ufence_priv *ufence_priv_create(void)
+{
+	struct ufence_priv *priv;
+
+	priv = mmap(0, ALIGN(sizeof(*priv), PAGE_SIZE),
+		    PROT_WRITE, MAP_SHARED | MAP_ANON, -1, 0);
+	igt_assert(priv);
+	memset(priv, 0, sizeof(*priv));
+	pthread_mutex_init(&priv->mutex, NULL);
+
+	return priv;
+}
+
+static void ufence_priv_destroy(struct ufence_priv *priv)
+{
+	munmap(priv, ALIGN(sizeof(*priv), PAGE_SIZE));
+}
+
+static void ack_fences(struct xe_eudebug_debugger *d)
+{
+	struct ufence_priv *priv = d->ptr;
+
+	for (int i = 0; i < priv->ufence_event_count; i++)
+		xe_eudebug_ack_ufence(d->fd, &priv->ufence_events[i]);
+}
+
+static void basic_ufence_trigger(struct xe_eudebug_debugger *d,
+				 struct drm_xe_eudebug_event *e)
+{
+	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
+	struct ufence_priv *priv = d->ptr;
+
+	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
+		char event_str[XE_EUDEBUG_EVENT_STRING_MAX_LEN];
+		struct drm_xe_eudebug_event_vm_bind *eb;
+
+		xe_eudebug_event_to_str(e, event_str, XE_EUDEBUG_EVENT_STRING_MAX_LEN);
+		igt_debug("ufence event received: %s\n", event_str);
+
+		xe_eudebug_assert_f(d, priv->ufence_event_count < UFENCE_EVENT_COUNT_EXPECTED,
+				    "surplus ufence event received: %s\n", event_str);
+		xe_eudebug_assert(d, ef->vm_bind_ref_seqno);
+
+		memcpy(&priv->ufence_events[priv->ufence_event_count++], ef, sizeof(*ef));
+
+		eb = (struct drm_xe_eudebug_event_vm_bind *)
+			xe_eudebug_event_log_find_seqno(d->log, ef->vm_bind_ref_seqno);
+		xe_eudebug_assert_f(d, eb, "vm bind event with seqno (%lld) not found\n",
+				    ef->vm_bind_ref_seqno);
+		xe_eudebug_assert_f(d, eb->flags & DRM_XE_EUDEBUG_EVENT_VM_BIND_FLAG_UFENCE,
+				    "vm bind event does not have ufence: %s\n", event_str);
+	}
+}
+
+static int wait_for_ufence_events(struct ufence_priv *priv, int timeout_ms)
+{
+	int ret = -ETIMEDOUT;
+
+	igt_for_milliseconds(timeout_ms) {
+		pthread_mutex_lock(&priv->mutex);
+		if (priv->ufence_event_count == UFENCE_EVENT_COUNT_EXPECTED)
+			ret = 0;
+		pthread_mutex_unlock(&priv->mutex);
+
+		if (!ret)
+			break;
+		usleep(1000);
+	}
+
+	return ret;
+}
+
+/**
+ * SUBTEST: basic-vm-bind-ufence
+ * Description:
+ *      Give user fence in application and check if ufence ack works
+ */
+static void test_basic_ufence(int fd, unsigned int flags)
+{
+	struct xe_eudebug_debugger *d;
+	struct xe_eudebug_session *s;
+	struct xe_eudebug_client *c;
+	struct ufence_priv *priv;
+
+	priv = ufence_priv_create();
+	s = xe_eudebug_session_create(fd, basic_ufence_client, flags, priv);
+	c = s->client;
+	d = s->debugger;
+
+	xe_eudebug_debugger_add_trigger(d,
+					DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
+					basic_ufence_trigger);
+
+	igt_assert_eq(xe_eudebug_debugger_attach(d, c), 0);
+	xe_eudebug_debugger_start_worker(d);
+	xe_eudebug_client_start(c);
+
+	xe_eudebug_debugger_wait_stage(s, STAGE_CLIENT_WAIT_ON_UFENCE_DONE);
+	xe_eudebug_assert_f(d, wait_for_ufence_events(priv, XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * MSEC_PER_SEC) == 0,
+			    "missing ufence events\n");
+	ack_fences(d);
+
+	xe_eudebug_client_wait_done(c);
+	xe_eudebug_debugger_stop_worker(d, 1);
+
+	xe_eudebug_event_log_print(d->log, true);
+	xe_eudebug_event_log_print(c->log, true);
+
+	xe_eudebug_session_check(s, true, XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
+
+	xe_eudebug_session_destroy(s);
+	ufence_priv_destroy(priv);
+}
+
+struct vm_bind_clear_thread_priv {
+	struct drm_xe_engine_class_instance *hwe;
+	struct xe_eudebug_client *c;
+	pthread_t thread;
+	uint64_t region;
+	unsigned long sum;
+};
+
+struct vm_bind_clear_priv {
+	unsigned long unbind_count;
+	unsigned long bind_count;
+	unsigned long sum;
+};
+
+static struct vm_bind_clear_priv *vm_bind_clear_priv_create(void)
+{
+	struct vm_bind_clear_priv *priv;
+
+	priv = mmap(0, ALIGN(sizeof(*priv), PAGE_SIZE),
+		    PROT_WRITE, MAP_SHARED | MAP_ANON, -1, 0);
+	igt_assert(priv);
+	memset(priv, 0, sizeof(*priv));
+
+	return priv;
+}
+
+static void vm_bind_clear_priv_destroy(struct vm_bind_clear_priv *priv)
+{
+	munmap(priv, ALIGN(sizeof(*priv), PAGE_SIZE));
+}
+
+static void *vm_bind_clear_thread(void *data)
+{
+	const uint32_t CS_GPR0 = 0x600;
+	const size_t batch_size = 16;
+	struct drm_xe_sync uf_sync = {
+		.type = DRM_XE_SYNC_TYPE_USER_FENCE, .flags = DRM_XE_SYNC_FLAG_SIGNAL,
+	};
+	struct vm_bind_clear_thread_priv *priv = data;
+	int fd = xe_eudebug_client_open_driver(priv->c);
+	uint32_t gtt_size = 1ull << min_t(uint32_t, xe_va_bits(fd), 48);
+	uint32_t vm = xe_eudebug_client_vm_create(priv->c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
+	size_t bo_size = xe_bb_size(fd, batch_size);
+	unsigned long count = 0;
+	uint64_t *fence_data;
+
+	/* init uf_sync */
+	fence_data = aligned_alloc(xe_get_default_alignment(fd), sizeof(*fence_data));
+	igt_assert(fence_data);
+	uf_sync.timeline_value = 1337;
+	uf_sync.addr = to_user_pointer(fence_data);
+
+	igt_debug("Run on: %s%u\n", xe_engine_class_string(priv->hwe->engine_class),
+		  priv->hwe->engine_instance);
+
+	igt_until_timeout(5) {
+		struct drm_xe_ext_set_property eq_ext = {
+			.base.name = DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY,
+			.property = DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG,
+			.value = DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE,
+		};
+		struct drm_xe_exec_queue_create eq_create = { 0 };
+		uint32_t clean_bo = 0;
+		uint32_t batch_bo = 0;
+		uint64_t clean_offset, batch_offset;
+		uint32_t exec_queue;
+		uint32_t *map, *cs;
+		uint64_t delta;
+
+		/* calculate offsets (vma addresses) */
+		batch_offset = (random() * SZ_2M) & (gtt_size - 1);
+		/* XXX: for some platforms/memory regions batch offset '0' can be problematic */
+		if (batch_offset == 0)
+			batch_offset = SZ_2M;
+
+		do {
+			clean_offset = (random() * SZ_2M) & (gtt_size - 1);
+			if (clean_offset == 0)
+				clean_offset = SZ_2M;
+		} while (clean_offset == batch_offset);
+
+		batch_offset += random() % SZ_2M & -bo_size;
+		clean_offset += random() % SZ_2M & -bo_size;
+
+		delta = (random() % bo_size) & -4;
+
+		/* prepare clean bo */
+		clean_bo = xe_bo_create(fd, vm, bo_size, priv->region,
+					DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM);
+		memset(fence_data, 0, sizeof(*fence_data));
+		xe_eudebug_client_vm_bind_flags(priv->c, fd, vm, clean_bo, 0, clean_offset, bo_size,
+						0, &uf_sync, 1, 0);
+		xe_wait_ufence(fd, fence_data, uf_sync.timeline_value, 0,
+			       XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC);
+
+		/* prepare batch bo */
+		batch_bo = xe_bo_create(fd, vm, bo_size, priv->region,
+					DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM);
+		memset(fence_data, 0, sizeof(*fence_data));
+		xe_eudebug_client_vm_bind_flags(priv->c, fd, vm, batch_bo, 0, batch_offset, bo_size,
+						0, &uf_sync, 1, 0);
+		xe_wait_ufence(fd, fence_data, uf_sync.timeline_value, 0,
+			       XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC);
+
+		map = xe_bo_map(fd, batch_bo, bo_size);
+
+		cs = map;
+		*cs++ = MI_NOOP | 0xc5a3;
+		*cs++ = MI_LOAD_REGISTER_MEM_CMD | MI_LRI_LRM_CS_MMIO | 2;
+		*cs++ = CS_GPR0;
+		*cs++ = clean_offset + delta;
+		*cs++ = (clean_offset + delta) >> 32;
+		*cs++ = MI_STORE_REGISTER_MEM_CMD | MI_LRI_LRM_CS_MMIO | 2;
+		*cs++ = CS_GPR0;
+		*cs++ = batch_offset;
+		*cs++ = batch_offset >> 32;
+		*cs++ = MI_BATCH_BUFFER_END;
+
+		/* execute batch */
+		eq_create.width = 1;
+		eq_create.num_placements = 1;
+		eq_create.vm_id = vm;
+		eq_create.instances = to_user_pointer(priv->hwe);
+		eq_create.extensions = to_user_pointer(&eq_ext);
+		exec_queue = xe_eudebug_client_exec_queue_create(priv->c, fd, &eq_create);
+
+		memset(fence_data, 0, sizeof(*fence_data));
+		xe_exec_sync(fd, exec_queue, batch_offset, &uf_sync, 1);
+		xe_wait_ufence(fd, fence_data, uf_sync.timeline_value, 0,
+			       XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC);
+
+		igt_assert_eq(*map, 0);
+
+		/* cleanup */
+		xe_eudebug_client_exec_queue_destroy(priv->c, fd, &eq_create);
+		munmap(map, bo_size);
+
+		xe_eudebug_client_vm_unbind(priv->c, fd, vm, 0, batch_offset, bo_size);
+		gem_close(fd, batch_bo);
+
+		xe_eudebug_client_vm_unbind(priv->c, fd, vm, 0, clean_offset, bo_size);
+		gem_close(fd, clean_bo);
+
+		count++;
+	}
+
+	priv->sum = count;
+
+	free(fence_data);
+	xe_eudebug_client_close_driver(priv->c, fd);
+	return NULL;
+}
+
+static void vm_bind_clear_client(struct xe_eudebug_client *c)
+{
+	int fd = xe_eudebug_client_open_driver(c);
+	struct xe_device *xe_dev = xe_device_get(fd);
+	int count = xe_number_engines(fd) * xe_dev->mem_regions->num_mem_regions;
+	uint64_t memreg = all_memory_regions(fd);
+	struct vm_bind_clear_priv *priv = c->ptr;
+	int current = 0;
+	struct drm_xe_engine_class_instance *engine;
+	struct vm_bind_clear_thread_priv *threads;
+	uint64_t region;
+
+	threads = calloc(count, sizeof(*threads));
+	igt_assert(threads);
+	priv->sum = 0;
+
+	xe_for_each_mem_region(fd, memreg, region) {
+		xe_eudebug_for_each_engine(fd, engine) {
+			threads[current].c = c;
+			threads[current].hwe = engine;
+			threads[current].region = region;
+
+			pthread_create(&threads[current].thread, NULL,
+				       vm_bind_clear_thread, &threads[current]);
+			current++;
+		}
+	}
+
+	for (current = 0; current < count; current++)
+		pthread_join(threads[current].thread, NULL);
+
+	xe_for_each_mem_region(fd, memreg, region) {
+		unsigned long sum = 0;
+
+		for (current = 0; current < count; current++)
+			if (threads[current].region == region)
+				sum += threads[current].sum;
+
+		igt_info("%s sampled %lu objects\n", xe_region_name(region), sum);
+		priv->sum += sum;
+	}
+
+	free(threads);
+	xe_device_put(fd);
+	xe_eudebug_client_close_driver(c, fd);
+}
+
+static void vm_bind_clear_test_trigger(struct xe_eudebug_debugger *d,
+				       struct drm_xe_eudebug_event *e)
+{
+	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
+	struct vm_bind_clear_priv *priv = d->ptr;
+
+	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
+		if (random() & 1) {
+			struct drm_xe_eudebug_vm_open vo = { 0, };
+			uint32_t v = 0xc1c1c1c1;
+
+			struct drm_xe_eudebug_event_vm_bind *eb;
+			int fd, delta, r;
+
+			igt_debug("vm bind op event received with ref %lld, addr 0x%llx, range 0x%llx\n",
+				  eo->vm_bind_ref_seqno, eo->addr, eo->range);
+
+			eb = (struct drm_xe_eudebug_event_vm_bind *)
+				xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
+			igt_assert(eb);
+
+			vo.client_handle = eb->client_handle;
+			vo.vm_handle = eb->vm_handle;
+
+			fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
+			igt_assert_lte(0, fd);
+
+			delta = (random() % eo->range) & -4;
+			r = pread(fd, &v, sizeof(v), eo->addr + delta);
+			igt_assert_eq(r, sizeof(v));
+			igt_assert_eq_u32(v, 0);
+
+			close(fd);
+		}
+		priv->bind_count++;
+	}
+
+	if (e->flags & DRM_XE_EUDEBUG_EVENT_DESTROY)
+		priv->unbind_count++;
+}
+
+static void vm_bind_clear_ack_trigger(struct xe_eudebug_debugger *d,
+				      struct drm_xe_eudebug_event *e)
+{
+	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
+
+	xe_eudebug_ack_ufence(d->fd, ef);
+}
+
+/**
+ * SUBTEST: vm-bind-clear
+ * Description:
+ *      Check that fresh buffers we vm_bind into the ppGTT are always clear.
+ */
+static void test_vm_bind_clear(int fd)
+{
+	struct vm_bind_clear_priv *priv;
+	struct xe_eudebug_session *s;
+
+	priv = vm_bind_clear_priv_create();
+	s = xe_eudebug_session_create(fd, vm_bind_clear_client, 0, priv);
+
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_OP,
+					vm_bind_clear_test_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
+					vm_bind_clear_ack_trigger);
+
+	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
+	xe_eudebug_debugger_start_worker(s->debugger);
+	xe_eudebug_client_start(s->client);
+
+	xe_eudebug_client_wait_done(s->client);
+	xe_eudebug_debugger_stop_worker(s->debugger, 1);
+
+	igt_assert_eq(priv->bind_count, priv->unbind_count);
+	igt_assert_eq(priv->sum * 2, priv->bind_count);
+
+	xe_eudebug_session_destroy(s);
+	vm_bind_clear_priv_destroy(priv);
+}
+
+#define UFENCE_CLIENT_VM_TEST_VAL_START 0xaaaaaaaa
+#define UFENCE_CLIENT_VM_TEST_VAL_END 0xbbbbbbbb
+
+static void vma_ufence_client(struct xe_eudebug_client *c)
+{
+	const unsigned int n = UFENCE_EVENT_COUNT_EXPECTED;
+	int fd = xe_eudebug_client_open_driver(c);
+	struct ufence_bind *binds = create_binds_with_ufence(fd, n);
+	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
+	size_t bo_size = xe_get_default_alignment(fd);
+	uint64_t items = bo_size / sizeof(uint32_t);
+	uint32_t bo[UFENCE_EVENT_COUNT_EXPECTED];
+	uint32_t *ptr[UFENCE_EVENT_COUNT_EXPECTED];
+
+	for (int i = 0; i < n; i++) {
+		bo[i] = xe_bo_create(fd, 0, bo_size,
+				     system_memory(fd), 0);
+		ptr[i] = xe_bo_map(fd, bo[i], bo_size);
+		igt_assert(ptr[i]);
+		memset(ptr[i], UFENCE_CLIENT_VM_TEST_VAL_START, bo_size);
+	}
+
+	for (int i = 0; i < n; i++)
+		for (int j = 0; j < items; j++)
+			igt_assert_eq(ptr[i][j], UFENCE_CLIENT_VM_TEST_VAL_START);
+
+	for (int i = 0; i < n; i++) {
+		struct ufence_bind *b = &binds[i];
+
+		xe_eudebug_client_vm_bind_flags(c, fd, vm, bo[i], 0, b->addr, b->range, 0,
+						&b->f, 1, 0);
+	}
+
+	/* Wait for acks on ufences */
+	for (int i = 0; i < n; i++) {
+		int err;
+		int64_t timeout_ns;
+		struct ufence_bind *b = &binds[i];
+
+		timeout_ns = XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC;
+		err = __xe_wait_ufence(fd, &b->fence_data->vm_sync, b->f.timeline_value,
+				       0, &timeout_ns);
+		igt_assert_eq(err, 0);
+		igt_assert_eq(b->fence_data->vm_sync, b->f.timeline_value);
+		igt_debug("wait #%d completed\n", i);
+
+		for (int j = 0; j < items; j++)
+			igt_assert_eq(ptr[i][j], UFENCE_CLIENT_VM_TEST_VAL_END);
+	}
+
+	for (int i = 0; i < n; i++) {
+		struct ufence_bind *b = &binds[i];
+
+		xe_eudebug_client_vm_unbind(c, fd, vm, 0, b->addr, b->range);
+	}
+
+	free(binds);
+
+	for (int i = 0; i < n; i++) {
+		munmap(ptr[i], bo_size);
+		gem_close(fd, bo[i]);
+	}
+
+	xe_eudebug_client_vm_destroy(c, fd, vm);
+	xe_eudebug_client_close_driver(c, fd);
+}
+
+static void debugger_test_vma_ufence(struct xe_eudebug_debugger *d,
+				     uint64_t client_handle,
+				     uint64_t vm_handle,
+				     uint64_t va_start,
+				     uint64_t va_length)
+{
+	struct drm_xe_eudebug_vm_open vo = { 0, };
+	uint32_t *v1, *v2;
+	uint32_t items = va_length / sizeof(uint32_t);
+	int fd;
+	int r, i;
+
+	v1 = malloc(va_length);
+	igt_assert(v1);
+	v2 = malloc(va_length);
+	igt_assert(v2);
+
+	vo.client_handle = client_handle;
+	vo.vm_handle = vm_handle;
+
+	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
+	igt_assert_lte(0, fd);
+
+	r = pread(fd, v1, va_length, va_start);
+	igt_assert_eq(r, va_length);
+
+	for (i = 0; i < items; i++)
+		igt_assert_eq(v1[i], UFENCE_CLIENT_VM_TEST_VAL_START);
+
+	memset(v1, UFENCE_CLIENT_VM_TEST_VAL_END, va_length);
+
+	r = pwrite(fd, v1, va_length, va_start);
+	igt_assert_eq(r, va_length);
+
+	lseek(fd, va_start, SEEK_SET);
+	r = read(fd, v2, va_length);
+	igt_assert_eq(r, va_length);
+
+	for (i = 0; i < items; i++)
+		igt_assert_eq_u64(v1[i], v2[i]);
+
+	fsync(fd);
+
+	close(fd);
+	free(v1);
+	free(v2);
+}
+
+static void vma_ufence_op_trigger(struct xe_eudebug_debugger *d,
+				  struct drm_xe_eudebug_event *e)
+{
+	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
+	struct ufence_priv *priv = d->ptr;
+
+	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
+		char event_str[XE_EUDEBUG_EVENT_STRING_MAX_LEN];
+		struct drm_xe_eudebug_event_vm_bind *eb;
+		unsigned int op_count = priv->vm_bind_op_count++;
+
+		xe_eudebug_event_to_str(e, event_str, XE_EUDEBUG_EVENT_STRING_MAX_LEN);
+		igt_debug("vm bind op event: ref %lld, addr 0x%llx, range 0x%llx, op_count %u\n",
+			  eo->vm_bind_ref_seqno,
+			  eo->addr,
+			  eo->range,
+			  op_count);
+		igt_debug("vm bind op event received: %s\n", event_str);
+		xe_eudebug_assert(d, eo->vm_bind_ref_seqno);
+		eb = (struct drm_xe_eudebug_event_vm_bind *)
+			xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
+
+		xe_eudebug_assert_f(d, eb, "vm bind event with seqno (%lld) not found\n",
+				    eo->vm_bind_ref_seqno);
+		xe_eudebug_assert_f(d, eb->flags & DRM_XE_EUDEBUG_EVENT_VM_BIND_FLAG_UFENCE,
+				    "vm bind event does not have ufence: %s\n", event_str);
+
+		priv->ufence_event_seqno[op_count] = eo->vm_bind_ref_seqno;
+		priv->ufence_event_vm_addr_start[op_count] = eo->addr;
+		priv->ufence_event_vm_addr_range[op_count] = eo->range;
+	}
+}
+
+static void vma_ufence_trigger(struct xe_eudebug_debugger *d,
+			       struct drm_xe_eudebug_event *e)
+{
+	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
+	struct ufence_priv *priv = d->ptr;
+	unsigned int ufence_count = priv->ufence_event_count;
+
+	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
+		char event_str[XE_EUDEBUG_EVENT_STRING_MAX_LEN];
+		struct drm_xe_eudebug_event_vm_bind *eb;
+		uint64_t addr = priv->ufence_event_vm_addr_start[ufence_count];
+		uint64_t range = priv->ufence_event_vm_addr_range[ufence_count];
+
+		xe_eudebug_event_to_str(e, event_str, XE_EUDEBUG_EVENT_STRING_MAX_LEN);
+		igt_debug("ufence event received: %s\n", event_str);
+
+		xe_eudebug_assert_f(d, priv->ufence_event_count < UFENCE_EVENT_COUNT_EXPECTED,
+				    "surplus ufence event received: %s\n", event_str);
+		xe_eudebug_assert(d, ef->vm_bind_ref_seqno);
+
+		memcpy(&priv->ufence_events[priv->ufence_event_count++], ef, sizeof(*ef));
+
+		eb = (struct drm_xe_eudebug_event_vm_bind *)
+			xe_eudebug_event_log_find_seqno(d->log, ef->vm_bind_ref_seqno);
+		xe_eudebug_assert_f(d, eb, "vm bind event with seqno (%lld) not found\n",
+				    ef->vm_bind_ref_seqno);
+		xe_eudebug_assert_f(d, eb->flags & DRM_XE_EUDEBUG_EVENT_VM_BIND_FLAG_UFENCE,
+				    "vm bind event does not have ufence: %s\n", event_str);
+		igt_debug("vm bind ufence event received with ref %lld, addr 0x%lx, range 0x%lx\n",
+			  ef->vm_bind_ref_seqno,
+			  addr,
+			  range);
+		debugger_test_vma_ufence(d, eb->client_handle, eb->vm_handle,
+					 addr, range);
+
+		xe_eudebug_ack_ufence(d->fd, ef);
+	}
+}
+
+/**
+ * SUBTEST: vma-ufence
+ * Description:
+ *      Intercept vm bind after receiving ufence event, then access target vm and write to it.
+ *      Then check on client side if the write was successful.
+ */
+static void test_vma_ufence(int fd, unsigned int flags)
+{
+	struct xe_eudebug_session *s;
+	struct ufence_priv *priv;
+
+	priv = ufence_priv_create();
+	s = xe_eudebug_session_create(fd, vma_ufence_client, flags, priv);
+
+	xe_eudebug_debugger_add_trigger(s->debugger,
+					DRM_XE_EUDEBUG_EVENT_VM_BIND_OP,
+					vma_ufence_op_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger,
+					DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
+					vma_ufence_trigger);
+
+	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
+	xe_eudebug_debugger_start_worker(s->debugger);
+	xe_eudebug_client_start(s->client);
+
+	xe_eudebug_client_wait_done(s->client);
+	xe_eudebug_debugger_stop_worker(s->debugger, 1);
+
+	xe_eudebug_event_log_print(s->debugger->log, true);
+	xe_eudebug_event_log_print(s->client->log, true);
+
+	xe_eudebug_session_check(s, true, XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
+
+	xe_eudebug_session_destroy(s);
+	ufence_priv_destroy(priv);
+}
+
+igt_main
+{
+	bool was_enabled;
+	bool *multigpu_was_enabled;
+	int fd, gpu_count;
+
+	igt_fixture {
+		fd = drm_open_driver(DRIVER_XE);
+		was_enabled = xe_eudebug_enable(fd, true);
+	}
+
+	igt_subtest("sysfs-toggle")
+		test_sysfs_toggle(fd);
+
+	igt_subtest("basic-connect")
+		test_connect(fd);
+
+	igt_subtest("connect-user")
+		test_connect_user(fd);
+
+	igt_subtest("basic-close")
+		test_close(fd);
+
+	igt_subtest("basic-read-event")
+		test_read_event(fd);
+
+	igt_subtest("basic-client")
+		test_basic_sessions(fd, 0, 1, true);
+
+	igt_subtest("basic-client-th")
+		test_basic_sessions_th(fd, 0, 1, true);
+
+	igt_subtest("basic-vm-access")
+		test_vm_access(fd, 0, 1);
+
+	igt_subtest("basic-vm-access-userptr")
+		test_vm_access(fd, VM_BIND_OP_MAP_USERPTR, 1);
+
+	igt_subtest("basic-vm-access-parameters")
+		test_vm_access_parameters(fd, 0, 1);
+
+	igt_subtest("multiple-sessions")
+		test_basic_sessions(fd, CREATE_VMS | CREATE_EXEC_QUEUES, 4, true);
+
+	igt_subtest("basic-vms")
+		test_basic_sessions(fd, CREATE_VMS, 1, true);
+
+	igt_subtest("basic-exec-queues")
+		test_basic_sessions(fd, CREATE_EXEC_QUEUES, 1, true);
+
+	igt_subtest("basic-vm-bind")
+		test_basic_sessions(fd, VM_BIND, 1, true);
+
+	igt_subtest("basic-vm-bind-ufence")
+		test_basic_ufence(fd, 0);
+
+	igt_subtest("vma-ufence")
+		test_vma_ufence(fd, 0);
+
+	igt_subtest("vm-bind-clear")
+		test_vm_bind_clear(fd);
+
+	igt_subtest("basic-vm-bind-discovery")
+		test_basic_discovery(fd, VM_BIND, true);
+
+	igt_subtest("basic-vm-bind-metadata-discovery")
+		test_basic_discovery(fd, VM_BIND_METADATA, true);
+
+	igt_subtest("basic-vm-bind-vm-destroy")
+		test_basic_sessions(fd, VM_BIND_VM_DESTROY, 1, false);
+
+	igt_subtest("basic-vm-bind-vm-destroy-discovery")
+		test_basic_discovery(fd, VM_BIND_VM_DESTROY, false);
+
+	igt_subtest("basic-vm-bind-extended")
+		test_basic_sessions(fd, VM_BIND_EXTENDED, 1, true);
+
+	igt_subtest("basic-vm-bind-extended-discovery")
+		test_basic_discovery(fd, VM_BIND_EXTENDED, true);
+
+	igt_subtest("read-metadata")
+		test_metadata_read(fd, 0, 1);
+
+	igt_subtest("attach-debug-metadata")
+		test_metadata_attach(fd, 0, 1);
+
+	igt_subtest("discovery-race")
+		test_race_discovery(fd, 0, 4);
+
+	igt_subtest("discovery-race-vmbind")
+		test_race_discovery(fd, DISCOVERY_VM_BIND, 4);
+
+	igt_subtest("discovery-empty")
+		test_empty_discovery(fd, DISCOVERY_CLOSE_CLIENT, 16);
+
+	igt_subtest("discovery-empty-clients")
+		test_empty_discovery(fd, DISCOVERY_DESTROY_RESOURCES, 16);
+
+	igt_fixture {
+		xe_eudebug_enable(fd, was_enabled);
+		drm_close_driver(fd);
+	}
+
+	igt_subtest_group {
+		igt_fixture {
+			gpu_count = drm_prepare_filtered_multigpu(DRIVER_XE);
+			igt_require(gpu_count >= 2);
+
+			multigpu_was_enabled = malloc(gpu_count * sizeof(bool));
+			igt_assert(multigpu_was_enabled);
+			for (int i = 0; i < gpu_count; i++) {
+				fd = drm_open_filtered_card(i);
+				multigpu_was_enabled[i] = xe_eudebug_enable(fd, true);
+				close(fd);
+			}
+		}
+
+		igt_subtest("multigpu-basic-client") {
+			igt_multi_fork(child, gpu_count) {
+				fd = drm_open_filtered_card(child);
+				igt_assert_f(fd > 0, "cannot open gpu-%d, errno=%d\n",
+					     child, errno);
+				igt_assert(is_xe_device(fd));
+
+				test_basic_sessions(fd, 0, 1, true);
+				close(fd);
+			}
+			igt_waitchildren();
+		}
+
+		igt_subtest("multigpu-basic-client-many") {
+			igt_multi_fork(child, gpu_count) {
+				fd = drm_open_filtered_card(child);
+				igt_assert_f(fd > 0, "cannot open gpu-%d, errno=%d\n",
+					     child, errno);
+				igt_assert(is_xe_device(fd));
+
+				test_basic_sessions(fd, 0, 4, true);
+				close(fd);
+			}
+			igt_waitchildren();
+		}
+
+		igt_fixture {
+			for (int i = 0; i < gpu_count; i++) {
+				fd = drm_open_filtered_card(i);
+				xe_eudebug_enable(fd, multigpu_was_enabled[i]);
+				close(fd);
+			}
+			free(multigpu_was_enabled);
+		}
+	}
+}
diff --git a/tests/meson.build b/tests/meson.build
index 00556c9d6..0f996fdc8 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -318,6 +318,14 @@ intel_xe_progs = [
 	'xe_sysfs_scheduler',
 ]
 
+intel_xe_eudebug_progs = [
+	'xe_eudebug',
+]
+
+if build_xe_eudebug
+	intel_xe_progs += intel_xe_eudebug_progs
+endif
+
 chamelium_progs = [
 	'kms_chamelium_audio',
 	'kms_chamelium_color',
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 13/17] tests/xe_eudebug: Test eudebug resource tracking and manipulation
  2024-09-05  9:28 ` [PATCH i-g-t v6 13/17] tests/xe_eudebug: Test eudebug resource tracking and manipulation Christoph Manszewski
@ 2024-09-06 14:46   ` Kamil Konieczny
  2024-09-09 10:34     ` Zbigniew Kempczyński
  2024-09-12  8:04   ` Zbigniew Kempczyński
  1 sibling, 1 reply; 50+ messages in thread
From: Kamil Konieczny @ 2024-09-06 14:46 UTC (permalink / raw)
  To: igt-dev
  Cc: Christoph Manszewski, Zbigniew Kempczyński,
	Dominik Grzegorzek, Maciej Patelczyk,
	Dominik Karol Piątkowski, Pawel Sikora, Andrzej Hajda,
	Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun, Mika Kuoppala,
	Karolina Stolarek, Jonathan Cavitt

Hi Christoph,
On 2024-09-05 at 11:28:08 +0200, Christoph Manszewski wrote:
> From: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> 
> For typical debugging under gdb one can specify two main usecases:
> accessing and manupulating resources created by the application and
> manipulating thread execution (interrupting and setting breakpoints).
> 
> This test adds coverage for the former by checking that:
> - the debugger reports the expected events for Xe resources created
> by the debugged client,
> - the debugger is able to read and write the vm of the debugged client.
> 
> Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
> Signed-off-by: Karolina Stolarek <karolina.stolarek@intel.com>
> Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
> Signed-off-by: Pawel Sikora <pawel.sikora@intel.com>
> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
> Signed-off-by: Dominik Karol Piątkowski <dominik.karol.piatkowski@intel.com>
> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> ---
>  docs/testplan/meson.build |   13 +-
>  meson_options.txt         |    2 +-
>  tests/intel/xe_eudebug.c  | 2716 +++++++++++++++++++++++++++++++++++++
>  tests/meson.build         |    8 +
>  4 files changed, 2737 insertions(+), 2 deletions(-)
>  create mode 100644 tests/intel/xe_eudebug.c
> 
> diff --git a/docs/testplan/meson.build b/docs/testplan/meson.build
> index 5560347f1..e86af028e 100644
> --- a/docs/testplan/meson.build
> +++ b/docs/testplan/meson.build
> @@ -33,11 +33,22 @@ else
>  	doc_dependencies = []
>  endif
>  
> +xe_excluded_tests = []
> +if not build_xe_eudebug
> +	foreach test : intel_xe_eudebug_progs
> +		xe_excluded_tests += meson.current_source_dir() + '/../../tests/intel/' + test + '.c'
> +	endforeach
> +endif
> +
> +if xe_excluded_tests.length() > 0
> +	xe_excluded_tests = ['--exclude-files'] + xe_excluded_tests
> +endif
> +
>  if build_xe
>  	test_dict = {
>  		'i915_tests': { 'input': i915_test_config, 'extra_args': check_testlist },
>  		'kms_tests': { 'input': kms_test_config, 'extra_args': kms_check_testlist },
> -		'xe_tests': { 'input': xe_test_config, 'extra_args': check_testlist }
> +		'xe_tests': { 'input': xe_test_config, 'extra_args': check_testlist + xe_excluded_tests }
>  	    }

Do we need this change? Can we just add igt@.*eudebug.* to blocklist
and have it compiled tested?

Regards,
Kamil

>  else
>  	test_dict = {
> diff --git a/meson_options.txt b/meson_options.txt
> index 11922523b..c410f9b77 100644
> --- a/meson_options.txt
> +++ b/meson_options.txt
> @@ -45,7 +45,7 @@ option('xe_driver',
>  option('xe_eudebug',
>         type : 'feature',
>         value : 'disabled',
> -       description : 'Build library for Xe EU debugger')
> +       description : 'Build library and tests for Xe EU debugger')
>  
>  option('libdrm_drivers',
>         type : 'array',
> diff --git a/tests/intel/xe_eudebug.c b/tests/intel/xe_eudebug.c
> new file mode 100644
> index 000000000..fd2894a5e
> --- /dev/null
> +++ b/tests/intel/xe_eudebug.c
> @@ -0,0 +1,2716 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2023 Intel Corporation
> + */
> +
> +/**
> + * TEST: Test EU Debugger functionality
> + * Category: Core
> + * Mega feature: EUdebug
> + * Sub-category: EUdebug tests
> + * Functionality: eu debugger framework
> + * Test category: functionality test
> + */
> +
> +#include <grp.h>
> +#include <poll.h>
> +#include <pthread.h>
> +#include <pwd.h>
> +#include <sys/ioctl.h>
> +#include <sys/prctl.h>
> +
> +#include "igt.h"
> +#include "intel_pat.h"
> +#include "lib/igt_syncobj.h"
> +#include "xe/xe_eudebug.h"
> +#include "xe/xe_ioctl.h"
> +#include "xe/xe_query.h"
> +
> +/**
> + * SUBTEST: sysfs-toggle
> + * Description:
> + *	Exercise the debugger enable/disable sysfs toggle logic
> + */
> +static void test_sysfs_toggle(int fd)
> +{
> +	xe_eudebug_enable(fd, false);
> +	igt_assert(!xe_eudebug_debugger_available(fd));
> +
> +	xe_eudebug_enable(fd, true);
> +	igt_assert(xe_eudebug_debugger_available(fd));
> +	xe_eudebug_enable(fd, true);
> +	igt_assert(xe_eudebug_debugger_available(fd));
> +
> +	xe_eudebug_enable(fd, false);
> +	igt_assert(!xe_eudebug_debugger_available(fd));
> +	xe_eudebug_enable(fd, false);
> +	igt_assert(!xe_eudebug_debugger_available(fd));
> +
> +	xe_eudebug_enable(fd, true);
> +	igt_assert(xe_eudebug_debugger_available(fd));
> +}
> +
> +#define STAGE_PRE_DEBUG_RESOURCES_DONE 1
> +#define STAGE_DISCOVERY_DONE 2
> +
> +#define CREATE_VMS (1 << 0)
> +#define CREATE_EXEC_QUEUES (1 << 1)
> +#define VM_BIND (1 << 2)
> +#define VM_BIND_VM_DESTROY (1 << 3)
> +#define VM_BIND_EXTENDED (1 << 4)
> +#define VM_METADATA (1 << 5)
> +#define VM_BIND_METADATA (1 << 6)
> +#define VM_BIND_OP_MAP_USERPTR (1 << 7)
> +#define TEST_DISCOVERY (1 << 31)
> +
> +#define PAGE_SIZE 4096
> +static struct drm_xe_vm_bind_op_ext_attach_debug *
> +basic_vm_bind_metadata_ext_prepare(int fd, struct xe_eudebug_client *c,
> +				   uint8_t **data, uint32_t data_size)
> +{
> +	struct drm_xe_vm_bind_op_ext_attach_debug *ext;
> +	int i;
> +
> +	*data = calloc(data_size, sizeof(*data));
> +	igt_assert(*data);
> +
> +	for (i = 0; i < data_size; i++)
> +		(*data)[i] = 0xff & (i + (i > PAGE_SIZE));
> +
> +	ext = calloc(WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM, sizeof(*ext));
> +	igt_assert(ext);
> +
> +	for (i = 0; i < WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM; i++) {
> +		ext[i].base.name = XE_VM_BIND_OP_EXTENSIONS_ATTACH_DEBUG;
> +		ext[i].metadata_id = xe_eudebug_client_metadata_create(c, fd, i,
> +								       (i + 1) * PAGE_SIZE, *data);
> +		ext[i].cookie = i;
> +
> +		if (i < WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM - 1)
> +			ext[i].base.next_extension = to_user_pointer(&ext[i + 1]);
> +	}
> +	return ext;
> +}
> +
> +static void basic_vm_bind_metadata_ext_del(int fd, struct xe_eudebug_client *c,
> +					   struct drm_xe_vm_bind_op_ext_attach_debug *ext,
> +					   uint8_t *data)
> +{
> +	for (int i = 0; i < WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM; i++)
> +		xe_eudebug_client_metadata_destroy(c, fd, ext[i].metadata_id, i,
> +						   (i + 1) * PAGE_SIZE);
> +	free(ext);
> +	free(data);
> +}
> +
> +static void basic_vm_bind_client(int fd, struct xe_eudebug_client *c)
> +{
> +	struct drm_xe_vm_bind_op_ext_attach_debug *ext = NULL;
> +	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +	size_t bo_size = xe_get_default_alignment(fd);
> +	bool test_discovery = c->flags & TEST_DISCOVERY;
> +	bool test_metadata = c->flags & VM_BIND_METADATA;
> +	uint32_t bo = xe_bo_create(fd, 0, bo_size,
> +				   system_memory(fd), 0);
> +	uint64_t addr = 0x1a0000;
> +	uint8_t *data = NULL;
> +
> +	if (test_metadata)
> +		ext = basic_vm_bind_metadata_ext_prepare(fd, c, &data, PAGE_SIZE);
> +
> +	xe_eudebug_client_vm_bind_flags(c, fd, vm, bo, 0, addr,
> +					bo_size, 0, NULL, 0, to_user_pointer(ext));
> +
> +	if (test_discovery) {
> +		xe_eudebug_client_signal_stage(c, STAGE_PRE_DEBUG_RESOURCES_DONE);
> +		xe_eudebug_client_wait_stage(c, STAGE_DISCOVERY_DONE);
> +	}
> +
> +	xe_eudebug_client_vm_unbind(c, fd, vm, 0, addr, bo_size);
> +
> +	if (test_metadata)
> +		basic_vm_bind_metadata_ext_del(fd, c, ext, data);
> +
> +	gem_close(fd, bo);
> +	xe_eudebug_client_vm_destroy(c, fd, vm);
> +}
> +
> +static void basic_vm_bind_vm_destroy_client(int fd, struct xe_eudebug_client *c)
> +{
> +	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +	size_t bo_size = xe_get_default_alignment(fd);
> +	bool test_discovery = c->flags & TEST_DISCOVERY;
> +	uint32_t bo = xe_bo_create(fd, 0, bo_size,
> +				   system_memory(fd), 0);
> +	uint64_t addr = 0x1a0000;
> +
> +	if (test_discovery) {
> +		vm = xe_vm_create(fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +
> +		xe_vm_bind_async(fd, vm, 0, bo, 0, addr, bo_size, NULL, 0);
> +
> +		xe_vm_destroy(fd, vm);
> +
> +		xe_eudebug_client_signal_stage(c, STAGE_PRE_DEBUG_RESOURCES_DONE);
> +		xe_eudebug_client_wait_stage(c, STAGE_DISCOVERY_DONE);
> +	} else {
> +		vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +		xe_eudebug_client_vm_bind(c, fd, vm, bo, 0, addr, bo_size);
> +		xe_eudebug_client_vm_destroy(c, fd, vm);
> +	}
> +
> +	gem_close(fd, bo);
> +}
> +
> +#define BO_ADDR 0x1a0000
> +#define BO_ITEMS 4096
> +#define MIN_BO_SIZE (BO_ITEMS * sizeof(uint64_t))
> +
> +union buf_id {
> +	uint32_t fd;
> +	void *userptr;
> +};
> +
> +struct bind_list {
> +	int fd;
> +	uint32_t vm;
> +	union buf_id *bo;
> +	struct drm_xe_vm_bind_op *bind_ops;
> +	unsigned int n;
> +};
> +
> +static void *bo_get_ptr(int fd, struct drm_xe_vm_bind_op *o)
> +{
> +	void *ptr;
> +
> +	if (o->op != DRM_XE_VM_BIND_OP_MAP_USERPTR)
> +		ptr = xe_bo_map(fd, o->obj, o->range);
> +	else
> +		ptr = (void *)(uintptr_t)o->userptr;
> +
> +	igt_assert(ptr);
> +
> +	return ptr;
> +}
> +
> +static void bo_put_ptr(int fd, struct drm_xe_vm_bind_op *o, void *ptr)
> +{
> +	if (o->op != DRM_XE_VM_BIND_OP_MAP_USERPTR)
> +		munmap(ptr, o->range);
> +}
> +
> +static void bo_prime(int fd, struct drm_xe_vm_bind_op *o)
> +{
> +	uint64_t *d;
> +	uint64_t i;
> +
> +	d = bo_get_ptr(fd, o);
> +
> +	for (i = 0; i < o->range / sizeof(*d); i++)
> +		d[i] = o->addr + i;
> +
> +	bo_put_ptr(fd, o, d);
> +}
> +
> +static void bo_check(int fd, struct drm_xe_vm_bind_op *o)
> +{
> +	uint64_t *d;
> +	uint64_t i;
> +
> +	d = bo_get_ptr(fd, o);
> +
> +	for (i = 0; i < o->range / sizeof(*d); i++)
> +		igt_assert_eq(d[i], o->addr + i + 1);
> +
> +	bo_put_ptr(fd, o, d);
> +}
> +
> +static union buf_id *vm_create_objects(int fd, uint32_t bo_placement, uint32_t vm,
> +				       unsigned int size, unsigned int n)
> +{
> +	union buf_id *bo;
> +	unsigned int i;
> +
> +	bo = calloc(n, sizeof(*bo));
> +	igt_assert(bo);
> +
> +	for (i = 0; i < n; i++) {
> +		if (bo_placement) {
> +			bo[i].fd = xe_bo_create(fd, vm, size, bo_placement, 0);
> +			igt_assert(bo[i].fd);
> +		} else {
> +			bo[i].userptr = aligned_alloc(PAGE_SIZE, size);
> +			igt_assert(bo[i].userptr);
> +		}
> +	}
> +
> +	return bo;
> +}
> +
> +static struct bind_list *create_bind_list(int fd, uint32_t bo_placement,
> +					  uint32_t vm, unsigned int n,
> +					  unsigned int target_size)
> +{
> +	unsigned int i = target_size ?: MIN_BO_SIZE;
> +	const unsigned int bo_size = max_t(bo_size, xe_get_default_alignment(fd), i);
> +	bool is_userptr = !bo_placement;
> +	struct bind_list *bl;
> +
> +	bl = malloc(sizeof(*bl));
> +	bl->fd = fd;
> +	bl->vm = vm;
> +	bl->bo = vm_create_objects(fd, bo_placement, vm, bo_size, n);
> +	bl->n = n;
> +	bl->bind_ops = calloc(n, sizeof(*bl->bind_ops));
> +	igt_assert(bl->bind_ops);
> +
> +	for (i = 0; i < n; i++) {
> +		struct drm_xe_vm_bind_op *o = &bl->bind_ops[i];
> +
> +		if (is_userptr) {
> +			o->obj = 0;
> +			o->userptr = (uintptr_t)bl->bo[i].userptr;
> +			o->op = DRM_XE_VM_BIND_OP_MAP_USERPTR;
> +		} else {
> +			o->obj = bl->bo[i].fd;
> +			o->obj_offset = 0;
> +			o->op = DRM_XE_VM_BIND_OP_MAP;
> +		}
> +
> +		o->range = bo_size;
> +		o->addr = BO_ADDR + 2 * i * bo_size;
> +		o->flags = 0;
> +		o->pat_index = intel_get_pat_idx_wb(fd);
> +		o->prefetch_mem_region_instance = 0;
> +		o->reserved[0] = 0;
> +		o->reserved[1] = 0;
> +	}
> +
> +	for (i = 0; i < bl->n; i++) {
> +		struct drm_xe_vm_bind_op *o = &bl->bind_ops[i];
> +
> +		igt_debug("bo %d: addr 0x%llx, range 0x%llx\n", i, o->addr, o->range);
> +		bo_prime(fd, o);
> +	}
> +
> +	return bl;
> +}
> +
> +static void do_bind_list(struct xe_eudebug_client *c,
> +			 struct bind_list *bl, bool sync)
> +{
> +	struct drm_xe_sync uf_sync = {
> +		.type = DRM_XE_SYNC_TYPE_USER_FENCE,
> +		.flags = DRM_XE_SYNC_FLAG_SIGNAL,
> +		.timeline_value = 1337,
> +	};
> +	uint64_t ref_seqno = 0, op_ref_seqno = 0;
> +	uint64_t *fence_data;
> +	int i;
> +
> +	if (sync) {
> +		fence_data = aligned_alloc(xe_get_default_alignment(bl->fd),
> +					   sizeof(*fence_data));
> +		igt_assert(fence_data);
> +		uf_sync.addr = to_user_pointer(fence_data);
> +		memset(fence_data, 0, sizeof(*fence_data));
> +	}
> +
> +	xe_vm_bind_array(bl->fd, bl->vm, 0, bl->bind_ops, bl->n, &uf_sync, sync ? 1 : 0);
> +	xe_eudebug_client_vm_bind_event(c, DRM_XE_EUDEBUG_EVENT_STATE_CHANGE,
> +					bl->fd, bl->vm, 0, bl->n, &ref_seqno);
> +	for (i = 0; i < bl->n; i++)
> +		xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_CREATE,
> +						   ref_seqno,
> +						   &op_ref_seqno,
> +						   bl->bind_ops[i].addr,
> +						   bl->bind_ops[i].range,
> +						   0);
> +
> +	if (sync) {
> +		xe_wait_ufence(bl->fd, fence_data, uf_sync.timeline_value, 0,
> +			       XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC);
> +		free(fence_data);
> +	}
> +}
> +
> +static void free_bind_list(struct xe_eudebug_client *c, struct bind_list *bl)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < bl->n; i++) {
> +		igt_debug("%d: checking 0x%llx (%lld)\n",
> +			  i, bl->bind_ops[i].addr, bl->bind_ops[i].addr);
> +		bo_check(bl->fd, &bl->bind_ops[i]);
> +		if (bl->bind_ops[i].op == DRM_XE_VM_BIND_OP_MAP_USERPTR)
> +			free(bl->bo[i].userptr);
> +		xe_eudebug_client_vm_unbind(c, bl->fd, bl->vm, 0,
> +					    bl->bind_ops[i].addr,
> +					    bl->bind_ops[i].range);
> +	}
> +
> +	free(bl->bind_ops);
> +	free(bl->bo);
> +	free(bl);
> +}
> +
> +static void vm_bind_client(int fd, struct xe_eudebug_client *c)
> +{
> +	uint64_t op_ref_seqno, ref_seqno;
> +	struct bind_list *bl;
> +	bool test_discovery = c->flags & TEST_DISCOVERY;
> +	size_t bo_size = 3 * xe_get_default_alignment(fd);
> +	uint32_t bo[2] = {
> +		xe_bo_create(fd, 0, bo_size, system_memory(fd), 0),
> +		xe_bo_create(fd, 0, bo_size, system_memory(fd), 0),
> +	};
> +	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +	uint64_t addr[] = {0x2a0000, 0x3a0000};
> +	uint64_t rebind_bo_offset = 2 * bo_size / 3;
> +	uint64_t size = bo_size / 3;
> +	int i = 0;
> +
> +	if (test_discovery) {
> +		xe_vm_bind_async(fd, vm, 0, bo[0], 0, addr[0], bo_size, NULL, 0);
> +
> +		xe_vm_unbind_async(fd, vm, 0, 0, addr[0] + size, size, NULL, 0);
> +
> +		xe_vm_bind_async(fd, vm, 0, bo[1], 0, addr[1], bo_size, NULL, 0);
> +
> +		xe_vm_bind_async(fd, vm, 0, bo[1], rebind_bo_offset, addr[1], size, NULL, 0);
> +
> +		bl = create_bind_list(fd, system_memory(fd), vm, 4, 0);
> +		xe_vm_bind_array(bl->fd, bl->vm, 0, bl->bind_ops, bl->n, NULL, 0);
> +
> +		xe_vm_unbind_all_async(fd, vm, 0, bo[0], NULL, 0);
> +
> +		xe_eudebug_client_vm_bind_event(c, DRM_XE_EUDEBUG_EVENT_STATE_CHANGE,
> +						bl->fd, bl->vm, 0, bl->n + 2, &ref_seqno);
> +
> +		xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_CREATE, ref_seqno,
> +						   &op_ref_seqno, addr[1], size, 0);
> +		xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_CREATE, ref_seqno,
> +						   &op_ref_seqno, addr[1] + size, size * 2, 0);
> +
> +		for (i = 0; i < bl->n; i++)
> +			xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_CREATE,
> +							   ref_seqno, &op_ref_seqno,
> +							   bl->bind_ops[i].addr,
> +							   bl->bind_ops[i].range, 0);
> +
> +		xe_eudebug_client_signal_stage(c, STAGE_PRE_DEBUG_RESOURCES_DONE);
> +		xe_eudebug_client_wait_stage(c, STAGE_DISCOVERY_DONE);
> +	} else {
> +		xe_eudebug_client_vm_bind(c, fd, vm, bo[0], 0, addr[0], bo_size);
> +		xe_eudebug_client_vm_unbind(c, fd, vm, 0, addr[0] + size, size);
> +
> +		xe_eudebug_client_vm_bind(c, fd, vm, bo[1], 0, addr[1], bo_size);
> +		xe_eudebug_client_vm_bind(c, fd, vm, bo[1], rebind_bo_offset, addr[1], size);
> +
> +		bl = create_bind_list(fd, system_memory(fd), vm, 4, 0);
> +		do_bind_list(c, bl, false);
> +	}
> +
> +	xe_vm_unbind_all_async(fd, vm, 0, bo[1], NULL, 0);
> +
> +	xe_eudebug_client_vm_bind_event(c, DRM_XE_EUDEBUG_EVENT_STATE_CHANGE, fd, vm, 0,
> +					1, &ref_seqno);
> +	xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_DESTROY, ref_seqno,
> +					   &op_ref_seqno, 0, 0, 0);
> +
> +	gem_close(fd, bo[0]);
> +	gem_close(fd, bo[1]);
> +	xe_eudebug_client_vm_destroy(c, fd, vm);
> +}
> +
> +static void run_basic_client(struct xe_eudebug_client *c)
> +{
> +	int fd, i;
> +
> +	fd = xe_eudebug_client_open_driver(c);
> +	xe_device_get(fd);
> +
> +	if (c->flags & CREATE_VMS) {
> +		const uint32_t flags[] = {
> +			DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE | DRM_XE_VM_CREATE_FLAG_LR_MODE,
> +			DRM_XE_VM_CREATE_FLAG_LR_MODE,
> +		};
> +		uint32_t vms[ARRAY_SIZE(flags)];
> +
> +		for (i = 0; i < ARRAY_SIZE(flags); i++)
> +			vms[i] = xe_eudebug_client_vm_create(c, fd, flags[i], 0);
> +
> +		for (i--; i >= 0; i--)
> +			xe_eudebug_client_vm_destroy(c, fd, vms[i]);
> +	}
> +
> +	if (c->flags & CREATE_EXEC_QUEUES) {
> +		struct drm_xe_exec_queue_create *create;
> +		struct drm_xe_engine_class_instance *hwe;
> +		struct drm_xe_ext_set_property eq_ext = {
> +			.base.name = DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY,
> +			.property = DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG,
> +			.value = DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE,
> +		};
> +		uint32_t vm;
> +
> +		create = calloc(xe_number_engines(fd), sizeof(*create));
> +
> +		vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +
> +		i = 0;
> +		xe_eudebug_for_each_engine(fd, hwe) {
> +			create[i].instances = to_user_pointer(hwe);
> +			create[i].vm_id = vm;
> +			create[i].width = 1;
> +			create[i].num_placements = 1;
> +			create[i].extensions = to_user_pointer(&eq_ext);
> +			xe_eudebug_client_exec_queue_create(c, fd, &create[i++]);
> +		}
> +
> +		while (--i >= 0)
> +			xe_eudebug_client_exec_queue_destroy(c, fd, &create[i]);
> +
> +		xe_eudebug_client_vm_destroy(c, fd, vm);
> +	}
> +
> +	if (c->flags & VM_BIND || c->flags & VM_BIND_METADATA)
> +		basic_vm_bind_client(fd, c);
> +
> +	if (c->flags & VM_BIND_EXTENDED)
> +		vm_bind_client(fd, c);
> +
> +	if (c->flags & VM_BIND_VM_DESTROY)
> +		basic_vm_bind_vm_destroy_client(fd, c);
> +
> +	xe_device_put(fd);
> +	xe_eudebug_client_close_driver(c, fd);
> +}
> +
> +static int read_event(int debugfd, struct drm_xe_eudebug_event *event)
> +{
> +	int ret;
> +
> +	ret = igt_ioctl(debugfd, DRM_XE_EUDEBUG_IOCTL_READ_EVENT, event);
> +	if (ret < 0)
> +		return -errno;
> +
> +	return ret;
> +}
> +
> +static int __read_event(int debugfd, struct drm_xe_eudebug_event *event)
> +{
> +	int ret;
> +
> +	ret = ioctl(debugfd, DRM_XE_EUDEBUG_IOCTL_READ_EVENT, event);
> +	if (ret < 0)
> +		return -errno;
> +
> +	return ret;
> +}
> +
> +static int poll_event(int fd, int timeout_ms)
> +{
> +	int ret;
> +
> +	struct pollfd p = {
> +		.fd = fd,
> +		.events = POLLIN,
> +		.revents = 0,
> +	};
> +
> +	ret = poll(&p, 1, timeout_ms);
> +	if (ret == -1)
> +		return -errno;
> +
> +	return ret == 1 && (p.revents & POLLIN);
> +}
> +
> +static int __debug_connect(int fd, int *debugfd, struct drm_xe_eudebug_connect *param)
> +{
> +	int ret = 0;
> +
> +	*debugfd = igt_ioctl(fd, DRM_IOCTL_XE_EUDEBUG_CONNECT, param);
> +
> +	if (*debugfd < 0) {
> +		ret = -errno;
> +		igt_assume(ret != 0);
> +	}
> +
> +	errno = 0;
> +	return ret;
> +}
> +
> +/**
> + * SUBTEST: basic-connect
> + * Description:
> + *	Exercise XE_EUDEBUG_CONNECT ioctl with passing
> + *	valid and invalid params.
> + */
> +static void test_connect(int fd)
> +{
> +	struct drm_xe_eudebug_connect param = {};
> +	int debugfd, ret;
> +	pid_t *pid;
> +
> +	pid = mmap(NULL, sizeof(pid_t), PROT_WRITE,
> +		   MAP_SHARED | MAP_ANON, -1, 0);
> +
> +	/* get fresh unrelated pid */
> +	igt_fork(child, 1)
> +		*pid = getpid();
> +
> +	igt_waitchildren();
> +	param.pid = *pid;
> +	munmap(pid, sizeof(pid_t));
> +
> +	ret = __debug_connect(fd, &debugfd, &param);
> +	igt_assert(debugfd == -1);
> +	igt_assert_eq(ret, param.pid ? -ENOENT : -EINVAL);
> +
> +	param.pid = 0;
> +	ret = __debug_connect(fd, &debugfd, &param);
> +	igt_assert(debugfd == -1);
> +	igt_assert_eq(ret, -EINVAL);
> +
> +	param.pid = getpid();
> +	param.version = -1;
> +	ret = __debug_connect(fd, &debugfd, &param);
> +	igt_assert(debugfd == -1);
> +	igt_assert_eq(ret, -EINVAL);
> +
> +	param.version = 0;
> +	param.flags = ~0;
> +	ret = __debug_connect(fd, &debugfd, &param);
> +	igt_assert(debugfd == -1);
> +	igt_assert_eq(ret, -EINVAL);
> +
> +	param.flags = 0;
> +	param.extensions = ~0;
> +	ret = __debug_connect(fd, &debugfd, &param);
> +	igt_assert(debugfd == -1);
> +	igt_assert_eq(ret, -EINVAL);
> +
> +	param.extensions = 0;
> +	ret = __debug_connect(fd, &debugfd, &param);
> +	igt_assert_neq(debugfd, -1);
> +	igt_assert_eq(ret, 0);
> +
> +	close(debugfd);
> +}
> +
> +static void switch_user(__uid_t uid, __gid_t gid)
> +{
> +	struct group *gr;
> +	__gid_t gr_v;
> +
> +	/* Users other then root need to belong to video group */
> +	gr = getgrnam("video");
> +	igt_assert(gr);
> +
> +	/* Drop all */
> +	igt_assert_eq(setgroups(1, &gr->gr_gid), 0);
> +	igt_assert_eq(setgid(gid), 0);
> +	igt_assert_eq(setuid(uid), 0);
> +
> +	igt_assert_eq(getgroups(1, &gr_v), 1);
> +	igt_assert_eq(gr_v, gr->gr_gid);
> +	igt_assert_eq(getgid(), gid);
> +	igt_assert_eq(getuid(), uid);
> +
> +	igt_assert_eq(prctl(PR_SET_DUMPABLE, 1L), 0);
> +}
> +
> +/**
> + * SUBTEST: connect-user
> + * Description:
> + *	Verify unprivileged XE_EUDEBG_CONNECT ioctl.
> + *	Check:
> + *	 - user debugger to user workload connection
> + *	 - user debugger to other user workload connection
> + *	 - user debugger to privileged workload connection
> + */
> +static void test_connect_user(int fd)
> +{
> +	struct drm_xe_eudebug_connect param = {};
> +	struct passwd *pwd, *pwd2;
> +	const char *user1 = "lp";
> +	const char *user2 = "mail";
> +	int debugfd, ret, i;
> +	int p1[2], p2[2];
> +	__uid_t u1, u2;
> +	__gid_t g1, g2;
> +	int newfd;
> +	pid_t pid;
> +
> +#define NUM_USER_TESTS 4
> +#define P_APP 0
> +#define P_GDB 1
> +	struct conn_user {
> +		/* u[0] - process uid, u[1] - gdb uid */
> +		__uid_t u[P_GDB + 1];
> +		/* g[0] - process gid, g[1] - gdb gid */
> +		__gid_t g[P_GDB + 1];
> +		/* Expected fd from open */
> +		int ret;
> +		/* Skip this test case */
> +		int skip;
> +		const char *desc;
> +	} test[NUM_USER_TESTS] = {};
> +
> +	igt_assert(!pipe(p1));
> +	igt_assert(!pipe(p2));
> +
> +	pwd = getpwnam(user1);
> +	igt_require(pwd);
> +	u1 = pwd->pw_uid;
> +	g1 = pwd->pw_gid;
> +
> +	/*
> +	 * Keep a copy of needed contents as it is a static
> +	 * memory area and subsequent calls will overwrite
> +	 * what's in.
> +	 * However getpwnam() returns NULL if cannot find
> +	 * user in passwd.
> +	 */
> +	setpwent();
> +	pwd2 = getpwnam(user2);
> +	if (pwd2) {
> +		u2 = pwd2->pw_uid;
> +		g2 = pwd2->pw_gid;
> +	}
> +
> +	test[0].skip = !pwd;
> +	test[0].u[P_GDB] = u1;
> +	test[0].g[P_GDB] = g1;
> +	test[0].ret = -EACCES;
> +	test[0].desc = "User GDB to Root App";
> +
> +	test[1].skip = !pwd;
> +	test[1].u[P_APP] = u1;
> +	test[1].g[P_APP] = g1;
> +	test[1].u[P_GDB] = u1;
> +	test[1].g[P_GDB] = g1;
> +	test[1].ret = 0;
> +	test[1].desc = "User GDB to User App";
> +
> +	test[2].skip = !pwd;
> +	test[2].u[P_APP] = u1;
> +	test[2].g[P_APP] = g1;
> +	test[2].ret = 0;
> +	test[2].desc = "Root GDB to User App";
> +
> +	test[3].skip = !pwd2;
> +	test[3].u[P_APP] = u1;
> +	test[3].g[P_APP] = g1;
> +	test[3].u[P_GDB] = u2;
> +	test[3].g[P_GDB] = g2;
> +	test[3].ret = -EACCES;
> +	test[3].desc = "User GDB to Other User App";
> +
> +	if (!pwd2)
> +		igt_warn("User %s not available in the system. Skipping subtests: %s.\n",
> +			 user2, test[3].desc);
> +
> +	for (i = 0; i < NUM_USER_TESTS; i++) {
> +		if (test[i].skip) {
> +			igt_debug("Subtest %s skipped\n", test[i].desc);
> +			continue;
> +		}
> +		igt_debug("Executing connection: %s\n", test[i].desc);
> +		igt_fork(child, 2) {
> +			if (!child) {
> +				if (test[i].u[P_APP])
> +					switch_user(test[i].u[P_APP], test[i].g[P_APP]);
> +
> +				pid = getpid();
> +				/* Signal the PID */
> +				igt_assert(write(p1[1], &pid, sizeof(pid)) == sizeof(pid));
> +				/* wait with exit */
> +				igt_assert(read(p2[0], &pid, sizeof(pid)) == sizeof(pid));
> +			} else {
> +				if (test[i].u[P_GDB])
> +					switch_user(test[i].u[P_GDB], test[i].g[P_GDB]);
> +
> +				igt_assert(read(p1[0], &pid, sizeof(pid)) == sizeof(pid));
> +				param.pid = pid;
> +
> +				newfd = drm_open_driver(DRIVER_XE);
> +				ret = __debug_connect(newfd, &debugfd, &param);
> +
> +				/* Release the app first */
> +				igt_assert(write(p2[1], &pid, sizeof(pid)) == sizeof(pid));
> +
> +				igt_assert_eq(ret, test[i].ret);
> +				if (!ret)
> +					close(debugfd);
> +			}
> +		}
> +		igt_waitchildren();
> +	}
> +	close(p1[0]);
> +	close(p1[1]);
> +	close(p2[0]);
> +	close(p2[1]);
> +#undef NUM_USER_TESTS
> +#undef P_APP
> +#undef P_GDB
> +}
> +
> +/**
> + * SUBTEST: basic-close
> + * Description:
> + *	Test whether eudebug can be reattached after closure.
> + */
> +static void test_close(int fd)
> +{
> +	struct drm_xe_eudebug_connect param = { 0,  };
> +	int debug_fd1, debug_fd2;
> +	int fd2;
> +
> +	param.pid = getpid();
> +
> +	igt_assert_eq(__debug_connect(fd, &debug_fd1, &param), 0);
> +	igt_assert(debug_fd1 >= 0);
> +	igt_assert_eq(__debug_connect(fd, &debug_fd2, &param), -EBUSY);
> +	igt_assert_eq(debug_fd2, -1);
> +
> +	close(debug_fd1);
> +	fd2 = drm_open_driver(DRIVER_XE);
> +
> +	igt_assert_eq(__debug_connect(fd2, &debug_fd2, &param), 0);
> +	igt_assert(debug_fd2 >= 0);
> +	close(fd2);
> +	close(debug_fd2);
> +	close(debug_fd1);
> +}
> +
> +/**
> + * SUBTEST: basic-read-event
> + * Description:
> + *	Synchronously exercise eu debugger event polling and reading.
> + */
> +#define MAX_EVENT_SIZE (32 * 1024)
> +static void test_read_event(int fd)
> +{
> +	struct drm_xe_eudebug_event *event;
> +	struct xe_eudebug_debugger *d;
> +	struct xe_eudebug_client *c;
> +
> +	event = malloc(MAX_EVENT_SIZE);
> +	igt_assert(event);
> +	memset(event, 0, sizeof(*event));
> +
> +	c = xe_eudebug_client_create(fd, run_basic_client, 0, NULL);
> +	d = xe_eudebug_debugger_create(fd, 0, NULL);
> +
> +	igt_assert_eq(xe_eudebug_debugger_attach(d, c), 0);
> +	igt_assert_eq(poll_event(d->fd, 500), 0);
> +
> +	event->len = 1;
> +	event->type = DRM_XE_EUDEBUG_EVENT_NONE;
> +	igt_assert_eq(read_event(d->fd, event), -EINVAL);
> +
> +	event->len = MAX_EVENT_SIZE;
> +	event->type = DRM_XE_EUDEBUG_EVENT_NONE;
> +	igt_assert_eq(read_event(d->fd, event), -EINVAL);
> +
> +	xe_eudebug_client_start(c);
> +
> +	igt_assert_eq(poll_event(d->fd, 500), 1);
> +	event->type = DRM_XE_EUDEBUG_EVENT_READ;
> +	igt_assert_eq(read_event(d->fd, event), 0);
> +
> +	igt_assert_eq(poll_event(d->fd, 500), 1);
> +
> +	event->flags = 0;
> +	event->type = DRM_XE_EUDEBUG_EVENT_READ;
> +
> +	event->len = 0;
> +	igt_assert_eq(read_event(d->fd, event), -EINVAL);
> +	igt_assert_eq(0, event->len);
> +
> +	event->len = sizeof(*event) - 1;
> +	igt_assert_eq(read_event(d->fd, event), -EINVAL);
> +
> +	event->len = sizeof(*event);
> +	igt_assert_eq(read_event(d->fd, event), -EMSGSIZE);
> +	igt_assert_lt(sizeof(*event), event->len);
> +
> +	event->len = event->len - 1;
> +	igt_assert_eq(read_event(d->fd, event), -EMSGSIZE);
> +	/* event->len should now contain the exact len */
> +	igt_assert_eq(read_event(d->fd, event), 0);
> +
> +	fcntl(d->fd, F_SETFL, fcntl(d->fd, F_GETFL) | O_NONBLOCK);
> +	igt_assert(fcntl(d->fd, F_GETFL) & O_NONBLOCK);
> +
> +	igt_assert_eq(poll_event(d->fd, 500), 0);
> +	event->len = MAX_EVENT_SIZE;
> +	event->flags = 0;
> +	event->type = DRM_XE_EUDEBUG_EVENT_READ;
> +	igt_assert_eq(__read_event(d->fd, event), -EAGAIN);
> +
> +	xe_eudebug_client_wait_done(c);
> +	xe_eudebug_client_stop(c);
> +
> +	igt_assert_eq(poll_event(d->fd, 500), 0);
> +	igt_assert_eq(__read_event(d->fd, event), -EAGAIN);
> +
> +	xe_eudebug_debugger_destroy(d);
> +	xe_eudebug_client_destroy(c);
> +
> +	free(event);
> +}
> +
> +/**
> + * SUBTEST: basic-client
> + * Description:
> + *	Attach the debugger to process which opens and closes xe drm client.
> + *
> + * SUBTEST: basic-client-th
> + * Description:
> + *	Create client basic resources (vms) in multiple threads
> + *
> + * SUBTEST: multiple-sessions
> + * Description:
> + *	Simultaneously attach many debuggers to many processes.
> + *	Each process opens and closes xe drm client and creates few resources.
> + *
> + * SUBTEST: basic-%s
> + * Description:
> + *	Attach the debugger to process which creates and destroys a few %arg[1].
> + *
> + * SUBTEST: basic-vm-bind
> + * Description:
> + *	Attach the debugger to a process that performs synchronous vm bind
> + *	and vm unbind.
> + *
> + * SUBTEST: basic-vm-bind-vm-destroy
> + * Description:
> + *	Attach the debugger to a process that performs vm bind, and destroys
> + *	the vm without unbinding. Make sure that we don't get unbind events.
> + *
> + * SUBTEST: basic-vm-bind-extended
> + * Description:
> + *	Attach the debugger to a process that performs bind, bind array, rebind,
> + *	partial unbind, unbind and unbind all operations.
> + *
> + * SUBTEST: multigpu-basic-client
> + * Description:
> + *	Attach the debugger to process which opens and closes xe drm client on all Xe devices.
> + *
> + * SUBTEST: multigpu-basic-client-many
> + * Description:
> + *	Simultaneously attach many debuggers to many processes on all Xe devices.
> + *	Each process opens and closes xe drm client and creates few resources.
> + *
> + * arg[1]:
> + *
> + * @vms: vms
> + * @exec-queues: exec queues
> + */
> +
> +static void test_basic_sessions(int fd, unsigned int flags, int count, bool match_opposite)
> +{
> +	struct xe_eudebug_session **s;
> +	int i;
> +
> +	s = calloc(count, sizeof(*s));
> +
> +	igt_assert(s);
> +
> +	for (i = 0; i < count; i++)
> +		s[i] = xe_eudebug_session_create(fd, run_basic_client, flags, NULL);
> +
> +	for (i = 0; i < count; i++)
> +		xe_eudebug_session_run(s[i]);
> +
> +	for (i = 0; i < count; i++)
> +		xe_eudebug_session_check(s[i], match_opposite, 0);
> +
> +	for (i = 0; i < count; i++)
> +		xe_eudebug_session_destroy(s[i]);
> +}
> +
> +/**
> + * SUBTEST: basic-vm-bind-discovery
> + * Description:
> + *	Attach the debugger to a process that performs vm-bind before attaching
> + *	and check if the discovery process reports it.
> + *
> + * SUBTEST: basic-vm-bind-metadata-discovery
> + * Description:
> + *	Attach the debugger to a process that performs vm-bind with metadata attached
> + *	before attaching and check if the discovery process reports it.
> + *
> + * SUBTEST: basic-vm-bind-vm-destroy-discovery
> + * Description:
> + *	Attach the debugger to a process that performs vm bind, and destroys
> + *	the vm without unbinding before attaching. Make sure that we don't get
> + *	any bind/unbind and vm create/destroy events.
> + *
> + * SUBTEST: basic-vm-bind-extended-discovery
> + * Description:
> + *	Attach the debugger to a process that performs bind, bind array, rebind,
> + *	partial unbind, and unbind all operations before attaching. Ensure that
> + *	we get a only a singe 'VM_BIND' event from the discovery worker.
> + */
> +static void test_basic_discovery(int fd, unsigned int flags, bool match_opposite)
> +{
> +	struct xe_eudebug_debugger *d;
> +	struct xe_eudebug_session *s;
> +	struct xe_eudebug_client *c;
> +
> +	s = xe_eudebug_session_create(fd, run_basic_client, flags | TEST_DISCOVERY, NULL);
> +
> +	c = s->client;
> +	d = s->debugger;
> +
> +	xe_eudebug_client_start(c);
> +	xe_eudebug_debugger_wait_stage(s, STAGE_PRE_DEBUG_RESOURCES_DONE);
> +
> +	igt_assert_eq(xe_eudebug_debugger_attach(d, c), 0);
> +	xe_eudebug_debugger_start_worker(d);
> +
> +	/* give the worker time to do it's job */
> +	sleep(2);
> +	xe_eudebug_debugger_signal_stage(d, STAGE_DISCOVERY_DONE);
> +
> +	xe_eudebug_client_wait_done(c);
> +
> +	xe_eudebug_debugger_stop_worker(d, 1);
> +
> +	xe_eudebug_event_log_print(d->log, true);
> +	xe_eudebug_event_log_print(c->log, true);
> +
> +	xe_eudebug_session_check(s, match_opposite, 0);
> +	xe_eudebug_session_destroy(s);
> +}
> +
> +#define RESOURCE_COUNT 16
> +#define PRIMARY_THREAD			(1 << 0)
> +#define DISCOVERY_CLOSE_CLIENT		(1 << 1)
> +#define DISCOVERY_DESTROY_RESOURCES	(1 << 2)
> +#define DISCOVERY_VM_BIND		(1 << 3)
> +static void run_discovery_client(struct xe_eudebug_client *c)
> +{
> +	struct drm_xe_engine_class_instance *hwe = NULL;
> +	int fd[RESOURCE_COUNT], i;
> +	bool skip_sleep = c->flags & (DISCOVERY_DESTROY_RESOURCES | DISCOVERY_CLOSE_CLIENT);
> +	uint64_t addr = 0x1a0000;
> +
> +	srand(getpid());
> +
> +	for (i = 0; i < RESOURCE_COUNT; i++) {
> +		fd[i] = xe_eudebug_client_open_driver(c);
> +
> +		if (!i) {
> +			bool found = false;
> +
> +			xe_device_get(fd[0]);
> +			xe_for_each_engine(fd[0], hwe) {
> +				if (hwe->engine_class == DRM_XE_ENGINE_CLASS_COMPUTE ||
> +				    hwe->engine_class == DRM_XE_ENGINE_CLASS_RENDER) {
> +					found = true;
> +					break;
> +				}
> +			}
> +			igt_assert(found);
> +		}
> +
> +		/*
> +		 * Give the debugger a break in event stream after every
> +		 * other client, that allows to read discovery and dettach in quiet.
> +		 */
> +		if (random() % 2 == 0 && !skip_sleep)
> +			sleep(1);
> +
> +		for (int j = 0; j < RESOURCE_COUNT; j++) {
> +			uint32_t vm = xe_eudebug_client_vm_create(c, fd[i],
> +								  DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +			struct drm_xe_ext_set_property eq_ext = {
> +				.base.name = DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY,
> +				.property = DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG,
> +				.value = DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE,
> +			};
> +			struct drm_xe_exec_queue_create create = {
> +				.width = 1,
> +				.num_placements = 1,
> +				.vm_id = vm,
> +				.instances = to_user_pointer(hwe),
> +				.extensions = to_user_pointer(&eq_ext),
> +			};
> +			const unsigned int bo_size = max_t(bo_size,
> +							   xe_get_default_alignment(fd[i]),
> +							   MIN_BO_SIZE);
> +			uint32_t bo = xe_bo_create(fd[i], 0, bo_size, system_memory(fd[i]), 0);
> +
> +			xe_eudebug_client_exec_queue_create(c, fd[i], &create);
> +
> +			if (c->flags & DISCOVERY_VM_BIND) {
> +				xe_eudebug_client_vm_bind(c, fd[i], vm, bo, 0, addr, bo_size);
> +				addr += 0x100000;
> +			}
> +
> +			if (c->flags & DISCOVERY_DESTROY_RESOURCES) {
> +				xe_eudebug_client_exec_queue_destroy(c, fd[i], &create);
> +				xe_eudebug_client_vm_destroy(c, fd[i], create.vm_id);
> +				gem_close(fd[i], bo);
> +			}
> +		}
> +
> +		if (c->flags & DISCOVERY_CLOSE_CLIENT)
> +			xe_eudebug_client_close_driver(c, fd[i]);
> +	}
> +	xe_device_put(fd[0]);
> +}
> +
> +/**
> + * SUBTEST: discovery-%s
> + * Description: Race discovery against %arg[1] and the debugger dettach.
> + *
> + * arg[1]:
> + *
> + * @race:		resources creation
> + * @race-vmbind:	vm-bind operations
> + * @empty:		resources destruction
> + * @empty-clients:	client closure
> + */
> +static void *discovery_race_thread(void *data)
> +{
> +	struct {
> +		uint64_t client_handle;
> +		int vm_count;
> +		int exec_queue_count;
> +		int vm_bind_op_count;
> +	} clients[RESOURCE_COUNT];
> +	struct xe_eudebug_session *s = data;
> +	int expected = RESOURCE_COUNT * (1 + 2 * RESOURCE_COUNT);
> +	const int tries = 100;
> +	bool done = false;
> +	int ret = 0;
> +
> +	for (int try = 0; try < tries && !done; try++) {
> +		ret = xe_eudebug_debugger_attach(s->debugger, s->client);
> +
> +		if (ret == -EBUSY) {
> +			usleep(100000);
> +			continue;
> +		}
> +
> +		igt_assert_eq(ret, 0);
> +
> +		if (random() % 2) {
> +			struct drm_xe_eudebug_event *e = NULL;
> +			int i = -1;
> +
> +			xe_eudebug_debugger_start_worker(s->debugger);
> +			sleep(1);
> +			xe_eudebug_debugger_stop_worker(s->debugger, 1);
> +			igt_debug("Resources discovered: %lu\n", s->debugger->event_count);
> +
> +			xe_eudebug_for_each_event(e, s->debugger->log) {
> +				if (e->type == DRM_XE_EUDEBUG_EVENT_OPEN) {
> +					struct drm_xe_eudebug_event_client *eo = (void *)e;
> +
> +					if (i >= 0) {
> +						igt_assert_eq(clients[i].vm_count,
> +							      RESOURCE_COUNT);
> +
> +						igt_assert_eq(clients[i].exec_queue_count,
> +							      RESOURCE_COUNT);
> +
> +						if (s->client->flags & DISCOVERY_VM_BIND)
> +							igt_assert_eq(clients[i].vm_bind_op_count,
> +								      RESOURCE_COUNT);
> +					}
> +
> +					igt_assert(++i < RESOURCE_COUNT);
> +					clients[i].client_handle = eo->client_handle;
> +					clients[i].vm_count = 0;
> +					clients[i].exec_queue_count = 0;
> +					clients[i].vm_bind_op_count = 0;
> +				}
> +
> +				if (e->type == DRM_XE_EUDEBUG_EVENT_VM)
> +					clients[i].vm_count++;
> +
> +				if (e->type == DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE)
> +					clients[i].exec_queue_count++;
> +
> +				if (e->type == DRM_XE_EUDEBUG_EVENT_VM_BIND_OP)
> +					clients[i].vm_bind_op_count++;
> +			};
> +
> +			igt_assert_lte(0, i);
> +
> +			for (int j = 0; j < i; j++)
> +				for (int k = 0; k < i; k++) {
> +					if (k == j)
> +						continue;
> +
> +					igt_assert_neq(clients[j].client_handle,
> +						       clients[k].client_handle);
> +				}
> +
> +			if (s->debugger->event_count >= expected)
> +				done = true;
> +		}
> +
> +		xe_eudebug_debugger_detach(s->debugger);
> +		s->debugger->log->head = 0;
> +		s->debugger->event_count = 0;
> +	}
> +
> +	/* Primary thread must read everything */
> +	if (s->flags & PRIMARY_THREAD) {
> +		while ((ret = xe_eudebug_debugger_attach(s->debugger, s->client)) == -EBUSY)
> +			usleep(100000);
> +
> +		igt_assert_eq(ret, 0);
> +
> +		xe_eudebug_debugger_start_worker(s->debugger);
> +		xe_eudebug_client_wait_done(s->client);
> +
> +		if (READ_ONCE(s->debugger->event_count) != expected)
> +			sleep(5);
> +
> +		xe_eudebug_debugger_stop_worker(s->debugger, 1);
> +		xe_eudebug_debugger_detach(s->debugger);
> +	}
> +
> +	return NULL;
> +}
> +
> +static void test_race_discovery(int fd, unsigned int flags, int clients)
> +{
> +	const int debuggers_per_client = 3;
> +	int count = clients * debuggers_per_client;
> +	struct xe_eudebug_session *sessions, *s;
> +	struct xe_eudebug_client *c;
> +	pthread_t *threads;
> +	int i, j;
> +
> +	sessions = calloc(count, sizeof(*sessions));
> +	threads = calloc(count, sizeof(*threads));
> +
> +	for (i = 0; i < clients; i++) {
> +		c = xe_eudebug_client_create(fd, run_discovery_client, flags, NULL);
> +		for (j = 0; j < debuggers_per_client; j++) {
> +			s = &sessions[i * debuggers_per_client + j];
> +			s->client = c;
> +			s->debugger = xe_eudebug_debugger_create(fd, flags, NULL);
> +			s->flags = flags | (!j ? PRIMARY_THREAD : 0);
> +		}
> +	}
> +
> +	for (i = 0; i < count; i++) {
> +		if (sessions[i].flags & PRIMARY_THREAD)
> +			xe_eudebug_client_start(sessions[i].client);
> +
> +		pthread_create(&threads[i], NULL, discovery_race_thread, &sessions[i]);
> +	}
> +
> +	for (i = 0; i < count; i++)
> +		pthread_join(threads[i], NULL);
> +
> +	for (i = count - 1; i > 0; i--) {
> +		if (sessions[i].flags & PRIMARY_THREAD) {
> +			igt_assert_eq(sessions[i].client->seqno - 1,
> +				      sessions[i].debugger->event_count);
> +
> +			xe_eudebug_event_log_compare(sessions[0].debugger->log,
> +						     sessions[i].debugger->log,
> +						     XE_EUDEBUG_FILTER_EVENT_VM_BIND);
> +
> +			xe_eudebug_client_destroy(sessions[i].client);
> +		}
> +		xe_eudebug_debugger_destroy(sessions[i].debugger);
> +	}
> +}
> +
> +static void *attach_dettach_thread(void *data)
> +{
> +	struct xe_eudebug_session *s = data;
> +	const int tries = 100;
> +	int ret = 0;
> +
> +	for (int try = 0; try < tries; try++) {
> +		ret = xe_eudebug_debugger_attach(s->debugger, s->client);
> +
> +		if (ret == -EBUSY) {
> +			usleep(100000);
> +			continue;
> +		}
> +
> +		igt_assert_eq(ret, 0);
> +
> +		if (random() % 2 == 0) {
> +			xe_eudebug_debugger_start_worker(s->debugger);
> +			xe_eudebug_debugger_stop_worker(s->debugger, 1);
> +		}
> +
> +		xe_eudebug_debugger_detach(s->debugger);
> +		s->debugger->log->head = 0;
> +		s->debugger->event_count = 0;
> +	}
> +
> +	return NULL;
> +}
> +
> +static void test_empty_discovery(int fd, unsigned int flags, int clients)
> +{
> +	struct xe_eudebug_session **s;
> +	pthread_t *threads;
> +	int i, expected = flags & DISCOVERY_CLOSE_CLIENT ? 0 : RESOURCE_COUNT;
> +
> +	igt_assert(flags & (DISCOVERY_DESTROY_RESOURCES | DISCOVERY_CLOSE_CLIENT));
> +
> +	s = calloc(clients, sizeof(struct xe_eudebug_session *));
> +	threads = calloc(clients, sizeof(*threads));
> +
> +	for (i = 0; i < clients; i++)
> +		s[i] = xe_eudebug_session_create(fd, run_discovery_client, flags, NULL);
> +
> +	for (i = 0; i < clients; i++) {
> +		xe_eudebug_client_start(s[i]->client);
> +
> +		pthread_create(&threads[i], NULL, attach_dettach_thread, s[i]);
> +	}
> +
> +	for (i = 0; i < clients; i++)
> +		pthread_join(threads[i], NULL);
> +
> +	for (i = 0; i < clients; i++) {
> +		xe_eudebug_client_wait_done(s[i]->client);
> +		igt_assert_eq(xe_eudebug_debugger_attach(s[i]->debugger, s[i]->client), 0);
> +
> +		xe_eudebug_debugger_start_worker(s[i]->debugger);
> +		xe_eudebug_debugger_stop_worker(s[i]->debugger, 5);
> +		xe_eudebug_debugger_detach(s[i]->debugger);
> +
> +		igt_assert_eq(s[i]->debugger->event_count, expected);
> +
> +		xe_eudebug_session_destroy(s[i]);
> +	}
> +}
> +
> +static void ufence_ack_trigger(struct xe_eudebug_debugger *d,
> +			       struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
> +
> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE)
> +		xe_eudebug_ack_ufence(d->fd, ef);
> +}
> +
> +typedef void (*client_run_t)(struct xe_eudebug_client *);
> +
> +static void test_client_with_trigger(int fd, unsigned int flags, int count,
> +				     client_run_t client_fn, int type,
> +				     xe_eudebug_trigger_fn trigger_fn,
> +				     struct drm_xe_engine_class_instance *hwe,
> +				     bool match_opposite, uint32_t event_filter)
> +{
> +	struct xe_eudebug_session **s;
> +	int i;
> +
> +	s = calloc(count, sizeof(*s));
> +
> +	igt_assert(s);
> +
> +	for (i = 0; i < count; i++)
> +		s[i] = xe_eudebug_session_create(fd, client_fn, flags, hwe);
> +
> +	if (trigger_fn)
> +		for (i = 0; i < count; i++)
> +			xe_eudebug_debugger_add_trigger(s[i]->debugger, type, trigger_fn);
> +
> +	for (i = 0; i < count; i++)
> +		xe_eudebug_debugger_add_trigger(s[i]->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> +						ufence_ack_trigger);
> +
> +	for (i = 0; i < count; i++)
> +		xe_eudebug_session_run(s[i]);
> +
> +	for (i = 0; i < count; i++)
> +		xe_eudebug_session_check(s[i], match_opposite, event_filter);
> +
> +	for (i = 0; i < count; i++)
> +		xe_eudebug_session_destroy(s[i]);
> +}
> +
> +struct thread_fn_args {
> +	struct xe_eudebug_client *client;
> +	int fd;
> +};
> +
> +static void *basic_client_th(void *data)
> +{
> +	struct thread_fn_args *f = data;
> +	struct xe_eudebug_client *c = f->client;
> +	uint32_t *vms;
> +	int fd, i, num_vms;
> +
> +	fd = f->fd;
> +	igt_assert(fd);
> +
> +	xe_device_get(fd);
> +
> +	num_vms = 2 + rand() % 16;
> +	vms = calloc(num_vms, sizeof(*vms));
> +	igt_assert(vms);
> +	igt_debug("Create %d client vms\n", num_vms);
> +
> +	for (i = 0; i < num_vms; i++)
> +		vms[i] = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +
> +	for (i = 0; i < num_vms; i++)
> +		xe_eudebug_client_vm_destroy(c, fd, vms[i]);
> +
> +	xe_device_put(fd);
> +	free(vms);
> +
> +	return NULL;
> +}
> +
> +static void run_basic_client_th(struct xe_eudebug_client *c)
> +{
> +	struct thread_fn_args *args;
> +	int i, num_threads, fd;
> +	pthread_t *threads;
> +
> +	args = calloc(1, sizeof(*args));
> +	igt_assert(args);
> +
> +	num_threads = 2 + random() % 16;
> +	igt_debug("Run on %d threads\n", num_threads);
> +	threads = calloc(num_threads, sizeof(*threads));
> +	igt_assert(threads);
> +
> +	fd = xe_eudebug_client_open_driver(c);
> +	args->client = c;
> +	args->fd = fd;
> +
> +	for (i = 0; i < num_threads; i++)
> +		pthread_create(&threads[i], NULL, basic_client_th, args);
> +
> +	for (i = 0; i < num_threads; i++)
> +		pthread_join(threads[i], NULL);
> +
> +	xe_eudebug_client_close_driver(c, fd);
> +	free(args);
> +	free(threads);
> +}
> +
> +static void test_basic_sessions_th(int fd, unsigned int flags, int num_clients, bool match_opposite)
> +{
> +	test_client_with_trigger(fd, flags, num_clients, run_basic_client_th, 0, NULL, NULL,
> +				 match_opposite, 0);
> +}
> +
> +static void vm_access_client(struct xe_eudebug_client *c)
> +{
> +	struct drm_xe_engine_class_instance *hwe = c->ptr;
> +	uint32_t bo_placement;
> +	struct bind_list *bl;
> +	uint32_t vm;
> +	int fd, i, j;
> +
> +	igt_debug("Using %s\n", xe_engine_class_string(hwe->engine_class));
> +
> +	fd = xe_eudebug_client_open_driver(c);
> +	xe_device_get(fd);
> +
> +	vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +
> +	if (c->flags & VM_BIND_OP_MAP_USERPTR)
> +		bo_placement = 0;
> +	else
> +		bo_placement = vram_if_possible(fd, hwe->gt_id);
> +
> +	for (j = 0; j < 5; j++) {
> +		unsigned int target_size = MIN_BO_SIZE * (1 << j);
> +
> +		bl = create_bind_list(fd, bo_placement, vm, 4, target_size);
> +		do_bind_list(c, bl, true);
> +
> +		for (i = 0; i < bl->n; i++)
> +			xe_eudebug_client_wait_stage(c, bl->bind_ops[i].addr);
> +
> +		free_bind_list(c, bl);
> +	}
> +	xe_eudebug_client_vm_destroy(c, fd, vm);
> +
> +	xe_device_put(fd);
> +	xe_eudebug_client_close_driver(c, fd);
> +}
> +
> +static void debugger_test_vma(struct xe_eudebug_debugger *d,
> +			      uint64_t client_handle,
> +			      uint64_t vm_handle,
> +			      uint64_t va_start,
> +			      uint64_t va_length)
> +{
> +	struct drm_xe_eudebug_vm_open vo = { 0, };
> +	uint64_t *v1, *v2;
> +	uint64_t items = va_length / sizeof(uint64_t);
> +	int fd;
> +	int r, i;
> +
> +	v1 = malloc(va_length);
> +	igt_assert(v1);
> +	v2 = malloc(va_length);
> +	igt_assert(v2);
> +
> +	vo.client_handle = client_handle;
> +	vo.vm_handle = vm_handle;
> +
> +	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
> +	igt_assert_lte(0, fd);
> +
> +	r = pread(fd, v1, va_length, va_start);
> +	igt_assert_eq(r, va_length);
> +
> +	for (i = 0; i < items; i++)
> +		igt_assert_eq(v1[i], va_start + i);
> +
> +	for (i = 0; i < items; i++)
> +		v1[i] = va_start + i + 1;
> +
> +	r = pwrite(fd, v1, va_length, va_start);
> +	igt_assert_eq(r, va_length);
> +
> +	lseek(fd, va_start, SEEK_SET);
> +	r = read(fd, v2, va_length);
> +	igt_assert_eq(r, va_length);
> +
> +	for (i = 0; i < items; i++)
> +		igt_assert_eq(v1[i], v2[i]);
> +
> +	fsync(fd);
> +
> +	close(fd);
> +	free(v1);
> +	free(v2);
> +}
> +
> +static void vm_trigger(struct xe_eudebug_debugger *d,
> +		       struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
> +
> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
> +		struct drm_xe_eudebug_event_vm_bind *eb;
> +
> +		igt_debug("vm bind op event received with ref %lld, addr 0x%llx, range 0x%llx\n",
> +			  eo->vm_bind_ref_seqno,
> +			  eo->addr,
> +			  eo->range);
> +
> +		eb = (struct drm_xe_eudebug_event_vm_bind *)
> +			xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
> +		igt_assert(eb);
> +
> +		debugger_test_vma(d, eb->client_handle, eb->vm_handle,
> +				  eo->addr, eo->range);
> +		xe_eudebug_debugger_signal_stage(d, eo->addr);
> +	}
> +}
> +
> +/**
> + * SUBTEST: basic-vm-access
> + * Description:
> + *      Exercise XE_EUDEBUG_VM_OPEN with pread and pwrite into the
> + *      vm fd, concerning many different offsets inside the vm,
> + *      and many virtual addresses of the vm_bound object.
> + *
> + * SUBTEST: basic-vm-access-userptr
> + * Description:
> + *      Exercise XE_EUDEBUG_VM_OPEN with pread and pwrite into the
> + *      vm fd, concerning many different offsets inside the vm,
> + *      and many virtual addresses of the vm_bound object, but backed
> + *      by userptr.
> + */
> +static void test_vm_access(int fd, unsigned int flags, int num_clients)
> +{
> +	struct drm_xe_engine_class_instance *hwe;
> +
> +	xe_eudebug_for_each_engine(fd, hwe)
> +		test_client_with_trigger(fd, flags, num_clients,
> +					 vm_access_client,
> +					 DRM_XE_EUDEBUG_EVENT_VM_BIND_OP,
> +					 vm_trigger, hwe,
> +					 false,
> +					 XE_EUDEBUG_FILTER_EVENT_VM_BIND_OP |
> +					 XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
> +}
> +
> +static void debugger_test_vma_parameters(struct xe_eudebug_debugger *d,
> +					 uint64_t client_handle,
> +					 uint64_t vm_handle,
> +					 uint64_t va_start,
> +					 uint64_t va_length)
> +{
> +	struct drm_xe_eudebug_vm_open vo = { 0, };
> +	uint64_t *v;
> +	uint64_t items = va_length / sizeof(uint64_t);
> +	int fd;
> +	int r, i;
> +
> +	v = malloc(va_length);
> +	igt_assert(v);
> +
> +	/* Negative VM open - bad client handle */
> +	vo.client_handle = client_handle + 123;
> +	vo.vm_handle = vm_handle;
> +	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
> +	igt_assert(fd < 0);
> +
> +	/* Negative VM open - bad vm handle */
> +	vo.client_handle = client_handle;
> +	vo.vm_handle = vm_handle + 123;
> +	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
> +	igt_assert(fd < 0);
> +
> +	/* Positive VM open */
> +	vo.client_handle = client_handle;
> +	vo.vm_handle = vm_handle;
> +	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
> +	igt_assert_lte(0, fd);
> +
> +	/* Negative pread - bad fd */
> +	r = pread(fd + 123, v, va_length, va_start);
> +	igt_assert(r < 0);
> +
> +	/* Negative pread - bad va_start */
> +	r = pread(fd, v, va_length, 0);
> +	igt_assert(r < 0);
> +
> +	/* Negative pread - bad va_start */
> +	r = pread(fd, v, va_length, va_start - 1);
> +	igt_assert(r < 0);
> +
> +	/* Positive pread - zero va_length */
> +	r = pread(fd, v, 0, va_start);
> +	igt_assert_eq(r, 0);
> +
> +	/* Negative pread - out of range */
> +	r = pread(fd, v, va_length + 1, va_start);
> +	igt_assert_eq(r, va_length);
> +
> +	/* Negative pread - bad va_start */
> +	r = pread(fd, v, 1, va_start + va_length);
> +	igt_assert(r < 0);
> +
> +	/* Positive pread - whole range */
> +	r = pread(fd, v, va_length, va_start);
> +	igt_assert_eq(r, va_length);
> +
> +	/* Positive pread */
> +	r = pread(fd, v, 1, va_start + va_length - 1);
> +	igt_assert_eq(r, 1);
> +
> +	for (i = 0; i < items; i++)
> +		igt_assert_eq(v[i], va_start + i);
> +
> +	for (i = 0; i < items; i++)
> +		v[i] = va_start + i + 1;
> +
> +	/* Negative pwrite - bad fd */
> +	r = pwrite(fd + 123, v, va_length, va_start);
> +	igt_assert(r < 0);
> +
> +	/* Negative pwrite - bad va_start */
> +	r = pwrite(fd, v, va_length, -1);
> +	igt_assert(r < 0);
> +
> +	/* Negative pwrite - zero va_start */
> +	r = pwrite(fd, v, va_length, 0);
> +	igt_assert(r < 0);
> +
> +	/* Negative pwrite - bad va_length */
> +	r = pwrite(fd, v, va_length + 1, va_start);
> +	igt_assert_eq(r, va_length);
> +
> +	/* Positive pwrite - zero va_length */
> +	r = pwrite(fd, v, 0, va_start);
> +	igt_assert_eq(r, 0);
> +
> +	/* Positive pwrite */
> +	r = pwrite(fd, v, va_length, va_start);
> +	igt_assert_eq(r, va_length);
> +	fsync(fd);
> +
> +	close(fd);
> +	free(v);
> +}
> +
> +static void vm_trigger_access_parameters(struct xe_eudebug_debugger *d,
> +					 struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
> +
> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
> +		struct drm_xe_eudebug_event_vm_bind *eb;
> +
> +		igt_debug("vm bind op event received with ref %lld, addr 0x%llx, range 0x%llx\n",
> +			  eo->vm_bind_ref_seqno,
> +			  eo->addr,
> +			  eo->range);
> +
> +		eb = (struct drm_xe_eudebug_event_vm_bind *)
> +			xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
> +		igt_assert(eb);
> +
> +		debugger_test_vma_parameters(d, eb->client_handle, eb->vm_handle, eo->addr,
> +					     eo->range);
> +		xe_eudebug_debugger_signal_stage(d, eo->addr);
> +	}
> +}
> +
> +/**
> + * SUBTEST: basic-vm-access-parameters
> + * Description:
> + *      Check negative scenarios of VM_OPEN ioctl and pread/pwrite usage.
> + */
> +static void test_vm_access_parameters(int fd, unsigned int flags, int num_clients)
> +{
> +	struct drm_xe_engine_class_instance *hwe;
> +
> +	xe_eudebug_for_each_engine(fd, hwe)
> +		test_client_with_trigger(fd, flags, num_clients,
> +					 vm_access_client,
> +					 DRM_XE_EUDEBUG_EVENT_VM_BIND_OP,
> +					 vm_trigger_access_parameters, hwe,
> +					 false,
> +					 XE_EUDEBUG_FILTER_EVENT_VM_BIND_OP |
> +					 XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
> +}
> +
> +#define PAGE_SIZE 4096
> +#define MDATA_SIZE (WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM * PAGE_SIZE)
> +static void metadata_access_client(struct xe_eudebug_client *c)
> +{
> +	const uint64_t addr = 0x1a0000;
> +	struct drm_xe_vm_bind_op_ext_attach_debug *ext;
> +	uint8_t *data;
> +	size_t bo_size;
> +	uint32_t bo, vm;
> +	int fd, i;
> +
> +	fd = xe_eudebug_client_open_driver(c);
> +	xe_device_get(fd);
> +
> +	bo_size = xe_get_default_alignment(fd);
> +	vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +	bo = xe_bo_create(fd, vm, bo_size, system_memory(fd), 0);
> +
> +	ext = basic_vm_bind_metadata_ext_prepare(fd, c, &data, MDATA_SIZE);
> +
> +	xe_eudebug_client_vm_bind_flags(c, fd, vm, bo, 0, addr,
> +					bo_size, 0, NULL, 0, to_user_pointer(ext));
> +
> +	for (i = 0; i < WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM; i++)
> +		xe_eudebug_client_wait_stage(c, i);
> +
> +	xe_eudebug_client_vm_unbind(c, fd, vm, 0, addr, bo_size);
> +
> +	basic_vm_bind_metadata_ext_del(fd, c, ext, data);
> +
> +	close(bo);
> +	xe_eudebug_client_vm_destroy(c, fd, vm);
> +
> +	xe_device_put(fd);
> +	xe_eudebug_client_close_driver(c, fd);
> +}
> +
> +static void debugger_test_metadata(struct xe_eudebug_debugger *d,
> +				   uint64_t client_handle,
> +				   uint64_t metadata_handle,
> +				   uint64_t type,
> +				   uint64_t len)
> +{
> +	struct drm_xe_eudebug_read_metadata rm = {
> +		.client_handle = client_handle,
> +		.metadata_handle = metadata_handle,
> +		.size = len,
> +	};
> +	uint8_t *data;
> +	int i;
> +
> +	data = malloc(len);
> +	igt_assert(data);
> +
> +	rm.ptr = to_user_pointer(data);
> +
> +	igt_assert_eq(igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_READ_METADATA, &rm), 0);
> +
> +	/* syntetic check, test sets different size per metadata type */
> +	igt_assert_eq((type + 1) * PAGE_SIZE, rm.size);
> +
> +	for (i = 0; i < rm.size; i++)
> +		igt_assert_eq(data[i], 0xff & (i + (i > PAGE_SIZE)));
> +
> +	free(data);
> +}
> +
> +static void metadata_read_trigger(struct xe_eudebug_debugger *d,
> +				  struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_metadata *em = (void *)e;
> +
> +	/* syntetic check, test sets different size per metadata type */
> +	igt_assert_eq((em->type + 1) * PAGE_SIZE, em->len);
> +
> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
> +		debugger_test_metadata(d, em->client_handle, em->metadata_handle,
> +				       em->type, em->len);
> +		xe_eudebug_debugger_signal_stage(d, em->type);
> +	}
> +}
> +
> +static void metadata_read_on_vm_bind_trigger(struct xe_eudebug_debugger *d,
> +					     struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_vm_bind_op_metadata *em = (void *)e;
> +	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
> +	struct drm_xe_eudebug_event_vm_bind *eb;
> +
> +	/* For testing purpose client sets metadata_cookie = type */
> +
> +	/*
> +	 * Metadata event has a reference to vm-bind-op event which has a reference
> +	 * to vm-bind event which contains proper client-handle.
> +	 */
> +	eo = (struct drm_xe_eudebug_event_vm_bind_op *)
> +		xe_eudebug_event_log_find_seqno(d->log, em->vm_bind_op_ref_seqno);
> +	igt_assert(eo);
> +	eb = (struct drm_xe_eudebug_event_vm_bind *)
> +		xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
> +	igt_assert(eb);
> +
> +	debugger_test_metadata(d,
> +			       eb->client_handle,
> +			       em->metadata_handle,
> +			       em->metadata_cookie,
> +			       MDATA_SIZE); /* max size */
> +
> +	xe_eudebug_debugger_signal_stage(d, em->metadata_cookie);
> +}
> +
> +/**
> + * SUBTEST: read-metadata
> + * Description:
> + *      Exercise DRM_XE_EUDEBUG_IOCTL_READ_METADATA and debug metadata create|destroy events.
> + */
> +static void test_metadata_read(int fd, unsigned int flags, int num_clients)
> +{
> +	test_client_with_trigger(fd, flags, num_clients, metadata_access_client,
> +				 DRM_XE_EUDEBUG_EVENT_METADATA, metadata_read_trigger,
> +				 NULL, true, 0);
> +}
> +
> +/**
> + * SUBTEST: attach-debug-metadata
> + * Description:
> + *      Read debug metadata when vm_bind has it attached.
> + */
> +static void test_metadata_attach(int fd, unsigned int flags, int num_clients)
> +{
> +	test_client_with_trigger(fd, flags, num_clients, metadata_access_client,
> +				 DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_METADATA,
> +				 metadata_read_on_vm_bind_trigger,
> +				 NULL, true, 0);
> +}
> +
> +#define STAGE_CLIENT_WAIT_ON_UFENCE_DONE 1337
> +
> +#define UFENCE_EVENT_COUNT_EXPECTED 4
> +#define UFENCE_EVENT_COUNT_MAX 100
> +
> +struct ufence_bind {
> +	struct drm_xe_sync f;
> +	uint64_t addr;
> +	uint64_t range;
> +	uint64_t value;
> +	struct {
> +		uint64_t vm_sync;
> +	} *fence_data;
> +};
> +
> +static void client_wait_ufences(struct xe_eudebug_client *c,
> +				int fd, struct ufence_bind *binds, int count)
> +{
> +	const int64_t default_fence_timeout_ns = 500 * NSEC_PER_MSEC;
> +	int64_t timeout_ns;
> +	int err;
> +
> +	/* Ensure that wait on unacked ufence times out */
> +	for (int i = 0; i < count; i++) {
> +		struct ufence_bind *b = &binds[i];
> +
> +		timeout_ns = default_fence_timeout_ns;
> +		err = __xe_wait_ufence(fd, &b->fence_data->vm_sync, b->f.timeline_value,
> +				       0, &timeout_ns);
> +		igt_assert_eq(err, -ETIME);
> +		igt_assert_neq(b->fence_data->vm_sync, b->f.timeline_value);
> +		igt_debug("wait #%d blocked on ack\n", i);
> +	}
> +
> +	/* Wait on fence timed out, now tell the debugger to ack */
> +	xe_eudebug_client_signal_stage(c, STAGE_CLIENT_WAIT_ON_UFENCE_DONE);
> +
> +	/* Check that ack unblocks ufence */
> +	for (int i = 0; i < count; i++) {
> +		struct ufence_bind *b = &binds[i];
> +
> +		timeout_ns = XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC;
> +		err = __xe_wait_ufence(fd, &b->fence_data->vm_sync, b->f.timeline_value,
> +				       0, &timeout_ns);
> +		igt_assert_eq(err, 0);
> +		igt_assert_eq(b->fence_data->vm_sync, b->f.timeline_value);
> +		igt_debug("wait #%d completed\n", i);
> +	}
> +}
> +
> +static struct ufence_bind *create_binds_with_ufence(int fd, int count)
> +{
> +	struct ufence_bind *binds;
> +
> +	binds = calloc(count, sizeof(*binds));
> +	igt_assert(binds);
> +
> +	for (int i = 0; i < count; i++) {
> +		struct ufence_bind *b = &binds[i];
> +
> +		b->range = 0x1000;
> +		b->addr = 0x100000 + b->range * i;
> +		b->fence_data = aligned_alloc(xe_get_default_alignment(fd),
> +					      sizeof(*b->fence_data));
> +		igt_assert(b->fence_data);
> +		memset(b->fence_data, 0, sizeof(*b->fence_data));
> +
> +		b->f.type = DRM_XE_SYNC_TYPE_USER_FENCE;
> +		b->f.flags = DRM_XE_SYNC_FLAG_SIGNAL;
> +		b->f.addr = to_user_pointer(&b->fence_data->vm_sync);
> +		b->f.timeline_value = UFENCE_EVENT_COUNT_EXPECTED + i;
> +	}
> +
> +	return binds;
> +}
> +
> +static void basic_ufence_client(struct xe_eudebug_client *c)
> +{
> +	const unsigned int n = UFENCE_EVENT_COUNT_EXPECTED;
> +	int fd = xe_eudebug_client_open_driver(c);
> +	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +	size_t bo_size = n * xe_get_default_alignment(fd);
> +	uint32_t bo = xe_bo_create(fd, 0, bo_size,
> +				   system_memory(fd), 0);
> +	struct ufence_bind *binds = create_binds_with_ufence(fd, n);
> +
> +	for (int i = 0; i < n; i++) {
> +		struct ufence_bind *b = &binds[i];
> +
> +		xe_eudebug_client_vm_bind_flags(c, fd, vm, bo, 0, b->addr, b->range, 0,
> +						&b->f, 1, 0);
> +	}
> +
> +	client_wait_ufences(c, fd, binds, n);
> +
> +	for (int i = 0; i < n; i++) {
> +		struct ufence_bind *b = &binds[i];
> +
> +		xe_eudebug_client_vm_unbind(c, fd, vm, 0, b->addr, b->range);
> +	}
> +
> +	free(binds);
> +	gem_close(fd, bo);
> +	xe_eudebug_client_vm_destroy(c, fd, vm);
> +	xe_eudebug_client_close_driver(c, fd);
> +}
> +
> +struct ufence_priv {
> +	struct drm_xe_eudebug_event_vm_bind_ufence ufence_events[UFENCE_EVENT_COUNT_MAX];
> +	uint64_t ufence_event_seqno[UFENCE_EVENT_COUNT_MAX];
> +	uint64_t ufence_event_vm_addr_start[UFENCE_EVENT_COUNT_MAX];
> +	uint64_t ufence_event_vm_addr_range[UFENCE_EVENT_COUNT_MAX];
> +	unsigned int ufence_event_count;
> +	unsigned int vm_bind_op_count;
> +	pthread_mutex_t mutex;
> +};
> +
> +static struct ufence_priv *ufence_priv_create(void)
> +{
> +	struct ufence_priv *priv;
> +
> +	priv = mmap(0, ALIGN(sizeof(*priv), PAGE_SIZE),
> +		    PROT_WRITE, MAP_SHARED | MAP_ANON, -1, 0);
> +	igt_assert(priv);
> +	memset(priv, 0, sizeof(*priv));
> +	pthread_mutex_init(&priv->mutex, NULL);
> +
> +	return priv;
> +}
> +
> +static void ufence_priv_destroy(struct ufence_priv *priv)
> +{
> +	munmap(priv, ALIGN(sizeof(*priv), PAGE_SIZE));
> +}
> +
> +static void ack_fences(struct xe_eudebug_debugger *d)
> +{
> +	struct ufence_priv *priv = d->ptr;
> +
> +	for (int i = 0; i < priv->ufence_event_count; i++)
> +		xe_eudebug_ack_ufence(d->fd, &priv->ufence_events[i]);
> +}
> +
> +static void basic_ufence_trigger(struct xe_eudebug_debugger *d,
> +				 struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
> +	struct ufence_priv *priv = d->ptr;
> +
> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
> +		char event_str[XE_EUDEBUG_EVENT_STRING_MAX_LEN];
> +		struct drm_xe_eudebug_event_vm_bind *eb;
> +
> +		xe_eudebug_event_to_str(e, event_str, XE_EUDEBUG_EVENT_STRING_MAX_LEN);
> +		igt_debug("ufence event received: %s\n", event_str);
> +
> +		xe_eudebug_assert_f(d, priv->ufence_event_count < UFENCE_EVENT_COUNT_EXPECTED,
> +				    "surplus ufence event received: %s\n", event_str);
> +		xe_eudebug_assert(d, ef->vm_bind_ref_seqno);
> +
> +		memcpy(&priv->ufence_events[priv->ufence_event_count++], ef, sizeof(*ef));
> +
> +		eb = (struct drm_xe_eudebug_event_vm_bind *)
> +			xe_eudebug_event_log_find_seqno(d->log, ef->vm_bind_ref_seqno);
> +		xe_eudebug_assert_f(d, eb, "vm bind event with seqno (%lld) not found\n",
> +				    ef->vm_bind_ref_seqno);
> +		xe_eudebug_assert_f(d, eb->flags & DRM_XE_EUDEBUG_EVENT_VM_BIND_FLAG_UFENCE,
> +				    "vm bind event does not have ufence: %s\n", event_str);
> +	}
> +}
> +
> +static int wait_for_ufence_events(struct ufence_priv *priv, int timeout_ms)
> +{
> +	int ret = -ETIMEDOUT;
> +
> +	igt_for_milliseconds(timeout_ms) {
> +		pthread_mutex_lock(&priv->mutex);
> +		if (priv->ufence_event_count == UFENCE_EVENT_COUNT_EXPECTED)
> +			ret = 0;
> +		pthread_mutex_unlock(&priv->mutex);
> +
> +		if (!ret)
> +			break;
> +		usleep(1000);
> +	}
> +
> +	return ret;
> +}
> +
> +/**
> + * SUBTEST: basic-vm-bind-ufence
> + * Description:
> + *      Give user fence in application and check if ufence ack works
> + */
> +static void test_basic_ufence(int fd, unsigned int flags)
> +{
> +	struct xe_eudebug_debugger *d;
> +	struct xe_eudebug_session *s;
> +	struct xe_eudebug_client *c;
> +	struct ufence_priv *priv;
> +
> +	priv = ufence_priv_create();
> +	s = xe_eudebug_session_create(fd, basic_ufence_client, flags, priv);
> +	c = s->client;
> +	d = s->debugger;
> +
> +	xe_eudebug_debugger_add_trigger(d,
> +					DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> +					basic_ufence_trigger);
> +
> +	igt_assert_eq(xe_eudebug_debugger_attach(d, c), 0);
> +	xe_eudebug_debugger_start_worker(d);
> +	xe_eudebug_client_start(c);
> +
> +	xe_eudebug_debugger_wait_stage(s, STAGE_CLIENT_WAIT_ON_UFENCE_DONE);
> +	xe_eudebug_assert_f(d, wait_for_ufence_events(priv, XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * MSEC_PER_SEC) == 0,
> +			    "missing ufence events\n");
> +	ack_fences(d);
> +
> +	xe_eudebug_client_wait_done(c);
> +	xe_eudebug_debugger_stop_worker(d, 1);
> +
> +	xe_eudebug_event_log_print(d->log, true);
> +	xe_eudebug_event_log_print(c->log, true);
> +
> +	xe_eudebug_session_check(s, true, XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
> +
> +	xe_eudebug_session_destroy(s);
> +	ufence_priv_destroy(priv);
> +}
> +
> +struct vm_bind_clear_thread_priv {
> +	struct drm_xe_engine_class_instance *hwe;
> +	struct xe_eudebug_client *c;
> +	pthread_t thread;
> +	uint64_t region;
> +	unsigned long sum;
> +};
> +
> +struct vm_bind_clear_priv {
> +	unsigned long unbind_count;
> +	unsigned long bind_count;
> +	unsigned long sum;
> +};
> +
> +static struct vm_bind_clear_priv *vm_bind_clear_priv_create(void)
> +{
> +	struct vm_bind_clear_priv *priv;
> +
> +	priv = mmap(0, ALIGN(sizeof(*priv), PAGE_SIZE),
> +		    PROT_WRITE, MAP_SHARED | MAP_ANON, -1, 0);
> +	igt_assert(priv);
> +	memset(priv, 0, sizeof(*priv));
> +
> +	return priv;
> +}
> +
> +static void vm_bind_clear_priv_destroy(struct vm_bind_clear_priv *priv)
> +{
> +	munmap(priv, ALIGN(sizeof(*priv), PAGE_SIZE));
> +}
> +
> +static void *vm_bind_clear_thread(void *data)
> +{
> +	const uint32_t CS_GPR0 = 0x600;
> +	const size_t batch_size = 16;
> +	struct drm_xe_sync uf_sync = {
> +		.type = DRM_XE_SYNC_TYPE_USER_FENCE, .flags = DRM_XE_SYNC_FLAG_SIGNAL,
> +	};
> +	struct vm_bind_clear_thread_priv *priv = data;
> +	int fd = xe_eudebug_client_open_driver(priv->c);
> +	uint32_t gtt_size = 1ull << min_t(uint32_t, xe_va_bits(fd), 48);
> +	uint32_t vm = xe_eudebug_client_vm_create(priv->c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +	size_t bo_size = xe_bb_size(fd, batch_size);
> +	unsigned long count = 0;
> +	uint64_t *fence_data;
> +
> +	/* init uf_sync */
> +	fence_data = aligned_alloc(xe_get_default_alignment(fd), sizeof(*fence_data));
> +	igt_assert(fence_data);
> +	uf_sync.timeline_value = 1337;
> +	uf_sync.addr = to_user_pointer(fence_data);
> +
> +	igt_debug("Run on: %s%u\n", xe_engine_class_string(priv->hwe->engine_class),
> +		  priv->hwe->engine_instance);
> +
> +	igt_until_timeout(5) {
> +		struct drm_xe_ext_set_property eq_ext = {
> +			.base.name = DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY,
> +			.property = DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG,
> +			.value = DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE,
> +		};
> +		struct drm_xe_exec_queue_create eq_create = { 0 };
> +		uint32_t clean_bo = 0;
> +		uint32_t batch_bo = 0;
> +		uint64_t clean_offset, batch_offset;
> +		uint32_t exec_queue;
> +		uint32_t *map, *cs;
> +		uint64_t delta;
> +
> +		/* calculate offsets (vma addresses) */
> +		batch_offset = (random() * SZ_2M) & (gtt_size - 1);
> +		/* XXX: for some platforms/memory regions batch offset '0' can be problematic */
> +		if (batch_offset == 0)
> +			batch_offset = SZ_2M;
> +
> +		do {
> +			clean_offset = (random() * SZ_2M) & (gtt_size - 1);
> +			if (clean_offset == 0)
> +				clean_offset = SZ_2M;
> +		} while (clean_offset == batch_offset);
> +
> +		batch_offset += random() % SZ_2M & -bo_size;
> +		clean_offset += random() % SZ_2M & -bo_size;
> +
> +		delta = (random() % bo_size) & -4;
> +
> +		/* prepare clean bo */
> +		clean_bo = xe_bo_create(fd, vm, bo_size, priv->region,
> +					DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM);
> +		memset(fence_data, 0, sizeof(*fence_data));
> +		xe_eudebug_client_vm_bind_flags(priv->c, fd, vm, clean_bo, 0, clean_offset, bo_size,
> +						0, &uf_sync, 1, 0);
> +		xe_wait_ufence(fd, fence_data, uf_sync.timeline_value, 0,
> +			       XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC);
> +
> +		/* prepare batch bo */
> +		batch_bo = xe_bo_create(fd, vm, bo_size, priv->region,
> +					DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM);
> +		memset(fence_data, 0, sizeof(*fence_data));
> +		xe_eudebug_client_vm_bind_flags(priv->c, fd, vm, batch_bo, 0, batch_offset, bo_size,
> +						0, &uf_sync, 1, 0);
> +		xe_wait_ufence(fd, fence_data, uf_sync.timeline_value, 0,
> +			       XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC);
> +
> +		map = xe_bo_map(fd, batch_bo, bo_size);
> +
> +		cs = map;
> +		*cs++ = MI_NOOP | 0xc5a3;
> +		*cs++ = MI_LOAD_REGISTER_MEM_CMD | MI_LRI_LRM_CS_MMIO | 2;
> +		*cs++ = CS_GPR0;
> +		*cs++ = clean_offset + delta;
> +		*cs++ = (clean_offset + delta) >> 32;
> +		*cs++ = MI_STORE_REGISTER_MEM_CMD | MI_LRI_LRM_CS_MMIO | 2;
> +		*cs++ = CS_GPR0;
> +		*cs++ = batch_offset;
> +		*cs++ = batch_offset >> 32;
> +		*cs++ = MI_BATCH_BUFFER_END;
> +
> +		/* execute batch */
> +		eq_create.width = 1;
> +		eq_create.num_placements = 1;
> +		eq_create.vm_id = vm;
> +		eq_create.instances = to_user_pointer(priv->hwe);
> +		eq_create.extensions = to_user_pointer(&eq_ext);
> +		exec_queue = xe_eudebug_client_exec_queue_create(priv->c, fd, &eq_create);
> +
> +		memset(fence_data, 0, sizeof(*fence_data));
> +		xe_exec_sync(fd, exec_queue, batch_offset, &uf_sync, 1);
> +		xe_wait_ufence(fd, fence_data, uf_sync.timeline_value, 0,
> +			       XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC);
> +
> +		igt_assert_eq(*map, 0);
> +
> +		/* cleanup */
> +		xe_eudebug_client_exec_queue_destroy(priv->c, fd, &eq_create);
> +		munmap(map, bo_size);
> +
> +		xe_eudebug_client_vm_unbind(priv->c, fd, vm, 0, batch_offset, bo_size);
> +		gem_close(fd, batch_bo);
> +
> +		xe_eudebug_client_vm_unbind(priv->c, fd, vm, 0, clean_offset, bo_size);
> +		gem_close(fd, clean_bo);
> +
> +		count++;
> +	}
> +
> +	priv->sum = count;
> +
> +	free(fence_data);
> +	xe_eudebug_client_close_driver(priv->c, fd);
> +	return NULL;
> +}
> +
> +static void vm_bind_clear_client(struct xe_eudebug_client *c)
> +{
> +	int fd = xe_eudebug_client_open_driver(c);
> +	struct xe_device *xe_dev = xe_device_get(fd);
> +	int count = xe_number_engines(fd) * xe_dev->mem_regions->num_mem_regions;
> +	uint64_t memreg = all_memory_regions(fd);
> +	struct vm_bind_clear_priv *priv = c->ptr;
> +	int current = 0;
> +	struct drm_xe_engine_class_instance *engine;
> +	struct vm_bind_clear_thread_priv *threads;
> +	uint64_t region;
> +
> +	threads = calloc(count, sizeof(*threads));
> +	igt_assert(threads);
> +	priv->sum = 0;
> +
> +	xe_for_each_mem_region(fd, memreg, region) {
> +		xe_eudebug_for_each_engine(fd, engine) {
> +			threads[current].c = c;
> +			threads[current].hwe = engine;
> +			threads[current].region = region;
> +
> +			pthread_create(&threads[current].thread, NULL,
> +				       vm_bind_clear_thread, &threads[current]);
> +			current++;
> +		}
> +	}
> +
> +	for (current = 0; current < count; current++)
> +		pthread_join(threads[current].thread, NULL);
> +
> +	xe_for_each_mem_region(fd, memreg, region) {
> +		unsigned long sum = 0;
> +
> +		for (current = 0; current < count; current++)
> +			if (threads[current].region == region)
> +				sum += threads[current].sum;
> +
> +		igt_info("%s sampled %lu objects\n", xe_region_name(region), sum);
> +		priv->sum += sum;
> +	}
> +
> +	free(threads);
> +	xe_device_put(fd);
> +	xe_eudebug_client_close_driver(c, fd);
> +}
> +
> +static void vm_bind_clear_test_trigger(struct xe_eudebug_debugger *d,
> +				       struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
> +	struct vm_bind_clear_priv *priv = d->ptr;
> +
> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
> +		if (random() & 1) {
> +			struct drm_xe_eudebug_vm_open vo = { 0, };
> +			uint32_t v = 0xc1c1c1c1;
> +
> +			struct drm_xe_eudebug_event_vm_bind *eb;
> +			int fd, delta, r;
> +
> +			igt_debug("vm bind op event received with ref %lld, addr 0x%llx, range 0x%llx\n",
> +				  eo->vm_bind_ref_seqno, eo->addr, eo->range);
> +
> +			eb = (struct drm_xe_eudebug_event_vm_bind *)
> +				xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
> +			igt_assert(eb);
> +
> +			vo.client_handle = eb->client_handle;
> +			vo.vm_handle = eb->vm_handle;
> +
> +			fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
> +			igt_assert_lte(0, fd);
> +
> +			delta = (random() % eo->range) & -4;
> +			r = pread(fd, &v, sizeof(v), eo->addr + delta);
> +			igt_assert_eq(r, sizeof(v));
> +			igt_assert_eq_u32(v, 0);
> +
> +			close(fd);
> +		}
> +		priv->bind_count++;
> +	}
> +
> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_DESTROY)
> +		priv->unbind_count++;
> +}
> +
> +static void vm_bind_clear_ack_trigger(struct xe_eudebug_debugger *d,
> +				      struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
> +
> +	xe_eudebug_ack_ufence(d->fd, ef);
> +}
> +
> +/**
> + * SUBTEST: vm-bind-clear
> + * Description:
> + *      Check that fresh buffers we vm_bind into the ppGTT are always clear.
> + */
> +static void test_vm_bind_clear(int fd)
> +{
> +	struct vm_bind_clear_priv *priv;
> +	struct xe_eudebug_session *s;
> +
> +	priv = vm_bind_clear_priv_create();
> +	s = xe_eudebug_session_create(fd, vm_bind_clear_client, 0, priv);
> +
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_OP,
> +					vm_bind_clear_test_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> +					vm_bind_clear_ack_trigger);
> +
> +	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
> +	xe_eudebug_debugger_start_worker(s->debugger);
> +	xe_eudebug_client_start(s->client);
> +
> +	xe_eudebug_client_wait_done(s->client);
> +	xe_eudebug_debugger_stop_worker(s->debugger, 1);
> +
> +	igt_assert_eq(priv->bind_count, priv->unbind_count);
> +	igt_assert_eq(priv->sum * 2, priv->bind_count);
> +
> +	xe_eudebug_session_destroy(s);
> +	vm_bind_clear_priv_destroy(priv);
> +}
> +
> +#define UFENCE_CLIENT_VM_TEST_VAL_START 0xaaaaaaaa
> +#define UFENCE_CLIENT_VM_TEST_VAL_END 0xbbbbbbbb
> +
> +static void vma_ufence_client(struct xe_eudebug_client *c)
> +{
> +	const unsigned int n = UFENCE_EVENT_COUNT_EXPECTED;
> +	int fd = xe_eudebug_client_open_driver(c);
> +	struct ufence_bind *binds = create_binds_with_ufence(fd, n);
> +	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +	size_t bo_size = xe_get_default_alignment(fd);
> +	uint64_t items = bo_size / sizeof(uint32_t);
> +	uint32_t bo[UFENCE_EVENT_COUNT_EXPECTED];
> +	uint32_t *ptr[UFENCE_EVENT_COUNT_EXPECTED];
> +
> +	for (int i = 0; i < n; i++) {
> +		bo[i] = xe_bo_create(fd, 0, bo_size,
> +				     system_memory(fd), 0);
> +		ptr[i] = xe_bo_map(fd, bo[i], bo_size);
> +		igt_assert(ptr[i]);
> +		memset(ptr[i], UFENCE_CLIENT_VM_TEST_VAL_START, bo_size);
> +	}
> +
> +	for (int i = 0; i < n; i++)
> +		for (int j = 0; j < items; j++)
> +			igt_assert_eq(ptr[i][j], UFENCE_CLIENT_VM_TEST_VAL_START);
> +
> +	for (int i = 0; i < n; i++) {
> +		struct ufence_bind *b = &binds[i];
> +
> +		xe_eudebug_client_vm_bind_flags(c, fd, vm, bo[i], 0, b->addr, b->range, 0,
> +						&b->f, 1, 0);
> +	}
> +
> +	/* Wait for acks on ufences */
> +	for (int i = 0; i < n; i++) {
> +		int err;
> +		int64_t timeout_ns;
> +		struct ufence_bind *b = &binds[i];
> +
> +		timeout_ns = XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC;
> +		err = __xe_wait_ufence(fd, &b->fence_data->vm_sync, b->f.timeline_value,
> +				       0, &timeout_ns);
> +		igt_assert_eq(err, 0);
> +		igt_assert_eq(b->fence_data->vm_sync, b->f.timeline_value);
> +		igt_debug("wait #%d completed\n", i);
> +
> +		for (int j = 0; j < items; j++)
> +			igt_assert_eq(ptr[i][j], UFENCE_CLIENT_VM_TEST_VAL_END);
> +	}
> +
> +	for (int i = 0; i < n; i++) {
> +		struct ufence_bind *b = &binds[i];
> +
> +		xe_eudebug_client_vm_unbind(c, fd, vm, 0, b->addr, b->range);
> +	}
> +
> +	free(binds);
> +
> +	for (int i = 0; i < n; i++) {
> +		munmap(ptr[i], bo_size);
> +		gem_close(fd, bo[i]);
> +	}
> +
> +	xe_eudebug_client_vm_destroy(c, fd, vm);
> +	xe_eudebug_client_close_driver(c, fd);
> +}
> +
> +static void debugger_test_vma_ufence(struct xe_eudebug_debugger *d,
> +				     uint64_t client_handle,
> +				     uint64_t vm_handle,
> +				     uint64_t va_start,
> +				     uint64_t va_length)
> +{
> +	struct drm_xe_eudebug_vm_open vo = { 0, };
> +	uint32_t *v1, *v2;
> +	uint32_t items = va_length / sizeof(uint32_t);
> +	int fd;
> +	int r, i;
> +
> +	v1 = malloc(va_length);
> +	igt_assert(v1);
> +	v2 = malloc(va_length);
> +	igt_assert(v2);
> +
> +	vo.client_handle = client_handle;
> +	vo.vm_handle = vm_handle;
> +
> +	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
> +	igt_assert_lte(0, fd);
> +
> +	r = pread(fd, v1, va_length, va_start);
> +	igt_assert_eq(r, va_length);
> +
> +	for (i = 0; i < items; i++)
> +		igt_assert_eq(v1[i], UFENCE_CLIENT_VM_TEST_VAL_START);
> +
> +	memset(v1, UFENCE_CLIENT_VM_TEST_VAL_END, va_length);
> +
> +	r = pwrite(fd, v1, va_length, va_start);
> +	igt_assert_eq(r, va_length);
> +
> +	lseek(fd, va_start, SEEK_SET);
> +	r = read(fd, v2, va_length);
> +	igt_assert_eq(r, va_length);
> +
> +	for (i = 0; i < items; i++)
> +		igt_assert_eq_u64(v1[i], v2[i]);
> +
> +	fsync(fd);
> +
> +	close(fd);
> +	free(v1);
> +	free(v2);
> +}
> +
> +static void vma_ufence_op_trigger(struct xe_eudebug_debugger *d,
> +				  struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
> +	struct ufence_priv *priv = d->ptr;
> +
> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
> +		char event_str[XE_EUDEBUG_EVENT_STRING_MAX_LEN];
> +		struct drm_xe_eudebug_event_vm_bind *eb;
> +		unsigned int op_count = priv->vm_bind_op_count++;
> +
> +		xe_eudebug_event_to_str(e, event_str, XE_EUDEBUG_EVENT_STRING_MAX_LEN);
> +		igt_debug("vm bind op event: ref %lld, addr 0x%llx, range 0x%llx, op_count %u\n",
> +			  eo->vm_bind_ref_seqno,
> +			  eo->addr,
> +			  eo->range,
> +			  op_count);
> +		igt_debug("vm bind op event received: %s\n", event_str);
> +		xe_eudebug_assert(d, eo->vm_bind_ref_seqno);
> +		eb = (struct drm_xe_eudebug_event_vm_bind *)
> +			xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
> +
> +		xe_eudebug_assert_f(d, eb, "vm bind event with seqno (%lld) not found\n",
> +				    eo->vm_bind_ref_seqno);
> +		xe_eudebug_assert_f(d, eb->flags & DRM_XE_EUDEBUG_EVENT_VM_BIND_FLAG_UFENCE,
> +				    "vm bind event does not have ufence: %s\n", event_str);
> +
> +		priv->ufence_event_seqno[op_count] = eo->vm_bind_ref_seqno;
> +		priv->ufence_event_vm_addr_start[op_count] = eo->addr;
> +		priv->ufence_event_vm_addr_range[op_count] = eo->range;
> +	}
> +}
> +
> +static void vma_ufence_trigger(struct xe_eudebug_debugger *d,
> +			       struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
> +	struct ufence_priv *priv = d->ptr;
> +	unsigned int ufence_count = priv->ufence_event_count;
> +
> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
> +		char event_str[XE_EUDEBUG_EVENT_STRING_MAX_LEN];
> +		struct drm_xe_eudebug_event_vm_bind *eb;
> +		uint64_t addr = priv->ufence_event_vm_addr_start[ufence_count];
> +		uint64_t range = priv->ufence_event_vm_addr_range[ufence_count];
> +
> +		xe_eudebug_event_to_str(e, event_str, XE_EUDEBUG_EVENT_STRING_MAX_LEN);
> +		igt_debug("ufence event received: %s\n", event_str);
> +
> +		xe_eudebug_assert_f(d, priv->ufence_event_count < UFENCE_EVENT_COUNT_EXPECTED,
> +				    "surplus ufence event received: %s\n", event_str);
> +		xe_eudebug_assert(d, ef->vm_bind_ref_seqno);
> +
> +		memcpy(&priv->ufence_events[priv->ufence_event_count++], ef, sizeof(*ef));
> +
> +		eb = (struct drm_xe_eudebug_event_vm_bind *)
> +			xe_eudebug_event_log_find_seqno(d->log, ef->vm_bind_ref_seqno);
> +		xe_eudebug_assert_f(d, eb, "vm bind event with seqno (%lld) not found\n",
> +				    ef->vm_bind_ref_seqno);
> +		xe_eudebug_assert_f(d, eb->flags & DRM_XE_EUDEBUG_EVENT_VM_BIND_FLAG_UFENCE,
> +				    "vm bind event does not have ufence: %s\n", event_str);
> +		igt_debug("vm bind ufence event received with ref %lld, addr 0x%lx, range 0x%lx\n",
> +			  ef->vm_bind_ref_seqno,
> +			  addr,
> +			  range);
> +		debugger_test_vma_ufence(d, eb->client_handle, eb->vm_handle,
> +					 addr, range);
> +
> +		xe_eudebug_ack_ufence(d->fd, ef);
> +	}
> +}
> +
> +/**
> + * SUBTEST: vma-ufence
> + * Description:
> + *      Intercept vm bind after receiving ufence event, then access target vm and write to it.
> + *      Then check on client side if the write was successful.
> + */
> +static void test_vma_ufence(int fd, unsigned int flags)
> +{
> +	struct xe_eudebug_session *s;
> +	struct ufence_priv *priv;
> +
> +	priv = ufence_priv_create();
> +	s = xe_eudebug_session_create(fd, vma_ufence_client, flags, priv);
> +
> +	xe_eudebug_debugger_add_trigger(s->debugger,
> +					DRM_XE_EUDEBUG_EVENT_VM_BIND_OP,
> +					vma_ufence_op_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger,
> +					DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> +					vma_ufence_trigger);
> +
> +	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
> +	xe_eudebug_debugger_start_worker(s->debugger);
> +	xe_eudebug_client_start(s->client);
> +
> +	xe_eudebug_client_wait_done(s->client);
> +	xe_eudebug_debugger_stop_worker(s->debugger, 1);
> +
> +	xe_eudebug_event_log_print(s->debugger->log, true);
> +	xe_eudebug_event_log_print(s->client->log, true);
> +
> +	xe_eudebug_session_check(s, true, XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
> +
> +	xe_eudebug_session_destroy(s);
> +	ufence_priv_destroy(priv);
> +}
> +
> +igt_main
> +{
> +	bool was_enabled;
> +	bool *multigpu_was_enabled;
> +	int fd, gpu_count;
> +
> +	igt_fixture {
> +		fd = drm_open_driver(DRIVER_XE);
> +		was_enabled = xe_eudebug_enable(fd, true);
> +	}
> +
> +	igt_subtest("sysfs-toggle")
> +		test_sysfs_toggle(fd);
> +
> +	igt_subtest("basic-connect")
> +		test_connect(fd);
> +
> +	igt_subtest("connect-user")
> +		test_connect_user(fd);
> +
> +	igt_subtest("basic-close")
> +		test_close(fd);
> +
> +	igt_subtest("basic-read-event")
> +		test_read_event(fd);
> +
> +	igt_subtest("basic-client")
> +		test_basic_sessions(fd, 0, 1, true);
> +
> +	igt_subtest("basic-client-th")
> +		test_basic_sessions_th(fd, 0, 1, true);
> +
> +	igt_subtest("basic-vm-access")
> +		test_vm_access(fd, 0, 1);
> +
> +	igt_subtest("basic-vm-access-userptr")
> +		test_vm_access(fd, VM_BIND_OP_MAP_USERPTR, 1);
> +
> +	igt_subtest("basic-vm-access-parameters")
> +		test_vm_access_parameters(fd, 0, 1);
> +
> +	igt_subtest("multiple-sessions")
> +		test_basic_sessions(fd, CREATE_VMS | CREATE_EXEC_QUEUES, 4, true);
> +
> +	igt_subtest("basic-vms")
> +		test_basic_sessions(fd, CREATE_VMS, 1, true);
> +
> +	igt_subtest("basic-exec-queues")
> +		test_basic_sessions(fd, CREATE_EXEC_QUEUES, 1, true);
> +
> +	igt_subtest("basic-vm-bind")
> +		test_basic_sessions(fd, VM_BIND, 1, true);
> +
> +	igt_subtest("basic-vm-bind-ufence")
> +		test_basic_ufence(fd, 0);
> +
> +	igt_subtest("vma-ufence")
> +		test_vma_ufence(fd, 0);
> +
> +	igt_subtest("vm-bind-clear")
> +		test_vm_bind_clear(fd);
> +
> +	igt_subtest("basic-vm-bind-discovery")
> +		test_basic_discovery(fd, VM_BIND, true);
> +
> +	igt_subtest("basic-vm-bind-metadata-discovery")
> +		test_basic_discovery(fd, VM_BIND_METADATA, true);
> +
> +	igt_subtest("basic-vm-bind-vm-destroy")
> +		test_basic_sessions(fd, VM_BIND_VM_DESTROY, 1, false);
> +
> +	igt_subtest("basic-vm-bind-vm-destroy-discovery")
> +		test_basic_discovery(fd, VM_BIND_VM_DESTROY, false);
> +
> +	igt_subtest("basic-vm-bind-extended")
> +		test_basic_sessions(fd, VM_BIND_EXTENDED, 1, true);
> +
> +	igt_subtest("basic-vm-bind-extended-discovery")
> +		test_basic_discovery(fd, VM_BIND_EXTENDED, true);
> +
> +	igt_subtest("read-metadata")
> +		test_metadata_read(fd, 0, 1);
> +
> +	igt_subtest("attach-debug-metadata")
> +		test_metadata_attach(fd, 0, 1);
> +
> +	igt_subtest("discovery-race")
> +		test_race_discovery(fd, 0, 4);
> +
> +	igt_subtest("discovery-race-vmbind")
> +		test_race_discovery(fd, DISCOVERY_VM_BIND, 4);
> +
> +	igt_subtest("discovery-empty")
> +		test_empty_discovery(fd, DISCOVERY_CLOSE_CLIENT, 16);
> +
> +	igt_subtest("discovery-empty-clients")
> +		test_empty_discovery(fd, DISCOVERY_DESTROY_RESOURCES, 16);
> +
> +	igt_fixture {
> +		xe_eudebug_enable(fd, was_enabled);
> +		drm_close_driver(fd);
> +	}
> +
> +	igt_subtest_group {
> +		igt_fixture {
> +			gpu_count = drm_prepare_filtered_multigpu(DRIVER_XE);
> +			igt_require(gpu_count >= 2);
> +
> +			multigpu_was_enabled = malloc(gpu_count * sizeof(bool));
> +			igt_assert(multigpu_was_enabled);
> +			for (int i = 0; i < gpu_count; i++) {
> +				fd = drm_open_filtered_card(i);
> +				multigpu_was_enabled[i] = xe_eudebug_enable(fd, true);
> +				close(fd);
> +			}
> +		}
> +
> +		igt_subtest("multigpu-basic-client") {
> +			igt_multi_fork(child, gpu_count) {
> +				fd = drm_open_filtered_card(child);
> +				igt_assert_f(fd > 0, "cannot open gpu-%d, errno=%d\n",
> +					     child, errno);
> +				igt_assert(is_xe_device(fd));
> +
> +				test_basic_sessions(fd, 0, 1, true);
> +				close(fd);
> +			}
> +			igt_waitchildren();
> +		}
> +
> +		igt_subtest("multigpu-basic-client-many") {
> +			igt_multi_fork(child, gpu_count) {
> +				fd = drm_open_filtered_card(child);
> +				igt_assert_f(fd > 0, "cannot open gpu-%d, errno=%d\n",
> +					     child, errno);
> +				igt_assert(is_xe_device(fd));
> +
> +				test_basic_sessions(fd, 0, 4, true);
> +				close(fd);
> +			}
> +			igt_waitchildren();
> +		}
> +
> +		igt_fixture {
> +			for (int i = 0; i < gpu_count; i++) {
> +				fd = drm_open_filtered_card(i);
> +				xe_eudebug_enable(fd, multigpu_was_enabled[i]);
> +				close(fd);
> +			}
> +			free(multigpu_was_enabled);
> +		}
> +	}
> +}
> diff --git a/tests/meson.build b/tests/meson.build
> index 00556c9d6..0f996fdc8 100644
> --- a/tests/meson.build
> +++ b/tests/meson.build
> @@ -318,6 +318,14 @@ intel_xe_progs = [
>  	'xe_sysfs_scheduler',
>  ]
>  
> +intel_xe_eudebug_progs = [
> +	'xe_eudebug',
> +]
> +
> +if build_xe_eudebug
> +	intel_xe_progs += intel_xe_eudebug_progs
> +endif
> +
>  chamelium_progs = [
>  	'kms_chamelium_audio',
>  	'kms_chamelium_color',
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 13/17] tests/xe_eudebug: Test eudebug resource tracking and manipulation
  2024-09-06 14:46   ` Kamil Konieczny
@ 2024-09-09 10:34     ` Zbigniew Kempczyński
  0 siblings, 0 replies; 50+ messages in thread
From: Zbigniew Kempczyński @ 2024-09-09 10:34 UTC (permalink / raw)
  To: Kamil Konieczny, igt-dev, Christoph Manszewski,
	Dominik Grzegorzek, Maciej Patelczyk,
	Dominik Karol Piątkowski, Pawel Sikora, Andrzej Hajda,
	Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun, Mika Kuoppala,
	Karolina Stolarek, Jonathan Cavitt

On Fri, Sep 06, 2024 at 04:46:36PM +0200, Kamil Konieczny wrote:
> Hi Christoph,
> On 2024-09-05 at 11:28:08 +0200, Christoph Manszewski wrote:
> > From: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> > 
> > For typical debugging under gdb one can specify two main usecases:
> > accessing and manupulating resources created by the application and
> > manipulating thread execution (interrupting and setting breakpoints).
> > 
> > This test adds coverage for the former by checking that:
> > - the debugger reports the expected events for Xe resources created
> > by the debugged client,
> > - the debugger is able to read and write the vm of the debugged client.
> > 
> > Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> > Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
> > Signed-off-by: Karolina Stolarek <karolina.stolarek@intel.com>
> > Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
> > Signed-off-by: Pawel Sikora <pawel.sikora@intel.com>
> > Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
> > Signed-off-by: Dominik Karol Piątkowski <dominik.karol.piatkowski@intel.com>
> > Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> > ---
> >  docs/testplan/meson.build |   13 +-
> >  meson_options.txt         |    2 +-
> >  tests/intel/xe_eudebug.c  | 2716 +++++++++++++++++++++++++++++++++++++
> >  tests/meson.build         |    8 +
> >  4 files changed, 2737 insertions(+), 2 deletions(-)
> >  create mode 100644 tests/intel/xe_eudebug.c
> > 
> > diff --git a/docs/testplan/meson.build b/docs/testplan/meson.build
> > index 5560347f1..e86af028e 100644
> > --- a/docs/testplan/meson.build
> > +++ b/docs/testplan/meson.build
> > @@ -33,11 +33,22 @@ else
> >  	doc_dependencies = []
> >  endif
> >  
> > +xe_excluded_tests = []
> > +if not build_xe_eudebug
> > +	foreach test : intel_xe_eudebug_progs
> > +		xe_excluded_tests += meson.current_source_dir() + '/../../tests/intel/' + test + '.c'
> > +	endforeach
> > +endif
> > +
> > +if xe_excluded_tests.length() > 0
> > +	xe_excluded_tests = ['--exclude-files'] + xe_excluded_tests
> > +endif
> > +
> >  if build_xe
> >  	test_dict = {
> >  		'i915_tests': { 'input': i915_test_config, 'extra_args': check_testlist },
> >  		'kms_tests': { 'input': kms_test_config, 'extra_args': kms_check_testlist },
> > -		'xe_tests': { 'input': xe_test_config, 'extra_args': check_testlist }
> > +		'xe_tests': { 'input': xe_test_config, 'extra_args': check_testlist + xe_excluded_tests }
> >  	    }
> 
> Do we need this change? Can we just add igt@.*eudebug.* to blocklist
> and have it compiled tested?

Yes. We need this change. We want to protect eudebug tests behind
the compilation flag. At least I prefer such solution instead of
compiling eudebug test unconditionally.

Problem with igt_doc is if you have eudebug tests disabled then it
will warn about mismatching list of subtests detected from the source
and binary (eudebug tests don't exists if you have them disabled
in meson configuration).

--
Zbigniew


> 
> Regards,
> Kamil
> 
> >  else
> >  	test_dict = {
> > diff --git a/meson_options.txt b/meson_options.txt
> > index 11922523b..c410f9b77 100644
> > --- a/meson_options.txt
> > +++ b/meson_options.txt
> > @@ -45,7 +45,7 @@ option('xe_driver',
> >  option('xe_eudebug',
> >         type : 'feature',
> >         value : 'disabled',
> > -       description : 'Build library for Xe EU debugger')
> > +       description : 'Build library and tests for Xe EU debugger')
> >  
> >  option('libdrm_drivers',
> >         type : 'array',
> > diff --git a/tests/intel/xe_eudebug.c b/tests/intel/xe_eudebug.c
> > new file mode 100644
> > index 000000000..fd2894a5e
> > --- /dev/null
> > +++ b/tests/intel/xe_eudebug.c
> > @@ -0,0 +1,2716 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright © 2023 Intel Corporation
> > + */
> > +
> > +/**
> > + * TEST: Test EU Debugger functionality
> > + * Category: Core
> > + * Mega feature: EUdebug
> > + * Sub-category: EUdebug tests
> > + * Functionality: eu debugger framework
> > + * Test category: functionality test
> > + */
> > +
> > +#include <grp.h>
> > +#include <poll.h>
> > +#include <pthread.h>
> > +#include <pwd.h>
> > +#include <sys/ioctl.h>
> > +#include <sys/prctl.h>
> > +
> > +#include "igt.h"
> > +#include "intel_pat.h"
> > +#include "lib/igt_syncobj.h"
> > +#include "xe/xe_eudebug.h"
> > +#include "xe/xe_ioctl.h"
> > +#include "xe/xe_query.h"
> > +
> > +/**
> > + * SUBTEST: sysfs-toggle
> > + * Description:
> > + *	Exercise the debugger enable/disable sysfs toggle logic
> > + */
> > +static void test_sysfs_toggle(int fd)
> > +{
> > +	xe_eudebug_enable(fd, false);
> > +	igt_assert(!xe_eudebug_debugger_available(fd));
> > +
> > +	xe_eudebug_enable(fd, true);
> > +	igt_assert(xe_eudebug_debugger_available(fd));
> > +	xe_eudebug_enable(fd, true);
> > +	igt_assert(xe_eudebug_debugger_available(fd));
> > +
> > +	xe_eudebug_enable(fd, false);
> > +	igt_assert(!xe_eudebug_debugger_available(fd));
> > +	xe_eudebug_enable(fd, false);
> > +	igt_assert(!xe_eudebug_debugger_available(fd));
> > +
> > +	xe_eudebug_enable(fd, true);
> > +	igt_assert(xe_eudebug_debugger_available(fd));
> > +}
> > +
> > +#define STAGE_PRE_DEBUG_RESOURCES_DONE 1
> > +#define STAGE_DISCOVERY_DONE 2
> > +
> > +#define CREATE_VMS (1 << 0)
> > +#define CREATE_EXEC_QUEUES (1 << 1)
> > +#define VM_BIND (1 << 2)
> > +#define VM_BIND_VM_DESTROY (1 << 3)
> > +#define VM_BIND_EXTENDED (1 << 4)
> > +#define VM_METADATA (1 << 5)
> > +#define VM_BIND_METADATA (1 << 6)
> > +#define VM_BIND_OP_MAP_USERPTR (1 << 7)
> > +#define TEST_DISCOVERY (1 << 31)
> > +
> > +#define PAGE_SIZE 4096
> > +static struct drm_xe_vm_bind_op_ext_attach_debug *
> > +basic_vm_bind_metadata_ext_prepare(int fd, struct xe_eudebug_client *c,
> > +				   uint8_t **data, uint32_t data_size)
> > +{
> > +	struct drm_xe_vm_bind_op_ext_attach_debug *ext;
> > +	int i;
> > +
> > +	*data = calloc(data_size, sizeof(*data));
> > +	igt_assert(*data);
> > +
> > +	for (i = 0; i < data_size; i++)
> > +		(*data)[i] = 0xff & (i + (i > PAGE_SIZE));
> > +
> > +	ext = calloc(WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM, sizeof(*ext));
> > +	igt_assert(ext);
> > +
> > +	for (i = 0; i < WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM; i++) {
> > +		ext[i].base.name = XE_VM_BIND_OP_EXTENSIONS_ATTACH_DEBUG;
> > +		ext[i].metadata_id = xe_eudebug_client_metadata_create(c, fd, i,
> > +								       (i + 1) * PAGE_SIZE, *data);
> > +		ext[i].cookie = i;
> > +
> > +		if (i < WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM - 1)
> > +			ext[i].base.next_extension = to_user_pointer(&ext[i + 1]);
> > +	}
> > +	return ext;
> > +}
> > +
> > +static void basic_vm_bind_metadata_ext_del(int fd, struct xe_eudebug_client *c,
> > +					   struct drm_xe_vm_bind_op_ext_attach_debug *ext,
> > +					   uint8_t *data)
> > +{
> > +	for (int i = 0; i < WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM; i++)
> > +		xe_eudebug_client_metadata_destroy(c, fd, ext[i].metadata_id, i,
> > +						   (i + 1) * PAGE_SIZE);
> > +	free(ext);
> > +	free(data);
> > +}
> > +
> > +static void basic_vm_bind_client(int fd, struct xe_eudebug_client *c)
> > +{
> > +	struct drm_xe_vm_bind_op_ext_attach_debug *ext = NULL;
> > +	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> > +	size_t bo_size = xe_get_default_alignment(fd);
> > +	bool test_discovery = c->flags & TEST_DISCOVERY;
> > +	bool test_metadata = c->flags & VM_BIND_METADATA;
> > +	uint32_t bo = xe_bo_create(fd, 0, bo_size,
> > +				   system_memory(fd), 0);
> > +	uint64_t addr = 0x1a0000;
> > +	uint8_t *data = NULL;
> > +
> > +	if (test_metadata)
> > +		ext = basic_vm_bind_metadata_ext_prepare(fd, c, &data, PAGE_SIZE);
> > +
> > +	xe_eudebug_client_vm_bind_flags(c, fd, vm, bo, 0, addr,
> > +					bo_size, 0, NULL, 0, to_user_pointer(ext));
> > +
> > +	if (test_discovery) {
> > +		xe_eudebug_client_signal_stage(c, STAGE_PRE_DEBUG_RESOURCES_DONE);
> > +		xe_eudebug_client_wait_stage(c, STAGE_DISCOVERY_DONE);
> > +	}
> > +
> > +	xe_eudebug_client_vm_unbind(c, fd, vm, 0, addr, bo_size);
> > +
> > +	if (test_metadata)
> > +		basic_vm_bind_metadata_ext_del(fd, c, ext, data);
> > +
> > +	gem_close(fd, bo);
> > +	xe_eudebug_client_vm_destroy(c, fd, vm);
> > +}
> > +
> > +static void basic_vm_bind_vm_destroy_client(int fd, struct xe_eudebug_client *c)
> > +{
> > +	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> > +	size_t bo_size = xe_get_default_alignment(fd);
> > +	bool test_discovery = c->flags & TEST_DISCOVERY;
> > +	uint32_t bo = xe_bo_create(fd, 0, bo_size,
> > +				   system_memory(fd), 0);
> > +	uint64_t addr = 0x1a0000;
> > +
> > +	if (test_discovery) {
> > +		vm = xe_vm_create(fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> > +
> > +		xe_vm_bind_async(fd, vm, 0, bo, 0, addr, bo_size, NULL, 0);
> > +
> > +		xe_vm_destroy(fd, vm);
> > +
> > +		xe_eudebug_client_signal_stage(c, STAGE_PRE_DEBUG_RESOURCES_DONE);
> > +		xe_eudebug_client_wait_stage(c, STAGE_DISCOVERY_DONE);
> > +	} else {
> > +		vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> > +		xe_eudebug_client_vm_bind(c, fd, vm, bo, 0, addr, bo_size);
> > +		xe_eudebug_client_vm_destroy(c, fd, vm);
> > +	}
> > +
> > +	gem_close(fd, bo);
> > +}
> > +
> > +#define BO_ADDR 0x1a0000
> > +#define BO_ITEMS 4096
> > +#define MIN_BO_SIZE (BO_ITEMS * sizeof(uint64_t))
> > +
> > +union buf_id {
> > +	uint32_t fd;
> > +	void *userptr;
> > +};
> > +
> > +struct bind_list {
> > +	int fd;
> > +	uint32_t vm;
> > +	union buf_id *bo;
> > +	struct drm_xe_vm_bind_op *bind_ops;
> > +	unsigned int n;
> > +};
> > +
> > +static void *bo_get_ptr(int fd, struct drm_xe_vm_bind_op *o)
> > +{
> > +	void *ptr;
> > +
> > +	if (o->op != DRM_XE_VM_BIND_OP_MAP_USERPTR)
> > +		ptr = xe_bo_map(fd, o->obj, o->range);
> > +	else
> > +		ptr = (void *)(uintptr_t)o->userptr;
> > +
> > +	igt_assert(ptr);
> > +
> > +	return ptr;
> > +}
> > +
> > +static void bo_put_ptr(int fd, struct drm_xe_vm_bind_op *o, void *ptr)
> > +{
> > +	if (o->op != DRM_XE_VM_BIND_OP_MAP_USERPTR)
> > +		munmap(ptr, o->range);
> > +}
> > +
> > +static void bo_prime(int fd, struct drm_xe_vm_bind_op *o)
> > +{
> > +	uint64_t *d;
> > +	uint64_t i;
> > +
> > +	d = bo_get_ptr(fd, o);
> > +
> > +	for (i = 0; i < o->range / sizeof(*d); i++)
> > +		d[i] = o->addr + i;
> > +
> > +	bo_put_ptr(fd, o, d);
> > +}
> > +
> > +static void bo_check(int fd, struct drm_xe_vm_bind_op *o)
> > +{
> > +	uint64_t *d;
> > +	uint64_t i;
> > +
> > +	d = bo_get_ptr(fd, o);
> > +
> > +	for (i = 0; i < o->range / sizeof(*d); i++)
> > +		igt_assert_eq(d[i], o->addr + i + 1);
> > +
> > +	bo_put_ptr(fd, o, d);
> > +}
> > +
> > +static union buf_id *vm_create_objects(int fd, uint32_t bo_placement, uint32_t vm,
> > +				       unsigned int size, unsigned int n)
> > +{
> > +	union buf_id *bo;
> > +	unsigned int i;
> > +
> > +	bo = calloc(n, sizeof(*bo));
> > +	igt_assert(bo);
> > +
> > +	for (i = 0; i < n; i++) {
> > +		if (bo_placement) {
> > +			bo[i].fd = xe_bo_create(fd, vm, size, bo_placement, 0);
> > +			igt_assert(bo[i].fd);
> > +		} else {
> > +			bo[i].userptr = aligned_alloc(PAGE_SIZE, size);
> > +			igt_assert(bo[i].userptr);
> > +		}
> > +	}
> > +
> > +	return bo;
> > +}
> > +
> > +static struct bind_list *create_bind_list(int fd, uint32_t bo_placement,
> > +					  uint32_t vm, unsigned int n,
> > +					  unsigned int target_size)
> > +{
> > +	unsigned int i = target_size ?: MIN_BO_SIZE;
> > +	const unsigned int bo_size = max_t(bo_size, xe_get_default_alignment(fd), i);
> > +	bool is_userptr = !bo_placement;
> > +	struct bind_list *bl;
> > +
> > +	bl = malloc(sizeof(*bl));
> > +	bl->fd = fd;
> > +	bl->vm = vm;
> > +	bl->bo = vm_create_objects(fd, bo_placement, vm, bo_size, n);
> > +	bl->n = n;
> > +	bl->bind_ops = calloc(n, sizeof(*bl->bind_ops));
> > +	igt_assert(bl->bind_ops);
> > +
> > +	for (i = 0; i < n; i++) {
> > +		struct drm_xe_vm_bind_op *o = &bl->bind_ops[i];
> > +
> > +		if (is_userptr) {
> > +			o->obj = 0;
> > +			o->userptr = (uintptr_t)bl->bo[i].userptr;
> > +			o->op = DRM_XE_VM_BIND_OP_MAP_USERPTR;
> > +		} else {
> > +			o->obj = bl->bo[i].fd;
> > +			o->obj_offset = 0;
> > +			o->op = DRM_XE_VM_BIND_OP_MAP;
> > +		}
> > +
> > +		o->range = bo_size;
> > +		o->addr = BO_ADDR + 2 * i * bo_size;
> > +		o->flags = 0;
> > +		o->pat_index = intel_get_pat_idx_wb(fd);
> > +		o->prefetch_mem_region_instance = 0;
> > +		o->reserved[0] = 0;
> > +		o->reserved[1] = 0;
> > +	}
> > +
> > +	for (i = 0; i < bl->n; i++) {
> > +		struct drm_xe_vm_bind_op *o = &bl->bind_ops[i];
> > +
> > +		igt_debug("bo %d: addr 0x%llx, range 0x%llx\n", i, o->addr, o->range);
> > +		bo_prime(fd, o);
> > +	}
> > +
> > +	return bl;
> > +}
> > +
> > +static void do_bind_list(struct xe_eudebug_client *c,
> > +			 struct bind_list *bl, bool sync)
> > +{
> > +	struct drm_xe_sync uf_sync = {
> > +		.type = DRM_XE_SYNC_TYPE_USER_FENCE,
> > +		.flags = DRM_XE_SYNC_FLAG_SIGNAL,
> > +		.timeline_value = 1337,
> > +	};
> > +	uint64_t ref_seqno = 0, op_ref_seqno = 0;
> > +	uint64_t *fence_data;
> > +	int i;
> > +
> > +	if (sync) {
> > +		fence_data = aligned_alloc(xe_get_default_alignment(bl->fd),
> > +					   sizeof(*fence_data));
> > +		igt_assert(fence_data);
> > +		uf_sync.addr = to_user_pointer(fence_data);
> > +		memset(fence_data, 0, sizeof(*fence_data));
> > +	}
> > +
> > +	xe_vm_bind_array(bl->fd, bl->vm, 0, bl->bind_ops, bl->n, &uf_sync, sync ? 1 : 0);
> > +	xe_eudebug_client_vm_bind_event(c, DRM_XE_EUDEBUG_EVENT_STATE_CHANGE,
> > +					bl->fd, bl->vm, 0, bl->n, &ref_seqno);
> > +	for (i = 0; i < bl->n; i++)
> > +		xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_CREATE,
> > +						   ref_seqno,
> > +						   &op_ref_seqno,
> > +						   bl->bind_ops[i].addr,
> > +						   bl->bind_ops[i].range,
> > +						   0);
> > +
> > +	if (sync) {
> > +		xe_wait_ufence(bl->fd, fence_data, uf_sync.timeline_value, 0,
> > +			       XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC);
> > +		free(fence_data);
> > +	}
> > +}
> > +
> > +static void free_bind_list(struct xe_eudebug_client *c, struct bind_list *bl)
> > +{
> > +	unsigned int i;
> > +
> > +	for (i = 0; i < bl->n; i++) {
> > +		igt_debug("%d: checking 0x%llx (%lld)\n",
> > +			  i, bl->bind_ops[i].addr, bl->bind_ops[i].addr);
> > +		bo_check(bl->fd, &bl->bind_ops[i]);
> > +		if (bl->bind_ops[i].op == DRM_XE_VM_BIND_OP_MAP_USERPTR)
> > +			free(bl->bo[i].userptr);
> > +		xe_eudebug_client_vm_unbind(c, bl->fd, bl->vm, 0,
> > +					    bl->bind_ops[i].addr,
> > +					    bl->bind_ops[i].range);
> > +	}
> > +
> > +	free(bl->bind_ops);
> > +	free(bl->bo);
> > +	free(bl);
> > +}
> > +
> > +static void vm_bind_client(int fd, struct xe_eudebug_client *c)
> > +{
> > +	uint64_t op_ref_seqno, ref_seqno;
> > +	struct bind_list *bl;
> > +	bool test_discovery = c->flags & TEST_DISCOVERY;
> > +	size_t bo_size = 3 * xe_get_default_alignment(fd);
> > +	uint32_t bo[2] = {
> > +		xe_bo_create(fd, 0, bo_size, system_memory(fd), 0),
> > +		xe_bo_create(fd, 0, bo_size, system_memory(fd), 0),
> > +	};
> > +	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> > +	uint64_t addr[] = {0x2a0000, 0x3a0000};
> > +	uint64_t rebind_bo_offset = 2 * bo_size / 3;
> > +	uint64_t size = bo_size / 3;
> > +	int i = 0;
> > +
> > +	if (test_discovery) {
> > +		xe_vm_bind_async(fd, vm, 0, bo[0], 0, addr[0], bo_size, NULL, 0);
> > +
> > +		xe_vm_unbind_async(fd, vm, 0, 0, addr[0] + size, size, NULL, 0);
> > +
> > +		xe_vm_bind_async(fd, vm, 0, bo[1], 0, addr[1], bo_size, NULL, 0);
> > +
> > +		xe_vm_bind_async(fd, vm, 0, bo[1], rebind_bo_offset, addr[1], size, NULL, 0);
> > +
> > +		bl = create_bind_list(fd, system_memory(fd), vm, 4, 0);
> > +		xe_vm_bind_array(bl->fd, bl->vm, 0, bl->bind_ops, bl->n, NULL, 0);
> > +
> > +		xe_vm_unbind_all_async(fd, vm, 0, bo[0], NULL, 0);
> > +
> > +		xe_eudebug_client_vm_bind_event(c, DRM_XE_EUDEBUG_EVENT_STATE_CHANGE,
> > +						bl->fd, bl->vm, 0, bl->n + 2, &ref_seqno);
> > +
> > +		xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_CREATE, ref_seqno,
> > +						   &op_ref_seqno, addr[1], size, 0);
> > +		xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_CREATE, ref_seqno,
> > +						   &op_ref_seqno, addr[1] + size, size * 2, 0);
> > +
> > +		for (i = 0; i < bl->n; i++)
> > +			xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_CREATE,
> > +							   ref_seqno, &op_ref_seqno,
> > +							   bl->bind_ops[i].addr,
> > +							   bl->bind_ops[i].range, 0);
> > +
> > +		xe_eudebug_client_signal_stage(c, STAGE_PRE_DEBUG_RESOURCES_DONE);
> > +		xe_eudebug_client_wait_stage(c, STAGE_DISCOVERY_DONE);
> > +	} else {
> > +		xe_eudebug_client_vm_bind(c, fd, vm, bo[0], 0, addr[0], bo_size);
> > +		xe_eudebug_client_vm_unbind(c, fd, vm, 0, addr[0] + size, size);
> > +
> > +		xe_eudebug_client_vm_bind(c, fd, vm, bo[1], 0, addr[1], bo_size);
> > +		xe_eudebug_client_vm_bind(c, fd, vm, bo[1], rebind_bo_offset, addr[1], size);
> > +
> > +		bl = create_bind_list(fd, system_memory(fd), vm, 4, 0);
> > +		do_bind_list(c, bl, false);
> > +	}
> > +
> > +	xe_vm_unbind_all_async(fd, vm, 0, bo[1], NULL, 0);
> > +
> > +	xe_eudebug_client_vm_bind_event(c, DRM_XE_EUDEBUG_EVENT_STATE_CHANGE, fd, vm, 0,
> > +					1, &ref_seqno);
> > +	xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_DESTROY, ref_seqno,
> > +					   &op_ref_seqno, 0, 0, 0);
> > +
> > +	gem_close(fd, bo[0]);
> > +	gem_close(fd, bo[1]);
> > +	xe_eudebug_client_vm_destroy(c, fd, vm);
> > +}
> > +
> > +static void run_basic_client(struct xe_eudebug_client *c)
> > +{
> > +	int fd, i;
> > +
> > +	fd = xe_eudebug_client_open_driver(c);
> > +	xe_device_get(fd);
> > +
> > +	if (c->flags & CREATE_VMS) {
> > +		const uint32_t flags[] = {
> > +			DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE | DRM_XE_VM_CREATE_FLAG_LR_MODE,
> > +			DRM_XE_VM_CREATE_FLAG_LR_MODE,
> > +		};
> > +		uint32_t vms[ARRAY_SIZE(flags)];
> > +
> > +		for (i = 0; i < ARRAY_SIZE(flags); i++)
> > +			vms[i] = xe_eudebug_client_vm_create(c, fd, flags[i], 0);
> > +
> > +		for (i--; i >= 0; i--)
> > +			xe_eudebug_client_vm_destroy(c, fd, vms[i]);
> > +	}
> > +
> > +	if (c->flags & CREATE_EXEC_QUEUES) {
> > +		struct drm_xe_exec_queue_create *create;
> > +		struct drm_xe_engine_class_instance *hwe;
> > +		struct drm_xe_ext_set_property eq_ext = {
> > +			.base.name = DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY,
> > +			.property = DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG,
> > +			.value = DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE,
> > +		};
> > +		uint32_t vm;
> > +
> > +		create = calloc(xe_number_engines(fd), sizeof(*create));
> > +
> > +		vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> > +
> > +		i = 0;
> > +		xe_eudebug_for_each_engine(fd, hwe) {
> > +			create[i].instances = to_user_pointer(hwe);
> > +			create[i].vm_id = vm;
> > +			create[i].width = 1;
> > +			create[i].num_placements = 1;
> > +			create[i].extensions = to_user_pointer(&eq_ext);
> > +			xe_eudebug_client_exec_queue_create(c, fd, &create[i++]);
> > +		}
> > +
> > +		while (--i >= 0)
> > +			xe_eudebug_client_exec_queue_destroy(c, fd, &create[i]);
> > +
> > +		xe_eudebug_client_vm_destroy(c, fd, vm);
> > +	}
> > +
> > +	if (c->flags & VM_BIND || c->flags & VM_BIND_METADATA)
> > +		basic_vm_bind_client(fd, c);
> > +
> > +	if (c->flags & VM_BIND_EXTENDED)
> > +		vm_bind_client(fd, c);
> > +
> > +	if (c->flags & VM_BIND_VM_DESTROY)
> > +		basic_vm_bind_vm_destroy_client(fd, c);
> > +
> > +	xe_device_put(fd);
> > +	xe_eudebug_client_close_driver(c, fd);
> > +}
> > +
> > +static int read_event(int debugfd, struct drm_xe_eudebug_event *event)
> > +{
> > +	int ret;
> > +
> > +	ret = igt_ioctl(debugfd, DRM_XE_EUDEBUG_IOCTL_READ_EVENT, event);
> > +	if (ret < 0)
> > +		return -errno;
> > +
> > +	return ret;
> > +}
> > +
> > +static int __read_event(int debugfd, struct drm_xe_eudebug_event *event)
> > +{
> > +	int ret;
> > +
> > +	ret = ioctl(debugfd, DRM_XE_EUDEBUG_IOCTL_READ_EVENT, event);
> > +	if (ret < 0)
> > +		return -errno;
> > +
> > +	return ret;
> > +}
> > +
> > +static int poll_event(int fd, int timeout_ms)
> > +{
> > +	int ret;
> > +
> > +	struct pollfd p = {
> > +		.fd = fd,
> > +		.events = POLLIN,
> > +		.revents = 0,
> > +	};
> > +
> > +	ret = poll(&p, 1, timeout_ms);
> > +	if (ret == -1)
> > +		return -errno;
> > +
> > +	return ret == 1 && (p.revents & POLLIN);
> > +}
> > +
> > +static int __debug_connect(int fd, int *debugfd, struct drm_xe_eudebug_connect *param)
> > +{
> > +	int ret = 0;
> > +
> > +	*debugfd = igt_ioctl(fd, DRM_IOCTL_XE_EUDEBUG_CONNECT, param);
> > +
> > +	if (*debugfd < 0) {
> > +		ret = -errno;
> > +		igt_assume(ret != 0);
> > +	}
> > +
> > +	errno = 0;
> > +	return ret;
> > +}
> > +
> > +/**
> > + * SUBTEST: basic-connect
> > + * Description:
> > + *	Exercise XE_EUDEBUG_CONNECT ioctl with passing
> > + *	valid and invalid params.
> > + */
> > +static void test_connect(int fd)
> > +{
> > +	struct drm_xe_eudebug_connect param = {};
> > +	int debugfd, ret;
> > +	pid_t *pid;
> > +
> > +	pid = mmap(NULL, sizeof(pid_t), PROT_WRITE,
> > +		   MAP_SHARED | MAP_ANON, -1, 0);
> > +
> > +	/* get fresh unrelated pid */
> > +	igt_fork(child, 1)
> > +		*pid = getpid();
> > +
> > +	igt_waitchildren();
> > +	param.pid = *pid;
> > +	munmap(pid, sizeof(pid_t));
> > +
> > +	ret = __debug_connect(fd, &debugfd, &param);
> > +	igt_assert(debugfd == -1);
> > +	igt_assert_eq(ret, param.pid ? -ENOENT : -EINVAL);
> > +
> > +	param.pid = 0;
> > +	ret = __debug_connect(fd, &debugfd, &param);
> > +	igt_assert(debugfd == -1);
> > +	igt_assert_eq(ret, -EINVAL);
> > +
> > +	param.pid = getpid();
> > +	param.version = -1;
> > +	ret = __debug_connect(fd, &debugfd, &param);
> > +	igt_assert(debugfd == -1);
> > +	igt_assert_eq(ret, -EINVAL);
> > +
> > +	param.version = 0;
> > +	param.flags = ~0;
> > +	ret = __debug_connect(fd, &debugfd, &param);
> > +	igt_assert(debugfd == -1);
> > +	igt_assert_eq(ret, -EINVAL);
> > +
> > +	param.flags = 0;
> > +	param.extensions = ~0;
> > +	ret = __debug_connect(fd, &debugfd, &param);
> > +	igt_assert(debugfd == -1);
> > +	igt_assert_eq(ret, -EINVAL);
> > +
> > +	param.extensions = 0;
> > +	ret = __debug_connect(fd, &debugfd, &param);
> > +	igt_assert_neq(debugfd, -1);
> > +	igt_assert_eq(ret, 0);
> > +
> > +	close(debugfd);
> > +}
> > +
> > +static void switch_user(__uid_t uid, __gid_t gid)
> > +{
> > +	struct group *gr;
> > +	__gid_t gr_v;
> > +
> > +	/* Users other then root need to belong to video group */
> > +	gr = getgrnam("video");
> > +	igt_assert(gr);
> > +
> > +	/* Drop all */
> > +	igt_assert_eq(setgroups(1, &gr->gr_gid), 0);
> > +	igt_assert_eq(setgid(gid), 0);
> > +	igt_assert_eq(setuid(uid), 0);
> > +
> > +	igt_assert_eq(getgroups(1, &gr_v), 1);
> > +	igt_assert_eq(gr_v, gr->gr_gid);
> > +	igt_assert_eq(getgid(), gid);
> > +	igt_assert_eq(getuid(), uid);
> > +
> > +	igt_assert_eq(prctl(PR_SET_DUMPABLE, 1L), 0);
> > +}
> > +
> > +/**
> > + * SUBTEST: connect-user
> > + * Description:
> > + *	Verify unprivileged XE_EUDEBG_CONNECT ioctl.
> > + *	Check:
> > + *	 - user debugger to user workload connection
> > + *	 - user debugger to other user workload connection
> > + *	 - user debugger to privileged workload connection
> > + */
> > +static void test_connect_user(int fd)
> > +{
> > +	struct drm_xe_eudebug_connect param = {};
> > +	struct passwd *pwd, *pwd2;
> > +	const char *user1 = "lp";
> > +	const char *user2 = "mail";
> > +	int debugfd, ret, i;
> > +	int p1[2], p2[2];
> > +	__uid_t u1, u2;
> > +	__gid_t g1, g2;
> > +	int newfd;
> > +	pid_t pid;
> > +
> > +#define NUM_USER_TESTS 4
> > +#define P_APP 0
> > +#define P_GDB 1
> > +	struct conn_user {
> > +		/* u[0] - process uid, u[1] - gdb uid */
> > +		__uid_t u[P_GDB + 1];
> > +		/* g[0] - process gid, g[1] - gdb gid */
> > +		__gid_t g[P_GDB + 1];
> > +		/* Expected fd from open */
> > +		int ret;
> > +		/* Skip this test case */
> > +		int skip;
> > +		const char *desc;
> > +	} test[NUM_USER_TESTS] = {};
> > +
> > +	igt_assert(!pipe(p1));
> > +	igt_assert(!pipe(p2));
> > +
> > +	pwd = getpwnam(user1);
> > +	igt_require(pwd);
> > +	u1 = pwd->pw_uid;
> > +	g1 = pwd->pw_gid;
> > +
> > +	/*
> > +	 * Keep a copy of needed contents as it is a static
> > +	 * memory area and subsequent calls will overwrite
> > +	 * what's in.
> > +	 * However getpwnam() returns NULL if cannot find
> > +	 * user in passwd.
> > +	 */
> > +	setpwent();
> > +	pwd2 = getpwnam(user2);
> > +	if (pwd2) {
> > +		u2 = pwd2->pw_uid;
> > +		g2 = pwd2->pw_gid;
> > +	}
> > +
> > +	test[0].skip = !pwd;
> > +	test[0].u[P_GDB] = u1;
> > +	test[0].g[P_GDB] = g1;
> > +	test[0].ret = -EACCES;
> > +	test[0].desc = "User GDB to Root App";
> > +
> > +	test[1].skip = !pwd;
> > +	test[1].u[P_APP] = u1;
> > +	test[1].g[P_APP] = g1;
> > +	test[1].u[P_GDB] = u1;
> > +	test[1].g[P_GDB] = g1;
> > +	test[1].ret = 0;
> > +	test[1].desc = "User GDB to User App";
> > +
> > +	test[2].skip = !pwd;
> > +	test[2].u[P_APP] = u1;
> > +	test[2].g[P_APP] = g1;
> > +	test[2].ret = 0;
> > +	test[2].desc = "Root GDB to User App";
> > +
> > +	test[3].skip = !pwd2;
> > +	test[3].u[P_APP] = u1;
> > +	test[3].g[P_APP] = g1;
> > +	test[3].u[P_GDB] = u2;
> > +	test[3].g[P_GDB] = g2;
> > +	test[3].ret = -EACCES;
> > +	test[3].desc = "User GDB to Other User App";
> > +
> > +	if (!pwd2)
> > +		igt_warn("User %s not available in the system. Skipping subtests: %s.\n",
> > +			 user2, test[3].desc);
> > +
> > +	for (i = 0; i < NUM_USER_TESTS; i++) {
> > +		if (test[i].skip) {
> > +			igt_debug("Subtest %s skipped\n", test[i].desc);
> > +			continue;
> > +		}
> > +		igt_debug("Executing connection: %s\n", test[i].desc);
> > +		igt_fork(child, 2) {
> > +			if (!child) {
> > +				if (test[i].u[P_APP])
> > +					switch_user(test[i].u[P_APP], test[i].g[P_APP]);
> > +
> > +				pid = getpid();
> > +				/* Signal the PID */
> > +				igt_assert(write(p1[1], &pid, sizeof(pid)) == sizeof(pid));
> > +				/* wait with exit */
> > +				igt_assert(read(p2[0], &pid, sizeof(pid)) == sizeof(pid));
> > +			} else {
> > +				if (test[i].u[P_GDB])
> > +					switch_user(test[i].u[P_GDB], test[i].g[P_GDB]);
> > +
> > +				igt_assert(read(p1[0], &pid, sizeof(pid)) == sizeof(pid));
> > +				param.pid = pid;
> > +
> > +				newfd = drm_open_driver(DRIVER_XE);
> > +				ret = __debug_connect(newfd, &debugfd, &param);
> > +
> > +				/* Release the app first */
> > +				igt_assert(write(p2[1], &pid, sizeof(pid)) == sizeof(pid));
> > +
> > +				igt_assert_eq(ret, test[i].ret);
> > +				if (!ret)
> > +					close(debugfd);
> > +			}
> > +		}
> > +		igt_waitchildren();
> > +	}
> > +	close(p1[0]);
> > +	close(p1[1]);
> > +	close(p2[0]);
> > +	close(p2[1]);
> > +#undef NUM_USER_TESTS
> > +#undef P_APP
> > +#undef P_GDB
> > +}
> > +
> > +/**
> > + * SUBTEST: basic-close
> > + * Description:
> > + *	Test whether eudebug can be reattached after closure.
> > + */
> > +static void test_close(int fd)
> > +{
> > +	struct drm_xe_eudebug_connect param = { 0,  };
> > +	int debug_fd1, debug_fd2;
> > +	int fd2;
> > +
> > +	param.pid = getpid();
> > +
> > +	igt_assert_eq(__debug_connect(fd, &debug_fd1, &param), 0);
> > +	igt_assert(debug_fd1 >= 0);
> > +	igt_assert_eq(__debug_connect(fd, &debug_fd2, &param), -EBUSY);
> > +	igt_assert_eq(debug_fd2, -1);
> > +
> > +	close(debug_fd1);
> > +	fd2 = drm_open_driver(DRIVER_XE);
> > +
> > +	igt_assert_eq(__debug_connect(fd2, &debug_fd2, &param), 0);
> > +	igt_assert(debug_fd2 >= 0);
> > +	close(fd2);
> > +	close(debug_fd2);
> > +	close(debug_fd1);
> > +}
> > +
> > +/**
> > + * SUBTEST: basic-read-event
> > + * Description:
> > + *	Synchronously exercise eu debugger event polling and reading.
> > + */
> > +#define MAX_EVENT_SIZE (32 * 1024)
> > +static void test_read_event(int fd)
> > +{
> > +	struct drm_xe_eudebug_event *event;
> > +	struct xe_eudebug_debugger *d;
> > +	struct xe_eudebug_client *c;
> > +
> > +	event = malloc(MAX_EVENT_SIZE);
> > +	igt_assert(event);
> > +	memset(event, 0, sizeof(*event));
> > +
> > +	c = xe_eudebug_client_create(fd, run_basic_client, 0, NULL);
> > +	d = xe_eudebug_debugger_create(fd, 0, NULL);
> > +
> > +	igt_assert_eq(xe_eudebug_debugger_attach(d, c), 0);
> > +	igt_assert_eq(poll_event(d->fd, 500), 0);
> > +
> > +	event->len = 1;
> > +	event->type = DRM_XE_EUDEBUG_EVENT_NONE;
> > +	igt_assert_eq(read_event(d->fd, event), -EINVAL);
> > +
> > +	event->len = MAX_EVENT_SIZE;
> > +	event->type = DRM_XE_EUDEBUG_EVENT_NONE;
> > +	igt_assert_eq(read_event(d->fd, event), -EINVAL);
> > +
> > +	xe_eudebug_client_start(c);
> > +
> > +	igt_assert_eq(poll_event(d->fd, 500), 1);
> > +	event->type = DRM_XE_EUDEBUG_EVENT_READ;
> > +	igt_assert_eq(read_event(d->fd, event), 0);
> > +
> > +	igt_assert_eq(poll_event(d->fd, 500), 1);
> > +
> > +	event->flags = 0;
> > +	event->type = DRM_XE_EUDEBUG_EVENT_READ;
> > +
> > +	event->len = 0;
> > +	igt_assert_eq(read_event(d->fd, event), -EINVAL);
> > +	igt_assert_eq(0, event->len);
> > +
> > +	event->len = sizeof(*event) - 1;
> > +	igt_assert_eq(read_event(d->fd, event), -EINVAL);
> > +
> > +	event->len = sizeof(*event);
> > +	igt_assert_eq(read_event(d->fd, event), -EMSGSIZE);
> > +	igt_assert_lt(sizeof(*event), event->len);
> > +
> > +	event->len = event->len - 1;
> > +	igt_assert_eq(read_event(d->fd, event), -EMSGSIZE);
> > +	/* event->len should now contain the exact len */
> > +	igt_assert_eq(read_event(d->fd, event), 0);
> > +
> > +	fcntl(d->fd, F_SETFL, fcntl(d->fd, F_GETFL) | O_NONBLOCK);
> > +	igt_assert(fcntl(d->fd, F_GETFL) & O_NONBLOCK);
> > +
> > +	igt_assert_eq(poll_event(d->fd, 500), 0);
> > +	event->len = MAX_EVENT_SIZE;
> > +	event->flags = 0;
> > +	event->type = DRM_XE_EUDEBUG_EVENT_READ;
> > +	igt_assert_eq(__read_event(d->fd, event), -EAGAIN);
> > +
> > +	xe_eudebug_client_wait_done(c);
> > +	xe_eudebug_client_stop(c);
> > +
> > +	igt_assert_eq(poll_event(d->fd, 500), 0);
> > +	igt_assert_eq(__read_event(d->fd, event), -EAGAIN);
> > +
> > +	xe_eudebug_debugger_destroy(d);
> > +	xe_eudebug_client_destroy(c);
> > +
> > +	free(event);
> > +}
> > +
> > +/**
> > + * SUBTEST: basic-client
> > + * Description:
> > + *	Attach the debugger to process which opens and closes xe drm client.
> > + *
> > + * SUBTEST: basic-client-th
> > + * Description:
> > + *	Create client basic resources (vms) in multiple threads
> > + *
> > + * SUBTEST: multiple-sessions
> > + * Description:
> > + *	Simultaneously attach many debuggers to many processes.
> > + *	Each process opens and closes xe drm client and creates few resources.
> > + *
> > + * SUBTEST: basic-%s
> > + * Description:
> > + *	Attach the debugger to process which creates and destroys a few %arg[1].
> > + *
> > + * SUBTEST: basic-vm-bind
> > + * Description:
> > + *	Attach the debugger to a process that performs synchronous vm bind
> > + *	and vm unbind.
> > + *
> > + * SUBTEST: basic-vm-bind-vm-destroy
> > + * Description:
> > + *	Attach the debugger to a process that performs vm bind, and destroys
> > + *	the vm without unbinding. Make sure that we don't get unbind events.
> > + *
> > + * SUBTEST: basic-vm-bind-extended
> > + * Description:
> > + *	Attach the debugger to a process that performs bind, bind array, rebind,
> > + *	partial unbind, unbind and unbind all operations.
> > + *
> > + * SUBTEST: multigpu-basic-client
> > + * Description:
> > + *	Attach the debugger to process which opens and closes xe drm client on all Xe devices.
> > + *
> > + * SUBTEST: multigpu-basic-client-many
> > + * Description:
> > + *	Simultaneously attach many debuggers to many processes on all Xe devices.
> > + *	Each process opens and closes xe drm client and creates few resources.
> > + *
> > + * arg[1]:
> > + *
> > + * @vms: vms
> > + * @exec-queues: exec queues
> > + */
> > +
> > +static void test_basic_sessions(int fd, unsigned int flags, int count, bool match_opposite)
> > +{
> > +	struct xe_eudebug_session **s;
> > +	int i;
> > +
> > +	s = calloc(count, sizeof(*s));
> > +
> > +	igt_assert(s);
> > +
> > +	for (i = 0; i < count; i++)
> > +		s[i] = xe_eudebug_session_create(fd, run_basic_client, flags, NULL);
> > +
> > +	for (i = 0; i < count; i++)
> > +		xe_eudebug_session_run(s[i]);
> > +
> > +	for (i = 0; i < count; i++)
> > +		xe_eudebug_session_check(s[i], match_opposite, 0);
> > +
> > +	for (i = 0; i < count; i++)
> > +		xe_eudebug_session_destroy(s[i]);
> > +}
> > +
> > +/**
> > + * SUBTEST: basic-vm-bind-discovery
> > + * Description:
> > + *	Attach the debugger to a process that performs vm-bind before attaching
> > + *	and check if the discovery process reports it.
> > + *
> > + * SUBTEST: basic-vm-bind-metadata-discovery
> > + * Description:
> > + *	Attach the debugger to a process that performs vm-bind with metadata attached
> > + *	before attaching and check if the discovery process reports it.
> > + *
> > + * SUBTEST: basic-vm-bind-vm-destroy-discovery
> > + * Description:
> > + *	Attach the debugger to a process that performs vm bind, and destroys
> > + *	the vm without unbinding before attaching. Make sure that we don't get
> > + *	any bind/unbind and vm create/destroy events.
> > + *
> > + * SUBTEST: basic-vm-bind-extended-discovery
> > + * Description:
> > + *	Attach the debugger to a process that performs bind, bind array, rebind,
> > + *	partial unbind, and unbind all operations before attaching. Ensure that
> > + *	we get a only a singe 'VM_BIND' event from the discovery worker.
> > + */
> > +static void test_basic_discovery(int fd, unsigned int flags, bool match_opposite)
> > +{
> > +	struct xe_eudebug_debugger *d;
> > +	struct xe_eudebug_session *s;
> > +	struct xe_eudebug_client *c;
> > +
> > +	s = xe_eudebug_session_create(fd, run_basic_client, flags | TEST_DISCOVERY, NULL);
> > +
> > +	c = s->client;
> > +	d = s->debugger;
> > +
> > +	xe_eudebug_client_start(c);
> > +	xe_eudebug_debugger_wait_stage(s, STAGE_PRE_DEBUG_RESOURCES_DONE);
> > +
> > +	igt_assert_eq(xe_eudebug_debugger_attach(d, c), 0);
> > +	xe_eudebug_debugger_start_worker(d);
> > +
> > +	/* give the worker time to do it's job */
> > +	sleep(2);
> > +	xe_eudebug_debugger_signal_stage(d, STAGE_DISCOVERY_DONE);
> > +
> > +	xe_eudebug_client_wait_done(c);
> > +
> > +	xe_eudebug_debugger_stop_worker(d, 1);
> > +
> > +	xe_eudebug_event_log_print(d->log, true);
> > +	xe_eudebug_event_log_print(c->log, true);
> > +
> > +	xe_eudebug_session_check(s, match_opposite, 0);
> > +	xe_eudebug_session_destroy(s);
> > +}
> > +
> > +#define RESOURCE_COUNT 16
> > +#define PRIMARY_THREAD			(1 << 0)
> > +#define DISCOVERY_CLOSE_CLIENT		(1 << 1)
> > +#define DISCOVERY_DESTROY_RESOURCES	(1 << 2)
> > +#define DISCOVERY_VM_BIND		(1 << 3)
> > +static void run_discovery_client(struct xe_eudebug_client *c)
> > +{
> > +	struct drm_xe_engine_class_instance *hwe = NULL;
> > +	int fd[RESOURCE_COUNT], i;
> > +	bool skip_sleep = c->flags & (DISCOVERY_DESTROY_RESOURCES | DISCOVERY_CLOSE_CLIENT);
> > +	uint64_t addr = 0x1a0000;
> > +
> > +	srand(getpid());
> > +
> > +	for (i = 0; i < RESOURCE_COUNT; i++) {
> > +		fd[i] = xe_eudebug_client_open_driver(c);
> > +
> > +		if (!i) {
> > +			bool found = false;
> > +
> > +			xe_device_get(fd[0]);
> > +			xe_for_each_engine(fd[0], hwe) {
> > +				if (hwe->engine_class == DRM_XE_ENGINE_CLASS_COMPUTE ||
> > +				    hwe->engine_class == DRM_XE_ENGINE_CLASS_RENDER) {
> > +					found = true;
> > +					break;
> > +				}
> > +			}
> > +			igt_assert(found);
> > +		}
> > +
> > +		/*
> > +		 * Give the debugger a break in event stream after every
> > +		 * other client, that allows to read discovery and dettach in quiet.
> > +		 */
> > +		if (random() % 2 == 0 && !skip_sleep)
> > +			sleep(1);
> > +
> > +		for (int j = 0; j < RESOURCE_COUNT; j++) {
> > +			uint32_t vm = xe_eudebug_client_vm_create(c, fd[i],
> > +								  DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> > +			struct drm_xe_ext_set_property eq_ext = {
> > +				.base.name = DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY,
> > +				.property = DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG,
> > +				.value = DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE,
> > +			};
> > +			struct drm_xe_exec_queue_create create = {
> > +				.width = 1,
> > +				.num_placements = 1,
> > +				.vm_id = vm,
> > +				.instances = to_user_pointer(hwe),
> > +				.extensions = to_user_pointer(&eq_ext),
> > +			};
> > +			const unsigned int bo_size = max_t(bo_size,
> > +							   xe_get_default_alignment(fd[i]),
> > +							   MIN_BO_SIZE);
> > +			uint32_t bo = xe_bo_create(fd[i], 0, bo_size, system_memory(fd[i]), 0);
> > +
> > +			xe_eudebug_client_exec_queue_create(c, fd[i], &create);
> > +
> > +			if (c->flags & DISCOVERY_VM_BIND) {
> > +				xe_eudebug_client_vm_bind(c, fd[i], vm, bo, 0, addr, bo_size);
> > +				addr += 0x100000;
> > +			}
> > +
> > +			if (c->flags & DISCOVERY_DESTROY_RESOURCES) {
> > +				xe_eudebug_client_exec_queue_destroy(c, fd[i], &create);
> > +				xe_eudebug_client_vm_destroy(c, fd[i], create.vm_id);
> > +				gem_close(fd[i], bo);
> > +			}
> > +		}
> > +
> > +		if (c->flags & DISCOVERY_CLOSE_CLIENT)
> > +			xe_eudebug_client_close_driver(c, fd[i]);
> > +	}
> > +	xe_device_put(fd[0]);
> > +}
> > +
> > +/**
> > + * SUBTEST: discovery-%s
> > + * Description: Race discovery against %arg[1] and the debugger dettach.
> > + *
> > + * arg[1]:
> > + *
> > + * @race:		resources creation
> > + * @race-vmbind:	vm-bind operations
> > + * @empty:		resources destruction
> > + * @empty-clients:	client closure
> > + */
> > +static void *discovery_race_thread(void *data)
> > +{
> > +	struct {
> > +		uint64_t client_handle;
> > +		int vm_count;
> > +		int exec_queue_count;
> > +		int vm_bind_op_count;
> > +	} clients[RESOURCE_COUNT];
> > +	struct xe_eudebug_session *s = data;
> > +	int expected = RESOURCE_COUNT * (1 + 2 * RESOURCE_COUNT);
> > +	const int tries = 100;
> > +	bool done = false;
> > +	int ret = 0;
> > +
> > +	for (int try = 0; try < tries && !done; try++) {
> > +		ret = xe_eudebug_debugger_attach(s->debugger, s->client);
> > +
> > +		if (ret == -EBUSY) {
> > +			usleep(100000);
> > +			continue;
> > +		}
> > +
> > +		igt_assert_eq(ret, 0);
> > +
> > +		if (random() % 2) {
> > +			struct drm_xe_eudebug_event *e = NULL;
> > +			int i = -1;
> > +
> > +			xe_eudebug_debugger_start_worker(s->debugger);
> > +			sleep(1);
> > +			xe_eudebug_debugger_stop_worker(s->debugger, 1);
> > +			igt_debug("Resources discovered: %lu\n", s->debugger->event_count);
> > +
> > +			xe_eudebug_for_each_event(e, s->debugger->log) {
> > +				if (e->type == DRM_XE_EUDEBUG_EVENT_OPEN) {
> > +					struct drm_xe_eudebug_event_client *eo = (void *)e;
> > +
> > +					if (i >= 0) {
> > +						igt_assert_eq(clients[i].vm_count,
> > +							      RESOURCE_COUNT);
> > +
> > +						igt_assert_eq(clients[i].exec_queue_count,
> > +							      RESOURCE_COUNT);
> > +
> > +						if (s->client->flags & DISCOVERY_VM_BIND)
> > +							igt_assert_eq(clients[i].vm_bind_op_count,
> > +								      RESOURCE_COUNT);
> > +					}
> > +
> > +					igt_assert(++i < RESOURCE_COUNT);
> > +					clients[i].client_handle = eo->client_handle;
> > +					clients[i].vm_count = 0;
> > +					clients[i].exec_queue_count = 0;
> > +					clients[i].vm_bind_op_count = 0;
> > +				}
> > +
> > +				if (e->type == DRM_XE_EUDEBUG_EVENT_VM)
> > +					clients[i].vm_count++;
> > +
> > +				if (e->type == DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE)
> > +					clients[i].exec_queue_count++;
> > +
> > +				if (e->type == DRM_XE_EUDEBUG_EVENT_VM_BIND_OP)
> > +					clients[i].vm_bind_op_count++;
> > +			};
> > +
> > +			igt_assert_lte(0, i);
> > +
> > +			for (int j = 0; j < i; j++)
> > +				for (int k = 0; k < i; k++) {
> > +					if (k == j)
> > +						continue;
> > +
> > +					igt_assert_neq(clients[j].client_handle,
> > +						       clients[k].client_handle);
> > +				}
> > +
> > +			if (s->debugger->event_count >= expected)
> > +				done = true;
> > +		}
> > +
> > +		xe_eudebug_debugger_detach(s->debugger);
> > +		s->debugger->log->head = 0;
> > +		s->debugger->event_count = 0;
> > +	}
> > +
> > +	/* Primary thread must read everything */
> > +	if (s->flags & PRIMARY_THREAD) {
> > +		while ((ret = xe_eudebug_debugger_attach(s->debugger, s->client)) == -EBUSY)
> > +			usleep(100000);
> > +
> > +		igt_assert_eq(ret, 0);
> > +
> > +		xe_eudebug_debugger_start_worker(s->debugger);
> > +		xe_eudebug_client_wait_done(s->client);
> > +
> > +		if (READ_ONCE(s->debugger->event_count) != expected)
> > +			sleep(5);
> > +
> > +		xe_eudebug_debugger_stop_worker(s->debugger, 1);
> > +		xe_eudebug_debugger_detach(s->debugger);
> > +	}
> > +
> > +	return NULL;
> > +}
> > +
> > +static void test_race_discovery(int fd, unsigned int flags, int clients)
> > +{
> > +	const int debuggers_per_client = 3;
> > +	int count = clients * debuggers_per_client;
> > +	struct xe_eudebug_session *sessions, *s;
> > +	struct xe_eudebug_client *c;
> > +	pthread_t *threads;
> > +	int i, j;
> > +
> > +	sessions = calloc(count, sizeof(*sessions));
> > +	threads = calloc(count, sizeof(*threads));
> > +
> > +	for (i = 0; i < clients; i++) {
> > +		c = xe_eudebug_client_create(fd, run_discovery_client, flags, NULL);
> > +		for (j = 0; j < debuggers_per_client; j++) {
> > +			s = &sessions[i * debuggers_per_client + j];
> > +			s->client = c;
> > +			s->debugger = xe_eudebug_debugger_create(fd, flags, NULL);
> > +			s->flags = flags | (!j ? PRIMARY_THREAD : 0);
> > +		}
> > +	}
> > +
> > +	for (i = 0; i < count; i++) {
> > +		if (sessions[i].flags & PRIMARY_THREAD)
> > +			xe_eudebug_client_start(sessions[i].client);
> > +
> > +		pthread_create(&threads[i], NULL, discovery_race_thread, &sessions[i]);
> > +	}
> > +
> > +	for (i = 0; i < count; i++)
> > +		pthread_join(threads[i], NULL);
> > +
> > +	for (i = count - 1; i > 0; i--) {
> > +		if (sessions[i].flags & PRIMARY_THREAD) {
> > +			igt_assert_eq(sessions[i].client->seqno - 1,
> > +				      sessions[i].debugger->event_count);
> > +
> > +			xe_eudebug_event_log_compare(sessions[0].debugger->log,
> > +						     sessions[i].debugger->log,
> > +						     XE_EUDEBUG_FILTER_EVENT_VM_BIND);
> > +
> > +			xe_eudebug_client_destroy(sessions[i].client);
> > +		}
> > +		xe_eudebug_debugger_destroy(sessions[i].debugger);
> > +	}
> > +}
> > +
> > +static void *attach_dettach_thread(void *data)
> > +{
> > +	struct xe_eudebug_session *s = data;
> > +	const int tries = 100;
> > +	int ret = 0;
> > +
> > +	for (int try = 0; try < tries; try++) {
> > +		ret = xe_eudebug_debugger_attach(s->debugger, s->client);
> > +
> > +		if (ret == -EBUSY) {
> > +			usleep(100000);
> > +			continue;
> > +		}
> > +
> > +		igt_assert_eq(ret, 0);
> > +
> > +		if (random() % 2 == 0) {
> > +			xe_eudebug_debugger_start_worker(s->debugger);
> > +			xe_eudebug_debugger_stop_worker(s->debugger, 1);
> > +		}
> > +
> > +		xe_eudebug_debugger_detach(s->debugger);
> > +		s->debugger->log->head = 0;
> > +		s->debugger->event_count = 0;
> > +	}
> > +
> > +	return NULL;
> > +}
> > +
> > +static void test_empty_discovery(int fd, unsigned int flags, int clients)
> > +{
> > +	struct xe_eudebug_session **s;
> > +	pthread_t *threads;
> > +	int i, expected = flags & DISCOVERY_CLOSE_CLIENT ? 0 : RESOURCE_COUNT;
> > +
> > +	igt_assert(flags & (DISCOVERY_DESTROY_RESOURCES | DISCOVERY_CLOSE_CLIENT));
> > +
> > +	s = calloc(clients, sizeof(struct xe_eudebug_session *));
> > +	threads = calloc(clients, sizeof(*threads));
> > +
> > +	for (i = 0; i < clients; i++)
> > +		s[i] = xe_eudebug_session_create(fd, run_discovery_client, flags, NULL);
> > +
> > +	for (i = 0; i < clients; i++) {
> > +		xe_eudebug_client_start(s[i]->client);
> > +
> > +		pthread_create(&threads[i], NULL, attach_dettach_thread, s[i]);
> > +	}
> > +
> > +	for (i = 0; i < clients; i++)
> > +		pthread_join(threads[i], NULL);
> > +
> > +	for (i = 0; i < clients; i++) {
> > +		xe_eudebug_client_wait_done(s[i]->client);
> > +		igt_assert_eq(xe_eudebug_debugger_attach(s[i]->debugger, s[i]->client), 0);
> > +
> > +		xe_eudebug_debugger_start_worker(s[i]->debugger);
> > +		xe_eudebug_debugger_stop_worker(s[i]->debugger, 5);
> > +		xe_eudebug_debugger_detach(s[i]->debugger);
> > +
> > +		igt_assert_eq(s[i]->debugger->event_count, expected);
> > +
> > +		xe_eudebug_session_destroy(s[i]);
> > +	}
> > +}
> > +
> > +static void ufence_ack_trigger(struct xe_eudebug_debugger *d,
> > +			       struct drm_xe_eudebug_event *e)
> > +{
> > +	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
> > +
> > +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE)
> > +		xe_eudebug_ack_ufence(d->fd, ef);
> > +}
> > +
> > +typedef void (*client_run_t)(struct xe_eudebug_client *);
> > +
> > +static void test_client_with_trigger(int fd, unsigned int flags, int count,
> > +				     client_run_t client_fn, int type,
> > +				     xe_eudebug_trigger_fn trigger_fn,
> > +				     struct drm_xe_engine_class_instance *hwe,
> > +				     bool match_opposite, uint32_t event_filter)
> > +{
> > +	struct xe_eudebug_session **s;
> > +	int i;
> > +
> > +	s = calloc(count, sizeof(*s));
> > +
> > +	igt_assert(s);
> > +
> > +	for (i = 0; i < count; i++)
> > +		s[i] = xe_eudebug_session_create(fd, client_fn, flags, hwe);
> > +
> > +	if (trigger_fn)
> > +		for (i = 0; i < count; i++)
> > +			xe_eudebug_debugger_add_trigger(s[i]->debugger, type, trigger_fn);
> > +
> > +	for (i = 0; i < count; i++)
> > +		xe_eudebug_debugger_add_trigger(s[i]->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> > +						ufence_ack_trigger);
> > +
> > +	for (i = 0; i < count; i++)
> > +		xe_eudebug_session_run(s[i]);
> > +
> > +	for (i = 0; i < count; i++)
> > +		xe_eudebug_session_check(s[i], match_opposite, event_filter);
> > +
> > +	for (i = 0; i < count; i++)
> > +		xe_eudebug_session_destroy(s[i]);
> > +}
> > +
> > +struct thread_fn_args {
> > +	struct xe_eudebug_client *client;
> > +	int fd;
> > +};
> > +
> > +static void *basic_client_th(void *data)
> > +{
> > +	struct thread_fn_args *f = data;
> > +	struct xe_eudebug_client *c = f->client;
> > +	uint32_t *vms;
> > +	int fd, i, num_vms;
> > +
> > +	fd = f->fd;
> > +	igt_assert(fd);
> > +
> > +	xe_device_get(fd);
> > +
> > +	num_vms = 2 + rand() % 16;
> > +	vms = calloc(num_vms, sizeof(*vms));
> > +	igt_assert(vms);
> > +	igt_debug("Create %d client vms\n", num_vms);
> > +
> > +	for (i = 0; i < num_vms; i++)
> > +		vms[i] = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> > +
> > +	for (i = 0; i < num_vms; i++)
> > +		xe_eudebug_client_vm_destroy(c, fd, vms[i]);
> > +
> > +	xe_device_put(fd);
> > +	free(vms);
> > +
> > +	return NULL;
> > +}
> > +
> > +static void run_basic_client_th(struct xe_eudebug_client *c)
> > +{
> > +	struct thread_fn_args *args;
> > +	int i, num_threads, fd;
> > +	pthread_t *threads;
> > +
> > +	args = calloc(1, sizeof(*args));
> > +	igt_assert(args);
> > +
> > +	num_threads = 2 + random() % 16;
> > +	igt_debug("Run on %d threads\n", num_threads);
> > +	threads = calloc(num_threads, sizeof(*threads));
> > +	igt_assert(threads);
> > +
> > +	fd = xe_eudebug_client_open_driver(c);
> > +	args->client = c;
> > +	args->fd = fd;
> > +
> > +	for (i = 0; i < num_threads; i++)
> > +		pthread_create(&threads[i], NULL, basic_client_th, args);
> > +
> > +	for (i = 0; i < num_threads; i++)
> > +		pthread_join(threads[i], NULL);
> > +
> > +	xe_eudebug_client_close_driver(c, fd);
> > +	free(args);
> > +	free(threads);
> > +}
> > +
> > +static void test_basic_sessions_th(int fd, unsigned int flags, int num_clients, bool match_opposite)
> > +{
> > +	test_client_with_trigger(fd, flags, num_clients, run_basic_client_th, 0, NULL, NULL,
> > +				 match_opposite, 0);
> > +}
> > +
> > +static void vm_access_client(struct xe_eudebug_client *c)
> > +{
> > +	struct drm_xe_engine_class_instance *hwe = c->ptr;
> > +	uint32_t bo_placement;
> > +	struct bind_list *bl;
> > +	uint32_t vm;
> > +	int fd, i, j;
> > +
> > +	igt_debug("Using %s\n", xe_engine_class_string(hwe->engine_class));
> > +
> > +	fd = xe_eudebug_client_open_driver(c);
> > +	xe_device_get(fd);
> > +
> > +	vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> > +
> > +	if (c->flags & VM_BIND_OP_MAP_USERPTR)
> > +		bo_placement = 0;
> > +	else
> > +		bo_placement = vram_if_possible(fd, hwe->gt_id);
> > +
> > +	for (j = 0; j < 5; j++) {
> > +		unsigned int target_size = MIN_BO_SIZE * (1 << j);
> > +
> > +		bl = create_bind_list(fd, bo_placement, vm, 4, target_size);
> > +		do_bind_list(c, bl, true);
> > +
> > +		for (i = 0; i < bl->n; i++)
> > +			xe_eudebug_client_wait_stage(c, bl->bind_ops[i].addr);
> > +
> > +		free_bind_list(c, bl);
> > +	}
> > +	xe_eudebug_client_vm_destroy(c, fd, vm);
> > +
> > +	xe_device_put(fd);
> > +	xe_eudebug_client_close_driver(c, fd);
> > +}
> > +
> > +static void debugger_test_vma(struct xe_eudebug_debugger *d,
> > +			      uint64_t client_handle,
> > +			      uint64_t vm_handle,
> > +			      uint64_t va_start,
> > +			      uint64_t va_length)
> > +{
> > +	struct drm_xe_eudebug_vm_open vo = { 0, };
> > +	uint64_t *v1, *v2;
> > +	uint64_t items = va_length / sizeof(uint64_t);
> > +	int fd;
> > +	int r, i;
> > +
> > +	v1 = malloc(va_length);
> > +	igt_assert(v1);
> > +	v2 = malloc(va_length);
> > +	igt_assert(v2);
> > +
> > +	vo.client_handle = client_handle;
> > +	vo.vm_handle = vm_handle;
> > +
> > +	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
> > +	igt_assert_lte(0, fd);
> > +
> > +	r = pread(fd, v1, va_length, va_start);
> > +	igt_assert_eq(r, va_length);
> > +
> > +	for (i = 0; i < items; i++)
> > +		igt_assert_eq(v1[i], va_start + i);
> > +
> > +	for (i = 0; i < items; i++)
> > +		v1[i] = va_start + i + 1;
> > +
> > +	r = pwrite(fd, v1, va_length, va_start);
> > +	igt_assert_eq(r, va_length);
> > +
> > +	lseek(fd, va_start, SEEK_SET);
> > +	r = read(fd, v2, va_length);
> > +	igt_assert_eq(r, va_length);
> > +
> > +	for (i = 0; i < items; i++)
> > +		igt_assert_eq(v1[i], v2[i]);
> > +
> > +	fsync(fd);
> > +
> > +	close(fd);
> > +	free(v1);
> > +	free(v2);
> > +}
> > +
> > +static void vm_trigger(struct xe_eudebug_debugger *d,
> > +		       struct drm_xe_eudebug_event *e)
> > +{
> > +	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
> > +
> > +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
> > +		struct drm_xe_eudebug_event_vm_bind *eb;
> > +
> > +		igt_debug("vm bind op event received with ref %lld, addr 0x%llx, range 0x%llx\n",
> > +			  eo->vm_bind_ref_seqno,
> > +			  eo->addr,
> > +			  eo->range);
> > +
> > +		eb = (struct drm_xe_eudebug_event_vm_bind *)
> > +			xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
> > +		igt_assert(eb);
> > +
> > +		debugger_test_vma(d, eb->client_handle, eb->vm_handle,
> > +				  eo->addr, eo->range);
> > +		xe_eudebug_debugger_signal_stage(d, eo->addr);
> > +	}
> > +}
> > +
> > +/**
> > + * SUBTEST: basic-vm-access
> > + * Description:
> > + *      Exercise XE_EUDEBUG_VM_OPEN with pread and pwrite into the
> > + *      vm fd, concerning many different offsets inside the vm,
> > + *      and many virtual addresses of the vm_bound object.
> > + *
> > + * SUBTEST: basic-vm-access-userptr
> > + * Description:
> > + *      Exercise XE_EUDEBUG_VM_OPEN with pread and pwrite into the
> > + *      vm fd, concerning many different offsets inside the vm,
> > + *      and many virtual addresses of the vm_bound object, but backed
> > + *      by userptr.
> > + */
> > +static void test_vm_access(int fd, unsigned int flags, int num_clients)
> > +{
> > +	struct drm_xe_engine_class_instance *hwe;
> > +
> > +	xe_eudebug_for_each_engine(fd, hwe)
> > +		test_client_with_trigger(fd, flags, num_clients,
> > +					 vm_access_client,
> > +					 DRM_XE_EUDEBUG_EVENT_VM_BIND_OP,
> > +					 vm_trigger, hwe,
> > +					 false,
> > +					 XE_EUDEBUG_FILTER_EVENT_VM_BIND_OP |
> > +					 XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
> > +}
> > +
> > +static void debugger_test_vma_parameters(struct xe_eudebug_debugger *d,
> > +					 uint64_t client_handle,
> > +					 uint64_t vm_handle,
> > +					 uint64_t va_start,
> > +					 uint64_t va_length)
> > +{
> > +	struct drm_xe_eudebug_vm_open vo = { 0, };
> > +	uint64_t *v;
> > +	uint64_t items = va_length / sizeof(uint64_t);
> > +	int fd;
> > +	int r, i;
> > +
> > +	v = malloc(va_length);
> > +	igt_assert(v);
> > +
> > +	/* Negative VM open - bad client handle */
> > +	vo.client_handle = client_handle + 123;
> > +	vo.vm_handle = vm_handle;
> > +	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
> > +	igt_assert(fd < 0);
> > +
> > +	/* Negative VM open - bad vm handle */
> > +	vo.client_handle = client_handle;
> > +	vo.vm_handle = vm_handle + 123;
> > +	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
> > +	igt_assert(fd < 0);
> > +
> > +	/* Positive VM open */
> > +	vo.client_handle = client_handle;
> > +	vo.vm_handle = vm_handle;
> > +	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
> > +	igt_assert_lte(0, fd);
> > +
> > +	/* Negative pread - bad fd */
> > +	r = pread(fd + 123, v, va_length, va_start);
> > +	igt_assert(r < 0);
> > +
> > +	/* Negative pread - bad va_start */
> > +	r = pread(fd, v, va_length, 0);
> > +	igt_assert(r < 0);
> > +
> > +	/* Negative pread - bad va_start */
> > +	r = pread(fd, v, va_length, va_start - 1);
> > +	igt_assert(r < 0);
> > +
> > +	/* Positive pread - zero va_length */
> > +	r = pread(fd, v, 0, va_start);
> > +	igt_assert_eq(r, 0);
> > +
> > +	/* Negative pread - out of range */
> > +	r = pread(fd, v, va_length + 1, va_start);
> > +	igt_assert_eq(r, va_length);
> > +
> > +	/* Negative pread - bad va_start */
> > +	r = pread(fd, v, 1, va_start + va_length);
> > +	igt_assert(r < 0);
> > +
> > +	/* Positive pread - whole range */
> > +	r = pread(fd, v, va_length, va_start);
> > +	igt_assert_eq(r, va_length);
> > +
> > +	/* Positive pread */
> > +	r = pread(fd, v, 1, va_start + va_length - 1);
> > +	igt_assert_eq(r, 1);
> > +
> > +	for (i = 0; i < items; i++)
> > +		igt_assert_eq(v[i], va_start + i);
> > +
> > +	for (i = 0; i < items; i++)
> > +		v[i] = va_start + i + 1;
> > +
> > +	/* Negative pwrite - bad fd */
> > +	r = pwrite(fd + 123, v, va_length, va_start);
> > +	igt_assert(r < 0);
> > +
> > +	/* Negative pwrite - bad va_start */
> > +	r = pwrite(fd, v, va_length, -1);
> > +	igt_assert(r < 0);
> > +
> > +	/* Negative pwrite - zero va_start */
> > +	r = pwrite(fd, v, va_length, 0);
> > +	igt_assert(r < 0);
> > +
> > +	/* Negative pwrite - bad va_length */
> > +	r = pwrite(fd, v, va_length + 1, va_start);
> > +	igt_assert_eq(r, va_length);
> > +
> > +	/* Positive pwrite - zero va_length */
> > +	r = pwrite(fd, v, 0, va_start);
> > +	igt_assert_eq(r, 0);
> > +
> > +	/* Positive pwrite */
> > +	r = pwrite(fd, v, va_length, va_start);
> > +	igt_assert_eq(r, va_length);
> > +	fsync(fd);
> > +
> > +	close(fd);
> > +	free(v);
> > +}
> > +
> > +static void vm_trigger_access_parameters(struct xe_eudebug_debugger *d,
> > +					 struct drm_xe_eudebug_event *e)
> > +{
> > +	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
> > +
> > +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
> > +		struct drm_xe_eudebug_event_vm_bind *eb;
> > +
> > +		igt_debug("vm bind op event received with ref %lld, addr 0x%llx, range 0x%llx\n",
> > +			  eo->vm_bind_ref_seqno,
> > +			  eo->addr,
> > +			  eo->range);
> > +
> > +		eb = (struct drm_xe_eudebug_event_vm_bind *)
> > +			xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
> > +		igt_assert(eb);
> > +
> > +		debugger_test_vma_parameters(d, eb->client_handle, eb->vm_handle, eo->addr,
> > +					     eo->range);
> > +		xe_eudebug_debugger_signal_stage(d, eo->addr);
> > +	}
> > +}
> > +
> > +/**
> > + * SUBTEST: basic-vm-access-parameters
> > + * Description:
> > + *      Check negative scenarios of VM_OPEN ioctl and pread/pwrite usage.
> > + */
> > +static void test_vm_access_parameters(int fd, unsigned int flags, int num_clients)
> > +{
> > +	struct drm_xe_engine_class_instance *hwe;
> > +
> > +	xe_eudebug_for_each_engine(fd, hwe)
> > +		test_client_with_trigger(fd, flags, num_clients,
> > +					 vm_access_client,
> > +					 DRM_XE_EUDEBUG_EVENT_VM_BIND_OP,
> > +					 vm_trigger_access_parameters, hwe,
> > +					 false,
> > +					 XE_EUDEBUG_FILTER_EVENT_VM_BIND_OP |
> > +					 XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
> > +}
> > +
> > +#define PAGE_SIZE 4096
> > +#define MDATA_SIZE (WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM * PAGE_SIZE)
> > +static void metadata_access_client(struct xe_eudebug_client *c)
> > +{
> > +	const uint64_t addr = 0x1a0000;
> > +	struct drm_xe_vm_bind_op_ext_attach_debug *ext;
> > +	uint8_t *data;
> > +	size_t bo_size;
> > +	uint32_t bo, vm;
> > +	int fd, i;
> > +
> > +	fd = xe_eudebug_client_open_driver(c);
> > +	xe_device_get(fd);
> > +
> > +	bo_size = xe_get_default_alignment(fd);
> > +	vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> > +	bo = xe_bo_create(fd, vm, bo_size, system_memory(fd), 0);
> > +
> > +	ext = basic_vm_bind_metadata_ext_prepare(fd, c, &data, MDATA_SIZE);
> > +
> > +	xe_eudebug_client_vm_bind_flags(c, fd, vm, bo, 0, addr,
> > +					bo_size, 0, NULL, 0, to_user_pointer(ext));
> > +
> > +	for (i = 0; i < WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM; i++)
> > +		xe_eudebug_client_wait_stage(c, i);
> > +
> > +	xe_eudebug_client_vm_unbind(c, fd, vm, 0, addr, bo_size);
> > +
> > +	basic_vm_bind_metadata_ext_del(fd, c, ext, data);
> > +
> > +	close(bo);
> > +	xe_eudebug_client_vm_destroy(c, fd, vm);
> > +
> > +	xe_device_put(fd);
> > +	xe_eudebug_client_close_driver(c, fd);
> > +}
> > +
> > +static void debugger_test_metadata(struct xe_eudebug_debugger *d,
> > +				   uint64_t client_handle,
> > +				   uint64_t metadata_handle,
> > +				   uint64_t type,
> > +				   uint64_t len)
> > +{
> > +	struct drm_xe_eudebug_read_metadata rm = {
> > +		.client_handle = client_handle,
> > +		.metadata_handle = metadata_handle,
> > +		.size = len,
> > +	};
> > +	uint8_t *data;
> > +	int i;
> > +
> > +	data = malloc(len);
> > +	igt_assert(data);
> > +
> > +	rm.ptr = to_user_pointer(data);
> > +
> > +	igt_assert_eq(igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_READ_METADATA, &rm), 0);
> > +
> > +	/* syntetic check, test sets different size per metadata type */
> > +	igt_assert_eq((type + 1) * PAGE_SIZE, rm.size);
> > +
> > +	for (i = 0; i < rm.size; i++)
> > +		igt_assert_eq(data[i], 0xff & (i + (i > PAGE_SIZE)));
> > +
> > +	free(data);
> > +}
> > +
> > +static void metadata_read_trigger(struct xe_eudebug_debugger *d,
> > +				  struct drm_xe_eudebug_event *e)
> > +{
> > +	struct drm_xe_eudebug_event_metadata *em = (void *)e;
> > +
> > +	/* syntetic check, test sets different size per metadata type */
> > +	igt_assert_eq((em->type + 1) * PAGE_SIZE, em->len);
> > +
> > +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
> > +		debugger_test_metadata(d, em->client_handle, em->metadata_handle,
> > +				       em->type, em->len);
> > +		xe_eudebug_debugger_signal_stage(d, em->type);
> > +	}
> > +}
> > +
> > +static void metadata_read_on_vm_bind_trigger(struct xe_eudebug_debugger *d,
> > +					     struct drm_xe_eudebug_event *e)
> > +{
> > +	struct drm_xe_eudebug_event_vm_bind_op_metadata *em = (void *)e;
> > +	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
> > +	struct drm_xe_eudebug_event_vm_bind *eb;
> > +
> > +	/* For testing purpose client sets metadata_cookie = type */
> > +
> > +	/*
> > +	 * Metadata event has a reference to vm-bind-op event which has a reference
> > +	 * to vm-bind event which contains proper client-handle.
> > +	 */
> > +	eo = (struct drm_xe_eudebug_event_vm_bind_op *)
> > +		xe_eudebug_event_log_find_seqno(d->log, em->vm_bind_op_ref_seqno);
> > +	igt_assert(eo);
> > +	eb = (struct drm_xe_eudebug_event_vm_bind *)
> > +		xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
> > +	igt_assert(eb);
> > +
> > +	debugger_test_metadata(d,
> > +			       eb->client_handle,
> > +			       em->metadata_handle,
> > +			       em->metadata_cookie,
> > +			       MDATA_SIZE); /* max size */
> > +
> > +	xe_eudebug_debugger_signal_stage(d, em->metadata_cookie);
> > +}
> > +
> > +/**
> > + * SUBTEST: read-metadata
> > + * Description:
> > + *      Exercise DRM_XE_EUDEBUG_IOCTL_READ_METADATA and debug metadata create|destroy events.
> > + */
> > +static void test_metadata_read(int fd, unsigned int flags, int num_clients)
> > +{
> > +	test_client_with_trigger(fd, flags, num_clients, metadata_access_client,
> > +				 DRM_XE_EUDEBUG_EVENT_METADATA, metadata_read_trigger,
> > +				 NULL, true, 0);
> > +}
> > +
> > +/**
> > + * SUBTEST: attach-debug-metadata
> > + * Description:
> > + *      Read debug metadata when vm_bind has it attached.
> > + */
> > +static void test_metadata_attach(int fd, unsigned int flags, int num_clients)
> > +{
> > +	test_client_with_trigger(fd, flags, num_clients, metadata_access_client,
> > +				 DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_METADATA,
> > +				 metadata_read_on_vm_bind_trigger,
> > +				 NULL, true, 0);
> > +}
> > +
> > +#define STAGE_CLIENT_WAIT_ON_UFENCE_DONE 1337
> > +
> > +#define UFENCE_EVENT_COUNT_EXPECTED 4
> > +#define UFENCE_EVENT_COUNT_MAX 100
> > +
> > +struct ufence_bind {
> > +	struct drm_xe_sync f;
> > +	uint64_t addr;
> > +	uint64_t range;
> > +	uint64_t value;
> > +	struct {
> > +		uint64_t vm_sync;
> > +	} *fence_data;
> > +};
> > +
> > +static void client_wait_ufences(struct xe_eudebug_client *c,
> > +				int fd, struct ufence_bind *binds, int count)
> > +{
> > +	const int64_t default_fence_timeout_ns = 500 * NSEC_PER_MSEC;
> > +	int64_t timeout_ns;
> > +	int err;
> > +
> > +	/* Ensure that wait on unacked ufence times out */
> > +	for (int i = 0; i < count; i++) {
> > +		struct ufence_bind *b = &binds[i];
> > +
> > +		timeout_ns = default_fence_timeout_ns;
> > +		err = __xe_wait_ufence(fd, &b->fence_data->vm_sync, b->f.timeline_value,
> > +				       0, &timeout_ns);
> > +		igt_assert_eq(err, -ETIME);
> > +		igt_assert_neq(b->fence_data->vm_sync, b->f.timeline_value);
> > +		igt_debug("wait #%d blocked on ack\n", i);
> > +	}
> > +
> > +	/* Wait on fence timed out, now tell the debugger to ack */
> > +	xe_eudebug_client_signal_stage(c, STAGE_CLIENT_WAIT_ON_UFENCE_DONE);
> > +
> > +	/* Check that ack unblocks ufence */
> > +	for (int i = 0; i < count; i++) {
> > +		struct ufence_bind *b = &binds[i];
> > +
> > +		timeout_ns = XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC;
> > +		err = __xe_wait_ufence(fd, &b->fence_data->vm_sync, b->f.timeline_value,
> > +				       0, &timeout_ns);
> > +		igt_assert_eq(err, 0);
> > +		igt_assert_eq(b->fence_data->vm_sync, b->f.timeline_value);
> > +		igt_debug("wait #%d completed\n", i);
> > +	}
> > +}
> > +
> > +static struct ufence_bind *create_binds_with_ufence(int fd, int count)
> > +{
> > +	struct ufence_bind *binds;
> > +
> > +	binds = calloc(count, sizeof(*binds));
> > +	igt_assert(binds);
> > +
> > +	for (int i = 0; i < count; i++) {
> > +		struct ufence_bind *b = &binds[i];
> > +
> > +		b->range = 0x1000;
> > +		b->addr = 0x100000 + b->range * i;
> > +		b->fence_data = aligned_alloc(xe_get_default_alignment(fd),
> > +					      sizeof(*b->fence_data));
> > +		igt_assert(b->fence_data);
> > +		memset(b->fence_data, 0, sizeof(*b->fence_data));
> > +
> > +		b->f.type = DRM_XE_SYNC_TYPE_USER_FENCE;
> > +		b->f.flags = DRM_XE_SYNC_FLAG_SIGNAL;
> > +		b->f.addr = to_user_pointer(&b->fence_data->vm_sync);
> > +		b->f.timeline_value = UFENCE_EVENT_COUNT_EXPECTED + i;
> > +	}
> > +
> > +	return binds;
> > +}
> > +
> > +static void basic_ufence_client(struct xe_eudebug_client *c)
> > +{
> > +	const unsigned int n = UFENCE_EVENT_COUNT_EXPECTED;
> > +	int fd = xe_eudebug_client_open_driver(c);
> > +	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> > +	size_t bo_size = n * xe_get_default_alignment(fd);
> > +	uint32_t bo = xe_bo_create(fd, 0, bo_size,
> > +				   system_memory(fd), 0);
> > +	struct ufence_bind *binds = create_binds_with_ufence(fd, n);
> > +
> > +	for (int i = 0; i < n; i++) {
> > +		struct ufence_bind *b = &binds[i];
> > +
> > +		xe_eudebug_client_vm_bind_flags(c, fd, vm, bo, 0, b->addr, b->range, 0,
> > +						&b->f, 1, 0);
> > +	}
> > +
> > +	client_wait_ufences(c, fd, binds, n);
> > +
> > +	for (int i = 0; i < n; i++) {
> > +		struct ufence_bind *b = &binds[i];
> > +
> > +		xe_eudebug_client_vm_unbind(c, fd, vm, 0, b->addr, b->range);
> > +	}
> > +
> > +	free(binds);
> > +	gem_close(fd, bo);
> > +	xe_eudebug_client_vm_destroy(c, fd, vm);
> > +	xe_eudebug_client_close_driver(c, fd);
> > +}
> > +
> > +struct ufence_priv {
> > +	struct drm_xe_eudebug_event_vm_bind_ufence ufence_events[UFENCE_EVENT_COUNT_MAX];
> > +	uint64_t ufence_event_seqno[UFENCE_EVENT_COUNT_MAX];
> > +	uint64_t ufence_event_vm_addr_start[UFENCE_EVENT_COUNT_MAX];
> > +	uint64_t ufence_event_vm_addr_range[UFENCE_EVENT_COUNT_MAX];
> > +	unsigned int ufence_event_count;
> > +	unsigned int vm_bind_op_count;
> > +	pthread_mutex_t mutex;
> > +};
> > +
> > +static struct ufence_priv *ufence_priv_create(void)
> > +{
> > +	struct ufence_priv *priv;
> > +
> > +	priv = mmap(0, ALIGN(sizeof(*priv), PAGE_SIZE),
> > +		    PROT_WRITE, MAP_SHARED | MAP_ANON, -1, 0);
> > +	igt_assert(priv);
> > +	memset(priv, 0, sizeof(*priv));
> > +	pthread_mutex_init(&priv->mutex, NULL);
> > +
> > +	return priv;
> > +}
> > +
> > +static void ufence_priv_destroy(struct ufence_priv *priv)
> > +{
> > +	munmap(priv, ALIGN(sizeof(*priv), PAGE_SIZE));
> > +}
> > +
> > +static void ack_fences(struct xe_eudebug_debugger *d)
> > +{
> > +	struct ufence_priv *priv = d->ptr;
> > +
> > +	for (int i = 0; i < priv->ufence_event_count; i++)
> > +		xe_eudebug_ack_ufence(d->fd, &priv->ufence_events[i]);
> > +}
> > +
> > +static void basic_ufence_trigger(struct xe_eudebug_debugger *d,
> > +				 struct drm_xe_eudebug_event *e)
> > +{
> > +	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
> > +	struct ufence_priv *priv = d->ptr;
> > +
> > +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
> > +		char event_str[XE_EUDEBUG_EVENT_STRING_MAX_LEN];
> > +		struct drm_xe_eudebug_event_vm_bind *eb;
> > +
> > +		xe_eudebug_event_to_str(e, event_str, XE_EUDEBUG_EVENT_STRING_MAX_LEN);
> > +		igt_debug("ufence event received: %s\n", event_str);
> > +
> > +		xe_eudebug_assert_f(d, priv->ufence_event_count < UFENCE_EVENT_COUNT_EXPECTED,
> > +				    "surplus ufence event received: %s\n", event_str);
> > +		xe_eudebug_assert(d, ef->vm_bind_ref_seqno);
> > +
> > +		memcpy(&priv->ufence_events[priv->ufence_event_count++], ef, sizeof(*ef));
> > +
> > +		eb = (struct drm_xe_eudebug_event_vm_bind *)
> > +			xe_eudebug_event_log_find_seqno(d->log, ef->vm_bind_ref_seqno);
> > +		xe_eudebug_assert_f(d, eb, "vm bind event with seqno (%lld) not found\n",
> > +				    ef->vm_bind_ref_seqno);
> > +		xe_eudebug_assert_f(d, eb->flags & DRM_XE_EUDEBUG_EVENT_VM_BIND_FLAG_UFENCE,
> > +				    "vm bind event does not have ufence: %s\n", event_str);
> > +	}
> > +}
> > +
> > +static int wait_for_ufence_events(struct ufence_priv *priv, int timeout_ms)
> > +{
> > +	int ret = -ETIMEDOUT;
> > +
> > +	igt_for_milliseconds(timeout_ms) {
> > +		pthread_mutex_lock(&priv->mutex);
> > +		if (priv->ufence_event_count == UFENCE_EVENT_COUNT_EXPECTED)
> > +			ret = 0;
> > +		pthread_mutex_unlock(&priv->mutex);
> > +
> > +		if (!ret)
> > +			break;
> > +		usleep(1000);
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> > +/**
> > + * SUBTEST: basic-vm-bind-ufence
> > + * Description:
> > + *      Give user fence in application and check if ufence ack works
> > + */
> > +static void test_basic_ufence(int fd, unsigned int flags)
> > +{
> > +	struct xe_eudebug_debugger *d;
> > +	struct xe_eudebug_session *s;
> > +	struct xe_eudebug_client *c;
> > +	struct ufence_priv *priv;
> > +
> > +	priv = ufence_priv_create();
> > +	s = xe_eudebug_session_create(fd, basic_ufence_client, flags, priv);
> > +	c = s->client;
> > +	d = s->debugger;
> > +
> > +	xe_eudebug_debugger_add_trigger(d,
> > +					DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> > +					basic_ufence_trigger);
> > +
> > +	igt_assert_eq(xe_eudebug_debugger_attach(d, c), 0);
> > +	xe_eudebug_debugger_start_worker(d);
> > +	xe_eudebug_client_start(c);
> > +
> > +	xe_eudebug_debugger_wait_stage(s, STAGE_CLIENT_WAIT_ON_UFENCE_DONE);
> > +	xe_eudebug_assert_f(d, wait_for_ufence_events(priv, XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * MSEC_PER_SEC) == 0,
> > +			    "missing ufence events\n");
> > +	ack_fences(d);
> > +
> > +	xe_eudebug_client_wait_done(c);
> > +	xe_eudebug_debugger_stop_worker(d, 1);
> > +
> > +	xe_eudebug_event_log_print(d->log, true);
> > +	xe_eudebug_event_log_print(c->log, true);
> > +
> > +	xe_eudebug_session_check(s, true, XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
> > +
> > +	xe_eudebug_session_destroy(s);
> > +	ufence_priv_destroy(priv);
> > +}
> > +
> > +struct vm_bind_clear_thread_priv {
> > +	struct drm_xe_engine_class_instance *hwe;
> > +	struct xe_eudebug_client *c;
> > +	pthread_t thread;
> > +	uint64_t region;
> > +	unsigned long sum;
> > +};
> > +
> > +struct vm_bind_clear_priv {
> > +	unsigned long unbind_count;
> > +	unsigned long bind_count;
> > +	unsigned long sum;
> > +};
> > +
> > +static struct vm_bind_clear_priv *vm_bind_clear_priv_create(void)
> > +{
> > +	struct vm_bind_clear_priv *priv;
> > +
> > +	priv = mmap(0, ALIGN(sizeof(*priv), PAGE_SIZE),
> > +		    PROT_WRITE, MAP_SHARED | MAP_ANON, -1, 0);
> > +	igt_assert(priv);
> > +	memset(priv, 0, sizeof(*priv));
> > +
> > +	return priv;
> > +}
> > +
> > +static void vm_bind_clear_priv_destroy(struct vm_bind_clear_priv *priv)
> > +{
> > +	munmap(priv, ALIGN(sizeof(*priv), PAGE_SIZE));
> > +}
> > +
> > +static void *vm_bind_clear_thread(void *data)
> > +{
> > +	const uint32_t CS_GPR0 = 0x600;
> > +	const size_t batch_size = 16;
> > +	struct drm_xe_sync uf_sync = {
> > +		.type = DRM_XE_SYNC_TYPE_USER_FENCE, .flags = DRM_XE_SYNC_FLAG_SIGNAL,
> > +	};
> > +	struct vm_bind_clear_thread_priv *priv = data;
> > +	int fd = xe_eudebug_client_open_driver(priv->c);
> > +	uint32_t gtt_size = 1ull << min_t(uint32_t, xe_va_bits(fd), 48);
> > +	uint32_t vm = xe_eudebug_client_vm_create(priv->c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> > +	size_t bo_size = xe_bb_size(fd, batch_size);
> > +	unsigned long count = 0;
> > +	uint64_t *fence_data;
> > +
> > +	/* init uf_sync */
> > +	fence_data = aligned_alloc(xe_get_default_alignment(fd), sizeof(*fence_data));
> > +	igt_assert(fence_data);
> > +	uf_sync.timeline_value = 1337;
> > +	uf_sync.addr = to_user_pointer(fence_data);
> > +
> > +	igt_debug("Run on: %s%u\n", xe_engine_class_string(priv->hwe->engine_class),
> > +		  priv->hwe->engine_instance);
> > +
> > +	igt_until_timeout(5) {
> > +		struct drm_xe_ext_set_property eq_ext = {
> > +			.base.name = DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY,
> > +			.property = DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG,
> > +			.value = DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE,
> > +		};
> > +		struct drm_xe_exec_queue_create eq_create = { 0 };
> > +		uint32_t clean_bo = 0;
> > +		uint32_t batch_bo = 0;
> > +		uint64_t clean_offset, batch_offset;
> > +		uint32_t exec_queue;
> > +		uint32_t *map, *cs;
> > +		uint64_t delta;
> > +
> > +		/* calculate offsets (vma addresses) */
> > +		batch_offset = (random() * SZ_2M) & (gtt_size - 1);
> > +		/* XXX: for some platforms/memory regions batch offset '0' can be problematic */
> > +		if (batch_offset == 0)
> > +			batch_offset = SZ_2M;
> > +
> > +		do {
> > +			clean_offset = (random() * SZ_2M) & (gtt_size - 1);
> > +			if (clean_offset == 0)
> > +				clean_offset = SZ_2M;
> > +		} while (clean_offset == batch_offset);
> > +
> > +		batch_offset += random() % SZ_2M & -bo_size;
> > +		clean_offset += random() % SZ_2M & -bo_size;
> > +
> > +		delta = (random() % bo_size) & -4;
> > +
> > +		/* prepare clean bo */
> > +		clean_bo = xe_bo_create(fd, vm, bo_size, priv->region,
> > +					DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM);
> > +		memset(fence_data, 0, sizeof(*fence_data));
> > +		xe_eudebug_client_vm_bind_flags(priv->c, fd, vm, clean_bo, 0, clean_offset, bo_size,
> > +						0, &uf_sync, 1, 0);
> > +		xe_wait_ufence(fd, fence_data, uf_sync.timeline_value, 0,
> > +			       XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC);
> > +
> > +		/* prepare batch bo */
> > +		batch_bo = xe_bo_create(fd, vm, bo_size, priv->region,
> > +					DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM);
> > +		memset(fence_data, 0, sizeof(*fence_data));
> > +		xe_eudebug_client_vm_bind_flags(priv->c, fd, vm, batch_bo, 0, batch_offset, bo_size,
> > +						0, &uf_sync, 1, 0);
> > +		xe_wait_ufence(fd, fence_data, uf_sync.timeline_value, 0,
> > +			       XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC);
> > +
> > +		map = xe_bo_map(fd, batch_bo, bo_size);
> > +
> > +		cs = map;
> > +		*cs++ = MI_NOOP | 0xc5a3;
> > +		*cs++ = MI_LOAD_REGISTER_MEM_CMD | MI_LRI_LRM_CS_MMIO | 2;
> > +		*cs++ = CS_GPR0;
> > +		*cs++ = clean_offset + delta;
> > +		*cs++ = (clean_offset + delta) >> 32;
> > +		*cs++ = MI_STORE_REGISTER_MEM_CMD | MI_LRI_LRM_CS_MMIO | 2;
> > +		*cs++ = CS_GPR0;
> > +		*cs++ = batch_offset;
> > +		*cs++ = batch_offset >> 32;
> > +		*cs++ = MI_BATCH_BUFFER_END;
> > +
> > +		/* execute batch */
> > +		eq_create.width = 1;
> > +		eq_create.num_placements = 1;
> > +		eq_create.vm_id = vm;
> > +		eq_create.instances = to_user_pointer(priv->hwe);
> > +		eq_create.extensions = to_user_pointer(&eq_ext);
> > +		exec_queue = xe_eudebug_client_exec_queue_create(priv->c, fd, &eq_create);
> > +
> > +		memset(fence_data, 0, sizeof(*fence_data));
> > +		xe_exec_sync(fd, exec_queue, batch_offset, &uf_sync, 1);
> > +		xe_wait_ufence(fd, fence_data, uf_sync.timeline_value, 0,
> > +			       XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC);
> > +
> > +		igt_assert_eq(*map, 0);
> > +
> > +		/* cleanup */
> > +		xe_eudebug_client_exec_queue_destroy(priv->c, fd, &eq_create);
> > +		munmap(map, bo_size);
> > +
> > +		xe_eudebug_client_vm_unbind(priv->c, fd, vm, 0, batch_offset, bo_size);
> > +		gem_close(fd, batch_bo);
> > +
> > +		xe_eudebug_client_vm_unbind(priv->c, fd, vm, 0, clean_offset, bo_size);
> > +		gem_close(fd, clean_bo);
> > +
> > +		count++;
> > +	}
> > +
> > +	priv->sum = count;
> > +
> > +	free(fence_data);
> > +	xe_eudebug_client_close_driver(priv->c, fd);
> > +	return NULL;
> > +}
> > +
> > +static void vm_bind_clear_client(struct xe_eudebug_client *c)
> > +{
> > +	int fd = xe_eudebug_client_open_driver(c);
> > +	struct xe_device *xe_dev = xe_device_get(fd);
> > +	int count = xe_number_engines(fd) * xe_dev->mem_regions->num_mem_regions;
> > +	uint64_t memreg = all_memory_regions(fd);
> > +	struct vm_bind_clear_priv *priv = c->ptr;
> > +	int current = 0;
> > +	struct drm_xe_engine_class_instance *engine;
> > +	struct vm_bind_clear_thread_priv *threads;
> > +	uint64_t region;
> > +
> > +	threads = calloc(count, sizeof(*threads));
> > +	igt_assert(threads);
> > +	priv->sum = 0;
> > +
> > +	xe_for_each_mem_region(fd, memreg, region) {
> > +		xe_eudebug_for_each_engine(fd, engine) {
> > +			threads[current].c = c;
> > +			threads[current].hwe = engine;
> > +			threads[current].region = region;
> > +
> > +			pthread_create(&threads[current].thread, NULL,
> > +				       vm_bind_clear_thread, &threads[current]);
> > +			current++;
> > +		}
> > +	}
> > +
> > +	for (current = 0; current < count; current++)
> > +		pthread_join(threads[current].thread, NULL);
> > +
> > +	xe_for_each_mem_region(fd, memreg, region) {
> > +		unsigned long sum = 0;
> > +
> > +		for (current = 0; current < count; current++)
> > +			if (threads[current].region == region)
> > +				sum += threads[current].sum;
> > +
> > +		igt_info("%s sampled %lu objects\n", xe_region_name(region), sum);
> > +		priv->sum += sum;
> > +	}
> > +
> > +	free(threads);
> > +	xe_device_put(fd);
> > +	xe_eudebug_client_close_driver(c, fd);
> > +}
> > +
> > +static void vm_bind_clear_test_trigger(struct xe_eudebug_debugger *d,
> > +				       struct drm_xe_eudebug_event *e)
> > +{
> > +	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
> > +	struct vm_bind_clear_priv *priv = d->ptr;
> > +
> > +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
> > +		if (random() & 1) {
> > +			struct drm_xe_eudebug_vm_open vo = { 0, };
> > +			uint32_t v = 0xc1c1c1c1;
> > +
> > +			struct drm_xe_eudebug_event_vm_bind *eb;
> > +			int fd, delta, r;
> > +
> > +			igt_debug("vm bind op event received with ref %lld, addr 0x%llx, range 0x%llx\n",
> > +				  eo->vm_bind_ref_seqno, eo->addr, eo->range);
> > +
> > +			eb = (struct drm_xe_eudebug_event_vm_bind *)
> > +				xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
> > +			igt_assert(eb);
> > +
> > +			vo.client_handle = eb->client_handle;
> > +			vo.vm_handle = eb->vm_handle;
> > +
> > +			fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
> > +			igt_assert_lte(0, fd);
> > +
> > +			delta = (random() % eo->range) & -4;
> > +			r = pread(fd, &v, sizeof(v), eo->addr + delta);
> > +			igt_assert_eq(r, sizeof(v));
> > +			igt_assert_eq_u32(v, 0);
> > +
> > +			close(fd);
> > +		}
> > +		priv->bind_count++;
> > +	}
> > +
> > +	if (e->flags & DRM_XE_EUDEBUG_EVENT_DESTROY)
> > +		priv->unbind_count++;
> > +}
> > +
> > +static void vm_bind_clear_ack_trigger(struct xe_eudebug_debugger *d,
> > +				      struct drm_xe_eudebug_event *e)
> > +{
> > +	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
> > +
> > +	xe_eudebug_ack_ufence(d->fd, ef);
> > +}
> > +
> > +/**
> > + * SUBTEST: vm-bind-clear
> > + * Description:
> > + *      Check that fresh buffers we vm_bind into the ppGTT are always clear.
> > + */
> > +static void test_vm_bind_clear(int fd)
> > +{
> > +	struct vm_bind_clear_priv *priv;
> > +	struct xe_eudebug_session *s;
> > +
> > +	priv = vm_bind_clear_priv_create();
> > +	s = xe_eudebug_session_create(fd, vm_bind_clear_client, 0, priv);
> > +
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_OP,
> > +					vm_bind_clear_test_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> > +					vm_bind_clear_ack_trigger);
> > +
> > +	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
> > +	xe_eudebug_debugger_start_worker(s->debugger);
> > +	xe_eudebug_client_start(s->client);
> > +
> > +	xe_eudebug_client_wait_done(s->client);
> > +	xe_eudebug_debugger_stop_worker(s->debugger, 1);
> > +
> > +	igt_assert_eq(priv->bind_count, priv->unbind_count);
> > +	igt_assert_eq(priv->sum * 2, priv->bind_count);
> > +
> > +	xe_eudebug_session_destroy(s);
> > +	vm_bind_clear_priv_destroy(priv);
> > +}
> > +
> > +#define UFENCE_CLIENT_VM_TEST_VAL_START 0xaaaaaaaa
> > +#define UFENCE_CLIENT_VM_TEST_VAL_END 0xbbbbbbbb
> > +
> > +static void vma_ufence_client(struct xe_eudebug_client *c)
> > +{
> > +	const unsigned int n = UFENCE_EVENT_COUNT_EXPECTED;
> > +	int fd = xe_eudebug_client_open_driver(c);
> > +	struct ufence_bind *binds = create_binds_with_ufence(fd, n);
> > +	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> > +	size_t bo_size = xe_get_default_alignment(fd);
> > +	uint64_t items = bo_size / sizeof(uint32_t);
> > +	uint32_t bo[UFENCE_EVENT_COUNT_EXPECTED];
> > +	uint32_t *ptr[UFENCE_EVENT_COUNT_EXPECTED];
> > +
> > +	for (int i = 0; i < n; i++) {
> > +		bo[i] = xe_bo_create(fd, 0, bo_size,
> > +				     system_memory(fd), 0);
> > +		ptr[i] = xe_bo_map(fd, bo[i], bo_size);
> > +		igt_assert(ptr[i]);
> > +		memset(ptr[i], UFENCE_CLIENT_VM_TEST_VAL_START, bo_size);
> > +	}
> > +
> > +	for (int i = 0; i < n; i++)
> > +		for (int j = 0; j < items; j++)
> > +			igt_assert_eq(ptr[i][j], UFENCE_CLIENT_VM_TEST_VAL_START);
> > +
> > +	for (int i = 0; i < n; i++) {
> > +		struct ufence_bind *b = &binds[i];
> > +
> > +		xe_eudebug_client_vm_bind_flags(c, fd, vm, bo[i], 0, b->addr, b->range, 0,
> > +						&b->f, 1, 0);
> > +	}
> > +
> > +	/* Wait for acks on ufences */
> > +	for (int i = 0; i < n; i++) {
> > +		int err;
> > +		int64_t timeout_ns;
> > +		struct ufence_bind *b = &binds[i];
> > +
> > +		timeout_ns = XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC;
> > +		err = __xe_wait_ufence(fd, &b->fence_data->vm_sync, b->f.timeline_value,
> > +				       0, &timeout_ns);
> > +		igt_assert_eq(err, 0);
> > +		igt_assert_eq(b->fence_data->vm_sync, b->f.timeline_value);
> > +		igt_debug("wait #%d completed\n", i);
> > +
> > +		for (int j = 0; j < items; j++)
> > +			igt_assert_eq(ptr[i][j], UFENCE_CLIENT_VM_TEST_VAL_END);
> > +	}
> > +
> > +	for (int i = 0; i < n; i++) {
> > +		struct ufence_bind *b = &binds[i];
> > +
> > +		xe_eudebug_client_vm_unbind(c, fd, vm, 0, b->addr, b->range);
> > +	}
> > +
> > +	free(binds);
> > +
> > +	for (int i = 0; i < n; i++) {
> > +		munmap(ptr[i], bo_size);
> > +		gem_close(fd, bo[i]);
> > +	}
> > +
> > +	xe_eudebug_client_vm_destroy(c, fd, vm);
> > +	xe_eudebug_client_close_driver(c, fd);
> > +}
> > +
> > +static void debugger_test_vma_ufence(struct xe_eudebug_debugger *d,
> > +				     uint64_t client_handle,
> > +				     uint64_t vm_handle,
> > +				     uint64_t va_start,
> > +				     uint64_t va_length)
> > +{
> > +	struct drm_xe_eudebug_vm_open vo = { 0, };
> > +	uint32_t *v1, *v2;
> > +	uint32_t items = va_length / sizeof(uint32_t);
> > +	int fd;
> > +	int r, i;
> > +
> > +	v1 = malloc(va_length);
> > +	igt_assert(v1);
> > +	v2 = malloc(va_length);
> > +	igt_assert(v2);
> > +
> > +	vo.client_handle = client_handle;
> > +	vo.vm_handle = vm_handle;
> > +
> > +	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
> > +	igt_assert_lte(0, fd);
> > +
> > +	r = pread(fd, v1, va_length, va_start);
> > +	igt_assert_eq(r, va_length);
> > +
> > +	for (i = 0; i < items; i++)
> > +		igt_assert_eq(v1[i], UFENCE_CLIENT_VM_TEST_VAL_START);
> > +
> > +	memset(v1, UFENCE_CLIENT_VM_TEST_VAL_END, va_length);
> > +
> > +	r = pwrite(fd, v1, va_length, va_start);
> > +	igt_assert_eq(r, va_length);
> > +
> > +	lseek(fd, va_start, SEEK_SET);
> > +	r = read(fd, v2, va_length);
> > +	igt_assert_eq(r, va_length);
> > +
> > +	for (i = 0; i < items; i++)
> > +		igt_assert_eq_u64(v1[i], v2[i]);
> > +
> > +	fsync(fd);
> > +
> > +	close(fd);
> > +	free(v1);
> > +	free(v2);
> > +}
> > +
> > +static void vma_ufence_op_trigger(struct xe_eudebug_debugger *d,
> > +				  struct drm_xe_eudebug_event *e)
> > +{
> > +	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
> > +	struct ufence_priv *priv = d->ptr;
> > +
> > +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
> > +		char event_str[XE_EUDEBUG_EVENT_STRING_MAX_LEN];
> > +		struct drm_xe_eudebug_event_vm_bind *eb;
> > +		unsigned int op_count = priv->vm_bind_op_count++;
> > +
> > +		xe_eudebug_event_to_str(e, event_str, XE_EUDEBUG_EVENT_STRING_MAX_LEN);
> > +		igt_debug("vm bind op event: ref %lld, addr 0x%llx, range 0x%llx, op_count %u\n",
> > +			  eo->vm_bind_ref_seqno,
> > +			  eo->addr,
> > +			  eo->range,
> > +			  op_count);
> > +		igt_debug("vm bind op event received: %s\n", event_str);
> > +		xe_eudebug_assert(d, eo->vm_bind_ref_seqno);
> > +		eb = (struct drm_xe_eudebug_event_vm_bind *)
> > +			xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
> > +
> > +		xe_eudebug_assert_f(d, eb, "vm bind event with seqno (%lld) not found\n",
> > +				    eo->vm_bind_ref_seqno);
> > +		xe_eudebug_assert_f(d, eb->flags & DRM_XE_EUDEBUG_EVENT_VM_BIND_FLAG_UFENCE,
> > +				    "vm bind event does not have ufence: %s\n", event_str);
> > +
> > +		priv->ufence_event_seqno[op_count] = eo->vm_bind_ref_seqno;
> > +		priv->ufence_event_vm_addr_start[op_count] = eo->addr;
> > +		priv->ufence_event_vm_addr_range[op_count] = eo->range;
> > +	}
> > +}
> > +
> > +static void vma_ufence_trigger(struct xe_eudebug_debugger *d,
> > +			       struct drm_xe_eudebug_event *e)
> > +{
> > +	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
> > +	struct ufence_priv *priv = d->ptr;
> > +	unsigned int ufence_count = priv->ufence_event_count;
> > +
> > +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
> > +		char event_str[XE_EUDEBUG_EVENT_STRING_MAX_LEN];
> > +		struct drm_xe_eudebug_event_vm_bind *eb;
> > +		uint64_t addr = priv->ufence_event_vm_addr_start[ufence_count];
> > +		uint64_t range = priv->ufence_event_vm_addr_range[ufence_count];
> > +
> > +		xe_eudebug_event_to_str(e, event_str, XE_EUDEBUG_EVENT_STRING_MAX_LEN);
> > +		igt_debug("ufence event received: %s\n", event_str);
> > +
> > +		xe_eudebug_assert_f(d, priv->ufence_event_count < UFENCE_EVENT_COUNT_EXPECTED,
> > +				    "surplus ufence event received: %s\n", event_str);
> > +		xe_eudebug_assert(d, ef->vm_bind_ref_seqno);
> > +
> > +		memcpy(&priv->ufence_events[priv->ufence_event_count++], ef, sizeof(*ef));
> > +
> > +		eb = (struct drm_xe_eudebug_event_vm_bind *)
> > +			xe_eudebug_event_log_find_seqno(d->log, ef->vm_bind_ref_seqno);
> > +		xe_eudebug_assert_f(d, eb, "vm bind event with seqno (%lld) not found\n",
> > +				    ef->vm_bind_ref_seqno);
> > +		xe_eudebug_assert_f(d, eb->flags & DRM_XE_EUDEBUG_EVENT_VM_BIND_FLAG_UFENCE,
> > +				    "vm bind event does not have ufence: %s\n", event_str);
> > +		igt_debug("vm bind ufence event received with ref %lld, addr 0x%lx, range 0x%lx\n",
> > +			  ef->vm_bind_ref_seqno,
> > +			  addr,
> > +			  range);
> > +		debugger_test_vma_ufence(d, eb->client_handle, eb->vm_handle,
> > +					 addr, range);
> > +
> > +		xe_eudebug_ack_ufence(d->fd, ef);
> > +	}
> > +}
> > +
> > +/**
> > + * SUBTEST: vma-ufence
> > + * Description:
> > + *      Intercept vm bind after receiving ufence event, then access target vm and write to it.
> > + *      Then check on client side if the write was successful.
> > + */
> > +static void test_vma_ufence(int fd, unsigned int flags)
> > +{
> > +	struct xe_eudebug_session *s;
> > +	struct ufence_priv *priv;
> > +
> > +	priv = ufence_priv_create();
> > +	s = xe_eudebug_session_create(fd, vma_ufence_client, flags, priv);
> > +
> > +	xe_eudebug_debugger_add_trigger(s->debugger,
> > +					DRM_XE_EUDEBUG_EVENT_VM_BIND_OP,
> > +					vma_ufence_op_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger,
> > +					DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> > +					vma_ufence_trigger);
> > +
> > +	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
> > +	xe_eudebug_debugger_start_worker(s->debugger);
> > +	xe_eudebug_client_start(s->client);
> > +
> > +	xe_eudebug_client_wait_done(s->client);
> > +	xe_eudebug_debugger_stop_worker(s->debugger, 1);
> > +
> > +	xe_eudebug_event_log_print(s->debugger->log, true);
> > +	xe_eudebug_event_log_print(s->client->log, true);
> > +
> > +	xe_eudebug_session_check(s, true, XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
> > +
> > +	xe_eudebug_session_destroy(s);
> > +	ufence_priv_destroy(priv);
> > +}
> > +
> > +igt_main
> > +{
> > +	bool was_enabled;
> > +	bool *multigpu_was_enabled;
> > +	int fd, gpu_count;
> > +
> > +	igt_fixture {
> > +		fd = drm_open_driver(DRIVER_XE);
> > +		was_enabled = xe_eudebug_enable(fd, true);
> > +	}
> > +
> > +	igt_subtest("sysfs-toggle")
> > +		test_sysfs_toggle(fd);
> > +
> > +	igt_subtest("basic-connect")
> > +		test_connect(fd);
> > +
> > +	igt_subtest("connect-user")
> > +		test_connect_user(fd);
> > +
> > +	igt_subtest("basic-close")
> > +		test_close(fd);
> > +
> > +	igt_subtest("basic-read-event")
> > +		test_read_event(fd);
> > +
> > +	igt_subtest("basic-client")
> > +		test_basic_sessions(fd, 0, 1, true);
> > +
> > +	igt_subtest("basic-client-th")
> > +		test_basic_sessions_th(fd, 0, 1, true);
> > +
> > +	igt_subtest("basic-vm-access")
> > +		test_vm_access(fd, 0, 1);
> > +
> > +	igt_subtest("basic-vm-access-userptr")
> > +		test_vm_access(fd, VM_BIND_OP_MAP_USERPTR, 1);
> > +
> > +	igt_subtest("basic-vm-access-parameters")
> > +		test_vm_access_parameters(fd, 0, 1);
> > +
> > +	igt_subtest("multiple-sessions")
> > +		test_basic_sessions(fd, CREATE_VMS | CREATE_EXEC_QUEUES, 4, true);
> > +
> > +	igt_subtest("basic-vms")
> > +		test_basic_sessions(fd, CREATE_VMS, 1, true);
> > +
> > +	igt_subtest("basic-exec-queues")
> > +		test_basic_sessions(fd, CREATE_EXEC_QUEUES, 1, true);
> > +
> > +	igt_subtest("basic-vm-bind")
> > +		test_basic_sessions(fd, VM_BIND, 1, true);
> > +
> > +	igt_subtest("basic-vm-bind-ufence")
> > +		test_basic_ufence(fd, 0);
> > +
> > +	igt_subtest("vma-ufence")
> > +		test_vma_ufence(fd, 0);
> > +
> > +	igt_subtest("vm-bind-clear")
> > +		test_vm_bind_clear(fd);
> > +
> > +	igt_subtest("basic-vm-bind-discovery")
> > +		test_basic_discovery(fd, VM_BIND, true);
> > +
> > +	igt_subtest("basic-vm-bind-metadata-discovery")
> > +		test_basic_discovery(fd, VM_BIND_METADATA, true);
> > +
> > +	igt_subtest("basic-vm-bind-vm-destroy")
> > +		test_basic_sessions(fd, VM_BIND_VM_DESTROY, 1, false);
> > +
> > +	igt_subtest("basic-vm-bind-vm-destroy-discovery")
> > +		test_basic_discovery(fd, VM_BIND_VM_DESTROY, false);
> > +
> > +	igt_subtest("basic-vm-bind-extended")
> > +		test_basic_sessions(fd, VM_BIND_EXTENDED, 1, true);
> > +
> > +	igt_subtest("basic-vm-bind-extended-discovery")
> > +		test_basic_discovery(fd, VM_BIND_EXTENDED, true);
> > +
> > +	igt_subtest("read-metadata")
> > +		test_metadata_read(fd, 0, 1);
> > +
> > +	igt_subtest("attach-debug-metadata")
> > +		test_metadata_attach(fd, 0, 1);
> > +
> > +	igt_subtest("discovery-race")
> > +		test_race_discovery(fd, 0, 4);
> > +
> > +	igt_subtest("discovery-race-vmbind")
> > +		test_race_discovery(fd, DISCOVERY_VM_BIND, 4);
> > +
> > +	igt_subtest("discovery-empty")
> > +		test_empty_discovery(fd, DISCOVERY_CLOSE_CLIENT, 16);
> > +
> > +	igt_subtest("discovery-empty-clients")
> > +		test_empty_discovery(fd, DISCOVERY_DESTROY_RESOURCES, 16);
> > +
> > +	igt_fixture {
> > +		xe_eudebug_enable(fd, was_enabled);
> > +		drm_close_driver(fd);
> > +	}
> > +
> > +	igt_subtest_group {
> > +		igt_fixture {
> > +			gpu_count = drm_prepare_filtered_multigpu(DRIVER_XE);
> > +			igt_require(gpu_count >= 2);
> > +
> > +			multigpu_was_enabled = malloc(gpu_count * sizeof(bool));
> > +			igt_assert(multigpu_was_enabled);
> > +			for (int i = 0; i < gpu_count; i++) {
> > +				fd = drm_open_filtered_card(i);
> > +				multigpu_was_enabled[i] = xe_eudebug_enable(fd, true);
> > +				close(fd);
> > +			}
> > +		}
> > +
> > +		igt_subtest("multigpu-basic-client") {
> > +			igt_multi_fork(child, gpu_count) {
> > +				fd = drm_open_filtered_card(child);
> > +				igt_assert_f(fd > 0, "cannot open gpu-%d, errno=%d\n",
> > +					     child, errno);
> > +				igt_assert(is_xe_device(fd));
> > +
> > +				test_basic_sessions(fd, 0, 1, true);
> > +				close(fd);
> > +			}
> > +			igt_waitchildren();
> > +		}
> > +
> > +		igt_subtest("multigpu-basic-client-many") {
> > +			igt_multi_fork(child, gpu_count) {
> > +				fd = drm_open_filtered_card(child);
> > +				igt_assert_f(fd > 0, "cannot open gpu-%d, errno=%d\n",
> > +					     child, errno);
> > +				igt_assert(is_xe_device(fd));
> > +
> > +				test_basic_sessions(fd, 0, 4, true);
> > +				close(fd);
> > +			}
> > +			igt_waitchildren();
> > +		}
> > +
> > +		igt_fixture {
> > +			for (int i = 0; i < gpu_count; i++) {
> > +				fd = drm_open_filtered_card(i);
> > +				xe_eudebug_enable(fd, multigpu_was_enabled[i]);
> > +				close(fd);
> > +			}
> > +			free(multigpu_was_enabled);
> > +		}
> > +	}
> > +}
> > diff --git a/tests/meson.build b/tests/meson.build
> > index 00556c9d6..0f996fdc8 100644
> > --- a/tests/meson.build
> > +++ b/tests/meson.build
> > @@ -318,6 +318,14 @@ intel_xe_progs = [
> >  	'xe_sysfs_scheduler',
> >  ]
> >  
> > +intel_xe_eudebug_progs = [
> > +	'xe_eudebug',
> > +]
> > +
> > +if build_xe_eudebug
> > +	intel_xe_progs += intel_xe_eudebug_progs
> > +endif
> > +
> >  chamelium_progs = [
> >  	'kms_chamelium_audio',
> >  	'kms_chamelium_color',
> > -- 
> > 2.34.1
> > 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 13/17] tests/xe_eudebug: Test eudebug resource tracking and manipulation
  2024-09-05  9:28 ` [PATCH i-g-t v6 13/17] tests/xe_eudebug: Test eudebug resource tracking and manipulation Christoph Manszewski
  2024-09-06 14:46   ` Kamil Konieczny
@ 2024-09-12  8:04   ` Zbigniew Kempczyński
  2024-09-17 14:44     ` Manszewski, Christoph
  2024-09-17 16:00     ` Manszewski, Christoph
  1 sibling, 2 replies; 50+ messages in thread
From: Zbigniew Kempczyński @ 2024-09-12  8:04 UTC (permalink / raw)
  To: Christoph Manszewski
  Cc: igt-dev, Kamil Konieczny, Dominik Grzegorzek, Maciej Patelczyk,
	Dominik Karol Piątkowski, Pawel Sikora, Andrzej Hajda,
	Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun, Mika Kuoppala,
	Jonathan Cavitt

On Thu, Sep 05, 2024 at 11:28:08AM +0200, Christoph Manszewski wrote:
> From: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> 
> For typical debugging under gdb one can specify two main usecases:
> accessing and manupulating resources created by the application and
> manipulating thread execution (interrupting and setting breakpoints).
> 
> This test adds coverage for the former by checking that:
> - the debugger reports the expected events for Xe resources created
> by the debugged client,
> - the debugger is able to read and write the vm of the debugged client.

Hi all.

First of all, on Mika series (v2) sent upstream on xe ml I've noticed
some tests are crashing the kernel. From this test perspective this is
good, it seems test is doing what it should do. I observe reboot on
vm access related subtests: basic-vm-access(-userptr).

> 
> Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
> Signed-off-by: Karolina Stolarek <karolina.stolarek@intel.com>
> Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
> Signed-off-by: Pawel Sikora <pawel.sikora@intel.com>
> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
> Signed-off-by: Dominik Karol Piątkowski <dominik.karol.piatkowski@intel.com>
> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> ---
>  docs/testplan/meson.build |   13 +-
>  meson_options.txt         |    2 +-
>  tests/intel/xe_eudebug.c  | 2716 +++++++++++++++++++++++++++++++++++++
>  tests/meson.build         |    8 +
>  4 files changed, 2737 insertions(+), 2 deletions(-)
>  create mode 100644 tests/intel/xe_eudebug.c
> 
> diff --git a/docs/testplan/meson.build b/docs/testplan/meson.build
> index 5560347f1..e86af028e 100644
> --- a/docs/testplan/meson.build
> +++ b/docs/testplan/meson.build
> @@ -33,11 +33,22 @@ else
>  	doc_dependencies = []
>  endif
>  
> +xe_excluded_tests = []
> +if not build_xe_eudebug
> +	foreach test : intel_xe_eudebug_progs
> +		xe_excluded_tests += meson.current_source_dir() + '/../../tests/intel/' + test + '.c'
> +	endforeach
> +endif
> +
> +if xe_excluded_tests.length() > 0
> +	xe_excluded_tests = ['--exclude-files'] + xe_excluded_tests
> +endif
> +
>  if build_xe
>  	test_dict = {
>  		'i915_tests': { 'input': i915_test_config, 'extra_args': check_testlist },
>  		'kms_tests': { 'input': kms_test_config, 'extra_args': kms_check_testlist },
> -		'xe_tests': { 'input': xe_test_config, 'extra_args': check_testlist }
> +		'xe_tests': { 'input': xe_test_config, 'extra_args': check_testlist + xe_excluded_tests }
>  	    }
>  else
>  	test_dict = {
> diff --git a/meson_options.txt b/meson_options.txt
> index 11922523b..c410f9b77 100644
> --- a/meson_options.txt
> +++ b/meson_options.txt
> @@ -45,7 +45,7 @@ option('xe_driver',
>  option('xe_eudebug',
>         type : 'feature',
>         value : 'disabled',
> -       description : 'Build library for Xe EU debugger')
> +       description : 'Build library and tests for Xe EU debugger')
>  
>  option('libdrm_drivers',
>         type : 'array',
> diff --git a/tests/intel/xe_eudebug.c b/tests/intel/xe_eudebug.c
> new file mode 100644
> index 000000000..fd2894a5e
> --- /dev/null
> +++ b/tests/intel/xe_eudebug.c
> @@ -0,0 +1,2716 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2023 Intel Corporation
> + */
> +
> +/**
> + * TEST: Test EU Debugger functionality
> + * Category: Core
> + * Mega feature: EUdebug
> + * Sub-category: EUdebug tests
> + * Functionality: eu debugger framework
> + * Test category: functionality test
> + */
> +
> +#include <grp.h>
> +#include <poll.h>
> +#include <pthread.h>
> +#include <pwd.h>
> +#include <sys/ioctl.h>
> +#include <sys/prctl.h>
> +
> +#include "igt.h"
> +#include "intel_pat.h"
> +#include "lib/igt_syncobj.h"
> +#include "xe/xe_eudebug.h"
> +#include "xe/xe_ioctl.h"
> +#include "xe/xe_query.h"
> +
> +/**
> + * SUBTEST: sysfs-toggle
> + * Description:
> + *	Exercise the debugger enable/disable sysfs toggle logic
> + */
> +static void test_sysfs_toggle(int fd)
> +{
> +	xe_eudebug_enable(fd, false);
> +	igt_assert(!xe_eudebug_debugger_available(fd));
> +
> +	xe_eudebug_enable(fd, true);
> +	igt_assert(xe_eudebug_debugger_available(fd));
> +	xe_eudebug_enable(fd, true);
> +	igt_assert(xe_eudebug_debugger_available(fd));
> +
> +	xe_eudebug_enable(fd, false);
> +	igt_assert(!xe_eudebug_debugger_available(fd));
> +	xe_eudebug_enable(fd, false);
> +	igt_assert(!xe_eudebug_debugger_available(fd));
> +
> +	xe_eudebug_enable(fd, true);
> +	igt_assert(xe_eudebug_debugger_available(fd));
> +}
> +
> +#define STAGE_PRE_DEBUG_RESOURCES_DONE 1
> +#define STAGE_DISCOVERY_DONE 2
> +
> +#define CREATE_VMS (1 << 0)
> +#define CREATE_EXEC_QUEUES (1 << 1)
> +#define VM_BIND (1 << 2)
> +#define VM_BIND_VM_DESTROY (1 << 3)
> +#define VM_BIND_EXTENDED (1 << 4)
> +#define VM_METADATA (1 << 5)
> +#define VM_BIND_METADATA (1 << 6)
> +#define VM_BIND_OP_MAP_USERPTR (1 << 7)
> +#define TEST_DISCOVERY (1 << 31)

Please align value to same column.

> +
> +#define PAGE_SIZE 4096

Use SZ_4K

> +static struct drm_xe_vm_bind_op_ext_attach_debug *
> +basic_vm_bind_metadata_ext_prepare(int fd, struct xe_eudebug_client *c,
> +				   uint8_t **data, uint32_t data_size)
> +{
> +	struct drm_xe_vm_bind_op_ext_attach_debug *ext;
> +	int i;
> +

According to code below data_size should be >= PAGE_SIZE * XE_DEBUG_METADATA_NUM.
Shouldn't be assert here?

> +	*data = calloc(data_size, sizeof(*data));
> +	igt_assert(*data);
> +
> +	for (i = 0; i < data_size; i++)
> +		(*data)[i] = 0xff & (i + (i > PAGE_SIZE));

Just question, what for starting from second page you're adding 1 ?

> +
> +	ext = calloc(WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM, sizeof(*ext));
> +	igt_assert(ext);
> +
> +	for (i = 0; i < WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM; i++) {
> +		ext[i].base.name = XE_VM_BIND_OP_EXTENSIONS_ATTACH_DEBUG;
> +		ext[i].metadata_id = xe_eudebug_client_metadata_create(c, fd, i,
> +								       (i + 1) * PAGE_SIZE, *data);

Is this intentional to use same *data for all metadata with increased
size? 

> +		ext[i].cookie = i;
> +
> +		if (i < WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM - 1)
> +			ext[i].base.next_extension = to_user_pointer(&ext[i + 1]);
> +	}
> +	return ext;
> +}
> +
> +static void basic_vm_bind_metadata_ext_del(int fd, struct xe_eudebug_client *c,
> +					   struct drm_xe_vm_bind_op_ext_attach_debug *ext,
> +					   uint8_t *data)
> +{
> +	for (int i = 0; i < WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM; i++)
> +		xe_eudebug_client_metadata_destroy(c, fd, ext[i].metadata_id, i,
> +						   (i + 1) * PAGE_SIZE);
> +	free(ext);
> +	free(data);
> +}
> +
> +static void basic_vm_bind_client(int fd, struct xe_eudebug_client *c)
> +{
> +	struct drm_xe_vm_bind_op_ext_attach_debug *ext = NULL;
> +	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +	size_t bo_size = xe_get_default_alignment(fd);
> +	bool test_discovery = c->flags & TEST_DISCOVERY;
> +	bool test_metadata = c->flags & VM_BIND_METADATA;
> +	uint32_t bo = xe_bo_create(fd, 0, bo_size,
> +				   system_memory(fd), 0);
> +	uint64_t addr = 0x1a0000;

Move BO_ADDR from the bottom and use this define.

> +	uint8_t *data = NULL;
> +
> +	if (test_metadata)
> +		ext = basic_vm_bind_metadata_ext_prepare(fd, c, &data, PAGE_SIZE);

According to above, PAGE_SIZE in this case looks too small, see above my
comment about *data. Looking at the code MDATA_SIZE likely should be
used here.

> +
> +	xe_eudebug_client_vm_bind_flags(c, fd, vm, bo, 0, addr,
> +					bo_size, 0, NULL, 0, to_user_pointer(ext));
> +
> +	if (test_discovery) {
> +		xe_eudebug_client_signal_stage(c, STAGE_PRE_DEBUG_RESOURCES_DONE);
> +		xe_eudebug_client_wait_stage(c, STAGE_DISCOVERY_DONE);
> +	}
> +
> +	xe_eudebug_client_vm_unbind(c, fd, vm, 0, addr, bo_size);
> +
> +	if (test_metadata)
> +		basic_vm_bind_metadata_ext_del(fd, c, ext, data);
> +
> +	gem_close(fd, bo);
> +	xe_eudebug_client_vm_destroy(c, fd, vm);
> +}
> +
> +static void basic_vm_bind_vm_destroy_client(int fd, struct xe_eudebug_client *c)
> +{
> +	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +	size_t bo_size = xe_get_default_alignment(fd);
> +	bool test_discovery = c->flags & TEST_DISCOVERY;
> +	uint32_t bo = xe_bo_create(fd, 0, bo_size,
> +				   system_memory(fd), 0);
> +	uint64_t addr = 0x1a0000;
> +
> +	if (test_discovery) {
> +		vm = xe_vm_create(fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +
> +		xe_vm_bind_async(fd, vm, 0, bo, 0, addr, bo_size, NULL, 0);
> +
> +		xe_vm_destroy(fd, vm);
> +
> +		xe_eudebug_client_signal_stage(c, STAGE_PRE_DEBUG_RESOURCES_DONE);
> +		xe_eudebug_client_wait_stage(c, STAGE_DISCOVERY_DONE);
> +	} else {
> +		vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +		xe_eudebug_client_vm_bind(c, fd, vm, bo, 0, addr, bo_size);
> +		xe_eudebug_client_vm_destroy(c, fd, vm);
> +	}
> +
> +	gem_close(fd, bo);
> +}
> +
> +#define BO_ADDR 0x1a0000
> +#define BO_ITEMS 4096
> +#define MIN_BO_SIZE (BO_ITEMS * sizeof(uint64_t))
> +
> +union buf_id {
> +	uint32_t fd;
> +	void *userptr;
> +};
> +
> +struct bind_list {
> +	int fd;
> +	uint32_t vm;
> +	union buf_id *bo;
> +	struct drm_xe_vm_bind_op *bind_ops;
> +	unsigned int n;
> +};
> +
> +static void *bo_get_ptr(int fd, struct drm_xe_vm_bind_op *o)
> +{
> +	void *ptr;
> +
> +	if (o->op != DRM_XE_VM_BIND_OP_MAP_USERPTR)
> +		ptr = xe_bo_map(fd, o->obj, o->range);
> +	else
> +		ptr = (void *)(uintptr_t)o->userptr;
> +
> +	igt_assert(ptr);
> +
> +	return ptr;
> +}
> +
> +static void bo_put_ptr(int fd, struct drm_xe_vm_bind_op *o, void *ptr)
> +{
> +	if (o->op != DRM_XE_VM_BIND_OP_MAP_USERPTR)
> +		munmap(ptr, o->range);
> +}
> +
> +static void bo_prime(int fd, struct drm_xe_vm_bind_op *o)

Why prime? Shouldn't this be bo_fill()?

> +{
> +	uint64_t *d;
> +	uint64_t i;
> +
> +	d = bo_get_ptr(fd, o);
> +
> +	for (i = 0; i < o->range / sizeof(*d); i++)
> +		d[i] = o->addr + i;
> +
> +	bo_put_ptr(fd, o, d);
> +}
> +
> +static void bo_check(int fd, struct drm_xe_vm_bind_op *o)
> +{
> +	uint64_t *d;
> +	uint64_t i;
> +
> +	d = bo_get_ptr(fd, o);
> +
> +	for (i = 0; i < o->range / sizeof(*d); i++)
> +		igt_assert_eq(d[i], o->addr + i + 1);
> +
> +	bo_put_ptr(fd, o, d);
> +}
> +
> +static union buf_id *vm_create_objects(int fd, uint32_t bo_placement, uint32_t vm,
> +				       unsigned int size, unsigned int n)
> +{
> +	union buf_id *bo;
> +	unsigned int i;
> +
> +	bo = calloc(n, sizeof(*bo));
> +	igt_assert(bo);
> +
> +	for (i = 0; i < n; i++) {
> +		if (bo_placement) {
> +			bo[i].fd = xe_bo_create(fd, vm, size, bo_placement, 0);
> +			igt_assert(bo[i].fd);
> +		} else {
> +			bo[i].userptr = aligned_alloc(PAGE_SIZE, size);
> +			igt_assert(bo[i].userptr);
> +		}
> +	}
> +
> +	return bo;
> +}
> +
> +static struct bind_list *create_bind_list(int fd, uint32_t bo_placement,
> +					  uint32_t vm, unsigned int n,
> +					  unsigned int target_size)
> +{
> +	unsigned int i = target_size ?: MIN_BO_SIZE;
> +	const unsigned int bo_size = max_t(bo_size, xe_get_default_alignment(fd), i);
> +	bool is_userptr = !bo_placement;
> +	struct bind_list *bl;
> +
> +	bl = malloc(sizeof(*bl));
> +	bl->fd = fd;
> +	bl->vm = vm;
> +	bl->bo = vm_create_objects(fd, bo_placement, vm, bo_size, n);
> +	bl->n = n;
> +	bl->bind_ops = calloc(n, sizeof(*bl->bind_ops));
> +	igt_assert(bl->bind_ops);
> +
> +	for (i = 0; i < n; i++) {
> +		struct drm_xe_vm_bind_op *o = &bl->bind_ops[i];
> +

Most of = 0 initializations may be skipped as you're callocing bind_ops.

> +		if (is_userptr) {
> +			o->obj = 0;
> +			o->userptr = (uintptr_t)bl->bo[i].userptr;
> +			o->op = DRM_XE_VM_BIND_OP_MAP_USERPTR;
> +		} else {
> +			o->obj = bl->bo[i].fd;
> +			o->obj_offset = 0;
> +			o->op = DRM_XE_VM_BIND_OP_MAP;
> +		}
> +
> +		o->range = bo_size;
> +		o->addr = BO_ADDR + 2 * i * bo_size;
> +		o->flags = 0;
> +		o->pat_index = intel_get_pat_idx_wb(fd);
> +		o->prefetch_mem_region_instance = 0;
> +		o->reserved[0] = 0;
> +		o->reserved[1] = 0;
> +	}
> +
> +	for (i = 0; i < bl->n; i++) {
> +		struct drm_xe_vm_bind_op *o = &bl->bind_ops[i];
> +
> +		igt_debug("bo %d: addr 0x%llx, range 0x%llx\n", i, o->addr, o->range);
> +		bo_prime(fd, o);
> +	}
> +
> +	return bl;
> +}
> +
> +static void do_bind_list(struct xe_eudebug_client *c,
> +			 struct bind_list *bl, bool sync)
> +{
> +	struct drm_xe_sync uf_sync = {
> +		.type = DRM_XE_SYNC_TYPE_USER_FENCE,
> +		.flags = DRM_XE_SYNC_FLAG_SIGNAL,
> +		.timeline_value = 1337,
> +	};
> +	uint64_t ref_seqno = 0, op_ref_seqno = 0;
> +	uint64_t *fence_data;
> +	int i;
> +
> +	if (sync) {
> +		fence_data = aligned_alloc(xe_get_default_alignment(bl->fd),
> +					   sizeof(*fence_data));
> +		igt_assert(fence_data);
> +		uf_sync.addr = to_user_pointer(fence_data);
> +		memset(fence_data, 0, sizeof(*fence_data));
> +	}
> +
> +	xe_vm_bind_array(bl->fd, bl->vm, 0, bl->bind_ops, bl->n, &uf_sync, sync ? 1 : 0);
> +	xe_eudebug_client_vm_bind_event(c, DRM_XE_EUDEBUG_EVENT_STATE_CHANGE,
> +					bl->fd, bl->vm, 0, bl->n, &ref_seqno);
> +	for (i = 0; i < bl->n; i++)
> +		xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_CREATE,
> +						   ref_seqno,
> +						   &op_ref_seqno,
> +						   bl->bind_ops[i].addr,
> +						   bl->bind_ops[i].range,
> +						   0);
> +
> +	if (sync) {
> +		xe_wait_ufence(bl->fd, fence_data, uf_sync.timeline_value, 0,
> +			       XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC);
> +		free(fence_data);
> +	}
> +}
> +
> +static void free_bind_list(struct xe_eudebug_client *c, struct bind_list *bl)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < bl->n; i++) {
> +		igt_debug("%d: checking 0x%llx (%lld)\n",
> +			  i, bl->bind_ops[i].addr, bl->bind_ops[i].addr);
> +		bo_check(bl->fd, &bl->bind_ops[i]);
> +		if (bl->bind_ops[i].op == DRM_XE_VM_BIND_OP_MAP_USERPTR)
> +			free(bl->bo[i].userptr);
> +		xe_eudebug_client_vm_unbind(c, bl->fd, bl->vm, 0,
> +					    bl->bind_ops[i].addr,
> +					    bl->bind_ops[i].range);
> +	}
> +
> +	free(bl->bind_ops);
> +	free(bl->bo);
> +	free(bl);
> +}
> +
> +static void vm_bind_client(int fd, struct xe_eudebug_client *c)
> +{
> +	uint64_t op_ref_seqno, ref_seqno;
> +	struct bind_list *bl;
> +	bool test_discovery = c->flags & TEST_DISCOVERY;
> +	size_t bo_size = 3 * xe_get_default_alignment(fd);
> +	uint32_t bo[2] = {
> +		xe_bo_create(fd, 0, bo_size, system_memory(fd), 0),
> +		xe_bo_create(fd, 0, bo_size, system_memory(fd), 0),
> +	};
> +	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +	uint64_t addr[] = {0x2a0000, 0x3a0000};
> +	uint64_t rebind_bo_offset = 2 * bo_size / 3;
> +	uint64_t size = bo_size / 3;
> +	int i = 0;
> +
> +	if (test_discovery) {
> +		xe_vm_bind_async(fd, vm, 0, bo[0], 0, addr[0], bo_size, NULL, 0);
> +
> +		xe_vm_unbind_async(fd, vm, 0, 0, addr[0] + size, size, NULL, 0);
> +
> +		xe_vm_bind_async(fd, vm, 0, bo[1], 0, addr[1], bo_size, NULL, 0);
> +
> +		xe_vm_bind_async(fd, vm, 0, bo[1], rebind_bo_offset, addr[1], size, NULL, 0);
> +
> +		bl = create_bind_list(fd, system_memory(fd), vm, 4, 0);
> +		xe_vm_bind_array(bl->fd, bl->vm, 0, bl->bind_ops, bl->n, NULL, 0);
> +
> +		xe_vm_unbind_all_async(fd, vm, 0, bo[0], NULL, 0);
> +
> +		xe_eudebug_client_vm_bind_event(c, DRM_XE_EUDEBUG_EVENT_STATE_CHANGE,
> +						bl->fd, bl->vm, 0, bl->n + 2, &ref_seqno);
> +
> +		xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_CREATE, ref_seqno,
> +						   &op_ref_seqno, addr[1], size, 0);
> +		xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_CREATE, ref_seqno,
> +						   &op_ref_seqno, addr[1] + size, size * 2, 0);
> +
> +		for (i = 0; i < bl->n; i++)
> +			xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_CREATE,
> +							   ref_seqno, &op_ref_seqno,
> +							   bl->bind_ops[i].addr,
> +							   bl->bind_ops[i].range, 0);
> +
> +		xe_eudebug_client_signal_stage(c, STAGE_PRE_DEBUG_RESOURCES_DONE);
> +		xe_eudebug_client_wait_stage(c, STAGE_DISCOVERY_DONE);
> +	} else {
> +		xe_eudebug_client_vm_bind(c, fd, vm, bo[0], 0, addr[0], bo_size);
> +		xe_eudebug_client_vm_unbind(c, fd, vm, 0, addr[0] + size, size);
> +
> +		xe_eudebug_client_vm_bind(c, fd, vm, bo[1], 0, addr[1], bo_size);
> +		xe_eudebug_client_vm_bind(c, fd, vm, bo[1], rebind_bo_offset, addr[1], size);
> +
> +		bl = create_bind_list(fd, system_memory(fd), vm, 4, 0);
> +		do_bind_list(c, bl, false);
> +	}
> +
> +	xe_vm_unbind_all_async(fd, vm, 0, bo[1], NULL, 0);
> +
> +	xe_eudebug_client_vm_bind_event(c, DRM_XE_EUDEBUG_EVENT_STATE_CHANGE, fd, vm, 0,
> +					1, &ref_seqno);
> +	xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_DESTROY, ref_seqno,
> +					   &op_ref_seqno, 0, 0, 0);
> +
> +	gem_close(fd, bo[0]);
> +	gem_close(fd, bo[1]);
> +	xe_eudebug_client_vm_destroy(c, fd, vm);
> +}
> +
> +static void run_basic_client(struct xe_eudebug_client *c)

For 'multiple-sessions' subtest - is run_basic_client() prepared for
being executed in parallel? I mean for my setup these 4 children
executes create vms/exec queues one after another, not in parallel.

> +{
> +	int fd, i;
> +
> +	fd = xe_eudebug_client_open_driver(c);
> +	xe_device_get(fd);

xe_device_get() is not necessary here, as xe_eudebug_client_open_driver()
calls drm_reopen_driver() which for xe calls this for new fd.

> +
> +	if (c->flags & CREATE_VMS) {
> +		const uint32_t flags[] = {
> +			DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE | DRM_XE_VM_CREATE_FLAG_LR_MODE,
> +			DRM_XE_VM_CREATE_FLAG_LR_MODE,
> +		};
> +		uint32_t vms[ARRAY_SIZE(flags)];
> +
> +		for (i = 0; i < ARRAY_SIZE(flags); i++)
> +			vms[i] = xe_eudebug_client_vm_create(c, fd, flags[i], 0);
> +
> +		for (i--; i >= 0; i--)
> +			xe_eudebug_client_vm_destroy(c, fd, vms[i]);
> +	}
> +
> +	if (c->flags & CREATE_EXEC_QUEUES) {
> +		struct drm_xe_exec_queue_create *create;
> +		struct drm_xe_engine_class_instance *hwe;
> +		struct drm_xe_ext_set_property eq_ext = {
> +			.base.name = DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY,
> +			.property = DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG,
> +			.value = DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE,
> +		};
> +		uint32_t vm;
> +
> +		create = calloc(xe_number_engines(fd), sizeof(*create));
> +
> +		vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +
> +		i = 0;
> +		xe_eudebug_for_each_engine(fd, hwe) {
> +			create[i].instances = to_user_pointer(hwe);
> +			create[i].vm_id = vm;
> +			create[i].width = 1;
> +			create[i].num_placements = 1;
> +			create[i].extensions = to_user_pointer(&eq_ext);
> +			xe_eudebug_client_exec_queue_create(c, fd, &create[i++]);
> +		}
> +
> +		while (--i >= 0)
> +			xe_eudebug_client_exec_queue_destroy(c, fd, &create[i]);
> +
> +		xe_eudebug_client_vm_destroy(c, fd, vm);
> +	}
> +
> +	if (c->flags & VM_BIND || c->flags & VM_BIND_METADATA)
> +		basic_vm_bind_client(fd, c);
> +
> +	if (c->flags & VM_BIND_EXTENDED)
> +		vm_bind_client(fd, c);
> +
> +	if (c->flags & VM_BIND_VM_DESTROY)
> +		basic_vm_bind_vm_destroy_client(fd, c);
> +
> +	xe_device_put(fd);
> +	xe_eudebug_client_close_driver(c, fd);
> +}
> +
> +static int read_event(int debugfd, struct drm_xe_eudebug_event *event)
> +{
> +	int ret;
> +
> +	ret = igt_ioctl(debugfd, DRM_XE_EUDEBUG_IOCTL_READ_EVENT, event);
> +	if (ret < 0)
> +		return -errno;
> +
> +	return ret;
> +}
> +
> +static int __read_event(int debugfd, struct drm_xe_eudebug_event *event)
> +{
> +	int ret;
> +
> +	ret = ioctl(debugfd, DRM_XE_EUDEBUG_IOCTL_READ_EVENT, event);
> +	if (ret < 0)
> +		return -errno;
> +
> +	return ret;
> +}
> +
> +static int poll_event(int fd, int timeout_ms)
> +{
> +	int ret;
> +
> +	struct pollfd p = {
> +		.fd = fd,
> +		.events = POLLIN,
> +		.revents = 0,
> +	};
> +
> +	ret = poll(&p, 1, timeout_ms);
> +	if (ret == -1)
> +		return -errno;
> +
> +	return ret == 1 && (p.revents & POLLIN);
> +}
> +
> +static int __debug_connect(int fd, int *debugfd, struct drm_xe_eudebug_connect *param)
> +{
> +	int ret = 0;
> +
> +	*debugfd = igt_ioctl(fd, DRM_IOCTL_XE_EUDEBUG_CONNECT, param);
> +
> +	if (*debugfd < 0) {
> +		ret = -errno;
> +		igt_assume(ret != 0);
> +	}
> +
> +	errno = 0;
> +	return ret;
> +}
> +
> +/**
> + * SUBTEST: basic-connect
> + * Description:
> + *	Exercise XE_EUDEBUG_CONNECT ioctl with passing
> + *	valid and invalid params.
> + */
> +static void test_connect(int fd)
> +{
> +	struct drm_xe_eudebug_connect param = {};
> +	int debugfd, ret;
> +	pid_t *pid;
> +
> +	pid = mmap(NULL, sizeof(pid_t), PROT_WRITE,
> +		   MAP_SHARED | MAP_ANON, -1, 0);
> +
> +	/* get fresh unrelated pid */
> +	igt_fork(child, 1)
> +		*pid = getpid();
> +
> +	igt_waitchildren();
> +	param.pid = *pid;
> +	munmap(pid, sizeof(pid_t));
> +
> +	ret = __debug_connect(fd, &debugfd, &param);
> +	igt_assert(debugfd == -1);
> +	igt_assert_eq(ret, param.pid ? -ENOENT : -EINVAL);

I've pointed out in review of kernel series ENOENT should be
used for file operations, so I think this should be changed.

> +
> +	param.pid = 0;
> +	ret = __debug_connect(fd, &debugfd, &param);
> +	igt_assert(debugfd == -1);
> +	igt_assert_eq(ret, -EINVAL);
> +
> +	param.pid = getpid();
> +	param.version = -1;
> +	ret = __debug_connect(fd, &debugfd, &param);
> +	igt_assert(debugfd == -1);
> +	igt_assert_eq(ret, -EINVAL);
> +
> +	param.version = 0;
> +	param.flags = ~0;
> +	ret = __debug_connect(fd, &debugfd, &param);
> +	igt_assert(debugfd == -1);
> +	igt_assert_eq(ret, -EINVAL);
> +
> +	param.flags = 0;
> +	param.extensions = ~0;
> +	ret = __debug_connect(fd, &debugfd, &param);
> +	igt_assert(debugfd == -1);
> +	igt_assert_eq(ret, -EINVAL);
> +
> +	param.extensions = 0;
> +	ret = __debug_connect(fd, &debugfd, &param);
> +	igt_assert_neq(debugfd, -1);
> +	igt_assert_eq(ret, 0);
> +
> +	close(debugfd);
> +}
> +
> +static void switch_user(__uid_t uid, __gid_t gid)
> +{
> +	struct group *gr;
> +	__gid_t gr_v;
> +
> +	/* Users other then root need to belong to video group */
> +	gr = getgrnam("video");
> +	igt_assert(gr);
> +
> +	/* Drop all */
> +	igt_assert_eq(setgroups(1, &gr->gr_gid), 0);
> +	igt_assert_eq(setgid(gid), 0);
> +	igt_assert_eq(setuid(uid), 0);
> +
> +	igt_assert_eq(getgroups(1, &gr_v), 1);
> +	igt_assert_eq(gr_v, gr->gr_gid);
> +	igt_assert_eq(getgid(), gid);
> +	igt_assert_eq(getuid(), uid);
> +
> +	igt_assert_eq(prctl(PR_SET_DUMPABLE, 1L), 0);
> +}
> +
> +/**
> + * SUBTEST: connect-user
> + * Description:
> + *	Verify unprivileged XE_EUDEBG_CONNECT ioctl.

Typo.

> + *	Check:
> + *	 - user debugger to user workload connection
> + *	 - user debugger to other user workload connection
> + *	 - user debugger to privileged workload connection
> + */
> +static void test_connect_user(int fd)
> +{
> +	struct drm_xe_eudebug_connect param = {};
> +	struct passwd *pwd, *pwd2;
> +	const char *user1 = "lp";
> +	const char *user2 = "mail";
> +	int debugfd, ret, i;
> +	int p1[2], p2[2];
> +	__uid_t u1, u2;
> +	__gid_t g1, g2;
> +	int newfd;
> +	pid_t pid;
> +
> +#define NUM_USER_TESTS 4
> +#define P_APP 0
> +#define P_GDB 1
> +	struct conn_user {
> +		/* u[0] - process uid, u[1] - gdb uid */
> +		__uid_t u[P_GDB + 1];
> +		/* g[0] - process gid, g[1] - gdb gid */
> +		__gid_t g[P_GDB + 1];
> +		/* Expected fd from open */
> +		int ret;
> +		/* Skip this test case */
> +		int skip;
> +		const char *desc;
> +	} test[NUM_USER_TESTS] = {};
> +
> +	igt_assert(!pipe(p1));
> +	igt_assert(!pipe(p2));
> +
> +	pwd = getpwnam(user1);
> +	igt_require(pwd);
> +	u1 = pwd->pw_uid;
> +	g1 = pwd->pw_gid;
> +
> +	/*
> +	 * Keep a copy of needed contents as it is a static
> +	 * memory area and subsequent calls will overwrite
> +	 * what's in.
> +	 * However getpwnam() returns NULL if cannot find
> +	 * user in passwd.
> +	 */
> +	setpwent();
> +	pwd2 = getpwnam(user2);
> +	if (pwd2) {
> +		u2 = pwd2->pw_uid;
> +		g2 = pwd2->pw_gid;
> +	}
> +
> +	test[0].skip = !pwd;
> +	test[0].u[P_GDB] = u1;
> +	test[0].g[P_GDB] = g1;
> +	test[0].ret = -EACCES;
> +	test[0].desc = "User GDB to Root App";
> +
> +	test[1].skip = !pwd;
> +	test[1].u[P_APP] = u1;
> +	test[1].g[P_APP] = g1;
> +	test[1].u[P_GDB] = u1;
> +	test[1].g[P_GDB] = g1;
> +	test[1].ret = 0;
> +	test[1].desc = "User GDB to User App";
> +
> +	test[2].skip = !pwd;
> +	test[2].u[P_APP] = u1;
> +	test[2].g[P_APP] = g1;
> +	test[2].ret = 0;
> +	test[2].desc = "Root GDB to User App";
> +
> +	test[3].skip = !pwd2;
> +	test[3].u[P_APP] = u1;
> +	test[3].g[P_APP] = g1;
> +	test[3].u[P_GDB] = u2;
> +	test[3].g[P_GDB] = g2;
> +	test[3].ret = -EACCES;
> +	test[3].desc = "User GDB to Other User App";
> +
> +	if (!pwd2)
> +		igt_warn("User %s not available in the system. Skipping subtests: %s.\n",
> +			 user2, test[3].desc);
> +
> +	for (i = 0; i < NUM_USER_TESTS; i++) {
> +		if (test[i].skip) {
> +			igt_debug("Subtest %s skipped\n", test[i].desc);
> +			continue;
> +		}
> +		igt_debug("Executing connection: %s\n", test[i].desc);
> +		igt_fork(child, 2) {
> +			if (!child) {
> +				if (test[i].u[P_APP])
> +					switch_user(test[i].u[P_APP], test[i].g[P_APP]);
> +
> +				pid = getpid();
> +				/* Signal the PID */
> +				igt_assert(write(p1[1], &pid, sizeof(pid)) == sizeof(pid));
> +				/* wait with exit */
> +				igt_assert(read(p2[0], &pid, sizeof(pid)) == sizeof(pid));
> +			} else {
> +				if (test[i].u[P_GDB])
> +					switch_user(test[i].u[P_GDB], test[i].g[P_GDB]);
> +
> +				igt_assert(read(p1[0], &pid, sizeof(pid)) == sizeof(pid));
> +				param.pid = pid;
> +
> +				newfd = drm_open_driver(DRIVER_XE);
> +				ret = __debug_connect(newfd, &debugfd, &param);
> +
> +				/* Release the app first */
> +				igt_assert(write(p2[1], &pid, sizeof(pid)) == sizeof(pid));
> +
> +				igt_assert_eq(ret, test[i].ret);
> +				if (!ret)
> +					close(debugfd);
> +			}
> +		}
> +		igt_waitchildren();
> +	}
> +	close(p1[0]);
> +	close(p1[1]);
> +	close(p2[0]);
> +	close(p2[1]);
> +#undef NUM_USER_TESTS
> +#undef P_APP
> +#undef P_GDB
> +}
> +
> +/**
> + * SUBTEST: basic-close
> + * Description:
> + *	Test whether eudebug can be reattached after closure.
> + */
> +static void test_close(int fd)
> +{
> +	struct drm_xe_eudebug_connect param = { 0,  };
> +	int debug_fd1, debug_fd2;
> +	int fd2;
> +
> +	param.pid = getpid();
> +
> +	igt_assert_eq(__debug_connect(fd, &debug_fd1, &param), 0);
> +	igt_assert(debug_fd1 >= 0);
> +	igt_assert_eq(__debug_connect(fd, &debug_fd2, &param), -EBUSY);
> +	igt_assert_eq(debug_fd2, -1);
> +
> +	close(debug_fd1);
> +	fd2 = drm_open_driver(DRIVER_XE);
> +
> +	igt_assert_eq(__debug_connect(fd2, &debug_fd2, &param), 0);
> +	igt_assert(debug_fd2 >= 0);
> +	close(fd2);
> +	close(debug_fd2);
> +	close(debug_fd1);
> +}
> +
> +/**
> + * SUBTEST: basic-read-event
> + * Description:
> + *	Synchronously exercise eu debugger event polling and reading.
> + */

May I ask for commenting out similar to debugger_test_vma_parameters()?

> +#define MAX_EVENT_SIZE (32 * 1024)
> +static void test_read_event(int fd)
> +{
> +	struct drm_xe_eudebug_event *event;
> +	struct xe_eudebug_debugger *d;
> +	struct xe_eudebug_client *c;
> +
> +	event = malloc(MAX_EVENT_SIZE);
> +	igt_assert(event);
> +	memset(event, 0, sizeof(*event));

calloc?

> +
> +	c = xe_eudebug_client_create(fd, run_basic_client, 0, NULL);
> +	d = xe_eudebug_debugger_create(fd, 0, NULL);
> +
> +	igt_assert_eq(xe_eudebug_debugger_attach(d, c), 0);
> +	igt_assert_eq(poll_event(d->fd, 500), 0);
> +
> +	event->len = 1;
> +	event->type = DRM_XE_EUDEBUG_EVENT_NONE;
> +	igt_assert_eq(read_event(d->fd, event), -EINVAL);
> +
> +	event->len = MAX_EVENT_SIZE;
> +	event->type = DRM_XE_EUDEBUG_EVENT_NONE;
> +	igt_assert_eq(read_event(d->fd, event), -EINVAL);
> +
> +	xe_eudebug_client_start(c);

run_basic_client() produces creates/destroy client events, so:
> +
> +	igt_assert_eq(poll_event(d->fd, 500), 1);
> +	event->type = DRM_XE_EUDEBUG_EVENT_READ;
> +	igt_assert_eq(read_event(d->fd, event), 0);

I would check is flags == CREATE at this point, then

> +
> +	igt_assert_eq(poll_event(d->fd, 500), 1);
> +
> +	event->flags = 0;
> +	event->type = DRM_XE_EUDEBUG_EVENT_READ;
> +
> +	event->len = 0;
> +	igt_assert_eq(read_event(d->fd, event), -EINVAL);
> +	igt_assert_eq(0, event->len);
> +
> +	event->len = sizeof(*event) - 1;
> +	igt_assert_eq(read_event(d->fd, event), -EINVAL);
> +
> +	event->len = sizeof(*event);
> +	igt_assert_eq(read_event(d->fd, event), -EMSGSIZE);
> +	igt_assert_lt(sizeof(*event), event->len);
> +
> +	event->len = event->len - 1;
> +	igt_assert_eq(read_event(d->fd, event), -EMSGSIZE);
> +	/* event->len should now contain the exact len */
> +	igt_assert_eq(read_event(d->fd, event), 0);

flags == DESTROY here.

> +
> +	fcntl(d->fd, F_SETFL, fcntl(d->fd, F_GETFL) | O_NONBLOCK);
> +	igt_assert(fcntl(d->fd, F_GETFL) & O_NONBLOCK);
> +
> +	igt_assert_eq(poll_event(d->fd, 500), 0);
> +	event->len = MAX_EVENT_SIZE;
> +	event->flags = 0;
> +	event->type = DRM_XE_EUDEBUG_EVENT_READ;
> +	igt_assert_eq(__read_event(d->fd, event), -EAGAIN);
> +
> +	xe_eudebug_client_wait_done(c);
> +	xe_eudebug_client_stop(c);
> +
> +	igt_assert_eq(poll_event(d->fd, 500), 0);
> +	igt_assert_eq(__read_event(d->fd, event), -EAGAIN);
> +
> +	xe_eudebug_debugger_destroy(d);
> +	xe_eudebug_client_destroy(c);
> +
> +	free(event);
> +}
> +
> +/**
> + * SUBTEST: basic-client
> + * Description:
> + *	Attach the debugger to process which opens and closes xe drm client.
> + *
> + * SUBTEST: basic-client-th
> + * Description:
> + *	Create client basic resources (vms) in multiple threads
> + *
> + * SUBTEST: multiple-sessions
> + * Description:
> + *	Simultaneously attach many debuggers to many processes.
> + *	Each process opens and closes xe drm client and creates few resources.
> + *
> + * SUBTEST: basic-%s
> + * Description:
> + *	Attach the debugger to process which creates and destroys a few %arg[1].
> + *
> + * SUBTEST: basic-vm-bind
> + * Description:
> + *	Attach the debugger to a process that performs synchronous vm bind
> + *	and vm unbind.
> + *
> + * SUBTEST: basic-vm-bind-vm-destroy
> + * Description:
> + *	Attach the debugger to a process that performs vm bind, and destroys
> + *	the vm without unbinding. Make sure that we don't get unbind events.
> + *
> + * SUBTEST: basic-vm-bind-extended
> + * Description:
> + *	Attach the debugger to a process that performs bind, bind array, rebind,
> + *	partial unbind, unbind and unbind all operations.
> + *
> + * SUBTEST: multigpu-basic-client
> + * Description:
> + *	Attach the debugger to process which opens and closes xe drm client on all Xe devices.
> + *
> + * SUBTEST: multigpu-basic-client-many
> + * Description:
> + *	Simultaneously attach many debuggers to many processes on all Xe devices.
> + *	Each process opens and closes xe drm client and creates few resources.
> + *
> + * arg[1]:
> + *
> + * @vms: vms
> + * @exec-queues: exec queues
> + */
> +
> +static void test_basic_sessions(int fd, unsigned int flags, int count, bool match_opposite)
> +{
> +	struct xe_eudebug_session **s;
> +	int i;
> +
> +	s = calloc(count, sizeof(*s));
> +
> +	igt_assert(s);
> +
> +	for (i = 0; i < count; i++)
> +		s[i] = xe_eudebug_session_create(fd, run_basic_client, flags, NULL);
> +
> +	for (i = 0; i < count; i++)
> +		xe_eudebug_session_run(s[i]);
> +
> +	for (i = 0; i < count; i++)
> +		xe_eudebug_session_check(s[i], match_opposite, 0);
> +
> +	for (i = 0; i < count; i++)
> +		xe_eudebug_session_destroy(s[i]);
> +}
> +
> +/**
> + * SUBTEST: basic-vm-bind-discovery
> + * Description:
> + *	Attach the debugger to a process that performs vm-bind before attaching
> + *	and check if the discovery process reports it.
> + *
> + * SUBTEST: basic-vm-bind-metadata-discovery
> + * Description:
> + *	Attach the debugger to a process that performs vm-bind with metadata attached
> + *	before attaching and check if the discovery process reports it.
> + *
> + * SUBTEST: basic-vm-bind-vm-destroy-discovery
> + * Description:
> + *	Attach the debugger to a process that performs vm bind, and destroys
> + *	the vm without unbinding before attaching. Make sure that we don't get
> + *	any bind/unbind and vm create/destroy events.
> + *
> + * SUBTEST: basic-vm-bind-extended-discovery
> + * Description:
> + *	Attach the debugger to a process that performs bind, bind array, rebind,
> + *	partial unbind, and unbind all operations before attaching. Ensure that
> + *	we get a only a singe 'VM_BIND' event from the discovery worker.
> + */
> +static void test_basic_discovery(int fd, unsigned int flags, bool match_opposite)
> +{
> +	struct xe_eudebug_debugger *d;
> +	struct xe_eudebug_session *s;
> +	struct xe_eudebug_client *c;
> +
> +	s = xe_eudebug_session_create(fd, run_basic_client, flags | TEST_DISCOVERY, NULL);
> +
> +	c = s->client;
> +	d = s->debugger;
> +
> +	xe_eudebug_client_start(c);
> +	xe_eudebug_debugger_wait_stage(s, STAGE_PRE_DEBUG_RESOURCES_DONE);
> +
> +	igt_assert_eq(xe_eudebug_debugger_attach(d, c), 0);
> +	xe_eudebug_debugger_start_worker(d);
> +
> +	/* give the worker time to do it's job */
> +	sleep(2);

Shouldn't debugger be informed via discovery completion event instead
of using arbitrary timeout?

> +	xe_eudebug_debugger_signal_stage(d, STAGE_DISCOVERY_DONE);
> +
> +	xe_eudebug_client_wait_done(c);
> +
> +	xe_eudebug_debugger_stop_worker(d, 1);
> +
> +	xe_eudebug_event_log_print(d->log, true);
> +	xe_eudebug_event_log_print(c->log, true);
> +
> +	xe_eudebug_session_check(s, match_opposite, 0);
> +	xe_eudebug_session_destroy(s);
> +}
> +
> +#define RESOURCE_COUNT 16
> +#define PRIMARY_THREAD			(1 << 0)
> +#define DISCOVERY_CLOSE_CLIENT		(1 << 1)
> +#define DISCOVERY_DESTROY_RESOURCES	(1 << 2)
> +#define DISCOVERY_VM_BIND		(1 << 3)
> +static void run_discovery_client(struct xe_eudebug_client *c)
> +{
> +	struct drm_xe_engine_class_instance *hwe = NULL;
> +	int fd[RESOURCE_COUNT], i;
> +	bool skip_sleep = c->flags & (DISCOVERY_DESTROY_RESOURCES | DISCOVERY_CLOSE_CLIENT);
> +	uint64_t addr = 0x1a0000;
> +
> +	srand(getpid());
> +
> +	for (i = 0; i < RESOURCE_COUNT; i++) {
> +		fd[i] = xe_eudebug_client_open_driver(c);
> +
> +		if (!i) {
> +			bool found = false;
> +
> +			xe_device_get(fd[0]);

Unnecessary, drm_reopen_driver() calls this.

> +			xe_for_each_engine(fd[0], hwe) {
> +				if (hwe->engine_class == DRM_XE_ENGINE_CLASS_COMPUTE ||
> +				    hwe->engine_class == DRM_XE_ENGINE_CLASS_RENDER) {
> +					found = true;
> +					break;
> +				}
> +			}
> +			igt_assert(found);
> +		}
> +
> +		/*
> +		 * Give the debugger a break in event stream after every
> +		 * other client, that allows to read discovery and dettach in quiet.
> +		 */
> +		if (random() % 2 == 0 && !skip_sleep)
> +			sleep(1);
> +
> +		for (int j = 0; j < RESOURCE_COUNT; j++) {
> +			uint32_t vm = xe_eudebug_client_vm_create(c, fd[i],
> +								  DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +			struct drm_xe_ext_set_property eq_ext = {
> +				.base.name = DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY,
> +				.property = DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG,
> +				.value = DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE,
> +			};
> +			struct drm_xe_exec_queue_create create = {
> +				.width = 1,
> +				.num_placements = 1,
> +				.vm_id = vm,
> +				.instances = to_user_pointer(hwe),
> +				.extensions = to_user_pointer(&eq_ext),
> +			};
> +			const unsigned int bo_size = max_t(bo_size,
> +							   xe_get_default_alignment(fd[i]),
> +							   MIN_BO_SIZE);
> +			uint32_t bo = xe_bo_create(fd[i], 0, bo_size, system_memory(fd[i]), 0);
> +
> +			xe_eudebug_client_exec_queue_create(c, fd[i], &create);
> +
> +			if (c->flags & DISCOVERY_VM_BIND) {
> +				xe_eudebug_client_vm_bind(c, fd[i], vm, bo, 0, addr, bo_size);
> +				addr += 0x100000;

Shouldn't this be addr += bo_size? 0x100000 is technically correct as
default alignment may be 4K or 64K and MIN_BO_SIZE seems to be 32K
but I would use bo_size here. Unless your intention is to have gaps
in vm space.

> +			}
> +
> +			if (c->flags & DISCOVERY_DESTROY_RESOURCES) {
> +				xe_eudebug_client_exec_queue_destroy(c, fd[i], &create);
> +				xe_eudebug_client_vm_destroy(c, fd[i], create.vm_id);
> +				gem_close(fd[i], bo);
> +			}
> +		}
> +
> +		if (c->flags & DISCOVERY_CLOSE_CLIENT)
> +			xe_eudebug_client_close_driver(c, fd[i]);
> +	}
> +	xe_device_put(fd[0]);

run_discovery_client() is executed after fork so freeing single fd[0]
device cached data is not necessary here. It would be necessary to
get/put these data if fds would be opened and closed, but here it
doesn't happen. It tooks me a while to figure out why I see no fds
leakage and I realized all of them are closed on process exiting.

> +}
> +
> +/**
> + * SUBTEST: discovery-%s
> + * Description: Race discovery against %arg[1] and the debugger dettach.
> + *
> + * arg[1]:
> + *
> + * @race:		resources creation
> + * @race-vmbind:	vm-bind operations
> + * @empty:		resources destruction
> + * @empty-clients:	client closure
> + */
> +static void *discovery_race_thread(void *data)
> +{
> +	struct {
> +		uint64_t client_handle;
> +		int vm_count;
> +		int exec_queue_count;
> +		int vm_bind_op_count;
> +	} clients[RESOURCE_COUNT];
> +	struct xe_eudebug_session *s = data;
> +	int expected = RESOURCE_COUNT * (1 + 2 * RESOURCE_COUNT);
> +	const int tries = 100;
> +	bool done = false;
> +	int ret = 0;
> +
> +	for (int try = 0; try < tries && !done; try++) {
> +		ret = xe_eudebug_debugger_attach(s->debugger, s->client);
> +
> +		if (ret == -EBUSY) {
> +			usleep(100000);
> +			continue;
> +		}
> +
> +		igt_assert_eq(ret, 0);
> +
> +		if (random() % 2) {
> +			struct drm_xe_eudebug_event *e = NULL;
> +			int i = -1;
> +
> +			xe_eudebug_debugger_start_worker(s->debugger);
> +			sleep(1);
> +			xe_eudebug_debugger_stop_worker(s->debugger, 1);
> +			igt_debug("Resources discovered: %lu\n", s->debugger->event_count);
> +
> +			xe_eudebug_for_each_event(e, s->debugger->log) {
> +				if (e->type == DRM_XE_EUDEBUG_EVENT_OPEN) {
> +					struct drm_xe_eudebug_event_client *eo = (void *)e;
> +
> +					if (i >= 0) {
> +						igt_assert_eq(clients[i].vm_count,
> +							      RESOURCE_COUNT);
> +
> +						igt_assert_eq(clients[i].exec_queue_count,
> +							      RESOURCE_COUNT);
> +
> +						if (s->client->flags & DISCOVERY_VM_BIND)
> +							igt_assert_eq(clients[i].vm_bind_op_count,
> +								      RESOURCE_COUNT);
> +					}
> +
> +					igt_assert(++i < RESOURCE_COUNT);
> +					clients[i].client_handle = eo->client_handle;
> +					clients[i].vm_count = 0;
> +					clients[i].exec_queue_count = 0;
> +					clients[i].vm_bind_op_count = 0;
> +				}
> +
> +				if (e->type == DRM_XE_EUDEBUG_EVENT_VM)
> +					clients[i].vm_count++;
> +
> +				if (e->type == DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE)
> +					clients[i].exec_queue_count++;
> +
> +				if (e->type == DRM_XE_EUDEBUG_EVENT_VM_BIND_OP)
> +					clients[i].vm_bind_op_count++;
> +			};
> +
> +			igt_assert_lte(0, i);
> +
> +			for (int j = 0; j < i; j++)
> +				for (int k = 0; k < i; k++) {
> +					if (k == j)
> +						continue;
> +
> +					igt_assert_neq(clients[j].client_handle,
> +						       clients[k].client_handle);
> +				}
> +
> +			if (s->debugger->event_count >= expected)
> +				done = true;
> +		}
> +
> +		xe_eudebug_debugger_detach(s->debugger);
> +		s->debugger->log->head = 0;
> +		s->debugger->event_count = 0;
> +	}
> +
> +	/* Primary thread must read everything */
> +	if (s->flags & PRIMARY_THREAD) {
> +		while ((ret = xe_eudebug_debugger_attach(s->debugger, s->client)) == -EBUSY)
> +			usleep(100000);
> +
> +		igt_assert_eq(ret, 0);
> +
> +		xe_eudebug_debugger_start_worker(s->debugger);
> +		xe_eudebug_client_wait_done(s->client);
> +
> +		if (READ_ONCE(s->debugger->event_count) != expected)
> +			sleep(5);
> +
> +		xe_eudebug_debugger_stop_worker(s->debugger, 1);
> +		xe_eudebug_debugger_detach(s->debugger);
> +	}
> +
> +	return NULL;
> +}
> +
> +static void test_race_discovery(int fd, unsigned int flags, int clients)
> +{
> +	const int debuggers_per_client = 3;
> +	int count = clients * debuggers_per_client;
> +	struct xe_eudebug_session *sessions, *s;
> +	struct xe_eudebug_client *c;
> +	pthread_t *threads;
> +	int i, j;
> +
> +	sessions = calloc(count, sizeof(*sessions));
> +	threads = calloc(count, sizeof(*threads));
> +
> +	for (i = 0; i < clients; i++) {
> +		c = xe_eudebug_client_create(fd, run_discovery_client, flags, NULL);
> +		for (j = 0; j < debuggers_per_client; j++) {
> +			s = &sessions[i * debuggers_per_client + j];
> +			s->client = c;
> +			s->debugger = xe_eudebug_debugger_create(fd, flags, NULL);
> +			s->flags = flags | (!j ? PRIMARY_THREAD : 0);
> +		}
> +	}
> +
> +	for (i = 0; i < count; i++) {
> +		if (sessions[i].flags & PRIMARY_THREAD)
> +			xe_eudebug_client_start(sessions[i].client);
> +
> +		pthread_create(&threads[i], NULL, discovery_race_thread, &sessions[i]);
> +	}
> +
> +	for (i = 0; i < count; i++)
> +		pthread_join(threads[i], NULL);
> +
> +	for (i = count - 1; i > 0; i--) {
> +		if (sessions[i].flags & PRIMARY_THREAD) {
> +			igt_assert_eq(sessions[i].client->seqno - 1,
> +				      sessions[i].debugger->event_count);
> +
> +			xe_eudebug_event_log_compare(sessions[0].debugger->log,
> +						     sessions[i].debugger->log,
> +						     XE_EUDEBUG_FILTER_EVENT_VM_BIND);
> +
> +			xe_eudebug_client_destroy(sessions[i].client);
> +		}
> +		xe_eudebug_debugger_destroy(sessions[i].debugger);
> +	}
> +}
> +
> +static void *attach_dettach_thread(void *data)
> +{
> +	struct xe_eudebug_session *s = data;
> +	const int tries = 100;
> +	int ret = 0;
> +
> +	for (int try = 0; try < tries; try++) {
> +		ret = xe_eudebug_debugger_attach(s->debugger, s->client);
> +
> +		if (ret == -EBUSY) {
> +			usleep(100000);
> +			continue;
> +		}
> +
> +		igt_assert_eq(ret, 0);
> +
> +		if (random() % 2 == 0) {
> +			xe_eudebug_debugger_start_worker(s->debugger);
> +			xe_eudebug_debugger_stop_worker(s->debugger, 1);
> +		}
> +
> +		xe_eudebug_debugger_detach(s->debugger);
> +		s->debugger->log->head = 0;
> +		s->debugger->event_count = 0;
> +	}
> +
> +	return NULL;
> +}
> +
> +static void test_empty_discovery(int fd, unsigned int flags, int clients)
> +{
> +	struct xe_eudebug_session **s;
> +	pthread_t *threads;
> +	int i, expected = flags & DISCOVERY_CLOSE_CLIENT ? 0 : RESOURCE_COUNT;
> +
> +	igt_assert(flags & (DISCOVERY_DESTROY_RESOURCES | DISCOVERY_CLOSE_CLIENT));
> +
> +	s = calloc(clients, sizeof(struct xe_eudebug_session *));
> +	threads = calloc(clients, sizeof(*threads));
> +
> +	for (i = 0; i < clients; i++)
> +		s[i] = xe_eudebug_session_create(fd, run_discovery_client, flags, NULL);
> +
> +	for (i = 0; i < clients; i++) {
> +		xe_eudebug_client_start(s[i]->client);
> +
> +		pthread_create(&threads[i], NULL, attach_dettach_thread, s[i]);
> +	}
> +
> +	for (i = 0; i < clients; i++)
> +		pthread_join(threads[i], NULL);
> +
> +	for (i = 0; i < clients; i++) {
> +		xe_eudebug_client_wait_done(s[i]->client);
> +		igt_assert_eq(xe_eudebug_debugger_attach(s[i]->debugger, s[i]->client), 0);
> +
> +		xe_eudebug_debugger_start_worker(s[i]->debugger);
> +		xe_eudebug_debugger_stop_worker(s[i]->debugger, 5);
> +		xe_eudebug_debugger_detach(s[i]->debugger);
> +
> +		igt_assert_eq(s[i]->debugger->event_count, expected);
> +
> +		xe_eudebug_session_destroy(s[i]);
> +	}
> +}
> +
> +static void ufence_ack_trigger(struct xe_eudebug_debugger *d,
> +			       struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
> +
> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE)
> +		xe_eudebug_ack_ufence(d->fd, ef);
> +}
> +
> +typedef void (*client_run_t)(struct xe_eudebug_client *);
> +
> +static void test_client_with_trigger(int fd, unsigned int flags, int count,
> +				     client_run_t client_fn, int type,
> +				     xe_eudebug_trigger_fn trigger_fn,
> +				     struct drm_xe_engine_class_instance *hwe,
> +				     bool match_opposite, uint32_t event_filter)
> +{
> +	struct xe_eudebug_session **s;
> +	int i;
> +
> +	s = calloc(count, sizeof(*s));
> +
> +	igt_assert(s);
> +
> +	for (i = 0; i < count; i++)
> +		s[i] = xe_eudebug_session_create(fd, client_fn, flags, hwe);
> +
> +	if (trigger_fn)
> +		for (i = 0; i < count; i++)
> +			xe_eudebug_debugger_add_trigger(s[i]->debugger, type, trigger_fn);
> +
> +	for (i = 0; i < count; i++)
> +		xe_eudebug_debugger_add_trigger(s[i]->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> +						ufence_ack_trigger);
> +
> +	for (i = 0; i < count; i++)
> +		xe_eudebug_session_run(s[i]);
> +
> +	for (i = 0; i < count; i++)
> +		xe_eudebug_session_check(s[i], match_opposite, event_filter);
> +
> +	for (i = 0; i < count; i++)
> +		xe_eudebug_session_destroy(s[i]);
> +}
> +
> +struct thread_fn_args {
> +	struct xe_eudebug_client *client;
> +	int fd;
> +};
> +
> +static void *basic_client_th(void *data)
> +{
> +	struct thread_fn_args *f = data;
> +	struct xe_eudebug_client *c = f->client;
> +	uint32_t *vms;
> +	int fd, i, num_vms;
> +
> +	fd = f->fd;
> +	igt_assert(fd);
> +
> +	xe_device_get(fd);
> +
> +	num_vms = 2 + rand() % 16;
> +	vms = calloc(num_vms, sizeof(*vms));
> +	igt_assert(vms);
> +	igt_debug("Create %d client vms\n", num_vms);
> +
> +	for (i = 0; i < num_vms; i++)
> +		vms[i] = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);

I think this client code is prone to get same seqno.

xe_eudebug_client_vm_create()
  -> vm_event()
     ...
     -> base_event()
        ...
        e->seqno = xe_eudebug_client_get_seqno(c);

There's no mutex protecting 'return c->seqno++' and this increment
is not atomic.

> +
> +	for (i = 0; i < num_vms; i++)
> +		xe_eudebug_client_vm_destroy(c, fd, vms[i]);
> +
> +	xe_device_put(fd);
> +	free(vms);
> +
> +	return NULL;
> +}
> +
> +static void run_basic_client_th(struct xe_eudebug_client *c)
> +{
> +	struct thread_fn_args *args;
> +	int i, num_threads, fd;
> +	pthread_t *threads;
> +
> +	args = calloc(1, sizeof(*args));
> +	igt_assert(args);
> +
> +	num_threads = 2 + random() % 16;
> +	igt_debug("Run on %d threads\n", num_threads);
> +	threads = calloc(num_threads, sizeof(*threads));
> +	igt_assert(threads);
> +
> +	fd = xe_eudebug_client_open_driver(c);
> +	args->client = c;
> +	args->fd = fd;
> +
> +	for (i = 0; i < num_threads; i++)
> +		pthread_create(&threads[i], NULL, basic_client_th, args);
> +
> +	for (i = 0; i < num_threads; i++)
> +		pthread_join(threads[i], NULL);
> +
> +	xe_eudebug_client_close_driver(c, fd);
> +	free(args);
> +	free(threads);
> +}
> +
> +static void test_basic_sessions_th(int fd, unsigned int flags, int num_clients, bool match_opposite)
> +{
> +	test_client_with_trigger(fd, flags, num_clients, run_basic_client_th, 0, NULL, NULL,
> +				 match_opposite, 0);
> +}
> +
> +static void vm_access_client(struct xe_eudebug_client *c)
> +{
> +	struct drm_xe_engine_class_instance *hwe = c->ptr;
> +	uint32_t bo_placement;
> +	struct bind_list *bl;
> +	uint32_t vm;
> +	int fd, i, j;
> +
> +	igt_debug("Using %s\n", xe_engine_class_string(hwe->engine_class));
> +
> +	fd = xe_eudebug_client_open_driver(c);
> +	xe_device_get(fd);

Not necessary.

> +
> +	vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +
> +	if (c->flags & VM_BIND_OP_MAP_USERPTR)
> +		bo_placement = 0;
> +	else
> +		bo_placement = vram_if_possible(fd, hwe->gt_id);
> +
> +	for (j = 0; j < 5; j++) {
> +		unsigned int target_size = MIN_BO_SIZE * (1 << j);
> +
> +		bl = create_bind_list(fd, bo_placement, vm, 4, target_size);
> +		do_bind_list(c, bl, true);
> +
> +		for (i = 0; i < bl->n; i++)
> +			xe_eudebug_client_wait_stage(c, bl->bind_ops[i].addr);
> +
> +		free_bind_list(c, bl);
> +	}
> +	xe_eudebug_client_vm_destroy(c, fd, vm);
> +
> +	xe_device_put(fd);

Same.

> +	xe_eudebug_client_close_driver(c, fd);
> +}
> +
> +static void debugger_test_vma(struct xe_eudebug_debugger *d,
> +			      uint64_t client_handle,
> +			      uint64_t vm_handle,
> +			      uint64_t va_start,
> +			      uint64_t va_length)
> +{
> +	struct drm_xe_eudebug_vm_open vo = { 0, };
> +	uint64_t *v1, *v2;
> +	uint64_t items = va_length / sizeof(uint64_t);
> +	int fd;
> +	int r, i;
> +
> +	v1 = malloc(va_length);
> +	igt_assert(v1);
> +	v2 = malloc(va_length);
> +	igt_assert(v2);
> +
> +	vo.client_handle = client_handle;
> +	vo.vm_handle = vm_handle;
> +
> +	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
> +	igt_assert_lte(0, fd);
> +
> +	r = pread(fd, v1, va_length, va_start);
> +	igt_assert_eq(r, va_length);
> +
> +	for (i = 0; i < items; i++)
> +		igt_assert_eq(v1[i], va_start + i);
> +
> +	for (i = 0; i < items; i++)
> +		v1[i] = va_start + i + 1;
> +
> +	r = pwrite(fd, v1, va_length, va_start);
> +	igt_assert_eq(r, va_length);
> +
> +	lseek(fd, va_start, SEEK_SET);
> +	r = read(fd, v2, va_length);
> +	igt_assert_eq(r, va_length);
> +
> +	for (i = 0; i < items; i++)
> +		igt_assert_eq(v1[i], v2[i]);
> +
> +	fsync(fd);
> +
> +	close(fd);
> +	free(v1);
> +	free(v2);
> +}
> +
> +static void vm_trigger(struct xe_eudebug_debugger *d,
> +		       struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
> +

I wanted to ask for checking of event type is this one we expect
but then I realized debugger_run_triggers() does this check.

> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
> +		struct drm_xe_eudebug_event_vm_bind *eb;
> +
> +		igt_debug("vm bind op event received with ref %lld, addr 0x%llx, range 0x%llx\n",
> +			  eo->vm_bind_ref_seqno,
> +			  eo->addr,
> +			  eo->range);
> +
> +		eb = (struct drm_xe_eudebug_event_vm_bind *)
> +			xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
> +		igt_assert(eb);
> +
> +		debugger_test_vma(d, eb->client_handle, eb->vm_handle,
> +				  eo->addr, eo->range);
> +		xe_eudebug_debugger_signal_stage(d, eo->addr);
> +	}
> +}
> +
> +/**
> + * SUBTEST: basic-vm-access
> + * Description:
> + *      Exercise XE_EUDEBUG_VM_OPEN with pread and pwrite into the
> + *      vm fd, concerning many different offsets inside the vm,
> + *      and many virtual addresses of the vm_bound object.
> + *
> + * SUBTEST: basic-vm-access-userptr
> + * Description:
> + *      Exercise XE_EUDEBUG_VM_OPEN with pread and pwrite into the
> + *      vm fd, concerning many different offsets inside the vm,
> + *      and many virtual addresses of the vm_bound object, but backed
> + *      by userptr.
> + */
> +static void test_vm_access(int fd, unsigned int flags, int num_clients)
> +{
> +	struct drm_xe_engine_class_instance *hwe;
> +
> +	xe_eudebug_for_each_engine(fd, hwe)
> +		test_client_with_trigger(fd, flags, num_clients,
> +					 vm_access_client,
> +					 DRM_XE_EUDEBUG_EVENT_VM_BIND_OP,
> +					 vm_trigger, hwe,
> +					 false,
> +					 XE_EUDEBUG_FILTER_EVENT_VM_BIND_OP |
> +					 XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
> +}
> +
> +static void debugger_test_vma_parameters(struct xe_eudebug_debugger *d,
> +					 uint64_t client_handle,
> +					 uint64_t vm_handle,
> +					 uint64_t va_start,
> +					 uint64_t va_length)
> +{
> +	struct drm_xe_eudebug_vm_open vo = { 0, };
> +	uint64_t *v;
> +	uint64_t items = va_length / sizeof(uint64_t);
> +	int fd;
> +	int r, i;
> +
> +	v = malloc(va_length);
> +	igt_assert(v);
> +
> +	/* Negative VM open - bad client handle */
> +	vo.client_handle = client_handle + 123;
> +	vo.vm_handle = vm_handle;
> +	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
> +	igt_assert(fd < 0);

igt_assert_lt()/igt_assert_eq() would be more appropriate here
and below. BTW nice comments, I've scrolled up to test_read_event()
to ask for the same there.

> +
> +	/* Negative VM open - bad vm handle */
> +	vo.client_handle = client_handle;
> +	vo.vm_handle = vm_handle + 123;
> +	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
> +	igt_assert(fd < 0);
> +
> +	/* Positive VM open */
> +	vo.client_handle = client_handle;
> +	vo.vm_handle = vm_handle;
> +	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
> +	igt_assert_lte(0, fd);
> +
> +	/* Negative pread - bad fd */
> +	r = pread(fd + 123, v, va_length, va_start);
> +	igt_assert(r < 0);
> +
> +	/* Negative pread - bad va_start */
> +	r = pread(fd, v, va_length, 0);
> +	igt_assert(r < 0);
> +
> +	/* Negative pread - bad va_start */
> +	r = pread(fd, v, va_length, va_start - 1);
> +	igt_assert(r < 0);
> +
> +	/* Positive pread - zero va_length */
> +	r = pread(fd, v, 0, va_start);
> +	igt_assert_eq(r, 0);
> +
> +	/* Negative pread - out of range */
> +	r = pread(fd, v, va_length + 1, va_start);
> +	igt_assert_eq(r, va_length);
> +
> +	/* Negative pread - bad va_start */
> +	r = pread(fd, v, 1, va_start + va_length);
> +	igt_assert(r < 0);
> +
> +	/* Positive pread - whole range */
> +	r = pread(fd, v, va_length, va_start);
> +	igt_assert_eq(r, va_length);
> +
> +	/* Positive pread */
> +	r = pread(fd, v, 1, va_start + va_length - 1);
> +	igt_assert_eq(r, 1);
> +
> +	for (i = 0; i < items; i++)
> +		igt_assert_eq(v[i], va_start + i);
> +
> +	for (i = 0; i < items; i++)
> +		v[i] = va_start + i + 1;
> +
> +	/* Negative pwrite - bad fd */
> +	r = pwrite(fd + 123, v, va_length, va_start);
> +	igt_assert(r < 0);
> +
> +	/* Negative pwrite - bad va_start */
> +	r = pwrite(fd, v, va_length, -1);
> +	igt_assert(r < 0);
> +
> +	/* Negative pwrite - zero va_start */
> +	r = pwrite(fd, v, va_length, 0);
> +	igt_assert(r < 0);
> +
> +	/* Negative pwrite - bad va_length */
> +	r = pwrite(fd, v, va_length + 1, va_start);
> +	igt_assert_eq(r, va_length);
> +
> +	/* Positive pwrite - zero va_length */
> +	r = pwrite(fd, v, 0, va_start);
> +	igt_assert_eq(r, 0);
> +
> +	/* Positive pwrite */
> +	r = pwrite(fd, v, va_length, va_start);
> +	igt_assert_eq(r, va_length);
> +	fsync(fd);
> +
> +	close(fd);
> +	free(v);
> +}
> +
> +static void vm_trigger_access_parameters(struct xe_eudebug_debugger *d,
> +					 struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
> +
> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
> +		struct drm_xe_eudebug_event_vm_bind *eb;
> +
> +		igt_debug("vm bind op event received with ref %lld, addr 0x%llx, range 0x%llx\n",
> +			  eo->vm_bind_ref_seqno,
> +			  eo->addr,
> +			  eo->range);
> +
> +		eb = (struct drm_xe_eudebug_event_vm_bind *)
> +			xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
> +		igt_assert(eb);
> +
> +		debugger_test_vma_parameters(d, eb->client_handle, eb->vm_handle, eo->addr,
> +					     eo->range);
> +		xe_eudebug_debugger_signal_stage(d, eo->addr);
> +	}
> +}
> +
> +/**
> + * SUBTEST: basic-vm-access-parameters
> + * Description:
> + *      Check negative scenarios of VM_OPEN ioctl and pread/pwrite usage.
> + */
> +static void test_vm_access_parameters(int fd, unsigned int flags, int num_clients)
> +{
> +	struct drm_xe_engine_class_instance *hwe;
> +
> +	xe_eudebug_for_each_engine(fd, hwe)
> +		test_client_with_trigger(fd, flags, num_clients,
> +					 vm_access_client,
> +					 DRM_XE_EUDEBUG_EVENT_VM_BIND_OP,
> +					 vm_trigger_access_parameters, hwe,
> +					 false,
> +					 XE_EUDEBUG_FILTER_EVENT_VM_BIND_OP |
> +					 XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
> +}
> +
> +#define PAGE_SIZE 4096

Repeated definition, use SZ_4K.

> +#define MDATA_SIZE (WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM * PAGE_SIZE)
> +static void metadata_access_client(struct xe_eudebug_client *c)
> +{
> +	const uint64_t addr = 0x1a0000;
> +	struct drm_xe_vm_bind_op_ext_attach_debug *ext;
> +	uint8_t *data;
> +	size_t bo_size;
> +	uint32_t bo, vm;
> +	int fd, i;
> +
> +	fd = xe_eudebug_client_open_driver(c);
> +	xe_device_get(fd);

Unnecessary xe_device_get() here.

> +
> +	bo_size = xe_get_default_alignment(fd);
> +	vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +	bo = xe_bo_create(fd, vm, bo_size, system_memory(fd), 0);
> +
> +	ext = basic_vm_bind_metadata_ext_prepare(fd, c, &data, MDATA_SIZE);
> +
> +	xe_eudebug_client_vm_bind_flags(c, fd, vm, bo, 0, addr,
> +					bo_size, 0, NULL, 0, to_user_pointer(ext));
> +
> +	for (i = 0; i < WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM; i++)
> +		xe_eudebug_client_wait_stage(c, i);
> +
> +	xe_eudebug_client_vm_unbind(c, fd, vm, 0, addr, bo_size);
> +
> +	basic_vm_bind_metadata_ext_del(fd, c, ext, data);
> +
> +	close(bo);
> +	xe_eudebug_client_vm_destroy(c, fd, vm);
> +
> +	xe_device_put(fd);

Unnecessary xe_device_put() here, will be cleaned after process exiting.

> +	xe_eudebug_client_close_driver(c, fd);
> +}
> +
> +static void debugger_test_metadata(struct xe_eudebug_debugger *d,
> +				   uint64_t client_handle,
> +				   uint64_t metadata_handle,
> +				   uint64_t type,
> +				   uint64_t len)
> +{
> +	struct drm_xe_eudebug_read_metadata rm = {
> +		.client_handle = client_handle,
> +		.metadata_handle = metadata_handle,
> +		.size = len,
> +	};
> +	uint8_t *data;
> +	int i;
> +
> +	data = malloc(len);
> +	igt_assert(data);
> +
> +	rm.ptr = to_user_pointer(data);
> +
> +	igt_assert_eq(igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_READ_METADATA, &rm), 0);

Do you plan to cover this ioctl? Here's only optimistic scenario.

> +
> +	/* syntetic check, test sets different size per metadata type */
> +	igt_assert_eq((type + 1) * PAGE_SIZE, rm.size);
> +
> +	for (i = 0; i < rm.size; i++)
> +		igt_assert_eq(data[i], 0xff & (i + (i > PAGE_SIZE)));

I've commented out above, but when I see this I likely guess you would
like to have different metadata type starting at different value.
Currently first page starts with 0x0 value, second/third/... pages are
always starts with 0x1. I think assigning instead i > PAGE_SIZE would
be use i / PAGE_SIZE, or i >> 12. Then each metadata type would have
unique start value at each page.

> +
> +	free(data);
> +}
> +
> +static void metadata_read_trigger(struct xe_eudebug_debugger *d,
> +				  struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_metadata *em = (void *)e;
> +
> +	/* syntetic check, test sets different size per metadata type */
> +	igt_assert_eq((em->type + 1) * PAGE_SIZE, em->len);
> +
> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
> +		debugger_test_metadata(d, em->client_handle, em->metadata_handle,
> +				       em->type, em->len);
> +		xe_eudebug_debugger_signal_stage(d, em->type);
> +	}
> +}
> +
> +static void metadata_read_on_vm_bind_trigger(struct xe_eudebug_debugger *d,
> +					     struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_vm_bind_op_metadata *em = (void *)e;
> +	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
> +	struct drm_xe_eudebug_event_vm_bind *eb;
> +
> +	/* For testing purpose client sets metadata_cookie = type */
> +
> +	/*
> +	 * Metadata event has a reference to vm-bind-op event which has a reference
> +	 * to vm-bind event which contains proper client-handle.
> +	 */
> +	eo = (struct drm_xe_eudebug_event_vm_bind_op *)
> +		xe_eudebug_event_log_find_seqno(d->log, em->vm_bind_op_ref_seqno);
> +	igt_assert(eo);
> +	eb = (struct drm_xe_eudebug_event_vm_bind *)
> +		xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
> +	igt_assert(eb);
> +
> +	debugger_test_metadata(d,
> +			       eb->client_handle,
> +			       em->metadata_handle,
> +			       em->metadata_cookie,
> +			       MDATA_SIZE); /* max size */
> +
> +	xe_eudebug_debugger_signal_stage(d, em->metadata_cookie);
> +}
> +
> +/**
> + * SUBTEST: read-metadata
> + * Description:
> + *      Exercise DRM_XE_EUDEBUG_IOCTL_READ_METADATA and debug metadata create|destroy events.
> + */
> +static void test_metadata_read(int fd, unsigned int flags, int num_clients)
> +{
> +	test_client_with_trigger(fd, flags, num_clients, metadata_access_client,
> +				 DRM_XE_EUDEBUG_EVENT_METADATA, metadata_read_trigger,
> +				 NULL, true, 0);
> +}
> +
> +/**
> + * SUBTEST: attach-debug-metadata
> + * Description:
> + *      Read debug metadata when vm_bind has it attached.
> + */
> +static void test_metadata_attach(int fd, unsigned int flags, int num_clients)
> +{
> +	test_client_with_trigger(fd, flags, num_clients, metadata_access_client,
> +				 DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_METADATA,
> +				 metadata_read_on_vm_bind_trigger,
> +				 NULL, true, 0);
> +}
> +
> +#define STAGE_CLIENT_WAIT_ON_UFENCE_DONE 1337
> +
> +#define UFENCE_EVENT_COUNT_EXPECTED 4
> +#define UFENCE_EVENT_COUNT_MAX 100
> +
> +struct ufence_bind {
> +	struct drm_xe_sync f;
> +	uint64_t addr;
> +	uint64_t range;
> +	uint64_t value;
> +	struct {
> +		uint64_t vm_sync;
> +	} *fence_data;
> +};
> +
> +static void client_wait_ufences(struct xe_eudebug_client *c,
> +				int fd, struct ufence_bind *binds, int count)
> +{
> +	const int64_t default_fence_timeout_ns = 500 * NSEC_PER_MSEC;
> +	int64_t timeout_ns;
> +	int err;
> +
> +	/* Ensure that wait on unacked ufence times out */
> +	for (int i = 0; i < count; i++) {
> +		struct ufence_bind *b = &binds[i];
> +
> +		timeout_ns = default_fence_timeout_ns;
> +		err = __xe_wait_ufence(fd, &b->fence_data->vm_sync, b->f.timeline_value,
> +				       0, &timeout_ns);
> +		igt_assert_eq(err, -ETIME);
> +		igt_assert_neq(b->fence_data->vm_sync, b->f.timeline_value);
> +		igt_debug("wait #%d blocked on ack\n", i);
> +	}
> +
> +	/* Wait on fence timed out, now tell the debugger to ack */
> +	xe_eudebug_client_signal_stage(c, STAGE_CLIENT_WAIT_ON_UFENCE_DONE);
> +
> +	/* Check that ack unblocks ufence */
> +	for (int i = 0; i < count; i++) {
> +		struct ufence_bind *b = &binds[i];
> +
> +		timeout_ns = XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC;
> +		err = __xe_wait_ufence(fd, &b->fence_data->vm_sync, b->f.timeline_value,
> +				       0, &timeout_ns);
> +		igt_assert_eq(err, 0);
> +		igt_assert_eq(b->fence_data->vm_sync, b->f.timeline_value);
> +		igt_debug("wait #%d completed\n", i);
> +	}
> +}
> +
> +static struct ufence_bind *create_binds_with_ufence(int fd, int count)
> +{
> +	struct ufence_bind *binds;
> +
> +	binds = calloc(count, sizeof(*binds));
> +	igt_assert(binds);
> +
> +	for (int i = 0; i < count; i++) {
> +		struct ufence_bind *b = &binds[i];
> +
> +		b->range = 0x1000;
> +		b->addr = 0x100000 + b->range * i;
> +		b->fence_data = aligned_alloc(xe_get_default_alignment(fd),
> +					      sizeof(*b->fence_data));

Where's fence_data freed?

> +		igt_assert(b->fence_data);
> +		memset(b->fence_data, 0, sizeof(*b->fence_data));
> +
> +		b->f.type = DRM_XE_SYNC_TYPE_USER_FENCE;
> +		b->f.flags = DRM_XE_SYNC_FLAG_SIGNAL;
> +		b->f.addr = to_user_pointer(&b->fence_data->vm_sync);
> +		b->f.timeline_value = UFENCE_EVENT_COUNT_EXPECTED + i;
> +	}
> +
> +	return binds;
> +}
> +
> +static void basic_ufence_client(struct xe_eudebug_client *c)
> +{
> +	const unsigned int n = UFENCE_EVENT_COUNT_EXPECTED;
> +	int fd = xe_eudebug_client_open_driver(c);
> +	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +	size_t bo_size = n * xe_get_default_alignment(fd);
> +	uint32_t bo = xe_bo_create(fd, 0, bo_size,
> +				   system_memory(fd), 0);
> +	struct ufence_bind *binds = create_binds_with_ufence(fd, n);
> +
> +	for (int i = 0; i < n; i++) {
> +		struct ufence_bind *b = &binds[i];
> +
> +		xe_eudebug_client_vm_bind_flags(c, fd, vm, bo, 0, b->addr, b->range, 0,
> +						&b->f, 1, 0);
> +	}
> +
> +	client_wait_ufences(c, fd, binds, n);
> +
> +	for (int i = 0; i < n; i++) {
> +		struct ufence_bind *b = &binds[i];
> +
> +		xe_eudebug_client_vm_unbind(c, fd, vm, 0, b->addr, b->range);
> +	}
> +
> +	free(binds);
> +	gem_close(fd, bo);
> +	xe_eudebug_client_vm_destroy(c, fd, vm);
> +	xe_eudebug_client_close_driver(c, fd);
> +}
> +
> +struct ufence_priv {
> +	struct drm_xe_eudebug_event_vm_bind_ufence ufence_events[UFENCE_EVENT_COUNT_MAX];
> +	uint64_t ufence_event_seqno[UFENCE_EVENT_COUNT_MAX];
> +	uint64_t ufence_event_vm_addr_start[UFENCE_EVENT_COUNT_MAX];
> +	uint64_t ufence_event_vm_addr_range[UFENCE_EVENT_COUNT_MAX];
> +	unsigned int ufence_event_count;
> +	unsigned int vm_bind_op_count;
> +	pthread_mutex_t mutex;
> +};
> +
> +static struct ufence_priv *ufence_priv_create(void)
> +{
> +	struct ufence_priv *priv;
> +
> +	priv = mmap(0, ALIGN(sizeof(*priv), PAGE_SIZE),
> +		    PROT_WRITE, MAP_SHARED | MAP_ANON, -1, 0);
> +	igt_assert(priv);
> +	memset(priv, 0, sizeof(*priv));
> +	pthread_mutex_init(&priv->mutex, NULL);

I think you should ensure that attribute PTHREAD_PROCESS_SHARED is set
for multiprocess usage (see pthread_mutexattr_get/setpshared()).

> +
> +	return priv;
> +}
> +
> +static void ufence_priv_destroy(struct ufence_priv *priv)
> +{
> +	munmap(priv, ALIGN(sizeof(*priv), PAGE_SIZE));
> +}
> +
> +static void ack_fences(struct xe_eudebug_debugger *d)
> +{
> +	struct ufence_priv *priv = d->ptr;
> +
> +	for (int i = 0; i < priv->ufence_event_count; i++)
> +		xe_eudebug_ack_ufence(d->fd, &priv->ufence_events[i]);
> +}
> +
> +static void basic_ufence_trigger(struct xe_eudebug_debugger *d,
> +				 struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
> +	struct ufence_priv *priv = d->ptr;
> +
> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
> +		char event_str[XE_EUDEBUG_EVENT_STRING_MAX_LEN];
> +		struct drm_xe_eudebug_event_vm_bind *eb;
> +
> +		xe_eudebug_event_to_str(e, event_str, XE_EUDEBUG_EVENT_STRING_MAX_LEN);
> +		igt_debug("ufence event received: %s\n", event_str);
> +
> +		xe_eudebug_assert_f(d, priv->ufence_event_count < UFENCE_EVENT_COUNT_EXPECTED,
> +				    "surplus ufence event received: %s\n", event_str);
> +		xe_eudebug_assert(d, ef->vm_bind_ref_seqno);
> +
> +		memcpy(&priv->ufence_events[priv->ufence_event_count++], ef, sizeof(*ef));
> +
> +		eb = (struct drm_xe_eudebug_event_vm_bind *)
> +			xe_eudebug_event_log_find_seqno(d->log, ef->vm_bind_ref_seqno);
> +		xe_eudebug_assert_f(d, eb, "vm bind event with seqno (%lld) not found\n",
> +				    ef->vm_bind_ref_seqno);
> +		xe_eudebug_assert_f(d, eb->flags & DRM_XE_EUDEBUG_EVENT_VM_BIND_FLAG_UFENCE,
> +				    "vm bind event does not have ufence: %s\n", event_str);
> +	}
> +}
> +
> +static int wait_for_ufence_events(struct ufence_priv *priv, int timeout_ms)
> +{
> +	int ret = -ETIMEDOUT;
> +
> +	igt_for_milliseconds(timeout_ms) {
> +		pthread_mutex_lock(&priv->mutex);
> +		if (priv->ufence_event_count == UFENCE_EVENT_COUNT_EXPECTED)
> +			ret = 0;
> +		pthread_mutex_unlock(&priv->mutex);
> +
> +		if (!ret)
> +			break;
> +		usleep(1000);
> +	}
> +
> +	return ret;
> +}
> +
> +/**
> + * SUBTEST: basic-vm-bind-ufence
> + * Description:
> + *      Give user fence in application and check if ufence ack works
> + */
> +static void test_basic_ufence(int fd, unsigned int flags)
> +{
> +	struct xe_eudebug_debugger *d;
> +	struct xe_eudebug_session *s;
> +	struct xe_eudebug_client *c;
> +	struct ufence_priv *priv;
> +
> +	priv = ufence_priv_create();
> +	s = xe_eudebug_session_create(fd, basic_ufence_client, flags, priv);
> +	c = s->client;
> +	d = s->debugger;
> +
> +	xe_eudebug_debugger_add_trigger(d,
> +					DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> +					basic_ufence_trigger);
> +
> +	igt_assert_eq(xe_eudebug_debugger_attach(d, c), 0);
> +	xe_eudebug_debugger_start_worker(d);
> +	xe_eudebug_client_start(c);
> +
> +	xe_eudebug_debugger_wait_stage(s, STAGE_CLIENT_WAIT_ON_UFENCE_DONE);
> +	xe_eudebug_assert_f(d, wait_for_ufence_events(priv, XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * MSEC_PER_SEC) == 0,
> +			    "missing ufence events\n");
> +	ack_fences(d);
> +
> +	xe_eudebug_client_wait_done(c);
> +	xe_eudebug_debugger_stop_worker(d, 1);
> +
> +	xe_eudebug_event_log_print(d->log, true);
> +	xe_eudebug_event_log_print(c->log, true);
> +
> +	xe_eudebug_session_check(s, true, XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
> +
> +	xe_eudebug_session_destroy(s);
> +	ufence_priv_destroy(priv);
> +}
> +
> +struct vm_bind_clear_thread_priv {
> +	struct drm_xe_engine_class_instance *hwe;
> +	struct xe_eudebug_client *c;
> +	pthread_t thread;
> +	uint64_t region;
> +	unsigned long sum;
> +};
> +
> +struct vm_bind_clear_priv {
> +	unsigned long unbind_count;
> +	unsigned long bind_count;
> +	unsigned long sum;
> +};
> +
> +static struct vm_bind_clear_priv *vm_bind_clear_priv_create(void)
> +{
> +	struct vm_bind_clear_priv *priv;
> +
> +	priv = mmap(0, ALIGN(sizeof(*priv), PAGE_SIZE),
> +		    PROT_WRITE, MAP_SHARED | MAP_ANON, -1, 0);
> +	igt_assert(priv);
> +	memset(priv, 0, sizeof(*priv));
> +
> +	return priv;
> +}
> +
> +static void vm_bind_clear_priv_destroy(struct vm_bind_clear_priv *priv)
> +{
> +	munmap(priv, ALIGN(sizeof(*priv), PAGE_SIZE));
> +}
> +
> +static void *vm_bind_clear_thread(void *data)
> +{
> +	const uint32_t CS_GPR0 = 0x600;
> +	const size_t batch_size = 16;
> +	struct drm_xe_sync uf_sync = {
> +		.type = DRM_XE_SYNC_TYPE_USER_FENCE, .flags = DRM_XE_SYNC_FLAG_SIGNAL,
> +	};
> +	struct vm_bind_clear_thread_priv *priv = data;
> +	int fd = xe_eudebug_client_open_driver(priv->c);
> +	uint32_t gtt_size = 1ull << min_t(uint32_t, xe_va_bits(fd), 48);
> +	uint32_t vm = xe_eudebug_client_vm_create(priv->c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +	size_t bo_size = xe_bb_size(fd, batch_size);
> +	unsigned long count = 0;
> +	uint64_t *fence_data;
> +
> +	/* init uf_sync */
> +	fence_data = aligned_alloc(xe_get_default_alignment(fd), sizeof(*fence_data));
> +	igt_assert(fence_data);
> +	uf_sync.timeline_value = 1337;
> +	uf_sync.addr = to_user_pointer(fence_data);
> +
> +	igt_debug("Run on: %s%u\n", xe_engine_class_string(priv->hwe->engine_class),
> +		  priv->hwe->engine_instance);
> +
> +	igt_until_timeout(5) {
> +		struct drm_xe_ext_set_property eq_ext = {
> +			.base.name = DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY,
> +			.property = DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG,
> +			.value = DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE,
> +		};
> +		struct drm_xe_exec_queue_create eq_create = { 0 };
> +		uint32_t clean_bo = 0;
> +		uint32_t batch_bo = 0;
> +		uint64_t clean_offset, batch_offset;
> +		uint32_t exec_queue;
> +		uint32_t *map, *cs;
> +		uint64_t delta;
> +
> +		/* calculate offsets (vma addresses) */
> +		batch_offset = (random() * SZ_2M) & (gtt_size - 1);
> +		/* XXX: for some platforms/memory regions batch offset '0' can be problematic */
> +		if (batch_offset == 0)
> +			batch_offset = SZ_2M;
> +
> +		do {
> +			clean_offset = (random() * SZ_2M) & (gtt_size - 1);
> +			if (clean_offset == 0)
> +				clean_offset = SZ_2M;
> +		} while (clean_offset == batch_offset);
> +
> +		batch_offset += random() % SZ_2M & -bo_size;
> +		clean_offset += random() % SZ_2M & -bo_size;
> +
> +		delta = (random() % bo_size) & -4;
> +
> +		/* prepare clean bo */
> +		clean_bo = xe_bo_create(fd, vm, bo_size, priv->region,
> +					DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM);
> +		memset(fence_data, 0, sizeof(*fence_data));
> +		xe_eudebug_client_vm_bind_flags(priv->c, fd, vm, clean_bo, 0, clean_offset, bo_size,
> +						0, &uf_sync, 1, 0);
> +		xe_wait_ufence(fd, fence_data, uf_sync.timeline_value, 0,
> +			       XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC);
> +
> +		/* prepare batch bo */
> +		batch_bo = xe_bo_create(fd, vm, bo_size, priv->region,
> +					DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM);
> +		memset(fence_data, 0, sizeof(*fence_data));
> +		xe_eudebug_client_vm_bind_flags(priv->c, fd, vm, batch_bo, 0, batch_offset, bo_size,
> +						0, &uf_sync, 1, 0);
> +		xe_wait_ufence(fd, fence_data, uf_sync.timeline_value, 0,
> +			       XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC);
> +
> +		map = xe_bo_map(fd, batch_bo, bo_size);
> +
> +		cs = map;
> +		*cs++ = MI_NOOP | 0xc5a3;
> +		*cs++ = MI_LOAD_REGISTER_MEM_CMD | MI_LRI_LRM_CS_MMIO | 2;
> +		*cs++ = CS_GPR0;
> +		*cs++ = clean_offset + delta;
> +		*cs++ = (clean_offset + delta) >> 32;
> +		*cs++ = MI_STORE_REGISTER_MEM_CMD | MI_LRI_LRM_CS_MMIO | 2;
> +		*cs++ = CS_GPR0;
> +		*cs++ = batch_offset;
> +		*cs++ = batch_offset >> 32;
> +		*cs++ = MI_BATCH_BUFFER_END;
> +
> +		/* execute batch */
> +		eq_create.width = 1;
> +		eq_create.num_placements = 1;
> +		eq_create.vm_id = vm;
> +		eq_create.instances = to_user_pointer(priv->hwe);
> +		eq_create.extensions = to_user_pointer(&eq_ext);
> +		exec_queue = xe_eudebug_client_exec_queue_create(priv->c, fd, &eq_create);
> +
> +		memset(fence_data, 0, sizeof(*fence_data));
> +		xe_exec_sync(fd, exec_queue, batch_offset, &uf_sync, 1);
> +		xe_wait_ufence(fd, fence_data, uf_sync.timeline_value, 0,
> +			       XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC);
> +
> +		igt_assert_eq(*map, 0);
> +
> +		/* cleanup */
> +		xe_eudebug_client_exec_queue_destroy(priv->c, fd, &eq_create);
> +		munmap(map, bo_size);
> +
> +		xe_eudebug_client_vm_unbind(priv->c, fd, vm, 0, batch_offset, bo_size);
> +		gem_close(fd, batch_bo);
> +
> +		xe_eudebug_client_vm_unbind(priv->c, fd, vm, 0, clean_offset, bo_size);
> +		gem_close(fd, clean_bo);
> +
> +		count++;
> +	}
> +
> +	priv->sum = count;
> +
> +	free(fence_data);
> +	xe_eudebug_client_close_driver(priv->c, fd);
> +	return NULL;
> +}
> +
> +static void vm_bind_clear_client(struct xe_eudebug_client *c)
> +{
> +	int fd = xe_eudebug_client_open_driver(c);
> +	struct xe_device *xe_dev = xe_device_get(fd);
> +	int count = xe_number_engines(fd) * xe_dev->mem_regions->num_mem_regions;
> +	uint64_t memreg = all_memory_regions(fd);
> +	struct vm_bind_clear_priv *priv = c->ptr;
> +	int current = 0;
> +	struct drm_xe_engine_class_instance *engine;
> +	struct vm_bind_clear_thread_priv *threads;
> +	uint64_t region;
> +
> +	threads = calloc(count, sizeof(*threads));
> +	igt_assert(threads);
> +	priv->sum = 0;
> +
> +	xe_for_each_mem_region(fd, memreg, region) {
> +		xe_eudebug_for_each_engine(fd, engine) {
> +			threads[current].c = c;
> +			threads[current].hwe = engine;
> +			threads[current].region = region;
> +
> +			pthread_create(&threads[current].thread, NULL,
> +				       vm_bind_clear_thread, &threads[current]);
> +			current++;
> +		}
> +	}
> +
> +	for (current = 0; current < count; current++)
> +		pthread_join(threads[current].thread, NULL);
> +
> +	xe_for_each_mem_region(fd, memreg, region) {
> +		unsigned long sum = 0;
> +
> +		for (current = 0; current < count; current++)
> +			if (threads[current].region == region)
> +				sum += threads[current].sum;
> +
> +		igt_info("%s sampled %lu objects\n", xe_region_name(region), sum);
> +		priv->sum += sum;
> +	}
> +
> +	free(threads);
> +	xe_device_put(fd);
> +	xe_eudebug_client_close_driver(c, fd);
> +}
> +
> +static void vm_bind_clear_test_trigger(struct xe_eudebug_debugger *d,
> +				       struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
> +	struct vm_bind_clear_priv *priv = d->ptr;
> +
> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
> +		if (random() & 1) {

What's that random() doing here? 

> +			struct drm_xe_eudebug_vm_open vo = { 0, };
> +			uint32_t v = 0xc1c1c1c1;
> +
> +			struct drm_xe_eudebug_event_vm_bind *eb;
> +			int fd, delta, r;
> +
> +			igt_debug("vm bind op event received with ref %lld, addr 0x%llx, range 0x%llx\n",
> +				  eo->vm_bind_ref_seqno, eo->addr, eo->range);
> +
> +			eb = (struct drm_xe_eudebug_event_vm_bind *)
> +				xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
> +			igt_assert(eb);
> +
> +			vo.client_handle = eb->client_handle;
> +			vo.vm_handle = eb->vm_handle;
> +
> +			fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
> +			igt_assert_lte(0, fd);
> +
> +			delta = (random() % eo->range) & -4;
> +			r = pread(fd, &v, sizeof(v), eo->addr + delta);
> +			igt_assert_eq(r, sizeof(v));
> +			igt_assert_eq_u32(v, 0);
> +
> +			close(fd);
> +		}
> +		priv->bind_count++;
> +	}
> +
> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_DESTROY)
> +		priv->unbind_count++;
> +}
> +
> +static void vm_bind_clear_ack_trigger(struct xe_eudebug_debugger *d,
> +				      struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
> +
> +	xe_eudebug_ack_ufence(d->fd, ef);
> +}
> +
> +/**
> + * SUBTEST: vm-bind-clear
> + * Description:
> + *      Check that fresh buffers we vm_bind into the ppGTT are always clear.
> + */
> +static void test_vm_bind_clear(int fd)
> +{
> +	struct vm_bind_clear_priv *priv;
> +	struct xe_eudebug_session *s;
> +
> +	priv = vm_bind_clear_priv_create();
> +	s = xe_eudebug_session_create(fd, vm_bind_clear_client, 0, priv);
> +
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_OP,
> +					vm_bind_clear_test_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> +					vm_bind_clear_ack_trigger);
> +
> +	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
> +	xe_eudebug_debugger_start_worker(s->debugger);
> +	xe_eudebug_client_start(s->client);
> +
> +	xe_eudebug_client_wait_done(s->client);
> +	xe_eudebug_debugger_stop_worker(s->debugger, 1);
> +
> +	igt_assert_eq(priv->bind_count, priv->unbind_count);
> +	igt_assert_eq(priv->sum * 2, priv->bind_count);
> +
> +	xe_eudebug_session_destroy(s);
> +	vm_bind_clear_priv_destroy(priv);
> +}
> +
> +#define UFENCE_CLIENT_VM_TEST_VAL_START 0xaaaaaaaa
> +#define UFENCE_CLIENT_VM_TEST_VAL_END 0xbbbbbbbb
> +
> +static void vma_ufence_client(struct xe_eudebug_client *c)
> +{
> +	const unsigned int n = UFENCE_EVENT_COUNT_EXPECTED;
> +	int fd = xe_eudebug_client_open_driver(c);
> +	struct ufence_bind *binds = create_binds_with_ufence(fd, n);
> +	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +	size_t bo_size = xe_get_default_alignment(fd);
> +	uint64_t items = bo_size / sizeof(uint32_t);
> +	uint32_t bo[UFENCE_EVENT_COUNT_EXPECTED];
> +	uint32_t *ptr[UFENCE_EVENT_COUNT_EXPECTED];
> +
> +	for (int i = 0; i < n; i++) {
> +		bo[i] = xe_bo_create(fd, 0, bo_size,
> +				     system_memory(fd), 0);
> +		ptr[i] = xe_bo_map(fd, bo[i], bo_size);
> +		igt_assert(ptr[i]);
> +		memset(ptr[i], UFENCE_CLIENT_VM_TEST_VAL_START, bo_size);
> +	}
> +
> +	for (int i = 0; i < n; i++)
> +		for (int j = 0; j < items; j++)
> +			igt_assert_eq(ptr[i][j], UFENCE_CLIENT_VM_TEST_VAL_START);

What's for this loop? ptr was filled in previous loop. Is it possible
this data are not persist?

I didn't spot other issues during this reading.

--
Zbigniew

> +
> +	for (int i = 0; i < n; i++) {
> +		struct ufence_bind *b = &binds[i];
> +
> +		xe_eudebug_client_vm_bind_flags(c, fd, vm, bo[i], 0, b->addr, b->range, 0,
> +						&b->f, 1, 0);
> +	}
> +
> +	/* Wait for acks on ufences */
> +	for (int i = 0; i < n; i++) {
> +		int err;
> +		int64_t timeout_ns;
> +		struct ufence_bind *b = &binds[i];
> +
> +		timeout_ns = XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC;
> +		err = __xe_wait_ufence(fd, &b->fence_data->vm_sync, b->f.timeline_value,
> +				       0, &timeout_ns);
> +		igt_assert_eq(err, 0);
> +		igt_assert_eq(b->fence_data->vm_sync, b->f.timeline_value);
> +		igt_debug("wait #%d completed\n", i);
> +
> +		for (int j = 0; j < items; j++)
> +			igt_assert_eq(ptr[i][j], UFENCE_CLIENT_VM_TEST_VAL_END);
> +	}
> +
> +	for (int i = 0; i < n; i++) {
> +		struct ufence_bind *b = &binds[i];
> +
> +		xe_eudebug_client_vm_unbind(c, fd, vm, 0, b->addr, b->range);
> +	}
> +
> +	free(binds);
> +
> +	for (int i = 0; i < n; i++) {
> +		munmap(ptr[i], bo_size);
> +		gem_close(fd, bo[i]);
> +	}
> +
> +	xe_eudebug_client_vm_destroy(c, fd, vm);
> +	xe_eudebug_client_close_driver(c, fd);
> +}
> +
> +static void debugger_test_vma_ufence(struct xe_eudebug_debugger *d,
> +				     uint64_t client_handle,
> +				     uint64_t vm_handle,
> +				     uint64_t va_start,
> +				     uint64_t va_length)
> +{
> +	struct drm_xe_eudebug_vm_open vo = { 0, };
> +	uint32_t *v1, *v2;
> +	uint32_t items = va_length / sizeof(uint32_t);
> +	int fd;
> +	int r, i;
> +
> +	v1 = malloc(va_length);
> +	igt_assert(v1);
> +	v2 = malloc(va_length);
> +	igt_assert(v2);
> +
> +	vo.client_handle = client_handle;
> +	vo.vm_handle = vm_handle;
> +
> +	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
> +	igt_assert_lte(0, fd);
> +
> +	r = pread(fd, v1, va_length, va_start);
> +	igt_assert_eq(r, va_length);
> +
> +	for (i = 0; i < items; i++)
> +		igt_assert_eq(v1[i], UFENCE_CLIENT_VM_TEST_VAL_START);
> +
> +	memset(v1, UFENCE_CLIENT_VM_TEST_VAL_END, va_length);
> +
> +	r = pwrite(fd, v1, va_length, va_start);
> +	igt_assert_eq(r, va_length);
> +
> +	lseek(fd, va_start, SEEK_SET);
> +	r = read(fd, v2, va_length);
> +	igt_assert_eq(r, va_length);
> +
> +	for (i = 0; i < items; i++)
> +		igt_assert_eq_u64(v1[i], v2[i]);
> +
> +	fsync(fd);
> +
> +	close(fd);
> +	free(v1);
> +	free(v2);
> +}
> +
> +static void vma_ufence_op_trigger(struct xe_eudebug_debugger *d,
> +				  struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
> +	struct ufence_priv *priv = d->ptr;
> +
> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
> +		char event_str[XE_EUDEBUG_EVENT_STRING_MAX_LEN];
> +		struct drm_xe_eudebug_event_vm_bind *eb;
> +		unsigned int op_count = priv->vm_bind_op_count++;
> +
> +		xe_eudebug_event_to_str(e, event_str, XE_EUDEBUG_EVENT_STRING_MAX_LEN);
> +		igt_debug("vm bind op event: ref %lld, addr 0x%llx, range 0x%llx, op_count %u\n",
> +			  eo->vm_bind_ref_seqno,
> +			  eo->addr,
> +			  eo->range,
> +			  op_count);
> +		igt_debug("vm bind op event received: %s\n", event_str);
> +		xe_eudebug_assert(d, eo->vm_bind_ref_seqno);
> +		eb = (struct drm_xe_eudebug_event_vm_bind *)
> +			xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
> +
> +		xe_eudebug_assert_f(d, eb, "vm bind event with seqno (%lld) not found\n",
> +				    eo->vm_bind_ref_seqno);
> +		xe_eudebug_assert_f(d, eb->flags & DRM_XE_EUDEBUG_EVENT_VM_BIND_FLAG_UFENCE,
> +				    "vm bind event does not have ufence: %s\n", event_str);
> +
> +		priv->ufence_event_seqno[op_count] = eo->vm_bind_ref_seqno;
> +		priv->ufence_event_vm_addr_start[op_count] = eo->addr;
> +		priv->ufence_event_vm_addr_range[op_count] = eo->range;
> +	}
> +}
> +
> +static void vma_ufence_trigger(struct xe_eudebug_debugger *d,
> +			       struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
> +	struct ufence_priv *priv = d->ptr;
> +	unsigned int ufence_count = priv->ufence_event_count;
> +
> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
> +		char event_str[XE_EUDEBUG_EVENT_STRING_MAX_LEN];
> +		struct drm_xe_eudebug_event_vm_bind *eb;
> +		uint64_t addr = priv->ufence_event_vm_addr_start[ufence_count];
> +		uint64_t range = priv->ufence_event_vm_addr_range[ufence_count];
> +
> +		xe_eudebug_event_to_str(e, event_str, XE_EUDEBUG_EVENT_STRING_MAX_LEN);
> +		igt_debug("ufence event received: %s\n", event_str);
> +
> +		xe_eudebug_assert_f(d, priv->ufence_event_count < UFENCE_EVENT_COUNT_EXPECTED,
> +				    "surplus ufence event received: %s\n", event_str);
> +		xe_eudebug_assert(d, ef->vm_bind_ref_seqno);
> +
> +		memcpy(&priv->ufence_events[priv->ufence_event_count++], ef, sizeof(*ef));
> +
> +		eb = (struct drm_xe_eudebug_event_vm_bind *)
> +			xe_eudebug_event_log_find_seqno(d->log, ef->vm_bind_ref_seqno);
> +		xe_eudebug_assert_f(d, eb, "vm bind event with seqno (%lld) not found\n",
> +				    ef->vm_bind_ref_seqno);
> +		xe_eudebug_assert_f(d, eb->flags & DRM_XE_EUDEBUG_EVENT_VM_BIND_FLAG_UFENCE,
> +				    "vm bind event does not have ufence: %s\n", event_str);
> +		igt_debug("vm bind ufence event received with ref %lld, addr 0x%lx, range 0x%lx\n",
> +			  ef->vm_bind_ref_seqno,
> +			  addr,
> +			  range);
> +		debugger_test_vma_ufence(d, eb->client_handle, eb->vm_handle,
> +					 addr, range);
> +
> +		xe_eudebug_ack_ufence(d->fd, ef);
> +	}
> +}
> +
> +/**
> + * SUBTEST: vma-ufence
> + * Description:
> + *      Intercept vm bind after receiving ufence event, then access target vm and write to it.
> + *      Then check on client side if the write was successful.
> + */
> +static void test_vma_ufence(int fd, unsigned int flags)
> +{
> +	struct xe_eudebug_session *s;
> +	struct ufence_priv *priv;
> +
> +	priv = ufence_priv_create();
> +	s = xe_eudebug_session_create(fd, vma_ufence_client, flags, priv);
> +
> +	xe_eudebug_debugger_add_trigger(s->debugger,
> +					DRM_XE_EUDEBUG_EVENT_VM_BIND_OP,
> +					vma_ufence_op_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger,
> +					DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> +					vma_ufence_trigger);
> +
> +	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
> +	xe_eudebug_debugger_start_worker(s->debugger);
> +	xe_eudebug_client_start(s->client);
> +
> +	xe_eudebug_client_wait_done(s->client);
> +	xe_eudebug_debugger_stop_worker(s->debugger, 1);
> +
> +	xe_eudebug_event_log_print(s->debugger->log, true);
> +	xe_eudebug_event_log_print(s->client->log, true);
> +
> +	xe_eudebug_session_check(s, true, XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
> +
> +	xe_eudebug_session_destroy(s);
> +	ufence_priv_destroy(priv);
> +}
> +
> +igt_main
> +{
> +	bool was_enabled;
> +	bool *multigpu_was_enabled;
> +	int fd, gpu_count;
> +
> +	igt_fixture {
> +		fd = drm_open_driver(DRIVER_XE);
> +		was_enabled = xe_eudebug_enable(fd, true);
> +	}
> +
> +	igt_subtest("sysfs-toggle")
> +		test_sysfs_toggle(fd);
> +
> +	igt_subtest("basic-connect")
> +		test_connect(fd);
> +
> +	igt_subtest("connect-user")
> +		test_connect_user(fd);
> +
> +	igt_subtest("basic-close")
> +		test_close(fd);
> +
> +	igt_subtest("basic-read-event")
> +		test_read_event(fd);
> +
> +	igt_subtest("basic-client")
> +		test_basic_sessions(fd, 0, 1, true);
> +
> +	igt_subtest("basic-client-th")
> +		test_basic_sessions_th(fd, 0, 1, true);
> +
> +	igt_subtest("basic-vm-access")
> +		test_vm_access(fd, 0, 1);
> +
> +	igt_subtest("basic-vm-access-userptr")
> +		test_vm_access(fd, VM_BIND_OP_MAP_USERPTR, 1);
> +
> +	igt_subtest("basic-vm-access-parameters")
> +		test_vm_access_parameters(fd, 0, 1);
> +
> +	igt_subtest("multiple-sessions")
> +		test_basic_sessions(fd, CREATE_VMS | CREATE_EXEC_QUEUES, 4, true);
> +
> +	igt_subtest("basic-vms")
> +		test_basic_sessions(fd, CREATE_VMS, 1, true);
> +
> +	igt_subtest("basic-exec-queues")
> +		test_basic_sessions(fd, CREATE_EXEC_QUEUES, 1, true);
> +
> +	igt_subtest("basic-vm-bind")
> +		test_basic_sessions(fd, VM_BIND, 1, true);
> +
> +	igt_subtest("basic-vm-bind-ufence")
> +		test_basic_ufence(fd, 0);
> +
> +	igt_subtest("vma-ufence")
> +		test_vma_ufence(fd, 0);
> +
> +	igt_subtest("vm-bind-clear")
> +		test_vm_bind_clear(fd);
> +
> +	igt_subtest("basic-vm-bind-discovery")
> +		test_basic_discovery(fd, VM_BIND, true);
> +
> +	igt_subtest("basic-vm-bind-metadata-discovery")
> +		test_basic_discovery(fd, VM_BIND_METADATA, true);
> +
> +	igt_subtest("basic-vm-bind-vm-destroy")
> +		test_basic_sessions(fd, VM_BIND_VM_DESTROY, 1, false);
> +
> +	igt_subtest("basic-vm-bind-vm-destroy-discovery")
> +		test_basic_discovery(fd, VM_BIND_VM_DESTROY, false);
> +
> +	igt_subtest("basic-vm-bind-extended")
> +		test_basic_sessions(fd, VM_BIND_EXTENDED, 1, true);
> +
> +	igt_subtest("basic-vm-bind-extended-discovery")
> +		test_basic_discovery(fd, VM_BIND_EXTENDED, true);
> +
> +	igt_subtest("read-metadata")
> +		test_metadata_read(fd, 0, 1);
> +
> +	igt_subtest("attach-debug-metadata")
> +		test_metadata_attach(fd, 0, 1);
> +
> +	igt_subtest("discovery-race")
> +		test_race_discovery(fd, 0, 4);
> +
> +	igt_subtest("discovery-race-vmbind")
> +		test_race_discovery(fd, DISCOVERY_VM_BIND, 4);
> +
> +	igt_subtest("discovery-empty")
> +		test_empty_discovery(fd, DISCOVERY_CLOSE_CLIENT, 16);
> +
> +	igt_subtest("discovery-empty-clients")
> +		test_empty_discovery(fd, DISCOVERY_DESTROY_RESOURCES, 16);
> +
> +	igt_fixture {
> +		xe_eudebug_enable(fd, was_enabled);
> +		drm_close_driver(fd);
> +	}
> +
> +	igt_subtest_group {
> +		igt_fixture {
> +			gpu_count = drm_prepare_filtered_multigpu(DRIVER_XE);
> +			igt_require(gpu_count >= 2);
> +
> +			multigpu_was_enabled = malloc(gpu_count * sizeof(bool));
> +			igt_assert(multigpu_was_enabled);
> +			for (int i = 0; i < gpu_count; i++) {
> +				fd = drm_open_filtered_card(i);
> +				multigpu_was_enabled[i] = xe_eudebug_enable(fd, true);
> +				close(fd);
> +			}
> +		}
> +
> +		igt_subtest("multigpu-basic-client") {
> +			igt_multi_fork(child, gpu_count) {
> +				fd = drm_open_filtered_card(child);
> +				igt_assert_f(fd > 0, "cannot open gpu-%d, errno=%d\n",
> +					     child, errno);
> +				igt_assert(is_xe_device(fd));
> +
> +				test_basic_sessions(fd, 0, 1, true);
> +				close(fd);
> +			}
> +			igt_waitchildren();
> +		}
> +
> +		igt_subtest("multigpu-basic-client-many") {
> +			igt_multi_fork(child, gpu_count) {
> +				fd = drm_open_filtered_card(child);
> +				igt_assert_f(fd > 0, "cannot open gpu-%d, errno=%d\n",
> +					     child, errno);
> +				igt_assert(is_xe_device(fd));
> +
> +				test_basic_sessions(fd, 0, 4, true);
> +				close(fd);
> +			}
> +			igt_waitchildren();
> +		}
> +
> +		igt_fixture {
> +			for (int i = 0; i < gpu_count; i++) {
> +				fd = drm_open_filtered_card(i);
> +				xe_eudebug_enable(fd, multigpu_was_enabled[i]);
> +				close(fd);
> +			}
> +			free(multigpu_was_enabled);
> +		}
> +	}
> +}
> diff --git a/tests/meson.build b/tests/meson.build
> index 00556c9d6..0f996fdc8 100644
> --- a/tests/meson.build
> +++ b/tests/meson.build
> @@ -318,6 +318,14 @@ intel_xe_progs = [
>  	'xe_sysfs_scheduler',
>  ]
>  
> +intel_xe_eudebug_progs = [
> +	'xe_eudebug',
> +]
> +
> +if build_xe_eudebug
> +	intel_xe_progs += intel_xe_eudebug_progs
> +endif
> +
>  chamelium_progs = [
>  	'kms_chamelium_audio',
>  	'kms_chamelium_color',
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 13/17] tests/xe_eudebug: Test eudebug resource tracking and manipulation
  2024-09-12  8:04   ` Zbigniew Kempczyński
@ 2024-09-17 14:44     ` Manszewski, Christoph
  2024-09-17 16:00     ` Manszewski, Christoph
  1 sibling, 0 replies; 50+ messages in thread
From: Manszewski, Christoph @ 2024-09-17 14:44 UTC (permalink / raw)
  To: Zbigniew Kempczyński
  Cc: igt-dev, Kamil Konieczny, Dominik Grzegorzek, Maciej Patelczyk,
	Dominik Karol Piątkowski, Pawel Sikora, Andrzej Hajda,
	Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun, Mika Kuoppala,
	Jonathan Cavitt

Hi Zbigniew,

On 12.09.2024 10:04, Zbigniew Kempczyński wrote:
> On Thu, Sep 05, 2024 at 11:28:08AM +0200, Christoph Manszewski wrote:
>> From: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
>>
>> For typical debugging under gdb one can specify two main usecases:
>> accessing and manupulating resources created by the application and
>> manipulating thread execution (interrupting and setting breakpoints).
>>
>> This test adds coverage for the former by checking that:
>> - the debugger reports the expected events for Xe resources created
>> by the debugged client,
>> - the debugger is able to read and write the vm of the debugged client.
> 
> Hi all.
> 
> First of all, on Mika series (v2) sent upstream on xe ml I've noticed
> some tests are crashing the kernel. From this test perspective this is
> good, it seems test is doing what it should do. I observe reboot on
> vm access related subtests: basic-vm-access(-userptr).
> 
>>
>> Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
>> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>> Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
>> Signed-off-by: Karolina Stolarek <karolina.stolarek@intel.com>
>> Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
>> Signed-off-by: Pawel Sikora <pawel.sikora@intel.com>
>> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
>> Signed-off-by: Dominik Karol Piątkowski <dominik.karol.piatkowski@intel.com>
>> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
>> ---
>>   docs/testplan/meson.build |   13 +-
>>   meson_options.txt         |    2 +-
>>   tests/intel/xe_eudebug.c  | 2716 +++++++++++++++++++++++++++++++++++++
>>   tests/meson.build         |    8 +
>>   4 files changed, 2737 insertions(+), 2 deletions(-)
>>   create mode 100644 tests/intel/xe_eudebug.c
>>
>> diff --git a/docs/testplan/meson.build b/docs/testplan/meson.build
>> index 5560347f1..e86af028e 100644
>> --- a/docs/testplan/meson.build
>> +++ b/docs/testplan/meson.build
>> @@ -33,11 +33,22 @@ else
>>   	doc_dependencies = []
>>   endif
>>   
>> +xe_excluded_tests = []
>> +if not build_xe_eudebug
>> +	foreach test : intel_xe_eudebug_progs
>> +		xe_excluded_tests += meson.current_source_dir() + '/../../tests/intel/' + test + '.c'
>> +	endforeach
>> +endif
>> +
>> +if xe_excluded_tests.length() > 0
>> +	xe_excluded_tests = ['--exclude-files'] + xe_excluded_tests
>> +endif
>> +
>>   if build_xe
>>   	test_dict = {
>>   		'i915_tests': { 'input': i915_test_config, 'extra_args': check_testlist },
>>   		'kms_tests': { 'input': kms_test_config, 'extra_args': kms_check_testlist },
>> -		'xe_tests': { 'input': xe_test_config, 'extra_args': check_testlist }
>> +		'xe_tests': { 'input': xe_test_config, 'extra_args': check_testlist + xe_excluded_tests }
>>   	    }
>>   else
>>   	test_dict = {
>> diff --git a/meson_options.txt b/meson_options.txt
>> index 11922523b..c410f9b77 100644
>> --- a/meson_options.txt
>> +++ b/meson_options.txt
>> @@ -45,7 +45,7 @@ option('xe_driver',
>>   option('xe_eudebug',
>>          type : 'feature',
>>          value : 'disabled',
>> -       description : 'Build library for Xe EU debugger')
>> +       description : 'Build library and tests for Xe EU debugger')
>>   
>>   option('libdrm_drivers',
>>          type : 'array',
>> diff --git a/tests/intel/xe_eudebug.c b/tests/intel/xe_eudebug.c
>> new file mode 100644
>> index 000000000..fd2894a5e
>> --- /dev/null
>> +++ b/tests/intel/xe_eudebug.c
>> @@ -0,0 +1,2716 @@
>> +// SPDX-License-Identifier: MIT
>> +/*
>> + * Copyright © 2023 Intel Corporation
>> + */
>> +
>> +/**
>> + * TEST: Test EU Debugger functionality
>> + * Category: Core
>> + * Mega feature: EUdebug
>> + * Sub-category: EUdebug tests
>> + * Functionality: eu debugger framework
>> + * Test category: functionality test
>> + */
>> +
>> +#include <grp.h>
>> +#include <poll.h>
>> +#include <pthread.h>
>> +#include <pwd.h>
>> +#include <sys/ioctl.h>
>> +#include <sys/prctl.h>
>> +
>> +#include "igt.h"
>> +#include "intel_pat.h"
>> +#include "lib/igt_syncobj.h"
>> +#include "xe/xe_eudebug.h"
>> +#include "xe/xe_ioctl.h"
>> +#include "xe/xe_query.h"
>> +
>> +/**
>> + * SUBTEST: sysfs-toggle
>> + * Description:
>> + *	Exercise the debugger enable/disable sysfs toggle logic
>> + */
>> +static void test_sysfs_toggle(int fd)
>> +{
>> +	xe_eudebug_enable(fd, false);
>> +	igt_assert(!xe_eudebug_debugger_available(fd));
>> +
>> +	xe_eudebug_enable(fd, true);
>> +	igt_assert(xe_eudebug_debugger_available(fd));
>> +	xe_eudebug_enable(fd, true);
>> +	igt_assert(xe_eudebug_debugger_available(fd));
>> +
>> +	xe_eudebug_enable(fd, false);
>> +	igt_assert(!xe_eudebug_debugger_available(fd));
>> +	xe_eudebug_enable(fd, false);
>> +	igt_assert(!xe_eudebug_debugger_available(fd));
>> +
>> +	xe_eudebug_enable(fd, true);
>> +	igt_assert(xe_eudebug_debugger_available(fd));
>> +}
>> +
>> +#define STAGE_PRE_DEBUG_RESOURCES_DONE 1
>> +#define STAGE_DISCOVERY_DONE 2
>> +
>> +#define CREATE_VMS (1 << 0)
>> +#define CREATE_EXEC_QUEUES (1 << 1)
>> +#define VM_BIND (1 << 2)
>> +#define VM_BIND_VM_DESTROY (1 << 3)
>> +#define VM_BIND_EXTENDED (1 << 4)
>> +#define VM_METADATA (1 << 5)
>> +#define VM_BIND_METADATA (1 << 6)
>> +#define VM_BIND_OP_MAP_USERPTR (1 << 7)
>> +#define TEST_DISCOVERY (1 << 31)
> 
> Please align value to same column.

Sure, will do. But just out of curiosity - is there some official 
guideline to do so? Personally I agree that aligning the values looks 
cleaner, but at the same time it is harder for me (without visual line 
highlighting) to see what value corresponds to the given name, 
especially for the cases with larger spacing.

> 
>> +
>> +#define PAGE_SIZE 4096
> 
> Use SZ_4K

Ok

> 
>> +static struct drm_xe_vm_bind_op_ext_attach_debug *
>> +basic_vm_bind_metadata_ext_prepare(int fd, struct xe_eudebug_client *c,
>> +				   uint8_t **data, uint32_t data_size)
>> +{
>> +	struct drm_xe_vm_bind_op_ext_attach_debug *ext;
>> +	int i;
>> +
> 
> According to code below data_size should be >= PAGE_SIZE * XE_DEBUG_METADATA_NUM.
> Shouldn't be assert here?

Actually I think we could skip passing 'data_size' alltogether. As you 
noticed the code below expects the size to be at least PAGE_SIZE * 
XE_DEBUG_METADATA_NUM at the same time there is no point in using more 
than that.

> 
>> +	*data = calloc(data_size, sizeof(*data));

This one is also wrong as it should be 'sizeof(**data)'. And it explains 
why the code below, that passes insufficient data_size doesn't crash - 
we just allocate 8 times more memory than intended.

>> +	igt_assert(*data);
>> +
>> +	for (i = 0; i < data_size; i++)
>> +		(*data)[i] = 0xff & (i + (i > PAGE_SIZE));
> 
> Just question, what for starting from second page you're adding 1 ?

I guess the intention was to make each page different but seems like 
this didn't work out. I'll try to fix that.

> 
>> +
>> +	ext = calloc(WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM, sizeof(*ext));
>> +	igt_assert(ext);
>> +
>> +	for (i = 0; i < WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM; i++) {
>> +		ext[i].base.name = XE_VM_BIND_OP_EXTENSIONS_ATTACH_DEBUG;
>> +		ext[i].metadata_id = xe_eudebug_client_metadata_create(c, fd, i,
>> +								       (i + 1) * PAGE_SIZE, *data);
> 
> Is this intentional to use same *data for all metadata with increased
> size?

Yes, metadata can point the same memory and in 'debuger_test_metadata' 
below you can see that we expect a certain size to ensure the size also 
propagates correctly to the debugger.

> 
>> +		ext[i].cookie = i;
>> +
>> +		if (i < WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM - 1)
>> +			ext[i].base.next_extension = to_user_pointer(&ext[i + 1]);
>> +	}
>> +	return ext;
>> +}
>> +
>> +static void basic_vm_bind_metadata_ext_del(int fd, struct xe_eudebug_client *c,
>> +					   struct drm_xe_vm_bind_op_ext_attach_debug *ext,
>> +					   uint8_t *data)
>> +{
>> +	for (int i = 0; i < WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM; i++)
>> +		xe_eudebug_client_metadata_destroy(c, fd, ext[i].metadata_id, i,
>> +						   (i + 1) * PAGE_SIZE);
>> +	free(ext);
>> +	free(data);
>> +}
>> +
>> +static void basic_vm_bind_client(int fd, struct xe_eudebug_client *c)
>> +{
>> +	struct drm_xe_vm_bind_op_ext_attach_debug *ext = NULL;
>> +	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
>> +	size_t bo_size = xe_get_default_alignment(fd);
>> +	bool test_discovery = c->flags & TEST_DISCOVERY;
>> +	bool test_metadata = c->flags & VM_BIND_METADATA;
>> +	uint32_t bo = xe_bo_create(fd, 0, bo_size,
>> +				   system_memory(fd), 0);
>> +	uint64_t addr = 0x1a0000;
> 
> Move BO_ADDR from the bottom and use this define.

Ok

> 
>> +	uint8_t *data = NULL;
>> +
>> +	if (test_metadata)
>> +		ext = basic_vm_bind_metadata_ext_prepare(fd, c, &data, PAGE_SIZE);
> 
> According to above, PAGE_SIZE in this case looks too small, see above my
> comment about *data. Looking at the code MDATA_SIZE likely should be
> used here.

Yes, nice catch. It works however as we accidentaly use ptr size instead 
of element size in the target function. But I will drop this param in 
the next revision.

> 
>> +
>> +	xe_eudebug_client_vm_bind_flags(c, fd, vm, bo, 0, addr,
>> +					bo_size, 0, NULL, 0, to_user_pointer(ext));
>> +
>> +	if (test_discovery) {
>> +		xe_eudebug_client_signal_stage(c, STAGE_PRE_DEBUG_RESOURCES_DONE);
>> +		xe_eudebug_client_wait_stage(c, STAGE_DISCOVERY_DONE);
>> +	}
>> +
>> +	xe_eudebug_client_vm_unbind(c, fd, vm, 0, addr, bo_size);
>> +
>> +	if (test_metadata)
>> +		basic_vm_bind_metadata_ext_del(fd, c, ext, data);
>> +
>> +	gem_close(fd, bo);
>> +	xe_eudebug_client_vm_destroy(c, fd, vm);
>> +}
>> +
>> +static void basic_vm_bind_vm_destroy_client(int fd, struct xe_eudebug_client *c)
>> +{
>> +	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
>> +	size_t bo_size = xe_get_default_alignment(fd);
>> +	bool test_discovery = c->flags & TEST_DISCOVERY;
>> +	uint32_t bo = xe_bo_create(fd, 0, bo_size,
>> +				   system_memory(fd), 0);
>> +	uint64_t addr = 0x1a0000;
>> +
>> +	if (test_discovery) {
>> +		vm = xe_vm_create(fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
>> +
>> +		xe_vm_bind_async(fd, vm, 0, bo, 0, addr, bo_size, NULL, 0);
>> +
>> +		xe_vm_destroy(fd, vm);
>> +
>> +		xe_eudebug_client_signal_stage(c, STAGE_PRE_DEBUG_RESOURCES_DONE);
>> +		xe_eudebug_client_wait_stage(c, STAGE_DISCOVERY_DONE);
>> +	} else {
>> +		vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
>> +		xe_eudebug_client_vm_bind(c, fd, vm, bo, 0, addr, bo_size);
>> +		xe_eudebug_client_vm_destroy(c, fd, vm);
>> +	}
>> +
>> +	gem_close(fd, bo);
>> +}
>> +
>> +#define BO_ADDR 0x1a0000
>> +#define BO_ITEMS 4096
>> +#define MIN_BO_SIZE (BO_ITEMS * sizeof(uint64_t))
>> +
>> +union buf_id {
>> +	uint32_t fd;
>> +	void *userptr;
>> +};
>> +
>> +struct bind_list {
>> +	int fd;
>> +	uint32_t vm;
>> +	union buf_id *bo;
>> +	struct drm_xe_vm_bind_op *bind_ops;
>> +	unsigned int n;
>> +};
>> +
>> +static void *bo_get_ptr(int fd, struct drm_xe_vm_bind_op *o)
>> +{
>> +	void *ptr;
>> +
>> +	if (o->op != DRM_XE_VM_BIND_OP_MAP_USERPTR)
>> +		ptr = xe_bo_map(fd, o->obj, o->range);
>> +	else
>> +		ptr = (void *)(uintptr_t)o->userptr;
>> +
>> +	igt_assert(ptr);
>> +
>> +	return ptr;
>> +}
>> +
>> +static void bo_put_ptr(int fd, struct drm_xe_vm_bind_op *o, void *ptr)
>> +{
>> +	if (o->op != DRM_XE_VM_BIND_OP_MAP_USERPTR)
>> +		munmap(ptr, o->range);
>> +}
>> +
>> +static void bo_prime(int fd, struct drm_xe_vm_bind_op *o)
> 
> Why prime? Shouldn't this be bo_fill()?

It's not my choice however it makes sesnse because:

One would call it fill if that function would perform some tested 
'target' operation. But here we actaully want to test the ability to 
read and write the bo by the *debugger*. And this function is called by 
the *client* to "prime" the bo, before the debugger tries to read and 
subsequently write it. See:

```
prime (verb)

to prepare someone or something for the next stage in a process

source: dictionary.cambridge.org
```

> 
>> +{
>> +	uint64_t *d;
>> +	uint64_t i;
>> +
>> +	d = bo_get_ptr(fd, o);
>> +
>> +	for (i = 0; i < o->range / sizeof(*d); i++)
>> +		d[i] = o->addr + i;
>> +
>> +	bo_put_ptr(fd, o, d);
>> +}
>> +
>> +static void bo_check(int fd, struct drm_xe_vm_bind_op *o)
>> +{
>> +	uint64_t *d;
>> +	uint64_t i;
>> +
>> +	d = bo_get_ptr(fd, o);
>> +
>> +	for (i = 0; i < o->range / sizeof(*d); i++)
>> +		igt_assert_eq(d[i], o->addr + i + 1);
>> +
>> +	bo_put_ptr(fd, o, d);
>> +}
>> +
>> +static union buf_id *vm_create_objects(int fd, uint32_t bo_placement, uint32_t vm,
>> +				       unsigned int size, unsigned int n)
>> +{
>> +	union buf_id *bo;
>> +	unsigned int i;
>> +
>> +	bo = calloc(n, sizeof(*bo));
>> +	igt_assert(bo);
>> +
>> +	for (i = 0; i < n; i++) {
>> +		if (bo_placement) {
>> +			bo[i].fd = xe_bo_create(fd, vm, size, bo_placement, 0);
>> +			igt_assert(bo[i].fd);
>> +		} else {
>> +			bo[i].userptr = aligned_alloc(PAGE_SIZE, size);
>> +			igt_assert(bo[i].userptr);
>> +		}
>> +	}
>> +
>> +	return bo;
>> +}
>> +
>> +static struct bind_list *create_bind_list(int fd, uint32_t bo_placement,
>> +					  uint32_t vm, unsigned int n,
>> +					  unsigned int target_size)
>> +{
>> +	unsigned int i = target_size ?: MIN_BO_SIZE;
>> +	const unsigned int bo_size = max_t(bo_size, xe_get_default_alignment(fd), i);
>> +	bool is_userptr = !bo_placement;
>> +	struct bind_list *bl;
>> +
>> +	bl = malloc(sizeof(*bl));
>> +	bl->fd = fd;
>> +	bl->vm = vm;
>> +	bl->bo = vm_create_objects(fd, bo_placement, vm, bo_size, n);
>> +	bl->n = n;
>> +	bl->bind_ops = calloc(n, sizeof(*bl->bind_ops));
>> +	igt_assert(bl->bind_ops);
>> +
>> +	for (i = 0; i < n; i++) {
>> +		struct drm_xe_vm_bind_op *o = &bl->bind_ops[i];
>> +
> 
> Most of = 0 initializations may be skipped as you're callocing bind_ops.

Ok

> 
>> +		if (is_userptr) {
>> +			o->obj = 0;
>> +			o->userptr = (uintptr_t)bl->bo[i].userptr;
>> +			o->op = DRM_XE_VM_BIND_OP_MAP_USERPTR;
>> +		} else {
>> +			o->obj = bl->bo[i].fd;
>> +			o->obj_offset = 0;
>> +			o->op = DRM_XE_VM_BIND_OP_MAP;
>> +		}
>> +
>> +		o->range = bo_size;
>> +		o->addr = BO_ADDR + 2 * i * bo_size;
>> +		o->flags = 0;
>> +		o->pat_index = intel_get_pat_idx_wb(fd);
>> +		o->prefetch_mem_region_instance = 0;
>> +		o->reserved[0] = 0;
>> +		o->reserved[1] = 0;
>> +	}
>> +
>> +	for (i = 0; i < bl->n; i++) {
>> +		struct drm_xe_vm_bind_op *o = &bl->bind_ops[i];
>> +
>> +		igt_debug("bo %d: addr 0x%llx, range 0x%llx\n", i, o->addr, o->range);
>> +		bo_prime(fd, o);
>> +	}
>> +
>> +	return bl;
>> +}
>> +
>> +static void do_bind_list(struct xe_eudebug_client *c,
>> +			 struct bind_list *bl, bool sync)
>> +{
>> +	struct drm_xe_sync uf_sync = {
>> +		.type = DRM_XE_SYNC_TYPE_USER_FENCE,
>> +		.flags = DRM_XE_SYNC_FLAG_SIGNAL,
>> +		.timeline_value = 1337,
>> +	};
>> +	uint64_t ref_seqno = 0, op_ref_seqno = 0;
>> +	uint64_t *fence_data;
>> +	int i;
>> +
>> +	if (sync) {
>> +		fence_data = aligned_alloc(xe_get_default_alignment(bl->fd),
>> +					   sizeof(*fence_data));
>> +		igt_assert(fence_data);
>> +		uf_sync.addr = to_user_pointer(fence_data);
>> +		memset(fence_data, 0, sizeof(*fence_data));
>> +	}
>> +
>> +	xe_vm_bind_array(bl->fd, bl->vm, 0, bl->bind_ops, bl->n, &uf_sync, sync ? 1 : 0);
>> +	xe_eudebug_client_vm_bind_event(c, DRM_XE_EUDEBUG_EVENT_STATE_CHANGE,
>> +					bl->fd, bl->vm, 0, bl->n, &ref_seqno);
>> +	for (i = 0; i < bl->n; i++)
>> +		xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_CREATE,
>> +						   ref_seqno,
>> +						   &op_ref_seqno,
>> +						   bl->bind_ops[i].addr,
>> +						   bl->bind_ops[i].range,
>> +						   0);
>> +
>> +	if (sync) {
>> +		xe_wait_ufence(bl->fd, fence_data, uf_sync.timeline_value, 0,
>> +			       XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC);
>> +		free(fence_data);
>> +	}
>> +}
>> +
>> +static void free_bind_list(struct xe_eudebug_client *c, struct bind_list *bl)
>> +{
>> +	unsigned int i;
>> +
>> +	for (i = 0; i < bl->n; i++) {
>> +		igt_debug("%d: checking 0x%llx (%lld)\n",
>> +			  i, bl->bind_ops[i].addr, bl->bind_ops[i].addr);
>> +		bo_check(bl->fd, &bl->bind_ops[i]);
>> +		if (bl->bind_ops[i].op == DRM_XE_VM_BIND_OP_MAP_USERPTR)
>> +			free(bl->bo[i].userptr);
>> +		xe_eudebug_client_vm_unbind(c, bl->fd, bl->vm, 0,
>> +					    bl->bind_ops[i].addr,
>> +					    bl->bind_ops[i].range);
>> +	}
>> +
>> +	free(bl->bind_ops);
>> +	free(bl->bo);
>> +	free(bl);
>> +}
>> +
>> +static void vm_bind_client(int fd, struct xe_eudebug_client *c)
>> +{
>> +	uint64_t op_ref_seqno, ref_seqno;
>> +	struct bind_list *bl;
>> +	bool test_discovery = c->flags & TEST_DISCOVERY;
>> +	size_t bo_size = 3 * xe_get_default_alignment(fd);
>> +	uint32_t bo[2] = {
>> +		xe_bo_create(fd, 0, bo_size, system_memory(fd), 0),
>> +		xe_bo_create(fd, 0, bo_size, system_memory(fd), 0),
>> +	};
>> +	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
>> +	uint64_t addr[] = {0x2a0000, 0x3a0000};
>> +	uint64_t rebind_bo_offset = 2 * bo_size / 3;
>> +	uint64_t size = bo_size / 3;
>> +	int i = 0;
>> +
>> +	if (test_discovery) {
>> +		xe_vm_bind_async(fd, vm, 0, bo[0], 0, addr[0], bo_size, NULL, 0);
>> +
>> +		xe_vm_unbind_async(fd, vm, 0, 0, addr[0] + size, size, NULL, 0);
>> +
>> +		xe_vm_bind_async(fd, vm, 0, bo[1], 0, addr[1], bo_size, NULL, 0);
>> +
>> +		xe_vm_bind_async(fd, vm, 0, bo[1], rebind_bo_offset, addr[1], size, NULL, 0);
>> +
>> +		bl = create_bind_list(fd, system_memory(fd), vm, 4, 0);
>> +		xe_vm_bind_array(bl->fd, bl->vm, 0, bl->bind_ops, bl->n, NULL, 0);
>> +
>> +		xe_vm_unbind_all_async(fd, vm, 0, bo[0], NULL, 0);
>> +
>> +		xe_eudebug_client_vm_bind_event(c, DRM_XE_EUDEBUG_EVENT_STATE_CHANGE,
>> +						bl->fd, bl->vm, 0, bl->n + 2, &ref_seqno);
>> +
>> +		xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_CREATE, ref_seqno,
>> +						   &op_ref_seqno, addr[1], size, 0);
>> +		xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_CREATE, ref_seqno,
>> +						   &op_ref_seqno, addr[1] + size, size * 2, 0);
>> +
>> +		for (i = 0; i < bl->n; i++)
>> +			xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_CREATE,
>> +							   ref_seqno, &op_ref_seqno,
>> +							   bl->bind_ops[i].addr,
>> +							   bl->bind_ops[i].range, 0);
>> +
>> +		xe_eudebug_client_signal_stage(c, STAGE_PRE_DEBUG_RESOURCES_DONE);
>> +		xe_eudebug_client_wait_stage(c, STAGE_DISCOVERY_DONE);
>> +	} else {
>> +		xe_eudebug_client_vm_bind(c, fd, vm, bo[0], 0, addr[0], bo_size);
>> +		xe_eudebug_client_vm_unbind(c, fd, vm, 0, addr[0] + size, size);
>> +
>> +		xe_eudebug_client_vm_bind(c, fd, vm, bo[1], 0, addr[1], bo_size);
>> +		xe_eudebug_client_vm_bind(c, fd, vm, bo[1], rebind_bo_offset, addr[1], size);
>> +
>> +		bl = create_bind_list(fd, system_memory(fd), vm, 4, 0);
>> +		do_bind_list(c, bl, false);
>> +	}
>> +
>> +	xe_vm_unbind_all_async(fd, vm, 0, bo[1], NULL, 0);
>> +
>> +	xe_eudebug_client_vm_bind_event(c, DRM_XE_EUDEBUG_EVENT_STATE_CHANGE, fd, vm, 0,
>> +					1, &ref_seqno);
>> +	xe_eudebug_client_vm_bind_op_event(c, DRM_XE_EUDEBUG_EVENT_DESTROY, ref_seqno,
>> +					   &op_ref_seqno, 0, 0, 0);
>> +
>> +	gem_close(fd, bo[0]);
>> +	gem_close(fd, bo[1]);
>> +	xe_eudebug_client_vm_destroy(c, fd, vm);
>> +}
>> +
>> +static void run_basic_client(struct xe_eudebug_client *c)
> 
> For 'multiple-sessions' subtest - is run_basic_client() prepared for
> being executed in parallel? I mean for my setup these 4 children
> executes create vms/exec queues one after another, not in parallel.

Well that doesn't really depend on your setup - the test_basic_sessions 
actually works that way that it runs the client work function after the 
previous one completed. However it keeps the session alive for 
subsequent work function runs and as this is a basic test, the intention 
was to ensure that events are correctly relayed when having multiple 
client<->debugger pairs.

> 
>> +{
>> +	int fd, i;
>> +
>> +	fd = xe_eudebug_client_open_driver(c);
>> +	xe_device_get(fd);
> 
> xe_device_get() is not necessary here, as xe_eudebug_client_open_driver()
> calls drm_reopen_driver() which for xe calls this for new fd.

Sure, then I guess we also shoud use drm_close_driver in 
'xe_eudebug_client_close_driver' to perform the put. Will address that.

> 
>> +
>> +	if (c->flags & CREATE_VMS) {
>> +		const uint32_t flags[] = {
>> +			DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE | DRM_XE_VM_CREATE_FLAG_LR_MODE,
>> +			DRM_XE_VM_CREATE_FLAG_LR_MODE,
>> +		};
>> +		uint32_t vms[ARRAY_SIZE(flags)];
>> +
>> +		for (i = 0; i < ARRAY_SIZE(flags); i++)
>> +			vms[i] = xe_eudebug_client_vm_create(c, fd, flags[i], 0);
>> +
>> +		for (i--; i >= 0; i--)
>> +			xe_eudebug_client_vm_destroy(c, fd, vms[i]);
>> +	}
>> +
>> +	if (c->flags & CREATE_EXEC_QUEUES) {
>> +		struct drm_xe_exec_queue_create *create;
>> +		struct drm_xe_engine_class_instance *hwe;
>> +		struct drm_xe_ext_set_property eq_ext = {
>> +			.base.name = DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY,
>> +			.property = DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG,
>> +			.value = DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE,
>> +		};
>> +		uint32_t vm;
>> +
>> +		create = calloc(xe_number_engines(fd), sizeof(*create));
>> +
>> +		vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
>> +
>> +		i = 0;
>> +		xe_eudebug_for_each_engine(fd, hwe) {
>> +			create[i].instances = to_user_pointer(hwe);
>> +			create[i].vm_id = vm;
>> +			create[i].width = 1;
>> +			create[i].num_placements = 1;
>> +			create[i].extensions = to_user_pointer(&eq_ext);
>> +			xe_eudebug_client_exec_queue_create(c, fd, &create[i++]);
>> +		}
>> +
>> +		while (--i >= 0)
>> +			xe_eudebug_client_exec_queue_destroy(c, fd, &create[i]);
>> +
>> +		xe_eudebug_client_vm_destroy(c, fd, vm);
>> +	}
>> +
>> +	if (c->flags & VM_BIND || c->flags & VM_BIND_METADATA)
>> +		basic_vm_bind_client(fd, c);
>> +
>> +	if (c->flags & VM_BIND_EXTENDED)
>> +		vm_bind_client(fd, c);
>> +
>> +	if (c->flags & VM_BIND_VM_DESTROY)
>> +		basic_vm_bind_vm_destroy_client(fd, c);
>> +
>> +	xe_device_put(fd);
>> +	xe_eudebug_client_close_driver(c, fd);
>> +}
>> +
>> +static int read_event(int debugfd, struct drm_xe_eudebug_event *event)
>> +{
>> +	int ret;
>> +
>> +	ret = igt_ioctl(debugfd, DRM_XE_EUDEBUG_IOCTL_READ_EVENT, event);
>> +	if (ret < 0)
>> +		return -errno;
>> +
>> +	return ret;
>> +}
>> +
>> +static int __read_event(int debugfd, struct drm_xe_eudebug_event *event)
>> +{
>> +	int ret;
>> +
>> +	ret = ioctl(debugfd, DRM_XE_EUDEBUG_IOCTL_READ_EVENT, event);
>> +	if (ret < 0)
>> +		return -errno;
>> +
>> +	return ret;
>> +}
>> +
>> +static int poll_event(int fd, int timeout_ms)
>> +{
>> +	int ret;
>> +
>> +	struct pollfd p = {
>> +		.fd = fd,
>> +		.events = POLLIN,
>> +		.revents = 0,
>> +	};
>> +
>> +	ret = poll(&p, 1, timeout_ms);
>> +	if (ret == -1)
>> +		return -errno;
>> +
>> +	return ret == 1 && (p.revents & POLLIN);
>> +}
>> +
>> +static int __debug_connect(int fd, int *debugfd, struct drm_xe_eudebug_connect *param)
>> +{
>> +	int ret = 0;
>> +
>> +	*debugfd = igt_ioctl(fd, DRM_IOCTL_XE_EUDEBUG_CONNECT, param);
>> +
>> +	if (*debugfd < 0) {
>> +		ret = -errno;
>> +		igt_assume(ret != 0);
>> +	}
>> +
>> +	errno = 0;
>> +	return ret;
>> +}
>> +
>> +/**
>> + * SUBTEST: basic-connect
>> + * Description:
>> + *	Exercise XE_EUDEBUG_CONNECT ioctl with passing
>> + *	valid and invalid params.
>> + */
>> +static void test_connect(int fd)
>> +{
>> +	struct drm_xe_eudebug_connect param = {};
>> +	int debugfd, ret;
>> +	pid_t *pid;
>> +
>> +	pid = mmap(NULL, sizeof(pid_t), PROT_WRITE,
>> +		   MAP_SHARED | MAP_ANON, -1, 0);
>> +
>> +	/* get fresh unrelated pid */
>> +	igt_fork(child, 1)
>> +		*pid = getpid();
>> +
>> +	igt_waitchildren();
>> +	param.pid = *pid;
>> +	munmap(pid, sizeof(pid_t));
>> +
>> +	ret = __debug_connect(fd, &debugfd, &param);
>> +	igt_assert(debugfd == -1);
>> +	igt_assert_eq(ret, param.pid ? -ENOENT : -EINVAL);
> 
> I've pointed out in review of kernel series ENOENT should be
> used for file operations, so I think this should be changed.

Sure, that will change as soon as the kmd adapts this change.

> 
>> +
>> +	param.pid = 0;
>> +	ret = __debug_connect(fd, &debugfd, &param);
>> +	igt_assert(debugfd == -1);
>> +	igt_assert_eq(ret, -EINVAL);
>> +
>> +	param.pid = getpid();
>> +	param.version = -1;
>> +	ret = __debug_connect(fd, &debugfd, &param);
>> +	igt_assert(debugfd == -1);
>> +	igt_assert_eq(ret, -EINVAL);
>> +
>> +	param.version = 0;
>> +	param.flags = ~0;
>> +	ret = __debug_connect(fd, &debugfd, &param);
>> +	igt_assert(debugfd == -1);
>> +	igt_assert_eq(ret, -EINVAL);
>> +
>> +	param.flags = 0;
>> +	param.extensions = ~0;
>> +	ret = __debug_connect(fd, &debugfd, &param);
>> +	igt_assert(debugfd == -1);
>> +	igt_assert_eq(ret, -EINVAL);
>> +
>> +	param.extensions = 0;
>> +	ret = __debug_connect(fd, &debugfd, &param);
>> +	igt_assert_neq(debugfd, -1);
>> +	igt_assert_eq(ret, 0);
>> +
>> +	close(debugfd);
>> +}
>> +
>> +static void switch_user(__uid_t uid, __gid_t gid)
>> +{
>> +	struct group *gr;
>> +	__gid_t gr_v;
>> +
>> +	/* Users other then root need to belong to video group */
>> +	gr = getgrnam("video");
>> +	igt_assert(gr);
>> +
>> +	/* Drop all */
>> +	igt_assert_eq(setgroups(1, &gr->gr_gid), 0);
>> +	igt_assert_eq(setgid(gid), 0);
>> +	igt_assert_eq(setuid(uid), 0);
>> +
>> +	igt_assert_eq(getgroups(1, &gr_v), 1);
>> +	igt_assert_eq(gr_v, gr->gr_gid);
>> +	igt_assert_eq(getgid(), gid);
>> +	igt_assert_eq(getuid(), uid);
>> +
>> +	igt_assert_eq(prctl(PR_SET_DUMPABLE, 1L), 0);
>> +}
>> +
>> +/**
>> + * SUBTEST: connect-user
>> + * Description:
>> + *	Verify unprivileged XE_EUDEBG_CONNECT ioctl.
> 
> Typo.

Thanks!

> 
>> + *	Check:
>> + *	 - user debugger to user workload connection
>> + *	 - user debugger to other user workload connection
>> + *	 - user debugger to privileged workload connection
>> + */
>> +static void test_connect_user(int fd)
>> +{
>> +	struct drm_xe_eudebug_connect param = {};
>> +	struct passwd *pwd, *pwd2;
>> +	const char *user1 = "lp";
>> +	const char *user2 = "mail";
>> +	int debugfd, ret, i;
>> +	int p1[2], p2[2];
>> +	__uid_t u1, u2;
>> +	__gid_t g1, g2;
>> +	int newfd;
>> +	pid_t pid;
>> +
>> +#define NUM_USER_TESTS 4
>> +#define P_APP 0
>> +#define P_GDB 1
>> +	struct conn_user {
>> +		/* u[0] - process uid, u[1] - gdb uid */
>> +		__uid_t u[P_GDB + 1];
>> +		/* g[0] - process gid, g[1] - gdb gid */
>> +		__gid_t g[P_GDB + 1];
>> +		/* Expected fd from open */
>> +		int ret;
>> +		/* Skip this test case */
>> +		int skip;
>> +		const char *desc;
>> +	} test[NUM_USER_TESTS] = {};
>> +
>> +	igt_assert(!pipe(p1));
>> +	igt_assert(!pipe(p2));
>> +
>> +	pwd = getpwnam(user1);
>> +	igt_require(pwd);
>> +	u1 = pwd->pw_uid;
>> +	g1 = pwd->pw_gid;
>> +
>> +	/*
>> +	 * Keep a copy of needed contents as it is a static
>> +	 * memory area and subsequent calls will overwrite
>> +	 * what's in.
>> +	 * However getpwnam() returns NULL if cannot find
>> +	 * user in passwd.
>> +	 */
>> +	setpwent();
>> +	pwd2 = getpwnam(user2);
>> +	if (pwd2) {
>> +		u2 = pwd2->pw_uid;
>> +		g2 = pwd2->pw_gid;
>> +	}
>> +
>> +	test[0].skip = !pwd;
>> +	test[0].u[P_GDB] = u1;
>> +	test[0].g[P_GDB] = g1;
>> +	test[0].ret = -EACCES;
>> +	test[0].desc = "User GDB to Root App";
>> +
>> +	test[1].skip = !pwd;
>> +	test[1].u[P_APP] = u1;
>> +	test[1].g[P_APP] = g1;
>> +	test[1].u[P_GDB] = u1;
>> +	test[1].g[P_GDB] = g1;
>> +	test[1].ret = 0;
>> +	test[1].desc = "User GDB to User App";
>> +
>> +	test[2].skip = !pwd;
>> +	test[2].u[P_APP] = u1;
>> +	test[2].g[P_APP] = g1;
>> +	test[2].ret = 0;
>> +	test[2].desc = "Root GDB to User App";
>> +
>> +	test[3].skip = !pwd2;
>> +	test[3].u[P_APP] = u1;
>> +	test[3].g[P_APP] = g1;
>> +	test[3].u[P_GDB] = u2;
>> +	test[3].g[P_GDB] = g2;
>> +	test[3].ret = -EACCES;
>> +	test[3].desc = "User GDB to Other User App";
>> +
>> +	if (!pwd2)
>> +		igt_warn("User %s not available in the system. Skipping subtests: %s.\n",
>> +			 user2, test[3].desc);
>> +
>> +	for (i = 0; i < NUM_USER_TESTS; i++) {
>> +		if (test[i].skip) {
>> +			igt_debug("Subtest %s skipped\n", test[i].desc);
>> +			continue;
>> +		}
>> +		igt_debug("Executing connection: %s\n", test[i].desc);
>> +		igt_fork(child, 2) {
>> +			if (!child) {
>> +				if (test[i].u[P_APP])
>> +					switch_user(test[i].u[P_APP], test[i].g[P_APP]);
>> +
>> +				pid = getpid();
>> +				/* Signal the PID */
>> +				igt_assert(write(p1[1], &pid, sizeof(pid)) == sizeof(pid));
>> +				/* wait with exit */
>> +				igt_assert(read(p2[0], &pid, sizeof(pid)) == sizeof(pid));
>> +			} else {
>> +				if (test[i].u[P_GDB])
>> +					switch_user(test[i].u[P_GDB], test[i].g[P_GDB]);
>> +
>> +				igt_assert(read(p1[0], &pid, sizeof(pid)) == sizeof(pid));
>> +				param.pid = pid;
>> +
>> +				newfd = drm_open_driver(DRIVER_XE);
>> +				ret = __debug_connect(newfd, &debugfd, &param);
>> +
>> +				/* Release the app first */
>> +				igt_assert(write(p2[1], &pid, sizeof(pid)) == sizeof(pid));
>> +
>> +				igt_assert_eq(ret, test[i].ret);
>> +				if (!ret)
>> +					close(debugfd);
>> +			}
>> +		}
>> +		igt_waitchildren();
>> +	}
>> +	close(p1[0]);
>> +	close(p1[1]);
>> +	close(p2[0]);
>> +	close(p2[1]);
>> +#undef NUM_USER_TESTS
>> +#undef P_APP
>> +#undef P_GDB
>> +}
>> +
>> +/**
>> + * SUBTEST: basic-close
>> + * Description:
>> + *	Test whether eudebug can be reattached after closure.
>> + */
>> +static void test_close(int fd)
>> +{
>> +	struct drm_xe_eudebug_connect param = { 0,  };
>> +	int debug_fd1, debug_fd2;
>> +	int fd2;
>> +
>> +	param.pid = getpid();
>> +
>> +	igt_assert_eq(__debug_connect(fd, &debug_fd1, &param), 0);
>> +	igt_assert(debug_fd1 >= 0);
>> +	igt_assert_eq(__debug_connect(fd, &debug_fd2, &param), -EBUSY);
>> +	igt_assert_eq(debug_fd2, -1);
>> +
>> +	close(debug_fd1);
>> +	fd2 = drm_open_driver(DRIVER_XE);
>> +
>> +	igt_assert_eq(__debug_connect(fd2, &debug_fd2, &param), 0);
>> +	igt_assert(debug_fd2 >= 0);
>> +	close(fd2);
>> +	close(debug_fd2);
>> +	close(debug_fd1);
>> +}
>> +
>> +/**
>> + * SUBTEST: basic-read-event
>> + * Description:
>> + *	Synchronously exercise eu debugger event polling and reading.
>> + */
> 
> May I ask for commenting out similar to debugger_test_vma_parameters()?

Ok.

> 
>> +#define MAX_EVENT_SIZE (32 * 1024)
>> +static void test_read_event(int fd)
>> +{
>> +	struct drm_xe_eudebug_event *event;
>> +	struct xe_eudebug_debugger *d;
>> +	struct xe_eudebug_client *c;
>> +
>> +	event = malloc(MAX_EVENT_SIZE);
>> +	igt_assert(event);
>> +	memset(event, 0, sizeof(*event));
> 
> calloc?

Ok.

> 
>> +
>> +	c = xe_eudebug_client_create(fd, run_basic_client, 0, NULL);
>> +	d = xe_eudebug_debugger_create(fd, 0, NULL);
>> +
>> +	igt_assert_eq(xe_eudebug_debugger_attach(d, c), 0);
>> +	igt_assert_eq(poll_event(d->fd, 500), 0);
>> +
>> +	event->len = 1;
>> +	event->type = DRM_XE_EUDEBUG_EVENT_NONE;
>> +	igt_assert_eq(read_event(d->fd, event), -EINVAL);
>> +
>> +	event->len = MAX_EVENT_SIZE;
>> +	event->type = DRM_XE_EUDEBUG_EVENT_NONE;
>> +	igt_assert_eq(read_event(d->fd, event), -EINVAL);
>> +
>> +	xe_eudebug_client_start(c);
> 
> run_basic_client() produces creates/destroy client events, so:
>> +
>> +	igt_assert_eq(poll_event(d->fd, 500), 1);
>> +	event->type = DRM_XE_EUDEBUG_EVENT_READ;
>> +	igt_assert_eq(read_event(d->fd, event), 0);
> 
> I would check is flags == CREATE at this point, then

Sure, adding a check for event type could also be added here then. Will do.

> 
>> +
>> +	igt_assert_eq(poll_event(d->fd, 500), 1);
>> +
>> +	event->flags = 0;
>> +	event->type = DRM_XE_EUDEBUG_EVENT_READ;
>> +
>> +	event->len = 0;
>> +	igt_assert_eq(read_event(d->fd, event), -EINVAL);
>> +	igt_assert_eq(0, event->len);
>> +
>> +	event->len = sizeof(*event) - 1;
>> +	igt_assert_eq(read_event(d->fd, event), -EINVAL);
>> +
>> +	event->len = sizeof(*event);
>> +	igt_assert_eq(read_event(d->fd, event), -EMSGSIZE);
>> +	igt_assert_lt(sizeof(*event), event->len);
>> +
>> +	event->len = event->len - 1;
>> +	igt_assert_eq(read_event(d->fd, event), -EMSGSIZE);
>> +	/* event->len should now contain the exact len */
>> +	igt_assert_eq(read_event(d->fd, event), 0);
> 
> flags == DESTROY here.

Same here.

>
>> +
>> +	fcntl(d->fd, F_SETFL, fcntl(d->fd, F_GETFL) | O_NONBLOCK);
>> +	igt_assert(fcntl(d->fd, F_GETFL) & O_NONBLOCK);
>> +
>> +	igt_assert_eq(poll_event(d->fd, 500), 0);
>> +	event->len = MAX_EVENT_SIZE;
>> +	event->flags = 0;
>> +	event->type = DRM_XE_EUDEBUG_EVENT_READ;
>> +	igt_assert_eq(__read_event(d->fd, event), -EAGAIN);
>> +
>> +	xe_eudebug_client_wait_done(c);
>> +	xe_eudebug_client_stop(c);
>> +
>> +	igt_assert_eq(poll_event(d->fd, 500), 0);
>> +	igt_assert_eq(__read_event(d->fd, event), -EAGAIN);
>> +
>> +	xe_eudebug_debugger_destroy(d);
>> +	xe_eudebug_client_destroy(c);
>> +
>> +	free(event);
>> +}
>> +
>> +/**
>> + * SUBTEST: basic-client
>> + * Description:
>> + *	Attach the debugger to process which opens and closes xe drm client.
>> + *
>> + * SUBTEST: basic-client-th
>> + * Description:
>> + *	Create client basic resources (vms) in multiple threads
>> + *
>> + * SUBTEST: multiple-sessions
>> + * Description:
>> + *	Simultaneously attach many debuggers to many processes.
>> + *	Each process opens and closes xe drm client and creates few resources.
>> + *
>> + * SUBTEST: basic-%s
>> + * Description:
>> + *	Attach the debugger to process which creates and destroys a few %arg[1].
>> + *
>> + * SUBTEST: basic-vm-bind
>> + * Description:
>> + *	Attach the debugger to a process that performs synchronous vm bind
>> + *	and vm unbind.
>> + *
>> + * SUBTEST: basic-vm-bind-vm-destroy
>> + * Description:
>> + *	Attach the debugger to a process that performs vm bind, and destroys
>> + *	the vm without unbinding. Make sure that we don't get unbind events.
>> + *
>> + * SUBTEST: basic-vm-bind-extended
>> + * Description:
>> + *	Attach the debugger to a process that performs bind, bind array, rebind,
>> + *	partial unbind, unbind and unbind all operations.
>> + *
>> + * SUBTEST: multigpu-basic-client
>> + * Description:
>> + *	Attach the debugger to process which opens and closes xe drm client on all Xe devices.
>> + *
>> + * SUBTEST: multigpu-basic-client-many
>> + * Description:
>> + *	Simultaneously attach many debuggers to many processes on all Xe devices.
>> + *	Each process opens and closes xe drm client and creates few resources.
>> + *
>> + * arg[1]:
>> + *
>> + * @vms: vms
>> + * @exec-queues: exec queues
>> + */
>> +
>> +static void test_basic_sessions(int fd, unsigned int flags, int count, bool match_opposite)
>> +{
>> +	struct xe_eudebug_session **s;
>> +	int i;
>> +
>> +	s = calloc(count, sizeof(*s));
>> +
>> +	igt_assert(s);
>> +
>> +	for (i = 0; i < count; i++)
>> +		s[i] = xe_eudebug_session_create(fd, run_basic_client, flags, NULL);
>> +
>> +	for (i = 0; i < count; i++)
>> +		xe_eudebug_session_run(s[i]);
>> +
>> +	for (i = 0; i < count; i++)
>> +		xe_eudebug_session_check(s[i], match_opposite, 0);
>> +
>> +	for (i = 0; i < count; i++)
>> +		xe_eudebug_session_destroy(s[i]);
>> +}
>> +
>> +/**
>> + * SUBTEST: basic-vm-bind-discovery
>> + * Description:
>> + *	Attach the debugger to a process that performs vm-bind before attaching
>> + *	and check if the discovery process reports it.
>> + *
>> + * SUBTEST: basic-vm-bind-metadata-discovery
>> + * Description:
>> + *	Attach the debugger to a process that performs vm-bind with metadata attached
>> + *	before attaching and check if the discovery process reports it.
>> + *
>> + * SUBTEST: basic-vm-bind-vm-destroy-discovery
>> + * Description:
>> + *	Attach the debugger to a process that performs vm bind, and destroys
>> + *	the vm without unbinding before attaching. Make sure that we don't get
>> + *	any bind/unbind and vm create/destroy events.
>> + *
>> + * SUBTEST: basic-vm-bind-extended-discovery
>> + * Description:
>> + *	Attach the debugger to a process that performs bind, bind array, rebind,
>> + *	partial unbind, and unbind all operations before attaching. Ensure that
>> + *	we get a only a singe 'VM_BIND' event from the discovery worker.
>> + */
>> +static void test_basic_discovery(int fd, unsigned int flags, bool match_opposite)
>> +{
>> +	struct xe_eudebug_debugger *d;
>> +	struct xe_eudebug_session *s;
>> +	struct xe_eudebug_client *c;
>> +
>> +	s = xe_eudebug_session_create(fd, run_basic_client, flags | TEST_DISCOVERY, NULL);
>> +
>> +	c = s->client;
>> +	d = s->debugger;
>> +
>> +	xe_eudebug_client_start(c);
>> +	xe_eudebug_debugger_wait_stage(s, STAGE_PRE_DEBUG_RESOURCES_DONE);
>> +
>> +	igt_assert_eq(xe_eudebug_debugger_attach(d, c), 0);
>> +	xe_eudebug_debugger_start_worker(d);
>> +
>> +	/* give the worker time to do it's job */
>> +	sleep(2);
> 
> Shouldn't debugger be informed via discovery completion event instead
> of using arbitrary timeout?

Well there is no such event and I guess it comes down to the fact that 
there is no real use case for it besides testing. There is no reason to 
distinguish between discovery and non-discovery events for the user as 
the resulting state will be the same. The actual event split and 
sequence may differ however, and here we are trying to ensure that it 
looks like expected for the discovery path.

> 
>> +	xe_eudebug_debugger_signal_stage(d, STAGE_DISCOVERY_DONE);
>> +
>> +	xe_eudebug_client_wait_done(c);
>> +
>> +	xe_eudebug_debugger_stop_worker(d, 1);
>> +
>> +	xe_eudebug_event_log_print(d->log, true);
>> +	xe_eudebug_event_log_print(c->log, true);
>> +
>> +	xe_eudebug_session_check(s, match_opposite, 0);
>> +	xe_eudebug_session_destroy(s);
>> +}
>> +
>> +#define RESOURCE_COUNT 16
>> +#define PRIMARY_THREAD			(1 << 0)
>> +#define DISCOVERY_CLOSE_CLIENT		(1 << 1)
>> +#define DISCOVERY_DESTROY_RESOURCES	(1 << 2)
>> +#define DISCOVERY_VM_BIND		(1 << 3)
>> +static void run_discovery_client(struct xe_eudebug_client *c)
>> +{
>> +	struct drm_xe_engine_class_instance *hwe = NULL;
>> +	int fd[RESOURCE_COUNT], i;
>> +	bool skip_sleep = c->flags & (DISCOVERY_DESTROY_RESOURCES | DISCOVERY_CLOSE_CLIENT);
>> +	uint64_t addr = 0x1a0000;
>> +
>> +	srand(getpid());
>> +
>> +	for (i = 0; i < RESOURCE_COUNT; i++) {
>> +		fd[i] = xe_eudebug_client_open_driver(c);
>> +
>> +		if (!i) {
>> +			bool found = false;
>> +
>> +			xe_device_get(fd[0]);
> 
> Unnecessary, drm_reopen_driver() calls this.

Yup, will clean all these cases up.

> 
>> +			xe_for_each_engine(fd[0], hwe) {
>> +				if (hwe->engine_class == DRM_XE_ENGINE_CLASS_COMPUTE ||
>> +				    hwe->engine_class == DRM_XE_ENGINE_CLASS_RENDER) {
>> +					found = true;
>> +					break;
>> +				}
>> +			}
>> +			igt_assert(found);
>> +		}
>> +
>> +		/*
>> +		 * Give the debugger a break in event stream after every
>> +		 * other client, that allows to read discovery and dettach in quiet.
>> +		 */
>> +		if (random() % 2 == 0 && !skip_sleep)
>> +			sleep(1);
>> +
>> +		for (int j = 0; j < RESOURCE_COUNT; j++) {
>> +			uint32_t vm = xe_eudebug_client_vm_create(c, fd[i],
>> +								  DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
>> +			struct drm_xe_ext_set_property eq_ext = {
>> +				.base.name = DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY,
>> +				.property = DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG,
>> +				.value = DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE,
>> +			};
>> +			struct drm_xe_exec_queue_create create = {
>> +				.width = 1,
>> +				.num_placements = 1,
>> +				.vm_id = vm,
>> +				.instances = to_user_pointer(hwe),
>> +				.extensions = to_user_pointer(&eq_ext),
>> +			};
>> +			const unsigned int bo_size = max_t(bo_size,
>> +							   xe_get_default_alignment(fd[i]),
>> +							   MIN_BO_SIZE);
>> +			uint32_t bo = xe_bo_create(fd[i], 0, bo_size, system_memory(fd[i]), 0);
>> +
>> +			xe_eudebug_client_exec_queue_create(c, fd[i], &create);
>> +
>> +			if (c->flags & DISCOVERY_VM_BIND) {
>> +				xe_eudebug_client_vm_bind(c, fd[i], vm, bo, 0, addr, bo_size);
>> +				addr += 0x100000;
> 
> Shouldn't this be addr += bo_size? 0x100000 is technically correct as
> default alignment may be 4K or 64K and MIN_BO_SIZE seems to be 32K
> but I would use bo_size here. Unless your intention is to have gaps
> in vm space.

Sure, I think we can change it to bo_size.

> 
>> +			}
>> +
>> +			if (c->flags & DISCOVERY_DESTROY_RESOURCES) {
>> +				xe_eudebug_client_exec_queue_destroy(c, fd[i], &create);
>> +				xe_eudebug_client_vm_destroy(c, fd[i], create.vm_id);
>> +				gem_close(fd[i], bo);
>> +			}
>> +		}
>> +
>> +		if (c->flags & DISCOVERY_CLOSE_CLIENT)
>> +			xe_eudebug_client_close_driver(c, fd[i]);
>> +	}
>> +	xe_device_put(fd[0]);
> 
> run_discovery_client() is executed after fork so freeing single fd[0]
> device cached data is not necessary here. It would be necessary to
> get/put these data if fds would be opened and closed, but here it
> doesn't happen. It tooks me a while to figure out why I see no fds
> leakage and I realized all of them are closed on process exiting.

As above, will be cleaned up.

> 
>> +}
>> +
>> +/**
>> + * SUBTEST: discovery-%s
>> + * Description: Race discovery against %arg[1] and the debugger dettach.
>> + *
>> + * arg[1]:
>> + *
>> + * @race:		resources creation
>> + * @race-vmbind:	vm-bind operations
>> + * @empty:		resources destruction
>> + * @empty-clients:	client closure
>> + */
>> +static void *discovery_race_thread(void *data)
>> +{
>> +	struct {
>> +		uint64_t client_handle;
>> +		int vm_count;
>> +		int exec_queue_count;
>> +		int vm_bind_op_count;
>> +	} clients[RESOURCE_COUNT];
>> +	struct xe_eudebug_session *s = data;
>> +	int expected = RESOURCE_COUNT * (1 + 2 * RESOURCE_COUNT);
>> +	const int tries = 100;
>> +	bool done = false;
>> +	int ret = 0;
>> +
>> +	for (int try = 0; try < tries && !done; try++) {
>> +		ret = xe_eudebug_debugger_attach(s->debugger, s->client);
>> +
>> +		if (ret == -EBUSY) {
>> +			usleep(100000);
>> +			continue;
>> +		}
>> +
>> +		igt_assert_eq(ret, 0);
>> +
>> +		if (random() % 2) {
>> +			struct drm_xe_eudebug_event *e = NULL;
>> +			int i = -1;
>> +
>> +			xe_eudebug_debugger_start_worker(s->debugger);
>> +			sleep(1);
>> +			xe_eudebug_debugger_stop_worker(s->debugger, 1);
>> +			igt_debug("Resources discovered: %lu\n", s->debugger->event_count);
>> +
>> +			xe_eudebug_for_each_event(e, s->debugger->log) {
>> +				if (e->type == DRM_XE_EUDEBUG_EVENT_OPEN) {
>> +					struct drm_xe_eudebug_event_client *eo = (void *)e;
>> +
>> +					if (i >= 0) {
>> +						igt_assert_eq(clients[i].vm_count,
>> +							      RESOURCE_COUNT);
>> +
>> +						igt_assert_eq(clients[i].exec_queue_count,
>> +							      RESOURCE_COUNT);
>> +
>> +						if (s->client->flags & DISCOVERY_VM_BIND)
>> +							igt_assert_eq(clients[i].vm_bind_op_count,
>> +								      RESOURCE_COUNT);
>> +					}
>> +
>> +					igt_assert(++i < RESOURCE_COUNT);
>> +					clients[i].client_handle = eo->client_handle;
>> +					clients[i].vm_count = 0;
>> +					clients[i].exec_queue_count = 0;
>> +					clients[i].vm_bind_op_count = 0;
>> +				}
>> +
>> +				if (e->type == DRM_XE_EUDEBUG_EVENT_VM)
>> +					clients[i].vm_count++;
>> +
>> +				if (e->type == DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE)
>> +					clients[i].exec_queue_count++;
>> +
>> +				if (e->type == DRM_XE_EUDEBUG_EVENT_VM_BIND_OP)
>> +					clients[i].vm_bind_op_count++;
>> +			};
>> +
>> +			igt_assert_lte(0, i);
>> +
>> +			for (int j = 0; j < i; j++)
>> +				for (int k = 0; k < i; k++) {
>> +					if (k == j)
>> +						continue;
>> +
>> +					igt_assert_neq(clients[j].client_handle,
>> +						       clients[k].client_handle);
>> +				}
>> +
>> +			if (s->debugger->event_count >= expected)
>> +				done = true;
>> +		}
>> +
>> +		xe_eudebug_debugger_detach(s->debugger);
>> +		s->debugger->log->head = 0;
>> +		s->debugger->event_count = 0;
>> +	}
>> +
>> +	/* Primary thread must read everything */
>> +	if (s->flags & PRIMARY_THREAD) {
>> +		while ((ret = xe_eudebug_debugger_attach(s->debugger, s->client)) == -EBUSY)
>> +			usleep(100000);
>> +
>> +		igt_assert_eq(ret, 0);
>> +
>> +		xe_eudebug_debugger_start_worker(s->debugger);
>> +		xe_eudebug_client_wait_done(s->client);
>> +
>> +		if (READ_ONCE(s->debugger->event_count) != expected)
>> +			sleep(5);
>> +
>> +		xe_eudebug_debugger_stop_worker(s->debugger, 1);
>> +		xe_eudebug_debugger_detach(s->debugger);
>> +	}
>> +
>> +	return NULL;
>> +}
>> +
>> +static void test_race_discovery(int fd, unsigned int flags, int clients)
>> +{
>> +	const int debuggers_per_client = 3;
>> +	int count = clients * debuggers_per_client;
>> +	struct xe_eudebug_session *sessions, *s;
>> +	struct xe_eudebug_client *c;
>> +	pthread_t *threads;
>> +	int i, j;
>> +
>> +	sessions = calloc(count, sizeof(*sessions));
>> +	threads = calloc(count, sizeof(*threads));
>> +
>> +	for (i = 0; i < clients; i++) {
>> +		c = xe_eudebug_client_create(fd, run_discovery_client, flags, NULL);
>> +		for (j = 0; j < debuggers_per_client; j++) {
>> +			s = &sessions[i * debuggers_per_client + j];
>> +			s->client = c;
>> +			s->debugger = xe_eudebug_debugger_create(fd, flags, NULL);
>> +			s->flags = flags | (!j ? PRIMARY_THREAD : 0);
>> +		}
>> +	}
>> +
>> +	for (i = 0; i < count; i++) {
>> +		if (sessions[i].flags & PRIMARY_THREAD)
>> +			xe_eudebug_client_start(sessions[i].client);
>> +
>> +		pthread_create(&threads[i], NULL, discovery_race_thread, &sessions[i]);
>> +	}
>> +
>> +	for (i = 0; i < count; i++)
>> +		pthread_join(threads[i], NULL);
>> +
>> +	for (i = count - 1; i > 0; i--) {
>> +		if (sessions[i].flags & PRIMARY_THREAD) {
>> +			igt_assert_eq(sessions[i].client->seqno - 1,
>> +				      sessions[i].debugger->event_count);
>> +
>> +			xe_eudebug_event_log_compare(sessions[0].debugger->log,
>> +						     sessions[i].debugger->log,
>> +						     XE_EUDEBUG_FILTER_EVENT_VM_BIND);
>> +
>> +			xe_eudebug_client_destroy(sessions[i].client);
>> +		}
>> +		xe_eudebug_debugger_destroy(sessions[i].debugger);
>> +	}
>> +}
>> +
>> +static void *attach_dettach_thread(void *data)
>> +{
>> +	struct xe_eudebug_session *s = data;
>> +	const int tries = 100;
>> +	int ret = 0;
>> +
>> +	for (int try = 0; try < tries; try++) {
>> +		ret = xe_eudebug_debugger_attach(s->debugger, s->client);
>> +
>> +		if (ret == -EBUSY) {
>> +			usleep(100000);
>> +			continue;
>> +		}
>> +
>> +		igt_assert_eq(ret, 0);
>> +
>> +		if (random() % 2 == 0) {
>> +			xe_eudebug_debugger_start_worker(s->debugger);
>> +			xe_eudebug_debugger_stop_worker(s->debugger, 1);
>> +		}
>> +
>> +		xe_eudebug_debugger_detach(s->debugger);
>> +		s->debugger->log->head = 0;
>> +		s->debugger->event_count = 0;
>> +	}
>> +
>> +	return NULL;
>> +}
>> +
>> +static void test_empty_discovery(int fd, unsigned int flags, int clients)
>> +{
>> +	struct xe_eudebug_session **s;
>> +	pthread_t *threads;
>> +	int i, expected = flags & DISCOVERY_CLOSE_CLIENT ? 0 : RESOURCE_COUNT;
>> +
>> +	igt_assert(flags & (DISCOVERY_DESTROY_RESOURCES | DISCOVERY_CLOSE_CLIENT));
>> +
>> +	s = calloc(clients, sizeof(struct xe_eudebug_session *));
>> +	threads = calloc(clients, sizeof(*threads));
>> +
>> +	for (i = 0; i < clients; i++)
>> +		s[i] = xe_eudebug_session_create(fd, run_discovery_client, flags, NULL);
>> +
>> +	for (i = 0; i < clients; i++) {
>> +		xe_eudebug_client_start(s[i]->client);
>> +
>> +		pthread_create(&threads[i], NULL, attach_dettach_thread, s[i]);
>> +	}
>> +
>> +	for (i = 0; i < clients; i++)
>> +		pthread_join(threads[i], NULL);
>> +
>> +	for (i = 0; i < clients; i++) {
>> +		xe_eudebug_client_wait_done(s[i]->client);
>> +		igt_assert_eq(xe_eudebug_debugger_attach(s[i]->debugger, s[i]->client), 0);
>> +
>> +		xe_eudebug_debugger_start_worker(s[i]->debugger);
>> +		xe_eudebug_debugger_stop_worker(s[i]->debugger, 5);
>> +		xe_eudebug_debugger_detach(s[i]->debugger);
>> +
>> +		igt_assert_eq(s[i]->debugger->event_count, expected);
>> +
>> +		xe_eudebug_session_destroy(s[i]);
>> +	}
>> +}
>> +
>> +static void ufence_ack_trigger(struct xe_eudebug_debugger *d,
>> +			       struct drm_xe_eudebug_event *e)
>> +{
>> +	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
>> +
>> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE)
>> +		xe_eudebug_ack_ufence(d->fd, ef);
>> +}
>> +
>> +typedef void (*client_run_t)(struct xe_eudebug_client *);
>> +
>> +static void test_client_with_trigger(int fd, unsigned int flags, int count,
>> +				     client_run_t client_fn, int type,
>> +				     xe_eudebug_trigger_fn trigger_fn,
>> +				     struct drm_xe_engine_class_instance *hwe,
>> +				     bool match_opposite, uint32_t event_filter)
>> +{
>> +	struct xe_eudebug_session **s;
>> +	int i;
>> +
>> +	s = calloc(count, sizeof(*s));
>> +
>> +	igt_assert(s);
>> +
>> +	for (i = 0; i < count; i++)
>> +		s[i] = xe_eudebug_session_create(fd, client_fn, flags, hwe);
>> +
>> +	if (trigger_fn)
>> +		for (i = 0; i < count; i++)
>> +			xe_eudebug_debugger_add_trigger(s[i]->debugger, type, trigger_fn);
>> +
>> +	for (i = 0; i < count; i++)
>> +		xe_eudebug_debugger_add_trigger(s[i]->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
>> +						ufence_ack_trigger);
>> +
>> +	for (i = 0; i < count; i++)
>> +		xe_eudebug_session_run(s[i]);
>> +
>> +	for (i = 0; i < count; i++)
>> +		xe_eudebug_session_check(s[i], match_opposite, event_filter);
>> +
>> +	for (i = 0; i < count; i++)
>> +		xe_eudebug_session_destroy(s[i]);
>> +}
>> +
>> +struct thread_fn_args {
>> +	struct xe_eudebug_client *client;
>> +	int fd;
>> +};
>> +
>> +static void *basic_client_th(void *data)
>> +{
>> +	struct thread_fn_args *f = data;
>> +	struct xe_eudebug_client *c = f->client;
>> +	uint32_t *vms;
>> +	int fd, i, num_vms;
>> +
>> +	fd = f->fd;
>> +	igt_assert(fd);
>> +
>> +	xe_device_get(fd);
>> +
>> +	num_vms = 2 + rand() % 16;
>> +	vms = calloc(num_vms, sizeof(*vms));
>> +	igt_assert(vms);
>> +	igt_debug("Create %d client vms\n", num_vms);
>> +
>> +	for (i = 0; i < num_vms; i++)
>> +		vms[i] = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> 
> I think this client code is prone to get same seqno.

Yes, nice catch!

> 
> xe_eudebug_client_vm_create()
>    -> vm_event()
>       ...
>       -> base_event()
>          ...
>          e->seqno = xe_eudebug_client_get_seqno(c);
> 
> There's no mutex protecting 'return c->seqno++' and this increment
> is not atomic.

Exactly, will be fixed in the next revision.

> 
>> +
>> +	for (i = 0; i < num_vms; i++)
>> +		xe_eudebug_client_vm_destroy(c, fd, vms[i]);
>> +
>> +	xe_device_put(fd);
>> +	free(vms);
>> +
>> +	return NULL;
>> +}
>> +
>> +static void run_basic_client_th(struct xe_eudebug_client *c)
>> +{
>> +	struct thread_fn_args *args;
>> +	int i, num_threads, fd;
>> +	pthread_t *threads;
>> +
>> +	args = calloc(1, sizeof(*args));
>> +	igt_assert(args);
>> +
>> +	num_threads = 2 + random() % 16;
>> +	igt_debug("Run on %d threads\n", num_threads);
>> +	threads = calloc(num_threads, sizeof(*threads));
>> +	igt_assert(threads);
>> +
>> +	fd = xe_eudebug_client_open_driver(c);
>> +	args->client = c;
>> +	args->fd = fd;
>> +
>> +	for (i = 0; i < num_threads; i++)
>> +		pthread_create(&threads[i], NULL, basic_client_th, args);
>> +
>> +	for (i = 0; i < num_threads; i++)
>> +		pthread_join(threads[i], NULL);
>> +
>> +	xe_eudebug_client_close_driver(c, fd);
>> +	free(args);
>> +	free(threads);
>> +}
>> +
>> +static void test_basic_sessions_th(int fd, unsigned int flags, int num_clients, bool match_opposite)
>> +{
>> +	test_client_with_trigger(fd, flags, num_clients, run_basic_client_th, 0, NULL, NULL,
>> +				 match_opposite, 0);
>> +}
>> +
>> +static void vm_access_client(struct xe_eudebug_client *c)
>> +{
>> +	struct drm_xe_engine_class_instance *hwe = c->ptr;
>> +	uint32_t bo_placement;
>> +	struct bind_list *bl;
>> +	uint32_t vm;
>> +	int fd, i, j;
>> +
>> +	igt_debug("Using %s\n", xe_engine_class_string(hwe->engine_class));
>> +
>> +	fd = xe_eudebug_client_open_driver(c);
>> +	xe_device_get(fd);
> 
> Not necessary.

Ok.

> 
>> +
>> +	vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
>> +
>> +	if (c->flags & VM_BIND_OP_MAP_USERPTR)
>> +		bo_placement = 0;
>> +	else
>> +		bo_placement = vram_if_possible(fd, hwe->gt_id);
>> +
>> +	for (j = 0; j < 5; j++) {
>> +		unsigned int target_size = MIN_BO_SIZE * (1 << j);
>> +
>> +		bl = create_bind_list(fd, bo_placement, vm, 4, target_size);
>> +		do_bind_list(c, bl, true);
>> +
>> +		for (i = 0; i < bl->n; i++)
>> +			xe_eudebug_client_wait_stage(c, bl->bind_ops[i].addr);
>> +
>> +		free_bind_list(c, bl);
>> +	}
>> +	xe_eudebug_client_vm_destroy(c, fd, vm);
>> +
>> +	xe_device_put(fd);
> 
> Same.

Ok.

> 
>> +	xe_eudebug_client_close_driver(c, fd);
>> +}
>> +
>> +static void debugger_test_vma(struct xe_eudebug_debugger *d,
>> +			      uint64_t client_handle,
>> +			      uint64_t vm_handle,
>> +			      uint64_t va_start,
>> +			      uint64_t va_length)
>> +{
>> +	struct drm_xe_eudebug_vm_open vo = { 0, };
>> +	uint64_t *v1, *v2;
>> +	uint64_t items = va_length / sizeof(uint64_t);
>> +	int fd;
>> +	int r, i;
>> +
>> +	v1 = malloc(va_length);
>> +	igt_assert(v1);
>> +	v2 = malloc(va_length);
>> +	igt_assert(v2);
>> +
>> +	vo.client_handle = client_handle;
>> +	vo.vm_handle = vm_handle;
>> +
>> +	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
>> +	igt_assert_lte(0, fd);
>> +
>> +	r = pread(fd, v1, va_length, va_start);
>> +	igt_assert_eq(r, va_length);
>> +
>> +	for (i = 0; i < items; i++)
>> +		igt_assert_eq(v1[i], va_start + i);
>> +
>> +	for (i = 0; i < items; i++)
>> +		v1[i] = va_start + i + 1;
>> +
>> +	r = pwrite(fd, v1, va_length, va_start);
>> +	igt_assert_eq(r, va_length);
>> +
>> +	lseek(fd, va_start, SEEK_SET);
>> +	r = read(fd, v2, va_length);
>> +	igt_assert_eq(r, va_length);
>> +
>> +	for (i = 0; i < items; i++)
>> +		igt_assert_eq(v1[i], v2[i]);
>> +
>> +	fsync(fd);
>> +
>> +	close(fd);
>> +	free(v1);
>> +	free(v2);
>> +}
>> +
>> +static void vm_trigger(struct xe_eudebug_debugger *d,
>> +		       struct drm_xe_eudebug_event *e)
>> +{
>> +	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
>> +
> 
> I wanted to ask for checking of event type is this one we expect
> but then I realized debugger_run_triggers() does this check.

Yes, when we register a trigger, we specify on which event type it 
should trigger and based on the reveived event type, 
debugger_run_triggers() calls the appropriate 'triggers'.

> 
>> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
>> +		struct drm_xe_eudebug_event_vm_bind *eb;
>> +
>> +		igt_debug("vm bind op event received with ref %lld, addr 0x%llx, range 0x%llx\n",
>> +			  eo->vm_bind_ref_seqno,
>> +			  eo->addr,
>> +			  eo->range);
>> +
>> +		eb = (struct drm_xe_eudebug_event_vm_bind *)
>> +			xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
>> +		igt_assert(eb);
>> +
>> +		debugger_test_vma(d, eb->client_handle, eb->vm_handle,
>> +				  eo->addr, eo->range);
>> +		xe_eudebug_debugger_signal_stage(d, eo->addr);
>> +	}
>> +}
>> +
>> +/**
>> + * SUBTEST: basic-vm-access
>> + * Description:
>> + *      Exercise XE_EUDEBUG_VM_OPEN with pread and pwrite into the
>> + *      vm fd, concerning many different offsets inside the vm,
>> + *      and many virtual addresses of the vm_bound object.
>> + *
>> + * SUBTEST: basic-vm-access-userptr
>> + * Description:
>> + *      Exercise XE_EUDEBUG_VM_OPEN with pread and pwrite into the
>> + *      vm fd, concerning many different offsets inside the vm,
>> + *      and many virtual addresses of the vm_bound object, but backed
>> + *      by userptr.
>> + */
>> +static void test_vm_access(int fd, unsigned int flags, int num_clients)
>> +{
>> +	struct drm_xe_engine_class_instance *hwe;
>> +
>> +	xe_eudebug_for_each_engine(fd, hwe)
>> +		test_client_with_trigger(fd, flags, num_clients,
>> +					 vm_access_client,
>> +					 DRM_XE_EUDEBUG_EVENT_VM_BIND_OP,
>> +					 vm_trigger, hwe,
>> +					 false,
>> +					 XE_EUDEBUG_FILTER_EVENT_VM_BIND_OP |
>> +					 XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
>> +}
>> +
>> +static void debugger_test_vma_parameters(struct xe_eudebug_debugger *d,
>> +					 uint64_t client_handle,
>> +					 uint64_t vm_handle,
>> +					 uint64_t va_start,
>> +					 uint64_t va_length)
>> +{
>> +	struct drm_xe_eudebug_vm_open vo = { 0, };
>> +	uint64_t *v;
>> +	uint64_t items = va_length / sizeof(uint64_t);
>> +	int fd;
>> +	int r, i;
>> +
>> +	v = malloc(va_length);
>> +	igt_assert(v);
>> +
>> +	/* Negative VM open - bad client handle */
>> +	vo.client_handle = client_handle + 123;
>> +	vo.vm_handle = vm_handle;
>> +	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
>> +	igt_assert(fd < 0);
> 
> igt_assert_lt()/igt_assert_eq() would be more appropriate here
> and below. BTW nice comments, I've scrolled up to test_read_event()
> to ask for the same there.

Ok.

> 
>> +
>> +	/* Negative VM open - bad vm handle */
>> +	vo.client_handle = client_handle;
>> +	vo.vm_handle = vm_handle + 123;
>> +	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
>> +	igt_assert(fd < 0);
>> +
>> +	/* Positive VM open */
>> +	vo.client_handle = client_handle;
>> +	vo.vm_handle = vm_handle;
>> +	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
>> +	igt_assert_lte(0, fd);
>> +
>> +	/* Negative pread - bad fd */
>> +	r = pread(fd + 123, v, va_length, va_start);
>> +	igt_assert(r < 0);
>> +
>> +	/* Negative pread - bad va_start */
>> +	r = pread(fd, v, va_length, 0);
>> +	igt_assert(r < 0);
>> +
>> +	/* Negative pread - bad va_start */
>> +	r = pread(fd, v, va_length, va_start - 1);
>> +	igt_assert(r < 0);
>> +
>> +	/* Positive pread - zero va_length */
>> +	r = pread(fd, v, 0, va_start);
>> +	igt_assert_eq(r, 0);
>> +
>> +	/* Negative pread - out of range */
>> +	r = pread(fd, v, va_length + 1, va_start);
>> +	igt_assert_eq(r, va_length);
>> +
>> +	/* Negative pread - bad va_start */
>> +	r = pread(fd, v, 1, va_start + va_length);
>> +	igt_assert(r < 0);
>> +
>> +	/* Positive pread - whole range */
>> +	r = pread(fd, v, va_length, va_start);
>> +	igt_assert_eq(r, va_length);
>> +
>> +	/* Positive pread */
>> +	r = pread(fd, v, 1, va_start + va_length - 1);
>> +	igt_assert_eq(r, 1);
>> +
>> +	for (i = 0; i < items; i++)
>> +		igt_assert_eq(v[i], va_start + i);
>> +
>> +	for (i = 0; i < items; i++)
>> +		v[i] = va_start + i + 1;
>> +
>> +	/* Negative pwrite - bad fd */
>> +	r = pwrite(fd + 123, v, va_length, va_start);
>> +	igt_assert(r < 0);
>> +
>> +	/* Negative pwrite - bad va_start */
>> +	r = pwrite(fd, v, va_length, -1);
>> +	igt_assert(r < 0);
>> +
>> +	/* Negative pwrite - zero va_start */
>> +	r = pwrite(fd, v, va_length, 0);
>> +	igt_assert(r < 0);
>> +
>> +	/* Negative pwrite - bad va_length */
>> +	r = pwrite(fd, v, va_length + 1, va_start);
>> +	igt_assert_eq(r, va_length);
>> +
>> +	/* Positive pwrite - zero va_length */
>> +	r = pwrite(fd, v, 0, va_start);
>> +	igt_assert_eq(r, 0);
>> +
>> +	/* Positive pwrite */
>> +	r = pwrite(fd, v, va_length, va_start);
>> +	igt_assert_eq(r, va_length);
>> +	fsync(fd);
>> +
>> +	close(fd);
>> +	free(v);
>> +}
>> +
>> +static void vm_trigger_access_parameters(struct xe_eudebug_debugger *d,
>> +					 struct drm_xe_eudebug_event *e)
>> +{
>> +	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
>> +
>> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
>> +		struct drm_xe_eudebug_event_vm_bind *eb;
>> +
>> +		igt_debug("vm bind op event received with ref %lld, addr 0x%llx, range 0x%llx\n",
>> +			  eo->vm_bind_ref_seqno,
>> +			  eo->addr,
>> +			  eo->range);
>> +
>> +		eb = (struct drm_xe_eudebug_event_vm_bind *)
>> +			xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
>> +		igt_assert(eb);
>> +
>> +		debugger_test_vma_parameters(d, eb->client_handle, eb->vm_handle, eo->addr,
>> +					     eo->range);
>> +		xe_eudebug_debugger_signal_stage(d, eo->addr);
>> +	}
>> +}
>> +
>> +/**
>> + * SUBTEST: basic-vm-access-parameters
>> + * Description:
>> + *      Check negative scenarios of VM_OPEN ioctl and pread/pwrite usage.
>> + */
>> +static void test_vm_access_parameters(int fd, unsigned int flags, int num_clients)
>> +{
>> +	struct drm_xe_engine_class_instance *hwe;
>> +
>> +	xe_eudebug_for_each_engine(fd, hwe)
>> +		test_client_with_trigger(fd, flags, num_clients,
>> +					 vm_access_client,
>> +					 DRM_XE_EUDEBUG_EVENT_VM_BIND_OP,
>> +					 vm_trigger_access_parameters, hwe,
>> +					 false,
>> +					 XE_EUDEBUG_FILTER_EVENT_VM_BIND_OP |
>> +					 XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
>> +}
>> +
>> +#define PAGE_SIZE 4096
> 
> Repeated definition, use SZ_4K.

Sure.

> 
>> +#define MDATA_SIZE (WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM * PAGE_SIZE)
>> +static void metadata_access_client(struct xe_eudebug_client *c)
>> +{
>> +	const uint64_t addr = 0x1a0000;
>> +	struct drm_xe_vm_bind_op_ext_attach_debug *ext;
>> +	uint8_t *data;
>> +	size_t bo_size;
>> +	uint32_t bo, vm;
>> +	int fd, i;
>> +
>> +	fd = xe_eudebug_client_open_driver(c);
>> +	xe_device_get(fd);
> 
> Unnecessary xe_device_get() here.

Ok.

> 
>> +
>> +	bo_size = xe_get_default_alignment(fd);
>> +	vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
>> +	bo = xe_bo_create(fd, vm, bo_size, system_memory(fd), 0);
>> +
>> +	ext = basic_vm_bind_metadata_ext_prepare(fd, c, &data, MDATA_SIZE);
>> +
>> +	xe_eudebug_client_vm_bind_flags(c, fd, vm, bo, 0, addr,
>> +					bo_size, 0, NULL, 0, to_user_pointer(ext));
>> +
>> +	for (i = 0; i < WORK_IN_PROGRESS_DRM_XE_DEBUG_METADATA_NUM; i++)
>> +		xe_eudebug_client_wait_stage(c, i);
>> +
>> +	xe_eudebug_client_vm_unbind(c, fd, vm, 0, addr, bo_size);
>> +
>> +	basic_vm_bind_metadata_ext_del(fd, c, ext, data);
>> +
>> +	close(bo);
>> +	xe_eudebug_client_vm_destroy(c, fd, vm);
>> +
>> +	xe_device_put(fd);
> 
> Unnecessary xe_device_put() here, will be cleaned after process exiting.

Ok.

> 
>> +	xe_eudebug_client_close_driver(c, fd);
>> +}
>> +
>> +static void debugger_test_metadata(struct xe_eudebug_debugger *d,
>> +				   uint64_t client_handle,
>> +				   uint64_t metadata_handle,
>> +				   uint64_t type,
>> +				   uint64_t len)
>> +{
>> +	struct drm_xe_eudebug_read_metadata rm = {
>> +		.client_handle = client_handle,
>> +		.metadata_handle = metadata_handle,
>> +		.size = len,
>> +	};
>> +	uint8_t *data;
>> +	int i;
>> +
>> +	data = malloc(len);
>> +	igt_assert(data);
>> +
>> +	rm.ptr = to_user_pointer(data);
>> +
>> +	igt_assert_eq(igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_READ_METADATA, &rm), 0);
> 
> Do you plan to cover this ioctl? Here's only optimistic scenario.

At some point, sure. This test is not said to be complete and done.

> 
>> +
>> +	/* syntetic check, test sets different size per metadata type */
>> +	igt_assert_eq((type + 1) * PAGE_SIZE, rm.size);
>> +
>> +	for (i = 0; i < rm.size; i++)
>> +		igt_assert_eq(data[i], 0xff & (i + (i > PAGE_SIZE)));
> 
> I've commented out above, but when I see this I likely guess you would
> like to have different metadata type starting at different value.
> Currently first page starts with 0x0 value, second/third/... pages are
> always starts with 0x1. I think assigning instead i > PAGE_SIZE would
> be use i / PAGE_SIZE, or i >> 12. Then each metadata type would have
> unique start value at each page.

Well like I said above, not necesarily each metadata type has to start 
at a different value. What's more, they all start at the same point. But 
I assume the goal was for each to extend the pointed memory area by a 
page and that page was prolly intended to be different. Agreed that 
adding 'i / PAGE_SIZE' would give the desired result.

> 
>> +
>> +	free(data);
>> +}
>> +
>> +static void metadata_read_trigger(struct xe_eudebug_debugger *d,
>> +				  struct drm_xe_eudebug_event *e)
>> +{
>> +	struct drm_xe_eudebug_event_metadata *em = (void *)e;
>> +
>> +	/* syntetic check, test sets different size per metadata type */
>> +	igt_assert_eq((em->type + 1) * PAGE_SIZE, em->len);
>> +
>> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
>> +		debugger_test_metadata(d, em->client_handle, em->metadata_handle,
>> +				       em->type, em->len);
>> +		xe_eudebug_debugger_signal_stage(d, em->type);
>> +	}
>> +}
>> +
>> +static void metadata_read_on_vm_bind_trigger(struct xe_eudebug_debugger *d,
>> +					     struct drm_xe_eudebug_event *e)
>> +{
>> +	struct drm_xe_eudebug_event_vm_bind_op_metadata *em = (void *)e;
>> +	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
>> +	struct drm_xe_eudebug_event_vm_bind *eb;
>> +
>> +	/* For testing purpose client sets metadata_cookie = type */
>> +
>> +	/*
>> +	 * Metadata event has a reference to vm-bind-op event which has a reference
>> +	 * to vm-bind event which contains proper client-handle.
>> +	 */
>> +	eo = (struct drm_xe_eudebug_event_vm_bind_op *)
>> +		xe_eudebug_event_log_find_seqno(d->log, em->vm_bind_op_ref_seqno);
>> +	igt_assert(eo);
>> +	eb = (struct drm_xe_eudebug_event_vm_bind *)
>> +		xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
>> +	igt_assert(eb);
>> +
>> +	debugger_test_metadata(d,
>> +			       eb->client_handle,
>> +			       em->metadata_handle,
>> +			       em->metadata_cookie,
>> +			       MDATA_SIZE); /* max size */
>> +
>> +	xe_eudebug_debugger_signal_stage(d, em->metadata_cookie);
>> +}
>> +
>> +/**
>> + * SUBTEST: read-metadata
>> + * Description:
>> + *      Exercise DRM_XE_EUDEBUG_IOCTL_READ_METADATA and debug metadata create|destroy events.
>> + */
>> +static void test_metadata_read(int fd, unsigned int flags, int num_clients)
>> +{
>> +	test_client_with_trigger(fd, flags, num_clients, metadata_access_client,
>> +				 DRM_XE_EUDEBUG_EVENT_METADATA, metadata_read_trigger,
>> +				 NULL, true, 0);
>> +}
>> +
>> +/**
>> + * SUBTEST: attach-debug-metadata
>> + * Description:
>> + *      Read debug metadata when vm_bind has it attached.
>> + */
>> +static void test_metadata_attach(int fd, unsigned int flags, int num_clients)
>> +{
>> +	test_client_with_trigger(fd, flags, num_clients, metadata_access_client,
>> +				 DRM_XE_EUDEBUG_EVENT_VM_BIND_OP_METADATA,
>> +				 metadata_read_on_vm_bind_trigger,
>> +				 NULL, true, 0);
>> +}
>> +
>> +#define STAGE_CLIENT_WAIT_ON_UFENCE_DONE 1337
>> +
>> +#define UFENCE_EVENT_COUNT_EXPECTED 4
>> +#define UFENCE_EVENT_COUNT_MAX 100
>> +
>> +struct ufence_bind {
>> +	struct drm_xe_sync f;
>> +	uint64_t addr;
>> +	uint64_t range;
>> +	uint64_t value;
>> +	struct {
>> +		uint64_t vm_sync;
>> +	} *fence_data;
>> +};
>> +
>> +static void client_wait_ufences(struct xe_eudebug_client *c,
>> +				int fd, struct ufence_bind *binds, int count)
>> +{
>> +	const int64_t default_fence_timeout_ns = 500 * NSEC_PER_MSEC;
>> +	int64_t timeout_ns;
>> +	int err;
>> +
>> +	/* Ensure that wait on unacked ufence times out */
>> +	for (int i = 0; i < count; i++) {
>> +		struct ufence_bind *b = &binds[i];
>> +
>> +		timeout_ns = default_fence_timeout_ns;
>> +		err = __xe_wait_ufence(fd, &b->fence_data->vm_sync, b->f.timeline_value,
>> +				       0, &timeout_ns);
>> +		igt_assert_eq(err, -ETIME);
>> +		igt_assert_neq(b->fence_data->vm_sync, b->f.timeline_value);
>> +		igt_debug("wait #%d blocked on ack\n", i);
>> +	}
>> +
>> +	/* Wait on fence timed out, now tell the debugger to ack */
>> +	xe_eudebug_client_signal_stage(c, STAGE_CLIENT_WAIT_ON_UFENCE_DONE);
>> +
>> +	/* Check that ack unblocks ufence */
>> +	for (int i = 0; i < count; i++) {
>> +		struct ufence_bind *b = &binds[i];
>> +
>> +		timeout_ns = XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC;
>> +		err = __xe_wait_ufence(fd, &b->fence_data->vm_sync, b->f.timeline_value,
>> +				       0, &timeout_ns);
>> +		igt_assert_eq(err, 0);
>> +		igt_assert_eq(b->fence_data->vm_sync, b->f.timeline_value);
>> +		igt_debug("wait #%d completed\n", i);
>> +	}
>> +}
>> +
>> +static struct ufence_bind *create_binds_with_ufence(int fd, int count)
>> +{
>> +	struct ufence_bind *binds;
>> +
>> +	binds = calloc(count, sizeof(*binds));
>> +	igt_assert(binds);
>> +
>> +	for (int i = 0; i < count; i++) {
>> +		struct ufence_bind *b = &binds[i];
>> +
>> +		b->range = 0x1000;
>> +		b->addr = 0x100000 + b->range * i;
>> +		b->fence_data = aligned_alloc(xe_get_default_alignment(fd),
>> +					      sizeof(*b->fence_data));
> 
> Where's fence_data freed?

Yup, will add a complementary destroy function.

> 
>> +		igt_assert(b->fence_data);
>> +		memset(b->fence_data, 0, sizeof(*b->fence_data));
>> +
>> +		b->f.type = DRM_XE_SYNC_TYPE_USER_FENCE;
>> +		b->f.flags = DRM_XE_SYNC_FLAG_SIGNAL;
>> +		b->f.addr = to_user_pointer(&b->fence_data->vm_sync);
>> +		b->f.timeline_value = UFENCE_EVENT_COUNT_EXPECTED + i;
>> +	}
>> +
>> +	return binds;
>> +}
>> +
>> +static void basic_ufence_client(struct xe_eudebug_client *c)
>> +{
>> +	const unsigned int n = UFENCE_EVENT_COUNT_EXPECTED;
>> +	int fd = xe_eudebug_client_open_driver(c);
>> +	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
>> +	size_t bo_size = n * xe_get_default_alignment(fd);
>> +	uint32_t bo = xe_bo_create(fd, 0, bo_size,
>> +				   system_memory(fd), 0);
>> +	struct ufence_bind *binds = create_binds_with_ufence(fd, n);
>> +
>> +	for (int i = 0; i < n; i++) {
>> +		struct ufence_bind *b = &binds[i];
>> +
>> +		xe_eudebug_client_vm_bind_flags(c, fd, vm, bo, 0, b->addr, b->range, 0,
>> +						&b->f, 1, 0);
>> +	}
>> +
>> +	client_wait_ufences(c, fd, binds, n);
>> +
>> +	for (int i = 0; i < n; i++) {
>> +		struct ufence_bind *b = &binds[i];
>> +
>> +		xe_eudebug_client_vm_unbind(c, fd, vm, 0, b->addr, b->range);
>> +	}
>> +
>> +	free(binds);
>> +	gem_close(fd, bo);
>> +	xe_eudebug_client_vm_destroy(c, fd, vm);
>> +	xe_eudebug_client_close_driver(c, fd);
>> +}
>> +
>> +struct ufence_priv {
>> +	struct drm_xe_eudebug_event_vm_bind_ufence ufence_events[UFENCE_EVENT_COUNT_MAX];
>> +	uint64_t ufence_event_seqno[UFENCE_EVENT_COUNT_MAX];
>> +	uint64_t ufence_event_vm_addr_start[UFENCE_EVENT_COUNT_MAX];
>> +	uint64_t ufence_event_vm_addr_range[UFENCE_EVENT_COUNT_MAX];
>> +	unsigned int ufence_event_count;
>> +	unsigned int vm_bind_op_count;
>> +	pthread_mutex_t mutex;
>> +};
>> +
>> +static struct ufence_priv *ufence_priv_create(void)
>> +{
>> +	struct ufence_priv *priv;
>> +
>> +	priv = mmap(0, ALIGN(sizeof(*priv), PAGE_SIZE),
>> +		    PROT_WRITE, MAP_SHARED | MAP_ANON, -1, 0);
>> +	igt_assert(priv);
>> +	memset(priv, 0, sizeof(*priv));
>> +	pthread_mutex_init(&priv->mutex, NULL);
> 
> I think you should ensure that attribute PTHREAD_PROCESS_SHARED is set
> for multiprocess usage (see pthread_mutexattr_get/setpshared()).

Thanks for pointing this out, will fix it in the next revision!

> 
>> +
>> +	return priv;
>> +}
>> +
>> +static void ufence_priv_destroy(struct ufence_priv *priv)
>> +{
>> +	munmap(priv, ALIGN(sizeof(*priv), PAGE_SIZE));
>> +}
>> +
>> +static void ack_fences(struct xe_eudebug_debugger *d)
>> +{
>> +	struct ufence_priv *priv = d->ptr;
>> +
>> +	for (int i = 0; i < priv->ufence_event_count; i++)
>> +		xe_eudebug_ack_ufence(d->fd, &priv->ufence_events[i]);
>> +}
>> +
>> +static void basic_ufence_trigger(struct xe_eudebug_debugger *d,
>> +				 struct drm_xe_eudebug_event *e)
>> +{
>> +	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
>> +	struct ufence_priv *priv = d->ptr;
>> +
>> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
>> +		char event_str[XE_EUDEBUG_EVENT_STRING_MAX_LEN];
>> +		struct drm_xe_eudebug_event_vm_bind *eb;
>> +
>> +		xe_eudebug_event_to_str(e, event_str, XE_EUDEBUG_EVENT_STRING_MAX_LEN);
>> +		igt_debug("ufence event received: %s\n", event_str);
>> +
>> +		xe_eudebug_assert_f(d, priv->ufence_event_count < UFENCE_EVENT_COUNT_EXPECTED,
>> +				    "surplus ufence event received: %s\n", event_str);
>> +		xe_eudebug_assert(d, ef->vm_bind_ref_seqno);
>> +
>> +		memcpy(&priv->ufence_events[priv->ufence_event_count++], ef, sizeof(*ef));
>> +
>> +		eb = (struct drm_xe_eudebug_event_vm_bind *)
>> +			xe_eudebug_event_log_find_seqno(d->log, ef->vm_bind_ref_seqno);
>> +		xe_eudebug_assert_f(d, eb, "vm bind event with seqno (%lld) not found\n",
>> +				    ef->vm_bind_ref_seqno);
>> +		xe_eudebug_assert_f(d, eb->flags & DRM_XE_EUDEBUG_EVENT_VM_BIND_FLAG_UFENCE,
>> +				    "vm bind event does not have ufence: %s\n", event_str);
>> +	}
>> +}
>> +
>> +static int wait_for_ufence_events(struct ufence_priv *priv, int timeout_ms)
>> +{
>> +	int ret = -ETIMEDOUT;
>> +
>> +	igt_for_milliseconds(timeout_ms) {
>> +		pthread_mutex_lock(&priv->mutex);
>> +		if (priv->ufence_event_count == UFENCE_EVENT_COUNT_EXPECTED)
>> +			ret = 0;
>> +		pthread_mutex_unlock(&priv->mutex);
>> +
>> +		if (!ret)
>> +			break;
>> +		usleep(1000);
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +/**
>> + * SUBTEST: basic-vm-bind-ufence
>> + * Description:
>> + *      Give user fence in application and check if ufence ack works
>> + */
>> +static void test_basic_ufence(int fd, unsigned int flags)
>> +{
>> +	struct xe_eudebug_debugger *d;
>> +	struct xe_eudebug_session *s;
>> +	struct xe_eudebug_client *c;
>> +	struct ufence_priv *priv;
>> +
>> +	priv = ufence_priv_create();
>> +	s = xe_eudebug_session_create(fd, basic_ufence_client, flags, priv);
>> +	c = s->client;
>> +	d = s->debugger;
>> +
>> +	xe_eudebug_debugger_add_trigger(d,
>> +					DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
>> +					basic_ufence_trigger);
>> +
>> +	igt_assert_eq(xe_eudebug_debugger_attach(d, c), 0);
>> +	xe_eudebug_debugger_start_worker(d);
>> +	xe_eudebug_client_start(c);
>> +
>> +	xe_eudebug_debugger_wait_stage(s, STAGE_CLIENT_WAIT_ON_UFENCE_DONE);
>> +	xe_eudebug_assert_f(d, wait_for_ufence_events(priv, XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * MSEC_PER_SEC) == 0,
>> +			    "missing ufence events\n");
>> +	ack_fences(d);
>> +
>> +	xe_eudebug_client_wait_done(c);
>> +	xe_eudebug_debugger_stop_worker(d, 1);
>> +
>> +	xe_eudebug_event_log_print(d->log, true);
>> +	xe_eudebug_event_log_print(c->log, true);
>> +
>> +	xe_eudebug_session_check(s, true, XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
>> +
>> +	xe_eudebug_session_destroy(s);
>> +	ufence_priv_destroy(priv);
>> +}
>> +
>> +struct vm_bind_clear_thread_priv {
>> +	struct drm_xe_engine_class_instance *hwe;
>> +	struct xe_eudebug_client *c;
>> +	pthread_t thread;
>> +	uint64_t region;
>> +	unsigned long sum;
>> +};
>> +
>> +struct vm_bind_clear_priv {
>> +	unsigned long unbind_count;
>> +	unsigned long bind_count;
>> +	unsigned long sum;
>> +};
>> +
>> +static struct vm_bind_clear_priv *vm_bind_clear_priv_create(void)
>> +{
>> +	struct vm_bind_clear_priv *priv;
>> +
>> +	priv = mmap(0, ALIGN(sizeof(*priv), PAGE_SIZE),
>> +		    PROT_WRITE, MAP_SHARED | MAP_ANON, -1, 0);
>> +	igt_assert(priv);
>> +	memset(priv, 0, sizeof(*priv));
>> +
>> +	return priv;
>> +}
>> +
>> +static void vm_bind_clear_priv_destroy(struct vm_bind_clear_priv *priv)
>> +{
>> +	munmap(priv, ALIGN(sizeof(*priv), PAGE_SIZE));
>> +}
>> +
>> +static void *vm_bind_clear_thread(void *data)
>> +{
>> +	const uint32_t CS_GPR0 = 0x600;
>> +	const size_t batch_size = 16;
>> +	struct drm_xe_sync uf_sync = {
>> +		.type = DRM_XE_SYNC_TYPE_USER_FENCE, .flags = DRM_XE_SYNC_FLAG_SIGNAL,
>> +	};
>> +	struct vm_bind_clear_thread_priv *priv = data;
>> +	int fd = xe_eudebug_client_open_driver(priv->c);
>> +	uint32_t gtt_size = 1ull << min_t(uint32_t, xe_va_bits(fd), 48);
>> +	uint32_t vm = xe_eudebug_client_vm_create(priv->c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
>> +	size_t bo_size = xe_bb_size(fd, batch_size);
>> +	unsigned long count = 0;
>> +	uint64_t *fence_data;
>> +
>> +	/* init uf_sync */
>> +	fence_data = aligned_alloc(xe_get_default_alignment(fd), sizeof(*fence_data));
>> +	igt_assert(fence_data);
>> +	uf_sync.timeline_value = 1337;
>> +	uf_sync.addr = to_user_pointer(fence_data);
>> +
>> +	igt_debug("Run on: %s%u\n", xe_engine_class_string(priv->hwe->engine_class),
>> +		  priv->hwe->engine_instance);
>> +
>> +	igt_until_timeout(5) {
>> +		struct drm_xe_ext_set_property eq_ext = {
>> +			.base.name = DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY,
>> +			.property = DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG,
>> +			.value = DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE,
>> +		};
>> +		struct drm_xe_exec_queue_create eq_create = { 0 };
>> +		uint32_t clean_bo = 0;
>> +		uint32_t batch_bo = 0;
>> +		uint64_t clean_offset, batch_offset;
>> +		uint32_t exec_queue;
>> +		uint32_t *map, *cs;
>> +		uint64_t delta;
>> +
>> +		/* calculate offsets (vma addresses) */
>> +		batch_offset = (random() * SZ_2M) & (gtt_size - 1);
>> +		/* XXX: for some platforms/memory regions batch offset '0' can be problematic */
>> +		if (batch_offset == 0)
>> +			batch_offset = SZ_2M;
>> +
>> +		do {
>> +			clean_offset = (random() * SZ_2M) & (gtt_size - 1);
>> +			if (clean_offset == 0)
>> +				clean_offset = SZ_2M;
>> +		} while (clean_offset == batch_offset);
>> +
>> +		batch_offset += random() % SZ_2M & -bo_size;
>> +		clean_offset += random() % SZ_2M & -bo_size;
>> +
>> +		delta = (random() % bo_size) & -4;
>> +
>> +		/* prepare clean bo */
>> +		clean_bo = xe_bo_create(fd, vm, bo_size, priv->region,
>> +					DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM);
>> +		memset(fence_data, 0, sizeof(*fence_data));
>> +		xe_eudebug_client_vm_bind_flags(priv->c, fd, vm, clean_bo, 0, clean_offset, bo_size,
>> +						0, &uf_sync, 1, 0);
>> +		xe_wait_ufence(fd, fence_data, uf_sync.timeline_value, 0,
>> +			       XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC);
>> +
>> +		/* prepare batch bo */
>> +		batch_bo = xe_bo_create(fd, vm, bo_size, priv->region,
>> +					DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM);
>> +		memset(fence_data, 0, sizeof(*fence_data));
>> +		xe_eudebug_client_vm_bind_flags(priv->c, fd, vm, batch_bo, 0, batch_offset, bo_size,
>> +						0, &uf_sync, 1, 0);
>> +		xe_wait_ufence(fd, fence_data, uf_sync.timeline_value, 0,
>> +			       XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC);
>> +
>> +		map = xe_bo_map(fd, batch_bo, bo_size);
>> +
>> +		cs = map;
>> +		*cs++ = MI_NOOP | 0xc5a3;
>> +		*cs++ = MI_LOAD_REGISTER_MEM_CMD | MI_LRI_LRM_CS_MMIO | 2;
>> +		*cs++ = CS_GPR0;
>> +		*cs++ = clean_offset + delta;
>> +		*cs++ = (clean_offset + delta) >> 32;
>> +		*cs++ = MI_STORE_REGISTER_MEM_CMD | MI_LRI_LRM_CS_MMIO | 2;
>> +		*cs++ = CS_GPR0;
>> +		*cs++ = batch_offset;
>> +		*cs++ = batch_offset >> 32;
>> +		*cs++ = MI_BATCH_BUFFER_END;
>> +
>> +		/* execute batch */
>> +		eq_create.width = 1;
>> +		eq_create.num_placements = 1;
>> +		eq_create.vm_id = vm;
>> +		eq_create.instances = to_user_pointer(priv->hwe);
>> +		eq_create.extensions = to_user_pointer(&eq_ext);
>> +		exec_queue = xe_eudebug_client_exec_queue_create(priv->c, fd, &eq_create);
>> +
>> +		memset(fence_data, 0, sizeof(*fence_data));
>> +		xe_exec_sync(fd, exec_queue, batch_offset, &uf_sync, 1);
>> +		xe_wait_ufence(fd, fence_data, uf_sync.timeline_value, 0,
>> +			       XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC);
>> +
>> +		igt_assert_eq(*map, 0);
>> +
>> +		/* cleanup */
>> +		xe_eudebug_client_exec_queue_destroy(priv->c, fd, &eq_create);
>> +		munmap(map, bo_size);
>> +
>> +		xe_eudebug_client_vm_unbind(priv->c, fd, vm, 0, batch_offset, bo_size);
>> +		gem_close(fd, batch_bo);
>> +
>> +		xe_eudebug_client_vm_unbind(priv->c, fd, vm, 0, clean_offset, bo_size);
>> +		gem_close(fd, clean_bo);
>> +
>> +		count++;
>> +	}
>> +
>> +	priv->sum = count;
>> +
>> +	free(fence_data);
>> +	xe_eudebug_client_close_driver(priv->c, fd);
>> +	return NULL;
>> +}
>> +
>> +static void vm_bind_clear_client(struct xe_eudebug_client *c)
>> +{
>> +	int fd = xe_eudebug_client_open_driver(c);
>> +	struct xe_device *xe_dev = xe_device_get(fd);
>> +	int count = xe_number_engines(fd) * xe_dev->mem_regions->num_mem_regions;
>> +	uint64_t memreg = all_memory_regions(fd);
>> +	struct vm_bind_clear_priv *priv = c->ptr;
>> +	int current = 0;
>> +	struct drm_xe_engine_class_instance *engine;
>> +	struct vm_bind_clear_thread_priv *threads;
>> +	uint64_t region;
>> +
>> +	threads = calloc(count, sizeof(*threads));
>> +	igt_assert(threads);
>> +	priv->sum = 0;
>> +
>> +	xe_for_each_mem_region(fd, memreg, region) {
>> +		xe_eudebug_for_each_engine(fd, engine) {
>> +			threads[current].c = c;
>> +			threads[current].hwe = engine;
>> +			threads[current].region = region;
>> +
>> +			pthread_create(&threads[current].thread, NULL,
>> +				       vm_bind_clear_thread, &threads[current]);
>> +			current++;
>> +		}
>> +	}
>> +
>> +	for (current = 0; current < count; current++)
>> +		pthread_join(threads[current].thread, NULL);
>> +
>> +	xe_for_each_mem_region(fd, memreg, region) {
>> +		unsigned long sum = 0;
>> +
>> +		for (current = 0; current < count; current++)
>> +			if (threads[current].region == region)
>> +				sum += threads[current].sum;
>> +
>> +		igt_info("%s sampled %lu objects\n", xe_region_name(region), sum);
>> +		priv->sum += sum;
>> +	}
>> +
>> +	free(threads);
>> +	xe_device_put(fd);
>> +	xe_eudebug_client_close_driver(c, fd);
>> +}
>> +
>> +static void vm_bind_clear_test_trigger(struct xe_eudebug_debugger *d,
>> +				       struct drm_xe_eudebug_event *e)
>> +{
>> +	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
>> +	struct vm_bind_clear_priv *priv = d->ptr;
>> +
>> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
>> +		if (random() & 1) {
> 
> What's that random() doing here?

Well looks like we want to perform the vm_open and debugger side check 
in approx. 50% of the cases, randomly. But why that is I can't tell yet. 
I will either drop it in the next revision or come back later with an 
explanation why it's there.

> 
>> +			struct drm_xe_eudebug_vm_open vo = { 0, };
>> +			uint32_t v = 0xc1c1c1c1;
>> +
>> +			struct drm_xe_eudebug_event_vm_bind *eb;
>> +			int fd, delta, r;
>> +
>> +			igt_debug("vm bind op event received with ref %lld, addr 0x%llx, range 0x%llx\n",
>> +				  eo->vm_bind_ref_seqno, eo->addr, eo->range);
>> +
>> +			eb = (struct drm_xe_eudebug_event_vm_bind *)
>> +				xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
>> +			igt_assert(eb);
>> +
>> +			vo.client_handle = eb->client_handle;
>> +			vo.vm_handle = eb->vm_handle;
>> +
>> +			fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
>> +			igt_assert_lte(0, fd);
>> +
>> +			delta = (random() % eo->range) & -4;
>> +			r = pread(fd, &v, sizeof(v), eo->addr + delta);
>> +			igt_assert_eq(r, sizeof(v));
>> +			igt_assert_eq_u32(v, 0);
>> +
>> +			close(fd);
>> +		}
>> +		priv->bind_count++;
>> +	}
>> +
>> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_DESTROY)
>> +		priv->unbind_count++;
>> +}
>> +
>> +static void vm_bind_clear_ack_trigger(struct xe_eudebug_debugger *d,
>> +				      struct drm_xe_eudebug_event *e)
>> +{
>> +	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
>> +
>> +	xe_eudebug_ack_ufence(d->fd, ef);
>> +}
>> +
>> +/**
>> + * SUBTEST: vm-bind-clear
>> + * Description:
>> + *      Check that fresh buffers we vm_bind into the ppGTT are always clear.
>> + */
>> +static void test_vm_bind_clear(int fd)
>> +{
>> +	struct vm_bind_clear_priv *priv;
>> +	struct xe_eudebug_session *s;
>> +
>> +	priv = vm_bind_clear_priv_create();
>> +	s = xe_eudebug_session_create(fd, vm_bind_clear_client, 0, priv);
>> +
>> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_OP,
>> +					vm_bind_clear_test_trigger);
>> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
>> +					vm_bind_clear_ack_trigger);
>> +
>> +	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
>> +	xe_eudebug_debugger_start_worker(s->debugger);
>> +	xe_eudebug_client_start(s->client);
>> +
>> +	xe_eudebug_client_wait_done(s->client);
>> +	xe_eudebug_debugger_stop_worker(s->debugger, 1);
>> +
>> +	igt_assert_eq(priv->bind_count, priv->unbind_count);
>> +	igt_assert_eq(priv->sum * 2, priv->bind_count);
>> +
>> +	xe_eudebug_session_destroy(s);
>> +	vm_bind_clear_priv_destroy(priv);
>> +}
>> +
>> +#define UFENCE_CLIENT_VM_TEST_VAL_START 0xaaaaaaaa
>> +#define UFENCE_CLIENT_VM_TEST_VAL_END 0xbbbbbbbb
>> +
>> +static void vma_ufence_client(struct xe_eudebug_client *c)
>> +{
>> +	const unsigned int n = UFENCE_EVENT_COUNT_EXPECTED;
>> +	int fd = xe_eudebug_client_open_driver(c);
>> +	struct ufence_bind *binds = create_binds_with_ufence(fd, n);
>> +	uint32_t vm = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
>> +	size_t bo_size = xe_get_default_alignment(fd);
>> +	uint64_t items = bo_size / sizeof(uint32_t);
>> +	uint32_t bo[UFENCE_EVENT_COUNT_EXPECTED];
>> +	uint32_t *ptr[UFENCE_EVENT_COUNT_EXPECTED];
>> +
>> +	for (int i = 0; i < n; i++) {
>> +		bo[i] = xe_bo_create(fd, 0, bo_size,
>> +				     system_memory(fd), 0);
>> +		ptr[i] = xe_bo_map(fd, bo[i], bo_size);
>> +		igt_assert(ptr[i]);
>> +		memset(ptr[i], UFENCE_CLIENT_VM_TEST_VAL_START, bo_size);
>> +	}
>> +
>> +	for (int i = 0; i < n; i++)
>> +		for (int j = 0; j < items; j++)
>> +			igt_assert_eq(ptr[i][j], UFENCE_CLIENT_VM_TEST_VAL_START);
> 
> What's for this loop? ptr was filled in previous loop. Is it possible
> this data are not persist?

I guess this one is a little bit paranoid, I will get rid of it.

Thanks,
Christoph

> 
> I didn't spot other issues during this reading.
> 
> --
> Zbigniew
> 
>> +
>> +	for (int i = 0; i < n; i++) {
>> +		struct ufence_bind *b = &binds[i];
>> +
>> +		xe_eudebug_client_vm_bind_flags(c, fd, vm, bo[i], 0, b->addr, b->range, 0,
>> +						&b->f, 1, 0);
>> +	}
>> +
>> +	/* Wait for acks on ufences */
>> +	for (int i = 0; i < n; i++) {
>> +		int err;
>> +		int64_t timeout_ns;
>> +		struct ufence_bind *b = &binds[i];
>> +
>> +		timeout_ns = XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC;
>> +		err = __xe_wait_ufence(fd, &b->fence_data->vm_sync, b->f.timeline_value,
>> +				       0, &timeout_ns);
>> +		igt_assert_eq(err, 0);
>> +		igt_assert_eq(b->fence_data->vm_sync, b->f.timeline_value);
>> +		igt_debug("wait #%d completed\n", i);
>> +
>> +		for (int j = 0; j < items; j++)
>> +			igt_assert_eq(ptr[i][j], UFENCE_CLIENT_VM_TEST_VAL_END);
>> +	}
>> +
>> +	for (int i = 0; i < n; i++) {
>> +		struct ufence_bind *b = &binds[i];
>> +
>> +		xe_eudebug_client_vm_unbind(c, fd, vm, 0, b->addr, b->range);
>> +	}
>> +
>> +	free(binds);
>> +
>> +	for (int i = 0; i < n; i++) {
>> +		munmap(ptr[i], bo_size);
>> +		gem_close(fd, bo[i]);
>> +	}
>> +
>> +	xe_eudebug_client_vm_destroy(c, fd, vm);
>> +	xe_eudebug_client_close_driver(c, fd);
>> +}
>> +
>> +static void debugger_test_vma_ufence(struct xe_eudebug_debugger *d,
>> +				     uint64_t client_handle,
>> +				     uint64_t vm_handle,
>> +				     uint64_t va_start,
>> +				     uint64_t va_length)
>> +{
>> +	struct drm_xe_eudebug_vm_open vo = { 0, };
>> +	uint32_t *v1, *v2;
>> +	uint32_t items = va_length / sizeof(uint32_t);
>> +	int fd;
>> +	int r, i;
>> +
>> +	v1 = malloc(va_length);
>> +	igt_assert(v1);
>> +	v2 = malloc(va_length);
>> +	igt_assert(v2);
>> +
>> +	vo.client_handle = client_handle;
>> +	vo.vm_handle = vm_handle;
>> +
>> +	fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
>> +	igt_assert_lte(0, fd);
>> +
>> +	r = pread(fd, v1, va_length, va_start);
>> +	igt_assert_eq(r, va_length);
>> +
>> +	for (i = 0; i < items; i++)
>> +		igt_assert_eq(v1[i], UFENCE_CLIENT_VM_TEST_VAL_START);
>> +
>> +	memset(v1, UFENCE_CLIENT_VM_TEST_VAL_END, va_length);
>> +
>> +	r = pwrite(fd, v1, va_length, va_start);
>> +	igt_assert_eq(r, va_length);
>> +
>> +	lseek(fd, va_start, SEEK_SET);
>> +	r = read(fd, v2, va_length);
>> +	igt_assert_eq(r, va_length);
>> +
>> +	for (i = 0; i < items; i++)
>> +		igt_assert_eq_u64(v1[i], v2[i]);
>> +
>> +	fsync(fd);
>> +
>> +	close(fd);
>> +	free(v1);
>> +	free(v2);
>> +}
>> +
>> +static void vma_ufence_op_trigger(struct xe_eudebug_debugger *d,
>> +				  struct drm_xe_eudebug_event *e)
>> +{
>> +	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
>> +	struct ufence_priv *priv = d->ptr;
>> +
>> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
>> +		char event_str[XE_EUDEBUG_EVENT_STRING_MAX_LEN];
>> +		struct drm_xe_eudebug_event_vm_bind *eb;
>> +		unsigned int op_count = priv->vm_bind_op_count++;
>> +
>> +		xe_eudebug_event_to_str(e, event_str, XE_EUDEBUG_EVENT_STRING_MAX_LEN);
>> +		igt_debug("vm bind op event: ref %lld, addr 0x%llx, range 0x%llx, op_count %u\n",
>> +			  eo->vm_bind_ref_seqno,
>> +			  eo->addr,
>> +			  eo->range,
>> +			  op_count);
>> +		igt_debug("vm bind op event received: %s\n", event_str);
>> +		xe_eudebug_assert(d, eo->vm_bind_ref_seqno);
>> +		eb = (struct drm_xe_eudebug_event_vm_bind *)
>> +			xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
>> +
>> +		xe_eudebug_assert_f(d, eb, "vm bind event with seqno (%lld) not found\n",
>> +				    eo->vm_bind_ref_seqno);
>> +		xe_eudebug_assert_f(d, eb->flags & DRM_XE_EUDEBUG_EVENT_VM_BIND_FLAG_UFENCE,
>> +				    "vm bind event does not have ufence: %s\n", event_str);
>> +
>> +		priv->ufence_event_seqno[op_count] = eo->vm_bind_ref_seqno;
>> +		priv->ufence_event_vm_addr_start[op_count] = eo->addr;
>> +		priv->ufence_event_vm_addr_range[op_count] = eo->range;
>> +	}
>> +}
>> +
>> +static void vma_ufence_trigger(struct xe_eudebug_debugger *d,
>> +			       struct drm_xe_eudebug_event *e)
>> +{
>> +	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
>> +	struct ufence_priv *priv = d->ptr;
>> +	unsigned int ufence_count = priv->ufence_event_count;
>> +
>> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
>> +		char event_str[XE_EUDEBUG_EVENT_STRING_MAX_LEN];
>> +		struct drm_xe_eudebug_event_vm_bind *eb;
>> +		uint64_t addr = priv->ufence_event_vm_addr_start[ufence_count];
>> +		uint64_t range = priv->ufence_event_vm_addr_range[ufence_count];
>> +
>> +		xe_eudebug_event_to_str(e, event_str, XE_EUDEBUG_EVENT_STRING_MAX_LEN);
>> +		igt_debug("ufence event received: %s\n", event_str);
>> +
>> +		xe_eudebug_assert_f(d, priv->ufence_event_count < UFENCE_EVENT_COUNT_EXPECTED,
>> +				    "surplus ufence event received: %s\n", event_str);
>> +		xe_eudebug_assert(d, ef->vm_bind_ref_seqno);
>> +
>> +		memcpy(&priv->ufence_events[priv->ufence_event_count++], ef, sizeof(*ef));
>> +
>> +		eb = (struct drm_xe_eudebug_event_vm_bind *)
>> +			xe_eudebug_event_log_find_seqno(d->log, ef->vm_bind_ref_seqno);
>> +		xe_eudebug_assert_f(d, eb, "vm bind event with seqno (%lld) not found\n",
>> +				    ef->vm_bind_ref_seqno);
>> +		xe_eudebug_assert_f(d, eb->flags & DRM_XE_EUDEBUG_EVENT_VM_BIND_FLAG_UFENCE,
>> +				    "vm bind event does not have ufence: %s\n", event_str);
>> +		igt_debug("vm bind ufence event received with ref %lld, addr 0x%lx, range 0x%lx\n",
>> +			  ef->vm_bind_ref_seqno,
>> +			  addr,
>> +			  range);
>> +		debugger_test_vma_ufence(d, eb->client_handle, eb->vm_handle,
>> +					 addr, range);
>> +
>> +		xe_eudebug_ack_ufence(d->fd, ef);
>> +	}
>> +}
>> +
>> +/**
>> + * SUBTEST: vma-ufence
>> + * Description:
>> + *      Intercept vm bind after receiving ufence event, then access target vm and write to it.
>> + *      Then check on client side if the write was successful.
>> + */
>> +static void test_vma_ufence(int fd, unsigned int flags)
>> +{
>> +	struct xe_eudebug_session *s;
>> +	struct ufence_priv *priv;
>> +
>> +	priv = ufence_priv_create();
>> +	s = xe_eudebug_session_create(fd, vma_ufence_client, flags, priv);
>> +
>> +	xe_eudebug_debugger_add_trigger(s->debugger,
>> +					DRM_XE_EUDEBUG_EVENT_VM_BIND_OP,
>> +					vma_ufence_op_trigger);
>> +	xe_eudebug_debugger_add_trigger(s->debugger,
>> +					DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
>> +					vma_ufence_trigger);
>> +
>> +	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
>> +	xe_eudebug_debugger_start_worker(s->debugger);
>> +	xe_eudebug_client_start(s->client);
>> +
>> +	xe_eudebug_client_wait_done(s->client);
>> +	xe_eudebug_debugger_stop_worker(s->debugger, 1);
>> +
>> +	xe_eudebug_event_log_print(s->debugger->log, true);
>> +	xe_eudebug_event_log_print(s->client->log, true);
>> +
>> +	xe_eudebug_session_check(s, true, XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
>> +
>> +	xe_eudebug_session_destroy(s);
>> +	ufence_priv_destroy(priv);
>> +}
>> +
>> +igt_main
>> +{
>> +	bool was_enabled;
>> +	bool *multigpu_was_enabled;
>> +	int fd, gpu_count;
>> +
>> +	igt_fixture {
>> +		fd = drm_open_driver(DRIVER_XE);
>> +		was_enabled = xe_eudebug_enable(fd, true);
>> +	}
>> +
>> +	igt_subtest("sysfs-toggle")
>> +		test_sysfs_toggle(fd);
>> +
>> +	igt_subtest("basic-connect")
>> +		test_connect(fd);
>> +
>> +	igt_subtest("connect-user")
>> +		test_connect_user(fd);
>> +
>> +	igt_subtest("basic-close")
>> +		test_close(fd);
>> +
>> +	igt_subtest("basic-read-event")
>> +		test_read_event(fd);
>> +
>> +	igt_subtest("basic-client")
>> +		test_basic_sessions(fd, 0, 1, true);
>> +
>> +	igt_subtest("basic-client-th")
>> +		test_basic_sessions_th(fd, 0, 1, true);
>> +
>> +	igt_subtest("basic-vm-access")
>> +		test_vm_access(fd, 0, 1);
>> +
>> +	igt_subtest("basic-vm-access-userptr")
>> +		test_vm_access(fd, VM_BIND_OP_MAP_USERPTR, 1);
>> +
>> +	igt_subtest("basic-vm-access-parameters")
>> +		test_vm_access_parameters(fd, 0, 1);
>> +
>> +	igt_subtest("multiple-sessions")
>> +		test_basic_sessions(fd, CREATE_VMS | CREATE_EXEC_QUEUES, 4, true);
>> +
>> +	igt_subtest("basic-vms")
>> +		test_basic_sessions(fd, CREATE_VMS, 1, true);
>> +
>> +	igt_subtest("basic-exec-queues")
>> +		test_basic_sessions(fd, CREATE_EXEC_QUEUES, 1, true);
>> +
>> +	igt_subtest("basic-vm-bind")
>> +		test_basic_sessions(fd, VM_BIND, 1, true);
>> +
>> +	igt_subtest("basic-vm-bind-ufence")
>> +		test_basic_ufence(fd, 0);
>> +
>> +	igt_subtest("vma-ufence")
>> +		test_vma_ufence(fd, 0);
>> +
>> +	igt_subtest("vm-bind-clear")
>> +		test_vm_bind_clear(fd);
>> +
>> +	igt_subtest("basic-vm-bind-discovery")
>> +		test_basic_discovery(fd, VM_BIND, true);
>> +
>> +	igt_subtest("basic-vm-bind-metadata-discovery")
>> +		test_basic_discovery(fd, VM_BIND_METADATA, true);
>> +
>> +	igt_subtest("basic-vm-bind-vm-destroy")
>> +		test_basic_sessions(fd, VM_BIND_VM_DESTROY, 1, false);
>> +
>> +	igt_subtest("basic-vm-bind-vm-destroy-discovery")
>> +		test_basic_discovery(fd, VM_BIND_VM_DESTROY, false);
>> +
>> +	igt_subtest("basic-vm-bind-extended")
>> +		test_basic_sessions(fd, VM_BIND_EXTENDED, 1, true);
>> +
>> +	igt_subtest("basic-vm-bind-extended-discovery")
>> +		test_basic_discovery(fd, VM_BIND_EXTENDED, true);
>> +
>> +	igt_subtest("read-metadata")
>> +		test_metadata_read(fd, 0, 1);
>> +
>> +	igt_subtest("attach-debug-metadata")
>> +		test_metadata_attach(fd, 0, 1);
>> +
>> +	igt_subtest("discovery-race")
>> +		test_race_discovery(fd, 0, 4);
>> +
>> +	igt_subtest("discovery-race-vmbind")
>> +		test_race_discovery(fd, DISCOVERY_VM_BIND, 4);
>> +
>> +	igt_subtest("discovery-empty")
>> +		test_empty_discovery(fd, DISCOVERY_CLOSE_CLIENT, 16);
>> +
>> +	igt_subtest("discovery-empty-clients")
>> +		test_empty_discovery(fd, DISCOVERY_DESTROY_RESOURCES, 16);
>> +
>> +	igt_fixture {
>> +		xe_eudebug_enable(fd, was_enabled);
>> +		drm_close_driver(fd);
>> +	}
>> +
>> +	igt_subtest_group {
>> +		igt_fixture {
>> +			gpu_count = drm_prepare_filtered_multigpu(DRIVER_XE);
>> +			igt_require(gpu_count >= 2);
>> +
>> +			multigpu_was_enabled = malloc(gpu_count * sizeof(bool));
>> +			igt_assert(multigpu_was_enabled);
>> +			for (int i = 0; i < gpu_count; i++) {
>> +				fd = drm_open_filtered_card(i);
>> +				multigpu_was_enabled[i] = xe_eudebug_enable(fd, true);
>> +				close(fd);
>> +			}
>> +		}
>> +
>> +		igt_subtest("multigpu-basic-client") {
>> +			igt_multi_fork(child, gpu_count) {
>> +				fd = drm_open_filtered_card(child);
>> +				igt_assert_f(fd > 0, "cannot open gpu-%d, errno=%d\n",
>> +					     child, errno);
>> +				igt_assert(is_xe_device(fd));
>> +
>> +				test_basic_sessions(fd, 0, 1, true);
>> +				close(fd);
>> +			}
>> +			igt_waitchildren();
>> +		}
>> +
>> +		igt_subtest("multigpu-basic-client-many") {
>> +			igt_multi_fork(child, gpu_count) {
>> +				fd = drm_open_filtered_card(child);
>> +				igt_assert_f(fd > 0, "cannot open gpu-%d, errno=%d\n",
>> +					     child, errno);
>> +				igt_assert(is_xe_device(fd));
>> +
>> +				test_basic_sessions(fd, 0, 4, true);
>> +				close(fd);
>> +			}
>> +			igt_waitchildren();
>> +		}
>> +
>> +		igt_fixture {
>> +			for (int i = 0; i < gpu_count; i++) {
>> +				fd = drm_open_filtered_card(i);
>> +				xe_eudebug_enable(fd, multigpu_was_enabled[i]);
>> +				close(fd);
>> +			}
>> +			free(multigpu_was_enabled);
>> +		}
>> +	}
>> +}
>> diff --git a/tests/meson.build b/tests/meson.build
>> index 00556c9d6..0f996fdc8 100644
>> --- a/tests/meson.build
>> +++ b/tests/meson.build
>> @@ -318,6 +318,14 @@ intel_xe_progs = [
>>   	'xe_sysfs_scheduler',
>>   ]
>>   
>> +intel_xe_eudebug_progs = [
>> +	'xe_eudebug',
>> +]
>> +
>> +if build_xe_eudebug
>> +	intel_xe_progs += intel_xe_eudebug_progs
>> +endif
>> +
>>   chamelium_progs = [
>>   	'kms_chamelium_audio',
>>   	'kms_chamelium_color',
>> -- 
>> 2.34.1
>>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 13/17] tests/xe_eudebug: Test eudebug resource tracking and manipulation
  2024-09-12  8:04   ` Zbigniew Kempczyński
  2024-09-17 14:44     ` Manszewski, Christoph
@ 2024-09-17 16:00     ` Manszewski, Christoph
  2024-09-18  4:47       ` Zbigniew Kempczyński
  1 sibling, 1 reply; 50+ messages in thread
From: Manszewski, Christoph @ 2024-09-17 16:00 UTC (permalink / raw)
  To: Zbigniew Kempczyński
  Cc: igt-dev, Kamil Konieczny, Dominik Grzegorzek, Maciej Patelczyk,
	Dominik Karol Piątkowski, Pawel Sikora, Andrzej Hajda,
	Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun, Mika Kuoppala,
	Jonathan Cavitt

Hi Zbigniew,

On 12.09.2024 10:04, Zbigniew Kempczyński wrote:
> On Thu, Sep 05, 2024 at 11:28:08AM +0200, Christoph Manszewski wrote:
>> From: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
>>
>> For typical debugging under gdb one can specify two main usecases:
>> accessing and manupulating resources created by the application and
>> manipulating thread execution (interrupting and setting breakpoints).
>>
>> This test adds coverage for the former by checking that:
>> - the debugger reports the expected events for Xe resources created
>> by the debugged client,
>> - the debugger is able to read and write the vm of the debugged client.
> 
> Hi all.
> 
> First of all, on Mika series (v2) sent upstream on xe ml I've noticed
> some tests are crashing the kernel. From this test perspective this is
> good, it seems test is doing what it should do. I observe reboot on
> vm access related subtests: basic-vm-access(-userptr).
> 
>>
>> Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
>> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>> Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
>> Signed-off-by: Karolina Stolarek <karolina.stolarek@intel.com>
>> Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
>> Signed-off-by: Pawel Sikora <pawel.sikora@intel.com>
>> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
>> Signed-off-by: Dominik Karol Piątkowski <dominik.karol.piatkowski@intel.com>
>> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
>> ---
>>   docs/testplan/meson.build |   13 +-
>>   meson_options.txt         |    2 +-
>>   tests/intel/xe_eudebug.c  | 2716 +++++++++++++++++++++++++++++++++++++
>>   tests/meson.build         |    8 +
>>   4 files changed, 2737 insertions(+), 2 deletions(-)
>>   create mode 100644 tests/intel/xe_eudebug.c
>>

<cut>

>> +
>> +static void vm_bind_clear_test_trigger(struct xe_eudebug_debugger *d,
>> +				       struct drm_xe_eudebug_event *e)
>> +{
>> +	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
>> +	struct vm_bind_clear_priv *priv = d->ptr;
>> +
>> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
>> +		if (random() & 1) {
> 
> What's that random() doing here?

Apparently this test is supposed to catch a race, where either the 
client or the debugger see some leftover in a new bind. To increase the 
chance of this happening, we need to skip the debugger's vm_open on a 
preceeding iteration, since otherwise we slow down/defer the subsequent 
bind.

Thanks,
Christoph
> 
>> +			struct drm_xe_eudebug_vm_open vo = { 0, };
>> +			uint32_t v = 0xc1c1c1c1;
>> +
>> +			struct drm_xe_eudebug_event_vm_bind *eb;
>> +			int fd, delta, r;
>> +
>> +			igt_debug("vm bind op event received with ref %lld, addr 0x%llx, range 0x%llx\n",
>> +				  eo->vm_bind_ref_seqno, eo->addr, eo->range);
>> +
>> +			eb = (struct drm_xe_eudebug_event_vm_bind *)
>> +				xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
>> +			igt_assert(eb);
>> +
>> +			vo.client_handle = eb->client_handle;
>> +			vo.vm_handle = eb->vm_handle;
>> +
>> +			fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
>> +			igt_assert_lte(0, fd);
>> +
>> +			delta = (random() % eo->range) & -4;
>> +			r = pread(fd, &v, sizeof(v), eo->addr + delta);
>> +			igt_assert_eq(r, sizeof(v));
>> +			igt_assert_eq_u32(v, 0);
>> +
>> +			close(fd);
>> +		}
>> +		priv->bind_count++;
>> +	}
>> +
>> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_DESTROY)
>> +		priv->unbind_count++;
>> +}
>> +
>> +static void vm_bind_clear_ack_trigger(struct xe_eudebug_debugger *d,
>> +				      struct drm_xe_eudebug_event *e)
>> +{
>> +	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
>> +
>> +	xe_eudebug_ack_ufence(d->fd, ef);
>> +}
>> +
>> +/**
>> + * SUBTEST: vm-bind-clear
>> + * Description:
>> + *      Check that fresh buffers we vm_bind into the ppGTT are always clear.
>> + */
>> +static void test_vm_bind_clear(int fd)
>> +{
>> +	struct vm_bind_clear_priv *priv;
>> +	struct xe_eudebug_session *s;
>> +
>> +	priv = vm_bind_clear_priv_create();
>> +	s = xe_eudebug_session_create(fd, vm_bind_clear_client, 0, priv);
>> +
>> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_OP,
>> +					vm_bind_clear_test_trigger);
>> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
>> +					vm_bind_clear_ack_trigger);
>> +
>> +	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
>> +	xe_eudebug_debugger_start_worker(s->debugger);
>> +	xe_eudebug_client_start(s->client);
>> +
>> +	xe_eudebug_client_wait_done(s->client);
>> +	xe_eudebug_debugger_stop_worker(s->debugger, 1);
>> +
>> +	igt_assert_eq(priv->bind_count, priv->unbind_count);
>> +	igt_assert_eq(priv->sum * 2, priv->bind_count);
>> +
>> +	xe_eudebug_session_destroy(s);
>> +	vm_bind_clear_priv_destroy(priv);
>> +}
>> +

<cut>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 13/17] tests/xe_eudebug: Test eudebug resource tracking and manipulation
  2024-09-17 16:00     ` Manszewski, Christoph
@ 2024-09-18  4:47       ` Zbigniew Kempczyński
  0 siblings, 0 replies; 50+ messages in thread
From: Zbigniew Kempczyński @ 2024-09-18  4:47 UTC (permalink / raw)
  To: Manszewski, Christoph
  Cc: igt-dev, Kamil Konieczny, Dominik Grzegorzek, Maciej Patelczyk,
	Dominik Karol Piątkowski, Pawel Sikora, Andrzej Hajda,
	Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun, Mika Kuoppala,
	Jonathan Cavitt

On Tue, Sep 17, 2024 at 06:00:51PM +0200, Manszewski, Christoph wrote:
> Hi Zbigniew,
> 
> On 12.09.2024 10:04, Zbigniew Kempczyński wrote:
> > On Thu, Sep 05, 2024 at 11:28:08AM +0200, Christoph Manszewski wrote:
> > > From: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> > > 
> > > For typical debugging under gdb one can specify two main usecases:
> > > accessing and manupulating resources created by the application and
> > > manipulating thread execution (interrupting and setting breakpoints).
> > > 
> > > This test adds coverage for the former by checking that:
> > > - the debugger reports the expected events for Xe resources created
> > > by the debugged client,
> > > - the debugger is able to read and write the vm of the debugged client.
> > 
> > Hi all.
> > 
> > First of all, on Mika series (v2) sent upstream on xe ml I've noticed
> > some tests are crashing the kernel. From this test perspective this is
> > good, it seems test is doing what it should do. I observe reboot on
> > vm access related subtests: basic-vm-access(-userptr).
> > 
> > > 
> > > Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> > > Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > > Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
> > > Signed-off-by: Karolina Stolarek <karolina.stolarek@intel.com>
> > > Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
> > > Signed-off-by: Pawel Sikora <pawel.sikora@intel.com>
> > > Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
> > > Signed-off-by: Dominik Karol Piątkowski <dominik.karol.piatkowski@intel.com>
> > > Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> > > ---
> > >   docs/testplan/meson.build |   13 +-
> > >   meson_options.txt         |    2 +-
> > >   tests/intel/xe_eudebug.c  | 2716 +++++++++++++++++++++++++++++++++++++
> > >   tests/meson.build         |    8 +
> > >   4 files changed, 2737 insertions(+), 2 deletions(-)
> > >   create mode 100644 tests/intel/xe_eudebug.c
> > > 
> 
> <cut>
> 
> > > +
> > > +static void vm_bind_clear_test_trigger(struct xe_eudebug_debugger *d,
> > > +				       struct drm_xe_eudebug_event *e)
> > > +{
> > > +	struct drm_xe_eudebug_event_vm_bind_op *eo = (void *)e;
> > > +	struct vm_bind_clear_priv *priv = d->ptr;
> > > +
> > > +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
> > > +		if (random() & 1) {
> > 
> > What's that random() doing here?
> 
> Apparently this test is supposed to catch a race, where either the client or
> the debugger see some leftover in a new bind. To increase the chance of this
> happening, we need to skip the debugger's vm_open on a preceeding iteration,
> since otherwise we slow down/defer the subsequent bind.

I would add some igt_debug() to get the context in which "random" part
test was executed. This will make it reproducing easier.

--
Zbigniew

> 
> Thanks,
> Christoph
> > 
> > > +			struct drm_xe_eudebug_vm_open vo = { 0, };
> > > +			uint32_t v = 0xc1c1c1c1;
> > > +
> > > +			struct drm_xe_eudebug_event_vm_bind *eb;
> > > +			int fd, delta, r;
> > > +
> > > +			igt_debug("vm bind op event received with ref %lld, addr 0x%llx, range 0x%llx\n",
> > > +				  eo->vm_bind_ref_seqno, eo->addr, eo->range);
> > > +
> > > +			eb = (struct drm_xe_eudebug_event_vm_bind *)
> > > +				xe_eudebug_event_log_find_seqno(d->log, eo->vm_bind_ref_seqno);
> > > +			igt_assert(eb);
> > > +
> > > +			vo.client_handle = eb->client_handle;
> > > +			vo.vm_handle = eb->vm_handle;
> > > +
> > > +			fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
> > > +			igt_assert_lte(0, fd);
> > > +
> > > +			delta = (random() % eo->range) & -4;
> > > +			r = pread(fd, &v, sizeof(v), eo->addr + delta);
> > > +			igt_assert_eq(r, sizeof(v));
> > > +			igt_assert_eq_u32(v, 0);
> > > +
> > > +			close(fd);
> > > +		}
> > > +		priv->bind_count++;
> > > +	}
> > > +
> > > +	if (e->flags & DRM_XE_EUDEBUG_EVENT_DESTROY)
> > > +		priv->unbind_count++;
> > > +}
> > > +
> > > +static void vm_bind_clear_ack_trigger(struct xe_eudebug_debugger *d,
> > > +				      struct drm_xe_eudebug_event *e)
> > > +{
> > > +	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
> > > +
> > > +	xe_eudebug_ack_ufence(d->fd, ef);
> > > +}
> > > +
> > > +/**
> > > + * SUBTEST: vm-bind-clear
> > > + * Description:
> > > + *      Check that fresh buffers we vm_bind into the ppGTT are always clear.
> > > + */
> > > +static void test_vm_bind_clear(int fd)
> > > +{
> > > +	struct vm_bind_clear_priv *priv;
> > > +	struct xe_eudebug_session *s;
> > > +
> > > +	priv = vm_bind_clear_priv_create();
> > > +	s = xe_eudebug_session_create(fd, vm_bind_clear_client, 0, priv);
> > > +
> > > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_OP,
> > > +					vm_bind_clear_test_trigger);
> > > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> > > +					vm_bind_clear_ack_trigger);
> > > +
> > > +	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
> > > +	xe_eudebug_debugger_start_worker(s->debugger);
> > > +	xe_eudebug_client_start(s->client);
> > > +
> > > +	xe_eudebug_client_wait_done(s->client);
> > > +	xe_eudebug_debugger_stop_worker(s->debugger, 1);
> > > +
> > > +	igt_assert_eq(priv->bind_count, priv->unbind_count);
> > > +	igt_assert_eq(priv->sum * 2, priv->bind_count);
> > > +
> > > +	xe_eudebug_session_destroy(s);
> > > +	vm_bind_clear_priv_destroy(priv);
> > > +}
> > > +
> 
> <cut>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH i-g-t v6 14/17] lib/intel_batchbuffer: Add support for long-running mode execution
  2024-09-05  9:27 [PATCH i-g-t v6 00/17] Test coverage for GPU debug support Christoph Manszewski
                   ` (12 preceding siblings ...)
  2024-09-05  9:28 ` [PATCH i-g-t v6 13/17] tests/xe_eudebug: Test eudebug resource tracking and manipulation Christoph Manszewski
@ 2024-09-05  9:28 ` Christoph Manszewski
  2024-09-05  9:28 ` [PATCH i-g-t v6 15/17] tests/xe_exec_sip_eudebug: Port tests for shaders and sip Christoph Manszewski
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 50+ messages in thread
From: Christoph Manszewski @ 2024-09-05  9:28 UTC (permalink / raw)
  To: igt-dev
  Cc: Zbigniew Kempczyński, Kamil Konieczny, Dominik Grzegorzek,
	Maciej Patelczyk, Dominik Karol Piątkowski, Pawel Sikora,
	Andrzej Hajda, Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun,
	Christoph Manszewski

From: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>

To execute in lr (long-running) mode, apart from setting
'DRM_XE_VM_CREATE_FLAG_LR_MODE' flag during vm creation, it is required
to use 'DRM_XE_SYNC_TYPE_USER_FENCE' syncs with vm_bind and xe_exec
ioctls.

Make it possible to execute batch buffers via intel_bb_exec() in lr mode
by setting the 'lr_mode' field with the supplied setter.

Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
Reviewed-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 lib/intel_batchbuffer.c | 149 ++++++++++++++++++++++++++++++++++++++--
 lib/intel_batchbuffer.h |  17 +++++
 2 files changed, 162 insertions(+), 4 deletions(-)

diff --git a/lib/intel_batchbuffer.c b/lib/intel_batchbuffer.c
index f91091bc4..f3df9f965 100644
--- a/lib/intel_batchbuffer.c
+++ b/lib/intel_batchbuffer.c
@@ -986,6 +986,7 @@ __intel_bb_create(int fd, uint32_t ctx, uint32_t vm, const intel_ctx_cfg_t *cfg,
 	igt_assert(ibb->batch);
 	ibb->ptr = ibb->batch;
 	ibb->fence = -1;
+	ibb->user_fence_offset = -1;
 
 	/* Cache context configuration */
 	if (cfg) {
@@ -1449,7 +1450,7 @@ int intel_bb_sync(struct intel_bb *ibb)
 {
 	int ret;
 
-	if (ibb->fence < 0 && !ibb->engine_syncobj)
+	if (ibb->fence < 0 && !ibb->engine_syncobj && ibb->user_fence_offset < 0)
 		return 0;
 
 	if (ibb->fence >= 0) {
@@ -1458,10 +1459,28 @@ int intel_bb_sync(struct intel_bb *ibb)
 			close(ibb->fence);
 			ibb->fence = -1;
 		}
-	} else {
-		igt_assert_neq(ibb->engine_syncobj, 0);
+	} else if (ibb->engine_syncobj) {
 		ret = syncobj_wait_err(ibb->fd, &ibb->engine_syncobj,
 				       1, INT64_MAX, 0);
+	} else {
+		int64_t timeout = -1;
+		uint64_t *sync_data;
+		void *map;
+
+		igt_assert(ibb->user_fence_offset >= 0);
+
+		map = xe_bo_map(ibb->fd, ibb->handle, ibb->size);
+		sync_data = (void *)((uint8_t *)map + ibb->user_fence_offset);
+
+		ret = __xe_wait_ufence(ibb->fd, sync_data, ibb->user_fence_value,
+				       ibb->ctx ?: ibb->engine_id, &timeout);
+
+		gem_munmap(map, ibb->size);
+		ibb->user_fence_offset = -1;
+
+		/* Workload finished forcibly, but finished none the less */
+		if (ret == -EIO)
+			ret = 0;
 	}
 
 	return ret;
@@ -2435,6 +2454,125 @@ __xe_bb_exec(struct intel_bb *ibb, uint64_t flags, bool sync)
 	return 0;
 }
 
+static int
+__xe_lr_bb_exec(struct intel_bb *ibb, uint64_t flags, bool sync)
+{
+	uint32_t engine = flags & (I915_EXEC_BSD_MASK | I915_EXEC_RING_MASK);
+	uint32_t engine_id;
+#define USER_FENCE_VALUE	0xdeadbeefdeadbeefull
+	/*
+	 * LR mode vm_bind requires to use DRM_XE_SYNC_TYPE_USER_FENCE type sync
+	 * LR mode xe_exec requires to use DRM_XE_SYNC_TYPE_USER_FENCE type sync
+	 */
+	struct drm_xe_sync syncs[2] = {
+		{ .type = DRM_XE_SYNC_TYPE_USER_FENCE,
+		  .flags = DRM_XE_SYNC_FLAG_SIGNAL,
+		  .timeline_value = USER_FENCE_VALUE
+		},
+		{ .type = DRM_XE_SYNC_TYPE_USER_FENCE,
+		  .flags = DRM_XE_SYNC_FLAG_SIGNAL,
+		  .timeline_value = USER_FENCE_VALUE
+		},
+	};
+	struct drm_xe_vm_bind_op *bind_ops;
+	struct {
+		uint64_t vm_sync;
+		uint64_t exec_sync;
+	} *sync_data;
+	uint32_t sync_offset;
+	uint64_t ibb_addr, vm_sync_addr, exec_sync_addr;
+	void *map;
+
+	igt_assert_eq(ibb->num_relocs, 0);
+	igt_assert_eq(ibb->xe_bound, false);
+
+	if (ibb->ctx) {
+		engine_id = ibb->ctx;
+	} else if (ibb->last_engine != engine) {
+		struct drm_xe_engine_class_instance inst = { };
+
+		inst.engine_instance =
+			(flags & I915_EXEC_BSD_MASK) >> I915_EXEC_BSD_SHIFT;
+
+		switch (flags & I915_EXEC_RING_MASK) {
+		case I915_EXEC_DEFAULT:
+		case I915_EXEC_BLT:
+			inst.engine_class = DRM_XE_ENGINE_CLASS_COPY;
+			break;
+		case I915_EXEC_BSD:
+			inst.engine_class = DRM_XE_ENGINE_CLASS_VIDEO_DECODE;
+			break;
+		case I915_EXEC_RENDER:
+			if (xe_has_engine_class(ibb->fd, DRM_XE_ENGINE_CLASS_RENDER))
+				inst.engine_class = DRM_XE_ENGINE_CLASS_RENDER;
+			else
+				inst.engine_class = DRM_XE_ENGINE_CLASS_COMPUTE;
+			break;
+		case I915_EXEC_VEBOX:
+			inst.engine_class = DRM_XE_ENGINE_CLASS_VIDEO_ENHANCE;
+			break;
+		default:
+			igt_assert_f(false, "Unknown engine: %x", (uint32_t)flags);
+		}
+		igt_debug("Run on %s\n", xe_engine_class_string(inst.engine_class));
+
+		if (ibb->engine_id)
+			xe_exec_queue_destroy(ibb->fd, ibb->engine_id);
+
+		ibb->engine_id = engine_id =
+			xe_exec_queue_create(ibb->fd, ibb->vm_id, &inst, 0);
+	} else {
+		engine_id = ibb->engine_id;
+	}
+	ibb->last_engine = engine;
+
+	/* User fence add for sync: sync.addr has a quadword align limitation */
+	intel_bb_ptr_align(ibb, 8);
+	sync_offset = intel_bb_offset(ibb);
+	intel_bb_ptr_add(ibb, sizeof(*sync_data));
+
+	map = xe_bo_map(ibb->fd, ibb->handle, ibb->size);
+	memcpy(map, ibb->batch, ibb->size);
+
+	sync_data = (void *)((uint8_t *)map + sync_offset);
+	/* vm_sync userfence userspace address. */
+	vm_sync_addr = to_user_pointer(&sync_data->vm_sync);
+	ibb_addr = ibb->batch_offset;
+	/* exec_sync userfence ppgtt address. */
+	exec_sync_addr = ibb_addr + sync_offset + sizeof(uint64_t);
+	syncs[0].addr = vm_sync_addr;
+	syncs[1].addr = exec_sync_addr;
+
+	if (ibb->num_objects > 1) {
+		bind_ops = xe_alloc_bind_ops(ibb, DRM_XE_VM_BIND_OP_MAP, 0, 0);
+		xe_vm_bind_array(ibb->fd, ibb->vm_id, 0, bind_ops,
+				 ibb->num_objects, syncs, 1);
+		free(bind_ops);
+	} else {
+		igt_debug("bind: MAP\n");
+		igt_debug("  handle: %u, offset: %llx, size: %llx\n",
+			  ibb->handle, (long long)ibb->batch_offset,
+			  (long long)ibb->size);
+		xe_vm_bind_async(ibb->fd, ibb->vm_id, 0, ibb->handle, 0,
+				 ibb->batch_offset, ibb->size, syncs, 1);
+	}
+
+	/* use default vm_bind_exec_queue */
+	xe_wait_ufence(ibb->fd, &sync_data->vm_sync, USER_FENCE_VALUE, 0, -1);
+	gem_munmap(map, ibb->size);
+
+	ibb->xe_bound = true;
+	ibb->user_fence_value = USER_FENCE_VALUE;
+	ibb->user_fence_offset = sync_offset + sizeof(uint64_t);
+
+	xe_exec_sync(ibb->fd, engine_id, ibb->batch_offset, &syncs[1], 1);
+
+	if (sync)
+		intel_bb_sync(ibb);
+
+	return 0;
+}
+
 /*
  * __intel_bb_exec:
  * @ibb: pointer to intel_bb
@@ -2536,7 +2674,10 @@ void intel_bb_exec(struct intel_bb *ibb, uint32_t end_offset,
 	if (ibb->driver == INTEL_DRIVER_I915)
 		igt_assert_eq(__intel_bb_exec(ibb, end_offset, flags, sync), 0);
 	else
-		igt_assert_eq(__xe_bb_exec(ibb, flags, sync), 0);
+		if (intel_bb_get_lr_mode(ibb))
+			igt_assert_eq(__xe_lr_bb_exec(ibb, flags, sync), 0);
+		else
+			igt_assert_eq(__xe_bb_exec(ibb, flags, sync), 0);
 }
 
 /**
diff --git a/lib/intel_batchbuffer.h b/lib/intel_batchbuffer.h
index 9e3430e2a..a63b31a33 100644
--- a/lib/intel_batchbuffer.h
+++ b/lib/intel_batchbuffer.h
@@ -303,6 +303,11 @@ struct intel_bb {
 	 * is not thread-safe.
 	 */
 	int32_t refcount;
+
+	/* long running mode */
+	bool lr_mode;
+	int64_t user_fence_offset;
+	uint64_t user_fence_value;
 };
 
 struct intel_bb *
@@ -423,6 +428,18 @@ static inline uint32_t intel_bb_pxp_appid(struct intel_bb *ibb)
 	return ibb->pxp.appid;
 }
 
+static inline void intel_bb_set_lr_mode(struct intel_bb *ibb, bool lr_mode)
+{
+	igt_assert(ibb);
+	ibb->lr_mode = lr_mode;
+}
+
+static inline bool intel_bb_get_lr_mode(struct intel_bb *ibb)
+{
+	igt_assert(ibb);
+	return ibb->lr_mode;
+}
+
 struct drm_i915_gem_exec_object2 *
 intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
 		    uint64_t offset, uint64_t alignment, bool write);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH i-g-t v6 15/17] tests/xe_exec_sip_eudebug: Port tests for shaders and sip
  2024-09-05  9:27 [PATCH i-g-t v6 00/17] Test coverage for GPU debug support Christoph Manszewski
                   ` (13 preceding siblings ...)
  2024-09-05  9:28 ` [PATCH i-g-t v6 14/17] lib/intel_batchbuffer: Add support for long-running mode execution Christoph Manszewski
@ 2024-09-05  9:28 ` Christoph Manszewski
  2024-09-05  9:28 ` [PATCH i-g-t v6 16/17] tests/xe_eudebug_online: Debug client which runs workloads on EU Christoph Manszewski
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 50+ messages in thread
From: Christoph Manszewski @ 2024-09-05  9:28 UTC (permalink / raw)
  To: igt-dev
  Cc: Zbigniew Kempczyński, Kamil Konieczny, Dominik Grzegorzek,
	Maciej Patelczyk, Dominik Karol Piątkowski, Pawel Sikora,
	Andrzej Hajda, Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun,
	Christoph Manszewski, Mika Kuoppala, Karolina Stolarek

SIP is a System Instruction Pointer, which the hardware will except/jump
into when some defined event occurs and pipeline setup has included sip
program.

Add xe_exec_sip_eudebug test that checks SIP interaction with hardware
debugging capabilities like breakpoints and software debugging like
attention handling by the KMD.

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
Signed-off-by: Dominik Karol Piątkowski <dominik.karol.piatkowski@intel.com>
Signed-off-by: Karolina Stolarek <karolina.stolarek@intel.com>
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Acked-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 tests/intel/xe_exec_sip_eudebug.c | 355 ++++++++++++++++++++++++++++++
 tests/meson.build                 |   1 +
 2 files changed, 356 insertions(+)
 create mode 100644 tests/intel/xe_exec_sip_eudebug.c

diff --git a/tests/intel/xe_exec_sip_eudebug.c b/tests/intel/xe_exec_sip_eudebug.c
new file mode 100644
index 000000000..d056a14a2
--- /dev/null
+++ b/tests/intel/xe_exec_sip_eudebug.c
@@ -0,0 +1,355 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2024 Intel Corporation
+ */
+
+/**
+ * TEST: Tests for GPGPU shader and system routine (SIP) execution related to EU debug
+ * Category: Core
+ * Mega feature: EUdebug
+ * Sub-category: EUdebug tests
+ * Functionality: EU debugger SIP interaction
+ * Test category: functionality test
+ */
+
+#include <dirent.h>
+#include <fcntl.h>
+#include <stdio.h>
+
+#include "gpgpu_shader.h"
+#include "igt.h"
+#include "igt_sysfs.h"
+#include "xe/xe_eudebug.h"
+#include "xe/xe_ioctl.h"
+#include "xe/xe_query.h"
+
+#define WIDTH 64
+#define HEIGHT 64
+
+#define COLOR_C4 0xc4
+
+#define SHADER_CANARY 0x01010101
+#define SIP_CANARY 0x02020202
+
+enum shader_type {
+	SHADER_BREAKPOINT,
+	SHADER_WAIT,
+	SHADER_WRITE,
+};
+
+enum sip_type {
+	SIP_HEAVY,
+	SIP_NULL,
+	SIP_WAIT,
+	SIP_WRITE,
+};
+
+#define F_SUBMIT_TWICE	(1 << 0)
+
+static struct intel_buf *
+create_fill_buf(int fd, int width, int height, uint8_t color)
+{
+	struct intel_buf *buf;
+	uint8_t *ptr;
+
+	buf = calloc(1, sizeof(*buf));
+	igt_assert(buf);
+
+	intel_buf_init(buf_ops_create(fd), buf, width / 4, height, 32, 0,
+		       I915_TILING_NONE, 0);
+
+	ptr = xe_bo_map(fd, buf->handle, buf->surface[0].size);
+	memset(ptr, color, buf->surface[0].size);
+	munmap(ptr, buf->surface[0].size);
+
+	return buf;
+}
+
+static struct gpgpu_shader *get_shader(int fd, enum shader_type shader_type)
+{
+	static struct gpgpu_shader *shader;
+
+	shader = gpgpu_shader_create(fd);
+	gpgpu_shader__write_dword(shader, SHADER_CANARY, 0);
+
+	switch (shader_type) {
+	case SHADER_WAIT:
+		gpgpu_shader__wait(shader);
+		break;
+	case SHADER_WRITE:
+		break;
+	case SHADER_BREAKPOINT:
+		gpgpu_shader__nop(shader);
+		gpgpu_shader__breakpoint(shader);
+		break;
+	}
+
+	gpgpu_shader__eot(shader);
+
+	return shader;
+}
+
+static struct gpgpu_shader *get_sip(int fd, enum sip_type sip_type, enum shader_type shader_type,
+				    unsigned int y_offset)
+{
+	static struct gpgpu_shader *sip;
+
+	if (sip_type == SIP_NULL)
+		return NULL;
+
+	sip = gpgpu_shader_create(fd);
+	gpgpu_shader__write_dword(sip, SIP_CANARY, y_offset);
+
+	switch (sip_type) {
+	case SIP_WAIT:
+		gpgpu_shader__wait(sip);
+		break;
+	case SIP_HEAVY:
+		/* Depending on the generation, the production sip
+		 * executes between 145 to 157 instructions.
+		 * It performs at most 45 data port writes and 5 data port reads.
+		 * Make sure our heavy sip is at least twice heavy as production one.
+		 */
+		gpgpu_shader__loop_begin(sip, 0);
+		gpgpu_shader__write_dword(sip, 0xdeadbeef, y_offset);
+		gpgpu_shader__write_dword(sip, SIP_CANARY, y_offset);
+		gpgpu_shader__loop_end(sip, 0, 45);
+
+		gpgpu_shader__loop_begin(sip, 1);
+		gpgpu_shader__jump_neq(sip, 1, y_offset, SIP_CANARY);
+		gpgpu_shader__loop_end(sip, 1, 10);
+
+		gpgpu_shader__wait(sip);
+		break;
+	default:
+		break;
+	}
+
+	gpgpu_shader__end_system_routine(sip, shader_type == SHADER_BREAKPOINT);
+
+	return sip;
+}
+
+static uint32_t gpgpu_shader(int fd, struct intel_bb *ibb, enum shader_type shader_type,
+			     enum sip_type sip_type, unsigned int threads, unsigned int width,
+			     unsigned int height)
+{
+	struct intel_buf *buf = create_fill_buf(fd, width, height, COLOR_C4);
+	struct gpgpu_shader *sip = get_sip(fd, sip_type, shader_type, height / 2);
+	struct gpgpu_shader *shader = get_shader(fd, shader_type);
+
+	gpgpu_shader_exec(ibb, buf, 1, threads, shader, sip, 0, 0);
+
+	if (sip)
+		gpgpu_shader_destroy(sip);
+	gpgpu_shader_destroy(shader);
+
+	return buf->handle;
+}
+
+static void check_fill_buf(uint8_t *ptr, const int width, const int x,
+			   const int y, const uint8_t color)
+{
+	const uint8_t val = ptr[y * width + x];
+
+	igt_assert_f(val == color,
+		     "Expected 0x%02x, found 0x%02x at (%d,%d)\n",
+		     color, val, x, y);
+}
+
+static void check_buf(int fd, uint32_t handle, int width, int height,
+		      enum sip_type sip_type, uint8_t poison_c)
+{
+	unsigned int sz = ALIGN(width * height, 4096);
+	int thread_count = 0, sip_count = 0;
+	uint32_t *ptr;
+	int i, j;
+
+	ptr = xe_bo_mmap_ext(fd, handle, sz, PROT_READ);
+
+	for (i = 0, j = 0; j < height / 2; ++j) {
+		if (ptr[j * width / 4] == SHADER_CANARY) {
+			++thread_count;
+			i = 4;
+		}
+
+		for (; i < width; i++)
+			check_fill_buf((uint8_t *)ptr, width, i, j, poison_c);
+
+		i = 0;
+	}
+
+	for (i = 0, j = height / 2; j < height; ++j) {
+		if (ptr[j * width / 4] == SIP_CANARY) {
+			++sip_count;
+			i = 4;
+		}
+
+		for (; i < width; i++)
+			check_fill_buf((uint8_t *)ptr, width, i, j, poison_c);
+
+		i = 0;
+	}
+
+	igt_assert(thread_count);
+	if (sip_type != SIP_NULL && xe_eudebug_debugger_available(fd))
+		igt_assert_f(thread_count == sip_count,
+			     "Thread and SIP count mismatch, %d != %d\n",
+			     thread_count, sip_count);
+	else
+		igt_assert(sip_count == 0);
+
+	munmap(ptr, sz);
+}
+
+static uint64_t
+xe_sysfs_get_job_timeout_ms(int fd, struct drm_xe_engine_class_instance *eci)
+{
+	int engine_fd = -1;
+	uint64_t ret;
+
+	engine_fd = xe_sysfs_engine_open(fd, eci->gt_id, eci->engine_class);
+	ret = igt_sysfs_get_u64(engine_fd, "job_timeout_ms");
+	close(engine_fd);
+
+	return ret;
+}
+
+/**
+ * SUBTEST: wait-writesip-nodebug
+ * Description: verify that we don't enter SIP after wait with debugging disabled.
+ *
+ * SUBTEST: breakpoint-writesip-nodebug
+ * Description: verify that we don't enter SIP after hitting breakpoint in shader
+ *		when debugging is disabled.
+ *
+ * SUBTEST: breakpoint-writesip
+ * Description: Test that we enter SIP after hitting breakpoint in shader.
+ *
+ * SUBTEST: breakpoint-writesip-twice
+ * Description: Test twice that we enter SIP after hitting breakpoint in shader.
+ *
+ * SUBTEST: breakpoint-waitsip
+ * Description: Test that we reset after seeing the attention without the debugger.
+ *
+ * SUBTEST: breakpoint-waitsip-heavy
+ * Description:
+ *	Test that we reset after seeing the attention from heavy SIP, that resembles
+ *	the production one, without the debugger.
+ */
+static void test_sip(enum shader_type shader_type, enum sip_type sip_type,
+		     struct drm_xe_engine_class_instance *eci, uint32_t flags)
+{
+	unsigned int threads = 512;
+	unsigned int height = max_t(threads, HEIGHT, threads * 2);
+	unsigned int width = WIDTH;
+	struct drm_xe_ext_set_property ext = {
+		.base.name = DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY,
+		.property = DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG,
+		.value = DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE,
+	};
+	struct timespec ts = { };
+	uint32_t exec_queue_id, handle, vm_id;
+	bool debugger_enabled;
+	struct intel_bb *ibb;
+	uint64_t timeout;
+	int done, fd;
+
+	igt_debug("Using %s\n", xe_engine_class_string(eci->engine_class));
+
+	fd = drm_open_driver(DRIVER_XE);
+	xe_device_get(fd);
+
+	debugger_enabled = xe_eudebug_debugger_available(fd);
+	vm_id = xe_vm_create(fd, debugger_enabled ? DRM_XE_VM_CREATE_FLAG_LR_MODE : 0, 0);
+
+	/* Get timeout for job, and add 4s to ensure timeout processes in subtest. */
+	timeout = xe_sysfs_get_job_timeout_ms(fd, eci) + 4ull * MSEC_PER_SEC;
+	timeout *= NSEC_PER_MSEC;
+	timeout *= igt_run_in_simulation() ? 10 : 1;
+
+	exec_queue_id = xe_exec_queue_create(fd, vm_id, eci,
+					     debugger_enabled ? to_user_pointer(&ext) : 0);
+
+	done = flags & F_SUBMIT_TWICE ? 2 : 1;
+	do {
+		ibb = intel_bb_create_with_context(fd, exec_queue_id, vm_id, NULL, 4096);
+		intel_bb_set_lr_mode(ibb, debugger_enabled);
+
+		igt_nsec_elapsed(&ts);
+		handle = gpgpu_shader(fd, ibb, shader_type, sip_type, threads, width, height);
+
+		intel_bb_sync(ibb);
+		igt_assert_lt_u64(igt_nsec_elapsed(&ts), timeout);
+
+		check_buf(fd, handle, width, height, sip_type, COLOR_C4);
+
+		gem_close(fd, handle);
+		intel_bb_destroy(ibb);
+	} while (--done);
+
+	xe_exec_queue_destroy(fd, exec_queue_id);
+	xe_vm_destroy(fd, vm_id);
+	xe_device_put(fd);
+	close(fd);
+}
+
+#define test_render_and_compute(t, __fd, __eci) \
+	igt_subtest_with_dynamic(t) \
+		xe_for_each_engine(__fd, __eci) \
+			if (__eci->engine_class == DRM_XE_ENGINE_CLASS_RENDER || \
+			    __eci->engine_class == DRM_XE_ENGINE_CLASS_COMPUTE) \
+				igt_dynamic_f("%s%d", xe_engine_class_string(__eci->engine_class), \
+					      __eci->engine_instance)
+
+igt_main
+{
+	struct drm_xe_engine_class_instance *eci;
+	bool was_enabled;
+	int fd;
+
+	igt_fixture
+		fd = drm_open_driver(DRIVER_XE);
+
+	/* Debugger disabled (TD_CTL not set) */
+	igt_subtest_group {
+		igt_fixture {
+			was_enabled = xe_eudebug_enable(fd, false);
+			igt_require(!xe_eudebug_debugger_available(fd));
+		}
+
+		test_render_and_compute("wait-writesip-nodebug", fd, eci)
+			test_sip(SHADER_WAIT, SIP_WRITE, eci, 0);
+
+		test_render_and_compute("breakpoint-writesip-nodebug", fd, eci)
+			test_sip(SHADER_BREAKPOINT, SIP_WRITE, eci, 0);
+
+		igt_fixture
+			xe_eudebug_enable(fd, was_enabled);
+	}
+
+	/* Debugger enabled (TD_CTL set) */
+	igt_subtest_group {
+		igt_fixture {
+			was_enabled = xe_eudebug_enable(fd, true);
+		}
+
+		test_render_and_compute("breakpoint-writesip", fd, eci)
+			test_sip(SHADER_BREAKPOINT, SIP_WRITE, eci, 0);
+
+		test_render_and_compute("breakpoint-writesip-twice", fd, eci)
+			test_sip(SHADER_BREAKPOINT, SIP_WRITE, eci, F_SUBMIT_TWICE);
+
+		test_render_and_compute("breakpoint-waitsip", fd, eci)
+			test_sip(SHADER_BREAKPOINT, SIP_WAIT, eci, 0);
+
+		test_render_and_compute("breakpoint-waitsip-heavy", fd, eci)
+			test_sip(SHADER_BREAKPOINT, SIP_HEAVY, eci, 0);
+
+		igt_fixture
+			xe_eudebug_enable(fd, was_enabled);
+	}
+
+	igt_fixture
+		drm_close_driver(fd);
+}
diff --git a/tests/meson.build b/tests/meson.build
index 0f996fdc8..43e8516f4 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -320,6 +320,7 @@ intel_xe_progs = [
 
 intel_xe_eudebug_progs = [
 	'xe_eudebug',
+	'xe_exec_sip_eudebug',
 ]
 
 if build_xe_eudebug
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH i-g-t v6 16/17] tests/xe_eudebug_online: Debug client which runs workloads on EU
  2024-09-05  9:27 [PATCH i-g-t v6 00/17] Test coverage for GPU debug support Christoph Manszewski
                   ` (14 preceding siblings ...)
  2024-09-05  9:28 ` [PATCH i-g-t v6 15/17] tests/xe_exec_sip_eudebug: Port tests for shaders and sip Christoph Manszewski
@ 2024-09-05  9:28 ` Christoph Manszewski
  2024-09-13 11:39   ` Zbigniew Kempczyński
  2024-09-05  9:28 ` [PATCH i-g-t v6 17/17] tests/xe_live_ktest: Add xe_eudebug live test Christoph Manszewski
                   ` (3 subsequent siblings)
  19 siblings, 1 reply; 50+ messages in thread
From: Christoph Manszewski @ 2024-09-05  9:28 UTC (permalink / raw)
  To: igt-dev
  Cc: Zbigniew Kempczyński, Kamil Konieczny, Dominik Grzegorzek,
	Maciej Patelczyk, Dominik Karol Piątkowski, Pawel Sikora,
	Andrzej Hajda, Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun,
	Christoph Manszewski, Karolina Stolarek

From: Dominik Grzegorzek <dominik.grzegorzek@intel.com>

For typical debugging under gdb one can specify two main usecases:
accessing and manupulating resources created by the application and
manipulating thread execution (interrupting and setting breakpoints).

This test adds coverage for the latter by checking that:
- EU workloads that hit a instruction with breakpoint bit set will stop
  halt execution and the debugger will report this via attention events,
- the debugger is able to interrupt workload execution by issuing a
  'interrupt_all' ioctl call,
- the debugger is able to resume selected workloads that are stopped.

Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
Signed-off-by: Dominik Karol Piątkowski <dominik.karol.piatkowski@intel.com>
Signed-off-by: Pawel Sikora <pawel.sikora@intel.com>
Signed-off-by: Karolina Stolarek <karolina.stolarek@intel.com>
Signed-off-by: Kolanupaka Naveena <kolanupaka.naveena@intel.com>
---
 tests/intel/xe_eudebug_online.c | 2254 +++++++++++++++++++++++++++++++
 tests/meson.build               |    1 +
 2 files changed, 2255 insertions(+)
 create mode 100644 tests/intel/xe_eudebug_online.c

diff --git a/tests/intel/xe_eudebug_online.c b/tests/intel/xe_eudebug_online.c
new file mode 100644
index 000000000..20f8e3601
--- /dev/null
+++ b/tests/intel/xe_eudebug_online.c
@@ -0,0 +1,2254 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+/**
+ * TEST: Tests for eudebug online functionality
+ * Category: Core
+ * Mega feature: EUdebug
+ * Sub-category: EUdebug tests
+ * Functionality: eu kernel debug
+ * Test category: functionality test
+ */
+
+#include "xe/xe_eudebug.h"
+#include "xe/xe_ioctl.h"
+#include "xe/xe_query.h"
+#include "igt.h"
+#include "intel_pat.h"
+#include "intel_mocs.h"
+#include "gpgpu_shader.h"
+
+#define SHADER_NOP			(0 << 0)
+#define SHADER_BREAKPOINT		(1 << 0)
+#define SHADER_LOOP			(1 << 1)
+#define SHADER_SINGLE_STEP		(1 << 2)
+#define SIP_SINGLE_STEP			(1 << 3)
+#define DISABLE_DEBUG_MODE		(1 << 4)
+#define SHADER_N_NOOP_BREAKPOINT	(1 << 5)
+#define SHADER_CACHING_SRAM		(1 << 6)
+#define SHADER_CACHING_VRAM		(1 << 7)
+#define SHADER_MIN_THREADS		(1 << 8)
+#define DO_NOT_EXPECT_CANARIES		(1 << 9)
+#define TRIGGER_UFENCE_SET_BREAKPOINT	(1 << 24)
+#define TRIGGER_RESUME_SINGLE_WALK	(1 << 25)
+#define TRIGGER_RESUME_PARALLEL_WALK	(1 << 26)
+#define TRIGGER_RECONNECT		(1 << 27)
+#define TRIGGER_RESUME_SET_BP		(1 << 28)
+#define TRIGGER_RESUME_DELAYED		(1 << 29)
+#define TRIGGER_RESUME_DSS		(1 << 30)
+#define TRIGGER_RESUME_ONE		(1 << 31)
+
+#define DEBUGGER_REATTACHED	1
+
+#define SHADER_LOOP_N		3
+#define SINGLE_STEP_COUNT	16
+#define STEERING_SINGLE_STEP	0
+#define STEERING_CONTINUE	0x00c0ffee
+#define STEERING_END_LOOP	0xdeadca11
+
+#define CACHING_INIT_VALUE	0xcafe0000
+#define CACHING_POISON_VALUE	0xcafedead
+#define CACHING_VALUE(n)	(CACHING_INIT_VALUE + (n))
+
+#define SHADER_CANARY 0x01010101
+
+#define WALKER_X_DIM		4
+#define WALKER_ALIGNMENT	16
+#define SIMD_SIZE		16
+
+#define STARTUP_TIMEOUT_MS	3000
+#define WORKLOAD_DELAY_US	(5000 * 1000)
+
+#define PAGE_SIZE 4096
+
+struct dim_t {
+	uint32_t x;
+	uint32_t y;
+	uint32_t alignment;
+};
+
+static struct dim_t walker_dimensions(int threads)
+{
+	uint32_t x_dim = min_t(x_dim, threads, WALKER_X_DIM);
+	struct dim_t ret = {
+		.x = x_dim,
+		.y = threads / x_dim,
+		.alignment = WALKER_ALIGNMENT
+	};
+
+	return ret;
+}
+
+static struct dim_t surface_dimensions(int threads)
+{
+	struct dim_t ret = walker_dimensions(threads);
+
+	ret.y = max_t(ret.y, threads / ret.x, 4);
+	ret.x *= SIMD_SIZE;
+	ret.alignment *= SIMD_SIZE;
+
+	return ret;
+}
+
+static uint32_t steering_offset(int threads)
+{
+	struct dim_t w = walker_dimensions(threads);
+
+	return ALIGN(w.x, w.alignment) * w.y * 4;
+}
+
+static struct intel_buf *create_uc_buf(int fd, int width, int height)
+{
+	struct intel_buf *buf;
+
+	buf = intel_buf_create_full(buf_ops_create(fd), 0, width / 4, height,
+				    32, 0, I915_TILING_NONE, 0, 0, 0,
+				    vram_if_possible(fd, 0),
+				    DEFAULT_PAT_INDEX, DEFAULT_MOCS_INDEX);
+
+	return buf;
+}
+
+static int get_number_of_threads(uint64_t flags)
+{
+	if (flags & SHADER_MIN_THREADS)
+		return 16;
+
+	if (flags & (TRIGGER_RESUME_ONE | TRIGGER_RESUME_SINGLE_WALK |
+		     TRIGGER_RESUME_PARALLEL_WALK | SHADER_CACHING_SRAM | SHADER_CACHING_VRAM))
+		return 32;
+
+	return 512;
+}
+
+static int caching_get_instruction_count(int fd, uint32_t s_dim__x, int flags)
+{
+	uint64_t memory;
+
+	igt_assert((flags & SHADER_CACHING_SRAM) || (flags & SHADER_CACHING_VRAM));
+
+	if (flags & SHADER_CACHING_SRAM)
+		memory = system_memory(fd);
+	else
+		memory = vram_memory(fd, 0);
+
+	/* each instruction writes to given y offset */
+	return (2 * xe_min_page_size(fd, memory)) / s_dim__x;
+}
+
+static struct gpgpu_shader *get_shader(int fd, const unsigned int flags)
+{
+	struct dim_t w_dim = walker_dimensions(get_number_of_threads(flags));
+	struct dim_t s_dim = surface_dimensions(get_number_of_threads(flags));
+	static struct gpgpu_shader *shader;
+
+	shader = gpgpu_shader_create(fd);
+
+	gpgpu_shader__write_dword(shader, SHADER_CANARY, 0);
+	if (flags & SHADER_BREAKPOINT) {
+		gpgpu_shader__nop(shader);
+		gpgpu_shader__breakpoint(shader);
+	} else if (flags & SHADER_LOOP) {
+		gpgpu_shader__label(shader, 0);
+		gpgpu_shader__write_dword(shader, SHADER_CANARY, 0);
+		gpgpu_shader__jump_neq(shader, 0, w_dim.y, STEERING_END_LOOP);
+		gpgpu_shader__write_dword(shader, SHADER_CANARY, 0);
+	} else if (flags & SHADER_SINGLE_STEP) {
+		gpgpu_shader__nop(shader);
+		gpgpu_shader__breakpoint(shader);
+		for (int i = 0; i < SINGLE_STEP_COUNT; i++)
+			gpgpu_shader__nop(shader);
+	} else if (flags & SHADER_N_NOOP_BREAKPOINT) {
+		for (int i = 0; i < SHADER_LOOP_N; i++) {
+			gpgpu_shader__nop(shader);
+			gpgpu_shader__breakpoint(shader);
+		}
+	} else if ((flags & SHADER_CACHING_SRAM) || (flags & SHADER_CACHING_VRAM)) {
+		gpgpu_shader__nop(shader);
+		gpgpu_shader__breakpoint(shader);
+		for (int i = 0; i < caching_get_instruction_count(fd, s_dim.x, flags); i++)
+			gpgpu_shader__common_target_write_u32(shader, s_dim.y + i, CACHING_VALUE(i));
+		gpgpu_shader__nop(shader);
+		gpgpu_shader__breakpoint(shader);
+	}
+
+	gpgpu_shader__eot(shader);
+	return shader;
+}
+
+static struct gpgpu_shader *get_sip(int fd, const unsigned int flags)
+{
+	struct dim_t w_dim = walker_dimensions(get_number_of_threads(flags));
+	static struct gpgpu_shader *sip;
+
+	sip = gpgpu_shader_create(fd);
+	gpgpu_shader__write_aip(sip, 0);
+
+	gpgpu_shader__wait(sip);
+	if (flags & SIP_SINGLE_STEP)
+		gpgpu_shader__end_system_routine_step_if_eq(sip, w_dim.y, 0);
+	else
+		gpgpu_shader__end_system_routine(sip, true);
+	return sip;
+}
+
+static int count_set_bits(void *ptr, size_t size)
+{
+	uint8_t *p = ptr;
+	int count = 0;
+	int i, j;
+
+	for (i = 0; i < size; i++)
+		for (j = 0; j < 8; j++)
+			count += !!(p[i] & (1 << j));
+
+	return count;
+}
+
+static int count_canaries_eq(uint32_t *ptr, struct dim_t w_dim, uint32_t value)
+{
+	int count = 0;
+	int x, y;
+
+	for (x = 0; x < w_dim.x; x++)
+		for (y = 0; y < w_dim.y; y++)
+			if (READ_ONCE(ptr[x + ALIGN(w_dim.x, w_dim.alignment) * y]) == value)
+				count++;
+
+	return count;
+}
+
+static int count_canaries_neq(uint32_t *ptr, struct dim_t w_dim, uint32_t value)
+{
+	return w_dim.x * w_dim.y - count_canaries_eq(ptr, w_dim, value);
+}
+
+static const char *td_ctl_cmd_to_str(uint32_t cmd)
+{
+	switch (cmd) {
+	case DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL:
+		return "interrupt all";
+	case DRM_XE_EUDEBUG_EU_CONTROL_CMD_STOPPED:
+		return "stopped";
+	case DRM_XE_EUDEBUG_EU_CONTROL_CMD_RESUME:
+		return "resume";
+	default:
+		return "unknown command";
+	}
+}
+
+static int __eu_ctl(int debugfd, uint64_t client,
+		    uint64_t exec_queue, uint64_t lrc,
+		    uint8_t *bitmask, uint32_t *bitmask_size,
+		    uint32_t cmd, uint64_t *seqno)
+{
+	struct drm_xe_eudebug_eu_control control = {
+		.client_handle = lower_32_bits(client),
+		.exec_queue_handle = exec_queue,
+		.lrc_handle = lrc,
+		.cmd = cmd,
+		.bitmask_ptr = to_user_pointer(bitmask),
+	};
+	int ret;
+
+	if (bitmask_size)
+		control.bitmask_size = *bitmask_size;
+
+	ret = igt_ioctl(debugfd, DRM_XE_EUDEBUG_IOCTL_EU_CONTROL, &control);
+
+	if (ret < 0)
+		return -errno;
+
+	igt_debug("EU CONTROL[%llu]: %s\n", control.seqno, td_ctl_cmd_to_str(cmd));
+
+	if (bitmask_size)
+		*bitmask_size = control.bitmask_size;
+
+	if (seqno)
+		*seqno = control.seqno;
+
+	return 0;
+}
+
+static uint64_t eu_ctl(int debugfd, uint64_t client,
+		       uint64_t exec_queue, uint64_t lrc,
+		       uint8_t *bitmask, uint32_t *bitmask_size, uint32_t cmd)
+{
+	uint64_t seqno;
+
+	igt_assert_eq(__eu_ctl(debugfd, client, exec_queue, lrc, bitmask,
+			       bitmask_size, cmd, &seqno), 0);
+
+	return seqno;
+}
+
+static bool intel_gen_needs_resume_wa(int fd)
+{
+	const uint32_t id = intel_get_drm_devid(fd);
+
+	return intel_gen(id) == 12 && intel_graphics_ver(id) < IP_VER(12, 55);
+}
+
+static uint64_t eu_ctl_resume(int fd, int debugfd, uint64_t client,
+			      uint64_t exec_queue, uint64_t lrc,
+			      uint8_t *bitmask, uint32_t bitmask_size)
+{
+	int i;
+
+	/*  Wa_14011332042 */
+	if (intel_gen_needs_resume_wa(fd)) {
+		uint32_t *att_reg_half = (uint32_t *)bitmask;
+
+		for (i = 0; i < bitmask_size / sizeof(uint32_t); i += 2) {
+			att_reg_half[i] |= att_reg_half[i + 1];
+			att_reg_half[i + 1] |= att_reg_half[i];
+		}
+	}
+
+	return eu_ctl(debugfd, client, exec_queue, lrc, bitmask, &bitmask_size,
+		      DRM_XE_EUDEBUG_EU_CONTROL_CMD_RESUME);
+}
+
+static inline uint64_t eu_ctl_stopped(int debugfd, uint64_t client,
+				      uint64_t exec_queue, uint64_t lrc,
+				      uint8_t *bitmask, uint32_t *bitmask_size)
+{
+	return eu_ctl(debugfd, client, exec_queue, lrc, bitmask, bitmask_size,
+		      DRM_XE_EUDEBUG_EU_CONTROL_CMD_STOPPED);
+}
+
+static inline uint64_t eu_ctl_interrupt_all(int debugfd, uint64_t client,
+					    uint64_t exec_queue, uint64_t lrc)
+{
+	return eu_ctl(debugfd, client, exec_queue, lrc, NULL, 0,
+		      DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL);
+}
+
+struct online_debug_data {
+	pthread_mutex_t mutex;
+	/* client in */
+	struct drm_xe_engine_class_instance hwe;
+	/* client out */
+	int threads_count;
+	/* debugger internals */
+	uint64_t client_handle;
+	uint64_t exec_queue_handle;
+	uint64_t lrc_handle;
+	uint64_t target_offset;
+	size_t target_size;
+	uint64_t bb_offset;
+	size_t bb_size;
+	int vm_fd;
+	uint32_t first_aip;
+	uint64_t *aips_offset_table;
+	uint32_t steps_done;
+	uint8_t *single_step_bitmask;
+	int stepped_threads_count;
+	struct timespec exception_arrived;
+	int last_eu_control_seqno;
+	struct drm_xe_eudebug_event *exception_event;
+};
+
+static struct online_debug_data *
+online_debug_data_create(struct drm_xe_engine_class_instance *hwe)
+{
+	struct online_debug_data *data;
+
+	data = mmap(0, ALIGN(sizeof(*data), PAGE_SIZE),
+		    PROT_WRITE, MAP_SHARED | MAP_ANON, -1, 0);
+	memcpy(&data->hwe, hwe, sizeof(*hwe));
+	pthread_mutex_init(&data->mutex, NULL);
+	data->client_handle = -1ULL;
+	data->exec_queue_handle = -1ULL;
+	data->lrc_handle = -1ULL;
+	data->vm_fd = -1;
+	data->stepped_threads_count = -1;
+
+	return data;
+}
+
+static void online_debug_data_destroy(struct online_debug_data *data)
+{
+	free(data->aips_offset_table);
+	munmap(data, ALIGN(sizeof(*data), PAGE_SIZE));
+}
+
+static void eu_attention_debug_trigger(struct xe_eudebug_debugger *d,
+				       struct drm_xe_eudebug_event *e)
+{
+	struct drm_xe_eudebug_event_eu_attention *att = (void *)e;
+	uint32_t *ptr = (uint32_t *)att->bitmask;
+
+	igt_debug("EVENT[%llu] eu-attenttion; threads=%d "
+		 "client[%llu], exec_queue[%llu], lrc[%llu], bitmask_size[%d]\n",
+		 att->base.seqno, count_set_bits(att->bitmask, att->bitmask_size),
+				att->client_handle, att->exec_queue_handle,
+				att->lrc_handle, att->bitmask_size);
+
+	for (uint32_t i = 0; i < att->bitmask_size / 4; i += 2)
+		igt_debug("bitmask[%d] = 0x%08x%08x\n", i / 2, ptr[i], ptr[i + 1]);
+}
+
+static void eu_attention_reset_trigger(struct xe_eudebug_debugger *d,
+				       struct drm_xe_eudebug_event *e)
+{
+	struct drm_xe_eudebug_event_eu_attention *att = (void *)e;
+	uint32_t *ptr = (uint32_t *)att->bitmask;
+	struct online_debug_data *data = d->ptr;
+
+	igt_debug("EVENT[%llu] eu-attention with reset; threads=%d "
+		 "client[%llu], exec_queue[%llu], lrc[%llu], bitmask_size[%d]\n",
+		 att->base.seqno, count_set_bits(att->bitmask, att->bitmask_size),
+				att->client_handle, att->exec_queue_handle,
+				att->lrc_handle, att->bitmask_size);
+
+	for (uint32_t i = 0; i < att->bitmask_size / 4; i += 2)
+		igt_debug("bitmask[%d] = 0x%08x%08x\n", i / 2, ptr[i], ptr[i + 1]);
+
+	xe_force_gt_reset_async(d->master_fd, data->hwe.gt_id);
+}
+
+static void copy_first_bit(uint8_t *dst, uint8_t *src, int size)
+{
+	bool found = false;
+	int i, j;
+
+	for (i = 0; i < size; i++) {
+		if (found) {
+			dst[i] = 0;
+		} else {
+			uint32_t tmp = src[i]; /* in case dst == src */
+
+			for (j = 0; j < 8; j++) {
+				dst[i] = tmp & (1 << j);
+				if (dst[i]) {
+					found = true;
+					break;
+				}
+			}
+		}
+	}
+}
+
+static void copy_nth_bit(uint8_t *dst, uint8_t *src, int size, int n)
+{
+	int count = 0;
+
+	for (int i = 0; i < size; i++) {
+		uint32_t tmp = src[i];
+
+		for (int j = 7; j >= 0; j--) {
+			if (tmp & (1 << j)) {
+				count++;
+				if (count == n)
+					dst[i] |= (1 << j);
+				else
+					dst[i] &= ~(1 << j);
+			} else {
+				dst[i] &= ~(1 << j);
+			}
+		}
+	}
+}
+
+/*
+ * Searches for the first instruction. It stands on assumption,
+ * that shader kernel is placed before sip within the bb.
+ */
+static uint32_t find_kernel_in_bb(struct gpgpu_shader *kernel,
+				  struct online_debug_data *data)
+{
+	uint32_t *p = kernel->code;
+	size_t sz = 4 * sizeof(uint32_t);
+	uint32_t buf[4];
+	int i;
+
+	for (i = 0; i < data->bb_size; i += sz) {
+		igt_assert_eq(pread(data->vm_fd, &buf, sz, data->bb_offset + i), sz);
+
+
+		if (memcmp(p, buf, sz) == 0)
+			break;
+	}
+
+	igt_assert(i < data->bb_size);
+
+	return i;
+}
+
+static void set_breakpoint_once(struct xe_eudebug_debugger *d,
+				struct online_debug_data *data)
+{
+	const uint32_t breakpoint_bit = 1 << 30;
+	size_t sz = sizeof(uint32_t);
+	struct gpgpu_shader *kernel;
+	uint32_t aip;
+
+	kernel = get_shader(d->master_fd, d->flags);
+
+	if (data->first_aip) {
+		uint32_t expected = find_kernel_in_bb(kernel, data) + kernel->size * 4 - 0x10;
+
+		igt_assert_eq(pread(data->vm_fd, &aip, sz, data->target_offset), sz);
+		igt_assert_eq_u32(aip, expected);
+	} else {
+		uint32_t instr_usdw;
+
+		igt_assert(data->vm_fd != -1);
+		igt_assert(data->target_size != 0);
+		igt_assert(data->bb_size != 0);
+
+		igt_assert_eq(pread(data->vm_fd, &aip, sz, data->target_offset), sz);
+		data->first_aip = aip;
+
+		aip = find_kernel_in_bb(kernel, data);
+
+		/* set breakpoint on last instruction */
+		aip += kernel->size * 4 - 0x10;
+		igt_assert_eq(pread(data->vm_fd, &instr_usdw, sz,
+				    data->bb_offset + aip), sz);
+		instr_usdw |= breakpoint_bit;
+		igt_assert_eq(pwrite(data->vm_fd, &instr_usdw, sz,
+				     data->bb_offset + aip), sz);
+
+	}
+
+	gpgpu_shader_destroy(kernel);
+}
+
+static void get_aips_offset_table(struct online_debug_data *data, int threads)
+{
+	size_t sz = sizeof(uint32_t);
+	uint32_t aip;
+	uint32_t first_aip;
+	int table_index = 0;
+
+	if (data->aips_offset_table)
+		return;
+
+	data->aips_offset_table = malloc(threads * sizeof(uint64_t));
+	igt_assert(data->aips_offset_table);
+
+	igt_assert_eq(pread(data->vm_fd, &first_aip, sz, data->target_offset), sz);
+	data->first_aip = first_aip;
+	data->aips_offset_table[table_index++] = 0;
+
+	fsync(data->vm_fd);
+	for (int i = sz; i < data->target_size; i += sz) {
+		igt_assert_eq(pread(data->vm_fd, &aip, sz, data->target_offset + i), sz);
+		if (aip == first_aip)
+			data->aips_offset_table[table_index++] = i;
+	}
+
+	igt_assert_eq(threads, table_index);
+
+	igt_debug("AIPs offset table:\n");
+	for (int i = 0; i < threads; i++)
+		igt_debug("%lx\n", data->aips_offset_table[i]);
+}
+
+static int get_stepped_threads_count(struct online_debug_data *data, int threads)
+{
+	int count = 0;
+	size_t sz = sizeof(uint32_t);
+	uint32_t aip;
+
+	fsync(data->vm_fd);
+	for (int i = 0; i < threads; i++) {
+		igt_assert_eq(pread(data->vm_fd, &aip, sz,
+				    data->target_offset + data->aips_offset_table[i]), sz);
+		if (aip != data->first_aip) {
+			igt_assert(aip == data->first_aip + 0x10);
+			count++;
+		}
+	}
+
+	return count;
+}
+
+static void save_first_exception_trigger(struct xe_eudebug_debugger *d,
+					 struct drm_xe_eudebug_event *e)
+{
+	struct online_debug_data *data = d->ptr;
+
+	pthread_mutex_lock(&data->mutex);
+	if (!data->exception_event) {
+		igt_gettime(&data->exception_arrived);
+		data->exception_event = igt_memdup(e, e->len);
+	}
+	pthread_mutex_unlock(&data->mutex);
+}
+
+#define MAX_PREEMPT_TIMEOUT 10ull
+static void eu_attention_resume_trigger(struct xe_eudebug_debugger *d,
+					struct drm_xe_eudebug_event *e)
+{
+	struct drm_xe_eudebug_event_eu_attention *att = (void *) e;
+	struct online_debug_data *data = d->ptr;
+	uint32_t bitmask_size = att->bitmask_size;
+	uint8_t *bitmask;
+	int i;
+
+	if (data->last_eu_control_seqno > att->base.seqno)
+		return;
+
+	bitmask = calloc(1, att->bitmask_size);
+
+	eu_ctl_stopped(d->fd, att->client_handle, att->exec_queue_handle,
+		       att->lrc_handle, bitmask, &bitmask_size);
+	igt_assert(bitmask_size == att->bitmask_size);
+	igt_assert(memcmp(bitmask, att->bitmask, att->bitmask_size) == 0);
+
+	pthread_mutex_lock(&data->mutex);
+	if (igt_nsec_elapsed(&data->exception_arrived) < (MAX_PREEMPT_TIMEOUT + 1) * NSEC_PER_SEC &&
+	    d->flags & TRIGGER_RESUME_DELAYED) {
+		pthread_mutex_unlock(&data->mutex);
+		free(bitmask);
+		return;
+	} else if (d->flags & TRIGGER_RESUME_ONE) {
+		copy_first_bit(bitmask, bitmask, bitmask_size);
+	} else if (d->flags & TRIGGER_RESUME_DSS) {
+		uint64_t *event = (uint64_t *)att->bitmask;
+		uint64_t *resume = (uint64_t *)bitmask;
+
+		memset(bitmask, 0, bitmask_size);
+		for (i = 0; i < att->bitmask_size / sizeof(uint64_t); i++) {
+			if (!event[i])
+				continue;
+
+			resume[i] = event[i];
+			break;
+		}
+	} else if (d->flags & TRIGGER_RESUME_SET_BP) {
+		set_breakpoint_once(d, data);
+	}
+
+	if (d->flags & SHADER_LOOP) {
+		uint32_t threads = get_number_of_threads(d->flags);
+		uint32_t val = STEERING_END_LOOP;
+
+		igt_assert_eq(pwrite(data->vm_fd, &val, sizeof(uint32_t),
+				     data->target_offset + steering_offset(threads)),
+			      sizeof(uint32_t));
+		fsync(data->vm_fd);
+	}
+	pthread_mutex_unlock(&data->mutex);
+
+	data->last_eu_control_seqno = eu_ctl_resume(d->master_fd, d->fd, att->client_handle,
+						    att->exec_queue_handle, att->lrc_handle,
+						    bitmask, att->bitmask_size);
+
+	free(bitmask);
+}
+
+static void eu_attention_resume_single_step_trigger(struct xe_eudebug_debugger *d,
+						    struct drm_xe_eudebug_event *e)
+{
+	struct drm_xe_eudebug_event_eu_attention *att = (void *) e;
+	struct online_debug_data *data = d->ptr;
+	const int threads = get_number_of_threads(d->flags);
+	uint32_t val;
+	size_t sz = sizeof(uint32_t);
+
+	get_aips_offset_table(data, threads);
+
+	if (d->flags & TRIGGER_RESUME_PARALLEL_WALK) {
+		if (data->stepped_threads_count != -1)
+			if (data->steps_done < SINGLE_STEP_COUNT) {
+				int stepped_threads_count_after_resume =
+						get_stepped_threads_count(data, threads);
+				igt_debug("Stepped threads after: %d\n",
+					  stepped_threads_count_after_resume);
+
+				if (stepped_threads_count_after_resume == threads) {
+					data->first_aip += 0x10;
+					data->steps_done++;
+				}
+
+				igt_debug("Shader steps: %d\n", data->steps_done);
+				igt_assert(data->stepped_threads_count == 0);
+				igt_assert(stepped_threads_count_after_resume == threads);
+			}
+
+		if (data->steps_done < SINGLE_STEP_COUNT) {
+			data->stepped_threads_count = get_stepped_threads_count(data, threads);
+			igt_debug("Stepped threads before: %d\n", data->stepped_threads_count);
+		}
+
+		val = data->steps_done < SINGLE_STEP_COUNT ? STEERING_SINGLE_STEP :
+							     STEERING_CONTINUE;
+	} else if (d->flags & TRIGGER_RESUME_SINGLE_WALK) {
+		if (data->stepped_threads_count != -1)
+			if (data->steps_done < 2) {
+				int stepped_threads_count_after_resume =
+						get_stepped_threads_count(data, threads);
+				igt_debug("Stepped threads after: %d\n",
+					  stepped_threads_count_after_resume);
+
+				if (stepped_threads_count_after_resume == threads) {
+					data->first_aip += 0x10;
+					data->steps_done++;
+					free(data->single_step_bitmask);
+					data->single_step_bitmask = 0;
+				}
+
+				igt_debug("Shader steps: %d\n", data->steps_done);
+				igt_assert(data->stepped_threads_count +
+					   (intel_gen_needs_resume_wa(d->master_fd) ? 2 : 1) ==
+					   stepped_threads_count_after_resume);
+			}
+
+		if (data->steps_done < 2) {
+			data->stepped_threads_count = get_stepped_threads_count(data, threads);
+			igt_debug("Stepped threads before: %d\n", data->stepped_threads_count);
+			if (intel_gen_needs_resume_wa(d->master_fd)) {
+				if (!data->single_step_bitmask) {
+					data->single_step_bitmask = malloc(att->bitmask_size *
+									   sizeof(uint8_t));
+					igt_assert(data->single_step_bitmask);
+					memcpy(data->single_step_bitmask, att->bitmask,
+					       att->bitmask_size);
+				}
+
+				copy_first_bit(att->bitmask, data->single_step_bitmask,
+					       att->bitmask_size);
+			} else
+				copy_nth_bit(att->bitmask, att->bitmask, att->bitmask_size,
+					     data->stepped_threads_count + 1);
+		}
+
+		val = data->steps_done < 2 ? STEERING_SINGLE_STEP : STEERING_CONTINUE;
+	}
+
+	igt_assert_eq(pwrite(data->vm_fd, &val, sz,
+			     data->target_offset + steering_offset(threads)), sz);
+	fsync(data->vm_fd);
+
+	eu_ctl_resume(d->master_fd, d->fd, att->client_handle,
+		      att->exec_queue_handle, att->lrc_handle,
+		      att->bitmask, att->bitmask_size);
+
+	if (data->single_step_bitmask)
+		for (int i = 0; i < att->bitmask_size; i++)
+			data->single_step_bitmask[i] &= ~att->bitmask[i];
+}
+
+static void open_trigger(struct xe_eudebug_debugger *d,
+			 struct drm_xe_eudebug_event *e)
+{
+	struct drm_xe_eudebug_event_client *client = (void *)e;
+	struct online_debug_data *data = d->ptr;
+
+	if (e->flags & DRM_XE_EUDEBUG_EVENT_DESTROY)
+		return;
+
+	pthread_mutex_lock(&data->mutex);
+	data->client_handle = client->client_handle;
+	pthread_mutex_unlock(&data->mutex);
+}
+
+static void exec_queue_trigger(struct xe_eudebug_debugger *d,
+			       struct drm_xe_eudebug_event *e)
+{
+	struct drm_xe_eudebug_event_exec_queue *eq = (void *)e;
+	struct online_debug_data *data = d->ptr;
+
+	if (e->flags & DRM_XE_EUDEBUG_EVENT_DESTROY)
+		return;
+
+	pthread_mutex_lock(&data->mutex);
+	data->exec_queue_handle = eq->exec_queue_handle;
+	data->lrc_handle = eq->lrc_handle[0];
+	pthread_mutex_unlock(&data->mutex);
+}
+
+static void vm_open_trigger(struct xe_eudebug_debugger *d,
+			    struct drm_xe_eudebug_event *e)
+{
+	struct drm_xe_eudebug_event_vm *vm = (void *)e;
+	struct online_debug_data *data = d->ptr;
+	struct drm_xe_eudebug_vm_open vo = {
+		.client_handle = vm->client_handle,
+		.vm_handle = vm->vm_handle,
+	};
+	int fd;
+
+	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
+		fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
+		igt_assert_lte(0, fd);
+
+		pthread_mutex_lock(&data->mutex);
+		igt_assert(data->vm_fd == -1);
+		data->vm_fd = fd;
+		pthread_mutex_unlock(&data->mutex);
+		return;
+	}
+
+	pthread_mutex_lock(&data->mutex);
+	close(data->vm_fd);
+	data->vm_fd = -1;
+	pthread_mutex_unlock(&data->mutex);
+}
+
+static void read_metadata(struct xe_eudebug_debugger *d,
+			  uint64_t client_handle,
+			  uint64_t metadata_handle,
+			  uint64_t type,
+			  uint64_t len)
+{
+	struct drm_xe_eudebug_read_metadata rm = {
+		.client_handle = client_handle,
+		.metadata_handle = metadata_handle,
+		.size = len,
+	};
+	struct online_debug_data *data = d->ptr;
+	uint64_t *metadata;
+
+	metadata = malloc(len);
+	igt_assert(metadata);
+
+	rm.ptr = to_user_pointer(metadata);
+	igt_assert_eq(igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_READ_METADATA, &rm), 0);
+
+	pthread_mutex_lock(&data->mutex);
+	switch (type) {
+	case DRM_XE_DEBUG_METADATA_ELF_BINARY:
+		data->bb_offset = metadata[0];
+		data->bb_size = metadata[1];
+		break;
+	case DRM_XE_DEBUG_METADATA_PROGRAM_MODULE:
+		data->target_offset = metadata[0];
+		data->target_size = metadata[1];
+		break;
+	default:
+		break;
+	}
+	pthread_mutex_unlock(&data->mutex);
+
+	free(metadata);
+}
+
+static void create_metadata_trigger(struct xe_eudebug_debugger *d, struct drm_xe_eudebug_event *e)
+{
+	struct drm_xe_eudebug_event_metadata *em = (void *)e;
+
+	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE)
+		read_metadata(d, em->client_handle, em->metadata_handle, em->type, em->len);
+}
+
+static void overwrite_immediate_value_in_common_target_write(int vm_fd, uint64_t offset,
+							     uint32_t old_val, uint32_t new_val)
+{
+	uint64_t addr = offset;
+	int vals_changed = 0;
+	uint32_t val;
+
+	while (vals_changed < 4) {
+		igt_assert_eq(pread(vm_fd, &val, sizeof(uint32_t), addr), sizeof(uint32_t));
+		if (val == old_val) {
+			igt_debug("val_before_write[%d]: %08x\n", vals_changed, val);
+			igt_assert_eq(pwrite(vm_fd, &new_val, sizeof(uint32_t), addr),
+				      sizeof(uint32_t));
+			igt_assert_eq(pread(vm_fd, &val, sizeof(uint32_t), addr),
+				      sizeof(uint32_t));
+			igt_debug("val_before_fsync[%d]: %08x\n", vals_changed, val);
+			fsync(vm_fd);
+			igt_assert_eq(pread(vm_fd, &val, sizeof(uint32_t), addr),
+				      sizeof(uint32_t));
+			igt_debug("val_after_fsync[%d]: %08x\n", vals_changed, val);
+			igt_assert_eq_u32(val, new_val);
+			vals_changed++;
+		}
+		addr += sizeof(uint32_t);
+	}
+}
+
+static void eu_attention_resume_caching_trigger(struct xe_eudebug_debugger *d,
+						struct drm_xe_eudebug_event *e)
+{
+	struct drm_xe_eudebug_event_eu_attention *att = (void *)e;
+	struct online_debug_data *data = d->ptr;
+	static int counter;
+	static int kernel_in_bb;
+	struct dim_t s_dim = surface_dimensions(get_number_of_threads(d->flags));
+	int val;
+	uint32_t instr_usdw;
+	struct gpgpu_shader *kernel;
+	const uint32_t breakpoint_bit = 1 << 30;
+	struct gpgpu_shader *shader_preamble;
+	struct gpgpu_shader *shader_write_instr;
+
+	shader_preamble = gpgpu_shader_create(d->master_fd);
+	gpgpu_shader__write_dword(shader_preamble, SHADER_CANARY, 0);
+	gpgpu_shader__nop(shader_preamble);
+	gpgpu_shader__breakpoint(shader_preamble);
+
+	shader_write_instr = gpgpu_shader_create(d->master_fd);
+	gpgpu_shader__common_target_write_u32(shader_write_instr, 0, 0);
+
+	if (!kernel_in_bb) {
+		kernel = get_shader(d->master_fd, d->flags);
+		kernel_in_bb = find_kernel_in_bb(kernel, data);
+		gpgpu_shader_destroy(kernel);
+	}
+
+	/* set breakpoint on next write instruction */
+	if (counter < caching_get_instruction_count(d->master_fd, s_dim.x, d->flags)) {
+		igt_assert_eq(pread(data->vm_fd, &instr_usdw, sizeof(instr_usdw),
+				    data->bb_offset + kernel_in_bb + shader_preamble->size * 4 +
+				    shader_write_instr->size * 4 * counter), sizeof(instr_usdw));
+		instr_usdw |= breakpoint_bit;
+		igt_assert_eq(pwrite(data->vm_fd, &instr_usdw, sizeof(instr_usdw),
+				     data->bb_offset + kernel_in_bb + shader_preamble->size * 4 +
+				     shader_write_instr->size * 4 * counter), sizeof(instr_usdw));
+		fsync(data->vm_fd);
+	}
+
+	/* restore current instruction */
+	if (counter && counter <= caching_get_instruction_count(d->master_fd, s_dim.x, d->flags))
+		overwrite_immediate_value_in_common_target_write(data->vm_fd,
+								 data->bb_offset + kernel_in_bb +
+								 shader_preamble->size * 4 +
+								 shader_write_instr->size * 4 * (counter - 1),
+								 CACHING_POISON_VALUE,
+								 CACHING_VALUE(counter - 1));
+
+	/* poison next instruction */
+	if (counter < caching_get_instruction_count(d->master_fd, s_dim.x, d->flags))
+		overwrite_immediate_value_in_common_target_write(data->vm_fd,
+								 data->bb_offset + kernel_in_bb +
+								 shader_preamble->size * 4 +
+								 shader_write_instr->size * 4 * counter,
+								 CACHING_VALUE(counter),
+								 CACHING_POISON_VALUE);
+
+	gpgpu_shader_destroy(shader_write_instr);
+	gpgpu_shader_destroy(shader_preamble);
+
+	for (int i = 0; i < data->target_size; i += sizeof(uint32_t)) {
+		igt_assert_eq(pread(data->vm_fd, &val, sizeof(val), data->target_offset + i),
+			      sizeof(val));
+		igt_assert_f(val != CACHING_POISON_VALUE, "Poison value found at %04d!\n", i);
+	}
+
+	eu_ctl_resume(d->master_fd, d->fd, att->client_handle,
+		      att->exec_queue_handle, att->lrc_handle,
+		      att->bitmask, att->bitmask_size);
+
+	counter++;
+}
+
+static struct intel_bb *xe_bb_create_on_offset(int fd, uint32_t exec_queue, uint32_t vm,
+					       uint64_t offset, uint32_t size)
+{
+	struct intel_bb *ibb;
+
+	ibb = intel_bb_create_with_context(fd, exec_queue, vm, NULL, size);
+
+	/* update intel bb offset */
+	intel_bb_remove_object(ibb, ibb->handle, ibb->batch_offset, ibb->size);
+	intel_bb_add_object(ibb, ibb->handle, ibb->size, offset, ibb->alignment, false);
+	ibb->batch_offset = offset;
+
+	return ibb;
+}
+
+static size_t get_bb_size(int flags)
+{
+	if ((flags & SHADER_CACHING_SRAM) || (flags & SHADER_CACHING_VRAM))
+		return 32768;
+
+	return 4096;
+}
+
+static void run_online_client(struct xe_eudebug_client *c)
+{
+	int threads = get_number_of_threads(c->flags);
+	const uint64_t target_offset = 0x1a000000;
+	const uint64_t bb_offset = 0x1b000000;
+	const size_t bb_size = get_bb_size(c->flags);
+	struct online_debug_data *data = c->ptr;
+	struct drm_xe_engine_class_instance hwe = data->hwe;
+	struct drm_xe_ext_set_property ext = {
+		.base.name = DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY,
+		.property = DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG,
+		.value = DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE,
+	};
+	struct drm_xe_exec_queue_create create = {
+		.instances = to_user_pointer(&hwe),
+		.width = 1,
+		.num_placements = 1,
+		.extensions = c->flags & DISABLE_DEBUG_MODE ? 0 : to_user_pointer(&ext)
+	};
+	struct dim_t w_dim = walker_dimensions(threads);
+	struct dim_t s_dim = surface_dimensions(threads);
+	struct timespec ts = { };
+	struct gpgpu_shader *sip, *shader;
+	uint32_t metadata_id[2];
+	uint64_t *metadata[2];
+	struct intel_bb *ibb;
+	struct intel_buf *buf;
+	uint32_t *ptr;
+	int fd;
+
+	metadata[0] = calloc(2, sizeof(*metadata));
+	metadata[1] = calloc(2, sizeof(*metadata));
+	igt_assert(metadata[0]);
+	igt_assert(metadata[1]);
+
+	fd = xe_eudebug_client_open_driver(c);
+	xe_device_get(fd);
+
+	/* Additional memory for steering control */
+	if (c->flags & SHADER_LOOP || c->flags & SHADER_SINGLE_STEP)
+		s_dim.y++;
+	/* Additional memory for caching check */
+	if ((c->flags & SHADER_CACHING_SRAM) || (c->flags & SHADER_CACHING_VRAM))
+		s_dim.y += caching_get_instruction_count(fd, s_dim.x, c->flags);
+	buf = create_uc_buf(fd, s_dim.x, s_dim.y);
+
+	buf->addr.offset = target_offset;
+
+	metadata[0][0] = bb_offset;
+	metadata[0][1] = bb_size;
+	metadata[1][0] = target_offset;
+	metadata[1][1] = buf->size;
+	metadata_id[0] = xe_eudebug_client_metadata_create(c, fd, DRM_XE_DEBUG_METADATA_ELF_BINARY,
+							   2 * sizeof(*metadata), metadata[0]);
+	metadata_id[1] = xe_eudebug_client_metadata_create(c, fd,
+							   DRM_XE_DEBUG_METADATA_PROGRAM_MODULE,
+							   2 * sizeof(*metadata), metadata[1]);
+
+	create.vm_id = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
+	xe_eudebug_client_exec_queue_create(c, fd, &create);
+
+	ibb = xe_bb_create_on_offset(fd, create.exec_queue_id, create.vm_id,
+				     bb_offset, bb_size);
+	intel_bb_set_lr_mode(ibb, true);
+
+	sip = get_sip(fd, c->flags);
+	shader = get_shader(fd, c->flags);
+
+	igt_nsec_elapsed(&ts);
+	gpgpu_shader_exec(ibb, buf, w_dim.x, w_dim.y, shader, sip, 0, 0);
+
+	gpgpu_shader_destroy(sip);
+	gpgpu_shader_destroy(shader);
+
+	intel_bb_sync(ibb);
+
+	if (c->flags & TRIGGER_RECONNECT)
+		xe_eudebug_client_wait_stage(c, DEBUGGER_REATTACHED);
+	else
+		/* Make sure it wasn't the timeout. */
+		igt_assert(igt_nsec_elapsed(&ts) < XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC);
+
+	if (!(c->flags & DO_NOT_EXPECT_CANARIES)) {
+		ptr = xe_bo_mmap_ext(fd, buf->handle, buf->size, PROT_READ);
+		data->threads_count = count_canaries_neq(ptr, w_dim, 0);
+		igt_assert_f(data->threads_count, "No canaries found, nothing executed?\n");
+
+		if ((c->flags & SHADER_BREAKPOINT || c->flags & TRIGGER_RESUME_SET_BP ||
+		     c->flags & SHADER_N_NOOP_BREAKPOINT) && !(c->flags & DISABLE_DEBUG_MODE)) {
+			uint32_t aip = ptr[0];
+
+			igt_assert_f(aip != SHADER_CANARY,
+				     "Workload executed but breakpoint not hit!\n");
+			igt_assert_eq(count_canaries_eq(ptr, w_dim, aip), data->threads_count);
+			igt_debug("Breakpoint hit in %d threads, AIP=0x%08x\n", data->threads_count,
+				  aip);
+		}
+
+		munmap(ptr, buf->size);
+	}
+
+	intel_bb_destroy(ibb);
+
+	xe_eudebug_client_exec_queue_destroy(c, fd, &create);
+	xe_eudebug_client_vm_destroy(c, fd,  create.vm_id);
+
+	xe_eudebug_client_metadata_destroy(c, fd, metadata_id[0], DRM_XE_DEBUG_METADATA_ELF_BINARY,
+					   2 * sizeof(*metadata));
+	xe_eudebug_client_metadata_destroy(c, fd, metadata_id[1],
+					   DRM_XE_DEBUG_METADATA_PROGRAM_MODULE,
+					   2 * sizeof(*metadata));
+
+	xe_device_put(fd);
+	xe_eudebug_client_close_driver(c, fd);
+}
+
+static bool intel_gen_has_lockstep_eus(int fd)
+{
+	const uint32_t id = intel_get_drm_devid(fd);
+
+	/*
+	 * Lockstep (or in some parlance, fused) EUs are pair of EUs
+	 * that work in sync, supposedly same clock and same control flow.
+	 * Thus for attentions, if the control has breakpoint, both will be
+	 * excepted into SIP. In this level, the hardware has only one attention
+	 * thread bit for units. PVC is the first one without lockstepping.
+	 */
+	return !(intel_graphics_ver(id) == IP_VER(12, 60) || intel_gen(id) >= 20);
+}
+
+static int query_attention_bitmask_size(int fd, int gt)
+{
+	const unsigned int threads = 8;
+	struct drm_xe_query_topology_mask *c_dss = NULL, *g_dss = NULL, *eu_per_dss = NULL;
+	struct drm_xe_query_topology_mask *topology;
+	struct drm_xe_device_query query = {
+		.extensions = 0,
+		.query = DRM_XE_DEVICE_QUERY_GT_TOPOLOGY,
+		.size = 0,
+		.data = 0,
+	};
+	int pos = 0, eus;
+	uint8_t *any_dss;
+
+	igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &query), 0);
+	igt_assert_neq(query.size, 0);
+
+	topology = malloc(query.size);
+	igt_assert(topology);
+
+	query.data = to_user_pointer(topology);
+	igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &query), 0);
+
+	while (query.size >= sizeof(struct drm_xe_query_topology_mask)) {
+		struct drm_xe_query_topology_mask *topo;
+		int sz;
+
+		topo = (struct drm_xe_query_topology_mask *)((unsigned char *)topology + pos);
+		sz = sizeof(struct drm_xe_query_topology_mask) + topo->num_bytes;
+
+		query.size -= sz;
+		pos += sz;
+
+		if (topo->gt_id != gt)
+			continue;
+
+		if (topo->type == DRM_XE_TOPO_DSS_GEOMETRY)
+			g_dss = topo;
+		else if (topo->type == DRM_XE_TOPO_DSS_COMPUTE)
+			c_dss = topo;
+		else if (topo->type == DRM_XE_TOPO_EU_PER_DSS ||
+			 topo->type == DRM_XE_TOPO_SIMD16_EU_PER_DSS)
+			eu_per_dss = topo;
+	}
+
+	igt_assert(g_dss && c_dss && eu_per_dss);
+	igt_assert_eq_u32(c_dss->num_bytes, g_dss->num_bytes);
+
+	any_dss = malloc(c_dss->num_bytes);
+
+	for (int i = 0; i < c_dss->num_bytes; i++)
+		any_dss[i] = c_dss->mask[i] | g_dss->mask[i];
+
+	eus = count_set_bits(any_dss, c_dss->num_bytes);
+	eus *= count_set_bits(eu_per_dss->mask, eu_per_dss->num_bytes);
+
+	if (intel_gen_has_lockstep_eus(fd))
+		eus /= 2;
+
+	free(any_dss);
+	free(topology);
+
+	return eus * threads / 8;
+}
+
+static struct drm_xe_eudebug_event_exec_queue *
+match_attention_with_exec_queue(struct xe_eudebug_event_log *log,
+				struct drm_xe_eudebug_event_eu_attention *ea)
+{
+	struct drm_xe_eudebug_event_exec_queue *ee;
+	struct drm_xe_eudebug_event *event = NULL, *current = NULL, *matching_destroy = NULL;
+	int lrc_idx;
+
+	xe_eudebug_for_each_event(event, log) {
+		if (event->type == DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE &&
+		    event->flags == DRM_XE_EUDEBUG_EVENT_CREATE) {
+			ee = (struct drm_xe_eudebug_event_exec_queue *)event;
+
+			if (ee->exec_queue_handle != ea->exec_queue_handle)
+				continue;
+
+			if (ee->client_handle != ea->client_handle)
+				continue;
+
+			for (lrc_idx = 0; lrc_idx < ee->width; lrc_idx++) {
+				if (ee->lrc_handle[lrc_idx] == ea->lrc_handle)
+					break;
+			}
+
+			if (lrc_idx >= ee->width) {
+				igt_debug("No matching lrc handle within matching exec_queue!");
+				continue;
+			}
+
+			/* event logs are sorted, every found next would not be present. */
+			if (ea->base.seqno < ee->base.seqno)
+				break;
+
+			/* sanity check whether attention did
+			 * not appear yet on already destroyed exec_queue
+			 */
+			current = event;
+			xe_eudebug_for_each_event(current, log) {
+				if (current->type == DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE &&
+				    current->flags == DRM_XE_EUDEBUG_EVENT_DESTROY) {
+					uint8_t offset = sizeof(struct drm_xe_eudebug_event);
+
+					if (memcmp((uint8_t *)current + offset,
+						   (uint8_t *)event + offset,
+						   current->len - offset) == 0) {
+						matching_destroy = current;
+					}
+				}
+			}
+
+			if (!matching_destroy || ea->base.seqno > matching_destroy->seqno)
+				continue;
+
+			return ee;
+		}
+	}
+
+	return NULL;
+}
+
+static void online_session_check(struct xe_eudebug_session *s, int flags)
+{
+	struct drm_xe_eudebug_event_eu_attention *ea = NULL;
+	struct drm_xe_eudebug_event *event = NULL;
+	struct online_debug_data *data = s->client->ptr;
+	bool expect_exception = flags & DISABLE_DEBUG_MODE ? false : true;
+	int sum = 0;
+	int bitmask_size;
+
+	xe_eudebug_session_check(s, true, XE_EUDEBUG_FILTER_EVENT_VM_BIND |
+					  XE_EUDEBUG_FILTER_EVENT_VM_BIND_OP |
+					  XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
+
+	bitmask_size = query_attention_bitmask_size(s->debugger->master_fd, data->hwe.gt_id);
+
+	xe_eudebug_for_each_event(event, s->debugger->log) {
+		if (event->type == DRM_XE_EUDEBUG_EVENT_EU_ATTENTION) {
+			ea = (struct drm_xe_eudebug_event_eu_attention *)event;
+
+			igt_assert(event->flags == DRM_XE_EUDEBUG_EVENT_STATE_CHANGE);
+			igt_assert_eq(ea->bitmask_size, bitmask_size);
+			sum += count_set_bits(ea->bitmask, bitmask_size);
+			igt_assert(match_attention_with_exec_queue(s->debugger->log, ea));
+		}
+	}
+
+	/*
+	 * We can expect attention to sum up only
+	 * if we have a breakpoint set and we resume all threads always.
+	 */
+	if (flags == SHADER_BREAKPOINT || flags == TRIGGER_UFENCE_SET_BREAKPOINT)
+		igt_assert_eq(sum, data->threads_count);
+
+	if (expect_exception)
+		igt_assert(sum > 0);
+	else
+		igt_assert(sum == 0);
+}
+
+static void ufence_ack_trigger(struct xe_eudebug_debugger *d,
+			       struct drm_xe_eudebug_event *e)
+{
+	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
+
+	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE)
+		xe_eudebug_ack_ufence(d->fd, ef);
+}
+
+static void ufence_ack_set_bp_trigger(struct xe_eudebug_debugger *d,
+				      struct drm_xe_eudebug_event *e)
+{
+	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
+	struct online_debug_data *data = d->ptr;
+
+	set_breakpoint_once(d, data);
+
+	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE)
+		xe_eudebug_ack_ufence(d->fd, ef);
+}
+
+/**
+ * SUBTEST: basic-breakpoint
+ * Description:
+ *	Check whether KMD sends attention events
+ *	for workload in debug mode stopped on breakpoint.
+ *
+ * SUBTEST: breakpoint-not-in-debug-mode
+ * Description:
+ *	Check whether KMD resets the GPU when it spots an attention
+ *	coming from workload not in debug mode.
+ *
+ * SUBTEST: stopped-thread
+ * Description:
+ *	Hits breakpoint on runalone workload and
+ *	reads attention for fixed time.
+ *
+ * SUBTEST: resume-%s
+ * Description:
+ *	Resumes stopped on a breakpoint workload
+ *	with granularity of %arg[1].
+ *
+ *
+ * arg[1]:
+ *
+ * @one:	one thread
+ * @dss:	threads running on one subslice
+ */
+static void test_basic_online(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
+{
+	struct xe_eudebug_session *s;
+	struct online_debug_data *data;
+
+	data = online_debug_data_create(hwe);
+	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
+
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
+					eu_attention_debug_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
+					eu_attention_resume_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
+					ufence_ack_trigger);
+
+	xe_eudebug_session_run(s);
+	online_session_check(s, s->flags);
+
+	xe_eudebug_session_destroy(s);
+	online_debug_data_destroy(data);
+}
+
+/**
+ * SUBTEST: set-breakpoint
+ * Description:
+ *	Checks for attention after setting a dynamic breakpoint in the ufence event.
+ */
+
+static void test_set_breakpoint_online(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
+{
+	struct xe_eudebug_session *s;
+	struct online_debug_data *data;
+
+	data = online_debug_data_create(hwe);
+	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_OPEN,
+					open_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE,
+					exec_queue_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM, vm_open_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_METADATA,
+					create_metadata_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
+					ufence_ack_set_bp_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
+					eu_attention_resume_trigger);
+
+	xe_eudebug_session_run(s);
+	online_session_check(s, s->flags);
+
+	xe_eudebug_session_destroy(s);
+	online_debug_data_destroy(data);
+}
+
+/**
+ * SUBTEST: preempt-breakpoint
+ * Description:
+ *	Verify that eu debugger disables preemption timeout to
+ *	prevent reset of workload stopped on breakpoint.
+ */
+static void test_preemption(int fd, struct drm_xe_engine_class_instance *hwe)
+{
+	int flags = SHADER_BREAKPOINT | TRIGGER_RESUME_DELAYED;
+	struct xe_eudebug_session *s;
+	struct online_debug_data *data;
+	struct xe_eudebug_client *other;
+
+	data = online_debug_data_create(hwe);
+	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
+	other = xe_eudebug_client_create(fd, run_online_client, SHADER_NOP, data);
+
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
+					eu_attention_debug_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
+					eu_attention_resume_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
+					ufence_ack_trigger);
+
+	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
+	xe_eudebug_debugger_start_worker(s->debugger);
+
+	xe_eudebug_client_start(s->client);
+	sleep(1); /* make sure s->client starts first */
+	xe_eudebug_client_start(other);
+
+	xe_eudebug_client_wait_done(s->client);
+	xe_eudebug_client_wait_done(other);
+
+	xe_eudebug_debugger_stop_worker(s->debugger, 1);
+
+	xe_eudebug_session_destroy(s);
+	xe_eudebug_client_destroy(other);
+
+	igt_assert_f(data->last_eu_control_seqno != 0,
+		     "Workload with breakpoint has ended without resume!\n");
+
+	online_debug_data_destroy(data);
+}
+
+/**
+ * SUBTEST: reset-with-attention
+ * Description:
+ *	Check whether GPU is usable after resetting with attention raised
+ *	(stopped on breakpoint) by running the same workload again.
+ */
+static void test_reset_with_attention_online(int fd, struct drm_xe_engine_class_instance *hwe,
+					     int flags)
+{
+	struct xe_eudebug_session *s1, *s2;
+	struct online_debug_data *data;
+
+	data = online_debug_data_create(hwe);
+	s1 = xe_eudebug_session_create(fd, run_online_client, flags, data);
+
+	xe_eudebug_debugger_add_trigger(s1->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
+					eu_attention_reset_trigger);
+	xe_eudebug_debugger_add_trigger(s1->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
+					ufence_ack_trigger);
+
+	xe_eudebug_session_run(s1);
+	xe_eudebug_session_destroy(s1);
+
+	s2 = xe_eudebug_session_create(fd, run_online_client, flags, data);
+	xe_eudebug_debugger_add_trigger(s2->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
+					eu_attention_resume_trigger);
+	xe_eudebug_debugger_add_trigger(s2->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
+					ufence_ack_trigger);
+
+	xe_eudebug_session_run(s2);
+
+	online_session_check(s2, s2->flags);
+
+	xe_eudebug_session_destroy(s2);
+	online_debug_data_destroy(data);
+}
+
+/**
+ * SUBTEST: interrupt-all
+ * Description:
+ *	Schedules EU workload which should last about a few seconds, then
+ *	interrupts all threads, checks whether attention event came, and
+ *	resumes stopped threads back.
+ *
+ * SUBTEST: interrupt-all-set-breakpoint
+ * Description:
+ *	Schedules EU workload which should last about a few seconds, then
+ *	interrupts all threads, once attention event come it sets breakpoint on
+ *	the very next instruction and resumes stopped threads back. It expects
+ *	that every thread hits the breakpoint.
+ */
+static void test_interrupt_all(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
+{
+	struct xe_eudebug_session *s;
+	struct online_debug_data *data;
+	uint32_t val;
+
+	data = online_debug_data_create(hwe);
+	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
+
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_OPEN,
+					open_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE,
+					exec_queue_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
+					eu_attention_debug_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
+					eu_attention_resume_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM, vm_open_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_METADATA,
+					create_metadata_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
+					ufence_ack_trigger);
+
+	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
+	xe_eudebug_debugger_start_worker(s->debugger);
+	xe_eudebug_client_start(s->client);
+
+	/* wait for workload to start */
+	igt_for_milliseconds(STARTUP_TIMEOUT_MS) {
+		/* collect needed data from triggers */
+		if (READ_ONCE(data->vm_fd) == -1 || READ_ONCE(data->target_size) == 0)
+			continue;
+
+		if (pread(data->vm_fd, &val, sizeof(val), data->target_offset) == sizeof(val))
+			if (val != 0)
+				break;
+	}
+
+	pthread_mutex_lock(&data->mutex);
+	igt_assert(data->client_handle != -1);
+	igt_assert(data->exec_queue_handle != -1);
+	eu_ctl_interrupt_all(s->debugger->fd, data->client_handle,
+			     data->exec_queue_handle, data->lrc_handle);
+	pthread_mutex_unlock(&data->mutex);
+
+	xe_eudebug_client_wait_done(s->client);
+
+	xe_eudebug_debugger_stop_worker(s->debugger, 1);
+
+	xe_eudebug_event_log_print(s->debugger->log, true);
+	xe_eudebug_event_log_print(s->client->log, true);
+
+	online_session_check(s, s->flags);
+
+	xe_eudebug_session_destroy(s);
+	online_debug_data_destroy(data);
+}
+
+static void reset_debugger_log(struct xe_eudebug_debugger *d)
+{
+	unsigned int max_size;
+	char log_name[80];
+
+	/* Don't pull the rug out from under an active debugger */
+	igt_assert(d->target_pid == 0);
+
+	max_size = d->log->max_size;
+	strncpy(log_name, d->log->name, sizeof(d->log->name) - 1);
+	log_name[79] = '\0';
+	xe_eudebug_event_log_destroy(d->log);
+	d->log = xe_eudebug_event_log_create(log_name, max_size);
+}
+
+/**
+ * SUBTEST: interrupt-other-debuggable
+ * Description:
+ *	Schedules EU workload in runalone mode with never ending loop, while
+ *	it is not under debug, tries to interrupt all threads using the different
+ *	client attached to debugger.
+ *
+ * SUBTEST: interrupt-other
+ * Description:
+ *	Schedules EU workload with a never ending loop and, while it is not
+ *	configured for debugging, tries to interrupt all threads using the client
+ *	attached to debugger.
+ */
+static void test_interrupt_other(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
+{
+	struct online_debug_data *data;
+	struct online_debug_data *debugee_data;
+	struct xe_eudebug_session *s;
+	struct xe_eudebug_client *debugee;
+	int debugee_flags = SHADER_LOOP | DO_NOT_EXPECT_CANARIES;
+	int val;
+
+	data = online_debug_data_create(hwe);
+	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
+
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_OPEN, open_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE,
+					exec_queue_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM, vm_open_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_METADATA,
+					create_metadata_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
+					ufence_ack_trigger);
+
+	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
+	xe_eudebug_debugger_start_worker(s->debugger);
+	xe_eudebug_client_start(s->client);
+
+	/* wait for workload to start */
+	igt_for_milliseconds(STARTUP_TIMEOUT_MS) {
+		if (READ_ONCE(data->vm_fd) == -1 || READ_ONCE(data->target_size) == 0)
+			continue;
+
+		if (pread(data->vm_fd, &val, sizeof(val), data->target_offset) == sizeof(val))
+			if (val != 0)
+				break;
+	}
+	igt_assert_f(val != 0, "Workload execution is not yet started\n");
+
+	xe_eudebug_debugger_detach(s->debugger);
+	reset_debugger_log(s->debugger);
+
+	debugee_data = online_debug_data_create(hwe);
+	s->debugger->ptr = debugee_data;
+	debugee = xe_eudebug_client_create(fd, run_online_client, debugee_flags, debugee_data);
+	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, debugee), 0);
+	xe_eudebug_client_start(debugee);
+
+	igt_for_milliseconds(STARTUP_TIMEOUT_MS) {
+		if (READ_ONCE(debugee_data->vm_fd) == -1 || READ_ONCE(debugee_data->target_size) == 0)
+			continue;
+	}
+
+	pthread_mutex_lock(&debugee_data->mutex);
+	igt_assert(debugee_data->client_handle != -1);
+	igt_assert(debugee_data->exec_queue_handle != -1);
+
+	/*
+	 * Interrupting the other client should return invalid state
+	 * as it is running in runalone mode
+	 */
+	igt_assert_eq(__eu_ctl(s->debugger->fd, debugee_data->client_handle,
+		      debugee_data->exec_queue_handle, debugee_data->lrc_handle, NULL, 0,
+		      DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL, NULL), -EINVAL);
+	pthread_mutex_unlock(&debugee_data->mutex);
+
+	xe_force_gt_reset_async(s->debugger->master_fd, debugee_data->hwe.gt_id);
+
+	xe_eudebug_client_wait_done(debugee);
+	xe_eudebug_debugger_stop_worker(s->debugger, 1);
+
+	xe_eudebug_event_log_print(s->debugger->log, true);
+	xe_eudebug_event_log_print(debugee->log, true);
+
+	xe_eudebug_session_check(s, true, XE_EUDEBUG_FILTER_EVENT_VM_BIND |
+				 XE_EUDEBUG_FILTER_EVENT_VM_BIND_OP |
+				 XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
+
+	xe_eudebug_client_destroy(debugee);
+	xe_eudebug_session_destroy(s);
+	online_debug_data_destroy(data);
+	online_debug_data_destroy(debugee_data);
+}
+
+/**
+ * SUBTEST: tdctl-parameters
+ * Description:
+ *	Schedules EU workload which should last about a few seconds, then
+ *	checks negative scenarios of EU_THREADS ioctl usage, interrupts all threads,
+ *	checks whether attention event came, and resumes stopped threads back.
+ */
+static void test_tdctl_parameters(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
+{
+	struct xe_eudebug_session *s;
+	struct online_debug_data *data;
+	uint32_t val;
+	uint32_t random_command;
+	uint32_t bitmask_size = query_attention_bitmask_size(fd, hwe->gt_id);
+	uint8_t *attention_bitmask = malloc(bitmask_size * sizeof(uint8_t));
+
+	igt_assert(attention_bitmask);
+
+	data = online_debug_data_create(hwe);
+	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
+
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_OPEN,
+					open_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE,
+					exec_queue_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
+					eu_attention_debug_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
+					eu_attention_resume_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM, vm_open_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_METADATA,
+					create_metadata_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
+					ufence_ack_trigger);
+
+	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
+	xe_eudebug_debugger_start_worker(s->debugger);
+	xe_eudebug_client_start(s->client);
+
+	/* wait for workload to start */
+	igt_for_milliseconds(STARTUP_TIMEOUT_MS) {
+		/* collect needed data from triggers */
+		if (READ_ONCE(data->vm_fd) == -1 || READ_ONCE(data->target_size) == 0)
+			continue;
+
+		if (pread(data->vm_fd, &val, sizeof(val), data->target_offset) == sizeof(val))
+			if (val != 0)
+				break;
+	}
+
+	pthread_mutex_lock(&data->mutex);
+	igt_assert(data->client_handle != -1);
+	igt_assert(data->exec_queue_handle != -1);
+	igt_assert(data->lrc_handle != -1);
+
+	/* fail on invalid lrc_handle */
+	igt_assert(__eu_ctl(s->debugger->fd, data->client_handle,
+			    data->exec_queue_handle, data->lrc_handle + 1,
+			    attention_bitmask, &bitmask_size,
+			    DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL, NULL) == -EINVAL);
+
+	/* fail on invalid exec_queue_handle */
+	igt_assert(__eu_ctl(s->debugger->fd, data->client_handle,
+			    data->exec_queue_handle + 1, data->lrc_handle,
+			    attention_bitmask, &bitmask_size,
+			    DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL, NULL) == -EINVAL);
+
+	/* fail on invalid client */
+	igt_assert(__eu_ctl(s->debugger->fd, data->client_handle + 1,
+			    data->exec_queue_handle, data->lrc_handle,
+			    attention_bitmask, &bitmask_size,
+			    DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL, NULL) == -EINVAL);
+
+	/*
+	 * bitmask size must be aligned to sizeof(u32) for all commands
+	 * and be zero for interrupt all
+	 */
+	bitmask_size = sizeof(uint32_t) - 1;
+	igt_assert(__eu_ctl(s->debugger->fd, data->client_handle,
+			    data->exec_queue_handle, data->lrc_handle,
+			    attention_bitmask, &bitmask_size,
+			    DRM_XE_EUDEBUG_EU_CONTROL_CMD_STOPPED, NULL) == -EINVAL);
+	bitmask_size = 0;
+
+	/* fail on invalid command */
+	random_command = random() | (DRM_XE_EUDEBUG_EU_CONTROL_CMD_RESUME + 1);
+	igt_assert(__eu_ctl(s->debugger->fd, data->client_handle,
+			    data->exec_queue_handle, data->lrc_handle,
+			    attention_bitmask, &bitmask_size, random_command, NULL) == -EINVAL);
+
+	free(attention_bitmask);
+
+	eu_ctl_interrupt_all(s->debugger->fd, data->client_handle,
+			     data->exec_queue_handle, data->lrc_handle);
+	pthread_mutex_unlock(&data->mutex);
+
+	xe_eudebug_client_wait_done(s->client);
+
+	xe_eudebug_debugger_stop_worker(s->debugger, 1);
+
+	xe_eudebug_event_log_print(s->debugger->log, true);
+	xe_eudebug_event_log_print(s->client->log, true);
+
+	online_session_check(s, s->flags);
+
+	xe_eudebug_session_destroy(s);
+	online_debug_data_destroy(data);
+}
+
+static void eu_attention_debugger_detach_trigger(struct xe_eudebug_debugger *d,
+						 struct drm_xe_eudebug_event *event)
+{
+	struct online_debug_data *data = d->ptr;
+	uint64_t c_pid;
+	int ret;
+
+	c_pid = d->target_pid;
+
+	/* Reset VM data so the re-triggered VM open handler works properly */
+	data->vm_fd = -1;
+
+	xe_eudebug_debugger_detach(d);
+
+	/* Let the KMD scan function notice unhandled EU attention */
+	if (!(d->flags & SHADER_N_NOOP_BREAKPOINT))
+		sleep(1);
+
+	/*
+	 * New session that is created by EU debugger on reconnect restarts
+	 * seqno, causing isses with log sorting. To avoid that, create
+	 * a new event log.
+	 */
+	reset_debugger_log(d);
+
+	ret = xe_eudebug_connect(d->master_fd, c_pid, 0);
+	igt_assert(ret >= 0);
+	d->fd = ret;
+	d->target_pid = c_pid;
+
+	/* Let the discovery worker discover resources */
+	sleep(2);
+
+	if (!(d->flags & SHADER_N_NOOP_BREAKPOINT))
+		xe_eudebug_debugger_signal_stage(d, DEBUGGER_REATTACHED);
+}
+
+/**
+ * SUBTEST: interrupt-reconnect
+ * Description:
+ *	Schedules EU workload which should last about a few seconds,
+ *	interrupts all threads and detaches debugger when attention is
+ *	raised. The test checks if KMD resets the workload when there's
+ *	no debugger attached and does the event playback on discovery.
+ */
+static void test_interrupt_reconnect(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
+{
+	struct drm_xe_eudebug_event *e = NULL;
+	struct online_debug_data *data;
+	struct xe_eudebug_session *s;
+	uint32_t val;
+
+	data = online_debug_data_create(hwe);
+	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
+
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_OPEN,
+					open_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE,
+					exec_queue_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
+					eu_attention_debug_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
+					eu_attention_debugger_detach_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM, vm_open_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_METADATA,
+					create_metadata_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
+					ufence_ack_trigger);
+
+	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
+	xe_eudebug_debugger_start_worker(s->debugger);
+	xe_eudebug_client_start(s->client);
+
+	/* wait for workload to start */
+	igt_for_milliseconds(STARTUP_TIMEOUT_MS) {
+		/* collect needed data from triggers */
+		if (READ_ONCE(data->vm_fd) == -1 || READ_ONCE(data->target_size) == 0)
+			continue;
+
+		if (pread(data->vm_fd, &val, sizeof(val), data->target_offset) == sizeof(val))
+			if (val != 0)
+				break;
+	}
+
+	pthread_mutex_lock(&data->mutex);
+	igt_assert(data->client_handle != -1);
+	igt_assert(data->exec_queue_handle != -1);
+	eu_ctl_interrupt_all(s->debugger->fd, data->client_handle,
+			     data->exec_queue_handle, data->lrc_handle);
+	pthread_mutex_unlock(&data->mutex);
+
+	xe_eudebug_client_wait_done(s->client);
+
+	xe_eudebug_debugger_stop_worker(s->debugger, 1);
+
+	xe_eudebug_event_log_print(s->debugger->log, true);
+	xe_eudebug_event_log_print(s->client->log, true);
+
+	xe_eudebug_session_check(s, true, XE_EUDEBUG_FILTER_EVENT_VM_BIND |
+					  XE_EUDEBUG_FILTER_EVENT_VM_BIND_OP |
+					  XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
+
+	/* We expect workload reset, so no attention should be raised */
+	xe_eudebug_for_each_event(e, s->debugger->log)
+		igt_assert(e->type != DRM_XE_EUDEBUG_EVENT_EU_ATTENTION);
+
+	xe_eudebug_session_destroy(s);
+	online_debug_data_destroy(data);
+}
+
+/**
+ * SUBTEST: single-step
+ * Description:
+ *	Schedules EU workload with 16 nops after breakpoint, then single-steps
+ *	through the shader, advances all threads each step, checking if all
+ *	threads advanced every step.
+ *
+ * SUBTEST: single-step-one
+ * Description:
+ *	Schedules EU workload with 16 nops after breakpoint, then single-steps
+ *	through the shader, advances one thread each step, checking if one
+ *	thread advanced every step. Due to the time constraint, only first two
+ *	shader instructions after breakpoint are validated.
+ */
+static void test_single_step(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
+{
+	struct xe_eudebug_session *s;
+	struct online_debug_data *data;
+
+	data = online_debug_data_create(hwe);
+	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
+
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_OPEN,
+					open_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
+					eu_attention_debug_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
+					eu_attention_resume_single_step_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM, vm_open_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_METADATA,
+					create_metadata_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
+					ufence_ack_trigger);
+
+	xe_eudebug_session_run(s);
+	online_session_check(s, s->flags);
+	xe_eudebug_session_destroy(s);
+	online_debug_data_destroy(data);
+}
+
+static void eu_attention_debugger_ndetach_trigger(struct xe_eudebug_debugger *d,
+						  struct drm_xe_eudebug_event *event)
+{
+	struct online_debug_data *data = d->ptr;
+	static int debugger_detach_count;
+
+	if (debugger_detach_count < (SHADER_LOOP_N - 1)) {
+		/* Make sure the resume command was issued before detaching the debugger */
+		if (data->last_eu_control_seqno > event->seqno)
+			return;
+		eu_attention_debugger_detach_trigger(d, event);
+		debugger_detach_count++;
+	} else {
+		igt_debug("Reached Nth breakpoint hence preventing the debugger detach\n");
+	}
+}
+
+/**
+ * SUBTEST: debugger-reopen
+ * Description:
+ *	Check whether the debugger is able to reopen the connection and
+ *	capture the events of already running client.
+ */
+static void test_debugger_reopen(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
+{
+	struct xe_eudebug_session *s;
+	struct online_debug_data *data;
+
+	data = online_debug_data_create(hwe);
+
+	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
+
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
+					eu_attention_debug_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
+					eu_attention_resume_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
+					eu_attention_debugger_ndetach_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
+					ufence_ack_trigger);
+
+	xe_eudebug_session_run(s);
+
+	xe_eudebug_session_destroy(s);
+	online_debug_data_destroy(data);
+}
+
+/**
+ * SUBTEST: writes-caching-%s
+ * Description:
+ *	Write incrementing values to 2-page-long target surface, poisoning the data one breakpoint
+ *	before each write instruction and restoring it when the poisoned instruction breakpoint
+ *	is hit. Expect to never see poison values in target surface.
+ *
+ *
+ * arg[1]:
+ *
+ * @sram:	Use page size of SRAM
+ * @vram:	Use page size of VRAM
+ */
+static void test_caching(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
+{
+	struct xe_eudebug_session *s;
+	struct online_debug_data *data;
+
+	if (flags & SHADER_CACHING_VRAM)
+		igt_skip_on_f(!xe_has_vram(fd), "Device does not have VRAM.\n");
+
+	data = online_debug_data_create(hwe);
+	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
+
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_OPEN,
+					open_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
+					eu_attention_debug_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
+					eu_attention_resume_caching_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM, vm_open_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_METADATA,
+					create_metadata_trigger);
+	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
+					ufence_ack_trigger);
+
+	xe_eudebug_session_run(s);
+	online_session_check(s, s->flags);
+	xe_eudebug_session_destroy(s);
+	online_debug_data_destroy(data);
+}
+
+static int wait_for_exception(struct online_debug_data *data, int timeout)
+{
+	int ret = -ETIMEDOUT;
+
+	igt_for_milliseconds(timeout) {
+		pthread_mutex_lock(&data->mutex);
+		if ((data->exception_arrived.tv_sec |
+		     data->exception_arrived.tv_nsec) != 0)
+			ret = 0;
+		pthread_mutex_unlock(&data->mutex);
+
+		if (!ret)
+			break;
+		usleep(1000);
+	}
+
+	return ret;
+}
+
+#define is_compute_on_gt(__e, __gt) (((__e)->engine_class == DRM_XE_ENGINE_CLASS_RENDER || \
+				      (__e)->engine_class == DRM_XE_ENGINE_CLASS_COMPUTE) && \
+				      (__e)->gt_id == (__gt))
+
+struct xe_engine_list_entry {
+	struct igt_list_head link;
+	struct drm_xe_engine_class_instance *hwe;
+};
+
+#define MAX_TILES	2
+static int find_suitable_engines(struct drm_xe_engine_class_instance *hwes[GEM_MAX_ENGINES],
+				 int fd, bool many_tiles)
+{
+	struct xe_device *xe_dev;
+	struct drm_xe_engine_class_instance *e;
+	struct xe_engine_list_entry *en, *tmp;
+	struct igt_list_head compute_engines[MAX_TILES];
+	int gt_id;
+	int tile_id, i, engine_count = 0, tile_count = 0;
+
+	xe_dev = xe_device_get(fd);
+
+	for (i = 0; i < MAX_TILES; i++)
+		IGT_INIT_LIST_HEAD(&compute_engines[i]);
+
+	xe_for_each_gt(fd, gt_id) {
+		xe_for_each_engine(fd, e) {
+			if (is_compute_on_gt(e, gt_id)) {
+				tile_id = xe_dev->gt_list->gt_list[gt_id].tile_id;
+
+				en = malloc(sizeof(struct xe_engine_list_entry));
+				en->hwe = e;
+
+				igt_list_add_tail(&en->link, &compute_engines[tile_id]);
+			}
+		}
+	}
+
+	for (i = 0; i < MAX_TILES; i++) {
+		if (igt_list_empty(&compute_engines[i]))
+			continue;
+
+		if (many_tiles) {
+			en = igt_list_first_entry(&compute_engines[i], en, link);
+			hwes[engine_count++] = en->hwe;
+			tile_count++;
+		} else {
+			if (igt_list_length(&compute_engines[i]) > 1) {
+				igt_list_for_each_entry(en, &compute_engines[i], link)
+					hwes[engine_count++] = en->hwe;
+				break;
+			}
+		}
+	}
+
+	for (i = 0; i < MAX_TILES; i++) {
+		igt_list_for_each_entry_safe(en, tmp, &compute_engines[i], link) {
+			igt_list_del(&en->link);
+			free(en);
+		}
+	}
+
+	if (many_tiles)
+		igt_require_f(tile_count > 1, "Mulit-tile scenario requires more tiles\n");
+
+	return engine_count;
+}
+
+/**
+ * SUBTEST: breakpoint-many-sessions-single-tile
+ * Description:
+ *	Schedules EU workload with preinstalled breakpoint on every compute engine
+ *	available on the tile. Checks if the contexts hit breakpoint in sequence
+ *	and resumes them.
+ *
+ * SUBTEST: breakpoint-many-sessions-tiles
+ * Description:
+ *	Schedules EU workload with preinstalled breakpoint on selected compute
+ *      engines, with one per tile. Checks if each context hit breakpoint and
+ *      resumes them.
+ */
+static void test_many_sessions_on_tiles(int fd, bool multi_tile)
+{
+	int n = 0, flags = SHADER_BREAKPOINT | SHADER_MIN_THREADS;
+	struct xe_eudebug_session *s[GEM_MAX_ENGINES] = {};
+	struct online_debug_data *data[GEM_MAX_ENGINES] = {};
+	struct drm_xe_engine_class_instance *hwe[GEM_MAX_ENGINES] = {};
+	struct drm_xe_eudebug_event_eu_attention *eus;
+	uint64_t current_t, next_t, diff;
+	int i;
+
+	n = find_suitable_engines(hwe, fd, multi_tile);
+
+	igt_require_f(n > 1, "Test requires at least two parallel compute engines!\n");
+
+	for (i = 0; i < n; i++) {
+		data[i] = online_debug_data_create(hwe[i]);
+		s[i] = xe_eudebug_session_create(fd, run_online_client, flags, data[i]);
+
+		xe_eudebug_debugger_add_trigger(s[i]->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
+						eu_attention_debug_trigger);
+		xe_eudebug_debugger_add_trigger(s[i]->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
+						save_first_exception_trigger);
+		xe_eudebug_debugger_add_trigger(s[i]->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
+						ufence_ack_trigger);
+
+		igt_assert_eq(xe_eudebug_debugger_attach(s[i]->debugger, s[i]->client), 0);
+
+		xe_eudebug_debugger_start_worker(s[i]->debugger);
+		xe_eudebug_client_start(s[i]->client);
+	}
+
+	for (i = 0; i < n; i++) {
+		/* XXX: Sometimes racy, expects clients to execute in sequence */
+		igt_assert(!wait_for_exception(data[i], STARTUP_TIMEOUT_MS));
+
+		eus = (struct drm_xe_eudebug_event_eu_attention *)data[i]->exception_event;
+
+		/* Delay all but the last workload to check serialization */
+		if (i < n - 1)
+			usleep(WORKLOAD_DELAY_US);
+
+		eu_ctl_resume(s[i]->debugger->master_fd, s[i]->debugger->fd,
+			      eus->client_handle, eus->exec_queue_handle,
+			      eus->lrc_handle, eus->bitmask, eus->bitmask_size);
+		free(eus);
+	}
+
+	for (i = 0; i < n - 1; i++) {
+		/* Convert timestamps to microseconds */
+		current_t = data[i]->exception_arrived.tv_nsec * 1000;
+		next_t = data[i + 1]->exception_arrived.tv_nsec * 1000;
+		diff = current_t < next_t ? next_t - current_t : current_t - next_t;
+
+		if (multi_tile)
+			igt_assert_f(diff < WORKLOAD_DELAY_US,
+				     "Expected to execute workloads concurrently. Actual delay: %lu ms\n",
+				     diff);
+		else
+			igt_assert_f(diff >= WORKLOAD_DELAY_US,
+				     "Expected a serialization of workloads. Actual delay: %lu ms\n",
+				     diff);
+	}
+
+	for (i = 0; i < n; i++) {
+		xe_eudebug_client_wait_done(s[i]->client);
+		xe_eudebug_debugger_stop_worker(s[i]->debugger, 1);
+
+		xe_eudebug_event_log_print(s[i]->debugger->log, true);
+		online_session_check(s[i], flags);
+
+		xe_eudebug_session_destroy(s[i]);
+		online_debug_data_destroy(data[i]);
+	}
+}
+
+static struct drm_xe_engine_class_instance *pick_compute(int fd, int gt)
+{
+	struct drm_xe_engine_class_instance *hwe;
+	int count = 0;
+
+	xe_for_each_engine(fd, hwe)
+		if (is_compute_on_gt(hwe, gt))
+			count++;
+
+	xe_for_each_engine(fd, hwe)
+		if (is_compute_on_gt(hwe, gt) && rand() % count-- == 0)
+			return hwe;
+
+	return NULL;
+}
+
+#define test_gt_render_or_compute(t, i915, __hwe) \
+	igt_subtest_with_dynamic(t) \
+		for (int gt = 0; (__hwe = pick_compute(i915, gt)); gt++) \
+			igt_dynamic_f("%s%d", xe_engine_class_string(__hwe->engine_class), \
+				      hwe->engine_instance)
+
+igt_main
+{
+	struct drm_xe_engine_class_instance *hwe;
+	bool was_enabled;
+	int fd;
+
+	igt_fixture {
+		fd = drm_open_driver(DRIVER_XE);
+		intel_allocator_multiprocess_start();
+		igt_srandom();
+		was_enabled = xe_eudebug_enable(fd, true);
+	}
+
+	test_gt_render_or_compute("basic-breakpoint", fd, hwe)
+		test_basic_online(fd, hwe, SHADER_BREAKPOINT);
+
+	test_gt_render_or_compute("preempt-breakpoint", fd, hwe)
+		test_preemption(fd, hwe);
+
+	test_gt_render_or_compute("set-breakpoint", fd, hwe)
+		test_set_breakpoint_online(fd, hwe, SHADER_NOP | TRIGGER_UFENCE_SET_BREAKPOINT);
+
+	test_gt_render_or_compute("breakpoint-not-in-debug-mode", fd, hwe)
+		test_basic_online(fd, hwe, SHADER_BREAKPOINT | DISABLE_DEBUG_MODE);
+
+	test_gt_render_or_compute("stopped-thread", fd, hwe)
+		test_basic_online(fd, hwe, SHADER_BREAKPOINT | TRIGGER_RESUME_DELAYED);
+
+	test_gt_render_or_compute("resume-one", fd, hwe)
+		test_basic_online(fd, hwe, SHADER_BREAKPOINT | TRIGGER_RESUME_ONE);
+
+	test_gt_render_or_compute("resume-dss", fd, hwe)
+		test_basic_online(fd, hwe, SHADER_BREAKPOINT | TRIGGER_RESUME_DSS);
+
+	test_gt_render_or_compute("interrupt-all", fd, hwe)
+		test_interrupt_all(fd, hwe, SHADER_LOOP);
+
+	test_gt_render_or_compute("interrupt-other-debuggable", fd, hwe)
+		test_interrupt_other(fd, hwe, SHADER_LOOP);
+
+	test_gt_render_or_compute("interrupt-other", fd, hwe)
+		test_interrupt_other(fd, hwe, SHADER_LOOP | DISABLE_DEBUG_MODE);
+
+	test_gt_render_or_compute("interrupt-all-set-breakpoint", fd, hwe)
+		test_interrupt_all(fd, hwe, SHADER_LOOP | TRIGGER_RESUME_SET_BP);
+
+	test_gt_render_or_compute("tdctl-parameters", fd, hwe)
+		test_tdctl_parameters(fd, hwe, SHADER_LOOP);
+
+	test_gt_render_or_compute("reset-with-attention", fd, hwe)
+		test_reset_with_attention_online(fd, hwe, SHADER_BREAKPOINT);
+
+	test_gt_render_or_compute("interrupt-reconnect", fd, hwe)
+		test_interrupt_reconnect(fd, hwe, SHADER_LOOP | TRIGGER_RECONNECT);
+
+	test_gt_render_or_compute("single-step", fd, hwe)
+		test_single_step(fd, hwe, SHADER_SINGLE_STEP | SIP_SINGLE_STEP |
+				 TRIGGER_RESUME_PARALLEL_WALK);
+
+	test_gt_render_or_compute("single-step-one", fd, hwe)
+		test_single_step(fd, hwe, SHADER_SINGLE_STEP | SIP_SINGLE_STEP |
+				 TRIGGER_RESUME_SINGLE_WALK);
+
+	test_gt_render_or_compute("debugger-reopen", fd, hwe)
+		test_debugger_reopen(fd, hwe, SHADER_N_NOOP_BREAKPOINT);
+
+	test_gt_render_or_compute("writes-caching-sram", fd, hwe)
+		test_caching(fd, hwe, SHADER_CACHING_SRAM);
+
+	test_gt_render_or_compute("writes-caching-vram", fd, hwe)
+		test_caching(fd, hwe, SHADER_CACHING_VRAM);
+
+	igt_subtest("breakpoint-many-sessions-single-tile")
+		test_many_sessions_on_tiles(fd, false);
+
+	igt_subtest("breakpoint-many-sessions-tiles")
+		test_many_sessions_on_tiles(fd, true);
+
+	igt_fixture {
+		xe_eudebug_enable(fd, was_enabled);
+
+		intel_allocator_multiprocess_stop();
+		drm_close_driver(fd);
+	}
+}
diff --git a/tests/meson.build b/tests/meson.build
index 43e8516f4..e5d8852f3 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -321,6 +321,7 @@ intel_xe_progs = [
 intel_xe_eudebug_progs = [
 	'xe_eudebug',
 	'xe_exec_sip_eudebug',
+	'xe_eudebug_online',
 ]
 
 if build_xe_eudebug
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 16/17] tests/xe_eudebug_online: Debug client which runs workloads on EU
  2024-09-05  9:28 ` [PATCH i-g-t v6 16/17] tests/xe_eudebug_online: Debug client which runs workloads on EU Christoph Manszewski
@ 2024-09-13 11:39   ` Zbigniew Kempczyński
  2024-09-17 19:34     ` Grzegorzek, Dominik
  0 siblings, 1 reply; 50+ messages in thread
From: Zbigniew Kempczyński @ 2024-09-13 11:39 UTC (permalink / raw)
  To: Christoph Manszewski
  Cc: igt-dev, Kamil Konieczny, Dominik Grzegorzek, Maciej Patelczyk,
	Dominik Karol Piątkowski, Pawel Sikora, Andrzej Hajda,
	Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun

On Thu, Sep 05, 2024 at 11:28:11AM +0200, Christoph Manszewski wrote:
> From: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> 
> For typical debugging under gdb one can specify two main usecases:
> accessing and manupulating resources created by the application and
> manipulating thread execution (interrupting and setting breakpoints).
> 
> This test adds coverage for the latter by checking that:
> - EU workloads that hit a instruction with breakpoint bit set will stop
>   halt execution and the debugger will report this via attention events,
> - the debugger is able to interrupt workload execution by issuing a
>   'interrupt_all' ioctl call,
> - the debugger is able to resume selected workloads that are stopped.
> 
> Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
> Signed-off-by: Dominik Karol Piątkowski <dominik.karol.piatkowski@intel.com>
> Signed-off-by: Pawel Sikora <pawel.sikora@intel.com>
> Signed-off-by: Karolina Stolarek <karolina.stolarek@intel.com>
> Signed-off-by: Kolanupaka Naveena <kolanupaka.naveena@intel.com>
> ---
>  tests/intel/xe_eudebug_online.c | 2254 +++++++++++++++++++++++++++++++
>  tests/meson.build               |    1 +
>  2 files changed, 2255 insertions(+)
>  create mode 100644 tests/intel/xe_eudebug_online.c
> 
> diff --git a/tests/intel/xe_eudebug_online.c b/tests/intel/xe_eudebug_online.c
> new file mode 100644
> index 000000000..20f8e3601
> --- /dev/null
> +++ b/tests/intel/xe_eudebug_online.c
> @@ -0,0 +1,2254 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2023 Intel Corporation
> + */
> +
> +/**
> + * TEST: Tests for eudebug online functionality
> + * Category: Core
> + * Mega feature: EUdebug
> + * Sub-category: EUdebug tests
> + * Functionality: eu kernel debug
> + * Test category: functionality test
> + */
> +
> +#include "xe/xe_eudebug.h"
> +#include "xe/xe_ioctl.h"
> +#include "xe/xe_query.h"
> +#include "igt.h"
> +#include "intel_pat.h"
> +#include "intel_mocs.h"
> +#include "gpgpu_shader.h"
> +
> +#define SHADER_NOP			(0 << 0)
> +#define SHADER_BREAKPOINT		(1 << 0)
> +#define SHADER_LOOP			(1 << 1)
> +#define SHADER_SINGLE_STEP		(1 << 2)
> +#define SIP_SINGLE_STEP			(1 << 3)
> +#define DISABLE_DEBUG_MODE		(1 << 4)
> +#define SHADER_N_NOOP_BREAKPOINT	(1 << 5)
> +#define SHADER_CACHING_SRAM		(1 << 6)
> +#define SHADER_CACHING_VRAM		(1 << 7)
> +#define SHADER_MIN_THREADS		(1 << 8)
> +#define DO_NOT_EXPECT_CANARIES		(1 << 9)
> +#define TRIGGER_UFENCE_SET_BREAKPOINT	(1 << 24)
> +#define TRIGGER_RESUME_SINGLE_WALK	(1 << 25)
> +#define TRIGGER_RESUME_PARALLEL_WALK	(1 << 26)
> +#define TRIGGER_RECONNECT		(1 << 27)
> +#define TRIGGER_RESUME_SET_BP		(1 << 28)
> +#define TRIGGER_RESUME_DELAYED		(1 << 29)
> +#define TRIGGER_RESUME_DSS		(1 << 30)
> +#define TRIGGER_RESUME_ONE		(1 << 31)
> +
> +#define DEBUGGER_REATTACHED	1
> +
> +#define SHADER_LOOP_N		3
> +#define SINGLE_STEP_COUNT	16
> +#define STEERING_SINGLE_STEP	0
> +#define STEERING_CONTINUE	0x00c0ffee
> +#define STEERING_END_LOOP	0xdeadca11
> +
> +#define CACHING_INIT_VALUE	0xcafe0000
> +#define CACHING_POISON_VALUE	0xcafedead
> +#define CACHING_VALUE(n)	(CACHING_INIT_VALUE + (n))
> +
> +#define SHADER_CANARY 0x01010101
> +
> +#define WALKER_X_DIM		4
> +#define WALKER_ALIGNMENT	16
> +#define SIMD_SIZE		16
> +
> +#define STARTUP_TIMEOUT_MS	3000
> +#define WORKLOAD_DELAY_US	(5000 * 1000)
> +
> +#define PAGE_SIZE 4096
> +
> +struct dim_t {
> +	uint32_t x;
> +	uint32_t y;
> +	uint32_t alignment;
> +};
> +
> +static struct dim_t walker_dimensions(int threads)
> +{
> +	uint32_t x_dim = min_t(x_dim, threads, WALKER_X_DIM);
> +	struct dim_t ret = {
> +		.x = x_dim,
> +		.y = threads / x_dim,
> +		.alignment = WALKER_ALIGNMENT
> +	};
> +
> +	return ret;
> +}
> +
> +static struct dim_t surface_dimensions(int threads)
> +{
> +	struct dim_t ret = walker_dimensions(threads);
> +
> +	ret.y = max_t(ret.y, threads / ret.x, 4);
> +	ret.x *= SIMD_SIZE;
> +	ret.alignment *= SIMD_SIZE;
> +
> +	return ret;
> +}
> +
> +static uint32_t steering_offset(int threads)
> +{
> +	struct dim_t w = walker_dimensions(threads);
> +
> +	return ALIGN(w.x, w.alignment) * w.y * 4;
> +}
> +
> +static struct intel_buf *create_uc_buf(int fd, int width, int height)
> +{
> +	struct intel_buf *buf;
> +
> +	buf = intel_buf_create_full(buf_ops_create(fd), 0, width / 4, height,
> +				    32, 0, I915_TILING_NONE, 0, 0, 0,
> +				    vram_if_possible(fd, 0),
> +				    DEFAULT_PAT_INDEX, DEFAULT_MOCS_INDEX);
> +
> +	return buf;
> +}
> +
> +static int get_number_of_threads(uint64_t flags)
> +{
> +	if (flags & SHADER_MIN_THREADS)
> +		return 16;
> +
> +	if (flags & (TRIGGER_RESUME_ONE | TRIGGER_RESUME_SINGLE_WALK |
> +		     TRIGGER_RESUME_PARALLEL_WALK | SHADER_CACHING_SRAM | SHADER_CACHING_VRAM))
> +		return 32;
> +
> +	return 512;
> +}
> +
> +static int caching_get_instruction_count(int fd, uint32_t s_dim__x, int flags)
> +{
> +	uint64_t memory;
> +
> +	igt_assert((flags & SHADER_CACHING_SRAM) || (flags & SHADER_CACHING_VRAM));
> +
> +	if (flags & SHADER_CACHING_SRAM)
> +		memory = system_memory(fd);
> +	else
> +		memory = vram_memory(fd, 0);
> +
> +	/* each instruction writes to given y offset */
> +	return (2 * xe_min_page_size(fd, memory)) / s_dim__x;
> +}
> +
> +static struct gpgpu_shader *get_shader(int fd, const unsigned int flags)
> +{
> +	struct dim_t w_dim = walker_dimensions(get_number_of_threads(flags));
> +	struct dim_t s_dim = surface_dimensions(get_number_of_threads(flags));
> +	static struct gpgpu_shader *shader;
> +
> +	shader = gpgpu_shader_create(fd);
> +
> +	gpgpu_shader__write_dword(shader, SHADER_CANARY, 0);
> +	if (flags & SHADER_BREAKPOINT) {
> +		gpgpu_shader__nop(shader);
> +		gpgpu_shader__breakpoint(shader);
> +	} else if (flags & SHADER_LOOP) {
> +		gpgpu_shader__label(shader, 0);
> +		gpgpu_shader__write_dword(shader, SHADER_CANARY, 0);
> +		gpgpu_shader__jump_neq(shader, 0, w_dim.y, STEERING_END_LOOP);
> +		gpgpu_shader__write_dword(shader, SHADER_CANARY, 0);
> +	} else if (flags & SHADER_SINGLE_STEP) {
> +		gpgpu_shader__nop(shader);
> +		gpgpu_shader__breakpoint(shader);
> +		for (int i = 0; i < SINGLE_STEP_COUNT; i++)
> +			gpgpu_shader__nop(shader);
> +	} else if (flags & SHADER_N_NOOP_BREAKPOINT) {
> +		for (int i = 0; i < SHADER_LOOP_N; i++) {
> +			gpgpu_shader__nop(shader);
> +			gpgpu_shader__breakpoint(shader);
> +		}
> +	} else if ((flags & SHADER_CACHING_SRAM) || (flags & SHADER_CACHING_VRAM)) {
> +		gpgpu_shader__nop(shader);
> +		gpgpu_shader__breakpoint(shader);
> +		for (int i = 0; i < caching_get_instruction_count(fd, s_dim.x, flags); i++)
> +			gpgpu_shader__common_target_write_u32(shader, s_dim.y + i, CACHING_VALUE(i));
> +		gpgpu_shader__nop(shader);
> +		gpgpu_shader__breakpoint(shader);
> +	}
> +
> +	gpgpu_shader__eot(shader);

Add blank line.

> +	return shader;
> +}
> +
> +static struct gpgpu_shader *get_sip(int fd, const unsigned int flags)
> +{
> +	struct dim_t w_dim = walker_dimensions(get_number_of_threads(flags));
> +	static struct gpgpu_shader *sip;
> +
> +	sip = gpgpu_shader_create(fd);
> +	gpgpu_shader__write_aip(sip, 0);
> +
> +	gpgpu_shader__wait(sip);
> +	if (flags & SIP_SINGLE_STEP)
> +		gpgpu_shader__end_system_routine_step_if_eq(sip, w_dim.y, 0);
> +	else
> +		gpgpu_shader__end_system_routine(sip, true);

Same.

> +	return sip;
> +}
> +
> +static int count_set_bits(void *ptr, size_t size)
> +{
> +	uint8_t *p = ptr;
> +	int count = 0;
> +	int i, j;
> +

hweight()?

> +	for (i = 0; i < size; i++)
> +		for (j = 0; j < 8; j++)
> +			count += !!(p[i] & (1 << j));
> +
> +	return count;
> +}
> +
> +static int count_canaries_eq(uint32_t *ptr, struct dim_t w_dim, uint32_t value)
> +{
> +	int count = 0;
> +	int x, y;
> +
> +	for (x = 0; x < w_dim.x; x++)
> +		for (y = 0; y < w_dim.y; y++)
> +			if (READ_ONCE(ptr[x + ALIGN(w_dim.x, w_dim.alignment) * y]) == value)
> +				count++;
> +
> +	return count;
> +}
> +
> +static int count_canaries_neq(uint32_t *ptr, struct dim_t w_dim, uint32_t value)
> +{
> +	return w_dim.x * w_dim.y - count_canaries_eq(ptr, w_dim, value);
> +}
> +
> +static const char *td_ctl_cmd_to_str(uint32_t cmd)
> +{
> +	switch (cmd) {
> +	case DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL:
> +		return "interrupt all";
> +	case DRM_XE_EUDEBUG_EU_CONTROL_CMD_STOPPED:
> +		return "stopped";
> +	case DRM_XE_EUDEBUG_EU_CONTROL_CMD_RESUME:
> +		return "resume";
> +	default:
> +		return "unknown command";
> +	}
> +}
> +
> +static int __eu_ctl(int debugfd, uint64_t client,
> +		    uint64_t exec_queue, uint64_t lrc,
> +		    uint8_t *bitmask, uint32_t *bitmask_size,
> +		    uint32_t cmd, uint64_t *seqno)
> +{
> +	struct drm_xe_eudebug_eu_control control = {
> +		.client_handle = lower_32_bits(client),
> +		.exec_queue_handle = exec_queue,
> +		.lrc_handle = lrc,
> +		.cmd = cmd,
> +		.bitmask_ptr = to_user_pointer(bitmask),
> +	};
> +	int ret;
> +
> +	if (bitmask_size)
> +		control.bitmask_size = *bitmask_size;
> +
> +	ret = igt_ioctl(debugfd, DRM_XE_EUDEBUG_IOCTL_EU_CONTROL, &control);
> +
> +	if (ret < 0)
> +		return -errno;
> +
> +	igt_debug("EU CONTROL[%llu]: %s\n", control.seqno, td_ctl_cmd_to_str(cmd));
> +
> +	if (bitmask_size)
> +		*bitmask_size = control.bitmask_size;
> +
> +	if (seqno)
> +		*seqno = control.seqno;
> +
> +	return 0;
> +}
> +
> +static uint64_t eu_ctl(int debugfd, uint64_t client,
> +		       uint64_t exec_queue, uint64_t lrc,
> +		       uint8_t *bitmask, uint32_t *bitmask_size, uint32_t cmd)
> +{
> +	uint64_t seqno;
> +
> +	igt_assert_eq(__eu_ctl(debugfd, client, exec_queue, lrc, bitmask,
> +			       bitmask_size, cmd, &seqno), 0);
> +
> +	return seqno;
> +}
> +
> +static bool intel_gen_needs_resume_wa(int fd)
> +{
> +	const uint32_t id = intel_get_drm_devid(fd);
> +
> +	return intel_gen(id) == 12 && intel_graphics_ver(id) < IP_VER(12, 55);
> +}
> +
> +static uint64_t eu_ctl_resume(int fd, int debugfd, uint64_t client,
> +			      uint64_t exec_queue, uint64_t lrc,
> +			      uint8_t *bitmask, uint32_t bitmask_size)
> +{
> +	int i;
> +
> +	/*  Wa_14011332042 */
> +	if (intel_gen_needs_resume_wa(fd)) {
> +		uint32_t *att_reg_half = (uint32_t *)bitmask;
> +
> +		for (i = 0; i < bitmask_size / sizeof(uint32_t); i += 2) {
> +			att_reg_half[i] |= att_reg_half[i + 1];
> +			att_reg_half[i + 1] |= att_reg_half[i];
> +		}
> +	}
> +
> +	return eu_ctl(debugfd, client, exec_queue, lrc, bitmask, &bitmask_size,
> +		      DRM_XE_EUDEBUG_EU_CONTROL_CMD_RESUME);
> +}
> +
> +static inline uint64_t eu_ctl_stopped(int debugfd, uint64_t client,
> +				      uint64_t exec_queue, uint64_t lrc,
> +				      uint8_t *bitmask, uint32_t *bitmask_size)
> +{
> +	return eu_ctl(debugfd, client, exec_queue, lrc, bitmask, bitmask_size,
> +		      DRM_XE_EUDEBUG_EU_CONTROL_CMD_STOPPED);
> +}
> +
> +static inline uint64_t eu_ctl_interrupt_all(int debugfd, uint64_t client,
> +					    uint64_t exec_queue, uint64_t lrc)
> +{
> +	return eu_ctl(debugfd, client, exec_queue, lrc, NULL, 0,
> +		      DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL);
> +}
> +
> +struct online_debug_data {
> +	pthread_mutex_t mutex;
> +	/* client in */
> +	struct drm_xe_engine_class_instance hwe;
> +	/* client out */
> +	int threads_count;
> +	/* debugger internals */
> +	uint64_t client_handle;
> +	uint64_t exec_queue_handle;
> +	uint64_t lrc_handle;
> +	uint64_t target_offset;
> +	size_t target_size;
> +	uint64_t bb_offset;
> +	size_t bb_size;
> +	int vm_fd;
> +	uint32_t first_aip;
> +	uint64_t *aips_offset_table;
> +	uint32_t steps_done;
> +	uint8_t *single_step_bitmask;
> +	int stepped_threads_count;
> +	struct timespec exception_arrived;
> +	int last_eu_control_seqno;
> +	struct drm_xe_eudebug_event *exception_event;
> +};
> +
> +static struct online_debug_data *
> +online_debug_data_create(struct drm_xe_engine_class_instance *hwe)
> +{
> +	struct online_debug_data *data;
> +
> +	data = mmap(0, ALIGN(sizeof(*data), PAGE_SIZE),
> +		    PROT_WRITE, MAP_SHARED | MAP_ANON, -1, 0);

Check is data valid pointer.

> +	memcpy(&data->hwe, hwe, sizeof(*hwe));
> +	pthread_mutex_init(&data->mutex, NULL);
> +	data->client_handle = -1ULL;
> +	data->exec_queue_handle = -1ULL;
> +	data->lrc_handle = -1ULL;
> +	data->vm_fd = -1;
> +	data->stepped_threads_count = -1;
> +
> +	return data;
> +}
> +
> +static void online_debug_data_destroy(struct online_debug_data *data)
> +{
> +	free(data->aips_offset_table);
> +	munmap(data, ALIGN(sizeof(*data), PAGE_SIZE));
> +}
> +
> +static void eu_attention_debug_trigger(struct xe_eudebug_debugger *d,
> +				       struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_eu_attention *att = (void *)e;
> +	uint32_t *ptr = (uint32_t *)att->bitmask;
> +
> +	igt_debug("EVENT[%llu] eu-attenttion; threads=%d "
> +		 "client[%llu], exec_queue[%llu], lrc[%llu], bitmask_size[%d]\n",
> +		 att->base.seqno, count_set_bits(att->bitmask, att->bitmask_size),
> +				att->client_handle, att->exec_queue_handle,
> +				att->lrc_handle, att->bitmask_size);
> +
> +	for (uint32_t i = 0; i < att->bitmask_size / 4; i += 2)
> +		igt_debug("bitmask[%d] = 0x%08x%08x\n", i / 2, ptr[i], ptr[i + 1]);
> +}
> +
> +static void eu_attention_reset_trigger(struct xe_eudebug_debugger *d,
> +				       struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_eu_attention *att = (void *)e;
> +	uint32_t *ptr = (uint32_t *)att->bitmask;
> +	struct online_debug_data *data = d->ptr;
> +
> +	igt_debug("EVENT[%llu] eu-attention with reset; threads=%d "
> +		 "client[%llu], exec_queue[%llu], lrc[%llu], bitmask_size[%d]\n",
> +		 att->base.seqno, count_set_bits(att->bitmask, att->bitmask_size),
> +				att->client_handle, att->exec_queue_handle,
> +				att->lrc_handle, att->bitmask_size);
> +
> +	for (uint32_t i = 0; i < att->bitmask_size / 4; i += 2)
> +		igt_debug("bitmask[%d] = 0x%08x%08x\n", i / 2, ptr[i], ptr[i + 1]);
> +
> +	xe_force_gt_reset_async(d->master_fd, data->hwe.gt_id);
> +}
> +
> +static void copy_first_bit(uint8_t *dst, uint8_t *src, int size)
> +{
> +	bool found = false;
> +	int i, j;
> +
> +	for (i = 0; i < size; i++) {
> +		if (found) {
> +			dst[i] = 0;

Function is static, but according to line above I would add some
comment that it is cleaning dst buffer. copy_first_bit() is misleading
as you mean first bit set. First bit is src[0] & 1.

And what 'first' means? Having lets say src = { 0x0, 0xff, 0xcc, 0xaa }
I would expect first should be most significant bit of 0xff.


> +		} else {
> +			uint32_t tmp = src[i]; /* in case dst == src */
> +
> +			for (j = 0; j < 8; j++) {

ffs()? But according to copy copy_nth_bit() I've doubts shouldn't this
be fls()?

> +				dst[i] = tmp & (1 << j);
> +				if (dst[i]) {
> +					found = true;
> +					break;
> +				}
> +			}
> +		}
> +	}
> +}
> +
> +static void copy_nth_bit(uint8_t *dst, uint8_t *src, int size, int n)
> +{
> +	int count = 0;
> +
> +	for (int i = 0; i < size; i++) {
> +		uint32_t tmp = src[i];
> +
> +		for (int j = 7; j >= 0; j--) {

I'm confused. In above function you iterate starting from least
significant bit, here you start from most significant bit.
Same concern about function name - shouldn't this be copy_nth_bit_set()?

> +			if (tmp & (1 << j)) {
> +				count++;
> +				if (count == n)
> +					dst[i] |= (1 << j);
> +				else
> +					dst[i] &= ~(1 << j);

Do I understand correctly that you are clearing other bits in dst?
It's extremely weird calling function copy_nth_bit() where it scans
for n-th bit set, zeroing other bits in dst. Or I just don't understand
logic behind this decision.

> +			} else {
> +				dst[i] &= ~(1 << j);
> +			}
> +		}
> +	}
> +}
> +
> +/*
> + * Searches for the first instruction. It stands on assumption,
> + * that shader kernel is placed before sip within the bb.
> + */
> +static uint32_t find_kernel_in_bb(struct gpgpu_shader *kernel,
> +				  struct online_debug_data *data)
> +{
> +	uint32_t *p = kernel->code;
> +	size_t sz = 4 * sizeof(uint32_t);
> +	uint32_t buf[4];
> +	int i;
> +
> +	for (i = 0; i < data->bb_size; i += sz) {
> +		igt_assert_eq(pread(data->vm_fd, &buf, sz, data->bb_offset + i), sz);
> +
> +

Unnecessary blank line.

> +		if (memcmp(p, buf, sz) == 0)
> +			break;
> +	}

Isn't simpler to pread whole bb then use memmem()? Unless you want
to exercise pread() with different offsets as well.

> +
> +	igt_assert(i < data->bb_size);
> +
> +	return i;
> +}
> +
> +static void set_breakpoint_once(struct xe_eudebug_debugger *d,
> +				struct online_debug_data *data)
> +{
> +	const uint32_t breakpoint_bit = 1 << 30;
> +	size_t sz = sizeof(uint32_t);
> +	struct gpgpu_shader *kernel;
> +	uint32_t aip;
> +
> +	kernel = get_shader(d->master_fd, d->flags);
> +
> +	if (data->first_aip) {
> +		uint32_t expected = find_kernel_in_bb(kernel, data) + kernel->size * 4 - 0x10;
> +
> +		igt_assert_eq(pread(data->vm_fd, &aip, sz, data->target_offset), sz);
> +		igt_assert_eq_u32(aip, expected);

I've checked how this is used, because it just compares aip, and it
seems it is called second time for validating is target offset contains
stored aip. Shouldn't this be in separate function like check_aip() or
whatever but not in set_breakpoint_once()?

> +	} else {
> +		uint32_t instr_usdw;
> +
> +		igt_assert(data->vm_fd != -1);
> +		igt_assert(data->target_size != 0);
> +		igt_assert(data->bb_size != 0);
> +
> +		igt_assert_eq(pread(data->vm_fd, &aip, sz, data->target_offset), sz);
> +		data->first_aip = aip;
> +
> +		aip = find_kernel_in_bb(kernel, data);
> +
> +		/* set breakpoint on last instruction */
> +		aip += kernel->size * 4 - 0x10;
> +		igt_assert_eq(pread(data->vm_fd, &instr_usdw, sz,
> +				    data->bb_offset + aip), sz);
> +		instr_usdw |= breakpoint_bit;
> +		igt_assert_eq(pwrite(data->vm_fd, &instr_usdw, sz,
> +				     data->bb_offset + aip), sz);
> +
> +	}
> +
> +	gpgpu_shader_destroy(kernel);
> +}
> +
> +static void get_aips_offset_table(struct online_debug_data *data, int threads)
> +{
> +	size_t sz = sizeof(uint32_t);
> +	uint32_t aip;
> +	uint32_t first_aip;
> +	int table_index = 0;
> +
> +	if (data->aips_offset_table)
> +		return;
> +
> +	data->aips_offset_table = malloc(threads * sizeof(uint64_t));
> +	igt_assert(data->aips_offset_table);
> +
> +	igt_assert_eq(pread(data->vm_fd, &first_aip, sz, data->target_offset), sz);
> +	data->first_aip = first_aip;
> +	data->aips_offset_table[table_index++] = 0;
> +
> +	fsync(data->vm_fd);
> +	for (int i = sz; i < data->target_size; i += sz) {
> +		igt_assert_eq(pread(data->vm_fd, &aip, sz, data->target_offset + i), sz);
> +		if (aip == first_aip)
> +			data->aips_offset_table[table_index++] = i;
> +	}
> +
> +	igt_assert_eq(threads, table_index);
> +
> +	igt_debug("AIPs offset table:\n");
> +	for (int i = 0; i < threads; i++)
> +		igt_debug("%lx\n", data->aips_offset_table[i]);
> +}
> +
> +static int get_stepped_threads_count(struct online_debug_data *data, int threads)
> +{
> +	int count = 0;
> +	size_t sz = sizeof(uint32_t);
> +	uint32_t aip;
> +
> +	fsync(data->vm_fd);
> +	for (int i = 0; i < threads; i++) {
> +		igt_assert_eq(pread(data->vm_fd, &aip, sz,
> +				    data->target_offset + data->aips_offset_table[i]), sz);
> +		if (aip != data->first_aip) {
> +			igt_assert(aip == data->first_aip + 0x10);
> +			count++;
> +		}
> +	}
> +
> +	return count;
> +}
> +
> +static void save_first_exception_trigger(struct xe_eudebug_debugger *d,
> +					 struct drm_xe_eudebug_event *e)
> +{
> +	struct online_debug_data *data = d->ptr;
> +
> +	pthread_mutex_lock(&data->mutex);
> +	if (!data->exception_event) {
> +		igt_gettime(&data->exception_arrived);
> +		data->exception_event = igt_memdup(e, e->len);
> +	}
> +	pthread_mutex_unlock(&data->mutex);
> +}
> +
> +#define MAX_PREEMPT_TIMEOUT 10ull
> +static void eu_attention_resume_trigger(struct xe_eudebug_debugger *d,
> +					struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_eu_attention *att = (void *) e;
> +	struct online_debug_data *data = d->ptr;
> +	uint32_t bitmask_size = att->bitmask_size;
> +	uint8_t *bitmask;
> +	int i;
> +
> +	if (data->last_eu_control_seqno > att->base.seqno)
> +		return;
> +
> +	bitmask = calloc(1, att->bitmask_size);
> +
> +	eu_ctl_stopped(d->fd, att->client_handle, att->exec_queue_handle,
> +		       att->lrc_handle, bitmask, &bitmask_size);
> +	igt_assert(bitmask_size == att->bitmask_size);
> +	igt_assert(memcmp(bitmask, att->bitmask, att->bitmask_size) == 0);
> +
> +	pthread_mutex_lock(&data->mutex);
> +	if (igt_nsec_elapsed(&data->exception_arrived) < (MAX_PREEMPT_TIMEOUT + 1) * NSEC_PER_SEC &&
> +	    d->flags & TRIGGER_RESUME_DELAYED) {
> +		pthread_mutex_unlock(&data->mutex);
> +		free(bitmask);
> +		return;
> +	} else if (d->flags & TRIGGER_RESUME_ONE) {
> +		copy_first_bit(bitmask, bitmask, bitmask_size);
> +	} else if (d->flags & TRIGGER_RESUME_DSS) {
> +		uint64_t *event = (uint64_t *)att->bitmask;
> +		uint64_t *resume = (uint64_t *)bitmask;
> +
> +		memset(bitmask, 0, bitmask_size);
> +		for (i = 0; i < att->bitmask_size / sizeof(uint64_t); i++) {
> +			if (!event[i])
> +				continue;
> +
> +			resume[i] = event[i];
> +			break;
> +		}
> +	} else if (d->flags & TRIGGER_RESUME_SET_BP) {
> +		set_breakpoint_once(d, data);
> +	}
> +
> +	if (d->flags & SHADER_LOOP) {
> +		uint32_t threads = get_number_of_threads(d->flags);
> +		uint32_t val = STEERING_END_LOOP;
> +
> +		igt_assert_eq(pwrite(data->vm_fd, &val, sizeof(uint32_t),
> +				     data->target_offset + steering_offset(threads)),
> +			      sizeof(uint32_t));
> +		fsync(data->vm_fd);
> +	}
> +	pthread_mutex_unlock(&data->mutex);
> +
> +	data->last_eu_control_seqno = eu_ctl_resume(d->master_fd, d->fd, att->client_handle,
> +						    att->exec_queue_handle, att->lrc_handle,
> +						    bitmask, att->bitmask_size);
> +
> +	free(bitmask);
> +}
> +
> +static void eu_attention_resume_single_step_trigger(struct xe_eudebug_debugger *d,
> +						    struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_eu_attention *att = (void *) e;
> +	struct online_debug_data *data = d->ptr;
> +	const int threads = get_number_of_threads(d->flags);
> +	uint32_t val;
> +	size_t sz = sizeof(uint32_t);
> +
> +	get_aips_offset_table(data, threads);
> +
> +	if (d->flags & TRIGGER_RESUME_PARALLEL_WALK) {
> +		if (data->stepped_threads_count != -1)
> +			if (data->steps_done < SINGLE_STEP_COUNT) {
> +				int stepped_threads_count_after_resume =
> +						get_stepped_threads_count(data, threads);
> +				igt_debug("Stepped threads after: %d\n",
> +					  stepped_threads_count_after_resume);
> +
> +				if (stepped_threads_count_after_resume == threads) {
> +					data->first_aip += 0x10;
> +					data->steps_done++;
> +				}
> +
> +				igt_debug("Shader steps: %d\n", data->steps_done);
> +				igt_assert(data->stepped_threads_count == 0);
> +				igt_assert(stepped_threads_count_after_resume == threads);
> +			}
> +
> +		if (data->steps_done < SINGLE_STEP_COUNT) {
> +			data->stepped_threads_count = get_stepped_threads_count(data, threads);
> +			igt_debug("Stepped threads before: %d\n", data->stepped_threads_count);
> +		}
> +
> +		val = data->steps_done < SINGLE_STEP_COUNT ? STEERING_SINGLE_STEP :
> +							     STEERING_CONTINUE;
> +	} else if (d->flags & TRIGGER_RESUME_SINGLE_WALK) {
> +		if (data->stepped_threads_count != -1)
> +			if (data->steps_done < 2) {
> +				int stepped_threads_count_after_resume =
> +						get_stepped_threads_count(data, threads);
> +				igt_debug("Stepped threads after: %d\n",
> +					  stepped_threads_count_after_resume);
> +
> +				if (stepped_threads_count_after_resume == threads) {
> +					data->first_aip += 0x10;
> +					data->steps_done++;
> +					free(data->single_step_bitmask);
> +					data->single_step_bitmask = 0;
> +				}
> +
> +				igt_debug("Shader steps: %d\n", data->steps_done);
> +				igt_assert(data->stepped_threads_count +
> +					   (intel_gen_needs_resume_wa(d->master_fd) ? 2 : 1) ==
> +					   stepped_threads_count_after_resume);
> +			}
> +
> +		if (data->steps_done < 2) {
> +			data->stepped_threads_count = get_stepped_threads_count(data, threads);
> +			igt_debug("Stepped threads before: %d\n", data->stepped_threads_count);
> +			if (intel_gen_needs_resume_wa(d->master_fd)) {
> +				if (!data->single_step_bitmask) {
> +					data->single_step_bitmask = malloc(att->bitmask_size *
> +									   sizeof(uint8_t));
> +					igt_assert(data->single_step_bitmask);
> +					memcpy(data->single_step_bitmask, att->bitmask,
> +					       att->bitmask_size);
> +				}
> +
> +				copy_first_bit(att->bitmask, data->single_step_bitmask,
> +					       att->bitmask_size);
> +			} else
> +				copy_nth_bit(att->bitmask, att->bitmask, att->bitmask_size,
> +					     data->stepped_threads_count + 1);
> +		}
> +
> +		val = data->steps_done < 2 ? STEERING_SINGLE_STEP : STEERING_CONTINUE;
> +	}
> +
> +	igt_assert_eq(pwrite(data->vm_fd, &val, sz,
> +			     data->target_offset + steering_offset(threads)), sz);
> +	fsync(data->vm_fd);
> +
> +	eu_ctl_resume(d->master_fd, d->fd, att->client_handle,
> +		      att->exec_queue_handle, att->lrc_handle,
> +		      att->bitmask, att->bitmask_size);
> +
> +	if (data->single_step_bitmask)
> +		for (int i = 0; i < att->bitmask_size; i++)
> +			data->single_step_bitmask[i] &= ~att->bitmask[i];
> +}
> +
> +static void open_trigger(struct xe_eudebug_debugger *d,
> +			 struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_client *client = (void *)e;
> +	struct online_debug_data *data = d->ptr;
> +
> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_DESTROY)
> +		return;
> +
> +	pthread_mutex_lock(&data->mutex);
> +	data->client_handle = client->client_handle;
> +	pthread_mutex_unlock(&data->mutex);
> +}
> +
> +static void exec_queue_trigger(struct xe_eudebug_debugger *d,
> +			       struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_exec_queue *eq = (void *)e;
> +	struct online_debug_data *data = d->ptr;
> +
> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_DESTROY)
> +		return;
> +
> +	pthread_mutex_lock(&data->mutex);
> +	data->exec_queue_handle = eq->exec_queue_handle;
> +	data->lrc_handle = eq->lrc_handle[0];
> +	pthread_mutex_unlock(&data->mutex);
> +}
> +
> +static void vm_open_trigger(struct xe_eudebug_debugger *d,
> +			    struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_vm *vm = (void *)e;
> +	struct online_debug_data *data = d->ptr;
> +	struct drm_xe_eudebug_vm_open vo = {
> +		.client_handle = vm->client_handle,
> +		.vm_handle = vm->vm_handle,
> +	};
> +	int fd;
> +
> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
> +		fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
> +		igt_assert_lte(0, fd);
> +
> +		pthread_mutex_lock(&data->mutex);
> +		igt_assert(data->vm_fd == -1);
> +		data->vm_fd = fd;
> +		pthread_mutex_unlock(&data->mutex);
> +		return;
> +	}
> +
> +	pthread_mutex_lock(&data->mutex);
> +	close(data->vm_fd);
> +	data->vm_fd = -1;
> +	pthread_mutex_unlock(&data->mutex);
> +}
> +
> +static void read_metadata(struct xe_eudebug_debugger *d,
> +			  uint64_t client_handle,
> +			  uint64_t metadata_handle,
> +			  uint64_t type,
> +			  uint64_t len)
> +{
> +	struct drm_xe_eudebug_read_metadata rm = {
> +		.client_handle = client_handle,
> +		.metadata_handle = metadata_handle,
> +		.size = len,
> +	};
> +	struct online_debug_data *data = d->ptr;
> +	uint64_t *metadata;
> +
> +	metadata = malloc(len);
> +	igt_assert(metadata);
> +
> +	rm.ptr = to_user_pointer(metadata);
> +	igt_assert_eq(igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_READ_METADATA, &rm), 0);
> +
> +	pthread_mutex_lock(&data->mutex);
> +	switch (type) {
> +	case DRM_XE_DEBUG_METADATA_ELF_BINARY:
> +		data->bb_offset = metadata[0];
> +		data->bb_size = metadata[1];
> +		break;
> +	case DRM_XE_DEBUG_METADATA_PROGRAM_MODULE:
> +		data->target_offset = metadata[0];
> +		data->target_size = metadata[1];
> +		break;
> +	default:
> +		break;
> +	}
> +	pthread_mutex_unlock(&data->mutex);
> +
> +	free(metadata);
> +}
> +
> +static void create_metadata_trigger(struct xe_eudebug_debugger *d, struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_metadata *em = (void *)e;
> +
> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE)
> +		read_metadata(d, em->client_handle, em->metadata_handle, em->type, em->len);
> +}
> +
> +static void overwrite_immediate_value_in_common_target_write(int vm_fd, uint64_t offset,
> +							     uint32_t old_val, uint32_t new_val)
> +{
> +	uint64_t addr = offset;
> +	int vals_changed = 0;
> +	uint32_t val;
> +
> +	while (vals_changed < 4) {
> +		igt_assert_eq(pread(vm_fd, &val, sizeof(uint32_t), addr), sizeof(uint32_t));
> +		if (val == old_val) {
> +			igt_debug("val_before_write[%d]: %08x\n", vals_changed, val);
> +			igt_assert_eq(pwrite(vm_fd, &new_val, sizeof(uint32_t), addr),
> +				      sizeof(uint32_t));
> +			igt_assert_eq(pread(vm_fd, &val, sizeof(uint32_t), addr),
> +				      sizeof(uint32_t));
> +			igt_debug("val_before_fsync[%d]: %08x\n", vals_changed, val);
> +			fsync(vm_fd);
> +			igt_assert_eq(pread(vm_fd, &val, sizeof(uint32_t), addr),
> +				      sizeof(uint32_t));
> +			igt_debug("val_after_fsync[%d]: %08x\n", vals_changed, val);
> +			igt_assert_eq_u32(val, new_val);
> +			vals_changed++;
> +		}
> +		addr += sizeof(uint32_t);
> +	}
> +}
> +
> +static void eu_attention_resume_caching_trigger(struct xe_eudebug_debugger *d,
> +						struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_eu_attention *att = (void *)e;
> +	struct online_debug_data *data = d->ptr;
> +	static int counter;
> +	static int kernel_in_bb;

Reusing this function (currently it is used once) may be error prone.
Shouldn't this be put in debugger private data?

> +	struct dim_t s_dim = surface_dimensions(get_number_of_threads(d->flags));
> +	int val;
> +	uint32_t instr_usdw;
> +	struct gpgpu_shader *kernel;
> +	const uint32_t breakpoint_bit = 1 << 30;
> +	struct gpgpu_shader *shader_preamble;
> +	struct gpgpu_shader *shader_write_instr;
> +
> +	shader_preamble = gpgpu_shader_create(d->master_fd);
> +	gpgpu_shader__write_dword(shader_preamble, SHADER_CANARY, 0);
> +	gpgpu_shader__nop(shader_preamble);
> +	gpgpu_shader__breakpoint(shader_preamble);
> +
> +	shader_write_instr = gpgpu_shader_create(d->master_fd);
> +	gpgpu_shader__common_target_write_u32(shader_write_instr, 0, 0);
> +
> +	if (!kernel_in_bb) {
> +		kernel = get_shader(d->master_fd, d->flags);
> +		kernel_in_bb = find_kernel_in_bb(kernel, data);
> +		gpgpu_shader_destroy(kernel);
> +	}
> +
> +	/* set breakpoint on next write instruction */
> +	if (counter < caching_get_instruction_count(d->master_fd, s_dim.x, d->flags)) {
> +		igt_assert_eq(pread(data->vm_fd, &instr_usdw, sizeof(instr_usdw),
> +				    data->bb_offset + kernel_in_bb + shader_preamble->size * 4 +
> +				    shader_write_instr->size * 4 * counter), sizeof(instr_usdw));
> +		instr_usdw |= breakpoint_bit;
> +		igt_assert_eq(pwrite(data->vm_fd, &instr_usdw, sizeof(instr_usdw),
> +				     data->bb_offset + kernel_in_bb + shader_preamble->size * 4 +
> +				     shader_write_instr->size * 4 * counter), sizeof(instr_usdw));
> +		fsync(data->vm_fd);
> +	}
> +
> +	/* restore current instruction */
> +	if (counter && counter <= caching_get_instruction_count(d->master_fd, s_dim.x, d->flags))
> +		overwrite_immediate_value_in_common_target_write(data->vm_fd,
> +								 data->bb_offset + kernel_in_bb +
> +								 shader_preamble->size * 4 +
> +								 shader_write_instr->size * 4 * (counter - 1),
> +								 CACHING_POISON_VALUE,
> +								 CACHING_VALUE(counter - 1));
> +
> +	/* poison next instruction */
> +	if (counter < caching_get_instruction_count(d->master_fd, s_dim.x, d->flags))
> +		overwrite_immediate_value_in_common_target_write(data->vm_fd,
> +								 data->bb_offset + kernel_in_bb +
> +								 shader_preamble->size * 4 +
> +								 shader_write_instr->size * 4 * counter,
> +								 CACHING_VALUE(counter),
> +								 CACHING_POISON_VALUE);
> +
> +	gpgpu_shader_destroy(shader_write_instr);
> +	gpgpu_shader_destroy(shader_preamble);
> +
> +	for (int i = 0; i < data->target_size; i += sizeof(uint32_t)) {
> +		igt_assert_eq(pread(data->vm_fd, &val, sizeof(val), data->target_offset + i),
> +			      sizeof(val));
> +		igt_assert_f(val != CACHING_POISON_VALUE, "Poison value found at %04d!\n", i);
> +	}
> +
> +	eu_ctl_resume(d->master_fd, d->fd, att->client_handle,
> +		      att->exec_queue_handle, att->lrc_handle,
> +		      att->bitmask, att->bitmask_size);
> +
> +	counter++;
> +}
> +
> +static struct intel_bb *xe_bb_create_on_offset(int fd, uint32_t exec_queue, uint32_t vm,
> +					       uint64_t offset, uint32_t size)
> +{
> +	struct intel_bb *ibb;
> +
> +	ibb = intel_bb_create_with_context(fd, exec_queue, vm, NULL, size);
> +
> +	/* update intel bb offset */
> +	intel_bb_remove_object(ibb, ibb->handle, ibb->batch_offset, ibb->size);
> +	intel_bb_add_object(ibb, ibb->handle, ibb->size, offset, ibb->alignment, false);
> +	ibb->batch_offset = offset;
> +
> +	return ibb;
> +}
> +
> +static size_t get_bb_size(int flags)
> +{
> +	if ((flags & SHADER_CACHING_SRAM) || (flags & SHADER_CACHING_VRAM))
> +		return 32768;
> +
> +	return 4096;
> +}
> +
> +static void run_online_client(struct xe_eudebug_client *c)
> +{
> +	int threads = get_number_of_threads(c->flags);
> +	const uint64_t target_offset = 0x1a000000;
> +	const uint64_t bb_offset = 0x1b000000;
> +	const size_t bb_size = get_bb_size(c->flags);
> +	struct online_debug_data *data = c->ptr;
> +	struct drm_xe_engine_class_instance hwe = data->hwe;
> +	struct drm_xe_ext_set_property ext = {
> +		.base.name = DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY,
> +		.property = DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG,
> +		.value = DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE,
> +	};
> +	struct drm_xe_exec_queue_create create = {
> +		.instances = to_user_pointer(&hwe),
> +		.width = 1,
> +		.num_placements = 1,
> +		.extensions = c->flags & DISABLE_DEBUG_MODE ? 0 : to_user_pointer(&ext)
> +	};
> +	struct dim_t w_dim = walker_dimensions(threads);
> +	struct dim_t s_dim = surface_dimensions(threads);
> +	struct timespec ts = { };
> +	struct gpgpu_shader *sip, *shader;
> +	uint32_t metadata_id[2];
> +	uint64_t *metadata[2];
> +	struct intel_bb *ibb;
> +	struct intel_buf *buf;
> +	uint32_t *ptr;
> +	int fd;
> +
> +	metadata[0] = calloc(2, sizeof(*metadata));
> +	metadata[1] = calloc(2, sizeof(*metadata));
> +	igt_assert(metadata[0]);
> +	igt_assert(metadata[1]);
> +
> +	fd = xe_eudebug_client_open_driver(c);
> +	xe_device_get(fd);

Not necessary.

> +
> +	/* Additional memory for steering control */
> +	if (c->flags & SHADER_LOOP || c->flags & SHADER_SINGLE_STEP)
> +		s_dim.y++;
> +	/* Additional memory for caching check */
> +	if ((c->flags & SHADER_CACHING_SRAM) || (c->flags & SHADER_CACHING_VRAM))
> +		s_dim.y += caching_get_instruction_count(fd, s_dim.x, c->flags);
> +	buf = create_uc_buf(fd, s_dim.x, s_dim.y);
> +
> +	buf->addr.offset = target_offset;
> +
> +	metadata[0][0] = bb_offset;
> +	metadata[0][1] = bb_size;
> +	metadata[1][0] = target_offset;
> +	metadata[1][1] = buf->size;
> +	metadata_id[0] = xe_eudebug_client_metadata_create(c, fd, DRM_XE_DEBUG_METADATA_ELF_BINARY,
> +							   2 * sizeof(*metadata), metadata[0]);
> +	metadata_id[1] = xe_eudebug_client_metadata_create(c, fd,
> +							   DRM_XE_DEBUG_METADATA_PROGRAM_MODULE,
> +							   2 * sizeof(*metadata), metadata[1]);
> +
> +	create.vm_id = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> +	xe_eudebug_client_exec_queue_create(c, fd, &create);
> +
> +	ibb = xe_bb_create_on_offset(fd, create.exec_queue_id, create.vm_id,
> +				     bb_offset, bb_size);
> +	intel_bb_set_lr_mode(ibb, true);
> +
> +	sip = get_sip(fd, c->flags);
> +	shader = get_shader(fd, c->flags);
> +
> +	igt_nsec_elapsed(&ts);
> +	gpgpu_shader_exec(ibb, buf, w_dim.x, w_dim.y, shader, sip, 0, 0);
> +
> +	gpgpu_shader_destroy(sip);
> +	gpgpu_shader_destroy(shader);
> +
> +	intel_bb_sync(ibb);
> +
> +	if (c->flags & TRIGGER_RECONNECT)
> +		xe_eudebug_client_wait_stage(c, DEBUGGER_REATTACHED);
> +	else
> +		/* Make sure it wasn't the timeout. */
> +		igt_assert(igt_nsec_elapsed(&ts) < XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC);
> +
> +	if (!(c->flags & DO_NOT_EXPECT_CANARIES)) {
> +		ptr = xe_bo_mmap_ext(fd, buf->handle, buf->size, PROT_READ);
> +		data->threads_count = count_canaries_neq(ptr, w_dim, 0);
> +		igt_assert_f(data->threads_count, "No canaries found, nothing executed?\n");
> +
> +		if ((c->flags & SHADER_BREAKPOINT || c->flags & TRIGGER_RESUME_SET_BP ||
> +		     c->flags & SHADER_N_NOOP_BREAKPOINT) && !(c->flags & DISABLE_DEBUG_MODE)) {
> +			uint32_t aip = ptr[0];
> +
> +			igt_assert_f(aip != SHADER_CANARY,
> +				     "Workload executed but breakpoint not hit!\n");
> +			igt_assert_eq(count_canaries_eq(ptr, w_dim, aip), data->threads_count);
> +			igt_debug("Breakpoint hit in %d threads, AIP=0x%08x\n", data->threads_count,
> +				  aip);
> +		}
> +
> +		munmap(ptr, buf->size);
> +	}
> +
> +	intel_bb_destroy(ibb);
> +
> +	xe_eudebug_client_exec_queue_destroy(c, fd, &create);
> +	xe_eudebug_client_vm_destroy(c, fd,  create.vm_id);
> +
> +	xe_eudebug_client_metadata_destroy(c, fd, metadata_id[0], DRM_XE_DEBUG_METADATA_ELF_BINARY,
> +					   2 * sizeof(*metadata));
> +	xe_eudebug_client_metadata_destroy(c, fd, metadata_id[1],
> +					   DRM_XE_DEBUG_METADATA_PROGRAM_MODULE,
> +					   2 * sizeof(*metadata));
> +
> +	xe_device_put(fd);

Same.

> +	xe_eudebug_client_close_driver(c, fd);
> +}
> +
> +static bool intel_gen_has_lockstep_eus(int fd)
> +{
> +	const uint32_t id = intel_get_drm_devid(fd);
> +
> +	/*
> +	 * Lockstep (or in some parlance, fused) EUs are pair of EUs
> +	 * that work in sync, supposedly same clock and same control flow.
> +	 * Thus for attentions, if the control has breakpoint, both will be
> +	 * excepted into SIP. In this level, the hardware has only one attention
> +	 * thread bit for units. PVC is the first one without lockstepping.
> +	 */
> +	return !(intel_graphics_ver(id) == IP_VER(12, 60) || intel_gen(id) >= 20);
> +}
> +
> +static int query_attention_bitmask_size(int fd, int gt)
> +{
> +	const unsigned int threads = 8;
> +	struct drm_xe_query_topology_mask *c_dss = NULL, *g_dss = NULL, *eu_per_dss = NULL;
> +	struct drm_xe_query_topology_mask *topology;
> +	struct drm_xe_device_query query = {
> +		.extensions = 0,
> +		.query = DRM_XE_DEVICE_QUERY_GT_TOPOLOGY,
> +		.size = 0,
> +		.data = 0,
> +	};
> +	int pos = 0, eus;
> +	uint8_t *any_dss;
> +
> +	igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &query), 0);
> +	igt_assert_neq(query.size, 0);
> +
> +	topology = malloc(query.size);
> +	igt_assert(topology);
> +
> +	query.data = to_user_pointer(topology);
> +	igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &query), 0);
> +
> +	while (query.size >= sizeof(struct drm_xe_query_topology_mask)) {
> +		struct drm_xe_query_topology_mask *topo;
> +		int sz;
> +
> +		topo = (struct drm_xe_query_topology_mask *)((unsigned char *)topology + pos);
> +		sz = sizeof(struct drm_xe_query_topology_mask) + topo->num_bytes;
> +
> +		query.size -= sz;
> +		pos += sz;
> +
> +		if (topo->gt_id != gt)
> +			continue;
> +
> +		if (topo->type == DRM_XE_TOPO_DSS_GEOMETRY)
> +			g_dss = topo;
> +		else if (topo->type == DRM_XE_TOPO_DSS_COMPUTE)
> +			c_dss = topo;
> +		else if (topo->type == DRM_XE_TOPO_EU_PER_DSS ||
> +			 topo->type == DRM_XE_TOPO_SIMD16_EU_PER_DSS)
> +			eu_per_dss = topo;
> +	}
> +
> +	igt_assert(g_dss && c_dss && eu_per_dss);
> +	igt_assert_eq_u32(c_dss->num_bytes, g_dss->num_bytes);
> +
> +	any_dss = malloc(c_dss->num_bytes);

Assert if NULL.

> +
> +	for (int i = 0; i < c_dss->num_bytes; i++)
> +		any_dss[i] = c_dss->mask[i] | g_dss->mask[i];
> +
> +	eus = count_set_bits(any_dss, c_dss->num_bytes);
> +	eus *= count_set_bits(eu_per_dss->mask, eu_per_dss->num_bytes);
> +
> +	if (intel_gen_has_lockstep_eus(fd))
> +		eus /= 2;
> +
> +	free(any_dss);
> +	free(topology);
> +
> +	return eus * threads / 8;
> +}
> +
> +static struct drm_xe_eudebug_event_exec_queue *
> +match_attention_with_exec_queue(struct xe_eudebug_event_log *log,
> +				struct drm_xe_eudebug_event_eu_attention *ea)
> +{
> +	struct drm_xe_eudebug_event_exec_queue *ee;
> +	struct drm_xe_eudebug_event *event = NULL, *current = NULL, *matching_destroy = NULL;
> +	int lrc_idx;
> +
> +	xe_eudebug_for_each_event(event, log) {
> +		if (event->type == DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE &&
> +		    event->flags == DRM_XE_EUDEBUG_EVENT_CREATE) {
> +			ee = (struct drm_xe_eudebug_event_exec_queue *)event;
> +
> +			if (ee->exec_queue_handle != ea->exec_queue_handle)
> +				continue;
> +
> +			if (ee->client_handle != ea->client_handle)
> +				continue;
> +
> +			for (lrc_idx = 0; lrc_idx < ee->width; lrc_idx++) {
> +				if (ee->lrc_handle[lrc_idx] == ea->lrc_handle)
> +					break;
> +			}
> +
> +			if (lrc_idx >= ee->width) {
> +				igt_debug("No matching lrc handle within matching exec_queue!");
> +				continue;
> +			}
> +
> +			/* event logs are sorted, every found next would not be present. */
> +			if (ea->base.seqno < ee->base.seqno)
> +				break;
> +
> +			/* sanity check whether attention did
> +			 * not appear yet on already destroyed exec_queue
> +			 */
> +			current = event;
> +			xe_eudebug_for_each_event(current, log) {
> +				if (current->type == DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE &&
> +				    current->flags == DRM_XE_EUDEBUG_EVENT_DESTROY) {
> +					uint8_t offset = sizeof(struct drm_xe_eudebug_event);
> +
> +					if (memcmp((uint8_t *)current + offset,
> +						   (uint8_t *)event + offset,
> +						   current->len - offset) == 0) {
> +						matching_destroy = current;
> +					}
> +				}
> +			}
> +
> +			if (!matching_destroy || ea->base.seqno > matching_destroy->seqno)
> +				continue;
> +
> +			return ee;
> +		}
> +	}
> +
> +	return NULL;
> +}
> +
> +static void online_session_check(struct xe_eudebug_session *s, int flags)
> +{
> +	struct drm_xe_eudebug_event_eu_attention *ea = NULL;
> +	struct drm_xe_eudebug_event *event = NULL;
> +	struct online_debug_data *data = s->client->ptr;
> +	bool expect_exception = flags & DISABLE_DEBUG_MODE ? false : true;
> +	int sum = 0;
> +	int bitmask_size;
> +
> +	xe_eudebug_session_check(s, true, XE_EUDEBUG_FILTER_EVENT_VM_BIND |
> +					  XE_EUDEBUG_FILTER_EVENT_VM_BIND_OP |
> +					  XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
> +
> +	bitmask_size = query_attention_bitmask_size(s->debugger->master_fd, data->hwe.gt_id);
> +
> +	xe_eudebug_for_each_event(event, s->debugger->log) {
> +		if (event->type == DRM_XE_EUDEBUG_EVENT_EU_ATTENTION) {
> +			ea = (struct drm_xe_eudebug_event_eu_attention *)event;
> +
> +			igt_assert(event->flags == DRM_XE_EUDEBUG_EVENT_STATE_CHANGE);
> +			igt_assert_eq(ea->bitmask_size, bitmask_size);
> +			sum += count_set_bits(ea->bitmask, bitmask_size);
> +			igt_assert(match_attention_with_exec_queue(s->debugger->log, ea));
> +		}
> +	}
> +
> +	/*
> +	 * We can expect attention to sum up only
> +	 * if we have a breakpoint set and we resume all threads always.
> +	 */
> +	if (flags == SHADER_BREAKPOINT || flags == TRIGGER_UFENCE_SET_BREAKPOINT)
> +		igt_assert_eq(sum, data->threads_count);
> +
> +	if (expect_exception)
> +		igt_assert(sum > 0);
> +	else
> +		igt_assert(sum == 0);
> +}
> +
> +static void ufence_ack_trigger(struct xe_eudebug_debugger *d,
> +			       struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
> +
> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE)
> +		xe_eudebug_ack_ufence(d->fd, ef);
> +}
> +
> +static void ufence_ack_set_bp_trigger(struct xe_eudebug_debugger *d,
> +				      struct drm_xe_eudebug_event *e)
> +{
> +	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
> +	struct online_debug_data *data = d->ptr;
> +
> +	set_breakpoint_once(d, data);
> +
> +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE)
> +		xe_eudebug_ack_ufence(d->fd, ef);
> +}
> +
> +/**
> + * SUBTEST: basic-breakpoint
> + * Description:
> + *	Check whether KMD sends attention events
> + *	for workload in debug mode stopped on breakpoint.
> + *
> + * SUBTEST: breakpoint-not-in-debug-mode
> + * Description:
> + *	Check whether KMD resets the GPU when it spots an attention
> + *	coming from workload not in debug mode.
> + *
> + * SUBTEST: stopped-thread
> + * Description:
> + *	Hits breakpoint on runalone workload and
> + *	reads attention for fixed time.
> + *
> + * SUBTEST: resume-%s
> + * Description:
> + *	Resumes stopped on a breakpoint workload
> + *	with granularity of %arg[1].
> + *
> + *
> + * arg[1]:
> + *
> + * @one:	one thread
> + * @dss:	threads running on one subslice
> + */
> +static void test_basic_online(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
> +{
> +	struct xe_eudebug_session *s;
> +	struct online_debug_data *data;
> +
> +	data = online_debug_data_create(hwe);
> +	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
> +
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> +					eu_attention_debug_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> +					eu_attention_resume_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> +					ufence_ack_trigger);
> +
> +	xe_eudebug_session_run(s);
> +	online_session_check(s, s->flags);
> +
> +	xe_eudebug_session_destroy(s);
> +	online_debug_data_destroy(data);
> +}
> +
> +/**
> + * SUBTEST: set-breakpoint
> + * Description:
> + *	Checks for attention after setting a dynamic breakpoint in the ufence event.
> + */
> +
> +static void test_set_breakpoint_online(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
> +{
> +	struct xe_eudebug_session *s;
> +	struct online_debug_data *data;
> +
> +	data = online_debug_data_create(hwe);
> +	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_OPEN,
> +					open_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE,
> +					exec_queue_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM, vm_open_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_METADATA,
> +					create_metadata_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> +					ufence_ack_set_bp_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> +					eu_attention_resume_trigger);
> +
> +	xe_eudebug_session_run(s);
> +	online_session_check(s, s->flags);
> +
> +	xe_eudebug_session_destroy(s);
> +	online_debug_data_destroy(data);
> +}
> +
> +/**
> + * SUBTEST: preempt-breakpoint
> + * Description:
> + *	Verify that eu debugger disables preemption timeout to
> + *	prevent reset of workload stopped on breakpoint.
> + */
> +static void test_preemption(int fd, struct drm_xe_engine_class_instance *hwe)
> +{
> +	int flags = SHADER_BREAKPOINT | TRIGGER_RESUME_DELAYED;
> +	struct xe_eudebug_session *s;
> +	struct online_debug_data *data;
> +	struct xe_eudebug_client *other;
> +
> +	data = online_debug_data_create(hwe);
> +	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
> +	other = xe_eudebug_client_create(fd, run_online_client, SHADER_NOP, data);
> +
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> +					eu_attention_debug_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> +					eu_attention_resume_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> +					ufence_ack_trigger);
> +
> +	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
> +	xe_eudebug_debugger_start_worker(s->debugger);
> +
> +	xe_eudebug_client_start(s->client);
> +	sleep(1); /* make sure s->client starts first */

If client would write token it has started this sleep wouldn't be
necessary. I mean inside xe_eudebug_client_start() do
token_signal/wait_for_client.

> +	xe_eudebug_client_start(other);
> +
> +	xe_eudebug_client_wait_done(s->client);
> +	xe_eudebug_client_wait_done(other);
> +
> +	xe_eudebug_debugger_stop_worker(s->debugger, 1);
> +
> +	xe_eudebug_session_destroy(s);
> +	xe_eudebug_client_destroy(other);
> +
> +	igt_assert_f(data->last_eu_control_seqno != 0,
> +		     "Workload with breakpoint has ended without resume!\n");
> +
> +	online_debug_data_destroy(data);
> +}
> +
> +/**
> + * SUBTEST: reset-with-attention
> + * Description:
> + *	Check whether GPU is usable after resetting with attention raised
> + *	(stopped on breakpoint) by running the same workload again.
> + */
> +static void test_reset_with_attention_online(int fd, struct drm_xe_engine_class_instance *hwe,
> +					     int flags)
> +{
> +	struct xe_eudebug_session *s1, *s2;
> +	struct online_debug_data *data;
> +
> +	data = online_debug_data_create(hwe);
> +	s1 = xe_eudebug_session_create(fd, run_online_client, flags, data);
> +
> +	xe_eudebug_debugger_add_trigger(s1->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> +					eu_attention_reset_trigger);
> +	xe_eudebug_debugger_add_trigger(s1->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> +					ufence_ack_trigger);
> +
> +	xe_eudebug_session_run(s1);
> +	xe_eudebug_session_destroy(s1);
> +
> +	s2 = xe_eudebug_session_create(fd, run_online_client, flags, data);
> +	xe_eudebug_debugger_add_trigger(s2->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> +					eu_attention_resume_trigger);
> +	xe_eudebug_debugger_add_trigger(s2->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> +					ufence_ack_trigger);
> +
> +	xe_eudebug_session_run(s2);
> +
> +	online_session_check(s2, s2->flags);
> +
> +	xe_eudebug_session_destroy(s2);
> +	online_debug_data_destroy(data);
> +}
> +
> +/**
> + * SUBTEST: interrupt-all
> + * Description:
> + *	Schedules EU workload which should last about a few seconds, then
> + *	interrupts all threads, checks whether attention event came, and
> + *	resumes stopped threads back.
> + *
> + * SUBTEST: interrupt-all-set-breakpoint
> + * Description:
> + *	Schedules EU workload which should last about a few seconds, then
> + *	interrupts all threads, once attention event come it sets breakpoint on
> + *	the very next instruction and resumes stopped threads back. It expects
> + *	that every thread hits the breakpoint.
> + */
> +static void test_interrupt_all(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
> +{
> +	struct xe_eudebug_session *s;
> +	struct online_debug_data *data;
> +	uint32_t val;
> +
> +	data = online_debug_data_create(hwe);
> +	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
> +
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_OPEN,
> +					open_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE,
> +					exec_queue_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> +					eu_attention_debug_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> +					eu_attention_resume_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM, vm_open_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_METADATA,
> +					create_metadata_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> +					ufence_ack_trigger);
> +
> +	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
> +	xe_eudebug_debugger_start_worker(s->debugger);
> +	xe_eudebug_client_start(s->client);
> +
> +	/* wait for workload to start */
> +	igt_for_milliseconds(STARTUP_TIMEOUT_MS) {
> +		/* collect needed data from triggers */
> +		if (READ_ONCE(data->vm_fd) == -1 || READ_ONCE(data->target_size) == 0)
> +			continue;
> +
> +		if (pread(data->vm_fd, &val, sizeof(val), data->target_offset) == sizeof(val))
> +			if (val != 0)
> +				break;
> +	}
> +
> +	pthread_mutex_lock(&data->mutex);
> +	igt_assert(data->client_handle != -1);
> +	igt_assert(data->exec_queue_handle != -1);
> +	eu_ctl_interrupt_all(s->debugger->fd, data->client_handle,
> +			     data->exec_queue_handle, data->lrc_handle);
> +	pthread_mutex_unlock(&data->mutex);
> +
> +	xe_eudebug_client_wait_done(s->client);
> +
> +	xe_eudebug_debugger_stop_worker(s->debugger, 1);
> +
> +	xe_eudebug_event_log_print(s->debugger->log, true);
> +	xe_eudebug_event_log_print(s->client->log, true);
> +
> +	online_session_check(s, s->flags);
> +
> +	xe_eudebug_session_destroy(s);
> +	online_debug_data_destroy(data);
> +}
> +
> +static void reset_debugger_log(struct xe_eudebug_debugger *d)
> +{
> +	unsigned int max_size;
> +	char log_name[80];
> +
> +	/* Don't pull the rug out from under an active debugger */
> +	igt_assert(d->target_pid == 0);
> +
> +	max_size = d->log->max_size;
> +	strncpy(log_name, d->log->name, sizeof(d->log->name) - 1);
> +	log_name[79] = '\0';
> +	xe_eudebug_event_log_destroy(d->log);
> +	d->log = xe_eudebug_event_log_create(log_name, max_size);
> +}
> +
> +/**
> + * SUBTEST: interrupt-other-debuggable
> + * Description:
> + *	Schedules EU workload in runalone mode with never ending loop, while
> + *	it is not under debug, tries to interrupt all threads using the different
> + *	client attached to debugger.
> + *
> + * SUBTEST: interrupt-other
> + * Description:
> + *	Schedules EU workload with a never ending loop and, while it is not
> + *	configured for debugging, tries to interrupt all threads using the client
> + *	attached to debugger.
> + */
> +static void test_interrupt_other(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
> +{
> +	struct online_debug_data *data;
> +	struct online_debug_data *debugee_data;
> +	struct xe_eudebug_session *s;
> +	struct xe_eudebug_client *debugee;
> +	int debugee_flags = SHADER_LOOP | DO_NOT_EXPECT_CANARIES;
> +	int val;
> +
> +	data = online_debug_data_create(hwe);
> +	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
> +
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_OPEN, open_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE,
> +					exec_queue_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM, vm_open_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_METADATA,
> +					create_metadata_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> +					ufence_ack_trigger);
> +
> +	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
> +	xe_eudebug_debugger_start_worker(s->debugger);
> +	xe_eudebug_client_start(s->client);
> +
> +	/* wait for workload to start */
> +	igt_for_milliseconds(STARTUP_TIMEOUT_MS) {
> +		if (READ_ONCE(data->vm_fd) == -1 || READ_ONCE(data->target_size) == 0)
> +			continue;
> +
> +		if (pread(data->vm_fd, &val, sizeof(val), data->target_offset) == sizeof(val))
> +			if (val != 0)
> +				break;
> +	}
> +	igt_assert_f(val != 0, "Workload execution is not yet started\n");
> +
> +	xe_eudebug_debugger_detach(s->debugger);
> +	reset_debugger_log(s->debugger);
> +
> +	debugee_data = online_debug_data_create(hwe);
> +	s->debugger->ptr = debugee_data;
> +	debugee = xe_eudebug_client_create(fd, run_online_client, debugee_flags, debugee_data);
> +	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, debugee), 0);
> +	xe_eudebug_client_start(debugee);
> +
> +	igt_for_milliseconds(STARTUP_TIMEOUT_MS) {
> +		if (READ_ONCE(debugee_data->vm_fd) == -1 || READ_ONCE(debugee_data->target_size) == 0)
> +			continue;
> +	}
> +
> +	pthread_mutex_lock(&debugee_data->mutex);
> +	igt_assert(debugee_data->client_handle != -1);
> +	igt_assert(debugee_data->exec_queue_handle != -1);
> +
> +	/*
> +	 * Interrupting the other client should return invalid state
> +	 * as it is running in runalone mode
> +	 */
> +	igt_assert_eq(__eu_ctl(s->debugger->fd, debugee_data->client_handle,
> +		      debugee_data->exec_queue_handle, debugee_data->lrc_handle, NULL, 0,
> +		      DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL, NULL), -EINVAL);
> +	pthread_mutex_unlock(&debugee_data->mutex);
> +
> +	xe_force_gt_reset_async(s->debugger->master_fd, debugee_data->hwe.gt_id);
> +
> +	xe_eudebug_client_wait_done(debugee);
> +	xe_eudebug_debugger_stop_worker(s->debugger, 1);
> +
> +	xe_eudebug_event_log_print(s->debugger->log, true);
> +	xe_eudebug_event_log_print(debugee->log, true);
> +
> +	xe_eudebug_session_check(s, true, XE_EUDEBUG_FILTER_EVENT_VM_BIND |
> +				 XE_EUDEBUG_FILTER_EVENT_VM_BIND_OP |
> +				 XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
> +
> +	xe_eudebug_client_destroy(debugee);
> +	xe_eudebug_session_destroy(s);
> +	online_debug_data_destroy(data);
> +	online_debug_data_destroy(debugee_data);
> +}
> +
> +/**
> + * SUBTEST: tdctl-parameters
> + * Description:
> + *	Schedules EU workload which should last about a few seconds, then
> + *	checks negative scenarios of EU_THREADS ioctl usage, interrupts all threads,
> + *	checks whether attention event came, and resumes stopped threads back.
> + */
> +static void test_tdctl_parameters(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
> +{
> +	struct xe_eudebug_session *s;
> +	struct online_debug_data *data;
> +	uint32_t val;
> +	uint32_t random_command;
> +	uint32_t bitmask_size = query_attention_bitmask_size(fd, hwe->gt_id);
> +	uint8_t *attention_bitmask = malloc(bitmask_size * sizeof(uint8_t));
> +
> +	igt_assert(attention_bitmask);
> +
> +	data = online_debug_data_create(hwe);
> +	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
> +
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_OPEN,
> +					open_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE,
> +					exec_queue_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> +					eu_attention_debug_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> +					eu_attention_resume_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM, vm_open_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_METADATA,
> +					create_metadata_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> +					ufence_ack_trigger);
> +
> +	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
> +	xe_eudebug_debugger_start_worker(s->debugger);
> +	xe_eudebug_client_start(s->client);
> +
> +	/* wait for workload to start */
> +	igt_for_milliseconds(STARTUP_TIMEOUT_MS) {
> +		/* collect needed data from triggers */
> +		if (READ_ONCE(data->vm_fd) == -1 || READ_ONCE(data->target_size) == 0)
> +			continue;
> +
> +		if (pread(data->vm_fd, &val, sizeof(val), data->target_offset) == sizeof(val))
> +			if (val != 0)
> +				break;
> +	}
> +
> +	pthread_mutex_lock(&data->mutex);
> +	igt_assert(data->client_handle != -1);
> +	igt_assert(data->exec_queue_handle != -1);
> +	igt_assert(data->lrc_handle != -1);
> +
> +	/* fail on invalid lrc_handle */
> +	igt_assert(__eu_ctl(s->debugger->fd, data->client_handle,
> +			    data->exec_queue_handle, data->lrc_handle + 1,
> +			    attention_bitmask, &bitmask_size,
> +			    DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL, NULL) == -EINVAL);
> +
> +	/* fail on invalid exec_queue_handle */
> +	igt_assert(__eu_ctl(s->debugger->fd, data->client_handle,
> +			    data->exec_queue_handle + 1, data->lrc_handle,
> +			    attention_bitmask, &bitmask_size,
> +			    DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL, NULL) == -EINVAL);
> +
> +	/* fail on invalid client */
> +	igt_assert(__eu_ctl(s->debugger->fd, data->client_handle + 1,
> +			    data->exec_queue_handle, data->lrc_handle,
> +			    attention_bitmask, &bitmask_size,
> +			    DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL, NULL) == -EINVAL);
> +
> +	/*
> +	 * bitmask size must be aligned to sizeof(u32) for all commands
> +	 * and be zero for interrupt all
> +	 */
> +	bitmask_size = sizeof(uint32_t) - 1;
> +	igt_assert(__eu_ctl(s->debugger->fd, data->client_handle,
> +			    data->exec_queue_handle, data->lrc_handle,
> +			    attention_bitmask, &bitmask_size,
> +			    DRM_XE_EUDEBUG_EU_CONTROL_CMD_STOPPED, NULL) == -EINVAL);
> +	bitmask_size = 0;
> +
> +	/* fail on invalid command */
> +	random_command = random() | (DRM_XE_EUDEBUG_EU_CONTROL_CMD_RESUME + 1);
> +	igt_assert(__eu_ctl(s->debugger->fd, data->client_handle,
> +			    data->exec_queue_handle, data->lrc_handle,
> +			    attention_bitmask, &bitmask_size, random_command, NULL) == -EINVAL);
> +
> +	free(attention_bitmask);
> +
> +	eu_ctl_interrupt_all(s->debugger->fd, data->client_handle,
> +			     data->exec_queue_handle, data->lrc_handle);
> +	pthread_mutex_unlock(&data->mutex);
> +
> +	xe_eudebug_client_wait_done(s->client);
> +
> +	xe_eudebug_debugger_stop_worker(s->debugger, 1);
> +
> +	xe_eudebug_event_log_print(s->debugger->log, true);
> +	xe_eudebug_event_log_print(s->client->log, true);
> +
> +	online_session_check(s, s->flags);
> +
> +	xe_eudebug_session_destroy(s);
> +	online_debug_data_destroy(data);
> +}
> +
> +static void eu_attention_debugger_detach_trigger(struct xe_eudebug_debugger *d,
> +						 struct drm_xe_eudebug_event *event)
> +{
> +	struct online_debug_data *data = d->ptr;
> +	uint64_t c_pid;
> +	int ret;
> +
> +	c_pid = d->target_pid;
> +
> +	/* Reset VM data so the re-triggered VM open handler works properly */
> +	data->vm_fd = -1;
> +
> +	xe_eudebug_debugger_detach(d);
> +
> +	/* Let the KMD scan function notice unhandled EU attention */
> +	if (!(d->flags & SHADER_N_NOOP_BREAKPOINT))
> +		sleep(1);
> +
> +	/*
> +	 * New session that is created by EU debugger on reconnect restarts
> +	 * seqno, causing isses with log sorting. To avoid that, create
> +	 * a new event log.
> +	 */
> +	reset_debugger_log(d);
> +
> +	ret = xe_eudebug_connect(d->master_fd, c_pid, 0);
> +	igt_assert(ret >= 0);
> +	d->fd = ret;
> +	d->target_pid = c_pid;
> +
> +	/* Let the discovery worker discover resources */
> +	sleep(2);
> +
> +	if (!(d->flags & SHADER_N_NOOP_BREAKPOINT))
> +		xe_eudebug_debugger_signal_stage(d, DEBUGGER_REATTACHED);
> +}
> +
> +/**
> + * SUBTEST: interrupt-reconnect
> + * Description:
> + *	Schedules EU workload which should last about a few seconds,
> + *	interrupts all threads and detaches debugger when attention is
> + *	raised. The test checks if KMD resets the workload when there's
> + *	no debugger attached and does the event playback on discovery.
> + */
> +static void test_interrupt_reconnect(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
> +{
> +	struct drm_xe_eudebug_event *e = NULL;
> +	struct online_debug_data *data;
> +	struct xe_eudebug_session *s;
> +	uint32_t val;
> +
> +	data = online_debug_data_create(hwe);
> +	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
> +
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_OPEN,
> +					open_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE,
> +					exec_queue_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> +					eu_attention_debug_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> +					eu_attention_debugger_detach_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM, vm_open_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_METADATA,
> +					create_metadata_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> +					ufence_ack_trigger);
> +
> +	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
> +	xe_eudebug_debugger_start_worker(s->debugger);
> +	xe_eudebug_client_start(s->client);
> +
> +	/* wait for workload to start */
> +	igt_for_milliseconds(STARTUP_TIMEOUT_MS) {
> +		/* collect needed data from triggers */
> +		if (READ_ONCE(data->vm_fd) == -1 || READ_ONCE(data->target_size) == 0)
> +			continue;
> +
> +		if (pread(data->vm_fd, &val, sizeof(val), data->target_offset) == sizeof(val))
> +			if (val != 0)
> +				break;
> +	}
> +
> +	pthread_mutex_lock(&data->mutex);
> +	igt_assert(data->client_handle != -1);
> +	igt_assert(data->exec_queue_handle != -1);
> +	eu_ctl_interrupt_all(s->debugger->fd, data->client_handle,
> +			     data->exec_queue_handle, data->lrc_handle);
> +	pthread_mutex_unlock(&data->mutex);
> +
> +	xe_eudebug_client_wait_done(s->client);
> +
> +	xe_eudebug_debugger_stop_worker(s->debugger, 1);

I wondered where's log cleared and I've noticed eu_attention_debugger_detach_trigger
is responsible for this.

> +
> +	xe_eudebug_event_log_print(s->debugger->log, true);
> +	xe_eudebug_event_log_print(s->client->log, true);
> +
> +	xe_eudebug_session_check(s, true, XE_EUDEBUG_FILTER_EVENT_VM_BIND |
> +					  XE_EUDEBUG_FILTER_EVENT_VM_BIND_OP |
> +					  XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);

That's my question here - if log for debugger is cleared then filled
again on reconnect, will vm-bind-ufence events match?

> +
> +	/* We expect workload reset, so no attention should be raised */
> +	xe_eudebug_for_each_event(e, s->debugger->log)
> +		igt_assert(e->type != DRM_XE_EUDEBUG_EVENT_EU_ATTENTION);
> +
> +	xe_eudebug_session_destroy(s);
> +	online_debug_data_destroy(data);
> +}
> +
> +/**
> + * SUBTEST: single-step
> + * Description:
> + *	Schedules EU workload with 16 nops after breakpoint, then single-steps
> + *	through the shader, advances all threads each step, checking if all
> + *	threads advanced every step.
> + *
> + * SUBTEST: single-step-one
> + * Description:
> + *	Schedules EU workload with 16 nops after breakpoint, then single-steps
> + *	through the shader, advances one thread each step, checking if one
> + *	thread advanced every step. Due to the time constraint, only first two
> + *	shader instructions after breakpoint are validated.
> + */
> +static void test_single_step(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
> +{
> +	struct xe_eudebug_session *s;
> +	struct online_debug_data *data;
> +
> +	data = online_debug_data_create(hwe);
> +	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
> +
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_OPEN,
> +					open_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> +					eu_attention_debug_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> +					eu_attention_resume_single_step_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM, vm_open_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_METADATA,
> +					create_metadata_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> +					ufence_ack_trigger);
> +
> +	xe_eudebug_session_run(s);
> +	online_session_check(s, s->flags);
> +	xe_eudebug_session_destroy(s);
> +	online_debug_data_destroy(data);
> +}
> +
> +static void eu_attention_debugger_ndetach_trigger(struct xe_eudebug_debugger *d,
> +						  struct drm_xe_eudebug_event *event)
> +{
> +	struct online_debug_data *data = d->ptr;
> +	static int debugger_detach_count;
> +
> +	if (debugger_detach_count < (SHADER_LOOP_N - 1)) {
> +		/* Make sure the resume command was issued before detaching the debugger */
> +		if (data->last_eu_control_seqno > event->seqno)
> +			return;
> +		eu_attention_debugger_detach_trigger(d, event);
> +		debugger_detach_count++;
> +	} else {
> +		igt_debug("Reached Nth breakpoint hence preventing the debugger detach\n");
> +	}
> +}
> +
> +/**
> + * SUBTEST: debugger-reopen
> + * Description:
> + *	Check whether the debugger is able to reopen the connection and
> + *	capture the events of already running client.
> + */
> +static void test_debugger_reopen(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
> +{
> +	struct xe_eudebug_session *s;
> +	struct online_debug_data *data;
> +
> +	data = online_debug_data_create(hwe);
> +
> +	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
> +
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> +					eu_attention_debug_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> +					eu_attention_resume_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> +					eu_attention_debugger_ndetach_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> +					ufence_ack_trigger);
> +
> +	xe_eudebug_session_run(s);
> +
> +	xe_eudebug_session_destroy(s);
> +	online_debug_data_destroy(data);
> +}
> +
> +/**
> + * SUBTEST: writes-caching-%s
> + * Description:
> + *	Write incrementing values to 2-page-long target surface, poisoning the data one breakpoint
> + *	before each write instruction and restoring it when the poisoned instruction breakpoint
> + *	is hit. Expect to never see poison values in target surface.
> + *
> + *
> + * arg[1]:
> + *
> + * @sram:	Use page size of SRAM
> + * @vram:	Use page size of VRAM
> + */
> +static void test_caching(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
> +{
> +	struct xe_eudebug_session *s;
> +	struct online_debug_data *data;
> +
> +	if (flags & SHADER_CACHING_VRAM)
> +		igt_skip_on_f(!xe_has_vram(fd), "Device does not have VRAM.\n");
> +
> +	data = online_debug_data_create(hwe);
> +	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
> +
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_OPEN,
> +					open_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> +					eu_attention_debug_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> +					eu_attention_resume_caching_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM, vm_open_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_METADATA,
> +					create_metadata_trigger);
> +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> +					ufence_ack_trigger);
> +
> +	xe_eudebug_session_run(s);
> +	online_session_check(s, s->flags);
> +	xe_eudebug_session_destroy(s);
> +	online_debug_data_destroy(data);
> +}
> +
> +static int wait_for_exception(struct online_debug_data *data, int timeout)
> +{
> +	int ret = -ETIMEDOUT;
> +
> +	igt_for_milliseconds(timeout) {
> +		pthread_mutex_lock(&data->mutex);
> +		if ((data->exception_arrived.tv_sec |
> +		     data->exception_arrived.tv_nsec) != 0)
> +			ret = 0;
> +		pthread_mutex_unlock(&data->mutex);
> +
> +		if (!ret)
> +			break;
> +		usleep(1000);
> +	}
> +
> +	return ret;
> +}
> +
> +#define is_compute_on_gt(__e, __gt) (((__e)->engine_class == DRM_XE_ENGINE_CLASS_RENDER || \
> +				      (__e)->engine_class == DRM_XE_ENGINE_CLASS_COMPUTE) && \
> +				      (__e)->gt_id == (__gt))
> +
> +struct xe_engine_list_entry {
> +	struct igt_list_head link;
> +	struct drm_xe_engine_class_instance *hwe;
> +};
> +
> +#define MAX_TILES	2
> +static int find_suitable_engines(struct drm_xe_engine_class_instance *hwes[GEM_MAX_ENGINES],
> +				 int fd, bool many_tiles)
> +{
> +	struct xe_device *xe_dev;
> +	struct drm_xe_engine_class_instance *e;
> +	struct xe_engine_list_entry *en, *tmp;
> +	struct igt_list_head compute_engines[MAX_TILES];
> +	int gt_id;
> +	int tile_id, i, engine_count = 0, tile_count = 0;
> +
> +	xe_dev = xe_device_get(fd);
> +
> +	for (i = 0; i < MAX_TILES; i++)
> +		IGT_INIT_LIST_HEAD(&compute_engines[i]);
> +
> +	xe_for_each_gt(fd, gt_id) {
> +		xe_for_each_engine(fd, e) {
> +			if (is_compute_on_gt(e, gt_id)) {
> +				tile_id = xe_dev->gt_list->gt_list[gt_id].tile_id;
> +
> +				en = malloc(sizeof(struct xe_engine_list_entry));
> +				en->hwe = e;
> +
> +				igt_list_add_tail(&en->link, &compute_engines[tile_id]);
> +			}
> +		}
> +	}
> +
> +	for (i = 0; i < MAX_TILES; i++) {
> +		if (igt_list_empty(&compute_engines[i]))
> +			continue;
> +
> +		if (many_tiles) {
> +			en = igt_list_first_entry(&compute_engines[i], en, link);
> +			hwes[engine_count++] = en->hwe;
> +			tile_count++;
> +		} else {
> +			if (igt_list_length(&compute_engines[i]) > 1) {
> +				igt_list_for_each_entry(en, &compute_engines[i], link)
> +					hwes[engine_count++] = en->hwe;
> +				break;
> +			}
> +		}
> +	}
> +
> +	for (i = 0; i < MAX_TILES; i++) {
> +		igt_list_for_each_entry_safe(en, tmp, &compute_engines[i], link) {
> +			igt_list_del(&en->link);
> +			free(en);
> +		}
> +	}
> +
> +	if (many_tiles)
> +		igt_require_f(tile_count > 1, "Mulit-tile scenario requires more tiles\n");
> +
> +	return engine_count;
> +}
> +
> +/**
> + * SUBTEST: breakpoint-many-sessions-single-tile
> + * Description:
> + *	Schedules EU workload with preinstalled breakpoint on every compute engine
> + *	available on the tile. Checks if the contexts hit breakpoint in sequence
> + *	and resumes them.
> + *
> + * SUBTEST: breakpoint-many-sessions-tiles
> + * Description:
> + *	Schedules EU workload with preinstalled breakpoint on selected compute
> + *      engines, with one per tile. Checks if each context hit breakpoint and
> + *      resumes them.
> + */
> +static void test_many_sessions_on_tiles(int fd, bool multi_tile)
> +{
> +	int n = 0, flags = SHADER_BREAKPOINT | SHADER_MIN_THREADS;
> +	struct xe_eudebug_session *s[GEM_MAX_ENGINES] = {};
> +	struct online_debug_data *data[GEM_MAX_ENGINES] = {};
> +	struct drm_xe_engine_class_instance *hwe[GEM_MAX_ENGINES] = {};a

GEM_MAX_ENGINES?

> +	struct drm_xe_eudebug_event_eu_attention *eus;
> +	uint64_t current_t, next_t, diff;
> +	int i;
> +
> +	n = find_suitable_engines(hwe, fd, multi_tile);
> +
> +	igt_require_f(n > 1, "Test requires at least two parallel compute engines!\n");
> +
> +	for (i = 0; i < n; i++) {
> +		data[i] = online_debug_data_create(hwe[i]);
> +		s[i] = xe_eudebug_session_create(fd, run_online_client, flags, data[i]);
> +
> +		xe_eudebug_debugger_add_trigger(s[i]->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> +						eu_attention_debug_trigger);
> +		xe_eudebug_debugger_add_trigger(s[i]->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> +						save_first_exception_trigger);
> +		xe_eudebug_debugger_add_trigger(s[i]->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> +						ufence_ack_trigger);
> +
> +		igt_assert_eq(xe_eudebug_debugger_attach(s[i]->debugger, s[i]->client), 0);
> +
> +		xe_eudebug_debugger_start_worker(s[i]->debugger);
> +		xe_eudebug_client_start(s[i]->client);
> +	}
> +
> +	for (i = 0; i < n; i++) {
> +		/* XXX: Sometimes racy, expects clients to execute in sequence */
> +		igt_assert(!wait_for_exception(data[i], STARTUP_TIMEOUT_MS));
> +
> +		eus = (struct drm_xe_eudebug_event_eu_attention *)data[i]->exception_event;
> +
> +		/* Delay all but the last workload to check serialization */
> +		if (i < n - 1)
> +			usleep(WORKLOAD_DELAY_US);
> +
> +		eu_ctl_resume(s[i]->debugger->master_fd, s[i]->debugger->fd,
> +			      eus->client_handle, eus->exec_queue_handle,
> +			      eus->lrc_handle, eus->bitmask, eus->bitmask_size);
> +		free(eus);
> +	}
> +
> +	for (i = 0; i < n - 1; i++) {
> +		/* Convert timestamps to microseconds */
> +		current_t = data[i]->exception_arrived.tv_nsec * 1000;
> +		next_t = data[i + 1]->exception_arrived.tv_nsec * 1000;
> +		diff = current_t < next_t ? next_t - current_t : current_t - next_t;
> +
> +		if (multi_tile)
> +			igt_assert_f(diff < WORKLOAD_DELAY_US,
> +				     "Expected to execute workloads concurrently. Actual delay: %lu ms\n",
> +				     diff);
> +		else
> +			igt_assert_f(diff >= WORKLOAD_DELAY_US,
> +				     "Expected a serialization of workloads. Actual delay: %lu ms\n",
> +				     diff);
> +	}
> +
> +	for (i = 0; i < n; i++) {
> +		xe_eudebug_client_wait_done(s[i]->client);
> +		xe_eudebug_debugger_stop_worker(s[i]->debugger, 1);
> +
> +		xe_eudebug_event_log_print(s[i]->debugger->log, true);
> +		online_session_check(s[i], flags);
> +
> +		xe_eudebug_session_destroy(s[i]);
> +		online_debug_data_destroy(data[i]);
> +	}
> +}
> +
> +static struct drm_xe_engine_class_instance *pick_compute(int fd, int gt)
> +{
> +	struct drm_xe_engine_class_instance *hwe;
> +	int count = 0;
> +
> +	xe_for_each_engine(fd, hwe)
> +		if (is_compute_on_gt(hwe, gt))
> +			count++;
> +
> +	xe_for_each_engine(fd, hwe)
> +		if (is_compute_on_gt(hwe, gt) && rand() % count-- == 0)
> +			return hwe;
> +
> +	return NULL;
> +}
> +
> +#define test_gt_render_or_compute(t, i915, __hwe) \
> +	igt_subtest_with_dynamic(t) \
> +		for (int gt = 0; (__hwe = pick_compute(i915, gt)); gt++) \

i915?

I haven't spotted any other issues and generally apart of bit
operations looks correct.

--
Zbigniew
> +			igt_dynamic_f("%s%d", xe_engine_class_string(__hwe->engine_class), \
> +				      hwe->engine_instance)
> +
> +igt_main
> +{
> +	struct drm_xe_engine_class_instance *hwe;
> +	bool was_enabled;
> +	int fd;
> +
> +	igt_fixture {
> +		fd = drm_open_driver(DRIVER_XE);
> +		intel_allocator_multiprocess_start();
> +		igt_srandom();
> +		was_enabled = xe_eudebug_enable(fd, true);
> +	}
> +
> +	test_gt_render_or_compute("basic-breakpoint", fd, hwe)
> +		test_basic_online(fd, hwe, SHADER_BREAKPOINT);
> +
> +	test_gt_render_or_compute("preempt-breakpoint", fd, hwe)
> +		test_preemption(fd, hwe);
> +
> +	test_gt_render_or_compute("set-breakpoint", fd, hwe)
> +		test_set_breakpoint_online(fd, hwe, SHADER_NOP | TRIGGER_UFENCE_SET_BREAKPOINT);
> +
> +	test_gt_render_or_compute("breakpoint-not-in-debug-mode", fd, hwe)
> +		test_basic_online(fd, hwe, SHADER_BREAKPOINT | DISABLE_DEBUG_MODE);
> +
> +	test_gt_render_or_compute("stopped-thread", fd, hwe)
> +		test_basic_online(fd, hwe, SHADER_BREAKPOINT | TRIGGER_RESUME_DELAYED);
> +
> +	test_gt_render_or_compute("resume-one", fd, hwe)
> +		test_basic_online(fd, hwe, SHADER_BREAKPOINT | TRIGGER_RESUME_ONE);
> +
> +	test_gt_render_or_compute("resume-dss", fd, hwe)
> +		test_basic_online(fd, hwe, SHADER_BREAKPOINT | TRIGGER_RESUME_DSS);
> +
> +	test_gt_render_or_compute("interrupt-all", fd, hwe)
> +		test_interrupt_all(fd, hwe, SHADER_LOOP);
> +
> +	test_gt_render_or_compute("interrupt-other-debuggable", fd, hwe)
> +		test_interrupt_other(fd, hwe, SHADER_LOOP);
> +
> +	test_gt_render_or_compute("interrupt-other", fd, hwe)
> +		test_interrupt_other(fd, hwe, SHADER_LOOP | DISABLE_DEBUG_MODE);
> +
> +	test_gt_render_or_compute("interrupt-all-set-breakpoint", fd, hwe)
> +		test_interrupt_all(fd, hwe, SHADER_LOOP | TRIGGER_RESUME_SET_BP);
> +
> +	test_gt_render_or_compute("tdctl-parameters", fd, hwe)
> +		test_tdctl_parameters(fd, hwe, SHADER_LOOP);
> +
> +	test_gt_render_or_compute("reset-with-attention", fd, hwe)
> +		test_reset_with_attention_online(fd, hwe, SHADER_BREAKPOINT);
> +
> +	test_gt_render_or_compute("interrupt-reconnect", fd, hwe)
> +		test_interrupt_reconnect(fd, hwe, SHADER_LOOP | TRIGGER_RECONNECT);
> +
> +	test_gt_render_or_compute("single-step", fd, hwe)
> +		test_single_step(fd, hwe, SHADER_SINGLE_STEP | SIP_SINGLE_STEP |
> +				 TRIGGER_RESUME_PARALLEL_WALK);
> +
> +	test_gt_render_or_compute("single-step-one", fd, hwe)
> +		test_single_step(fd, hwe, SHADER_SINGLE_STEP | SIP_SINGLE_STEP |
> +				 TRIGGER_RESUME_SINGLE_WALK);
> +
> +	test_gt_render_or_compute("debugger-reopen", fd, hwe)
> +		test_debugger_reopen(fd, hwe, SHADER_N_NOOP_BREAKPOINT);
> +
> +	test_gt_render_or_compute("writes-caching-sram", fd, hwe)
> +		test_caching(fd, hwe, SHADER_CACHING_SRAM);
> +
> +	test_gt_render_or_compute("writes-caching-vram", fd, hwe)
> +		test_caching(fd, hwe, SHADER_CACHING_VRAM);
> +
> +	igt_subtest("breakpoint-many-sessions-single-tile")
> +		test_many_sessions_on_tiles(fd, false);
> +
> +	igt_subtest("breakpoint-many-sessions-tiles")
> +		test_many_sessions_on_tiles(fd, true);
> +
> +	igt_fixture {
> +		xe_eudebug_enable(fd, was_enabled);
> +
> +		intel_allocator_multiprocess_stop();
> +		drm_close_driver(fd);
> +	}
> +}
> diff --git a/tests/meson.build b/tests/meson.build
> index 43e8516f4..e5d8852f3 100644
> --- a/tests/meson.build
> +++ b/tests/meson.build
> @@ -321,6 +321,7 @@ intel_xe_progs = [
>  intel_xe_eudebug_progs = [
>  	'xe_eudebug',
>  	'xe_exec_sip_eudebug',
> +	'xe_eudebug_online',
>  ]
>  
>  if build_xe_eudebug
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 16/17] tests/xe_eudebug_online: Debug client which runs workloads on EU
  2024-09-13 11:39   ` Zbigniew Kempczyński
@ 2024-09-17 19:34     ` Grzegorzek, Dominik
  2024-09-18  5:08       ` Zbigniew Kempczyński
  2024-09-18  5:21       ` Zbigniew Kempczyński
  0 siblings, 2 replies; 50+ messages in thread
From: Grzegorzek, Dominik @ 2024-09-17 19:34 UTC (permalink / raw)
  To: Kempczynski, Zbigniew, Manszewski, Christoph
  Cc: Patelczyk, Maciej, Hajda, Andrzej, Kuoppala, Mika, Sikora, Pawel,
	Piatkowski, Dominik Karol, Mun, Gwan-gyeong,
	igt-dev@lists.freedesktop.org, kamil.konieczny@linux.intel.com,
	Kolanupaka Naveena

On Fri, 2024-09-13 at 13:39 +0200, Zbigniew Kempczyński wrote:
> On Thu, Sep 05, 2024 at 11:28:11AM +0200, Christoph Manszewski wrote:
> > From: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> > 
> > For typical debugging under gdb one can specify two main usecases:
> > accessing and manupulating resources created by the application and
> > manipulating thread execution (interrupting and setting breakpoints).
> > 
> > This test adds coverage for the latter by checking that:
> > - EU workloads that hit a instruction with breakpoint bit set will stop
> >   halt execution and the debugger will report this via attention events,
> > - the debugger is able to interrupt workload execution by issuing a
> >   'interrupt_all' ioctl call,
> > - the debugger is able to resume selected workloads that are stopped.
> > 
> > Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> > Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> > Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
> > Signed-off-by: Dominik Karol Piątkowski <dominik.karol.piatkowski@intel.com>
> > Signed-off-by: Pawel Sikora <pawel.sikora@intel.com>
> > Signed-off-by: Karolina Stolarek <karolina.stolarek@intel.com>
> > Signed-off-by: Kolanupaka Naveena <kolanupaka.naveena@intel.com>
> > ---
> >  tests/intel/xe_eudebug_online.c | 2254 +++++++++++++++++++++++++++++++
> >  tests/meson.build               |    1 +
> >  2 files changed, 2255 insertions(+)
> >  create mode 100644 tests/intel/xe_eudebug_online.c
> > 
> > diff --git a/tests/intel/xe_eudebug_online.c b/tests/intel/xe_eudebug_online.c
> > new file mode 100644
> > index 000000000..20f8e3601
> > --- /dev/null
> > +++ b/tests/intel/xe_eudebug_online.c
> > @@ -0,0 +1,2254 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright © 2023 Intel Corporation
> > + */
> > +
> > +/**
> > + * TEST: Tests for eudebug online functionality
> > + * Category: Core
> > + * Mega feature: EUdebug
> > + * Sub-category: EUdebug tests
> > + * Functionality: eu kernel debug
> > + * Test category: functionality test
> > + */
> > +
> > +#include "xe/xe_eudebug.h"
> > +#include "xe/xe_ioctl.h"
> > +#include "xe/xe_query.h"
> > +#include "igt.h"
> > +#include "intel_pat.h"
> > +#include "intel_mocs.h"
> > +#include "gpgpu_shader.h"
> > +
> > +#define SHADER_NOP			(0 << 0)
> > +#define SHADER_BREAKPOINT		(1 << 0)
> > +#define SHADER_LOOP			(1 << 1)
> > +#define SHADER_SINGLE_STEP		(1 << 2)
> > +#define SIP_SINGLE_STEP			(1 << 3)
> > +#define DISABLE_DEBUG_MODE		(1 << 4)
> > +#define SHADER_N_NOOP_BREAKPOINT	(1 << 5)
> > +#define SHADER_CACHING_SRAM		(1 << 6)
> > +#define SHADER_CACHING_VRAM		(1 << 7)
> > +#define SHADER_MIN_THREADS		(1 << 8)
> > +#define DO_NOT_EXPECT_CANARIES		(1 << 9)
> > +#define TRIGGER_UFENCE_SET_BREAKPOINT	(1 << 24)
> > +#define TRIGGER_RESUME_SINGLE_WALK	(1 << 25)
> > +#define TRIGGER_RESUME_PARALLEL_WALK	(1 << 26)
> > +#define TRIGGER_RECONNECT		(1 << 27)
> > +#define TRIGGER_RESUME_SET_BP		(1 << 28)
> > +#define TRIGGER_RESUME_DELAYED		(1 << 29)
> > +#define TRIGGER_RESUME_DSS		(1 << 30)
> > +#define TRIGGER_RESUME_ONE		(1 << 31)
> > +
> > +#define DEBUGGER_REATTACHED	1
> > +
> > +#define SHADER_LOOP_N		3
> > +#define SINGLE_STEP_COUNT	16
> > +#define STEERING_SINGLE_STEP	0
> > +#define STEERING_CONTINUE	0x00c0ffee
> > +#define STEERING_END_LOOP	0xdeadca11
> > +
> > +#define CACHING_INIT_VALUE	0xcafe0000
> > +#define CACHING_POISON_VALUE	0xcafedead
> > +#define CACHING_VALUE(n)	(CACHING_INIT_VALUE + (n))
> > +
> > +#define SHADER_CANARY 0x01010101
> > +
> > +#define WALKER_X_DIM		4
> > +#define WALKER_ALIGNMENT	16
> > +#define SIMD_SIZE		16
> > +
> > +#define STARTUP_TIMEOUT_MS	3000
> > +#define WORKLOAD_DELAY_US	(5000 * 1000)
> > +
> > +#define PAGE_SIZE 4096
> > +
> > +struct dim_t {
> > +	uint32_t x;
> > +	uint32_t y;
> > +	uint32_t alignment;
> > +};
> > +
> > +static struct dim_t walker_dimensions(int threads)
> > +{
> > +	uint32_t x_dim = min_t(x_dim, threads, WALKER_X_DIM);
> > +	struct dim_t ret = {
> > +		.x = x_dim,
> > +		.y = threads / x_dim,
> > +		.alignment = WALKER_ALIGNMENT
> > +	};
> > +
> > +	return ret;
> > +}
> > +
> > +static struct dim_t surface_dimensions(int threads)
> > +{
> > +	struct dim_t ret = walker_dimensions(threads);
> > +
> > +	ret.y = max_t(ret.y, threads / ret.x, 4);
> > +	ret.x *= SIMD_SIZE;
> > +	ret.alignment *= SIMD_SIZE;
> > +
> > +	return ret;
> > +}
> > +
> > +static uint32_t steering_offset(int threads)
> > +{
> > +	struct dim_t w = walker_dimensions(threads);
> > +
> > +	return ALIGN(w.x, w.alignment) * w.y * 4;
> > +}
> > +
> > +static struct intel_buf *create_uc_buf(int fd, int width, int height)
> > +{
> > +	struct intel_buf *buf;
> > +
> > +	buf = intel_buf_create_full(buf_ops_create(fd), 0, width / 4, height,
> > +				    32, 0, I915_TILING_NONE, 0, 0, 0,
> > +				    vram_if_possible(fd, 0),
> > +				    DEFAULT_PAT_INDEX, DEFAULT_MOCS_INDEX);
> > +
> > +	return buf;
> > +}
> > +
> > +static int get_number_of_threads(uint64_t flags)
> > +{
> > +	if (flags & SHADER_MIN_THREADS)
> > +		return 16;
> > +
> > +	if (flags & (TRIGGER_RESUME_ONE | TRIGGER_RESUME_SINGLE_WALK |
> > +		     TRIGGER_RESUME_PARALLEL_WALK | SHADER_CACHING_SRAM | SHADER_CACHING_VRAM))
> > +		return 32;
> > +
> > +	return 512;
> > +}
> > +
> > +static int caching_get_instruction_count(int fd, uint32_t s_dim__x, int flags)
> > +{
> > +	uint64_t memory;
> > +
> > +	igt_assert((flags & SHADER_CACHING_SRAM) || (flags & SHADER_CACHING_VRAM));
> > +
> > +	if (flags & SHADER_CACHING_SRAM)
> > +		memory = system_memory(fd);
> > +	else
> > +		memory = vram_memory(fd, 0);
> > +
> > +	/* each instruction writes to given y offset */
> > +	return (2 * xe_min_page_size(fd, memory)) / s_dim__x;
> > +}
> > +
> > +static struct gpgpu_shader *get_shader(int fd, const unsigned int flags)
> > +{
> > +	struct dim_t w_dim = walker_dimensions(get_number_of_threads(flags));
> > +	struct dim_t s_dim = surface_dimensions(get_number_of_threads(flags));
> > +	static struct gpgpu_shader *shader;
> > +
> > +	shader = gpgpu_shader_create(fd);
> > +
> > +	gpgpu_shader__write_dword(shader, SHADER_CANARY, 0);
> > +	if (flags & SHADER_BREAKPOINT) {
> > +		gpgpu_shader__nop(shader);
> > +		gpgpu_shader__breakpoint(shader);
> > +	} else if (flags & SHADER_LOOP) {
> > +		gpgpu_shader__label(shader, 0);
> > +		gpgpu_shader__write_dword(shader, SHADER_CANARY, 0);
> > +		gpgpu_shader__jump_neq(shader, 0, w_dim.y, STEERING_END_LOOP);
> > +		gpgpu_shader__write_dword(shader, SHADER_CANARY, 0);
> > +	} else if (flags & SHADER_SINGLE_STEP) {
> > +		gpgpu_shader__nop(shader);
> > +		gpgpu_shader__breakpoint(shader);
> > +		for (int i = 0; i < SINGLE_STEP_COUNT; i++)
> > +			gpgpu_shader__nop(shader);
> > +	} else if (flags & SHADER_N_NOOP_BREAKPOINT) {
> > +		for (int i = 0; i < SHADER_LOOP_N; i++) {
> > +			gpgpu_shader__nop(shader);
> > +			gpgpu_shader__breakpoint(shader);
> > +		}
> > +	} else if ((flags & SHADER_CACHING_SRAM) || (flags & SHADER_CACHING_VRAM)) {
> > +		gpgpu_shader__nop(shader);
> > +		gpgpu_shader__breakpoint(shader);
> > +		for (int i = 0; i < caching_get_instruction_count(fd, s_dim.x, flags); i++)
> > +			gpgpu_shader__common_target_write_u32(shader, s_dim.y + i, CACHING_VALUE(i));
> > +		gpgpu_shader__nop(shader);
> > +		gpgpu_shader__breakpoint(shader);
> > +	}
> > +
> > +	gpgpu_shader__eot(shader);
> 
> Add blank line.
> 
> > +	return shader;
> > +}
> > +
> > +static struct gpgpu_shader *get_sip(int fd, const unsigned int flags)
> > +{
> > +	struct dim_t w_dim = walker_dimensions(get_number_of_threads(flags));
> > +	static struct gpgpu_shader *sip;
> > +
> > +	sip = gpgpu_shader_create(fd);
> > +	gpgpu_shader__write_aip(sip, 0);
> > +
> > +	gpgpu_shader__wait(sip);
> > +	if (flags & SIP_SINGLE_STEP)
> > +		gpgpu_shader__end_system_routine_step_if_eq(sip, w_dim.y, 0);
> > +	else
> > +		gpgpu_shader__end_system_routine(sip, true);
> 
> Same.
> 
> > +	return sip;
> > +}
> > +
> > +static int count_set_bits(void *ptr, size_t size)
> > +{
> > +	uint8_t *p = ptr;
> > +	int count = 0;
> > +	int i, j;
> > +
> 
> hweight()?
> 
Are you proposing here to change the name or to implement it without second loop like below?

static int count_set_bits(void *ptr, size_t size)
{
	uint32_t *p = ptr;
	int count = 0;
	int i;

	igt_assert(size % 4 == 0);

	for (i = 0; i < size/4; i++)
		count += igt_hweight(p[i]);

	return count;
}
> > +	for (i = 0; i < size; i++)
> > +		for (j = 0; j < 8; j++)
> > +			count += !!(p[i] & (1 << j));
> > +
> > +	return count;
> > +}
> > +
> > +static int count_canaries_eq(uint32_t *ptr, struct dim_t w_dim, uint32_t value)
> > +{
> > +	int count = 0;
> > +	int x, y;
> > +
> > +	for (x = 0; x < w_dim.x; x++)
> > +		for (y = 0; y < w_dim.y; y++)
> > +			if (READ_ONCE(ptr[x + ALIGN(w_dim.x, w_dim.alignment) * y]) == value)
> > +				count++;
> > +
> > +	return count;
> > +}
> > +
> > +static int count_canaries_neq(uint32_t *ptr, struct dim_t w_dim, uint32_t value)
> > +{
> > +	return w_dim.x * w_dim.y - count_canaries_eq(ptr, w_dim, value);
> > +}
> > +
> > +static const char *td_ctl_cmd_to_str(uint32_t cmd)
> > +{
> > +	switch (cmd) {
> > +	case DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL:
> > +		return "interrupt all";
> > +	case DRM_XE_EUDEBUG_EU_CONTROL_CMD_STOPPED:
> > +		return "stopped";
> > +	case DRM_XE_EUDEBUG_EU_CONTROL_CMD_RESUME:
> > +		return "resume";
> > +	default:
> > +		return "unknown command";
> > +	}
> > +}
> > +
> > +static int __eu_ctl(int debugfd, uint64_t client,
> > +		    uint64_t exec_queue, uint64_t lrc,
> > +		    uint8_t *bitmask, uint32_t *bitmask_size,
> > +		    uint32_t cmd, uint64_t *seqno)
> > +{
> > +	struct drm_xe_eudebug_eu_control control = {
> > +		.client_handle = lower_32_bits(client),
> > +		.exec_queue_handle = exec_queue,
> > +		.lrc_handle = lrc,
> > +		.cmd = cmd,
> > +		.bitmask_ptr = to_user_pointer(bitmask),
> > +	};
> > +	int ret;
> > +
> > +	if (bitmask_size)
> > +		control.bitmask_size = *bitmask_size;
> > +
> > +	ret = igt_ioctl(debugfd, DRM_XE_EUDEBUG_IOCTL_EU_CONTROL, &control);
> > +
> > +	if (ret < 0)
> > +		return -errno;
> > +
> > +	igt_debug("EU CONTROL[%llu]: %s\n", control.seqno, td_ctl_cmd_to_str(cmd));
> > +
> > +	if (bitmask_size)
> > +		*bitmask_size = control.bitmask_size;
> > +
> > +	if (seqno)
> > +		*seqno = control.seqno;
> > +
> > +	return 0;
> > +}
> > +
> > +static uint64_t eu_ctl(int debugfd, uint64_t client,
> > +		       uint64_t exec_queue, uint64_t lrc,
> > +		       uint8_t *bitmask, uint32_t *bitmask_size, uint32_t cmd)
> > +{
> > +	uint64_t seqno;
> > +
> > +	igt_assert_eq(__eu_ctl(debugfd, client, exec_queue, lrc, bitmask,
> > +			       bitmask_size, cmd, &seqno), 0);
> > +
> > +	return seqno;
> > +}
> > +
> > +static bool intel_gen_needs_resume_wa(int fd)
> > +{
> > +	const uint32_t id = intel_get_drm_devid(fd);
> > +
> > +	return intel_gen(id) == 12 && intel_graphics_ver(id) < IP_VER(12, 55);
> > +}
> > +
> > +static uint64_t eu_ctl_resume(int fd, int debugfd, uint64_t client,
> > +			      uint64_t exec_queue, uint64_t lrc,
> > +			      uint8_t *bitmask, uint32_t bitmask_size)
> > +{
> > +	int i;
> > +
> > +	/*  Wa_14011332042 */
> > +	if (intel_gen_needs_resume_wa(fd)) {
> > +		uint32_t *att_reg_half = (uint32_t *)bitmask;
> > +
> > +		for (i = 0; i < bitmask_size / sizeof(uint32_t); i += 2) {
> > +			att_reg_half[i] |= att_reg_half[i + 1];
> > +			att_reg_half[i + 1] |= att_reg_half[i];
> > +		}
> > +	}
> > +
> > +	return eu_ctl(debugfd, client, exec_queue, lrc, bitmask, &bitmask_size,
> > +		      DRM_XE_EUDEBUG_EU_CONTROL_CMD_RESUME);
> > +}
> > +
> > +static inline uint64_t eu_ctl_stopped(int debugfd, uint64_t client,
> > +				      uint64_t exec_queue, uint64_t lrc,
> > +				      uint8_t *bitmask, uint32_t *bitmask_size)
> > +{
> > +	return eu_ctl(debugfd, client, exec_queue, lrc, bitmask, bitmask_size,
> > +		      DRM_XE_EUDEBUG_EU_CONTROL_CMD_STOPPED);
> > +}
> > +
> > +static inline uint64_t eu_ctl_interrupt_all(int debugfd, uint64_t client,
> > +					    uint64_t exec_queue, uint64_t lrc)
> > +{
> > +	return eu_ctl(debugfd, client, exec_queue, lrc, NULL, 0,
> > +		      DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL);
> > +}
> > +
> > +struct online_debug_data {
> > +	pthread_mutex_t mutex;
> > +	/* client in */
> > +	struct drm_xe_engine_class_instance hwe;
> > +	/* client out */
> > +	int threads_count;
> > +	/* debugger internals */
> > +	uint64_t client_handle;
> > +	uint64_t exec_queue_handle;
> > +	uint64_t lrc_handle;
> > +	uint64_t target_offset;
> > +	size_t target_size;
> > +	uint64_t bb_offset;
> > +	size_t bb_size;
> > +	int vm_fd;
> > +	uint32_t first_aip;
> > +	uint64_t *aips_offset_table;
> > +	uint32_t steps_done;
> > +	uint8_t *single_step_bitmask;
> > +	int stepped_threads_count;
> > +	struct timespec exception_arrived;
> > +	int last_eu_control_seqno;
> > +	struct drm_xe_eudebug_event *exception_event;
> > +};
> > +
> > +static struct online_debug_data *
> > +online_debug_data_create(struct drm_xe_engine_class_instance *hwe)
> > +{
> > +	struct online_debug_data *data;
> > +
> > +	data = mmap(0, ALIGN(sizeof(*data), PAGE_SIZE),
> > +		    PROT_WRITE, MAP_SHARED | MAP_ANON, -1, 0);
> 
> Check is data valid pointer.
> 
> > +	memcpy(&data->hwe, hwe, sizeof(*hwe));
> > +	pthread_mutex_init(&data->mutex, NULL);
> > +	data->client_handle = -1ULL;
> > +	data->exec_queue_handle = -1ULL;
> > +	data->lrc_handle = -1ULL;
> > +	data->vm_fd = -1;
> > +	data->stepped_threads_count = -1;
> > +
> > +	return data;
> > +}
> > +
> > +static void online_debug_data_destroy(struct online_debug_data *data)
> > +{
> > +	free(data->aips_offset_table);
> > +	munmap(data, ALIGN(sizeof(*data), PAGE_SIZE));
> > +}
> > +
> > +static void eu_attention_debug_trigger(struct xe_eudebug_debugger *d,
> > +				       struct drm_xe_eudebug_event *e)
> > +{
> > +	struct drm_xe_eudebug_event_eu_attention *att = (void *)e;
> > +	uint32_t *ptr = (uint32_t *)att->bitmask;
> > +
> > +	igt_debug("EVENT[%llu] eu-attenttion; threads=%d "
> > +		 "client[%llu], exec_queue[%llu], lrc[%llu], bitmask_size[%d]\n",
> > +		 att->base.seqno, count_set_bits(att->bitmask, att->bitmask_size),
> > +				att->client_handle, att->exec_queue_handle,
> > +				att->lrc_handle, att->bitmask_size);
> > +
> > +	for (uint32_t i = 0; i < att->bitmask_size / 4; i += 2)
> > +		igt_debug("bitmask[%d] = 0x%08x%08x\n", i / 2, ptr[i], ptr[i + 1]);
> > +}
> > +
> > +static void eu_attention_reset_trigger(struct xe_eudebug_debugger *d,
> > +				       struct drm_xe_eudebug_event *e)
> > +{
> > +	struct drm_xe_eudebug_event_eu_attention *att = (void *)e;
> > +	uint32_t *ptr = (uint32_t *)att->bitmask;
> > +	struct online_debug_data *data = d->ptr;
> > +
> > +	igt_debug("EVENT[%llu] eu-attention with reset; threads=%d "
> > +		 "client[%llu], exec_queue[%llu], lrc[%llu], bitmask_size[%d]\n",
> > +		 att->base.seqno, count_set_bits(att->bitmask, att->bitmask_size),
> > +				att->client_handle, att->exec_queue_handle,
> > +				att->lrc_handle, att->bitmask_size);
> > +
> > +	for (uint32_t i = 0; i < att->bitmask_size / 4; i += 2)
> > +		igt_debug("bitmask[%d] = 0x%08x%08x\n", i / 2, ptr[i], ptr[i + 1]);
> > +
> > +	xe_force_gt_reset_async(d->master_fd, data->hwe.gt_id);
> > +}
> > +
> > +static void copy_first_bit(uint8_t *dst, uint8_t *src, int size)
> > +{
> > +	bool found = false;
> > +	int i, j;
> > +
> > +	for (i = 0; i < size; i++) {
> > +		if (found) {
> > +			dst[i] = 0;
> 
> Function is static, but according to line above I would add some
> comment that it is cleaning dst buffer. copy_first_bit() is misleading
> as you mean first bit set. First bit is src[0] & 1.
> 
> And what 'first' means? Having lets say src = { 0x0, 0xff, 0xcc, 0xaa }
> I would expect first should be most significant bit of 0xff.
> 
> 
> > +		} else {
> > +			uint32_t tmp = src[i]; /* in case dst == src */
> > +
> > +			for (j = 0; j < 8; j++) {
> 
> ffs()? But according to copy copy_nth_bit() I've doubts shouldn't this
> be fls()?
> 
> > +				dst[i] = tmp & (1 << j);
> > +				if (dst[i]) {
> > +					found = true;
> > +					break;
> > +				}
> > +			}
> > +		}
> > +	}
> > +}
> > +
> > +static void copy_nth_bit(uint8_t *dst, uint8_t *src, int size, int n)
> > +{
> > +	int count = 0;
> > +
> > +	for (int i = 0; i < size; i++) {
> > +		uint32_t tmp = src[i];
> > +
> > +		for (int j = 7; j >= 0; j--) {
> 
> I'm confused. In above function you iterate starting from least
> significant bit, here you start from most significant bit.
> Same concern about function name - shouldn't this be copy_nth_bit_set()?
> 
> > +			if (tmp & (1 << j)) {
> > +				count++;
> > +				if (count == n)
> > +					dst[i] |= (1 << j);
> > +				else
> > +					dst[i] &= ~(1 << j);
> 
> Do I understand correctly that you are clearing other bits in dst?
> It's extremely weird calling function copy_nth_bit() where it scans
> for n-th bit set, zeroing other bits in dst. Or I just don't understand
> logic behind this decision.

You've raised bunch of valid inaccuracies. How about:

static void only_nth_set_bit(uint8_t *dst, uint8_t *src, int size, int n)
{
	int count = 0;

	for (int i = 0; i < size; i++) {
		if (count < n) {
			uint8_t tmp = src[i];

			for (int j = 0; j < 8; j++) {
				if (tmp & (1 << j)) {
					count++;
					if (count == n)
						dst[i] |= (1 << j);
					else
						dst[i] &= ~(1 << j);
				} else {
					dst[i] &= ~(1 << j);
				}
			}
		} else {
			dst[i] = 0;
		}
	}
}

static void only_first_set_bit(uint8_t *dst, uint8_t *src, int size)
{
	return only_nth_set_bit(dst, src, size, 1);
}

I think this way both functions are coherent and their names are more descriptive.
> 
> > +			} else {
> > +				dst[i] &= ~(1 << j);
> > +			}
> > +		}
> > +	}
> > +}
> > +
> > +/*
> > + * Searches for the first instruction. It stands on assumption,
> > + * that shader kernel is placed before sip within the bb.
> > + */
> > +static uint32_t find_kernel_in_bb(struct gpgpu_shader *kernel,
> > +				  struct online_debug_data *data)
> > +{
> > +	uint32_t *p = kernel->code;
> > +	size_t sz = 4 * sizeof(uint32_t);
> > +	uint32_t buf[4];
> > +	int i;
> > +
> > +	for (i = 0; i < data->bb_size; i += sz) {
> > +		igt_assert_eq(pread(data->vm_fd, &buf, sz, data->bb_offset + i), sz);
> > +
> > +
> 
> Unnecessary blank line.
> 
> > +		if (memcmp(p, buf, sz) == 0)
> > +			break;
> > +	}
> 
> Isn't simpler to pread whole bb then use memmem()? Unless you want
> to exercise pread() with different offsets as well.
> 
> > +
> > +	igt_assert(i < data->bb_size);
> > +
> > +	return i;
> > +}
> > +
> > +static void set_breakpoint_once(struct xe_eudebug_debugger *d,
> > +				struct online_debug_data *data)
> > +{
> > +	const uint32_t breakpoint_bit = 1 << 30;
> > +	size_t sz = sizeof(uint32_t);
> > +	struct gpgpu_shader *kernel;
> > +	uint32_t aip;
> > +
> > +	kernel = get_shader(d->master_fd, d->flags);
> > +
> > +	if (data->first_aip) {
> > +		uint32_t expected = find_kernel_in_bb(kernel, data) + kernel->size * 4 - 0x10;
> > +
> > +		igt_assert_eq(pread(data->vm_fd, &aip, sz, data->target_offset), sz);
> > +		igt_assert_eq_u32(aip, expected);
> 
> I've checked how this is used, because it just compares aip, and it
> seems it is called second time for validating is target offset contains
> stored aip. Shouldn't this be in separate function like check_aip() or
> whatever but not in set_breakpoint_once()?
Good catch! The check even does not make sense when set_breakpoint_once is called from ufence
trigger (set-breakpoint) testcase. I will fix that. 
> 
> > +	} else {
> > +		uint32_t instr_usdw;
> > +
> > +		igt_assert(data->vm_fd != -1);
> > +		igt_assert(data->target_size != 0);
> > +		igt_assert(data->bb_size != 0);
> > +
> > +		igt_assert_eq(pread(data->vm_fd, &aip, sz, data->target_offset), sz);
> > +		data->first_aip = aip;
> > +
> > +		aip = find_kernel_in_bb(kernel, data);
> > +
> > +		/* set breakpoint on last instruction */
> > +		aip += kernel->size * 4 - 0x10;
> > +		igt_assert_eq(pread(data->vm_fd, &instr_usdw, sz,
> > +				    data->bb_offset + aip), sz);
> > +		instr_usdw |= breakpoint_bit;
> > +		igt_assert_eq(pwrite(data->vm_fd, &instr_usdw, sz,
> > +				     data->bb_offset + aip), sz);
> > +
> > +	}
> > +
> > +	gpgpu_shader_destroy(kernel);
> > +}
> > +
> > +static void get_aips_offset_table(struct online_debug_data *data, int threads)
> > +{
> > +	size_t sz = sizeof(uint32_t);
> > +	uint32_t aip;
> > +	uint32_t first_aip;
> > +	int table_index = 0;
> > +
> > +	if (data->aips_offset_table)
> > +		return;
> > +
> > +	data->aips_offset_table = malloc(threads * sizeof(uint64_t));
> > +	igt_assert(data->aips_offset_table);
> > +
> > +	igt_assert_eq(pread(data->vm_fd, &first_aip, sz, data->target_offset), sz);
> > +	data->first_aip = first_aip;
> > +	data->aips_offset_table[table_index++] = 0;
> > +
> > +	fsync(data->vm_fd);
> > +	for (int i = sz; i < data->target_size; i += sz) {
> > +		igt_assert_eq(pread(data->vm_fd, &aip, sz, data->target_offset + i), sz);
> > +		if (aip == first_aip)
> > +			data->aips_offset_table[table_index++] = i;
> > +	}
> > +
> > +	igt_assert_eq(threads, table_index);
> > +
> > +	igt_debug("AIPs offset table:\n");
> > +	for (int i = 0; i < threads; i++)
> > +		igt_debug("%lx\n", data->aips_offset_table[i]);
> > +}
> > +
> > +static int get_stepped_threads_count(struct online_debug_data *data, int threads)
> > +{
> > +	int count = 0;
> > +	size_t sz = sizeof(uint32_t);
> > +	uint32_t aip;
> > +
> > +	fsync(data->vm_fd);
> > +	for (int i = 0; i < threads; i++) {
> > +		igt_assert_eq(pread(data->vm_fd, &aip, sz,
> > +				    data->target_offset + data->aips_offset_table[i]), sz);
> > +		if (aip != data->first_aip) {
> > +			igt_assert(aip == data->first_aip + 0x10);
> > +			count++;
> > +		}
> > +	}
> > +
> > +	return count;
> > +}
> > +
> > +static void save_first_exception_trigger(struct xe_eudebug_debugger *d,
> > +					 struct drm_xe_eudebug_event *e)
> > +{
> > +	struct online_debug_data *data = d->ptr;
> > +
> > +	pthread_mutex_lock(&data->mutex);
> > +	if (!data->exception_event) {
> > +		igt_gettime(&data->exception_arrived);
> > +		data->exception_event = igt_memdup(e, e->len);
> > +	}
> > +	pthread_mutex_unlock(&data->mutex);
> > +}
> > +
> > +#define MAX_PREEMPT_TIMEOUT 10ull
> > +static void eu_attention_resume_trigger(struct xe_eudebug_debugger *d,
> > +					struct drm_xe_eudebug_event *e)
> > +{
> > +	struct drm_xe_eudebug_event_eu_attention *att = (void *) e;
> > +	struct online_debug_data *data = d->ptr;
> > +	uint32_t bitmask_size = att->bitmask_size;
> > +	uint8_t *bitmask;
> > +	int i;
> > +
> > +	if (data->last_eu_control_seqno > att->base.seqno)
> > +		return;
> > +
> > +	bitmask = calloc(1, att->bitmask_size);
> > +
> > +	eu_ctl_stopped(d->fd, att->client_handle, att->exec_queue_handle,
> > +		       att->lrc_handle, bitmask, &bitmask_size);
> > +	igt_assert(bitmask_size == att->bitmask_size);
> > +	igt_assert(memcmp(bitmask, att->bitmask, att->bitmask_size) == 0);
> > +
> > +	pthread_mutex_lock(&data->mutex);
> > +	if (igt_nsec_elapsed(&data->exception_arrived) < (MAX_PREEMPT_TIMEOUT + 1) * NSEC_PER_SEC &&
> > +	    d->flags & TRIGGER_RESUME_DELAYED) {
> > +		pthread_mutex_unlock(&data->mutex);
> > +		free(bitmask);
> > +		return;
> > +	} else if (d->flags & TRIGGER_RESUME_ONE) {
> > +		copy_first_bit(bitmask, bitmask, bitmask_size);
> > +	} else if (d->flags & TRIGGER_RESUME_DSS) {
> > +		uint64_t *event = (uint64_t *)att->bitmask;
> > +		uint64_t *resume = (uint64_t *)bitmask;
> > +
> > +		memset(bitmask, 0, bitmask_size);
> > +		for (i = 0; i < att->bitmask_size / sizeof(uint64_t); i++) {
> > +			if (!event[i])
> > +				continue;
> > +
> > +			resume[i] = event[i];
> > +			break;
> > +		}
> > +	} else if (d->flags & TRIGGER_RESUME_SET_BP) {
> > +		set_breakpoint_once(d, data);
> > +	}
> > +
> > +	if (d->flags & SHADER_LOOP) {
> > +		uint32_t threads = get_number_of_threads(d->flags);
> > +		uint32_t val = STEERING_END_LOOP;
> > +
> > +		igt_assert_eq(pwrite(data->vm_fd, &val, sizeof(uint32_t),
> > +				     data->target_offset + steering_offset(threads)),
> > +			      sizeof(uint32_t));
> > +		fsync(data->vm_fd);
> > +	}
> > +	pthread_mutex_unlock(&data->mutex);
> > +
> > +	data->last_eu_control_seqno = eu_ctl_resume(d->master_fd, d->fd, att->client_handle,
> > +						    att->exec_queue_handle, att->lrc_handle,
> > +						    bitmask, att->bitmask_size);
> > +
> > +	free(bitmask);
> > +}
> > +
> > +static void eu_attention_resume_single_step_trigger(struct xe_eudebug_debugger *d,
> > +						    struct drm_xe_eudebug_event *e)
> > +{
> > +	struct drm_xe_eudebug_event_eu_attention *att = (void *) e;
> > +	struct online_debug_data *data = d->ptr;
> > +	const int threads = get_number_of_threads(d->flags);
> > +	uint32_t val;
> > +	size_t sz = sizeof(uint32_t);
> > +
> > +	get_aips_offset_table(data, threads);
> > +
> > +	if (d->flags & TRIGGER_RESUME_PARALLEL_WALK) {
> > +		if (data->stepped_threads_count != -1)
> > +			if (data->steps_done < SINGLE_STEP_COUNT) {
> > +				int stepped_threads_count_after_resume =
> > +						get_stepped_threads_count(data, threads);
> > +				igt_debug("Stepped threads after: %d\n",
> > +					  stepped_threads_count_after_resume);
> > +
> > +				if (stepped_threads_count_after_resume == threads) {
> > +					data->first_aip += 0x10;
> > +					data->steps_done++;
> > +				}
> > +
> > +				igt_debug("Shader steps: %d\n", data->steps_done);
> > +				igt_assert(data->stepped_threads_count == 0);
> > +				igt_assert(stepped_threads_count_after_resume == threads);
> > +			}
> > +
> > +		if (data->steps_done < SINGLE_STEP_COUNT) {
> > +			data->stepped_threads_count = get_stepped_threads_count(data, threads);
> > +			igt_debug("Stepped threads before: %d\n", data->stepped_threads_count);
> > +		}
> > +
> > +		val = data->steps_done < SINGLE_STEP_COUNT ? STEERING_SINGLE_STEP :
> > +							     STEERING_CONTINUE;
> > +	} else if (d->flags & TRIGGER_RESUME_SINGLE_WALK) {
> > +		if (data->stepped_threads_count != -1)
> > +			if (data->steps_done < 2) {
> > +				int stepped_threads_count_after_resume =
> > +						get_stepped_threads_count(data, threads);
> > +				igt_debug("Stepped threads after: %d\n",
> > +					  stepped_threads_count_after_resume);
> > +
> > +				if (stepped_threads_count_after_resume == threads) {
> > +					data->first_aip += 0x10;
> > +					data->steps_done++;
> > +					free(data->single_step_bitmask);
> > +					data->single_step_bitmask = 0;
> > +				}
> > +
> > +				igt_debug("Shader steps: %d\n", data->steps_done);
> > +				igt_assert(data->stepped_threads_count +
> > +					   (intel_gen_needs_resume_wa(d->master_fd) ? 2 : 1) ==
> > +					   stepped_threads_count_after_resume);
> > +			}
> > +
> > +		if (data->steps_done < 2) {
> > +			data->stepped_threads_count = get_stepped_threads_count(data, threads);
> > +			igt_debug("Stepped threads before: %d\n", data->stepped_threads_count);
> > +			if (intel_gen_needs_resume_wa(d->master_fd)) {
> > +				if (!data->single_step_bitmask) {
> > +					data->single_step_bitmask = malloc(att->bitmask_size *
> > +									   sizeof(uint8_t));
> > +					igt_assert(data->single_step_bitmask);
> > +					memcpy(data->single_step_bitmask, att->bitmask,
> > +					       att->bitmask_size);
> > +				}
> > +
> > +				copy_first_bit(att->bitmask, data->single_step_bitmask,
> > +					       att->bitmask_size);
> > +			} else
> > +				copy_nth_bit(att->bitmask, att->bitmask, att->bitmask_size,
> > +					     data->stepped_threads_count + 1);
> > +		}
> > +
> > +		val = data->steps_done < 2 ? STEERING_SINGLE_STEP : STEERING_CONTINUE;
> > +	}
> > +
> > +	igt_assert_eq(pwrite(data->vm_fd, &val, sz,
> > +			     data->target_offset + steering_offset(threads)), sz);
> > +	fsync(data->vm_fd);
> > +
> > +	eu_ctl_resume(d->master_fd, d->fd, att->client_handle,
> > +		      att->exec_queue_handle, att->lrc_handle,
> > +		      att->bitmask, att->bitmask_size);
> > +
> > +	if (data->single_step_bitmask)
> > +		for (int i = 0; i < att->bitmask_size; i++)
> > +			data->single_step_bitmask[i] &= ~att->bitmask[i];
> > +}
> > +
> > +static void open_trigger(struct xe_eudebug_debugger *d,
> > +			 struct drm_xe_eudebug_event *e)
> > +{
> > +	struct drm_xe_eudebug_event_client *client = (void *)e;
> > +	struct online_debug_data *data = d->ptr;
> > +
> > +	if (e->flags & DRM_XE_EUDEBUG_EVENT_DESTROY)
> > +		return;
> > +
> > +	pthread_mutex_lock(&data->mutex);
> > +	data->client_handle = client->client_handle;
> > +	pthread_mutex_unlock(&data->mutex);
> > +}
> > +
> > +static void exec_queue_trigger(struct xe_eudebug_debugger *d,
> > +			       struct drm_xe_eudebug_event *e)
> > +{
> > +	struct drm_xe_eudebug_event_exec_queue *eq = (void *)e;
> > +	struct online_debug_data *data = d->ptr;
> > +
> > +	if (e->flags & DRM_XE_EUDEBUG_EVENT_DESTROY)
> > +		return;
> > +
> > +	pthread_mutex_lock(&data->mutex);
> > +	data->exec_queue_handle = eq->exec_queue_handle;
> > +	data->lrc_handle = eq->lrc_handle[0];
> > +	pthread_mutex_unlock(&data->mutex);
> > +}
> > +
> > +static void vm_open_trigger(struct xe_eudebug_debugger *d,
> > +			    struct drm_xe_eudebug_event *e)
> > +{
> > +	struct drm_xe_eudebug_event_vm *vm = (void *)e;
> > +	struct online_debug_data *data = d->ptr;
> > +	struct drm_xe_eudebug_vm_open vo = {
> > +		.client_handle = vm->client_handle,
> > +		.vm_handle = vm->vm_handle,
> > +	};
> > +	int fd;
> > +
> > +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE) {
> > +		fd = igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_VM_OPEN, &vo);
> > +		igt_assert_lte(0, fd);
> > +
> > +		pthread_mutex_lock(&data->mutex);
> > +		igt_assert(data->vm_fd == -1);
> > +		data->vm_fd = fd;
> > +		pthread_mutex_unlock(&data->mutex);
> > +		return;
> > +	}
> > +
> > +	pthread_mutex_lock(&data->mutex);
> > +	close(data->vm_fd);
> > +	data->vm_fd = -1;
> > +	pthread_mutex_unlock(&data->mutex);
> > +}
> > +
> > +static void read_metadata(struct xe_eudebug_debugger *d,
> > +			  uint64_t client_handle,
> > +			  uint64_t metadata_handle,
> > +			  uint64_t type,
> > +			  uint64_t len)
> > +{
> > +	struct drm_xe_eudebug_read_metadata rm = {
> > +		.client_handle = client_handle,
> > +		.metadata_handle = metadata_handle,
> > +		.size = len,
> > +	};
> > +	struct online_debug_data *data = d->ptr;
> > +	uint64_t *metadata;
> > +
> > +	metadata = malloc(len);
> > +	igt_assert(metadata);
> > +
> > +	rm.ptr = to_user_pointer(metadata);
> > +	igt_assert_eq(igt_ioctl(d->fd, DRM_XE_EUDEBUG_IOCTL_READ_METADATA, &rm), 0);
> > +
> > +	pthread_mutex_lock(&data->mutex);
> > +	switch (type) {
> > +	case DRM_XE_DEBUG_METADATA_ELF_BINARY:
> > +		data->bb_offset = metadata[0];
> > +		data->bb_size = metadata[1];
> > +		break;
> > +	case DRM_XE_DEBUG_METADATA_PROGRAM_MODULE:
> > +		data->target_offset = metadata[0];
> > +		data->target_size = metadata[1];
> > +		break;
> > +	default:
> > +		break;
> > +	}
> > +	pthread_mutex_unlock(&data->mutex);
> > +
> > +	free(metadata);
> > +}
> > +
> > +static void create_metadata_trigger(struct xe_eudebug_debugger *d, struct drm_xe_eudebug_event *e)
> > +{
> > +	struct drm_xe_eudebug_event_metadata *em = (void *)e;
> > +
> > +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE)
> > +		read_metadata(d, em->client_handle, em->metadata_handle, em->type, em->len);
> > +}
> > +
> > +static void overwrite_immediate_value_in_common_target_write(int vm_fd, uint64_t offset,
> > +							     uint32_t old_val, uint32_t new_val)
> > +{
> > +	uint64_t addr = offset;
> > +	int vals_changed = 0;
> > +	uint32_t val;
> > +
> > +	while (vals_changed < 4) {
> > +		igt_assert_eq(pread(vm_fd, &val, sizeof(uint32_t), addr), sizeof(uint32_t));
> > +		if (val == old_val) {
> > +			igt_debug("val_before_write[%d]: %08x\n", vals_changed, val);
> > +			igt_assert_eq(pwrite(vm_fd, &new_val, sizeof(uint32_t), addr),
> > +				      sizeof(uint32_t));
> > +			igt_assert_eq(pread(vm_fd, &val, sizeof(uint32_t), addr),
> > +				      sizeof(uint32_t));
> > +			igt_debug("val_before_fsync[%d]: %08x\n", vals_changed, val);
> > +			fsync(vm_fd);
> > +			igt_assert_eq(pread(vm_fd, &val, sizeof(uint32_t), addr),
> > +				      sizeof(uint32_t));
> > +			igt_debug("val_after_fsync[%d]: %08x\n", vals_changed, val);
> > +			igt_assert_eq_u32(val, new_val);
> > +			vals_changed++;
> > +		}
> > +		addr += sizeof(uint32_t);
> > +	}
> > +}
> > +
> > +static void eu_attention_resume_caching_trigger(struct xe_eudebug_debugger *d,
> > +						struct drm_xe_eudebug_event *e)
> > +{
> > +	struct drm_xe_eudebug_event_eu_attention *att = (void *)e;
> > +	struct online_debug_data *data = d->ptr;
> > +	static int counter;
> > +	static int kernel_in_bb;
> 
> Reusing this function (currently it is used once) may be error prone.
> Shouldn't this be put in debugger private data?
Make sense.
> 
> > +	struct dim_t s_dim = surface_dimensions(get_number_of_threads(d->flags));
> > +	int val;
> > +	uint32_t instr_usdw;
> > +	struct gpgpu_shader *kernel;
> > +	const uint32_t breakpoint_bit = 1 << 30;
> > +	struct gpgpu_shader *shader_preamble;
> > +	struct gpgpu_shader *shader_write_instr;
> > +
> > +	shader_preamble = gpgpu_shader_create(d->master_fd);
> > +	gpgpu_shader__write_dword(shader_preamble, SHADER_CANARY, 0);
> > +	gpgpu_shader__nop(shader_preamble);
> > +	gpgpu_shader__breakpoint(shader_preamble);
> > +
> > +	shader_write_instr = gpgpu_shader_create(d->master_fd);
> > +	gpgpu_shader__common_target_write_u32(shader_write_instr, 0, 0);
> > +
> > +	if (!kernel_in_bb) {
> > +		kernel = get_shader(d->master_fd, d->flags);
> > +		kernel_in_bb = find_kernel_in_bb(kernel, data);
> > +		gpgpu_shader_destroy(kernel);
> > +	}
> > +
> > +	/* set breakpoint on next write instruction */
> > +	if (counter < caching_get_instruction_count(d->master_fd, s_dim.x, d->flags)) {
> > +		igt_assert_eq(pread(data->vm_fd, &instr_usdw, sizeof(instr_usdw),
> > +				    data->bb_offset + kernel_in_bb + shader_preamble->size * 4 +
> > +				    shader_write_instr->size * 4 * counter), sizeof(instr_usdw));
> > +		instr_usdw |= breakpoint_bit;
> > +		igt_assert_eq(pwrite(data->vm_fd, &instr_usdw, sizeof(instr_usdw),
> > +				     data->bb_offset + kernel_in_bb + shader_preamble->size * 4 +
> > +				     shader_write_instr->size * 4 * counter), sizeof(instr_usdw));
> > +		fsync(data->vm_fd);
> > +	}
> > +
> > +	/* restore current instruction */
> > +	if (counter && counter <= caching_get_instruction_count(d->master_fd, s_dim.x, d->flags))
> > +		overwrite_immediate_value_in_common_target_write(data->vm_fd,
> > +								 data->bb_offset + kernel_in_bb +
> > +								 shader_preamble->size * 4 +
> > +								 shader_write_instr->size * 4 * (counter - 1),
> > +								 CACHING_POISON_VALUE,
> > +								 CACHING_VALUE(counter - 1));
> > +
> > +	/* poison next instruction */
> > +	if (counter < caching_get_instruction_count(d->master_fd, s_dim.x, d->flags))
> > +		overwrite_immediate_value_in_common_target_write(data->vm_fd,
> > +								 data->bb_offset + kernel_in_bb +
> > +								 shader_preamble->size * 4 +
> > +								 shader_write_instr->size * 4 * counter,
> > +								 CACHING_VALUE(counter),
> > +								 CACHING_POISON_VALUE);
> > +
> > +	gpgpu_shader_destroy(shader_write_instr);
> > +	gpgpu_shader_destroy(shader_preamble);
> > +
> > +	for (int i = 0; i < data->target_size; i += sizeof(uint32_t)) {
> > +		igt_assert_eq(pread(data->vm_fd, &val, sizeof(val), data->target_offset + i),
> > +			      sizeof(val));
> > +		igt_assert_f(val != CACHING_POISON_VALUE, "Poison value found at %04d!\n", i);
> > +	}
> > +
> > +	eu_ctl_resume(d->master_fd, d->fd, att->client_handle,
> > +		      att->exec_queue_handle, att->lrc_handle,
> > +		      att->bitmask, att->bitmask_size);
> > +
> > +	counter++;
> > +}
> > +
> > +static struct intel_bb *xe_bb_create_on_offset(int fd, uint32_t exec_queue, uint32_t vm,
> > +					       uint64_t offset, uint32_t size)
> > +{
> > +	struct intel_bb *ibb;
> > +
> > +	ibb = intel_bb_create_with_context(fd, exec_queue, vm, NULL, size);
> > +
> > +	/* update intel bb offset */
> > +	intel_bb_remove_object(ibb, ibb->handle, ibb->batch_offset, ibb->size);
> > +	intel_bb_add_object(ibb, ibb->handle, ibb->size, offset, ibb->alignment, false);
> > +	ibb->batch_offset = offset;
> > +
> > +	return ibb;
> > +}
> > +
> > +static size_t get_bb_size(int flags)
> > +{
> > +	if ((flags & SHADER_CACHING_SRAM) || (flags & SHADER_CACHING_VRAM))
> > +		return 32768;
> > +
> > +	return 4096;
> > +}
> > +
> > +static void run_online_client(struct xe_eudebug_client *c)
> > +{
> > +	int threads = get_number_of_threads(c->flags);
> > +	const uint64_t target_offset = 0x1a000000;
> > +	const uint64_t bb_offset = 0x1b000000;
> > +	const size_t bb_size = get_bb_size(c->flags);
> > +	struct online_debug_data *data = c->ptr;
> > +	struct drm_xe_engine_class_instance hwe = data->hwe;
> > +	struct drm_xe_ext_set_property ext = {
> > +		.base.name = DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY,
> > +		.property = DRM_XE_EXEC_QUEUE_SET_PROPERTY_EUDEBUG,
> > +		.value = DRM_XE_EXEC_QUEUE_EUDEBUG_FLAG_ENABLE,
> > +	};
> > +	struct drm_xe_exec_queue_create create = {
> > +		.instances = to_user_pointer(&hwe),
> > +		.width = 1,
> > +		.num_placements = 1,
> > +		.extensions = c->flags & DISABLE_DEBUG_MODE ? 0 : to_user_pointer(&ext)
> > +	};
> > +	struct dim_t w_dim = walker_dimensions(threads);
> > +	struct dim_t s_dim = surface_dimensions(threads);
> > +	struct timespec ts = { };
> > +	struct gpgpu_shader *sip, *shader;
> > +	uint32_t metadata_id[2];
> > +	uint64_t *metadata[2];
> > +	struct intel_bb *ibb;
> > +	struct intel_buf *buf;
> > +	uint32_t *ptr;
> > +	int fd;
> > +
> > +	metadata[0] = calloc(2, sizeof(*metadata));
> > +	metadata[1] = calloc(2, sizeof(*metadata));
> > +	igt_assert(metadata[0]);
> > +	igt_assert(metadata[1]);
> > +
> > +	fd = xe_eudebug_client_open_driver(c);
> > +	xe_device_get(fd);
> 
> Not necessary.
> 
> > +
> > +	/* Additional memory for steering control */
> > +	if (c->flags & SHADER_LOOP || c->flags & SHADER_SINGLE_STEP)
> > +		s_dim.y++;
> > +	/* Additional memory for caching check */
> > +	if ((c->flags & SHADER_CACHING_SRAM) || (c->flags & SHADER_CACHING_VRAM))
> > +		s_dim.y += caching_get_instruction_count(fd, s_dim.x, c->flags);
> > +	buf = create_uc_buf(fd, s_dim.x, s_dim.y);
> > +
> > +	buf->addr.offset = target_offset;
> > +
> > +	metadata[0][0] = bb_offset;
> > +	metadata[0][1] = bb_size;
> > +	metadata[1][0] = target_offset;
> > +	metadata[1][1] = buf->size;
> > +	metadata_id[0] = xe_eudebug_client_metadata_create(c, fd, DRM_XE_DEBUG_METADATA_ELF_BINARY,
> > +							   2 * sizeof(*metadata), metadata[0]);
> > +	metadata_id[1] = xe_eudebug_client_metadata_create(c, fd,
> > +							   DRM_XE_DEBUG_METADATA_PROGRAM_MODULE,
> > +							   2 * sizeof(*metadata), metadata[1]);
> > +
> > +	create.vm_id = xe_eudebug_client_vm_create(c, fd, DRM_XE_VM_CREATE_FLAG_LR_MODE, 0);
> > +	xe_eudebug_client_exec_queue_create(c, fd, &create);
> > +
> > +	ibb = xe_bb_create_on_offset(fd, create.exec_queue_id, create.vm_id,
> > +				     bb_offset, bb_size);
> > +	intel_bb_set_lr_mode(ibb, true);
> > +
> > +	sip = get_sip(fd, c->flags);
> > +	shader = get_shader(fd, c->flags);
> > +
> > +	igt_nsec_elapsed(&ts);
> > +	gpgpu_shader_exec(ibb, buf, w_dim.x, w_dim.y, shader, sip, 0, 0);
> > +
> > +	gpgpu_shader_destroy(sip);
> > +	gpgpu_shader_destroy(shader);
> > +
> > +	intel_bb_sync(ibb);
> > +
> > +	if (c->flags & TRIGGER_RECONNECT)
> > +		xe_eudebug_client_wait_stage(c, DEBUGGER_REATTACHED);
> > +	else
> > +		/* Make sure it wasn't the timeout. */
> > +		igt_assert(igt_nsec_elapsed(&ts) < XE_EUDEBUG_DEFAULT_TIMEOUT_SEC * NSEC_PER_SEC);
> > +
> > +	if (!(c->flags & DO_NOT_EXPECT_CANARIES)) {
> > +		ptr = xe_bo_mmap_ext(fd, buf->handle, buf->size, PROT_READ);
> > +		data->threads_count = count_canaries_neq(ptr, w_dim, 0);
> > +		igt_assert_f(data->threads_count, "No canaries found, nothing executed?\n");
> > +
> > +		if ((c->flags & SHADER_BREAKPOINT || c->flags & TRIGGER_RESUME_SET_BP ||
> > +		     c->flags & SHADER_N_NOOP_BREAKPOINT) && !(c->flags & DISABLE_DEBUG_MODE)) {
> > +			uint32_t aip = ptr[0];
> > +
> > +			igt_assert_f(aip != SHADER_CANARY,
> > +				     "Workload executed but breakpoint not hit!\n");
> > +			igt_assert_eq(count_canaries_eq(ptr, w_dim, aip), data->threads_count);
> > +			igt_debug("Breakpoint hit in %d threads, AIP=0x%08x\n", data->threads_count,
> > +				  aip);
> > +		}
> > +
> > +		munmap(ptr, buf->size);
> > +	}
> > +
> > +	intel_bb_destroy(ibb);
> > +
> > +	xe_eudebug_client_exec_queue_destroy(c, fd, &create);
> > +	xe_eudebug_client_vm_destroy(c, fd,  create.vm_id);
> > +
> > +	xe_eudebug_client_metadata_destroy(c, fd, metadata_id[0], DRM_XE_DEBUG_METADATA_ELF_BINARY,
> > +					   2 * sizeof(*metadata));
> > +	xe_eudebug_client_metadata_destroy(c, fd, metadata_id[1],
> > +					   DRM_XE_DEBUG_METADATA_PROGRAM_MODULE,
> > +					   2 * sizeof(*metadata));
> > +
> > +	xe_device_put(fd);
> 
> Same.
> 
> > +	xe_eudebug_client_close_driver(c, fd);
> > +}
> > +
> > +static bool intel_gen_has_lockstep_eus(int fd)
> > +{
> > +	const uint32_t id = intel_get_drm_devid(fd);
> > +
> > +	/*
> > +	 * Lockstep (or in some parlance, fused) EUs are pair of EUs
> > +	 * that work in sync, supposedly same clock and same control flow.
> > +	 * Thus for attentions, if the control has breakpoint, both will be
> > +	 * excepted into SIP. In this level, the hardware has only one attention
> > +	 * thread bit for units. PVC is the first one without lockstepping.
> > +	 */
> > +	return !(intel_graphics_ver(id) == IP_VER(12, 60) || intel_gen(id) >= 20);
> > +}
> > +
> > +static int query_attention_bitmask_size(int fd, int gt)
> > +{
> > +	const unsigned int threads = 8;
> > +	struct drm_xe_query_topology_mask *c_dss = NULL, *g_dss = NULL, *eu_per_dss = NULL;
> > +	struct drm_xe_query_topology_mask *topology;
> > +	struct drm_xe_device_query query = {
> > +		.extensions = 0,
> > +		.query = DRM_XE_DEVICE_QUERY_GT_TOPOLOGY,
> > +		.size = 0,
> > +		.data = 0,
> > +	};
> > +	int pos = 0, eus;
> > +	uint8_t *any_dss;
> > +
> > +	igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &query), 0);
> > +	igt_assert_neq(query.size, 0);
> > +
> > +	topology = malloc(query.size);
> > +	igt_assert(topology);
> > +
> > +	query.data = to_user_pointer(topology);
> > +	igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &query), 0);
> > +
> > +	while (query.size >= sizeof(struct drm_xe_query_topology_mask)) {
> > +		struct drm_xe_query_topology_mask *topo;
> > +		int sz;
> > +
> > +		topo = (struct drm_xe_query_topology_mask *)((unsigned char *)topology + pos);
> > +		sz = sizeof(struct drm_xe_query_topology_mask) + topo->num_bytes;
> > +
> > +		query.size -= sz;
> > +		pos += sz;
> > +
> > +		if (topo->gt_id != gt)
> > +			continue;
> > +
> > +		if (topo->type == DRM_XE_TOPO_DSS_GEOMETRY)
> > +			g_dss = topo;
> > +		else if (topo->type == DRM_XE_TOPO_DSS_COMPUTE)
> > +			c_dss = topo;
> > +		else if (topo->type == DRM_XE_TOPO_EU_PER_DSS ||
> > +			 topo->type == DRM_XE_TOPO_SIMD16_EU_PER_DSS)
> > +			eu_per_dss = topo;
> > +	}
> > +
> > +	igt_assert(g_dss && c_dss && eu_per_dss);
> > +	igt_assert_eq_u32(c_dss->num_bytes, g_dss->num_bytes);
> > +
> > +	any_dss = malloc(c_dss->num_bytes);
> 
> Assert if NULL.
> 
> > +
> > +	for (int i = 0; i < c_dss->num_bytes; i++)
> > +		any_dss[i] = c_dss->mask[i] | g_dss->mask[i];
> > +
> > +	eus = count_set_bits(any_dss, c_dss->num_bytes);
> > +	eus *= count_set_bits(eu_per_dss->mask, eu_per_dss->num_bytes);
> > +
> > +	if (intel_gen_has_lockstep_eus(fd))
> > +		eus /= 2;
> > +
> > +	free(any_dss);
> > +	free(topology);
> > +
> > +	return eus * threads / 8;
> > +}
> > +
> > +static struct drm_xe_eudebug_event_exec_queue *
> > +match_attention_with_exec_queue(struct xe_eudebug_event_log *log,
> > +				struct drm_xe_eudebug_event_eu_attention *ea)
> > +{
> > +	struct drm_xe_eudebug_event_exec_queue *ee;
> > +	struct drm_xe_eudebug_event *event = NULL, *current = NULL, *matching_destroy = NULL;
> > +	int lrc_idx;
> > +
> > +	xe_eudebug_for_each_event(event, log) {
> > +		if (event->type == DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE &&
> > +		    event->flags == DRM_XE_EUDEBUG_EVENT_CREATE) {
> > +			ee = (struct drm_xe_eudebug_event_exec_queue *)event;
> > +
> > +			if (ee->exec_queue_handle != ea->exec_queue_handle)
> > +				continue;
> > +
> > +			if (ee->client_handle != ea->client_handle)
> > +				continue;
> > +
> > +			for (lrc_idx = 0; lrc_idx < ee->width; lrc_idx++) {
> > +				if (ee->lrc_handle[lrc_idx] == ea->lrc_handle)
> > +					break;
> > +			}
> > +
> > +			if (lrc_idx >= ee->width) {
> > +				igt_debug("No matching lrc handle within matching exec_queue!");
> > +				continue;
> > +			}
> > +
> > +			/* event logs are sorted, every found next would not be present. */
> > +			if (ea->base.seqno < ee->base.seqno)
> > +				break;
> > +
> > +			/* sanity check whether attention did
> > +			 * not appear yet on already destroyed exec_queue
> > +			 */
> > +			current = event;
> > +			xe_eudebug_for_each_event(current, log) {
> > +				if (current->type == DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE &&
> > +				    current->flags == DRM_XE_EUDEBUG_EVENT_DESTROY) {
> > +					uint8_t offset = sizeof(struct drm_xe_eudebug_event);
> > +
> > +					if (memcmp((uint8_t *)current + offset,
> > +						   (uint8_t *)event + offset,
> > +						   current->len - offset) == 0) {
> > +						matching_destroy = current;
> > +					}
> > +				}
> > +			}
> > +
> > +			if (!matching_destroy || ea->base.seqno > matching_destroy->seqno)
> > +				continue;
> > +
> > +			return ee;
> > +		}
> > +	}
> > +
> > +	return NULL;
> > +}
> > +
> > +static void online_session_check(struct xe_eudebug_session *s, int flags)
> > +{
> > +	struct drm_xe_eudebug_event_eu_attention *ea = NULL;
> > +	struct drm_xe_eudebug_event *event = NULL;
> > +	struct online_debug_data *data = s->client->ptr;
> > +	bool expect_exception = flags & DISABLE_DEBUG_MODE ? false : true;
> > +	int sum = 0;
> > +	int bitmask_size;
> > +
> > +	xe_eudebug_session_check(s, true, XE_EUDEBUG_FILTER_EVENT_VM_BIND |
> > +					  XE_EUDEBUG_FILTER_EVENT_VM_BIND_OP |
> > +					  XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
> > +
> > +	bitmask_size = query_attention_bitmask_size(s->debugger->master_fd, data->hwe.gt_id);
> > +
> > +	xe_eudebug_for_each_event(event, s->debugger->log) {
> > +		if (event->type == DRM_XE_EUDEBUG_EVENT_EU_ATTENTION) {
> > +			ea = (struct drm_xe_eudebug_event_eu_attention *)event;
> > +
> > +			igt_assert(event->flags == DRM_XE_EUDEBUG_EVENT_STATE_CHANGE);
> > +			igt_assert_eq(ea->bitmask_size, bitmask_size);
> > +			sum += count_set_bits(ea->bitmask, bitmask_size);
> > +			igt_assert(match_attention_with_exec_queue(s->debugger->log, ea));
> > +		}
> > +	}
> > +
> > +	/*
> > +	 * We can expect attention to sum up only
> > +	 * if we have a breakpoint set and we resume all threads always.
> > +	 */
> > +	if (flags == SHADER_BREAKPOINT || flags == TRIGGER_UFENCE_SET_BREAKPOINT)
> > +		igt_assert_eq(sum, data->threads_count);
> > +
> > +	if (expect_exception)
> > +		igt_assert(sum > 0);
> > +	else
> > +		igt_assert(sum == 0);
> > +}
> > +
> > +static void ufence_ack_trigger(struct xe_eudebug_debugger *d,
> > +			       struct drm_xe_eudebug_event *e)
> > +{
> > +	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
> > +
> > +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE)
> > +		xe_eudebug_ack_ufence(d->fd, ef);
> > +}
> > +
> > +static void ufence_ack_set_bp_trigger(struct xe_eudebug_debugger *d,
> > +				      struct drm_xe_eudebug_event *e)
> > +{
> > +	struct drm_xe_eudebug_event_vm_bind_ufence *ef = (void *)e;
> > +	struct online_debug_data *data = d->ptr;
> > +
> > +	set_breakpoint_once(d, data);
> > +
> > +	if (e->flags & DRM_XE_EUDEBUG_EVENT_CREATE)
> > +		xe_eudebug_ack_ufence(d->fd, ef);
> > +}
> > +
> > +/**
> > + * SUBTEST: basic-breakpoint
> > + * Description:
> > + *	Check whether KMD sends attention events
> > + *	for workload in debug mode stopped on breakpoint.
> > + *
> > + * SUBTEST: breakpoint-not-in-debug-mode
> > + * Description:
> > + *	Check whether KMD resets the GPU when it spots an attention
> > + *	coming from workload not in debug mode.
> > + *
> > + * SUBTEST: stopped-thread
> > + * Description:
> > + *	Hits breakpoint on runalone workload and
> > + *	reads attention for fixed time.
> > + *
> > + * SUBTEST: resume-%s
> > + * Description:
> > + *	Resumes stopped on a breakpoint workload
> > + *	with granularity of %arg[1].
> > + *
> > + *
> > + * arg[1]:
> > + *
> > + * @one:	one thread
> > + * @dss:	threads running on one subslice
> > + */
> > +static void test_basic_online(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
> > +{
> > +	struct xe_eudebug_session *s;
> > +	struct online_debug_data *data;
> > +
> > +	data = online_debug_data_create(hwe);
> > +	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
> > +
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> > +					eu_attention_debug_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> > +					eu_attention_resume_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> > +					ufence_ack_trigger);
> > +
> > +	xe_eudebug_session_run(s);
> > +	online_session_check(s, s->flags);
> > +
> > +	xe_eudebug_session_destroy(s);
> > +	online_debug_data_destroy(data);
> > +}
> > +
> > +/**
> > + * SUBTEST: set-breakpoint
> > + * Description:
> > + *	Checks for attention after setting a dynamic breakpoint in the ufence event.
> > + */
> > +
> > +static void test_set_breakpoint_online(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
> > +{
> > +	struct xe_eudebug_session *s;
> > +	struct online_debug_data *data;
> > +
> > +	data = online_debug_data_create(hwe);
> > +	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_OPEN,
> > +					open_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE,
> > +					exec_queue_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM, vm_open_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_METADATA,
> > +					create_metadata_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> > +					ufence_ack_set_bp_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> > +					eu_attention_resume_trigger);
> > +
> > +	xe_eudebug_session_run(s);
> > +	online_session_check(s, s->flags);
> > +
> > +	xe_eudebug_session_destroy(s);
> > +	online_debug_data_destroy(data);
> > +}
> > +
> > +/**
> > + * SUBTEST: preempt-breakpoint
> > + * Description:
> > + *	Verify that eu debugger disables preemption timeout to
> > + *	prevent reset of workload stopped on breakpoint.
> > + */
> > +static void test_preemption(int fd, struct drm_xe_engine_class_instance *hwe)
> > +{
> > +	int flags = SHADER_BREAKPOINT | TRIGGER_RESUME_DELAYED;
> > +	struct xe_eudebug_session *s;
> > +	struct online_debug_data *data;
> > +	struct xe_eudebug_client *other;
> > +
> > +	data = online_debug_data_create(hwe);
> > +	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
> > +	other = xe_eudebug_client_create(fd, run_online_client, SHADER_NOP, data);
> > +
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> > +					eu_attention_debug_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> > +					eu_attention_resume_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> > +					ufence_ack_trigger);
> > +
> > +	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
> > +	xe_eudebug_debugger_start_worker(s->debugger);
> > +
> > +	xe_eudebug_client_start(s->client);
> > +	sleep(1); /* make sure s->client starts first */
> 
> If client would write token it has started this sleep wouldn't be
> necessary. I mean inside xe_eudebug_client_start() do
> token_signal/wait_for_client.
To ensure that the first client executes its workload first, we would need to signal it after
calling xe_exec. This means that the signaling would need to be incorporated within the client's
work function. We cannot place wait_for_client inside the generic xe_eudebug_client_start() because
the work function is defined by the caller. While I could implement a similar mechanism specifically
for this test, it would require creating a brand new run_online_client()-like function or adding
wait_for_client in every test that reuses run_online_client(). I decided to keep it simple, albeit
imperfect. Let me know if you would rather have it changed. 
> 
> > +	xe_eudebug_client_start(other);
> > +
> > +	xe_eudebug_client_wait_done(s->client);
> > +	xe_eudebug_client_wait_done(other);
> > +
> > +	xe_eudebug_debugger_stop_worker(s->debugger, 1);
> > +
> > +	xe_eudebug_session_destroy(s);
> > +	xe_eudebug_client_destroy(other);
> > +
> > +	igt_assert_f(data->last_eu_control_seqno != 0,
> > +		     "Workload with breakpoint has ended without resume!\n");
> > +
> > +	online_debug_data_destroy(data);
> > +}
> > +
> > +/**
> > + * SUBTEST: reset-with-attention
> > + * Description:
> > + *	Check whether GPU is usable after resetting with attention raised
> > + *	(stopped on breakpoint) by running the same workload again.
> > + */
> > +static void test_reset_with_attention_online(int fd, struct drm_xe_engine_class_instance *hwe,
> > +					     int flags)
> > +{
> > +	struct xe_eudebug_session *s1, *s2;
> > +	struct online_debug_data *data;
> > +
> > +	data = online_debug_data_create(hwe);
> > +	s1 = xe_eudebug_session_create(fd, run_online_client, flags, data);
> > +
> > +	xe_eudebug_debugger_add_trigger(s1->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> > +					eu_attention_reset_trigger);
> > +	xe_eudebug_debugger_add_trigger(s1->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> > +					ufence_ack_trigger);
> > +
> > +	xe_eudebug_session_run(s1);
> > +	xe_eudebug_session_destroy(s1);
> > +
> > +	s2 = xe_eudebug_session_create(fd, run_online_client, flags, data);
> > +	xe_eudebug_debugger_add_trigger(s2->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> > +					eu_attention_resume_trigger);
> > +	xe_eudebug_debugger_add_trigger(s2->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> > +					ufence_ack_trigger);
> > +
> > +	xe_eudebug_session_run(s2);
> > +
> > +	online_session_check(s2, s2->flags);
> > +
> > +	xe_eudebug_session_destroy(s2);
> > +	online_debug_data_destroy(data);
> > +}
> > +
> > +/**
> > + * SUBTEST: interrupt-all
> > + * Description:
> > + *	Schedules EU workload which should last about a few seconds, then
> > + *	interrupts all threads, checks whether attention event came, and
> > + *	resumes stopped threads back.
> > + *
> > + * SUBTEST: interrupt-all-set-breakpoint
> > + * Description:
> > + *	Schedules EU workload which should last about a few seconds, then
> > + *	interrupts all threads, once attention event come it sets breakpoint on
> > + *	the very next instruction and resumes stopped threads back. It expects
> > + *	that every thread hits the breakpoint.
> > + */
> > +static void test_interrupt_all(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
> > +{
> > +	struct xe_eudebug_session *s;
> > +	struct online_debug_data *data;
> > +	uint32_t val;
> > +
> > +	data = online_debug_data_create(hwe);
> > +	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
> > +
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_OPEN,
> > +					open_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE,
> > +					exec_queue_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> > +					eu_attention_debug_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> > +					eu_attention_resume_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM, vm_open_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_METADATA,
> > +					create_metadata_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> > +					ufence_ack_trigger);
> > +
> > +	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
> > +	xe_eudebug_debugger_start_worker(s->debugger);
> > +	xe_eudebug_client_start(s->client);
> > +
> > +	/* wait for workload to start */
> > +	igt_for_milliseconds(STARTUP_TIMEOUT_MS) {
> > +		/* collect needed data from triggers */
> > +		if (READ_ONCE(data->vm_fd) == -1 || READ_ONCE(data->target_size) == 0)
> > +			continue;
> > +
> > +		if (pread(data->vm_fd, &val, sizeof(val), data->target_offset) == sizeof(val))
> > +			if (val != 0)
> > +				break;
> > +	}
> > +
> > +	pthread_mutex_lock(&data->mutex);
> > +	igt_assert(data->client_handle != -1);
> > +	igt_assert(data->exec_queue_handle != -1);
> > +	eu_ctl_interrupt_all(s->debugger->fd, data->client_handle,
> > +			     data->exec_queue_handle, data->lrc_handle);
> > +	pthread_mutex_unlock(&data->mutex);
> > +
> > +	xe_eudebug_client_wait_done(s->client);
> > +
> > +	xe_eudebug_debugger_stop_worker(s->debugger, 1);
> > +
> > +	xe_eudebug_event_log_print(s->debugger->log, true);
> > +	xe_eudebug_event_log_print(s->client->log, true);
> > +
> > +	online_session_check(s, s->flags);
> > +
> > +	xe_eudebug_session_destroy(s);
> > +	online_debug_data_destroy(data);
> > +}
> > +
> > +static void reset_debugger_log(struct xe_eudebug_debugger *d)
> > +{
> > +	unsigned int max_size;
> > +	char log_name[80];
> > +
> > +	/* Don't pull the rug out from under an active debugger */
> > +	igt_assert(d->target_pid == 0);
> > +
> > +	max_size = d->log->max_size;
> > +	strncpy(log_name, d->log->name, sizeof(d->log->name) - 1);
> > +	log_name[79] = '\0';
> > +	xe_eudebug_event_log_destroy(d->log);
> > +	d->log = xe_eudebug_event_log_create(log_name, max_size);
> > +}
> > +
> > +/**
> > + * SUBTEST: interrupt-other-debuggable
> > + * Description:
> > + *	Schedules EU workload in runalone mode with never ending loop, while
> > + *	it is not under debug, tries to interrupt all threads using the different
> > + *	client attached to debugger.
> > + *
> > + * SUBTEST: interrupt-other
> > + * Description:
> > + *	Schedules EU workload with a never ending loop and, while it is not
> > + *	configured for debugging, tries to interrupt all threads using the client
> > + *	attached to debugger.
> > + */
> > +static void test_interrupt_other(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
> > +{
> > +	struct online_debug_data *data;
> > +	struct online_debug_data *debugee_data;
> > +	struct xe_eudebug_session *s;
> > +	struct xe_eudebug_client *debugee;
> > +	int debugee_flags = SHADER_LOOP | DO_NOT_EXPECT_CANARIES;
> > +	int val;
> > +
> > +	data = online_debug_data_create(hwe);
> > +	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
> > +
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_OPEN, open_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE,
> > +					exec_queue_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM, vm_open_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_METADATA,
> > +					create_metadata_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> > +					ufence_ack_trigger);
> > +
> > +	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
> > +	xe_eudebug_debugger_start_worker(s->debugger);
> > +	xe_eudebug_client_start(s->client);
> > +
> > +	/* wait for workload to start */
> > +	igt_for_milliseconds(STARTUP_TIMEOUT_MS) {
> > +		if (READ_ONCE(data->vm_fd) == -1 || READ_ONCE(data->target_size) == 0)
> > +			continue;
> > +
> > +		if (pread(data->vm_fd, &val, sizeof(val), data->target_offset) == sizeof(val))
> > +			if (val != 0)
> > +				break;
> > +	}
> > +	igt_assert_f(val != 0, "Workload execution is not yet started\n");
> > +
> > +	xe_eudebug_debugger_detach(s->debugger);
> > +	reset_debugger_log(s->debugger);
> > +
> > +	debugee_data = online_debug_data_create(hwe);
> > +	s->debugger->ptr = debugee_data;
> > +	debugee = xe_eudebug_client_create(fd, run_online_client, debugee_flags, debugee_data);
> > +	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, debugee), 0);
> > +	xe_eudebug_client_start(debugee);
> > +
> > +	igt_for_milliseconds(STARTUP_TIMEOUT_MS) {
> > +		if (READ_ONCE(debugee_data->vm_fd) == -1 || READ_ONCE(debugee_data->target_size) == 0)
> > +			continue;
> > +	}
> > +
> > +	pthread_mutex_lock(&debugee_data->mutex);
> > +	igt_assert(debugee_data->client_handle != -1);
> > +	igt_assert(debugee_data->exec_queue_handle != -1);
> > +
> > +	/*
> > +	 * Interrupting the other client should return invalid state
> > +	 * as it is running in runalone mode
> > +	 */
> > +	igt_assert_eq(__eu_ctl(s->debugger->fd, debugee_data->client_handle,
> > +		      debugee_data->exec_queue_handle, debugee_data->lrc_handle, NULL, 0,
> > +		      DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL, NULL), -EINVAL);
> > +	pthread_mutex_unlock(&debugee_data->mutex);
> > +
> > +	xe_force_gt_reset_async(s->debugger->master_fd, debugee_data->hwe.gt_id);
> > +
> > +	xe_eudebug_client_wait_done(debugee);
> > +	xe_eudebug_debugger_stop_worker(s->debugger, 1);
> > +
> > +	xe_eudebug_event_log_print(s->debugger->log, true);
> > +	xe_eudebug_event_log_print(debugee->log, true);
> > +
> > +	xe_eudebug_session_check(s, true, XE_EUDEBUG_FILTER_EVENT_VM_BIND |
> > +				 XE_EUDEBUG_FILTER_EVENT_VM_BIND_OP |
> > +				 XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
> > +
> > +	xe_eudebug_client_destroy(debugee);
> > +	xe_eudebug_session_destroy(s);
> > +	online_debug_data_destroy(data);
> > +	online_debug_data_destroy(debugee_data);
> > +}
> > +
> > +/**
> > + * SUBTEST: tdctl-parameters
> > + * Description:
> > + *	Schedules EU workload which should last about a few seconds, then
> > + *	checks negative scenarios of EU_THREADS ioctl usage, interrupts all threads,
> > + *	checks whether attention event came, and resumes stopped threads back.
> > + */
> > +static void test_tdctl_parameters(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
> > +{
> > +	struct xe_eudebug_session *s;
> > +	struct online_debug_data *data;
> > +	uint32_t val;
> > +	uint32_t random_command;
> > +	uint32_t bitmask_size = query_attention_bitmask_size(fd, hwe->gt_id);
> > +	uint8_t *attention_bitmask = malloc(bitmask_size * sizeof(uint8_t));
> > +
> > +	igt_assert(attention_bitmask);
> > +
> > +	data = online_debug_data_create(hwe);
> > +	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
> > +
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_OPEN,
> > +					open_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE,
> > +					exec_queue_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> > +					eu_attention_debug_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> > +					eu_attention_resume_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM, vm_open_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_METADATA,
> > +					create_metadata_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> > +					ufence_ack_trigger);
> > +
> > +	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
> > +	xe_eudebug_debugger_start_worker(s->debugger);
> > +	xe_eudebug_client_start(s->client);
> > +
> > +	/* wait for workload to start */
> > +	igt_for_milliseconds(STARTUP_TIMEOUT_MS) {
> > +		/* collect needed data from triggers */
> > +		if (READ_ONCE(data->vm_fd) == -1 || READ_ONCE(data->target_size) == 0)
> > +			continue;
> > +
> > +		if (pread(data->vm_fd, &val, sizeof(val), data->target_offset) == sizeof(val))
> > +			if (val != 0)
> > +				break;
> > +	}
> > +
> > +	pthread_mutex_lock(&data->mutex);
> > +	igt_assert(data->client_handle != -1);
> > +	igt_assert(data->exec_queue_handle != -1);
> > +	igt_assert(data->lrc_handle != -1);
> > +
> > +	/* fail on invalid lrc_handle */
> > +	igt_assert(__eu_ctl(s->debugger->fd, data->client_handle,
> > +			    data->exec_queue_handle, data->lrc_handle + 1,
> > +			    attention_bitmask, &bitmask_size,
> > +			    DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL, NULL) == -EINVAL);
> > +
> > +	/* fail on invalid exec_queue_handle */
> > +	igt_assert(__eu_ctl(s->debugger->fd, data->client_handle,
> > +			    data->exec_queue_handle + 1, data->lrc_handle,
> > +			    attention_bitmask, &bitmask_size,
> > +			    DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL, NULL) == -EINVAL);
> > +
> > +	/* fail on invalid client */
> > +	igt_assert(__eu_ctl(s->debugger->fd, data->client_handle + 1,
> > +			    data->exec_queue_handle, data->lrc_handle,
> > +			    attention_bitmask, &bitmask_size,
> > +			    DRM_XE_EUDEBUG_EU_CONTROL_CMD_INTERRUPT_ALL, NULL) == -EINVAL);
> > +
> > +	/*
> > +	 * bitmask size must be aligned to sizeof(u32) for all commands
> > +	 * and be zero for interrupt all
> > +	 */
> > +	bitmask_size = sizeof(uint32_t) - 1;
> > +	igt_assert(__eu_ctl(s->debugger->fd, data->client_handle,
> > +			    data->exec_queue_handle, data->lrc_handle,
> > +			    attention_bitmask, &bitmask_size,
> > +			    DRM_XE_EUDEBUG_EU_CONTROL_CMD_STOPPED, NULL) == -EINVAL);
> > +	bitmask_size = 0;
> > +
> > +	/* fail on invalid command */
> > +	random_command = random() | (DRM_XE_EUDEBUG_EU_CONTROL_CMD_RESUME + 1);
> > +	igt_assert(__eu_ctl(s->debugger->fd, data->client_handle,
> > +			    data->exec_queue_handle, data->lrc_handle,
> > +			    attention_bitmask, &bitmask_size, random_command, NULL) == -EINVAL);
> > +
> > +	free(attention_bitmask);
> > +
> > +	eu_ctl_interrupt_all(s->debugger->fd, data->client_handle,
> > +			     data->exec_queue_handle, data->lrc_handle);
> > +	pthread_mutex_unlock(&data->mutex);
> > +
> > +	xe_eudebug_client_wait_done(s->client);
> > +
> > +	xe_eudebug_debugger_stop_worker(s->debugger, 1);
> > +
> > +	xe_eudebug_event_log_print(s->debugger->log, true);
> > +	xe_eudebug_event_log_print(s->client->log, true);
> > +
> > +	online_session_check(s, s->flags);
> > +
> > +	xe_eudebug_session_destroy(s);
> > +	online_debug_data_destroy(data);
> > +}
> > +
> > +static void eu_attention_debugger_detach_trigger(struct xe_eudebug_debugger *d,
> > +						 struct drm_xe_eudebug_event *event)
> > +{
> > +	struct online_debug_data *data = d->ptr;
> > +	uint64_t c_pid;
> > +	int ret;
> > +
> > +	c_pid = d->target_pid;
> > +
> > +	/* Reset VM data so the re-triggered VM open handler works properly */
> > +	data->vm_fd = -1;
> > +
> > +	xe_eudebug_debugger_detach(d);
> > +
> > +	/* Let the KMD scan function notice unhandled EU attention */
> > +	if (!(d->flags & SHADER_N_NOOP_BREAKPOINT))
> > +		sleep(1);
> > +
> > +	/*
> > +	 * New session that is created by EU debugger on reconnect restarts
> > +	 * seqno, causing isses with log sorting. To avoid that, create
> > +	 * a new event log.
> > +	 */
> > +	reset_debugger_log(d);
> > +
> > +	ret = xe_eudebug_connect(d->master_fd, c_pid, 0);
> > +	igt_assert(ret >= 0);
> > +	d->fd = ret;
> > +	d->target_pid = c_pid;
> > +
> > +	/* Let the discovery worker discover resources */
> > +	sleep(2);
> > +
> > +	if (!(d->flags & SHADER_N_NOOP_BREAKPOINT))
> > +		xe_eudebug_debugger_signal_stage(d, DEBUGGER_REATTACHED);
> > +}
> > +
> > +/**
> > + * SUBTEST: interrupt-reconnect
> > + * Description:
> > + *	Schedules EU workload which should last about a few seconds,
> > + *	interrupts all threads and detaches debugger when attention is
> > + *	raised. The test checks if KMD resets the workload when there's
> > + *	no debugger attached and does the event playback on discovery.
> > + */
> > +static void test_interrupt_reconnect(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
> > +{
> > +	struct drm_xe_eudebug_event *e = NULL;
> > +	struct online_debug_data *data;
> > +	struct xe_eudebug_session *s;
> > +	uint32_t val;
> > +
> > +	data = online_debug_data_create(hwe);
> > +	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
> > +
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_OPEN,
> > +					open_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EXEC_QUEUE,
> > +					exec_queue_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> > +					eu_attention_debug_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> > +					eu_attention_debugger_detach_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM, vm_open_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_METADATA,
> > +					create_metadata_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> > +					ufence_ack_trigger);
> > +
> > +	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
> > +	xe_eudebug_debugger_start_worker(s->debugger);
> > +	xe_eudebug_client_start(s->client);
> > +
> > +	/* wait for workload to start */
> > +	igt_for_milliseconds(STARTUP_TIMEOUT_MS) {
> > +		/* collect needed data from triggers */
> > +		if (READ_ONCE(data->vm_fd) == -1 || READ_ONCE(data->target_size) == 0)
> > +			continue;
> > +
> > +		if (pread(data->vm_fd, &val, sizeof(val), data->target_offset) == sizeof(val))
> > +			if (val != 0)
> > +				break;
> > +	}
> > +
> > +	pthread_mutex_lock(&data->mutex);
> > +	igt_assert(data->client_handle != -1);
> > +	igt_assert(data->exec_queue_handle != -1);
> > +	eu_ctl_interrupt_all(s->debugger->fd, data->client_handle,
> > +			     data->exec_queue_handle, data->lrc_handle);
> > +	pthread_mutex_unlock(&data->mutex);
> > +
> > +	xe_eudebug_client_wait_done(s->client);
> > +
> > +	xe_eudebug_debugger_stop_worker(s->debugger, 1);
> 
> I wondered where's log cleared and I've noticed eu_attention_debugger_detach_trigger
> is responsible for this.
> 
> > +
> > +	xe_eudebug_event_log_print(s->debugger->log, true);
> > +	xe_eudebug_event_log_print(s->client->log, true);
> > +
> > +	xe_eudebug_session_check(s, true, XE_EUDEBUG_FILTER_EVENT_VM_BIND |
> > +					  XE_EUDEBUG_FILTER_EVENT_VM_BIND_OP |
> > +					  XE_EUDEBUG_FILTER_EVENT_VM_BIND_UFENCE);
> 
> That's my question here - if log for debugger is cleared then filled
> again on reconnect, will vm-bind-ufence events match?
No, there would not be single ufence event on reconnect, thus we are filtering them.
The same is for vm_bind events. As debugger tracks recources not subsequent ioctl calls it cannot
recreate each and every vm_bind with its vm_bind_ops. On discovery there will be a single vm_bind
event with all vmas reflected by subsequent vm_bind_op events.
> 
> > +
> > +	/* We expect workload reset, so no attention should be raised */
> > +	xe_eudebug_for_each_event(e, s->debugger->log)
> > +		igt_assert(e->type != DRM_XE_EUDEBUG_EVENT_EU_ATTENTION);
> > +
> > +	xe_eudebug_session_destroy(s);
> > +	online_debug_data_destroy(data);
> > +}
> > +
> > +/**
> > + * SUBTEST: single-step
> > + * Description:
> > + *	Schedules EU workload with 16 nops after breakpoint, then single-steps
> > + *	through the shader, advances all threads each step, checking if all
> > + *	threads advanced every step.
> > + *
> > + * SUBTEST: single-step-one
> > + * Description:
> > + *	Schedules EU workload with 16 nops after breakpoint, then single-steps
> > + *	through the shader, advances one thread each step, checking if one
> > + *	thread advanced every step. Due to the time constraint, only first two
> > + *	shader instructions after breakpoint are validated.
> > + */
> > +static void test_single_step(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
> > +{
> > +	struct xe_eudebug_session *s;
> > +	struct online_debug_data *data;
> > +
> > +	data = online_debug_data_create(hwe);
> > +	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
> > +
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_OPEN,
> > +					open_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> > +					eu_attention_debug_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> > +					eu_attention_resume_single_step_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM, vm_open_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_METADATA,
> > +					create_metadata_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> > +					ufence_ack_trigger);
> > +
> > +	xe_eudebug_session_run(s);
> > +	online_session_check(s, s->flags);
> > +	xe_eudebug_session_destroy(s);
> > +	online_debug_data_destroy(data);
> > +}
> > +
> > +static void eu_attention_debugger_ndetach_trigger(struct xe_eudebug_debugger *d,
> > +						  struct drm_xe_eudebug_event *event)
> > +{
> > +	struct online_debug_data *data = d->ptr;
> > +	static int debugger_detach_count;
> > +
> > +	if (debugger_detach_count < (SHADER_LOOP_N - 1)) {
> > +		/* Make sure the resume command was issued before detaching the debugger */
> > +		if (data->last_eu_control_seqno > event->seqno)
> > +			return;
> > +		eu_attention_debugger_detach_trigger(d, event);
> > +		debugger_detach_count++;
> > +	} else {
> > +		igt_debug("Reached Nth breakpoint hence preventing the debugger detach\n");
> > +	}
> > +}
> > +
> > +/**
> > + * SUBTEST: debugger-reopen
> > + * Description:
> > + *	Check whether the debugger is able to reopen the connection and
> > + *	capture the events of already running client.
> > + */
> > +static void test_debugger_reopen(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
> > +{
> > +	struct xe_eudebug_session *s;
> > +	struct online_debug_data *data;
> > +
> > +	data = online_debug_data_create(hwe);
> > +
> > +	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
> > +
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> > +					eu_attention_debug_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> > +					eu_attention_resume_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> > +					eu_attention_debugger_ndetach_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> > +					ufence_ack_trigger);
> > +
> > +	xe_eudebug_session_run(s);
> > +
> > +	xe_eudebug_session_destroy(s);
> > +	online_debug_data_destroy(data);
> > +}
> > +
> > +/**
> > + * SUBTEST: writes-caching-%s
> > + * Description:
> > + *	Write incrementing values to 2-page-long target surface, poisoning the data one breakpoint
> > + *	before each write instruction and restoring it when the poisoned instruction breakpoint
> > + *	is hit. Expect to never see poison values in target surface.
> > + *
> > + *
> > + * arg[1]:
> > + *
> > + * @sram:	Use page size of SRAM
> > + * @vram:	Use page size of VRAM
> > + */
> > +static void test_caching(int fd, struct drm_xe_engine_class_instance *hwe, int flags)
> > +{
> > +	struct xe_eudebug_session *s;
> > +	struct online_debug_data *data;
> > +
> > +	if (flags & SHADER_CACHING_VRAM)
> > +		igt_skip_on_f(!xe_has_vram(fd), "Device does not have VRAM.\n");
> > +
> > +	data = online_debug_data_create(hwe);
> > +	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
> > +
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_OPEN,
> > +					open_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> > +					eu_attention_debug_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> > +					eu_attention_resume_caching_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM, vm_open_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_METADATA,
> > +					create_metadata_trigger);
> > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> > +					ufence_ack_trigger);
> > +
> > +	xe_eudebug_session_run(s);
> > +	online_session_check(s, s->flags);
> > +	xe_eudebug_session_destroy(s);
> > +	online_debug_data_destroy(data);
> > +}
> > +
> > +static int wait_for_exception(struct online_debug_data *data, int timeout)
> > +{
> > +	int ret = -ETIMEDOUT;
> > +
> > +	igt_for_milliseconds(timeout) {
> > +		pthread_mutex_lock(&data->mutex);
> > +		if ((data->exception_arrived.tv_sec |
> > +		     data->exception_arrived.tv_nsec) != 0)
> > +			ret = 0;
> > +		pthread_mutex_unlock(&data->mutex);
> > +
> > +		if (!ret)
> > +			break;
> > +		usleep(1000);
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> > +#define is_compute_on_gt(__e, __gt) (((__e)->engine_class == DRM_XE_ENGINE_CLASS_RENDER || \
> > +				      (__e)->engine_class == DRM_XE_ENGINE_CLASS_COMPUTE) && \
> > +				      (__e)->gt_id == (__gt))
> > +
> > +struct xe_engine_list_entry {
> > +	struct igt_list_head link;
> > +	struct drm_xe_engine_class_instance *hwe;
> > +};
> > +
> > +#define MAX_TILES	2
> > +static int find_suitable_engines(struct drm_xe_engine_class_instance *hwes[GEM_MAX_ENGINES],
> > +				 int fd, bool many_tiles)
> > +{
> > +	struct xe_device *xe_dev;
> > +	struct drm_xe_engine_class_instance *e;
> > +	struct xe_engine_list_entry *en, *tmp;
> > +	struct igt_list_head compute_engines[MAX_TILES];
> > +	int gt_id;
> > +	int tile_id, i, engine_count = 0, tile_count = 0;
> > +
> > +	xe_dev = xe_device_get(fd);
> > +
> > +	for (i = 0; i < MAX_TILES; i++)
> > +		IGT_INIT_LIST_HEAD(&compute_engines[i]);
> > +
> > +	xe_for_each_gt(fd, gt_id) {
> > +		xe_for_each_engine(fd, e) {
> > +			if (is_compute_on_gt(e, gt_id)) {
> > +				tile_id = xe_dev->gt_list->gt_list[gt_id].tile_id;
> > +
> > +				en = malloc(sizeof(struct xe_engine_list_entry));
> > +				en->hwe = e;
> > +
> > +				igt_list_add_tail(&en->link, &compute_engines[tile_id]);
> > +			}
> > +		}
> > +	}
> > +
> > +	for (i = 0; i < MAX_TILES; i++) {
> > +		if (igt_list_empty(&compute_engines[i]))
> > +			continue;
> > +
> > +		if (many_tiles) {
> > +			en = igt_list_first_entry(&compute_engines[i], en, link);
> > +			hwes[engine_count++] = en->hwe;
> > +			tile_count++;
> > +		} else {
> > +			if (igt_list_length(&compute_engines[i]) > 1) {
> > +				igt_list_for_each_entry(en, &compute_engines[i], link)
> > +					hwes[engine_count++] = en->hwe;
> > +				break;
> > +			}
> > +		}
> > +	}
> > +
> > +	for (i = 0; i < MAX_TILES; i++) {
> > +		igt_list_for_each_entry_safe(en, tmp, &compute_engines[i], link) {
> > +			igt_list_del(&en->link);
> > +			free(en);
> > +		}
> > +	}
> > +
> > +	if (many_tiles)
> > +		igt_require_f(tile_count > 1, "Mulit-tile scenario requires more tiles\n");
> > +
> > +	return engine_count;
> > +}
> > +
> > +/**
> > + * SUBTEST: breakpoint-many-sessions-single-tile
> > + * Description:
> > + *	Schedules EU workload with preinstalled breakpoint on every compute engine
> > + *	available on the tile. Checks if the contexts hit breakpoint in sequence
> > + *	and resumes them.
> > + *
> > + * SUBTEST: breakpoint-many-sessions-tiles
> > + * Description:
> > + *	Schedules EU workload with preinstalled breakpoint on selected compute
> > + *      engines, with one per tile. Checks if each context hit breakpoint and
> > + *      resumes them.
> > + */
> > +static void test_many_sessions_on_tiles(int fd, bool multi_tile)
> > +{
> > +	int n = 0, flags = SHADER_BREAKPOINT | SHADER_MIN_THREADS;
> > +	struct xe_eudebug_session *s[GEM_MAX_ENGINES] = {};
> > +	struct online_debug_data *data[GEM_MAX_ENGINES] = {};
> > +	struct drm_xe_engine_class_instance *hwe[GEM_MAX_ENGINES] = {};a
> 
> GEM_MAX_ENGINES?
(= will fix that.
> 
> > +	struct drm_xe_eudebug_event_eu_attention *eus;
> > +	uint64_t current_t, next_t, diff;
> > +	int i;
> > +
> > +	n = find_suitable_engines(hwe, fd, multi_tile);
> > +
> > +	igt_require_f(n > 1, "Test requires at least two parallel compute engines!\n");
> > +
> > +	for (i = 0; i < n; i++) {
> > +		data[i] = online_debug_data_create(hwe[i]);
> > +		s[i] = xe_eudebug_session_create(fd, run_online_client, flags, data[i]);
> > +
> > +		xe_eudebug_debugger_add_trigger(s[i]->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> > +						eu_attention_debug_trigger);
> > +		xe_eudebug_debugger_add_trigger(s[i]->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> > +						save_first_exception_trigger);
> > +		xe_eudebug_debugger_add_trigger(s[i]->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> > +						ufence_ack_trigger);
> > +
> > +		igt_assert_eq(xe_eudebug_debugger_attach(s[i]->debugger, s[i]->client), 0);
> > +
> > +		xe_eudebug_debugger_start_worker(s[i]->debugger);
> > +		xe_eudebug_client_start(s[i]->client);
> > +	}
> > +
> > +	for (i = 0; i < n; i++) {
> > +		/* XXX: Sometimes racy, expects clients to execute in sequence */
> > +		igt_assert(!wait_for_exception(data[i], STARTUP_TIMEOUT_MS));
> > +
> > +		eus = (struct drm_xe_eudebug_event_eu_attention *)data[i]->exception_event;
> > +
> > +		/* Delay all but the last workload to check serialization */
> > +		if (i < n - 1)
> > +			usleep(WORKLOAD_DELAY_US);
> > +
> > +		eu_ctl_resume(s[i]->debugger->master_fd, s[i]->debugger->fd,
> > +			      eus->client_handle, eus->exec_queue_handle,
> > +			      eus->lrc_handle, eus->bitmask, eus->bitmask_size);
> > +		free(eus);
> > +	}
> > +
> > +	for (i = 0; i < n - 1; i++) {
> > +		/* Convert timestamps to microseconds */
> > +		current_t = data[i]->exception_arrived.tv_nsec * 1000;
> > +		next_t = data[i + 1]->exception_arrived.tv_nsec * 1000;
> > +		diff = current_t < next_t ? next_t - current_t : current_t - next_t;
> > +
> > +		if (multi_tile)
> > +			igt_assert_f(diff < WORKLOAD_DELAY_US,
> > +				     "Expected to execute workloads concurrently. Actual delay: %lu ms\n",
> > +				     diff);
> > +		else
> > +			igt_assert_f(diff >= WORKLOAD_DELAY_US,
> > +				     "Expected a serialization of workloads. Actual delay: %lu ms\n",
> > +				     diff);
> > +	}
> > +
> > +	for (i = 0; i < n; i++) {
> > +		xe_eudebug_client_wait_done(s[i]->client);
> > +		xe_eudebug_debugger_stop_worker(s[i]->debugger, 1);
> > +
> > +		xe_eudebug_event_log_print(s[i]->debugger->log, true);
> > +		online_session_check(s[i], flags);
> > +
> > +		xe_eudebug_session_destroy(s[i]);
> > +		online_debug_data_destroy(data[i]);
> > +	}
> > +}
> > +
> > +static struct drm_xe_engine_class_instance *pick_compute(int fd, int gt)
> > +{
> > +	struct drm_xe_engine_class_instance *hwe;
> > +	int count = 0;
> > +
> > +	xe_for_each_engine(fd, hwe)
> > +		if (is_compute_on_gt(hwe, gt))
> > +			count++;
> > +
> > +	xe_for_each_engine(fd, hwe)
> > +		if (is_compute_on_gt(hwe, gt) && rand() % count-- == 0)
> > +			return hwe;
> > +
> > +	return NULL;
> > +}
> > +
> > +#define test_gt_render_or_compute(t, i915, __hwe) \
> > +	igt_subtest_with_dynamic(t) \
> > +		for (int gt = 0; (__hwe = pick_compute(i915, gt)); gt++) \
> 
> i915?
> 
> I haven't spotted any other issues and generally apart of bit
> operations looks correct.
> 
> --
> Zbigniew

Thanks for all comments,
Dominik
> > +			igt_dynamic_f("%s%d", xe_engine_class_string(__hwe->engine_class), \
> > +				      hwe->engine_instance)
> > +
> > +igt_main
> > +{
> > +	struct drm_xe_engine_class_instance *hwe;
> > +	bool was_enabled;
> > +	int fd;
> > +
> > +	igt_fixture {
> > +		fd = drm_open_driver(DRIVER_XE);
> > +		intel_allocator_multiprocess_start();
> > +		igt_srandom();
> > +		was_enabled = xe_eudebug_enable(fd, true);
> > +	}
> > +
> > +	test_gt_render_or_compute("basic-breakpoint", fd, hwe)
> > +		test_basic_online(fd, hwe, SHADER_BREAKPOINT);
> > +
> > +	test_gt_render_or_compute("preempt-breakpoint", fd, hwe)
> > +		test_preemption(fd, hwe);
> > +
> > +	test_gt_render_or_compute("set-breakpoint", fd, hwe)
> > +		test_set_breakpoint_online(fd, hwe, SHADER_NOP | TRIGGER_UFENCE_SET_BREAKPOINT);
> > +
> > +	test_gt_render_or_compute("breakpoint-not-in-debug-mode", fd, hwe)
> > +		test_basic_online(fd, hwe, SHADER_BREAKPOINT | DISABLE_DEBUG_MODE);
> > +
> > +	test_gt_render_or_compute("stopped-thread", fd, hwe)
> > +		test_basic_online(fd, hwe, SHADER_BREAKPOINT | TRIGGER_RESUME_DELAYED);
> > +
> > +	test_gt_render_or_compute("resume-one", fd, hwe)
> > +		test_basic_online(fd, hwe, SHADER_BREAKPOINT | TRIGGER_RESUME_ONE);
> > +
> > +	test_gt_render_or_compute("resume-dss", fd, hwe)
> > +		test_basic_online(fd, hwe, SHADER_BREAKPOINT | TRIGGER_RESUME_DSS);
> > +
> > +	test_gt_render_or_compute("interrupt-all", fd, hwe)
> > +		test_interrupt_all(fd, hwe, SHADER_LOOP);
> > +
> > +	test_gt_render_or_compute("interrupt-other-debuggable", fd, hwe)
> > +		test_interrupt_other(fd, hwe, SHADER_LOOP);
> > +
> > +	test_gt_render_or_compute("interrupt-other", fd, hwe)
> > +		test_interrupt_other(fd, hwe, SHADER_LOOP | DISABLE_DEBUG_MODE);
> > +
> > +	test_gt_render_or_compute("interrupt-all-set-breakpoint", fd, hwe)
> > +		test_interrupt_all(fd, hwe, SHADER_LOOP | TRIGGER_RESUME_SET_BP);
> > +
> > +	test_gt_render_or_compute("tdctl-parameters", fd, hwe)
> > +		test_tdctl_parameters(fd, hwe, SHADER_LOOP);
> > +
> > +	test_gt_render_or_compute("reset-with-attention", fd, hwe)
> > +		test_reset_with_attention_online(fd, hwe, SHADER_BREAKPOINT);
> > +
> > +	test_gt_render_or_compute("interrupt-reconnect", fd, hwe)
> > +		test_interrupt_reconnect(fd, hwe, SHADER_LOOP | TRIGGER_RECONNECT);
> > +
> > +	test_gt_render_or_compute("single-step", fd, hwe)
> > +		test_single_step(fd, hwe, SHADER_SINGLE_STEP | SIP_SINGLE_STEP |
> > +				 TRIGGER_RESUME_PARALLEL_WALK);
> > +
> > +	test_gt_render_or_compute("single-step-one", fd, hwe)
> > +		test_single_step(fd, hwe, SHADER_SINGLE_STEP | SIP_SINGLE_STEP |
> > +				 TRIGGER_RESUME_SINGLE_WALK);
> > +
> > +	test_gt_render_or_compute("debugger-reopen", fd, hwe)
> > +		test_debugger_reopen(fd, hwe, SHADER_N_NOOP_BREAKPOINT);
> > +
> > +	test_gt_render_or_compute("writes-caching-sram", fd, hwe)
> > +		test_caching(fd, hwe, SHADER_CACHING_SRAM);
> > +
> > +	test_gt_render_or_compute("writes-caching-vram", fd, hwe)
> > +		test_caching(fd, hwe, SHADER_CACHING_VRAM);
> > +
> > +	igt_subtest("breakpoint-many-sessions-single-tile")
> > +		test_many_sessions_on_tiles(fd, false);
> > +
> > +	igt_subtest("breakpoint-many-sessions-tiles")
> > +		test_many_sessions_on_tiles(fd, true);
> > +
> > +	igt_fixture {
> > +		xe_eudebug_enable(fd, was_enabled);
> > +
> > +		intel_allocator_multiprocess_stop();
> > +		drm_close_driver(fd);
> > +	}
> > +}
> > diff --git a/tests/meson.build b/tests/meson.build
> > index 43e8516f4..e5d8852f3 100644
> > --- a/tests/meson.build
> > +++ b/tests/meson.build
> > @@ -321,6 +321,7 @@ intel_xe_progs = [
> >  intel_xe_eudebug_progs = [
> >  	'xe_eudebug',
> >  	'xe_exec_sip_eudebug',
> > +	'xe_eudebug_online',
> >  ]
> >  
> >  if build_xe_eudebug
> > -- 
> > 2.34.1
> > 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 16/17] tests/xe_eudebug_online: Debug client which runs workloads on EU
  2024-09-17 19:34     ` Grzegorzek, Dominik
@ 2024-09-18  5:08       ` Zbigniew Kempczyński
  2024-09-18  6:44         ` Grzegorzek, Dominik
  2024-09-18  5:21       ` Zbigniew Kempczyński
  1 sibling, 1 reply; 50+ messages in thread
From: Zbigniew Kempczyński @ 2024-09-18  5:08 UTC (permalink / raw)
  To: Grzegorzek, Dominik
  Cc: Manszewski, Christoph, Patelczyk, Maciej, Hajda, Andrzej,
	Kuoppala, Mika, Sikora, Pawel, Piatkowski, Dominik Karol,
	Mun, Gwan-gyeong, igt-dev@lists.freedesktop.org,
	kamil.konieczny@linux.intel.com, Kolanupaka Naveena

On Tue, Sep 17, 2024 at 09:34:20PM +0200, Grzegorzek, Dominik wrote:

<cut>

> > > +static int count_set_bits(void *ptr, size_t size)
> > > +{
> > > +	uint8_t *p = ptr;
> > > +	int count = 0;
> > > +	int i, j;
> > > +
> > 
> > hweight()?
> > 
> Are you proposing here to change the name or to implement it without second loop like below?

Yes, I just want to get rid of second loop.

> 
> static int count_set_bits(void *ptr, size_t size)
> {
> 	uint32_t *p = ptr;
> 	int count = 0;
> 	int i;
> 
> 	igt_assert(size % 4 == 0);
> 
> 	for (i = 0; i < size/4; i++)
> 		count += igt_hweight(p[i]);
> 
> 	return count;
> }

You may iterate over uint8_t to cover all sizes, not only % 4.
But if you're sure buffer will always be multiple of 4, this
imo is ok.

<cut>

> > > +static void copy_first_bit(uint8_t *dst, uint8_t *src, int size)
> > > +{
> > > +	bool found = false;
> > > +	int i, j;
> > > +
> > > +	for (i = 0; i < size; i++) {
> > > +		if (found) {
> > > +			dst[i] = 0;
> > 
> > Function is static, but according to line above I would add some
> > comment that it is cleaning dst buffer. copy_first_bit() is misleading
> > as you mean first bit set. First bit is src[0] & 1.
> > 
> > And what 'first' means? Having lets say src = { 0x0, 0xff, 0xcc, 0xaa }
> > I would expect first should be most significant bit of 0xff.
> > 
> > 
> > > +		} else {
> > > +			uint32_t tmp = src[i]; /* in case dst == src */
> > > +
> > > +			for (j = 0; j < 8; j++) {
> > 
> > ffs()? But according to copy copy_nth_bit() I've doubts shouldn't this
> > be fls()?
> > 
> > > +				dst[i] = tmp & (1 << j);
> > > +				if (dst[i]) {
> > > +					found = true;
> > > +					break;
> > > +				}
> > > +			}
> > > +		}
> > > +	}
> > > +}
> > > +
> > > +static void copy_nth_bit(uint8_t *dst, uint8_t *src, int size, int n)
> > > +{
> > > +	int count = 0;
> > > +
> > > +	for (int i = 0; i < size; i++) {
> > > +		uint32_t tmp = src[i];
> > > +
> > > +		for (int j = 7; j >= 0; j--) {
> > 
> > I'm confused. In above function you iterate starting from least
> > significant bit, here you start from most significant bit.
> > Same concern about function name - shouldn't this be copy_nth_bit_set()?
> > 
> > > +			if (tmp & (1 << j)) {
> > > +				count++;
> > > +				if (count == n)
> > > +					dst[i] |= (1 << j);
> > > +				else
> > > +					dst[i] &= ~(1 << j);
> > 
> > Do I understand correctly that you are clearing other bits in dst?
> > It's extremely weird calling function copy_nth_bit() where it scans
> > for n-th bit set, zeroing other bits in dst. Or I just don't understand
> > logic behind this decision.
> 
> You've raised bunch of valid inaccuracies. How about:
> 
> static void only_nth_set_bit(uint8_t *dst, uint8_t *src, int size, int n)
> {
> 	int count = 0;
> 
> 	for (int i = 0; i < size; i++) {
> 		if (count < n) {
> 			uint8_t tmp = src[i];
> 
> 			for (int j = 0; j < 8; j++) {
> 				if (tmp & (1 << j)) {
> 					count++;
> 					if (count == n)
> 						dst[i] |= (1 << j);
> 					else
> 						dst[i] &= ~(1 << j);
> 				} else {
> 					dst[i] &= ~(1 << j);
> 				}
> 			}
> 		} else {
> 			dst[i] = 0;
> 		}
> 	}
> }

Likely I would copy octet by octet from src[i] -> dst[i], tracking
previous/current hweight and when it is bigger than n zeroing rest of bits in
current octet. But this is implementation detail.

In above code you're copying from least significant bit, is this intended?
Previous code was copying from most significant bit so this is
definitely semantic change which according to hw behavior may be
incorrect. May you double check this?

> 
> static void only_first_set_bit(uint8_t *dst, uint8_t *src, int size)
> {
> 	return only_nth_set_bit(dst, src, size, 1);
> }

Nice, code reuse is always good.

I'll drop rest of email to reply in another email. This will narrow
our conversation to interesting part only.

--
Zbigniew
 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 16/17] tests/xe_eudebug_online: Debug client which runs workloads on EU
  2024-09-18  5:08       ` Zbigniew Kempczyński
@ 2024-09-18  6:44         ` Grzegorzek, Dominik
  0 siblings, 0 replies; 50+ messages in thread
From: Grzegorzek, Dominik @ 2024-09-18  6:44 UTC (permalink / raw)
  To: Kempczynski, Zbigniew
  Cc: Patelczyk, Maciej, Hajda, Andrzej, Kuoppala, Mika,
	Piatkowski, Dominik Karol, Manszewski, Christoph,
	Mun, Gwan-gyeong, igt-dev@lists.freedesktop.org,
	kamil.konieczny@linux.intel.com, Sikora, Pawel,
	Kolanupaka Naveena

On Wed, 2024-09-18 at 07:08 +0200, Zbigniew Kempczyński wrote:
> On Tue, Sep 17, 2024 at 09:34:20PM +0200, Grzegorzek, Dominik wrote:
> 
> <cut>
> 
> > > > +static int count_set_bits(void *ptr, size_t size)
> > > > +{
> > > > +	uint8_t *p = ptr;
> > > > +	int count = 0;
> > > > +	int i, j;
> > > > +
> > > 
> > > hweight()?
> > > 
> > Are you proposing here to change the name or to implement it without second loop like below?
> 
> Yes, I just want to get rid of second loop.
> 
> > 
> > static int count_set_bits(void *ptr, size_t size)
> > {
> > 	uint32_t *p = ptr;
> > 	int count = 0;
> > 	int i;
> > 
> > 	igt_assert(size % 4 == 0);
> > 
> > 	for (i = 0; i < size/4; i++)
> > 		count += igt_hweight(p[i]);
> > 
> > 	return count;
> > }
> 
> You may iterate over uint8_t to cover all sizes, not only % 4.
> But if you're sure buffer will always be multiple of 4, this
> imo is ok.
> 
> <cut>
> 
> > > > +static void copy_first_bit(uint8_t *dst, uint8_t *src, int size)
> > > > +{
> > > > +	bool found = false;
> > > > +	int i, j;
> > > > +
> > > > +	for (i = 0; i < size; i++) {
> > > > +		if (found) {
> > > > +			dst[i] = 0;
> > > 
> > > Function is static, but according to line above I would add some
> > > comment that it is cleaning dst buffer. copy_first_bit() is misleading
> > > as you mean first bit set. First bit is src[0] & 1.
> > > 
> > > And what 'first' means? Having lets say src = { 0x0, 0xff, 0xcc, 0xaa }
> > > I would expect first should be most significant bit of 0xff.
> > > 
> > > 
> > > > +		} else {
> > > > +			uint32_t tmp = src[i]; /* in case dst == src */
> > > > +
> > > > +			for (j = 0; j < 8; j++) {
> > > 
> > > ffs()? But according to copy copy_nth_bit() I've doubts shouldn't this
> > > be fls()?
> > > 
> > > > +				dst[i] = tmp & (1 << j);
> > > > +				if (dst[i]) {
> > > > +					found = true;
> > > > +					break;
> > > > +				}
> > > > +			}
> > > > +		}
> > > > +	}
> > > > +}
> > > > +
> > > > +static void copy_nth_bit(uint8_t *dst, uint8_t *src, int size, int n)
> > > > +{
> > > > +	int count = 0;
> > > > +
> > > > +	for (int i = 0; i < size; i++) {
> > > > +		uint32_t tmp = src[i];
> > > > +
> > > > +		for (int j = 7; j >= 0; j--) {
> > > 
> > > I'm confused. In above function you iterate starting from least
> > > significant bit, here you start from most significant bit.
> > > Same concern about function name - shouldn't this be copy_nth_bit_set()?
> > > 
> > > > +			if (tmp & (1 << j)) {
> > > > +				count++;
> > > > +				if (count == n)
> > > > +					dst[i] |= (1 << j);
> > > > +				else
> > > > +					dst[i] &= ~(1 << j);
> > > 
> > > Do I understand correctly that you are clearing other bits in dst?
> > > It's extremely weird calling function copy_nth_bit() where it scans
> > > for n-th bit set, zeroing other bits in dst. Or I just don't understand
> > > logic behind this decision.
> > 
> > You've raised bunch of valid inaccuracies. How about:
> > 
> > static void only_nth_set_bit(uint8_t *dst, uint8_t *src, int size, int n)
> > {
> > 	int count = 0;
> > 
> > 	for (int i = 0; i < size; i++) {
> > 		if (count < n) {
> > 			uint8_t tmp = src[i];
> > 
> > 			for (int j = 0; j < 8; j++) {
> > 				if (tmp & (1 << j)) {
> > 					count++;
> > 					if (count == n)
> > 						dst[i] |= (1 << j);
> > 					else
> > 						dst[i] &= ~(1 << j);
> > 				} else {
> > 					dst[i] &= ~(1 << j);
> > 				}
> > 			}
> > 		} else {
> > 			dst[i] = 0;
> > 		}
> > 	}
> > }
> 
> Likely I would copy octet by octet from src[i] -> dst[i], tracking
> previous/current hweight and when it is bigger than n zeroing rest of bits in
> current octet. But this is implementation detail.
> 
> In above code you're copying from least significant bit, is this intended?
> Previous code was copying from most significant bit so this is
> definitely semantic change which according to hw behavior may be
> incorrect. May you double check this?
From the test pov this really doesn'm matter. We just wanted to keep single bit in the bitmask to
resume different thread. As long as different n gave different bit it was fine. However,
by 'first' I think we meant least significant bit, so I deliberately changed it that way. 

Regards,
Dominik

> 
> > 
> > static void only_first_set_bit(uint8_t *dst, uint8_t *src, int size)
> > {
> > 	return only_nth_set_bit(dst, src, size, 1);
> > }
> 
> Nice, code reuse is always good.
> 
> I'll drop rest of email to reply in another email. This will narrow
> our conversation to interesting part only.
> 
> --
> Zbigniew
>  


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH i-g-t v6 16/17] tests/xe_eudebug_online: Debug client which runs workloads on EU
  2024-09-17 19:34     ` Grzegorzek, Dominik
  2024-09-18  5:08       ` Zbigniew Kempczyński
@ 2024-09-18  5:21       ` Zbigniew Kempczyński
  1 sibling, 0 replies; 50+ messages in thread
From: Zbigniew Kempczyński @ 2024-09-18  5:21 UTC (permalink / raw)
  To: Grzegorzek, Dominik
  Cc: Manszewski, Christoph, Patelczyk, Maciej, Hajda, Andrzej,
	Kuoppala, Mika, Sikora, Pawel, Piatkowski, Dominik Karol,
	Mun, Gwan-gyeong, igt-dev@lists.freedesktop.org,
	kamil.konieczny@linux.intel.com, Kolanupaka Naveena

On Tue, Sep 17, 2024 at 09:34:20PM +0200, Grzegorzek, Dominik wrote:

<cut>

> > > +/**
> > > + * SUBTEST: preempt-breakpoint
> > > + * Description:
> > > + *	Verify that eu debugger disables preemption timeout to
> > > + *	prevent reset of workload stopped on breakpoint.
> > > + */
> > > +static void test_preemption(int fd, struct drm_xe_engine_class_instance *hwe)
> > > +{
> > > +	int flags = SHADER_BREAKPOINT | TRIGGER_RESUME_DELAYED;
> > > +	struct xe_eudebug_session *s;
> > > +	struct online_debug_data *data;
> > > +	struct xe_eudebug_client *other;
> > > +
> > > +	data = online_debug_data_create(hwe);
> > > +	s = xe_eudebug_session_create(fd, run_online_client, flags, data);
> > > +	other = xe_eudebug_client_create(fd, run_online_client, SHADER_NOP, data);
> > > +
> > > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> > > +					eu_attention_debug_trigger);
> > > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> > > +					eu_attention_resume_trigger);
> > > +	xe_eudebug_debugger_add_trigger(s->debugger, DRM_XE_EUDEBUG_EVENT_VM_BIND_UFENCE,
> > > +					ufence_ack_trigger);
> > > +
> > > +	igt_assert_eq(xe_eudebug_debugger_attach(s->debugger, s->client), 0);
> > > +	xe_eudebug_debugger_start_worker(s->debugger);
> > > +
> > > +	xe_eudebug_client_start(s->client);
> > > +	sleep(1); /* make sure s->client starts first */
> > 
> > If client would write token it has started this sleep wouldn't be
> > necessary. I mean inside xe_eudebug_client_start() do
> > token_signal/wait_for_client.
> To ensure that the first client executes its workload first, we would need to signal it after
> calling xe_exec. This means that the signaling would need to be incorporated within the client's
> work function. We cannot place wait_for_client inside the generic xe_eudebug_client_start() because
> the work function is defined by the caller. While I could implement a similar mechanism specifically
> for this test, it would require creating a brand new run_online_client()-like function or adding
> wait_for_client in every test that reuses run_online_client(). I decided to keep it simple, albeit
> imperfect. Let me know if you would rather have it changed. 

Ok, now I understand. That's not the client process start is interesting from
our point of view (that's why I suggested to inform caller that is created)
but xe_exec() itself. Signalling to the spawner xe_exec() was called is not
so easy with current code shape. sleep(1) is however long time and likely
first client will start its execution before 'other' (I'm not sure shouldn't
it be called 'another', but that's not important).

Keep as it is now. If we encounter the race win by 'other' then we'll think
how to prevent this.

--

Zbigniew


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH i-g-t v6 17/17] tests/xe_live_ktest: Add xe_eudebug live test
  2024-09-05  9:27 [PATCH i-g-t v6 00/17] Test coverage for GPU debug support Christoph Manszewski
                   ` (15 preceding siblings ...)
  2024-09-05  9:28 ` [PATCH i-g-t v6 16/17] tests/xe_eudebug_online: Debug client which runs workloads on EU Christoph Manszewski
@ 2024-09-05  9:28 ` Christoph Manszewski
  2024-09-05 21:04 ` ✗ GitLab.Pipeline: warning for Test coverage for GPU debug support (rev6) Patchwork
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 50+ messages in thread
From: Christoph Manszewski @ 2024-09-05  9:28 UTC (permalink / raw)
  To: igt-dev
  Cc: Zbigniew Kempczyński, Kamil Konieczny, Dominik Grzegorzek,
	Maciej Patelczyk, Dominik Karol Piątkowski, Pawel Sikora,
	Andrzej Hajda, Kolanupaka Naveena, Mika Kuoppala, Gwan-gyeong Mun,
	Christoph Manszewski

xe_eudebug introduces a dedicated kunit test to the live test module.
Add it to the list of live tests to be executed.

Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Reviewed-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 tests/intel/xe_live_ktest.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/tests/intel/xe_live_ktest.c b/tests/intel/xe_live_ktest.c
index 4376d5df7..50af97ecc 100644
--- a/tests/intel/xe_live_ktest.c
+++ b/tests/intel/xe_live_ktest.c
@@ -30,6 +30,11 @@
  * Description:
  *	Kernel dynamic selftests to check mocs configuration.
  * Functionality: mocs configuration
+ *
+ * SUBTEST: xe_eudebug
+ * Description:
+ *	Kernel dynamic selftests to check eudebug functionality.
+ * Functionality: eudebug kunit
  */
 
 static const char *live_tests[] = {
@@ -37,6 +42,7 @@ static const char *live_tests[] = {
 	"xe_dma_buf",
 	"xe_migrate",
 	"xe_mocs",
+	"xe_eudebug",
 };
 
 igt_main
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* ✗ GitLab.Pipeline: warning for Test coverage for GPU debug support (rev6)
  2024-09-05  9:27 [PATCH i-g-t v6 00/17] Test coverage for GPU debug support Christoph Manszewski
                   ` (16 preceding siblings ...)
  2024-09-05  9:28 ` [PATCH i-g-t v6 17/17] tests/xe_live_ktest: Add xe_eudebug live test Christoph Manszewski
@ 2024-09-05 21:04 ` Patchwork
  2024-09-05 21:33 ` ✓ CI.xeBAT: success " Patchwork
  2024-09-05 21:40 ` ✗ Fi.CI.BAT: failure " Patchwork
  19 siblings, 0 replies; 50+ messages in thread
From: Patchwork @ 2024-09-05 21:04 UTC (permalink / raw)
  To: Christoph Manszewski; +Cc: igt-dev

== Series Details ==

Series: Test coverage for GPU debug support (rev6)
URL   : https://patchwork.freedesktop.org/series/136623/
State : warning

== Summary ==

Pipeline status: FAILED.

see https://gitlab.freedesktop.org/gfx-ci/igt-ci-tags/-/pipelines/1265472 for the overview.

build:tests-debian-minimal has failed (https://gitlab.freedesktop.org/gfx-ci/igt-ci-tags/-/jobs/63236149):
  Message: WARNING: leg command not found, disabling overlay; try : apt-get install peg
  Configuring defs.rst using configuration
  Program rst2man-3 found: NO
  Program rst2man found: NO
  Program rst2man.sh found: YES (/builds/gfx-ci/igt-ci-tags/man/rst2man.sh)
  Dependency gtk-doc found: NO (tried pkgconfig and cmake)
  Program sphinx-build skipped: feature sphinx disabled
  Program rst2html-3 found: NO
  Program rst2html found: NO
  Program rst2pdf found: NO
  
  docs/testplan/meson.build:38:9: ERROR:  Unknown variable "intel_xe_eudebug_progs".
  
  A full log can be found at /builds/gfx-ci/igt-ci-tags/build/meson-logs/meson-log.txt
  section_end:1725569831:step_script
  section_start:1725569831:cleanup_file_variables
  Cleaning up project directory and file based variables
  section_end:1725569832:cleanup_file_variables
  ERROR: Job failed: exit code 1

== Logs ==

For more details see: https://gitlab.freedesktop.org/gfx-ci/igt-ci-tags/-/pipelines/1265472

^ permalink raw reply	[flat|nested] 50+ messages in thread

* ✓ CI.xeBAT: success for Test coverage for GPU debug support (rev6)
  2024-09-05  9:27 [PATCH i-g-t v6 00/17] Test coverage for GPU debug support Christoph Manszewski
                   ` (17 preceding siblings ...)
  2024-09-05 21:04 ` ✗ GitLab.Pipeline: warning for Test coverage for GPU debug support (rev6) Patchwork
@ 2024-09-05 21:33 ` Patchwork
  2024-09-05 21:40 ` ✗ Fi.CI.BAT: failure " Patchwork
  19 siblings, 0 replies; 50+ messages in thread
From: Patchwork @ 2024-09-05 21:33 UTC (permalink / raw)
  To: Christoph Manszewski; +Cc: igt-dev

[-- Attachment #1: Type: text/plain, Size: 1679 bytes --]

== Series Details ==

Series: Test coverage for GPU debug support (rev6)
URL   : https://patchwork.freedesktop.org/series/136623/
State : success

== Summary ==

CI Bug Log - changes from XEIGT_8006_BAT -> XEIGTPW_11702_BAT
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  

Participating hosts (9 -> 9)
------------------------------

  No changes in participating hosts

Known issues
------------

  Here are the changes found in XEIGTPW_11702_BAT that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@kms_frontbuffer_tracking@basic:
    - bat-adlp-7:         [PASS][1] -> [FAIL][2] ([Intel XE#1861])
   [1]: https://intel-gfx-ci.01.org/tree/intel-xe/IGT_8006/bat-adlp-7/igt@kms_frontbuffer_tracking@basic.html
   [2]: https://intel-gfx-ci.01.org/tree/intel-xe/IGTPW_11702/bat-adlp-7/igt@kms_frontbuffer_tracking@basic.html

  
  [Intel XE#1861]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1861


Build changes
-------------

  * IGT: IGT_8006 -> IGTPW_11702
  * Linux: xe-1897-883d811c2bfc8544bc5b00bbee84e6b83e20a581 -> xe-1901-7f3ffaf88a3a1b3e29416488fb4e58fd551cd89d

  IGTPW_11702: 5ec411ec1c3953025a2e390afd83549db4ee9f43 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  IGT_8006: ae7f2bc0b99801a7ae369d4b5fda5c6b1c386eb1 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  xe-1897-883d811c2bfc8544bc5b00bbee84e6b83e20a581: 883d811c2bfc8544bc5b00bbee84e6b83e20a581
  xe-1901-7f3ffaf88a3a1b3e29416488fb4e58fd551cd89d: 7f3ffaf88a3a1b3e29416488fb4e58fd551cd89d

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/intel-xe/IGTPW_11702/index.html

[-- Attachment #2: Type: text/html, Size: 2255 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* ✗ Fi.CI.BAT: failure for Test coverage for GPU debug support (rev6)
  2024-09-05  9:27 [PATCH i-g-t v6 00/17] Test coverage for GPU debug support Christoph Manszewski
                   ` (18 preceding siblings ...)
  2024-09-05 21:33 ` ✓ CI.xeBAT: success " Patchwork
@ 2024-09-05 21:40 ` Patchwork
  19 siblings, 0 replies; 50+ messages in thread
From: Patchwork @ 2024-09-05 21:40 UTC (permalink / raw)
  To: Christoph Manszewski; +Cc: igt-dev

[-- Attachment #1: Type: text/plain, Size: 5667 bytes --]

== Series Details ==

Series: Test coverage for GPU debug support (rev6)
URL   : https://patchwork.freedesktop.org/series/136623/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_15369 -> IGTPW_11702
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with IGTPW_11702 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in IGTPW_11702, please notify your bug team (I915-ci-infra@lists.freedesktop.org) to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_11702/index.html

Participating hosts (40 -> 39)
------------------------------

  Additional (1): fi-kbl-8809g 
  Missing    (2): bat-dg2-11 fi-snb-2520m 

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in IGTPW_11702:

### IGT changes ###

#### Possible regressions ####

  * igt@i915_selftest@live:
    - bat-mtlp-6:         [PASS][1] -> [ABORT][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15369/bat-mtlp-6/igt@i915_selftest@live.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_11702/bat-mtlp-6/igt@i915_selftest@live.html

  
Known issues
------------

  Here are the changes found in IGTPW_11702 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@fbdev@nullptr:
    - bat-arls-1:         [PASS][3] -> [DMESG-WARN][4] ([i915#12102])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15369/bat-arls-1/igt@fbdev@nullptr.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_11702/bat-arls-1/igt@fbdev@nullptr.html

  * igt@gem_huc_copy@huc-copy:
    - fi-kbl-8809g:       NOTRUN -> [SKIP][5] ([i915#2190])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_11702/fi-kbl-8809g/igt@gem_huc_copy@huc-copy.html

  * igt@gem_lmem_swapping@parallel-random-engines:
    - fi-kbl-8809g:       NOTRUN -> [SKIP][6] ([i915#4613]) +3 other tests skip
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_11702/fi-kbl-8809g/igt@gem_lmem_swapping@parallel-random-engines.html

  * igt@i915_selftest@live@workarounds:
    - bat-mtlp-6:         [PASS][7] -> [ABORT][8] ([i915#12061])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15369/bat-mtlp-6/igt@i915_selftest@live@workarounds.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_11702/bat-mtlp-6/igt@i915_selftest@live@workarounds.html

  * igt@kms_dsc@dsc-basic:
    - fi-kbl-8809g:       NOTRUN -> [SKIP][9] +30 other tests skip
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_11702/fi-kbl-8809g/igt@kms_dsc@dsc-basic.html

  
#### Possible fixes ####

  * igt@fbdev@info:
    - bat-arls-1:         [DMESG-WARN][10] ([i915#12102]) -> [PASS][11]
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15369/bat-arls-1/igt@fbdev@info.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_11702/bat-arls-1/igt@fbdev@info.html

  * igt@i915_selftest@live:
    - bat-arls-1:         [DMESG-WARN][12] ([i915#10341]) -> [PASS][13]
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15369/bat-arls-1/igt@i915_selftest@live.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_11702/bat-arls-1/igt@i915_selftest@live.html

  * igt@i915_selftest@live@hangcheck:
    - bat-arls-1:         [DMESG-WARN][14] ([i915#11349]) -> [PASS][15]
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15369/bat-arls-1/igt@i915_selftest@live@hangcheck.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_11702/bat-arls-1/igt@i915_selftest@live@hangcheck.html

  
#### Warnings ####

  * igt@i915_module_load@load:
    - bat-apl-1:          [DMESG-WARN][16] ([i915#180]) -> [DMESG-WARN][17] ([i915#180] / [i915#1982])
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15369/bat-apl-1/igt@i915_module_load@load.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_11702/bat-apl-1/igt@i915_module_load@load.html

  * igt@i915_module_load@reload:
    - fi-kbl-7567u:       [DMESG-WARN][18] ([i915#180] / [i915#9925]) -> [DMESG-WARN][19] ([i915#180] / [i915#1982] / [i915#9925])
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15369/fi-kbl-7567u/igt@i915_module_load@reload.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_11702/fi-kbl-7567u/igt@i915_module_load@reload.html

  
  [i915#10341]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10341
  [i915#11349]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11349
  [i915#12061]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12061
  [i915#12102]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12102
  [i915#180]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/180
  [i915#1982]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/1982
  [i915#2190]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/2190
  [i915#4613]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/4613
  [i915#9925]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/9925


Build changes
-------------

  * CI: CI-20190529 -> None
  * IGT: IGT_8006 -> IGTPW_11702

  CI-20190529: 20190529
  CI_DRM_15369: 7f3ffaf88a3a1b3e29416488fb4e58fd551cd89d @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_11702: 5ec411ec1c3953025a2e390afd83549db4ee9f43 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  IGT_8006: ae7f2bc0b99801a7ae369d4b5fda5c6b1c386eb1 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_11702/index.html

[-- Attachment #2: Type: text/html, Size: 6996 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2024-09-18  6:44 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-05  9:27 [PATCH i-g-t v6 00/17] Test coverage for GPU debug support Christoph Manszewski
2024-09-05  9:27 ` [PATCH i-g-t v6 01/17] drm-uapi/xe: Sync with oa uapi fix Christoph Manszewski
2024-09-06 14:41   ` Kamil Konieczny
2024-09-05  9:27 ` [PATCH i-g-t v6 02/17] lib/xe_ioctl: Add wrapper with vm_bind_op extension parameter Christoph Manszewski
2024-09-05  9:27 ` [PATCH i-g-t v6 03/17] lib/gpgpu_shader: Extend shader building library Christoph Manszewski
2024-09-05 11:56   ` Zbigniew Kempczyński
2024-09-09  6:54   ` Zbigniew Kempczyński
2024-09-05  9:27 ` [PATCH i-g-t v6 04/17] lib/gpgpu_shader: Add write_on_exception template Christoph Manszewski
2024-09-05 10:51   ` Zbigniew Kempczyński
2024-09-06  5:58     ` Andrzej Hajda
2024-09-06  6:54       ` Zbigniew Kempczyński
2024-09-05  9:28 ` [PATCH i-g-t v6 05/17] lib/gpgpu_shader: Add set/clear exception register (cr0.1) helpers Christoph Manszewski
2024-09-05  9:28 ` [PATCH i-g-t v6 06/17] lib/intel_batchbuffer: Add helper to get pointer at specified offset Christoph Manszewski
2024-09-06  7:46   ` Zbigniew Kempczyński
2024-09-05  9:28 ` [PATCH i-g-t v6 07/17] lib/gpgpu_shader: Allow enabling illegal opcode exceptions in shader Christoph Manszewski
2024-09-05  9:28 ` [PATCH i-g-t v6 08/17] tests/xe_exec_sip: Add sanity-after-timeout test Christoph Manszewski
2024-09-05  9:28 ` [PATCH i-g-t v6 09/17] tests/xe_exec_sip: Introduce invalid instruction tests Christoph Manszewski
2024-09-05 18:39   ` Zbigniew Kempczyński
2024-09-09  7:21   ` Zbigniew Kempczyński
2024-09-13 11:50     ` Manszewski, Christoph
2024-09-05  9:28 ` [PATCH i-g-t v6 10/17] drm-uapi/xe: Sync with eudebug uapi Christoph Manszewski
2024-09-05  9:28 ` [PATCH i-g-t v6 11/17] lib/xe_eudebug: Introduce eu debug testing framework Christoph Manszewski
2024-09-09  8:46   ` Zbigniew Kempczyński
2024-09-13 15:14     ` Manszewski, Christoph
2024-09-16  6:48       ` Zbigniew Kempczyński
2024-09-10  5:32   ` Zbigniew Kempczyński
2024-09-05  9:28 ` [PATCH i-g-t v6 12/17] scripts/igt_doc: Add '--exclude-files' parameter Christoph Manszewski
2024-09-09 11:31   ` Kamil Konieczny
2024-09-09 13:57     ` Zbigniew Kempczyński
2024-09-13 13:24       ` Manszewski, Christoph
2024-09-13 16:40         ` Kamil Konieczny
2024-09-05  9:28 ` [PATCH i-g-t v6 13/17] tests/xe_eudebug: Test eudebug resource tracking and manipulation Christoph Manszewski
2024-09-06 14:46   ` Kamil Konieczny
2024-09-09 10:34     ` Zbigniew Kempczyński
2024-09-12  8:04   ` Zbigniew Kempczyński
2024-09-17 14:44     ` Manszewski, Christoph
2024-09-17 16:00     ` Manszewski, Christoph
2024-09-18  4:47       ` Zbigniew Kempczyński
2024-09-05  9:28 ` [PATCH i-g-t v6 14/17] lib/intel_batchbuffer: Add support for long-running mode execution Christoph Manszewski
2024-09-05  9:28 ` [PATCH i-g-t v6 15/17] tests/xe_exec_sip_eudebug: Port tests for shaders and sip Christoph Manszewski
2024-09-05  9:28 ` [PATCH i-g-t v6 16/17] tests/xe_eudebug_online: Debug client which runs workloads on EU Christoph Manszewski
2024-09-13 11:39   ` Zbigniew Kempczyński
2024-09-17 19:34     ` Grzegorzek, Dominik
2024-09-18  5:08       ` Zbigniew Kempczyński
2024-09-18  6:44         ` Grzegorzek, Dominik
2024-09-18  5:21       ` Zbigniew Kempczyński
2024-09-05  9:28 ` [PATCH i-g-t v6 17/17] tests/xe_live_ktest: Add xe_eudebug live test Christoph Manszewski
2024-09-05 21:04 ` ✗ GitLab.Pipeline: warning for Test coverage for GPU debug support (rev6) Patchwork
2024-09-05 21:33 ` ✓ CI.xeBAT: success " Patchwork
2024-09-05 21:40 ` ✗ Fi.CI.BAT: failure " Patchwork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox