Igt-dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [igt-dev] [PATCH i-g-t v2 0/9] Extend compute square to i915 and Xe
@ 2023-09-05 13:33 Zbigniew Kempczyński
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 1/9] lib/intel_compute: Migrate xe_compute library to intel_compute Zbigniew Kempczyński
                   ` (9 more replies)
  0 siblings, 10 replies; 22+ messages in thread
From: Zbigniew Kempczyński @ 2023-09-05 13:33 UTC (permalink / raw)
  To: igt-dev

Even we already got gem|xe_gpgpu_fill we'd like to have a little
bit more complex compute test. Pipeline comes from reversing aub
dumped from hello.c square compute.

This series enables compute on TGL, DG2, ATS-M, PVC selectively 
for i915 and Xe.

Cc: Christoph Manszewski <christoph.manszewski@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>

Zbigniew Kempczyński (9):
  lib/intel_compute: Migrate xe_compute library to intel_compute
  lib/intel_compute: Add compatibility flags for running compute
  lib/intel_compute: Reorganize the code for i915 version preparation
  lib/intel_compute: Add name field for debugging purposes
  lib/intel_compute: Add i915 path in compute library
  intel/gem_compute: Add test which runs compute workload on i915
  lib/intel_compute: Add XeHP implementation of compute pipeline
  lib/intel_compute: Adding pvc compute pipeline implementation
  tests/gem|xe_compute: Update documentation regarding test requirements

 lib/intel_compute.c                      | 1158 ++++++++++++++++++++++
 lib/{xe/xe_compute.h => intel_compute.h} |   12 +-
 lib/intel_compute_square_kernels.c       |  166 ++++
 lib/meson.build                          |    4 +-
 lib/xe/xe_compute.c                      |  488 ---------
 lib/xe/xe_compute_square_kernels.c       |   71 --
 tests/intel/gem_compute.c                |   45 +
 tests/intel/xe_compute.c                 |    7 +-
 tests/meson.build                        |    1 +
 9 files changed, 1381 insertions(+), 571 deletions(-)
 create mode 100644 lib/intel_compute.c
 rename lib/{xe/xe_compute.h => intel_compute.h} (74%)
 create mode 100644 lib/intel_compute_square_kernels.c
 delete mode 100644 lib/xe/xe_compute.c
 delete mode 100644 lib/xe/xe_compute_square_kernels.c
 create mode 100644 tests/intel/gem_compute.c

-- 
2.34.1

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [igt-dev] [PATCH i-g-t v2 1/9] lib/intel_compute: Migrate xe_compute library to intel_compute
  2023-09-05 13:33 [igt-dev] [PATCH i-g-t v2 0/9] Extend compute square to i915 and Xe Zbigniew Kempczyński
@ 2023-09-05 13:33 ` Zbigniew Kempczyński
  2023-09-06 16:42   ` Kamil Konieczny
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 2/9] lib/intel_compute: Add compatibility flags for running compute Zbigniew Kempczyński
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Zbigniew Kempczyński @ 2023-09-05 13:33 UTC (permalink / raw)
  To: igt-dev

During my work on adding xe-compute support to DG2 I hit some issues
on Xe driver so instead of limiting workload to Xe only I decided to
handle i915 as well. Such attitude might be handy on driver feature
status comparison.

Patch does preparation step to share the code between i915 and Xe.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Christoph Manszewski <christoph.manszewski@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
---
 lib/{xe/xe_compute.c => intel_compute.c}       | 18 +++++++++---------
 lib/{xe/xe_compute.h => intel_compute.h}       | 12 ++++++------
 ...ernels.c => intel_compute_square_kernels.c} |  4 ++--
 lib/meson.build                                |  4 ++--
 tests/intel/xe_compute.c                       |  4 ++--
 5 files changed, 21 insertions(+), 21 deletions(-)
 rename lib/{xe/xe_compute.c => intel_compute.c} (97%)
 rename lib/{xe/xe_compute.h => intel_compute.h} (74%)
 rename lib/{xe/xe_compute_square_kernels.c => intel_compute_square_kernels.c} (97%)

diff --git a/lib/xe/xe_compute.c b/lib/intel_compute.c
similarity index 97%
rename from lib/xe/xe_compute.c
rename to lib/intel_compute.c
index 3e8112a048..647bce0e43 100644
--- a/lib/xe/xe_compute.c
+++ b/lib/intel_compute.c
@@ -13,7 +13,7 @@
 #include "lib/igt_syncobj.h"
 #include "lib/intel_reg.h"
 
-#include "xe_compute.h"
+#include "intel_compute.h"
 #include "xe/xe_ioctl.h"
 #include "xe/xe_query.h"
 
@@ -453,24 +453,24 @@ static const struct {
 	unsigned int ip_ver;
 	void (*compute_exec)(int fd, const unsigned char *kernel,
 			     unsigned int size);
-} xe_compute_batches[] = {
+} compute_batches[] = {
 	{
 		.ip_ver = IP_VER(12, 0),
 		.compute_exec = tgl_compute_exec,
 	},
 };
 
-bool run_xe_compute_kernel(int fd)
+bool run_compute_kernel(int fd)
 {
 	unsigned int ip_ver = intel_graphics_ver(intel_get_drm_devid(fd));
 	unsigned int batch;
-	const struct xe_compute_kernels *kernels = xe_compute_square_kernels;
+	const struct compute_kernels *kernels = compute_square_kernels;
 
-	for (batch = 0; batch < ARRAY_SIZE(xe_compute_batches); batch++) {
-		if (ip_ver == xe_compute_batches[batch].ip_ver)
+	for (batch = 0; batch < ARRAY_SIZE(compute_batches); batch++) {
+		if (ip_ver == compute_batches[batch].ip_ver)
 			break;
 	}
-	if (batch == ARRAY_SIZE(xe_compute_batches))
+	if (batch == ARRAY_SIZE(compute_batches))
 		return false;
 
 	while (kernels->kernel) {
@@ -481,8 +481,8 @@ bool run_xe_compute_kernel(int fd)
 	if (!kernels->kernel)
 		return 1;
 
-	xe_compute_batches[batch].compute_exec(fd, kernels->kernel,
-					       kernels->size);
+	compute_batches[batch].compute_exec(fd, kernels->kernel,
+					    kernels->size);
 
 	return true;
 }
diff --git a/lib/xe/xe_compute.h b/lib/intel_compute.h
similarity index 74%
rename from lib/xe/xe_compute.h
rename to lib/intel_compute.h
index b2e7e98278..e271bb5254 100644
--- a/lib/xe/xe_compute.h
+++ b/lib/intel_compute.h
@@ -6,8 +6,8 @@
  *    Francois Dugast <francois.dugast@intel.com>
  */
 
-#ifndef XE_COMPUTE_H
-#define XE_COMPUTE_H
+#ifndef INTEL_COMPUTE_H
+#define INTEL_COMPUTE_H
 
 /*
  * OpenCL Kernels are generated using:
@@ -19,14 +19,14 @@
  * For each GPU model desired. A list of supported models can be obtained with: ocloc compile --help
  */
 
-struct xe_compute_kernels {
+struct compute_kernels {
 	int ip_ver;
 	unsigned int size;
 	const unsigned char *kernel;
 };
 
-extern const struct xe_compute_kernels xe_compute_square_kernels[];
+extern const struct compute_kernels compute_square_kernels[];
 
-bool run_xe_compute_kernel(int fd);
+bool run_compute_kernel(int fd);
 
-#endif	/* XE_COMPUTE_H */
+#endif	/* INTEL_COMPUTE_H */
diff --git a/lib/xe/xe_compute_square_kernels.c b/lib/intel_compute_square_kernels.c
similarity index 97%
rename from lib/xe/xe_compute_square_kernels.c
rename to lib/intel_compute_square_kernels.c
index f9c07dc778..b30d8a23dd 100644
--- a/lib/xe/xe_compute_square_kernels.c
+++ b/lib/intel_compute_square_kernels.c
@@ -8,7 +8,7 @@
  */
 
 #include "intel_chipset.h"
-#include "lib/xe/xe_compute.h"
+#include "lib/intel_compute.h"
 
 static const unsigned char tgllp_kernel_square_bin[] = {
 	0x61, 0x00, 0x03, 0x80, 0x20, 0x02, 0x05, 0x03, 0x04, 0x00, 0x10, 0x00,
@@ -61,7 +61,7 @@ static const unsigned char tgllp_kernel_square_bin[] = {
 	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
 };
 
-const struct xe_compute_kernels xe_compute_square_kernels[] = {
+const struct compute_kernels compute_square_kernels[] = {
 	{
 		.ip_ver = IP_VER(12, 0),
 		.size = sizeof(tgllp_kernel_square_bin),
diff --git a/lib/meson.build b/lib/meson.build
index 21ea9d5ac4..a45f7d677f 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -58,6 +58,8 @@ lib_sources = [
 	'intel_bufops.c',
 	'intel_chipset.c',
 	'intel_cmds_info.c',
+	'intel_compute.c',
+	'intel_compute_square_kernels.c',
 	'intel_ctx.c',
 	'intel_device_info.c',
 	'intel_mmio.c',
@@ -103,8 +105,6 @@ lib_sources = [
 	'veboxcopy_gen12.c',
 	'igt_msm.c',
 	'igt_dsc.c',
-	'xe/xe_compute.c',
-	'xe/xe_compute_square_kernels.c',
 	'xe/xe_gt.c',
 	'xe/xe_ioctl.c',
 	'xe/xe_query.c',
diff --git a/tests/intel/xe_compute.c b/tests/intel/xe_compute.c
index 2cf536701a..0c54fbec42 100644
--- a/tests/intel/xe_compute.c
+++ b/tests/intel/xe_compute.c
@@ -14,8 +14,8 @@
 #include <string.h>
 
 #include "igt.h"
+#include "intel_compute.h"
 #include "xe/xe_query.h"
-#include "xe/xe_compute.h"
 
 /**
  * SUBTEST: compute-square
@@ -29,7 +29,7 @@
 static void
 test_compute_square(int fd)
 {
-	igt_require_f(run_xe_compute_kernel(fd), "GPU not supported\n");
+	igt_require_f(run_compute_kernel(fd), "GPU not supported\n");
 }
 
 igt_main
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [igt-dev] [PATCH i-g-t v2 2/9] lib/intel_compute: Add compatibility flags for running compute
  2023-09-05 13:33 [igt-dev] [PATCH i-g-t v2 0/9] Extend compute square to i915 and Xe Zbigniew Kempczyński
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 1/9] lib/intel_compute: Migrate xe_compute library to intel_compute Zbigniew Kempczyński
@ 2023-09-05 13:33 ` Zbigniew Kempczyński
  2023-09-08  9:03   ` Francois Dugast
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 3/9] lib/intel_compute: Reorganize the code for i915 version preparation Zbigniew Kempczyński
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Zbigniew Kempczyński @ 2023-09-05 13:33 UTC (permalink / raw)
  To: igt-dev

Allow selectively turn on/off compute tests on both i915 and xe
drivers.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Christoph Manszewski <christoph.manszewski@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
---
 lib/intel_compute.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/lib/intel_compute.c b/lib/intel_compute.c
index 647bce0e43..dd9f686d0c 100644
--- a/lib/intel_compute.c
+++ b/lib/intel_compute.c
@@ -446,17 +446,27 @@ static void tgl_compute_exec(int fd, const unsigned char *kernel,
 }
 
 /*
- * Generic code
+ * Compatibility flags.
+ *
+ * There will be some time period in which both drivers (i915 and xe)
+ * will support compute runtime tests. Lets define compat flags to allow
+ * the code to be shared between two drivers allowing disabling this in
+ * the future.
  */
+#define COMPAT_FLAG(f) (1 << (f))
+#define COMPAT_I915 COMPAT_FLAG(INTEL_DRIVER_I915)
+#define COMPAT_XE   COMPAT_FLAG(INTEL_DRIVER_XE)
 
 static const struct {
 	unsigned int ip_ver;
 	void (*compute_exec)(int fd, const unsigned char *kernel,
 			     unsigned int size);
+	uint32_t compat;
 } compute_batches[] = {
 	{
 		.ip_ver = IP_VER(12, 0),
 		.compute_exec = tgl_compute_exec,
+		.compat = COMPAT_I915 | COMPAT_XE,
 	},
 };
 
@@ -465,6 +475,7 @@ bool run_compute_kernel(int fd)
 	unsigned int ip_ver = intel_graphics_ver(intel_get_drm_devid(fd));
 	unsigned int batch;
 	const struct compute_kernels *kernels = compute_square_kernels;
+	enum intel_driver driver = get_intel_driver(fd);
 
 	for (batch = 0; batch < ARRAY_SIZE(compute_batches); batch++) {
 		if (ip_ver == compute_batches[batch].ip_ver)
@@ -473,6 +484,12 @@ bool run_compute_kernel(int fd)
 	if (batch == ARRAY_SIZE(compute_batches))
 		return false;
 
+	if (!(COMPAT_FLAG(driver) & compute_batches[batch].compat)) {
+		igt_debug("driver flag: %x\n", COMPAT_FLAG(driver));
+		igt_debug("compat flag: %x\n", compute_batches[batch].compat);
+		return false;
+	}
+
 	while (kernels->kernel) {
 		if (ip_ver == kernels->ip_ver)
 			break;
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [igt-dev] [PATCH i-g-t v2 3/9] lib/intel_compute: Reorganize the code for i915 version preparation
  2023-09-05 13:33 [igt-dev] [PATCH i-g-t v2 0/9] Extend compute square to i915 and Xe Zbigniew Kempczyński
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 1/9] lib/intel_compute: Migrate xe_compute library to intel_compute Zbigniew Kempczyński
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 2/9] lib/intel_compute: Add compatibility flags for running compute Zbigniew Kempczyński
@ 2023-09-05 13:33 ` Zbigniew Kempczyński
  2023-09-08  9:05   ` Francois Dugast
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 4/9] lib/intel_compute: Add name field for debugging purposes Zbigniew Kempczyński
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Zbigniew Kempczyński @ 2023-09-05 13:33 UTC (permalink / raw)
  To: igt-dev

There's common code in compute pipeline creation so it's worth to
extract it and create dedicated functions for this purpose.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Christoph Manszewski <christoph.manszewski@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
---
 lib/intel_compute.c | 135 ++++++++++++++++++++++++++++++++++++--------
 1 file changed, 110 insertions(+), 25 deletions(-)

diff --git a/lib/intel_compute.c b/lib/intel_compute.c
index dd9f686d0c..b42f3eca0e 100644
--- a/lib/intel_compute.c
+++ b/lib/intel_compute.c
@@ -39,6 +39,95 @@ struct bo_dict_entry {
 	void *data;
 };
 
+struct bo_execenv {
+	int fd;
+	enum intel_driver driver;
+
+	/* Xe part */
+	uint32_t vm;
+	uint32_t exec_queue;
+};
+
+static void bo_execenv_create(int fd, struct bo_execenv *execenv)
+{
+	igt_assert(execenv);
+
+	memset(execenv, 0, sizeof(*execenv));
+	execenv->fd = fd;
+	execenv->driver = get_intel_driver(fd);
+
+	if (execenv->driver == INTEL_DRIVER_XE) {
+		execenv->vm = xe_vm_create(fd, DRM_XE_VM_CREATE_ASYNC_BIND_OPS, 0);
+		execenv->exec_queue = xe_exec_queue_create_class(fd, execenv->vm,
+								 DRM_XE_ENGINE_CLASS_RENDER);
+	}
+}
+
+static void bo_execenv_destroy(struct bo_execenv *execenv)
+{
+	igt_assert(execenv);
+
+	if (execenv->driver == INTEL_DRIVER_XE) {
+		xe_vm_destroy(execenv->fd, execenv->vm);
+		xe_exec_queue_destroy(execenv->fd, execenv->exec_queue);
+	}
+}
+
+static void bo_execenv_bind(struct bo_execenv *execenv,
+			    struct bo_dict_entry *bo_dict, int entries)
+{
+	int fd = execenv->fd;
+
+	if (execenv->driver == INTEL_DRIVER_XE) {
+		uint32_t vm = execenv->vm;
+		uint64_t alignment = xe_get_default_alignment(fd);
+		struct drm_xe_sync sync = { 0 };
+
+		sync.flags = DRM_XE_SYNC_SYNCOBJ | DRM_XE_SYNC_SIGNAL;
+		sync.handle = syncobj_create(fd, 0);
+
+		for (int i = 0; i < entries; i++) {
+			bo_dict[i].data = aligned_alloc(alignment, bo_dict[i].size);
+			xe_vm_bind_userptr_async(fd, vm, 0, to_user_pointer(bo_dict[i].data),
+						 bo_dict[i].addr, bo_dict[i].size, &sync, 1);
+			syncobj_wait(fd, &sync.handle, 1, INT64_MAX, 0, NULL);
+			memset(bo_dict[i].data, 0, bo_dict[i].size);
+		}
+
+		syncobj_destroy(fd, sync.handle);
+	}
+}
+
+static void bo_execenv_unbind(struct bo_execenv *execenv,
+			      struct bo_dict_entry *bo_dict, int entries)
+{
+	int fd = execenv->fd;
+
+	if (execenv->driver == INTEL_DRIVER_XE) {
+		uint32_t vm = execenv->vm;
+		struct drm_xe_sync sync = { 0 };
+
+		sync.flags = DRM_XE_SYNC_SYNCOBJ | DRM_XE_SYNC_SIGNAL;
+		sync.handle = syncobj_create(fd, 0);
+
+		for (int i = 0; i < entries; i++) {
+			xe_vm_unbind_async(fd, vm, 0, 0, bo_dict[i].addr, bo_dict[i].size, &sync, 1);
+			syncobj_wait(fd, &sync.handle, 1, INT64_MAX, 0, NULL);
+			free(bo_dict[i].data);
+		}
+
+		syncobj_destroy(fd, sync.handle);
+	}
+}
+
+static void bo_execenv_exec(struct bo_execenv *execenv, uint64_t start_addr)
+{
+	int fd = execenv->fd;
+
+	if (execenv->driver == INTEL_DRIVER_XE)
+		xe_exec_wait(fd, execenv->exec_queue, start_addr);
+}
+
 /*
  * TGL compatible batch
  */
@@ -389,9 +478,6 @@ static void tgllp_compute_exec_compute(uint32_t *addr_bo_buffer_batch,
 static void tgl_compute_exec(int fd, const unsigned char *kernel,
 			     unsigned int size)
 {
-	uint32_t vm, exec_queue;
-	float *dinput;
-	struct drm_xe_sync sync = { 0 };
 #define TGL_BO_DICT_ENTRIES 7
 	struct bo_dict_entry bo_dict[TGL_BO_DICT_ENTRIES] = {
 		{ .addr = ADDR_INDIRECT_OBJECT_BASE + OFFSET_KERNEL}, // kernel
@@ -402,47 +488,46 @@ static void tgl_compute_exec(int fd, const unsigned char *kernel,
 		{ .addr = ADDR_OUTPUT, .size = SIZE_BUFFER_OUTPUT }, // output
 		{ .addr = ADDR_BATCH, .size = SIZE_BATCH }, // batch
 	};
+	struct bo_execenv execenv;
+	float *dinput;
+
+	bo_execenv_create(fd, &execenv);
 
 	/* Sets Kernel size */
 	bo_dict[0].size = ALIGN(size, 0x1000);
 
-	vm = xe_vm_create(fd, DRM_XE_VM_CREATE_ASYNC_BIND_OPS, 0);
-	exec_queue = xe_exec_queue_create_class(fd, vm, DRM_XE_ENGINE_CLASS_RENDER);
-	sync.flags = DRM_XE_SYNC_SYNCOBJ | DRM_XE_SYNC_SIGNAL;
-	sync.handle = syncobj_create(fd, 0);
+	bo_execenv_bind(&execenv, bo_dict, TGL_BO_DICT_ENTRIES);
 
-	for (int i = 0; i < TGL_BO_DICT_ENTRIES; i++) {
-		bo_dict[i].data = aligned_alloc(xe_get_default_alignment(fd), bo_dict[i].size);
-		xe_vm_bind_userptr_async(fd, vm, 0, to_user_pointer(bo_dict[i].data), bo_dict[i].addr, bo_dict[i].size, &sync, 1);
-		syncobj_wait(fd, &sync.handle, 1, INT64_MAX, 0, NULL);
-		memset(bo_dict[i].data, 0, bo_dict[i].size);
-	}
 	memcpy(bo_dict[0].data, kernel, size);
 	tgllp_create_dynamic_state(bo_dict[1].data, OFFSET_KERNEL);
 	tgllp_create_surface_state(bo_dict[2].data, ADDR_INPUT, ADDR_OUTPUT);
 	tgllp_create_indirect_data(bo_dict[3].data, ADDR_INPUT, ADDR_OUTPUT);
+
 	dinput = (float *)bo_dict[4].data;
 	srand(time(NULL));
-
 	for (int i = 0; i < SIZE_DATA; i++)
 		((float *)dinput)[i] = rand() / (float)RAND_MAX;
 
-	tgllp_compute_exec_compute(bo_dict[6].data, ADDR_SURFACE_STATE_BASE, ADDR_DYNAMIC_STATE_BASE, ADDR_INDIRECT_OBJECT_BASE, OFFSET_INDIRECT_DATA_START);
+	tgllp_compute_exec_compute(bo_dict[6].data,
+				   ADDR_SURFACE_STATE_BASE,
+				   ADDR_DYNAMIC_STATE_BASE,
+				   ADDR_INDIRECT_OBJECT_BASE,
+				   OFFSET_INDIRECT_DATA_START);
 
-	xe_exec_wait(fd, exec_queue, ADDR_BATCH);
+	bo_execenv_exec(&execenv, ADDR_BATCH);
 
-	for (int i = 0; i < SIZE_DATA; i++)
-		igt_assert(((float *)bo_dict[5].data)[i] == ((float *)bo_dict[4].data)[i] * ((float *) bo_dict[4].data)[i]);
+	for (int i = 0; i < SIZE_DATA; i++) {
+		float f1, f2;
 
-	for (int i = 0; i < TGL_BO_DICT_ENTRIES; i++) {
-		xe_vm_unbind_async(fd, vm, 0, 0, bo_dict[i].addr, bo_dict[i].size, &sync, 1);
-		syncobj_wait(fd, &sync.handle, 1, INT64_MAX, 0, NULL);
-		free(bo_dict[i].data);
+		f1 = ((float *) bo_dict[5].data)[i];
+		f2 = ((float *) bo_dict[4].data)[i];
+		if (f1 != f2 * f2)
+			igt_debug("[%4d] f1: %f != %f\n", i, f1, f2 * f2);
+		igt_assert(f1 == f2 * f2);
 	}
 
-	syncobj_destroy(fd, sync.handle);
-	xe_exec_queue_destroy(fd, exec_queue);
-	xe_vm_destroy(fd, vm);
+	bo_execenv_unbind(&execenv, bo_dict, TGL_BO_DICT_ENTRIES);
+	bo_execenv_destroy(&execenv);
 }
 
 /*
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [igt-dev] [PATCH i-g-t v2 4/9] lib/intel_compute: Add name field for debugging purposes
  2023-09-05 13:33 [igt-dev] [PATCH i-g-t v2 0/9] Extend compute square to i915 and Xe Zbigniew Kempczyński
                   ` (2 preceding siblings ...)
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 3/9] lib/intel_compute: Reorganize the code for i915 version preparation Zbigniew Kempczyński
@ 2023-09-05 13:33 ` Zbigniew Kempczyński
  2023-09-08  9:05   ` Francois Dugast
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 5/9] lib/intel_compute: Add i915 path in compute library Zbigniew Kempczyński
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Zbigniew Kempczyński @ 2023-09-05 13:33 UTC (permalink / raw)
  To: igt-dev

Debugging without knowledge about object characteristics is hard and
time consuming. Simple name field added for printing binded addresses
and their sizes might speed up development. I experienced this on
extending to DG2 so I decided to permanently add it. But to avoid
annoying output this is limited to igt_debug() which will print
only on user request or on the test failure.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Christoph Manszewski <christoph.manszewski@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
---
 lib/intel_compute.c | 33 ++++++++++++++++++++++++++-------
 1 file changed, 26 insertions(+), 7 deletions(-)

diff --git a/lib/intel_compute.c b/lib/intel_compute.c
index b42f3eca0e..a1e87ef46f 100644
--- a/lib/intel_compute.c
+++ b/lib/intel_compute.c
@@ -37,6 +37,7 @@ struct bo_dict_entry {
 	uint64_t addr;
 	uint32_t size;
 	void *data;
+	const char *name;
 };
 
 struct bo_execenv {
@@ -92,6 +93,11 @@ static void bo_execenv_bind(struct bo_execenv *execenv,
 						 bo_dict[i].addr, bo_dict[i].size, &sync, 1);
 			syncobj_wait(fd, &sync.handle, 1, INT64_MAX, 0, NULL);
 			memset(bo_dict[i].data, 0, bo_dict[i].size);
+
+			igt_debug("[i: %2d name: %20s] data: %p, addr: %16llx, size: %llx\n",
+				  i, bo_dict[i].name, bo_dict[i].data,
+				  (long long)bo_dict[i].addr,
+				  (long long)bo_dict[i].size);
 		}
 
 		syncobj_destroy(fd, sync.handle);
@@ -480,13 +486,26 @@ static void tgl_compute_exec(int fd, const unsigned char *kernel,
 {
 #define TGL_BO_DICT_ENTRIES 7
 	struct bo_dict_entry bo_dict[TGL_BO_DICT_ENTRIES] = {
-		{ .addr = ADDR_INDIRECT_OBJECT_BASE + OFFSET_KERNEL}, // kernel
-		{ .addr = ADDR_DYNAMIC_STATE_BASE, .size =  0x1000}, // dynamic state
-		{ .addr = ADDR_SURFACE_STATE_BASE, .size =  0x1000}, // surface state
-		{ .addr = ADDR_INDIRECT_OBJECT_BASE + OFFSET_INDIRECT_DATA_START, .size =  0x10000}, // indirect data
-		{ .addr = ADDR_INPUT, .size = SIZE_BUFFER_INPUT }, // input
-		{ .addr = ADDR_OUTPUT, .size = SIZE_BUFFER_OUTPUT }, // output
-		{ .addr = ADDR_BATCH, .size = SIZE_BATCH }, // batch
+		{ .addr = ADDR_INDIRECT_OBJECT_BASE + OFFSET_KERNEL,
+		  .name = "kernel" },
+		{ .addr = ADDR_DYNAMIC_STATE_BASE,
+		  .size =  0x1000,
+		  .name = "dynamic state base" },
+		{ .addr = ADDR_SURFACE_STATE_BASE,
+		  .size =  0x1000,
+		  .name = "surface state base" },
+		{ .addr = ADDR_INDIRECT_OBJECT_BASE + OFFSET_INDIRECT_DATA_START,
+		  .size =  0x10000,
+		  .name = "indirect data start" },
+		{ .addr = ADDR_INPUT,
+		  .size = SIZE_BUFFER_INPUT,
+		  .name = "input" },
+		{ .addr = ADDR_OUTPUT,
+		  .size = SIZE_BUFFER_OUTPUT,
+		  .name = "output" },
+		{ .addr = ADDR_BATCH,
+		  .size = SIZE_BATCH,
+		  .name = "batch" },
 	};
 	struct bo_execenv execenv;
 	float *dinput;
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [igt-dev] [PATCH i-g-t v2 5/9] lib/intel_compute: Add i915 path in compute library
  2023-09-05 13:33 [igt-dev] [PATCH i-g-t v2 0/9] Extend compute square to i915 and Xe Zbigniew Kempczyński
                   ` (3 preceding siblings ...)
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 4/9] lib/intel_compute: Add name field for debugging purposes Zbigniew Kempczyński
@ 2023-09-05 13:33 ` Zbigniew Kempczyński
  2023-09-08  9:13   ` Francois Dugast
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 6/9] intel/gem_compute: Add test which runs compute workload on i915 Zbigniew Kempczyński
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Zbigniew Kempczyński @ 2023-09-05 13:33 UTC (permalink / raw)
  To: igt-dev

Add code which fills requirement to run compute workload on i915.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Christoph Manszewski <christoph.manszewski@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
---
 lib/intel_compute.c | 50 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 49 insertions(+), 1 deletion(-)

diff --git a/lib/intel_compute.c b/lib/intel_compute.c
index a1e87ef46f..4344844825 100644
--- a/lib/intel_compute.c
+++ b/lib/intel_compute.c
@@ -8,6 +8,7 @@
 
 #include <stdint.h>
 
+#include "i915/gem_create.h"
 #include "igt.h"
 #include "xe_drm.h"
 #include "lib/igt_syncobj.h"
@@ -38,6 +39,7 @@ struct bo_dict_entry {
 	uint32_t size;
 	void *data;
 	const char *name;
+	uint32_t handle;
 };
 
 struct bo_execenv {
@@ -47,6 +49,10 @@ struct bo_execenv {
 	/* Xe part */
 	uint32_t vm;
 	uint32_t exec_queue;
+
+	/* i915 part */
+	struct drm_i915_gem_execbuffer2 execbuf;
+	struct drm_i915_gem_exec_object2 *obj;
 };
 
 static void bo_execenv_create(int fd, struct bo_execenv *execenv)
@@ -101,6 +107,33 @@ static void bo_execenv_bind(struct bo_execenv *execenv,
 		}
 
 		syncobj_destroy(fd, sync.handle);
+	} else {
+		struct drm_i915_gem_execbuffer2 *execbuf = &execenv->execbuf;
+		struct drm_i915_gem_exec_object2 *obj;
+
+		obj = calloc(entries, sizeof(*obj));
+		execenv->obj = obj;
+
+		for (int i = 0; i < entries; i++) {
+			bo_dict[i].handle = gem_create(fd, bo_dict[i].size);
+			bo_dict[i].data = gem_mmap__device_coherent(fd, bo_dict[i].handle,
+								    0, bo_dict[i].size,
+								    PROT_READ | PROT_WRITE);
+			igt_debug("[i: %2d name: %20s] handle: %u, data: %p, addr: %16llx, size: %llx\n",
+				  i, bo_dict[i].name,
+				  bo_dict[i].handle, bo_dict[i].data,
+				  (long long)bo_dict[i].addr,
+				  (long long)bo_dict[i].size);
+
+			obj[i].handle = bo_dict[i].handle;
+			obj[i].offset = CANONICAL(bo_dict[i].addr);
+			obj[i].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+			if (bo_dict[i].addr == ADDR_OUTPUT)
+				obj[i].flags |= EXEC_OBJECT_WRITE;
+		}
+
+		execbuf->buffers_ptr = to_user_pointer(obj);
+		execbuf->buffer_count = entries;
 	}
 }
 
@@ -123,6 +156,12 @@ static void bo_execenv_unbind(struct bo_execenv *execenv,
 		}
 
 		syncobj_destroy(fd, sync.handle);
+	} else {
+		for (int i = 0; i < entries; i++) {
+			gem_close(fd, bo_dict[i].handle);
+			munmap(bo_dict[i].data, bo_dict[i].size);
+		}
+		free(execenv->obj);
 	}
 }
 
@@ -130,8 +169,17 @@ static void bo_execenv_exec(struct bo_execenv *execenv, uint64_t start_addr)
 {
 	int fd = execenv->fd;
 
-	if (execenv->driver == INTEL_DRIVER_XE)
+	if (execenv->driver == INTEL_DRIVER_XE) {
 		xe_exec_wait(fd, execenv->exec_queue, start_addr);
+	} else {
+		struct drm_i915_gem_execbuffer2 *execbuf = &execenv->execbuf;
+		struct drm_i915_gem_exec_object2 *obj = execenv->obj;
+		int num_objects = execbuf->buffer_count;
+
+		execbuf->flags = I915_EXEC_RENDER;
+		gem_execbuf(fd, execbuf);
+		gem_sync(fd, obj[num_objects - 1].handle); /* batch handle */
+	}
 }
 
 /*
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [igt-dev] [PATCH i-g-t v2 6/9] intel/gem_compute: Add test which runs compute workload on i915
  2023-09-05 13:33 [igt-dev] [PATCH i-g-t v2 0/9] Extend compute square to i915 and Xe Zbigniew Kempczyński
                   ` (4 preceding siblings ...)
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 5/9] lib/intel_compute: Add i915 path in compute library Zbigniew Kempczyński
@ 2023-09-05 13:33 ` Zbigniew Kempczyński
  2023-09-08  9:15   ` Francois Dugast
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 7/9] lib/intel_compute: Add XeHP implementation of compute pipeline Zbigniew Kempczyński
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Zbigniew Kempczyński @ 2023-09-05 13:33 UTC (permalink / raw)
  To: igt-dev

This test is verbatim copy of xe_compute with driver open exception
(it opens i915 drm fd instead xe). Technically it is possible to
create single test code (open would try DEVICE_INTEL | DEVICE_XE)
but I resisted to that distinguishing i915 and xe version.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Christoph Manszewski <christoph.manszewski@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
---
 tests/intel/gem_compute.c | 46 +++++++++++++++++++++++++++++++++++++++
 tests/meson.build         |  1 +
 2 files changed, 47 insertions(+)
 create mode 100644 tests/intel/gem_compute.c

diff --git a/tests/intel/gem_compute.c b/tests/intel/gem_compute.c
new file mode 100644
index 0000000000..b408efee16
--- /dev/null
+++ b/tests/intel/gem_compute.c
@@ -0,0 +1,46 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+/**
+ * TEST: Check compute-related functionality
+ * Category: Hardware building block
+ * Sub-category: compute
+ * Test category: functionality test
+ * Run type: BAT
+ */
+
+#include <string.h>
+
+#include "igt.h"
+#include "intel_compute.h"
+
+/**
+ * SUBTEST: compute-square
+ * GPU requirement: only works on TGL
+ * Description:
+ *	Run an openCL Kernel that returns output[i] = input[i] * input[i],
+ *	for an input dataset..
+ * Functionality: compute openCL kernel
+ * TODO: extend test to cover other platforms
+ */
+static void
+test_compute_square(int fd)
+{
+	igt_require_f(run_compute_kernel(fd), "GPU not supported\n");
+}
+
+igt_main
+{
+	int i915;
+
+	igt_fixture
+		i915 = drm_open_driver(DRIVER_INTEL);
+
+	igt_subtest("compute-square")
+		test_compute_square(i915);
+
+	igt_fixture
+		drm_close_driver(i915);
+}
diff --git a/tests/meson.build b/tests/meson.build
index aa8e3434ce..03bb7785c3 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -105,6 +105,7 @@ intel_i915_progs = [
 	'gem_ccs',
 	'gem_close',
 	'gem_close_race',
+	'gem_compute',
 	'gem_concurrent_blit',
 	'gem_cs_tlb',
 	'gem_ctx_bad_destroy',
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [igt-dev] [PATCH i-g-t v2 7/9] lib/intel_compute: Add XeHP implementation of compute pipeline
  2023-09-05 13:33 [igt-dev] [PATCH i-g-t v2 0/9] Extend compute square to i915 and Xe Zbigniew Kempczyński
                   ` (5 preceding siblings ...)
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 6/9] intel/gem_compute: Add test which runs compute workload on i915 Zbigniew Kempczyński
@ 2023-09-05 13:33 ` Zbigniew Kempczyński
  2023-09-08 13:55   ` Francois Dugast
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 8/9] lib/intel_compute: Adding pvc compute pipeline implementation Zbigniew Kempczyński
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Zbigniew Kempczyński @ 2023-09-05 13:33 UTC (permalink / raw)
  To: igt-dev

Add pipeline which runs square compute workload on DG2.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Christoph Manszewski <christoph.manszewski@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
---
 lib/intel_compute.c                | 287 ++++++++++++++++++++++++++++-
 lib/intel_compute_square_kernels.c |  56 ++++++
 2 files changed, 342 insertions(+), 1 deletion(-)

diff --git a/lib/intel_compute.c b/lib/intel_compute.c
index 4344844825..29a5ec168f 100644
--- a/lib/intel_compute.c
+++ b/lib/intel_compute.c
@@ -14,9 +14,12 @@
 #include "lib/igt_syncobj.h"
 #include "lib/intel_reg.h"
 
+#include "gen7_media.h"
+#include "gen8_media.h"
 #include "intel_compute.h"
 #include "xe/xe_ioctl.h"
 #include "xe/xe_query.h"
+#include "xehp_media.h"
 
 #define PIPE_CONTROL			0x7a000004
 #define MEDIA_STATE_FLUSH		0x0
@@ -25,7 +28,7 @@
 #define SIZE_BATCH			0x1000
 #define SIZE_BUFFER_INPUT		MAX(sizeof(float) * SIZE_DATA, 0x1000)
 #define SIZE_BUFFER_OUTPUT		MAX(sizeof(float) * SIZE_DATA, 0x1000)
-#define ADDR_BATCH			0x100000
+#define ADDR_BATCH			0x100000UL
 #define ADDR_INPUT			0x200000UL
 #define ADDR_OUTPUT			0x300000UL
 #define ADDR_SURFACE_STATE_BASE		0x400000UL
@@ -34,6 +37,10 @@
 #define OFFSET_INDIRECT_DATA_START	0xFFFDF000
 #define OFFSET_KERNEL			0xFFFEF000
 
+#define XEHP_ADDR_GENERAL_STATE_BASE		0x80000000UL
+#define XEHP_ADDR_INSTRUCTION_STATE_BASE	0x90000000UL
+#define XEHP_OFFSET_BINDING_TABLE		0x1000
+
 struct bo_dict_entry {
 	uint64_t addr;
 	uint32_t size;
@@ -597,6 +604,279 @@ static void tgl_compute_exec(int fd, const unsigned char *kernel,
 	bo_execenv_destroy(&execenv);
 }
 
+static void xehp_create_indirect_data(uint32_t *addr_bo_buffer_batch,
+				      uint64_t addr_input,
+				      uint64_t addr_output)
+{
+	int b = 0;
+
+	addr_bo_buffer_batch[b++] = addr_input & 0xffffffff;
+	addr_bo_buffer_batch[b++] = addr_input >> 32;
+	addr_bo_buffer_batch[b++] = addr_output & 0xffffffff;
+	addr_bo_buffer_batch[b++] = addr_output >> 32;
+	addr_bo_buffer_batch[b++] = 0x00000400;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000400;
+	addr_bo_buffer_batch[b++] = 0x00000001;
+	addr_bo_buffer_batch[b++] = 0x00000001;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+}
+
+static void xehp_create_surface_state(uint32_t *addr_bo_buffer_batch,
+				      uint64_t addr_input,
+				      uint64_t addr_output)
+{
+	int b = 0;
+
+	addr_bo_buffer_batch[b++] = 0x87FDC000;
+	addr_bo_buffer_batch[b++] = 0x06000000;
+	addr_bo_buffer_batch[b++] = 0x001F007F;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00002000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = addr_input & 0xffffffff;
+	addr_bo_buffer_batch[b++] = addr_input >> 32;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+
+	addr_bo_buffer_batch[b++] = 0x87FDC000;
+	addr_bo_buffer_batch[b++] = 0x06000000;
+	addr_bo_buffer_batch[b++] = 0x001F007F;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00002000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = addr_output & 0xffffffff;
+	addr_bo_buffer_batch[b++] = addr_output >> 32;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+
+	addr_bo_buffer_batch[b++] = 0x00001000;
+	addr_bo_buffer_batch[b++] = 0x00001040;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+}
+
+static void xehp_compute_exec_compute(uint32_t *addr_bo_buffer_batch,
+				      uint64_t addr_general_state_base,
+				      uint64_t addr_surface_state_base,
+				      uint64_t addr_dynamic_state_base,
+				      uint64_t addr_instruction_state_base,
+				      uint64_t offset_indirect_data_start,
+				      uint64_t kernel_start_pointer)
+{
+	int b = 0;
+
+	igt_debug("general   state base: %lx\n", addr_general_state_base);
+	igt_debug("surface   state base: %lx\n", addr_surface_state_base);
+	igt_debug("dynamic   state base: %lx\n", addr_dynamic_state_base);
+	igt_debug("instruct   base addr: %lx\n", addr_instruction_state_base);
+	igt_debug("bindless   base addr: %lx\n", addr_surface_state_base);
+	igt_debug("offset indirect addr: %lx\n", offset_indirect_data_start);
+	igt_debug("kernel start pointer: %lx\n", kernel_start_pointer);
+
+	addr_bo_buffer_batch[b++] = GEN7_PIPELINE_SELECT | GEN9_PIPELINE_SELECTION_MASK |
+				    PIPELINE_SELECT_GPGPU;
+
+	addr_bo_buffer_batch[b++] = XEHP_STATE_COMPUTE_MODE;
+	addr_bo_buffer_batch[b++] = 0x80180010;
+
+	addr_bo_buffer_batch[b++] = XEHP_CFE_STATE;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x0c008800;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+
+	addr_bo_buffer_batch[b++] = MI_LOAD_REGISTER_IMM(1);
+	addr_bo_buffer_batch[b++] = 0x00002580;
+	addr_bo_buffer_batch[b++] = 0x00060002;
+
+	addr_bo_buffer_batch[b++] = STATE_BASE_ADDRESS | 0x14;
+	addr_bo_buffer_batch[b++] = (addr_general_state_base & 0xffffffff) | 0x61;
+	addr_bo_buffer_batch[b++] = addr_general_state_base >> 32;
+	addr_bo_buffer_batch[b++] = 0x0106c000;
+	addr_bo_buffer_batch[b++] = (addr_surface_state_base & 0xffffffff) | 0x61;
+	addr_bo_buffer_batch[b++] = addr_surface_state_base >> 32;
+	addr_bo_buffer_batch[b++] = (addr_dynamic_state_base & 0xffffffff) | 0x61;
+	addr_bo_buffer_batch[b++] = addr_dynamic_state_base >> 32;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = (addr_instruction_state_base & 0xffffffff) | 0x61;
+	addr_bo_buffer_batch[b++] = addr_instruction_state_base >> 32;
+	addr_bo_buffer_batch[b++] = 0xfffff001;
+	addr_bo_buffer_batch[b++] = 0x00010001;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0xfffff001;
+	addr_bo_buffer_batch[b++] = (addr_surface_state_base & 0xffffffff) | 0x61;
+	addr_bo_buffer_batch[b++] = addr_surface_state_base >> 32;
+	addr_bo_buffer_batch[b++] = 0x00007fbf;
+	addr_bo_buffer_batch[b++] = 0x00000061;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+
+	addr_bo_buffer_batch[b++] = GEN8_3DSTATE_BINDING_TABLE_POOL_ALLOC | 2;
+	addr_bo_buffer_batch[b++] = (addr_surface_state_base & 0xffffffff) | 0x6;
+	addr_bo_buffer_batch[b++] = addr_surface_state_base >> 32;
+	addr_bo_buffer_batch[b++] = 0x00002000;
+	addr_bo_buffer_batch[b++] = 0x001ff000;
+
+	addr_bo_buffer_batch[b++] = XEHP_COMPUTE_WALKER | 0x25;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000040;
+	addr_bo_buffer_batch[b++] = offset_indirect_data_start;
+	addr_bo_buffer_batch[b++] = 0xbe040000;
+	addr_bo_buffer_batch[b++] = 0xffffffff;
+	addr_bo_buffer_batch[b++] = 0x0000003f;
+	addr_bo_buffer_batch[b++] = 0x00000010;
+
+	addr_bo_buffer_batch[b++] = 0x00000001;
+	addr_bo_buffer_batch[b++] = 0x00000001;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+
+	addr_bo_buffer_batch[b++] = kernel_start_pointer;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00180000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00001080;
+	addr_bo_buffer_batch[b++] = 0x0c000002;
+
+	addr_bo_buffer_batch[b++] = 0x00000008;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00001027;
+	addr_bo_buffer_batch[b++] = ADDR_BATCH;
+	addr_bo_buffer_batch[b++] = ADDR_BATCH >> 32;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000040;
+	addr_bo_buffer_batch[b++] = 0x00000001;
+	addr_bo_buffer_batch[b++] = 0x00000001;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+
+	addr_bo_buffer_batch[b++] = MI_BATCH_BUFFER_END;
+}
+
+/**
+ * xehp_compute_exec - run a pipeline compatible with XEHP
+ *
+ * @fd: file descriptor of the opened DRM device
+ * @kernel: GPU Kernel binary to be executed
+ * @size: size of @kernel.
+ */
+static void xehp_compute_exec(int fd, const unsigned char *kernel,
+			     unsigned int size)
+{
+#define XEHP_BO_DICT_ENTRIES 9
+	struct bo_dict_entry bo_dict[XEHP_BO_DICT_ENTRIES] = {
+		{ .addr = XEHP_ADDR_INSTRUCTION_STATE_BASE + OFFSET_KERNEL,
+		  .name = "instr state base"},
+		{ .addr = ADDR_DYNAMIC_STATE_BASE,
+		  .size = 0x100000,
+		  .name = "dynamic state base"},
+		{ .addr = ADDR_SURFACE_STATE_BASE,
+		  .size = 0x1000,
+		  .name = "surface state base"},
+		{ .addr = XEHP_ADDR_GENERAL_STATE_BASE + OFFSET_INDIRECT_DATA_START,
+		  .size =  0x1000,
+		  .name = "indirect object base"},
+		{ .addr = ADDR_INPUT, .size = SIZE_BUFFER_INPUT,
+		  .name = "addr input"},
+		{ .addr = ADDR_OUTPUT, .size = SIZE_BUFFER_OUTPUT,
+		  .name = "addr output" },
+		{ .addr = XEHP_ADDR_GENERAL_STATE_BASE, .size = 0x100000,
+		  .name = "general state base" },
+		{ .addr = ADDR_SURFACE_STATE_BASE + XEHP_OFFSET_BINDING_TABLE,
+		  .size = 0x1000,
+		  .name = "binding table" },
+		{ .addr = ADDR_BATCH, .size = SIZE_BATCH,
+		  .name = "batch" },
+	};
+	struct bo_execenv execenv;
+	float *dinput;
+
+	bo_execenv_create(fd, &execenv);
+
+	/* Sets Kernel size */
+	bo_dict[0].size = ALIGN(size, 0x1000);
+
+	bo_execenv_bind(&execenv, bo_dict, XEHP_BO_DICT_ENTRIES);
+
+	memcpy(bo_dict[0].data, kernel, size);
+	tgllp_create_dynamic_state(bo_dict[1].data, OFFSET_KERNEL);
+	xehp_create_surface_state(bo_dict[2].data, ADDR_INPUT, ADDR_OUTPUT);
+	xehp_create_indirect_data(bo_dict[3].data, ADDR_INPUT, ADDR_OUTPUT);
+	xehp_create_surface_state(bo_dict[7].data, ADDR_INPUT, ADDR_OUTPUT);
+
+	dinput = (float *)bo_dict[4].data;
+	srand(time(NULL));
+	for (int i = 0; i < SIZE_DATA; i++)
+		((float *)dinput)[i] = rand() / (float)RAND_MAX;
+
+	xehp_compute_exec_compute(bo_dict[8].data,
+				  XEHP_ADDR_GENERAL_STATE_BASE,
+				  ADDR_SURFACE_STATE_BASE,
+				  ADDR_DYNAMIC_STATE_BASE,
+				  XEHP_ADDR_INSTRUCTION_STATE_BASE,
+				  OFFSET_INDIRECT_DATA_START,
+				  OFFSET_KERNEL);
+
+	bo_execenv_exec(&execenv, ADDR_BATCH);
+
+	for (int i = 0; i < SIZE_DATA; i++) {
+		float f1, f2;
+
+		f1 = ((float *) bo_dict[5].data)[i];
+		f2 = ((float *) bo_dict[4].data)[i];
+		if (f1 != f2 * f2)
+			igt_debug("[%4d] f1: %f != %f\n", i, f1, f2 * f2);
+		igt_assert(f1 == f2 * f2);
+	}
+
+	bo_execenv_unbind(&execenv, bo_dict, XEHP_BO_DICT_ENTRIES);
+	bo_execenv_destroy(&execenv);
+}
+
 /*
  * Compatibility flags.
  *
@@ -620,6 +900,11 @@ static const struct {
 		.compute_exec = tgl_compute_exec,
 		.compat = COMPAT_I915 | COMPAT_XE,
 	},
+	{
+		.ip_ver = IP_VER(12, 55),
+		.compute_exec = xehp_compute_exec,
+		.compat = COMPAT_I915,
+	},
 };
 
 bool run_compute_kernel(int fd)
diff --git a/lib/intel_compute_square_kernels.c b/lib/intel_compute_square_kernels.c
index b30d8a23dd..da73a3747c 100644
--- a/lib/intel_compute_square_kernels.c
+++ b/lib/intel_compute_square_kernels.c
@@ -61,11 +61,67 @@ static const unsigned char tgllp_kernel_square_bin[] = {
 	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
 };
 
+static const unsigned char xehp_kernel_square_bin[] = {
+	0x61, 0x31, 0x03, 0x80, 0x20, 0x42, 0x05, 0x7f, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x65, 0x00, 0x00, 0x80, 0x20, 0x82, 0x45, 0x7f,
+	0x04, 0x00, 0x00, 0x02, 0xc0, 0xff, 0xff, 0xff, 0x40, 0x19, 0x00, 0x80,
+	0x20, 0x82, 0x45, 0x7f, 0x44, 0x7f, 0x00, 0x02, 0x00, 0x00, 0x00, 0x00,
+	0x31, 0x92, 0x03, 0x80, 0x00, 0x00, 0x14, 0x08, 0x0c, 0x7f, 0xfa, 0xa7,
+	0x00, 0x00, 0x10, 0x02, 0x61, 0x20, 0x03, 0x80, 0x20, 0x02, 0x05, 0x03,
+	0x04, 0x00, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x66, 0x09, 0x00, 0x80,
+	0x20, 0x82, 0x01, 0x80, 0x00, 0x80, 0x00, 0x01, 0xc0, 0x04, 0xc0, 0x04,
+	0x01, 0x09, 0x00, 0xe8, 0x01, 0x00, 0x11, 0x00, 0x01, 0x22, 0x00, 0xe8,
+	0x01, 0x00, 0x11, 0x00, 0x41, 0x09, 0x20, 0x22, 0x16, 0x09, 0x11, 0x03,
+	0x49, 0x00, 0x04, 0xa2, 0x12, 0x09, 0x11, 0x03, 0x01, 0x21, 0x00, 0xe8,
+	0x01, 0x00, 0x11, 0x00, 0x52, 0x19, 0x04, 0x00, 0x60, 0x06, 0x04, 0x05,
+	0x04, 0x04, 0x0e, 0x01, 0x04, 0x01, 0x04, 0x07, 0x52, 0x00, 0x24, 0x00,
+	0x60, 0x06, 0x04, 0x0a, 0x04, 0x04, 0x0e, 0x01, 0x04, 0x02, 0x04, 0x07,
+	0x70, 0x1a, 0x04, 0x00, 0x60, 0x02, 0x01, 0x00, 0x04, 0x05, 0x10, 0x52,
+	0x84, 0x08, 0x00, 0x00, 0x70, 0x1a, 0x24, 0x00, 0x60, 0x02, 0x01, 0x00,
+	0x04, 0x0a, 0x10, 0x52, 0x84, 0x08, 0x00, 0x00, 0x2e, 0x00, 0x05, 0x11,
+	0x00, 0xc0, 0x00, 0x00, 0x90, 0x00, 0x00, 0x00, 0x90, 0x00, 0x00, 0x00,
+	0x69, 0x00, 0x0c, 0x60, 0x02, 0x05, 0x20, 0x00, 0x69, 0x00, 0x0e, 0x66,
+	0x02, 0x0a, 0x20, 0x00, 0x40, 0x1a, 0x10, 0xa0, 0x32, 0x0c, 0x10, 0x08,
+	0x40, 0x1a, 0x12, 0xa6, 0x32, 0x0e, 0x10, 0x08, 0x31, 0xa3, 0x04, 0x00,
+	0x00, 0x00, 0x14, 0x14, 0x94, 0x10, 0x00, 0xfa, 0x00, 0x00, 0x00, 0x06,
+	0x31, 0x94, 0x24, 0x00, 0x00, 0x00, 0x14, 0x16, 0x94, 0x12, 0x00, 0xfa,
+	0x00, 0x00, 0x00, 0x06, 0x40, 0x00, 0x0c, 0xa0, 0x4a, 0x0c, 0x10, 0x08,
+	0x40, 0x00, 0x0e, 0xa6, 0x4a, 0x0e, 0x10, 0x08, 0x41, 0x23, 0x14, 0x20,
+	0x00, 0x14, 0x00, 0x14, 0x41, 0x24, 0x16, 0x26, 0x00, 0x16, 0x00, 0x16,
+	0x31, 0xa5, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x95, 0x0c, 0x08, 0xfa,
+	0x14, 0x14, 0x80, 0x07, 0x31, 0x96, 0x24, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x95, 0x0e, 0x08, 0xfa, 0x14, 0x16, 0x80, 0x07, 0x2f, 0x00, 0x05, 0x00,
+	0x00, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00,
+	0x61, 0x00, 0x7f, 0x64, 0x00, 0x03, 0x10, 0x00, 0x31, 0x09, 0x03, 0x80,
+	0x04, 0x00, 0x00, 0x00, 0x0c, 0x7f, 0x20, 0x30, 0x00, 0x00, 0x00, 0x00,
+	0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+
 const struct compute_kernels compute_square_kernels[] = {
 	{
 		.ip_ver = IP_VER(12, 0),
 		.size = sizeof(tgllp_kernel_square_bin),
 		.kernel = tgllp_kernel_square_bin,
 	},
+	{
+		.ip_ver = IP_VER(12, 55),
+		.size = sizeof(xehp_kernel_square_bin),
+		.kernel = xehp_kernel_square_bin,
+	},
 	{}
 };
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [igt-dev] [PATCH i-g-t v2 8/9] lib/intel_compute: Adding pvc compute pipeline implementation
  2023-09-05 13:33 [igt-dev] [PATCH i-g-t v2 0/9] Extend compute square to i915 and Xe Zbigniew Kempczyński
                   ` (6 preceding siblings ...)
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 7/9] lib/intel_compute: Add XeHP implementation of compute pipeline Zbigniew Kempczyński
@ 2023-09-05 13:33 ` Zbigniew Kempczyński
  2023-09-08 13:56   ` Francois Dugast
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 9/9] tests/gem|xe_compute: Update documentation regarding test requirements Zbigniew Kempczyński
  2023-09-05 18:23 ` [igt-dev] ✗ Fi.CI.BAT: failure for Extend compute square to i915 and Xe (rev2) Patchwork
  9 siblings, 1 reply; 22+ messages in thread
From: Zbigniew Kempczyński @ 2023-09-05 13:33 UTC (permalink / raw)
  To: igt-dev

Add square compute pipeline which works on PVC. Currently limited
to Xe driver.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Christoph Manszewski <christoph.manszewski@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
---
 lib/intel_compute.c                | 218 ++++++++++++++++++++++++++++-
 lib/intel_compute_square_kernels.c |  39 ++++++
 2 files changed, 256 insertions(+), 1 deletion(-)

diff --git a/lib/intel_compute.c b/lib/intel_compute.c
index 29a5ec168f..4a232ce72b 100644
--- a/lib/intel_compute.c
+++ b/lib/intel_compute.c
@@ -71,9 +71,18 @@ static void bo_execenv_create(int fd, struct bo_execenv *execenv)
 	execenv->driver = get_intel_driver(fd);
 
 	if (execenv->driver == INTEL_DRIVER_XE) {
+		uint16_t engine_class;
+		uint32_t devid = intel_get_drm_devid(fd);
+		const struct intel_device_info *info = intel_get_device_info(devid);
+
+		if (info->graphics_ver >= 12 && info->graphics_rel < 60)
+			engine_class = DRM_XE_ENGINE_CLASS_RENDER;
+		else
+			engine_class = DRM_XE_ENGINE_CLASS_COMPUTE;
+
 		execenv->vm = xe_vm_create(fd, DRM_XE_VM_CREATE_ASYNC_BIND_OPS, 0);
 		execenv->exec_queue = xe_exec_queue_create_class(fd, execenv->vm,
-								 DRM_XE_ENGINE_CLASS_RENDER);
+								 engine_class);
 	}
 }
 
@@ -877,6 +886,208 @@ static void xehp_compute_exec(int fd, const unsigned char *kernel,
 	bo_execenv_destroy(&execenv);
 }
 
+static void xehpc_create_indirect_data(uint32_t *addr_bo_buffer_batch,
+				       uint64_t addr_input,
+				       uint64_t addr_output)
+{
+	int b = 0;
+
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000400;
+	addr_bo_buffer_batch[b++] = 0x00000001;
+	addr_bo_buffer_batch[b++] = 0x00000001;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = addr_input & 0xffffffff;
+	addr_bo_buffer_batch[b++] = addr_input >> 32;
+	addr_bo_buffer_batch[b++] = addr_output & 0xffffffff;
+	addr_bo_buffer_batch[b++] = addr_output >> 32;
+	addr_bo_buffer_batch[b++] = 0x00000400;
+	addr_bo_buffer_batch[b++] = 0x00000400;
+	addr_bo_buffer_batch[b++] = 0x00000001;
+	addr_bo_buffer_batch[b++] = 0x00000001;
+}
+
+static void xehpc_compute_exec_compute(uint32_t *addr_bo_buffer_batch,
+				       uint64_t addr_general_state_base,
+				       uint64_t addr_surface_state_base,
+				       uint64_t addr_dynamic_state_base,
+				       uint64_t addr_instruction_state_base,
+				       uint64_t offset_indirect_data_start,
+				       uint64_t kernel_start_pointer)
+{
+	int b = 0;
+
+	igt_debug("general   state base: %lx\n", addr_general_state_base);
+	igt_debug("surface   state base: %lx\n", addr_surface_state_base);
+	igt_debug("dynamic   state base: %lx\n", addr_dynamic_state_base);
+	igt_debug("instruct   base addr: %lx\n", addr_instruction_state_base);
+	igt_debug("bindless   base addr: %lx\n", addr_surface_state_base);
+	igt_debug("offset indirect addr: %lx\n", offset_indirect_data_start);
+	igt_debug("kernel start pointer: %lx\n", kernel_start_pointer);
+
+	addr_bo_buffer_batch[b++] = GEN7_PIPELINE_SELECT | GEN9_PIPELINE_SELECTION_MASK |
+				    PIPELINE_SELECT_GPGPU;
+
+	addr_bo_buffer_batch[b++] = XEHP_STATE_COMPUTE_MODE;
+	addr_bo_buffer_batch[b++] = 0xE0186010;
+
+	addr_bo_buffer_batch[b++] = XEHP_CFE_STATE | 0x4;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x10008800;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+
+	addr_bo_buffer_batch[b++] = MI_LOAD_REGISTER_IMM(1);
+	addr_bo_buffer_batch[b++] = 0x00002580;
+	addr_bo_buffer_batch[b++] = 0x00060002;
+
+	addr_bo_buffer_batch[b++] = STATE_BASE_ADDRESS | 0x14;
+	addr_bo_buffer_batch[b++] = (addr_general_state_base & 0xffffffff) | 0x41;
+	addr_bo_buffer_batch[b++] = addr_general_state_base >> 32;
+	addr_bo_buffer_batch[b++] = 0x00044000;
+	addr_bo_buffer_batch[b++] = (addr_surface_state_base & 0xffffffff) | 0x41;
+	addr_bo_buffer_batch[b++] = addr_surface_state_base >> 32;
+	addr_bo_buffer_batch[b++] = (addr_dynamic_state_base & 0xffffffff) | 0x41;
+	addr_bo_buffer_batch[b++] = addr_dynamic_state_base >> 32;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = (addr_instruction_state_base & 0xffffffff) | 0x41;
+	addr_bo_buffer_batch[b++] = addr_instruction_state_base >> 32;
+	addr_bo_buffer_batch[b++] = 0xfffff001;
+	addr_bo_buffer_batch[b++] = 0x00010001;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0xfffff001;
+	addr_bo_buffer_batch[b++] = (addr_surface_state_base & 0xffffffff) | 0x41;
+	addr_bo_buffer_batch[b++] = addr_surface_state_base >> 32;
+	addr_bo_buffer_batch[b++] = 0x00007fbf;
+	addr_bo_buffer_batch[b++] = 0x00000041;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+
+	addr_bo_buffer_batch[b++] = GEN8_3DSTATE_BINDING_TABLE_POOL_ALLOC | 2;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+
+	addr_bo_buffer_batch[b++] = XEHP_COMPUTE_WALKER | 0x25;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000040;
+	addr_bo_buffer_batch[b++] = offset_indirect_data_start;
+	addr_bo_buffer_batch[b++] = 0xbe040000;
+	addr_bo_buffer_batch[b++] = 0xffffffff;
+	addr_bo_buffer_batch[b++] = 0x0000003f;
+	addr_bo_buffer_batch[b++] = 0x00000010;
+
+	addr_bo_buffer_batch[b++] = 0x00000001;
+	addr_bo_buffer_batch[b++] = 0x00000001;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+
+	addr_bo_buffer_batch[b++] = kernel_start_pointer;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00180000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x0c000020;
+
+	addr_bo_buffer_batch[b++] = 0x00000008;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00001047;
+	addr_bo_buffer_batch[b++] = ADDR_BATCH;
+	addr_bo_buffer_batch[b++] = ADDR_BATCH >> 32;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000040;
+	addr_bo_buffer_batch[b++] = 0x00000001;
+	addr_bo_buffer_batch[b++] = 0x00000001;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+	addr_bo_buffer_batch[b++] = 0x00000000;
+
+	addr_bo_buffer_batch[b++] = MI_BATCH_BUFFER_END;
+}
+
+/**
+ * xehpc_compute_exec - run a pipeline compatible with XEHP
+ *
+ * @fd: file descriptor of the opened DRM device
+ * @kernel: GPU Kernel binary to be executed
+ * @size: size of @kernel.
+ */
+static void xehpc_compute_exec(int fd, const unsigned char *kernel,
+			       unsigned int size)
+{
+#define XEHPC_BO_DICT_ENTRIES 6
+	struct bo_dict_entry bo_dict[XEHP_BO_DICT_ENTRIES] = {
+		{ .addr = XEHP_ADDR_INSTRUCTION_STATE_BASE + OFFSET_KERNEL,
+		  .name = "instr state base"},
+		{ .addr = XEHP_ADDR_GENERAL_STATE_BASE + OFFSET_INDIRECT_DATA_START,
+		  .size =  0x10000,
+		  .name = "indirect object base"},
+		{ .addr = ADDR_INPUT, .size = SIZE_BUFFER_INPUT,
+		  .name = "addr input"},
+		{ .addr = ADDR_OUTPUT, .size = SIZE_BUFFER_OUTPUT,
+		  .name = "addr output" },
+		{ .addr = XEHP_ADDR_GENERAL_STATE_BASE, .size = 0x10000,
+		  .name = "general state base" },
+		{ .addr = ADDR_BATCH, .size = SIZE_BATCH,
+		  .name = "batch" },
+	};
+	struct bo_execenv execenv;
+	float *dinput;
+
+	bo_execenv_create(fd, &execenv);
+
+	/* Sets Kernel size */
+	bo_dict[0].size = ALIGN(size, 0x1000);
+
+	bo_execenv_bind(&execenv, bo_dict, XEHPC_BO_DICT_ENTRIES);
+
+	memcpy(bo_dict[0].data, kernel, size);
+	xehpc_create_indirect_data(bo_dict[1].data, ADDR_INPUT, ADDR_OUTPUT);
+
+	dinput = (float *)bo_dict[2].data;
+	srand(time(NULL));
+	for (int i = 0; i < SIZE_DATA; i++)
+		((float *)dinput)[i] = rand() / (float)RAND_MAX;
+
+	xehpc_compute_exec_compute(bo_dict[5].data,
+				   XEHP_ADDR_GENERAL_STATE_BASE,
+				   ADDR_SURFACE_STATE_BASE,
+				   ADDR_DYNAMIC_STATE_BASE,
+				   XEHP_ADDR_INSTRUCTION_STATE_BASE,
+				   OFFSET_INDIRECT_DATA_START,
+				   OFFSET_KERNEL);
+
+	bo_execenv_exec(&execenv, ADDR_BATCH);
+
+	for (int i = 0; i < SIZE_DATA; i++) {
+		float f1, f2;
+
+		f1 = ((float *) bo_dict[3].data)[i];
+		f2 = ((float *) bo_dict[2].data)[i];
+		if (f1 != f2 * f2)
+			igt_debug("[%4d] f1: %f != %f\n", i, f1, f2 * f2);
+		igt_assert(f1 == f2 * f2);
+	}
+
+	bo_execenv_unbind(&execenv, bo_dict, XEHPC_BO_DICT_ENTRIES);
+	bo_execenv_destroy(&execenv);
+}
+
 /*
  * Compatibility flags.
  *
@@ -905,6 +1116,11 @@ static const struct {
 		.compute_exec = xehp_compute_exec,
 		.compat = COMPAT_I915,
 	},
+	{
+		.ip_ver = IP_VER(12, 60),
+		.compute_exec = xehpc_compute_exec,
+		.compat = COMPAT_XE,
+	},
 };
 
 bool run_compute_kernel(int fd)
diff --git a/lib/intel_compute_square_kernels.c b/lib/intel_compute_square_kernels.c
index da73a3747c..de93a3bdfd 100644
--- a/lib/intel_compute_square_kernels.c
+++ b/lib/intel_compute_square_kernels.c
@@ -112,6 +112,40 @@ static const unsigned char xehp_kernel_square_bin[] = {
 	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
 };
 
+static const unsigned char xehpc_kernel_square_bin[] = {
+	0x65, 0xa1, 0x00, 0x80, 0x20, 0x82, 0x05, 0x7f, 0x04, 0x00, 0x00, 0x02,
+	0xc0, 0xff, 0xff, 0xff, 0x40, 0x19, 0x00, 0x80, 0x20, 0x82, 0x05, 0x7f,
+	0x04, 0x7f, 0x00, 0x02, 0x00, 0x00, 0x00, 0x00, 0x31, 0x22, 0x03, 0x00,
+	0x00, 0x00, 0x0c, 0x04, 0x8f, 0x7f, 0x00, 0xfa, 0x03, 0x00, 0x34, 0xf6,
+	0x66, 0x09, 0x84, 0xb4, 0x80, 0x80, 0x00, 0x4c, 0x41, 0x22, 0x03, 0x80,
+	0x60, 0x06, 0x01, 0x20, 0xd4, 0x04, 0x00, 0x01, 0x14, 0x00, 0x00, 0x00,
+	0x53, 0x80, 0x00, 0x80, 0x60, 0x06, 0x05, 0x02, 0xd4, 0x04, 0x00, 0x06,
+	0x14, 0x00, 0x00, 0x00, 0x52, 0x19, 0x14, 0x00, 0x60, 0x06, 0x04, 0x05,
+	0x04, 0x02, 0x0e, 0x01, 0x04, 0x01, 0x04, 0x04, 0x70, 0x19, 0x14, 0x00,
+	0x20, 0x02, 0x01, 0x00, 0x04, 0x05, 0x10, 0x52, 0xc4, 0x04, 0x00, 0x00,
+	0x2e, 0x00, 0x14, 0x14, 0x00, 0xc0, 0x00, 0x00, 0x78, 0x00, 0x00, 0x00,
+	0x78, 0x00, 0x00, 0x00, 0x61, 0x00, 0x00, 0x6c, 0x13, 0x05, 0x00, 0x00,
+	0x61, 0x00, 0x08, 0x6c, 0x15, 0x06, 0x00, 0x00, 0x69, 0x1a, 0x00, 0xf9,
+	0x17, 0x13, 0x20, 0x00, 0x69, 0x1a, 0x08, 0xf9, 0x19, 0x15, 0x20, 0x00,
+	0x40, 0x1a, 0x00, 0x20, 0x07, 0x17, 0x60, 0x04, 0x40, 0x1a, 0x08, 0x20,
+	0x09, 0x19, 0x60, 0x04, 0x31, 0x23, 0x15, 0x00, 0x00, 0x00, 0x14, 0x0b,
+	0x24, 0x07, 0x00, 0xfb, 0x00, 0x00, 0x00, 0x00, 0x40, 0x00, 0x00, 0x20,
+	0x0f, 0x17, 0x30, 0x04, 0x40, 0x00, 0x08, 0x20, 0x11, 0x19, 0x30, 0x04,
+	0x41, 0x83, 0x14, 0x2c, 0x0d, 0x0b, 0x10, 0x0b, 0x31, 0x24, 0x15, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x24, 0x0f, 0x08, 0xfb, 0x14, 0x0d, 0x00, 0x00,
+	0x2f, 0x00, 0x14, 0x00, 0x00, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x10, 0x00, 0x00, 0x00, 0x61, 0x00, 0x1c, 0x34, 0x7f, 0x00, 0x00, 0x00,
+	0x31, 0x11, 0x0c, 0x80, 0x04, 0x00, 0x00, 0x00, 0x0c, 0x7f, 0x20, 0x30,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+};
+
 const struct compute_kernels compute_square_kernels[] = {
 	{
 		.ip_ver = IP_VER(12, 0),
@@ -123,5 +157,10 @@ const struct compute_kernels compute_square_kernels[] = {
 		.size = sizeof(xehp_kernel_square_bin),
 		.kernel = xehp_kernel_square_bin,
 	},
+	{
+		.ip_ver = IP_VER(12, 60),
+		.size = sizeof(xehpc_kernel_square_bin),
+		.kernel = xehpc_kernel_square_bin,
+	},
 	{}
 };
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [igt-dev] [PATCH i-g-t v2 9/9] tests/gem|xe_compute: Update documentation regarding test requirements
  2023-09-05 13:33 [igt-dev] [PATCH i-g-t v2 0/9] Extend compute square to i915 and Xe Zbigniew Kempczyński
                   ` (7 preceding siblings ...)
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 8/9] lib/intel_compute: Adding pvc compute pipeline implementation Zbigniew Kempczyński
@ 2023-09-05 13:33 ` Zbigniew Kempczyński
  2023-09-08 13:56   ` Francois Dugast
  2023-09-05 18:23 ` [igt-dev] ✗ Fi.CI.BAT: failure for Extend compute square to i915 and Xe (rev2) Patchwork
  9 siblings, 1 reply; 22+ messages in thread
From: Zbigniew Kempczyński @ 2023-09-05 13:33 UTC (permalink / raw)
  To: igt-dev

Currently test is prepared to run on DG2, ATS-M and PVC so lets
reflect this in the documentation tags.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Christoph Manszewski <christoph.manszewski@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
---
 tests/intel/gem_compute.c | 3 +--
 tests/intel/xe_compute.c  | 3 +--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/tests/intel/gem_compute.c b/tests/intel/gem_compute.c
index b408efee16..8f4722d2dc 100644
--- a/tests/intel/gem_compute.c
+++ b/tests/intel/gem_compute.c
@@ -18,12 +18,11 @@
 
 /**
  * SUBTEST: compute-square
- * GPU requirement: only works on TGL
+ * GPU requirement: TGL, DG2, ATS-M
  * Description:
  *	Run an openCL Kernel that returns output[i] = input[i] * input[i],
  *	for an input dataset..
  * Functionality: compute openCL kernel
- * TODO: extend test to cover other platforms
  */
 static void
 test_compute_square(int fd)
diff --git a/tests/intel/xe_compute.c b/tests/intel/xe_compute.c
index 0c54fbec42..07764decb5 100644
--- a/tests/intel/xe_compute.c
+++ b/tests/intel/xe_compute.c
@@ -19,12 +19,11 @@
 
 /**
  * SUBTEST: compute-square
- * GPU requirement: only works on TGL
+ * GPU requirement: TGL, PVC
  * Description:
  *	Run an openCL Kernel that returns output[i] = input[i] * input[i],
  *	for an input dataset..
  * Functionality: compute openCL kernel
- * TODO: extend test to cover other platforms
  */
 static void
 test_compute_square(int fd)
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [igt-dev] ✗ Fi.CI.BAT: failure for Extend compute square to i915 and Xe (rev2)
  2023-09-05 13:33 [igt-dev] [PATCH i-g-t v2 0/9] Extend compute square to i915 and Xe Zbigniew Kempczyński
                   ` (8 preceding siblings ...)
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 9/9] tests/gem|xe_compute: Update documentation regarding test requirements Zbigniew Kempczyński
@ 2023-09-05 18:23 ` Patchwork
  9 siblings, 0 replies; 22+ messages in thread
From: Patchwork @ 2023-09-05 18:23 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

[-- Attachment #1: Type: text/plain, Size: 14795 bytes --]

== Series Details ==

Series: Extend compute square to i915 and Xe (rev2)
URL   : https://patchwork.freedesktop.org/series/122568/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_13599 -> IGTPW_9722
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with IGTPW_9722 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in IGTPW_9722, please notify your bug team (lgci.bug.filing@intel.com) to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/index.html

Participating hosts (38 -> 38)
------------------------------

  Additional (2): fi-kbl-soraka bat-dg2-8 
  Missing    (2): bat-dg2-9 fi-snb-2520m 

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in IGTPW_9722:

### IGT changes ###

#### Possible regressions ####

  * igt@core_auth@basic-auth:
    - fi-apl-guc:         [PASS][1] -> [DMESG-WARN][2] +1 other test dmesg-warn
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13599/fi-apl-guc/igt@core_auth@basic-auth.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/fi-apl-guc/igt@core_auth@basic-auth.html

  
Known issues
------------

  Here are the changes found in IGTPW_9722 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@core_hotunplug@unbind-rebind:
    - fi-apl-guc:         [PASS][3] -> [DMESG-WARN][4] ([i915#180] / [i915#7634]) +1 other test dmesg-warn
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13599/fi-apl-guc/igt@core_hotunplug@unbind-rebind.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/fi-apl-guc/igt@core_hotunplug@unbind-rebind.html

  * igt@gem_exec_suspend@basic-s0@lmem0:
    - bat-dg2-8:          NOTRUN -> [INCOMPLETE][5] ([i915#6311])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-dg2-8/igt@gem_exec_suspend@basic-s0@lmem0.html

  * igt@gem_huc_copy@huc-copy:
    - fi-kbl-soraka:      NOTRUN -> [SKIP][6] ([fdo#109271] / [i915#2190])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/fi-kbl-soraka/igt@gem_huc_copy@huc-copy.html

  * igt@gem_lmem_swapping@basic:
    - fi-kbl-soraka:      NOTRUN -> [SKIP][7] ([fdo#109271] / [i915#4613]) +3 other tests skip
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/fi-kbl-soraka/igt@gem_lmem_swapping@basic.html

  * igt@gem_mmap@basic:
    - bat-dg2-8:          NOTRUN -> [SKIP][8] ([i915#4083])
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-dg2-8/igt@gem_mmap@basic.html

  * igt@gem_mmap_gtt@basic:
    - bat-dg2-8:          NOTRUN -> [SKIP][9] ([i915#4077]) +2 other tests skip
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-dg2-8/igt@gem_mmap_gtt@basic.html

  * igt@gem_tiled_pread_basic:
    - bat-dg2-8:          NOTRUN -> [SKIP][10] ([i915#4079]) +1 other test skip
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-dg2-8/igt@gem_tiled_pread_basic.html

  * igt@i915_module_load@reload:
    - fi-apl-guc:         [PASS][11] -> [DMESG-WARN][12] ([i915#180] / [i915#1982] / [i915#7634])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13599/fi-apl-guc/igt@i915_module_load@reload.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/fi-apl-guc/igt@i915_module_load@reload.html

  * igt@i915_pm_backlight@basic-brightness:
    - bat-dg2-8:          NOTRUN -> [SKIP][13] ([i915#5354] / [i915#7561])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-dg2-8/igt@i915_pm_backlight@basic-brightness.html

  * igt@i915_pm_rpm@module-reload:
    - fi-apl-guc:         [PASS][14] -> [DMESG-WARN][15] ([i915#180] / [i915#7634] / [i915#8585])
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13599/fi-apl-guc/igt@i915_pm_rpm@module-reload.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/fi-apl-guc/igt@i915_pm_rpm@module-reload.html

  * igt@i915_pm_rps@basic-api:
    - bat-dg2-8:          NOTRUN -> [SKIP][16] ([i915#6621])
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-dg2-8/igt@i915_pm_rps@basic-api.html

  * igt@i915_selftest@live@gt_pm:
    - fi-kbl-soraka:      NOTRUN -> [DMESG-FAIL][17] ([i915#1886] / [i915#7913])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/fi-kbl-soraka/igt@i915_selftest@live@gt_pm.html

  * igt@i915_selftest@live@reset:
    - fi-apl-guc:         [PASS][18] -> [DMESG-WARN][19] ([i915#7634]) +36 other tests dmesg-warn
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13599/fi-apl-guc/igt@i915_selftest@live@reset.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/fi-apl-guc/igt@i915_selftest@live@reset.html

  * igt@i915_suspend@basic-s3-without-i915:
    - bat-dg2-8:          NOTRUN -> [SKIP][20] ([i915#6645])
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-dg2-8/igt@i915_suspend@basic-s3-without-i915.html

  * igt@kms_addfb_basic@addfb25-bad-modifier:
    - fi-apl-guc:         [PASS][21] -> [DMESG-WARN][22] ([i915#7634] / [i915#8703])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13599/fi-apl-guc/igt@kms_addfb_basic@addfb25-bad-modifier.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/fi-apl-guc/igt@kms_addfb_basic@addfb25-bad-modifier.html

  * igt@kms_addfb_basic@addfb25-y-tiled-small-legacy:
    - bat-dg2-8:          NOTRUN -> [SKIP][23] ([i915#5190])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-dg2-8/igt@kms_addfb_basic@addfb25-y-tiled-small-legacy.html

  * igt@kms_addfb_basic@bad-pitch-0:
    - fi-apl-guc:         [PASS][24] -> [DMESG-WARN][25] ([i915#8703]) +35 other tests dmesg-warn
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13599/fi-apl-guc/igt@kms_addfb_basic@bad-pitch-0.html
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/fi-apl-guc/igt@kms_addfb_basic@bad-pitch-0.html

  * igt@kms_addfb_basic@basic-y-tiled-legacy:
    - bat-dg2-8:          NOTRUN -> [SKIP][26] ([i915#4215] / [i915#5190])
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-dg2-8/igt@kms_addfb_basic@basic-y-tiled-legacy.html

  * igt@kms_addfb_basic@framebuffer-vs-set-tiling:
    - bat-dg2-8:          NOTRUN -> [SKIP][27] ([i915#4212]) +7 other tests skip
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-dg2-8/igt@kms_addfb_basic@framebuffer-vs-set-tiling.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic:
    - fi-kbl-soraka:      NOTRUN -> [SKIP][28] ([fdo#109271]) +8 other tests skip
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/fi-kbl-soraka/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-legacy:
    - bat-dg2-8:          NOTRUN -> [SKIP][29] ([i915#4103] / [i915#4213]) +1 other test skip
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-dg2-8/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-legacy.html

  * igt@kms_flip@basic-flip-vs-dpms@c-dp1:
    - fi-apl-guc:         [PASS][30] -> [DMESG-WARN][31] ([i915#180] / [i915#8703]) +40 other tests dmesg-warn
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13599/fi-apl-guc/igt@kms_flip@basic-flip-vs-dpms@c-dp1.html
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/fi-apl-guc/igt@kms_flip@basic-flip-vs-dpms@c-dp1.html

  * igt@kms_flip@basic-flip-vs-wf_vblank@a-dp1:
    - fi-apl-guc:         [PASS][32] -> [DMESG-WARN][33] ([i915#180] / [i915#1982] / [i915#8703])
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13599/fi-apl-guc/igt@kms_flip@basic-flip-vs-wf_vblank@a-dp1.html
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/fi-apl-guc/igt@kms_flip@basic-flip-vs-wf_vblank@a-dp1.html

  * igt@kms_force_connector_basic@force-load-detect:
    - bat-dg2-8:          NOTRUN -> [SKIP][34] ([fdo#109285])
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-dg2-8/igt@kms_force_connector_basic@force-load-detect.html

  * igt@kms_force_connector_basic@prune-stale-modes:
    - bat-dg2-8:          NOTRUN -> [SKIP][35] ([i915#5274])
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-dg2-8/igt@kms_force_connector_basic@prune-stale-modes.html

  * igt@kms_psr@cursor_plane_move:
    - bat-dg2-8:          NOTRUN -> [SKIP][36] ([i915#1072]) +3 other tests skip
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-dg2-8/igt@kms_psr@cursor_plane_move.html

  * igt@kms_psr@primary_mmap_gtt:
    - bat-rplp-1:         NOTRUN -> [SKIP][37] ([i915#1072]) +1 other test skip
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-rplp-1/igt@kms_psr@primary_mmap_gtt.html

  * igt@kms_setmode@basic-clone-single-crtc:
    - bat-rplp-1:         NOTRUN -> [ABORT][38] ([i915#8260] / [i915#8668])
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-rplp-1/igt@kms_setmode@basic-clone-single-crtc.html
    - bat-dg2-8:          NOTRUN -> [SKIP][39] ([i915#3555])
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-dg2-8/igt@kms_setmode@basic-clone-single-crtc.html

  * igt@prime_vgem@basic-fence-flip:
    - bat-dg2-8:          NOTRUN -> [SKIP][40] ([i915#3708])
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-dg2-8/igt@prime_vgem@basic-fence-flip.html

  * igt@prime_vgem@basic-fence-mmap:
    - bat-dg2-8:          NOTRUN -> [SKIP][41] ([i915#3708] / [i915#4077]) +1 other test skip
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-dg2-8/igt@prime_vgem@basic-fence-mmap.html

  * igt@prime_vgem@basic-write:
    - bat-dg2-8:          NOTRUN -> [SKIP][42] ([i915#3291] / [i915#3708]) +2 other tests skip
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-dg2-8/igt@prime_vgem@basic-write.html

  
#### Possible fixes ####

  * igt@kms_chamelium_edid@hdmi-edid-read:
    - {bat-dg2-13}:       [DMESG-WARN][43] ([i915#7952]) -> [PASS][44]
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13599/bat-dg2-13/igt@kms_chamelium_edid@hdmi-edid-read.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-dg2-13/igt@kms_chamelium_edid@hdmi-edid-read.html

  * igt@kms_chamelium_frames@dp-crc-fast:
    - {bat-dg2-13}:       [DMESG-WARN][45] ([Intel XE#485]) -> [PASS][46]
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13599/bat-dg2-13/igt@kms_chamelium_frames@dp-crc-fast.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-dg2-13/igt@kms_chamelium_frames@dp-crc-fast.html

  * igt@kms_flip@basic-flip-vs-wf_vblank@a-dp6:
    - bat-adlp-11:        [FAIL][47] ([i915#6121]) -> [PASS][48] +4 other tests pass
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13599/bat-adlp-11/igt@kms_flip@basic-flip-vs-wf_vblank@a-dp6.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-adlp-11/igt@kms_flip@basic-flip-vs-wf_vblank@a-dp6.html

  * igt@kms_flip@basic-flip-vs-wf_vblank@c-dp5:
    - bat-adlp-11:        [DMESG-WARN][49] ([i915#6868]) -> [PASS][50]
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13599/bat-adlp-11/igt@kms_flip@basic-flip-vs-wf_vblank@c-dp5.html
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-adlp-11/igt@kms_flip@basic-flip-vs-wf_vblank@c-dp5.html

  
#### Warnings ####

  * igt@kms_psr@cursor_plane_move:
    - bat-rplp-1:         [ABORT][51] ([i915#9243]) -> [SKIP][52] ([i915#1072])
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13599/bat-rplp-1/igt@kms_psr@cursor_plane_move.html
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/bat-rplp-1/igt@kms_psr@cursor_plane_move.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [Intel XE#485]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/485
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109285]: https://bugs.freedesktop.org/show_bug.cgi?id=109285
  [i915#1072]: https://gitlab.freedesktop.org/drm/intel/issues/1072
  [i915#180]: https://gitlab.freedesktop.org/drm/intel/issues/180
  [i915#1886]: https://gitlab.freedesktop.org/drm/intel/issues/1886
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#3291]: https://gitlab.freedesktop.org/drm/intel/issues/3291
  [i915#3555]: https://gitlab.freedesktop.org/drm/intel/issues/3555
  [i915#3708]: https://gitlab.freedesktop.org/drm/intel/issues/3708
  [i915#4077]: https://gitlab.freedesktop.org/drm/intel/issues/4077
  [i915#4079]: https://gitlab.freedesktop.org/drm/intel/issues/4079
  [i915#4083]: https://gitlab.freedesktop.org/drm/intel/issues/4083
  [i915#4103]: https://gitlab.freedesktop.org/drm/intel/issues/4103
  [i915#4212]: https://gitlab.freedesktop.org/drm/intel/issues/4212
  [i915#4213]: https://gitlab.freedesktop.org/drm/intel/issues/4213
  [i915#4215]: https://gitlab.freedesktop.org/drm/intel/issues/4215
  [i915#4613]: https://gitlab.freedesktop.org/drm/intel/issues/4613
  [i915#5190]: https://gitlab.freedesktop.org/drm/intel/issues/5190
  [i915#5274]: https://gitlab.freedesktop.org/drm/intel/issues/5274
  [i915#5354]: https://gitlab.freedesktop.org/drm/intel/issues/5354
  [i915#6121]: https://gitlab.freedesktop.org/drm/intel/issues/6121
  [i915#6311]: https://gitlab.freedesktop.org/drm/intel/issues/6311
  [i915#6621]: https://gitlab.freedesktop.org/drm/intel/issues/6621
  [i915#6645]: https://gitlab.freedesktop.org/drm/intel/issues/6645
  [i915#6868]: https://gitlab.freedesktop.org/drm/intel/issues/6868
  [i915#7561]: https://gitlab.freedesktop.org/drm/intel/issues/7561
  [i915#7634]: https://gitlab.freedesktop.org/drm/intel/issues/7634
  [i915#7913]: https://gitlab.freedesktop.org/drm/intel/issues/7913
  [i915#7952]: https://gitlab.freedesktop.org/drm/intel/issues/7952
  [i915#8260]: https://gitlab.freedesktop.org/drm/intel/issues/8260
  [i915#8585]: https://gitlab.freedesktop.org/drm/intel/issues/8585
  [i915#8668]: https://gitlab.freedesktop.org/drm/intel/issues/8668
  [i915#8703]: https://gitlab.freedesktop.org/drm/intel/issues/8703
  [i915#9243]: https://gitlab.freedesktop.org/drm/intel/issues/9243


Build changes
-------------

  * CI: CI-20190529 -> None
  * IGT: IGT_7468 -> IGTPW_9722

  CI-20190529: 20190529
  CI_DRM_13599: 58fe10f34e80d0eeb5609128faa135260623a715 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_9722: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/index.html
  IGT_7468: 7468


Testlist changes
----------------

+++ 59611 lines
--- 59610 lines

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9722/index.html

[-- Attachment #2: Type: text/html, Size: 17673 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 1/9] lib/intel_compute: Migrate xe_compute library to intel_compute
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 1/9] lib/intel_compute: Migrate xe_compute library to intel_compute Zbigniew Kempczyński
@ 2023-09-06 16:42   ` Kamil Konieczny
  0 siblings, 0 replies; 22+ messages in thread
From: Kamil Konieczny @ 2023-09-06 16:42 UTC (permalink / raw)
  To: igt-dev

Hi Zbigniew,

On 2023-09-05 at 15:33:01 +0200, Zbigniew Kempczyński wrote:
> During my work on adding xe-compute support to DG2 I hit some issues
> on Xe driver so instead of limiting workload to Xe only I decided to
> handle i915 as well. Such attitude might be handy on driver feature
> status comparison.
> 
> Patch does preparation step to share the code between i915 and Xe.
> 
> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> Cc: Christoph Manszewski <christoph.manszewski@intel.com>
> Cc: Francois Dugast <francois.dugast@intel.com>
> Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
> ---
>  lib/{xe/xe_compute.c => intel_compute.c}       | 18 +++++++++---------
>  lib/{xe/xe_compute.h => intel_compute.h}       | 12 ++++++------
>  ...ernels.c => intel_compute_square_kernels.c} |  4 ++--
>  lib/meson.build                                |  4 ++--
>  tests/intel/xe_compute.c                       |  4 ++--
>  5 files changed, 21 insertions(+), 21 deletions(-)
>  rename lib/{xe/xe_compute.c => intel_compute.c} (97%)
>  rename lib/{xe/xe_compute.h => intel_compute.h} (74%)
>  rename lib/{xe/xe_compute_square_kernels.c => intel_compute_square_kernels.c} (97%)
> 
> diff --git a/lib/xe/xe_compute.c b/lib/intel_compute.c
> similarity index 97%
> rename from lib/xe/xe_compute.c
> rename to lib/intel_compute.c
> index 3e8112a048..647bce0e43 100644
> --- a/lib/xe/xe_compute.c
> +++ b/lib/intel_compute.c
> @@ -13,7 +13,7 @@
>  #include "lib/igt_syncobj.h"
>  #include "lib/intel_reg.h"
>  
> -#include "xe_compute.h"
> +#include "intel_compute.h"

Sort alphabetically.

>  #include "xe/xe_ioctl.h"
>  #include "xe/xe_query.h"
>  
> @@ -453,24 +453,24 @@ static const struct {
>  	unsigned int ip_ver;
>  	void (*compute_exec)(int fd, const unsigned char *kernel,
>  			     unsigned int size);
> -} xe_compute_batches[] = {
> +} compute_batches[] = {
>  	{
>  		.ip_ver = IP_VER(12, 0),
>  		.compute_exec = tgl_compute_exec,
>  	},
>  };
>  
> -bool run_xe_compute_kernel(int fd)
> +bool run_compute_kernel(int fd)
>  {
>  	unsigned int ip_ver = intel_graphics_ver(intel_get_drm_devid(fd));
>  	unsigned int batch;
> -	const struct xe_compute_kernels *kernels = xe_compute_square_kernels;
> +	const struct compute_kernels *kernels = compute_square_kernels;
>  
> -	for (batch = 0; batch < ARRAY_SIZE(xe_compute_batches); batch++) {
> -		if (ip_ver == xe_compute_batches[batch].ip_ver)
> +	for (batch = 0; batch < ARRAY_SIZE(compute_batches); batch++) {
> +		if (ip_ver == compute_batches[batch].ip_ver)
>  			break;
>  	}
> -	if (batch == ARRAY_SIZE(xe_compute_batches))
> +	if (batch == ARRAY_SIZE(compute_batches))
>  		return false;
>  
>  	while (kernels->kernel) {
> @@ -481,8 +481,8 @@ bool run_xe_compute_kernel(int fd)
>  	if (!kernels->kernel)
>  		return 1;
>  
> -	xe_compute_batches[batch].compute_exec(fd, kernels->kernel,
> -					       kernels->size);
> +	compute_batches[batch].compute_exec(fd, kernels->kernel,
> +					    kernels->size);
>  
>  	return true;
>  }
> diff --git a/lib/xe/xe_compute.h b/lib/intel_compute.h
> similarity index 74%
> rename from lib/xe/xe_compute.h
> rename to lib/intel_compute.h
> index b2e7e98278..e271bb5254 100644
> --- a/lib/xe/xe_compute.h
> +++ b/lib/intel_compute.h
> @@ -6,8 +6,8 @@
>   *    Francois Dugast <francois.dugast@intel.com>
>   */
>  
> -#ifndef XE_COMPUTE_H
> -#define XE_COMPUTE_H
> +#ifndef INTEL_COMPUTE_H
> +#define INTEL_COMPUTE_H
>  
>  /*
>   * OpenCL Kernels are generated using:
> @@ -19,14 +19,14 @@
>   * For each GPU model desired. A list of supported models can be obtained with: ocloc compile --help
>   */
>  
> -struct xe_compute_kernels {
> +struct compute_kernels {

imho better:
struct intel_compute_kernels {

Regards,
Kamil

>  	int ip_ver;
>  	unsigned int size;
>  	const unsigned char *kernel;
>  };
>  
> -extern const struct xe_compute_kernels xe_compute_square_kernels[];
> +extern const struct compute_kernels compute_square_kernels[];
>  
> -bool run_xe_compute_kernel(int fd);
> +bool run_compute_kernel(int fd);
>  
> -#endif	/* XE_COMPUTE_H */
> +#endif	/* INTEL_COMPUTE_H */
> diff --git a/lib/xe/xe_compute_square_kernels.c b/lib/intel_compute_square_kernels.c
> similarity index 97%
> rename from lib/xe/xe_compute_square_kernels.c
> rename to lib/intel_compute_square_kernels.c
> index f9c07dc778..b30d8a23dd 100644
> --- a/lib/xe/xe_compute_square_kernels.c
> +++ b/lib/intel_compute_square_kernels.c
> @@ -8,7 +8,7 @@
>   */
>  
>  #include "intel_chipset.h"
> -#include "lib/xe/xe_compute.h"
> +#include "lib/intel_compute.h"
>  
>  static const unsigned char tgllp_kernel_square_bin[] = {
>  	0x61, 0x00, 0x03, 0x80, 0x20, 0x02, 0x05, 0x03, 0x04, 0x00, 0x10, 0x00,
> @@ -61,7 +61,7 @@ static const unsigned char tgllp_kernel_square_bin[] = {
>  	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>  };
>  
> -const struct xe_compute_kernels xe_compute_square_kernels[] = {
> +const struct compute_kernels compute_square_kernels[] = {
>  	{
>  		.ip_ver = IP_VER(12, 0),
>  		.size = sizeof(tgllp_kernel_square_bin),
> diff --git a/lib/meson.build b/lib/meson.build
> index 21ea9d5ac4..a45f7d677f 100644
> --- a/lib/meson.build
> +++ b/lib/meson.build
> @@ -58,6 +58,8 @@ lib_sources = [
>  	'intel_bufops.c',
>  	'intel_chipset.c',
>  	'intel_cmds_info.c',
> +	'intel_compute.c',
> +	'intel_compute_square_kernels.c',
>  	'intel_ctx.c',
>  	'intel_device_info.c',
>  	'intel_mmio.c',
> @@ -103,8 +105,6 @@ lib_sources = [
>  	'veboxcopy_gen12.c',
>  	'igt_msm.c',
>  	'igt_dsc.c',
> -	'xe/xe_compute.c',
> -	'xe/xe_compute_square_kernels.c',
>  	'xe/xe_gt.c',
>  	'xe/xe_ioctl.c',
>  	'xe/xe_query.c',
> diff --git a/tests/intel/xe_compute.c b/tests/intel/xe_compute.c
> index 2cf536701a..0c54fbec42 100644
> --- a/tests/intel/xe_compute.c
> +++ b/tests/intel/xe_compute.c
> @@ -14,8 +14,8 @@
>  #include <string.h>
>  
>  #include "igt.h"
> +#include "intel_compute.h"
>  #include "xe/xe_query.h"
> -#include "xe/xe_compute.h"
>  
>  /**
>   * SUBTEST: compute-square
> @@ -29,7 +29,7 @@
>  static void
>  test_compute_square(int fd)
>  {
> -	igt_require_f(run_xe_compute_kernel(fd), "GPU not supported\n");
> +	igt_require_f(run_compute_kernel(fd), "GPU not supported\n");
>  }
>  
>  igt_main
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 2/9] lib/intel_compute: Add compatibility flags for running compute
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 2/9] lib/intel_compute: Add compatibility flags for running compute Zbigniew Kempczyński
@ 2023-09-08  9:03   ` Francois Dugast
  2023-09-08 11:07     ` Zbigniew Kempczyński
  2023-09-08 11:29     ` Mauro Carvalho Chehab
  0 siblings, 2 replies; 22+ messages in thread
From: Francois Dugast @ 2023-09-08  9:03 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On Tue, Sep 05, 2023 at 03:33:02PM +0200, Zbigniew Kempczyński wrote:
> Allow selectively turn on/off compute tests on both i915 and xe
> drivers.
> 
> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> Cc: Christoph Manszewski <christoph.manszewski@intel.com>
> Cc: Francois Dugast <francois.dugast@intel.com>
> Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
> ---
>  lib/intel_compute.c | 19 ++++++++++++++++++-
>  1 file changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/intel_compute.c b/lib/intel_compute.c
> index 647bce0e43..dd9f686d0c 100644
> --- a/lib/intel_compute.c
> +++ b/lib/intel_compute.c
> @@ -446,17 +446,27 @@ static void tgl_compute_exec(int fd, const unsigned char *kernel,
>  }
>  
>  /*
> - * Generic code
> + * Compatibility flags.
> + *
> + * There will be some time period in which both drivers (i915 and xe)
> + * will support compute runtime tests. Lets define compat flags to allow
> + * the code to be shared between two drivers allowing disabling this in
> + * the future.
>   */
> +#define COMPAT_FLAG(f) (1 << (f))
> +#define COMPAT_I915 COMPAT_FLAG(INTEL_DRIVER_I915)
> +#define COMPAT_XE   COMPAT_FLAG(INTEL_DRIVER_XE)

This approach allows solving the issue of unsupported combinations of
platforms and drivers. I cannot think of something better but my concern
is there could be some confusion if the test is skipped on a platform
that appears to be supported because it is listed in the compute_batches.

s/COMPAT_I915/COMPAT_DRIVER_I915/ and s/COMPAT_XE/COMPAT_DRIVER_XE/ can
help make it not ambiguous.

It would be good to get ack from someone else on the compat flag
approach, Mauro maybe?

Francois

>  
>  static const struct {
>  	unsigned int ip_ver;
>  	void (*compute_exec)(int fd, const unsigned char *kernel,
>  			     unsigned int size);
> +	uint32_t compat;
>  } compute_batches[] = {
>  	{
>  		.ip_ver = IP_VER(12, 0),
>  		.compute_exec = tgl_compute_exec,
> +		.compat = COMPAT_I915 | COMPAT_XE,
>  	},
>  };
>  
> @@ -465,6 +475,7 @@ bool run_compute_kernel(int fd)
>  	unsigned int ip_ver = intel_graphics_ver(intel_get_drm_devid(fd));
>  	unsigned int batch;
>  	const struct compute_kernels *kernels = compute_square_kernels;
> +	enum intel_driver driver = get_intel_driver(fd);
>  
>  	for (batch = 0; batch < ARRAY_SIZE(compute_batches); batch++) {
>  		if (ip_ver == compute_batches[batch].ip_ver)
> @@ -473,6 +484,12 @@ bool run_compute_kernel(int fd)
>  	if (batch == ARRAY_SIZE(compute_batches))
>  		return false;
>  
> +	if (!(COMPAT_FLAG(driver) & compute_batches[batch].compat)) {
> +		igt_debug("driver flag: %x\n", COMPAT_FLAG(driver));
> +		igt_debug("compat flag: %x\n", compute_batches[batch].compat);
> +		return false;
> +	}
> +
>  	while (kernels->kernel) {
>  		if (ip_ver == kernels->ip_ver)
>  			break;
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 3/9] lib/intel_compute: Reorganize the code for i915 version preparation
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 3/9] lib/intel_compute: Reorganize the code for i915 version preparation Zbigniew Kempczyński
@ 2023-09-08  9:05   ` Francois Dugast
  0 siblings, 0 replies; 22+ messages in thread
From: Francois Dugast @ 2023-09-08  9:05 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On Tue, Sep 05, 2023 at 03:33:03PM +0200, Zbigniew Kempczyński wrote:
> There's common code in compute pipeline creation so it's worth to
> extract it and create dedicated functions for this purpose.
> 
> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> Cc: Christoph Manszewski <christoph.manszewski@intel.com>
> Cc: Francois Dugast <francois.dugast@intel.com>
> Cc: Mauro Carvalho Chehab <mchehab@kernel.org>

Reviewed-by: Francois Dugast <francois.dugast@intel.com>

> ---
>  lib/intel_compute.c | 135 ++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 110 insertions(+), 25 deletions(-)
> 
> diff --git a/lib/intel_compute.c b/lib/intel_compute.c
> index dd9f686d0c..b42f3eca0e 100644
> --- a/lib/intel_compute.c
> +++ b/lib/intel_compute.c
> @@ -39,6 +39,95 @@ struct bo_dict_entry {
>  	void *data;
>  };
>  
> +struct bo_execenv {
> +	int fd;
> +	enum intel_driver driver;
> +
> +	/* Xe part */
> +	uint32_t vm;
> +	uint32_t exec_queue;
> +};
> +
> +static void bo_execenv_create(int fd, struct bo_execenv *execenv)
> +{
> +	igt_assert(execenv);
> +
> +	memset(execenv, 0, sizeof(*execenv));
> +	execenv->fd = fd;
> +	execenv->driver = get_intel_driver(fd);
> +
> +	if (execenv->driver == INTEL_DRIVER_XE) {
> +		execenv->vm = xe_vm_create(fd, DRM_XE_VM_CREATE_ASYNC_BIND_OPS, 0);
> +		execenv->exec_queue = xe_exec_queue_create_class(fd, execenv->vm,
> +								 DRM_XE_ENGINE_CLASS_RENDER);
> +	}
> +}
> +
> +static void bo_execenv_destroy(struct bo_execenv *execenv)
> +{
> +	igt_assert(execenv);
> +
> +	if (execenv->driver == INTEL_DRIVER_XE) {
> +		xe_vm_destroy(execenv->fd, execenv->vm);
> +		xe_exec_queue_destroy(execenv->fd, execenv->exec_queue);
> +	}
> +}
> +
> +static void bo_execenv_bind(struct bo_execenv *execenv,
> +			    struct bo_dict_entry *bo_dict, int entries)
> +{
> +	int fd = execenv->fd;
> +
> +	if (execenv->driver == INTEL_DRIVER_XE) {
> +		uint32_t vm = execenv->vm;
> +		uint64_t alignment = xe_get_default_alignment(fd);
> +		struct drm_xe_sync sync = { 0 };
> +
> +		sync.flags = DRM_XE_SYNC_SYNCOBJ | DRM_XE_SYNC_SIGNAL;
> +		sync.handle = syncobj_create(fd, 0);
> +
> +		for (int i = 0; i < entries; i++) {
> +			bo_dict[i].data = aligned_alloc(alignment, bo_dict[i].size);
> +			xe_vm_bind_userptr_async(fd, vm, 0, to_user_pointer(bo_dict[i].data),
> +						 bo_dict[i].addr, bo_dict[i].size, &sync, 1);
> +			syncobj_wait(fd, &sync.handle, 1, INT64_MAX, 0, NULL);
> +			memset(bo_dict[i].data, 0, bo_dict[i].size);
> +		}
> +
> +		syncobj_destroy(fd, sync.handle);
> +	}
> +}
> +
> +static void bo_execenv_unbind(struct bo_execenv *execenv,
> +			      struct bo_dict_entry *bo_dict, int entries)
> +{
> +	int fd = execenv->fd;
> +
> +	if (execenv->driver == INTEL_DRIVER_XE) {
> +		uint32_t vm = execenv->vm;
> +		struct drm_xe_sync sync = { 0 };
> +
> +		sync.flags = DRM_XE_SYNC_SYNCOBJ | DRM_XE_SYNC_SIGNAL;
> +		sync.handle = syncobj_create(fd, 0);
> +
> +		for (int i = 0; i < entries; i++) {
> +			xe_vm_unbind_async(fd, vm, 0, 0, bo_dict[i].addr, bo_dict[i].size, &sync, 1);
> +			syncobj_wait(fd, &sync.handle, 1, INT64_MAX, 0, NULL);
> +			free(bo_dict[i].data);
> +		}
> +
> +		syncobj_destroy(fd, sync.handle);
> +	}
> +}
> +
> +static void bo_execenv_exec(struct bo_execenv *execenv, uint64_t start_addr)
> +{
> +	int fd = execenv->fd;
> +
> +	if (execenv->driver == INTEL_DRIVER_XE)
> +		xe_exec_wait(fd, execenv->exec_queue, start_addr);
> +}
> +
>  /*
>   * TGL compatible batch
>   */
> @@ -389,9 +478,6 @@ static void tgllp_compute_exec_compute(uint32_t *addr_bo_buffer_batch,
>  static void tgl_compute_exec(int fd, const unsigned char *kernel,
>  			     unsigned int size)
>  {
> -	uint32_t vm, exec_queue;
> -	float *dinput;
> -	struct drm_xe_sync sync = { 0 };
>  #define TGL_BO_DICT_ENTRIES 7
>  	struct bo_dict_entry bo_dict[TGL_BO_DICT_ENTRIES] = {
>  		{ .addr = ADDR_INDIRECT_OBJECT_BASE + OFFSET_KERNEL}, // kernel
> @@ -402,47 +488,46 @@ static void tgl_compute_exec(int fd, const unsigned char *kernel,
>  		{ .addr = ADDR_OUTPUT, .size = SIZE_BUFFER_OUTPUT }, // output
>  		{ .addr = ADDR_BATCH, .size = SIZE_BATCH }, // batch
>  	};
> +	struct bo_execenv execenv;
> +	float *dinput;
> +
> +	bo_execenv_create(fd, &execenv);
>  
>  	/* Sets Kernel size */
>  	bo_dict[0].size = ALIGN(size, 0x1000);
>  
> -	vm = xe_vm_create(fd, DRM_XE_VM_CREATE_ASYNC_BIND_OPS, 0);
> -	exec_queue = xe_exec_queue_create_class(fd, vm, DRM_XE_ENGINE_CLASS_RENDER);
> -	sync.flags = DRM_XE_SYNC_SYNCOBJ | DRM_XE_SYNC_SIGNAL;
> -	sync.handle = syncobj_create(fd, 0);
> +	bo_execenv_bind(&execenv, bo_dict, TGL_BO_DICT_ENTRIES);
>  
> -	for (int i = 0; i < TGL_BO_DICT_ENTRIES; i++) {
> -		bo_dict[i].data = aligned_alloc(xe_get_default_alignment(fd), bo_dict[i].size);
> -		xe_vm_bind_userptr_async(fd, vm, 0, to_user_pointer(bo_dict[i].data), bo_dict[i].addr, bo_dict[i].size, &sync, 1);
> -		syncobj_wait(fd, &sync.handle, 1, INT64_MAX, 0, NULL);
> -		memset(bo_dict[i].data, 0, bo_dict[i].size);
> -	}
>  	memcpy(bo_dict[0].data, kernel, size);
>  	tgllp_create_dynamic_state(bo_dict[1].data, OFFSET_KERNEL);
>  	tgllp_create_surface_state(bo_dict[2].data, ADDR_INPUT, ADDR_OUTPUT);
>  	tgllp_create_indirect_data(bo_dict[3].data, ADDR_INPUT, ADDR_OUTPUT);
> +
>  	dinput = (float *)bo_dict[4].data;
>  	srand(time(NULL));
> -
>  	for (int i = 0; i < SIZE_DATA; i++)
>  		((float *)dinput)[i] = rand() / (float)RAND_MAX;
>  
> -	tgllp_compute_exec_compute(bo_dict[6].data, ADDR_SURFACE_STATE_BASE, ADDR_DYNAMIC_STATE_BASE, ADDR_INDIRECT_OBJECT_BASE, OFFSET_INDIRECT_DATA_START);
> +	tgllp_compute_exec_compute(bo_dict[6].data,
> +				   ADDR_SURFACE_STATE_BASE,
> +				   ADDR_DYNAMIC_STATE_BASE,
> +				   ADDR_INDIRECT_OBJECT_BASE,
> +				   OFFSET_INDIRECT_DATA_START);
>  
> -	xe_exec_wait(fd, exec_queue, ADDR_BATCH);
> +	bo_execenv_exec(&execenv, ADDR_BATCH);
>  
> -	for (int i = 0; i < SIZE_DATA; i++)
> -		igt_assert(((float *)bo_dict[5].data)[i] == ((float *)bo_dict[4].data)[i] * ((float *) bo_dict[4].data)[i]);
> +	for (int i = 0; i < SIZE_DATA; i++) {
> +		float f1, f2;
>  
> -	for (int i = 0; i < TGL_BO_DICT_ENTRIES; i++) {
> -		xe_vm_unbind_async(fd, vm, 0, 0, bo_dict[i].addr, bo_dict[i].size, &sync, 1);
> -		syncobj_wait(fd, &sync.handle, 1, INT64_MAX, 0, NULL);
> -		free(bo_dict[i].data);
> +		f1 = ((float *) bo_dict[5].data)[i];
> +		f2 = ((float *) bo_dict[4].data)[i];
> +		if (f1 != f2 * f2)
> +			igt_debug("[%4d] f1: %f != %f\n", i, f1, f2 * f2);
> +		igt_assert(f1 == f2 * f2);
>  	}
>  
> -	syncobj_destroy(fd, sync.handle);
> -	xe_exec_queue_destroy(fd, exec_queue);
> -	xe_vm_destroy(fd, vm);
> +	bo_execenv_unbind(&execenv, bo_dict, TGL_BO_DICT_ENTRIES);
> +	bo_execenv_destroy(&execenv);
>  }
>  
>  /*
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 4/9] lib/intel_compute: Add name field for debugging purposes
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 4/9] lib/intel_compute: Add name field for debugging purposes Zbigniew Kempczyński
@ 2023-09-08  9:05   ` Francois Dugast
  0 siblings, 0 replies; 22+ messages in thread
From: Francois Dugast @ 2023-09-08  9:05 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On Tue, Sep 05, 2023 at 03:33:04PM +0200, Zbigniew Kempczyński wrote:
> Debugging without knowledge about object characteristics is hard and
> time consuming. Simple name field added for printing binded addresses
> and their sizes might speed up development. I experienced this on
> extending to DG2 so I decided to permanently add it. But to avoid
> annoying output this is limited to igt_debug() which will print
> only on user request or on the test failure.
> 
> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> Cc: Christoph Manszewski <christoph.manszewski@intel.com>
> Cc: Francois Dugast <francois.dugast@intel.com>
> Cc: Mauro Carvalho Chehab <mchehab@kernel.org>

Reviewed-by: Francois Dugast <francois.dugast@intel.com>

> ---
>  lib/intel_compute.c | 33 ++++++++++++++++++++++++++-------
>  1 file changed, 26 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/intel_compute.c b/lib/intel_compute.c
> index b42f3eca0e..a1e87ef46f 100644
> --- a/lib/intel_compute.c
> +++ b/lib/intel_compute.c
> @@ -37,6 +37,7 @@ struct bo_dict_entry {
>  	uint64_t addr;
>  	uint32_t size;
>  	void *data;
> +	const char *name;
>  };
>  
>  struct bo_execenv {
> @@ -92,6 +93,11 @@ static void bo_execenv_bind(struct bo_execenv *execenv,
>  						 bo_dict[i].addr, bo_dict[i].size, &sync, 1);
>  			syncobj_wait(fd, &sync.handle, 1, INT64_MAX, 0, NULL);
>  			memset(bo_dict[i].data, 0, bo_dict[i].size);
> +
> +			igt_debug("[i: %2d name: %20s] data: %p, addr: %16llx, size: %llx\n",
> +				  i, bo_dict[i].name, bo_dict[i].data,
> +				  (long long)bo_dict[i].addr,
> +				  (long long)bo_dict[i].size);
>  		}
>  
>  		syncobj_destroy(fd, sync.handle);
> @@ -480,13 +486,26 @@ static void tgl_compute_exec(int fd, const unsigned char *kernel,
>  {
>  #define TGL_BO_DICT_ENTRIES 7
>  	struct bo_dict_entry bo_dict[TGL_BO_DICT_ENTRIES] = {
> -		{ .addr = ADDR_INDIRECT_OBJECT_BASE + OFFSET_KERNEL}, // kernel
> -		{ .addr = ADDR_DYNAMIC_STATE_BASE, .size =  0x1000}, // dynamic state
> -		{ .addr = ADDR_SURFACE_STATE_BASE, .size =  0x1000}, // surface state
> -		{ .addr = ADDR_INDIRECT_OBJECT_BASE + OFFSET_INDIRECT_DATA_START, .size =  0x10000}, // indirect data
> -		{ .addr = ADDR_INPUT, .size = SIZE_BUFFER_INPUT }, // input
> -		{ .addr = ADDR_OUTPUT, .size = SIZE_BUFFER_OUTPUT }, // output
> -		{ .addr = ADDR_BATCH, .size = SIZE_BATCH }, // batch
> +		{ .addr = ADDR_INDIRECT_OBJECT_BASE + OFFSET_KERNEL,
> +		  .name = "kernel" },
> +		{ .addr = ADDR_DYNAMIC_STATE_BASE,
> +		  .size =  0x1000,
> +		  .name = "dynamic state base" },
> +		{ .addr = ADDR_SURFACE_STATE_BASE,
> +		  .size =  0x1000,
> +		  .name = "surface state base" },
> +		{ .addr = ADDR_INDIRECT_OBJECT_BASE + OFFSET_INDIRECT_DATA_START,
> +		  .size =  0x10000,
> +		  .name = "indirect data start" },
> +		{ .addr = ADDR_INPUT,
> +		  .size = SIZE_BUFFER_INPUT,
> +		  .name = "input" },
> +		{ .addr = ADDR_OUTPUT,
> +		  .size = SIZE_BUFFER_OUTPUT,
> +		  .name = "output" },
> +		{ .addr = ADDR_BATCH,
> +		  .size = SIZE_BATCH,
> +		  .name = "batch" },
>  	};
>  	struct bo_execenv execenv;
>  	float *dinput;
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 5/9] lib/intel_compute: Add i915 path in compute library
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 5/9] lib/intel_compute: Add i915 path in compute library Zbigniew Kempczyński
@ 2023-09-08  9:13   ` Francois Dugast
  0 siblings, 0 replies; 22+ messages in thread
From: Francois Dugast @ 2023-09-08  9:13 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On Tue, Sep 05, 2023 at 03:33:05PM +0200, Zbigniew Kempczyński wrote:
> Add code which fills requirement to run compute workload on i915.
> 
> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> Cc: Christoph Manszewski <christoph.manszewski@intel.com>
> Cc: Francois Dugast <francois.dugast@intel.com>
> Cc: Mauro Carvalho Chehab <mchehab@kernel.org>

Reviewed-by: Francois Dugast <francois.dugast@intel.com>

> ---
>  lib/intel_compute.c | 50 ++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 49 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/intel_compute.c b/lib/intel_compute.c
> index a1e87ef46f..4344844825 100644
> --- a/lib/intel_compute.c
> +++ b/lib/intel_compute.c
> @@ -8,6 +8,7 @@
>  
>  #include <stdint.h>
>  
> +#include "i915/gem_create.h"
>  #include "igt.h"
>  #include "xe_drm.h"
>  #include "lib/igt_syncobj.h"
> @@ -38,6 +39,7 @@ struct bo_dict_entry {
>  	uint32_t size;
>  	void *data;
>  	const char *name;
> +	uint32_t handle;
>  };
>  
>  struct bo_execenv {
> @@ -47,6 +49,10 @@ struct bo_execenv {
>  	/* Xe part */
>  	uint32_t vm;
>  	uint32_t exec_queue;
> +
> +	/* i915 part */
> +	struct drm_i915_gem_execbuffer2 execbuf;
> +	struct drm_i915_gem_exec_object2 *obj;
>  };
>  
>  static void bo_execenv_create(int fd, struct bo_execenv *execenv)
> @@ -101,6 +107,33 @@ static void bo_execenv_bind(struct bo_execenv *execenv,
>  		}
>  
>  		syncobj_destroy(fd, sync.handle);
> +	} else {
> +		struct drm_i915_gem_execbuffer2 *execbuf = &execenv->execbuf;
> +		struct drm_i915_gem_exec_object2 *obj;
> +
> +		obj = calloc(entries, sizeof(*obj));
> +		execenv->obj = obj;
> +
> +		for (int i = 0; i < entries; i++) {
> +			bo_dict[i].handle = gem_create(fd, bo_dict[i].size);
> +			bo_dict[i].data = gem_mmap__device_coherent(fd, bo_dict[i].handle,
> +								    0, bo_dict[i].size,
> +								    PROT_READ | PROT_WRITE);
> +			igt_debug("[i: %2d name: %20s] handle: %u, data: %p, addr: %16llx, size: %llx\n",
> +				  i, bo_dict[i].name,
> +				  bo_dict[i].handle, bo_dict[i].data,
> +				  (long long)bo_dict[i].addr,
> +				  (long long)bo_dict[i].size);
> +
> +			obj[i].handle = bo_dict[i].handle;
> +			obj[i].offset = CANONICAL(bo_dict[i].addr);
> +			obj[i].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> +			if (bo_dict[i].addr == ADDR_OUTPUT)
> +				obj[i].flags |= EXEC_OBJECT_WRITE;
> +		}
> +
> +		execbuf->buffers_ptr = to_user_pointer(obj);
> +		execbuf->buffer_count = entries;
>  	}
>  }
>  
> @@ -123,6 +156,12 @@ static void bo_execenv_unbind(struct bo_execenv *execenv,
>  		}
>  
>  		syncobj_destroy(fd, sync.handle);
> +	} else {
> +		for (int i = 0; i < entries; i++) {
> +			gem_close(fd, bo_dict[i].handle);
> +			munmap(bo_dict[i].data, bo_dict[i].size);
> +		}
> +		free(execenv->obj);
>  	}
>  }
>  
> @@ -130,8 +169,17 @@ static void bo_execenv_exec(struct bo_execenv *execenv, uint64_t start_addr)
>  {
>  	int fd = execenv->fd;
>  
> -	if (execenv->driver == INTEL_DRIVER_XE)
> +	if (execenv->driver == INTEL_DRIVER_XE) {
>  		xe_exec_wait(fd, execenv->exec_queue, start_addr);
> +	} else {
> +		struct drm_i915_gem_execbuffer2 *execbuf = &execenv->execbuf;
> +		struct drm_i915_gem_exec_object2 *obj = execenv->obj;
> +		int num_objects = execbuf->buffer_count;
> +
> +		execbuf->flags = I915_EXEC_RENDER;
> +		gem_execbuf(fd, execbuf);
> +		gem_sync(fd, obj[num_objects - 1].handle); /* batch handle */
> +	}
>  }
>  
>  /*
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 6/9] intel/gem_compute: Add test which runs compute workload on i915
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 6/9] intel/gem_compute: Add test which runs compute workload on i915 Zbigniew Kempczyński
@ 2023-09-08  9:15   ` Francois Dugast
  0 siblings, 0 replies; 22+ messages in thread
From: Francois Dugast @ 2023-09-08  9:15 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On Tue, Sep 05, 2023 at 03:33:06PM +0200, Zbigniew Kempczyński wrote:
> This test is verbatim copy of xe_compute with driver open exception
> (it opens i915 drm fd instead xe). Technically it is possible to
> create single test code (open would try DEVICE_INTEL | DEVICE_XE)
> but I resisted to that distinguishing i915 and xe version.
> 
> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> Cc: Christoph Manszewski <christoph.manszewski@intel.com>
> Cc: Francois Dugast <francois.dugast@intel.com>
> Cc: Mauro Carvalho Chehab <mchehab@kernel.org>

Reviewed-by: Francois Dugast <francois.dugast@intel.com>

> ---
>  tests/intel/gem_compute.c | 46 +++++++++++++++++++++++++++++++++++++++
>  tests/meson.build         |  1 +
>  2 files changed, 47 insertions(+)
>  create mode 100644 tests/intel/gem_compute.c
> 
> diff --git a/tests/intel/gem_compute.c b/tests/intel/gem_compute.c
> new file mode 100644
> index 0000000000..b408efee16
> --- /dev/null
> +++ b/tests/intel/gem_compute.c
> @@ -0,0 +1,46 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2023 Intel Corporation
> + */
> +
> +/**
> + * TEST: Check compute-related functionality
> + * Category: Hardware building block
> + * Sub-category: compute
> + * Test category: functionality test
> + * Run type: BAT
> + */
> +
> +#include <string.h>
> +
> +#include "igt.h"
> +#include "intel_compute.h"
> +
> +/**
> + * SUBTEST: compute-square
> + * GPU requirement: only works on TGL
> + * Description:
> + *	Run an openCL Kernel that returns output[i] = input[i] * input[i],
> + *	for an input dataset..
> + * Functionality: compute openCL kernel
> + * TODO: extend test to cover other platforms
> + */
> +static void
> +test_compute_square(int fd)
> +{
> +	igt_require_f(run_compute_kernel(fd), "GPU not supported\n");
> +}
> +
> +igt_main
> +{
> +	int i915;
> +
> +	igt_fixture
> +		i915 = drm_open_driver(DRIVER_INTEL);
> +
> +	igt_subtest("compute-square")
> +		test_compute_square(i915);
> +
> +	igt_fixture
> +		drm_close_driver(i915);
> +}
> diff --git a/tests/meson.build b/tests/meson.build
> index aa8e3434ce..03bb7785c3 100644
> --- a/tests/meson.build
> +++ b/tests/meson.build
> @@ -105,6 +105,7 @@ intel_i915_progs = [
>  	'gem_ccs',
>  	'gem_close',
>  	'gem_close_race',
> +	'gem_compute',
>  	'gem_concurrent_blit',
>  	'gem_cs_tlb',
>  	'gem_ctx_bad_destroy',
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 2/9] lib/intel_compute: Add compatibility flags for running compute
  2023-09-08  9:03   ` Francois Dugast
@ 2023-09-08 11:07     ` Zbigniew Kempczyński
  2023-09-08 11:29     ` Mauro Carvalho Chehab
  1 sibling, 0 replies; 22+ messages in thread
From: Zbigniew Kempczyński @ 2023-09-08 11:07 UTC (permalink / raw)
  To: Francois Dugast; +Cc: igt-dev

On Fri, Sep 08, 2023 at 11:03:14AM +0200, Francois Dugast wrote:
> On Tue, Sep 05, 2023 at 03:33:02PM +0200, Zbigniew Kempczyński wrote:
> > Allow selectively turn on/off compute tests on both i915 and xe
> > drivers.
> > 
> > Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> > Cc: Christoph Manszewski <christoph.manszewski@intel.com>
> > Cc: Francois Dugast <francois.dugast@intel.com>
> > Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
> > ---
> >  lib/intel_compute.c | 19 ++++++++++++++++++-
> >  1 file changed, 18 insertions(+), 1 deletion(-)
> > 
> > diff --git a/lib/intel_compute.c b/lib/intel_compute.c
> > index 647bce0e43..dd9f686d0c 100644
> > --- a/lib/intel_compute.c
> > +++ b/lib/intel_compute.c
> > @@ -446,17 +446,27 @@ static void tgl_compute_exec(int fd, const unsigned char *kernel,
> >  }
> >  
> >  /*
> > - * Generic code
> > + * Compatibility flags.
> > + *
> > + * There will be some time period in which both drivers (i915 and xe)
> > + * will support compute runtime tests. Lets define compat flags to allow
> > + * the code to be shared between two drivers allowing disabling this in
> > + * the future.
> >   */
> > +#define COMPAT_FLAG(f) (1 << (f))
> > +#define COMPAT_I915 COMPAT_FLAG(INTEL_DRIVER_I915)
> > +#define COMPAT_XE   COMPAT_FLAG(INTEL_DRIVER_XE)
> 
> This approach allows solving the issue of unsupported combinations of
> platforms and drivers. I cannot think of something better but my concern
> is there could be some confusion if the test is skipped on a platform
> that appears to be supported because it is listed in the compute_batches.

I think it is obvious if some platform supports workload on one driver
whereas on another it is not supported it is software (driver) issue.
Issue - I mean we're in transition state between i915 and Xe so some
features will be common and other will be exclusive for Xe. As it is
opensource driver we might never now if someone will enable some feature
which wasn't officially supported by us. I tried to code this as much
as possible common for both drivers leaving place for such situation.

> 
> s/COMPAT_I915/COMPAT_DRIVER_I915/ and s/COMPAT_XE/COMPAT_DRIVER_XE/ can
> help make it not ambiguous.

Ok, makes sense for me.

> 
> It would be good to get ack from someone else on the compat flag
> approach, Mauro maybe?
> 
> Francois
> 

Thank you for the review.

--
Zbigniew

> >  
> >  static const struct {
> >  	unsigned int ip_ver;
> >  	void (*compute_exec)(int fd, const unsigned char *kernel,
> >  			     unsigned int size);
> > +	uint32_t compat;
> >  } compute_batches[] = {
> >  	{
> >  		.ip_ver = IP_VER(12, 0),
> >  		.compute_exec = tgl_compute_exec,
> > +		.compat = COMPAT_I915 | COMPAT_XE,
> >  	},
> >  };
> >  
> > @@ -465,6 +475,7 @@ bool run_compute_kernel(int fd)
> >  	unsigned int ip_ver = intel_graphics_ver(intel_get_drm_devid(fd));
> >  	unsigned int batch;
> >  	const struct compute_kernels *kernels = compute_square_kernels;
> > +	enum intel_driver driver = get_intel_driver(fd);
> >  
> >  	for (batch = 0; batch < ARRAY_SIZE(compute_batches); batch++) {
> >  		if (ip_ver == compute_batches[batch].ip_ver)
> > @@ -473,6 +484,12 @@ bool run_compute_kernel(int fd)
> >  	if (batch == ARRAY_SIZE(compute_batches))
> >  		return false;
> >  
> > +	if (!(COMPAT_FLAG(driver) & compute_batches[batch].compat)) {
> > +		igt_debug("driver flag: %x\n", COMPAT_FLAG(driver));
> > +		igt_debug("compat flag: %x\n", compute_batches[batch].compat);
> > +		return false;
> > +	}
> > +
> >  	while (kernels->kernel) {
> >  		if (ip_ver == kernels->ip_ver)
> >  			break;
> > -- 
> > 2.34.1
> > 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 2/9] lib/intel_compute: Add compatibility flags for running compute
  2023-09-08  9:03   ` Francois Dugast
  2023-09-08 11:07     ` Zbigniew Kempczyński
@ 2023-09-08 11:29     ` Mauro Carvalho Chehab
  1 sibling, 0 replies; 22+ messages in thread
From: Mauro Carvalho Chehab @ 2023-09-08 11:29 UTC (permalink / raw)
  To: Francois Dugast; +Cc: igt-dev

On Fri, 8 Sep 2023 11:03:14 +0200
Francois Dugast <francois.dugast@intel.com> wrote:

> On Tue, Sep 05, 2023 at 03:33:02PM +0200, Zbigniew Kempczyński wrote:
> > Allow selectively turn on/off compute tests on both i915 and xe
> > drivers.
> > 
> > Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> > Cc: Christoph Manszewski <christoph.manszewski@intel.com>
> > Cc: Francois Dugast <francois.dugast@intel.com>
> > Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
> > ---
> >  lib/intel_compute.c | 19 ++++++++++++++++++-
> >  1 file changed, 18 insertions(+), 1 deletion(-)
> > 
> > diff --git a/lib/intel_compute.c b/lib/intel_compute.c
> > index 647bce0e43..dd9f686d0c 100644
> > --- a/lib/intel_compute.c
> > +++ b/lib/intel_compute.c
> > @@ -446,17 +446,27 @@ static void tgl_compute_exec(int fd, const unsigned char *kernel,
> >  }
> >  
> >  /*
> > - * Generic code
> > + * Compatibility flags.
> > + *
> > + * There will be some time period in which both drivers (i915 and xe)
> > + * will support compute runtime tests. Lets define compat flags to allow
> > + * the code to be shared between two drivers allowing disabling this in
> > + * the future.
> >   */
> > +#define COMPAT_FLAG(f) (1 << (f))
> > +#define COMPAT_I915 COMPAT_FLAG(INTEL_DRIVER_I915)
> > +#define COMPAT_XE   COMPAT_FLAG(INTEL_DRIVER_XE)  
> 
> This approach allows solving the issue of unsupported combinations of
> platforms and drivers. I cannot think of something better but my concern
> is there could be some confusion if the test is skipped on a platform
> that appears to be supported because it is listed in the compute_batches.
> 
> s/COMPAT_I915/COMPAT_DRIVER_I915/ and s/COMPAT_XE/COMPAT_DRIVER_XE/ can
> help make it not ambiguous.
> 
> It would be good to get ack from someone else on the compat flag
> approach, Mauro maybe?

This approach looks sane to me, as, if we add support for < 12,
this will only be supported by i915 driver. The same will be true
in the future when we add support for newer generations, as it
is unlikely that those will be added for both drivers.

Acked-by: Mauro Carvalho Chehab <mchehab@kernel.org>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 7/9] lib/intel_compute: Add XeHP implementation of compute pipeline
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 7/9] lib/intel_compute: Add XeHP implementation of compute pipeline Zbigniew Kempczyński
@ 2023-09-08 13:55   ` Francois Dugast
  0 siblings, 0 replies; 22+ messages in thread
From: Francois Dugast @ 2023-09-08 13:55 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On Tue, Sep 05, 2023 at 03:33:07PM +0200, Zbigniew Kempczyński wrote:
> Add pipeline which runs square compute workload on DG2.
> 
> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> Cc: Christoph Manszewski <christoph.manszewski@intel.com>
> Cc: Francois Dugast <francois.dugast@intel.com>
> Cc: Mauro Carvalho Chehab <mchehab@kernel.org>

Reviewed-by: Francois Dugast <francois.dugast@intel.com>

> ---
>  lib/intel_compute.c                | 287 ++++++++++++++++++++++++++++-
>  lib/intel_compute_square_kernels.c |  56 ++++++
>  2 files changed, 342 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/intel_compute.c b/lib/intel_compute.c
> index 4344844825..29a5ec168f 100644
> --- a/lib/intel_compute.c
> +++ b/lib/intel_compute.c
> @@ -14,9 +14,12 @@
>  #include "lib/igt_syncobj.h"
>  #include "lib/intel_reg.h"
>  
> +#include "gen7_media.h"
> +#include "gen8_media.h"
>  #include "intel_compute.h"
>  #include "xe/xe_ioctl.h"
>  #include "xe/xe_query.h"
> +#include "xehp_media.h"
>  
>  #define PIPE_CONTROL			0x7a000004
>  #define MEDIA_STATE_FLUSH		0x0
> @@ -25,7 +28,7 @@
>  #define SIZE_BATCH			0x1000
>  #define SIZE_BUFFER_INPUT		MAX(sizeof(float) * SIZE_DATA, 0x1000)
>  #define SIZE_BUFFER_OUTPUT		MAX(sizeof(float) * SIZE_DATA, 0x1000)
> -#define ADDR_BATCH			0x100000
> +#define ADDR_BATCH			0x100000UL
>  #define ADDR_INPUT			0x200000UL
>  #define ADDR_OUTPUT			0x300000UL
>  #define ADDR_SURFACE_STATE_BASE		0x400000UL
> @@ -34,6 +37,10 @@
>  #define OFFSET_INDIRECT_DATA_START	0xFFFDF000
>  #define OFFSET_KERNEL			0xFFFEF000
>  
> +#define XEHP_ADDR_GENERAL_STATE_BASE		0x80000000UL
> +#define XEHP_ADDR_INSTRUCTION_STATE_BASE	0x90000000UL
> +#define XEHP_OFFSET_BINDING_TABLE		0x1000
> +
>  struct bo_dict_entry {
>  	uint64_t addr;
>  	uint32_t size;
> @@ -597,6 +604,279 @@ static void tgl_compute_exec(int fd, const unsigned char *kernel,
>  	bo_execenv_destroy(&execenv);
>  }
>  
> +static void xehp_create_indirect_data(uint32_t *addr_bo_buffer_batch,
> +				      uint64_t addr_input,
> +				      uint64_t addr_output)
> +{
> +	int b = 0;
> +
> +	addr_bo_buffer_batch[b++] = addr_input & 0xffffffff;
> +	addr_bo_buffer_batch[b++] = addr_input >> 32;
> +	addr_bo_buffer_batch[b++] = addr_output & 0xffffffff;
> +	addr_bo_buffer_batch[b++] = addr_output >> 32;
> +	addr_bo_buffer_batch[b++] = 0x00000400;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000400;
> +	addr_bo_buffer_batch[b++] = 0x00000001;
> +	addr_bo_buffer_batch[b++] = 0x00000001;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +}
> +
> +static void xehp_create_surface_state(uint32_t *addr_bo_buffer_batch,
> +				      uint64_t addr_input,
> +				      uint64_t addr_output)
> +{
> +	int b = 0;
> +
> +	addr_bo_buffer_batch[b++] = 0x87FDC000;
> +	addr_bo_buffer_batch[b++] = 0x06000000;
> +	addr_bo_buffer_batch[b++] = 0x001F007F;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00002000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = addr_input & 0xffffffff;
> +	addr_bo_buffer_batch[b++] = addr_input >> 32;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +
> +	addr_bo_buffer_batch[b++] = 0x87FDC000;
> +	addr_bo_buffer_batch[b++] = 0x06000000;
> +	addr_bo_buffer_batch[b++] = 0x001F007F;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00002000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = addr_output & 0xffffffff;
> +	addr_bo_buffer_batch[b++] = addr_output >> 32;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +
> +	addr_bo_buffer_batch[b++] = 0x00001000;
> +	addr_bo_buffer_batch[b++] = 0x00001040;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +}
> +
> +static void xehp_compute_exec_compute(uint32_t *addr_bo_buffer_batch,
> +				      uint64_t addr_general_state_base,
> +				      uint64_t addr_surface_state_base,
> +				      uint64_t addr_dynamic_state_base,
> +				      uint64_t addr_instruction_state_base,
> +				      uint64_t offset_indirect_data_start,
> +				      uint64_t kernel_start_pointer)
> +{
> +	int b = 0;
> +
> +	igt_debug("general   state base: %lx\n", addr_general_state_base);
> +	igt_debug("surface   state base: %lx\n", addr_surface_state_base);
> +	igt_debug("dynamic   state base: %lx\n", addr_dynamic_state_base);
> +	igt_debug("instruct   base addr: %lx\n", addr_instruction_state_base);
> +	igt_debug("bindless   base addr: %lx\n", addr_surface_state_base);
> +	igt_debug("offset indirect addr: %lx\n", offset_indirect_data_start);
> +	igt_debug("kernel start pointer: %lx\n", kernel_start_pointer);
> +
> +	addr_bo_buffer_batch[b++] = GEN7_PIPELINE_SELECT | GEN9_PIPELINE_SELECTION_MASK |
> +				    PIPELINE_SELECT_GPGPU;
> +
> +	addr_bo_buffer_batch[b++] = XEHP_STATE_COMPUTE_MODE;
> +	addr_bo_buffer_batch[b++] = 0x80180010;
> +
> +	addr_bo_buffer_batch[b++] = XEHP_CFE_STATE;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x0c008800;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +
> +	addr_bo_buffer_batch[b++] = MI_LOAD_REGISTER_IMM(1);
> +	addr_bo_buffer_batch[b++] = 0x00002580;
> +	addr_bo_buffer_batch[b++] = 0x00060002;
> +
> +	addr_bo_buffer_batch[b++] = STATE_BASE_ADDRESS | 0x14;
> +	addr_bo_buffer_batch[b++] = (addr_general_state_base & 0xffffffff) | 0x61;
> +	addr_bo_buffer_batch[b++] = addr_general_state_base >> 32;
> +	addr_bo_buffer_batch[b++] = 0x0106c000;
> +	addr_bo_buffer_batch[b++] = (addr_surface_state_base & 0xffffffff) | 0x61;
> +	addr_bo_buffer_batch[b++] = addr_surface_state_base >> 32;
> +	addr_bo_buffer_batch[b++] = (addr_dynamic_state_base & 0xffffffff) | 0x61;
> +	addr_bo_buffer_batch[b++] = addr_dynamic_state_base >> 32;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = (addr_instruction_state_base & 0xffffffff) | 0x61;
> +	addr_bo_buffer_batch[b++] = addr_instruction_state_base >> 32;
> +	addr_bo_buffer_batch[b++] = 0xfffff001;
> +	addr_bo_buffer_batch[b++] = 0x00010001;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0xfffff001;
> +	addr_bo_buffer_batch[b++] = (addr_surface_state_base & 0xffffffff) | 0x61;
> +	addr_bo_buffer_batch[b++] = addr_surface_state_base >> 32;
> +	addr_bo_buffer_batch[b++] = 0x00007fbf;
> +	addr_bo_buffer_batch[b++] = 0x00000061;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +
> +	addr_bo_buffer_batch[b++] = GEN8_3DSTATE_BINDING_TABLE_POOL_ALLOC | 2;
> +	addr_bo_buffer_batch[b++] = (addr_surface_state_base & 0xffffffff) | 0x6;
> +	addr_bo_buffer_batch[b++] = addr_surface_state_base >> 32;
> +	addr_bo_buffer_batch[b++] = 0x00002000;
> +	addr_bo_buffer_batch[b++] = 0x001ff000;
> +
> +	addr_bo_buffer_batch[b++] = XEHP_COMPUTE_WALKER | 0x25;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000040;
> +	addr_bo_buffer_batch[b++] = offset_indirect_data_start;
> +	addr_bo_buffer_batch[b++] = 0xbe040000;
> +	addr_bo_buffer_batch[b++] = 0xffffffff;
> +	addr_bo_buffer_batch[b++] = 0x0000003f;
> +	addr_bo_buffer_batch[b++] = 0x00000010;
> +
> +	addr_bo_buffer_batch[b++] = 0x00000001;
> +	addr_bo_buffer_batch[b++] = 0x00000001;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +
> +	addr_bo_buffer_batch[b++] = kernel_start_pointer;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00180000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00001080;
> +	addr_bo_buffer_batch[b++] = 0x0c000002;
> +
> +	addr_bo_buffer_batch[b++] = 0x00000008;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00001027;
> +	addr_bo_buffer_batch[b++] = ADDR_BATCH;
> +	addr_bo_buffer_batch[b++] = ADDR_BATCH >> 32;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000040;
> +	addr_bo_buffer_batch[b++] = 0x00000001;
> +	addr_bo_buffer_batch[b++] = 0x00000001;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +
> +	addr_bo_buffer_batch[b++] = MI_BATCH_BUFFER_END;
> +}
> +
> +/**
> + * xehp_compute_exec - run a pipeline compatible with XEHP
> + *
> + * @fd: file descriptor of the opened DRM device
> + * @kernel: GPU Kernel binary to be executed
> + * @size: size of @kernel.
> + */
> +static void xehp_compute_exec(int fd, const unsigned char *kernel,
> +			     unsigned int size)
> +{
> +#define XEHP_BO_DICT_ENTRIES 9
> +	struct bo_dict_entry bo_dict[XEHP_BO_DICT_ENTRIES] = {
> +		{ .addr = XEHP_ADDR_INSTRUCTION_STATE_BASE + OFFSET_KERNEL,
> +		  .name = "instr state base"},
> +		{ .addr = ADDR_DYNAMIC_STATE_BASE,
> +		  .size = 0x100000,
> +		  .name = "dynamic state base"},
> +		{ .addr = ADDR_SURFACE_STATE_BASE,
> +		  .size = 0x1000,
> +		  .name = "surface state base"},
> +		{ .addr = XEHP_ADDR_GENERAL_STATE_BASE + OFFSET_INDIRECT_DATA_START,
> +		  .size =  0x1000,
> +		  .name = "indirect object base"},
> +		{ .addr = ADDR_INPUT, .size = SIZE_BUFFER_INPUT,
> +		  .name = "addr input"},
> +		{ .addr = ADDR_OUTPUT, .size = SIZE_BUFFER_OUTPUT,
> +		  .name = "addr output" },
> +		{ .addr = XEHP_ADDR_GENERAL_STATE_BASE, .size = 0x100000,
> +		  .name = "general state base" },
> +		{ .addr = ADDR_SURFACE_STATE_BASE + XEHP_OFFSET_BINDING_TABLE,
> +		  .size = 0x1000,
> +		  .name = "binding table" },
> +		{ .addr = ADDR_BATCH, .size = SIZE_BATCH,
> +		  .name = "batch" },
> +	};
> +	struct bo_execenv execenv;
> +	float *dinput;
> +
> +	bo_execenv_create(fd, &execenv);
> +
> +	/* Sets Kernel size */
> +	bo_dict[0].size = ALIGN(size, 0x1000);
> +
> +	bo_execenv_bind(&execenv, bo_dict, XEHP_BO_DICT_ENTRIES);
> +
> +	memcpy(bo_dict[0].data, kernel, size);
> +	tgllp_create_dynamic_state(bo_dict[1].data, OFFSET_KERNEL);
> +	xehp_create_surface_state(bo_dict[2].data, ADDR_INPUT, ADDR_OUTPUT);
> +	xehp_create_indirect_data(bo_dict[3].data, ADDR_INPUT, ADDR_OUTPUT);
> +	xehp_create_surface_state(bo_dict[7].data, ADDR_INPUT, ADDR_OUTPUT);
> +
> +	dinput = (float *)bo_dict[4].data;
> +	srand(time(NULL));
> +	for (int i = 0; i < SIZE_DATA; i++)
> +		((float *)dinput)[i] = rand() / (float)RAND_MAX;
> +
> +	xehp_compute_exec_compute(bo_dict[8].data,
> +				  XEHP_ADDR_GENERAL_STATE_BASE,
> +				  ADDR_SURFACE_STATE_BASE,
> +				  ADDR_DYNAMIC_STATE_BASE,
> +				  XEHP_ADDR_INSTRUCTION_STATE_BASE,
> +				  OFFSET_INDIRECT_DATA_START,
> +				  OFFSET_KERNEL);
> +
> +	bo_execenv_exec(&execenv, ADDR_BATCH);
> +
> +	for (int i = 0; i < SIZE_DATA; i++) {
> +		float f1, f2;
> +
> +		f1 = ((float *) bo_dict[5].data)[i];
> +		f2 = ((float *) bo_dict[4].data)[i];
> +		if (f1 != f2 * f2)
> +			igt_debug("[%4d] f1: %f != %f\n", i, f1, f2 * f2);
> +		igt_assert(f1 == f2 * f2);
> +	}
> +
> +	bo_execenv_unbind(&execenv, bo_dict, XEHP_BO_DICT_ENTRIES);
> +	bo_execenv_destroy(&execenv);
> +}
> +
>  /*
>   * Compatibility flags.
>   *
> @@ -620,6 +900,11 @@ static const struct {
>  		.compute_exec = tgl_compute_exec,
>  		.compat = COMPAT_I915 | COMPAT_XE,
>  	},
> +	{
> +		.ip_ver = IP_VER(12, 55),
> +		.compute_exec = xehp_compute_exec,
> +		.compat = COMPAT_I915,
> +	},
>  };
>  
>  bool run_compute_kernel(int fd)
> diff --git a/lib/intel_compute_square_kernels.c b/lib/intel_compute_square_kernels.c
> index b30d8a23dd..da73a3747c 100644
> --- a/lib/intel_compute_square_kernels.c
> +++ b/lib/intel_compute_square_kernels.c
> @@ -61,11 +61,67 @@ static const unsigned char tgllp_kernel_square_bin[] = {
>  	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>  };
>  
> +static const unsigned char xehp_kernel_square_bin[] = {
> +	0x61, 0x31, 0x03, 0x80, 0x20, 0x42, 0x05, 0x7f, 0x00, 0x00, 0x00, 0x00,
> +	0x00, 0x00, 0x00, 0x00, 0x65, 0x00, 0x00, 0x80, 0x20, 0x82, 0x45, 0x7f,
> +	0x04, 0x00, 0x00, 0x02, 0xc0, 0xff, 0xff, 0xff, 0x40, 0x19, 0x00, 0x80,
> +	0x20, 0x82, 0x45, 0x7f, 0x44, 0x7f, 0x00, 0x02, 0x00, 0x00, 0x00, 0x00,
> +	0x31, 0x92, 0x03, 0x80, 0x00, 0x00, 0x14, 0x08, 0x0c, 0x7f, 0xfa, 0xa7,
> +	0x00, 0x00, 0x10, 0x02, 0x61, 0x20, 0x03, 0x80, 0x20, 0x02, 0x05, 0x03,
> +	0x04, 0x00, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x66, 0x09, 0x00, 0x80,
> +	0x20, 0x82, 0x01, 0x80, 0x00, 0x80, 0x00, 0x01, 0xc0, 0x04, 0xc0, 0x04,
> +	0x01, 0x09, 0x00, 0xe8, 0x01, 0x00, 0x11, 0x00, 0x01, 0x22, 0x00, 0xe8,
> +	0x01, 0x00, 0x11, 0x00, 0x41, 0x09, 0x20, 0x22, 0x16, 0x09, 0x11, 0x03,
> +	0x49, 0x00, 0x04, 0xa2, 0x12, 0x09, 0x11, 0x03, 0x01, 0x21, 0x00, 0xe8,
> +	0x01, 0x00, 0x11, 0x00, 0x52, 0x19, 0x04, 0x00, 0x60, 0x06, 0x04, 0x05,
> +	0x04, 0x04, 0x0e, 0x01, 0x04, 0x01, 0x04, 0x07, 0x52, 0x00, 0x24, 0x00,
> +	0x60, 0x06, 0x04, 0x0a, 0x04, 0x04, 0x0e, 0x01, 0x04, 0x02, 0x04, 0x07,
> +	0x70, 0x1a, 0x04, 0x00, 0x60, 0x02, 0x01, 0x00, 0x04, 0x05, 0x10, 0x52,
> +	0x84, 0x08, 0x00, 0x00, 0x70, 0x1a, 0x24, 0x00, 0x60, 0x02, 0x01, 0x00,
> +	0x04, 0x0a, 0x10, 0x52, 0x84, 0x08, 0x00, 0x00, 0x2e, 0x00, 0x05, 0x11,
> +	0x00, 0xc0, 0x00, 0x00, 0x90, 0x00, 0x00, 0x00, 0x90, 0x00, 0x00, 0x00,
> +	0x69, 0x00, 0x0c, 0x60, 0x02, 0x05, 0x20, 0x00, 0x69, 0x00, 0x0e, 0x66,
> +	0x02, 0x0a, 0x20, 0x00, 0x40, 0x1a, 0x10, 0xa0, 0x32, 0x0c, 0x10, 0x08,
> +	0x40, 0x1a, 0x12, 0xa6, 0x32, 0x0e, 0x10, 0x08, 0x31, 0xa3, 0x04, 0x00,
> +	0x00, 0x00, 0x14, 0x14, 0x94, 0x10, 0x00, 0xfa, 0x00, 0x00, 0x00, 0x06,
> +	0x31, 0x94, 0x24, 0x00, 0x00, 0x00, 0x14, 0x16, 0x94, 0x12, 0x00, 0xfa,
> +	0x00, 0x00, 0x00, 0x06, 0x40, 0x00, 0x0c, 0xa0, 0x4a, 0x0c, 0x10, 0x08,
> +	0x40, 0x00, 0x0e, 0xa6, 0x4a, 0x0e, 0x10, 0x08, 0x41, 0x23, 0x14, 0x20,
> +	0x00, 0x14, 0x00, 0x14, 0x41, 0x24, 0x16, 0x26, 0x00, 0x16, 0x00, 0x16,
> +	0x31, 0xa5, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x95, 0x0c, 0x08, 0xfa,
> +	0x14, 0x14, 0x80, 0x07, 0x31, 0x96, 0x24, 0x00, 0x00, 0x00, 0x00, 0x00,
> +	0x95, 0x0e, 0x08, 0xfa, 0x14, 0x16, 0x80, 0x07, 0x2f, 0x00, 0x05, 0x00,
> +	0x00, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00,
> +	0x61, 0x00, 0x7f, 0x64, 0x00, 0x03, 0x10, 0x00, 0x31, 0x09, 0x03, 0x80,
> +	0x04, 0x00, 0x00, 0x00, 0x0c, 0x7f, 0x20, 0x30, 0x00, 0x00, 0x00, 0x00,
> +	0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> +	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> +	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> +	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> +	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> +	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> +	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> +	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> +	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> +	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> +	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> +	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> +	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> +	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> +	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> +	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> +};
> +
>  const struct compute_kernels compute_square_kernels[] = {
>  	{
>  		.ip_ver = IP_VER(12, 0),
>  		.size = sizeof(tgllp_kernel_square_bin),
>  		.kernel = tgllp_kernel_square_bin,
>  	},
> +	{
> +		.ip_ver = IP_VER(12, 55),
> +		.size = sizeof(xehp_kernel_square_bin),
> +		.kernel = xehp_kernel_square_bin,
> +	},
>  	{}
>  };
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 8/9] lib/intel_compute: Adding pvc compute pipeline implementation
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 8/9] lib/intel_compute: Adding pvc compute pipeline implementation Zbigniew Kempczyński
@ 2023-09-08 13:56   ` Francois Dugast
  0 siblings, 0 replies; 22+ messages in thread
From: Francois Dugast @ 2023-09-08 13:56 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On Tue, Sep 05, 2023 at 03:33:08PM +0200, Zbigniew Kempczyński wrote:
> Add square compute pipeline which works on PVC. Currently limited
> to Xe driver.
> 
> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> Cc: Christoph Manszewski <christoph.manszewski@intel.com>
> Cc: Francois Dugast <francois.dugast@intel.com>
> Cc: Mauro Carvalho Chehab <mchehab@kernel.org>

Reviewed-by: Francois Dugast <francois.dugast@intel.com>

> ---
>  lib/intel_compute.c                | 218 ++++++++++++++++++++++++++++-
>  lib/intel_compute_square_kernels.c |  39 ++++++
>  2 files changed, 256 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/intel_compute.c b/lib/intel_compute.c
> index 29a5ec168f..4a232ce72b 100644
> --- a/lib/intel_compute.c
> +++ b/lib/intel_compute.c
> @@ -71,9 +71,18 @@ static void bo_execenv_create(int fd, struct bo_execenv *execenv)
>  	execenv->driver = get_intel_driver(fd);
>  
>  	if (execenv->driver == INTEL_DRIVER_XE) {
> +		uint16_t engine_class;
> +		uint32_t devid = intel_get_drm_devid(fd);
> +		const struct intel_device_info *info = intel_get_device_info(devid);
> +
> +		if (info->graphics_ver >= 12 && info->graphics_rel < 60)
> +			engine_class = DRM_XE_ENGINE_CLASS_RENDER;
> +		else
> +			engine_class = DRM_XE_ENGINE_CLASS_COMPUTE;
> +
>  		execenv->vm = xe_vm_create(fd, DRM_XE_VM_CREATE_ASYNC_BIND_OPS, 0);
>  		execenv->exec_queue = xe_exec_queue_create_class(fd, execenv->vm,
> -								 DRM_XE_ENGINE_CLASS_RENDER);
> +								 engine_class);
>  	}
>  }
>  
> @@ -877,6 +886,208 @@ static void xehp_compute_exec(int fd, const unsigned char *kernel,
>  	bo_execenv_destroy(&execenv);
>  }
>  
> +static void xehpc_create_indirect_data(uint32_t *addr_bo_buffer_batch,
> +				       uint64_t addr_input,
> +				       uint64_t addr_output)
> +{
> +	int b = 0;
> +
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000400;
> +	addr_bo_buffer_batch[b++] = 0x00000001;
> +	addr_bo_buffer_batch[b++] = 0x00000001;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = addr_input & 0xffffffff;
> +	addr_bo_buffer_batch[b++] = addr_input >> 32;
> +	addr_bo_buffer_batch[b++] = addr_output & 0xffffffff;
> +	addr_bo_buffer_batch[b++] = addr_output >> 32;
> +	addr_bo_buffer_batch[b++] = 0x00000400;
> +	addr_bo_buffer_batch[b++] = 0x00000400;
> +	addr_bo_buffer_batch[b++] = 0x00000001;
> +	addr_bo_buffer_batch[b++] = 0x00000001;
> +}
> +
> +static void xehpc_compute_exec_compute(uint32_t *addr_bo_buffer_batch,
> +				       uint64_t addr_general_state_base,
> +				       uint64_t addr_surface_state_base,
> +				       uint64_t addr_dynamic_state_base,
> +				       uint64_t addr_instruction_state_base,
> +				       uint64_t offset_indirect_data_start,
> +				       uint64_t kernel_start_pointer)
> +{
> +	int b = 0;
> +
> +	igt_debug("general   state base: %lx\n", addr_general_state_base);
> +	igt_debug("surface   state base: %lx\n", addr_surface_state_base);
> +	igt_debug("dynamic   state base: %lx\n", addr_dynamic_state_base);
> +	igt_debug("instruct   base addr: %lx\n", addr_instruction_state_base);
> +	igt_debug("bindless   base addr: %lx\n", addr_surface_state_base);
> +	igt_debug("offset indirect addr: %lx\n", offset_indirect_data_start);
> +	igt_debug("kernel start pointer: %lx\n", kernel_start_pointer);
> +
> +	addr_bo_buffer_batch[b++] = GEN7_PIPELINE_SELECT | GEN9_PIPELINE_SELECTION_MASK |
> +				    PIPELINE_SELECT_GPGPU;
> +
> +	addr_bo_buffer_batch[b++] = XEHP_STATE_COMPUTE_MODE;
> +	addr_bo_buffer_batch[b++] = 0xE0186010;
> +
> +	addr_bo_buffer_batch[b++] = XEHP_CFE_STATE | 0x4;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x10008800;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +
> +	addr_bo_buffer_batch[b++] = MI_LOAD_REGISTER_IMM(1);
> +	addr_bo_buffer_batch[b++] = 0x00002580;
> +	addr_bo_buffer_batch[b++] = 0x00060002;
> +
> +	addr_bo_buffer_batch[b++] = STATE_BASE_ADDRESS | 0x14;
> +	addr_bo_buffer_batch[b++] = (addr_general_state_base & 0xffffffff) | 0x41;
> +	addr_bo_buffer_batch[b++] = addr_general_state_base >> 32;
> +	addr_bo_buffer_batch[b++] = 0x00044000;
> +	addr_bo_buffer_batch[b++] = (addr_surface_state_base & 0xffffffff) | 0x41;
> +	addr_bo_buffer_batch[b++] = addr_surface_state_base >> 32;
> +	addr_bo_buffer_batch[b++] = (addr_dynamic_state_base & 0xffffffff) | 0x41;
> +	addr_bo_buffer_batch[b++] = addr_dynamic_state_base >> 32;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = (addr_instruction_state_base & 0xffffffff) | 0x41;
> +	addr_bo_buffer_batch[b++] = addr_instruction_state_base >> 32;
> +	addr_bo_buffer_batch[b++] = 0xfffff001;
> +	addr_bo_buffer_batch[b++] = 0x00010001;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0xfffff001;
> +	addr_bo_buffer_batch[b++] = (addr_surface_state_base & 0xffffffff) | 0x41;
> +	addr_bo_buffer_batch[b++] = addr_surface_state_base >> 32;
> +	addr_bo_buffer_batch[b++] = 0x00007fbf;
> +	addr_bo_buffer_batch[b++] = 0x00000041;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +
> +	addr_bo_buffer_batch[b++] = GEN8_3DSTATE_BINDING_TABLE_POOL_ALLOC | 2;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +
> +	addr_bo_buffer_batch[b++] = XEHP_COMPUTE_WALKER | 0x25;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000040;
> +	addr_bo_buffer_batch[b++] = offset_indirect_data_start;
> +	addr_bo_buffer_batch[b++] = 0xbe040000;
> +	addr_bo_buffer_batch[b++] = 0xffffffff;
> +	addr_bo_buffer_batch[b++] = 0x0000003f;
> +	addr_bo_buffer_batch[b++] = 0x00000010;
> +
> +	addr_bo_buffer_batch[b++] = 0x00000001;
> +	addr_bo_buffer_batch[b++] = 0x00000001;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +
> +	addr_bo_buffer_batch[b++] = kernel_start_pointer;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00180000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x0c000020;
> +
> +	addr_bo_buffer_batch[b++] = 0x00000008;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00001047;
> +	addr_bo_buffer_batch[b++] = ADDR_BATCH;
> +	addr_bo_buffer_batch[b++] = ADDR_BATCH >> 32;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000040;
> +	addr_bo_buffer_batch[b++] = 0x00000001;
> +	addr_bo_buffer_batch[b++] = 0x00000001;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +	addr_bo_buffer_batch[b++] = 0x00000000;
> +
> +	addr_bo_buffer_batch[b++] = MI_BATCH_BUFFER_END;
> +}
> +
> +/**
> + * xehpc_compute_exec - run a pipeline compatible with XEHP
> + *
> + * @fd: file descriptor of the opened DRM device
> + * @kernel: GPU Kernel binary to be executed
> + * @size: size of @kernel.
> + */
> +static void xehpc_compute_exec(int fd, const unsigned char *kernel,
> +			       unsigned int size)
> +{
> +#define XEHPC_BO_DICT_ENTRIES 6
> +	struct bo_dict_entry bo_dict[XEHP_BO_DICT_ENTRIES] = {
> +		{ .addr = XEHP_ADDR_INSTRUCTION_STATE_BASE + OFFSET_KERNEL,
> +		  .name = "instr state base"},
> +		{ .addr = XEHP_ADDR_GENERAL_STATE_BASE + OFFSET_INDIRECT_DATA_START,
> +		  .size =  0x10000,
> +		  .name = "indirect object base"},
> +		{ .addr = ADDR_INPUT, .size = SIZE_BUFFER_INPUT,
> +		  .name = "addr input"},
> +		{ .addr = ADDR_OUTPUT, .size = SIZE_BUFFER_OUTPUT,
> +		  .name = "addr output" },
> +		{ .addr = XEHP_ADDR_GENERAL_STATE_BASE, .size = 0x10000,
> +		  .name = "general state base" },
> +		{ .addr = ADDR_BATCH, .size = SIZE_BATCH,
> +		  .name = "batch" },
> +	};
> +	struct bo_execenv execenv;
> +	float *dinput;
> +
> +	bo_execenv_create(fd, &execenv);
> +
> +	/* Sets Kernel size */
> +	bo_dict[0].size = ALIGN(size, 0x1000);
> +
> +	bo_execenv_bind(&execenv, bo_dict, XEHPC_BO_DICT_ENTRIES);
> +
> +	memcpy(bo_dict[0].data, kernel, size);
> +	xehpc_create_indirect_data(bo_dict[1].data, ADDR_INPUT, ADDR_OUTPUT);
> +
> +	dinput = (float *)bo_dict[2].data;
> +	srand(time(NULL));
> +	for (int i = 0; i < SIZE_DATA; i++)
> +		((float *)dinput)[i] = rand() / (float)RAND_MAX;
> +
> +	xehpc_compute_exec_compute(bo_dict[5].data,
> +				   XEHP_ADDR_GENERAL_STATE_BASE,
> +				   ADDR_SURFACE_STATE_BASE,
> +				   ADDR_DYNAMIC_STATE_BASE,
> +				   XEHP_ADDR_INSTRUCTION_STATE_BASE,
> +				   OFFSET_INDIRECT_DATA_START,
> +				   OFFSET_KERNEL);
> +
> +	bo_execenv_exec(&execenv, ADDR_BATCH);
> +
> +	for (int i = 0; i < SIZE_DATA; i++) {
> +		float f1, f2;
> +
> +		f1 = ((float *) bo_dict[3].data)[i];
> +		f2 = ((float *) bo_dict[2].data)[i];
> +		if (f1 != f2 * f2)
> +			igt_debug("[%4d] f1: %f != %f\n", i, f1, f2 * f2);
> +		igt_assert(f1 == f2 * f2);
> +	}
> +
> +	bo_execenv_unbind(&execenv, bo_dict, XEHPC_BO_DICT_ENTRIES);
> +	bo_execenv_destroy(&execenv);
> +}
> +
>  /*
>   * Compatibility flags.
>   *
> @@ -905,6 +1116,11 @@ static const struct {
>  		.compute_exec = xehp_compute_exec,
>  		.compat = COMPAT_I915,
>  	},
> +	{
> +		.ip_ver = IP_VER(12, 60),
> +		.compute_exec = xehpc_compute_exec,
> +		.compat = COMPAT_XE,
> +	},
>  };
>  
>  bool run_compute_kernel(int fd)
> diff --git a/lib/intel_compute_square_kernels.c b/lib/intel_compute_square_kernels.c
> index da73a3747c..de93a3bdfd 100644
> --- a/lib/intel_compute_square_kernels.c
> +++ b/lib/intel_compute_square_kernels.c
> @@ -112,6 +112,40 @@ static const unsigned char xehp_kernel_square_bin[] = {
>  	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>  };
>  
> +static const unsigned char xehpc_kernel_square_bin[] = {
> +	0x65, 0xa1, 0x00, 0x80, 0x20, 0x82, 0x05, 0x7f, 0x04, 0x00, 0x00, 0x02,
> +	0xc0, 0xff, 0xff, 0xff, 0x40, 0x19, 0x00, 0x80, 0x20, 0x82, 0x05, 0x7f,
> +	0x04, 0x7f, 0x00, 0x02, 0x00, 0x00, 0x00, 0x00, 0x31, 0x22, 0x03, 0x00,
> +	0x00, 0x00, 0x0c, 0x04, 0x8f, 0x7f, 0x00, 0xfa, 0x03, 0x00, 0x34, 0xf6,
> +	0x66, 0x09, 0x84, 0xb4, 0x80, 0x80, 0x00, 0x4c, 0x41, 0x22, 0x03, 0x80,
> +	0x60, 0x06, 0x01, 0x20, 0xd4, 0x04, 0x00, 0x01, 0x14, 0x00, 0x00, 0x00,
> +	0x53, 0x80, 0x00, 0x80, 0x60, 0x06, 0x05, 0x02, 0xd4, 0x04, 0x00, 0x06,
> +	0x14, 0x00, 0x00, 0x00, 0x52, 0x19, 0x14, 0x00, 0x60, 0x06, 0x04, 0x05,
> +	0x04, 0x02, 0x0e, 0x01, 0x04, 0x01, 0x04, 0x04, 0x70, 0x19, 0x14, 0x00,
> +	0x20, 0x02, 0x01, 0x00, 0x04, 0x05, 0x10, 0x52, 0xc4, 0x04, 0x00, 0x00,
> +	0x2e, 0x00, 0x14, 0x14, 0x00, 0xc0, 0x00, 0x00, 0x78, 0x00, 0x00, 0x00,
> +	0x78, 0x00, 0x00, 0x00, 0x61, 0x00, 0x00, 0x6c, 0x13, 0x05, 0x00, 0x00,
> +	0x61, 0x00, 0x08, 0x6c, 0x15, 0x06, 0x00, 0x00, 0x69, 0x1a, 0x00, 0xf9,
> +	0x17, 0x13, 0x20, 0x00, 0x69, 0x1a, 0x08, 0xf9, 0x19, 0x15, 0x20, 0x00,
> +	0x40, 0x1a, 0x00, 0x20, 0x07, 0x17, 0x60, 0x04, 0x40, 0x1a, 0x08, 0x20,
> +	0x09, 0x19, 0x60, 0x04, 0x31, 0x23, 0x15, 0x00, 0x00, 0x00, 0x14, 0x0b,
> +	0x24, 0x07, 0x00, 0xfb, 0x00, 0x00, 0x00, 0x00, 0x40, 0x00, 0x00, 0x20,
> +	0x0f, 0x17, 0x30, 0x04, 0x40, 0x00, 0x08, 0x20, 0x11, 0x19, 0x30, 0x04,
> +	0x41, 0x83, 0x14, 0x2c, 0x0d, 0x0b, 0x10, 0x0b, 0x31, 0x24, 0x15, 0x00,
> +	0x00, 0x00, 0x00, 0x00, 0x24, 0x0f, 0x08, 0xfb, 0x14, 0x0d, 0x00, 0x00,
> +	0x2f, 0x00, 0x14, 0x00, 0x00, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> +	0x10, 0x00, 0x00, 0x00, 0x61, 0x00, 0x1c, 0x34, 0x7f, 0x00, 0x00, 0x00,
> +	0x31, 0x11, 0x0c, 0x80, 0x04, 0x00, 0x00, 0x00, 0x0c, 0x7f, 0x20, 0x30,
> +	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> +	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> +	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> +	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> +	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> +	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> +	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> +	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> +};
> +
>  const struct compute_kernels compute_square_kernels[] = {
>  	{
>  		.ip_ver = IP_VER(12, 0),
> @@ -123,5 +157,10 @@ const struct compute_kernels compute_square_kernels[] = {
>  		.size = sizeof(xehp_kernel_square_bin),
>  		.kernel = xehp_kernel_square_bin,
>  	},
> +	{
> +		.ip_ver = IP_VER(12, 60),
> +		.size = sizeof(xehpc_kernel_square_bin),
> +		.kernel = xehpc_kernel_square_bin,
> +	},
>  	{}
>  };
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 9/9] tests/gem|xe_compute: Update documentation regarding test requirements
  2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 9/9] tests/gem|xe_compute: Update documentation regarding test requirements Zbigniew Kempczyński
@ 2023-09-08 13:56   ` Francois Dugast
  0 siblings, 0 replies; 22+ messages in thread
From: Francois Dugast @ 2023-09-08 13:56 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On Tue, Sep 05, 2023 at 03:33:09PM +0200, Zbigniew Kempczyński wrote:
> Currently test is prepared to run on DG2, ATS-M and PVC so lets
> reflect this in the documentation tags.
> 
> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> Cc: Christoph Manszewski <christoph.manszewski@intel.com>
> Cc: Francois Dugast <francois.dugast@intel.com>
> Cc: Mauro Carvalho Chehab <mchehab@kernel.org>

Reviewed-by: Francois Dugast <francois.dugast@intel.com>

> ---
>  tests/intel/gem_compute.c | 3 +--
>  tests/intel/xe_compute.c  | 3 +--
>  2 files changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/tests/intel/gem_compute.c b/tests/intel/gem_compute.c
> index b408efee16..8f4722d2dc 100644
> --- a/tests/intel/gem_compute.c
> +++ b/tests/intel/gem_compute.c
> @@ -18,12 +18,11 @@
>  
>  /**
>   * SUBTEST: compute-square
> - * GPU requirement: only works on TGL
> + * GPU requirement: TGL, DG2, ATS-M
>   * Description:
>   *	Run an openCL Kernel that returns output[i] = input[i] * input[i],
>   *	for an input dataset..
>   * Functionality: compute openCL kernel
> - * TODO: extend test to cover other platforms
>   */
>  static void
>  test_compute_square(int fd)
> diff --git a/tests/intel/xe_compute.c b/tests/intel/xe_compute.c
> index 0c54fbec42..07764decb5 100644
> --- a/tests/intel/xe_compute.c
> +++ b/tests/intel/xe_compute.c
> @@ -19,12 +19,11 @@
>  
>  /**
>   * SUBTEST: compute-square
> - * GPU requirement: only works on TGL
> + * GPU requirement: TGL, PVC
>   * Description:
>   *	Run an openCL Kernel that returns output[i] = input[i] * input[i],
>   *	for an input dataset..
>   * Functionality: compute openCL kernel
> - * TODO: extend test to cover other platforms
>   */
>  static void
>  test_compute_square(int fd)
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2023-09-08 13:57 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-05 13:33 [igt-dev] [PATCH i-g-t v2 0/9] Extend compute square to i915 and Xe Zbigniew Kempczyński
2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 1/9] lib/intel_compute: Migrate xe_compute library to intel_compute Zbigniew Kempczyński
2023-09-06 16:42   ` Kamil Konieczny
2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 2/9] lib/intel_compute: Add compatibility flags for running compute Zbigniew Kempczyński
2023-09-08  9:03   ` Francois Dugast
2023-09-08 11:07     ` Zbigniew Kempczyński
2023-09-08 11:29     ` Mauro Carvalho Chehab
2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 3/9] lib/intel_compute: Reorganize the code for i915 version preparation Zbigniew Kempczyński
2023-09-08  9:05   ` Francois Dugast
2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 4/9] lib/intel_compute: Add name field for debugging purposes Zbigniew Kempczyński
2023-09-08  9:05   ` Francois Dugast
2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 5/9] lib/intel_compute: Add i915 path in compute library Zbigniew Kempczyński
2023-09-08  9:13   ` Francois Dugast
2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 6/9] intel/gem_compute: Add test which runs compute workload on i915 Zbigniew Kempczyński
2023-09-08  9:15   ` Francois Dugast
2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 7/9] lib/intel_compute: Add XeHP implementation of compute pipeline Zbigniew Kempczyński
2023-09-08 13:55   ` Francois Dugast
2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 8/9] lib/intel_compute: Adding pvc compute pipeline implementation Zbigniew Kempczyński
2023-09-08 13:56   ` Francois Dugast
2023-09-05 13:33 ` [igt-dev] [PATCH i-g-t v2 9/9] tests/gem|xe_compute: Update documentation regarding test requirements Zbigniew Kempczyński
2023-09-08 13:56   ` Francois Dugast
2023-09-05 18:23 ` [igt-dev] ✗ Fi.CI.BAT: failure for Extend compute square to i915 and Xe (rev2) Patchwork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox