Igt-dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] lib/gpgpu: add shader support
@ 2024-04-29 12:08 Andrzej Hajda
  2024-04-29 12:08 ` [PATCH 1/4] lib/gpu_cmds: add Xe_LP version of emit_vfe_state Andrzej Hajda
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Andrzej Hajda @ 2024-04-29 12:08 UTC (permalink / raw)
  To: igt-dev
  Cc: Kamil Konieczny, Dominik Grzegorzek, Christoph Manszewski,
	Dominik Karol Piątkowski, Andrzej Hajda

This patchset adds shader support to mainline IGT.
Together with iga64 inline assembly and demo test using both.

The patches were cherry-picked/trimmed from internal branch,
quite painful process. I hope I have not cut off too much :)

To: igt-dev@lists.freedesktop.org
Cc: Kamil Konieczny <kamil.konieczny@linux.intel.com>
Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Cc: Christoph Manszewski <christoph.manszewski@intel.com>
Cc: Dominik Karol Piątkowski <dominik.karol.piatkowski@intel.com>

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
---
Andrzej Hajda (4):
      lib/gpu_cmds: add Xe_LP version of emit_vfe_state
      lib/gpgpu_shader: tooling for preparing and running gpgpu shaders
      lib/gpgpu_shader: add inline support for iga64 assembly
      intel/xe_exec_sip: port test for shader sanity check

 lib/generate_iga64_codes    | 104 +++++++++++++++
 lib/gpgpu_shader.c          | 313 ++++++++++++++++++++++++++++++++++++++++++++
 lib/gpgpu_shader.h          |  63 +++++++++
 lib/gpu_cmds.c              |  29 +++-
 lib/gpu_cmds.h              |   6 +
 lib/iga64_generated_codes.c |  87 ++++++++++++
 lib/iga64_macros.h          |  10 ++
 lib/meson.build             |  19 +++
 tests/intel/xe_exec_sip.c   | 239 +++++++++++++++++++++++++++++++++
 tests/meson.build           |   1 +
 10 files changed, 865 insertions(+), 6 deletions(-)
---
base-commit: 61121a2eac4d191ad9f3077948c8ba19686fbb16
change-id: 20240425-iga64_inline_ups-438ddfd6023f

Best regards,
-- 
Andrzej Hajda <andrzej.hajda@intel.com>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/4] lib/gpu_cmds: add Xe_LP version of emit_vfe_state
  2024-04-29 12:08 [PATCH 0/4] lib/gpgpu: add shader support Andrzej Hajda
@ 2024-04-29 12:08 ` Andrzej Hajda
  2024-04-29 12:37   ` Grzegorzek, Dominik
  2024-04-29 12:08 ` [PATCH 2/4] lib/gpgpu_shader: tooling for preparing and running gpgpu shaders Andrzej Hajda
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Andrzej Hajda @ 2024-04-29 12:08 UTC (permalink / raw)
  To: igt-dev
  Cc: Kamil Konieczny, Dominik Grzegorzek, Christoph Manszewski,
	Dominik Karol Piątkowski, Andrzej Hajda

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
---
 lib/gpu_cmds.c | 29 +++++++++++++++++++++++------
 lib/gpu_cmds.h |  6 ++++++
 2 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/lib/gpu_cmds.c b/lib/gpu_cmds.c
index da41121ce945..c73d56cc3f8c 100644
--- a/lib/gpu_cmds.c
+++ b/lib/gpu_cmds.c
@@ -651,10 +651,10 @@ gen7_emit_vfe_state(struct intel_bb *ibb, uint32_t threads,
 	intel_bb_out(ibb, 0);
 }
 
-void
-gen8_emit_vfe_state(struct intel_bb *ibb, uint32_t threads,
-		    uint32_t urb_entries, uint32_t urb_size,
-		    uint32_t curbe_size)
+static void
+__gen8_emit_vfe_state(struct intel_bb *ibb, uint32_t threads,
+		      uint32_t urb_entries, uint32_t urb_size,
+		      uint32_t curbe_size, bool legacy_mode)
 {
 	intel_bb_out(ibb, GEN7_MEDIA_VFE_STATE | (9 - 2));
 
@@ -662,8 +662,8 @@ gen8_emit_vfe_state(struct intel_bb *ibb, uint32_t threads,
 	intel_bb_out(ibb, 0);
 	intel_bb_out(ibb, 0);
 
-	/* number of threads & urb entries */
-	intel_bb_out(ibb, threads << 16 | urb_entries << 8);
+	/* number of threads & urb entries & eu fusion */
+	intel_bb_out(ibb, threads << 16 | urb_entries << 8 | legacy_mode << 6);
 
 	intel_bb_out(ibb, 0);
 
@@ -676,6 +676,15 @@ gen8_emit_vfe_state(struct intel_bb *ibb, uint32_t threads,
 	intel_bb_out(ibb, 0);
 }
 
+void
+gen8_emit_vfe_state(struct intel_bb *ibb, uint32_t threads,
+		    uint32_t urb_entries, uint32_t urb_size,
+		    uint32_t curbe_size)
+{
+	__gen8_emit_vfe_state(ibb, threads, urb_entries, urb_size, curbe_size,
+			      false);
+}
+
 void
 gen7_emit_curbe_load(struct intel_bb *ibb, uint32_t curbe_buffer)
 {
@@ -864,6 +873,14 @@ gen7_emit_media_objects(struct intel_bb *ibb,
 			gen_emit_media_object(ibb, x + i * 16, y + j * 16);
 }
 
+void xelp_emit_vfe_state(struct intel_bb *ibb, uint32_t threads,
+			 uint32_t urb_entries, uint32_t urb_size,
+			 uint32_t curbe_size, bool legacy_mode)
+{
+	return __gen8_emit_vfe_state(ibb, threads, urb_entries, urb_size,
+				     curbe_size, legacy_mode);
+}
+
 /*
  * XEHP
  */
diff --git a/lib/gpu_cmds.h b/lib/gpu_cmds.h
index 348c6c9453e9..1b9156a80c7c 100644
--- a/lib/gpu_cmds.h
+++ b/lib/gpu_cmds.h
@@ -81,6 +81,12 @@ void
 gen8_emit_vfe_state(struct intel_bb *ibb, uint32_t threads,
 		    uint32_t urb_entries, uint32_t urb_size,
 		    uint32_t curbe_size);
+
+void
+xelp_emit_vfe_state(struct intel_bb *ibb, uint32_t threads,
+		    uint32_t urb_entries, uint32_t urb_size,
+		    uint32_t curbe_size, bool legacy_mode);
+
 void
 gen7_emit_curbe_load(struct intel_bb *ibb, uint32_t curbe_buffer);
 

-- 
2.34.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/4] lib/gpgpu_shader: tooling for preparing and running gpgpu shaders
  2024-04-29 12:08 [PATCH 0/4] lib/gpgpu: add shader support Andrzej Hajda
  2024-04-29 12:08 ` [PATCH 1/4] lib/gpu_cmds: add Xe_LP version of emit_vfe_state Andrzej Hajda
@ 2024-04-29 12:08 ` Andrzej Hajda
  2024-04-29 12:23   ` Grzegorzek, Dominik
  2024-04-29 12:08 ` [PATCH 3/4] lib/gpgpu_shader: add inline support for iga64 assembly Andrzej Hajda
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Andrzej Hajda @ 2024-04-29 12:08 UTC (permalink / raw)
  To: igt-dev
  Cc: Kamil Konieczny, Dominik Grzegorzek, Christoph Manszewski,
	Dominik Karol Piątkowski, Andrzej Hajda

Implement tooling for building shaders for specific generations.
The library allows you to build and run shader from precompiled blocks
and provides an abstraction layer over gpgpu pipeline.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Dominik Karol Piątkowski <dominik.karol.piatkowski@intel.com>
---
 lib/gpgpu_shader.c | 211 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 lib/gpgpu_shader.h |  38 ++++++++++
 lib/meson.build    |   1 +
 3 files changed, 250 insertions(+)

diff --git a/lib/gpgpu_shader.c b/lib/gpgpu_shader.c
new file mode 100644
index 000000000000..d14301789421
--- /dev/null
+++ b/lib/gpgpu_shader.c
@@ -0,0 +1,211 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2024 Intel Corporation
+ *
+ * Author: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
+ */
+
+#include <i915_drm.h>
+
+#include "ioctl_wrappers.h"
+#include "gpgpu_shader.h"
+#include "gpu_cmds.h"
+
+#define SUPPORTED_GEN_VER 1200 /* Support TGL and up */
+
+#define PAGE_SIZE 4096
+#define BATCH_STATE_SPLIT 2048
+/* VFE STATE params */
+#define THREADS (1 << 16) /* max value */
+#define GEN8_GPGPU_URB_ENTRIES 1
+#define GPGPU_URB_SIZE 0
+#define GPGPU_CURBE_SIZE 0
+#define GEN7_VFE_STATE_GPGPU_MODE 1
+
+static uint32_t fill_sip(struct intel_bb *ibb,
+			 const uint32_t sip[][4],
+			 const size_t size)
+{
+	uint32_t *sip_dst;
+	uint32_t offset;
+
+	intel_bb_ptr_align(ibb, 16);
+	sip_dst = intel_bb_ptr(ibb);
+	offset = intel_bb_offset(ibb);
+
+	memcpy(sip_dst, sip, size);
+
+	intel_bb_ptr_add(ibb, size);
+
+	return offset;
+}
+
+static void emit_sip(struct intel_bb *ibb, const uint64_t offset)
+{
+	intel_bb_out(ibb, GEN4_STATE_SIP | (3 - 2));
+	intel_bb_out(ibb, lower_32_bits(offset));
+	intel_bb_out(ibb, upper_32_bits(offset));
+}
+
+static void
+__xelp_gpgpu_execfunc(struct intel_bb *ibb,
+		      struct intel_buf *target,
+		      unsigned int x_dim, unsigned int y_dim,
+		      struct gpgpu_shader *shdr,
+		      struct gpgpu_shader *sip,
+		      uint64_t ring, bool explicit_engine)
+{
+	uint32_t interface_descriptor, sip_offset;
+	uint64_t engine;
+
+	intel_bb_add_intel_buf(ibb, target, true);
+
+	intel_bb_ptr_set(ibb, BATCH_STATE_SPLIT);
+
+	interface_descriptor = gen8_fill_interface_descriptor(ibb, target,
+							      shdr->instr,
+							      4 * shdr->size);
+
+	if (sip && sip->size)
+		sip_offset = fill_sip(ibb, sip->instr, 4 * sip->size);
+	else
+		sip_offset = 0;
+
+	intel_bb_ptr_set(ibb, 0);
+
+	/* GPGPU pipeline */
+	intel_bb_out(ibb, GEN7_PIPELINE_SELECT | GEN9_PIPELINE_SELECTION_MASK |
+		     PIPELINE_SELECT_GPGPU);
+
+	gen9_emit_state_base_address(ibb);
+
+	xelp_emit_vfe_state(ibb, THREADS, GEN8_GPGPU_URB_ENTRIES,
+			    GPGPU_URB_SIZE, GPGPU_CURBE_SIZE, true);
+
+	gen7_emit_interface_descriptor_load(ibb, interface_descriptor);
+
+	if (sip_offset)
+		emit_sip(ibb, sip_offset);
+
+	gen8_emit_gpgpu_walk(ibb, 0, 0, x_dim * 16, y_dim);
+
+	intel_bb_out(ibb, MI_BATCH_BUFFER_END);
+	intel_bb_ptr_align(ibb, 32);
+
+	engine = explicit_engine ? ring : I915_EXEC_DEFAULT;
+	intel_bb_exec(ibb, intel_bb_offset(ibb),
+		      engine | I915_EXEC_NO_RELOC, false);
+}
+
+static void
+__xehp_gpgpu_execfunc(struct intel_bb *ibb,
+		      struct intel_buf *target,
+		      unsigned int x_dim, unsigned int y_dim,
+		      struct gpgpu_shader *shdr,
+		      struct gpgpu_shader *sip,
+		      uint64_t ring, bool explicit_engine)
+{
+	struct xehp_interface_descriptor_data idd;
+	uint32_t sip_offset;
+	uint64_t engine;
+
+	intel_bb_add_intel_buf(ibb, target, true);
+
+	intel_bb_ptr_set(ibb, BATCH_STATE_SPLIT);
+
+	xehp_fill_interface_descriptor(ibb, target, shdr->instr,
+				       4 * shdr->size, &idd);
+
+	if (sip && sip->size)
+		sip_offset = fill_sip(ibb, sip->instr, 4 * sip->size);
+	else
+		sip_offset = 0;
+
+	intel_bb_ptr_set(ibb, 0);
+
+	/* GPGPU pipeline */
+	intel_bb_out(ibb, GEN7_PIPELINE_SELECT | GEN9_PIPELINE_SELECTION_MASK |
+		     PIPELINE_SELECT_GPGPU);
+	xehp_emit_state_base_address(ibb);
+	xehp_emit_state_compute_mode(ibb);
+	xehp_emit_state_binding_table_pool_alloc(ibb);
+	xehp_emit_cfe_state(ibb, THREADS);
+
+	if (sip_offset)
+		emit_sip(ibb, sip_offset);
+
+	xehp_emit_compute_walk(ibb, 0, 0, x_dim * 16, y_dim, &idd, 0x0);
+
+	intel_bb_out(ibb, MI_BATCH_BUFFER_END);
+	intel_bb_ptr_align(ibb, 32);
+
+	engine = explicit_engine ? ring : I915_EXEC_DEFAULT;
+	intel_bb_exec(ibb, intel_bb_offset(ibb),
+		      engine | I915_EXEC_NO_RELOC, false);
+
+}
+
+/**
+ * gpgpu_shader_exec:
+ * @ibb: pointer to initialized intel_bb
+ * @target: pointer to initialized intel_buf to be written by shader/sip
+ * @x_dim: gpgpu/compute walker thread group width
+ * @y_dim: gpgpu/compute walker thread group height
+ * @shdr: shader to be executed
+ * @sip: sip to be executed, can be NULL
+ * @ring: engine index
+ * @explicit_engine: whether to use provided engine index
+ *
+ * Execute provided shader in asynchronous fashion. To wait for completion,
+ * caller has to use the provided ibb handle.
+ */
+void gpgpu_shader_exec(struct intel_bb *ibb,
+		       struct intel_buf *target,
+		       unsigned int x_dim, unsigned int y_dim,
+		       struct gpgpu_shader *shdr,
+		       struct gpgpu_shader *sip,
+		       uint64_t ring, bool explicit_engine)
+{
+	igt_require(shdr->gen_ver >= SUPPORTED_GEN_VER);
+	igt_assert(ibb->size >= PAGE_SIZE);
+	igt_assert(ibb->ptr == ibb->batch);
+
+	if (shdr->gen_ver >= 1250)
+		__xehp_gpgpu_execfunc(ibb, target, x_dim, y_dim, shdr, sip,
+				      ring, explicit_engine);
+	else
+		__xelp_gpgpu_execfunc(ibb, target, x_dim, y_dim, shdr, sip,
+				      ring, explicit_engine);
+}
+
+/**
+ * gpgpu_shader_create:
+ * @fd: drm fd - i915 or xe
+ *
+ * Creates empty shader.
+ *
+ * Returns: pointer to empty shader struct.
+ */
+struct gpgpu_shader *gpgpu_shader_create(int fd)
+{
+	struct gpgpu_shader *shdr = calloc(1, sizeof(struct gpgpu_shader));
+	const struct intel_device_info *info;
+
+	info = intel_get_device_info(intel_get_drm_devid(fd));
+	shdr->gen_ver = 100 * info->graphics_ver + info->graphics_rel;
+	shdr->max_size = 16 * 4;
+	shdr->code = malloc(4 * shdr->max_size);
+	return shdr;
+}
+
+/**
+ * gpgpu_shader_destroy:
+ * @shdr: pointer to shader struct created with 'gpgpu_shader_create'
+ *
+ * Frees resources of gpgpu_shader struct.
+ */
+void gpgpu_shader_destroy(struct gpgpu_shader *shdr)
+{
+	free(shdr->code);
+	free(shdr);
+}
diff --git a/lib/gpgpu_shader.h b/lib/gpgpu_shader.h
new file mode 100644
index 000000000000..02f6f1aad1e3
--- /dev/null
+++ b/lib/gpgpu_shader.h
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2024 Intel Corporation
+ */
+
+#ifndef GPGPU_SHADER_H
+#define GPGPU_SHADER_H
+
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdlib.h>
+
+struct intel_bb;
+struct intel_buf;
+
+struct gpgpu_shader {
+	uint32_t gen_ver;
+	uint32_t size;
+	uint32_t max_size;
+	union {
+		uint32_t *code;
+		uint32_t (*instr)[4];
+	};
+};
+
+struct gpgpu_shader *gpgpu_shader_create(int fd);
+void gpgpu_shader_destroy(struct gpgpu_shader *shdr);
+
+void gpgpu_shader_dump(struct gpgpu_shader *shdr);
+
+void gpgpu_shader_exec(struct intel_bb *ibb,
+		       struct intel_buf *target,
+		       unsigned int x_dim, unsigned int y_dim,
+		       struct gpgpu_shader *shdr,
+		       struct gpgpu_shader *sip,
+		       uint64_t ring, bool explicit_engine);
+
+#endif /* GPGPU_SHADER_H */
diff --git a/lib/meson.build b/lib/meson.build
index e2f740c116f8..0a3084f8aea2 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -72,6 +72,7 @@ lib_sources = [
 	'media_spin.c',
 	'media_fill.c',
 	'gpgpu_fill.c',
+	'gpgpu_shader.c',
 	'gpu_cmds.c',
 	'rendercopy_i915.c',
 	'rendercopy_i830.c',

-- 
2.34.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/4] lib/gpgpu_shader: add inline support for iga64 assembly
  2024-04-29 12:08 [PATCH 0/4] lib/gpgpu: add shader support Andrzej Hajda
  2024-04-29 12:08 ` [PATCH 1/4] lib/gpu_cmds: add Xe_LP version of emit_vfe_state Andrzej Hajda
  2024-04-29 12:08 ` [PATCH 2/4] lib/gpgpu_shader: tooling for preparing and running gpgpu shaders Andrzej Hajda
@ 2024-04-29 12:08 ` Andrzej Hajda
  2024-05-10  5:52   ` Zbigniew Kempczyński
                     ` (2 more replies)
  2024-04-29 12:08 ` [PATCH 4/4] intel/xe_exec_sip: port test for shader sanity check Andrzej Hajda
                   ` (2 subsequent siblings)
  5 siblings, 3 replies; 17+ messages in thread
From: Andrzej Hajda @ 2024-04-29 12:08 UTC (permalink / raw)
  To: igt-dev
  Cc: Kamil Konieczny, Dominik Grzegorzek, Christoph Manszewski,
	Dominik Karol Piątkowski, Andrzej Hajda

With this patch adding iga64 assembly should be similar to
adding x86 assembly inline. Simple example:
    emit_iga64_code(shdr, set_exception, R"ASM(
        or (1|M0) cr0.1<1>:ud cr0.1<0;1,0>:ud ARG(0):ud
    )ASM", value);
Note presence of 'ARG(0)', it will be replaced by 'value' argument,
multiple arguments are possible.
More sophisticated examples in following patches.
How does it works:
1. Raw string literals (C++ feature available in gcc as extension):
   R"ASM(...)ASM" allows to use multiline/unescaped string literals.
   If for some reason they cannot be used we could always fallback to
   old ugly way of handling multiline strings with escape characters:
    emit_iga64_code(shdr, set_exception, "\n\
        or (1|M0) cr0.1<1>:ud cr0.1<0;1,0>:ud ARG(0):ud\n\
    ", value);
2. emit_iga64_code puts the assembly string into special linker section,
   and calls __emit_iga64_code with pointer to external variable
   which will contain code templates generated from the assembly for all
   supported platforms, remaining arguments are put to temporal array
   to eventually patch the code with positional arguments.
3. During build phase the linker section is scanned for assemblies.
   Every assembly is preprocessed with cpp, to replace ARG(x) macros with
   magic numbers, and to provide different code for different platforms
   if needed. Then output file is compiled with iga64, and then .c file
   is generated with global variables pointing to hexified iga64 codes.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
---
 lib/generate_iga64_codes    | 104 ++++++++++++++++++++++++++++++++++++++++++++
 lib/gpgpu_shader.c          |  39 +++++++++++++++++
 lib/gpgpu_shader.h          |  25 +++++++++++
 lib/iga64_generated_codes.c |   6 +++
 lib/iga64_macros.h          |  10 +++++
 lib/meson.build             |  18 ++++++++
 6 files changed, 202 insertions(+)

diff --git a/lib/generate_iga64_codes b/lib/generate_iga64_codes
new file mode 100755
index 000000000000..efc2a29b409c
--- /dev/null
+++ b/lib/generate_iga64_codes
@@ -0,0 +1,104 @@
+#!/bin/bash
+# SPDX-License-Identifier: MIT
+# Copyright © 2024 Intel Corporation
+# Author: Andrzej Hajda <andrzej.hajda@intel.com>
+
+# List of supported platforms, in format gen100:platform, where gen100 equals
+# to minimal GPU generation supported by platform multiplied by 100 and platform
+# is one of platforms supported by -p switch of iga64.
+#
+# Must be in decreasing order, the last one must have gen100 equal 0"
+GEN_VERSIONS="2000:2 1272:12p72 1250:12p5 0:12p1"
+
+warn() {
+    echo -e "$1" >/dev/stderr
+}
+
+die() {
+    warn "DIE: $1"
+    exit 1
+}
+
+# parse args
+while getopts ':i:o:' opt; do
+    case $opt in
+    i) INPUT=$OPTARG;;
+    o) OUTPUT=$OPTARG;;
+    ?) die "Usage: $0 -i pre-generated-iga64-file -o generated-iga64-file libs-with-iga64-assembly [...]"
+    esac
+done
+LIBS=${@:OPTIND}
+
+# read all assemblies into ASMS array
+ASMS=()
+while  read -d $'\0' asm; do
+    test -z "$asm" && continue
+    ASMS+=( "$asm" )
+done < <(for f in $LIBS; do objcopy --dump-section .iga64_assembly=/dev/stdout $f.p/*.o; done)
+
+# check if we need to recompile - checksum difference and compiler present
+MD5_ASMS="$(for a in "${ASMS[@]}"; do echo "${a#*:}"; done | md5sum|cut -b1-32)"
+MD5_PRE="$(grep -Po '(?<=^#define MD5_SUM )\S{32,32}' $INPUT 2>/dev/null)"
+
+if [ "$MD5_ASMS" = "$MD5_PRE" ]; then
+    echo "iga64 assemblies not changed, reusing pre-compiled file $INPUT."
+    cp $INPUT $OUTPUT
+    exit 0
+fi
+
+type iga64 >/dev/null || {
+    warn "WARNING: iga64 assemblies changed, but iga64 compiler not present, CHANGES will have no effect. Install iga64 (libigc-tools package) to re-compile code."
+    cp $INPUT $OUTPUT
+    exit 0
+}
+
+# returns count of numbers in strings of format "0x1234, 0x23434, ..."
+dword_count() {
+    n=${1//[^x]}
+    echo ${#n}
+}
+
+# generate code file
+WD=$OUTPUT.d
+mkdir -p $WD
+
+echo "Generating new $OUTPUT"
+
+cat <<-EOF >$OUTPUT
+/* SPDX-License-Identifier: MIT */
+/* Generated using $(iga64 |& head -1) */
+
+#include "gpgpu_shader.h"
+
+#define MD5_SUM $MD5_ASMS
+EOF
+
+for asm in "${ASMS[@]}"; do
+    asm_name="${asm%%:*}"
+    asm_code="${asm_name/assembly/code}"
+    asm_body="${asm#*:}"
+    cur_code=""
+    cur_ver=""
+    echo -e "\nstruct iga64_template const $asm_code[] = {" >>$OUTPUT
+    for gen in $GEN_VERSIONS; do
+        gen_ver="${gen%%:*}"
+        gen_name="${gen#*:}"
+        warn "Generating $asm_code for platform $gen_name"
+        cmd="cpp -P - -o $WD/$asm_name.$gen_name.asm"
+        cmd+=" -DGEN_VER=$gen_ver -imacros ../lib/iga64_macros.h"
+        eval "$cmd" <<<"$asm_body" || die "cpp error for $asm_name.$gen_name\ncmd: $cmd"
+        cmd="iga64 -Xauto-deps -Wall -p=$gen_name"
+        cmd+=" $WD/$asm_name.$gen_name.asm -o $WD/$asm_name.$gen_name.bin"
+        eval "$cmd" || die "iga64 error for $asm_name.$gen_name\ncmd: $cmd"
+        code="$(hexdump -e '"\t\t" 4/4 "0x%08x, " "\n"' $WD/$asm_name.$gen_name.bin)"
+        [ -z "$cur_code" ] && cur_code="$code"
+        [ "$cur_code" != "$code" ] && {
+            echo -e "\t{ .gen_ver = $cur_ver, .size = $(dword_count "$cur_code"), .code = (const uint32_t []) {\n$cur_code\n\t}}," >>$OUTPUT
+            cur_code="$code"
+        }
+        cur_ver=$gen_ver
+    done
+    echo -e "\t{ .gen_ver = $cur_ver, .size = $(dword_count "$cur_code"), .code = (const uint32_t []) {\n$cur_code\n\t}}\n};" >>$OUTPUT
+done
+
+cp $OUTPUT $INPUT
diff --git a/lib/gpgpu_shader.c b/lib/gpgpu_shader.c
index d14301789421..3317e9e35c91 100644
--- a/lib/gpgpu_shader.c
+++ b/lib/gpgpu_shader.c
@@ -11,6 +11,9 @@
 #include "gpgpu_shader.h"
 #include "gpu_cmds.h"
 
+#define IGA64_ARG0 0xc0ded000
+#define IGA64_ARG_MASK 0xffffff00
+
 #define SUPPORTED_GEN_VER 1200 /* Support TGL and up */
 
 #define PAGE_SIZE 4096
@@ -22,6 +25,42 @@
 #define GPGPU_CURBE_SIZE 0
 #define GEN7_VFE_STATE_GPGPU_MODE 1
 
+static void gpgpu_shader_extend(struct gpgpu_shader *shdr)
+{
+	shdr->max_size <<= 1;
+	shdr->code = realloc(shdr->code, 4 * shdr->max_size);
+}
+
+void
+__emit_iga64_code(struct gpgpu_shader *shdr, struct iga64_template const *tpls,
+		  int argc, uint32_t *argv)
+{
+	uint32_t *ptr;
+
+	igt_require_f(shdr->gen_ver >= SUPPORTED_GEN_VER,
+		      "No available shader templates for platforms older than XeLP\n");
+
+	while (shdr->gen_ver < tpls->gen_ver)
+		tpls++;
+
+	while (shdr->max_size < shdr->size + tpls->size)
+		gpgpu_shader_extend(shdr);
+
+	ptr = shdr->code + shdr->size;
+	memcpy(ptr, tpls->code, 4 * tpls->size);
+
+	/* patch the template */
+	for (int n, i = 0; i < tpls->size; ++i) {
+		if ((ptr[i] & IGA64_ARG_MASK) != IGA64_ARG0)
+			continue;
+		n = ptr[i] - IGA64_ARG0;
+		igt_assert(n < argc);
+		ptr[i] = argv[n];
+	}
+
+	shdr->size += tpls->size;
+}
+
 static uint32_t fill_sip(struct intel_bb *ibb,
 			 const uint32_t sip[][4],
 			 const size_t size)
diff --git a/lib/gpgpu_shader.h b/lib/gpgpu_shader.h
index 02f6f1aad1e3..0b997deba8bb 100644
--- a/lib/gpgpu_shader.h
+++ b/lib/gpgpu_shader.h
@@ -23,6 +23,27 @@ struct gpgpu_shader {
 	};
 };
 
+struct iga64_template {
+	uint32_t gen_ver;
+	uint32_t size;
+	const uint32_t *code;
+};
+
+#pragma GCC diagnostic ignored "-Wnested-externs"
+
+void
+__emit_iga64_code(struct gpgpu_shader *shdr, const struct iga64_template *tpls,
+		  int argc, uint32_t *argv);
+
+#define emit_iga64_code(__shdr, __name, __txt, __args...) \
+({ \
+	static const char t[] __attribute__ ((section(".iga64_assembly"),used)) \
+		="iga64_assembly_" #__name ":" __txt "\n"; \
+	extern struct iga64_template const iga64_code_ ## __name[]; \
+	u32 args[] = { __args }; \
+	__emit_iga64_code(__shdr, iga64_code_ ## __name, ARRAY_SIZE(args), args); \
+})
+
 struct gpgpu_shader *gpgpu_shader_create(int fd);
 void gpgpu_shader_destroy(struct gpgpu_shader *shdr);
 
@@ -35,4 +56,8 @@ void gpgpu_shader_exec(struct intel_bb *ibb,
 		       struct gpgpu_shader *sip,
 		       uint64_t ring, bool explicit_engine);
 
+void gpgpu_shader__eot(struct gpgpu_shader *shdr);
+void gpgpu_shader__write_dword(struct gpgpu_shader *shdr, uint32_t value,
+			       uint32_t y_offset);
+
 #endif /* GPGPU_SHADER_H */
diff --git a/lib/iga64_generated_codes.c b/lib/iga64_generated_codes.c
new file mode 100644
index 000000000000..449c5e9bcf31
--- /dev/null
+++ b/lib/iga64_generated_codes.c
@@ -0,0 +1,6 @@
+/* SPDX-License-Identifier: MIT */
+/* Generated using Intel Graphics Assembler 1.1.0-int */
+
+#include "gpgpu_shader.h"
+
+#define MD5_SUM d41d8cd98f00b204e9800998ecf8427e
diff --git a/lib/iga64_macros.h b/lib/iga64_macros.h
new file mode 100644
index 000000000000..33375763a1d0
--- /dev/null
+++ b/lib/iga64_macros.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: MIT */
+
+#define ARG(n) (0xc0ded000 + n)
+
+/* send instruction for DG2+ requires 0 length in case src1 is null, BSpec: 47443 */
+#if GEN_VER < 1271
+#define src1_null null
+#else
+#define src1_null null:0
+#endif
diff --git a/lib/meson.build b/lib/meson.build
index 0a3084f8aea2..843c74e5187f 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -216,7 +216,10 @@ lib_version = vcs_tag(input : 'version.h.in', output : 'version.h',
 		      fallback : 'NO-GIT',
 		      command : vcs_command )
 
+iga64_assembly_sources = [ 'gpgpu_shader.c' ]
+
 lib_intermediates = []
+iga64_assembly_libs = []
 foreach f: lib_sources
     name = f.underscorify()
     lib = static_library('igt-' + name,
@@ -230,8 +233,23 @@ foreach f: lib_sources
 	])
 
     lib_intermediates += lib
+    if f in iga64_assembly_sources
+	iga64_assembly_libs += lib
+    endif
 endforeach
 
+iga64_generated_codes = custom_target(
+    'iga64_generated_codes.c',
+    output : 'iga64_generated_codes.c',
+    input : [ 'iga64_generated_codes.c' ] + iga64_assembly_libs,
+    command : [ './generate_iga64_codes', '-o', '@OUTPUT@', '-i', '@INPUT@' ],
+    depend_files: [ 'generate_iga64_codes' ]
+)
+
+lib_intermediates += static_library('igt-iga64_generated_codes.c',
+			[ iga64_generated_codes, lib_version ]
+		     )
+
 lib_igt_build = shared_library('igt',
     ['dummy.c'],
     link_whole: lib_intermediates,

-- 
2.34.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/4] intel/xe_exec_sip: port test for shader sanity check
  2024-04-29 12:08 [PATCH 0/4] lib/gpgpu: add shader support Andrzej Hajda
                   ` (2 preceding siblings ...)
  2024-04-29 12:08 ` [PATCH 3/4] lib/gpgpu_shader: add inline support for iga64 assembly Andrzej Hajda
@ 2024-04-29 12:08 ` Andrzej Hajda
  2024-05-10 10:44   ` Zbigniew Kempczyński
  2024-05-10 11:30   ` Kamil Konieczny
  2024-04-29 16:19 ` ✗ Fi.CI.BUILD: failure for lib/gpgpu: add shader support Patchwork
  2024-04-29 16:21 ` ✗ GitLab.Pipeline: warning " Patchwork
  5 siblings, 2 replies; 17+ messages in thread
From: Andrzej Hajda @ 2024-04-29 12:08 UTC (permalink / raw)
  To: igt-dev
  Cc: Kamil Konieczny, Dominik Grzegorzek, Christoph Manszewski,
	Dominik Karol Piątkowski, Andrzej Hajda

xe_exec_sip will contain tests for shader and SIP interaction.
For starters let's implement test checking if shader is run correctly.
The patch also demostrates usage of inline iga64 assembly.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
---
 lib/gpgpu_shader.c          |  63 ++++++++++++
 lib/iga64_generated_codes.c |  83 ++++++++++++++-
 tests/intel/xe_exec_sip.c   | 239 ++++++++++++++++++++++++++++++++++++++++++++
 tests/meson.build           |   1 +
 4 files changed, 385 insertions(+), 1 deletion(-)

diff --git a/lib/gpgpu_shader.c b/lib/gpgpu_shader.c
index 3317e9e35c91..cd8c82ff9c8c 100644
--- a/lib/gpgpu_shader.c
+++ b/lib/gpgpu_shader.c
@@ -248,3 +248,66 @@ void gpgpu_shader_destroy(struct gpgpu_shader *shdr)
 	free(shdr->code);
 	free(shdr);
 }
+
+/**
+ * gpgpu_shader__eot:
+ * @shdr: shader to be modified
+ *
+ * Append end of thread instruction to @shdr.
+ */
+void gpgpu_shader__eot(struct gpgpu_shader *shdr)
+{
+	emit_iga64_code(shdr, eot, R"ASM(
+(W)     mov (8|M0)               r112.0<1>:ud  r0.0<8;8,1>:ud
+#if GEN_VER < 1250
+(W)     send.ts (16|M0)          null r112 null 0x10000000 0x02000010 {EOT,@1} // wr:1+0, rd:0; end of thread
+#else
+(W)     send.gtwy (8|M0)         null r112 src1_null     0 0x02000000 {EOT}
+#endif
+	)ASM");
+}
+
+/**
+ * gpgpu_shader__write_dword:
+ * @shdr: shader to be modified
+ * @value: dword to be written
+ * @y_offset: write target offset within the surface in rows
+ *
+ * Fill dword in (row, column/dword) == (tg_id_y + @y_offset, tg_id_x).
+ */
+void gpgpu_shader__write_dword(struct gpgpu_shader *shdr, uint32_t value,
+			       uint32_t y_offset)
+{
+	emit_iga64_code(shdr, media_block_write, R"ASM(
+	// Payload
+(W)     mov (1|M0)               r5.0<1>:ud    ARG(3):ud
+(W)     mov (1|M0)               r5.1<1>:ud    ARG(4):ud
+(W)     mov (1|M0)               r5.2<1>:ud    ARG(5):ud
+(W)     mov (1|M0)               r5.3<1>:ud    ARG(6):ud
+#if GEN_VER < 2000 // Media Block Write
+        // X offset of the block in bytes := (thread group id X << ARG(0))
+(W)     shl (1|M0)               r4.0<1>:ud    r0.1<0;1,0>:ud    ARG(0):ud
+        // Y offset of the block in rows := thread group id Y
+(W)     mov (1|M0)               r4.1<1>:ud    r0.6<0;1,0>:ud
+(W)     add (1|M0)               r4.1<1>:ud    r4.1<0;1,0>:ud   ARG(1):ud
+        // block width [0,63] representing 1 to 64 bytes
+(W)     mov (1|M0)               r4.2<1>:ud    ARG(2):ud
+        // FFTID := FFTID from R0 header
+(W)     mov (1|M0)               r4.4<1>:ud    r0.5<0;1,0>:ud
+(W)     send.dc1 (16|M0)         null     r4   src1_null 0    0x40A8000
+#else // Typed 2D Block Store
+        // Load r2.0-3 with tg id X << ARG(0)
+(W)     shl (1|M0)               r2.0<1>:ud    r0.1<0;1,0>:ud    ARG(0):ud
+        // Load r2.4-7 with tg id Y + ARG(1):ud
+(W)     mov (1|M0)               r2.1<1>:ud    r0.6<0;1,0>:ud
+(W)     add (1|M0)               r2.1<1>:ud    r2.1<0;1,0>:ud    ARG(1):ud
+        // payload setup
+(W)     mov (16|M0)              r4.0<1>:ud    0x0:ud
+        // Store X and Y block start (160:191 and 192:223)
+(W)     mov (2|M0)               r4.5<1>:ud    r2.0<2;2,1>:ud
+        // Store X and Y block max_size (224:231 and 232:239)
+(W)     mov (1|M0)               r4.7<1>:ud    ARG(2):ud
+(W)     send.tgm (16|M0)         null     r4   null:0    0    0x64000007
+#endif
+	)ASM", 2, y_offset, 3, value, value, value, value);
+}
diff --git a/lib/iga64_generated_codes.c b/lib/iga64_generated_codes.c
index 449c5e9bcf31..f06362d806cd 100644
--- a/lib/iga64_generated_codes.c
+++ b/lib/iga64_generated_codes.c
@@ -3,4 +3,85 @@
 
 #include "gpgpu_shader.h"
 
-#define MD5_SUM d41d8cd98f00b204e9800998ecf8427e
+#define MD5_SUM 1a47442138fa63fddb0f260694ef9edb
+
+struct iga64_template const iga64_code_media_block_write[] = {
+	{ .gen_ver = 2000, .size = 56, .code = (const uint32_t []) {
+		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
+		0x80000061, 0x05154220, 0x00000000, 0xc0ded004,
+		0x80000061, 0x05254220, 0x00000000, 0xc0ded005,
+		0x80000061, 0x05354220, 0x00000000, 0xc0ded006,
+		0x80000069, 0x02058220, 0x02000014, 0xc0ded000,
+		0x80000061, 0x02150220, 0x00000064, 0x00000000,
+		0x80001940, 0x02158220, 0x02000214, 0xc0ded001,
+		0x80100061, 0x04054220, 0x00000000, 0x00000000,
+		0x80041a61, 0x04550220, 0x00220205, 0x00000000,
+		0x80000061, 0x04754220, 0x00000000, 0xc0ded002,
+		0x80132031, 0x00000000, 0xd00e0494, 0x04000000,
+		0x80000001, 0x00010000, 0x20000000, 0x00000000,
+		0x80000001, 0x00010000, 0x30000000, 0x00000000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1272, .size = 52, .code = (const uint32_t []) {
+		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
+		0x80000061, 0x05154220, 0x00000000, 0xc0ded004,
+		0x80000061, 0x05254220, 0x00000000, 0xc0ded005,
+		0x80000061, 0x05354220, 0x00000000, 0xc0ded006,
+		0x80000069, 0x04058220, 0x02000014, 0xc0ded000,
+		0x80000061, 0x04150220, 0x00000064, 0x00000000,
+		0x80001940, 0x04158220, 0x02000414, 0xc0ded001,
+		0x80000061, 0x04254220, 0x00000000, 0xc0ded002,
+		0x80000061, 0x04450220, 0x00000054, 0x00000000,
+		0x80132031, 0x00000000, 0xc0000414, 0x02a00000,
+		0x80000001, 0x00010000, 0x20000000, 0x00000000,
+		0x80000001, 0x00010000, 0x30000000, 0x00000000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 1250, .size = 56, .code = (const uint32_t []) {
+		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
+		0x80000061, 0x05254220, 0x00000000, 0xc0ded004,
+		0x80000061, 0x05454220, 0x00000000, 0xc0ded005,
+		0x80000061, 0x05654220, 0x00000000, 0xc0ded006,
+		0x80000069, 0x04058220, 0x02000024, 0xc0ded000,
+		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
+		0x80001940, 0x04258220, 0x02000424, 0xc0ded001,
+		0x80000061, 0x04454220, 0x00000000, 0xc0ded002,
+		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
+		0x80001901, 0x00010000, 0x00000000, 0x00000000,
+		0x80044031, 0x00000000, 0xc0000414, 0x02a00000,
+		0x80000001, 0x00010000, 0x20000000, 0x00000000,
+		0x80000001, 0x00010000, 0x30000000, 0x00000000,
+		0x80000901, 0x00010000, 0x00000000, 0x00000000,
+	}},
+	{ .gen_ver = 0, .size = 52, .code = (const uint32_t []) {
+		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
+		0x80000061, 0x05254220, 0x00000000, 0xc0ded004,
+		0x80000061, 0x05454220, 0x00000000, 0xc0ded005,
+		0x80000061, 0x05654220, 0x00000000, 0xc0ded006,
+		0x80000069, 0x04058220, 0x02000024, 0xc0ded000,
+		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
+		0x80000140, 0x04258220, 0x02000424, 0xc0ded001,
+		0x80000061, 0x04454220, 0x00000000, 0xc0ded002,
+		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
+		0x80049031, 0x00000000, 0xc0000414, 0x02a00000,
+		0x80000001, 0x00010000, 0x20000000, 0x00000000,
+		0x80000001, 0x00010000, 0x30000000, 0x00000000,
+		0x80000101, 0x00010000, 0x00000000, 0x00000000,
+	}}
+};
+
+struct iga64_template const iga64_code_eot[] = {
+	{ .gen_ver = 1272, .size = 8, .code = (const uint32_t []) {
+		0x800c0061, 0x70050220, 0x00460005, 0x00000000,
+		0x800f2031, 0x00000004, 0x3000700c, 0x00000000,
+	}},
+	{ .gen_ver = 1250, .size = 12, .code = (const uint32_t []) {
+		0x80030061, 0x70050220, 0x00460005, 0x00000000,
+		0x80001901, 0x00010000, 0x00000000, 0x00000000,
+		0x80034031, 0x00000004, 0x3000700c, 0x00000000,
+	}},
+	{ .gen_ver = 0, .size = 8, .code = (const uint32_t []) {
+		0x80030061, 0x70050220, 0x00460005, 0x00000000,
+		0x80049031, 0x00000004, 0x7020700c, 0x10000000,
+	}}
+};
diff --git a/tests/intel/xe_exec_sip.c b/tests/intel/xe_exec_sip.c
new file mode 100644
index 000000000000..af0eaf8cbda6
--- /dev/null
+++ b/tests/intel/xe_exec_sip.c
@@ -0,0 +1,239 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2024 Intel Corporation
+ */
+
+/**
+ * TEST: Tests for gpgpu shader and system routine execution
+ * Category: Software building block
+ * Sub-category: gpgpu
+ * Functionality: system routine
+ * Test category: functionality test
+ */
+
+#include <dirent.h>
+#include <fcntl.h>
+#include <stdio.h>
+#include "gpgpu_shader.h"
+#include "igt.h"
+#include "igt_sysfs.h"
+#include "xe/xe_ioctl.h"
+#include "xe/xe_query.h"
+
+#define WIDTH 64
+#define HEIGHT 64
+
+#define COLOR_C4 0xc4
+
+#define SHADER_CANARY 0x01010101
+
+#define NSEC_PER_MSEC (1000 * 1000ull)
+
+static struct intel_buf *
+create_fill_buf(int fd, int width, int height, uint8_t color)
+{
+	struct intel_buf *buf;
+	uint8_t *ptr;
+
+	buf = calloc(1, sizeof(*buf));
+	igt_assert(buf);
+
+	intel_buf_init(buf_ops_create(fd), buf, width / 4, height, 32, 0,
+		       I915_TILING_NONE, 0);
+
+	ptr = xe_bo_map(fd, buf->handle, buf->surface[0].size);
+	memset(ptr, color, buf->surface[0].size);
+	munmap(ptr, buf->surface[0].size);
+
+	return buf;
+}
+
+static struct gpgpu_shader *get_shader(int fd)
+{
+	static struct gpgpu_shader *shader;
+
+	shader = gpgpu_shader_create(fd);
+	gpgpu_shader__write_dword(shader, SHADER_CANARY, 0);
+	gpgpu_shader__eot(shader);
+	return shader;
+}
+
+static uint32_t gpgpu_shader(int fd, struct intel_bb *ibb, unsigned int threads,
+			     unsigned int width, unsigned int height)
+{
+	struct intel_buf *buf = create_fill_buf(fd, width, height, COLOR_C4);
+	struct gpgpu_shader *shader = get_shader(fd);
+
+	gpgpu_shader_exec(ibb, buf, 1, threads, shader, NULL, 0, 0);
+	gpgpu_shader_destroy(shader);
+	return buf->handle;
+}
+
+static void check_fill_buf(uint8_t *ptr, const int width, const int x,
+			   const int y, const uint8_t color)
+{
+	const uint8_t val = ptr[y * width + x];
+
+	igt_assert_f(val == color,
+		     "Expected 0x%02x, found 0x%02x at (%d,%d)\n",
+		     color, val, x, y);
+}
+
+static void check_buf(int fd, uint32_t handle, int width, int height,
+		      uint8_t poison_c)
+{
+	unsigned int sz = ALIGN(width * height, 4096);
+	int thread_count = 0;
+	uint32_t *ptr;
+	int i, j;
+
+	ptr = xe_bo_mmap_ext(fd, handle, sz, PROT_READ);
+
+	for (i = 0, j = 0; j < height / 2; ++j) {
+		if (ptr[j * width / 4] == SHADER_CANARY) {
+			++thread_count;
+			i = 4;
+		}
+
+		for (; i < width; i++)
+			check_fill_buf((uint8_t *)ptr, width, i, j, poison_c);
+
+		i = 0;
+	}
+
+	igt_assert(thread_count);
+
+	munmap(ptr, sz);
+}
+
+static const char *class_to_str(int class)
+{
+        const char *str[] = {
+                [DRM_XE_ENGINE_CLASS_RENDER] = "rcs",
+                [DRM_XE_ENGINE_CLASS_COPY] = "bcs",
+                [DRM_XE_ENGINE_CLASS_VIDEO_DECODE] = "vcs",
+                [DRM_XE_ENGINE_CLASS_VIDEO_ENHANCE] = "vecs",
+		[DRM_XE_ENGINE_CLASS_COMPUTE] = "ccs",
+        };
+
+        if (class < ARRAY_SIZE(str))
+                return str[class];
+
+        return "unk";
+}
+
+static uint64_t xe_sysfs_get_job_timeout_ms(int fd, struct drm_xe_engine_class_instance *eci)
+{
+	struct dirent *de;
+	int engines_fd = -1;
+	int gt_fd = -1;
+	DIR *dir;
+	/* Default timeout is 5s */
+	uint64_t ret = 5ULL * MSEC_PER_SEC;
+
+	gt_fd = xe_sysfs_gt_open(fd, eci->gt_id);
+	if (gt_fd == -1)
+		return ret;
+
+	engines_fd = openat(gt_fd, "engines", O_RDONLY);
+	if (engines_fd == -1) {
+		close(gt_fd);
+		return ret;
+	}
+
+	lseek(engines_fd, 0, SEEK_SET);
+	dir = fdopendir(engines_fd);
+	while (dir && (de = readdir(dir))) {
+		int engine_fd;
+		if (strcmp(de->d_name, class_to_str(eci->engine_class)))
+			continue;
+
+		engine_fd = openat(engines_fd, de->d_name, O_RDONLY);
+		if (engine_fd < 0)
+			break;
+
+		ret = igt_sysfs_get_u64(engine_fd, "job_timeout_ms");
+		close(engine_fd);
+		break;
+	}
+
+	close(engines_fd);
+	close(gt_fd);
+	return ret;
+}
+
+/**
+ * SUBTEST: sanity
+ * Description: check basic shader with write operation
+ * Run type: BAT
+ *
+ */
+static void test_sip(struct drm_xe_engine_class_instance *eci, uint32_t flags)
+{
+	unsigned int threads = 512;
+	unsigned int height = max_t(threads, HEIGHT, threads * 2);
+	uint32_t exec_queue_id, handle, vm_id;
+	unsigned int width = WIDTH;
+	struct timespec ts = { };
+	uint64_t timeout;
+	struct intel_bb *ibb;
+	int fd;
+
+	igt_debug("Using %s\n", xe_engine_class_string(eci->engine_class));
+
+	fd = drm_open_driver(DRIVER_XE);
+	xe_device_get(fd);
+
+	vm_id = xe_vm_create(fd, 0, 0);
+
+	/* Get timeout for job, and add 4s to ensure timeout processes in subtest. */
+	timeout = xe_sysfs_get_job_timeout_ms(fd, eci) + 4ull * MSEC_PER_SEC;
+	timeout *= NSEC_PER_MSEC;
+	timeout *= igt_run_in_simulation() ? 10 : 1;
+
+	exec_queue_id = xe_exec_queue_create(fd, vm_id, eci, 0);
+	ibb = intel_bb_create_with_context(fd, exec_queue_id, vm_id, NULL, 4096);
+
+	igt_nsec_elapsed(&ts);
+	handle = gpgpu_shader(fd, ibb, threads, width, height);
+
+	intel_bb_sync(ibb);
+	igt_assert_lt_u64(igt_nsec_elapsed(&ts), timeout);
+
+	check_buf(fd, handle, width, height, COLOR_C4);
+
+	gem_close(fd, handle);
+	intel_bb_destroy(ibb);
+
+	xe_exec_queue_destroy(fd, exec_queue_id);
+	xe_vm_destroy(fd, vm_id);
+	xe_device_put(fd);
+	close(fd);
+}
+
+#define test_render_and_compute(t, __fd, __eci) \
+	igt_subtest_with_dynamic(t) \
+		xe_for_each_engine(__fd, __eci) \
+			if (__eci->engine_class == DRM_XE_ENGINE_CLASS_RENDER || \
+			    __eci->engine_class == DRM_XE_ENGINE_CLASS_COMPUTE) \
+				igt_dynamic_f("%s%d", xe_engine_class_string(__eci->engine_class), \
+					      __eci->engine_instance)
+
+igt_main
+{
+	struct drm_xe_engine_class_instance *eci;
+	int fd;
+
+	igt_fixture {
+		fd = drm_open_driver(DRIVER_XE);
+		xe_device_get(fd);
+	}
+
+	test_render_and_compute("sanity", fd, eci)
+		test_sip(eci, 0);
+
+	igt_fixture {
+		xe_device_put(fd);
+		close(fd);
+	}
+}
diff --git a/tests/meson.build b/tests/meson.build
index 65b8bf23b972..63588e473616 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -292,6 +292,7 @@ intel_xe_progs = [
 	'xe_exec_fault_mode',
 	'xe_exec_queue_property',
 	'xe_exec_reset',
+	'xe_exec_sip',
 	'xe_exec_store',
 	'xe_exec_threads',
 	'xe_exercise_blt',

-- 
2.34.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] lib/gpgpu_shader: tooling for preparing and running gpgpu shaders
  2024-04-29 12:08 ` [PATCH 2/4] lib/gpgpu_shader: tooling for preparing and running gpgpu shaders Andrzej Hajda
@ 2024-04-29 12:23   ` Grzegorzek, Dominik
  0 siblings, 0 replies; 17+ messages in thread
From: Grzegorzek, Dominik @ 2024-04-29 12:23 UTC (permalink / raw)
  To: igt-dev@lists.freedesktop.org, Hajda, Andrzej
  Cc: Piatkowski, Dominik Karol, Manszewski, Christoph,
	kamil.konieczny@linux.intel.com

On Mon, 2024-04-29 at 14:08 +0200, Andrzej Hajda wrote:
> Implement tooling for building shaders for specific generations.
> The library allows you to build and run shader from precompiled blocks
> and provides an abstraction layer over gpgpu pipeline.
> 
> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
> Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> Signed-off-by: Christoph Manszewski <christoph.manszewski@intel.com>
> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
> Signed-off-by: Dominik Karol Piątkowski <dominik.karol.piatkowski@intel.com>

Two Andrzejs here. Dominik Karol's sob was added to that commit in internal as he modified 
some instructions you stripped anyway so I would remove it.

~Dominik
> ---
>  lib/gpgpu_shader.c | 211 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>  lib/gpgpu_shader.h |  38 ++++++++++
>  lib/meson.build    |   1 +
>  3 files changed, 250 insertions(+)
> 
> diff --git a/lib/gpgpu_shader.c b/lib/gpgpu_shader.c
> new file mode 100644
> index 000000000000..d14301789421
> --- /dev/null
> +++ b/lib/gpgpu_shader.c
> @@ -0,0 +1,211 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2024 Intel Corporation
> + *
> + * Author: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> + */
> +
> +#include <i915_drm.h>
> +
> +#include "ioctl_wrappers.h"
> +#include "gpgpu_shader.h"
> +#include "gpu_cmds.h"
> +
> +#define SUPPORTED_GEN_VER 1200 /* Support TGL and up */
> +
> +#define PAGE_SIZE 4096
> +#define BATCH_STATE_SPLIT 2048
> +/* VFE STATE params */
> +#define THREADS (1 << 16) /* max value */
> +#define GEN8_GPGPU_URB_ENTRIES 1
> +#define GPGPU_URB_SIZE 0
> +#define GPGPU_CURBE_SIZE 0
> +#define GEN7_VFE_STATE_GPGPU_MODE 1
> +
> +static uint32_t fill_sip(struct intel_bb *ibb,
> +			 const uint32_t sip[][4],
> +			 const size_t size)
> +{
> +	uint32_t *sip_dst;
> +	uint32_t offset;
> +
> +	intel_bb_ptr_align(ibb, 16);
> +	sip_dst = intel_bb_ptr(ibb);
> +	offset = intel_bb_offset(ibb);
> +
> +	memcpy(sip_dst, sip, size);
> +
> +	intel_bb_ptr_add(ibb, size);
> +
> +	return offset;
> +}
> +
> +static void emit_sip(struct intel_bb *ibb, const uint64_t offset)
> +{
> +	intel_bb_out(ibb, GEN4_STATE_SIP | (3 - 2));
> +	intel_bb_out(ibb, lower_32_bits(offset));
> +	intel_bb_out(ibb, upper_32_bits(offset));
> +}
> +
> +static void
> +__xelp_gpgpu_execfunc(struct intel_bb *ibb,
> +		      struct intel_buf *target,
> +		      unsigned int x_dim, unsigned int y_dim,
> +		      struct gpgpu_shader *shdr,
> +		      struct gpgpu_shader *sip,
> +		      uint64_t ring, bool explicit_engine)
> +{
> +	uint32_t interface_descriptor, sip_offset;
> +	uint64_t engine;
> +
> +	intel_bb_add_intel_buf(ibb, target, true);
> +
> +	intel_bb_ptr_set(ibb, BATCH_STATE_SPLIT);
> +
> +	interface_descriptor = gen8_fill_interface_descriptor(ibb, target,
> +							      shdr->instr,
> +							      4 * shdr->size);
> +
> +	if (sip && sip->size)
> +		sip_offset = fill_sip(ibb, sip->instr, 4 * sip->size);
> +	else
> +		sip_offset = 0;
> +
> +	intel_bb_ptr_set(ibb, 0);
> +
> +	/* GPGPU pipeline */
> +	intel_bb_out(ibb, GEN7_PIPELINE_SELECT | GEN9_PIPELINE_SELECTION_MASK |
> +		     PIPELINE_SELECT_GPGPU);
> +
> +	gen9_emit_state_base_address(ibb);
> +
> +	xelp_emit_vfe_state(ibb, THREADS, GEN8_GPGPU_URB_ENTRIES,
> +			    GPGPU_URB_SIZE, GPGPU_CURBE_SIZE, true);
> +
> +	gen7_emit_interface_descriptor_load(ibb, interface_descriptor);
> +
> +	if (sip_offset)
> +		emit_sip(ibb, sip_offset);
> +
> +	gen8_emit_gpgpu_walk(ibb, 0, 0, x_dim * 16, y_dim);
> +
> +	intel_bb_out(ibb, MI_BATCH_BUFFER_END);
> +	intel_bb_ptr_align(ibb, 32);
> +
> +	engine = explicit_engine ? ring : I915_EXEC_DEFAULT;
> +	intel_bb_exec(ibb, intel_bb_offset(ibb),
> +		      engine | I915_EXEC_NO_RELOC, false);
> +}
> +
> +static void
> +__xehp_gpgpu_execfunc(struct intel_bb *ibb,
> +		      struct intel_buf *target,
> +		      unsigned int x_dim, unsigned int y_dim,
> +		      struct gpgpu_shader *shdr,
> +		      struct gpgpu_shader *sip,
> +		      uint64_t ring, bool explicit_engine)
> +{
> +	struct xehp_interface_descriptor_data idd;
> +	uint32_t sip_offset;
> +	uint64_t engine;
> +
> +	intel_bb_add_intel_buf(ibb, target, true);
> +
> +	intel_bb_ptr_set(ibb, BATCH_STATE_SPLIT);
> +
> +	xehp_fill_interface_descriptor(ibb, target, shdr->instr,
> +				       4 * shdr->size, &idd);
> +
> +	if (sip && sip->size)
> +		sip_offset = fill_sip(ibb, sip->instr, 4 * sip->size);
> +	else
> +		sip_offset = 0;
> +
> +	intel_bb_ptr_set(ibb, 0);
> +
> +	/* GPGPU pipeline */
> +	intel_bb_out(ibb, GEN7_PIPELINE_SELECT | GEN9_PIPELINE_SELECTION_MASK |
> +		     PIPELINE_SELECT_GPGPU);
> +	xehp_emit_state_base_address(ibb);
> +	xehp_emit_state_compute_mode(ibb);
> +	xehp_emit_state_binding_table_pool_alloc(ibb);
> +	xehp_emit_cfe_state(ibb, THREADS);
> +
> +	if (sip_offset)
> +		emit_sip(ibb, sip_offset);
> +
> +	xehp_emit_compute_walk(ibb, 0, 0, x_dim * 16, y_dim, &idd, 0x0);
> +
> +	intel_bb_out(ibb, MI_BATCH_BUFFER_END);
> +	intel_bb_ptr_align(ibb, 32);
> +
> +	engine = explicit_engine ? ring : I915_EXEC_DEFAULT;
> +	intel_bb_exec(ibb, intel_bb_offset(ibb),
> +		      engine | I915_EXEC_NO_RELOC, false);
> +
> +}
> +
> +/**
> + * gpgpu_shader_exec:
> + * @ibb: pointer to initialized intel_bb
> + * @target: pointer to initialized intel_buf to be written by shader/sip
> + * @x_dim: gpgpu/compute walker thread group width
> + * @y_dim: gpgpu/compute walker thread group height
> + * @shdr: shader to be executed
> + * @sip: sip to be executed, can be NULL
> + * @ring: engine index
> + * @explicit_engine: whether to use provided engine index
> + *
> + * Execute provided shader in asynchronous fashion. To wait for completion,
> + * caller has to use the provided ibb handle.
> + */
> +void gpgpu_shader_exec(struct intel_bb *ibb,
> +		       struct intel_buf *target,
> +		       unsigned int x_dim, unsigned int y_dim,
> +		       struct gpgpu_shader *shdr,
> +		       struct gpgpu_shader *sip,
> +		       uint64_t ring, bool explicit_engine)
> +{
> +	igt_require(shdr->gen_ver >= SUPPORTED_GEN_VER);
> +	igt_assert(ibb->size >= PAGE_SIZE);
> +	igt_assert(ibb->ptr == ibb->batch);
> +
> +	if (shdr->gen_ver >= 1250)
> +		__xehp_gpgpu_execfunc(ibb, target, x_dim, y_dim, shdr, sip,
> +				      ring, explicit_engine);
> +	else
> +		__xelp_gpgpu_execfunc(ibb, target, x_dim, y_dim, shdr, sip,
> +				      ring, explicit_engine);
> +}
> +
> +/**
> + * gpgpu_shader_create:
> + * @fd: drm fd - i915 or xe
> + *
> + * Creates empty shader.
> + *
> + * Returns: pointer to empty shader struct.
> + */
> +struct gpgpu_shader *gpgpu_shader_create(int fd)
> +{
> +	struct gpgpu_shader *shdr = calloc(1, sizeof(struct gpgpu_shader));
> +	const struct intel_device_info *info;
> +
> +	info = intel_get_device_info(intel_get_drm_devid(fd));
> +	shdr->gen_ver = 100 * info->graphics_ver + info->graphics_rel;
> +	shdr->max_size = 16 * 4;
> +	shdr->code = malloc(4 * shdr->max_size);
> +	return shdr;
> +}
> +
> +/**
> + * gpgpu_shader_destroy:
> + * @shdr: pointer to shader struct created with 'gpgpu_shader_create'
> + *
> + * Frees resources of gpgpu_shader struct.
> + */
> +void gpgpu_shader_destroy(struct gpgpu_shader *shdr)
> +{
> +	free(shdr->code);
> +	free(shdr);
> +}
> diff --git a/lib/gpgpu_shader.h b/lib/gpgpu_shader.h
> new file mode 100644
> index 000000000000..02f6f1aad1e3
> --- /dev/null
> +++ b/lib/gpgpu_shader.h
> @@ -0,0 +1,38 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2024 Intel Corporation
> + */
> +
> +#ifndef GPGPU_SHADER_H
> +#define GPGPU_SHADER_H
> +
> +#include <stdbool.h>
> +#include <stdint.h>
> +#include <stdlib.h>
> +
> +struct intel_bb;
> +struct intel_buf;
> +
> +struct gpgpu_shader {
> +	uint32_t gen_ver;
> +	uint32_t size;
> +	uint32_t max_size;
> +	union {
> +		uint32_t *code;
> +		uint32_t (*instr)[4];
> +	};
> +};
> +
> +struct gpgpu_shader *gpgpu_shader_create(int fd);
> +void gpgpu_shader_destroy(struct gpgpu_shader *shdr);
> +
> +void gpgpu_shader_dump(struct gpgpu_shader *shdr);
> +
> +void gpgpu_shader_exec(struct intel_bb *ibb,
> +		       struct intel_buf *target,
> +		       unsigned int x_dim, unsigned int y_dim,
> +		       struct gpgpu_shader *shdr,
> +		       struct gpgpu_shader *sip,
> +		       uint64_t ring, bool explicit_engine);
> +
> +#endif /* GPGPU_SHADER_H */
> diff --git a/lib/meson.build b/lib/meson.build
> index e2f740c116f8..0a3084f8aea2 100644
> --- a/lib/meson.build
> +++ b/lib/meson.build
> @@ -72,6 +72,7 @@ lib_sources = [
>  	'media_spin.c',
>  	'media_fill.c',
>  	'gpgpu_fill.c',
> +	'gpgpu_shader.c',
>  	'gpu_cmds.c',
>  	'rendercopy_i915.c',
>  	'rendercopy_i830.c',
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] lib/gpu_cmds: add Xe_LP version of emit_vfe_state
  2024-04-29 12:08 ` [PATCH 1/4] lib/gpu_cmds: add Xe_LP version of emit_vfe_state Andrzej Hajda
@ 2024-04-29 12:37   ` Grzegorzek, Dominik
  0 siblings, 0 replies; 17+ messages in thread
From: Grzegorzek, Dominik @ 2024-04-29 12:37 UTC (permalink / raw)
  To: igt-dev@lists.freedesktop.org, Hajda, Andrzej
  Cc: Piatkowski, Dominik Karol, Manszewski, Christoph,
	kamil.konieczny@linux.intel.com

On Mon, 2024-04-29 at 14:08 +0200, Andrzej Hajda wrote:
> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>

With some commit massage explaining that it is needed in order to disable EU fusion it is:

Reviewed-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
> ---
>  lib/gpu_cmds.c | 29 +++++++++++++++++++++++------
>  lib/gpu_cmds.h |  6 ++++++
>  2 files changed, 29 insertions(+), 6 deletions(-)
> 
> diff --git a/lib/gpu_cmds.c b/lib/gpu_cmds.c
> index da41121ce945..c73d56cc3f8c 100644
> --- a/lib/gpu_cmds.c
> +++ b/lib/gpu_cmds.c
> @@ -651,10 +651,10 @@ gen7_emit_vfe_state(struct intel_bb *ibb, uint32_t threads,
>  	intel_bb_out(ibb, 0);
>  }
>  
> -void
> -gen8_emit_vfe_state(struct intel_bb *ibb, uint32_t threads,
> -		    uint32_t urb_entries, uint32_t urb_size,
> -		    uint32_t curbe_size)
> +static void
> +__gen8_emit_vfe_state(struct intel_bb *ibb, uint32_t threads,
> +		      uint32_t urb_entries, uint32_t urb_size,
> +		      uint32_t curbe_size, bool legacy_mode)
>  {
>  	intel_bb_out(ibb, GEN7_MEDIA_VFE_STATE | (9 - 2));
>  
> @@ -662,8 +662,8 @@ gen8_emit_vfe_state(struct intel_bb *ibb, uint32_t threads,
>  	intel_bb_out(ibb, 0);
>  	intel_bb_out(ibb, 0);
>  
> -	/* number of threads & urb entries */
> -	intel_bb_out(ibb, threads << 16 | urb_entries << 8);
> +	/* number of threads & urb entries & eu fusion */
> +	intel_bb_out(ibb, threads << 16 | urb_entries << 8 | legacy_mode << 6);
>  
>  	intel_bb_out(ibb, 0);
>  
> @@ -676,6 +676,15 @@ gen8_emit_vfe_state(struct intel_bb *ibb, uint32_t threads,
>  	intel_bb_out(ibb, 0);
>  }
>  
> +void
> +gen8_emit_vfe_state(struct intel_bb *ibb, uint32_t threads,
> +		    uint32_t urb_entries, uint32_t urb_size,
> +		    uint32_t curbe_size)
> +{
> +	__gen8_emit_vfe_state(ibb, threads, urb_entries, urb_size, curbe_size,
> +			      false);
> +}
> +
>  void
>  gen7_emit_curbe_load(struct intel_bb *ibb, uint32_t curbe_buffer)
>  {
> @@ -864,6 +873,14 @@ gen7_emit_media_objects(struct intel_bb *ibb,
>  			gen_emit_media_object(ibb, x + i * 16, y + j * 16);
>  }
>  
> +void xelp_emit_vfe_state(struct intel_bb *ibb, uint32_t threads,
> +			 uint32_t urb_entries, uint32_t urb_size,
> +			 uint32_t curbe_size, bool legacy_mode)
> +{
> +	return __gen8_emit_vfe_state(ibb, threads, urb_entries, urb_size,
> +				     curbe_size, legacy_mode);
> +}
> +
>  /*
>   * XEHP
>   */
> diff --git a/lib/gpu_cmds.h b/lib/gpu_cmds.h
> index 348c6c9453e9..1b9156a80c7c 100644
> --- a/lib/gpu_cmds.h
> +++ b/lib/gpu_cmds.h
> @@ -81,6 +81,12 @@ void
>  gen8_emit_vfe_state(struct intel_bb *ibb, uint32_t threads,
>  		    uint32_t urb_entries, uint32_t urb_size,
>  		    uint32_t curbe_size);
> +
> +void
> +xelp_emit_vfe_state(struct intel_bb *ibb, uint32_t threads,
> +		    uint32_t urb_entries, uint32_t urb_size,
> +		    uint32_t curbe_size, bool legacy_mode);
> +
>  void
>  gen7_emit_curbe_load(struct intel_bb *ibb, uint32_t curbe_buffer);
>  
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* ✗ Fi.CI.BUILD: failure for lib/gpgpu: add shader support
  2024-04-29 12:08 [PATCH 0/4] lib/gpgpu: add shader support Andrzej Hajda
                   ` (3 preceding siblings ...)
  2024-04-29 12:08 ` [PATCH 4/4] intel/xe_exec_sip: port test for shader sanity check Andrzej Hajda
@ 2024-04-29 16:19 ` Patchwork
  2024-04-29 16:21 ` ✗ GitLab.Pipeline: warning " Patchwork
  5 siblings, 0 replies; 17+ messages in thread
From: Patchwork @ 2024-04-29 16:19 UTC (permalink / raw)
  To: Andrzej Hajda; +Cc: igt-dev

== Series Details ==

Series: lib/gpgpu: add shader support
URL   : https://patchwork.freedesktop.org/series/133020/
State : failure

== Summary ==

IGT patchset build failed on latest successful build
ce6ce0f60dd1a6c0df93a01ad71a31964158a2cf intel_reg: Move decoding behind an option

Tail of build.log:
[326/1687] Compiling C object 'lib/76b5a35@@igt-igt_aux_c@sta/igt_aux.c.o'.
[327/1687] Compiling C object 'lib/76b5a35@@igt-rendercopy_gen6_c@sta/rendercopy_gen6.c.o'.
[328/1687] Compiling C object 'lib/76b5a35@@igt-rendercopy_gen7_c@sta/rendercopy_gen7.c.o'.
[329/1687] Compiling C object 'tests/59830eb@@kms_multipipe_modeset@exe/kms_multipipe_modeset.c.o'.
[330/1687] Compiling C object 'tests/59830eb@@kms_force_connector_basic@exe/kms_force_connector_basic.c.o'.
[331/1687] Compiling C object 'lib/76b5a35@@igt-gpu_cmds_c@sta/gpu_cmds.c.o'.
[332/1687] Compiling C object 'lib/76b5a35@@igt-igt_audio_c@sta/igt_audio.c.o'.
[333/1687] Compiling C object 'tests/59830eb@@kms_panel_fitting@exe/kms_panel_fitting.c.o'.
[334/1687] Compiling C object 'tests/59830eb@@kms_invalid_mode@exe/kms_invalid_mode.c.o'.
[335/1687] Compiling C object 'lib/76b5a35@@igt-igt_pm_c@sta/igt_pm.c.o'.
[336/1687] Compiling C object 'tests/59830eb@@kms_plane_lowres@exe/kms_plane_lowres.c.o'.
[337/1687] Compiling C object 'lib/76b5a35@@igt-intel_compute_c@sta/intel_compute.c.o'.
[338/1687] Compiling C object 'lib/76b5a35@@igt-intel_bufops_c@sta/intel_bufops.c.o'.
[339/1687] Compiling C object 'tests/59830eb@@kms_rmfb@exe/kms_rmfb.c.o'.
[340/1687] Compiling C object 'tests/59830eb@@kms_selftest@exe/kms_selftest.c.o'.
[341/1687] Compiling C object 'lib/76b5a35@@igt-igt_vmwgfx_c@sta/igt_vmwgfx.c.o'.
[342/1687] Compiling C object 'tests/59830eb@@kms_plane_cursor@exe/kms_plane_cursor.c.o'.
[343/1687] Compiling C object 'tests/59830eb@@kms_flip@exe/kms_flip.c.o'.
[344/1687] Compiling C object 'tests/59830eb@@kms_prop_blob@exe/kms_prop_blob.c.o'.
[345/1687] Compiling C object 'tests/59830eb@@kms_sysfs_edid_timing@exe/kms_sysfs_edid_timing.c.o'.
[346/1687] Compiling C object 'tests/59830eb@@kms_scaling_modes@exe/kms_scaling_modes.c.o'.
[347/1687] Compiling C object 'tests/59830eb@@kms_plane_multiple@exe/kms_plane_multiple.c.o'.
[348/1687] Compiling C object 'tests/59830eb@@kms_sequence@exe/kms_sequence.c.o'.
[349/1687] Compiling C object 'tests/59830eb@@kms_hdr@exe/kms_hdr.c.o'.
[350/1687] Compiling C object 'lib/76b5a35@@igt-igt_amd_c@sta/igt_amd.c.o'.
[351/1687] Compiling C object 'tests/59830eb@@kms_pipe_crc_basic@exe/kms_pipe_crc_basic.c.o'.
[352/1687] Compiling C object 'lib/76b5a35@@igt-intel_blt_c@sta/intel_blt.c.o'.
[353/1687] Compiling C object 'tests/59830eb@@kms_prime@exe/kms_prime.c.o'.
[354/1687] Compiling C object 'tests/59830eb@@kms_plane_alpha_blend@exe/kms_plane_alpha_blend.c.o'.
[355/1687] Compiling C object 'lib/76b5a35@@igt-igt_kmod_c@sta/igt_kmod.c.o'.
[356/1687] Generating i915-perf-registers-acmgt3 with a custom command.
[357/1687] Compiling C object 'lib/76b5a35@@igt-intel_batchbuffer_c@sta/intel_batchbuffer.c.o'.
[358/1687] Compiling C object 'tests/59830eb@@kms_plane@exe/kms_plane.c.o'.
[359/1687] Compiling C object 'tests/59830eb@@kms_cursor_legacy@exe/kms_cursor_legacy.c.o'.
[360/1687] Compiling C object 'tests/59830eb@@kms_setmode@exe/kms_setmode.c.o'.
[361/1687] Compiling C object 'tests/59830eb@@kms_lease@exe/kms_lease.c.o'.
[362/1687] Compiling C object 'tests/59830eb@@kms_properties@exe/kms_properties.c.o'.
[363/1687] Compiling C object 'lib/76b5a35@@igt-igt_chamelium_c@sta/igt_chamelium.c.o'.
[364/1687] Compiling C object 'tests/59830eb@@kms_atomic@exe/kms_atomic.c.o'.
[365/1687] Generating i915-perf-metrics-acmgt3 with a custom command.
[366/1687] Compiling C object 'lib/76b5a35@@igt-igt_core_c@sta/igt_core.c.o'.
[367/1687] Compiling C object 'tests/59830eb@@kms_rotation_crc@exe/kms_rotation_crc.c.o'.
[368/1687] Compiling C object 'lib/76b5a35@@igt-rendercopy_gen8_c@sta/rendercopy_gen8.c.o'.
[369/1687] Compiling C object 'lib/76b5a35@@igt-rendercopy_gen9_c@sta/rendercopy_gen9.c.o'.
[370/1687] Compiling C object 'tests/59830eb@@kms_plane_scaling@exe/kms_plane_scaling.c.o'.
[371/1687] Compiling C object 'lib/76b5a35@@igt-i915_intel_decode_c@sta/i915_intel_decode.c.o'.
[372/1687] Compiling C object 'lib/76b5a35@@igt-igt_fb_c@sta/igt_fb.c.o'.
[373/1687] Compiling C object 'lib/76b5a35@@igt-igt_kms_c@sta/igt_kms.c.o'.
[374/1687] Generating i915-perf-equations with a custom command.
ninja: build stopped: subcommand failed.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* ✗ GitLab.Pipeline: warning for lib/gpgpu: add shader support
  2024-04-29 12:08 [PATCH 0/4] lib/gpgpu: add shader support Andrzej Hajda
                   ` (4 preceding siblings ...)
  2024-04-29 16:19 ` ✗ Fi.CI.BUILD: failure for lib/gpgpu: add shader support Patchwork
@ 2024-04-29 16:21 ` Patchwork
  5 siblings, 0 replies; 17+ messages in thread
From: Patchwork @ 2024-04-29 16:21 UTC (permalink / raw)
  To: Andrzej Hajda; +Cc: igt-dev

== Series Details ==

Series: lib/gpgpu: add shader support
URL   : https://patchwork.freedesktop.org/series/133020/
State : warning

== Summary ==

Pipeline status: FAILED.

see https://gitlab.freedesktop.org/gfx-ci/igt-ci-tags/-/pipelines/1166154 for the overview.

build:tests-debian-meson has failed (https://gitlab.freedesktop.org/gfx-ci/igt-ci-tags/-/jobs/58165334):
  ninja: build stopped: subcommand failed.
  ninja: Entering directory `build'
  [1/1516] Generating version.h with a custom command.
  [2/1512] Linking static target lib/libigt-amdgpu_amd_mmd_shared_c.a.
  [3/1512] Linking static target lib/libigt-igt_frame_c.a.
  [4/1512] Linking static target lib/libigt-igt_audio_c.a.
  [5/1512] Linking static target lib/libigt-igt_alsa_c.a.
  [6/1512] Linking static target lib/libigt-igt_chamelium_c.a.
  [7/1512] Linking static target lib/libigt-igt_chamelium_stream_c.a.
  [8/1512] Generating iga64_generated_codes.c with a custom command.
  FAILED: lib/iga64_generated_codes.c 
  ./generate_iga64_codes -o lib/iga64_generated_codes.c -i ../lib/iga64_generated_codes.c lib/libigt-gpgpu_shader_c.a
  /bin/sh: 1: ./generate_iga64_codes: not found
  ninja: build stopped: subcommand failed.
  section_end:1714407656:step_script
  section_start:1714407656:cleanup_file_variables
  Cleaning up project directory and file based variables
  section_end:1714407657:cleanup_file_variables
  ERROR: Job failed: exit code 1
  

build:tests-debian-meson-arm64 has failed (https://gitlab.freedesktop.org/gfx-ci/igt-ci-tags/-/jobs/58165337):
  ninja: Entering directory `build'
  [1/1153] Generating version.h with a custom command.
  [2/1149] Linking static target lib/libigt-igt_kms_c.a.
  [3/1149] Linking static target lib/libigt-amdgpu_amd_ip_blocks_c.a.
  [4/1149] Linking static target lib/libigt-amdgpu_amd_gfx_c.a.
  [5/1149] Linking static target lib/libigt-amdgpu_amd_pci_unplug_c.a.
  [6/1149] Linking static target lib/libigt-igt_frame_c.a.
  [7/1149] Linking static target lib/libigt-amdgpu_amd_cp_dma_c.a.
  [8/1149] Linking static target lib/libigt-amdgpu_amd_mmd_shared_c.a.
  [9/1149] Generating iga64_generated_codes.c with a custom command.
  FAILED: lib/iga64_generated_codes.c 
  ./generate_iga64_codes -o lib/iga64_generated_codes.c -i ../lib/iga64_generated_codes.c lib/libigt-gpgpu_shader_c.a
  /bin/sh: 1: ./generate_iga64_codes: not found
  ninja: build stopped: subcommand failed.
  section_end:1714407669:step_script
  section_start:1714407669:cleanup_file_variables
  Cleaning up project directory and file based variables
  section_end:1714407669:cleanup_file_variables
  ERROR: Job failed: exit code 1
  

build:tests-debian-meson-armhf has failed (https://gitlab.freedesktop.org/gfx-ci/igt-ci-tags/-/jobs/58165336):
  ninja: build stopped: subcommand failed.
  ninja: Entering directory `build'
  [1/1152] Generating version.h with a custom command.
  [2/1148] Linking static target lib/libigt-amdgpu_amd_ip_blocks_c.a.
  [3/1148] Linking static target lib/libigt-amdgpu_amd_pci_unplug_c.a.
  [4/1148] Linking static target lib/libigt-amdgpu_amd_dispatch_helpers_c.a.
  [5/1148] Linking static target lib/libigt-igt_frame_c.a.
  [6/1148] Linking static target lib/libigt-amdgpu_amd_cp_dma_c.a.
  [7/1148] Linking static target lib/libigt-amdgpu_amd_mmd_shared_c.a.
  [8/1148] Generating iga64_generated_codes.c with a custom command.
  FAILED: lib/iga64_generated_codes.c 
  ./generate_iga64_codes -o lib/iga64_generated_codes.c -i ../lib/iga64_generated_codes.c lib/libigt-gpgpu_shader_c.a
  /bin/sh: 1: ./generate_iga64_codes: not found
  ninja: build stopped: subcommand failed.
  section_end:1714407642:step_script
  section_start:1714407642:cleanup_file_variables
  Cleaning up project directory and file based variables
  section_end:1714407643:cleanup_file_variables
  ERROR: Job failed: exit code 1
  

build:tests-debian-meson-mips has failed (https://gitlab.freedesktop.org/gfx-ci/igt-ci-tags/-/jobs/58165338):
  ninja: Entering directory `build'
  [1/1149] Generating version.h with a custom command.
  [2/1145] Linking static target lib/libigt-amdgpu_amd_ip_blocks_c.a.
  [3/1145] Linking static target lib/libigt-amdgpu_amd_pci_unplug_c.a.
  [4/1145] Linking static target lib/libigt-igt_frame_c.a.
  [5/1145] Linking static target lib/libigt-igt_audio_c.a.
  [6/1145] Linking static target lib/libigt-amdgpu_amd_cp_dma_c.a.
  [7/1145] Linking static target lib/libigt-amdgpu_amd_mmd_shared_c.a.
  [8/1145] Linking static target lib/libigt-igt_alsa_c.a.
  [9/1145] Generating iga64_generated_codes.c with a custom command.
  FAILED: lib/iga64_generated_codes.c 
  ./generate_iga64_codes -o lib/iga64_generated_codes.c -i ../lib/iga64_generated_codes.c lib/libigt-gpgpu_shader_c.a
  /bin/sh: 1: ./generate_iga64_codes: not found
  ninja: build stopped: subcommand failed.
  section_end:1714407651:step_script
  section_start:1714407651:cleanup_file_variables
  Cleaning up project directory and file based variables
  section_end:1714407652:cleanup_file_variables
  ERROR: Job failed: exit code 1
  

build:tests-debian-minimal has failed (https://gitlab.freedesktop.org/gfx-ci/igt-ci-tags/-/jobs/58165335):
  ninja: Entering directory `build'
  [1/331] Generating version.h with a custom command.
  [2/327] Linking static target lib/libigt-igt_kms_c.a.
  [3/327] Linking static target lib/libigt-igt_fb_c.a.
  [4/327] Linking static target lib/libigt-igt_amd_c.a.
  [5/327] Linking static target lib/libigt-igt_msm_c.a.
  [6/327] Linking static target lib/libigt-xe_xe_gt_c.a.
  [7/327] Linking static target lib/libigt-xe_xe_mmio_c.a.
  [8/327] Linking static target lib/libigt-xe_xe_util_c.a.
  [9/327] Generating iga64_generated_codes.c with a custom command.
  FAILED: lib/iga64_generated_codes.c 
  ./generate_iga64_codes -o lib/iga64_generated_codes.c -i ../lib/iga64_generated_codes.c lib/libigt-gpgpu_shader_c.a
  /bin/sh: 1: ./generate_iga64_codes: not found
  ninja: build stopped: subcommand failed.
  section_end:1714407633:step_script
  section_start:1714407633:cleanup_file_variables
  Cleaning up project directory and file based variables
  section_end:1714407634:cleanup_file_variables
  ERROR: Job failed: exit code 1
  

build:tests-fedora has failed (https://gitlab.freedesktop.org/gfx-ci/igt-ci-tags/-/jobs/58165329):
  ninja: Entering directory `build'
  [1/1527] Generating version.h with a custom command.
  [2/1523] Linking static target lib/libigt-amdgpu_amd_pci_unplug_c.a.
  [3/1523] Linking static target lib/libigt-amdgpu_amd_mmd_shared_c.a.
  [4/1523] Linking static target lib/libigt-amdgpu_amd_dispatch_c.a.
  [5/1523] Linking static target lib/libigt-igt_frame_c.a.
  [6/1523] Linking static target lib/libigt-igt_audio_c.a.
  [7/1523] Linking static target lib/libigt-igt_chamelium_c.a.
  [8/1523] Linking static target lib/libigt-monitor_edids_monitor_edids_helper_c.a.
  [9/1523] Generating iga64_generated_codes.c with a custom command.
  FAILED: lib/iga64_generated_codes.c 
  ./generate_iga64_codes -o lib/iga64_generated_codes.c -i ../lib/iga64_generated_codes.c lib/libigt-gpgpu_shader_c.a
  /bin/sh: ./generate_iga64_codes: No such file or directory
  ninja: build stopped: subcommand failed.
  section_end:1714407644:step_script
  section_start:1714407644:cleanup_file_variables
  Cleaning up project directory and file based variables
  section_end:1714407645:cleanup_file_variables
  ERROR: Job failed: exit code 1
  

build:tests-fedora-clang has failed (https://gitlab.freedesktop.org/gfx-ci/igt-ci-tags/-/jobs/58165333):
  #define ARRAY_SIZE(arr) (sizeof(arr)/sizeof(arr[0]))
                                 ^~~~~
  ../lib/gpgpu_shader.c:312:3: error: expected ';' after expression
          )ASM", 2, y_offset, 3, value, value, value, value);
           ^
           ;
  ../lib/gpgpu_shader.c:312:6: warning: missing terminating '"' character [-Winvalid-pp-token]
          )ASM", 2, y_offset, 3, value, value, value, value);
              ^
  ../lib/gpgpu_shader.c:312:3: error: use of undeclared identifier 'ASM'
          )ASM", 2, y_offset, 3, value, value, value, value);
           ^
  6 warnings and 12 errors generated.
  ninja: build stopped: subcommand failed.
  section_end:1714407645:step_script
  section_start:1714407645:cleanup_file_variables
  Cleaning up project directory and file based variables
  section_end:1714407646:cleanup_file_variables
  ERROR: Job failed: exit code 1
  

build:tests-fedora-no-libdrm-nouveau has failed (https://gitlab.freedesktop.org/gfx-ci/igt-ci-tags/-/jobs/58165332):
  ninja: build stopped: subcommand failed.
  ninja: Entering directory `build'
  [1/1389] Generating version.h with a custom command.
  [2/1385] Linking static target lib/libigt-igt_kms_c.a.
  [3/1385] Linking static target lib/libigt-xe_xe_spin_c.a.
  [4/1385] Linking static target lib/libigt-igt_audio_c.a.
  [5/1385] Linking static target lib/libigt-igt_alsa_c.a.
  [6/1385] Linking static target lib/libigt-igt_chamelium_c.a.
  [7/1385] Linking static target lib/libigt-igt_chamelium_stream_c.a.
  [8/1385] Generating iga64_generated_codes.c with a custom command.
  FAILED: lib/iga64_generated_codes.c 
  ./generate_iga64_codes -o lib/iga64_generated_codes.c -i ../lib/iga64_generated_codes.c lib/libigt-gpgpu_shader_c.a
  /bin/sh: ./generate_iga64_codes: No such file or directory
  ninja: build stopped: subcommand failed.
  section_end:1714407659:step_script
  section_start:1714407659:cleanup_file_variables
  Cleaning up project directory and file based variables
  section_end:1714407660:cleanup_file_variables
  ERROR: Job failed: exit code 1
  

build:tests-fedora-no-libunwind has failed (https://gitlab.freedesktop.org/gfx-ci/igt-ci-tags/-/jobs/58165330):
  ninja: Entering directory `build'
  [1/1525] Generating version.h with a custom command.
  [2/1521] Linking static target lib/libigt-monitor_edids_monitor_edids_helper_c.a.
  [3/1521] Linking static target lib/libigt-amdgpu_amd_pci_unplug_c.a.
  [4/1521] Linking static target lib/libigt-amdgpu_amd_mmd_shared_c.a.
  [5/1521] Linking static target lib/libigt-amdgpu_amd_dispatch_c.a.
  [6/1521] Linking static target lib/libigt-igt_audio_c.a.
  [7/1521] Linking static target lib/libigt-igt_alsa_c.a.
  [8/1521] Linking static target lib/libigt-igt_chamelium_stream_c.a.
  [9/1521] Generating iga64_generated_codes.c with a custom command.
  FAILED: lib/iga64_generated_codes.c 
  ./generate_iga64_codes -o lib/iga64_generated_codes.c -i ../lib/iga64_generated_codes.c lib/libigt-gpgpu_shader_c.a
  /bin/sh: ./generate_iga64_codes: No such file or directory
  ninja: build stopped: subcommand failed.
  section_end:1714407667:step_script
  section_start:1714407667:cleanup_file_variables
  Cleaning up project directory and file based variables
  section_end:1714407667:cleanup_file_variables
  ERROR: Job failed: exit code 1
  

build:tests-fedora-oldest-meson has failed (https://gitlab.freedesktop.org/gfx-ci/igt-ci-tags/-/jobs/58165331):
  Checking for function "outb" : YES
  Checking if "cpuid.h" links: YES
  Header <unistd.h> has symbol "gettid": YES
  Checking whether type "struct sysinfo" has member "totalram" : YES
  Checking for function "memfd_create" : YES
  Configuring config.h using configuration
  Program python3 found: YES (/usr/bin/python3)
  lib/meson.build:236: WARNING: Identifier 'in' will become a reserved keyword in a future release. Please rename it.
  
  lib/meson.build:236:9: ERROR:  Expecting eol got id.
      if f in iga64_assembly_sources
           ^
  
  A full log can be found at /builds/gfx-ci/igt-ci-tags/build/meson-logs/meson-log.txt
  section_end:1714407632:step_script
  section_start:1714407632:cleanup_file_variables
  Cleaning up project directory and file based variables
  section_end:1714407633:cleanup_file_variables
  ERROR: Job failed: exit code 1

== Logs ==

For more details see: https://gitlab.freedesktop.org/gfx-ci/igt-ci-tags/-/pipelines/1166154

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/4] lib/gpgpu_shader: add inline support for iga64 assembly
  2024-04-29 12:08 ` [PATCH 3/4] lib/gpgpu_shader: add inline support for iga64 assembly Andrzej Hajda
@ 2024-05-10  5:52   ` Zbigniew Kempczyński
  2024-05-10 10:42   ` Zbigniew Kempczyński
  2024-05-10 11:18   ` Kamil Konieczny
  2 siblings, 0 replies; 17+ messages in thread
From: Zbigniew Kempczyński @ 2024-05-10  5:52 UTC (permalink / raw)
  To: Andrzej Hajda
  Cc: igt-dev, Kamil Konieczny, Dominik Grzegorzek,
	Christoph Manszewski, Dominik Karol Piątkowski

On Mon, Apr 29, 2024 at 02:08:19PM +0200, Andrzej Hajda wrote:
> With this patch adding iga64 assembly should be similar to
> adding x86 assembly inline. Simple example:
>     emit_iga64_code(shdr, set_exception, R"ASM(
>         or (1|M0) cr0.1<1>:ud cr0.1<0;1,0>:ud ARG(0):ud
>     )ASM", value);
> Note presence of 'ARG(0)', it will be replaced by 'value' argument,
> multiple arguments are possible.
> More sophisticated examples in following patches.
> How does it works:
> 1. Raw string literals (C++ feature available in gcc as extension):
>    R"ASM(...)ASM" allows to use multiline/unescaped string literals.
>    If for some reason they cannot be used we could always fallback to
>    old ugly way of handling multiline strings with escape characters:
>     emit_iga64_code(shdr, set_exception, "\n\
>         or (1|M0) cr0.1<1>:ud cr0.1<0;1,0>:ud ARG(0):ud\n\
>     ", value);
> 2. emit_iga64_code puts the assembly string into special linker section,
>    and calls __emit_iga64_code with pointer to external variable
>    which will contain code templates generated from the assembly for all
>    supported platforms, remaining arguments are put to temporal array
>    to eventually patch the code with positional arguments.
> 3. During build phase the linker section is scanned for assemblies.
>    Every assembly is preprocessed with cpp, to replace ARG(x) macros with
>    magic numbers, and to provide different code for different platforms
>    if needed. Then output file is compiled with iga64, and then .c file
>    is generated with global variables pointing to hexified iga64 codes.
> 
> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
> ---
>  lib/generate_iga64_codes    | 104 ++++++++++++++++++++++++++++++++++++++++++++
>  lib/gpgpu_shader.c          |  39 +++++++++++++++++
>  lib/gpgpu_shader.h          |  25 +++++++++++
>  lib/iga64_generated_codes.c |   6 +++
>  lib/iga64_macros.h          |  10 +++++
>  lib/meson.build             |  18 ++++++++
>  6 files changed, 202 insertions(+)
> 
> diff --git a/lib/generate_iga64_codes b/lib/generate_iga64_codes
> new file mode 100755
> index 000000000000..efc2a29b409c
> --- /dev/null
> +++ b/lib/generate_iga64_codes
> @@ -0,0 +1,104 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: MIT
> +# Copyright © 2024 Intel Corporation
> +# Author: Andrzej Hajda <andrzej.hajda@intel.com>
> +
> +# List of supported platforms, in format gen100:platform, where gen100 equals
> +# to minimal GPU generation supported by platform multiplied by 100 and platform
> +# is one of platforms supported by -p switch of iga64.
> +#
> +# Must be in decreasing order, the last one must have gen100 equal 0"
> +GEN_VERSIONS="2000:2 1272:12p72 1250:12p5 0:12p1"
> +
> +warn() {
> +    echo -e "$1" >/dev/stderr
> +}
> +
> +die() {
> +    warn "DIE: $1"
> +    exit 1
> +}
> +
> +# parse args
> +while getopts ':i:o:' opt; do
> +    case $opt in
> +    i) INPUT=$OPTARG;;
> +    o) OUTPUT=$OPTARG;;
> +    ?) die "Usage: $0 -i pre-generated-iga64-file -o generated-iga64-file libs-with-iga64-assembly [...]"
> +    esac
> +done
> +LIBS=${@:OPTIND}
> +
> +# read all assemblies into ASMS array
> +ASMS=()
> +while  read -d $'\0' asm; do
> +    test -z "$asm" && continue
> +    ASMS+=( "$asm" )
> +done < <(for f in $LIBS; do objcopy --dump-section .iga64_assembly=/dev/stdout $f.p/*.o; done)
> +
> +# check if we need to recompile - checksum difference and compiler present
> +MD5_ASMS="$(for a in "${ASMS[@]}"; do echo "${a#*:}"; done | md5sum|cut -b1-32)"
> +MD5_PRE="$(grep -Po '(?<=^#define MD5_SUM )\S{32,32}' $INPUT 2>/dev/null)"
> +
> +if [ "$MD5_ASMS" = "$MD5_PRE" ]; then
> +    echo "iga64 assemblies not changed, reusing pre-compiled file $INPUT."
> +    cp $INPUT $OUTPUT
> +    exit 0
> +fi

Great, igt will compile without iga if asm code md5 isn't changed.

> +
> +type iga64 >/dev/null || {
> +    warn "WARNING: iga64 assemblies changed, but iga64 compiler not present, CHANGES will have no effect. Install iga64 (libigc-tools package) to re-compile code."
> +    cp $INPUT $OUTPUT
> +    exit 0
> +}

This might be confusing. Binary existence is not enough, it should
support superset of platforms defined in GEN_VERIONS, otherwise someone
will be confused - libigc-tools is installed but why it is failing?
The best imo would be to compare platform supported in iga64 with
GEN_VERSIONS and print warning and switch to fallback and compile
with predefined file.

I'm still staring on your code so I might have more comments later.
Your solution how to keep inline assembly for gpu is impressive
and definitely we want to merge it.

--
Zbigniew

> +
> +# returns count of numbers in strings of format "0x1234, 0x23434, ..."
> +dword_count() {
> +    n=${1//[^x]}
> +    echo ${#n}
> +}
> +
> +# generate code file
> +WD=$OUTPUT.d
> +mkdir -p $WD
> +
> +echo "Generating new $OUTPUT"
> +
> +cat <<-EOF >$OUTPUT
> +/* SPDX-License-Identifier: MIT */
> +/* Generated using $(iga64 |& head -1) */
> +
> +#include "gpgpu_shader.h"
> +
> +#define MD5_SUM $MD5_ASMS
> +EOF
> +
> +for asm in "${ASMS[@]}"; do
> +    asm_name="${asm%%:*}"
> +    asm_code="${asm_name/assembly/code}"
> +    asm_body="${asm#*:}"
> +    cur_code=""
> +    cur_ver=""
> +    echo -e "\nstruct iga64_template const $asm_code[] = {" >>$OUTPUT
> +    for gen in $GEN_VERSIONS; do
> +        gen_ver="${gen%%:*}"
> +        gen_name="${gen#*:}"
> +        warn "Generating $asm_code for platform $gen_name"
> +        cmd="cpp -P - -o $WD/$asm_name.$gen_name.asm"
> +        cmd+=" -DGEN_VER=$gen_ver -imacros ../lib/iga64_macros.h"
> +        eval "$cmd" <<<"$asm_body" || die "cpp error for $asm_name.$gen_name\ncmd: $cmd"
> +        cmd="iga64 -Xauto-deps -Wall -p=$gen_name"
> +        cmd+=" $WD/$asm_name.$gen_name.asm -o $WD/$asm_name.$gen_name.bin"
> +        eval "$cmd" || die "iga64 error for $asm_name.$gen_name\ncmd: $cmd"
> +        code="$(hexdump -e '"\t\t" 4/4 "0x%08x, " "\n"' $WD/$asm_name.$gen_name.bin)"
> +        [ -z "$cur_code" ] && cur_code="$code"
> +        [ "$cur_code" != "$code" ] && {
> +            echo -e "\t{ .gen_ver = $cur_ver, .size = $(dword_count "$cur_code"), .code = (const uint32_t []) {\n$cur_code\n\t}}," >>$OUTPUT
> +            cur_code="$code"
> +        }
> +        cur_ver=$gen_ver
> +    done
> +    echo -e "\t{ .gen_ver = $cur_ver, .size = $(dword_count "$cur_code"), .code = (const uint32_t []) {\n$cur_code\n\t}}\n};" >>$OUTPUT
> +done
> +
> +cp $OUTPUT $INPUT
> diff --git a/lib/gpgpu_shader.c b/lib/gpgpu_shader.c
> index d14301789421..3317e9e35c91 100644
> --- a/lib/gpgpu_shader.c
> +++ b/lib/gpgpu_shader.c
> @@ -11,6 +11,9 @@
>  #include "gpgpu_shader.h"
>  #include "gpu_cmds.h"
>  
> +#define IGA64_ARG0 0xc0ded000
> +#define IGA64_ARG_MASK 0xffffff00
> +
>  #define SUPPORTED_GEN_VER 1200 /* Support TGL and up */
>  
>  #define PAGE_SIZE 4096
> @@ -22,6 +25,42 @@
>  #define GPGPU_CURBE_SIZE 0
>  #define GEN7_VFE_STATE_GPGPU_MODE 1
>  
> +static void gpgpu_shader_extend(struct gpgpu_shader *shdr)
> +{
> +	shdr->max_size <<= 1;
> +	shdr->code = realloc(shdr->code, 4 * shdr->max_size);
> +}
> +
> +void
> +__emit_iga64_code(struct gpgpu_shader *shdr, struct iga64_template const *tpls,
> +		  int argc, uint32_t *argv)
> +{
> +	uint32_t *ptr;
> +
> +	igt_require_f(shdr->gen_ver >= SUPPORTED_GEN_VER,
> +		      "No available shader templates for platforms older than XeLP\n");
> +
> +	while (shdr->gen_ver < tpls->gen_ver)
> +		tpls++;
> +
> +	while (shdr->max_size < shdr->size + tpls->size)
> +		gpgpu_shader_extend(shdr);
> +
> +	ptr = shdr->code + shdr->size;
> +	memcpy(ptr, tpls->code, 4 * tpls->size);
> +
> +	/* patch the template */
> +	for (int n, i = 0; i < tpls->size; ++i) {
> +		if ((ptr[i] & IGA64_ARG_MASK) != IGA64_ARG0)
> +			continue;
> +		n = ptr[i] - IGA64_ARG0;
> +		igt_assert(n < argc);
> +		ptr[i] = argv[n];
> +	}
> +
> +	shdr->size += tpls->size;
> +}
> +
>  static uint32_t fill_sip(struct intel_bb *ibb,
>  			 const uint32_t sip[][4],
>  			 const size_t size)
> diff --git a/lib/gpgpu_shader.h b/lib/gpgpu_shader.h
> index 02f6f1aad1e3..0b997deba8bb 100644
> --- a/lib/gpgpu_shader.h
> +++ b/lib/gpgpu_shader.h
> @@ -23,6 +23,27 @@ struct gpgpu_shader {
>  	};
>  };
>  
> +struct iga64_template {
> +	uint32_t gen_ver;
> +	uint32_t size;
> +	const uint32_t *code;
> +};
> +
> +#pragma GCC diagnostic ignored "-Wnested-externs"
> +
> +void
> +__emit_iga64_code(struct gpgpu_shader *shdr, const struct iga64_template *tpls,
> +		  int argc, uint32_t *argv);
> +
> +#define emit_iga64_code(__shdr, __name, __txt, __args...) \
> +({ \
> +	static const char t[] __attribute__ ((section(".iga64_assembly"),used)) \
> +		="iga64_assembly_" #__name ":" __txt "\n"; \
> +	extern struct iga64_template const iga64_code_ ## __name[]; \
> +	u32 args[] = { __args }; \
> +	__emit_iga64_code(__shdr, iga64_code_ ## __name, ARRAY_SIZE(args), args); \
> +})
> +
>  struct gpgpu_shader *gpgpu_shader_create(int fd);
>  void gpgpu_shader_destroy(struct gpgpu_shader *shdr);
>  
> @@ -35,4 +56,8 @@ void gpgpu_shader_exec(struct intel_bb *ibb,
>  		       struct gpgpu_shader *sip,
>  		       uint64_t ring, bool explicit_engine);
>  
> +void gpgpu_shader__eot(struct gpgpu_shader *shdr);
> +void gpgpu_shader__write_dword(struct gpgpu_shader *shdr, uint32_t value,
> +			       uint32_t y_offset);
> +
>  #endif /* GPGPU_SHADER_H */
> diff --git a/lib/iga64_generated_codes.c b/lib/iga64_generated_codes.c
> new file mode 100644
> index 000000000000..449c5e9bcf31
> --- /dev/null
> +++ b/lib/iga64_generated_codes.c
> @@ -0,0 +1,6 @@
> +/* SPDX-License-Identifier: MIT */
> +/* Generated using Intel Graphics Assembler 1.1.0-int */
> +
> +#include "gpgpu_shader.h"
> +
> +#define MD5_SUM d41d8cd98f00b204e9800998ecf8427e
> diff --git a/lib/iga64_macros.h b/lib/iga64_macros.h
> new file mode 100644
> index 000000000000..33375763a1d0
> --- /dev/null
> +++ b/lib/iga64_macros.h
> @@ -0,0 +1,10 @@
> +/* SPDX-License-Identifier: MIT */
> +
> +#define ARG(n) (0xc0ded000 + n)
> +
> +/* send instruction for DG2+ requires 0 length in case src1 is null, BSpec: 47443 */
> +#if GEN_VER < 1271
> +#define src1_null null
> +#else
> +#define src1_null null:0
> +#endif
> diff --git a/lib/meson.build b/lib/meson.build
> index 0a3084f8aea2..843c74e5187f 100644
> --- a/lib/meson.build
> +++ b/lib/meson.build
> @@ -216,7 +216,10 @@ lib_version = vcs_tag(input : 'version.h.in', output : 'version.h',
>  		      fallback : 'NO-GIT',
>  		      command : vcs_command )
>  
> +iga64_assembly_sources = [ 'gpgpu_shader.c' ]
> +
>  lib_intermediates = []
> +iga64_assembly_libs = []
>  foreach f: lib_sources
>      name = f.underscorify()
>      lib = static_library('igt-' + name,
> @@ -230,8 +233,23 @@ foreach f: lib_sources
>  	])
>  
>      lib_intermediates += lib
> +    if f in iga64_assembly_sources
> +	iga64_assembly_libs += lib
> +    endif
>  endforeach
>  
> +iga64_generated_codes = custom_target(
> +    'iga64_generated_codes.c',
> +    output : 'iga64_generated_codes.c',
> +    input : [ 'iga64_generated_codes.c' ] + iga64_assembly_libs,
> +    command : [ './generate_iga64_codes', '-o', '@OUTPUT@', '-i', '@INPUT@' ],
> +    depend_files: [ 'generate_iga64_codes' ]
> +)
> +
> +lib_intermediates += static_library('igt-iga64_generated_codes.c',
> +			[ iga64_generated_codes, lib_version ]
> +		     )
> +
>  lib_igt_build = shared_library('igt',
>      ['dummy.c'],
>      link_whole: lib_intermediates,
> 
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/4] lib/gpgpu_shader: add inline support for iga64 assembly
  2024-04-29 12:08 ` [PATCH 3/4] lib/gpgpu_shader: add inline support for iga64 assembly Andrzej Hajda
  2024-05-10  5:52   ` Zbigniew Kempczyński
@ 2024-05-10 10:42   ` Zbigniew Kempczyński
  2024-05-14  9:39     ` Andrzej Hajda
  2024-05-10 11:18   ` Kamil Konieczny
  2 siblings, 1 reply; 17+ messages in thread
From: Zbigniew Kempczyński @ 2024-05-10 10:42 UTC (permalink / raw)
  To: Andrzej Hajda
  Cc: igt-dev, Kamil Konieczny, Dominik Grzegorzek,
	Christoph Manszewski, Dominik Karol Piątkowski

On Mon, Apr 29, 2024 at 02:08:19PM +0200, Andrzej Hajda wrote:
<cut>

> +# check if we need to recompile - checksum difference and compiler present
> +MD5_ASMS="$(for a in "${ASMS[@]}"; do echo "${a#*:}"; done | md5sum|cut -b1-32)"

Why not use:
MD5_ASMS=$(echo "${ASMS[@]}" | md5sum | cut -b1-32)

I'm not sure but your check doesn't support function (asm) rename.
--
Zbigniew

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] intel/xe_exec_sip: port test for shader sanity check
  2024-04-29 12:08 ` [PATCH 4/4] intel/xe_exec_sip: port test for shader sanity check Andrzej Hajda
@ 2024-05-10 10:44   ` Zbigniew Kempczyński
  2024-05-14  9:49     ` Andrzej Hajda
  2024-05-10 11:30   ` Kamil Konieczny
  1 sibling, 1 reply; 17+ messages in thread
From: Zbigniew Kempczyński @ 2024-05-10 10:44 UTC (permalink / raw)
  To: Andrzej Hajda
  Cc: igt-dev, Kamil Konieczny, Dominik Grzegorzek,
	Christoph Manszewski, Dominik Karol Piątkowski

On Mon, Apr 29, 2024 at 02:08:20PM +0200, Andrzej Hajda wrote:
> xe_exec_sip will contain tests for shader and SIP interaction.
> For starters let's implement test checking if shader is run correctly.
> The patch also demostrates usage of inline iga64 assembly.
> 
> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
> ---
>  lib/gpgpu_shader.c          |  63 ++++++++++++
>  lib/iga64_generated_codes.c |  83 ++++++++++++++-
>  tests/intel/xe_exec_sip.c   | 239 ++++++++++++++++++++++++++++++++++++++++++++
>  tests/meson.build           |   1 +
>  4 files changed, 385 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/gpgpu_shader.c b/lib/gpgpu_shader.c
> index 3317e9e35c91..cd8c82ff9c8c 100644
> --- a/lib/gpgpu_shader.c
> +++ b/lib/gpgpu_shader.c
> @@ -248,3 +248,66 @@ void gpgpu_shader_destroy(struct gpgpu_shader *shdr)
>  	free(shdr->code);
>  	free(shdr);
>  }
> +
> +/**
> + * gpgpu_shader__eot:
> + * @shdr: shader to be modified
> + *
> + * Append end of thread instruction to @shdr.
> + */
> +void gpgpu_shader__eot(struct gpgpu_shader *shdr)
> +{
> +	emit_iga64_code(shdr, eot, R"ASM(
> +(W)     mov (8|M0)               r112.0<1>:ud  r0.0<8;8,1>:ud
> +#if GEN_VER < 1250
> +(W)     send.ts (16|M0)          null r112 null 0x10000000 0x02000010 {EOT,@1} // wr:1+0, rd:0; end of thread
> +#else
> +(W)     send.gtwy (8|M0)         null r112 src1_null     0 0x02000000 {EOT}
> +#endif
> +	)ASM");
> +}
> +
> +/**
> + * gpgpu_shader__write_dword:
> + * @shdr: shader to be modified
> + * @value: dword to be written
> + * @y_offset: write target offset within the surface in rows
> + *
> + * Fill dword in (row, column/dword) == (tg_id_y + @y_offset, tg_id_x).
> + */
> +void gpgpu_shader__write_dword(struct gpgpu_shader *shdr, uint32_t value,
> +			       uint32_t y_offset)
> +{
> +	emit_iga64_code(shdr, media_block_write, R"ASM(
> +	// Payload
> +(W)     mov (1|M0)               r5.0<1>:ud    ARG(3):ud
> +(W)     mov (1|M0)               r5.1<1>:ud    ARG(4):ud
> +(W)     mov (1|M0)               r5.2<1>:ud    ARG(5):ud
> +(W)     mov (1|M0)               r5.3<1>:ud    ARG(6):ud
> +#if GEN_VER < 2000 // Media Block Write
> +        // X offset of the block in bytes := (thread group id X << ARG(0))
> +(W)     shl (1|M0)               r4.0<1>:ud    r0.1<0;1,0>:ud    ARG(0):ud
> +        // Y offset of the block in rows := thread group id Y
> +(W)     mov (1|M0)               r4.1<1>:ud    r0.6<0;1,0>:ud
> +(W)     add (1|M0)               r4.1<1>:ud    r4.1<0;1,0>:ud   ARG(1):ud
> +        // block width [0,63] representing 1 to 64 bytes
> +(W)     mov (1|M0)               r4.2<1>:ud    ARG(2):ud
> +        // FFTID := FFTID from R0 header
> +(W)     mov (1|M0)               r4.4<1>:ud    r0.5<0;1,0>:ud
> +(W)     send.dc1 (16|M0)         null     r4   src1_null 0    0x40A8000
> +#else // Typed 2D Block Store
> +        // Load r2.0-3 with tg id X << ARG(0)
> +(W)     shl (1|M0)               r2.0<1>:ud    r0.1<0;1,0>:ud    ARG(0):ud
> +        // Load r2.4-7 with tg id Y + ARG(1):ud
> +(W)     mov (1|M0)               r2.1<1>:ud    r0.6<0;1,0>:ud
> +(W)     add (1|M0)               r2.1<1>:ud    r2.1<0;1,0>:ud    ARG(1):ud
> +        // payload setup
> +(W)     mov (16|M0)              r4.0<1>:ud    0x0:ud
> +        // Store X and Y block start (160:191 and 192:223)
> +(W)     mov (2|M0)               r4.5<1>:ud    r2.0<2;2,1>:ud
> +        // Store X and Y block max_size (224:231 and 232:239)
> +(W)     mov (1|M0)               r4.7<1>:ud    ARG(2):ud
> +(W)     send.tgm (16|M0)         null     r4   null:0    0    0x64000007
> +#endif
> +	)ASM", 2, y_offset, 3, value, value, value, value);
> +}
> diff --git a/lib/iga64_generated_codes.c b/lib/iga64_generated_codes.c
> index 449c5e9bcf31..f06362d806cd 100644
> --- a/lib/iga64_generated_codes.c
> +++ b/lib/iga64_generated_codes.c
> @@ -3,4 +3,85 @@
>  
>  #include "gpgpu_shader.h"
>  
> -#define MD5_SUM d41d8cd98f00b204e9800998ecf8427e
> +#define MD5_SUM 1a47442138fa63fddb0f260694ef9edb
> +
> +struct iga64_template const iga64_code_media_block_write[] = {
> +	{ .gen_ver = 2000, .size = 56, .code = (const uint32_t []) {
> +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
> +		0x80000061, 0x05154220, 0x00000000, 0xc0ded004,
> +		0x80000061, 0x05254220, 0x00000000, 0xc0ded005,
> +		0x80000061, 0x05354220, 0x00000000, 0xc0ded006,
> +		0x80000069, 0x02058220, 0x02000014, 0xc0ded000,
> +		0x80000061, 0x02150220, 0x00000064, 0x00000000,
> +		0x80001940, 0x02158220, 0x02000214, 0xc0ded001,
> +		0x80100061, 0x04054220, 0x00000000, 0x00000000,
> +		0x80041a61, 0x04550220, 0x00220205, 0x00000000,
> +		0x80000061, 0x04754220, 0x00000000, 0xc0ded002,
> +		0x80132031, 0x00000000, 0xd00e0494, 0x04000000,
> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1272, .size = 52, .code = (const uint32_t []) {
> +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
> +		0x80000061, 0x05154220, 0x00000000, 0xc0ded004,
> +		0x80000061, 0x05254220, 0x00000000, 0xc0ded005,
> +		0x80000061, 0x05354220, 0x00000000, 0xc0ded006,
> +		0x80000069, 0x04058220, 0x02000014, 0xc0ded000,
> +		0x80000061, 0x04150220, 0x00000064, 0x00000000,
> +		0x80001940, 0x04158220, 0x02000414, 0xc0ded001,
> +		0x80000061, 0x04254220, 0x00000000, 0xc0ded002,
> +		0x80000061, 0x04450220, 0x00000054, 0x00000000,
> +		0x80132031, 0x00000000, 0xc0000414, 0x02a00000,
> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1250, .size = 56, .code = (const uint32_t []) {
> +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
> +		0x80000061, 0x05254220, 0x00000000, 0xc0ded004,
> +		0x80000061, 0x05454220, 0x00000000, 0xc0ded005,
> +		0x80000061, 0x05654220, 0x00000000, 0xc0ded006,
> +		0x80000069, 0x04058220, 0x02000024, 0xc0ded000,
> +		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
> +		0x80001940, 0x04258220, 0x02000424, 0xc0ded001,
> +		0x80000061, 0x04454220, 0x00000000, 0xc0ded002,
> +		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
> +		0x80001901, 0x00010000, 0x00000000, 0x00000000,
> +		0x80044031, 0x00000000, 0xc0000414, 0x02a00000,
> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 0, .size = 52, .code = (const uint32_t []) {
> +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
> +		0x80000061, 0x05254220, 0x00000000, 0xc0ded004,
> +		0x80000061, 0x05454220, 0x00000000, 0xc0ded005,
> +		0x80000061, 0x05654220, 0x00000000, 0xc0ded006,
> +		0x80000069, 0x04058220, 0x02000024, 0xc0ded000,
> +		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
> +		0x80000140, 0x04258220, 0x02000424, 0xc0ded001,
> +		0x80000061, 0x04454220, 0x00000000, 0xc0ded002,
> +		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
> +		0x80049031, 0x00000000, 0xc0000414, 0x02a00000,
> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> +		0x80000101, 0x00010000, 0x00000000, 0x00000000,
> +	}}
> +};
> +
> +struct iga64_template const iga64_code_eot[] = {

Where's .gen_ver = 2000?

--
Zbigniew

> +	{ .gen_ver = 1272, .size = 8, .code = (const uint32_t []) {
> +		0x800c0061, 0x70050220, 0x00460005, 0x00000000,
> +		0x800f2031, 0x00000004, 0x3000700c, 0x00000000,
> +	}},
> +	{ .gen_ver = 1250, .size = 12, .code = (const uint32_t []) {
> +		0x80030061, 0x70050220, 0x00460005, 0x00000000,
> +		0x80001901, 0x00010000, 0x00000000, 0x00000000,
> +		0x80034031, 0x00000004, 0x3000700c, 0x00000000,
> +	}},
> +	{ .gen_ver = 0, .size = 8, .code = (const uint32_t []) {
> +		0x80030061, 0x70050220, 0x00460005, 0x00000000,
> +		0x80049031, 0x00000004, 0x7020700c, 0x10000000,
> +	}}
> +};
> diff --git a/tests/intel/xe_exec_sip.c b/tests/intel/xe_exec_sip.c
> new file mode 100644
> index 000000000000..af0eaf8cbda6
> --- /dev/null
> +++ b/tests/intel/xe_exec_sip.c
> @@ -0,0 +1,239 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2024 Intel Corporation
> + */
> +
> +/**
> + * TEST: Tests for gpgpu shader and system routine execution
> + * Category: Software building block
> + * Sub-category: gpgpu
> + * Functionality: system routine
> + * Test category: functionality test
> + */
> +
> +#include <dirent.h>
> +#include <fcntl.h>
> +#include <stdio.h>
> +#include "gpgpu_shader.h"
> +#include "igt.h"
> +#include "igt_sysfs.h"
> +#include "xe/xe_ioctl.h"
> +#include "xe/xe_query.h"
> +
> +#define WIDTH 64
> +#define HEIGHT 64
> +
> +#define COLOR_C4 0xc4
> +
> +#define SHADER_CANARY 0x01010101
> +
> +#define NSEC_PER_MSEC (1000 * 1000ull)
> +
> +static struct intel_buf *
> +create_fill_buf(int fd, int width, int height, uint8_t color)
> +{
> +	struct intel_buf *buf;
> +	uint8_t *ptr;
> +
> +	buf = calloc(1, sizeof(*buf));
> +	igt_assert(buf);
> +
> +	intel_buf_init(buf_ops_create(fd), buf, width / 4, height, 32, 0,
> +		       I915_TILING_NONE, 0);
> +
> +	ptr = xe_bo_map(fd, buf->handle, buf->surface[0].size);
> +	memset(ptr, color, buf->surface[0].size);
> +	munmap(ptr, buf->surface[0].size);
> +
> +	return buf;
> +}
> +
> +static struct gpgpu_shader *get_shader(int fd)
> +{
> +	static struct gpgpu_shader *shader;
> +
> +	shader = gpgpu_shader_create(fd);
> +	gpgpu_shader__write_dword(shader, SHADER_CANARY, 0);
> +	gpgpu_shader__eot(shader);
> +	return shader;
> +}
> +
> +static uint32_t gpgpu_shader(int fd, struct intel_bb *ibb, unsigned int threads,
> +			     unsigned int width, unsigned int height)
> +{
> +	struct intel_buf *buf = create_fill_buf(fd, width, height, COLOR_C4);
> +	struct gpgpu_shader *shader = get_shader(fd);
> +
> +	gpgpu_shader_exec(ibb, buf, 1, threads, shader, NULL, 0, 0);
> +	gpgpu_shader_destroy(shader);
> +	return buf->handle;
> +}
> +
> +static void check_fill_buf(uint8_t *ptr, const int width, const int x,
> +			   const int y, const uint8_t color)
> +{
> +	const uint8_t val = ptr[y * width + x];
> +
> +	igt_assert_f(val == color,
> +		     "Expected 0x%02x, found 0x%02x at (%d,%d)\n",
> +		     color, val, x, y);
> +}
> +
> +static void check_buf(int fd, uint32_t handle, int width, int height,
> +		      uint8_t poison_c)
> +{
> +	unsigned int sz = ALIGN(width * height, 4096);
> +	int thread_count = 0;
> +	uint32_t *ptr;
> +	int i, j;
> +
> +	ptr = xe_bo_mmap_ext(fd, handle, sz, PROT_READ);
> +
> +	for (i = 0, j = 0; j < height / 2; ++j) {
> +		if (ptr[j * width / 4] == SHADER_CANARY) {
> +			++thread_count;
> +			i = 4;
> +		}
> +
> +		for (; i < width; i++)
> +			check_fill_buf((uint8_t *)ptr, width, i, j, poison_c);
> +
> +		i = 0;
> +	}
> +
> +	igt_assert(thread_count);
> +
> +	munmap(ptr, sz);
> +}
> +
> +static const char *class_to_str(int class)
> +{
> +        const char *str[] = {
> +                [DRM_XE_ENGINE_CLASS_RENDER] = "rcs",
> +                [DRM_XE_ENGINE_CLASS_COPY] = "bcs",
> +                [DRM_XE_ENGINE_CLASS_VIDEO_DECODE] = "vcs",
> +                [DRM_XE_ENGINE_CLASS_VIDEO_ENHANCE] = "vecs",
> +		[DRM_XE_ENGINE_CLASS_COMPUTE] = "ccs",
> +        };
> +
> +        if (class < ARRAY_SIZE(str))
> +                return str[class];
> +
> +        return "unk";
> +}
> +
> +static uint64_t xe_sysfs_get_job_timeout_ms(int fd, struct drm_xe_engine_class_instance *eci)
> +{
> +	struct dirent *de;
> +	int engines_fd = -1;
> +	int gt_fd = -1;
> +	DIR *dir;
> +	/* Default timeout is 5s */
> +	uint64_t ret = 5ULL * MSEC_PER_SEC;
> +
> +	gt_fd = xe_sysfs_gt_open(fd, eci->gt_id);
> +	if (gt_fd == -1)
> +		return ret;
> +
> +	engines_fd = openat(gt_fd, "engines", O_RDONLY);
> +	if (engines_fd == -1) {
> +		close(gt_fd);
> +		return ret;
> +	}
> +
> +	lseek(engines_fd, 0, SEEK_SET);
> +	dir = fdopendir(engines_fd);
> +	while (dir && (de = readdir(dir))) {
> +		int engine_fd;
> +		if (strcmp(de->d_name, class_to_str(eci->engine_class)))
> +			continue;
> +
> +		engine_fd = openat(engines_fd, de->d_name, O_RDONLY);
> +		if (engine_fd < 0)
> +			break;
> +
> +		ret = igt_sysfs_get_u64(engine_fd, "job_timeout_ms");
> +		close(engine_fd);
> +		break;
> +	}
> +
> +	close(engines_fd);
> +	close(gt_fd);
> +	return ret;
> +}
> +
> +/**
> + * SUBTEST: sanity
> + * Description: check basic shader with write operation
> + * Run type: BAT
> + *
> + */
> +static void test_sip(struct drm_xe_engine_class_instance *eci, uint32_t flags)
> +{
> +	unsigned int threads = 512;
> +	unsigned int height = max_t(threads, HEIGHT, threads * 2);
> +	uint32_t exec_queue_id, handle, vm_id;
> +	unsigned int width = WIDTH;
> +	struct timespec ts = { };
> +	uint64_t timeout;
> +	struct intel_bb *ibb;
> +	int fd;
> +
> +	igt_debug("Using %s\n", xe_engine_class_string(eci->engine_class));
> +
> +	fd = drm_open_driver(DRIVER_XE);
> +	xe_device_get(fd);
> +
> +	vm_id = xe_vm_create(fd, 0, 0);
> +
> +	/* Get timeout for job, and add 4s to ensure timeout processes in subtest. */
> +	timeout = xe_sysfs_get_job_timeout_ms(fd, eci) + 4ull * MSEC_PER_SEC;
> +	timeout *= NSEC_PER_MSEC;
> +	timeout *= igt_run_in_simulation() ? 10 : 1;
> +
> +	exec_queue_id = xe_exec_queue_create(fd, vm_id, eci, 0);
> +	ibb = intel_bb_create_with_context(fd, exec_queue_id, vm_id, NULL, 4096);
> +
> +	igt_nsec_elapsed(&ts);
> +	handle = gpgpu_shader(fd, ibb, threads, width, height);
> +
> +	intel_bb_sync(ibb);
> +	igt_assert_lt_u64(igt_nsec_elapsed(&ts), timeout);
> +
> +	check_buf(fd, handle, width, height, COLOR_C4);
> +
> +	gem_close(fd, handle);
> +	intel_bb_destroy(ibb);
> +
> +	xe_exec_queue_destroy(fd, exec_queue_id);
> +	xe_vm_destroy(fd, vm_id);
> +	xe_device_put(fd);
> +	close(fd);
> +}
> +
> +#define test_render_and_compute(t, __fd, __eci) \
> +	igt_subtest_with_dynamic(t) \
> +		xe_for_each_engine(__fd, __eci) \
> +			if (__eci->engine_class == DRM_XE_ENGINE_CLASS_RENDER || \
> +			    __eci->engine_class == DRM_XE_ENGINE_CLASS_COMPUTE) \
> +				igt_dynamic_f("%s%d", xe_engine_class_string(__eci->engine_class), \
> +					      __eci->engine_instance)
> +
> +igt_main
> +{
> +	struct drm_xe_engine_class_instance *eci;
> +	int fd;
> +
> +	igt_fixture {
> +		fd = drm_open_driver(DRIVER_XE);
> +		xe_device_get(fd);
> +	}
> +
> +	test_render_and_compute("sanity", fd, eci)
> +		test_sip(eci, 0);
> +
> +	igt_fixture {
> +		xe_device_put(fd);
> +		close(fd);
> +	}
> +}
> diff --git a/tests/meson.build b/tests/meson.build
> index 65b8bf23b972..63588e473616 100644
> --- a/tests/meson.build
> +++ b/tests/meson.build
> @@ -292,6 +292,7 @@ intel_xe_progs = [
>  	'xe_exec_fault_mode',
>  	'xe_exec_queue_property',
>  	'xe_exec_reset',
> +	'xe_exec_sip',
>  	'xe_exec_store',
>  	'xe_exec_threads',
>  	'xe_exercise_blt',
> 
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/4] lib/gpgpu_shader: add inline support for iga64 assembly
  2024-04-29 12:08 ` [PATCH 3/4] lib/gpgpu_shader: add inline support for iga64 assembly Andrzej Hajda
  2024-05-10  5:52   ` Zbigniew Kempczyński
  2024-05-10 10:42   ` Zbigniew Kempczyński
@ 2024-05-10 11:18   ` Kamil Konieczny
  2024-05-14  9:42     ` Andrzej Hajda
  2 siblings, 1 reply; 17+ messages in thread
From: Kamil Konieczny @ 2024-05-10 11:18 UTC (permalink / raw)
  To: igt-dev
  Cc: Andrzej Hajda, Dominik Grzegorzek, Christoph Manszewski,
	Dominik Karol Piątkowski, Zbigniew Kempczyński

Hi Andrzej,
On 2024-04-29 at 14:08:19 +0200, Andrzej Hajda wrote:
> With this patch adding iga64 assembly should be similar to
> adding x86 assembly inline. Simple example:
>     emit_iga64_code(shdr, set_exception, R"ASM(
>         or (1|M0) cr0.1<1>:ud cr0.1<0;1,0>:ud ARG(0):ud
>     )ASM", value);
> Note presence of 'ARG(0)', it will be replaced by 'value' argument,
> multiple arguments are possible.
> More sophisticated examples in following patches.
> How does it works:
> 1. Raw string literals (C++ feature available in gcc as extension):
>    R"ASM(...)ASM" allows to use multiline/unescaped string literals.
>    If for some reason they cannot be used we could always fallback to
>    old ugly way of handling multiline strings with escape characters:
>     emit_iga64_code(shdr, set_exception, "\n\
>         or (1|M0) cr0.1<1>:ud cr0.1<0;1,0>:ud ARG(0):ud\n\
>     ", value);
> 2. emit_iga64_code puts the assembly string into special linker section,
>    and calls __emit_iga64_code with pointer to external variable
>    which will contain code templates generated from the assembly for all
>    supported platforms, remaining arguments are put to temporal array
>    to eventually patch the code with positional arguments.
> 3. During build phase the linker section is scanned for assemblies.
>    Every assembly is preprocessed with cpp, to replace ARG(x) macros with
>    magic numbers, and to provide different code for different platforms
>    if needed. Then output file is compiled with iga64, and then .c file
>    is generated with global variables pointing to hexified iga64 codes.
> 

+cc Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>

> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
> ---
>  lib/generate_iga64_codes    | 104 ++++++++++++++++++++++++++++++++++++++++++++
>  lib/gpgpu_shader.c          |  39 +++++++++++++++++
>  lib/gpgpu_shader.h          |  25 +++++++++++
>  lib/iga64_generated_codes.c |   6 +++
>  lib/iga64_macros.h          |  10 +++++
>  lib/meson.build             |  18 ++++++++
>  6 files changed, 202 insertions(+)
> 
> diff --git a/lib/generate_iga64_codes b/lib/generate_iga64_codes
> new file mode 100755
> index 000000000000..efc2a29b409c
> --- /dev/null
> +++ b/lib/generate_iga64_codes
> @@ -0,0 +1,104 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: MIT
> +# Copyright © 2024 Intel Corporation
> +# Author: Andrzej Hajda <andrzej.hajda@intel.com>
> +
> +# List of supported platforms, in format gen100:platform, where gen100 equals
> +# to minimal GPU generation supported by platform multiplied by 100 and platform
> +# is one of platforms supported by -p switch of iga64.
> +#
> +# Must be in decreasing order, the last one must have gen100 equal 0"
> +GEN_VERSIONS="2000:2 1272:12p72 1250:12p5 0:12p1"
> +
> +warn() {
> +    echo -e "$1" >/dev/stderr
> +}
> +
> +die() {
> +    warn "DIE: $1"
> +    exit 1
> +}
> +
> +# parse args
> +while getopts ':i:o:' opt; do
> +    case $opt in
> +    i) INPUT=$OPTARG;;
> +    o) OUTPUT=$OPTARG;;
> +    ?) die "Usage: $0 -i pre-generated-iga64-file -o generated-iga64-file libs-with-iga64-assembly [...]"
> +    esac
> +done
> +LIBS=${@:OPTIND}
> +
> +# read all assemblies into ASMS array
> +ASMS=()
> +while  read -d $'\0' asm; do
> +    test -z "$asm" && continue
> +    ASMS+=( "$asm" )
> +done < <(for f in $LIBS; do objcopy --dump-section .iga64_assembly=/dev/stdout $f.p/*.o; done)
> +
> +# check if we need to recompile - checksum difference and compiler present
> +MD5_ASMS="$(for a in "${ASMS[@]}"; do echo "${a#*:}"; done | md5sum|cut -b1-32)"
> +MD5_PRE="$(grep -Po '(?<=^#define MD5_SUM )\S{32,32}' $INPUT 2>/dev/null)"
> +
> +if [ "$MD5_ASMS" = "$MD5_PRE" ]; then
> +    echo "iga64 assemblies not changed, reusing pre-compiled file $INPUT."
> +    cp $INPUT $OUTPUT
> +    exit 0
> +fi
> +
> +type iga64 >/dev/null || {
> +    warn "WARNING: iga64 assemblies changed, but iga64 compiler not present, CHANGES will have no effect. Install iga64 (libigc-tools package) to re-compile code."
> +    cp $INPUT $OUTPUT
> +    exit 0
> +}
> +
> +# returns count of numbers in strings of format "0x1234, 0x23434, ..."
> +dword_count() {
> +    n=${1//[^x]}
> +    echo ${#n}
> +}
> +
> +# generate code file
> +WD=$OUTPUT.d
> +mkdir -p $WD
> +
> +echo "Generating new $OUTPUT"
> +
> +cat <<-EOF >$OUTPUT
> +/* SPDX-License-Identifier: MIT */
> +/* Generated using $(iga64 |& head -1) */
> +
> +#include "gpgpu_shader.h"
> +
> +#define MD5_SUM $MD5_ASMS
> +EOF
> +
> +for asm in "${ASMS[@]}"; do
> +    asm_name="${asm%%:*}"
> +    asm_code="${asm_name/assembly/code}"
> +    asm_body="${asm#*:}"
> +    cur_code=""
> +    cur_ver=""
> +    echo -e "\nstruct iga64_template const $asm_code[] = {" >>$OUTPUT
> +    for gen in $GEN_VERSIONS; do
> +        gen_ver="${gen%%:*}"
> +        gen_name="${gen#*:}"
> +        warn "Generating $asm_code for platform $gen_name"
> +        cmd="cpp -P - -o $WD/$asm_name.$gen_name.asm"
> +        cmd+=" -DGEN_VER=$gen_ver -imacros ../lib/iga64_macros.h"
> +        eval "$cmd" <<<"$asm_body" || die "cpp error for $asm_name.$gen_name\ncmd: $cmd"
> +        cmd="iga64 -Xauto-deps -Wall -p=$gen_name"
> +        cmd+=" $WD/$asm_name.$gen_name.asm -o $WD/$asm_name.$gen_name.bin"
> +        eval "$cmd" || die "iga64 error for $asm_name.$gen_name\ncmd: $cmd"
> +        code="$(hexdump -e '"\t\t" 4/4 "0x%08x, " "\n"' $WD/$asm_name.$gen_name.bin)"
> +        [ -z "$cur_code" ] && cur_code="$code"
> +        [ "$cur_code" != "$code" ] && {
> +            echo -e "\t{ .gen_ver = $cur_ver, .size = $(dword_count "$cur_code"), .code = (const uint32_t []) {\n$cur_code\n\t}}," >>$OUTPUT
> +            cur_code="$code"
> +        }
> +        cur_ver=$gen_ver
> +    done
> +    echo -e "\t{ .gen_ver = $cur_ver, .size = $(dword_count "$cur_code"), .code = (const uint32_t []) {\n$cur_code\n\t}}\n};" >>$OUTPUT
> +done
> +
> +cp $OUTPUT $INPUT
> diff --git a/lib/gpgpu_shader.c b/lib/gpgpu_shader.c
> index d14301789421..3317e9e35c91 100644
> --- a/lib/gpgpu_shader.c
> +++ b/lib/gpgpu_shader.c
> @@ -11,6 +11,9 @@
>  #include "gpgpu_shader.h"
>  #include "gpu_cmds.h"
>  
> +#define IGA64_ARG0 0xc0ded000
> +#define IGA64_ARG_MASK 0xffffff00
> +
>  #define SUPPORTED_GEN_VER 1200 /* Support TGL and up */
>  
>  #define PAGE_SIZE 4096
> @@ -22,6 +25,42 @@
>  #define GPGPU_CURBE_SIZE 0
>  #define GEN7_VFE_STATE_GPGPU_MODE 1
>  
> +static void gpgpu_shader_extend(struct gpgpu_shader *shdr)
> +{
> +	shdr->max_size <<= 1;
> +	shdr->code = realloc(shdr->code, 4 * shdr->max_size);
> +}
> +
> +void
> +__emit_iga64_code(struct gpgpu_shader *shdr, struct iga64_template const *tpls,
> +		  int argc, uint32_t *argv)
> +{
> +	uint32_t *ptr;
> +
> +	igt_require_f(shdr->gen_ver >= SUPPORTED_GEN_VER,
> +		      "No available shader templates for platforms older than XeLP\n");
> +
> +	while (shdr->gen_ver < tpls->gen_ver)
> +		tpls++;
> +
> +	while (shdr->max_size < shdr->size + tpls->size)
> +		gpgpu_shader_extend(shdr);
> +
> +	ptr = shdr->code + shdr->size;
> +	memcpy(ptr, tpls->code, 4 * tpls->size);
> +
> +	/* patch the template */
> +	for (int n, i = 0; i < tpls->size; ++i) {
> +		if ((ptr[i] & IGA64_ARG_MASK) != IGA64_ARG0)
> +			continue;
> +		n = ptr[i] - IGA64_ARG0;
> +		igt_assert(n < argc);
> +		ptr[i] = argv[n];
> +	}
> +
> +	shdr->size += tpls->size;
> +}
> +
>  static uint32_t fill_sip(struct intel_bb *ibb,
>  			 const uint32_t sip[][4],
>  			 const size_t size)
> diff --git a/lib/gpgpu_shader.h b/lib/gpgpu_shader.h
> index 02f6f1aad1e3..0b997deba8bb 100644
> --- a/lib/gpgpu_shader.h
> +++ b/lib/gpgpu_shader.h
> @@ -23,6 +23,27 @@ struct gpgpu_shader {
>  	};
>  };
>  
> +struct iga64_template {
> +	uint32_t gen_ver;
> +	uint32_t size;
> +	const uint32_t *code;
> +};
> +
> +#pragma GCC diagnostic ignored "-Wnested-externs"
> +
> +void
> +__emit_iga64_code(struct gpgpu_shader *shdr, const struct iga64_template *tpls,
> +		  int argc, uint32_t *argv);
> +
> +#define emit_iga64_code(__shdr, __name, __txt, __args...) \
> +({ \
> +	static const char t[] __attribute__ ((section(".iga64_assembly"),used)) \
> +		="iga64_assembly_" #__name ":" __txt "\n"; \
> +	extern struct iga64_template const iga64_code_ ## __name[]; \
> +	u32 args[] = { __args }; \
> +	__emit_iga64_code(__shdr, iga64_code_ ## __name, ARRAY_SIZE(args), args); \
> +})
> +
>  struct gpgpu_shader *gpgpu_shader_create(int fd);
>  void gpgpu_shader_destroy(struct gpgpu_shader *shdr);
>  
> @@ -35,4 +56,8 @@ void gpgpu_shader_exec(struct intel_bb *ibb,
>  		       struct gpgpu_shader *sip,
>  		       uint64_t ring, bool explicit_engine);
>  
> +void gpgpu_shader__eot(struct gpgpu_shader *shdr);
> +void gpgpu_shader__write_dword(struct gpgpu_shader *shdr, uint32_t value,
> +			       uint32_t y_offset);
> +
>  #endif /* GPGPU_SHADER_H */
> diff --git a/lib/iga64_generated_codes.c b/lib/iga64_generated_codes.c
> new file mode 100644
> index 000000000000..449c5e9bcf31
> --- /dev/null
> +++ b/lib/iga64_generated_codes.c
> @@ -0,0 +1,6 @@
> +/* SPDX-License-Identifier: MIT */
> +/* Generated using Intel Graphics Assembler 1.1.0-int */
> +
> +#include "gpgpu_shader.h"
> +
> +#define MD5_SUM d41d8cd98f00b204e9800998ecf8427e
---------- ^^^^^^^
This name is too much generic, what about MD5_SUM_IGA64_CODE ?

> diff --git a/lib/iga64_macros.h b/lib/iga64_macros.h
> new file mode 100644
> index 000000000000..33375763a1d0
> --- /dev/null
> +++ b/lib/iga64_macros.h
> @@ -0,0 +1,10 @@
> +/* SPDX-License-Identifier: MIT */

Add Copyright here.

> +
> +#define ARG(n) (0xc0ded000 + n)
> +
> +/* send instruction for DG2+ requires 0 length in case src1 is null, BSpec: 47443 */
> +#if GEN_VER < 1271
> +#define src1_null null
> +#else
> +#define src1_null null:0
> +#endif
> diff --git a/lib/meson.build b/lib/meson.build
> index 0a3084f8aea2..843c74e5187f 100644
> --- a/lib/meson.build
> +++ b/lib/meson.build
> @@ -216,7 +216,10 @@ lib_version = vcs_tag(input : 'version.h.in', output : 'version.h',
>  		      fallback : 'NO-GIT',
>  		      command : vcs_command )
>  
> +iga64_assembly_sources = [ 'gpgpu_shader.c' ]
> +
>  lib_intermediates = []
> +iga64_assembly_libs = []
>  foreach f: lib_sources
>      name = f.underscorify()
>      lib = static_library('igt-' + name,
> @@ -230,8 +233,23 @@ foreach f: lib_sources
>  	])
>  
>      lib_intermediates += lib
> +    if f in iga64_assembly_sources
> +	iga64_assembly_libs += lib
> +    endif
>  endforeach
>  
> +iga64_generated_codes = custom_target(
> +    'iga64_generated_codes.c',
> +    output : 'iga64_generated_codes.c',
> +    input : [ 'iga64_generated_codes.c' ] + iga64_assembly_libs,
> +    command : [ './generate_iga64_codes', '-o', '@OUTPUT@', '-i', '@INPUT@' ],
> +    depend_files: [ 'generate_iga64_codes' ]
> +)
> +
> +lib_intermediates += static_library('igt-iga64_generated_codes.c',
> +			[ iga64_generated_codes, lib_version ]
> +		     )
> +
>  lib_igt_build = shared_library('igt',
>      ['dummy.c'],
>      link_whole: lib_intermediates,
> 

Please look into CI.Build fails.

Regards,
Kamil

> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] intel/xe_exec_sip: port test for shader sanity check
  2024-04-29 12:08 ` [PATCH 4/4] intel/xe_exec_sip: port test for shader sanity check Andrzej Hajda
  2024-05-10 10:44   ` Zbigniew Kempczyński
@ 2024-05-10 11:30   ` Kamil Konieczny
  1 sibling, 0 replies; 17+ messages in thread
From: Kamil Konieczny @ 2024-05-10 11:30 UTC (permalink / raw)
  To: igt-dev
  Cc: Andrzej Hajda, Dominik Grzegorzek, Christoph Manszewski,
	Dominik Karol Piątkowski

Hi Andrzej,
On 2024-04-29 at 14:08:20 +0200, Andrzej Hajda wrote:
> xe_exec_sip will contain tests for shader and SIP interaction.

Please decipher 'SIP' here and in test description below.

> For starters let's implement test checking if shader is run correctly.
s/is run/runs/

> The patch also demostrates usage of inline iga64 assembly.
> 
> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
> ---
>  lib/gpgpu_shader.c          |  63 ++++++++++++
>  lib/iga64_generated_codes.c |  83 ++++++++++++++-
>  tests/intel/xe_exec_sip.c   | 239 ++++++++++++++++++++++++++++++++++++++++++++
>  tests/meson.build           |   1 +
>  4 files changed, 385 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/gpgpu_shader.c b/lib/gpgpu_shader.c
> index 3317e9e35c91..cd8c82ff9c8c 100644
> --- a/lib/gpgpu_shader.c
> +++ b/lib/gpgpu_shader.c
> @@ -248,3 +248,66 @@ void gpgpu_shader_destroy(struct gpgpu_shader *shdr)
>  	free(shdr->code);
>  	free(shdr);
>  }
> +
> +/**
> + * gpgpu_shader__eot:
> + * @shdr: shader to be modified
> + *
> + * Append end of thread instruction to @shdr.
> + */
> +void gpgpu_shader__eot(struct gpgpu_shader *shdr)
> +{
> +	emit_iga64_code(shdr, eot, R"ASM(
> +(W)     mov (8|M0)               r112.0<1>:ud  r0.0<8;8,1>:ud
> +#if GEN_VER < 1250
> +(W)     send.ts (16|M0)          null r112 null 0x10000000 0x02000010 {EOT,@1} // wr:1+0, rd:0; end of thread
> +#else
> +(W)     send.gtwy (8|M0)         null r112 src1_null     0 0x02000000 {EOT}
> +#endif
> +	)ASM");
> +}
> +
> +/**
> + * gpgpu_shader__write_dword:
> + * @shdr: shader to be modified
> + * @value: dword to be written
> + * @y_offset: write target offset within the surface in rows
> + *
> + * Fill dword in (row, column/dword) == (tg_id_y + @y_offset, tg_id_x).
> + */
> +void gpgpu_shader__write_dword(struct gpgpu_shader *shdr, uint32_t value,
> +			       uint32_t y_offset)
> +{
> +	emit_iga64_code(shdr, media_block_write, R"ASM(
> +	// Payload
> +(W)     mov (1|M0)               r5.0<1>:ud    ARG(3):ud
> +(W)     mov (1|M0)               r5.1<1>:ud    ARG(4):ud
> +(W)     mov (1|M0)               r5.2<1>:ud    ARG(5):ud
> +(W)     mov (1|M0)               r5.3<1>:ud    ARG(6):ud
> +#if GEN_VER < 2000 // Media Block Write
> +        // X offset of the block in bytes := (thread group id X << ARG(0))
> +(W)     shl (1|M0)               r4.0<1>:ud    r0.1<0;1,0>:ud    ARG(0):ud
> +        // Y offset of the block in rows := thread group id Y
> +(W)     mov (1|M0)               r4.1<1>:ud    r0.6<0;1,0>:ud
> +(W)     add (1|M0)               r4.1<1>:ud    r4.1<0;1,0>:ud   ARG(1):ud
> +        // block width [0,63] representing 1 to 64 bytes
> +(W)     mov (1|M0)               r4.2<1>:ud    ARG(2):ud
> +        // FFTID := FFTID from R0 header
> +(W)     mov (1|M0)               r4.4<1>:ud    r0.5<0;1,0>:ud
> +(W)     send.dc1 (16|M0)         null     r4   src1_null 0    0x40A8000
> +#else // Typed 2D Block Store
> +        // Load r2.0-3 with tg id X << ARG(0)
> +(W)     shl (1|M0)               r2.0<1>:ud    r0.1<0;1,0>:ud    ARG(0):ud
> +        // Load r2.4-7 with tg id Y + ARG(1):ud
> +(W)     mov (1|M0)               r2.1<1>:ud    r0.6<0;1,0>:ud
> +(W)     add (1|M0)               r2.1<1>:ud    r2.1<0;1,0>:ud    ARG(1):ud
> +        // payload setup
> +(W)     mov (16|M0)              r4.0<1>:ud    0x0:ud
> +        // Store X and Y block start (160:191 and 192:223)
> +(W)     mov (2|M0)               r4.5<1>:ud    r2.0<2;2,1>:ud
> +        // Store X and Y block max_size (224:231 and 232:239)
> +(W)     mov (1|M0)               r4.7<1>:ud    ARG(2):ud
> +(W)     send.tgm (16|M0)         null     r4   null:0    0    0x64000007
> +#endif
> +	)ASM", 2, y_offset, 3, value, value, value, value);
> +}
> diff --git a/lib/iga64_generated_codes.c b/lib/iga64_generated_codes.c
> index 449c5e9bcf31..f06362d806cd 100644
> --- a/lib/iga64_generated_codes.c
> +++ b/lib/iga64_generated_codes.c
> @@ -3,4 +3,85 @@
>  
>  #include "gpgpu_shader.h"
>  
> -#define MD5_SUM d41d8cd98f00b204e9800998ecf8427e
> +#define MD5_SUM 1a47442138fa63fddb0f260694ef9edb
---------- ^^^^^^^
Name looks a little too generic.

> +
> +struct iga64_template const iga64_code_media_block_write[] = {
> +	{ .gen_ver = 2000, .size = 56, .code = (const uint32_t []) {
> +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
> +		0x80000061, 0x05154220, 0x00000000, 0xc0ded004,
> +		0x80000061, 0x05254220, 0x00000000, 0xc0ded005,
> +		0x80000061, 0x05354220, 0x00000000, 0xc0ded006,
> +		0x80000069, 0x02058220, 0x02000014, 0xc0ded000,
> +		0x80000061, 0x02150220, 0x00000064, 0x00000000,
> +		0x80001940, 0x02158220, 0x02000214, 0xc0ded001,
> +		0x80100061, 0x04054220, 0x00000000, 0x00000000,
> +		0x80041a61, 0x04550220, 0x00220205, 0x00000000,
> +		0x80000061, 0x04754220, 0x00000000, 0xc0ded002,
> +		0x80132031, 0x00000000, 0xd00e0494, 0x04000000,
> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1272, .size = 52, .code = (const uint32_t []) {
> +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
> +		0x80000061, 0x05154220, 0x00000000, 0xc0ded004,
> +		0x80000061, 0x05254220, 0x00000000, 0xc0ded005,
> +		0x80000061, 0x05354220, 0x00000000, 0xc0ded006,
> +		0x80000069, 0x04058220, 0x02000014, 0xc0ded000,
> +		0x80000061, 0x04150220, 0x00000064, 0x00000000,
> +		0x80001940, 0x04158220, 0x02000414, 0xc0ded001,
> +		0x80000061, 0x04254220, 0x00000000, 0xc0ded002,
> +		0x80000061, 0x04450220, 0x00000054, 0x00000000,
> +		0x80132031, 0x00000000, 0xc0000414, 0x02a00000,
> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 1250, .size = 56, .code = (const uint32_t []) {
> +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
> +		0x80000061, 0x05254220, 0x00000000, 0xc0ded004,
> +		0x80000061, 0x05454220, 0x00000000, 0xc0ded005,
> +		0x80000061, 0x05654220, 0x00000000, 0xc0ded006,
> +		0x80000069, 0x04058220, 0x02000024, 0xc0ded000,
> +		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
> +		0x80001940, 0x04258220, 0x02000424, 0xc0ded001,
> +		0x80000061, 0x04454220, 0x00000000, 0xc0ded002,
> +		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
> +		0x80001901, 0x00010000, 0x00000000, 0x00000000,
> +		0x80044031, 0x00000000, 0xc0000414, 0x02a00000,
> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
> +	}},
> +	{ .gen_ver = 0, .size = 52, .code = (const uint32_t []) {
> +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
> +		0x80000061, 0x05254220, 0x00000000, 0xc0ded004,
> +		0x80000061, 0x05454220, 0x00000000, 0xc0ded005,
> +		0x80000061, 0x05654220, 0x00000000, 0xc0ded006,
> +		0x80000069, 0x04058220, 0x02000024, 0xc0ded000,
> +		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
> +		0x80000140, 0x04258220, 0x02000424, 0xc0ded001,
> +		0x80000061, 0x04454220, 0x00000000, 0xc0ded002,
> +		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
> +		0x80049031, 0x00000000, 0xc0000414, 0x02a00000,
> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
> +		0x80000101, 0x00010000, 0x00000000, 0x00000000,
> +	}}
> +};
> +
> +struct iga64_template const iga64_code_eot[] = {
> +	{ .gen_ver = 1272, .size = 8, .code = (const uint32_t []) {
> +		0x800c0061, 0x70050220, 0x00460005, 0x00000000,
> +		0x800f2031, 0x00000004, 0x3000700c, 0x00000000,
> +	}},
> +	{ .gen_ver = 1250, .size = 12, .code = (const uint32_t []) {
> +		0x80030061, 0x70050220, 0x00460005, 0x00000000,
> +		0x80001901, 0x00010000, 0x00000000, 0x00000000,
> +		0x80034031, 0x00000004, 0x3000700c, 0x00000000,
> +	}},
> +	{ .gen_ver = 0, .size = 8, .code = (const uint32_t []) {
> +		0x80030061, 0x70050220, 0x00460005, 0x00000000,
> +		0x80049031, 0x00000004, 0x7020700c, 0x10000000,
> +	}}
> +};
> diff --git a/tests/intel/xe_exec_sip.c b/tests/intel/xe_exec_sip.c
> new file mode 100644
> index 000000000000..af0eaf8cbda6
> --- /dev/null
> +++ b/tests/intel/xe_exec_sip.c
> @@ -0,0 +1,239 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2024 Intel Corporation
> + */
> +
> +/**
> + * TEST: Tests for gpgpu shader and system routine execution

Please add here SIP meaning (in TEST: or in Descriptio: ).

> + * Category: Software building block
> + * Sub-category: gpgpu
> + * Functionality: system routine
> + * Test category: functionality test
> + */
> +
> +#include <dirent.h>
> +#include <fcntl.h>
> +#include <stdio.h>

Add newline.

> +#include "gpgpu_shader.h"
> +#include "igt.h"
> +#include "igt_sysfs.h"
> +#include "xe/xe_ioctl.h"
> +#include "xe/xe_query.h"
> +
> +#define WIDTH 64
> +#define HEIGHT 64
> +
> +#define COLOR_C4 0xc4
> +
> +#define SHADER_CANARY 0x01010101
> +
> +#define NSEC_PER_MSEC (1000 * 1000ull)
> +
> +static struct intel_buf *
> +create_fill_buf(int fd, int width, int height, uint8_t color)
> +{
> +	struct intel_buf *buf;
> +	uint8_t *ptr;
> +
> +	buf = calloc(1, sizeof(*buf));
> +	igt_assert(buf);
> +
> +	intel_buf_init(buf_ops_create(fd), buf, width / 4, height, 32, 0,
> +		       I915_TILING_NONE, 0);
> +
> +	ptr = xe_bo_map(fd, buf->handle, buf->surface[0].size);
> +	memset(ptr, color, buf->surface[0].size);
> +	munmap(ptr, buf->surface[0].size);
> +
> +	return buf;
> +}
> +
> +static struct gpgpu_shader *get_shader(int fd)
> +{
> +	static struct gpgpu_shader *shader;
> +
> +	shader = gpgpu_shader_create(fd);
> +	gpgpu_shader__write_dword(shader, SHADER_CANARY, 0);
> +	gpgpu_shader__eot(shader);
> +	return shader;
> +}
> +
> +static uint32_t gpgpu_shader(int fd, struct intel_bb *ibb, unsigned int threads,
> +			     unsigned int width, unsigned int height)
> +{
> +	struct intel_buf *buf = create_fill_buf(fd, width, height, COLOR_C4);
> +	struct gpgpu_shader *shader = get_shader(fd);
> +
> +	gpgpu_shader_exec(ibb, buf, 1, threads, shader, NULL, 0, 0);
> +	gpgpu_shader_destroy(shader);
> +	return buf->handle;
> +}
> +
> +static void check_fill_buf(uint8_t *ptr, const int width, const int x,
> +			   const int y, const uint8_t color)
> +{
> +	const uint8_t val = ptr[y * width + x];
> +
> +	igt_assert_f(val == color,
> +		     "Expected 0x%02x, found 0x%02x at (%d,%d)\n",
> +		     color, val, x, y);
> +}
> +
> +static void check_buf(int fd, uint32_t handle, int width, int height,
> +		      uint8_t poison_c)
> +{
> +	unsigned int sz = ALIGN(width * height, 4096);
> +	int thread_count = 0;
> +	uint32_t *ptr;
> +	int i, j;
> +
> +	ptr = xe_bo_mmap_ext(fd, handle, sz, PROT_READ);
> +
> +	for (i = 0, j = 0; j < height / 2; ++j) {
> +		if (ptr[j * width / 4] == SHADER_CANARY) {
> +			++thread_count;
> +			i = 4;
> +		}
> +
> +		for (; i < width; i++)
> +			check_fill_buf((uint8_t *)ptr, width, i, j, poison_c);
> +
> +		i = 0;
> +	}
> +
> +	igt_assert(thread_count);
> +
> +	munmap(ptr, sz);
> +}
> +
> +static const char *class_to_str(int class)
> +{
> +        const char *str[] = {
> +                [DRM_XE_ENGINE_CLASS_RENDER] = "rcs",
> +                [DRM_XE_ENGINE_CLASS_COPY] = "bcs",
> +                [DRM_XE_ENGINE_CLASS_VIDEO_DECODE] = "vcs",
> +                [DRM_XE_ENGINE_CLASS_VIDEO_ENHANCE] = "vecs",
> +		[DRM_XE_ENGINE_CLASS_COMPUTE] = "ccs",
> +        };
> +
> +        if (class < ARRAY_SIZE(str))
> +                return str[class];
> +
> +        return "unk";
> +}

Above looks like a candidate for lib function or it is somewhere
already?

> +
> +static uint64_t xe_sysfs_get_job_timeout_ms(int fd, struct drm_xe_engine_class_instance *eci)
> +{
> +	struct dirent *de;
> +	int engines_fd = -1;
> +	int gt_fd = -1;
> +	DIR *dir;
> +	/* Default timeout is 5s */
> +	uint64_t ret = 5ULL * MSEC_PER_SEC;
> +
> +	gt_fd = xe_sysfs_gt_open(fd, eci->gt_id);
> +	if (gt_fd == -1)
> +		return ret;

Why the same timeout here?

> +
> +	engines_fd = openat(gt_fd, "engines", O_RDONLY);
> +	if (engines_fd == -1) {
> +		close(gt_fd);
> +		return ret;

Why the same timeout here?

> +	}
> +
> +	lseek(engines_fd, 0, SEEK_SET);
> +	dir = fdopendir(engines_fd);
> +	while (dir && (de = readdir(dir))) {
> +		int engine_fd;
> +		if (strcmp(de->d_name, class_to_str(eci->engine_class)))
> +			continue;
> +
> +		engine_fd = openat(engines_fd, de->d_name, O_RDONLY);
> +		if (engine_fd < 0)
> +			break;
> +
> +		ret = igt_sysfs_get_u64(engine_fd, "job_timeout_ms");
> +		close(engine_fd);
> +		break;
> +	}
> +
> +	close(engines_fd);
> +	close(gt_fd);

Add newline here.

> +	return ret;
> +}
> +
> +/**
> + * SUBTEST: sanity
> + * Description: check basic shader with write operation
> + * Run type: BAT
> + *
> + */
> +static void test_sip(struct drm_xe_engine_class_instance *eci, uint32_t flags)
> +{
> +	unsigned int threads = 512;
> +	unsigned int height = max_t(threads, HEIGHT, threads * 2);
> +	uint32_t exec_queue_id, handle, vm_id;
> +	unsigned int width = WIDTH;
> +	struct timespec ts = { };
> +	uint64_t timeout;
> +	struct intel_bb *ibb;
> +	int fd;
> +
> +	igt_debug("Using %s\n", xe_engine_class_string(eci->engine_class));
> +
> +	fd = drm_open_driver(DRIVER_XE);
> +	xe_device_get(fd);
> +
> +	vm_id = xe_vm_create(fd, 0, 0);
> +
> +	/* Get timeout for job, and add 4s to ensure timeout processes in subtest. */
> +	timeout = xe_sysfs_get_job_timeout_ms(fd, eci) + 4ull * MSEC_PER_SEC;
> +	timeout *= NSEC_PER_MSEC;
> +	timeout *= igt_run_in_simulation() ? 10 : 1;
> +
> +	exec_queue_id = xe_exec_queue_create(fd, vm_id, eci, 0);
> +	ibb = intel_bb_create_with_context(fd, exec_queue_id, vm_id, NULL, 4096);
> +
> +	igt_nsec_elapsed(&ts);
> +	handle = gpgpu_shader(fd, ibb, threads, width, height);
> +
> +	intel_bb_sync(ibb);
> +	igt_assert_lt_u64(igt_nsec_elapsed(&ts), timeout);
> +
> +	check_buf(fd, handle, width, height, COLOR_C4);
> +
> +	gem_close(fd, handle);
> +	intel_bb_destroy(ibb);
> +
> +	xe_exec_queue_destroy(fd, exec_queue_id);
> +	xe_vm_destroy(fd, vm_id);
> +	xe_device_put(fd);
> +	close(fd);
> +}
> +
> +#define test_render_and_compute(t, __fd, __eci) \
> +	igt_subtest_with_dynamic(t) \
> +		xe_for_each_engine(__fd, __eci) \
> +			if (__eci->engine_class == DRM_XE_ENGINE_CLASS_RENDER || \
> +			    __eci->engine_class == DRM_XE_ENGINE_CLASS_COMPUTE) \
> +				igt_dynamic_f("%s%d", xe_engine_class_string(__eci->engine_class), \
> +					      __eci->engine_instance)
> +
> +igt_main
> +{
> +	struct drm_xe_engine_class_instance *eci;
> +	int fd;
> +
> +	igt_fixture {
> +		fd = drm_open_driver(DRIVER_XE);
> +		xe_device_get(fd);
--------^^
No need for this, drm_open_driver do this.

> +	}
> +
> +	test_render_and_compute("sanity", fd, eci)
> +		test_sip(eci, 0);
> +
> +	igt_fixture {
> +		xe_device_put(fd);
--------^^
> +		close(fd);
--------^^
Use drm_close_driver instead.

Regards,
Kamil

> +	}
> +}
> diff --git a/tests/meson.build b/tests/meson.build
> index 65b8bf23b972..63588e473616 100644
> --- a/tests/meson.build
> +++ b/tests/meson.build
> @@ -292,6 +292,7 @@ intel_xe_progs = [
>  	'xe_exec_fault_mode',
>  	'xe_exec_queue_property',
>  	'xe_exec_reset',
> +	'xe_exec_sip',
>  	'xe_exec_store',
>  	'xe_exec_threads',
>  	'xe_exercise_blt',
> 
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/4] lib/gpgpu_shader: add inline support for iga64 assembly
  2024-05-10 10:42   ` Zbigniew Kempczyński
@ 2024-05-14  9:39     ` Andrzej Hajda
  0 siblings, 0 replies; 17+ messages in thread
From: Andrzej Hajda @ 2024-05-14  9:39 UTC (permalink / raw)
  To: Zbigniew Kempczyński
  Cc: igt-dev, Kamil Konieczny, Dominik Grzegorzek,
	Christoph Manszewski, Dominik Karol Piątkowski

[-- Attachment #1: Type: text/plain, Size: 719 bytes --]

Thanks for looking at the patches.

On 10.05.2024 12:42, Zbigniew Kempczyński wrote:
> On Mon, Apr 29, 2024 at 02:08:19PM +0200, Andrzej Hajda wrote:
> <cut>
>
>> +# check if we need to recompile - checksum difference and compiler present
>> +MD5_ASMS="$(for a in "${ASMS[@]}"; do echo "${a#*:}"; done | md5sum|cut -b1-32)"
> Why not use:
> MD5_ASMS=$(echo "${ASMS[@]}" | md5sum | cut -b1-32)
>
> I'm not sure but your check doesn't support function (asm) rename.

My version deliberately skips asm names because they are not passed to 
the compiler, so the output
of iga64 does not depend on it. But you are right, after name change 
output file should be regenerated.
Will be fixed.

Regards
Andrzej
> --
> Zbigniew

[-- Attachment #2: Type: text/html, Size: 1501 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/4] lib/gpgpu_shader: add inline support for iga64 assembly
  2024-05-10 11:18   ` Kamil Konieczny
@ 2024-05-14  9:42     ` Andrzej Hajda
  0 siblings, 0 replies; 17+ messages in thread
From: Andrzej Hajda @ 2024-05-14  9:42 UTC (permalink / raw)
  To: Kamil Konieczny, igt-dev, Dominik Grzegorzek,
	Christoph Manszewski, Dominik Karol Piątkowski,
	Zbigniew Kempczyński


On 10.05.2024 13:18, Kamil Konieczny wrote:
> Hi Andrzej,
> On 2024-04-29 at 14:08:19 +0200, Andrzej Hajda wrote:
>> With this patch adding iga64 assembly should be similar to
>> adding x86 assembly inline. Simple example:
>>      emit_iga64_code(shdr, set_exception, R"ASM(
>>          or (1|M0) cr0.1<1>:ud cr0.1<0;1,0>:ud ARG(0):ud
>>      )ASM", value);
>> Note presence of 'ARG(0)', it will be replaced by 'value' argument,
>> multiple arguments are possible.
>> More sophisticated examples in following patches.
>> How does it works:
>> 1. Raw string literals (C++ feature available in gcc as extension):
>>     R"ASM(...)ASM" allows to use multiline/unescaped string literals.
>>     If for some reason they cannot be used we could always fallback to
>>     old ugly way of handling multiline strings with escape characters:
>>      emit_iga64_code(shdr, set_exception, "\n\
>>          or (1|M0) cr0.1<1>:ud cr0.1<0;1,0>:ud ARG(0):ud\n\
>>      ", value);
>> 2. emit_iga64_code puts the assembly string into special linker section,
>>     and calls __emit_iga64_code with pointer to external variable
>>     which will contain code templates generated from the assembly for all
>>     supported platforms, remaining arguments are put to temporal array
>>     to eventually patch the code with positional arguments.
>> 3. During build phase the linker section is scanned for assemblies.
>>     Every assembly is preprocessed with cpp, to replace ARG(x) macros with
>>     magic numbers, and to provide different code for different platforms
>>     if needed. Then output file is compiled with iga64, and then .c file
>>     is generated with global variables pointing to hexified iga64 codes.
>>
> +cc Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
>
>> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
>> ---
>>   lib/generate_iga64_codes    | 104 ++++++++++++++++++++++++++++++++++++++++++++
>>   lib/gpgpu_shader.c          |  39 +++++++++++++++++
>>   lib/gpgpu_shader.h          |  25 +++++++++++
>>   lib/iga64_generated_codes.c |   6 +++
>>   lib/iga64_macros.h          |  10 +++++
>>   lib/meson.build             |  18 ++++++++
>>   6 files changed, 202 insertions(+)
>>
>> diff --git a/lib/generate_iga64_codes b/lib/generate_iga64_codes
>> new file mode 100755
>> index 000000000000..efc2a29b409c
>> --- /dev/null
>> +++ b/lib/generate_iga64_codes
>> @@ -0,0 +1,104 @@
>> +#!/bin/bash
>> +# SPDX-License-Identifier: MIT
>> +# Copyright © 2024 Intel Corporation
>> +# Author: Andrzej Hajda <andrzej.hajda@intel.com>
>> +
>> +# List of supported platforms, in format gen100:platform, where gen100 equals
>> +# to minimal GPU generation supported by platform multiplied by 100 and platform
>> +# is one of platforms supported by -p switch of iga64.
>> +#
>> +# Must be in decreasing order, the last one must have gen100 equal 0"
>> +GEN_VERSIONS="2000:2 1272:12p72 1250:12p5 0:12p1"
>> +
>> +warn() {
>> +    echo -e "$1" >/dev/stderr
>> +}
>> +
>> +die() {
>> +    warn "DIE: $1"
>> +    exit 1
>> +}
>> +
>> +# parse args
>> +while getopts ':i:o:' opt; do
>> +    case $opt in
>> +    i) INPUT=$OPTARG;;
>> +    o) OUTPUT=$OPTARG;;
>> +    ?) die "Usage: $0 -i pre-generated-iga64-file -o generated-iga64-file libs-with-iga64-assembly [...]"
>> +    esac
>> +done
>> +LIBS=${@:OPTIND}
>> +
>> +# read all assemblies into ASMS array
>> +ASMS=()
>> +while  read -d $'\0' asm; do
>> +    test -z "$asm" && continue
>> +    ASMS+=( "$asm" )
>> +done < <(for f in $LIBS; do objcopy --dump-section .iga64_assembly=/dev/stdout $f.p/*.o; done)
>> +
>> +# check if we need to recompile - checksum difference and compiler present
>> +MD5_ASMS="$(for a in "${ASMS[@]}"; do echo "${a#*:}"; done | md5sum|cut -b1-32)"
>> +MD5_PRE="$(grep -Po '(?<=^#define MD5_SUM )\S{32,32}' $INPUT 2>/dev/null)"
>> +
>> +if [ "$MD5_ASMS" = "$MD5_PRE" ]; then
>> +    echo "iga64 assemblies not changed, reusing pre-compiled file $INPUT."
>> +    cp $INPUT $OUTPUT
>> +    exit 0
>> +fi
>> +
>> +type iga64 >/dev/null || {
>> +    warn "WARNING: iga64 assemblies changed, but iga64 compiler not present, CHANGES will have no effect. Install iga64 (libigc-tools package) to re-compile code."
>> +    cp $INPUT $OUTPUT
>> +    exit 0
>> +}
>> +
>> +# returns count of numbers in strings of format "0x1234, 0x23434, ..."
>> +dword_count() {
>> +    n=${1//[^x]}
>> +    echo ${#n}
>> +}
>> +
>> +# generate code file
>> +WD=$OUTPUT.d
>> +mkdir -p $WD
>> +
>> +echo "Generating new $OUTPUT"
>> +
>> +cat <<-EOF >$OUTPUT
>> +/* SPDX-License-Identifier: MIT */
>> +/* Generated using $(iga64 |& head -1) */
>> +
>> +#include "gpgpu_shader.h"
>> +
>> +#define MD5_SUM $MD5_ASMS
>> +EOF
>> +
>> +for asm in "${ASMS[@]}"; do
>> +    asm_name="${asm%%:*}"
>> +    asm_code="${asm_name/assembly/code}"
>> +    asm_body="${asm#*:}"
>> +    cur_code=""
>> +    cur_ver=""
>> +    echo -e "\nstruct iga64_template const $asm_code[] = {" >>$OUTPUT
>> +    for gen in $GEN_VERSIONS; do
>> +        gen_ver="${gen%%:*}"
>> +        gen_name="${gen#*:}"
>> +        warn "Generating $asm_code for platform $gen_name"
>> +        cmd="cpp -P - -o $WD/$asm_name.$gen_name.asm"
>> +        cmd+=" -DGEN_VER=$gen_ver -imacros ../lib/iga64_macros.h"
>> +        eval "$cmd" <<<"$asm_body" || die "cpp error for $asm_name.$gen_name\ncmd: $cmd"
>> +        cmd="iga64 -Xauto-deps -Wall -p=$gen_name"
>> +        cmd+=" $WD/$asm_name.$gen_name.asm -o $WD/$asm_name.$gen_name.bin"
>> +        eval "$cmd" || die "iga64 error for $asm_name.$gen_name\ncmd: $cmd"
>> +        code="$(hexdump -e '"\t\t" 4/4 "0x%08x, " "\n"' $WD/$asm_name.$gen_name.bin)"
>> +        [ -z "$cur_code" ] && cur_code="$code"
>> +        [ "$cur_code" != "$code" ] && {
>> +            echo -e "\t{ .gen_ver = $cur_ver, .size = $(dword_count "$cur_code"), .code = (const uint32_t []) {\n$cur_code\n\t}}," >>$OUTPUT
>> +            cur_code="$code"
>> +        }
>> +        cur_ver=$gen_ver
>> +    done
>> +    echo -e "\t{ .gen_ver = $cur_ver, .size = $(dword_count "$cur_code"), .code = (const uint32_t []) {\n$cur_code\n\t}}\n};" >>$OUTPUT
>> +done
>> +
>> +cp $OUTPUT $INPUT
>> diff --git a/lib/gpgpu_shader.c b/lib/gpgpu_shader.c
>> index d14301789421..3317e9e35c91 100644
>> --- a/lib/gpgpu_shader.c
>> +++ b/lib/gpgpu_shader.c
>> @@ -11,6 +11,9 @@
>>   #include "gpgpu_shader.h"
>>   #include "gpu_cmds.h"
>>   
>> +#define IGA64_ARG0 0xc0ded000
>> +#define IGA64_ARG_MASK 0xffffff00
>> +
>>   #define SUPPORTED_GEN_VER 1200 /* Support TGL and up */
>>   
>>   #define PAGE_SIZE 4096
>> @@ -22,6 +25,42 @@
>>   #define GPGPU_CURBE_SIZE 0
>>   #define GEN7_VFE_STATE_GPGPU_MODE 1
>>   
>> +static void gpgpu_shader_extend(struct gpgpu_shader *shdr)
>> +{
>> +	shdr->max_size <<= 1;
>> +	shdr->code = realloc(shdr->code, 4 * shdr->max_size);
>> +}
>> +
>> +void
>> +__emit_iga64_code(struct gpgpu_shader *shdr, struct iga64_template const *tpls,
>> +		  int argc, uint32_t *argv)
>> +{
>> +	uint32_t *ptr;
>> +
>> +	igt_require_f(shdr->gen_ver >= SUPPORTED_GEN_VER,
>> +		      "No available shader templates for platforms older than XeLP\n");
>> +
>> +	while (shdr->gen_ver < tpls->gen_ver)
>> +		tpls++;
>> +
>> +	while (shdr->max_size < shdr->size + tpls->size)
>> +		gpgpu_shader_extend(shdr);
>> +
>> +	ptr = shdr->code + shdr->size;
>> +	memcpy(ptr, tpls->code, 4 * tpls->size);
>> +
>> +	/* patch the template */
>> +	for (int n, i = 0; i < tpls->size; ++i) {
>> +		if ((ptr[i] & IGA64_ARG_MASK) != IGA64_ARG0)
>> +			continue;
>> +		n = ptr[i] - IGA64_ARG0;
>> +		igt_assert(n < argc);
>> +		ptr[i] = argv[n];
>> +	}
>> +
>> +	shdr->size += tpls->size;
>> +}
>> +
>>   static uint32_t fill_sip(struct intel_bb *ibb,
>>   			 const uint32_t sip[][4],
>>   			 const size_t size)
>> diff --git a/lib/gpgpu_shader.h b/lib/gpgpu_shader.h
>> index 02f6f1aad1e3..0b997deba8bb 100644
>> --- a/lib/gpgpu_shader.h
>> +++ b/lib/gpgpu_shader.h
>> @@ -23,6 +23,27 @@ struct gpgpu_shader {
>>   	};
>>   };
>>   
>> +struct iga64_template {
>> +	uint32_t gen_ver;
>> +	uint32_t size;
>> +	const uint32_t *code;
>> +};
>> +
>> +#pragma GCC diagnostic ignored "-Wnested-externs"
>> +
>> +void
>> +__emit_iga64_code(struct gpgpu_shader *shdr, const struct iga64_template *tpls,
>> +		  int argc, uint32_t *argv);
>> +
>> +#define emit_iga64_code(__shdr, __name, __txt, __args...) \
>> +({ \
>> +	static const char t[] __attribute__ ((section(".iga64_assembly"),used)) \
>> +		="iga64_assembly_" #__name ":" __txt "\n"; \
>> +	extern struct iga64_template const iga64_code_ ## __name[]; \
>> +	u32 args[] = { __args }; \
>> +	__emit_iga64_code(__shdr, iga64_code_ ## __name, ARRAY_SIZE(args), args); \
>> +})
>> +
>>   struct gpgpu_shader *gpgpu_shader_create(int fd);
>>   void gpgpu_shader_destroy(struct gpgpu_shader *shdr);
>>   
>> @@ -35,4 +56,8 @@ void gpgpu_shader_exec(struct intel_bb *ibb,
>>   		       struct gpgpu_shader *sip,
>>   		       uint64_t ring, bool explicit_engine);
>>   
>> +void gpgpu_shader__eot(struct gpgpu_shader *shdr);
>> +void gpgpu_shader__write_dword(struct gpgpu_shader *shdr, uint32_t value,
>> +			       uint32_t y_offset);
>> +
>>   #endif /* GPGPU_SHADER_H */
>> diff --git a/lib/iga64_generated_codes.c b/lib/iga64_generated_codes.c
>> new file mode 100644
>> index 000000000000..449c5e9bcf31
>> --- /dev/null
>> +++ b/lib/iga64_generated_codes.c
>> @@ -0,0 +1,6 @@
>> +/* SPDX-License-Identifier: MIT */
>> +/* Generated using Intel Graphics Assembler 1.1.0-int */
>> +
>> +#include "gpgpu_shader.h"
>> +
>> +#define MD5_SUM d41d8cd98f00b204e9800998ecf8427e
> ---------- ^^^^^^^
> This name is too much generic, what about MD5_SUM_IGA64_CODE ?
>
>> diff --git a/lib/iga64_macros.h b/lib/iga64_macros.h
>> new file mode 100644
>> index 000000000000..33375763a1d0
>> --- /dev/null
>> +++ b/lib/iga64_macros.h
>> @@ -0,0 +1,10 @@
>> +/* SPDX-License-Identifier: MIT */
> Add Copyright here.
>
>> +
>> +#define ARG(n) (0xc0ded000 + n)
>> +
>> +/* send instruction for DG2+ requires 0 length in case src1 is null, BSpec: 47443 */
>> +#if GEN_VER < 1271
>> +#define src1_null null
>> +#else
>> +#define src1_null null:0
>> +#endif
>> diff --git a/lib/meson.build b/lib/meson.build
>> index 0a3084f8aea2..843c74e5187f 100644
>> --- a/lib/meson.build
>> +++ b/lib/meson.build
>> @@ -216,7 +216,10 @@ lib_version = vcs_tag(input : 'version.h.in', output : 'version.h',
>>   		      fallback : 'NO-GIT',
>>   		      command : vcs_command )
>>   
>> +iga64_assembly_sources = [ 'gpgpu_shader.c' ]
>> +
>>   lib_intermediates = []
>> +iga64_assembly_libs = []
>>   foreach f: lib_sources
>>       name = f.underscorify()
>>       lib = static_library('igt-' + name,
>> @@ -230,8 +233,23 @@ foreach f: lib_sources
>>   	])
>>   
>>       lib_intermediates += lib
>> +    if f in iga64_assembly_sources
>> +	iga64_assembly_libs += lib
>> +    endif
>>   endforeach
>>   
>> +iga64_generated_codes = custom_target(
>> +    'iga64_generated_codes.c',
>> +    output : 'iga64_generated_codes.c',
>> +    input : [ 'iga64_generated_codes.c' ] + iga64_assembly_libs,
>> +    command : [ './generate_iga64_codes', '-o', '@OUTPUT@', '-i', '@INPUT@' ],
>> +    depend_files: [ 'generate_iga64_codes' ]
>> +)
>> +
>> +lib_intermediates += static_library('igt-iga64_generated_codes.c',
>> +			[ iga64_generated_codes, lib_version ]
>> +		     )
>> +
>>   lib_igt_build = shared_library('igt',
>>       ['dummy.c'],
>>       link_whole: lib_intermediates,
>>
> Please look into CI.Build fails.

Yep, another meson/ninja tutorial to grep.

Thanks for comments, all will be addressed.

Regards
Andrzej

>
> Regards,
> Kamil
>
>> -- 
>> 2.34.1
>>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] intel/xe_exec_sip: port test for shader sanity check
  2024-05-10 10:44   ` Zbigniew Kempczyński
@ 2024-05-14  9:49     ` Andrzej Hajda
  0 siblings, 0 replies; 17+ messages in thread
From: Andrzej Hajda @ 2024-05-14  9:49 UTC (permalink / raw)
  To: Zbigniew Kempczyński
  Cc: igt-dev, Kamil Konieczny, Dominik Grzegorzek,
	Christoph Manszewski, Dominik Karol Piątkowski



On 10.05.2024 12:44, Zbigniew Kempczyński wrote:
> On Mon, Apr 29, 2024 at 02:08:20PM +0200, Andrzej Hajda wrote:
>> xe_exec_sip will contain tests for shader and SIP interaction.
>> For starters let's implement test checking if shader is run correctly.
>> The patch also demostrates usage of inline iga64 assembly.
>>
>> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
>> ---
>>   lib/gpgpu_shader.c          |  63 ++++++++++++
>>   lib/iga64_generated_codes.c |  83 ++++++++++++++-
>>   tests/intel/xe_exec_sip.c   | 239 ++++++++++++++++++++++++++++++++++++++++++++
>>   tests/meson.build           |   1 +
>>   4 files changed, 385 insertions(+), 1 deletion(-)
>>
>> diff --git a/lib/gpgpu_shader.c b/lib/gpgpu_shader.c
>> index 3317e9e35c91..cd8c82ff9c8c 100644
>> --- a/lib/gpgpu_shader.c
>> +++ b/lib/gpgpu_shader.c
>> @@ -248,3 +248,66 @@ void gpgpu_shader_destroy(struct gpgpu_shader *shdr)
>>   	free(shdr->code);
>>   	free(shdr);
>>   }
>> +
>> +/**
>> + * gpgpu_shader__eot:
>> + * @shdr: shader to be modified
>> + *
>> + * Append end of thread instruction to @shdr.
>> + */
>> +void gpgpu_shader__eot(struct gpgpu_shader *shdr)
>> +{
>> +	emit_iga64_code(shdr, eot, R"ASM(
>> +(W)     mov (8|M0)               r112.0<1>:ud  r0.0<8;8,1>:ud
>> +#if GEN_VER < 1250
>> +(W)     send.ts (16|M0)          null r112 null 0x10000000 0x02000010 {EOT,@1} // wr:1+0, rd:0; end of thread
>> +#else
>> +(W)     send.gtwy (8|M0)         null r112 src1_null     0 0x02000000 {EOT}
>> +#endif
>> +	)ASM");
>> +}
>> +
>> +/**
>> + * gpgpu_shader__write_dword:
>> + * @shdr: shader to be modified
>> + * @value: dword to be written
>> + * @y_offset: write target offset within the surface in rows
>> + *
>> + * Fill dword in (row, column/dword) == (tg_id_y + @y_offset, tg_id_x).
>> + */
>> +void gpgpu_shader__write_dword(struct gpgpu_shader *shdr, uint32_t value,
>> +			       uint32_t y_offset)
>> +{
>> +	emit_iga64_code(shdr, media_block_write, R"ASM(
>> +	// Payload
>> +(W)     mov (1|M0)               r5.0<1>:ud    ARG(3):ud
>> +(W)     mov (1|M0)               r5.1<1>:ud    ARG(4):ud
>> +(W)     mov (1|M0)               r5.2<1>:ud    ARG(5):ud
>> +(W)     mov (1|M0)               r5.3<1>:ud    ARG(6):ud
>> +#if GEN_VER < 2000 // Media Block Write
>> +        // X offset of the block in bytes := (thread group id X << ARG(0))
>> +(W)     shl (1|M0)               r4.0<1>:ud    r0.1<0;1,0>:ud    ARG(0):ud
>> +        // Y offset of the block in rows := thread group id Y
>> +(W)     mov (1|M0)               r4.1<1>:ud    r0.6<0;1,0>:ud
>> +(W)     add (1|M0)               r4.1<1>:ud    r4.1<0;1,0>:ud   ARG(1):ud
>> +        // block width [0,63] representing 1 to 64 bytes
>> +(W)     mov (1|M0)               r4.2<1>:ud    ARG(2):ud
>> +        // FFTID := FFTID from R0 header
>> +(W)     mov (1|M0)               r4.4<1>:ud    r0.5<0;1,0>:ud
>> +(W)     send.dc1 (16|M0)         null     r4   src1_null 0    0x40A8000
>> +#else // Typed 2D Block Store
>> +        // Load r2.0-3 with tg id X << ARG(0)
>> +(W)     shl (1|M0)               r2.0<1>:ud    r0.1<0;1,0>:ud    ARG(0):ud
>> +        // Load r2.4-7 with tg id Y + ARG(1):ud
>> +(W)     mov (1|M0)               r2.1<1>:ud    r0.6<0;1,0>:ud
>> +(W)     add (1|M0)               r2.1<1>:ud    r2.1<0;1,0>:ud    ARG(1):ud
>> +        // payload setup
>> +(W)     mov (16|M0)              r4.0<1>:ud    0x0:ud
>> +        // Store X and Y block start (160:191 and 192:223)
>> +(W)     mov (2|M0)               r4.5<1>:ud    r2.0<2;2,1>:ud
>> +        // Store X and Y block max_size (224:231 and 232:239)
>> +(W)     mov (1|M0)               r4.7<1>:ud    ARG(2):ud
>> +(W)     send.tgm (16|M0)         null     r4   null:0    0    0x64000007
>> +#endif
>> +	)ASM", 2, y_offset, 3, value, value, value, value);
>> +}
>> diff --git a/lib/iga64_generated_codes.c b/lib/iga64_generated_codes.c
>> index 449c5e9bcf31..f06362d806cd 100644
>> --- a/lib/iga64_generated_codes.c
>> +++ b/lib/iga64_generated_codes.c
>> @@ -3,4 +3,85 @@
>>   
>>   #include "gpgpu_shader.h"
>>   
>> -#define MD5_SUM d41d8cd98f00b204e9800998ecf8427e
>> +#define MD5_SUM 1a47442138fa63fddb0f260694ef9edb
>> +
>> +struct iga64_template const iga64_code_media_block_write[] = {
>> +	{ .gen_ver = 2000, .size = 56, .code = (const uint32_t []) {
>> +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
>> +		0x80000061, 0x05154220, 0x00000000, 0xc0ded004,
>> +		0x80000061, 0x05254220, 0x00000000, 0xc0ded005,
>> +		0x80000061, 0x05354220, 0x00000000, 0xc0ded006,
>> +		0x80000069, 0x02058220, 0x02000014, 0xc0ded000,
>> +		0x80000061, 0x02150220, 0x00000064, 0x00000000,
>> +		0x80001940, 0x02158220, 0x02000214, 0xc0ded001,
>> +		0x80100061, 0x04054220, 0x00000000, 0x00000000,
>> +		0x80041a61, 0x04550220, 0x00220205, 0x00000000,
>> +		0x80000061, 0x04754220, 0x00000000, 0xc0ded002,
>> +		0x80132031, 0x00000000, 0xd00e0494, 0x04000000,
>> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
>> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
>> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
>> +	}},
>> +	{ .gen_ver = 1272, .size = 52, .code = (const uint32_t []) {
>> +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
>> +		0x80000061, 0x05154220, 0x00000000, 0xc0ded004,
>> +		0x80000061, 0x05254220, 0x00000000, 0xc0ded005,
>> +		0x80000061, 0x05354220, 0x00000000, 0xc0ded006,
>> +		0x80000069, 0x04058220, 0x02000014, 0xc0ded000,
>> +		0x80000061, 0x04150220, 0x00000064, 0x00000000,
>> +		0x80001940, 0x04158220, 0x02000414, 0xc0ded001,
>> +		0x80000061, 0x04254220, 0x00000000, 0xc0ded002,
>> +		0x80000061, 0x04450220, 0x00000054, 0x00000000,
>> +		0x80132031, 0x00000000, 0xc0000414, 0x02a00000,
>> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
>> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
>> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
>> +	}},
>> +	{ .gen_ver = 1250, .size = 56, .code = (const uint32_t []) {
>> +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
>> +		0x80000061, 0x05254220, 0x00000000, 0xc0ded004,
>> +		0x80000061, 0x05454220, 0x00000000, 0xc0ded005,
>> +		0x80000061, 0x05654220, 0x00000000, 0xc0ded006,
>> +		0x80000069, 0x04058220, 0x02000024, 0xc0ded000,
>> +		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
>> +		0x80001940, 0x04258220, 0x02000424, 0xc0ded001,
>> +		0x80000061, 0x04454220, 0x00000000, 0xc0ded002,
>> +		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
>> +		0x80001901, 0x00010000, 0x00000000, 0x00000000,
>> +		0x80044031, 0x00000000, 0xc0000414, 0x02a00000,
>> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
>> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
>> +		0x80000901, 0x00010000, 0x00000000, 0x00000000,
>> +	}},
>> +	{ .gen_ver = 0, .size = 52, .code = (const uint32_t []) {
>> +		0x80000061, 0x05054220, 0x00000000, 0xc0ded003,
>> +		0x80000061, 0x05254220, 0x00000000, 0xc0ded004,
>> +		0x80000061, 0x05454220, 0x00000000, 0xc0ded005,
>> +		0x80000061, 0x05654220, 0x00000000, 0xc0ded006,
>> +		0x80000069, 0x04058220, 0x02000024, 0xc0ded000,
>> +		0x80000061, 0x04250220, 0x000000c4, 0x00000000,
>> +		0x80000140, 0x04258220, 0x02000424, 0xc0ded001,
>> +		0x80000061, 0x04454220, 0x00000000, 0xc0ded002,
>> +		0x80000061, 0x04850220, 0x000000a4, 0x00000000,
>> +		0x80049031, 0x00000000, 0xc0000414, 0x02a00000,
>> +		0x80000001, 0x00010000, 0x20000000, 0x00000000,
>> +		0x80000001, 0x00010000, 0x30000000, 0x00000000,
>> +		0x80000101, 0x00010000, 0x00000000, 0x00000000,
>> +	}}
>> +};
>> +
>> +struct iga64_template const iga64_code_eot[] = {
> Where's .gen_ver = 2000?

Apparently 2000 and 1272 have the same binary code, in such case we keep 
only the lower one.
Nice way to check which gens introduce changes.

Regards
Andrzej

>
> --
> Zbigniew
>
>> +	{ .gen_ver = 1272, .size = 8, .code = (const uint32_t []) {
>> +		0x800c0061, 0x70050220, 0x00460005, 0x00000000,
>> +		0x800f2031, 0x00000004, 0x3000700c, 0x00000000,
>> +	}},
>> +	{ .gen_ver = 1250, .size = 12, .code = (const uint32_t []) {
>> +		0x80030061, 0x70050220, 0x00460005, 0x00000000,
>> +		0x80001901, 0x00010000, 0x00000000, 0x00000000,
>> +		0x80034031, 0x00000004, 0x3000700c, 0x00000000,
>> +	}},
>> +	{ .gen_ver = 0, .size = 8, .code = (const uint32_t []) {
>> +		0x80030061, 0x70050220, 0x00460005, 0x00000000,
>> +		0x80049031, 0x00000004, 0x7020700c, 0x10000000,
>> +	}}
>> +};
>> diff --git a/tests/intel/xe_exec_sip.c b/tests/intel/xe_exec_sip.c
>> new file mode 100644
>> index 000000000000..af0eaf8cbda6
>> --- /dev/null
>> +++ b/tests/intel/xe_exec_sip.c
>> @@ -0,0 +1,239 @@
>> +// SPDX-License-Identifier: MIT
>> +/*
>> + * Copyright © 2024 Intel Corporation
>> + */
>> +
>> +/**
>> + * TEST: Tests for gpgpu shader and system routine execution
>> + * Category: Software building block
>> + * Sub-category: gpgpu
>> + * Functionality: system routine
>> + * Test category: functionality test
>> + */
>> +
>> +#include <dirent.h>
>> +#include <fcntl.h>
>> +#include <stdio.h>
>> +#include "gpgpu_shader.h"
>> +#include "igt.h"
>> +#include "igt_sysfs.h"
>> +#include "xe/xe_ioctl.h"
>> +#include "xe/xe_query.h"
>> +
>> +#define WIDTH 64
>> +#define HEIGHT 64
>> +
>> +#define COLOR_C4 0xc4
>> +
>> +#define SHADER_CANARY 0x01010101
>> +
>> +#define NSEC_PER_MSEC (1000 * 1000ull)
>> +
>> +static struct intel_buf *
>> +create_fill_buf(int fd, int width, int height, uint8_t color)
>> +{
>> +	struct intel_buf *buf;
>> +	uint8_t *ptr;
>> +
>> +	buf = calloc(1, sizeof(*buf));
>> +	igt_assert(buf);
>> +
>> +	intel_buf_init(buf_ops_create(fd), buf, width / 4, height, 32, 0,
>> +		       I915_TILING_NONE, 0);
>> +
>> +	ptr = xe_bo_map(fd, buf->handle, buf->surface[0].size);
>> +	memset(ptr, color, buf->surface[0].size);
>> +	munmap(ptr, buf->surface[0].size);
>> +
>> +	return buf;
>> +}
>> +
>> +static struct gpgpu_shader *get_shader(int fd)
>> +{
>> +	static struct gpgpu_shader *shader;
>> +
>> +	shader = gpgpu_shader_create(fd);
>> +	gpgpu_shader__write_dword(shader, SHADER_CANARY, 0);
>> +	gpgpu_shader__eot(shader);
>> +	return shader;
>> +}
>> +
>> +static uint32_t gpgpu_shader(int fd, struct intel_bb *ibb, unsigned int threads,
>> +			     unsigned int width, unsigned int height)
>> +{
>> +	struct intel_buf *buf = create_fill_buf(fd, width, height, COLOR_C4);
>> +	struct gpgpu_shader *shader = get_shader(fd);
>> +
>> +	gpgpu_shader_exec(ibb, buf, 1, threads, shader, NULL, 0, 0);
>> +	gpgpu_shader_destroy(shader);
>> +	return buf->handle;
>> +}
>> +
>> +static void check_fill_buf(uint8_t *ptr, const int width, const int x,
>> +			   const int y, const uint8_t color)
>> +{
>> +	const uint8_t val = ptr[y * width + x];
>> +
>> +	igt_assert_f(val == color,
>> +		     "Expected 0x%02x, found 0x%02x at (%d,%d)\n",
>> +		     color, val, x, y);
>> +}
>> +
>> +static void check_buf(int fd, uint32_t handle, int width, int height,
>> +		      uint8_t poison_c)
>> +{
>> +	unsigned int sz = ALIGN(width * height, 4096);
>> +	int thread_count = 0;
>> +	uint32_t *ptr;
>> +	int i, j;
>> +
>> +	ptr = xe_bo_mmap_ext(fd, handle, sz, PROT_READ);
>> +
>> +	for (i = 0, j = 0; j < height / 2; ++j) {
>> +		if (ptr[j * width / 4] == SHADER_CANARY) {
>> +			++thread_count;
>> +			i = 4;
>> +		}
>> +
>> +		for (; i < width; i++)
>> +			check_fill_buf((uint8_t *)ptr, width, i, j, poison_c);
>> +
>> +		i = 0;
>> +	}
>> +
>> +	igt_assert(thread_count);
>> +
>> +	munmap(ptr, sz);
>> +}
>> +
>> +static const char *class_to_str(int class)
>> +{
>> +        const char *str[] = {
>> +                [DRM_XE_ENGINE_CLASS_RENDER] = "rcs",
>> +                [DRM_XE_ENGINE_CLASS_COPY] = "bcs",
>> +                [DRM_XE_ENGINE_CLASS_VIDEO_DECODE] = "vcs",
>> +                [DRM_XE_ENGINE_CLASS_VIDEO_ENHANCE] = "vecs",
>> +		[DRM_XE_ENGINE_CLASS_COMPUTE] = "ccs",
>> +        };
>> +
>> +        if (class < ARRAY_SIZE(str))
>> +                return str[class];
>> +
>> +        return "unk";
>> +}
>> +
>> +static uint64_t xe_sysfs_get_job_timeout_ms(int fd, struct drm_xe_engine_class_instance *eci)
>> +{
>> +	struct dirent *de;
>> +	int engines_fd = -1;
>> +	int gt_fd = -1;
>> +	DIR *dir;
>> +	/* Default timeout is 5s */
>> +	uint64_t ret = 5ULL * MSEC_PER_SEC;
>> +
>> +	gt_fd = xe_sysfs_gt_open(fd, eci->gt_id);
>> +	if (gt_fd == -1)
>> +		return ret;
>> +
>> +	engines_fd = openat(gt_fd, "engines", O_RDONLY);
>> +	if (engines_fd == -1) {
>> +		close(gt_fd);
>> +		return ret;
>> +	}
>> +
>> +	lseek(engines_fd, 0, SEEK_SET);
>> +	dir = fdopendir(engines_fd);
>> +	while (dir && (de = readdir(dir))) {
>> +		int engine_fd;
>> +		if (strcmp(de->d_name, class_to_str(eci->engine_class)))
>> +			continue;
>> +
>> +		engine_fd = openat(engines_fd, de->d_name, O_RDONLY);
>> +		if (engine_fd < 0)
>> +			break;
>> +
>> +		ret = igt_sysfs_get_u64(engine_fd, "job_timeout_ms");
>> +		close(engine_fd);
>> +		break;
>> +	}
>> +
>> +	close(engines_fd);
>> +	close(gt_fd);
>> +	return ret;
>> +}
>> +
>> +/**
>> + * SUBTEST: sanity
>> + * Description: check basic shader with write operation
>> + * Run type: BAT
>> + *
>> + */
>> +static void test_sip(struct drm_xe_engine_class_instance *eci, uint32_t flags)
>> +{
>> +	unsigned int threads = 512;
>> +	unsigned int height = max_t(threads, HEIGHT, threads * 2);
>> +	uint32_t exec_queue_id, handle, vm_id;
>> +	unsigned int width = WIDTH;
>> +	struct timespec ts = { };
>> +	uint64_t timeout;
>> +	struct intel_bb *ibb;
>> +	int fd;
>> +
>> +	igt_debug("Using %s\n", xe_engine_class_string(eci->engine_class));
>> +
>> +	fd = drm_open_driver(DRIVER_XE);
>> +	xe_device_get(fd);
>> +
>> +	vm_id = xe_vm_create(fd, 0, 0);
>> +
>> +	/* Get timeout for job, and add 4s to ensure timeout processes in subtest. */
>> +	timeout = xe_sysfs_get_job_timeout_ms(fd, eci) + 4ull * MSEC_PER_SEC;
>> +	timeout *= NSEC_PER_MSEC;
>> +	timeout *= igt_run_in_simulation() ? 10 : 1;
>> +
>> +	exec_queue_id = xe_exec_queue_create(fd, vm_id, eci, 0);
>> +	ibb = intel_bb_create_with_context(fd, exec_queue_id, vm_id, NULL, 4096);
>> +
>> +	igt_nsec_elapsed(&ts);
>> +	handle = gpgpu_shader(fd, ibb, threads, width, height);
>> +
>> +	intel_bb_sync(ibb);
>> +	igt_assert_lt_u64(igt_nsec_elapsed(&ts), timeout);
>> +
>> +	check_buf(fd, handle, width, height, COLOR_C4);
>> +
>> +	gem_close(fd, handle);
>> +	intel_bb_destroy(ibb);
>> +
>> +	xe_exec_queue_destroy(fd, exec_queue_id);
>> +	xe_vm_destroy(fd, vm_id);
>> +	xe_device_put(fd);
>> +	close(fd);
>> +}
>> +
>> +#define test_render_and_compute(t, __fd, __eci) \
>> +	igt_subtest_with_dynamic(t) \
>> +		xe_for_each_engine(__fd, __eci) \
>> +			if (__eci->engine_class == DRM_XE_ENGINE_CLASS_RENDER || \
>> +			    __eci->engine_class == DRM_XE_ENGINE_CLASS_COMPUTE) \
>> +				igt_dynamic_f("%s%d", xe_engine_class_string(__eci->engine_class), \
>> +					      __eci->engine_instance)
>> +
>> +igt_main
>> +{
>> +	struct drm_xe_engine_class_instance *eci;
>> +	int fd;
>> +
>> +	igt_fixture {
>> +		fd = drm_open_driver(DRIVER_XE);
>> +		xe_device_get(fd);
>> +	}
>> +
>> +	test_render_and_compute("sanity", fd, eci)
>> +		test_sip(eci, 0);
>> +
>> +	igt_fixture {
>> +		xe_device_put(fd);
>> +		close(fd);
>> +	}
>> +}
>> diff --git a/tests/meson.build b/tests/meson.build
>> index 65b8bf23b972..63588e473616 100644
>> --- a/tests/meson.build
>> +++ b/tests/meson.build
>> @@ -292,6 +292,7 @@ intel_xe_progs = [
>>   	'xe_exec_fault_mode',
>>   	'xe_exec_queue_property',
>>   	'xe_exec_reset',
>> +	'xe_exec_sip',
>>   	'xe_exec_store',
>>   	'xe_exec_threads',
>>   	'xe_exercise_blt',
>>
>> -- 
>> 2.34.1
>>


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-05-14  9:49 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-29 12:08 [PATCH 0/4] lib/gpgpu: add shader support Andrzej Hajda
2024-04-29 12:08 ` [PATCH 1/4] lib/gpu_cmds: add Xe_LP version of emit_vfe_state Andrzej Hajda
2024-04-29 12:37   ` Grzegorzek, Dominik
2024-04-29 12:08 ` [PATCH 2/4] lib/gpgpu_shader: tooling for preparing and running gpgpu shaders Andrzej Hajda
2024-04-29 12:23   ` Grzegorzek, Dominik
2024-04-29 12:08 ` [PATCH 3/4] lib/gpgpu_shader: add inline support for iga64 assembly Andrzej Hajda
2024-05-10  5:52   ` Zbigniew Kempczyński
2024-05-10 10:42   ` Zbigniew Kempczyński
2024-05-14  9:39     ` Andrzej Hajda
2024-05-10 11:18   ` Kamil Konieczny
2024-05-14  9:42     ` Andrzej Hajda
2024-04-29 12:08 ` [PATCH 4/4] intel/xe_exec_sip: port test for shader sanity check Andrzej Hajda
2024-05-10 10:44   ` Zbigniew Kempczyński
2024-05-14  9:49     ` Andrzej Hajda
2024-05-10 11:30   ` Kamil Konieczny
2024-04-29 16:19 ` ✗ Fi.CI.BUILD: failure for lib/gpgpu: add shader support Patchwork
2024-04-29 16:21 ` ✗ GitLab.Pipeline: warning " Patchwork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox