[Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support

Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support
@ 2023-10-05 15:31 Matthew Auld
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 01/12] drm-uapi/xe_drm: sync to get pat and coherency bits Matthew Auld
                   ` (11 more replies)
  0 siblings, 12 replies; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-xe

Series implements the IGT side of things needed to support the new Xe uapi here:
https://patchwork.freedesktop.org/series/123027/

Branch with the IGT changes:
https://gitlab.freedesktop.org/mwa/igt-gpu-tools/-/commits/xe-pat-index

Branch with the KMD changes:
https://gitlab.freedesktop.org/mwa/kernel/-/tree/xe-pat-index?ref_type=heads

-- 
2.41.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Intel-xe] [PATCH i-g-t 01/12] drm-uapi/xe_drm: sync to get pat and coherency bits
  2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
@ 2023-10-05 15:31 ` Matthew Auld
  2023-10-09 22:03   ` Mishra, Pallavi
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 02/12] lib/igt_fb: mark buffers as SCANOUT Matthew Auld
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-xe

Grab the PAT & coherency uapi additions.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
---
 include/drm-uapi/xe_drm.h | 93 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 90 insertions(+), 3 deletions(-)

diff --git a/include/drm-uapi/xe_drm.h b/include/drm-uapi/xe_drm.h
index 804c02270..0a665f67f 100644
--- a/include/drm-uapi/xe_drm.h
+++ b/include/drm-uapi/xe_drm.h
@@ -456,8 +456,54 @@ struct drm_xe_gem_create {
 	 */
 	__u32 handle;
 
-	/** @pad: MBZ */
-	__u32 pad;
+	/**
+	 * @coh_mode: The coherency mode for this object. This will limit the
+	 * possible @cpu_caching values.
+	 *
+	 * Supported values:
+	 *
+	 * DRM_XE_GEM_COH_NONE: GPU access is assumed to be not coherent with
+	 * CPU. CPU caches are not snooped.
+	 *
+	 * DRM_XE_GEM_COH_AT_LEAST_1WAY:
+	 *
+	 * CPU-GPU coherency must be at least 1WAY.
+	 *
+	 * If 1WAY then GPU access is coherent with CPU (CPU caches are snooped)
+	 * until GPU acquires. The acquire by the GPU is not tracked by CPU
+	 * caches.
+	 *
+	 * If 2WAY then should be fully coherent between GPU and CPU.  Fully
+	 * tracked by CPU caches. Both CPU and GPU caches are snooped.
+	 *
+	 * Note: On dgpu the GPU device never caches system memory. The device
+	 * should be thought of as always 1WAY coherent, with the addition that
+	 * the GPU never caches system memory. At least on current dgpu HW there
+	 * is no way to turn off snooping so likely the different coherency
+	 * modes of the pat_index make no difference for system memory.
+	 */
+#define DRM_XE_GEM_COH_NONE		1
+#define DRM_XE_GEM_COH_AT_LEAST_1WAY	2
+	__u16 coh_mode;
+
+	/**
+	 * @cpu_caching: The CPU caching mode to select for this object. If
+	 * mmaping the object the mode selected here will also be used.
+	 *
+	 * Supported values:
+	 *
+	 * DRM_XE_GEM_CPU_CACHING_WB: Allocate the pages with write-back caching.
+	 * On iGPU this can't be used for scanout surfaces. The @coh_mode must
+	 * be DRM_XE_GEM_COH_AT_LEAST_1WAY. Currently not allowed for objects placed
+	 * in VRAM.
+	 *
+	 * DRM_XE_GEM_CPU_CACHING_WC: Allocate the pages as write-combined. This is
+	 * uncached. Any @coh_mode is permitted. Scanout surfaces should likely
+	 * use this. All objects that can be placed in VRAM must use this.
+	 */
+#define DRM_XE_GEM_CPU_CACHING_WB                      1
+#define DRM_XE_GEM_CPU_CACHING_WC                      2
+	__u16 cpu_caching;
 
 	/** @reserved: Reserved */
 	__u64 reserved[2];
@@ -552,8 +598,49 @@ struct drm_xe_vm_bind_op {
 	 */
 	__u32 obj;
 
+	/**
+	 * @pat_index: The platform defined @pat_index to use for this mapping.
+	 * The index basically maps to some predefined memory attributes,
+	 * including things like caching, coherency, compression etc.  The exact
+	 * meaning of the pat_index is platform specific and defined in the
+	 * Bspec and PRMs.  When the KMD sets up the binding the index here is
+	 * encoded into the ppGTT PTE.
+	 *
+	 * For coherency the @pat_index needs to be least as coherent as
+	 * drm_xe_gem_create.coh_mode. i.e coh_mode(pat_index) >=
+	 * drm_xe_gem_create.coh_mode. The KMD will extract the coherency mode
+	 * from the @pat_index and reject if there is a mismatch (see note below
+	 * for pre-MTL platforms).
+	 *
+	 * Note: On pre-MTL platforms there is only a caching mode and no
+	 * explicit coherency mode, but on such hardware there is always a
+	 * shared-LLC (or is dgpu) so all GT memory accesses are coherent with
+	 * CPU caches even with the caching mode set as uncached.  It's only the
+	 * display engine that is incoherent (on dgpu it must be in VRAM which
+	 * is always mapped as WC on the CPU). However to keep the uapi somewhat
+	 * consistent with newer platforms the KMD groups the different cache
+	 * levels into the following coherency buckets on all pre-MTL platforms:
+	 *
+	 *	ppGTT UC -> DRM_XE_GEM_COH_NONE
+	 *	ppGTT WC -> DRM_XE_GEM_COH_NONE
+	 *	ppGTT WT -> DRM_XE_GEM_COH_NONE
+	 *	ppGTT WB -> DRM_XE_GEM_COH_AT_LEAST_1WAY
+	 *
+	 * In practice UC/WC/WT should only ever used for scanout surfaces on
+	 * such platforms (or perhaps in general for dma-buf if shared with
+	 * another device) since it is only the display engine that is actually
+	 * incoherent.  Everything else should typically use WB given that we
+	 * have a shared-LLC.  On MTL+ this completely changes and the HW
+	 * defines the coherency mode as part of the @pat_index, where
+	 * incoherent GT access is possible.
+	 *
+	 * Note: For userptr and externally imported dma-buf the kernel expects
+	 * either 1WAY or 2WAY for the @pat_index.
+	 */
+	__u16 pat_index;
+
 	/** @pad: MBZ */
-	__u32 pad;
+	__u16 pad;
 
 	union {
 		/**
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Intel-xe] [PATCH i-g-t 02/12] lib/igt_fb: mark buffers as SCANOUT
  2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 01/12] drm-uapi/xe_drm: sync to get pat and coherency bits Matthew Auld
@ 2023-10-05 15:31 ` Matthew Auld
  2023-10-09 22:03   ` Mishra, Pallavi
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 03/12] lib/igt_draw: " Matthew Auld
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-xe

Display buffers likely will want WC, instead of the default WB on the
CPU side, given that display engine is incoherent with CPU caches.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
---
 lib/igt_fb.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/igt_fb.c b/lib/igt_fb.c
index 54a66eb6a..f8a0db22c 100644
--- a/lib/igt_fb.c
+++ b/lib/igt_fb.c
@@ -1206,7 +1206,8 @@ static int create_bo_for_fb(struct igt_fb *fb, bool prefer_sysmem)
 			igt_assert(err == 0 || err == -EOPNOTSUPP);
 		} else if (is_xe_device(fd)) {
 			fb->gem_handle = xe_bo_create_flags(fd, 0, fb->size,
-							visible_vram_if_possible(fd, 0));
+							    visible_vram_if_possible(fd, 0) |
+							    XE_GEM_CREATE_FLAG_SCANOUT);
 		} else if (is_vc4_device(fd)) {
 			fb->gem_handle = igt_vc4_create_bo(fd, fb->size);
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Intel-xe] [PATCH i-g-t 03/12] lib/igt_draw: mark buffers as SCANOUT
  2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 01/12] drm-uapi/xe_drm: sync to get pat and coherency bits Matthew Auld
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 02/12] lib/igt_fb: mark buffers as SCANOUT Matthew Auld
@ 2023-10-05 15:31 ` Matthew Auld
  2023-10-09 22:03   ` Mishra, Pallavi
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 04/12] lib/xe: support cpu_caching and coh_mod for gem_create Matthew Auld
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-xe

Display buffers likely will want WC, instead of the default WB on the
CPU side, given that display engine is incoherent with CPU caches.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
---
 lib/igt_draw.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/igt_draw.c b/lib/igt_draw.c
index 476778a13..2332bf94a 100644
--- a/lib/igt_draw.c
+++ b/lib/igt_draw.c
@@ -791,7 +791,8 @@ static void draw_rect_render(int fd, struct cmd_data *cmd_data,
 	else
 		tmp.handle = xe_bo_create_flags(fd, 0,
 						ALIGN(tmp.size, xe_get_default_alignment(fd)),
-						visible_vram_if_possible(fd, 0));
+						visible_vram_if_possible(fd, 0) |
+						XE_GEM_CREATE_FLAG_SCANOUT);
 
 	tmp.stride = rect->w * pixel_size;
 	tmp.bpp = buf->bpp;
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Intel-xe] [PATCH i-g-t 04/12] lib/xe: support cpu_caching and coh_mod for gem_create
  2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
                   ` (2 preceding siblings ...)
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 03/12] lib/igt_draw: " Matthew Auld
@ 2023-10-05 15:31 ` Matthew Auld
  2023-10-09 22:04   ` Mishra, Pallavi
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 05/12] tests/xe/mmap: add some tests for cpu_caching and coh_mode Matthew Auld
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-xe

Most tests shouldn't about such things, so likely it's just a case of
picking the most sane default. However we also add some helpers for the
tests that do care.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
---
 lib/xe/xe_ioctl.c       | 65 ++++++++++++++++++++++++++++++++++-------
 lib/xe/xe_ioctl.h       |  8 +++++
 tests/intel/xe_create.c |  3 ++
 3 files changed, 65 insertions(+), 11 deletions(-)

diff --git a/lib/xe/xe_ioctl.c b/lib/xe/xe_ioctl.c
index 730dcfd16..80696aa59 100644
--- a/lib/xe/xe_ioctl.c
+++ b/lib/xe/xe_ioctl.c
@@ -233,13 +233,30 @@ void xe_vm_destroy(int fd, uint32_t vm)
 	igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_VM_DESTROY, &destroy), 0);
 }
 
-uint32_t __xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t flags,
-			      uint32_t *handle)
+void __xe_default_coh_caching_from_flags(int fd, uint32_t flags,
+					 uint16_t *cpu_caching,
+					 uint16_t *coh_mode)
+{
+	if ((flags & all_memory_regions(fd)) != system_memory(fd) ||
+	    flags & XE_GEM_CREATE_FLAG_SCANOUT) {
+		/* VRAM placements or scanout should always use WC */
+		*cpu_caching = DRM_XE_GEM_CPU_CACHING_WC;
+		*coh_mode = DRM_XE_GEM_COH_NONE;
+	} else {
+		*cpu_caching = DRM_XE_GEM_CPU_CACHING_WB;
+		*coh_mode = DRM_XE_GEM_COH_AT_LEAST_1WAY;
+	}
+}
+
+static uint32_t ___xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t flags,
+				      uint16_t cpu_caching, uint16_t coh_mode, uint32_t *handle)
 {
 	struct drm_xe_gem_create create = {
 		.vm_id = vm,
 		.size = size,
 		.flags = flags,
+		.cpu_caching = cpu_caching,
+		.coh_mode = coh_mode,
 	};
 	int err;
 
@@ -249,6 +266,18 @@ uint32_t __xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t flags
 
 	*handle = create.handle;
 	return 0;
+
+}
+
+uint32_t __xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t flags,
+			      uint32_t *handle)
+{
+	uint16_t cpu_caching, coh_mode;
+
+	__xe_default_coh_caching_from_flags(fd, flags, &cpu_caching, &coh_mode);
+
+	return ___xe_bo_create_flags(fd, vm, size, flags, cpu_caching, coh_mode,
+				     handle);
 }
 
 uint32_t xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t flags)
@@ -260,19 +289,33 @@ uint32_t xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t flags)
 	return handle;
 }
 
+uint32_t __xe_bo_create_caching(int fd, uint32_t vm, uint64_t size, uint32_t flags,
+				uint16_t cpu_caching, uint16_t coh_mode,
+				uint32_t *handle)
+{
+	return ___xe_bo_create_flags(fd, vm, size, flags, cpu_caching, coh_mode,
+				     handle);
+}
+
+uint32_t xe_bo_create_caching(int fd, uint32_t vm, uint64_t size, uint32_t flags,
+			      uint16_t cpu_caching, uint16_t coh_mode)
+{
+	uint32_t handle;
+
+	igt_assert_eq(__xe_bo_create_caching(fd, vm, size, flags,
+					     cpu_caching, coh_mode, &handle), 0);
+
+	return handle;
+}
+
 uint32_t xe_bo_create(int fd, int gt, uint32_t vm, uint64_t size)
 {
-	struct drm_xe_gem_create create = {
-		.vm_id = vm,
-		.size = size,
-		.flags = vram_if_possible(fd, gt),
-	};
-	int err;
+	uint32_t handle;
 
-	err = igt_ioctl(fd, DRM_IOCTL_XE_GEM_CREATE, &create);
-	igt_assert_eq(err, 0);
+	igt_assert_eq(__xe_bo_create_flags(fd, vm, size, vram_if_possible(fd, gt),
+					   &handle), 0);
 
-	return create.handle;
+	return handle;
 }
 
 uint32_t xe_bind_exec_queue_create(int fd, uint32_t vm, uint64_t ext)
diff --git a/lib/xe/xe_ioctl.h b/lib/xe/xe_ioctl.h
index 6c281b3bf..c18fc878c 100644
--- a/lib/xe/xe_ioctl.h
+++ b/lib/xe/xe_ioctl.h
@@ -67,6 +67,14 @@ void xe_vm_destroy(int fd, uint32_t vm);
 uint32_t __xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t flags,
 			      uint32_t *handle);
 uint32_t xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t flags);
+uint32_t __xe_bo_create_caching(int fd, uint32_t vm, uint64_t size, uint32_t flags,
+				uint16_t cpu_caching, uint16_t coh_mode,
+				uint32_t *handle);
+uint32_t xe_bo_create_caching(int fd, uint32_t vm, uint64_t size, uint32_t flags,
+			      uint16_t cpu_caching, uint16_t coh_mode);
+void __xe_default_coh_caching_from_flags(int fd, uint32_t flags,
+					 uint16_t *cpu_caching,
+					 uint16_t *coh_mode);
 uint32_t xe_bo_create(int fd, int gt, uint32_t vm, uint64_t size);
 uint32_t xe_exec_queue_create(int fd, uint32_t vm,
 			  struct drm_xe_engine_class_instance *instance,
diff --git a/tests/intel/xe_create.c b/tests/intel/xe_create.c
index 8d845e5c8..f5d2cc1b2 100644
--- a/tests/intel/xe_create.c
+++ b/tests/intel/xe_create.c
@@ -30,6 +30,9 @@ static int __create_bo(int fd, uint32_t vm, uint64_t size, uint32_t flags,
 
 	igt_assert(handlep);
 
+	__xe_default_coh_caching_from_flags(fd, flags, &create.cpu_caching,
+					    &create.coh_mode);
+
 	if (igt_ioctl(fd, DRM_IOCTL_XE_GEM_CREATE, &create)) {
 		ret = -errno;
 		errno = 0;
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Intel-xe] [PATCH i-g-t 05/12] tests/xe/mmap: add some tests for cpu_caching and coh_mode
  2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
                   ` (3 preceding siblings ...)
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 04/12] lib/xe: support cpu_caching and coh_mod for gem_create Matthew Auld
@ 2023-10-05 15:31 ` Matthew Auld
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 06/12] lib/intel_pat: add helpers for common pat_index modes Matthew Auld
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-xe

Ensure the various invalid combinations are rejected. Also ensure we can
mmap and fault anything that is valid.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
---
 tests/intel/xe_mmap.c | 77 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 77 insertions(+)

diff --git a/tests/intel/xe_mmap.c b/tests/intel/xe_mmap.c
index 7e7e43c00..09e9c8aae 100644
--- a/tests/intel/xe_mmap.c
+++ b/tests/intel/xe_mmap.c
@@ -199,6 +199,80 @@ static void test_small_bar(int fd)
 	gem_close(fd, bo);
 }
 
+static void assert_caching(int fd, uint64_t flags, uint16_t cpu_caching,
+			   uint16_t coh_mode, bool fail)
+{
+	uint64_t size = xe_get_default_alignment(fd);
+	uint64_t mmo;
+	uint32_t handle;
+	uint32_t *map;
+	bool ret;
+
+	ret = __xe_bo_create_caching(fd, 0, size, flags, cpu_caching,
+				     coh_mode, &handle);
+	igt_assert(ret == fail);
+
+	if (fail)
+		return;
+
+	mmo = xe_bo_mmap_offset(fd, handle);
+	map = mmap(NULL, size, PROT_WRITE, MAP_SHARED, fd, mmo);
+	igt_assert(map != MAP_FAILED);
+	map[0] = 0xdeadbeaf;
+	gem_close(fd, handle);
+}
+
+/**
+ * SUBTEST: cpu-caching-coh
+ * Description: Test cpu_caching and coh, including mmap behaviour.
+ * Test category: functionality test
+ */
+static void test_cpu_caching(int fd)
+{
+	if (vram_memory(fd, 0)) {
+		assert_caching(fd, vram_memory(fd, 0),
+			       DRM_XE_GEM_CPU_CACHING_WC, DRM_XE_GEM_COH_NONE,
+			       false);
+		assert_caching(fd, vram_memory(fd, 0),
+			       DRM_XE_GEM_CPU_CACHING_WC, DRM_XE_GEM_COH_AT_LEAST_1WAY,
+			       false);
+		assert_caching(fd, vram_memory(fd, 0) | system_memory(fd),
+			       DRM_XE_GEM_CPU_CACHING_WC, DRM_XE_GEM_COH_NONE,
+			       false);
+
+		assert_caching(fd, vram_memory(fd, 0),
+			       DRM_XE_GEM_CPU_CACHING_WB, DRM_XE_GEM_COH_NONE,
+			       true);
+		assert_caching(fd, vram_memory(fd, 0),
+			       DRM_XE_GEM_CPU_CACHING_WB, DRM_XE_GEM_COH_AT_LEAST_1WAY,
+			       true);
+		assert_caching(fd, vram_memory(fd, 0) | system_memory(fd),
+			       DRM_XE_GEM_CPU_CACHING_WB, DRM_XE_GEM_COH_NONE,
+			       true);
+		assert_caching(fd, vram_memory(fd, 0) | system_memory(fd),
+			       DRM_XE_GEM_CPU_CACHING_WB, DRM_XE_GEM_COH_AT_LEAST_1WAY,
+			       true);
+	}
+
+	assert_caching(fd, system_memory(fd), DRM_XE_GEM_CPU_CACHING_WB,
+		       DRM_XE_GEM_COH_AT_LEAST_1WAY, false);
+	assert_caching(fd, system_memory(fd), DRM_XE_GEM_CPU_CACHING_WC,
+		       DRM_XE_GEM_COH_NONE, false);
+	assert_caching(fd, system_memory(fd), DRM_XE_GEM_CPU_CACHING_WC,
+		       DRM_XE_GEM_COH_AT_LEAST_1WAY, false);
+
+	assert_caching(fd, system_memory(fd), DRM_XE_GEM_CPU_CACHING_WB,
+		       DRM_XE_GEM_COH_NONE, true);
+	assert_caching(fd, system_memory(fd), -1, -1, true);
+	assert_caching(fd, system_memory(fd), 0, 0, true);
+	assert_caching(fd, system_memory(fd), 0, DRM_XE_GEM_COH_AT_LEAST_1WAY, true);
+	assert_caching(fd, system_memory(fd), DRM_XE_GEM_CPU_CACHING_WC, 0, true);
+	assert_caching(fd, system_memory(fd), DRM_XE_GEM_CPU_CACHING_WC + 1,
+		       DRM_XE_GEM_COH_AT_LEAST_1WAY, true);
+	assert_caching(fd, system_memory(fd), DRM_XE_GEM_CPU_CACHING_WC,
+		       DRM_XE_GEM_COH_AT_LEAST_1WAY + 1, true);
+}
+
 igt_main
 {
 	int fd;
@@ -230,6 +304,9 @@ igt_main
 		test_small_bar(fd);
 	}
 
+	igt_subtest("cpu-caching-coh")
+		test_cpu_caching(fd);
+
 	igt_fixture
 		drm_close_driver(fd);
 }
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Intel-xe] [PATCH i-g-t 06/12] lib/intel_pat: add helpers for common pat_index modes
  2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
                   ` (4 preceding siblings ...)
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 05/12] tests/xe/mmap: add some tests for cpu_caching and coh_mode Matthew Auld
@ 2023-10-05 15:31 ` Matthew Auld
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 07/12] lib/allocator: add get_offset_pat_index() helper Matthew Auld
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-xe

For now just add uc, wt and wb for every platform. The wb mode should
always be at least 1way coherent, if messing around with system memory.
Also make non-matching platforms throw an error rather than trying to
inherit the modes from previous platforms since they will likely be
different.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
---
 lib/intel_pat.c | 77 +++++++++++++++++++++++++++++++++++++++++++++++++
 lib/intel_pat.h | 19 ++++++++++++
 lib/meson.build |  1 +
 3 files changed, 97 insertions(+)
 create mode 100644 lib/intel_pat.c
 create mode 100644 lib/intel_pat.h

diff --git a/lib/intel_pat.c b/lib/intel_pat.c
new file mode 100644
index 000000000..4d19d57ea
--- /dev/null
+++ b/lib/intel_pat.c
@@ -0,0 +1,77 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#include "intel_pat.h"
+
+#include "igt.h"
+
+struct intel_pat_cache {
+	uint8_t uc; /* UC + COH_NONE */
+	uint8_t wt; /* WT + COH_NONE */
+	uint8_t wb; /* WB + COH_AT_LEAST_1WAY */
+
+	uint8_t max_index;
+};
+
+static void intel_get_pat_idx(int fd, struct intel_pat_cache *pat)
+{
+	uint16_t dev_id = intel_get_drm_devid(fd);
+
+	if (intel_graphics_ver(dev_id) == IP_VER(20, 0)) {
+		pat->uc = 3;
+		pat->wt = 15;
+		pat->wb = 2;
+		pat->max_index = 31;
+	} else if (IS_METEORLAKE(dev_id)) {
+		pat->uc = 2;
+		pat->wt = 1;
+		pat->wb = 3;
+		pat->max_index = 3;
+	} else if (IS_PONTEVECCHIO(dev_id)) {
+		pat->uc = 0;
+		pat->wt = 2;
+		pat->wb = 3;
+		pat->max_index = 7;
+	} else if (intel_graphics_ver(dev_id) <= IP_VER(12, 60)) {
+		pat->uc = 3;
+		pat->wt = 2;
+		pat->wb = 0;
+		pat->max_index = 3;
+	} else {
+		igt_critical("Platform is missing PAT settings for uc/wt/wb\n");
+	}
+}
+
+uint8_t intel_get_max_pat_index(int fd)
+{
+	struct intel_pat_cache pat = {};
+
+	intel_get_pat_idx(fd, &pat);
+	return pat.max_index;
+}
+
+uint8_t intel_get_pat_idx_uc(int fd)
+{
+	struct intel_pat_cache pat = {};
+
+	intel_get_pat_idx(fd, &pat);
+	return pat.uc;
+}
+
+uint8_t intel_get_pat_idx_wt(int fd)
+{
+	struct intel_pat_cache pat = {};
+
+	intel_get_pat_idx(fd, &pat);
+	return pat.wt;
+}
+
+uint8_t intel_get_pat_idx_wb(int fd)
+{
+	struct intel_pat_cache pat = {};
+
+	intel_get_pat_idx(fd, &pat);
+	return pat.wb;
+}
diff --git a/lib/intel_pat.h b/lib/intel_pat.h
new file mode 100644
index 000000000..c24dbc275
--- /dev/null
+++ b/lib/intel_pat.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#ifndef INTEL_PAT_H
+#define INTEL_PAT_H
+
+#include <stdint.h>
+
+#define DEFAULT_PAT_INDEX ((uint8_t)-1) /* igt-core can pick 1way or better */
+
+uint8_t intel_get_max_pat_index(int fd);
+
+uint8_t intel_get_pat_idx_uc(int fd);
+uint8_t intel_get_pat_idx_wt(int fd);
+uint8_t intel_get_pat_idx_wb(int fd);
+
+#endif /* INTEL_PAT_H */
diff --git a/lib/meson.build b/lib/meson.build
index a7bccafc3..48466a2e9 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -64,6 +64,7 @@ lib_sources = [
 	'intel_device_info.c',
 	'intel_mmio.c',
 	'intel_mocs.c',
+	'intel_pat.c',
 	'ioctl_wrappers.c',
 	'media_spin.c',
 	'media_fill.c',
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Intel-xe] [PATCH i-g-t 07/12] lib/allocator: add get_offset_pat_index() helper
  2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
                   ` (5 preceding siblings ...)
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 06/12] lib/intel_pat: add helpers for common pat_index modes Matthew Auld
@ 2023-10-05 15:31 ` Matthew Auld
  2023-10-06 11:38   ` Zbigniew Kempczyński
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 08/12] lib/intel_blt: support pat_index Matthew Auld
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-xe

For some cases we are going to need to pass the pat_index for the
vm_bind op. Add a helper for this, such that we can allocate an address
and give the mapping some pat_index.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
---
 lib/intel_allocator.c             | 43 +++++++++++++++++++++++--------
 lib/intel_allocator.h             |  5 +++-
 lib/xe/xe_util.c                  |  1 +
 lib/xe/xe_util.h                  |  1 +
 tests/intel/api_intel_allocator.c |  4 ++-
 5 files changed, 41 insertions(+), 13 deletions(-)

diff --git a/lib/intel_allocator.c b/lib/intel_allocator.c
index f0a9b7fb5..da357b833 100644
--- a/lib/intel_allocator.c
+++ b/lib/intel_allocator.c
@@ -16,6 +16,7 @@
 #include "igt_map.h"
 #include "intel_allocator.h"
 #include "intel_allocator_msgchannel.h"
+#include "intel_pat.h"
 #include "xe/xe_query.h"
 #include "xe/xe_util.h"
 
@@ -92,6 +93,7 @@ struct allocator_object {
 	uint32_t handle;
 	uint64_t offset;
 	uint64_t size;
+	uint8_t pat_index;
 
 	enum allocator_bind_op bind_op;
 };
@@ -1122,14 +1124,14 @@ void intel_allocator_get_address_range(uint64_t allocator_handle,
 
 static bool is_same(struct allocator_object *obj,
 		    uint32_t handle, uint64_t offset, uint64_t size,
-		    enum allocator_bind_op bind_op)
+		    uint8_t pat_index, enum allocator_bind_op bind_op)
 {
 	return obj->handle == handle &&	obj->offset == offset && obj->size == size &&
-	       (obj->bind_op == bind_op || obj->bind_op == BOUND);
+	       obj->pat_index == pat_index && (obj->bind_op == bind_op || obj->bind_op == BOUND);
 }
 
 static void track_object(uint64_t allocator_handle, uint32_t handle,
-			 uint64_t offset, uint64_t size,
+			 uint64_t offset, uint64_t size, uint8_t pat_index,
 			 enum allocator_bind_op bind_op)
 {
 	struct ahnd_info *ainfo;
@@ -1156,6 +1158,9 @@ static void track_object(uint64_t allocator_handle, uint32_t handle,
 	if (ainfo->driver == INTEL_DRIVER_I915)
 		return; /* no-op for i915, at least for now */
 
+	if (pat_index == DEFAULT_PAT_INDEX)
+		pat_index = intel_get_pat_idx_wb(ainfo->fd);
+
 	pthread_mutex_lock(&ainfo->bind_map_mutex);
 	obj = igt_map_search(ainfo->bind_map, &handle);
 	if (obj) {
@@ -1165,7 +1170,7 @@ static void track_object(uint64_t allocator_handle, uint32_t handle,
 		 * bind_map.
 		 */
 		if (bind_op == TO_BIND) {
-			igt_assert_eq(is_same(obj, handle, offset, size, bind_op), true);
+			igt_assert_eq(is_same(obj, handle, offset, size, pat_index, bind_op), true);
 		} else if (bind_op == TO_UNBIND) {
 			if (obj->bind_op == TO_BIND)
 				igt_map_remove(ainfo->bind_map, &obj->handle, map_entry_free_func);
@@ -1181,6 +1186,7 @@ static void track_object(uint64_t allocator_handle, uint32_t handle,
 		obj->handle = handle;
 		obj->offset = offset;
 		obj->size = size;
+		obj->pat_index = pat_index;
 		obj->bind_op = bind_op;
 		igt_map_insert(ainfo->bind_map, &obj->handle, obj);
 	}
@@ -1204,7 +1210,7 @@ out:
  */
 uint64_t __intel_allocator_alloc(uint64_t allocator_handle, uint32_t handle,
 				 uint64_t size, uint64_t alignment,
-				 enum allocator_strategy strategy)
+				 uint8_t pat_index, enum allocator_strategy strategy)
 {
 	struct alloc_req req = { .request_type = REQ_ALLOC,
 				 .allocator_handle = allocator_handle,
@@ -1219,7 +1225,8 @@ uint64_t __intel_allocator_alloc(uint64_t allocator_handle, uint32_t handle,
 	igt_assert(handle_request(&req, &resp) == 0);
 	igt_assert(resp.response_type == RESP_ALLOC);
 
-	track_object(allocator_handle, handle, resp.alloc.offset, size, TO_BIND);
+	track_object(allocator_handle, handle, resp.alloc.offset, size, pat_index,
+		     TO_BIND);
 
 	return resp.alloc.offset;
 }
@@ -1241,7 +1248,7 @@ uint64_t intel_allocator_alloc(uint64_t allocator_handle, uint32_t handle,
 	uint64_t offset;
 
 	offset = __intel_allocator_alloc(allocator_handle, handle,
-					 size, alignment,
+					 size, alignment, DEFAULT_PAT_INDEX,
 					 ALLOC_STRATEGY_NONE);
 	igt_assert(offset != ALLOC_INVALID_ADDRESS);
 
@@ -1268,7 +1275,8 @@ uint64_t intel_allocator_alloc_with_strategy(uint64_t allocator_handle,
 	uint64_t offset;
 
 	offset = __intel_allocator_alloc(allocator_handle, handle,
-					 size, alignment, strategy);
+					 size, alignment, DEFAULT_PAT_INDEX,
+					 strategy);
 	igt_assert(offset != ALLOC_INVALID_ADDRESS);
 
 	return offset;
@@ -1298,7 +1306,7 @@ bool intel_allocator_free(uint64_t allocator_handle, uint32_t handle)
 	igt_assert(handle_request(&req, &resp) == 0);
 	igt_assert(resp.response_type == RESP_FREE);
 
-	track_object(allocator_handle, handle, 0, 0, TO_UNBIND);
+	track_object(allocator_handle, handle, 0, 0, 0, TO_UNBIND);
 
 	return resp.free.freed;
 }
@@ -1500,16 +1508,17 @@ static void __xe_op_bind(struct ahnd_info *ainfo, uint32_t sync_in, uint32_t syn
 		if (obj->bind_op == BOUND)
 			continue;
 
-		bind_info("= [vm: %u] %s => %u %lx %lx\n",
+		bind_info("= [vm: %u] %s => %u %lx %lx %u\n",
 			  ainfo->vm,
 			  obj->bind_op == TO_BIND ? "TO BIND" : "TO UNBIND",
 			  obj->handle, obj->offset,
-			  obj->size);
+			  obj->size, obj->pat_index);
 
 		entry = malloc(sizeof(*entry));
 		entry->handle = obj->handle;
 		entry->offset = obj->offset;
 		entry->size = obj->size;
+		entry->pat_index = obj->pat_index;
 		entry->bind_op = obj->bind_op == TO_BIND ? XE_OBJECT_BIND :
 							   XE_OBJECT_UNBIND;
 		igt_list_add(&entry->link, &obj_list);
@@ -1534,6 +1543,18 @@ static void __xe_op_bind(struct ahnd_info *ainfo, uint32_t sync_in, uint32_t syn
 	}
 }
 
+uint64_t get_offset_pat_index(uint64_t ahnd, uint32_t handle, uint64_t size,
+			      uint64_t alignment, uint8_t pat_index)
+{
+	uint64_t offset;
+
+	offset = __intel_allocator_alloc(ahnd, handle, size, alignment,
+					 pat_index, ALLOC_STRATEGY_NONE);
+	igt_assert(offset != ALLOC_INVALID_ADDRESS);
+
+	return offset;
+}
+
 /**
  * intel_allocator_bind:
  * @allocator_handle: handle to an allocator
diff --git a/lib/intel_allocator.h b/lib/intel_allocator.h
index f9ff7f1cc..5da8af7f9 100644
--- a/lib/intel_allocator.h
+++ b/lib/intel_allocator.h
@@ -186,7 +186,7 @@ bool intel_allocator_close(uint64_t allocator_handle);
 void intel_allocator_get_address_range(uint64_t allocator_handle,
 				       uint64_t *startp, uint64_t *endp);
 uint64_t __intel_allocator_alloc(uint64_t allocator_handle, uint32_t handle,
-				 uint64_t size, uint64_t alignment,
+				 uint64_t size, uint64_t alignment, uint8_t pat_index,
 				 enum allocator_strategy strategy);
 uint64_t intel_allocator_alloc(uint64_t allocator_handle, uint32_t handle,
 			       uint64_t size, uint64_t alignment);
@@ -266,6 +266,9 @@ static inline bool put_ahnd(uint64_t ahnd)
 	return !ahnd || intel_allocator_close(ahnd);
 }
 
+uint64_t get_offset_pat_index(uint64_t ahnd, uint32_t handle, uint64_t size,
+			      uint64_t alignment, uint8_t pat_index);
+
 static inline uint64_t get_offset(uint64_t ahnd, uint32_t handle,
 				  uint64_t size, uint64_t alignment)
 {
diff --git a/lib/xe/xe_util.c b/lib/xe/xe_util.c
index 2f9ffe2f1..8583326a9 100644
--- a/lib/xe/xe_util.c
+++ b/lib/xe/xe_util.c
@@ -145,6 +145,7 @@ static struct drm_xe_vm_bind_op *xe_alloc_bind_ops(struct igt_list_head *obj_lis
 		ops->addr = obj->offset;
 		ops->range = obj->size;
 		ops->region = 0;
+		ops->pat_index = obj->pat_index;
 
 		bind_info("  [%d]: [%6s] handle: %u, offset: %llx, size: %llx\n",
 			  i, obj->bind_op == XE_OBJECT_BIND ? "BIND" : "UNBIND",
diff --git a/lib/xe/xe_util.h b/lib/xe/xe_util.h
index e97d236b8..e3bdf3d11 100644
--- a/lib/xe/xe_util.h
+++ b/lib/xe/xe_util.h
@@ -36,6 +36,7 @@ struct xe_object {
 	uint32_t handle;
 	uint64_t offset;
 	uint64_t size;
+	uint8_t pat_index;
 	enum xe_bind_op bind_op;
 	struct igt_list_head link;
 };
diff --git a/tests/intel/api_intel_allocator.c b/tests/intel/api_intel_allocator.c
index f3fcf8a34..d19be3ce9 100644
--- a/tests/intel/api_intel_allocator.c
+++ b/tests/intel/api_intel_allocator.c
@@ -9,6 +9,7 @@
 #include "igt.h"
 #include "igt_aux.h"
 #include "intel_allocator.h"
+#include "intel_pat.h"
 #include "xe/xe_ioctl.h"
 #include "xe/xe_query.h"
 
@@ -131,7 +132,8 @@ static void alloc_simple(int fd)
 
 	intel_allocator_get_address_range(ahnd, &start, &end);
 	offset0 = intel_allocator_alloc(ahnd, 1, end - start, 0);
-	offset1 = __intel_allocator_alloc(ahnd, 2, 4096, 0, ALLOC_STRATEGY_NONE);
+	offset1 = __intel_allocator_alloc(ahnd, 2, 4096, 0, DEFAULT_PAT_INDEX,
+					  ALLOC_STRATEGY_NONE);
 	igt_assert(offset1 == ALLOC_INVALID_ADDRESS);
 	intel_allocator_free(ahnd, 1);
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Intel-xe] [PATCH i-g-t 08/12] lib/intel_blt: support pat_index
  2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
                   ` (6 preceding siblings ...)
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 07/12] lib/allocator: add get_offset_pat_index() helper Matthew Auld
@ 2023-10-05 15:31 ` Matthew Auld
  2023-10-06 11:51   ` [Intel-xe] [igt-dev] " Zbigniew Kempczyński
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 09/12] lib/intel_buf: " Matthew Auld
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-xe

For the most part we can just use the default wb, however some users
including display might want to use something else.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
---
 lib/igt_fb.c                    |  2 ++
 lib/intel_blt.c                 | 54 +++++++++++++++++++++------------
 lib/intel_blt.h                 |  7 +++--
 tests/intel/gem_ccs.c           | 16 +++++-----
 tests/intel/gem_lmem_swapping.c |  4 +--
 tests/intel/xe_ccs.c            | 19 +++++++-----
 6 files changed, 64 insertions(+), 38 deletions(-)

diff --git a/lib/igt_fb.c b/lib/igt_fb.c
index f8a0db22c..d290fd775 100644
--- a/lib/igt_fb.c
+++ b/lib/igt_fb.c
@@ -37,6 +37,7 @@
 #include "i915/gem_mman.h"
 #include "intel_blt.h"
 #include "intel_mocs.h"
+#include "intel_pat.h"
 #include "igt_aux.h"
 #include "igt_color_encoding.h"
 #include "igt_fb.h"
@@ -2768,6 +2769,7 @@ static struct blt_copy_object *blt_fb_init(const struct igt_fb *fb,
 
 	blt_set_object(blt, handle, fb->size, memregion,
 		       intel_get_uc_mocs(fb->fd),
+		       intel_get_pat_idx_wt(fb->fd),
 		       blt_tile,
 		       is_ccs_modifier(fb->modifier) ? COMPRESSION_ENABLED : COMPRESSION_DISABLED,
 		       is_gen12_mc_ccs_modifier(fb->modifier) ? COMPRESSION_TYPE_MEDIA : COMPRESSION_TYPE_3D);
diff --git a/lib/intel_blt.c b/lib/intel_blt.c
index b55fa9b52..b7ac2902b 100644
--- a/lib/intel_blt.c
+++ b/lib/intel_blt.c
@@ -13,6 +13,7 @@
 #include "igt.h"
 #include "igt_syncobj.h"
 #include "intel_blt.h"
+#include "intel_pat.h"
 #include "xe/xe_ioctl.h"
 #include "xe/xe_query.h"
 #include "xe/xe_util.h"
@@ -810,10 +811,12 @@ uint64_t emit_blt_block_copy(int fd,
 	igt_assert_f(blt, "block-copy requires data to do blit\n");
 
 	alignment = get_default_alignment(fd, blt->driver);
-	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment)
-		     + blt->src.plane_offset;
-	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment)
-		     + blt->dst.plane_offset;
+	src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
+					  alignment, blt->src.pat_index) +
+		blt->src.plane_offset;
+	dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size,
+					  alignment, blt->dst.pat_index) +
+		blt->dst.plane_offset;
 	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
 
 	fill_data(&data, blt, src_offset, dst_offset, ext);
@@ -884,8 +887,10 @@ int blt_block_copy(int fd,
 	igt_assert_neq(blt->driver, 0);
 
 	alignment = get_default_alignment(fd, blt->driver);
-	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
-	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
+	src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
+					  alignment, blt->src.pat_index);
+	dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size,
+					  alignment, blt->dst.pat_index);
 	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
 
 	emit_blt_block_copy(fd, ahnd, blt, ext, 0, true);
@@ -1036,8 +1041,10 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
 	data.dw00.size_of_ctrl_copy = __ccs_size(surf) / CCS_RATIO - 1;
 	data.dw00.length = 0x3;
 
-	src_offset = get_offset(ahnd, surf->src.handle, surf->src.size, alignment);
-	dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size, alignment);
+	src_offset = get_offset_pat_index(ahnd, surf->src.handle, surf->src.size,
+					  alignment, surf->src.pat_index);
+	dst_offset = get_offset_pat_index(ahnd, surf->dst.handle, surf->dst.size,
+					  alignment, surf->dst.pat_index);
 	bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size, alignment);
 
 	data.dw01.src_address_lo = src_offset;
@@ -1103,8 +1110,10 @@ int blt_ctrl_surf_copy(int fd,
 	igt_assert_neq(surf->driver, 0);
 
 	alignment = max_t(uint64_t, get_default_alignment(fd, surf->driver), 1ull << 16);
-	src_offset = get_offset(ahnd, surf->src.handle, surf->src.size, alignment);
-	dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size, alignment);
+	src_offset = get_offset_pat_index(ahnd, surf->src.handle, surf->src.size,
+					  alignment, surf->src.pat_index);
+	dst_offset = get_offset_pat_index(ahnd, surf->dst.handle, surf->dst.size,
+					  alignment, surf->dst.pat_index);
 	bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size, alignment);
 
 	emit_blt_ctrl_surf_copy(fd, ahnd, surf, 0, true);
@@ -1308,10 +1317,12 @@ uint64_t emit_blt_fast_copy(int fd,
 	data.dw03.dst_x2 = blt->dst.x2;
 	data.dw03.dst_y2 = blt->dst.y2;
 
-	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment)
-		     + blt->src.plane_offset;
-	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment)
-		     + blt->dst.plane_offset;
+	src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
+					  alignment, blt->src.pat_index) +
+		blt->src.plane_offset;
+	dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size, alignment,
+					  blt->dst.pat_index) +
+		blt->dst.plane_offset;
 	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
 
 	data.dw04.dst_address_lo = dst_offset;
@@ -1380,8 +1391,10 @@ int blt_fast_copy(int fd,
 	igt_assert_neq(blt->driver, 0);
 
 	alignment = get_default_alignment(fd, blt->driver);
-	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
-	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
+	src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
+					  alignment, blt->src.pat_index);
+	dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size,
+					  alignment, blt->dst.pat_index);
 	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
 
 	emit_blt_fast_copy(fd, ahnd, blt, 0, true);
@@ -1460,7 +1473,7 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
 							  &size, region) == 0);
 	}
 
-	blt_set_object(obj, handle, size, region, mocs, tiling,
+	blt_set_object(obj, handle, size, region, mocs, DEFAULT_PAT_INDEX, tiling,
 		       compression, compression_type);
 	blt_set_geom(obj, stride, 0, 0, width, height, 0, 0);
 
@@ -1481,7 +1494,7 @@ void blt_destroy_object(int fd, struct blt_copy_object *obj)
 
 void blt_set_object(struct blt_copy_object *obj,
 		    uint32_t handle, uint64_t size, uint32_t region,
-		    uint8_t mocs, enum blt_tiling_type tiling,
+		    uint8_t mocs, uint8_t pat_index, enum blt_tiling_type tiling,
 		    enum blt_compression compression,
 		    enum blt_compression_type compression_type)
 {
@@ -1489,6 +1502,7 @@ void blt_set_object(struct blt_copy_object *obj,
 	obj->size = size;
 	obj->region = region;
 	obj->mocs = mocs;
+	obj->pat_index = pat_index;
 	obj->tiling = tiling;
 	obj->compression = compression;
 	obj->compression_type = compression_type;
@@ -1516,12 +1530,14 @@ void blt_set_copy_object(struct blt_copy_object *obj,
 
 void blt_set_ctrl_surf_object(struct blt_ctrl_surf_copy_object *obj,
 			      uint32_t handle, uint32_t region, uint64_t size,
-			      uint8_t mocs, enum blt_access_type access_type)
+			      uint8_t mocs, uint8_t pat_index,
+			      enum blt_access_type access_type)
 {
 	obj->handle = handle;
 	obj->region = region;
 	obj->size = size;
 	obj->mocs = mocs;
+	obj->pat_index = pat_index;
 	obj->access_type = access_type;
 }
 
diff --git a/lib/intel_blt.h b/lib/intel_blt.h
index d9c8883c7..f8423a986 100644
--- a/lib/intel_blt.h
+++ b/lib/intel_blt.h
@@ -79,6 +79,7 @@ struct blt_copy_object {
 	uint32_t region;
 	uint64_t size;
 	uint8_t mocs;
+	uint8_t pat_index;
 	enum blt_tiling_type tiling;
 	enum blt_compression compression;  /* BC only */
 	enum blt_compression_type compression_type; /* BC only */
@@ -151,6 +152,7 @@ struct blt_ctrl_surf_copy_object {
 	uint32_t region;
 	uint64_t size;
 	uint8_t mocs;
+	uint8_t pat_index;
 	enum blt_access_type access_type;
 };
 
@@ -247,7 +249,7 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
 void blt_destroy_object(int fd, struct blt_copy_object *obj);
 void blt_set_object(struct blt_copy_object *obj,
 		    uint32_t handle, uint64_t size, uint32_t region,
-		    uint8_t mocs, enum blt_tiling_type tiling,
+		    uint8_t mocs, uint8_t pat_index, enum blt_tiling_type tiling,
 		    enum blt_compression compression,
 		    enum blt_compression_type compression_type);
 void blt_set_object_ext(struct blt_block_copy_object_ext *obj,
@@ -258,7 +260,8 @@ void blt_set_copy_object(struct blt_copy_object *obj,
 			 const struct blt_copy_object *orig);
 void blt_set_ctrl_surf_object(struct blt_ctrl_surf_copy_object *obj,
 			      uint32_t handle, uint32_t region, uint64_t size,
-			      uint8_t mocs, enum blt_access_type access_type);
+			      uint8_t mocs, uint8_t pat_index,
+			      enum blt_access_type access_type);
 
 void blt_surface_info(const char *info,
 		      const struct blt_copy_object *obj);
diff --git a/tests/intel/gem_ccs.c b/tests/intel/gem_ccs.c
index f5d4ab359..a98557b72 100644
--- a/tests/intel/gem_ccs.c
+++ b/tests/intel/gem_ccs.c
@@ -15,6 +15,7 @@
 #include "lib/intel_chipset.h"
 #include "intel_blt.h"
 #include "intel_mocs.h"
+#include "intel_pat.h"
 /**
  * TEST: gem ccs
  * Description: Exercise gen12 blitter with and without flatccs compression
@@ -111,9 +112,9 @@ static void surf_copy(int i915,
 	blt_ctrl_surf_copy_init(i915, &surf);
 	surf.print_bb = param.print_bb;
 	blt_set_ctrl_surf_object(&surf.src, mid->handle, mid->region, mid->size,
-				 uc_mocs, BLT_INDIRECT_ACCESS);
+				 uc_mocs, DEFAULT_PAT_INDEX, BLT_INDIRECT_ACCESS);
 	blt_set_ctrl_surf_object(&surf.dst, ccs, REGION_SMEM, ccssize,
-				 uc_mocs, DIRECT_ACCESS);
+				 uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
 	bb_size = 4096;
 	igt_assert_eq(__gem_create(i915, &bb_size, &bb1), 0);
 	blt_set_batch(&surf.bb, bb1, bb_size, REGION_SMEM);
@@ -133,7 +134,7 @@ static void surf_copy(int i915,
 		igt_system_suspend_autoresume(SUSPEND_STATE_FREEZE, SUSPEND_TEST_NONE);
 
 		blt_set_ctrl_surf_object(&surf.dst, ccs2, REGION_SMEM, ccssize,
-					 0, DIRECT_ACCESS);
+					 0, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
 		blt_ctrl_surf_copy(i915, ctx, e, ahnd, &surf);
 		gem_sync(i915, surf.dst.handle);
 
@@ -155,9 +156,9 @@ static void surf_copy(int i915,
 	for (int i = 0; i < surf.dst.size / sizeof(uint32_t); i++)
 		ccsmap[i] = i;
 	blt_set_ctrl_surf_object(&surf.src, ccs, REGION_SMEM, ccssize,
-				 uc_mocs, DIRECT_ACCESS);
+				 uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
 	blt_set_ctrl_surf_object(&surf.dst, mid->handle, mid->region, mid->size,
-				 uc_mocs, INDIRECT_ACCESS);
+				 uc_mocs, DEFAULT_PAT_INDEX, INDIRECT_ACCESS);
 	blt_ctrl_surf_copy(i915, ctx, e, ahnd, &surf);
 
 	blt_copy_init(i915, &blt);
@@ -399,7 +400,8 @@ static void block_copy(int i915,
 	blt_set_object_ext(&ext.dst, 0, width, height, SURFACE_TYPE_2D);
 	if (config->inplace) {
 		blt_set_object(&blt.dst, mid->handle, dst->size, mid->region, 0,
-			       T_LINEAR, COMPRESSION_DISABLED, comp_type);
+			       DEFAULT_PAT_INDEX, T_LINEAR, COMPRESSION_DISABLED,
+			       comp_type);
 		blt.dst.ptr = mid->ptr;
 	}
 
@@ -475,7 +477,7 @@ static void block_multicopy(int i915,
 
 	if (config->inplace) {
 		blt_set_object(&blt3.dst, mid->handle, dst->size, mid->region,
-			       mid->mocs, mid_tiling, COMPRESSION_DISABLED,
+			       mid->mocs, DEFAULT_PAT_INDEX, mid_tiling, COMPRESSION_DISABLED,
 			       comp_type);
 		blt3.dst.ptr = mid->ptr;
 	}
diff --git a/tests/intel/gem_lmem_swapping.c b/tests/intel/gem_lmem_swapping.c
index ede545c92..7f2ab8bb6 100644
--- a/tests/intel/gem_lmem_swapping.c
+++ b/tests/intel/gem_lmem_swapping.c
@@ -486,7 +486,7 @@ static void __do_evict(int i915,
 				   INTEL_MEMORY_REGION_ID(I915_SYSTEM_MEMORY, 0));
 		blt_set_object(tmp, tmp->handle, params->size.max,
 			       INTEL_MEMORY_REGION_ID(I915_SYSTEM_MEMORY, 0),
-			       intel_get_uc_mocs(i915), T_LINEAR,
+			       intel_get_uc_mocs(i915), 0, T_LINEAR,
 			       COMPRESSION_DISABLED, COMPRESSION_TYPE_3D);
 		blt_set_geom(tmp, stride, 0, 0, width, height, 0, 0);
 	}
@@ -516,7 +516,7 @@ static void __do_evict(int i915,
 			obj->blt_obj = calloc(1, sizeof(*obj->blt_obj));
 			igt_assert(obj->blt_obj);
 			blt_set_object(obj->blt_obj, obj->handle, obj->size, region_id,
-				       intel_get_uc_mocs(i915), T_LINEAR,
+				       intel_get_uc_mocs(i915), 0, T_LINEAR,
 				       COMPRESSION_ENABLED, COMPRESSION_TYPE_3D);
 			blt_set_geom(obj->blt_obj, stride, 0, 0, width, height, 0, 0);
 			init_object_ccs(i915, obj, tmp, rand(), blt_ctx,
diff --git a/tests/intel/xe_ccs.c b/tests/intel/xe_ccs.c
index 20bbc4448..27859d5ce 100644
--- a/tests/intel/xe_ccs.c
+++ b/tests/intel/xe_ccs.c
@@ -13,6 +13,7 @@
 #include "igt_syncobj.h"
 #include "intel_blt.h"
 #include "intel_mocs.h"
+#include "intel_pat.h"
 #include "xe/xe_ioctl.h"
 #include "xe/xe_query.h"
 #include "xe/xe_util.h"
@@ -108,8 +109,9 @@ static void surf_copy(int xe,
 	blt_ctrl_surf_copy_init(xe, &surf);
 	surf.print_bb = param.print_bb;
 	blt_set_ctrl_surf_object(&surf.src, mid->handle, mid->region, mid->size,
-				 uc_mocs, BLT_INDIRECT_ACCESS);
-	blt_set_ctrl_surf_object(&surf.dst, ccs, sysmem, ccssize, uc_mocs, DIRECT_ACCESS);
+				 uc_mocs, DEFAULT_PAT_INDEX, BLT_INDIRECT_ACCESS);
+	blt_set_ctrl_surf_object(&surf.dst, ccs, sysmem, ccssize, uc_mocs,
+				 DEFAULT_PAT_INDEX, DIRECT_ACCESS);
 	bb_size = xe_get_default_alignment(xe);
 	bb1 = xe_bo_create_flags(xe, 0, bb_size, sysmem);
 	blt_set_batch(&surf.bb, bb1, bb_size, sysmem);
@@ -130,7 +132,7 @@ static void surf_copy(int xe,
 		igt_system_suspend_autoresume(SUSPEND_STATE_FREEZE, SUSPEND_TEST_NONE);
 
 		blt_set_ctrl_surf_object(&surf.dst, ccs2, system_memory(xe), ccssize,
-					 0, DIRECT_ACCESS);
+					 0, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
 		blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
 		intel_ctx_xe_sync(ctx, true);
 
@@ -153,9 +155,9 @@ static void surf_copy(int xe,
 	for (int i = 0; i < surf.dst.size / sizeof(uint32_t); i++)
 		ccsmap[i] = i;
 	blt_set_ctrl_surf_object(&surf.src, ccs, sysmem, ccssize,
-				 uc_mocs, DIRECT_ACCESS);
+				 uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
 	blt_set_ctrl_surf_object(&surf.dst, mid->handle, mid->region, mid->size,
-				 uc_mocs, INDIRECT_ACCESS);
+				 uc_mocs, DEFAULT_PAT_INDEX, INDIRECT_ACCESS);
 	blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
 	intel_ctx_xe_sync(ctx, true);
 
@@ -369,7 +371,8 @@ static void block_copy(int xe,
 	blt_set_object_ext(&ext.dst, 0, width, height, SURFACE_TYPE_2D);
 	if (config->inplace) {
 		blt_set_object(&blt.dst, mid->handle, dst->size, mid->region, 0,
-			       T_LINEAR, COMPRESSION_DISABLED, comp_type);
+			       DEFAULT_PAT_INDEX, T_LINEAR, COMPRESSION_DISABLED,
+			       comp_type);
 		blt.dst.ptr = mid->ptr;
 	}
 
@@ -450,8 +453,8 @@ static void block_multicopy(int xe,
 
 	if (config->inplace) {
 		blt_set_object(&blt3.dst, mid->handle, dst->size, mid->region,
-			       mid->mocs, mid_tiling, COMPRESSION_DISABLED,
-			       comp_type);
+			       mid->mocs, DEFAULT_PAT_INDEX, mid_tiling,
+			       COMPRESSION_DISABLED, comp_type);
 		blt3.dst.ptr = mid->ptr;
 	}
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Intel-xe] [PATCH i-g-t 09/12] lib/intel_buf: support pat_index
  2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
                   ` (7 preceding siblings ...)
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 08/12] lib/intel_blt: support pat_index Matthew Auld
@ 2023-10-05 15:31 ` Matthew Auld
  2023-10-06 12:13   ` Zbigniew Kempczyński
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 10/12] lib/xe_ioctl: update vm_bind to account for pat_index Matthew Auld
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-xe

Some users need to able select their own pat_index. Some display tests
use igt_draw which in turn uses intel_batchbuffer and intel_buf.  We
also have a couple more display tests directly using these interfaces
directly. Idea is to select wt/uc for anything display related, but also
allow any test to select a pat_index for a given intel_buf.

Signted-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
---
 lib/igt_draw.c            |  7 +++++-
 lib/igt_fb.c              |  3 ++-
 lib/intel_allocator.c     |  1 +
 lib/intel_allocator.h     |  1 +
 lib/intel_batchbuffer.c   | 51 ++++++++++++++++++++++++++++++---------
 lib/intel_bufops.c        | 29 +++++++++++++++-------
 lib/intel_bufops.h        |  9 +++++--
 tests/intel/kms_big_fb.c  |  4 ++-
 tests/intel/kms_dirtyfb.c |  7 ++++--
 tests/intel/kms_psr.c     |  4 ++-
 tests/intel/xe_intel_bb.c |  3 ++-
 11 files changed, 89 insertions(+), 30 deletions(-)

diff --git a/lib/igt_draw.c b/lib/igt_draw.c
index 2332bf94a..8db71ce5e 100644
--- a/lib/igt_draw.c
+++ b/lib/igt_draw.c
@@ -31,6 +31,7 @@
 #include "intel_batchbuffer.h"
 #include "intel_chipset.h"
 #include "intel_mocs.h"
+#include "intel_pat.h"
 #include "igt_core.h"
 #include "igt_fb.h"
 #include "ioctl_wrappers.h"
@@ -75,6 +76,7 @@ struct buf_data {
 	uint32_t size;
 	uint32_t stride;
 	int bpp;
+	uint8_t pat_index;
 };
 
 struct rect {
@@ -658,7 +660,8 @@ static struct intel_buf *create_buf(int fd, struct buf_ops *bops,
 				    width, height, from->bpp, 0,
 				    tiling, 0,
 				    size, 0,
-				    region);
+				    region,
+				    from->pat_index);
 
 	/* Make sure we close handle on destroy path */
 	intel_buf_set_ownership(buf, true);
@@ -785,6 +788,7 @@ static void draw_rect_render(int fd, struct cmd_data *cmd_data,
 	igt_skip_on(!rendercopy);
 
 	/* We create a temporary buffer and copy from it using rendercopy. */
+	tmp.pat_index = buf->pat_index;
 	tmp.size = rect->w * rect->h * pixel_size;
 	if (is_i915_device(fd))
 		tmp.handle = gem_create(fd, tmp.size);
@@ -852,6 +856,7 @@ void igt_draw_rect(int fd, struct buf_ops *bops, uint32_t ctx,
 		.size = buf_size,
 		.stride = buf_stride,
 		.bpp = bpp,
+		.pat_index = intel_get_pat_idx_wt(fd),
 	};
 	struct rect rect = {
 		.x = rect_x,
diff --git a/lib/igt_fb.c b/lib/igt_fb.c
index d290fd775..61384c553 100644
--- a/lib/igt_fb.c
+++ b/lib/igt_fb.c
@@ -2637,7 +2637,8 @@ igt_fb_create_intel_buf(int fd, struct buf_ops *bops,
 				    igt_fb_mod_to_tiling(fb->modifier),
 				    compression, fb->size,
 				    fb->strides[0],
-				    region);
+				    region,
+				    intel_get_pat_idx_wt(fd));
 	intel_buf_set_name(buf, name);
 
 	/* Make sure we close handle on destroy path */
diff --git a/lib/intel_allocator.c b/lib/intel_allocator.c
index da357b833..b3e5c0226 100644
--- a/lib/intel_allocator.c
+++ b/lib/intel_allocator.c
@@ -1449,6 +1449,7 @@ bool intel_allocator_is_reserved(uint64_t allocator_handle,
 bool intel_allocator_reserve_if_not_allocated(uint64_t allocator_handle,
 					      uint32_t handle,
 					      uint64_t size, uint64_t offset,
+					      uint8_t pat_index,
 					      bool *is_allocatedp)
 {
 	struct alloc_req req = { .request_type = REQ_RESERVE_IF_NOT_ALLOCATED,
diff --git a/lib/intel_allocator.h b/lib/intel_allocator.h
index 5da8af7f9..d93c5828d 100644
--- a/lib/intel_allocator.h
+++ b/lib/intel_allocator.h
@@ -206,6 +206,7 @@ bool intel_allocator_is_reserved(uint64_t allocator_handle,
 bool intel_allocator_reserve_if_not_allocated(uint64_t allocator_handle,
 					      uint32_t handle,
 					      uint64_t size, uint64_t offset,
+					      uint8_t pat_index,
 					      bool *is_allocatedp);
 
 void intel_allocator_print(uint64_t allocator_handle);
diff --git a/lib/intel_batchbuffer.c b/lib/intel_batchbuffer.c
index e7b1b755f..eaaf667ea 100644
--- a/lib/intel_batchbuffer.c
+++ b/lib/intel_batchbuffer.c
@@ -38,6 +38,7 @@
 #include "intel_batchbuffer.h"
 #include "intel_bufops.h"
 #include "intel_chipset.h"
+#include "intel_pat.h"
 #include "media_fill.h"
 #include "media_spin.h"
 #include "sw_sync.h"
@@ -825,15 +826,18 @@ static void __reallocate_objects(struct intel_bb *ibb)
 static inline uint64_t __intel_bb_get_offset(struct intel_bb *ibb,
 					     uint32_t handle,
 					     uint64_t size,
-					     uint32_t alignment)
+					     uint32_t alignment,
+					     uint8_t pat_index)
 {
 	uint64_t offset;
 
 	if (ibb->enforce_relocs)
 		return 0;
 
-	offset = intel_allocator_alloc(ibb->allocator_handle,
-				       handle, size, alignment);
+	offset = __intel_allocator_alloc(ibb->allocator_handle, handle,
+					 size, alignment, pat_index,
+					 ALLOC_STRATEGY_NONE);
+	igt_assert(offset != ALLOC_INVALID_ADDRESS);
 
 	return offset;
 }
@@ -1300,11 +1304,14 @@ static struct drm_xe_vm_bind_op *xe_alloc_bind_ops(struct intel_bb *ibb,
 		ops->op = op;
 		ops->obj_offset = 0;
 		ops->addr = objects[i]->offset;
-		ops->range = objects[i]->rsvd1;
+		ops->range = objects[i]->rsvd1 & ~(4096-1);
 		ops->region = region;
+		if (set_obj)
+			ops->pat_index = objects[i]->rsvd1 & (4096-1);
 
-		igt_debug("  [%d]: handle: %u, offset: %llx, size: %llx\n",
-			  i, ops->obj, (long long)ops->addr, (long long)ops->range);
+		igt_debug("  [%d]: handle: %u, offset: %llx, size: %llx pat_index: %u\n",
+			  i, ops->obj, (long long)ops->addr, (long long)ops->range,
+			  ops->pat_index);
 	}
 
 	return bind_ops;
@@ -1409,7 +1416,8 @@ void intel_bb_reset(struct intel_bb *ibb, bool purge_objects_cache)
 		ibb->batch_offset = __intel_bb_get_offset(ibb,
 							  ibb->handle,
 							  ibb->size,
-							  ibb->alignment);
+							  ibb->alignment,
+							  DEFAULT_PAT_INDEX);
 
 	intel_bb_add_object(ibb, ibb->handle, ibb->size,
 			    ibb->batch_offset,
@@ -1645,7 +1653,8 @@ static void __remove_from_objects(struct intel_bb *ibb,
  */
 static struct drm_i915_gem_exec_object2 *
 __intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
-		      uint64_t offset, uint64_t alignment, bool write)
+		      uint64_t offset, uint64_t alignment, uint8_t pat_index,
+		      bool write)
 {
 	struct drm_i915_gem_exec_object2 *object;
 
@@ -1661,6 +1670,9 @@ __intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
 	object = __add_to_cache(ibb, handle);
 	__add_to_objects(ibb, object);
 
+	if (pat_index == DEFAULT_PAT_INDEX)
+		pat_index = intel_get_pat_idx_wb(ibb->fd);
+
 	/*
 	 * If object->offset == INVALID_ADDRESS we added freshly object to the
 	 * cache. In that case we have two choices:
@@ -1670,7 +1682,7 @@ __intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
 	if (INVALID_ADDR(object->offset)) {
 		if (INVALID_ADDR(offset)) {
 			offset = __intel_bb_get_offset(ibb, handle, size,
-						       alignment);
+						       alignment, pat_index);
 		} else {
 			offset = offset & (ibb->gtt_size - 1);
 
@@ -1683,6 +1695,7 @@ __intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
 
 				reserved = intel_allocator_reserve_if_not_allocated(ibb->allocator_handle,
 										    handle, size, offset,
+										    pat_index,
 										    &allocated);
 				igt_assert_f(allocated || reserved,
 					     "Can't get offset, allocated: %d, reserved: %d\n",
@@ -1721,6 +1734,18 @@ __intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
 	if (ibb->driver == INTEL_DRIVER_XE) {
 		object->alignment = alignment;
 		object->rsvd1 = size;
+		igt_assert(!(size & (4096-1)));
+
+		if (pat_index == DEFAULT_PAT_INDEX)
+			pat_index = intel_get_pat_idx_wb(ibb->fd);
+
+		/*
+		 * XXX: For now encode the pat_index in the first few bits of
+		 * rsvd1. intel_batchbuffer should really stop using the i915
+		 * drm_i915_gem_exec_object2 to encode VMA placement
+		 * information on xe...
+		 */
+		object->rsvd1 |= pat_index;
 	}
 
 	return object;
@@ -1733,7 +1758,7 @@ intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
 	struct drm_i915_gem_exec_object2 *obj = NULL;
 
 	obj = __intel_bb_add_object(ibb, handle, size, offset,
-				    alignment, write);
+				    alignment, DEFAULT_PAT_INDEX, write);
 	igt_assert(obj);
 
 	return obj;
@@ -1795,8 +1820,10 @@ __intel_bb_add_intel_buf(struct intel_bb *ibb, struct intel_buf *buf,
 		}
 	}
 
-	obj = intel_bb_add_object(ibb, buf->handle, intel_buf_bo_size(buf),
-				  buf->addr.offset, alignment, write);
+	obj = __intel_bb_add_object(ibb, buf->handle, intel_buf_bo_size(buf),
+				    buf->addr.offset, alignment, buf->pat_index,
+				    write);
+	igt_assert(obj);
 	buf->addr.offset = obj->offset;
 
 	if (igt_list_empty(&buf->link)) {
diff --git a/lib/intel_bufops.c b/lib/intel_bufops.c
index 2c91adb88..fbee4748e 100644
--- a/lib/intel_bufops.c
+++ b/lib/intel_bufops.c
@@ -29,6 +29,7 @@
 #include "igt.h"
 #include "igt_x86.h"
 #include "intel_bufops.h"
+#include "intel_pat.h"
 #include "xe/xe_ioctl.h"
 #include "xe/xe_query.h"
 
@@ -818,7 +819,7 @@ static void __intel_buf_init(struct buf_ops *bops,
 			     int width, int height, int bpp, int alignment,
 			     uint32_t req_tiling, uint32_t compression,
 			     uint64_t bo_size, int bo_stride,
-			     uint64_t region)
+			     uint64_t region, uint8_t pat_index)
 {
 	uint32_t tiling = req_tiling;
 	uint64_t size;
@@ -839,6 +840,10 @@ static void __intel_buf_init(struct buf_ops *bops,
 	IGT_INIT_LIST_HEAD(&buf->link);
 	buf->mocs = INTEL_BUF_MOCS_DEFAULT;
 
+	if (pat_index == DEFAULT_PAT_INDEX)
+		pat_index = intel_get_pat_idx_wb(bops->fd);
+	buf->pat_index = pat_index;
+
 	if (compression) {
 		igt_require(bops->intel_gen >= 9);
 		igt_assert(req_tiling == I915_TILING_Y ||
@@ -957,7 +962,7 @@ void intel_buf_init(struct buf_ops *bops,
 	region = bops->driver == INTEL_DRIVER_I915 ? I915_SYSTEM_MEMORY :
 						     system_memory(bops->fd);
 	__intel_buf_init(bops, 0, buf, width, height, bpp, alignment,
-			 tiling, compression, 0, 0, region);
+			 tiling, compression, 0, 0, region, DEFAULT_PAT_INDEX);
 
 	intel_buf_set_ownership(buf, true);
 }
@@ -974,7 +979,7 @@ void intel_buf_init_in_region(struct buf_ops *bops,
 			      uint64_t region)
 {
 	__intel_buf_init(bops, 0, buf, width, height, bpp, alignment,
-			 tiling, compression, 0, 0, region);
+			 tiling, compression, 0, 0, region, DEFAULT_PAT_INDEX);
 
 	intel_buf_set_ownership(buf, true);
 }
@@ -1033,7 +1038,7 @@ void intel_buf_init_using_handle(struct buf_ops *bops,
 				 uint32_t req_tiling, uint32_t compression)
 {
 	__intel_buf_init(bops, handle, buf, width, height, bpp, alignment,
-			 req_tiling, compression, 0, 0, -1);
+			 req_tiling, compression, 0, 0, -1, DEFAULT_PAT_INDEX);
 }
 
 /**
@@ -1050,6 +1055,7 @@ void intel_buf_init_using_handle(struct buf_ops *bops,
  * @size: real bo size
  * @stride: bo stride
  * @region: region
+ * @pat_index: pat_index to use for the binding (only used on xe)
  *
  * Function configures BO handle within intel_buf structure passed by the caller
  * (with all its metadata - width, height, ...). Useful if BO was created
@@ -1067,10 +1073,12 @@ void intel_buf_init_full(struct buf_ops *bops,
 			 uint32_t compression,
 			 uint64_t size,
 			 int stride,
-			 uint64_t region)
+			 uint64_t region,
+			 uint8_t pat_index)
 {
 	__intel_buf_init(bops, handle, buf, width, height, bpp, alignment,
-			 req_tiling, compression, size, stride, region);
+			 req_tiling, compression, size, stride, region,
+			 pat_index);
 }
 
 /**
@@ -1149,7 +1157,8 @@ struct intel_buf *intel_buf_create_using_handle_and_size(struct buf_ops *bops,
 							 int stride)
 {
 	return intel_buf_create_full(bops, handle, width, height, bpp, alignment,
-				     req_tiling, compression, size, stride, -1);
+				     req_tiling, compression, size, stride, -1,
+				     DEFAULT_PAT_INDEX);
 }
 
 struct intel_buf *intel_buf_create_full(struct buf_ops *bops,
@@ -1160,7 +1169,8 @@ struct intel_buf *intel_buf_create_full(struct buf_ops *bops,
 					uint32_t compression,
 					uint64_t size,
 					int stride,
-					uint64_t region)
+					uint64_t region,
+					uint8_t pat_index)
 {
 	struct intel_buf *buf;
 
@@ -1170,7 +1180,8 @@ struct intel_buf *intel_buf_create_full(struct buf_ops *bops,
 	igt_assert(buf);
 
 	__intel_buf_init(bops, handle, buf, width, height, bpp, alignment,
-			 req_tiling, compression, size, stride, region);
+			 req_tiling, compression, size, stride, region,
+			 pat_index);
 
 	return buf;
 }
diff --git a/lib/intel_bufops.h b/lib/intel_bufops.h
index 4dfe4681c..b6048402b 100644
--- a/lib/intel_bufops.h
+++ b/lib/intel_bufops.h
@@ -63,6 +63,9 @@ struct intel_buf {
 	/* Content Protection*/
 	bool is_protected;
 
+	/* pat_index to use for mapping this buf. Only used in Xe. */
+	uint8_t pat_index;
+
 	/* For debugging purposes */
 	char name[INTEL_BUF_NAME_MAXSIZE + 1];
 };
@@ -161,7 +164,8 @@ void intel_buf_init_full(struct buf_ops *bops,
 			 uint32_t compression,
 			 uint64_t size,
 			 int stride,
-			 uint64_t region);
+			 uint64_t region,
+			 uint8_t pat_index);
 
 struct intel_buf *intel_buf_create(struct buf_ops *bops,
 				   int width, int height,
@@ -192,7 +196,8 @@ struct intel_buf *intel_buf_create_full(struct buf_ops *bops,
 					uint32_t compression,
 					uint64_t size,
 					int stride,
-					uint64_t region);
+					uint64_t region,
+					uint8_t pat_index);
 void intel_buf_destroy(struct intel_buf *buf);
 
 static inline void intel_buf_set_pxp(struct intel_buf *buf, bool new_pxp_state)
diff --git a/tests/intel/kms_big_fb.c b/tests/intel/kms_big_fb.c
index 611e60896..854a77992 100644
--- a/tests/intel/kms_big_fb.c
+++ b/tests/intel/kms_big_fb.c
@@ -34,6 +34,7 @@
 #include <string.h>
 
 #include "i915/gem_create.h"
+#include "intel_pat.h"
 #include "xe/xe_ioctl.h"
 #include "xe/xe_query.h"
 
@@ -88,7 +89,8 @@ static struct intel_buf *init_buf(data_t *data,
 	handle = gem_open(data->drm_fd, name);
 	buf = intel_buf_create_full(data->bops, handle, width, height,
 				    bpp, 0, tiling, 0, size, 0,
-				    region);
+				    region,
+				    intel_get_pat_idx_wt(data->drm_fd));
 
 	intel_buf_set_name(buf, buf_name);
 	intel_buf_set_ownership(buf, true);
diff --git a/tests/intel/kms_dirtyfb.c b/tests/intel/kms_dirtyfb.c
index cc9529178..ec9b2a137 100644
--- a/tests/intel/kms_dirtyfb.c
+++ b/tests/intel/kms_dirtyfb.c
@@ -10,6 +10,7 @@
 
 #include "i915/intel_drrs.h"
 #include "i915/intel_fbc.h"
+#include "intel_pat.h"
 
 #include "xe/xe_query.h"
 
@@ -246,14 +247,16 @@ static void run_test(data_t *data)
 				    0,
 				    igt_fb_mod_to_tiling(data->fbs[1].modifier),
 				    0, 0, 0, is_xe_device(data->drm_fd) ?
-				    system_memory(data->drm_fd) : 0);
+				    system_memory(data->drm_fd) : 0,
+				    intel_get_pat_idx_wt(data->drm_fd));
 	dst = intel_buf_create_full(data->bops, data->fbs[2].gem_handle,
 				    data->fbs[2].width,
 				    data->fbs[2].height,
 				    igt_drm_format_to_bpp(data->fbs[2].drm_format),
 				    0, igt_fb_mod_to_tiling(data->fbs[2].modifier),
 				    0, 0, 0, is_xe_device(data->drm_fd) ?
-				    system_memory(data->drm_fd) : 0);
+				    system_memory(data->drm_fd) : 0,
+				    intel_get_pat_idx_wt(data->drm_fd));
 	ibb = intel_bb_create(data->drm_fd, PAGE_SIZE);
 
 	spin = igt_spin_new(data->drm_fd, .ahnd = ibb->allocator_handle);
diff --git a/tests/intel/kms_psr.c b/tests/intel/kms_psr.c
index ffecc5222..9c6ecd829 100644
--- a/tests/intel/kms_psr.c
+++ b/tests/intel/kms_psr.c
@@ -31,6 +31,7 @@
 #include "igt.h"
 #include "igt_sysfs.h"
 #include "igt_psr.h"
+#include "intel_pat.h"
 #include <errno.h>
 #include <stdbool.h>
 #include <stdio.h>
@@ -356,7 +357,8 @@ static struct intel_buf *create_buf_from_fb(data_t *data,
 	name = gem_flink(data->drm_fd, fb->gem_handle);
 	handle = gem_open(data->drm_fd, name);
 	buf = intel_buf_create_full(data->bops, handle, width, height,
-				    bpp, 0, tiling, 0, size, stride, region);
+				    bpp, 0, tiling, 0, size, stride, region,
+				    intel_get_pat_idx_wt(data->drm_fd));
 	intel_buf_set_ownership(buf, true);
 
 	return buf;
diff --git a/tests/intel/xe_intel_bb.c b/tests/intel/xe_intel_bb.c
index 0159a3164..e2480acf8 100644
--- a/tests/intel/xe_intel_bb.c
+++ b/tests/intel/xe_intel_bb.c
@@ -19,6 +19,7 @@
 #include "igt.h"
 #include "igt_crc.h"
 #include "intel_bufops.h"
+#include "intel_pat.h"
 #include "xe/xe_ioctl.h"
 #include "xe/xe_query.h"
 
@@ -400,7 +401,7 @@ static void create_in_region(struct buf_ops *bops, uint64_t region)
 	intel_buf_init_full(bops, handle, &buf,
 			    width/4, height, 32, 0,
 			    I915_TILING_NONE, 0,
-			    size, 0, region);
+			    size, 0, region, DEFAULT_PAT_INDEX);
 	intel_buf_set_ownership(&buf, true);
 
 	intel_bb_add_intel_buf(ibb, &buf, false);
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Intel-xe] [PATCH i-g-t 10/12] lib/xe_ioctl: update vm_bind to account for pat_index
  2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
                   ` (8 preceding siblings ...)
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 09/12] lib/intel_buf: " Matthew Auld
@ 2023-10-05 15:31 ` Matthew Auld
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 11/12] tests/xe: add some vm_bind pat_index tests Matthew Auld
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 12/12] tests/intel-ci/xe: add pat and caching related tests Matthew Auld
  11 siblings, 0 replies; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-xe

Keep things minimal and select the 1way+ by default on all platforms.
Other users can use intel_buf, get_offset_pat_index etc or use
__xe_vm_bind() directly.  Display tests don't directly use this
interface.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
---
 lib/xe/xe_ioctl.c   | 8 ++++++--
 lib/xe/xe_ioctl.h   | 2 +-
 tests/intel/xe_vm.c | 4 +++-
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/lib/xe/xe_ioctl.c b/lib/xe/xe_ioctl.c
index 80696aa59..ebaed1e96 100644
--- a/lib/xe/xe_ioctl.c
+++ b/lib/xe/xe_ioctl.c
@@ -41,6 +41,7 @@
 #include "config.h"
 #include "drmtest.h"
 #include "igt_syncobj.h"
+#include "intel_pat.h"
 #include "ioctl_wrappers.h"
 #include "xe_ioctl.h"
 #include "xe_query.h"
@@ -92,7 +93,7 @@ void xe_vm_bind_array(int fd, uint32_t vm, uint32_t exec_queue,
 int  __xe_vm_bind(int fd, uint32_t vm, uint32_t exec_queue, uint32_t bo,
 		  uint64_t offset, uint64_t addr, uint64_t size, uint32_t op,
 		  struct drm_xe_sync *sync, uint32_t num_syncs, uint32_t region,
-		  uint64_t ext)
+		  uint8_t pat_index, uint64_t ext)
 {
 	struct drm_xe_vm_bind bind = {
 		.extensions = ext,
@@ -107,6 +108,8 @@ int  __xe_vm_bind(int fd, uint32_t vm, uint32_t exec_queue, uint32_t bo,
 		.num_syncs = num_syncs,
 		.syncs = (uintptr_t)sync,
 		.exec_queue_id = exec_queue,
+		.bind.pat_index = (pat_index == DEFAULT_PAT_INDEX) ?
+			intel_get_pat_idx_wb(fd) : pat_index,
 	};
 
 	if (igt_ioctl(fd, DRM_IOCTL_XE_VM_BIND, &bind))
@@ -121,7 +124,8 @@ void  __xe_vm_bind_assert(int fd, uint32_t vm, uint32_t exec_queue, uint32_t bo,
 			  uint32_t num_syncs, uint32_t region, uint64_t ext)
 {
 	igt_assert_eq(__xe_vm_bind(fd, vm, exec_queue, bo, offset, addr, size,
-				   op, sync, num_syncs, region, ext), 0);
+				   op, sync, num_syncs, region, DEFAULT_PAT_INDEX,
+				   ext), 0);
 }
 
 void xe_vm_bind(int fd, uint32_t vm, uint32_t bo, uint64_t offset,
diff --git a/lib/xe/xe_ioctl.h b/lib/xe/xe_ioctl.h
index c18fc878c..cafbb011a 100644
--- a/lib/xe/xe_ioctl.h
+++ b/lib/xe/xe_ioctl.h
@@ -20,7 +20,7 @@ uint32_t xe_vm_create(int fd, uint32_t flags, uint64_t ext);
 int  __xe_vm_bind(int fd, uint32_t vm, uint32_t exec_queue, uint32_t bo,
 		  uint64_t offset, uint64_t addr, uint64_t size, uint32_t op,
 		  struct drm_xe_sync *sync, uint32_t num_syncs, uint32_t region,
-		  uint64_t ext);
+		  uint8_t pat_index, uint64_t ext);
 void  __xe_vm_bind_assert(int fd, uint32_t vm, uint32_t exec_queue, uint32_t bo,
 			  uint64_t offset, uint64_t addr, uint64_t size,
 			  uint32_t op, struct drm_xe_sync *sync,
diff --git a/tests/intel/xe_vm.c b/tests/intel/xe_vm.c
index 4952ea786..ffb70973b 100644
--- a/tests/intel/xe_vm.c
+++ b/tests/intel/xe_vm.c
@@ -10,6 +10,7 @@
  */
 
 #include "igt.h"
+#include "intel_pat.h"
 #include "lib/igt_syncobj.h"
 #include "lib/intel_reg.h"
 #include "xe_drm.h"
@@ -316,7 +317,8 @@ static void userptr_invalid(int fd)
 	vm = xe_vm_create(fd, 0, 0);
 	munmap(data, size);
 	ret = __xe_vm_bind(fd, vm, 0, 0, to_user_pointer(data), 0x40000,
-			   size, XE_VM_BIND_OP_MAP_USERPTR, NULL, 0, 0, 0);
+			   size, XE_VM_BIND_OP_MAP_USERPTR, NULL, 0, 0,
+			   DEFAULT_PAT_INDEX, 0);
 	igt_assert(ret == -EFAULT);
 
 	xe_vm_destroy(fd, vm);
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Intel-xe] [PATCH i-g-t 11/12] tests/xe: add some vm_bind pat_index tests
  2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
                   ` (9 preceding siblings ...)
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 10/12] lib/xe_ioctl: update vm_bind to account for pat_index Matthew Auld
@ 2023-10-05 15:31 ` Matthew Auld
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 12/12] tests/intel-ci/xe: add pat and caching related tests Matthew Auld
  11 siblings, 0 replies; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
  To: igt-dev; +Cc: Nitish Kumar, intel-xe

Add some basic tests for pat_index and vm_bind.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
Cc: Nitish Kumar <nitish.kumar@intel.com>
---
 tests/intel/xe_pat.c | 483 +++++++++++++++++++++++++++++++++++++++++++
 tests/meson.build    |   1 +
 2 files changed, 484 insertions(+)
 create mode 100644 tests/intel/xe_pat.c

diff --git a/tests/intel/xe_pat.c b/tests/intel/xe_pat.c
new file mode 100644
index 000000000..9c5261b4a
--- /dev/null
+++ b/tests/intel/xe_pat.c
@@ -0,0 +1,483 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+/**
+ * TEST: Test for selecting per-VMA pat_index
+ * Category: Software building block
+ * Sub-category: VMA
+ * Functionality: pat_index
+ */
+
+#include "igt.h"
+#include "intel_blt.h"
+#include "intel_mocs.h"
+#include "intel_pat.h"
+
+#include "xe/xe_ioctl.h"
+#include "xe/xe_query.h"
+#include "xe/xe_util.h"
+
+#define PAGE_SIZE 4096
+
+static bool do_slow_check;
+
+/**
+ * SUBTEST: userptr-coh-none
+ * Test category: functionality test
+ * Description: Test non-coherent pat_index on userptr
+ */
+static void userptr_coh_none(int fd)
+{
+	size_t size = xe_get_default_alignment(fd);
+	uint32_t vm;
+	void *data;
+
+	data = mmap(0, size, PROT_READ |
+		    PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0);
+	igt_assert(data != MAP_FAILED);
+
+	vm = xe_vm_create(fd, 0, 0);
+
+	/*
+	 * Try some valid combinations first just to make sure we're not being
+	 * swindled.
+	 */
+	igt_assert_eq(__xe_vm_bind(fd, vm, 0, 0, to_user_pointer(data), 0x40000,
+				   size, XE_VM_BIND_OP_MAP_USERPTR, NULL, 0, 0,
+				   DEFAULT_PAT_INDEX, 0),
+		      0);
+	xe_vm_unbind_sync(fd, vm, 0, 0x40000, size);
+	igt_assert_eq(__xe_vm_bind(fd, vm, 0, 0, to_user_pointer(data), 0x40000,
+				   size, XE_VM_BIND_OP_MAP_USERPTR, NULL, 0, 0,
+				   intel_get_pat_idx_wb(fd), 0),
+		      0);
+	xe_vm_unbind_sync(fd, vm, 0, 0x40000, size);
+
+	/* And then some known COH_NONE pat_index combos which should fail. */
+	igt_assert_eq(__xe_vm_bind(fd, vm, 0, 0, to_user_pointer(data), 0x40000,
+				   size, XE_VM_BIND_OP_MAP_USERPTR, NULL, 0, 0,
+				   intel_get_pat_idx_uc(fd), 0),
+		      -EINVAL);
+	igt_assert_eq(__xe_vm_bind(fd, vm, 0, 0, to_user_pointer(data), 0x40000,
+				   size, XE_VM_BIND_OP_MAP_USERPTR, NULL, 0, 0,
+				   intel_get_pat_idx_wt(fd), 0),
+		      -EINVAL);
+
+	munmap(data, size);
+	xe_vm_destroy(fd, vm);
+}
+
+/**
+ * SUBTEST: pat-index-all
+ * Test category: functionality test
+ * Description: Test every pat_index
+ */
+static void pat_index_all(int fd)
+{
+	size_t size = xe_get_default_alignment(fd);
+	uint32_t vm, bo;
+	uint8_t pat_index;
+
+	vm = xe_vm_create(fd, 0, 0);
+
+	bo = xe_bo_create_caching(fd, 0, size, all_memory_regions(fd),
+				  DRM_XE_GEM_CPU_CACHING_WC,
+				  DRM_XE_GEM_COH_NONE);
+
+	igt_assert_eq(__xe_vm_bind(fd, vm, 0, bo, 0, 0x40000,
+				   size, XE_VM_BIND_OP_MAP, NULL, 0, 0,
+				   intel_get_pat_idx_uc(fd), 0),
+		      0);
+	xe_vm_unbind_sync(fd, vm, 0, 0x40000, size);
+
+	igt_assert_eq(__xe_vm_bind(fd, vm, 0, bo, 0, 0x40000,
+				   size, XE_VM_BIND_OP_MAP, NULL, 0, 0,
+				   intel_get_pat_idx_wt(fd), 0),
+		      0);
+	xe_vm_unbind_sync(fd, vm, 0, 0x40000, size);
+
+	igt_assert_eq(__xe_vm_bind(fd, vm, 0, bo, 0, 0x40000,
+				   size, XE_VM_BIND_OP_MAP, NULL, 0, 0,
+				   intel_get_pat_idx_wb(fd), 0),
+		      0);
+	xe_vm_unbind_sync(fd, vm, 0, 0x40000, size);
+
+	igt_assert(intel_get_max_pat_index(fd));
+
+	for (pat_index = 0; pat_index <= intel_get_max_pat_index(fd);
+	     pat_index++) {
+		igt_assert_eq(__xe_vm_bind(fd, vm, 0, bo, 0, 0x40000,
+					   size, XE_VM_BIND_OP_MAP, NULL, 0, 0,
+					   pat_index, 0),
+			      0);
+		xe_vm_unbind_sync(fd, vm, 0, 0x40000, size);
+	}
+
+	igt_assert_eq(__xe_vm_bind(fd, vm, 0, bo, 0, 0x40000,
+				   size, XE_VM_BIND_OP_MAP, NULL, 0, 0,
+				   pat_index, 0),
+		      -EINVAL);
+
+	gem_close(fd, bo);
+
+	/* Must be at least as coherent as the gem_create coh_mode. */
+	bo = xe_bo_create_caching(fd, 0, size, system_memory(fd),
+				  DRM_XE_GEM_CPU_CACHING_WB,
+				  DRM_XE_GEM_COH_AT_LEAST_1WAY);
+
+	igt_assert_eq(__xe_vm_bind(fd, vm, 0, bo, 0, 0x40000,
+				   size, XE_VM_BIND_OP_MAP, NULL, 0, 0,
+				   intel_get_pat_idx_uc(fd), 0),
+		      -EINVAL);
+
+	igt_assert_eq(__xe_vm_bind(fd, vm, 0, bo, 0, 0x40000,
+				   size, XE_VM_BIND_OP_MAP, NULL, 0, 0,
+				   intel_get_pat_idx_wt(fd), 0),
+		      -EINVAL);
+
+	gem_close(fd, bo);
+
+	xe_vm_destroy(fd, vm);
+}
+
+/**
+ * SUBTEST: pat-index-common-blt
+ * Test category: functionality test
+ * Description: Check the common pat_index modes with blitter copy.
+ */
+
+static void pat_index_blt(int fd,
+			  uint32_t r1, uint8_t r1_pat_index, uint16_t r1_coh_mode,
+			  uint32_t r2, uint8_t r2_pat_index, uint16_t r2_coh_mode)
+{
+	struct drm_xe_engine_class_instance inst = {
+		.engine_class = DRM_XE_ENGINE_CLASS_COPY,
+	};
+	struct blt_copy_data blt = {};
+	struct blt_copy_object src = {};
+	struct blt_copy_object dst = {};
+	uint32_t vm, exec_queue, src_bo, dst_bo, bb;
+	uint32_t *src_map, *dst_map;
+	uint16_t r1_cpu_caching, r2_cpu_caching;
+	intel_ctx_t *ctx;
+	uint64_t ahnd;
+	int width = 512, height = 512;
+	int size, stride, bb_size;
+	int bpp = 32;
+	int i;
+
+	vm = xe_vm_create(fd, DRM_XE_VM_CREATE_ASYNC_BIND_OPS, 0);
+	exec_queue = xe_exec_queue_create(fd, vm, &inst, 0);
+	ctx = intel_ctx_xe(fd, vm, exec_queue, 0, 0, 0);
+	ahnd = intel_allocator_open_full(fd, ctx->vm, 0, 0,
+					 INTEL_ALLOCATOR_SIMPLE,
+					 ALLOC_STRATEGY_LOW_TO_HIGH, 0);
+
+	bb_size = xe_get_default_alignment(fd);
+	bb = xe_bo_create_flags(fd, 0, bb_size, r1);
+
+	size = width * height * bpp / 8;
+	stride = width * 4;
+
+	if (r1_coh_mode == DRM_XE_GEM_COH_AT_LEAST_1WAY
+	    && r1 == system_memory(fd))
+		r1_cpu_caching = DRM_XE_GEM_CPU_CACHING_WB;
+	else
+		r1_cpu_caching = DRM_XE_GEM_CPU_CACHING_WC;
+
+	if (r2_coh_mode == DRM_XE_GEM_COH_AT_LEAST_1WAY &&
+	    r2 == system_memory(fd))
+		r2_cpu_caching = DRM_XE_GEM_CPU_CACHING_WB;
+	else
+		r2_cpu_caching = DRM_XE_GEM_CPU_CACHING_WC;
+
+	src_bo = xe_bo_create_caching(fd, 0, size, r1, r1_cpu_caching,
+				      r1_coh_mode);
+	dst_bo = xe_bo_create_caching(fd, 0, size, r2, r2_cpu_caching,
+				      r2_coh_mode);
+
+	blt_copy_init(fd, &blt);
+	blt.color_depth = CD_32bit;
+
+	blt_set_object(&src, src_bo, size, r1, intel_get_uc_mocs(fd),
+		       r1_pat_index, T_LINEAR,
+		       COMPRESSION_DISABLED, COMPRESSION_TYPE_3D);
+	blt_set_geom(&src, stride, 0, 0, width, height, 0, 0);
+
+	blt_set_object(&dst, dst_bo, size, r2, intel_get_uc_mocs(fd),
+		       r2_pat_index, T_LINEAR,
+		       COMPRESSION_DISABLED, COMPRESSION_TYPE_3D);
+	blt_set_geom(&dst, stride, 0, 0, width, height, 0, 0);
+
+	blt_set_copy_object(&blt.src, &src);
+	blt_set_copy_object(&blt.dst, &dst);
+	blt_set_batch(&blt.bb, bb, bb_size, r1);
+
+	src_map = xe_bo_map(fd, src_bo, size);
+	dst_map = xe_bo_map(fd, dst_bo, size);
+
+	/* Ensure we always see zeroes for the initial KMD zeroing */
+	blt_fast_copy(fd, ctx, NULL, ahnd, &blt);
+
+	/*
+	 * Only sample random dword in every page if we are doing slow uncached
+	 * reads from VRAM.
+	 */
+	if (!do_slow_check && r2 != system_memory(fd)) {
+		int dwords_page = PAGE_SIZE / sizeof(uint32_t);
+		int dword = rand() % dwords_page;
+
+		igt_debug("random dword: %d\n", dword);
+
+		for (i = dword; i < size / sizeof(uint32_t); i += dwords_page)
+			igt_assert_eq(dst_map[i], 0);
+
+	} else {
+		for (i = 0; i < size / sizeof(uint32_t); i++)
+			igt_assert_eq(dst_map[i], 0);
+	}
+
+	/* Write some values from the CPU, potentially dirtying the CPU cache */
+	for (i = 0; i < size / sizeof(uint32_t); i++)
+		src_map[i] = i;
+
+	/* And finally ensure we always see the CPU written values */
+	blt_fast_copy(fd, ctx, NULL, ahnd, &blt);
+
+	if (!do_slow_check && r2 != system_memory(fd)) {
+		int dwords_page = PAGE_SIZE / sizeof(uint32_t);
+		int dword = rand() % dwords_page;
+
+		igt_debug("random dword: %d\n", dword);
+
+		for (i = dword; i < size / sizeof(uint32_t); i += dwords_page)
+			igt_assert_eq(dst_map[i], i);
+	} else {
+		for (i = 0; i < size / sizeof(uint32_t); i++)
+			igt_assert_eq(dst_map[i], i);
+	}
+
+	munmap(src_map, size);
+	munmap(dst_map, size);
+
+	gem_close(fd, src_bo);
+	gem_close(fd, dst_bo);
+	gem_close(fd, bb);
+
+	xe_exec_queue_destroy(fd, exec_queue);
+	xe_vm_destroy(fd, vm);
+
+	put_ahnd(ahnd);
+	intel_ctx_destroy(fd, ctx);
+}
+
+/**
+ * SUBTEST: pat-index-common-render
+ * Test category: functionality test
+ * Description: Check the common pat_index modes with render.
+ */
+
+static void pat_index_render(int fd,
+			     uint32_t r1, uint8_t r1_pat_index, uint16_t r1_coh_mode,
+			     uint32_t r2, uint8_t r2_pat_index, uint16_t r2_coh_mode)
+{
+	uint32_t devid = intel_get_drm_devid(fd);
+	igt_render_copyfunc_t render_copy = NULL;
+	int size, stride, width = 512, height = 512;
+	struct intel_buf src, dst;
+	struct intel_bb *ibb;
+	struct buf_ops *bops;
+	uint16_t r1_cpu_caching, r2_cpu_caching;
+	uint32_t src_bo, dst_bo;
+	uint32_t *src_map, *dst_map;
+	int bpp = 32;
+	int i;
+
+	bops = buf_ops_create(fd);
+
+	render_copy = igt_get_render_copyfunc(devid);
+	igt_assert(render_copy);
+
+	ibb = intel_bb_create(fd, xe_get_default_alignment(fd));
+
+	if (r1_coh_mode == DRM_XE_GEM_COH_AT_LEAST_1WAY
+	    && r1 == system_memory(fd))
+		r1_cpu_caching = DRM_XE_GEM_CPU_CACHING_WB;
+	else
+		r1_cpu_caching = DRM_XE_GEM_CPU_CACHING_WC;
+
+	if (r2_coh_mode == DRM_XE_GEM_COH_AT_LEAST_1WAY &&
+	    r2 == system_memory(fd))
+		r2_cpu_caching = DRM_XE_GEM_CPU_CACHING_WB;
+	else
+		r2_cpu_caching = DRM_XE_GEM_CPU_CACHING_WC;
+
+	size = width * height * bpp / 8;
+	stride = width * 4;
+
+	src_bo = xe_bo_create_caching(fd, 0, size, r1, r1_cpu_caching,
+				      r1_coh_mode);
+	intel_buf_init_full(bops, src_bo, &src, width, height, bpp, 0,
+			    I915_TILING_NONE, I915_COMPRESSION_NONE, size,
+			    stride, r1, r1_pat_index);
+
+	dst_bo = xe_bo_create_caching(fd, 0, size, r2, r2_cpu_caching,
+				      r2_coh_mode);
+	intel_buf_init_full(bops, dst_bo, &dst, width, height, bpp, 0,
+			    I915_TILING_NONE, I915_COMPRESSION_NONE, size,
+			    stride, r2, r2_pat_index);
+
+	src_map = xe_bo_map(fd, src_bo, size);
+	dst_map = xe_bo_map(fd, dst_bo, size);
+
+	/* Ensure we always see zeroes for the initial KMD zeroing */
+	render_copy(ibb,
+		    &src,
+		    0, 0, width, height,
+		    &dst,
+		    0, 0);
+	intel_bb_sync(ibb);
+
+	if (!do_slow_check && r2 != system_memory(fd)) {
+		int dwords_page = PAGE_SIZE / sizeof(uint32_t);
+		int dword = rand() % dwords_page;
+
+		igt_debug("random dword: %d\n", dword);
+
+		for (i = dword; i < size / sizeof(uint32_t); i += dwords_page)
+			igt_assert_eq(dst_map[i], 0);
+	} else {
+		for (i = 0; i < size / sizeof(uint32_t); i++)
+			igt_assert_eq(dst_map[i], 0);
+	}
+
+	/* Write some values from the CPU, potentially dirtying the CPU cache */
+	for (i = 0; i < size / sizeof(uint32_t); i++)
+		src_map[i] = i;
+
+	/* And finally ensure we always see the CPU written values */
+	render_copy(ibb,
+		    &src,
+		    0, 0, width, height,
+		    &dst,
+		    0, 0);
+	intel_bb_sync(ibb);
+
+	if (!do_slow_check && r2 != system_memory(fd)) {
+		int dwords_page = PAGE_SIZE / sizeof(uint32_t);
+		int dword = rand() % dwords_page;
+
+		igt_debug("random dword: %d\n", dword);
+
+		for (i = dword; i < size / sizeof(uint32_t); i += dwords_page)
+			igt_assert_eq(dst_map[i], i);
+	} else {
+		for (i = 0; i < size / sizeof(uint32_t); i++)
+			igt_assert_eq(dst_map[i], i);
+	}
+
+	munmap(src_map, size);
+	munmap(dst_map, size);
+
+	intel_bb_destroy(ibb);
+
+	gem_close(fd, src_bo);
+	gem_close(fd, dst_bo);
+}
+
+const struct pat_index_entry {
+	uint8_t (*get_pat_index)(int fd);
+	const char *name;
+	uint16_t coh_mode;
+} common_pat_index_modes[] = {
+	{ intel_get_pat_idx_uc, "uc", DRM_XE_GEM_COH_NONE },
+	{ intel_get_pat_idx_wt, "wt", DRM_XE_GEM_COH_NONE },
+	{ intel_get_pat_idx_wb, "wb", DRM_XE_GEM_COH_AT_LEAST_1WAY },
+};
+
+typedef void (*pat_index_fn)(int fd,
+			     uint32_t r1, uint8_t r1_pat_index, uint16_t r1_coh_mode,
+			     uint32_t r2, uint8_t r2_pat_index, uint16_t r2_coh_mode);
+
+static void subtest_pat_index_common_with_regions(int fd, pat_index_fn fn)
+{
+	struct igt_collection *common_pat_index_set;
+	struct igt_collection *regions_set;
+	struct igt_collection *regions;
+
+	common_pat_index_set =
+		igt_collection_create(ARRAY_SIZE(common_pat_index_modes));
+
+	regions_set = xe_get_memory_region_set(fd,
+					       XE_MEM_REGION_CLASS_SYSMEM,
+					       XE_MEM_REGION_CLASS_VRAM);
+
+	for_each_variation_r(regions, 2, regions_set) {
+		struct igt_collection *modes;
+		uint32_t r1, r2;
+		char *reg_str;
+
+		r1 = igt_collection_get_value(regions, 0);
+		r2 = igt_collection_get_value(regions, 1);
+
+		reg_str = xe_memregion_dynamic_subtest_name(fd, regions);
+
+		for_each_variation_r(modes, 2, common_pat_index_set) {
+			struct pat_index_entry r1_entry, r2_entry;
+			uint8_t r1_pat_index, r2_pat_index;
+			int r1_idx, r2_idx;
+
+			r1_idx = igt_collection_get_value(modes, 0);
+			r2_idx = igt_collection_get_value(modes, 1);
+
+			r1_entry = common_pat_index_modes[r1_idx];
+			r2_entry = common_pat_index_modes[r2_idx];
+
+			r1_pat_index = r1_entry.get_pat_index(fd);
+			r2_pat_index = r2_entry.get_pat_index(fd);
+
+			igt_dynamic_f("%s-%s-%s", reg_str, r1_entry.name, r2_entry.name)
+				fn(fd,
+				   r1, r1_pat_index, r1_entry.coh_mode,
+				   r2, r2_pat_index, r2_entry.coh_mode);
+		}
+
+		free(reg_str);
+	}
+}
+
+igt_main
+{
+	int fd;
+	uint32_t seed;
+
+	igt_fixture {
+		fd = drm_open_driver(DRIVER_XE);
+
+		seed = time(NULL);
+		igt_debug("seed: %d\n", seed);
+
+		xe_device_get(fd);
+	}
+
+	igt_subtest("pat-index-all")
+		pat_index_all(fd);
+
+	igt_subtest("userptr-coh-none")
+		userptr_coh_none(fd);
+
+	igt_subtest_with_dynamic("pat-index-common-blt") {
+		igt_require(blt_has_fast_copy(fd));
+		subtest_pat_index_common_with_regions(fd, pat_index_blt);
+	}
+
+	igt_subtest_with_dynamic("pat-index-common-render") {
+		igt_require(xe_has_engine_class(fd, DRM_XE_ENGINE_CLASS_RENDER));
+		subtest_pat_index_common_with_regions(fd, pat_index_render);
+	}
+
+	igt_fixture
+		drm_close_driver(fd);
+}
diff --git a/tests/meson.build b/tests/meson.build
index 2404b2d4a..61351be04 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -296,6 +296,7 @@ intel_xe_progs = [
 	'xe_mmio',
 	'xe_module_load',
 	'xe_noexec_ping_pong',
+	'xe_pat',
 	'xe_pm',
 	'xe_pm_residency',
 	'xe_prime_self_import',
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Intel-xe] [PATCH i-g-t 12/12] tests/intel-ci/xe: add pat and caching related tests
  2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
                   ` (10 preceding siblings ...)
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 11/12] tests/xe: add some vm_bind pat_index tests Matthew Auld
@ 2023-10-05 15:31 ` Matthew Auld
  11 siblings, 0 replies; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-xe

Add the various pat_index, coh_mode and cpu_caching related tests to
BAT.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
---
 tests/intel-ci/xe-fast-feedback.testlist | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/tests/intel-ci/xe-fast-feedback.testlist b/tests/intel-ci/xe-fast-feedback.testlist
index 610cc958c..c41be52a6 100644
--- a/tests/intel-ci/xe-fast-feedback.testlist
+++ b/tests/intel-ci/xe-fast-feedback.testlist
@@ -138,6 +138,7 @@ igt@xe_intel_bb@simple-bb-ctx
 igt@xe_mmap@bad-extensions
 igt@xe_mmap@bad-flags
 igt@xe_mmap@bad-object
+igt@xe_mmap@cpu-caching-coh
 igt@xe_mmap@system
 igt@xe_mmap@vram
 igt@xe_mmap@vram-system
@@ -180,6 +181,10 @@ igt@xe_vm@munmap-style-unbind-userptr-end
 igt@xe_vm@munmap-style-unbind-userptr-front
 igt@xe_vm@munmap-style-unbind-userptr-inval-end
 igt@xe_vm@munmap-style-unbind-userptr-inval-front
+igt@xe_pat@pat-index-all
+igt@xe_pat@pat-index-common-blt
+igt@xe_pat@pat-index-common-render
+igt@xe_pat@userptr-coh-none
 igt@xe_waitfence@abstime
 igt@xe_waitfence@reltime
 igt@kms_addfb_basic@addfb25-4-tiled
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [Intel-xe] [PATCH i-g-t 07/12] lib/allocator: add get_offset_pat_index() helper
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 07/12] lib/allocator: add get_offset_pat_index() helper Matthew Auld
@ 2023-10-06 11:38   ` Zbigniew Kempczyński
  0 siblings, 0 replies; 22+ messages in thread
From: Zbigniew Kempczyński @ 2023-10-06 11:38 UTC (permalink / raw)
  To: Matthew Auld; +Cc: igt-dev, intel-xe

On Thu, Oct 05, 2023 at 04:31:11PM +0100, Matthew Auld wrote:
> For some cases we are going to need to pass the pat_index for the
> vm_bind op. Add a helper for this, such that we can allocate an address
> and give the mapping some pat_index.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: José Roberto de Souza <jose.souza@intel.com>
> Cc: Pallavi Mishra <pallavi.mishra@intel.com>
> ---
>  lib/intel_allocator.c             | 43 +++++++++++++++++++++++--------
>  lib/intel_allocator.h             |  5 +++-
>  lib/xe/xe_util.c                  |  1 +
>  lib/xe/xe_util.h                  |  1 +
>  tests/intel/api_intel_allocator.c |  4 ++-
>  5 files changed, 41 insertions(+), 13 deletions(-)
> 
> diff --git a/lib/intel_allocator.c b/lib/intel_allocator.c
> index f0a9b7fb5..da357b833 100644
> --- a/lib/intel_allocator.c
> +++ b/lib/intel_allocator.c
> @@ -16,6 +16,7 @@
>  #include "igt_map.h"
>  #include "intel_allocator.h"
>  #include "intel_allocator_msgchannel.h"
> +#include "intel_pat.h"
>  #include "xe/xe_query.h"
>  #include "xe/xe_util.h"
>  
> @@ -92,6 +93,7 @@ struct allocator_object {
>  	uint32_t handle;
>  	uint64_t offset;
>  	uint64_t size;
> +	uint8_t pat_index;
>  
>  	enum allocator_bind_op bind_op;
>  };
> @@ -1122,14 +1124,14 @@ void intel_allocator_get_address_range(uint64_t allocator_handle,
>  
>  static bool is_same(struct allocator_object *obj,
>  		    uint32_t handle, uint64_t offset, uint64_t size,
> -		    enum allocator_bind_op bind_op)
> +		    uint8_t pat_index, enum allocator_bind_op bind_op)
>  {
>  	return obj->handle == handle &&	obj->offset == offset && obj->size == size &&
> -	       (obj->bind_op == bind_op || obj->bind_op == BOUND);
> +	       obj->pat_index == pat_index && (obj->bind_op == bind_op || obj->bind_op == BOUND);
>  }
>  
>  static void track_object(uint64_t allocator_handle, uint32_t handle,
> -			 uint64_t offset, uint64_t size,
> +			 uint64_t offset, uint64_t size, uint8_t pat_index,
>  			 enum allocator_bind_op bind_op)
>  {
>  	struct ahnd_info *ainfo;

Code looks good to me, only minor nitpick is to add pat index to
bind_debug() here. Be aware that pat_index don't go underneath
to the allocator itself, only to cache which tracks alloc()/free()
data returned from allocator necessary to bind/unbind. But I don't
think it will be a problem.

With above added:

Reviewed-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
--
Zbigniew

> @@ -1156,6 +1158,9 @@ static void track_object(uint64_t allocator_handle, uint32_t handle,
>  	if (ainfo->driver == INTEL_DRIVER_I915)
>  		return; /* no-op for i915, at least for now */
>  
> +	if (pat_index == DEFAULT_PAT_INDEX)
> +		pat_index = intel_get_pat_idx_wb(ainfo->fd);
> +
>  	pthread_mutex_lock(&ainfo->bind_map_mutex);
>  	obj = igt_map_search(ainfo->bind_map, &handle);
>  	if (obj) {
> @@ -1165,7 +1170,7 @@ static void track_object(uint64_t allocator_handle, uint32_t handle,
>  		 * bind_map.
>  		 */
>  		if (bind_op == TO_BIND) {
> -			igt_assert_eq(is_same(obj, handle, offset, size, bind_op), true);
> +			igt_assert_eq(is_same(obj, handle, offset, size, pat_index, bind_op), true);
>  		} else if (bind_op == TO_UNBIND) {
>  			if (obj->bind_op == TO_BIND)
>  				igt_map_remove(ainfo->bind_map, &obj->handle, map_entry_free_func);
> @@ -1181,6 +1186,7 @@ static void track_object(uint64_t allocator_handle, uint32_t handle,
>  		obj->handle = handle;
>  		obj->offset = offset;
>  		obj->size = size;
> +		obj->pat_index = pat_index;
>  		obj->bind_op = bind_op;
>  		igt_map_insert(ainfo->bind_map, &obj->handle, obj);
>  	}
> @@ -1204,7 +1210,7 @@ out:
>   */
>  uint64_t __intel_allocator_alloc(uint64_t allocator_handle, uint32_t handle,
>  				 uint64_t size, uint64_t alignment,
> -				 enum allocator_strategy strategy)
> +				 uint8_t pat_index, enum allocator_strategy strategy)
>  {
>  	struct alloc_req req = { .request_type = REQ_ALLOC,
>  				 .allocator_handle = allocator_handle,
> @@ -1219,7 +1225,8 @@ uint64_t __intel_allocator_alloc(uint64_t allocator_handle, uint32_t handle,
>  	igt_assert(handle_request(&req, &resp) == 0);
>  	igt_assert(resp.response_type == RESP_ALLOC);
>  
> -	track_object(allocator_handle, handle, resp.alloc.offset, size, TO_BIND);
> +	track_object(allocator_handle, handle, resp.alloc.offset, size, pat_index,
> +		     TO_BIND);
>  
>  	return resp.alloc.offset;
>  }
> @@ -1241,7 +1248,7 @@ uint64_t intel_allocator_alloc(uint64_t allocator_handle, uint32_t handle,
>  	uint64_t offset;
>  
>  	offset = __intel_allocator_alloc(allocator_handle, handle,
> -					 size, alignment,
> +					 size, alignment, DEFAULT_PAT_INDEX,
>  					 ALLOC_STRATEGY_NONE);
>  	igt_assert(offset != ALLOC_INVALID_ADDRESS);
>  
> @@ -1268,7 +1275,8 @@ uint64_t intel_allocator_alloc_with_strategy(uint64_t allocator_handle,
>  	uint64_t offset;
>  
>  	offset = __intel_allocator_alloc(allocator_handle, handle,
> -					 size, alignment, strategy);
> +					 size, alignment, DEFAULT_PAT_INDEX,
> +					 strategy);
>  	igt_assert(offset != ALLOC_INVALID_ADDRESS);
>  
>  	return offset;
> @@ -1298,7 +1306,7 @@ bool intel_allocator_free(uint64_t allocator_handle, uint32_t handle)
>  	igt_assert(handle_request(&req, &resp) == 0);
>  	igt_assert(resp.response_type == RESP_FREE);
>  
> -	track_object(allocator_handle, handle, 0, 0, TO_UNBIND);
> +	track_object(allocator_handle, handle, 0, 0, 0, TO_UNBIND);
>  
>  	return resp.free.freed;
>  }
> @@ -1500,16 +1508,17 @@ static void __xe_op_bind(struct ahnd_info *ainfo, uint32_t sync_in, uint32_t syn
>  		if (obj->bind_op == BOUND)
>  			continue;
>  
> -		bind_info("= [vm: %u] %s => %u %lx %lx\n",
> +		bind_info("= [vm: %u] %s => %u %lx %lx %u\n",
>  			  ainfo->vm,
>  			  obj->bind_op == TO_BIND ? "TO BIND" : "TO UNBIND",
>  			  obj->handle, obj->offset,
> -			  obj->size);
> +			  obj->size, obj->pat_index);
>  
>  		entry = malloc(sizeof(*entry));
>  		entry->handle = obj->handle;
>  		entry->offset = obj->offset;
>  		entry->size = obj->size;
> +		entry->pat_index = obj->pat_index;
>  		entry->bind_op = obj->bind_op == TO_BIND ? XE_OBJECT_BIND :
>  							   XE_OBJECT_UNBIND;
>  		igt_list_add(&entry->link, &obj_list);
> @@ -1534,6 +1543,18 @@ static void __xe_op_bind(struct ahnd_info *ainfo, uint32_t sync_in, uint32_t syn
>  	}
>  }
>  
> +uint64_t get_offset_pat_index(uint64_t ahnd, uint32_t handle, uint64_t size,
> +			      uint64_t alignment, uint8_t pat_index)
> +{
> +	uint64_t offset;
> +
> +	offset = __intel_allocator_alloc(ahnd, handle, size, alignment,
> +					 pat_index, ALLOC_STRATEGY_NONE);
> +	igt_assert(offset != ALLOC_INVALID_ADDRESS);
> +
> +	return offset;
> +}
> +
>  /**
>   * intel_allocator_bind:
>   * @allocator_handle: handle to an allocator
> diff --git a/lib/intel_allocator.h b/lib/intel_allocator.h
> index f9ff7f1cc..5da8af7f9 100644
> --- a/lib/intel_allocator.h
> +++ b/lib/intel_allocator.h
> @@ -186,7 +186,7 @@ bool intel_allocator_close(uint64_t allocator_handle);
>  void intel_allocator_get_address_range(uint64_t allocator_handle,
>  				       uint64_t *startp, uint64_t *endp);
>  uint64_t __intel_allocator_alloc(uint64_t allocator_handle, uint32_t handle,
> -				 uint64_t size, uint64_t alignment,
> +				 uint64_t size, uint64_t alignment, uint8_t pat_index,
>  				 enum allocator_strategy strategy);
>  uint64_t intel_allocator_alloc(uint64_t allocator_handle, uint32_t handle,
>  			       uint64_t size, uint64_t alignment);
> @@ -266,6 +266,9 @@ static inline bool put_ahnd(uint64_t ahnd)
>  	return !ahnd || intel_allocator_close(ahnd);
>  }
>  
> +uint64_t get_offset_pat_index(uint64_t ahnd, uint32_t handle, uint64_t size,
> +			      uint64_t alignment, uint8_t pat_index);
> +
>  static inline uint64_t get_offset(uint64_t ahnd, uint32_t handle,
>  				  uint64_t size, uint64_t alignment)
>  {
> diff --git a/lib/xe/xe_util.c b/lib/xe/xe_util.c
> index 2f9ffe2f1..8583326a9 100644
> --- a/lib/xe/xe_util.c
> +++ b/lib/xe/xe_util.c
> @@ -145,6 +145,7 @@ static struct drm_xe_vm_bind_op *xe_alloc_bind_ops(struct igt_list_head *obj_lis
>  		ops->addr = obj->offset;
>  		ops->range = obj->size;
>  		ops->region = 0;
> +		ops->pat_index = obj->pat_index;
>  
>  		bind_info("  [%d]: [%6s] handle: %u, offset: %llx, size: %llx\n",
>  			  i, obj->bind_op == XE_OBJECT_BIND ? "BIND" : "UNBIND",
> diff --git a/lib/xe/xe_util.h b/lib/xe/xe_util.h
> index e97d236b8..e3bdf3d11 100644
> --- a/lib/xe/xe_util.h
> +++ b/lib/xe/xe_util.h
> @@ -36,6 +36,7 @@ struct xe_object {
>  	uint32_t handle;
>  	uint64_t offset;
>  	uint64_t size;
> +	uint8_t pat_index;
>  	enum xe_bind_op bind_op;
>  	struct igt_list_head link;
>  };
> diff --git a/tests/intel/api_intel_allocator.c b/tests/intel/api_intel_allocator.c
> index f3fcf8a34..d19be3ce9 100644
> --- a/tests/intel/api_intel_allocator.c
> +++ b/tests/intel/api_intel_allocator.c
> @@ -9,6 +9,7 @@
>  #include "igt.h"
>  #include "igt_aux.h"
>  #include "intel_allocator.h"
> +#include "intel_pat.h"
>  #include "xe/xe_ioctl.h"
>  #include "xe/xe_query.h"
>  
> @@ -131,7 +132,8 @@ static void alloc_simple(int fd)
>  
>  	intel_allocator_get_address_range(ahnd, &start, &end);
>  	offset0 = intel_allocator_alloc(ahnd, 1, end - start, 0);
> -	offset1 = __intel_allocator_alloc(ahnd, 2, 4096, 0, ALLOC_STRATEGY_NONE);
> +	offset1 = __intel_allocator_alloc(ahnd, 2, 4096, 0, DEFAULT_PAT_INDEX,
> +					  ALLOC_STRATEGY_NONE);
>  	igt_assert(offset1 == ALLOC_INVALID_ADDRESS);
>  	intel_allocator_free(ahnd, 1);
>  
> -- 
> 2.41.0
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-xe] [igt-dev] [PATCH i-g-t 08/12] lib/intel_blt: support pat_index
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 08/12] lib/intel_blt: support pat_index Matthew Auld
@ 2023-10-06 11:51   ` Zbigniew Kempczyński
  2023-10-06 12:08     ` Matthew Auld
  0 siblings, 1 reply; 22+ messages in thread
From: Zbigniew Kempczyński @ 2023-10-06 11:51 UTC (permalink / raw)
  To: Matthew Auld; +Cc: igt-dev, intel-xe

On Thu, Oct 05, 2023 at 04:31:12PM +0100, Matthew Auld wrote:
> For the most part we can just use the default wb, however some users
> including display might want to use something else.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: José Roberto de Souza <jose.souza@intel.com>
> Cc: Pallavi Mishra <pallavi.mishra@intel.com>
> ---
>  lib/igt_fb.c                    |  2 ++
>  lib/intel_blt.c                 | 54 +++++++++++++++++++++------------
>  lib/intel_blt.h                 |  7 +++--
>  tests/intel/gem_ccs.c           | 16 +++++-----
>  tests/intel/gem_lmem_swapping.c |  4 +--
>  tests/intel/xe_ccs.c            | 19 +++++++-----
>  6 files changed, 64 insertions(+), 38 deletions(-)
> 
> diff --git a/lib/igt_fb.c b/lib/igt_fb.c
> index f8a0db22c..d290fd775 100644
> --- a/lib/igt_fb.c
> +++ b/lib/igt_fb.c
> @@ -37,6 +37,7 @@
>  #include "i915/gem_mman.h"
>  #include "intel_blt.h"
>  #include "intel_mocs.h"
> +#include "intel_pat.h"
>  #include "igt_aux.h"
>  #include "igt_color_encoding.h"
>  #include "igt_fb.h"
> @@ -2768,6 +2769,7 @@ static struct blt_copy_object *blt_fb_init(const struct igt_fb *fb,
>  
>  	blt_set_object(blt, handle, fb->size, memregion,
>  		       intel_get_uc_mocs(fb->fd),
> +		       intel_get_pat_idx_wt(fb->fd),
>  		       blt_tile,
>  		       is_ccs_modifier(fb->modifier) ? COMPRESSION_ENABLED : COMPRESSION_DISABLED,
>  		       is_gen12_mc_ccs_modifier(fb->modifier) ? COMPRESSION_TYPE_MEDIA : COMPRESSION_TYPE_3D);
> diff --git a/lib/intel_blt.c b/lib/intel_blt.c
> index b55fa9b52..b7ac2902b 100644
> --- a/lib/intel_blt.c
> +++ b/lib/intel_blt.c
> @@ -13,6 +13,7 @@
>  #include "igt.h"
>  #include "igt_syncobj.h"
>  #include "intel_blt.h"
> +#include "intel_pat.h"
>  #include "xe/xe_ioctl.h"
>  #include "xe/xe_query.h"
>  #include "xe/xe_util.h"
> @@ -810,10 +811,12 @@ uint64_t emit_blt_block_copy(int fd,
>  	igt_assert_f(blt, "block-copy requires data to do blit\n");
>  
>  	alignment = get_default_alignment(fd, blt->driver);
> -	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment)
> -		     + blt->src.plane_offset;
> -	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment)
> -		     + blt->dst.plane_offset;
> +	src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
> +					  alignment, blt->src.pat_index) +
> +		blt->src.plane_offset;
> +	dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size,
> +					  alignment, blt->dst.pat_index) +
> +		blt->dst.plane_offset;

To less tabs in formatting for src and dst plane_offset.

>  	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
>  
>  	fill_data(&data, blt, src_offset, dst_offset, ext);
> @@ -884,8 +887,10 @@ int blt_block_copy(int fd,
>  	igt_assert_neq(blt->driver, 0);
>  
>  	alignment = get_default_alignment(fd, blt->driver);
> -	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
> -	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
> +	src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
> +					  alignment, blt->src.pat_index);
> +	dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size,
> +					  alignment, blt->dst.pat_index);
>  	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
>  
>  	emit_blt_block_copy(fd, ahnd, blt, ext, 0, true);
> @@ -1036,8 +1041,10 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
>  	data.dw00.size_of_ctrl_copy = __ccs_size(surf) / CCS_RATIO - 1;
>  	data.dw00.length = 0x3;
>  
> -	src_offset = get_offset(ahnd, surf->src.handle, surf->src.size, alignment);
> -	dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size, alignment);
> +	src_offset = get_offset_pat_index(ahnd, surf->src.handle, surf->src.size,
> +					  alignment, surf->src.pat_index);
> +	dst_offset = get_offset_pat_index(ahnd, surf->dst.handle, surf->dst.size,
> +					  alignment, surf->dst.pat_index);
>  	bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size, alignment);
>  
>  	data.dw01.src_address_lo = src_offset;
> @@ -1103,8 +1110,10 @@ int blt_ctrl_surf_copy(int fd,
>  	igt_assert_neq(surf->driver, 0);
>  
>  	alignment = max_t(uint64_t, get_default_alignment(fd, surf->driver), 1ull << 16);
> -	src_offset = get_offset(ahnd, surf->src.handle, surf->src.size, alignment);
> -	dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size, alignment);
> +	src_offset = get_offset_pat_index(ahnd, surf->src.handle, surf->src.size,
> +					  alignment, surf->src.pat_index);
> +	dst_offset = get_offset_pat_index(ahnd, surf->dst.handle, surf->dst.size,
> +					  alignment, surf->dst.pat_index);
>  	bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size, alignment);
>  
>  	emit_blt_ctrl_surf_copy(fd, ahnd, surf, 0, true);
> @@ -1308,10 +1317,12 @@ uint64_t emit_blt_fast_copy(int fd,
>  	data.dw03.dst_x2 = blt->dst.x2;
>  	data.dw03.dst_y2 = blt->dst.y2;
>  
> -	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment)
> -		     + blt->src.plane_offset;
> -	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment)
> -		     + blt->dst.plane_offset;
> +	src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
> +					  alignment, blt->src.pat_index) +
> +		blt->src.plane_offset;
> +	dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size, alignment,
> +					  blt->dst.pat_index) +
> +		blt->dst.plane_offset;

Ditto.

>  	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
>  
>  	data.dw04.dst_address_lo = dst_offset;
> @@ -1380,8 +1391,10 @@ int blt_fast_copy(int fd,
>  	igt_assert_neq(blt->driver, 0);
>  
>  	alignment = get_default_alignment(fd, blt->driver);
> -	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
> -	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
> +	src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
> +					  alignment, blt->src.pat_index);
> +	dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size,
> +					  alignment, blt->dst.pat_index);
>  	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
>  
>  	emit_blt_fast_copy(fd, ahnd, blt, 0, true);
> @@ -1460,7 +1473,7 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
>  							  &size, region) == 0);
>  	}

I think blt_create_object() should have also pat_index passed as an
argument.

Rest looks ok.

--
Zbigniew

>  
> -	blt_set_object(obj, handle, size, region, mocs, tiling,
> +	blt_set_object(obj, handle, size, region, mocs, DEFAULT_PAT_INDEX, tiling,
>  		       compression, compression_type);
>  	blt_set_geom(obj, stride, 0, 0, width, height, 0, 0);
>  
> @@ -1481,7 +1494,7 @@ void blt_destroy_object(int fd, struct blt_copy_object *obj)
>  
>  void blt_set_object(struct blt_copy_object *obj,
>  		    uint32_t handle, uint64_t size, uint32_t region,
> -		    uint8_t mocs, enum blt_tiling_type tiling,
> +		    uint8_t mocs, uint8_t pat_index, enum blt_tiling_type tiling,
>  		    enum blt_compression compression,
>  		    enum blt_compression_type compression_type)
>  {
> @@ -1489,6 +1502,7 @@ void blt_set_object(struct blt_copy_object *obj,
>  	obj->size = size;
>  	obj->region = region;
>  	obj->mocs = mocs;
> +	obj->pat_index = pat_index;
>  	obj->tiling = tiling;
>  	obj->compression = compression;
>  	obj->compression_type = compression_type;
> @@ -1516,12 +1530,14 @@ void blt_set_copy_object(struct blt_copy_object *obj,
>  
>  void blt_set_ctrl_surf_object(struct blt_ctrl_surf_copy_object *obj,
>  			      uint32_t handle, uint32_t region, uint64_t size,
> -			      uint8_t mocs, enum blt_access_type access_type)
> +			      uint8_t mocs, uint8_t pat_index,
> +			      enum blt_access_type access_type)
>  {
>  	obj->handle = handle;
>  	obj->region = region;
>  	obj->size = size;
>  	obj->mocs = mocs;
> +	obj->pat_index = pat_index;
>  	obj->access_type = access_type;
>  }
>  
> diff --git a/lib/intel_blt.h b/lib/intel_blt.h
> index d9c8883c7..f8423a986 100644
> --- a/lib/intel_blt.h
> +++ b/lib/intel_blt.h
> @@ -79,6 +79,7 @@ struct blt_copy_object {
>  	uint32_t region;
>  	uint64_t size;
>  	uint8_t mocs;
> +	uint8_t pat_index;
>  	enum blt_tiling_type tiling;
>  	enum blt_compression compression;  /* BC only */
>  	enum blt_compression_type compression_type; /* BC only */
> @@ -151,6 +152,7 @@ struct blt_ctrl_surf_copy_object {
>  	uint32_t region;
>  	uint64_t size;
>  	uint8_t mocs;
> +	uint8_t pat_index;
>  	enum blt_access_type access_type;
>  };
>  
> @@ -247,7 +249,7 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
>  void blt_destroy_object(int fd, struct blt_copy_object *obj);
>  void blt_set_object(struct blt_copy_object *obj,
>  		    uint32_t handle, uint64_t size, uint32_t region,
> -		    uint8_t mocs, enum blt_tiling_type tiling,
> +		    uint8_t mocs, uint8_t pat_index, enum blt_tiling_type tiling,
>  		    enum blt_compression compression,
>  		    enum blt_compression_type compression_type);
>  void blt_set_object_ext(struct blt_block_copy_object_ext *obj,
> @@ -258,7 +260,8 @@ void blt_set_copy_object(struct blt_copy_object *obj,
>  			 const struct blt_copy_object *orig);
>  void blt_set_ctrl_surf_object(struct blt_ctrl_surf_copy_object *obj,
>  			      uint32_t handle, uint32_t region, uint64_t size,
> -			      uint8_t mocs, enum blt_access_type access_type);
> +			      uint8_t mocs, uint8_t pat_index,
> +			      enum blt_access_type access_type);
>  
>  void blt_surface_info(const char *info,
>  		      const struct blt_copy_object *obj);
> diff --git a/tests/intel/gem_ccs.c b/tests/intel/gem_ccs.c
> index f5d4ab359..a98557b72 100644
> --- a/tests/intel/gem_ccs.c
> +++ b/tests/intel/gem_ccs.c
> @@ -15,6 +15,7 @@
>  #include "lib/intel_chipset.h"
>  #include "intel_blt.h"
>  #include "intel_mocs.h"
> +#include "intel_pat.h"
>  /**
>   * TEST: gem ccs
>   * Description: Exercise gen12 blitter with and without flatccs compression
> @@ -111,9 +112,9 @@ static void surf_copy(int i915,
>  	blt_ctrl_surf_copy_init(i915, &surf);
>  	surf.print_bb = param.print_bb;
>  	blt_set_ctrl_surf_object(&surf.src, mid->handle, mid->region, mid->size,
> -				 uc_mocs, BLT_INDIRECT_ACCESS);
> +				 uc_mocs, DEFAULT_PAT_INDEX, BLT_INDIRECT_ACCESS);
>  	blt_set_ctrl_surf_object(&surf.dst, ccs, REGION_SMEM, ccssize,
> -				 uc_mocs, DIRECT_ACCESS);
> +				 uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
>  	bb_size = 4096;
>  	igt_assert_eq(__gem_create(i915, &bb_size, &bb1), 0);
>  	blt_set_batch(&surf.bb, bb1, bb_size, REGION_SMEM);
> @@ -133,7 +134,7 @@ static void surf_copy(int i915,
>  		igt_system_suspend_autoresume(SUSPEND_STATE_FREEZE, SUSPEND_TEST_NONE);
>  
>  		blt_set_ctrl_surf_object(&surf.dst, ccs2, REGION_SMEM, ccssize,
> -					 0, DIRECT_ACCESS);
> +					 0, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
>  		blt_ctrl_surf_copy(i915, ctx, e, ahnd, &surf);
>  		gem_sync(i915, surf.dst.handle);
>  
> @@ -155,9 +156,9 @@ static void surf_copy(int i915,
>  	for (int i = 0; i < surf.dst.size / sizeof(uint32_t); i++)
>  		ccsmap[i] = i;
>  	blt_set_ctrl_surf_object(&surf.src, ccs, REGION_SMEM, ccssize,
> -				 uc_mocs, DIRECT_ACCESS);
> +				 uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
>  	blt_set_ctrl_surf_object(&surf.dst, mid->handle, mid->region, mid->size,
> -				 uc_mocs, INDIRECT_ACCESS);
> +				 uc_mocs, DEFAULT_PAT_INDEX, INDIRECT_ACCESS);
>  	blt_ctrl_surf_copy(i915, ctx, e, ahnd, &surf);
>  
>  	blt_copy_init(i915, &blt);
> @@ -399,7 +400,8 @@ static void block_copy(int i915,
>  	blt_set_object_ext(&ext.dst, 0, width, height, SURFACE_TYPE_2D);
>  	if (config->inplace) {
>  		blt_set_object(&blt.dst, mid->handle, dst->size, mid->region, 0,
> -			       T_LINEAR, COMPRESSION_DISABLED, comp_type);
> +			       DEFAULT_PAT_INDEX, T_LINEAR, COMPRESSION_DISABLED,
> +			       comp_type);
>  		blt.dst.ptr = mid->ptr;
>  	}
>  
> @@ -475,7 +477,7 @@ static void block_multicopy(int i915,
>  
>  	if (config->inplace) {
>  		blt_set_object(&blt3.dst, mid->handle, dst->size, mid->region,
> -			       mid->mocs, mid_tiling, COMPRESSION_DISABLED,
> +			       mid->mocs, DEFAULT_PAT_INDEX, mid_tiling, COMPRESSION_DISABLED,
>  			       comp_type);
>  		blt3.dst.ptr = mid->ptr;
>  	}
> diff --git a/tests/intel/gem_lmem_swapping.c b/tests/intel/gem_lmem_swapping.c
> index ede545c92..7f2ab8bb6 100644
> --- a/tests/intel/gem_lmem_swapping.c
> +++ b/tests/intel/gem_lmem_swapping.c
> @@ -486,7 +486,7 @@ static void __do_evict(int i915,
>  				   INTEL_MEMORY_REGION_ID(I915_SYSTEM_MEMORY, 0));
>  		blt_set_object(tmp, tmp->handle, params->size.max,
>  			       INTEL_MEMORY_REGION_ID(I915_SYSTEM_MEMORY, 0),
> -			       intel_get_uc_mocs(i915), T_LINEAR,
> +			       intel_get_uc_mocs(i915), 0, T_LINEAR,
>  			       COMPRESSION_DISABLED, COMPRESSION_TYPE_3D);
>  		blt_set_geom(tmp, stride, 0, 0, width, height, 0, 0);
>  	}
> @@ -516,7 +516,7 @@ static void __do_evict(int i915,
>  			obj->blt_obj = calloc(1, sizeof(*obj->blt_obj));
>  			igt_assert(obj->blt_obj);
>  			blt_set_object(obj->blt_obj, obj->handle, obj->size, region_id,
> -				       intel_get_uc_mocs(i915), T_LINEAR,
> +				       intel_get_uc_mocs(i915), 0, T_LINEAR,
>  				       COMPRESSION_ENABLED, COMPRESSION_TYPE_3D);
>  			blt_set_geom(obj->blt_obj, stride, 0, 0, width, height, 0, 0);
>  			init_object_ccs(i915, obj, tmp, rand(), blt_ctx,
> diff --git a/tests/intel/xe_ccs.c b/tests/intel/xe_ccs.c
> index 20bbc4448..27859d5ce 100644
> --- a/tests/intel/xe_ccs.c
> +++ b/tests/intel/xe_ccs.c
> @@ -13,6 +13,7 @@
>  #include "igt_syncobj.h"
>  #include "intel_blt.h"
>  #include "intel_mocs.h"
> +#include "intel_pat.h"
>  #include "xe/xe_ioctl.h"
>  #include "xe/xe_query.h"
>  #include "xe/xe_util.h"
> @@ -108,8 +109,9 @@ static void surf_copy(int xe,
>  	blt_ctrl_surf_copy_init(xe, &surf);
>  	surf.print_bb = param.print_bb;
>  	blt_set_ctrl_surf_object(&surf.src, mid->handle, mid->region, mid->size,
> -				 uc_mocs, BLT_INDIRECT_ACCESS);
> -	blt_set_ctrl_surf_object(&surf.dst, ccs, sysmem, ccssize, uc_mocs, DIRECT_ACCESS);
> +				 uc_mocs, DEFAULT_PAT_INDEX, BLT_INDIRECT_ACCESS);
> +	blt_set_ctrl_surf_object(&surf.dst, ccs, sysmem, ccssize, uc_mocs,
> +				 DEFAULT_PAT_INDEX, DIRECT_ACCESS);
>  	bb_size = xe_get_default_alignment(xe);
>  	bb1 = xe_bo_create_flags(xe, 0, bb_size, sysmem);
>  	blt_set_batch(&surf.bb, bb1, bb_size, sysmem);
> @@ -130,7 +132,7 @@ static void surf_copy(int xe,
>  		igt_system_suspend_autoresume(SUSPEND_STATE_FREEZE, SUSPEND_TEST_NONE);
>  
>  		blt_set_ctrl_surf_object(&surf.dst, ccs2, system_memory(xe), ccssize,
> -					 0, DIRECT_ACCESS);
> +					 0, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
>  		blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
>  		intel_ctx_xe_sync(ctx, true);
>  
> @@ -153,9 +155,9 @@ static void surf_copy(int xe,
>  	for (int i = 0; i < surf.dst.size / sizeof(uint32_t); i++)
>  		ccsmap[i] = i;
>  	blt_set_ctrl_surf_object(&surf.src, ccs, sysmem, ccssize,
> -				 uc_mocs, DIRECT_ACCESS);
> +				 uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
>  	blt_set_ctrl_surf_object(&surf.dst, mid->handle, mid->region, mid->size,
> -				 uc_mocs, INDIRECT_ACCESS);
> +				 uc_mocs, DEFAULT_PAT_INDEX, INDIRECT_ACCESS);
>  	blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
>  	intel_ctx_xe_sync(ctx, true);
>  
> @@ -369,7 +371,8 @@ static void block_copy(int xe,
>  	blt_set_object_ext(&ext.dst, 0, width, height, SURFACE_TYPE_2D);
>  	if (config->inplace) {
>  		blt_set_object(&blt.dst, mid->handle, dst->size, mid->region, 0,
> -			       T_LINEAR, COMPRESSION_DISABLED, comp_type);
> +			       DEFAULT_PAT_INDEX, T_LINEAR, COMPRESSION_DISABLED,
> +			       comp_type);
>  		blt.dst.ptr = mid->ptr;
>  	}
>  
> @@ -450,8 +453,8 @@ static void block_multicopy(int xe,
>  
>  	if (config->inplace) {
>  		blt_set_object(&blt3.dst, mid->handle, dst->size, mid->region,
> -			       mid->mocs, mid_tiling, COMPRESSION_DISABLED,
> -			       comp_type);
> +			       mid->mocs, DEFAULT_PAT_INDEX, mid_tiling,
> +			       COMPRESSION_DISABLED, comp_type);
>  		blt3.dst.ptr = mid->ptr;
>  	}
>  
> -- 
> 2.41.0
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-xe] [igt-dev] [PATCH i-g-t 08/12] lib/intel_blt: support pat_index
  2023-10-06 11:51   ` [Intel-xe] [igt-dev] " Zbigniew Kempczyński
@ 2023-10-06 12:08     ` Matthew Auld
  2023-10-09  9:21       ` Zbigniew Kempczyński
  0 siblings, 1 reply; 22+ messages in thread
From: Matthew Auld @ 2023-10-06 12:08 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev, intel-xe

On 06/10/2023 12:51, Zbigniew Kempczyński wrote:
> On Thu, Oct 05, 2023 at 04:31:12PM +0100, Matthew Auld wrote:
>> For the most part we can just use the default wb, however some users
>> including display might want to use something else.
>>
>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> Cc: José Roberto de Souza <jose.souza@intel.com>
>> Cc: Pallavi Mishra <pallavi.mishra@intel.com>
>> ---
>>   lib/igt_fb.c                    |  2 ++
>>   lib/intel_blt.c                 | 54 +++++++++++++++++++++------------
>>   lib/intel_blt.h                 |  7 +++--
>>   tests/intel/gem_ccs.c           | 16 +++++-----
>>   tests/intel/gem_lmem_swapping.c |  4 +--
>>   tests/intel/xe_ccs.c            | 19 +++++++-----
>>   6 files changed, 64 insertions(+), 38 deletions(-)
>>
>> diff --git a/lib/igt_fb.c b/lib/igt_fb.c
>> index f8a0db22c..d290fd775 100644
>> --- a/lib/igt_fb.c
>> +++ b/lib/igt_fb.c
>> @@ -37,6 +37,7 @@
>>   #include "i915/gem_mman.h"
>>   #include "intel_blt.h"
>>   #include "intel_mocs.h"
>> +#include "intel_pat.h"
>>   #include "igt_aux.h"
>>   #include "igt_color_encoding.h"
>>   #include "igt_fb.h"
>> @@ -2768,6 +2769,7 @@ static struct blt_copy_object *blt_fb_init(const struct igt_fb *fb,
>>   
>>   	blt_set_object(blt, handle, fb->size, memregion,
>>   		       intel_get_uc_mocs(fb->fd),
>> +		       intel_get_pat_idx_wt(fb->fd),
>>   		       blt_tile,
>>   		       is_ccs_modifier(fb->modifier) ? COMPRESSION_ENABLED : COMPRESSION_DISABLED,
>>   		       is_gen12_mc_ccs_modifier(fb->modifier) ? COMPRESSION_TYPE_MEDIA : COMPRESSION_TYPE_3D);
>> diff --git a/lib/intel_blt.c b/lib/intel_blt.c
>> index b55fa9b52..b7ac2902b 100644
>> --- a/lib/intel_blt.c
>> +++ b/lib/intel_blt.c
>> @@ -13,6 +13,7 @@
>>   #include "igt.h"
>>   #include "igt_syncobj.h"
>>   #include "intel_blt.h"
>> +#include "intel_pat.h"
>>   #include "xe/xe_ioctl.h"
>>   #include "xe/xe_query.h"
>>   #include "xe/xe_util.h"
>> @@ -810,10 +811,12 @@ uint64_t emit_blt_block_copy(int fd,
>>   	igt_assert_f(blt, "block-copy requires data to do blit\n");
>>   
>>   	alignment = get_default_alignment(fd, blt->driver);
>> -	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment)
>> -		     + blt->src.plane_offset;
>> -	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment)
>> -		     + blt->dst.plane_offset;
>> +	src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
>> +					  alignment, blt->src.pat_index) +
>> +		blt->src.plane_offset;
>> +	dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size,
>> +					  alignment, blt->dst.pat_index) +
>> +		blt->dst.plane_offset;
> 
> To less tabs in formatting for src and dst plane_offset.
> 
>>   	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
>>   
>>   	fill_data(&data, blt, src_offset, dst_offset, ext);
>> @@ -884,8 +887,10 @@ int blt_block_copy(int fd,
>>   	igt_assert_neq(blt->driver, 0);
>>   
>>   	alignment = get_default_alignment(fd, blt->driver);
>> -	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
>> -	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
>> +	src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
>> +					  alignment, blt->src.pat_index);
>> +	dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size,
>> +					  alignment, blt->dst.pat_index);
>>   	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
>>   
>>   	emit_blt_block_copy(fd, ahnd, blt, ext, 0, true);
>> @@ -1036,8 +1041,10 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
>>   	data.dw00.size_of_ctrl_copy = __ccs_size(surf) / CCS_RATIO - 1;
>>   	data.dw00.length = 0x3;
>>   
>> -	src_offset = get_offset(ahnd, surf->src.handle, surf->src.size, alignment);
>> -	dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size, alignment);
>> +	src_offset = get_offset_pat_index(ahnd, surf->src.handle, surf->src.size,
>> +					  alignment, surf->src.pat_index);
>> +	dst_offset = get_offset_pat_index(ahnd, surf->dst.handle, surf->dst.size,
>> +					  alignment, surf->dst.pat_index);
>>   	bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size, alignment);
>>   
>>   	data.dw01.src_address_lo = src_offset;
>> @@ -1103,8 +1110,10 @@ int blt_ctrl_surf_copy(int fd,
>>   	igt_assert_neq(surf->driver, 0);
>>   
>>   	alignment = max_t(uint64_t, get_default_alignment(fd, surf->driver), 1ull << 16);
>> -	src_offset = get_offset(ahnd, surf->src.handle, surf->src.size, alignment);
>> -	dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size, alignment);
>> +	src_offset = get_offset_pat_index(ahnd, surf->src.handle, surf->src.size,
>> +					  alignment, surf->src.pat_index);
>> +	dst_offset = get_offset_pat_index(ahnd, surf->dst.handle, surf->dst.size,
>> +					  alignment, surf->dst.pat_index);
>>   	bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size, alignment);
>>   
>>   	emit_blt_ctrl_surf_copy(fd, ahnd, surf, 0, true);
>> @@ -1308,10 +1317,12 @@ uint64_t emit_blt_fast_copy(int fd,
>>   	data.dw03.dst_x2 = blt->dst.x2;
>>   	data.dw03.dst_y2 = blt->dst.y2;
>>   
>> -	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment)
>> -		     + blt->src.plane_offset;
>> -	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment)
>> -		     + blt->dst.plane_offset;
>> +	src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
>> +					  alignment, blt->src.pat_index) +
>> +		blt->src.plane_offset;
>> +	dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size, alignment,
>> +					  blt->dst.pat_index) +
>> +		blt->dst.plane_offset;
> 
> Ditto.
> 
>>   	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
>>   
>>   	data.dw04.dst_address_lo = dst_offset;
>> @@ -1380,8 +1391,10 @@ int blt_fast_copy(int fd,
>>   	igt_assert_neq(blt->driver, 0);
>>   
>>   	alignment = get_default_alignment(fd, blt->driver);
>> -	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
>> -	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
>> +	src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
>> +					  alignment, blt->src.pat_index);
>> +	dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size,
>> +					  alignment, blt->dst.pat_index);
>>   	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
>>   
>>   	emit_blt_fast_copy(fd, ahnd, blt, 0, true);
>> @@ -1460,7 +1473,7 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
>>   							  &size, region) == 0);
>>   	}
> 
> I think blt_create_object() should have also pat_index passed as an
> argument.

I think you would also have to pass in the cpu_caching mode, and maybe 
even the coh_mode, if we wanted that. Currently blt_create_object() 
gives you a combination of cpu_caching, coh_mode and pat_index that is 
the default and should "just work" for most cases. Idea is if you need 
something more exotic you would instead create your own object (using 
say gem_create_caching) and then also select whatever pat_index you needed.

I can change it to expose everything but figured blt_create_object() 
should be more "I don't care, just give me the defaults".

> 
> Rest looks ok.
> 
> --
> Zbigniew
> 
>>   
>> -	blt_set_object(obj, handle, size, region, mocs, tiling,
>> +	blt_set_object(obj, handle, size, region, mocs, DEFAULT_PAT_INDEX, tiling,
>>   		       compression, compression_type);
>>   	blt_set_geom(obj, stride, 0, 0, width, height, 0, 0);
>>   
>> @@ -1481,7 +1494,7 @@ void blt_destroy_object(int fd, struct blt_copy_object *obj)
>>   
>>   void blt_set_object(struct blt_copy_object *obj,
>>   		    uint32_t handle, uint64_t size, uint32_t region,
>> -		    uint8_t mocs, enum blt_tiling_type tiling,
>> +		    uint8_t mocs, uint8_t pat_index, enum blt_tiling_type tiling,
>>   		    enum blt_compression compression,
>>   		    enum blt_compression_type compression_type)
>>   {
>> @@ -1489,6 +1502,7 @@ void blt_set_object(struct blt_copy_object *obj,
>>   	obj->size = size;
>>   	obj->region = region;
>>   	obj->mocs = mocs;
>> +	obj->pat_index = pat_index;
>>   	obj->tiling = tiling;
>>   	obj->compression = compression;
>>   	obj->compression_type = compression_type;
>> @@ -1516,12 +1530,14 @@ void blt_set_copy_object(struct blt_copy_object *obj,
>>   
>>   void blt_set_ctrl_surf_object(struct blt_ctrl_surf_copy_object *obj,
>>   			      uint32_t handle, uint32_t region, uint64_t size,
>> -			      uint8_t mocs, enum blt_access_type access_type)
>> +			      uint8_t mocs, uint8_t pat_index,
>> +			      enum blt_access_type access_type)
>>   {
>>   	obj->handle = handle;
>>   	obj->region = region;
>>   	obj->size = size;
>>   	obj->mocs = mocs;
>> +	obj->pat_index = pat_index;
>>   	obj->access_type = access_type;
>>   }
>>   
>> diff --git a/lib/intel_blt.h b/lib/intel_blt.h
>> index d9c8883c7..f8423a986 100644
>> --- a/lib/intel_blt.h
>> +++ b/lib/intel_blt.h
>> @@ -79,6 +79,7 @@ struct blt_copy_object {
>>   	uint32_t region;
>>   	uint64_t size;
>>   	uint8_t mocs;
>> +	uint8_t pat_index;
>>   	enum blt_tiling_type tiling;
>>   	enum blt_compression compression;  /* BC only */
>>   	enum blt_compression_type compression_type; /* BC only */
>> @@ -151,6 +152,7 @@ struct blt_ctrl_surf_copy_object {
>>   	uint32_t region;
>>   	uint64_t size;
>>   	uint8_t mocs;
>> +	uint8_t pat_index;
>>   	enum blt_access_type access_type;
>>   };
>>   
>> @@ -247,7 +249,7 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
>>   void blt_destroy_object(int fd, struct blt_copy_object *obj);
>>   void blt_set_object(struct blt_copy_object *obj,
>>   		    uint32_t handle, uint64_t size, uint32_t region,
>> -		    uint8_t mocs, enum blt_tiling_type tiling,
>> +		    uint8_t mocs, uint8_t pat_index, enum blt_tiling_type tiling,
>>   		    enum blt_compression compression,
>>   		    enum blt_compression_type compression_type);
>>   void blt_set_object_ext(struct blt_block_copy_object_ext *obj,
>> @@ -258,7 +260,8 @@ void blt_set_copy_object(struct blt_copy_object *obj,
>>   			 const struct blt_copy_object *orig);
>>   void blt_set_ctrl_surf_object(struct blt_ctrl_surf_copy_object *obj,
>>   			      uint32_t handle, uint32_t region, uint64_t size,
>> -			      uint8_t mocs, enum blt_access_type access_type);
>> +			      uint8_t mocs, uint8_t pat_index,
>> +			      enum blt_access_type access_type);
>>   
>>   void blt_surface_info(const char *info,
>>   		      const struct blt_copy_object *obj);
>> diff --git a/tests/intel/gem_ccs.c b/tests/intel/gem_ccs.c
>> index f5d4ab359..a98557b72 100644
>> --- a/tests/intel/gem_ccs.c
>> +++ b/tests/intel/gem_ccs.c
>> @@ -15,6 +15,7 @@
>>   #include "lib/intel_chipset.h"
>>   #include "intel_blt.h"
>>   #include "intel_mocs.h"
>> +#include "intel_pat.h"
>>   /**
>>    * TEST: gem ccs
>>    * Description: Exercise gen12 blitter with and without flatccs compression
>> @@ -111,9 +112,9 @@ static void surf_copy(int i915,
>>   	blt_ctrl_surf_copy_init(i915, &surf);
>>   	surf.print_bb = param.print_bb;
>>   	blt_set_ctrl_surf_object(&surf.src, mid->handle, mid->region, mid->size,
>> -				 uc_mocs, BLT_INDIRECT_ACCESS);
>> +				 uc_mocs, DEFAULT_PAT_INDEX, BLT_INDIRECT_ACCESS);
>>   	blt_set_ctrl_surf_object(&surf.dst, ccs, REGION_SMEM, ccssize,
>> -				 uc_mocs, DIRECT_ACCESS);
>> +				 uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
>>   	bb_size = 4096;
>>   	igt_assert_eq(__gem_create(i915, &bb_size, &bb1), 0);
>>   	blt_set_batch(&surf.bb, bb1, bb_size, REGION_SMEM);
>> @@ -133,7 +134,7 @@ static void surf_copy(int i915,
>>   		igt_system_suspend_autoresume(SUSPEND_STATE_FREEZE, SUSPEND_TEST_NONE);
>>   
>>   		blt_set_ctrl_surf_object(&surf.dst, ccs2, REGION_SMEM, ccssize,
>> -					 0, DIRECT_ACCESS);
>> +					 0, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
>>   		blt_ctrl_surf_copy(i915, ctx, e, ahnd, &surf);
>>   		gem_sync(i915, surf.dst.handle);
>>   
>> @@ -155,9 +156,9 @@ static void surf_copy(int i915,
>>   	for (int i = 0; i < surf.dst.size / sizeof(uint32_t); i++)
>>   		ccsmap[i] = i;
>>   	blt_set_ctrl_surf_object(&surf.src, ccs, REGION_SMEM, ccssize,
>> -				 uc_mocs, DIRECT_ACCESS);
>> +				 uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
>>   	blt_set_ctrl_surf_object(&surf.dst, mid->handle, mid->region, mid->size,
>> -				 uc_mocs, INDIRECT_ACCESS);
>> +				 uc_mocs, DEFAULT_PAT_INDEX, INDIRECT_ACCESS);
>>   	blt_ctrl_surf_copy(i915, ctx, e, ahnd, &surf);
>>   
>>   	blt_copy_init(i915, &blt);
>> @@ -399,7 +400,8 @@ static void block_copy(int i915,
>>   	blt_set_object_ext(&ext.dst, 0, width, height, SURFACE_TYPE_2D);
>>   	if (config->inplace) {
>>   		blt_set_object(&blt.dst, mid->handle, dst->size, mid->region, 0,
>> -			       T_LINEAR, COMPRESSION_DISABLED, comp_type);
>> +			       DEFAULT_PAT_INDEX, T_LINEAR, COMPRESSION_DISABLED,
>> +			       comp_type);
>>   		blt.dst.ptr = mid->ptr;
>>   	}
>>   
>> @@ -475,7 +477,7 @@ static void block_multicopy(int i915,
>>   
>>   	if (config->inplace) {
>>   		blt_set_object(&blt3.dst, mid->handle, dst->size, mid->region,
>> -			       mid->mocs, mid_tiling, COMPRESSION_DISABLED,
>> +			       mid->mocs, DEFAULT_PAT_INDEX, mid_tiling, COMPRESSION_DISABLED,
>>   			       comp_type);
>>   		blt3.dst.ptr = mid->ptr;
>>   	}
>> diff --git a/tests/intel/gem_lmem_swapping.c b/tests/intel/gem_lmem_swapping.c
>> index ede545c92..7f2ab8bb6 100644
>> --- a/tests/intel/gem_lmem_swapping.c
>> +++ b/tests/intel/gem_lmem_swapping.c
>> @@ -486,7 +486,7 @@ static void __do_evict(int i915,
>>   				   INTEL_MEMORY_REGION_ID(I915_SYSTEM_MEMORY, 0));
>>   		blt_set_object(tmp, tmp->handle, params->size.max,
>>   			       INTEL_MEMORY_REGION_ID(I915_SYSTEM_MEMORY, 0),
>> -			       intel_get_uc_mocs(i915), T_LINEAR,
>> +			       intel_get_uc_mocs(i915), 0, T_LINEAR,
>>   			       COMPRESSION_DISABLED, COMPRESSION_TYPE_3D);
>>   		blt_set_geom(tmp, stride, 0, 0, width, height, 0, 0);
>>   	}
>> @@ -516,7 +516,7 @@ static void __do_evict(int i915,
>>   			obj->blt_obj = calloc(1, sizeof(*obj->blt_obj));
>>   			igt_assert(obj->blt_obj);
>>   			blt_set_object(obj->blt_obj, obj->handle, obj->size, region_id,
>> -				       intel_get_uc_mocs(i915), T_LINEAR,
>> +				       intel_get_uc_mocs(i915), 0, T_LINEAR,
>>   				       COMPRESSION_ENABLED, COMPRESSION_TYPE_3D);
>>   			blt_set_geom(obj->blt_obj, stride, 0, 0, width, height, 0, 0);
>>   			init_object_ccs(i915, obj, tmp, rand(), blt_ctx,
>> diff --git a/tests/intel/xe_ccs.c b/tests/intel/xe_ccs.c
>> index 20bbc4448..27859d5ce 100644
>> --- a/tests/intel/xe_ccs.c
>> +++ b/tests/intel/xe_ccs.c
>> @@ -13,6 +13,7 @@
>>   #include "igt_syncobj.h"
>>   #include "intel_blt.h"
>>   #include "intel_mocs.h"
>> +#include "intel_pat.h"
>>   #include "xe/xe_ioctl.h"
>>   #include "xe/xe_query.h"
>>   #include "xe/xe_util.h"
>> @@ -108,8 +109,9 @@ static void surf_copy(int xe,
>>   	blt_ctrl_surf_copy_init(xe, &surf);
>>   	surf.print_bb = param.print_bb;
>>   	blt_set_ctrl_surf_object(&surf.src, mid->handle, mid->region, mid->size,
>> -				 uc_mocs, BLT_INDIRECT_ACCESS);
>> -	blt_set_ctrl_surf_object(&surf.dst, ccs, sysmem, ccssize, uc_mocs, DIRECT_ACCESS);
>> +				 uc_mocs, DEFAULT_PAT_INDEX, BLT_INDIRECT_ACCESS);
>> +	blt_set_ctrl_surf_object(&surf.dst, ccs, sysmem, ccssize, uc_mocs,
>> +				 DEFAULT_PAT_INDEX, DIRECT_ACCESS);
>>   	bb_size = xe_get_default_alignment(xe);
>>   	bb1 = xe_bo_create_flags(xe, 0, bb_size, sysmem);
>>   	blt_set_batch(&surf.bb, bb1, bb_size, sysmem);
>> @@ -130,7 +132,7 @@ static void surf_copy(int xe,
>>   		igt_system_suspend_autoresume(SUSPEND_STATE_FREEZE, SUSPEND_TEST_NONE);
>>   
>>   		blt_set_ctrl_surf_object(&surf.dst, ccs2, system_memory(xe), ccssize,
>> -					 0, DIRECT_ACCESS);
>> +					 0, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
>>   		blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
>>   		intel_ctx_xe_sync(ctx, true);
>>   
>> @@ -153,9 +155,9 @@ static void surf_copy(int xe,
>>   	for (int i = 0; i < surf.dst.size / sizeof(uint32_t); i++)
>>   		ccsmap[i] = i;
>>   	blt_set_ctrl_surf_object(&surf.src, ccs, sysmem, ccssize,
>> -				 uc_mocs, DIRECT_ACCESS);
>> +				 uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
>>   	blt_set_ctrl_surf_object(&surf.dst, mid->handle, mid->region, mid->size,
>> -				 uc_mocs, INDIRECT_ACCESS);
>> +				 uc_mocs, DEFAULT_PAT_INDEX, INDIRECT_ACCESS);
>>   	blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
>>   	intel_ctx_xe_sync(ctx, true);
>>   
>> @@ -369,7 +371,8 @@ static void block_copy(int xe,
>>   	blt_set_object_ext(&ext.dst, 0, width, height, SURFACE_TYPE_2D);
>>   	if (config->inplace) {
>>   		blt_set_object(&blt.dst, mid->handle, dst->size, mid->region, 0,
>> -			       T_LINEAR, COMPRESSION_DISABLED, comp_type);
>> +			       DEFAULT_PAT_INDEX, T_LINEAR, COMPRESSION_DISABLED,
>> +			       comp_type);
>>   		blt.dst.ptr = mid->ptr;
>>   	}
>>   
>> @@ -450,8 +453,8 @@ static void block_multicopy(int xe,
>>   
>>   	if (config->inplace) {
>>   		blt_set_object(&blt3.dst, mid->handle, dst->size, mid->region,
>> -			       mid->mocs, mid_tiling, COMPRESSION_DISABLED,
>> -			       comp_type);
>> +			       mid->mocs, DEFAULT_PAT_INDEX, mid_tiling,
>> +			       COMPRESSION_DISABLED, comp_type);
>>   		blt3.dst.ptr = mid->ptr;
>>   	}
>>   
>> -- 
>> 2.41.0
>>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-xe] [PATCH i-g-t 09/12] lib/intel_buf: support pat_index
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 09/12] lib/intel_buf: " Matthew Auld
@ 2023-10-06 12:13   ` Zbigniew Kempczyński
  0 siblings, 0 replies; 22+ messages in thread
From: Zbigniew Kempczyński @ 2023-10-06 12:13 UTC (permalink / raw)
  To: Matthew Auld; +Cc: igt-dev, intel-xe

On Thu, Oct 05, 2023 at 04:31:13PM +0100, Matthew Auld wrote:
> Some users need to able select their own pat_index. Some display tests
> use igt_draw which in turn uses intel_batchbuffer and intel_buf.  We
> also have a couple more display tests directly using these interfaces
> directly. Idea is to select wt/uc for anything display related, but also
> allow any test to select a pat_index for a given intel_buf.
> 
> Signted-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: José Roberto de Souza <jose.souza@intel.com>
> Cc: Pallavi Mishra <pallavi.mishra@intel.com>
> ---
>  lib/igt_draw.c            |  7 +++++-
>  lib/igt_fb.c              |  3 ++-
>  lib/intel_allocator.c     |  1 +
>  lib/intel_allocator.h     |  1 +
>  lib/intel_batchbuffer.c   | 51 ++++++++++++++++++++++++++++++---------
>  lib/intel_bufops.c        | 29 +++++++++++++++-------
>  lib/intel_bufops.h        |  9 +++++--
>  tests/intel/kms_big_fb.c  |  4 ++-
>  tests/intel/kms_dirtyfb.c |  7 ++++--
>  tests/intel/kms_psr.c     |  4 ++-
>  tests/intel/xe_intel_bb.c |  3 ++-
>  11 files changed, 89 insertions(+), 30 deletions(-)
> 
> diff --git a/lib/igt_draw.c b/lib/igt_draw.c
> index 2332bf94a..8db71ce5e 100644
> --- a/lib/igt_draw.c
> +++ b/lib/igt_draw.c
> @@ -31,6 +31,7 @@
>  #include "intel_batchbuffer.h"
>  #include "intel_chipset.h"
>  #include "intel_mocs.h"
> +#include "intel_pat.h"
>  #include "igt_core.h"
>  #include "igt_fb.h"
>  #include "ioctl_wrappers.h"
> @@ -75,6 +76,7 @@ struct buf_data {
>  	uint32_t size;
>  	uint32_t stride;
>  	int bpp;
> +	uint8_t pat_index;
>  };
>  
>  struct rect {
> @@ -658,7 +660,8 @@ static struct intel_buf *create_buf(int fd, struct buf_ops *bops,
>  				    width, height, from->bpp, 0,
>  				    tiling, 0,
>  				    size, 0,
> -				    region);
> +				    region,
> +				    from->pat_index);
>  
>  	/* Make sure we close handle on destroy path */
>  	intel_buf_set_ownership(buf, true);
> @@ -785,6 +788,7 @@ static void draw_rect_render(int fd, struct cmd_data *cmd_data,
>  	igt_skip_on(!rendercopy);
>  
>  	/* We create a temporary buffer and copy from it using rendercopy. */
> +	tmp.pat_index = buf->pat_index;
>  	tmp.size = rect->w * rect->h * pixel_size;
>  	if (is_i915_device(fd))
>  		tmp.handle = gem_create(fd, tmp.size);
> @@ -852,6 +856,7 @@ void igt_draw_rect(int fd, struct buf_ops *bops, uint32_t ctx,
>  		.size = buf_size,
>  		.stride = buf_stride,
>  		.bpp = bpp,
> +		.pat_index = intel_get_pat_idx_wt(fd),
>  	};
>  	struct rect rect = {
>  		.x = rect_x,
> diff --git a/lib/igt_fb.c b/lib/igt_fb.c
> index d290fd775..61384c553 100644
> --- a/lib/igt_fb.c
> +++ b/lib/igt_fb.c
> @@ -2637,7 +2637,8 @@ igt_fb_create_intel_buf(int fd, struct buf_ops *bops,
>  				    igt_fb_mod_to_tiling(fb->modifier),
>  				    compression, fb->size,
>  				    fb->strides[0],
> -				    region);
> +				    region,
> +				    intel_get_pat_idx_wt(fd));
>  	intel_buf_set_name(buf, name);
>  
>  	/* Make sure we close handle on destroy path */
> diff --git a/lib/intel_allocator.c b/lib/intel_allocator.c
> index da357b833..b3e5c0226 100644
> --- a/lib/intel_allocator.c
> +++ b/lib/intel_allocator.c
> @@ -1449,6 +1449,7 @@ bool intel_allocator_is_reserved(uint64_t allocator_handle,
>  bool intel_allocator_reserve_if_not_allocated(uint64_t allocator_handle,
>  					      uint32_t handle,
>  					      uint64_t size, uint64_t offset,
> +					      uint8_t pat_index,
>  					      bool *is_allocatedp)
>  {
>  	struct alloc_req req = { .request_type = REQ_RESERVE_IF_NOT_ALLOCATED,
> diff --git a/lib/intel_allocator.h b/lib/intel_allocator.h
> index 5da8af7f9..d93c5828d 100644
> --- a/lib/intel_allocator.h
> +++ b/lib/intel_allocator.h
> @@ -206,6 +206,7 @@ bool intel_allocator_is_reserved(uint64_t allocator_handle,
>  bool intel_allocator_reserve_if_not_allocated(uint64_t allocator_handle,
>  					      uint32_t handle,
>  					      uint64_t size, uint64_t offset,
> +					      uint8_t pat_index,
>  					      bool *is_allocatedp);
>  
>  void intel_allocator_print(uint64_t allocator_handle);
> diff --git a/lib/intel_batchbuffer.c b/lib/intel_batchbuffer.c
> index e7b1b755f..eaaf667ea 100644
> --- a/lib/intel_batchbuffer.c
> +++ b/lib/intel_batchbuffer.c
> @@ -38,6 +38,7 @@
>  #include "intel_batchbuffer.h"
>  #include "intel_bufops.h"
>  #include "intel_chipset.h"
> +#include "intel_pat.h"
>  #include "media_fill.h"
>  #include "media_spin.h"
>  #include "sw_sync.h"
> @@ -825,15 +826,18 @@ static void __reallocate_objects(struct intel_bb *ibb)
>  static inline uint64_t __intel_bb_get_offset(struct intel_bb *ibb,
>  					     uint32_t handle,
>  					     uint64_t size,
> -					     uint32_t alignment)
> +					     uint32_t alignment,
> +					     uint8_t pat_index)
>  {
>  	uint64_t offset;
>  
>  	if (ibb->enforce_relocs)
>  		return 0;
>  
> -	offset = intel_allocator_alloc(ibb->allocator_handle,
> -				       handle, size, alignment);
> +	offset = __intel_allocator_alloc(ibb->allocator_handle, handle,
> +					 size, alignment, pat_index,
> +					 ALLOC_STRATEGY_NONE);
> +	igt_assert(offset != ALLOC_INVALID_ADDRESS);
>  
>  	return offset;
>  }
> @@ -1300,11 +1304,14 @@ static struct drm_xe_vm_bind_op *xe_alloc_bind_ops(struct intel_bb *ibb,
>  		ops->op = op;
>  		ops->obj_offset = 0;
>  		ops->addr = objects[i]->offset;
> -		ops->range = objects[i]->rsvd1;
> +		ops->range = objects[i]->rsvd1 & ~(4096-1);

I would introduce some macro for better readability, like

#define OBJ_SIZE(rsvd1) ((rsvd1) & ~(SZ_4K-1))
#define OBJ_PATIDX(rsvd1) ((rsvd1) & (SZ_4K-1))

or sth. Imo

	ops->range = OBJ_SIZE(objects[i]->rsvd1);
	ops->pat_index = OBJ_PATIDX(objects[i]->rsvd1);

suggests more data were packed into rsvd1 on first reading.

>  		ops->region = region;
> +		if (set_obj)
> +			ops->pat_index = objects[i]->rsvd1 & (4096-1);
>  
> -		igt_debug("  [%d]: handle: %u, offset: %llx, size: %llx\n",
> -			  i, ops->obj, (long long)ops->addr, (long long)ops->range);
> +		igt_debug("  [%d]: handle: %u, offset: %llx, size: %llx pat_index: %u\n",
> +			  i, ops->obj, (long long)ops->addr, (long long)ops->range,
> +			  ops->pat_index);
>  	}
>  
>  	return bind_ops;
> @@ -1409,7 +1416,8 @@ void intel_bb_reset(struct intel_bb *ibb, bool purge_objects_cache)
>  		ibb->batch_offset = __intel_bb_get_offset(ibb,
>  							  ibb->handle,
>  							  ibb->size,
> -							  ibb->alignment);
> +							  ibb->alignment,
> +							  DEFAULT_PAT_INDEX);
>  
>  	intel_bb_add_object(ibb, ibb->handle, ibb->size,
>  			    ibb->batch_offset,
> @@ -1645,7 +1653,8 @@ static void __remove_from_objects(struct intel_bb *ibb,
>   */
>  static struct drm_i915_gem_exec_object2 *
>  __intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
> -		      uint64_t offset, uint64_t alignment, bool write)
> +		      uint64_t offset, uint64_t alignment, uint8_t pat_index,
> +		      bool write)
>  {
>  	struct drm_i915_gem_exec_object2 *object;
>  
> @@ -1661,6 +1670,9 @@ __intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
>  	object = __add_to_cache(ibb, handle);
>  	__add_to_objects(ibb, object);
>  
> +	if (pat_index == DEFAULT_PAT_INDEX)
> +		pat_index = intel_get_pat_idx_wb(ibb->fd);
> +
>  	/*
>  	 * If object->offset == INVALID_ADDRESS we added freshly object to the
>  	 * cache. In that case we have two choices:
> @@ -1670,7 +1682,7 @@ __intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
>  	if (INVALID_ADDR(object->offset)) {
>  		if (INVALID_ADDR(offset)) {
>  			offset = __intel_bb_get_offset(ibb, handle, size,
> -						       alignment);
> +						       alignment, pat_index);
>  		} else {
>  			offset = offset & (ibb->gtt_size - 1);
>  
> @@ -1683,6 +1695,7 @@ __intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
>  
>  				reserved = intel_allocator_reserve_if_not_allocated(ibb->allocator_handle,
>  										    handle, size, offset,
> +										    pat_index,
>  										    &allocated);
>  				igt_assert_f(allocated || reserved,
>  					     "Can't get offset, allocated: %d, reserved: %d\n",
> @@ -1721,6 +1734,18 @@ __intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
>  	if (ibb->driver == INTEL_DRIVER_XE) {
>  		object->alignment = alignment;
>  		object->rsvd1 = size;
> +		igt_assert(!(size & (4096-1)));

igt_assert(!OBJ_PATIDX(object->rsvd1));

?

But that's suggestion. Anyway for this one:

Acked-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>

--
Zbigniew

> +
> +		if (pat_index == DEFAULT_PAT_INDEX)
> +			pat_index = intel_get_pat_idx_wb(ibb->fd);
> +
> +		/*
> +		 * XXX: For now encode the pat_index in the first few bits of
> +		 * rsvd1. intel_batchbuffer should really stop using the i915
> +		 * drm_i915_gem_exec_object2 to encode VMA placement
> +		 * information on xe...
> +		 */
> +		object->rsvd1 |= pat_index;
>  	}
>  
>  	return object;
> @@ -1733,7 +1758,7 @@ intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
>  	struct drm_i915_gem_exec_object2 *obj = NULL;
>  
>  	obj = __intel_bb_add_object(ibb, handle, size, offset,
> -				    alignment, write);
> +				    alignment, DEFAULT_PAT_INDEX, write);
>  	igt_assert(obj);
>  
>  	return obj;
> @@ -1795,8 +1820,10 @@ __intel_bb_add_intel_buf(struct intel_bb *ibb, struct intel_buf *buf,
>  		}
>  	}
>  
> -	obj = intel_bb_add_object(ibb, buf->handle, intel_buf_bo_size(buf),
> -				  buf->addr.offset, alignment, write);
> +	obj = __intel_bb_add_object(ibb, buf->handle, intel_buf_bo_size(buf),
> +				    buf->addr.offset, alignment, buf->pat_index,
> +				    write);
> +	igt_assert(obj);
>  	buf->addr.offset = obj->offset;
>  
>  	if (igt_list_empty(&buf->link)) {
> diff --git a/lib/intel_bufops.c b/lib/intel_bufops.c
> index 2c91adb88..fbee4748e 100644
> --- a/lib/intel_bufops.c
> +++ b/lib/intel_bufops.c
> @@ -29,6 +29,7 @@
>  #include "igt.h"
>  #include "igt_x86.h"
>  #include "intel_bufops.h"
> +#include "intel_pat.h"
>  #include "xe/xe_ioctl.h"
>  #include "xe/xe_query.h"
>  
> @@ -818,7 +819,7 @@ static void __intel_buf_init(struct buf_ops *bops,
>  			     int width, int height, int bpp, int alignment,
>  			     uint32_t req_tiling, uint32_t compression,
>  			     uint64_t bo_size, int bo_stride,
> -			     uint64_t region)
> +			     uint64_t region, uint8_t pat_index)
>  {
>  	uint32_t tiling = req_tiling;
>  	uint64_t size;
> @@ -839,6 +840,10 @@ static void __intel_buf_init(struct buf_ops *bops,
>  	IGT_INIT_LIST_HEAD(&buf->link);
>  	buf->mocs = INTEL_BUF_MOCS_DEFAULT;
>  
> +	if (pat_index == DEFAULT_PAT_INDEX)
> +		pat_index = intel_get_pat_idx_wb(bops->fd);
> +	buf->pat_index = pat_index;
> +
>  	if (compression) {
>  		igt_require(bops->intel_gen >= 9);
>  		igt_assert(req_tiling == I915_TILING_Y ||
> @@ -957,7 +962,7 @@ void intel_buf_init(struct buf_ops *bops,
>  	region = bops->driver == INTEL_DRIVER_I915 ? I915_SYSTEM_MEMORY :
>  						     system_memory(bops->fd);
>  	__intel_buf_init(bops, 0, buf, width, height, bpp, alignment,
> -			 tiling, compression, 0, 0, region);
> +			 tiling, compression, 0, 0, region, DEFAULT_PAT_INDEX);
>  
>  	intel_buf_set_ownership(buf, true);
>  }
> @@ -974,7 +979,7 @@ void intel_buf_init_in_region(struct buf_ops *bops,
>  			      uint64_t region)
>  {
>  	__intel_buf_init(bops, 0, buf, width, height, bpp, alignment,
> -			 tiling, compression, 0, 0, region);
> +			 tiling, compression, 0, 0, region, DEFAULT_PAT_INDEX);
>  
>  	intel_buf_set_ownership(buf, true);
>  }
> @@ -1033,7 +1038,7 @@ void intel_buf_init_using_handle(struct buf_ops *bops,
>  				 uint32_t req_tiling, uint32_t compression)
>  {
>  	__intel_buf_init(bops, handle, buf, width, height, bpp, alignment,
> -			 req_tiling, compression, 0, 0, -1);
> +			 req_tiling, compression, 0, 0, -1, DEFAULT_PAT_INDEX);
>  }
>  
>  /**
> @@ -1050,6 +1055,7 @@ void intel_buf_init_using_handle(struct buf_ops *bops,
>   * @size: real bo size
>   * @stride: bo stride
>   * @region: region
> + * @pat_index: pat_index to use for the binding (only used on xe)
>   *
>   * Function configures BO handle within intel_buf structure passed by the caller
>   * (with all its metadata - width, height, ...). Useful if BO was created
> @@ -1067,10 +1073,12 @@ void intel_buf_init_full(struct buf_ops *bops,
>  			 uint32_t compression,
>  			 uint64_t size,
>  			 int stride,
> -			 uint64_t region)
> +			 uint64_t region,
> +			 uint8_t pat_index)
>  {
>  	__intel_buf_init(bops, handle, buf, width, height, bpp, alignment,
> -			 req_tiling, compression, size, stride, region);
> +			 req_tiling, compression, size, stride, region,
> +			 pat_index);
>  }
>  
>  /**
> @@ -1149,7 +1157,8 @@ struct intel_buf *intel_buf_create_using_handle_and_size(struct buf_ops *bops,
>  							 int stride)
>  {
>  	return intel_buf_create_full(bops, handle, width, height, bpp, alignment,
> -				     req_tiling, compression, size, stride, -1);
> +				     req_tiling, compression, size, stride, -1,
> +				     DEFAULT_PAT_INDEX);
>  }
>  
>  struct intel_buf *intel_buf_create_full(struct buf_ops *bops,
> @@ -1160,7 +1169,8 @@ struct intel_buf *intel_buf_create_full(struct buf_ops *bops,
>  					uint32_t compression,
>  					uint64_t size,
>  					int stride,
> -					uint64_t region)
> +					uint64_t region,
> +					uint8_t pat_index)
>  {
>  	struct intel_buf *buf;
>  
> @@ -1170,7 +1180,8 @@ struct intel_buf *intel_buf_create_full(struct buf_ops *bops,
>  	igt_assert(buf);
>  
>  	__intel_buf_init(bops, handle, buf, width, height, bpp, alignment,
> -			 req_tiling, compression, size, stride, region);
> +			 req_tiling, compression, size, stride, region,
> +			 pat_index);
>  
>  	return buf;
>  }
> diff --git a/lib/intel_bufops.h b/lib/intel_bufops.h
> index 4dfe4681c..b6048402b 100644
> --- a/lib/intel_bufops.h
> +++ b/lib/intel_bufops.h
> @@ -63,6 +63,9 @@ struct intel_buf {
>  	/* Content Protection*/
>  	bool is_protected;
>  
> +	/* pat_index to use for mapping this buf. Only used in Xe. */
> +	uint8_t pat_index;
> +
>  	/* For debugging purposes */
>  	char name[INTEL_BUF_NAME_MAXSIZE + 1];
>  };
> @@ -161,7 +164,8 @@ void intel_buf_init_full(struct buf_ops *bops,
>  			 uint32_t compression,
>  			 uint64_t size,
>  			 int stride,
> -			 uint64_t region);
> +			 uint64_t region,
> +			 uint8_t pat_index);
>  
>  struct intel_buf *intel_buf_create(struct buf_ops *bops,
>  				   int width, int height,
> @@ -192,7 +196,8 @@ struct intel_buf *intel_buf_create_full(struct buf_ops *bops,
>  					uint32_t compression,
>  					uint64_t size,
>  					int stride,
> -					uint64_t region);
> +					uint64_t region,
> +					uint8_t pat_index);
>  void intel_buf_destroy(struct intel_buf *buf);
>  
>  static inline void intel_buf_set_pxp(struct intel_buf *buf, bool new_pxp_state)
> diff --git a/tests/intel/kms_big_fb.c b/tests/intel/kms_big_fb.c
> index 611e60896..854a77992 100644
> --- a/tests/intel/kms_big_fb.c
> +++ b/tests/intel/kms_big_fb.c
> @@ -34,6 +34,7 @@
>  #include <string.h>
>  
>  #include "i915/gem_create.h"
> +#include "intel_pat.h"
>  #include "xe/xe_ioctl.h"
>  #include "xe/xe_query.h"
>  
> @@ -88,7 +89,8 @@ static struct intel_buf *init_buf(data_t *data,
>  	handle = gem_open(data->drm_fd, name);
>  	buf = intel_buf_create_full(data->bops, handle, width, height,
>  				    bpp, 0, tiling, 0, size, 0,
> -				    region);
> +				    region,
> +				    intel_get_pat_idx_wt(data->drm_fd));
>  
>  	intel_buf_set_name(buf, buf_name);
>  	intel_buf_set_ownership(buf, true);
> diff --git a/tests/intel/kms_dirtyfb.c b/tests/intel/kms_dirtyfb.c
> index cc9529178..ec9b2a137 100644
> --- a/tests/intel/kms_dirtyfb.c
> +++ b/tests/intel/kms_dirtyfb.c
> @@ -10,6 +10,7 @@
>  
>  #include "i915/intel_drrs.h"
>  #include "i915/intel_fbc.h"
> +#include "intel_pat.h"
>  
>  #include "xe/xe_query.h"
>  
> @@ -246,14 +247,16 @@ static void run_test(data_t *data)
>  				    0,
>  				    igt_fb_mod_to_tiling(data->fbs[1].modifier),
>  				    0, 0, 0, is_xe_device(data->drm_fd) ?
> -				    system_memory(data->drm_fd) : 0);
> +				    system_memory(data->drm_fd) : 0,
> +				    intel_get_pat_idx_wt(data->drm_fd));
>  	dst = intel_buf_create_full(data->bops, data->fbs[2].gem_handle,
>  				    data->fbs[2].width,
>  				    data->fbs[2].height,
>  				    igt_drm_format_to_bpp(data->fbs[2].drm_format),
>  				    0, igt_fb_mod_to_tiling(data->fbs[2].modifier),
>  				    0, 0, 0, is_xe_device(data->drm_fd) ?
> -				    system_memory(data->drm_fd) : 0);
> +				    system_memory(data->drm_fd) : 0,
> +				    intel_get_pat_idx_wt(data->drm_fd));
>  	ibb = intel_bb_create(data->drm_fd, PAGE_SIZE);
>  
>  	spin = igt_spin_new(data->drm_fd, .ahnd = ibb->allocator_handle);
> diff --git a/tests/intel/kms_psr.c b/tests/intel/kms_psr.c
> index ffecc5222..9c6ecd829 100644
> --- a/tests/intel/kms_psr.c
> +++ b/tests/intel/kms_psr.c
> @@ -31,6 +31,7 @@
>  #include "igt.h"
>  #include "igt_sysfs.h"
>  #include "igt_psr.h"
> +#include "intel_pat.h"
>  #include <errno.h>
>  #include <stdbool.h>
>  #include <stdio.h>
> @@ -356,7 +357,8 @@ static struct intel_buf *create_buf_from_fb(data_t *data,
>  	name = gem_flink(data->drm_fd, fb->gem_handle);
>  	handle = gem_open(data->drm_fd, name);
>  	buf = intel_buf_create_full(data->bops, handle, width, height,
> -				    bpp, 0, tiling, 0, size, stride, region);
> +				    bpp, 0, tiling, 0, size, stride, region,
> +				    intel_get_pat_idx_wt(data->drm_fd));
>  	intel_buf_set_ownership(buf, true);
>  
>  	return buf;
> diff --git a/tests/intel/xe_intel_bb.c b/tests/intel/xe_intel_bb.c
> index 0159a3164..e2480acf8 100644
> --- a/tests/intel/xe_intel_bb.c
> +++ b/tests/intel/xe_intel_bb.c
> @@ -19,6 +19,7 @@
>  #include "igt.h"
>  #include "igt_crc.h"
>  #include "intel_bufops.h"
> +#include "intel_pat.h"
>  #include "xe/xe_ioctl.h"
>  #include "xe/xe_query.h"
>  
> @@ -400,7 +401,7 @@ static void create_in_region(struct buf_ops *bops, uint64_t region)
>  	intel_buf_init_full(bops, handle, &buf,
>  			    width/4, height, 32, 0,
>  			    I915_TILING_NONE, 0,
> -			    size, 0, region);
> +			    size, 0, region, DEFAULT_PAT_INDEX);
>  	intel_buf_set_ownership(&buf, true);
>  
>  	intel_bb_add_intel_buf(ibb, &buf, false);
> -- 
> 2.41.0
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-xe] [igt-dev] [PATCH i-g-t 08/12] lib/intel_blt: support pat_index
  2023-10-06 12:08     ` Matthew Auld
@ 2023-10-09  9:21       ` Zbigniew Kempczyński
  0 siblings, 0 replies; 22+ messages in thread
From: Zbigniew Kempczyński @ 2023-10-09  9:21 UTC (permalink / raw)
  To: Matthew Auld; +Cc: igt-dev, intel-xe

On Fri, Oct 06, 2023 at 01:08:50PM +0100, Matthew Auld wrote:
> On 06/10/2023 12:51, Zbigniew Kempczyński wrote:
> > On Thu, Oct 05, 2023 at 04:31:12PM +0100, Matthew Auld wrote:
> > > For the most part we can just use the default wb, however some users
> > > including display might want to use something else.
> > > 
> > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > Cc: José Roberto de Souza <jose.souza@intel.com>
> > > Cc: Pallavi Mishra <pallavi.mishra@intel.com>
> > > ---
> > >   lib/igt_fb.c                    |  2 ++
> > >   lib/intel_blt.c                 | 54 +++++++++++++++++++++------------
> > >   lib/intel_blt.h                 |  7 +++--
> > >   tests/intel/gem_ccs.c           | 16 +++++-----
> > >   tests/intel/gem_lmem_swapping.c |  4 +--
> > >   tests/intel/xe_ccs.c            | 19 +++++++-----
> > >   6 files changed, 64 insertions(+), 38 deletions(-)
> > > 
> > > diff --git a/lib/igt_fb.c b/lib/igt_fb.c
> > > index f8a0db22c..d290fd775 100644
> > > --- a/lib/igt_fb.c
> > > +++ b/lib/igt_fb.c
> > > @@ -37,6 +37,7 @@
> > >   #include "i915/gem_mman.h"
> > >   #include "intel_blt.h"
> > >   #include "intel_mocs.h"
> > > +#include "intel_pat.h"
> > >   #include "igt_aux.h"
> > >   #include "igt_color_encoding.h"
> > >   #include "igt_fb.h"
> > > @@ -2768,6 +2769,7 @@ static struct blt_copy_object *blt_fb_init(const struct igt_fb *fb,
> > >   	blt_set_object(blt, handle, fb->size, memregion,
> > >   		       intel_get_uc_mocs(fb->fd),
> > > +		       intel_get_pat_idx_wt(fb->fd),
> > >   		       blt_tile,
> > >   		       is_ccs_modifier(fb->modifier) ? COMPRESSION_ENABLED : COMPRESSION_DISABLED,
> > >   		       is_gen12_mc_ccs_modifier(fb->modifier) ? COMPRESSION_TYPE_MEDIA : COMPRESSION_TYPE_3D);
> > > diff --git a/lib/intel_blt.c b/lib/intel_blt.c
> > > index b55fa9b52..b7ac2902b 100644
> > > --- a/lib/intel_blt.c
> > > +++ b/lib/intel_blt.c
> > > @@ -13,6 +13,7 @@
> > >   #include "igt.h"
> > >   #include "igt_syncobj.h"
> > >   #include "intel_blt.h"
> > > +#include "intel_pat.h"
> > >   #include "xe/xe_ioctl.h"
> > >   #include "xe/xe_query.h"
> > >   #include "xe/xe_util.h"
> > > @@ -810,10 +811,12 @@ uint64_t emit_blt_block_copy(int fd,
> > >   	igt_assert_f(blt, "block-copy requires data to do blit\n");
> > >   	alignment = get_default_alignment(fd, blt->driver);
> > > -	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment)
> > > -		     + blt->src.plane_offset;
> > > -	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment)
> > > -		     + blt->dst.plane_offset;
> > > +	src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
> > > +					  alignment, blt->src.pat_index) +
> > > +		blt->src.plane_offset;
> > > +	dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size,
> > > +					  alignment, blt->dst.pat_index) +
> > > +		blt->dst.plane_offset;
> > 
> > To less tabs in formatting for src and dst plane_offset.
> > 
> > >   	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
> > >   	fill_data(&data, blt, src_offset, dst_offset, ext);
> > > @@ -884,8 +887,10 @@ int blt_block_copy(int fd,
> > >   	igt_assert_neq(blt->driver, 0);
> > >   	alignment = get_default_alignment(fd, blt->driver);
> > > -	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
> > > -	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
> > > +	src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
> > > +					  alignment, blt->src.pat_index);
> > > +	dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size,
> > > +					  alignment, blt->dst.pat_index);
> > >   	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
> > >   	emit_blt_block_copy(fd, ahnd, blt, ext, 0, true);
> > > @@ -1036,8 +1041,10 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
> > >   	data.dw00.size_of_ctrl_copy = __ccs_size(surf) / CCS_RATIO - 1;
> > >   	data.dw00.length = 0x3;
> > > -	src_offset = get_offset(ahnd, surf->src.handle, surf->src.size, alignment);
> > > -	dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size, alignment);
> > > +	src_offset = get_offset_pat_index(ahnd, surf->src.handle, surf->src.size,
> > > +					  alignment, surf->src.pat_index);
> > > +	dst_offset = get_offset_pat_index(ahnd, surf->dst.handle, surf->dst.size,
> > > +					  alignment, surf->dst.pat_index);
> > >   	bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size, alignment);
> > >   	data.dw01.src_address_lo = src_offset;
> > > @@ -1103,8 +1110,10 @@ int blt_ctrl_surf_copy(int fd,
> > >   	igt_assert_neq(surf->driver, 0);
> > >   	alignment = max_t(uint64_t, get_default_alignment(fd, surf->driver), 1ull << 16);
> > > -	src_offset = get_offset(ahnd, surf->src.handle, surf->src.size, alignment);
> > > -	dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size, alignment);
> > > +	src_offset = get_offset_pat_index(ahnd, surf->src.handle, surf->src.size,
> > > +					  alignment, surf->src.pat_index);
> > > +	dst_offset = get_offset_pat_index(ahnd, surf->dst.handle, surf->dst.size,
> > > +					  alignment, surf->dst.pat_index);
> > >   	bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size, alignment);
> > >   	emit_blt_ctrl_surf_copy(fd, ahnd, surf, 0, true);
> > > @@ -1308,10 +1317,12 @@ uint64_t emit_blt_fast_copy(int fd,
> > >   	data.dw03.dst_x2 = blt->dst.x2;
> > >   	data.dw03.dst_y2 = blt->dst.y2;
> > > -	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment)
> > > -		     + blt->src.plane_offset;
> > > -	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment)
> > > -		     + blt->dst.plane_offset;
> > > +	src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
> > > +					  alignment, blt->src.pat_index) +
> > > +		blt->src.plane_offset;
> > > +	dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size, alignment,
> > > +					  blt->dst.pat_index) +
> > > +		blt->dst.plane_offset;
> > 
> > Ditto.
> > 
> > >   	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
> > >   	data.dw04.dst_address_lo = dst_offset;
> > > @@ -1380,8 +1391,10 @@ int blt_fast_copy(int fd,
> > >   	igt_assert_neq(blt->driver, 0);
> > >   	alignment = get_default_alignment(fd, blt->driver);
> > > -	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
> > > -	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
> > > +	src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
> > > +					  alignment, blt->src.pat_index);
> > > +	dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size,
> > > +					  alignment, blt->dst.pat_index);
> > >   	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
> > >   	emit_blt_fast_copy(fd, ahnd, blt, 0, true);
> > > @@ -1460,7 +1473,7 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
> > >   							  &size, region) == 0);
> > >   	}
> > 
> > I think blt_create_object() should have also pat_index passed as an
> > argument.
> 
> I think you would also have to pass in the cpu_caching mode, and maybe even
> the coh_mode, if we wanted that. Currently blt_create_object() gives you a
> combination of cpu_caching, coh_mode and pat_index that is the default and
> should "just work" for most cases. Idea is if you need something more exotic
> you would instead create your own object (using say gem_create_caching) and
> then also select whatever pat_index you needed.
> 
> I can change it to expose everything but figured blt_create_object() should
> be more "I don't care, just give me the defaults".

Ok. You've conviced me. Any non-default settings might be changed before
the exec.

--
Zbigniew

> 
> > 
> > Rest looks ok.
> > 
> > --
> > Zbigniew
> > 
> > > -	blt_set_object(obj, handle, size, region, mocs, tiling,
> > > +	blt_set_object(obj, handle, size, region, mocs, DEFAULT_PAT_INDEX, tiling,
> > >   		       compression, compression_type);
> > >   	blt_set_geom(obj, stride, 0, 0, width, height, 0, 0);
> > > @@ -1481,7 +1494,7 @@ void blt_destroy_object(int fd, struct blt_copy_object *obj)
> > >   void blt_set_object(struct blt_copy_object *obj,
> > >   		    uint32_t handle, uint64_t size, uint32_t region,
> > > -		    uint8_t mocs, enum blt_tiling_type tiling,
> > > +		    uint8_t mocs, uint8_t pat_index, enum blt_tiling_type tiling,
> > >   		    enum blt_compression compression,
> > >   		    enum blt_compression_type compression_type)
> > >   {
> > > @@ -1489,6 +1502,7 @@ void blt_set_object(struct blt_copy_object *obj,
> > >   	obj->size = size;
> > >   	obj->region = region;
> > >   	obj->mocs = mocs;
> > > +	obj->pat_index = pat_index;
> > >   	obj->tiling = tiling;
> > >   	obj->compression = compression;
> > >   	obj->compression_type = compression_type;
> > > @@ -1516,12 +1530,14 @@ void blt_set_copy_object(struct blt_copy_object *obj,
> > >   void blt_set_ctrl_surf_object(struct blt_ctrl_surf_copy_object *obj,
> > >   			      uint32_t handle, uint32_t region, uint64_t size,
> > > -			      uint8_t mocs, enum blt_access_type access_type)
> > > +			      uint8_t mocs, uint8_t pat_index,
> > > +			      enum blt_access_type access_type)
> > >   {
> > >   	obj->handle = handle;
> > >   	obj->region = region;
> > >   	obj->size = size;
> > >   	obj->mocs = mocs;
> > > +	obj->pat_index = pat_index;
> > >   	obj->access_type = access_type;
> > >   }
> > > diff --git a/lib/intel_blt.h b/lib/intel_blt.h
> > > index d9c8883c7..f8423a986 100644
> > > --- a/lib/intel_blt.h
> > > +++ b/lib/intel_blt.h
> > > @@ -79,6 +79,7 @@ struct blt_copy_object {
> > >   	uint32_t region;
> > >   	uint64_t size;
> > >   	uint8_t mocs;
> > > +	uint8_t pat_index;
> > >   	enum blt_tiling_type tiling;
> > >   	enum blt_compression compression;  /* BC only */
> > >   	enum blt_compression_type compression_type; /* BC only */
> > > @@ -151,6 +152,7 @@ struct blt_ctrl_surf_copy_object {
> > >   	uint32_t region;
> > >   	uint64_t size;
> > >   	uint8_t mocs;
> > > +	uint8_t pat_index;
> > >   	enum blt_access_type access_type;
> > >   };
> > > @@ -247,7 +249,7 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
> > >   void blt_destroy_object(int fd, struct blt_copy_object *obj);
> > >   void blt_set_object(struct blt_copy_object *obj,
> > >   		    uint32_t handle, uint64_t size, uint32_t region,
> > > -		    uint8_t mocs, enum blt_tiling_type tiling,
> > > +		    uint8_t mocs, uint8_t pat_index, enum blt_tiling_type tiling,
> > >   		    enum blt_compression compression,
> > >   		    enum blt_compression_type compression_type);
> > >   void blt_set_object_ext(struct blt_block_copy_object_ext *obj,
> > > @@ -258,7 +260,8 @@ void blt_set_copy_object(struct blt_copy_object *obj,
> > >   			 const struct blt_copy_object *orig);
> > >   void blt_set_ctrl_surf_object(struct blt_ctrl_surf_copy_object *obj,
> > >   			      uint32_t handle, uint32_t region, uint64_t size,
> > > -			      uint8_t mocs, enum blt_access_type access_type);
> > > +			      uint8_t mocs, uint8_t pat_index,
> > > +			      enum blt_access_type access_type);
> > >   void blt_surface_info(const char *info,
> > >   		      const struct blt_copy_object *obj);
> > > diff --git a/tests/intel/gem_ccs.c b/tests/intel/gem_ccs.c
> > > index f5d4ab359..a98557b72 100644
> > > --- a/tests/intel/gem_ccs.c
> > > +++ b/tests/intel/gem_ccs.c
> > > @@ -15,6 +15,7 @@
> > >   #include "lib/intel_chipset.h"
> > >   #include "intel_blt.h"
> > >   #include "intel_mocs.h"
> > > +#include "intel_pat.h"
> > >   /**
> > >    * TEST: gem ccs
> > >    * Description: Exercise gen12 blitter with and without flatccs compression
> > > @@ -111,9 +112,9 @@ static void surf_copy(int i915,
> > >   	blt_ctrl_surf_copy_init(i915, &surf);
> > >   	surf.print_bb = param.print_bb;
> > >   	blt_set_ctrl_surf_object(&surf.src, mid->handle, mid->region, mid->size,
> > > -				 uc_mocs, BLT_INDIRECT_ACCESS);
> > > +				 uc_mocs, DEFAULT_PAT_INDEX, BLT_INDIRECT_ACCESS);
> > >   	blt_set_ctrl_surf_object(&surf.dst, ccs, REGION_SMEM, ccssize,
> > > -				 uc_mocs, DIRECT_ACCESS);
> > > +				 uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
> > >   	bb_size = 4096;
> > >   	igt_assert_eq(__gem_create(i915, &bb_size, &bb1), 0);
> > >   	blt_set_batch(&surf.bb, bb1, bb_size, REGION_SMEM);
> > > @@ -133,7 +134,7 @@ static void surf_copy(int i915,
> > >   		igt_system_suspend_autoresume(SUSPEND_STATE_FREEZE, SUSPEND_TEST_NONE);
> > >   		blt_set_ctrl_surf_object(&surf.dst, ccs2, REGION_SMEM, ccssize,
> > > -					 0, DIRECT_ACCESS);
> > > +					 0, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
> > >   		blt_ctrl_surf_copy(i915, ctx, e, ahnd, &surf);
> > >   		gem_sync(i915, surf.dst.handle);
> > > @@ -155,9 +156,9 @@ static void surf_copy(int i915,
> > >   	for (int i = 0; i < surf.dst.size / sizeof(uint32_t); i++)
> > >   		ccsmap[i] = i;
> > >   	blt_set_ctrl_surf_object(&surf.src, ccs, REGION_SMEM, ccssize,
> > > -				 uc_mocs, DIRECT_ACCESS);
> > > +				 uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
> > >   	blt_set_ctrl_surf_object(&surf.dst, mid->handle, mid->region, mid->size,
> > > -				 uc_mocs, INDIRECT_ACCESS);
> > > +				 uc_mocs, DEFAULT_PAT_INDEX, INDIRECT_ACCESS);
> > >   	blt_ctrl_surf_copy(i915, ctx, e, ahnd, &surf);
> > >   	blt_copy_init(i915, &blt);
> > > @@ -399,7 +400,8 @@ static void block_copy(int i915,
> > >   	blt_set_object_ext(&ext.dst, 0, width, height, SURFACE_TYPE_2D);
> > >   	if (config->inplace) {
> > >   		blt_set_object(&blt.dst, mid->handle, dst->size, mid->region, 0,
> > > -			       T_LINEAR, COMPRESSION_DISABLED, comp_type);
> > > +			       DEFAULT_PAT_INDEX, T_LINEAR, COMPRESSION_DISABLED,
> > > +			       comp_type);
> > >   		blt.dst.ptr = mid->ptr;
> > >   	}
> > > @@ -475,7 +477,7 @@ static void block_multicopy(int i915,
> > >   	if (config->inplace) {
> > >   		blt_set_object(&blt3.dst, mid->handle, dst->size, mid->region,
> > > -			       mid->mocs, mid_tiling, COMPRESSION_DISABLED,
> > > +			       mid->mocs, DEFAULT_PAT_INDEX, mid_tiling, COMPRESSION_DISABLED,
> > >   			       comp_type);
> > >   		blt3.dst.ptr = mid->ptr;
> > >   	}
> > > diff --git a/tests/intel/gem_lmem_swapping.c b/tests/intel/gem_lmem_swapping.c
> > > index ede545c92..7f2ab8bb6 100644
> > > --- a/tests/intel/gem_lmem_swapping.c
> > > +++ b/tests/intel/gem_lmem_swapping.c
> > > @@ -486,7 +486,7 @@ static void __do_evict(int i915,
> > >   				   INTEL_MEMORY_REGION_ID(I915_SYSTEM_MEMORY, 0));
> > >   		blt_set_object(tmp, tmp->handle, params->size.max,
> > >   			       INTEL_MEMORY_REGION_ID(I915_SYSTEM_MEMORY, 0),
> > > -			       intel_get_uc_mocs(i915), T_LINEAR,
> > > +			       intel_get_uc_mocs(i915), 0, T_LINEAR,
> > >   			       COMPRESSION_DISABLED, COMPRESSION_TYPE_3D);
> > >   		blt_set_geom(tmp, stride, 0, 0, width, height, 0, 0);
> > >   	}
> > > @@ -516,7 +516,7 @@ static void __do_evict(int i915,
> > >   			obj->blt_obj = calloc(1, sizeof(*obj->blt_obj));
> > >   			igt_assert(obj->blt_obj);
> > >   			blt_set_object(obj->blt_obj, obj->handle, obj->size, region_id,
> > > -				       intel_get_uc_mocs(i915), T_LINEAR,
> > > +				       intel_get_uc_mocs(i915), 0, T_LINEAR,
> > >   				       COMPRESSION_ENABLED, COMPRESSION_TYPE_3D);
> > >   			blt_set_geom(obj->blt_obj, stride, 0, 0, width, height, 0, 0);
> > >   			init_object_ccs(i915, obj, tmp, rand(), blt_ctx,
> > > diff --git a/tests/intel/xe_ccs.c b/tests/intel/xe_ccs.c
> > > index 20bbc4448..27859d5ce 100644
> > > --- a/tests/intel/xe_ccs.c
> > > +++ b/tests/intel/xe_ccs.c
> > > @@ -13,6 +13,7 @@
> > >   #include "igt_syncobj.h"
> > >   #include "intel_blt.h"
> > >   #include "intel_mocs.h"
> > > +#include "intel_pat.h"
> > >   #include "xe/xe_ioctl.h"
> > >   #include "xe/xe_query.h"
> > >   #include "xe/xe_util.h"
> > > @@ -108,8 +109,9 @@ static void surf_copy(int xe,
> > >   	blt_ctrl_surf_copy_init(xe, &surf);
> > >   	surf.print_bb = param.print_bb;
> > >   	blt_set_ctrl_surf_object(&surf.src, mid->handle, mid->region, mid->size,
> > > -				 uc_mocs, BLT_INDIRECT_ACCESS);
> > > -	blt_set_ctrl_surf_object(&surf.dst, ccs, sysmem, ccssize, uc_mocs, DIRECT_ACCESS);
> > > +				 uc_mocs, DEFAULT_PAT_INDEX, BLT_INDIRECT_ACCESS);
> > > +	blt_set_ctrl_surf_object(&surf.dst, ccs, sysmem, ccssize, uc_mocs,
> > > +				 DEFAULT_PAT_INDEX, DIRECT_ACCESS);
> > >   	bb_size = xe_get_default_alignment(xe);
> > >   	bb1 = xe_bo_create_flags(xe, 0, bb_size, sysmem);
> > >   	blt_set_batch(&surf.bb, bb1, bb_size, sysmem);
> > > @@ -130,7 +132,7 @@ static void surf_copy(int xe,
> > >   		igt_system_suspend_autoresume(SUSPEND_STATE_FREEZE, SUSPEND_TEST_NONE);
> > >   		blt_set_ctrl_surf_object(&surf.dst, ccs2, system_memory(xe), ccssize,
> > > -					 0, DIRECT_ACCESS);
> > > +					 0, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
> > >   		blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
> > >   		intel_ctx_xe_sync(ctx, true);
> > > @@ -153,9 +155,9 @@ static void surf_copy(int xe,
> > >   	for (int i = 0; i < surf.dst.size / sizeof(uint32_t); i++)
> > >   		ccsmap[i] = i;
> > >   	blt_set_ctrl_surf_object(&surf.src, ccs, sysmem, ccssize,
> > > -				 uc_mocs, DIRECT_ACCESS);
> > > +				 uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
> > >   	blt_set_ctrl_surf_object(&surf.dst, mid->handle, mid->region, mid->size,
> > > -				 uc_mocs, INDIRECT_ACCESS);
> > > +				 uc_mocs, DEFAULT_PAT_INDEX, INDIRECT_ACCESS);
> > >   	blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
> > >   	intel_ctx_xe_sync(ctx, true);
> > > @@ -369,7 +371,8 @@ static void block_copy(int xe,
> > >   	blt_set_object_ext(&ext.dst, 0, width, height, SURFACE_TYPE_2D);
> > >   	if (config->inplace) {
> > >   		blt_set_object(&blt.dst, mid->handle, dst->size, mid->region, 0,
> > > -			       T_LINEAR, COMPRESSION_DISABLED, comp_type);
> > > +			       DEFAULT_PAT_INDEX, T_LINEAR, COMPRESSION_DISABLED,
> > > +			       comp_type);
> > >   		blt.dst.ptr = mid->ptr;
> > >   	}
> > > @@ -450,8 +453,8 @@ static void block_multicopy(int xe,
> > >   	if (config->inplace) {
> > >   		blt_set_object(&blt3.dst, mid->handle, dst->size, mid->region,
> > > -			       mid->mocs, mid_tiling, COMPRESSION_DISABLED,
> > > -			       comp_type);
> > > +			       mid->mocs, DEFAULT_PAT_INDEX, mid_tiling,
> > > +			       COMPRESSION_DISABLED, comp_type);
> > >   		blt3.dst.ptr = mid->ptr;
> > >   	}
> > > -- 
> > > 2.41.0
> > > 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-xe] [PATCH i-g-t 01/12] drm-uapi/xe_drm: sync to get pat and coherency bits
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 01/12] drm-uapi/xe_drm: sync to get pat and coherency bits Matthew Auld
@ 2023-10-09 22:03   ` Mishra, Pallavi
  0 siblings, 0 replies; 22+ messages in thread
From: Mishra, Pallavi @ 2023-10-09 22:03 UTC (permalink / raw)
  To: Auld, Matthew, igt-dev@lists.freedesktop.org
  Cc: intel-xe@lists.freedesktop.org



> -----Original Message-----
> From: Auld, Matthew <matthew.auld@intel.com>
> Sent: Thursday, October 5, 2023 8:31 AM
> To: igt-dev@lists.freedesktop.org
> Cc: intel-xe@lists.freedesktop.org; Souza, Jose <jose.souza@intel.com>;
> Mishra, Pallavi <pallavi.mishra@intel.com>
> Subject: [PATCH i-g-t 01/12] drm-uapi/xe_drm: sync to get pat and coherency
> bits
> 
> Grab the PAT & coherency uapi additions.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: José Roberto de Souza <jose.souza@intel.com>
> Cc: Pallavi Mishra <pallavi.mishra@intel.com>

Reviewed-by: Pallavi Mishra <pallavi.mishra@intel.com>

> ---
>  include/drm-uapi/xe_drm.h | 93
> +++++++++++++++++++++++++++++++++++++--
>  1 file changed, 90 insertions(+), 3 deletions(-)
> 
> diff --git a/include/drm-uapi/xe_drm.h b/include/drm-uapi/xe_drm.h index
> 804c02270..0a665f67f 100644
> --- a/include/drm-uapi/xe_drm.h
> +++ b/include/drm-uapi/xe_drm.h
> @@ -456,8 +456,54 @@ struct drm_xe_gem_create {
>  	 */
>  	__u32 handle;
> 
> -	/** @pad: MBZ */
> -	__u32 pad;
> +	/**
> +	 * @coh_mode: The coherency mode for this object. This will limit the
> +	 * possible @cpu_caching values.
> +	 *
> +	 * Supported values:
> +	 *
> +	 * DRM_XE_GEM_COH_NONE: GPU access is assumed to be not
> coherent with
> +	 * CPU. CPU caches are not snooped.
> +	 *
> +	 * DRM_XE_GEM_COH_AT_LEAST_1WAY:
> +	 *
> +	 * CPU-GPU coherency must be at least 1WAY.
> +	 *
> +	 * If 1WAY then GPU access is coherent with CPU (CPU caches are
> snooped)
> +	 * until GPU acquires. The acquire by the GPU is not tracked by CPU
> +	 * caches.
> +	 *
> +	 * If 2WAY then should be fully coherent between GPU and CPU.  Fully
> +	 * tracked by CPU caches. Both CPU and GPU caches are snooped.
> +	 *
> +	 * Note: On dgpu the GPU device never caches system memory. The
> device
> +	 * should be thought of as always 1WAY coherent, with the addition
> that
> +	 * the GPU never caches system memory. At least on current dgpu HW
> there
> +	 * is no way to turn off snooping so likely the different coherency
> +	 * modes of the pat_index make no difference for system memory.
> +	 */
> +#define DRM_XE_GEM_COH_NONE		1
> +#define DRM_XE_GEM_COH_AT_LEAST_1WAY	2
> +	__u16 coh_mode;
> +
> +	/**
> +	 * @cpu_caching: The CPU caching mode to select for this object. If
> +	 * mmaping the object the mode selected here will also be used.
> +	 *
> +	 * Supported values:
> +	 *
> +	 * DRM_XE_GEM_CPU_CACHING_WB: Allocate the pages with write-
> back caching.
> +	 * On iGPU this can't be used for scanout surfaces. The @coh_mode
> must
> +	 * be DRM_XE_GEM_COH_AT_LEAST_1WAY. Currently not allowed for
> objects placed
> +	 * in VRAM.
> +	 *
> +	 * DRM_XE_GEM_CPU_CACHING_WC: Allocate the pages as write-
> combined. This is
> +	 * uncached. Any @coh_mode is permitted. Scanout surfaces should
> likely
> +	 * use this. All objects that can be placed in VRAM must use this.
> +	 */
> +#define DRM_XE_GEM_CPU_CACHING_WB                      1
> +#define DRM_XE_GEM_CPU_CACHING_WC                      2
> +	__u16 cpu_caching;
> 
>  	/** @reserved: Reserved */
>  	__u64 reserved[2];
> @@ -552,8 +598,49 @@ struct drm_xe_vm_bind_op {
>  	 */
>  	__u32 obj;
> 
> +	/**
> +	 * @pat_index: The platform defined @pat_index to use for this
> mapping.
> +	 * The index basically maps to some predefined memory attributes,
> +	 * including things like caching, coherency, compression etc.  The exact
> +	 * meaning of the pat_index is platform specific and defined in the
> +	 * Bspec and PRMs.  When the KMD sets up the binding the index here
> is
> +	 * encoded into the ppGTT PTE.
> +	 *
> +	 * For coherency the @pat_index needs to be least as coherent as
> +	 * drm_xe_gem_create.coh_mode. i.e coh_mode(pat_index) >=
> +	 * drm_xe_gem_create.coh_mode. The KMD will extract the coherency
> mode
> +	 * from the @pat_index and reject if there is a mismatch (see note
> below
> +	 * for pre-MTL platforms).
> +	 *
> +	 * Note: On pre-MTL platforms there is only a caching mode and no
> +	 * explicit coherency mode, but on such hardware there is always a
> +	 * shared-LLC (or is dgpu) so all GT memory accesses are coherent with
> +	 * CPU caches even with the caching mode set as uncached.  It's only
> the
> +	 * display engine that is incoherent (on dgpu it must be in VRAM which
> +	 * is always mapped as WC on the CPU). However to keep the uapi
> somewhat
> +	 * consistent with newer platforms the KMD groups the different cache
> +	 * levels into the following coherency buckets on all pre-MTL platforms:
> +	 *
> +	 *	ppGTT UC -> DRM_XE_GEM_COH_NONE
> +	 *	ppGTT WC -> DRM_XE_GEM_COH_NONE
> +	 *	ppGTT WT -> DRM_XE_GEM_COH_NONE
> +	 *	ppGTT WB -> DRM_XE_GEM_COH_AT_LEAST_1WAY
> +	 *
> +	 * In practice UC/WC/WT should only ever used for scanout surfaces
> on
> +	 * such platforms (or perhaps in general for dma-buf if shared with
> +	 * another device) since it is only the display engine that is actually
> +	 * incoherent.  Everything else should typically use WB given that we
> +	 * have a shared-LLC.  On MTL+ this completely changes and the HW
> +	 * defines the coherency mode as part of the @pat_index, where
> +	 * incoherent GT access is possible.
> +	 *
> +	 * Note: For userptr and externally imported dma-buf the kernel
> expects
> +	 * either 1WAY or 2WAY for the @pat_index.
> +	 */
> +	__u16 pat_index;
> +
>  	/** @pad: MBZ */
> -	__u32 pad;
> +	__u16 pad;
> 
>  	union {
>  		/**
> --
> 2.41.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-xe] [PATCH i-g-t 02/12] lib/igt_fb: mark buffers as SCANOUT
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 02/12] lib/igt_fb: mark buffers as SCANOUT Matthew Auld
@ 2023-10-09 22:03   ` Mishra, Pallavi
  0 siblings, 0 replies; 22+ messages in thread
From: Mishra, Pallavi @ 2023-10-09 22:03 UTC (permalink / raw)
  To: Auld, Matthew, igt-dev@lists.freedesktop.org
  Cc: intel-xe@lists.freedesktop.org



> -----Original Message-----
> From: Auld, Matthew <matthew.auld@intel.com>
> Sent: Thursday, October 5, 2023 8:31 AM
> To: igt-dev@lists.freedesktop.org
> Cc: intel-xe@lists.freedesktop.org; Souza, Jose <jose.souza@intel.com>;
> Mishra, Pallavi <pallavi.mishra@intel.com>
> Subject: [PATCH i-g-t 02/12] lib/igt_fb: mark buffers as SCANOUT
> 
> Display buffers likely will want WC, instead of the default WB on the CPU side,
> given that display engine is incoherent with CPU caches.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: José Roberto de Souza <jose.souza@intel.com>
> Cc: Pallavi Mishra <pallavi.mishra@intel.com>

Reviewed-by: Pallavi Mishra <pallavi.mishra@intel.com>

> ---
>  lib/igt_fb.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/igt_fb.c b/lib/igt_fb.c index 54a66eb6a..f8a0db22c 100644
> --- a/lib/igt_fb.c
> +++ b/lib/igt_fb.c
> @@ -1206,7 +1206,8 @@ static int create_bo_for_fb(struct igt_fb *fb, bool
> prefer_sysmem)
>  			igt_assert(err == 0 || err == -EOPNOTSUPP);
>  		} else if (is_xe_device(fd)) {
>  			fb->gem_handle = xe_bo_create_flags(fd, 0, fb->size,
> -
> 	visible_vram_if_possible(fd, 0));
> +
> visible_vram_if_possible(fd, 0) |
> +
> XE_GEM_CREATE_FLAG_SCANOUT);
>  		} else if (is_vc4_device(fd)) {
>  			fb->gem_handle = igt_vc4_create_bo(fd, fb->size);
> 
> --
> 2.41.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-xe] [PATCH i-g-t 03/12] lib/igt_draw: mark buffers as SCANOUT
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 03/12] lib/igt_draw: " Matthew Auld
@ 2023-10-09 22:03   ` Mishra, Pallavi
  0 siblings, 0 replies; 22+ messages in thread
From: Mishra, Pallavi @ 2023-10-09 22:03 UTC (permalink / raw)
  To: Auld, Matthew, igt-dev@lists.freedesktop.org
  Cc: intel-xe@lists.freedesktop.org



> -----Original Message-----
> From: Auld, Matthew <matthew.auld@intel.com>
> Sent: Thursday, October 5, 2023 8:31 AM
> To: igt-dev@lists.freedesktop.org
> Cc: intel-xe@lists.freedesktop.org; Souza, Jose <jose.souza@intel.com>;
> Mishra, Pallavi <pallavi.mishra@intel.com>
> Subject: [PATCH i-g-t 03/12] lib/igt_draw: mark buffers as SCANOUT
> 
> Display buffers likely will want WC, instead of the default WB on the CPU side,
> given that display engine is incoherent with CPU caches.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: José Roberto de Souza <jose.souza@intel.com>
> Cc: Pallavi Mishra <pallavi.mishra@intel.com>

Reviewed-by: Pallavi Mishra <pallavi.mishra@intel.com>
> ---
>  lib/igt_draw.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/igt_draw.c b/lib/igt_draw.c index 476778a13..2332bf94a
> 100644
> --- a/lib/igt_draw.c
> +++ b/lib/igt_draw.c
> @@ -791,7 +791,8 @@ static void draw_rect_render(int fd, struct cmd_data
> *cmd_data,
>  	else
>  		tmp.handle = xe_bo_create_flags(fd, 0,
>  						ALIGN(tmp.size,
> xe_get_default_alignment(fd)),
> -						visible_vram_if_possible(fd,
> 0));
> +						visible_vram_if_possible(fd,
> 0) |
> +
> 	XE_GEM_CREATE_FLAG_SCANOUT);
> 
>  	tmp.stride = rect->w * pixel_size;
>  	tmp.bpp = buf->bpp;
> --
> 2.41.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-xe] [PATCH i-g-t 04/12] lib/xe: support cpu_caching and coh_mod for gem_create
  2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 04/12] lib/xe: support cpu_caching and coh_mod for gem_create Matthew Auld
@ 2023-10-09 22:04   ` Mishra, Pallavi
  0 siblings, 0 replies; 22+ messages in thread
From: Mishra, Pallavi @ 2023-10-09 22:04 UTC (permalink / raw)
  To: Auld, Matthew, igt-dev@lists.freedesktop.org
  Cc: intel-xe@lists.freedesktop.org



> -----Original Message-----
> From: Auld, Matthew <matthew.auld@intel.com>
> Sent: Thursday, October 5, 2023 8:31 AM
> To: igt-dev@lists.freedesktop.org
> Cc: intel-xe@lists.freedesktop.org; Souza, Jose <jose.souza@intel.com>;
> Mishra, Pallavi <pallavi.mishra@intel.com>
> Subject: [PATCH i-g-t 04/12] lib/xe: support cpu_caching and coh_mod for
> gem_create
> 
> Most tests shouldn't about such things, so likely it's just a case of picking the
> most sane default. However we also add some helpers for the tests that do
> care.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: José Roberto de Souza <jose.souza@intel.com>
> Cc: Pallavi Mishra <pallavi.mishra@intel.com>

Reviewed-by: Pallavi Mishra <pallavi.mishra@intel.com>
> ---
>  lib/xe/xe_ioctl.c       | 65 ++++++++++++++++++++++++++++++++++-------
>  lib/xe/xe_ioctl.h       |  8 +++++
>  tests/intel/xe_create.c |  3 ++
>  3 files changed, 65 insertions(+), 11 deletions(-)
> 
> diff --git a/lib/xe/xe_ioctl.c b/lib/xe/xe_ioctl.c index 730dcfd16..80696aa59
> 100644
> --- a/lib/xe/xe_ioctl.c
> +++ b/lib/xe/xe_ioctl.c
> @@ -233,13 +233,30 @@ void xe_vm_destroy(int fd, uint32_t vm)
>  	igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_VM_DESTROY, &destroy),
> 0);  }
> 
> -uint32_t __xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t
> flags,
> -			      uint32_t *handle)
> +void __xe_default_coh_caching_from_flags(int fd, uint32_t flags,
> +					 uint16_t *cpu_caching,
> +					 uint16_t *coh_mode)
> +{
> +	if ((flags & all_memory_regions(fd)) != system_memory(fd) ||
> +	    flags & XE_GEM_CREATE_FLAG_SCANOUT) {
> +		/* VRAM placements or scanout should always use WC */
> +		*cpu_caching = DRM_XE_GEM_CPU_CACHING_WC;
> +		*coh_mode = DRM_XE_GEM_COH_NONE;
> +	} else {
> +		*cpu_caching = DRM_XE_GEM_CPU_CACHING_WB;
> +		*coh_mode = DRM_XE_GEM_COH_AT_LEAST_1WAY;
> +	}
> +}
> +
> +static uint32_t ___xe_bo_create_flags(int fd, uint32_t vm, uint64_t size,
> uint32_t flags,
> +				      uint16_t cpu_caching, uint16_t coh_mode,
> uint32_t *handle)
>  {
>  	struct drm_xe_gem_create create = {
>  		.vm_id = vm,
>  		.size = size,
>  		.flags = flags,
> +		.cpu_caching = cpu_caching,
> +		.coh_mode = coh_mode,
>  	};
>  	int err;
> 
> @@ -249,6 +266,18 @@ uint32_t __xe_bo_create_flags(int fd, uint32_t vm,
> uint64_t size, uint32_t flags
> 
>  	*handle = create.handle;
>  	return 0;
> +
> +}
> +
> +uint32_t __xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t
> flags,
> +			      uint32_t *handle)
> +{
> +	uint16_t cpu_caching, coh_mode;
> +
> +	__xe_default_coh_caching_from_flags(fd, flags, &cpu_caching,
> +&coh_mode);
> +
> +	return ___xe_bo_create_flags(fd, vm, size, flags, cpu_caching,
> coh_mode,
> +				     handle);
>  }
> 
>  uint32_t xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t flags)
> @@ -260,19 +289,33 @@ uint32_t xe_bo_create_flags(int fd, uint32_t vm,
> uint64_t size, uint32_t flags)
>  	return handle;
>  }
> 
> +uint32_t __xe_bo_create_caching(int fd, uint32_t vm, uint64_t size, uint32_t
> flags,
> +				uint16_t cpu_caching, uint16_t coh_mode,
> +				uint32_t *handle)
> +{
> +	return ___xe_bo_create_flags(fd, vm, size, flags, cpu_caching,
> coh_mode,
> +				     handle);
> +}
> +
> +uint32_t xe_bo_create_caching(int fd, uint32_t vm, uint64_t size, uint32_t
> flags,
> +			      uint16_t cpu_caching, uint16_t coh_mode) {
> +	uint32_t handle;
> +
> +	igt_assert_eq(__xe_bo_create_caching(fd, vm, size, flags,
> +					     cpu_caching, coh_mode, &handle),
> 0);
> +
> +	return handle;
> +}
> +
>  uint32_t xe_bo_create(int fd, int gt, uint32_t vm, uint64_t size)  {
> -	struct drm_xe_gem_create create = {
> -		.vm_id = vm,
> -		.size = size,
> -		.flags = vram_if_possible(fd, gt),
> -	};
> -	int err;
> +	uint32_t handle;
> 
> -	err = igt_ioctl(fd, DRM_IOCTL_XE_GEM_CREATE, &create);
> -	igt_assert_eq(err, 0);
> +	igt_assert_eq(__xe_bo_create_flags(fd, vm, size, vram_if_possible(fd,
> gt),
> +					   &handle), 0);
> 
> -	return create.handle;
> +	return handle;
>  }
> 
>  uint32_t xe_bind_exec_queue_create(int fd, uint32_t vm, uint64_t ext) diff --
> git a/lib/xe/xe_ioctl.h b/lib/xe/xe_ioctl.h index 6c281b3bf..c18fc878c 100644
> --- a/lib/xe/xe_ioctl.h
> +++ b/lib/xe/xe_ioctl.h
> @@ -67,6 +67,14 @@ void xe_vm_destroy(int fd, uint32_t vm);  uint32_t
> __xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t flags,
>  			      uint32_t *handle);
>  uint32_t xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t
> flags);
> +uint32_t __xe_bo_create_caching(int fd, uint32_t vm, uint64_t size, uint32_t
> flags,
> +				uint16_t cpu_caching, uint16_t coh_mode,
> +				uint32_t *handle);
> +uint32_t xe_bo_create_caching(int fd, uint32_t vm, uint64_t size, uint32_t
> flags,
> +			      uint16_t cpu_caching, uint16_t coh_mode); void
> +__xe_default_coh_caching_from_flags(int fd, uint32_t flags,
> +					 uint16_t *cpu_caching,
> +					 uint16_t *coh_mode);
>  uint32_t xe_bo_create(int fd, int gt, uint32_t vm, uint64_t size);  uint32_t
> xe_exec_queue_create(int fd, uint32_t vm,
>  			  struct drm_xe_engine_class_instance *instance, diff -
> -git a/tests/intel/xe_create.c b/tests/intel/xe_create.c index
> 8d845e5c8..f5d2cc1b2 100644
> --- a/tests/intel/xe_create.c
> +++ b/tests/intel/xe_create.c
> @@ -30,6 +30,9 @@ static int __create_bo(int fd, uint32_t vm, uint64_t size,
> uint32_t flags,
> 
>  	igt_assert(handlep);
> 
> +	__xe_default_coh_caching_from_flags(fd, flags, &create.cpu_caching,
> +					    &create.coh_mode);
> +
>  	if (igt_ioctl(fd, DRM_IOCTL_XE_GEM_CREATE, &create)) {
>  		ret = -errno;
>  		errno = 0;
> --
> 2.41.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2023-10-09 22:04 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 01/12] drm-uapi/xe_drm: sync to get pat and coherency bits Matthew Auld
2023-10-09 22:03   ` Mishra, Pallavi
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 02/12] lib/igt_fb: mark buffers as SCANOUT Matthew Auld
2023-10-09 22:03   ` Mishra, Pallavi
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 03/12] lib/igt_draw: " Matthew Auld
2023-10-09 22:03   ` Mishra, Pallavi
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 04/12] lib/xe: support cpu_caching and coh_mod for gem_create Matthew Auld
2023-10-09 22:04   ` Mishra, Pallavi
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 05/12] tests/xe/mmap: add some tests for cpu_caching and coh_mode Matthew Auld
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 06/12] lib/intel_pat: add helpers for common pat_index modes Matthew Auld
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 07/12] lib/allocator: add get_offset_pat_index() helper Matthew Auld
2023-10-06 11:38   ` Zbigniew Kempczyński
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 08/12] lib/intel_blt: support pat_index Matthew Auld
2023-10-06 11:51   ` [Intel-xe] [igt-dev] " Zbigniew Kempczyński
2023-10-06 12:08     ` Matthew Auld
2023-10-09  9:21       ` Zbigniew Kempczyński
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 09/12] lib/intel_buf: " Matthew Auld
2023-10-06 12:13   ` Zbigniew Kempczyński
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 10/12] lib/xe_ioctl: update vm_bind to account for pat_index Matthew Auld
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 11/12] tests/xe: add some vm_bind pat_index tests Matthew Auld
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 12/12] tests/intel-ci/xe: add pat and caching related tests Matthew Auld

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox