* [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support
@ 2023-10-05 15:31 Matthew Auld
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 01/12] drm-uapi/xe_drm: sync to get pat and coherency bits Matthew Auld
` (11 more replies)
0 siblings, 12 replies; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
To: igt-dev; +Cc: intel-xe
Series implements the IGT side of things needed to support the new Xe uapi here:
https://patchwork.freedesktop.org/series/123027/
Branch with the IGT changes:
https://gitlab.freedesktop.org/mwa/igt-gpu-tools/-/commits/xe-pat-index
Branch with the KMD changes:
https://gitlab.freedesktop.org/mwa/kernel/-/tree/xe-pat-index?ref_type=heads
--
2.41.0
^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-xe] [PATCH i-g-t 01/12] drm-uapi/xe_drm: sync to get pat and coherency bits
2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
@ 2023-10-05 15:31 ` Matthew Auld
2023-10-09 22:03 ` Mishra, Pallavi
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 02/12] lib/igt_fb: mark buffers as SCANOUT Matthew Auld
` (10 subsequent siblings)
11 siblings, 1 reply; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
To: igt-dev; +Cc: intel-xe
Grab the PAT & coherency uapi additions.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
---
include/drm-uapi/xe_drm.h | 93 +++++++++++++++++++++++++++++++++++++--
1 file changed, 90 insertions(+), 3 deletions(-)
diff --git a/include/drm-uapi/xe_drm.h b/include/drm-uapi/xe_drm.h
index 804c02270..0a665f67f 100644
--- a/include/drm-uapi/xe_drm.h
+++ b/include/drm-uapi/xe_drm.h
@@ -456,8 +456,54 @@ struct drm_xe_gem_create {
*/
__u32 handle;
- /** @pad: MBZ */
- __u32 pad;
+ /**
+ * @coh_mode: The coherency mode for this object. This will limit the
+ * possible @cpu_caching values.
+ *
+ * Supported values:
+ *
+ * DRM_XE_GEM_COH_NONE: GPU access is assumed to be not coherent with
+ * CPU. CPU caches are not snooped.
+ *
+ * DRM_XE_GEM_COH_AT_LEAST_1WAY:
+ *
+ * CPU-GPU coherency must be at least 1WAY.
+ *
+ * If 1WAY then GPU access is coherent with CPU (CPU caches are snooped)
+ * until GPU acquires. The acquire by the GPU is not tracked by CPU
+ * caches.
+ *
+ * If 2WAY then should be fully coherent between GPU and CPU. Fully
+ * tracked by CPU caches. Both CPU and GPU caches are snooped.
+ *
+ * Note: On dgpu the GPU device never caches system memory. The device
+ * should be thought of as always 1WAY coherent, with the addition that
+ * the GPU never caches system memory. At least on current dgpu HW there
+ * is no way to turn off snooping so likely the different coherency
+ * modes of the pat_index make no difference for system memory.
+ */
+#define DRM_XE_GEM_COH_NONE 1
+#define DRM_XE_GEM_COH_AT_LEAST_1WAY 2
+ __u16 coh_mode;
+
+ /**
+ * @cpu_caching: The CPU caching mode to select for this object. If
+ * mmaping the object the mode selected here will also be used.
+ *
+ * Supported values:
+ *
+ * DRM_XE_GEM_CPU_CACHING_WB: Allocate the pages with write-back caching.
+ * On iGPU this can't be used for scanout surfaces. The @coh_mode must
+ * be DRM_XE_GEM_COH_AT_LEAST_1WAY. Currently not allowed for objects placed
+ * in VRAM.
+ *
+ * DRM_XE_GEM_CPU_CACHING_WC: Allocate the pages as write-combined. This is
+ * uncached. Any @coh_mode is permitted. Scanout surfaces should likely
+ * use this. All objects that can be placed in VRAM must use this.
+ */
+#define DRM_XE_GEM_CPU_CACHING_WB 1
+#define DRM_XE_GEM_CPU_CACHING_WC 2
+ __u16 cpu_caching;
/** @reserved: Reserved */
__u64 reserved[2];
@@ -552,8 +598,49 @@ struct drm_xe_vm_bind_op {
*/
__u32 obj;
+ /**
+ * @pat_index: The platform defined @pat_index to use for this mapping.
+ * The index basically maps to some predefined memory attributes,
+ * including things like caching, coherency, compression etc. The exact
+ * meaning of the pat_index is platform specific and defined in the
+ * Bspec and PRMs. When the KMD sets up the binding the index here is
+ * encoded into the ppGTT PTE.
+ *
+ * For coherency the @pat_index needs to be least as coherent as
+ * drm_xe_gem_create.coh_mode. i.e coh_mode(pat_index) >=
+ * drm_xe_gem_create.coh_mode. The KMD will extract the coherency mode
+ * from the @pat_index and reject if there is a mismatch (see note below
+ * for pre-MTL platforms).
+ *
+ * Note: On pre-MTL platforms there is only a caching mode and no
+ * explicit coherency mode, but on such hardware there is always a
+ * shared-LLC (or is dgpu) so all GT memory accesses are coherent with
+ * CPU caches even with the caching mode set as uncached. It's only the
+ * display engine that is incoherent (on dgpu it must be in VRAM which
+ * is always mapped as WC on the CPU). However to keep the uapi somewhat
+ * consistent with newer platforms the KMD groups the different cache
+ * levels into the following coherency buckets on all pre-MTL platforms:
+ *
+ * ppGTT UC -> DRM_XE_GEM_COH_NONE
+ * ppGTT WC -> DRM_XE_GEM_COH_NONE
+ * ppGTT WT -> DRM_XE_GEM_COH_NONE
+ * ppGTT WB -> DRM_XE_GEM_COH_AT_LEAST_1WAY
+ *
+ * In practice UC/WC/WT should only ever used for scanout surfaces on
+ * such platforms (or perhaps in general for dma-buf if shared with
+ * another device) since it is only the display engine that is actually
+ * incoherent. Everything else should typically use WB given that we
+ * have a shared-LLC. On MTL+ this completely changes and the HW
+ * defines the coherency mode as part of the @pat_index, where
+ * incoherent GT access is possible.
+ *
+ * Note: For userptr and externally imported dma-buf the kernel expects
+ * either 1WAY or 2WAY for the @pat_index.
+ */
+ __u16 pat_index;
+
/** @pad: MBZ */
- __u32 pad;
+ __u16 pad;
union {
/**
--
2.41.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Intel-xe] [PATCH i-g-t 02/12] lib/igt_fb: mark buffers as SCANOUT
2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 01/12] drm-uapi/xe_drm: sync to get pat and coherency bits Matthew Auld
@ 2023-10-05 15:31 ` Matthew Auld
2023-10-09 22:03 ` Mishra, Pallavi
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 03/12] lib/igt_draw: " Matthew Auld
` (9 subsequent siblings)
11 siblings, 1 reply; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
To: igt-dev; +Cc: intel-xe
Display buffers likely will want WC, instead of the default WB on the
CPU side, given that display engine is incoherent with CPU caches.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
---
lib/igt_fb.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/lib/igt_fb.c b/lib/igt_fb.c
index 54a66eb6a..f8a0db22c 100644
--- a/lib/igt_fb.c
+++ b/lib/igt_fb.c
@@ -1206,7 +1206,8 @@ static int create_bo_for_fb(struct igt_fb *fb, bool prefer_sysmem)
igt_assert(err == 0 || err == -EOPNOTSUPP);
} else if (is_xe_device(fd)) {
fb->gem_handle = xe_bo_create_flags(fd, 0, fb->size,
- visible_vram_if_possible(fd, 0));
+ visible_vram_if_possible(fd, 0) |
+ XE_GEM_CREATE_FLAG_SCANOUT);
} else if (is_vc4_device(fd)) {
fb->gem_handle = igt_vc4_create_bo(fd, fb->size);
--
2.41.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Intel-xe] [PATCH i-g-t 03/12] lib/igt_draw: mark buffers as SCANOUT
2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 01/12] drm-uapi/xe_drm: sync to get pat and coherency bits Matthew Auld
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 02/12] lib/igt_fb: mark buffers as SCANOUT Matthew Auld
@ 2023-10-05 15:31 ` Matthew Auld
2023-10-09 22:03 ` Mishra, Pallavi
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 04/12] lib/xe: support cpu_caching and coh_mod for gem_create Matthew Auld
` (8 subsequent siblings)
11 siblings, 1 reply; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
To: igt-dev; +Cc: intel-xe
Display buffers likely will want WC, instead of the default WB on the
CPU side, given that display engine is incoherent with CPU caches.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
---
lib/igt_draw.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/lib/igt_draw.c b/lib/igt_draw.c
index 476778a13..2332bf94a 100644
--- a/lib/igt_draw.c
+++ b/lib/igt_draw.c
@@ -791,7 +791,8 @@ static void draw_rect_render(int fd, struct cmd_data *cmd_data,
else
tmp.handle = xe_bo_create_flags(fd, 0,
ALIGN(tmp.size, xe_get_default_alignment(fd)),
- visible_vram_if_possible(fd, 0));
+ visible_vram_if_possible(fd, 0) |
+ XE_GEM_CREATE_FLAG_SCANOUT);
tmp.stride = rect->w * pixel_size;
tmp.bpp = buf->bpp;
--
2.41.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Intel-xe] [PATCH i-g-t 04/12] lib/xe: support cpu_caching and coh_mod for gem_create
2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
` (2 preceding siblings ...)
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 03/12] lib/igt_draw: " Matthew Auld
@ 2023-10-05 15:31 ` Matthew Auld
2023-10-09 22:04 ` Mishra, Pallavi
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 05/12] tests/xe/mmap: add some tests for cpu_caching and coh_mode Matthew Auld
` (7 subsequent siblings)
11 siblings, 1 reply; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
To: igt-dev; +Cc: intel-xe
Most tests shouldn't about such things, so likely it's just a case of
picking the most sane default. However we also add some helpers for the
tests that do care.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
---
lib/xe/xe_ioctl.c | 65 ++++++++++++++++++++++++++++++++++-------
lib/xe/xe_ioctl.h | 8 +++++
tests/intel/xe_create.c | 3 ++
3 files changed, 65 insertions(+), 11 deletions(-)
diff --git a/lib/xe/xe_ioctl.c b/lib/xe/xe_ioctl.c
index 730dcfd16..80696aa59 100644
--- a/lib/xe/xe_ioctl.c
+++ b/lib/xe/xe_ioctl.c
@@ -233,13 +233,30 @@ void xe_vm_destroy(int fd, uint32_t vm)
igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_VM_DESTROY, &destroy), 0);
}
-uint32_t __xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t flags,
- uint32_t *handle)
+void __xe_default_coh_caching_from_flags(int fd, uint32_t flags,
+ uint16_t *cpu_caching,
+ uint16_t *coh_mode)
+{
+ if ((flags & all_memory_regions(fd)) != system_memory(fd) ||
+ flags & XE_GEM_CREATE_FLAG_SCANOUT) {
+ /* VRAM placements or scanout should always use WC */
+ *cpu_caching = DRM_XE_GEM_CPU_CACHING_WC;
+ *coh_mode = DRM_XE_GEM_COH_NONE;
+ } else {
+ *cpu_caching = DRM_XE_GEM_CPU_CACHING_WB;
+ *coh_mode = DRM_XE_GEM_COH_AT_LEAST_1WAY;
+ }
+}
+
+static uint32_t ___xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t flags,
+ uint16_t cpu_caching, uint16_t coh_mode, uint32_t *handle)
{
struct drm_xe_gem_create create = {
.vm_id = vm,
.size = size,
.flags = flags,
+ .cpu_caching = cpu_caching,
+ .coh_mode = coh_mode,
};
int err;
@@ -249,6 +266,18 @@ uint32_t __xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t flags
*handle = create.handle;
return 0;
+
+}
+
+uint32_t __xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t flags,
+ uint32_t *handle)
+{
+ uint16_t cpu_caching, coh_mode;
+
+ __xe_default_coh_caching_from_flags(fd, flags, &cpu_caching, &coh_mode);
+
+ return ___xe_bo_create_flags(fd, vm, size, flags, cpu_caching, coh_mode,
+ handle);
}
uint32_t xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t flags)
@@ -260,19 +289,33 @@ uint32_t xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t flags)
return handle;
}
+uint32_t __xe_bo_create_caching(int fd, uint32_t vm, uint64_t size, uint32_t flags,
+ uint16_t cpu_caching, uint16_t coh_mode,
+ uint32_t *handle)
+{
+ return ___xe_bo_create_flags(fd, vm, size, flags, cpu_caching, coh_mode,
+ handle);
+}
+
+uint32_t xe_bo_create_caching(int fd, uint32_t vm, uint64_t size, uint32_t flags,
+ uint16_t cpu_caching, uint16_t coh_mode)
+{
+ uint32_t handle;
+
+ igt_assert_eq(__xe_bo_create_caching(fd, vm, size, flags,
+ cpu_caching, coh_mode, &handle), 0);
+
+ return handle;
+}
+
uint32_t xe_bo_create(int fd, int gt, uint32_t vm, uint64_t size)
{
- struct drm_xe_gem_create create = {
- .vm_id = vm,
- .size = size,
- .flags = vram_if_possible(fd, gt),
- };
- int err;
+ uint32_t handle;
- err = igt_ioctl(fd, DRM_IOCTL_XE_GEM_CREATE, &create);
- igt_assert_eq(err, 0);
+ igt_assert_eq(__xe_bo_create_flags(fd, vm, size, vram_if_possible(fd, gt),
+ &handle), 0);
- return create.handle;
+ return handle;
}
uint32_t xe_bind_exec_queue_create(int fd, uint32_t vm, uint64_t ext)
diff --git a/lib/xe/xe_ioctl.h b/lib/xe/xe_ioctl.h
index 6c281b3bf..c18fc878c 100644
--- a/lib/xe/xe_ioctl.h
+++ b/lib/xe/xe_ioctl.h
@@ -67,6 +67,14 @@ void xe_vm_destroy(int fd, uint32_t vm);
uint32_t __xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t flags,
uint32_t *handle);
uint32_t xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t flags);
+uint32_t __xe_bo_create_caching(int fd, uint32_t vm, uint64_t size, uint32_t flags,
+ uint16_t cpu_caching, uint16_t coh_mode,
+ uint32_t *handle);
+uint32_t xe_bo_create_caching(int fd, uint32_t vm, uint64_t size, uint32_t flags,
+ uint16_t cpu_caching, uint16_t coh_mode);
+void __xe_default_coh_caching_from_flags(int fd, uint32_t flags,
+ uint16_t *cpu_caching,
+ uint16_t *coh_mode);
uint32_t xe_bo_create(int fd, int gt, uint32_t vm, uint64_t size);
uint32_t xe_exec_queue_create(int fd, uint32_t vm,
struct drm_xe_engine_class_instance *instance,
diff --git a/tests/intel/xe_create.c b/tests/intel/xe_create.c
index 8d845e5c8..f5d2cc1b2 100644
--- a/tests/intel/xe_create.c
+++ b/tests/intel/xe_create.c
@@ -30,6 +30,9 @@ static int __create_bo(int fd, uint32_t vm, uint64_t size, uint32_t flags,
igt_assert(handlep);
+ __xe_default_coh_caching_from_flags(fd, flags, &create.cpu_caching,
+ &create.coh_mode);
+
if (igt_ioctl(fd, DRM_IOCTL_XE_GEM_CREATE, &create)) {
ret = -errno;
errno = 0;
--
2.41.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Intel-xe] [PATCH i-g-t 05/12] tests/xe/mmap: add some tests for cpu_caching and coh_mode
2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
` (3 preceding siblings ...)
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 04/12] lib/xe: support cpu_caching and coh_mod for gem_create Matthew Auld
@ 2023-10-05 15:31 ` Matthew Auld
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 06/12] lib/intel_pat: add helpers for common pat_index modes Matthew Auld
` (6 subsequent siblings)
11 siblings, 0 replies; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
To: igt-dev; +Cc: intel-xe
Ensure the various invalid combinations are rejected. Also ensure we can
mmap and fault anything that is valid.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
---
tests/intel/xe_mmap.c | 77 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 77 insertions(+)
diff --git a/tests/intel/xe_mmap.c b/tests/intel/xe_mmap.c
index 7e7e43c00..09e9c8aae 100644
--- a/tests/intel/xe_mmap.c
+++ b/tests/intel/xe_mmap.c
@@ -199,6 +199,80 @@ static void test_small_bar(int fd)
gem_close(fd, bo);
}
+static void assert_caching(int fd, uint64_t flags, uint16_t cpu_caching,
+ uint16_t coh_mode, bool fail)
+{
+ uint64_t size = xe_get_default_alignment(fd);
+ uint64_t mmo;
+ uint32_t handle;
+ uint32_t *map;
+ bool ret;
+
+ ret = __xe_bo_create_caching(fd, 0, size, flags, cpu_caching,
+ coh_mode, &handle);
+ igt_assert(ret == fail);
+
+ if (fail)
+ return;
+
+ mmo = xe_bo_mmap_offset(fd, handle);
+ map = mmap(NULL, size, PROT_WRITE, MAP_SHARED, fd, mmo);
+ igt_assert(map != MAP_FAILED);
+ map[0] = 0xdeadbeaf;
+ gem_close(fd, handle);
+}
+
+/**
+ * SUBTEST: cpu-caching-coh
+ * Description: Test cpu_caching and coh, including mmap behaviour.
+ * Test category: functionality test
+ */
+static void test_cpu_caching(int fd)
+{
+ if (vram_memory(fd, 0)) {
+ assert_caching(fd, vram_memory(fd, 0),
+ DRM_XE_GEM_CPU_CACHING_WC, DRM_XE_GEM_COH_NONE,
+ false);
+ assert_caching(fd, vram_memory(fd, 0),
+ DRM_XE_GEM_CPU_CACHING_WC, DRM_XE_GEM_COH_AT_LEAST_1WAY,
+ false);
+ assert_caching(fd, vram_memory(fd, 0) | system_memory(fd),
+ DRM_XE_GEM_CPU_CACHING_WC, DRM_XE_GEM_COH_NONE,
+ false);
+
+ assert_caching(fd, vram_memory(fd, 0),
+ DRM_XE_GEM_CPU_CACHING_WB, DRM_XE_GEM_COH_NONE,
+ true);
+ assert_caching(fd, vram_memory(fd, 0),
+ DRM_XE_GEM_CPU_CACHING_WB, DRM_XE_GEM_COH_AT_LEAST_1WAY,
+ true);
+ assert_caching(fd, vram_memory(fd, 0) | system_memory(fd),
+ DRM_XE_GEM_CPU_CACHING_WB, DRM_XE_GEM_COH_NONE,
+ true);
+ assert_caching(fd, vram_memory(fd, 0) | system_memory(fd),
+ DRM_XE_GEM_CPU_CACHING_WB, DRM_XE_GEM_COH_AT_LEAST_1WAY,
+ true);
+ }
+
+ assert_caching(fd, system_memory(fd), DRM_XE_GEM_CPU_CACHING_WB,
+ DRM_XE_GEM_COH_AT_LEAST_1WAY, false);
+ assert_caching(fd, system_memory(fd), DRM_XE_GEM_CPU_CACHING_WC,
+ DRM_XE_GEM_COH_NONE, false);
+ assert_caching(fd, system_memory(fd), DRM_XE_GEM_CPU_CACHING_WC,
+ DRM_XE_GEM_COH_AT_LEAST_1WAY, false);
+
+ assert_caching(fd, system_memory(fd), DRM_XE_GEM_CPU_CACHING_WB,
+ DRM_XE_GEM_COH_NONE, true);
+ assert_caching(fd, system_memory(fd), -1, -1, true);
+ assert_caching(fd, system_memory(fd), 0, 0, true);
+ assert_caching(fd, system_memory(fd), 0, DRM_XE_GEM_COH_AT_LEAST_1WAY, true);
+ assert_caching(fd, system_memory(fd), DRM_XE_GEM_CPU_CACHING_WC, 0, true);
+ assert_caching(fd, system_memory(fd), DRM_XE_GEM_CPU_CACHING_WC + 1,
+ DRM_XE_GEM_COH_AT_LEAST_1WAY, true);
+ assert_caching(fd, system_memory(fd), DRM_XE_GEM_CPU_CACHING_WC,
+ DRM_XE_GEM_COH_AT_LEAST_1WAY + 1, true);
+}
+
igt_main
{
int fd;
@@ -230,6 +304,9 @@ igt_main
test_small_bar(fd);
}
+ igt_subtest("cpu-caching-coh")
+ test_cpu_caching(fd);
+
igt_fixture
drm_close_driver(fd);
}
--
2.41.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Intel-xe] [PATCH i-g-t 06/12] lib/intel_pat: add helpers for common pat_index modes
2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
` (4 preceding siblings ...)
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 05/12] tests/xe/mmap: add some tests for cpu_caching and coh_mode Matthew Auld
@ 2023-10-05 15:31 ` Matthew Auld
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 07/12] lib/allocator: add get_offset_pat_index() helper Matthew Auld
` (5 subsequent siblings)
11 siblings, 0 replies; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
To: igt-dev; +Cc: intel-xe
For now just add uc, wt and wb for every platform. The wb mode should
always be at least 1way coherent, if messing around with system memory.
Also make non-matching platforms throw an error rather than trying to
inherit the modes from previous platforms since they will likely be
different.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
---
lib/intel_pat.c | 77 +++++++++++++++++++++++++++++++++++++++++++++++++
lib/intel_pat.h | 19 ++++++++++++
lib/meson.build | 1 +
3 files changed, 97 insertions(+)
create mode 100644 lib/intel_pat.c
create mode 100644 lib/intel_pat.h
diff --git a/lib/intel_pat.c b/lib/intel_pat.c
new file mode 100644
index 000000000..4d19d57ea
--- /dev/null
+++ b/lib/intel_pat.c
@@ -0,0 +1,77 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#include "intel_pat.h"
+
+#include "igt.h"
+
+struct intel_pat_cache {
+ uint8_t uc; /* UC + COH_NONE */
+ uint8_t wt; /* WT + COH_NONE */
+ uint8_t wb; /* WB + COH_AT_LEAST_1WAY */
+
+ uint8_t max_index;
+};
+
+static void intel_get_pat_idx(int fd, struct intel_pat_cache *pat)
+{
+ uint16_t dev_id = intel_get_drm_devid(fd);
+
+ if (intel_graphics_ver(dev_id) == IP_VER(20, 0)) {
+ pat->uc = 3;
+ pat->wt = 15;
+ pat->wb = 2;
+ pat->max_index = 31;
+ } else if (IS_METEORLAKE(dev_id)) {
+ pat->uc = 2;
+ pat->wt = 1;
+ pat->wb = 3;
+ pat->max_index = 3;
+ } else if (IS_PONTEVECCHIO(dev_id)) {
+ pat->uc = 0;
+ pat->wt = 2;
+ pat->wb = 3;
+ pat->max_index = 7;
+ } else if (intel_graphics_ver(dev_id) <= IP_VER(12, 60)) {
+ pat->uc = 3;
+ pat->wt = 2;
+ pat->wb = 0;
+ pat->max_index = 3;
+ } else {
+ igt_critical("Platform is missing PAT settings for uc/wt/wb\n");
+ }
+}
+
+uint8_t intel_get_max_pat_index(int fd)
+{
+ struct intel_pat_cache pat = {};
+
+ intel_get_pat_idx(fd, &pat);
+ return pat.max_index;
+}
+
+uint8_t intel_get_pat_idx_uc(int fd)
+{
+ struct intel_pat_cache pat = {};
+
+ intel_get_pat_idx(fd, &pat);
+ return pat.uc;
+}
+
+uint8_t intel_get_pat_idx_wt(int fd)
+{
+ struct intel_pat_cache pat = {};
+
+ intel_get_pat_idx(fd, &pat);
+ return pat.wt;
+}
+
+uint8_t intel_get_pat_idx_wb(int fd)
+{
+ struct intel_pat_cache pat = {};
+
+ intel_get_pat_idx(fd, &pat);
+ return pat.wb;
+}
diff --git a/lib/intel_pat.h b/lib/intel_pat.h
new file mode 100644
index 000000000..c24dbc275
--- /dev/null
+++ b/lib/intel_pat.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#ifndef INTEL_PAT_H
+#define INTEL_PAT_H
+
+#include <stdint.h>
+
+#define DEFAULT_PAT_INDEX ((uint8_t)-1) /* igt-core can pick 1way or better */
+
+uint8_t intel_get_max_pat_index(int fd);
+
+uint8_t intel_get_pat_idx_uc(int fd);
+uint8_t intel_get_pat_idx_wt(int fd);
+uint8_t intel_get_pat_idx_wb(int fd);
+
+#endif /* INTEL_PAT_H */
diff --git a/lib/meson.build b/lib/meson.build
index a7bccafc3..48466a2e9 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -64,6 +64,7 @@ lib_sources = [
'intel_device_info.c',
'intel_mmio.c',
'intel_mocs.c',
+ 'intel_pat.c',
'ioctl_wrappers.c',
'media_spin.c',
'media_fill.c',
--
2.41.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Intel-xe] [PATCH i-g-t 07/12] lib/allocator: add get_offset_pat_index() helper
2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
` (5 preceding siblings ...)
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 06/12] lib/intel_pat: add helpers for common pat_index modes Matthew Auld
@ 2023-10-05 15:31 ` Matthew Auld
2023-10-06 11:38 ` Zbigniew Kempczyński
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 08/12] lib/intel_blt: support pat_index Matthew Auld
` (4 subsequent siblings)
11 siblings, 1 reply; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
To: igt-dev; +Cc: intel-xe
For some cases we are going to need to pass the pat_index for the
vm_bind op. Add a helper for this, such that we can allocate an address
and give the mapping some pat_index.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
---
lib/intel_allocator.c | 43 +++++++++++++++++++++++--------
lib/intel_allocator.h | 5 +++-
lib/xe/xe_util.c | 1 +
lib/xe/xe_util.h | 1 +
tests/intel/api_intel_allocator.c | 4 ++-
5 files changed, 41 insertions(+), 13 deletions(-)
diff --git a/lib/intel_allocator.c b/lib/intel_allocator.c
index f0a9b7fb5..da357b833 100644
--- a/lib/intel_allocator.c
+++ b/lib/intel_allocator.c
@@ -16,6 +16,7 @@
#include "igt_map.h"
#include "intel_allocator.h"
#include "intel_allocator_msgchannel.h"
+#include "intel_pat.h"
#include "xe/xe_query.h"
#include "xe/xe_util.h"
@@ -92,6 +93,7 @@ struct allocator_object {
uint32_t handle;
uint64_t offset;
uint64_t size;
+ uint8_t pat_index;
enum allocator_bind_op bind_op;
};
@@ -1122,14 +1124,14 @@ void intel_allocator_get_address_range(uint64_t allocator_handle,
static bool is_same(struct allocator_object *obj,
uint32_t handle, uint64_t offset, uint64_t size,
- enum allocator_bind_op bind_op)
+ uint8_t pat_index, enum allocator_bind_op bind_op)
{
return obj->handle == handle && obj->offset == offset && obj->size == size &&
- (obj->bind_op == bind_op || obj->bind_op == BOUND);
+ obj->pat_index == pat_index && (obj->bind_op == bind_op || obj->bind_op == BOUND);
}
static void track_object(uint64_t allocator_handle, uint32_t handle,
- uint64_t offset, uint64_t size,
+ uint64_t offset, uint64_t size, uint8_t pat_index,
enum allocator_bind_op bind_op)
{
struct ahnd_info *ainfo;
@@ -1156,6 +1158,9 @@ static void track_object(uint64_t allocator_handle, uint32_t handle,
if (ainfo->driver == INTEL_DRIVER_I915)
return; /* no-op for i915, at least for now */
+ if (pat_index == DEFAULT_PAT_INDEX)
+ pat_index = intel_get_pat_idx_wb(ainfo->fd);
+
pthread_mutex_lock(&ainfo->bind_map_mutex);
obj = igt_map_search(ainfo->bind_map, &handle);
if (obj) {
@@ -1165,7 +1170,7 @@ static void track_object(uint64_t allocator_handle, uint32_t handle,
* bind_map.
*/
if (bind_op == TO_BIND) {
- igt_assert_eq(is_same(obj, handle, offset, size, bind_op), true);
+ igt_assert_eq(is_same(obj, handle, offset, size, pat_index, bind_op), true);
} else if (bind_op == TO_UNBIND) {
if (obj->bind_op == TO_BIND)
igt_map_remove(ainfo->bind_map, &obj->handle, map_entry_free_func);
@@ -1181,6 +1186,7 @@ static void track_object(uint64_t allocator_handle, uint32_t handle,
obj->handle = handle;
obj->offset = offset;
obj->size = size;
+ obj->pat_index = pat_index;
obj->bind_op = bind_op;
igt_map_insert(ainfo->bind_map, &obj->handle, obj);
}
@@ -1204,7 +1210,7 @@ out:
*/
uint64_t __intel_allocator_alloc(uint64_t allocator_handle, uint32_t handle,
uint64_t size, uint64_t alignment,
- enum allocator_strategy strategy)
+ uint8_t pat_index, enum allocator_strategy strategy)
{
struct alloc_req req = { .request_type = REQ_ALLOC,
.allocator_handle = allocator_handle,
@@ -1219,7 +1225,8 @@ uint64_t __intel_allocator_alloc(uint64_t allocator_handle, uint32_t handle,
igt_assert(handle_request(&req, &resp) == 0);
igt_assert(resp.response_type == RESP_ALLOC);
- track_object(allocator_handle, handle, resp.alloc.offset, size, TO_BIND);
+ track_object(allocator_handle, handle, resp.alloc.offset, size, pat_index,
+ TO_BIND);
return resp.alloc.offset;
}
@@ -1241,7 +1248,7 @@ uint64_t intel_allocator_alloc(uint64_t allocator_handle, uint32_t handle,
uint64_t offset;
offset = __intel_allocator_alloc(allocator_handle, handle,
- size, alignment,
+ size, alignment, DEFAULT_PAT_INDEX,
ALLOC_STRATEGY_NONE);
igt_assert(offset != ALLOC_INVALID_ADDRESS);
@@ -1268,7 +1275,8 @@ uint64_t intel_allocator_alloc_with_strategy(uint64_t allocator_handle,
uint64_t offset;
offset = __intel_allocator_alloc(allocator_handle, handle,
- size, alignment, strategy);
+ size, alignment, DEFAULT_PAT_INDEX,
+ strategy);
igt_assert(offset != ALLOC_INVALID_ADDRESS);
return offset;
@@ -1298,7 +1306,7 @@ bool intel_allocator_free(uint64_t allocator_handle, uint32_t handle)
igt_assert(handle_request(&req, &resp) == 0);
igt_assert(resp.response_type == RESP_FREE);
- track_object(allocator_handle, handle, 0, 0, TO_UNBIND);
+ track_object(allocator_handle, handle, 0, 0, 0, TO_UNBIND);
return resp.free.freed;
}
@@ -1500,16 +1508,17 @@ static void __xe_op_bind(struct ahnd_info *ainfo, uint32_t sync_in, uint32_t syn
if (obj->bind_op == BOUND)
continue;
- bind_info("= [vm: %u] %s => %u %lx %lx\n",
+ bind_info("= [vm: %u] %s => %u %lx %lx %u\n",
ainfo->vm,
obj->bind_op == TO_BIND ? "TO BIND" : "TO UNBIND",
obj->handle, obj->offset,
- obj->size);
+ obj->size, obj->pat_index);
entry = malloc(sizeof(*entry));
entry->handle = obj->handle;
entry->offset = obj->offset;
entry->size = obj->size;
+ entry->pat_index = obj->pat_index;
entry->bind_op = obj->bind_op == TO_BIND ? XE_OBJECT_BIND :
XE_OBJECT_UNBIND;
igt_list_add(&entry->link, &obj_list);
@@ -1534,6 +1543,18 @@ static void __xe_op_bind(struct ahnd_info *ainfo, uint32_t sync_in, uint32_t syn
}
}
+uint64_t get_offset_pat_index(uint64_t ahnd, uint32_t handle, uint64_t size,
+ uint64_t alignment, uint8_t pat_index)
+{
+ uint64_t offset;
+
+ offset = __intel_allocator_alloc(ahnd, handle, size, alignment,
+ pat_index, ALLOC_STRATEGY_NONE);
+ igt_assert(offset != ALLOC_INVALID_ADDRESS);
+
+ return offset;
+}
+
/**
* intel_allocator_bind:
* @allocator_handle: handle to an allocator
diff --git a/lib/intel_allocator.h b/lib/intel_allocator.h
index f9ff7f1cc..5da8af7f9 100644
--- a/lib/intel_allocator.h
+++ b/lib/intel_allocator.h
@@ -186,7 +186,7 @@ bool intel_allocator_close(uint64_t allocator_handle);
void intel_allocator_get_address_range(uint64_t allocator_handle,
uint64_t *startp, uint64_t *endp);
uint64_t __intel_allocator_alloc(uint64_t allocator_handle, uint32_t handle,
- uint64_t size, uint64_t alignment,
+ uint64_t size, uint64_t alignment, uint8_t pat_index,
enum allocator_strategy strategy);
uint64_t intel_allocator_alloc(uint64_t allocator_handle, uint32_t handle,
uint64_t size, uint64_t alignment);
@@ -266,6 +266,9 @@ static inline bool put_ahnd(uint64_t ahnd)
return !ahnd || intel_allocator_close(ahnd);
}
+uint64_t get_offset_pat_index(uint64_t ahnd, uint32_t handle, uint64_t size,
+ uint64_t alignment, uint8_t pat_index);
+
static inline uint64_t get_offset(uint64_t ahnd, uint32_t handle,
uint64_t size, uint64_t alignment)
{
diff --git a/lib/xe/xe_util.c b/lib/xe/xe_util.c
index 2f9ffe2f1..8583326a9 100644
--- a/lib/xe/xe_util.c
+++ b/lib/xe/xe_util.c
@@ -145,6 +145,7 @@ static struct drm_xe_vm_bind_op *xe_alloc_bind_ops(struct igt_list_head *obj_lis
ops->addr = obj->offset;
ops->range = obj->size;
ops->region = 0;
+ ops->pat_index = obj->pat_index;
bind_info(" [%d]: [%6s] handle: %u, offset: %llx, size: %llx\n",
i, obj->bind_op == XE_OBJECT_BIND ? "BIND" : "UNBIND",
diff --git a/lib/xe/xe_util.h b/lib/xe/xe_util.h
index e97d236b8..e3bdf3d11 100644
--- a/lib/xe/xe_util.h
+++ b/lib/xe/xe_util.h
@@ -36,6 +36,7 @@ struct xe_object {
uint32_t handle;
uint64_t offset;
uint64_t size;
+ uint8_t pat_index;
enum xe_bind_op bind_op;
struct igt_list_head link;
};
diff --git a/tests/intel/api_intel_allocator.c b/tests/intel/api_intel_allocator.c
index f3fcf8a34..d19be3ce9 100644
--- a/tests/intel/api_intel_allocator.c
+++ b/tests/intel/api_intel_allocator.c
@@ -9,6 +9,7 @@
#include "igt.h"
#include "igt_aux.h"
#include "intel_allocator.h"
+#include "intel_pat.h"
#include "xe/xe_ioctl.h"
#include "xe/xe_query.h"
@@ -131,7 +132,8 @@ static void alloc_simple(int fd)
intel_allocator_get_address_range(ahnd, &start, &end);
offset0 = intel_allocator_alloc(ahnd, 1, end - start, 0);
- offset1 = __intel_allocator_alloc(ahnd, 2, 4096, 0, ALLOC_STRATEGY_NONE);
+ offset1 = __intel_allocator_alloc(ahnd, 2, 4096, 0, DEFAULT_PAT_INDEX,
+ ALLOC_STRATEGY_NONE);
igt_assert(offset1 == ALLOC_INVALID_ADDRESS);
intel_allocator_free(ahnd, 1);
--
2.41.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Intel-xe] [PATCH i-g-t 08/12] lib/intel_blt: support pat_index
2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
` (6 preceding siblings ...)
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 07/12] lib/allocator: add get_offset_pat_index() helper Matthew Auld
@ 2023-10-05 15:31 ` Matthew Auld
2023-10-06 11:51 ` [Intel-xe] [igt-dev] " Zbigniew Kempczyński
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 09/12] lib/intel_buf: " Matthew Auld
` (3 subsequent siblings)
11 siblings, 1 reply; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
To: igt-dev; +Cc: intel-xe
For the most part we can just use the default wb, however some users
including display might want to use something else.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
---
lib/igt_fb.c | 2 ++
lib/intel_blt.c | 54 +++++++++++++++++++++------------
lib/intel_blt.h | 7 +++--
tests/intel/gem_ccs.c | 16 +++++-----
tests/intel/gem_lmem_swapping.c | 4 +--
tests/intel/xe_ccs.c | 19 +++++++-----
6 files changed, 64 insertions(+), 38 deletions(-)
diff --git a/lib/igt_fb.c b/lib/igt_fb.c
index f8a0db22c..d290fd775 100644
--- a/lib/igt_fb.c
+++ b/lib/igt_fb.c
@@ -37,6 +37,7 @@
#include "i915/gem_mman.h"
#include "intel_blt.h"
#include "intel_mocs.h"
+#include "intel_pat.h"
#include "igt_aux.h"
#include "igt_color_encoding.h"
#include "igt_fb.h"
@@ -2768,6 +2769,7 @@ static struct blt_copy_object *blt_fb_init(const struct igt_fb *fb,
blt_set_object(blt, handle, fb->size, memregion,
intel_get_uc_mocs(fb->fd),
+ intel_get_pat_idx_wt(fb->fd),
blt_tile,
is_ccs_modifier(fb->modifier) ? COMPRESSION_ENABLED : COMPRESSION_DISABLED,
is_gen12_mc_ccs_modifier(fb->modifier) ? COMPRESSION_TYPE_MEDIA : COMPRESSION_TYPE_3D);
diff --git a/lib/intel_blt.c b/lib/intel_blt.c
index b55fa9b52..b7ac2902b 100644
--- a/lib/intel_blt.c
+++ b/lib/intel_blt.c
@@ -13,6 +13,7 @@
#include "igt.h"
#include "igt_syncobj.h"
#include "intel_blt.h"
+#include "intel_pat.h"
#include "xe/xe_ioctl.h"
#include "xe/xe_query.h"
#include "xe/xe_util.h"
@@ -810,10 +811,12 @@ uint64_t emit_blt_block_copy(int fd,
igt_assert_f(blt, "block-copy requires data to do blit\n");
alignment = get_default_alignment(fd, blt->driver);
- src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment)
- + blt->src.plane_offset;
- dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment)
- + blt->dst.plane_offset;
+ src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
+ alignment, blt->src.pat_index) +
+ blt->src.plane_offset;
+ dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size,
+ alignment, blt->dst.pat_index) +
+ blt->dst.plane_offset;
bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
fill_data(&data, blt, src_offset, dst_offset, ext);
@@ -884,8 +887,10 @@ int blt_block_copy(int fd,
igt_assert_neq(blt->driver, 0);
alignment = get_default_alignment(fd, blt->driver);
- src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
- dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
+ src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
+ alignment, blt->src.pat_index);
+ dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size,
+ alignment, blt->dst.pat_index);
bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
emit_blt_block_copy(fd, ahnd, blt, ext, 0, true);
@@ -1036,8 +1041,10 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
data.dw00.size_of_ctrl_copy = __ccs_size(surf) / CCS_RATIO - 1;
data.dw00.length = 0x3;
- src_offset = get_offset(ahnd, surf->src.handle, surf->src.size, alignment);
- dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size, alignment);
+ src_offset = get_offset_pat_index(ahnd, surf->src.handle, surf->src.size,
+ alignment, surf->src.pat_index);
+ dst_offset = get_offset_pat_index(ahnd, surf->dst.handle, surf->dst.size,
+ alignment, surf->dst.pat_index);
bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size, alignment);
data.dw01.src_address_lo = src_offset;
@@ -1103,8 +1110,10 @@ int blt_ctrl_surf_copy(int fd,
igt_assert_neq(surf->driver, 0);
alignment = max_t(uint64_t, get_default_alignment(fd, surf->driver), 1ull << 16);
- src_offset = get_offset(ahnd, surf->src.handle, surf->src.size, alignment);
- dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size, alignment);
+ src_offset = get_offset_pat_index(ahnd, surf->src.handle, surf->src.size,
+ alignment, surf->src.pat_index);
+ dst_offset = get_offset_pat_index(ahnd, surf->dst.handle, surf->dst.size,
+ alignment, surf->dst.pat_index);
bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size, alignment);
emit_blt_ctrl_surf_copy(fd, ahnd, surf, 0, true);
@@ -1308,10 +1317,12 @@ uint64_t emit_blt_fast_copy(int fd,
data.dw03.dst_x2 = blt->dst.x2;
data.dw03.dst_y2 = blt->dst.y2;
- src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment)
- + blt->src.plane_offset;
- dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment)
- + blt->dst.plane_offset;
+ src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
+ alignment, blt->src.pat_index) +
+ blt->src.plane_offset;
+ dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size, alignment,
+ blt->dst.pat_index) +
+ blt->dst.plane_offset;
bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
data.dw04.dst_address_lo = dst_offset;
@@ -1380,8 +1391,10 @@ int blt_fast_copy(int fd,
igt_assert_neq(blt->driver, 0);
alignment = get_default_alignment(fd, blt->driver);
- src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
- dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
+ src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
+ alignment, blt->src.pat_index);
+ dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size,
+ alignment, blt->dst.pat_index);
bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
emit_blt_fast_copy(fd, ahnd, blt, 0, true);
@@ -1460,7 +1473,7 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
&size, region) == 0);
}
- blt_set_object(obj, handle, size, region, mocs, tiling,
+ blt_set_object(obj, handle, size, region, mocs, DEFAULT_PAT_INDEX, tiling,
compression, compression_type);
blt_set_geom(obj, stride, 0, 0, width, height, 0, 0);
@@ -1481,7 +1494,7 @@ void blt_destroy_object(int fd, struct blt_copy_object *obj)
void blt_set_object(struct blt_copy_object *obj,
uint32_t handle, uint64_t size, uint32_t region,
- uint8_t mocs, enum blt_tiling_type tiling,
+ uint8_t mocs, uint8_t pat_index, enum blt_tiling_type tiling,
enum blt_compression compression,
enum blt_compression_type compression_type)
{
@@ -1489,6 +1502,7 @@ void blt_set_object(struct blt_copy_object *obj,
obj->size = size;
obj->region = region;
obj->mocs = mocs;
+ obj->pat_index = pat_index;
obj->tiling = tiling;
obj->compression = compression;
obj->compression_type = compression_type;
@@ -1516,12 +1530,14 @@ void blt_set_copy_object(struct blt_copy_object *obj,
void blt_set_ctrl_surf_object(struct blt_ctrl_surf_copy_object *obj,
uint32_t handle, uint32_t region, uint64_t size,
- uint8_t mocs, enum blt_access_type access_type)
+ uint8_t mocs, uint8_t pat_index,
+ enum blt_access_type access_type)
{
obj->handle = handle;
obj->region = region;
obj->size = size;
obj->mocs = mocs;
+ obj->pat_index = pat_index;
obj->access_type = access_type;
}
diff --git a/lib/intel_blt.h b/lib/intel_blt.h
index d9c8883c7..f8423a986 100644
--- a/lib/intel_blt.h
+++ b/lib/intel_blt.h
@@ -79,6 +79,7 @@ struct blt_copy_object {
uint32_t region;
uint64_t size;
uint8_t mocs;
+ uint8_t pat_index;
enum blt_tiling_type tiling;
enum blt_compression compression; /* BC only */
enum blt_compression_type compression_type; /* BC only */
@@ -151,6 +152,7 @@ struct blt_ctrl_surf_copy_object {
uint32_t region;
uint64_t size;
uint8_t mocs;
+ uint8_t pat_index;
enum blt_access_type access_type;
};
@@ -247,7 +249,7 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
void blt_destroy_object(int fd, struct blt_copy_object *obj);
void blt_set_object(struct blt_copy_object *obj,
uint32_t handle, uint64_t size, uint32_t region,
- uint8_t mocs, enum blt_tiling_type tiling,
+ uint8_t mocs, uint8_t pat_index, enum blt_tiling_type tiling,
enum blt_compression compression,
enum blt_compression_type compression_type);
void blt_set_object_ext(struct blt_block_copy_object_ext *obj,
@@ -258,7 +260,8 @@ void blt_set_copy_object(struct blt_copy_object *obj,
const struct blt_copy_object *orig);
void blt_set_ctrl_surf_object(struct blt_ctrl_surf_copy_object *obj,
uint32_t handle, uint32_t region, uint64_t size,
- uint8_t mocs, enum blt_access_type access_type);
+ uint8_t mocs, uint8_t pat_index,
+ enum blt_access_type access_type);
void blt_surface_info(const char *info,
const struct blt_copy_object *obj);
diff --git a/tests/intel/gem_ccs.c b/tests/intel/gem_ccs.c
index f5d4ab359..a98557b72 100644
--- a/tests/intel/gem_ccs.c
+++ b/tests/intel/gem_ccs.c
@@ -15,6 +15,7 @@
#include "lib/intel_chipset.h"
#include "intel_blt.h"
#include "intel_mocs.h"
+#include "intel_pat.h"
/**
* TEST: gem ccs
* Description: Exercise gen12 blitter with and without flatccs compression
@@ -111,9 +112,9 @@ static void surf_copy(int i915,
blt_ctrl_surf_copy_init(i915, &surf);
surf.print_bb = param.print_bb;
blt_set_ctrl_surf_object(&surf.src, mid->handle, mid->region, mid->size,
- uc_mocs, BLT_INDIRECT_ACCESS);
+ uc_mocs, DEFAULT_PAT_INDEX, BLT_INDIRECT_ACCESS);
blt_set_ctrl_surf_object(&surf.dst, ccs, REGION_SMEM, ccssize,
- uc_mocs, DIRECT_ACCESS);
+ uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
bb_size = 4096;
igt_assert_eq(__gem_create(i915, &bb_size, &bb1), 0);
blt_set_batch(&surf.bb, bb1, bb_size, REGION_SMEM);
@@ -133,7 +134,7 @@ static void surf_copy(int i915,
igt_system_suspend_autoresume(SUSPEND_STATE_FREEZE, SUSPEND_TEST_NONE);
blt_set_ctrl_surf_object(&surf.dst, ccs2, REGION_SMEM, ccssize,
- 0, DIRECT_ACCESS);
+ 0, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
blt_ctrl_surf_copy(i915, ctx, e, ahnd, &surf);
gem_sync(i915, surf.dst.handle);
@@ -155,9 +156,9 @@ static void surf_copy(int i915,
for (int i = 0; i < surf.dst.size / sizeof(uint32_t); i++)
ccsmap[i] = i;
blt_set_ctrl_surf_object(&surf.src, ccs, REGION_SMEM, ccssize,
- uc_mocs, DIRECT_ACCESS);
+ uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
blt_set_ctrl_surf_object(&surf.dst, mid->handle, mid->region, mid->size,
- uc_mocs, INDIRECT_ACCESS);
+ uc_mocs, DEFAULT_PAT_INDEX, INDIRECT_ACCESS);
blt_ctrl_surf_copy(i915, ctx, e, ahnd, &surf);
blt_copy_init(i915, &blt);
@@ -399,7 +400,8 @@ static void block_copy(int i915,
blt_set_object_ext(&ext.dst, 0, width, height, SURFACE_TYPE_2D);
if (config->inplace) {
blt_set_object(&blt.dst, mid->handle, dst->size, mid->region, 0,
- T_LINEAR, COMPRESSION_DISABLED, comp_type);
+ DEFAULT_PAT_INDEX, T_LINEAR, COMPRESSION_DISABLED,
+ comp_type);
blt.dst.ptr = mid->ptr;
}
@@ -475,7 +477,7 @@ static void block_multicopy(int i915,
if (config->inplace) {
blt_set_object(&blt3.dst, mid->handle, dst->size, mid->region,
- mid->mocs, mid_tiling, COMPRESSION_DISABLED,
+ mid->mocs, DEFAULT_PAT_INDEX, mid_tiling, COMPRESSION_DISABLED,
comp_type);
blt3.dst.ptr = mid->ptr;
}
diff --git a/tests/intel/gem_lmem_swapping.c b/tests/intel/gem_lmem_swapping.c
index ede545c92..7f2ab8bb6 100644
--- a/tests/intel/gem_lmem_swapping.c
+++ b/tests/intel/gem_lmem_swapping.c
@@ -486,7 +486,7 @@ static void __do_evict(int i915,
INTEL_MEMORY_REGION_ID(I915_SYSTEM_MEMORY, 0));
blt_set_object(tmp, tmp->handle, params->size.max,
INTEL_MEMORY_REGION_ID(I915_SYSTEM_MEMORY, 0),
- intel_get_uc_mocs(i915), T_LINEAR,
+ intel_get_uc_mocs(i915), 0, T_LINEAR,
COMPRESSION_DISABLED, COMPRESSION_TYPE_3D);
blt_set_geom(tmp, stride, 0, 0, width, height, 0, 0);
}
@@ -516,7 +516,7 @@ static void __do_evict(int i915,
obj->blt_obj = calloc(1, sizeof(*obj->blt_obj));
igt_assert(obj->blt_obj);
blt_set_object(obj->blt_obj, obj->handle, obj->size, region_id,
- intel_get_uc_mocs(i915), T_LINEAR,
+ intel_get_uc_mocs(i915), 0, T_LINEAR,
COMPRESSION_ENABLED, COMPRESSION_TYPE_3D);
blt_set_geom(obj->blt_obj, stride, 0, 0, width, height, 0, 0);
init_object_ccs(i915, obj, tmp, rand(), blt_ctx,
diff --git a/tests/intel/xe_ccs.c b/tests/intel/xe_ccs.c
index 20bbc4448..27859d5ce 100644
--- a/tests/intel/xe_ccs.c
+++ b/tests/intel/xe_ccs.c
@@ -13,6 +13,7 @@
#include "igt_syncobj.h"
#include "intel_blt.h"
#include "intel_mocs.h"
+#include "intel_pat.h"
#include "xe/xe_ioctl.h"
#include "xe/xe_query.h"
#include "xe/xe_util.h"
@@ -108,8 +109,9 @@ static void surf_copy(int xe,
blt_ctrl_surf_copy_init(xe, &surf);
surf.print_bb = param.print_bb;
blt_set_ctrl_surf_object(&surf.src, mid->handle, mid->region, mid->size,
- uc_mocs, BLT_INDIRECT_ACCESS);
- blt_set_ctrl_surf_object(&surf.dst, ccs, sysmem, ccssize, uc_mocs, DIRECT_ACCESS);
+ uc_mocs, DEFAULT_PAT_INDEX, BLT_INDIRECT_ACCESS);
+ blt_set_ctrl_surf_object(&surf.dst, ccs, sysmem, ccssize, uc_mocs,
+ DEFAULT_PAT_INDEX, DIRECT_ACCESS);
bb_size = xe_get_default_alignment(xe);
bb1 = xe_bo_create_flags(xe, 0, bb_size, sysmem);
blt_set_batch(&surf.bb, bb1, bb_size, sysmem);
@@ -130,7 +132,7 @@ static void surf_copy(int xe,
igt_system_suspend_autoresume(SUSPEND_STATE_FREEZE, SUSPEND_TEST_NONE);
blt_set_ctrl_surf_object(&surf.dst, ccs2, system_memory(xe), ccssize,
- 0, DIRECT_ACCESS);
+ 0, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
intel_ctx_xe_sync(ctx, true);
@@ -153,9 +155,9 @@ static void surf_copy(int xe,
for (int i = 0; i < surf.dst.size / sizeof(uint32_t); i++)
ccsmap[i] = i;
blt_set_ctrl_surf_object(&surf.src, ccs, sysmem, ccssize,
- uc_mocs, DIRECT_ACCESS);
+ uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
blt_set_ctrl_surf_object(&surf.dst, mid->handle, mid->region, mid->size,
- uc_mocs, INDIRECT_ACCESS);
+ uc_mocs, DEFAULT_PAT_INDEX, INDIRECT_ACCESS);
blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
intel_ctx_xe_sync(ctx, true);
@@ -369,7 +371,8 @@ static void block_copy(int xe,
blt_set_object_ext(&ext.dst, 0, width, height, SURFACE_TYPE_2D);
if (config->inplace) {
blt_set_object(&blt.dst, mid->handle, dst->size, mid->region, 0,
- T_LINEAR, COMPRESSION_DISABLED, comp_type);
+ DEFAULT_PAT_INDEX, T_LINEAR, COMPRESSION_DISABLED,
+ comp_type);
blt.dst.ptr = mid->ptr;
}
@@ -450,8 +453,8 @@ static void block_multicopy(int xe,
if (config->inplace) {
blt_set_object(&blt3.dst, mid->handle, dst->size, mid->region,
- mid->mocs, mid_tiling, COMPRESSION_DISABLED,
- comp_type);
+ mid->mocs, DEFAULT_PAT_INDEX, mid_tiling,
+ COMPRESSION_DISABLED, comp_type);
blt3.dst.ptr = mid->ptr;
}
--
2.41.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Intel-xe] [PATCH i-g-t 09/12] lib/intel_buf: support pat_index
2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
` (7 preceding siblings ...)
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 08/12] lib/intel_blt: support pat_index Matthew Auld
@ 2023-10-05 15:31 ` Matthew Auld
2023-10-06 12:13 ` Zbigniew Kempczyński
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 10/12] lib/xe_ioctl: update vm_bind to account for pat_index Matthew Auld
` (2 subsequent siblings)
11 siblings, 1 reply; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
To: igt-dev; +Cc: intel-xe
Some users need to able select their own pat_index. Some display tests
use igt_draw which in turn uses intel_batchbuffer and intel_buf. We
also have a couple more display tests directly using these interfaces
directly. Idea is to select wt/uc for anything display related, but also
allow any test to select a pat_index for a given intel_buf.
Signted-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
---
lib/igt_draw.c | 7 +++++-
lib/igt_fb.c | 3 ++-
lib/intel_allocator.c | 1 +
lib/intel_allocator.h | 1 +
lib/intel_batchbuffer.c | 51 ++++++++++++++++++++++++++++++---------
lib/intel_bufops.c | 29 +++++++++++++++-------
lib/intel_bufops.h | 9 +++++--
tests/intel/kms_big_fb.c | 4 ++-
tests/intel/kms_dirtyfb.c | 7 ++++--
tests/intel/kms_psr.c | 4 ++-
tests/intel/xe_intel_bb.c | 3 ++-
11 files changed, 89 insertions(+), 30 deletions(-)
diff --git a/lib/igt_draw.c b/lib/igt_draw.c
index 2332bf94a..8db71ce5e 100644
--- a/lib/igt_draw.c
+++ b/lib/igt_draw.c
@@ -31,6 +31,7 @@
#include "intel_batchbuffer.h"
#include "intel_chipset.h"
#include "intel_mocs.h"
+#include "intel_pat.h"
#include "igt_core.h"
#include "igt_fb.h"
#include "ioctl_wrappers.h"
@@ -75,6 +76,7 @@ struct buf_data {
uint32_t size;
uint32_t stride;
int bpp;
+ uint8_t pat_index;
};
struct rect {
@@ -658,7 +660,8 @@ static struct intel_buf *create_buf(int fd, struct buf_ops *bops,
width, height, from->bpp, 0,
tiling, 0,
size, 0,
- region);
+ region,
+ from->pat_index);
/* Make sure we close handle on destroy path */
intel_buf_set_ownership(buf, true);
@@ -785,6 +788,7 @@ static void draw_rect_render(int fd, struct cmd_data *cmd_data,
igt_skip_on(!rendercopy);
/* We create a temporary buffer and copy from it using rendercopy. */
+ tmp.pat_index = buf->pat_index;
tmp.size = rect->w * rect->h * pixel_size;
if (is_i915_device(fd))
tmp.handle = gem_create(fd, tmp.size);
@@ -852,6 +856,7 @@ void igt_draw_rect(int fd, struct buf_ops *bops, uint32_t ctx,
.size = buf_size,
.stride = buf_stride,
.bpp = bpp,
+ .pat_index = intel_get_pat_idx_wt(fd),
};
struct rect rect = {
.x = rect_x,
diff --git a/lib/igt_fb.c b/lib/igt_fb.c
index d290fd775..61384c553 100644
--- a/lib/igt_fb.c
+++ b/lib/igt_fb.c
@@ -2637,7 +2637,8 @@ igt_fb_create_intel_buf(int fd, struct buf_ops *bops,
igt_fb_mod_to_tiling(fb->modifier),
compression, fb->size,
fb->strides[0],
- region);
+ region,
+ intel_get_pat_idx_wt(fd));
intel_buf_set_name(buf, name);
/* Make sure we close handle on destroy path */
diff --git a/lib/intel_allocator.c b/lib/intel_allocator.c
index da357b833..b3e5c0226 100644
--- a/lib/intel_allocator.c
+++ b/lib/intel_allocator.c
@@ -1449,6 +1449,7 @@ bool intel_allocator_is_reserved(uint64_t allocator_handle,
bool intel_allocator_reserve_if_not_allocated(uint64_t allocator_handle,
uint32_t handle,
uint64_t size, uint64_t offset,
+ uint8_t pat_index,
bool *is_allocatedp)
{
struct alloc_req req = { .request_type = REQ_RESERVE_IF_NOT_ALLOCATED,
diff --git a/lib/intel_allocator.h b/lib/intel_allocator.h
index 5da8af7f9..d93c5828d 100644
--- a/lib/intel_allocator.h
+++ b/lib/intel_allocator.h
@@ -206,6 +206,7 @@ bool intel_allocator_is_reserved(uint64_t allocator_handle,
bool intel_allocator_reserve_if_not_allocated(uint64_t allocator_handle,
uint32_t handle,
uint64_t size, uint64_t offset,
+ uint8_t pat_index,
bool *is_allocatedp);
void intel_allocator_print(uint64_t allocator_handle);
diff --git a/lib/intel_batchbuffer.c b/lib/intel_batchbuffer.c
index e7b1b755f..eaaf667ea 100644
--- a/lib/intel_batchbuffer.c
+++ b/lib/intel_batchbuffer.c
@@ -38,6 +38,7 @@
#include "intel_batchbuffer.h"
#include "intel_bufops.h"
#include "intel_chipset.h"
+#include "intel_pat.h"
#include "media_fill.h"
#include "media_spin.h"
#include "sw_sync.h"
@@ -825,15 +826,18 @@ static void __reallocate_objects(struct intel_bb *ibb)
static inline uint64_t __intel_bb_get_offset(struct intel_bb *ibb,
uint32_t handle,
uint64_t size,
- uint32_t alignment)
+ uint32_t alignment,
+ uint8_t pat_index)
{
uint64_t offset;
if (ibb->enforce_relocs)
return 0;
- offset = intel_allocator_alloc(ibb->allocator_handle,
- handle, size, alignment);
+ offset = __intel_allocator_alloc(ibb->allocator_handle, handle,
+ size, alignment, pat_index,
+ ALLOC_STRATEGY_NONE);
+ igt_assert(offset != ALLOC_INVALID_ADDRESS);
return offset;
}
@@ -1300,11 +1304,14 @@ static struct drm_xe_vm_bind_op *xe_alloc_bind_ops(struct intel_bb *ibb,
ops->op = op;
ops->obj_offset = 0;
ops->addr = objects[i]->offset;
- ops->range = objects[i]->rsvd1;
+ ops->range = objects[i]->rsvd1 & ~(4096-1);
ops->region = region;
+ if (set_obj)
+ ops->pat_index = objects[i]->rsvd1 & (4096-1);
- igt_debug(" [%d]: handle: %u, offset: %llx, size: %llx\n",
- i, ops->obj, (long long)ops->addr, (long long)ops->range);
+ igt_debug(" [%d]: handle: %u, offset: %llx, size: %llx pat_index: %u\n",
+ i, ops->obj, (long long)ops->addr, (long long)ops->range,
+ ops->pat_index);
}
return bind_ops;
@@ -1409,7 +1416,8 @@ void intel_bb_reset(struct intel_bb *ibb, bool purge_objects_cache)
ibb->batch_offset = __intel_bb_get_offset(ibb,
ibb->handle,
ibb->size,
- ibb->alignment);
+ ibb->alignment,
+ DEFAULT_PAT_INDEX);
intel_bb_add_object(ibb, ibb->handle, ibb->size,
ibb->batch_offset,
@@ -1645,7 +1653,8 @@ static void __remove_from_objects(struct intel_bb *ibb,
*/
static struct drm_i915_gem_exec_object2 *
__intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
- uint64_t offset, uint64_t alignment, bool write)
+ uint64_t offset, uint64_t alignment, uint8_t pat_index,
+ bool write)
{
struct drm_i915_gem_exec_object2 *object;
@@ -1661,6 +1670,9 @@ __intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
object = __add_to_cache(ibb, handle);
__add_to_objects(ibb, object);
+ if (pat_index == DEFAULT_PAT_INDEX)
+ pat_index = intel_get_pat_idx_wb(ibb->fd);
+
/*
* If object->offset == INVALID_ADDRESS we added freshly object to the
* cache. In that case we have two choices:
@@ -1670,7 +1682,7 @@ __intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
if (INVALID_ADDR(object->offset)) {
if (INVALID_ADDR(offset)) {
offset = __intel_bb_get_offset(ibb, handle, size,
- alignment);
+ alignment, pat_index);
} else {
offset = offset & (ibb->gtt_size - 1);
@@ -1683,6 +1695,7 @@ __intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
reserved = intel_allocator_reserve_if_not_allocated(ibb->allocator_handle,
handle, size, offset,
+ pat_index,
&allocated);
igt_assert_f(allocated || reserved,
"Can't get offset, allocated: %d, reserved: %d\n",
@@ -1721,6 +1734,18 @@ __intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
if (ibb->driver == INTEL_DRIVER_XE) {
object->alignment = alignment;
object->rsvd1 = size;
+ igt_assert(!(size & (4096-1)));
+
+ if (pat_index == DEFAULT_PAT_INDEX)
+ pat_index = intel_get_pat_idx_wb(ibb->fd);
+
+ /*
+ * XXX: For now encode the pat_index in the first few bits of
+ * rsvd1. intel_batchbuffer should really stop using the i915
+ * drm_i915_gem_exec_object2 to encode VMA placement
+ * information on xe...
+ */
+ object->rsvd1 |= pat_index;
}
return object;
@@ -1733,7 +1758,7 @@ intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
struct drm_i915_gem_exec_object2 *obj = NULL;
obj = __intel_bb_add_object(ibb, handle, size, offset,
- alignment, write);
+ alignment, DEFAULT_PAT_INDEX, write);
igt_assert(obj);
return obj;
@@ -1795,8 +1820,10 @@ __intel_bb_add_intel_buf(struct intel_bb *ibb, struct intel_buf *buf,
}
}
- obj = intel_bb_add_object(ibb, buf->handle, intel_buf_bo_size(buf),
- buf->addr.offset, alignment, write);
+ obj = __intel_bb_add_object(ibb, buf->handle, intel_buf_bo_size(buf),
+ buf->addr.offset, alignment, buf->pat_index,
+ write);
+ igt_assert(obj);
buf->addr.offset = obj->offset;
if (igt_list_empty(&buf->link)) {
diff --git a/lib/intel_bufops.c b/lib/intel_bufops.c
index 2c91adb88..fbee4748e 100644
--- a/lib/intel_bufops.c
+++ b/lib/intel_bufops.c
@@ -29,6 +29,7 @@
#include "igt.h"
#include "igt_x86.h"
#include "intel_bufops.h"
+#include "intel_pat.h"
#include "xe/xe_ioctl.h"
#include "xe/xe_query.h"
@@ -818,7 +819,7 @@ static void __intel_buf_init(struct buf_ops *bops,
int width, int height, int bpp, int alignment,
uint32_t req_tiling, uint32_t compression,
uint64_t bo_size, int bo_stride,
- uint64_t region)
+ uint64_t region, uint8_t pat_index)
{
uint32_t tiling = req_tiling;
uint64_t size;
@@ -839,6 +840,10 @@ static void __intel_buf_init(struct buf_ops *bops,
IGT_INIT_LIST_HEAD(&buf->link);
buf->mocs = INTEL_BUF_MOCS_DEFAULT;
+ if (pat_index == DEFAULT_PAT_INDEX)
+ pat_index = intel_get_pat_idx_wb(bops->fd);
+ buf->pat_index = pat_index;
+
if (compression) {
igt_require(bops->intel_gen >= 9);
igt_assert(req_tiling == I915_TILING_Y ||
@@ -957,7 +962,7 @@ void intel_buf_init(struct buf_ops *bops,
region = bops->driver == INTEL_DRIVER_I915 ? I915_SYSTEM_MEMORY :
system_memory(bops->fd);
__intel_buf_init(bops, 0, buf, width, height, bpp, alignment,
- tiling, compression, 0, 0, region);
+ tiling, compression, 0, 0, region, DEFAULT_PAT_INDEX);
intel_buf_set_ownership(buf, true);
}
@@ -974,7 +979,7 @@ void intel_buf_init_in_region(struct buf_ops *bops,
uint64_t region)
{
__intel_buf_init(bops, 0, buf, width, height, bpp, alignment,
- tiling, compression, 0, 0, region);
+ tiling, compression, 0, 0, region, DEFAULT_PAT_INDEX);
intel_buf_set_ownership(buf, true);
}
@@ -1033,7 +1038,7 @@ void intel_buf_init_using_handle(struct buf_ops *bops,
uint32_t req_tiling, uint32_t compression)
{
__intel_buf_init(bops, handle, buf, width, height, bpp, alignment,
- req_tiling, compression, 0, 0, -1);
+ req_tiling, compression, 0, 0, -1, DEFAULT_PAT_INDEX);
}
/**
@@ -1050,6 +1055,7 @@ void intel_buf_init_using_handle(struct buf_ops *bops,
* @size: real bo size
* @stride: bo stride
* @region: region
+ * @pat_index: pat_index to use for the binding (only used on xe)
*
* Function configures BO handle within intel_buf structure passed by the caller
* (with all its metadata - width, height, ...). Useful if BO was created
@@ -1067,10 +1073,12 @@ void intel_buf_init_full(struct buf_ops *bops,
uint32_t compression,
uint64_t size,
int stride,
- uint64_t region)
+ uint64_t region,
+ uint8_t pat_index)
{
__intel_buf_init(bops, handle, buf, width, height, bpp, alignment,
- req_tiling, compression, size, stride, region);
+ req_tiling, compression, size, stride, region,
+ pat_index);
}
/**
@@ -1149,7 +1157,8 @@ struct intel_buf *intel_buf_create_using_handle_and_size(struct buf_ops *bops,
int stride)
{
return intel_buf_create_full(bops, handle, width, height, bpp, alignment,
- req_tiling, compression, size, stride, -1);
+ req_tiling, compression, size, stride, -1,
+ DEFAULT_PAT_INDEX);
}
struct intel_buf *intel_buf_create_full(struct buf_ops *bops,
@@ -1160,7 +1169,8 @@ struct intel_buf *intel_buf_create_full(struct buf_ops *bops,
uint32_t compression,
uint64_t size,
int stride,
- uint64_t region)
+ uint64_t region,
+ uint8_t pat_index)
{
struct intel_buf *buf;
@@ -1170,7 +1180,8 @@ struct intel_buf *intel_buf_create_full(struct buf_ops *bops,
igt_assert(buf);
__intel_buf_init(bops, handle, buf, width, height, bpp, alignment,
- req_tiling, compression, size, stride, region);
+ req_tiling, compression, size, stride, region,
+ pat_index);
return buf;
}
diff --git a/lib/intel_bufops.h b/lib/intel_bufops.h
index 4dfe4681c..b6048402b 100644
--- a/lib/intel_bufops.h
+++ b/lib/intel_bufops.h
@@ -63,6 +63,9 @@ struct intel_buf {
/* Content Protection*/
bool is_protected;
+ /* pat_index to use for mapping this buf. Only used in Xe. */
+ uint8_t pat_index;
+
/* For debugging purposes */
char name[INTEL_BUF_NAME_MAXSIZE + 1];
};
@@ -161,7 +164,8 @@ void intel_buf_init_full(struct buf_ops *bops,
uint32_t compression,
uint64_t size,
int stride,
- uint64_t region);
+ uint64_t region,
+ uint8_t pat_index);
struct intel_buf *intel_buf_create(struct buf_ops *bops,
int width, int height,
@@ -192,7 +196,8 @@ struct intel_buf *intel_buf_create_full(struct buf_ops *bops,
uint32_t compression,
uint64_t size,
int stride,
- uint64_t region);
+ uint64_t region,
+ uint8_t pat_index);
void intel_buf_destroy(struct intel_buf *buf);
static inline void intel_buf_set_pxp(struct intel_buf *buf, bool new_pxp_state)
diff --git a/tests/intel/kms_big_fb.c b/tests/intel/kms_big_fb.c
index 611e60896..854a77992 100644
--- a/tests/intel/kms_big_fb.c
+++ b/tests/intel/kms_big_fb.c
@@ -34,6 +34,7 @@
#include <string.h>
#include "i915/gem_create.h"
+#include "intel_pat.h"
#include "xe/xe_ioctl.h"
#include "xe/xe_query.h"
@@ -88,7 +89,8 @@ static struct intel_buf *init_buf(data_t *data,
handle = gem_open(data->drm_fd, name);
buf = intel_buf_create_full(data->bops, handle, width, height,
bpp, 0, tiling, 0, size, 0,
- region);
+ region,
+ intel_get_pat_idx_wt(data->drm_fd));
intel_buf_set_name(buf, buf_name);
intel_buf_set_ownership(buf, true);
diff --git a/tests/intel/kms_dirtyfb.c b/tests/intel/kms_dirtyfb.c
index cc9529178..ec9b2a137 100644
--- a/tests/intel/kms_dirtyfb.c
+++ b/tests/intel/kms_dirtyfb.c
@@ -10,6 +10,7 @@
#include "i915/intel_drrs.h"
#include "i915/intel_fbc.h"
+#include "intel_pat.h"
#include "xe/xe_query.h"
@@ -246,14 +247,16 @@ static void run_test(data_t *data)
0,
igt_fb_mod_to_tiling(data->fbs[1].modifier),
0, 0, 0, is_xe_device(data->drm_fd) ?
- system_memory(data->drm_fd) : 0);
+ system_memory(data->drm_fd) : 0,
+ intel_get_pat_idx_wt(data->drm_fd));
dst = intel_buf_create_full(data->bops, data->fbs[2].gem_handle,
data->fbs[2].width,
data->fbs[2].height,
igt_drm_format_to_bpp(data->fbs[2].drm_format),
0, igt_fb_mod_to_tiling(data->fbs[2].modifier),
0, 0, 0, is_xe_device(data->drm_fd) ?
- system_memory(data->drm_fd) : 0);
+ system_memory(data->drm_fd) : 0,
+ intel_get_pat_idx_wt(data->drm_fd));
ibb = intel_bb_create(data->drm_fd, PAGE_SIZE);
spin = igt_spin_new(data->drm_fd, .ahnd = ibb->allocator_handle);
diff --git a/tests/intel/kms_psr.c b/tests/intel/kms_psr.c
index ffecc5222..9c6ecd829 100644
--- a/tests/intel/kms_psr.c
+++ b/tests/intel/kms_psr.c
@@ -31,6 +31,7 @@
#include "igt.h"
#include "igt_sysfs.h"
#include "igt_psr.h"
+#include "intel_pat.h"
#include <errno.h>
#include <stdbool.h>
#include <stdio.h>
@@ -356,7 +357,8 @@ static struct intel_buf *create_buf_from_fb(data_t *data,
name = gem_flink(data->drm_fd, fb->gem_handle);
handle = gem_open(data->drm_fd, name);
buf = intel_buf_create_full(data->bops, handle, width, height,
- bpp, 0, tiling, 0, size, stride, region);
+ bpp, 0, tiling, 0, size, stride, region,
+ intel_get_pat_idx_wt(data->drm_fd));
intel_buf_set_ownership(buf, true);
return buf;
diff --git a/tests/intel/xe_intel_bb.c b/tests/intel/xe_intel_bb.c
index 0159a3164..e2480acf8 100644
--- a/tests/intel/xe_intel_bb.c
+++ b/tests/intel/xe_intel_bb.c
@@ -19,6 +19,7 @@
#include "igt.h"
#include "igt_crc.h"
#include "intel_bufops.h"
+#include "intel_pat.h"
#include "xe/xe_ioctl.h"
#include "xe/xe_query.h"
@@ -400,7 +401,7 @@ static void create_in_region(struct buf_ops *bops, uint64_t region)
intel_buf_init_full(bops, handle, &buf,
width/4, height, 32, 0,
I915_TILING_NONE, 0,
- size, 0, region);
+ size, 0, region, DEFAULT_PAT_INDEX);
intel_buf_set_ownership(&buf, true);
intel_bb_add_intel_buf(ibb, &buf, false);
--
2.41.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Intel-xe] [PATCH i-g-t 10/12] lib/xe_ioctl: update vm_bind to account for pat_index
2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
` (8 preceding siblings ...)
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 09/12] lib/intel_buf: " Matthew Auld
@ 2023-10-05 15:31 ` Matthew Auld
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 11/12] tests/xe: add some vm_bind pat_index tests Matthew Auld
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 12/12] tests/intel-ci/xe: add pat and caching related tests Matthew Auld
11 siblings, 0 replies; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
To: igt-dev; +Cc: intel-xe
Keep things minimal and select the 1way+ by default on all platforms.
Other users can use intel_buf, get_offset_pat_index etc or use
__xe_vm_bind() directly. Display tests don't directly use this
interface.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
---
lib/xe/xe_ioctl.c | 8 ++++++--
lib/xe/xe_ioctl.h | 2 +-
tests/intel/xe_vm.c | 4 +++-
3 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/lib/xe/xe_ioctl.c b/lib/xe/xe_ioctl.c
index 80696aa59..ebaed1e96 100644
--- a/lib/xe/xe_ioctl.c
+++ b/lib/xe/xe_ioctl.c
@@ -41,6 +41,7 @@
#include "config.h"
#include "drmtest.h"
#include "igt_syncobj.h"
+#include "intel_pat.h"
#include "ioctl_wrappers.h"
#include "xe_ioctl.h"
#include "xe_query.h"
@@ -92,7 +93,7 @@ void xe_vm_bind_array(int fd, uint32_t vm, uint32_t exec_queue,
int __xe_vm_bind(int fd, uint32_t vm, uint32_t exec_queue, uint32_t bo,
uint64_t offset, uint64_t addr, uint64_t size, uint32_t op,
struct drm_xe_sync *sync, uint32_t num_syncs, uint32_t region,
- uint64_t ext)
+ uint8_t pat_index, uint64_t ext)
{
struct drm_xe_vm_bind bind = {
.extensions = ext,
@@ -107,6 +108,8 @@ int __xe_vm_bind(int fd, uint32_t vm, uint32_t exec_queue, uint32_t bo,
.num_syncs = num_syncs,
.syncs = (uintptr_t)sync,
.exec_queue_id = exec_queue,
+ .bind.pat_index = (pat_index == DEFAULT_PAT_INDEX) ?
+ intel_get_pat_idx_wb(fd) : pat_index,
};
if (igt_ioctl(fd, DRM_IOCTL_XE_VM_BIND, &bind))
@@ -121,7 +124,8 @@ void __xe_vm_bind_assert(int fd, uint32_t vm, uint32_t exec_queue, uint32_t bo,
uint32_t num_syncs, uint32_t region, uint64_t ext)
{
igt_assert_eq(__xe_vm_bind(fd, vm, exec_queue, bo, offset, addr, size,
- op, sync, num_syncs, region, ext), 0);
+ op, sync, num_syncs, region, DEFAULT_PAT_INDEX,
+ ext), 0);
}
void xe_vm_bind(int fd, uint32_t vm, uint32_t bo, uint64_t offset,
diff --git a/lib/xe/xe_ioctl.h b/lib/xe/xe_ioctl.h
index c18fc878c..cafbb011a 100644
--- a/lib/xe/xe_ioctl.h
+++ b/lib/xe/xe_ioctl.h
@@ -20,7 +20,7 @@ uint32_t xe_vm_create(int fd, uint32_t flags, uint64_t ext);
int __xe_vm_bind(int fd, uint32_t vm, uint32_t exec_queue, uint32_t bo,
uint64_t offset, uint64_t addr, uint64_t size, uint32_t op,
struct drm_xe_sync *sync, uint32_t num_syncs, uint32_t region,
- uint64_t ext);
+ uint8_t pat_index, uint64_t ext);
void __xe_vm_bind_assert(int fd, uint32_t vm, uint32_t exec_queue, uint32_t bo,
uint64_t offset, uint64_t addr, uint64_t size,
uint32_t op, struct drm_xe_sync *sync,
diff --git a/tests/intel/xe_vm.c b/tests/intel/xe_vm.c
index 4952ea786..ffb70973b 100644
--- a/tests/intel/xe_vm.c
+++ b/tests/intel/xe_vm.c
@@ -10,6 +10,7 @@
*/
#include "igt.h"
+#include "intel_pat.h"
#include "lib/igt_syncobj.h"
#include "lib/intel_reg.h"
#include "xe_drm.h"
@@ -316,7 +317,8 @@ static void userptr_invalid(int fd)
vm = xe_vm_create(fd, 0, 0);
munmap(data, size);
ret = __xe_vm_bind(fd, vm, 0, 0, to_user_pointer(data), 0x40000,
- size, XE_VM_BIND_OP_MAP_USERPTR, NULL, 0, 0, 0);
+ size, XE_VM_BIND_OP_MAP_USERPTR, NULL, 0, 0,
+ DEFAULT_PAT_INDEX, 0);
igt_assert(ret == -EFAULT);
xe_vm_destroy(fd, vm);
--
2.41.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Intel-xe] [PATCH i-g-t 11/12] tests/xe: add some vm_bind pat_index tests
2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
` (9 preceding siblings ...)
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 10/12] lib/xe_ioctl: update vm_bind to account for pat_index Matthew Auld
@ 2023-10-05 15:31 ` Matthew Auld
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 12/12] tests/intel-ci/xe: add pat and caching related tests Matthew Auld
11 siblings, 0 replies; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
To: igt-dev; +Cc: Nitish Kumar, intel-xe
Add some basic tests for pat_index and vm_bind.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
Cc: Nitish Kumar <nitish.kumar@intel.com>
---
tests/intel/xe_pat.c | 483 +++++++++++++++++++++++++++++++++++++++++++
tests/meson.build | 1 +
2 files changed, 484 insertions(+)
create mode 100644 tests/intel/xe_pat.c
diff --git a/tests/intel/xe_pat.c b/tests/intel/xe_pat.c
new file mode 100644
index 000000000..9c5261b4a
--- /dev/null
+++ b/tests/intel/xe_pat.c
@@ -0,0 +1,483 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+/**
+ * TEST: Test for selecting per-VMA pat_index
+ * Category: Software building block
+ * Sub-category: VMA
+ * Functionality: pat_index
+ */
+
+#include "igt.h"
+#include "intel_blt.h"
+#include "intel_mocs.h"
+#include "intel_pat.h"
+
+#include "xe/xe_ioctl.h"
+#include "xe/xe_query.h"
+#include "xe/xe_util.h"
+
+#define PAGE_SIZE 4096
+
+static bool do_slow_check;
+
+/**
+ * SUBTEST: userptr-coh-none
+ * Test category: functionality test
+ * Description: Test non-coherent pat_index on userptr
+ */
+static void userptr_coh_none(int fd)
+{
+ size_t size = xe_get_default_alignment(fd);
+ uint32_t vm;
+ void *data;
+
+ data = mmap(0, size, PROT_READ |
+ PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0);
+ igt_assert(data != MAP_FAILED);
+
+ vm = xe_vm_create(fd, 0, 0);
+
+ /*
+ * Try some valid combinations first just to make sure we're not being
+ * swindled.
+ */
+ igt_assert_eq(__xe_vm_bind(fd, vm, 0, 0, to_user_pointer(data), 0x40000,
+ size, XE_VM_BIND_OP_MAP_USERPTR, NULL, 0, 0,
+ DEFAULT_PAT_INDEX, 0),
+ 0);
+ xe_vm_unbind_sync(fd, vm, 0, 0x40000, size);
+ igt_assert_eq(__xe_vm_bind(fd, vm, 0, 0, to_user_pointer(data), 0x40000,
+ size, XE_VM_BIND_OP_MAP_USERPTR, NULL, 0, 0,
+ intel_get_pat_idx_wb(fd), 0),
+ 0);
+ xe_vm_unbind_sync(fd, vm, 0, 0x40000, size);
+
+ /* And then some known COH_NONE pat_index combos which should fail. */
+ igt_assert_eq(__xe_vm_bind(fd, vm, 0, 0, to_user_pointer(data), 0x40000,
+ size, XE_VM_BIND_OP_MAP_USERPTR, NULL, 0, 0,
+ intel_get_pat_idx_uc(fd), 0),
+ -EINVAL);
+ igt_assert_eq(__xe_vm_bind(fd, vm, 0, 0, to_user_pointer(data), 0x40000,
+ size, XE_VM_BIND_OP_MAP_USERPTR, NULL, 0, 0,
+ intel_get_pat_idx_wt(fd), 0),
+ -EINVAL);
+
+ munmap(data, size);
+ xe_vm_destroy(fd, vm);
+}
+
+/**
+ * SUBTEST: pat-index-all
+ * Test category: functionality test
+ * Description: Test every pat_index
+ */
+static void pat_index_all(int fd)
+{
+ size_t size = xe_get_default_alignment(fd);
+ uint32_t vm, bo;
+ uint8_t pat_index;
+
+ vm = xe_vm_create(fd, 0, 0);
+
+ bo = xe_bo_create_caching(fd, 0, size, all_memory_regions(fd),
+ DRM_XE_GEM_CPU_CACHING_WC,
+ DRM_XE_GEM_COH_NONE);
+
+ igt_assert_eq(__xe_vm_bind(fd, vm, 0, bo, 0, 0x40000,
+ size, XE_VM_BIND_OP_MAP, NULL, 0, 0,
+ intel_get_pat_idx_uc(fd), 0),
+ 0);
+ xe_vm_unbind_sync(fd, vm, 0, 0x40000, size);
+
+ igt_assert_eq(__xe_vm_bind(fd, vm, 0, bo, 0, 0x40000,
+ size, XE_VM_BIND_OP_MAP, NULL, 0, 0,
+ intel_get_pat_idx_wt(fd), 0),
+ 0);
+ xe_vm_unbind_sync(fd, vm, 0, 0x40000, size);
+
+ igt_assert_eq(__xe_vm_bind(fd, vm, 0, bo, 0, 0x40000,
+ size, XE_VM_BIND_OP_MAP, NULL, 0, 0,
+ intel_get_pat_idx_wb(fd), 0),
+ 0);
+ xe_vm_unbind_sync(fd, vm, 0, 0x40000, size);
+
+ igt_assert(intel_get_max_pat_index(fd));
+
+ for (pat_index = 0; pat_index <= intel_get_max_pat_index(fd);
+ pat_index++) {
+ igt_assert_eq(__xe_vm_bind(fd, vm, 0, bo, 0, 0x40000,
+ size, XE_VM_BIND_OP_MAP, NULL, 0, 0,
+ pat_index, 0),
+ 0);
+ xe_vm_unbind_sync(fd, vm, 0, 0x40000, size);
+ }
+
+ igt_assert_eq(__xe_vm_bind(fd, vm, 0, bo, 0, 0x40000,
+ size, XE_VM_BIND_OP_MAP, NULL, 0, 0,
+ pat_index, 0),
+ -EINVAL);
+
+ gem_close(fd, bo);
+
+ /* Must be at least as coherent as the gem_create coh_mode. */
+ bo = xe_bo_create_caching(fd, 0, size, system_memory(fd),
+ DRM_XE_GEM_CPU_CACHING_WB,
+ DRM_XE_GEM_COH_AT_LEAST_1WAY);
+
+ igt_assert_eq(__xe_vm_bind(fd, vm, 0, bo, 0, 0x40000,
+ size, XE_VM_BIND_OP_MAP, NULL, 0, 0,
+ intel_get_pat_idx_uc(fd), 0),
+ -EINVAL);
+
+ igt_assert_eq(__xe_vm_bind(fd, vm, 0, bo, 0, 0x40000,
+ size, XE_VM_BIND_OP_MAP, NULL, 0, 0,
+ intel_get_pat_idx_wt(fd), 0),
+ -EINVAL);
+
+ gem_close(fd, bo);
+
+ xe_vm_destroy(fd, vm);
+}
+
+/**
+ * SUBTEST: pat-index-common-blt
+ * Test category: functionality test
+ * Description: Check the common pat_index modes with blitter copy.
+ */
+
+static void pat_index_blt(int fd,
+ uint32_t r1, uint8_t r1_pat_index, uint16_t r1_coh_mode,
+ uint32_t r2, uint8_t r2_pat_index, uint16_t r2_coh_mode)
+{
+ struct drm_xe_engine_class_instance inst = {
+ .engine_class = DRM_XE_ENGINE_CLASS_COPY,
+ };
+ struct blt_copy_data blt = {};
+ struct blt_copy_object src = {};
+ struct blt_copy_object dst = {};
+ uint32_t vm, exec_queue, src_bo, dst_bo, bb;
+ uint32_t *src_map, *dst_map;
+ uint16_t r1_cpu_caching, r2_cpu_caching;
+ intel_ctx_t *ctx;
+ uint64_t ahnd;
+ int width = 512, height = 512;
+ int size, stride, bb_size;
+ int bpp = 32;
+ int i;
+
+ vm = xe_vm_create(fd, DRM_XE_VM_CREATE_ASYNC_BIND_OPS, 0);
+ exec_queue = xe_exec_queue_create(fd, vm, &inst, 0);
+ ctx = intel_ctx_xe(fd, vm, exec_queue, 0, 0, 0);
+ ahnd = intel_allocator_open_full(fd, ctx->vm, 0, 0,
+ INTEL_ALLOCATOR_SIMPLE,
+ ALLOC_STRATEGY_LOW_TO_HIGH, 0);
+
+ bb_size = xe_get_default_alignment(fd);
+ bb = xe_bo_create_flags(fd, 0, bb_size, r1);
+
+ size = width * height * bpp / 8;
+ stride = width * 4;
+
+ if (r1_coh_mode == DRM_XE_GEM_COH_AT_LEAST_1WAY
+ && r1 == system_memory(fd))
+ r1_cpu_caching = DRM_XE_GEM_CPU_CACHING_WB;
+ else
+ r1_cpu_caching = DRM_XE_GEM_CPU_CACHING_WC;
+
+ if (r2_coh_mode == DRM_XE_GEM_COH_AT_LEAST_1WAY &&
+ r2 == system_memory(fd))
+ r2_cpu_caching = DRM_XE_GEM_CPU_CACHING_WB;
+ else
+ r2_cpu_caching = DRM_XE_GEM_CPU_CACHING_WC;
+
+ src_bo = xe_bo_create_caching(fd, 0, size, r1, r1_cpu_caching,
+ r1_coh_mode);
+ dst_bo = xe_bo_create_caching(fd, 0, size, r2, r2_cpu_caching,
+ r2_coh_mode);
+
+ blt_copy_init(fd, &blt);
+ blt.color_depth = CD_32bit;
+
+ blt_set_object(&src, src_bo, size, r1, intel_get_uc_mocs(fd),
+ r1_pat_index, T_LINEAR,
+ COMPRESSION_DISABLED, COMPRESSION_TYPE_3D);
+ blt_set_geom(&src, stride, 0, 0, width, height, 0, 0);
+
+ blt_set_object(&dst, dst_bo, size, r2, intel_get_uc_mocs(fd),
+ r2_pat_index, T_LINEAR,
+ COMPRESSION_DISABLED, COMPRESSION_TYPE_3D);
+ blt_set_geom(&dst, stride, 0, 0, width, height, 0, 0);
+
+ blt_set_copy_object(&blt.src, &src);
+ blt_set_copy_object(&blt.dst, &dst);
+ blt_set_batch(&blt.bb, bb, bb_size, r1);
+
+ src_map = xe_bo_map(fd, src_bo, size);
+ dst_map = xe_bo_map(fd, dst_bo, size);
+
+ /* Ensure we always see zeroes for the initial KMD zeroing */
+ blt_fast_copy(fd, ctx, NULL, ahnd, &blt);
+
+ /*
+ * Only sample random dword in every page if we are doing slow uncached
+ * reads from VRAM.
+ */
+ if (!do_slow_check && r2 != system_memory(fd)) {
+ int dwords_page = PAGE_SIZE / sizeof(uint32_t);
+ int dword = rand() % dwords_page;
+
+ igt_debug("random dword: %d\n", dword);
+
+ for (i = dword; i < size / sizeof(uint32_t); i += dwords_page)
+ igt_assert_eq(dst_map[i], 0);
+
+ } else {
+ for (i = 0; i < size / sizeof(uint32_t); i++)
+ igt_assert_eq(dst_map[i], 0);
+ }
+
+ /* Write some values from the CPU, potentially dirtying the CPU cache */
+ for (i = 0; i < size / sizeof(uint32_t); i++)
+ src_map[i] = i;
+
+ /* And finally ensure we always see the CPU written values */
+ blt_fast_copy(fd, ctx, NULL, ahnd, &blt);
+
+ if (!do_slow_check && r2 != system_memory(fd)) {
+ int dwords_page = PAGE_SIZE / sizeof(uint32_t);
+ int dword = rand() % dwords_page;
+
+ igt_debug("random dword: %d\n", dword);
+
+ for (i = dword; i < size / sizeof(uint32_t); i += dwords_page)
+ igt_assert_eq(dst_map[i], i);
+ } else {
+ for (i = 0; i < size / sizeof(uint32_t); i++)
+ igt_assert_eq(dst_map[i], i);
+ }
+
+ munmap(src_map, size);
+ munmap(dst_map, size);
+
+ gem_close(fd, src_bo);
+ gem_close(fd, dst_bo);
+ gem_close(fd, bb);
+
+ xe_exec_queue_destroy(fd, exec_queue);
+ xe_vm_destroy(fd, vm);
+
+ put_ahnd(ahnd);
+ intel_ctx_destroy(fd, ctx);
+}
+
+/**
+ * SUBTEST: pat-index-common-render
+ * Test category: functionality test
+ * Description: Check the common pat_index modes with render.
+ */
+
+static void pat_index_render(int fd,
+ uint32_t r1, uint8_t r1_pat_index, uint16_t r1_coh_mode,
+ uint32_t r2, uint8_t r2_pat_index, uint16_t r2_coh_mode)
+{
+ uint32_t devid = intel_get_drm_devid(fd);
+ igt_render_copyfunc_t render_copy = NULL;
+ int size, stride, width = 512, height = 512;
+ struct intel_buf src, dst;
+ struct intel_bb *ibb;
+ struct buf_ops *bops;
+ uint16_t r1_cpu_caching, r2_cpu_caching;
+ uint32_t src_bo, dst_bo;
+ uint32_t *src_map, *dst_map;
+ int bpp = 32;
+ int i;
+
+ bops = buf_ops_create(fd);
+
+ render_copy = igt_get_render_copyfunc(devid);
+ igt_assert(render_copy);
+
+ ibb = intel_bb_create(fd, xe_get_default_alignment(fd));
+
+ if (r1_coh_mode == DRM_XE_GEM_COH_AT_LEAST_1WAY
+ && r1 == system_memory(fd))
+ r1_cpu_caching = DRM_XE_GEM_CPU_CACHING_WB;
+ else
+ r1_cpu_caching = DRM_XE_GEM_CPU_CACHING_WC;
+
+ if (r2_coh_mode == DRM_XE_GEM_COH_AT_LEAST_1WAY &&
+ r2 == system_memory(fd))
+ r2_cpu_caching = DRM_XE_GEM_CPU_CACHING_WB;
+ else
+ r2_cpu_caching = DRM_XE_GEM_CPU_CACHING_WC;
+
+ size = width * height * bpp / 8;
+ stride = width * 4;
+
+ src_bo = xe_bo_create_caching(fd, 0, size, r1, r1_cpu_caching,
+ r1_coh_mode);
+ intel_buf_init_full(bops, src_bo, &src, width, height, bpp, 0,
+ I915_TILING_NONE, I915_COMPRESSION_NONE, size,
+ stride, r1, r1_pat_index);
+
+ dst_bo = xe_bo_create_caching(fd, 0, size, r2, r2_cpu_caching,
+ r2_coh_mode);
+ intel_buf_init_full(bops, dst_bo, &dst, width, height, bpp, 0,
+ I915_TILING_NONE, I915_COMPRESSION_NONE, size,
+ stride, r2, r2_pat_index);
+
+ src_map = xe_bo_map(fd, src_bo, size);
+ dst_map = xe_bo_map(fd, dst_bo, size);
+
+ /* Ensure we always see zeroes for the initial KMD zeroing */
+ render_copy(ibb,
+ &src,
+ 0, 0, width, height,
+ &dst,
+ 0, 0);
+ intel_bb_sync(ibb);
+
+ if (!do_slow_check && r2 != system_memory(fd)) {
+ int dwords_page = PAGE_SIZE / sizeof(uint32_t);
+ int dword = rand() % dwords_page;
+
+ igt_debug("random dword: %d\n", dword);
+
+ for (i = dword; i < size / sizeof(uint32_t); i += dwords_page)
+ igt_assert_eq(dst_map[i], 0);
+ } else {
+ for (i = 0; i < size / sizeof(uint32_t); i++)
+ igt_assert_eq(dst_map[i], 0);
+ }
+
+ /* Write some values from the CPU, potentially dirtying the CPU cache */
+ for (i = 0; i < size / sizeof(uint32_t); i++)
+ src_map[i] = i;
+
+ /* And finally ensure we always see the CPU written values */
+ render_copy(ibb,
+ &src,
+ 0, 0, width, height,
+ &dst,
+ 0, 0);
+ intel_bb_sync(ibb);
+
+ if (!do_slow_check && r2 != system_memory(fd)) {
+ int dwords_page = PAGE_SIZE / sizeof(uint32_t);
+ int dword = rand() % dwords_page;
+
+ igt_debug("random dword: %d\n", dword);
+
+ for (i = dword; i < size / sizeof(uint32_t); i += dwords_page)
+ igt_assert_eq(dst_map[i], i);
+ } else {
+ for (i = 0; i < size / sizeof(uint32_t); i++)
+ igt_assert_eq(dst_map[i], i);
+ }
+
+ munmap(src_map, size);
+ munmap(dst_map, size);
+
+ intel_bb_destroy(ibb);
+
+ gem_close(fd, src_bo);
+ gem_close(fd, dst_bo);
+}
+
+const struct pat_index_entry {
+ uint8_t (*get_pat_index)(int fd);
+ const char *name;
+ uint16_t coh_mode;
+} common_pat_index_modes[] = {
+ { intel_get_pat_idx_uc, "uc", DRM_XE_GEM_COH_NONE },
+ { intel_get_pat_idx_wt, "wt", DRM_XE_GEM_COH_NONE },
+ { intel_get_pat_idx_wb, "wb", DRM_XE_GEM_COH_AT_LEAST_1WAY },
+};
+
+typedef void (*pat_index_fn)(int fd,
+ uint32_t r1, uint8_t r1_pat_index, uint16_t r1_coh_mode,
+ uint32_t r2, uint8_t r2_pat_index, uint16_t r2_coh_mode);
+
+static void subtest_pat_index_common_with_regions(int fd, pat_index_fn fn)
+{
+ struct igt_collection *common_pat_index_set;
+ struct igt_collection *regions_set;
+ struct igt_collection *regions;
+
+ common_pat_index_set =
+ igt_collection_create(ARRAY_SIZE(common_pat_index_modes));
+
+ regions_set = xe_get_memory_region_set(fd,
+ XE_MEM_REGION_CLASS_SYSMEM,
+ XE_MEM_REGION_CLASS_VRAM);
+
+ for_each_variation_r(regions, 2, regions_set) {
+ struct igt_collection *modes;
+ uint32_t r1, r2;
+ char *reg_str;
+
+ r1 = igt_collection_get_value(regions, 0);
+ r2 = igt_collection_get_value(regions, 1);
+
+ reg_str = xe_memregion_dynamic_subtest_name(fd, regions);
+
+ for_each_variation_r(modes, 2, common_pat_index_set) {
+ struct pat_index_entry r1_entry, r2_entry;
+ uint8_t r1_pat_index, r2_pat_index;
+ int r1_idx, r2_idx;
+
+ r1_idx = igt_collection_get_value(modes, 0);
+ r2_idx = igt_collection_get_value(modes, 1);
+
+ r1_entry = common_pat_index_modes[r1_idx];
+ r2_entry = common_pat_index_modes[r2_idx];
+
+ r1_pat_index = r1_entry.get_pat_index(fd);
+ r2_pat_index = r2_entry.get_pat_index(fd);
+
+ igt_dynamic_f("%s-%s-%s", reg_str, r1_entry.name, r2_entry.name)
+ fn(fd,
+ r1, r1_pat_index, r1_entry.coh_mode,
+ r2, r2_pat_index, r2_entry.coh_mode);
+ }
+
+ free(reg_str);
+ }
+}
+
+igt_main
+{
+ int fd;
+ uint32_t seed;
+
+ igt_fixture {
+ fd = drm_open_driver(DRIVER_XE);
+
+ seed = time(NULL);
+ igt_debug("seed: %d\n", seed);
+
+ xe_device_get(fd);
+ }
+
+ igt_subtest("pat-index-all")
+ pat_index_all(fd);
+
+ igt_subtest("userptr-coh-none")
+ userptr_coh_none(fd);
+
+ igt_subtest_with_dynamic("pat-index-common-blt") {
+ igt_require(blt_has_fast_copy(fd));
+ subtest_pat_index_common_with_regions(fd, pat_index_blt);
+ }
+
+ igt_subtest_with_dynamic("pat-index-common-render") {
+ igt_require(xe_has_engine_class(fd, DRM_XE_ENGINE_CLASS_RENDER));
+ subtest_pat_index_common_with_regions(fd, pat_index_render);
+ }
+
+ igt_fixture
+ drm_close_driver(fd);
+}
diff --git a/tests/meson.build b/tests/meson.build
index 2404b2d4a..61351be04 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -296,6 +296,7 @@ intel_xe_progs = [
'xe_mmio',
'xe_module_load',
'xe_noexec_ping_pong',
+ 'xe_pat',
'xe_pm',
'xe_pm_residency',
'xe_prime_self_import',
--
2.41.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Intel-xe] [PATCH i-g-t 12/12] tests/intel-ci/xe: add pat and caching related tests
2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
` (10 preceding siblings ...)
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 11/12] tests/xe: add some vm_bind pat_index tests Matthew Auld
@ 2023-10-05 15:31 ` Matthew Auld
11 siblings, 0 replies; 22+ messages in thread
From: Matthew Auld @ 2023-10-05 15:31 UTC (permalink / raw)
To: igt-dev; +Cc: intel-xe
Add the various pat_index, coh_mode and cpu_caching related tests to
BAT.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
---
tests/intel-ci/xe-fast-feedback.testlist | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/tests/intel-ci/xe-fast-feedback.testlist b/tests/intel-ci/xe-fast-feedback.testlist
index 610cc958c..c41be52a6 100644
--- a/tests/intel-ci/xe-fast-feedback.testlist
+++ b/tests/intel-ci/xe-fast-feedback.testlist
@@ -138,6 +138,7 @@ igt@xe_intel_bb@simple-bb-ctx
igt@xe_mmap@bad-extensions
igt@xe_mmap@bad-flags
igt@xe_mmap@bad-object
+igt@xe_mmap@cpu-caching-coh
igt@xe_mmap@system
igt@xe_mmap@vram
igt@xe_mmap@vram-system
@@ -180,6 +181,10 @@ igt@xe_vm@munmap-style-unbind-userptr-end
igt@xe_vm@munmap-style-unbind-userptr-front
igt@xe_vm@munmap-style-unbind-userptr-inval-end
igt@xe_vm@munmap-style-unbind-userptr-inval-front
+igt@xe_pat@pat-index-all
+igt@xe_pat@pat-index-common-blt
+igt@xe_pat@pat-index-common-render
+igt@xe_pat@userptr-coh-none
igt@xe_waitfence@abstime
igt@xe_waitfence@reltime
igt@kms_addfb_basic@addfb25-4-tiled
--
2.41.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [Intel-xe] [PATCH i-g-t 07/12] lib/allocator: add get_offset_pat_index() helper
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 07/12] lib/allocator: add get_offset_pat_index() helper Matthew Auld
@ 2023-10-06 11:38 ` Zbigniew Kempczyński
0 siblings, 0 replies; 22+ messages in thread
From: Zbigniew Kempczyński @ 2023-10-06 11:38 UTC (permalink / raw)
To: Matthew Auld; +Cc: igt-dev, intel-xe
On Thu, Oct 05, 2023 at 04:31:11PM +0100, Matthew Auld wrote:
> For some cases we are going to need to pass the pat_index for the
> vm_bind op. Add a helper for this, such that we can allocate an address
> and give the mapping some pat_index.
>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: José Roberto de Souza <jose.souza@intel.com>
> Cc: Pallavi Mishra <pallavi.mishra@intel.com>
> ---
> lib/intel_allocator.c | 43 +++++++++++++++++++++++--------
> lib/intel_allocator.h | 5 +++-
> lib/xe/xe_util.c | 1 +
> lib/xe/xe_util.h | 1 +
> tests/intel/api_intel_allocator.c | 4 ++-
> 5 files changed, 41 insertions(+), 13 deletions(-)
>
> diff --git a/lib/intel_allocator.c b/lib/intel_allocator.c
> index f0a9b7fb5..da357b833 100644
> --- a/lib/intel_allocator.c
> +++ b/lib/intel_allocator.c
> @@ -16,6 +16,7 @@
> #include "igt_map.h"
> #include "intel_allocator.h"
> #include "intel_allocator_msgchannel.h"
> +#include "intel_pat.h"
> #include "xe/xe_query.h"
> #include "xe/xe_util.h"
>
> @@ -92,6 +93,7 @@ struct allocator_object {
> uint32_t handle;
> uint64_t offset;
> uint64_t size;
> + uint8_t pat_index;
>
> enum allocator_bind_op bind_op;
> };
> @@ -1122,14 +1124,14 @@ void intel_allocator_get_address_range(uint64_t allocator_handle,
>
> static bool is_same(struct allocator_object *obj,
> uint32_t handle, uint64_t offset, uint64_t size,
> - enum allocator_bind_op bind_op)
> + uint8_t pat_index, enum allocator_bind_op bind_op)
> {
> return obj->handle == handle && obj->offset == offset && obj->size == size &&
> - (obj->bind_op == bind_op || obj->bind_op == BOUND);
> + obj->pat_index == pat_index && (obj->bind_op == bind_op || obj->bind_op == BOUND);
> }
>
> static void track_object(uint64_t allocator_handle, uint32_t handle,
> - uint64_t offset, uint64_t size,
> + uint64_t offset, uint64_t size, uint8_t pat_index,
> enum allocator_bind_op bind_op)
> {
> struct ahnd_info *ainfo;
Code looks good to me, only minor nitpick is to add pat index to
bind_debug() here. Be aware that pat_index don't go underneath
to the allocator itself, only to cache which tracks alloc()/free()
data returned from allocator necessary to bind/unbind. But I don't
think it will be a problem.
With above added:
Reviewed-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
--
Zbigniew
> @@ -1156,6 +1158,9 @@ static void track_object(uint64_t allocator_handle, uint32_t handle,
> if (ainfo->driver == INTEL_DRIVER_I915)
> return; /* no-op for i915, at least for now */
>
> + if (pat_index == DEFAULT_PAT_INDEX)
> + pat_index = intel_get_pat_idx_wb(ainfo->fd);
> +
> pthread_mutex_lock(&ainfo->bind_map_mutex);
> obj = igt_map_search(ainfo->bind_map, &handle);
> if (obj) {
> @@ -1165,7 +1170,7 @@ static void track_object(uint64_t allocator_handle, uint32_t handle,
> * bind_map.
> */
> if (bind_op == TO_BIND) {
> - igt_assert_eq(is_same(obj, handle, offset, size, bind_op), true);
> + igt_assert_eq(is_same(obj, handle, offset, size, pat_index, bind_op), true);
> } else if (bind_op == TO_UNBIND) {
> if (obj->bind_op == TO_BIND)
> igt_map_remove(ainfo->bind_map, &obj->handle, map_entry_free_func);
> @@ -1181,6 +1186,7 @@ static void track_object(uint64_t allocator_handle, uint32_t handle,
> obj->handle = handle;
> obj->offset = offset;
> obj->size = size;
> + obj->pat_index = pat_index;
> obj->bind_op = bind_op;
> igt_map_insert(ainfo->bind_map, &obj->handle, obj);
> }
> @@ -1204,7 +1210,7 @@ out:
> */
> uint64_t __intel_allocator_alloc(uint64_t allocator_handle, uint32_t handle,
> uint64_t size, uint64_t alignment,
> - enum allocator_strategy strategy)
> + uint8_t pat_index, enum allocator_strategy strategy)
> {
> struct alloc_req req = { .request_type = REQ_ALLOC,
> .allocator_handle = allocator_handle,
> @@ -1219,7 +1225,8 @@ uint64_t __intel_allocator_alloc(uint64_t allocator_handle, uint32_t handle,
> igt_assert(handle_request(&req, &resp) == 0);
> igt_assert(resp.response_type == RESP_ALLOC);
>
> - track_object(allocator_handle, handle, resp.alloc.offset, size, TO_BIND);
> + track_object(allocator_handle, handle, resp.alloc.offset, size, pat_index,
> + TO_BIND);
>
> return resp.alloc.offset;
> }
> @@ -1241,7 +1248,7 @@ uint64_t intel_allocator_alloc(uint64_t allocator_handle, uint32_t handle,
> uint64_t offset;
>
> offset = __intel_allocator_alloc(allocator_handle, handle,
> - size, alignment,
> + size, alignment, DEFAULT_PAT_INDEX,
> ALLOC_STRATEGY_NONE);
> igt_assert(offset != ALLOC_INVALID_ADDRESS);
>
> @@ -1268,7 +1275,8 @@ uint64_t intel_allocator_alloc_with_strategy(uint64_t allocator_handle,
> uint64_t offset;
>
> offset = __intel_allocator_alloc(allocator_handle, handle,
> - size, alignment, strategy);
> + size, alignment, DEFAULT_PAT_INDEX,
> + strategy);
> igt_assert(offset != ALLOC_INVALID_ADDRESS);
>
> return offset;
> @@ -1298,7 +1306,7 @@ bool intel_allocator_free(uint64_t allocator_handle, uint32_t handle)
> igt_assert(handle_request(&req, &resp) == 0);
> igt_assert(resp.response_type == RESP_FREE);
>
> - track_object(allocator_handle, handle, 0, 0, TO_UNBIND);
> + track_object(allocator_handle, handle, 0, 0, 0, TO_UNBIND);
>
> return resp.free.freed;
> }
> @@ -1500,16 +1508,17 @@ static void __xe_op_bind(struct ahnd_info *ainfo, uint32_t sync_in, uint32_t syn
> if (obj->bind_op == BOUND)
> continue;
>
> - bind_info("= [vm: %u] %s => %u %lx %lx\n",
> + bind_info("= [vm: %u] %s => %u %lx %lx %u\n",
> ainfo->vm,
> obj->bind_op == TO_BIND ? "TO BIND" : "TO UNBIND",
> obj->handle, obj->offset,
> - obj->size);
> + obj->size, obj->pat_index);
>
> entry = malloc(sizeof(*entry));
> entry->handle = obj->handle;
> entry->offset = obj->offset;
> entry->size = obj->size;
> + entry->pat_index = obj->pat_index;
> entry->bind_op = obj->bind_op == TO_BIND ? XE_OBJECT_BIND :
> XE_OBJECT_UNBIND;
> igt_list_add(&entry->link, &obj_list);
> @@ -1534,6 +1543,18 @@ static void __xe_op_bind(struct ahnd_info *ainfo, uint32_t sync_in, uint32_t syn
> }
> }
>
> +uint64_t get_offset_pat_index(uint64_t ahnd, uint32_t handle, uint64_t size,
> + uint64_t alignment, uint8_t pat_index)
> +{
> + uint64_t offset;
> +
> + offset = __intel_allocator_alloc(ahnd, handle, size, alignment,
> + pat_index, ALLOC_STRATEGY_NONE);
> + igt_assert(offset != ALLOC_INVALID_ADDRESS);
> +
> + return offset;
> +}
> +
> /**
> * intel_allocator_bind:
> * @allocator_handle: handle to an allocator
> diff --git a/lib/intel_allocator.h b/lib/intel_allocator.h
> index f9ff7f1cc..5da8af7f9 100644
> --- a/lib/intel_allocator.h
> +++ b/lib/intel_allocator.h
> @@ -186,7 +186,7 @@ bool intel_allocator_close(uint64_t allocator_handle);
> void intel_allocator_get_address_range(uint64_t allocator_handle,
> uint64_t *startp, uint64_t *endp);
> uint64_t __intel_allocator_alloc(uint64_t allocator_handle, uint32_t handle,
> - uint64_t size, uint64_t alignment,
> + uint64_t size, uint64_t alignment, uint8_t pat_index,
> enum allocator_strategy strategy);
> uint64_t intel_allocator_alloc(uint64_t allocator_handle, uint32_t handle,
> uint64_t size, uint64_t alignment);
> @@ -266,6 +266,9 @@ static inline bool put_ahnd(uint64_t ahnd)
> return !ahnd || intel_allocator_close(ahnd);
> }
>
> +uint64_t get_offset_pat_index(uint64_t ahnd, uint32_t handle, uint64_t size,
> + uint64_t alignment, uint8_t pat_index);
> +
> static inline uint64_t get_offset(uint64_t ahnd, uint32_t handle,
> uint64_t size, uint64_t alignment)
> {
> diff --git a/lib/xe/xe_util.c b/lib/xe/xe_util.c
> index 2f9ffe2f1..8583326a9 100644
> --- a/lib/xe/xe_util.c
> +++ b/lib/xe/xe_util.c
> @@ -145,6 +145,7 @@ static struct drm_xe_vm_bind_op *xe_alloc_bind_ops(struct igt_list_head *obj_lis
> ops->addr = obj->offset;
> ops->range = obj->size;
> ops->region = 0;
> + ops->pat_index = obj->pat_index;
>
> bind_info(" [%d]: [%6s] handle: %u, offset: %llx, size: %llx\n",
> i, obj->bind_op == XE_OBJECT_BIND ? "BIND" : "UNBIND",
> diff --git a/lib/xe/xe_util.h b/lib/xe/xe_util.h
> index e97d236b8..e3bdf3d11 100644
> --- a/lib/xe/xe_util.h
> +++ b/lib/xe/xe_util.h
> @@ -36,6 +36,7 @@ struct xe_object {
> uint32_t handle;
> uint64_t offset;
> uint64_t size;
> + uint8_t pat_index;
> enum xe_bind_op bind_op;
> struct igt_list_head link;
> };
> diff --git a/tests/intel/api_intel_allocator.c b/tests/intel/api_intel_allocator.c
> index f3fcf8a34..d19be3ce9 100644
> --- a/tests/intel/api_intel_allocator.c
> +++ b/tests/intel/api_intel_allocator.c
> @@ -9,6 +9,7 @@
> #include "igt.h"
> #include "igt_aux.h"
> #include "intel_allocator.h"
> +#include "intel_pat.h"
> #include "xe/xe_ioctl.h"
> #include "xe/xe_query.h"
>
> @@ -131,7 +132,8 @@ static void alloc_simple(int fd)
>
> intel_allocator_get_address_range(ahnd, &start, &end);
> offset0 = intel_allocator_alloc(ahnd, 1, end - start, 0);
> - offset1 = __intel_allocator_alloc(ahnd, 2, 4096, 0, ALLOC_STRATEGY_NONE);
> + offset1 = __intel_allocator_alloc(ahnd, 2, 4096, 0, DEFAULT_PAT_INDEX,
> + ALLOC_STRATEGY_NONE);
> igt_assert(offset1 == ALLOC_INVALID_ADDRESS);
> intel_allocator_free(ahnd, 1);
>
> --
> 2.41.0
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Intel-xe] [igt-dev] [PATCH i-g-t 08/12] lib/intel_blt: support pat_index
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 08/12] lib/intel_blt: support pat_index Matthew Auld
@ 2023-10-06 11:51 ` Zbigniew Kempczyński
2023-10-06 12:08 ` Matthew Auld
0 siblings, 1 reply; 22+ messages in thread
From: Zbigniew Kempczyński @ 2023-10-06 11:51 UTC (permalink / raw)
To: Matthew Auld; +Cc: igt-dev, intel-xe
On Thu, Oct 05, 2023 at 04:31:12PM +0100, Matthew Auld wrote:
> For the most part we can just use the default wb, however some users
> including display might want to use something else.
>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: José Roberto de Souza <jose.souza@intel.com>
> Cc: Pallavi Mishra <pallavi.mishra@intel.com>
> ---
> lib/igt_fb.c | 2 ++
> lib/intel_blt.c | 54 +++++++++++++++++++++------------
> lib/intel_blt.h | 7 +++--
> tests/intel/gem_ccs.c | 16 +++++-----
> tests/intel/gem_lmem_swapping.c | 4 +--
> tests/intel/xe_ccs.c | 19 +++++++-----
> 6 files changed, 64 insertions(+), 38 deletions(-)
>
> diff --git a/lib/igt_fb.c b/lib/igt_fb.c
> index f8a0db22c..d290fd775 100644
> --- a/lib/igt_fb.c
> +++ b/lib/igt_fb.c
> @@ -37,6 +37,7 @@
> #include "i915/gem_mman.h"
> #include "intel_blt.h"
> #include "intel_mocs.h"
> +#include "intel_pat.h"
> #include "igt_aux.h"
> #include "igt_color_encoding.h"
> #include "igt_fb.h"
> @@ -2768,6 +2769,7 @@ static struct blt_copy_object *blt_fb_init(const struct igt_fb *fb,
>
> blt_set_object(blt, handle, fb->size, memregion,
> intel_get_uc_mocs(fb->fd),
> + intel_get_pat_idx_wt(fb->fd),
> blt_tile,
> is_ccs_modifier(fb->modifier) ? COMPRESSION_ENABLED : COMPRESSION_DISABLED,
> is_gen12_mc_ccs_modifier(fb->modifier) ? COMPRESSION_TYPE_MEDIA : COMPRESSION_TYPE_3D);
> diff --git a/lib/intel_blt.c b/lib/intel_blt.c
> index b55fa9b52..b7ac2902b 100644
> --- a/lib/intel_blt.c
> +++ b/lib/intel_blt.c
> @@ -13,6 +13,7 @@
> #include "igt.h"
> #include "igt_syncobj.h"
> #include "intel_blt.h"
> +#include "intel_pat.h"
> #include "xe/xe_ioctl.h"
> #include "xe/xe_query.h"
> #include "xe/xe_util.h"
> @@ -810,10 +811,12 @@ uint64_t emit_blt_block_copy(int fd,
> igt_assert_f(blt, "block-copy requires data to do blit\n");
>
> alignment = get_default_alignment(fd, blt->driver);
> - src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment)
> - + blt->src.plane_offset;
> - dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment)
> - + blt->dst.plane_offset;
> + src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
> + alignment, blt->src.pat_index) +
> + blt->src.plane_offset;
> + dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size,
> + alignment, blt->dst.pat_index) +
> + blt->dst.plane_offset;
To less tabs in formatting for src and dst plane_offset.
> bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
>
> fill_data(&data, blt, src_offset, dst_offset, ext);
> @@ -884,8 +887,10 @@ int blt_block_copy(int fd,
> igt_assert_neq(blt->driver, 0);
>
> alignment = get_default_alignment(fd, blt->driver);
> - src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
> - dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
> + src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
> + alignment, blt->src.pat_index);
> + dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size,
> + alignment, blt->dst.pat_index);
> bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
>
> emit_blt_block_copy(fd, ahnd, blt, ext, 0, true);
> @@ -1036,8 +1041,10 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
> data.dw00.size_of_ctrl_copy = __ccs_size(surf) / CCS_RATIO - 1;
> data.dw00.length = 0x3;
>
> - src_offset = get_offset(ahnd, surf->src.handle, surf->src.size, alignment);
> - dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size, alignment);
> + src_offset = get_offset_pat_index(ahnd, surf->src.handle, surf->src.size,
> + alignment, surf->src.pat_index);
> + dst_offset = get_offset_pat_index(ahnd, surf->dst.handle, surf->dst.size,
> + alignment, surf->dst.pat_index);
> bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size, alignment);
>
> data.dw01.src_address_lo = src_offset;
> @@ -1103,8 +1110,10 @@ int blt_ctrl_surf_copy(int fd,
> igt_assert_neq(surf->driver, 0);
>
> alignment = max_t(uint64_t, get_default_alignment(fd, surf->driver), 1ull << 16);
> - src_offset = get_offset(ahnd, surf->src.handle, surf->src.size, alignment);
> - dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size, alignment);
> + src_offset = get_offset_pat_index(ahnd, surf->src.handle, surf->src.size,
> + alignment, surf->src.pat_index);
> + dst_offset = get_offset_pat_index(ahnd, surf->dst.handle, surf->dst.size,
> + alignment, surf->dst.pat_index);
> bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size, alignment);
>
> emit_blt_ctrl_surf_copy(fd, ahnd, surf, 0, true);
> @@ -1308,10 +1317,12 @@ uint64_t emit_blt_fast_copy(int fd,
> data.dw03.dst_x2 = blt->dst.x2;
> data.dw03.dst_y2 = blt->dst.y2;
>
> - src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment)
> - + blt->src.plane_offset;
> - dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment)
> - + blt->dst.plane_offset;
> + src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
> + alignment, blt->src.pat_index) +
> + blt->src.plane_offset;
> + dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size, alignment,
> + blt->dst.pat_index) +
> + blt->dst.plane_offset;
Ditto.
> bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
>
> data.dw04.dst_address_lo = dst_offset;
> @@ -1380,8 +1391,10 @@ int blt_fast_copy(int fd,
> igt_assert_neq(blt->driver, 0);
>
> alignment = get_default_alignment(fd, blt->driver);
> - src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
> - dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
> + src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
> + alignment, blt->src.pat_index);
> + dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size,
> + alignment, blt->dst.pat_index);
> bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
>
> emit_blt_fast_copy(fd, ahnd, blt, 0, true);
> @@ -1460,7 +1473,7 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
> &size, region) == 0);
> }
I think blt_create_object() should have also pat_index passed as an
argument.
Rest looks ok.
--
Zbigniew
>
> - blt_set_object(obj, handle, size, region, mocs, tiling,
> + blt_set_object(obj, handle, size, region, mocs, DEFAULT_PAT_INDEX, tiling,
> compression, compression_type);
> blt_set_geom(obj, stride, 0, 0, width, height, 0, 0);
>
> @@ -1481,7 +1494,7 @@ void blt_destroy_object(int fd, struct blt_copy_object *obj)
>
> void blt_set_object(struct blt_copy_object *obj,
> uint32_t handle, uint64_t size, uint32_t region,
> - uint8_t mocs, enum blt_tiling_type tiling,
> + uint8_t mocs, uint8_t pat_index, enum blt_tiling_type tiling,
> enum blt_compression compression,
> enum blt_compression_type compression_type)
> {
> @@ -1489,6 +1502,7 @@ void blt_set_object(struct blt_copy_object *obj,
> obj->size = size;
> obj->region = region;
> obj->mocs = mocs;
> + obj->pat_index = pat_index;
> obj->tiling = tiling;
> obj->compression = compression;
> obj->compression_type = compression_type;
> @@ -1516,12 +1530,14 @@ void blt_set_copy_object(struct blt_copy_object *obj,
>
> void blt_set_ctrl_surf_object(struct blt_ctrl_surf_copy_object *obj,
> uint32_t handle, uint32_t region, uint64_t size,
> - uint8_t mocs, enum blt_access_type access_type)
> + uint8_t mocs, uint8_t pat_index,
> + enum blt_access_type access_type)
> {
> obj->handle = handle;
> obj->region = region;
> obj->size = size;
> obj->mocs = mocs;
> + obj->pat_index = pat_index;
> obj->access_type = access_type;
> }
>
> diff --git a/lib/intel_blt.h b/lib/intel_blt.h
> index d9c8883c7..f8423a986 100644
> --- a/lib/intel_blt.h
> +++ b/lib/intel_blt.h
> @@ -79,6 +79,7 @@ struct blt_copy_object {
> uint32_t region;
> uint64_t size;
> uint8_t mocs;
> + uint8_t pat_index;
> enum blt_tiling_type tiling;
> enum blt_compression compression; /* BC only */
> enum blt_compression_type compression_type; /* BC only */
> @@ -151,6 +152,7 @@ struct blt_ctrl_surf_copy_object {
> uint32_t region;
> uint64_t size;
> uint8_t mocs;
> + uint8_t pat_index;
> enum blt_access_type access_type;
> };
>
> @@ -247,7 +249,7 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
> void blt_destroy_object(int fd, struct blt_copy_object *obj);
> void blt_set_object(struct blt_copy_object *obj,
> uint32_t handle, uint64_t size, uint32_t region,
> - uint8_t mocs, enum blt_tiling_type tiling,
> + uint8_t mocs, uint8_t pat_index, enum blt_tiling_type tiling,
> enum blt_compression compression,
> enum blt_compression_type compression_type);
> void blt_set_object_ext(struct blt_block_copy_object_ext *obj,
> @@ -258,7 +260,8 @@ void blt_set_copy_object(struct blt_copy_object *obj,
> const struct blt_copy_object *orig);
> void blt_set_ctrl_surf_object(struct blt_ctrl_surf_copy_object *obj,
> uint32_t handle, uint32_t region, uint64_t size,
> - uint8_t mocs, enum blt_access_type access_type);
> + uint8_t mocs, uint8_t pat_index,
> + enum blt_access_type access_type);
>
> void blt_surface_info(const char *info,
> const struct blt_copy_object *obj);
> diff --git a/tests/intel/gem_ccs.c b/tests/intel/gem_ccs.c
> index f5d4ab359..a98557b72 100644
> --- a/tests/intel/gem_ccs.c
> +++ b/tests/intel/gem_ccs.c
> @@ -15,6 +15,7 @@
> #include "lib/intel_chipset.h"
> #include "intel_blt.h"
> #include "intel_mocs.h"
> +#include "intel_pat.h"
> /**
> * TEST: gem ccs
> * Description: Exercise gen12 blitter with and without flatccs compression
> @@ -111,9 +112,9 @@ static void surf_copy(int i915,
> blt_ctrl_surf_copy_init(i915, &surf);
> surf.print_bb = param.print_bb;
> blt_set_ctrl_surf_object(&surf.src, mid->handle, mid->region, mid->size,
> - uc_mocs, BLT_INDIRECT_ACCESS);
> + uc_mocs, DEFAULT_PAT_INDEX, BLT_INDIRECT_ACCESS);
> blt_set_ctrl_surf_object(&surf.dst, ccs, REGION_SMEM, ccssize,
> - uc_mocs, DIRECT_ACCESS);
> + uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
> bb_size = 4096;
> igt_assert_eq(__gem_create(i915, &bb_size, &bb1), 0);
> blt_set_batch(&surf.bb, bb1, bb_size, REGION_SMEM);
> @@ -133,7 +134,7 @@ static void surf_copy(int i915,
> igt_system_suspend_autoresume(SUSPEND_STATE_FREEZE, SUSPEND_TEST_NONE);
>
> blt_set_ctrl_surf_object(&surf.dst, ccs2, REGION_SMEM, ccssize,
> - 0, DIRECT_ACCESS);
> + 0, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
> blt_ctrl_surf_copy(i915, ctx, e, ahnd, &surf);
> gem_sync(i915, surf.dst.handle);
>
> @@ -155,9 +156,9 @@ static void surf_copy(int i915,
> for (int i = 0; i < surf.dst.size / sizeof(uint32_t); i++)
> ccsmap[i] = i;
> blt_set_ctrl_surf_object(&surf.src, ccs, REGION_SMEM, ccssize,
> - uc_mocs, DIRECT_ACCESS);
> + uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
> blt_set_ctrl_surf_object(&surf.dst, mid->handle, mid->region, mid->size,
> - uc_mocs, INDIRECT_ACCESS);
> + uc_mocs, DEFAULT_PAT_INDEX, INDIRECT_ACCESS);
> blt_ctrl_surf_copy(i915, ctx, e, ahnd, &surf);
>
> blt_copy_init(i915, &blt);
> @@ -399,7 +400,8 @@ static void block_copy(int i915,
> blt_set_object_ext(&ext.dst, 0, width, height, SURFACE_TYPE_2D);
> if (config->inplace) {
> blt_set_object(&blt.dst, mid->handle, dst->size, mid->region, 0,
> - T_LINEAR, COMPRESSION_DISABLED, comp_type);
> + DEFAULT_PAT_INDEX, T_LINEAR, COMPRESSION_DISABLED,
> + comp_type);
> blt.dst.ptr = mid->ptr;
> }
>
> @@ -475,7 +477,7 @@ static void block_multicopy(int i915,
>
> if (config->inplace) {
> blt_set_object(&blt3.dst, mid->handle, dst->size, mid->region,
> - mid->mocs, mid_tiling, COMPRESSION_DISABLED,
> + mid->mocs, DEFAULT_PAT_INDEX, mid_tiling, COMPRESSION_DISABLED,
> comp_type);
> blt3.dst.ptr = mid->ptr;
> }
> diff --git a/tests/intel/gem_lmem_swapping.c b/tests/intel/gem_lmem_swapping.c
> index ede545c92..7f2ab8bb6 100644
> --- a/tests/intel/gem_lmem_swapping.c
> +++ b/tests/intel/gem_lmem_swapping.c
> @@ -486,7 +486,7 @@ static void __do_evict(int i915,
> INTEL_MEMORY_REGION_ID(I915_SYSTEM_MEMORY, 0));
> blt_set_object(tmp, tmp->handle, params->size.max,
> INTEL_MEMORY_REGION_ID(I915_SYSTEM_MEMORY, 0),
> - intel_get_uc_mocs(i915), T_LINEAR,
> + intel_get_uc_mocs(i915), 0, T_LINEAR,
> COMPRESSION_DISABLED, COMPRESSION_TYPE_3D);
> blt_set_geom(tmp, stride, 0, 0, width, height, 0, 0);
> }
> @@ -516,7 +516,7 @@ static void __do_evict(int i915,
> obj->blt_obj = calloc(1, sizeof(*obj->blt_obj));
> igt_assert(obj->blt_obj);
> blt_set_object(obj->blt_obj, obj->handle, obj->size, region_id,
> - intel_get_uc_mocs(i915), T_LINEAR,
> + intel_get_uc_mocs(i915), 0, T_LINEAR,
> COMPRESSION_ENABLED, COMPRESSION_TYPE_3D);
> blt_set_geom(obj->blt_obj, stride, 0, 0, width, height, 0, 0);
> init_object_ccs(i915, obj, tmp, rand(), blt_ctx,
> diff --git a/tests/intel/xe_ccs.c b/tests/intel/xe_ccs.c
> index 20bbc4448..27859d5ce 100644
> --- a/tests/intel/xe_ccs.c
> +++ b/tests/intel/xe_ccs.c
> @@ -13,6 +13,7 @@
> #include "igt_syncobj.h"
> #include "intel_blt.h"
> #include "intel_mocs.h"
> +#include "intel_pat.h"
> #include "xe/xe_ioctl.h"
> #include "xe/xe_query.h"
> #include "xe/xe_util.h"
> @@ -108,8 +109,9 @@ static void surf_copy(int xe,
> blt_ctrl_surf_copy_init(xe, &surf);
> surf.print_bb = param.print_bb;
> blt_set_ctrl_surf_object(&surf.src, mid->handle, mid->region, mid->size,
> - uc_mocs, BLT_INDIRECT_ACCESS);
> - blt_set_ctrl_surf_object(&surf.dst, ccs, sysmem, ccssize, uc_mocs, DIRECT_ACCESS);
> + uc_mocs, DEFAULT_PAT_INDEX, BLT_INDIRECT_ACCESS);
> + blt_set_ctrl_surf_object(&surf.dst, ccs, sysmem, ccssize, uc_mocs,
> + DEFAULT_PAT_INDEX, DIRECT_ACCESS);
> bb_size = xe_get_default_alignment(xe);
> bb1 = xe_bo_create_flags(xe, 0, bb_size, sysmem);
> blt_set_batch(&surf.bb, bb1, bb_size, sysmem);
> @@ -130,7 +132,7 @@ static void surf_copy(int xe,
> igt_system_suspend_autoresume(SUSPEND_STATE_FREEZE, SUSPEND_TEST_NONE);
>
> blt_set_ctrl_surf_object(&surf.dst, ccs2, system_memory(xe), ccssize,
> - 0, DIRECT_ACCESS);
> + 0, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
> blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
> intel_ctx_xe_sync(ctx, true);
>
> @@ -153,9 +155,9 @@ static void surf_copy(int xe,
> for (int i = 0; i < surf.dst.size / sizeof(uint32_t); i++)
> ccsmap[i] = i;
> blt_set_ctrl_surf_object(&surf.src, ccs, sysmem, ccssize,
> - uc_mocs, DIRECT_ACCESS);
> + uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
> blt_set_ctrl_surf_object(&surf.dst, mid->handle, mid->region, mid->size,
> - uc_mocs, INDIRECT_ACCESS);
> + uc_mocs, DEFAULT_PAT_INDEX, INDIRECT_ACCESS);
> blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
> intel_ctx_xe_sync(ctx, true);
>
> @@ -369,7 +371,8 @@ static void block_copy(int xe,
> blt_set_object_ext(&ext.dst, 0, width, height, SURFACE_TYPE_2D);
> if (config->inplace) {
> blt_set_object(&blt.dst, mid->handle, dst->size, mid->region, 0,
> - T_LINEAR, COMPRESSION_DISABLED, comp_type);
> + DEFAULT_PAT_INDEX, T_LINEAR, COMPRESSION_DISABLED,
> + comp_type);
> blt.dst.ptr = mid->ptr;
> }
>
> @@ -450,8 +453,8 @@ static void block_multicopy(int xe,
>
> if (config->inplace) {
> blt_set_object(&blt3.dst, mid->handle, dst->size, mid->region,
> - mid->mocs, mid_tiling, COMPRESSION_DISABLED,
> - comp_type);
> + mid->mocs, DEFAULT_PAT_INDEX, mid_tiling,
> + COMPRESSION_DISABLED, comp_type);
> blt3.dst.ptr = mid->ptr;
> }
>
> --
> 2.41.0
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Intel-xe] [igt-dev] [PATCH i-g-t 08/12] lib/intel_blt: support pat_index
2023-10-06 11:51 ` [Intel-xe] [igt-dev] " Zbigniew Kempczyński
@ 2023-10-06 12:08 ` Matthew Auld
2023-10-09 9:21 ` Zbigniew Kempczyński
0 siblings, 1 reply; 22+ messages in thread
From: Matthew Auld @ 2023-10-06 12:08 UTC (permalink / raw)
To: Zbigniew Kempczyński; +Cc: igt-dev, intel-xe
On 06/10/2023 12:51, Zbigniew Kempczyński wrote:
> On Thu, Oct 05, 2023 at 04:31:12PM +0100, Matthew Auld wrote:
>> For the most part we can just use the default wb, however some users
>> including display might want to use something else.
>>
>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> Cc: José Roberto de Souza <jose.souza@intel.com>
>> Cc: Pallavi Mishra <pallavi.mishra@intel.com>
>> ---
>> lib/igt_fb.c | 2 ++
>> lib/intel_blt.c | 54 +++++++++++++++++++++------------
>> lib/intel_blt.h | 7 +++--
>> tests/intel/gem_ccs.c | 16 +++++-----
>> tests/intel/gem_lmem_swapping.c | 4 +--
>> tests/intel/xe_ccs.c | 19 +++++++-----
>> 6 files changed, 64 insertions(+), 38 deletions(-)
>>
>> diff --git a/lib/igt_fb.c b/lib/igt_fb.c
>> index f8a0db22c..d290fd775 100644
>> --- a/lib/igt_fb.c
>> +++ b/lib/igt_fb.c
>> @@ -37,6 +37,7 @@
>> #include "i915/gem_mman.h"
>> #include "intel_blt.h"
>> #include "intel_mocs.h"
>> +#include "intel_pat.h"
>> #include "igt_aux.h"
>> #include "igt_color_encoding.h"
>> #include "igt_fb.h"
>> @@ -2768,6 +2769,7 @@ static struct blt_copy_object *blt_fb_init(const struct igt_fb *fb,
>>
>> blt_set_object(blt, handle, fb->size, memregion,
>> intel_get_uc_mocs(fb->fd),
>> + intel_get_pat_idx_wt(fb->fd),
>> blt_tile,
>> is_ccs_modifier(fb->modifier) ? COMPRESSION_ENABLED : COMPRESSION_DISABLED,
>> is_gen12_mc_ccs_modifier(fb->modifier) ? COMPRESSION_TYPE_MEDIA : COMPRESSION_TYPE_3D);
>> diff --git a/lib/intel_blt.c b/lib/intel_blt.c
>> index b55fa9b52..b7ac2902b 100644
>> --- a/lib/intel_blt.c
>> +++ b/lib/intel_blt.c
>> @@ -13,6 +13,7 @@
>> #include "igt.h"
>> #include "igt_syncobj.h"
>> #include "intel_blt.h"
>> +#include "intel_pat.h"
>> #include "xe/xe_ioctl.h"
>> #include "xe/xe_query.h"
>> #include "xe/xe_util.h"
>> @@ -810,10 +811,12 @@ uint64_t emit_blt_block_copy(int fd,
>> igt_assert_f(blt, "block-copy requires data to do blit\n");
>>
>> alignment = get_default_alignment(fd, blt->driver);
>> - src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment)
>> - + blt->src.plane_offset;
>> - dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment)
>> - + blt->dst.plane_offset;
>> + src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
>> + alignment, blt->src.pat_index) +
>> + blt->src.plane_offset;
>> + dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size,
>> + alignment, blt->dst.pat_index) +
>> + blt->dst.plane_offset;
>
> To less tabs in formatting for src and dst plane_offset.
>
>> bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
>>
>> fill_data(&data, blt, src_offset, dst_offset, ext);
>> @@ -884,8 +887,10 @@ int blt_block_copy(int fd,
>> igt_assert_neq(blt->driver, 0);
>>
>> alignment = get_default_alignment(fd, blt->driver);
>> - src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
>> - dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
>> + src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
>> + alignment, blt->src.pat_index);
>> + dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size,
>> + alignment, blt->dst.pat_index);
>> bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
>>
>> emit_blt_block_copy(fd, ahnd, blt, ext, 0, true);
>> @@ -1036,8 +1041,10 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
>> data.dw00.size_of_ctrl_copy = __ccs_size(surf) / CCS_RATIO - 1;
>> data.dw00.length = 0x3;
>>
>> - src_offset = get_offset(ahnd, surf->src.handle, surf->src.size, alignment);
>> - dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size, alignment);
>> + src_offset = get_offset_pat_index(ahnd, surf->src.handle, surf->src.size,
>> + alignment, surf->src.pat_index);
>> + dst_offset = get_offset_pat_index(ahnd, surf->dst.handle, surf->dst.size,
>> + alignment, surf->dst.pat_index);
>> bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size, alignment);
>>
>> data.dw01.src_address_lo = src_offset;
>> @@ -1103,8 +1110,10 @@ int blt_ctrl_surf_copy(int fd,
>> igt_assert_neq(surf->driver, 0);
>>
>> alignment = max_t(uint64_t, get_default_alignment(fd, surf->driver), 1ull << 16);
>> - src_offset = get_offset(ahnd, surf->src.handle, surf->src.size, alignment);
>> - dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size, alignment);
>> + src_offset = get_offset_pat_index(ahnd, surf->src.handle, surf->src.size,
>> + alignment, surf->src.pat_index);
>> + dst_offset = get_offset_pat_index(ahnd, surf->dst.handle, surf->dst.size,
>> + alignment, surf->dst.pat_index);
>> bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size, alignment);
>>
>> emit_blt_ctrl_surf_copy(fd, ahnd, surf, 0, true);
>> @@ -1308,10 +1317,12 @@ uint64_t emit_blt_fast_copy(int fd,
>> data.dw03.dst_x2 = blt->dst.x2;
>> data.dw03.dst_y2 = blt->dst.y2;
>>
>> - src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment)
>> - + blt->src.plane_offset;
>> - dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment)
>> - + blt->dst.plane_offset;
>> + src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
>> + alignment, blt->src.pat_index) +
>> + blt->src.plane_offset;
>> + dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size, alignment,
>> + blt->dst.pat_index) +
>> + blt->dst.plane_offset;
>
> Ditto.
>
>> bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
>>
>> data.dw04.dst_address_lo = dst_offset;
>> @@ -1380,8 +1391,10 @@ int blt_fast_copy(int fd,
>> igt_assert_neq(blt->driver, 0);
>>
>> alignment = get_default_alignment(fd, blt->driver);
>> - src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
>> - dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
>> + src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
>> + alignment, blt->src.pat_index);
>> + dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size,
>> + alignment, blt->dst.pat_index);
>> bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
>>
>> emit_blt_fast_copy(fd, ahnd, blt, 0, true);
>> @@ -1460,7 +1473,7 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
>> &size, region) == 0);
>> }
>
> I think blt_create_object() should have also pat_index passed as an
> argument.
I think you would also have to pass in the cpu_caching mode, and maybe
even the coh_mode, if we wanted that. Currently blt_create_object()
gives you a combination of cpu_caching, coh_mode and pat_index that is
the default and should "just work" for most cases. Idea is if you need
something more exotic you would instead create your own object (using
say gem_create_caching) and then also select whatever pat_index you needed.
I can change it to expose everything but figured blt_create_object()
should be more "I don't care, just give me the defaults".
>
> Rest looks ok.
>
> --
> Zbigniew
>
>>
>> - blt_set_object(obj, handle, size, region, mocs, tiling,
>> + blt_set_object(obj, handle, size, region, mocs, DEFAULT_PAT_INDEX, tiling,
>> compression, compression_type);
>> blt_set_geom(obj, stride, 0, 0, width, height, 0, 0);
>>
>> @@ -1481,7 +1494,7 @@ void blt_destroy_object(int fd, struct blt_copy_object *obj)
>>
>> void blt_set_object(struct blt_copy_object *obj,
>> uint32_t handle, uint64_t size, uint32_t region,
>> - uint8_t mocs, enum blt_tiling_type tiling,
>> + uint8_t mocs, uint8_t pat_index, enum blt_tiling_type tiling,
>> enum blt_compression compression,
>> enum blt_compression_type compression_type)
>> {
>> @@ -1489,6 +1502,7 @@ void blt_set_object(struct blt_copy_object *obj,
>> obj->size = size;
>> obj->region = region;
>> obj->mocs = mocs;
>> + obj->pat_index = pat_index;
>> obj->tiling = tiling;
>> obj->compression = compression;
>> obj->compression_type = compression_type;
>> @@ -1516,12 +1530,14 @@ void blt_set_copy_object(struct blt_copy_object *obj,
>>
>> void blt_set_ctrl_surf_object(struct blt_ctrl_surf_copy_object *obj,
>> uint32_t handle, uint32_t region, uint64_t size,
>> - uint8_t mocs, enum blt_access_type access_type)
>> + uint8_t mocs, uint8_t pat_index,
>> + enum blt_access_type access_type)
>> {
>> obj->handle = handle;
>> obj->region = region;
>> obj->size = size;
>> obj->mocs = mocs;
>> + obj->pat_index = pat_index;
>> obj->access_type = access_type;
>> }
>>
>> diff --git a/lib/intel_blt.h b/lib/intel_blt.h
>> index d9c8883c7..f8423a986 100644
>> --- a/lib/intel_blt.h
>> +++ b/lib/intel_blt.h
>> @@ -79,6 +79,7 @@ struct blt_copy_object {
>> uint32_t region;
>> uint64_t size;
>> uint8_t mocs;
>> + uint8_t pat_index;
>> enum blt_tiling_type tiling;
>> enum blt_compression compression; /* BC only */
>> enum blt_compression_type compression_type; /* BC only */
>> @@ -151,6 +152,7 @@ struct blt_ctrl_surf_copy_object {
>> uint32_t region;
>> uint64_t size;
>> uint8_t mocs;
>> + uint8_t pat_index;
>> enum blt_access_type access_type;
>> };
>>
>> @@ -247,7 +249,7 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
>> void blt_destroy_object(int fd, struct blt_copy_object *obj);
>> void blt_set_object(struct blt_copy_object *obj,
>> uint32_t handle, uint64_t size, uint32_t region,
>> - uint8_t mocs, enum blt_tiling_type tiling,
>> + uint8_t mocs, uint8_t pat_index, enum blt_tiling_type tiling,
>> enum blt_compression compression,
>> enum blt_compression_type compression_type);
>> void blt_set_object_ext(struct blt_block_copy_object_ext *obj,
>> @@ -258,7 +260,8 @@ void blt_set_copy_object(struct blt_copy_object *obj,
>> const struct blt_copy_object *orig);
>> void blt_set_ctrl_surf_object(struct blt_ctrl_surf_copy_object *obj,
>> uint32_t handle, uint32_t region, uint64_t size,
>> - uint8_t mocs, enum blt_access_type access_type);
>> + uint8_t mocs, uint8_t pat_index,
>> + enum blt_access_type access_type);
>>
>> void blt_surface_info(const char *info,
>> const struct blt_copy_object *obj);
>> diff --git a/tests/intel/gem_ccs.c b/tests/intel/gem_ccs.c
>> index f5d4ab359..a98557b72 100644
>> --- a/tests/intel/gem_ccs.c
>> +++ b/tests/intel/gem_ccs.c
>> @@ -15,6 +15,7 @@
>> #include "lib/intel_chipset.h"
>> #include "intel_blt.h"
>> #include "intel_mocs.h"
>> +#include "intel_pat.h"
>> /**
>> * TEST: gem ccs
>> * Description: Exercise gen12 blitter with and without flatccs compression
>> @@ -111,9 +112,9 @@ static void surf_copy(int i915,
>> blt_ctrl_surf_copy_init(i915, &surf);
>> surf.print_bb = param.print_bb;
>> blt_set_ctrl_surf_object(&surf.src, mid->handle, mid->region, mid->size,
>> - uc_mocs, BLT_INDIRECT_ACCESS);
>> + uc_mocs, DEFAULT_PAT_INDEX, BLT_INDIRECT_ACCESS);
>> blt_set_ctrl_surf_object(&surf.dst, ccs, REGION_SMEM, ccssize,
>> - uc_mocs, DIRECT_ACCESS);
>> + uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
>> bb_size = 4096;
>> igt_assert_eq(__gem_create(i915, &bb_size, &bb1), 0);
>> blt_set_batch(&surf.bb, bb1, bb_size, REGION_SMEM);
>> @@ -133,7 +134,7 @@ static void surf_copy(int i915,
>> igt_system_suspend_autoresume(SUSPEND_STATE_FREEZE, SUSPEND_TEST_NONE);
>>
>> blt_set_ctrl_surf_object(&surf.dst, ccs2, REGION_SMEM, ccssize,
>> - 0, DIRECT_ACCESS);
>> + 0, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
>> blt_ctrl_surf_copy(i915, ctx, e, ahnd, &surf);
>> gem_sync(i915, surf.dst.handle);
>>
>> @@ -155,9 +156,9 @@ static void surf_copy(int i915,
>> for (int i = 0; i < surf.dst.size / sizeof(uint32_t); i++)
>> ccsmap[i] = i;
>> blt_set_ctrl_surf_object(&surf.src, ccs, REGION_SMEM, ccssize,
>> - uc_mocs, DIRECT_ACCESS);
>> + uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
>> blt_set_ctrl_surf_object(&surf.dst, mid->handle, mid->region, mid->size,
>> - uc_mocs, INDIRECT_ACCESS);
>> + uc_mocs, DEFAULT_PAT_INDEX, INDIRECT_ACCESS);
>> blt_ctrl_surf_copy(i915, ctx, e, ahnd, &surf);
>>
>> blt_copy_init(i915, &blt);
>> @@ -399,7 +400,8 @@ static void block_copy(int i915,
>> blt_set_object_ext(&ext.dst, 0, width, height, SURFACE_TYPE_2D);
>> if (config->inplace) {
>> blt_set_object(&blt.dst, mid->handle, dst->size, mid->region, 0,
>> - T_LINEAR, COMPRESSION_DISABLED, comp_type);
>> + DEFAULT_PAT_INDEX, T_LINEAR, COMPRESSION_DISABLED,
>> + comp_type);
>> blt.dst.ptr = mid->ptr;
>> }
>>
>> @@ -475,7 +477,7 @@ static void block_multicopy(int i915,
>>
>> if (config->inplace) {
>> blt_set_object(&blt3.dst, mid->handle, dst->size, mid->region,
>> - mid->mocs, mid_tiling, COMPRESSION_DISABLED,
>> + mid->mocs, DEFAULT_PAT_INDEX, mid_tiling, COMPRESSION_DISABLED,
>> comp_type);
>> blt3.dst.ptr = mid->ptr;
>> }
>> diff --git a/tests/intel/gem_lmem_swapping.c b/tests/intel/gem_lmem_swapping.c
>> index ede545c92..7f2ab8bb6 100644
>> --- a/tests/intel/gem_lmem_swapping.c
>> +++ b/tests/intel/gem_lmem_swapping.c
>> @@ -486,7 +486,7 @@ static void __do_evict(int i915,
>> INTEL_MEMORY_REGION_ID(I915_SYSTEM_MEMORY, 0));
>> blt_set_object(tmp, tmp->handle, params->size.max,
>> INTEL_MEMORY_REGION_ID(I915_SYSTEM_MEMORY, 0),
>> - intel_get_uc_mocs(i915), T_LINEAR,
>> + intel_get_uc_mocs(i915), 0, T_LINEAR,
>> COMPRESSION_DISABLED, COMPRESSION_TYPE_3D);
>> blt_set_geom(tmp, stride, 0, 0, width, height, 0, 0);
>> }
>> @@ -516,7 +516,7 @@ static void __do_evict(int i915,
>> obj->blt_obj = calloc(1, sizeof(*obj->blt_obj));
>> igt_assert(obj->blt_obj);
>> blt_set_object(obj->blt_obj, obj->handle, obj->size, region_id,
>> - intel_get_uc_mocs(i915), T_LINEAR,
>> + intel_get_uc_mocs(i915), 0, T_LINEAR,
>> COMPRESSION_ENABLED, COMPRESSION_TYPE_3D);
>> blt_set_geom(obj->blt_obj, stride, 0, 0, width, height, 0, 0);
>> init_object_ccs(i915, obj, tmp, rand(), blt_ctx,
>> diff --git a/tests/intel/xe_ccs.c b/tests/intel/xe_ccs.c
>> index 20bbc4448..27859d5ce 100644
>> --- a/tests/intel/xe_ccs.c
>> +++ b/tests/intel/xe_ccs.c
>> @@ -13,6 +13,7 @@
>> #include "igt_syncobj.h"
>> #include "intel_blt.h"
>> #include "intel_mocs.h"
>> +#include "intel_pat.h"
>> #include "xe/xe_ioctl.h"
>> #include "xe/xe_query.h"
>> #include "xe/xe_util.h"
>> @@ -108,8 +109,9 @@ static void surf_copy(int xe,
>> blt_ctrl_surf_copy_init(xe, &surf);
>> surf.print_bb = param.print_bb;
>> blt_set_ctrl_surf_object(&surf.src, mid->handle, mid->region, mid->size,
>> - uc_mocs, BLT_INDIRECT_ACCESS);
>> - blt_set_ctrl_surf_object(&surf.dst, ccs, sysmem, ccssize, uc_mocs, DIRECT_ACCESS);
>> + uc_mocs, DEFAULT_PAT_INDEX, BLT_INDIRECT_ACCESS);
>> + blt_set_ctrl_surf_object(&surf.dst, ccs, sysmem, ccssize, uc_mocs,
>> + DEFAULT_PAT_INDEX, DIRECT_ACCESS);
>> bb_size = xe_get_default_alignment(xe);
>> bb1 = xe_bo_create_flags(xe, 0, bb_size, sysmem);
>> blt_set_batch(&surf.bb, bb1, bb_size, sysmem);
>> @@ -130,7 +132,7 @@ static void surf_copy(int xe,
>> igt_system_suspend_autoresume(SUSPEND_STATE_FREEZE, SUSPEND_TEST_NONE);
>>
>> blt_set_ctrl_surf_object(&surf.dst, ccs2, system_memory(xe), ccssize,
>> - 0, DIRECT_ACCESS);
>> + 0, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
>> blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
>> intel_ctx_xe_sync(ctx, true);
>>
>> @@ -153,9 +155,9 @@ static void surf_copy(int xe,
>> for (int i = 0; i < surf.dst.size / sizeof(uint32_t); i++)
>> ccsmap[i] = i;
>> blt_set_ctrl_surf_object(&surf.src, ccs, sysmem, ccssize,
>> - uc_mocs, DIRECT_ACCESS);
>> + uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
>> blt_set_ctrl_surf_object(&surf.dst, mid->handle, mid->region, mid->size,
>> - uc_mocs, INDIRECT_ACCESS);
>> + uc_mocs, DEFAULT_PAT_INDEX, INDIRECT_ACCESS);
>> blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
>> intel_ctx_xe_sync(ctx, true);
>>
>> @@ -369,7 +371,8 @@ static void block_copy(int xe,
>> blt_set_object_ext(&ext.dst, 0, width, height, SURFACE_TYPE_2D);
>> if (config->inplace) {
>> blt_set_object(&blt.dst, mid->handle, dst->size, mid->region, 0,
>> - T_LINEAR, COMPRESSION_DISABLED, comp_type);
>> + DEFAULT_PAT_INDEX, T_LINEAR, COMPRESSION_DISABLED,
>> + comp_type);
>> blt.dst.ptr = mid->ptr;
>> }
>>
>> @@ -450,8 +453,8 @@ static void block_multicopy(int xe,
>>
>> if (config->inplace) {
>> blt_set_object(&blt3.dst, mid->handle, dst->size, mid->region,
>> - mid->mocs, mid_tiling, COMPRESSION_DISABLED,
>> - comp_type);
>> + mid->mocs, DEFAULT_PAT_INDEX, mid_tiling,
>> + COMPRESSION_DISABLED, comp_type);
>> blt3.dst.ptr = mid->ptr;
>> }
>>
>> --
>> 2.41.0
>>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Intel-xe] [PATCH i-g-t 09/12] lib/intel_buf: support pat_index
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 09/12] lib/intel_buf: " Matthew Auld
@ 2023-10-06 12:13 ` Zbigniew Kempczyński
0 siblings, 0 replies; 22+ messages in thread
From: Zbigniew Kempczyński @ 2023-10-06 12:13 UTC (permalink / raw)
To: Matthew Auld; +Cc: igt-dev, intel-xe
On Thu, Oct 05, 2023 at 04:31:13PM +0100, Matthew Auld wrote:
> Some users need to able select their own pat_index. Some display tests
> use igt_draw which in turn uses intel_batchbuffer and intel_buf. We
> also have a couple more display tests directly using these interfaces
> directly. Idea is to select wt/uc for anything display related, but also
> allow any test to select a pat_index for a given intel_buf.
>
> Signted-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: José Roberto de Souza <jose.souza@intel.com>
> Cc: Pallavi Mishra <pallavi.mishra@intel.com>
> ---
> lib/igt_draw.c | 7 +++++-
> lib/igt_fb.c | 3 ++-
> lib/intel_allocator.c | 1 +
> lib/intel_allocator.h | 1 +
> lib/intel_batchbuffer.c | 51 ++++++++++++++++++++++++++++++---------
> lib/intel_bufops.c | 29 +++++++++++++++-------
> lib/intel_bufops.h | 9 +++++--
> tests/intel/kms_big_fb.c | 4 ++-
> tests/intel/kms_dirtyfb.c | 7 ++++--
> tests/intel/kms_psr.c | 4 ++-
> tests/intel/xe_intel_bb.c | 3 ++-
> 11 files changed, 89 insertions(+), 30 deletions(-)
>
> diff --git a/lib/igt_draw.c b/lib/igt_draw.c
> index 2332bf94a..8db71ce5e 100644
> --- a/lib/igt_draw.c
> +++ b/lib/igt_draw.c
> @@ -31,6 +31,7 @@
> #include "intel_batchbuffer.h"
> #include "intel_chipset.h"
> #include "intel_mocs.h"
> +#include "intel_pat.h"
> #include "igt_core.h"
> #include "igt_fb.h"
> #include "ioctl_wrappers.h"
> @@ -75,6 +76,7 @@ struct buf_data {
> uint32_t size;
> uint32_t stride;
> int bpp;
> + uint8_t pat_index;
> };
>
> struct rect {
> @@ -658,7 +660,8 @@ static struct intel_buf *create_buf(int fd, struct buf_ops *bops,
> width, height, from->bpp, 0,
> tiling, 0,
> size, 0,
> - region);
> + region,
> + from->pat_index);
>
> /* Make sure we close handle on destroy path */
> intel_buf_set_ownership(buf, true);
> @@ -785,6 +788,7 @@ static void draw_rect_render(int fd, struct cmd_data *cmd_data,
> igt_skip_on(!rendercopy);
>
> /* We create a temporary buffer and copy from it using rendercopy. */
> + tmp.pat_index = buf->pat_index;
> tmp.size = rect->w * rect->h * pixel_size;
> if (is_i915_device(fd))
> tmp.handle = gem_create(fd, tmp.size);
> @@ -852,6 +856,7 @@ void igt_draw_rect(int fd, struct buf_ops *bops, uint32_t ctx,
> .size = buf_size,
> .stride = buf_stride,
> .bpp = bpp,
> + .pat_index = intel_get_pat_idx_wt(fd),
> };
> struct rect rect = {
> .x = rect_x,
> diff --git a/lib/igt_fb.c b/lib/igt_fb.c
> index d290fd775..61384c553 100644
> --- a/lib/igt_fb.c
> +++ b/lib/igt_fb.c
> @@ -2637,7 +2637,8 @@ igt_fb_create_intel_buf(int fd, struct buf_ops *bops,
> igt_fb_mod_to_tiling(fb->modifier),
> compression, fb->size,
> fb->strides[0],
> - region);
> + region,
> + intel_get_pat_idx_wt(fd));
> intel_buf_set_name(buf, name);
>
> /* Make sure we close handle on destroy path */
> diff --git a/lib/intel_allocator.c b/lib/intel_allocator.c
> index da357b833..b3e5c0226 100644
> --- a/lib/intel_allocator.c
> +++ b/lib/intel_allocator.c
> @@ -1449,6 +1449,7 @@ bool intel_allocator_is_reserved(uint64_t allocator_handle,
> bool intel_allocator_reserve_if_not_allocated(uint64_t allocator_handle,
> uint32_t handle,
> uint64_t size, uint64_t offset,
> + uint8_t pat_index,
> bool *is_allocatedp)
> {
> struct alloc_req req = { .request_type = REQ_RESERVE_IF_NOT_ALLOCATED,
> diff --git a/lib/intel_allocator.h b/lib/intel_allocator.h
> index 5da8af7f9..d93c5828d 100644
> --- a/lib/intel_allocator.h
> +++ b/lib/intel_allocator.h
> @@ -206,6 +206,7 @@ bool intel_allocator_is_reserved(uint64_t allocator_handle,
> bool intel_allocator_reserve_if_not_allocated(uint64_t allocator_handle,
> uint32_t handle,
> uint64_t size, uint64_t offset,
> + uint8_t pat_index,
> bool *is_allocatedp);
>
> void intel_allocator_print(uint64_t allocator_handle);
> diff --git a/lib/intel_batchbuffer.c b/lib/intel_batchbuffer.c
> index e7b1b755f..eaaf667ea 100644
> --- a/lib/intel_batchbuffer.c
> +++ b/lib/intel_batchbuffer.c
> @@ -38,6 +38,7 @@
> #include "intel_batchbuffer.h"
> #include "intel_bufops.h"
> #include "intel_chipset.h"
> +#include "intel_pat.h"
> #include "media_fill.h"
> #include "media_spin.h"
> #include "sw_sync.h"
> @@ -825,15 +826,18 @@ static void __reallocate_objects(struct intel_bb *ibb)
> static inline uint64_t __intel_bb_get_offset(struct intel_bb *ibb,
> uint32_t handle,
> uint64_t size,
> - uint32_t alignment)
> + uint32_t alignment,
> + uint8_t pat_index)
> {
> uint64_t offset;
>
> if (ibb->enforce_relocs)
> return 0;
>
> - offset = intel_allocator_alloc(ibb->allocator_handle,
> - handle, size, alignment);
> + offset = __intel_allocator_alloc(ibb->allocator_handle, handle,
> + size, alignment, pat_index,
> + ALLOC_STRATEGY_NONE);
> + igt_assert(offset != ALLOC_INVALID_ADDRESS);
>
> return offset;
> }
> @@ -1300,11 +1304,14 @@ static struct drm_xe_vm_bind_op *xe_alloc_bind_ops(struct intel_bb *ibb,
> ops->op = op;
> ops->obj_offset = 0;
> ops->addr = objects[i]->offset;
> - ops->range = objects[i]->rsvd1;
> + ops->range = objects[i]->rsvd1 & ~(4096-1);
I would introduce some macro for better readability, like
#define OBJ_SIZE(rsvd1) ((rsvd1) & ~(SZ_4K-1))
#define OBJ_PATIDX(rsvd1) ((rsvd1) & (SZ_4K-1))
or sth. Imo
ops->range = OBJ_SIZE(objects[i]->rsvd1);
ops->pat_index = OBJ_PATIDX(objects[i]->rsvd1);
suggests more data were packed into rsvd1 on first reading.
> ops->region = region;
> + if (set_obj)
> + ops->pat_index = objects[i]->rsvd1 & (4096-1);
>
> - igt_debug(" [%d]: handle: %u, offset: %llx, size: %llx\n",
> - i, ops->obj, (long long)ops->addr, (long long)ops->range);
> + igt_debug(" [%d]: handle: %u, offset: %llx, size: %llx pat_index: %u\n",
> + i, ops->obj, (long long)ops->addr, (long long)ops->range,
> + ops->pat_index);
> }
>
> return bind_ops;
> @@ -1409,7 +1416,8 @@ void intel_bb_reset(struct intel_bb *ibb, bool purge_objects_cache)
> ibb->batch_offset = __intel_bb_get_offset(ibb,
> ibb->handle,
> ibb->size,
> - ibb->alignment);
> + ibb->alignment,
> + DEFAULT_PAT_INDEX);
>
> intel_bb_add_object(ibb, ibb->handle, ibb->size,
> ibb->batch_offset,
> @@ -1645,7 +1653,8 @@ static void __remove_from_objects(struct intel_bb *ibb,
> */
> static struct drm_i915_gem_exec_object2 *
> __intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
> - uint64_t offset, uint64_t alignment, bool write)
> + uint64_t offset, uint64_t alignment, uint8_t pat_index,
> + bool write)
> {
> struct drm_i915_gem_exec_object2 *object;
>
> @@ -1661,6 +1670,9 @@ __intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
> object = __add_to_cache(ibb, handle);
> __add_to_objects(ibb, object);
>
> + if (pat_index == DEFAULT_PAT_INDEX)
> + pat_index = intel_get_pat_idx_wb(ibb->fd);
> +
> /*
> * If object->offset == INVALID_ADDRESS we added freshly object to the
> * cache. In that case we have two choices:
> @@ -1670,7 +1682,7 @@ __intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
> if (INVALID_ADDR(object->offset)) {
> if (INVALID_ADDR(offset)) {
> offset = __intel_bb_get_offset(ibb, handle, size,
> - alignment);
> + alignment, pat_index);
> } else {
> offset = offset & (ibb->gtt_size - 1);
>
> @@ -1683,6 +1695,7 @@ __intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
>
> reserved = intel_allocator_reserve_if_not_allocated(ibb->allocator_handle,
> handle, size, offset,
> + pat_index,
> &allocated);
> igt_assert_f(allocated || reserved,
> "Can't get offset, allocated: %d, reserved: %d\n",
> @@ -1721,6 +1734,18 @@ __intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
> if (ibb->driver == INTEL_DRIVER_XE) {
> object->alignment = alignment;
> object->rsvd1 = size;
> + igt_assert(!(size & (4096-1)));
igt_assert(!OBJ_PATIDX(object->rsvd1));
?
But that's suggestion. Anyway for this one:
Acked-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
--
Zbigniew
> +
> + if (pat_index == DEFAULT_PAT_INDEX)
> + pat_index = intel_get_pat_idx_wb(ibb->fd);
> +
> + /*
> + * XXX: For now encode the pat_index in the first few bits of
> + * rsvd1. intel_batchbuffer should really stop using the i915
> + * drm_i915_gem_exec_object2 to encode VMA placement
> + * information on xe...
> + */
> + object->rsvd1 |= pat_index;
> }
>
> return object;
> @@ -1733,7 +1758,7 @@ intel_bb_add_object(struct intel_bb *ibb, uint32_t handle, uint64_t size,
> struct drm_i915_gem_exec_object2 *obj = NULL;
>
> obj = __intel_bb_add_object(ibb, handle, size, offset,
> - alignment, write);
> + alignment, DEFAULT_PAT_INDEX, write);
> igt_assert(obj);
>
> return obj;
> @@ -1795,8 +1820,10 @@ __intel_bb_add_intel_buf(struct intel_bb *ibb, struct intel_buf *buf,
> }
> }
>
> - obj = intel_bb_add_object(ibb, buf->handle, intel_buf_bo_size(buf),
> - buf->addr.offset, alignment, write);
> + obj = __intel_bb_add_object(ibb, buf->handle, intel_buf_bo_size(buf),
> + buf->addr.offset, alignment, buf->pat_index,
> + write);
> + igt_assert(obj);
> buf->addr.offset = obj->offset;
>
> if (igt_list_empty(&buf->link)) {
> diff --git a/lib/intel_bufops.c b/lib/intel_bufops.c
> index 2c91adb88..fbee4748e 100644
> --- a/lib/intel_bufops.c
> +++ b/lib/intel_bufops.c
> @@ -29,6 +29,7 @@
> #include "igt.h"
> #include "igt_x86.h"
> #include "intel_bufops.h"
> +#include "intel_pat.h"
> #include "xe/xe_ioctl.h"
> #include "xe/xe_query.h"
>
> @@ -818,7 +819,7 @@ static void __intel_buf_init(struct buf_ops *bops,
> int width, int height, int bpp, int alignment,
> uint32_t req_tiling, uint32_t compression,
> uint64_t bo_size, int bo_stride,
> - uint64_t region)
> + uint64_t region, uint8_t pat_index)
> {
> uint32_t tiling = req_tiling;
> uint64_t size;
> @@ -839,6 +840,10 @@ static void __intel_buf_init(struct buf_ops *bops,
> IGT_INIT_LIST_HEAD(&buf->link);
> buf->mocs = INTEL_BUF_MOCS_DEFAULT;
>
> + if (pat_index == DEFAULT_PAT_INDEX)
> + pat_index = intel_get_pat_idx_wb(bops->fd);
> + buf->pat_index = pat_index;
> +
> if (compression) {
> igt_require(bops->intel_gen >= 9);
> igt_assert(req_tiling == I915_TILING_Y ||
> @@ -957,7 +962,7 @@ void intel_buf_init(struct buf_ops *bops,
> region = bops->driver == INTEL_DRIVER_I915 ? I915_SYSTEM_MEMORY :
> system_memory(bops->fd);
> __intel_buf_init(bops, 0, buf, width, height, bpp, alignment,
> - tiling, compression, 0, 0, region);
> + tiling, compression, 0, 0, region, DEFAULT_PAT_INDEX);
>
> intel_buf_set_ownership(buf, true);
> }
> @@ -974,7 +979,7 @@ void intel_buf_init_in_region(struct buf_ops *bops,
> uint64_t region)
> {
> __intel_buf_init(bops, 0, buf, width, height, bpp, alignment,
> - tiling, compression, 0, 0, region);
> + tiling, compression, 0, 0, region, DEFAULT_PAT_INDEX);
>
> intel_buf_set_ownership(buf, true);
> }
> @@ -1033,7 +1038,7 @@ void intel_buf_init_using_handle(struct buf_ops *bops,
> uint32_t req_tiling, uint32_t compression)
> {
> __intel_buf_init(bops, handle, buf, width, height, bpp, alignment,
> - req_tiling, compression, 0, 0, -1);
> + req_tiling, compression, 0, 0, -1, DEFAULT_PAT_INDEX);
> }
>
> /**
> @@ -1050,6 +1055,7 @@ void intel_buf_init_using_handle(struct buf_ops *bops,
> * @size: real bo size
> * @stride: bo stride
> * @region: region
> + * @pat_index: pat_index to use for the binding (only used on xe)
> *
> * Function configures BO handle within intel_buf structure passed by the caller
> * (with all its metadata - width, height, ...). Useful if BO was created
> @@ -1067,10 +1073,12 @@ void intel_buf_init_full(struct buf_ops *bops,
> uint32_t compression,
> uint64_t size,
> int stride,
> - uint64_t region)
> + uint64_t region,
> + uint8_t pat_index)
> {
> __intel_buf_init(bops, handle, buf, width, height, bpp, alignment,
> - req_tiling, compression, size, stride, region);
> + req_tiling, compression, size, stride, region,
> + pat_index);
> }
>
> /**
> @@ -1149,7 +1157,8 @@ struct intel_buf *intel_buf_create_using_handle_and_size(struct buf_ops *bops,
> int stride)
> {
> return intel_buf_create_full(bops, handle, width, height, bpp, alignment,
> - req_tiling, compression, size, stride, -1);
> + req_tiling, compression, size, stride, -1,
> + DEFAULT_PAT_INDEX);
> }
>
> struct intel_buf *intel_buf_create_full(struct buf_ops *bops,
> @@ -1160,7 +1169,8 @@ struct intel_buf *intel_buf_create_full(struct buf_ops *bops,
> uint32_t compression,
> uint64_t size,
> int stride,
> - uint64_t region)
> + uint64_t region,
> + uint8_t pat_index)
> {
> struct intel_buf *buf;
>
> @@ -1170,7 +1180,8 @@ struct intel_buf *intel_buf_create_full(struct buf_ops *bops,
> igt_assert(buf);
>
> __intel_buf_init(bops, handle, buf, width, height, bpp, alignment,
> - req_tiling, compression, size, stride, region);
> + req_tiling, compression, size, stride, region,
> + pat_index);
>
> return buf;
> }
> diff --git a/lib/intel_bufops.h b/lib/intel_bufops.h
> index 4dfe4681c..b6048402b 100644
> --- a/lib/intel_bufops.h
> +++ b/lib/intel_bufops.h
> @@ -63,6 +63,9 @@ struct intel_buf {
> /* Content Protection*/
> bool is_protected;
>
> + /* pat_index to use for mapping this buf. Only used in Xe. */
> + uint8_t pat_index;
> +
> /* For debugging purposes */
> char name[INTEL_BUF_NAME_MAXSIZE + 1];
> };
> @@ -161,7 +164,8 @@ void intel_buf_init_full(struct buf_ops *bops,
> uint32_t compression,
> uint64_t size,
> int stride,
> - uint64_t region);
> + uint64_t region,
> + uint8_t pat_index);
>
> struct intel_buf *intel_buf_create(struct buf_ops *bops,
> int width, int height,
> @@ -192,7 +196,8 @@ struct intel_buf *intel_buf_create_full(struct buf_ops *bops,
> uint32_t compression,
> uint64_t size,
> int stride,
> - uint64_t region);
> + uint64_t region,
> + uint8_t pat_index);
> void intel_buf_destroy(struct intel_buf *buf);
>
> static inline void intel_buf_set_pxp(struct intel_buf *buf, bool new_pxp_state)
> diff --git a/tests/intel/kms_big_fb.c b/tests/intel/kms_big_fb.c
> index 611e60896..854a77992 100644
> --- a/tests/intel/kms_big_fb.c
> +++ b/tests/intel/kms_big_fb.c
> @@ -34,6 +34,7 @@
> #include <string.h>
>
> #include "i915/gem_create.h"
> +#include "intel_pat.h"
> #include "xe/xe_ioctl.h"
> #include "xe/xe_query.h"
>
> @@ -88,7 +89,8 @@ static struct intel_buf *init_buf(data_t *data,
> handle = gem_open(data->drm_fd, name);
> buf = intel_buf_create_full(data->bops, handle, width, height,
> bpp, 0, tiling, 0, size, 0,
> - region);
> + region,
> + intel_get_pat_idx_wt(data->drm_fd));
>
> intel_buf_set_name(buf, buf_name);
> intel_buf_set_ownership(buf, true);
> diff --git a/tests/intel/kms_dirtyfb.c b/tests/intel/kms_dirtyfb.c
> index cc9529178..ec9b2a137 100644
> --- a/tests/intel/kms_dirtyfb.c
> +++ b/tests/intel/kms_dirtyfb.c
> @@ -10,6 +10,7 @@
>
> #include "i915/intel_drrs.h"
> #include "i915/intel_fbc.h"
> +#include "intel_pat.h"
>
> #include "xe/xe_query.h"
>
> @@ -246,14 +247,16 @@ static void run_test(data_t *data)
> 0,
> igt_fb_mod_to_tiling(data->fbs[1].modifier),
> 0, 0, 0, is_xe_device(data->drm_fd) ?
> - system_memory(data->drm_fd) : 0);
> + system_memory(data->drm_fd) : 0,
> + intel_get_pat_idx_wt(data->drm_fd));
> dst = intel_buf_create_full(data->bops, data->fbs[2].gem_handle,
> data->fbs[2].width,
> data->fbs[2].height,
> igt_drm_format_to_bpp(data->fbs[2].drm_format),
> 0, igt_fb_mod_to_tiling(data->fbs[2].modifier),
> 0, 0, 0, is_xe_device(data->drm_fd) ?
> - system_memory(data->drm_fd) : 0);
> + system_memory(data->drm_fd) : 0,
> + intel_get_pat_idx_wt(data->drm_fd));
> ibb = intel_bb_create(data->drm_fd, PAGE_SIZE);
>
> spin = igt_spin_new(data->drm_fd, .ahnd = ibb->allocator_handle);
> diff --git a/tests/intel/kms_psr.c b/tests/intel/kms_psr.c
> index ffecc5222..9c6ecd829 100644
> --- a/tests/intel/kms_psr.c
> +++ b/tests/intel/kms_psr.c
> @@ -31,6 +31,7 @@
> #include "igt.h"
> #include "igt_sysfs.h"
> #include "igt_psr.h"
> +#include "intel_pat.h"
> #include <errno.h>
> #include <stdbool.h>
> #include <stdio.h>
> @@ -356,7 +357,8 @@ static struct intel_buf *create_buf_from_fb(data_t *data,
> name = gem_flink(data->drm_fd, fb->gem_handle);
> handle = gem_open(data->drm_fd, name);
> buf = intel_buf_create_full(data->bops, handle, width, height,
> - bpp, 0, tiling, 0, size, stride, region);
> + bpp, 0, tiling, 0, size, stride, region,
> + intel_get_pat_idx_wt(data->drm_fd));
> intel_buf_set_ownership(buf, true);
>
> return buf;
> diff --git a/tests/intel/xe_intel_bb.c b/tests/intel/xe_intel_bb.c
> index 0159a3164..e2480acf8 100644
> --- a/tests/intel/xe_intel_bb.c
> +++ b/tests/intel/xe_intel_bb.c
> @@ -19,6 +19,7 @@
> #include "igt.h"
> #include "igt_crc.h"
> #include "intel_bufops.h"
> +#include "intel_pat.h"
> #include "xe/xe_ioctl.h"
> #include "xe/xe_query.h"
>
> @@ -400,7 +401,7 @@ static void create_in_region(struct buf_ops *bops, uint64_t region)
> intel_buf_init_full(bops, handle, &buf,
> width/4, height, 32, 0,
> I915_TILING_NONE, 0,
> - size, 0, region);
> + size, 0, region, DEFAULT_PAT_INDEX);
> intel_buf_set_ownership(&buf, true);
>
> intel_bb_add_intel_buf(ibb, &buf, false);
> --
> 2.41.0
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Intel-xe] [igt-dev] [PATCH i-g-t 08/12] lib/intel_blt: support pat_index
2023-10-06 12:08 ` Matthew Auld
@ 2023-10-09 9:21 ` Zbigniew Kempczyński
0 siblings, 0 replies; 22+ messages in thread
From: Zbigniew Kempczyński @ 2023-10-09 9:21 UTC (permalink / raw)
To: Matthew Auld; +Cc: igt-dev, intel-xe
On Fri, Oct 06, 2023 at 01:08:50PM +0100, Matthew Auld wrote:
> On 06/10/2023 12:51, Zbigniew Kempczyński wrote:
> > On Thu, Oct 05, 2023 at 04:31:12PM +0100, Matthew Auld wrote:
> > > For the most part we can just use the default wb, however some users
> > > including display might want to use something else.
> > >
> > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > Cc: José Roberto de Souza <jose.souza@intel.com>
> > > Cc: Pallavi Mishra <pallavi.mishra@intel.com>
> > > ---
> > > lib/igt_fb.c | 2 ++
> > > lib/intel_blt.c | 54 +++++++++++++++++++++------------
> > > lib/intel_blt.h | 7 +++--
> > > tests/intel/gem_ccs.c | 16 +++++-----
> > > tests/intel/gem_lmem_swapping.c | 4 +--
> > > tests/intel/xe_ccs.c | 19 +++++++-----
> > > 6 files changed, 64 insertions(+), 38 deletions(-)
> > >
> > > diff --git a/lib/igt_fb.c b/lib/igt_fb.c
> > > index f8a0db22c..d290fd775 100644
> > > --- a/lib/igt_fb.c
> > > +++ b/lib/igt_fb.c
> > > @@ -37,6 +37,7 @@
> > > #include "i915/gem_mman.h"
> > > #include "intel_blt.h"
> > > #include "intel_mocs.h"
> > > +#include "intel_pat.h"
> > > #include "igt_aux.h"
> > > #include "igt_color_encoding.h"
> > > #include "igt_fb.h"
> > > @@ -2768,6 +2769,7 @@ static struct blt_copy_object *blt_fb_init(const struct igt_fb *fb,
> > > blt_set_object(blt, handle, fb->size, memregion,
> > > intel_get_uc_mocs(fb->fd),
> > > + intel_get_pat_idx_wt(fb->fd),
> > > blt_tile,
> > > is_ccs_modifier(fb->modifier) ? COMPRESSION_ENABLED : COMPRESSION_DISABLED,
> > > is_gen12_mc_ccs_modifier(fb->modifier) ? COMPRESSION_TYPE_MEDIA : COMPRESSION_TYPE_3D);
> > > diff --git a/lib/intel_blt.c b/lib/intel_blt.c
> > > index b55fa9b52..b7ac2902b 100644
> > > --- a/lib/intel_blt.c
> > > +++ b/lib/intel_blt.c
> > > @@ -13,6 +13,7 @@
> > > #include "igt.h"
> > > #include "igt_syncobj.h"
> > > #include "intel_blt.h"
> > > +#include "intel_pat.h"
> > > #include "xe/xe_ioctl.h"
> > > #include "xe/xe_query.h"
> > > #include "xe/xe_util.h"
> > > @@ -810,10 +811,12 @@ uint64_t emit_blt_block_copy(int fd,
> > > igt_assert_f(blt, "block-copy requires data to do blit\n");
> > > alignment = get_default_alignment(fd, blt->driver);
> > > - src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment)
> > > - + blt->src.plane_offset;
> > > - dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment)
> > > - + blt->dst.plane_offset;
> > > + src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
> > > + alignment, blt->src.pat_index) +
> > > + blt->src.plane_offset;
> > > + dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size,
> > > + alignment, blt->dst.pat_index) +
> > > + blt->dst.plane_offset;
> >
> > To less tabs in formatting for src and dst plane_offset.
> >
> > > bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
> > > fill_data(&data, blt, src_offset, dst_offset, ext);
> > > @@ -884,8 +887,10 @@ int blt_block_copy(int fd,
> > > igt_assert_neq(blt->driver, 0);
> > > alignment = get_default_alignment(fd, blt->driver);
> > > - src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
> > > - dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
> > > + src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
> > > + alignment, blt->src.pat_index);
> > > + dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size,
> > > + alignment, blt->dst.pat_index);
> > > bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
> > > emit_blt_block_copy(fd, ahnd, blt, ext, 0, true);
> > > @@ -1036,8 +1041,10 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
> > > data.dw00.size_of_ctrl_copy = __ccs_size(surf) / CCS_RATIO - 1;
> > > data.dw00.length = 0x3;
> > > - src_offset = get_offset(ahnd, surf->src.handle, surf->src.size, alignment);
> > > - dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size, alignment);
> > > + src_offset = get_offset_pat_index(ahnd, surf->src.handle, surf->src.size,
> > > + alignment, surf->src.pat_index);
> > > + dst_offset = get_offset_pat_index(ahnd, surf->dst.handle, surf->dst.size,
> > > + alignment, surf->dst.pat_index);
> > > bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size, alignment);
> > > data.dw01.src_address_lo = src_offset;
> > > @@ -1103,8 +1110,10 @@ int blt_ctrl_surf_copy(int fd,
> > > igt_assert_neq(surf->driver, 0);
> > > alignment = max_t(uint64_t, get_default_alignment(fd, surf->driver), 1ull << 16);
> > > - src_offset = get_offset(ahnd, surf->src.handle, surf->src.size, alignment);
> > > - dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size, alignment);
> > > + src_offset = get_offset_pat_index(ahnd, surf->src.handle, surf->src.size,
> > > + alignment, surf->src.pat_index);
> > > + dst_offset = get_offset_pat_index(ahnd, surf->dst.handle, surf->dst.size,
> > > + alignment, surf->dst.pat_index);
> > > bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size, alignment);
> > > emit_blt_ctrl_surf_copy(fd, ahnd, surf, 0, true);
> > > @@ -1308,10 +1317,12 @@ uint64_t emit_blt_fast_copy(int fd,
> > > data.dw03.dst_x2 = blt->dst.x2;
> > > data.dw03.dst_y2 = blt->dst.y2;
> > > - src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment)
> > > - + blt->src.plane_offset;
> > > - dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment)
> > > - + blt->dst.plane_offset;
> > > + src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
> > > + alignment, blt->src.pat_index) +
> > > + blt->src.plane_offset;
> > > + dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size, alignment,
> > > + blt->dst.pat_index) +
> > > + blt->dst.plane_offset;
> >
> > Ditto.
> >
> > > bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
> > > data.dw04.dst_address_lo = dst_offset;
> > > @@ -1380,8 +1391,10 @@ int blt_fast_copy(int fd,
> > > igt_assert_neq(blt->driver, 0);
> > > alignment = get_default_alignment(fd, blt->driver);
> > > - src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
> > > - dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
> > > + src_offset = get_offset_pat_index(ahnd, blt->src.handle, blt->src.size,
> > > + alignment, blt->src.pat_index);
> > > + dst_offset = get_offset_pat_index(ahnd, blt->dst.handle, blt->dst.size,
> > > + alignment, blt->dst.pat_index);
> > > bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
> > > emit_blt_fast_copy(fd, ahnd, blt, 0, true);
> > > @@ -1460,7 +1473,7 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
> > > &size, region) == 0);
> > > }
> >
> > I think blt_create_object() should have also pat_index passed as an
> > argument.
>
> I think you would also have to pass in the cpu_caching mode, and maybe even
> the coh_mode, if we wanted that. Currently blt_create_object() gives you a
> combination of cpu_caching, coh_mode and pat_index that is the default and
> should "just work" for most cases. Idea is if you need something more exotic
> you would instead create your own object (using say gem_create_caching) and
> then also select whatever pat_index you needed.
>
> I can change it to expose everything but figured blt_create_object() should
> be more "I don't care, just give me the defaults".
Ok. You've conviced me. Any non-default settings might be changed before
the exec.
--
Zbigniew
>
> >
> > Rest looks ok.
> >
> > --
> > Zbigniew
> >
> > > - blt_set_object(obj, handle, size, region, mocs, tiling,
> > > + blt_set_object(obj, handle, size, region, mocs, DEFAULT_PAT_INDEX, tiling,
> > > compression, compression_type);
> > > blt_set_geom(obj, stride, 0, 0, width, height, 0, 0);
> > > @@ -1481,7 +1494,7 @@ void blt_destroy_object(int fd, struct blt_copy_object *obj)
> > > void blt_set_object(struct blt_copy_object *obj,
> > > uint32_t handle, uint64_t size, uint32_t region,
> > > - uint8_t mocs, enum blt_tiling_type tiling,
> > > + uint8_t mocs, uint8_t pat_index, enum blt_tiling_type tiling,
> > > enum blt_compression compression,
> > > enum blt_compression_type compression_type)
> > > {
> > > @@ -1489,6 +1502,7 @@ void blt_set_object(struct blt_copy_object *obj,
> > > obj->size = size;
> > > obj->region = region;
> > > obj->mocs = mocs;
> > > + obj->pat_index = pat_index;
> > > obj->tiling = tiling;
> > > obj->compression = compression;
> > > obj->compression_type = compression_type;
> > > @@ -1516,12 +1530,14 @@ void blt_set_copy_object(struct blt_copy_object *obj,
> > > void blt_set_ctrl_surf_object(struct blt_ctrl_surf_copy_object *obj,
> > > uint32_t handle, uint32_t region, uint64_t size,
> > > - uint8_t mocs, enum blt_access_type access_type)
> > > + uint8_t mocs, uint8_t pat_index,
> > > + enum blt_access_type access_type)
> > > {
> > > obj->handle = handle;
> > > obj->region = region;
> > > obj->size = size;
> > > obj->mocs = mocs;
> > > + obj->pat_index = pat_index;
> > > obj->access_type = access_type;
> > > }
> > > diff --git a/lib/intel_blt.h b/lib/intel_blt.h
> > > index d9c8883c7..f8423a986 100644
> > > --- a/lib/intel_blt.h
> > > +++ b/lib/intel_blt.h
> > > @@ -79,6 +79,7 @@ struct blt_copy_object {
> > > uint32_t region;
> > > uint64_t size;
> > > uint8_t mocs;
> > > + uint8_t pat_index;
> > > enum blt_tiling_type tiling;
> > > enum blt_compression compression; /* BC only */
> > > enum blt_compression_type compression_type; /* BC only */
> > > @@ -151,6 +152,7 @@ struct blt_ctrl_surf_copy_object {
> > > uint32_t region;
> > > uint64_t size;
> > > uint8_t mocs;
> > > + uint8_t pat_index;
> > > enum blt_access_type access_type;
> > > };
> > > @@ -247,7 +249,7 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
> > > void blt_destroy_object(int fd, struct blt_copy_object *obj);
> > > void blt_set_object(struct blt_copy_object *obj,
> > > uint32_t handle, uint64_t size, uint32_t region,
> > > - uint8_t mocs, enum blt_tiling_type tiling,
> > > + uint8_t mocs, uint8_t pat_index, enum blt_tiling_type tiling,
> > > enum blt_compression compression,
> > > enum blt_compression_type compression_type);
> > > void blt_set_object_ext(struct blt_block_copy_object_ext *obj,
> > > @@ -258,7 +260,8 @@ void blt_set_copy_object(struct blt_copy_object *obj,
> > > const struct blt_copy_object *orig);
> > > void blt_set_ctrl_surf_object(struct blt_ctrl_surf_copy_object *obj,
> > > uint32_t handle, uint32_t region, uint64_t size,
> > > - uint8_t mocs, enum blt_access_type access_type);
> > > + uint8_t mocs, uint8_t pat_index,
> > > + enum blt_access_type access_type);
> > > void blt_surface_info(const char *info,
> > > const struct blt_copy_object *obj);
> > > diff --git a/tests/intel/gem_ccs.c b/tests/intel/gem_ccs.c
> > > index f5d4ab359..a98557b72 100644
> > > --- a/tests/intel/gem_ccs.c
> > > +++ b/tests/intel/gem_ccs.c
> > > @@ -15,6 +15,7 @@
> > > #include "lib/intel_chipset.h"
> > > #include "intel_blt.h"
> > > #include "intel_mocs.h"
> > > +#include "intel_pat.h"
> > > /**
> > > * TEST: gem ccs
> > > * Description: Exercise gen12 blitter with and without flatccs compression
> > > @@ -111,9 +112,9 @@ static void surf_copy(int i915,
> > > blt_ctrl_surf_copy_init(i915, &surf);
> > > surf.print_bb = param.print_bb;
> > > blt_set_ctrl_surf_object(&surf.src, mid->handle, mid->region, mid->size,
> > > - uc_mocs, BLT_INDIRECT_ACCESS);
> > > + uc_mocs, DEFAULT_PAT_INDEX, BLT_INDIRECT_ACCESS);
> > > blt_set_ctrl_surf_object(&surf.dst, ccs, REGION_SMEM, ccssize,
> > > - uc_mocs, DIRECT_ACCESS);
> > > + uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
> > > bb_size = 4096;
> > > igt_assert_eq(__gem_create(i915, &bb_size, &bb1), 0);
> > > blt_set_batch(&surf.bb, bb1, bb_size, REGION_SMEM);
> > > @@ -133,7 +134,7 @@ static void surf_copy(int i915,
> > > igt_system_suspend_autoresume(SUSPEND_STATE_FREEZE, SUSPEND_TEST_NONE);
> > > blt_set_ctrl_surf_object(&surf.dst, ccs2, REGION_SMEM, ccssize,
> > > - 0, DIRECT_ACCESS);
> > > + 0, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
> > > blt_ctrl_surf_copy(i915, ctx, e, ahnd, &surf);
> > > gem_sync(i915, surf.dst.handle);
> > > @@ -155,9 +156,9 @@ static void surf_copy(int i915,
> > > for (int i = 0; i < surf.dst.size / sizeof(uint32_t); i++)
> > > ccsmap[i] = i;
> > > blt_set_ctrl_surf_object(&surf.src, ccs, REGION_SMEM, ccssize,
> > > - uc_mocs, DIRECT_ACCESS);
> > > + uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
> > > blt_set_ctrl_surf_object(&surf.dst, mid->handle, mid->region, mid->size,
> > > - uc_mocs, INDIRECT_ACCESS);
> > > + uc_mocs, DEFAULT_PAT_INDEX, INDIRECT_ACCESS);
> > > blt_ctrl_surf_copy(i915, ctx, e, ahnd, &surf);
> > > blt_copy_init(i915, &blt);
> > > @@ -399,7 +400,8 @@ static void block_copy(int i915,
> > > blt_set_object_ext(&ext.dst, 0, width, height, SURFACE_TYPE_2D);
> > > if (config->inplace) {
> > > blt_set_object(&blt.dst, mid->handle, dst->size, mid->region, 0,
> > > - T_LINEAR, COMPRESSION_DISABLED, comp_type);
> > > + DEFAULT_PAT_INDEX, T_LINEAR, COMPRESSION_DISABLED,
> > > + comp_type);
> > > blt.dst.ptr = mid->ptr;
> > > }
> > > @@ -475,7 +477,7 @@ static void block_multicopy(int i915,
> > > if (config->inplace) {
> > > blt_set_object(&blt3.dst, mid->handle, dst->size, mid->region,
> > > - mid->mocs, mid_tiling, COMPRESSION_DISABLED,
> > > + mid->mocs, DEFAULT_PAT_INDEX, mid_tiling, COMPRESSION_DISABLED,
> > > comp_type);
> > > blt3.dst.ptr = mid->ptr;
> > > }
> > > diff --git a/tests/intel/gem_lmem_swapping.c b/tests/intel/gem_lmem_swapping.c
> > > index ede545c92..7f2ab8bb6 100644
> > > --- a/tests/intel/gem_lmem_swapping.c
> > > +++ b/tests/intel/gem_lmem_swapping.c
> > > @@ -486,7 +486,7 @@ static void __do_evict(int i915,
> > > INTEL_MEMORY_REGION_ID(I915_SYSTEM_MEMORY, 0));
> > > blt_set_object(tmp, tmp->handle, params->size.max,
> > > INTEL_MEMORY_REGION_ID(I915_SYSTEM_MEMORY, 0),
> > > - intel_get_uc_mocs(i915), T_LINEAR,
> > > + intel_get_uc_mocs(i915), 0, T_LINEAR,
> > > COMPRESSION_DISABLED, COMPRESSION_TYPE_3D);
> > > blt_set_geom(tmp, stride, 0, 0, width, height, 0, 0);
> > > }
> > > @@ -516,7 +516,7 @@ static void __do_evict(int i915,
> > > obj->blt_obj = calloc(1, sizeof(*obj->blt_obj));
> > > igt_assert(obj->blt_obj);
> > > blt_set_object(obj->blt_obj, obj->handle, obj->size, region_id,
> > > - intel_get_uc_mocs(i915), T_LINEAR,
> > > + intel_get_uc_mocs(i915), 0, T_LINEAR,
> > > COMPRESSION_ENABLED, COMPRESSION_TYPE_3D);
> > > blt_set_geom(obj->blt_obj, stride, 0, 0, width, height, 0, 0);
> > > init_object_ccs(i915, obj, tmp, rand(), blt_ctx,
> > > diff --git a/tests/intel/xe_ccs.c b/tests/intel/xe_ccs.c
> > > index 20bbc4448..27859d5ce 100644
> > > --- a/tests/intel/xe_ccs.c
> > > +++ b/tests/intel/xe_ccs.c
> > > @@ -13,6 +13,7 @@
> > > #include "igt_syncobj.h"
> > > #include "intel_blt.h"
> > > #include "intel_mocs.h"
> > > +#include "intel_pat.h"
> > > #include "xe/xe_ioctl.h"
> > > #include "xe/xe_query.h"
> > > #include "xe/xe_util.h"
> > > @@ -108,8 +109,9 @@ static void surf_copy(int xe,
> > > blt_ctrl_surf_copy_init(xe, &surf);
> > > surf.print_bb = param.print_bb;
> > > blt_set_ctrl_surf_object(&surf.src, mid->handle, mid->region, mid->size,
> > > - uc_mocs, BLT_INDIRECT_ACCESS);
> > > - blt_set_ctrl_surf_object(&surf.dst, ccs, sysmem, ccssize, uc_mocs, DIRECT_ACCESS);
> > > + uc_mocs, DEFAULT_PAT_INDEX, BLT_INDIRECT_ACCESS);
> > > + blt_set_ctrl_surf_object(&surf.dst, ccs, sysmem, ccssize, uc_mocs,
> > > + DEFAULT_PAT_INDEX, DIRECT_ACCESS);
> > > bb_size = xe_get_default_alignment(xe);
> > > bb1 = xe_bo_create_flags(xe, 0, bb_size, sysmem);
> > > blt_set_batch(&surf.bb, bb1, bb_size, sysmem);
> > > @@ -130,7 +132,7 @@ static void surf_copy(int xe,
> > > igt_system_suspend_autoresume(SUSPEND_STATE_FREEZE, SUSPEND_TEST_NONE);
> > > blt_set_ctrl_surf_object(&surf.dst, ccs2, system_memory(xe), ccssize,
> > > - 0, DIRECT_ACCESS);
> > > + 0, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
> > > blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
> > > intel_ctx_xe_sync(ctx, true);
> > > @@ -153,9 +155,9 @@ static void surf_copy(int xe,
> > > for (int i = 0; i < surf.dst.size / sizeof(uint32_t); i++)
> > > ccsmap[i] = i;
> > > blt_set_ctrl_surf_object(&surf.src, ccs, sysmem, ccssize,
> > > - uc_mocs, DIRECT_ACCESS);
> > > + uc_mocs, DEFAULT_PAT_INDEX, DIRECT_ACCESS);
> > > blt_set_ctrl_surf_object(&surf.dst, mid->handle, mid->region, mid->size,
> > > - uc_mocs, INDIRECT_ACCESS);
> > > + uc_mocs, DEFAULT_PAT_INDEX, INDIRECT_ACCESS);
> > > blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
> > > intel_ctx_xe_sync(ctx, true);
> > > @@ -369,7 +371,8 @@ static void block_copy(int xe,
> > > blt_set_object_ext(&ext.dst, 0, width, height, SURFACE_TYPE_2D);
> > > if (config->inplace) {
> > > blt_set_object(&blt.dst, mid->handle, dst->size, mid->region, 0,
> > > - T_LINEAR, COMPRESSION_DISABLED, comp_type);
> > > + DEFAULT_PAT_INDEX, T_LINEAR, COMPRESSION_DISABLED,
> > > + comp_type);
> > > blt.dst.ptr = mid->ptr;
> > > }
> > > @@ -450,8 +453,8 @@ static void block_multicopy(int xe,
> > > if (config->inplace) {
> > > blt_set_object(&blt3.dst, mid->handle, dst->size, mid->region,
> > > - mid->mocs, mid_tiling, COMPRESSION_DISABLED,
> > > - comp_type);
> > > + mid->mocs, DEFAULT_PAT_INDEX, mid_tiling,
> > > + COMPRESSION_DISABLED, comp_type);
> > > blt3.dst.ptr = mid->ptr;
> > > }
> > > --
> > > 2.41.0
> > >
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Intel-xe] [PATCH i-g-t 01/12] drm-uapi/xe_drm: sync to get pat and coherency bits
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 01/12] drm-uapi/xe_drm: sync to get pat and coherency bits Matthew Auld
@ 2023-10-09 22:03 ` Mishra, Pallavi
0 siblings, 0 replies; 22+ messages in thread
From: Mishra, Pallavi @ 2023-10-09 22:03 UTC (permalink / raw)
To: Auld, Matthew, igt-dev@lists.freedesktop.org
Cc: intel-xe@lists.freedesktop.org
> -----Original Message-----
> From: Auld, Matthew <matthew.auld@intel.com>
> Sent: Thursday, October 5, 2023 8:31 AM
> To: igt-dev@lists.freedesktop.org
> Cc: intel-xe@lists.freedesktop.org; Souza, Jose <jose.souza@intel.com>;
> Mishra, Pallavi <pallavi.mishra@intel.com>
> Subject: [PATCH i-g-t 01/12] drm-uapi/xe_drm: sync to get pat and coherency
> bits
>
> Grab the PAT & coherency uapi additions.
>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: José Roberto de Souza <jose.souza@intel.com>
> Cc: Pallavi Mishra <pallavi.mishra@intel.com>
Reviewed-by: Pallavi Mishra <pallavi.mishra@intel.com>
> ---
> include/drm-uapi/xe_drm.h | 93
> +++++++++++++++++++++++++++++++++++++--
> 1 file changed, 90 insertions(+), 3 deletions(-)
>
> diff --git a/include/drm-uapi/xe_drm.h b/include/drm-uapi/xe_drm.h index
> 804c02270..0a665f67f 100644
> --- a/include/drm-uapi/xe_drm.h
> +++ b/include/drm-uapi/xe_drm.h
> @@ -456,8 +456,54 @@ struct drm_xe_gem_create {
> */
> __u32 handle;
>
> - /** @pad: MBZ */
> - __u32 pad;
> + /**
> + * @coh_mode: The coherency mode for this object. This will limit the
> + * possible @cpu_caching values.
> + *
> + * Supported values:
> + *
> + * DRM_XE_GEM_COH_NONE: GPU access is assumed to be not
> coherent with
> + * CPU. CPU caches are not snooped.
> + *
> + * DRM_XE_GEM_COH_AT_LEAST_1WAY:
> + *
> + * CPU-GPU coherency must be at least 1WAY.
> + *
> + * If 1WAY then GPU access is coherent with CPU (CPU caches are
> snooped)
> + * until GPU acquires. The acquire by the GPU is not tracked by CPU
> + * caches.
> + *
> + * If 2WAY then should be fully coherent between GPU and CPU. Fully
> + * tracked by CPU caches. Both CPU and GPU caches are snooped.
> + *
> + * Note: On dgpu the GPU device never caches system memory. The
> device
> + * should be thought of as always 1WAY coherent, with the addition
> that
> + * the GPU never caches system memory. At least on current dgpu HW
> there
> + * is no way to turn off snooping so likely the different coherency
> + * modes of the pat_index make no difference for system memory.
> + */
> +#define DRM_XE_GEM_COH_NONE 1
> +#define DRM_XE_GEM_COH_AT_LEAST_1WAY 2
> + __u16 coh_mode;
> +
> + /**
> + * @cpu_caching: The CPU caching mode to select for this object. If
> + * mmaping the object the mode selected here will also be used.
> + *
> + * Supported values:
> + *
> + * DRM_XE_GEM_CPU_CACHING_WB: Allocate the pages with write-
> back caching.
> + * On iGPU this can't be used for scanout surfaces. The @coh_mode
> must
> + * be DRM_XE_GEM_COH_AT_LEAST_1WAY. Currently not allowed for
> objects placed
> + * in VRAM.
> + *
> + * DRM_XE_GEM_CPU_CACHING_WC: Allocate the pages as write-
> combined. This is
> + * uncached. Any @coh_mode is permitted. Scanout surfaces should
> likely
> + * use this. All objects that can be placed in VRAM must use this.
> + */
> +#define DRM_XE_GEM_CPU_CACHING_WB 1
> +#define DRM_XE_GEM_CPU_CACHING_WC 2
> + __u16 cpu_caching;
>
> /** @reserved: Reserved */
> __u64 reserved[2];
> @@ -552,8 +598,49 @@ struct drm_xe_vm_bind_op {
> */
> __u32 obj;
>
> + /**
> + * @pat_index: The platform defined @pat_index to use for this
> mapping.
> + * The index basically maps to some predefined memory attributes,
> + * including things like caching, coherency, compression etc. The exact
> + * meaning of the pat_index is platform specific and defined in the
> + * Bspec and PRMs. When the KMD sets up the binding the index here
> is
> + * encoded into the ppGTT PTE.
> + *
> + * For coherency the @pat_index needs to be least as coherent as
> + * drm_xe_gem_create.coh_mode. i.e coh_mode(pat_index) >=
> + * drm_xe_gem_create.coh_mode. The KMD will extract the coherency
> mode
> + * from the @pat_index and reject if there is a mismatch (see note
> below
> + * for pre-MTL platforms).
> + *
> + * Note: On pre-MTL platforms there is only a caching mode and no
> + * explicit coherency mode, but on such hardware there is always a
> + * shared-LLC (or is dgpu) so all GT memory accesses are coherent with
> + * CPU caches even with the caching mode set as uncached. It's only
> the
> + * display engine that is incoherent (on dgpu it must be in VRAM which
> + * is always mapped as WC on the CPU). However to keep the uapi
> somewhat
> + * consistent with newer platforms the KMD groups the different cache
> + * levels into the following coherency buckets on all pre-MTL platforms:
> + *
> + * ppGTT UC -> DRM_XE_GEM_COH_NONE
> + * ppGTT WC -> DRM_XE_GEM_COH_NONE
> + * ppGTT WT -> DRM_XE_GEM_COH_NONE
> + * ppGTT WB -> DRM_XE_GEM_COH_AT_LEAST_1WAY
> + *
> + * In practice UC/WC/WT should only ever used for scanout surfaces
> on
> + * such platforms (or perhaps in general for dma-buf if shared with
> + * another device) since it is only the display engine that is actually
> + * incoherent. Everything else should typically use WB given that we
> + * have a shared-LLC. On MTL+ this completely changes and the HW
> + * defines the coherency mode as part of the @pat_index, where
> + * incoherent GT access is possible.
> + *
> + * Note: For userptr and externally imported dma-buf the kernel
> expects
> + * either 1WAY or 2WAY for the @pat_index.
> + */
> + __u16 pat_index;
> +
> /** @pad: MBZ */
> - __u32 pad;
> + __u16 pad;
>
> union {
> /**
> --
> 2.41.0
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Intel-xe] [PATCH i-g-t 02/12] lib/igt_fb: mark buffers as SCANOUT
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 02/12] lib/igt_fb: mark buffers as SCANOUT Matthew Auld
@ 2023-10-09 22:03 ` Mishra, Pallavi
0 siblings, 0 replies; 22+ messages in thread
From: Mishra, Pallavi @ 2023-10-09 22:03 UTC (permalink / raw)
To: Auld, Matthew, igt-dev@lists.freedesktop.org
Cc: intel-xe@lists.freedesktop.org
> -----Original Message-----
> From: Auld, Matthew <matthew.auld@intel.com>
> Sent: Thursday, October 5, 2023 8:31 AM
> To: igt-dev@lists.freedesktop.org
> Cc: intel-xe@lists.freedesktop.org; Souza, Jose <jose.souza@intel.com>;
> Mishra, Pallavi <pallavi.mishra@intel.com>
> Subject: [PATCH i-g-t 02/12] lib/igt_fb: mark buffers as SCANOUT
>
> Display buffers likely will want WC, instead of the default WB on the CPU side,
> given that display engine is incoherent with CPU caches.
>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: José Roberto de Souza <jose.souza@intel.com>
> Cc: Pallavi Mishra <pallavi.mishra@intel.com>
Reviewed-by: Pallavi Mishra <pallavi.mishra@intel.com>
> ---
> lib/igt_fb.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/lib/igt_fb.c b/lib/igt_fb.c index 54a66eb6a..f8a0db22c 100644
> --- a/lib/igt_fb.c
> +++ b/lib/igt_fb.c
> @@ -1206,7 +1206,8 @@ static int create_bo_for_fb(struct igt_fb *fb, bool
> prefer_sysmem)
> igt_assert(err == 0 || err == -EOPNOTSUPP);
> } else if (is_xe_device(fd)) {
> fb->gem_handle = xe_bo_create_flags(fd, 0, fb->size,
> -
> visible_vram_if_possible(fd, 0));
> +
> visible_vram_if_possible(fd, 0) |
> +
> XE_GEM_CREATE_FLAG_SCANOUT);
> } else if (is_vc4_device(fd)) {
> fb->gem_handle = igt_vc4_create_bo(fd, fb->size);
>
> --
> 2.41.0
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Intel-xe] [PATCH i-g-t 03/12] lib/igt_draw: mark buffers as SCANOUT
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 03/12] lib/igt_draw: " Matthew Auld
@ 2023-10-09 22:03 ` Mishra, Pallavi
0 siblings, 0 replies; 22+ messages in thread
From: Mishra, Pallavi @ 2023-10-09 22:03 UTC (permalink / raw)
To: Auld, Matthew, igt-dev@lists.freedesktop.org
Cc: intel-xe@lists.freedesktop.org
> -----Original Message-----
> From: Auld, Matthew <matthew.auld@intel.com>
> Sent: Thursday, October 5, 2023 8:31 AM
> To: igt-dev@lists.freedesktop.org
> Cc: intel-xe@lists.freedesktop.org; Souza, Jose <jose.souza@intel.com>;
> Mishra, Pallavi <pallavi.mishra@intel.com>
> Subject: [PATCH i-g-t 03/12] lib/igt_draw: mark buffers as SCANOUT
>
> Display buffers likely will want WC, instead of the default WB on the CPU side,
> given that display engine is incoherent with CPU caches.
>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: José Roberto de Souza <jose.souza@intel.com>
> Cc: Pallavi Mishra <pallavi.mishra@intel.com>
Reviewed-by: Pallavi Mishra <pallavi.mishra@intel.com>
> ---
> lib/igt_draw.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/lib/igt_draw.c b/lib/igt_draw.c index 476778a13..2332bf94a
> 100644
> --- a/lib/igt_draw.c
> +++ b/lib/igt_draw.c
> @@ -791,7 +791,8 @@ static void draw_rect_render(int fd, struct cmd_data
> *cmd_data,
> else
> tmp.handle = xe_bo_create_flags(fd, 0,
> ALIGN(tmp.size,
> xe_get_default_alignment(fd)),
> - visible_vram_if_possible(fd,
> 0));
> + visible_vram_if_possible(fd,
> 0) |
> +
> XE_GEM_CREATE_FLAG_SCANOUT);
>
> tmp.stride = rect->w * pixel_size;
> tmp.bpp = buf->bpp;
> --
> 2.41.0
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Intel-xe] [PATCH i-g-t 04/12] lib/xe: support cpu_caching and coh_mod for gem_create
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 04/12] lib/xe: support cpu_caching and coh_mod for gem_create Matthew Auld
@ 2023-10-09 22:04 ` Mishra, Pallavi
0 siblings, 0 replies; 22+ messages in thread
From: Mishra, Pallavi @ 2023-10-09 22:04 UTC (permalink / raw)
To: Auld, Matthew, igt-dev@lists.freedesktop.org
Cc: intel-xe@lists.freedesktop.org
> -----Original Message-----
> From: Auld, Matthew <matthew.auld@intel.com>
> Sent: Thursday, October 5, 2023 8:31 AM
> To: igt-dev@lists.freedesktop.org
> Cc: intel-xe@lists.freedesktop.org; Souza, Jose <jose.souza@intel.com>;
> Mishra, Pallavi <pallavi.mishra@intel.com>
> Subject: [PATCH i-g-t 04/12] lib/xe: support cpu_caching and coh_mod for
> gem_create
>
> Most tests shouldn't about such things, so likely it's just a case of picking the
> most sane default. However we also add some helpers for the tests that do
> care.
>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: José Roberto de Souza <jose.souza@intel.com>
> Cc: Pallavi Mishra <pallavi.mishra@intel.com>
Reviewed-by: Pallavi Mishra <pallavi.mishra@intel.com>
> ---
> lib/xe/xe_ioctl.c | 65 ++++++++++++++++++++++++++++++++++-------
> lib/xe/xe_ioctl.h | 8 +++++
> tests/intel/xe_create.c | 3 ++
> 3 files changed, 65 insertions(+), 11 deletions(-)
>
> diff --git a/lib/xe/xe_ioctl.c b/lib/xe/xe_ioctl.c index 730dcfd16..80696aa59
> 100644
> --- a/lib/xe/xe_ioctl.c
> +++ b/lib/xe/xe_ioctl.c
> @@ -233,13 +233,30 @@ void xe_vm_destroy(int fd, uint32_t vm)
> igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_VM_DESTROY, &destroy),
> 0); }
>
> -uint32_t __xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t
> flags,
> - uint32_t *handle)
> +void __xe_default_coh_caching_from_flags(int fd, uint32_t flags,
> + uint16_t *cpu_caching,
> + uint16_t *coh_mode)
> +{
> + if ((flags & all_memory_regions(fd)) != system_memory(fd) ||
> + flags & XE_GEM_CREATE_FLAG_SCANOUT) {
> + /* VRAM placements or scanout should always use WC */
> + *cpu_caching = DRM_XE_GEM_CPU_CACHING_WC;
> + *coh_mode = DRM_XE_GEM_COH_NONE;
> + } else {
> + *cpu_caching = DRM_XE_GEM_CPU_CACHING_WB;
> + *coh_mode = DRM_XE_GEM_COH_AT_LEAST_1WAY;
> + }
> +}
> +
> +static uint32_t ___xe_bo_create_flags(int fd, uint32_t vm, uint64_t size,
> uint32_t flags,
> + uint16_t cpu_caching, uint16_t coh_mode,
> uint32_t *handle)
> {
> struct drm_xe_gem_create create = {
> .vm_id = vm,
> .size = size,
> .flags = flags,
> + .cpu_caching = cpu_caching,
> + .coh_mode = coh_mode,
> };
> int err;
>
> @@ -249,6 +266,18 @@ uint32_t __xe_bo_create_flags(int fd, uint32_t vm,
> uint64_t size, uint32_t flags
>
> *handle = create.handle;
> return 0;
> +
> +}
> +
> +uint32_t __xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t
> flags,
> + uint32_t *handle)
> +{
> + uint16_t cpu_caching, coh_mode;
> +
> + __xe_default_coh_caching_from_flags(fd, flags, &cpu_caching,
> +&coh_mode);
> +
> + return ___xe_bo_create_flags(fd, vm, size, flags, cpu_caching,
> coh_mode,
> + handle);
> }
>
> uint32_t xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t flags)
> @@ -260,19 +289,33 @@ uint32_t xe_bo_create_flags(int fd, uint32_t vm,
> uint64_t size, uint32_t flags)
> return handle;
> }
>
> +uint32_t __xe_bo_create_caching(int fd, uint32_t vm, uint64_t size, uint32_t
> flags,
> + uint16_t cpu_caching, uint16_t coh_mode,
> + uint32_t *handle)
> +{
> + return ___xe_bo_create_flags(fd, vm, size, flags, cpu_caching,
> coh_mode,
> + handle);
> +}
> +
> +uint32_t xe_bo_create_caching(int fd, uint32_t vm, uint64_t size, uint32_t
> flags,
> + uint16_t cpu_caching, uint16_t coh_mode) {
> + uint32_t handle;
> +
> + igt_assert_eq(__xe_bo_create_caching(fd, vm, size, flags,
> + cpu_caching, coh_mode, &handle),
> 0);
> +
> + return handle;
> +}
> +
> uint32_t xe_bo_create(int fd, int gt, uint32_t vm, uint64_t size) {
> - struct drm_xe_gem_create create = {
> - .vm_id = vm,
> - .size = size,
> - .flags = vram_if_possible(fd, gt),
> - };
> - int err;
> + uint32_t handle;
>
> - err = igt_ioctl(fd, DRM_IOCTL_XE_GEM_CREATE, &create);
> - igt_assert_eq(err, 0);
> + igt_assert_eq(__xe_bo_create_flags(fd, vm, size, vram_if_possible(fd,
> gt),
> + &handle), 0);
>
> - return create.handle;
> + return handle;
> }
>
> uint32_t xe_bind_exec_queue_create(int fd, uint32_t vm, uint64_t ext) diff --
> git a/lib/xe/xe_ioctl.h b/lib/xe/xe_ioctl.h index 6c281b3bf..c18fc878c 100644
> --- a/lib/xe/xe_ioctl.h
> +++ b/lib/xe/xe_ioctl.h
> @@ -67,6 +67,14 @@ void xe_vm_destroy(int fd, uint32_t vm); uint32_t
> __xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t flags,
> uint32_t *handle);
> uint32_t xe_bo_create_flags(int fd, uint32_t vm, uint64_t size, uint32_t
> flags);
> +uint32_t __xe_bo_create_caching(int fd, uint32_t vm, uint64_t size, uint32_t
> flags,
> + uint16_t cpu_caching, uint16_t coh_mode,
> + uint32_t *handle);
> +uint32_t xe_bo_create_caching(int fd, uint32_t vm, uint64_t size, uint32_t
> flags,
> + uint16_t cpu_caching, uint16_t coh_mode); void
> +__xe_default_coh_caching_from_flags(int fd, uint32_t flags,
> + uint16_t *cpu_caching,
> + uint16_t *coh_mode);
> uint32_t xe_bo_create(int fd, int gt, uint32_t vm, uint64_t size); uint32_t
> xe_exec_queue_create(int fd, uint32_t vm,
> struct drm_xe_engine_class_instance *instance, diff -
> -git a/tests/intel/xe_create.c b/tests/intel/xe_create.c index
> 8d845e5c8..f5d2cc1b2 100644
> --- a/tests/intel/xe_create.c
> +++ b/tests/intel/xe_create.c
> @@ -30,6 +30,9 @@ static int __create_bo(int fd, uint32_t vm, uint64_t size,
> uint32_t flags,
>
> igt_assert(handlep);
>
> + __xe_default_coh_caching_from_flags(fd, flags, &create.cpu_caching,
> + &create.coh_mode);
> +
> if (igt_ioctl(fd, DRM_IOCTL_XE_GEM_CREATE, &create)) {
> ret = -errno;
> errno = 0;
> --
> 2.41.0
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2023-10-09 22:04 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-05 15:31 [Intel-xe] [PATCH i-g-t 00/12] PAT and cache coherency support Matthew Auld
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 01/12] drm-uapi/xe_drm: sync to get pat and coherency bits Matthew Auld
2023-10-09 22:03 ` Mishra, Pallavi
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 02/12] lib/igt_fb: mark buffers as SCANOUT Matthew Auld
2023-10-09 22:03 ` Mishra, Pallavi
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 03/12] lib/igt_draw: " Matthew Auld
2023-10-09 22:03 ` Mishra, Pallavi
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 04/12] lib/xe: support cpu_caching and coh_mod for gem_create Matthew Auld
2023-10-09 22:04 ` Mishra, Pallavi
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 05/12] tests/xe/mmap: add some tests for cpu_caching and coh_mode Matthew Auld
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 06/12] lib/intel_pat: add helpers for common pat_index modes Matthew Auld
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 07/12] lib/allocator: add get_offset_pat_index() helper Matthew Auld
2023-10-06 11:38 ` Zbigniew Kempczyński
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 08/12] lib/intel_blt: support pat_index Matthew Auld
2023-10-06 11:51 ` [Intel-xe] [igt-dev] " Zbigniew Kempczyński
2023-10-06 12:08 ` Matthew Auld
2023-10-09 9:21 ` Zbigniew Kempczyński
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 09/12] lib/intel_buf: " Matthew Auld
2023-10-06 12:13 ` Zbigniew Kempczyński
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 10/12] lib/xe_ioctl: update vm_bind to account for pat_index Matthew Auld
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 11/12] tests/xe: add some vm_bind pat_index tests Matthew Auld
2023-10-05 15:31 ` [Intel-xe] [PATCH i-g-t 12/12] tests/intel-ci/xe: add pat and caching related tests Matthew Auld
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox