public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH i-g-t 0/2] Confirm full SSEU enable on Gen9+
@ 2015-03-10 21:17 jeff.mcgee
  2015-03-10 21:17 ` [PATCH i-g-t 1/2] lib: Add media spin jeff.mcgee
  2015-03-10 21:17 ` [PATCH i-g-t 2/2] tests/pm_sseu: Create new test pm_sseu jeff.mcgee
  0 siblings, 2 replies; 10+ messages in thread
From: jeff.mcgee @ 2015-03-10 21:17 UTC (permalink / raw)
  To: intel-gfx

From: Jeff McGee <jeff.mcgee@intel.com>

New IGT testing to cover the RC6/SSEU issue recently resolved on SKL.

http://lists.freedesktop.org/archives/intel-gfx/2015-February/060058.html

Jeff McGee (2):
  lib: Add media spin
  tests/pm_sseu: Create new test pm_sseu

 lib/Makefile.sources    |   2 +
 lib/intel_batchbuffer.c |  24 +++
 lib/intel_batchbuffer.h |  22 ++
 lib/media_spin.c        | 540 ++++++++++++++++++++++++++++++++++++++++++++++++
 lib/media_spin.h        |  39 ++++
 tests/.gitignore        |   1 +
 tests/Makefile.sources  |   1 +
 tests/pm_sseu.c         | 373 +++++++++++++++++++++++++++++++++
 8 files changed, 1002 insertions(+)
 create mode 100644 lib/media_spin.c
 create mode 100644 lib/media_spin.h
 create mode 100644 tests/pm_sseu.c

-- 
2.3.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH i-g-t 1/2] lib: Add media spin
  2015-03-10 21:17 [PATCH i-g-t 0/2] Confirm full SSEU enable on Gen9+ jeff.mcgee
@ 2015-03-10 21:17 ` jeff.mcgee
  2015-03-12 17:52   ` [PATCH i-g-t 1/2 v2] " jeff.mcgee
  2015-03-10 21:17 ` [PATCH i-g-t 2/2] tests/pm_sseu: Create new test pm_sseu jeff.mcgee
  1 sibling, 1 reply; 10+ messages in thread
From: jeff.mcgee @ 2015-03-10 21:17 UTC (permalink / raw)
  To: intel-gfx

From: Jeff McGee <jeff.mcgee@intel.com>

The media spin utility is derived from media fill. The purpose
is to create a simple means to keep the render engine (media
pipeline) busy for a controlled amount of time. It does so by
emitting a batch with a single execution thread that spins in
a tight loop the requested number of times. Each spin increments
a counter whose final 32-bit value is written to the destination
buffer on completion for checking. The implementation supports
Gen8, Gen8lp, and Gen9.

Signed-off-by: Jeff McGee <jeff.mcgee@intel.com>
---
 lib/Makefile.sources    |   2 +
 lib/intel_batchbuffer.c |  24 +++
 lib/intel_batchbuffer.h |  22 ++
 lib/media_spin.c        | 540 ++++++++++++++++++++++++++++++++++++++++++++++++
 lib/media_spin.h        |  39 ++++
 5 files changed, 627 insertions(+)
 create mode 100644 lib/media_spin.c
 create mode 100644 lib/media_spin.h

diff --git a/lib/Makefile.sources b/lib/Makefile.sources
index 76f353a..3d93629 100644
--- a/lib/Makefile.sources
+++ b/lib/Makefile.sources
@@ -29,6 +29,8 @@ libintel_tools_la_SOURCES = 	\
 	media_fill_gen8.c       \
 	media_fill_gen8lp.c     \
 	media_fill_gen9.c       \
+	media_spin.h		\
+	media_spin.c	\
 	gen7_media.h            \
 	gen8_media.h            \
 	rendercopy_i915.c	\
diff --git a/lib/intel_batchbuffer.c b/lib/intel_batchbuffer.c
index c70f6d8..14970e4 100644
--- a/lib/intel_batchbuffer.c
+++ b/lib/intel_batchbuffer.c
@@ -39,6 +39,7 @@
 #include "intel_reg.h"
 #include "rendercopy.h"
 #include "media_fill.h"
+#include "media_spin.h"
 #include <i915_drm.h>
 
 /**
@@ -530,3 +531,26 @@ igt_fillfunc_t igt_get_gpgpu_fillfunc(int devid)
 
 	return fill;
 }
+
+/**
+ * igt_get_media_spinfunc:
+ * @devid: pci device id
+ *
+ * Returns:
+ *
+ * The platform-specific media spin function pointer for the device specified
+ * with @devid. Will return NULL when no media spin function is implemented.
+ */
+igt_media_spinfunc_t igt_get_media_spinfunc(int devid)
+{
+	igt_media_spinfunc_t spin = NULL;
+
+	if (IS_GEN9(devid))
+		spin = gen9_media_spinfunc;
+	else if (IS_BROADWELL(devid))
+		spin = gen8_media_spinfunc;
+	else if (IS_CHERRYVIEW(devid))
+		spin = gen8lp_media_spinfunc;
+
+	return spin;
+}
diff --git a/lib/intel_batchbuffer.h b/lib/intel_batchbuffer.h
index 12f7be1..13b356a 100644
--- a/lib/intel_batchbuffer.h
+++ b/lib/intel_batchbuffer.h
@@ -265,4 +265,26 @@ typedef void (*igt_fillfunc_t)(struct intel_batchbuffer *batch,
 igt_fillfunc_t igt_get_media_fillfunc(int devid);
 igt_fillfunc_t igt_get_gpgpu_fillfunc(int devid);
 
+/**
+ * igt_media_spinfunc_t:
+ * @batch: batchbuffer object
+ * @dst: destination i-g-t buffer object
+ * @spins: number of loops to execute
+ *
+ * This is the type of the per-platform media spin functions. The
+ * platform-specific implementation can be obtained by calling
+ * igt_get_media_spinfunc().
+ *
+ * The media spin function emits a batchbuffer for the render engine with
+ * the media pipeline selected. The workload consists of a single thread
+ * which spins in a tight loop the requested number of times. Each spin
+ * increments a counter whose final 32-bit value is written to the
+ * destination buffer on completion. This utility provides a simple way
+ * to keep the render engine busy for a set time for various tests.
+ */
+typedef void (*igt_media_spinfunc_t)(struct intel_batchbuffer *batch,
+				     struct igt_buf *dst, uint32_t spins);
+
+igt_media_spinfunc_t igt_get_media_spinfunc(int devid);
+
 #endif
diff --git a/lib/media_spin.c b/lib/media_spin.c
new file mode 100644
index 0000000..b44c55a
--- /dev/null
+++ b/lib/media_spin.c
@@ -0,0 +1,540 @@
+/*
+ * Copyright © 2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ * 	Jeff McGee <jeff.mcgee@intel.com>
+ */
+
+#include <intel_bufmgr.h>
+#include <i915_drm.h>
+#include "intel_reg.h"
+#include "drmtest.h"
+#include "intel_batchbuffer.h"
+#include "gen8_media.h"
+#include "media_spin.h"
+
+static const uint32_t spin_kernel[][4] = {
+	{ 0x00600001, 0x20800208, 0x008d0000, 0x00000000 }, /* mov (8)r4.0<1>:ud r0.0<8;8;1>:ud */
+	{ 0x00200001, 0x20800208, 0x00450040, 0x00000000 }, /* mov (2)r4.0<1>.ud r2.0<2;2;1>:ud */
+	{ 0x00000001, 0x20880608, 0x00000000, 0x00000003 }, /* mov (1)r4.8<1>:ud 0x3 */
+	{ 0x00000001, 0x20a00608, 0x00000000, 0x00000000 }, /* mov (1)r5.0<1>:ud 0 */
+	{ 0x00000040, 0x20a00208, 0x060000a0, 0x00000001 }, /* add (1)r5.0<1>:ud r5.0<0;1;0>:ud 1 */
+	{ 0x01000010, 0x20000200, 0x02000020, 0x000000a0 }, /* cmp.e.f0.0 (1)null<1> r1<0;1;0> r5<0;1;0> */
+	{ 0x00110027, 0x00000000, 0x00000000, 0xffffffe0 }, /* ~f0.0 while (1) -32 */
+	{ 0x0c800031, 0x20000a00, 0x0e000080, 0x040a8000 }, /* send.dcdp1 (16)null<1> r4.0<0;1;0> 0x040a8000 */
+	{ 0x00600001, 0x2e000208, 0x008d0000, 0x00000000 }, /* mov (8)r112<1>:ud r0.0<8;8;1>:ud */
+	{ 0x07800031, 0x20000a40, 0x0e000e00, 0x82000010 }, /* send.ts (16)null<1> r112<0;1;0>:d 0x82000010 */
+};
+
+static uint32_t
+batch_used(struct intel_batchbuffer *batch)
+{
+	return batch->ptr - batch->buffer;
+}
+
+static uint32_t
+batch_align(struct intel_batchbuffer *batch, uint32_t align)
+{
+	uint32_t offset = batch_used(batch);
+	offset = ALIGN(offset, align);
+	batch->ptr = batch->buffer + offset;
+	return offset;
+}
+
+static void *
+batch_alloc(struct intel_batchbuffer *batch, uint32_t size, uint32_t align)
+{
+	uint32_t offset = batch_align(batch, align);
+	batch->ptr += size;
+	return memset(batch->buffer + offset, 0, size);
+}
+
+static uint32_t
+batch_offset(struct intel_batchbuffer *batch, void *ptr)
+{
+	return (uint8_t *)ptr - batch->buffer;
+}
+
+static uint32_t
+batch_copy(struct intel_batchbuffer *batch, const void *ptr, uint32_t size,
+	   uint32_t align)
+{
+	return batch_offset(batch, memcpy(batch_alloc(batch, size, align), ptr, size));
+}
+
+static void
+gen8_render_flush(struct intel_batchbuffer *batch, uint32_t batch_end)
+{
+	int ret;
+
+	ret = drm_intel_bo_subdata(batch->bo, 0, 4096, batch->buffer);
+	if (ret == 0)
+		ret = drm_intel_gem_bo_context_exec(batch->bo, NULL,
+						    batch_end, 0);
+	igt_assert(ret == 0);
+}
+
+static uint32_t
+gen8_spin_curbe_buffer_data(struct intel_batchbuffer *batch,
+			    uint32_t iters)
+{
+	uint32_t *curbe_buffer;
+	uint32_t offset;
+
+	curbe_buffer = batch_alloc(batch, 64, 64);
+	offset = batch_offset(batch, curbe_buffer);
+	*curbe_buffer = iters;
+
+	return offset;
+}
+
+static uint32_t
+gen8_spin_surface_state(struct intel_batchbuffer *batch,
+			struct igt_buf *buf,
+			uint32_t format,
+			int is_dst)
+{
+	struct gen8_surface_state *ss;
+	uint32_t write_domain, read_domain, offset;
+	int ret;
+
+	if (is_dst) {
+		write_domain = read_domain = I915_GEM_DOMAIN_RENDER;
+	} else {
+		write_domain = 0;
+		read_domain = I915_GEM_DOMAIN_SAMPLER;
+	}
+
+	ss = batch_alloc(batch, sizeof(*ss), 64);
+	offset = batch_offset(batch, ss);
+
+	ss->ss0.surface_type = GEN8_SURFACE_2D;
+	ss->ss0.surface_format = format;
+	ss->ss0.render_cache_read_write = 1;
+	ss->ss0.vertical_alignment = 1; /* align 4 */
+	ss->ss0.horizontal_alignment = 1; /* align 4 */
+
+	if (buf->tiling == I915_TILING_X)
+		ss->ss0.tiled_mode = 2;
+	else if (buf->tiling == I915_TILING_Y)
+		ss->ss0.tiled_mode = 3;
+
+	ss->ss8.base_addr = buf->bo->offset;
+
+	ret = drm_intel_bo_emit_reloc(batch->bo,
+				batch_offset(batch, ss) + 8 * 4,
+				buf->bo, 0,
+				read_domain, write_domain);
+	igt_assert(ret == 0);
+
+	ss->ss2.height = igt_buf_height(buf) - 1;
+	ss->ss2.width  = igt_buf_width(buf) - 1;
+	ss->ss3.pitch  = buf->stride - 1;
+
+	ss->ss7.shader_chanel_select_r = 4;
+	ss->ss7.shader_chanel_select_g = 5;
+	ss->ss7.shader_chanel_select_b = 6;
+	ss->ss7.shader_chanel_select_a = 7;
+
+	return offset;
+}
+
+static uint32_t
+gen8_spin_binding_table(struct intel_batchbuffer *batch,
+			struct igt_buf *dst)
+{
+	uint32_t *binding_table, offset;
+
+	binding_table = batch_alloc(batch, 32, 64);
+	offset = batch_offset(batch, binding_table);
+
+	binding_table[0] = gen8_spin_surface_state(batch, dst,
+					GEN8_SURFACEFORMAT_R8_UNORM, 1);
+
+	return offset;
+}
+
+static uint32_t
+gen8_spin_media_kernel(struct intel_batchbuffer *batch,
+		       const uint32_t kernel[][4],
+		       size_t size)
+{
+	uint32_t offset;
+
+	offset = batch_copy(batch, kernel, size, 64);
+
+	return offset;
+}
+
+static uint32_t
+gen8_spin_interface_descriptor(struct intel_batchbuffer *batch,
+			       struct igt_buf *dst)
+{
+	struct gen8_interface_descriptor_data *idd;
+	uint32_t offset;
+	uint32_t binding_table_offset, kernel_offset;
+
+	binding_table_offset = gen8_spin_binding_table(batch, dst);
+	kernel_offset = gen8_spin_media_kernel(batch, spin_kernel,
+					       sizeof(spin_kernel));
+
+	idd = batch_alloc(batch, sizeof(*idd), 64);
+	offset = batch_offset(batch, idd);
+
+	idd->desc0.kernel_start_pointer = (kernel_offset >> 6);
+
+	idd->desc2.single_program_flow = 1;
+	idd->desc2.floating_point_mode = GEN8_FLOATING_POINT_IEEE_754;
+
+	idd->desc3.sampler_count = 0;      /* 0 samplers used */
+	idd->desc3.sampler_state_pointer = 0;
+
+	idd->desc4.binding_table_entry_count = 0;
+	idd->desc4.binding_table_pointer = (binding_table_offset >> 5);
+
+	idd->desc5.constant_urb_entry_read_offset = 0;
+	idd->desc5.constant_urb_entry_read_length = 1; /* grf 1 */
+
+	return offset;
+}
+
+static void
+gen8_emit_state_base_address(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_STATE_BASE_ADDRESS | (16 - 2));
+
+	/* general */
+	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+
+	/* stateless data port */
+	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
+
+	/* surface */
+	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0, BASE_ADDRESS_MODIFY);
+
+	/* dynamic */
+	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION,
+		0, BASE_ADDRESS_MODIFY);
+
+	/* indirect */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	/* instruction */
+	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
+
+	/* general state buffer size */
+	OUT_BATCH(0xfffff000 | 1);
+	/* dynamic state buffer size */
+	OUT_BATCH(1 << 12 | 1);
+	/* indirect object buffer size */
+	OUT_BATCH(0xfffff000 | 1);
+	/* intruction buffer size, must set modify enable bit, otherwise it may result in GPU hang */
+	OUT_BATCH(1 << 12 | 1);
+}
+
+static void
+gen9_emit_state_base_address(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_STATE_BASE_ADDRESS | (19 - 2));
+
+	/* general */
+	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+
+	/* stateless data port */
+	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
+
+	/* surface */
+	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0, BASE_ADDRESS_MODIFY);
+
+	/* dynamic */
+	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION,
+		0, BASE_ADDRESS_MODIFY);
+
+	/* indirect */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	/* instruction */
+	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
+
+	/* general state buffer size */
+	OUT_BATCH(0xfffff000 | 1);
+	/* dynamic state buffer size */
+	OUT_BATCH(1 << 12 | 1);
+	/* indirect object buffer size */
+	OUT_BATCH(0xfffff000 | 1);
+	/* intruction buffer size, must set modify enable bit, otherwise it may result in GPU hang */
+	OUT_BATCH(1 << 12 | 1);
+
+	/* Bindless surface state base address */
+	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+	OUT_BATCH(0xfffff000);
+}
+
+static void
+gen8_emit_vfe_state(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_MEDIA_VFE_STATE | (9 - 2));
+
+	/* scratch buffer */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	/* number of threads & urb entries */
+	OUT_BATCH(2 << 8);
+
+	OUT_BATCH(0);
+
+	/* urb entry size & curbe size */
+	OUT_BATCH(2 << 16 |
+		2);
+
+	/* scoreboard */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+}
+
+static void
+gen8_emit_curbe_load(struct intel_batchbuffer *batch, uint32_t curbe_buffer)
+{
+	OUT_BATCH(GEN8_MEDIA_CURBE_LOAD | (4 - 2));
+	OUT_BATCH(0);
+	/* curbe total data length */
+	OUT_BATCH(64);
+	/* curbe data start address, is relative to the dynamics base address */
+	OUT_BATCH(curbe_buffer);
+}
+
+static void
+gen8_emit_interface_descriptor_load(struct intel_batchbuffer *batch,
+				    uint32_t interface_descriptor)
+{
+	OUT_BATCH(GEN8_MEDIA_INTERFACE_DESCRIPTOR_LOAD | (4 - 2));
+	OUT_BATCH(0);
+	/* interface descriptor data length */
+	OUT_BATCH(sizeof(struct gen8_interface_descriptor_data));
+	/* interface descriptor address, is relative to the dynamics base address */
+	OUT_BATCH(interface_descriptor);
+}
+
+static void
+gen8_emit_media_state_flush(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_MEDIA_STATE_FLUSH | (2 - 2));
+	OUT_BATCH(0);
+}
+
+static void
+gen8_emit_media_objects(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_MEDIA_OBJECT | (8 - 2));
+
+	/* interface descriptor offset */
+	OUT_BATCH(0);
+
+	/* without indirect data */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	/* scoreboard */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	/* inline data (xoffset, yoffset) */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	gen8_emit_media_state_flush(batch);
+}
+
+static void
+gen8lp_emit_media_objects(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_MEDIA_OBJECT | (8 - 2));
+
+	/* interface descriptor offset */
+	OUT_BATCH(0);
+
+	/* without indirect data */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	/* scoreboard */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	/* inline data (xoffset, yoffset) */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+}
+
+/*
+ * This sets up the media pipeline,
+ *
+ * +---------------+ <---- 4096
+ * |       ^       |
+ * |       |       |
+ * |    various    |
+ * |      state    |
+ * |       |       |
+ * |_______|_______| <---- 2048 + ?
+ * |       ^       |
+ * |       |       |
+ * |   batch       |
+ * |    commands   |
+ * |       |       |
+ * |       |       |
+ * +---------------+ <---- 0 + ?
+ *
+ */
+
+#define BATCH_STATE_SPLIT 2048
+
+void
+gen8_media_spinfunc(struct intel_batchbuffer *batch,
+		    struct igt_buf *dst, uint32_t spins)
+{
+	uint32_t curbe_buffer, interface_descriptor;
+	uint32_t batch_end;
+
+	intel_batchbuffer_flush_with_context(batch, NULL);
+
+	/* setup states */
+	batch->ptr = &batch->buffer[BATCH_STATE_SPLIT];
+
+	curbe_buffer = gen8_spin_curbe_buffer_data(batch, spins);
+	interface_descriptor = gen8_spin_interface_descriptor(batch, dst);
+	igt_assert(batch->ptr < &batch->buffer[4095]);
+
+	/* media pipeline */
+	batch->ptr = batch->buffer;
+	OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA);
+	gen8_emit_state_base_address(batch);
+
+	gen8_emit_vfe_state(batch);
+
+	gen8_emit_curbe_load(batch, curbe_buffer);
+
+	gen8_emit_interface_descriptor_load(batch, interface_descriptor);
+
+	gen8_emit_media_objects(batch);
+
+	OUT_BATCH(MI_BATCH_BUFFER_END);
+
+	batch_end = batch_align(batch, 8);
+	igt_assert(batch_end < BATCH_STATE_SPLIT);
+
+	gen8_render_flush(batch, batch_end);
+	intel_batchbuffer_reset(batch);
+}
+
+void
+gen8lp_media_spinfunc(struct intel_batchbuffer *batch,
+		      struct igt_buf *dst, uint32_t spins)
+{
+	uint32_t curbe_buffer, interface_descriptor;
+	uint32_t batch_end;
+
+	intel_batchbuffer_flush_with_context(batch, NULL);
+
+	/* setup states */
+	batch->ptr = &batch->buffer[BATCH_STATE_SPLIT];
+
+	curbe_buffer = gen8_spin_curbe_buffer_data(batch, spins);
+	interface_descriptor = gen8_spin_interface_descriptor(batch, dst);
+	igt_assert(batch->ptr < &batch->buffer[4095]);
+
+	/* media pipeline */
+	batch->ptr = batch->buffer;
+	OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA);
+	gen8_emit_state_base_address(batch);
+
+	gen8_emit_vfe_state(batch);
+
+	gen8_emit_curbe_load(batch, curbe_buffer);
+
+	gen8_emit_interface_descriptor_load(batch, interface_descriptor);
+
+	gen8lp_emit_media_objects(batch);
+
+	OUT_BATCH(MI_BATCH_BUFFER_END);
+
+	batch_end = batch_align(batch, 8);
+	igt_assert(batch_end < BATCH_STATE_SPLIT);
+
+	gen8_render_flush(batch, batch_end);
+	intel_batchbuffer_reset(batch);
+}
+
+void
+gen9_media_spinfunc(struct intel_batchbuffer *batch,
+		    struct igt_buf *dst, uint32_t spins)
+{
+	uint32_t curbe_buffer, interface_descriptor;
+	uint32_t batch_end;
+
+	intel_batchbuffer_flush_with_context(batch, NULL);
+
+	/* setup states */
+	batch->ptr = &batch->buffer[BATCH_STATE_SPLIT];
+
+	curbe_buffer = gen8_spin_curbe_buffer_data(batch, spins);
+	interface_descriptor = gen8_spin_interface_descriptor(batch, dst);
+	igt_assert(batch->ptr < &batch->buffer[4095]);
+
+	/* media pipeline */
+	batch->ptr = batch->buffer;
+	OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA |
+			GEN9_FORCE_MEDIA_AWAKE_ENABLE |
+			GEN9_SAMPLER_DOP_GATE_DISABLE |
+			GEN9_PIPELINE_SELECTION_MASK |
+			GEN9_SAMPLER_DOP_GATE_MASK |
+			GEN9_FORCE_MEDIA_AWAKE_MASK);
+	gen9_emit_state_base_address(batch);
+
+	gen8_emit_vfe_state(batch);
+
+	gen8_emit_curbe_load(batch, curbe_buffer);
+
+	gen8_emit_interface_descriptor_load(batch, interface_descriptor);
+
+	gen8_emit_media_objects(batch);
+
+	OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA |
+			GEN9_FORCE_MEDIA_AWAKE_DISABLE |
+			GEN9_SAMPLER_DOP_GATE_ENABLE |
+			GEN9_PIPELINE_SELECTION_MASK |
+			GEN9_SAMPLER_DOP_GATE_MASK |
+			GEN9_FORCE_MEDIA_AWAKE_MASK);
+
+	OUT_BATCH(MI_BATCH_BUFFER_END);
+
+	batch_end = batch_align(batch, 8);
+	igt_assert(batch_end < BATCH_STATE_SPLIT);
+
+	gen8_render_flush(batch, batch_end);
+	intel_batchbuffer_reset(batch);
+}
diff --git a/lib/media_spin.h b/lib/media_spin.h
new file mode 100644
index 0000000..8bc4829
--- /dev/null
+++ b/lib/media_spin.h
@@ -0,0 +1,39 @@
+/*
+ * Copyright © 2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ * 	Jeff McGee <jeff.mcgee@intel.com>
+ */
+
+#ifndef MEDIA_SPIN_H
+#define MEDIA_SPIN_H
+
+void gen8_media_spinfunc(struct intel_batchbuffer *batch,
+			 struct igt_buf *dst, uint32_t spins);
+
+void gen8lp_media_spinfunc(struct intel_batchbuffer *batch,
+			   struct igt_buf *dst, uint32_t spins);
+
+void gen9_media_spinfunc(struct intel_batchbuffer *batch,
+			 struct igt_buf *dst, uint32_t spins);
+
+#endif /* MEDIA_SPIN_H */
-- 
2.3.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH i-g-t 2/2] tests/pm_sseu: Create new test pm_sseu
  2015-03-10 21:17 [PATCH i-g-t 0/2] Confirm full SSEU enable on Gen9+ jeff.mcgee
  2015-03-10 21:17 ` [PATCH i-g-t 1/2] lib: Add media spin jeff.mcgee
@ 2015-03-10 21:17 ` jeff.mcgee
  2015-03-12 12:09   ` Thomas Wood
  2015-03-12 17:54   ` [PATCH i-g-t 2/2 v2] " jeff.mcgee
  1 sibling, 2 replies; 10+ messages in thread
From: jeff.mcgee @ 2015-03-10 21:17 UTC (permalink / raw)
  To: intel-gfx

From: Jeff McGee <jeff.mcgee@intel.com>

New test pm_sseu is intended for any subtest related to the
slice/subslice/EU power gating feature. The sole initial subtest,
'full-enable', confirms that the slice/subslice/EU state is at
full enablement when the render engine is active. Starting with
Gen9 SKL, the render power gating feature can leave SSEU in a
partially enabled state upon resumption of render work unless
explicit action is taken.

Signed-off-by: Jeff McGee <jeff.mcgee@intel.com>
---
 tests/.gitignore       |   1 +
 tests/Makefile.sources |   1 +
 tests/pm_sseu.c        | 373 +++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 375 insertions(+)
 create mode 100644 tests/pm_sseu.c

diff --git a/tests/.gitignore b/tests/.gitignore
index 7b4dd94..23094ce 100644
--- a/tests/.gitignore
+++ b/tests/.gitignore
@@ -144,6 +144,7 @@ pm_psr
 pm_rc6_residency
 pm_rpm
 pm_rps
+pm_sseu
 prime_nv_api
 prime_nv_pcopy
 prime_nv_test
diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index 51e8376..74106c0 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -82,6 +82,7 @@ TESTS_progs_M = \
 	pm_rpm \
 	pm_rps \
 	pm_rc6_residency \
+	pm_sseu \
 	prime_self_import \
 	template \
 	$(NULL)
diff --git a/tests/pm_sseu.c b/tests/pm_sseu.c
new file mode 100644
index 0000000..45aeef3
--- /dev/null
+++ b/tests/pm_sseu.c
@@ -0,0 +1,373 @@
+/*
+ * Copyright © 2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *    Jeff McGee <jeff.mcgee@intel.com>
+ */
+
+#include <fcntl.h>
+#include <unistd.h>
+#include <string.h>
+#include <errno.h>
+#include <time.h>
+#include "drmtest.h"
+#include "i915_drm.h"
+#include "intel_io.h"
+#include "intel_bufmgr.h"
+#include "intel_batchbuffer.h"
+#include "intel_chipset.h"
+#include "ioctl_wrappers.h"
+#include "igt_debugfs.h"
+#include "media_spin.h"
+
+static double
+to_dt(const struct timespec *start, const struct timespec *end)
+{
+	double dt;
+
+	dt = (end->tv_sec - start->tv_sec) * 1e3;
+	dt += (end->tv_nsec - start->tv_nsec) * 1e-6;
+
+	return dt;
+}
+
+struct status {
+	struct {
+		int slice_total;
+		int subslice_total;
+		int subslice_per;
+		int eu_total;
+		int eu_per;
+		bool has_slice_pg;
+		bool has_subslice_pg;
+		bool has_eu_pg;
+	} info;
+	struct {
+		int slice_total;
+		int subslice_total;
+		int subslice_per;
+		int eu_total;
+		int eu_per;
+	} hw;
+};
+
+#define DBG_STATUS_BUF_SIZE 4096
+
+struct {
+	int init;
+	int status_fd;
+	char status_buf[DBG_STATUS_BUF_SIZE];
+} dbg;
+
+static void
+dbg_get_status_section(const char *title, char **first, char **last)
+{
+	char *pos;
+
+	*first = strstr(dbg.status_buf, title);
+	igt_assert(*first != NULL);
+
+	pos = *first;
+	do {
+		pos = strchr(pos, '\n');
+		igt_assert(pos != NULL);
+		pos++;
+	} while (*pos == ' '); /* lines in the section begin with a space */
+	*last = pos - 1;
+}
+
+static int
+dbg_get_int(const char *first, const char *last, const char *name)
+{
+	char *pos;
+
+	pos = strstr(first, name);
+	igt_assert(pos != NULL);
+	pos = strstr(pos, ":");
+	igt_assert(pos != NULL);
+	pos += 2;
+	igt_assert(pos < last);
+
+	return strtol(pos, &pos, 10);
+}
+
+static bool
+dbg_get_bool(const char *first, const char *last, const char *name)
+{
+	char *pos;
+
+	pos = strstr(first, name);
+	igt_assert(pos != NULL);
+	pos = strstr(pos, ":");
+	igt_assert(pos != NULL);
+	pos += 2;
+	igt_assert(pos < last);
+
+	if (*pos == 'y')
+		return true;
+	if (*pos == 'n')
+		return false;
+
+	igt_assert(false);
+	return false;
+}
+
+static void
+dbg_get_status(struct status *stat)
+{
+	char *first, *last;
+	int nread;
+
+	lseek(dbg.status_fd, 0, SEEK_SET);
+	nread = read(dbg.status_fd, dbg.status_buf, DBG_STATUS_BUF_SIZE);
+	igt_assert(nread < DBG_STATUS_BUF_SIZE);
+	dbg.status_buf[nread] = '\0';
+
+	memset(stat, 0, sizeof(*stat));
+
+	dbg_get_status_section("SSEU Device Info", &first, &last);
+	stat->info.slice_total =
+		dbg_get_int(first, last, "Available Slice Total:");
+	stat->info.subslice_total =
+		dbg_get_int(first, last, "Available Subslice Total:");
+	stat->info.subslice_per =
+		dbg_get_int(first, last, "Available Subslice Per Slice:");
+	stat->info.eu_total =
+		dbg_get_int(first, last, "Available EU Total:");
+	stat->info.eu_per =
+		dbg_get_int(first, last, "Available EU Per Subslice:");
+	stat->info.has_slice_pg =
+		dbg_get_bool(first, last, "Has Slice Power Gating:");
+	stat->info.has_subslice_pg =
+		dbg_get_bool(first, last, "Has Subslice Power Gating:");
+	stat->info.has_eu_pg =
+		dbg_get_bool(first, last, "Has EU Power Gating:");
+
+	dbg_get_status_section("SSEU Device Status", &first, &last);
+	stat->hw.slice_total =
+		dbg_get_int(first, last, "Enabled Slice Total:");
+	stat->hw.subslice_total =
+		dbg_get_int(first, last, "Enabled Subslice Total:");
+	stat->hw.subslice_per =
+		dbg_get_int(first, last, "Enabled Subslice Per Slice:");
+	stat->hw.eu_total =
+		dbg_get_int(first, last, "Enabled EU Total:");
+	stat->hw.eu_per =
+		dbg_get_int(first, last, "Enabled EU Per Subslice:");
+}
+
+static void
+dbg_init(void)
+{
+	dbg.status_fd = igt_debugfs_open("i915_sseu_status", O_RDONLY);
+	igt_assert(dbg.status_fd != -1);
+	dbg.init = 1;
+}
+
+static void
+dbg_deinit(void)
+{
+	switch (dbg.init)
+	{
+	case 1:
+		close(dbg.status_fd);
+	}
+}
+
+struct {
+	int init;
+	int drm_fd;
+	int devid;
+	int gen;
+	int has_ppgtt;
+	drm_intel_bufmgr *bufmgr;
+	struct intel_batchbuffer *batch;
+	igt_media_spinfunc_t spinfunc;
+	struct igt_buf buf;
+	uint32_t spins_per_msec;
+} gem;
+
+static void
+gem_check_spin(uint32_t spins)
+{
+	uint32_t *data;
+
+	data = (uint32_t*)gem.buf.bo->virtual;
+	igt_assert(*data == spins);
+}
+
+static uint32_t
+gem_get_target_spins(double dt)
+{
+	struct timespec tstart, tdone;
+	double prev_dt, cur_dt;
+	uint32_t spins;
+	int i, ret;
+
+	/* Double increments until we bound the target time */
+	prev_dt = 0.0;
+	for (i = 0; i < 32; i++) {
+		spins = 1 << i;
+		clock_gettime(CLOCK_MONOTONIC, &tstart);
+
+		gem.spinfunc(gem.batch, &gem.buf, spins);
+		ret = drm_intel_bo_map(gem.buf.bo, 0);
+		igt_assert (ret == 0);
+		clock_gettime(CLOCK_MONOTONIC, &tdone);
+
+		gem_check_spin(spins);
+		drm_intel_bo_unmap(gem.buf.bo);
+
+		cur_dt = to_dt(&tstart, &tdone);
+		if (cur_dt > dt)
+			break;
+		prev_dt = cur_dt;
+	}
+	igt_assert(i != 32);
+
+	/* Linearly interpolate between i and i-1 to get target increments */
+	spins = 1 << (i-1); /* lower bound spins */
+	spins += spins * (dt - prev_dt)/(cur_dt - prev_dt); /* target spins */
+
+	return spins;
+}
+
+static void
+gem_init(void)
+{
+	gem.drm_fd = drm_open_any();
+	gem.init = 1;
+
+	gem.devid = intel_get_drm_devid(gem.drm_fd);
+	gem.gen = intel_gen(gem.devid);
+	gem.has_ppgtt = gem_uses_aliasing_ppgtt(gem.drm_fd);
+
+	gem.bufmgr = drm_intel_bufmgr_gem_init(gem.drm_fd, 4096);
+	igt_assert(gem.bufmgr);
+	gem.init = 2;
+
+	drm_intel_bufmgr_gem_enable_reuse(gem.bufmgr);
+
+	gem.batch = intel_batchbuffer_alloc(gem.bufmgr, gem.devid);
+	igt_assert(gem.batch);
+	gem.init = 3;
+
+	gem.spinfunc = igt_get_media_spinfunc(gem.devid);
+	igt_assert(gem.spinfunc);
+
+	gem.buf.stride = sizeof(uint32_t);
+	gem.buf.tiling = I915_TILING_NONE;
+	gem.buf.size = gem.buf.stride;
+	gem.buf.bo = drm_intel_bo_alloc(gem.bufmgr, "", gem.buf.size, 4096);
+	igt_assert(gem.buf.bo);
+	gem.init = 4;
+
+	gem.spins_per_msec = gem_get_target_spins(100) / 100;
+}
+
+static void
+gem_deinit(void)
+{
+	switch (gem.init)
+	{
+	case 4:
+		drm_intel_bo_unmap(gem.buf.bo);
+		drm_intel_bo_unreference(gem.buf.bo);
+	case 3:
+		intel_batchbuffer_free(gem.batch);
+	case 2:
+		drm_intel_bufmgr_destroy(gem.bufmgr);
+	case 1:
+		close(gem.drm_fd);
+	}
+}
+
+static void
+check_full_enable(struct status *stat)
+{
+	igt_assert(stat->hw.slice_total == stat->info.slice_total);
+	igt_assert(stat->hw.subslice_total == stat->info.subslice_total);
+	igt_assert(stat->hw.subslice_per == stat->info.subslice_per);
+
+	/*
+	 * EU are powered in pairs, but it is possible for one EU in the pair
+	 * to be non-functional due to fusing. The determination of enabled
+	 * EU does not account for this and can therefore actually exceed the
+	 * available count. Allow for this small discrepancy in our
+	 * comparison.
+	*/
+	igt_assert(stat->hw.eu_total >= stat->info.eu_total);
+	igt_assert(stat->hw.eu_per >= stat->info.eu_per);
+}
+
+static void
+full_enable(void)
+{
+	struct status stat;
+	const int spin_msec = 10;
+	int ret, spins;
+
+	/* Simulation doesn't currently model slice/subslice/EU power gating. */
+	igt_skip_on_simulation();
+
+	/*
+	 * Gen9 SKL is the first case in which render power gating can leave
+	 * slice/subslice/EU in a partially enabled state upon resumption of
+	 * render work. So start checking that this is prevented as of Gen9.
+	*/
+	igt_require(gem.gen >= 9);
+
+	spins = spin_msec * gem.spins_per_msec;
+
+	gem.spinfunc(gem.batch, &gem.buf, spins);
+
+	usleep(2000); /* 2ms wait to make sure batch is running */
+	dbg_get_status(&stat);
+
+	ret = drm_intel_bo_map(gem.buf.bo, 0);
+	igt_assert (ret == 0);
+
+	gem_check_spin(spins);
+	drm_intel_bo_unmap(gem.buf.bo);
+
+	check_full_enable(&stat);
+}
+
+static void
+exit_handler(int sig)
+{
+	gem_deinit();
+	dbg_deinit();
+}
+
+igt_main
+{
+	igt_fixture {
+		igt_install_exit_handler(exit_handler);
+
+		dbg_init();
+		gem_init();
+	}
+
+	igt_subtest("full-enable")
+		full_enable();
+}
-- 
2.3.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH i-g-t 2/2] tests/pm_sseu: Create new test pm_sseu
  2015-03-10 21:17 ` [PATCH i-g-t 2/2] tests/pm_sseu: Create new test pm_sseu jeff.mcgee
@ 2015-03-12 12:09   ` Thomas Wood
  2015-03-18 16:51     ` Jeff McGee
  2015-03-12 17:54   ` [PATCH i-g-t 2/2 v2] " jeff.mcgee
  1 sibling, 1 reply; 10+ messages in thread
From: Thomas Wood @ 2015-03-12 12:09 UTC (permalink / raw)
  To: jeff.mcgee; +Cc: Intel Graphics Development

On 10 March 2015 at 21:17,  <jeff.mcgee@intel.com> wrote:
> From: Jeff McGee <jeff.mcgee@intel.com>
>
> New test pm_sseu is intended for any subtest related to the
> slice/subslice/EU power gating feature. The sole initial subtest,
> 'full-enable', confirms that the slice/subslice/EU state is at
> full enablement when the render engine is active. Starting with
> Gen9 SKL, the render power gating feature can leave SSEU in a
> partially enabled state upon resumption of render work unless
> explicit action is taken.

Please add a short description to the test using the
IGT_TEST_DESCRIPTION macro, so that it is included in the
documentation and help output.

>
> Signed-off-by: Jeff McGee <jeff.mcgee@intel.com>
> ---
>  tests/.gitignore       |   1 +
>  tests/Makefile.sources |   1 +
>  tests/pm_sseu.c        | 373 +++++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 375 insertions(+)
>  create mode 100644 tests/pm_sseu.c
>
> diff --git a/tests/.gitignore b/tests/.gitignore
> index 7b4dd94..23094ce 100644
> --- a/tests/.gitignore
> +++ b/tests/.gitignore
> @@ -144,6 +144,7 @@ pm_psr
>  pm_rc6_residency
>  pm_rpm
>  pm_rps
> +pm_sseu
>  prime_nv_api
>  prime_nv_pcopy
>  prime_nv_test
> diff --git a/tests/Makefile.sources b/tests/Makefile.sources
> index 51e8376..74106c0 100644
> --- a/tests/Makefile.sources
> +++ b/tests/Makefile.sources
> @@ -82,6 +82,7 @@ TESTS_progs_M = \
>         pm_rpm \
>         pm_rps \
>         pm_rc6_residency \
> +       pm_sseu \
>         prime_self_import \
>         template \
>         $(NULL)
> diff --git a/tests/pm_sseu.c b/tests/pm_sseu.c
> new file mode 100644
> index 0000000..45aeef3
> --- /dev/null
> +++ b/tests/pm_sseu.c
> @@ -0,0 +1,373 @@
> +/*
> + * Copyright © 2015 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + *
> + * Authors:
> + *    Jeff McGee <jeff.mcgee@intel.com>
> + */
> +
> +#include <fcntl.h>
> +#include <unistd.h>
> +#include <string.h>
> +#include <errno.h>
> +#include <time.h>
> +#include "drmtest.h"
> +#include "i915_drm.h"
> +#include "intel_io.h"
> +#include "intel_bufmgr.h"
> +#include "intel_batchbuffer.h"
> +#include "intel_chipset.h"
> +#include "ioctl_wrappers.h"
> +#include "igt_debugfs.h"
> +#include "media_spin.h"
> +
> +static double
> +to_dt(const struct timespec *start, const struct timespec *end)
> +{
> +       double dt;
> +
> +       dt = (end->tv_sec - start->tv_sec) * 1e3;
> +       dt += (end->tv_nsec - start->tv_nsec) * 1e-6;
> +
> +       return dt;
> +}
> +
> +struct status {
> +       struct {
> +               int slice_total;
> +               int subslice_total;
> +               int subslice_per;
> +               int eu_total;
> +               int eu_per;
> +               bool has_slice_pg;
> +               bool has_subslice_pg;
> +               bool has_eu_pg;
> +       } info;
> +       struct {
> +               int slice_total;
> +               int subslice_total;
> +               int subslice_per;
> +               int eu_total;
> +               int eu_per;
> +       } hw;
> +};
> +
> +#define DBG_STATUS_BUF_SIZE 4096
> +
> +struct {
> +       int init;
> +       int status_fd;
> +       char status_buf[DBG_STATUS_BUF_SIZE];
> +} dbg;
> +
> +static void
> +dbg_get_status_section(const char *title, char **first, char **last)
> +{
> +       char *pos;
> +
> +       *first = strstr(dbg.status_buf, title);
> +       igt_assert(*first != NULL);
> +
> +       pos = *first;
> +       do {
> +               pos = strchr(pos, '\n');
> +               igt_assert(pos != NULL);
> +               pos++;
> +       } while (*pos == ' '); /* lines in the section begin with a space */
> +       *last = pos - 1;
> +}
> +
> +static int
> +dbg_get_int(const char *first, const char *last, const char *name)
> +{
> +       char *pos;
> +
> +       pos = strstr(first, name);
> +       igt_assert(pos != NULL);
> +       pos = strstr(pos, ":");
> +       igt_assert(pos != NULL);
> +       pos += 2;
> +       igt_assert(pos < last);
> +
> +       return strtol(pos, &pos, 10);
> +}
> +
> +static bool
> +dbg_get_bool(const char *first, const char *last, const char *name)
> +{
> +       char *pos;
> +
> +       pos = strstr(first, name);
> +       igt_assert(pos != NULL);
> +       pos = strstr(pos, ":");
> +       igt_assert(pos != NULL);
> +       pos += 2;
> +       igt_assert(pos < last);
> +
> +       if (*pos == 'y')
> +               return true;
> +       if (*pos == 'n')
> +               return false;
> +
> +       igt_assert(false);

Perhaps use igt_assert_f() to add a more detailed error message?


> +       return false;
> +}
> +
> +static void
> +dbg_get_status(struct status *stat)
> +{
> +       char *first, *last;
> +       int nread;
> +
> +       lseek(dbg.status_fd, 0, SEEK_SET);
> +       nread = read(dbg.status_fd, dbg.status_buf, DBG_STATUS_BUF_SIZE);
> +       igt_assert(nread < DBG_STATUS_BUF_SIZE);

igt_assert_lt() would produce a better error message here. Using
igt.cocci will suggest other similar changes elsewhere too.


> +       dbg.status_buf[nread] = '\0';
> +
> +       memset(stat, 0, sizeof(*stat));
> +
> +       dbg_get_status_section("SSEU Device Info", &first, &last);
> +       stat->info.slice_total =
> +               dbg_get_int(first, last, "Available Slice Total:");
> +       stat->info.subslice_total =
> +               dbg_get_int(first, last, "Available Subslice Total:");
> +       stat->info.subslice_per =
> +               dbg_get_int(first, last, "Available Subslice Per Slice:");
> +       stat->info.eu_total =
> +               dbg_get_int(first, last, "Available EU Total:");
> +       stat->info.eu_per =
> +               dbg_get_int(first, last, "Available EU Per Subslice:");
> +       stat->info.has_slice_pg =
> +               dbg_get_bool(first, last, "Has Slice Power Gating:");
> +       stat->info.has_subslice_pg =
> +               dbg_get_bool(first, last, "Has Subslice Power Gating:");
> +       stat->info.has_eu_pg =
> +               dbg_get_bool(first, last, "Has EU Power Gating:");
> +
> +       dbg_get_status_section("SSEU Device Status", &first, &last);
> +       stat->hw.slice_total =
> +               dbg_get_int(first, last, "Enabled Slice Total:");
> +       stat->hw.subslice_total =
> +               dbg_get_int(first, last, "Enabled Subslice Total:");
> +       stat->hw.subslice_per =
> +               dbg_get_int(first, last, "Enabled Subslice Per Slice:");
> +       stat->hw.eu_total =
> +               dbg_get_int(first, last, "Enabled EU Total:");
> +       stat->hw.eu_per =
> +               dbg_get_int(first, last, "Enabled EU Per Subslice:");
> +}
> +
> +static void
> +dbg_init(void)
> +{
> +       dbg.status_fd = igt_debugfs_open("i915_sseu_status", O_RDONLY);
> +       igt_assert(dbg.status_fd != -1);
> +       dbg.init = 1;
> +}
> +
> +static void
> +dbg_deinit(void)
> +{
> +       switch (dbg.init)
> +       {
> +       case 1:
> +               close(dbg.status_fd);
> +       }
> +}
> +
> +struct {
> +       int init;
> +       int drm_fd;
> +       int devid;
> +       int gen;
> +       int has_ppgtt;
> +       drm_intel_bufmgr *bufmgr;
> +       struct intel_batchbuffer *batch;
> +       igt_media_spinfunc_t spinfunc;
> +       struct igt_buf buf;
> +       uint32_t spins_per_msec;
> +} gem;
> +
> +static void
> +gem_check_spin(uint32_t spins)
> +{
> +       uint32_t *data;
> +
> +       data = (uint32_t*)gem.buf.bo->virtual;
> +       igt_assert(*data == spins);
> +}
> +
> +static uint32_t
> +gem_get_target_spins(double dt)
> +{
> +       struct timespec tstart, tdone;
> +       double prev_dt, cur_dt;
> +       uint32_t spins;
> +       int i, ret;
> +
> +       /* Double increments until we bound the target time */
> +       prev_dt = 0.0;
> +       for (i = 0; i < 32; i++) {
> +               spins = 1 << i;
> +               clock_gettime(CLOCK_MONOTONIC, &tstart);
> +
> +               gem.spinfunc(gem.batch, &gem.buf, spins);
> +               ret = drm_intel_bo_map(gem.buf.bo, 0);
> +               igt_assert (ret == 0);
> +               clock_gettime(CLOCK_MONOTONIC, &tdone);
> +
> +               gem_check_spin(spins);
> +               drm_intel_bo_unmap(gem.buf.bo);
> +
> +               cur_dt = to_dt(&tstart, &tdone);
> +               if (cur_dt > dt)
> +                       break;
> +               prev_dt = cur_dt;
> +       }
> +       igt_assert(i != 32);
> +
> +       /* Linearly interpolate between i and i-1 to get target increments */
> +       spins = 1 << (i-1); /* lower bound spins */
> +       spins += spins * (dt - prev_dt)/(cur_dt - prev_dt); /* target spins */
> +
> +       return spins;
> +}
> +
> +static void
> +gem_init(void)
> +{
> +       gem.drm_fd = drm_open_any();
> +       gem.init = 1;
> +
> +       gem.devid = intel_get_drm_devid(gem.drm_fd);
> +       gem.gen = intel_gen(gem.devid);
> +       gem.has_ppgtt = gem_uses_aliasing_ppgtt(gem.drm_fd);
> +
> +       gem.bufmgr = drm_intel_bufmgr_gem_init(gem.drm_fd, 4096);
> +       igt_assert(gem.bufmgr);
> +       gem.init = 2;
> +
> +       drm_intel_bufmgr_gem_enable_reuse(gem.bufmgr);
> +
> +       gem.batch = intel_batchbuffer_alloc(gem.bufmgr, gem.devid);
> +       igt_assert(gem.batch);
> +       gem.init = 3;
> +
> +       gem.spinfunc = igt_get_media_spinfunc(gem.devid);
> +       igt_assert(gem.spinfunc);
> +
> +       gem.buf.stride = sizeof(uint32_t);
> +       gem.buf.tiling = I915_TILING_NONE;
> +       gem.buf.size = gem.buf.stride;
> +       gem.buf.bo = drm_intel_bo_alloc(gem.bufmgr, "", gem.buf.size, 4096);
> +       igt_assert(gem.buf.bo);
> +       gem.init = 4;
> +
> +       gem.spins_per_msec = gem_get_target_spins(100) / 100;
> +}
> +
> +static void
> +gem_deinit(void)
> +{
> +       switch (gem.init)
> +       {
> +       case 4:
> +               drm_intel_bo_unmap(gem.buf.bo);
> +               drm_intel_bo_unreference(gem.buf.bo);
> +       case 3:
> +               intel_batchbuffer_free(gem.batch);
> +       case 2:
> +               drm_intel_bufmgr_destroy(gem.bufmgr);
> +       case 1:
> +               close(gem.drm_fd);
> +       }
> +}
> +
> +static void
> +check_full_enable(struct status *stat)
> +{
> +       igt_assert(stat->hw.slice_total == stat->info.slice_total);
> +       igt_assert(stat->hw.subslice_total == stat->info.subslice_total);
> +       igt_assert(stat->hw.subslice_per == stat->info.subslice_per);
> +
> +       /*
> +        * EU are powered in pairs, but it is possible for one EU in the pair
> +        * to be non-functional due to fusing. The determination of enabled
> +        * EU does not account for this and can therefore actually exceed the
> +        * available count. Allow for this small discrepancy in our
> +        * comparison.
> +       */
> +       igt_assert(stat->hw.eu_total >= stat->info.eu_total);
> +       igt_assert(stat->hw.eu_per >= stat->info.eu_per);
> +}
> +
> +static void
> +full_enable(void)
> +{
> +       struct status stat;
> +       const int spin_msec = 10;
> +       int ret, spins;
> +
> +       /* Simulation doesn't currently model slice/subslice/EU power gating. */
> +       igt_skip_on_simulation();
> +
> +       /*
> +        * Gen9 SKL is the first case in which render power gating can leave
> +        * slice/subslice/EU in a partially enabled state upon resumption of
> +        * render work. So start checking that this is prevented as of Gen9.
> +       */
> +       igt_require(gem.gen >= 9);
> +
> +       spins = spin_msec * gem.spins_per_msec;
> +
> +       gem.spinfunc(gem.batch, &gem.buf, spins);
> +
> +       usleep(2000); /* 2ms wait to make sure batch is running */
> +       dbg_get_status(&stat);
> +
> +       ret = drm_intel_bo_map(gem.buf.bo, 0);
> +       igt_assert (ret == 0);
> +
> +       gem_check_spin(spins);
> +       drm_intel_bo_unmap(gem.buf.bo);
> +
> +       check_full_enable(&stat);
> +}
> +
> +static void
> +exit_handler(int sig)
> +{
> +       gem_deinit();
> +       dbg_deinit();
> +}
> +
> +igt_main
> +{
> +       igt_fixture {
> +               igt_install_exit_handler(exit_handler);
> +
> +               dbg_init();
> +               gem_init();
> +       }
> +
> +       igt_subtest("full-enable")
> +               full_enable();
> +}
> --
> 2.3.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH i-g-t 1/2 v2] lib: Add media spin
  2015-03-10 21:17 ` [PATCH i-g-t 1/2] lib: Add media spin jeff.mcgee
@ 2015-03-12 17:52   ` jeff.mcgee
  2015-03-25  2:50     ` He, Shuang
  0 siblings, 1 reply; 10+ messages in thread
From: jeff.mcgee @ 2015-03-12 17:52 UTC (permalink / raw)
  To: intel-gfx

From: Jeff McGee <jeff.mcgee@intel.com>

The media spin utility is derived from media fill. The purpose
is to create a simple means to keep the render engine (media
pipeline) busy for a controlled amount of time. It does so by
emitting a batch with a single execution thread that spins in
a tight loop the requested number of times. Each spin increments
a counter whose final 32-bit value is written to the destination
buffer on completion for checking. The implementation supports
Gen8, Gen8lp, and Gen9.

v2: Apply the recommendations of igt.cocci.

Signed-off-by: Jeff McGee <jeff.mcgee@intel.com>
---
 lib/Makefile.sources    |   2 +
 lib/intel_batchbuffer.c |  24 +++
 lib/intel_batchbuffer.h |  22 ++
 lib/media_spin.c        | 540 ++++++++++++++++++++++++++++++++++++++++++++++++
 lib/media_spin.h        |  39 ++++
 5 files changed, 627 insertions(+)
 create mode 100644 lib/media_spin.c
 create mode 100644 lib/media_spin.h

diff --git a/lib/Makefile.sources b/lib/Makefile.sources
index 76f353a..3d93629 100644
--- a/lib/Makefile.sources
+++ b/lib/Makefile.sources
@@ -29,6 +29,8 @@ libintel_tools_la_SOURCES = 	\
 	media_fill_gen8.c       \
 	media_fill_gen8lp.c     \
 	media_fill_gen9.c       \
+	media_spin.h		\
+	media_spin.c	\
 	gen7_media.h            \
 	gen8_media.h            \
 	rendercopy_i915.c	\
diff --git a/lib/intel_batchbuffer.c b/lib/intel_batchbuffer.c
index 666c323..195ccc4 100644
--- a/lib/intel_batchbuffer.c
+++ b/lib/intel_batchbuffer.c
@@ -40,6 +40,7 @@
 #include "rendercopy.h"
 #include "media_fill.h"
 #include "ioctl_wrappers.h"
+#include "media_spin.h"
 
 #include <i915_drm.h>
 
@@ -785,3 +786,26 @@ igt_fillfunc_t igt_get_gpgpu_fillfunc(int devid)
 
 	return fill;
 }
+
+/**
+ * igt_get_media_spinfunc:
+ * @devid: pci device id
+ *
+ * Returns:
+ *
+ * The platform-specific media spin function pointer for the device specified
+ * with @devid. Will return NULL when no media spin function is implemented.
+ */
+igt_media_spinfunc_t igt_get_media_spinfunc(int devid)
+{
+	igt_media_spinfunc_t spin = NULL;
+
+	if (IS_GEN9(devid))
+		spin = gen9_media_spinfunc;
+	else if (IS_BROADWELL(devid))
+		spin = gen8_media_spinfunc;
+	else if (IS_CHERRYVIEW(devid))
+		spin = gen8lp_media_spinfunc;
+
+	return spin;
+}
diff --git a/lib/intel_batchbuffer.h b/lib/intel_batchbuffer.h
index fa8875b..62c8396 100644
--- a/lib/intel_batchbuffer.h
+++ b/lib/intel_batchbuffer.h
@@ -300,4 +300,26 @@ typedef void (*igt_fillfunc_t)(struct intel_batchbuffer *batch,
 igt_fillfunc_t igt_get_media_fillfunc(int devid);
 igt_fillfunc_t igt_get_gpgpu_fillfunc(int devid);
 
+/**
+ * igt_media_spinfunc_t:
+ * @batch: batchbuffer object
+ * @dst: destination i-g-t buffer object
+ * @spins: number of loops to execute
+ *
+ * This is the type of the per-platform media spin functions. The
+ * platform-specific implementation can be obtained by calling
+ * igt_get_media_spinfunc().
+ *
+ * The media spin function emits a batchbuffer for the render engine with
+ * the media pipeline selected. The workload consists of a single thread
+ * which spins in a tight loop the requested number of times. Each spin
+ * increments a counter whose final 32-bit value is written to the
+ * destination buffer on completion. This utility provides a simple way
+ * to keep the render engine busy for a set time for various tests.
+ */
+typedef void (*igt_media_spinfunc_t)(struct intel_batchbuffer *batch,
+				     struct igt_buf *dst, uint32_t spins);
+
+igt_media_spinfunc_t igt_get_media_spinfunc(int devid);
+
 #endif
diff --git a/lib/media_spin.c b/lib/media_spin.c
new file mode 100644
index 0000000..580c109
--- /dev/null
+++ b/lib/media_spin.c
@@ -0,0 +1,540 @@
+/*
+ * Copyright © 2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ * 	Jeff McGee <jeff.mcgee@intel.com>
+ */
+
+#include <intel_bufmgr.h>
+#include <i915_drm.h>
+#include "intel_reg.h"
+#include "drmtest.h"
+#include "intel_batchbuffer.h"
+#include "gen8_media.h"
+#include "media_spin.h"
+
+static const uint32_t spin_kernel[][4] = {
+	{ 0x00600001, 0x20800208, 0x008d0000, 0x00000000 }, /* mov (8)r4.0<1>:ud r0.0<8;8;1>:ud */
+	{ 0x00200001, 0x20800208, 0x00450040, 0x00000000 }, /* mov (2)r4.0<1>.ud r2.0<2;2;1>:ud */
+	{ 0x00000001, 0x20880608, 0x00000000, 0x00000003 }, /* mov (1)r4.8<1>:ud 0x3 */
+	{ 0x00000001, 0x20a00608, 0x00000000, 0x00000000 }, /* mov (1)r5.0<1>:ud 0 */
+	{ 0x00000040, 0x20a00208, 0x060000a0, 0x00000001 }, /* add (1)r5.0<1>:ud r5.0<0;1;0>:ud 1 */
+	{ 0x01000010, 0x20000200, 0x02000020, 0x000000a0 }, /* cmp.e.f0.0 (1)null<1> r1<0;1;0> r5<0;1;0> */
+	{ 0x00110027, 0x00000000, 0x00000000, 0xffffffe0 }, /* ~f0.0 while (1) -32 */
+	{ 0x0c800031, 0x20000a00, 0x0e000080, 0x040a8000 }, /* send.dcdp1 (16)null<1> r4.0<0;1;0> 0x040a8000 */
+	{ 0x00600001, 0x2e000208, 0x008d0000, 0x00000000 }, /* mov (8)r112<1>:ud r0.0<8;8;1>:ud */
+	{ 0x07800031, 0x20000a40, 0x0e000e00, 0x82000010 }, /* send.ts (16)null<1> r112<0;1;0>:d 0x82000010 */
+};
+
+static uint32_t
+batch_used(struct intel_batchbuffer *batch)
+{
+	return batch->ptr - batch->buffer;
+}
+
+static uint32_t
+batch_align(struct intel_batchbuffer *batch, uint32_t align)
+{
+	uint32_t offset = batch_used(batch);
+	offset = ALIGN(offset, align);
+	batch->ptr = batch->buffer + offset;
+	return offset;
+}
+
+static void *
+batch_alloc(struct intel_batchbuffer *batch, uint32_t size, uint32_t align)
+{
+	uint32_t offset = batch_align(batch, align);
+	batch->ptr += size;
+	return memset(batch->buffer + offset, 0, size);
+}
+
+static uint32_t
+batch_offset(struct intel_batchbuffer *batch, void *ptr)
+{
+	return (uint8_t *)ptr - batch->buffer;
+}
+
+static uint32_t
+batch_copy(struct intel_batchbuffer *batch, const void *ptr, uint32_t size,
+	   uint32_t align)
+{
+	return batch_offset(batch, memcpy(batch_alloc(batch, size, align), ptr, size));
+}
+
+static void
+gen8_render_flush(struct intel_batchbuffer *batch, uint32_t batch_end)
+{
+	int ret;
+
+	ret = drm_intel_bo_subdata(batch->bo, 0, 4096, batch->buffer);
+	if (ret == 0)
+		ret = drm_intel_gem_bo_context_exec(batch->bo, NULL,
+						    batch_end, 0);
+	igt_assert_eq(ret, 0);
+}
+
+static uint32_t
+gen8_spin_curbe_buffer_data(struct intel_batchbuffer *batch,
+			    uint32_t iters)
+{
+	uint32_t *curbe_buffer;
+	uint32_t offset;
+
+	curbe_buffer = batch_alloc(batch, 64, 64);
+	offset = batch_offset(batch, curbe_buffer);
+	*curbe_buffer = iters;
+
+	return offset;
+}
+
+static uint32_t
+gen8_spin_surface_state(struct intel_batchbuffer *batch,
+			struct igt_buf *buf,
+			uint32_t format,
+			int is_dst)
+{
+	struct gen8_surface_state *ss;
+	uint32_t write_domain, read_domain, offset;
+	int ret;
+
+	if (is_dst) {
+		write_domain = read_domain = I915_GEM_DOMAIN_RENDER;
+	} else {
+		write_domain = 0;
+		read_domain = I915_GEM_DOMAIN_SAMPLER;
+	}
+
+	ss = batch_alloc(batch, sizeof(*ss), 64);
+	offset = batch_offset(batch, ss);
+
+	ss->ss0.surface_type = GEN8_SURFACE_2D;
+	ss->ss0.surface_format = format;
+	ss->ss0.render_cache_read_write = 1;
+	ss->ss0.vertical_alignment = 1; /* align 4 */
+	ss->ss0.horizontal_alignment = 1; /* align 4 */
+
+	if (buf->tiling == I915_TILING_X)
+		ss->ss0.tiled_mode = 2;
+	else if (buf->tiling == I915_TILING_Y)
+		ss->ss0.tiled_mode = 3;
+
+	ss->ss8.base_addr = buf->bo->offset;
+
+	ret = drm_intel_bo_emit_reloc(batch->bo,
+				batch_offset(batch, ss) + 8 * 4,
+				buf->bo, 0,
+				read_domain, write_domain);
+	igt_assert_eq(ret, 0);
+
+	ss->ss2.height = igt_buf_height(buf) - 1;
+	ss->ss2.width  = igt_buf_width(buf) - 1;
+	ss->ss3.pitch  = buf->stride - 1;
+
+	ss->ss7.shader_chanel_select_r = 4;
+	ss->ss7.shader_chanel_select_g = 5;
+	ss->ss7.shader_chanel_select_b = 6;
+	ss->ss7.shader_chanel_select_a = 7;
+
+	return offset;
+}
+
+static uint32_t
+gen8_spin_binding_table(struct intel_batchbuffer *batch,
+			struct igt_buf *dst)
+{
+	uint32_t *binding_table, offset;
+
+	binding_table = batch_alloc(batch, 32, 64);
+	offset = batch_offset(batch, binding_table);
+
+	binding_table[0] = gen8_spin_surface_state(batch, dst,
+					GEN8_SURFACEFORMAT_R8_UNORM, 1);
+
+	return offset;
+}
+
+static uint32_t
+gen8_spin_media_kernel(struct intel_batchbuffer *batch,
+		       const uint32_t kernel[][4],
+		       size_t size)
+{
+	uint32_t offset;
+
+	offset = batch_copy(batch, kernel, size, 64);
+
+	return offset;
+}
+
+static uint32_t
+gen8_spin_interface_descriptor(struct intel_batchbuffer *batch,
+			       struct igt_buf *dst)
+{
+	struct gen8_interface_descriptor_data *idd;
+	uint32_t offset;
+	uint32_t binding_table_offset, kernel_offset;
+
+	binding_table_offset = gen8_spin_binding_table(batch, dst);
+	kernel_offset = gen8_spin_media_kernel(batch, spin_kernel,
+					       sizeof(spin_kernel));
+
+	idd = batch_alloc(batch, sizeof(*idd), 64);
+	offset = batch_offset(batch, idd);
+
+	idd->desc0.kernel_start_pointer = (kernel_offset >> 6);
+
+	idd->desc2.single_program_flow = 1;
+	idd->desc2.floating_point_mode = GEN8_FLOATING_POINT_IEEE_754;
+
+	idd->desc3.sampler_count = 0;      /* 0 samplers used */
+	idd->desc3.sampler_state_pointer = 0;
+
+	idd->desc4.binding_table_entry_count = 0;
+	idd->desc4.binding_table_pointer = (binding_table_offset >> 5);
+
+	idd->desc5.constant_urb_entry_read_offset = 0;
+	idd->desc5.constant_urb_entry_read_length = 1; /* grf 1 */
+
+	return offset;
+}
+
+static void
+gen8_emit_state_base_address(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_STATE_BASE_ADDRESS | (16 - 2));
+
+	/* general */
+	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+
+	/* stateless data port */
+	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
+
+	/* surface */
+	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0, BASE_ADDRESS_MODIFY);
+
+	/* dynamic */
+	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION,
+		0, BASE_ADDRESS_MODIFY);
+
+	/* indirect */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	/* instruction */
+	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
+
+	/* general state buffer size */
+	OUT_BATCH(0xfffff000 | 1);
+	/* dynamic state buffer size */
+	OUT_BATCH(1 << 12 | 1);
+	/* indirect object buffer size */
+	OUT_BATCH(0xfffff000 | 1);
+	/* intruction buffer size, must set modify enable bit, otherwise it may result in GPU hang */
+	OUT_BATCH(1 << 12 | 1);
+}
+
+static void
+gen9_emit_state_base_address(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_STATE_BASE_ADDRESS | (19 - 2));
+
+	/* general */
+	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+
+	/* stateless data port */
+	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
+
+	/* surface */
+	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0, BASE_ADDRESS_MODIFY);
+
+	/* dynamic */
+	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION,
+		0, BASE_ADDRESS_MODIFY);
+
+	/* indirect */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	/* instruction */
+	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
+
+	/* general state buffer size */
+	OUT_BATCH(0xfffff000 | 1);
+	/* dynamic state buffer size */
+	OUT_BATCH(1 << 12 | 1);
+	/* indirect object buffer size */
+	OUT_BATCH(0xfffff000 | 1);
+	/* intruction buffer size, must set modify enable bit, otherwise it may result in GPU hang */
+	OUT_BATCH(1 << 12 | 1);
+
+	/* Bindless surface state base address */
+	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+	OUT_BATCH(0xfffff000);
+}
+
+static void
+gen8_emit_vfe_state(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_MEDIA_VFE_STATE | (9 - 2));
+
+	/* scratch buffer */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	/* number of threads & urb entries */
+	OUT_BATCH(2 << 8);
+
+	OUT_BATCH(0);
+
+	/* urb entry size & curbe size */
+	OUT_BATCH(2 << 16 |
+		2);
+
+	/* scoreboard */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+}
+
+static void
+gen8_emit_curbe_load(struct intel_batchbuffer *batch, uint32_t curbe_buffer)
+{
+	OUT_BATCH(GEN8_MEDIA_CURBE_LOAD | (4 - 2));
+	OUT_BATCH(0);
+	/* curbe total data length */
+	OUT_BATCH(64);
+	/* curbe data start address, is relative to the dynamics base address */
+	OUT_BATCH(curbe_buffer);
+}
+
+static void
+gen8_emit_interface_descriptor_load(struct intel_batchbuffer *batch,
+				    uint32_t interface_descriptor)
+{
+	OUT_BATCH(GEN8_MEDIA_INTERFACE_DESCRIPTOR_LOAD | (4 - 2));
+	OUT_BATCH(0);
+	/* interface descriptor data length */
+	OUT_BATCH(sizeof(struct gen8_interface_descriptor_data));
+	/* interface descriptor address, is relative to the dynamics base address */
+	OUT_BATCH(interface_descriptor);
+}
+
+static void
+gen8_emit_media_state_flush(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_MEDIA_STATE_FLUSH | (2 - 2));
+	OUT_BATCH(0);
+}
+
+static void
+gen8_emit_media_objects(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_MEDIA_OBJECT | (8 - 2));
+
+	/* interface descriptor offset */
+	OUT_BATCH(0);
+
+	/* without indirect data */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	/* scoreboard */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	/* inline data (xoffset, yoffset) */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	gen8_emit_media_state_flush(batch);
+}
+
+static void
+gen8lp_emit_media_objects(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_MEDIA_OBJECT | (8 - 2));
+
+	/* interface descriptor offset */
+	OUT_BATCH(0);
+
+	/* without indirect data */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	/* scoreboard */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	/* inline data (xoffset, yoffset) */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+}
+
+/*
+ * This sets up the media pipeline,
+ *
+ * +---------------+ <---- 4096
+ * |       ^       |
+ * |       |       |
+ * |    various    |
+ * |      state    |
+ * |       |       |
+ * |_______|_______| <---- 2048 + ?
+ * |       ^       |
+ * |       |       |
+ * |   batch       |
+ * |    commands   |
+ * |       |       |
+ * |       |       |
+ * +---------------+ <---- 0 + ?
+ *
+ */
+
+#define BATCH_STATE_SPLIT 2048
+
+void
+gen8_media_spinfunc(struct intel_batchbuffer *batch,
+		    struct igt_buf *dst, uint32_t spins)
+{
+	uint32_t curbe_buffer, interface_descriptor;
+	uint32_t batch_end;
+
+	intel_batchbuffer_flush_with_context(batch, NULL);
+
+	/* setup states */
+	batch->ptr = &batch->buffer[BATCH_STATE_SPLIT];
+
+	curbe_buffer = gen8_spin_curbe_buffer_data(batch, spins);
+	interface_descriptor = gen8_spin_interface_descriptor(batch, dst);
+	igt_assert(batch->ptr < &batch->buffer[4095]);
+
+	/* media pipeline */
+	batch->ptr = batch->buffer;
+	OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA);
+	gen8_emit_state_base_address(batch);
+
+	gen8_emit_vfe_state(batch);
+
+	gen8_emit_curbe_load(batch, curbe_buffer);
+
+	gen8_emit_interface_descriptor_load(batch, interface_descriptor);
+
+	gen8_emit_media_objects(batch);
+
+	OUT_BATCH(MI_BATCH_BUFFER_END);
+
+	batch_end = batch_align(batch, 8);
+	igt_assert(batch_end < BATCH_STATE_SPLIT);
+
+	gen8_render_flush(batch, batch_end);
+	intel_batchbuffer_reset(batch);
+}
+
+void
+gen8lp_media_spinfunc(struct intel_batchbuffer *batch,
+		      struct igt_buf *dst, uint32_t spins)
+{
+	uint32_t curbe_buffer, interface_descriptor;
+	uint32_t batch_end;
+
+	intel_batchbuffer_flush_with_context(batch, NULL);
+
+	/* setup states */
+	batch->ptr = &batch->buffer[BATCH_STATE_SPLIT];
+
+	curbe_buffer = gen8_spin_curbe_buffer_data(batch, spins);
+	interface_descriptor = gen8_spin_interface_descriptor(batch, dst);
+	igt_assert(batch->ptr < &batch->buffer[4095]);
+
+	/* media pipeline */
+	batch->ptr = batch->buffer;
+	OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA);
+	gen8_emit_state_base_address(batch);
+
+	gen8_emit_vfe_state(batch);
+
+	gen8_emit_curbe_load(batch, curbe_buffer);
+
+	gen8_emit_interface_descriptor_load(batch, interface_descriptor);
+
+	gen8lp_emit_media_objects(batch);
+
+	OUT_BATCH(MI_BATCH_BUFFER_END);
+
+	batch_end = batch_align(batch, 8);
+	igt_assert(batch_end < BATCH_STATE_SPLIT);
+
+	gen8_render_flush(batch, batch_end);
+	intel_batchbuffer_reset(batch);
+}
+
+void
+gen9_media_spinfunc(struct intel_batchbuffer *batch,
+		    struct igt_buf *dst, uint32_t spins)
+{
+	uint32_t curbe_buffer, interface_descriptor;
+	uint32_t batch_end;
+
+	intel_batchbuffer_flush_with_context(batch, NULL);
+
+	/* setup states */
+	batch->ptr = &batch->buffer[BATCH_STATE_SPLIT];
+
+	curbe_buffer = gen8_spin_curbe_buffer_data(batch, spins);
+	interface_descriptor = gen8_spin_interface_descriptor(batch, dst);
+	igt_assert(batch->ptr < &batch->buffer[4095]);
+
+	/* media pipeline */
+	batch->ptr = batch->buffer;
+	OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA |
+			GEN9_FORCE_MEDIA_AWAKE_ENABLE |
+			GEN9_SAMPLER_DOP_GATE_DISABLE |
+			GEN9_PIPELINE_SELECTION_MASK |
+			GEN9_SAMPLER_DOP_GATE_MASK |
+			GEN9_FORCE_MEDIA_AWAKE_MASK);
+	gen9_emit_state_base_address(batch);
+
+	gen8_emit_vfe_state(batch);
+
+	gen8_emit_curbe_load(batch, curbe_buffer);
+
+	gen8_emit_interface_descriptor_load(batch, interface_descriptor);
+
+	gen8_emit_media_objects(batch);
+
+	OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA |
+			GEN9_FORCE_MEDIA_AWAKE_DISABLE |
+			GEN9_SAMPLER_DOP_GATE_ENABLE |
+			GEN9_PIPELINE_SELECTION_MASK |
+			GEN9_SAMPLER_DOP_GATE_MASK |
+			GEN9_FORCE_MEDIA_AWAKE_MASK);
+
+	OUT_BATCH(MI_BATCH_BUFFER_END);
+
+	batch_end = batch_align(batch, 8);
+	igt_assert(batch_end < BATCH_STATE_SPLIT);
+
+	gen8_render_flush(batch, batch_end);
+	intel_batchbuffer_reset(batch);
+}
diff --git a/lib/media_spin.h b/lib/media_spin.h
new file mode 100644
index 0000000..8bc4829
--- /dev/null
+++ b/lib/media_spin.h
@@ -0,0 +1,39 @@
+/*
+ * Copyright © 2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ * 	Jeff McGee <jeff.mcgee@intel.com>
+ */
+
+#ifndef MEDIA_SPIN_H
+#define MEDIA_SPIN_H
+
+void gen8_media_spinfunc(struct intel_batchbuffer *batch,
+			 struct igt_buf *dst, uint32_t spins);
+
+void gen8lp_media_spinfunc(struct intel_batchbuffer *batch,
+			   struct igt_buf *dst, uint32_t spins);
+
+void gen9_media_spinfunc(struct intel_batchbuffer *batch,
+			 struct igt_buf *dst, uint32_t spins);
+
+#endif /* MEDIA_SPIN_H */
-- 
2.3.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH i-g-t 2/2 v2] tests/pm_sseu: Create new test pm_sseu
  2015-03-10 21:17 ` [PATCH i-g-t 2/2] tests/pm_sseu: Create new test pm_sseu jeff.mcgee
  2015-03-12 12:09   ` Thomas Wood
@ 2015-03-12 17:54   ` jeff.mcgee
  2015-03-24 23:20     ` [PATCH i-g-t 2/2 v3] " jeff.mcgee
  1 sibling, 1 reply; 10+ messages in thread
From: jeff.mcgee @ 2015-03-12 17:54 UTC (permalink / raw)
  To: intel-gfx

From: Jeff McGee <jeff.mcgee@intel.com>

New test pm_sseu is intended for any subtest related to the
slice/subslice/EU power gating feature. The sole initial subtest,
'full-enable', confirms that the slice/subslice/EU state is at
full enablement when the render engine is active. Starting with
Gen9 SKL, the render power gating feature can leave SSEU in a
partially enabled state upon resumption of render work unless
explicit action is taken.

v2: Add test description and apply recommendations of igt.cocci
    (Thomas Wood).

Signed-off-by: Jeff McGee <jeff.mcgee@intel.com>
---
 tests/.gitignore       |   1 +
 tests/Makefile.sources |   1 +
 tests/pm_sseu.c        | 375 +++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 377 insertions(+)
 create mode 100644 tests/pm_sseu.c

diff --git a/tests/.gitignore b/tests/.gitignore
index 426cc67..a1ec1b5 100644
--- a/tests/.gitignore
+++ b/tests/.gitignore
@@ -143,6 +143,7 @@ pm_lpsp
 pm_rc6_residency
 pm_rpm
 pm_rps
+pm_sseu
 prime_nv_api
 prime_nv_pcopy
 prime_nv_test
diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index 51e8376..74106c0 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -82,6 +82,7 @@ TESTS_progs_M = \
 	pm_rpm \
 	pm_rps \
 	pm_rc6_residency \
+	pm_sseu \
 	prime_self_import \
 	template \
 	$(NULL)
diff --git a/tests/pm_sseu.c b/tests/pm_sseu.c
new file mode 100644
index 0000000..7196dcb
--- /dev/null
+++ b/tests/pm_sseu.c
@@ -0,0 +1,375 @@
+/*
+ * Copyright © 2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *    Jeff McGee <jeff.mcgee@intel.com>
+ */
+
+#include <fcntl.h>
+#include <unistd.h>
+#include <string.h>
+#include <errno.h>
+#include <time.h>
+#include "drmtest.h"
+#include "i915_drm.h"
+#include "intel_io.h"
+#include "intel_bufmgr.h"
+#include "intel_batchbuffer.h"
+#include "intel_chipset.h"
+#include "ioctl_wrappers.h"
+#include "igt_debugfs.h"
+#include "media_spin.h"
+
+IGT_TEST_DESCRIPTION("Tests slice/subslice/EU power gating functionality.\n");
+
+static double
+to_dt(const struct timespec *start, const struct timespec *end)
+{
+	double dt;
+
+	dt = (end->tv_sec - start->tv_sec) * 1e3;
+	dt += (end->tv_nsec - start->tv_nsec) * 1e-6;
+
+	return dt;
+}
+
+struct status {
+	struct {
+		int slice_total;
+		int subslice_total;
+		int subslice_per;
+		int eu_total;
+		int eu_per;
+		bool has_slice_pg;
+		bool has_subslice_pg;
+		bool has_eu_pg;
+	} info;
+	struct {
+		int slice_total;
+		int subslice_total;
+		int subslice_per;
+		int eu_total;
+		int eu_per;
+	} hw;
+};
+
+#define DBG_STATUS_BUF_SIZE 4096
+
+struct {
+	int init;
+	int status_fd;
+	char status_buf[DBG_STATUS_BUF_SIZE];
+} dbg;
+
+static void
+dbg_get_status_section(const char *title, char **first, char **last)
+{
+	char *pos;
+
+	*first = strstr(dbg.status_buf, title);
+	igt_assert(*first != NULL);
+
+	pos = *first;
+	do {
+		pos = strchr(pos, '\n');
+		igt_assert(pos != NULL);
+		pos++;
+	} while (*pos == ' '); /* lines in the section begin with a space */
+	*last = pos - 1;
+}
+
+static int
+dbg_get_int(const char *first, const char *last, const char *name)
+{
+	char *pos;
+
+	pos = strstr(first, name);
+	igt_assert(pos != NULL);
+	pos = strstr(pos, ":");
+	igt_assert(pos != NULL);
+	pos += 2;
+	igt_assert(pos != last);
+
+	return strtol(pos, &pos, 10);
+}
+
+static bool
+dbg_get_bool(const char *first, const char *last, const char *name)
+{
+	char *pos;
+
+	pos = strstr(first, name);
+	igt_assert(pos != NULL);
+	pos = strstr(pos, ":");
+	igt_assert(pos != NULL);
+	pos += 2;
+	igt_assert(pos < last);
+
+	if (*pos == 'y')
+		return true;
+	if (*pos == 'n')
+		return false;
+
+	igt_assert_f(false, "Could not read boolean value for %s.\n", name);
+	return false;
+}
+
+static void
+dbg_get_status(struct status *stat)
+{
+	char *first, *last;
+	int nread;
+
+	lseek(dbg.status_fd, 0, SEEK_SET);
+	nread = read(dbg.status_fd, dbg.status_buf, DBG_STATUS_BUF_SIZE);
+	igt_assert_lt(nread, DBG_STATUS_BUF_SIZE);
+	dbg.status_buf[nread] = '\0';
+
+	memset(stat, 0, sizeof(*stat));
+
+	dbg_get_status_section("SSEU Device Info", &first, &last);
+	stat->info.slice_total =
+		dbg_get_int(first, last, "Available Slice Total:");
+	stat->info.subslice_total =
+		dbg_get_int(first, last, "Available Subslice Total:");
+	stat->info.subslice_per =
+		dbg_get_int(first, last, "Available Subslice Per Slice:");
+	stat->info.eu_total =
+		dbg_get_int(first, last, "Available EU Total:");
+	stat->info.eu_per =
+		dbg_get_int(first, last, "Available EU Per Subslice:");
+	stat->info.has_slice_pg =
+		dbg_get_bool(first, last, "Has Slice Power Gating:");
+	stat->info.has_subslice_pg =
+		dbg_get_bool(first, last, "Has Subslice Power Gating:");
+	stat->info.has_eu_pg =
+		dbg_get_bool(first, last, "Has EU Power Gating:");
+
+	dbg_get_status_section("SSEU Device Status", &first, &last);
+	stat->hw.slice_total =
+		dbg_get_int(first, last, "Enabled Slice Total:");
+	stat->hw.subslice_total =
+		dbg_get_int(first, last, "Enabled Subslice Total:");
+	stat->hw.subslice_per =
+		dbg_get_int(first, last, "Enabled Subslice Per Slice:");
+	stat->hw.eu_total =
+		dbg_get_int(first, last, "Enabled EU Total:");
+	stat->hw.eu_per =
+		dbg_get_int(first, last, "Enabled EU Per Subslice:");
+}
+
+static void
+dbg_init(void)
+{
+	dbg.status_fd = igt_debugfs_open("i915_sseu_status", O_RDONLY);
+	igt_assert_neq(dbg.status_fd, -1);
+	dbg.init = 1;
+}
+
+static void
+dbg_deinit(void)
+{
+	switch (dbg.init)
+	{
+	case 1:
+		close(dbg.status_fd);
+	}
+}
+
+struct {
+	int init;
+	int drm_fd;
+	int devid;
+	int gen;
+	int has_ppgtt;
+	drm_intel_bufmgr *bufmgr;
+	struct intel_batchbuffer *batch;
+	igt_media_spinfunc_t spinfunc;
+	struct igt_buf buf;
+	uint32_t spins_per_msec;
+} gem;
+
+static void
+gem_check_spin(uint32_t spins)
+{
+	uint32_t *data;
+
+	data = (uint32_t*)gem.buf.bo->virtual;
+	igt_assert_eq_u32(*data, spins);
+}
+
+static uint32_t
+gem_get_target_spins(double dt)
+{
+	struct timespec tstart, tdone;
+	double prev_dt, cur_dt;
+	uint32_t spins;
+	int i, ret;
+
+	/* Double increments until we bound the target time */
+	prev_dt = 0.0;
+	for (i = 0; i < 32; i++) {
+		spins = 1 << i;
+		clock_gettime(CLOCK_MONOTONIC, &tstart);
+
+		gem.spinfunc(gem.batch, &gem.buf, spins);
+		ret = drm_intel_bo_map(gem.buf.bo, 0);
+		igt_assert_eq(ret, 0);
+		clock_gettime(CLOCK_MONOTONIC, &tdone);
+
+		gem_check_spin(spins);
+		drm_intel_bo_unmap(gem.buf.bo);
+
+		cur_dt = to_dt(&tstart, &tdone);
+		if (cur_dt > dt)
+			break;
+		prev_dt = cur_dt;
+	}
+	igt_assert_neq(i, 32);
+
+	/* Linearly interpolate between i and i-1 to get target increments */
+	spins = 1 << (i-1); /* lower bound spins */
+	spins += spins * (dt - prev_dt)/(cur_dt - prev_dt); /* target spins */
+
+	return spins;
+}
+
+static void
+gem_init(void)
+{
+	gem.drm_fd = drm_open_any();
+	gem.init = 1;
+
+	gem.devid = intel_get_drm_devid(gem.drm_fd);
+	gem.gen = intel_gen(gem.devid);
+	gem.has_ppgtt = gem_uses_aliasing_ppgtt(gem.drm_fd);
+
+	gem.bufmgr = drm_intel_bufmgr_gem_init(gem.drm_fd, 4096);
+	igt_assert(gem.bufmgr);
+	gem.init = 2;
+
+	drm_intel_bufmgr_gem_enable_reuse(gem.bufmgr);
+
+	gem.batch = intel_batchbuffer_alloc(gem.bufmgr, gem.devid);
+	igt_assert(gem.batch);
+	gem.init = 3;
+
+	gem.spinfunc = igt_get_media_spinfunc(gem.devid);
+	igt_assert(gem.spinfunc);
+
+	gem.buf.stride = sizeof(uint32_t);
+	gem.buf.tiling = I915_TILING_NONE;
+	gem.buf.size = gem.buf.stride;
+	gem.buf.bo = drm_intel_bo_alloc(gem.bufmgr, "", gem.buf.size, 4096);
+	igt_assert(gem.buf.bo);
+	gem.init = 4;
+
+	gem.spins_per_msec = gem_get_target_spins(100) / 100;
+}
+
+static void
+gem_deinit(void)
+{
+	switch (gem.init)
+	{
+	case 4:
+		drm_intel_bo_unmap(gem.buf.bo);
+		drm_intel_bo_unreference(gem.buf.bo);
+	case 3:
+		intel_batchbuffer_free(gem.batch);
+	case 2:
+		drm_intel_bufmgr_destroy(gem.bufmgr);
+	case 1:
+		close(gem.drm_fd);
+	}
+}
+
+static void
+check_full_enable(struct status *stat)
+{
+	igt_assert_eq(stat->hw.slice_total, stat->info.slice_total);
+	igt_assert_eq(stat->hw.subslice_total, stat->info.subslice_total);
+	igt_assert_eq(stat->hw.subslice_per, stat->info.subslice_per);
+
+	/*
+	 * EU are powered in pairs, but it is possible for one EU in the pair
+	 * to be non-functional due to fusing. The determination of enabled
+	 * EU does not account for this and can therefore actually exceed the
+	 * available count. Allow for this small discrepancy in our
+	 * comparison.
+	*/
+	igt_assert_lte(stat->info.eu_total, stat->hw.eu_total);
+	igt_assert_lte(stat->info.eu_per, stat->hw.eu_per);
+}
+
+static void
+full_enable(void)
+{
+	struct status stat;
+	const int spin_msec = 10;
+	int ret, spins;
+
+	/* Simulation doesn't currently model slice/subslice/EU power gating. */
+	igt_skip_on_simulation();
+
+	/*
+	 * Gen9 SKL is the first case in which render power gating can leave
+	 * slice/subslice/EU in a partially enabled state upon resumption of
+	 * render work. So start checking that this is prevented as of Gen9.
+	*/
+	igt_require(gem.gen >= 9);
+
+	spins = spin_msec * gem.spins_per_msec;
+
+	gem.spinfunc(gem.batch, &gem.buf, spins);
+
+	usleep(2000); /* 2ms wait to make sure batch is running */
+	dbg_get_status(&stat);
+
+	ret = drm_intel_bo_map(gem.buf.bo, 0);
+	igt_assert_eq(ret, 0);
+
+	gem_check_spin(spins);
+	drm_intel_bo_unmap(gem.buf.bo);
+
+	check_full_enable(&stat);
+}
+
+static void
+exit_handler(int sig)
+{
+	gem_deinit();
+	dbg_deinit();
+}
+
+igt_main
+{
+	igt_fixture {
+		igt_install_exit_handler(exit_handler);
+
+		dbg_init();
+		gem_init();
+	}
+
+	igt_subtest("full-enable")
+		full_enable();
+}
-- 
2.3.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH i-g-t 2/2] tests/pm_sseu: Create new test pm_sseu
  2015-03-12 12:09   ` Thomas Wood
@ 2015-03-18 16:51     ` Jeff McGee
  0 siblings, 0 replies; 10+ messages in thread
From: Jeff McGee @ 2015-03-18 16:51 UTC (permalink / raw)
  To: Thomas Wood; +Cc: Intel Graphics Development

On Thu, Mar 12, 2015 at 12:09:50PM +0000, Thomas Wood wrote:
> On 10 March 2015 at 21:17,  <jeff.mcgee@intel.com> wrote:
> > From: Jeff McGee <jeff.mcgee@intel.com>
> >
> > New test pm_sseu is intended for any subtest related to the
> > slice/subslice/EU power gating feature. The sole initial subtest,
> > 'full-enable', confirms that the slice/subslice/EU state is at
> > full enablement when the render engine is active. Starting with
> > Gen9 SKL, the render power gating feature can leave SSEU in a
> > partially enabled state upon resumption of render work unless
> > explicit action is taken.
> 
> Please add a short description to the test using the
> IGT_TEST_DESCRIPTION macro, so that it is included in the
> documentation and help output.
> 

Hi Thomas. I have posted v2 patches to address this and your other comments.
Can you please have a second look? Thanks
-Jeff
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH i-g-t 2/2 v3] tests/pm_sseu: Create new test pm_sseu
  2015-03-12 17:54   ` [PATCH i-g-t 2/2 v2] " jeff.mcgee
@ 2015-03-24 23:20     ` jeff.mcgee
  0 siblings, 0 replies; 10+ messages in thread
From: jeff.mcgee @ 2015-03-24 23:20 UTC (permalink / raw)
  To: intel-gfx

From: Jeff McGee <jeff.mcgee@intel.com>

New test pm_sseu is intended for any subtest related to the
slice/subslice/EU power gating feature. The sole initial subtest,
'full-enable', confirms that the slice/subslice/EU state is at
full enablement when the render engine is active. Starting with
Gen9 SKL, the render power gating feature can leave SSEU in a
partially enabled state upon resumption of render work unless
explicit action is taken.

v2: Add test description and apply recommendations of igt.cocci
    (Thomas Wood).
v3: Skip instead of fail if debugfs entry i915_sseu_status is not
    available.

Signed-off-by: Jeff McGee <jeff.mcgee@intel.com>
---
 tests/.gitignore       |   1 +
 tests/Makefile.sources |   1 +
 tests/pm_sseu.c        | 376 +++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 378 insertions(+)
 create mode 100644 tests/pm_sseu.c

diff --git a/tests/.gitignore b/tests/.gitignore
index 402e062..35b3289 100644
--- a/tests/.gitignore
+++ b/tests/.gitignore
@@ -146,6 +146,7 @@ pm_lpsp
 pm_rc6_residency
 pm_rpm
 pm_rps
+pm_sseu
 prime_nv_api
 prime_nv_pcopy
 prime_nv_test
diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index a165978..798cb75 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -86,6 +86,7 @@ TESTS_progs_M = \
 	pm_rpm \
 	pm_rps \
 	pm_rc6_residency \
+	pm_sseu \
 	prime_self_import \
 	template \
 	$(NULL)
diff --git a/tests/pm_sseu.c b/tests/pm_sseu.c
new file mode 100644
index 0000000..34465db
--- /dev/null
+++ b/tests/pm_sseu.c
@@ -0,0 +1,376 @@
+/*
+ * Copyright © 2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *    Jeff McGee <jeff.mcgee@intel.com>
+ */
+
+#include <fcntl.h>
+#include <unistd.h>
+#include <string.h>
+#include <errno.h>
+#include <time.h>
+#include "drmtest.h"
+#include "i915_drm.h"
+#include "intel_io.h"
+#include "intel_bufmgr.h"
+#include "intel_batchbuffer.h"
+#include "intel_chipset.h"
+#include "ioctl_wrappers.h"
+#include "igt_debugfs.h"
+#include "media_spin.h"
+
+IGT_TEST_DESCRIPTION("Tests slice/subslice/EU power gating functionality.\n");
+
+static double
+to_dt(const struct timespec *start, const struct timespec *end)
+{
+	double dt;
+
+	dt = (end->tv_sec - start->tv_sec) * 1e3;
+	dt += (end->tv_nsec - start->tv_nsec) * 1e-6;
+
+	return dt;
+}
+
+struct status {
+	struct {
+		int slice_total;
+		int subslice_total;
+		int subslice_per;
+		int eu_total;
+		int eu_per;
+		bool has_slice_pg;
+		bool has_subslice_pg;
+		bool has_eu_pg;
+	} info;
+	struct {
+		int slice_total;
+		int subslice_total;
+		int subslice_per;
+		int eu_total;
+		int eu_per;
+	} hw;
+};
+
+#define DBG_STATUS_BUF_SIZE 4096
+
+struct {
+	int init;
+	int status_fd;
+	char status_buf[DBG_STATUS_BUF_SIZE];
+} dbg;
+
+static void
+dbg_get_status_section(const char *title, char **first, char **last)
+{
+	char *pos;
+
+	*first = strstr(dbg.status_buf, title);
+	igt_assert(*first != NULL);
+
+	pos = *first;
+	do {
+		pos = strchr(pos, '\n');
+		igt_assert(pos != NULL);
+		pos++;
+	} while (*pos == ' '); /* lines in the section begin with a space */
+	*last = pos - 1;
+}
+
+static int
+dbg_get_int(const char *first, const char *last, const char *name)
+{
+	char *pos;
+
+	pos = strstr(first, name);
+	igt_assert(pos != NULL);
+	pos = strstr(pos, ":");
+	igt_assert(pos != NULL);
+	pos += 2;
+	igt_assert(pos != last);
+
+	return strtol(pos, &pos, 10);
+}
+
+static bool
+dbg_get_bool(const char *first, const char *last, const char *name)
+{
+	char *pos;
+
+	pos = strstr(first, name);
+	igt_assert(pos != NULL);
+	pos = strstr(pos, ":");
+	igt_assert(pos != NULL);
+	pos += 2;
+	igt_assert(pos < last);
+
+	if (*pos == 'y')
+		return true;
+	if (*pos == 'n')
+		return false;
+
+	igt_assert_f(false, "Could not read boolean value for %s.\n", name);
+	return false;
+}
+
+static void
+dbg_get_status(struct status *stat)
+{
+	char *first, *last;
+	int nread;
+
+	lseek(dbg.status_fd, 0, SEEK_SET);
+	nread = read(dbg.status_fd, dbg.status_buf, DBG_STATUS_BUF_SIZE);
+	igt_assert_lt(nread, DBG_STATUS_BUF_SIZE);
+	dbg.status_buf[nread] = '\0';
+
+	memset(stat, 0, sizeof(*stat));
+
+	dbg_get_status_section("SSEU Device Info", &first, &last);
+	stat->info.slice_total =
+		dbg_get_int(first, last, "Available Slice Total:");
+	stat->info.subslice_total =
+		dbg_get_int(first, last, "Available Subslice Total:");
+	stat->info.subslice_per =
+		dbg_get_int(first, last, "Available Subslice Per Slice:");
+	stat->info.eu_total =
+		dbg_get_int(first, last, "Available EU Total:");
+	stat->info.eu_per =
+		dbg_get_int(first, last, "Available EU Per Subslice:");
+	stat->info.has_slice_pg =
+		dbg_get_bool(first, last, "Has Slice Power Gating:");
+	stat->info.has_subslice_pg =
+		dbg_get_bool(first, last, "Has Subslice Power Gating:");
+	stat->info.has_eu_pg =
+		dbg_get_bool(first, last, "Has EU Power Gating:");
+
+	dbg_get_status_section("SSEU Device Status", &first, &last);
+	stat->hw.slice_total =
+		dbg_get_int(first, last, "Enabled Slice Total:");
+	stat->hw.subslice_total =
+		dbg_get_int(first, last, "Enabled Subslice Total:");
+	stat->hw.subslice_per =
+		dbg_get_int(first, last, "Enabled Subslice Per Slice:");
+	stat->hw.eu_total =
+		dbg_get_int(first, last, "Enabled EU Total:");
+	stat->hw.eu_per =
+		dbg_get_int(first, last, "Enabled EU Per Subslice:");
+}
+
+static void
+dbg_init(void)
+{
+	dbg.status_fd = igt_debugfs_open("i915_sseu_status", O_RDONLY);
+	igt_skip_on_f(dbg.status_fd == -1,
+		      "debugfs entry 'i915_sseu_status' not found\n");
+	dbg.init = 1;
+}
+
+static void
+dbg_deinit(void)
+{
+	switch (dbg.init)
+	{
+	case 1:
+		close(dbg.status_fd);
+	}
+}
+
+struct {
+	int init;
+	int drm_fd;
+	int devid;
+	int gen;
+	int has_ppgtt;
+	drm_intel_bufmgr *bufmgr;
+	struct intel_batchbuffer *batch;
+	igt_media_spinfunc_t spinfunc;
+	struct igt_buf buf;
+	uint32_t spins_per_msec;
+} gem;
+
+static void
+gem_check_spin(uint32_t spins)
+{
+	uint32_t *data;
+
+	data = (uint32_t*)gem.buf.bo->virtual;
+	igt_assert_eq_u32(*data, spins);
+}
+
+static uint32_t
+gem_get_target_spins(double dt)
+{
+	struct timespec tstart, tdone;
+	double prev_dt, cur_dt;
+	uint32_t spins;
+	int i, ret;
+
+	/* Double increments until we bound the target time */
+	prev_dt = 0.0;
+	for (i = 0; i < 32; i++) {
+		spins = 1 << i;
+		clock_gettime(CLOCK_MONOTONIC, &tstart);
+
+		gem.spinfunc(gem.batch, &gem.buf, spins);
+		ret = drm_intel_bo_map(gem.buf.bo, 0);
+		igt_assert_eq(ret, 0);
+		clock_gettime(CLOCK_MONOTONIC, &tdone);
+
+		gem_check_spin(spins);
+		drm_intel_bo_unmap(gem.buf.bo);
+
+		cur_dt = to_dt(&tstart, &tdone);
+		if (cur_dt > dt)
+			break;
+		prev_dt = cur_dt;
+	}
+	igt_assert_neq(i, 32);
+
+	/* Linearly interpolate between i and i-1 to get target increments */
+	spins = 1 << (i-1); /* lower bound spins */
+	spins += spins * (dt - prev_dt)/(cur_dt - prev_dt); /* target spins */
+
+	return spins;
+}
+
+static void
+gem_init(void)
+{
+	gem.drm_fd = drm_open_any();
+	gem.init = 1;
+
+	gem.devid = intel_get_drm_devid(gem.drm_fd);
+	gem.gen = intel_gen(gem.devid);
+	gem.has_ppgtt = gem_uses_aliasing_ppgtt(gem.drm_fd);
+
+	gem.bufmgr = drm_intel_bufmgr_gem_init(gem.drm_fd, 4096);
+	igt_assert(gem.bufmgr);
+	gem.init = 2;
+
+	drm_intel_bufmgr_gem_enable_reuse(gem.bufmgr);
+
+	gem.batch = intel_batchbuffer_alloc(gem.bufmgr, gem.devid);
+	igt_assert(gem.batch);
+	gem.init = 3;
+
+	gem.spinfunc = igt_get_media_spinfunc(gem.devid);
+	igt_assert(gem.spinfunc);
+
+	gem.buf.stride = sizeof(uint32_t);
+	gem.buf.tiling = I915_TILING_NONE;
+	gem.buf.size = gem.buf.stride;
+	gem.buf.bo = drm_intel_bo_alloc(gem.bufmgr, "", gem.buf.size, 4096);
+	igt_assert(gem.buf.bo);
+	gem.init = 4;
+
+	gem.spins_per_msec = gem_get_target_spins(100) / 100;
+}
+
+static void
+gem_deinit(void)
+{
+	switch (gem.init)
+	{
+	case 4:
+		drm_intel_bo_unmap(gem.buf.bo);
+		drm_intel_bo_unreference(gem.buf.bo);
+	case 3:
+		intel_batchbuffer_free(gem.batch);
+	case 2:
+		drm_intel_bufmgr_destroy(gem.bufmgr);
+	case 1:
+		close(gem.drm_fd);
+	}
+}
+
+static void
+check_full_enable(struct status *stat)
+{
+	igt_assert_eq(stat->hw.slice_total, stat->info.slice_total);
+	igt_assert_eq(stat->hw.subslice_total, stat->info.subslice_total);
+	igt_assert_eq(stat->hw.subslice_per, stat->info.subslice_per);
+
+	/*
+	 * EU are powered in pairs, but it is possible for one EU in the pair
+	 * to be non-functional due to fusing. The determination of enabled
+	 * EU does not account for this and can therefore actually exceed the
+	 * available count. Allow for this small discrepancy in our
+	 * comparison.
+	*/
+	igt_assert_lte(stat->info.eu_total, stat->hw.eu_total);
+	igt_assert_lte(stat->info.eu_per, stat->hw.eu_per);
+}
+
+static void
+full_enable(void)
+{
+	struct status stat;
+	const int spin_msec = 10;
+	int ret, spins;
+
+	/* Simulation doesn't currently model slice/subslice/EU power gating. */
+	igt_skip_on_simulation();
+
+	/*
+	 * Gen9 SKL is the first case in which render power gating can leave
+	 * slice/subslice/EU in a partially enabled state upon resumption of
+	 * render work. So start checking that this is prevented as of Gen9.
+	*/
+	igt_require(gem.gen >= 9);
+
+	spins = spin_msec * gem.spins_per_msec;
+
+	gem.spinfunc(gem.batch, &gem.buf, spins);
+
+	usleep(2000); /* 2ms wait to make sure batch is running */
+	dbg_get_status(&stat);
+
+	ret = drm_intel_bo_map(gem.buf.bo, 0);
+	igt_assert_eq(ret, 0);
+
+	gem_check_spin(spins);
+	drm_intel_bo_unmap(gem.buf.bo);
+
+	check_full_enable(&stat);
+}
+
+static void
+exit_handler(int sig)
+{
+	gem_deinit();
+	dbg_deinit();
+}
+
+igt_main
+{
+	igt_fixture {
+		igt_install_exit_handler(exit_handler);
+
+		dbg_init();
+		gem_init();
+	}
+
+	igt_subtest("full-enable")
+		full_enable();
+}
-- 
2.3.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH i-g-t 1/2 v2] lib: Add media spin
  2015-03-12 17:52   ` [PATCH i-g-t 1/2 v2] " jeff.mcgee
@ 2015-03-25  2:50     ` He, Shuang
  2015-03-25 18:07       ` Thomas Wood
  0 siblings, 1 reply; 10+ messages in thread
From: He, Shuang @ 2015-03-25  2:50 UTC (permalink / raw)
  To: Mcgee, Jeff, intel-gfx@lists.freedesktop.org; +Cc: Liu, Lei A

(He Shuang on behalf of Liu Lei)
Tested-by: Lei,Liu lei.a.liu@intel.com

I-G-T test result:
./pm_sseu
IGT-Version: 1.9-g07be8fe (x86_64) (Linux: 4.0.0-rc3_drm-intel-nightly_c09a3b_20150310+ x86_64)
Subtest full-enable: SUCCESS (0.010s)

Manually test result:
SSEU Device Info
Available Slice Total: 1
Available Subslice Total: 3
Available Subslice Per Slice: 3
Available EU Total: 23
Available EU Per Subslice: 8
Has Slice Power Gating: no
Has Subslice Power Gating: no
Has EU Power Gating: yes
SSEU Device Status
Enabled Slice Total: 1
Enabled Subslice Total: 3
Enabled Subslice Per Slice: 3
Enabled EU Total: 24
Enabled EU Per Subslice: 8

EU are enabled in pairs. Because one EU in a pair can be fused-off, it is possible to see such case where reported EU enabled is greater than reported EU available. The IGT test allows for this discrepancy and only fails if enabled is less than available, which can only happen if unwanted power gating is applied

Best wishes
Liu,Lei

> -----Original Message-----
> From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On Behalf
> Of jeff.mcgee@intel.com
> Sent: Friday, March 13, 2015 1:52 AM
> To: intel-gfx@lists.freedesktop.org
> Subject: [Intel-gfx] [PATCH i-g-t 1/2 v2] lib: Add media spin
> 
> From: Jeff McGee <jeff.mcgee@intel.com>
> 
> The media spin utility is derived from media fill. The purpose
> is to create a simple means to keep the render engine (media
> pipeline) busy for a controlled amount of time. It does so by
> emitting a batch with a single execution thread that spins in
> a tight loop the requested number of times. Each spin increments
> a counter whose final 32-bit value is written to the destination
> buffer on completion for checking. The implementation supports
> Gen8, Gen8lp, and Gen9.
> 
> v2: Apply the recommendations of igt.cocci.
> 
> Signed-off-by: Jeff McGee <jeff.mcgee@intel.com>
> ---
>  lib/Makefile.sources    |   2 +
>  lib/intel_batchbuffer.c |  24 +++
>  lib/intel_batchbuffer.h |  22 ++
>  lib/media_spin.c        | 540
> ++++++++++++++++++++++++++++++++++++++++++++++++
>  lib/media_spin.h        |  39 ++++
>  5 files changed, 627 insertions(+)
>  create mode 100644 lib/media_spin.c
>  create mode 100644 lib/media_spin.h
> 
> diff --git a/lib/Makefile.sources b/lib/Makefile.sources
> index 76f353a..3d93629 100644
> --- a/lib/Makefile.sources
> +++ b/lib/Makefile.sources
> @@ -29,6 +29,8 @@ libintel_tools_la_SOURCES = 	\
>  	media_fill_gen8.c       \
>  	media_fill_gen8lp.c     \
>  	media_fill_gen9.c       \
> +	media_spin.h		\
> +	media_spin.c	\
>  	gen7_media.h            \
>  	gen8_media.h            \
>  	rendercopy_i915.c	\
> diff --git a/lib/intel_batchbuffer.c b/lib/intel_batchbuffer.c
> index 666c323..195ccc4 100644
> --- a/lib/intel_batchbuffer.c
> +++ b/lib/intel_batchbuffer.c
> @@ -40,6 +40,7 @@
>  #include "rendercopy.h"
>  #include "media_fill.h"
>  #include "ioctl_wrappers.h"
> +#include "media_spin.h"
> 
>  #include <i915_drm.h>
> 
> @@ -785,3 +786,26 @@ igt_fillfunc_t igt_get_gpgpu_fillfunc(int devid)
> 
>  	return fill;
>  }
> +
> +/**
> + * igt_get_media_spinfunc:
> + * @devid: pci device id
> + *
> + * Returns:
> + *
> + * The platform-specific media spin function pointer for the device specified
> + * with @devid. Will return NULL when no media spin function is
> implemented.
> + */
> +igt_media_spinfunc_t igt_get_media_spinfunc(int devid)
> +{
> +	igt_media_spinfunc_t spin = NULL;
> +
> +	if (IS_GEN9(devid))
> +		spin = gen9_media_spinfunc;
> +	else if (IS_BROADWELL(devid))
> +		spin = gen8_media_spinfunc;
> +	else if (IS_CHERRYVIEW(devid))
> +		spin = gen8lp_media_spinfunc;
> +
> +	return spin;
> +}
> diff --git a/lib/intel_batchbuffer.h b/lib/intel_batchbuffer.h
> index fa8875b..62c8396 100644
> --- a/lib/intel_batchbuffer.h
> +++ b/lib/intel_batchbuffer.h
> @@ -300,4 +300,26 @@ typedef void (*igt_fillfunc_t)(struct
> intel_batchbuffer *batch,
>  igt_fillfunc_t igt_get_media_fillfunc(int devid);
>  igt_fillfunc_t igt_get_gpgpu_fillfunc(int devid);
> 
> +/**
> + * igt_media_spinfunc_t:
> + * @batch: batchbuffer object
> + * @dst: destination i-g-t buffer object
> + * @spins: number of loops to execute
> + *
> + * This is the type of the per-platform media spin functions. The
> + * platform-specific implementation can be obtained by calling
> + * igt_get_media_spinfunc().
> + *
> + * The media spin function emits a batchbuffer for the render engine with
> + * the media pipeline selected. The workload consists of a single thread
> + * which spins in a tight loop the requested number of times. Each spin
> + * increments a counter whose final 32-bit value is written to the
> + * destination buffer on completion. This utility provides a simple way
> + * to keep the render engine busy for a set time for various tests.
> + */
> +typedef void (*igt_media_spinfunc_t)(struct intel_batchbuffer *batch,
> +				     struct igt_buf *dst, uint32_t spins);
> +
> +igt_media_spinfunc_t igt_get_media_spinfunc(int devid);
> +
>  #endif
> diff --git a/lib/media_spin.c b/lib/media_spin.c
> new file mode 100644
> index 0000000..580c109
> --- /dev/null
> +++ b/lib/media_spin.c
> @@ -0,0 +1,540 @@
> +/*
> + * Copyright © 2015 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO
> EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> OTHER DEALINGS
> + * IN THE SOFTWARE.
> + *
> + * Authors:
> + * 	Jeff McGee <jeff.mcgee@intel.com>
> + */
> +
> +#include <intel_bufmgr.h>
> +#include <i915_drm.h>
> +#include "intel_reg.h"
> +#include "drmtest.h"
> +#include "intel_batchbuffer.h"
> +#include "gen8_media.h"
> +#include "media_spin.h"
> +
> +static const uint32_t spin_kernel[][4] = {
> +	{ 0x00600001, 0x20800208, 0x008d0000, 0x00000000 }, /* mov
> (8)r4.0<1>:ud r0.0<8;8;1>:ud */
> +	{ 0x00200001, 0x20800208, 0x00450040, 0x00000000 }, /* mov
> (2)r4.0<1>.ud r2.0<2;2;1>:ud */
> +	{ 0x00000001, 0x20880608, 0x00000000, 0x00000003 }, /* mov
> (1)r4.8<1>:ud 0x3 */
> +	{ 0x00000001, 0x20a00608, 0x00000000, 0x00000000 }, /* mov
> (1)r5.0<1>:ud 0 */
> +	{ 0x00000040, 0x20a00208, 0x060000a0, 0x00000001 }, /* add
> (1)r5.0<1>:ud r5.0<0;1;0>:ud 1 */
> +	{ 0x01000010, 0x20000200, 0x02000020, 0x000000a0 }, /* cmp.e.f0.0
> (1)null<1> r1<0;1;0> r5<0;1;0> */
> +	{ 0x00110027, 0x00000000, 0x00000000, 0xffffffe0 }, /* ~f0.0 while (1)
> -32 */
> +	{ 0x0c800031, 0x20000a00, 0x0e000080, 0x040a8000 }, /* send.dcdp1
> (16)null<1> r4.0<0;1;0> 0x040a8000 */
> +	{ 0x00600001, 0x2e000208, 0x008d0000, 0x00000000 }, /* mov
> (8)r112<1>:ud r0.0<8;8;1>:ud */
> +	{ 0x07800031, 0x20000a40, 0x0e000e00, 0x82000010 }, /* send.ts
> (16)null<1> r112<0;1;0>:d 0x82000010 */
> +};
> +
> +static uint32_t
> +batch_used(struct intel_batchbuffer *batch)
> +{
> +	return batch->ptr - batch->buffer;
> +}
> +
> +static uint32_t
> +batch_align(struct intel_batchbuffer *batch, uint32_t align)
> +{
> +	uint32_t offset = batch_used(batch);
> +	offset = ALIGN(offset, align);
> +	batch->ptr = batch->buffer + offset;
> +	return offset;
> +}
> +
> +static void *
> +batch_alloc(struct intel_batchbuffer *batch, uint32_t size, uint32_t align)
> +{
> +	uint32_t offset = batch_align(batch, align);
> +	batch->ptr += size;
> +	return memset(batch->buffer + offset, 0, size);
> +}
> +
> +static uint32_t
> +batch_offset(struct intel_batchbuffer *batch, void *ptr)
> +{
> +	return (uint8_t *)ptr - batch->buffer;
> +}
> +
> +static uint32_t
> +batch_copy(struct intel_batchbuffer *batch, const void *ptr, uint32_t size,
> +	   uint32_t align)
> +{
> +	return batch_offset(batch, memcpy(batch_alloc(batch, size, align),
> ptr, size));
> +}
> +
> +static void
> +gen8_render_flush(struct intel_batchbuffer *batch, uint32_t batch_end)
> +{
> +	int ret;
> +
> +	ret = drm_intel_bo_subdata(batch->bo, 0, 4096, batch->buffer);
> +	if (ret == 0)
> +		ret = drm_intel_gem_bo_context_exec(batch->bo, NULL,
> +						    batch_end, 0);
> +	igt_assert_eq(ret, 0);
> +}
> +
> +static uint32_t
> +gen8_spin_curbe_buffer_data(struct intel_batchbuffer *batch,
> +			    uint32_t iters)
> +{
> +	uint32_t *curbe_buffer;
> +	uint32_t offset;
> +
> +	curbe_buffer = batch_alloc(batch, 64, 64);
> +	offset = batch_offset(batch, curbe_buffer);
> +	*curbe_buffer = iters;
> +
> +	return offset;
> +}
> +
> +static uint32_t
> +gen8_spin_surface_state(struct intel_batchbuffer *batch,
> +			struct igt_buf *buf,
> +			uint32_t format,
> +			int is_dst)
> +{
> +	struct gen8_surface_state *ss;
> +	uint32_t write_domain, read_domain, offset;
> +	int ret;
> +
> +	if (is_dst) {
> +		write_domain = read_domain =
> I915_GEM_DOMAIN_RENDER;
> +	} else {
> +		write_domain = 0;
> +		read_domain = I915_GEM_DOMAIN_SAMPLER;
> +	}
> +
> +	ss = batch_alloc(batch, sizeof(*ss), 64);
> +	offset = batch_offset(batch, ss);
> +
> +	ss->ss0.surface_type = GEN8_SURFACE_2D;
> +	ss->ss0.surface_format = format;
> +	ss->ss0.render_cache_read_write = 1;
> +	ss->ss0.vertical_alignment = 1; /* align 4 */
> +	ss->ss0.horizontal_alignment = 1; /* align 4 */
> +
> +	if (buf->tiling == I915_TILING_X)
> +		ss->ss0.tiled_mode = 2;
> +	else if (buf->tiling == I915_TILING_Y)
> +		ss->ss0.tiled_mode = 3;
> +
> +	ss->ss8.base_addr = buf->bo->offset;
> +
> +	ret = drm_intel_bo_emit_reloc(batch->bo,
> +				batch_offset(batch, ss) + 8 * 4,
> +				buf->bo, 0,
> +				read_domain, write_domain);
> +	igt_assert_eq(ret, 0);
> +
> +	ss->ss2.height = igt_buf_height(buf) - 1;
> +	ss->ss2.width  = igt_buf_width(buf) - 1;
> +	ss->ss3.pitch  = buf->stride - 1;
> +
> +	ss->ss7.shader_chanel_select_r = 4;
> +	ss->ss7.shader_chanel_select_g = 5;
> +	ss->ss7.shader_chanel_select_b = 6;
> +	ss->ss7.shader_chanel_select_a = 7;
> +
> +	return offset;
> +}
> +
> +static uint32_t
> +gen8_spin_binding_table(struct intel_batchbuffer *batch,
> +			struct igt_buf *dst)
> +{
> +	uint32_t *binding_table, offset;
> +
> +	binding_table = batch_alloc(batch, 32, 64);
> +	offset = batch_offset(batch, binding_table);
> +
> +	binding_table[0] = gen8_spin_surface_state(batch, dst,
> +
> 	GEN8_SURFACEFORMAT_R8_UNORM, 1);
> +
> +	return offset;
> +}
> +
> +static uint32_t
> +gen8_spin_media_kernel(struct intel_batchbuffer *batch,
> +		       const uint32_t kernel[][4],
> +		       size_t size)
> +{
> +	uint32_t offset;
> +
> +	offset = batch_copy(batch, kernel, size, 64);
> +
> +	return offset;
> +}
> +
> +static uint32_t
> +gen8_spin_interface_descriptor(struct intel_batchbuffer *batch,
> +			       struct igt_buf *dst)
> +{
> +	struct gen8_interface_descriptor_data *idd;
> +	uint32_t offset;
> +	uint32_t binding_table_offset, kernel_offset;
> +
> +	binding_table_offset = gen8_spin_binding_table(batch, dst);
> +	kernel_offset = gen8_spin_media_kernel(batch, spin_kernel,
> +					       sizeof(spin_kernel));
> +
> +	idd = batch_alloc(batch, sizeof(*idd), 64);
> +	offset = batch_offset(batch, idd);
> +
> +	idd->desc0.kernel_start_pointer = (kernel_offset >> 6);
> +
> +	idd->desc2.single_program_flow = 1;
> +	idd->desc2.floating_point_mode =
> GEN8_FLOATING_POINT_IEEE_754;
> +
> +	idd->desc3.sampler_count = 0;      /* 0 samplers used */
> +	idd->desc3.sampler_state_pointer = 0;
> +
> +	idd->desc4.binding_table_entry_count = 0;
> +	idd->desc4.binding_table_pointer = (binding_table_offset >> 5);
> +
> +	idd->desc5.constant_urb_entry_read_offset = 0;
> +	idd->desc5.constant_urb_entry_read_length = 1; /* grf 1 */
> +
> +	return offset;
> +}
> +
> +static void
> +gen8_emit_state_base_address(struct intel_batchbuffer *batch)
> +{
> +	OUT_BATCH(GEN8_STATE_BASE_ADDRESS | (16 - 2));
> +
> +	/* general */
> +	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
> +	OUT_BATCH(0);
> +
> +	/* stateless data port */
> +	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
> +
> +	/* surface */
> +	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0,
> BASE_ADDRESS_MODIFY);
> +
> +	/* dynamic */
> +	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER |
> I915_GEM_DOMAIN_INSTRUCTION,
> +		0, BASE_ADDRESS_MODIFY);
> +
> +	/* indirect */
> +	OUT_BATCH(0);
> +	OUT_BATCH(0);
> +
> +	/* instruction */
> +	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0,
> BASE_ADDRESS_MODIFY);
> +
> +	/* general state buffer size */
> +	OUT_BATCH(0xfffff000 | 1);
> +	/* dynamic state buffer size */
> +	OUT_BATCH(1 << 12 | 1);
> +	/* indirect object buffer size */
> +	OUT_BATCH(0xfffff000 | 1);
> +	/* intruction buffer size, must set modify enable bit, otherwise it
> may result in GPU hang */
> +	OUT_BATCH(1 << 12 | 1);
> +}
> +
> +static void
> +gen9_emit_state_base_address(struct intel_batchbuffer *batch)
> +{
> +	OUT_BATCH(GEN8_STATE_BASE_ADDRESS | (19 - 2));
> +
> +	/* general */
> +	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
> +	OUT_BATCH(0);
> +
> +	/* stateless data port */
> +	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
> +
> +	/* surface */
> +	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0,
> BASE_ADDRESS_MODIFY);
> +
> +	/* dynamic */
> +	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER |
> I915_GEM_DOMAIN_INSTRUCTION,
> +		0, BASE_ADDRESS_MODIFY);
> +
> +	/* indirect */
> +	OUT_BATCH(0);
> +	OUT_BATCH(0);
> +
> +	/* instruction */
> +	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0,
> BASE_ADDRESS_MODIFY);
> +
> +	/* general state buffer size */
> +	OUT_BATCH(0xfffff000 | 1);
> +	/* dynamic state buffer size */
> +	OUT_BATCH(1 << 12 | 1);
> +	/* indirect object buffer size */
> +	OUT_BATCH(0xfffff000 | 1);
> +	/* intruction buffer size, must set modify enable bit, otherwise it
> may result in GPU hang */
> +	OUT_BATCH(1 << 12 | 1);
> +
> +	/* Bindless surface state base address */
> +	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
> +	OUT_BATCH(0);
> +	OUT_BATCH(0xfffff000);
> +}
> +
> +static void
> +gen8_emit_vfe_state(struct intel_batchbuffer *batch)
> +{
> +	OUT_BATCH(GEN8_MEDIA_VFE_STATE | (9 - 2));
> +
> +	/* scratch buffer */
> +	OUT_BATCH(0);
> +	OUT_BATCH(0);
> +
> +	/* number of threads & urb entries */
> +	OUT_BATCH(2 << 8);
> +
> +	OUT_BATCH(0);
> +
> +	/* urb entry size & curbe size */
> +	OUT_BATCH(2 << 16 |
> +		2);
> +
> +	/* scoreboard */
> +	OUT_BATCH(0);
> +	OUT_BATCH(0);
> +	OUT_BATCH(0);
> +}
> +
> +static void
> +gen8_emit_curbe_load(struct intel_batchbuffer *batch, uint32_t
> curbe_buffer)
> +{
> +	OUT_BATCH(GEN8_MEDIA_CURBE_LOAD | (4 - 2));
> +	OUT_BATCH(0);
> +	/* curbe total data length */
> +	OUT_BATCH(64);
> +	/* curbe data start address, is relative to the dynamics base address
> */
> +	OUT_BATCH(curbe_buffer);
> +}
> +
> +static void
> +gen8_emit_interface_descriptor_load(struct intel_batchbuffer *batch,
> +				    uint32_t interface_descriptor)
> +{
> +	OUT_BATCH(GEN8_MEDIA_INTERFACE_DESCRIPTOR_LOAD | (4 - 2));
> +	OUT_BATCH(0);
> +	/* interface descriptor data length */
> +	OUT_BATCH(sizeof(struct gen8_interface_descriptor_data));
> +	/* interface descriptor address, is relative to the dynamics base
> address */
> +	OUT_BATCH(interface_descriptor);
> +}
> +
> +static void
> +gen8_emit_media_state_flush(struct intel_batchbuffer *batch)
> +{
> +	OUT_BATCH(GEN8_MEDIA_STATE_FLUSH | (2 - 2));
> +	OUT_BATCH(0);
> +}
> +
> +static void
> +gen8_emit_media_objects(struct intel_batchbuffer *batch)
> +{
> +	OUT_BATCH(GEN8_MEDIA_OBJECT | (8 - 2));
> +
> +	/* interface descriptor offset */
> +	OUT_BATCH(0);
> +
> +	/* without indirect data */
> +	OUT_BATCH(0);
> +	OUT_BATCH(0);
> +
> +	/* scoreboard */
> +	OUT_BATCH(0);
> +	OUT_BATCH(0);
> +
> +	/* inline data (xoffset, yoffset) */
> +	OUT_BATCH(0);
> +	OUT_BATCH(0);
> +	gen8_emit_media_state_flush(batch);
> +}
> +
> +static void
> +gen8lp_emit_media_objects(struct intel_batchbuffer *batch)
> +{
> +	OUT_BATCH(GEN8_MEDIA_OBJECT | (8 - 2));
> +
> +	/* interface descriptor offset */
> +	OUT_BATCH(0);
> +
> +	/* without indirect data */
> +	OUT_BATCH(0);
> +	OUT_BATCH(0);
> +
> +	/* scoreboard */
> +	OUT_BATCH(0);
> +	OUT_BATCH(0);
> +
> +	/* inline data (xoffset, yoffset) */
> +	OUT_BATCH(0);
> +	OUT_BATCH(0);
> +}
> +
> +/*
> + * This sets up the media pipeline,
> + *
> + * +---------------+ <---- 4096
> + * |       ^       |
> + * |       |       |
> + * |    various    |
> + * |      state    |
> + * |       |       |
> + * |_______|_______| <---- 2048 + ?
> + * |       ^       |
> + * |       |       |
> + * |   batch       |
> + * |    commands   |
> + * |       |       |
> + * |       |       |
> + * +---------------+ <---- 0 + ?
> + *
> + */
> +
> +#define BATCH_STATE_SPLIT 2048
> +
> +void
> +gen8_media_spinfunc(struct intel_batchbuffer *batch,
> +		    struct igt_buf *dst, uint32_t spins)
> +{
> +	uint32_t curbe_buffer, interface_descriptor;
> +	uint32_t batch_end;
> +
> +	intel_batchbuffer_flush_with_context(batch, NULL);
> +
> +	/* setup states */
> +	batch->ptr = &batch->buffer[BATCH_STATE_SPLIT];
> +
> +	curbe_buffer = gen8_spin_curbe_buffer_data(batch, spins);
> +	interface_descriptor = gen8_spin_interface_descriptor(batch, dst);
> +	igt_assert(batch->ptr < &batch->buffer[4095]);
> +
> +	/* media pipeline */
> +	batch->ptr = batch->buffer;
> +	OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA);
> +	gen8_emit_state_base_address(batch);
> +
> +	gen8_emit_vfe_state(batch);
> +
> +	gen8_emit_curbe_load(batch, curbe_buffer);
> +
> +	gen8_emit_interface_descriptor_load(batch, interface_descriptor);
> +
> +	gen8_emit_media_objects(batch);
> +
> +	OUT_BATCH(MI_BATCH_BUFFER_END);
> +
> +	batch_end = batch_align(batch, 8);
> +	igt_assert(batch_end < BATCH_STATE_SPLIT);
> +
> +	gen8_render_flush(batch, batch_end);
> +	intel_batchbuffer_reset(batch);
> +}
> +
> +void
> +gen8lp_media_spinfunc(struct intel_batchbuffer *batch,
> +		      struct igt_buf *dst, uint32_t spins)
> +{
> +	uint32_t curbe_buffer, interface_descriptor;
> +	uint32_t batch_end;
> +
> +	intel_batchbuffer_flush_with_context(batch, NULL);
> +
> +	/* setup states */
> +	batch->ptr = &batch->buffer[BATCH_STATE_SPLIT];
> +
> +	curbe_buffer = gen8_spin_curbe_buffer_data(batch, spins);
> +	interface_descriptor = gen8_spin_interface_descriptor(batch, dst);
> +	igt_assert(batch->ptr < &batch->buffer[4095]);
> +
> +	/* media pipeline */
> +	batch->ptr = batch->buffer;
> +	OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA);
> +	gen8_emit_state_base_address(batch);
> +
> +	gen8_emit_vfe_state(batch);
> +
> +	gen8_emit_curbe_load(batch, curbe_buffer);
> +
> +	gen8_emit_interface_descriptor_load(batch, interface_descriptor);
> +
> +	gen8lp_emit_media_objects(batch);
> +
> +	OUT_BATCH(MI_BATCH_BUFFER_END);
> +
> +	batch_end = batch_align(batch, 8);
> +	igt_assert(batch_end < BATCH_STATE_SPLIT);
> +
> +	gen8_render_flush(batch, batch_end);
> +	intel_batchbuffer_reset(batch);
> +}
> +
> +void
> +gen9_media_spinfunc(struct intel_batchbuffer *batch,
> +		    struct igt_buf *dst, uint32_t spins)
> +{
> +	uint32_t curbe_buffer, interface_descriptor;
> +	uint32_t batch_end;
> +
> +	intel_batchbuffer_flush_with_context(batch, NULL);
> +
> +	/* setup states */
> +	batch->ptr = &batch->buffer[BATCH_STATE_SPLIT];
> +
> +	curbe_buffer = gen8_spin_curbe_buffer_data(batch, spins);
> +	interface_descriptor = gen8_spin_interface_descriptor(batch, dst);
> +	igt_assert(batch->ptr < &batch->buffer[4095]);
> +
> +	/* media pipeline */
> +	batch->ptr = batch->buffer;
> +	OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA |
> +			GEN9_FORCE_MEDIA_AWAKE_ENABLE |
> +			GEN9_SAMPLER_DOP_GATE_DISABLE |
> +			GEN9_PIPELINE_SELECTION_MASK |
> +			GEN9_SAMPLER_DOP_GATE_MASK |
> +			GEN9_FORCE_MEDIA_AWAKE_MASK);
> +	gen9_emit_state_base_address(batch);
> +
> +	gen8_emit_vfe_state(batch);
> +
> +	gen8_emit_curbe_load(batch, curbe_buffer);
> +
> +	gen8_emit_interface_descriptor_load(batch, interface_descriptor);
> +
> +	gen8_emit_media_objects(batch);
> +
> +	OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA |
> +			GEN9_FORCE_MEDIA_AWAKE_DISABLE |
> +			GEN9_SAMPLER_DOP_GATE_ENABLE |
> +			GEN9_PIPELINE_SELECTION_MASK |
> +			GEN9_SAMPLER_DOP_GATE_MASK |
> +			GEN9_FORCE_MEDIA_AWAKE_MASK);
> +
> +	OUT_BATCH(MI_BATCH_BUFFER_END);
> +
> +	batch_end = batch_align(batch, 8);
> +	igt_assert(batch_end < BATCH_STATE_SPLIT);
> +
> +	gen8_render_flush(batch, batch_end);
> +	intel_batchbuffer_reset(batch);
> +}
> diff --git a/lib/media_spin.h b/lib/media_spin.h
> new file mode 100644
> index 0000000..8bc4829
> --- /dev/null
> +++ b/lib/media_spin.h
> @@ -0,0 +1,39 @@
> +/*
> + * Copyright © 2015 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO
> EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> OTHER DEALINGS
> + * IN THE SOFTWARE.
> + *
> + * Authors:
> + * 	Jeff McGee <jeff.mcgee@intel.com>
> + */
> +
> +#ifndef MEDIA_SPIN_H
> +#define MEDIA_SPIN_H
> +
> +void gen8_media_spinfunc(struct intel_batchbuffer *batch,
> +			 struct igt_buf *dst, uint32_t spins);
> +
> +void gen8lp_media_spinfunc(struct intel_batchbuffer *batch,
> +			   struct igt_buf *dst, uint32_t spins);
> +
> +void gen9_media_spinfunc(struct intel_batchbuffer *batch,
> +			 struct igt_buf *dst, uint32_t spins);
> +
> +#endif /* MEDIA_SPIN_H */
> --
> 2.3.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH i-g-t 1/2 v2] lib: Add media spin
  2015-03-25  2:50     ` He, Shuang
@ 2015-03-25 18:07       ` Thomas Wood
  0 siblings, 0 replies; 10+ messages in thread
From: Thomas Wood @ 2015-03-25 18:07 UTC (permalink / raw)
  To: He, Shuang; +Cc: intel-gfx@lists.freedesktop.org, Liu, Lei A

On 25 March 2015 at 02:50, He, Shuang <shuang.he@intel.com> wrote:
> (He Shuang on behalf of Liu Lei)
> Tested-by: Lei,Liu lei.a.liu@intel.com

Thanks, both patches in this series are now merged.


>
> I-G-T test result:
> ./pm_sseu
> IGT-Version: 1.9-g07be8fe (x86_64) (Linux: 4.0.0-rc3_drm-intel-nightly_c09a3b_20150310+ x86_64)
> Subtest full-enable: SUCCESS (0.010s)
>
> Manually test result:
> SSEU Device Info
> Available Slice Total: 1
> Available Subslice Total: 3
> Available Subslice Per Slice: 3
> Available EU Total: 23
> Available EU Per Subslice: 8
> Has Slice Power Gating: no
> Has Subslice Power Gating: no
> Has EU Power Gating: yes
> SSEU Device Status
> Enabled Slice Total: 1
> Enabled Subslice Total: 3
> Enabled Subslice Per Slice: 3
> Enabled EU Total: 24
> Enabled EU Per Subslice: 8
>
> EU are enabled in pairs. Because one EU in a pair can be fused-off, it is possible to see such case where reported EU enabled is greater than reported EU available. The IGT test allows for this discrepancy and only fails if enabled is less than available, which can only happen if unwanted power gating is applied
>
> Best wishes
> Liu,Lei
>
>> -----Original Message-----
>> From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On Behalf
>> Of jeff.mcgee@intel.com
>> Sent: Friday, March 13, 2015 1:52 AM
>> To: intel-gfx@lists.freedesktop.org
>> Subject: [Intel-gfx] [PATCH i-g-t 1/2 v2] lib: Add media spin
>>
>> From: Jeff McGee <jeff.mcgee@intel.com>
>>
>> The media spin utility is derived from media fill. The purpose
>> is to create a simple means to keep the render engine (media
>> pipeline) busy for a controlled amount of time. It does so by
>> emitting a batch with a single execution thread that spins in
>> a tight loop the requested number of times. Each spin increments
>> a counter whose final 32-bit value is written to the destination
>> buffer on completion for checking. The implementation supports
>> Gen8, Gen8lp, and Gen9.
>>
>> v2: Apply the recommendations of igt.cocci.
>>
>> Signed-off-by: Jeff McGee <jeff.mcgee@intel.com>
>> ---
>>  lib/Makefile.sources    |   2 +
>>  lib/intel_batchbuffer.c |  24 +++
>>  lib/intel_batchbuffer.h |  22 ++
>>  lib/media_spin.c        | 540
>> ++++++++++++++++++++++++++++++++++++++++++++++++
>>  lib/media_spin.h        |  39 ++++
>>  5 files changed, 627 insertions(+)
>>  create mode 100644 lib/media_spin.c
>>  create mode 100644 lib/media_spin.h
>>
>> diff --git a/lib/Makefile.sources b/lib/Makefile.sources
>> index 76f353a..3d93629 100644
>> --- a/lib/Makefile.sources
>> +++ b/lib/Makefile.sources
>> @@ -29,6 +29,8 @@ libintel_tools_la_SOURCES =         \
>>       media_fill_gen8.c       \
>>       media_fill_gen8lp.c     \
>>       media_fill_gen9.c       \
>> +     media_spin.h            \
>> +     media_spin.c    \
>>       gen7_media.h            \
>>       gen8_media.h            \
>>       rendercopy_i915.c       \
>> diff --git a/lib/intel_batchbuffer.c b/lib/intel_batchbuffer.c
>> index 666c323..195ccc4 100644
>> --- a/lib/intel_batchbuffer.c
>> +++ b/lib/intel_batchbuffer.c
>> @@ -40,6 +40,7 @@
>>  #include "rendercopy.h"
>>  #include "media_fill.h"
>>  #include "ioctl_wrappers.h"
>> +#include "media_spin.h"
>>
>>  #include <i915_drm.h>
>>
>> @@ -785,3 +786,26 @@ igt_fillfunc_t igt_get_gpgpu_fillfunc(int devid)
>>
>>       return fill;
>>  }
>> +
>> +/**
>> + * igt_get_media_spinfunc:
>> + * @devid: pci device id
>> + *
>> + * Returns:
>> + *
>> + * The platform-specific media spin function pointer for the device specified
>> + * with @devid. Will return NULL when no media spin function is
>> implemented.
>> + */
>> +igt_media_spinfunc_t igt_get_media_spinfunc(int devid)
>> +{
>> +     igt_media_spinfunc_t spin = NULL;
>> +
>> +     if (IS_GEN9(devid))
>> +             spin = gen9_media_spinfunc;
>> +     else if (IS_BROADWELL(devid))
>> +             spin = gen8_media_spinfunc;
>> +     else if (IS_CHERRYVIEW(devid))
>> +             spin = gen8lp_media_spinfunc;
>> +
>> +     return spin;
>> +}
>> diff --git a/lib/intel_batchbuffer.h b/lib/intel_batchbuffer.h
>> index fa8875b..62c8396 100644
>> --- a/lib/intel_batchbuffer.h
>> +++ b/lib/intel_batchbuffer.h
>> @@ -300,4 +300,26 @@ typedef void (*igt_fillfunc_t)(struct
>> intel_batchbuffer *batch,
>>  igt_fillfunc_t igt_get_media_fillfunc(int devid);
>>  igt_fillfunc_t igt_get_gpgpu_fillfunc(int devid);
>>
>> +/**
>> + * igt_media_spinfunc_t:
>> + * @batch: batchbuffer object
>> + * @dst: destination i-g-t buffer object
>> + * @spins: number of loops to execute
>> + *
>> + * This is the type of the per-platform media spin functions. The
>> + * platform-specific implementation can be obtained by calling
>> + * igt_get_media_spinfunc().
>> + *
>> + * The media spin function emits a batchbuffer for the render engine with
>> + * the media pipeline selected. The workload consists of a single thread
>> + * which spins in a tight loop the requested number of times. Each spin
>> + * increments a counter whose final 32-bit value is written to the
>> + * destination buffer on completion. This utility provides a simple way
>> + * to keep the render engine busy for a set time for various tests.
>> + */
>> +typedef void (*igt_media_spinfunc_t)(struct intel_batchbuffer *batch,
>> +                                  struct igt_buf *dst, uint32_t spins);
>> +
>> +igt_media_spinfunc_t igt_get_media_spinfunc(int devid);
>> +
>>  #endif
>> diff --git a/lib/media_spin.c b/lib/media_spin.c
>> new file mode 100644
>> index 0000000..580c109
>> --- /dev/null
>> +++ b/lib/media_spin.c
>> @@ -0,0 +1,540 @@
>> +/*
>> + * Copyright © 2015 Intel Corporation
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a
>> + * copy of this software and associated documentation files (the
>> "Software"),
>> + * to deal in the Software without restriction, including without limitation
>> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including the next
>> + * paragraph) shall be included in all copies or substantial portions of the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO
>> EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
>> DAMAGES OR OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>> ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
>> OTHER DEALINGS
>> + * IN THE SOFTWARE.
>> + *
>> + * Authors:
>> + *   Jeff McGee <jeff.mcgee@intel.com>
>> + */
>> +
>> +#include <intel_bufmgr.h>
>> +#include <i915_drm.h>
>> +#include "intel_reg.h"
>> +#include "drmtest.h"
>> +#include "intel_batchbuffer.h"
>> +#include "gen8_media.h"
>> +#include "media_spin.h"
>> +
>> +static const uint32_t spin_kernel[][4] = {
>> +     { 0x00600001, 0x20800208, 0x008d0000, 0x00000000 }, /* mov
>> (8)r4.0<1>:ud r0.0<8;8;1>:ud */
>> +     { 0x00200001, 0x20800208, 0x00450040, 0x00000000 }, /* mov
>> (2)r4.0<1>.ud r2.0<2;2;1>:ud */
>> +     { 0x00000001, 0x20880608, 0x00000000, 0x00000003 }, /* mov
>> (1)r4.8<1>:ud 0x3 */
>> +     { 0x00000001, 0x20a00608, 0x00000000, 0x00000000 }, /* mov
>> (1)r5.0<1>:ud 0 */
>> +     { 0x00000040, 0x20a00208, 0x060000a0, 0x00000001 }, /* add
>> (1)r5.0<1>:ud r5.0<0;1;0>:ud 1 */
>> +     { 0x01000010, 0x20000200, 0x02000020, 0x000000a0 }, /* cmp.e.f0.0
>> (1)null<1> r1<0;1;0> r5<0;1;0> */
>> +     { 0x00110027, 0x00000000, 0x00000000, 0xffffffe0 }, /* ~f0.0 while (1)
>> -32 */
>> +     { 0x0c800031, 0x20000a00, 0x0e000080, 0x040a8000 }, /* send.dcdp1
>> (16)null<1> r4.0<0;1;0> 0x040a8000 */
>> +     { 0x00600001, 0x2e000208, 0x008d0000, 0x00000000 }, /* mov
>> (8)r112<1>:ud r0.0<8;8;1>:ud */
>> +     { 0x07800031, 0x20000a40, 0x0e000e00, 0x82000010 }, /* send.ts
>> (16)null<1> r112<0;1;0>:d 0x82000010 */
>> +};
>> +
>> +static uint32_t
>> +batch_used(struct intel_batchbuffer *batch)
>> +{
>> +     return batch->ptr - batch->buffer;
>> +}
>> +
>> +static uint32_t
>> +batch_align(struct intel_batchbuffer *batch, uint32_t align)
>> +{
>> +     uint32_t offset = batch_used(batch);
>> +     offset = ALIGN(offset, align);
>> +     batch->ptr = batch->buffer + offset;
>> +     return offset;
>> +}
>> +
>> +static void *
>> +batch_alloc(struct intel_batchbuffer *batch, uint32_t size, uint32_t align)
>> +{
>> +     uint32_t offset = batch_align(batch, align);
>> +     batch->ptr += size;
>> +     return memset(batch->buffer + offset, 0, size);
>> +}
>> +
>> +static uint32_t
>> +batch_offset(struct intel_batchbuffer *batch, void *ptr)
>> +{
>> +     return (uint8_t *)ptr - batch->buffer;
>> +}
>> +
>> +static uint32_t
>> +batch_copy(struct intel_batchbuffer *batch, const void *ptr, uint32_t size,
>> +        uint32_t align)
>> +{
>> +     return batch_offset(batch, memcpy(batch_alloc(batch, size, align),
>> ptr, size));
>> +}
>> +
>> +static void
>> +gen8_render_flush(struct intel_batchbuffer *batch, uint32_t batch_end)
>> +{
>> +     int ret;
>> +
>> +     ret = drm_intel_bo_subdata(batch->bo, 0, 4096, batch->buffer);
>> +     if (ret == 0)
>> +             ret = drm_intel_gem_bo_context_exec(batch->bo, NULL,
>> +                                                 batch_end, 0);
>> +     igt_assert_eq(ret, 0);
>> +}
>> +
>> +static uint32_t
>> +gen8_spin_curbe_buffer_data(struct intel_batchbuffer *batch,
>> +                         uint32_t iters)
>> +{
>> +     uint32_t *curbe_buffer;
>> +     uint32_t offset;
>> +
>> +     curbe_buffer = batch_alloc(batch, 64, 64);
>> +     offset = batch_offset(batch, curbe_buffer);
>> +     *curbe_buffer = iters;
>> +
>> +     return offset;
>> +}
>> +
>> +static uint32_t
>> +gen8_spin_surface_state(struct intel_batchbuffer *batch,
>> +                     struct igt_buf *buf,
>> +                     uint32_t format,
>> +                     int is_dst)
>> +{
>> +     struct gen8_surface_state *ss;
>> +     uint32_t write_domain, read_domain, offset;
>> +     int ret;
>> +
>> +     if (is_dst) {
>> +             write_domain = read_domain =
>> I915_GEM_DOMAIN_RENDER;
>> +     } else {
>> +             write_domain = 0;
>> +             read_domain = I915_GEM_DOMAIN_SAMPLER;
>> +     }
>> +
>> +     ss = batch_alloc(batch, sizeof(*ss), 64);
>> +     offset = batch_offset(batch, ss);
>> +
>> +     ss->ss0.surface_type = GEN8_SURFACE_2D;
>> +     ss->ss0.surface_format = format;
>> +     ss->ss0.render_cache_read_write = 1;
>> +     ss->ss0.vertical_alignment = 1; /* align 4 */
>> +     ss->ss0.horizontal_alignment = 1; /* align 4 */
>> +
>> +     if (buf->tiling == I915_TILING_X)
>> +             ss->ss0.tiled_mode = 2;
>> +     else if (buf->tiling == I915_TILING_Y)
>> +             ss->ss0.tiled_mode = 3;
>> +
>> +     ss->ss8.base_addr = buf->bo->offset;
>> +
>> +     ret = drm_intel_bo_emit_reloc(batch->bo,
>> +                             batch_offset(batch, ss) + 8 * 4,
>> +                             buf->bo, 0,
>> +                             read_domain, write_domain);
>> +     igt_assert_eq(ret, 0);
>> +
>> +     ss->ss2.height = igt_buf_height(buf) - 1;
>> +     ss->ss2.width  = igt_buf_width(buf) - 1;
>> +     ss->ss3.pitch  = buf->stride - 1;
>> +
>> +     ss->ss7.shader_chanel_select_r = 4;
>> +     ss->ss7.shader_chanel_select_g = 5;
>> +     ss->ss7.shader_chanel_select_b = 6;
>> +     ss->ss7.shader_chanel_select_a = 7;
>> +
>> +     return offset;
>> +}
>> +
>> +static uint32_t
>> +gen8_spin_binding_table(struct intel_batchbuffer *batch,
>> +                     struct igt_buf *dst)
>> +{
>> +     uint32_t *binding_table, offset;
>> +
>> +     binding_table = batch_alloc(batch, 32, 64);
>> +     offset = batch_offset(batch, binding_table);
>> +
>> +     binding_table[0] = gen8_spin_surface_state(batch, dst,
>> +
>>       GEN8_SURFACEFORMAT_R8_UNORM, 1);
>> +
>> +     return offset;
>> +}
>> +
>> +static uint32_t
>> +gen8_spin_media_kernel(struct intel_batchbuffer *batch,
>> +                    const uint32_t kernel[][4],
>> +                    size_t size)
>> +{
>> +     uint32_t offset;
>> +
>> +     offset = batch_copy(batch, kernel, size, 64);
>> +
>> +     return offset;
>> +}
>> +
>> +static uint32_t
>> +gen8_spin_interface_descriptor(struct intel_batchbuffer *batch,
>> +                            struct igt_buf *dst)
>> +{
>> +     struct gen8_interface_descriptor_data *idd;
>> +     uint32_t offset;
>> +     uint32_t binding_table_offset, kernel_offset;
>> +
>> +     binding_table_offset = gen8_spin_binding_table(batch, dst);
>> +     kernel_offset = gen8_spin_media_kernel(batch, spin_kernel,
>> +                                            sizeof(spin_kernel));
>> +
>> +     idd = batch_alloc(batch, sizeof(*idd), 64);
>> +     offset = batch_offset(batch, idd);
>> +
>> +     idd->desc0.kernel_start_pointer = (kernel_offset >> 6);
>> +
>> +     idd->desc2.single_program_flow = 1;
>> +     idd->desc2.floating_point_mode =
>> GEN8_FLOATING_POINT_IEEE_754;
>> +
>> +     idd->desc3.sampler_count = 0;      /* 0 samplers used */
>> +     idd->desc3.sampler_state_pointer = 0;
>> +
>> +     idd->desc4.binding_table_entry_count = 0;
>> +     idd->desc4.binding_table_pointer = (binding_table_offset >> 5);
>> +
>> +     idd->desc5.constant_urb_entry_read_offset = 0;
>> +     idd->desc5.constant_urb_entry_read_length = 1; /* grf 1 */
>> +
>> +     return offset;
>> +}
>> +
>> +static void
>> +gen8_emit_state_base_address(struct intel_batchbuffer *batch)
>> +{
>> +     OUT_BATCH(GEN8_STATE_BASE_ADDRESS | (16 - 2));
>> +
>> +     /* general */
>> +     OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
>> +     OUT_BATCH(0);
>> +
>> +     /* stateless data port */
>> +     OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
>> +
>> +     /* surface */
>> +     OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0,
>> BASE_ADDRESS_MODIFY);
>> +
>> +     /* dynamic */
>> +     OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER |
>> I915_GEM_DOMAIN_INSTRUCTION,
>> +             0, BASE_ADDRESS_MODIFY);
>> +
>> +     /* indirect */
>> +     OUT_BATCH(0);
>> +     OUT_BATCH(0);
>> +
>> +     /* instruction */
>> +     OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0,
>> BASE_ADDRESS_MODIFY);
>> +
>> +     /* general state buffer size */
>> +     OUT_BATCH(0xfffff000 | 1);
>> +     /* dynamic state buffer size */
>> +     OUT_BATCH(1 << 12 | 1);
>> +     /* indirect object buffer size */
>> +     OUT_BATCH(0xfffff000 | 1);
>> +     /* intruction buffer size, must set modify enable bit, otherwise it
>> may result in GPU hang */
>> +     OUT_BATCH(1 << 12 | 1);
>> +}
>> +
>> +static void
>> +gen9_emit_state_base_address(struct intel_batchbuffer *batch)
>> +{
>> +     OUT_BATCH(GEN8_STATE_BASE_ADDRESS | (19 - 2));
>> +
>> +     /* general */
>> +     OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
>> +     OUT_BATCH(0);
>> +
>> +     /* stateless data port */
>> +     OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
>> +
>> +     /* surface */
>> +     OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0,
>> BASE_ADDRESS_MODIFY);
>> +
>> +     /* dynamic */
>> +     OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER |
>> I915_GEM_DOMAIN_INSTRUCTION,
>> +             0, BASE_ADDRESS_MODIFY);
>> +
>> +     /* indirect */
>> +     OUT_BATCH(0);
>> +     OUT_BATCH(0);
>> +
>> +     /* instruction */
>> +     OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0,
>> BASE_ADDRESS_MODIFY);
>> +
>> +     /* general state buffer size */
>> +     OUT_BATCH(0xfffff000 | 1);
>> +     /* dynamic state buffer size */
>> +     OUT_BATCH(1 << 12 | 1);
>> +     /* indirect object buffer size */
>> +     OUT_BATCH(0xfffff000 | 1);
>> +     /* intruction buffer size, must set modify enable bit, otherwise it
>> may result in GPU hang */
>> +     OUT_BATCH(1 << 12 | 1);
>> +
>> +     /* Bindless surface state base address */
>> +     OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
>> +     OUT_BATCH(0);
>> +     OUT_BATCH(0xfffff000);
>> +}
>> +
>> +static void
>> +gen8_emit_vfe_state(struct intel_batchbuffer *batch)
>> +{
>> +     OUT_BATCH(GEN8_MEDIA_VFE_STATE | (9 - 2));
>> +
>> +     /* scratch buffer */
>> +     OUT_BATCH(0);
>> +     OUT_BATCH(0);
>> +
>> +     /* number of threads & urb entries */
>> +     OUT_BATCH(2 << 8);
>> +
>> +     OUT_BATCH(0);
>> +
>> +     /* urb entry size & curbe size */
>> +     OUT_BATCH(2 << 16 |
>> +             2);
>> +
>> +     /* scoreboard */
>> +     OUT_BATCH(0);
>> +     OUT_BATCH(0);
>> +     OUT_BATCH(0);
>> +}
>> +
>> +static void
>> +gen8_emit_curbe_load(struct intel_batchbuffer *batch, uint32_t
>> curbe_buffer)
>> +{
>> +     OUT_BATCH(GEN8_MEDIA_CURBE_LOAD | (4 - 2));
>> +     OUT_BATCH(0);
>> +     /* curbe total data length */
>> +     OUT_BATCH(64);
>> +     /* curbe data start address, is relative to the dynamics base address
>> */
>> +     OUT_BATCH(curbe_buffer);
>> +}
>> +
>> +static void
>> +gen8_emit_interface_descriptor_load(struct intel_batchbuffer *batch,
>> +                                 uint32_t interface_descriptor)
>> +{
>> +     OUT_BATCH(GEN8_MEDIA_INTERFACE_DESCRIPTOR_LOAD | (4 - 2));
>> +     OUT_BATCH(0);
>> +     /* interface descriptor data length */
>> +     OUT_BATCH(sizeof(struct gen8_interface_descriptor_data));
>> +     /* interface descriptor address, is relative to the dynamics base
>> address */
>> +     OUT_BATCH(interface_descriptor);
>> +}
>> +
>> +static void
>> +gen8_emit_media_state_flush(struct intel_batchbuffer *batch)
>> +{
>> +     OUT_BATCH(GEN8_MEDIA_STATE_FLUSH | (2 - 2));
>> +     OUT_BATCH(0);
>> +}
>> +
>> +static void
>> +gen8_emit_media_objects(struct intel_batchbuffer *batch)
>> +{
>> +     OUT_BATCH(GEN8_MEDIA_OBJECT | (8 - 2));
>> +
>> +     /* interface descriptor offset */
>> +     OUT_BATCH(0);
>> +
>> +     /* without indirect data */
>> +     OUT_BATCH(0);
>> +     OUT_BATCH(0);
>> +
>> +     /* scoreboard */
>> +     OUT_BATCH(0);
>> +     OUT_BATCH(0);
>> +
>> +     /* inline data (xoffset, yoffset) */
>> +     OUT_BATCH(0);
>> +     OUT_BATCH(0);
>> +     gen8_emit_media_state_flush(batch);
>> +}
>> +
>> +static void
>> +gen8lp_emit_media_objects(struct intel_batchbuffer *batch)
>> +{
>> +     OUT_BATCH(GEN8_MEDIA_OBJECT | (8 - 2));
>> +
>> +     /* interface descriptor offset */
>> +     OUT_BATCH(0);
>> +
>> +     /* without indirect data */
>> +     OUT_BATCH(0);
>> +     OUT_BATCH(0);
>> +
>> +     /* scoreboard */
>> +     OUT_BATCH(0);
>> +     OUT_BATCH(0);
>> +
>> +     /* inline data (xoffset, yoffset) */
>> +     OUT_BATCH(0);
>> +     OUT_BATCH(0);
>> +}
>> +
>> +/*
>> + * This sets up the media pipeline,
>> + *
>> + * +---------------+ <---- 4096
>> + * |       ^       |
>> + * |       |       |
>> + * |    various    |
>> + * |      state    |
>> + * |       |       |
>> + * |_______|_______| <---- 2048 + ?
>> + * |       ^       |
>> + * |       |       |
>> + * |   batch       |
>> + * |    commands   |
>> + * |       |       |
>> + * |       |       |
>> + * +---------------+ <---- 0 + ?
>> + *
>> + */
>> +
>> +#define BATCH_STATE_SPLIT 2048
>> +
>> +void
>> +gen8_media_spinfunc(struct intel_batchbuffer *batch,
>> +                 struct igt_buf *dst, uint32_t spins)
>> +{
>> +     uint32_t curbe_buffer, interface_descriptor;
>> +     uint32_t batch_end;
>> +
>> +     intel_batchbuffer_flush_with_context(batch, NULL);
>> +
>> +     /* setup states */
>> +     batch->ptr = &batch->buffer[BATCH_STATE_SPLIT];
>> +
>> +     curbe_buffer = gen8_spin_curbe_buffer_data(batch, spins);
>> +     interface_descriptor = gen8_spin_interface_descriptor(batch, dst);
>> +     igt_assert(batch->ptr < &batch->buffer[4095]);
>> +
>> +     /* media pipeline */
>> +     batch->ptr = batch->buffer;
>> +     OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA);
>> +     gen8_emit_state_base_address(batch);
>> +
>> +     gen8_emit_vfe_state(batch);
>> +
>> +     gen8_emit_curbe_load(batch, curbe_buffer);
>> +
>> +     gen8_emit_interface_descriptor_load(batch, interface_descriptor);
>> +
>> +     gen8_emit_media_objects(batch);
>> +
>> +     OUT_BATCH(MI_BATCH_BUFFER_END);
>> +
>> +     batch_end = batch_align(batch, 8);
>> +     igt_assert(batch_end < BATCH_STATE_SPLIT);
>> +
>> +     gen8_render_flush(batch, batch_end);
>> +     intel_batchbuffer_reset(batch);
>> +}
>> +
>> +void
>> +gen8lp_media_spinfunc(struct intel_batchbuffer *batch,
>> +                   struct igt_buf *dst, uint32_t spins)
>> +{
>> +     uint32_t curbe_buffer, interface_descriptor;
>> +     uint32_t batch_end;
>> +
>> +     intel_batchbuffer_flush_with_context(batch, NULL);
>> +
>> +     /* setup states */
>> +     batch->ptr = &batch->buffer[BATCH_STATE_SPLIT];
>> +
>> +     curbe_buffer = gen8_spin_curbe_buffer_data(batch, spins);
>> +     interface_descriptor = gen8_spin_interface_descriptor(batch, dst);
>> +     igt_assert(batch->ptr < &batch->buffer[4095]);
>> +
>> +     /* media pipeline */
>> +     batch->ptr = batch->buffer;
>> +     OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA);
>> +     gen8_emit_state_base_address(batch);
>> +
>> +     gen8_emit_vfe_state(batch);
>> +
>> +     gen8_emit_curbe_load(batch, curbe_buffer);
>> +
>> +     gen8_emit_interface_descriptor_load(batch, interface_descriptor);
>> +
>> +     gen8lp_emit_media_objects(batch);
>> +
>> +     OUT_BATCH(MI_BATCH_BUFFER_END);
>> +
>> +     batch_end = batch_align(batch, 8);
>> +     igt_assert(batch_end < BATCH_STATE_SPLIT);
>> +
>> +     gen8_render_flush(batch, batch_end);
>> +     intel_batchbuffer_reset(batch);
>> +}
>> +
>> +void
>> +gen9_media_spinfunc(struct intel_batchbuffer *batch,
>> +                 struct igt_buf *dst, uint32_t spins)
>> +{
>> +     uint32_t curbe_buffer, interface_descriptor;
>> +     uint32_t batch_end;
>> +
>> +     intel_batchbuffer_flush_with_context(batch, NULL);
>> +
>> +     /* setup states */
>> +     batch->ptr = &batch->buffer[BATCH_STATE_SPLIT];
>> +
>> +     curbe_buffer = gen8_spin_curbe_buffer_data(batch, spins);
>> +     interface_descriptor = gen8_spin_interface_descriptor(batch, dst);
>> +     igt_assert(batch->ptr < &batch->buffer[4095]);
>> +
>> +     /* media pipeline */
>> +     batch->ptr = batch->buffer;
>> +     OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA |
>> +                     GEN9_FORCE_MEDIA_AWAKE_ENABLE |
>> +                     GEN9_SAMPLER_DOP_GATE_DISABLE |
>> +                     GEN9_PIPELINE_SELECTION_MASK |
>> +                     GEN9_SAMPLER_DOP_GATE_MASK |
>> +                     GEN9_FORCE_MEDIA_AWAKE_MASK);
>> +     gen9_emit_state_base_address(batch);
>> +
>> +     gen8_emit_vfe_state(batch);
>> +
>> +     gen8_emit_curbe_load(batch, curbe_buffer);
>> +
>> +     gen8_emit_interface_descriptor_load(batch, interface_descriptor);
>> +
>> +     gen8_emit_media_objects(batch);
>> +
>> +     OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA |
>> +                     GEN9_FORCE_MEDIA_AWAKE_DISABLE |
>> +                     GEN9_SAMPLER_DOP_GATE_ENABLE |
>> +                     GEN9_PIPELINE_SELECTION_MASK |
>> +                     GEN9_SAMPLER_DOP_GATE_MASK |
>> +                     GEN9_FORCE_MEDIA_AWAKE_MASK);
>> +
>> +     OUT_BATCH(MI_BATCH_BUFFER_END);
>> +
>> +     batch_end = batch_align(batch, 8);
>> +     igt_assert(batch_end < BATCH_STATE_SPLIT);
>> +
>> +     gen8_render_flush(batch, batch_end);
>> +     intel_batchbuffer_reset(batch);
>> +}
>> diff --git a/lib/media_spin.h b/lib/media_spin.h
>> new file mode 100644
>> index 0000000..8bc4829
>> --- /dev/null
>> +++ b/lib/media_spin.h
>> @@ -0,0 +1,39 @@
>> +/*
>> + * Copyright © 2015 Intel Corporation
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a
>> + * copy of this software and associated documentation files (the
>> "Software"),
>> + * to deal in the Software without restriction, including without limitation
>> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including the next
>> + * paragraph) shall be included in all copies or substantial portions of the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO
>> EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
>> DAMAGES OR OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>> ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
>> OTHER DEALINGS
>> + * IN THE SOFTWARE.
>> + *
>> + * Authors:
>> + *   Jeff McGee <jeff.mcgee@intel.com>
>> + */
>> +
>> +#ifndef MEDIA_SPIN_H
>> +#define MEDIA_SPIN_H
>> +
>> +void gen8_media_spinfunc(struct intel_batchbuffer *batch,
>> +                      struct igt_buf *dst, uint32_t spins);
>> +
>> +void gen8lp_media_spinfunc(struct intel_batchbuffer *batch,
>> +                        struct igt_buf *dst, uint32_t spins);
>> +
>> +void gen9_media_spinfunc(struct intel_batchbuffer *batch,
>> +                      struct igt_buf *dst, uint32_t spins);
>> +
>> +#endif /* MEDIA_SPIN_H */
>> --
>> 2.3.0
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-03-25 18:07 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-03-10 21:17 [PATCH i-g-t 0/2] Confirm full SSEU enable on Gen9+ jeff.mcgee
2015-03-10 21:17 ` [PATCH i-g-t 1/2] lib: Add media spin jeff.mcgee
2015-03-12 17:52   ` [PATCH i-g-t 1/2 v2] " jeff.mcgee
2015-03-25  2:50     ` He, Shuang
2015-03-25 18:07       ` Thomas Wood
2015-03-10 21:17 ` [PATCH i-g-t 2/2] tests/pm_sseu: Create new test pm_sseu jeff.mcgee
2015-03-12 12:09   ` Thomas Wood
2015-03-18 16:51     ` Jeff McGee
2015-03-12 17:54   ` [PATCH i-g-t 2/2 v2] " jeff.mcgee
2015-03-24 23:20     ` [PATCH i-g-t 2/2 v3] " jeff.mcgee

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox