intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH igt 1/9] lib: Always double check igt_require_hang_ring() on use
@ 2015-12-12 20:02 Chris Wilson
  2015-12-12 20:02 ` [PATCH igt 2/9] lib: Expand igt_hang_ring() to select target context and various options Chris Wilson
                   ` (7 more replies)
  0 siblings, 8 replies; 14+ messages in thread
From: Chris Wilson @ 2015-12-12 20:02 UTC (permalink / raw)
  To: intel-gfx

If we move the igt_require() into the hang injector, this makes simple
test cases even more convenient. More complex test cases can always do
their own precursory check before settting up the test.

However, this does embed the assumption that the first context we are
called from is safe (i.e no i915.enable_hangcheck/i915.reset
interferrence).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 lib/igt_gt.c | 29 ++++++++++++++++++-----------
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/lib/igt_gt.c b/lib/igt_gt.c
index 688ea5e..3d618d6 100644
--- a/lib/igt_gt.c
+++ b/lib/igt_gt.c
@@ -50,17 +50,21 @@
 
 static bool has_gpu_reset(int fd)
 {
-	struct drm_i915_getparam gp;
-	int val = 0;
-
-	memset(&gp, 0, sizeof(gp));
-	gp.param = 35; /* HAS_GPU_RESET */
-	gp.value = &val;
-
-	if (ioctl(fd, DRM_IOCTL_I915_GETPARAM, &gp))
-		return intel_gen(intel_get_drm_devid(fd)) >= 5;
-
-	return val > 0;
+	static int once = -1;
+	if (once < 0) {
+		struct drm_i915_getparam gp;
+		int val = 0;
+
+		memset(&gp, 0, sizeof(gp));
+		gp.param = 35; /* HAS_GPU_RESET */
+		gp.value = &val;
+
+		if (ioctl(fd, DRM_IOCTL_I915_GETPARAM, &gp))
+			once = intel_gen(intel_get_drm_devid(fd)) >= 5;
+		else
+			once = val > 0;
+	}
+	return once;
 }
 
 /**
@@ -74,6 +78,7 @@ static bool has_gpu_reset(int fd)
  */
 void igt_require_hang_ring(int fd, int ring)
 {
+	gem_require_ring(fd, ring);
 	gem_context_require_ban_period(fd);
 	igt_require(has_gpu_reset(fd));
 }
@@ -100,6 +105,8 @@ igt_hang_ring_t igt_hang_ring(int fd, int ring)
 	unsigned ban;
 	unsigned len;
 
+	igt_require_hang_ring(fd, ring);
+
 	param.context = 0;
 	param.size = 0;
 	param.param = LOCAL_CONTEXT_PARAM_BAN_PERIOD;
-- 
2.6.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH igt 2/9] lib: Expand igt_hang_ring() to select target context and various options
  2015-12-12 20:02 [PATCH igt 1/9] lib: Always double check igt_require_hang_ring() on use Chris Wilson
@ 2015-12-12 20:02 ` Chris Wilson
  2015-12-12 20:02 ` [PATCH igt 3/9] igt/drv_hangman: Inject a true hang Chris Wilson
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Chris Wilson @ 2015-12-12 20:02 UTC (permalink / raw)
  To: intel-gfx

Some potential callers want to inject a hang into a particular context,
some want to trigger an actual ban and others may or may not want to
capture the associated error state. Expand the hang injection interface
to suit all.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 lib/igt_gt.c         | 82 +++++++++++++++++++++++++++++++++++++++++++++-------
 lib/igt_gt.h         | 12 ++++++++
 lib/ioctl_wrappers.c | 14 +++++++--
 lib/ioctl_wrappers.h |  1 +
 4 files changed, 96 insertions(+), 13 deletions(-)

diff --git a/lib/igt_gt.c b/lib/igt_gt.c
index 3d618d6..242bad1 100644
--- a/lib/igt_gt.c
+++ b/lib/igt_gt.c
@@ -37,6 +37,8 @@
 #include "intel_reg.h"
 #include "intel_chipset.h"
 
+#define LOCAL_CONTEXT_PARAM_NO_ERROR_CAPTURE 0x4
+
 /**
  * SECTION:igt_gt
  * @short_description: GT support library
@@ -84,18 +86,25 @@ void igt_require_hang_ring(int fd, int ring)
 }
 
 /**
- * igt_hang_ring:
+ * igt_hang_ring_ctx:
  * @fd: open i915 drm file descriptor
+ * @ctx: the contxt specifier
  * @ring: execbuf ring flag
+ * @flags: set of flags to control execution
  *
- * This helper function injects a hanging batch into @ring. It returns a
- * #igt_hang_ring_t structure which must be passed to igt_post_hang_ring() for
- * hang post-processing (after the gpu hang interaction has been tested.
+ * This helper function injects a hanging batch associated with @ctx into @ring.
+ * It returns a #igt_hang_ring_t structure which must be passed to
+ * igt_post_hang_ring() for hang post-processing (after the gpu hang
+ * interaction has been tested.
  *
  * Returns:
  * Structure with helper internal state for igt_post_hang_ring().
  */
-igt_hang_ring_t igt_hang_ring(int fd, int ring)
+igt_hang_ring_t igt_hang_ctx(int fd,
+			     uint32_t ctx,
+			     int ring,
+			     unsigned flags,
+			     uint64_t *offset)
 {
 	struct drm_i915_gem_relocation_entry reloc;
 	struct drm_i915_gem_execbuffer2 execbuf;
@@ -107,15 +116,28 @@ igt_hang_ring_t igt_hang_ring(int fd, int ring)
 
 	igt_require_hang_ring(fd, ring);
 
-	param.context = 0;
+	/* One day the kernel ABI will be fixed! */
+	igt_require(ctx == 0 || ring == I915_EXEC_RENDER);
+
+	if ((flags & HANG_ALLOW_CAPTURE) == 0) {
+		param.context = ctx;
+		param.size = 0;
+		param.param = LOCAL_CONTEXT_PARAM_NO_ERROR_CAPTURE;
+		param.value = 1;
+		__gem_context_set_param(fd, &param);
+	}
+
+	param.context = ctx;
 	param.size = 0;
 	param.param = LOCAL_CONTEXT_PARAM_BAN_PERIOD;
 	param.value = 0;
 	gem_context_get_param(fd, &param);
 	ban = param.value;
 
-	param.value = 0;
-	gem_context_set_param(fd, &param);
+	if ((flags & HANG_ALLOW_BAN) == 0) {
+		param.value = 0;
+		gem_context_set_param(fd, &param);
+	}
 
 	memset(&reloc, 0, sizeof(reloc));
 	memset(&exec, 0, sizeof(exec));
@@ -125,6 +147,7 @@ igt_hang_ring_t igt_hang_ring(int fd, int ring)
 	exec.relocation_count = 1;
 	exec.relocs_ptr = (uintptr_t)&reloc;
 
+	memset(b, 0xc5, sizeof(b));
 	len = 2;
 	if (intel_gen(intel_get_drm_devid(fd)) >= 8)
 		len++;
@@ -141,9 +164,39 @@ igt_hang_ring_t igt_hang_ring(int fd, int ring)
 	execbuf.buffer_count = 1;
 	execbuf.batch_len = sizeof(b);
 	execbuf.flags = ring;
+	i915_execbuffer2_set_context_id(execbuf, ctx);
 	gem_execbuf(fd, &execbuf);
 
-	return (struct igt_hang_ring){ exec.handle, ban };
+	if (offset)
+		*offset = exec.offset;
+
+	return (struct igt_hang_ring){ exec.handle, ctx, ban, flags };
+}
+
+/**
+ * igt_hang_ring:
+ * @fd: open i915 drm file descriptor
+ * @ring: execbuf ring flag
+ *
+ * This helper function injects a hanging batch into @ring. It returns a
+ * #igt_hang_ring_t structure which must be passed to igt_post_hang_ring() for
+ * hang post-processing (after the gpu hang interaction has been tested.
+ *
+ * Returns:
+ * Structure with helper internal state for igt_post_hang_ring().
+ */
+igt_hang_ring_t igt_hang_ring(int fd, int ring)
+{
+	return igt_hang_ctx(fd, 0, ring, 0, NULL);
+}
+
+static void eat_error_state(void)
+{
+	int fd;
+
+	fd = igt_debugfs_open("i915_error_state", O_WRONLY);
+	igt_assert(write(fd, "", 1) == 1);
+	close(fd);
 }
 
 /**
@@ -165,11 +218,20 @@ void igt_post_hang_ring(int fd, struct igt_hang_ring arg)
 		       I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
 	gem_close(fd, arg.handle);
 
-	param.context = 0;
+	param.context = arg.ctx;
 	param.size = 0;
 	param.param = LOCAL_CONTEXT_PARAM_BAN_PERIOD;
 	param.value = arg.ban;
 	gem_context_set_param(fd, &param);
+
+	if ((arg.flags & HANG_ALLOW_CAPTURE) == 0) {
+		param.context = arg.ctx;
+		param.size = 0;
+		param.param = LOCAL_CONTEXT_PARAM_NO_ERROR_CAPTURE;
+		param.value = 0;
+		if (__gem_context_set_param(fd, &param))
+			eat_error_state();
+	}
 }
 
 /* GPU abusers */
diff --git a/lib/igt_gt.h b/lib/igt_gt.h
index b70bbd1..4acf42b 100644
--- a/lib/igt_gt.h
+++ b/lib/igt_gt.h
@@ -30,9 +30,21 @@ void igt_require_hang_ring(int fd, int ring);
 
 typedef struct igt_hang_ring {
 	unsigned handle;
+	unsigned ctx;
 	unsigned ban;
+	unsigned flags;
 } igt_hang_ring_t;
 
+#define HANG_POISON 0xc5c5c5c5
+
+struct igt_hang_ring igt_hang_ctx(int fd,
+				  uint32_t ctx,
+				  int ring,
+				  unsigned flags,
+				  uint64_t *offset);
+#define HANG_ALLOW_BAN 1
+#define HANG_ALLOW_CAPTURE 2
+
 struct igt_hang_ring igt_hang_ring(int fd, int ring);
 void igt_post_hang_ring(int fd, struct igt_hang_ring arg);
 
diff --git a/lib/ioctl_wrappers.c b/lib/ioctl_wrappers.c
index e348f26..e1c0f28 100644
--- a/lib/ioctl_wrappers.c
+++ b/lib/ioctl_wrappers.c
@@ -816,6 +816,16 @@ void gem_context_get_param(int fd, struct local_i915_gem_context_param *p)
 	do_ioctl(fd, LOCAL_IOCTL_I915_GEM_CONTEXT_GETPARAM, p);
 }
 
+int __gem_context_set_param(int fd, struct local_i915_gem_context_param *p)
+{
+#define LOCAL_I915_GEM_CONTEXT_SETPARAM       0x35
+#define LOCAL_IOCTL_I915_GEM_CONTEXT_SETPARAM DRM_IOWR (DRM_COMMAND_BASE + LOCAL_I915_GEM_CONTEXT_SETPARAM, struct local_i915_gem_context_param)
+	if (drmIoctl(fd, LOCAL_IOCTL_I915_GEM_CONTEXT_SETPARAM, p))
+		return -errno;
+
+	errno = 0;
+	return 0;
+}
 /**
  * gem_context_set_param:
  * @fd: open i915 drm file descriptor
@@ -828,9 +838,7 @@ void gem_context_get_param(int fd, struct local_i915_gem_context_param *p)
  */
 void gem_context_set_param(int fd, struct local_i915_gem_context_param *p)
 {
-#define LOCAL_I915_GEM_CONTEXT_SETPARAM       0x35
-#define LOCAL_IOCTL_I915_GEM_CONTEXT_SETPARAM DRM_IOWR (DRM_COMMAND_BASE + LOCAL_I915_GEM_CONTEXT_SETPARAM, struct local_i915_gem_context_param)
-	do_ioctl(fd, LOCAL_IOCTL_I915_GEM_CONTEXT_SETPARAM, p);
+	igt_assert(__gem_context_set_param(fd, p) == 0);
 }
 
 /**
diff --git a/lib/ioctl_wrappers.h b/lib/ioctl_wrappers.h
index fe2f687..214ec78 100644
--- a/lib/ioctl_wrappers.h
+++ b/lib/ioctl_wrappers.h
@@ -111,6 +111,7 @@ void gem_context_require_ban_period(int fd);
 void gem_context_require_param(int fd, uint64_t param);
 void gem_context_get_param(int fd, struct local_i915_gem_context_param *p);
 void gem_context_set_param(int fd, struct local_i915_gem_context_param *p);
+int __gem_context_set_param(int fd, struct local_i915_gem_context_param *p);
 
 void gem_sw_finish(int fd, uint32_t handle);
 
-- 
2.6.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH igt 3/9] igt/drv_hangman: Inject a true hang
  2015-12-12 20:02 [PATCH igt 1/9] lib: Always double check igt_require_hang_ring() on use Chris Wilson
  2015-12-12 20:02 ` [PATCH igt 2/9] lib: Expand igt_hang_ring() to select target context and various options Chris Wilson
@ 2015-12-12 20:02 ` Chris Wilson
  2015-12-16  9:02   ` Daniel Vetter
  2015-12-12 20:02 ` [PATCH igt 4/9] igt/gem_ctx_exec: Convert from stop-rings to a real GPU hang/reset Chris Wilson
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 14+ messages in thread
From: Chris Wilson @ 2015-12-12 20:02 UTC (permalink / raw)
  To: intel-gfx

Wean drv_hangman off the atrocious stop_rings and use a real GPU hang
instead.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/drv_hangman.c | 66 ++++++++++++-----------------------------------------
 1 file changed, 14 insertions(+), 52 deletions(-)

diff --git a/tests/drv_hangman.c b/tests/drv_hangman.c
index cd63b97..866d9dc 100644
--- a/tests/drv_hangman.c
+++ b/tests/drv_hangman.c
@@ -147,59 +147,21 @@ static void assert_error_state_collected(void)
 	assert_dfs_entry_not("i915_error_state", "no error state collected");
 }
 
-#define MAGIC_NUMBER 0x10001
-const uint32_t batch[] = { MI_NOOP,
-			   MI_BATCH_BUFFER_END,
-			   MAGIC_NUMBER,
-			   MAGIC_NUMBER };
+const uint32_t *batch;
 
-static uint64_t submit_batch(int fd, unsigned ring_id, bool stop_ring)
+static uint64_t submit_hang(int fd, unsigned ring_id)
 {
-	struct drm_i915_gem_execbuffer2 execbuf;
-	struct drm_i915_gem_exec_object2 exec;
-	uint64_t presumed_offset;
-
-	gem_require_ring(fd, ring_id);
-
-	exec.handle = gem_create(fd, 4096);
-	gem_write(fd, exec.handle, 0, batch, sizeof(batch));
-	exec.relocation_count = 0;
-	exec.relocs_ptr = 0;
-	exec.alignment = 0;
-	exec.offset = 0;
-	exec.flags = 0;
-	exec.rsvd1 = 0;
-	exec.rsvd2 = 0;
-
-	execbuf.buffers_ptr = (uintptr_t)&exec;
-	execbuf.buffer_count = 1;
-	execbuf.batch_start_offset = 0;
-	execbuf.batch_len = sizeof(batch);
-	execbuf.cliprects_ptr = 0;
-	execbuf.num_cliprects = 0;
-	execbuf.DR1 = 0;
-	execbuf.DR4 = 0;
-	execbuf.flags = ring_id;
-	i915_execbuffer2_set_context_id(execbuf, 0);
-	execbuf.rsvd2 = 0;
-
-	gem_execbuf(fd, &execbuf);
-	gem_sync(fd, exec.handle);
-	presumed_offset = exec.offset;
-
-	if (stop_ring) {
-		igt_set_stop_rings(igt_to_stop_ring_flag(ring_id));
-
-		gem_execbuf(fd, &execbuf);
-		gem_sync(fd, exec.handle);
-
-		igt_assert(igt_get_stop_rings() == STOP_RING_NONE);
-		igt_assert(presumed_offset == exec.offset);
-	}
+	uint64_t offset;
+	igt_hang_ring_t hang;
+
+	hang = igt_hang_ctx(fd, 0, ring_id, HANG_ALLOW_CAPTURE, &offset);
+
+	batch = gem_mmap__cpu(fd, hang.handle, 0, 4096, PROT_READ);
+	gem_set_domain(fd, hang.handle, I915_GEM_DOMAIN_CPU, 0);
 
-	gem_close(fd, exec.handle);
+	igt_post_hang_ring(fd, hang);
 
-	return exec.offset;
+	return offset;
 }
 
 static void clear_error_state(void)
@@ -222,7 +184,7 @@ static void test_error_state_basic(void)
 	clear_error_state();
 	assert_error_state_clear();
 
-	submit_batch(fd, I915_EXEC_RENDER, true);
+	submit_hang(fd, I915_EXEC_RENDER);
 	close(fd);
 
 	assert_error_state_collected();
@@ -276,7 +238,7 @@ static void check_error_state(const int gen,
 			if (!uses_cmd_parser)
 				igt_assert(gtt_offset == expected_offset);
 
-			for (i = 0; i < sizeof(batch) / 4; i++) {
+			for (i = 0; i < 1024; i++) {
 				igt_assert(getline(&line, &line_size, file) > 0);
 				snprintf(expected_line, sizeof(expected_line), "%08x :  %08x",
 					 4*i, batch[i]);
@@ -381,7 +343,7 @@ static void test_error_state_capture(unsigned ring_id,
 	gen = intel_gen(intel_get_drm_devid(fd));
 	cmd_parser = uses_cmd_parser(fd, gen);
 
-	offset = submit_batch(fd, ring_id, true);
+	offset = submit_hang(fd, ring_id);
 	close(fd);
 
 	check_error_state(gen, cmd_parser, ring_name, offset);
-- 
2.6.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH igt 4/9] igt/gem_ctx_exec: Convert from stop-rings to a real GPU hang/reset
  2015-12-12 20:02 [PATCH igt 1/9] lib: Always double check igt_require_hang_ring() on use Chris Wilson
  2015-12-12 20:02 ` [PATCH igt 2/9] lib: Expand igt_hang_ring() to select target context and various options Chris Wilson
  2015-12-12 20:02 ` [PATCH igt 3/9] igt/drv_hangman: Inject a true hang Chris Wilson
@ 2015-12-12 20:02 ` Chris Wilson
  2015-12-12 20:02 ` [PATCH igt 5/9] igt/gem_reset_stats: Convert from stop-rings to real hang injection Chris Wilson
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Chris Wilson @ 2015-12-12 20:02 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/gem_ctx_exec.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tests/gem_ctx_exec.c b/tests/gem_ctx_exec.c
index 179e991..0ec898a 100644
--- a/tests/gem_ctx_exec.c
+++ b/tests/gem_ctx_exec.c
@@ -188,6 +188,8 @@ igt_main
 	igt_subtest("reset-pin-leak") {
 		int i;
 
+		igt_require_hang_ring(fd, I915_EXEC_RENDER);
+
 		/*
 		 * Use an explicit context to isolate the test from
 		 * any major code changes related to the per-file
@@ -201,10 +203,10 @@ igt_main
 		 * the last context is leaked at every reset.
 		 */
 		for (i = 0; i < 20; i++) {
-                        igt_set_stop_rings(igt_to_stop_ring_flag(I915_EXEC_RENDER));
+			igt_hang_ring_t hang = igt_hang_ring(fd, I915_EXEC_RENDER);
 			igt_assert(exec(fd, handle, I915_EXEC_RENDER, 0) == 0);
 			igt_assert(exec(fd, handle, I915_EXEC_RENDER, ctx_id) == 0);
-			gem_sync(fd, handle);
+			igt_post_hang_ring(fd, hang);
 		}
 
 		gem_context_destroy(fd, ctx_id);
-- 
2.6.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH igt 5/9] igt/gem_reset_stats: Convert from stop-rings to real hang injection
  2015-12-12 20:02 [PATCH igt 1/9] lib: Always double check igt_require_hang_ring() on use Chris Wilson
                   ` (2 preceding siblings ...)
  2015-12-12 20:02 ` [PATCH igt 4/9] igt/gem_ctx_exec: Convert from stop-rings to a real GPU hang/reset Chris Wilson
@ 2015-12-12 20:02 ` Chris Wilson
  2015-12-12 20:02 ` [PATCH igt 6/9] igt/kms_pipe_crc_basic: Replace stop_rings with igt_hang_ring Chris Wilson
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Chris Wilson @ 2015-12-12 20:02 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/gem_reset_stats.c | 488 +++++++++++++++---------------------------------
 1 file changed, 147 insertions(+), 341 deletions(-)

diff --git a/tests/gem_reset_stats.c b/tests/gem_reset_stats.c
index 4cbbb4e..da4aae3 100644
--- a/tests/gem_reset_stats.c
+++ b/tests/gem_reset_stats.c
@@ -47,6 +47,7 @@
 #define RS_BATCH_PENDING (1 << 1)
 #define RS_UNKNOWN       (1 << 2)
 
+
 static uint32_t devid;
 
 struct local_drm_i915_reset_stats {
@@ -71,6 +72,13 @@ static bool gem_has_render(int fd)
 	return true;
 }
 
+static void sync_gpu(void)
+{
+	int fd = drm_open_driver(DRIVER_INTEL);
+	gem_quiescent_gpu(fd);
+	close(fd);
+}
+
 static const struct target_ring {
 	uint32_t exec;
 	bool (*present)(int fd);
@@ -100,29 +108,21 @@ static const struct target_ring *current_ring;
 static int gem_reset_stats(int fd, int ctx_id,
 			   struct local_drm_i915_reset_stats *rs)
 {
-	int ret;
-
+	memset(rs, 0, sizeof(*rs));
 	rs->ctx_id = ctx_id;
-	rs->flags = 0;
-	rs->reset_count = rand();
-	rs->batch_active = rand();
-	rs->batch_pending = rand();
-	rs->pad = 0;
+	rs->reset_count = -1;
 
-	do {
-		ret = ioctl(fd, GET_RESET_STATS_IOCTL, rs);
-	} while (ret == -1 && (errno == EINTR || errno == EAGAIN));
-
-	if (ret < 0)
+	if (drmIoctl(fd, GET_RESET_STATS_IOCTL, rs))
 		return -errno;
 
+	igt_assert(rs->reset_count != -1);
 	return 0;
 }
 
 static int gem_reset_status(int fd, int ctx_id)
 {
-	int ret;
 	struct local_drm_i915_reset_stats rs;
+	int ret;
 
 	ret = gem_reset_stats(fd, ctx_id, &rs);
 	if (ret)
@@ -136,15 +136,9 @@ static int gem_reset_status(int fd, int ctx_id)
 	return RS_NO_ERROR;
 }
 
-static int gem_exec(int fd, struct drm_i915_gem_execbuffer2 *execbuf)
+static int __gem_execbuf(int fd, struct drm_i915_gem_execbuffer2 *execbuf)
 {
-	int ret;
-
-	ret = ioctl(fd,
-		    DRM_IOCTL_I915_GEM_EXECBUFFER2,
-		    execbuf);
-
-	if (ret < 0)
+	if (drmIoctl(fd, DRM_IOCTL_I915_GEM_EXECBUFFER2, execbuf))
 		return -errno;
 
 	return 0;
@@ -158,29 +152,19 @@ static int exec_valid_ring(int fd, int ctx, int ring)
 
 	uint32_t buf[2] = { MI_BATCH_BUFFER_END, 0 };
 
+	memset(&exec, 0, sizeof(exec));
 	exec.handle = gem_create(fd, 4096);
+	igt_assert((int)exec.handle > 0);
 	gem_write(fd, exec.handle, 0, buf, sizeof(buf));
-	exec.relocation_count = 0;
-	exec.relocs_ptr = 0;
-	exec.alignment = 0;
-	exec.offset = 0;
-	exec.flags = 0;
-	exec.rsvd1 = 0;
-	exec.rsvd2 = 0;
 
+	memset(&execbuf, 0, sizeof(execbuf));
 	execbuf.buffers_ptr = (uintptr_t)&exec;
 	execbuf.buffer_count = 1;
-	execbuf.batch_start_offset = 0;
 	execbuf.batch_len = sizeof(buf);
-	execbuf.cliprects_ptr = 0;
-	execbuf.num_cliprects = 0;
-	execbuf.DR1 = 0;
-	execbuf.DR4 = 0;
 	execbuf.flags = ring;
 	i915_execbuffer2_set_context_id(execbuf, ctx);
-	execbuf.rsvd2 = 0;
 
-	ret = gem_exec(fd, &execbuf);
+	ret = __gem_execbuf(fd, &execbuf);
 	if (ret < 0)
 		return ret;
 
@@ -192,135 +176,45 @@ static int exec_valid(int fd, int ctx)
 	return exec_valid_ring(fd, ctx, current_ring->exec);
 }
 
-#define BUFSIZE (4 * 1024)
-#define ITEMS   (BUFSIZE >> 2)
-
-static int inject_hang_ring(int fd, int ctx, int ring, bool ignore_ban_error)
+#define BAN HANG_ALLOW_BAN
+#define ASYNC 2
+static void inject_hang(int fd, int ctx, unsigned flags)
 {
-	struct drm_i915_gem_execbuffer2 execbuf;
-	struct drm_i915_gem_exec_object2 exec;
-	uint64_t gtt_off;
-	uint32_t *buf;
-	int roff, i;
-	unsigned cmd_len = 2;
-	enum stop_ring_flags flags;
-
-	srandom(time(NULL));
-
-	if (intel_gen(devid) >= 8)
-		cmd_len = 3;
-
-	buf = malloc(BUFSIZE);
-	igt_assert(buf != NULL);
-
-	buf[0] = MI_BATCH_BUFFER_END;
-	buf[1] = MI_NOOP;
-
-	exec.handle = gem_create(fd, BUFSIZE);
-	gem_write(fd, exec.handle, 0, buf, BUFSIZE);
-	exec.relocation_count = 0;
-	exec.relocs_ptr = 0;
-	exec.alignment = 0;
-	exec.offset = 0;
-	exec.flags = 0;
-	exec.rsvd1 = 0;
-	exec.rsvd2 = 0;
-
-	execbuf.buffers_ptr = (uintptr_t)&exec;
-	execbuf.buffer_count = 1;
-	execbuf.batch_start_offset = 0;
-	execbuf.batch_len = BUFSIZE;
-	execbuf.cliprects_ptr = 0;
-	execbuf.num_cliprects = 0;
-	execbuf.DR1 = 0;
-	execbuf.DR4 = 0;
-	execbuf.flags = ring;
-	i915_execbuffer2_set_context_id(execbuf, ctx);
-	execbuf.rsvd2 = 0;
-
-	igt_assert_eq(gem_exec(fd, &execbuf), 0);
-
-	gtt_off = exec.offset;
-
-	for (i = 0; i < ITEMS; i++)
-		buf[i] = MI_NOOP;
-
-	roff = random() % (ITEMS - cmd_len - 1);
-	buf[roff] = MI_BATCH_BUFFER_START | (cmd_len - 2);
-	buf[roff + 1] = (gtt_off & 0xfffffffc) + (roff << 2);
-	if (cmd_len == 3)
-		buf[roff + 2] = (gtt_off & 0xffffffff00000000ull) >> 32;
-
-	buf[roff + cmd_len] = MI_BATCH_BUFFER_END;
-
-	igt_debug("loop injected at 0x%lx (off 0x%x, bo_start 0x%lx, bo_end 0x%lx)\n",
-		  (long unsigned int)((roff << 2) + gtt_off),
-		  roff << 2, (long unsigned int)gtt_off,
-		  (long unsigned int)(gtt_off + BUFSIZE - 1));
-	gem_write(fd, exec.handle, 0, buf, BUFSIZE);
-
-	exec.relocation_count = 0;
-	exec.relocs_ptr = 0;
-	exec.alignment = 0;
-	exec.offset = 0;
-	exec.flags = 0;
-	exec.rsvd1 = 0;
-	exec.rsvd2 = 0;
-
-	execbuf.buffers_ptr = (uintptr_t)&exec;
-	execbuf.buffer_count = 1;
-	execbuf.batch_start_offset = 0;
-	execbuf.batch_len = BUFSIZE;
-	execbuf.cliprects_ptr = 0;
-	execbuf.num_cliprects = 0;
-	execbuf.DR1 = 0;
-	execbuf.DR4 = 0;
-	execbuf.flags = ring;
-	i915_execbuffer2_set_context_id(execbuf, ctx);
-	execbuf.rsvd2 = 0;
-
-	igt_assert_eq(gem_exec(fd, &execbuf), 0);
-
-	igt_assert(gtt_off == exec.offset);
-
-	free(buf);
-
-	flags = igt_to_stop_ring_flag(ring);
-
-	flags |= STOP_RING_ALLOW_BAN;
-
-	if (!ignore_ban_error)
-		flags |= STOP_RING_ALLOW_ERRORS;
-
-	igt_set_stop_rings(flags);
-
-	return exec.handle;
-}
+	igt_hang_ring_t hang;
 
-static int inject_hang(int fd, int ctx)
-{
-	return inject_hang_ring(fd, ctx, current_ring->exec, false);
+	hang = igt_hang_ctx(fd, ctx, current_ring->exec, flags & BAN, NULL);
+	if ((flags & ASYNC) == 0)
+		igt_post_hang_ring(fd, hang);
 }
 
-static int inject_hang_no_ban_error(int fd, int ctx)
+static const char *status_to_string(int x)
 {
-	return inject_hang_ring(fd, ctx, current_ring->exec, true);
+	const char *strings[] = {
+		"No error",
+		"Guilty",
+		"Pending",
+	};
+	if (x >= ARRAY_SIZE(strings))
+		return "Unknown";
+	return strings[x];
 }
 
-static int _assert_reset_status(int fd, int ctx, int status)
+static int _assert_reset_status(int idx, int fd, int ctx, int status)
 {
 	int rs;
 
 	rs = gem_reset_status(fd, ctx);
 	if (rs < 0) {
 		igt_info("reset status for %d ctx %d returned %d\n",
-			 fd, ctx, rs);
+			 idx, ctx, rs);
 		return rs;
 	}
 
 	if (rs != status) {
-		igt_info("%d:%d reset status %d differs from assumed %d\n",
-			 fd, ctx, rs, status);
+		igt_info("%d:%d expected '%s' [%d], found '%s' [%d]\n",
+			 idx, ctx,
+			 status_to_string(status), status,
+			 status_to_string(rs), rs);
 
 		return 1;
 	}
@@ -328,53 +222,49 @@ static int _assert_reset_status(int fd, int ctx, int status)
 	return 0;
 }
 
-#define assert_reset_status(fd, ctx, status) \
-	igt_assert(_assert_reset_status(fd, ctx, status) == 0)
+#define assert_reset_status(idx, fd, ctx, status) \
+	igt_assert(_assert_reset_status(idx, fd, ctx, status) == 0)
 
 static void test_rs(int num_fds, int hang_index, int rs_assumed_no_hang)
 {
-	int i;
 	int fd[MAX_FD];
-	int h[MAX_FD];
+	int i;
 
 	igt_assert_lte(num_fds, MAX_FD);
 	igt_assert_lt(hang_index, MAX_FD);
 
+	igt_debug("num fds=%d, hang index=%d\n", num_fds, hang_index);
+
 	for (i = 0; i < num_fds; i++) {
 		fd[i] = drm_open_driver(DRIVER_INTEL);
-		igt_assert(fd[i]);
+		assert_reset_status(i, fd[i], 0, RS_NO_ERROR);
 	}
 
-	for (i = 0; i < num_fds; i++)
-		assert_reset_status(fd[i], 0, RS_NO_ERROR);
-
+	sync_gpu();
 	for (i = 0; i < num_fds; i++) {
 		if (i == hang_index)
-			h[i] = inject_hang(fd[i], 0);
+			inject_hang(fd[i], 0, ASYNC);
 		else
-			h[i] = exec_valid(fd[i], 0);
+			igt_assert(exec_valid(fd[i], 0) > 0);
 	}
-
-	gem_sync(fd[num_fds - 1], h[num_fds - 1]);
+	sync_gpu();
 
 	for (i = 0; i < num_fds; i++) {
 		if (hang_index < 0) {
-			assert_reset_status(fd[i], 0, rs_assumed_no_hang);
+			assert_reset_status(i, fd[i], 0, rs_assumed_no_hang);
 			continue;
 		}
 
 		if (i < hang_index)
-			assert_reset_status(fd[i], 0, RS_NO_ERROR);
+			assert_reset_status(i, fd[i], 0, RS_NO_ERROR);
 		if (i == hang_index)
-			assert_reset_status(fd[i], 0, RS_BATCH_ACTIVE);
+			assert_reset_status(i, fd[i], 0, RS_BATCH_ACTIVE);
 		if (i > hang_index)
-			assert_reset_status(fd[i], 0, RS_BATCH_PENDING);
+			assert_reset_status(i, fd[i], 0, RS_BATCH_PENDING);
 	}
 
-	for (i = 0; i < num_fds; i++) {
-		gem_close(fd[i], h[i]);
+	for (i = 0; i < num_fds; i++)
 		close(fd[i]);
-	}
 }
 
 #define MAX_CTX 100
@@ -383,7 +273,6 @@ static void test_rs_ctx(int num_fds, int num_ctx, int hang_index,
 {
 	int i, j;
 	int fd[MAX_FD];
-	int h[MAX_FD][MAX_CTX];
 	int ctx[MAX_FD][MAX_CTX];
 
 	igt_assert_lte(num_fds, MAX_FD);
@@ -397,73 +286,65 @@ static void test_rs_ctx(int num_fds, int num_ctx, int hang_index,
 	for (i = 0; i < num_fds; i++) {
 		fd[i] = drm_open_driver(DRIVER_INTEL);
 		igt_assert(fd[i]);
-		assert_reset_status(fd[i], 0, RS_NO_ERROR);
+		assert_reset_status(i, fd[i], 0, RS_NO_ERROR);
 
 		for (j = 0; j < num_ctx; j++) {
 			ctx[i][j] = gem_context_create(fd[i]);
 
 		}
 
-		assert_reset_status(fd[i], 0, RS_NO_ERROR);
+		assert_reset_status(i, fd[i], 0, RS_NO_ERROR);
 	}
 
 	for (i = 0; i < num_fds; i++) {
-
-		assert_reset_status(fd[i], 0, RS_NO_ERROR);
+		assert_reset_status(i, fd[i], 0, RS_NO_ERROR);
 
 		for (j = 0; j < num_ctx; j++)
-			assert_reset_status(fd[i], ctx[i][j], RS_NO_ERROR);
+			assert_reset_status(i, fd[i], ctx[i][j], RS_NO_ERROR);
 
-		assert_reset_status(fd[i], 0, RS_NO_ERROR);
+		assert_reset_status(i, fd[i], 0, RS_NO_ERROR);
 	}
 
 	for (i = 0; i < num_fds; i++) {
 		for (j = 0; j < num_ctx; j++) {
 			if (i == hang_index && j == hang_context)
-				h[i][j] = inject_hang(fd[i], ctx[i][j]);
+				inject_hang(fd[i], ctx[i][j], ASYNC);
 			else
-				h[i][j] = exec_valid(fd[i], ctx[i][j]);
+				igt_assert(exec_valid(fd[i], ctx[i][j]) > 0);
 		}
 	}
-
-	gem_sync(fd[num_fds - 1], ctx[num_fds - 1][num_ctx - 1]);
+	sync_gpu();
 
 	for (i = 0; i < num_fds; i++)
-		assert_reset_status(fd[i], 0, RS_NO_ERROR);
+		assert_reset_status(i, fd[i], 0, RS_NO_ERROR);
 
 	for (i = 0; i < num_fds; i++) {
 		for (j = 0; j < num_ctx; j++) {
 			if (i < hang_index)
-				assert_reset_status(fd[i], ctx[i][j], RS_NO_ERROR);
+				assert_reset_status(i, fd[i], ctx[i][j], RS_NO_ERROR);
 			if (i == hang_index && j < hang_context)
-				assert_reset_status(fd[i], ctx[i][j], RS_NO_ERROR);
+				assert_reset_status(i, fd[i], ctx[i][j], RS_NO_ERROR);
 			if (i == hang_index && j == hang_context)
-				assert_reset_status(fd[i], ctx[i][j],
+				assert_reset_status(i, fd[i], ctx[i][j],
 						    RS_BATCH_ACTIVE);
 			if (i == hang_index && j > hang_context)
-				assert_reset_status(fd[i], ctx[i][j],
+				assert_reset_status(i, fd[i], ctx[i][j],
 						    RS_BATCH_PENDING);
 			if (i > hang_index)
-				assert_reset_status(fd[i], ctx[i][j],
+				assert_reset_status(i, fd[i], ctx[i][j],
 						    RS_BATCH_PENDING);
 		}
 	}
 
 	for (i = 0; i < num_fds; i++) {
-		for (j = 0; j < num_ctx; j++) {
-			gem_close(fd[i], h[i][j]);
-			gem_context_destroy(fd[i], ctx[i][j]);
-		}
-
-		assert_reset_status(fd[i], 0, RS_NO_ERROR);
-
+		assert_reset_status(i, fd[i], 0, RS_NO_ERROR);
 		close(fd[i]);
 	}
 }
 
 static void test_ban(void)
 {
-	int h1,h2,h3,h4,h5,h6,h7;
+	int h1,h4,h5,h6,h7;
 	int fd_bad, fd_good;
 	int retry = 10;
 	int active_count = 0, pending_count = 0;
@@ -473,72 +354,63 @@ static void test_ban(void)
 
 	fd_good = drm_open_driver(DRIVER_INTEL);
 
-	assert_reset_status(fd_bad, 0, RS_NO_ERROR);
-	assert_reset_status(fd_good, 0, RS_NO_ERROR);
+	assert_reset_status(fd_bad, fd_bad, 0, RS_NO_ERROR);
+	assert_reset_status(fd_good, fd_good, 0, RS_NO_ERROR);
 
 	h1 = exec_valid(fd_bad, 0);
 	igt_assert_lte(0, h1);
 	h5 = exec_valid(fd_good, 0);
 	igt_assert_lte(0, h5);
 
-	assert_reset_status(fd_bad, 0, RS_NO_ERROR);
-	assert_reset_status(fd_good, 0, RS_NO_ERROR);
+	assert_reset_status(fd_bad, fd_bad, 0, RS_NO_ERROR);
+	assert_reset_status(fd_good, fd_good, 0, RS_NO_ERROR);
 
-	h2 = inject_hang_no_ban_error(fd_bad, 0);
-	igt_assert_lte(0, h2);
+	inject_hang(fd_bad, 0, BAN | ASYNC);
 	active_count++;
-	/* Second hang will be pending for this */
+	/* The second hang will count as pending and be discarded */
 	pending_count++;
+	active_count--;
 
 	h6 = exec_valid(fd_good, 0);
 	h7 = exec_valid(fd_good, 0);
 
-        while (retry--) {
-                h3 = inject_hang_no_ban_error(fd_bad, 0);
-                igt_assert_lte(0, h3);
-                gem_sync(fd_bad, h3);
+	while (retry--) {
+		inject_hang(fd_bad, 0, BAN);
 		active_count++;
-		/* This second hand will count as pending */
-                assert_reset_status(fd_bad, 0, RS_BATCH_ACTIVE);
+		assert_reset_status(fd_bad, fd_bad, 0, RS_BATCH_ACTIVE);
 
-                h4 = exec_valid(fd_bad, 0);
-                if (h4 == -EIO) {
-                        gem_close(fd_bad, h3);
-                        break;
-                }
+		h4 = exec_valid(fd_bad, 0);
+		if (h4 == -EIO)
+			break;
 
-                /* Should not happen often but sometimes hang is declared too slow
-                 * due to our way of faking hang using loop */
+		/* Should not happen often but sometimes hang is declared too slow
+		 * due to our way of faking hang using loop */
 
-                igt_assert_lte(0, h4);
-                gem_close(fd_bad, h3);
-                gem_close(fd_bad, h4);
+		igt_assert_lte(0, h4);
+		gem_close(fd_bad, h4);
 
-                igt_info("retrying for ban (%d)\n", retry);
-        }
+		igt_info("retrying for ban (%d)\n", retry);
+	}
 
 	igt_assert_eq(h4, -EIO);
-	assert_reset_status(fd_bad, 0, RS_BATCH_ACTIVE);
+	assert_reset_status(fd_bad, fd_bad, 0, RS_BATCH_ACTIVE);
 
 	gem_sync(fd_good, h7);
-	assert_reset_status(fd_good, 0, RS_BATCH_PENDING);
+	assert_reset_status(fd_good, fd_good, 0, RS_BATCH_PENDING);
 
 	igt_assert_eq(gem_reset_stats(fd_good, 0, &rs_good), 0);
 	igt_assert_eq(gem_reset_stats(fd_bad, 0, &rs_bad), 0);
 
-	igt_assert(rs_bad.batch_active == active_count);
-	igt_assert(rs_bad.batch_pending == pending_count);
-	igt_assert(rs_good.batch_active == 0);
-	igt_assert(rs_good.batch_pending == 2);
+	igt_assert_eq(rs_bad.batch_active, active_count);
+	igt_assert_eq(rs_bad.batch_pending, pending_count);
+	igt_assert_eq(rs_good.batch_active, 0);
+	igt_assert_eq(rs_good.batch_pending, 2);
 
 	gem_close(fd_bad, h1);
-	gem_close(fd_bad, h2);
 	gem_close(fd_good, h6);
 	gem_close(fd_good, h7);
 
-	h1 = exec_valid(fd_good, 0);
-	igt_assert_lte(0, h1);
-	gem_close(fd_good, h1);
+	gem_close(fd_good, exec_valid(fd_good, 0));
 
 	close(fd_bad);
 	close(fd_good);
@@ -549,7 +421,7 @@ static void test_ban(void)
 
 static void test_ban_ctx(void)
 {
-	int h1,h2,h3,h4,h5,h6,h7;
+	int h1,h4,h5,h6,h7;
 	int ctx_good, ctx_bad;
 	int fd;
 	int retry = 10;
@@ -558,72 +430,65 @@ static void test_ban_ctx(void)
 
 	fd = drm_open_driver(DRIVER_INTEL);
 
-	assert_reset_status(fd, 0, RS_NO_ERROR);
+	assert_reset_status(fd, fd, 0, RS_NO_ERROR);
 
 	ctx_good = gem_context_create(fd);
 	ctx_bad = gem_context_create(fd);
 
-	assert_reset_status(fd, 0, RS_NO_ERROR);
-	assert_reset_status(fd, ctx_good, RS_NO_ERROR);
-	assert_reset_status(fd, ctx_bad, RS_NO_ERROR);
+	assert_reset_status(fd, fd, 0, RS_NO_ERROR);
+	assert_reset_status(fd, fd, ctx_good, RS_NO_ERROR);
+	assert_reset_status(fd, fd, ctx_bad, RS_NO_ERROR);
 
 	h1 = exec_valid(fd, ctx_bad);
 	igt_assert_lte(0, h1);
 	h5 = exec_valid(fd, ctx_good);
 	igt_assert_lte(0, h5);
 
-	assert_reset_status(fd, ctx_good, RS_NO_ERROR);
-	assert_reset_status(fd, ctx_bad, RS_NO_ERROR);
+	assert_reset_status(fd, fd, ctx_good, RS_NO_ERROR);
+	assert_reset_status(fd, fd, ctx_bad, RS_NO_ERROR);
 
-	h2 = inject_hang_no_ban_error(fd, ctx_bad);
-	igt_assert_lte(0, h2);
+	inject_hang(fd, ctx_bad, BAN | ASYNC);
 	active_count++;
-	/* Second hang will be pending for this */
+	/* This second hang will count as pending and be reset */
 	pending_count++;
+	active_count--;
 
 	h6 = exec_valid(fd, ctx_good);
 	h7 = exec_valid(fd, ctx_good);
 
         while (retry--) {
-                h3 = inject_hang_no_ban_error(fd, ctx_bad);
-                igt_assert_lte(0, h3);
-                gem_sync(fd, h3);
+                inject_hang(fd, ctx_bad, BAN);
 		active_count++;
-		/* This second hand will count as pending */
-                assert_reset_status(fd, ctx_bad, RS_BATCH_ACTIVE);
+                assert_reset_status(fd, fd, ctx_bad, RS_BATCH_ACTIVE);
 
                 h4 = exec_valid(fd, ctx_bad);
-                if (h4 == -EIO) {
-                        gem_close(fd, h3);
+                if (h4 == -EIO)
                         break;
-                }
 
                 /* Should not happen often but sometimes hang is declared too slow
                  * due to our way of faking hang using loop */
 
                 igt_assert_lte(0, h4);
-                gem_close(fd, h3);
                 gem_close(fd, h4);
 
                 igt_info("retrying for ban (%d)\n", retry);
         }
 
 	igt_assert_eq(h4, -EIO);
-	assert_reset_status(fd, ctx_bad, RS_BATCH_ACTIVE);
+	assert_reset_status(fd, fd, ctx_bad, RS_BATCH_ACTIVE);
 
 	gem_sync(fd, h7);
-	assert_reset_status(fd, ctx_good, RS_BATCH_PENDING);
+	assert_reset_status(fd, fd, ctx_good, RS_BATCH_PENDING);
 
 	igt_assert_eq(gem_reset_stats(fd, ctx_good, &rs_good), 0);
 	igt_assert_eq(gem_reset_stats(fd, ctx_bad, &rs_bad), 0);
 
-	igt_assert(rs_bad.batch_active == active_count);
-	igt_assert(rs_bad.batch_pending == pending_count);
-	igt_assert(rs_good.batch_active == 0);
-	igt_assert(rs_good.batch_pending == 2);
+	igt_assert_eq(rs_bad.batch_active, active_count);
+	igt_assert_eq(rs_bad.batch_pending, pending_count);
+	igt_assert_eq(rs_good.batch_active, 0);
+	igt_assert_eq(rs_good.batch_pending, 2);
 
 	gem_close(fd, h1);
-	gem_close(fd, h2);
 	gem_close(fd, h6);
 	gem_close(fd, h7);
 
@@ -643,36 +508,26 @@ static void test_ban_ctx(void)
 
 static void test_unrelated_ctx(void)
 {
-	int h1,h2;
 	int fd1,fd2;
 	int ctx_guilty, ctx_unrelated;
 
 	fd1 = drm_open_driver(DRIVER_INTEL);
 	fd2 = drm_open_driver(DRIVER_INTEL);
-	assert_reset_status(fd1, 0, RS_NO_ERROR);
-	assert_reset_status(fd2, 0, RS_NO_ERROR);
+	assert_reset_status(0, fd1, 0, RS_NO_ERROR);
+	assert_reset_status(1, fd2, 0, RS_NO_ERROR);
 	ctx_guilty = gem_context_create(fd1);
 	ctx_unrelated = gem_context_create(fd2);
 
-	assert_reset_status(fd1, ctx_guilty, RS_NO_ERROR);
-	assert_reset_status(fd2, ctx_unrelated, RS_NO_ERROR);
-
-	h1 = inject_hang(fd1, ctx_guilty);
-	igt_assert_lte(0, h1);
-	gem_sync(fd1, h1);
-	assert_reset_status(fd1, ctx_guilty, RS_BATCH_ACTIVE);
-	assert_reset_status(fd2, ctx_unrelated, RS_NO_ERROR);
+	assert_reset_status(0, fd1, ctx_guilty, RS_NO_ERROR);
+	assert_reset_status(1, fd2, ctx_unrelated, RS_NO_ERROR);
 
-	h2 = exec_valid(fd2, ctx_unrelated);
-	igt_assert_lte(0, h2);
-	gem_sync(fd2, h2);
-	assert_reset_status(fd1, ctx_guilty, RS_BATCH_ACTIVE);
-	assert_reset_status(fd2, ctx_unrelated, RS_NO_ERROR);
-	gem_close(fd1, h1);
-	gem_close(fd2, h2);
+	inject_hang(fd1, ctx_guilty, 0);
+	assert_reset_status(0, fd1, ctx_guilty, RS_BATCH_ACTIVE);
+	assert_reset_status(1, fd2, ctx_unrelated, RS_NO_ERROR);
 
-	gem_context_destroy(fd1, ctx_guilty);
-	gem_context_destroy(fd2, ctx_unrelated);
+	gem_sync(fd2, exec_valid(fd2, ctx_unrelated));
+	assert_reset_status(0, fd1, ctx_guilty, RS_BATCH_ACTIVE);
+	assert_reset_status(1, fd2, ctx_unrelated, RS_NO_ERROR);
 
 	close(fd1);
 	close(fd2);
@@ -692,35 +547,25 @@ static int get_reset_count(int fd, int ctx)
 
 static void test_close_pending_ctx(void)
 {
-	int fd, h;
-	uint32_t ctx;
-
-	fd = drm_open_driver(DRIVER_INTEL);
-	ctx = gem_context_create(fd);
+	int fd = drm_open_driver(DRIVER_INTEL);
+	uint32_t ctx = gem_context_create(fd);
 
-	assert_reset_status(fd, ctx, RS_NO_ERROR);
+	assert_reset_status(fd, fd, ctx, RS_NO_ERROR);
 
-	h = inject_hang(fd, ctx);
-	igt_assert_lte(0, h);
+	inject_hang(fd, ctx, 0);
 	gem_context_destroy(fd, ctx);
-	igt_assert(__gem_context_destroy(fd, ctx) == -ENOENT);
+	igt_assert_eq(__gem_context_destroy(fd, ctx), -ENOENT);
 
-	gem_close(fd, h);
 	close(fd);
 }
 
 static void test_close_pending(void)
 {
-	int fd, h;
-
-	fd = drm_open_driver(DRIVER_INTEL);
-
-	assert_reset_status(fd, 0, RS_NO_ERROR);
+	int fd = drm_open_driver(DRIVER_INTEL);
 
-	h = inject_hang(fd, 0);
-	igt_assert_lte(0, h);
+	assert_reset_status(fd, fd, 0, RS_NO_ERROR);
 
-	gem_close(fd, h);
+	inject_hang(fd, 0, 0);
 	close(fd);
 }
 
@@ -772,16 +617,12 @@ static void exec_noop_on_each_ring(int fd, const bool reverse)
 
 static void test_close_pending_fork(const bool reverse)
 {
-	int pid;
-	int fd, h;
-
-	fd = drm_open_driver(DRIVER_INTEL);
-
-	assert_reset_status(fd, 0, RS_NO_ERROR);
+	int fd = drm_open_driver(DRIVER_INTEL);
+	int pid, h;
 
-	h = inject_hang(fd, 0);
-	igt_assert_lte(0, h);
+	assert_reset_status(fd, fd, 0, RS_NO_ERROR);
 
+	inject_hang(fd, 0, 0);
 	sleep(1);
 
 	/* Avoid helpers as we need to kill the child
@@ -810,7 +651,6 @@ static void test_close_pending_fork(const bool reverse)
 		kill(pid, SIGKILL);
 	}
 
-	gem_close(fd, h);
 	close(fd);
 
 	/* Then we just wait on hang to happen */
@@ -826,25 +666,23 @@ static void test_close_pending_fork(const bool reverse)
 
 static void test_reset_count(const bool create_ctx)
 {
-	int fd, h, ctx;
+	int fd = drm_open_driver(DRIVER_INTEL);
+       	int ctx;
 	long c1, c2;
 
-	fd = drm_open_driver(DRIVER_INTEL);
 	if (create_ctx)
 		ctx = gem_context_create(fd);
 	else
 		ctx = 0;
 
-	assert_reset_status(fd, ctx, RS_NO_ERROR);
+	assert_reset_status(fd, fd, ctx, RS_NO_ERROR);
 
 	c1 = get_reset_count(fd, ctx);
 	igt_assert(c1 >= 0);
 
-	h = inject_hang(fd, ctx);
-	igt_assert_lte(0, h);
-	gem_sync(fd, h);
+	inject_hang(fd, ctx, 0);
 
-	assert_reset_status(fd, ctx, RS_BATCH_ACTIVE);
+	assert_reset_status(fd, fd, ctx, RS_BATCH_ACTIVE);
 	c2 = get_reset_count(fd, ctx);
 	igt_assert(c2 >= 0);
 	igt_assert(c2 == (c1 + 1));
@@ -862,8 +700,6 @@ static void test_reset_count(const bool create_ctx)
 
 	igt_waitchildren();
 
-	gem_close(fd, h);
-
 	if (create_ctx)
 		gem_context_destroy(fd, ctx);
 
@@ -979,9 +815,9 @@ static void defer_hangcheck(int ring_num)
 	count_start = get_reset_count(fd, 0);
 	igt_assert_lte(0, count_start);
 
-	igt_assert(inject_hang_ring(fd, 0, current_ring->exec, true));
+	inject_hang(fd, 0, 0);
 	while (--seconds) {
-		igt_assert(exec_valid_ring(fd, 0, next_ring->exec));
+		exec_valid_ring(fd, 0, next_ring->exec);
 
 		count_end = get_reset_count(fd, 0);
 		igt_assert_lte(0, count_end);
@@ -1019,37 +855,7 @@ static bool gem_has_reset_stats(int fd)
 	return false;
 }
 
-static void check_gpu_ok(void)
-{
-	int retry_count = 30;
-	enum stop_ring_flags flags;
-	int fd;
-
-	igt_debug("checking gpu state\n");
-
-	while (retry_count--) {
-		flags = igt_get_stop_rings() & STOP_RING_ALL;
-		if (flags == 0)
-			break;
-
-		igt_debug("waiting previous hang to clear\n");
-		sleep(1);
-	}
-
-	igt_assert(flags == 0);
-
-	/*
-	 * Clear the _ALLOW_ERRORS and _ALLOW_BAN flags;
-	 * these are not cleared by individual ring reset.
-	 */
-	igt_set_stop_rings(0);
-
-	fd = drm_open_driver(DRIVER_INTEL);
-	gem_quiescent_gpu(fd);
-	close(fd);
-}
-
-#define RUN_TEST(...) do { check_gpu_ok(); __VA_ARGS__; check_gpu_ok(); } while (0)
+#define RUN_TEST(...) do { sync_gpu(); __VA_ARGS__; sync_gpu(); } while (0)
 #define RUN_CTX_TEST(...) do { check_context(current_ring); RUN_TEST(__VA_ARGS__); } while (0)
 
 igt_main
-- 
2.6.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH igt 6/9] igt/kms_pipe_crc_basic: Replace stop_rings with igt_hang_ring
  2015-12-12 20:02 [PATCH igt 1/9] lib: Always double check igt_require_hang_ring() on use Chris Wilson
                   ` (3 preceding siblings ...)
  2015-12-12 20:02 ` [PATCH igt 5/9] igt/gem_reset_stats: Convert from stop-rings to real hang injection Chris Wilson
@ 2015-12-12 20:02 ` Chris Wilson
  2015-12-12 20:02 ` [PATCH igt 7/9] igt/kms_flip: Convert over to real hang injection Chris Wilson
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Chris Wilson @ 2015-12-12 20:02 UTC (permalink / raw)
  To: intel-gfx

We can inject a real GPU hang for greater effect!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/kms_pipe_crc_basic.c | 55 ++++------------------------------------------
 1 file changed, 4 insertions(+), 51 deletions(-)

diff --git a/tests/kms_pipe_crc_basic.c b/tests/kms_pipe_crc_basic.c
index a3292c2..f33ca41 100644
--- a/tests/kms_pipe_crc_basic.c
+++ b/tests/kms_pipe_crc_basic.c
@@ -43,55 +43,6 @@ static struct {
 	{ .r = 0.0, .g = 1.0, .b = 1.0 },
 };
 
-static uint64_t submit_batch(int fd, unsigned ring_id)
-{
-	const uint32_t batch[] = { MI_NOOP,
-				   MI_BATCH_BUFFER_END };
-	struct drm_i915_gem_execbuffer2 execbuf;
-	struct drm_i915_gem_exec_object2 exec;
-	uint64_t presumed_offset;
-
-	gem_require_ring(fd, ring_id);
-
-	exec.handle = gem_create(fd, 4096);
-	gem_write(fd, exec.handle, 0, batch, sizeof(batch));
-	exec.relocation_count = 0;
-	exec.relocs_ptr = 0;
-	exec.alignment = 0;
-	exec.offset = 0;
-	exec.flags = 0;
-	exec.rsvd1 = 0;
-	exec.rsvd2 = 0;
-
-	execbuf.buffers_ptr = (uintptr_t)&exec;
-	execbuf.buffer_count = 1;
-	execbuf.batch_start_offset = 0;
-	execbuf.batch_len = sizeof(batch);
-	execbuf.cliprects_ptr = 0;
-	execbuf.num_cliprects = 0;
-	execbuf.DR1 = 0;
-	execbuf.DR4 = 0;
-	execbuf.flags = ring_id;
-	i915_execbuffer2_set_context_id(execbuf, 0);
-	execbuf.rsvd2 = 0;
-
-	gem_execbuf(fd, &execbuf);
-	gem_sync(fd, exec.handle);
-	presumed_offset = exec.offset;
-
-	igt_set_stop_rings(igt_to_stop_ring_flag(ring_id));
-
-	gem_execbuf(fd, &execbuf);
-	gem_sync(fd, exec.handle);
-
-	igt_assert(igt_get_stop_rings() == STOP_RING_NONE);
-	igt_assert(presumed_offset == exec.offset);
-
-	gem_close(fd, exec.handle);
-
-	return exec.offset;
-}
-
 static void test_bad_command(data_t *data, const char *cmd)
 {
 	FILE *ctl;
@@ -253,8 +204,10 @@ igt_main
 		}
 
 		igt_subtest_f("hang-read-crc-pipe-%c", 'A'+i) {
-			submit_batch(data.drm_fd, I915_EXEC_RENDER);
-
+			igt_hang_ring_t hang =
+				igt_hang_ring(data.drm_fd, I915_EXEC_RENDER);
+			test_read_crc(&data, i, 0);
+			igt_post_hang_ring(data.drm_fd, hang);
 			test_read_crc(&data, i, 0);
 		}
 	}
-- 
2.6.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH igt 7/9] igt/kms_flip: Convert over to real hang injection
  2015-12-12 20:02 [PATCH igt 1/9] lib: Always double check igt_require_hang_ring() on use Chris Wilson
                   ` (4 preceding siblings ...)
  2015-12-12 20:02 ` [PATCH igt 6/9] igt/kms_pipe_crc_basic: Replace stop_rings with igt_hang_ring Chris Wilson
@ 2015-12-12 20:02 ` Chris Wilson
  2015-12-12 20:02 ` [PATCH igt 8/9] igt/gem_workarounds: Convert to real GPU " Chris Wilson
  2015-12-12 20:02 ` [PATCH igt 9/9] igt/pm_rps: Trigger a real GPU reset Chris Wilson
  7 siblings, 0 replies; 14+ messages in thread
From: Chris Wilson @ 2015-12-12 20:02 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/kms_flip.c | 89 ++++++++++----------------------------------------------
 1 file changed, 15 insertions(+), 74 deletions(-)

diff --git a/tests/kms_flip.c b/tests/kms_flip.c
index a3acc3d..356462e 100644
--- a/tests/kms_flip.c
+++ b/tests/kms_flip.c
@@ -776,72 +776,17 @@ static void set_y_tiling(struct test_output *o, int fb_idx)
 	drmFree(r);
 }
 
-static void stop_rings(bool stop)
+static igt_hang_ring_t hang_gpu(int fd)
 {
-	if (stop)
-		igt_set_stop_rings(STOP_RING_DEFAULTS);
-	else
-		igt_set_stop_rings(STOP_RING_NONE);
-}
-
-static void eat_error_state(void)
-{
-	static const char dfs_entry_error[] = "i915_error_state";
-	static const char data[] = "";
-	int fd;
-
-	fd = igt_debugfs_open(dfs_entry_error, O_WRONLY);
-	igt_assert_lte(0, fd);
-
-	igt_assert(write(fd, data, sizeof(data)) == sizeof(data));
-	close(fd);
-
-	/* and check whether stop_rings is not reset, i.e. the hang has indeed
-	 * happened */
-	igt_assert_f(igt_get_stop_rings() == STOP_RING_NONE,
-		     "no gpu hang detected, stop_rings is still 0x%x\n",
-		     igt_get_stop_rings());
-
-	close(fd);
-}
-
-static void unhang_gpu(int fd, uint32_t handle)
-{
-	gem_sync(drm_fd, handle);
-	gem_close(drm_fd, handle);
-	eat_error_state();
-	stop_rings(false);
+	return igt_hang_ring(fd, I915_EXEC_DEFAULT);
 }
 
-static uint32_t hang_gpu(int fd)
+static void unhang_gpu(int fd, igt_hang_ring_t hang)
 {
-	struct drm_i915_gem_execbuffer2 execbuf;
-	struct drm_i915_gem_exec_object2 gem_exec;
-	uint32_t b[2] = {MI_BATCH_BUFFER_END};
-
-	stop_rings(true);
-
-	memset(&gem_exec, 0, sizeof(gem_exec));
-	gem_exec.handle = gem_create(fd, 4096);
-	gem_write(fd, gem_exec.handle, 0, b, sizeof(b));
-
-	memset(&execbuf, 0, sizeof(execbuf));
-	execbuf.buffers_ptr = (uintptr_t)&gem_exec;
-	execbuf.buffer_count = 1;
-	execbuf.batch_len = sizeof(b);
-
-	if (drmIoctl(fd, DRM_IOCTL_I915_GEM_EXECBUFFER2, &execbuf)) {
-		igt_assert_f(errno == EIO,
-			     "failed to exercise page flip hang recovery\n");
-
-		unhang_gpu(fd, gem_exec.handle);
-		gem_exec.handle = 0;
-	}
-
-	return gem_exec.handle;
+	igt_post_hang_ring(fd, hang);
 }
 
-static bool is_hung(int fd)
+static bool is_wedged(int fd)
 {
 	if (drmIoctl(fd, DRM_IOCTL_I915_GEM_THROTTLE, 0) == 0)
 		return false;
@@ -883,7 +828,7 @@ static unsigned int run_test_step(struct test_output *o)
 	bool do_vblank;
 	struct vblank_reply vbl_reply;
 	unsigned int target_seq;
-	uint32_t hang = 0;	/* Suppress GCC warning */
+	igt_hang_ring_t hang;
 
 	target_seq = o->vblank_state.seq_step;
 	/* Absolute waits only works once we have a frame counter. */
@@ -989,10 +934,9 @@ static unsigned int run_test_step(struct test_output *o)
 
 	igt_print_activity();
 
-	if (do_flip && (o->flags & TEST_HANG)) {
+	memset(&hang, 0, sizeof(hang));
+	if (do_flip && (o->flags & TEST_HANG))
 		hang = hang_gpu(drm_fd);
-		igt_assert_f(hang, "failed to exercise page flip hang recovery\n");
-	}
 
 	/* try to make sure we can issue two flips during the same frame */
 	if (do_flip && (o->flags & TEST_EBUSY)) {
@@ -1064,8 +1008,7 @@ static unsigned int run_test_step(struct test_output *o)
 	if (do_flip && (o->flags & TEST_EINVAL) && !(o->flags & TEST_FB_BAD_TILING))
 		igt_assert(do_page_flip(o, new_fb_id, true) == expected_einval);
 
-	if (hang)
-		unhang_gpu(drm_fd, hang);
+	unhang_gpu(drm_fd, hang);
 
 	return completed_events;
 }
@@ -1295,13 +1238,12 @@ static unsigned int wait_for_events(struct test_output *o)
 static unsigned event_loop(struct test_output *o, unsigned duration_ms)
 {
 	unsigned long start, end;
-	uint32_t hang = 0;	/* Suppress GCC warning */
+	igt_hang_ring_t hang;
 	int count = 0;
 
-	if (o->flags & TEST_HANG_ONCE) {
+	memset(&hang, 0, sizeof(hang));
+	if (o->flags & TEST_HANG_ONCE)
 		hang = hang_gpu(drm_fd);
-		igt_assert_f(hang, "failed to exercise page flip hang recovery\n");
-	}
 
 	start = gettime_us();
 
@@ -1322,8 +1264,7 @@ static unsigned event_loop(struct test_output *o, unsigned duration_ms)
 
 	end = gettime_us();
 
-	if (hang)
-		unhang_gpu(drm_fd, hang);
+	unhang_gpu(drm_fd, hang);
 
 	/* Flush any remaining events */
 	if (o->pending_events)
@@ -1488,7 +1429,7 @@ static int run_test(int duration, int flags)
 	struct test_output o;
 	int i, n, modes = 0;
 
-	igt_require((flags & TEST_HANG) == 0 || !is_hung(drm_fd));
+	igt_require((flags & TEST_HANG) == 0 || !is_wedged(drm_fd));
 
 	if (flags & TEST_RPM)
 		igt_require(igt_setup_runtime_pm());
@@ -1548,7 +1489,7 @@ static int run_pair(int duration, int flags)
 	struct test_output o;
 	int i, j, m, n, modes = 0;
 
-	igt_require((flags & TEST_HANG) == 0 || !is_hung(drm_fd));
+	igt_require((flags & TEST_HANG) == 0 || !is_wedged(drm_fd));
 
 	resources = drmModeGetResources(drm_fd);
 	igt_assert(resources);
-- 
2.6.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH igt 8/9] igt/gem_workarounds: Convert to real GPU hang injection
  2015-12-12 20:02 [PATCH igt 1/9] lib: Always double check igt_require_hang_ring() on use Chris Wilson
                   ` (5 preceding siblings ...)
  2015-12-12 20:02 ` [PATCH igt 7/9] igt/kms_flip: Convert over to real hang injection Chris Wilson
@ 2015-12-12 20:02 ` Chris Wilson
  2015-12-12 20:02 ` [PATCH igt 9/9] igt/pm_rps: Trigger a real GPU reset Chris Wilson
  7 siblings, 0 replies; 14+ messages in thread
From: Chris Wilson @ 2015-12-12 20:02 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/gem_workarounds.c | 43 ++++++-------------------------------------
 1 file changed, 6 insertions(+), 37 deletions(-)

diff --git a/tests/gem_workarounds.c b/tests/gem_workarounds.c
index 87c798b..90d4005 100644
--- a/tests/gem_workarounds.c
+++ b/tests/gem_workarounds.c
@@ -64,47 +64,16 @@ static struct intel_wa_reg *wa_regs;
 
 static void wait_gpu(void)
 {
-	struct drm_i915_gem_execbuffer2 execbuf;
-	struct drm_i915_gem_exec_object2 gem_exec;
-	uint32_t b[2] = {MI_BATCH_BUFFER_END};
-
-	memset(&gem_exec, 0, sizeof(gem_exec));
-	gem_exec.handle = gem_create(drm_fd, 4096);
-	gem_write(drm_fd, gem_exec.handle, 0, b, sizeof(b));
-
-	memset(&execbuf, 0, sizeof(execbuf));
-	execbuf.buffers_ptr = (uintptr_t)&gem_exec;
-	execbuf.buffer_count = 1;
-	execbuf.batch_len = sizeof(b);
-
-	drmIoctl(drm_fd, DRM_IOCTL_I915_GEM_EXECBUFFER2, &execbuf);
-
-	gem_sync(drm_fd, gem_exec.handle);
-
-	gem_close(drm_fd, gem_exec.handle);
+	int fd = drm_open_driver(DRIVER_INTEL);
+	gem_quiescent_gpu(fd);
+	close(fd);
 }
 
 static void test_hang_gpu(void)
 {
-	int retry_count = 30;
-	enum stop_ring_flags flags;
-
-	igt_assert(retry_count);
-	igt_set_stop_rings(STOP_RING_DEFAULTS);
-
-	wait_gpu();
-
-	while(retry_count--) {
-		flags = igt_get_stop_rings();
-		if (flags == 0)
-			break;
-		igt_info("gpu hang not yet cleared, retries left %d\n", retry_count);
-		sleep(1);
-	}
-
-	flags = igt_get_stop_rings();
-	if (flags)
-		igt_set_stop_rings(STOP_RING_NONE);
+	int fd = drm_open_driver(DRIVER_INTEL);
+	igt_post_hang_ring(fd, igt_hang_ring(fd, I915_EXEC_DEFAULT));
+	close(fd);
 }
 
 static void test_suspend_resume(void)
-- 
2.6.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH igt 9/9] igt/pm_rps: Trigger a real GPU reset
  2015-12-12 20:02 [PATCH igt 1/9] lib: Always double check igt_require_hang_ring() on use Chris Wilson
                   ` (6 preceding siblings ...)
  2015-12-12 20:02 ` [PATCH igt 8/9] igt/gem_workarounds: Convert to real GPU " Chris Wilson
@ 2015-12-12 20:02 ` Chris Wilson
  7 siblings, 0 replies; 14+ messages in thread
From: Chris Wilson @ 2015-12-12 20:02 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/pm_rps.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/tests/pm_rps.c b/tests/pm_rps.c
index 9f752f8..db23492 100644
--- a/tests/pm_rps.c
+++ b/tests/pm_rps.c
@@ -525,6 +525,13 @@ static void stabilize_check(int *freqs)
 	igt_debug("Waited %d msec to stabilize cur\n", wait);
 }
 
+static void reset_gpu(void)
+{
+	int fd = drm_open_driver(DRIVER_INTEL);
+	igt_post_hang_ring(fd, igt_hang_ring(fd, I915_EXEC_DEFAULT));
+	close(fd);
+}
+
 /*
  * reset - test that turbo works across a ring stop
  *
@@ -560,11 +567,8 @@ static void reset(void)
 	load_helper_run(LOW);
 	stabilize_check(pre_freqs);
 
-	igt_debug("Stop rings...\n");
-	igt_set_stop_rings(STOP_RING_DEFAULTS);
-	while (igt_get_stop_rings())
-		usleep(1000 * 100);
-	igt_debug("Ring stop cleared\n");
+	igt_debug("Reset gpu...\n");
+	reset_gpu();
 
 	igt_debug("Apply high load...\n");
 	load_helper_set_load(HIGH);
-- 
2.6.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH igt 3/9] igt/drv_hangman: Inject a true hang
  2015-12-12 20:02 ` [PATCH igt 3/9] igt/drv_hangman: Inject a true hang Chris Wilson
@ 2015-12-16  9:02   ` Daniel Vetter
  2015-12-16 10:37     ` Chris Wilson
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Vetter @ 2015-12-16  9:02 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Sat, Dec 12, 2015 at 08:02:49PM +0000, Chris Wilson wrote:
> Wean drv_hangman off the atrocious stop_rings and use a real GPU hang
> instead.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Doesn't this kill pre-gen6? Or at least anything where we don't have
proper hang recovery ... Lack of that is why I've done the original
stop_rings fun.
-Daniel

> ---
>  tests/drv_hangman.c | 66 ++++++++++++-----------------------------------------
>  1 file changed, 14 insertions(+), 52 deletions(-)
> 
> diff --git a/tests/drv_hangman.c b/tests/drv_hangman.c
> index cd63b97..866d9dc 100644
> --- a/tests/drv_hangman.c
> +++ b/tests/drv_hangman.c
> @@ -147,59 +147,21 @@ static void assert_error_state_collected(void)
>  	assert_dfs_entry_not("i915_error_state", "no error state collected");
>  }
>  
> -#define MAGIC_NUMBER 0x10001
> -const uint32_t batch[] = { MI_NOOP,
> -			   MI_BATCH_BUFFER_END,
> -			   MAGIC_NUMBER,
> -			   MAGIC_NUMBER };
> +const uint32_t *batch;
>  
> -static uint64_t submit_batch(int fd, unsigned ring_id, bool stop_ring)
> +static uint64_t submit_hang(int fd, unsigned ring_id)
>  {
> -	struct drm_i915_gem_execbuffer2 execbuf;
> -	struct drm_i915_gem_exec_object2 exec;
> -	uint64_t presumed_offset;
> -
> -	gem_require_ring(fd, ring_id);
> -
> -	exec.handle = gem_create(fd, 4096);
> -	gem_write(fd, exec.handle, 0, batch, sizeof(batch));
> -	exec.relocation_count = 0;
> -	exec.relocs_ptr = 0;
> -	exec.alignment = 0;
> -	exec.offset = 0;
> -	exec.flags = 0;
> -	exec.rsvd1 = 0;
> -	exec.rsvd2 = 0;
> -
> -	execbuf.buffers_ptr = (uintptr_t)&exec;
> -	execbuf.buffer_count = 1;
> -	execbuf.batch_start_offset = 0;
> -	execbuf.batch_len = sizeof(batch);
> -	execbuf.cliprects_ptr = 0;
> -	execbuf.num_cliprects = 0;
> -	execbuf.DR1 = 0;
> -	execbuf.DR4 = 0;
> -	execbuf.flags = ring_id;
> -	i915_execbuffer2_set_context_id(execbuf, 0);
> -	execbuf.rsvd2 = 0;
> -
> -	gem_execbuf(fd, &execbuf);
> -	gem_sync(fd, exec.handle);
> -	presumed_offset = exec.offset;
> -
> -	if (stop_ring) {
> -		igt_set_stop_rings(igt_to_stop_ring_flag(ring_id));
> -
> -		gem_execbuf(fd, &execbuf);
> -		gem_sync(fd, exec.handle);
> -
> -		igt_assert(igt_get_stop_rings() == STOP_RING_NONE);
> -		igt_assert(presumed_offset == exec.offset);
> -	}
> +	uint64_t offset;
> +	igt_hang_ring_t hang;
> +
> +	hang = igt_hang_ctx(fd, 0, ring_id, HANG_ALLOW_CAPTURE, &offset);
> +
> +	batch = gem_mmap__cpu(fd, hang.handle, 0, 4096, PROT_READ);
> +	gem_set_domain(fd, hang.handle, I915_GEM_DOMAIN_CPU, 0);
>  
> -	gem_close(fd, exec.handle);
> +	igt_post_hang_ring(fd, hang);
>  
> -	return exec.offset;
> +	return offset;
>  }
>  
>  static void clear_error_state(void)
> @@ -222,7 +184,7 @@ static void test_error_state_basic(void)
>  	clear_error_state();
>  	assert_error_state_clear();
>  
> -	submit_batch(fd, I915_EXEC_RENDER, true);
> +	submit_hang(fd, I915_EXEC_RENDER);
>  	close(fd);
>  
>  	assert_error_state_collected();
> @@ -276,7 +238,7 @@ static void check_error_state(const int gen,
>  			if (!uses_cmd_parser)
>  				igt_assert(gtt_offset == expected_offset);
>  
> -			for (i = 0; i < sizeof(batch) / 4; i++) {
> +			for (i = 0; i < 1024; i++) {
>  				igt_assert(getline(&line, &line_size, file) > 0);
>  				snprintf(expected_line, sizeof(expected_line), "%08x :  %08x",
>  					 4*i, batch[i]);
> @@ -381,7 +343,7 @@ static void test_error_state_capture(unsigned ring_id,
>  	gen = intel_gen(intel_get_drm_devid(fd));
>  	cmd_parser = uses_cmd_parser(fd, gen);
>  
> -	offset = submit_batch(fd, ring_id, true);
> +	offset = submit_hang(fd, ring_id);
>  	close(fd);
>  
>  	check_error_state(gen, cmd_parser, ring_name, offset);
> -- 
> 2.6.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH igt 3/9] igt/drv_hangman: Inject a true hang
  2015-12-16  9:02   ` Daniel Vetter
@ 2015-12-16 10:37     ` Chris Wilson
  2015-12-16 13:12       ` Ville Syrjälä
  0 siblings, 1 reply; 14+ messages in thread
From: Chris Wilson @ 2015-12-16 10:37 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Wed, Dec 16, 2015 at 10:02:27AM +0100, Daniel Vetter wrote:
> On Sat, Dec 12, 2015 at 08:02:49PM +0000, Chris Wilson wrote:
> > Wean drv_hangman off the atrocious stop_rings and use a real GPU hang
> > instead.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Doesn't this kill pre-gen6? Or at least anything where we don't have
> proper hang recovery ... Lack of that is why I've done the original
> stop_rings fun.

Originally, igt_hang_ring required gen >= 5, but since that Ville has
been working hard on getting reset support working for gen3 and gen4,
now we query the kernel as to whether it can reset the device. So by
switching over we lose testing of simulated hangs and recovery code (or 
KMS handling during wedged) for gen2.

It is a loss in test coverage, but the benefit is that we can remove the
hang injection code from the kernel. And that is a tradeoff I am willing
to make.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH igt 3/9] igt/drv_hangman: Inject a true hang
  2015-12-16 10:37     ` Chris Wilson
@ 2015-12-16 13:12       ` Ville Syrjälä
  2015-12-16 13:51         ` Chris Wilson
  0 siblings, 1 reply; 14+ messages in thread
From: Ville Syrjälä @ 2015-12-16 13:12 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, intel-gfx

On Wed, Dec 16, 2015 at 10:37:55AM +0000, Chris Wilson wrote:
> On Wed, Dec 16, 2015 at 10:02:27AM +0100, Daniel Vetter wrote:
> > On Sat, Dec 12, 2015 at 08:02:49PM +0000, Chris Wilson wrote:
> > > Wean drv_hangman off the atrocious stop_rings and use a real GPU hang
> > > instead.
> > > 
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > 
> > Doesn't this kill pre-gen6? Or at least anything where we don't have
> > proper hang recovery ... Lack of that is why I've done the original
> > stop_rings fun.
> 
> Originally, igt_hang_ring required gen >= 5, but since that Ville has
> been working hard on getting reset support working for gen3 and gen4,
> now we query the kernel as to whether it can reset the device. So by
> switching over we lose testing of simulated hangs and recovery code (or 
> KMS handling during wedged) for gen2.
> 
> It is a loss in test coverage, but the benefit is that we can remove the
> hang injection code from the kernel. And that is a tradeoff I am willing
> to make.

One day I will get around to trying D3 as a reset mechanism for gen2 ;)
Until then, losing the gen2 test coverage seems fairly reasonable.
I suppose the only thing it's really testing on gen2 is making sure we
don't lock up or oops when the gpu hangs.

So what might be nice is a way to force the hang tests to run even 
when gpu reset isn't supported, just to make sure the system doesn't
die completely. Obviously it's going to be slow to run such tests due
to needing a reboot between every test, but it would be enough to do
an occasional spot check to see if there's been a regression.

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH igt 3/9] igt/drv_hangman: Inject a true hang
  2015-12-16 13:12       ` Ville Syrjälä
@ 2015-12-16 13:51         ` Chris Wilson
  2015-12-16 14:03           ` Chris Wilson
  0 siblings, 1 reply; 14+ messages in thread
From: Chris Wilson @ 2015-12-16 13:51 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

On Wed, Dec 16, 2015 at 03:12:57PM +0200, Ville Syrjälä wrote:
> On Wed, Dec 16, 2015 at 10:37:55AM +0000, Chris Wilson wrote:
> > On Wed, Dec 16, 2015 at 10:02:27AM +0100, Daniel Vetter wrote:
> > > On Sat, Dec 12, 2015 at 08:02:49PM +0000, Chris Wilson wrote:
> > > > Wean drv_hangman off the atrocious stop_rings and use a real GPU hang
> > > > instead.
> > > > 
> > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > 
> > > Doesn't this kill pre-gen6? Or at least anything where we don't have
> > > proper hang recovery ... Lack of that is why I've done the original
> > > stop_rings fun.
> > 
> > Originally, igt_hang_ring required gen >= 5, but since that Ville has
> > been working hard on getting reset support working for gen3 and gen4,
> > now we query the kernel as to whether it can reset the device. So by
> > switching over we lose testing of simulated hangs and recovery code (or 
> > KMS handling during wedged) for gen2.
> > 
> > It is a loss in test coverage, but the benefit is that we can remove the
> > hang injection code from the kernel. And that is a tradeoff I am willing
> > to make.
> 
> One day I will get around to trying D3 as a reset mechanism for gen2 ;)
> Until then, losing the gen2 test coverage seems fairly reasonable.
> I suppose the only thing it's really testing on gen2 is making sure we
> don't lock up or oops when the gpu hangs.
> 
> So what might be nice is a way to force the hang tests to run even 
> when gpu reset isn't supported, just to make sure the system doesn't
> die completely. Obviously it's going to be slow to run such tests due
> to needing a reboot between every test, but it would be enough to do
> an occasional spot check to see if there's been a regression.

Would "export IGT_HANG_WITHOUT_RESET=1" be enough?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH igt 3/9] igt/drv_hangman: Inject a true hang
  2015-12-16 13:51         ` Chris Wilson
@ 2015-12-16 14:03           ` Chris Wilson
  0 siblings, 0 replies; 14+ messages in thread
From: Chris Wilson @ 2015-12-16 14:03 UTC (permalink / raw)
  To: Ville Syrjälä, Daniel Vetter, intel-gfx

On Wed, Dec 16, 2015 at 01:51:46PM +0000, Chris Wilson wrote:
> On Wed, Dec 16, 2015 at 03:12:57PM +0200, Ville Syrjälä wrote:
> > On Wed, Dec 16, 2015 at 10:37:55AM +0000, Chris Wilson wrote:
> > > On Wed, Dec 16, 2015 at 10:02:27AM +0100, Daniel Vetter wrote:
> > > > On Sat, Dec 12, 2015 at 08:02:49PM +0000, Chris Wilson wrote:
> > > > > Wean drv_hangman off the atrocious stop_rings and use a real GPU hang
> > > > > instead.
> > > > > 
> > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > > 
> > > > Doesn't this kill pre-gen6? Or at least anything where we don't have
> > > > proper hang recovery ... Lack of that is why I've done the original
> > > > stop_rings fun.
> > > 
> > > Originally, igt_hang_ring required gen >= 5, but since that Ville has
> > > been working hard on getting reset support working for gen3 and gen4,
> > > now we query the kernel as to whether it can reset the device. So by
> > > switching over we lose testing of simulated hangs and recovery code (or 
> > > KMS handling during wedged) for gen2.
> > > 
> > > It is a loss in test coverage, but the benefit is that we can remove the
> > > hang injection code from the kernel. And that is a tradeoff I am willing
> > > to make.
> > 
> > One day I will get around to trying D3 as a reset mechanism for gen2 ;)
> > Until then, losing the gen2 test coverage seems fairly reasonable.
> > I suppose the only thing it's really testing on gen2 is making sure we
> > don't lock up or oops when the gpu hangs.
> > 
> > So what might be nice is a way to force the hang tests to run even 
> > when gpu reset isn't supported, just to make sure the system doesn't
> > die completely. Obviously it's going to be slow to run such tests due
> > to needing a reboot between every test, but it would be enough to do
> > an occasional spot check to see if there's been a regression.
> 
> Would "export IGT_HANG_WITHOUT_RESET=1" be enough?


commit 8e5113502f1a0feb8f5996d7b44f5021aa24a35f
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Dec 11 21:24:21 2015 +0000

    lib: Always double check igt_require_hang_ring() on use
    
    If we move the igt_require() into the hang injector, this makes simple
    test cases even more convenient. More complex test cases can always do
    their own precursory check before settting up the test.
    
    However, this does embed the assumption that the first context we are
    called from is safe (i.e no i915.enable_hangcheck/i915.reset
    interferrence).
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

diff --git a/lib/igt_gt.c b/lib/igt_gt.c
index 688ea5e..5ae1717 100644
--- a/lib/igt_gt.c
+++ b/lib/igt_gt.c
@@ -50,17 +50,21 @@
 
 static bool has_gpu_reset(int fd)
 {
-	struct drm_i915_getparam gp;
-	int val = 0;
-
-	memset(&gp, 0, sizeof(gp));
-	gp.param = 35; /* HAS_GPU_RESET */
-	gp.value = &val;
-
-	if (ioctl(fd, DRM_IOCTL_I915_GETPARAM, &gp))
-		return intel_gen(intel_get_drm_devid(fd)) >= 5;
-
-	return val > 0;
+	static int once = -1;
+	if (once < 0) {
+		struct drm_i915_getparam gp;
+		int val = 0;
+
+		memset(&gp, 0, sizeof(gp));
+		gp.param = 35; /* HAS_GPU_RESET */
+		gp.value = &val;
+
+		if (ioctl(fd, DRM_IOCTL_I915_GETPARAM, &gp))
+			once = intel_gen(intel_get_drm_devid(fd)) >= 5;
+		else
+			once = val > 0;
+	}
+	return once;
 }
 
 /**
@@ -71,11 +75,29 @@ static bool has_gpu_reset(int fd)
  * Convenience helper to check whether advanced hang injection is supported by
  * the kernel. Uses igt_skip to automatically skip the test/subtest if this
  * isn't the case.
+ *
+ * Note that we can't simply just call this from igt_hang_ring since some
+ * tests want to exercise gpu wedging behavior. For which we intentionally
+ * disable gpu reset support, but still want to inject a hang, see for example
+ * tests/gem_eio.c Instead, we expect that the first invocation of
+ * igt_require_hand_ring be from a vanilla context and use the has_gpu_reset()
+ * determined then for all later instances. This allows us the convenience
+ * of double checking when injecting hangs, whilst pushing the complexity
+ * to the tests that are deliberating trying to break the box.
+ *
+ * This function is also controlled by the environment variables:
+ *
+ * IGT_HANG_WITHOUT_RESET (boolean) - if true, allow the hang even if the
+ * kernel does not support GPU recovery. The machine will be wedged afterwards
+ * (and so require a reboot between testing), but it does allow limited testing
+ * to be done under hang injection.
  */
 void igt_require_hang_ring(int fd, int ring)
 {
+	gem_require_ring(fd, ring);
 	gem_context_require_ban_period(fd);
-	igt_require(has_gpu_reset(fd));
+	if (!igt_check_boolean_env_var("IGT_HANG_WITHOUT_RESET", false))
+		igt_require(has_gpu_reset(fd));
 }
 
 /**
@@ -100,6 +122,8 @@ igt_hang_ring_t igt_hang_ring(int fd, int ring)
 	unsigned ban;
 	unsigned len;
 
+	igt_require_hang_ring(fd, ring);
+
 	param.context = 0;
 	param.size = 0;
 	param.param = LOCAL_CONTEXT_PARAM_BAN_PERIOD;
diff --git a/lib/ioctl_wrappers.c b/lib/ioctl_wrappers.c
index e348f26..22c694b 100644
--- a/lib/ioctl_wrappers.c
+++ b/lib/ioctl_wrappers.c
@@ -1241,6 +1241,7 @@ void gem_require_caching(int fd)
 void gem_require_ring(int fd, int ring_id)
 {
 	switch (ring_id) {
+	case I915_EXEC_DEFAULT:
 	case I915_EXEC_RENDER:
 		return;
 	case I915_EXEC_BLT:
@@ -1255,6 +1256,7 @@ void gem_require_ring(int fd, int ring_id)
 		return;
 #endif
 	default:
+		igt_warn("Invalid ring: %d\n", ring_id);
 		igt_assert(0);
 		return;
 	}


-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-12-16 14:03 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-12-12 20:02 [PATCH igt 1/9] lib: Always double check igt_require_hang_ring() on use Chris Wilson
2015-12-12 20:02 ` [PATCH igt 2/9] lib: Expand igt_hang_ring() to select target context and various options Chris Wilson
2015-12-12 20:02 ` [PATCH igt 3/9] igt/drv_hangman: Inject a true hang Chris Wilson
2015-12-16  9:02   ` Daniel Vetter
2015-12-16 10:37     ` Chris Wilson
2015-12-16 13:12       ` Ville Syrjälä
2015-12-16 13:51         ` Chris Wilson
2015-12-16 14:03           ` Chris Wilson
2015-12-12 20:02 ` [PATCH igt 4/9] igt/gem_ctx_exec: Convert from stop-rings to a real GPU hang/reset Chris Wilson
2015-12-12 20:02 ` [PATCH igt 5/9] igt/gem_reset_stats: Convert from stop-rings to real hang injection Chris Wilson
2015-12-12 20:02 ` [PATCH igt 6/9] igt/kms_pipe_crc_basic: Replace stop_rings with igt_hang_ring Chris Wilson
2015-12-12 20:02 ` [PATCH igt 7/9] igt/kms_flip: Convert over to real hang injection Chris Wilson
2015-12-12 20:02 ` [PATCH igt 8/9] igt/gem_workarounds: Convert to real GPU " Chris Wilson
2015-12-12 20:02 ` [PATCH igt 9/9] igt/pm_rps: Trigger a real GPU reset Chris Wilson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).