Igt-dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [igt-dev] [PATCH i-g-t 1/5] include: i915 uAPI headers
  2018-03-19 18:22 [Intel-gfx] [PATCH i-g-t 0/5] Queued/runnable/running engine stats Tvrtko Ursulin
@ 2018-03-19 18:22 ` Tvrtko Ursulin
  0 siblings, 0 replies; 9+ messages in thread
From: Tvrtko Ursulin @ 2018-03-19 18:22 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Temporary up to date uAPI headers.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 include/drm-uapi/i915_drm.h | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/include/drm-uapi/i915_drm.h b/include/drm-uapi/i915_drm.h
index 16e452aa12d4..14c7e790f6ed 100644
--- a/include/drm-uapi/i915_drm.h
+++ b/include/drm-uapi/i915_drm.h
@@ -110,9 +110,17 @@ enum drm_i915_gem_engine_class {
 enum drm_i915_pmu_engine_sample {
 	I915_SAMPLE_BUSY = 0,
 	I915_SAMPLE_WAIT = 1,
-	I915_SAMPLE_SEMA = 2
+	I915_SAMPLE_SEMA = 2,
+	I915_SAMPLE_QUEUED = 3,
+	I915_SAMPLE_RUNNABLE = 4,
+	I915_SAMPLE_RUNNING = 5,
 };
 
+ /* Divide counter value by divisor to get the real value. */
+#define I915_SAMPLE_QUEUED_DIVISOR (1024)
+#define I915_SAMPLE_RUNNABLE_DIVISOR (1024)
+#define I915_SAMPLE_RUNNING_DIVISOR (1024)
+
 #define I915_PMU_SAMPLE_BITS (4)
 #define I915_PMU_SAMPLE_MASK (0xf)
 #define I915_PMU_SAMPLE_INSTANCE_BITS (8)
@@ -133,6 +141,15 @@ enum drm_i915_pmu_engine_sample {
 #define I915_PMU_ENGINE_SEMA(class, instance) \
 	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_SEMA)
 
+#define I915_PMU_ENGINE_QUEUED(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_QUEUED)
+
+#define I915_PMU_ENGINE_RUNNABLE(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_RUNNABLE)
+
+#define I915_PMU_ENGINE_RUNNING(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_RUNNING)
+
 #define __I915_PMU_OTHER(x) (__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x))
 
 #define I915_PMU_ACTUAL_FREQUENCY	__I915_PMU_OTHER(0)
-- 
2.14.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Intel-gfx] [PATCH i-g-t v2 0/5] Queued/runnable/running engine stats
@ 2018-04-05 12:40 Tvrtko Ursulin
  2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 1/5] include: i915 uAPI headers Tvrtko Ursulin
                   ` (6 more replies)
  0 siblings, 7 replies; 9+ messages in thread
From: Tvrtko Ursulin @ 2018-04-05 12:40 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

IGT patches for the identicaly named i915 series, including:

 * Engine queue depths for intel-gpu-overlay (including load average).
 * Tests for new PMU counters.
 * Tests for the query API.

v2:
 * Review feedback and tweaks.

Tvrtko Ursulin (5):
  include: i915 uAPI headers
  intel-gpu-overlay: Add engine queue stats
  intel-gpu-overlay: Show 1s, 30s and 15m GPU load
  tests/perf_pmu: Add tests for engine queued/runnable/running stats
  tests/i915_query: Engine queues tests

 include/drm-uapi/i915_drm.h |  19 +-
 overlay/gpu-top.c           |  81 +++++++-
 overlay/gpu-top.h           |  22 ++-
 overlay/overlay.c           |  35 +++-
 tests/i915_query.c          | 442 ++++++++++++++++++++++++++++++++++++++++++++
 tests/perf_pmu.c            | 258 ++++++++++++++++++++++++++
 6 files changed, 848 insertions(+), 9 deletions(-)

-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [igt-dev] [PATCH i-g-t 1/5] include: i915 uAPI headers
  2018-04-05 12:40 [Intel-gfx] [PATCH i-g-t v2 0/5] Queued/runnable/running engine stats Tvrtko Ursulin
@ 2018-04-05 12:40 ` Tvrtko Ursulin
  2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 2/5] intel-gpu-overlay: Add engine queue stats Tvrtko Ursulin
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Tvrtko Ursulin @ 2018-04-05 12:40 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Temporary up to date uAPI headers.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 include/drm-uapi/i915_drm.h | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/include/drm-uapi/i915_drm.h b/include/drm-uapi/i915_drm.h
index 16e452aa12d4..14c7e790f6ed 100644
--- a/include/drm-uapi/i915_drm.h
+++ b/include/drm-uapi/i915_drm.h
@@ -110,9 +110,17 @@ enum drm_i915_gem_engine_class {
 enum drm_i915_pmu_engine_sample {
 	I915_SAMPLE_BUSY = 0,
 	I915_SAMPLE_WAIT = 1,
-	I915_SAMPLE_SEMA = 2
+	I915_SAMPLE_SEMA = 2,
+	I915_SAMPLE_QUEUED = 3,
+	I915_SAMPLE_RUNNABLE = 4,
+	I915_SAMPLE_RUNNING = 5,
 };
 
+ /* Divide counter value by divisor to get the real value. */
+#define I915_SAMPLE_QUEUED_DIVISOR (1024)
+#define I915_SAMPLE_RUNNABLE_DIVISOR (1024)
+#define I915_SAMPLE_RUNNING_DIVISOR (1024)
+
 #define I915_PMU_SAMPLE_BITS (4)
 #define I915_PMU_SAMPLE_MASK (0xf)
 #define I915_PMU_SAMPLE_INSTANCE_BITS (8)
@@ -133,6 +141,15 @@ enum drm_i915_pmu_engine_sample {
 #define I915_PMU_ENGINE_SEMA(class, instance) \
 	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_SEMA)
 
+#define I915_PMU_ENGINE_QUEUED(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_QUEUED)
+
+#define I915_PMU_ENGINE_RUNNABLE(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_RUNNABLE)
+
+#define I915_PMU_ENGINE_RUNNING(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_RUNNING)
+
 #define __I915_PMU_OTHER(x) (__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x))
 
 #define I915_PMU_ACTUAL_FREQUENCY	__I915_PMU_OTHER(0)
-- 
2.14.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [igt-dev] [PATCH i-g-t 2/5] intel-gpu-overlay: Add engine queue stats
  2018-04-05 12:40 [Intel-gfx] [PATCH i-g-t v2 0/5] Queued/runnable/running engine stats Tvrtko Ursulin
  2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 1/5] include: i915 uAPI headers Tvrtko Ursulin
@ 2018-04-05 12:40 ` Tvrtko Ursulin
  2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 3/5] intel-gpu-overlay: Show 1s, 30s and 15m GPU load Tvrtko Ursulin
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Tvrtko Ursulin @ 2018-04-05 12:40 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Use new PMU engine queue stats (queued, runnable and running) and display
them per engine.

v2:
 * Compact per engine stats. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 overlay/gpu-top.c | 42 ++++++++++++++++++++++++++++++++++++++++++
 overlay/gpu-top.h | 11 +++++++++++
 overlay/overlay.c |  7 +++++++
 3 files changed, 60 insertions(+)

diff --git a/overlay/gpu-top.c b/overlay/gpu-top.c
index 61b8f62fd78c..22e9badb22c1 100644
--- a/overlay/gpu-top.c
+++ b/overlay/gpu-top.c
@@ -72,6 +72,18 @@ static int perf_init(struct gpu_top *gt)
 				 gt->fd) >= 0)
 		gt->have_sema = 1;
 
+	if (perf_i915_open_group(I915_PMU_ENGINE_QUEUED(d->class, d->inst),
+				 gt->fd) >= 0)
+		gt->have_queued = 1;
+
+	if (perf_i915_open_group(I915_PMU_ENGINE_RUNNABLE(d->class, d->inst),
+				 gt->fd) >= 0)
+		gt->have_runnable = 1;
+
+	if (perf_i915_open_group(I915_PMU_ENGINE_RUNNING(d->class, d->inst),
+				 gt->fd) >= 0)
+		gt->have_running = 1;
+
 	gt->ring[0].name = d->name;
 	gt->num_rings = 1;
 
@@ -93,6 +105,24 @@ static int perf_init(struct gpu_top *gt)
 				   gt->fd) < 0)
 			return -1;
 
+		if (gt->have_queued &&
+		    perf_i915_open_group(I915_PMU_ENGINE_QUEUED(d->class,
+								d->inst),
+				   gt->fd) < 0)
+			return -1;
+
+		if (gt->have_runnable &&
+		    perf_i915_open_group(I915_PMU_ENGINE_RUNNABLE(d->class,
+								  d->inst),
+				   gt->fd) < 0)
+			return -1;
+
+		if (gt->have_running &&
+		    perf_i915_open_group(I915_PMU_ENGINE_RUNNING(d->class,
+								 d->inst),
+				   gt->fd) < 0)
+			return -1;
+
 		gt->ring[gt->num_rings++].name = d->name;
 	}
 
@@ -298,6 +328,12 @@ int gpu_top_update(struct gpu_top *gt)
 				s->wait[n] = sample[m++];
 			if (gt->have_sema)
 				s->sema[n] = sample[m++];
+			if (gt->have_queued)
+				s->queued[n] = sample[m++];
+			if (gt->have_runnable)
+				s->runnable[n] = sample[m++];
+			if (gt->have_running)
+				s->running[n] = sample[m++];
 		}
 
 		if (gt->count == 1)
@@ -310,6 +346,12 @@ int gpu_top_update(struct gpu_top *gt)
 				gt->ring[n].u.u.wait = (100 * (s->wait[n] - d->wait[n]) + d_time/2) / d_time;
 			if (gt->have_sema)
 				gt->ring[n].u.u.sema = (100 * (s->sema[n] - d->sema[n]) + d_time/2) / d_time;
+			if (gt->have_queued)
+				gt->ring[n].queued = (double)((s->queued[n] - d->queued[n])) / I915_SAMPLE_QUEUED_DIVISOR * 1e9 / d_time;
+			if (gt->have_runnable)
+				gt->ring[n].runnable = (double)((s->runnable[n] - d->runnable[n])) / I915_SAMPLE_RUNNABLE_DIVISOR  * 1e9 / d_time;
+			if (gt->have_running)
+				gt->ring[n].running = (double)((s->running[n] - d->running[n])) / I915_SAMPLE_RUNNING_DIVISOR * 1e9 / d_time;
 
 			/* in case of rounding + sampling errors, fudge */
 			if (gt->ring[n].u.u.busy > 100)
diff --git a/overlay/gpu-top.h b/overlay/gpu-top.h
index d3cdd779760f..cb4310c82a94 100644
--- a/overlay/gpu-top.h
+++ b/overlay/gpu-top.h
@@ -36,6 +36,9 @@ struct gpu_top {
 	int num_rings;
 	int have_wait;
 	int have_sema;
+	int have_queued;
+	int have_runnable;
+	int have_running;
 
 	struct gpu_top_ring {
 		const char *name;
@@ -47,6 +50,10 @@ struct gpu_top {
 			} u;
 			uint32_t payload;
 		} u;
+
+		double queued;
+		double runnable;
+		double running;
 	} ring[MAX_RINGS];
 
 	struct gpu_top_stat {
@@ -54,7 +61,11 @@ struct gpu_top {
 		uint64_t busy[MAX_RINGS];
 		uint64_t wait[MAX_RINGS];
 		uint64_t sema[MAX_RINGS];
+		uint64_t queued[MAX_RINGS];
+		uint64_t runnable[MAX_RINGS];
+		uint64_t running[MAX_RINGS];
 	} stat[2];
+
 	int count;
 };
 
diff --git a/overlay/overlay.c b/overlay/overlay.c
index 545af7bcb2f5..d3755397061b 100644
--- a/overlay/overlay.c
+++ b/overlay/overlay.c
@@ -255,6 +255,13 @@ static void show_gpu_top(struct overlay_context *ctx, struct overlay_gpu_top *gt
 		len = sprintf(txt, "%s: %3d%% busy",
 			      gt->gpu_top.ring[n].name,
 			      gt->gpu_top.ring[n].u.u.busy);
+		if (gt->gpu_top.have_queued &&
+		    gt->gpu_top.have_runnable &&
+		    gt->gpu_top.have_running)
+			len += sprintf(txt + len, " (%.2f / %.2f / %.2f)",
+				       gt->gpu_top.ring[n].queued,
+				       gt->gpu_top.ring[n].runnable,
+				       gt->gpu_top.ring[n].running);
 		if (gt->gpu_top.ring[n].u.u.wait)
 			len += sprintf(txt + len, ", %d%% wait",
 				       gt->gpu_top.ring[n].u.u.wait);
-- 
2.14.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [igt-dev] [PATCH i-g-t 3/5] intel-gpu-overlay: Show 1s, 30s and 15m GPU load
  2018-04-05 12:40 [Intel-gfx] [PATCH i-g-t v2 0/5] Queued/runnable/running engine stats Tvrtko Ursulin
  2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 1/5] include: i915 uAPI headers Tvrtko Ursulin
  2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 2/5] intel-gpu-overlay: Add engine queue stats Tvrtko Ursulin
@ 2018-04-05 12:40 ` Tvrtko Ursulin
  2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 4/5] tests/perf_pmu: Add tests for engine queued/runnable/running stats Tvrtko Ursulin
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Tvrtko Ursulin @ 2018-04-05 12:40 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Show total GPU loads in the window banner.

Engine load is defined as total of runnable and running requests on an
engine.

Total, non-normalized, load is display. In other words if N engines are
busy with exactly one request, the load will be shown as N.

v2:
 * Different flavour of load avg. (Chris Wilson)
 * Simplify code. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 overlay/gpu-top.c | 39 ++++++++++++++++++++++++++++++++++++++-
 overlay/gpu-top.h | 11 ++++++++++-
 overlay/overlay.c | 28 ++++++++++++++++++++++------
 3 files changed, 70 insertions(+), 8 deletions(-)

diff --git a/overlay/gpu-top.c b/overlay/gpu-top.c
index 22e9badb22c1..501429b86379 100644
--- a/overlay/gpu-top.c
+++ b/overlay/gpu-top.c
@@ -28,6 +28,7 @@
 #include <string.h>
 #include <unistd.h>
 #include <fcntl.h>
+#include <math.h>
 #include <errno.h>
 #include <assert.h>
 
@@ -126,6 +127,10 @@ static int perf_init(struct gpu_top *gt)
 		gt->ring[gt->num_rings++].name = d->name;
 	}
 
+	gt->have_load_avg = gt->have_queued &&
+			    gt->have_runnable &&
+			    gt->have_running;
+
 	return 0;
 }
 
@@ -290,17 +295,32 @@ static void mmio_init(struct gpu_top *gt)
 	}
 }
 
-void gpu_top_init(struct gpu_top *gt)
+void gpu_top_init(struct gpu_top *gt, unsigned int period_us)
 {
+	const double period = (double)period_us / 1e6;
+	const double load_period[NUM_LOADS] = { 1.0, 30.0, 900.0 };
+	const char *load_names[NUM_LOADS] = { "1s", "30s", "15m" };
+	unsigned int i;
+
 	memset(gt, 0, sizeof(*gt));
 	gt->fd = -1;
 
+	for (i = 0; i < NUM_LOADS; i++) {
+		gt->load_name[i] = load_names[i];
+		gt->exp[i] = exp(-period / load_period[i]);
+	}
+
 	if (perf_init(gt) == 0)
 		return;
 
 	mmio_init(gt);
 }
 
+static double update_load(double load, double exp, double val)
+{
+	return val + exp * (load - val);
+}
+
 int gpu_top_update(struct gpu_top *gt)
 {
 	uint32_t data[1024];
@@ -313,6 +333,8 @@ int gpu_top_update(struct gpu_top *gt)
 		struct gpu_top_stat *s = &gt->stat[gt->count++&1];
 		struct gpu_top_stat *d = &gt->stat[gt->count&1];
 		uint64_t *sample, d_time;
+		double gpu_qd = 0.0;
+		unsigned int i;
 		int n, m;
 
 		len = read(gt->fd, data, sizeof(data));
@@ -341,6 +363,8 @@ int gpu_top_update(struct gpu_top *gt)
 
 		d_time = s->time - d->time;
 		for (n = 0; n < gt->num_rings; n++) {
+			double qd = 0.0;
+
 			gt->ring[n].u.u.busy = (100 * (s->busy[n] - d->busy[n]) + d_time/2) / d_time;
 			if (gt->have_wait)
 				gt->ring[n].u.u.wait = (100 * (s->wait[n] - d->wait[n]) + d_time/2) / d_time;
@@ -353,6 +377,14 @@ int gpu_top_update(struct gpu_top *gt)
 			if (gt->have_running)
 				gt->ring[n].running = (double)((s->running[n] - d->running[n])) / I915_SAMPLE_RUNNING_DIVISOR * 1e9 / d_time;
 
+			qd = gt->ring[n].runnable + gt->ring[n].running;
+			gpu_qd += qd;
+
+			for (i = 0; i < NUM_LOADS; i++)
+				gt->ring[n].load[i] =
+					update_load(gt->ring[n].load[i],
+						    gt->exp[i], qd);
+
 			/* in case of rounding + sampling errors, fudge */
 			if (gt->ring[n].u.u.busy > 100)
 				gt->ring[n].u.u.busy = 100;
@@ -362,6 +394,11 @@ int gpu_top_update(struct gpu_top *gt)
 				gt->ring[n].u.u.sema = 100;
 		}
 
+		for (i = 0; i < NUM_LOADS; i++) {
+			gt->load[i] = update_load(gt->load[i], gt->exp[i],
+						  gpu_qd);
+			gt->norm_load[i] = gt->load[i] / gt->num_rings;
+		}
 		update = 1;
 	} else {
 		while ((len = read(gt->fd, data, sizeof(data))) > 0) {
diff --git a/overlay/gpu-top.h b/overlay/gpu-top.h
index cb4310c82a94..115ce8c482c1 100644
--- a/overlay/gpu-top.h
+++ b/overlay/gpu-top.h
@@ -26,6 +26,7 @@
 #define GPU_TOP_H
 
 #define MAX_RINGS 16
+#define NUM_LOADS 3
 
 #include <stdint.h>
 
@@ -39,6 +40,12 @@ struct gpu_top {
 	int have_queued;
 	int have_runnable;
 	int have_running;
+	int have_load_avg;
+
+	double exp[NUM_LOADS];
+	double load[NUM_LOADS];
+	double norm_load[NUM_LOADS];
+	const char *load_name[NUM_LOADS];
 
 	struct gpu_top_ring {
 		const char *name;
@@ -54,6 +61,8 @@ struct gpu_top {
 		double queued;
 		double runnable;
 		double running;
+
+		double load[NUM_LOADS];
 	} ring[MAX_RINGS];
 
 	struct gpu_top_stat {
@@ -69,7 +78,7 @@ struct gpu_top {
 	int count;
 };
 
-void gpu_top_init(struct gpu_top *gt);
+void gpu_top_init(struct gpu_top *gt, unsigned int period_us);
 int gpu_top_update(struct gpu_top *gt);
 
 #endif /* GPU_TOP_H */
diff --git a/overlay/overlay.c b/overlay/overlay.c
index d3755397061b..63512059d8ff 100644
--- a/overlay/overlay.c
+++ b/overlay/overlay.c
@@ -141,7 +141,8 @@ struct overlay_context {
 };
 
 static void init_gpu_top(struct overlay_context *ctx,
-			 struct overlay_gpu_top *gt)
+			 struct overlay_gpu_top *gt,
+			 unsigned int period_us)
 {
 	const double rgba[][4] = {
 		{ 1, 0.25, 0.25, 1 },
@@ -152,7 +153,7 @@ static void init_gpu_top(struct overlay_context *ctx,
 	int n;
 
 	cpu_top_init(&gt->cpu_top);
-	gpu_top_init(&gt->gpu_top);
+	gpu_top_init(&gt->gpu_top, period_us);
 
 	chart_init(&gt->cpu, "CPU", 120);
 	chart_set_position(&gt->cpu, PAD, PAD);
@@ -927,13 +928,13 @@ int main(int argc, char **argv)
 
 	debugfs_init();
 
-	init_gpu_top(&ctx, &ctx.gpu_top);
+	sample_period = get_sample_period(&config);
+
+	init_gpu_top(&ctx, &ctx.gpu_top, sample_period);
 	init_gpu_perf(&ctx, &ctx.gpu_perf);
 	init_gpu_freq(&ctx, &ctx.gpu_freq);
 	init_gem_objects(&ctx, &ctx.gem_objects);
 
-	sample_period = get_sample_period(&config);
-
 	i = 0;
 	while (1) {
 		ctx.time = time(NULL);
@@ -949,9 +950,24 @@ int main(int argc, char **argv)
 		show_gem_objects(&ctx, &ctx.gem_objects);
 
 		{
-			char buf[80];
+			struct gpu_top *gt = &ctx.gpu_top.gpu_top;
 			cairo_text_extents_t extents;
+			char buf[256];
+
 			gethostname(buf, sizeof(buf));
+
+			if (gt->have_load_avg) {
+				int len = strlen(buf);
+
+				snprintf(buf + len, sizeof(buf) - len,
+					 "%s; %u engines; load %s %.2f, %s %.2f, %s %.2f",
+					 buf,
+					 gt->num_rings,
+					 gt->load_name[0], gt->load[0],
+					 gt->load_name[1], gt->load[1],
+					 gt->load_name[2], gt->load[2]);
+			}
+
 			cairo_set_source_rgb(ctx.cr, .5, .5, .5);
 			cairo_set_font_size(ctx.cr, PAD-2);
 			cairo_text_extents(ctx.cr, buf, &extents);
-- 
2.14.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [igt-dev] [PATCH i-g-t 4/5] tests/perf_pmu: Add tests for engine queued/runnable/running stats
  2018-04-05 12:40 [Intel-gfx] [PATCH i-g-t v2 0/5] Queued/runnable/running engine stats Tvrtko Ursulin
                   ` (2 preceding siblings ...)
  2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 3/5] intel-gpu-overlay: Show 1s, 30s and 15m GPU load Tvrtko Ursulin
@ 2018-04-05 12:40 ` Tvrtko Ursulin
  2018-04-05 12:40 ` [Intel-gfx] [PATCH i-g-t 5/5] tests/i915_query: Engine queues tests Tvrtko Ursulin
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Tvrtko Ursulin @ 2018-04-05 12:40 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Simple tests to check reported queue depths are correct.

v2:
 * Improvements similar to ones from i915_query.c.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 tests/perf_pmu.c | 258 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 258 insertions(+)

diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
index 590e6526b069..7fccb437d048 100644
--- a/tests/perf_pmu.c
+++ b/tests/perf_pmu.c
@@ -169,6 +169,7 @@ static unsigned int e2ring(int gem_fd, const struct intel_execution_engine2 *e)
 #define TEST_RUNTIME_PM (8)
 #define FLAG_LONG (16)
 #define FLAG_HANG (32)
+#define TEST_CONTEXTS (64)
 
 static igt_spin_t * __spin_poll(int fd, uint32_t ctx, unsigned long flags)
 {
@@ -959,6 +960,223 @@ multi_client(int gem_fd, const struct intel_execution_engine2 *e)
 	assert_within_epsilon(val[1], perf_slept[1], tolerance);
 }
 
+static double calc_queued(uint64_t d_val, uint64_t d_ns)
+{
+	return (double)d_val * 1e9 / I915_SAMPLE_QUEUED_DIVISOR / d_ns;
+}
+
+static void
+queued(int gem_fd, const struct intel_execution_engine2 *e, unsigned int flags)
+{
+	const unsigned long engine = e2ring(gem_fd, e);
+	const uint32_t bbe = MI_BATCH_BUFFER_END;
+	const unsigned int max_rq = 10;
+	double queued[max_rq + 1];
+	unsigned int n, i;
+	uint64_t val[2];
+	uint64_t ts[2];
+	uint32_t bo;
+	int fd;
+
+	igt_require_sw_sync();
+	if (flags & TEST_CONTEXTS)
+		gem_require_contexts(gem_fd);
+
+	memset(queued, 0, sizeof(queued));
+
+	bo = gem_create(gem_fd, 4096);
+	gem_write(gem_fd, bo, 4092, &bbe, sizeof(bbe));
+
+	fd = open_pmu(I915_PMU_ENGINE_QUEUED(e->class, e->instance));
+
+	for (n = 0; n <= max_rq; n++) {
+		IGT_CORK_FENCE(cork);
+		int fence = -1;
+
+		gem_quiescent_gpu(gem_fd);
+
+		if (n)
+			fence = igt_cork_plug(&cork, -1);
+
+		for (i = 0; i < n; i++) {
+			struct drm_i915_gem_exec_object2 obj = { };
+			struct drm_i915_gem_execbuffer2 eb = { };
+
+			obj.handle = bo;
+
+			eb.buffer_count = 1;
+			eb.buffers_ptr = to_user_pointer(&obj);
+
+			eb.flags = engine | I915_EXEC_FENCE_IN;
+			if (flags & TEST_CONTEXTS)
+				eb.rsvd1 = gem_context_create(gem_fd);
+			eb.rsvd2 = fence;
+
+			gem_execbuf(gem_fd, &eb);
+
+			if (flags & TEST_CONTEXTS)
+				gem_context_destroy(gem_fd, eb.rsvd1);
+		}
+
+		val[0] = __pmu_read_single(fd, &ts[0]);
+		usleep(batch_duration_ns / 1000);
+		val[1] = __pmu_read_single(fd, &ts[1]);
+
+		queued[n] = calc_queued(val[1] - val[0], ts[1] - ts[0]);
+		igt_info("n=%u queued=%.2f\n", n, queued[n]);
+
+		if (fence >= 0)
+			igt_cork_unplug(&cork);
+
+		for (i = 0; i < n; i++)
+			gem_sync(gem_fd, bo);
+	}
+
+	close(fd);
+
+	gem_close(gem_fd, bo);
+
+	for (i = 0; i <= max_rq; i++)
+		assert_within_epsilon(queued[i], i, tolerance);
+}
+
+static unsigned long __query_wait(igt_spin_t *spin, unsigned int n)
+{
+	struct timespec ts = { };
+	unsigned long t;
+
+	igt_nsec_elapsed(&ts);
+
+	if (spin->running) {
+		igt_spin_busywait_until_running(spin);
+	} else {
+		igt_debug("__spin_wait - usleep mode\n");
+		usleep(500e3); /* Better than nothing! */
+	}
+
+	t = igt_nsec_elapsed(&ts);
+
+	return spin->running ? t : 500e6 / n;
+}
+
+static void
+runnable(int gem_fd, const struct intel_execution_engine2 *e)
+{
+	const unsigned long engine = e2ring(gem_fd, e);
+	bool contexts = gem_has_contexts(gem_fd);
+	const unsigned int max_rq = 10;
+	igt_spin_t *spin[max_rq + 1];
+	double runnable[max_rq + 1];
+	uint32_t ctx[max_rq];
+	unsigned int n, i;
+	uint64_t val[2];
+	uint64_t ts[2];
+	int fd;
+
+	memset(runnable, 0, sizeof(runnable));
+
+	if (contexts) {
+		for (i = 0; i < max_rq; i++)
+			ctx[i] = gem_context_create(gem_fd);
+	}
+
+	fd = open_pmu(I915_PMU_ENGINE_RUNNABLE(e->class, e->instance));
+
+	for (n = 0; n <= max_rq; n++) {
+		gem_quiescent_gpu(gem_fd);
+
+		for (i = 0; i < n; i++) {
+			uint32_t ctx_ = contexts ? ctx[i] : 0;
+
+			if (i == 0)
+				spin[i] = __spin_poll(gem_fd, ctx_, engine);
+			else
+				spin[i] = __igt_spin_batch_new(gem_fd, ctx_,
+							       engine, 0);
+		}
+
+		if (n)
+			usleep(__query_wait(spin[0], n) * n);
+
+		val[0] = __pmu_read_single(fd, &ts[0]);
+		usleep(batch_duration_ns / 1000);
+		val[1] = __pmu_read_single(fd, &ts[1]);
+
+		runnable[n] = calc_queued(val[1] - val[0], ts[1] - ts[0]);
+		igt_info("n=%u runnable=%.2f\n", n, runnable[n]);
+
+		for (i = 0; i < n; i++) {
+			end_spin(gem_fd, spin[i], FLAG_SYNC);
+			igt_spin_batch_free(gem_fd, spin[i]);
+		}
+	}
+
+	if (contexts) {
+		for (i = 0; i < max_rq; i++)
+			gem_context_destroy(gem_fd, ctx[i]);
+	}
+
+	close(fd);
+
+	assert_within_epsilon(runnable[0], 0, tolerance);
+	igt_assert(runnable[max_rq] > 0.0);
+
+	if (contexts)
+		assert_within_epsilon(runnable[max_rq] - runnable[max_rq - 1],
+				      1, tolerance);
+}
+
+static void
+running(int gem_fd, const struct intel_execution_engine2 *e)
+{
+	const unsigned long engine = e2ring(gem_fd, e);
+	const unsigned int max_rq = 10;
+	igt_spin_t *spin[max_rq + 1];
+	double running[max_rq + 1];
+	unsigned int n, i;
+	uint64_t val[2];
+	uint64_t ts[2];
+	int fd;
+
+	memset(running, 0, sizeof(running));
+	memset(spin, 0, sizeof(spin));
+
+	fd = open_pmu(I915_PMU_ENGINE_RUNNING(e->class, e->instance));
+
+	for (n = 0; n <= max_rq; n++) {
+		gem_quiescent_gpu(gem_fd);
+
+		for (i = 0; i < n; i++) {
+			if (i == 0)
+				spin[i] = __spin_poll(gem_fd, 0, engine);
+			else
+				spin[i] = __igt_spin_batch_new(gem_fd, 0,
+							       engine, 0);
+		}
+
+		if (n)
+			usleep(__query_wait(spin[0], n) * n);
+
+		val[0] = __pmu_read_single(fd, &ts[0]);
+		usleep(batch_duration_ns / 1000);
+		val[1] = __pmu_read_single(fd, &ts[1]);
+
+		running[n] = calc_queued(val[1] - val[0], ts[1] - ts[0]);
+		igt_info("n=%u running=%.2f\n", n, running[n]);
+
+		for (i = 0; i < n; i++) {
+			end_spin(gem_fd, spin[i], FLAG_SYNC);
+			igt_spin_batch_free(gem_fd, spin[i]);
+		}
+	}
+
+	close(fd);
+
+	assert_within_epsilon(running[0], 0, tolerance);
+	for (i = 1; i <= max_rq; i++)
+		igt_assert(running[i] > 0);
+}
+
 /**
  * Tests that i915 PMU corectly errors out in invalid initialization.
  * i915 PMU is uncore PMU, thus:
@@ -1692,6 +1910,15 @@ igt_main
 		igt_subtest_f("init-sema-%s", e->name)
 			init(fd, e, I915_SAMPLE_SEMA);
 
+		igt_subtest_f("init-queued-%s", e->name)
+			init(fd, e, I915_SAMPLE_QUEUED);
+
+		igt_subtest_f("init-runnable-%s", e->name)
+			init(fd, e, I915_SAMPLE_RUNNABLE);
+
+		igt_subtest_f("init-running-%s", e->name)
+			init(fd, e, I915_SAMPLE_RUNNING);
+
 		igt_subtest_group {
 			igt_fixture {
 				gem_require_engine(fd, e->class, e->instance);
@@ -1797,6 +2024,27 @@ igt_main
 
 			igt_subtest_f("busy-hang-%s", e->name)
 				single(fd, e, TEST_BUSY | FLAG_HANG);
+
+			/**
+			 * Test that queued metric works.
+			 */
+			igt_subtest_f("queued-%s", e->name)
+				queued(fd, e, 0);
+
+			igt_subtest_f("queued-contexts-%s", e->name)
+				queued(fd, e, TEST_CONTEXTS);
+
+			/**
+			 * Test that runnable metric works.
+			 */
+			igt_subtest_f("runnable-%s", e->name)
+				runnable(fd, e);
+
+			/**
+			 * Test that running metric works.
+			 */
+			igt_subtest_f("running-%s", e->name)
+				running(fd, e);
 		}
 
 		/**
@@ -1889,6 +2137,16 @@ igt_main
 					      e->name)
 					single(render_fd, e,
 					       TEST_BUSY | TEST_TRAILING_IDLE);
+				igt_subtest_f("render-node-queued-%s", e->name)
+					queued(render_fd, e, 0);
+				igt_subtest_f("render-node-queued-contexts-%s",
+					      e->name)
+					queued(render_fd, e, TEST_CONTEXTS);
+				igt_subtest_f("render-node-runnable-%s",
+					      e->name)
+					runnable(render_fd, e);
+				igt_subtest_f("render-node-running-%s", e->name)
+					running(render_fd, e);
 			}
 		}
 
-- 
2.14.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Intel-gfx] [PATCH i-g-t 5/5] tests/i915_query: Engine queues tests
  2018-04-05 12:40 [Intel-gfx] [PATCH i-g-t v2 0/5] Queued/runnable/running engine stats Tvrtko Ursulin
                   ` (3 preceding siblings ...)
  2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 4/5] tests/perf_pmu: Add tests for engine queued/runnable/running stats Tvrtko Ursulin
@ 2018-04-05 12:40 ` Tvrtko Ursulin
  2018-04-05 14:05 ` [igt-dev] ✓ Fi.CI.BAT: success for Queued/runnable/running engine stats (rev2) Patchwork
  2018-04-05 16:45 ` [igt-dev] ✗ Fi.CI.IGT: warning " Patchwork
  6 siblings, 0 replies; 9+ messages in thread
From: Tvrtko Ursulin @ 2018-04-05 12:40 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Basic tests to cover engine queued/runnable/running metric as reported
by the DRM_I915_QUERY_ENGINE_QUEUES query.

v2:
 * Update ABI for i915 changes.
 * Use igt_spin_busywait_until_running.
 * Support no hardware contexts.
 * More comments. (Lionel Landwerlin)
 Chris Wilson:
 * Check for sw sync support.
 * Multiple contexts queued test.
 * Simplify context and bb allocation.
 * Fix asserts in the running subtest.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 tests/i915_query.c | 442 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 442 insertions(+)

diff --git a/tests/i915_query.c b/tests/i915_query.c
index c7de8cbd8371..83192c348e70 100644
--- a/tests/i915_query.c
+++ b/tests/i915_query.c
@@ -22,6 +22,7 @@
  */
 
 #include "igt.h"
+#include "sw_sync.h"
 
 #include <limits.h>
 
@@ -477,8 +478,414 @@ test_query_topology_known_pci_ids(int fd, int devid)
 	free(topo_info);
 }
 
+#define DRM_I915_QUERY_ENGINE_QUEUES	2
+
+struct drm_i915_query_engine_queues {
+	/** Engine class as in enum drm_i915_gem_engine_class. */
+	__u16 class;
+
+	/** Engine instance number. */
+	__u16 instance;
+
+	/** Number of requests with unresolved fences and dependencies. */
+	__u32 queued;
+
+	/** Number of ready requests waiting on a slot on GPU. */
+	__u32 runnable;
+
+	/** Number of requests executing on the GPU. */
+	__u32 running;
+
+	__u32 rsvd[5];
+};
+
+static bool query_engine_queues_supported(int fd)
+{
+	struct drm_i915_query_item item = {
+		.query_id = DRM_I915_QUERY_ENGINE_QUEUES,
+	};
+
+	return __i915_query_items(fd, &item, 1) == 0 && item.length > 0;
+}
+
+static void engine_queues_invalid(int fd)
+{
+	struct drm_i915_query_engine_queues queues;
+	struct drm_i915_query_item item;
+	unsigned int len;
+	unsigned int i;
+
+	/* Flags is MBZ. */
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_QUEUES;
+	item.flags = 1;
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	/* Length not zero and not greater or equal required size. */
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_QUEUES;
+	item.length = 1;
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	/* Query correct length. */
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_QUEUES;
+	i915_query_items(fd, &item, 1);
+	igt_assert(item.length >= 0);
+	len = item.length;
+
+	/* Ivalid pointer. */
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_QUEUES;
+	item.length = len;
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EFAULT);
+
+	/* Reserved fields are MBZ. */
+
+	for (i = 0; i < ARRAY_SIZE(queues.rsvd); i++) {
+		memset(&queues, 0, sizeof(queues));
+		queues.rsvd[i] = 1;
+		memset(&item, 0, sizeof(item));
+		item.query_id = DRM_I915_QUERY_ENGINE_QUEUES;
+		item.length = len;
+		item.data_ptr = to_user_pointer(&queues);
+		i915_query_items(fd, &item, 1);
+		igt_assert_eq(item.length, -EINVAL);
+	}
+
+	memset(&queues, 0, sizeof(queues));
+	queues.class = -1;
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_QUEUES;
+	item.length = len;
+	item.data_ptr = to_user_pointer(&queues);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -ENOENT);
+
+	memset(&queues, 0, sizeof(queues));
+	queues.instance = -1;
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_QUEUES;
+	item.length = len;
+		item.data_ptr = to_user_pointer(&queues);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -ENOENT);
+}
+
+static void engine_queues(int fd, const struct intel_execution_engine2 *e)
+{
+	struct drm_i915_query_engine_queues queues;
+	struct drm_i915_query_item item;
+	unsigned int len;
+
+	/* Query required buffer length. */
+	memset(&queues, 0, sizeof(queues));
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_QUEUES;
+	item.data_ptr = to_user_pointer(&queues);
+	i915_query_items(fd, &item, 1);
+	igt_assert(item.length >= 0);
+	igt_assert(item.length <= sizeof(queues));
+	len = item.length;
+
+	/* Check length larger than required works and reports same length. */
+	memset(&queues, 0, sizeof(queues));
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_QUEUES;
+	item.data_ptr = to_user_pointer(&queues);
+	item.length = len + 1;
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, len);
+
+	/* Actual query. */
+	memset(&queues, 0, sizeof(queues));
+	queues.class = e->class;
+	queues.instance = e->instance;
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_QUEUES;
+	item.data_ptr = to_user_pointer(&queues);
+	item.length = len;
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, len);
+}
+
+static unsigned int e2ring(int gem_fd, const struct intel_execution_engine2 *e)
+{
+	return gem_class_instance_to_eb_flags(gem_fd, e->class, e->instance);
+}
+
+static void
+__query_queues(int fd, const struct intel_execution_engine2 *e,
+	       struct drm_i915_query_engine_queues *queues)
+{
+	struct drm_i915_query_item item;
+
+	memset(queues, 0, sizeof(*queues));
+	queues->class = e->class;
+	queues->instance = e->instance;
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_QUEUES;
+	item.data_ptr = to_user_pointer(queues);
+	item.length = sizeof(*queues);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, sizeof(*queues));
+}
+
+#define TEST_CONTEXTS (1 << 0)
+
+/*
+ * Test that the reported number of queued (not ready for execution due fences
+ * or dependencies) requests on an engine is correct.
+ */
+static void
+engine_queued(int gem_fd, const struct intel_execution_engine2 *e,
+	      unsigned int flags)
+{
+	const unsigned long engine = e2ring(gem_fd, e);
+	struct drm_i915_query_engine_queues queues;
+	const uint32_t bbe = MI_BATCH_BUFFER_END;
+	const unsigned int max_rq = 10;
+	uint32_t queued[max_rq + 1];
+	unsigned int n, i;
+	uint32_t bo;
+
+	igt_require_sw_sync();
+	if (flags & TEST_CONTEXTS)
+		gem_require_contexts(gem_fd);
+
+	memset(queued, 0, sizeof(queued));
+
+	bo = gem_create(gem_fd, 4096);
+	gem_write(gem_fd, bo, 4092, &bbe, sizeof(bbe));
+
+	 /* Create a specific queue depth of unready requests. */
+	for (n = 0; n <= max_rq; n++) {
+		int fence = -1;
+		IGT_CORK_FENCE(cork);
+
+		gem_quiescent_gpu(gem_fd);
+
+		/* Create a cork so we can create a dependency chain. */
+		if (n)
+			fence = igt_cork_plug(&cork, -1);
+
+		/* Submit n unready requests depending on the cork. */
+		for (i = 0; i < n; i++) {
+			struct drm_i915_gem_exec_object2 obj = { };
+			struct drm_i915_gem_execbuffer2 eb = { };
+
+			obj.handle = bo;
+
+			eb.buffer_count = 1;
+			eb.buffers_ptr = to_user_pointer(&obj);
+
+			eb.flags = engine | I915_EXEC_FENCE_IN;
+
+			/*
+			 * In context mode each submission is on a separate
+			 * context.
+			 */
+			if (flags & TEST_CONTEXTS)
+				eb.rsvd1 = gem_context_create(gem_fd);
+
+			eb.rsvd2 = fence;
+
+			gem_execbuf(gem_fd, &eb);
+
+			if (flags & TEST_CONTEXTS)
+				gem_context_destroy(gem_fd, eb.rsvd1);
+		}
+
+		/* Store reported queue depth to assert against later. */
+		__query_queues(gem_fd, e, &queues);
+		queued[n] = queues.queued;
+		igt_info("n=%u queued=%u\n", n, queued[n]);
+
+		/* Unplug the queue and proceed to the next queue depth. */
+		if (fence >= 0)
+			igt_cork_unplug(&cork);
+
+		gem_sync(gem_fd, bo);
+	}
+
+	gem_close(gem_fd, bo);
+
+	for (i = 0; i <= max_rq; i++)
+		igt_assert_eq(queued[i], i);
+}
+
+static igt_spin_t * __spin_poll(int fd, uint32_t ctx, unsigned long flags)
+{
+	if (gem_can_store_dword(fd, flags))
+		return __igt_spin_batch_new_poll(fd, ctx, flags);
+	else
+		return __igt_spin_batch_new(fd, ctx, flags, 0);
+}
+
+static unsigned long __spin_wait(igt_spin_t *spin, unsigned int n)
+{
+	struct timespec ts = { };
+	unsigned long t;
+
+	igt_nsec_elapsed(&ts);
+
+	if (spin->running) {
+		igt_spin_busywait_until_running(spin);
+	} else {
+		igt_debug("__spin_wait - usleep mode\n");
+		usleep(500e3); /* Better than nothing! */
+	}
+
+	t = igt_nsec_elapsed(&ts);
+
+	return spin->running ? t : 500e6 / n;
+}
+
+/*
+ * Test that the number of requests ready for execution but waiting on space on
+ * GPU is correctly reported.
+ */
+static void
+engine_runnable(int gem_fd, const struct intel_execution_engine2 *e)
+{
+	const unsigned long engine = e2ring(gem_fd, e);
+	struct drm_i915_query_engine_queues queues;
+	bool contexts = gem_has_contexts(gem_fd);
+	const unsigned int max_rq = 10;
+	igt_spin_t *spin[max_rq + 1];
+	uint32_t runnable[max_rq + 1];
+	uint32_t ctx[max_rq];
+	unsigned int n, i;
+
+	memset(runnable, 0, sizeof(runnable));
+
+	if (contexts) {
+		for (i = 0; i < max_rq; i++)
+			ctx[i] = gem_context_create(gem_fd);
+	}
+
+	/*
+	 * Submit different number of requests, potentially against different
+	 * contexts, in order to provoke engine runnable metric returning
+	 * different numbers.
+	 */
+	for (n = 0; n <= max_rq; n++) {
+		gem_quiescent_gpu(gem_fd);
+
+		for (i = 0; i < n; i++) {
+			uint32_t ctx_ = contexts ? ctx[i] : 0;
+
+			if (i == 0)
+				spin[i] = __spin_poll(gem_fd, ctx_, engine);
+			else
+				spin[i] = __igt_spin_batch_new(gem_fd, ctx_,
+							       engine, 0);
+		}
+
+		if (n)
+			usleep(__spin_wait(spin[0], n) * n);
+
+		/* Query and store for later checking. */
+		__query_queues(gem_fd, e, &queues);
+		runnable[n] = queues.runnable;
+		igt_info("n=%u runnable=%u\n", n, runnable[n]);
+
+		for (i = 0; i < n; i++) {
+			igt_spin_batch_end(spin[i]);
+			gem_sync(gem_fd, spin[i]->handle);
+			igt_spin_batch_free(gem_fd, spin[i]);
+		}
+	}
+
+	if (contexts) {
+		for (i = 0; i < max_rq; i++)
+			gem_context_destroy(gem_fd, ctx[i]);
+	}
+
+	/*
+	 * Check that the runnable metric is zero when nothing is submitted,
+	 * and that it is greater than zero on the maximum queue depth.
+	 *
+	 * We cannot assert the exact value since we do not know how many
+	 * requests can the submission backend consume.
+	 */
+	igt_assert_eq(runnable[0], 0);
+	igt_assert(runnable[max_rq] > 0);
+
+	/*
+	 * We can only test that the runnable metric is growing by one if we
+	 * have context support.
+	 */
+	if (contexts)
+		igt_assert_eq(runnable[max_rq] - runnable[max_rq - 1], 1);
+}
+
+/*
+ * Test that the number of requests currently executing on the GPU is correctly
+ * reported.
+ */
+static void
+engine_running(int gem_fd, const struct intel_execution_engine2 *e)
+{
+	const unsigned long engine = e2ring(gem_fd, e);
+	struct drm_i915_query_engine_queues queues;
+	const unsigned int max_rq = 10;
+	igt_spin_t *spin[max_rq + 1];
+	uint32_t running[max_rq + 1];
+	unsigned int n, i;
+
+	memset(running, 0, sizeof(running));
+	memset(spin, 0, sizeof(spin));
+
+	/*
+	 * Create various queue depths of requests against the same context to
+	 * try and get submission backed execute one or more on the GPU.
+	 */
+	for (n = 0; n <= max_rq; n++) {
+		gem_quiescent_gpu(gem_fd);
+
+		for (i = 0; i < n; i++) {
+			if (i == 0)
+				spin[i] = __spin_poll(gem_fd, 0, engine);
+			else
+				spin[i] = __igt_spin_batch_new(gem_fd, 0,
+							       engine, 0);
+		}
+
+		if (n)
+			usleep(__spin_wait(spin[0], n) * n);
+
+		/* Query and store for later checking. */
+		__query_queues(gem_fd, e, &queues);
+		running[n] = queues.running;
+		igt_info("n=%u running=%u\n", n, running[n]);
+
+		for (i = 0; i < n; i++) {
+			igt_spin_batch_end(spin[i]);
+			gem_sync(gem_fd, spin[i]->handle);
+			igt_spin_batch_free(gem_fd, spin[i]);
+		}
+	}
+
+	/*
+	 * Check that the running metric is zero when nothing is submitted,
+	 * one when one request is submitted, and at least one for any greater
+	 * queue depth.
+	 *
+	 * We cannot assert the exact value since we do not know how many
+	 * requests can the submission backend consume.
+	 */
+	igt_assert_eq(running[0], 0);
+	for (i = 1; i <= max_rq; i++)
+		igt_assert(running[i] > 0);
+}
+
 igt_main
 {
+	const struct intel_execution_engine2 *e;
 	int fd = -1;
 	int devid;
 
@@ -524,6 +931,41 @@ igt_main
 		test_query_topology_known_pci_ids(fd, devid);
 	}
 
+	igt_subtest_group {
+		igt_fixture {
+			igt_require(query_engine_queues_supported(fd));
+		}
+
+		igt_subtest("engine-queues-invalid")
+			engine_queues_invalid(fd);
+
+		__for_each_engine_class_instance(fd, e) {
+			igt_subtest_group {
+				igt_fixture {
+					gem_require_engine(fd,
+							   e->class,
+							   e->instance);
+				}
+
+				igt_subtest_f("engine-queues-%s", e->name)
+					engine_queues(fd, e);
+
+				igt_subtest_f("engine-queued-%s", e->name)
+					engine_queued(fd, e, 0);
+
+				igt_subtest_f("engine-queued-contexts-%s",
+					      e->name)
+					engine_queued(fd, e, TEST_CONTEXTS);
+
+				igt_subtest_f("engine-runnable-%s", e->name)
+					engine_runnable(fd, e);
+
+				igt_subtest_f("engine-running-%s", e->name)
+					engine_running(fd, e);
+			}
+		}
+	}
+
 	igt_fixture {
 		close(fd);
 	}
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [igt-dev] ✓ Fi.CI.BAT: success for Queued/runnable/running engine stats (rev2)
  2018-04-05 12:40 [Intel-gfx] [PATCH i-g-t v2 0/5] Queued/runnable/running engine stats Tvrtko Ursulin
                   ` (4 preceding siblings ...)
  2018-04-05 12:40 ` [Intel-gfx] [PATCH i-g-t 5/5] tests/i915_query: Engine queues tests Tvrtko Ursulin
@ 2018-04-05 14:05 ` Patchwork
  2018-04-05 16:45 ` [igt-dev] ✗ Fi.CI.IGT: warning " Patchwork
  6 siblings, 0 replies; 9+ messages in thread
From: Patchwork @ 2018-04-05 14:05 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: igt-dev

== Series Details ==

Series: Queued/runnable/running engine stats (rev2)
URL   : https://patchwork.freedesktop.org/series/40217/
State : success

== Summary ==

IGT patchset tested on top of latest successful build
164b4a3ab34bd7d18d34181c62bfaedb906a76e3 blacklist: Don't run tests on pipe-d, pipe-e or pipe-f

with latest DRM-Tip kernel build CI_DRM_4025
0eddede73765 drm-tip: 2018y-04m-05d-09h-51m-03s UTC integration manifest

Testlist changes:
+igt@i915_query@engine-queued-bcs0
+igt@i915_query@engine-queued-contexts-bcs0
+igt@i915_query@engine-queued-contexts-rcs0
+igt@i915_query@engine-queued-contexts-vcs0
+igt@i915_query@engine-queued-contexts-vcs1
+igt@i915_query@engine-queued-contexts-vecs0
+igt@i915_query@engine-queued-rcs0
+igt@i915_query@engine-queued-vcs0
+igt@i915_query@engine-queued-vcs1
+igt@i915_query@engine-queued-vecs0
+igt@i915_query@engine-queues-bcs0
+igt@i915_query@engine-queues-invalid
+igt@i915_query@engine-queues-rcs0
+igt@i915_query@engine-queues-vcs0
+igt@i915_query@engine-queues-vcs1
+igt@i915_query@engine-queues-vecs0
+igt@i915_query@engine-runnable-bcs0
+igt@i915_query@engine-runnable-rcs0
+igt@i915_query@engine-runnable-vcs0
+igt@i915_query@engine-runnable-vcs1
+igt@i915_query@engine-runnable-vecs0
+igt@i915_query@engine-running-bcs0
+igt@i915_query@engine-running-rcs0
+igt@i915_query@engine-running-vcs0
+igt@i915_query@engine-running-vcs1
+igt@i915_query@engine-running-vecs0
+igt@perf_pmu@init-queued-bcs0
+igt@perf_pmu@init-queued-rcs0
+igt@perf_pmu@init-queued-vcs0
+igt@perf_pmu@init-queued-vcs1
+igt@perf_pmu@init-queued-vecs0
+igt@perf_pmu@init-runnable-bcs0
+igt@perf_pmu@init-runnable-rcs0
+igt@perf_pmu@init-runnable-vcs0
+igt@perf_pmu@init-runnable-vcs1
+igt@perf_pmu@init-runnable-vecs0
+igt@perf_pmu@init-running-bcs0
+igt@perf_pmu@init-running-rcs0
+igt@perf_pmu@init-running-vcs0
+igt@perf_pmu@init-running-vcs1
+igt@perf_pmu@init-running-vecs0
+igt@perf_pmu@queued-bcs0
+igt@perf_pmu@queued-contexts-bcs0
+igt@perf_pmu@queued-contexts-rcs0
+igt@perf_pmu@queued-contexts-vcs0
+igt@perf_pmu@queued-contexts-vcs1
+igt@perf_pmu@queued-contexts-vecs0
+igt@perf_pmu@queued-rcs0
+igt@perf_pmu@queued-vcs0
+igt@perf_pmu@queued-vcs1
+igt@perf_pmu@queued-vecs0
+igt@perf_pmu@render-node-queued-bcs0
+igt@perf_pmu@render-node-queued-contexts-bcs0
+igt@perf_pmu@render-node-queued-contexts-rcs0
+igt@perf_pmu@render-node-queued-contexts-vcs0
+igt@perf_pmu@render-node-queued-contexts-vcs1
+igt@perf_pmu@render-node-queued-contexts-vecs0
+igt@perf_pmu@render-node-queued-rcs0
+igt@perf_pmu@render-node-queued-vcs0
+igt@perf_pmu@render-node-queued-vcs1
+igt@perf_pmu@render-node-queued-vecs0
+igt@perf_pmu@render-node-runnable-bcs0
+igt@perf_pmu@render-node-runnable-rcs0
+igt@perf_pmu@render-node-runnable-vcs0
+igt@perf_pmu@render-node-runnable-vcs1
+igt@perf_pmu@render-node-runnable-vecs0
+igt@perf_pmu@render-node-running-bcs0
+igt@perf_pmu@render-node-running-rcs0
+igt@perf_pmu@render-node-running-vcs0
+igt@perf_pmu@render-node-running-vcs1
+igt@perf_pmu@render-node-running-vecs0
+igt@perf_pmu@runnable-bcs0
+igt@perf_pmu@runnable-rcs0
+igt@perf_pmu@runnable-vcs0
+igt@perf_pmu@runnable-vcs1
+igt@perf_pmu@runnable-vecs0
+igt@perf_pmu@running-bcs0
+igt@perf_pmu@running-rcs0
+igt@perf_pmu@running-vcs0
+igt@perf_pmu@running-vcs1
+igt@perf_pmu@running-vecs0

---- Known issues:

Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-c:
                dmesg-warn -> PASS       (fi-glk-j4005) fdo#105644
Test prime_vgem:
        Subgroup basic-fence-flip:
                fail       -> PASS       (fi-ilk-650) fdo#104008

fdo#105644 https://bugs.freedesktop.org/show_bug.cgi?id=105644
fdo#104008 https://bugs.freedesktop.org/show_bug.cgi?id=104008

fi-bdw-5557u     total:285  pass:264  dwarn:0   dfail:0   fail:0   skip:21  time:432s
fi-bdw-gvtdvm    total:285  pass:261  dwarn:0   dfail:0   fail:0   skip:24  time:443s
fi-blb-e6850     total:285  pass:220  dwarn:1   dfail:0   fail:0   skip:64  time:383s
fi-bsw-n3050     total:285  pass:239  dwarn:0   dfail:0   fail:0   skip:46  time:542s
fi-bwr-2160      total:285  pass:180  dwarn:0   dfail:0   fail:0   skip:105 time:299s
fi-bxt-dsi       total:285  pass:255  dwarn:0   dfail:0   fail:0   skip:30  time:522s
fi-bxt-j4205     total:285  pass:256  dwarn:0   dfail:0   fail:0   skip:29  time:520s
fi-byt-j1900     total:285  pass:250  dwarn:0   dfail:0   fail:0   skip:35  time:525s
fi-byt-n2820     total:285  pass:246  dwarn:0   dfail:0   fail:0   skip:39  time:509s
fi-cfl-8700k     total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:414s
fi-cfl-s3        total:285  pass:259  dwarn:0   dfail:0   fail:0   skip:26  time:563s
fi-cfl-u         total:285  pass:259  dwarn:0   dfail:0   fail:0   skip:26  time:513s
fi-cnl-y3        total:285  pass:259  dwarn:0   dfail:0   fail:0   skip:26  time:581s
fi-elk-e7500     total:285  pass:226  dwarn:0   dfail:0   fail:0   skip:59  time:425s
fi-gdg-551       total:285  pass:176  dwarn:0   dfail:0   fail:1   skip:108 time:316s
fi-glk-1         total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:538s
fi-glk-j4005     total:285  pass:256  dwarn:0   dfail:0   fail:0   skip:29  time:487s
fi-hsw-4770      total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:411s
fi-ilk-650       total:285  pass:225  dwarn:0   dfail:0   fail:0   skip:60  time:422s
fi-ivb-3520m     total:285  pass:256  dwarn:0   dfail:0   fail:0   skip:29  time:471s
fi-ivb-3770      total:285  pass:252  dwarn:0   dfail:0   fail:0   skip:33  time:433s
fi-kbl-7500u     total:285  pass:260  dwarn:1   dfail:0   fail:0   skip:24  time:474s
fi-kbl-7567u     total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:469s
fi-kbl-r         total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:509s
fi-pnv-d510      total:285  pass:220  dwarn:1   dfail:0   fail:0   skip:64  time:671s
fi-skl-6260u     total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:443s
fi-skl-6600u     total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:536s
fi-skl-6700k2    total:285  pass:261  dwarn:0   dfail:0   fail:0   skip:24  time:508s
fi-skl-6770hq    total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:511s
fi-skl-guc       total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:429s
fi-skl-gvtdvm    total:285  pass:262  dwarn:0   dfail:0   fail:0   skip:23  time:447s
fi-snb-2520m     total:285  pass:245  dwarn:0   dfail:0   fail:0   skip:40  time:566s
fi-snb-2600      total:285  pass:245  dwarn:0   dfail:0   fail:0   skip:40  time:406s
Blacklisted hosts:
fi-cnl-psr       total:285  pass:256  dwarn:3   dfail:0   fail:0   skip:26  time:514s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_1228/issues.html
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [igt-dev] ✗ Fi.CI.IGT: warning for Queued/runnable/running engine stats (rev2)
  2018-04-05 12:40 [Intel-gfx] [PATCH i-g-t v2 0/5] Queued/runnable/running engine stats Tvrtko Ursulin
                   ` (5 preceding siblings ...)
  2018-04-05 14:05 ` [igt-dev] ✓ Fi.CI.BAT: success for Queued/runnable/running engine stats (rev2) Patchwork
@ 2018-04-05 16:45 ` Patchwork
  6 siblings, 0 replies; 9+ messages in thread
From: Patchwork @ 2018-04-05 16:45 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: igt-dev

== Series Details ==

Series: Queued/runnable/running engine stats (rev2)
URL   : https://patchwork.freedesktop.org/series/40217/
State : warning

== Summary ==

---- Possible new issues:

Test gem_pwrite:
        Subgroup big-cpu-backwards:
                pass       -> SKIP       (shard-apl)
Test kms_cursor_legacy:
        Subgroup cursor-vs-flip-toggle:
                fail       -> PASS       (shard-hsw)

---- Known issues:

Test kms_rotation_crc:
        Subgroup primary-rotation-180:
                fail       -> PASS       (shard-snb) fdo#103925

fdo#103925 https://bugs.freedesktop.org/show_bug.cgi?id=103925

shard-apl        total:2761 pass:1837 dwarn:1   dfail:0   fail:51  skip:871 time:12677s
shard-hsw        total:2761 pass:1789 dwarn:1   dfail:0   fail:45  skip:925 time:11560s
shard-snb        total:2761 pass:1382 dwarn:1   dfail:0   fail:37  skip:1341 time:6917s
Blacklisted hosts:
shard-kbl        total:2761 pass:1962 dwarn:1   dfail:0   fail:62  skip:736 time:9200s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_1228/shards.html
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-04-05 16:45 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-04-05 12:40 [Intel-gfx] [PATCH i-g-t v2 0/5] Queued/runnable/running engine stats Tvrtko Ursulin
2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 1/5] include: i915 uAPI headers Tvrtko Ursulin
2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 2/5] intel-gpu-overlay: Add engine queue stats Tvrtko Ursulin
2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 3/5] intel-gpu-overlay: Show 1s, 30s and 15m GPU load Tvrtko Ursulin
2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 4/5] tests/perf_pmu: Add tests for engine queued/runnable/running stats Tvrtko Ursulin
2018-04-05 12:40 ` [Intel-gfx] [PATCH i-g-t 5/5] tests/i915_query: Engine queues tests Tvrtko Ursulin
2018-04-05 14:05 ` [igt-dev] ✓ Fi.CI.BAT: success for Queued/runnable/running engine stats (rev2) Patchwork
2018-04-05 16:45 ` [igt-dev] ✗ Fi.CI.IGT: warning " Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2018-03-19 18:22 [Intel-gfx] [PATCH i-g-t 0/5] Queued/runnable/running engine stats Tvrtko Ursulin
2018-03-19 18:22 ` [igt-dev] [PATCH i-g-t 1/5] include: i915 uAPI headers Tvrtko Ursulin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox