* [igt-dev] [PATCH i-g-t 1/5] include: i915 uAPI headers
2018-03-19 18:22 [Intel-gfx] [PATCH i-g-t 0/5] Queued/runnable/running engine stats Tvrtko Ursulin
@ 2018-03-19 18:22 ` Tvrtko Ursulin
0 siblings, 0 replies; 9+ messages in thread
From: Tvrtko Ursulin @ 2018-03-19 18:22 UTC (permalink / raw)
To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Temporary up to date uAPI headers.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
include/drm-uapi/i915_drm.h | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/include/drm-uapi/i915_drm.h b/include/drm-uapi/i915_drm.h
index 16e452aa12d4..14c7e790f6ed 100644
--- a/include/drm-uapi/i915_drm.h
+++ b/include/drm-uapi/i915_drm.h
@@ -110,9 +110,17 @@ enum drm_i915_gem_engine_class {
enum drm_i915_pmu_engine_sample {
I915_SAMPLE_BUSY = 0,
I915_SAMPLE_WAIT = 1,
- I915_SAMPLE_SEMA = 2
+ I915_SAMPLE_SEMA = 2,
+ I915_SAMPLE_QUEUED = 3,
+ I915_SAMPLE_RUNNABLE = 4,
+ I915_SAMPLE_RUNNING = 5,
};
+ /* Divide counter value by divisor to get the real value. */
+#define I915_SAMPLE_QUEUED_DIVISOR (1024)
+#define I915_SAMPLE_RUNNABLE_DIVISOR (1024)
+#define I915_SAMPLE_RUNNING_DIVISOR (1024)
+
#define I915_PMU_SAMPLE_BITS (4)
#define I915_PMU_SAMPLE_MASK (0xf)
#define I915_PMU_SAMPLE_INSTANCE_BITS (8)
@@ -133,6 +141,15 @@ enum drm_i915_pmu_engine_sample {
#define I915_PMU_ENGINE_SEMA(class, instance) \
__I915_PMU_ENGINE(class, instance, I915_SAMPLE_SEMA)
+#define I915_PMU_ENGINE_QUEUED(class, instance) \
+ __I915_PMU_ENGINE(class, instance, I915_SAMPLE_QUEUED)
+
+#define I915_PMU_ENGINE_RUNNABLE(class, instance) \
+ __I915_PMU_ENGINE(class, instance, I915_SAMPLE_RUNNABLE)
+
+#define I915_PMU_ENGINE_RUNNING(class, instance) \
+ __I915_PMU_ENGINE(class, instance, I915_SAMPLE_RUNNING)
+
#define __I915_PMU_OTHER(x) (__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x))
#define I915_PMU_ACTUAL_FREQUENCY __I915_PMU_OTHER(0)
--
2.14.1
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Intel-gfx] [PATCH i-g-t v2 0/5] Queued/runnable/running engine stats
@ 2018-04-05 12:40 Tvrtko Ursulin
2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 1/5] include: i915 uAPI headers Tvrtko Ursulin
` (6 more replies)
0 siblings, 7 replies; 9+ messages in thread
From: Tvrtko Ursulin @ 2018-04-05 12:40 UTC (permalink / raw)
To: igt-dev; +Cc: Intel-gfx
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
IGT patches for the identicaly named i915 series, including:
* Engine queue depths for intel-gpu-overlay (including load average).
* Tests for new PMU counters.
* Tests for the query API.
v2:
* Review feedback and tweaks.
Tvrtko Ursulin (5):
include: i915 uAPI headers
intel-gpu-overlay: Add engine queue stats
intel-gpu-overlay: Show 1s, 30s and 15m GPU load
tests/perf_pmu: Add tests for engine queued/runnable/running stats
tests/i915_query: Engine queues tests
include/drm-uapi/i915_drm.h | 19 +-
overlay/gpu-top.c | 81 +++++++-
overlay/gpu-top.h | 22 ++-
overlay/overlay.c | 35 +++-
tests/i915_query.c | 442 ++++++++++++++++++++++++++++++++++++++++++++
tests/perf_pmu.c | 258 ++++++++++++++++++++++++++
6 files changed, 848 insertions(+), 9 deletions(-)
--
2.14.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 9+ messages in thread
* [igt-dev] [PATCH i-g-t 1/5] include: i915 uAPI headers
2018-04-05 12:40 [Intel-gfx] [PATCH i-g-t v2 0/5] Queued/runnable/running engine stats Tvrtko Ursulin
@ 2018-04-05 12:40 ` Tvrtko Ursulin
2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 2/5] intel-gpu-overlay: Add engine queue stats Tvrtko Ursulin
` (5 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Tvrtko Ursulin @ 2018-04-05 12:40 UTC (permalink / raw)
To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Temporary up to date uAPI headers.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
include/drm-uapi/i915_drm.h | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/include/drm-uapi/i915_drm.h b/include/drm-uapi/i915_drm.h
index 16e452aa12d4..14c7e790f6ed 100644
--- a/include/drm-uapi/i915_drm.h
+++ b/include/drm-uapi/i915_drm.h
@@ -110,9 +110,17 @@ enum drm_i915_gem_engine_class {
enum drm_i915_pmu_engine_sample {
I915_SAMPLE_BUSY = 0,
I915_SAMPLE_WAIT = 1,
- I915_SAMPLE_SEMA = 2
+ I915_SAMPLE_SEMA = 2,
+ I915_SAMPLE_QUEUED = 3,
+ I915_SAMPLE_RUNNABLE = 4,
+ I915_SAMPLE_RUNNING = 5,
};
+ /* Divide counter value by divisor to get the real value. */
+#define I915_SAMPLE_QUEUED_DIVISOR (1024)
+#define I915_SAMPLE_RUNNABLE_DIVISOR (1024)
+#define I915_SAMPLE_RUNNING_DIVISOR (1024)
+
#define I915_PMU_SAMPLE_BITS (4)
#define I915_PMU_SAMPLE_MASK (0xf)
#define I915_PMU_SAMPLE_INSTANCE_BITS (8)
@@ -133,6 +141,15 @@ enum drm_i915_pmu_engine_sample {
#define I915_PMU_ENGINE_SEMA(class, instance) \
__I915_PMU_ENGINE(class, instance, I915_SAMPLE_SEMA)
+#define I915_PMU_ENGINE_QUEUED(class, instance) \
+ __I915_PMU_ENGINE(class, instance, I915_SAMPLE_QUEUED)
+
+#define I915_PMU_ENGINE_RUNNABLE(class, instance) \
+ __I915_PMU_ENGINE(class, instance, I915_SAMPLE_RUNNABLE)
+
+#define I915_PMU_ENGINE_RUNNING(class, instance) \
+ __I915_PMU_ENGINE(class, instance, I915_SAMPLE_RUNNING)
+
#define __I915_PMU_OTHER(x) (__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x))
#define I915_PMU_ACTUAL_FREQUENCY __I915_PMU_OTHER(0)
--
2.14.1
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [igt-dev] [PATCH i-g-t 2/5] intel-gpu-overlay: Add engine queue stats
2018-04-05 12:40 [Intel-gfx] [PATCH i-g-t v2 0/5] Queued/runnable/running engine stats Tvrtko Ursulin
2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 1/5] include: i915 uAPI headers Tvrtko Ursulin
@ 2018-04-05 12:40 ` Tvrtko Ursulin
2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 3/5] intel-gpu-overlay: Show 1s, 30s and 15m GPU load Tvrtko Ursulin
` (4 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Tvrtko Ursulin @ 2018-04-05 12:40 UTC (permalink / raw)
To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Use new PMU engine queue stats (queued, runnable and running) and display
them per engine.
v2:
* Compact per engine stats. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
overlay/gpu-top.c | 42 ++++++++++++++++++++++++++++++++++++++++++
overlay/gpu-top.h | 11 +++++++++++
overlay/overlay.c | 7 +++++++
3 files changed, 60 insertions(+)
diff --git a/overlay/gpu-top.c b/overlay/gpu-top.c
index 61b8f62fd78c..22e9badb22c1 100644
--- a/overlay/gpu-top.c
+++ b/overlay/gpu-top.c
@@ -72,6 +72,18 @@ static int perf_init(struct gpu_top *gt)
gt->fd) >= 0)
gt->have_sema = 1;
+ if (perf_i915_open_group(I915_PMU_ENGINE_QUEUED(d->class, d->inst),
+ gt->fd) >= 0)
+ gt->have_queued = 1;
+
+ if (perf_i915_open_group(I915_PMU_ENGINE_RUNNABLE(d->class, d->inst),
+ gt->fd) >= 0)
+ gt->have_runnable = 1;
+
+ if (perf_i915_open_group(I915_PMU_ENGINE_RUNNING(d->class, d->inst),
+ gt->fd) >= 0)
+ gt->have_running = 1;
+
gt->ring[0].name = d->name;
gt->num_rings = 1;
@@ -93,6 +105,24 @@ static int perf_init(struct gpu_top *gt)
gt->fd) < 0)
return -1;
+ if (gt->have_queued &&
+ perf_i915_open_group(I915_PMU_ENGINE_QUEUED(d->class,
+ d->inst),
+ gt->fd) < 0)
+ return -1;
+
+ if (gt->have_runnable &&
+ perf_i915_open_group(I915_PMU_ENGINE_RUNNABLE(d->class,
+ d->inst),
+ gt->fd) < 0)
+ return -1;
+
+ if (gt->have_running &&
+ perf_i915_open_group(I915_PMU_ENGINE_RUNNING(d->class,
+ d->inst),
+ gt->fd) < 0)
+ return -1;
+
gt->ring[gt->num_rings++].name = d->name;
}
@@ -298,6 +328,12 @@ int gpu_top_update(struct gpu_top *gt)
s->wait[n] = sample[m++];
if (gt->have_sema)
s->sema[n] = sample[m++];
+ if (gt->have_queued)
+ s->queued[n] = sample[m++];
+ if (gt->have_runnable)
+ s->runnable[n] = sample[m++];
+ if (gt->have_running)
+ s->running[n] = sample[m++];
}
if (gt->count == 1)
@@ -310,6 +346,12 @@ int gpu_top_update(struct gpu_top *gt)
gt->ring[n].u.u.wait = (100 * (s->wait[n] - d->wait[n]) + d_time/2) / d_time;
if (gt->have_sema)
gt->ring[n].u.u.sema = (100 * (s->sema[n] - d->sema[n]) + d_time/2) / d_time;
+ if (gt->have_queued)
+ gt->ring[n].queued = (double)((s->queued[n] - d->queued[n])) / I915_SAMPLE_QUEUED_DIVISOR * 1e9 / d_time;
+ if (gt->have_runnable)
+ gt->ring[n].runnable = (double)((s->runnable[n] - d->runnable[n])) / I915_SAMPLE_RUNNABLE_DIVISOR * 1e9 / d_time;
+ if (gt->have_running)
+ gt->ring[n].running = (double)((s->running[n] - d->running[n])) / I915_SAMPLE_RUNNING_DIVISOR * 1e9 / d_time;
/* in case of rounding + sampling errors, fudge */
if (gt->ring[n].u.u.busy > 100)
diff --git a/overlay/gpu-top.h b/overlay/gpu-top.h
index d3cdd779760f..cb4310c82a94 100644
--- a/overlay/gpu-top.h
+++ b/overlay/gpu-top.h
@@ -36,6 +36,9 @@ struct gpu_top {
int num_rings;
int have_wait;
int have_sema;
+ int have_queued;
+ int have_runnable;
+ int have_running;
struct gpu_top_ring {
const char *name;
@@ -47,6 +50,10 @@ struct gpu_top {
} u;
uint32_t payload;
} u;
+
+ double queued;
+ double runnable;
+ double running;
} ring[MAX_RINGS];
struct gpu_top_stat {
@@ -54,7 +61,11 @@ struct gpu_top {
uint64_t busy[MAX_RINGS];
uint64_t wait[MAX_RINGS];
uint64_t sema[MAX_RINGS];
+ uint64_t queued[MAX_RINGS];
+ uint64_t runnable[MAX_RINGS];
+ uint64_t running[MAX_RINGS];
} stat[2];
+
int count;
};
diff --git a/overlay/overlay.c b/overlay/overlay.c
index 545af7bcb2f5..d3755397061b 100644
--- a/overlay/overlay.c
+++ b/overlay/overlay.c
@@ -255,6 +255,13 @@ static void show_gpu_top(struct overlay_context *ctx, struct overlay_gpu_top *gt
len = sprintf(txt, "%s: %3d%% busy",
gt->gpu_top.ring[n].name,
gt->gpu_top.ring[n].u.u.busy);
+ if (gt->gpu_top.have_queued &&
+ gt->gpu_top.have_runnable &&
+ gt->gpu_top.have_running)
+ len += sprintf(txt + len, " (%.2f / %.2f / %.2f)",
+ gt->gpu_top.ring[n].queued,
+ gt->gpu_top.ring[n].runnable,
+ gt->gpu_top.ring[n].running);
if (gt->gpu_top.ring[n].u.u.wait)
len += sprintf(txt + len, ", %d%% wait",
gt->gpu_top.ring[n].u.u.wait);
--
2.14.1
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [igt-dev] [PATCH i-g-t 3/5] intel-gpu-overlay: Show 1s, 30s and 15m GPU load
2018-04-05 12:40 [Intel-gfx] [PATCH i-g-t v2 0/5] Queued/runnable/running engine stats Tvrtko Ursulin
2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 1/5] include: i915 uAPI headers Tvrtko Ursulin
2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 2/5] intel-gpu-overlay: Add engine queue stats Tvrtko Ursulin
@ 2018-04-05 12:40 ` Tvrtko Ursulin
2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 4/5] tests/perf_pmu: Add tests for engine queued/runnable/running stats Tvrtko Ursulin
` (3 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Tvrtko Ursulin @ 2018-04-05 12:40 UTC (permalink / raw)
To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Show total GPU loads in the window banner.
Engine load is defined as total of runnable and running requests on an
engine.
Total, non-normalized, load is display. In other words if N engines are
busy with exactly one request, the load will be shown as N.
v2:
* Different flavour of load avg. (Chris Wilson)
* Simplify code. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
overlay/gpu-top.c | 39 ++++++++++++++++++++++++++++++++++++++-
overlay/gpu-top.h | 11 ++++++++++-
overlay/overlay.c | 28 ++++++++++++++++++++++------
3 files changed, 70 insertions(+), 8 deletions(-)
diff --git a/overlay/gpu-top.c b/overlay/gpu-top.c
index 22e9badb22c1..501429b86379 100644
--- a/overlay/gpu-top.c
+++ b/overlay/gpu-top.c
@@ -28,6 +28,7 @@
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
+#include <math.h>
#include <errno.h>
#include <assert.h>
@@ -126,6 +127,10 @@ static int perf_init(struct gpu_top *gt)
gt->ring[gt->num_rings++].name = d->name;
}
+ gt->have_load_avg = gt->have_queued &&
+ gt->have_runnable &&
+ gt->have_running;
+
return 0;
}
@@ -290,17 +295,32 @@ static void mmio_init(struct gpu_top *gt)
}
}
-void gpu_top_init(struct gpu_top *gt)
+void gpu_top_init(struct gpu_top *gt, unsigned int period_us)
{
+ const double period = (double)period_us / 1e6;
+ const double load_period[NUM_LOADS] = { 1.0, 30.0, 900.0 };
+ const char *load_names[NUM_LOADS] = { "1s", "30s", "15m" };
+ unsigned int i;
+
memset(gt, 0, sizeof(*gt));
gt->fd = -1;
+ for (i = 0; i < NUM_LOADS; i++) {
+ gt->load_name[i] = load_names[i];
+ gt->exp[i] = exp(-period / load_period[i]);
+ }
+
if (perf_init(gt) == 0)
return;
mmio_init(gt);
}
+static double update_load(double load, double exp, double val)
+{
+ return val + exp * (load - val);
+}
+
int gpu_top_update(struct gpu_top *gt)
{
uint32_t data[1024];
@@ -313,6 +333,8 @@ int gpu_top_update(struct gpu_top *gt)
struct gpu_top_stat *s = >->stat[gt->count++&1];
struct gpu_top_stat *d = >->stat[gt->count&1];
uint64_t *sample, d_time;
+ double gpu_qd = 0.0;
+ unsigned int i;
int n, m;
len = read(gt->fd, data, sizeof(data));
@@ -341,6 +363,8 @@ int gpu_top_update(struct gpu_top *gt)
d_time = s->time - d->time;
for (n = 0; n < gt->num_rings; n++) {
+ double qd = 0.0;
+
gt->ring[n].u.u.busy = (100 * (s->busy[n] - d->busy[n]) + d_time/2) / d_time;
if (gt->have_wait)
gt->ring[n].u.u.wait = (100 * (s->wait[n] - d->wait[n]) + d_time/2) / d_time;
@@ -353,6 +377,14 @@ int gpu_top_update(struct gpu_top *gt)
if (gt->have_running)
gt->ring[n].running = (double)((s->running[n] - d->running[n])) / I915_SAMPLE_RUNNING_DIVISOR * 1e9 / d_time;
+ qd = gt->ring[n].runnable + gt->ring[n].running;
+ gpu_qd += qd;
+
+ for (i = 0; i < NUM_LOADS; i++)
+ gt->ring[n].load[i] =
+ update_load(gt->ring[n].load[i],
+ gt->exp[i], qd);
+
/* in case of rounding + sampling errors, fudge */
if (gt->ring[n].u.u.busy > 100)
gt->ring[n].u.u.busy = 100;
@@ -362,6 +394,11 @@ int gpu_top_update(struct gpu_top *gt)
gt->ring[n].u.u.sema = 100;
}
+ for (i = 0; i < NUM_LOADS; i++) {
+ gt->load[i] = update_load(gt->load[i], gt->exp[i],
+ gpu_qd);
+ gt->norm_load[i] = gt->load[i] / gt->num_rings;
+ }
update = 1;
} else {
while ((len = read(gt->fd, data, sizeof(data))) > 0) {
diff --git a/overlay/gpu-top.h b/overlay/gpu-top.h
index cb4310c82a94..115ce8c482c1 100644
--- a/overlay/gpu-top.h
+++ b/overlay/gpu-top.h
@@ -26,6 +26,7 @@
#define GPU_TOP_H
#define MAX_RINGS 16
+#define NUM_LOADS 3
#include <stdint.h>
@@ -39,6 +40,12 @@ struct gpu_top {
int have_queued;
int have_runnable;
int have_running;
+ int have_load_avg;
+
+ double exp[NUM_LOADS];
+ double load[NUM_LOADS];
+ double norm_load[NUM_LOADS];
+ const char *load_name[NUM_LOADS];
struct gpu_top_ring {
const char *name;
@@ -54,6 +61,8 @@ struct gpu_top {
double queued;
double runnable;
double running;
+
+ double load[NUM_LOADS];
} ring[MAX_RINGS];
struct gpu_top_stat {
@@ -69,7 +78,7 @@ struct gpu_top {
int count;
};
-void gpu_top_init(struct gpu_top *gt);
+void gpu_top_init(struct gpu_top *gt, unsigned int period_us);
int gpu_top_update(struct gpu_top *gt);
#endif /* GPU_TOP_H */
diff --git a/overlay/overlay.c b/overlay/overlay.c
index d3755397061b..63512059d8ff 100644
--- a/overlay/overlay.c
+++ b/overlay/overlay.c
@@ -141,7 +141,8 @@ struct overlay_context {
};
static void init_gpu_top(struct overlay_context *ctx,
- struct overlay_gpu_top *gt)
+ struct overlay_gpu_top *gt,
+ unsigned int period_us)
{
const double rgba[][4] = {
{ 1, 0.25, 0.25, 1 },
@@ -152,7 +153,7 @@ static void init_gpu_top(struct overlay_context *ctx,
int n;
cpu_top_init(>->cpu_top);
- gpu_top_init(>->gpu_top);
+ gpu_top_init(>->gpu_top, period_us);
chart_init(>->cpu, "CPU", 120);
chart_set_position(>->cpu, PAD, PAD);
@@ -927,13 +928,13 @@ int main(int argc, char **argv)
debugfs_init();
- init_gpu_top(&ctx, &ctx.gpu_top);
+ sample_period = get_sample_period(&config);
+
+ init_gpu_top(&ctx, &ctx.gpu_top, sample_period);
init_gpu_perf(&ctx, &ctx.gpu_perf);
init_gpu_freq(&ctx, &ctx.gpu_freq);
init_gem_objects(&ctx, &ctx.gem_objects);
- sample_period = get_sample_period(&config);
-
i = 0;
while (1) {
ctx.time = time(NULL);
@@ -949,9 +950,24 @@ int main(int argc, char **argv)
show_gem_objects(&ctx, &ctx.gem_objects);
{
- char buf[80];
+ struct gpu_top *gt = &ctx.gpu_top.gpu_top;
cairo_text_extents_t extents;
+ char buf[256];
+
gethostname(buf, sizeof(buf));
+
+ if (gt->have_load_avg) {
+ int len = strlen(buf);
+
+ snprintf(buf + len, sizeof(buf) - len,
+ "%s; %u engines; load %s %.2f, %s %.2f, %s %.2f",
+ buf,
+ gt->num_rings,
+ gt->load_name[0], gt->load[0],
+ gt->load_name[1], gt->load[1],
+ gt->load_name[2], gt->load[2]);
+ }
+
cairo_set_source_rgb(ctx.cr, .5, .5, .5);
cairo_set_font_size(ctx.cr, PAD-2);
cairo_text_extents(ctx.cr, buf, &extents);
--
2.14.1
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [igt-dev] [PATCH i-g-t 4/5] tests/perf_pmu: Add tests for engine queued/runnable/running stats
2018-04-05 12:40 [Intel-gfx] [PATCH i-g-t v2 0/5] Queued/runnable/running engine stats Tvrtko Ursulin
` (2 preceding siblings ...)
2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 3/5] intel-gpu-overlay: Show 1s, 30s and 15m GPU load Tvrtko Ursulin
@ 2018-04-05 12:40 ` Tvrtko Ursulin
2018-04-05 12:40 ` [Intel-gfx] [PATCH i-g-t 5/5] tests/i915_query: Engine queues tests Tvrtko Ursulin
` (2 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Tvrtko Ursulin @ 2018-04-05 12:40 UTC (permalink / raw)
To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Simple tests to check reported queue depths are correct.
v2:
* Improvements similar to ones from i915_query.c.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
tests/perf_pmu.c | 258 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 258 insertions(+)
diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
index 590e6526b069..7fccb437d048 100644
--- a/tests/perf_pmu.c
+++ b/tests/perf_pmu.c
@@ -169,6 +169,7 @@ static unsigned int e2ring(int gem_fd, const struct intel_execution_engine2 *e)
#define TEST_RUNTIME_PM (8)
#define FLAG_LONG (16)
#define FLAG_HANG (32)
+#define TEST_CONTEXTS (64)
static igt_spin_t * __spin_poll(int fd, uint32_t ctx, unsigned long flags)
{
@@ -959,6 +960,223 @@ multi_client(int gem_fd, const struct intel_execution_engine2 *e)
assert_within_epsilon(val[1], perf_slept[1], tolerance);
}
+static double calc_queued(uint64_t d_val, uint64_t d_ns)
+{
+ return (double)d_val * 1e9 / I915_SAMPLE_QUEUED_DIVISOR / d_ns;
+}
+
+static void
+queued(int gem_fd, const struct intel_execution_engine2 *e, unsigned int flags)
+{
+ const unsigned long engine = e2ring(gem_fd, e);
+ const uint32_t bbe = MI_BATCH_BUFFER_END;
+ const unsigned int max_rq = 10;
+ double queued[max_rq + 1];
+ unsigned int n, i;
+ uint64_t val[2];
+ uint64_t ts[2];
+ uint32_t bo;
+ int fd;
+
+ igt_require_sw_sync();
+ if (flags & TEST_CONTEXTS)
+ gem_require_contexts(gem_fd);
+
+ memset(queued, 0, sizeof(queued));
+
+ bo = gem_create(gem_fd, 4096);
+ gem_write(gem_fd, bo, 4092, &bbe, sizeof(bbe));
+
+ fd = open_pmu(I915_PMU_ENGINE_QUEUED(e->class, e->instance));
+
+ for (n = 0; n <= max_rq; n++) {
+ IGT_CORK_FENCE(cork);
+ int fence = -1;
+
+ gem_quiescent_gpu(gem_fd);
+
+ if (n)
+ fence = igt_cork_plug(&cork, -1);
+
+ for (i = 0; i < n; i++) {
+ struct drm_i915_gem_exec_object2 obj = { };
+ struct drm_i915_gem_execbuffer2 eb = { };
+
+ obj.handle = bo;
+
+ eb.buffer_count = 1;
+ eb.buffers_ptr = to_user_pointer(&obj);
+
+ eb.flags = engine | I915_EXEC_FENCE_IN;
+ if (flags & TEST_CONTEXTS)
+ eb.rsvd1 = gem_context_create(gem_fd);
+ eb.rsvd2 = fence;
+
+ gem_execbuf(gem_fd, &eb);
+
+ if (flags & TEST_CONTEXTS)
+ gem_context_destroy(gem_fd, eb.rsvd1);
+ }
+
+ val[0] = __pmu_read_single(fd, &ts[0]);
+ usleep(batch_duration_ns / 1000);
+ val[1] = __pmu_read_single(fd, &ts[1]);
+
+ queued[n] = calc_queued(val[1] - val[0], ts[1] - ts[0]);
+ igt_info("n=%u queued=%.2f\n", n, queued[n]);
+
+ if (fence >= 0)
+ igt_cork_unplug(&cork);
+
+ for (i = 0; i < n; i++)
+ gem_sync(gem_fd, bo);
+ }
+
+ close(fd);
+
+ gem_close(gem_fd, bo);
+
+ for (i = 0; i <= max_rq; i++)
+ assert_within_epsilon(queued[i], i, tolerance);
+}
+
+static unsigned long __query_wait(igt_spin_t *spin, unsigned int n)
+{
+ struct timespec ts = { };
+ unsigned long t;
+
+ igt_nsec_elapsed(&ts);
+
+ if (spin->running) {
+ igt_spin_busywait_until_running(spin);
+ } else {
+ igt_debug("__spin_wait - usleep mode\n");
+ usleep(500e3); /* Better than nothing! */
+ }
+
+ t = igt_nsec_elapsed(&ts);
+
+ return spin->running ? t : 500e6 / n;
+}
+
+static void
+runnable(int gem_fd, const struct intel_execution_engine2 *e)
+{
+ const unsigned long engine = e2ring(gem_fd, e);
+ bool contexts = gem_has_contexts(gem_fd);
+ const unsigned int max_rq = 10;
+ igt_spin_t *spin[max_rq + 1];
+ double runnable[max_rq + 1];
+ uint32_t ctx[max_rq];
+ unsigned int n, i;
+ uint64_t val[2];
+ uint64_t ts[2];
+ int fd;
+
+ memset(runnable, 0, sizeof(runnable));
+
+ if (contexts) {
+ for (i = 0; i < max_rq; i++)
+ ctx[i] = gem_context_create(gem_fd);
+ }
+
+ fd = open_pmu(I915_PMU_ENGINE_RUNNABLE(e->class, e->instance));
+
+ for (n = 0; n <= max_rq; n++) {
+ gem_quiescent_gpu(gem_fd);
+
+ for (i = 0; i < n; i++) {
+ uint32_t ctx_ = contexts ? ctx[i] : 0;
+
+ if (i == 0)
+ spin[i] = __spin_poll(gem_fd, ctx_, engine);
+ else
+ spin[i] = __igt_spin_batch_new(gem_fd, ctx_,
+ engine, 0);
+ }
+
+ if (n)
+ usleep(__query_wait(spin[0], n) * n);
+
+ val[0] = __pmu_read_single(fd, &ts[0]);
+ usleep(batch_duration_ns / 1000);
+ val[1] = __pmu_read_single(fd, &ts[1]);
+
+ runnable[n] = calc_queued(val[1] - val[0], ts[1] - ts[0]);
+ igt_info("n=%u runnable=%.2f\n", n, runnable[n]);
+
+ for (i = 0; i < n; i++) {
+ end_spin(gem_fd, spin[i], FLAG_SYNC);
+ igt_spin_batch_free(gem_fd, spin[i]);
+ }
+ }
+
+ if (contexts) {
+ for (i = 0; i < max_rq; i++)
+ gem_context_destroy(gem_fd, ctx[i]);
+ }
+
+ close(fd);
+
+ assert_within_epsilon(runnable[0], 0, tolerance);
+ igt_assert(runnable[max_rq] > 0.0);
+
+ if (contexts)
+ assert_within_epsilon(runnable[max_rq] - runnable[max_rq - 1],
+ 1, tolerance);
+}
+
+static void
+running(int gem_fd, const struct intel_execution_engine2 *e)
+{
+ const unsigned long engine = e2ring(gem_fd, e);
+ const unsigned int max_rq = 10;
+ igt_spin_t *spin[max_rq + 1];
+ double running[max_rq + 1];
+ unsigned int n, i;
+ uint64_t val[2];
+ uint64_t ts[2];
+ int fd;
+
+ memset(running, 0, sizeof(running));
+ memset(spin, 0, sizeof(spin));
+
+ fd = open_pmu(I915_PMU_ENGINE_RUNNING(e->class, e->instance));
+
+ for (n = 0; n <= max_rq; n++) {
+ gem_quiescent_gpu(gem_fd);
+
+ for (i = 0; i < n; i++) {
+ if (i == 0)
+ spin[i] = __spin_poll(gem_fd, 0, engine);
+ else
+ spin[i] = __igt_spin_batch_new(gem_fd, 0,
+ engine, 0);
+ }
+
+ if (n)
+ usleep(__query_wait(spin[0], n) * n);
+
+ val[0] = __pmu_read_single(fd, &ts[0]);
+ usleep(batch_duration_ns / 1000);
+ val[1] = __pmu_read_single(fd, &ts[1]);
+
+ running[n] = calc_queued(val[1] - val[0], ts[1] - ts[0]);
+ igt_info("n=%u running=%.2f\n", n, running[n]);
+
+ for (i = 0; i < n; i++) {
+ end_spin(gem_fd, spin[i], FLAG_SYNC);
+ igt_spin_batch_free(gem_fd, spin[i]);
+ }
+ }
+
+ close(fd);
+
+ assert_within_epsilon(running[0], 0, tolerance);
+ for (i = 1; i <= max_rq; i++)
+ igt_assert(running[i] > 0);
+}
+
/**
* Tests that i915 PMU corectly errors out in invalid initialization.
* i915 PMU is uncore PMU, thus:
@@ -1692,6 +1910,15 @@ igt_main
igt_subtest_f("init-sema-%s", e->name)
init(fd, e, I915_SAMPLE_SEMA);
+ igt_subtest_f("init-queued-%s", e->name)
+ init(fd, e, I915_SAMPLE_QUEUED);
+
+ igt_subtest_f("init-runnable-%s", e->name)
+ init(fd, e, I915_SAMPLE_RUNNABLE);
+
+ igt_subtest_f("init-running-%s", e->name)
+ init(fd, e, I915_SAMPLE_RUNNING);
+
igt_subtest_group {
igt_fixture {
gem_require_engine(fd, e->class, e->instance);
@@ -1797,6 +2024,27 @@ igt_main
igt_subtest_f("busy-hang-%s", e->name)
single(fd, e, TEST_BUSY | FLAG_HANG);
+
+ /**
+ * Test that queued metric works.
+ */
+ igt_subtest_f("queued-%s", e->name)
+ queued(fd, e, 0);
+
+ igt_subtest_f("queued-contexts-%s", e->name)
+ queued(fd, e, TEST_CONTEXTS);
+
+ /**
+ * Test that runnable metric works.
+ */
+ igt_subtest_f("runnable-%s", e->name)
+ runnable(fd, e);
+
+ /**
+ * Test that running metric works.
+ */
+ igt_subtest_f("running-%s", e->name)
+ running(fd, e);
}
/**
@@ -1889,6 +2137,16 @@ igt_main
e->name)
single(render_fd, e,
TEST_BUSY | TEST_TRAILING_IDLE);
+ igt_subtest_f("render-node-queued-%s", e->name)
+ queued(render_fd, e, 0);
+ igt_subtest_f("render-node-queued-contexts-%s",
+ e->name)
+ queued(render_fd, e, TEST_CONTEXTS);
+ igt_subtest_f("render-node-runnable-%s",
+ e->name)
+ runnable(render_fd, e);
+ igt_subtest_f("render-node-running-%s", e->name)
+ running(render_fd, e);
}
}
--
2.14.1
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Intel-gfx] [PATCH i-g-t 5/5] tests/i915_query: Engine queues tests
2018-04-05 12:40 [Intel-gfx] [PATCH i-g-t v2 0/5] Queued/runnable/running engine stats Tvrtko Ursulin
` (3 preceding siblings ...)
2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 4/5] tests/perf_pmu: Add tests for engine queued/runnable/running stats Tvrtko Ursulin
@ 2018-04-05 12:40 ` Tvrtko Ursulin
2018-04-05 14:05 ` [igt-dev] ✓ Fi.CI.BAT: success for Queued/runnable/running engine stats (rev2) Patchwork
2018-04-05 16:45 ` [igt-dev] ✗ Fi.CI.IGT: warning " Patchwork
6 siblings, 0 replies; 9+ messages in thread
From: Tvrtko Ursulin @ 2018-04-05 12:40 UTC (permalink / raw)
To: igt-dev; +Cc: Intel-gfx
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Basic tests to cover engine queued/runnable/running metric as reported
by the DRM_I915_QUERY_ENGINE_QUEUES query.
v2:
* Update ABI for i915 changes.
* Use igt_spin_busywait_until_running.
* Support no hardware contexts.
* More comments. (Lionel Landwerlin)
Chris Wilson:
* Check for sw sync support.
* Multiple contexts queued test.
* Simplify context and bb allocation.
* Fix asserts in the running subtest.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
tests/i915_query.c | 442 +++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 442 insertions(+)
diff --git a/tests/i915_query.c b/tests/i915_query.c
index c7de8cbd8371..83192c348e70 100644
--- a/tests/i915_query.c
+++ b/tests/i915_query.c
@@ -22,6 +22,7 @@
*/
#include "igt.h"
+#include "sw_sync.h"
#include <limits.h>
@@ -477,8 +478,414 @@ test_query_topology_known_pci_ids(int fd, int devid)
free(topo_info);
}
+#define DRM_I915_QUERY_ENGINE_QUEUES 2
+
+struct drm_i915_query_engine_queues {
+ /** Engine class as in enum drm_i915_gem_engine_class. */
+ __u16 class;
+
+ /** Engine instance number. */
+ __u16 instance;
+
+ /** Number of requests with unresolved fences and dependencies. */
+ __u32 queued;
+
+ /** Number of ready requests waiting on a slot on GPU. */
+ __u32 runnable;
+
+ /** Number of requests executing on the GPU. */
+ __u32 running;
+
+ __u32 rsvd[5];
+};
+
+static bool query_engine_queues_supported(int fd)
+{
+ struct drm_i915_query_item item = {
+ .query_id = DRM_I915_QUERY_ENGINE_QUEUES,
+ };
+
+ return __i915_query_items(fd, &item, 1) == 0 && item.length > 0;
+}
+
+static void engine_queues_invalid(int fd)
+{
+ struct drm_i915_query_engine_queues queues;
+ struct drm_i915_query_item item;
+ unsigned int len;
+ unsigned int i;
+
+ /* Flags is MBZ. */
+ memset(&item, 0, sizeof(item));
+ item.query_id = DRM_I915_QUERY_ENGINE_QUEUES;
+ item.flags = 1;
+ i915_query_items(fd, &item, 1);
+ igt_assert_eq(item.length, -EINVAL);
+
+ /* Length not zero and not greater or equal required size. */
+ memset(&item, 0, sizeof(item));
+ item.query_id = DRM_I915_QUERY_ENGINE_QUEUES;
+ item.length = 1;
+ i915_query_items(fd, &item, 1);
+ igt_assert_eq(item.length, -EINVAL);
+
+ /* Query correct length. */
+ memset(&item, 0, sizeof(item));
+ item.query_id = DRM_I915_QUERY_ENGINE_QUEUES;
+ i915_query_items(fd, &item, 1);
+ igt_assert(item.length >= 0);
+ len = item.length;
+
+ /* Ivalid pointer. */
+ memset(&item, 0, sizeof(item));
+ item.query_id = DRM_I915_QUERY_ENGINE_QUEUES;
+ item.length = len;
+ i915_query_items(fd, &item, 1);
+ igt_assert_eq(item.length, -EFAULT);
+
+ /* Reserved fields are MBZ. */
+
+ for (i = 0; i < ARRAY_SIZE(queues.rsvd); i++) {
+ memset(&queues, 0, sizeof(queues));
+ queues.rsvd[i] = 1;
+ memset(&item, 0, sizeof(item));
+ item.query_id = DRM_I915_QUERY_ENGINE_QUEUES;
+ item.length = len;
+ item.data_ptr = to_user_pointer(&queues);
+ i915_query_items(fd, &item, 1);
+ igt_assert_eq(item.length, -EINVAL);
+ }
+
+ memset(&queues, 0, sizeof(queues));
+ queues.class = -1;
+ memset(&item, 0, sizeof(item));
+ item.query_id = DRM_I915_QUERY_ENGINE_QUEUES;
+ item.length = len;
+ item.data_ptr = to_user_pointer(&queues);
+ i915_query_items(fd, &item, 1);
+ igt_assert_eq(item.length, -ENOENT);
+
+ memset(&queues, 0, sizeof(queues));
+ queues.instance = -1;
+ memset(&item, 0, sizeof(item));
+ item.query_id = DRM_I915_QUERY_ENGINE_QUEUES;
+ item.length = len;
+ item.data_ptr = to_user_pointer(&queues);
+ i915_query_items(fd, &item, 1);
+ igt_assert_eq(item.length, -ENOENT);
+}
+
+static void engine_queues(int fd, const struct intel_execution_engine2 *e)
+{
+ struct drm_i915_query_engine_queues queues;
+ struct drm_i915_query_item item;
+ unsigned int len;
+
+ /* Query required buffer length. */
+ memset(&queues, 0, sizeof(queues));
+ memset(&item, 0, sizeof(item));
+ item.query_id = DRM_I915_QUERY_ENGINE_QUEUES;
+ item.data_ptr = to_user_pointer(&queues);
+ i915_query_items(fd, &item, 1);
+ igt_assert(item.length >= 0);
+ igt_assert(item.length <= sizeof(queues));
+ len = item.length;
+
+ /* Check length larger than required works and reports same length. */
+ memset(&queues, 0, sizeof(queues));
+ memset(&item, 0, sizeof(item));
+ item.query_id = DRM_I915_QUERY_ENGINE_QUEUES;
+ item.data_ptr = to_user_pointer(&queues);
+ item.length = len + 1;
+ i915_query_items(fd, &item, 1);
+ igt_assert_eq(item.length, len);
+
+ /* Actual query. */
+ memset(&queues, 0, sizeof(queues));
+ queues.class = e->class;
+ queues.instance = e->instance;
+ memset(&item, 0, sizeof(item));
+ item.query_id = DRM_I915_QUERY_ENGINE_QUEUES;
+ item.data_ptr = to_user_pointer(&queues);
+ item.length = len;
+ i915_query_items(fd, &item, 1);
+ igt_assert_eq(item.length, len);
+}
+
+static unsigned int e2ring(int gem_fd, const struct intel_execution_engine2 *e)
+{
+ return gem_class_instance_to_eb_flags(gem_fd, e->class, e->instance);
+}
+
+static void
+__query_queues(int fd, const struct intel_execution_engine2 *e,
+ struct drm_i915_query_engine_queues *queues)
+{
+ struct drm_i915_query_item item;
+
+ memset(queues, 0, sizeof(*queues));
+ queues->class = e->class;
+ queues->instance = e->instance;
+ memset(&item, 0, sizeof(item));
+ item.query_id = DRM_I915_QUERY_ENGINE_QUEUES;
+ item.data_ptr = to_user_pointer(queues);
+ item.length = sizeof(*queues);
+ i915_query_items(fd, &item, 1);
+ igt_assert_eq(item.length, sizeof(*queues));
+}
+
+#define TEST_CONTEXTS (1 << 0)
+
+/*
+ * Test that the reported number of queued (not ready for execution due fences
+ * or dependencies) requests on an engine is correct.
+ */
+static void
+engine_queued(int gem_fd, const struct intel_execution_engine2 *e,
+ unsigned int flags)
+{
+ const unsigned long engine = e2ring(gem_fd, e);
+ struct drm_i915_query_engine_queues queues;
+ const uint32_t bbe = MI_BATCH_BUFFER_END;
+ const unsigned int max_rq = 10;
+ uint32_t queued[max_rq + 1];
+ unsigned int n, i;
+ uint32_t bo;
+
+ igt_require_sw_sync();
+ if (flags & TEST_CONTEXTS)
+ gem_require_contexts(gem_fd);
+
+ memset(queued, 0, sizeof(queued));
+
+ bo = gem_create(gem_fd, 4096);
+ gem_write(gem_fd, bo, 4092, &bbe, sizeof(bbe));
+
+ /* Create a specific queue depth of unready requests. */
+ for (n = 0; n <= max_rq; n++) {
+ int fence = -1;
+ IGT_CORK_FENCE(cork);
+
+ gem_quiescent_gpu(gem_fd);
+
+ /* Create a cork so we can create a dependency chain. */
+ if (n)
+ fence = igt_cork_plug(&cork, -1);
+
+ /* Submit n unready requests depending on the cork. */
+ for (i = 0; i < n; i++) {
+ struct drm_i915_gem_exec_object2 obj = { };
+ struct drm_i915_gem_execbuffer2 eb = { };
+
+ obj.handle = bo;
+
+ eb.buffer_count = 1;
+ eb.buffers_ptr = to_user_pointer(&obj);
+
+ eb.flags = engine | I915_EXEC_FENCE_IN;
+
+ /*
+ * In context mode each submission is on a separate
+ * context.
+ */
+ if (flags & TEST_CONTEXTS)
+ eb.rsvd1 = gem_context_create(gem_fd);
+
+ eb.rsvd2 = fence;
+
+ gem_execbuf(gem_fd, &eb);
+
+ if (flags & TEST_CONTEXTS)
+ gem_context_destroy(gem_fd, eb.rsvd1);
+ }
+
+ /* Store reported queue depth to assert against later. */
+ __query_queues(gem_fd, e, &queues);
+ queued[n] = queues.queued;
+ igt_info("n=%u queued=%u\n", n, queued[n]);
+
+ /* Unplug the queue and proceed to the next queue depth. */
+ if (fence >= 0)
+ igt_cork_unplug(&cork);
+
+ gem_sync(gem_fd, bo);
+ }
+
+ gem_close(gem_fd, bo);
+
+ for (i = 0; i <= max_rq; i++)
+ igt_assert_eq(queued[i], i);
+}
+
+static igt_spin_t * __spin_poll(int fd, uint32_t ctx, unsigned long flags)
+{
+ if (gem_can_store_dword(fd, flags))
+ return __igt_spin_batch_new_poll(fd, ctx, flags);
+ else
+ return __igt_spin_batch_new(fd, ctx, flags, 0);
+}
+
+static unsigned long __spin_wait(igt_spin_t *spin, unsigned int n)
+{
+ struct timespec ts = { };
+ unsigned long t;
+
+ igt_nsec_elapsed(&ts);
+
+ if (spin->running) {
+ igt_spin_busywait_until_running(spin);
+ } else {
+ igt_debug("__spin_wait - usleep mode\n");
+ usleep(500e3); /* Better than nothing! */
+ }
+
+ t = igt_nsec_elapsed(&ts);
+
+ return spin->running ? t : 500e6 / n;
+}
+
+/*
+ * Test that the number of requests ready for execution but waiting on space on
+ * GPU is correctly reported.
+ */
+static void
+engine_runnable(int gem_fd, const struct intel_execution_engine2 *e)
+{
+ const unsigned long engine = e2ring(gem_fd, e);
+ struct drm_i915_query_engine_queues queues;
+ bool contexts = gem_has_contexts(gem_fd);
+ const unsigned int max_rq = 10;
+ igt_spin_t *spin[max_rq + 1];
+ uint32_t runnable[max_rq + 1];
+ uint32_t ctx[max_rq];
+ unsigned int n, i;
+
+ memset(runnable, 0, sizeof(runnable));
+
+ if (contexts) {
+ for (i = 0; i < max_rq; i++)
+ ctx[i] = gem_context_create(gem_fd);
+ }
+
+ /*
+ * Submit different number of requests, potentially against different
+ * contexts, in order to provoke engine runnable metric returning
+ * different numbers.
+ */
+ for (n = 0; n <= max_rq; n++) {
+ gem_quiescent_gpu(gem_fd);
+
+ for (i = 0; i < n; i++) {
+ uint32_t ctx_ = contexts ? ctx[i] : 0;
+
+ if (i == 0)
+ spin[i] = __spin_poll(gem_fd, ctx_, engine);
+ else
+ spin[i] = __igt_spin_batch_new(gem_fd, ctx_,
+ engine, 0);
+ }
+
+ if (n)
+ usleep(__spin_wait(spin[0], n) * n);
+
+ /* Query and store for later checking. */
+ __query_queues(gem_fd, e, &queues);
+ runnable[n] = queues.runnable;
+ igt_info("n=%u runnable=%u\n", n, runnable[n]);
+
+ for (i = 0; i < n; i++) {
+ igt_spin_batch_end(spin[i]);
+ gem_sync(gem_fd, spin[i]->handle);
+ igt_spin_batch_free(gem_fd, spin[i]);
+ }
+ }
+
+ if (contexts) {
+ for (i = 0; i < max_rq; i++)
+ gem_context_destroy(gem_fd, ctx[i]);
+ }
+
+ /*
+ * Check that the runnable metric is zero when nothing is submitted,
+ * and that it is greater than zero on the maximum queue depth.
+ *
+ * We cannot assert the exact value since we do not know how many
+ * requests can the submission backend consume.
+ */
+ igt_assert_eq(runnable[0], 0);
+ igt_assert(runnable[max_rq] > 0);
+
+ /*
+ * We can only test that the runnable metric is growing by one if we
+ * have context support.
+ */
+ if (contexts)
+ igt_assert_eq(runnable[max_rq] - runnable[max_rq - 1], 1);
+}
+
+/*
+ * Test that the number of requests currently executing on the GPU is correctly
+ * reported.
+ */
+static void
+engine_running(int gem_fd, const struct intel_execution_engine2 *e)
+{
+ const unsigned long engine = e2ring(gem_fd, e);
+ struct drm_i915_query_engine_queues queues;
+ const unsigned int max_rq = 10;
+ igt_spin_t *spin[max_rq + 1];
+ uint32_t running[max_rq + 1];
+ unsigned int n, i;
+
+ memset(running, 0, sizeof(running));
+ memset(spin, 0, sizeof(spin));
+
+ /*
+ * Create various queue depths of requests against the same context to
+ * try and get submission backed execute one or more on the GPU.
+ */
+ for (n = 0; n <= max_rq; n++) {
+ gem_quiescent_gpu(gem_fd);
+
+ for (i = 0; i < n; i++) {
+ if (i == 0)
+ spin[i] = __spin_poll(gem_fd, 0, engine);
+ else
+ spin[i] = __igt_spin_batch_new(gem_fd, 0,
+ engine, 0);
+ }
+
+ if (n)
+ usleep(__spin_wait(spin[0], n) * n);
+
+ /* Query and store for later checking. */
+ __query_queues(gem_fd, e, &queues);
+ running[n] = queues.running;
+ igt_info("n=%u running=%u\n", n, running[n]);
+
+ for (i = 0; i < n; i++) {
+ igt_spin_batch_end(spin[i]);
+ gem_sync(gem_fd, spin[i]->handle);
+ igt_spin_batch_free(gem_fd, spin[i]);
+ }
+ }
+
+ /*
+ * Check that the running metric is zero when nothing is submitted,
+ * one when one request is submitted, and at least one for any greater
+ * queue depth.
+ *
+ * We cannot assert the exact value since we do not know how many
+ * requests can the submission backend consume.
+ */
+ igt_assert_eq(running[0], 0);
+ for (i = 1; i <= max_rq; i++)
+ igt_assert(running[i] > 0);
+}
+
igt_main
{
+ const struct intel_execution_engine2 *e;
int fd = -1;
int devid;
@@ -524,6 +931,41 @@ igt_main
test_query_topology_known_pci_ids(fd, devid);
}
+ igt_subtest_group {
+ igt_fixture {
+ igt_require(query_engine_queues_supported(fd));
+ }
+
+ igt_subtest("engine-queues-invalid")
+ engine_queues_invalid(fd);
+
+ __for_each_engine_class_instance(fd, e) {
+ igt_subtest_group {
+ igt_fixture {
+ gem_require_engine(fd,
+ e->class,
+ e->instance);
+ }
+
+ igt_subtest_f("engine-queues-%s", e->name)
+ engine_queues(fd, e);
+
+ igt_subtest_f("engine-queued-%s", e->name)
+ engine_queued(fd, e, 0);
+
+ igt_subtest_f("engine-queued-contexts-%s",
+ e->name)
+ engine_queued(fd, e, TEST_CONTEXTS);
+
+ igt_subtest_f("engine-runnable-%s", e->name)
+ engine_runnable(fd, e);
+
+ igt_subtest_f("engine-running-%s", e->name)
+ engine_running(fd, e);
+ }
+ }
+ }
+
igt_fixture {
close(fd);
}
--
2.14.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [igt-dev] ✓ Fi.CI.BAT: success for Queued/runnable/running engine stats (rev2)
2018-04-05 12:40 [Intel-gfx] [PATCH i-g-t v2 0/5] Queued/runnable/running engine stats Tvrtko Ursulin
` (4 preceding siblings ...)
2018-04-05 12:40 ` [Intel-gfx] [PATCH i-g-t 5/5] tests/i915_query: Engine queues tests Tvrtko Ursulin
@ 2018-04-05 14:05 ` Patchwork
2018-04-05 16:45 ` [igt-dev] ✗ Fi.CI.IGT: warning " Patchwork
6 siblings, 0 replies; 9+ messages in thread
From: Patchwork @ 2018-04-05 14:05 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: igt-dev
== Series Details ==
Series: Queued/runnable/running engine stats (rev2)
URL : https://patchwork.freedesktop.org/series/40217/
State : success
== Summary ==
IGT patchset tested on top of latest successful build
164b4a3ab34bd7d18d34181c62bfaedb906a76e3 blacklist: Don't run tests on pipe-d, pipe-e or pipe-f
with latest DRM-Tip kernel build CI_DRM_4025
0eddede73765 drm-tip: 2018y-04m-05d-09h-51m-03s UTC integration manifest
Testlist changes:
+igt@i915_query@engine-queued-bcs0
+igt@i915_query@engine-queued-contexts-bcs0
+igt@i915_query@engine-queued-contexts-rcs0
+igt@i915_query@engine-queued-contexts-vcs0
+igt@i915_query@engine-queued-contexts-vcs1
+igt@i915_query@engine-queued-contexts-vecs0
+igt@i915_query@engine-queued-rcs0
+igt@i915_query@engine-queued-vcs0
+igt@i915_query@engine-queued-vcs1
+igt@i915_query@engine-queued-vecs0
+igt@i915_query@engine-queues-bcs0
+igt@i915_query@engine-queues-invalid
+igt@i915_query@engine-queues-rcs0
+igt@i915_query@engine-queues-vcs0
+igt@i915_query@engine-queues-vcs1
+igt@i915_query@engine-queues-vecs0
+igt@i915_query@engine-runnable-bcs0
+igt@i915_query@engine-runnable-rcs0
+igt@i915_query@engine-runnable-vcs0
+igt@i915_query@engine-runnable-vcs1
+igt@i915_query@engine-runnable-vecs0
+igt@i915_query@engine-running-bcs0
+igt@i915_query@engine-running-rcs0
+igt@i915_query@engine-running-vcs0
+igt@i915_query@engine-running-vcs1
+igt@i915_query@engine-running-vecs0
+igt@perf_pmu@init-queued-bcs0
+igt@perf_pmu@init-queued-rcs0
+igt@perf_pmu@init-queued-vcs0
+igt@perf_pmu@init-queued-vcs1
+igt@perf_pmu@init-queued-vecs0
+igt@perf_pmu@init-runnable-bcs0
+igt@perf_pmu@init-runnable-rcs0
+igt@perf_pmu@init-runnable-vcs0
+igt@perf_pmu@init-runnable-vcs1
+igt@perf_pmu@init-runnable-vecs0
+igt@perf_pmu@init-running-bcs0
+igt@perf_pmu@init-running-rcs0
+igt@perf_pmu@init-running-vcs0
+igt@perf_pmu@init-running-vcs1
+igt@perf_pmu@init-running-vecs0
+igt@perf_pmu@queued-bcs0
+igt@perf_pmu@queued-contexts-bcs0
+igt@perf_pmu@queued-contexts-rcs0
+igt@perf_pmu@queued-contexts-vcs0
+igt@perf_pmu@queued-contexts-vcs1
+igt@perf_pmu@queued-contexts-vecs0
+igt@perf_pmu@queued-rcs0
+igt@perf_pmu@queued-vcs0
+igt@perf_pmu@queued-vcs1
+igt@perf_pmu@queued-vecs0
+igt@perf_pmu@render-node-queued-bcs0
+igt@perf_pmu@render-node-queued-contexts-bcs0
+igt@perf_pmu@render-node-queued-contexts-rcs0
+igt@perf_pmu@render-node-queued-contexts-vcs0
+igt@perf_pmu@render-node-queued-contexts-vcs1
+igt@perf_pmu@render-node-queued-contexts-vecs0
+igt@perf_pmu@render-node-queued-rcs0
+igt@perf_pmu@render-node-queued-vcs0
+igt@perf_pmu@render-node-queued-vcs1
+igt@perf_pmu@render-node-queued-vecs0
+igt@perf_pmu@render-node-runnable-bcs0
+igt@perf_pmu@render-node-runnable-rcs0
+igt@perf_pmu@render-node-runnable-vcs0
+igt@perf_pmu@render-node-runnable-vcs1
+igt@perf_pmu@render-node-runnable-vecs0
+igt@perf_pmu@render-node-running-bcs0
+igt@perf_pmu@render-node-running-rcs0
+igt@perf_pmu@render-node-running-vcs0
+igt@perf_pmu@render-node-running-vcs1
+igt@perf_pmu@render-node-running-vecs0
+igt@perf_pmu@runnable-bcs0
+igt@perf_pmu@runnable-rcs0
+igt@perf_pmu@runnable-vcs0
+igt@perf_pmu@runnable-vcs1
+igt@perf_pmu@runnable-vecs0
+igt@perf_pmu@running-bcs0
+igt@perf_pmu@running-rcs0
+igt@perf_pmu@running-vcs0
+igt@perf_pmu@running-vcs1
+igt@perf_pmu@running-vecs0
---- Known issues:
Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-c:
dmesg-warn -> PASS (fi-glk-j4005) fdo#105644
Test prime_vgem:
Subgroup basic-fence-flip:
fail -> PASS (fi-ilk-650) fdo#104008
fdo#105644 https://bugs.freedesktop.org/show_bug.cgi?id=105644
fdo#104008 https://bugs.freedesktop.org/show_bug.cgi?id=104008
fi-bdw-5557u total:285 pass:264 dwarn:0 dfail:0 fail:0 skip:21 time:432s
fi-bdw-gvtdvm total:285 pass:261 dwarn:0 dfail:0 fail:0 skip:24 time:443s
fi-blb-e6850 total:285 pass:220 dwarn:1 dfail:0 fail:0 skip:64 time:383s
fi-bsw-n3050 total:285 pass:239 dwarn:0 dfail:0 fail:0 skip:46 time:542s
fi-bwr-2160 total:285 pass:180 dwarn:0 dfail:0 fail:0 skip:105 time:299s
fi-bxt-dsi total:285 pass:255 dwarn:0 dfail:0 fail:0 skip:30 time:522s
fi-bxt-j4205 total:285 pass:256 dwarn:0 dfail:0 fail:0 skip:29 time:520s
fi-byt-j1900 total:285 pass:250 dwarn:0 dfail:0 fail:0 skip:35 time:525s
fi-byt-n2820 total:285 pass:246 dwarn:0 dfail:0 fail:0 skip:39 time:509s
fi-cfl-8700k total:285 pass:257 dwarn:0 dfail:0 fail:0 skip:28 time:414s
fi-cfl-s3 total:285 pass:259 dwarn:0 dfail:0 fail:0 skip:26 time:563s
fi-cfl-u total:285 pass:259 dwarn:0 dfail:0 fail:0 skip:26 time:513s
fi-cnl-y3 total:285 pass:259 dwarn:0 dfail:0 fail:0 skip:26 time:581s
fi-elk-e7500 total:285 pass:226 dwarn:0 dfail:0 fail:0 skip:59 time:425s
fi-gdg-551 total:285 pass:176 dwarn:0 dfail:0 fail:1 skip:108 time:316s
fi-glk-1 total:285 pass:257 dwarn:0 dfail:0 fail:0 skip:28 time:538s
fi-glk-j4005 total:285 pass:256 dwarn:0 dfail:0 fail:0 skip:29 time:487s
fi-hsw-4770 total:285 pass:258 dwarn:0 dfail:0 fail:0 skip:27 time:411s
fi-ilk-650 total:285 pass:225 dwarn:0 dfail:0 fail:0 skip:60 time:422s
fi-ivb-3520m total:285 pass:256 dwarn:0 dfail:0 fail:0 skip:29 time:471s
fi-ivb-3770 total:285 pass:252 dwarn:0 dfail:0 fail:0 skip:33 time:433s
fi-kbl-7500u total:285 pass:260 dwarn:1 dfail:0 fail:0 skip:24 time:474s
fi-kbl-7567u total:285 pass:265 dwarn:0 dfail:0 fail:0 skip:20 time:469s
fi-kbl-r total:285 pass:258 dwarn:0 dfail:0 fail:0 skip:27 time:509s
fi-pnv-d510 total:285 pass:220 dwarn:1 dfail:0 fail:0 skip:64 time:671s
fi-skl-6260u total:285 pass:265 dwarn:0 dfail:0 fail:0 skip:20 time:443s
fi-skl-6600u total:285 pass:258 dwarn:0 dfail:0 fail:0 skip:27 time:536s
fi-skl-6700k2 total:285 pass:261 dwarn:0 dfail:0 fail:0 skip:24 time:508s
fi-skl-6770hq total:285 pass:265 dwarn:0 dfail:0 fail:0 skip:20 time:511s
fi-skl-guc total:285 pass:257 dwarn:0 dfail:0 fail:0 skip:28 time:429s
fi-skl-gvtdvm total:285 pass:262 dwarn:0 dfail:0 fail:0 skip:23 time:447s
fi-snb-2520m total:285 pass:245 dwarn:0 dfail:0 fail:0 skip:40 time:566s
fi-snb-2600 total:285 pass:245 dwarn:0 dfail:0 fail:0 skip:40 time:406s
Blacklisted hosts:
fi-cnl-psr total:285 pass:256 dwarn:3 dfail:0 fail:0 skip:26 time:514s
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_1228/issues.html
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 9+ messages in thread
* [igt-dev] ✗ Fi.CI.IGT: warning for Queued/runnable/running engine stats (rev2)
2018-04-05 12:40 [Intel-gfx] [PATCH i-g-t v2 0/5] Queued/runnable/running engine stats Tvrtko Ursulin
` (5 preceding siblings ...)
2018-04-05 14:05 ` [igt-dev] ✓ Fi.CI.BAT: success for Queued/runnable/running engine stats (rev2) Patchwork
@ 2018-04-05 16:45 ` Patchwork
6 siblings, 0 replies; 9+ messages in thread
From: Patchwork @ 2018-04-05 16:45 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: igt-dev
== Series Details ==
Series: Queued/runnable/running engine stats (rev2)
URL : https://patchwork.freedesktop.org/series/40217/
State : warning
== Summary ==
---- Possible new issues:
Test gem_pwrite:
Subgroup big-cpu-backwards:
pass -> SKIP (shard-apl)
Test kms_cursor_legacy:
Subgroup cursor-vs-flip-toggle:
fail -> PASS (shard-hsw)
---- Known issues:
Test kms_rotation_crc:
Subgroup primary-rotation-180:
fail -> PASS (shard-snb) fdo#103925
fdo#103925 https://bugs.freedesktop.org/show_bug.cgi?id=103925
shard-apl total:2761 pass:1837 dwarn:1 dfail:0 fail:51 skip:871 time:12677s
shard-hsw total:2761 pass:1789 dwarn:1 dfail:0 fail:45 skip:925 time:11560s
shard-snb total:2761 pass:1382 dwarn:1 dfail:0 fail:37 skip:1341 time:6917s
Blacklisted hosts:
shard-kbl total:2761 pass:1962 dwarn:1 dfail:0 fail:62 skip:736 time:9200s
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_1228/shards.html
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2018-04-05 16:45 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-04-05 12:40 [Intel-gfx] [PATCH i-g-t v2 0/5] Queued/runnable/running engine stats Tvrtko Ursulin
2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 1/5] include: i915 uAPI headers Tvrtko Ursulin
2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 2/5] intel-gpu-overlay: Add engine queue stats Tvrtko Ursulin
2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 3/5] intel-gpu-overlay: Show 1s, 30s and 15m GPU load Tvrtko Ursulin
2018-04-05 12:40 ` [igt-dev] [PATCH i-g-t 4/5] tests/perf_pmu: Add tests for engine queued/runnable/running stats Tvrtko Ursulin
2018-04-05 12:40 ` [Intel-gfx] [PATCH i-g-t 5/5] tests/i915_query: Engine queues tests Tvrtko Ursulin
2018-04-05 14:05 ` [igt-dev] ✓ Fi.CI.BAT: success for Queued/runnable/running engine stats (rev2) Patchwork
2018-04-05 16:45 ` [igt-dev] ✗ Fi.CI.IGT: warning " Patchwork
-- strict thread matches above, loose matches on Subject: below --
2018-03-19 18:22 [Intel-gfx] [PATCH i-g-t 0/5] Queued/runnable/running engine stats Tvrtko Ursulin
2018-03-19 18:22 ` [igt-dev] [PATCH i-g-t 1/5] include: i915 uAPI headers Tvrtko Ursulin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox