* [igt-dev] [PATCH i-g-t 0/3] perf_pmu reliability improvements
@ 2018-02-01 12:47 Tvrtko Ursulin
2018-02-01 12:47 ` [igt-dev] [PATCH i-g-t 1/3] tests/perf_pmu: Tighten busy measurement Tvrtko Ursulin
` (4 more replies)
0 siblings, 5 replies; 19+ messages in thread
From: Tvrtko Ursulin @ 2018-02-01 12:47 UTC (permalink / raw)
To: igt-dev; +Cc: Tvrtko Ursulin
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Some reworks to hopefully improve the reliability of tests on small core
machines.
Mostly ensuring measured sleep approach is always used in tests which look at
time, and that time spent setting up the test is ignored as much as possible.
Tvrtko Ursulin (3):
tests/perf_pmu: Tighten busy measurement
tests/perf_pmu: More busy measurement tightening
tests/perf_pmu: Use measured sleep in all time based tests
tests/perf_pmu.c | 102 ++++++++++++++++++++++++++++++++-----------------------
1 file changed, 60 insertions(+), 42 deletions(-)
--
2.14.1
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 19+ messages in thread
* [igt-dev] [PATCH i-g-t 1/3] tests/perf_pmu: Tighten busy measurement
2018-02-01 12:47 [igt-dev] [PATCH i-g-t 0/3] perf_pmu reliability improvements Tvrtko Ursulin
@ 2018-02-01 12:47 ` Tvrtko Ursulin
2018-02-01 12:57 ` Chris Wilson
2018-02-01 12:47 ` [igt-dev] [PATCH i-g-t 2/3] tests/perf_pmu: More busy measurement tightening Tvrtko Ursulin
` (3 subsequent siblings)
4 siblings, 1 reply; 19+ messages in thread
From: Tvrtko Ursulin @ 2018-02-01 12:47 UTC (permalink / raw)
To: igt-dev; +Cc: Tvrtko Ursulin
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
In cases where we manually terminate the busy batch, we always want to
sample busyness while the batch is running, just before we will
terminate it, and not the other way around. This way we make the window
for unwated idleness getting sampled smaller.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
tests/perf_pmu.c | 28 +++++++++++++---------------
1 file changed, 13 insertions(+), 15 deletions(-)
diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
index 2f7d33414a53..bf16e5e8b1f9 100644
--- a/tests/perf_pmu.c
+++ b/tests/perf_pmu.c
@@ -146,10 +146,9 @@ single(int gem_fd, const struct intel_execution_engine2 *e, bool busy)
spin = NULL;
slept = measured_usleep(batch_duration_ns / 1000);
- igt_spin_batch_end(spin);
-
val = pmu_read_single(fd);
+ igt_spin_batch_end(spin);
igt_spin_batch_free(gem_fd, spin);
close(fd);
@@ -256,7 +255,7 @@ busy_double_start(int gem_fd, const struct intel_execution_engine2 *e)
gem_quiescent_gpu(gem_fd);
}
-static void log_busy(int fd, unsigned int num_engines, uint64_t *val)
+static void log_busy(unsigned int num_engines, uint64_t *val)
{
char buf[1024];
int rem = sizeof(buf);
@@ -303,14 +302,14 @@ busy_check_all(int gem_fd, const struct intel_execution_engine2 *e,
spin = igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0);
slept = measured_usleep(batch_duration_ns / 1000);
- igt_spin_batch_end(spin);
-
pmu_read_multi(fd[0], num_engines, val);
- log_busy(fd[0], num_engines, val);
+ igt_spin_batch_end(spin);
igt_spin_batch_free(gem_fd, spin);
close(fd[0]);
+ log_busy(num_engines, val);
+
assert_within_epsilon(val[busy_idx], slept, tolerance);
for (i = 0; i < num_engines; i++) {
if (i == busy_idx)
@@ -364,14 +363,14 @@ most_busy_check_all(int gem_fd, const struct intel_execution_engine2 *e,
fd[i] = open_group(val[i], fd[0]);
slept = measured_usleep(batch_duration_ns / 1000);
- igt_spin_batch_end(spin);
-
pmu_read_multi(fd[0], num_engines, val);
- log_busy(fd[0], num_engines, val);
+ igt_spin_batch_end(spin);
igt_spin_batch_free(gem_fd, spin);
close(fd[0]);
+ log_busy(num_engines, val);
+
for (i = 0; i < num_engines; i++) {
if (i == idle_idx)
assert_within_epsilon(val[i], 0.0f, tolerance);
@@ -420,14 +419,14 @@ all_busy_check_all(int gem_fd, const unsigned int num_engines)
fd[i] = open_group(val[i], fd[0]);
slept = measured_usleep(batch_duration_ns / 1000);
- igt_spin_batch_end(spin);
-
pmu_read_multi(fd[0], num_engines, val);
- log_busy(fd[0], num_engines, val);
+ igt_spin_batch_end(spin);
igt_spin_batch_free(gem_fd, spin);
close(fd[0]);
+ log_busy(num_engines, val);
+
for (i = 0; i < num_engines; i++)
assert_within_epsilon(val[i], slept, tolerance);
gem_quiescent_gpu(gem_fd);
@@ -903,12 +902,11 @@ static void cpu_hotplug(int gem_fd)
igt_waitchildren();
- igt_spin_batch_end(spin);
- gem_sync(gem_fd, spin->handle);
-
ref = igt_nsec_elapsed(&start);
val = pmu_read_single(fd);
+ igt_spin_batch_end(spin);
+ gem_sync(gem_fd, spin->handle);
igt_spin_batch_free(gem_fd, spin);
close(fd);
--
2.14.1
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [igt-dev] [PATCH i-g-t 2/3] tests/perf_pmu: More busy measurement tightening
2018-02-01 12:47 [igt-dev] [PATCH i-g-t 0/3] perf_pmu reliability improvements Tvrtko Ursulin
2018-02-01 12:47 ` [igt-dev] [PATCH i-g-t 1/3] tests/perf_pmu: Tighten busy measurement Tvrtko Ursulin
@ 2018-02-01 12:47 ` Tvrtko Ursulin
2018-02-01 12:59 ` Chris Wilson
2018-02-01 12:47 ` [igt-dev] [PATCH i-g-t 3/3] tests/perf_pmu: Use measured sleep in all time based tests Tvrtko Ursulin
` (2 subsequent siblings)
4 siblings, 1 reply; 19+ messages in thread
From: Tvrtko Ursulin @ 2018-02-01 12:47 UTC (permalink / raw)
To: igt-dev; +Cc: Tvrtko Ursulin
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Where we use measured sleeps, take PMU samples immediately before and
after and look at their delta in order to minimize the effect of any
test setup delays.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
tests/perf_pmu.c | 36 ++++++++++++++++++++++++++++--------
1 file changed, 28 insertions(+), 8 deletions(-)
diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
index bf16e5e8b1f9..bdf452c8347c 100644
--- a/tests/perf_pmu.c
+++ b/tests/perf_pmu.c
@@ -145,8 +145,9 @@ single(int gem_fd, const struct intel_execution_engine2 *e, bool busy)
else
spin = NULL;
- slept = measured_usleep(batch_duration_ns / 1000);
val = pmu_read_single(fd);
+ slept = measured_usleep(batch_duration_ns / 1000);
+ val = pmu_read_single(fd) - val;
igt_spin_batch_end(spin);
igt_spin_batch_free(gem_fd, spin);
@@ -180,8 +181,9 @@ busy_start(int gem_fd, const struct intel_execution_engine2 *e)
fd = open_pmu(I915_PMU_ENGINE_BUSY(e->class, e->instance));
- slept = measured_usleep(batch_duration_ns / 1000);
val = pmu_read_single(fd);
+ slept = measured_usleep(batch_duration_ns / 1000);
+ val = pmu_read_single(fd) - val;
igt_spin_batch_free(gem_fd, spin);
close(fd);
@@ -227,8 +229,9 @@ busy_double_start(int gem_fd, const struct intel_execution_engine2 *e)
*/
fd = open_pmu(I915_PMU_ENGINE_BUSY(e->class, e->instance));
- slept = measured_usleep(batch_duration_ns / 1000);
val = pmu_read_single(fd);
+ slept = measured_usleep(batch_duration_ns / 1000);
+ val = pmu_read_single(fd) - val;
igt_spin_batch_end(spin[0]);
igt_spin_batch_end(spin[1]);
@@ -279,6 +282,7 @@ busy_check_all(int gem_fd, const struct intel_execution_engine2 *e,
const unsigned int num_engines)
{
const struct intel_execution_engine2 *e_;
+ uint64_t tval[2][num_engines];
uint64_t val[num_engines];
int fd[num_engines];
unsigned long slept;
@@ -301,13 +305,17 @@ busy_check_all(int gem_fd, const struct intel_execution_engine2 *e,
igt_assert_eq(i, num_engines);
spin = igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0);
+ pmu_read_multi(fd[0], num_engines, tval[0]);
slept = measured_usleep(batch_duration_ns / 1000);
- pmu_read_multi(fd[0], num_engines, val);
+ pmu_read_multi(fd[0], num_engines, tval[1]);
igt_spin_batch_end(spin);
igt_spin_batch_free(gem_fd, spin);
close(fd[0]);
+ for (i = 0; i < num_engines; i++)
+ val[i] = tval[1][i] - tval[0][i];
+
log_busy(num_engines, val);
assert_within_epsilon(val[busy_idx], slept, tolerance);
@@ -324,6 +332,7 @@ most_busy_check_all(int gem_fd, const struct intel_execution_engine2 *e,
const unsigned int num_engines)
{
const struct intel_execution_engine2 *e_;
+ uint64_t tval[2][num_engines];
uint64_t val[num_engines];
int fd[num_engines];
unsigned long slept;
@@ -362,13 +371,17 @@ most_busy_check_all(int gem_fd, const struct intel_execution_engine2 *e,
for (i = 0; i < num_engines; i++)
fd[i] = open_group(val[i], fd[0]);
+ pmu_read_multi(fd[0], num_engines, tval[0]);
slept = measured_usleep(batch_duration_ns / 1000);
- pmu_read_multi(fd[0], num_engines, val);
+ pmu_read_multi(fd[0], num_engines, tval[1]);
igt_spin_batch_end(spin);
igt_spin_batch_free(gem_fd, spin);
close(fd[0]);
+ for (i = 0; i < num_engines; i++)
+ val[i] = tval[1][i] - tval[0][i];
+
log_busy(num_engines, val);
for (i = 0; i < num_engines; i++) {
@@ -384,6 +397,7 @@ static void
all_busy_check_all(int gem_fd, const unsigned int num_engines)
{
const struct intel_execution_engine2 *e;
+ uint64_t tval[2][num_engines];
uint64_t val[num_engines];
int fd[num_engines];
unsigned long slept;
@@ -418,13 +432,17 @@ all_busy_check_all(int gem_fd, const unsigned int num_engines)
for (i = 0; i < num_engines; i++)
fd[i] = open_group(val[i], fd[0]);
+ pmu_read_multi(fd[0], num_engines, tval[0]);
slept = measured_usleep(batch_duration_ns / 1000);
- pmu_read_multi(fd[0], num_engines, val);
+ pmu_read_multi(fd[0], num_engines, tval[1]);
igt_spin_batch_end(spin);
igt_spin_batch_free(gem_fd, spin);
close(fd[0]);
+ for (i = 0; i < num_engines; i++)
+ val[i] = tval[1][i] - tval[0][i];
+
log_busy(num_engines, val);
for (i = 0; i < num_engines; i++)
@@ -765,13 +783,15 @@ multi_client(int gem_fd, const struct intel_execution_engine2 *e)
spin = igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0);
igt_spin_batch_set_timeout(spin, 2 * batch_duration_ns);
- slept = measured_usleep(batch_duration_ns / 1000);
+ val[0] = pmu_read_single(fd[0]);
val[1] = pmu_read_single(fd[1]);
+ slept = measured_usleep(batch_duration_ns / 1000);
+ val[1] = pmu_read_single(fd[1]) - val[1];
close(fd[1]);
gem_sync(gem_fd, spin->handle);
- val[0] = pmu_read_single(fd[0]);
+ val[0] = pmu_read_single(fd[0]) - val[0];
igt_spin_batch_free(gem_fd, spin);
close(fd[0]);
--
2.14.1
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [igt-dev] [PATCH i-g-t 3/3] tests/perf_pmu: Use measured sleep in all time based tests
2018-02-01 12:47 [igt-dev] [PATCH i-g-t 0/3] perf_pmu reliability improvements Tvrtko Ursulin
2018-02-01 12:47 ` [igt-dev] [PATCH i-g-t 1/3] tests/perf_pmu: Tighten busy measurement Tvrtko Ursulin
2018-02-01 12:47 ` [igt-dev] [PATCH i-g-t 2/3] tests/perf_pmu: More busy measurement tightening Tvrtko Ursulin
@ 2018-02-01 12:47 ` Tvrtko Ursulin
2018-02-01 17:37 ` Chris Wilson
2018-02-01 13:22 ` [igt-dev] ✓ Fi.CI.BAT: success for perf_pmu reliability improvements Patchwork
2018-02-01 16:38 ` [igt-dev] ✗ Fi.CI.IGT: warning " Patchwork
4 siblings, 1 reply; 19+ messages in thread
From: Tvrtko Ursulin @ 2018-02-01 12:47 UTC (permalink / raw)
To: igt-dev; +Cc: Tvrtko Ursulin
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Stop relying on timers to end spin batches but use measured sleep, which
was established to work better, in all time based tests.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
tests/perf_pmu.c | 42 +++++++++++++++++++++---------------------
1 file changed, 21 insertions(+), 21 deletions(-)
diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
index bdf452c8347c..e27e74abe686 100644
--- a/tests/perf_pmu.c
+++ b/tests/perf_pmu.c
@@ -454,30 +454,31 @@ static void
no_sema(int gem_fd, const struct intel_execution_engine2 *e, bool busy)
{
igt_spin_t *spin;
- uint64_t val[2];
+ uint64_t val[2][2];
int fd;
fd = open_group(I915_PMU_ENGINE_SEMA(e->class, e->instance), -1);
open_group(I915_PMU_ENGINE_WAIT(e->class, e->instance), fd);
- if (busy) {
+ if (busy)
spin = igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0);
- igt_spin_batch_set_timeout(spin, batch_duration_ns);
- } else {
- usleep(batch_duration_ns / 1000);
- }
- if (busy)
- gem_sync(gem_fd, spin->handle);
+ pmu_read_multi(fd, 2, val[0]);
+ measured_usleep(batch_duration_ns / 1000);
+ pmu_read_multi(fd, 2, val[1]);
- pmu_read_multi(fd, 2, val);
+ val[0][0] = val[1][0] - val[0][0];
+ val[0][1] = val[1][1] - val[0][1];
- if (busy)
+ if (busy) {
+ igt_spin_batch_end(spin);
+ gem_sync(gem_fd, spin->handle);
igt_spin_batch_free(gem_fd, spin);
+ }
close(fd);
- assert_within_epsilon(val[0], 0.0f, tolerance);
- assert_within_epsilon(val[1], 0.0f, tolerance);
+ assert_within_epsilon(val[0][0], 0.0f, tolerance);
+ assert_within_epsilon(val[0][1], 0.0f, tolerance);
}
#define MI_INSTR(opcode, flags) (((opcode) << 23) | (flags))
@@ -766,7 +767,7 @@ static void
multi_client(int gem_fd, const struct intel_execution_engine2 *e)
{
uint64_t config = I915_PMU_ENGINE_BUSY(e->class, e->instance);
- unsigned int slept;
+ unsigned int slept[2];
igt_spin_t *spin;
uint64_t val[2];
int fd[2];
@@ -781,23 +782,22 @@ multi_client(int gem_fd, const struct intel_execution_engine2 *e)
fd[1] = open_pmu(config);
spin = igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0);
- igt_spin_batch_set_timeout(spin, 2 * batch_duration_ns);
- val[0] = pmu_read_single(fd[0]);
- val[1] = pmu_read_single(fd[1]);
- slept = measured_usleep(batch_duration_ns / 1000);
+ val[0] = val[1] = pmu_read_single(fd[0]);
+ slept[1] = measured_usleep(batch_duration_ns / 1000);
val[1] = pmu_read_single(fd[1]) - val[1];
close(fd[1]);
- gem_sync(gem_fd, spin->handle);
-
+ slept[0] = measured_usleep(batch_duration_ns / 1000) + slept[1];
val[0] = pmu_read_single(fd[0]) - val[0];
+ igt_spin_batch_end(spin);
+ gem_sync(gem_fd, spin->handle);
igt_spin_batch_free(gem_fd, spin);
close(fd[0]);
- assert_within_epsilon(val[0], 2 * batch_duration_ns, tolerance);
- assert_within_epsilon(val[1], slept, tolerance);
+ assert_within_epsilon(val[0], slept[0], tolerance);
+ assert_within_epsilon(val[1], slept[1], tolerance);
}
/**
--
2.14.1
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [igt-dev] [PATCH i-g-t 1/3] tests/perf_pmu: Tighten busy measurement
2018-02-01 12:47 ` [igt-dev] [PATCH i-g-t 1/3] tests/perf_pmu: Tighten busy measurement Tvrtko Ursulin
@ 2018-02-01 12:57 ` Chris Wilson
2018-02-01 16:26 ` Tvrtko Ursulin
0 siblings, 1 reply; 19+ messages in thread
From: Chris Wilson @ 2018-02-01 12:57 UTC (permalink / raw)
To: Tvrtko Ursulin, igt-dev; +Cc: Tvrtko Ursulin
Quoting Tvrtko Ursulin (2018-02-01 12:47:44)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> In cases where we manually terminate the busy batch, we always want to
> sample busyness while the batch is running, just before we will
> terminate it, and not the other way around. This way we make the window
> for unwated idleness getting sampled smaller.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
> tests/perf_pmu.c | 28 +++++++++++++---------------
> 1 file changed, 13 insertions(+), 15 deletions(-)
>
> diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
> index 2f7d33414a53..bf16e5e8b1f9 100644
> --- a/tests/perf_pmu.c
> +++ b/tests/perf_pmu.c
> @@ -146,10 +146,9 @@ single(int gem_fd, const struct intel_execution_engine2 *e, bool busy)
> spin = NULL;
>
> slept = measured_usleep(batch_duration_ns / 1000);
> - igt_spin_batch_end(spin);
> -
> val = pmu_read_single(fd);
>
> + igt_spin_batch_end(spin);
But that's the wrong way round as we are measuring busyness, and the
sampling should terminate as soon as the spin-batch ends, before we even
read the PMU sample? For the timer sampler, it's lost in the noise.
So the idea was to cancel the busyness asap so that the sampler stops
updating before we even have to cross into the kernel for the PMU read.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [igt-dev] [PATCH i-g-t 2/3] tests/perf_pmu: More busy measurement tightening
2018-02-01 12:47 ` [igt-dev] [PATCH i-g-t 2/3] tests/perf_pmu: More busy measurement tightening Tvrtko Ursulin
@ 2018-02-01 12:59 ` Chris Wilson
2018-02-01 16:37 ` Tvrtko Ursulin
0 siblings, 1 reply; 19+ messages in thread
From: Chris Wilson @ 2018-02-01 12:59 UTC (permalink / raw)
To: Tvrtko Ursulin, igt-dev; +Cc: Tvrtko Ursulin
Quoting Tvrtko Ursulin (2018-02-01 12:47:45)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> Where we use measured sleeps, take PMU samples immediately before and
> after and look at their delta in order to minimize the effect of any
> test setup delays.
The system and pmu were meant to be idle at the start of the test,
right? So val should always be zero?
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 19+ messages in thread
* [igt-dev] ✓ Fi.CI.BAT: success for perf_pmu reliability improvements
2018-02-01 12:47 [igt-dev] [PATCH i-g-t 0/3] perf_pmu reliability improvements Tvrtko Ursulin
` (2 preceding siblings ...)
2018-02-01 12:47 ` [igt-dev] [PATCH i-g-t 3/3] tests/perf_pmu: Use measured sleep in all time based tests Tvrtko Ursulin
@ 2018-02-01 13:22 ` Patchwork
2018-02-01 16:38 ` [igt-dev] ✗ Fi.CI.IGT: warning " Patchwork
4 siblings, 0 replies; 19+ messages in thread
From: Patchwork @ 2018-02-01 13:22 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: igt-dev
== Series Details ==
Series: perf_pmu reliability improvements
URL : https://patchwork.freedesktop.org/series/37481/
State : success
== Summary ==
IGT patchset tested on top of latest successful build
689ea8ddcaf769c14b26df8c0fcd8fcc74e06dd2 igt/gem_exec_params: Drop drm master privileges only on drm master fds
with latest DRM-Tip kernel build CI_DRM_3711
4244f989341c drm-tip: 2018y-02m-01d-12h-39m-55s UTC integration manifest
No testlist changes.
fi-bdw-5557u total:288 pass:267 dwarn:0 dfail:0 fail:0 skip:21 time:426s
fi-bdw-gvtdvm total:288 pass:264 dwarn:0 dfail:0 fail:0 skip:24 time:425s
fi-blb-e6850 total:288 pass:223 dwarn:1 dfail:0 fail:0 skip:64 time:372s
fi-bsw-n3050 total:288 pass:242 dwarn:0 dfail:0 fail:0 skip:46 time:489s
fi-bwr-2160 total:288 pass:183 dwarn:0 dfail:0 fail:0 skip:105 time:282s
fi-bxt-dsi total:288 pass:258 dwarn:0 dfail:0 fail:0 skip:30 time:483s
fi-bxt-j4205 total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:485s
fi-byt-j1900 total:288 pass:253 dwarn:0 dfail:0 fail:0 skip:35 time:468s
fi-byt-n2820 total:288 pass:249 dwarn:0 dfail:0 fail:0 skip:39 time:456s
fi-cfl-s2 total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:566s
fi-elk-e7500 total:288 pass:229 dwarn:0 dfail:0 fail:0 skip:59 time:418s
fi-gdg-551 total:288 pass:179 dwarn:0 dfail:0 fail:1 skip:108 time:283s
fi-glk-1 total:288 pass:260 dwarn:0 dfail:0 fail:0 skip:28 time:509s
fi-hsw-4770 total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:393s
fi-hsw-4770r total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:401s
fi-ilk-650 total:288 pass:228 dwarn:0 dfail:0 fail:0 skip:60 time:410s
fi-ivb-3520m total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:449s
fi-ivb-3770 total:288 pass:255 dwarn:0 dfail:0 fail:0 skip:33 time:412s
fi-kbl-7500u total:288 pass:263 dwarn:1 dfail:0 fail:0 skip:24 time:459s
fi-kbl-7560u total:288 pass:269 dwarn:0 dfail:0 fail:0 skip:19 time:496s
fi-kbl-7567u total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:454s
fi-kbl-r total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:503s
fi-pnv-d510 total:288 pass:222 dwarn:1 dfail:0 fail:0 skip:65 time:578s
fi-skl-6260u total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:426s
fi-skl-6600u total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:507s
fi-skl-6700hq total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:526s
fi-skl-6700k2 total:288 pass:264 dwarn:0 dfail:0 fail:0 skip:24 time:486s
fi-skl-6770hq total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:485s
fi-skl-guc total:288 pass:260 dwarn:0 dfail:0 fail:0 skip:28 time:417s
fi-skl-gvtdvm total:288 pass:265 dwarn:0 dfail:0 fail:0 skip:23 time:429s
fi-snb-2520m total:288 pass:248 dwarn:0 dfail:0 fail:0 skip:40 time:528s
fi-snb-2600 total:288 pass:248 dwarn:0 dfail:0 fail:0 skip:40 time:399s
Blacklisted hosts:
fi-glk-dsi total:288 pass:258 dwarn:0 dfail:0 fail:0 skip:30 time:468s
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_848/issues.html
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [igt-dev] [PATCH i-g-t 1/3] tests/perf_pmu: Tighten busy measurement
2018-02-01 12:57 ` Chris Wilson
@ 2018-02-01 16:26 ` Tvrtko Ursulin
2018-02-01 16:39 ` Chris Wilson
0 siblings, 1 reply; 19+ messages in thread
From: Tvrtko Ursulin @ 2018-02-01 16:26 UTC (permalink / raw)
To: Chris Wilson, Tvrtko Ursulin, igt-dev; +Cc: Tvrtko Ursulin
On 01/02/2018 12:57, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-02-01 12:47:44)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> In cases where we manually terminate the busy batch, we always want to
>> sample busyness while the batch is running, just before we will
>> terminate it, and not the other way around. This way we make the window
>> for unwated idleness getting sampled smaller.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>> tests/perf_pmu.c | 28 +++++++++++++---------------
>> 1 file changed, 13 insertions(+), 15 deletions(-)
>>
>> diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
>> index 2f7d33414a53..bf16e5e8b1f9 100644
>> --- a/tests/perf_pmu.c
>> +++ b/tests/perf_pmu.c
>> @@ -146,10 +146,9 @@ single(int gem_fd, const struct intel_execution_engine2 *e, bool busy)
>> spin = NULL;
>>
>> slept = measured_usleep(batch_duration_ns / 1000);
>> - igt_spin_batch_end(spin);
>> -
>> val = pmu_read_single(fd);
>>
>> + igt_spin_batch_end(spin);
>
> But that's the wrong way round as we are measuring busyness, and the
> sampling should terminate as soon as the spin-batch ends, before we even
> read the PMU sample? For the timer sampler, it's lost in the noise.
>
> So the idea was to cancel the busyness asap so that the sampler stops
> updating before we even have to cross into the kernel for the PMU read.
I don't follow on the problem statement. This is how code used to looks
in many places:
slept = measured_usleep(batch_duration_ns / 1000);
igt_spin_batch_end(spin);
pmu_read_multi(fd[0], num_engines, val);
Problem here is there is indeterministic time gap, depending on test
execution speed, between requesting the batch to end to reading the
counter. This can add a random amount of idleness to the read value.
And another indeterminism in how long it takes for the batch end request
to get picked up by the GPU. If that is slower in some cases, the
counter will drift from the expected value toward other direction,
overestimating busyness relative to sleep duration.
Attempted improvement was simply to reverse the last two lines, so we
read the counter when we know it is busy (after the sleep), and then
request batch termination.
This only leaves the scheduling delay between end of sleep and counter
read, which is smaller than end of sleep - batch end - counter read.
These tests are not testing for edge conditions, just that the busy
engines are reported as busy, in various combinations, so that sounded
like a reasonable change.
I hope I did not get confused here, it wouldn't be the first time in
these tests...
Regards,
Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [igt-dev] [PATCH i-g-t 2/3] tests/perf_pmu: More busy measurement tightening
2018-02-01 12:59 ` Chris Wilson
@ 2018-02-01 16:37 ` Tvrtko Ursulin
2018-02-01 16:48 ` Chris Wilson
0 siblings, 1 reply; 19+ messages in thread
From: Tvrtko Ursulin @ 2018-02-01 16:37 UTC (permalink / raw)
To: Chris Wilson, Tvrtko Ursulin, igt-dev; +Cc: Tvrtko Ursulin
On 01/02/2018 12:59, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-02-01 12:47:45)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Where we use measured sleeps, take PMU samples immediately before and
>> after and look at their delta in order to minimize the effect of any
>> test setup delays.
>
> The system and pmu were meant to be idle at the start of the test,
> right? So val should always be zero?
Yes, but there is a time delay between starting the counters and
applying busyness. For instance, busy-check-all, current version:
... pmu open somewhere before ...
spin = igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0);
slept = measured_usleep(batch_duration_ns / 1000);
pmu_read_multi(fd[0], num_engines, val);
In this case the slept value vs the read busyness will miss a tiny bit
between igt_spin_batch_new to measured_usleep. Probably minimal indeed,
but I thought just for extra safety to take explicit initial read just
before the sleep, so:
spin = igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0);
pmu_read_multi(fd[0], num_engines, tval[0]);
slept = measured_usleep(batch_duration_ns / 1000);
pmu_read_multi(fd[0], num_engines, tval[1]);
More importantly, it is a potentially larger time delta in tests which
open multiple counters after starting the spinner. Like
most_busy_check_all for instance:
... start spin batch...
for (i = 0; i < num_engines; i++)
fd[i] = open_group(val[i], fd[0]);
slept = measured_usleep(batch_duration_ns / 1000);
pmu_read_multi(fd[0], num_engines, val);
So the counter value relative to slept value will include time spent
opening num_engines event. Once again change to take an explicit initial
value just before the sleep looked reasonable to me.
Regards,
Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 19+ messages in thread
* [igt-dev] ✗ Fi.CI.IGT: warning for perf_pmu reliability improvements
2018-02-01 12:47 [igt-dev] [PATCH i-g-t 0/3] perf_pmu reliability improvements Tvrtko Ursulin
` (3 preceding siblings ...)
2018-02-01 13:22 ` [igt-dev] ✓ Fi.CI.BAT: success for perf_pmu reliability improvements Patchwork
@ 2018-02-01 16:38 ` Patchwork
4 siblings, 0 replies; 19+ messages in thread
From: Patchwork @ 2018-02-01 16:38 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: igt-dev
== Series Details ==
Series: perf_pmu reliability improvements
URL : https://patchwork.freedesktop.org/series/37481/
State : warning
== Summary ==
Test kms_flip:
Subgroup plain-flip-ts-check:
fail -> PASS (shard-hsw) fdo#100368
Subgroup 2x-flip-vs-expired-vblank:
pass -> FAIL (shard-hsw) fdo#102887
Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-a:
skip -> PASS (shard-snb) fdo#103375
Test kms_vblank:
Subgroup pipe-a-wait-idle-hang:
pass -> SKIP (shard-snb)
Subgroup pipe-a-accuracy-idle:
fail -> PASS (shard-apl) fdo#102583
Test gem_eio:
Subgroup in-flight-contexts:
fail -> PASS (shard-hsw) fdo#104676
Test kms_setmode:
Subgroup basic:
pass -> FAIL (shard-hsw) fdo#99912
Test kms_sysfs_edid_timing:
warn -> PASS (shard-apl) fdo#100047
Test perf:
Subgroup enable-disable:
pass -> FAIL (shard-apl) fdo#103715
fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
fdo#102887 https://bugs.freedesktop.org/show_bug.cgi?id=102887
fdo#103375 https://bugs.freedesktop.org/show_bug.cgi?id=103375
fdo#102583 https://bugs.freedesktop.org/show_bug.cgi?id=102583
fdo#104676 https://bugs.freedesktop.org/show_bug.cgi?id=104676
fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912
fdo#100047 https://bugs.freedesktop.org/show_bug.cgi?id=100047
fdo#103715 https://bugs.freedesktop.org/show_bug.cgi?id=103715
shard-apl total:2838 pass:1751 dwarn:1 dfail:0 fail:22 skip:1064 time:12530s
shard-hsw total:2838 pass:1735 dwarn:1 dfail:0 fail:11 skip:1090 time:11737s
shard-snb total:2838 pass:1329 dwarn:1 dfail:0 fail:10 skip:1498 time:6602s
Blacklisted hosts:
shard-kbl total:2827 pass:1865 dwarn:1 dfail:0 fail:22 skip:938 time:9349s
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_848/shards.html
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [igt-dev] [PATCH i-g-t 1/3] tests/perf_pmu: Tighten busy measurement
2018-02-01 16:26 ` Tvrtko Ursulin
@ 2018-02-01 16:39 ` Chris Wilson
2018-02-01 16:58 ` Tvrtko Ursulin
0 siblings, 1 reply; 19+ messages in thread
From: Chris Wilson @ 2018-02-01 16:39 UTC (permalink / raw)
To: Tvrtko Ursulin, Tvrtko Ursulin, igt-dev; +Cc: Tvrtko Ursulin
Quoting Tvrtko Ursulin (2018-02-01 16:26:58)
>
> On 01/02/2018 12:57, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-02-01 12:47:44)
> >> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >> In cases where we manually terminate the busy batch, we always want to
> >> sample busyness while the batch is running, just before we will
> >> terminate it, and not the other way around. This way we make the window
> >> for unwated idleness getting sampled smaller.
> >>
> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >> ---
> >> tests/perf_pmu.c | 28 +++++++++++++---------------
> >> 1 file changed, 13 insertions(+), 15 deletions(-)
> >>
> >> diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
> >> index 2f7d33414a53..bf16e5e8b1f9 100644
> >> --- a/tests/perf_pmu.c
> >> +++ b/tests/perf_pmu.c
> >> @@ -146,10 +146,9 @@ single(int gem_fd, const struct intel_execution_engine2 *e, bool busy)
> >> spin = NULL;
> >>
> >> slept = measured_usleep(batch_duration_ns / 1000);
> >> - igt_spin_batch_end(spin);
> >> -
> >> val = pmu_read_single(fd);
> >>
> >> + igt_spin_batch_end(spin);
> >
> > But that's the wrong way round as we are measuring busyness, and the
> > sampling should terminate as soon as the spin-batch ends, before we even
> > read the PMU sample? For the timer sampler, it's lost in the noise.
> >
> > So the idea was to cancel the busyness asap so that the sampler stops
> > updating before we even have to cross into the kernel for the PMU read.
>
> I don't follow on the problem statement. This is how code used to looks
> in many places:
>
> slept = measured_usleep(batch_duration_ns / 1000);
> igt_spin_batch_end(spin);
>
> pmu_read_multi(fd[0], num_engines, val);
>
> Problem here is there is indeterministic time gap, depending on test
> execution speed, between requesting the batch to end to reading the
> counter. This can add a random amount of idleness to the read value.
But we are not measuring idleness, but busyness. The batch will end a
few tens of nanoseconds after the write hits memory. The desire is that
time is zero so that the sleep exactly corresponds with the interval the
batch was spinning (busy).
> And another indeterminism in how long it takes for the batch end request
> to get picked up by the GPU. If that is slower in some cases, the
> counter will drift from the expected value toward other direction,
> overestimating busyness relative to sleep duration.
>
> Attempted improvement was simply to reverse the last two lines, so we
> read the counter when we know it is busy (after the sleep), and then
> request batch termination.
>
> This only leaves the scheduling delay between end of sleep and counter
> read, which is smaller than end of sleep - batch end - counter read.
>
> These tests are not testing for edge conditions, just that the busy
> engines are reported as busy, in various combinations, so that sounded
> like a reasonable change.
>
> I hope I did not get confused here, it wouldn't be the first time in
> these tests...
Aiui, the PMU measure busyness which is the duration of the spin-batch
(or first enabling of the PMU event). The choice is between a context
switch into the kernel to stop the counter, or a single write to memory
to stop the batch).
My bet is the write to memory will turn off the counter within 100ns,
worst case including the interrupt processing. The PMU event stopping
I estimate 100ns best case (including the kernel context switch, and
don't ask how KPTI perturbs that switch as I don't know off hand how
much worse it will get).
Now, the write to memory is asynchronous to the PMU event stopping. So
if you write first, whichever is processed first (the kernel context
switch + event stop, or the CS interrupt) causes the busy counter to
cease. So best of both worlds?
I too may be wrong...
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [igt-dev] [PATCH i-g-t 2/3] tests/perf_pmu: More busy measurement tightening
2018-02-01 16:37 ` Tvrtko Ursulin
@ 2018-02-01 16:48 ` Chris Wilson
2018-02-01 17:02 ` Tvrtko Ursulin
0 siblings, 1 reply; 19+ messages in thread
From: Chris Wilson @ 2018-02-01 16:48 UTC (permalink / raw)
To: Tvrtko Ursulin, Tvrtko Ursulin, igt-dev; +Cc: Tvrtko Ursulin
Quoting Tvrtko Ursulin (2018-02-01 16:37:29)
>
> On 01/02/2018 12:59, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-02-01 12:47:45)
> >> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >> Where we use measured sleeps, take PMU samples immediately before and
> >> after and look at their delta in order to minimize the effect of any
> >> test setup delays.
> >
> > The system and pmu were meant to be idle at the start of the test,
> > right? So val should always be zero?
>
> Yes, but there is a time delay between starting the counters and
> applying busyness. For instance, busy-check-all, current version:
>
> ... pmu open somewhere before ...
>
> spin = igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0);
> slept = measured_usleep(batch_duration_ns / 1000);
> pmu_read_multi(fd[0], num_engines, val);
>
> In this case the slept value vs the read busyness will miss a tiny bit
> between igt_spin_batch_new to measured_usleep. Probably minimal indeed,
> but I thought just for extra safety to take explicit initial read just
> before the sleep, so:
>
> spin = igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0);
> pmu_read_multi(fd[0], num_engines, tval[0]);
Time gap here as well. How do we know this is better than before?
> slept = measured_usleep(batch_duration_ns / 1000);
> pmu_read_multi(fd[0], num_engines, tval[1]);
>
> More importantly, it is a potentially larger time delta in tests which
> open multiple counters after starting the spinner. Like
> most_busy_check_all for instance:
>
> ... start spin batch...
>
> for (i = 0; i < num_engines; i++)
> fd[i] = open_group(val[i], fd[0]);
>
> slept = measured_usleep(batch_duration_ns / 1000);
> pmu_read_multi(fd[0], num_engines, val);
>
> So the counter value relative to slept value will include time spent
> opening num_engines event. Once again change to take an explicit initial
> value just before the sleep looked reasonable to me.
I was working on open being minimal delay and insignificant. I have no
idea what the relative costs are. That would be nice to know.
The issue I have is that the scheduler can preempt us at time (so
other than the argument one is quicker and so gives less systematic
error in the ideal case), we are at the mercy of the scheduler which can
inject unknown sleeps between any point. I fear we may need RT, mlocking
and more?, but would much rather avoid it.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
(in exchange for a small test to benchmark open_(single|group),
read_(single|group), pretty please :)
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [igt-dev] [PATCH i-g-t 1/3] tests/perf_pmu: Tighten busy measurement
2018-02-01 16:39 ` Chris Wilson
@ 2018-02-01 16:58 ` Tvrtko Ursulin
2018-02-01 17:08 ` Chris Wilson
0 siblings, 1 reply; 19+ messages in thread
From: Tvrtko Ursulin @ 2018-02-01 16:58 UTC (permalink / raw)
To: Chris Wilson, Tvrtko Ursulin, igt-dev; +Cc: Tvrtko Ursulin
On 01/02/2018 16:39, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-02-01 16:26:58)
>>
>> On 01/02/2018 12:57, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2018-02-01 12:47:44)
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>
>>>> In cases where we manually terminate the busy batch, we always want to
>>>> sample busyness while the batch is running, just before we will
>>>> terminate it, and not the other way around. This way we make the window
>>>> for unwated idleness getting sampled smaller.
>>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>> ---
>>>> tests/perf_pmu.c | 28 +++++++++++++---------------
>>>> 1 file changed, 13 insertions(+), 15 deletions(-)
>>>>
>>>> diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
>>>> index 2f7d33414a53..bf16e5e8b1f9 100644
>>>> --- a/tests/perf_pmu.c
>>>> +++ b/tests/perf_pmu.c
>>>> @@ -146,10 +146,9 @@ single(int gem_fd, const struct intel_execution_engine2 *e, bool busy)
>>>> spin = NULL;
>>>>
>>>> slept = measured_usleep(batch_duration_ns / 1000);
>>>> - igt_spin_batch_end(spin);
>>>> -
>>>> val = pmu_read_single(fd);
>>>>
>>>> + igt_spin_batch_end(spin);
>>>
>>> But that's the wrong way round as we are measuring busyness, and the
>>> sampling should terminate as soon as the spin-batch ends, before we even
>>> read the PMU sample? For the timer sampler, it's lost in the noise.
>>>
>>> So the idea was to cancel the busyness asap so that the sampler stops
>>> updating before we even have to cross into the kernel for the PMU read.
>>
>> I don't follow on the problem statement. This is how code used to looks
>> in many places:
>>
>> slept = measured_usleep(batch_duration_ns / 1000);
>> igt_spin_batch_end(spin);
>>
>> pmu_read_multi(fd[0], num_engines, val);
>>
>> Problem here is there is indeterministic time gap, depending on test
>> execution speed, between requesting the batch to end to reading the
>> counter. This can add a random amount of idleness to the read value.
>
> But we are not measuring idleness, but busyness. The batch will end a
Yep, and that's why I removed some idleness from the busyness! :) Plus
removed the time required to signal batch end from it as well.
> few tens of nanoseconds after the write hits memory. The desire is that
> time is zero so that the sleep exactly corresponds with the interval the
> batch was spinning (busy).
>
>> And another indeterminism in how long it takes for the batch end request
>> to get picked up by the GPU. If that is slower in some cases, the
>> counter will drift from the expected value toward other direction,
>> overestimating busyness relative to sleep duration.
>>
>> Attempted improvement was simply to reverse the last two lines, so we
>> read the counter when we know it is busy (after the sleep), and then
>> request batch termination.
>>
>> This only leaves the scheduling delay between end of sleep and counter
>> read, which is smaller than end of sleep - batch end - counter read.
>>
>> These tests are not testing for edge conditions, just that the busy
>> engines are reported as busy, in various combinations, so that sounded
>> like a reasonable change.
>>
>> I hope I did not get confused here, it wouldn't be the first time in
>> these tests...
>
> Aiui, the PMU measure busyness which is the duration of the spin-batch
> (or first enabling of the PMU event). The choice is between a context
> switch into the kernel to stop the counter, or a single write to memory
> to stop the batch).
>
> My bet is the write to memory will turn off the counter within 100ns,
> worst case including the interrupt processing. The PMU event stopping
> I estimate 100ns best case (including the kernel context switch, and
> don't ask how KPTI perturbs that switch as I don't know off hand how
> much worse it will get).
There is not PMU event stopping in the changed bits, so not being under
measure either before or after.
> Now, the write to memory is asynchronous to the PMU event stopping. So
> if you write first, whichever is processed first (the kernel context
> switch + event stop, or the CS interrupt) causes the busy counter to
> cease. So best of both worlds?
I am just moving the two sample read-outs closer together in time. Two
samples are measure of how long we slept, and measure of PMU busyness.
There should be as little time as possible between the two readouts.
Ending the spinner I don't see how it belongs in this sandwich, or
anything else really.
Regards,
Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [igt-dev] [PATCH i-g-t 2/3] tests/perf_pmu: More busy measurement tightening
2018-02-01 16:48 ` Chris Wilson
@ 2018-02-01 17:02 ` Tvrtko Ursulin
0 siblings, 0 replies; 19+ messages in thread
From: Tvrtko Ursulin @ 2018-02-01 17:02 UTC (permalink / raw)
To: Chris Wilson, Tvrtko Ursulin, igt-dev; +Cc: Tvrtko Ursulin
On 01/02/2018 16:48, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-02-01 16:37:29)
>>
>> On 01/02/2018 12:59, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2018-02-01 12:47:45)
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>
>>>> Where we use measured sleeps, take PMU samples immediately before and
>>>> after and look at their delta in order to minimize the effect of any
>>>> test setup delays.
>>>
>>> The system and pmu were meant to be idle at the start of the test,
>>> right? So val should always be zero?
>>
>> Yes, but there is a time delay between starting the counters and
>> applying busyness. For instance, busy-check-all, current version:
>>
>> ... pmu open somewhere before ...
>>
>> spin = igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0);
>> slept = measured_usleep(batch_duration_ns / 1000);
>> pmu_read_multi(fd[0], num_engines, val);
>>
>> In this case the slept value vs the read busyness will miss a tiny bit
>> between igt_spin_batch_new to measured_usleep. Probably minimal indeed,
>> but I thought just for extra safety to take explicit initial read just
>> before the sleep, so:
>>
>> spin = igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0);
>> pmu_read_multi(fd[0], num_engines, tval[0]);
>
> Time gap here as well. How do we know this is better than before?
It is theoretically marginally better on the leading edge in this
example. Is it practically significant I don't know. At least it isn't
worse. I can compare the numbers before and after once the results get
in. I suspect Apollo Lake will the interesting machine to look at.
>> slept = measured_usleep(batch_duration_ns / 1000);
>> pmu_read_multi(fd[0], num_engines, tval[1]);
>>
>> More importantly, it is a potentially larger time delta in tests which
>> open multiple counters after starting the spinner. Like
>> most_busy_check_all for instance:
>>
>> ... start spin batch...
>>
>> for (i = 0; i < num_engines; i++)
>> fd[i] = open_group(val[i], fd[0]);
>>
>> slept = measured_usleep(batch_duration_ns / 1000);
>> pmu_read_multi(fd[0], num_engines, val);
>>
>> So the counter value relative to slept value will include time spent
>> opening num_engines event. Once again change to take an explicit initial
>> value just before the sleep looked reasonable to me.
>
> I was working on open being minimal delay and insignificant. I have no
> idea what the relative costs are. That would be nice to know.
>
> The issue I have is that the scheduler can preempt us at time (so
> other than the argument one is quicker and so gives less systematic
> error in the ideal case), we are at the mercy of the scheduler which can
> inject unknown sleeps between any point. I fear we may need RT, mlocking
> and more?, but would much rather avoid it.
Yep, that was exactly my thinking. Lets avoid using big ugly hammers
(RT) unless there is no other way. With this series applied, if there
are still random unexplained discrepancies, I think there will be
nothing else to try than to go RT.
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
>
> (in exchange for a small test to benchmark open_(single|group),
> read_(single|group), pretty please :)
Okay, marking the email as TODO.
Regards,
Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [igt-dev] [PATCH i-g-t 1/3] tests/perf_pmu: Tighten busy measurement
2018-02-01 16:58 ` Tvrtko Ursulin
@ 2018-02-01 17:08 ` Chris Wilson
2018-02-01 17:16 ` Chris Wilson
2018-02-01 17:20 ` Tvrtko Ursulin
0 siblings, 2 replies; 19+ messages in thread
From: Chris Wilson @ 2018-02-01 17:08 UTC (permalink / raw)
To: Tvrtko Ursulin, Tvrtko Ursulin, igt-dev; +Cc: Tvrtko Ursulin
Quoting Tvrtko Ursulin (2018-02-01 16:58:21)
>
> I am just moving the two sample read-outs closer together in time. Two
> samples are measure of how long we slept, and measure of PMU busyness.
> There should be as little time as possible between the two readouts.
> Ending the spinner I don't see how it belongs in this sandwich, or
> anything else really.
Then we disagree on what we think we are measuring. The usleep should
equal the busy value defined by the spin-batch.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [igt-dev] [PATCH i-g-t 1/3] tests/perf_pmu: Tighten busy measurement
2018-02-01 17:08 ` Chris Wilson
@ 2018-02-01 17:16 ` Chris Wilson
2018-02-01 17:34 ` Chris Wilson
2018-02-01 17:20 ` Tvrtko Ursulin
1 sibling, 1 reply; 19+ messages in thread
From: Chris Wilson @ 2018-02-01 17:16 UTC (permalink / raw)
To: Tvrtko Ursulin, Tvrtko Ursulin, igt-dev; +Cc: Tvrtko Ursulin
Quoting Chris Wilson (2018-02-01 17:08:09)
> Quoting Tvrtko Ursulin (2018-02-01 16:58:21)
> >
> > I am just moving the two sample read-outs closer together in time. Two
> > samples are measure of how long we slept, and measure of PMU busyness.
> > There should be as little time as possible between the two readouts.
> > Ending the spinner I don't see how it belongs in this sandwich, or
> > anything else really.
>
> Then we disagree on what we think we are measuring. The usleep should
> equal the busy value defined by the spin-batch.
How about if it was phrased:
create spin-batch per-engine queued on a unsignal vgem
open_group
signal vgem // submit all queued spinners simultaneously (well, that's our goal)
duration = measured_usleep
end spinners
busy = read_group
That's what I want to measure. The passage of time outside of the
spinner phase is then irrelevant, and the error inside is something that
we are trying to minimise as part of the driver overhead.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [igt-dev] [PATCH i-g-t 1/3] tests/perf_pmu: Tighten busy measurement
2018-02-01 17:08 ` Chris Wilson
2018-02-01 17:16 ` Chris Wilson
@ 2018-02-01 17:20 ` Tvrtko Ursulin
1 sibling, 0 replies; 19+ messages in thread
From: Tvrtko Ursulin @ 2018-02-01 17:20 UTC (permalink / raw)
To: Chris Wilson, Tvrtko Ursulin, igt-dev; +Cc: Tvrtko Ursulin
On 01/02/2018 17:08, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-02-01 16:58:21)
>>
>> I am just moving the two sample read-outs closer together in time. Two
>> samples are measure of how long we slept, and measure of PMU busyness.
>> There should be as little time as possible between the two readouts.
>> Ending the spinner I don't see how it belongs in this sandwich, or
>> anything else really.
>
> Then we disagree on what we think we are measuring. The usleep should
> equal the busy value defined by the spin-batch.
I agree, and that is why there should be no extra work between the sleep
ending and PMU read-out.
Sleep is defining the spin-batch duration and that is the most accurate
value for it we got.
Regards,
Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [igt-dev] [PATCH i-g-t 1/3] tests/perf_pmu: Tighten busy measurement
2018-02-01 17:16 ` Chris Wilson
@ 2018-02-01 17:34 ` Chris Wilson
0 siblings, 0 replies; 19+ messages in thread
From: Chris Wilson @ 2018-02-01 17:34 UTC (permalink / raw)
To: Tvrtko Ursulin, Tvrtko Ursulin, igt-dev; +Cc: Tvrtko Ursulin
Quoting Chris Wilson (2018-02-01 17:16:28)
> Quoting Chris Wilson (2018-02-01 17:08:09)
> > Quoting Tvrtko Ursulin (2018-02-01 16:58:21)
> > >
> > > I am just moving the two sample read-outs closer together in time. Two
> > > samples are measure of how long we slept, and measure of PMU busyness.
> > > There should be as little time as possible between the two readouts.
> > > Ending the spinner I don't see how it belongs in this sandwich, or
> > > anything else really.
> >
> > Then we disagree on what we think we are measuring. The usleep should
> > equal the busy value defined by the spin-batch.
>
> How about if it was phrased:
>
> create spin-batch per-engine queued on a unsignal vgem
> open_group
>
> signal vgem // submit all queued spinners simultaneously (well, that's our goal)
> duration = measured_usleep
> end spinners
>
> busy = read_group
>
> That's what I want to measure. The passage of time outside of the
> spinner phase is then irrelevant, and the error inside is something that
> we are trying to minimise as part of the driver overhead.
Ok, so we are looking at slightly different angles here. The patch
applies your idea of excluding the vagaries in controlling the spinning
batch from measuring the accuracy of the PMU event itself,
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
But I think there's some merit in my pov still :)
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [igt-dev] [PATCH i-g-t 3/3] tests/perf_pmu: Use measured sleep in all time based tests
2018-02-01 12:47 ` [igt-dev] [PATCH i-g-t 3/3] tests/perf_pmu: Use measured sleep in all time based tests Tvrtko Ursulin
@ 2018-02-01 17:37 ` Chris Wilson
0 siblings, 0 replies; 19+ messages in thread
From: Chris Wilson @ 2018-02-01 17:37 UTC (permalink / raw)
To: Tvrtko Ursulin, igt-dev; +Cc: Tvrtko Ursulin
Quoting Tvrtko Ursulin (2018-02-01 12:47:46)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> Stop relying on timers to end spin batches but use measured sleep, which
> was established to work better, in all time based tests.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
> tests/perf_pmu.c | 42 +++++++++++++++++++++---------------------
> 1 file changed, 21 insertions(+), 21 deletions(-)
>
> diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
> index bdf452c8347c..e27e74abe686 100644
> --- a/tests/perf_pmu.c
> +++ b/tests/perf_pmu.c
> @@ -454,30 +454,31 @@ static void
> no_sema(int gem_fd, const struct intel_execution_engine2 *e, bool busy)
> {
> igt_spin_t *spin;
> - uint64_t val[2];
> + uint64_t val[2][2];
> int fd;
>
> fd = open_group(I915_PMU_ENGINE_SEMA(e->class, e->instance), -1);
> open_group(I915_PMU_ENGINE_WAIT(e->class, e->instance), fd);
>
> - if (busy) {
> + if (busy)
> spin = igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0);
> - igt_spin_batch_set_timeout(spin, batch_duration_ns);
> - } else {
> - usleep(batch_duration_ns / 1000);
> - }
>
> - if (busy)
> - gem_sync(gem_fd, spin->handle);
> + pmu_read_multi(fd, 2, val[0]);
> + measured_usleep(batch_duration_ns / 1000);
> + pmu_read_multi(fd, 2, val[1]);
>
> - pmu_read_multi(fd, 2, val);
> + val[0][0] = val[1][0] - val[0][0];
> + val[0][1] = val[1][1] - val[0][1];
>
> - if (busy)
> + if (busy) {
> + igt_spin_batch_end(spin);
> + gem_sync(gem_fd, spin->handle);
> igt_spin_batch_free(gem_fd, spin);
> + }
> close(fd);
>
> - assert_within_epsilon(val[0], 0.0f, tolerance);
> - assert_within_epsilon(val[1], 0.0f, tolerance);
> + assert_within_epsilon(val[0][0], 0.0f, tolerance);
> + assert_within_epsilon(val[0][1], 0.0f, tolerance);
> }
This one looks innocent, but I don't see anything gained from being
inconsistent here. (Most bug discovery is serendipitous, so I welcome
inconsistency between tests :)
> #define MI_INSTR(opcode, flags) (((opcode) << 23) | (flags))
> @@ -766,7 +767,7 @@ static void
> multi_client(int gem_fd, const struct intel_execution_engine2 *e)
> {
> uint64_t config = I915_PMU_ENGINE_BUSY(e->class, e->instance);
> - unsigned int slept;
> + unsigned int slept[2];
> igt_spin_t *spin;
> uint64_t val[2];
> int fd[2];
> @@ -781,23 +782,22 @@ multi_client(int gem_fd, const struct intel_execution_engine2 *e)
> fd[1] = open_pmu(config);
>
> spin = igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0);
> - igt_spin_batch_set_timeout(spin, 2 * batch_duration_ns);
>
> - val[0] = pmu_read_single(fd[0]);
> - val[1] = pmu_read_single(fd[1]);
> - slept = measured_usleep(batch_duration_ns / 1000);
> + val[0] = val[1] = pmu_read_single(fd[0]);
> + slept[1] = measured_usleep(batch_duration_ns / 1000);
> val[1] = pmu_read_single(fd[1]) - val[1];
> close(fd[1]);
>
> - gem_sync(gem_fd, spin->handle);
> -
> + slept[0] = measured_usleep(batch_duration_ns / 1000) + slept[1];
> val[0] = pmu_read_single(fd[0]) - val[0];
>
> + igt_spin_batch_end(spin);
> + gem_sync(gem_fd, spin->handle);
> igt_spin_batch_free(gem_fd, spin);
> close(fd[0]);
>
> - assert_within_epsilon(val[0], 2 * batch_duration_ns, tolerance);
> - assert_within_epsilon(val[1], slept, tolerance);
> + assert_within_epsilon(val[0], slept[0], tolerance);
> + assert_within_epsilon(val[1], slept[1], tolerance);
Whereas this should, given our experience elsewhere with unknown sleeps,
make a difference.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2018-02-01 17:37 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-02-01 12:47 [igt-dev] [PATCH i-g-t 0/3] perf_pmu reliability improvements Tvrtko Ursulin
2018-02-01 12:47 ` [igt-dev] [PATCH i-g-t 1/3] tests/perf_pmu: Tighten busy measurement Tvrtko Ursulin
2018-02-01 12:57 ` Chris Wilson
2018-02-01 16:26 ` Tvrtko Ursulin
2018-02-01 16:39 ` Chris Wilson
2018-02-01 16:58 ` Tvrtko Ursulin
2018-02-01 17:08 ` Chris Wilson
2018-02-01 17:16 ` Chris Wilson
2018-02-01 17:34 ` Chris Wilson
2018-02-01 17:20 ` Tvrtko Ursulin
2018-02-01 12:47 ` [igt-dev] [PATCH i-g-t 2/3] tests/perf_pmu: More busy measurement tightening Tvrtko Ursulin
2018-02-01 12:59 ` Chris Wilson
2018-02-01 16:37 ` Tvrtko Ursulin
2018-02-01 16:48 ` Chris Wilson
2018-02-01 17:02 ` Tvrtko Ursulin
2018-02-01 12:47 ` [igt-dev] [PATCH i-g-t 3/3] tests/perf_pmu: Use measured sleep in all time based tests Tvrtko Ursulin
2018-02-01 17:37 ` Chris Wilson
2018-02-01 13:22 ` [igt-dev] ✓ Fi.CI.BAT: success for perf_pmu reliability improvements Patchwork
2018-02-01 16:38 ` [igt-dev] ✗ Fi.CI.IGT: warning " Patchwork
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox