From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>,
Tvrtko Ursulin <tursulin@ursulin.net>,
igt-dev@lists.freedesktop.org
Cc: Intel-gfx@lists.freedesktop.org,
Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Subject: Re: [igt-dev] [PATCH i-g-t] tests/perf_pmu: Improve accuracy by waiting on spinner to start
Date: Thu, 15 Mar 2018 13:36:26 +0000 [thread overview]
Message-ID: <1a8be18d-e456-b3b0-3483-d95594ea8b27@linux.intel.com> (raw)
In-Reply-To: <152111966955.25315.15544334273574135347@mail.alporthouse.com>
On 15/03/2018 13:14, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-03-15 12:56:17)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> More than one test assumes that the spinner is running pretty much
>> immediately after we have create or submitted it.
>>
>> In actuality there is a variable delay, especially on execlists platforms,
>> between submission and spin batch starting to run on the hardware.
>>
>> To enable tests which care about this level of timing to account for this,
>> we add a new spin batch constructor which provides an output field which
>> can be polled to determine when the batch actually started running.
>>
>> This is implemented via MI_STOREDW_IMM from the spin batch, writing into
>> memory mapped page shared with userspace.
>>
>> Using this facility from perf_pmu, where applicable, should improve very
>> occasional test fails across the set and platforms.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
>> ---
>> lib/igt_dummyload.c | 99 +++++++++++++++++++++++++++++++----
>> lib/igt_dummyload.h | 9 ++++
>> tests/perf_pmu.c | 145 +++++++++++++++++++++++++++++++++++-----------------
>> 3 files changed, 196 insertions(+), 57 deletions(-)
>>
>> diff --git a/lib/igt_dummyload.c b/lib/igt_dummyload.c
>> index 4b20f23dfe26..0447d2f14d57 100644
>> --- a/lib/igt_dummyload.c
>> +++ b/lib/igt_dummyload.c
>> @@ -74,9 +74,12 @@ fill_reloc(struct drm_i915_gem_relocation_entry *reloc,
>> reloc->write_domain = write_domains;
>> }
>>
>> -static int emit_recursive_batch(igt_spin_t *spin,
>> - int fd, uint32_t ctx, unsigned engine,
>> - uint32_t dep, bool out_fence)
>> +#define OUT_FENCE (1 << 0)
>> +#define POLL_RUN (1 << 1)
>> +
>> +static int
>> +emit_recursive_batch(igt_spin_t *spin, int fd, uint32_t ctx, unsigned engine,
>> + uint32_t dep, unsigned int flags)
>> {
>> #define SCRATCH 0
>> #define BATCH 1
>> @@ -116,6 +119,8 @@ static int emit_recursive_batch(igt_spin_t *spin,
>> execbuf.buffer_count++;
>>
>> if (dep) {
>> + igt_assert(!(flags & POLL_RUN));
>> +
>
> Challenge left to the reader :)
Well not the reader, whoever gets to need both. :)
>> /* dummy write to dependency */
>> obj[SCRATCH].handle = dep;
>> fill_reloc(&relocs[obj[BATCH].relocation_count++],
>> @@ -123,6 +128,41 @@ static int emit_recursive_batch(igt_spin_t *spin,
>> I915_GEM_DOMAIN_RENDER,
>> I915_GEM_DOMAIN_RENDER);
>> execbuf.buffer_count++;
>> + } else if (flags & POLL_RUN) {
>> + unsigned int offset;
>> +
>> + igt_assert(!dep);
>> +
>> + spin->poll_handle = gem_create(fd, 4096);
>> + spin->running = __gem_mmap__wc(fd, spin->poll_handle,
>> + 0, 4096, PROT_READ | PROT_WRITE);
>
> Use mmap_cpu and gem_set_caching().
Wouldn't that get us into coherency issues on some platforms? I keep the
page mapped for API users to poll on.
>
>> + igt_assert(spin->running);
>> + igt_assert_eq(*spin->running, 0);
>> +
>> + *batch++ = MI_STORE_DWORD_IMM | (gen < 6 ? 1 << 22 : 0);
>
> Hmm, have we forgot the (len-2) or is this an unusual command that knows
> its own length?
I lifted the code from elsewhere.
>
>> +
>> + if (gen >= 8) {
>> + offset = sizeof(uint32_t);
>> + *batch++ = 0;
>> + *batch++ = 0;
>> + } else if (gen >= 4) {
>> + offset = 2 * sizeof(uint32_t);
>> + *batch++ = 0;
>> + *batch++ = 0;
>> + } else {
>> + offset = sizeof(uint32_t);
>> + batch[-1]--;
>> + *batch++ = 0;
>> + }
>> +
>> + *batch++ = 1;
>> +
>> + obj[SCRATCH].handle = spin->poll_handle;
>> + fill_reloc(&relocs[obj[BATCH].relocation_count++],
>> + spin->poll_handle, offset,
>> + I915_GEM_DOMAIN_INSTRUCTION,
>> + I915_GEM_DOMAIN_INSTRUCTION);
>
> DOMAIN_RENDER preferably. You don't need the w/a. Could we not lie about
> the write-hazard? Removes the need for EXEC_OBJECT_ASYNC and opens up
Can do.
> the possibility for using different dwords for different engines and then
> waiting for all-engines.
Yes, I could even use this so good to not let me be lazy. :)
>> + execbuf.buffer_count++;
>
> gen4 and gen5 require I915_EXEC_SECURE and a DRM_MASTER fd.
>
> We can just do something like
> if (gen == 4 || gen == 5)
> igt_require(igt_device_set_master(fd) == 0));
Okay.
>
>> +/**
>> + * igt_spin_batch_new_poll:
>> + * @fd: open i915 drm file descriptor
>> + * @engine: Ring to execute batch OR'd with execbuf flags. If value is less
>> + * than 0, execute on all available rings.
>> + *
>> + * Start a recursive batch on a ring. Immediately returns a #igt_spin_t that
>> + * contains the batch's handle that can be waited upon. The returned structure
>> + * must be passed to igt_spin_batch_free() for post-processing.
>> + *
>> + * igt_spin_t->running will containt a pointer which target will change from
>> + * zero to one once the spinner actually starts executing on the GPU.
>> + *
>> + * Returns:
>> + * Structure with helper internal state for igt_spin_batch_free().
>> + */
>> +igt_spin_t *
>> +igt_spin_batch_new_poll(int fd, uint32_t ctx, unsigned engine)
>> +{
>> + igt_spin_t *spin;
>> +
>> + igt_require_gem(fd);
>> + igt_require(gem_mmap__has_wc(fd));
>
> igt_require(gem_can_store_dword(fd, engine));
>
> Not all platforms have a MI_STORE_DWORD/DATA_IMM (with virtual addresses
> at least) and some platforms will die (*cough* snb *cough*).
Grr that makes it all problematic. Well, maybe not completely, I can
just fall back to less accurate method on those platforms.
>
>> +
>> + spin = __igt_spin_batch_new_poll(fd, ctx, engine);
>> + igt_assert(gem_bo_busy(fd, spin->handle));
>> +
>> + return spin;
>> +}
>
>> igt_spin_t *__igt_spin_batch_new(int fd,
>> @@ -55,6 +57,13 @@ igt_spin_t *igt_spin_batch_new_fence(int fd,
>> uint32_t ctx,
>> unsigned engine);
>>
>> +igt_spin_t *__igt_spin_batch_new_poll(int fd,
>> + uint32_t ctx,
>> + unsigned engine);
>> +igt_spin_t *igt_spin_batch_new_poll(int fd,
>> + uint32_t ctx,
>> + unsigned engine);
>> +
>> void igt_spin_batch_set_timeout(igt_spin_t *spin, int64_t ns);
>> void igt_spin_batch_end(igt_spin_t *spin);
>> void igt_spin_batch_free(int fd, igt_spin_t *spin);
>> diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
>> index 19fcc95ffc7f..d1b7b23bc646 100644
>> --- a/tests/perf_pmu.c
>> +++ b/tests/perf_pmu.c
>> @@ -184,6 +184,38 @@ static void end_spin(int fd, igt_spin_t *spin, unsigned int flags)
>> usleep(batch_duration_ns / 5000);
>> }
>>
>> +static igt_spin_t * __spin_poll(int fd, uint32_t ctx, unsigned long flags)
>> +{
>> + return __igt_spin_batch_new_poll(fd, ctx, flags);
>> +}
>> +
>> +static unsigned long __spin_wait(igt_spin_t *spin)
>> +{
>> + struct timespec start = { };
>> +
>> + igt_nsec_elapsed(&start);
>> +
>> + while (!spin->running);
>
> Put ';' on a new line so it's clearly visible.
Okay.
>
>> +
>> + return igt_nsec_elapsed(&start);
>> +}
>> +
>> +static igt_spin_t * __spin_sync(int fd, uint32_t ctx, unsigned long flags)
>> +{
>> + igt_spin_t *spin = __spin_poll(fd, ctx, flags);
>> +
>> + __spin_wait(spin);
>> +
>> + return spin;
>> +}
>> +
>> +static igt_spin_t * spin_sync(int fd, uint32_t ctx, unsigned long flags)
>
> spin_sync() has connotations with gem_sync(). gem_sync is wait for end,
> but spin_sync is wait_for_start. Maybe spin_wait_for_execute? Nah.
>
>> +{
>> + igt_require_gem(fd);
>> +
>> + return __spin_sync(fd, ctx, flags);
>> +}
>
>> static void
>> __submit_spin_batch(int gem_fd,
>> + igt_spin_t *spin,
>> struct drm_i915_gem_exec_object2 *obj,
>> const struct intel_execution_engine2 *e)
>> {
>> struct drm_i915_gem_execbuffer2 eb = {
>> - .buffer_count = 1,
>> .buffers_ptr = to_user_pointer(obj),
>> .flags = e2ring(gem_fd, e),
>> };
>>
>> + if (spin->running) {
>> + obj[0].handle = spin->poll_handle;
>> + obj[0].flags = EXEC_OBJECT_ASYNC;
>> + obj[1].handle = spin->handle;
>> + eb.buffer_count = 2;
>> + } else {
>> + obj[0].handle = spin->handle;
>> + eb.buffer_count = 1;
>> + }
>
> obj[] must be set up by the caller; the EXEC_OBJECT_PINNED are
> essential. Or else the kernel *will* move spin->poll_handle and then it
> is fubar.
Why the caller has to do it? It is providing obj array which gets
populated by the helper and by the kernel. If I add EXEC_OBJECT_PINNED
to the helper is there a remaining problem?
Regards,
Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
next prev parent reply other threads:[~2018-03-15 13:36 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-15 12:56 [igt-dev] [PATCH i-g-t] tests/perf_pmu: Improve accuracy by waiting on spinner to start Tvrtko Ursulin
2018-03-15 13:14 ` Chris Wilson
2018-03-15 13:36 ` Tvrtko Ursulin [this message]
2018-03-15 13:45 ` [Intel-gfx] " Chris Wilson
2018-03-15 14:37 ` Tvrtko Ursulin
2018-03-15 14:46 ` Chris Wilson
2018-03-15 14:53 ` [Intel-gfx] " Tvrtko Ursulin
2018-03-15 14:58 ` Chris Wilson
2018-03-15 14:03 ` [igt-dev] ✓ Fi.CI.BAT: success for " Patchwork
2018-03-15 15:46 ` [Intel-gfx] [PATCH i-g-t v2] " Tvrtko Ursulin
2018-03-15 16:01 ` [igt-dev] " Chris Wilson
2018-03-16 7:36 ` [igt-dev] [PATCH i-g-t v3] " Tvrtko Ursulin
2018-03-16 10:17 ` [igt-dev] [PATCH i-g-t v4] " Tvrtko Ursulin
2018-03-16 12:18 ` [igt-dev] [PATCH i-g-t v5] " Tvrtko Ursulin
2018-03-16 13:31 ` [Intel-gfx] [PATCH i-g-t v6] " Tvrtko Ursulin
2018-03-19 13:56 ` [Intel-gfx] [PATCH i-g-t v7] " Tvrtko Ursulin
2018-03-19 14:02 ` [igt-dev] " Chris Wilson
2018-03-19 15:29 ` Chris Wilson
2018-03-19 15:33 ` Chris Wilson
2018-03-19 16:59 ` [igt-dev] [PATCH i-g-t v8] " Tvrtko Ursulin
2018-03-20 13:51 ` [igt-dev] [PATCH i-g-t v9] " Tvrtko Ursulin
2018-03-15 16:35 ` [igt-dev] ✗ Fi.CI.IGT: failure for " Patchwork
2018-03-15 16:49 ` [igt-dev] ✗ Fi.CI.BAT: failure for tests/perf_pmu: Improve accuracy by waiting on spinner to start (rev2) Patchwork
2018-03-16 8:02 ` [igt-dev] ✓ Fi.CI.BAT: success for tests/perf_pmu: Improve accuracy by waiting on spinner to start (rev3) Patchwork
2018-03-16 8:52 ` [igt-dev] ✗ Fi.CI.IGT: failure " Patchwork
2018-03-16 11:20 ` [igt-dev] ✓ Fi.CI.BAT: success for tests/perf_pmu: Improve accuracy by waiting on spinner to start (rev4) Patchwork
2018-03-16 12:27 ` [igt-dev] ✗ Fi.CI.IGT: failure " Patchwork
2018-03-16 14:30 ` [igt-dev] ✓ Fi.CI.BAT: success for tests/perf_pmu: Improve accuracy by waiting on spinner to start (rev6) Patchwork
2018-03-19 10:24 ` [igt-dev] ✗ Fi.CI.IGT: failure " Patchwork
2018-03-19 21:12 ` [igt-dev] ✓ Fi.CI.BAT: success for tests/perf_pmu: Improve accuracy by waiting on spinner to start (rev9) Patchwork
2018-03-20 0:26 ` [igt-dev] ✓ Fi.CI.IGT: " Patchwork
2018-03-20 17:12 ` [igt-dev] ✓ Fi.CI.BAT: success for tests/perf_pmu: Improve accuracy by waiting on spinner to start (rev10) Patchwork
2018-03-20 20:31 ` [igt-dev] ✓ Fi.CI.IGT: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1a8be18d-e456-b3b0-3483-d95594ea8b27@linux.intel.com \
--to=tvrtko.ursulin@linux.intel.com \
--cc=Intel-gfx@lists.freedesktop.org \
--cc=chris@chris-wilson.co.uk \
--cc=igt-dev@lists.freedesktop.org \
--cc=tursulin@ursulin.net \
--cc=tvrtko.ursulin@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox