Intel-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Mika Kuoppala <mika.kuoppala@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH 12/20] drm/i915/selftests: Verify LRC isolation
Date: Fri, 28 Feb 2020 14:13:03 +0200	[thread overview]
Message-ID: <875zfq9agg.fsf@gaia.fi.intel.com> (raw)
In-Reply-To: <158289077278.24106.12975460776411775775@skylake-alporthouse-com>

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2020-02-28 11:30:21)
>> Chris Wilson <chris@chris-wilson.co.uk> writes:
>> > +     x = 0;
>> > +     dw = 0;
>> > +     hw = engine->pinned_default_state;
>> > +     hw += LRC_STATE_PN * PAGE_SIZE / sizeof(*hw);
>> > +     do {
>> > +             u32 lri = hw[dw];
>> > +
>> > +             if (lri == 0) {
>> > +                     dw++;
>> > +                     continue;
>> > +             }
>> > +
>> > +             if ((lri & GENMASK(31, 23)) != MI_INSTR(0x22, 0)) {
>> > +                     lri &= 0x7f;
>> > +                     dw += lri + 2;
>> > +                     continue;
>> > +             }
>> > +
>> > +             lri &= 0x7f;
>> > +             lri++;
>> > +             dw++;
>> 
>> The pattern for weeding out everything except lris starts
>> to stand out as a boilerplate. 
>
> Ssh. Just do a visual replacement with for_each_register()
>
>> 
>> > +
>> > +             while (lri) {
>> > +                     if (!is_moving(A[0][x], A[1][x]) &&
>> > +                         (A[0][x] != B[0][x] || A[1][x] != B[1][x])) {
>> > +                             switch (hw[dw] & 4095) {
>> > +                             case 0x30: /* RING_HEAD */
>> > +                             case 0x34: /* RING_TAIL */
>> > +                                     break;
>> > +
>> > +                             default:
>> > +                                     pr_err("%s[%d]: Mismatch for register %4x, default %08x, reference %08x, result (%08x, %08x), poison %08x, context %08x\n",
>> 
>> 0x left out on all for compactness or by accident?
>
> I tend not to use 0x, unless it looks odd or we need machine parsing :) 
> Here the compactness helps with a very long line.
>
>> > +                                            engine->name, x,
>> > +                                            hw[dw], hw[dw + 1],
>> > +                                            A[0][x], B[0][x], B[1][x],
>> > +                                            poison, lrc[dw + 1]);
>> > +                                     err = -EINVAL;
>> > +                                     break;
>> > +                             }
>> > +                     }
>> > +                     dw += 2;
>> > +                     lri -= 2;
>> > +                     x++;
>> > +             }
>> > +     } while (dw < PAGE_SIZE / sizeof(u32) &&
>> > +              (hw[dw] & ~BIT(0)) != MI_BATCH_BUFFER_END);
>> > +
>> > +     i915_gem_object_unpin_map(ce->state->obj);
>> > +err_B1:
>> > +     i915_gem_object_unpin_map(result[1]->obj);
>> > +err_B0:
>> > +     i915_gem_object_unpin_map(result[0]->obj);
>> > +err_A1:
>> > +     i915_gem_object_unpin_map(ref[1]->obj);
>> > +err_A0:
>> > +     i915_gem_object_unpin_map(ref[0]->obj);
>> > +     return err;
>> > +}
>> > +
>> > +static int __lrc_isolation(struct intel_engine_cs *engine, u32 poison)
>> > +{
>> > +     u32 *sema = memset32(engine->status_page.addr + 1000, 0, 1);
>> > +     struct i915_vma *ref[2], *result[2];
>> > +     struct intel_context *A, *B;
>> > +     struct i915_request *rq;
>> > +     int err;
>> > +
>> > +     A = intel_context_create(engine);
>> > +     if (IS_ERR(A))
>> > +             return PTR_ERR(A);
>> > +
>> > +     B = intel_context_create(engine);
>> > +     if (IS_ERR(B)) {
>> > +             err = PTR_ERR(B);
>> > +             goto err_A;
>> > +     }
>> > +
>> > +     ref[0] = create_user_vma(A->vm, SZ_64K);
>> > +     if (IS_ERR(ref[0])) {
>> > +             err = PTR_ERR(ref[0]);
>> > +             goto err_B;
>> > +     }
>> > +
>> > +     ref[1] = create_user_vma(A->vm, SZ_64K);
>> > +     if (IS_ERR(ref[1])) {
>> > +             err = PTR_ERR(ref[1]);
>> > +             goto err_ref0;
>> > +     }
>> > +
>> > +     rq = record_registers(A, ref[0], ref[1], sema);
>> > +     if (IS_ERR(rq)) {
>> > +             err = PTR_ERR(rq);
>> > +             goto err_ref1;
>> > +     }
>> > +
>> > +     WRITE_ONCE(*sema, 1);
>> > +     wmb();
>> 
>> So with this you get reference base for before and after...
>> 
>> > +
>> > +     if (i915_request_wait(rq, 0, HZ / 2) < 0) {
>> > +             i915_request_put(rq);
>> > +             err = -ETIME;
>> > +             goto err_ref1;
>> > +     }
>> > +     i915_request_put(rq);
>> > +
>> > +     result[0] = create_user_vma(A->vm, SZ_64K);
>> > +     if (IS_ERR(result[0])) {
>> > +             err = PTR_ERR(result[0]);
>> > +             goto err_ref1;
>> > +     }
>> > +
>> > +     result[1] = create_user_vma(A->vm, SZ_64K);
>> > +     if (IS_ERR(result[1])) {
>> > +             err = PTR_ERR(result[1]);
>> > +             goto err_result0;
>> > +     }
>> > +
>> > +     rq = record_registers(A, result[0], result[1], sema);
>> > +     if (IS_ERR(rq)) {
>> > +             err = PTR_ERR(rq);
>> > +             goto err_result1;
>> > +     }
>> > +
>> > +     err = poison_registers(B, poison, sema);
>> 
>> ..and apparently the poisoning releases the semaphore
>> triggering the preemption?
>
> Correct.
>
>> How can you make preempt happen deterministically with this?
>
> We use MI_ARB_ONOFF to ensure that we can only be preempted within the
> semaphore.

Ah yes this give you the resolution you want.

>
>> Released semaphore generates interrupt and you have it ready
>> with maximum prio?
>
> On submission of the I915_PRIORITY_BARRIER request, we will immediately
> submit it to ELSP, and so queue the preemption request to HW. Since we
> disable preemption in record_registers() until it hits the semaphore, we
> know that the second batch must run only when the first is waiting on
> that semaphore.

Thanks for the explanation.

>  
>> I am also puzzled why there is a need for two set of reference
>> values?
>
> One is without preemption from a second context (so just one context
> running on the HW). This reference should match the context image
> before/after. The other is after having been preempted between the
> two reads. This /should/ also match its context image...
>
> The first checks our mechanism for reading back the registers and
> comparing them to the context, the second is the actual test.

I remember yesterday pondering in irc that it would be nice to have
an machanism to test the machinery. You must have been confused
that what do I really mean as this was the mechanism I was looking
for but didn't undestand =)

I like that it adapts to context image content and the
relative mmio trick should work for gens to come.

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

> -Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2020-02-28 12:14 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-27  8:57 [Intel-gfx] [PATCH 01/20] drm/i915: Skip barriers inside waits Chris Wilson
2020-02-27  8:57 ` [Intel-gfx] [PATCH 02/20] drm/i915/perf: Mark up the racy use of perf->exclusive_stream Chris Wilson
2020-02-27  8:57 ` [Intel-gfx] [PATCH 03/20] drm/i915/perf: Manually acquire engine-wakeref around use of kernel_context Chris Wilson
2020-02-28 11:53   ` Mika Kuoppala
2020-02-28 11:56     ` Chris Wilson
2020-02-28 12:18       ` Mika Kuoppala
2020-02-27  8:57 ` [Intel-gfx] [PATCH 04/20] drm/i915/perf: Wait for lrc_reconfigure on disable Chris Wilson
2020-02-27 11:17   ` [Intel-gfx] [PATCH] " Chris Wilson
2020-02-27  8:57 ` [Intel-gfx] [PATCH 05/20] drm/i915/gem: Consolidate ctx->engines[] release Chris Wilson
2020-02-27  9:51   ` [Intel-gfx] [PATCH] " Chris Wilson
2020-02-27 11:01   ` Chris Wilson
2020-02-28 12:08     ` Tvrtko Ursulin
2020-02-28 12:19       ` Chris Wilson
2020-03-02 14:20         ` Tvrtko Ursulin
2020-02-27  8:57 ` [Intel-gfx] [PATCH 06/20] drm/i915/gt: Prevent allocation on a banned context Chris Wilson
2020-02-27  8:57 ` [Intel-gfx] [PATCH 07/20] drm/i915/gem: Check that the context wasn't closed during setup Chris Wilson
2020-02-27  8:57 ` [Intel-gfx] [PATCH 08/20] drm/i915/selftests: Disable heartbeat around manual pulse tests Chris Wilson
2020-02-27 22:51   ` Andi Shyti
2020-02-27  8:57 ` [Intel-gfx] [PATCH 09/20] drm/i915/gt: Reset queue_priority_hint after wedging Chris Wilson
2020-02-28 12:10   ` Tvrtko Ursulin
2020-02-28 12:31     ` Chris Wilson
2020-02-28 12:59       ` Tvrtko Ursulin
2020-02-28 13:10         ` Chris Wilson
2020-02-28 13:20           ` Tvrtko Ursulin
2020-02-28 13:34             ` Chris Wilson
2020-02-27  8:57 ` [Intel-gfx] [PATCH 10/20] drm/i915/gt: Pull marking vm as closed underneath the vm->mutex Chris Wilson
2020-02-28 12:12   ` Tvrtko Ursulin
2020-02-27  8:57 ` [Intel-gfx] [PATCH 11/20] drm/i915: Protect i915_request_await_start from early waits Chris Wilson
2020-02-28 12:41   ` Tvrtko Ursulin
2020-02-27  8:57 ` [Intel-gfx] [PATCH 12/20] drm/i915/selftests: Verify LRC isolation Chris Wilson
2020-02-28 11:30   ` Mika Kuoppala
2020-02-28 11:52     ` Chris Wilson
2020-02-28 12:13       ` Mika Kuoppala [this message]
2020-02-27  8:57 ` [Intel-gfx] [PATCH 13/20] drm/i915/selftests: Check recovery from corrupted LRC Chris Wilson
2020-02-27  8:57 ` [Intel-gfx] [PATCH 14/20] drm/i915/selftests: Wait for the kernel context switch Chris Wilson
2020-02-27  8:57 ` [Intel-gfx] [PATCH 15/20] drm/i915/selftests: Be a little more lenient for reset workers Chris Wilson
2020-02-27  8:57 ` [Intel-gfx] [PATCH 16/20] drm/i915/selftests: Add request throughput measurement to perf Chris Wilson
2020-02-27  8:57 ` [Intel-gfx] [PATCH 17/20] drm/i915/gt: Declare when we enabled timeslicing Chris Wilson
2020-02-28 12:45   ` Tvrtko Ursulin
2020-02-28 13:14     ` Chris Wilson
2020-02-27  8:57 ` [Intel-gfx] [PATCH 18/20] drm/i915/gt: Yield the timeslice if caught waiting on a user semaphore Chris Wilson
2020-02-27  8:57 ` [Intel-gfx] [PATCH 19/20] drm/i915/execlists: Check the sentinel is alone in the ELSP Chris Wilson
2020-02-27  8:57 ` [Intel-gfx] [PATCH 20/20] drm/i915/execlists: Reduce preempt-to-busy roundtrip delay Chris Wilson
2020-02-27  9:14 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/20] drm/i915: Skip barriers inside waits Patchwork
2020-02-27  9:45 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2020-02-27 15:06 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/20] drm/i915: Skip barriers inside waits (rev4) Patchwork
2020-02-27 15:37 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2020-02-27 22:38 ` [Intel-gfx] [PATCH 01/20] drm/i915: Skip barriers inside waits Andi Shyti
2020-02-28 11:53 ` Tvrtko Ursulin
2020-02-28 12:08   ` Chris Wilson
2020-02-28 16:33 ` [Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [01/20] drm/i915: Skip barriers inside waits (rev4) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=875zfq9agg.fsf@gaia.fi.intel.com \
    --to=mika.kuoppala@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox