All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mika Kuoppala <mika.kuoppala@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/i915/execlists: Verify context register state before execution
Date: Thu, 31 Oct 2019 16:52:33 +0200	[thread overview]
Message-ID: <87tv7p6k7i.fsf@gaia.fi.intel.com> (raw)
In-Reply-To: <157253285086.11954.12097715789006191101@skylake-alporthouse-com>

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2019-10-31 14:32:05)
>> Chris Wilson <chris@chris-wilson.co.uk> writes:
>> 
>> > Check that the context's ring register state still matches our
>> > expectations prior to execution.
>> >
>> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>> > ---
>> >  drivers/gpu/drm/i915/gt/intel_lrc.c     | 73 ++++++++++++++++++++-----
>> >  drivers/gpu/drm/i915/gt/intel_lrc_reg.h |  4 +-
>> >  drivers/gpu/drm/i915/gt/selftest_lrc.c  |  4 +-
>> >  3 files changed, 63 insertions(+), 18 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>> > index 5f61cd128d9c..666e70931524 100644
>> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>> > @@ -1146,6 +1146,60 @@ execlists_schedule_out(struct i915_request *rq)
>> >       i915_request_put(rq);
>> >  }
>> >  
>> > +static int lrc_ring_mi_mode(const struct intel_engine_cs *engine)
>> > +{
>> > +     if (INTEL_GEN(engine->i915) >= 12)
>> > +             return 0x60;
>> > +     else if (INTEL_GEN(engine->i915) >= 9)
>> > +             return 0x54;
>> > +     else if (engine->class == RENDER_CLASS)
>> > +             return 0x58;
>> > +     else
>> > +             return -1;
>> > +}
>> > +
>> > +static void
>> > +execlists_check_context(const struct intel_context *ce,
>> > +                     const struct intel_engine_cs *engine)
>> > +{
>> > +     const struct intel_ring *ring = ce->ring;
>> > +     u32 *regs = ce->lrc_reg_state;
>> > +     int x;
>> > +
>> > +     if (regs[CTX_RING_START] != i915_ggtt_offset(ring->vma)) {
>> > +             pr_err_once("%s: context submitted with incorrect RING_BUFFER_START [%08x], expected %08x\n",
>> > +                         engine->name,
>> > +                         regs[CTX_RING_START],
>> > +                         i915_ggtt_offset(ring->vma));
>> > +             regs[CTX_RING_START] = i915_ggtt_offset(ring->vma);
>> > +     }
>> > +
>> > +     if ((regs[CTX_RING_CTL] & ~(RING_WAIT | RING_WAIT_SEMAPHORE)) !=
>> > +         (RING_CTL_SIZE(ring->size) | RING_VALID)) {
>> > +             pr_err_once("%s: context submitted with incorrect RING_BUFFER_CONTROL [%08x], expected %08x\n",
>> > +                         engine->name,
>> > +                         regs[CTX_RING_CTL],
>> > +                         (u32)(RING_CTL_SIZE(ring->size) | RING_VALID));
>> > +             regs[CTX_RING_CTL] = RING_CTL_SIZE(ring->size) | RING_VALID;
>> 
>> We are going to submit by clearing the waits. First I thought this will
>> lead to disaster but the hardware works so that we clear on writing '1'.
>
> Afaik, they are only indicator bits. So the HW ignores those bits when
> we write to the register.
>

I did check on reviewing. For Gen12 it states that: writing 0 == no
effect, writing 1 == clear.

-Mika


>> > +     if (regs[CTX_BB_STATE] != RING_BB_PPGTT) {
>> > +             pr_err_once("%s: context submitted with incorrect BB_STATE [%08x], expected %08x\n",
>> > +                         engine->name,
>> > +                         regs[CTX_BB_STATE],
>> > +                         RING_BB_PPGTT);
>> > +             regs[CTX_BB_STATE] = RING_BB_PPGTT;
>> > +     }
>> > +
>> > +     x = lrc_ring_mi_mode(engine);
>> > +     if (x != -1 && regs[x + 1] & STOP_RING) {
>> > +             pr_err_once("%s: context submitted with STOP_RING [%08x] in RING_MI_MODE\n",
>> > +                         engine->name, regs[x + 1]);
>> > +             regs[x + 1] &= ~STOP_RING;
>> > +             regs[x + 1] |= STOP_RING << 16;
>> 
>> Ok you want only to care about this one. Was pondering of restoring rest
>> of the state from previous.
>
> It was mostly in the spirit of "now that we are here, we might as well
> fix it up". It's mainly the paranoia in trying to ascertain if we are
> feeding garbage into the GPU. There's probably a lot more we can check,
> but the ring registers are essential :)
>
>> Will the hardware even care on masks at this point or is the saving part
>> setting them accordingly?
>
> Aiui, it uses the masks on the implicit LRI in the context restore to
> control where or not to actually apply the bits. Not that I have checked
> to see what the HW is doing (we will see if it ever flags an error), but
> memory says from elsewere it uses 0xffffxxxx.
> -Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

WARNING: multiple messages have this Message-ID (diff)
From: Mika Kuoppala <mika.kuoppala@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH] drm/i915/execlists: Verify context register state before execution
Date: Thu, 31 Oct 2019 16:52:33 +0200	[thread overview]
Message-ID: <87tv7p6k7i.fsf@gaia.fi.intel.com> (raw)
Message-ID: <20191031145233.mYITVeq2sE-rmbq6-0K4rvRnUTr_drRcEkhulW89i8k@z> (raw)
In-Reply-To: <157253285086.11954.12097715789006191101@skylake-alporthouse-com>

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2019-10-31 14:32:05)
>> Chris Wilson <chris@chris-wilson.co.uk> writes:
>> 
>> > Check that the context's ring register state still matches our
>> > expectations prior to execution.
>> >
>> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>> > ---
>> >  drivers/gpu/drm/i915/gt/intel_lrc.c     | 73 ++++++++++++++++++++-----
>> >  drivers/gpu/drm/i915/gt/intel_lrc_reg.h |  4 +-
>> >  drivers/gpu/drm/i915/gt/selftest_lrc.c  |  4 +-
>> >  3 files changed, 63 insertions(+), 18 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>> > index 5f61cd128d9c..666e70931524 100644
>> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>> > @@ -1146,6 +1146,60 @@ execlists_schedule_out(struct i915_request *rq)
>> >       i915_request_put(rq);
>> >  }
>> >  
>> > +static int lrc_ring_mi_mode(const struct intel_engine_cs *engine)
>> > +{
>> > +     if (INTEL_GEN(engine->i915) >= 12)
>> > +             return 0x60;
>> > +     else if (INTEL_GEN(engine->i915) >= 9)
>> > +             return 0x54;
>> > +     else if (engine->class == RENDER_CLASS)
>> > +             return 0x58;
>> > +     else
>> > +             return -1;
>> > +}
>> > +
>> > +static void
>> > +execlists_check_context(const struct intel_context *ce,
>> > +                     const struct intel_engine_cs *engine)
>> > +{
>> > +     const struct intel_ring *ring = ce->ring;
>> > +     u32 *regs = ce->lrc_reg_state;
>> > +     int x;
>> > +
>> > +     if (regs[CTX_RING_START] != i915_ggtt_offset(ring->vma)) {
>> > +             pr_err_once("%s: context submitted with incorrect RING_BUFFER_START [%08x], expected %08x\n",
>> > +                         engine->name,
>> > +                         regs[CTX_RING_START],
>> > +                         i915_ggtt_offset(ring->vma));
>> > +             regs[CTX_RING_START] = i915_ggtt_offset(ring->vma);
>> > +     }
>> > +
>> > +     if ((regs[CTX_RING_CTL] & ~(RING_WAIT | RING_WAIT_SEMAPHORE)) !=
>> > +         (RING_CTL_SIZE(ring->size) | RING_VALID)) {
>> > +             pr_err_once("%s: context submitted with incorrect RING_BUFFER_CONTROL [%08x], expected %08x\n",
>> > +                         engine->name,
>> > +                         regs[CTX_RING_CTL],
>> > +                         (u32)(RING_CTL_SIZE(ring->size) | RING_VALID));
>> > +             regs[CTX_RING_CTL] = RING_CTL_SIZE(ring->size) | RING_VALID;
>> 
>> We are going to submit by clearing the waits. First I thought this will
>> lead to disaster but the hardware works so that we clear on writing '1'.
>
> Afaik, they are only indicator bits. So the HW ignores those bits when
> we write to the register.
>

I did check on reviewing. For Gen12 it states that: writing 0 == no
effect, writing 1 == clear.

-Mika


>> > +     if (regs[CTX_BB_STATE] != RING_BB_PPGTT) {
>> > +             pr_err_once("%s: context submitted with incorrect BB_STATE [%08x], expected %08x\n",
>> > +                         engine->name,
>> > +                         regs[CTX_BB_STATE],
>> > +                         RING_BB_PPGTT);
>> > +             regs[CTX_BB_STATE] = RING_BB_PPGTT;
>> > +     }
>> > +
>> > +     x = lrc_ring_mi_mode(engine);
>> > +     if (x != -1 && regs[x + 1] & STOP_RING) {
>> > +             pr_err_once("%s: context submitted with STOP_RING [%08x] in RING_MI_MODE\n",
>> > +                         engine->name, regs[x + 1]);
>> > +             regs[x + 1] &= ~STOP_RING;
>> > +             regs[x + 1] |= STOP_RING << 16;
>> 
>> Ok you want only to care about this one. Was pondering of restoring rest
>> of the state from previous.
>
> It was mostly in the spirit of "now that we are here, we might as well
> fix it up". It's mainly the paranoia in trying to ascertain if we are
> feeding garbage into the GPU. There's probably a lot more we can check,
> but the ring registers are essential :)
>
>> Will the hardware even care on masks at this point or is the saving part
>> setting them accordingly?
>
> Aiui, it uses the masks on the implicit LRI in the context restore to
> control where or not to actually apply the bits. Not that I have checked
> to see what the HW is doing (we will see if it ever flags an error), but
> memory says from elsewere it uses 0xffffxxxx.
> -Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2019-10-31 14:53 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-31 10:47 [PATCH] drm/i915/execlists: Verify context register state before execution Chris Wilson
2019-10-31 10:47 ` [Intel-gfx] " Chris Wilson
2019-10-31 10:51 ` ✗ Fi.CI.BUILD: failure for drm/i915/execlists: Verify context register state before execution (rev2) Patchwork
2019-10-31 10:51   ` [Intel-gfx] " Patchwork
2019-10-31 14:32 ` [PATCH] drm/i915/execlists: Verify context register state before execution Mika Kuoppala
2019-10-31 14:32   ` [Intel-gfx] " Mika Kuoppala
2019-10-31 14:40   ` Chris Wilson
2019-10-31 14:40     ` [Intel-gfx] " Chris Wilson
2019-10-31 14:52     ` Mika Kuoppala [this message]
2019-10-31 14:52       ` Mika Kuoppala
  -- strict thread matches above, loose matches on Subject: below --
2019-10-26  9:44 Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tv7p6k7i.fsf@gaia.fi.intel.com \
    --to=mika.kuoppala@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.