All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Gordon <david.s.gordon@intel.com>
To: intel-gfx@lists.freedesktop.org, "Dai, Yu" <yu.dai@intel.com>,
	"Barnes, Jesse" <jesse.barnes@intel.com>
Subject: Re: [PATCH] drm/i915: fix crash in error state readout on non-execlist platforms
Date: Mon, 14 Sep 2015 17:45:29 +0100	[thread overview]
Message-ID: <55F6F9A9.70700@intel.com> (raw)
In-Reply-To: <20150914092139.GU3383@phenom.ffwll.local>

On 14/09/15 10:21, Daniel Vetter wrote:
> On Thu, Sep 10, 2015 at 03:07:00PM -0700, Yu Dai wrote:
>> Agree. The LRC prefix is confusing. Thanks for the patch. -Alex
>
> Care to do an official r-b?
>
> Thanks, Daniel
>
>>
>> On 09/10/2015 02:58 PM, Jesse Barnes wrote:
>>> That looks like it would, but I think it's still confusing to reference LRC state when we haven't initialized execlists at all...
>>>
>>> Jesse
>>>
>>> On 09/10/2015 02:56 PM, Yu Dai wrote:
>>>> Jesse,
>>>>
>>>> Will the patch here fix the issue? It should help other cases where LRC_PPHWSP_PN is referenced on non-execlist / guc platforms.
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
>>>> index 4cc54b3..233a930 100644
>>>> --- a/drivers/gpu/drm/i915/intel_lrc.h
>>>> +++ b/drivers/gpu/drm/i915/intel_lrc.h
>>>> @@ -71,7 +71,7 @@ static inline void intel_logical_ring_emit(struct intel_ringbuffer *ringbuf,
>>>>
>>>>   /* One extra page is added before LRC for GuC as shared data */
>>>>   #define LRC_GUCSHR_PN  (0)
>>>> -#define LRC_PPHWSP_PN  (LRC_GUCSHR_PN + 1)
>>>> +#define LRC_PPHWSP_PN  (LRC_GUCSHR_PN + i915.enable_guc_submission ? 1 : 0)
>>>>   #define LRC_STATE_PN   (LRC_PPHWSP_PN + 1)

I don't like this approach of hiding the runtime conditional offset in 
the macro. I think it would be better to leave this alone and use the 
technique in the code below where the (const) macros named 'LRC_*' are 
used only inside clauses guarded by an "if (i915.enable_execlists)". 
That way the code won't mention the LRC_* symbols except where they're 
known tp be meaningful.

Of these symbols, LRC_GUCSHR_PN is not used except in GuC-specific code, 
and LRC_STATE_PN is used only in GuC and/or LRC (execlist-mode) specific 
code. So it's only LRC_PPHWSP_PN that needs to be used selectively 
(specifically, added to the base of a context object when using execlist 
mode). And the code below already accomplishes that without changing the 
macro.

>>>>   void intel_lr_context_free(struct intel_context *ctx);
>>>>
>>>> Thanks,
>>>> Alex
>>>>
>>>> On 09/10/2015 02:40 PM, Jesse Barnes wrote:
>>>>> Looks like this was introduced in:
>>>>> commit d1675198ed1f21aec6e036336e4340c40b726497
>>>>> Author: Alex Dai <yu.dai@intel.com>
>>>>> Date:   Wed Aug 12 15:43:43 2015 +0100
>>>>>
>>>>>       drm/i915: Integrate GuC-based command submission
>>>>>
>>>>> This patch assumed LRC contexts and HWS layout, which is incorrect on
>>>>> platforms without execlists.  This can lead to a crash in GPU error
>>>>> state readout on those platforms.
>>>>>
>>>>> I don't see a bug filed for this, but there may be one that I haven't
>>>>> found.
>>>>>
>>>>> Cc: Alex Dai <yu.dai@intel.com>
>>>>> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
>>>>> ---
>>>>>    drivers/gpu/drm/i915/i915_gpu_error.c | 19 +++++++++++++------
>>>>>    1 file changed, 13 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
>>>>> index 3379f9c..d0822f8 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
>>>>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
>>>>> @@ -457,17 +457,24 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
>>>>>            }
>>>>>              if ((obj = error->ring[i].hws_page)) {
>>>>> +            u64 hws_offset = lower_32_bits(obj->gtt_offset);

No need to take only lower 32 bits here, the variable is already u64 and 
the format below is 0x%08llx, so we'll get a 32-bit value in the output 
(unless it actually is bigger than 32 bits, which shouldn't happen and 
which we'd want to know about!).

>>>>> +            u32 *hws_page = &obj->pages[0][0];
>>>>> +
>>>>> +            if (i915.enable_execlists) {
>>>>> +                hws_offset = obj->gtt_offset + LRC_PPHWSP_PN *
>>>>> +                    PAGE_SIZE;

If we keep the full 64-bit value (or even if we don't), this should 
simplify to "hws_offset += LRC_PPHWSP_PN*PAGE_SIZE;"

>>>>> +                hws_page = &obj->pages[LRC_PPHWSP_PN][0];
>>>>> +            }
>>>>>                err_printf(m, "%s --- HW Status = 0x%08llx\n",
>>>>> -                dev_priv->ring[i].name,
>>>>> -                obj->gtt_offset + LRC_PPHWSP_PN * PAGE_SIZE);
>>>>> +                   dev_priv->ring[i].name, hws_offset);
>>>>>                offset = 0;
>>>>>                for (elt = 0; elt < PAGE_SIZE/16; elt += 4) {
>>>>>                    err_printf(m, "[%04x] %08x %08x %08x %08x\n",
>>>>>                           offset,
>>>>> -                       obj->pages[LRC_PPHWSP_PN][elt],
>>>>> -                       obj->pages[LRC_PPHWSP_PN][elt+1],
>>>>> -                       obj->pages[LRC_PPHWSP_PN][elt+2],
>>>>> -                       obj->pages[LRC_PPHWSP_PN][elt+3]);
>>>>> +                       hws_page[elt],
>>>>> +                       hws_page[elt+1],
>>>>> +                       hws_page[elt+2],
>>>>> +                       hws_page[elt+3]);
>>>>>                        offset += 16;
>>>>>                }
>>>>>            }

With the 64-bit handling changed, I'd be happy to r-b Jesse's original 
patch, but not Alex's redefinition of LRC_PPHWSP_PN.

Cheers,
.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2015-09-14 16:45 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-10 21:40 [PATCH] drm/i915: fix crash in error state readout on non-execlist platforms Jesse Barnes
2015-09-10 21:56 ` Yu Dai
2015-09-10 21:58   ` Jesse Barnes
2015-09-10 22:07     ` Yu Dai
2015-09-14  9:21       ` Daniel Vetter
2015-09-14 16:45         ` Dave Gordon [this message]
2015-09-15 17:03 ` [PATCH] drm/i915: fix crash in error state readout on non-execlist platforms v2 Jesse Barnes
2015-09-23  8:34   ` Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55F6F9A9.70700@intel.com \
    --to=david.s.gordon@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jesse.barnes@intel.com \
    --cc=yu.dai@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.