* [PATCH] drm/i915: fix crash in error state readout on non-execlist platforms
@ 2015-09-10 21:40 Jesse Barnes
2015-09-10 21:56 ` Yu Dai
2015-09-15 17:03 ` [PATCH] drm/i915: fix crash in error state readout on non-execlist platforms v2 Jesse Barnes
0 siblings, 2 replies; 8+ messages in thread
From: Jesse Barnes @ 2015-09-10 21:40 UTC (permalink / raw)
To: intel-gfx
Looks like this was introduced in:
commit d1675198ed1f21aec6e036336e4340c40b726497
Author: Alex Dai <yu.dai@intel.com>
Date: Wed Aug 12 15:43:43 2015 +0100
drm/i915: Integrate GuC-based command submission
This patch assumed LRC contexts and HWS layout, which is incorrect on
platforms without execlists. This can lead to a crash in GPU error
state readout on those platforms.
I don't see a bug filed for this, but there may be one that I haven't
found.
Cc: Alex Dai <yu.dai@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
---
drivers/gpu/drm/i915/i915_gpu_error.c | 19 +++++++++++++------
1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 3379f9c..d0822f8 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -457,17 +457,24 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
}
if ((obj = error->ring[i].hws_page)) {
+ u64 hws_offset = lower_32_bits(obj->gtt_offset);
+ u32 *hws_page = &obj->pages[0][0];
+
+ if (i915.enable_execlists) {
+ hws_offset = obj->gtt_offset + LRC_PPHWSP_PN *
+ PAGE_SIZE;
+ hws_page = &obj->pages[LRC_PPHWSP_PN][0];
+ }
err_printf(m, "%s --- HW Status = 0x%08llx\n",
- dev_priv->ring[i].name,
- obj->gtt_offset + LRC_PPHWSP_PN * PAGE_SIZE);
+ dev_priv->ring[i].name, hws_offset);
offset = 0;
for (elt = 0; elt < PAGE_SIZE/16; elt += 4) {
err_printf(m, "[%04x] %08x %08x %08x %08x\n",
offset,
- obj->pages[LRC_PPHWSP_PN][elt],
- obj->pages[LRC_PPHWSP_PN][elt+1],
- obj->pages[LRC_PPHWSP_PN][elt+2],
- obj->pages[LRC_PPHWSP_PN][elt+3]);
+ hws_page[elt],
+ hws_page[elt+1],
+ hws_page[elt+2],
+ hws_page[elt+3]);
offset += 16;
}
}
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH] drm/i915: fix crash in error state readout on non-execlist platforms 2015-09-10 21:40 [PATCH] drm/i915: fix crash in error state readout on non-execlist platforms Jesse Barnes @ 2015-09-10 21:56 ` Yu Dai 2015-09-10 21:58 ` Jesse Barnes 2015-09-15 17:03 ` [PATCH] drm/i915: fix crash in error state readout on non-execlist platforms v2 Jesse Barnes 1 sibling, 1 reply; 8+ messages in thread From: Yu Dai @ 2015-09-10 21:56 UTC (permalink / raw) To: Jesse Barnes, intel-gfx Jesse, Will the patch here fix the issue? It should help other cases where LRC_PPHWSP_PN is referenced on non-execlist / guc platforms. diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h index 4cc54b3..233a930 100644 --- a/drivers/gpu/drm/i915/intel_lrc.h +++ b/drivers/gpu/drm/i915/intel_lrc.h @@ -71,7 +71,7 @@ static inline void intel_logical_ring_emit(struct intel_ringbuffer *ringbuf, /* One extra page is added before LRC for GuC as shared data */ #define LRC_GUCSHR_PN (0) -#define LRC_PPHWSP_PN (LRC_GUCSHR_PN + 1) +#define LRC_PPHWSP_PN (LRC_GUCSHR_PN + i915.enable_guc_submission ? 1 : 0) #define LRC_STATE_PN (LRC_PPHWSP_PN + 1) void intel_lr_context_free(struct intel_context *ctx); Thanks, Alex On 09/10/2015 02:40 PM, Jesse Barnes wrote: > Looks like this was introduced in: > commit d1675198ed1f21aec6e036336e4340c40b726497 > Author: Alex Dai <yu.dai@intel.com> > Date: Wed Aug 12 15:43:43 2015 +0100 > > drm/i915: Integrate GuC-based command submission > > This patch assumed LRC contexts and HWS layout, which is incorrect on > platforms without execlists. This can lead to a crash in GPU error > state readout on those platforms. > > I don't see a bug filed for this, but there may be one that I haven't > found. > > Cc: Alex Dai <yu.dai@intel.com> > Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> > --- > drivers/gpu/drm/i915/i915_gpu_error.c | 19 +++++++++++++------ > 1 file changed, 13 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c > index 3379f9c..d0822f8 100644 > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > @@ -457,17 +457,24 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, > } > > if ((obj = error->ring[i].hws_page)) { > + u64 hws_offset = lower_32_bits(obj->gtt_offset); > + u32 *hws_page = &obj->pages[0][0]; > + > + if (i915.enable_execlists) { > + hws_offset = obj->gtt_offset + LRC_PPHWSP_PN * > + PAGE_SIZE; > + hws_page = &obj->pages[LRC_PPHWSP_PN][0]; > + } > err_printf(m, "%s --- HW Status = 0x%08llx\n", > - dev_priv->ring[i].name, > - obj->gtt_offset + LRC_PPHWSP_PN * PAGE_SIZE); > + dev_priv->ring[i].name, hws_offset); > offset = 0; > for (elt = 0; elt < PAGE_SIZE/16; elt += 4) { > err_printf(m, "[%04x] %08x %08x %08x %08x\n", > offset, > - obj->pages[LRC_PPHWSP_PN][elt], > - obj->pages[LRC_PPHWSP_PN][elt+1], > - obj->pages[LRC_PPHWSP_PN][elt+2], > - obj->pages[LRC_PPHWSP_PN][elt+3]); > + hws_page[elt], > + hws_page[elt+1], > + hws_page[elt+2], > + hws_page[elt+3]); > offset += 16; > } > } _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] drm/i915: fix crash in error state readout on non-execlist platforms 2015-09-10 21:56 ` Yu Dai @ 2015-09-10 21:58 ` Jesse Barnes 2015-09-10 22:07 ` Yu Dai 0 siblings, 1 reply; 8+ messages in thread From: Jesse Barnes @ 2015-09-10 21:58 UTC (permalink / raw) To: Yu Dai, intel-gfx That looks like it would, but I think it's still confusing to reference LRC state when we haven't initialized execlists at all... Jesse On 09/10/2015 02:56 PM, Yu Dai wrote: > Jesse, > > Will the patch here fix the issue? It should help other cases where LRC_PPHWSP_PN is referenced on non-execlist / guc platforms. > > diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h > index 4cc54b3..233a930 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.h > +++ b/drivers/gpu/drm/i915/intel_lrc.h > @@ -71,7 +71,7 @@ static inline void intel_logical_ring_emit(struct intel_ringbuffer *ringbuf, > > /* One extra page is added before LRC for GuC as shared data */ > #define LRC_GUCSHR_PN (0) > -#define LRC_PPHWSP_PN (LRC_GUCSHR_PN + 1) > +#define LRC_PPHWSP_PN (LRC_GUCSHR_PN + i915.enable_guc_submission ? 1 : 0) > #define LRC_STATE_PN (LRC_PPHWSP_PN + 1) > > void intel_lr_context_free(struct intel_context *ctx); > > Thanks, > Alex > > On 09/10/2015 02:40 PM, Jesse Barnes wrote: >> Looks like this was introduced in: >> commit d1675198ed1f21aec6e036336e4340c40b726497 >> Author: Alex Dai <yu.dai@intel.com> >> Date: Wed Aug 12 15:43:43 2015 +0100 >> >> drm/i915: Integrate GuC-based command submission >> >> This patch assumed LRC contexts and HWS layout, which is incorrect on >> platforms without execlists. This can lead to a crash in GPU error >> state readout on those platforms. >> >> I don't see a bug filed for this, but there may be one that I haven't >> found. >> >> Cc: Alex Dai <yu.dai@intel.com> >> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> >> --- >> drivers/gpu/drm/i915/i915_gpu_error.c | 19 +++++++++++++------ >> 1 file changed, 13 insertions(+), 6 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c >> index 3379f9c..d0822f8 100644 >> --- a/drivers/gpu/drm/i915/i915_gpu_error.c >> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c >> @@ -457,17 +457,24 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, >> } >> if ((obj = error->ring[i].hws_page)) { >> + u64 hws_offset = lower_32_bits(obj->gtt_offset); >> + u32 *hws_page = &obj->pages[0][0]; >> + >> + if (i915.enable_execlists) { >> + hws_offset = obj->gtt_offset + LRC_PPHWSP_PN * >> + PAGE_SIZE; >> + hws_page = &obj->pages[LRC_PPHWSP_PN][0]; >> + } >> err_printf(m, "%s --- HW Status = 0x%08llx\n", >> - dev_priv->ring[i].name, >> - obj->gtt_offset + LRC_PPHWSP_PN * PAGE_SIZE); >> + dev_priv->ring[i].name, hws_offset); >> offset = 0; >> for (elt = 0; elt < PAGE_SIZE/16; elt += 4) { >> err_printf(m, "[%04x] %08x %08x %08x %08x\n", >> offset, >> - obj->pages[LRC_PPHWSP_PN][elt], >> - obj->pages[LRC_PPHWSP_PN][elt+1], >> - obj->pages[LRC_PPHWSP_PN][elt+2], >> - obj->pages[LRC_PPHWSP_PN][elt+3]); >> + hws_page[elt], >> + hws_page[elt+1], >> + hws_page[elt+2], >> + hws_page[elt+3]); >> offset += 16; >> } >> } > > _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] drm/i915: fix crash in error state readout on non-execlist platforms 2015-09-10 21:58 ` Jesse Barnes @ 2015-09-10 22:07 ` Yu Dai 2015-09-14 9:21 ` Daniel Vetter 0 siblings, 1 reply; 8+ messages in thread From: Yu Dai @ 2015-09-10 22:07 UTC (permalink / raw) To: Jesse Barnes, intel-gfx Agree. The LRC prefix is confusing. Thanks for the patch. -Alex On 09/10/2015 02:58 PM, Jesse Barnes wrote: > That looks like it would, but I think it's still confusing to reference LRC state when we haven't initialized execlists at all... > > Jesse > > On 09/10/2015 02:56 PM, Yu Dai wrote: > > Jesse, > > > > Will the patch here fix the issue? It should help other cases where LRC_PPHWSP_PN is referenced on non-execlist / guc platforms. > > > > diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h > > index 4cc54b3..233a930 100644 > > --- a/drivers/gpu/drm/i915/intel_lrc.h > > +++ b/drivers/gpu/drm/i915/intel_lrc.h > > @@ -71,7 +71,7 @@ static inline void intel_logical_ring_emit(struct intel_ringbuffer *ringbuf, > > > > /* One extra page is added before LRC for GuC as shared data */ > > #define LRC_GUCSHR_PN (0) > > -#define LRC_PPHWSP_PN (LRC_GUCSHR_PN + 1) > > +#define LRC_PPHWSP_PN (LRC_GUCSHR_PN + i915.enable_guc_submission ? 1 : 0) > > #define LRC_STATE_PN (LRC_PPHWSP_PN + 1) > > > > void intel_lr_context_free(struct intel_context *ctx); > > > > Thanks, > > Alex > > > > On 09/10/2015 02:40 PM, Jesse Barnes wrote: > >> Looks like this was introduced in: > >> commit d1675198ed1f21aec6e036336e4340c40b726497 > >> Author: Alex Dai <yu.dai@intel.com> > >> Date: Wed Aug 12 15:43:43 2015 +0100 > >> > >> drm/i915: Integrate GuC-based command submission > >> > >> This patch assumed LRC contexts and HWS layout, which is incorrect on > >> platforms without execlists. This can lead to a crash in GPU error > >> state readout on those platforms. > >> > >> I don't see a bug filed for this, but there may be one that I haven't > >> found. > >> > >> Cc: Alex Dai <yu.dai@intel.com> > >> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> > >> --- > >> drivers/gpu/drm/i915/i915_gpu_error.c | 19 +++++++++++++------ > >> 1 file changed, 13 insertions(+), 6 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c > >> index 3379f9c..d0822f8 100644 > >> --- a/drivers/gpu/drm/i915/i915_gpu_error.c > >> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > >> @@ -457,17 +457,24 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, > >> } > >> if ((obj = error->ring[i].hws_page)) { > >> + u64 hws_offset = lower_32_bits(obj->gtt_offset); > >> + u32 *hws_page = &obj->pages[0][0]; > >> + > >> + if (i915.enable_execlists) { > >> + hws_offset = obj->gtt_offset + LRC_PPHWSP_PN * > >> + PAGE_SIZE; > >> + hws_page = &obj->pages[LRC_PPHWSP_PN][0]; > >> + } > >> err_printf(m, "%s --- HW Status = 0x%08llx\n", > >> - dev_priv->ring[i].name, > >> - obj->gtt_offset + LRC_PPHWSP_PN * PAGE_SIZE); > >> + dev_priv->ring[i].name, hws_offset); > >> offset = 0; > >> for (elt = 0; elt < PAGE_SIZE/16; elt += 4) { > >> err_printf(m, "[%04x] %08x %08x %08x %08x\n", > >> offset, > >> - obj->pages[LRC_PPHWSP_PN][elt], > >> - obj->pages[LRC_PPHWSP_PN][elt+1], > >> - obj->pages[LRC_PPHWSP_PN][elt+2], > >> - obj->pages[LRC_PPHWSP_PN][elt+3]); > >> + hws_page[elt], > >> + hws_page[elt+1], > >> + hws_page[elt+2], > >> + hws_page[elt+3]); > >> offset += 16; > >> } > >> } > > > > > _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] drm/i915: fix crash in error state readout on non-execlist platforms 2015-09-10 22:07 ` Yu Dai @ 2015-09-14 9:21 ` Daniel Vetter 2015-09-14 16:45 ` Dave Gordon 0 siblings, 1 reply; 8+ messages in thread From: Daniel Vetter @ 2015-09-14 9:21 UTC (permalink / raw) To: Yu Dai; +Cc: intel-gfx On Thu, Sep 10, 2015 at 03:07:00PM -0700, Yu Dai wrote: > Agree. The LRC prefix is confusing. Thanks for the patch. -Alex Care to do an official r-b? Thanks, Daniel > > On 09/10/2015 02:58 PM, Jesse Barnes wrote: > >That looks like it would, but I think it's still confusing to reference LRC state when we haven't initialized execlists at all... > > > >Jesse > > > >On 09/10/2015 02:56 PM, Yu Dai wrote: > >> Jesse, > >> > >> Will the patch here fix the issue? It should help other cases where LRC_PPHWSP_PN is referenced on non-execlist / guc platforms. > >> > >> diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h > >> index 4cc54b3..233a930 100644 > >> --- a/drivers/gpu/drm/i915/intel_lrc.h > >> +++ b/drivers/gpu/drm/i915/intel_lrc.h > >> @@ -71,7 +71,7 @@ static inline void intel_logical_ring_emit(struct intel_ringbuffer *ringbuf, > >> > >> /* One extra page is added before LRC for GuC as shared data */ > >> #define LRC_GUCSHR_PN (0) > >> -#define LRC_PPHWSP_PN (LRC_GUCSHR_PN + 1) > >> +#define LRC_PPHWSP_PN (LRC_GUCSHR_PN + i915.enable_guc_submission ? 1 : 0) > >> #define LRC_STATE_PN (LRC_PPHWSP_PN + 1) > >> > >> void intel_lr_context_free(struct intel_context *ctx); > >> > >> Thanks, > >> Alex > >> > >> On 09/10/2015 02:40 PM, Jesse Barnes wrote: > >>> Looks like this was introduced in: > >>> commit d1675198ed1f21aec6e036336e4340c40b726497 > >>> Author: Alex Dai <yu.dai@intel.com> > >>> Date: Wed Aug 12 15:43:43 2015 +0100 > >>> > >>> drm/i915: Integrate GuC-based command submission > >>> > >>> This patch assumed LRC contexts and HWS layout, which is incorrect on > >>> platforms without execlists. This can lead to a crash in GPU error > >>> state readout on those platforms. > >>> > >>> I don't see a bug filed for this, but there may be one that I haven't > >>> found. > >>> > >>> Cc: Alex Dai <yu.dai@intel.com> > >>> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> > >>> --- > >>> drivers/gpu/drm/i915/i915_gpu_error.c | 19 +++++++++++++------ > >>> 1 file changed, 13 insertions(+), 6 deletions(-) > >>> > >>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c > >>> index 3379f9c..d0822f8 100644 > >>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c > >>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > >>> @@ -457,17 +457,24 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, > >>> } > >>> if ((obj = error->ring[i].hws_page)) { > >>> + u64 hws_offset = lower_32_bits(obj->gtt_offset); > >>> + u32 *hws_page = &obj->pages[0][0]; > >>> + > >>> + if (i915.enable_execlists) { > >>> + hws_offset = obj->gtt_offset + LRC_PPHWSP_PN * > >>> + PAGE_SIZE; > >>> + hws_page = &obj->pages[LRC_PPHWSP_PN][0]; > >>> + } > >>> err_printf(m, "%s --- HW Status = 0x%08llx\n", > >>> - dev_priv->ring[i].name, > >>> - obj->gtt_offset + LRC_PPHWSP_PN * PAGE_SIZE); > >>> + dev_priv->ring[i].name, hws_offset); > >>> offset = 0; > >>> for (elt = 0; elt < PAGE_SIZE/16; elt += 4) { > >>> err_printf(m, "[%04x] %08x %08x %08x %08x\n", > >>> offset, > >>> - obj->pages[LRC_PPHWSP_PN][elt], > >>> - obj->pages[LRC_PPHWSP_PN][elt+1], > >>> - obj->pages[LRC_PPHWSP_PN][elt+2], > >>> - obj->pages[LRC_PPHWSP_PN][elt+3]); > >>> + hws_page[elt], > >>> + hws_page[elt+1], > >>> + hws_page[elt+2], > >>> + hws_page[elt+3]); > >>> offset += 16; > >>> } > >>> } > >> > >> > > > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] drm/i915: fix crash in error state readout on non-execlist platforms 2015-09-14 9:21 ` Daniel Vetter @ 2015-09-14 16:45 ` Dave Gordon 0 siblings, 0 replies; 8+ messages in thread From: Dave Gordon @ 2015-09-14 16:45 UTC (permalink / raw) To: intel-gfx, Dai, Yu, Barnes, Jesse On 14/09/15 10:21, Daniel Vetter wrote: > On Thu, Sep 10, 2015 at 03:07:00PM -0700, Yu Dai wrote: >> Agree. The LRC prefix is confusing. Thanks for the patch. -Alex > > Care to do an official r-b? > > Thanks, Daniel > >> >> On 09/10/2015 02:58 PM, Jesse Barnes wrote: >>> That looks like it would, but I think it's still confusing to reference LRC state when we haven't initialized execlists at all... >>> >>> Jesse >>> >>> On 09/10/2015 02:56 PM, Yu Dai wrote: >>>> Jesse, >>>> >>>> Will the patch here fix the issue? It should help other cases where LRC_PPHWSP_PN is referenced on non-execlist / guc platforms. >>>> >>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h >>>> index 4cc54b3..233a930 100644 >>>> --- a/drivers/gpu/drm/i915/intel_lrc.h >>>> +++ b/drivers/gpu/drm/i915/intel_lrc.h >>>> @@ -71,7 +71,7 @@ static inline void intel_logical_ring_emit(struct intel_ringbuffer *ringbuf, >>>> >>>> /* One extra page is added before LRC for GuC as shared data */ >>>> #define LRC_GUCSHR_PN (0) >>>> -#define LRC_PPHWSP_PN (LRC_GUCSHR_PN + 1) >>>> +#define LRC_PPHWSP_PN (LRC_GUCSHR_PN + i915.enable_guc_submission ? 1 : 0) >>>> #define LRC_STATE_PN (LRC_PPHWSP_PN + 1) I don't like this approach of hiding the runtime conditional offset in the macro. I think it would be better to leave this alone and use the technique in the code below where the (const) macros named 'LRC_*' are used only inside clauses guarded by an "if (i915.enable_execlists)". That way the code won't mention the LRC_* symbols except where they're known tp be meaningful. Of these symbols, LRC_GUCSHR_PN is not used except in GuC-specific code, and LRC_STATE_PN is used only in GuC and/or LRC (execlist-mode) specific code. So it's only LRC_PPHWSP_PN that needs to be used selectively (specifically, added to the base of a context object when using execlist mode). And the code below already accomplishes that without changing the macro. >>>> void intel_lr_context_free(struct intel_context *ctx); >>>> >>>> Thanks, >>>> Alex >>>> >>>> On 09/10/2015 02:40 PM, Jesse Barnes wrote: >>>>> Looks like this was introduced in: >>>>> commit d1675198ed1f21aec6e036336e4340c40b726497 >>>>> Author: Alex Dai <yu.dai@intel.com> >>>>> Date: Wed Aug 12 15:43:43 2015 +0100 >>>>> >>>>> drm/i915: Integrate GuC-based command submission >>>>> >>>>> This patch assumed LRC contexts and HWS layout, which is incorrect on >>>>> platforms without execlists. This can lead to a crash in GPU error >>>>> state readout on those platforms. >>>>> >>>>> I don't see a bug filed for this, but there may be one that I haven't >>>>> found. >>>>> >>>>> Cc: Alex Dai <yu.dai@intel.com> >>>>> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> >>>>> --- >>>>> drivers/gpu/drm/i915/i915_gpu_error.c | 19 +++++++++++++------ >>>>> 1 file changed, 13 insertions(+), 6 deletions(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c >>>>> index 3379f9c..d0822f8 100644 >>>>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c >>>>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c >>>>> @@ -457,17 +457,24 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, >>>>> } >>>>> if ((obj = error->ring[i].hws_page)) { >>>>> + u64 hws_offset = lower_32_bits(obj->gtt_offset); No need to take only lower 32 bits here, the variable is already u64 and the format below is 0x%08llx, so we'll get a 32-bit value in the output (unless it actually is bigger than 32 bits, which shouldn't happen and which we'd want to know about!). >>>>> + u32 *hws_page = &obj->pages[0][0]; >>>>> + >>>>> + if (i915.enable_execlists) { >>>>> + hws_offset = obj->gtt_offset + LRC_PPHWSP_PN * >>>>> + PAGE_SIZE; If we keep the full 64-bit value (or even if we don't), this should simplify to "hws_offset += LRC_PPHWSP_PN*PAGE_SIZE;" >>>>> + hws_page = &obj->pages[LRC_PPHWSP_PN][0]; >>>>> + } >>>>> err_printf(m, "%s --- HW Status = 0x%08llx\n", >>>>> - dev_priv->ring[i].name, >>>>> - obj->gtt_offset + LRC_PPHWSP_PN * PAGE_SIZE); >>>>> + dev_priv->ring[i].name, hws_offset); >>>>> offset = 0; >>>>> for (elt = 0; elt < PAGE_SIZE/16; elt += 4) { >>>>> err_printf(m, "[%04x] %08x %08x %08x %08x\n", >>>>> offset, >>>>> - obj->pages[LRC_PPHWSP_PN][elt], >>>>> - obj->pages[LRC_PPHWSP_PN][elt+1], >>>>> - obj->pages[LRC_PPHWSP_PN][elt+2], >>>>> - obj->pages[LRC_PPHWSP_PN][elt+3]); >>>>> + hws_page[elt], >>>>> + hws_page[elt+1], >>>>> + hws_page[elt+2], >>>>> + hws_page[elt+3]); >>>>> offset += 16; >>>>> } >>>>> } With the 64-bit handling changed, I'd be happy to r-b Jesse's original patch, but not Alex's redefinition of LRC_PPHWSP_PN. Cheers, .Dave. _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] drm/i915: fix crash in error state readout on non-execlist platforms v2 2015-09-10 21:40 [PATCH] drm/i915: fix crash in error state readout on non-execlist platforms Jesse Barnes 2015-09-10 21:56 ` Yu Dai @ 2015-09-15 17:03 ` Jesse Barnes 2015-09-23 8:34 ` Daniel Vetter 1 sibling, 1 reply; 8+ messages in thread From: Jesse Barnes @ 2015-09-15 17:03 UTC (permalink / raw) To: intel-gfx Looks like this was introduced in: commit d1675198ed1f21aec6e036336e4340c40b726497 Author: Alex Dai <yu.dai@intel.com> Date: Wed Aug 12 15:43:43 2015 +0100 drm/i915: Integrate GuC-based command submission This patch assumed LRC contexts and HWS layout, which is incorrect on platforms without execlists. This can lead to a crash in GPU error state readout on those platforms. I don't see a bug filed for this, but there may be one that I haven't found. v2: fixup offset handling for error capture fix (Dave) Cc: Alex Dai <yu.dai@intel.com> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> --- drivers/gpu/drm/i915/i915_gpu_error.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 3379f9c..f95de05 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -457,17 +457,23 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, } if ((obj = error->ring[i].hws_page)) { + u64 hws_offset = obj->gtt_offset; + u32 *hws_page = &obj->pages[0][0]; + + if (i915.enable_execlists) { + hws_offset += LRC_PPHWSP_PN * PAGE_SIZE; + hws_page = &obj->pages[LRC_PPHWSP_PN][0]; + } err_printf(m, "%s --- HW Status = 0x%08llx\n", - dev_priv->ring[i].name, - obj->gtt_offset + LRC_PPHWSP_PN * PAGE_SIZE); + dev_priv->ring[i].name, hws_offset); offset = 0; for (elt = 0; elt < PAGE_SIZE/16; elt += 4) { err_printf(m, "[%04x] %08x %08x %08x %08x\n", offset, - obj->pages[LRC_PPHWSP_PN][elt], - obj->pages[LRC_PPHWSP_PN][elt+1], - obj->pages[LRC_PPHWSP_PN][elt+2], - obj->pages[LRC_PPHWSP_PN][elt+3]); + hws_page[elt], + hws_page[elt+1], + hws_page[elt+2], + hws_page[elt+3]); offset += 16; } } -- 1.9.1 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] drm/i915: fix crash in error state readout on non-execlist platforms v2 2015-09-15 17:03 ` [PATCH] drm/i915: fix crash in error state readout on non-execlist platforms v2 Jesse Barnes @ 2015-09-23 8:34 ` Daniel Vetter 0 siblings, 0 replies; 8+ messages in thread From: Daniel Vetter @ 2015-09-23 8:34 UTC (permalink / raw) To: Jesse Barnes; +Cc: intel-gfx On Tue, Sep 15, 2015 at 10:03:01AM -0700, Jesse Barnes wrote: > Looks like this was introduced in: > commit d1675198ed1f21aec6e036336e4340c40b726497 > Author: Alex Dai <yu.dai@intel.com> > Date: Wed Aug 12 15:43:43 2015 +0100 > > drm/i915: Integrate GuC-based command submission > > This patch assumed LRC contexts and HWS layout, which is incorrect on > platforms without execlists. This can lead to a crash in GPU error > state readout on those platforms. > > I don't see a bug filed for this, but there may be one that I haven't > found. > > v2: fixup offset handling for error capture fix (Dave) > > Cc: Alex Dai <yu.dai@intel.com> > Reviewed-by: Dave Gordon <david.s.gordon@intel.com> > Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Queued for -next, thanks for the patch. -Daniel > --- > drivers/gpu/drm/i915/i915_gpu_error.c | 18 ++++++++++++------ > 1 file changed, 12 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c > index 3379f9c..f95de05 100644 > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > @@ -457,17 +457,23 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, > } > > if ((obj = error->ring[i].hws_page)) { > + u64 hws_offset = obj->gtt_offset; > + u32 *hws_page = &obj->pages[0][0]; > + > + if (i915.enable_execlists) { > + hws_offset += LRC_PPHWSP_PN * PAGE_SIZE; > + hws_page = &obj->pages[LRC_PPHWSP_PN][0]; > + } > err_printf(m, "%s --- HW Status = 0x%08llx\n", > - dev_priv->ring[i].name, > - obj->gtt_offset + LRC_PPHWSP_PN * PAGE_SIZE); > + dev_priv->ring[i].name, hws_offset); > offset = 0; > for (elt = 0; elt < PAGE_SIZE/16; elt += 4) { > err_printf(m, "[%04x] %08x %08x %08x %08x\n", > offset, > - obj->pages[LRC_PPHWSP_PN][elt], > - obj->pages[LRC_PPHWSP_PN][elt+1], > - obj->pages[LRC_PPHWSP_PN][elt+2], > - obj->pages[LRC_PPHWSP_PN][elt+3]); > + hws_page[elt], > + hws_page[elt+1], > + hws_page[elt+2], > + hws_page[elt+3]); > offset += 16; > } > } > -- > 1.9.1 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-09-23 8:32 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-09-10 21:40 [PATCH] drm/i915: fix crash in error state readout on non-execlist platforms Jesse Barnes 2015-09-10 21:56 ` Yu Dai 2015-09-10 21:58 ` Jesse Barnes 2015-09-10 22:07 ` Yu Dai 2015-09-14 9:21 ` Daniel Vetter 2015-09-14 16:45 ` Dave Gordon 2015-09-15 17:03 ` [PATCH] drm/i915: fix crash in error state readout on non-execlist platforms v2 Jesse Barnes 2015-09-23 8:34 ` Daniel Vetter
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox