* [PATCH 1/6] drm/i915: Extract register state error capture
@ 2014-01-30 8:19 Ben Widawsky
2014-01-30 8:19 ` [PATCH 2/6] drm/i915: Logically reorder error register capture Ben Widawsky
` (4 more replies)
0 siblings, 5 replies; 8+ messages in thread
From: Ben Widawsky @ 2014-01-30 8:19 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
The code has become quite hairy. By relocating all the generic registers
it will become more obvious where future ones should go. There is still
admittedly a bit of confusion left for things like per ring registers.
A subsequent patch will clean this function up.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/i915_gpu_error.c | 77 +++++++++++++++++++----------------
1 file changed, 43 insertions(+), 34 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 21cf0cf..67c82e5 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1012,43 +1012,13 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
}
}
-/**
- * i915_capture_error_state - capture an error record for later analysis
- * @dev: drm device
- *
- * Should be called when an error is detected (either a hang or an error
- * interrupt) to capture error state from the time of the error. Fills
- * out a structure which becomes available in debugfs for user level tools
- * to pick up.
- */
-void i915_capture_error_state(struct drm_device *dev)
+/* Capture all registers which don't fit into another category. */
+static void i915_capture_reg_state(struct drm_i915_private *dev_priv,
+ struct drm_i915_error_state *error)
{
- struct drm_i915_private *dev_priv = dev->dev_private;
- struct drm_i915_error_state *error;
- unsigned long flags;
+ struct drm_device *dev = dev_priv->dev;
int pipe;
- spin_lock_irqsave(&dev_priv->gpu_error.lock, flags);
- error = dev_priv->gpu_error.first_error;
- spin_unlock_irqrestore(&dev_priv->gpu_error.lock, flags);
- if (error)
- return;
-
- /* Account for pipe specific data like PIPE*STAT */
- error = kzalloc(sizeof(*error), GFP_ATOMIC);
- if (!error) {
- DRM_DEBUG_DRIVER("out of memory, not capturing error state\n");
- return;
- }
-
- DRM_INFO("GPU crash dump saved to /sys/class/drm/card%d/error\n",
- dev->primary->index);
- DRM_INFO("GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.\n");
- DRM_INFO("Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel\n");
- DRM_INFO("drm/i915 developers can then reassign to the right component if it's not a kernel issue.\n");
- DRM_INFO("The gpu crash dump is required to analyze gpu hangs, so please always attach it.\n");
-
- kref_init(&error->ref);
error->eir = I915_READ(EIR);
error->pgtbl_er = I915_READ(PGTBL_ER);
if (HAS_HW_CONTEXTS(dev))
@@ -1086,7 +1056,46 @@ void i915_capture_error_state(struct drm_device *dev)
error->err_int = I915_READ(GEN7_ERR_INT);
i915_get_extra_instdone(dev, error->extra_instdone);
+}
+
+/**
+ * i915_capture_error_state - capture an error record for later analysis
+ * @dev: drm device
+ *
+ * Should be called when an error is detected (either a hang or an error
+ * interrupt) to capture error state from the time of the error. Fills
+ * out a structure which becomes available in debugfs for user level tools
+ * to pick up.
+ */
+void i915_capture_error_state(struct drm_device *dev)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct drm_i915_error_state *error;
+ unsigned long flags;
+
+ spin_lock_irqsave(&dev_priv->gpu_error.lock, flags);
+ error = dev_priv->gpu_error.first_error;
+ spin_unlock_irqrestore(&dev_priv->gpu_error.lock, flags);
+ if (error)
+ return;
+
+ /* Account for pipe specific data like PIPE*STAT */
+ error = kzalloc(sizeof(*error), GFP_ATOMIC);
+ if (!error) {
+ DRM_DEBUG_DRIVER("out of memory, not capturing error state\n");
+ return;
+ }
+
+ DRM_INFO("GPU crash dump saved to /sys/class/drm/card%d/error\n",
+ dev->primary->index);
+ DRM_INFO("GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.\n");
+ DRM_INFO("Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel\n");
+ DRM_INFO("drm/i915 developers can then reassign to the right component if it's not a kernel issue.\n");
+ DRM_INFO("The gpu crash dump is required to analyze gpu hangs, so please always attach it.\n");
+
+ kref_init(&error->ref);
+ i915_capture_reg_state(dev_priv, error);
i915_gem_capture_buffers(dev_priv, error);
i915_gem_record_fences(dev, error);
i915_gem_record_rings(dev, error);
--
1.8.5.3
^ permalink raw reply related [flat|nested] 8+ messages in thread* [PATCH 2/6] drm/i915: Logically reorder error register capture 2014-01-30 8:19 [PATCH 1/6] drm/i915: Extract register state error capture Ben Widawsky @ 2014-01-30 8:19 ` Ben Widawsky 2014-01-30 8:19 ` [PATCH 3/6] drm/i915: Reorder struct members Ben Widawsky ` (3 subsequent siblings) 4 siblings, 0 replies; 8+ messages in thread From: Ben Widawsky @ 2014-01-30 8:19 UTC (permalink / raw) To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky Create logical sections in an attempt to clean up, and continue to keep future additions clean. v2: Reworded the comments. Added section headers (Chris) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/i915_gpu_error.c | 59 +++++++++++++++++++++-------------- 1 file changed, 36 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 67c82e5..c9d4a18 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1019,41 +1019,54 @@ static void i915_capture_reg_state(struct drm_i915_private *dev_priv, struct drm_device *dev = dev_priv->dev; int pipe; - error->eir = I915_READ(EIR); - error->pgtbl_er = I915_READ(PGTBL_ER); - if (HAS_HW_CONTEXTS(dev)) - error->ccid = I915_READ(CCID); + /* General organization + * 1. Registers specific to a single generation + * 2. Registers which belong to multiple generations + * 3. Feature specific registers. + * 4. Everything else + * Please try to follow the order. + */ - if (HAS_PCH_SPLIT(dev)) - error->ier = I915_READ(DEIER) | I915_READ(GTIER); - else if (IS_VALLEYVIEW(dev)) + /* 1: Registers specific to a single generation */ + if (IS_VALLEYVIEW(dev)) { error->ier = I915_READ(GTIER) | I915_READ(VLV_IER); - else if (IS_GEN2(dev)) - error->ier = I915_READ16(IER); - else - error->ier = I915_READ(IER); + error->forcewake = I915_READ(FORCEWAKE_VLV); + } - if (INTEL_INFO(dev)->gen >= 6) - error->derrmr = I915_READ(DERRMR); + if (IS_GEN7(dev)) + error->err_int = I915_READ(GEN7_ERR_INT); - if (IS_VALLEYVIEW(dev)) - error->forcewake = I915_READ(FORCEWAKE_VLV); - else if (INTEL_INFO(dev)->gen >= 7) - error->forcewake = I915_READ(FORCEWAKE_MT); - else if (INTEL_INFO(dev)->gen == 6) + if (IS_GEN6(dev)) error->forcewake = I915_READ(FORCEWAKE); - if (!HAS_PCH_SPLIT(dev)) - for_each_pipe(pipe) - error->pipestat[pipe] = I915_READ(PIPESTAT(pipe)); + if (IS_GEN2(dev)) + error->ier = I915_READ16(IER); + + /* 2: Registers which belong to multiple generations */ + if (INTEL_INFO(dev)->gen >= 7) + error->forcewake = I915_READ(FORCEWAKE_MT); if (INTEL_INFO(dev)->gen >= 6) { + error->derrmr = I915_READ(DERRMR); error->error = I915_READ(ERROR_GEN6); error->done_reg = I915_READ(DONE_REG); } - if (INTEL_INFO(dev)->gen == 7) - error->err_int = I915_READ(GEN7_ERR_INT); + /* 3: Feature specific registers */ + if (HAS_HW_CONTEXTS(dev)) + error->ccid = I915_READ(CCID); + + if (HAS_PCH_SPLIT(dev)) + error->ier = I915_READ(DEIER) | I915_READ(GTIER); + else { + error->ier = I915_READ(IER); + for_each_pipe(pipe) + error->pipestat[pipe] = I915_READ(PIPESTAT(pipe)); + } + + /* 4: Everything else */ + error->eir = I915_READ(EIR); + error->pgtbl_er = I915_READ(PGTBL_ER); i915_get_extra_instdone(dev, error->extra_instdone); } -- 1.8.5.3 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 3/6] drm/i915: Reorder struct members 2014-01-30 8:19 [PATCH 1/6] drm/i915: Extract register state error capture Ben Widawsky 2014-01-30 8:19 ` [PATCH 2/6] drm/i915: Logically reorder error register capture Ben Widawsky @ 2014-01-30 8:19 ` Ben Widawsky 2014-01-30 8:19 ` [PATCH 4/6] drm/i915: Move per ring error state to ring_error Ben Widawsky ` (2 subsequent siblings) 4 siblings, 0 replies; 8+ messages in thread From: Ben Widawsky @ 2014-01-30 8:19 UTC (permalink / raw) To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky This helps make an upcoming patch a bit more reviewable Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/i915_drv.h | 43 ++++++++++++++++++++++++----------------- 1 file changed, 25 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 3782b36..cd97c86 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -295,14 +295,26 @@ struct intel_display_error_state; struct drm_i915_error_state { struct kref ref; + struct timeval time; + + /* Generic register state */ u32 eir; u32 pgtbl_er; u32 ier; u32 ccid; u32 derrmr; u32 forcewake; - bool waiting[I915_NUM_RINGS]; + u32 error; /* gen6+ */ + u32 err_int; /* gen7 */ + u32 done_reg; + u32 extra_instdone[I915_NUM_INSTDONE_REG]; u32 pipestat[I915_MAX_PIPES]; + u64 fence[I915_MAX_NUM_FENCES]; + struct intel_overlay_error_state *overlay; + struct intel_display_error_state *display; + + /* Per ring register state + * TODO: Move these to per ring */ u32 tail[I915_NUM_RINGS]; u32 head[I915_NUM_RINGS]; u32 ctl[I915_NUM_RINGS]; @@ -311,25 +323,25 @@ struct drm_i915_error_state { u32 ipehr[I915_NUM_RINGS]; u32 instdone[I915_NUM_RINGS]; u32 acthd[I915_NUM_RINGS]; - u32 semaphore_mboxes[I915_NUM_RINGS][I915_NUM_RINGS - 1]; - u32 semaphore_seqno[I915_NUM_RINGS][I915_NUM_RINGS - 1]; - u32 rc_psmi[I915_NUM_RINGS]; /* sleep state */ - /* our own tracking of ring head and tail */ - u32 cpu_ring_head[I915_NUM_RINGS]; - u32 cpu_ring_tail[I915_NUM_RINGS]; - u32 error; /* gen6+ */ - u32 err_int; /* gen7 */ u32 bbstate[I915_NUM_RINGS]; u32 instpm[I915_NUM_RINGS]; u32 instps[I915_NUM_RINGS]; - u32 extra_instdone[I915_NUM_INSTDONE_REG]; u32 seqno[I915_NUM_RINGS]; u64 bbaddr[I915_NUM_RINGS]; u32 fault_reg[I915_NUM_RINGS]; - u32 done_reg; u32 faddr[I915_NUM_RINGS]; - u64 fence[I915_MAX_NUM_FENCES]; - struct timeval time; + u32 rc_psmi[I915_NUM_RINGS]; /* sleep state */ + u32 semaphore_mboxes[I915_NUM_RINGS][I915_NUM_RINGS - 1]; + + /* Software tracked state */ + bool waiting[I915_NUM_RINGS]; + int hangcheck_score[I915_NUM_RINGS]; + enum intel_ring_hangcheck_action hangcheck_action[I915_NUM_RINGS]; + + /* our own tracking of ring head and tail */ + u32 cpu_ring_head[I915_NUM_RINGS]; + u32 cpu_ring_tail[I915_NUM_RINGS]; + u32 semaphore_seqno[I915_NUM_RINGS][I915_NUM_RINGS - 1]; struct drm_i915_error_ring { bool valid; @@ -363,11 +375,6 @@ struct drm_i915_error_state { } **active_bo, **pinned_bo; u32 *active_bo_count, *pinned_bo_count; u32 vm_count; - - struct intel_overlay_error_state *overlay; - struct intel_display_error_state *display; - int hangcheck_score[I915_NUM_RINGS]; - enum intel_ring_hangcheck_action hangcheck_action[I915_NUM_RINGS]; }; struct intel_connector; -- 1.8.5.3 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 4/6] drm/i915: Move per ring error state to ring_error 2014-01-30 8:19 [PATCH 1/6] drm/i915: Extract register state error capture Ben Widawsky 2014-01-30 8:19 ` [PATCH 2/6] drm/i915: Logically reorder error register capture Ben Widawsky 2014-01-30 8:19 ` [PATCH 3/6] drm/i915: Reorder struct members Ben Widawsky @ 2014-01-30 8:19 ` Ben Widawsky 2014-01-30 8:19 ` [PATCH 5/6] drm/i915: Add some more registers to error state Ben Widawsky 2014-01-30 8:19 ` [PATCH 6/6] drm/i915: Capture PPGTT info on error capture Ben Widawsky 4 siblings, 0 replies; 8+ messages in thread From: Ben Widawsky @ 2014-01-30 8:19 UTC (permalink / raw) To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky v2: Moved num_requests up (Chris) Rebased on new hws page capture which required a rename since it made two members named, 'hws' in the per ring error state. (Ben) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> --- drivers/gpu/drm/i915/i915_drv.h | 65 ++++++++-------- drivers/gpu/drm/i915/i915_gpu_error.c | 143 +++++++++++++++++----------------- 2 files changed, 104 insertions(+), 104 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index cd97c86..d20fc80 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -313,49 +313,50 @@ struct drm_i915_error_state { struct intel_overlay_error_state *overlay; struct intel_display_error_state *display; - /* Per ring register state - * TODO: Move these to per ring */ - u32 tail[I915_NUM_RINGS]; - u32 head[I915_NUM_RINGS]; - u32 ctl[I915_NUM_RINGS]; - u32 hws[I915_NUM_RINGS]; - u32 ipeir[I915_NUM_RINGS]; - u32 ipehr[I915_NUM_RINGS]; - u32 instdone[I915_NUM_RINGS]; - u32 acthd[I915_NUM_RINGS]; - u32 bbstate[I915_NUM_RINGS]; - u32 instpm[I915_NUM_RINGS]; - u32 instps[I915_NUM_RINGS]; - u32 seqno[I915_NUM_RINGS]; - u64 bbaddr[I915_NUM_RINGS]; - u32 fault_reg[I915_NUM_RINGS]; - u32 faddr[I915_NUM_RINGS]; - u32 rc_psmi[I915_NUM_RINGS]; /* sleep state */ - u32 semaphore_mboxes[I915_NUM_RINGS][I915_NUM_RINGS - 1]; - - /* Software tracked state */ - bool waiting[I915_NUM_RINGS]; - int hangcheck_score[I915_NUM_RINGS]; - enum intel_ring_hangcheck_action hangcheck_action[I915_NUM_RINGS]; - - /* our own tracking of ring head and tail */ - u32 cpu_ring_head[I915_NUM_RINGS]; - u32 cpu_ring_tail[I915_NUM_RINGS]; - u32 semaphore_seqno[I915_NUM_RINGS][I915_NUM_RINGS - 1]; - struct drm_i915_error_ring { bool valid; + /* Software tracked state */ + bool waiting; + int hangcheck_score; + enum intel_ring_hangcheck_action hangcheck_action; + int num_requests; + + /* our own tracking of ring head and tail */ + u32 cpu_ring_head; + u32 cpu_ring_tail; + + u32 semaphore_seqno[I915_NUM_RINGS - 1]; + + /* Register state */ + u32 tail; + u32 head; + u32 ctl; + u32 hws; + u32 ipeir; + u32 ipehr; + u32 instdone; + u32 acthd; + u32 bbstate; + u32 instpm; + u32 instps; + u32 seqno; + u64 bbaddr; + u32 fault_reg; + u32 faddr; + u32 rc_psmi; /* sleep state */ + u32 semaphore_mboxes[I915_NUM_RINGS - 1]; + struct drm_i915_error_object { int page_count; u32 gtt_offset; u32 *pages[0]; - } *ringbuffer, *batchbuffer, *ctx, *hws; + } *ringbuffer, *batchbuffer, *ctx, *hws_page; + struct drm_i915_error_request { long jiffies; u32 seqno; u32 tail; } *requests; - int num_requests; } ring[I915_NUM_RINGS]; struct drm_i915_error_buffer { diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index c9d4a18..07433bc 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -235,51 +235,48 @@ static const char *hangcheck_action_to_str(enum intel_ring_hangcheck_action a) static void i915_ring_error_state(struct drm_i915_error_state_buf *m, struct drm_device *dev, - struct drm_i915_error_state *error, - unsigned ring) + struct drm_i915_error_ring *ring) { - BUG_ON(ring >= I915_NUM_RINGS); /* shut up confused gcc */ - if (!error->ring[ring].valid) + if (!ring->valid) return; - err_printf(m, "%s command stream:\n", ring_str(ring)); - err_printf(m, " HEAD: 0x%08x\n", error->head[ring]); - err_printf(m, " TAIL: 0x%08x\n", error->tail[ring]); - err_printf(m, " CTL: 0x%08x\n", error->ctl[ring]); - err_printf(m, " HWS: 0x%08x\n", error->hws[ring]); - err_printf(m, " ACTHD: 0x%08x\n", error->acthd[ring]); - err_printf(m, " IPEIR: 0x%08x\n", error->ipeir[ring]); - err_printf(m, " IPEHR: 0x%08x\n", error->ipehr[ring]); - err_printf(m, " INSTDONE: 0x%08x\n", error->instdone[ring]); + err_printf(m, " HEAD: 0x%08x\n", ring->head); + err_printf(m, " TAIL: 0x%08x\n", ring->tail); + err_printf(m, " CTL: 0x%08x\n", ring->ctl); + err_printf(m, " HWS: 0x%08x\n", ring->hws); + err_printf(m, " ACTHD: 0x%08x\n", ring->acthd); + err_printf(m, " IPEIR: 0x%08x\n", ring->ipeir); + err_printf(m, " IPEHR: 0x%08x\n", ring->ipehr); + err_printf(m, " INSTDONE: 0x%08x\n", ring->instdone); if (INTEL_INFO(dev)->gen >= 4) { - err_printf(m, " BBADDR: 0x%08llx\n", error->bbaddr[ring]); - err_printf(m, " BB_STATE: 0x%08x\n", error->bbstate[ring]); - err_printf(m, " INSTPS: 0x%08x\n", error->instps[ring]); + err_printf(m, " BBADDR: 0x%08llx\n", ring->bbaddr); + err_printf(m, " BB_STATE: 0x%08x\n", ring->bbstate); + err_printf(m, " INSTPS: 0x%08x\n", ring->instps); } - err_printf(m, " INSTPM: 0x%08x\n", error->instpm[ring]); - err_printf(m, " FADDR: 0x%08x\n", error->faddr[ring]); + err_printf(m, " INSTPM: 0x%08x\n", ring->instpm); + err_printf(m, " FADDR: 0x%08x\n", ring->faddr); if (INTEL_INFO(dev)->gen >= 6) { - err_printf(m, " RC PSMI: 0x%08x\n", error->rc_psmi[ring]); - err_printf(m, " FAULT_REG: 0x%08x\n", error->fault_reg[ring]); + err_printf(m, " RC PSMI: 0x%08x\n", ring->rc_psmi); + err_printf(m, " FAULT_REG: 0x%08x\n", ring->fault_reg); err_printf(m, " SYNC_0: 0x%08x [last synced 0x%08x]\n", - error->semaphore_mboxes[ring][0], - error->semaphore_seqno[ring][0]); + ring->semaphore_mboxes[0], + ring->semaphore_seqno[0]); err_printf(m, " SYNC_1: 0x%08x [last synced 0x%08x]\n", - error->semaphore_mboxes[ring][1], - error->semaphore_seqno[ring][1]); + ring->semaphore_mboxes[1], + ring->semaphore_seqno[1]); if (HAS_VEBOX(dev)) { err_printf(m, " SYNC_2: 0x%08x [last synced 0x%08x]\n", - error->semaphore_mboxes[ring][2], - error->semaphore_seqno[ring][2]); + ring->semaphore_mboxes[2], + ring->semaphore_seqno[2]); } } - err_printf(m, " seqno: 0x%08x\n", error->seqno[ring]); - err_printf(m, " waiting: %s\n", yesno(error->waiting[ring])); - err_printf(m, " ring->head: 0x%08x\n", error->cpu_ring_head[ring]); - err_printf(m, " ring->tail: 0x%08x\n", error->cpu_ring_tail[ring]); + err_printf(m, " seqno: 0x%08x\n", ring->seqno); + err_printf(m, " waiting: %s\n", yesno(ring->waiting)); + err_printf(m, " ring->head: 0x%08x\n", ring->cpu_ring_head); + err_printf(m, " ring->tail: 0x%08x\n", ring->cpu_ring_tail); err_printf(m, " hangcheck: %s [%d]\n", - hangcheck_action_to_str(error->hangcheck_action[ring]), - error->hangcheck_score[ring]); + hangcheck_action_to_str(ring->hangcheck_action), + ring->hangcheck_score); } void i915_error_printf(struct drm_i915_error_state_buf *e, const char *f, ...) @@ -331,8 +328,10 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, if (INTEL_INFO(dev)->gen == 7) err_printf(m, "ERR_INT: 0x%08x\n", error->err_int); - for (i = 0; i < ARRAY_SIZE(error->ring); i++) - i915_ring_error_state(m, dev, error, i); + for (i = 0; i < ARRAY_SIZE(error->ring); i++) { + err_printf(m, "%s command stream:\n", ring_str(i)); + i915_ring_error_state(m, dev, &error->ring[i]); + } for (i = 0; i < error->vm_count; i++) { err_printf(m, "vm[%d]\n", i); @@ -390,7 +389,7 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, } } - if ((obj = error->ring[i].hws)) { + if ((obj = error->ring[i].hws_page)) { err_printf(m, "%s --- HW Status = 0x%08x\n", dev_priv->ring[i].name, obj->gtt_offset); @@ -488,7 +487,7 @@ static void i915_error_state_free(struct kref *error_ref) for (i = 0; i < ARRAY_SIZE(error->ring); i++) { i915_error_object_free(error->ring[i].batchbuffer); i915_error_object_free(error->ring[i].ringbuffer); - i915_error_object_free(error->ring[i].hws); + i915_error_object_free(error->ring[i].hws_page); i915_error_object_free(error->ring[i].ctx); kfree(error->ring[i].requests); } @@ -767,52 +766,52 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv, } static void i915_record_ring_state(struct drm_device *dev, - struct drm_i915_error_state *error, - struct intel_ring_buffer *ring) + struct intel_ring_buffer *ring, + struct drm_i915_error_ring *ering) { struct drm_i915_private *dev_priv = dev->dev_private; if (INTEL_INFO(dev)->gen >= 6) { - error->rc_psmi[ring->id] = I915_READ(ring->mmio_base + 0x50); - error->fault_reg[ring->id] = I915_READ(RING_FAULT_REG(ring)); - error->semaphore_mboxes[ring->id][0] + ering->rc_psmi = I915_READ(ring->mmio_base + 0x50); + ering->fault_reg = I915_READ(RING_FAULT_REG(ring)); + ering->semaphore_mboxes[0] = I915_READ(RING_SYNC_0(ring->mmio_base)); - error->semaphore_mboxes[ring->id][1] + ering->semaphore_mboxes[1] = I915_READ(RING_SYNC_1(ring->mmio_base)); - error->semaphore_seqno[ring->id][0] = ring->sync_seqno[0]; - error->semaphore_seqno[ring->id][1] = ring->sync_seqno[1]; + ering->semaphore_seqno[0] = ring->sync_seqno[0]; + ering->semaphore_seqno[1] = ring->sync_seqno[1]; } if (HAS_VEBOX(dev)) { - error->semaphore_mboxes[ring->id][2] = + ering->semaphore_mboxes[2] = I915_READ(RING_SYNC_2(ring->mmio_base)); - error->semaphore_seqno[ring->id][2] = ring->sync_seqno[2]; + ering->semaphore_seqno[2] = ring->sync_seqno[2]; } if (INTEL_INFO(dev)->gen >= 4) { - error->faddr[ring->id] = I915_READ(RING_DMA_FADD(ring->mmio_base)); - error->ipeir[ring->id] = I915_READ(RING_IPEIR(ring->mmio_base)); - error->ipehr[ring->id] = I915_READ(RING_IPEHR(ring->mmio_base)); - error->instdone[ring->id] = I915_READ(RING_INSTDONE(ring->mmio_base)); - error->instps[ring->id] = I915_READ(RING_INSTPS(ring->mmio_base)); - error->bbaddr[ring->id] = I915_READ(RING_BBADDR(ring->mmio_base)); + ering->faddr = I915_READ(RING_DMA_FADD(ring->mmio_base)); + ering->ipeir = I915_READ(RING_IPEIR(ring->mmio_base)); + ering->ipehr = I915_READ(RING_IPEHR(ring->mmio_base)); + ering->instdone = I915_READ(RING_INSTDONE(ring->mmio_base)); + ering->instps = I915_READ(RING_INSTPS(ring->mmio_base)); + ering->bbaddr = I915_READ(RING_BBADDR(ring->mmio_base)); if (INTEL_INFO(dev)->gen >= 8) - error->bbaddr[ring->id] |= (u64) I915_READ(RING_BBADDR_UDW(ring->mmio_base)) << 32; - error->bbstate[ring->id] = I915_READ(RING_BBSTATE(ring->mmio_base)); + ering->bbaddr |= (u64) I915_READ(RING_BBADDR_UDW(ring->mmio_base)) << 32; + ering->bbstate = I915_READ(RING_BBSTATE(ring->mmio_base)); } else { - error->faddr[ring->id] = I915_READ(DMA_FADD_I8XX); - error->ipeir[ring->id] = I915_READ(IPEIR); - error->ipehr[ring->id] = I915_READ(IPEHR); - error->instdone[ring->id] = I915_READ(INSTDONE); + ering->faddr = I915_READ(DMA_FADD_I8XX); + ering->ipeir = I915_READ(IPEIR); + ering->ipehr = I915_READ(IPEHR); + ering->instdone = I915_READ(INSTDONE); } - error->waiting[ring->id] = waitqueue_active(&ring->irq_queue); - error->instpm[ring->id] = I915_READ(RING_INSTPM(ring->mmio_base)); - error->seqno[ring->id] = ring->get_seqno(ring, false); - error->acthd[ring->id] = intel_ring_get_active_head(ring); - error->head[ring->id] = I915_READ_HEAD(ring); - error->tail[ring->id] = I915_READ_TAIL(ring); - error->ctl[ring->id] = I915_READ_CTL(ring); + ering->waiting = waitqueue_active(&ring->irq_queue); + ering->instpm = I915_READ(RING_INSTPM(ring->mmio_base)); + ering->seqno = ring->get_seqno(ring, false); + ering->acthd = intel_ring_get_active_head(ring); + ering->head = I915_READ_HEAD(ring); + ering->tail = I915_READ_TAIL(ring); + ering->ctl = I915_READ_CTL(ring); if (I915_NEED_GFX_HWS(dev)) { int mmio; @@ -840,14 +839,14 @@ static void i915_record_ring_state(struct drm_device *dev, mmio = RING_HWS_PGA(ring->mmio_base); } - error->hws[ring->id] = I915_READ(mmio); + ering->hws = I915_READ(mmio); } - error->cpu_ring_head[ring->id] = ring->head; - error->cpu_ring_tail[ring->id] = ring->tail; + ering->cpu_ring_head = ring->head; + ering->cpu_ring_tail = ring->tail; - error->hangcheck_score[ring->id] = ring->hangcheck.score; - error->hangcheck_action[ring->id] = ring->hangcheck.action; + ering->hangcheck_score = ring->hangcheck.score; + ering->hangcheck_action = ring->hangcheck.action; } @@ -888,7 +887,7 @@ static void i915_gem_record_rings(struct drm_device *dev, error->ring[i].valid = true; - i915_record_ring_state(dev, error, ring); + i915_record_ring_state(dev, ring, &error->ring[i]); error->ring[i].batchbuffer = i915_error_first_batchbuffer(dev_priv, ring); @@ -897,7 +896,7 @@ static void i915_gem_record_rings(struct drm_device *dev, i915_error_ggtt_object_create(dev_priv, ring->obj); if (ring->status_page.obj) - error->ring[i].hws = + error->ring[i].hws_page = i915_error_ggtt_object_create(dev_priv, ring->status_page.obj); i915_gem_record_active_context(ring, error, &error->ring[i]); -- 1.8.5.3 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 5/6] drm/i915: Add some more registers to error state 2014-01-30 8:19 [PATCH 1/6] drm/i915: Extract register state error capture Ben Widawsky ` (2 preceding siblings ...) 2014-01-30 8:19 ` [PATCH 4/6] drm/i915: Move per ring error state to ring_error Ben Widawsky @ 2014-01-30 8:19 ` Ben Widawsky 2014-01-30 8:19 ` [PATCH 6/6] drm/i915: Capture PPGTT info on error capture Ben Widawsky 4 siblings, 0 replies; 8+ messages in thread From: Ben Widawsky @ 2014-01-30 8:19 UTC (permalink / raw) To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky Chris: Do we also want to capture? GAC_ECO_BITS /* gen6,7 */ GAM_ECOCHK /* gen6,7 */ GAB_CTL /* gen6 */ GFX_MODE /* gen6 */ Requested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/i915_drv.h | 4 ++++ drivers/gpu/drm/i915/i915_gpu_error.c | 11 ++++++++++- 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index d20fc80..e41f30a 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -307,6 +307,10 @@ struct drm_i915_error_state { u32 error; /* gen6+ */ u32 err_int; /* gen7 */ u32 done_reg; + u32 gac_eco; + u32 gam_ecochk; + u32 gab_ctl; + u32 gfx_mode; u32 extra_instdone[I915_NUM_INSTDONE_REG]; u32 pipestat[I915_MAX_PIPES]; u64 fence[I915_MAX_NUM_FENCES]; diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 07433bc..4c3ca11 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1035,8 +1035,11 @@ static void i915_capture_reg_state(struct drm_i915_private *dev_priv, if (IS_GEN7(dev)) error->err_int = I915_READ(GEN7_ERR_INT); - if (IS_GEN6(dev)) + if (IS_GEN6(dev)) { error->forcewake = I915_READ(FORCEWAKE); + error->gab_ctl = I915_READ(GAB_CTL); + error->gfx_mode = I915_READ(GFX_MODE); + } if (IS_GEN2(dev)) error->ier = I915_READ16(IER); @@ -1052,6 +1055,12 @@ static void i915_capture_reg_state(struct drm_i915_private *dev_priv, } /* 3: Feature specific registers */ + if (IS_GEN6(dev) || IS_GEN7(dev)) { + error->gam_ecochk = I915_READ(GAM_ECOCHK); + error->gac_eco = I915_READ(GAC_ECO_BITS); + } + + /* 4: Everything else */ if (HAS_HW_CONTEXTS(dev)) error->ccid = I915_READ(CCID); -- 1.8.5.3 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 6/6] drm/i915: Capture PPGTT info on error capture 2014-01-30 8:19 [PATCH 1/6] drm/i915: Extract register state error capture Ben Widawsky ` (3 preceding siblings ...) 2014-01-30 8:19 ` [PATCH 5/6] drm/i915: Add some more registers to error state Ben Widawsky @ 2014-01-30 8:19 ` Ben Widawsky 2014-01-30 11:26 ` Daniel Vetter 4 siblings, 1 reply; 8+ messages in thread From: Ben Widawsky @ 2014-01-30 8:19 UTC (permalink / raw) To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky v2: Rebased upon cleaned up error state v3: Make sure hangcheck info remains last (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> --- drivers/gpu/drm/i915/i915_drv.h | 9 +++++++++ drivers/gpu/drm/i915/i915_gpu_error.c | 37 +++++++++++++++++++++++++++++++++++ 2 files changed, 46 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index e41f30a..3035bf3 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -361,6 +361,14 @@ struct drm_i915_error_state { u32 seqno; u32 tail; } *requests; + + struct { + u32 gfx_mode; + union { + u64 pdp[4]; + u32 pp_dir_base; + }; + } vm_info; } ring[I915_NUM_RINGS]; struct drm_i915_error_buffer { @@ -378,6 +386,7 @@ struct drm_i915_error_state { s32 ring:4; u32 cache_level:3; } **active_bo, **pinned_bo; + u32 *active_bo_count, *pinned_bo_count; u32 vm_count; }; diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 4c3ca11..9d04e6a 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -270,6 +270,19 @@ static void i915_ring_error_state(struct drm_i915_error_state_buf *m, ring->semaphore_seqno[2]); } } + if (USES_PPGTT(dev)) { + err_printf(m, " GFX_MODE: 0x%08x\n", ring->vm_info.gfx_mode); + + if (INTEL_INFO(dev)->gen >= 8) { + int i; + for (i = 0; i < 4; i++) + err_printf(m, " PDP%d: 0x%016llx\n", + i, ring->vm_info.pdp[i]); + } else { + err_printf(m, " PP_DIR_BASE: 0x%08x\n", + ring->vm_info.pp_dir_base); + } + } err_printf(m, " seqno: 0x%08x\n", ring->seqno); err_printf(m, " waiting: %s\n", yesno(ring->waiting)); err_printf(m, " ring->head: 0x%08x\n", ring->cpu_ring_head); @@ -847,6 +860,30 @@ static void i915_record_ring_state(struct drm_device *dev, ering->hangcheck_score = ring->hangcheck.score; ering->hangcheck_action = ring->hangcheck.action; + + if (USES_PPGTT(dev)) { + int i; + + ering->vm_info.gfx_mode = I915_READ(RING_MODE_GEN7(ring)); + + switch (INTEL_INFO(dev)->gen) { + case 8: + for (i = 0; i < 4; i++) { + ering->vm_info.pdp[i] = + I915_READ(GEN8_RING_PDP_UDW(ring, i)); + ering->vm_info.pdp[i] <<= 32; + ering->vm_info.pdp[i] |= + I915_READ(GEN8_RING_PDP_LDW(ring, i)); + } + break; + case 7: + ering->vm_info.pp_dir_base = RING_PP_DIR_BASE(ring); + break; + case 6: + ering->vm_info.pp_dir_base = RING_PP_DIR_BASE_READ(ring); + break; + } + } } -- 1.8.5.3 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 6/6] drm/i915: Capture PPGTT info on error capture 2014-01-30 8:19 ` [PATCH 6/6] drm/i915: Capture PPGTT info on error capture Ben Widawsky @ 2014-01-30 11:26 ` Daniel Vetter 2014-01-30 11:34 ` Daniel Vetter 0 siblings, 1 reply; 8+ messages in thread From: Daniel Vetter @ 2014-01-30 11:26 UTC (permalink / raw) To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky On Thu, Jan 30, 2014 at 12:19:40AM -0800, Ben Widawsky wrote: > v2: Rebased upon cleaned up error state > v3: Make sure hangcheck info remains last (Chris) > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Pulled in entire series with Chris' irc ack on the remaining two patches. Note that there have been a few conflicts aroung ring_error_state->valid (dunno how that happen, I guess this series wasn't strictly based on -nightly), please double-check things. Thanks, Daniel > --- > drivers/gpu/drm/i915/i915_drv.h | 9 +++++++++ > drivers/gpu/drm/i915/i915_gpu_error.c | 37 +++++++++++++++++++++++++++++++++++ > 2 files changed, 46 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index e41f30a..3035bf3 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -361,6 +361,14 @@ struct drm_i915_error_state { > u32 seqno; > u32 tail; > } *requests; > + > + struct { > + u32 gfx_mode; > + union { > + u64 pdp[4]; > + u32 pp_dir_base; > + }; > + } vm_info; > } ring[I915_NUM_RINGS]; > > struct drm_i915_error_buffer { > @@ -378,6 +386,7 @@ struct drm_i915_error_state { > s32 ring:4; > u32 cache_level:3; > } **active_bo, **pinned_bo; > + > u32 *active_bo_count, *pinned_bo_count; > u32 vm_count; > }; > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c > index 4c3ca11..9d04e6a 100644 > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > @@ -270,6 +270,19 @@ static void i915_ring_error_state(struct drm_i915_error_state_buf *m, > ring->semaphore_seqno[2]); > } > } > + if (USES_PPGTT(dev)) { > + err_printf(m, " GFX_MODE: 0x%08x\n", ring->vm_info.gfx_mode); > + > + if (INTEL_INFO(dev)->gen >= 8) { > + int i; > + for (i = 0; i < 4; i++) > + err_printf(m, " PDP%d: 0x%016llx\n", > + i, ring->vm_info.pdp[i]); > + } else { > + err_printf(m, " PP_DIR_BASE: 0x%08x\n", > + ring->vm_info.pp_dir_base); > + } > + } > err_printf(m, " seqno: 0x%08x\n", ring->seqno); > err_printf(m, " waiting: %s\n", yesno(ring->waiting)); > err_printf(m, " ring->head: 0x%08x\n", ring->cpu_ring_head); > @@ -847,6 +860,30 @@ static void i915_record_ring_state(struct drm_device *dev, > > ering->hangcheck_score = ring->hangcheck.score; > ering->hangcheck_action = ring->hangcheck.action; > + > + if (USES_PPGTT(dev)) { > + int i; > + > + ering->vm_info.gfx_mode = I915_READ(RING_MODE_GEN7(ring)); > + > + switch (INTEL_INFO(dev)->gen) { > + case 8: > + for (i = 0; i < 4; i++) { > + ering->vm_info.pdp[i] = > + I915_READ(GEN8_RING_PDP_UDW(ring, i)); > + ering->vm_info.pdp[i] <<= 32; > + ering->vm_info.pdp[i] |= > + I915_READ(GEN8_RING_PDP_LDW(ring, i)); > + } > + break; > + case 7: > + ering->vm_info.pp_dir_base = RING_PP_DIR_BASE(ring); > + break; > + case 6: > + ering->vm_info.pp_dir_base = RING_PP_DIR_BASE_READ(ring); > + break; > + } > + } > } > > > -- > 1.8.5.3 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 6/6] drm/i915: Capture PPGTT info on error capture 2014-01-30 11:26 ` Daniel Vetter @ 2014-01-30 11:34 ` Daniel Vetter 0 siblings, 0 replies; 8+ messages in thread From: Daniel Vetter @ 2014-01-30 11:34 UTC (permalink / raw) To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky On Thu, Jan 30, 2014 at 12:26:49PM +0100, Daniel Vetter wrote: > On Thu, Jan 30, 2014 at 12:19:40AM -0800, Ben Widawsky wrote: > > v2: Rebased upon cleaned up error state > > v3: Make sure hangcheck info remains last (Chris) > > > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > > Pulled in entire series with Chris' irc ack on the remaining two patches. > Note that there have been a few conflicts aroung ring_error_state->valid > (dunno how that happen, I guess this series wasn't strictly based on > -nightly), please double-check things. Meh, I've forgotten to check -fixes - the ring->valid patch I've been looking for was obviously there. Coffee doesn't seem to work today, I'll do a backmerge and sort this out. Sorry for the fuss. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-01-30 11:34 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-01-30 8:19 [PATCH 1/6] drm/i915: Extract register state error capture Ben Widawsky 2014-01-30 8:19 ` [PATCH 2/6] drm/i915: Logically reorder error register capture Ben Widawsky 2014-01-30 8:19 ` [PATCH 3/6] drm/i915: Reorder struct members Ben Widawsky 2014-01-30 8:19 ` [PATCH 4/6] drm/i915: Move per ring error state to ring_error Ben Widawsky 2014-01-30 8:19 ` [PATCH 5/6] drm/i915: Add some more registers to error state Ben Widawsky 2014-01-30 8:19 ` [PATCH 6/6] drm/i915: Capture PPGTT info on error capture Ben Widawsky 2014-01-30 11:26 ` Daniel Vetter 2014-01-30 11:34 ` Daniel Vetter
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox