* Per-engine reset, preliminary recovery patches
@ 2016-08-06 16:49 Chris Wilson
2016-08-06 16:49 ` [PATCH 1/4] drm/i915: Record the position of the start of the request Chris Wilson
` (4 more replies)
0 siblings, 5 replies; 10+ messages in thread
From: Chris Wilson @ 2016-08-06 16:49 UTC (permalink / raw)
To: intel-gfx
Before we get into the fine-grained TDR driven, per-engine reset, we
want to have a solid base for doing the recovery.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/4] drm/i915: Record the position of the start of the request
2016-08-06 16:49 Per-engine reset, preliminary recovery patches Chris Wilson
@ 2016-08-06 16:49 ` Chris Wilson
2016-08-10 12:19 ` Mika Kuoppala
2016-08-06 16:49 ` [PATCH 2/4] drm/i915: Simplify ELSP queue request tracking Chris Wilson
` (3 subsequent siblings)
4 siblings, 1 reply; 10+ messages in thread
From: Chris Wilson @ 2016-08-06 16:49 UTC (permalink / raw)
To: intel-gfx
Not only does it make for good documentation and debugging aide, but it is
also vital for when we want to unwind requests - such as when throwing away
an incomplete request.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1470414607-32453-2-git-send-email-arun.siluvery@linux.intel.com
---
drivers/gpu/drm/i915/i915_drv.h | 1 +
drivers/gpu/drm/i915/i915_gem_request.c | 13 +++++++++----
drivers/gpu/drm/i915/i915_gpu_error.c | 6 ++++--
3 files changed, 14 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index feec00f769e1..040c3b6300ac 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -557,6 +557,7 @@ struct drm_i915_error_state {
struct drm_i915_error_request {
long jiffies;
u32 seqno;
+ u32 head;
u32 tail;
} *requests;
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index 6a1661643d3d..6c2553715263 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -391,6 +391,13 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
if (ret)
goto err_ctx;
+ /* Record the position of the start of the request so that
+ * should we detect the updated seqno part-way through the
+ * GPU processing the request, we never over-estimate the
+ * position of the head.
+ */
+ req->head = req->ring->tail;
+
return req;
err_ctx:
@@ -467,8 +474,6 @@ void __i915_add_request(struct drm_i915_gem_request *request,
trace_i915_gem_request_add(request);
- request->head = request_start;
-
/* Whilst this request exists, batch_obj will be on the
* active_list, and so will hold the active reference. Only when this
* request is retired will the the batch_obj be moved onto the
@@ -489,10 +494,10 @@ void __i915_add_request(struct drm_i915_gem_request *request,
list_add_tail(&request->link, &engine->request_list);
list_add_tail(&request->ring_link, &ring->request_list);
- /* Record the position of the start of the request so that
+ /* Record the position of the start of the breadcrumb so that
* should we detect the updated seqno part-way through the
* GPU processing the request, we never over-estimate the
- * position of the head.
+ * position of the ring's HEAD.
*/
request->postfix = ring->tail;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index eecb87063c88..c9e1246ecd36 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -455,9 +455,10 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
dev_priv->engine[i].name,
ee->num_requests);
for (j = 0; j < ee->num_requests; j++) {
- err_printf(m, " seqno 0x%08x, emitted %ld, tail 0x%08x\n",
+ err_printf(m, " seqno 0x%08x, emitted %ld, head 0x%08x tail 0x%08x\n",
ee->requests[j].seqno,
ee->requests[j].jiffies,
+ ee->requests[j].head,
ee->requests[j].tail);
}
}
@@ -1205,7 +1206,8 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv,
erq = &ee->requests[count++];
erq->seqno = request->fence.seqno;
erq->jiffies = request->emitted_jiffies;
- erq->tail = request->postfix;
+ erq->head = request->head;
+ erq->tail = request->tail;
}
}
}
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/4] drm/i915: Simplify ELSP queue request tracking
2016-08-06 16:49 Per-engine reset, preliminary recovery patches Chris Wilson
2016-08-06 16:49 ` [PATCH 1/4] drm/i915: Record the position of the start of the request Chris Wilson
@ 2016-08-06 16:49 ` Chris Wilson
2016-08-23 14:28 ` Mika Kuoppala
2016-08-06 16:49 ` [PATCH 3/4] drm/i915: Update reset path to fix incomplete requests Chris Wilson
` (2 subsequent siblings)
4 siblings, 1 reply; 10+ messages in thread
From: Chris Wilson @ 2016-08-06 16:49 UTC (permalink / raw)
To: intel-gfx; +Cc: Mika Kuoppala
Emulate HW to track and manage ELSP queue. A set of SW ports are defined
and requests are assigned to these ports before submitting them to HW. This
helps in cleaning up incomplete requests during reset recovery easier
especially after engine reset by decoupling elsp queue management. This
will become more clear in the next patch.
In the engine reset case we want to resume where we left-off after skipping
the incomplete batch which requires checking the elsp queue, removing
element and fixing elsp_submitted counts in some cases. Instead of directly
manipulating the elsp queue from reset path we can examine these ports, fix
up ringbuffer pointers using the incomplete request and restart submissions
again after reset.
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1470414607-32453-3-git-send-email-arun.siluvery@linux.intel.com
---
drivers/gpu/drm/i915/i915_debugfs.c | 2 +-
drivers/gpu/drm/i915/i915_gem.c | 6 +-
drivers/gpu/drm/i915/i915_gem_request.c | 11 +-
drivers/gpu/drm/i915/i915_gem_request.h | 24 +-
drivers/gpu/drm/i915/intel_lrc.c | 417 +++++++++++---------------------
drivers/gpu/drm/i915/intel_lrc.h | 2 -
drivers/gpu/drm/i915/intel_ringbuffer.h | 7 +-
7 files changed, 160 insertions(+), 309 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 9bd41581b592..f1f937a64c27 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2200,7 +2200,7 @@ static int i915_execlists(struct seq_file *m, void *data)
status_pointer = I915_READ(RING_CONTEXT_STATUS_PTR(engine));
seq_printf(m, "\tStatus pointer: 0x%08X\n", status_pointer);
- read_pointer = engine->next_context_status_buffer;
+ read_pointer = GEN8_CSB_READ_PTR(status_pointer);
write_pointer = GEN8_CSB_WRITE_PTR(status_pointer);
if (read_pointer > write_pointer)
write_pointer += GEN8_CSB_ENTRIES;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f4f8eaa90f2a..19715f5e698f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2443,7 +2443,11 @@ static void i915_gem_reset_engine_cleanup(struct intel_engine_cs *engine)
/* Ensure irq handler finishes or is cancelled. */
tasklet_kill(&engine->irq_tasklet);
- intel_execlists_cancel_requests(engine);
+ INIT_LIST_HEAD(&engine->execlist_queue);
+ i915_gem_request_assign(&engine->execlist_port[0].request,
+ NULL);
+ i915_gem_request_assign(&engine->execlist_port[1].request,
+ NULL);
}
/*
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index 6c2553715263..49cca4233a8e 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -736,16 +736,18 @@ complete:
return ret;
}
-static void engine_retire_requests(struct intel_engine_cs *engine)
+static bool engine_retire_requests(struct intel_engine_cs *engine)
{
struct drm_i915_gem_request *request, *next;
list_for_each_entry_safe(request, next, &engine->request_list, link) {
if (!i915_gem_request_completed(request))
- break;
+ return false;
i915_gem_request_retire(request);
}
+
+ return true;
}
void i915_gem_retire_requests(struct drm_i915_private *dev_priv)
@@ -759,9 +761,8 @@ void i915_gem_retire_requests(struct drm_i915_private *dev_priv)
GEM_BUG_ON(!dev_priv->gt.awake);
- for_each_engine(engine, dev_priv) {
- engine_retire_requests(engine);
- if (!intel_engine_is_active(engine))
+ for_each_engine_masked(engine, dev_priv, dev_priv->gt.active_engines) {
+ if (engine_retire_requests(engine))
dev_priv->gt.active_engines &= ~intel_engine_flag(engine);
}
diff --git a/drivers/gpu/drm/i915/i915_gem_request.h b/drivers/gpu/drm/i915/i915_gem_request.h
index 3496e28785e7..32123e38ef4b 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.h
+++ b/drivers/gpu/drm/i915/i915_gem_request.h
@@ -94,6 +94,9 @@ struct drm_i915_gem_request {
/** Position in the ringbuffer of the end of the whole request */
u32 tail;
+ /** Position in the ringbuffer after the end of the whole request */
+ u32 wa_tail;
+
/** Preallocate space in the ringbuffer for the emitting the request */
u32 reserved_space;
@@ -130,27 +133,8 @@ struct drm_i915_gem_request {
/** process identifier submitting this request */
struct pid *pid;
- /**
- * The ELSP only accepts two elements at a time, so we queue
- * context/tail pairs on a given queue (ring->execlist_queue) until the
- * hardware is available. The queue serves a double purpose: we also use
- * it to keep track of the up to 2 contexts currently in the hardware
- * (usually one in execution and the other queued up by the GPU): We
- * only remove elements from the head of the queue when the hardware
- * informs us that an element has been completed.
- *
- * All accesses to the queue are mediated by a spinlock
- * (ring->execlist_lock).
- */
-
- /** Execlist link in the submission queue.*/
+ /** Link in the execlist submission queue, guarded by execlist_lock. */
struct list_head execlist_link;
-
- /** Execlists no. of times this request has been sent to the ELSP */
- int elsp_submitted;
-
- /** Execlists context hardware id. */
- unsigned int ctx_hw_id;
};
extern const struct fence_ops i915_fence_ops;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 309c5d9b1c57..56b5aa8a35c4 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -156,6 +156,11 @@
#define GEN8_CTX_STATUS_COMPLETE (1 << 4)
#define GEN8_CTX_STATUS_LITE_RESTORE (1 << 15)
+#define GEN8_CTX_STATUS_COMPLETED_MASK \
+ (GEN8_CTX_STATUS_ACTIVE_IDLE | \
+ GEN8_CTX_STATUS_PREEMPTED | \
+ GEN8_CTX_STATUS_ELEMENT_SWITCH)
+
#define CTX_LRI_HEADER_0 0x01
#define CTX_CONTEXT_CONTROL 0x02
#define CTX_RING_HEAD 0x04
@@ -263,12 +268,10 @@ logical_ring_init_platform_invariants(struct intel_engine_cs *engine)
{
struct drm_i915_private *dev_priv = engine->i915;
- if (IS_GEN8(dev_priv) || IS_GEN9(dev_priv))
- engine->idle_lite_restore_wa = ~0;
-
- engine->disable_lite_restore_wa = (IS_SKL_REVID(dev_priv, 0, SKL_REVID_B0) ||
- IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) &&
- (engine->id == VCS || engine->id == VCS2);
+ engine->disable_lite_restore_wa =
+ (IS_SKL_REVID(dev_priv, 0, SKL_REVID_B0) ||
+ IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) &&
+ (engine->id == VCS || engine->id == VCS2);
engine->ctx_desc_template = GEN8_CTX_VALID;
if (IS_GEN8(dev_priv))
@@ -328,36 +331,6 @@ uint64_t intel_lr_context_descriptor(struct i915_gem_context *ctx,
return ctx->engine[engine->id].lrc_desc;
}
-static void execlists_elsp_write(struct drm_i915_gem_request *rq0,
- struct drm_i915_gem_request *rq1)
-{
-
- struct intel_engine_cs *engine = rq0->engine;
- struct drm_i915_private *dev_priv = rq0->i915;
- uint64_t desc[2];
-
- if (rq1) {
- desc[1] = intel_lr_context_descriptor(rq1->ctx, rq1->engine);
- rq1->elsp_submitted++;
- } else {
- desc[1] = 0;
- }
-
- desc[0] = intel_lr_context_descriptor(rq0->ctx, rq0->engine);
- rq0->elsp_submitted++;
-
- /* You must always write both descriptors in the order below. */
- I915_WRITE_FW(RING_ELSP(engine), upper_32_bits(desc[1]));
- I915_WRITE_FW(RING_ELSP(engine), lower_32_bits(desc[1]));
-
- I915_WRITE_FW(RING_ELSP(engine), upper_32_bits(desc[0]));
- /* The context is automatically loaded after the following */
- I915_WRITE_FW(RING_ELSP(engine), lower_32_bits(desc[0]));
-
- /* ELSP is a wo register, use another nearby reg for posting */
- POSTING_READ_FW(RING_EXECLIST_STATUS_LO(engine));
-}
-
static void
execlists_update_context_pdps(struct i915_hw_ppgtt *ppgtt, u32 *reg_state)
{
@@ -367,13 +340,12 @@ execlists_update_context_pdps(struct i915_hw_ppgtt *ppgtt, u32 *reg_state)
ASSIGN_CTX_PDP(ppgtt, reg_state, 0);
}
-static void execlists_update_context(struct drm_i915_gem_request *rq)
+static u64 execlists_update_context(struct drm_i915_gem_request *rq)
{
- struct intel_engine_cs *engine = rq->engine;
+ struct intel_context *ce = &rq->ctx->engine[rq->engine->id];
struct i915_hw_ppgtt *ppgtt = rq->ctx->ppgtt;
- uint32_t *reg_state = rq->ctx->engine[engine->id].lrc_reg_state;
- reg_state[CTX_RING_TAIL+1] = intel_ring_offset(rq->ring, rq->tail);
+ ce->lrc_reg_state[CTX_RING_TAIL+1] = intel_ring_offset(rq->ring, rq->tail);
/* True 32b PPGTT with dynamic page allocation: update PDP
* registers and point the unallocated PDPs to scratch page.
@@ -381,32 +353,14 @@ static void execlists_update_context(struct drm_i915_gem_request *rq)
* in 48-bit mode.
*/
if (ppgtt && !USES_FULL_48BIT_PPGTT(ppgtt->base.dev))
- execlists_update_context_pdps(ppgtt, reg_state);
-}
-
-static void execlists_elsp_submit_contexts(struct drm_i915_gem_request *rq0,
- struct drm_i915_gem_request *rq1)
-{
- struct drm_i915_private *dev_priv = rq0->i915;
- unsigned int fw_domains = rq0->engine->fw_domains;
-
- execlists_update_context(rq0);
-
- if (rq1)
- execlists_update_context(rq1);
+ execlists_update_context_pdps(ppgtt, ce->lrc_reg_state);
- spin_lock_irq(&dev_priv->uncore.lock);
- intel_uncore_forcewake_get__locked(dev_priv, fw_domains);
-
- execlists_elsp_write(rq0, rq1);
-
- intel_uncore_forcewake_put__locked(dev_priv, fw_domains);
- spin_unlock_irq(&dev_priv->uncore.lock);
+ return ce->lrc_desc;
}
-static inline void execlists_context_status_change(
- struct drm_i915_gem_request *rq,
- unsigned long status)
+static inline void
+execlists_context_status_change(struct drm_i915_gem_request *rq,
+ unsigned long status)
{
/*
* Only used when GVT-g is enabled now. When GVT-g is disabled,
@@ -418,122 +372,104 @@ static inline void execlists_context_status_change(
atomic_notifier_call_chain(&rq->ctx->status_notifier, status, rq);
}
-static void execlists_unqueue(struct intel_engine_cs *engine)
+static void execlists_submit_ports(struct intel_engine_cs *engine)
{
- struct drm_i915_gem_request *req0 = NULL, *req1 = NULL;
- struct drm_i915_gem_request *cursor, *tmp;
-
- assert_spin_locked(&engine->execlist_lock);
-
- /*
- * If irqs are not active generate a warning as batches that finish
- * without the irqs may get lost and a GPU Hang may occur.
- */
- WARN_ON(!intel_irqs_enabled(engine->i915));
-
- /* Try to read in pairs */
- list_for_each_entry_safe(cursor, tmp, &engine->execlist_queue,
- execlist_link) {
- if (!req0) {
- req0 = cursor;
- } else if (req0->ctx == cursor->ctx) {
- /* Same ctx: ignore first request, as second request
- * will update tail past first request's workload */
- cursor->elsp_submitted = req0->elsp_submitted;
- list_del(&req0->execlist_link);
- i915_gem_request_put(req0);
- req0 = cursor;
- } else {
- if (IS_ENABLED(CONFIG_DRM_I915_GVT)) {
- /*
- * req0 (after merged) ctx requires single
- * submission, stop picking
- */
- if (req0->ctx->execlists_force_single_submission)
- break;
- /*
- * req0 ctx doesn't require single submission,
- * but next req ctx requires, stop picking
- */
- if (cursor->ctx->execlists_force_single_submission)
- break;
- }
- req1 = cursor;
- WARN_ON(req1->elsp_submitted);
- break;
- }
- }
-
- if (unlikely(!req0))
- return;
-
- execlists_context_status_change(req0, INTEL_CONTEXT_SCHEDULE_IN);
+ struct drm_i915_private *dev_priv = engine->i915;
+ u32 *elsp = dev_priv->regs + i915_mmio_reg_offset(RING_ELSP(engine));
+ u64 desc[2];
- if (req1)
- execlists_context_status_change(req1,
+ if (!engine->execlist_port[0].count)
+ execlists_context_status_change(engine->execlist_port[0].request,
INTEL_CONTEXT_SCHEDULE_IN);
+ desc[0] = execlists_update_context(engine->execlist_port[0].request);
+ engine->preempt_wa = engine->execlist_port[0].count++;
- if (req0->elsp_submitted & engine->idle_lite_restore_wa) {
- /*
- * WaIdleLiteRestore: make sure we never cause a lite restore
- * with HEAD==TAIL.
- *
- * Apply the wa NOOPS to prevent ring:HEAD == req:TAIL as we
- * resubmit the request. See gen8_emit_request() for where we
- * prepare the padding after the end of the request.
- */
- req0->tail += 8;
- req0->tail &= req0->ring->size - 1;
+ if (engine->execlist_port[1].request) {
+ GEM_BUG_ON(engine->execlist_port[1].count);
+ execlists_context_status_change(engine->execlist_port[1].request,
+ INTEL_CONTEXT_SCHEDULE_IN);
+ desc[1] = execlists_update_context(engine->execlist_port[1].request);
+ engine->execlist_port[1].count = 1;
+ } else {
+ desc[1] = 0;
}
- execlists_elsp_submit_contexts(req0, req1);
+ /* You must always write both descriptors in the order below. */
+ writel(upper_32_bits(desc[1]), elsp);
+ writel(lower_32_bits(desc[1]), elsp);
+
+ writel(upper_32_bits(desc[0]), elsp);
+ /* The context is automatically loaded after the following */
+ writel(lower_32_bits(desc[0]), elsp);
}
-static unsigned int
-execlists_check_remove_request(struct intel_engine_cs *engine, u32 ctx_id)
+static bool merge_ctx(struct i915_gem_context *prev,
+ struct i915_gem_context *next)
{
- struct drm_i915_gem_request *head_req;
+ if (prev != next)
+ return false;
- assert_spin_locked(&engine->execlist_lock);
+ if (IS_ENABLED(CONFIG_DRM_I915_GVT) &&
+ prev->execlists_force_single_submission)
+ return false;
- head_req = list_first_entry_or_null(&engine->execlist_queue,
- struct drm_i915_gem_request,
- execlist_link);
-
- if (WARN_ON(!head_req || (head_req->ctx_hw_id != ctx_id)))
- return 0;
+ return true;
+}
- WARN(head_req->elsp_submitted == 0, "Never submitted head request\n");
+static void execlists_context_unqueue(struct intel_engine_cs *engine)
+{
+ struct drm_i915_gem_request *cursor, *last;
+ struct execlist_port *port = engine->execlist_port;
+ bool submit = false;
+
+ last = port->request;
+ if (last)
+ /* WaIdleLiteRestore:bdw,skl
+ * Apply the wa NOOPs to prevent ring:HEAD == req:TAIL
+ * as we resubmit the request. See gen8_emit_request()
+ * for where we prepare the padding after the end of the
+ * request.
+ */
+ last->tail = last->wa_tail;
- if (--head_req->elsp_submitted > 0)
- return 0;
+ /* Try to read in pairs and fill both submission ports */
+ spin_lock(&engine->execlist_lock);
+ list_for_each_entry(cursor, &engine->execlist_queue, execlist_link) {
+ if (last && !merge_ctx(cursor->ctx, last->ctx)) {
+ if (port != engine->execlist_port)
+ break;
- execlists_context_status_change(head_req, INTEL_CONTEXT_SCHEDULE_OUT);
+ if (IS_ENABLED(CONFIG_DRM_I915_GVT) &&
+ cursor->ctx->execlists_force_single_submission)
+ break;
- list_del(&head_req->execlist_link);
- i915_gem_request_put(head_req);
+ i915_gem_request_assign(&port->request, last);
+ port++;
+ }
+ last = cursor;
+ submit = true;
+ }
+ if (submit) {
+ i915_gem_request_assign(&port->request, last);
+ engine->execlist_queue.next = &cursor->execlist_link;
+ cursor->execlist_link.prev = &engine->execlist_queue;
+ }
+ spin_unlock(&engine->execlist_lock);
- return 1;
+ if (submit)
+ execlists_submit_ports(engine);
}
-static u32
-get_context_status(struct intel_engine_cs *engine, unsigned int read_pointer,
- u32 *context_id)
+static bool execlists_elsp_idle(struct intel_engine_cs *engine)
{
- struct drm_i915_private *dev_priv = engine->i915;
- u32 status;
-
- read_pointer %= GEN8_CSB_ENTRIES;
-
- status = I915_READ_FW(RING_CONTEXT_STATUS_BUF_LO(engine, read_pointer));
-
- if (status & GEN8_CTX_STATUS_IDLE_ACTIVE)
- return 0;
+ return !engine->execlist_port[0].request;
+}
- *context_id = I915_READ_FW(RING_CONTEXT_STATUS_BUF_HI(engine,
- read_pointer));
+static bool execlists_elsp_ready(struct intel_engine_cs *engine)
+{
+ int idx = !engine->disable_lite_restore_wa && !engine->preempt_wa;
- return status;
+ return !engine->execlist_port[idx].request;
}
/*
@@ -544,100 +480,63 @@ static void intel_lrc_irq_handler(unsigned long data)
{
struct intel_engine_cs *engine = (struct intel_engine_cs *)data;
struct drm_i915_private *dev_priv = engine->i915;
- u32 status_pointer;
- unsigned int read_pointer, write_pointer;
- u32 csb[GEN8_CSB_ENTRIES][2];
- unsigned int csb_read = 0, i;
- unsigned int submit_contexts = 0;
intel_uncore_forcewake_get(dev_priv, engine->fw_domains);
- status_pointer = I915_READ_FW(RING_CONTEXT_STATUS_PTR(engine));
-
- read_pointer = engine->next_context_status_buffer;
- write_pointer = GEN8_CSB_WRITE_PTR(status_pointer);
- if (read_pointer > write_pointer)
- write_pointer += GEN8_CSB_ENTRIES;
-
- while (read_pointer < write_pointer) {
- if (WARN_ON_ONCE(csb_read == GEN8_CSB_ENTRIES))
- break;
- csb[csb_read][0] = get_context_status(engine, ++read_pointer,
- &csb[csb_read][1]);
- csb_read++;
- }
-
- engine->next_context_status_buffer = write_pointer % GEN8_CSB_ENTRIES;
-
- /* Update the read pointer to the old write pointer. Manual ringbuffer
- * management ftw </sarcasm> */
- I915_WRITE_FW(RING_CONTEXT_STATUS_PTR(engine),
- _MASKED_FIELD(GEN8_CSB_READ_PTR_MASK,
- engine->next_context_status_buffer << 8));
-
- intel_uncore_forcewake_put(dev_priv, engine->fw_domains);
-
- spin_lock(&engine->execlist_lock);
-
- for (i = 0; i < csb_read; i++) {
- if (unlikely(csb[i][0] & GEN8_CTX_STATUS_PREEMPTED)) {
- if (csb[i][0] & GEN8_CTX_STATUS_LITE_RESTORE) {
- if (execlists_check_remove_request(engine, csb[i][1]))
- WARN(1, "Lite Restored request removed from queue\n");
- } else
- WARN(1, "Preemption without Lite Restore\n");
+ if (!execlists_elsp_idle(engine)) {
+ u32 *ring_mmio =
+ dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine));
+ u32 *csb_mmio =
+ dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_BUF_LO(engine, 0));
+ unsigned ring, head, tail;
+
+ ring = readl(ring_mmio);
+ head = GEN8_CSB_READ_PTR(ring);
+ tail = GEN8_CSB_WRITE_PTR(ring);
+ if (tail < head)
+ tail += GEN8_CSB_ENTRIES;
+
+ while (head < tail) {
+ unsigned idx = ++head % GEN8_CSB_ENTRIES;
+ unsigned status = readl(&csb_mmio[2*idx]);
+
+ if (status & GEN8_CTX_STATUS_COMPLETED_MASK) {
+ GEM_BUG_ON(engine->execlist_port[0].count == 0);
+ if (--engine->execlist_port[0].count == 0) {
+ GEM_BUG_ON(status & GEN8_CTX_STATUS_PREEMPTED);
+ execlists_context_status_change(engine->execlist_port[0].request,
+ INTEL_CONTEXT_SCHEDULE_OUT);
+ i915_gem_request_put(engine->execlist_port[0].request);
+ engine->execlist_port[0] = engine->execlist_port[1];
+ memset(&engine->execlist_port[1], 0,
+ sizeof(engine->execlist_port[1]));
+ engine->preempt_wa = false;
+ }
+ }
+ GEM_BUG_ON(engine->execlist_port[0].count == 0 &&
+ !(status & GEN8_CTX_STATUS_ACTIVE_IDLE));
}
- if (csb[i][0] & (GEN8_CTX_STATUS_ACTIVE_IDLE |
- GEN8_CTX_STATUS_ELEMENT_SWITCH))
- submit_contexts +=
- execlists_check_remove_request(engine, csb[i][1]);
+ writel(_MASKED_FIELD(GEN8_CSB_READ_PTR_MASK, GEN8_CSB_WRITE_PTR(ring) << 8),
+ ring_mmio);
}
- if (submit_contexts) {
- if (!engine->disable_lite_restore_wa ||
- (csb[i][0] & GEN8_CTX_STATUS_ACTIVE_IDLE))
- execlists_unqueue(engine);
- }
+ if (execlists_elsp_ready(engine))
+ execlists_context_unqueue(engine);
- spin_unlock(&engine->execlist_lock);
-
- if (unlikely(submit_contexts > 2))
- DRM_ERROR("More than two context complete events?\n");
+ intel_uncore_forcewake_put(dev_priv, engine->fw_domains);
}
static void execlists_submit_request(struct drm_i915_gem_request *request)
{
struct intel_engine_cs *engine = request->engine;
- struct drm_i915_gem_request *cursor;
- int num_elements = 0;
spin_lock_bh(&engine->execlist_lock);
- list_for_each_entry(cursor, &engine->execlist_queue, execlist_link)
- if (++num_elements > 2)
- break;
-
- if (num_elements > 2) {
- struct drm_i915_gem_request *tail_req;
-
- tail_req = list_last_entry(&engine->execlist_queue,
- struct drm_i915_gem_request,
- execlist_link);
-
- if (request->ctx == tail_req->ctx) {
- WARN(tail_req->elsp_submitted != 0,
- "More than 2 already-submitted reqs queued\n");
- list_del(&tail_req->execlist_link);
- i915_gem_request_put(tail_req);
- }
- }
-
- i915_gem_request_get(request);
list_add_tail(&request->execlist_link, &engine->execlist_queue);
- request->ctx_hw_id = request->ctx->hw_id;
- if (num_elements == 0)
- execlists_unqueue(engine);
+
+ if (execlists_elsp_idle(engine))
+ tasklet_hi_schedule(&engine->irq_tasklet);
spin_unlock_bh(&engine->execlist_lock);
}
@@ -731,6 +630,7 @@ intel_logical_ring_advance(struct drm_i915_gem_request *request)
intel_ring_emit(ring, MI_NOOP);
intel_ring_emit(ring, MI_NOOP);
intel_ring_advance(ring);
+ request->wa_tail = request->ring->tail;
/* We keep the previous context alive until we retire the following
* request. This ensures that any the context object is still pinned
@@ -743,23 +643,6 @@ intel_logical_ring_advance(struct drm_i915_gem_request *request)
return 0;
}
-void intel_execlists_cancel_requests(struct intel_engine_cs *engine)
-{
- struct drm_i915_gem_request *req, *tmp;
- LIST_HEAD(cancel_list);
-
- WARN_ON(!mutex_is_locked(&engine->i915->drm.struct_mutex));
-
- spin_lock_bh(&engine->execlist_lock);
- list_replace_init(&engine->execlist_queue, &cancel_list);
- spin_unlock_bh(&engine->execlist_lock);
-
- list_for_each_entry_safe(req, tmp, &cancel_list, execlist_link) {
- list_del(&req->execlist_link);
- i915_gem_request_put(req);
- }
-}
-
static int intel_lr_context_pin(struct i915_gem_context *ctx,
struct intel_engine_cs *engine)
{
@@ -1285,7 +1168,6 @@ static void lrc_init_hws(struct intel_engine_cs *engine)
static int gen8_init_common_ring(struct intel_engine_cs *engine)
{
struct drm_i915_private *dev_priv = engine->i915;
- unsigned int next_context_status_buffer_hw;
lrc_init_hws(engine);
@@ -1296,32 +1178,12 @@ static int gen8_init_common_ring(struct intel_engine_cs *engine)
I915_WRITE(RING_MODE_GEN7(engine),
_MASKED_BIT_DISABLE(GFX_REPLAY_MODE) |
_MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE));
- POSTING_READ(RING_MODE_GEN7(engine));
- /*
- * Instead of resetting the Context Status Buffer (CSB) read pointer to
- * zero, we need to read the write pointer from hardware and use its
- * value because "this register is power context save restored".
- * Effectively, these states have been observed:
- *
- * | Suspend-to-idle (freeze) | Suspend-to-RAM (mem) |
- * BDW | CSB regs not reset | CSB regs reset |
- * CHT | CSB regs not reset | CSB regs not reset |
- * SKL | ? | ? |
- * BXT | ? | ? |
- */
- next_context_status_buffer_hw =
- GEN8_CSB_WRITE_PTR(I915_READ(RING_CONTEXT_STATUS_PTR(engine)));
-
- /*
- * When the CSB registers are reset (also after power-up / gpu reset),
- * CSB write pointer is set to all 1's, which is not valid, use '5' in
- * this special case, so the first element read is CSB[0].
- */
- if (next_context_status_buffer_hw == GEN8_CSB_PTR_MASK)
- next_context_status_buffer_hw = (GEN8_CSB_ENTRIES - 1);
+ I915_WRITE(RING_CONTEXT_STATUS_PTR(engine),
+ _MASKED_FIELD(GEN8_CSB_READ_PTR_MASK |
+ GEN8_CSB_WRITE_PTR_MASK,
+ 0));
- engine->next_context_status_buffer = next_context_status_buffer_hw;
DRM_DEBUG_DRIVER("Execlists enabled for %s\n", engine->name);
intel_engine_init_hangcheck(engine);
@@ -1706,7 +1568,6 @@ void intel_logical_ring_cleanup(struct intel_engine_cs *engine)
}
intel_lr_context_unpin(dev_priv->kernel_context, engine);
- engine->idle_lite_restore_wa = 0;
engine->disable_lite_restore_wa = false;
engine->ctx_desc_template = 0;
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index a52cf57dbd40..4d70346500c2 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -97,6 +97,4 @@ int intel_sanitize_enable_execlists(struct drm_i915_private *dev_priv,
int enable_execlists);
void intel_execlists_enable_submission(struct drm_i915_private *dev_priv);
-void intel_execlists_cancel_requests(struct intel_engine_cs *engine);
-
#endif /* _INTEL_LRC_H_ */
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 43e545e44352..ed8a8482a7fb 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -288,11 +288,14 @@ struct intel_engine_cs {
/* Execlists */
struct tasklet_struct irq_tasklet;
spinlock_t execlist_lock; /* used inside tasklet, use spin_lock_bh */
+ struct execlist_port {
+ struct drm_i915_gem_request *request;
+ unsigned int count;
+ } execlist_port[2];
struct list_head execlist_queue;
unsigned int fw_domains;
- unsigned int next_context_status_buffer;
- unsigned int idle_lite_restore_wa;
bool disable_lite_restore_wa;
+ bool preempt_wa;
u32 ctx_desc_template;
/**
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 3/4] drm/i915: Update reset path to fix incomplete requests
2016-08-06 16:49 Per-engine reset, preliminary recovery patches Chris Wilson
2016-08-06 16:49 ` [PATCH 1/4] drm/i915: Record the position of the start of the request Chris Wilson
2016-08-06 16:49 ` [PATCH 2/4] drm/i915: Simplify ELSP queue request tracking Chris Wilson
@ 2016-08-06 16:49 ` Chris Wilson
2016-08-06 16:49 ` [PATCH 4/4] drm/i915: Separate out reset flags from the reset counter Chris Wilson
2016-08-06 17:16 ` ✗ Ro.CI.BAT: failure for series starting with [1/4] drm/i915: Record the position of the start of the request Patchwork
4 siblings, 0 replies; 10+ messages in thread
From: Chris Wilson @ 2016-08-06 16:49 UTC (permalink / raw)
To: intel-gfx; +Cc: Mika Kuoppala
Update reset path in preparation for engine reset which requires
identification of incomplete requests and associated context and fixing
their state so that engine can resume correctly after reset.
The request that caused the hang will be skipped and head is reset to the
start of breadcrumb. This allows us to resume from where we left-off.
Since this request didn't complete normally we also need to cleanup elsp
queue manually.
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/i915_drv.c | 4 +-
drivers/gpu/drm/i915/i915_drv.h | 2 +-
drivers/gpu/drm/i915/i915_gem.c | 68 ++++++++++++++++++++++-----------
drivers/gpu/drm/i915/i915_gem_context.c | 16 --------
drivers/gpu/drm/i915/intel_lrc.c | 34 ++++++++++++++++-
drivers/gpu/drm/i915/intel_ringbuffer.c | 33 ++++++++++------
drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +
7 files changed, 105 insertions(+), 54 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 8cfc264ec9f6..09f7372a2e3b 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1578,7 +1578,7 @@ static int i915_drm_resume(struct drm_device *dev)
mutex_lock(&dev->struct_mutex);
if (i915_gem_init_hw(dev)) {
DRM_ERROR("failed to re-initialize GPU, declaring wedged!\n");
- atomic_or(I915_WEDGED, &dev_priv->gpu_error.reset_counter);
+ i915_gem_set_wedged(dev_priv);
}
mutex_unlock(&dev->struct_mutex);
@@ -1802,7 +1802,7 @@ int i915_reset(struct drm_i915_private *dev_priv)
return 0;
error:
- atomic_or(I915_WEDGED, &error->reset_counter);
+ i915_gem_set_wedged(dev_priv);
mutex_unlock(&dev->struct_mutex);
return ret;
}
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 040c3b6300ac..fa1b6e64b55a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3245,6 +3245,7 @@ static inline u32 i915_reset_count(struct i915_gpu_error *error)
}
void i915_gem_reset(struct drm_device *dev);
+void i915_gem_set_wedged(struct drm_i915_private *dev_priv);
bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force);
int __must_check i915_gem_init(struct drm_device *dev);
int __must_check i915_gem_init_hw(struct drm_device *dev);
@@ -3364,7 +3365,6 @@ void i915_gem_object_save_bit_17_swizzle(struct drm_i915_gem_object *obj);
int __must_check i915_gem_context_init(struct drm_device *dev);
void i915_gem_context_lost(struct drm_i915_private *dev_priv);
void i915_gem_context_fini(struct drm_device *dev);
-void i915_gem_context_reset(struct drm_device *dev);
int i915_gem_context_open(struct drm_device *dev, struct drm_file *file);
void i915_gem_context_close(struct drm_device *dev, struct drm_file *file);
int i915_switch_context(struct drm_i915_gem_request *req);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 19715f5e698f..d4bcf88636ec 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2402,23 +2402,61 @@ i915_gem_find_active_request(struct intel_engine_cs *engine)
return NULL;
}
-static void i915_gem_reset_engine_status(struct intel_engine_cs *engine)
+static void i915_gem_reset_engine(struct intel_engine_cs *engine)
{
struct drm_i915_gem_request *request;
+ struct i915_gem_context *incomplete_ctx;
bool ring_hung;
+ /* Ensure irq handler finishes or is cancelled. */
+ if (i915.enable_execlists)
+ tasklet_kill(&engine->irq_tasklet);
+
request = i915_gem_find_active_request(engine);
- if (request == NULL)
+ if (!request)
return;
ring_hung = engine->hangcheck.score >= HANGCHECK_SCORE_RING_HUNG;
i915_set_reset_status(request->ctx, ring_hung);
- list_for_each_entry_continue(request, &engine->request_list, link)
+
+ engine->reset_hw(engine, request);
+
+ incomplete_ctx = request->ctx;
+ if (i915_gem_context_is_default(incomplete_ctx))
+ incomplete_ctx = NULL;
+
+ list_for_each_entry_continue(request, &engine->request_list, link) {
+ void *vaddr = request->ring->vaddr;
+ u32 head;
+
+ if (request->ctx != incomplete_ctx)
+ continue;
+
+ head = request->head;
+ if (request->postfix < head) {
+ memset(vaddr + head, 0, request->ring->size - head);
+ head = 0;
+ }
+ memset(vaddr + head, 0, request->postfix - head);
+
i915_set_reset_status(request->ctx, false);
+ }
}
-static void i915_gem_reset_engine_cleanup(struct intel_engine_cs *engine)
+void i915_gem_reset(struct drm_device *dev)
+{
+ struct drm_i915_private *dev_priv = to_i915(dev);
+ struct intel_engine_cs *engine;
+
+ for_each_engine(engine, dev_priv)
+ i915_gem_reset_engine(engine);
+
+ i915_gem_context_lost(dev_priv);
+ i915_gem_restore_fences(dev);
+}
+
+static void i915_gem_cleanup_engine(struct intel_engine_cs *engine)
{
struct drm_i915_gem_request *request;
struct intel_ring *ring;
@@ -2440,9 +2478,6 @@ static void i915_gem_reset_engine_cleanup(struct intel_engine_cs *engine)
*/
if (i915.enable_execlists) {
- /* Ensure irq handler finishes or is cancelled. */
- tasklet_kill(&engine->irq_tasklet);
-
INIT_LIST_HEAD(&engine->execlist_queue);
i915_gem_request_assign(&engine->execlist_port[0].request,
NULL);
@@ -2476,26 +2511,15 @@ static void i915_gem_reset_engine_cleanup(struct intel_engine_cs *engine)
engine->i915->gt.active_engines &= ~intel_engine_flag(engine);
}
-void i915_gem_reset(struct drm_device *dev)
+void i915_gem_set_wedged(struct drm_i915_private *dev_priv)
{
- struct drm_i915_private *dev_priv = to_i915(dev);
struct intel_engine_cs *engine;
- /*
- * Before we free the objects from the requests, we need to inspect
- * them for finding the guilty party. As the requests only borrow
- * their reference to the objects, the inspection must be done first.
- */
- for_each_engine(engine, dev_priv)
- i915_gem_reset_engine_status(engine);
+ atomic_or(I915_WEDGED, &dev_priv->gpu_error.reset_counter);
for_each_engine(engine, dev_priv)
- i915_gem_reset_engine_cleanup(engine);
+ i915_gem_cleanup_engine(engine);
mod_delayed_work(dev_priv->wq, &dev_priv->gt.idle_work, 0);
-
- i915_gem_context_reset(dev);
-
- i915_gem_restore_fences(dev);
}
static void
@@ -4356,7 +4380,7 @@ int i915_gem_init(struct drm_device *dev)
* for all other failure, such as an allocation failure, bail.
*/
DRM_ERROR("Failed to initialize GPU, declaring it wedged\n");
- atomic_or(I915_WEDGED, &dev_priv->gpu_error.reset_counter);
+ i915_gem_set_wedged(dev_priv);
ret = 0;
}
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index bb72af5320b0..a5b3ba4cca4a 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -405,22 +405,6 @@ static void i915_gem_context_unpin(struct i915_gem_context *ctx,
}
}
-void i915_gem_context_reset(struct drm_device *dev)
-{
- struct drm_i915_private *dev_priv = to_i915(dev);
-
- lockdep_assert_held(&dev->struct_mutex);
-
- if (i915.enable_execlists) {
- struct i915_gem_context *ctx;
-
- list_for_each_entry(ctx, &dev_priv->context_list, link)
- intel_lr_context_reset(dev_priv, ctx);
- }
-
- i915_gem_context_lost(dev_priv);
-}
-
int i915_gem_context_init(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = to_i915(dev);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 56b5aa8a35c4..3bc23c2f0b89 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1169,6 +1169,7 @@ static int gen8_init_common_ring(struct intel_engine_cs *engine)
{
struct drm_i915_private *dev_priv = engine->i915;
+ intel_mocs_init_engine(engine);
lrc_init_hws(engine);
I915_WRITE_IMR(engine,
@@ -1188,7 +1189,10 @@ static int gen8_init_common_ring(struct intel_engine_cs *engine)
intel_engine_init_hangcheck(engine);
- return intel_mocs_init_engine(engine);
+ if (engine->execlist_port[0].request)
+ execlists_submit_ports(engine);
+
+ return 0;
}
static int gen8_init_render_ring(struct intel_engine_cs *engine)
@@ -1224,6 +1228,33 @@ static int gen9_init_render_ring(struct intel_engine_cs *engine)
return init_workarounds_ring(engine);
}
+static void reset_common_ring(struct intel_engine_cs *engine,
+ struct drm_i915_gem_request *request)
+{
+ struct drm_i915_private *dev_priv = engine->i915;
+ struct i915_gem_context *ctx = request->ctx;
+ struct intel_context *ce = &ctx->engine[engine->id];
+ u32 *reg_state;
+
+ reg_state = ce->lrc_reg_state;
+ reg_state[CTX_RING_HEAD+1] = request->postfix;
+
+ request->ring->head = request->postfix;
+ request->ring->last_retired_head = -1;
+ intel_ring_update_space(request->ring);
+
+ if (request == engine->execlist_port[1].request) {
+ i915_gem_request_put(engine->execlist_port[0].request);
+ engine->execlist_port[0] = engine->execlist_port[1];
+ memset(&engine->execlist_port[1], 0,
+ sizeof(engine->execlist_port[1]));
+ }
+
+ engine->execlist_port[0].count = 0;
+
+ I915_WRITE(RING_CONTEXT_STATUS_PTR(engine), _MASKED_FIELD(0xffff, 0));
+}
+
static int intel_logical_ring_emit_pdps(struct drm_i915_gem_request *req)
{
struct i915_hw_ppgtt *ppgtt = req->ctx->ppgtt;
@@ -1588,6 +1619,7 @@ logical_ring_default_vfuncs(struct intel_engine_cs *engine)
{
/* Default vfuncs which can be overriden by each engine. */
engine->init_hw = gen8_init_common_ring;
+ engine->reset_hw = reset_common_ring;
engine->emit_flush = gen8_emit_flush;
engine->emit_request = gen8_emit_request;
engine->submit_request = execlists_submit_request;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index e08a1e1b04e4..3c070e9306b2 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -577,34 +577,33 @@ static int init_ring_common(struct intel_engine_cs *engine)
if (I915_READ_HEAD(engine))
DRM_DEBUG("%s initialization failed [head=%08x], fudging\n",
engine->name, I915_READ_HEAD(engine));
- I915_WRITE_HEAD(engine, 0);
- (void)I915_READ_HEAD(engine);
+
+ intel_ring_update_space(ring);
+ I915_WRITE_HEAD(engine, ring->head);
+ I915_WRITE_TAIL(engine, ring->tail);
+ (void)I915_READ_TAIL(engine);
I915_WRITE_CTL(engine,
((ring->size - PAGE_SIZE) & RING_NR_PAGES)
| RING_VALID);
/* If the head is still not zero, the ring is dead */
- if (wait_for((I915_READ_CTL(engine) & RING_VALID) != 0 &&
- I915_READ_START(engine) == i915_gem_obj_ggtt_offset(obj) &&
- (I915_READ_HEAD(engine) & HEAD_ADDR) == 0, 50)) {
+ if (intel_wait_for_register_fw(dev_priv, RING_CTL(engine->mmio_base),
+ RING_VALID, RING_VALID,
+ 50)) {
DRM_ERROR("%s initialization failed "
- "ctl %08x (valid? %d) head %08x tail %08x start %08x [expected %08lx]\n",
+ "ctl %08x (valid? %d) head %08x [%08x] tail %08x [%08x] start %08x [expected %08lx]\n",
engine->name,
I915_READ_CTL(engine),
I915_READ_CTL(engine) & RING_VALID,
- I915_READ_HEAD(engine), I915_READ_TAIL(engine),
+ I915_READ_HEAD(engine), ring->head,
+ I915_READ_TAIL(engine), ring->tail,
I915_READ_START(engine),
(unsigned long)i915_gem_obj_ggtt_offset(obj));
ret = -EIO;
goto out;
}
- ring->last_retired_head = -1;
- ring->head = I915_READ_HEAD(engine);
- ring->tail = I915_READ_TAIL(engine) & TAIL_ADDR;
- intel_ring_update_space(ring);
-
intel_engine_init_hangcheck(engine);
out:
@@ -613,6 +612,15 @@ out:
return ret;
}
+static void reset_ring_common(struct intel_engine_cs *engine,
+ struct drm_i915_gem_request *request)
+{
+ struct intel_ring *ring = request->ring;
+
+ ring->head = request->postfix;
+ ring->last_retired_head = -1;
+}
+
void intel_fini_pipe_control(struct intel_engine_cs *engine)
{
if (engine->scratch.obj == NULL)
@@ -2748,6 +2756,7 @@ static void intel_ring_default_vfuncs(struct drm_i915_private *dev_priv,
intel_ring_init_semaphores(dev_priv, engine);
engine->init_hw = init_ring_common;
+ engine->reset_hw = reset_ring_common;
engine->emit_request = i9xx_emit_request;
if (i915.semaphores)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index ed8a8482a7fb..854523bdf0b2 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -204,6 +204,8 @@ struct intel_engine_cs {
void (*irq_disable)(struct intel_engine_cs *engine);
int (*init_hw)(struct intel_engine_cs *engine);
+ void (*reset_hw)(struct intel_engine_cs *engine,
+ struct drm_i915_gem_request *req);
int (*init_context)(struct drm_i915_gem_request *req);
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 4/4] drm/i915: Separate out reset flags from the reset counter
2016-08-06 16:49 Per-engine reset, preliminary recovery patches Chris Wilson
` (2 preceding siblings ...)
2016-08-06 16:49 ` [PATCH 3/4] drm/i915: Update reset path to fix incomplete requests Chris Wilson
@ 2016-08-06 16:49 ` Chris Wilson
2016-08-06 17:16 ` ✗ Ro.CI.BAT: failure for series starting with [1/4] drm/i915: Record the position of the start of the request Patchwork
4 siblings, 0 replies; 10+ messages in thread
From: Chris Wilson @ 2016-08-06 16:49 UTC (permalink / raw)
To: intel-gfx; +Cc: Mika Kuoppala
In preparation for introducing a per-engine reset, we can first separate
the mixing of the reset state from the global reset counter.
The loss of atomicity in updating the reset state poses a small problem
for handling the waiters. For requests, this is solved by advancing the
seqno so that a waiter waking up after the reset knows the request is
complete. For pending flips, we still rely on the increment of the
global reset epoch (as well as the reset-in-progress flag) to signify
when the hardware was reset.
The advantage, now that we do not inspect the reset state during reset
itself i.e. we no longer emit requests during reset, is that we can use
the atomic updates of the state flags to ensure that only one reset
worker is active.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470414607-32453-6-git-send-email-arun.siluvery@linux.intel.com
---
drivers/gpu/drm/i915/i915_drv.c | 12 +---
drivers/gpu/drm/i915/i915_drv.h | 46 +++++---------
drivers/gpu/drm/i915/i915_gem.c | 2 +-
drivers/gpu/drm/i915/i915_gem_request.c | 13 ++--
drivers/gpu/drm/i915/i915_irq.c | 104 +++++++++++++++-----------------
drivers/gpu/drm/i915/intel_display.c | 25 +++++---
drivers/gpu/drm/i915/intel_drv.h | 4 +-
7 files changed, 91 insertions(+), 115 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 09f7372a2e3b..cacb6a201d72 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1739,20 +1739,13 @@ int i915_reset(struct drm_i915_private *dev_priv)
{
struct drm_device *dev = &dev_priv->drm;
struct i915_gpu_error *error = &dev_priv->gpu_error;
- unsigned reset_counter;
int ret;
mutex_lock(&dev->struct_mutex);
/* Clear any previous failed attempts at recovery. Time to try again. */
- atomic_andnot(I915_WEDGED, &error->reset_counter);
-
- /* Clear the reset-in-progress flag and increment the reset epoch. */
- reset_counter = atomic_inc_return(&error->reset_counter);
- if (WARN_ON(__i915_reset_in_progress(reset_counter))) {
- ret = -EIO;
- goto error;
- }
+ __clear_bit(I915_WEDGED, &error->flags);
+ error->reset_count++;
pr_notice("drm/i915: Resetting chip after gpu hang\n");
@@ -1789,6 +1782,7 @@ int i915_reset(struct drm_i915_private *dev_priv)
goto error;
}
+ clear_bit(I915_RESET_IN_PROGRESS, &error->flags);
mutex_unlock(&dev->struct_mutex);
/*
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index fa1b6e64b55a..29eaada40793 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1378,9 +1378,10 @@ struct i915_gpu_error {
* State variable controlling the reset flow and count
*
* This is a counter which gets incremented when reset is triggered,
- * and again when reset has been handled. So odd values (lowest bit set)
- * means that reset is in progress and even values that
- * (reset_counter >> 1):th reset was successfully completed.
+ *
+ * Before the reset commences, the I915_RESET_IN_PROGRESS bit is set
+ * meaning that any waiters holding onto the struct_mutex should
+ * relinquish the lock immediately in order for the reset to start.
*
* If reset is not completed succesfully, the I915_WEDGE bit is
* set meaning that hardware is terminally sour and there is no
@@ -1395,10 +1396,11 @@ struct i915_gpu_error {
* naturally enforces the correct ordering between the bail-out of the
* waiter and the gpu reset work code.
*/
- atomic_t reset_counter;
+ unsigned long reset_count;
-#define I915_RESET_IN_PROGRESS_FLAG 1
-#define I915_WEDGED (1 << 31)
+ unsigned long flags;
+#define I915_RESET_IN_PROGRESS 0
+#define I915_WEDGED (BITS_PER_LONG - 1)
/**
* Waitqueue to signal when a hang is detected. Used to for waiters
@@ -3204,44 +3206,24 @@ i915_gem_find_active_request(struct intel_engine_cs *engine);
void i915_gem_retire_requests(struct drm_i915_private *dev_priv);
-static inline u32 i915_reset_counter(struct i915_gpu_error *error)
-{
- return atomic_read(&error->reset_counter);
-}
-
-static inline bool __i915_reset_in_progress(u32 reset)
-{
- return unlikely(reset & I915_RESET_IN_PROGRESS_FLAG);
-}
-
-static inline bool __i915_reset_in_progress_or_wedged(u32 reset)
-{
- return unlikely(reset & (I915_RESET_IN_PROGRESS_FLAG | I915_WEDGED));
-}
-
-static inline bool __i915_terminally_wedged(u32 reset)
-{
- return unlikely(reset & I915_WEDGED);
-}
-
static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
{
- return __i915_reset_in_progress(i915_reset_counter(error));
+ return test_bit(I915_RESET_IN_PROGRESS, &error->flags);
}
-static inline bool i915_reset_in_progress_or_wedged(struct i915_gpu_error *error)
+static inline bool i915_terminally_wedged(struct i915_gpu_error *error)
{
- return __i915_reset_in_progress_or_wedged(i915_reset_counter(error));
+ return test_bit(I915_WEDGED, &error->flags);
}
-static inline bool i915_terminally_wedged(struct i915_gpu_error *error)
+static inline bool i915_reset_in_progress_or_wedged(struct i915_gpu_error *error)
{
- return __i915_terminally_wedged(i915_reset_counter(error));
+ return i915_reset_in_progress(error) | i915_terminally_wedged(error);
}
static inline u32 i915_reset_count(struct i915_gpu_error *error)
{
- return ((i915_reset_counter(error) & ~I915_WEDGED) + 1) / 2;
+ return READ_ONCE(error->reset_count);
}
void i915_gem_reset(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d4bcf88636ec..669fb358b448 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2515,7 +2515,7 @@ void i915_gem_set_wedged(struct drm_i915_private *dev_priv)
{
struct intel_engine_cs *engine;
- atomic_or(I915_WEDGED, &dev_priv->gpu_error.reset_counter);
+ set_bit(I915_WEDGED, &dev_priv->gpu_error.flags);
for_each_engine(engine, dev_priv)
i915_gem_cleanup_engine(engine);
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index 49cca4233a8e..391c310f1690 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -238,16 +238,18 @@ void i915_gem_request_retire_upto(struct drm_i915_gem_request *req)
} while (tmp != req);
}
-static int i915_gem_check_wedge(unsigned int reset_counter, bool interruptible)
+static int i915_gem_check_wedge(struct drm_i915_private *dev_priv)
{
- if (__i915_terminally_wedged(reset_counter))
+ struct i915_gpu_error *error = &dev_priv->gpu_error;
+
+ if (i915_terminally_wedged(error))
return -EIO;
- if (__i915_reset_in_progress(reset_counter)) {
+ if (i915_reset_in_progress(error)) {
/* Non-interruptible callers can't handle -EAGAIN, hence return
* -EIO unconditionally for these.
*/
- if (!interruptible)
+ if (!dev_priv->mm.interruptible)
return -EIO;
return -EAGAIN;
@@ -336,7 +338,6 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
struct i915_gem_context *ctx)
{
struct drm_i915_private *dev_priv = engine->i915;
- unsigned int reset_counter = i915_reset_counter(&dev_priv->gpu_error);
struct drm_i915_gem_request *req;
u32 seqno;
int ret;
@@ -345,7 +346,7 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
* EIO if the GPU is already wedged, or EAGAIN to drop the struct_mutex
* and restart.
*/
- ret = i915_gem_check_wedge(reset_counter, dev_priv->mm.interruptible);
+ ret = i915_gem_check_wedge(dev_priv);
if (ret)
return ERR_PTR(ret);
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 591f452ece68..f4fe931584a1 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2503,53 +2503,41 @@ static void i915_reset_and_wakeup(struct drm_i915_private *dev_priv)
kobject_uevent_env(kobj, KOBJ_CHANGE, error_event);
+ DRM_DEBUG_DRIVER("resetting chip\n");
+ kobject_uevent_env(kobj, KOBJ_CHANGE, reset_event);
+
/*
- * Note that there's only one work item which does gpu resets, so we
- * need not worry about concurrent gpu resets potentially incrementing
- * error->reset_counter twice. We only need to take care of another
- * racing irq/hangcheck declaring the gpu dead for a second time. A
- * quick check for that is good enough: schedule_work ensures the
- * correct ordering between hang detection and this work item, and since
- * the reset in-progress bit is only ever set by code outside of this
- * work we don't need to worry about any other races.
+ * In most cases it's guaranteed that we get here with an RPM
+ * reference held, for example because there is a pending GPU
+ * request that won't finish until the reset is done. This
+ * isn't the case at least when we get here by doing a
+ * simulated reset via debugs, so get an RPM reference.
*/
- if (i915_reset_in_progress(&dev_priv->gpu_error)) {
- DRM_DEBUG_DRIVER("resetting chip\n");
- kobject_uevent_env(kobj, KOBJ_CHANGE, reset_event);
-
- /*
- * In most cases it's guaranteed that we get here with an RPM
- * reference held, for example because there is a pending GPU
- * request that won't finish until the reset is done. This
- * isn't the case at least when we get here by doing a
- * simulated reset via debugs, so get an RPM reference.
- */
- intel_runtime_pm_get(dev_priv);
+ intel_runtime_pm_get(dev_priv);
- intel_prepare_reset(dev_priv);
+ intel_prepare_reset(dev_priv);
- /*
- * All state reset _must_ be completed before we update the
- * reset counter, for otherwise waiters might miss the reset
- * pending state and not properly drop locks, resulting in
- * deadlocks with the reset work.
- */
- ret = i915_reset(dev_priv);
+ /*
+ * All state reset _must_ be completed before we update the
+ * reset counter, for otherwise waiters might miss the reset
+ * pending state and not properly drop locks, resulting in
+ * deadlocks with the reset work.
+ */
+ ret = i915_reset(dev_priv);
- intel_finish_reset(dev_priv);
+ intel_finish_reset(dev_priv);
- intel_runtime_pm_put(dev_priv);
+ intel_runtime_pm_put(dev_priv);
- if (ret == 0)
- kobject_uevent_env(kobj,
- KOBJ_CHANGE, reset_done_event);
+ if (ret == 0)
+ kobject_uevent_env(kobj,
+ KOBJ_CHANGE, reset_done_event);
- /*
- * Note: The wake_up also serves as a memory barrier so that
- * waiters see the update value of the reset counter atomic_t.
- */
- wake_up_all(&dev_priv->gpu_error.reset_queue);
- }
+ /*
+ * Note: The wake_up also serves as a memory barrier so that
+ * waiters see the update value of the reset counter atomic_t.
+ */
+ i915_error_wake_up(dev_priv);
}
static void i915_report_and_clear_eir(struct drm_i915_private *dev_priv)
@@ -2668,25 +2656,27 @@ void i915_handle_error(struct drm_i915_private *dev_priv,
i915_capture_error_state(dev_priv, engine_mask, error_msg);
i915_report_and_clear_eir(dev_priv);
- if (engine_mask) {
- atomic_or(I915_RESET_IN_PROGRESS_FLAG,
- &dev_priv->gpu_error.reset_counter);
+ if (!engine_mask)
+ return;
- /*
- * Wakeup waiting processes so that the reset function
- * i915_reset_and_wakeup doesn't deadlock trying to grab
- * various locks. By bumping the reset counter first, the woken
- * processes will see a reset in progress and back off,
- * releasing their locks and then wait for the reset completion.
- * We must do this for _all_ gpu waiters that might hold locks
- * that the reset work needs to acquire.
- *
- * Note: The wake_up serves as the required memory barrier to
- * ensure that the waiters see the updated value of the reset
- * counter atomic_t.
- */
- i915_error_wake_up(dev_priv);
- }
+ if (test_and_set_bit(I915_RESET_IN_PROGRESS,
+ &dev_priv->gpu_error.flags))
+ return;
+
+ /*
+ * Wakeup waiting processes so that the reset function
+ * i915_reset_and_wakeup doesn't deadlock trying to grab
+ * various locks. By bumping the reset counter first, the woken
+ * processes will see a reset in progress and back off,
+ * releasing their locks and then wait for the reset completion.
+ * We must do this for _all_ gpu waiters that might hold locks
+ * that the reset work needs to acquire.
+ *
+ * Note: The wake_up serves as the required memory barrier to
+ * ensure that the waiters see the updated value of the reset
+ * counter atomic_t.
+ */
+ i915_error_wake_up(dev_priv);
i915_reset_and_wakeup(dev_priv);
}
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 9cbf5431c1e3..5934789747cf 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -3174,15 +3174,26 @@ void intel_finish_reset(struct drm_i915_private *dev_priv)
drm_modeset_unlock_all(&dev_priv->drm);
}
+static bool abort_flip_on_reset(struct intel_crtc *crtc)
+{
+ struct i915_gpu_error *error = &to_i915(crtc->base.dev)->gpu_error;
+
+ if (i915_reset_in_progress(error))
+ return true;
+
+ if (crtc->reset_count != i915_reset_count(error))
+ return true;
+
+ return false;
+}
+
static bool intel_crtc_has_pending_flip(struct drm_crtc *crtc)
{
struct drm_device *dev = crtc->dev;
struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
- unsigned reset_counter;
bool pending;
- reset_counter = i915_reset_counter(&to_i915(dev)->gpu_error);
- if (intel_crtc->reset_counter != reset_counter)
+ if (abort_flip_on_reset(intel_crtc))
return false;
spin_lock_irq(&dev->event_lock);
@@ -10971,10 +10982,8 @@ static bool __pageflip_finished_cs(struct intel_crtc *crtc,
{
struct drm_device *dev = crtc->base.dev;
struct drm_i915_private *dev_priv = to_i915(dev);
- unsigned reset_counter;
- reset_counter = i915_reset_counter(&dev_priv->gpu_error);
- if (crtc->reset_counter != reset_counter)
+ if (abort_flip_on_reset(crtc))
return true;
/*
@@ -11655,8 +11664,8 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
if (ret)
goto cleanup;
- intel_crtc->reset_counter = i915_reset_counter(&dev_priv->gpu_error);
- if (__i915_reset_in_progress_or_wedged(intel_crtc->reset_counter)) {
+ intel_crtc->reset_count = i915_reset_count(&dev_priv->gpu_error);
+ if (i915_reset_in_progress_or_wedged(&dev_priv->gpu_error)) {
ret = -EIO;
goto cleanup;
}
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 1ad2e2c5f580..3b90db27ea1e 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -683,8 +683,8 @@ struct intel_crtc {
struct intel_crtc_state *config;
- /* reset counter value when the last flip was submitted */
- unsigned int reset_counter;
+ /* global reset count when the last flip was submitted */
+ unsigned int reset_count;
/* Access to these should be protected by dev_priv->irq_lock. */
bool cpu_fifo_underrun_disabled;
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 10+ messages in thread
* ✗ Ro.CI.BAT: failure for series starting with [1/4] drm/i915: Record the position of the start of the request
2016-08-06 16:49 Per-engine reset, preliminary recovery patches Chris Wilson
` (3 preceding siblings ...)
2016-08-06 16:49 ` [PATCH 4/4] drm/i915: Separate out reset flags from the reset counter Chris Wilson
@ 2016-08-06 17:16 ` Patchwork
4 siblings, 0 replies; 10+ messages in thread
From: Patchwork @ 2016-08-06 17:16 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: series starting with [1/4] drm/i915: Record the position of the start of the request
URL : https://patchwork.freedesktop.org/series/10758/
State : failure
== Summary ==
Series 10758v1 Series without cover letter
http://patchwork.freedesktop.org/api/1.0/series/10758/revisions/1/mbox
Test kms_cursor_legacy:
Subgroup basic-flip-vs-cursor-legacy:
pass -> FAIL (ro-skl3-i5-6260u)
fi-kbl-qkkr total:244 pass:185 dwarn:30 dfail:0 fail:2 skip:27
ro-bdw-i5-5250u total:240 pass:218 dwarn:4 dfail:0 fail:2 skip:16
ro-bdw-i7-5557U total:240 pass:224 dwarn:0 dfail:0 fail:0 skip:16
ro-bdw-i7-5600u total:240 pass:207 dwarn:0 dfail:0 fail:1 skip:32
ro-bsw-n3050 total:240 pass:194 dwarn:0 dfail:0 fail:4 skip:42
ro-byt-n2820 total:240 pass:197 dwarn:0 dfail:0 fail:3 skip:40
ro-hsw-i3-4010u total:240 pass:214 dwarn:0 dfail:0 fail:0 skip:26
ro-hsw-i7-4770r total:240 pass:214 dwarn:0 dfail:0 fail:0 skip:26
ro-ilk-i7-620lm total:240 pass:172 dwarn:1 dfail:0 fail:2 skip:65
ro-ivb-i7-3770 total:240 pass:205 dwarn:0 dfail:0 fail:0 skip:35
ro-ivb2-i7-3770 total:240 pass:209 dwarn:0 dfail:0 fail:0 skip:31
ro-skl3-i5-6260u total:240 pass:222 dwarn:0 dfail:0 fail:4 skip:14
ro-snb-i7-2620M total:240 pass:198 dwarn:0 dfail:0 fail:1 skip:41
ro-ilk1-i5-650 failed to connect after reboot
Results at /archive/results/CI_IGT_test/RO_Patchwork_1751/
b834992 drm-intel-nightly: 2016y-08m-05d-20h-40m-44s UTC integration manifest
dfdbc69 drm/i915: Separate out reset flags from the reset counter
1aed894 drm/i915: Update reset path to fix incomplete requests
dcaefae drm/i915: Simplify ELSP queue request tracking
ed8e59a drm/i915: Record the position of the start of the request
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/4] drm/i915: Record the position of the start of the request
2016-08-06 16:49 ` [PATCH 1/4] drm/i915: Record the position of the start of the request Chris Wilson
@ 2016-08-10 12:19 ` Mika Kuoppala
2016-08-10 13:03 ` Chris Wilson
0 siblings, 1 reply; 10+ messages in thread
From: Mika Kuoppala @ 2016-08-10 12:19 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
Chris Wilson <chris@chris-wilson.co.uk> writes:
> Not only does it make for good documentation and debugging aide, but it is
> also vital for when we want to unwind requests - such as when throwing away
> an incomplete request.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Link: http://patchwork.freedesktop.org/patch/msgid/1470414607-32453-2-git-send-email-arun.siluvery@linux.intel.com
> ---
> drivers/gpu/drm/i915/i915_drv.h | 1 +
> drivers/gpu/drm/i915/i915_gem_request.c | 13 +++++++++----
> drivers/gpu/drm/i915/i915_gpu_error.c | 6 ++++--
> 3 files changed, 14 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index feec00f769e1..040c3b6300ac 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -557,6 +557,7 @@ struct drm_i915_error_state {
> struct drm_i915_error_request {
> long jiffies;
> u32 seqno;
> + u32 head;
> u32 tail;
> } *requests;
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> index 6a1661643d3d..6c2553715263 100644
> --- a/drivers/gpu/drm/i915/i915_gem_request.c
> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> @@ -391,6 +391,13 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
> if (ret)
> goto err_ctx;
>
> + /* Record the position of the start of the request so that
> + * should we detect the updated seqno part-way through the
> + * GPU processing the request, we never over-estimate the
> + * position of the head.
> + */
> + req->head = req->ring->tail;
> +
> return req;
>
> err_ctx:
> @@ -467,8 +474,6 @@ void __i915_add_request(struct drm_i915_gem_request *request,
>
> trace_i915_gem_request_add(request);
>
> - request->head = request_start;
Nuke the local request_start also?
-Mika
> -
> /* Whilst this request exists, batch_obj will be on the
> * active_list, and so will hold the active reference. Only when this
> * request is retired will the the batch_obj be moved onto the
> @@ -489,10 +494,10 @@ void __i915_add_request(struct drm_i915_gem_request *request,
> list_add_tail(&request->link, &engine->request_list);
> list_add_tail(&request->ring_link, &ring->request_list);
>
> - /* Record the position of the start of the request so that
> + /* Record the position of the start of the breadcrumb so that
> * should we detect the updated seqno part-way through the
> * GPU processing the request, we never over-estimate the
> - * position of the head.
> + * position of the ring's HEAD.
> */
> request->postfix = ring->tail;
>
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index eecb87063c88..c9e1246ecd36 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -455,9 +455,10 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
> dev_priv->engine[i].name,
> ee->num_requests);
> for (j = 0; j < ee->num_requests; j++) {
> - err_printf(m, " seqno 0x%08x, emitted %ld, tail 0x%08x\n",
> + err_printf(m, " seqno 0x%08x, emitted %ld, head 0x%08x tail 0x%08x\n",
> ee->requests[j].seqno,
> ee->requests[j].jiffies,
> + ee->requests[j].head,
> ee->requests[j].tail);
> }
> }
> @@ -1205,7 +1206,8 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv,
> erq = &ee->requests[count++];
> erq->seqno = request->fence.seqno;
> erq->jiffies = request->emitted_jiffies;
> - erq->tail = request->postfix;
> + erq->head = request->head;
> + erq->tail = request->tail;
> }
> }
> }
> --
> 2.8.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/4] drm/i915: Record the position of the start of the request
2016-08-10 12:19 ` Mika Kuoppala
@ 2016-08-10 13:03 ` Chris Wilson
0 siblings, 0 replies; 10+ messages in thread
From: Chris Wilson @ 2016-08-10 13:03 UTC (permalink / raw)
To: Mika Kuoppala; +Cc: intel-gfx
On Wed, Aug 10, 2016 at 03:19:34PM +0300, Mika Kuoppala wrote:
> Chris Wilson <chris@chris-wilson.co.uk> writes:
>
> > Not only does it make for good documentation and debugging aide, but it is
> > also vital for when we want to unwind requests - such as when throwing away
> > an incomplete request.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Link: http://patchwork.freedesktop.org/patch/msgid/1470414607-32453-2-git-send-email-arun.siluvery@linux.intel.com
> > ---
> > drivers/gpu/drm/i915/i915_drv.h | 1 +
> > drivers/gpu/drm/i915/i915_gem_request.c | 13 +++++++++----
> > drivers/gpu/drm/i915/i915_gpu_error.c | 6 ++++--
> > 3 files changed, 14 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index feec00f769e1..040c3b6300ac 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -557,6 +557,7 @@ struct drm_i915_error_state {
> > struct drm_i915_error_request {
> > long jiffies;
> > u32 seqno;
> > + u32 head;
> > u32 tail;
> > } *requests;
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> > index 6a1661643d3d..6c2553715263 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_request.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> > @@ -391,6 +391,13 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
> > if (ret)
> > goto err_ctx;
> >
> > + /* Record the position of the start of the request so that
> > + * should we detect the updated seqno part-way through the
> > + * GPU processing the request, we never over-estimate the
> > + * position of the head.
> > + */
> > + req->head = req->ring->tail;
> > +
> > return req;
> >
> > err_ctx:
> > @@ -467,8 +474,6 @@ void __i915_add_request(struct drm_i915_gem_request *request,
> >
> > trace_i915_gem_request_add(request);
> >
> > - request->head = request_start;
>
> Nuke the local request_start also?
Not quite. It still is used for making sure that the emission during
i915_add_request() fits within the reserved space. There is no other
convenient marker.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/4] drm/i915: Simplify ELSP queue request tracking
2016-08-06 16:49 ` [PATCH 2/4] drm/i915: Simplify ELSP queue request tracking Chris Wilson
@ 2016-08-23 14:28 ` Mika Kuoppala
2016-08-23 14:51 ` Chris Wilson
0 siblings, 1 reply; 10+ messages in thread
From: Mika Kuoppala @ 2016-08-23 14:28 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
Chris Wilson <chris@chris-wilson.co.uk> writes:
> Emulate HW to track and manage ELSP queue. A set of SW ports are defined
> and requests are assigned to these ports before submitting them to HW. This
> helps in cleaning up incomplete requests during reset recovery easier
> especially after engine reset by decoupling elsp queue management. This
> will become more clear in the next patch.
>
> In the engine reset case we want to resume where we left-off after skipping
> the incomplete batch which requires checking the elsp queue, removing
> element and fixing elsp_submitted counts in some cases. Instead of directly
> manipulating the elsp queue from reset path we can examine these ports, fix
> up ringbuffer pointers using the incomplete request and restart submissions
> again after reset.
>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Link: http://patchwork.freedesktop.org/patch/msgid/1470414607-32453-3-git-send-email-arun.siluvery@linux.intel.com
> ---
> drivers/gpu/drm/i915/i915_debugfs.c | 2 +-
> drivers/gpu/drm/i915/i915_gem.c | 6 +-
> drivers/gpu/drm/i915/i915_gem_request.c | 11 +-
> drivers/gpu/drm/i915/i915_gem_request.h | 24 +-
> drivers/gpu/drm/i915/intel_lrc.c | 417 +++++++++++---------------------
> drivers/gpu/drm/i915/intel_lrc.h | 2 -
> drivers/gpu/drm/i915/intel_ringbuffer.h | 7 +-
> 7 files changed, 160 insertions(+), 309 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 9bd41581b592..f1f937a64c27 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2200,7 +2200,7 @@ static int i915_execlists(struct seq_file *m, void *data)
> status_pointer = I915_READ(RING_CONTEXT_STATUS_PTR(engine));
> seq_printf(m, "\tStatus pointer: 0x%08X\n", status_pointer);
>
> - read_pointer = engine->next_context_status_buffer;
> + read_pointer = GEN8_CSB_READ_PTR(status_pointer);
> write_pointer = GEN8_CSB_WRITE_PTR(status_pointer);
> if (read_pointer > write_pointer)
> write_pointer += GEN8_CSB_ENTRIES;
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index f4f8eaa90f2a..19715f5e698f 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2443,7 +2443,11 @@ static void i915_gem_reset_engine_cleanup(struct intel_engine_cs *engine)
> /* Ensure irq handler finishes or is cancelled. */
> tasklet_kill(&engine->irq_tasklet);
>
> - intel_execlists_cancel_requests(engine);
> + INIT_LIST_HEAD(&engine->execlist_queue);
> + i915_gem_request_assign(&engine->execlist_port[0].request,
> + NULL);
> + i915_gem_request_assign(&engine->execlist_port[1].request,
> + NULL);
How about engine->reset() which would contain these?
> }
>
> /*
> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> index 6c2553715263..49cca4233a8e 100644
> --- a/drivers/gpu/drm/i915/i915_gem_request.c
> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> @@ -736,16 +736,18 @@ complete:
> return ret;
> }
>
> -static void engine_retire_requests(struct intel_engine_cs *engine)
> +static bool engine_retire_requests(struct intel_engine_cs *engine)
> {
> struct drm_i915_gem_request *request, *next;
>
> list_for_each_entry_safe(request, next, &engine->request_list, link) {
> if (!i915_gem_request_completed(request))
> - break;
> + return false;
>
> i915_gem_request_retire(request);
> }
> +
> + return true;
> }
>
> void i915_gem_retire_requests(struct drm_i915_private *dev_priv)
> @@ -759,9 +761,8 @@ void i915_gem_retire_requests(struct drm_i915_private *dev_priv)
>
> GEM_BUG_ON(!dev_priv->gt.awake);
>
> - for_each_engine(engine, dev_priv) {
> - engine_retire_requests(engine);
> - if (!intel_engine_is_active(engine))
> + for_each_engine_masked(engine, dev_priv, dev_priv->gt.active_engines) {
> + if (engine_retire_requests(engine))
> dev_priv->gt.active_engines &= ~intel_engine_flag(engine);
> }
>
These were already stripped to another patch. Good.
> diff --git a/drivers/gpu/drm/i915/i915_gem_request.h b/drivers/gpu/drm/i915/i915_gem_request.h
> index 3496e28785e7..32123e38ef4b 100644
> --- a/drivers/gpu/drm/i915/i915_gem_request.h
> +++ b/drivers/gpu/drm/i915/i915_gem_request.h
> @@ -94,6 +94,9 @@ struct drm_i915_gem_request {
> /** Position in the ringbuffer of the end of the whole request */
> u32 tail;
>
> + /** Position in the ringbuffer after the end of the whole request */
> + u32 wa_tail;
> +
> /** Preallocate space in the ringbuffer for the emitting the request */
> u32 reserved_space;
>
> @@ -130,27 +133,8 @@ struct drm_i915_gem_request {
> /** process identifier submitting this request */
> struct pid *pid;
>
> - /**
> - * The ELSP only accepts two elements at a time, so we queue
> - * context/tail pairs on a given queue (ring->execlist_queue) until the
> - * hardware is available. The queue serves a double purpose: we also use
> - * it to keep track of the up to 2 contexts currently in the hardware
> - * (usually one in execution and the other queued up by the GPU): We
> - * only remove elements from the head of the queue when the hardware
> - * informs us that an element has been completed.
> - *
> - * All accesses to the queue are mediated by a spinlock
> - * (ring->execlist_lock).
> - */
> -
> - /** Execlist link in the submission queue.*/
> + /** Link in the execlist submission queue, guarded by execlist_lock. */
> struct list_head execlist_link;
> -
> - /** Execlists no. of times this request has been sent to the ELSP */
> - int elsp_submitted;
> -
> - /** Execlists context hardware id. */
> - unsigned int ctx_hw_id;
> };
>
> extern const struct fence_ops i915_fence_ops;
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 309c5d9b1c57..56b5aa8a35c4 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -156,6 +156,11 @@
> #define GEN8_CTX_STATUS_COMPLETE (1 << 4)
> #define GEN8_CTX_STATUS_LITE_RESTORE (1 << 15)
>
> +#define GEN8_CTX_STATUS_COMPLETED_MASK \
> + (GEN8_CTX_STATUS_ACTIVE_IDLE | \
> + GEN8_CTX_STATUS_PREEMPTED | \
> + GEN8_CTX_STATUS_ELEMENT_SWITCH)
> +
> #define CTX_LRI_HEADER_0 0x01
> #define CTX_CONTEXT_CONTROL 0x02
> #define CTX_RING_HEAD 0x04
> @@ -263,12 +268,10 @@ logical_ring_init_platform_invariants(struct intel_engine_cs *engine)
> {
> struct drm_i915_private *dev_priv = engine->i915;
>
> - if (IS_GEN8(dev_priv) || IS_GEN9(dev_priv))
> - engine->idle_lite_restore_wa = ~0;
> -
> - engine->disable_lite_restore_wa = (IS_SKL_REVID(dev_priv, 0, SKL_REVID_B0) ||
> - IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) &&
> - (engine->id == VCS || engine->id == VCS2);
> + engine->disable_lite_restore_wa =
> + (IS_SKL_REVID(dev_priv, 0, SKL_REVID_B0) ||
> + IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) &&
> + (engine->id == VCS || engine->id == VCS2);
Candidate for another cleanup patch after as both revids smells obsolete.
>
> engine->ctx_desc_template = GEN8_CTX_VALID;
> if (IS_GEN8(dev_priv))
> @@ -328,36 +331,6 @@ uint64_t intel_lr_context_descriptor(struct i915_gem_context *ctx,
> return ctx->engine[engine->id].lrc_desc;
> }
>
> -static void execlists_elsp_write(struct drm_i915_gem_request *rq0,
> - struct drm_i915_gem_request *rq1)
> -{
> -
> - struct intel_engine_cs *engine = rq0->engine;
> - struct drm_i915_private *dev_priv = rq0->i915;
> - uint64_t desc[2];
> -
> - if (rq1) {
> - desc[1] = intel_lr_context_descriptor(rq1->ctx, rq1->engine);
> - rq1->elsp_submitted++;
> - } else {
> - desc[1] = 0;
> - }
> -
> - desc[0] = intel_lr_context_descriptor(rq0->ctx, rq0->engine);
> - rq0->elsp_submitted++;
> -
> - /* You must always write both descriptors in the order below. */
> - I915_WRITE_FW(RING_ELSP(engine), upper_32_bits(desc[1]));
> - I915_WRITE_FW(RING_ELSP(engine), lower_32_bits(desc[1]));
> -
> - I915_WRITE_FW(RING_ELSP(engine), upper_32_bits(desc[0]));
> - /* The context is automatically loaded after the following */
> - I915_WRITE_FW(RING_ELSP(engine), lower_32_bits(desc[0]));
> -
> - /* ELSP is a wo register, use another nearby reg for posting */
> - POSTING_READ_FW(RING_EXECLIST_STATUS_LO(engine));
> -}
> -
> static void
> execlists_update_context_pdps(struct i915_hw_ppgtt *ppgtt, u32 *reg_state)
> {
> @@ -367,13 +340,12 @@ execlists_update_context_pdps(struct i915_hw_ppgtt *ppgtt, u32 *reg_state)
> ASSIGN_CTX_PDP(ppgtt, reg_state, 0);
> }
>
> -static void execlists_update_context(struct drm_i915_gem_request *rq)
> +static u64 execlists_update_context(struct drm_i915_gem_request *rq)
> {
> - struct intel_engine_cs *engine = rq->engine;
> + struct intel_context *ce = &rq->ctx->engine[rq->engine->id];
> struct i915_hw_ppgtt *ppgtt = rq->ctx->ppgtt;
> - uint32_t *reg_state = rq->ctx->engine[engine->id].lrc_reg_state;
>
> - reg_state[CTX_RING_TAIL+1] = intel_ring_offset(rq->ring, rq->tail);
> + ce->lrc_reg_state[CTX_RING_TAIL+1] = intel_ring_offset(rq->ring, rq->tail);
>
> /* True 32b PPGTT with dynamic page allocation: update PDP
> * registers and point the unallocated PDPs to scratch page.
> @@ -381,32 +353,14 @@ static void execlists_update_context(struct drm_i915_gem_request *rq)
> * in 48-bit mode.
> */
> if (ppgtt && !USES_FULL_48BIT_PPGTT(ppgtt->base.dev))
> - execlists_update_context_pdps(ppgtt, reg_state);
> -}
> -
> -static void execlists_elsp_submit_contexts(struct drm_i915_gem_request *rq0,
> - struct drm_i915_gem_request *rq1)
> -{
> - struct drm_i915_private *dev_priv = rq0->i915;
> - unsigned int fw_domains = rq0->engine->fw_domains;
> -
> - execlists_update_context(rq0);
> -
> - if (rq1)
> - execlists_update_context(rq1);
> + execlists_update_context_pdps(ppgtt, ce->lrc_reg_state);
>
> - spin_lock_irq(&dev_priv->uncore.lock);
> - intel_uncore_forcewake_get__locked(dev_priv, fw_domains);
> -
> - execlists_elsp_write(rq0, rq1);
> -
> - intel_uncore_forcewake_put__locked(dev_priv, fw_domains);
> - spin_unlock_irq(&dev_priv->uncore.lock);
> + return ce->lrc_desc;
> }
>
> -static inline void execlists_context_status_change(
> - struct drm_i915_gem_request *rq,
> - unsigned long status)
> +static inline void
> +execlists_context_status_change(struct drm_i915_gem_request *rq,
> + unsigned long status)
> {
> /*
> * Only used when GVT-g is enabled now. When GVT-g is disabled,
> @@ -418,122 +372,104 @@ static inline void execlists_context_status_change(
> atomic_notifier_call_chain(&rq->ctx->status_notifier, status, rq);
> }
>
> -static void execlists_unqueue(struct intel_engine_cs *engine)
> +static void execlists_submit_ports(struct intel_engine_cs *engine)
> {
> - struct drm_i915_gem_request *req0 = NULL, *req1 = NULL;
> - struct drm_i915_gem_request *cursor, *tmp;
> -
> - assert_spin_locked(&engine->execlist_lock);
> -
> - /*
> - * If irqs are not active generate a warning as batches that finish
> - * without the irqs may get lost and a GPU Hang may occur.
> - */
> - WARN_ON(!intel_irqs_enabled(engine->i915));
> -
> - /* Try to read in pairs */
> - list_for_each_entry_safe(cursor, tmp, &engine->execlist_queue,
> - execlist_link) {
> - if (!req0) {
> - req0 = cursor;
> - } else if (req0->ctx == cursor->ctx) {
> - /* Same ctx: ignore first request, as second request
> - * will update tail past first request's workload */
> - cursor->elsp_submitted = req0->elsp_submitted;
> - list_del(&req0->execlist_link);
> - i915_gem_request_put(req0);
> - req0 = cursor;
> - } else {
> - if (IS_ENABLED(CONFIG_DRM_I915_GVT)) {
> - /*
> - * req0 (after merged) ctx requires single
> - * submission, stop picking
> - */
> - if (req0->ctx->execlists_force_single_submission)
> - break;
> - /*
> - * req0 ctx doesn't require single submission,
> - * but next req ctx requires, stop picking
> - */
> - if (cursor->ctx->execlists_force_single_submission)
> - break;
> - }
> - req1 = cursor;
> - WARN_ON(req1->elsp_submitted);
> - break;
> - }
> - }
> -
> - if (unlikely(!req0))
> - return;
> -
> - execlists_context_status_change(req0, INTEL_CONTEXT_SCHEDULE_IN);
> + struct drm_i915_private *dev_priv = engine->i915;
> + u32 *elsp = dev_priv->regs + i915_mmio_reg_offset(RING_ELSP(engine));
> + u64 desc[2];
>
> - if (req1)
> - execlists_context_status_change(req1,
> + if (!engine->execlist_port[0].count)
> + execlists_context_status_change(engine->execlist_port[0].request,
> INTEL_CONTEXT_SCHEDULE_IN);
> + desc[0] = execlists_update_context(engine->execlist_port[0].request);
> + engine->preempt_wa = engine->execlist_port[0].count++;
>
> - if (req0->elsp_submitted & engine->idle_lite_restore_wa) {
> - /*
> - * WaIdleLiteRestore: make sure we never cause a lite restore
> - * with HEAD==TAIL.
> - *
> - * Apply the wa NOOPS to prevent ring:HEAD == req:TAIL as we
> - * resubmit the request. See gen8_emit_request() for where we
> - * prepare the padding after the end of the request.
> - */
> - req0->tail += 8;
> - req0->tail &= req0->ring->size - 1;
> + if (engine->execlist_port[1].request) {
> + GEM_BUG_ON(engine->execlist_port[1].count);
> + execlists_context_status_change(engine->execlist_port[1].request,
> + INTEL_CONTEXT_SCHEDULE_IN);
> + desc[1] = execlists_update_context(engine->execlist_port[1].request);
> + engine->execlist_port[1].count = 1;
> + } else {
> + desc[1] = 0;
> }
>
> - execlists_elsp_submit_contexts(req0, req1);
> + /* You must always write both descriptors in the order below. */
> + writel(upper_32_bits(desc[1]), elsp);
> + writel(lower_32_bits(desc[1]), elsp);
> +
> + writel(upper_32_bits(desc[0]), elsp);
> + /* The context is automatically loaded after the following */
> + writel(lower_32_bits(desc[0]), elsp);
> }
>
> -static unsigned int
> -execlists_check_remove_request(struct intel_engine_cs *engine, u32 ctx_id)
> +static bool merge_ctx(struct i915_gem_context *prev,
> + struct i915_gem_context *next)
> {
> - struct drm_i915_gem_request *head_req;
> + if (prev != next)
> + return false;
>
> - assert_spin_locked(&engine->execlist_lock);
> + if (IS_ENABLED(CONFIG_DRM_I915_GVT) &&
> + prev->execlists_force_single_submission)
> + return false;
>
> - head_req = list_first_entry_or_null(&engine->execlist_queue,
> - struct drm_i915_gem_request,
> - execlist_link);
> -
> - if (WARN_ON(!head_req || (head_req->ctx_hw_id != ctx_id)))
> - return 0;
> + return true;
> +}
>
> - WARN(head_req->elsp_submitted == 0, "Never submitted head request\n");
> +static void execlists_context_unqueue(struct intel_engine_cs *engine)
> +{
> + struct drm_i915_gem_request *cursor, *last;
> + struct execlist_port *port = engine->execlist_port;
> + bool submit = false;
> +
> + last = port->request;
> + if (last)
> + /* WaIdleLiteRestore:bdw,skl
> + * Apply the wa NOOPs to prevent ring:HEAD == req:TAIL
> + * as we resubmit the request. See gen8_emit_request()
> + * for where we prepare the padding after the end of the
> + * request.
> + */
> + last->tail = last->wa_tail;
>
> - if (--head_req->elsp_submitted > 0)
> - return 0;
> + /* Try to read in pairs and fill both submission ports */
> + spin_lock(&engine->execlist_lock);
> + list_for_each_entry(cursor, &engine->execlist_queue, execlist_link) {
> + if (last && !merge_ctx(cursor->ctx, last->ctx)) {
> + if (port != engine->execlist_port)
> + break;
In here you will merge ctx also for the second port?
The previous version of the code was careful to only merge for
the first port/request.
>
> - execlists_context_status_change(head_req, INTEL_CONTEXT_SCHEDULE_OUT);
> + if (IS_ENABLED(CONFIG_DRM_I915_GVT) &&
> + cursor->ctx->execlists_force_single_submission)
> + break;
>
> - list_del(&head_req->execlist_link);
> - i915_gem_request_put(head_req);
> + i915_gem_request_assign(&port->request, last);
> + port++;
> + }
> + last = cursor;
> + submit = true;
> + }
> + if (submit) {
> + i915_gem_request_assign(&port->request, last);
> + engine->execlist_queue.next = &cursor->execlist_link;
> + cursor->execlist_link.prev = &engine->execlist_queue;
We keep the cursor pointing to the request that will be in port[0].request?
while (!list_empty(engine->execlist_queue)) and then removing from
the list would have made the fill loop more complex?
> + }
> + spin_unlock(&engine->execlist_lock);
>
> - return 1;
> + if (submit)
> + execlists_submit_ports(engine);
> }
>
> -static u32
> -get_context_status(struct intel_engine_cs *engine, unsigned int read_pointer,
> - u32 *context_id)
> +static bool execlists_elsp_idle(struct intel_engine_cs *engine)
> {
> - struct drm_i915_private *dev_priv = engine->i915;
> - u32 status;
> -
> - read_pointer %= GEN8_CSB_ENTRIES;
> -
> - status = I915_READ_FW(RING_CONTEXT_STATUS_BUF_LO(engine, read_pointer));
> -
> - if (status & GEN8_CTX_STATUS_IDLE_ACTIVE)
> - return 0;
> + return !engine->execlist_port[0].request;
> +}
>
> - *context_id = I915_READ_FW(RING_CONTEXT_STATUS_BUF_HI(engine,
> - read_pointer));
> +static bool execlists_elsp_ready(struct intel_engine_cs *engine)
> +{
> + int idx = !engine->disable_lite_restore_wa && !engine->preempt_wa;
>
> - return status;
> + return !engine->execlist_port[idx].request;
> }
>
> /*
> @@ -544,100 +480,63 @@ static void intel_lrc_irq_handler(unsigned long data)
> {
> struct intel_engine_cs *engine = (struct intel_engine_cs *)data;
> struct drm_i915_private *dev_priv = engine->i915;
> - u32 status_pointer;
> - unsigned int read_pointer, write_pointer;
> - u32 csb[GEN8_CSB_ENTRIES][2];
> - unsigned int csb_read = 0, i;
> - unsigned int submit_contexts = 0;
>
> intel_uncore_forcewake_get(dev_priv, engine->fw_domains);
>
> - status_pointer = I915_READ_FW(RING_CONTEXT_STATUS_PTR(engine));
> -
> - read_pointer = engine->next_context_status_buffer;
> - write_pointer = GEN8_CSB_WRITE_PTR(status_pointer);
> - if (read_pointer > write_pointer)
> - write_pointer += GEN8_CSB_ENTRIES;
> -
> - while (read_pointer < write_pointer) {
> - if (WARN_ON_ONCE(csb_read == GEN8_CSB_ENTRIES))
> - break;
> - csb[csb_read][0] = get_context_status(engine, ++read_pointer,
> - &csb[csb_read][1]);
> - csb_read++;
> - }
> -
> - engine->next_context_status_buffer = write_pointer % GEN8_CSB_ENTRIES;
> -
> - /* Update the read pointer to the old write pointer. Manual ringbuffer
> - * management ftw </sarcasm> */
> - I915_WRITE_FW(RING_CONTEXT_STATUS_PTR(engine),
> - _MASKED_FIELD(GEN8_CSB_READ_PTR_MASK,
> - engine->next_context_status_buffer << 8));
> -
> - intel_uncore_forcewake_put(dev_priv, engine->fw_domains);
> -
> - spin_lock(&engine->execlist_lock);
> -
> - for (i = 0; i < csb_read; i++) {
> - if (unlikely(csb[i][0] & GEN8_CTX_STATUS_PREEMPTED)) {
> - if (csb[i][0] & GEN8_CTX_STATUS_LITE_RESTORE) {
> - if (execlists_check_remove_request(engine, csb[i][1]))
> - WARN(1, "Lite Restored request removed from queue\n");
> - } else
> - WARN(1, "Preemption without Lite Restore\n");
> + if (!execlists_elsp_idle(engine)) {
> + u32 *ring_mmio =
> + dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine));
> + u32 *csb_mmio =
> + dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_BUF_LO(engine, 0));
> + unsigned ring, head, tail;
> +
> + ring = readl(ring_mmio);
> + head = GEN8_CSB_READ_PTR(ring);
> + tail = GEN8_CSB_WRITE_PTR(ring);
> + if (tail < head)
> + tail += GEN8_CSB_ENTRIES;
> +
> + while (head < tail) {
> + unsigned idx = ++head % GEN8_CSB_ENTRIES;
> + unsigned status = readl(&csb_mmio[2*idx]);
> +
> + if (status & GEN8_CTX_STATUS_COMPLETED_MASK) {
> + GEM_BUG_ON(engine->execlist_port[0].count == 0);
Would this be the safety valve if we go out of sync vrt submit vs head?
> + if (--engine->execlist_port[0].count == 0) {
> + GEM_BUG_ON(status & GEN8_CTX_STATUS_PREEMPTED);
> + execlists_context_status_change(engine->execlist_port[0].request,
> + INTEL_CONTEXT_SCHEDULE_OUT);
> + i915_gem_request_put(engine->execlist_port[0].request);
> + engine->execlist_port[0] = engine->execlist_port[1];
Intention here is to always retire through port zero, and thus the
refcounts matches the submitted ones?
> + memset(&engine->execlist_port[1], 0,
> + sizeof(engine->execlist_port[1]));
> + engine->preempt_wa = false;
> + }
> + }
> + GEM_BUG_ON(engine->execlist_port[0].count == 0 &&
> + !(status & GEN8_CTX_STATUS_ACTIVE_IDLE));
> }
>
> - if (csb[i][0] & (GEN8_CTX_STATUS_ACTIVE_IDLE |
> - GEN8_CTX_STATUS_ELEMENT_SWITCH))
> - submit_contexts +=
> - execlists_check_remove_request(engine, csb[i][1]);
> + writel(_MASKED_FIELD(GEN8_CSB_READ_PTR_MASK, GEN8_CSB_WRITE_PTR(ring) << 8),
> + ring_mmio);
> }
>
> - if (submit_contexts) {
> - if (!engine->disable_lite_restore_wa ||
> - (csb[i][0] & GEN8_CTX_STATUS_ACTIVE_IDLE))
> - execlists_unqueue(engine);
> - }
> + if (execlists_elsp_ready(engine))
> + execlists_context_unqueue(engine);
>
> - spin_unlock(&engine->execlist_lock);
> -
> - if (unlikely(submit_contexts > 2))
> - DRM_ERROR("More than two context complete events?\n");
> + intel_uncore_forcewake_put(dev_priv, engine->fw_domains);
> }
>
> static void execlists_submit_request(struct drm_i915_gem_request *request)
> {
> struct intel_engine_cs *engine = request->engine;
> - struct drm_i915_gem_request *cursor;
> - int num_elements = 0;
>
> spin_lock_bh(&engine->execlist_lock);
>
> - list_for_each_entry(cursor, &engine->execlist_queue, execlist_link)
> - if (++num_elements > 2)
> - break;
> -
> - if (num_elements > 2) {
> - struct drm_i915_gem_request *tail_req;
> -
> - tail_req = list_last_entry(&engine->execlist_queue,
> - struct drm_i915_gem_request,
> - execlist_link);
> -
> - if (request->ctx == tail_req->ctx) {
> - WARN(tail_req->elsp_submitted != 0,
> - "More than 2 already-submitted reqs queued\n");
> - list_del(&tail_req->execlist_link);
> - i915_gem_request_put(tail_req);
> - }
> - }
> -
> - i915_gem_request_get(request);
> list_add_tail(&request->execlist_link, &engine->execlist_queue);
> - request->ctx_hw_id = request->ctx->hw_id;
> - if (num_elements == 0)
> - execlists_unqueue(engine);
> +
> + if (execlists_elsp_idle(engine))
> + tasklet_hi_schedule(&engine->irq_tasklet);
>
> spin_unlock_bh(&engine->execlist_lock);
> }
> @@ -731,6 +630,7 @@ intel_logical_ring_advance(struct drm_i915_gem_request *request)
> intel_ring_emit(ring, MI_NOOP);
> intel_ring_emit(ring, MI_NOOP);
> intel_ring_advance(ring);
> + request->wa_tail = request->ring->tail;
>
> /* We keep the previous context alive until we retire the following
> * request. This ensures that any the context object is still pinned
> @@ -743,23 +643,6 @@ intel_logical_ring_advance(struct drm_i915_gem_request *request)
> return 0;
> }
>
> -void intel_execlists_cancel_requests(struct intel_engine_cs *engine)
> -{
> - struct drm_i915_gem_request *req, *tmp;
> - LIST_HEAD(cancel_list);
> -
> - WARN_ON(!mutex_is_locked(&engine->i915->drm.struct_mutex));
> -
> - spin_lock_bh(&engine->execlist_lock);
> - list_replace_init(&engine->execlist_queue, &cancel_list);
> - spin_unlock_bh(&engine->execlist_lock);
> -
> - list_for_each_entry_safe(req, tmp, &cancel_list, execlist_link) {
> - list_del(&req->execlist_link);
> - i915_gem_request_put(req);
> - }
> -}
> -
> static int intel_lr_context_pin(struct i915_gem_context *ctx,
> struct intel_engine_cs *engine)
> {
> @@ -1285,7 +1168,6 @@ static void lrc_init_hws(struct intel_engine_cs *engine)
> static int gen8_init_common_ring(struct intel_engine_cs *engine)
> {
> struct drm_i915_private *dev_priv = engine->i915;
> - unsigned int next_context_status_buffer_hw;
>
> lrc_init_hws(engine);
>
> @@ -1296,32 +1178,12 @@ static int gen8_init_common_ring(struct intel_engine_cs *engine)
> I915_WRITE(RING_MODE_GEN7(engine),
> _MASKED_BIT_DISABLE(GFX_REPLAY_MODE) |
> _MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE));
> - POSTING_READ(RING_MODE_GEN7(engine));
>
> - /*
> - * Instead of resetting the Context Status Buffer (CSB) read pointer to
> - * zero, we need to read the write pointer from hardware and use its
> - * value because "this register is power context save restored".
> - * Effectively, these states have been observed:
> - *
> - * | Suspend-to-idle (freeze) | Suspend-to-RAM (mem) |
> - * BDW | CSB regs not reset | CSB regs reset |
> - * CHT | CSB regs not reset | CSB regs not reset |
> - * SKL | ? | ? |
> - * BXT | ? | ? |
> - */
> - next_context_status_buffer_hw =
> - GEN8_CSB_WRITE_PTR(I915_READ(RING_CONTEXT_STATUS_PTR(engine)));
> -
> - /*
> - * When the CSB registers are reset (also after power-up / gpu reset),
> - * CSB write pointer is set to all 1's, which is not valid, use '5' in
> - * this special case, so the first element read is CSB[0].
> - */
> - if (next_context_status_buffer_hw == GEN8_CSB_PTR_MASK)
> - next_context_status_buffer_hw = (GEN8_CSB_ENTRIES - 1);
> + I915_WRITE(RING_CONTEXT_STATUS_PTR(engine),
> + _MASKED_FIELD(GEN8_CSB_READ_PTR_MASK |
> + GEN8_CSB_WRITE_PTR_MASK,
> + 0));
>
> - engine->next_context_status_buffer = next_context_status_buffer_hw;
> DRM_DEBUG_DRIVER("Execlists enabled for %s\n", engine->name);
>
> intel_engine_init_hangcheck(engine);
> @@ -1706,7 +1568,6 @@ void intel_logical_ring_cleanup(struct intel_engine_cs *engine)
> }
> intel_lr_context_unpin(dev_priv->kernel_context, engine);
>
> - engine->idle_lite_restore_wa = 0;
> engine->disable_lite_restore_wa = false;
> engine->ctx_desc_template = 0;
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
> index a52cf57dbd40..4d70346500c2 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.h
> +++ b/drivers/gpu/drm/i915/intel_lrc.h
> @@ -97,6 +97,4 @@ int intel_sanitize_enable_execlists(struct drm_i915_private *dev_priv,
> int enable_execlists);
> void intel_execlists_enable_submission(struct drm_i915_private *dev_priv);
>
> -void intel_execlists_cancel_requests(struct intel_engine_cs *engine);
> -
> #endif /* _INTEL_LRC_H_ */
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 43e545e44352..ed8a8482a7fb 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -288,11 +288,14 @@ struct intel_engine_cs {
> /* Execlists */
> struct tasklet_struct irq_tasklet;
> spinlock_t execlist_lock; /* used inside tasklet, use spin_lock_bh */
> + struct execlist_port {
> + struct drm_i915_gem_request *request;
> + unsigned int count;
> + } execlist_port[2];
> struct list_head execlist_queue;
> unsigned int fw_domains;
> - unsigned int next_context_status_buffer;
> - unsigned int idle_lite_restore_wa;
> bool disable_lite_restore_wa;
> + bool preempt_wa;
This needs to be reset also in logical_ring_cleanup().
Overall this patch really makes the submission/retirement much more
compact and simpler. The exception being the unqueuing part which
is difficult due to the merging logic.
Some bikescheds I made during review can be found in:
https://cgit.freedesktop.org/~miku/drm-intel/commit/?h=review&id=3c806a9f6665db3a08f949224560bb039af1be1a
-Mika
> u32 ctx_desc_template;
>
> /**
> --
> 2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/4] drm/i915: Simplify ELSP queue request tracking
2016-08-23 14:28 ` Mika Kuoppala
@ 2016-08-23 14:51 ` Chris Wilson
0 siblings, 0 replies; 10+ messages in thread
From: Chris Wilson @ 2016-08-23 14:51 UTC (permalink / raw)
To: Mika Kuoppala; +Cc: intel-gfx
On Tue, Aug 23, 2016 at 05:28:07PM +0300, Mika Kuoppala wrote:
> > + /* Try to read in pairs and fill both submission ports */
> > + spin_lock(&engine->execlist_lock);
> > + list_for_each_entry(cursor, &engine->execlist_queue, execlist_link) {
> > + if (last && !merge_ctx(cursor->ctx, last->ctx)) {
> > + if (port != engine->execlist_port)
> > + break;
>
> In here you will merge ctx also for the second port?
>
> The previous version of the code was careful to only merge for
> the first port/request.
Careful? Hah.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2016-08-23 14:52 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-06 16:49 Per-engine reset, preliminary recovery patches Chris Wilson
2016-08-06 16:49 ` [PATCH 1/4] drm/i915: Record the position of the start of the request Chris Wilson
2016-08-10 12:19 ` Mika Kuoppala
2016-08-10 13:03 ` Chris Wilson
2016-08-06 16:49 ` [PATCH 2/4] drm/i915: Simplify ELSP queue request tracking Chris Wilson
2016-08-23 14:28 ` Mika Kuoppala
2016-08-23 14:51 ` Chris Wilson
2016-08-06 16:49 ` [PATCH 3/4] drm/i915: Update reset path to fix incomplete requests Chris Wilson
2016-08-06 16:49 ` [PATCH 4/4] drm/i915: Separate out reset flags from the reset counter Chris Wilson
2016-08-06 17:16 ` ✗ Ro.CI.BAT: failure for series starting with [1/4] drm/i915: Record the position of the start of the request Patchwork
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.