intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: John.C.Harrison@Intel.com
To: Intel-GFX@Lists.FreeDesktop.Org
Subject: [RFC 36/38] drm/i915/preempt: update (LRC) ringbuffer-filling code to create preemptive requests
Date: Fri, 11 Dec 2015 14:49:52 +0000	[thread overview]
Message-ID: <1449845392-14799-1-git-send-email-John.C.Harrison@Intel.com> (raw)
In-Reply-To: <1448278932-31551-36-git-send-email-John.C.Harrison@Intel.com>

From: Dave Gordon <david.s.gordon@intel.com>

This patch refactors the rinbuffer-level code (in execlists/GuC mode
only) and enhances it so that it can emit the proper sequence of opcode
for preemption requests.

A preemption request is similar to an batch submission, but doesn't
actually invoke a batchbuffer, the purpose being simply to get the
engine to stop what it's doing so that the scheduler can then send it a
new workload instead.

Preemption requests use different locations in the hardware status page
to hold the 'active' and 'done' seqnos from regular batches, so that
information pertaining to a preempted batch is not overwritten. Also,
whereas a regular batch clears its 'active' flag when it finishes (so
that TDR knows it's no longer to blame), preemption requests leave this
set and the driver clears it once the completion of the preemption
request has been noticed. Only one preemption (per ring) can be in
progress at one time, so this handshake ensures correct sequencing of
the request between the GPU and CPU.

Actually-preemptive requests are still disabled via a module parameter
at this stage, but all the components should now be ready for us to turn
it on :)

v2: Updated to use locally cached request pointer and to fix the
location of the dispatch trace point.

For: VIZ-2021
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 177 ++++++++++++++++++++++++++++++---------
 1 file changed, 136 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 36d63b7..31645a3 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -748,7 +748,7 @@ intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
 	struct drm_i915_private *dev_priv = request->i915;
 	struct i915_guc_client *client = dev_priv->guc.execbuf_client;
 	const static bool fake = false;	/* true => only pretend to preempt */
-	bool preemptive = false;	/* for now */
+	bool preemptive;
 
 	intel_logical_ring_advance(request->ringbuf);
 
@@ -757,6 +757,7 @@ intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
 	if (intel_ring_stopped(ring))
 		return;
 
+	preemptive = (request->scheduler_flags & i915_req_sf_preempt) != 0;
 	if (preemptive && dev_priv->guc.preempt_client && !fake)
 		client = dev_priv->guc.preempt_client;
 
@@ -951,6 +952,117 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
 }
 
 /*
+ * This function stores the specified constant value in the (index)th DWORD of the
+ * hardware status page (execlist mode only). See separate code for legacy mode.
+ */
+static void
+emit_store_dw_index(struct drm_i915_gem_request *req, uint32_t value, uint32_t index)
+{
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
+	uint64_t hwpa = req->ring->status_page.gfx_addr;
+	hwpa += index << MI_STORE_DWORD_INDEX_SHIFT;
+
+	intel_logical_ring_emit(ringbuf, MI_STORE_DWORD_IMM_GEN4 | MI_GLOBAL_GTT);
+	intel_logical_ring_emit(ringbuf, lower_32_bits(hwpa));
+	intel_logical_ring_emit(ringbuf, upper_32_bits(hwpa)); /* GEN8+ */
+	intel_logical_ring_emit(ringbuf, value);
+
+	req->ring->gpu_caches_dirty = true;
+}
+
+/*
+ * This function stores the specified register value in the (index)th DWORD
+ * of the hardware status page (execlist mode only). See separate code for
+ * legacy mode.
+ */
+static void
+emit_store_reg_index(struct drm_i915_gem_request *req, uint32_t reg, uint32_t index)
+{
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
+	uint64_t hwpa = req->ring->status_page.gfx_addr;
+	hwpa += index << MI_STORE_DWORD_INDEX_SHIFT;
+
+	intel_logical_ring_emit(ringbuf, (MI_STORE_REG_MEM+1) | MI_GLOBAL_GTT);
+	intel_logical_ring_emit(ringbuf, reg);
+	intel_logical_ring_emit(ringbuf, lower_32_bits(hwpa));
+	intel_logical_ring_emit(ringbuf, upper_32_bits(hwpa)); /* GEN8+ */
+
+	req->ring->gpu_caches_dirty = true;
+}
+
+/*
+ * Emit the commands to execute when preparing to start a batch
+ *
+ * The GPU will log the seqno of the batch before it starts
+ * running any of the commands to actually execute that batch
+ */
+static void
+emit_preamble(struct drm_i915_gem_request *req)
+{
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
+	uint32_t seqno = i915_gem_request_get_seqno(req);
+
+	BUG_ON(!seqno);
+	if (req->scheduler_flags & i915_req_sf_preempt)
+		emit_store_dw_index(req, seqno, I915_PREEMPTIVE_ACTIVE_SEQNO);
+	else
+		emit_store_dw_index(req, seqno, I915_BATCH_ACTIVE_SEQNO);
+
+	intel_logical_ring_emit(ringbuf, MI_REPORT_HEAD);
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+
+	req->ring->gpu_caches_dirty = true;
+}
+
+static void
+emit_relconsts_mode(struct i915_execbuffer_params *params)
+{
+	if (params->ctx->relative_constants_mode != params->instp_mode) {
+		struct intel_ringbuffer *ringbuf = params->request->ringbuf;
+
+		intel_logical_ring_emit(ringbuf, MI_NOOP);
+		intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(1));
+		intel_logical_ring_emit(ringbuf, INSTPM);
+		intel_logical_ring_emit(ringbuf, params->instp_mask << 16 | params->instp_mode);
+
+		params->ctx->relative_constants_mode = params->instp_mode;
+	}
+}
+
+/*
+ * Emit the commands that flag the end of execution of a batch.
+ *
+ * The GPU will:
+ * 1) log the request of the batch we've just completed.
+ * 2) in the case of a non-preemptive batch, clear the in-progress sequence
+ *    number; otherwise, issue a dummy register store to flush the above
+ *    write before the interrupt happens.
+ * 3) Issue a USER INTERRUPT to notify the driver that the sequence number
+ *    has been updated.
+ */
+static void
+emit_postamble(struct drm_i915_gem_request *req)
+{
+	struct intel_ringbuffer *ringbuf = req->ringbuf;
+	uint32_t seqno = i915_gem_request_get_seqno(req);
+
+	BUG_ON(!seqno);
+
+	if (req->scheduler_flags & i915_req_sf_preempt) {
+		emit_store_dw_index(req, seqno, I915_PREEMPTIVE_DONE_SEQNO);
+		emit_store_reg_index(req, NOPID, I915_GEM_HWS_SCRATCH_INDEX);
+		logical_ring_invalidate_all_caches(req);
+	} else {
+		emit_store_dw_index(req, seqno, I915_BATCH_DONE_SEQNO);
+		emit_store_dw_index(req, 0, I915_BATCH_ACTIVE_SEQNO);
+		logical_ring_flush_all_caches(req);
+	}
+
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+	intel_logical_ring_emit(ringbuf, MI_USER_INTERRUPT);
+}
+
+/*
  * This is the main function for adding a batch to the ring.
  * It is called from the scheduler, with the struct_mutex already held.
  */
@@ -1028,6 +1140,11 @@ int intel_execlists_submission_final(struct i915_execbuffer_params *params)
 	req->head = intel_ring_get_tail(ringbuf);
 
 	/*
+	 * Log the seqno of the batch we're starting
+	 */
+	emit_preamble(req);
+
+	/*
 	 * Unconditionally invalidate gpu caches and ensure that we do flush
 	 * any residual writes from the previous batch.
 	 */
@@ -1035,25 +1152,19 @@ int intel_execlists_submission_final(struct i915_execbuffer_params *params)
 	if (ret)
 		goto err;
 
-	if (ring == &dev_priv->ring[RCS] &&
-	    params->instp_mode != params->ctx->relative_constants_mode) {
-		intel_logical_ring_emit(ringbuf, MI_NOOP);
-		intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(1));
-		intel_logical_ring_emit(ringbuf, INSTPM);
-		intel_logical_ring_emit(ringbuf, params->instp_mask << 16 | params->instp_mode);
-		intel_logical_ring_advance(ringbuf);
-
-		params->ctx->relative_constants_mode = params->instp_mode;
-	}
+	if (!(req->scheduler_flags & i915_req_sf_preempt)) {
+		if (ring == &dev_priv->ring[RCS])
+			emit_relconsts_mode(params);
 
-	exec_start = params->batch_obj_vm_offset +
-		     params->args_batch_start_offset;
+		exec_start = params->batch_obj_vm_offset +
+			     params->args_batch_start_offset;
 
-	ret = ring->emit_bb_start(req, exec_start, params->dispatch_flags);
-	if (ret)
-		goto err;
+		ret = ring->emit_bb_start(req, exec_start, params->dispatch_flags);
+		if (ret)
+			goto err;
 
-	trace_i915_gem_ring_dispatch(req, params->dispatch_flags);
+		trace_i915_gem_ring_dispatch(req, params->dispatch_flags);
+	}
 
 	i915_gem_execbuffer_retire_commands(params);
 
@@ -1914,38 +2025,22 @@ static void bxt_a_set_seqno(struct intel_engine_cs *ring, u32 seqno)
 static int gen8_emit_request(struct drm_i915_gem_request *request)
 {
 	struct intel_ringbuffer *ringbuf = request->ringbuf;
-	struct intel_engine_cs *ring = ringbuf->ring;
-	u64 addr;
-	u32 cmd;
 	int ret;
 
+	emit_postamble(request);
+	intel_logical_ring_advance_and_submit(request);
+
 	/*
-	 * Reserve space for 2 NOOPs at the end of each request to be
-	 * used as a workaround for not being allowed to do lite
-	 * restore with HEAD==TAIL (WaIdleLiteRestore).
+	 * Add 4 NOOPs to the end of each request. These can
+	 * be used as a workaround for not being allowed to
+	 * do lite restore with HEAD==TAIL (WaIdleLiteRestore).
 	 */
-	ret = intel_logical_ring_begin(request, 8);
+	ret = intel_logical_ring_begin(request, 4);
 	if (ret)
 		return ret;
 
-	cmd = MI_STORE_DWORD_IMM_GEN4 | MI_GLOBAL_GTT;
-	intel_logical_ring_emit(ringbuf, cmd);
-
-	addr = I915_GEM_HWS_INDEX;
-	addr <<= MI_STORE_DWORD_INDEX_SHIFT;
-	addr += ring->status_page.gfx_addr;
-	intel_logical_ring_emit(ringbuf, lower_32_bits(addr));
-	intel_logical_ring_emit(ringbuf, upper_32_bits(addr));
-
-	intel_logical_ring_emit(ringbuf, i915_gem_request_get_seqno(request));
-	intel_logical_ring_emit(ringbuf, MI_USER_INTERRUPT);
 	intel_logical_ring_emit(ringbuf, MI_NOOP);
-	intel_logical_ring_advance_and_submit(request);
-
-	/*
-	 * Here we add two extra NOOPs as padding to avoid
-	 * lite restore of a context with HEAD==TAIL.
-	 */
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
 	intel_logical_ring_emit(ringbuf, MI_NOOP);
 	intel_logical_ring_emit(ringbuf, MI_NOOP);
 	intel_logical_ring_advance(ringbuf);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2015-12-11 14:49 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-23 11:41 [RFC 00/37] Preemption support for GPU scheduler John.C.Harrison
2015-11-23 11:41 ` [RFC 01/37] drm/i915: update ring space correctly John.C.Harrison
2015-11-23 11:41 ` [RFC 02/37] drm/i915: recalculate ring space after reset John.C.Harrison
2015-11-23 11:41 ` [RFC 03/37] drm/i915: hangcheck=idle should wake_up_all every time, not just once John.C.Harrison
2015-11-23 11:41 ` [RFC 04/37] drm/i915/error: capture execlist state on error John.C.Harrison
2015-11-23 11:41 ` [RFC 05/37] drm/i915/error: capture ringbuffer pointed to by START John.C.Harrison
2015-11-23 11:41 ` [RFC 06/37] drm/i915/error: report ctx id & desc for each request in the queue John.C.Harrison
2015-11-23 11:41 ` [RFC 07/37] drm/i915/error: improve CSB reporting John.C.Harrison
2015-11-23 11:41 ` [RFC 08/37] drm/i915/error: report size in pages for each object dumped John.C.Harrison
2015-11-23 11:41 ` [RFC 09/37] drm/i915/error: track, capture & print ringbuffer submission activity John.C.Harrison
2015-11-23 11:41 ` [RFC 10/37] drm/i915/guc: Tidy up GuC proc/ctx descriptor setup John.C.Harrison
2015-11-23 11:41 ` [RFC 11/37] drm/i915/guc: Add a second client, to be used for preemption John.C.Harrison
2015-11-23 11:41 ` [RFC 12/37] drm/i915/guc: implement submission via REQUEST_PREEMPTION action John.C.Harrison
2015-11-23 11:41 ` [RFC 13/37] drm/i915/guc: Improve action error reporting, add preemption debug John.C.Harrison
2015-11-23 11:41 ` [RFC 14/37] drm/i915/guc: Expose GuC-maintained statistics John.C.Harrison
2015-11-23 11:41 ` [RFC 15/37] drm/i915: add i915_wait_request() call after i915_add_request_no_flush() John.C.Harrison
2015-11-23 11:41 ` [RFC 16/37] drm/i915/guc: Expose (intel)_lr_context_size() John.C.Harrison
2015-11-23 11:41 ` [RFC 17/37] drm/i915/guc: Add support for GuC ADS (Addition Data Structure) John.C.Harrison
2015-11-23 11:41 ` [RFC 18/37] drm/i915/guc: Fill in (part of?) the ADS whitelist John.C.Harrison
2015-11-23 11:41 ` [RFC 19/37] drm/i915/error: capture errored context based on request context-id John.C.Harrison
2015-11-23 11:41 ` [RFC 20/37] drm/i915/error: enhanced error capture of requests John.C.Harrison
2015-11-23 11:41 ` [RFC 21/37] drm/i915/error: add GuC state error capture & decode John.C.Harrison
2015-11-23 11:41 ` [RFC 22/37] drm/i915: track relative-constants-mode per-context not per-device John.C.Harrison
2015-11-23 11:41 ` [RFC 23/37] drm/i915: set request 'head' on allocation not in add_request() John.C.Harrison
2015-11-23 11:41 ` [RFC 24/37] drm/i915/sched: set request 'head' on at start of ring submission John.C.Harrison
2015-12-11 14:46   ` [RFC 24/38] " John.C.Harrison
2015-11-23 11:42 ` [RFC 25/37] drm/i915/sched: include scheduler state in error capture John.C.Harrison
2015-11-23 11:42 ` [RFC 26/37] drm/i915/preempt: preemption-related definitions and statistics John.C.Harrison
2015-11-23 11:42 ` [RFC 27/37] drm/i915/preempt: scheduler logic for queueing preemptive requests John.C.Harrison
2015-11-23 11:42 ` [RFC 28/37] drm/i915/preempt: scheduler logic for selecting " John.C.Harrison
2015-11-23 11:42 ` [RFC 29/37] drm/i915/preempt: scheduler logic for preventing recursive preemption John.C.Harrison
2015-11-23 11:42 ` [RFC 30/37] drm/i915/preempt: don't allow nonbatch ctx init when the scheduler is busy John.C.Harrison
2015-11-23 11:42 ` [RFC 31/37] drm/i915/preempt: scheduler logic for landing preemptive requests John.C.Harrison
2015-12-11 14:48   ` [RFC 31/38] " John.C.Harrison
2015-11-23 11:42 ` [RFC 32/37] drm/i915/preempt: add hook to catch 'unexpected' ring submissions John.C.Harrison
2015-12-11 14:49   ` [RFC 32/38] " John.C.Harrison
2015-11-23 11:42 ` [RFC 33/37] drm/i915/preempt: Refactor intel_lr_context_reset() John.C.Harrison
2015-11-23 11:42 ` [RFC 34/37] drm/i915/preempt: scheduler logic for postprocessing preemptive requests John.C.Harrison
2015-11-23 11:42 ` [RFC 35/37] drm/i915/preempt: update (LRC) ringbuffer-filling code to create " John.C.Harrison
2015-12-11 14:49   ` John.C.Harrison [this message]
2015-11-23 11:42 ` [RFC 36/37] drm/i915/preempt: update scheduler parameters to enable preemption John.C.Harrison
2015-11-23 11:42 ` [RFC 37/37] drm/i915: Added preemption info to various trace points John.C.Harrison
2015-12-11 14:50   ` [RFC 38/38] " John.C.Harrison
2015-12-11 14:50 ` [RFC 35/38] drm/i915/preempt: Implement mid-batch preemption support John.C.Harrison
2015-12-11 14:50 ` [RFC 00/38] Preemption support for GPU scheduler John.C.Harrison

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1449845392-14799-1-git-send-email-John.C.Harrison@Intel.com \
    --to=john.c.harrison@intel.com \
    --cc=Intel-GFX@Lists.FreeDesktop.Org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).