public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Thomas Daniel <thomas.daniel@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 28/43] drm/i915/bdw: Implement context switching (somewhat)
Date: Mon, 11 Aug 2014 23:29:11 +0200	[thread overview]
Message-ID: <20140811212910.GF10500@phenom.ffwll.local> (raw)
In-Reply-To: <1406217891-8912-29-git-send-email-thomas.daniel@intel.com>

On Thu, Jul 24, 2014 at 05:04:36PM +0100, Thomas Daniel wrote:
> From: Ben Widawsky <benjamin.widawsky@intel.com>
> 
> A context switch occurs by submitting a context descriptor to the
> ExecList Submission Port. Given that we can now initialize a context,
> it's possible to begin implementing the context switch by creating the
> descriptor and submitting it to ELSP (actually two, since the ELSP
> has two ports).
> 
> The context object must be mapped in the GGTT, which means it must exist
> in the 0-4GB graphics VA range.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> v2: This code has changed quite a lot in various rebases. Of particular
> importance is that now we use the globally unique Submission ID to send
> to the hardware. Also, context pages are now pinned unconditionally to
> GGTT, so there is no need to bind them.
> 
> v3: Use LRCA[31:12] as hwCtxId[19:0]. This guarantees that the HW context
> ID we submit to the ELSP is globally unique and != 0 (Bspec requirements
> of the software use-only bits of the Context ID in the Context Descriptor
> Format) without the hassle of the previous submission Id construction.
> Also, re-add the ELSP porting read (it was dropped somewhere during the
> rebases).
> 
> v4:
> - Squash with "drm/i915/bdw: Add forcewake lock around ELSP writes" (BSPEC
>   says: "SW must set Force Wakeup bit to prevent GT from entering C6 while
>   ELSP writes are in progress") as noted by Thomas Daniel
>   (thomas.daniel@intel.com).
> - Rename functions and use an execlists/intel_execlists_ namespace.
> - The BUG_ON only checked that the LRCA was <32 bits, but it didn't make
>   sure that it was properly aligned. Spotted by Alistair Mcaulay
>   <alistair.mcaulay@intel.com>.
> 
> v5:
> - Improved source code comments as suggested by Chris Wilson.
> - No need to abstract submit_ctx away, as pointed by Brad Volkin.
> 
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c |  116 +++++++++++++++++++++++++++++++++++++-
>  drivers/gpu/drm/i915/intel_lrc.h |    1 +
>  2 files changed, 115 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 4549eec..535ef98 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -47,6 +47,7 @@
>  #define GEN8_LR_CONTEXT_ALIGN 4096
>  
>  #define RING_ELSP(ring)			((ring)->mmio_base+0x230)
> +#define RING_EXECLIST_STATUS(ring)	((ring)->mmio_base+0x234)
>  #define RING_CONTEXT_CONTROL(ring)	((ring)->mmio_base+0x244)
>  
>  #define CTX_LRI_HEADER_0		0x01
> @@ -78,6 +79,26 @@
>  #define CTX_R_PWR_CLK_STATE		0x42
>  #define CTX_GPGPU_CSR_BASE_ADDRESS	0x44
>  
> +#define GEN8_CTX_VALID (1<<0)
> +#define GEN8_CTX_FORCE_PD_RESTORE (1<<1)
> +#define GEN8_CTX_FORCE_RESTORE (1<<2)
> +#define GEN8_CTX_L3LLC_COHERENT (1<<5)
> +#define GEN8_CTX_PRIVILEGE (1<<8)
> +enum {
> +	ADVANCED_CONTEXT=0,
> +	LEGACY_CONTEXT,
> +	ADVANCED_AD_CONTEXT,
> +	LEGACY_64B_CONTEXT
> +};
> +#define GEN8_CTX_MODE_SHIFT 3
> +enum {
> +	FAULT_AND_HANG=0,
> +	FAULT_AND_HALT, /* Debug only */
> +	FAULT_AND_STREAM,
> +	FAULT_AND_CONTINUE /* Unsupported */
> +};
> +#define GEN8_CTX_ID_SHIFT 32
> +
>  int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists)
>  {
>  	if (enable_execlists == 0)
> @@ -90,6 +111,93 @@ int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists
>  	return 0;
>  }
>  
> +u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj)
> +{
> +	u32 lrca = i915_gem_obj_ggtt_offset(ctx_obj);
> +
> +	/* LRCA is required to be 4K aligned so the more significant 20 bits
> +	 * are globally unique */
> +	return lrca >> 12;
> +}
> +
> +static uint64_t execlists_ctx_descriptor(struct drm_i915_gem_object *ctx_obj)
> +{
> +	uint64_t desc;
> +	uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj);
> +	BUG_ON(lrca & 0xFFFFFFFF00000FFFULL);
> +
> +	desc = GEN8_CTX_VALID;
> +	desc |= LEGACY_CONTEXT << GEN8_CTX_MODE_SHIFT;
> +	desc |= GEN8_CTX_L3LLC_COHERENT;
> +	desc |= GEN8_CTX_PRIVILEGE;
> +	desc |= lrca;
> +	desc |= (u64)intel_execlists_ctx_id(ctx_obj) << GEN8_CTX_ID_SHIFT;
> +
> +	/* TODO: WaDisableLiteRestore when we start using semaphore
> +	 * signalling between Command Streamers */
> +	/* desc |= GEN8_CTX_FORCE_RESTORE; */
> +
> +	return desc;
> +}
> +
> +static void execlists_elsp_write(struct intel_engine_cs *ring,
> +				 struct drm_i915_gem_object *ctx_obj0,
> +				 struct drm_i915_gem_object *ctx_obj1)
> +{
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> +	uint64_t temp = 0;
> +	uint32_t desc[4];
> +
> +	/* XXX: You must always write both descriptors in the order below. */
> +	if (ctx_obj1)
> +		temp = execlists_ctx_descriptor(ctx_obj1);
> +	else
> +		temp = 0;
> +	desc[1] = (u32)(temp >> 32);
> +	desc[0] = (u32)temp;
> +
> +	temp = execlists_ctx_descriptor(ctx_obj0);
> +	desc[3] = (u32)(temp >> 32);
> +	desc[2] = (u32)temp;
> +
> +	/* Set Force Wakeup bit to prevent GT from entering C6 while
> +	 * ELSP writes are in progress */
> +	gen6_gt_force_wake_get(dev_priv, FORCEWAKE_ALL);
> +
> +	I915_WRITE(RING_ELSP(ring), desc[1]);
> +	I915_WRITE(RING_ELSP(ring), desc[0]);
> +	I915_WRITE(RING_ELSP(ring), desc[3]);
> +	/* The context is automatically loaded after the following */
> +	I915_WRITE(RING_ELSP(ring), desc[2]);
> +
> +	/* ELSP is a wo register, so use another nearby reg for posting instead */
> +	POSTING_READ(RING_EXECLIST_STATUS(ring));
> +
> +	gen6_gt_force_wake_put(dev_priv, FORCEWAKE_ALL);
> +}
> +
> +static int execlists_submit_context(struct intel_engine_cs *ring,
> +				    struct intel_context *to0, u32 tail0,
> +				    struct intel_context *to1, u32 tail1)
> +{
> +	struct drm_i915_gem_object *ctx_obj0;
> +	struct drm_i915_gem_object *ctx_obj1 = NULL;
> +
> +	ctx_obj0 = to0->engine[ring->id].state;
> +	BUG_ON(!ctx_obj0);
> +	BUG_ON(!i915_gem_obj_is_pinned(ctx_obj0));
> +
> +	if (to1) {
> +		ctx_obj1 = to1->engine[ring->id].state;
> +		BUG_ON(!ctx_obj1);
> +		BUG_ON(!i915_gem_obj_is_pinned(ctx_obj1));
> +	}
> +
> +	execlists_elsp_write(ring, ctx_obj0, ctx_obj1);
> +
> +	return 0;
> +}
> +
>  static int logical_ring_invalidate_all_caches(struct intel_ringbuffer *ringbuf)
>  {
>  	struct intel_engine_cs *ring = ringbuf->ring;
> @@ -270,12 +378,16 @@ int logical_ring_flush_all_caches(struct intel_ringbuffer *ringbuf)
>  
>  void intel_logical_ring_advance_and_submit(struct intel_ringbuffer *ringbuf)
>  {
> +	struct intel_engine_cs *ring = ringbuf->ring;
> +	struct intel_context *ctx = ringbuf->ctx;
> +
>  	intel_logical_ring_advance(ringbuf);
>  
> -	if (intel_ring_stopped(ringbuf->ring))
> +	if (intel_ring_stopped(ring))
>  		return;
>  
> -	/* TODO: how to submit a context to the ELSP is not here yet */
> +	/* FIXME: too cheeky, we don't even check if the ELSP is ready */
> +	execlists_submit_context(ring, ctx, ringbuf->tail, NULL, 0);

So this is the 2nd user of ringbuf->ctx I've spotted (well gcc did) and
imo it shouldn't be here. We should have one ELSP submit for each batch,
not one for each ring_advance. Heck even the ring_advance should probably
be done just once.

This is one of the reasons why I wanted to have a parallel execlist
function hierarchy, so that we could implement such stuff correctly
without jumping through hoops (or doing the lazy update trick we do for
legacy rings).

Punt on this for now.
-Daniel

>  }
>  
>  static int logical_ring_alloc_seqno(struct intel_engine_cs *ring,
> diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
> index f20c3d2..b59965b 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.h
> +++ b/drivers/gpu/drm/i915/intel_lrc.h
> @@ -58,5 +58,6 @@ int intel_execlists_submission(struct drm_device *dev, struct drm_file *file,
>  			       struct list_head *vmas,
>  			       struct drm_i915_gem_object *batch_obj,
>  			       u64 exec_start, u32 flags);
> +u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj);
>  
>  #endif /* _INTEL_LRC_H_ */
> -- 
> 1.7.9.5
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

  reply	other threads:[~2014-08-11 21:28 UTC|newest]

Thread overview: 137+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-24 16:04 [PATCH 00/43] Execlists v5 Thomas Daniel
2014-07-24 16:04 ` [PATCH 01/43] drm/i915: Reorder the actual workload submission so that args checking is done earlier Thomas Daniel
2014-07-25  8:30   ` Daniel Vetter
2014-07-25  9:16     ` Chris Wilson
2014-07-24 16:04 ` [PATCH 02/43] drm/i915/bdw: New source and header file for LRs, LRCs and Execlists Thomas Daniel
2014-07-24 16:04 ` [PATCH 03/43] drm/i915/bdw: Macro for LRCs and module option for Execlists Thomas Daniel
2014-08-11 13:57   ` Daniel Vetter
2014-07-24 16:04 ` [PATCH 04/43] drm/i915/bdw: Initialization for Logical Ring Contexts Thomas Daniel
2014-08-11 14:03   ` Daniel Vetter
2014-07-24 16:04 ` [PATCH 05/43] drm/i915/bdw: Introduce one context backing object per engine Thomas Daniel
2014-08-11 13:59   ` [PATCH] drm/i915: WARN if module opt sanitization goes out of order Daniel Vetter
2014-08-11 14:28     ` Damien Lespiau
2014-07-24 16:04 ` [PATCH 06/43] drm/i915/bdw: A bit more advanced LR context alloc/free Thomas Daniel
2014-07-24 16:04 ` [PATCH 07/43] drm/i915/bdw: Allocate ringbuffers for Logical Ring Contexts Thomas Daniel
2014-07-24 16:04 ` [PATCH 08/43] drm/i915/bdw: Add a context and an engine pointers to the ringbuffer Thomas Daniel
2014-08-11 14:14   ` Daniel Vetter
2014-08-11 14:20     ` Daniel Vetter
2014-08-13 13:34       ` Daniel, Thomas
2014-08-13 15:16         ` Daniel Vetter
2014-08-14 15:09           ` Daniel, Thomas
2014-08-14 15:32             ` Daniel Vetter
2014-08-14 15:37               ` Daniel Vetter
2014-08-14 15:56                 ` Daniel, Thomas
2014-08-14 16:19                   ` Daniel Vetter
2014-08-14 16:27                     ` [PATCH] drm/i915: Add temporary ring->ctx backpointer Daniel Vetter
2014-08-14 16:33                       ` Daniel, Thomas
2014-07-24 16:04 ` [PATCH 09/43] drm/i915/bdw: Populate LR contexts (somewhat) Thomas Daniel
2014-07-24 16:04 ` [PATCH 10/43] drm/i915/bdw: Deferred creation of user-created LRCs Thomas Daniel
2014-08-11 14:25   ` Daniel Vetter
2014-07-24 16:04 ` [PATCH 11/43] drm/i915/bdw: Render moot context reset and switch with Execlists Thomas Daniel
2014-08-11 14:30   ` Daniel Vetter
2014-08-15 10:22     ` Daniel, Thomas
2014-08-15 15:39       ` Daniel Vetter
2014-08-20 15:29   ` [PATCH] " Thomas Daniel
2014-08-20 15:36     ` Chris Wilson
2014-08-25 20:39       ` Daniel Vetter
2014-08-25 22:01         ` Scot Doyle
2014-08-26  5:59         ` Chris Wilson
2014-08-26 13:54           ` Siluvery, Arun
2014-08-26 14:11             ` Daniel Vetter
2014-07-24 16:04 ` [PATCH 12/43] drm/i915/bdw: Don't write PDP in the legacy way when using LRCs Thomas Daniel
2014-08-01 13:46   ` Damien Lespiau
2014-08-07 12:17   ` Thomas Daniel
2014-08-08 15:59     ` Damien Lespiau
2014-08-11 14:32     ` Daniel Vetter
2014-08-15 11:01     ` [PATCH] " Thomas Daniel
2014-07-24 16:04 ` [PATCH 13/43] drm/i915: Abstract the legacy workload submission mechanism away Thomas Daniel
2014-08-11 14:36   ` Daniel Vetter
2014-08-11 14:39     ` Daniel Vetter
2014-08-11 14:39   ` Daniel Vetter
2014-08-11 15:02   ` Daniel Vetter
2014-07-24 16:04 ` [PATCH 14/43] drm/i915/bdw: Skeleton for the new logical rings submission path Thomas Daniel
2014-07-24 16:04 ` [PATCH 15/43] drm/i915/bdw: Generic logical ring init and cleanup Thomas Daniel
2014-08-11 15:01   ` Daniel Vetter
2014-07-24 16:04 ` [PATCH 16/43] drm/i915/bdw: GEN-specific logical ring init Thomas Daniel
2014-08-11 15:04   ` Daniel Vetter
2014-07-24 16:04 ` [PATCH 17/43] drm/i915/bdw: GEN-specific logical ring set/get seqno Thomas Daniel
2014-08-11 15:05   ` Daniel Vetter
2014-07-24 16:04 ` [PATCH 18/43] drm/i915/bdw: New logical ring submission mechanism Thomas Daniel
2014-08-11 20:40   ` Daniel Vetter
2014-07-24 16:04 ` [PATCH 19/43] drm/i915/bdw: GEN-specific logical ring emit request Thomas Daniel
2014-07-24 16:04 ` [PATCH 20/43] drm/i915/bdw: GEN-specific logical ring emit flush Thomas Daniel
2014-07-24 16:04 ` [PATCH 21/43] drm/i915/bdw: Emission of requests with logical rings Thomas Daniel
2014-08-11 20:56   ` Daniel Vetter
2014-08-13 13:34     ` Daniel, Thomas
2014-08-13 15:25       ` Daniel Vetter
2014-07-24 16:04 ` [PATCH 22/43] drm/i915/bdw: Ring idle and stop " Thomas Daniel
2014-07-24 16:04 ` [PATCH 23/43] drm/i915/bdw: Interrupts " Thomas Daniel
2014-08-11 21:02   ` Daniel Vetter
2014-08-11 21:08   ` Daniel Vetter
2014-07-24 16:04 ` [PATCH 24/43] drm/i915/bdw: GEN-specific logical ring emit batchbuffer start Thomas Daniel
2014-08-11 21:09   ` Daniel Vetter
2014-08-11 21:12     ` Daniel Vetter
2014-07-24 16:04 ` [PATCH 25/43] drm/i915/bdw: Workload submission mechanism for Execlists Thomas Daniel
2014-08-11 20:30   ` Daniel Vetter
2014-07-24 16:04 ` [PATCH 26/43] drm/i915/bdw: Always use MMIO flips with Execlists Thomas Daniel
2014-08-11 20:34   ` Daniel Vetter
2014-07-24 16:04 ` [PATCH 27/43] drm/i915/bdw: Render state init for Execlists Thomas Daniel
2014-08-11 21:25   ` Daniel Vetter
2014-08-13 15:07     ` Daniel, Thomas
2014-08-13 15:30       ` Daniel Vetter
2014-08-14 20:00         ` Daniel Vetter
2014-08-15  8:43           ` Daniel, Thomas
2014-08-20 15:55           ` Daniel, Thomas
2014-08-25 20:55             ` Daniel Vetter
2014-08-21 10:40   ` [PATCH] " Thomas Daniel
2014-08-28  9:40     ` Daniel Vetter
2014-07-24 16:04 ` [PATCH 28/43] drm/i915/bdw: Implement context switching (somewhat) Thomas Daniel
2014-08-11 21:29   ` Daniel Vetter [this message]
2014-07-24 16:04 ` [PATCH 29/43] drm/i915/bdw: Write the tail pointer, LRC style Thomas Daniel
2014-08-01 14:33   ` Damien Lespiau
2014-08-11 21:30   ` Daniel Vetter
2014-07-24 16:04 ` [PATCH 30/43] drm/i915/bdw: Two-stage execlist submit process Thomas Daniel
2014-08-14 20:05   ` Daniel Vetter
2014-08-14 20:10   ` Daniel Vetter
2014-08-15  8:51     ` Daniel, Thomas
2014-08-15  9:38       ` Daniel Vetter
2014-07-24 16:04 ` [PATCH 31/43] drm/i915/bdw: Handle context switch events Thomas Daniel
2014-08-14 20:13   ` Daniel Vetter
2014-08-14 20:17   ` Daniel Vetter
2014-08-14 20:28   ` Daniel Vetter
2014-08-14 20:37   ` Daniel Vetter
2014-07-24 16:04 ` [PATCH 32/43] drm/i915/bdw: Avoid non-lite-restore preemptions Thomas Daniel
2014-08-14 20:31   ` Daniel Vetter
2014-07-24 16:04 ` [PATCH 33/43] drm/i915/bdw: Help out the ctx switch interrupt handler Thomas Daniel
2014-08-14 20:43   ` Daniel Vetter
2014-07-24 16:04 ` [PATCH 34/43] drm/i915/bdw: Make sure gpu reset still works with Execlists Thomas Daniel
2014-08-01 14:42   ` Damien Lespiau
2014-08-06  9:26     ` Daniel, Thomas
2014-08-01 14:46   ` Damien Lespiau
2014-08-06  9:28     ` Daniel, Thomas
2014-07-24 16:04 ` [PATCH 35/43] drm/i915/bdw: Make sure error capture keeps working " Thomas Daniel
2014-08-15 12:14   ` Daniel Vetter
2014-08-21 10:57     ` Daniel, Thomas
2014-08-25 21:00       ` Daniel Vetter
2014-08-25 21:29       ` Daniel Vetter
2014-07-24 16:04 ` [PATCH 36/43] drm/i915/bdw: Disable semaphores for Execlists Thomas Daniel
2014-07-24 16:04 ` [PATCH 37/43] drm/i915/bdw: Display execlists info in debugfs Thomas Daniel
2014-08-01 14:54   ` Damien Lespiau
2014-08-07 12:23   ` Thomas Daniel
2014-08-08 16:02     ` Damien Lespiau
2014-07-24 16:04 ` [PATCH 38/43] drm/i915/bdw: Display context backing obj & ringbuffer " Thomas Daniel
2014-07-24 16:04 ` [PATCH 39/43] drm/i915/bdw: Print context state " Thomas Daniel
2014-08-01 15:54   ` Damien Lespiau
2014-08-07 12:24   ` Thomas Daniel
2014-08-08 15:57     ` Damien Lespiau
2014-07-24 16:04 ` [PATCH 40/43] drm/i915/bdw: Document Logical Rings, LR contexts and Execlists Thomas Daniel
2014-08-15 12:42   ` Daniel Vetter
2014-07-24 16:04 ` [PATCH 41/43] drm/i915/bdw: Enable Logical Ring Contexts (hence, Execlists) Thomas Daniel
2014-08-18  8:33   ` Jani Nikula
2014-08-18 14:52     ` Daniel, Thomas
2014-07-24 16:04 ` [PATCH 42/43] drm/i915/bdw: Pin the context backing objects to GGTT on-demand Thomas Daniel
2014-08-15 13:03   ` Daniel Vetter
2014-07-24 16:04 ` [PATCH 43/43] drm/i915/bdw: Pin the ringbuffer backing object " Thomas Daniel
2014-07-25  8:35 ` [PATCH 00/43] Execlists v5 Daniel Vetter
2014-08-01 16:09 ` Damien Lespiau
2014-08-01 16:29   ` Jesse Barnes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140811212910.GF10500@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=thomas.daniel@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox