[PATCH 1/2] drm/i915: Trim the retired request queue after submitting

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/2] drm/i915: Trim the retired request queue after submitting
@ 2018-02-07  8:43 Chris Wilson
  2018-02-07  8:43 ` [PATCH 2/2] drm/i915: Skip request serialisation if the timeline is already complete Chris Wilson
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Chris Wilson @ 2018-02-07  8:43 UTC (permalink / raw)
  To: intel-gfx

If we submit a request and see that the previous request on this
timeline was already signaled, we first do not need to add the
dependency tracker for that completed request and secondly we know that
we there is then a large backlog in retiring requests affecting this
timeline. Given that we just submitted more work to the HW, now would be
a good time to catch up on those retirements.

v2: Try to sum up the compromises involved in flushing the retirement
queue after submission.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_request.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index 794263421aa0..384cb49ae4cc 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -1075,6 +1075,26 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches)
 	local_bh_disable();
 	i915_sw_fence_commit(&request->submit);
 	local_bh_enable(); /* Kick the execlists tasklet if just scheduled */
+
+	/*
+	 * In typical scenarios, we do not expect the previous request on
+	 * the timeline to be still tracked by timeline->last_request if it
+	 * has been completed. If the completed request is still here, that
+	 * implies that request retirement is a long way behind submission,
+	 * suggesting that we haven't been retiring frequently enough from
+	 * the combination of retire-before-alloc, waiters and the background
+	 * retirement worker. So if the last request on this timeline was
+	 * already completed, do a catch up pass, flushing the retirement queue
+	 * up to this client. Since we have now moved the heaviest operations
+	 * during retirement onto secondary workers, such as freeing objects
+	 * or contexts, retiring a bunch of requests is mostly list management
+	 * (and cache misses), and so we should not be overly penalizing this
+	 * client by performing excess work, though we may still performing
+	 * work on behalf of others -- but instead we should benefit from
+	 * improved resource management. (Well, that's the theory at least.)
+	 */
+	if (prev && i915_gem_request_completed(prev))
+		i915_gem_request_retire_upto(prev);
 }
 
 static unsigned long local_clock_us(unsigned int *cpu)
-- 
2.16.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] drm/i915: Skip request serialisation if the timeline is already complete
  2018-02-07  8:43 [PATCH 1/2] drm/i915: Trim the retired request queue after submitting Chris Wilson
@ 2018-02-07  8:43 ` Chris Wilson
  2018-02-07 10:45   ` Tvrtko Ursulin
  2018-02-07  9:21 ` ✗ Fi.CI.BAT: failure for series starting with [1/2] drm/i915: Trim the retired request queue after submitting Patchwork
  2018-02-07 10:45 ` [PATCH 1/2] " Tvrtko Ursulin
  2 siblings, 1 reply; 5+ messages in thread
From: Chris Wilson @ 2018-02-07  8:43 UTC (permalink / raw)
  To: intel-gfx

If the last request on the timeline is already complete, we do not need
to emit the serialisation barriers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_request.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index 384cb49ae4cc..8a35b5591e0e 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -995,7 +995,8 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches)
 	lockdep_assert_held(&request->i915->drm.struct_mutex);
 	trace_i915_gem_request_add(request);
 
-	/* Make sure that no request gazumped us - if it was allocated after
+	/*
+	 * Make sure that no request gazumped us - if it was allocated after
 	 * our i915_gem_request_alloc() and called __i915_add_request() before
 	 * us, the timeline will hold its seqno which is later than ours.
 	 */
@@ -1022,7 +1023,8 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches)
 		WARN(err, "engine->emit_flush() failed: %d!\n", err);
 	}
 
-	/* Record the position of the start of the breadcrumb so that
+	/*
+	 * Record the position of the start of the breadcrumb so that
 	 * should we detect the updated seqno part-way through the
 	 * GPU processing the request, we never over-estimate the
 	 * position of the ring's HEAD.
@@ -1031,7 +1033,8 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches)
 	GEM_BUG_ON(IS_ERR(cs));
 	request->postfix = intel_ring_offset(request, cs);
 
-	/* Seal the request and mark it as pending execution. Note that
+	/*
+	 * Seal the request and mark it as pending execution. Note that
 	 * we may inspect this state, without holding any locks, during
 	 * hangcheck. Hence we apply the barrier to ensure that we do not
 	 * see a more recent value in the hws than we are tracking.
@@ -1039,7 +1042,7 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches)
 
 	prev = i915_gem_active_raw(&timeline->last_request,
 				   &request->i915->drm.struct_mutex);
-	if (prev) {
+	if (prev && !i915_gem_request_completed(prev)) {
 		i915_sw_fence_await_sw_fence(&request->submit, &prev->submit,
 					     &request->submitq);
 		if (engine->schedule)
@@ -1059,7 +1062,8 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches)
 	list_add_tail(&request->ring_link, &ring->request_list);
 	request->emitted_jiffies = jiffies;
 
-	/* Let the backend know a new request has arrived that may need
+	/*
+	 * Let the backend know a new request has arrived that may need
 	 * to adjust the existing execution schedule due to a high priority
 	 * request - i.e. we may want to preempt the current request in order
 	 * to run a high priority dependency chain *before* we can execute this
-- 
2.16.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] drm/i915: Skip request serialisation if the timeline is already complete
  2018-02-07  8:43 ` [PATCH 2/2] drm/i915: Skip request serialisation if the timeline is already complete Chris Wilson
@ 2018-02-07 10:45   ` Tvrtko Ursulin
  0 siblings, 0 replies; 5+ messages in thread
From: Tvrtko Ursulin @ 2018-02-07 10:45 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 07/02/2018 08:43, Chris Wilson wrote:
> If the last request on the timeline is already complete, we do not need
> to emit the serialisation barriers.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem_request.c | 14 +++++++++-----
>   1 file changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> index 384cb49ae4cc..8a35b5591e0e 100644
> --- a/drivers/gpu/drm/i915/i915_gem_request.c
> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> @@ -995,7 +995,8 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches)
>   	lockdep_assert_held(&request->i915->drm.struct_mutex);
>   	trace_i915_gem_request_add(request);
>   
> -	/* Make sure that no request gazumped us - if it was allocated after
> +	/*
> +	 * Make sure that no request gazumped us - if it was allocated after
>   	 * our i915_gem_request_alloc() and called __i915_add_request() before
>   	 * us, the timeline will hold its seqno which is later than ours.
>   	 */
> @@ -1022,7 +1023,8 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches)
>   		WARN(err, "engine->emit_flush() failed: %d!\n", err);
>   	}
>   
> -	/* Record the position of the start of the breadcrumb so that
> +	/*
> +	 * Record the position of the start of the breadcrumb so that
>   	 * should we detect the updated seqno part-way through the
>   	 * GPU processing the request, we never over-estimate the
>   	 * position of the ring's HEAD.
> @@ -1031,7 +1033,8 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches)
>   	GEM_BUG_ON(IS_ERR(cs));
>   	request->postfix = intel_ring_offset(request, cs);
>   
> -	/* Seal the request and mark it as pending execution. Note that
> +	/*
> +	 * Seal the request and mark it as pending execution. Note that
>   	 * we may inspect this state, without holding any locks, during
>   	 * hangcheck. Hence we apply the barrier to ensure that we do not
>   	 * see a more recent value in the hws than we are tracking.
> @@ -1039,7 +1042,7 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches)
>   
>   	prev = i915_gem_active_raw(&timeline->last_request,
>   				   &request->i915->drm.struct_mutex);
> -	if (prev) {
> +	if (prev && !i915_gem_request_completed(prev)) {
>   		i915_sw_fence_await_sw_fence(&request->submit, &prev->submit,
>   					     &request->submitq);
>   		if (engine->schedule)
> @@ -1059,7 +1062,8 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches)
>   	list_add_tail(&request->ring_link, &ring->request_list);
>   	request->emitted_jiffies = jiffies;
>   
> -	/* Let the backend know a new request has arrived that may need
> +	/*
> +	 * Let the backend know a new request has arrived that may need
>   	 * to adjust the existing execution schedule due to a high priority
>   	 * request - i.e. we may want to preempt the current request in order
>   	 * to run a high priority dependency chain *before* we can execute this
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* ✗ Fi.CI.BAT: failure for series starting with [1/2] drm/i915: Trim the retired request queue after submitting
  2018-02-07  8:43 [PATCH 1/2] drm/i915: Trim the retired request queue after submitting Chris Wilson
  2018-02-07  8:43 ` [PATCH 2/2] drm/i915: Skip request serialisation if the timeline is already complete Chris Wilson
@ 2018-02-07  9:21 ` Patchwork
  2018-02-07 10:45 ` [PATCH 1/2] " Tvrtko Ursulin
  2 siblings, 0 replies; 5+ messages in thread
From: Patchwork @ 2018-02-07  9:21 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/2] drm/i915: Trim the retired request queue after submitting
URL   : https://patchwork.freedesktop.org/series/37796/
State : failure

== Summary ==

Series 37796v1 series starting with [1/2] drm/i915: Trim the retired request queue after submitting
https://patchwork.freedesktop.org/api/1.0/series/37796/revisions/1/mbox/

Test gem_sync:
        Subgroup basic-all:
                skip       -> PASS       (fi-pnv-d510)
        Subgroup basic-each:
                skip       -> PASS       (fi-pnv-d510)
        Subgroup basic-many-each:
                skip       -> PASS       (fi-pnv-d510)
        Subgroup basic-store-all:
                skip       -> PASS       (fi-pnv-d510)
        Subgroup basic-store-each:
                skip       -> PASS       (fi-pnv-d510)
Test gem_tiled_blits:
        Subgroup basic:
                skip       -> PASS       (fi-pnv-d510)
Test gem_tiled_fence_blits:
        Subgroup basic:
                skip       -> PASS       (fi-pnv-d510)
Test gem_wait:
        Subgroup basic-busy-all:
                skip       -> PASS       (fi-pnv-d510)
        Subgroup basic-wait-all:
                skip       -> PASS       (fi-pnv-d510)
        Subgroup basic-await-all:
                skip       -> PASS       (fi-pnv-d510)
Test kms_busy:
        Subgroup basic-flip-a:
                skip       -> PASS       (fi-pnv-d510)
        Subgroup basic-flip-b:
                skip       -> PASS       (fi-pnv-d510)
Test kms_cursor_legacy:
        Subgroup basic-busy-flip-before-cursor-legacy:
                skip       -> PASS       (fi-pnv-d510)
Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-b-frame-sequence:
                pass       -> FAIL       (fi-skl-guc)

fi-bdw-5557u     total:288  pass:267  dwarn:0   dfail:0   fail:0   skip:21  time:427s
fi-bdw-gvtdvm    total:288  pass:264  dwarn:0   dfail:0   fail:0   skip:24  time:428s
fi-blb-e6850     total:288  pass:223  dwarn:1   dfail:0   fail:0   skip:64  time:372s
fi-bsw-n3050     total:288  pass:242  dwarn:0   dfail:0   fail:0   skip:46  time:490s
fi-bwr-2160      total:288  pass:183  dwarn:0   dfail:0   fail:0   skip:105 time:285s
fi-bxt-dsi       total:288  pass:258  dwarn:0   dfail:0   fail:0   skip:30  time:482s
fi-bxt-j4205     total:288  pass:259  dwarn:0   dfail:0   fail:0   skip:29  time:488s
fi-byt-j1900     total:288  pass:253  dwarn:0   dfail:0   fail:0   skip:35  time:472s
fi-byt-n2820     total:288  pass:249  dwarn:0   dfail:0   fail:0   skip:39  time:463s
fi-cfl-s2        total:288  pass:262  dwarn:0   dfail:0   fail:0   skip:26  time:577s
fi-cnl-y3        total:288  pass:262  dwarn:0   dfail:0   fail:0   skip:26  time:585s
fi-elk-e7500     total:288  pass:229  dwarn:0   dfail:0   fail:0   skip:59  time:421s
fi-gdg-551       total:288  pass:179  dwarn:0   dfail:0   fail:1   skip:108 time:278s
fi-glk-1         total:288  pass:260  dwarn:0   dfail:0   fail:0   skip:28  time:514s
fi-hsw-4770      total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:390s
fi-ilk-650       total:288  pass:228  dwarn:0   dfail:0   fail:0   skip:60  time:412s
fi-ivb-3520m     total:288  pass:259  dwarn:0   dfail:0   fail:0   skip:29  time:456s
fi-ivb-3770      total:288  pass:255  dwarn:0   dfail:0   fail:0   skip:33  time:418s
fi-kbl-7500u     total:288  pass:263  dwarn:1   dfail:0   fail:0   skip:24  time:458s
fi-kbl-7560u     total:288  pass:269  dwarn:0   dfail:0   fail:0   skip:19  time:498s
fi-kbl-7567u     total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:452s
fi-kbl-r         total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:503s
fi-pnv-d510      total:288  pass:222  dwarn:1   dfail:0   fail:0   skip:65  time:606s
fi-skl-6260u     total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:431s
fi-skl-6600u     total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:509s
fi-skl-6700hq    total:288  pass:262  dwarn:0   dfail:0   fail:0   skip:26  time:526s
fi-skl-6700k2    total:288  pass:264  dwarn:0   dfail:0   fail:0   skip:24  time:485s
fi-skl-6770hq    total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:488s
fi-skl-guc       total:288  pass:259  dwarn:0   dfail:0   fail:1   skip:28  time:414s
fi-skl-gvtdvm    total:288  pass:265  dwarn:0   dfail:0   fail:0   skip:23  time:429s
fi-snb-2520m     total:288  pass:248  dwarn:0   dfail:0   fail:0   skip:40  time:528s
fi-snb-2600      total:288  pass:248  dwarn:0   dfail:0   fail:0   skip:40  time:403s
Blacklisted hosts:
fi-glk-dsi       total:288  pass:258  dwarn:0   dfail:0   fail:0   skip:30  time:470s

e5f22cbeec1da222b22367ee3ac165188fb2a36d drm-tip: 2018y-02m-07d-08h-09m-07s UTC integration manifest
e5372cb6d2b1 drm/i915: Skip request serialisation if the timeline is already complete
a7e2e6dd82d4 drm/i915: Trim the retired request queue after submitting

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7916/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] drm/i915: Trim the retired request queue after submitting
  2018-02-07  8:43 [PATCH 1/2] drm/i915: Trim the retired request queue after submitting Chris Wilson
  2018-02-07  8:43 ` [PATCH 2/2] drm/i915: Skip request serialisation if the timeline is already complete Chris Wilson
  2018-02-07  9:21 ` ✗ Fi.CI.BAT: failure for series starting with [1/2] drm/i915: Trim the retired request queue after submitting Patchwork
@ 2018-02-07 10:45 ` Tvrtko Ursulin
  2 siblings, 0 replies; 5+ messages in thread
From: Tvrtko Ursulin @ 2018-02-07 10:45 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 07/02/2018 08:43, Chris Wilson wrote:
> If we submit a request and see that the previous request on this
> timeline was already signaled, we first do not need to add the
> dependency tracker for that completed request and secondly we know that
> we there is then a large backlog in retiring requests affecting this
> timeline. Given that we just submitted more work to the HW, now would be
> a good time to catch up on those retirements.
> 
> v2: Try to sum up the compromises involved in flushing the retirement
> queue after submission.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem_request.c | 20 ++++++++++++++++++++
>   1 file changed, 20 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> index 794263421aa0..384cb49ae4cc 100644
> --- a/drivers/gpu/drm/i915/i915_gem_request.c
> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> @@ -1075,6 +1075,26 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches)
>   	local_bh_disable();
>   	i915_sw_fence_commit(&request->submit);
>   	local_bh_enable(); /* Kick the execlists tasklet if just scheduled */
> +
> +	/*
> +	 * In typical scenarios, we do not expect the previous request on
> +	 * the timeline to be still tracked by timeline->last_request if it
> +	 * has been completed. If the completed request is still here, that
> +	 * implies that request retirement is a long way behind submission,
> +	 * suggesting that we haven't been retiring frequently enough from
> +	 * the combination of retire-before-alloc, waiters and the background
> +	 * retirement worker. So if the last request on this timeline was
> +	 * already completed, do a catch up pass, flushing the retirement queue
> +	 * up to this client. Since we have now moved the heaviest operations
> +	 * during retirement onto secondary workers, such as freeing objects
> +	 * or contexts, retiring a bunch of requests is mostly list management
> +	 * (and cache misses), and so we should not be overly penalizing this
> +	 * client by performing excess work, though we may still performing
> +	 * work on behalf of others -- but instead we should benefit from
> +	 * improved resource management. (Well, that's the theory at least.)
> +	 */
> +	if (prev && i915_gem_request_completed(prev))
> +		i915_gem_request_retire_upto(prev);
>   }
>   
>   static unsigned long local_clock_us(unsigned int *cpu)
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-02-07 10:46 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-02-07  8:43 [PATCH 1/2] drm/i915: Trim the retired request queue after submitting Chris Wilson
2018-02-07  8:43 ` [PATCH 2/2] drm/i915: Skip request serialisation if the timeline is already complete Chris Wilson
2018-02-07 10:45   ` Tvrtko Ursulin
2018-02-07  9:21 ` ✗ Fi.CI.BAT: failure for series starting with [1/2] drm/i915: Trim the retired request queue after submitting Patchwork
2018-02-07 10:45 ` [PATCH 1/2] " Tvrtko Ursulin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.