* [PATCH] drm/i915: Optimistically spin for the request completion
@ 2015-03-10 16:06 Chris Wilson
2015-03-10 16:14 ` Chris Wilson
2015-03-11 1:56 ` shuang.he
0 siblings, 2 replies; 6+ messages in thread
From: Chris Wilson @ 2015-03-10 16:06 UTC (permalink / raw)
To: intel-gfx
This provides a nice boost to mesa in swap bound scenarios (as mesa
throttles itself to the previous frame and given the scenario that will
complete shortly). It will also provide a good boost to systems running
with semaphores disabled and so frequently waiting on the GPU as it
switches rings.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/i915_gem.c | 43 +++++++++++++++++++++++++++++++++++------
1 file changed, 37 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index de15bd319bd0..2aaa3cab2be4 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1191,6 +1191,28 @@ static bool missed_irq(struct drm_i915_private *dev_priv,
return test_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings);
}
+static int __i915_spin_request(struct drm_i915_gem_request *req)
+{
+ struct intel_engine_cs *ring = i915_gem_request_get_ring(req);
+ struct drm_i915_private *dev_priv = to_i915(ring->dev);
+ int ret = -EBUSY;
+
+ intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+ while (!need_resched()) {
+ if (i915_gem_request_completed(req, true)) {
+ ret = 0;
+ goto out;
+ }
+
+ cpu_relax_lowlatency();
+ }
+ if (i915_gem_request_completed(req, false))
+ ret = 0;
+out:
+ intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
+ return ret;
+}
+
/**
* __i915_wait_request - wait until execution of request has finished
* @req: duh!
@@ -1235,12 +1257,20 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
if (ring->id == RCS && INTEL_INFO(dev)->gen >= 6)
gen6_rps_boost(dev_priv, file_priv);
- if (!irq_test_in_progress && WARN_ON(!ring->irq_get(ring)))
- return -ENODEV;
-
/* Record current time in case interrupted by signal, or wedged */
trace_i915_gem_request_wait_begin(req);
before = ktime_get_raw_ns();
+
+ /* Optimistic spin before touching IRQs */
+ ret = __i915_spin_request(req);
+ if (ret == 0)
+ goto out;
+
+ if (!irq_test_in_progress && WARN_ON(!ring->irq_get(ring))) {
+ ret = -ENODEV;
+ goto out;
+ }
+
for (;;) {
struct timer_list timer;
@@ -1289,14 +1319,15 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
destroy_timer_on_stack(&timer);
}
}
- now = ktime_get_raw_ns();
- trace_i915_gem_request_wait_end(req);
-
if (!irq_test_in_progress)
ring->irq_put(ring);
finish_wait(&ring->irq_queue, &wait);
+out:
+ now = ktime_get_raw_ns();
+ trace_i915_gem_request_wait_end(req);
+
if (timeout) {
s64 tres = *timeout - (now - before);
--
2.1.4
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] drm/i915: Optimistically spin for the request completion
2015-03-10 16:06 [PATCH] drm/i915: Optimistically spin for the request completion Chris Wilson
@ 2015-03-10 16:14 ` Chris Wilson
2015-03-11 10:13 ` Daniel Vetter
2015-03-11 1:56 ` shuang.he
1 sibling, 1 reply; 6+ messages in thread
From: Chris Wilson @ 2015-03-10 16:14 UTC (permalink / raw)
To: intel-gfx
On Tue, Mar 10, 2015 at 04:06:19PM +0000, Chris Wilson wrote:
> @@ -1235,12 +1257,20 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
> if (ring->id == RCS && INTEL_INFO(dev)->gen >= 6)
> gen6_rps_boost(dev_priv, file_priv);
>
> - if (!irq_test_in_progress && WARN_ON(!ring->irq_get(ring)))
> - return -ENODEV;
> -
> /* Record current time in case interrupted by signal, or wedged */
> trace_i915_gem_request_wait_begin(req);
> before = ktime_get_raw_ns();
> +
> + /* Optimistic spin before touching IRQs */
Perhaps iff timeout == NULL, or pass it along and add a
if (timeout && timeout_after_eq(jiffies, timeout))
break;
before the cpu_relax()?
> + ret = __i915_spin_request(req);
> + if (ret == 0)
> + goto out;
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] drm/i915: Optimistically spin for the request completion
2015-03-10 16:06 [PATCH] drm/i915: Optimistically spin for the request completion Chris Wilson
2015-03-10 16:14 ` Chris Wilson
@ 2015-03-11 1:56 ` shuang.he
1 sibling, 0 replies; 6+ messages in thread
From: shuang.he @ 2015-03-11 1:56 UTC (permalink / raw)
To: shuang.he, ethan.gao, intel-gfx, chris
Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
Task id: 5928
-------------------------------------Summary-------------------------------------
Platform Delta drm-intel-nightly Series Applied
PNV -1 282/282 281/282
ILK 308/308 308/308
SNB 307/307 307/307
IVB 375/375 375/375
BYT 294/294 294/294
HSW 385/385 385/385
BDW 315/315 315/315
-------------------------------------Detailed-------------------------------------
Platform Test drm-intel-nightly Series Applied
*PNV igt_gem_userptr_blits_minor-unsync-normal PASS(4) DMESG_WARN(1)PASS(1)
Note: You need to pay more attention to line start with '*'
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] drm/i915: Optimistically spin for the request completion
2015-03-10 16:14 ` Chris Wilson
@ 2015-03-11 10:13 ` Daniel Vetter
2015-03-11 10:30 ` Chris Wilson
0 siblings, 1 reply; 6+ messages in thread
From: Daniel Vetter @ 2015-03-11 10:13 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On Tue, Mar 10, 2015 at 04:14:14PM +0000, Chris Wilson wrote:
> On Tue, Mar 10, 2015 at 04:06:19PM +0000, Chris Wilson wrote:
> > @@ -1235,12 +1257,20 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
> > if (ring->id == RCS && INTEL_INFO(dev)->gen >= 6)
> > gen6_rps_boost(dev_priv, file_priv);
> >
> > - if (!irq_test_in_progress && WARN_ON(!ring->irq_get(ring)))
> > - return -ENODEV;
> > -
> > /* Record current time in case interrupted by signal, or wedged */
> > trace_i915_gem_request_wait_begin(req);
> > before = ktime_get_raw_ns();
> > +
> > + /* Optimistic spin before touching IRQs */
> Perhaps iff timeout == NULL, or pass it along and add a
>
> if (timeout && timeout_after_eq(jiffies, timeout))
> break;
>
> before the cpu_relax()?
I guess the answer for that is asking how many apps use short
opportunistic waits in the frame rendering loop, e.g. to wait for query
results and fall back to the ones from the previous frame if they're not
available?
Also do you have microbenchmark numbers for something midly ridiculous
like a loop of very short batches (enough ofc to cause a bit of delay) and
immediately stalling for them? It's definitely an awesome idea given that
every other lock and sync primitive does it too.
-Daniel
>
> > + ret = __i915_spin_request(req);
> > + if (ret == 0)
> > + goto out;
>
> --
> Chris Wilson, Intel Open Source Technology Centre
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] drm/i915: Optimistically spin for the request completion
2015-03-11 10:13 ` Daniel Vetter
@ 2015-03-11 10:30 ` Chris Wilson
2015-03-11 15:24 ` Chris Wilson
0 siblings, 1 reply; 6+ messages in thread
From: Chris Wilson @ 2015-03-11 10:30 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx
On Wed, Mar 11, 2015 at 11:13:59AM +0100, Daniel Vetter wrote:
> On Tue, Mar 10, 2015 at 04:14:14PM +0000, Chris Wilson wrote:
> > On Tue, Mar 10, 2015 at 04:06:19PM +0000, Chris Wilson wrote:
> > > @@ -1235,12 +1257,20 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
> > > if (ring->id == RCS && INTEL_INFO(dev)->gen >= 6)
> > > gen6_rps_boost(dev_priv, file_priv);
> > >
> > > - if (!irq_test_in_progress && WARN_ON(!ring->irq_get(ring)))
> > > - return -ENODEV;
> > > -
> > > /* Record current time in case interrupted by signal, or wedged */
> > > trace_i915_gem_request_wait_begin(req);
> > > before = ktime_get_raw_ns();
> > > +
> > > + /* Optimistic spin before touching IRQs */
> > Perhaps iff timeout == NULL, or pass it along and add a
> >
> > if (timeout && timeout_after_eq(jiffies, timeout))
> > break;
> >
> > before the cpu_relax()?
>
> I guess the answer for that is asking how many apps use short
> opportunistic waits in the frame rendering loop, e.g. to wait for query
> results and fall back to the ones from the previous frame if they're not
> available?
My presumption is that this would help most with occlusion query bound
applications (in terms of real world impact). Applications that have
readback in their critical path aren't really that exciting...
I was trying to see if cities skylines would benefit... But that seems
broken with multimonitor setups :|
> Also do you have microbenchmark numbers for something midly ridiculous
> like a loop of very short batches (enough ofc to cause a bit of delay) and
> immediately stalling for them? It's definitely an awesome idea given that
> every other lock and sync primitive does it too.
Urm, you are describing exactly how mesa behaves in swap benchmarks.
Admittedly there isn't much room for improvement after the throttle
adjustment patches land in mesa, but it is still there. It does increase
CPU load greatly in memory bound swap benchmarks, and I wonder how much
of the performance increase is from keeping the CPU from going to sleep
(i.e. preventing cpufreq from destroying the benchmark). I guess I have
an exciting morning of letting synmark run on one machine in various
configs.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] drm/i915: Optimistically spin for the request completion
2015-03-11 10:30 ` Chris Wilson
@ 2015-03-11 15:24 ` Chris Wilson
0 siblings, 0 replies; 6+ messages in thread
From: Chris Wilson @ 2015-03-11 15:24 UTC (permalink / raw)
To: Daniel Vetter, intel-gfx
On Wed, Mar 11, 2015 at 10:30:13AM +0000, Chris Wilson wrote:
> On Wed, Mar 11, 2015 at 11:13:59AM +0100, Daniel Vetter wrote:
> > Also do you have microbenchmark numbers for something midly ridiculous
> > like a loop of very short batches (enough ofc to cause a bit of delay) and
> > immediately stalling for them? It's definitely an awesome idea given that
> > every other lock and sync primitive does it too.
>
> I guess I have
> an exciting morning of letting synmark run on one machine in various
> configs.
So looking at the basics on HSW:gt3e, comparing mesa + throttle
improvements and this patch, we see a consistent improvement throughout
the synmark suite, of anything between 4 - 16%. That is actually
reassuring as one of my main worries is that this patch will acerbate
fallout from thermal throttling. However, a quick look at a few games
suggests that any improvement here is well into the noise, if any.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-03-11 15:25 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-03-10 16:06 [PATCH] drm/i915: Optimistically spin for the request completion Chris Wilson
2015-03-10 16:14 ` Chris Wilson
2015-03-11 10:13 ` Daniel Vetter
2015-03-11 10:30 ` Chris Wilson
2015-03-11 15:24 ` Chris Wilson
2015-03-11 1:56 ` shuang.he
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox