* [PATCH 0/2] semaphorify the pageflip BO (if possible)
@ 2012-03-22 0:19 Ben Widawsky
2012-03-22 0:19 ` [PATCH 1/2] drm/i915: extract ring sync code Ben Widawsky
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Ben Widawsky @ 2012-03-22 0:19 UTC (permalink / raw)
To: intel-gfx; +Cc: Ben Widawsky
I've not observed any FPS changes with my limited testing.
Here is the performance data I collected with nexuiz, measuring the
latency of i915_gem_object_pin_to_display_plane. Top is before, bottom
is after.
N Min Max Median Avg Stddev
x 2246 0.352 37.538 2.791 3.3091901 1.8631844
+ 2239 0.281 44.517 2.699 3.088992 1.583838
Difference at 95.0% confidence
-0.220198 +/- 0.101227
-6.65414% +/- 3.05896%
(Student's t, pooled s = 1.72938)
Ben Widawsky (2):
drm/i915: extract ring sync code
drm/i915: use semaphores for the display plane
drivers/gpu/drm/i915/i915_drv.h | 19 +++++++++
drivers/gpu/drm/i915/i915_gem.c | 51 ++++++++++++++++++++---
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 60 +---------------------------
3 files changed, 66 insertions(+), 64 deletions(-)
--
1.7.9.4
^ permalink raw reply [flat|nested] 12+ messages in thread* [PATCH 1/2] drm/i915: extract ring sync code 2012-03-22 0:19 [PATCH 0/2] semaphorify the pageflip BO (if possible) Ben Widawsky @ 2012-03-22 0:19 ` Ben Widawsky 2012-03-22 9:50 ` Chris Wilson 2012-03-22 0:19 ` [PATCH 2/2] drm/i915: use semaphores for the display plane Ben Widawsky 2012-03-29 0:23 ` [PATCH 0/2] semaphorify the pageflip BO (if possible) Ben Widawsky 2 siblings, 1 reply; 12+ messages in thread From: Ben Widawsky @ 2012-03-22 0:19 UTC (permalink / raw) To: intel-gfx; +Cc: Ben Widawsky We want to use this function elsewhere... Signed-off-by: Ben Widawsky <ben@bwidawsk.net> --- drivers/gpu/drm/i915/i915_drv.h | 19 +++++++++ drivers/gpu/drm/i915/i915_gem.c | 43 ++++++++++++++++++++ drivers/gpu/drm/i915/i915_gem_execbuffer.c | 60 +--------------------------- 3 files changed, 63 insertions(+), 59 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index b6098b0..ee691ce 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -38,6 +38,7 @@ #include <linux/i2c-algo-bit.h> #include <drm/intel-gtt.h> #include <linux/backlight.h> +#include <linux/intel-iommu.h> /* General customization: */ @@ -1188,6 +1189,8 @@ void i915_gem_lastclose(struct drm_device *dev); int __must_check i915_mutex_lock_interruptible(struct drm_device *dev); int __must_check i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj); +int i915_gem_object_sync(struct drm_i915_gem_object *obj, + struct intel_ring_buffer *to); void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj, struct intel_ring_buffer *ring, u32 seqno); @@ -1282,6 +1285,22 @@ i915_gem_get_unfenced_gtt_alignment(struct drm_device *dev, int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, enum i915_cache_level cache_level); +static inline bool +intel_enable_semaphores(struct drm_device *dev) +{ + if (INTEL_INFO(dev)->gen < 6) + return 0; + + if (i915_semaphores >= 0) + return i915_semaphores; + + /* Enable semaphores on SNB when IO remapping is off */ + if (INTEL_INFO(dev)->gen == 6) + return !intel_iommu_enabled; + + return 1; +} + /* i915_gem_gtt.c */ int __must_check i915_gem_init_aliasing_ppgtt(struct drm_device *dev); void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 863e14a..ce2fee5 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2026,6 +2026,49 @@ i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj) return 0; } +int +i915_gem_object_sync(struct drm_i915_gem_object *obj, + struct intel_ring_buffer *to) +{ + struct intel_ring_buffer *from = obj->ring; + u32 seqno; + int ret, idx; + + if (from == NULL || to == from) + return 0; + + /* XXX gpu semaphores are implicated in various hard hangs on SNB */ + if (!intel_enable_semaphores(obj->base.dev)) + return i915_gem_object_wait_rendering(obj); + + idx = intel_ring_sync_index(from, to); + + seqno = obj->last_rendering_seqno; + if (seqno <= from->sync_seqno[idx]) + return 0; + + if (seqno == from->outstanding_lazy_request) { + struct drm_i915_gem_request *request; + + request = kzalloc(sizeof(*request), GFP_KERNEL); + if (request == NULL) + return -ENOMEM; + + ret = i915_add_request(from, NULL, request); + if (ret) { + kfree(request); + return ret; + } + + seqno = request->seqno; + } + + from->sync_seqno[idx] = seqno; + + return to->sync_to(to, from, seqno - 1); + +} + static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj) { u32 old_write_domain, old_read_domains; diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 0e051ec..e150f4b 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -810,64 +810,6 @@ i915_gem_execbuffer_flush(struct drm_device *dev, return 0; } -static bool -intel_enable_semaphores(struct drm_device *dev) -{ - if (INTEL_INFO(dev)->gen < 6) - return 0; - - if (i915_semaphores >= 0) - return i915_semaphores; - - /* Disable semaphores on SNB */ - if (INTEL_INFO(dev)->gen == 6) - return 0; - - return 1; -} - -static int -i915_gem_execbuffer_sync_rings(struct drm_i915_gem_object *obj, - struct intel_ring_buffer *to) -{ - struct intel_ring_buffer *from = obj->ring; - u32 seqno; - int ret, idx; - - if (from == NULL || to == from) - return 0; - - /* XXX gpu semaphores are implicated in various hard hangs on SNB */ - if (!intel_enable_semaphores(obj->base.dev)) - return i915_gem_object_wait_rendering(obj); - - idx = intel_ring_sync_index(from, to); - - seqno = obj->last_rendering_seqno; - if (seqno <= from->sync_seqno[idx]) - return 0; - - if (seqno == from->outstanding_lazy_request) { - struct drm_i915_gem_request *request; - - request = kzalloc(sizeof(*request), GFP_KERNEL); - if (request == NULL) - return -ENOMEM; - - ret = i915_add_request(from, NULL, request); - if (ret) { - kfree(request); - return ret; - } - - seqno = request->seqno; - } - - from->sync_seqno[idx] = seqno; - - return to->sync_to(to, from, seqno - 1); -} - static int i915_gem_execbuffer_wait_for_flips(struct intel_ring_buffer *ring, u32 flips) { @@ -929,7 +871,7 @@ i915_gem_execbuffer_move_to_gpu(struct intel_ring_buffer *ring, } list_for_each_entry(obj, objects, exec_list) { - ret = i915_gem_execbuffer_sync_rings(obj, ring); + ret = i915_gem_object_sync(obj, ring); if (ret) return ret; } -- 1.7.9.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] drm/i915: extract ring sync code 2012-03-22 0:19 ` [PATCH 1/2] drm/i915: extract ring sync code Ben Widawsky @ 2012-03-22 9:50 ` Chris Wilson 0 siblings, 0 replies; 12+ messages in thread From: Chris Wilson @ 2012-03-22 9:50 UTC (permalink / raw) To: intel-gfx; +Cc: Ben Widawsky On Wed, 21 Mar 2012 17:19:12 -0700, Ben Widawsky <ben@bwidawsk.net> wrote: > We want to use this function elsewhere... > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > --- > drivers/gpu/drm/i915/i915_drv.h | 19 +++++++++ > drivers/gpu/drm/i915/i915_gem.c | 43 ++++++++++++++++++++ > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 60 +--------------------------- > 3 files changed, 63 insertions(+), 59 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index b6098b0..ee691ce 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -38,6 +38,7 @@ > #include <linux/i2c-algo-bit.h> > #include <drm/intel-gtt.h> > #include <linux/backlight.h> > +#include <linux/intel-iommu.h> > > /* General customization: > */ > @@ -1188,6 +1189,8 @@ void i915_gem_lastclose(struct drm_device *dev); > > int __must_check i915_mutex_lock_interruptible(struct drm_device *dev); > int __must_check i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj); > +int i915_gem_object_sync(struct drm_i915_gem_object *obj, > + struct intel_ring_buffer *to); > void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj, > struct intel_ring_buffer *ring, > u32 seqno); > @@ -1282,6 +1285,22 @@ i915_gem_get_unfenced_gtt_alignment(struct drm_device *dev, > int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, > enum i915_cache_level cache_level); > > +static inline bool > +intel_enable_semaphores(struct drm_device *dev) > +{ > + if (INTEL_INFO(dev)->gen < 6) > + return 0; > + > + if (i915_semaphores >= 0) > + return i915_semaphores; > + > + /* Enable semaphores on SNB when IO remapping is off */ > + if (INTEL_INFO(dev)->gen == 6) > + return !intel_iommu_enabled; > + > + return 1; > +} This doesn't need to go in a header as it is still only used in one location, unless you are planning more? > + > /* i915_gem_gtt.c */ > int __must_check i915_gem_init_aliasing_ppgtt(struct drm_device *dev); > void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev); > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 863e14a..ce2fee5 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -2026,6 +2026,49 @@ i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj) > return 0; > } > > +int > +i915_gem_object_sync(struct drm_i915_gem_object *obj, > + struct intel_ring_buffer *to) > +{ > + struct intel_ring_buffer *from = obj->ring; > + u32 seqno; > + int ret, idx; > + > + if (from == NULL || to == from) > + return 0; > + > + /* XXX gpu semaphores are implicated in various hard hangs on SNB */ This comment is antiquated as we now know the exact hw snafu that trigger those hangs. And since you already have explained this comment in intel_enable_semaphores() we can kill it here. -Chris -- Chris Wilson, Intel Open Source Technology Centre ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/2] drm/i915: use semaphores for the display plane 2012-03-22 0:19 [PATCH 0/2] semaphorify the pageflip BO (if possible) Ben Widawsky 2012-03-22 0:19 ` [PATCH 1/2] drm/i915: extract ring sync code Ben Widawsky @ 2012-03-22 0:19 ` Ben Widawsky 2012-03-22 9:56 ` Chris Wilson 2012-03-30 23:38 ` Daniel Vetter 2012-03-29 0:23 ` [PATCH 0/2] semaphorify the pageflip BO (if possible) Ben Widawsky 2 siblings, 2 replies; 12+ messages in thread From: Ben Widawsky @ 2012-03-22 0:19 UTC (permalink / raw) To: intel-gfx; +Cc: Ben Widawsky In theory this will have performance and power improvements. Performance because we don't need to stall when the scanout BO is busy, and power because we don't have to stall when the BO is busy ie. we can get the work done sooner and put the CPU to sleep (and one less interrupt required). Signed-off-by: Ben Widawsky <ben@bwidawsk.net> --- drivers/gpu/drm/i915/i915_gem.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index ce2fee5..96ad162 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3041,9 +3041,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, * any flushes to be pipelined (for pageflips). * * For the display plane, we want to be in the GTT but out of any write - * domains. So in many ways this looks like set_to_gtt_domain() apart from the - * ability to pipeline the waits, pinning and any additional subtleties - * that may differentiate the display plane from ordinary buffers. + * domains. So in many ways this looks like set_to_gtt_domain(). */ int i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, @@ -3058,8 +3056,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, return ret; if (pipelined != obj->ring) { - ret = i915_gem_object_wait_rendering(obj); - if (ret == -ERESTARTSYS) + ret = i915_gem_object_sync(obj, pipelined); + if (ret) return ret; } -- 1.7.9.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] drm/i915: use semaphores for the display plane 2012-03-22 0:19 ` [PATCH 2/2] drm/i915: use semaphores for the display plane Ben Widawsky @ 2012-03-22 9:56 ` Chris Wilson 2012-03-22 16:07 ` Ben Widawsky 2012-03-30 23:38 ` Daniel Vetter 1 sibling, 1 reply; 12+ messages in thread From: Chris Wilson @ 2012-03-22 9:56 UTC (permalink / raw) To: intel-gfx; +Cc: Ben Widawsky On Wed, 21 Mar 2012 17:19:13 -0700, Ben Widawsky <ben@bwidawsk.net> wrote: > In theory this will have performance and power improvements. Performance > because we don't need to stall when the scanout BO is busy, and power > because we don't have to stall when the BO is busy ie. we can get the > work done sooner and put the CPU to sleep (and one less interrupt > required). > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > --- > drivers/gpu/drm/i915/i915_gem.c | 8 +++----- > 1 file changed, 3 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index ce2fee5..96ad162 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -3041,9 +3041,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, > * any flushes to be pipelined (for pageflips). > * > * For the display plane, we want to be in the GTT but out of any write > - * domains. So in many ways this looks like set_to_gtt_domain() apart from the > - * ability to pipeline the waits, pinning and any additional subtleties > - * that may differentiate the display plane from ordinary buffers. > + * domains. So in many ways this looks like set_to_gtt_domain(). ...apart from the whole pinning and pipelining. It looks less like set-to-gtt-domain over time. Not an improvement. I'll be the first to admit that it is not a great comment, but at least it does try to capture why we don't just treat the display plane as GTT. A better comment would explain our concept of display plane and pipelining. > */ > int > i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, > @@ -3058,8 +3056,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, > return ret; > > if (pipelined != obj->ring) { > - ret = i915_gem_object_wait_rendering(obj); > - if (ret == -ERESTARTSYS) > + ret = i915_gem_object_sync(obj, pipelined); > + if (ret) > return ret; > } This looks eerily familiar. -Chris -- Chris Wilson, Intel Open Source Technology Centre ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] drm/i915: use semaphores for the display plane 2012-03-22 9:56 ` Chris Wilson @ 2012-03-22 16:07 ` Ben Widawsky 2012-03-22 16:18 ` Chris Wilson 0 siblings, 1 reply; 12+ messages in thread From: Ben Widawsky @ 2012-03-22 16:07 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx On Thu, 22 Mar 2012 09:56:07 +0000 Chris Wilson <chris@chris-wilson.co.uk> wrote: > On Wed, 21 Mar 2012 17:19:13 -0700, Ben Widawsky <ben@bwidawsk.net> > wrote: > > In theory this will have performance and power improvements. > > Performance because we don't need to stall when the scanout BO is > > busy, and power because we don't have to stall when the BO is busy > > ie. we can get the work done sooner and put the CPU to sleep (and > > one less interrupt required). > > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > > --- > > drivers/gpu/drm/i915/i915_gem.c | 8 +++----- > > 1 file changed, 3 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c > > b/drivers/gpu/drm/i915/i915_gem.c index ce2fee5..96ad162 100644 > > --- a/drivers/gpu/drm/i915/i915_gem.c > > +++ b/drivers/gpu/drm/i915/i915_gem.c > > @@ -3041,9 +3041,7 @@ int i915_gem_object_set_cache_level(struct > > drm_i915_gem_object *obj, > > * any flushes to be pipelined (for pageflips). > > * > > * For the display plane, we want to be in the GTT but out of any > > write > > - * domains. So in many ways this looks like set_to_gtt_domain() > > apart from the > > - * ability to pipeline the waits, pinning and any additional > > subtleties > > - * that may differentiate the display plane from ordinary buffers. > > + * domains. So in many ways this looks like set_to_gtt_domain(). > ...apart from the whole pinning and pipelining. It looks less like > set-to-gtt-domain over time. > > Not an improvement. I'll be the first to admit that it is not a great > comment, but at least it does try to capture why we don't just treat > the display plane as GTT. A better comment would explain our concept > of display plane and pipelining. Separate patch though, don't you think? And it is now pipe-lining the waits, so I'm confused why you don't think it's an improvement. > > > */ > > int > > i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object > > *obj, @@ -3058,8 +3056,8 @@ > > i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object > > *obj, return ret; > > if (pipelined != obj->ring) { > > - ret = i915_gem_object_wait_rendering(obj); > > - if (ret == -ERESTARTSYS) > > + ret = i915_gem_object_sync(obj, pipelined); > > + if (ret) > > return ret; > > } > > This looks eerily familiar. Yes. Danvet enlightened me that you'd done these patches before, but he said they no longer applied and I couldn't find them in my mail anywhere so I wasn't sure how similar it was or wasn't. Assuming I redo this one, I'll be sure to add a note in the commit. > -Chris > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] drm/i915: use semaphores for the display plane 2012-03-22 16:07 ` Ben Widawsky @ 2012-03-22 16:18 ` Chris Wilson 2012-03-22 16:48 ` Ben Widawsky 0 siblings, 1 reply; 12+ messages in thread From: Chris Wilson @ 2012-03-22 16:18 UTC (permalink / raw) To: Ben Widawsky; +Cc: intel-gfx On Thu, 22 Mar 2012 09:07:09 -0700, Ben Widawsky <ben@bwidawsk.net> wrote: > On Thu, 22 Mar 2012 09:56:07 +0000 > Chris Wilson <chris@chris-wilson.co.uk> wrote: > > > On Wed, 21 Mar 2012 17:19:13 -0700, Ben Widawsky <ben@bwidawsk.net> > > wrote: > > > * For the display plane, we want to be in the GTT but out of any > > > write > > > - * domains. So in many ways this looks like set_to_gtt_domain() > > > apart from the > > > - * ability to pipeline the waits, pinning and any additional > > > subtleties > > > - * that may differentiate the display plane from ordinary buffers. > > > + * domains. So in many ways this looks like set_to_gtt_domain(). > > ...apart from the whole pinning and pipelining. It looks less like > > set-to-gtt-domain over time. > > > > Not an improvement. I'll be the first to admit that it is not a great > > comment, but at least it does try to capture why we don't just treat > > the display plane as GTT. A better comment would explain our concept > > of display plane and pipelining. > > Separate patch though, don't you think? And it is now pipe-lining the > waits, so I'm confused why you don't think it's an improvement. Right. So how does removing the hint of why pin-to-display is different from set-to-gtt when it now performs said function help improve the comment? I completely agree that changing that comment is outside of the scope of this patch. -Chris -- Chris Wilson, Intel Open Source Technology Centre ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] drm/i915: use semaphores for the display plane 2012-03-22 16:18 ` Chris Wilson @ 2012-03-22 16:48 ` Ben Widawsky 0 siblings, 0 replies; 12+ messages in thread From: Ben Widawsky @ 2012-03-22 16:48 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx On Thu, 22 Mar 2012 16:18:54 +0000 Chris Wilson <chris@chris-wilson.co.uk> wrote: > On Thu, 22 Mar 2012 09:07:09 -0700, Ben Widawsky <ben@bwidawsk.net> wrote: > > On Thu, 22 Mar 2012 09:56:07 +0000 > > Chris Wilson <chris@chris-wilson.co.uk> wrote: > > > > > On Wed, 21 Mar 2012 17:19:13 -0700, Ben Widawsky <ben@bwidawsk.net> > > > wrote: > > > > * For the display plane, we want to be in the GTT but out of any > > > > write > > > > - * domains. So in many ways this looks like set_to_gtt_domain() > > > > apart from the > > > > - * ability to pipeline the waits, pinning and any additional > > > > subtleties > > > > - * that may differentiate the display plane from ordinary buffers. > > > > + * domains. So in many ways this looks like set_to_gtt_domain(). > > > ...apart from the whole pinning and pipelining. It looks less like > > > set-to-gtt-domain over time. > > > > > > Not an improvement. I'll be the first to admit that it is not a great > > > comment, but at least it does try to capture why we don't just treat > > > the display plane as GTT. A better comment would explain our concept > > > of display plane and pipelining. > > > > Separate patch though, don't you think? And it is now pipe-lining the > > waits, so I'm confused why you don't think it's an improvement. > > Right. So how does removing the hint of why pin-to-display is different > from set-to-gtt when it now performs said function help improve the > comment? I completely agree that changing that comment is outside of the > scope of this patch. > -Chris > Can you volunteer a better description? I'll undo the comment change in my patch #2 and use whatever you come up with. If not, I'll get to this when I get to it. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] drm/i915: use semaphores for the display plane 2012-03-22 0:19 ` [PATCH 2/2] drm/i915: use semaphores for the display plane Ben Widawsky 2012-03-22 9:56 ` Chris Wilson @ 2012-03-30 23:38 ` Daniel Vetter 2012-03-30 23:42 ` Chris Wilson 1 sibling, 1 reply; 12+ messages in thread From: Daniel Vetter @ 2012-03-30 23:38 UTC (permalink / raw) To: Ben Widawsky; +Cc: intel-gfx On Wed, Mar 21, 2012 at 05:19:13PM -0700, Ben Widawsky wrote: > In theory this will have performance and power improvements. Performance > because we don't need to stall when the scanout BO is busy, and power > because we don't have to stall when the BO is busy ie. we can get the > work done sooner and put the CPU to sleep (and one less interrupt > required). > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > --- > drivers/gpu/drm/i915/i915_gem.c | 8 +++----- > 1 file changed, 3 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index ce2fee5..96ad162 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -3041,9 +3041,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, > * any flushes to be pipelined (for pageflips). > * > * For the display plane, we want to be in the GTT but out of any write > - * domains. So in many ways this looks like set_to_gtt_domain() apart from the > - * ability to pipeline the waits, pinning and any additional subtleties > - * that may differentiate the display plane from ordinary buffers. > + * domains. So in many ways this looks like set_to_gtt_domain(). > */ For the comment bikeshed, what about: "Prepare buffer for display plane (scanout, cursors, etc). Can be called from an uninterruptible phase (modesetting). "For display planes we need to flush, synchronize with any outstanding rendering (pipelined using semaphores, if available, in case of a pageflip) and additionally need to ensure that the cache_level is coherent for the scanout engine." Cheers, Daniel > int > i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, > @@ -3058,8 +3056,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, > return ret; > > if (pipelined != obj->ring) { > - ret = i915_gem_object_wait_rendering(obj); > - if (ret == -ERESTARTSYS) > + ret = i915_gem_object_sync(obj, pipelined); > + if (ret) > return ret; > } > > -- > 1.7.9.4 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Mail: daniel@ffwll.ch Mobile: +41 (0)79 365 57 48 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] drm/i915: use semaphores for the display plane 2012-03-30 23:38 ` Daniel Vetter @ 2012-03-30 23:42 ` Chris Wilson 0 siblings, 0 replies; 12+ messages in thread From: Chris Wilson @ 2012-03-30 23:42 UTC (permalink / raw) To: Daniel Vetter, Ben Widawsky; +Cc: intel-gfx On Sat, 31 Mar 2012 01:38:24 +0200, Daniel Vetter <daniel@ffwll.ch> wrote: > On Wed, Mar 21, 2012 at 05:19:13PM -0700, Ben Widawsky wrote: > > In theory this will have performance and power improvements. Performance > > because we don't need to stall when the scanout BO is busy, and power > > because we don't have to stall when the BO is busy ie. we can get the > > work done sooner and put the CPU to sleep (and one less interrupt > > required). > > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > > --- > > drivers/gpu/drm/i915/i915_gem.c | 8 +++----- > > 1 file changed, 3 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > > index ce2fee5..96ad162 100644 > > --- a/drivers/gpu/drm/i915/i915_gem.c > > +++ b/drivers/gpu/drm/i915/i915_gem.c > > @@ -3041,9 +3041,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, > > * any flushes to be pipelined (for pageflips). > > * > > * For the display plane, we want to be in the GTT but out of any write > > - * domains. So in many ways this looks like set_to_gtt_domain() apart from the > > - * ability to pipeline the waits, pinning and any additional subtleties > > - * that may differentiate the display plane from ordinary buffers. > > + * domains. So in many ways this looks like set_to_gtt_domain(). > > */ > > For the comment bikeshed, what about: > > "Prepare buffer for display plane (scanout, cursors, etc). > Can be called from an uninterruptible phase (modesetting). > > "For display planes we need to flush, synchronize with any outstanding > rendering (pipelined using semaphores, if available, in case of a > pageflip) and additionally need to ensure that the cache_level is coherent > for the scanout engine." It's an improvement over what we have certainly. Still doesn't explain the why, but I've writers block as well. -Chris -- Chris Wilson, Intel Open Source Technology Centre ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] semaphorify the pageflip BO (if possible) 2012-03-22 0:19 [PATCH 0/2] semaphorify the pageflip BO (if possible) Ben Widawsky 2012-03-22 0:19 ` [PATCH 1/2] drm/i915: extract ring sync code Ben Widawsky 2012-03-22 0:19 ` [PATCH 2/2] drm/i915: use semaphores for the display plane Ben Widawsky @ 2012-03-29 0:23 ` Ben Widawsky 2012-03-29 8:49 ` Daniel Vetter 2 siblings, 1 reply; 12+ messages in thread From: Ben Widawsky @ 2012-03-29 0:23 UTC (permalink / raw) To: Daniel Vetter; +Cc: intel-gfx Daniel, how do you want to handle this? On Wed, 21 Mar 2012 17:19:11 -0700 Ben Widawsky <ben@bwidawsk.net> wrote: > I've not observed any FPS changes with my limited testing. > > Here is the performance data I collected with nexuiz, measuring the > latency of i915_gem_object_pin_to_display_plane. Top is before, bottom > is after. > > N Min Max Median Avg > Stddev x 2246 0.352 37.538 2.791 > 3.3091901 1.8631844 > + 2239 0.281 44.517 2.699 3.088992 > 1.583838 Difference at 95.0% confidence > -0.220198 +/- 0.101227 > -6.65414% +/- 3.05896% > (Student's t, pooled s = 1.72938) > > > Ben Widawsky (2): > drm/i915: extract ring sync code > drm/i915: use semaphores for the display plane > > drivers/gpu/drm/i915/i915_drv.h | 19 +++++++++ > drivers/gpu/drm/i915/i915_gem.c | 51 > ++++++++++++++++++++--- drivers/gpu/drm/i915/i915_gem_execbuffer.c > | 60 +--------------------------- 3 files changed, 66 > insertions(+), 64 deletions(-) > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] semaphorify the pageflip BO (if possible) 2012-03-29 0:23 ` [PATCH 0/2] semaphorify the pageflip BO (if possible) Ben Widawsky @ 2012-03-29 8:49 ` Daniel Vetter 0 siblings, 0 replies; 12+ messages in thread From: Daniel Vetter @ 2012-03-29 8:49 UTC (permalink / raw) To: Ben Widawsky; +Cc: intel-gfx On Wed, Mar 28, 2012 at 05:23:08PM -0700, Ben Widawsky wrote: > Daniel, how do you want to handle this? It's somewhere on my list of things to do. I'd like to play around with this on my ivb first just to see what's going on, but otherwise nothing to complain about here. Yours, Daniel > > On Wed, 21 Mar 2012 17:19:11 -0700 > Ben Widawsky <ben@bwidawsk.net> wrote: > > > I've not observed any FPS changes with my limited testing. > > > > Here is the performance data I collected with nexuiz, measuring the > > latency of i915_gem_object_pin_to_display_plane. Top is before, bottom > > is after. > > > > N Min Max Median Avg > > Stddev x 2246 0.352 37.538 2.791 > > 3.3091901 1.8631844 > > + 2239 0.281 44.517 2.699 3.088992 > > 1.583838 Difference at 95.0% confidence > > -0.220198 +/- 0.101227 > > -6.65414% +/- 3.05896% > > (Student's t, pooled s = 1.72938) > > > > > > Ben Widawsky (2): > > drm/i915: extract ring sync code > > drm/i915: use semaphores for the display plane > > > > drivers/gpu/drm/i915/i915_drv.h | 19 +++++++++ > > drivers/gpu/drm/i915/i915_gem.c | 51 > > ++++++++++++++++++++--- drivers/gpu/drm/i915/i915_gem_execbuffer.c > > | 60 +--------------------------- 3 files changed, 66 > > insertions(+), 64 deletions(-) > > > -- Daniel Vetter Mail: daniel@ffwll.ch Mobile: +41 (0)79 365 57 48 ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2012-03-30 23:42 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-03-22 0:19 [PATCH 0/2] semaphorify the pageflip BO (if possible) Ben Widawsky 2012-03-22 0:19 ` [PATCH 1/2] drm/i915: extract ring sync code Ben Widawsky 2012-03-22 9:50 ` Chris Wilson 2012-03-22 0:19 ` [PATCH 2/2] drm/i915: use semaphores for the display plane Ben Widawsky 2012-03-22 9:56 ` Chris Wilson 2012-03-22 16:07 ` Ben Widawsky 2012-03-22 16:18 ` Chris Wilson 2012-03-22 16:48 ` Ben Widawsky 2012-03-30 23:38 ` Daniel Vetter 2012-03-30 23:42 ` Chris Wilson 2012-03-29 0:23 ` [PATCH 0/2] semaphorify the pageflip BO (if possible) Ben Widawsky 2012-03-29 8:49 ` Daniel Vetter
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.