* [PATCH 1/2] drm/i915: Process page flags once rather than per pwrite/pread @ 2014-03-07 8:30 Chris Wilson 2014-03-07 8:30 ` [PATCH 2/2] drm/i915: Do not force non-caching copies for pwrite along shmem path Chris Wilson 2014-03-07 18:14 ` [PATCH 1/2] drm/i915: Process page flags once rather than per pwrite/pread Volkin, Bradley D 0 siblings, 2 replies; 7+ messages in thread From: Chris Wilson @ 2014-03-07 8:30 UTC (permalink / raw) To: intel-gfx We used to lock individual pages inside the buffer object and so needed to update the page flags every time. However, we now pin the pages into the object for the duration of the pwrite/pread (and hopefully much longer) and so we can forgo the flag updates until we release all the pages. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/i915_gem.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index ba7dc4868066..877afb2c576d 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -627,12 +627,10 @@ i915_gem_shmem_pread(struct drm_device *dev, mutex_lock(&dev->struct_mutex); -next_page: - mark_page_accessed(page); - if (ret) goto out; +next_page: remain -= page_length; user_data += page_length; offset += page_length; @@ -950,13 +948,10 @@ i915_gem_shmem_pwrite(struct drm_device *dev, mutex_lock(&dev->struct_mutex); -next_page: - set_page_dirty(page); - mark_page_accessed(page); - if (ret) goto out; +next_page: remain -= page_length; user_data += page_length; offset += page_length; -- 1.9.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] drm/i915: Do not force non-caching copies for pwrite along shmem path 2014-03-07 8:30 [PATCH 1/2] drm/i915: Process page flags once rather than per pwrite/pread Chris Wilson @ 2014-03-07 8:30 ` Chris Wilson 2014-03-07 8:39 ` Daniel Vetter 2014-03-07 18:14 ` Volkin, Bradley D 2014-03-07 18:14 ` [PATCH 1/2] drm/i915: Process page flags once rather than per pwrite/pread Volkin, Bradley D 1 sibling, 2 replies; 7+ messages in thread From: Chris Wilson @ 2014-03-07 8:30 UTC (permalink / raw) To: intel-gfx We don't always want to write into main memory with pwrite. The shmem fast path in particular is used for memory that is cacheable - under such circumstances forcing the cache eviction is undesirable. As we will always flush the cache when targeting incoherent buffers, we can rely on that second pass to apply the cache coherency rules and so benefit from in-cache copies otherwise. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/i915_gem.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 877afb2c576d..e0ca6d6be2ae 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -810,9 +810,8 @@ shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length, if (needs_clflush_before) drm_clflush_virt_range(vaddr + shmem_page_offset, page_length); - ret = __copy_from_user_inatomic_nocache(vaddr + shmem_page_offset, - user_data, - page_length); + ret = __copy_from_user_inatomic(vaddr + shmem_page_offset, + user_data, page_length); if (needs_clflush_after) drm_clflush_virt_range(vaddr + shmem_page_offset, page_length); -- 1.9.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] drm/i915: Do not force non-caching copies for pwrite along shmem path 2014-03-07 8:30 ` [PATCH 2/2] drm/i915: Do not force non-caching copies for pwrite along shmem path Chris Wilson @ 2014-03-07 8:39 ` Daniel Vetter 2014-03-07 9:50 ` Chris Wilson 2014-03-07 18:14 ` Volkin, Bradley D 1 sibling, 1 reply; 7+ messages in thread From: Daniel Vetter @ 2014-03-07 8:39 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx On Fri, Mar 07, 2014 at 08:30:37AM +0000, Chris Wilson wrote: > We don't always want to write into main memory with pwrite. The shmem > fast path in particular is used for memory that is cacheable - under > such circumstances forcing the cache eviction is undesirable. As we will > always flush the cache when targeting incoherent buffers, we can rely on > that second pass to apply the cache coherency rules and so benefit from > in-cache copies otherwise. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Do you have some numbers on this? Looks good otherwise. -Daniel > --- > drivers/gpu/drm/i915/i915_gem.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 877afb2c576d..e0ca6d6be2ae 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -810,9 +810,8 @@ shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length, > if (needs_clflush_before) > drm_clflush_virt_range(vaddr + shmem_page_offset, > page_length); > - ret = __copy_from_user_inatomic_nocache(vaddr + shmem_page_offset, > - user_data, > - page_length); > + ret = __copy_from_user_inatomic(vaddr + shmem_page_offset, > + user_data, page_length); > if (needs_clflush_after) > drm_clflush_virt_range(vaddr + shmem_page_offset, > page_length); > -- > 1.9.0 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] drm/i915: Do not force non-caching copies for pwrite along shmem path 2014-03-07 8:39 ` Daniel Vetter @ 2014-03-07 9:50 ` Chris Wilson 0 siblings, 0 replies; 7+ messages in thread From: Chris Wilson @ 2014-03-07 9:50 UTC (permalink / raw) To: Daniel Vetter; +Cc: intel-gfx On Fri, Mar 07, 2014 at 09:39:44AM +0100, Daniel Vetter wrote: > On Fri, Mar 07, 2014 at 08:30:37AM +0000, Chris Wilson wrote: > > We don't always want to write into main memory with pwrite. The shmem > > fast path in particular is used for memory that is cacheable - under > > such circumstances forcing the cache eviction is undesirable. As we will > > always flush the cache when targeting incoherent buffers, we can rely on > > that second pass to apply the cache coherency rules and so benefit from > > in-cache copies otherwise. > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Do you have some numbers on this? Looks good otherwise. Comparative figures with 1333MHz DDR3 on crw: 0: Time to snooped copy 16384 bytes x 131072: 19.520µs, 839.4MiB/s 1: Time to snooped copy 16384 bytes x 131072: 19.444µs, 842.6MiB/s 2: Time to snooped copy 16384 bytes x 131072: 18.808µs, 871.1MiB/s Oddly enough though, it was the removing the page flag accesses that made the most impact at a higher level. It will take a while longer to complete checks on pnv. -Chris -- Chris Wilson, Intel Open Source Technology Centre ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] drm/i915: Do not force non-caching copies for pwrite along shmem path 2014-03-07 8:30 ` [PATCH 2/2] drm/i915: Do not force non-caching copies for pwrite along shmem path Chris Wilson 2014-03-07 8:39 ` Daniel Vetter @ 2014-03-07 18:14 ` Volkin, Bradley D 2014-03-07 23:03 ` Daniel Vetter 1 sibling, 1 reply; 7+ messages in thread From: Volkin, Bradley D @ 2014-03-07 18:14 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx@lists.freedesktop.org Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> On Fri, Mar 07, 2014 at 12:30:37AM -0800, Chris Wilson wrote: > We don't always want to write into main memory with pwrite. The shmem > fast path in particular is used for memory that is cacheable - under > such circumstances forcing the cache eviction is undesirable. As we will > always flush the cache when targeting incoherent buffers, we can rely on > that second pass to apply the cache coherency rules and so benefit from > in-cache copies otherwise. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > --- > drivers/gpu/drm/i915/i915_gem.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 877afb2c576d..e0ca6d6be2ae 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -810,9 +810,8 @@ shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length, > if (needs_clflush_before) > drm_clflush_virt_range(vaddr + shmem_page_offset, > page_length); > - ret = __copy_from_user_inatomic_nocache(vaddr + shmem_page_offset, > - user_data, > - page_length); > + ret = __copy_from_user_inatomic(vaddr + shmem_page_offset, > + user_data, page_length); > if (needs_clflush_after) > drm_clflush_virt_range(vaddr + shmem_page_offset, > page_length); > -- > 1.9.0 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] drm/i915: Do not force non-caching copies for pwrite along shmem path 2014-03-07 18:14 ` Volkin, Bradley D @ 2014-03-07 23:03 ` Daniel Vetter 0 siblings, 0 replies; 7+ messages in thread From: Daniel Vetter @ 2014-03-07 23:03 UTC (permalink / raw) To: Volkin, Bradley D; +Cc: intel-gfx@lists.freedesktop.org On Fri, Mar 07, 2014 at 10:14:58AM -0800, Volkin, Bradley D wrote: > Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> > > On Fri, Mar 07, 2014 at 12:30:37AM -0800, Chris Wilson wrote: > > We don't always want to write into main memory with pwrite. The shmem > > fast path in particular is used for memory that is cacheable - under > > such circumstances forcing the cache eviction is undesirable. As we will > > always flush the cache when targeting incoherent buffers, we can rely on > > that second pass to apply the cache coherency rules and so benefit from > > in-cache copies otherwise. > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Both patches merged, thanks. -Daniel > > --- > > drivers/gpu/drm/i915/i915_gem.c | 5 ++--- > > 1 file changed, 2 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > > index 877afb2c576d..e0ca6d6be2ae 100644 > > --- a/drivers/gpu/drm/i915/i915_gem.c > > +++ b/drivers/gpu/drm/i915/i915_gem.c > > @@ -810,9 +810,8 @@ shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length, > > if (needs_clflush_before) > > drm_clflush_virt_range(vaddr + shmem_page_offset, > > page_length); > > - ret = __copy_from_user_inatomic_nocache(vaddr + shmem_page_offset, > > - user_data, > > - page_length); > > + ret = __copy_from_user_inatomic(vaddr + shmem_page_offset, > > + user_data, page_length); > > if (needs_clflush_after) > > drm_clflush_virt_range(vaddr + shmem_page_offset, > > page_length); > > -- > > 1.9.0 > > > > _______________________________________________ > > Intel-gfx mailing list > > Intel-gfx@lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] drm/i915: Process page flags once rather than per pwrite/pread 2014-03-07 8:30 [PATCH 1/2] drm/i915: Process page flags once rather than per pwrite/pread Chris Wilson 2014-03-07 8:30 ` [PATCH 2/2] drm/i915: Do not force non-caching copies for pwrite along shmem path Chris Wilson @ 2014-03-07 18:14 ` Volkin, Bradley D 1 sibling, 0 replies; 7+ messages in thread From: Volkin, Bradley D @ 2014-03-07 18:14 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx@lists.freedesktop.org Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> On Fri, Mar 07, 2014 at 12:30:36AM -0800, Chris Wilson wrote: > We used to lock individual pages inside the buffer object and so needed > to update the page flags every time. However, we now pin the pages into > the object for the duration of the pwrite/pread (and hopefully much > longer) and so we can forgo the flag updates until we release all the > pages. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > --- > drivers/gpu/drm/i915/i915_gem.c | 9 ++------- > 1 file changed, 2 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index ba7dc4868066..877afb2c576d 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -627,12 +627,10 @@ i915_gem_shmem_pread(struct drm_device *dev, > > mutex_lock(&dev->struct_mutex); > > -next_page: > - mark_page_accessed(page); > - > if (ret) > goto out; > > +next_page: > remain -= page_length; > user_data += page_length; > offset += page_length; > @@ -950,13 +948,10 @@ i915_gem_shmem_pwrite(struct drm_device *dev, > > mutex_lock(&dev->struct_mutex); > > -next_page: > - set_page_dirty(page); > - mark_page_accessed(page); > - > if (ret) > goto out; > > +next_page: > remain -= page_length; > user_data += page_length; > offset += page_length; > -- > 1.9.0 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-03-07 23:03 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-03-07 8:30 [PATCH 1/2] drm/i915: Process page flags once rather than per pwrite/pread Chris Wilson 2014-03-07 8:30 ` [PATCH 2/2] drm/i915: Do not force non-caching copies for pwrite along shmem path Chris Wilson 2014-03-07 8:39 ` Daniel Vetter 2014-03-07 9:50 ` Chris Wilson 2014-03-07 18:14 ` Volkin, Bradley D 2014-03-07 23:03 ` Daniel Vetter 2014-03-07 18:14 ` [PATCH 1/2] drm/i915: Process page flags once rather than per pwrite/pread Volkin, Bradley D
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox