* [PATCH 1/2] drm/i915: Process page flags once rather than per pwrite/pread
@ 2014-03-07 8:30 Chris Wilson
2014-03-07 8:30 ` [PATCH 2/2] drm/i915: Do not force non-caching copies for pwrite along shmem path Chris Wilson
2014-03-07 18:14 ` [PATCH 1/2] drm/i915: Process page flags once rather than per pwrite/pread Volkin, Bradley D
0 siblings, 2 replies; 7+ messages in thread
From: Chris Wilson @ 2014-03-07 8:30 UTC (permalink / raw)
To: intel-gfx
We used to lock individual pages inside the buffer object and so needed
to update the page flags every time. However, we now pin the pages into
the object for the duration of the pwrite/pread (and hopefully much
longer) and so we can forgo the flag updates until we release all the
pages.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/i915_gem.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ba7dc4868066..877afb2c576d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -627,12 +627,10 @@ i915_gem_shmem_pread(struct drm_device *dev,
mutex_lock(&dev->struct_mutex);
-next_page:
- mark_page_accessed(page);
-
if (ret)
goto out;
+next_page:
remain -= page_length;
user_data += page_length;
offset += page_length;
@@ -950,13 +948,10 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
mutex_lock(&dev->struct_mutex);
-next_page:
- set_page_dirty(page);
- mark_page_accessed(page);
-
if (ret)
goto out;
+next_page:
remain -= page_length;
user_data += page_length;
offset += page_length;
--
1.9.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] drm/i915: Do not force non-caching copies for pwrite along shmem path
2014-03-07 8:30 [PATCH 1/2] drm/i915: Process page flags once rather than per pwrite/pread Chris Wilson
@ 2014-03-07 8:30 ` Chris Wilson
2014-03-07 8:39 ` Daniel Vetter
2014-03-07 18:14 ` Volkin, Bradley D
2014-03-07 18:14 ` [PATCH 1/2] drm/i915: Process page flags once rather than per pwrite/pread Volkin, Bradley D
1 sibling, 2 replies; 7+ messages in thread
From: Chris Wilson @ 2014-03-07 8:30 UTC (permalink / raw)
To: intel-gfx
We don't always want to write into main memory with pwrite. The shmem
fast path in particular is used for memory that is cacheable - under
such circumstances forcing the cache eviction is undesirable. As we will
always flush the cache when targeting incoherent buffers, we can rely on
that second pass to apply the cache coherency rules and so benefit from
in-cache copies otherwise.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/i915_gem.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 877afb2c576d..e0ca6d6be2ae 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -810,9 +810,8 @@ shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length,
if (needs_clflush_before)
drm_clflush_virt_range(vaddr + shmem_page_offset,
page_length);
- ret = __copy_from_user_inatomic_nocache(vaddr + shmem_page_offset,
- user_data,
- page_length);
+ ret = __copy_from_user_inatomic(vaddr + shmem_page_offset,
+ user_data, page_length);
if (needs_clflush_after)
drm_clflush_virt_range(vaddr + shmem_page_offset,
page_length);
--
1.9.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] drm/i915: Do not force non-caching copies for pwrite along shmem path
2014-03-07 8:30 ` [PATCH 2/2] drm/i915: Do not force non-caching copies for pwrite along shmem path Chris Wilson
@ 2014-03-07 8:39 ` Daniel Vetter
2014-03-07 9:50 ` Chris Wilson
2014-03-07 18:14 ` Volkin, Bradley D
1 sibling, 1 reply; 7+ messages in thread
From: Daniel Vetter @ 2014-03-07 8:39 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
On Fri, Mar 07, 2014 at 08:30:37AM +0000, Chris Wilson wrote:
> We don't always want to write into main memory with pwrite. The shmem
> fast path in particular is used for memory that is cacheable - under
> such circumstances forcing the cache eviction is undesirable. As we will
> always flush the cache when targeting incoherent buffers, we can rely on
> that second pass to apply the cache coherency rules and so benefit from
> in-cache copies otherwise.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Do you have some numbers on this? Looks good otherwise.
-Daniel
> ---
> drivers/gpu/drm/i915/i915_gem.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 877afb2c576d..e0ca6d6be2ae 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -810,9 +810,8 @@ shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length,
> if (needs_clflush_before)
> drm_clflush_virt_range(vaddr + shmem_page_offset,
> page_length);
> - ret = __copy_from_user_inatomic_nocache(vaddr + shmem_page_offset,
> - user_data,
> - page_length);
> + ret = __copy_from_user_inatomic(vaddr + shmem_page_offset,
> + user_data, page_length);
> if (needs_clflush_after)
> drm_clflush_virt_range(vaddr + shmem_page_offset,
> page_length);
> --
> 1.9.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] drm/i915: Do not force non-caching copies for pwrite along shmem path
2014-03-07 8:39 ` Daniel Vetter
@ 2014-03-07 9:50 ` Chris Wilson
0 siblings, 0 replies; 7+ messages in thread
From: Chris Wilson @ 2014-03-07 9:50 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx
On Fri, Mar 07, 2014 at 09:39:44AM +0100, Daniel Vetter wrote:
> On Fri, Mar 07, 2014 at 08:30:37AM +0000, Chris Wilson wrote:
> > We don't always want to write into main memory with pwrite. The shmem
> > fast path in particular is used for memory that is cacheable - under
> > such circumstances forcing the cache eviction is undesirable. As we will
> > always flush the cache when targeting incoherent buffers, we can rely on
> > that second pass to apply the cache coherency rules and so benefit from
> > in-cache copies otherwise.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>
> Do you have some numbers on this? Looks good otherwise.
Comparative figures with 1333MHz DDR3 on crw:
0: Time to snooped copy 16384 bytes x 131072: 19.520µs, 839.4MiB/s
1: Time to snooped copy 16384 bytes x 131072: 19.444µs, 842.6MiB/s
2: Time to snooped copy 16384 bytes x 131072: 18.808µs, 871.1MiB/s
Oddly enough though, it was the removing the page flag accesses that
made the most impact at a higher level.
It will take a while longer to complete checks on pnv.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] drm/i915: Process page flags once rather than per pwrite/pread
2014-03-07 8:30 [PATCH 1/2] drm/i915: Process page flags once rather than per pwrite/pread Chris Wilson
2014-03-07 8:30 ` [PATCH 2/2] drm/i915: Do not force non-caching copies for pwrite along shmem path Chris Wilson
@ 2014-03-07 18:14 ` Volkin, Bradley D
1 sibling, 0 replies; 7+ messages in thread
From: Volkin, Bradley D @ 2014-03-07 18:14 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx@lists.freedesktop.org
Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com>
On Fri, Mar 07, 2014 at 12:30:36AM -0800, Chris Wilson wrote:
> We used to lock individual pages inside the buffer object and so needed
> to update the page flags every time. However, we now pin the pages into
> the object for the duration of the pwrite/pread (and hopefully much
> longer) and so we can forgo the flag updates until we release all the
> pages.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
> drivers/gpu/drm/i915/i915_gem.c | 9 ++-------
> 1 file changed, 2 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index ba7dc4868066..877afb2c576d 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -627,12 +627,10 @@ i915_gem_shmem_pread(struct drm_device *dev,
>
> mutex_lock(&dev->struct_mutex);
>
> -next_page:
> - mark_page_accessed(page);
> -
> if (ret)
> goto out;
>
> +next_page:
> remain -= page_length;
> user_data += page_length;
> offset += page_length;
> @@ -950,13 +948,10 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
>
> mutex_lock(&dev->struct_mutex);
>
> -next_page:
> - set_page_dirty(page);
> - mark_page_accessed(page);
> -
> if (ret)
> goto out;
>
> +next_page:
> remain -= page_length;
> user_data += page_length;
> offset += page_length;
> --
> 1.9.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] drm/i915: Do not force non-caching copies for pwrite along shmem path
2014-03-07 8:30 ` [PATCH 2/2] drm/i915: Do not force non-caching copies for pwrite along shmem path Chris Wilson
2014-03-07 8:39 ` Daniel Vetter
@ 2014-03-07 18:14 ` Volkin, Bradley D
2014-03-07 23:03 ` Daniel Vetter
1 sibling, 1 reply; 7+ messages in thread
From: Volkin, Bradley D @ 2014-03-07 18:14 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx@lists.freedesktop.org
Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com>
On Fri, Mar 07, 2014 at 12:30:37AM -0800, Chris Wilson wrote:
> We don't always want to write into main memory with pwrite. The shmem
> fast path in particular is used for memory that is cacheable - under
> such circumstances forcing the cache eviction is undesirable. As we will
> always flush the cache when targeting incoherent buffers, we can rely on
> that second pass to apply the cache coherency rules and so benefit from
> in-cache copies otherwise.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
> drivers/gpu/drm/i915/i915_gem.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 877afb2c576d..e0ca6d6be2ae 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -810,9 +810,8 @@ shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length,
> if (needs_clflush_before)
> drm_clflush_virt_range(vaddr + shmem_page_offset,
> page_length);
> - ret = __copy_from_user_inatomic_nocache(vaddr + shmem_page_offset,
> - user_data,
> - page_length);
> + ret = __copy_from_user_inatomic(vaddr + shmem_page_offset,
> + user_data, page_length);
> if (needs_clflush_after)
> drm_clflush_virt_range(vaddr + shmem_page_offset,
> page_length);
> --
> 1.9.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] drm/i915: Do not force non-caching copies for pwrite along shmem path
2014-03-07 18:14 ` Volkin, Bradley D
@ 2014-03-07 23:03 ` Daniel Vetter
0 siblings, 0 replies; 7+ messages in thread
From: Daniel Vetter @ 2014-03-07 23:03 UTC (permalink / raw)
To: Volkin, Bradley D; +Cc: intel-gfx@lists.freedesktop.org
On Fri, Mar 07, 2014 at 10:14:58AM -0800, Volkin, Bradley D wrote:
> Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com>
>
> On Fri, Mar 07, 2014 at 12:30:37AM -0800, Chris Wilson wrote:
> > We don't always want to write into main memory with pwrite. The shmem
> > fast path in particular is used for memory that is cacheable - under
> > such circumstances forcing the cache eviction is undesirable. As we will
> > always flush the cache when targeting incoherent buffers, we can rely on
> > that second pass to apply the cache coherency rules and so benefit from
> > in-cache copies otherwise.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Both patches merged, thanks.
-Daniel
> > ---
> > drivers/gpu/drm/i915/i915_gem.c | 5 ++---
> > 1 file changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 877afb2c576d..e0ca6d6be2ae 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -810,9 +810,8 @@ shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length,
> > if (needs_clflush_before)
> > drm_clflush_virt_range(vaddr + shmem_page_offset,
> > page_length);
> > - ret = __copy_from_user_inatomic_nocache(vaddr + shmem_page_offset,
> > - user_data,
> > - page_length);
> > + ret = __copy_from_user_inatomic(vaddr + shmem_page_offset,
> > + user_data, page_length);
> > if (needs_clflush_after)
> > drm_clflush_virt_range(vaddr + shmem_page_offset,
> > page_length);
> > --
> > 1.9.0
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-03-07 23:03 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-07 8:30 [PATCH 1/2] drm/i915: Process page flags once rather than per pwrite/pread Chris Wilson
2014-03-07 8:30 ` [PATCH 2/2] drm/i915: Do not force non-caching copies for pwrite along shmem path Chris Wilson
2014-03-07 8:39 ` Daniel Vetter
2014-03-07 9:50 ` Chris Wilson
2014-03-07 18:14 ` Volkin, Bradley D
2014-03-07 23:03 ` Daniel Vetter
2014-03-07 18:14 ` [PATCH 1/2] drm/i915: Process page flags once rather than per pwrite/pread Volkin, Bradley D
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox