* [PATCH v2] drm/i915: Move to CPU domain in pwrite/pread
@ 2014-11-12 14:57 ville.syrjala
2014-11-12 15:01 ` Chris Wilson
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: ville.syrjala @ 2014-11-12 14:57 UTC (permalink / raw)
To: intel-gfx
From: Ville Syrjälä <ville.syrjala@linux.intel.com>
Currently it's possible to get visible cache dirt on scanout on LLC
machines when using pwrite on the future scanout bo if its cache_level
is already NONE.
pwrite's "does this need clflush?" checks would decide that no clflush
is necessary since the bo isn't currently pinned to the display and LLC
makes everything else coherent. The subsequent set_cache_level(NONE)
would also do nothing since cache_level is already correct. And hence
no clflush will be performed and we flip to a bo which can still have
dirty data in the caches.
To correctly track the cache dirtyness move the object to CPU write
domain in pwrite. This cures the cache dirt since we explicitly flush
the CPU write domain in the pin_to_display path.
Give pread the same treatment simply in the name of symmetry.
v2: Use trace_i915_gem_object_change_domain() and provide some kind
of commit message
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
v1 was floated on irc.
drivers/gpu/drm/i915/i915_gem.c | 32 +++++++++++++++++++++++++++++++-
1 file changed, 31 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 3e0cabe..12a3c16 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -456,6 +456,7 @@ __copy_from_user_swizzled(char *gpu_vaddr, int gpu_offset,
int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj,
int *needs_clflush)
{
+ u32 old_read_domains;
int ret;
*needs_clflush = 0;
@@ -475,6 +476,7 @@ int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj,
return ret;
i915_gem_object_retire(obj);
+ i915_gem_object_flush_gtt_write_domain(obj);
}
ret = i915_gem_object_get_pages(obj);
@@ -483,6 +485,13 @@ int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj,
i915_gem_object_pin_pages(obj);
+ old_read_domains = obj->base.read_domains;
+ obj->base.read_domains |= I915_GEM_DOMAIN_CPU;
+
+ trace_i915_gem_object_change_domain(obj,
+ old_read_domains,
+ obj->base.write_domain);
+
return ret;
}
@@ -875,6 +884,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
int needs_clflush_after = 0;
int needs_clflush_before = 0;
struct sg_page_iter sg_iter;
+ u32 old_write_domain, old_read_domains;
user_data = to_user_ptr(args->data_ptr);
remain = args->size;
@@ -892,6 +902,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
return ret;
i915_gem_object_retire(obj);
+ i915_gem_object_flush_gtt_write_domain(obj);
}
/* Same trick applies to invalidate partially written cachelines read
* before writing. */
@@ -908,6 +919,15 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
offset = args->offset;
obj->dirty = 1;
+ old_read_domains = obj->base.read_domains;
+ old_write_domain = obj->base.write_domain;
+ obj->base.read_domains = I915_GEM_DOMAIN_CPU;
+ obj->base.write_domain = I915_GEM_DOMAIN_CPU;
+
+ trace_i915_gem_object_change_domain(obj,
+ old_read_domains,
+ old_write_domain);
+
for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents,
offset >> PAGE_SHIFT) {
struct page *page = sg_page_iter_page(&sg_iter);
@@ -978,9 +998,19 @@ out:
}
}
- if (needs_clflush_after)
+ if (needs_clflush_after) {
i915_gem_chipset_flush(dev);
+ if (obj->base.write_domain == I915_GEM_DOMAIN_CPU) {
+ old_write_domain = obj->base.write_domain;
+ obj->base.write_domain = 0;
+
+ trace_i915_gem_object_change_domain(obj,
+ obj->base.read_domains,
+ old_write_domain);
+ }
+ }
+
return ret;
}
--
2.0.4
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2] drm/i915: Move to CPU domain in pwrite/pread
2014-11-12 14:57 [PATCH v2] drm/i915: Move to CPU domain in pwrite/pread ville.syrjala
@ 2014-11-12 15:01 ` Chris Wilson
2014-11-12 15:10 ` Ville Syrjälä
2014-11-12 15:30 ` Daniel Vetter
2014-11-13 4:10 ` shuang.he
2 siblings, 1 reply; 5+ messages in thread
From: Chris Wilson @ 2014-11-12 15:01 UTC (permalink / raw)
To: ville.syrjala; +Cc: intel-gfx
On Wed, Nov 12, 2014 at 04:57:10PM +0200, ville.syrjala@linux.intel.com wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> Currently it's possible to get visible cache dirt on scanout on LLC
> machines when using pwrite on the future scanout bo if its cache_level
> is already NONE.
>
> pwrite's "does this need clflush?" checks would decide that no clflush
> is necessary since the bo isn't currently pinned to the display and LLC
> makes everything else coherent. The subsequent set_cache_level(NONE)
> would also do nothing since cache_level is already correct. And hence
> no clflush will be performed and we flip to a bo which can still have
> dirty data in the caches.
>
> To correctly track the cache dirtyness move the object to CPU write
> domain in pwrite. This cures the cache dirt since we explicitly flush
> the CPU write domain in the pin_to_display path.
>
> Give pread the same treatment simply in the name of symmetry.
>
> v2: Use trace_i915_gem_object_change_domain() and provide some kind
> of commit message
>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
> v1 was floated on irc.
Doesn't it still suffer the same problem that only accessed cache lines
are clflushed, but we declare the entire object as valid?
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] drm/i915: Move to CPU domain in pwrite/pread
2014-11-12 15:01 ` Chris Wilson
@ 2014-11-12 15:10 ` Ville Syrjälä
0 siblings, 0 replies; 5+ messages in thread
From: Ville Syrjälä @ 2014-11-12 15:10 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On Wed, Nov 12, 2014 at 03:01:30PM +0000, Chris Wilson wrote:
> On Wed, Nov 12, 2014 at 04:57:10PM +0200, ville.syrjala@linux.intel.com wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> >
> > Currently it's possible to get visible cache dirt on scanout on LLC
> > machines when using pwrite on the future scanout bo if its cache_level
> > is already NONE.
> >
> > pwrite's "does this need clflush?" checks would decide that no clflush
> > is necessary since the bo isn't currently pinned to the display and LLC
> > makes everything else coherent. The subsequent set_cache_level(NONE)
> > would also do nothing since cache_level is already correct. And hence
> > no clflush will be performed and we flip to a bo which can still have
> > dirty data in the caches.
> >
> > To correctly track the cache dirtyness move the object to CPU write
> > domain in pwrite. This cures the cache dirt since we explicitly flush
> > the CPU write domain in the pin_to_display path.
> >
> > Give pread the same treatment simply in the name of symmetry.
> >
> > v2: Use trace_i915_gem_object_change_domain() and provide some kind
> > of commit message
> >
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> > v1 was floated on irc.
>
> Doesn't it still suffer the same problem that only accessed cache lines
> are clflushed, but we declare the entire object as valid?
Ah in case there's already dirty stuff in the cache before the pwrite?
Hmm, yeah that would be a problem.
How about about clearing write_domain in the end only if write_domain
was != CPU in the start?
--
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] drm/i915: Move to CPU domain in pwrite/pread
2014-11-12 14:57 [PATCH v2] drm/i915: Move to CPU domain in pwrite/pread ville.syrjala
2014-11-12 15:01 ` Chris Wilson
@ 2014-11-12 15:30 ` Daniel Vetter
2014-11-13 4:10 ` shuang.he
2 siblings, 0 replies; 5+ messages in thread
From: Daniel Vetter @ 2014-11-12 15:30 UTC (permalink / raw)
To: ville.syrjala; +Cc: intel-gfx
On Wed, Nov 12, 2014 at 04:57:10PM +0200, ville.syrjala@linux.intel.com wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> Currently it's possible to get visible cache dirt on scanout on LLC
> machines when using pwrite on the future scanout bo if its cache_level
> is already NONE.
>
> pwrite's "does this need clflush?" checks would decide that no clflush
> is necessary since the bo isn't currently pinned to the display and LLC
> makes everything else coherent. The subsequent set_cache_level(NONE)
> would also do nothing since cache_level is already correct. And hence
> no clflush will be performed and we flip to a bo which can still have
> dirty data in the caches.
>
> To correctly track the cache dirtyness move the object to CPU write
> domain in pwrite. This cures the cache dirt since we explicitly flush
> the CPU write domain in the pin_to_display path.
>
> Give pread the same treatment simply in the name of symmetry.
>
> v2: Use trace_i915_gem_object_change_domain() and provide some kind
> of commit message
>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Also usual drill: Can I haz testcase pls?
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] drm/i915: Move to CPU domain in pwrite/pread
2014-11-12 14:57 [PATCH v2] drm/i915: Move to CPU domain in pwrite/pread ville.syrjala
2014-11-12 15:01 ` Chris Wilson
2014-11-12 15:30 ` Daniel Vetter
@ 2014-11-13 4:10 ` shuang.he
2 siblings, 0 replies; 5+ messages in thread
From: shuang.he @ 2014-11-13 4:10 UTC (permalink / raw)
To: shuang.he, intel-gfx, ville.syrjala
Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
-------------------------------------Summary-------------------------------------
Platform: baseline_drm_intel_nightly_pass_rate->patch_applied_pass_rate
BYT: pass/total=291/291->284/291
PNV: pass/total=356/356->350/356
ILK: pass/total=372/372->365/372
IVB: pass/total=545/546->545/546
SNB: pass/total=380/380->379/380
HSW: pass/total=579/579->579/579
BDW: pass/total=434/435->434/435
-------------------------------------Detailed-------------------------------------
test_platform: test_suite, test_case, result_with_drm_intel_nightly(count, machine_id...)...->result_with_patch_applied(count, machine_id)...
BYT: Intel_gpu_tools, igt_gem_pread_after_blit_interruptible, FAIL(3, M36)PASS(1, M31) -> FAIL(1, M36)PASS(3, M36)
BYT: Intel_gpu_tools, igt_gem_pread_after_blit_interruptible-display, FAIL(3, M36)PASS(1, M31) -> FAIL(1, M36)PASS(3, M36)
BYT: Intel_gpu_tools, igt_gem_pread_after_blit_interruptible-uncached, FAIL(3, M36)PASS(1, M31) -> FAIL(1, M36)PASS(3, M36)
BYT: Intel_gpu_tools, igt_gem_pread_after_blit_normal, FAIL(3, M36)PASS(1, M31) -> FAIL(1, M36)PASS(3, M36)
BYT: Intel_gpu_tools, igt_gem_pread_after_blit_normal-display, FAIL(3, M36)PASS(1, M31) -> FAIL(1, M36)PASS(3, M36)
BYT: Intel_gpu_tools, igt_gem_pread_after_blit_normal-uncached, FAIL(3, M36)PASS(1, M31) -> FAIL(1, M36)PASS(3, M36)
BYT: Intel_gpu_tools, igt_kms_setmode_invalid-clone-single-crtc, TIMEOUT(9, M36M31)PASS(1, M31) -> TIMEOUT(1, M36)PASS(3, M36)
PNV: Intel_gpu_tools, igt_gem_pread_after_blit_interruptible, FAIL(3, M7)PASS(1, M25) -> FAIL(1, M7)PASS(3, M7)
PNV: Intel_gpu_tools, igt_gem_pread_after_blit_interruptible-display, FAIL(3, M7)PASS(1, M25) -> FAIL(1, M7)PASS(3, M7)
PNV: Intel_gpu_tools, igt_gem_pread_after_blit_interruptible-uncached, FAIL(3, M7)PASS(1, M25) -> FAIL(1, M7)PASS(2, M7)
PNV: Intel_gpu_tools, igt_gem_pread_after_blit_normal, FAIL(3, M7)PASS(1, M25) -> FAIL(1, M7)PASS(3, M7)
PNV: Intel_gpu_tools, igt_gem_pread_after_blit_normal-display, FAIL(3, M7)PASS(1, M25) -> FAIL(1, M7)PASS(3, M7)
PNV: Intel_gpu_tools, igt_gem_pread_after_blit_normal-uncached, FAIL(3, M7)PASS(1, M25) -> FAIL(1, M7)PASS(3, M7)
ILK: Intel_gpu_tools, igt_gem_pread_after_blit_interruptible, FAIL(3, M6)PASS(1, M26) -> FAIL(1, M6)PASS(3, M6)
ILK: Intel_gpu_tools, igt_gem_pread_after_blit_interruptible-display, FAIL(3, M6)PASS(1, M26) -> FAIL(1, M6)PASS(3, M6)
ILK: Intel_gpu_tools, igt_gem_pread_after_blit_interruptible-uncached, FAIL(3, M6)PASS(1, M26) -> FAIL(1, M6)PASS(3, M6)
ILK: Intel_gpu_tools, igt_gem_pread_after_blit_normal, FAIL(3, M6)PASS(1, M26) -> FAIL(1, M6)PASS(3, M6)
ILK: Intel_gpu_tools, igt_gem_pread_after_blit_normal-display, FAIL(3, M6)PASS(1, M26) -> FAIL(1, M6)PASS(3, M6)
ILK: Intel_gpu_tools, igt_gem_pread_after_blit_normal-uncached, FAIL(3, M6)PASS(1, M26) -> FAIL(1, M6)PASS(3, M6)
ILK: Intel_gpu_tools, igt_kms_setmode_invalid-clone-single-crtc, FAIL(3, M26)DMESG_FAIL(1, M26)TIMEOUT(16, M37M6M26)PASS(1, M26) -> TIMEOUT(1, M6)PASS(3, M6)
IVB: Intel_gpu_tools, igt_gem_bad_reloc_negative-reloc, NSPT(6, M21M34M4)PASS(10, M34M21M4) -> NSPT(1, M4)PASS(3, M4)
IVB: Intel_gpu_tools, igt_kms_setmode_invalid-clone-single-crtc, TIMEOUT(15, M34M21M4)PASS(1, M21) -> TIMEOUT(1, M4)PASS(3, M4)
SNB: Intel_gpu_tools, igt_kms_setmode_invalid-clone-single-crtc, TIMEOUT(18, M35M22)PASS(1, M35) -> TIMEOUT(1, M35)PASS(3, M35)
BDW: Intel_gpu_tools, igt_gem_reset_stats_ban-bsd, DMESG_WARN(1, M28)PASS(18, M42M30M28) -> PASS(4, M30)
BDW: Intel_gpu_tools, igt_kms_setmode_invalid-clone-single-crtc, TIMEOUT(18, M42M30M28)PASS(1, M28) -> TIMEOUT(1, M30)PASS(3, M30)
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-11-13 4:10 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-12 14:57 [PATCH v2] drm/i915: Move to CPU domain in pwrite/pread ville.syrjala
2014-11-12 15:01 ` Chris Wilson
2014-11-12 15:10 ` Ville Syrjälä
2014-11-12 15:30 ` Daniel Vetter
2014-11-13 4:10 ` shuang.he
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox