* ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Mark CPU cache as dirty on every transition for CPU writes
2017-04-27 14:46 [PATCH 1/2] " Chris Wilson
@ 2017-04-27 15:07 ` Patchwork
0 siblings, 0 replies; 4+ messages in thread
From: Patchwork @ 2017-04-27 15:07 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: series starting with [1/2] drm/i915: Mark CPU cache as dirty on every transition for CPU writes
URL : https://patchwork.freedesktop.org/series/23634/
State : success
== Summary ==
Series 23634v1 Series without cover letter
https://patchwork.freedesktop.org/api/1.0/series/23634/revisions/1/mbox/
Test drv_module_reload:
Subgroup basic-reload-inject:
pass -> INCOMPLETE (fi-bdw-5557u) fdo#100750
fdo#100750 https://bugs.freedesktop.org/show_bug.cgi?id=100750
fi-bdw-5557u total:276 pass:265 dwarn:0 dfail:0 fail:0 skip:10
fi-bdw-gvtdvm total:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time:425s
fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:503s
fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time:554s
fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:483s
fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:477s
fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:406s
fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:406s
fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:416s
fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:497s
fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:472s
fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:453s
fi-kbl-7560u total:278 pass:267 dwarn:1 dfail:0 fail:0 skip:10 time:567s
fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:455s
fi-skl-6700hq total:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:572s
fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:458s
fi-skl-6770hq total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:485s
fi-skl-gvtdvm total:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:437s
fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:531s
fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29 time:401s
fi-bsw-n3050 failed to collect. IGT log at Patchwork_4569/fi-bsw-n3050/igt.log
8b5a41bbd270c3a8db6d48bc1d6d6bafb59e6753 drm-tip: 2017y-04m-27d-13h-10m-59s UTC integration manifest
5bb12ef drm/i915: Store i915_gem_object_is_coherent() as a bit next to cache-dirty
1b954e1 drm/i915: Mark CPU cache as dirty on every transition for CPU writes
== Logs ==
For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4569/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH 1/2] drm/i915: Mark CPU cache as dirty on every transition for CPU writes
@ 2017-06-01 16:59 Chris Wilson
2017-06-01 16:59 ` [PATCH 2/2] drm/i915: Store i915_gem_object_is_coherent() as a bit next to cache-dirty Chris Wilson
2017-06-01 17:15 ` ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Mark CPU cache as dirty on every transition for CPU writes Patchwork
0 siblings, 2 replies; 4+ messages in thread
From: Chris Wilson @ 2017-06-01 16:59 UTC (permalink / raw)
To: intel-gfx; +Cc: Dongwon Kim
Currently, we only mark the CPU cache as dirty if we skip a clflush.
This leads to some confusion where we have to ask if the object is in
the write domain or missed a clflush. If we always mark the cache as
dirty, this becomes a much simply question to answer.
The goal remains to do as few clflushes as required and to do them as
late as possible, in the hope of deferring the work to a kthread and not
block the caller (e.g. execbuf, flips).
v2: Always call clflush before GPU execution when the cache_dirty flag
is set. This may cause some extra work on llc systems that migrate dirty
buffers back and forth - but we do try to limit that by only setting
cache_dirty at the end of the gpu sequence.
v3: Always mark the cache as dirty upon a level change, as we need to
invalidate any stale cachelines due to external writes.
Reported-by: Dongwon Kim <dongwon.kim@intel.com>
Fixes: a6a7cc4b7db6 ("drm/i915: Always flush the dirty CPU cache when pinning the scanout")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dongwon Kim <dongwon.kim@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Tested-by: Dongwon Kim <dongwon.kim@intel.com>
---
drivers/gpu/drm/i915/i915_gem.c | 76 ++++++++++++++----------
drivers/gpu/drm/i915/i915_gem_clflush.c | 15 +++--
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 21 +++----
drivers/gpu/drm/i915/i915_gem_internal.c | 3 +-
drivers/gpu/drm/i915/i915_gem_userptr.c | 5 +-
drivers/gpu/drm/i915/selftests/huge_gem_object.c | 3 +-
6 files changed, 67 insertions(+), 56 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 53c51787d2ed..9fa571523641 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -49,7 +49,7 @@ static void i915_gem_flush_free_objects(struct drm_i915_private *i915);
static bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj)
{
- if (obj->base.write_domain == I915_GEM_DOMAIN_CPU)
+ if (obj->cache_dirty)
return false;
if (!i915_gem_object_is_coherent(obj))
@@ -233,6 +233,14 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj)
return st;
}
+static void __start_cpu_write(struct drm_i915_gem_object *obj)
+{
+ obj->base.read_domains = I915_GEM_DOMAIN_CPU;
+ obj->base.write_domain = I915_GEM_DOMAIN_CPU;
+ if (cpu_write_needs_clflush(obj))
+ obj->cache_dirty = true;
+}
+
static void
__i915_gem_object_release_shmem(struct drm_i915_gem_object *obj,
struct sg_table *pages,
@@ -248,8 +256,7 @@ __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj,
!i915_gem_object_is_coherent(obj))
drm_clflush_sg(pages);
- obj->base.read_domains = I915_GEM_DOMAIN_CPU;
- obj->base.write_domain = I915_GEM_DOMAIN_CPU;
+ __start_cpu_write(obj);
}
static void
@@ -684,6 +691,12 @@ i915_gem_dumb_create(struct drm_file *file,
args->size, &args->handle);
}
+static bool gpu_write_needs_clflush(struct drm_i915_gem_object *obj)
+{
+ return !(obj->cache_level == I915_CACHE_NONE ||
+ obj->cache_level == I915_CACHE_WT);
+}
+
/**
* Creates a new mm object and returns a handle to it.
* @dev: drm device pointer
@@ -753,6 +766,11 @@ flush_write_domain(struct drm_i915_gem_object *obj, unsigned int flush_domains)
case I915_GEM_DOMAIN_CPU:
i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC);
break;
+
+ case I915_GEM_DOMAIN_RENDER:
+ if (gpu_write_needs_clflush(obj))
+ obj->cache_dirty = true;
+ break;
}
obj->base.write_domain = 0;
@@ -854,7 +872,8 @@ int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj,
* optimizes for the case when the gpu will dirty the data
* anyway again before the next pread happens.
*/
- if (!(obj->base.read_domains & I915_GEM_DOMAIN_CPU))
+ if (!obj->cache_dirty &&
+ !(obj->base.read_domains & I915_GEM_DOMAIN_CPU))
*needs_clflush = CLFLUSH_BEFORE;
out:
@@ -906,14 +925,16 @@ int i915_gem_obj_prepare_shmem_write(struct drm_i915_gem_object *obj,
* This optimizes for the case when the gpu will use the data
* right away and we therefore have to clflush anyway.
*/
- if (obj->base.write_domain != I915_GEM_DOMAIN_CPU)
+ if (!obj->cache_dirty) {
*needs_clflush |= CLFLUSH_AFTER;
- /* Same trick applies to invalidate partially written cachelines read
- * before writing.
- */
- if (!(obj->base.read_domains & I915_GEM_DOMAIN_CPU))
- *needs_clflush |= CLFLUSH_BEFORE;
+ /*
+ * Same trick applies to invalidate partially written
+ * cachelines read before writing.
+ */
+ if (!(obj->base.read_domains & I915_GEM_DOMAIN_CPU))
+ *needs_clflush |= CLFLUSH_BEFORE;
+ }
out:
intel_fb_obj_invalidate(obj, ORIGIN_CPU);
@@ -3399,10 +3420,13 @@ int i915_gem_wait_for_idle(struct drm_i915_private *i915, unsigned int flags)
static void __i915_gem_object_flush_for_display(struct drm_i915_gem_object *obj)
{
- if (obj->base.write_domain != I915_GEM_DOMAIN_CPU && !obj->cache_dirty)
- return;
-
- i915_gem_clflush_object(obj, I915_CLFLUSH_FORCE);
+ /*
+ * We manually flush the CPU domain so that we can override and
+ * force the flush for the display, and perform it asyncrhonously.
+ */
+ flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
+ if (obj->cache_dirty)
+ i915_gem_clflush_object(obj, I915_CLFLUSH_FORCE);
obj->base.write_domain = 0;
}
@@ -3661,13 +3685,10 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
}
}
- if (obj->base.write_domain == I915_GEM_DOMAIN_CPU &&
- i915_gem_object_is_coherent(obj))
- obj->cache_dirty = true;
-
list_for_each_entry(vma, &obj->vma_list, obj_link)
vma->node.color = cache_level;
obj->cache_level = cache_level;
+ obj->cache_dirty = true; /* Always invalidate stale cachelines */
return 0;
}
@@ -3889,9 +3910,6 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write)
if (ret)
return ret;
- if (obj->base.write_domain == I915_GEM_DOMAIN_CPU)
- return 0;
-
flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
/* Flush the CPU cache if it's still invalid. */
@@ -3903,15 +3921,13 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write)
/* It should now be out of any other write domains, and we can update
* the domain values for our changes.
*/
- GEM_BUG_ON((obj->base.write_domain & ~I915_GEM_DOMAIN_CPU) != 0);
+ GEM_BUG_ON(obj->base.write_domain & ~I915_GEM_DOMAIN_CPU);
/* If we're writing through the CPU, then the GPU read domains will
* need to be invalidated at next use.
*/
- if (write) {
- obj->base.read_domains = I915_GEM_DOMAIN_CPU;
- obj->base.write_domain = I915_GEM_DOMAIN_CPU;
- }
+ if (write)
+ __start_cpu_write(obj);
return 0;
}
@@ -4332,6 +4348,8 @@ i915_gem_object_create(struct drm_i915_private *dev_priv, u64 size)
} else
obj->cache_level = I915_CACHE_NONE;
+ obj->cache_dirty = !i915_gem_object_is_coherent(obj);
+
trace_i915_gem_object_create(obj);
return obj;
@@ -4998,10 +5016,8 @@ int i915_gem_freeze_late(struct drm_i915_private *dev_priv)
mutex_lock(&dev_priv->drm.struct_mutex);
for (p = phases; *p; p++) {
- list_for_each_entry(obj, *p, global_link) {
- obj->base.read_domains = I915_GEM_DOMAIN_CPU;
- obj->base.write_domain = I915_GEM_DOMAIN_CPU;
- }
+ list_for_each_entry(obj, *p, global_link)
+ __start_cpu_write(obj);
}
mutex_unlock(&dev_priv->drm.struct_mutex);
diff --git a/drivers/gpu/drm/i915/i915_gem_clflush.c b/drivers/gpu/drm/i915/i915_gem_clflush.c
index ffac7a1f0caf..17b207e963c2 100644
--- a/drivers/gpu/drm/i915/i915_gem_clflush.c
+++ b/drivers/gpu/drm/i915/i915_gem_clflush.c
@@ -71,8 +71,6 @@ static const struct dma_fence_ops i915_clflush_ops = {
static void __i915_do_clflush(struct drm_i915_gem_object *obj)
{
drm_clflush_sg(obj->mm.pages);
- obj->cache_dirty = false;
-
intel_fb_obj_flush(obj, ORIGIN_CPU);
}
@@ -81,9 +79,6 @@ static void i915_clflush_work(struct work_struct *work)
struct clflush *clflush = container_of(work, typeof(*clflush), work);
struct drm_i915_gem_object *obj = clflush->obj;
- if (!obj->cache_dirty)
- goto out;
-
if (i915_gem_object_pin_pages(obj)) {
DRM_ERROR("Failed to acquire obj->pages for clflushing\n");
goto out;
@@ -131,10 +126,10 @@ void i915_gem_clflush_object(struct drm_i915_gem_object *obj,
* anything not backed by physical memory we consider to be always
* coherent and not need clflushing.
*/
- if (!i915_gem_object_has_struct_page(obj))
+ if (!i915_gem_object_has_struct_page(obj)) {
+ obj->cache_dirty = false;
return;
-
- obj->cache_dirty = true;
+ }
/* If the GPU is snooping the contents of the CPU cache,
* we do not need to manually clear the CPU cache lines. However,
@@ -153,6 +148,8 @@ void i915_gem_clflush_object(struct drm_i915_gem_object *obj,
if (!(flags & I915_CLFLUSH_SYNC))
clflush = kmalloc(sizeof(*clflush), GFP_KERNEL);
if (clflush) {
+ GEM_BUG_ON(!obj->cache_dirty);
+
dma_fence_init(&clflush->dma,
&i915_clflush_ops,
&clflush_lock,
@@ -180,4 +177,6 @@ void i915_gem_clflush_object(struct drm_i915_gem_object *obj,
} else {
GEM_BUG_ON(obj->base.write_domain != I915_GEM_DOMAIN_CPU);
}
+
+ obj->cache_dirty = false;
}
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 04211c970b9f..87505cfb75c2 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -291,7 +291,7 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
return DBG_USE_CPU_RELOC > 0;
return (HAS_LLC(to_i915(obj->base.dev)) ||
- obj->base.write_domain == I915_GEM_DOMAIN_CPU ||
+ obj->cache_dirty ||
obj->cache_level != I915_CACHE_NONE);
}
@@ -1129,10 +1129,8 @@ i915_gem_execbuffer_move_to_gpu(struct drm_i915_gem_request *req,
if (vma->exec_entry->flags & EXEC_OBJECT_ASYNC)
continue;
- if (obj->base.write_domain & I915_GEM_DOMAIN_CPU) {
+ if (obj->cache_dirty)
i915_gem_clflush_object(obj, 0);
- obj->base.write_domain = 0;
- }
ret = i915_gem_request_await_object
(req, obj, obj->base.pending_write_domain);
@@ -1265,12 +1263,6 @@ i915_gem_validate_context(struct drm_device *dev, struct drm_file *file,
return ctx;
}
-static bool gpu_write_needs_clflush(struct drm_i915_gem_object *obj)
-{
- return !(obj->cache_level == I915_CACHE_NONE ||
- obj->cache_level == I915_CACHE_WT);
-}
-
void i915_vma_move_to_active(struct i915_vma *vma,
struct drm_i915_gem_request *req,
unsigned int flags)
@@ -1294,15 +1286,16 @@ void i915_vma_move_to_active(struct i915_vma *vma,
i915_gem_active_set(&vma->last_read[idx], req);
list_move_tail(&vma->vm_link, &vma->vm->active_list);
+ obj->base.write_domain = 0;
if (flags & EXEC_OBJECT_WRITE) {
+ obj->base.write_domain = I915_GEM_DOMAIN_RENDER;
+
if (intel_fb_obj_invalidate(obj, ORIGIN_CS))
i915_gem_active_set(&obj->frontbuffer_write, req);
- /* update for the implicit flush after a batch */
- obj->base.write_domain &= ~I915_GEM_GPU_DOMAINS;
- if (!obj->cache_dirty && gpu_write_needs_clflush(obj))
- obj->cache_dirty = true;
+ obj->base.read_domains = 0;
}
+ obj->base.read_domains |= I915_GEM_GPU_DOMAINS;
if (flags & EXEC_OBJECT_NEEDS_FENCE)
i915_gem_active_set(&vma->last_fence, req);
diff --git a/drivers/gpu/drm/i915/i915_gem_internal.c b/drivers/gpu/drm/i915/i915_gem_internal.c
index fc950abbe400..58e93e87d573 100644
--- a/drivers/gpu/drm/i915/i915_gem_internal.c
+++ b/drivers/gpu/drm/i915/i915_gem_internal.c
@@ -188,9 +188,10 @@ i915_gem_object_create_internal(struct drm_i915_private *i915,
drm_gem_private_object_init(&i915->drm, &obj->base, size);
i915_gem_object_init(obj, &i915_gem_object_internal_ops);
- obj->base.write_domain = I915_GEM_DOMAIN_CPU;
obj->base.read_domains = I915_GEM_DOMAIN_CPU;
+ obj->base.write_domain = I915_GEM_DOMAIN_CPU;
obj->cache_level = HAS_LLC(i915) ? I915_CACHE_LLC : I915_CACHE_NONE;
+ obj->cache_dirty = !i915_gem_object_is_coherent(obj);
return obj;
}
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 1a0ce1dc68f5..34461e1928bc 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -802,9 +802,10 @@ i915_gem_userptr_ioctl(struct drm_device *dev, void *data, struct drm_file *file
drm_gem_private_object_init(dev, &obj->base, args->user_size);
i915_gem_object_init(obj, &i915_gem_userptr_ops);
- obj->cache_level = I915_CACHE_LLC;
- obj->base.write_domain = I915_GEM_DOMAIN_CPU;
obj->base.read_domains = I915_GEM_DOMAIN_CPU;
+ obj->base.write_domain = I915_GEM_DOMAIN_CPU;
+ obj->cache_level = I915_CACHE_LLC;
+ obj->cache_dirty = !i915_gem_object_is_coherent(obj);
obj->userptr.ptr = args->user_ptr;
obj->userptr.read_only = !!(args->flags & I915_USERPTR_READ_ONLY);
diff --git a/drivers/gpu/drm/i915/selftests/huge_gem_object.c b/drivers/gpu/drm/i915/selftests/huge_gem_object.c
index 4e681fc13be4..0ca867a877b6 100644
--- a/drivers/gpu/drm/i915/selftests/huge_gem_object.c
+++ b/drivers/gpu/drm/i915/selftests/huge_gem_object.c
@@ -126,9 +126,10 @@ huge_gem_object(struct drm_i915_private *i915,
drm_gem_private_object_init(&i915->drm, &obj->base, dma_size);
i915_gem_object_init(obj, &huge_ops);
- obj->base.write_domain = I915_GEM_DOMAIN_CPU;
obj->base.read_domains = I915_GEM_DOMAIN_CPU;
+ obj->base.write_domain = I915_GEM_DOMAIN_CPU;
obj->cache_level = HAS_LLC(i915) ? I915_CACHE_LLC : I915_CACHE_NONE;
+ obj->cache_dirty = !i915_gem_object_is_coherent(obj);
obj->scratch = phys_size;
return obj;
--
2.11.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/2] drm/i915: Store i915_gem_object_is_coherent() as a bit next to cache-dirty
2017-06-01 16:59 [PATCH 1/2] drm/i915: Mark CPU cache as dirty on every transition for CPU writes Chris Wilson
@ 2017-06-01 16:59 ` Chris Wilson
2017-06-01 17:15 ` ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Mark CPU cache as dirty on every transition for CPU writes Patchwork
1 sibling, 0 replies; 4+ messages in thread
From: Chris Wilson @ 2017-06-01 16:59 UTC (permalink / raw)
To: intel-gfx; +Cc: Dongwon Kim
For ease of use (i.e. avoiding a few checks and function calls), store
the object's cache coherency next to the cache is dirty bit.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dongwon Kim <dongwon.kim@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Tested-by: Dongwon Kim <dongwon.kim@intel.com>
---
drivers/gpu/drm/i915/i915_gem.c | 14 +++++++-------
drivers/gpu/drm/i915/i915_gem_clflush.c | 2 +-
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 +-
drivers/gpu/drm/i915/i915_gem_internal.c | 3 ++-
drivers/gpu/drm/i915/i915_gem_object.h | 1 +
drivers/gpu/drm/i915/i915_gem_stolen.c | 1 +
drivers/gpu/drm/i915/i915_gem_userptr.c | 3 ++-
drivers/gpu/drm/i915/selftests/huge_gem_object.c | 3 ++-
8 files changed, 17 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 9fa571523641..bd1be8f8ce9d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -52,7 +52,7 @@ static bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj)
if (obj->cache_dirty)
return false;
- if (!i915_gem_object_is_coherent(obj))
+ if (!obj->cache_coherent)
return true;
return obj->pin_display;
@@ -253,7 +253,7 @@ __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj,
if (needs_clflush &&
(obj->base.read_domains & I915_GEM_DOMAIN_CPU) == 0 &&
- !i915_gem_object_is_coherent(obj))
+ !obj->cache_coherent)
drm_clflush_sg(pages);
__start_cpu_write(obj);
@@ -856,8 +856,7 @@ int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj,
if (ret)
return ret;
- if (i915_gem_object_is_coherent(obj) ||
- !static_cpu_has(X86_FEATURE_CLFLUSH)) {
+ if (obj->cache_coherent || !static_cpu_has(X86_FEATURE_CLFLUSH)) {
ret = i915_gem_object_set_to_cpu_domain(obj, false);
if (ret)
goto err_unpin;
@@ -909,8 +908,7 @@ int i915_gem_obj_prepare_shmem_write(struct drm_i915_gem_object *obj,
if (ret)
return ret;
- if (i915_gem_object_is_coherent(obj) ||
- !static_cpu_has(X86_FEATURE_CLFLUSH)) {
+ if (obj->cache_coherent || !static_cpu_has(X86_FEATURE_CLFLUSH)) {
ret = i915_gem_object_set_to_cpu_domain(obj, true);
if (ret)
goto err_unpin;
@@ -3688,6 +3686,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
list_for_each_entry(vma, &obj->vma_list, obj_link)
vma->node.color = cache_level;
obj->cache_level = cache_level;
+ obj->cache_coherent = i915_gem_object_is_coherent(obj);
obj->cache_dirty = true; /* Always invalidate stale cachelines */
return 0;
@@ -4348,7 +4347,8 @@ i915_gem_object_create(struct drm_i915_private *dev_priv, u64 size)
} else
obj->cache_level = I915_CACHE_NONE;
- obj->cache_dirty = !i915_gem_object_is_coherent(obj);
+ obj->cache_coherent = i915_gem_object_is_coherent(obj);
+ obj->cache_dirty = !obj->cache_coherent;
trace_i915_gem_object_create(obj);
diff --git a/drivers/gpu/drm/i915/i915_gem_clflush.c b/drivers/gpu/drm/i915/i915_gem_clflush.c
index 17b207e963c2..152f16c11878 100644
--- a/drivers/gpu/drm/i915/i915_gem_clflush.c
+++ b/drivers/gpu/drm/i915/i915_gem_clflush.c
@@ -139,7 +139,7 @@ void i915_gem_clflush_object(struct drm_i915_gem_object *obj,
* snooping behaviour occurs naturally as the result of our domain
* tracking.
*/
- if (!(flags & I915_CLFLUSH_FORCE) && i915_gem_object_is_coherent(obj))
+ if (!(flags & I915_CLFLUSH_FORCE) && obj->cache_coherent)
return;
trace_i915_gem_object_clflush(obj);
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 87505cfb75c2..3d34badc4838 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1129,7 +1129,7 @@ i915_gem_execbuffer_move_to_gpu(struct drm_i915_gem_request *req,
if (vma->exec_entry->flags & EXEC_OBJECT_ASYNC)
continue;
- if (obj->cache_dirty)
+ if (obj->cache_dirty & ~obj->cache_coherent)
i915_gem_clflush_object(obj, 0);
ret = i915_gem_request_await_object
diff --git a/drivers/gpu/drm/i915/i915_gem_internal.c b/drivers/gpu/drm/i915/i915_gem_internal.c
index 58e93e87d573..568bf83af1f5 100644
--- a/drivers/gpu/drm/i915/i915_gem_internal.c
+++ b/drivers/gpu/drm/i915/i915_gem_internal.c
@@ -191,7 +191,8 @@ i915_gem_object_create_internal(struct drm_i915_private *i915,
obj->base.read_domains = I915_GEM_DOMAIN_CPU;
obj->base.write_domain = I915_GEM_DOMAIN_CPU;
obj->cache_level = HAS_LLC(i915) ? I915_CACHE_LLC : I915_CACHE_NONE;
- obj->cache_dirty = !i915_gem_object_is_coherent(obj);
+ obj->cache_coherent = i915_gem_object_is_coherent(obj);
+ obj->cache_dirty = !obj->cache_coherent;
return obj;
}
diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h
index 915057824284..adb482b00271 100644
--- a/drivers/gpu/drm/i915/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/i915_gem_object.h
@@ -121,6 +121,7 @@ struct drm_i915_gem_object {
unsigned long gt_ro:1;
unsigned int cache_level:3;
unsigned int cache_dirty:1;
+ unsigned int cache_coherent:1;
atomic_t frontbuffer_bits;
unsigned int frontbuffer_ggtt_origin; /* write once */
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 681db6083f4d..a817b3e0b17e 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -590,6 +590,7 @@ _i915_gem_object_create_stolen(struct drm_i915_private *dev_priv,
obj->stolen = stolen;
obj->base.read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT;
obj->cache_level = HAS_LLC(dev_priv) ? I915_CACHE_LLC : I915_CACHE_NONE;
+ obj->cache_coherent = true; /* assumptions! more like cache_oblivious */
if (i915_gem_object_pin_pages(obj))
goto cleanup;
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 34461e1928bc..05c36f663550 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -805,7 +805,8 @@ i915_gem_userptr_ioctl(struct drm_device *dev, void *data, struct drm_file *file
obj->base.read_domains = I915_GEM_DOMAIN_CPU;
obj->base.write_domain = I915_GEM_DOMAIN_CPU;
obj->cache_level = I915_CACHE_LLC;
- obj->cache_dirty = !i915_gem_object_is_coherent(obj);
+ obj->cache_coherent = i915_gem_object_is_coherent(obj);
+ obj->cache_dirty = !obj->cache_coherent;
obj->userptr.ptr = args->user_ptr;
obj->userptr.read_only = !!(args->flags & I915_USERPTR_READ_ONLY);
diff --git a/drivers/gpu/drm/i915/selftests/huge_gem_object.c b/drivers/gpu/drm/i915/selftests/huge_gem_object.c
index 0ca867a877b6..caf76af36aba 100644
--- a/drivers/gpu/drm/i915/selftests/huge_gem_object.c
+++ b/drivers/gpu/drm/i915/selftests/huge_gem_object.c
@@ -129,7 +129,8 @@ huge_gem_object(struct drm_i915_private *i915,
obj->base.read_domains = I915_GEM_DOMAIN_CPU;
obj->base.write_domain = I915_GEM_DOMAIN_CPU;
obj->cache_level = HAS_LLC(i915) ? I915_CACHE_LLC : I915_CACHE_NONE;
- obj->cache_dirty = !i915_gem_object_is_coherent(obj);
+ obj->cache_coherent = i915_gem_object_is_coherent(obj);
+ obj->cache_dirty = !obj->cache_coherent;
obj->scratch = phys_size;
return obj;
--
2.11.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 4+ messages in thread
* ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Mark CPU cache as dirty on every transition for CPU writes
2017-06-01 16:59 [PATCH 1/2] drm/i915: Mark CPU cache as dirty on every transition for CPU writes Chris Wilson
2017-06-01 16:59 ` [PATCH 2/2] drm/i915: Store i915_gem_object_is_coherent() as a bit next to cache-dirty Chris Wilson
@ 2017-06-01 17:15 ` Patchwork
1 sibling, 0 replies; 4+ messages in thread
From: Patchwork @ 2017-06-01 17:15 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: series starting with [1/2] drm/i915: Mark CPU cache as dirty on every transition for CPU writes
URL : https://patchwork.freedesktop.org/series/25173/
State : success
== Summary ==
Series 25173v1 Series without cover letter
https://patchwork.freedesktop.org/api/1.0/series/25173/revisions/1/mbox/
Test kms_cursor_legacy:
Subgroup basic-busy-flip-before-cursor-legacy:
pass -> FAIL (fi-snb-2600) fdo#100215
fdo#100215 https://bugs.freedesktop.org/show_bug.cgi?id=100215
fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time:444s
fi-bdw-gvtdvm total:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time:432s
fi-bsw-n3050 total:278 pass:242 dwarn:0 dfail:0 fail:0 skip:36 time:573s
fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:515s
fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:482s
fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:489s
fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:426s
fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:406s
fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:424s
fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:490s
fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:464s
fi-kbl-7500u total:278 pass:255 dwarn:5 dfail:0 fail:0 skip:18 time:465s
fi-kbl-7560u total:278 pass:263 dwarn:5 dfail:0 fail:0 skip:10 time:568s
fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:460s
fi-skl-6700hq total:278 pass:239 dwarn:0 dfail:1 fail:17 skip:21 time:427s
fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:468s
fi-skl-6770hq total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:498s
fi-skl-gvtdvm total:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:436s
fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:539s
fi-snb-2600 total:278 pass:248 dwarn:0 dfail:0 fail:1 skip:29 time:406s
2c9abf8ec88cf4b3d3735976bb0ff7a4991946b2 drm-tip: 2017y-06m-01d-13h-29m-05s UTC integration manifest
77ab5e1 drm/i915: Store i915_gem_object_is_coherent() as a bit next to cache-dirty
9823c27 drm/i915: Mark CPU cache as dirty on every transition for CPU writes
== Logs ==
For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4859/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-06-01 17:15 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-01 16:59 [PATCH 1/2] drm/i915: Mark CPU cache as dirty on every transition for CPU writes Chris Wilson
2017-06-01 16:59 ` [PATCH 2/2] drm/i915: Store i915_gem_object_is_coherent() as a bit next to cache-dirty Chris Wilson
2017-06-01 17:15 ` ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Mark CPU cache as dirty on every transition for CPU writes Patchwork
-- strict thread matches above, loose matches on Subject: below --
2017-04-27 14:46 [PATCH 1/2] " Chris Wilson
2017-04-27 15:07 ` ✓ Fi.CI.BAT: success for series starting with [1/2] " Patchwork
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).