* [PATCH 0/3] drm/i915: mark GEM objects as dirtied by CPU @ 2015-12-08 16:51 Dave Gordon 2015-12-08 16:51 ` [PATCH 1/3] drm/i915: mark GEM objects as dirty when filled by the CPU Dave Gordon ` (3 more replies) 0 siblings, 4 replies; 35+ messages in thread From: Dave Gordon @ 2015-12-08 16:51 UTC (permalink / raw) To: intel-gfx This patchset covers three variations on GEM objects being dirtied by means of CPU writes. The first is simple: an object has been entirely filled with data via CPU writes, and the whole object is therefore dirty (i.e. backing store is out-of-date w.r.t. current contents). For these cases, marking the individual pages at the point of writing would not be a win. The second patch covers cases where only one page is actually written; these are candidates for a future optimisation where only the specific page is marked dirty (only applicable to objects with page-tracked backing store). The third deals with a couple of other apparently missed cases where there might be an opportunity for updates to be dropped. _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH 1/3] drm/i915: mark GEM objects as dirty when filled by the CPU 2015-12-08 16:51 [PATCH 0/3] drm/i915: mark GEM objects as dirtied by CPU Dave Gordon @ 2015-12-08 16:51 ` Dave Gordon 2015-12-08 17:00 ` Chris Wilson 2015-12-08 16:51 ` [PATCH 2/3] drm/i915: mark GEM objects as dirty when updated " Dave Gordon ` (2 subsequent siblings) 3 siblings, 1 reply; 35+ messages in thread From: Dave Gordon @ 2015-12-08 16:51 UTC (permalink / raw) To: intel-gfx In various places, a GEM object is filled with data by means of CPU writes. In such cases, the object should be marked dirty, to ensure that the data is not discarded if the object is evicted under memory pressure. This incorporates and supercedes Alex Dai's earlier patch [PATCH v1] drm/i915/guc: Fix a fw content lost issue after it is evicted Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Alex Dai <yu.dai@intel.com> --- drivers/gpu/drm/i915/i915_cmd_parser.c | 1 + drivers/gpu/drm/i915/i915_gem.c | 1 + 2 files changed, 2 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 814d894..292bd5d 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -945,6 +945,7 @@ static u32 *copy_batch(struct drm_i915_gem_object *dest_obj, drm_clflush_virt_range(src, batch_len); memcpy(dst, src, batch_len); + dest_obj->dirty = 1; unmap_src: vunmap(src_base); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index dfaf25b..12f68f4 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -5209,6 +5209,7 @@ i915_gem_object_create_from_data(struct drm_device *dev, i915_gem_object_pin_pages(obj); sg = obj->pages; bytes = sg_copy_from_buffer(sg->sgl, sg->nents, (void *)data, size); + obj->dirty = 1; i915_gem_object_unpin_pages(obj); if (WARN_ON(bytes != size)) { -- 1.9.1 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH 1/3] drm/i915: mark GEM objects as dirty when filled by the CPU 2015-12-08 16:51 ` [PATCH 1/3] drm/i915: mark GEM objects as dirty when filled by the CPU Dave Gordon @ 2015-12-08 17:00 ` Chris Wilson 2015-12-08 18:06 ` Dave Gordon 0 siblings, 1 reply; 35+ messages in thread From: Chris Wilson @ 2015-12-08 17:00 UTC (permalink / raw) To: Dave Gordon; +Cc: intel-gfx On Tue, Dec 08, 2015 at 04:51:16PM +0000, Dave Gordon wrote: > In various places, a GEM object is filled with data by means of CPU > writes. In such cases, the object should be marked dirty, to ensure that > the data is not discarded if the object is evicted under memory > pressure. > > This incorporates and supercedes Alex Dai's earlier patch > [PATCH v1] drm/i915/guc: Fix a fw content lost issue after it is evicted > > Signed-off-by: Dave Gordon <david.s.gordon@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Alex Dai <yu.dai@intel.com> > --- > drivers/gpu/drm/i915/i915_cmd_parser.c | 1 + > drivers/gpu/drm/i915/i915_gem.c | 1 + > 2 files changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c > index 814d894..292bd5d 100644 > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c > @@ -945,6 +945,7 @@ static u32 *copy_batch(struct drm_i915_gem_object *dest_obj, > drm_clflush_virt_range(src, batch_len); > > memcpy(dst, src, batch_len); > + dest_obj->dirty = 1; There is no bug here. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 1/3] drm/i915: mark GEM objects as dirty when filled by the CPU 2015-12-08 17:00 ` Chris Wilson @ 2015-12-08 18:06 ` Dave Gordon 2015-12-10 10:49 ` Daniel Vetter 0 siblings, 1 reply; 35+ messages in thread From: Dave Gordon @ 2015-12-08 18:06 UTC (permalink / raw) To: Chris Wilson, intel-gfx, Alex Dai On 08/12/15 17:00, Chris Wilson wrote: > On Tue, Dec 08, 2015 at 04:51:16PM +0000, Dave Gordon wrote: >> In various places, a GEM object is filled with data by means of CPU >> writes. In such cases, the object should be marked dirty, to ensure that >> the data is not discarded if the object is evicted under memory >> pressure. >> >> This incorporates and supercedes Alex Dai's earlier patch >> [PATCH v1] drm/i915/guc: Fix a fw content lost issue after it is evicted >> >> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> >> Cc: Chris Wilson <chris@chris-wilson.co.uk> >> Cc: Alex Dai <yu.dai@intel.com> >> --- >> drivers/gpu/drm/i915/i915_cmd_parser.c | 1 + >> drivers/gpu/drm/i915/i915_gem.c | 1 + >> 2 files changed, 2 insertions(+) >> >> diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c >> index 814d894..292bd5d 100644 >> --- a/drivers/gpu/drm/i915/i915_cmd_parser.c >> +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c >> @@ -945,6 +945,7 @@ static u32 *copy_batch(struct drm_i915_gem_object *dest_obj, >> drm_clflush_virt_range(src, batch_len); >> >> memcpy(dst, src, batch_len); >> + dest_obj->dirty = 1; > > There is no bug here. > -Chris Because the shadow batch has been page-pinned by the caller? Two questions, then: 1. Do we reuse the same shadow batch if the same source batch is resubmitted? If so, can we be sure that it has stayed pinned (one way or another) for the entire intervening period? 2. If the shadow batch can never be unpinned, why do we allocate it as a GEM object with backing store which can apparently never be used. We could get rid of the shmfs setup overhead by choosing an object type without backing store. .Dave. _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 1/3] drm/i915: mark GEM objects as dirty when filled by the CPU 2015-12-08 18:06 ` Dave Gordon @ 2015-12-10 10:49 ` Daniel Vetter 0 siblings, 0 replies; 35+ messages in thread From: Daniel Vetter @ 2015-12-10 10:49 UTC (permalink / raw) To: Dave Gordon; +Cc: intel-gfx On Tue, Dec 08, 2015 at 06:06:42PM +0000, Dave Gordon wrote: > On 08/12/15 17:00, Chris Wilson wrote: > >On Tue, Dec 08, 2015 at 04:51:16PM +0000, Dave Gordon wrote: > >>In various places, a GEM object is filled with data by means of CPU > >>writes. In such cases, the object should be marked dirty, to ensure that > >>the data is not discarded if the object is evicted under memory > >>pressure. > >> > >>This incorporates and supercedes Alex Dai's earlier patch > >>[PATCH v1] drm/i915/guc: Fix a fw content lost issue after it is evicted > >> > >>Signed-off-by: Dave Gordon <david.s.gordon@intel.com> > >>Cc: Chris Wilson <chris@chris-wilson.co.uk> > >>Cc: Alex Dai <yu.dai@intel.com> > >>--- > >> drivers/gpu/drm/i915/i915_cmd_parser.c | 1 + > >> drivers/gpu/drm/i915/i915_gem.c | 1 + > >> 2 files changed, 2 insertions(+) > >> > >>diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c > >>index 814d894..292bd5d 100644 > >>--- a/drivers/gpu/drm/i915/i915_cmd_parser.c > >>+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c > >>@@ -945,6 +945,7 @@ static u32 *copy_batch(struct drm_i915_gem_object *dest_obj, > >> drm_clflush_virt_range(src, batch_len); > >> > >> memcpy(dst, src, batch_len); > >>+ dest_obj->dirty = 1; > > > >There is no bug here. > >-Chris > > Because the shadow batch has been page-pinned by the caller? Two questions, > then: > > 1. Do we reuse the same shadow batch if the same source batch is > resubmitted? If so, can we be sure that it has stayed pinned > (one way or another) for the entire intervening period? > > 2. If the shadow batch can never be unpinned, why do we allocate > it as a GEM object with backing store which can apparently > never be used. We could get rid of the shmfs setup overhead by > choosing an object type without backing store. We don't care about coherency once the shadow batch is retired (we mark it as purgeable even, which flat-out ignores ->dirty). We use shmem because we need some memory for it, there's no copying to the backing storage (like ttm does when evicting from vram to shmem). -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH 2/3] drm/i915: mark GEM objects as dirty when updated by the CPU 2015-12-08 16:51 [PATCH 0/3] drm/i915: mark GEM objects as dirtied by CPU Dave Gordon 2015-12-08 16:51 ` [PATCH 1/3] drm/i915: mark GEM objects as dirty when filled by the CPU Dave Gordon @ 2015-12-08 16:51 ` Dave Gordon 2015-12-08 17:00 ` Chris Wilson 2015-12-08 16:51 ` [PATCH 3/3] drm/i915: mark GEM objects as dirty when written " Dave Gordon 2015-12-09 15:52 ` [PATCH 0/2 v2] drm/i915: mark GEM objects as dirtied by CPU Dave Gordon 3 siblings, 1 reply; 35+ messages in thread From: Dave Gordon @ 2015-12-08 16:51 UTC (permalink / raw) To: intel-gfx In various places, one or more pages of a GEM object are mapped into CPU address space and updated. In each such case, either the page or the the object should be marked dirty, to ensure that the modifications are not discarded if the object is evicted under memory pressure. Ideally, we would like to mark only the updated pages dirty; but it isn't clear at this point whether this will work for all types of GEM objects (regular/gtt, phys, stolen, userptr, dmabuf, ...). So for now, let's ensure correctness by marking the whole object dirty. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 ++ drivers/gpu/drm/i915/i915_gem_render_state.c | 1 + drivers/gpu/drm/i915/i915_guc_submission.c | 1 + drivers/gpu/drm/i915/intel_lrc.c | 6 +++++- 4 files changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index a4c243c..bc28a10 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -281,6 +281,7 @@ relocate_entry_cpu(struct drm_i915_gem_object *obj, } kunmap_atomic(vaddr); + obj->dirty = 1; return 0; } @@ -372,6 +373,7 @@ relocate_entry_clflush(struct drm_i915_gem_object *obj, } kunmap_atomic(vaddr); + obj->dirty = 1; return 0; } diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c index 5026a62..dd1976c 100644 --- a/drivers/gpu/drm/i915/i915_gem_render_state.c +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c @@ -144,6 +144,7 @@ static int render_state_setup(struct render_state *so) so->aux_batch_size = ALIGN(so->aux_batch_size, 8); kunmap(page); + so->obj->dirty = 1; ret = i915_gem_object_set_to_gtt_domain(so->obj, false); if (ret) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 0d23785b..c0e58f8 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -574,6 +574,7 @@ static void lr_context_update(struct drm_i915_gem_request *rq) reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(rb_obj); kunmap_atomic(reg_state); + ctx_obj->dirty = 1; } /** diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 4ebafab..bc77794 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -391,6 +391,7 @@ static int execlists_update_context(struct drm_i915_gem_request *rq) } kunmap_atomic(reg_state); + ctx_obj->dirty = 1; return 0; } @@ -1030,7 +1031,7 @@ static int intel_lr_context_do_pin(struct intel_engine_cs *ring, if (ret) goto unpin_ctx_obj; - ctx_obj->dirty = true; + ctx_obj->dirty = 1; /* Invalidate GuC TLB. */ if (i915.enable_guc_submission) @@ -1461,6 +1462,8 @@ static int intel_init_workaround_bb(struct intel_engine_cs *ring) out: kunmap_atomic(batch); + wa_ctx->obj->dirty = 1; + if (ret) lrc_destroy_wa_ctx_obj(ring); @@ -2536,6 +2539,7 @@ void intel_lr_context_reset(struct drm_device *dev, reg_state[CTX_RING_TAIL+1] = 0; kunmap_atomic(reg_state); + ctx_obj->dirty = 1; ringbuf->head = 0; ringbuf->tail = 0; -- 1.9.1 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH 2/3] drm/i915: mark GEM objects as dirty when updated by the CPU 2015-12-08 16:51 ` [PATCH 2/3] drm/i915: mark GEM objects as dirty when updated " Dave Gordon @ 2015-12-08 17:00 ` Chris Wilson 2015-12-08 18:43 ` Dave Gordon 0 siblings, 1 reply; 35+ messages in thread From: Chris Wilson @ 2015-12-08 17:00 UTC (permalink / raw) To: Dave Gordon; +Cc: intel-gfx On Tue, Dec 08, 2015 at 04:51:17PM +0000, Dave Gordon wrote: > In various places, one or more pages of a GEM object are mapped into CPU > address space and updated. In each such case, either the page or the the > object should be marked dirty, to ensure that the modifications are not > discarded if the object is evicted under memory pressure. > > Ideally, we would like to mark only the updated pages dirty; but it > isn't clear at this point whether this will work for all types of GEM > objects (regular/gtt, phys, stolen, userptr, dmabuf, ...). So for now, > let's ensure correctness by marking the whole object dirty. > > Signed-off-by: Dave Gordon <david.s.gordon@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > --- > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 ++ > drivers/gpu/drm/i915/i915_gem_render_state.c | 1 + > drivers/gpu/drm/i915/i915_guc_submission.c | 1 + > drivers/gpu/drm/i915/intel_lrc.c | 6 +++++- > 4 files changed, 9 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c > index a4c243c..bc28a10 100644 > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c > @@ -281,6 +281,7 @@ relocate_entry_cpu(struct drm_i915_gem_object *obj, > } > > kunmap_atomic(vaddr); > + obj->dirty = 1; Nak. CPU dirtying is a per-page interface. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 2/3] drm/i915: mark GEM objects as dirty when updated by the CPU 2015-12-08 17:00 ` Chris Wilson @ 2015-12-08 18:43 ` Dave Gordon 0 siblings, 0 replies; 35+ messages in thread From: Dave Gordon @ 2015-12-08 18:43 UTC (permalink / raw) To: Chris Wilson, intel-gfx, Alex Dai On 08/12/15 17:00, Chris Wilson wrote: > On Tue, Dec 08, 2015 at 04:51:17PM +0000, Dave Gordon wrote: >> In various places, one or more pages of a GEM object are mapped into CPU >> address space and updated. In each such case, either the page or the the >> object should be marked dirty, to ensure that the modifications are not >> discarded if the object is evicted under memory pressure. >> >> Ideally, we would like to mark only the updated pages dirty; but it >> isn't clear at this point whether this will work for all types of GEM >> objects (regular/gtt, phys, stolen, userptr, dmabuf, ...). So for now, >> let's ensure correctness by marking the whole object dirty. >> >> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> >> Cc: Chris Wilson <chris@chris-wilson.co.uk> >> --- >> drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 ++ >> drivers/gpu/drm/i915/i915_gem_render_state.c | 1 + >> drivers/gpu/drm/i915/i915_guc_submission.c | 1 + >> drivers/gpu/drm/i915/intel_lrc.c | 6 +++++- >> 4 files changed, 9 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c >> index a4c243c..bc28a10 100644 >> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c >> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c >> @@ -281,6 +281,7 @@ relocate_entry_cpu(struct drm_i915_gem_object *obj, >> } >> >> kunmap_atomic(vaddr); >> + obj->dirty = 1; > Nak. CPU dirtying is a per-page interface. > -Chris That's what my commit message said. But let's at least have /correct/ behaviour while we work out which object types we (can) support here. Also, in: if (use_cpu_reloc(obj)) ret = relocate_entry_cpu(obj, reloc, target_offset); else if (obj->map_and_fenceable) ret = relocate_entry_gtt(obj, reloc, target_offset); else if (cpu_has_clflush) ret = relocate_entry_clflush(obj, reloc, target_offset); both the other routines parallel to relocate_entry_cpu() [i.e. relocate_entry_gtt() and relocate_entry_clflush()] mark the whole object dirty. Why be inconsistent? Can we be sure that the object in question actually has per-page tracking of dirty pages. shmfs objects do, but not phys, which only has object-level dirty tracking. Can we guarantee that only the right sort of objects will be handled here? And when stolen objects are exposed to the user? .Dave. _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH 3/3] drm/i915: mark GEM objects as dirty when written by the CPU 2015-12-08 16:51 [PATCH 0/3] drm/i915: mark GEM objects as dirtied by CPU Dave Gordon 2015-12-08 16:51 ` [PATCH 1/3] drm/i915: mark GEM objects as dirty when filled by the CPU Dave Gordon 2015-12-08 16:51 ` [PATCH 2/3] drm/i915: mark GEM objects as dirty when updated " Dave Gordon @ 2015-12-08 16:51 ` Dave Gordon 2015-12-08 17:03 ` Chris Wilson 2015-12-09 15:52 ` [PATCH 0/2 v2] drm/i915: mark GEM objects as dirtied by CPU Dave Gordon 3 siblings, 1 reply; 35+ messages in thread From: Dave Gordon @ 2015-12-08 16:51 UTC (permalink / raw) To: intel-gfx This patch covers a couple more places where a GEM object is (or may be) modified by means of CPU writes, and should therefore be marked dirty to ensure that the changes are not lost in the evenof of the object is evicted under memory pressure. It may be possible to optimise these paths later, by marking only specific pages of the object as dirty (for objects backed by shmfs pages); but for now let's ensure correctness by dirtying the whole object. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/i915_gem.c | 4 +++- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 2 ++ 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 12f68f4..36b9539 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -937,7 +937,6 @@ i915_gem_shmem_pwrite(struct drm_device *dev, i915_gem_object_pin_pages(obj); offset = args->offset; - obj->dirty = 1; for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, offset >> PAGE_SHIFT) { @@ -1074,6 +1073,9 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data, goto out; } + /* Object backing store will be out of date hereafter */ + obj->dirty = 1; + trace_i915_gem_object_pwrite(obj, args->offset, args->size); ret = -EFAULT; diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c index e9c2bfd..49a74c6 100644 --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c @@ -208,6 +208,8 @@ static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, size_t start, size return ret; ret = i915_gem_object_set_to_cpu_domain(obj, write); + if (write) + obj->dirty = 1; mutex_unlock(&dev->struct_mutex); return ret; } -- 1.9.1 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH 3/3] drm/i915: mark GEM objects as dirty when written by the CPU 2015-12-08 16:51 ` [PATCH 3/3] drm/i915: mark GEM objects as dirty when written " Dave Gordon @ 2015-12-08 17:03 ` Chris Wilson 2015-12-08 18:24 ` Dave Gordon 0 siblings, 1 reply; 35+ messages in thread From: Chris Wilson @ 2015-12-08 17:03 UTC (permalink / raw) To: Dave Gordon; +Cc: intel-gfx On Tue, Dec 08, 2015 at 04:51:18PM +0000, Dave Gordon wrote: > This patch covers a couple more places where a GEM object is (or may be) > modified by means of CPU writes, and should therefore be marked dirty to > ensure that the changes are not lost in the evenof of the object is > evicted under memory pressure. > > It may be possible to optimise these paths later, by marking only > specific pages of the object as dirty (for objects backed by shmfs > pages); but for now let's ensure correctness by dirtying the whole > object. > > Signed-off-by: Dave Gordon <david.s.gordon@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > --- > drivers/gpu/drm/i915/i915_gem.c | 4 +++- > drivers/gpu/drm/i915/i915_gem_dmabuf.c | 2 ++ > 2 files changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 12f68f4..36b9539 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -937,7 +937,6 @@ i915_gem_shmem_pwrite(struct drm_device *dev, > i915_gem_object_pin_pages(obj); > > offset = args->offset; > - obj->dirty = 1; > > for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, > offset >> PAGE_SHIFT) { > @@ -1074,6 +1073,9 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data, > goto out; > } > > + /* Object backing store will be out of date hereafter */ > + obj->dirty = 1; Possibly. I'd rather just have shmem_pwrite be consistent and use set_page_dirty. It is baked into the code that it doesn't access every page. > trace_i915_gem_object_pwrite(obj, args->offset, args->size); > > ret = -EFAULT; > diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c > index e9c2bfd..49a74c6 100644 > --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c > +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c > @@ -208,6 +208,8 @@ static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, size_t start, size > return ret; > > ret = i915_gem_object_set_to_cpu_domain(obj, write); > + if (write) > + obj->dirty = 1; No. The accessor here should already be using set_page_dirty. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 3/3] drm/i915: mark GEM objects as dirty when written by the CPU 2015-12-08 17:03 ` Chris Wilson @ 2015-12-08 18:24 ` Dave Gordon 2015-12-10 13:10 ` Daniel Vetter 0 siblings, 1 reply; 35+ messages in thread From: Dave Gordon @ 2015-12-08 18:24 UTC (permalink / raw) To: Chris Wilson, intel-gfx, Alex Dai On 08/12/15 17:03, Chris Wilson wrote: > On Tue, Dec 08, 2015 at 04:51:18PM +0000, Dave Gordon wrote: >> This patch covers a couple more places where a GEM object is (or may be) >> modified by means of CPU writes, and should therefore be marked dirty to >> ensure that the changes are not lost in the evenof of the object is >> evicted under memory pressure. >> >> It may be possible to optimise these paths later, by marking only >> specific pages of the object as dirty (for objects backed by shmfs >> pages); but for now let's ensure correctness by dirtying the whole >> object. >> >> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> >> Cc: Chris Wilson <chris@chris-wilson.co.uk> >> --- >> drivers/gpu/drm/i915/i915_gem.c | 4 +++- >> drivers/gpu/drm/i915/i915_gem_dmabuf.c | 2 ++ >> 2 files changed, 5 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c >> index 12f68f4..36b9539 100644 >> --- a/drivers/gpu/drm/i915/i915_gem.c >> +++ b/drivers/gpu/drm/i915/i915_gem.c >> @@ -937,7 +937,6 @@ i915_gem_shmem_pwrite(struct drm_device *dev, >> i915_gem_object_pin_pages(obj); >> >> offset = args->offset; >> - obj->dirty = 1; >> >> for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, >> offset >> PAGE_SHIFT) { >> @@ -1074,6 +1073,9 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data, >> goto out; >> } >> >> + /* Object backing store will be out of date hereafter */ >> + obj->dirty = 1; > > Possibly. I'd rather just have shmem_pwrite be consistent and use > set_page_dirty. It is baked into the code that it doesn't access every > page. It wasn't the shmem path that was the problem; this line was previously inside i915_gem_shmem_pwrite() above. But i915_gem_phys_pwrite() was missing the corresponding line, so it was simpler to move marking the object dirty up to the top level of the ioctl for now, especially as i915_gem_gtt_pwrite_fast() might or might not have marked the object in the case where it returns early. We could at some time in the future devolve object marking to a class-specific vfunc, at which point this line would disappear again; but we'd have to implement it in each class, or at least the ones that users can call pwrite on (shmem, phys, and eventually stolen?). Of those, shmem can do per-page dirtying, but phys can stolen can't (stolen doesn't even have "struct page" entries available). Which is why it's simpler to just mark the whole object here and let put_pages() deal with it later (if ever -- if the object is never actually swapped out then marking the object incurs LESS overhead than marking all the pages). >> trace_i915_gem_object_pwrite(obj, args->offset, args->size); >> >> ret = -EFAULT; >> diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c >> index e9c2bfd..49a74c6 100644 >> --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c >> +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c >> @@ -208,6 +208,8 @@ static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, size_t start, size >> return ret; >> >> ret = i915_gem_object_set_to_cpu_domain(obj, write); >> + if (write) >> + obj->dirty = 1; > > No. The accessor here should already be using set_page_dirty. > -Chris What function would that be? I can't find any calls to set_page_dirty() in this source file. OTOH, does a dmabuf object have shmfs backing store anyway? .Dave. _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 3/3] drm/i915: mark GEM objects as dirty when written by the CPU 2015-12-08 18:24 ` Dave Gordon @ 2015-12-10 13:10 ` Daniel Vetter 0 siblings, 0 replies; 35+ messages in thread From: Daniel Vetter @ 2015-12-10 13:10 UTC (permalink / raw) To: Dave Gordon; +Cc: intel-gfx On Tue, Dec 08, 2015 at 06:24:40PM +0000, Dave Gordon wrote: > On 08/12/15 17:03, Chris Wilson wrote: > >On Tue, Dec 08, 2015 at 04:51:18PM +0000, Dave Gordon wrote: > >>This patch covers a couple more places where a GEM object is (or may be) > >>modified by means of CPU writes, and should therefore be marked dirty to > >>ensure that the changes are not lost in the evenof of the object is > >>evicted under memory pressure. > >> > >>It may be possible to optimise these paths later, by marking only > >>specific pages of the object as dirty (for objects backed by shmfs > >>pages); but for now let's ensure correctness by dirtying the whole > >>object. > >> > >>Signed-off-by: Dave Gordon <david.s.gordon@intel.com> > >>Cc: Chris Wilson <chris@chris-wilson.co.uk> > >>--- > >> drivers/gpu/drm/i915/i915_gem.c | 4 +++- > >> drivers/gpu/drm/i915/i915_gem_dmabuf.c | 2 ++ > >> 2 files changed, 5 insertions(+), 1 deletion(-) > >> > >>diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > >>index 12f68f4..36b9539 100644 > >>--- a/drivers/gpu/drm/i915/i915_gem.c > >>+++ b/drivers/gpu/drm/i915/i915_gem.c > >>@@ -937,7 +937,6 @@ i915_gem_shmem_pwrite(struct drm_device *dev, > >> i915_gem_object_pin_pages(obj); > >> > >> offset = args->offset; > >>- obj->dirty = 1; > >> > >> for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, > >> offset >> PAGE_SHIFT) { > >>@@ -1074,6 +1073,9 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data, > >> goto out; > >> } > >> > >>+ /* Object backing store will be out of date hereafter */ > >>+ obj->dirty = 1; > > > >Possibly. I'd rather just have shmem_pwrite be consistent and use > >set_page_dirty. It is baked into the code that it doesn't access every > >page. > > It wasn't the shmem path that was the problem; this line was previously > inside i915_gem_shmem_pwrite() above. But i915_gem_phys_pwrite() was missing > the corresponding line, so it was simpler to move marking the object dirty > up to the top level of the ioctl for now, especially as > i915_gem_gtt_pwrite_fast() might or might not have marked the object in the > case where it returns early. > > We could at some time in the future devolve object marking to a > class-specific vfunc, at which point this line would disappear again; but > we'd have to implement it in each class, or at least the ones that users can > call pwrite on (shmem, phys, and eventually stolen?). Of those, shmem can do > per-page dirtying, but phys can stolen can't (stolen doesn't even have > "struct page" entries available). > > Which is why it's simpler to just mark the whole object here and let > put_pages() deal with it later (if ever -- if the object is never actually > swapped out then marking the object incurs LESS overhead than marking all > the pages). > > >> trace_i915_gem_object_pwrite(obj, args->offset, args->size); > >> > >> ret = -EFAULT; > >>diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c > >>index e9c2bfd..49a74c6 100644 > >>--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c > >>+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c > >>@@ -208,6 +208,8 @@ static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, size_t start, size > >> return ret; > >> > >> ret = i915_gem_object_set_to_cpu_domain(obj, write); > >>+ if (write) > >>+ obj->dirty = 1; > > > >No. The accessor here should already be using set_page_dirty. > >-Chris > > What function would that be? I can't find any calls to set_page_dirty() in > this source file. OTOH, does a dmabuf object have shmfs backing store > anyway? I think there's indeed a bug here, and setting obj->dirty is the right defensive solution. For mmap access from userspace (once that happens) we'd be covered by the set_page_dirty shmem would do. But for kernel-internal users (dma-buf vmap, used e.g. by udl) there indeed seems to be no one setting obj->dirty. And that code definitely needs to be somewhere in i915. I guess we could make a case whether we should set obj->dirty in vmap instead - since we don't support the dma-buf kmap stuff there's no problem there. But indeed this should be set somewhere. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH 0/2 v2] drm/i915: mark GEM objects as dirtied by CPU 2015-12-08 16:51 [PATCH 0/3] drm/i915: mark GEM objects as dirtied by CPU Dave Gordon ` (2 preceding siblings ...) 2015-12-08 16:51 ` [PATCH 3/3] drm/i915: mark GEM objects as dirty when written " Dave Gordon @ 2015-12-09 15:52 ` Dave Gordon 2015-12-09 15:52 ` [PATCH 1/2 v2] drm/i915: mark GEM object pages dirty when mapped & written by the CPU Dave Gordon ` (2 more replies) 3 siblings, 3 replies; 35+ messages in thread From: Dave Gordon @ 2015-12-09 15:52 UTC (permalink / raw) To: intel-gfx This patchset covers various places where GEM objects are dirtied by means of CPU writes. The first patch covers cases where only one page is actually written; here we can mark just the specific page in the pagecache dirty. This applies to regular (shmfs-backed) objects only. The second patch covers situations where a subrange that is not limited to a single page is modified, or a whole object is filled with data via CPU writes. In either case, the object is now dirty (i.e. backing store is out-of-date w.r.t. current contents) and must be marked so or risk losing its contents if evicted. For the whole-object cases, marking the individual pages at the point of writing would not be a win; instead put_pages() will propagate the object-dirty flag to each page iff the object is ever evicted. _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH 1/2 v2] drm/i915: mark GEM object pages dirty when mapped & written by the CPU 2015-12-09 15:52 ` [PATCH 0/2 v2] drm/i915: mark GEM objects as dirtied by CPU Dave Gordon @ 2015-12-09 15:52 ` Dave Gordon 2015-12-10 13:29 ` Chris Wilson 2015-12-09 15:52 ` [PATCH 2/2 v2] drm/i915: mark GEM objects dirty after overwriting their contents Dave Gordon 2015-12-10 18:51 ` [PATCH 0/4 v3] drm/i915: mark GEM objects as dirtied by CPU Dave Gordon 2 siblings, 1 reply; 35+ messages in thread From: Dave Gordon @ 2015-12-09 15:52 UTC (permalink / raw) To: intel-gfx In various places, a single page of a (regular) GEM object is mapped into CPU address space and updated. In each such case, either the page or the the object should be marked dirty, to ensure that the modifications are not discarded if the object is evicted under memory pressure. The typical sequence is: va = kmap_atomic(i915_gem_object_get_page(obj, pageno)); *(va+offset) = ... kunmap_atomic(va); Here we introduce i915_gem_object_get_dirty_page(), which performs the same operation as i915_gem_object_get_page() but with the side-effect of marking the returned page dirty in the pagecache. This will ensure that if the object is subsequently evicted (due to memory pressure), the changes are written to backing store rather than discarded. Note that it works only for regular (shmfs-backed) GEM objects, but (at least for now) those are the only ones that are updated in this way -- the objects in question are contexts and batchbuffers, which are always shmfs-backed. A separate patch deals with the case where whole objects are (or may be) dirtied. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/i915_drv.h | 3 +++ drivers/gpu/drm/i915/i915_gem.c | 15 +++++++++++++++ drivers/gpu/drm/i915/i915_gem_execbuffer.c | 4 ++-- drivers/gpu/drm/i915/i915_gem_render_state.c | 2 +- drivers/gpu/drm/i915/i915_guc_submission.c | 2 +- drivers/gpu/drm/i915/intel_lrc.c | 11 ++++------- 6 files changed, 26 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index f1a8a53..ca77392 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2894,6 +2894,9 @@ static inline int __sg_page_count(struct scatterlist *sg) return sg->length >> PAGE_SHIFT; } +struct page * +i915_gem_object_get_dirty_page(struct drm_i915_gem_object *obj, int n); + static inline struct page * i915_gem_object_get_page(struct drm_i915_gem_object *obj, int n) { diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index dfaf25b..06a5f39 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -5184,6 +5184,21 @@ bool i915_gem_obj_is_pinned(struct drm_i915_gem_object *obj) return false; } +/* Like i915_gem_object_get_page(), but mark the returned page dirty */ +struct page * +i915_gem_object_get_dirty_page(struct drm_i915_gem_object *obj, int n) +{ + struct page *page; + + /* Only default objects have per-page dirty tracking */ + if (WARN_ON(obj->ops != &i915_gem_object_ops)) + return NULL; + + page = i915_gem_object_get_page(obj, n); + set_page_dirty(page); + return page; +} + /* Allocate a new GEM object and fill it with the supplied data */ struct drm_i915_gem_object * i915_gem_object_create_from_data(struct drm_device *dev, diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index a4c243c..81796cc 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -264,7 +264,7 @@ relocate_entry_cpu(struct drm_i915_gem_object *obj, if (ret) return ret; - vaddr = kmap_atomic(i915_gem_object_get_page(obj, + vaddr = kmap_atomic(i915_gem_object_get_dirty_page(obj, reloc->offset >> PAGE_SHIFT)); *(uint32_t *)(vaddr + page_offset) = lower_32_bits(delta); @@ -355,7 +355,7 @@ relocate_entry_clflush(struct drm_i915_gem_object *obj, if (ret) return ret; - vaddr = kmap_atomic(i915_gem_object_get_page(obj, + vaddr = kmap_atomic(i915_gem_object_get_dirty_page(obj, reloc->offset >> PAGE_SHIFT)); clflush_write32(vaddr + page_offset, lower_32_bits(delta)); diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c index 5026a62..fc7e6d5 100644 --- a/drivers/gpu/drm/i915/i915_gem_render_state.c +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c @@ -103,7 +103,7 @@ static int render_state_setup(struct render_state *so) if (ret) return ret; - page = sg_page(so->obj->pages->sgl); + page = i915_gem_object_get_dirty_page(so->obj, 0); d = kmap(page); while (i < rodata->batch_items) { diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 0d23785b..05aa7e6 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -568,7 +568,7 @@ static void lr_context_update(struct drm_i915_gem_request *rq) WARN_ON(!i915_gem_obj_is_pinned(ctx_obj)); WARN_ON(!i915_gem_obj_is_pinned(rb_obj)); - page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN); + page = i915_gem_object_get_dirty_page(ctx_obj, LRC_STATE_PN); reg_state = kmap_atomic(page); reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(rb_obj); diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 4ebafab..ceccecc 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -372,7 +372,7 @@ static int execlists_update_context(struct drm_i915_gem_request *rq) WARN_ON(!i915_gem_obj_is_pinned(ctx_obj)); WARN_ON(!i915_gem_obj_is_pinned(rb_obj)); - page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN); + page = i915_gem_object_get_dirty_page(ctx_obj, LRC_STATE_PN); reg_state = kmap_atomic(page); reg_state[CTX_RING_TAIL+1] = rq->tail; @@ -1425,7 +1425,7 @@ static int intel_init_workaround_bb(struct intel_engine_cs *ring) return ret; } - page = i915_gem_object_get_page(wa_ctx->obj, 0); + page = i915_gem_object_get_dirty_page(wa_ctx->obj, 0); batch = kmap_atomic(page); offset = 0; @@ -2257,7 +2257,7 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o /* The second page of the context object contains some fields which must * be set up prior to the first execution. */ - page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN); + page = i915_gem_object_get_dirty_page(ctx_obj, LRC_STATE_PN); reg_state = kmap_atomic(page); /* A context is actually a big batch buffer with several MI_LOAD_REGISTER_IMM @@ -2343,9 +2343,6 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o } kunmap_atomic(reg_state); - - ctx_obj->dirty = 1; - set_page_dirty(page); i915_gem_object_unpin_pages(ctx_obj); return 0; @@ -2529,7 +2526,7 @@ void intel_lr_context_reset(struct drm_device *dev, WARN(1, "Failed get_pages for context obj\n"); continue; } - page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN); + page = i915_gem_object_get_dirty_page(ctx_obj, LRC_STATE_PN); reg_state = kmap_atomic(page); reg_state[CTX_RING_HEAD+1] = 0; -- 1.9.1 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH 1/2 v2] drm/i915: mark GEM object pages dirty when mapped & written by the CPU 2015-12-09 15:52 ` [PATCH 1/2 v2] drm/i915: mark GEM object pages dirty when mapped & written by the CPU Dave Gordon @ 2015-12-10 13:29 ` Chris Wilson 2015-12-10 17:24 ` Dave Gordon 0 siblings, 1 reply; 35+ messages in thread From: Chris Wilson @ 2015-12-10 13:29 UTC (permalink / raw) To: Dave Gordon; +Cc: intel-gfx On Wed, Dec 09, 2015 at 03:52:51PM +0000, Dave Gordon wrote: > In various places, a single page of a (regular) GEM object is mapped into > CPU address space and updated. In each such case, either the page or the > the object should be marked dirty, to ensure that the modifications are > not discarded if the object is evicted under memory pressure. > > The typical sequence is: > va = kmap_atomic(i915_gem_object_get_page(obj, pageno)); > *(va+offset) = ... > kunmap_atomic(va); > > Here we introduce i915_gem_object_get_dirty_page(), which performs the > same operation as i915_gem_object_get_page() but with the side-effect > of marking the returned page dirty in the pagecache. This will ensure > that if the object is subsequently evicted (due to memory pressure), > the changes are written to backing store rather than discarded. > > Note that it works only for regular (shmfs-backed) GEM objects, but (at > least for now) those are the only ones that are updated in this way -- > the objects in question are contexts and batchbuffers, which are always > shmfs-backed. > > A separate patch deals with the case where whole objects are (or may > be) dirtied. > > Signed-off-by: Dave Gordon <david.s.gordon@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> I like this. There are places were we do both obj->dirty and set_page_dirty(), but this so much more clearly shows what is going on. All of these locations should be infrequent (or at least have patches to make them so), so moving the call out-of-line will also be a benefit. > /* Allocate a new GEM object and fill it with the supplied data */ > struct drm_i915_gem_object * > i915_gem_object_create_from_data(struct drm_device *dev, > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c > index a4c243c..81796cc 100644 > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c > @@ -264,7 +264,7 @@ relocate_entry_cpu(struct drm_i915_gem_object *obj, > if (ret) > return ret; > > - vaddr = kmap_atomic(i915_gem_object_get_page(obj, > + vaddr = kmap_atomic(i915_gem_object_get_dirty_page(obj, > reloc->offset >> PAGE_SHIFT)); > *(uint32_t *)(vaddr + page_offset) = lower_32_bits(delta); > > @@ -355,7 +355,7 @@ relocate_entry_clflush(struct drm_i915_gem_object *obj, > if (ret) > return ret; > > - vaddr = kmap_atomic(i915_gem_object_get_page(obj, > + vaddr = kmap_atomic(i915_gem_object_get_dirty_page(obj, > reloc->offset >> PAGE_SHIFT)); > clflush_write32(vaddr + page_offset, lower_32_bits(delta)); > The relocation functions may dirty pairs of pages. Other than that, I think you have the right mix of callsites. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 1/2 v2] drm/i915: mark GEM object pages dirty when mapped & written by the CPU 2015-12-10 13:29 ` Chris Wilson @ 2015-12-10 17:24 ` Dave Gordon 2015-12-10 21:04 ` Chris Wilson 0 siblings, 1 reply; 35+ messages in thread From: Dave Gordon @ 2015-12-10 17:24 UTC (permalink / raw) To: Chris Wilson, intel-gfx, Alex Dai On 10/12/15 13:29, Chris Wilson wrote: > On Wed, Dec 09, 2015 at 03:52:51PM +0000, Dave Gordon wrote: >> In various places, a single page of a (regular) GEM object is mapped into >> CPU address space and updated. In each such case, either the page or the >> the object should be marked dirty, to ensure that the modifications are >> not discarded if the object is evicted under memory pressure. >> >> The typical sequence is: >> va = kmap_atomic(i915_gem_object_get_page(obj, pageno)); >> *(va+offset) = ... >> kunmap_atomic(va); >> >> Here we introduce i915_gem_object_get_dirty_page(), which performs the >> same operation as i915_gem_object_get_page() but with the side-effect >> of marking the returned page dirty in the pagecache. This will ensure >> that if the object is subsequently evicted (due to memory pressure), >> the changes are written to backing store rather than discarded. >> >> Note that it works only for regular (shmfs-backed) GEM objects, but (at >> least for now) those are the only ones that are updated in this way -- >> the objects in question are contexts and batchbuffers, which are always >> shmfs-backed. >> >> A separate patch deals with the case where whole objects are (or may >> be) dirtied. >> >> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> >> Cc: Chris Wilson <chris@chris-wilson.co.uk> > > I like this. There are places were we do both obj->dirty and > set_page_dirty(), but this so much more clearly shows what is going on. > All of these locations should be infrequent (or at least have patches to > make them so), so moving the call out-of-line will also be a benefit. I think there was only one place that both called set_page_dirty() AND set obj->dirty, which was in populate_lr_context(). You'll see that I've eliminated both in favour of a call to get_dirty_page() :) >> /* Allocate a new GEM object and fill it with the supplied data */ >> struct drm_i915_gem_object * >> i915_gem_object_create_from_data(struct drm_device *dev, >> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c >> index a4c243c..81796cc 100644 >> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c >> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c >> @@ -264,7 +264,7 @@ relocate_entry_cpu(struct drm_i915_gem_object *obj, >> if (ret) >> return ret; >> >> - vaddr = kmap_atomic(i915_gem_object_get_page(obj, >> + vaddr = kmap_atomic(i915_gem_object_get_dirty_page(obj, >> reloc->offset >> PAGE_SHIFT)); >> *(uint32_t *)(vaddr + page_offset) = lower_32_bits(delta); >> >> @@ -355,7 +355,7 @@ relocate_entry_clflush(struct drm_i915_gem_object *obj, >> if (ret) >> return ret; >> >> - vaddr = kmap_atomic(i915_gem_object_get_page(obj, >> + vaddr = kmap_atomic(i915_gem_object_get_dirty_page(obj, >> reloc->offset >> PAGE_SHIFT)); >> clflush_write32(vaddr + page_offset, lower_32_bits(delta)); >> > > The relocation functions may dirty pairs of pages. Other than that, I > think you have the right mix of callsites. > -Chris Thanks, I've added the other two to the next (v3) version :) .Dave. _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 1/2 v2] drm/i915: mark GEM object pages dirty when mapped & written by the CPU 2015-12-10 17:24 ` Dave Gordon @ 2015-12-10 21:04 ` Chris Wilson 2015-12-11 17:08 ` Daniel Vetter 0 siblings, 1 reply; 35+ messages in thread From: Chris Wilson @ 2015-12-10 21:04 UTC (permalink / raw) To: Dave Gordon; +Cc: intel-gfx On Thu, Dec 10, 2015 at 05:24:42PM +0000, Dave Gordon wrote: > On 10/12/15 13:29, Chris Wilson wrote: > >On Wed, Dec 09, 2015 at 03:52:51PM +0000, Dave Gordon wrote: > >>In various places, a single page of a (regular) GEM object is mapped into > >>CPU address space and updated. In each such case, either the page or the > >>the object should be marked dirty, to ensure that the modifications are > >>not discarded if the object is evicted under memory pressure. > >> > >>The typical sequence is: > >> va = kmap_atomic(i915_gem_object_get_page(obj, pageno)); > >> *(va+offset) = ... > >> kunmap_atomic(va); > >> > >>Here we introduce i915_gem_object_get_dirty_page(), which performs the > >>same operation as i915_gem_object_get_page() but with the side-effect > >>of marking the returned page dirty in the pagecache. This will ensure > >>that if the object is subsequently evicted (due to memory pressure), > >>the changes are written to backing store rather than discarded. > >> > >>Note that it works only for regular (shmfs-backed) GEM objects, but (at > >>least for now) those are the only ones that are updated in this way -- > >>the objects in question are contexts and batchbuffers, which are always > >>shmfs-backed. > >> > >>A separate patch deals with the case where whole objects are (or may > >>be) dirtied. > >> > >>Signed-off-by: Dave Gordon <david.s.gordon@intel.com> > >>Cc: Chris Wilson <chris@chris-wilson.co.uk> > > > >I like this. There are places were we do both obj->dirty and > >set_page_dirty(), but this so much more clearly shows what is going on. > >All of these locations should be infrequent (or at least have patches to > >make them so), so moving the call out-of-line will also be a benefit. > > I think there was only one place that both called set_page_dirty() > AND set obj->dirty, which was in populate_lr_context(). You'll see > that I've eliminated both in favour of a call to get_dirty_page() :) It was things like all GPU objects have obj->dirty set, so really the importance of using get_dirty_page() in the relocations and context pinning is for documentation. Which is a very good reason, nevertheless. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 1/2 v2] drm/i915: mark GEM object pages dirty when mapped & written by the CPU 2015-12-10 21:04 ` Chris Wilson @ 2015-12-11 17:08 ` Daniel Vetter 2015-12-11 17:27 ` Chris Wilson 0 siblings, 1 reply; 35+ messages in thread From: Daniel Vetter @ 2015-12-11 17:08 UTC (permalink / raw) To: Chris Wilson, Dave Gordon, intel-gfx, Alex Dai On Thu, Dec 10, 2015 at 09:04:23PM +0000, Chris Wilson wrote: > On Thu, Dec 10, 2015 at 05:24:42PM +0000, Dave Gordon wrote: > > On 10/12/15 13:29, Chris Wilson wrote: > > >On Wed, Dec 09, 2015 at 03:52:51PM +0000, Dave Gordon wrote: > > >>In various places, a single page of a (regular) GEM object is mapped into > > >>CPU address space and updated. In each such case, either the page or the > > >>the object should be marked dirty, to ensure that the modifications are > > >>not discarded if the object is evicted under memory pressure. > > >> > > >>The typical sequence is: > > >> va = kmap_atomic(i915_gem_object_get_page(obj, pageno)); > > >> *(va+offset) = ... > > >> kunmap_atomic(va); > > >> > > >>Here we introduce i915_gem_object_get_dirty_page(), which performs the > > >>same operation as i915_gem_object_get_page() but with the side-effect > > >>of marking the returned page dirty in the pagecache. This will ensure > > >>that if the object is subsequently evicted (due to memory pressure), > > >>the changes are written to backing store rather than discarded. > > >> > > >>Note that it works only for regular (shmfs-backed) GEM objects, but (at > > >>least for now) those are the only ones that are updated in this way -- > > >>the objects in question are contexts and batchbuffers, which are always > > >>shmfs-backed. > > >> > > >>A separate patch deals with the case where whole objects are (or may > > >>be) dirtied. > > >> > > >>Signed-off-by: Dave Gordon <david.s.gordon@intel.com> > > >>Cc: Chris Wilson <chris@chris-wilson.co.uk> > > > > > >I like this. There are places were we do both obj->dirty and > > >set_page_dirty(), but this so much more clearly shows what is going on. > > >All of these locations should be infrequent (or at least have patches to > > >make them so), so moving the call out-of-line will also be a benefit. > > > > I think there was only one place that both called set_page_dirty() > > AND set obj->dirty, which was in populate_lr_context(). You'll see > > that I've eliminated both in favour of a call to get_dirty_page() :) > > It was things like all GPU objects have obj->dirty set, so really the > importance of using get_dirty_page() in the relocations and context > pinning is for documentation. Which is a very good reason, nevertheless. Hm, I think if you force a fault on relocs and then shrink everything really hard before actually managing to submit the batch you could provoke this into a proper bug. one-in-a-billion perhaps ;-) -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 1/2 v2] drm/i915: mark GEM object pages dirty when mapped & written by the CPU 2015-12-11 17:08 ` Daniel Vetter @ 2015-12-11 17:27 ` Chris Wilson 0 siblings, 0 replies; 35+ messages in thread From: Chris Wilson @ 2015-12-11 17:27 UTC (permalink / raw) To: Daniel Vetter; +Cc: intel-gfx On Fri, Dec 11, 2015 at 06:08:10PM +0100, Daniel Vetter wrote: > Hm, I think if you force a fault on relocs and then shrink everything > really hard before actually managing to submit the batch you could provoke > this into a proper bug. one-in-a-billion perhaps ;-) Hmm, you would need to force the slowpath (otherwise all the objects are reserved and so not swappable). And then we force the presumed_offset to be invalid but only on the user side - so we don't force the relocations in this batch. Ok, plausible. But who hits the slowpath? Sigh. Fancy reviewing some mesa patches? -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH 2/2 v2] drm/i915: mark GEM objects dirty after overwriting their contents 2015-12-09 15:52 ` [PATCH 0/2 v2] drm/i915: mark GEM objects as dirtied by CPU Dave Gordon 2015-12-09 15:52 ` [PATCH 1/2 v2] drm/i915: mark GEM object pages dirty when mapped & written by the CPU Dave Gordon @ 2015-12-09 15:52 ` Dave Gordon 2015-12-10 13:22 ` Chris Wilson 2015-12-10 14:06 ` Daniel Vetter 2015-12-10 18:51 ` [PATCH 0/4 v3] drm/i915: mark GEM objects as dirtied by CPU Dave Gordon 2 siblings, 2 replies; 35+ messages in thread From: Dave Gordon @ 2015-12-09 15:52 UTC (permalink / raw) To: intel-gfx In a few places, we fill a GEM object with data, or overwrite some portion of its contents other than a single page. In such cases, we should mark the object dirty so that its pages in the pagecache are written to backing store (rather than discarded) if the object is evicted due to memory pressure. The cases where only a single page is touched are dealt with in a separate patch. This incorporates and supercedes Alex Dai's earlier patch [PATCH v1] drm/i915/guc: Fix a fw content lost issue after it is evicted Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Cc: Alex Dai <yu.dai@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/i915_cmd_parser.c | 3 +++ drivers/gpu/drm/i915/i915_gem.c | 5 ++++- drivers/gpu/drm/i915/intel_lrc.c | 2 +- 3 files changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 814d894..81a4fa2 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -946,6 +946,9 @@ static u32 *copy_batch(struct drm_i915_gem_object *dest_obj, memcpy(dst, src, batch_len); + /* After writing on the dest_obj, its backing store is out-of-date */ + dest_obj->dirty = 1; + unmap_src: vunmap(src_base); unpin_src: diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 06a5f39..81a770f 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -937,7 +937,6 @@ i915_gem_shmem_pwrite(struct drm_device *dev, i915_gem_object_pin_pages(obj); offset = args->offset; - obj->dirty = 1; for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, offset >> PAGE_SHIFT) { @@ -1074,6 +1073,9 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data, goto out; } + /* Object backing store will be out of date hereafter */ + obj->dirty = 1; + trace_i915_gem_object_pwrite(obj, args->offset, args->size); ret = -EFAULT; @@ -5224,6 +5226,7 @@ i915_gem_object_create_from_data(struct drm_device *dev, i915_gem_object_pin_pages(obj); sg = obj->pages; bytes = sg_copy_from_buffer(sg->sgl, sg->nents, (void *)data, size); + obj->dirty = 1; /* Backing store is now out of date */ i915_gem_object_unpin_pages(obj); if (WARN_ON(bytes != size)) { diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index ceccecc..c7520b7 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1030,7 +1030,7 @@ static int intel_lr_context_do_pin(struct intel_engine_cs *ring, if (ret) goto unpin_ctx_obj; - ctx_obj->dirty = true; + ctx_obj->dirty = 1; /* Invalidate GuC TLB. */ if (i915.enable_guc_submission) -- 1.9.1 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH 2/2 v2] drm/i915: mark GEM objects dirty after overwriting their contents 2015-12-09 15:52 ` [PATCH 2/2 v2] drm/i915: mark GEM objects dirty after overwriting their contents Dave Gordon @ 2015-12-10 13:22 ` Chris Wilson 2015-12-10 14:06 ` Daniel Vetter 1 sibling, 0 replies; 35+ messages in thread From: Chris Wilson @ 2015-12-10 13:22 UTC (permalink / raw) To: Dave Gordon; +Cc: intel-gfx On Wed, Dec 09, 2015 at 03:52:52PM +0000, Dave Gordon wrote: > In a few places, we fill a GEM object with data, or overwrite some > portion of its contents other than a single page. In such cases, we > should mark the object dirty so that its pages in the pagecache are > written to backing store (rather than discarded) if the object is > evicted due to memory pressure. > > The cases where only a single page is touched are dealt with in a > separate patch. > > This incorporates and supercedes Alex Dai's earlier patch > [PATCH v1] drm/i915/guc: Fix a fw content lost issue after it is evicted > > Signed-off-by: Dave Gordon <david.s.gordon@intel.com> > Cc: Alex Dai <yu.dai@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > --- > drivers/gpu/drm/i915/i915_cmd_parser.c | 3 +++ > drivers/gpu/drm/i915/i915_gem.c | 5 ++++- > drivers/gpu/drm/i915/intel_lrc.c | 2 +- > 3 files changed, 8 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c > index 814d894..81a4fa2 100644 > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c > @@ -946,6 +946,9 @@ static u32 *copy_batch(struct drm_i915_gem_object *dest_obj, > > memcpy(dst, src, batch_len); > > + /* After writing on the dest_obj, its backing store is out-of-date */ > + dest_obj->dirty = 1; I still think this is superfluous as it doesn't fix any bug and would rather not introduce new obj->dirty (I regret the limitations of our coarse whole object granularity), especially just before deleting copy_batch. > unmap_src: > vunmap(src_base); > unpin_src: > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 06a5f39..81a770f 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -937,7 +937,6 @@ i915_gem_shmem_pwrite(struct drm_device *dev, > i915_gem_object_pin_pages(obj); > > offset = args->offset; > - obj->dirty = 1; > > for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, > offset >> PAGE_SHIFT) { > @@ -1074,6 +1073,9 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data, > goto out; > } > > + /* Object backing store will be out of date hereafter */ > + obj->dirty = 1; I don't feel that reflects the asymmetry of pwrite. > trace_i915_gem_object_pwrite(obj, args->offset, args->size); > > ret = -EFAULT; > @@ -5224,6 +5226,7 @@ i915_gem_object_create_from_data(struct drm_device *dev, > i915_gem_object_pin_pages(obj); > sg = obj->pages; > bytes = sg_copy_from_buffer(sg->sgl, sg->nents, (void *)data, size); > + obj->dirty = 1; /* Backing store is now out of date */ This is the bug, so should be by itself so that we don't lose it admist the churn. I still think that it a bug in the general library function to not mark the buffer dirty upon copying. However, I can accept this as we do create the object from the data, so sematically the object is dirty. > i915_gem_object_unpin_pages(obj); > > if (WARN_ON(bytes != size)) { > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > index ceccecc..c7520b7 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -1030,7 +1030,7 @@ static int intel_lr_context_do_pin(struct intel_engine_cs *ring, > if (ret) > goto unpin_ctx_obj; > > - ctx_obj->dirty = true; > + ctx_obj->dirty = 1; That's just being obstinate! Going the other way and using the novel bool:1 would be just as useful. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 2/2 v2] drm/i915: mark GEM objects dirty after overwriting their contents 2015-12-09 15:52 ` [PATCH 2/2 v2] drm/i915: mark GEM objects dirty after overwriting their contents Dave Gordon 2015-12-10 13:22 ` Chris Wilson @ 2015-12-10 14:06 ` Daniel Vetter 2015-12-10 14:52 ` Chris Wilson 2015-12-10 16:19 ` Dave Gordon 1 sibling, 2 replies; 35+ messages in thread From: Daniel Vetter @ 2015-12-10 14:06 UTC (permalink / raw) To: Dave Gordon; +Cc: intel-gfx On Wed, Dec 09, 2015 at 03:52:52PM +0000, Dave Gordon wrote: > In a few places, we fill a GEM object with data, or overwrite some > portion of its contents other than a single page. In such cases, we > should mark the object dirty so that its pages in the pagecache are > written to backing store (rather than discarded) if the object is > evicted due to memory pressure. > > The cases where only a single page is touched are dealt with in a > separate patch. > > This incorporates and supercedes Alex Dai's earlier patch > [PATCH v1] drm/i915/guc: Fix a fw content lost issue after it is evicted > > Signed-off-by: Dave Gordon <david.s.gordon@intel.com> > Cc: Alex Dai <yu.dai@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> Hm, did you drop the begin_cpu_access dma-buf one here? See my other mail, I think that one was legit. -Daniel > --- > drivers/gpu/drm/i915/i915_cmd_parser.c | 3 +++ > drivers/gpu/drm/i915/i915_gem.c | 5 ++++- > drivers/gpu/drm/i915/intel_lrc.c | 2 +- > 3 files changed, 8 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c > index 814d894..81a4fa2 100644 > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c > @@ -946,6 +946,9 @@ static u32 *copy_batch(struct drm_i915_gem_object *dest_obj, > > memcpy(dst, src, batch_len); > > + /* After writing on the dest_obj, its backing store is out-of-date */ > + dest_obj->dirty = 1; > + > unmap_src: > vunmap(src_base); > unpin_src: > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 06a5f39..81a770f 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -937,7 +937,6 @@ i915_gem_shmem_pwrite(struct drm_device *dev, > i915_gem_object_pin_pages(obj); > > offset = args->offset; > - obj->dirty = 1; > > for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, > offset >> PAGE_SHIFT) { > @@ -1074,6 +1073,9 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data, > goto out; > } > > + /* Object backing store will be out of date hereafter */ > + obj->dirty = 1; > + > trace_i915_gem_object_pwrite(obj, args->offset, args->size); > > ret = -EFAULT; > @@ -5224,6 +5226,7 @@ i915_gem_object_create_from_data(struct drm_device *dev, > i915_gem_object_pin_pages(obj); > sg = obj->pages; > bytes = sg_copy_from_buffer(sg->sgl, sg->nents, (void *)data, size); > + obj->dirty = 1; /* Backing store is now out of date */ > i915_gem_object_unpin_pages(obj); > > if (WARN_ON(bytes != size)) { > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > index ceccecc..c7520b7 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -1030,7 +1030,7 @@ static int intel_lr_context_do_pin(struct intel_engine_cs *ring, > if (ret) > goto unpin_ctx_obj; > > - ctx_obj->dirty = true; > + ctx_obj->dirty = 1; > > /* Invalidate GuC TLB. */ > if (i915.enable_guc_submission) > -- > 1.9.1 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 2/2 v2] drm/i915: mark GEM objects dirty after overwriting their contents 2015-12-10 14:06 ` Daniel Vetter @ 2015-12-10 14:52 ` Chris Wilson 2015-12-11 17:09 ` Daniel Vetter 2015-12-10 16:19 ` Dave Gordon 1 sibling, 1 reply; 35+ messages in thread From: Chris Wilson @ 2015-12-10 14:52 UTC (permalink / raw) To: Daniel Vetter; +Cc: intel-gfx On Thu, Dec 10, 2015 at 03:06:55PM +0100, Daniel Vetter wrote: > On Wed, Dec 09, 2015 at 03:52:52PM +0000, Dave Gordon wrote: > > In a few places, we fill a GEM object with data, or overwrite some > > portion of its contents other than a single page. In such cases, we > > should mark the object dirty so that its pages in the pagecache are > > written to backing store (rather than discarded) if the object is > > evicted due to memory pressure. > > > > The cases where only a single page is touched are dealt with in a > > separate patch. > > > > This incorporates and supercedes Alex Dai's earlier patch > > [PATCH v1] drm/i915/guc: Fix a fw content lost issue after it is evicted > > > > Signed-off-by: Dave Gordon <david.s.gordon@intel.com> > > Cc: Alex Dai <yu.dai@intel.com> > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > Hm, did you drop the begin_cpu_access dma-buf one here? See my other mail, > I think that one was legit. I thought begin-access was the sync point around the per-page mmap(), but reading the current code "in kernel users", of which it is just drm/udl. How would we interact with a future dma-buf mmap()? -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 2/2 v2] drm/i915: mark GEM objects dirty after overwriting their contents 2015-12-10 14:52 ` Chris Wilson @ 2015-12-11 17:09 ` Daniel Vetter 0 siblings, 0 replies; 35+ messages in thread From: Daniel Vetter @ 2015-12-11 17:09 UTC (permalink / raw) To: Chris Wilson, Daniel Vetter, Dave Gordon, intel-gfx On Thu, Dec 10, 2015 at 02:52:57PM +0000, Chris Wilson wrote: > On Thu, Dec 10, 2015 at 03:06:55PM +0100, Daniel Vetter wrote: > > On Wed, Dec 09, 2015 at 03:52:52PM +0000, Dave Gordon wrote: > > > In a few places, we fill a GEM object with data, or overwrite some > > > portion of its contents other than a single page. In such cases, we > > > should mark the object dirty so that its pages in the pagecache are > > > written to backing store (rather than discarded) if the object is > > > evicted due to memory pressure. > > > > > > The cases where only a single page is touched are dealt with in a > > > separate patch. > > > > > > This incorporates and supercedes Alex Dai's earlier patch > > > [PATCH v1] drm/i915/guc: Fix a fw content lost issue after it is evicted > > > > > > Signed-off-by: Dave Gordon <david.s.gordon@intel.com> > > > Cc: Alex Dai <yu.dai@intel.com> > > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > > > Hm, did you drop the begin_cpu_access dma-buf one here? See my other mail, > > I think that one was legit. > > I thought begin-access was the sync point around the per-page mmap(), > but reading the current code "in kernel users", of which it is just > drm/udl. How would we interact with a future dma-buf mmap()? We might end up with a bool userspace. Or we could check obj->pages and only set dirty if that's set. A bit evil, but should work even for mmap. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 2/2 v2] drm/i915: mark GEM objects dirty after overwriting their contents 2015-12-10 14:06 ` Daniel Vetter 2015-12-10 14:52 ` Chris Wilson @ 2015-12-10 16:19 ` Dave Gordon 1 sibling, 0 replies; 35+ messages in thread From: Dave Gordon @ 2015-12-10 16:19 UTC (permalink / raw) To: Daniel Vetter; +Cc: intel-gfx On 10/12/15 14:06, Daniel Vetter wrote: > On Wed, Dec 09, 2015 at 03:52:52PM +0000, Dave Gordon wrote: >> In a few places, we fill a GEM object with data, or overwrite some >> portion of its contents other than a single page. In such cases, we >> should mark the object dirty so that its pages in the pagecache are >> written to backing store (rather than discarded) if the object is >> evicted due to memory pressure. >> >> The cases where only a single page is touched are dealt with in a >> separate patch. >> >> This incorporates and supercedes Alex Dai's earlier patch >> [PATCH v1] drm/i915/guc: Fix a fw content lost issue after it is evicted >> >> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> >> Cc: Alex Dai <yu.dai@intel.com> >> Cc: Chris Wilson <chris@chris-wilson.co.uk> > > Hm, did you drop the begin_cpu_access dma-buf one here? See my other mail, > I think that one was legit. > -Daniel I did, because I didn't know whether it was necessary and Chris was dubious. But I can easily reinstate it in the next (v3) series. .Dave. _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH 0/4 v3] drm/i915: mark GEM objects as dirtied by CPU 2015-12-09 15:52 ` [PATCH 0/2 v2] drm/i915: mark GEM objects as dirtied by CPU Dave Gordon 2015-12-09 15:52 ` [PATCH 1/2 v2] drm/i915: mark GEM object pages dirty when mapped & written by the CPU Dave Gordon 2015-12-09 15:52 ` [PATCH 2/2 v2] drm/i915: mark GEM objects dirty after overwriting their contents Dave Gordon @ 2015-12-10 18:51 ` Dave Gordon 2015-12-10 18:51 ` [PATCH 1/4 v3] drm/i915: mark GEM object pages dirty when mapped & written by the CPU Dave Gordon ` (3 more replies) 2 siblings, 4 replies; 35+ messages in thread From: Dave Gordon @ 2015-12-10 18:51 UTC (permalink / raw) To: intel-gfx Another reworking of this patchset. Changes since v2 include: * added two more calls to get_dirty_page() in the relocation code [Chris Wilson] * split the remaining changes into multiple tiny patches [Chris Wilson] * reinstated setting obj->dirty in i915_gem_begin_cpu_access() [Daniel Vetter] Enjoy :) _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH 1/4 v3] drm/i915: mark GEM object pages dirty when mapped & written by the CPU 2015-12-10 18:51 ` [PATCH 0/4 v3] drm/i915: mark GEM objects as dirtied by CPU Dave Gordon @ 2015-12-10 18:51 ` Dave Gordon 2015-12-10 21:07 ` Chris Wilson 2015-12-10 18:51 ` [PATCH 2/4 v3] drm/i915: mark a newly-created GEM object dirty when filled with data Dave Gordon ` (2 subsequent siblings) 3 siblings, 1 reply; 35+ messages in thread From: Dave Gordon @ 2015-12-10 18:51 UTC (permalink / raw) To: intel-gfx In various places, a single page of a (regular) GEM object is mapped into CPU address space and updated. In each such case, either the page or the the object should be marked dirty, to ensure that the modifications are not discarded if the object is evicted under memory pressure. The typical sequence is: va = kmap_atomic(i915_gem_object_get_page(obj, pageno)); *(va+offset) = ... kunmap_atomic(va); Here we introduce i915_gem_object_get_dirty_page(), which performs the same operation as i915_gem_object_get_page() but with the side-effect of marking the returned page dirty in the pagecache. This will ensure that if the object is subsequently evicted (due to memory pressure), the changes are written to backing store rather than discarded. Note that it works only for regular (shmfs-backed) GEM objects, but (at least for now) those are the only ones that are updated in this way -- the objects in question are contexts and batchbuffers, which are always shmfs-backed. Separate patches deal with the cases where whole objects are (or may be) dirtied. v3: Mark two more pages dirty in the page-boundary-crossing cases of the execbuffer relocation code [Chris Wilson] Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/i915_drv.h | 3 +++ drivers/gpu/drm/i915/i915_gem.c | 15 +++++++++++++++ drivers/gpu/drm/i915/i915_gem_execbuffer.c | 8 ++++---- drivers/gpu/drm/i915/i915_gem_render_state.c | 2 +- drivers/gpu/drm/i915/i915_guc_submission.c | 2 +- drivers/gpu/drm/i915/intel_lrc.c | 11 ++++------- 6 files changed, 28 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index f1a8a53..ca77392 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2894,6 +2894,9 @@ static inline int __sg_page_count(struct scatterlist *sg) return sg->length >> PAGE_SHIFT; } +struct page * +i915_gem_object_get_dirty_page(struct drm_i915_gem_object *obj, int n); + static inline struct page * i915_gem_object_get_page(struct drm_i915_gem_object *obj, int n) { diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index dfaf25b..06a5f39 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -5184,6 +5184,21 @@ bool i915_gem_obj_is_pinned(struct drm_i915_gem_object *obj) return false; } +/* Like i915_gem_object_get_page(), but mark the returned page dirty */ +struct page * +i915_gem_object_get_dirty_page(struct drm_i915_gem_object *obj, int n) +{ + struct page *page; + + /* Only default objects have per-page dirty tracking */ + if (WARN_ON(obj->ops != &i915_gem_object_ops)) + return NULL; + + page = i915_gem_object_get_page(obj, n); + set_page_dirty(page); + return page; +} + /* Allocate a new GEM object and fill it with the supplied data */ struct drm_i915_gem_object * i915_gem_object_create_from_data(struct drm_device *dev, diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index a4c243c..1a1b979 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -264,7 +264,7 @@ relocate_entry_cpu(struct drm_i915_gem_object *obj, if (ret) return ret; - vaddr = kmap_atomic(i915_gem_object_get_page(obj, + vaddr = kmap_atomic(i915_gem_object_get_dirty_page(obj, reloc->offset >> PAGE_SHIFT)); *(uint32_t *)(vaddr + page_offset) = lower_32_bits(delta); @@ -273,7 +273,7 @@ relocate_entry_cpu(struct drm_i915_gem_object *obj, if (page_offset == 0) { kunmap_atomic(vaddr); - vaddr = kmap_atomic(i915_gem_object_get_page(obj, + vaddr = kmap_atomic(i915_gem_object_get_dirty_page(obj, (reloc->offset + sizeof(uint32_t)) >> PAGE_SHIFT)); } @@ -355,7 +355,7 @@ relocate_entry_clflush(struct drm_i915_gem_object *obj, if (ret) return ret; - vaddr = kmap_atomic(i915_gem_object_get_page(obj, + vaddr = kmap_atomic(i915_gem_object_get_dirty_page(obj, reloc->offset >> PAGE_SHIFT)); clflush_write32(vaddr + page_offset, lower_32_bits(delta)); @@ -364,7 +364,7 @@ relocate_entry_clflush(struct drm_i915_gem_object *obj, if (page_offset == 0) { kunmap_atomic(vaddr); - vaddr = kmap_atomic(i915_gem_object_get_page(obj, + vaddr = kmap_atomic(i915_gem_object_get_dirty_page(obj, (reloc->offset + sizeof(uint32_t)) >> PAGE_SHIFT)); } diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c index 5026a62..fc7e6d5 100644 --- a/drivers/gpu/drm/i915/i915_gem_render_state.c +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c @@ -103,7 +103,7 @@ static int render_state_setup(struct render_state *so) if (ret) return ret; - page = sg_page(so->obj->pages->sgl); + page = i915_gem_object_get_dirty_page(so->obj, 0); d = kmap(page); while (i < rodata->batch_items) { diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 0d23785b..05aa7e6 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -568,7 +568,7 @@ static void lr_context_update(struct drm_i915_gem_request *rq) WARN_ON(!i915_gem_obj_is_pinned(ctx_obj)); WARN_ON(!i915_gem_obj_is_pinned(rb_obj)); - page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN); + page = i915_gem_object_get_dirty_page(ctx_obj, LRC_STATE_PN); reg_state = kmap_atomic(page); reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(rb_obj); diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 4ebafab..ceccecc 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -372,7 +372,7 @@ static int execlists_update_context(struct drm_i915_gem_request *rq) WARN_ON(!i915_gem_obj_is_pinned(ctx_obj)); WARN_ON(!i915_gem_obj_is_pinned(rb_obj)); - page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN); + page = i915_gem_object_get_dirty_page(ctx_obj, LRC_STATE_PN); reg_state = kmap_atomic(page); reg_state[CTX_RING_TAIL+1] = rq->tail; @@ -1425,7 +1425,7 @@ static int intel_init_workaround_bb(struct intel_engine_cs *ring) return ret; } - page = i915_gem_object_get_page(wa_ctx->obj, 0); + page = i915_gem_object_get_dirty_page(wa_ctx->obj, 0); batch = kmap_atomic(page); offset = 0; @@ -2257,7 +2257,7 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o /* The second page of the context object contains some fields which must * be set up prior to the first execution. */ - page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN); + page = i915_gem_object_get_dirty_page(ctx_obj, LRC_STATE_PN); reg_state = kmap_atomic(page); /* A context is actually a big batch buffer with several MI_LOAD_REGISTER_IMM @@ -2343,9 +2343,6 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o } kunmap_atomic(reg_state); - - ctx_obj->dirty = 1; - set_page_dirty(page); i915_gem_object_unpin_pages(ctx_obj); return 0; @@ -2529,7 +2526,7 @@ void intel_lr_context_reset(struct drm_device *dev, WARN(1, "Failed get_pages for context obj\n"); continue; } - page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN); + page = i915_gem_object_get_dirty_page(ctx_obj, LRC_STATE_PN); reg_state = kmap_atomic(page); reg_state[CTX_RING_HEAD+1] = 0; -- 1.9.1 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH 1/4 v3] drm/i915: mark GEM object pages dirty when mapped & written by the CPU 2015-12-10 18:51 ` [PATCH 1/4 v3] drm/i915: mark GEM object pages dirty when mapped & written by the CPU Dave Gordon @ 2015-12-10 21:07 ` Chris Wilson 0 siblings, 0 replies; 35+ messages in thread From: Chris Wilson @ 2015-12-10 21:07 UTC (permalink / raw) To: Dave Gordon; +Cc: intel-gfx On Thu, Dec 10, 2015 at 06:51:23PM +0000, Dave Gordon wrote: > In various places, a single page of a (regular) GEM object is mapped into > CPU address space and updated. In each such case, either the page or the > the object should be marked dirty, to ensure that the modifications are > not discarded if the object is evicted under memory pressure. > > The typical sequence is: > va = kmap_atomic(i915_gem_object_get_page(obj, pageno)); > *(va+offset) = ... > kunmap_atomic(va); > > Here we introduce i915_gem_object_get_dirty_page(), which performs the > same operation as i915_gem_object_get_page() but with the side-effect > of marking the returned page dirty in the pagecache. This will ensure > that if the object is subsequently evicted (due to memory pressure), > the changes are written to backing store rather than discarded. > > Note that it works only for regular (shmfs-backed) GEM objects, but (at > least for now) those are the only ones that are updated in this way -- > the objects in question are contexts and batchbuffers, which are always > shmfs-backed. > > Separate patches deal with the cases where whole objects are (or may > be) dirtied. > > v3: Mark two more pages dirty in the page-boundary-crossing > cases of the execbuffer relocation code [Chris Wilson] > > Signed-off-by: Dave Gordon <david.s.gordon@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH 2/4 v3] drm/i915: mark a newly-created GEM object dirty when filled with data 2015-12-10 18:51 ` [PATCH 0/4 v3] drm/i915: mark GEM objects as dirtied by CPU Dave Gordon 2015-12-10 18:51 ` [PATCH 1/4 v3] drm/i915: mark GEM object pages dirty when mapped & written by the CPU Dave Gordon @ 2015-12-10 18:51 ` Dave Gordon 2015-12-10 21:06 ` Chris Wilson 2015-12-10 18:51 ` [PATCH 3/4 v3] drm/i915: always mark the target of pwrite() as dirty Dave Gordon 2015-12-10 18:51 ` [PATCH 4/4 v3] drm/i915: miscellaneous tiny tweaks to GEM object->dirty Dave Gordon 3 siblings, 1 reply; 35+ messages in thread From: Dave Gordon @ 2015-12-10 18:51 UTC (permalink / raw) To: intel-gfx When creating a new (pageable) GEM object and filling it with data, we must mark it as 'dirty', i.e. backing store is out-of-date w.r.t. the newly-written content. This ensures that if the object is evicted under memory pressure, its pages in the pagecache will be written to backing store rather than discarded. Based on an original version by Alex Dai. Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/i915_gem.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 06a5f39..936f0a9 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -5224,6 +5224,7 @@ i915_gem_object_create_from_data(struct drm_device *dev, i915_gem_object_pin_pages(obj); sg = obj->pages; bytes = sg_copy_from_buffer(sg->sgl, sg->nents, (void *)data, size); + obj->dirty = 1; /* Backing store is now out of date */ i915_gem_object_unpin_pages(obj); if (WARN_ON(bytes != size)) { -- 1.9.1 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH 2/4 v3] drm/i915: mark a newly-created GEM object dirty when filled with data 2015-12-10 18:51 ` [PATCH 2/4 v3] drm/i915: mark a newly-created GEM object dirty when filled with data Dave Gordon @ 2015-12-10 21:06 ` Chris Wilson 2015-12-11 17:21 ` Daniel Vetter 0 siblings, 1 reply; 35+ messages in thread From: Chris Wilson @ 2015-12-10 21:06 UTC (permalink / raw) To: Dave Gordon; +Cc: intel-gfx On Thu, Dec 10, 2015 at 06:51:24PM +0000, Dave Gordon wrote: > When creating a new (pageable) GEM object and filling it with data, we > must mark it as 'dirty', i.e. backing store is out-of-date w.r.t. the > newly-written content. This ensures that if the object is evicted under > memory pressure, its pages in the pagecache will be written to backing > store rather than discarded. > > Based on an original version by Alex Dai. > > Signed-off-by: Alex Dai <yu.dai@intel.com> > Signed-off-by: Dave Gordon <david.s.gordon@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> I've made my peace with this patch finally. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> > --- > drivers/gpu/drm/i915/i915_gem.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 06a5f39..936f0a9 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -5224,6 +5224,7 @@ i915_gem_object_create_from_data(struct drm_device *dev, > i915_gem_object_pin_pages(obj); > sg = obj->pages; > bytes = sg_copy_from_buffer(sg->sgl, sg->nents, (void *)data, size); > + obj->dirty = 1; /* Backing store is now out of date */ That seems like it would be better served as an improvement to the existing obj->dirty /** doc */ -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 2/4 v3] drm/i915: mark a newly-created GEM object dirty when filled with data 2015-12-10 21:06 ` Chris Wilson @ 2015-12-11 17:21 ` Daniel Vetter 0 siblings, 0 replies; 35+ messages in thread From: Daniel Vetter @ 2015-12-11 17:21 UTC (permalink / raw) To: Chris Wilson, Dave Gordon, intel-gfx, Alex Dai On Thu, Dec 10, 2015 at 09:06:25PM +0000, Chris Wilson wrote: > On Thu, Dec 10, 2015 at 06:51:24PM +0000, Dave Gordon wrote: > > When creating a new (pageable) GEM object and filling it with data, we > > must mark it as 'dirty', i.e. backing store is out-of-date w.r.t. the > > newly-written content. This ensures that if the object is evicted under > > memory pressure, its pages in the pagecache will be written to backing > > store rather than discarded. > > > > Based on an original version by Alex Dai. > > > > Signed-off-by: Alex Dai <yu.dai@intel.com> > > Signed-off-by: Dave Gordon <david.s.gordon@intel.com> > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > I've made my peace with this patch finally. > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> > > > --- > > drivers/gpu/drm/i915/i915_gem.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > > index 06a5f39..936f0a9 100644 > > --- a/drivers/gpu/drm/i915/i915_gem.c > > +++ b/drivers/gpu/drm/i915/i915_gem.c > > @@ -5224,6 +5224,7 @@ i915_gem_object_create_from_data(struct drm_device *dev, > > i915_gem_object_pin_pages(obj); > > sg = obj->pages; > > bytes = sg_copy_from_buffer(sg->sgl, sg->nents, (void *)data, size); > > + obj->dirty = 1; /* Backing store is now out of date */ > > That seems like it would be better served as an improvement to the > existing obj->dirty /** doc */ Yeah doc polish at the end would be stellar. Merged the first 2 patches from this series meanwhile. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH 3/4 v3] drm/i915: always mark the target of pwrite() as dirty 2015-12-10 18:51 ` [PATCH 0/4 v3] drm/i915: mark GEM objects as dirtied by CPU Dave Gordon 2015-12-10 18:51 ` [PATCH 1/4 v3] drm/i915: mark GEM object pages dirty when mapped & written by the CPU Dave Gordon 2015-12-10 18:51 ` [PATCH 2/4 v3] drm/i915: mark a newly-created GEM object dirty when filled with data Dave Gordon @ 2015-12-10 18:51 ` Dave Gordon 2015-12-10 21:09 ` Chris Wilson 2015-12-10 18:51 ` [PATCH 4/4 v3] drm/i915: miscellaneous tiny tweaks to GEM object->dirty Dave Gordon 3 siblings, 1 reply; 35+ messages in thread From: Dave Gordon @ 2015-12-10 18:51 UTC (permalink / raw) To: intel-gfx Currently, the target object being written *may* be marked dirty, either in i915_gem_gtt_pwrite_fast() (as a side-effect of setting its domain to GTT!), or in i915_gem_shmem_pwrite() (if it's a shmfs-backed object). While these two are the common cases, it's not obvious that they cover every possible path through the pwrite code, for every possible type of object (e.g. phys, stolen, etc). So here we move setting-the-mark to the top level so that it is obvious that it applies no matter which subsequent path is followed. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/i915_gem.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 936f0a9..81a770f 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -937,7 +937,6 @@ i915_gem_shmem_pwrite(struct drm_device *dev, i915_gem_object_pin_pages(obj); offset = args->offset; - obj->dirty = 1; for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, offset >> PAGE_SHIFT) { @@ -1074,6 +1073,9 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data, goto out; } + /* Object backing store will be out of date hereafter */ + obj->dirty = 1; + trace_i915_gem_object_pwrite(obj, args->offset, args->size); ret = -EFAULT; -- 1.9.1 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH 3/4 v3] drm/i915: always mark the target of pwrite() as dirty 2015-12-10 18:51 ` [PATCH 3/4 v3] drm/i915: always mark the target of pwrite() as dirty Dave Gordon @ 2015-12-10 21:09 ` Chris Wilson 0 siblings, 0 replies; 35+ messages in thread From: Chris Wilson @ 2015-12-10 21:09 UTC (permalink / raw) To: Dave Gordon; +Cc: intel-gfx On Thu, Dec 10, 2015 at 06:51:25PM +0000, Dave Gordon wrote: > Currently, the target object being written *may* be marked dirty, either > in i915_gem_gtt_pwrite_fast() (as a side-effect of setting its domain to > GTT!), or in i915_gem_shmem_pwrite() (if it's a shmfs-backed object). > While these two are the common cases, it's not obvious that they cover > every possible path through the pwrite code, for every possible type > of object (e.g. phys, stolen, etc). So here we move setting-the-mark > to the top level so that it is obvious that it applies no matter which > subsequent path is followed. > > Signed-off-by: Dave Gordon <david.s.gordon@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> I don't like this patch - I feel like it divorces the information that we are dirtying the pages from the actual copy. Especially as some paths don't actually dirty the object's backing storage (for extra confusion). -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH 4/4 v3] drm/i915: miscellaneous tiny tweaks to GEM object->dirty 2015-12-10 18:51 ` [PATCH 0/4 v3] drm/i915: mark GEM objects as dirtied by CPU Dave Gordon ` (2 preceding siblings ...) 2015-12-10 18:51 ` [PATCH 3/4 v3] drm/i915: always mark the target of pwrite() as dirty Dave Gordon @ 2015-12-10 18:51 ` Dave Gordon 2015-12-10 21:16 ` Chris Wilson 3 siblings, 1 reply; 35+ messages in thread From: Dave Gordon @ 2015-12-10 18:51 UTC (permalink / raw) To: intel-gfx This patch covers a couple more places where a GEM object is (or may be) modified by means of CPU writes, and should therefore be marked dirty to ensure that the changes are not lost in the event that the object is evicted under memory pressure. One is in i915_gem_begin_cpu_access(); after this call, the GEM object may be written to by the caller (which may not be part of the i915 driver e.g. udl). We must therefore assume that the object is dirty hereafter if the caller has asked for write access. Another is in copy_batch(); the destination object is obviously dirty after being written, but failing to mark it doesn't cause a bug at present, because the object is page-pinned at this point, and should remain either page- pinned or GTT-pinned until it's retired, at which point its content can be discarded. However, if the lifecycle of shadow batches is ever changed (e.g. by the introduction of a GPU scheduler) this might no longer be true, so it's safer to mark it correctly (this introduces no overhead if the buffer is never swappable). It also makes the content cycle clearer: ---allocate--> [empty buffer acquired from pool] ---fill--> [valid buffer full of unsaved data] ---use--> [buffer full of unsaved but unwanted data] --retire--> [purgeable buffer returned to pool] ... repeat ... The last change here is just for consistency; since 'dirty' has been declared as an (unsigned int) bitfield, let's not treat it as a bool. Maybe it should be a byte instead? Signed-off-by: Dave Gordon <david.s.gordon@intel.com> --- drivers/gpu/drm/i915/i915_cmd_parser.c | 3 +++ drivers/gpu/drm/i915/i915_gem_dmabuf.c | 3 +++ drivers/gpu/drm/i915/intel_lrc.c | 2 +- 3 files changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 814d894..81a4fa2 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -946,6 +946,9 @@ static u32 *copy_batch(struct drm_i915_gem_object *dest_obj, memcpy(dst, src, batch_len); + /* After writing on the dest_obj, its backing store is out-of-date */ + dest_obj->dirty = 1; + unmap_src: vunmap(src_base); unpin_src: diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c index e9c2bfd..5eb7887 100644 --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c @@ -208,6 +208,9 @@ static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, size_t start, size return ret; ret = i915_gem_object_set_to_cpu_domain(obj, write); + if (write) + obj->dirty = 1; + mutex_unlock(&dev->struct_mutex); return ret; } diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index ceccecc..c7520b7 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1030,7 +1030,7 @@ static int intel_lr_context_do_pin(struct intel_engine_cs *ring, if (ret) goto unpin_ctx_obj; - ctx_obj->dirty = true; + ctx_obj->dirty = 1; /* Invalidate GuC TLB. */ if (i915.enable_guc_submission) -- 1.9.1 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH 4/4 v3] drm/i915: miscellaneous tiny tweaks to GEM object->dirty 2015-12-10 18:51 ` [PATCH 4/4 v3] drm/i915: miscellaneous tiny tweaks to GEM object->dirty Dave Gordon @ 2015-12-10 21:16 ` Chris Wilson 0 siblings, 0 replies; 35+ messages in thread From: Chris Wilson @ 2015-12-10 21:16 UTC (permalink / raw) To: Dave Gordon; +Cc: intel-gfx On Thu, Dec 10, 2015 at 06:51:26PM +0000, Dave Gordon wrote: > This patch covers a couple more places where a GEM object is (or may be) > modified by means of CPU writes, and should therefore be marked dirty to > ensure that the changes are not lost in the event that the object is > evicted under memory pressure. > > One is in i915_gem_begin_cpu_access(); after this call, the GEM object may > be written to by the caller (which may not be part of the i915 driver e.g. > udl). We must therefore assume that the object is dirty hereafter if > the caller has asked for write access. > > Another is in copy_batch(); the destination object is obviously dirty > after being written, but failing to mark it doesn't cause a bug at > present, because the object is page-pinned at this point, and should > remain either page- pinned or GTT-pinned until it's retired, at which > point its content can be discarded. However, if the lifecycle of shadow > batches is ever changed (e.g. by the introduction of a GPU scheduler) > this might no longer be true, so it's safer to mark it correctly (this > introduces no overhead if the buffer is never swappable). It also makes > the content cycle clearer: > > ---allocate--> > [empty buffer acquired from pool] > ---fill--> > [valid buffer full of unsaved data] > ---use--> > [buffer full of unsaved but unwanted data] > --retire--> > [purgeable buffer returned to pool] > ... repeat ... > > The last change here is just for consistency; since 'dirty' has been > declared as an (unsigned int) bitfield, let's not treat it as a bool. > Maybe it should be a byte instead? No, it's just because obj->dirty is older than C's bool type. Changing it to be bool obj->dirty:1 would be fine - except that there is one particular very hot path where moving it to an unsigned obj->flags field would be even better. > diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c > index e9c2bfd..5eb7887 100644 > --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c > +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c > @@ -208,6 +208,9 @@ static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, size_t start, size > return ret; > > ret = i915_gem_object_set_to_cpu_domain(obj, write); > + if (write) > + obj->dirty = 1; > + So looking at the only existing example (drm/udl which only reads from te object anyway) this would fall into bug category. Hence separate patch. But I'll defer to Daniel as to whether the dma-buf is meant to operate at the object level or at the page level with regards to dirty tracking (certainly we would struggle at the moment with dma-buf on massive objects). -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2015-12-11 17:27 UTC | newest] Thread overview: 35+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-12-08 16:51 [PATCH 0/3] drm/i915: mark GEM objects as dirtied by CPU Dave Gordon 2015-12-08 16:51 ` [PATCH 1/3] drm/i915: mark GEM objects as dirty when filled by the CPU Dave Gordon 2015-12-08 17:00 ` Chris Wilson 2015-12-08 18:06 ` Dave Gordon 2015-12-10 10:49 ` Daniel Vetter 2015-12-08 16:51 ` [PATCH 2/3] drm/i915: mark GEM objects as dirty when updated " Dave Gordon 2015-12-08 17:00 ` Chris Wilson 2015-12-08 18:43 ` Dave Gordon 2015-12-08 16:51 ` [PATCH 3/3] drm/i915: mark GEM objects as dirty when written " Dave Gordon 2015-12-08 17:03 ` Chris Wilson 2015-12-08 18:24 ` Dave Gordon 2015-12-10 13:10 ` Daniel Vetter 2015-12-09 15:52 ` [PATCH 0/2 v2] drm/i915: mark GEM objects as dirtied by CPU Dave Gordon 2015-12-09 15:52 ` [PATCH 1/2 v2] drm/i915: mark GEM object pages dirty when mapped & written by the CPU Dave Gordon 2015-12-10 13:29 ` Chris Wilson 2015-12-10 17:24 ` Dave Gordon 2015-12-10 21:04 ` Chris Wilson 2015-12-11 17:08 ` Daniel Vetter 2015-12-11 17:27 ` Chris Wilson 2015-12-09 15:52 ` [PATCH 2/2 v2] drm/i915: mark GEM objects dirty after overwriting their contents Dave Gordon 2015-12-10 13:22 ` Chris Wilson 2015-12-10 14:06 ` Daniel Vetter 2015-12-10 14:52 ` Chris Wilson 2015-12-11 17:09 ` Daniel Vetter 2015-12-10 16:19 ` Dave Gordon 2015-12-10 18:51 ` [PATCH 0/4 v3] drm/i915: mark GEM objects as dirtied by CPU Dave Gordon 2015-12-10 18:51 ` [PATCH 1/4 v3] drm/i915: mark GEM object pages dirty when mapped & written by the CPU Dave Gordon 2015-12-10 21:07 ` Chris Wilson 2015-12-10 18:51 ` [PATCH 2/4 v3] drm/i915: mark a newly-created GEM object dirty when filled with data Dave Gordon 2015-12-10 21:06 ` Chris Wilson 2015-12-11 17:21 ` Daniel Vetter 2015-12-10 18:51 ` [PATCH 3/4 v3] drm/i915: always mark the target of pwrite() as dirty Dave Gordon 2015-12-10 21:09 ` Chris Wilson 2015-12-10 18:51 ` [PATCH 4/4 v3] drm/i915: miscellaneous tiny tweaks to GEM object->dirty Dave Gordon 2015-12-10 21:16 ` Chris Wilson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).