* [PATCH v3] drm/i915: Clean up associated VMAs on context destruction
@ 2015-09-22 13:16 Tvrtko Ursulin
2015-10-05 11:21 ` Michel Thierry
0 siblings, 1 reply; 2+ messages in thread
From: Tvrtko Ursulin @ 2015-09-22 13:16 UTC (permalink / raw)
To: Intel-gfx; +Cc: Daniel Vetter
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Prevent leaking VMAs and PPGTT VMs when objects are imported
via flink.
Scenario is that any VMAs created by the importer will be left
dangling after the importer exits, or destroys the PPGTT context
with which they are associated.
This is caused by object destruction not running when the
importer closes the buffer object handle due the reference held
by the exporter. This also leaks the VM since the VMA has a
reference on it.
In practice these leaks can be observed by stopping and starting
the X server on a kernel with fbcon compiled in. Every time
X server exits another VMA will be leaked against the fbcon's
frame buffer object.
Also on systems where flink buffer sharing is used extensively,
like Android, this leak has even more serious consequences.
This version is takes a general approach from the earlier work
by Rafael Barabalho (drm/i915: Clean-up PPGTT on context
destruction) and tries to incorporate the subsequent discussion
between Chris Wilson and Daniel Vetter.
v2:
Removed immediate cleanup on object retire - it was causing a
recursive VMA unbind via i915_gem_object_wait_rendering. And
it is in fact not even needed since by definition context
cleanup worker runs only after the last context reference has
been dropped, hence all VMAs against the VM belonging to the
context are already on the inactive list.
v3:
Previous version could deadlock since VMA unbind waits on any
rendering on an object to complete. Objects can be busy in a
different VM which would mean that the cleanup loop would do
the wait with the struct mutex held.
This is an even simpler approach where we just unbind VMAs
without waiting since we know all VMAs belonging to this VM
are idle, and there is nothing in flight, at the point
context destructor runs.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Testcase: igt/gem_ppgtt.c/flink-and-exit-vma-leak
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Rafael Barbalho <rafael.barbalho@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 5 +++++
drivers/gpu/drm/i915/i915_gem.c | 20 ++++++++++++++++----
drivers/gpu/drm/i915/i915_gem_context.c | 24 ++++++++++++++++++++++++
3 files changed, 45 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bc9c3a58d498..6dfd8830fe6b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2832,6 +2832,11 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj,
int i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
u32 flags);
int __must_check i915_vma_unbind(struct i915_vma *vma);
+/*
+ * BEWARE: Do not use the function below unless you can _absolutely_
+ * _guarantee_ VMA in question is _not in use_ anywhere.
+ */
+int __must_check _i915_vma_unbind_no_wait(struct i915_vma *vma);
int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
void i915_gem_release_all_mmaps(struct drm_i915_private *dev_priv);
void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 64a5847f5107..3685080985df 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3208,7 +3208,7 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
old_write_domain);
}
-int i915_vma_unbind(struct i915_vma *vma)
+static int __i915_vma_unbind(struct i915_vma *vma, bool wait)
{
struct drm_i915_gem_object *obj = vma->obj;
struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
@@ -3227,9 +3227,11 @@ int i915_vma_unbind(struct i915_vma *vma)
BUG_ON(obj->pages == NULL);
- ret = i915_gem_object_wait_rendering(obj, false);
- if (ret)
- return ret;
+ if (wait) {
+ ret = i915_gem_object_wait_rendering(obj, false);
+ if (ret)
+ return ret;
+ }
if (i915_is_ggtt(vma->vm) &&
vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL) {
@@ -3274,6 +3276,16 @@ int i915_vma_unbind(struct i915_vma *vma)
return 0;
}
+int i915_vma_unbind(struct i915_vma *vma)
+{
+ return __i915_vma_unbind(vma, true);
+}
+
+int _i915_vma_unbind_no_wait(struct i915_vma *vma)
+{
+ return __i915_vma_unbind(vma, false);
+}
+
int i915_gpu_idle(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 74aa0c9929ba..d9d7ba8af3b4 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -133,6 +133,23 @@ static int get_context_size(struct drm_device *dev)
return ret;
}
+static void i915_gem_context_clean(struct intel_context *ctx)
+{
+ struct i915_hw_ppgtt *ppgtt = ctx->ppgtt;
+ struct i915_vma *vma, *next;
+
+ if (WARN_ON_ONCE(!ppgtt))
+ return;
+
+ WARN_ON(!list_empty(&ppgtt->base.active_list));
+
+ list_for_each_entry_safe(vma, next, &ppgtt->base.inactive_list,
+ mm_list) {
+ if (WARN_ON(_i915_vma_unbind_no_wait(vma)))
+ break;
+ }
+}
+
void i915_gem_context_free(struct kref *ctx_ref)
{
struct intel_context *ctx = container_of(ctx_ref, typeof(*ctx), ref);
@@ -142,6 +159,13 @@ void i915_gem_context_free(struct kref *ctx_ref)
if (i915.enable_execlists)
intel_lr_context_free(ctx);
+ /*
+ * This context is going away and we need to remove all VMAs still
+ * around. This is to handle imported shared objects for which
+ * destructor did not run when their handles were closed.
+ */
+ i915_gem_context_clean(ctx);
+
i915_ppgtt_put(ctx->ppgtt);
if (ctx->legacy_hw_ctx.rcs_state)
--
2.5.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 2+ messages in thread* Re: [PATCH v3] drm/i915: Clean up associated VMAs on context destruction
2015-09-22 13:16 [PATCH v3] drm/i915: Clean up associated VMAs on context destruction Tvrtko Ursulin
@ 2015-10-05 11:21 ` Michel Thierry
0 siblings, 0 replies; 2+ messages in thread
From: Michel Thierry @ 2015-10-05 11:21 UTC (permalink / raw)
To: Tvrtko Ursulin, Intel-gfx@lists.freedesktop.org; +Cc: Daniel Vetter
On 9/22/2015 2:16 PM, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> Prevent leaking VMAs and PPGTT VMs when objects are imported
> via flink.
>
> Scenario is that any VMAs created by the importer will be left
> dangling after the importer exits, or destroys the PPGTT context
> with which they are associated.
>
> This is caused by object destruction not running when the
> importer closes the buffer object handle due the reference held
> by the exporter. This also leaks the VM since the VMA has a
> reference on it.
>
> In practice these leaks can be observed by stopping and starting
> the X server on a kernel with fbcon compiled in. Every time
> X server exits another VMA will be leaked against the fbcon's
> frame buffer object.
>
> Also on systems where flink buffer sharing is used extensively,
> like Android, this leak has even more serious consequences.
>
> This version is takes a general approach from the earlier work
> by Rafael Barabalho (drm/i915: Clean-up PPGTT on context
^Barbalho
> destruction) and tries to incorporate the subsequent discussion
> between Chris Wilson and Daniel Vetter.
>
> v2:
>
> Removed immediate cleanup on object retire - it was causing a
> recursive VMA unbind via i915_gem_object_wait_rendering. And
> it is in fact not even needed since by definition context
> cleanup worker runs only after the last context reference has
> been dropped, hence all VMAs against the VM belonging to the
> context are already on the inactive list.
>
> v3:
>
> Previous version could deadlock since VMA unbind waits on any
> rendering on an object to complete. Objects can be busy in a
> different VM which would mean that the cleanup loop would do
> the wait with the struct mutex held.
>
> This is an even simpler approach where we just unbind VMAs
> without waiting since we know all VMAs belonging to this VM
> are idle, and there is nothing in flight, at the point
> context destructor runs.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Testcase: igt/gem_ppgtt.c/flink-and-exit-vma-leak
I guess we'll have to wait a bit more to have a fix for
flink-and-close-vma-leak subtest... Anyway, this addresses the flink
leak (also seen in bug 87729 - igt/gem_close_race)
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Rafael Barbalho <rafael.barbalho@intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> ---
> drivers/gpu/drm/i915/i915_drv.h | 5 +++++
> drivers/gpu/drm/i915/i915_gem.c | 20 ++++++++++++++++----
> drivers/gpu/drm/i915/i915_gem_context.c | 24 ++++++++++++++++++++++++
> 3 files changed, 45 insertions(+), 4 deletions(-)
>
> @@ -3274,6 +3276,16 @@ int i915_vma_unbind(struct i915_vma *vma)
> return 0;
> }
>
> +int i915_vma_unbind(struct i915_vma *vma)
> +{
> + return __i915_vma_unbind(vma, true);
> +}
> +
> +int _i915_vma_unbind_no_wait(struct i915_vma *vma)
^ it needs one more underscore (__i915_vma_unbind_no_wait).
> +{
> + return __i915_vma_unbind(vma, false);
> +}
> +
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2015-10-05 11:21 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-22 13:16 [PATCH v3] drm/i915: Clean up associated VMAs on context destruction Tvrtko Ursulin
2015-10-05 11:21 ` Michel Thierry
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).