From: Daniel Vetter <daniel@ffwll.ch>
To: Chris Wilson <chris@chris-wilson.co.uk>,
Daniel Vetter <daniel@ffwll.ch>,
Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
Intel-gfx@lists.freedesktop.org,
Tvrtko Ursulin <tvrtko.ursulin@intel.com>,
Daniel Vetter <daniel.vetter@ffwll.ch>,
Rafael Barbalho <rafael.barbalho@intel.com>,
Michel Thierry <michel.thierry@intel.com>
Subject: Re: [PATCH v4] drm/i915: Clean up associated VMAs on context destruction
Date: Tue, 6 Oct 2015 11:28:49 +0200 [thread overview]
Message-ID: <20151006092849.GC3383@phenom.ffwll.local> (raw)
In-Reply-To: <20151006090431.GA26237@nuc-i3427.alporthouse.com>
On Tue, Oct 06, 2015 at 10:04:31AM +0100, Chris Wilson wrote:
> On Tue, Oct 06, 2015 at 10:58:01AM +0200, Daniel Vetter wrote:
> > On Mon, Oct 05, 2015 at 01:26:36PM +0100, Tvrtko Ursulin wrote:
> > > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > >
> > > Prevent leaking VMAs and PPGTT VMs when objects are imported
> > > via flink.
> > >
> > > Scenario is that any VMAs created by the importer will be left
> > > dangling after the importer exits, or destroys the PPGTT context
> > > with which they are associated.
> > >
> > > This is caused by object destruction not running when the
> > > importer closes the buffer object handle due the reference held
> > > by the exporter. This also leaks the VM since the VMA has a
> > > reference on it.
> > >
> > > In practice these leaks can be observed by stopping and starting
> > > the X server on a kernel with fbcon compiled in. Every time
> > > X server exits another VMA will be leaked against the fbcon's
> > > frame buffer object.
> > >
> > > Also on systems where flink buffer sharing is used extensively,
> > > like Android, this leak has even more serious consequences.
> > >
> > > This version is takes a general approach from the earlier work
> > > by Rafael Barbalho (drm/i915: Clean-up PPGTT on context
> > > destruction) and tries to incorporate the subsequent discussion
> > > between Chris Wilson and Daniel Vetter.
> > >
> > > v2:
> > >
> > > Removed immediate cleanup on object retire - it was causing a
> > > recursive VMA unbind via i915_gem_object_wait_rendering. And
> > > it is in fact not even needed since by definition context
> > > cleanup worker runs only after the last context reference has
> > > been dropped, hence all VMAs against the VM belonging to the
> > > context are already on the inactive list.
> > >
> > > v3:
> > >
> > > Previous version could deadlock since VMA unbind waits on any
> > > rendering on an object to complete. Objects can be busy in a
> > > different VM which would mean that the cleanup loop would do
> > > the wait with the struct mutex held.
> > >
> > > This is an even simpler approach where we just unbind VMAs
> > > without waiting since we know all VMAs belonging to this VM
> > > are idle, and there is nothing in flight, at the point
> > > context destructor runs.
> > >
> > > v4:
> > >
> > > Double underscore prefix for __915_vma_unbind_no_wait and a
> > > commit message typo fix. (Michel Thierry)
> > >
> > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > > Testcase: igt/gem_ppgtt.c/flink-and-exit-vma-leak
> > > Reviewed-by: Michel Thierry <michel.thierry@intel.com>
> > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Rafael Barbalho <rafael.barbalho@intel.com>
> > > Cc: Michel Thierry <michel.thierry@intel.com>
> >
> > Queued for -next, thanks for the patch.
>
> Please no, it's an awful patch and does not even fix the root cause of
> the leak (that the vma are not closed when the handle is).
It's a lose-lose situation for me as maintainer (and this holds in general
really, not just for this patch):
- Either I wield my considerable maintainer powers and force proper
solution, which will piss of a lot of people.
- Or I just merge intermediate stuff and piss of another set of people
(including likely our all future selves because we're slowly digging a
tech debt grave).
What I can't do with maintainer fu is force collaboration, I can only try
to piss off everyone equally. And today the die rolled a "merge".
Oh well.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2015-10-06 9:25 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-05 12:26 [PATCH v4] drm/i915: Clean up associated VMAs on context destruction Tvrtko Ursulin
2015-10-06 8:58 ` Daniel Vetter
2015-10-06 9:04 ` Chris Wilson
2015-10-06 9:28 ` Daniel Vetter [this message]
2015-10-06 9:34 ` Chris Wilson
2015-10-06 9:48 ` Tvrtko Ursulin
2015-10-06 12:15 ` Daniel Vetter
2015-10-08 13:45 ` Ville Syrjälä
2015-10-08 14:03 ` Tvrtko Ursulin
2015-10-08 16:03 ` Ville Syrjälä
2015-10-08 16:26 ` Tvrtko Ursulin
2015-10-08 16:41 ` Ville Syrjälä
2015-10-23 13:27 ` Chris Wilson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151006092849.GC3383@phenom.ffwll.local \
--to=daniel@ffwll.ch \
--cc=Intel-gfx@lists.freedesktop.org \
--cc=chris@chris-wilson.co.uk \
--cc=daniel.vetter@ffwll.ch \
--cc=michel.thierry@intel.com \
--cc=rafael.barbalho@intel.com \
--cc=tvrtko.ursulin@intel.com \
--cc=tvrtko.ursulin@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.