From: Yu Dai <yu.dai@intel.com>
To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
Daniel Vetter <daniel@ffwll.ch>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH v1] drm/i915: Fix a false alert of memory leak when free LRC
Date: Mon, 23 Nov 2015 14:30:38 -0800 [thread overview]
Message-ID: <5653938E.20404@intel.com> (raw)
In-Reply-To: <5652EBD3.90002@linux.intel.com>
On 11/23/2015 02:34 AM, Tvrtko Ursulin wrote:
> On 20/11/15 08:31, Daniel Vetter wrote:
> > On Thu, Nov 19, 2015 at 04:10:26PM -0800, yu.dai@intel.com wrote:
> >> From: Alex Dai <yu.dai@intel.com>
> >>
> >> There is a memory leak warning message from i915_gem_context_clean
> >> when GuC submission is enabled. The reason is that when LRC is
> >> released, its ppgtt could be still referenced. The assumption that
> >> all VMAs are unbound during release of LRC is not true.
> >>
> >> v1: Move the code inside i915_gem_context_clean() to where ppgtt is
> >> released because it is not cleaning context anyway but ppgtt.
> >>
> >> Signed-off-by: Alex Dai <yu.dai@intel.com>
> >
> > retire__read drops the ctx (and hence ppgtt) reference too early,
> > resulting in us hitting the WARNING. See the giant thread with Tvrtko,
> > Chris and me:
> >
> > http://www.spinics.net/lists/intel-gfx/msg78918.html
> >
> > Would be great if someone could test the diff I posted in there.
>
> It doesn't work - I have posted my IGT snippet which I thought explained it.
I thought moving the VMA list clean up into i915_ppgtt_release() should
work. However, it creates a chicken & egg problem. ppgtt_release() rely
on vma_unbound() to be called first to decrease its refcount. So calling
vma_unbound() inside ppgtt_release() is not right.
> Problem req unreference in obj->active case. When it hits that path it
> will not move the VMA to inactive and the
> intel_execlists_retire_requests will be the last unreference from the
> retire worker which will trigger the WARN.
I still think the problem comes from the assumption that when lrc is
released, its all VMAs should be unbound. Precisely I mean the comments
made for i915_gem_context_clean() - "This context is going away and we
need to remove all VMAs still around." Really the lrc life cycle is
different from ppgtt / VMAs. Check the line after
i915_gem_context_clean(). It is ppgtt_put(). In the case lrc is freed
early, It won't release ppgtt anyway because it is still referenced by
VMAs. An it will be freed when no ref of GEM obj.
> I posted an IGT which hits that ->
> http://patchwork.freedesktop.org/patch/65369/
>
> And posted one give up on the active VMA mem leak patch ->
> http://patchwork.freedesktop.org/patch/65529/
This patch will silent the warning. But I think the
i915_gem_context_clean() itself is unnecessary. I don't see any issue by
deleting it. The check of VMA list is inside ppgtt_release() and the
unbound should be aligned to GEM obj's life cycle but not lrc life cycle.
> I have no idea yet of GuC implications, I just spotted this parallel thread.
>
> And Mika has proposed something interesting - that we could just clean
> up the active VMA in context cleanup since we know it is a false one.
>
> However, again I don't know how that interacts with the GuC. Surely it
> cannot be freeing the context with stuff genuinely still active in the GuC?
>
There is no interacts with GuC though. Just very easy to see the warning
when GuC is enabled, says when run gem_close_race. The reason is that
GuC does not use the execlist_queue (execlist_retired_req_list) which is
deferred to retire worker. Same as ring submission mode, when GuC is
enabled, whenever driver submits a new batch, it will try to release
previous request. I don't know why intel_execlists_retire_requests is
not called for this case. Probably because of the unpin. Deferring the
retirement may just hide the issue. I bet you will see the warning more
often if you change i915_gem_retire_requests_ring() to
i915_gem_retire_requests() in i915_gem_execbuffer_reserve().
Thanks,
Alex
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2015-11-23 22:32 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-19 22:05 [PATCH] drm/i915/guc: Fix a false alert of memory leak when free LRC yu.dai
2015-10-20 7:45 ` Daniel Vetter
2015-10-21 18:27 ` yu.dai
2015-10-23 21:40 ` Dave Gordon
2015-10-24 8:52 ` Chris Wilson
2015-11-20 0:10 ` [PATCH v1] drm/i915: " yu.dai
2015-11-20 8:31 ` Daniel Vetter
2015-11-20 18:38 ` Yu Dai
2015-11-23 10:34 ` Tvrtko Ursulin
2015-11-23 22:30 ` Yu Dai [this message]
2015-11-24 10:46 ` Tvrtko Ursulin
2015-11-24 10:57 ` Daniel Vetter
2015-11-24 12:50 ` Chris Wilson
2015-11-24 12:51 ` Chris Wilson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5653938E.20404@intel.com \
--to=yu.dai@intel.com \
--cc=daniel@ffwll.ch \
--cc=intel-gfx@lists.freedesktop.org \
--cc=tvrtko.ursulin@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.