All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yu Dai <yu.dai@intel.com>
To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	Daniel Vetter <daniel@ffwll.ch>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH v1] drm/i915: Fix a false alert of memory leak when free LRC
Date: Mon, 23 Nov 2015 14:30:38 -0800	[thread overview]
Message-ID: <5653938E.20404@intel.com> (raw)
In-Reply-To: <5652EBD3.90002@linux.intel.com>



On 11/23/2015 02:34 AM, Tvrtko Ursulin wrote:
> On 20/11/15 08:31, Daniel Vetter wrote:
> > On Thu, Nov 19, 2015 at 04:10:26PM -0800, yu.dai@intel.com wrote:
> >> From: Alex Dai <yu.dai@intel.com>
> >>
> >> There is a memory leak warning message from i915_gem_context_clean
> >> when GuC submission is enabled. The reason is that when LRC is
> >> released, its ppgtt could be still referenced. The assumption that
> >> all VMAs are unbound during release of LRC is not true.
> >>
> >> v1: Move the code inside i915_gem_context_clean() to where ppgtt is
> >> released because it is not cleaning context anyway but ppgtt.
> >>
> >> Signed-off-by: Alex Dai <yu.dai@intel.com>
> >
> > retire__read drops the ctx (and hence ppgtt) reference too early,
> > resulting in us hitting the WARNING. See the giant thread with Tvrtko,
> > Chris and me:
> >
> > http://www.spinics.net/lists/intel-gfx/msg78918.html
> >
> > Would be great if someone could test the diff I posted in there.
>
> It doesn't work - I have posted my IGT snippet which I thought explained it.

I thought moving the VMA list clean up into i915_ppgtt_release() should 
work. However, it creates a chicken & egg problem. ppgtt_release() rely 
on vma_unbound() to be called first to decrease its refcount. So calling 
vma_unbound() inside ppgtt_release() is not right.
> Problem req unreference in obj->active case. When it hits that path it
> will not move the VMA to inactive and the
> intel_execlists_retire_requests will be the last unreference from the
> retire worker which will trigger the WARN.

I still think the problem comes from the assumption that when lrc is 
released, its all VMAs should be unbound. Precisely I mean the comments 
made for i915_gem_context_clean() - "This context is going away and we 
need to remove all VMAs still around." Really the lrc life cycle is 
different from ppgtt / VMAs. Check the line after 
i915_gem_context_clean(). It is ppgtt_put(). In the case lrc is freed 
early, It won't release ppgtt anyway because it is still referenced by 
VMAs. An it will be freed when no ref of GEM obj.

> I posted an IGT which hits that ->
> http://patchwork.freedesktop.org/patch/65369/
>
> And posted one give up on the active VMA mem leak patch ->
> http://patchwork.freedesktop.org/patch/65529/

This patch will silent the warning. But I think the 
i915_gem_context_clean() itself is unnecessary. I don't see any issue by 
deleting it. The check of VMA list is inside ppgtt_release() and the 
unbound should be aligned to GEM obj's life cycle but not lrc life cycle.
> I have no idea yet of GuC implications, I just spotted this parallel thread.
>
> And Mika has proposed something interesting - that we could just clean
> up the active VMA in context cleanup since we know it is a false one.
>
> However, again I don't know how that interacts with the GuC. Surely it
> cannot be freeing the context with stuff genuinely still active in the GuC?
>

There is no interacts with GuC though. Just very easy to see the warning 
when GuC is enabled, says when run gem_close_race. The reason is that 
GuC does not use the execlist_queue (execlist_retired_req_list) which is 
deferred to retire worker. Same as ring submission mode, when GuC is 
enabled, whenever driver submits a new batch, it will try to release 
previous request. I don't know why  intel_execlists_retire_requests is 
not called for this case. Probably because of the unpin. Deferring the 
retirement may just hide the issue. I bet you will see the warning more 
often if you change i915_gem_retire_requests_ring() to 
i915_gem_retire_requests() in i915_gem_execbuffer_reserve().

Thanks,
Alex
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2015-11-23 22:32 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-19 22:05 [PATCH] drm/i915/guc: Fix a false alert of memory leak when free LRC yu.dai
2015-10-20  7:45 ` Daniel Vetter
2015-10-21 18:27 ` yu.dai
2015-10-23 21:40   ` Dave Gordon
2015-10-24  8:52     ` Chris Wilson
2015-11-20  0:10 ` [PATCH v1] drm/i915: " yu.dai
2015-11-20  8:31   ` Daniel Vetter
2015-11-20 18:38     ` Yu Dai
2015-11-23 10:34     ` Tvrtko Ursulin
2015-11-23 22:30       ` Yu Dai [this message]
2015-11-24 10:46         ` Tvrtko Ursulin
2015-11-24 10:57       ` Daniel Vetter
2015-11-24 12:50         ` Chris Wilson
2015-11-24 12:51         ` Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5653938E.20404@intel.com \
    --to=yu.dai@intel.com \
    --cc=daniel@ffwll.ch \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=tvrtko.ursulin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.