All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yu Dai <yu.dai@intel.com>
To: Dave Gordon <david.s.gordon@intel.com>, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/i915/guc: Fix a memory leak where guc->execbuf_client is not freed
Date: Tue, 12 Jan 2016 14:57:28 -0800	[thread overview]
Message-ID: <569584D8.9080203@intel.com> (raw)
In-Reply-To: <5694ED8C.9070603@intel.com>



On 01/12/2016 04:11 AM, Dave Gordon wrote:
> On 06/01/16 20:53, yu.dai@intel.com wrote:
> > From: Alex Dai <yu.dai@intel.com>
> >
> > During driver unloading, the guc_client created for command submission
> > needs to be released to avoid memory leak.
> >
> > Signed-off-by: Alex Dai <yu.dai@intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_guc_submission.c | 3 +++
> >   1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> > index 9c24424..8ce4f32 100644
> > --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> > @@ -995,6 +995,9 @@ void i915_guc_submission_fini(struct drm_device *dev)
> >   	struct drm_i915_private *dev_priv = dev->dev_private;
> >   	struct intel_guc *guc = &dev_priv->guc;
> >
> > +	if (i915.enable_guc_submission)
> > +		i915_guc_submission_disable(dev);
> > +
> >   	gem_release_guc_obj(dev_priv->guc.ads_obj);
> >   	guc->ads_obj = NULL;
>
> This looks like the right thing to do, but the wrong place to do it.
>
> i915_guc_submission_{init,enable,disable,fini} are the top-level
> functions exported from this source file and called (only) from
> intel_guc_loader.c
>
> Therefore, the code in intel_guc_ucode_fini() should call
> submission_disable() before submission_fini(), like this:
>
> /**
>    * intel_guc_ucode_fini() - clean up all allocated resources
>    * @dev:        drm device
>    */
> void intel_guc_ucode_fini(struct drm_device *dev)
> {
>           struct drm_i915_private *dev_priv = dev->dev_private;
>           struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
>
>           direct_interrupts_to_host(dev_priv);
> +	i915_guc_submission_disable(dev);
> 	i915_guc_submission_fini(dev);
>
>           mutex_lock(&dev->struct_mutex);
>           if (guc_fw->guc_fw_obj)
>                   drm_gem_object_unreference(&guc_fw->guc_fw_obj->base);
>           guc_fw->guc_fw_obj = NULL;
>           mutex_unlock(&dev->struct_mutex);
>
>           guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_NONE;
> }
>
> There's no need for it to be conditional, as disable (and fini) are
> idempotent; if a thing hasn't been allocated, or has already been
> deallocated, then these functions will just do nothing.

I agree. We will keep the symmetry here 
i915_guc_submission_init(_enable, _disable and _fini).
> HOWEVER,
>
> while reviewing this I've noticed that the locking is all screwed up;
> basically "bf248ca drm/i915: Fix locking around GuC firmware load"
> removed locking round the calls into i915_guc_loader.c and added it back
> in a few places, but not enough.
>
> It would probably have been better to have left the locking in the
> caller, and hence round the entirety of the calls to _init, _load,
> _fini, and then explicitly DROP the mutex only for the duration of the
> request_firmware call.
>
> It would have been better still not to insist on synchronous firmware
> load in the first place; the original generic (and asynchronous) loader
> didn't require struct_mutex or any other locking around the
> request_firmware() call, so we wouldn't now have to fix it (again).
>
> At present, in intel_guc_loader.c, intel_guc_ucode_load() is called with
> the struct_mutex already held by the caller, but _init() and _fini() are
> called with it NOT held.
>
> All exported functions in i915_guc_submission.c expect it to be held
> when they're called.
>
> On that basis, what we need now is:
>
> guc_fw_fetch() needs to take & release the mutex round the unreference
> in the fail: path (like the code in _fini above).

I prefer the current approach that only takes lock for necessary 
critical session.
> intel_guc_ucode_fini() needs to extend the scope of the lock to enclose
> all calls to _submission_ functions. So the above becomes:
>
> /**
> * intel_guc_ucode_fini() - clean up all allocated resources
> * @dev: drm device
> */
> void intel_guc_ucode_fini(struct drm_device *dev)
> {
> 	struct drm_i915_private *dev_priv = dev->dev_private;
> 	struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
>
> 	mutex_lock(&dev->struct_mutex);
> 	direct_interrupts_to_host(dev_priv);
> 	i915_guc_submission_disable(dev);
> 	i915_guc_submission_fini(dev);
>
> 	if (guc_fw->guc_fw_obj)
> 		drm_gem_object_unreference(&guc_fw->guc_fw_obj->base);
> 	guc_fw->guc_fw_obj = NULL;
> 	mutex_unlock(&dev->struct_mutex);
>
> 	guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_NONE;
> }

This is done by patch https://patchwork.freedesktop.org/patch/68708/. 
Please review this one.
> FINALLY,
>
> intel_guc_ucode_load() should probably call i915_guc_submission_fini()
> in the failure path (after submission_disable()) as it called
> submission_init() earlier. Not critical, as it will get called from
> ucode_fini() anyway, but it improves symmetry.
>
>

We don't have ucode_unload(). The ucode_fini() is actually doing the 
unload job. Because ucode_fini() needs to acquire the lock but 
ucode_load() expects that lock is held by caller, calling ucode_fini() 
inside ucode_load() is not good. I don't think it is worth to wrap up a 
ucode_unload() call which only includes two lines 
(direct_interrupts_to_host and i915_guc_submission_disable).

Thanks,
Alex
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2016-01-12 23:01 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-06 20:53 [PATCH] drm/i915/guc: Fix a memory leak where guc->execbuf_client is not freed yu.dai
2016-01-07  7:49 ` ✓ success: Fi.CI.BAT Patchwork
2016-01-12 12:11 ` [PATCH] drm/i915/guc: Fix a memory leak where guc->execbuf_client is not freed Dave Gordon
2016-01-12 22:57   ` Yu Dai [this message]
2016-01-12 23:17 ` [PATCH v1] " yu.dai
2016-01-13 18:15   ` Dave Gordon
2016-01-13 18:17     ` Yu Dai
2016-01-13 18:51       ` Dave Gordon
2016-01-13  8:49 ` ✗ failure: Fi.CI.BAT Patchwork
2016-01-13 19:01 ` [PATCH v2] drm/i915/guc: Fix a memory leak where guc->execbuf_client is not freed yu.dai
2016-01-13 19:11   ` Dave Gordon
2016-01-18 10:01     ` Tvrtko Ursulin
2016-01-19  8:48       ` Daniel Vetter
2016-01-14  9:49 ` ✗ warning: Fi.CI.BAT Patchwork
2016-01-19 10:07   ` Tvrtko Ursulin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=569584D8.9080203@intel.com \
    --to=yu.dai@intel.com \
    --cc=david.s.gordon@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.