From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <stable-owner@vger.kernel.org>
Received: from mga11.intel.com ([192.55.52.93]:9109 "EHLO mga11.intel.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1750915AbcDRIQj (ORCPT <rfc822;stable@vger.kernel.org>);
	Mon, 18 Apr 2016 04:16:39 -0400
Message-ID: <1460967394.3172.11.camel@intel.com>
Subject: Re: [Intel-gfx] [PATCH 2/4] drm/i915: Fix system resume if PCI
 device remained enabled
From: Imre Deak <imre.deak@intel.com>
Reply-To: imre.deak@intel.com
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: intel-gfx@lists.freedesktop.org, stable@vger.kernel.org
Date: Mon, 18 Apr 2016 11:16:34 +0300
In-Reply-To: <20160418080646.GD10708@nuc-i3427.alporthouse.com>
References: <1460963062-13211-1-git-send-email-imre.deak@intel.com>
	 <1460963062-13211-3-git-send-email-imre.deak@intel.com>
	 <20160418080646.GD10708@nuc-i3427.alporthouse.com>
Content-Type: text/plain; charset="UTF-8"
Mime-Version: 1.0
Content-Transfer-Encoding: 8bit
Sender: stable-owner@vger.kernel.org
List-ID: <stable.vger.kernel.org>

On ma, 2016-04-18 at 09:06 +0100, Chris Wilson wrote:
> On Mon, Apr 18, 2016 at 10:04:20AM +0300, Imre Deak wrote:
> > During system resume we depended on pci_enable_device() also
> > putting the
> > device into PCI D0 state. This won't work if the PCI device was
> > already
> > enabled but still in D3 state. This is because pci_enable_device()
> > is
> > refcounted and will not change the HW state if called with a non-
> > zero
> > refcount. Leaving the device in D3 will make all subsequent device
> > accesses fail.
> > 
> > This didn't cause a problem most of the time, since we resumed with
> > an
> > enable refcount of 0. But it fails at least after module reload
> > because
> > after that we also happen to leak a PCI device enable reference:
> > During
> > probing we call drm_get_pci_dev() which will enable the PCI device,
> > but
> > during device removal drm_put_dev() won't disable it. This is a bug
> > of
> > its own in DRM core, but without much harm as it only leaves the
> > PCI
> > device enabled. Fixing it is also a bit more involved, due to DRM
> > mid-layering and because it affects non-i915 drivers too. The fix
> > in
> > this patch is valid regardless of the problem in DRM core.
> > 
> > CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > CC: stable@vger.kernel.org
> > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > b/drivers/gpu/drm/i915/i915_drv.c
> > index d550ae2..7eaa93e 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device
> > *dev)
> >  static int i915_drm_resume_early(struct drm_device *dev)
> >  {
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	int ret = 0;
> > +	int ret;
> >  
> >  	/*
> >  	 * We have a resume ordering issue with the snd-hda driver
> > also
> > @@ -814,6 +814,13 @@ static int i915_drm_resume_early(struct
> > drm_device *dev)
> >  	 * FIXME: This should be solved with a special hdmi sink
> > device or
> >  	 * similar so that power domains can be employed.
> >  	 */
> > +
> > +	ret = pci_set_power_state(dev->pdev, PCI_D0);
> > +	if (ret) {
> > +		DRM_ERROR("failed to set PCI D0 power state
> > (%d)\n", ret);
> > +		goto out;
> > +	}
> 
> The device should be enabled first, otherwise we are not meant to be
> touching its IO space at all (such as twiddling power state). At
> least
> that is the order pci_enable_device() uses.

It's not MMIO or (port) IO but only a PCI config space access
that pci_set_power_state() requires, so doesn't need the enabling 
of PCI resources. AFAICS pci_enable_device() enables power as the first
thing.

> Either way, upon failure we should be unwinding.

I'd rather wouldn't put back the device to D3 state, as further device
access may still possible even though resume failed.

--Imre