From mboxrd@z Thu Jan 1 00:00:00 1970 From: Imre Deak Subject: Re: [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled Date: Mon, 18 Apr 2016 11:54:31 +0300 Message-ID: <1460969671.3172.25.camel@intel.com> References: <1460963062-13211-1-git-send-email-imre.deak@intel.com> <1460963062-13211-3-git-send-email-imre.deak@intel.com> <20160418082822.GY4329@intel.com> <1460968358.3172.17.camel@intel.com> <20160418084451.GB4329@intel.com> Reply-To: imre.deak@intel.com Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: In-Reply-To: <20160418084451.GB4329@intel.com> Sender: stable-owner@vger.kernel.org To: Ville =?ISO-8859-1?Q?Syrj=E4l=E4?= Cc: intel-gfx@lists.freedesktop.org, stable@vger.kernel.org List-Id: intel-gfx@lists.freedesktop.org On ma, 2016-04-18 at 11:44 +0300, Ville Syrj=C3=A4l=C3=A4 wrote: > On Mon, Apr 18, 2016 at 11:32:38AM +0300, Imre Deak wrote: > > On ma, 2016-04-18 at 11:28 +0300, Ville Syrj=C3=A4l=C3=A4 wrote: > > > On Mon, Apr 18, 2016 at 10:04:20AM +0300, Imre Deak wrote: > > > > During system resume we depended on pci_enable_device() also > > > > putting the > > > > device into PCI D0 state. This won't work if the PCI device was > > > > already > > > > enabled but still in D3 state. This is because pci_enable_devic= e() > > > > is > > > > refcounted and will not change the HW state if called with a no= n- > > > > zero > > > > refcount. Leaving the device in D3 will make all subsequent dev= ice > > > > accesses fail. > > > >=20 > > > > This didn't cause a problem most of the time, since we resumed = with > > > > an > > > > enable refcount of 0. But it fails at least after module reload > > > > because > > > > after that we also happen to leak a PCI device enable reference= : > > > > During > > > > probing we call drm_get_pci_dev() which will enable the PCI dev= ice, > > > > but > > > > during device removal drm_put_dev() won't disable it. This is a= bug > > > > of > > > > its own in DRM core, but without much harm as it only leaves th= e > > > > PCI > > > > device enabled. Fixing it is also a bit more involved, due to D= RM > > > > mid-layering and because it affects non-i915 drivers too. The f= ix > > > > in > > > > this patch is valid regardless of the problem in DRM core. > > > >=20 > > > > CC: Ville Syrj=C3=A4l=C3=A4 > > > > CC: stable@vger.kernel.org > > > > Signed-off-by: Imre Deak > > > > --- > > > > =C2=A0drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++- > > > > =C2=A01 file changed, 8 insertions(+), 1 deletion(-) > > > >=20 > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c > > > > b/drivers/gpu/drm/i915/i915_drv.c > > > > index d550ae2..7eaa93e 100644 > > > > --- a/drivers/gpu/drm/i915/i915_drv.c > > > > +++ b/drivers/gpu/drm/i915/i915_drv.c > > > > @@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_devic= e > > > > *dev) > > > > =C2=A0static int i915_drm_resume_early(struct drm_device *dev) > > > > =C2=A0{ > > > > =C2=A0 struct drm_i915_private *dev_priv =3D dev->dev_private; > > > > - int ret =3D 0; > > > > + int ret; > > > > =C2=A0 > > > > =C2=A0 /* > > > > =C2=A0 =C2=A0* We have a resume ordering issue with the snd-hda= driver > > > > also > > > > @@ -814,6 +814,13 @@ static int i915_drm_resume_early(struct > > > > drm_device *dev) > > > > =C2=A0 =C2=A0* FIXME: This should be solved with a special hdmi= sink > > > > device or > > > > =C2=A0 =C2=A0* similar so that power domains can be employed. > > > > =C2=A0 =C2=A0*/ > > > > + > > > > + ret =3D pci_set_power_state(dev->pdev, PCI_D0); > > > > + if (ret) { > > > > + DRM_ERROR("failed to set PCI D0 power state > > > > (%d)\n", ret); > > > > + goto out; > > > > + } > > >=20 > > > Hmm. Doesn't this already happen from pci bus resume_noirq hook? > >=20 > > It does, but not during thaw_noirq. >=20 > Maybe put that into a comment? If we ever get to dropping the device > state frobbery from freeze/thaw, then we should also be able to throw > out the pci_set_power_state() call as well. Yes, can add a comment. > Perhaps we should have some asserts about the state of the PCI device= ? You mean after calling pci_enable_device() that it's indeed in D0 and enabled? Can do that as a follow-up. --Imre