[PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled

stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled
       [not found] <1460963062-13211-1-git-send-email-imre.deak@intel.com>
@ 2016-04-18  7:04 ` Imre Deak
  2016-04-18  8:06   ` [Intel-gfx] " Chris Wilson
                     ` (3 more replies)
  2016-04-18  7:04 ` [PATCH 3/4] drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume Imre Deak
  1 sibling, 4 replies; 17+ messages in thread
From: Imre Deak @ 2016-04-18  7:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ville Syrjälä, stable

During system resume we depended on pci_enable_device() also putting the
device into PCI D0 state. This won't work if the PCI device was already
enabled but still in D3 state. This is because pci_enable_device() is
refcounted and will not change the HW state if called with a non-zero
refcount. Leaving the device in D3 will make all subsequent device
accesses fail.

This didn't cause a problem most of the time, since we resumed with an
enable refcount of 0. But it fails at least after module reload because
after that we also happen to leak a PCI device enable reference: During
probing we call drm_get_pci_dev() which will enable the PCI device, but
during device removal drm_put_dev() won't disable it. This is a bug of
its own in DRM core, but without much harm as it only leaves the PCI
device enabled. Fixing it is also a bit more involved, due to DRM
mid-layering and because it affects non-i915 drivers too. The fix in
this patch is valid regardless of the problem in DRM core.

CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
CC: stable@vger.kernel.org
Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index d550ae2..7eaa93e 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device *dev)
 static int i915_drm_resume_early(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	int ret = 0;
+	int ret;

 	/*
 	 * We have a resume ordering issue with the snd-hda driver also
@@ -814,6 +814,13 @@ static int i915_drm_resume_early(struct drm_device *dev)
 	 * FIXME: This should be solved with a special hdmi sink device or
 	 * similar so that power domains can be employed.
 	 */
+
+	ret = pci_set_power_state(dev->pdev, PCI_D0);
+	if (ret) {
+		DRM_ERROR("failed to set PCI D0 power state (%d)\n", ret);
+		goto out;
+	}
+
 	if (pci_enable_device(dev->pdev)) {
 		ret = -EIO;
 		goto out;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/4] drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume
       [not found] <1460963062-13211-1-git-send-email-imre.deak@intel.com>
  2016-04-18  7:04 ` [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled Imre Deak
@ 2016-04-18  7:04 ` Imre Deak
  2016-04-18 11:05   ` Ville Syrjälä
  1 sibling, 1 reply; 17+ messages in thread
From: Imre Deak @ 2016-04-18  7:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ville Syrjälä, stable

The driver's VDD on/off logic assumes that whenever the VDD is on we
also hold an AUX power domain reference. Since BIOS can leave the VDD on
during booting and resuming and on DDI platforms we won't take a
corresponding power reference, the above assumption won't hold on those
platforms and an eventual delayed VDD off work will do an extraneous AUX
power domain put resulting in a refcount underflow. Fix this the same
way we did this for non-DDI DP encoders:

6d93c0c41760c0 ("drm/i915: fix VDD state tracking after system resume")

At the same time call the DP encoder suspend handler the same way as the
non-DDI DP encoders do to flush any pending VDD off work. Leaving the
work running may cause a HW access where we don't expect this (at a point
where power domains are suspended already).

While at it remove an unnecessary function call indirection.

This fixed for me AUX refcount underflow problems on BXT during
suspend/resume.

CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
CC: stable@vger.kernel.org
Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/intel_ddi.c | 10 +++-------
 drivers/gpu/drm/i915/intel_dp.c  |  4 ++--
 drivers/gpu/drm/i915/intel_drv.h |  2 ++
 3 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c
index 96234c5..c2348fb 100644
--- a/drivers/gpu/drm/i915/intel_ddi.c
+++ b/drivers/gpu/drm/i915/intel_ddi.c
@@ -2206,12 +2206,6 @@ void intel_ddi_get_config(struct intel_encoder *encoder,
 	intel_ddi_clock_get(encoder, pipe_config);
 }
 
-static void intel_ddi_destroy(struct drm_encoder *encoder)
-{
-	/* HDMI has nothing special to destroy, so we can go with this. */
-	intel_dp_encoder_destroy(encoder);
-}
-
 static bool intel_ddi_compute_config(struct intel_encoder *encoder,
 				     struct intel_crtc_state *pipe_config)
 {
@@ -2230,7 +2224,8 @@ static bool intel_ddi_compute_config(struct intel_encoder *encoder,
 }
 
 static const struct drm_encoder_funcs intel_ddi_funcs = {
-	.destroy = intel_ddi_destroy,
+	.reset = intel_dp_encoder_reset,
+	.destroy = intel_dp_encoder_destroy,
 };
 
 static struct intel_connector *
@@ -2329,6 +2324,7 @@ void intel_ddi_init(struct drm_device *dev, enum port port)
 	intel_encoder->post_disable = intel_ddi_post_disable;
 	intel_encoder->get_hw_state = intel_ddi_get_hw_state;
 	intel_encoder->get_config = intel_ddi_get_config;
+	intel_encoder->suspend = intel_dp_encoder_suspend;
 
 	intel_dig_port->port = port;
 	intel_dig_port->saved_port_bits = I915_READ(DDI_BUF_CTL(port)) &
diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index 61ee226..c6af3d0 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -4889,7 +4889,7 @@ void intel_dp_encoder_destroy(struct drm_encoder *encoder)
 	kfree(intel_dig_port);
 }
 
-static void intel_dp_encoder_suspend(struct intel_encoder *intel_encoder)
+void intel_dp_encoder_suspend(struct intel_encoder *intel_encoder)
 {
 	struct intel_dp *intel_dp = enc_to_intel_dp(&intel_encoder->base);
 
@@ -4931,7 +4931,7 @@ static void intel_edp_panel_vdd_sanitize(struct intel_dp *intel_dp)
 	edp_panel_vdd_schedule_off(intel_dp);
 }
 
-static void intel_dp_encoder_reset(struct drm_encoder *encoder)
+void intel_dp_encoder_reset(struct drm_encoder *encoder)
 {
 	struct intel_dp *intel_dp;
 
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index e13ce22..10dfe72 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1285,6 +1285,8 @@ void intel_dp_set_link_params(struct intel_dp *intel_dp,
 void intel_dp_start_link_train(struct intel_dp *intel_dp);
 void intel_dp_stop_link_train(struct intel_dp *intel_dp);
 void intel_dp_sink_dpms(struct intel_dp *intel_dp, int mode);
+void intel_dp_encoder_reset(struct drm_encoder *encoder);
+void intel_dp_encoder_suspend(struct intel_encoder *intel_encoder);
 void intel_dp_encoder_destroy(struct drm_encoder *encoder);
 int intel_dp_sink_crc(struct intel_dp *intel_dp, u8 *crc);
 bool intel_dp_compute_config(struct intel_encoder *encoder,
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  7:04 ` [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled Imre Deak
@ 2016-04-18  8:06   ` Chris Wilson
  2016-04-18  8:16     ` Imre Deak
  2016-04-18  8:28   ` Ville Syrjälä
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Chris Wilson @ 2016-04-18  8:06 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx, stable

On Mon, Apr 18, 2016 at 10:04:20AM +0300, Imre Deak wrote:
> During system resume we depended on pci_enable_device() also putting the
> device into PCI D0 state. This won't work if the PCI device was already
> enabled but still in D3 state. This is because pci_enable_device() is
> refcounted and will not change the HW state if called with a non-zero
> refcount. Leaving the device in D3 will make all subsequent device
> accesses fail.
> 
> This didn't cause a problem most of the time, since we resumed with an
> enable refcount of 0. But it fails at least after module reload because
> after that we also happen to leak a PCI device enable reference: During
> probing we call drm_get_pci_dev() which will enable the PCI device, but
> during device removal drm_put_dev() won't disable it. This is a bug of
> its own in DRM core, but without much harm as it only leaves the PCI
> device enabled. Fixing it is also a bit more involved, due to DRM
> mid-layering and because it affects non-i915 drivers too. The fix in
> this patch is valid regardless of the problem in DRM core.
> 
> CC: Ville Syrjï¿½lï¿½ <ville.syrjala@linux.intel.com>
> CC: stable@vger.kernel.org
> Signed-off-by: Imre Deak <imre.deak@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index d550ae2..7eaa93e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device *dev)
>  static int i915_drm_resume_early(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	int ret = 0;
> +	int ret;
>  
>  	/*
>  	 * We have a resume ordering issue with the snd-hda driver also
> @@ -814,6 +814,13 @@ static int i915_drm_resume_early(struct drm_device *dev)
>  	 * FIXME: This should be solved with a special hdmi sink device or
>  	 * similar so that power domains can be employed.
>  	 */
> +
> +	ret = pci_set_power_state(dev->pdev, PCI_D0);
> +	if (ret) {
> +		DRM_ERROR("failed to set PCI D0 power state (%d)\n", ret);
> +		goto out;
> +	}

The device should be enabled first, otherwise we are not meant to be
touching its IO space at all (such as twiddling power state). At least
that is the order pci_enable_device() uses.

Either way, upon failure we should be unwinding.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  8:06   ` [Intel-gfx] " Chris Wilson
@ 2016-04-18  8:16     ` Imre Deak
  2016-04-18  8:24       ` Chris Wilson
  0 siblings, 1 reply; 17+ messages in thread
From: Imre Deak @ 2016-04-18  8:16 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, stable

On ma, 2016-04-18 at 09:06 +0100, Chris Wilson wrote:
> On Mon, Apr 18, 2016 at 10:04:20AM +0300, Imre Deak wrote:
> > During system resume we depended on pci_enable_device() also
> > putting the
> > device into PCI D0 state. This won't work if the PCI device was
> > already
> > enabled but still in D3 state. This is because pci_enable_device()
> > is
> > refcounted and will not change the HW state if called with a non-
> > zero
> > refcount. Leaving the device in D3 will make all subsequent device
> > accesses fail.
> > 
> > This didn't cause a problem most of the time, since we resumed with
> > an
> > enable refcount of 0. But it fails at least after module reload
> > because
> > after that we also happen to leak a PCI device enable reference:
> > During
> > probing we call drm_get_pci_dev() which will enable the PCI device,
> > but
> > during device removal drm_put_dev() won't disable it. This is a bug
> > of
> > its own in DRM core, but without much harm as it only leaves the
> > PCI
> > device enabled. Fixing it is also a bit more involved, due to DRM
> > mid-layering and because it affects non-i915 drivers too. The fix
> > in
> > this patch is valid regardless of the problem in DRM core.
> > 
> > CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > CC: stable@vger.kernel.org
> > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > b/drivers/gpu/drm/i915/i915_drv.c
> > index d550ae2..7eaa93e 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device
> > *dev)
> >  static int i915_drm_resume_early(struct drm_device *dev)
> >  {
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	int ret = 0;
> > +	int ret;
> >  
> >  	/*
> >  	 * We have a resume ordering issue with the snd-hda driver
> > also
> > @@ -814,6 +814,13 @@ static int i915_drm_resume_early(struct
> > drm_device *dev)
> >  	 * FIXME: This should be solved with a special hdmi sink
> > device or
> >  	 * similar so that power domains can be employed.
> >  	 */
> > +
> > +	ret = pci_set_power_state(dev->pdev, PCI_D0);
> > +	if (ret) {
> > +		DRM_ERROR("failed to set PCI D0 power state
> > (%d)\n", ret);
> > +		goto out;
> > +	}
> 
> The device should be enabled first, otherwise we are not meant to be
> touching its IO space at all (such as twiddling power state). At
> least
> that is the order pci_enable_device() uses.

It's not MMIO or (port) IO but only a PCI config space access
that pci_set_power_state() requires, so doesn't need the enabling 
of PCI resources. AFAICS pci_enable_device() enables power as the first
thing.

> Either way, upon failure we should be unwinding.

I'd rather wouldn't put back the device to D3 state, as further device
access may still possible even though resume failed.

--Imre

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  8:16     ` Imre Deak
@ 2016-04-18  8:24       ` Chris Wilson
  2016-04-18  8:37         ` Imre Deak
  0 siblings, 1 reply; 17+ messages in thread
From: Chris Wilson @ 2016-04-18  8:24 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx, stable

On Mon, Apr 18, 2016 at 11:16:34AM +0300, Imre Deak wrote:
> On ma, 2016-04-18 at 09:06 +0100, Chris Wilson wrote:
> > On Mon, Apr 18, 2016 at 10:04:20AM +0300, Imre Deak wrote:
> > > During system resume we depended on pci_enable_device() also
> > > putting the
> > > device into PCI D0 state. This won't work if the PCI device was
> > > already
> > > enabled but still in D3 state. This is because pci_enable_device()
> > > is
> > > refcounted and will not change the HW state if called with a non-
> > > zero
> > > refcount. Leaving the device in D3 will make all subsequent device
> > > accesses fail.
> > > 
> > > This didn't cause a problem most of the time, since we resumed with
> > > an
> > > enable refcount of 0. But it fails at least after module reload
> > > because
> > > after that we also happen to leak a PCI device enable reference:
> > > During
> > > probing we call drm_get_pci_dev() which will enable the PCI device,
> > > but
> > > during device removal drm_put_dev() won't disable it. This is a bug
> > > of
> > > its own in DRM core, but without much harm as it only leaves the
> > > PCI
> > > device enabled. Fixing it is also a bit more involved, due to DRM
> > > mid-layering and because it affects non-i915 drivers too. The fix
> > > in
> > > this patch is valid regardless of the problem in DRM core.
> > > 
> > > CC: Ville Syrjï¿½lï¿½ <ville.syrjala@linux.intel.com>
> > > CC: stable@vger.kernel.org
> > > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > > ---
> > > ï¿½drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
> > > ï¿½1 file changed, 8 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > b/drivers/gpu/drm/i915/i915_drv.c
> > > index d550ae2..7eaa93e 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > @@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device
> > > *dev)
> > > ï¿½static int i915_drm_resume_early(struct drm_device *dev)
> > > ï¿½{
> > > ï¿½	struct drm_i915_private *dev_priv = dev->dev_private;
> > > -	int ret = 0;
> > > +	int ret;
> > > ï¿½
> > > ï¿½	/*
> > > ï¿½	ï¿½* We have a resume ordering issue with the snd-hda driver
> > > also
> > > @@ -814,6 +814,13 @@ static int i915_drm_resume_early(struct
> > > drm_device *dev)
> > > ï¿½	ï¿½* FIXME: This should be solved with a special hdmi sink
> > > device or
> > > ï¿½	ï¿½* similar so that power domains can be employed.
> > > ï¿½	ï¿½*/
> > > +
> > > +	ret = pci_set_power_state(dev->pdev, PCI_D0);
> > > +	if (ret) {
> > > +		DRM_ERROR("failed to set PCI D0 power state
> > > (%d)\n", ret);
> > > +		goto out;
> > > +	}
> > 
> > The device should be enabled first, otherwise we are not meant to be
> > touching its IO space at all (such as twiddling power state). At
> > least
> > that is the order pci_enable_device() uses.
> 
> It's not MMIO or (port) IO but only a PCI config space access
> thatï¿½pci_set_power_state() requires, so doesn't need the enabling 
> of PCI resources. AFAICSï¿½pci_enable_device() enables power as the first
> thing.
> 
> > Either way, upon failure we should be unwinding.
> 
> I'd rather wouldn't put back the device to D3 state, as further device
> access may still possible even though resume failed.

On the other hand, if you order it thusly:

	if (!pci_enable_device())
		return -EIO;

	ret = pci_set_power_state()
	if (ret < 0) {
		pci_disable_device()
		return ret;
	}

it doesn't raise as many eyebrows :)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  7:04 ` [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled Imre Deak
  2016-04-18  8:06   ` [Intel-gfx] " Chris Wilson
@ 2016-04-18  8:28   ` Ville Syrjälä
  2016-04-18  8:32     ` Imre Deak
  2016-04-18 11:44   ` [PATCH v2 " Imre Deak
  2016-04-18 11:45   ` Imre Deak
  3 siblings, 1 reply; 17+ messages in thread
From: Ville Syrjälä @ 2016-04-18  8:28 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx, stable

On Mon, Apr 18, 2016 at 10:04:20AM +0300, Imre Deak wrote:
> During system resume we depended on pci_enable_device() also putting the
> device into PCI D0 state. This won't work if the PCI device was already
> enabled but still in D3 state. This is because pci_enable_device() is
> refcounted and will not change the HW state if called with a non-zero
> refcount. Leaving the device in D3 will make all subsequent device
> accesses fail.
> 
> This didn't cause a problem most of the time, since we resumed with an
> enable refcount of 0. But it fails at least after module reload because
> after that we also happen to leak a PCI device enable reference: During
> probing we call drm_get_pci_dev() which will enable the PCI device, but
> during device removal drm_put_dev() won't disable it. This is a bug of
> its own in DRM core, but without much harm as it only leaves the PCI
> device enabled. Fixing it is also a bit more involved, due to DRM
> mid-layering and because it affects non-i915 drivers too. The fix in
> this patch is valid regardless of the problem in DRM core.
> 
> CC: Ville Syrjï¿½lï¿½ <ville.syrjala@linux.intel.com>
> CC: stable@vger.kernel.org
> Signed-off-by: Imre Deak <imre.deak@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index d550ae2..7eaa93e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device *dev)
>  static int i915_drm_resume_early(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	int ret = 0;
> +	int ret;
>  
>  	/*
>  	 * We have a resume ordering issue with the snd-hda driver also
> @@ -814,6 +814,13 @@ static int i915_drm_resume_early(struct drm_device *dev)
>  	 * FIXME: This should be solved with a special hdmi sink device or
>  	 * similar so that power domains can be employed.
>  	 */
> +
> +	ret = pci_set_power_state(dev->pdev, PCI_D0);
> +	if (ret) {
> +		DRM_ERROR("failed to set PCI D0 power state (%d)\n", ret);
> +		goto out;
> +	}

Hmm. Doesn't this already happen from pci bus resume_noirq hook?

> +
>  	if (pci_enable_device(dev->pdev)) {
>  		ret = -EIO;
>  		goto out;
> -- 
> 2.5.0

-- 
Ville Syrjï¿½lï¿½
Intel OTC

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  8:28   ` Ville Syrjälä
@ 2016-04-18  8:32     ` Imre Deak
  2016-04-18  8:44       ` Ville Syrjälä
  0 siblings, 1 reply; 17+ messages in thread
From: Imre Deak @ 2016-04-18  8:32 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx, stable

On ma, 2016-04-18 at 11:28 +0300, Ville Syrjälä wrote:
> On Mon, Apr 18, 2016 at 10:04:20AM +0300, Imre Deak wrote:
> > During system resume we depended on pci_enable_device() also
> > putting the
> > device into PCI D0 state. This won't work if the PCI device was
> > already
> > enabled but still in D3 state. This is because pci_enable_device()
> > is
> > refcounted and will not change the HW state if called with a non-
> > zero
> > refcount. Leaving the device in D3 will make all subsequent device
> > accesses fail.
> > 
> > This didn't cause a problem most of the time, since we resumed with
> > an
> > enable refcount of 0. But it fails at least after module reload
> > because
> > after that we also happen to leak a PCI device enable reference:
> > During
> > probing we call drm_get_pci_dev() which will enable the PCI device,
> > but
> > during device removal drm_put_dev() won't disable it. This is a bug
> > of
> > its own in DRM core, but without much harm as it only leaves the
> > PCI
> > device enabled. Fixing it is also a bit more involved, due to DRM
> > mid-layering and because it affects non-i915 drivers too. The fix
> > in
> > this patch is valid regardless of the problem in DRM core.
> > 
> > CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > CC: stable@vger.kernel.org
> > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > b/drivers/gpu/drm/i915/i915_drv.c
> > index d550ae2..7eaa93e 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device
> > *dev)
> >  static int i915_drm_resume_early(struct drm_device *dev)
> >  {
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	int ret = 0;
> > +	int ret;
> >  
> >  	/*
> >  	 * We have a resume ordering issue with the snd-hda driver
> > also
> > @@ -814,6 +814,13 @@ static int i915_drm_resume_early(struct
> > drm_device *dev)
> >  	 * FIXME: This should be solved with a special hdmi sink
> > device or
> >  	 * similar so that power domains can be employed.
> >  	 */
> > +
> > +	ret = pci_set_power_state(dev->pdev, PCI_D0);
> > +	if (ret) {
> > +		DRM_ERROR("failed to set PCI D0 power state
> > (%d)\n", ret);
> > +		goto out;
> > +	}
> 
> Hmm. Doesn't this already happen from pci bus resume_noirq hook?

It does, but not during thaw_noirq.

> 
> > +
> >  	if (pci_enable_device(dev->pdev)) {
> >  		ret = -EIO;
> >  		goto out;
> > -- 
> > 2.5.0
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  8:24       ` Chris Wilson
@ 2016-04-18  8:37         ` Imre Deak
  2016-04-18  8:52           ` Chris Wilson
  0 siblings, 1 reply; 17+ messages in thread
From: Imre Deak @ 2016-04-18  8:37 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, stable

On ma, 2016-04-18 at 09:24 +0100, Chris Wilson wrote:
> On Mon, Apr 18, 2016 at 11:16:34AM +0300, Imre Deak wrote:
> > On ma, 2016-04-18 at 09:06 +0100, Chris Wilson wrote:
> > > On Mon, Apr 18, 2016 at 10:04:20AM +0300, Imre Deak wrote:
> > > > During system resume we depended on pci_enable_device() also
> > > > putting the
> > > > device into PCI D0 state. This won't work if the PCI device was
> > > > already
> > > > enabled but still in D3 state. This is because
> > > > pci_enable_device()
> > > > is
> > > > refcounted and will not change the HW state if called with a
> > > > non-
> > > > zero
> > > > refcount. Leaving the device in D3 will make all subsequent
> > > > device
> > > > accesses fail.
> > > > 
> > > > This didn't cause a problem most of the time, since we resumed
> > > > with
> > > > an
> > > > enable refcount of 0. But it fails at least after module reload
> > > > because
> > > > after that we also happen to leak a PCI device enable
> > > > reference:
> > > > During
> > > > probing we call drm_get_pci_dev() which will enable the PCI
> > > > device,
> > > > but
> > > > during device removal drm_put_dev() won't disable it. This is a
> > > > bug
> > > > of
> > > > its own in DRM core, but without much harm as it only leaves
> > > > the
> > > > PCI
> > > > device enabled. Fixing it is also a bit more involved, due to
> > > > DRM
> > > > mid-layering and because it affects non-i915 drivers too. The
> > > > fix
> > > > in
> > > > this patch is valid regardless of the problem in DRM core.
> > > > 
> > > > CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > > CC: stable@vger.kernel.org
> > > > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
> > > >  1 file changed, 8 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > > b/drivers/gpu/drm/i915/i915_drv.c
> > > > index d550ae2..7eaa93e 100644
> > > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > > @@ -803,7 +803,7 @@ static int i915_drm_resume(struct
> > > > drm_device
> > > > *dev)
> > > >  static int i915_drm_resume_early(struct drm_device *dev)
> > > >  {
> > > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > > -	int ret = 0;
> > > > +	int ret;
> > > >  
> > > >  	/*
> > > >  	 * We have a resume ordering issue with the snd-hda
> > > > driver
> > > > also
> > > > @@ -814,6 +814,13 @@ static int i915_drm_resume_early(struct
> > > > drm_device *dev)
> > > >  	 * FIXME: This should be solved with a special hdmi
> > > > sink
> > > > device or
> > > >  	 * similar so that power domains can be employed.
> > > >  	 */
> > > > +
> > > > +	ret = pci_set_power_state(dev->pdev, PCI_D0);
> > > > +	if (ret) {
> > > > +		DRM_ERROR("failed to set PCI D0 power state
> > > > (%d)\n", ret);
> > > > +		goto out;
> > > > +	}
> > > 
> > > The device should be enabled first, otherwise we are not meant to
> > > be
> > > touching its IO space at all (such as twiddling power state). At
> > > least
> > > that is the order pci_enable_device() uses.
> > 
> > It's not MMIO or (port) IO but only a PCI config space access
> > that pci_set_power_state() requires, so doesn't need the enabling 
> > of PCI resources. AFAICS pci_enable_device() enables power as the
> > first
> > thing.
> > 
> > > Either way, upon failure we should be unwinding.
> > 
> > I'd rather wouldn't put back the device to D3 state, as further
> > device
> > access may still possible even though resume failed.
> 
> On the other hand, if you order it thusly:
> 
> 	if (!pci_enable_device())
> 		return -EIO;
> 
> 	ret = pci_set_power_state()
> 	if (ret < 0) {
> 		pci_disable_device()
> 		return ret;
> 	}
> 
> it doesn't raise as many eyebrows :)

I don't know. I guess you mean that enabling the device first seems
like the first logical step. To me enabling power is the first logical
step and enabling the device itself is the second.

In any case if we decide to change this, I would suggest doing it
separately since then we also need to change the disabling order during
suspend.

--Imre

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  8:32     ` Imre Deak
@ 2016-04-18  8:44       ` Ville Syrjälä
  2016-04-18  8:54         ` Imre Deak
  0 siblings, 1 reply; 17+ messages in thread
From: Ville Syrjälä @ 2016-04-18  8:44 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx, stable

On Mon, Apr 18, 2016 at 11:32:38AM +0300, Imre Deak wrote:
> On ma, 2016-04-18 at 11:28 +0300, Ville Syrjï¿½lï¿½ wrote:
> > On Mon, Apr 18, 2016 at 10:04:20AM +0300, Imre Deak wrote:
> > > During system resume we depended on pci_enable_device() also
> > > putting the
> > > device into PCI D0 state. This won't work if the PCI device was
> > > already
> > > enabled but still in D3 state. This is because pci_enable_device()
> > > is
> > > refcounted and will not change the HW state if called with a non-
> > > zero
> > > refcount. Leaving the device in D3 will make all subsequent device
> > > accesses fail.
> > > 
> > > This didn't cause a problem most of the time, since we resumed with
> > > an
> > > enable refcount of 0. But it fails at least after module reload
> > > because
> > > after that we also happen to leak a PCI device enable reference:
> > > During
> > > probing we call drm_get_pci_dev() which will enable the PCI device,
> > > but
> > > during device removal drm_put_dev() won't disable it. This is a bug
> > > of
> > > its own in DRM core, but without much harm as it only leaves the
> > > PCI
> > > device enabled. Fixing it is also a bit more involved, due to DRM
> > > mid-layering and because it affects non-i915 drivers too. The fix
> > > in
> > > this patch is valid regardless of the problem in DRM core.
> > > 
> > > CC: Ville Syrjï¿½lï¿½ <ville.syrjala@linux.intel.com>
> > > CC: stable@vger.kernel.org
> > > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > > ---
> > > ï¿½drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
> > > ï¿½1 file changed, 8 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > b/drivers/gpu/drm/i915/i915_drv.c
> > > index d550ae2..7eaa93e 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > @@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device
> > > *dev)
> > > ï¿½static int i915_drm_resume_early(struct drm_device *dev)
> > > ï¿½{
> > > ï¿½	struct drm_i915_private *dev_priv = dev->dev_private;
> > > -	int ret = 0;
> > > +	int ret;
> > > ï¿½
> > > ï¿½	/*
> > > ï¿½	ï¿½* We have a resume ordering issue with the snd-hda driver
> > > also
> > > @@ -814,6 +814,13 @@ static int i915_drm_resume_early(struct
> > > drm_device *dev)
> > > ï¿½	ï¿½* FIXME: This should be solved with a special hdmi sink
> > > device or
> > > ï¿½	ï¿½* similar so that power domains can be employed.
> > > ï¿½	ï¿½*/
> > > +
> > > +	ret = pci_set_power_state(dev->pdev, PCI_D0);
> > > +	if (ret) {
> > > +		DRM_ERROR("failed to set PCI D0 power state
> > > (%d)\n", ret);
> > > +		goto out;
> > > +	}
> > 
> > Hmm. Doesn't this already happen from pci bus resume_noirq hook?
> 
> It does, but not during thaw_noirq.

Maybe put that into a comment? If we ever get to dropping the device
state frobbery from freeze/thaw, then we should also be able to throw
out the pci_set_power_state() call as well.

Perhaps we should have some asserts about the state of the PCI device?

> 
> > 
> > > +
> > > ï¿½	if (pci_enable_device(dev->pdev)) {
> > > ï¿½		ret = -EIO;
> > > ï¿½		goto out;
> > > -- 
> > > 2.5.0
> > 

-- 
Ville Syrjï¿½lï¿½
Intel OTC

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  8:37         ` Imre Deak
@ 2016-04-18  8:52           ` Chris Wilson
  2016-04-18 11:05             ` Imre Deak
  0 siblings, 1 reply; 17+ messages in thread
From: Chris Wilson @ 2016-04-18  8:52 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx, stable

On Mon, Apr 18, 2016 at 11:37:05AM +0300, Imre Deak wrote:
> I don't know. I guess you mean that enabling the device first seems
> like the first logical step. To me enabling power is the first logical
> step and enabling the device itself is the second.

It is not clear exactly, but what is clear is that pci_enable_device()
sets the power state of this device only after doing so for the bridge
hierachy.

The order in this patch is inconsistent.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  8:44       ` Ville Syrjälä
@ 2016-04-18  8:54         ` Imre Deak
  2016-04-18  9:04           ` Ville Syrjälä
  0 siblings, 1 reply; 17+ messages in thread
From: Imre Deak @ 2016-04-18  8:54 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx, stable

On ma, 2016-04-18 at 11:44 +0300, Ville Syrjälä wrote:
> On Mon, Apr 18, 2016 at 11:32:38AM +0300, Imre Deak wrote:
> > On ma, 2016-04-18 at 11:28 +0300, Ville Syrjälä wrote:
> > > On Mon, Apr 18, 2016 at 10:04:20AM +0300, Imre Deak wrote:
> > > > During system resume we depended on pci_enable_device() also
> > > > putting the
> > > > device into PCI D0 state. This won't work if the PCI device was
> > > > already
> > > > enabled but still in D3 state. This is because pci_enable_device()
> > > > is
> > > > refcounted and will not change the HW state if called with a non-
> > > > zero
> > > > refcount. Leaving the device in D3 will make all subsequent device
> > > > accesses fail.
> > > > 
> > > > This didn't cause a problem most of the time, since we resumed with
> > > > an
> > > > enable refcount of 0. But it fails at least after module reload
> > > > because
> > > > after that we also happen to leak a PCI device enable reference:
> > > > During
> > > > probing we call drm_get_pci_dev() which will enable the PCI device,
> > > > but
> > > > during device removal drm_put_dev() won't disable it. This is a bug
> > > > of
> > > > its own in DRM core, but without much harm as it only leaves the
> > > > PCI
> > > > device enabled. Fixing it is also a bit more involved, due to DRM
> > > > mid-layering and because it affects non-i915 drivers too. The fix
> > > > in
> > > > this patch is valid regardless of the problem in DRM core.
> > > > 
> > > > CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > > CC: stable@vger.kernel.org
> > > > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
> > > >  1 file changed, 8 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > > b/drivers/gpu/drm/i915/i915_drv.c
> > > > index d550ae2..7eaa93e 100644
> > > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > > @@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device
> > > > *dev)
> > > >  static int i915_drm_resume_early(struct drm_device *dev)
> > > >  {
> > > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > > -	int ret = 0;
> > > > +	int ret;
> > > >  
> > > >  	/*
> > > >  	 * We have a resume ordering issue with the snd-hda driver
> > > > also
> > > > @@ -814,6 +814,13 @@ static int i915_drm_resume_early(struct
> > > > drm_device *dev)
> > > >  	 * FIXME: This should be solved with a special hdmi sink
> > > > device or
> > > >  	 * similar so that power domains can be employed.
> > > >  	 */
> > > > +
> > > > +	ret = pci_set_power_state(dev->pdev, PCI_D0);
> > > > +	if (ret) {
> > > > +		DRM_ERROR("failed to set PCI D0 power state
> > > > (%d)\n", ret);
> > > > +		goto out;
> > > > +	}
> > > 
> > > Hmm. Doesn't this already happen from pci bus resume_noirq hook?
> > 
> > It does, but not during thaw_noirq.
> 
> Maybe put that into a comment? If we ever get to dropping the device
> state frobbery from freeze/thaw, then we should also be able to throw
> out the pci_set_power_state() call as well.

Yes, can add a comment.

> Perhaps we should have some asserts about the state of the PCI device?

You mean after calling pci_enable_device() that it's indeed in D0 and
enabled? Can do that as a follow-up.

--Imre

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  8:54         ` Imre Deak
@ 2016-04-18  9:04           ` Ville Syrjälä
  0 siblings, 0 replies; 17+ messages in thread
From: Ville Syrjälä @ 2016-04-18  9:04 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx, stable

On Mon, Apr 18, 2016 at 11:54:31AM +0300, Imre Deak wrote:
> On ma, 2016-04-18 at 11:44 +0300, Ville Syrjï¿½lï¿½ wrote:
> > On Mon, Apr 18, 2016 at 11:32:38AM +0300, Imre Deak wrote:
> > > On ma, 2016-04-18 at 11:28 +0300, Ville Syrjï¿½lï¿½ wrote:
> > > > On Mon, Apr 18, 2016 at 10:04:20AM +0300, Imre Deak wrote:
> > > > > During system resume we depended on pci_enable_device() also
> > > > > putting the
> > > > > device into PCI D0 state. This won't work if the PCI device was
> > > > > already
> > > > > enabled but still in D3 state. This is because pci_enable_device()
> > > > > is
> > > > > refcounted and will not change the HW state if called with a non-
> > > > > zero
> > > > > refcount. Leaving the device in D3 will make all subsequent device
> > > > > accesses fail.
> > > > > 
> > > > > This didn't cause a problem most of the time, since we resumed with
> > > > > an
> > > > > enable refcount of 0. But it fails at least after module reload
> > > > > because
> > > > > after that we also happen to leak a PCI device enable reference:
> > > > > During
> > > > > probing we call drm_get_pci_dev() which will enable the PCI device,
> > > > > but
> > > > > during device removal drm_put_dev() won't disable it. This is a bug
> > > > > of
> > > > > its own in DRM core, but without much harm as it only leaves the
> > > > > PCI
> > > > > device enabled. Fixing it is also a bit more involved, due to DRM
> > > > > mid-layering and because it affects non-i915 drivers too. The fix
> > > > > in
> > > > > this patch is valid regardless of the problem in DRM core.
> > > > > 
> > > > > CC: Ville Syrjï¿½lï¿½ <ville.syrjala@linux.intel.com>
> > > > > CC: stable@vger.kernel.org
> > > > > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > > > > ---
> > > > > ï¿½drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
> > > > > ï¿½1 file changed, 8 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > > > b/drivers/gpu/drm/i915/i915_drv.c
> > > > > index d550ae2..7eaa93e 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > > > @@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device
> > > > > *dev)
> > > > > ï¿½static int i915_drm_resume_early(struct drm_device *dev)
> > > > > ï¿½{
> > > > > ï¿½	struct drm_i915_private *dev_priv = dev->dev_private;
> > > > > -	int ret = 0;
> > > > > +	int ret;
> > > > > ï¿½
> > > > > ï¿½	/*
> > > > > ï¿½	ï¿½* We have a resume ordering issue with the snd-hda driver
> > > > > also
> > > > > @@ -814,6 +814,13 @@ static int i915_drm_resume_early(struct
> > > > > drm_device *dev)
> > > > > ï¿½	ï¿½* FIXME: This should be solved with a special hdmi sink
> > > > > device or
> > > > > ï¿½	ï¿½* similar so that power domains can be employed.
> > > > > ï¿½	ï¿½*/
> > > > > +
> > > > > +	ret = pci_set_power_state(dev->pdev, PCI_D0);
> > > > > +	if (ret) {
> > > > > +		DRM_ERROR("failed to set PCI D0 power state
> > > > > (%d)\n", ret);
> > > > > +		goto out;
> > > > > +	}
> > > > 
> > > > Hmm. Doesn't this already happen from pci bus resume_noirq hook?
> > > 
> > > It does, but not during thaw_noirq.
> > 
> > Maybe put that into a comment? If we ever get to dropping the device
> > state frobbery from freeze/thaw, then we should also be able to throw
> > out the pci_set_power_state() call as well.
> 
> Yes, can add a comment.
> 
> > Perhaps we should have some asserts about the state of the PCI device?
> 
> You mean after calling pci_enable_device() that it's indeed in D0 and
> enabled? Can do that as a follow-up.

Yeah, was thinking that we could assert that we're in D0, required BARs,
BM, pci int/msi etc. are enabled.

-- 
Ville Syrjï¿½lï¿½
Intel OTC

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/4] drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume
  2016-04-18  7:04 ` [PATCH 3/4] drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume Imre Deak
@ 2016-04-18 11:05   ` Ville Syrjälä
  0 siblings, 0 replies; 17+ messages in thread
From: Ville Syrjälä @ 2016-04-18 11:05 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx, stable

On Mon, Apr 18, 2016 at 10:04:21AM +0300, Imre Deak wrote:
> The driver's VDD on/off logic assumes that whenever the VDD is on we
> also hold an AUX power domain reference. Since BIOS can leave the VDD on
> during booting and resuming and on DDI platforms we won't take a
> corresponding power reference, the above assumption won't hold on those
> platforms and an eventual delayed VDD off work will do an extraneous AUX
> power domain put resulting in a refcount underflow. Fix this the same
> way we did this for non-DDI DP encoders:
> 
> 6d93c0c41760c0 ("drm/i915: fix VDD state tracking after system resume")
> 
> At the same time call the DP encoder suspend handler the same way as the
> non-DDI DP encoders do to flush any pending VDD off work. Leaving the
> work running may cause a HW access where we don't expect this (at a point
> where power domains are suspended already).
> 
> While at it remove an unnecessary function call indirection.
> 
> This fixed for me AUX refcount underflow problems on BXT during
> suspend/resume.
> 
> CC: Ville Syrjï¿½lï¿½ <ville.syrjala@linux.intel.com>
> CC: stable@vger.kernel.org
> Signed-off-by: Imre Deak <imre.deak@intel.com>

Reviewed-by: Ville Syrjï¿½lï¿½ <ville.syrjala@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/intel_ddi.c | 10 +++-------
>  drivers/gpu/drm/i915/intel_dp.c  |  4 ++--
>  drivers/gpu/drm/i915/intel_drv.h |  2 ++
>  3 files changed, 7 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c
> index 96234c5..c2348fb 100644
> --- a/drivers/gpu/drm/i915/intel_ddi.c
> +++ b/drivers/gpu/drm/i915/intel_ddi.c
> @@ -2206,12 +2206,6 @@ void intel_ddi_get_config(struct intel_encoder *encoder,
>  	intel_ddi_clock_get(encoder, pipe_config);
>  }
>  
> -static void intel_ddi_destroy(struct drm_encoder *encoder)
> -{
> -	/* HDMI has nothing special to destroy, so we can go with this. */
> -	intel_dp_encoder_destroy(encoder);
> -}
> -
>  static bool intel_ddi_compute_config(struct intel_encoder *encoder,
>  				     struct intel_crtc_state *pipe_config)
>  {
> @@ -2230,7 +2224,8 @@ static bool intel_ddi_compute_config(struct intel_encoder *encoder,
>  }
>  
>  static const struct drm_encoder_funcs intel_ddi_funcs = {
> -	.destroy = intel_ddi_destroy,
> +	.reset = intel_dp_encoder_reset,
> +	.destroy = intel_dp_encoder_destroy,
>  };
>  
>  static struct intel_connector *
> @@ -2329,6 +2324,7 @@ void intel_ddi_init(struct drm_device *dev, enum port port)
>  	intel_encoder->post_disable = intel_ddi_post_disable;
>  	intel_encoder->get_hw_state = intel_ddi_get_hw_state;
>  	intel_encoder->get_config = intel_ddi_get_config;
> +	intel_encoder->suspend = intel_dp_encoder_suspend;
>  
>  	intel_dig_port->port = port;
>  	intel_dig_port->saved_port_bits = I915_READ(DDI_BUF_CTL(port)) &
> diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
> index 61ee226..c6af3d0 100644
> --- a/drivers/gpu/drm/i915/intel_dp.c
> +++ b/drivers/gpu/drm/i915/intel_dp.c
> @@ -4889,7 +4889,7 @@ void intel_dp_encoder_destroy(struct drm_encoder *encoder)
>  	kfree(intel_dig_port);
>  }
>  
> -static void intel_dp_encoder_suspend(struct intel_encoder *intel_encoder)
> +void intel_dp_encoder_suspend(struct intel_encoder *intel_encoder)
>  {
>  	struct intel_dp *intel_dp = enc_to_intel_dp(&intel_encoder->base);
>  
> @@ -4931,7 +4931,7 @@ static void intel_edp_panel_vdd_sanitize(struct intel_dp *intel_dp)
>  	edp_panel_vdd_schedule_off(intel_dp);
>  }
>  
> -static void intel_dp_encoder_reset(struct drm_encoder *encoder)
> +void intel_dp_encoder_reset(struct drm_encoder *encoder)
>  {
>  	struct intel_dp *intel_dp;
>  
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index e13ce22..10dfe72 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -1285,6 +1285,8 @@ void intel_dp_set_link_params(struct intel_dp *intel_dp,
>  void intel_dp_start_link_train(struct intel_dp *intel_dp);
>  void intel_dp_stop_link_train(struct intel_dp *intel_dp);
>  void intel_dp_sink_dpms(struct intel_dp *intel_dp, int mode);
> +void intel_dp_encoder_reset(struct drm_encoder *encoder);
> +void intel_dp_encoder_suspend(struct intel_encoder *intel_encoder);
>  void intel_dp_encoder_destroy(struct drm_encoder *encoder);
>  int intel_dp_sink_crc(struct intel_dp *intel_dp, u8 *crc);
>  bool intel_dp_compute_config(struct intel_encoder *encoder,
> -- 
> 2.5.0

-- 
Ville Syrjï¿½lï¿½
Intel OTC

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  8:52           ` Chris Wilson
@ 2016-04-18 11:05             ` Imre Deak
  0 siblings, 0 replies; 17+ messages in thread
From: Imre Deak @ 2016-04-18 11:05 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, stable

On ma, 2016-04-18 at 09:52 +0100, Chris Wilson wrote:
> On Mon, Apr 18, 2016 at 11:37:05AM +0300, Imre Deak wrote:
> > I don't know. I guess you mean that enabling the device first seems
> > like the first logical step. To me enabling power is the first
> > logical
> > step and enabling the device itself is the second.
> 
> It is not clear exactly, but what is clear is that
> pci_enable_device()
> sets the power state of this device only after doing so for the
> bridge
> hierachy.

That bridge enabling in pci_enable_deivce() won't make a difference
since the suspend/resume order is already defined so that bridge
devices are suspended last and resumed first. Otoh, the current order
is to call pci_set_power_state() first and then pci_enable_device()
both for our device and all the rest of PCI drivers/devices that I
checked so far. Note that we can't even change this order during the
suspend/resume phase (as opposed to the freeze/thaw phase) since then
it's the PCI core that imposes the order already. So if I would change
this now as you suggest, we would have a different order during the
suspend/resume and the freeze/thaw phases.

> The order in this patch is inconsistent.

I attribute this inconsistency to the sloppiness of the PCI API. I
don't want to change the order in this fix, but I can follow up with a
patch that removes the pci_disable_device()/pci_enable_device() from
our suspend/resume hooks. We can't anyway depend on these doing
anything, since they are dependent on the device enable refcount which
is out of reach of the driver.

I can also add now a code comment about this.

--Imre

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  7:04 ` [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled Imre Deak
  2016-04-18  8:06   ` [Intel-gfx] " Chris Wilson
  2016-04-18  8:28   ` Ville Syrjälä
@ 2016-04-18 11:44   ` Imre Deak
  2016-04-18 11:45   ` Imre Deak
  3 siblings, 0 replies; 17+ messages in thread
From: Imre Deak @ 2016-04-18 11:44 UTC (permalink / raw)
  To: intel-gf; +Cc: Ville Syrjälä, Chris Wilson, stable

During system resume we depended on pci_enable_device() also putting the
device into PCI D0 state. This won't work if the PCI device was already
enabled but still in D3 state. This is because pci_enable_device() is
refcounted and will not change the HW state if called with a non-zero
refcount. Leaving the device in D3 will make all subsequent device
accesses fail.

This didn't cause a problem most of the time, since we resumed with an
enable refcount of 0. But it fails at least after module reload because
after that we also happen to leak a PCI device enable reference: During
probing we call drm_get_pci_dev() which will enable the PCI device, but
during device removal drm_put_dev() won't disable it. This is a bug of
its own in DRM core, but without much harm as it only leaves the PCI
device enabled. Fixing it is also a bit more involved, due to DRM
mid-layering and because it affects non-i915 drivers too. The fix in
this patch is valid regardless of the problem in DRM core.

v2:
- Add a code comment about the relation of this fix to the freeze/thaw
  vs. the suspend/resume phases. (Ville)
- Add a code comment about the inconsistent ordering of set power state
  and device enable calls. (Chris)

CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
CC: Chris Wilson <chris@chris-wilson.co.uk>
CC: stable@vger.kernel.org
Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 32 +++++++++++++++++++++++++++++++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index d550ae2..3b79e97 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device *dev)
 static int i915_drm_resume_early(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	int ret = 0;
+	int ret;
 
 	/*
 	 * We have a resume ordering issue with the snd-hda driver also
@@ -814,6 +814,36 @@ static int i915_drm_resume_early(struct drm_device *dev)
 	 * FIXME: This should be solved with a special hdmi sink device or
 	 * similar so that power domains can be employed.
 	 */
+
+	/*
+	 * Note that we need to set the power state explicitly, since we
+	 * powered off the device during freeze and the PCI core won't power
+	 * it back up for us during thaw. Powering off the device during
+	 * freeze is not a hard requirement though, and during the
+	 * suspend/resume phases the PCI core makes sure we get here with the
+	 * device powered on. So in case we change our freeze logic and keep
+	 * the device powered we can also remove the following set power state
+	 * call.
+	 */
+	ret = pci_set_power_state(dev->pdev, PCI_D0);
+	if (ret) {
+		DRM_ERROR("failed to set PCI D0 power state (%d)\n", ret);
+		goto out;
+	}
+
+	/*
+	 * Note that pci_enable_device() first enables any parent bridge
+	 * device and only then sets the power state for this device. The
+	 * bridge enabling is a nop though, since bridge devices are resumed
+	 * first. The order of enabling power and enabling the device is
+	 * imposed by the PCI core as described above, so here we preserve the
+	 * same order for the freeze/thaw phases.
+	 *
+	 * TODO: eventually we should remove pci_disable_device() /
+	 * pci_enable_enable_device() from suspend/resume. Due to how they
+	 * depend on the device enable refcount we can't anyway depend on them
+	 * disabling/enabling the device.
+	 */
 	if (pci_enable_device(dev->pdev)) {
 		ret = -EIO;
 		goto out;
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  7:04 ` [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled Imre Deak
                     ` (2 preceding siblings ...)
  2016-04-18 11:44   ` [PATCH v2 " Imre Deak
@ 2016-04-18 11:45   ` Imre Deak
  2016-04-18 14:59     ` Ville Syrjälä
  3 siblings, 1 reply; 17+ messages in thread
From: Imre Deak @ 2016-04-18 11:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ville Syrjälä, Chris Wilson, stable

During system resume we depended on pci_enable_device() also putting the
device into PCI D0 state. This won't work if the PCI device was already
enabled but still in D3 state. This is because pci_enable_device() is
refcounted and will not change the HW state if called with a non-zero
refcount. Leaving the device in D3 will make all subsequent device
accesses fail.

This didn't cause a problem most of the time, since we resumed with an
enable refcount of 0. But it fails at least after module reload because
after that we also happen to leak a PCI device enable reference: During
probing we call drm_get_pci_dev() which will enable the PCI device, but
during device removal drm_put_dev() won't disable it. This is a bug of
its own in DRM core, but without much harm as it only leaves the PCI
device enabled. Fixing it is also a bit more involved, due to DRM
mid-layering and because it affects non-i915 drivers too. The fix in
this patch is valid regardless of the problem in DRM core.

v2:
- Add a code comment about the relation of this fix to the freeze/thaw
  vs. the suspend/resume phases. (Ville)
- Add a code comment about the inconsistent ordering of set power state
  and device enable calls. (Chris)

CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
CC: Chris Wilson <chris@chris-wilson.co.uk>
CC: stable@vger.kernel.org
Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 32 +++++++++++++++++++++++++++++++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index d550ae2..3b79e97 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device *dev)
 static int i915_drm_resume_early(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	int ret = 0;
+	int ret;
 
 	/*
 	 * We have a resume ordering issue with the snd-hda driver also
@@ -814,6 +814,36 @@ static int i915_drm_resume_early(struct drm_device *dev)
 	 * FIXME: This should be solved with a special hdmi sink device or
 	 * similar so that power domains can be employed.
 	 */
+
+	/*
+	 * Note that we need to set the power state explicitly, since we
+	 * powered off the device during freeze and the PCI core won't power
+	 * it back up for us during thaw. Powering off the device during
+	 * freeze is not a hard requirement though, and during the
+	 * suspend/resume phases the PCI core makes sure we get here with the
+	 * device powered on. So in case we change our freeze logic and keep
+	 * the device powered we can also remove the following set power state
+	 * call.
+	 */
+	ret = pci_set_power_state(dev->pdev, PCI_D0);
+	if (ret) {
+		DRM_ERROR("failed to set PCI D0 power state (%d)\n", ret);
+		goto out;
+	}
+
+	/*
+	 * Note that pci_enable_device() first enables any parent bridge
+	 * device and only then sets the power state for this device. The
+	 * bridge enabling is a nop though, since bridge devices are resumed
+	 * first. The order of enabling power and enabling the device is
+	 * imposed by the PCI core as described above, so here we preserve the
+	 * same order for the freeze/thaw phases.
+	 *
+	 * TODO: eventually we should remove pci_disable_device() /
+	 * pci_enable_enable_device() from suspend/resume. Due to how they
+	 * depend on the device enable refcount we can't anyway depend on them
+	 * disabling/enabling the device.
+	 */
 	if (pci_enable_device(dev->pdev)) {
 		ret = -EIO;
 		goto out;
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18 11:45   ` Imre Deak
@ 2016-04-18 14:59     ` Ville Syrjälä
  0 siblings, 0 replies; 17+ messages in thread
From: Ville Syrjälä @ 2016-04-18 14:59 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx, Chris Wilson, stable

On Mon, Apr 18, 2016 at 02:45:54PM +0300, Imre Deak wrote:
> During system resume we depended on pci_enable_device() also putting the
> device into PCI D0 state. This won't work if the PCI device was already
> enabled but still in D3 state. This is because pci_enable_device() is
> refcounted and will not change the HW state if called with a non-zero
> refcount. Leaving the device in D3 will make all subsequent device
> accesses fail.
> 
> This didn't cause a problem most of the time, since we resumed with an
> enable refcount of 0. But it fails at least after module reload because
> after that we also happen to leak a PCI device enable reference: During
> probing we call drm_get_pci_dev() which will enable the PCI device, but
> during device removal drm_put_dev() won't disable it. This is a bug of
> its own in DRM core, but without much harm as it only leaves the PCI
> device enabled. Fixing it is also a bit more involved, due to DRM
> mid-layering and because it affects non-i915 drivers too. The fix in
> this patch is valid regardless of the problem in DRM core.
> 
> v2:
> - Add a code comment about the relation of this fix to the freeze/thaw
>   vs. the suspend/resume phases. (Ville)
> - Add a code comment about the inconsistent ordering of set power state
>   and device enable calls. (Chris)
> 
> CC: Ville Syrjï¿½lï¿½ <ville.syrjala@linux.intel.com>
> CC: Chris Wilson <chris@chris-wilson.co.uk>
> CC: stable@vger.kernel.org
> Signed-off-by: Imre Deak <imre.deak@intel.com>

The PCI PM code is a bit of a mess, but it would be appear this should
do what we need.

Reviewed-by: Ville Syrjï¿½lï¿½ <ville.syrjala@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/i915_drv.c | 32 +++++++++++++++++++++++++++++++-
>  1 file changed, 31 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index d550ae2..3b79e97 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device *dev)
>  static int i915_drm_resume_early(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	int ret = 0;
> +	int ret;
>  
>  	/*
>  	 * We have a resume ordering issue with the snd-hda driver also
> @@ -814,6 +814,36 @@ static int i915_drm_resume_early(struct drm_device *dev)
>  	 * FIXME: This should be solved with a special hdmi sink device or
>  	 * similar so that power domains can be employed.
>  	 */
> +
> +	/*
> +	 * Note that we need to set the power state explicitly, since we
> +	 * powered off the device during freeze and the PCI core won't power
> +	 * it back up for us during thaw. Powering off the device during
> +	 * freeze is not a hard requirement though, and during the
> +	 * suspend/resume phases the PCI core makes sure we get here with the
> +	 * device powered on. So in case we change our freeze logic and keep
> +	 * the device powered we can also remove the following set power state
> +	 * call.
> +	 */
> +	ret = pci_set_power_state(dev->pdev, PCI_D0);
> +	if (ret) {
> +		DRM_ERROR("failed to set PCI D0 power state (%d)\n", ret);
> +		goto out;
> +	}
> +
> +	/*
> +	 * Note that pci_enable_device() first enables any parent bridge
> +	 * device and only then sets the power state for this device. The
> +	 * bridge enabling is a nop though, since bridge devices are resumed
> +	 * first. The order of enabling power and enabling the device is
> +	 * imposed by the PCI core as described above, so here we preserve the
> +	 * same order for the freeze/thaw phases.
> +	 *
> +	 * TODO: eventually we should remove pci_disable_device() /
> +	 * pci_enable_enable_device() from suspend/resume. Due to how they
> +	 * depend on the device enable refcount we can't anyway depend on them
> +	 * disabling/enabling the device.
> +	 */
>  	if (pci_enable_device(dev->pdev)) {
>  		ret = -EIO;
>  		goto out;
> -- 
> 2.5.0

-- 
Ville Syrjï¿½lï¿½
Intel OTC

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-04-18 14:59 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1460963062-13211-1-git-send-email-imre.deak@intel.com>
2016-04-18  7:04 ` [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled Imre Deak
2016-04-18  8:06   ` [Intel-gfx] " Chris Wilson
2016-04-18  8:16     ` Imre Deak
2016-04-18  8:24       ` Chris Wilson
2016-04-18  8:37         ` Imre Deak
2016-04-18  8:52           ` Chris Wilson
2016-04-18 11:05             ` Imre Deak
2016-04-18  8:28   ` Ville Syrjälä
2016-04-18  8:32     ` Imre Deak
2016-04-18  8:44       ` Ville Syrjälä
2016-04-18  8:54         ` Imre Deak
2016-04-18  9:04           ` Ville Syrjälä
2016-04-18 11:44   ` [PATCH v2 " Imre Deak
2016-04-18 11:45   ` Imre Deak
2016-04-18 14:59     ` Ville Syrjälä
2016-04-18  7:04 ` [PATCH 3/4] drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume Imre Deak
2016-04-18 11:05   ` Ville Syrjälä

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).