[PATCH 0/4] drm/i915: Fix a few suspend/resume and driver unload bugs

public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed

* [PATCH 0/4] drm/i915: Fix a few suspend/resume and driver unload bugs
@ 2016-04-18  7:04 Imre Deak
  2016-04-18  7:04 ` [PATCH 1/4] drm/i915: Fix error path in i915_drm_resume_early Imre Deak
                   ` (5 more replies)
  0 siblings, 6 replies; 34+ messages in thread
From: Imre Deak @ 2016-04-18  7:04 UTC (permalink / raw)
  To: intel-gfx

This fixes a few issues I found while testing suspend/resume on BXT.

CC: Ville Syrjälä <ville.syrjala@linux.intel.com>

Imre Deak (4):
  drm/i915: Fix error path in i915_drm_resume_early
  drm/i915: Fix system resume if PCI device remained enabled
  drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume
  drm/i915/gen9: Fix runtime PM refcounting in case DMC firmware isn't
    loaded

 drivers/gpu/drm/i915/i915_drv.c         | 18 +++++++++++++-----
 drivers/gpu/drm/i915/intel_csr.c        | 29 +++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_ddi.c        | 10 +++-------
 drivers/gpu/drm/i915/intel_dp.c         |  4 ++--
 drivers/gpu/drm/i915/intel_drv.h        |  4 ++++
 drivers/gpu/drm/i915/intel_runtime_pm.c |  3 +++
 6 files changed, 54 insertions(+), 14 deletions(-)

-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 1/4] drm/i915: Fix error path in i915_drm_resume_early
  2016-04-18  7:04 [PATCH 0/4] drm/i915: Fix a few suspend/resume and driver unload bugs Imre Deak
@ 2016-04-18  7:04 ` Imre Deak
  2016-04-18  8:00   ` Chris Wilson
  2016-04-18  7:04 ` [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled Imre Deak
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 34+ messages in thread
From: Imre Deak @ 2016-04-18  7:04 UTC (permalink / raw)
  To: intel-gfx

If system resume fails, this may lead to a runtime PM wake reference
underflow used for runtime PM state checking.

Fixes: 1f814daca43a ("drm/i915: add support for checking if we hold an RPM reference")
Signed-off-by: Imre Deak <imre.deak@intel.com>

---

As this only fixes a debugging feature, it's not for -fixes or stable.
---
 drivers/gpu/drm/i915/i915_drv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 50bc05f..d550ae2 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -842,11 +842,11 @@ static int i915_drm_resume_early(struct drm_device *dev)
 	    !(dev_priv->suspended_to_idle && dev_priv->csr.dmc_payload))
 		intel_power_domains_init_hw(dev_priv, true);
 
+	enable_rpm_wakeref_asserts(dev_priv);
+
 out:
 	dev_priv->suspended_to_idle = false;
 
-	enable_rpm_wakeref_asserts(dev_priv);
-
 	return ret;
 }
 
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  7:04 [PATCH 0/4] drm/i915: Fix a few suspend/resume and driver unload bugs Imre Deak
  2016-04-18  7:04 ` [PATCH 1/4] drm/i915: Fix error path in i915_drm_resume_early Imre Deak
@ 2016-04-18  7:04 ` Imre Deak
  2016-04-18  8:06   ` [Intel-gfx] " Chris Wilson
                     ` (2 more replies)
  2016-04-18  7:04 ` [PATCH 3/4] drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume Imre Deak
                   ` (3 subsequent siblings)
  5 siblings, 3 replies; 34+ messages in thread
From: Imre Deak @ 2016-04-18  7:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ville Syrjälä, stable

During system resume we depended on pci_enable_device() also putting the
device into PCI D0 state. This won't work if the PCI device was already
enabled but still in D3 state. This is because pci_enable_device() is
refcounted and will not change the HW state if called with a non-zero
refcount. Leaving the device in D3 will make all subsequent device
accesses fail.

This didn't cause a problem most of the time, since we resumed with an
enable refcount of 0. But it fails at least after module reload because
after that we also happen to leak a PCI device enable reference: During
probing we call drm_get_pci_dev() which will enable the PCI device, but
during device removal drm_put_dev() won't disable it. This is a bug of
its own in DRM core, but without much harm as it only leaves the PCI
device enabled. Fixing it is also a bit more involved, due to DRM
mid-layering and because it affects non-i915 drivers too. The fix in
this patch is valid regardless of the problem in DRM core.

CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
CC: stable@vger.kernel.org
Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index d550ae2..7eaa93e 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device *dev)
 static int i915_drm_resume_early(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	int ret = 0;
+	int ret;
 
 	/*
 	 * We have a resume ordering issue with the snd-hda driver also
@@ -814,6 +814,13 @@ static int i915_drm_resume_early(struct drm_device *dev)
 	 * FIXME: This should be solved with a special hdmi sink device or
 	 * similar so that power domains can be employed.
 	 */
+
+	ret = pci_set_power_state(dev->pdev, PCI_D0);
+	if (ret) {
+		DRM_ERROR("failed to set PCI D0 power state (%d)\n", ret);
+		goto out;
+	}
+
 	if (pci_enable_device(dev->pdev)) {
 		ret = -EIO;
 		goto out;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 3/4] drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume
  2016-04-18  7:04 [PATCH 0/4] drm/i915: Fix a few suspend/resume and driver unload bugs Imre Deak
  2016-04-18  7:04 ` [PATCH 1/4] drm/i915: Fix error path in i915_drm_resume_early Imre Deak
  2016-04-18  7:04 ` [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled Imre Deak
@ 2016-04-18  7:04 ` Imre Deak
  2016-04-18 11:05   ` Ville Syrjälä
  2016-04-18  7:04 ` [PATCH 4/4] drm/i915/gen9: Fix runtime PM refcounting in case DMC firmware isn't loaded Imre Deak
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 34+ messages in thread
From: Imre Deak @ 2016-04-18  7:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ville Syrjälä, stable

The driver's VDD on/off logic assumes that whenever the VDD is on we
also hold an AUX power domain reference. Since BIOS can leave the VDD on
during booting and resuming and on DDI platforms we won't take a
corresponding power reference, the above assumption won't hold on those
platforms and an eventual delayed VDD off work will do an extraneous AUX
power domain put resulting in a refcount underflow. Fix this the same
way we did this for non-DDI DP encoders:

6d93c0c41760c0 ("drm/i915: fix VDD state tracking after system resume")

At the same time call the DP encoder suspend handler the same way as the
non-DDI DP encoders do to flush any pending VDD off work. Leaving the
work running may cause a HW access where we don't expect this (at a point
where power domains are suspended already).

While at it remove an unnecessary function call indirection.

This fixed for me AUX refcount underflow problems on BXT during
suspend/resume.

CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
CC: stable@vger.kernel.org
Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/intel_ddi.c | 10 +++-------
 drivers/gpu/drm/i915/intel_dp.c  |  4 ++--
 drivers/gpu/drm/i915/intel_drv.h |  2 ++
 3 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c
index 96234c5..c2348fb 100644
--- a/drivers/gpu/drm/i915/intel_ddi.c
+++ b/drivers/gpu/drm/i915/intel_ddi.c
@@ -2206,12 +2206,6 @@ void intel_ddi_get_config(struct intel_encoder *encoder,
 	intel_ddi_clock_get(encoder, pipe_config);
 }
 
-static void intel_ddi_destroy(struct drm_encoder *encoder)
-{
-	/* HDMI has nothing special to destroy, so we can go with this. */
-	intel_dp_encoder_destroy(encoder);
-}
-
 static bool intel_ddi_compute_config(struct intel_encoder *encoder,
 				     struct intel_crtc_state *pipe_config)
 {
@@ -2230,7 +2224,8 @@ static bool intel_ddi_compute_config(struct intel_encoder *encoder,
 }
 
 static const struct drm_encoder_funcs intel_ddi_funcs = {
-	.destroy = intel_ddi_destroy,
+	.reset = intel_dp_encoder_reset,
+	.destroy = intel_dp_encoder_destroy,
 };
 
 static struct intel_connector *
@@ -2329,6 +2324,7 @@ void intel_ddi_init(struct drm_device *dev, enum port port)
 	intel_encoder->post_disable = intel_ddi_post_disable;
 	intel_encoder->get_hw_state = intel_ddi_get_hw_state;
 	intel_encoder->get_config = intel_ddi_get_config;
+	intel_encoder->suspend = intel_dp_encoder_suspend;
 
 	intel_dig_port->port = port;
 	intel_dig_port->saved_port_bits = I915_READ(DDI_BUF_CTL(port)) &
diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index 61ee226..c6af3d0 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -4889,7 +4889,7 @@ void intel_dp_encoder_destroy(struct drm_encoder *encoder)
 	kfree(intel_dig_port);
 }
 
-static void intel_dp_encoder_suspend(struct intel_encoder *intel_encoder)
+void intel_dp_encoder_suspend(struct intel_encoder *intel_encoder)
 {
 	struct intel_dp *intel_dp = enc_to_intel_dp(&intel_encoder->base);
 
@@ -4931,7 +4931,7 @@ static void intel_edp_panel_vdd_sanitize(struct intel_dp *intel_dp)
 	edp_panel_vdd_schedule_off(intel_dp);
 }
 
-static void intel_dp_encoder_reset(struct drm_encoder *encoder)
+void intel_dp_encoder_reset(struct drm_encoder *encoder)
 {
 	struct intel_dp *intel_dp;
 
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index e13ce22..10dfe72 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1285,6 +1285,8 @@ void intel_dp_set_link_params(struct intel_dp *intel_dp,
 void intel_dp_start_link_train(struct intel_dp *intel_dp);
 void intel_dp_stop_link_train(struct intel_dp *intel_dp);
 void intel_dp_sink_dpms(struct intel_dp *intel_dp, int mode);
+void intel_dp_encoder_reset(struct drm_encoder *encoder);
+void intel_dp_encoder_suspend(struct intel_encoder *intel_encoder);
 void intel_dp_encoder_destroy(struct drm_encoder *encoder);
 int intel_dp_sink_crc(struct intel_dp *intel_dp, u8 *crc);
 bool intel_dp_compute_config(struct intel_encoder *encoder,
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 4/4] drm/i915/gen9: Fix runtime PM refcounting in case DMC firmware isn't loaded
  2016-04-18  7:04 [PATCH 0/4] drm/i915: Fix a few suspend/resume and driver unload bugs Imre Deak
                   ` (2 preceding siblings ...)
  2016-04-18  7:04 ` [PATCH 3/4] drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume Imre Deak
@ 2016-04-18  7:04 ` Imre Deak
  2016-04-18  7:49   ` Chris Wilson
  2016-04-18 11:48   ` [PATCH v2 " Imre Deak
  2016-04-18  7:24 ` ✓ Fi.CI.BAT: success for drm/i915: Fix a few suspend/resume and driver unload bugs Patchwork
  2016-04-18 13:57 ` ✗ Fi.CI.BAT: failure for drm/i915: Fix a few suspend/resume and driver unload bugs (rev3) Patchwork
  5 siblings, 2 replies; 34+ messages in thread
From: Imre Deak @ 2016-04-18  7:04 UTC (permalink / raw)
  To: intel-gfx

While we disable runtime PM and with that display power well support if
the DMC firmware isn't loaded, we still want to disable power wells
during system suspend and driver unload. So drop/reacquire the
corresponding power refcount during suspend/resume and driver unloading.
This also means we have to check if DMC is not loaded and skip enabling
DC states in the power well code.

Signed-off-by: Imre Deak <imre.deak@intel.com>

---

As this fixes the special case when firmware loading failed with the
result of possibly leaving power wells enabled during suspend/resume and
driver unloading it's not for -fixes or stable.
---
 drivers/gpu/drm/i915/i915_drv.c         |  5 +++--
 drivers/gpu/drm/i915/intel_csr.c        | 29 +++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_drv.h        |  2 ++
 drivers/gpu/drm/i915/intel_runtime_pm.c |  3 +++
 4 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 7eaa93e..6c70d0b 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -640,8 +640,7 @@ static int i915_drm_suspend(struct drm_device *dev)
 
 	intel_display_set_init_power(dev_priv, false);
 
-	if (HAS_CSR(dev_priv))
-		flush_work(&dev_priv->csr.work);
+	intel_csr_ucode_suspend(dev_priv);
 
 out:
 	enable_rpm_wakeref_asserts(dev_priv);
@@ -733,6 +732,8 @@ static int i915_drm_resume(struct drm_device *dev)
 
 	disable_rpm_wakeref_asserts(dev_priv);
 
+	intel_csr_ucode_resume(dev_priv);
+
 	mutex_lock(&dev->struct_mutex);
 	i915_gem_restore_gtt_mappings(dev);
 	mutex_unlock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
index d57b00e..fd9f5b3 100644
--- a/drivers/gpu/drm/i915/intel_csr.c
+++ b/drivers/gpu/drm/i915/intel_csr.c
@@ -480,5 +480,34 @@ void intel_csr_ucode_fini(struct drm_i915_private *dev_priv)
 
 	flush_work(&dev_priv->csr.work);
 
+	/* Drop the reference held in case DMC isn't loaded. */
+	if (!dev_priv->csr.dmc_payload)
+		intel_display_power_put(dev_priv, POWER_DOMAIN_INIT);
+
 	kfree(dev_priv->csr.dmc_payload);
 }
+
+void intel_csr_ucode_suspend(struct drm_i915_private *dev_priv)
+{
+	if (!HAS_CSR(dev_priv))
+		return;
+
+	flush_work(&dev_priv->csr.work);
+
+	/* Drop the reference held in case DMC isn't loaded. */
+	if (!dev_priv->csr.dmc_payload)
+		intel_display_power_put(dev_priv, POWER_DOMAIN_INIT);
+}
+
+void intel_csr_ucode_resume(struct drm_i915_private *dev_priv)
+{
+	if (!HAS_CSR(dev_priv))
+		return;
+
+	/*
+	 * Reacquire the reference to keep RPM disabled in case DMC isn't
+	 * loaded.
+	 */
+	if (!dev_priv->csr.dmc_payload)
+		intel_display_power_get(dev_priv, POWER_DOMAIN_INIT);
+}
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 10dfe72..beed9e8 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1275,6 +1275,8 @@ u32 skl_plane_ctl_rotation(unsigned int rotation);
 void intel_csr_ucode_init(struct drm_i915_private *);
 void intel_csr_load_program(struct drm_i915_private *);
 void intel_csr_ucode_fini(struct drm_i915_private *);
+void intel_csr_ucode_suspend(struct drm_i915_private *);
+void intel_csr_ucode_resume(struct drm_i915_private *);
 
 /* intel_dp.c */
 void intel_dp_init(struct drm_device *dev, i915_reg_t output_reg, enum port port);
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
index 259f66f..bfa8548 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -809,6 +809,9 @@ static void gen9_dc_off_power_well_enable(struct drm_i915_private *dev_priv,
 static void gen9_dc_off_power_well_disable(struct drm_i915_private *dev_priv,
 					   struct i915_power_well *power_well)
 {
+	if (!dev_priv->csr.dmc_payload)
+		return;
+
 	if (dev_priv->csr.allowed_dc_mask & DC_STATE_EN_UPTO_DC6)
 		skl_enable_dc6(dev_priv);
 	else if (dev_priv->csr.allowed_dc_mask & DC_STATE_EN_UPTO_DC5)
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915: Fix a few suspend/resume and driver unload bugs
  2016-04-18  7:04 [PATCH 0/4] drm/i915: Fix a few suspend/resume and driver unload bugs Imre Deak
                   ` (3 preceding siblings ...)
  2016-04-18  7:04 ` [PATCH 4/4] drm/i915/gen9: Fix runtime PM refcounting in case DMC firmware isn't loaded Imre Deak
@ 2016-04-18  7:24 ` Patchwork
  2016-04-18 13:57 ` ✗ Fi.CI.BAT: failure for drm/i915: Fix a few suspend/resume and driver unload bugs (rev3) Patchwork
  5 siblings, 0 replies; 34+ messages in thread
From: Patchwork @ 2016-04-18  7:24 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Fix a few suspend/resume and driver unload bugs
URL   : https://patchwork.freedesktop.org/series/5856/
State : success

== Summary ==

Series 5856v1 drm/i915: Fix a few suspend/resume and driver unload bugs
http://patchwork.freedesktop.org/api/1.0/series/5856/revisions/1/mbox/

Test gem_busy:
        Subgroup basic-blt:
                skip       -> PASS       (bsw-nuc-2)
Test kms_force_connector_basic:
        Subgroup force-load-detect:
                skip       -> PASS       (ivb-t430s)

bdw-ultra        total:203  pass:180  dwarn:0   dfail:0   fail:0   skip:23 
bsw-nuc-2        total:202  pass:163  dwarn:0   dfail:0   fail:0   skip:39 
byt-nuc          total:202  pass:164  dwarn:0   dfail:0   fail:0   skip:38 
hsw-brixbox      total:203  pass:179  dwarn:0   dfail:0   fail:0   skip:24 
ilk-hp8440p      total:203  pass:135  dwarn:0   dfail:0   fail:0   skip:68 
ivb-t430s        total:203  pass:175  dwarn:0   dfail:0   fail:0   skip:28 
skl-i7k-2        total:203  pass:178  dwarn:0   dfail:0   fail:0   skip:25 
skl-nuci5        total:203  pass:192  dwarn:0   dfail:0   fail:0   skip:11 
snb-x220t        total:203  pass:165  dwarn:0   dfail:0   fail:1   skip:37 

Results at /archive/results/CI_IGT_test/Patchwork_1922/

d9131d62d18ba94fb3ca019f1156c22b5f4ce23c drm-intel-nightly: 2016y-04m-15d-13h-53m-44s UTC integration manifest
3577326 drm/i915/gen9: Fix runtime PM refcounting in case DMC firmware isn't loaded
e5de482 drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume
41c3968 drm/i915: Fix system resume if PCI device remained enabled
cfa1570 drm/i915: Fix error path in i915_drm_resume_early

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 4/4] drm/i915/gen9: Fix runtime PM refcounting in case DMC firmware isn't loaded
  2016-04-18  7:04 ` [PATCH 4/4] drm/i915/gen9: Fix runtime PM refcounting in case DMC firmware isn't loaded Imre Deak
@ 2016-04-18  7:49   ` Chris Wilson
  2016-04-18  7:59     ` Imre Deak
  2016-04-18 11:48   ` [PATCH v2 " Imre Deak
  1 sibling, 1 reply; 34+ messages in thread
From: Chris Wilson @ 2016-04-18  7:49 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

On Mon, Apr 18, 2016 at 10:04:22AM +0300, Imre Deak wrote:
> diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
> index d57b00e..fd9f5b3 100644
> --- a/drivers/gpu/drm/i915/intel_csr.c
> +++ b/drivers/gpu/drm/i915/intel_csr.c

> +void intel_csr_ucode_suspend(struct drm_i915_private *dev_priv)
> +{
> +	if (!HAS_CSR(dev_priv))
> +		return;
> +
> +	flush_work(&dev_priv->csr.work);
> +
> +	/* Drop the reference held in case DMC isn't loaded. */
> +	if (!dev_priv->csr.dmc_payload)
> +		intel_display_power_put(dev_priv, POWER_DOMAIN_INIT);
> +}

> @@ -480,5 +480,34 @@ void intel_csr_ucode_fini(struct drm_i915_private *dev_priv)
>  
>  	flush_work(&dev_priv->csr.work);
>  
> +	/* Drop the reference held in case DMC isn't loaded. */
> +	if (!dev_priv->csr.dmc_payload)
> +		intel_display_power_put(dev_priv, POWER_DOMAIN_INIT);
> +
>  	kfree(dev_priv->csr.dmc_payload);
>  }

Should fini just be calling suspend first?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 4/4] drm/i915/gen9: Fix runtime PM refcounting in case DMC firmware isn't loaded
  2016-04-18  7:49   ` Chris Wilson
@ 2016-04-18  7:59     ` Imre Deak
  0 siblings, 0 replies; 34+ messages in thread
From: Imre Deak @ 2016-04-18  7:59 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On ma, 2016-04-18 at 08:49 +0100, Chris Wilson wrote:
> On Mon, Apr 18, 2016 at 10:04:22AM +0300, Imre Deak wrote:
> > diff --git a/drivers/gpu/drm/i915/intel_csr.c
> > b/drivers/gpu/drm/i915/intel_csr.c
> > index d57b00e..fd9f5b3 100644
> > --- a/drivers/gpu/drm/i915/intel_csr.c
> > +++ b/drivers/gpu/drm/i915/intel_csr.c
> 
> > +void intel_csr_ucode_suspend(struct drm_i915_private *dev_priv)
> > +{
> > +	if (!HAS_CSR(dev_priv))
> > +		return;
> > +
> > +	flush_work(&dev_priv->csr.work);
> > +
> > +	/* Drop the reference held in case DMC isn't loaded. */
> > +	if (!dev_priv->csr.dmc_payload)
> > +		intel_display_power_put(dev_priv,
> > POWER_DOMAIN_INIT);
> > +}
> 
> > @@ -480,5 +480,34 @@ void intel_csr_ucode_fini(struct
> > drm_i915_private *dev_priv)
> >  
> >  	flush_work(&dev_priv->csr.work);
> >  
> > +	/* Drop the reference held in case DMC isn't loaded. */
> > +	if (!dev_priv->csr.dmc_payload)
> > +		intel_display_power_put(dev_priv,
> > POWER_DOMAIN_INIT);
> > +
> >  	kfree(dev_priv->csr.dmc_payload);
> >  }
> 
> Should fini just be calling suspend first?

Yep didn't notice, will do that.

--Imre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/4] drm/i915: Fix error path in i915_drm_resume_early
  2016-04-18  7:04 ` [PATCH 1/4] drm/i915: Fix error path in i915_drm_resume_early Imre Deak
@ 2016-04-18  8:00   ` Chris Wilson
  2016-04-18  8:06     ` Imre Deak
  0 siblings, 1 reply; 34+ messages in thread
From: Chris Wilson @ 2016-04-18  8:00 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

On Mon, Apr 18, 2016 at 10:04:19AM +0300, Imre Deak wrote:
> If system resume fails, this may lead to a runtime PM wake reference
> underflow used for runtime PM state checking.

If a later resume succeeds!

> Fixes: 1f814daca43a ("drm/i915: add support for checking if we hold an RPM reference")
> Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

> ---
> 
> As this only fixes a debugging feature, it's not for -fixes or stable.

Following a catastrophic device failure.

> ---
>  drivers/gpu/drm/i915/i915_drv.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 50bc05f..d550ae2 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -842,11 +842,11 @@ static int i915_drm_resume_early(struct drm_device *dev)
>  	    !(dev_priv->suspended_to_idle && dev_priv->csr.dmc_payload))
>  		intel_power_domains_init_hw(dev_priv, true);
>  
> +	enable_rpm_wakeref_asserts(dev_priv);
> +
>  out:
>  	dev_priv->suspended_to_idle = false;

I'm not even sure about twiddling anything here, an early return if
pci_enable_device() fails would be clearer.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  7:04 ` [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled Imre Deak
@ 2016-04-18  8:06   ` Chris Wilson
  2016-04-18  8:16     ` Imre Deak
  2016-04-18  8:28   ` Ville Syrjälä
  2016-04-18 11:45   ` [PATCH v2 " Imre Deak
  2 siblings, 1 reply; 34+ messages in thread
From: Chris Wilson @ 2016-04-18  8:06 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx, stable

On Mon, Apr 18, 2016 at 10:04:20AM +0300, Imre Deak wrote:
> During system resume we depended on pci_enable_device() also putting the
> device into PCI D0 state. This won't work if the PCI device was already
> enabled but still in D3 state. This is because pci_enable_device() is
> refcounted and will not change the HW state if called with a non-zero
> refcount. Leaving the device in D3 will make all subsequent device
> accesses fail.
> 
> This didn't cause a problem most of the time, since we resumed with an
> enable refcount of 0. But it fails at least after module reload because
> after that we also happen to leak a PCI device enable reference: During
> probing we call drm_get_pci_dev() which will enable the PCI device, but
> during device removal drm_put_dev() won't disable it. This is a bug of
> its own in DRM core, but without much harm as it only leaves the PCI
> device enabled. Fixing it is also a bit more involved, due to DRM
> mid-layering and because it affects non-i915 drivers too. The fix in
> this patch is valid regardless of the problem in DRM core.
> 
> CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
> CC: stable@vger.kernel.org
> Signed-off-by: Imre Deak <imre.deak@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index d550ae2..7eaa93e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device *dev)
>  static int i915_drm_resume_early(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	int ret = 0;
> +	int ret;
>  
>  	/*
>  	 * We have a resume ordering issue with the snd-hda driver also
> @@ -814,6 +814,13 @@ static int i915_drm_resume_early(struct drm_device *dev)
>  	 * FIXME: This should be solved with a special hdmi sink device or
>  	 * similar so that power domains can be employed.
>  	 */
> +
> +	ret = pci_set_power_state(dev->pdev, PCI_D0);
> +	if (ret) {
> +		DRM_ERROR("failed to set PCI D0 power state (%d)\n", ret);
> +		goto out;
> +	}

The device should be enabled first, otherwise we are not meant to be
touching its IO space at all (such as twiddling power state). At least
that is the order pci_enable_device() uses.

Either way, upon failure we should be unwinding.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/4] drm/i915: Fix error path in i915_drm_resume_early
  2016-04-18  8:00   ` Chris Wilson
@ 2016-04-18  8:06     ` Imre Deak
  0 siblings, 0 replies; 34+ messages in thread
From: Imre Deak @ 2016-04-18  8:06 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On ma, 2016-04-18 at 09:00 +0100, Chris Wilson wrote:
> On Mon, Apr 18, 2016 at 10:04:19AM +0300, Imre Deak wrote:
> > If system resume fails, this may lead to a runtime PM wake
> > reference
> > underflow used for runtime PM state checking.
> 
> If a later resume succeeds!
> 
> > Fixes: 1f814daca43a ("drm/i915: add support for checking if we hold
> > an RPM reference")
> > Signed-off-by: Imre Deak <imre.deak@intel.com>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> > ---
> > 
> > As this only fixes a debugging feature, it's not for -fixes or
> > stable.
> 
> Following a catastrophic device failure.
> 
> > ---
> >  drivers/gpu/drm/i915/i915_drv.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > b/drivers/gpu/drm/i915/i915_drv.c
> > index 50bc05f..d550ae2 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -842,11 +842,11 @@ static int i915_drm_resume_early(struct
> > drm_device *dev)
> >  	    !(dev_priv->suspended_to_idle && dev_priv-
> > >csr.dmc_payload))
> >  		intel_power_domains_init_hw(dev_priv, true);
> >  
> > +	enable_rpm_wakeref_asserts(dev_priv);
> > +
> >  out:
> >  	dev_priv->suspended_to_idle = false;
> 
> I'm not even sure about twiddling anything here, an early return if
> pci_enable_device() fails would be clearer.

Yea good point I was also thinking about it. I have to check a bit what
exactly happens in case resume fails (for example wrt. subsequent
IOCTLs) and can fix this up.

--Imre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  8:06   ` [Intel-gfx] " Chris Wilson
@ 2016-04-18  8:16     ` Imre Deak
  2016-04-18  8:24       ` [Intel-gfx] " Chris Wilson
  0 siblings, 1 reply; 34+ messages in thread
From: Imre Deak @ 2016-04-18  8:16 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, stable

On ma, 2016-04-18 at 09:06 +0100, Chris Wilson wrote:
> On Mon, Apr 18, 2016 at 10:04:20AM +0300, Imre Deak wrote:
> > During system resume we depended on pci_enable_device() also
> > putting the
> > device into PCI D0 state. This won't work if the PCI device was
> > already
> > enabled but still in D3 state. This is because pci_enable_device()
> > is
> > refcounted and will not change the HW state if called with a non-
> > zero
> > refcount. Leaving the device in D3 will make all subsequent device
> > accesses fail.
> > 
> > This didn't cause a problem most of the time, since we resumed with
> > an
> > enable refcount of 0. But it fails at least after module reload
> > because
> > after that we also happen to leak a PCI device enable reference:
> > During
> > probing we call drm_get_pci_dev() which will enable the PCI device,
> > but
> > during device removal drm_put_dev() won't disable it. This is a bug
> > of
> > its own in DRM core, but without much harm as it only leaves the
> > PCI
> > device enabled. Fixing it is also a bit more involved, due to DRM
> > mid-layering and because it affects non-i915 drivers too. The fix
> > in
> > this patch is valid regardless of the problem in DRM core.
> > 
> > CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > CC: stable@vger.kernel.org
> > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > b/drivers/gpu/drm/i915/i915_drv.c
> > index d550ae2..7eaa93e 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device
> > *dev)
> >  static int i915_drm_resume_early(struct drm_device *dev)
> >  {
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	int ret = 0;
> > +	int ret;
> >  
> >  	/*
> >  	 * We have a resume ordering issue with the snd-hda driver
> > also
> > @@ -814,6 +814,13 @@ static int i915_drm_resume_early(struct
> > drm_device *dev)
> >  	 * FIXME: This should be solved with a special hdmi sink
> > device or
> >  	 * similar so that power domains can be employed.
> >  	 */
> > +
> > +	ret = pci_set_power_state(dev->pdev, PCI_D0);
> > +	if (ret) {
> > +		DRM_ERROR("failed to set PCI D0 power state
> > (%d)\n", ret);
> > +		goto out;
> > +	}
> 
> The device should be enabled first, otherwise we are not meant to be
> touching its IO space at all (such as twiddling power state). At
> least
> that is the order pci_enable_device() uses.

It's not MMIO or (port) IO but only a PCI config space access
that pci_set_power_state() requires, so doesn't need the enabling 
of PCI resources. AFAICS pci_enable_device() enables power as the first
thing.

> Either way, upon failure we should be unwinding.

I'd rather wouldn't put back the device to D3 state, as further device
access may still possible even though resume failed.

--Imre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  8:16     ` Imre Deak
@ 2016-04-18  8:24       ` Chris Wilson
  2016-04-18  8:37         ` Imre Deak
  0 siblings, 1 reply; 34+ messages in thread
From: Chris Wilson @ 2016-04-18  8:24 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx, stable

On Mon, Apr 18, 2016 at 11:16:34AM +0300, Imre Deak wrote:
> On ma, 2016-04-18 at 09:06 +0100, Chris Wilson wrote:
> > On Mon, Apr 18, 2016 at 10:04:20AM +0300, Imre Deak wrote:
> > > During system resume we depended on pci_enable_device() also
> > > putting the
> > > device into PCI D0 state. This won't work if the PCI device was
> > > already
> > > enabled but still in D3 state. This is because pci_enable_device()
> > > is
> > > refcounted and will not change the HW state if called with a non-
> > > zero
> > > refcount. Leaving the device in D3 will make all subsequent device
> > > accesses fail.
> > > 
> > > This didn't cause a problem most of the time, since we resumed with
> > > an
> > > enable refcount of 0. But it fails at least after module reload
> > > because
> > > after that we also happen to leak a PCI device enable reference:
> > > During
> > > probing we call drm_get_pci_dev() which will enable the PCI device,
> > > but
> > > during device removal drm_put_dev() won't disable it. This is a bug
> > > of
> > > its own in DRM core, but without much harm as it only leaves the
> > > PCI
> > > device enabled. Fixing it is also a bit more involved, due to DRM
> > > mid-layering and because it affects non-i915 drivers too. The fix
> > > in
> > > this patch is valid regardless of the problem in DRM core.
> > > 
> > > CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > CC: stable@vger.kernel.org
> > > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
> > >  1 file changed, 8 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > b/drivers/gpu/drm/i915/i915_drv.c
> > > index d550ae2..7eaa93e 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > @@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device
> > > *dev)
> > >  static int i915_drm_resume_early(struct drm_device *dev)
> > >  {
> > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > -	int ret = 0;
> > > +	int ret;
> > >  
> > >  	/*
> > >  	 * We have a resume ordering issue with the snd-hda driver
> > > also
> > > @@ -814,6 +814,13 @@ static int i915_drm_resume_early(struct
> > > drm_device *dev)
> > >  	 * FIXME: This should be solved with a special hdmi sink
> > > device or
> > >  	 * similar so that power domains can be employed.
> > >  	 */
> > > +
> > > +	ret = pci_set_power_state(dev->pdev, PCI_D0);
> > > +	if (ret) {
> > > +		DRM_ERROR("failed to set PCI D0 power state
> > > (%d)\n", ret);
> > > +		goto out;
> > > +	}
> > 
> > The device should be enabled first, otherwise we are not meant to be
> > touching its IO space at all (such as twiddling power state). At
> > least
> > that is the order pci_enable_device() uses.
> 
> It's not MMIO or (port) IO but only a PCI config space access
> that pci_set_power_state() requires, so doesn't need the enabling 
> of PCI resources. AFAICS pci_enable_device() enables power as the first
> thing.
> 
> > Either way, upon failure we should be unwinding.
> 
> I'd rather wouldn't put back the device to D3 state, as further device
> access may still possible even though resume failed.

On the other hand, if you order it thusly:

	if (!pci_enable_device())
		return -EIO;

	ret = pci_set_power_state()
	if (ret < 0) {
		pci_disable_device()
		return ret;
	}

it doesn't raise as many eyebrows :)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  7:04 ` [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled Imre Deak
  2016-04-18  8:06   ` [Intel-gfx] " Chris Wilson
@ 2016-04-18  8:28   ` Ville Syrjälä
  2016-04-18  8:32     ` Imre Deak
  2016-04-18 11:45   ` [PATCH v2 " Imre Deak
  2 siblings, 1 reply; 34+ messages in thread
From: Ville Syrjälä @ 2016-04-18  8:28 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx, stable

On Mon, Apr 18, 2016 at 10:04:20AM +0300, Imre Deak wrote:
> During system resume we depended on pci_enable_device() also putting the
> device into PCI D0 state. This won't work if the PCI device was already
> enabled but still in D3 state. This is because pci_enable_device() is
> refcounted and will not change the HW state if called with a non-zero
> refcount. Leaving the device in D3 will make all subsequent device
> accesses fail.
> 
> This didn't cause a problem most of the time, since we resumed with an
> enable refcount of 0. But it fails at least after module reload because
> after that we also happen to leak a PCI device enable reference: During
> probing we call drm_get_pci_dev() which will enable the PCI device, but
> during device removal drm_put_dev() won't disable it. This is a bug of
> its own in DRM core, but without much harm as it only leaves the PCI
> device enabled. Fixing it is also a bit more involved, due to DRM
> mid-layering and because it affects non-i915 drivers too. The fix in
> this patch is valid regardless of the problem in DRM core.
> 
> CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
> CC: stable@vger.kernel.org
> Signed-off-by: Imre Deak <imre.deak@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index d550ae2..7eaa93e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device *dev)
>  static int i915_drm_resume_early(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	int ret = 0;
> +	int ret;
>  
>  	/*
>  	 * We have a resume ordering issue with the snd-hda driver also
> @@ -814,6 +814,13 @@ static int i915_drm_resume_early(struct drm_device *dev)
>  	 * FIXME: This should be solved with a special hdmi sink device or
>  	 * similar so that power domains can be employed.
>  	 */
> +
> +	ret = pci_set_power_state(dev->pdev, PCI_D0);
> +	if (ret) {
> +		DRM_ERROR("failed to set PCI D0 power state (%d)\n", ret);
> +		goto out;
> +	}

Hmm. Doesn't this already happen from pci bus resume_noirq hook?

> +
>  	if (pci_enable_device(dev->pdev)) {
>  		ret = -EIO;
>  		goto out;
> -- 
> 2.5.0

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  8:28   ` Ville Syrjälä
@ 2016-04-18  8:32     ` Imre Deak
  2016-04-18  8:44       ` Ville Syrjälä
  0 siblings, 1 reply; 34+ messages in thread
From: Imre Deak @ 2016-04-18  8:32 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx, stable

On ma, 2016-04-18 at 11:28 +0300, Ville Syrjälä wrote:
> On Mon, Apr 18, 2016 at 10:04:20AM +0300, Imre Deak wrote:
> > During system resume we depended on pci_enable_device() also
> > putting the
> > device into PCI D0 state. This won't work if the PCI device was
> > already
> > enabled but still in D3 state. This is because pci_enable_device()
> > is
> > refcounted and will not change the HW state if called with a non-
> > zero
> > refcount. Leaving the device in D3 will make all subsequent device
> > accesses fail.
> > 
> > This didn't cause a problem most of the time, since we resumed with
> > an
> > enable refcount of 0. But it fails at least after module reload
> > because
> > after that we also happen to leak a PCI device enable reference:
> > During
> > probing we call drm_get_pci_dev() which will enable the PCI device,
> > but
> > during device removal drm_put_dev() won't disable it. This is a bug
> > of
> > its own in DRM core, but without much harm as it only leaves the
> > PCI
> > device enabled. Fixing it is also a bit more involved, due to DRM
> > mid-layering and because it affects non-i915 drivers too. The fix
> > in
> > this patch is valid regardless of the problem in DRM core.
> > 
> > CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > CC: stable@vger.kernel.org
> > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > b/drivers/gpu/drm/i915/i915_drv.c
> > index d550ae2..7eaa93e 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device
> > *dev)
> >  static int i915_drm_resume_early(struct drm_device *dev)
> >  {
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	int ret = 0;
> > +	int ret;
> >  
> >  	/*
> >  	 * We have a resume ordering issue with the snd-hda driver
> > also
> > @@ -814,6 +814,13 @@ static int i915_drm_resume_early(struct
> > drm_device *dev)
> >  	 * FIXME: This should be solved with a special hdmi sink
> > device or
> >  	 * similar so that power domains can be employed.
> >  	 */
> > +
> > +	ret = pci_set_power_state(dev->pdev, PCI_D0);
> > +	if (ret) {
> > +		DRM_ERROR("failed to set PCI D0 power state
> > (%d)\n", ret);
> > +		goto out;
> > +	}
> 
> Hmm. Doesn't this already happen from pci bus resume_noirq hook?

It does, but not during thaw_noirq.

> 
> > +
> >  	if (pci_enable_device(dev->pdev)) {
> >  		ret = -EIO;
> >  		goto out;
> > -- 
> > 2.5.0
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  8:24       ` [Intel-gfx] " Chris Wilson
@ 2016-04-18  8:37         ` Imre Deak
  2016-04-18  8:52           ` Chris Wilson
  0 siblings, 1 reply; 34+ messages in thread
From: Imre Deak @ 2016-04-18  8:37 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, stable

On ma, 2016-04-18 at 09:24 +0100, Chris Wilson wrote:
> On Mon, Apr 18, 2016 at 11:16:34AM +0300, Imre Deak wrote:
> > On ma, 2016-04-18 at 09:06 +0100, Chris Wilson wrote:
> > > On Mon, Apr 18, 2016 at 10:04:20AM +0300, Imre Deak wrote:
> > > > During system resume we depended on pci_enable_device() also
> > > > putting the
> > > > device into PCI D0 state. This won't work if the PCI device was
> > > > already
> > > > enabled but still in D3 state. This is because
> > > > pci_enable_device()
> > > > is
> > > > refcounted and will not change the HW state if called with a
> > > > non-
> > > > zero
> > > > refcount. Leaving the device in D3 will make all subsequent
> > > > device
> > > > accesses fail.
> > > > 
> > > > This didn't cause a problem most of the time, since we resumed
> > > > with
> > > > an
> > > > enable refcount of 0. But it fails at least after module reload
> > > > because
> > > > after that we also happen to leak a PCI device enable
> > > > reference:
> > > > During
> > > > probing we call drm_get_pci_dev() which will enable the PCI
> > > > device,
> > > > but
> > > > during device removal drm_put_dev() won't disable it. This is a
> > > > bug
> > > > of
> > > > its own in DRM core, but without much harm as it only leaves
> > > > the
> > > > PCI
> > > > device enabled. Fixing it is also a bit more involved, due to
> > > > DRM
> > > > mid-layering and because it affects non-i915 drivers too. The
> > > > fix
> > > > in
> > > > this patch is valid regardless of the problem in DRM core.
> > > > 
> > > > CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > > CC: stable@vger.kernel.org
> > > > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
> > > >  1 file changed, 8 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > > b/drivers/gpu/drm/i915/i915_drv.c
> > > > index d550ae2..7eaa93e 100644
> > > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > > @@ -803,7 +803,7 @@ static int i915_drm_resume(struct
> > > > drm_device
> > > > *dev)
> > > >  static int i915_drm_resume_early(struct drm_device *dev)
> > > >  {
> > > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > > -	int ret = 0;
> > > > +	int ret;
> > > >  
> > > >  	/*
> > > >  	 * We have a resume ordering issue with the snd-hda
> > > > driver
> > > > also
> > > > @@ -814,6 +814,13 @@ static int i915_drm_resume_early(struct
> > > > drm_device *dev)
> > > >  	 * FIXME: This should be solved with a special hdmi
> > > > sink
> > > > device or
> > > >  	 * similar so that power domains can be employed.
> > > >  	 */
> > > > +
> > > > +	ret = pci_set_power_state(dev->pdev, PCI_D0);
> > > > +	if (ret) {
> > > > +		DRM_ERROR("failed to set PCI D0 power state
> > > > (%d)\n", ret);
> > > > +		goto out;
> > > > +	}
> > > 
> > > The device should be enabled first, otherwise we are not meant to
> > > be
> > > touching its IO space at all (such as twiddling power state). At
> > > least
> > > that is the order pci_enable_device() uses.
> > 
> > It's not MMIO or (port) IO but only a PCI config space access
> > that pci_set_power_state() requires, so doesn't need the enabling 
> > of PCI resources. AFAICS pci_enable_device() enables power as the
> > first
> > thing.
> > 
> > > Either way, upon failure we should be unwinding.
> > 
> > I'd rather wouldn't put back the device to D3 state, as further
> > device
> > access may still possible even though resume failed.
> 
> On the other hand, if you order it thusly:
> 
> 	if (!pci_enable_device())
> 		return -EIO;
> 
> 	ret = pci_set_power_state()
> 	if (ret < 0) {
> 		pci_disable_device()
> 		return ret;
> 	}
> 
> it doesn't raise as many eyebrows :)

I don't know. I guess you mean that enabling the device first seems
like the first logical step. To me enabling power is the first logical
step and enabling the device itself is the second.

In any case if we decide to change this, I would suggest doing it
separately since then we also need to change the disabling order during
suspend.

--Imre

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  8:32     ` Imre Deak
@ 2016-04-18  8:44       ` Ville Syrjälä
  2016-04-18  8:54         ` Imre Deak
  0 siblings, 1 reply; 34+ messages in thread
From: Ville Syrjälä @ 2016-04-18  8:44 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx, stable

On Mon, Apr 18, 2016 at 11:32:38AM +0300, Imre Deak wrote:
> On ma, 2016-04-18 at 11:28 +0300, Ville Syrjälä wrote:
> > On Mon, Apr 18, 2016 at 10:04:20AM +0300, Imre Deak wrote:
> > > During system resume we depended on pci_enable_device() also
> > > putting the
> > > device into PCI D0 state. This won't work if the PCI device was
> > > already
> > > enabled but still in D3 state. This is because pci_enable_device()
> > > is
> > > refcounted and will not change the HW state if called with a non-
> > > zero
> > > refcount. Leaving the device in D3 will make all subsequent device
> > > accesses fail.
> > > 
> > > This didn't cause a problem most of the time, since we resumed with
> > > an
> > > enable refcount of 0. But it fails at least after module reload
> > > because
> > > after that we also happen to leak a PCI device enable reference:
> > > During
> > > probing we call drm_get_pci_dev() which will enable the PCI device,
> > > but
> > > during device removal drm_put_dev() won't disable it. This is a bug
> > > of
> > > its own in DRM core, but without much harm as it only leaves the
> > > PCI
> > > device enabled. Fixing it is also a bit more involved, due to DRM
> > > mid-layering and because it affects non-i915 drivers too. The fix
> > > in
> > > this patch is valid regardless of the problem in DRM core.
> > > 
> > > CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > CC: stable@vger.kernel.org
> > > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
> > >  1 file changed, 8 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > b/drivers/gpu/drm/i915/i915_drv.c
> > > index d550ae2..7eaa93e 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > @@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device
> > > *dev)
> > >  static int i915_drm_resume_early(struct drm_device *dev)
> > >  {
> > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > -	int ret = 0;
> > > +	int ret;
> > >  
> > >  	/*
> > >  	 * We have a resume ordering issue with the snd-hda driver
> > > also
> > > @@ -814,6 +814,13 @@ static int i915_drm_resume_early(struct
> > > drm_device *dev)
> > >  	 * FIXME: This should be solved with a special hdmi sink
> > > device or
> > >  	 * similar so that power domains can be employed.
> > >  	 */
> > > +
> > > +	ret = pci_set_power_state(dev->pdev, PCI_D0);
> > > +	if (ret) {
> > > +		DRM_ERROR("failed to set PCI D0 power state
> > > (%d)\n", ret);
> > > +		goto out;
> > > +	}
> > 
> > Hmm. Doesn't this already happen from pci bus resume_noirq hook?
> 
> It does, but not during thaw_noirq.

Maybe put that into a comment? If we ever get to dropping the device
state frobbery from freeze/thaw, then we should also be able to throw
out the pci_set_power_state() call as well.

Perhaps we should have some asserts about the state of the PCI device?

> 
> > 
> > > +
> > >  	if (pci_enable_device(dev->pdev)) {
> > >  		ret = -EIO;
> > >  		goto out;
> > > -- 
> > > 2.5.0
> > 

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  8:37         ` Imre Deak
@ 2016-04-18  8:52           ` Chris Wilson
  2016-04-18 11:05             ` Imre Deak
  0 siblings, 1 reply; 34+ messages in thread
From: Chris Wilson @ 2016-04-18  8:52 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx, stable

On Mon, Apr 18, 2016 at 11:37:05AM +0300, Imre Deak wrote:
> I don't know. I guess you mean that enabling the device first seems
> like the first logical step. To me enabling power is the first logical
> step and enabling the device itself is the second.

It is not clear exactly, but what is clear is that pci_enable_device()
sets the power state of this device only after doing so for the bridge
hierachy.

The order in this patch is inconsistent.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  8:44       ` Ville Syrjälä
@ 2016-04-18  8:54         ` Imre Deak
  2016-04-18  9:04           ` Ville Syrjälä
  0 siblings, 1 reply; 34+ messages in thread
From: Imre Deak @ 2016-04-18  8:54 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx, stable

On ma, 2016-04-18 at 11:44 +0300, Ville Syrjälä wrote:
> On Mon, Apr 18, 2016 at 11:32:38AM +0300, Imre Deak wrote:
> > On ma, 2016-04-18 at 11:28 +0300, Ville Syrjälä wrote:
> > > On Mon, Apr 18, 2016 at 10:04:20AM +0300, Imre Deak wrote:
> > > > During system resume we depended on pci_enable_device() also
> > > > putting the
> > > > device into PCI D0 state. This won't work if the PCI device was
> > > > already
> > > > enabled but still in D3 state. This is because pci_enable_device()
> > > > is
> > > > refcounted and will not change the HW state if called with a non-
> > > > zero
> > > > refcount. Leaving the device in D3 will make all subsequent device
> > > > accesses fail.
> > > > 
> > > > This didn't cause a problem most of the time, since we resumed with
> > > > an
> > > > enable refcount of 0. But it fails at least after module reload
> > > > because
> > > > after that we also happen to leak a PCI device enable reference:
> > > > During
> > > > probing we call drm_get_pci_dev() which will enable the PCI device,
> > > > but
> > > > during device removal drm_put_dev() won't disable it. This is a bug
> > > > of
> > > > its own in DRM core, but without much harm as it only leaves the
> > > > PCI
> > > > device enabled. Fixing it is also a bit more involved, due to DRM
> > > > mid-layering and because it affects non-i915 drivers too. The fix
> > > > in
> > > > this patch is valid regardless of the problem in DRM core.
> > > > 
> > > > CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > > CC: stable@vger.kernel.org
> > > > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
> > > >  1 file changed, 8 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > > b/drivers/gpu/drm/i915/i915_drv.c
> > > > index d550ae2..7eaa93e 100644
> > > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > > @@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device
> > > > *dev)
> > > >  static int i915_drm_resume_early(struct drm_device *dev)
> > > >  {
> > > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > > -	int ret = 0;
> > > > +	int ret;
> > > >  
> > > >  	/*
> > > >  	 * We have a resume ordering issue with the snd-hda driver
> > > > also
> > > > @@ -814,6 +814,13 @@ static int i915_drm_resume_early(struct
> > > > drm_device *dev)
> > > >  	 * FIXME: This should be solved with a special hdmi sink
> > > > device or
> > > >  	 * similar so that power domains can be employed.
> > > >  	 */
> > > > +
> > > > +	ret = pci_set_power_state(dev->pdev, PCI_D0);
> > > > +	if (ret) {
> > > > +		DRM_ERROR("failed to set PCI D0 power state
> > > > (%d)\n", ret);
> > > > +		goto out;
> > > > +	}
> > > 
> > > Hmm. Doesn't this already happen from pci bus resume_noirq hook?
> > 
> > It does, but not during thaw_noirq.
> 
> Maybe put that into a comment? If we ever get to dropping the device
> state frobbery from freeze/thaw, then we should also be able to throw
> out the pci_set_power_state() call as well.

Yes, can add a comment.

> Perhaps we should have some asserts about the state of the PCI device?

You mean after calling pci_enable_device() that it's indeed in D0 and
enabled? Can do that as a follow-up.

--Imre

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  8:54         ` Imre Deak
@ 2016-04-18  9:04           ` Ville Syrjälä
  0 siblings, 0 replies; 34+ messages in thread
From: Ville Syrjälä @ 2016-04-18  9:04 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx, stable

On Mon, Apr 18, 2016 at 11:54:31AM +0300, Imre Deak wrote:
> On ma, 2016-04-18 at 11:44 +0300, Ville Syrjälä wrote:
> > On Mon, Apr 18, 2016 at 11:32:38AM +0300, Imre Deak wrote:
> > > On ma, 2016-04-18 at 11:28 +0300, Ville Syrjälä wrote:
> > > > On Mon, Apr 18, 2016 at 10:04:20AM +0300, Imre Deak wrote:
> > > > > During system resume we depended on pci_enable_device() also
> > > > > putting the
> > > > > device into PCI D0 state. This won't work if the PCI device was
> > > > > already
> > > > > enabled but still in D3 state. This is because pci_enable_device()
> > > > > is
> > > > > refcounted and will not change the HW state if called with a non-
> > > > > zero
> > > > > refcount. Leaving the device in D3 will make all subsequent device
> > > > > accesses fail.
> > > > > 
> > > > > This didn't cause a problem most of the time, since we resumed with
> > > > > an
> > > > > enable refcount of 0. But it fails at least after module reload
> > > > > because
> > > > > after that we also happen to leak a PCI device enable reference:
> > > > > During
> > > > > probing we call drm_get_pci_dev() which will enable the PCI device,
> > > > > but
> > > > > during device removal drm_put_dev() won't disable it. This is a bug
> > > > > of
> > > > > its own in DRM core, but without much harm as it only leaves the
> > > > > PCI
> > > > > device enabled. Fixing it is also a bit more involved, due to DRM
> > > > > mid-layering and because it affects non-i915 drivers too. The fix
> > > > > in
> > > > > this patch is valid regardless of the problem in DRM core.
> > > > > 
> > > > > CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > > > CC: stable@vger.kernel.org
> > > > > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > > > > ---
> > > > >  drivers/gpu/drm/i915/i915_drv.c | 9 ++++++++-
> > > > >  1 file changed, 8 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > > > b/drivers/gpu/drm/i915/i915_drv.c
> > > > > index d550ae2..7eaa93e 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > > > @@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device
> > > > > *dev)
> > > > >  static int i915_drm_resume_early(struct drm_device *dev)
> > > > >  {
> > > > >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > > > > -	int ret = 0;
> > > > > +	int ret;
> > > > >  
> > > > >  	/*
> > > > >  	 * We have a resume ordering issue with the snd-hda driver
> > > > > also
> > > > > @@ -814,6 +814,13 @@ static int i915_drm_resume_early(struct
> > > > > drm_device *dev)
> > > > >  	 * FIXME: This should be solved with a special hdmi sink
> > > > > device or
> > > > >  	 * similar so that power domains can be employed.
> > > > >  	 */
> > > > > +
> > > > > +	ret = pci_set_power_state(dev->pdev, PCI_D0);
> > > > > +	if (ret) {
> > > > > +		DRM_ERROR("failed to set PCI D0 power state
> > > > > (%d)\n", ret);
> > > > > +		goto out;
> > > > > +	}
> > > > 
> > > > Hmm. Doesn't this already happen from pci bus resume_noirq hook?
> > > 
> > > It does, but not during thaw_noirq.
> > 
> > Maybe put that into a comment? If we ever get to dropping the device
> > state frobbery from freeze/thaw, then we should also be able to throw
> > out the pci_set_power_state() call as well.
> 
> Yes, can add a comment.
> 
> > Perhaps we should have some asserts about the state of the PCI device?
> 
> You mean after calling pci_enable_device() that it's indeed in D0 and
> enabled? Can do that as a follow-up.

Yeah, was thinking that we could assert that we're in D0, required BARs,
BM, pci int/msi etc. are enabled.

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/4] drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume
  2016-04-18  7:04 ` [PATCH 3/4] drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume Imre Deak
@ 2016-04-18 11:05   ` Ville Syrjälä
  0 siblings, 0 replies; 34+ messages in thread
From: Ville Syrjälä @ 2016-04-18 11:05 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx, stable

On Mon, Apr 18, 2016 at 10:04:21AM +0300, Imre Deak wrote:
> The driver's VDD on/off logic assumes that whenever the VDD is on we
> also hold an AUX power domain reference. Since BIOS can leave the VDD on
> during booting and resuming and on DDI platforms we won't take a
> corresponding power reference, the above assumption won't hold on those
> platforms and an eventual delayed VDD off work will do an extraneous AUX
> power domain put resulting in a refcount underflow. Fix this the same
> way we did this for non-DDI DP encoders:
> 
> 6d93c0c41760c0 ("drm/i915: fix VDD state tracking after system resume")
> 
> At the same time call the DP encoder suspend handler the same way as the
> non-DDI DP encoders do to flush any pending VDD off work. Leaving the
> work running may cause a HW access where we don't expect this (at a point
> where power domains are suspended already).
> 
> While at it remove an unnecessary function call indirection.
> 
> This fixed for me AUX refcount underflow problems on BXT during
> suspend/resume.
> 
> CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
> CC: stable@vger.kernel.org
> Signed-off-by: Imre Deak <imre.deak@intel.com>

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/intel_ddi.c | 10 +++-------
>  drivers/gpu/drm/i915/intel_dp.c  |  4 ++--
>  drivers/gpu/drm/i915/intel_drv.h |  2 ++
>  3 files changed, 7 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c
> index 96234c5..c2348fb 100644
> --- a/drivers/gpu/drm/i915/intel_ddi.c
> +++ b/drivers/gpu/drm/i915/intel_ddi.c
> @@ -2206,12 +2206,6 @@ void intel_ddi_get_config(struct intel_encoder *encoder,
>  	intel_ddi_clock_get(encoder, pipe_config);
>  }
>  
> -static void intel_ddi_destroy(struct drm_encoder *encoder)
> -{
> -	/* HDMI has nothing special to destroy, so we can go with this. */
> -	intel_dp_encoder_destroy(encoder);
> -}
> -
>  static bool intel_ddi_compute_config(struct intel_encoder *encoder,
>  				     struct intel_crtc_state *pipe_config)
>  {
> @@ -2230,7 +2224,8 @@ static bool intel_ddi_compute_config(struct intel_encoder *encoder,
>  }
>  
>  static const struct drm_encoder_funcs intel_ddi_funcs = {
> -	.destroy = intel_ddi_destroy,
> +	.reset = intel_dp_encoder_reset,
> +	.destroy = intel_dp_encoder_destroy,
>  };
>  
>  static struct intel_connector *
> @@ -2329,6 +2324,7 @@ void intel_ddi_init(struct drm_device *dev, enum port port)
>  	intel_encoder->post_disable = intel_ddi_post_disable;
>  	intel_encoder->get_hw_state = intel_ddi_get_hw_state;
>  	intel_encoder->get_config = intel_ddi_get_config;
> +	intel_encoder->suspend = intel_dp_encoder_suspend;
>  
>  	intel_dig_port->port = port;
>  	intel_dig_port->saved_port_bits = I915_READ(DDI_BUF_CTL(port)) &
> diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
> index 61ee226..c6af3d0 100644
> --- a/drivers/gpu/drm/i915/intel_dp.c
> +++ b/drivers/gpu/drm/i915/intel_dp.c
> @@ -4889,7 +4889,7 @@ void intel_dp_encoder_destroy(struct drm_encoder *encoder)
>  	kfree(intel_dig_port);
>  }
>  
> -static void intel_dp_encoder_suspend(struct intel_encoder *intel_encoder)
> +void intel_dp_encoder_suspend(struct intel_encoder *intel_encoder)
>  {
>  	struct intel_dp *intel_dp = enc_to_intel_dp(&intel_encoder->base);
>  
> @@ -4931,7 +4931,7 @@ static void intel_edp_panel_vdd_sanitize(struct intel_dp *intel_dp)
>  	edp_panel_vdd_schedule_off(intel_dp);
>  }
>  
> -static void intel_dp_encoder_reset(struct drm_encoder *encoder)
> +void intel_dp_encoder_reset(struct drm_encoder *encoder)
>  {
>  	struct intel_dp *intel_dp;
>  
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index e13ce22..10dfe72 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -1285,6 +1285,8 @@ void intel_dp_set_link_params(struct intel_dp *intel_dp,
>  void intel_dp_start_link_train(struct intel_dp *intel_dp);
>  void intel_dp_stop_link_train(struct intel_dp *intel_dp);
>  void intel_dp_sink_dpms(struct intel_dp *intel_dp, int mode);
> +void intel_dp_encoder_reset(struct drm_encoder *encoder);
> +void intel_dp_encoder_suspend(struct intel_encoder *intel_encoder);
>  void intel_dp_encoder_destroy(struct drm_encoder *encoder);
>  int intel_dp_sink_crc(struct intel_dp *intel_dp, u8 *crc);
>  bool intel_dp_compute_config(struct intel_encoder *encoder,
> -- 
> 2.5.0

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  8:52           ` Chris Wilson
@ 2016-04-18 11:05             ` Imre Deak
  0 siblings, 0 replies; 34+ messages in thread
From: Imre Deak @ 2016-04-18 11:05 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, stable

On ma, 2016-04-18 at 09:52 +0100, Chris Wilson wrote:
> On Mon, Apr 18, 2016 at 11:37:05AM +0300, Imre Deak wrote:
> > I don't know. I guess you mean that enabling the device first seems
> > like the first logical step. To me enabling power is the first
> > logical
> > step and enabling the device itself is the second.
> 
> It is not clear exactly, but what is clear is that
> pci_enable_device()
> sets the power state of this device only after doing so for the
> bridge
> hierachy.

That bridge enabling in pci_enable_deivce() won't make a difference
since the suspend/resume order is already defined so that bridge
devices are suspended last and resumed first. Otoh, the current order
is to call pci_set_power_state() first and then pci_enable_device()
both for our device and all the rest of PCI drivers/devices that I
checked so far. Note that we can't even change this order during the
suspend/resume phase (as opposed to the freeze/thaw phase) since then
it's the PCI core that imposes the order already. So if I would change
this now as you suggest, we would have a different order during the
suspend/resume and the freeze/thaw phases.

> The order in this patch is inconsistent.

I attribute this inconsistency to the sloppiness of the PCI API. I
don't want to change the order in this fix, but I can follow up with a
patch that removes the pci_disable_device()/pci_enable_device() from
our suspend/resume hooks. We can't anyway depend on these doing
anything, since they are dependent on the device enable refcount which
is out of reach of the driver.

I can also add now a code comment about this.

--Imre

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v2 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18  7:04 ` [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled Imre Deak
  2016-04-18  8:06   ` [Intel-gfx] " Chris Wilson
  2016-04-18  8:28   ` Ville Syrjälä
@ 2016-04-18 11:45   ` Imre Deak
  2016-04-18 14:59     ` Ville Syrjälä
  2 siblings, 1 reply; 34+ messages in thread
From: Imre Deak @ 2016-04-18 11:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: stable

During system resume we depended on pci_enable_device() also putting the
device into PCI D0 state. This won't work if the PCI device was already
enabled but still in D3 state. This is because pci_enable_device() is
refcounted and will not change the HW state if called with a non-zero
refcount. Leaving the device in D3 will make all subsequent device
accesses fail.

This didn't cause a problem most of the time, since we resumed with an
enable refcount of 0. But it fails at least after module reload because
after that we also happen to leak a PCI device enable reference: During
probing we call drm_get_pci_dev() which will enable the PCI device, but
during device removal drm_put_dev() won't disable it. This is a bug of
its own in DRM core, but without much harm as it only leaves the PCI
device enabled. Fixing it is also a bit more involved, due to DRM
mid-layering and because it affects non-i915 drivers too. The fix in
this patch is valid regardless of the problem in DRM core.

v2:
- Add a code comment about the relation of this fix to the freeze/thaw
  vs. the suspend/resume phases. (Ville)
- Add a code comment about the inconsistent ordering of set power state
  and device enable calls. (Chris)

CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
CC: Chris Wilson <chris@chris-wilson.co.uk>
CC: stable@vger.kernel.org
Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 32 +++++++++++++++++++++++++++++++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index d550ae2..3b79e97 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device *dev)
 static int i915_drm_resume_early(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	int ret = 0;
+	int ret;
 
 	/*
 	 * We have a resume ordering issue with the snd-hda driver also
@@ -814,6 +814,36 @@ static int i915_drm_resume_early(struct drm_device *dev)
 	 * FIXME: This should be solved with a special hdmi sink device or
 	 * similar so that power domains can be employed.
 	 */
+
+	/*
+	 * Note that we need to set the power state explicitly, since we
+	 * powered off the device during freeze and the PCI core won't power
+	 * it back up for us during thaw. Powering off the device during
+	 * freeze is not a hard requirement though, and during the
+	 * suspend/resume phases the PCI core makes sure we get here with the
+	 * device powered on. So in case we change our freeze logic and keep
+	 * the device powered we can also remove the following set power state
+	 * call.
+	 */
+	ret = pci_set_power_state(dev->pdev, PCI_D0);
+	if (ret) {
+		DRM_ERROR("failed to set PCI D0 power state (%d)\n", ret);
+		goto out;
+	}
+
+	/*
+	 * Note that pci_enable_device() first enables any parent bridge
+	 * device and only then sets the power state for this device. The
+	 * bridge enabling is a nop though, since bridge devices are resumed
+	 * first. The order of enabling power and enabling the device is
+	 * imposed by the PCI core as described above, so here we preserve the
+	 * same order for the freeze/thaw phases.
+	 *
+	 * TODO: eventually we should remove pci_disable_device() /
+	 * pci_enable_enable_device() from suspend/resume. Due to how they
+	 * depend on the device enable refcount we can't anyway depend on them
+	 * disabling/enabling the device.
+	 */
 	if (pci_enable_device(dev->pdev)) {
 		ret = -EIO;
 		goto out;
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 4/4] drm/i915/gen9: Fix runtime PM refcounting in case DMC firmware isn't loaded
  2016-04-18  7:04 ` [PATCH 4/4] drm/i915/gen9: Fix runtime PM refcounting in case DMC firmware isn't loaded Imre Deak
  2016-04-18  7:49   ` Chris Wilson
@ 2016-04-18 11:48   ` Imre Deak
  2016-04-18 12:07     ` Chris Wilson
  2016-04-20 13:02     ` Daniel Vetter
  1 sibling, 2 replies; 34+ messages in thread
From: Imre Deak @ 2016-04-18 11:48 UTC (permalink / raw)
  To: intel-gfx

While we disable runtime PM and with that display power well support if
the DMC firmware isn't loaded, we still want to disable power wells
during system suspend and driver unload. So drop/reacquire the
corresponding power refcount during suspend/resume and driver unloading.
This also means we have to check if DMC is not loaded and skip enabling
DC states in the power well code.

v2:
- Reuse intel_csr_ucode_suspend() in intel_csr_ucode_fini() instead of
  opencoding the former. (Chris)
- Add docbook comment to the public resume and suspend functions.

CC: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Imre Deak <imre.deak@intel.com>

---

As this fixes the special case when firmware loading failed with the
result of possibly leaving power wells enabled during suspend/resume and
driver unloading it's not for -fixes or stable.
---
 drivers/gpu/drm/i915/i915_drv.c         |  5 ++--
 drivers/gpu/drm/i915/intel_csr.c        | 44 +++++++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/intel_drv.h        |  2 ++
 drivers/gpu/drm/i915/intel_runtime_pm.c |  3 +++
 4 files changed, 50 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 3b79e97..9a016f1 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -640,8 +640,7 @@ static int i915_drm_suspend(struct drm_device *dev)
 
 	intel_display_set_init_power(dev_priv, false);
 
-	if (HAS_CSR(dev_priv))
-		flush_work(&dev_priv->csr.work);
+	intel_csr_ucode_suspend(dev_priv);
 
 out:
 	enable_rpm_wakeref_asserts(dev_priv);
@@ -733,6 +732,8 @@ static int i915_drm_resume(struct drm_device *dev)
 
 	disable_rpm_wakeref_asserts(dev_priv);
 
+	intel_csr_ucode_resume(dev_priv);
+
 	mutex_lock(&dev->struct_mutex);
 	i915_gem_restore_gtt_mappings(dev);
 	mutex_unlock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
index d57b00e..a34c23e 100644
--- a/drivers/gpu/drm/i915/intel_csr.c
+++ b/drivers/gpu/drm/i915/intel_csr.c
@@ -467,10 +467,50 @@ void intel_csr_ucode_init(struct drm_i915_private *dev_priv)
 }
 
 /**
+ * intel_csr_ucode_suspend() - prepare CSR firmware before system suspend
+ * @dev_priv: i915 drm device
+ *
+ * Prepare the DMC firmware before entering system suspend. This includes
+ * flushing pending work items and releasing any resources acquired during
+ * init.
+ */
+void intel_csr_ucode_suspend(struct drm_i915_private *dev_priv)
+{
+	if (!HAS_CSR(dev_priv))
+		return;
+
+	flush_work(&dev_priv->csr.work);
+
+	/* Drop the reference held in case DMC isn't loaded. */
+	if (!dev_priv->csr.dmc_payload)
+		intel_display_power_put(dev_priv, POWER_DOMAIN_INIT);
+}
+
+/**
+ * intel_csr_ucode_resume() - init CSR firmware during system resume
+ * @dev_priv: i915 drm device
+ *
+ * Reinitialize the DMC firmware during system resume, reacquiring any
+ * resources released in intel_csr_ucode_suspend().
+ */
+void intel_csr_ucode_resume(struct drm_i915_private *dev_priv)
+{
+	if (!HAS_CSR(dev_priv))
+		return;
+
+	/*
+	 * Reacquire the reference to keep RPM disabled in case DMC isn't
+	 * loaded.
+	 */
+	if (!dev_priv->csr.dmc_payload)
+		intel_display_power_get(dev_priv, POWER_DOMAIN_INIT);
+}
+
+/**
  * intel_csr_ucode_fini() - unload the CSR firmware.
  * @dev_priv: i915 drm device.
  *
- * Firmmware unloading includes freeing the internal momory and reset the
+ * Firmmware unloading includes freeing the internal memory and reset the
  * firmware loading status.
  */
 void intel_csr_ucode_fini(struct drm_i915_private *dev_priv)
@@ -478,7 +518,7 @@ void intel_csr_ucode_fini(struct drm_i915_private *dev_priv)
 	if (!HAS_CSR(dev_priv))
 		return;
 
-	flush_work(&dev_priv->csr.work);
+	intel_csr_ucode_suspend(dev_priv);
 
 	kfree(dev_priv->csr.dmc_payload);
 }
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 10dfe72..beed9e8 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1275,6 +1275,8 @@ u32 skl_plane_ctl_rotation(unsigned int rotation);
 void intel_csr_ucode_init(struct drm_i915_private *);
 void intel_csr_load_program(struct drm_i915_private *);
 void intel_csr_ucode_fini(struct drm_i915_private *);
+void intel_csr_ucode_suspend(struct drm_i915_private *);
+void intel_csr_ucode_resume(struct drm_i915_private *);
 
 /* intel_dp.c */
 void intel_dp_init(struct drm_device *dev, i915_reg_t output_reg, enum port port);
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
index 259f66f..bfa8548 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -809,6 +809,9 @@ static void gen9_dc_off_power_well_enable(struct drm_i915_private *dev_priv,
 static void gen9_dc_off_power_well_disable(struct drm_i915_private *dev_priv,
 					   struct i915_power_well *power_well)
 {
+	if (!dev_priv->csr.dmc_payload)
+		return;
+
 	if (dev_priv->csr.allowed_dc_mask & DC_STATE_EN_UPTO_DC6)
 		skl_enable_dc6(dev_priv);
 	else if (dev_priv->csr.allowed_dc_mask & DC_STATE_EN_UPTO_DC5)
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 4/4] drm/i915/gen9: Fix runtime PM refcounting in case DMC firmware isn't loaded
  2016-04-18 11:48   ` [PATCH v2 " Imre Deak
@ 2016-04-18 12:07     ` Chris Wilson
  2016-04-20 13:02     ` Daniel Vetter
  1 sibling, 0 replies; 34+ messages in thread
From: Chris Wilson @ 2016-04-18 12:07 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

On Mon, Apr 18, 2016 at 02:48:21PM +0300, Imre Deak wrote:
> While we disable runtime PM and with that display power well support if
> the DMC firmware isn't loaded, we still want to disable power wells
> during system suspend and driver unload. So drop/reacquire the
> corresponding power refcount during suspend/resume and driver unloading.
> This also means we have to check if DMC is not loaded and skip enabling
> DC states in the power well code.
> 
> v2:
> - Reuse intel_csr_ucode_suspend() in intel_csr_ucode_fini() instead of
>   opencoding the former. (Chris)
> - Add docbook comment to the public resume and suspend functions.
> 
> CC: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Imre Deak <imre.deak@intel.com>

Makes sense to me (I'm will say I am not that familiar with CSR though)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915: Fix a few suspend/resume and driver unload bugs (rev3)
  2016-04-18  7:04 [PATCH 0/4] drm/i915: Fix a few suspend/resume and driver unload bugs Imre Deak
                   ` (4 preceding siblings ...)
  2016-04-18  7:24 ` ✓ Fi.CI.BAT: success for drm/i915: Fix a few suspend/resume and driver unload bugs Patchwork
@ 2016-04-18 13:57 ` Patchwork
  2016-04-18 15:45   ` Imre Deak
  5 siblings, 1 reply; 34+ messages in thread
From: Patchwork @ 2016-04-18 13:57 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Fix a few suspend/resume and driver unload bugs (rev3)
URL   : https://patchwork.freedesktop.org/series/5856/
State : failure

== Summary ==

Series 5856v3 drm/i915: Fix a few suspend/resume and driver unload bugs
http://patchwork.freedesktop.org/api/1.0/series/5856/revisions/3/mbox/

Test gem_exec_whisper:
        Subgroup basic:
                incomplete -> PASS       (bdw-nuci7)
Test kms_flip:
        Subgroup basic-flip-vs-dpms:
                pass       -> DMESG-WARN (skl-i7k-2)
        Subgroup basic-flip-vs-modeset:
                pass       -> DMESG-WARN (skl-i7k-2)
        Subgroup basic-flip-vs-wf_vblank:
                pass       -> FAIL       (snb-x220t)
Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-c:
                pass       -> SKIP       (hsw-brixbox)

bdw-nuci7        total:203  pass:191  dwarn:0   dfail:0   fail:0   skip:12 
bdw-ultra        total:203  pass:180  dwarn:0   dfail:0   fail:0   skip:23 
bsw-nuc-2        total:202  pass:163  dwarn:0   dfail:0   fail:0   skip:39 
byt-nuc          total:202  pass:164  dwarn:0   dfail:0   fail:0   skip:38 
hsw-brixbox      total:203  pass:178  dwarn:0   dfail:0   fail:0   skip:25 
hsw-gt2          total:203  pass:184  dwarn:0   dfail:0   fail:0   skip:19 
ilk-hp8440p      total:203  pass:135  dwarn:0   dfail:0   fail:0   skip:68 
ivb-t430s        total:203  pass:175  dwarn:0   dfail:0   fail:0   skip:28 
skl-i7k-2        total:203  pass:176  dwarn:2   dfail:0   fail:0   skip:25 
skl-nuci5        total:203  pass:192  dwarn:0   dfail:0   fail:0   skip:11 
snb-dellxps      total:203  pass:165  dwarn:0   dfail:0   fail:0   skip:38 
snb-x220t        total:203  pass:164  dwarn:0   dfail:0   fail:2   skip:37 

Results at /archive/results/CI_IGT_test/Patchwork_1927/

78673b24b60c1a884c947ee5a45ad860ca618418 drm-intel-nightly: 2016y-04m-18d-12h-32m-33s UTC integration manifest
a0cb3b0 drm/i915/gen9: Fix runtime PM refcounting in case DMC firmware isn't loaded
c48ce3c drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume
e2672e0 drm/i915: Fix system resume if PCI device remained enabled
5e7b1ad drm/i915: Fix error path in i915_drm_resume_early

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 2/4] drm/i915: Fix system resume if PCI device remained enabled
  2016-04-18 11:45   ` [PATCH v2 " Imre Deak
@ 2016-04-18 14:59     ` Ville Syrjälä
  0 siblings, 0 replies; 34+ messages in thread
From: Ville Syrjälä @ 2016-04-18 14:59 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx, Chris Wilson, stable

On Mon, Apr 18, 2016 at 02:45:54PM +0300, Imre Deak wrote:
> During system resume we depended on pci_enable_device() also putting the
> device into PCI D0 state. This won't work if the PCI device was already
> enabled but still in D3 state. This is because pci_enable_device() is
> refcounted and will not change the HW state if called with a non-zero
> refcount. Leaving the device in D3 will make all subsequent device
> accesses fail.
> 
> This didn't cause a problem most of the time, since we resumed with an
> enable refcount of 0. But it fails at least after module reload because
> after that we also happen to leak a PCI device enable reference: During
> probing we call drm_get_pci_dev() which will enable the PCI device, but
> during device removal drm_put_dev() won't disable it. This is a bug of
> its own in DRM core, but without much harm as it only leaves the PCI
> device enabled. Fixing it is also a bit more involved, due to DRM
> mid-layering and because it affects non-i915 drivers too. The fix in
> this patch is valid regardless of the problem in DRM core.
> 
> v2:
> - Add a code comment about the relation of this fix to the freeze/thaw
>   vs. the suspend/resume phases. (Ville)
> - Add a code comment about the inconsistent ordering of set power state
>   and device enable calls. (Chris)
> 
> CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
> CC: Chris Wilson <chris@chris-wilson.co.uk>
> CC: stable@vger.kernel.org
> Signed-off-by: Imre Deak <imre.deak@intel.com>

The PCI PM code is a bit of a mess, but it would be appear this should
do what we need.

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/i915_drv.c | 32 +++++++++++++++++++++++++++++++-
>  1 file changed, 31 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index d550ae2..3b79e97 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -803,7 +803,7 @@ static int i915_drm_resume(struct drm_device *dev)
>  static int i915_drm_resume_early(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	int ret = 0;
> +	int ret;
>  
>  	/*
>  	 * We have a resume ordering issue with the snd-hda driver also
> @@ -814,6 +814,36 @@ static int i915_drm_resume_early(struct drm_device *dev)
>  	 * FIXME: This should be solved with a special hdmi sink device or
>  	 * similar so that power domains can be employed.
>  	 */
> +
> +	/*
> +	 * Note that we need to set the power state explicitly, since we
> +	 * powered off the device during freeze and the PCI core won't power
> +	 * it back up for us during thaw. Powering off the device during
> +	 * freeze is not a hard requirement though, and during the
> +	 * suspend/resume phases the PCI core makes sure we get here with the
> +	 * device powered on. So in case we change our freeze logic and keep
> +	 * the device powered we can also remove the following set power state
> +	 * call.
> +	 */
> +	ret = pci_set_power_state(dev->pdev, PCI_D0);
> +	if (ret) {
> +		DRM_ERROR("failed to set PCI D0 power state (%d)\n", ret);
> +		goto out;
> +	}
> +
> +	/*
> +	 * Note that pci_enable_device() first enables any parent bridge
> +	 * device and only then sets the power state for this device. The
> +	 * bridge enabling is a nop though, since bridge devices are resumed
> +	 * first. The order of enabling power and enabling the device is
> +	 * imposed by the PCI core as described above, so here we preserve the
> +	 * same order for the freeze/thaw phases.
> +	 *
> +	 * TODO: eventually we should remove pci_disable_device() /
> +	 * pci_enable_enable_device() from suspend/resume. Due to how they
> +	 * depend on the device enable refcount we can't anyway depend on them
> +	 * disabling/enabling the device.
> +	 */
>  	if (pci_enable_device(dev->pdev)) {
>  		ret = -EIO;
>  		goto out;
> -- 
> 2.5.0

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: ✗ Fi.CI.BAT: failure for drm/i915: Fix a few suspend/resume and driver unload bugs (rev3)
  2016-04-18 13:57 ` ✗ Fi.CI.BAT: failure for drm/i915: Fix a few suspend/resume and driver unload bugs (rev3) Patchwork
@ 2016-04-18 15:45   ` Imre Deak
  2016-04-19  9:42     ` Imre Deak
  0 siblings, 1 reply; 34+ messages in thread
From: Imre Deak @ 2016-04-18 15:45 UTC (permalink / raw)
  To: intel-gfx

On ma, 2016-04-18 at 13:57 +0000, Patchwork wrote:
> == Series Details ==
> 
> Series: drm/i915: Fix a few suspend/resume and driver unload bugs
> (rev3)
> URL   : https://patchwork.freedesktop.org/series/5856/
> State : failure
> 
> == Summary ==
> 
> Series 5856v3 drm/i915: Fix a few suspend/resume and driver unload
> bugs
> http://patchwork.freedesktop.org/api/1.0/series/5856/revisions/3/mbox
> /
> 
> Test gem_exec_whisper:
>         Subgroup basic:
>                 incomplete -> PASS       (bdw-nuci7)
> Test kms_flip:
>         Subgroup basic-flip-vs-dpms:
>                 pass       -> DMESG-WARN (skl-i7k-2)

Pre-existing issue from an earlier CI run, intel_atomic_commit()
failure during DPMS set property followed by a wait for vblank timeout:
http://benchsrv.fi.intel.com//archive/results/CI_IGT_test/CI_DRM_1177/skl-i7k-2/html/skl-i7k-2@CI_DRM_1177@1/igt@kms_flip@basic-flip-vs-dpms.html

I opened a new bug for this:
https://bugs.freedesktop.org/show_bug.cgi?id=94992

>         Subgroup basic-flip-vs-modeset:
>                 pass       -> DMESG-WARN (skl-i7k-2)
Pre-existing issue from an earlier CI run, intel_atomic_commit()
failure during set CRTC IOCTL followed by a wait for vblank timeout:
http://benchsrv.fi.intel.com//archive/results/CI_IGT_test/CI_DRM_1224/skl-i7k-2/html/skl-i7k-2@CI_DRM_1224@1/igt@kms_flip@basic-flip-vs-modeset.html

I opened a new bug for this:
https://bugs.freedesktop.org/show_bug.cgi?id=94993

>         Subgroup basic-flip-vs-wf_vblank:
>                 pass       -> FAIL       (snb-x220t)

Pre-existing 1 frame jitter bug:
https://bugs.freedesktop.org/show_bug.cgi?id=94294

> Test kms_pipe_crc_basic:
>         Subgroup read-crc-pipe-c:
>                 pass       -> SKIP       (hsw-brixbox)

Looks like a problem with detecting the display output (HDMI):
"""
Test requirement not met in function test_read_crc, file kms_pipe_crc_basic.c:178:
Test requirement: valid_connectors
No connector found for pipe 2
"""

Maybe the live status check is acting up? There were previous cases
where CI didn't run this test on hsw-brixbox:
http://benchsrv.fi.intel.com/archive/results/CI_IGT_test/igt@kms_pipe_crc_basic@read-crc-pipe-c.html

> 
> bdw-
> nuci7        total:203  pass:191  dwarn:0   dfail:0   fail:0   skip:1
> 2 
> bdw-
> ultra        total:203  pass:180  dwarn:0   dfail:0   fail:0   skip:2
> 3 
> bsw-nuc-
> 2        total:202  pass:163  dwarn:0   dfail:0   fail:0   skip:39 
> byt-
> nuc          total:202  pass:164  dwarn:0   dfail:0   fail:0   skip:3
> 8 
> hsw-
> brixbox      total:203  pass:178  dwarn:0   dfail:0   fail:0   skip:2
> 5 
> hsw-
> gt2          total:203  pass:184  dwarn:0   dfail:0   fail:0   skip:1
> 9 
> ilk-
> hp8440p      total:203  pass:135  dwarn:0   dfail:0   fail:0   skip:6
> 8 
> ivb-
> t430s        total:203  pass:175  dwarn:0   dfail:0   fail:0   skip:2
> 8 
> skl-i7k-
> 2        total:203  pass:176  dwarn:2   dfail:0   fail:0   skip:25 
> skl-
> nuci5        total:203  pass:192  dwarn:0   dfail:0   fail:0   skip:1
> 1 
> snb-
> dellxps      total:203  pass:165  dwarn:0   dfail:0   fail:0   skip:3
> 8 
> snb-
> x220t        total:203  pass:164  dwarn:0   dfail:0   fail:2   skip:3
> 7 
> 
> Results at /archive/results/CI_IGT_test/Patchwork_1927/
> 
> 78673b24b60c1a884c947ee5a45ad860ca618418 drm-intel-nightly: 2016y-
> 04m-18d-12h-32m-33s UTC integration manifest
> a0cb3b0 drm/i915/gen9: Fix runtime PM refcounting in case DMC
> firmware isn't loaded
> c48ce3c drm/i915/ddi: Fix eDP VDD handling during booting and
> suspend/resume
> e2672e0 drm/i915: Fix system resume if PCI device remained enabled
> 5e7b1ad drm/i915: Fix error path in i915_drm_resume_early
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: ✗ Fi.CI.BAT: failure for drm/i915: Fix a few suspend/resume and driver unload bugs (rev3)
  2016-04-18 15:45   ` Imre Deak
@ 2016-04-19  9:42     ` Imre Deak
  0 siblings, 0 replies; 34+ messages in thread
From: Imre Deak @ 2016-04-19  9:42 UTC (permalink / raw)
  To: intel-gfx, Chris Wilson, Ville Syrjälä

On ma, 2016-04-18 at 18:45 +0300, Imre Deak wrote:
> On ma, 2016-04-18 at 13:57 +0000, Patchwork wrote:
> > == Series Details ==
> > 
> > Series: drm/i915: Fix a few suspend/resume and driver unload bugs
> > (rev3)
> > URL   : https://patchwork.freedesktop.org/series/5856/
> > State : failure
> > 
> > == Summary ==
> > 
> > Series 5856v3 drm/i915: Fix a few suspend/resume and driver unload
> > bugs
> > http://patchwork.freedesktop.org/api/1.0/series/5856/revisions/3/mbox
> > /
> > 
> > Test gem_exec_whisper:
> >         Subgroup basic:
> >                 incomplete -> PASS       (bdw-nuci7)
> > Test kms_flip:
> >         Subgroup basic-flip-vs-dpms:
> >                 pass       -> DMESG-WARN (skl-i7k-2)
> 
> Pre-existing issue from an earlier CI run, intel_atomic_commit()
> failure during DPMS set property followed by a wait for vblank timeout:
> http://benchsrv.fi.intel.com//archive/results/CI_IGT_test/CI_DRM_1177/skl-i7k-2/html/skl-i7k-2@CI_DRM_1177@1/igt@kms_flip@basic-flip-vs-dpms.html
> 
> I opened a new bug for this:
> https://bugs.freedesktop.org/show_bug.cgi?id=94992
> 
> >         Subgroup basic-flip-vs-modeset:
> >                 pass       -> DMESG-WARN (skl-i7k-2)
> Pre-existing issue from an earlier CI run, intel_atomic_commit()
> failure during set CRTC IOCTL followed by a wait for vblank timeout:
> http://benchsrv.fi.intel.com//archive/results/CI_IGT_test/CI_DRM_1224/skl-i7k-2/html/skl-i7k-2@CI_DRM_1224@1/igt@kms_flip@basic-flip-vs-modeset.html
> 
> I opened a new bug for this:
> https://bugs.freedesktop.org/show_bug.cgi?id=94993
> 
> >         Subgroup basic-flip-vs-wf_vblank:
> >                 pass       -> FAIL       (snb-x220t)
> 
> Pre-existing 1 frame jitter bug:
> https://bugs.freedesktop.org/show_bug.cgi?id=94294
> 
> > Test kms_pipe_crc_basic:
> >         Subgroup read-crc-pipe-c:
> >                 pass       -> SKIP       (hsw-brixbox)
> 
> Looks like a problem with detecting the display output (HDMI):
> """
> Test requirement not met in function test_read_crc, file kms_pipe_crc_basic.c:178:
> Test requirement: valid_connectors
> No connector found for pipe 2
> """
> 
> Maybe the live status check is acting up? There were previous cases
> where CI didn't run this test on hsw-brixbox:
> http://benchsrv.fi.intel.com/archive/results/CI_IGT_test/igt@kms_pipe_crc_basic@read-crc-pipe-c.html

Thanks for the reviews, I pushed the patches to -dinq.

--Imre

> > bdw-
> > nuci7        total:203  pass:191  dwarn:0   dfail:0   fail:0   skip:1
> > 2 
> > bdw-
> > ultra        total:203  pass:180  dwarn:0   dfail:0   fail:0   skip:2
> > 3 
> > bsw-nuc-
> > 2        total:202  pass:163  dwarn:0   dfail:0   fail:0   skip:39 
> > byt-
> > nuc          total:202  pass:164  dwarn:0   dfail:0   fail:0   skip:3
> > 8 
> > hsw-
> > brixbox      total:203  pass:178  dwarn:0   dfail:0   fail:0   skip:2
> > 5 
> > hsw-
> > gt2          total:203  pass:184  dwarn:0   dfail:0   fail:0   skip:1
> > 9 
> > ilk-
> > hp8440p      total:203  pass:135  dwarn:0   dfail:0   fail:0   skip:6
> > 8 
> > ivb-
> > t430s        total:203  pass:175  dwarn:0   dfail:0   fail:0   skip:2
> > 8 
> > skl-i7k-
> > 2        total:203  pass:176  dwarn:2   dfail:0   fail:0   skip:25 
> > skl-
> > nuci5        total:203  pass:192  dwarn:0   dfail:0   fail:0   skip:1
> > 1 
> > snb-
> > dellxps      total:203  pass:165  dwarn:0   dfail:0   fail:0   skip:3
> > 8 
> > snb-
> > x220t        total:203  pass:164  dwarn:0   dfail:0   fail:2   skip:3
> > 7 
> > 
> > Results at /archive/results/CI_IGT_test/Patchwork_1927/
> > 
> > 78673b24b60c1a884c947ee5a45ad860ca618418 drm-intel-nightly: 2016y-
> > 04m-18d-12h-32m-33s UTC integration manifest
> > a0cb3b0 drm/i915/gen9: Fix runtime PM refcounting in case DMC
> > firmware isn't loaded
> > c48ce3c drm/i915/ddi: Fix eDP VDD handling during booting and
> > suspend/resume
> > e2672e0 drm/i915: Fix system resume if PCI device remained enabled
> > 5e7b1ad drm/i915: Fix error path in i915_drm_resume_early
> > 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 4/4] drm/i915/gen9: Fix runtime PM refcounting in case DMC firmware isn't loaded
  2016-04-18 11:48   ` [PATCH v2 " Imre Deak
  2016-04-18 12:07     ` Chris Wilson
@ 2016-04-20 13:02     ` Daniel Vetter
  2016-04-20 13:13       ` Imre Deak
  1 sibling, 1 reply; 34+ messages in thread
From: Daniel Vetter @ 2016-04-20 13:02 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

On Mon, Apr 18, 2016 at 02:48:21PM +0300, Imre Deak wrote:
> While we disable runtime PM and with that display power well support if
> the DMC firmware isn't loaded, we still want to disable power wells
> during system suspend and driver unload. So drop/reacquire the
> corresponding power refcount during suspend/resume and driver unloading.
> This also means we have to check if DMC is not loaded and skip enabling
> DC states in the power well code.
> 
> v2:
> - Reuse intel_csr_ucode_suspend() in intel_csr_ucode_fini() instead of
>   opencoding the former. (Chris)
> - Add docbook comment to the public resume and suspend functions.
> 
> CC: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Imre Deak <imre.deak@intel.com>
> 
> ---
> 
> As this fixes the special case when firmware loading failed with the
> result of possibly leaving power wells enabled during suspend/resume and
> driver unloading it's not for -fixes or stable.

Do we even care about this to bother fixing it? We pretty much agreed that
we need dmc to make power management work on skl, so if it doesn't work
too well over suspend/resume (soix), meh.

And your patch here does make the suspend/resume code even more fragile
and filled with special cases, so not too sure we should apply this one.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.c         |  5 ++--
>  drivers/gpu/drm/i915/intel_csr.c        | 44 +++++++++++++++++++++++++++++++--
>  drivers/gpu/drm/i915/intel_drv.h        |  2 ++
>  drivers/gpu/drm/i915/intel_runtime_pm.c |  3 +++
>  4 files changed, 50 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 3b79e97..9a016f1 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -640,8 +640,7 @@ static int i915_drm_suspend(struct drm_device *dev)
>  
>  	intel_display_set_init_power(dev_priv, false);
>  
> -	if (HAS_CSR(dev_priv))
> -		flush_work(&dev_priv->csr.work);
> +	intel_csr_ucode_suspend(dev_priv);
>  
>  out:
>  	enable_rpm_wakeref_asserts(dev_priv);
> @@ -733,6 +732,8 @@ static int i915_drm_resume(struct drm_device *dev)
>  
>  	disable_rpm_wakeref_asserts(dev_priv);
>  
> +	intel_csr_ucode_resume(dev_priv);
> +
>  	mutex_lock(&dev->struct_mutex);
>  	i915_gem_restore_gtt_mappings(dev);
>  	mutex_unlock(&dev->struct_mutex);
> diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
> index d57b00e..a34c23e 100644
> --- a/drivers/gpu/drm/i915/intel_csr.c
> +++ b/drivers/gpu/drm/i915/intel_csr.c
> @@ -467,10 +467,50 @@ void intel_csr_ucode_init(struct drm_i915_private *dev_priv)
>  }
>  
>  /**
> + * intel_csr_ucode_suspend() - prepare CSR firmware before system suspend
> + * @dev_priv: i915 drm device
> + *
> + * Prepare the DMC firmware before entering system suspend. This includes
> + * flushing pending work items and releasing any resources acquired during
> + * init.
> + */
> +void intel_csr_ucode_suspend(struct drm_i915_private *dev_priv)
> +{
> +	if (!HAS_CSR(dev_priv))
> +		return;
> +
> +	flush_work(&dev_priv->csr.work);
> +
> +	/* Drop the reference held in case DMC isn't loaded. */
> +	if (!dev_priv->csr.dmc_payload)
> +		intel_display_power_put(dev_priv, POWER_DOMAIN_INIT);
> +}
> +
> +/**
> + * intel_csr_ucode_resume() - init CSR firmware during system resume
> + * @dev_priv: i915 drm device
> + *
> + * Reinitialize the DMC firmware during system resume, reacquiring any
> + * resources released in intel_csr_ucode_suspend().
> + */
> +void intel_csr_ucode_resume(struct drm_i915_private *dev_priv)
> +{
> +	if (!HAS_CSR(dev_priv))
> +		return;
> +
> +	/*
> +	 * Reacquire the reference to keep RPM disabled in case DMC isn't
> +	 * loaded.
> +	 */
> +	if (!dev_priv->csr.dmc_payload)
> +		intel_display_power_get(dev_priv, POWER_DOMAIN_INIT);
> +}
> +
> +/**
>   * intel_csr_ucode_fini() - unload the CSR firmware.
>   * @dev_priv: i915 drm device.
>   *
> - * Firmmware unloading includes freeing the internal momory and reset the
> + * Firmmware unloading includes freeing the internal memory and reset the
>   * firmware loading status.
>   */
>  void intel_csr_ucode_fini(struct drm_i915_private *dev_priv)
> @@ -478,7 +518,7 @@ void intel_csr_ucode_fini(struct drm_i915_private *dev_priv)
>  	if (!HAS_CSR(dev_priv))
>  		return;
>  
> -	flush_work(&dev_priv->csr.work);
> +	intel_csr_ucode_suspend(dev_priv);
>  
>  	kfree(dev_priv->csr.dmc_payload);
>  }
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index 10dfe72..beed9e8 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -1275,6 +1275,8 @@ u32 skl_plane_ctl_rotation(unsigned int rotation);
>  void intel_csr_ucode_init(struct drm_i915_private *);
>  void intel_csr_load_program(struct drm_i915_private *);
>  void intel_csr_ucode_fini(struct drm_i915_private *);
> +void intel_csr_ucode_suspend(struct drm_i915_private *);
> +void intel_csr_ucode_resume(struct drm_i915_private *);
>  
>  /* intel_dp.c */
>  void intel_dp_init(struct drm_device *dev, i915_reg_t output_reg, enum port port);
> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> index 259f66f..bfa8548 100644
> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> @@ -809,6 +809,9 @@ static void gen9_dc_off_power_well_enable(struct drm_i915_private *dev_priv,
>  static void gen9_dc_off_power_well_disable(struct drm_i915_private *dev_priv,
>  					   struct i915_power_well *power_well)
>  {
> +	if (!dev_priv->csr.dmc_payload)
> +		return;
> +
>  	if (dev_priv->csr.allowed_dc_mask & DC_STATE_EN_UPTO_DC6)
>  		skl_enable_dc6(dev_priv);
>  	else if (dev_priv->csr.allowed_dc_mask & DC_STATE_EN_UPTO_DC5)
> -- 
> 2.5.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 4/4] drm/i915/gen9: Fix runtime PM refcounting in case DMC firmware isn't loaded
  2016-04-20 13:02     ` Daniel Vetter
@ 2016-04-20 13:13       ` Imre Deak
  2016-04-20 13:16         ` Daniel Vetter
  0 siblings, 1 reply; 34+ messages in thread
From: Imre Deak @ 2016-04-20 13:13 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On ke, 2016-04-20 at 15:02 +0200, Daniel Vetter wrote:
> On Mon, Apr 18, 2016 at 02:48:21PM +0300, Imre Deak wrote:
> > While we disable runtime PM and with that display power well support if
> > the DMC firmware isn't loaded, we still want to disable power wells
> > during system suspend and driver unload. So drop/reacquire the
> > corresponding power refcount during suspend/resume and driver unloading.
> > This also means we have to check if DMC is not loaded and skip enabling
> > DC states in the power well code.
> > 
> > v2:
> > - Reuse intel_csr_ucode_suspend() in intel_csr_ucode_fini() instead of
> >   opencoding the former. (Chris)
> > - Add docbook comment to the public resume and suspend functions.
> > 
> > CC: Chris Wilson <chris@chris-wilson.co.uk>
> > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > 
> > ---
> > 
> > As this fixes the special case when firmware loading failed with the
> > result of possibly leaving power wells enabled during suspend/resume and
> > driver unloading it's not for -fixes or stable.
> 
> Do we even care about this to bother fixing it? We pretty much agreed that
> we need dmc to make power management work on skl, so if it doesn't work
> too well over suspend/resume (soix), meh.

It's possible that someone runs w/o the firmware, either because of not
having the latest version for a given kernel or on purpose disabling
DMC functionality (for example for debugging). We want to keep things
working in this case too. And it's also the right thing to do during
driver unload, we'd leak the RPM reference otherwise.

> And your patch here does make the suspend/resume code even more fragile
> and filled with special cases, so not too sure we should apply this one.

It's the same case for all DMC platforms.

--Imre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 4/4] drm/i915/gen9: Fix runtime PM refcounting in case DMC firmware isn't loaded
  2016-04-20 13:13       ` Imre Deak
@ 2016-04-20 13:16         ` Daniel Vetter
  2016-04-20 13:35           ` Imre Deak
  0 siblings, 1 reply; 34+ messages in thread
From: Daniel Vetter @ 2016-04-20 13:16 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

On Wed, Apr 20, 2016 at 04:13:25PM +0300, Imre Deak wrote:
> On ke, 2016-04-20 at 15:02 +0200, Daniel Vetter wrote:
> > On Mon, Apr 18, 2016 at 02:48:21PM +0300, Imre Deak wrote:
> > > While we disable runtime PM and with that display power well support if
> > > the DMC firmware isn't loaded, we still want to disable power wells
> > > during system suspend and driver unload. So drop/reacquire the
> > > corresponding power refcount during suspend/resume and driver unloading.
> > > This also means we have to check if DMC is not loaded and skip enabling
> > > DC states in the power well code.
> > > 
> > > v2:
> > > - Reuse intel_csr_ucode_suspend() in intel_csr_ucode_fini() instead of
> > >   opencoding the former. (Chris)
> > > - Add docbook comment to the public resume and suspend functions.
> > > 
> > > CC: Chris Wilson <chris@chris-wilson.co.uk>
> > > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > > 
> > > ---
> > > 
> > > As this fixes the special case when firmware loading failed with the
> > > result of possibly leaving power wells enabled during suspend/resume and
> > > driver unloading it's not for -fixes or stable.
> > 
> > Do we even care about this to bother fixing it? We pretty much agreed that
> > we need dmc to make power management work on skl, so if it doesn't work
> > too well over suspend/resume (soix), meh.
> 
> It's possible that someone runs w/o the firmware, either because of not
> having the latest version for a given kernel or on purpose disabling
> DMC functionality (for example for debugging). We want to keep things
> working in this case too. And it's also the right thing to do during
> driver unload, we'd leak the RPM reference otherwise.

Ok, if this is needed anyway for driver unload I think it's ok. But
otherwise keeping stuff working without DMC is imo a fringe use case we
shouldn't bother about.

> > And your patch here does make the suspend/resume code even more fragile
> > and filled with special cases, so not too sure we should apply this one.
> 
> It's the same case for all DMC platforms.

I more meant that there's not yet another little bit we need to do at
exactly the right time in an already massive sequence of events. It
certainly doesn't make suspend/resume simpler ;-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 4/4] drm/i915/gen9: Fix runtime PM refcounting in case DMC firmware isn't loaded
  2016-04-20 13:16         ` Daniel Vetter
@ 2016-04-20 13:35           ` Imre Deak
  2016-04-20 14:11             ` Daniel Vetter
  0 siblings, 1 reply; 34+ messages in thread
From: Imre Deak @ 2016-04-20 13:35 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On ke, 2016-04-20 at 15:16 +0200, Daniel Vetter wrote:
> On Wed, Apr 20, 2016 at 04:13:25PM +0300, Imre Deak wrote:
> > On ke, 2016-04-20 at 15:02 +0200, Daniel Vetter wrote:
> > > On Mon, Apr 18, 2016 at 02:48:21PM +0300, Imre Deak wrote:
> > > > While we disable runtime PM and with that display power well support if
> > > > the DMC firmware isn't loaded, we still want to disable power wells
> > > > during system suspend and driver unload. So drop/reacquire the
> > > > corresponding power refcount during suspend/resume and driver unloading.
> > > > This also means we have to check if DMC is not loaded and skip enabling
> > > > DC states in the power well code.
> > > > 
> > > > v2:
> > > > - Reuse intel_csr_ucode_suspend() in intel_csr_ucode_fini() instead of
> > > >   opencoding the former. (Chris)
> > > > - Add docbook comment to the public resume and suspend functions.
> > > > 
> > > > CC: Chris Wilson <chris@chris-wilson.co.uk>
> > > > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > > > 
> > > > ---
> > > > 
> > > > As this fixes the special case when firmware loading failed with the
> > > > result of possibly leaving power wells enabled during suspend/resume and
> > > > driver unloading it's not for -fixes or stable.
> > > 
> > > Do we even care about this to bother fixing it? We pretty much agreed that
> > > we need dmc to make power management work on skl, so if it doesn't work
> > > too well over suspend/resume (soix), meh.
> > 
> > It's possible that someone runs w/o the firmware, either because of not
> > having the latest version for a given kernel or on purpose disabling
> > DMC functionality (for example for debugging). We want to keep things
> > working in this case too. And it's also the right thing to do during
> > driver unload, we'd leak the RPM reference otherwise.
> 
> Ok, if this is needed anyway for driver unload I think it's ok. But
> otherwise keeping stuff working without DMC is imo a fringe use case we
> shouldn't bother about.

Well we could still not do the suspend/resume time put/get, but I'd
still argue for those see below.

> > > And your patch here does make the suspend/resume code even more fragile
> > > and filled with special cases, so not too sure we should apply this one.
> > 
> > It's the same case for all DMC platforms.
> 
> I more meant that there's not yet another little bit we need to do at
> exactly the right time in an already massive sequence of events. It
> certainly doesn't make suspend/resume simpler ;-)

Right agreed, and annoyed a lot by this fact. But we are already
talking about the exception case anyway where DMC isn't loaded so it's
not risking to regress the usual case. Also while the no DMC case is an
exception, it's still possible. And in that case, I'm not sure what
would happen if we left power wells on during system s/r. I suppose
BIOS would still do the right thing (and not hang the machine for
example), but I'm not sure. If we go with white listing the DMC version
then this becomes even more of an issue. Would it be possible to have
CI test both cases?

--Imre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 4/4] drm/i915/gen9: Fix runtime PM refcounting in case DMC firmware isn't loaded
  2016-04-20 13:35           ` Imre Deak
@ 2016-04-20 14:11             ` Daniel Vetter
  0 siblings, 0 replies; 34+ messages in thread
From: Daniel Vetter @ 2016-04-20 14:11 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

On Wed, Apr 20, 2016 at 04:35:59PM +0300, Imre Deak wrote:
> On ke, 2016-04-20 at 15:16 +0200, Daniel Vetter wrote:
> > On Wed, Apr 20, 2016 at 04:13:25PM +0300, Imre Deak wrote:
> > > On ke, 2016-04-20 at 15:02 +0200, Daniel Vetter wrote:
> > > > On Mon, Apr 18, 2016 at 02:48:21PM +0300, Imre Deak wrote:
> > > > > While we disable runtime PM and with that display power well support if
> > > > > the DMC firmware isn't loaded, we still want to disable power wells
> > > > > during system suspend and driver unload. So drop/reacquire the
> > > > > corresponding power refcount during suspend/resume and driver unloading.
> > > > > This also means we have to check if DMC is not loaded and skip enabling
> > > > > DC states in the power well code.
> > > > > 
> > > > > v2:
> > > > > - Reuse intel_csr_ucode_suspend() in intel_csr_ucode_fini() instead of
> > > > >   opencoding the former. (Chris)
> > > > > - Add docbook comment to the public resume and suspend functions.
> > > > > 
> > > > > CC: Chris Wilson <chris@chris-wilson.co.uk>
> > > > > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > > > > 
> > > > > ---
> > > > > 
> > > > > As this fixes the special case when firmware loading failed with the
> > > > > result of possibly leaving power wells enabled during suspend/resume and
> > > > > driver unloading it's not for -fixes or stable.
> > > > 
> > > > Do we even care about this to bother fixing it? We pretty much agreed that
> > > > we need dmc to make power management work on skl, so if it doesn't work
> > > > too well over suspend/resume (soix), meh.
> > > 
> > > It's possible that someone runs w/o the firmware, either because of not
> > > having the latest version for a given kernel or on purpose disabling
> > > DMC functionality (for example for debugging). We want to keep things
> > > working in this case too. And it's also the right thing to do during
> > > driver unload, we'd leak the RPM reference otherwise.
> > 
> > Ok, if this is needed anyway for driver unload I think it's ok. But
> > otherwise keeping stuff working without DMC is imo a fringe use case we
> > shouldn't bother about.
> 
> Well we could still not do the suspend/resume time put/get, but I'd
> still argue for those see below.
> 
> > > > And your patch here does make the suspend/resume code even more fragile
> > > > and filled with special cases, so not too sure we should apply this one.
> > > 
> > > It's the same case for all DMC platforms.
> > 
> > I more meant that there's not yet another little bit we need to do at
> > exactly the right time in an already massive sequence of events. It
> > certainly doesn't make suspend/resume simpler ;-)
> 
> Right agreed, and annoyed a lot by this fact. But we are already
> talking about the exception case anyway where DMC isn't loaded so it's
> not risking to regress the usual case. Also while the no DMC case is an
> exception, it's still possible. And in that case, I'm not sure what
> would happen if we left power wells on during system s/r. I suppose
> BIOS would still do the right thing (and not hang the machine for
> example), but I'm not sure. If we go with white listing the DMC version
> then this becomes even more of an issue. Would it be possible to have
> CI test both cases?

The combinatorial explosion in testing is why I don't want to support
everything. So no, I don't think CI should test with and without DMC.
That's also why I'm so much opposed to features merged into upstream, but
disabled by default.

Imo non-default cases should be best effort only, and we fix up if it's a
fireworks.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2016-04-20 14:11 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-18  7:04 [PATCH 0/4] drm/i915: Fix a few suspend/resume and driver unload bugs Imre Deak
2016-04-18  7:04 ` [PATCH 1/4] drm/i915: Fix error path in i915_drm_resume_early Imre Deak
2016-04-18  8:00   ` Chris Wilson
2016-04-18  8:06     ` Imre Deak
2016-04-18  7:04 ` [PATCH 2/4] drm/i915: Fix system resume if PCI device remained enabled Imre Deak
2016-04-18  8:06   ` [Intel-gfx] " Chris Wilson
2016-04-18  8:16     ` Imre Deak
2016-04-18  8:24       ` [Intel-gfx] " Chris Wilson
2016-04-18  8:37         ` Imre Deak
2016-04-18  8:52           ` Chris Wilson
2016-04-18 11:05             ` Imre Deak
2016-04-18  8:28   ` Ville Syrjälä
2016-04-18  8:32     ` Imre Deak
2016-04-18  8:44       ` Ville Syrjälä
2016-04-18  8:54         ` Imre Deak
2016-04-18  9:04           ` Ville Syrjälä
2016-04-18 11:45   ` [PATCH v2 " Imre Deak
2016-04-18 14:59     ` Ville Syrjälä
2016-04-18  7:04 ` [PATCH 3/4] drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume Imre Deak
2016-04-18 11:05   ` Ville Syrjälä
2016-04-18  7:04 ` [PATCH 4/4] drm/i915/gen9: Fix runtime PM refcounting in case DMC firmware isn't loaded Imre Deak
2016-04-18  7:49   ` Chris Wilson
2016-04-18  7:59     ` Imre Deak
2016-04-18 11:48   ` [PATCH v2 " Imre Deak
2016-04-18 12:07     ` Chris Wilson
2016-04-20 13:02     ` Daniel Vetter
2016-04-20 13:13       ` Imre Deak
2016-04-20 13:16         ` Daniel Vetter
2016-04-20 13:35           ` Imre Deak
2016-04-20 14:11             ` Daniel Vetter
2016-04-18  7:24 ` ✓ Fi.CI.BAT: success for drm/i915: Fix a few suspend/resume and driver unload bugs Patchwork
2016-04-18 13:57 ` ✗ Fi.CI.BAT: failure for drm/i915: Fix a few suspend/resume and driver unload bugs (rev3) Patchwork
2016-04-18 15:45   ` Imre Deak
2016-04-19  9:42     ` Imre Deak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox