public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH v6 1/2] drm/i915/gen9: Fix PCODE polling during CDCLK change notification
@ 2016-12-05 16:27 Imre Deak
  2016-12-05 16:27 ` [PATCH v6 2/2] drm/i915/gen9: Fix PCODE polling during SAGV disabling Imre Deak
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Imre Deak @ 2016-12-05 16:27 UTC (permalink / raw)
  To: intel-gfx; +Cc: Art Runyan

commit 848496e5902833600f7992f4faa82dc1546051ba
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Wed Jul 13 16:32:03 2016 +0300

    drm/i915: Wait up to 3ms for the pcu to ack the cdclk change request on SKL

increased the timeout to match the spec, but we still see a timeout on
at least one SKL. A CDCLK change request following the failed one will
succeed nevertheless.

I could reproduce this problem easily by running kms_pipe_crc_basic in a
loop. In all failure cases _wait_for() was pre-empted for >3ms and so in
the worst case - when the pre-emption happened right after calculating
timeout__ in _wait_for() - we called skl_cdclk_wait_for_pcu_ready() only
once which failed and so _wait_for() timed out. As opposed to this the
spec says to keep retrying the request for at most a 3ms period.

To fix this send the first request explicitly to guarantee that there is
3ms between the first and last request. Though this matches the spec, I
noticed that in rare cases this can still time out if we sent only a few
requests (in the worst case 2) _and_ PCODE is busy for some reason even
after a previous request and a 3ms delay. To work around this retry the
polling with pre-emption disabled to maximize the number of requests.
Also increase the timeout to 10ms to account for interrupts that could
reduce the number of requests. With this change I couldn't trigger
the problem.

v2:
- Use 1ms poll period instead of 10us. (Chris)
v3:
- Poll with pre-emption disabled to increase the number of request
  attempts. (Ville, Chris)
- Factor out a helper to poll, it's also needed by the next patch.
v4:
- Pass reply_mask, reply to skl_pcode_request(), instead of assuming the
  reply is generic. (Ville)
v5:
- List the request specific timeout values as code comment. (Ville)
v6:
- Try the poll first with preemption enabled.
- Add code comment about first request being queued by PCODE. (Art)
- Add timeout_base_ms argument. (Ville)

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Art Runyan <arthur.j.runyan@intel.com>
Reference: https://bugs.freedesktop.org/show_bug.cgi?id=97929
Testcase: igt/kms_pipe_crc_basic/suspend-read-crc-pipe-B
Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h      |  2 +
 drivers/gpu/drm/i915/intel_display.c | 31 +++++----------
 drivers/gpu/drm/i915/intel_pm.c      | 75 ++++++++++++++++++++++++++++++++++++
 3 files changed, 87 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ca9786c..a2462bf 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3597,6 +3597,8 @@ extern void intel_display_print_error_state(struct drm_i915_error_state_buf *e,
 
 int sandybridge_pcode_read(struct drm_i915_private *dev_priv, u32 mbox, u32 *val);
 int sandybridge_pcode_write(struct drm_i915_private *dev_priv, u32 mbox, u32 val);
+int skl_pcode_request(struct drm_i915_private *dev_priv, u32 mbox, u32 request,
+		      u32 reply_mask, u32 reply, int timeout_base_ms);
 
 /* intel_sideband.c */
 u32 vlv_punit_read(struct drm_i915_private *dev_priv, u32 addr);
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 1fafcce..4ef7392 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -6280,35 +6280,24 @@ skl_dpll0_disable(struct drm_i915_private *dev_priv)
 	dev_priv->cdclk_pll.vco = 0;
 }
 
-static bool skl_cdclk_pcu_ready(struct drm_i915_private *dev_priv)
-{
-	int ret;
-	u32 val;
-
-	/* inform PCU we want to change CDCLK */
-	val = SKL_CDCLK_PREPARE_FOR_CHANGE;
-	mutex_lock(&dev_priv->rps.hw_lock);
-	ret = sandybridge_pcode_read(dev_priv, SKL_PCODE_CDCLK_CONTROL, &val);
-	mutex_unlock(&dev_priv->rps.hw_lock);
-
-	return ret == 0 && (val & SKL_CDCLK_READY_FOR_CHANGE);
-}
-
-static bool skl_cdclk_wait_for_pcu_ready(struct drm_i915_private *dev_priv)
-{
-	return _wait_for(skl_cdclk_pcu_ready(dev_priv), 3000, 10) == 0;
-}
-
 static void skl_set_cdclk(struct drm_i915_private *dev_priv, int cdclk, int vco)
 {
 	u32 freq_select, pcu_ack;
+	int ret;
 
 	WARN_ON((cdclk == 24000) != (vco == 0));
 
 	DRM_DEBUG_DRIVER("Changing CDCLK to %d kHz (VCO %d kHz)\n", cdclk, vco);
 
-	if (!skl_cdclk_wait_for_pcu_ready(dev_priv)) {
-		DRM_ERROR("failed to inform PCU about cdclk change\n");
+	mutex_lock(&dev_priv->rps.hw_lock);
+	ret = skl_pcode_request(dev_priv, SKL_PCODE_CDCLK_CONTROL,
+				SKL_CDCLK_PREPARE_FOR_CHANGE,
+				SKL_CDCLK_READY_FOR_CHANGE,
+				SKL_CDCLK_READY_FOR_CHANGE, 3);
+	mutex_unlock(&dev_priv->rps.hw_lock);
+	if (ret) {
+		DRM_ERROR("Failed to inform PCU about cdclk change (%d)\n",
+			  ret);
 		return;
 	}
 
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 59a88de..6c2fa34 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -7864,6 +7864,81 @@ int sandybridge_pcode_write(struct drm_i915_private *dev_priv,
 	return 0;
 }
 
+static bool skl_pcode_try_request(struct drm_i915_private *dev_priv, u32 mbox,
+				  u32 request, u32 reply_mask, u32 reply,
+				  u32 *status)
+{
+	u32 val = request;
+
+	*status = sandybridge_pcode_read(dev_priv, mbox, &val);
+
+	return *status || ((val & reply_mask) == reply);
+}
+
+/**
+ * skl_pcode_request - send PCODE request until acknowledgment
+ * @dev_priv: device private
+ * @mbox: PCODE mailbox ID the request is targeted for
+ * @request: request ID
+ * @reply_mask: mask used to check for request acknowledgment
+ * @reply: value used to check for request acknowledgement
+ * @timeout_base_ms: timeout for polling with preemption enabled
+ *
+ * Keep resending the @request to @mbox until PCODE acknowledges it, PCODE
+ * reports an error or an overall timeout of @timeout_base_ms+10 ms expires.
+ * The request is acknowledged once the PCODE reply dword equals @reply after
+ * applying @reply_mask. Polling is first attempted with preemption enabled
+ * for @timeout_base_ms and if this times out for another 10 ms with
+ * preemption disabled.
+ *
+ * Returns 0 on success, %-ETIMEDOUT in case of a timeout, <0 in case of some
+ * other error as reported by PCODE.
+ */
+int skl_pcode_request(struct drm_i915_private *dev_priv, u32 mbox, u32 request,
+		      u32 reply_mask, u32 reply, int timeout_base_ms)
+{
+	u32 status;
+	int ret;
+
+	WARN_ON(!mutex_is_locked(&dev_priv->rps.hw_lock));
+
+#define COND skl_pcode_try_request(dev_priv, mbox, request, reply_mask, reply, \
+				   &status)
+
+	/*
+	 * The first request is queued by PCODE, which normally guarantees
+	 * that a subsequent request at most @timeout_base_ms later succeeds.
+	 * _wait_for() doesn't guarantee when its passed condition is evaluated
+	 * first, so send the first request explicitly.
+	 */
+	if (COND) {
+		ret = 0;
+		goto out;
+	}
+	ret = _wait_for(COND, timeout_base_ms * 1000, 10);
+	if (!ret)
+		goto out;
+
+	/*
+	 * The above can time out if the number of requests was low (2 in the
+	 * worst case) _and_ PCODE was busy for some reason even after a
+	 * (queued) request and @timeout_base_ms delay. As a workaround retry
+	 * the poll with preemption disabled to maximize the number of
+	 * requests. Increase the timeout from @timeout_base_ms to 10ms to
+	 * account for interrupts that could reduce the number of these
+	 * requests.
+	 */
+	DRM_DEBUG_KMS("PCODE timeout, retrying with preemption disabled\n");
+	WARN_ON_ONCE(timeout_base_ms > 3);
+	preempt_disable();
+	ret = wait_for_atomic(COND, 10);
+	preempt_enable();
+
+out:
+	return ret ? ret : status;
+#undef COND
+}
+
 static int byt_gpu_freq(struct drm_i915_private *dev_priv, int val)
 {
 	/*
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v6 2/2] drm/i915/gen9: Fix PCODE polling during SAGV disabling
  2016-12-05 16:27 [PATCH v6 1/2] drm/i915/gen9: Fix PCODE polling during CDCLK change notification Imre Deak
@ 2016-12-05 16:27 ` Imre Deak
  2016-12-05 18:59 ` ✓ Fi.CI.BAT: success for series starting with [v6,1/2] drm/i915/gen9: Fix PCODE polling during CDCLK change notification Patchwork
  2016-12-08 14:02 ` [PATCH v6 1/2] " Chris Wilson
  2 siblings, 0 replies; 7+ messages in thread
From: Imre Deak @ 2016-12-05 16:27 UTC (permalink / raw)
  To: intel-gfx

According to the previous patch, it's possible atm that we call
intel_do_sagv_disable() only once during the 1ms period and time out if
that call fails. As opposed to this the spec says that we need to keep
retrying this request for a 1ms duration, so let's do this similarly to
the CDCLK change notification request.

v4-5:
- Rebased on the reply_mask, reply change.
v6:
- Remove w/s change. (Lyude)
- Rebased on the timeout_base argument change.

Cc: Lyude <cpaul@redhat.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Lyude <lyude@redhat.com> (v4)
---
 drivers/gpu/drm/i915/intel_pm.c | 34 +++++++++-------------------------
 1 file changed, 9 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 6c2fa34..e7c4891 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2954,24 +2954,10 @@ intel_enable_sagv(struct drm_i915_private *dev_priv)
 	return 0;
 }
 
-static int
-intel_do_sagv_disable(struct drm_i915_private *dev_priv)
-{
-	int ret;
-	uint32_t temp = GEN9_SAGV_DISABLE;
-
-	ret = sandybridge_pcode_read(dev_priv, GEN9_PCODE_SAGV_CONTROL,
-				     &temp);
-	if (ret)
-		return ret;
-	else
-		return temp & GEN9_SAGV_IS_DISABLED;
-}
-
 int
 intel_disable_sagv(struct drm_i915_private *dev_priv)
 {
-	int ret, result;
+	int ret;
 
 	if (!intel_has_sagv(dev_priv))
 		return 0;
@@ -2983,25 +2969,23 @@ intel_disable_sagv(struct drm_i915_private *dev_priv)
 	mutex_lock(&dev_priv->rps.hw_lock);
 
 	/* bspec says to keep retrying for at least 1 ms */
-	ret = wait_for(result = intel_do_sagv_disable(dev_priv), 1);
+	ret = skl_pcode_request(dev_priv, GEN9_PCODE_SAGV_CONTROL,
+				GEN9_SAGV_DISABLE,
+				GEN9_SAGV_IS_DISABLED, GEN9_SAGV_IS_DISABLED,
+				1);
 	mutex_unlock(&dev_priv->rps.hw_lock);
 
-	if (ret == -ETIMEDOUT) {
-		DRM_ERROR("Request to disable SAGV timed out\n");
-		return -ETIMEDOUT;
-	}
-
 	/*
 	 * Some skl systems, pre-release machines in particular,
 	 * don't actually have an SAGV.
 	 */
-	if (IS_SKYLAKE(dev_priv) && result == -ENXIO) {
+	if (IS_SKYLAKE(dev_priv) && ret == -ENXIO) {
 		DRM_DEBUG_DRIVER("No SAGV found on system, ignoring\n");
 		dev_priv->sagv_status = I915_SAGV_NOT_CONTROLLED;
 		return 0;
-	} else if (result < 0) {
-		DRM_ERROR("Failed to disable the SAGV\n");
-		return result;
+	} else if (ret < 0) {
+		DRM_ERROR("Failed to disable the SAGV (%d)\n", ret);
+		return ret;
 	}
 
 	dev_priv->sagv_status = I915_SAGV_DISABLED;
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [v6,1/2] drm/i915/gen9: Fix PCODE polling during CDCLK change notification
  2016-12-05 16:27 [PATCH v6 1/2] drm/i915/gen9: Fix PCODE polling during CDCLK change notification Imre Deak
  2016-12-05 16:27 ` [PATCH v6 2/2] drm/i915/gen9: Fix PCODE polling during SAGV disabling Imre Deak
@ 2016-12-05 18:59 ` Patchwork
  2016-12-08 20:57   ` Imre Deak
  2016-12-08 14:02 ` [PATCH v6 1/2] " Chris Wilson
  2 siblings, 1 reply; 7+ messages in thread
From: Patchwork @ 2016-12-05 18:59 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

== Series Details ==

Series: series starting with [v6,1/2] drm/i915/gen9: Fix PCODE polling during CDCLK change notification
URL   : https://patchwork.freedesktop.org/series/16375/
State : success

== Summary ==

Series 16375v1 Series without cover letter
https://patchwork.freedesktop.org/api/1.0/series/16375/revisions/1/mbox/

Test gem_busy:
        Subgroup basic-hang-default:
                fail       -> PASS       (fi-hsw-4770r)

fi-bdw-5557u     total:247  pass:232  dwarn:0   dfail:0   fail:0   skip:15 
fi-bsw-n3050     total:247  pass:207  dwarn:0   dfail:0   fail:0   skip:40 
fi-bxt-t5700     total:247  pass:219  dwarn:0   dfail:0   fail:0   skip:28 
fi-byt-j1900     total:247  pass:219  dwarn:0   dfail:0   fail:0   skip:28 
fi-byt-n2820     total:247  pass:215  dwarn:0   dfail:0   fail:0   skip:32 
fi-hsw-4770      total:247  pass:227  dwarn:0   dfail:0   fail:0   skip:20 
fi-hsw-4770r     total:247  pass:227  dwarn:0   dfail:0   fail:0   skip:20 
fi-ilk-650       total:247  pass:194  dwarn:0   dfail:0   fail:0   skip:53 
fi-ivb-3520m     total:247  pass:225  dwarn:0   dfail:0   fail:0   skip:22 
fi-ivb-3770      total:247  pass:225  dwarn:0   dfail:0   fail:0   skip:22 
fi-kbl-7500u     total:247  pass:225  dwarn:0   dfail:0   fail:0   skip:22 
fi-skl-6260u     total:247  pass:233  dwarn:0   dfail:0   fail:0   skip:14 
fi-skl-6700hq    total:247  pass:226  dwarn:0   dfail:0   fail:0   skip:21 
fi-skl-6770hq    total:247  pass:233  dwarn:0   dfail:0   fail:0   skip:14 
fi-snb-2520m     total:247  pass:215  dwarn:0   dfail:0   fail:0   skip:32 
fi-snb-2600      total:247  pass:214  dwarn:0   dfail:0   fail:0   skip:33 

a947529bb652fdcf50e8c0e8ee5102b737bae23e drm-tip: 2016y-12m-05d-14h-27m-06s UTC integration manifest
25d5a8b drm/i915/gen9: Fix PCODE polling during SAGV disabling
9357396 drm/i915/gen9: Fix PCODE polling during CDCLK change notification

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3197/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v6 1/2] drm/i915/gen9: Fix PCODE polling during CDCLK change notification
  2016-12-05 16:27 [PATCH v6 1/2] drm/i915/gen9: Fix PCODE polling during CDCLK change notification Imre Deak
  2016-12-05 16:27 ` [PATCH v6 2/2] drm/i915/gen9: Fix PCODE polling during SAGV disabling Imre Deak
  2016-12-05 18:59 ` ✓ Fi.CI.BAT: success for series starting with [v6,1/2] drm/i915/gen9: Fix PCODE polling during CDCLK change notification Patchwork
@ 2016-12-08 14:02 ` Chris Wilson
  2016-12-08 14:18   ` Imre Deak
  2 siblings, 1 reply; 7+ messages in thread
From: Chris Wilson @ 2016-12-08 14:02 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx, Art Runyan

On Mon, Dec 05, 2016 at 06:27:37PM +0200, Imre Deak wrote:
> commit 848496e5902833600f7992f4faa82dc1546051ba
> Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Date:   Wed Jul 13 16:32:03 2016 +0300
> 
>     drm/i915: Wait up to 3ms for the pcu to ack the cdclk change request on SKL
> 
> increased the timeout to match the spec, but we still see a timeout on
> at least one SKL. A CDCLK change request following the failed one will
> succeed nevertheless.
> 
> I could reproduce this problem easily by running kms_pipe_crc_basic in a
> loop. In all failure cases _wait_for() was pre-empted for >3ms and so in
> the worst case - when the pre-emption happened right after calculating
> timeout__ in _wait_for() - we called skl_cdclk_wait_for_pcu_ready() only
> once which failed and so _wait_for() timed out. As opposed to this the
> spec says to keep retrying the request for at most a 3ms period.
> 
> To fix this send the first request explicitly to guarantee that there is
> 3ms between the first and last request. Though this matches the spec, I
> noticed that in rare cases this can still time out if we sent only a few
> requests (in the worst case 2) _and_ PCODE is busy for some reason even
> after a previous request and a 3ms delay. To work around this retry the
> polling with pre-emption disabled to maximize the number of requests.
> Also increase the timeout to 10ms to account for interrupts that could
> reduce the number of requests. With this change I couldn't trigger
> the problem.
> 
> v2:
> - Use 1ms poll period instead of 10us. (Chris)
> v3:
> - Poll with pre-emption disabled to increase the number of request
>   attempts. (Ville, Chris)
> - Factor out a helper to poll, it's also needed by the next patch.
> v4:
> - Pass reply_mask, reply to skl_pcode_request(), instead of assuming the
>   reply is generic. (Ville)
> v5:
> - List the request specific timeout values as code comment. (Ville)
> v6:
> - Try the poll first with preemption enabled.
> - Add code comment about first request being queued by PCODE. (Art)
> - Add timeout_base_ms argument. (Ville)
> 
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Art Runyan <arthur.j.runyan@intel.com>
> Reference: https://bugs.freedesktop.org/show_bug.cgi?id=97929
> Testcase: igt/kms_pipe_crc_basic/suspend-read-crc-pipe-B
> Signed-off-by: Imre Deak <imre.deak@intel.com>

The only issue I have is that buried within snb_pcode_read() is another
wait_for, now in an atomic section and we have been trying to erradicate
those. It should be happy enough, just a pita to fix later!

Other than that and a minor nit, it looks like a reasonable compromise.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

>  drivers/gpu/drm/i915/i915_drv.h      |  2 +
>  drivers/gpu/drm/i915/intel_display.c | 31 +++++----------
>  drivers/gpu/drm/i915/intel_pm.c      | 75 ++++++++++++++++++++++++++++++++++++
>  3 files changed, 87 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index ca9786c..a2462bf 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -3597,6 +3597,8 @@ extern void intel_display_print_error_state(struct drm_i915_error_state_buf *e,
>  
>  int sandybridge_pcode_read(struct drm_i915_private *dev_priv, u32 mbox, u32 *val);
>  int sandybridge_pcode_write(struct drm_i915_private *dev_priv, u32 mbox, u32 val);
> +int skl_pcode_request(struct drm_i915_private *dev_priv, u32 mbox, u32 request,
> +		      u32 reply_mask, u32 reply, int timeout_base_ms);
>  
>  /* intel_sideband.c */
>  u32 vlv_punit_read(struct drm_i915_private *dev_priv, u32 addr);
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 1fafcce..4ef7392 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -6280,35 +6280,24 @@ skl_dpll0_disable(struct drm_i915_private *dev_priv)
>  	dev_priv->cdclk_pll.vco = 0;
>  }
>  
> -static bool skl_cdclk_pcu_ready(struct drm_i915_private *dev_priv)
> -{
> -	int ret;
> -	u32 val;
> -
> -	/* inform PCU we want to change CDCLK */
> -	val = SKL_CDCLK_PREPARE_FOR_CHANGE;
> -	mutex_lock(&dev_priv->rps.hw_lock);
> -	ret = sandybridge_pcode_read(dev_priv, SKL_PCODE_CDCLK_CONTROL, &val);
> -	mutex_unlock(&dev_priv->rps.hw_lock);
> -
> -	return ret == 0 && (val & SKL_CDCLK_READY_FOR_CHANGE);
> -}
> -
> -static bool skl_cdclk_wait_for_pcu_ready(struct drm_i915_private *dev_priv)
> -{
> -	return _wait_for(skl_cdclk_pcu_ready(dev_priv), 3000, 10) == 0;
> -}
> -
>  static void skl_set_cdclk(struct drm_i915_private *dev_priv, int cdclk, int vco)
>  {
>  	u32 freq_select, pcu_ack;
> +	int ret;
>  
>  	WARN_ON((cdclk == 24000) != (vco == 0));
>  
>  	DRM_DEBUG_DRIVER("Changing CDCLK to %d kHz (VCO %d kHz)\n", cdclk, vco);
>  
> -	if (!skl_cdclk_wait_for_pcu_ready(dev_priv)) {
> -		DRM_ERROR("failed to inform PCU about cdclk change\n");
> +	mutex_lock(&dev_priv->rps.hw_lock);
> +	ret = skl_pcode_request(dev_priv, SKL_PCODE_CDCLK_CONTROL,
> +				SKL_CDCLK_PREPARE_FOR_CHANGE,
> +				SKL_CDCLK_READY_FOR_CHANGE,
> +				SKL_CDCLK_READY_FOR_CHANGE, 3);
> +	mutex_unlock(&dev_priv->rps.hw_lock);
> +	if (ret) {
> +		DRM_ERROR("Failed to inform PCU about cdclk change (%d)\n",
> +			  ret);
>  		return;
>  	}
>  
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 59a88de..6c2fa34 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -7864,6 +7864,81 @@ int sandybridge_pcode_write(struct drm_i915_private *dev_priv,
>  	return 0;
>  }
>  
> +static bool skl_pcode_try_request(struct drm_i915_private *dev_priv, u32 mbox,
> +				  u32 request, u32 reply_mask, u32 reply,
> +				  u32 *status)
> +{
> +	u32 val = request;
> +
> +	*status = sandybridge_pcode_read(dev_priv, mbox, &val);
> +
> +	return *status || ((val & reply_mask) == reply);
> +}
> +
> +/**
> + * skl_pcode_request - send PCODE request until acknowledgment
> + * @dev_priv: device private
> + * @mbox: PCODE mailbox ID the request is targeted for
> + * @request: request ID
> + * @reply_mask: mask used to check for request acknowledgment
> + * @reply: value used to check for request acknowledgement
> + * @timeout_base_ms: timeout for polling with preemption enabled
> + *
> + * Keep resending the @request to @mbox until PCODE acknowledges it, PCODE
> + * reports an error or an overall timeout of @timeout_base_ms+10 ms expires.
> + * The request is acknowledged once the PCODE reply dword equals @reply after
> + * applying @reply_mask. Polling is first attempted with preemption enabled
> + * for @timeout_base_ms and if this times out for another 10 ms with
> + * preemption disabled.
> + *
> + * Returns 0 on success, %-ETIMEDOUT in case of a timeout, <0 in case of some
> + * other error as reported by PCODE.
> + */
> +int skl_pcode_request(struct drm_i915_private *dev_priv, u32 mbox, u32 request,
> +		      u32 reply_mask, u32 reply, int timeout_base_ms)
> +{
> +	u32 status;
> +	int ret;
> +
> +	WARN_ON(!mutex_is_locked(&dev_priv->rps.hw_lock));
> +
> +#define COND skl_pcode_try_request(dev_priv, mbox, request, reply_mask, reply, \
> +				   &status)
> +
> +	/*
> +	 * The first request is queued by PCODE, which normally guarantees
> +	 * that a subsequent request at most @timeout_base_ms later succeeds.
> +	 * _wait_for() doesn't guarantee when its passed condition is evaluated
> +	 * first, so send the first request explicitly.
> +	 */

Scratching my head here.

/* Prime the PCODE by doing a request first. Normally it guarantees that
 * a subsequent request, at most @timeout_base_ms later, succeeds.
 */

As what I think you mean is that given preemption wait_for() doesn't
guarantee that at least two CONDs are executed. Which is reasonable, so
we want to write the code to look for a reply within timeout.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v6 1/2] drm/i915/gen9: Fix PCODE polling during CDCLK change notification
  2016-12-08 14:02 ` [PATCH v6 1/2] " Chris Wilson
@ 2016-12-08 14:18   ` Imre Deak
  2016-12-08 14:34     ` Chris Wilson
  0 siblings, 1 reply; 7+ messages in thread
From: Imre Deak @ 2016-12-08 14:18 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, Art Runyan

On to, 2016-12-08 at 14:02 +0000, Chris Wilson wrote:
> On Mon, Dec 05, 2016 at 06:27:37PM +0200, Imre Deak wrote:
> > commit 848496e5902833600f7992f4faa82dc1546051ba
> > Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Date:   Wed Jul 13 16:32:03 2016 +0300
> > 
> >     drm/i915: Wait up to 3ms for the pcu to ack the cdclk change request on SKL
> > 
> > increased the timeout to match the spec, but we still see a timeout on
> > at least one SKL. A CDCLK change request following the failed one will
> > succeed nevertheless.
> > 
> > I could reproduce this problem easily by running kms_pipe_crc_basic in a
> > loop. In all failure cases _wait_for() was pre-empted for >3ms and so in
> > the worst case - when the pre-emption happened right after calculating
> > timeout__ in _wait_for() - we called skl_cdclk_wait_for_pcu_ready() only
> > once which failed and so _wait_for() timed out. As opposed to this the
> > spec says to keep retrying the request for at most a 3ms period.
> > 
> > To fix this send the first request explicitly to guarantee that there is
> > 3ms between the first and last request. Though this matches the spec, I
> > noticed that in rare cases this can still time out if we sent only a few
> > requests (in the worst case 2) _and_ PCODE is busy for some reason even
> > after a previous request and a 3ms delay. To work around this retry the
> > polling with pre-emption disabled to maximize the number of requests.
> > Also increase the timeout to 10ms to account for interrupts that could
> > reduce the number of requests. With this change I couldn't trigger
> > the problem.
> > 
> > v2:
> > - Use 1ms poll period instead of 10us. (Chris)
> > v3:
> > - Poll with pre-emption disabled to increase the number of request
> >   attempts. (Ville, Chris)
> > - Factor out a helper to poll, it's also needed by the next patch.
> > v4:
> > - Pass reply_mask, reply to skl_pcode_request(), instead of assuming the
> >   reply is generic. (Ville)
> > v5:
> > - List the request specific timeout values as code comment. (Ville)
> > v6:
> > - Try the poll first with preemption enabled.
> > - Add code comment about first request being queued by PCODE. (Art)
> > - Add timeout_base_ms argument. (Ville)
> > 
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Art Runyan <arthur.j.runyan@intel.com>
> > Reference: https://bugs.freedesktop.org/show_bug.cgi?id=97929
> > Testcase: igt/kms_pipe_crc_basic/suspend-read-crc-pipe-B
> > Signed-off-by: Imre Deak <imre.deak@intel.com>
> 
> The only issue I have is that buried within snb_pcode_read() is another
> wait_for, now in an atomic section and we have been trying to erradicate
> those. It should be happy enough, just a pita to fix later!

Yea, agreed with the rational for that. I can volunteer to refactor
this part as a follow-up, although passing a flag to snb_pcode_read()
would be still the clearest to me.

Btw, this is also for -stable imo.

> Other than that and a minor nit, it looks like a reasonable compromise.
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> >  drivers/gpu/drm/i915/i915_drv.h      |  2 +
> >  drivers/gpu/drm/i915/intel_display.c | 31 +++++----------
> >  drivers/gpu/drm/i915/intel_pm.c      | 75 ++++++++++++++++++++++++++++++++++++
> >  3 files changed, 87 insertions(+), 21 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index ca9786c..a2462bf 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -3597,6 +3597,8 @@ extern void intel_display_print_error_state(struct drm_i915_error_state_buf *e,
> >  
> >  int sandybridge_pcode_read(struct drm_i915_private *dev_priv, u32 mbox, u32 *val);
> >  int sandybridge_pcode_write(struct drm_i915_private *dev_priv, u32 mbox, u32 val);
> > +int skl_pcode_request(struct drm_i915_private *dev_priv, u32 mbox, u32 request,
> > +		      u32 reply_mask, u32 reply, int timeout_base_ms);
> >  
> >  /* intel_sideband.c */
> >  u32 vlv_punit_read(struct drm_i915_private *dev_priv, u32 addr);
> > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> > index 1fafcce..4ef7392 100644
> > --- a/drivers/gpu/drm/i915/intel_display.c
> > +++ b/drivers/gpu/drm/i915/intel_display.c
> > @@ -6280,35 +6280,24 @@ skl_dpll0_disable(struct drm_i915_private *dev_priv)
> >  	dev_priv->cdclk_pll.vco = 0;
> >  }
> >  
> > -static bool skl_cdclk_pcu_ready(struct drm_i915_private *dev_priv)
> > -{
> > -	int ret;
> > -	u32 val;
> > -
> > -	/* inform PCU we want to change CDCLK */
> > -	val = SKL_CDCLK_PREPARE_FOR_CHANGE;
> > -	mutex_lock(&dev_priv->rps.hw_lock);
> > -	ret = sandybridge_pcode_read(dev_priv, SKL_PCODE_CDCLK_CONTROL, &val);
> > -	mutex_unlock(&dev_priv->rps.hw_lock);
> > -
> > -	return ret == 0 && (val & SKL_CDCLK_READY_FOR_CHANGE);
> > -}
> > -
> > -static bool skl_cdclk_wait_for_pcu_ready(struct drm_i915_private *dev_priv)
> > -{
> > -	return _wait_for(skl_cdclk_pcu_ready(dev_priv), 3000, 10) == 0;
> > -}
> > -
> >  static void skl_set_cdclk(struct drm_i915_private *dev_priv, int cdclk, int vco)
> >  {
> >  	u32 freq_select, pcu_ack;
> > +	int ret;
> >  
> >  	WARN_ON((cdclk == 24000) != (vco == 0));
> >  
> >  	DRM_DEBUG_DRIVER("Changing CDCLK to %d kHz (VCO %d kHz)\n", cdclk, vco);
> >  
> > -	if (!skl_cdclk_wait_for_pcu_ready(dev_priv)) {
> > -		DRM_ERROR("failed to inform PCU about cdclk change\n");
> > +	mutex_lock(&dev_priv->rps.hw_lock);
> > +	ret = skl_pcode_request(dev_priv, SKL_PCODE_CDCLK_CONTROL,
> > +				SKL_CDCLK_PREPARE_FOR_CHANGE,
> > +				SKL_CDCLK_READY_FOR_CHANGE,
> > +				SKL_CDCLK_READY_FOR_CHANGE, 3);
> > +	mutex_unlock(&dev_priv->rps.hw_lock);
> > +	if (ret) {
> > +		DRM_ERROR("Failed to inform PCU about cdclk change (%d)\n",
> > +			  ret);
> >  		return;
> >  	}
> >  
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > index 59a88de..6c2fa34 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -7864,6 +7864,81 @@ int sandybridge_pcode_write(struct drm_i915_private *dev_priv,
> >  	return 0;
> >  }
> >  
> > +static bool skl_pcode_try_request(struct drm_i915_private *dev_priv, u32 mbox,
> > +				  u32 request, u32 reply_mask, u32 reply,
> > +				  u32 *status)
> > +{
> > +	u32 val = request;
> > +
> > +	*status = sandybridge_pcode_read(dev_priv, mbox, &val);
> > +
> > +	return *status || ((val & reply_mask) == reply);
> > +}
> > +
> > +/**
> > + * skl_pcode_request - send PCODE request until acknowledgment
> > + * @dev_priv: device private
> > + * @mbox: PCODE mailbox ID the request is targeted for
> > + * @request: request ID
> > + * @reply_mask: mask used to check for request acknowledgment
> > + * @reply: value used to check for request acknowledgement
> > + * @timeout_base_ms: timeout for polling with preemption enabled
> > + *
> > + * Keep resending the @request to @mbox until PCODE acknowledges it, PCODE
> > + * reports an error or an overall timeout of @timeout_base_ms+10 ms expires.
> > + * The request is acknowledged once the PCODE reply dword equals @reply after
> > + * applying @reply_mask. Polling is first attempted with preemption enabled
> > + * for @timeout_base_ms and if this times out for another 10 ms with
> > + * preemption disabled.
> > + *
> > + * Returns 0 on success, %-ETIMEDOUT in case of a timeout, <0 in case of some
> > + * other error as reported by PCODE.
> > + */
> > +int skl_pcode_request(struct drm_i915_private *dev_priv, u32 mbox, u32 request,
> > +		      u32 reply_mask, u32 reply, int timeout_base_ms)
> > +{
> > +	u32 status;
> > +	int ret;
> > +
> > +	WARN_ON(!mutex_is_locked(&dev_priv->rps.hw_lock));
> > +
> > +#define COND skl_pcode_try_request(dev_priv, mbox, request, reply_mask, reply, \
> > +				   &status)
> > +
> > +	/*
> > +	 * The first request is queued by PCODE, which normally guarantees
> > +	 * that a subsequent request at most @timeout_base_ms later succeeds.
> > +	 * _wait_for() doesn't guarantee when its passed condition is evaluated
> > +	 * first, so send the first request explicitly.
> > +	 */
> 
> Scratching my head here.
> 
> /* Prime the PCODE by doing a request first. Normally it guarantees that
>  * a subsequent request, at most @timeout_base_ms later, succeeds.
>  */

Right, this is what I meant (explained to me by Art). I can use the
above instead.

> As what I think you mean is that given preemption wait_for() doesn't
> guarantee that at least two CONDs are executed. Which is reasonable, so
> we want to write the code to look for a reply within timeout.

Yep. Btw, I also pondered if we could just make this part of
wait_for(), but not sure if we want the corresponding code increase
(and for -stable we'd want a minimal diff). It's not required in other
cases, although it could speed up the wait in some cases. AFAIR Ville
did some measurements on this.

--Imre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v6 1/2] drm/i915/gen9: Fix PCODE polling during CDCLK change notification
  2016-12-08 14:18   ` Imre Deak
@ 2016-12-08 14:34     ` Chris Wilson
  0 siblings, 0 replies; 7+ messages in thread
From: Chris Wilson @ 2016-12-08 14:34 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx, Art Runyan

On Thu, Dec 08, 2016 at 04:18:50PM +0200, Imre Deak wrote:
> Yep. Btw, I also pondered if we could just make this part of
> wait_for(), but not sure if we want the corresponding code increase
> (and for -stable we'd want a minimal diff). It's not required in other
> cases, although it could speed up the wait in some cases. AFAIR Ville
> did some measurements on this.

The majority use intel_wait_for_register() which includes the spin then
sleep (i.e. a double wait_for()) as suggested by Ville.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ✓ Fi.CI.BAT: success for series starting with [v6,1/2] drm/i915/gen9: Fix PCODE polling during CDCLK change notification
  2016-12-05 18:59 ` ✓ Fi.CI.BAT: success for series starting with [v6,1/2] drm/i915/gen9: Fix PCODE polling during CDCLK change notification Patchwork
@ 2016-12-08 20:57   ` Imre Deak
  0 siblings, 0 replies; 7+ messages in thread
From: Imre Deak @ 2016-12-08 20:57 UTC (permalink / raw)
  To: intel-gfx, Ville Syrjälä, Chris Wilson, Lyude; +Cc: Jani Nikula

On Mon, 2016-12-05 at 18:59 +0000, Patchwork wrote:
> == Series Details ==
> 
> Series: series starting with [v6,1/2] drm/i915/gen9: Fix PCODE polling during CDCLK change notification
> URL   : https://patchwork.freedesktop.org/series/16375/
> State : success
> 
> == Summary ==
> 
> Series 16375v1 Series without cover letter
> https://patchwork.freedesktop.org/api/1.0/series/16375/revisions/1/mbox/

I pushed the patches to -dinq with Chris' code comment change, thanks
for the reviews.

I also added

Cc: <stable@vger.kernel.org> # v4.2- : 3b2c171 : drm/i915: Wait up to 3ms
Cc: <stable@vger.kernel.org> # v4.2-
Fixes: 5d96d8afcfbb ("drm/i915/skl: Deinit/init the display at suspend/resume")

to the first and

Fixes: 656d1b89e5ff ("drm/i915/skl: Add support for the SAGV, fix underrun hangs")

to the second patch.

--Imre

> Test gem_busy:
>         Subgroup basic-hang-default:
>                 fail       -> PASS       (fi-hsw-4770r)
> 
> fi-bdw-5557u     total:247  pass:232  dwarn:0   dfail:0   fail:0   skip:15 
> fi-bsw-n3050     total:247  pass:207  dwarn:0   dfail:0   fail:0   skip:40 
> fi-bxt-t5700     total:247  pass:219  dwarn:0   dfail:0   fail:0   skip:28 
> fi-byt-j1900     total:247  pass:219  dwarn:0   dfail:0   fail:0   skip:28 
> fi-byt-n2820     total:247  pass:215  dwarn:0   dfail:0   fail:0   skip:32 
> fi-hsw-4770      total:247  pass:227  dwarn:0   dfail:0   fail:0   skip:20 
> fi-hsw-4770r     total:247  pass:227  dwarn:0   dfail:0   fail:0   skip:20 
> fi-ilk-650       total:247  pass:194  dwarn:0   dfail:0   fail:0   skip:53 
> fi-ivb-3520m     total:247  pass:225  dwarn:0   dfail:0   fail:0   skip:22 
> fi-ivb-3770      total:247  pass:225  dwarn:0   dfail:0   fail:0   skip:22 
> fi-kbl-7500u     total:247  pass:225  dwarn:0   dfail:0   fail:0   skip:22 
> fi-skl-6260u     total:247  pass:233  dwarn:0   dfail:0   fail:0   skip:14 
> fi-skl-6700hq    total:247  pass:226  dwarn:0   dfail:0   fail:0   skip:21 
> fi-skl-6770hq    total:247  pass:233  dwarn:0   dfail:0   fail:0   skip:14 
> fi-snb-2520m     total:247  pass:215  dwarn:0   dfail:0   fail:0   skip:32 
> fi-snb-2600      total:247  pass:214  dwarn:0   dfail:0   fail:0   skip:33 
> 
> a947529bb652fdcf50e8c0e8ee5102b737bae23e drm-tip: 2016y-12m-05d-14h-27m-06s UTC integration manifest
> 25d5a8b drm/i915/gen9: Fix PCODE polling during SAGV disabling
> 9357396 drm/i915/gen9: Fix PCODE polling during CDCLK change notification
> 
> == Logs ==
> 
> For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3197/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-12-08 20:57 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-05 16:27 [PATCH v6 1/2] drm/i915/gen9: Fix PCODE polling during CDCLK change notification Imre Deak
2016-12-05 16:27 ` [PATCH v6 2/2] drm/i915/gen9: Fix PCODE polling during SAGV disabling Imre Deak
2016-12-05 18:59 ` ✓ Fi.CI.BAT: success for series starting with [v6,1/2] drm/i915/gen9: Fix PCODE polling during CDCLK change notification Patchwork
2016-12-08 20:57   ` Imre Deak
2016-12-08 14:02 ` [PATCH v6 1/2] " Chris Wilson
2016-12-08 14:18   ` Imre Deak
2016-12-08 14:34     ` Chris Wilson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox