Re: [PATCH v6 1/2] drm/i915/gen9: Fix PCODE polling during CDCLK change notification

From: Imre Deak <imre.deak@intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: intel-gfx@lists.freedesktop.org, Art Runyan <arthur.j.runyan@intel.com>
Subject: Re: [PATCH v6 1/2] drm/i915/gen9: Fix PCODE polling during CDCLK change notification
Date: Thu, 08 Dec 2016 16:18:50 +0200	[thread overview]
Message-ID: <1481206730.17555.16.camel@intel.com> (raw)
In-Reply-To: <20161208140252.GP4815@nuc-i3427.alporthouse.com>

On to, 2016-12-08 at 14:02 +0000, Chris Wilson wrote:
> On Mon, Dec 05, 2016 at 06:27:37PM +0200, Imre Deak wrote:
> > commit 848496e5902833600f7992f4faa82dc1546051ba
> > Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Date:   Wed Jul 13 16:32:03 2016 +0300
> > 
> >     drm/i915: Wait up to 3ms for the pcu to ack the cdclk change request on SKL
> > 
> > increased the timeout to match the spec, but we still see a timeout on
> > at least one SKL. A CDCLK change request following the failed one will
> > succeed nevertheless.
> > 
> > I could reproduce this problem easily by running kms_pipe_crc_basic in a
> > loop. In all failure cases _wait_for() was pre-empted for >3ms and so in
> > the worst case - when the pre-emption happened right after calculating
> > timeout__ in _wait_for() - we called skl_cdclk_wait_for_pcu_ready() only
> > once which failed and so _wait_for() timed out. As opposed to this the
> > spec says to keep retrying the request for at most a 3ms period.
> > 
> > To fix this send the first request explicitly to guarantee that there is
> > 3ms between the first and last request. Though this matches the spec, I
> > noticed that in rare cases this can still time out if we sent only a few
> > requests (in the worst case 2) _and_ PCODE is busy for some reason even
> > after a previous request and a 3ms delay. To work around this retry the
> > polling with pre-emption disabled to maximize the number of requests.
> > Also increase the timeout to 10ms to account for interrupts that could
> > reduce the number of requests. With this change I couldn't trigger
> > the problem.
> > 
> > v2:
> > - Use 1ms poll period instead of 10us. (Chris)
> > v3:
> > - Poll with pre-emption disabled to increase the number of request
> >   attempts. (Ville, Chris)
> > - Factor out a helper to poll, it's also needed by the next patch.
> > v4:
> > - Pass reply_mask, reply to skl_pcode_request(), instead of assuming the
> >   reply is generic. (Ville)
> > v5:
> > - List the request specific timeout values as code comment. (Ville)
> > v6:
> > - Try the poll first with preemption enabled.
> > - Add code comment about first request being queued by PCODE. (Art)
> > - Add timeout_base_ms argument. (Ville)
> > 
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Art Runyan <arthur.j.runyan@intel.com>
> > Reference: https://bugs.freedesktop.org/show_bug.cgi?id=97929
> > Testcase: igt/kms_pipe_crc_basic/suspend-read-crc-pipe-B
> > Signed-off-by: Imre Deak <imre.deak@intel.com>
> 
> The only issue I have is that buried within snb_pcode_read() is another
> wait_for, now in an atomic section and we have been trying to erradicate
> those. It should be happy enough, just a pita to fix later!

Yea, agreed with the rational for that. I can volunteer to refactor
this part as a follow-up, although passing a flag to snb_pcode_read()
would be still the clearest to me.

Btw, this is also for -stable imo.

> Other than that and a minor nit, it looks like a reasonable compromise.
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> >  drivers/gpu/drm/i915/i915_drv.h      |  2 +
> >  drivers/gpu/drm/i915/intel_display.c | 31 +++++----------
> >  drivers/gpu/drm/i915/intel_pm.c      | 75 ++++++++++++++++++++++++++++++++++++
> >  3 files changed, 87 insertions(+), 21 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index ca9786c..a2462bf 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -3597,6 +3597,8 @@ extern void intel_display_print_error_state(struct drm_i915_error_state_buf *e,
> >  
> >  int sandybridge_pcode_read(struct drm_i915_private *dev_priv, u32 mbox, u32 *val);
> >  int sandybridge_pcode_write(struct drm_i915_private *dev_priv, u32 mbox, u32 val);
> > +int skl_pcode_request(struct drm_i915_private *dev_priv, u32 mbox, u32 request,
> > +		      u32 reply_mask, u32 reply, int timeout_base_ms);
> >  
> >  /* intel_sideband.c */
> >  u32 vlv_punit_read(struct drm_i915_private *dev_priv, u32 addr);
> > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> > index 1fafcce..4ef7392 100644
> > --- a/drivers/gpu/drm/i915/intel_display.c
> > +++ b/drivers/gpu/drm/i915/intel_display.c
> > @@ -6280,35 +6280,24 @@ skl_dpll0_disable(struct drm_i915_private *dev_priv)
> >  	dev_priv->cdclk_pll.vco = 0;
> >  }
> >  
> > -static bool skl_cdclk_pcu_ready(struct drm_i915_private *dev_priv)
> > -{
> > -	int ret;
> > -	u32 val;
> > -
> > -	/* inform PCU we want to change CDCLK */
> > -	val = SKL_CDCLK_PREPARE_FOR_CHANGE;
> > -	mutex_lock(&dev_priv->rps.hw_lock);
> > -	ret = sandybridge_pcode_read(dev_priv, SKL_PCODE_CDCLK_CONTROL, &val);
> > -	mutex_unlock(&dev_priv->rps.hw_lock);
> > -
> > -	return ret == 0 && (val & SKL_CDCLK_READY_FOR_CHANGE);
> > -}
> > -
> > -static bool skl_cdclk_wait_for_pcu_ready(struct drm_i915_private *dev_priv)
> > -{
> > -	return _wait_for(skl_cdclk_pcu_ready(dev_priv), 3000, 10) == 0;
> > -}
> > -
> >  static void skl_set_cdclk(struct drm_i915_private *dev_priv, int cdclk, int vco)
> >  {
> >  	u32 freq_select, pcu_ack;
> > +	int ret;
> >  
> >  	WARN_ON((cdclk == 24000) != (vco == 0));
> >  
> >  	DRM_DEBUG_DRIVER("Changing CDCLK to %d kHz (VCO %d kHz)\n", cdclk, vco);
> >  
> > -	if (!skl_cdclk_wait_for_pcu_ready(dev_priv)) {
> > -		DRM_ERROR("failed to inform PCU about cdclk change\n");
> > +	mutex_lock(&dev_priv->rps.hw_lock);
> > +	ret = skl_pcode_request(dev_priv, SKL_PCODE_CDCLK_CONTROL,
> > +				SKL_CDCLK_PREPARE_FOR_CHANGE,
> > +				SKL_CDCLK_READY_FOR_CHANGE,
> > +				SKL_CDCLK_READY_FOR_CHANGE, 3);
> > +	mutex_unlock(&dev_priv->rps.hw_lock);
> > +	if (ret) {
> > +		DRM_ERROR("Failed to inform PCU about cdclk change (%d)\n",
> > +			  ret);
> >  		return;
> >  	}
> >  
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > index 59a88de..6c2fa34 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -7864,6 +7864,81 @@ int sandybridge_pcode_write(struct drm_i915_private *dev_priv,
> >  	return 0;
> >  }
> >  
> > +static bool skl_pcode_try_request(struct drm_i915_private *dev_priv, u32 mbox,
> > +				  u32 request, u32 reply_mask, u32 reply,
> > +				  u32 *status)
> > +{
> > +	u32 val = request;
> > +
> > +	*status = sandybridge_pcode_read(dev_priv, mbox, &val);
> > +
> > +	return *status || ((val & reply_mask) == reply);
> > +}
> > +
> > +/**
> > + * skl_pcode_request - send PCODE request until acknowledgment
> > + * @dev_priv: device private
> > + * @mbox: PCODE mailbox ID the request is targeted for
> > + * @request: request ID
> > + * @reply_mask: mask used to check for request acknowledgment
> > + * @reply: value used to check for request acknowledgement
> > + * @timeout_base_ms: timeout for polling with preemption enabled
> > + *
> > + * Keep resending the @request to @mbox until PCODE acknowledges it, PCODE
> > + * reports an error or an overall timeout of @timeout_base_ms+10 ms expires.
> > + * The request is acknowledged once the PCODE reply dword equals @reply after
> > + * applying @reply_mask. Polling is first attempted with preemption enabled
> > + * for @timeout_base_ms and if this times out for another 10 ms with
> > + * preemption disabled.
> > + *
> > + * Returns 0 on success, %-ETIMEDOUT in case of a timeout, <0 in case of some
> > + * other error as reported by PCODE.
> > + */
> > +int skl_pcode_request(struct drm_i915_private *dev_priv, u32 mbox, u32 request,
> > +		      u32 reply_mask, u32 reply, int timeout_base_ms)
> > +{
> > +	u32 status;
> > +	int ret;
> > +
> > +	WARN_ON(!mutex_is_locked(&dev_priv->rps.hw_lock));
> > +
> > +#define COND skl_pcode_try_request(dev_priv, mbox, request, reply_mask, reply, \
> > +				   &status)
> > +
> > +	/*
> > +	 * The first request is queued by PCODE, which normally guarantees
> > +	 * that a subsequent request at most @timeout_base_ms later succeeds.
> > +	 * _wait_for() doesn't guarantee when its passed condition is evaluated
> > +	 * first, so send the first request explicitly.
> > +	 */
> 
> Scratching my head here.
> 
> /* Prime the PCODE by doing a request first. Normally it guarantees that
>  * a subsequent request, at most @timeout_base_ms later, succeeds.
>  */

Right, this is what I meant (explained to me by Art). I can use the
above instead.

> As what I think you mean is that given preemption wait_for() doesn't
> guarantee that at least two CONDs are executed. Which is reasonable, so
> we want to write the code to look for a reply within timeout.

Yep. Btw, I also pondered if we could just make this part of
wait_for(), but not sure if we want the corresponding code increase
(and for -stable we'd want a minimal diff). It's not required in other
cases, although it could speed up the wait in some cases. AFAIR Ville
did some measurements on this.

--Imre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx