All of lore.kernel.org
 help / color / mirror / Atom feed
From: Imre Deak <imre.deak@intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: intel-gfx@lists.freedesktop.org, Art Runyan <arthur.j.runyan@intel.com>
Subject: Re: [PATCH v6 1/2] drm/i915/gen9: Fix PCODE polling during CDCLK change notification
Date: Thu, 08 Dec 2016 16:18:50 +0200	[thread overview]
Message-ID: <1481206730.17555.16.camel@intel.com> (raw)
In-Reply-To: <20161208140252.GP4815@nuc-i3427.alporthouse.com>

On to, 2016-12-08 at 14:02 +0000, Chris Wilson wrote:
> On Mon, Dec 05, 2016 at 06:27:37PM +0200, Imre Deak wrote:
> > commit 848496e5902833600f7992f4faa82dc1546051ba
> > Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Date:   Wed Jul 13 16:32:03 2016 +0300
> > 
> >     drm/i915: Wait up to 3ms for the pcu to ack the cdclk change request on SKL
> > 
> > increased the timeout to match the spec, but we still see a timeout on
> > at least one SKL. A CDCLK change request following the failed one will
> > succeed nevertheless.
> > 
> > I could reproduce this problem easily by running kms_pipe_crc_basic in a
> > loop. In all failure cases _wait_for() was pre-empted for >3ms and so in
> > the worst case - when the pre-emption happened right after calculating
> > timeout__ in _wait_for() - we called skl_cdclk_wait_for_pcu_ready() only
> > once which failed and so _wait_for() timed out. As opposed to this the
> > spec says to keep retrying the request for at most a 3ms period.
> > 
> > To fix this send the first request explicitly to guarantee that there is
> > 3ms between the first and last request. Though this matches the spec, I
> > noticed that in rare cases this can still time out if we sent only a few
> > requests (in the worst case 2) _and_ PCODE is busy for some reason even
> > after a previous request and a 3ms delay. To work around this retry the
> > polling with pre-emption disabled to maximize the number of requests.
> > Also increase the timeout to 10ms to account for interrupts that could
> > reduce the number of requests. With this change I couldn't trigger
> > the problem.
> > 
> > v2:
> > - Use 1ms poll period instead of 10us. (Chris)
> > v3:
> > - Poll with pre-emption disabled to increase the number of request
> >   attempts. (Ville, Chris)
> > - Factor out a helper to poll, it's also needed by the next patch.
> > v4:
> > - Pass reply_mask, reply to skl_pcode_request(), instead of assuming the
> >   reply is generic. (Ville)
> > v5:
> > - List the request specific timeout values as code comment. (Ville)
> > v6:
> > - Try the poll first with preemption enabled.
> > - Add code comment about first request being queued by PCODE. (Art)
> > - Add timeout_base_ms argument. (Ville)
> > 
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Art Runyan <arthur.j.runyan@intel.com>
> > Reference: https://bugs.freedesktop.org/show_bug.cgi?id=97929
> > Testcase: igt/kms_pipe_crc_basic/suspend-read-crc-pipe-B
> > Signed-off-by: Imre Deak <imre.deak@intel.com>
> 
> The only issue I have is that buried within snb_pcode_read() is another
> wait_for, now in an atomic section and we have been trying to erradicate
> those. It should be happy enough, just a pita to fix later!

Yea, agreed with the rational for that. I can volunteer to refactor
this part as a follow-up, although passing a flag to snb_pcode_read()
would be still the clearest to me.

Btw, this is also for -stable imo.

> Other than that and a minor nit, it looks like a reasonable compromise.
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> >  drivers/gpu/drm/i915/i915_drv.h      |  2 +
> >  drivers/gpu/drm/i915/intel_display.c | 31 +++++----------
> >  drivers/gpu/drm/i915/intel_pm.c      | 75 ++++++++++++++++++++++++++++++++++++
> >  3 files changed, 87 insertions(+), 21 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index ca9786c..a2462bf 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -3597,6 +3597,8 @@ extern void intel_display_print_error_state(struct drm_i915_error_state_buf *e,
> >  
> >  int sandybridge_pcode_read(struct drm_i915_private *dev_priv, u32 mbox, u32 *val);
> >  int sandybridge_pcode_write(struct drm_i915_private *dev_priv, u32 mbox, u32 val);
> > +int skl_pcode_request(struct drm_i915_private *dev_priv, u32 mbox, u32 request,
> > +		      u32 reply_mask, u32 reply, int timeout_base_ms);
> >  
> >  /* intel_sideband.c */
> >  u32 vlv_punit_read(struct drm_i915_private *dev_priv, u32 addr);
> > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> > index 1fafcce..4ef7392 100644
> > --- a/drivers/gpu/drm/i915/intel_display.c
> > +++ b/drivers/gpu/drm/i915/intel_display.c
> > @@ -6280,35 +6280,24 @@ skl_dpll0_disable(struct drm_i915_private *dev_priv)
> >  	dev_priv->cdclk_pll.vco = 0;
> >  }
> >  
> > -static bool skl_cdclk_pcu_ready(struct drm_i915_private *dev_priv)
> > -{
> > -	int ret;
> > -	u32 val;
> > -
> > -	/* inform PCU we want to change CDCLK */
> > -	val = SKL_CDCLK_PREPARE_FOR_CHANGE;
> > -	mutex_lock(&dev_priv->rps.hw_lock);
> > -	ret = sandybridge_pcode_read(dev_priv, SKL_PCODE_CDCLK_CONTROL, &val);
> > -	mutex_unlock(&dev_priv->rps.hw_lock);
> > -
> > -	return ret == 0 && (val & SKL_CDCLK_READY_FOR_CHANGE);
> > -}
> > -
> > -static bool skl_cdclk_wait_for_pcu_ready(struct drm_i915_private *dev_priv)
> > -{
> > -	return _wait_for(skl_cdclk_pcu_ready(dev_priv), 3000, 10) == 0;
> > -}
> > -
> >  static void skl_set_cdclk(struct drm_i915_private *dev_priv, int cdclk, int vco)
> >  {
> >  	u32 freq_select, pcu_ack;
> > +	int ret;
> >  
> >  	WARN_ON((cdclk == 24000) != (vco == 0));
> >  
> >  	DRM_DEBUG_DRIVER("Changing CDCLK to %d kHz (VCO %d kHz)\n", cdclk, vco);
> >  
> > -	if (!skl_cdclk_wait_for_pcu_ready(dev_priv)) {
> > -		DRM_ERROR("failed to inform PCU about cdclk change\n");
> > +	mutex_lock(&dev_priv->rps.hw_lock);
> > +	ret = skl_pcode_request(dev_priv, SKL_PCODE_CDCLK_CONTROL,
> > +				SKL_CDCLK_PREPARE_FOR_CHANGE,
> > +				SKL_CDCLK_READY_FOR_CHANGE,
> > +				SKL_CDCLK_READY_FOR_CHANGE, 3);
> > +	mutex_unlock(&dev_priv->rps.hw_lock);
> > +	if (ret) {
> > +		DRM_ERROR("Failed to inform PCU about cdclk change (%d)\n",
> > +			  ret);
> >  		return;
> >  	}
> >  
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > index 59a88de..6c2fa34 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -7864,6 +7864,81 @@ int sandybridge_pcode_write(struct drm_i915_private *dev_priv,
> >  	return 0;
> >  }
> >  
> > +static bool skl_pcode_try_request(struct drm_i915_private *dev_priv, u32 mbox,
> > +				  u32 request, u32 reply_mask, u32 reply,
> > +				  u32 *status)
> > +{
> > +	u32 val = request;
> > +
> > +	*status = sandybridge_pcode_read(dev_priv, mbox, &val);
> > +
> > +	return *status || ((val & reply_mask) == reply);
> > +}
> > +
> > +/**
> > + * skl_pcode_request - send PCODE request until acknowledgment
> > + * @dev_priv: device private
> > + * @mbox: PCODE mailbox ID the request is targeted for
> > + * @request: request ID
> > + * @reply_mask: mask used to check for request acknowledgment
> > + * @reply: value used to check for request acknowledgement
> > + * @timeout_base_ms: timeout for polling with preemption enabled
> > + *
> > + * Keep resending the @request to @mbox until PCODE acknowledges it, PCODE
> > + * reports an error or an overall timeout of @timeout_base_ms+10 ms expires.
> > + * The request is acknowledged once the PCODE reply dword equals @reply after
> > + * applying @reply_mask. Polling is first attempted with preemption enabled
> > + * for @timeout_base_ms and if this times out for another 10 ms with
> > + * preemption disabled.
> > + *
> > + * Returns 0 on success, %-ETIMEDOUT in case of a timeout, <0 in case of some
> > + * other error as reported by PCODE.
> > + */
> > +int skl_pcode_request(struct drm_i915_private *dev_priv, u32 mbox, u32 request,
> > +		      u32 reply_mask, u32 reply, int timeout_base_ms)
> > +{
> > +	u32 status;
> > +	int ret;
> > +
> > +	WARN_ON(!mutex_is_locked(&dev_priv->rps.hw_lock));
> > +
> > +#define COND skl_pcode_try_request(dev_priv, mbox, request, reply_mask, reply, \
> > +				   &status)
> > +
> > +	/*
> > +	 * The first request is queued by PCODE, which normally guarantees
> > +	 * that a subsequent request at most @timeout_base_ms later succeeds.
> > +	 * _wait_for() doesn't guarantee when its passed condition is evaluated
> > +	 * first, so send the first request explicitly.
> > +	 */
> 
> Scratching my head here.
> 
> /* Prime the PCODE by doing a request first. Normally it guarantees that
>  * a subsequent request, at most @timeout_base_ms later, succeeds.
>  */

Right, this is what I meant (explained to me by Art). I can use the
above instead.

> As what I think you mean is that given preemption wait_for() doesn't
> guarantee that at least two CONDs are executed. Which is reasonable, so
> we want to write the code to look for a reply within timeout.

Yep. Btw, I also pondered if we could just make this part of
wait_for(), but not sure if we want the corresponding code increase
(and for -stable we'd want a minimal diff). It's not required in other
cases, although it could speed up the wait in some cases. AFAIR Ville
did some measurements on this.

--Imre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2016-12-08 14:19 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-05 16:27 [PATCH v6 1/2] drm/i915/gen9: Fix PCODE polling during CDCLK change notification Imre Deak
2016-12-05 16:27 ` [PATCH v6 2/2] drm/i915/gen9: Fix PCODE polling during SAGV disabling Imre Deak
2016-12-05 18:59 ` ✓ Fi.CI.BAT: success for series starting with [v6,1/2] drm/i915/gen9: Fix PCODE polling during CDCLK change notification Patchwork
2016-12-08 20:57   ` Imre Deak
2016-12-08 14:02 ` [PATCH v6 1/2] " Chris Wilson
2016-12-08 14:18   ` Imre Deak [this message]
2016-12-08 14:34     ` Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1481206730.17555.16.camel@intel.com \
    --to=imre.deak@intel.com \
    --cc=arthur.j.runyan@intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.