All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
To: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH 2/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware
Date: Mon, 10 Apr 2023 15:17:33 -0700	[thread overview]
Message-ID: <878reze60y.wl-ashutosh.dixit@intel.com> (raw)
In-Reply-To: <ZC/5r96UYIZCKjW9@intel.com>

On Fri, 07 Apr 2023 04:08:31 -0700, Rodrigo Vivi wrote:
>

Hi Rodrigo,

> On Wed, Apr 05, 2023 at 09:45:21PM -0700, Ashutosh Dixit wrote:
> > On dGfx, the PL1 power limit being enabled and set to a low value results
> > in a low GPU operating freq. It also negates the freq raise operation which
> > is done before GuC firmware load. As a result GuC firmware load can time
> > out. Such timeouts were seen in the GL #8062 bug below (where the PL1 power
> > limit was enabled and set to a low value). Therefore disable the PL1 power
> > limit when allowed by HW when loading GuC firmware.
> >
> > v2:
> >  - Take mutex (to disallow writes to power1_max) across GuC reset/fw load
> >  - Add hwm_power_max_restore to error return code path
> >
> > v3 (Jani N):
> >  - Add/remove explanatory comments
> >  - Function renames
> >  - Type corrections
> >  - Locking annotation
> >
> > v4:
> >  - Don't hold the lock across GuC reset (Rodrigo)
> >  - New locking scheme (suggested by Rodrigo)
> >  - Eliminate rpm_get in power_max_disable/restore, not needed (Tvrtko)
> >
> > Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8062
> > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > ---
> >  drivers/gpu/drm/i915/gt/uc/intel_uc.c |  9 ++++++
> >  drivers/gpu/drm/i915/i915_hwmon.c     | 40 +++++++++++++++++++++++++++
> >  drivers/gpu/drm/i915/i915_hwmon.h     |  7 +++++
> >  3 files changed, 56 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > index 4ccb4be4c9cba..aa8e35a5636a0 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> > @@ -18,6 +18,7 @@
> >  #include "intel_uc.h"
> >
> >  #include "i915_drv.h"
> > +#include "i915_hwmon.h"
> >
> >  static const struct intel_uc_ops uc_ops_off;
> >  static const struct intel_uc_ops uc_ops_on;
> > @@ -461,6 +462,7 @@ static int __uc_init_hw(struct intel_uc *uc)
> >	struct intel_guc *guc = &uc->guc;
> >	struct intel_huc *huc = &uc->huc;
> >	int ret, attempts;
> > +	bool pl1en;
>
> we need to initialize this to make warn free builds happy...
> what's our default btw? false? true? we need to read it back?

Yes this was a real bug caught by the kernel build robot. We don't know the
default till we read it back, which would mean exposing a new function. I
have avoided exposing the new function, i.e. I have fixed this by creating a
new (err_rps) label which will make sure that the variable is not used
unless it is initialized. I am not expecting to see warnings from the build
robot with this fix now.

> >
> >	GEM_BUG_ON(!intel_uc_supports_guc(uc));
> >	GEM_BUG_ON(!intel_uc_wants_guc(uc));
> > @@ -491,6 +493,9 @@ static int __uc_init_hw(struct intel_uc *uc)
> >	else
> >		attempts = 1;
> >
> > +	/* Disable a potentially low PL1 power limit to allow freq to be raised */
> > +	i915_hwmon_power_max_disable(gt->i915, &pl1en);
> > +
> >	intel_rps_raise_unslice(&uc_to_gt(uc)->rps);
> >
> >	while (attempts--) {
> > @@ -547,6 +552,8 @@ static int __uc_init_hw(struct intel_uc *uc)
> >		intel_rps_lower_unslice(&uc_to_gt(uc)->rps);
> >	}
> >
> > +	i915_hwmon_power_max_restore(gt->i915, pl1en);
> > +
> >	guc_info(guc, "submission %s\n", str_enabled_disabled(intel_uc_uses_guc_submission(uc)));
> >	guc_info(guc, "SLPC %s\n", str_enabled_disabled(intel_uc_uses_guc_slpc(uc)));
> >
> > @@ -563,6 +570,8 @@ static int __uc_init_hw(struct intel_uc *uc)
> >	/* Return GT back to RPn */
> >	intel_rps_lower_unslice(&uc_to_gt(uc)->rps);
> >
> > +	i915_hwmon_power_max_restore(gt->i915, pl1en);
> > +
> >	__uc_sanitize(uc);
> >
> >	if (!ret) {
> > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c
> > index 7f44e809ca155..9ab8971679fe3 100644
> > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > @@ -50,6 +50,7 @@ struct hwm_drvdata {
> >	struct hwm_energy_info ei;		/*  Energy info for energy1_input */
> >	char name[12];
> >	int gt_n;
> > +	bool reset_in_progress;
> >  };
> >
> >  struct i915_hwmon {
> > @@ -400,6 +401,10 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> >	u32 nval;
> >
> >	mutex_lock(&hwmon->hwmon_lock);
> > +	if (hwmon->ddat.reset_in_progress) {
> > +		ret = -EAGAIN;
> > +		goto unlock;
> > +	}
> >	wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
> >
> >	/* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */
> > @@ -421,6 +426,7 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> >			 PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval);
> >  exit:
> >	intel_runtime_pm_put(ddat->uncore->rpm, wakeref);
> > +unlock:
> >	mutex_unlock(&hwmon->hwmon_lock);
> >	return ret;
> >  }
> > @@ -472,6 +478,40 @@ hwm_power_write(struct hwm_drvdata *ddat, u32 attr, int chan, long val)
> >	}
> >  }
> >
> > +void i915_hwmon_power_max_disable(struct drm_i915_private *i915, bool *old)
> > +{
> > +	struct i915_hwmon *hwmon = i915->hwmon;
> > +	u32 r;
> > +
> > +	if (!hwmon || !i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit))
> > +		return;
> > +
> > +	mutex_lock(&hwmon->hwmon_lock);
> > +
> > +	hwmon->ddat.reset_in_progress = true;
> > +	r = intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
> > +			     PKG_PWR_LIM_1_EN, 0);
> > +	*old = !!(r & PKG_PWR_LIM_1_EN);
> > +
> > +	mutex_unlock(&hwmon->hwmon_lock);
> > +}
> > +
> > +void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old)
> > +{
> > +	struct i915_hwmon *hwmon = i915->hwmon;
> > +
> > +	if (!hwmon || !i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit))
> > +		return;
> > +
> > +	mutex_lock(&hwmon->hwmon_lock);
> > +
> > +	intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
> > +			 PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
> > +	hwmon->ddat.reset_in_progress = false;
> > +
> > +	mutex_unlock(&hwmon->hwmon_lock);
> > +}
>
> you could have combined both functions in a
> i915_hwmon_power_max_set(struct drm_i915_private *i915, bool val, bool *old)
>
> then pass NULL to old on the restoration times
> and have
>     if (old)
>		*old = !!(r & PKG_PWR_LIM_1_EN);
>
> But really up to you here, the current code is clear to follow imho
> so, with the pl1en initialization fixed:

Yes, left this as is.

>
> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

Have retained the R-b since the fix in __uc_init_hw is minor.

Thanks!
Ashutosh

> > +
> >  static umode_t
> >  hwm_energy_is_visible(const struct hwm_drvdata *ddat, u32 attr)
> >  {
> > diff --git a/drivers/gpu/drm/i915/i915_hwmon.h b/drivers/gpu/drm/i915/i915_hwmon.h
> > index 7ca9cf2c34c96..0fcb7de844061 100644
> > --- a/drivers/gpu/drm/i915/i915_hwmon.h
> > +++ b/drivers/gpu/drm/i915/i915_hwmon.h
> > @@ -7,14 +7,21 @@
> >  #ifndef __I915_HWMON_H__
> >  #define __I915_HWMON_H__
> >
> > +#include <linux/types.h>
> > +
> >  struct drm_i915_private;
> > +struct intel_gt;
> >
> >  #if IS_REACHABLE(CONFIG_HWMON)
> >  void i915_hwmon_register(struct drm_i915_private *i915);
> >  void i915_hwmon_unregister(struct drm_i915_private *i915);
> > +void i915_hwmon_power_max_disable(struct drm_i915_private *i915, bool *old);
> > +void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old);
> >  #else
> >  static inline void i915_hwmon_register(struct drm_i915_private *i915) { };
> >  static inline void i915_hwmon_unregister(struct drm_i915_private *i915) { };
> > +static inline void i915_hwmon_power_max_disable(struct drm_i915_private *i915, bool *old) { };
> > +static inline void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old) { };
> >  #endif
> >
> >  #endif /* __I915_HWMON_H__ */
> > --
> > 2.38.0
> >

  reply	other threads:[~2023-04-10 22:38 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-06  4:45 [Intel-gfx] [PATCH v4 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit
2023-04-06  4:45 ` Ashutosh Dixit
2023-04-06  4:45 ` [Intel-gfx] [PATCH 1/3] drm/i915/hwmon: Get mutex and rpm ref just once in hwm_power_max_write Ashutosh Dixit
2023-04-06  4:45   ` Ashutosh Dixit
2023-04-07 11:08   ` [Intel-gfx] " Rodrigo Vivi
2023-04-06  4:45 ` [Intel-gfx] [PATCH 2/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit
2023-04-06  4:45   ` Ashutosh Dixit
2023-04-06  9:16   ` [Intel-gfx] " kernel test robot
2023-04-06  9:16     ` kernel test robot
2023-04-07 11:08   ` Rodrigo Vivi
2023-04-10 22:17     ` Dixit, Ashutosh [this message]
2023-04-06  4:45 ` [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete Ashutosh Dixit
2023-04-06  4:45   ` Ashutosh Dixit
2023-04-07 11:04   ` [Intel-gfx] " Rodrigo Vivi
2023-04-10 22:40     ` Dixit, Ashutosh
2023-04-10 22:40       ` Dixit, Ashutosh
2023-04-06  5:15 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915/guc: Disable PL1 power limit when loading GuC firmware Patchwork
2023-04-06  5:32 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2023-04-06 17:42 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2023-04-10 22:35 [Intel-gfx] [PATCH 0/3] " Ashutosh Dixit
2023-04-10 22:35 ` [Intel-gfx] [PATCH 2/3] " Ashutosh Dixit
2023-04-20 16:40 [Intel-gfx] [PATCH v6 0/3] " Ashutosh Dixit
2023-04-20 16:40 ` [Intel-gfx] [PATCH 2/3] " Ashutosh Dixit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878reze60y.wl-ashutosh.dixit@intel.com \
    --to=ashutosh.dixit@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=rodrigo.vivi@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.