All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
To: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH] i915/pmu: Use a faster read for 2x32 mmio reads
Date: Thu, 03 Nov 2022 22:10:14 -0700	[thread overview]
Message-ID: <87wn8bl2yx.wl-ashutosh.dixit@intel.com> (raw)
In-Reply-To: <20221103180705.1315142-1-umesh.nerlige.ramappa@intel.com>

On Thu, 03 Nov 2022 11:07:05 -0700, Umesh Nerlige Ramappa wrote:
>

Hi Umesh,

> PMU reads the GT timestamp as a 2x32 mmio read and since upper and lower
> 32 bit registers are read in a loop, there is a latency involved in
> getting the GT timestamp. To reduce the latency, define another version
> of the helper that requires caller to acquire uncore->spinlock and
> necessary forcewakes.

Why does this reduces the latency compared to intel_uncore_read64_2x32?

Thanks.
--
Ashutosh

> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> ---
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 26 ++++++++++++++++---
>  drivers/gpu/drm/i915/intel_uncore.h           | 24 +++++++++++++++++
>  2 files changed, 47 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 693b07a97789..64b0193c9ee4 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -1252,6 +1252,28 @@ static u32 gpm_timestamp_shift(struct intel_gt *gt)
>	return 3 - shift;
>  }
>
> +static u64 gpm_timestamp(struct intel_uncore *uncore, ktime_t *now)
> +{
> +	enum forcewake_domains fw_domains;
> +	u64 reg;
> +
> +	/* Assume MISC_STATUS0 and MISC_STATUS1 are in the same fw_domain */
> +	fw_domains = intel_uncore_forcewake_for_reg(uncore,
> +						    MISC_STATUS0,
> +						    FW_REG_READ);
> +
> +	spin_lock_irq(&uncore->lock);
> +	intel_uncore_forcewake_get__locked(uncore, fw_domains);
> +
> +	reg = intel_uncore_read64_2x32_fw(uncore, MISC_STATUS0, MISC_STATUS1);
> +	*now = ktime_get();
> +
> +	intel_uncore_forcewake_put__locked(uncore, fw_domains);
> +	spin_unlock_irq(&uncore->lock);
> +
> +	return reg;
> +}
> +
>  static void guc_update_pm_timestamp(struct intel_guc *guc, ktime_t *now)
>  {
>	struct intel_gt *gt = guc_to_gt(guc);
> @@ -1261,10 +1283,8 @@ static void guc_update_pm_timestamp(struct intel_guc *guc, ktime_t *now)
>	lockdep_assert_held(&guc->timestamp.lock);
>
>	gt_stamp_hi = upper_32_bits(guc->timestamp.gt_stamp);
> -	gpm_ts = intel_uncore_read64_2x32(gt->uncore, MISC_STATUS0,
> -					  MISC_STATUS1) >> guc->timestamp.shift;
> +	gpm_ts = gpm_timestamp(gt->uncore, now) >> guc->timestamp.shift;
>	gt_stamp_lo = lower_32_bits(gpm_ts);
> -	*now = ktime_get();
>
>	if (gt_stamp_lo < lower_32_bits(guc->timestamp.gt_stamp))
>		gt_stamp_hi++;
> diff --git a/drivers/gpu/drm/i915/intel_uncore.h b/drivers/gpu/drm/i915/intel_uncore.h
> index 5449146a0624..dd0cf7d4ce6c 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.h
> +++ b/drivers/gpu/drm/i915/intel_uncore.h
> @@ -455,6 +455,30 @@ static inline void intel_uncore_rmw_fw(struct intel_uncore *uncore,
>		intel_uncore_write_fw(uncore, reg, val);
>  }
>
> +/*
> + * Introduce a _fw version of intel_uncore_read64_2x32 so that the 64 bit
> + * register read is as quick as possible.
> + *
> + * NOTE:
> + * Prior to calling this function, the caller must
> + * 1. obtain the uncore->lock
> + * 2. acquire forcewakes for the upper and lower register
> + */
> +static inline u64
> +intel_uncore_read64_2x32_fw(struct intel_uncore *uncore,
> +			    i915_reg_t lower_reg, i915_reg_t upper_reg)
> +{
> +	u32 upper, lower, old_upper, loop = 0;
> +
> +	upper = intel_uncore_read_fw(uncore, upper_reg);
> +	do {
> +		old_upper = upper;
> +		lower = intel_uncore_read_fw(uncore, lower_reg);
> +		upper = intel_uncore_read_fw(uncore, upper_reg);
> +	} while (upper != old_upper && loop++ < 2);
> +	return (u64)upper << 32 | lower;
> +}
> +
>  static inline int intel_uncore_write_and_verify(struct intel_uncore *uncore,
>						i915_reg_t reg, u32 val,
>						u32 mask, u32 expected_val)
> --
> 2.36.1
>

  parent reply	other threads:[~2022-11-04  5:10 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-03 18:07 [Intel-gfx] [PATCH] i915/pmu: Use a faster read for 2x32 mmio reads Umesh Nerlige Ramappa
2022-11-03 18:09 ` Umesh Nerlige Ramappa
2022-11-03 21:52 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for " Patchwork
2022-11-03 22:14 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2022-11-04  5:10 ` Dixit, Ashutosh [this message]
2022-11-04 14:39   ` [Intel-gfx] [PATCH] " Umesh Nerlige Ramappa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wn8bl2yx.wl-ashutosh.dixit@intel.com \
    --to=ashutosh.dixit@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=umesh.nerlige.ramappa@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.