From: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
To: <intel-gfx@lists.freedesktop.org>
Subject: Re: [Intel-gfx] [PATCH] i915/pmu: Use a faster read for 2x32 mmio reads
Date: Thu, 3 Nov 2022 11:09:36 -0700 [thread overview]
Message-ID: <Y2QD4FAM8miZAluS@unerlige-ril> (raw)
In-Reply-To: <20221103180705.1315142-1-umesh.nerlige.ramappa@intel.com>
On Thu, Nov 03, 2022 at 11:07:05AM -0700, Umesh Nerlige Ramappa wrote:
>PMU reads the GT timestamp as a 2x32 mmio read and since upper and lower
>32 bit registers are read in a loop, there is a latency involved in
>getting the GT timestamp. To reduce the latency, define another version
>of the helper that requires caller to acquire uncore->spinlock and
>necessary forcewakes.
>
>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Not for review, just to demonstrate one of the solutions to a DG1 BAT
issue
Thanks,
Umesh
>---
> .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 26 ++++++++++++++++---
> drivers/gpu/drm/i915/intel_uncore.h | 24 +++++++++++++++++
> 2 files changed, 47 insertions(+), 3 deletions(-)
>
>diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
>index 693b07a97789..64b0193c9ee4 100644
>--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
>+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
>@@ -1252,6 +1252,28 @@ static u32 gpm_timestamp_shift(struct intel_gt *gt)
> return 3 - shift;
> }
>
>+static u64 gpm_timestamp(struct intel_uncore *uncore, ktime_t *now)
>+{
>+ enum forcewake_domains fw_domains;
>+ u64 reg;
>+
>+ /* Assume MISC_STATUS0 and MISC_STATUS1 are in the same fw_domain */
>+ fw_domains = intel_uncore_forcewake_for_reg(uncore,
>+ MISC_STATUS0,
>+ FW_REG_READ);
>+
>+ spin_lock_irq(&uncore->lock);
>+ intel_uncore_forcewake_get__locked(uncore, fw_domains);
>+
>+ reg = intel_uncore_read64_2x32_fw(uncore, MISC_STATUS0, MISC_STATUS1);
>+ *now = ktime_get();
>+
>+ intel_uncore_forcewake_put__locked(uncore, fw_domains);
>+ spin_unlock_irq(&uncore->lock);
>+
>+ return reg;
>+}
>+
> static void guc_update_pm_timestamp(struct intel_guc *guc, ktime_t *now)
> {
> struct intel_gt *gt = guc_to_gt(guc);
>@@ -1261,10 +1283,8 @@ static void guc_update_pm_timestamp(struct intel_guc *guc, ktime_t *now)
> lockdep_assert_held(&guc->timestamp.lock);
>
> gt_stamp_hi = upper_32_bits(guc->timestamp.gt_stamp);
>- gpm_ts = intel_uncore_read64_2x32(gt->uncore, MISC_STATUS0,
>- MISC_STATUS1) >> guc->timestamp.shift;
>+ gpm_ts = gpm_timestamp(gt->uncore, now) >> guc->timestamp.shift;
> gt_stamp_lo = lower_32_bits(gpm_ts);
>- *now = ktime_get();
>
> if (gt_stamp_lo < lower_32_bits(guc->timestamp.gt_stamp))
> gt_stamp_hi++;
>diff --git a/drivers/gpu/drm/i915/intel_uncore.h b/drivers/gpu/drm/i915/intel_uncore.h
>index 5449146a0624..dd0cf7d4ce6c 100644
>--- a/drivers/gpu/drm/i915/intel_uncore.h
>+++ b/drivers/gpu/drm/i915/intel_uncore.h
>@@ -455,6 +455,30 @@ static inline void intel_uncore_rmw_fw(struct intel_uncore *uncore,
> intel_uncore_write_fw(uncore, reg, val);
> }
>
>+/*
>+ * Introduce a _fw version of intel_uncore_read64_2x32 so that the 64 bit
>+ * register read is as quick as possible.
>+ *
>+ * NOTE:
>+ * Prior to calling this function, the caller must
>+ * 1. obtain the uncore->lock
>+ * 2. acquire forcewakes for the upper and lower register
>+ */
>+static inline u64
>+intel_uncore_read64_2x32_fw(struct intel_uncore *uncore,
>+ i915_reg_t lower_reg, i915_reg_t upper_reg)
>+{
>+ u32 upper, lower, old_upper, loop = 0;
>+
>+ upper = intel_uncore_read_fw(uncore, upper_reg);
>+ do {
>+ old_upper = upper;
>+ lower = intel_uncore_read_fw(uncore, lower_reg);
>+ upper = intel_uncore_read_fw(uncore, upper_reg);
>+ } while (upper != old_upper && loop++ < 2);
>+ return (u64)upper << 32 | lower;
>+}
>+
> static inline int intel_uncore_write_and_verify(struct intel_uncore *uncore,
> i915_reg_t reg, u32 val,
> u32 mask, u32 expected_val)
>--
>2.36.1
>
next prev parent reply other threads:[~2022-11-03 18:09 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-03 18:07 [Intel-gfx] [PATCH] i915/pmu: Use a faster read for 2x32 mmio reads Umesh Nerlige Ramappa
2022-11-03 18:09 ` Umesh Nerlige Ramappa [this message]
2022-11-03 21:52 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for " Patchwork
2022-11-03 22:14 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2022-11-04 5:10 ` [Intel-gfx] [PATCH] " Dixit, Ashutosh
2022-11-04 14:39 ` Umesh Nerlige Ramappa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y2QD4FAM8miZAluS@unerlige-ril \
--to=umesh.nerlige.ramappa@intel.com \
--cc=intel-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.