public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
To: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH 1/2] i915/uncore: Acquire fw before loop in intel_uncore_read64_2x32
Date: Mon, 7 Nov 2022 16:11:27 -0800	[thread overview]
Message-ID: <Y2mer/algb78p4Fh@unerlige-ril> (raw)
In-Reply-To: <87pmdylarc.wl-ashutosh.dixit@intel.com>

On Mon, Nov 07, 2022 at 01:23:19PM -0800, Dixit, Ashutosh wrote:
>On Mon, 07 Nov 2022 02:13:46 -0800, Tvrtko Ursulin wrote:
>>
>> On 05/11/2022 00:32, Umesh Nerlige Ramappa wrote:
>> > PMU reads the GT timestamp as a 2x32 mmio read and since upper and lower
>> > 32 bit registers are read in a loop, there is a latency involved between
>> > getting the GT timestamp and the CPU timestamp. As part of the
>> > resolution, refactor intel_uncore_read64_2x32 to acquire forcewake and
>> > uncore lock prior to reading upper and lower regs.
>> >
>> > Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>> > ---
>> >   drivers/gpu/drm/i915/intel_uncore.h | 44 ++++++++++++++++++++---------
>> >   1 file changed, 30 insertions(+), 14 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/i915/intel_uncore.h b/drivers/gpu/drm/i915/intel_uncore.h
>> > index 5449146a0624..e9e38490815d 100644
>> > --- a/drivers/gpu/drm/i915/intel_uncore.h
>> > +++ b/drivers/gpu/drm/i915/intel_uncore.h
>> > @@ -382,20 +382,6 @@ __uncore_write(write_notrace, 32, l, false)
>> >    */
>> >   __uncore_read(read64, 64, q, true)
>> >   -static inline u64
>> > -intel_uncore_read64_2x32(struct intel_uncore *uncore,
>> > -			 i915_reg_t lower_reg, i915_reg_t upper_reg)
>> > -{
>> > -	u32 upper, lower, old_upper, loop = 0;
>> > -	upper = intel_uncore_read(uncore, upper_reg);
>> > -	do {
>> > -		old_upper = upper;
>> > -		lower = intel_uncore_read(uncore, lower_reg);
>> > -		upper = intel_uncore_read(uncore, upper_reg);
>> > -	} while (upper != old_upper && loop++ < 2);
>> > -	return (u64)upper << 32 | lower;
>> > -}
>> > -
>> >   #define intel_uncore_posting_read(...) ((void)intel_uncore_read_notrace(__VA_ARGS__))
>> >   #define intel_uncore_posting_read16(...) ((void)intel_uncore_read16_notrace(__VA_ARGS__))
>> >   @@ -455,6 +441,36 @@ static inline void intel_uncore_rmw_fw(struct
>> > intel_uncore *uncore,
>> >		intel_uncore_write_fw(uncore, reg, val);
>> >   }
>> >   +static inline u64
>> > +intel_uncore_read64_2x32(struct intel_uncore *uncore,
>> > +			 i915_reg_t lower_reg, i915_reg_t upper_reg)
>> > +{
>> > +	u32 upper, lower, old_upper, loop = 0;
>> > +	enum forcewake_domains fw_domains;
>> > +	unsigned long flags;
>> > +
>> > +	fw_domains = intel_uncore_forcewake_for_reg(uncore, lower_reg,
>> > +						    FW_REG_READ);
>> > +
>> > +	fw_domains |= intel_uncore_forcewake_for_reg(uncore, upper_reg,
>> > +						    FW_REG_READ);
>> > +
>> > +	spin_lock_irqsave(&uncore->lock, flags);
>> > +	intel_uncore_forcewake_get__locked(uncore, fw_domains);
>> > +
>> > +	upper = intel_uncore_read_fw(uncore, upper_reg);
>> > +	do {
>> > +		old_upper = upper;
>> > +		lower = intel_uncore_read_fw(uncore, lower_reg);
>> > +		upper = intel_uncore_read_fw(uncore, upper_reg);
>> > +	} while (upper != old_upper && loop++ < 2);
>> > +
>> > +	intel_uncore_forcewake_put__locked(uncore, fw_domains);
>>
>> I mulled over the fact this no longer applies the put hysteresis, but then
>> I saw GuC busyness is essentially the only current caller so thought it
>> doesn't really warrant adding a super long named
>> intel_uncore_forcewake_put_delayed__locked helper.
>>
>> Perhaps it would make sense to move this out of static inline,

Are you saying - drop the inline OR drop static inline? I am assuming 
the former.

>> in which
>> case it would also be easier to have the hysteresis without needing to
>> export any new helpers,

I don't understand this part. Do you mean that it makes it easier to 
just call __intel_uncore_forcewake_put(uncore, fw_domains, true) then?  
Just wondering how 'static inline' has any effect on that.

>> but mostly because it does not feel the static
>> inline is justified.

Agree, just carried it over from the previous helper definition.

>> Sounds an attractive option but it is passable as is.
>
>Yup, copy that. Also see now how this reduces the read latency. And also it
>would increase the latency a bit for a different thread trying to do an
>uncore read/write since we hold uncore->lock longer but should be ok I
>think.

Didn't think about it from that perspective. Worst case is that 
gt_park/gt_unpark may happen very frequently (as seen on some use 
cases). In that case, the unpark would end up calling this helper each 
time.

>
>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
>Copy that too:
>
>Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>

Thanks,
Umesh
>

>>
>> > +	spin_unlock_irqrestore(&uncore->lock, flags);
>> > +
>> > +	return (u64)upper << 32 | lower;
>> > +}
>> > +
>> >   static inline int intel_uncore_write_and_verify(struct intel_uncore *uncore,
>> >						i915_reg_t reg, u32 val,
>> >						u32 mask, u32 expected_val)

  reply	other threads:[~2022-11-08  0:11 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-05  0:32 [Intel-gfx] [PATCH 0/2] Fix live busy stats selftest failure Umesh Nerlige Ramappa
2022-11-05  0:32 ` [Intel-gfx] [PATCH 1/2] i915/uncore: Acquire fw before loop in intel_uncore_read64_2x32 Umesh Nerlige Ramappa
2022-11-07 10:13   ` Tvrtko Ursulin
2022-11-07 21:23     ` Dixit, Ashutosh
2022-11-08  0:11       ` Umesh Nerlige Ramappa [this message]
2022-11-08  0:45         ` Dixit, Ashutosh
2022-11-08 10:06           ` Tvrtko Ursulin
2022-11-05  0:32 ` [Intel-gfx] [PATCH 2/2] drm/i915/selftest: Bump up sample period for busy stats selftest Umesh Nerlige Ramappa
2022-11-07 10:16   ` Tvrtko Ursulin
2022-11-07 19:01     ` Umesh Nerlige Ramappa
2022-11-07 23:33   ` Dixit, Ashutosh
2022-11-05  0:57 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for Fix live busy stats selftest failure Patchwork
2022-11-05  1:19 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2022-11-05 13:59 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y2mer/algb78p4Fh@unerlige-ril \
    --to=umesh.nerlige.ramappa@intel.com \
    --cc=ashutosh.dixit@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox