From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>,
Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH 1/2] i915/uncore: Acquire fw before loop in intel_uncore_read64_2x32
Date: Tue, 8 Nov 2022 10:06:52 +0000 [thread overview]
Message-ID: <4482cdde-8c8b-261a-cba7-36595d849a0e@linux.intel.com> (raw)
In-Reply-To: <87h6zauvdt.wl-ashutosh.dixit@intel.com>
On 08/11/2022 00:45, Dixit, Ashutosh wrote:
> On Mon, 07 Nov 2022 16:11:27 -0800, Umesh Nerlige Ramappa wrote:
>>
>> On Mon, Nov 07, 2022 at 01:23:19PM -0800, Dixit, Ashutosh wrote:
>>> On Mon, 07 Nov 2022 02:13:46 -0800, Tvrtko Ursulin wrote:
>>>>
>>>> On 05/11/2022 00:32, Umesh Nerlige Ramappa wrote:
>>>>> PMU reads the GT timestamp as a 2x32 mmio read and since upper and lower
>>>>> 32 bit registers are read in a loop, there is a latency involved between
>>>>> getting the GT timestamp and the CPU timestamp. As part of the
>>>>> resolution, refactor intel_uncore_read64_2x32 to acquire forcewake and
>>>>> uncore lock prior to reading upper and lower regs.
>>>>>
>>>>> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>>>>> ---
>>>>> drivers/gpu/drm/i915/intel_uncore.h | 44 ++++++++++++++++++++---------
>>>>> 1 file changed, 30 insertions(+), 14 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/intel_uncore.h b/drivers/gpu/drm/i915/intel_uncore.h
>>>>> index 5449146a0624..e9e38490815d 100644
>>>>> --- a/drivers/gpu/drm/i915/intel_uncore.h
>>>>> +++ b/drivers/gpu/drm/i915/intel_uncore.h
>>>>> @@ -382,20 +382,6 @@ __uncore_write(write_notrace, 32, l, false)
>>>>> */
>>>>> __uncore_read(read64, 64, q, true)
>>>>> -static inline u64
>>>>> -intel_uncore_read64_2x32(struct intel_uncore *uncore,
>>>>> - i915_reg_t lower_reg, i915_reg_t upper_reg)
>>>>> -{
>>>>> - u32 upper, lower, old_upper, loop = 0;
>>>>> - upper = intel_uncore_read(uncore, upper_reg);
>>>>> - do {
>>>>> - old_upper = upper;
>>>>> - lower = intel_uncore_read(uncore, lower_reg);
>>>>> - upper = intel_uncore_read(uncore, upper_reg);
>>>>> - } while (upper != old_upper && loop++ < 2);
>>>>> - return (u64)upper << 32 | lower;
>>>>> -}
>>>>> -
>>>>> #define intel_uncore_posting_read(...) ((void)intel_uncore_read_notrace(__VA_ARGS__))
>>>>> #define intel_uncore_posting_read16(...) ((void)intel_uncore_read16_notrace(__VA_ARGS__))
>>>>> @@ -455,6 +441,36 @@ static inline void intel_uncore_rmw_fw(struct
>>>>> intel_uncore *uncore,
>>>>> intel_uncore_write_fw(uncore, reg, val);
>>>>> }
>>>>> +static inline u64
>>>>> +intel_uncore_read64_2x32(struct intel_uncore *uncore,
>>>>> + i915_reg_t lower_reg, i915_reg_t upper_reg)
>>>>> +{
>>>>> + u32 upper, lower, old_upper, loop = 0;
>>>>> + enum forcewake_domains fw_domains;
>>>>> + unsigned long flags;
>>>>> +
>>>>> + fw_domains = intel_uncore_forcewake_for_reg(uncore, lower_reg,
>>>>> + FW_REG_READ);
>>>>> +
>>>>> + fw_domains |= intel_uncore_forcewake_for_reg(uncore, upper_reg,
>>>>> + FW_REG_READ);
>>>>> +
>>>>> + spin_lock_irqsave(&uncore->lock, flags);
>>>>> + intel_uncore_forcewake_get__locked(uncore, fw_domains);
>>>>> +
>>>>> + upper = intel_uncore_read_fw(uncore, upper_reg);
>>>>> + do {
>>>>> + old_upper = upper;
>>>>> + lower = intel_uncore_read_fw(uncore, lower_reg);
>>>>> + upper = intel_uncore_read_fw(uncore, upper_reg);
>>>>> + } while (upper != old_upper && loop++ < 2);
>>>>> +
>>>>> + intel_uncore_forcewake_put__locked(uncore, fw_domains);
>>>>
>>>> I mulled over the fact this no longer applies the put hysteresis, but then
>>>> I saw GuC busyness is essentially the only current caller so thought it
>>>> doesn't really warrant adding a super long named
>>>> intel_uncore_forcewake_put_delayed__locked helper.
>>>>
>>>> Perhaps it would make sense to move this out of static inline,
>>
>> Are you saying - drop the inline OR drop static inline? I am assuming the
>> former.
>
> No you need to have 'static inline' for functions defined in a header
> file. I also don't understand completely but seems what Tvrtko is saying is
> move the function to the .c leaving only the declarations in the .h? Anyway
> let Tvrtko explain more.
Yes I does not feel warranted for it to be a static inline so I'd just
move it to .c. In which case..
>>>> in which
>>>> case it would also be easier to have the hysteresis without needing to
>>>> export any new helpers,
>>
>> I don't understand this part. Do you mean that it makes it easier to just
>> call __intel_uncore_forcewake_put(uncore, fw_domains, true) then?
.. you could indeed call this and keep the put hysteresis. But I don't
think that it matters really. You can go with the patch as is for what I
am concerned.
> Yes I think this will work, drop the lock and call
> __intel_uncore_forcewake_put.
>
>> Just
>> wondering how 'static inline' has any effect on that.
>>
>>>> but mostly because it does not feel the static
>>>> inline is justified.
>>
>> Agree, just carried it over from the previous helper definition.
>>
>>>> Sounds an attractive option but it is passable as is.
>>>
>>> Yup, copy that. Also see now how this reduces the read latency. And also it
>>> would increase the latency a bit for a different thread trying to do an
>>> uncore read/write since we hold uncore->lock longer but should be ok I
>>> think.
>>
>> Didn't think about it from that perspective. Worst case is that
>> gt_park/gt_unpark may happen very frequently (as seen on some use
>> cases). In that case, the unpark would end up calling this helper each
>> time.
Concern is two mmio reads under the uncore lock versus two lock-unlock
cycles with one mmio read under them each? Feels like a meh. I guess
with this DC induced latency issue it's a worse worst case but
difference between normal times and pathological spike is probably
orders of magnitude right?
Regards,
Tvrtko
next prev parent reply other threads:[~2022-11-08 10:07 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-05 0:32 [Intel-gfx] [PATCH 0/2] Fix live busy stats selftest failure Umesh Nerlige Ramappa
2022-11-05 0:32 ` [Intel-gfx] [PATCH 1/2] i915/uncore: Acquire fw before loop in intel_uncore_read64_2x32 Umesh Nerlige Ramappa
2022-11-07 10:13 ` Tvrtko Ursulin
2022-11-07 21:23 ` Dixit, Ashutosh
2022-11-08 0:11 ` Umesh Nerlige Ramappa
2022-11-08 0:45 ` Dixit, Ashutosh
2022-11-08 10:06 ` Tvrtko Ursulin [this message]
2022-11-05 0:32 ` [Intel-gfx] [PATCH 2/2] drm/i915/selftest: Bump up sample period for busy stats selftest Umesh Nerlige Ramappa
2022-11-07 10:16 ` Tvrtko Ursulin
2022-11-07 19:01 ` Umesh Nerlige Ramappa
2022-11-07 23:33 ` Dixit, Ashutosh
2022-11-05 0:57 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for Fix live busy stats selftest failure Patchwork
2022-11-05 1:19 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2022-11-05 13:59 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4482cdde-8c8b-261a-cba7-36595d849a0e@linux.intel.com \
--to=tvrtko.ursulin@linux.intel.com \
--cc=ashutosh.dixit@intel.com \
--cc=intel-gfx@lists.freedesktop.org \
--cc=umesh.nerlige.ramappa@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox