Intel-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Ashutosh Dixit <ashutosh.dixit@intel.com>,
	intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH] drm/i915/pmu: Match frequencies reported by PMU and sysfs
Date: Tue, 4 Oct 2022 14:05:55 +0100	[thread overview]
Message-ID: <7b7fe361-bdeb-7fe0-7572-c64cd8eb6ad6@linux.intel.com> (raw)
In-Reply-To: <dce4c792-fd6d-0647-1977-613026c8c5b3@linux.intel.com>


On 04/10/2022 14:00, Tvrtko Ursulin wrote:
> 
> On 04/10/2022 10:29, Tvrtko Ursulin wrote:
>>
>> On 03/10/2022 20:24, Ashutosh Dixit wrote:
>>> PMU and sysfs use different wakeref's to "interpret" zero freq. Sysfs 
>>> uses
>>> runtime PM wakeref (see intel_rps_read_punit_req and
>>> intel_rps_read_actual_frequency). PMU uses the GT parked/unparked
>>> wakeref. In general the GT wakeref is held for less time that the 
>>> runtime
>>> PM wakeref which causes PMU to report a lower average freq than the 
>>> average
>>> freq obtained from sampling sysfs.
>>>
>>> To resolve this, use the same freq functions (and wakeref's) in PMU as
>>> those used in sysfs.
>>>
>>> Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/7025
>>> Reported-by: Ashwin Kumar Kulkarni <ashwin.kumar.kulkarni@intel.com>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>>> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/i915_pmu.c | 27 ++-------------------------
>>>   1 file changed, 2 insertions(+), 25 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_pmu.c 
>>> b/drivers/gpu/drm/i915/i915_pmu.c
>>> index 958b37123bf1..eda03f264792 100644
>>> --- a/drivers/gpu/drm/i915/i915_pmu.c
>>> +++ b/drivers/gpu/drm/i915/i915_pmu.c
>>> @@ -371,37 +371,16 @@ static void
>>>   frequency_sample(struct intel_gt *gt, unsigned int period_ns)
>>>   {
>>>       struct drm_i915_private *i915 = gt->i915;
>>> -    struct intel_uncore *uncore = gt->uncore;
>>>       struct i915_pmu *pmu = &i915->pmu;
>>>       struct intel_rps *rps = &gt->rps;
>>>       if (!frequency_sampling_enabled(pmu))
>>>           return;
>>> -    /* Report 0/0 (actual/requested) frequency while parked. */
>>> -    if (!intel_gt_pm_get_if_awake(gt))
>>> -        return;
>>> -
>>>       if (pmu->enable & config_mask(I915_PMU_ACTUAL_FREQUENCY)) {
>>> -        u32 val;
>>> -
>>> -        /*
>>> -         * We take a quick peek here without using forcewake
>>> -         * so that we don't perturb the system under observation
>>> -         * (forcewake => !rc6 => increased power use). We expect
>>> -         * that if the read fails because it is outside of the
>>> -         * mmio power well, then it will return 0 -- in which
>>> -         * case we assume the system is running at the intended
>>> -         * frequency. Fortunately, the read should rarely fail!
>>> -         */
>>> -        val = intel_uncore_read_fw(uncore, GEN6_RPSTAT1);
>>> -        if (val)
>>> -            val = intel_rps_get_cagf(rps, val);
>>> -        else
>>> -            val = rps->cur_freq;
>>> -
>>>           add_sample_mult(&pmu->sample[__I915_SAMPLE_FREQ_ACT],
>>> -                intel_gpu_freq(rps, val), period_ns / 1000);
>>> +                intel_rps_read_actual_frequency(rps),
>>> +                period_ns / 1000);
>>>       }
>>>       if (pmu->enable & config_mask(I915_PMU_REQUESTED_FREQUENCY)) {
>>
>> What is software tracking of requested frequency showing when GT is 
>> parked or runtime suspended? With this change sampling would be 
>> outside any such checks so we need to be sure reported value makes sense.
>>
>> Although more important open is around what is actually correct.
>>
>> For instance how does the patch affect RC6 and power? I don't know how 
>> power management of different blocks is wired up, so personally I 
>> would only be able to look at it empirically. In other words what I am 
>> asking is this - if we changed from skipping obtaining forcewake even 
>> when unparked, to obtaining forcewake if not runtime suspended - what 
>> hardware blocks does that power up and how it affects RC6 and power? 
>> Can it affect actual frequency or not? (Will "something" power up the 
>> clocks just because we will be getting forcewake?)
>>
>> Or maybe question simplified - does 200Hz polling on existing sysfs 
>> actual frequency field disturbs the system under some circumstances? 
>> (Increases power and decreases RC6.) If it does then that would be a 
>> problem. We want a solution which shows the real data, but where the 
>> act of monitoring itself does not change it too much. If it doesn't 
>> then it's okay.
>>
>> Could you somehow investigate on these topics? Maybe log RAPL GPU 
>> power while polling on sysfs, versus getting the actual frequency from 
>> the existing PMU implementation and see if that shows anything? Or 
>> actually simpler - RAPL GPU power for current PMU intel_gpu_top versus 
>> this patch? On idle(-ish) desktop workloads perhaps? Power and 
>> frequency graphed for both.
> 
> Another thought - considering that bspec says for 0xa01c "This register 
> reflects real-time values and thus does not have a pre-determined 
> default value out of reset" - could it be that it also does not reflect 
> a real value when GPU is not executing anything (so zero), just happens 
> to be not runtime suspended? That would mean sysfs reads could maybe 
> show last known value? Just a thought to check.
> 
> I've also tried on my Alderlake desktop:
> 
> 1)
> 
> while true; do cat gt_act_freq_mhz >/dev/null; sleep 0.005; done
> 
> This costs ~120mW of GPU power and ~20% decrease in RC6.
> 
> 
> 2)
> 
> intel_gpu_top -l -s 5 >/dev/null

This "-s 5" was pointless though. :)

Regards,

Tvrtko

> 
> This costs no power or RC6.
> 
> I have also never observed sysfs to show below min freq. This was with 
> no desktop so it's possible this register indeed does not reflect the 
> real situation when things are idle.
> 
> So I think it is possible sysfs value is the misleading one.
> 
> Regards,
> 
> Tvrtko

  reply	other threads:[~2022-10-04 13:06 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-03 19:24 [Intel-gfx] [PATCH] drm/i915/pmu: Match frequencies reported by PMU and sysfs Ashutosh Dixit
2022-10-03 23:34 ` [Intel-gfx] ✓ Fi.CI.BAT: success for " Patchwork
2022-10-04  6:20 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2022-10-04  9:29 ` [Intel-gfx] [PATCH] " Tvrtko Ursulin
2022-10-04 13:00   ` Tvrtko Ursulin
2022-10-04 13:05     ` Tvrtko Ursulin [this message]
2022-10-05  7:40     ` Dixit, Ashutosh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7b7fe361-bdeb-7fe0-7572-c64cd8eb6ad6@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=ashutosh.dixit@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox