From: "Anirban, Sk" <sk.anirban@intel.com>
To: "Nilawar, Badal" <badal.nilawar@intel.com>,
"Gupta, Anshuman" <anshuman.gupta@intel.com>,
"intel-gfx@lists.freedesktop.org"
<intel-gfx@lists.freedesktop.org>
Cc: "Poosa, Karthik" <karthik.poosa@intel.com>,
"Pottumuttu, Sai Teja" <sai.teja.pottumuttu@intel.com>
Subject: Re: [PATCH v6] drm/i915/selftests: Implement frequency logging for energy reading validation
Date: Wed, 20 Nov 2024 23:38:34 +0530 [thread overview]
Message-ID: <ffedbbf9-6841-4440-ae9d-2afd12cc6c5c@intel.com> (raw)
In-Reply-To: <4cb8acc9-eb46-45cc-8f1b-08ce8de3d969@intel.com>
On 20-11-2024 20:20, Nilawar, Badal wrote:
>
>
> On 20-11-2024 16:00, Gupta, Anshuman wrote:
>>
>>
>>> -----Original Message-----
>>> From: Nilawar, Badal <badal.nilawar@intel.com>
>>> Sent: Wednesday, November 20, 2024 1:44 PM
>>> To: Anirban, Sk <sk.anirban@intel.com>; intel-gfx@lists.freedesktop.org
>>> Cc: Gupta, Anshuman <anshuman.gupta@intel.com>; Poosa, Karthik
>>> <karthik.poosa@intel.com>; Pottumuttu, Sai Teja
>>> <sai.teja.pottumuttu@intel.com>
>>> Subject: Re: [PATCH v6] drm/i915/selftests: Implement frequency
>>> logging for
>>> energy reading validation
>>>
>>>
>>>
>>> On 13-11-2024 15:20, Sk Anirban wrote:
>>>> Introduce RC6 & RC0 frequency logging mechanism to ensure accurate
>>>> energy readings aimed at addressing GPU energy leaks and power
>>>> measurement failures.
>>>> This enhancement will help ensure the accuracy of energy readings.
>>>>
>>>> v2:
>>>> - Improved commit message.
>>>> v3:
>>>> - Used pr_err log to display frequency (Anshuman)
>>>> - Sorted headers alphabetically (Sai Teja)
>>>> v4:
>>>> - Improved commit message.
>>>> - Fix pr_err log (Sai Teja)
>>>> v5:
>>>> - Add error & debug logging for RC0 power and frequency checks
>>>> (Anshuman)
>>>> v6:
>>>> - Modify debug logging for RC0 power and frequency checks (Sai
>>>> Teja)
>>>>
>>>> Signed-off-by: Sk Anirban <sk.anirban@intel.com>
>>>> Reviewed-by: Sai Teja Pottumuttu <sai.teja.pottumuttu@intel.com>
>>>> ---
>>>> drivers/gpu/drm/i915/gt/selftest_rc6.c | 15 +++++++++++++--
>>>> 1 file changed, 13 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_rc6.c
>>>> b/drivers/gpu/drm/i915/gt/selftest_rc6.c
>>>> index 1aa1446c8fb0..a8776f88d6a1 100644
>>>> --- a/drivers/gpu/drm/i915/gt/selftest_rc6.c
>>>> +++ b/drivers/gpu/drm/i915/gt/selftest_rc6.c
>>>> @@ -8,6 +8,7 @@
>>>> #include "intel_gpu_commands.h"
>>>> #include "intel_gt_requests.h"
>>>> #include "intel_ring.h"
>>>> +#include "intel_rps.h"
>>>> #include "selftest_rc6.h"
>>>>
>>>> #include "selftests/i915_random.h"
>>>> @@ -38,6 +39,9 @@ int live_rc6_manual(void *arg)
>>>> ktime_t dt;
>>>> u64 res[2];
>>>> int err = 0;
>>>> + u32 rc0_freq = 0;
>>>> + u32 rc6_freq = 0;
>>>> + struct intel_rps *rps = >->rps;
>>>>
>>>> /*
>>>> * Our claim is that we can "encourage" the GPU to enter rc6
>>>> at will.
>>>> @@ -66,6 +70,7 @@ int live_rc6_manual(void *arg)
>>>> rc0_power = librapl_energy_uJ() - rc0_power;
>>>> dt = ktime_sub(ktime_get(), dt);
>>>> res[1] = rc6_residency(rc6);
>>>> + rc0_freq = intel_rps_read_actual_frequency(rps);
>>>> if ((res[1] - res[0]) >> 10) {
>>>> pr_err("RC6 residency increased by %lldus while disabled
>>>> for
>>> 1000ms!\n",
>>>> (res[1] - res[0]) >> 10);
>>>> @@ -77,7 +82,11 @@ int live_rc6_manual(void *arg)
>>>> rc0_power = div64_u64(NSEC_PER_SEC * rc0_power,
>>>> ktime_to_ns(dt));
>>>> if (!rc0_power) {
>>>> - pr_err("No power measured while in RC0\n");
>>>> + if (rc0_freq)
>>>> + pr_err("No power measured while in RC0!
>>> GPU Freq: %u in RC0\n",
>>>> + rc0_freq);
>> If rc0 frequency is there then, this has to be pr_dbg, otherwise what
>> is the purpose of this patch.
>
> I too didn't understand purpose of this. How this going to help for
> accurate energy readings.
in case of rc0 power is 0 , I just want to confirm whether the freq is
available there or not ? Also it is defined as pr_err because there is
no point of validate rc0 power below if no power is being measured.
>
>>>> + else
>>>> + pr_err("No power and freq measured while in
>>> RC0\n");
>>>> err = -EINVAL;
>>>> goto out_unlock;
>>>> }
>>>> @@ -91,6 +100,7 @@ int live_rc6_manual(void *arg)
>>>> dt = ktime_get();
>>>> rc6_power = librapl_energy_uJ();
>>>> msleep(100);
>>>> + rc6_freq = intel_rps_read_actual_frequency(rps);
>>>
>>> I think intention of reading frequency here is to know if device was
>>> not in RC6
>>> when there is failure. But for the platforms below gen12 reading act
>>> frequency
>>> will cause gt wake as GEN6_RPSTAT reg requires forcewake.
>>> To avoid wake when device is in RC6 read actual frequency without
>>> applying
>>> forcewake.
>> If reading act_freq will wake the device, How to read frequency
>> without forcewake then ?
>
> Use this api intel_rps_read_actual_frequency_fw to read freq without
> forcewake.
>
> Regards,
> Badal
>
>>
>> Thanks,
>> Anshuaman
>>>
>>> Additionally add delay, may be delay of 1 seconds after re-enabling RC6
>>> manually and forcewake flush.
yeah, I can use that to read the actual freq but there is already a
check involved just to cross verify whether the system has entered RC6
or not.
>>>
>>> Regards,
>>> Badal
Thanks,
Anirban
>>>
>>>> rc6_power = librapl_energy_uJ() - rc6_power;
>>>> dt = ktime_sub(ktime_get(), dt);
>>>> res[1] = rc6_residency(rc6);
>>>> @@ -108,7 +118,8 @@ int live_rc6_manual(void *arg)
>>>> pr_info("GPU consumed %llduW in RC0 and %llduW in
>>> RC6\n",
>>>> rc0_power, rc6_power);
>>>> if (2 * rc6_power > rc0_power) {
>>>> - pr_err("GPU leaked energy while in RC6!\n");
>>>> + pr_err("GPU leaked energy while in RC6! GPU Freq:
>>> %u in RC6 and %u in RC0\n",
>>>> + rc6_freq, rc0_freq);
>>>> err = -EINVAL;
>>>> goto out_unlock;
>>>> }
>>
>
prev parent reply other threads:[~2024-11-20 18:09 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-13 9:50 [PATCH v6] drm/i915/selftests: Implement frequency logging for energy reading validation Sk Anirban
2024-11-13 11:33 ` ✓ Fi.CI.BAT: success for drm/i915/selftests: Implement frequency logging for energy reading validation (rev5) Patchwork
2024-11-13 13:27 ` ✗ Fi.CI.IGT: failure " Patchwork
2024-11-20 8:13 ` [PATCH v6] drm/i915/selftests: Implement frequency logging for energy reading validation Nilawar, Badal
2024-11-20 10:30 ` Gupta, Anshuman
2024-11-20 10:43 ` Gupta, Anshuman
2024-11-20 14:50 ` Nilawar, Badal
2024-11-20 18:08 ` Anirban, Sk [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ffedbbf9-6841-4440-ae9d-2afd12cc6c5c@intel.com \
--to=sk.anirban@intel.com \
--cc=anshuman.gupta@intel.com \
--cc=badal.nilawar@intel.com \
--cc=intel-gfx@lists.freedesktop.org \
--cc=karthik.poosa@intel.com \
--cc=sai.teja.pottumuttu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox