* [PATCH v3] drm/i915/selftests: Refactor RC6 power measurement and error handling
@ 2025-03-06 19:11 sk.anirban
2025-03-06 21:42 ` ✗ i915.CI.BAT: failure for drm/i915/selftests: Refactor RC6 power measurement and error handling (rev2) Patchwork
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: sk.anirban @ 2025-03-06 19:11 UTC (permalink / raw)
To: intel-gfx
Cc: anshuman.gupta, badal.nilawar, riana.tauro, karthik.poosa,
Sk Anirban
From: Sk Anirban <sk.anirban@intel.com>
Refactor power measurement logic to store and compare energy values.
Introduce a threshold check to ensure the GPU enters RC6 properly.
v2:
- Improved commit message (Badal)
v3:
- Reorder threshold check (Badal)
Signed-off-by: Sk Anirban <sk.anirban@intel.com>
---
drivers/gpu/drm/i915/gt/selftest_rc6.c | 59 +++++++++++++++++---------
1 file changed, 38 insertions(+), 21 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/selftest_rc6.c b/drivers/gpu/drm/i915/gt/selftest_rc6.c
index 908483ab0bc8..5364e50be638 100644
--- a/drivers/gpu/drm/i915/gt/selftest_rc6.c
+++ b/drivers/gpu/drm/i915/gt/selftest_rc6.c
@@ -33,15 +33,20 @@ int live_rc6_manual(void *arg)
{
struct intel_gt *gt = arg;
struct intel_rc6 *rc6 = >->rc6;
- u64 rc0_power, rc6_power;
+ struct intel_rps *rps = >->rps;
intel_wakeref_t wakeref;
+ u64 sleep_time = 1000;
+ u32 rc0_freq = 0;
+ u32 rc6_freq = 0;
+ u64 rc0_power[3];
+ u64 rc6_power[3];
bool has_power;
+ u64 threshold;
ktime_t dt;
u64 res[2];
int err = 0;
- u32 rc0_freq = 0;
- u32 rc6_freq = 0;
- struct intel_rps *rps = >->rps;
+ u64 diff;
+
/*
* Our claim is that we can "encourage" the GPU to enter rc6 at will.
@@ -65,9 +70,9 @@ int live_rc6_manual(void *arg)
res[0] = rc6_residency(rc6);
dt = ktime_get();
- rc0_power = librapl_energy_uJ();
- msleep(1000);
- rc0_power = librapl_energy_uJ() - rc0_power;
+ rc0_power[0] = librapl_energy_uJ();
+ msleep(sleep_time);
+ rc0_power[1] = librapl_energy_uJ() - rc0_power[0];
dt = ktime_sub(ktime_get(), dt);
res[1] = rc6_residency(rc6);
rc0_freq = intel_rps_read_actual_frequency_fw(rps);
@@ -79,11 +84,12 @@ int live_rc6_manual(void *arg)
}
if (has_power) {
- rc0_power = div64_u64(NSEC_PER_SEC * rc0_power,
- ktime_to_ns(dt));
- if (!rc0_power) {
+ rc0_power[2] = div64_u64(NSEC_PER_SEC * rc0_power[1],
+ ktime_to_ns(dt));
+
+ if (!rc0_power[2]) {
if (rc0_freq)
- pr_debug("No power measured while in RC0! GPU Freq: %u in RC0\n",
+ pr_debug("No power measured while in RC0! GPU Freq: %uMHz in RC0\n",
rc0_freq);
else
pr_err("No power and freq measured while in RC0\n");
@@ -98,10 +104,10 @@ int live_rc6_manual(void *arg)
res[0] = rc6_residency(rc6);
intel_uncore_forcewake_flush(rc6_to_uncore(rc6), FORCEWAKE_ALL);
dt = ktime_get();
- rc6_power = librapl_energy_uJ();
- msleep(1000);
+ rc6_power[0] = librapl_energy_uJ();
+ msleep(sleep_time);
rc6_freq = intel_rps_read_actual_frequency_fw(rps);
- rc6_power = librapl_energy_uJ() - rc6_power;
+ rc6_power[1] = librapl_energy_uJ() - rc6_power[0];
dt = ktime_sub(ktime_get(), dt);
res[1] = rc6_residency(rc6);
if (res[1] == res[0]) {
@@ -113,13 +119,24 @@ int live_rc6_manual(void *arg)
}
if (has_power) {
- rc6_power = div64_u64(NSEC_PER_SEC * rc6_power,
- ktime_to_ns(dt));
- pr_info("GPU consumed %llduW in RC0 and %llduW in RC6\n",
- rc0_power, rc6_power);
- if (2 * rc6_power > rc0_power) {
- pr_err("GPU leaked energy while in RC6! GPU Freq: %u in RC6 and %u in RC0\n",
- rc6_freq, rc0_freq);
+ rc6_power[2] = div64_u64(NSEC_PER_SEC * rc6_power[1],
+ ktime_to_ns(dt));
+ pr_info("GPU consumed %lluuW in RC0 and %lluuW in RC6\n",
+ rc0_power[2], rc6_power[2]);
+
+ if (2 * rc6_power[2] > rc0_power[2]) {
+ pr_err("GPU leaked energy while in RC6!\n"
+ "GPU Freq: %uMHz in RC6 and %uMHz in RC0\n"
+ "RC0 energy before & after sleep respectively: %lluuJ %lluuJ\n"
+ "RC6 energy before & after sleep respectively: %lluuJ %lluuJ\n",
+ rc6_freq, rc0_freq, rc0_power[0], rc0_power[1],
+ rc6_power[0], rc6_power[1]);
+
+ diff = res[1] - res[0];
+ threshold = (9 * NSEC_PER_MSEC * sleep_time) / 10;
+ if (diff < threshold)
+ pr_err("Did not enter RC6 properly, RC6 start residency=%lluns, RC6 end residency=%lluns\n",
+ res[0], res[1]);
err = -EINVAL;
goto out_unlock;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* ✗ i915.CI.BAT: failure for drm/i915/selftests: Refactor RC6 power measurement and error handling (rev2)
2025-03-06 19:11 [PATCH v3] drm/i915/selftests: Refactor RC6 power measurement and error handling sk.anirban
@ 2025-03-06 21:42 ` Patchwork
2025-03-11 11:17 ` [PATCH v3] drm/i915/selftests: Refactor RC6 power measurement and error handling Nilawar, Badal
2025-03-11 13:57 ` ✓ i915.CI.BAT: success for drm/i915/selftests: Refactor RC6 power measurement and error handling (rev2) Patchwork
2 siblings, 0 replies; 5+ messages in thread
From: Patchwork @ 2025-03-06 21:42 UTC (permalink / raw)
To: sk.anirban; +Cc: intel-gfx
[-- Attachment #1: Type: text/plain, Size: 6857 bytes --]
== Series Details ==
Series: drm/i915/selftests: Refactor RC6 power measurement and error handling (rev2)
URL : https://patchwork.freedesktop.org/series/145766/
State : failure
== Summary ==
CI Bug Log - changes from CI_DRM_16237 -> Patchwork_145766v2
====================================================
Summary
-------
**FAILURE**
Serious unknown changes coming with Patchwork_145766v2 absolutely need to be
verified manually.
If you think the reported changes have nothing to do with the changes
introduced in Patchwork_145766v2, please notify your bug team (I915-ci-infra@lists.freedesktop.org) to allow them
to document this new failure mode, which will reduce false positives in CI.
External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/index.html
Participating hosts (44 -> 43)
------------------------------
Missing (1): fi-snb-2520m
Possible new issues
-------------------
Here are the unknown changes that may have been introduced in Patchwork_145766v2:
### IGT changes ###
#### Possible regressions ####
* igt@i915_pm_rpm@module-reload:
- fi-rkl-11600: [PASS][1] -> [ABORT][2]
[1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16237/fi-rkl-11600/igt@i915_pm_rpm@module-reload.html
[2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/fi-rkl-11600/igt@i915_pm_rpm@module-reload.html
* igt@kms_flip@basic-flip-vs-dpms@c-dp2:
- fi-cfl-8109u: [PASS][3] -> [DMESG-WARN][4] +2 other tests dmesg-warn
[3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16237/fi-cfl-8109u/igt@kms_flip@basic-flip-vs-dpms@c-dp2.html
[4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/fi-cfl-8109u/igt@kms_flip@basic-flip-vs-dpms@c-dp2.html
Known issues
------------
Here are the changes found in Patchwork_145766v2 that come from known issues:
### IGT changes ###
#### Issues hit ####
* igt@fbdev@info:
- fi-kbl-8809g: NOTRUN -> [SKIP][5] ([i915#1849])
[5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/fi-kbl-8809g/igt@fbdev@info.html
* igt@gem_huc_copy@huc-copy:
- fi-kbl-8809g: NOTRUN -> [SKIP][6] ([i915#2190])
[6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/fi-kbl-8809g/igt@gem_huc_copy@huc-copy.html
* igt@gem_lmem_swapping@parallel-random-engines:
- fi-kbl-8809g: NOTRUN -> [SKIP][7] ([i915#4613]) +3 other tests skip
[7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/fi-kbl-8809g/igt@gem_lmem_swapping@parallel-random-engines.html
* igt@i915_module_load@load:
- bat-mtlp-9: [PASS][8] -> [DMESG-WARN][9] ([i915#13494])
[8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16237/bat-mtlp-9/igt@i915_module_load@load.html
[9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/bat-mtlp-9/igt@i915_module_load@load.html
* igt@i915_selftest@live:
- bat-arlh-3: [PASS][10] -> [DMESG-FAIL][11] ([i915#12061] / [i915#12435])
[10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16237/bat-arlh-3/igt@i915_selftest@live.html
[11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/bat-arlh-3/igt@i915_selftest@live.html
- bat-rplp-1: [PASS][12] -> [ABORT][13] ([i915#9413]) +1 other test abort
[12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16237/bat-rplp-1/igt@i915_selftest@live.html
[13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/bat-rplp-1/igt@i915_selftest@live.html
* igt@i915_selftest@live@workarounds:
- bat-arlh-3: [PASS][14] -> [DMESG-FAIL][15] ([i915#12061])
[14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16237/bat-arlh-3/igt@i915_selftest@live@workarounds.html
[15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/bat-arlh-3/igt@i915_selftest@live@workarounds.html
* igt@kms_dsc@dsc-basic:
- fi-kbl-8809g: NOTRUN -> [SKIP][16] +34 other tests skip
[16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/fi-kbl-8809g/igt@kms_dsc@dsc-basic.html
* igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence:
- bat-dg2-11: [PASS][17] -> [SKIP][18] ([i915#9197]) +1 other test skip
[17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16237/bat-dg2-11/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence.html
[18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/bat-dg2-11/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence.html
#### Possible fixes ####
* igt@dmabuf@all-tests@dma_fence_chain:
- fi-bsw-nick: [INCOMPLETE][19] ([i915#12904]) -> [PASS][20] +1 other test pass
[19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16237/fi-bsw-nick/igt@dmabuf@all-tests@dma_fence_chain.html
[20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/fi-bsw-nick/igt@dmabuf@all-tests@dma_fence_chain.html
* igt@i915_module_load@reload:
- bat-twl-2: [DMESG-WARN][21] ([i915#13736]) -> [PASS][22]
[21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16237/bat-twl-2/igt@i915_module_load@reload.html
[22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/bat-twl-2/igt@i915_module_load@reload.html
* igt@i915_selftest@live@workarounds:
- bat-arls-5: [DMESG-FAIL][23] ([i915#12061]) -> [PASS][24] +1 other test pass
[23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16237/bat-arls-5/igt@i915_selftest@live@workarounds.html
[24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/bat-arls-5/igt@i915_selftest@live@workarounds.html
[i915#12061]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12061
[i915#12435]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12435
[i915#12904]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12904
[i915#13494]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13494
[i915#13736]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13736
[i915#1849]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/1849
[i915#2190]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/2190
[i915#4613]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/4613
[i915#9197]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/9197
[i915#9413]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/9413
Build changes
-------------
* Linux: CI_DRM_16237 -> Patchwork_145766v2
CI-20190529: 20190529
CI_DRM_16237: 6f2e5afc45e3ca8bf46427ae21a9c1029ea6cdb3 @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_8263: 25f60274b3dd14d35a7f32558b489ab7a02b6223 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
Patchwork_145766v2: 6f2e5afc45e3ca8bf46427ae21a9c1029ea6cdb3 @ git://anongit.freedesktop.org/gfx-ci/linux
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/index.html
[-- Attachment #2: Type: text/html, Size: 7950 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v3] drm/i915/selftests: Refactor RC6 power measurement and error handling
2025-03-06 19:11 [PATCH v3] drm/i915/selftests: Refactor RC6 power measurement and error handling sk.anirban
2025-03-06 21:42 ` ✗ i915.CI.BAT: failure for drm/i915/selftests: Refactor RC6 power measurement and error handling (rev2) Patchwork
@ 2025-03-11 11:17 ` Nilawar, Badal
2025-03-11 16:17 ` Anirban, Sk
2025-03-11 13:57 ` ✓ i915.CI.BAT: success for drm/i915/selftests: Refactor RC6 power measurement and error handling (rev2) Patchwork
2 siblings, 1 reply; 5+ messages in thread
From: Nilawar, Badal @ 2025-03-11 11:17 UTC (permalink / raw)
To: sk.anirban, intel-gfx; +Cc: anshuman.gupta, riana.tauro, karthik.poosa
On 07-03-2025 00:41, sk.anirban@intel.com wrote:
> From: Sk Anirban <sk.anirban@intel.com>
>
> Refactor power measurement logic to store and compare energy values.
> Introduce a threshold check to ensure the GPU enters RC6 properly.
>
> v2:
> - Improved commit message (Badal)
>
> v3:
> - Reorder threshold check (Badal)
>
> Signed-off-by: Sk Anirban <sk.anirban@intel.com>
> ---
> drivers/gpu/drm/i915/gt/selftest_rc6.c | 59 +++++++++++++++++---------
> 1 file changed, 38 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/selftest_rc6.c b/drivers/gpu/drm/i915/gt/selftest_rc6.c
> index 908483ab0bc8..5364e50be638 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_rc6.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_rc6.c
> @@ -33,15 +33,20 @@ int live_rc6_manual(void *arg)
> {
> struct intel_gt *gt = arg;
> struct intel_rc6 *rc6 = >->rc6;
> - u64 rc0_power, rc6_power;
> + struct intel_rps *rps = >->rps;
> intel_wakeref_t wakeref;
> + u64 sleep_time = 1000;
> + u32 rc0_freq = 0;
> + u32 rc6_freq = 0;
> + u64 rc0_power[3];
> + u64 rc6_power[3];
> bool has_power;
> + u64 threshold;
> ktime_t dt;
> u64 res[2];
> int err = 0;
> - u32 rc0_freq = 0;
> - u32 rc6_freq = 0;
> - struct intel_rps *rps = >->rps;
> + u64 diff;
> +
>
> /*
> * Our claim is that we can "encourage" the GPU to enter rc6 at will.
> @@ -65,9 +70,9 @@ int live_rc6_manual(void *arg)
> res[0] = rc6_residency(rc6);
>
> dt = ktime_get();
> - rc0_power = librapl_energy_uJ();
> - msleep(1000);
> - rc0_power = librapl_energy_uJ() - rc0_power;
> + rc0_power[0] = librapl_energy_uJ();
> + msleep(sleep_time);
> + rc0_power[1] = librapl_energy_uJ() - rc0_power[0];
> dt = ktime_sub(ktime_get(), dt);
> res[1] = rc6_residency(rc6);
> rc0_freq = intel_rps_read_actual_frequency_fw(rps);
> @@ -79,11 +84,12 @@ int live_rc6_manual(void *arg)
> }
>
> if (has_power) {
> - rc0_power = div64_u64(NSEC_PER_SEC * rc0_power,
> - ktime_to_ns(dt));
> - if (!rc0_power) {
> + rc0_power[2] = div64_u64(NSEC_PER_SEC * rc0_power[1],
> + ktime_to_ns(dt));
> +
> + if (!rc0_power[2]) {
> if (rc0_freq)
> - pr_debug("No power measured while in RC0! GPU Freq: %u in RC0\n",
> + pr_debug("No power measured while in RC0! GPU Freq: %uMHz in RC0\n",
> rc0_freq);
> else
> pr_err("No power and freq measured while in RC0\n");
> @@ -98,10 +104,10 @@ int live_rc6_manual(void *arg)
> res[0] = rc6_residency(rc6);
> intel_uncore_forcewake_flush(rc6_to_uncore(rc6), FORCEWAKE_ALL);
> dt = ktime_get();
> - rc6_power = librapl_energy_uJ();
> - msleep(1000);
> + rc6_power[0] = librapl_energy_uJ();
> + msleep(sleep_time);
> rc6_freq = intel_rps_read_actual_frequency_fw(rps);
> - rc6_power = librapl_energy_uJ() - rc6_power;
> + rc6_power[1] = librapl_energy_uJ() - rc6_power[0];
> dt = ktime_sub(ktime_get(), dt);
> res[1] = rc6_residency(rc6);
> if (res[1] == res[0]) {
> @@ -113,13 +119,24 @@ int live_rc6_manual(void *arg)
> }
>
> if (has_power) {
> - rc6_power = div64_u64(NSEC_PER_SEC * rc6_power,
> - ktime_to_ns(dt));
> - pr_info("GPU consumed %llduW in RC0 and %llduW in RC6\n",
> - rc0_power, rc6_power);
> - if (2 * rc6_power > rc0_power) {
> - pr_err("GPU leaked energy while in RC6! GPU Freq: %u in RC6 and %u in RC0\n",
> - rc6_freq, rc0_freq);
> + rc6_power[2] = div64_u64(NSEC_PER_SEC * rc6_power[1],
> + ktime_to_ns(dt));
> + pr_info("GPU consumed %lluuW in RC0 and %lluuW in RC6\n",
> + rc0_power[2], rc6_power[2]);
> +
> + if (2 * rc6_power[2] > rc0_power[2]) {
> + pr_err("GPU leaked energy while in RC6!\n"
> + "GPU Freq: %uMHz in RC6 and %uMHz in RC0\n"
> + "RC0 energy before & after sleep respectively: %lluuJ %lluuJ\n"
> + "RC6 energy before & after sleep respectively: %lluuJ %lluuJ\n",
> + rc6_freq, rc0_freq, rc0_power[0], rc0_power[1],
> + rc6_power[0], rc6_power[1]);
> +
> + diff = res[1] - res[0];
> + threshold = (9 * NSEC_PER_MSEC * sleep_time) / 10;
> + if (diff < threshold)
> + pr_err("Did not enter RC6 properly, RC6 start residency=%lluns, RC6 end residency=%lluns\n",
> + res[0], res[1]);
Check if BAT failures reported are related. Similar errors were seen
with other selftest related patches too.
Otherwise this looks good to me.
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Regards,
Badal
> err = -EINVAL;
> goto out_unlock;
> }
^ permalink raw reply [flat|nested] 5+ messages in thread
* ✓ i915.CI.BAT: success for drm/i915/selftests: Refactor RC6 power measurement and error handling (rev2)
2025-03-06 19:11 [PATCH v3] drm/i915/selftests: Refactor RC6 power measurement and error handling sk.anirban
2025-03-06 21:42 ` ✗ i915.CI.BAT: failure for drm/i915/selftests: Refactor RC6 power measurement and error handling (rev2) Patchwork
2025-03-11 11:17 ` [PATCH v3] drm/i915/selftests: Refactor RC6 power measurement and error handling Nilawar, Badal
@ 2025-03-11 13:57 ` Patchwork
2 siblings, 0 replies; 5+ messages in thread
From: Patchwork @ 2025-03-11 13:57 UTC (permalink / raw)
To: Sk Anirban; +Cc: intel-gfx
[-- Attachment #1: Type: text/plain, Size: 6516 bytes --]
== Series Details ==
Series: drm/i915/selftests: Refactor RC6 power measurement and error handling (rev2)
URL : https://patchwork.freedesktop.org/series/145766/
State : success
== Summary ==
CI Bug Log - changes from CI_DRM_16237 -> Patchwork_145766v2
====================================================
Summary
-------
**SUCCESS**
No regressions found.
External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/index.html
Participating hosts (44 -> 43)
------------------------------
Missing (1): fi-snb-2520m
Known issues
------------
Here are the changes found in Patchwork_145766v2 that come from known issues:
### IGT changes ###
#### Issues hit ####
* igt@fbdev@info:
- fi-kbl-8809g: NOTRUN -> [SKIP][1] ([i915#1849])
[1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/fi-kbl-8809g/igt@fbdev@info.html
* igt@gem_huc_copy@huc-copy:
- fi-kbl-8809g: NOTRUN -> [SKIP][2] ([i915#2190])
[2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/fi-kbl-8809g/igt@gem_huc_copy@huc-copy.html
* igt@gem_lmem_swapping@parallel-random-engines:
- fi-kbl-8809g: NOTRUN -> [SKIP][3] ([i915#4613]) +3 other tests skip
[3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/fi-kbl-8809g/igt@gem_lmem_swapping@parallel-random-engines.html
* igt@i915_module_load@load:
- bat-mtlp-9: [PASS][4] -> [DMESG-WARN][5] ([i915#13494])
[4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16237/bat-mtlp-9/igt@i915_module_load@load.html
[5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/bat-mtlp-9/igt@i915_module_load@load.html
* igt@i915_pm_rpm@module-reload:
- fi-rkl-11600: [PASS][6] -> [ABORT][7] ([i915#13894])
[6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16237/fi-rkl-11600/igt@i915_pm_rpm@module-reload.html
[7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/fi-rkl-11600/igt@i915_pm_rpm@module-reload.html
* igt@i915_selftest@live:
- bat-arlh-3: [PASS][8] -> [DMESG-FAIL][9] ([i915#12061] / [i915#12435])
[8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16237/bat-arlh-3/igt@i915_selftest@live.html
[9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/bat-arlh-3/igt@i915_selftest@live.html
- bat-rplp-1: [PASS][10] -> [ABORT][11] ([i915#9413]) +1 other test abort
[10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16237/bat-rplp-1/igt@i915_selftest@live.html
[11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/bat-rplp-1/igt@i915_selftest@live.html
* igt@i915_selftest@live@workarounds:
- bat-arlh-3: [PASS][12] -> [DMESG-FAIL][13] ([i915#12061])
[12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16237/bat-arlh-3/igt@i915_selftest@live@workarounds.html
[13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/bat-arlh-3/igt@i915_selftest@live@workarounds.html
* igt@kms_dsc@dsc-basic:
- fi-kbl-8809g: NOTRUN -> [SKIP][14] +34 other tests skip
[14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/fi-kbl-8809g/igt@kms_dsc@dsc-basic.html
* igt@kms_flip@basic-flip-vs-dpms@c-dp2:
- fi-cfl-8109u: [PASS][15] -> [DMESG-WARN][16] ([i915#13770]) +2 other tests dmesg-warn
[15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16237/fi-cfl-8109u/igt@kms_flip@basic-flip-vs-dpms@c-dp2.html
[16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/fi-cfl-8109u/igt@kms_flip@basic-flip-vs-dpms@c-dp2.html
* igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence:
- bat-dg2-11: [PASS][17] -> [SKIP][18] ([i915#9197]) +1 other test skip
[17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16237/bat-dg2-11/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence.html
[18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/bat-dg2-11/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence.html
#### Possible fixes ####
* igt@dmabuf@all-tests@dma_fence_chain:
- fi-bsw-nick: [INCOMPLETE][19] ([i915#12904]) -> [PASS][20] +1 other test pass
[19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16237/fi-bsw-nick/igt@dmabuf@all-tests@dma_fence_chain.html
[20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/fi-bsw-nick/igt@dmabuf@all-tests@dma_fence_chain.html
* igt@i915_module_load@reload:
- bat-twl-2: [DMESG-WARN][21] ([i915#13736]) -> [PASS][22]
[21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16237/bat-twl-2/igt@i915_module_load@reload.html
[22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/bat-twl-2/igt@i915_module_load@reload.html
* igt@i915_selftest@live@workarounds:
- bat-arls-5: [DMESG-FAIL][23] ([i915#12061]) -> [PASS][24] +1 other test pass
[23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16237/bat-arls-5/igt@i915_selftest@live@workarounds.html
[24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/bat-arls-5/igt@i915_selftest@live@workarounds.html
[i915#12061]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12061
[i915#12435]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12435
[i915#12904]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12904
[i915#13494]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13494
[i915#13736]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13736
[i915#13770]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13770
[i915#13894]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13894
[i915#1849]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/1849
[i915#2190]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/2190
[i915#4613]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/4613
[i915#9197]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/9197
[i915#9413]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/9413
Build changes
-------------
* Linux: CI_DRM_16237 -> Patchwork_145766v2
CI-20190529: 20190529
CI_DRM_16237: 6f2e5afc45e3ca8bf46427ae21a9c1029ea6cdb3 @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_8263: 25f60274b3dd14d35a7f32558b489ab7a02b6223 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
Patchwork_145766v2: 6f2e5afc45e3ca8bf46427ae21a9c1029ea6cdb3 @ git://anongit.freedesktop.org/gfx-ci/linux
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_145766v2/index.html
[-- Attachment #2: Type: text/html, Size: 7583 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v3] drm/i915/selftests: Refactor RC6 power measurement and error handling
2025-03-11 11:17 ` [PATCH v3] drm/i915/selftests: Refactor RC6 power measurement and error handling Nilawar, Badal
@ 2025-03-11 16:17 ` Anirban, Sk
0 siblings, 0 replies; 5+ messages in thread
From: Anirban, Sk @ 2025-03-11 16:17 UTC (permalink / raw)
To: Nilawar, Badal, intel-gfx; +Cc: anshuman.gupta, riana.tauro, karthik.poosa
On 11-03-2025 16:47, Nilawar, Badal wrote:
>
> On 07-03-2025 00:41, sk.anirban@intel.com wrote:
>> From: Sk Anirban <sk.anirban@intel.com>
>>
>> Refactor power measurement logic to store and compare energy values.
>> Introduce a threshold check to ensure the GPU enters RC6 properly.
>>
>> v2:
>> - Improved commit message (Badal)
>>
>> v3:
>> - Reorder threshold check (Badal)
>>
>> Signed-off-by: Sk Anirban <sk.anirban@intel.com>
>> ---
>> drivers/gpu/drm/i915/gt/selftest_rc6.c | 59 +++++++++++++++++---------
>> 1 file changed, 38 insertions(+), 21 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/selftest_rc6.c
>> b/drivers/gpu/drm/i915/gt/selftest_rc6.c
>> index 908483ab0bc8..5364e50be638 100644
>> --- a/drivers/gpu/drm/i915/gt/selftest_rc6.c
>> +++ b/drivers/gpu/drm/i915/gt/selftest_rc6.c
>> @@ -33,15 +33,20 @@ int live_rc6_manual(void *arg)
>> {
>> struct intel_gt *gt = arg;
>> struct intel_rc6 *rc6 = >->rc6;
>> - u64 rc0_power, rc6_power;
>> + struct intel_rps *rps = >->rps;
>> intel_wakeref_t wakeref;
>> + u64 sleep_time = 1000;
>> + u32 rc0_freq = 0;
>> + u32 rc6_freq = 0;
>> + u64 rc0_power[3];
>> + u64 rc6_power[3];
>> bool has_power;
>> + u64 threshold;
>> ktime_t dt;
>> u64 res[2];
>> int err = 0;
>> - u32 rc0_freq = 0;
>> - u32 rc6_freq = 0;
>> - struct intel_rps *rps = >->rps;
>> + u64 diff;
>> +
>> /*
>> * Our claim is that we can "encourage" the GPU to enter rc6 at
>> will.
>> @@ -65,9 +70,9 @@ int live_rc6_manual(void *arg)
>> res[0] = rc6_residency(rc6);
>> dt = ktime_get();
>> - rc0_power = librapl_energy_uJ();
>> - msleep(1000);
>> - rc0_power = librapl_energy_uJ() - rc0_power;
>> + rc0_power[0] = librapl_energy_uJ();
>> + msleep(sleep_time);
>> + rc0_power[1] = librapl_energy_uJ() - rc0_power[0];
>> dt = ktime_sub(ktime_get(), dt);
>> res[1] = rc6_residency(rc6);
>> rc0_freq = intel_rps_read_actual_frequency_fw(rps);
>> @@ -79,11 +84,12 @@ int live_rc6_manual(void *arg)
>> }
>> if (has_power) {
>> - rc0_power = div64_u64(NSEC_PER_SEC * rc0_power,
>> - ktime_to_ns(dt));
>> - if (!rc0_power) {
>> + rc0_power[2] = div64_u64(NSEC_PER_SEC * rc0_power[1],
>> + ktime_to_ns(dt));
>> +
>> + if (!rc0_power[2]) {
>> if (rc0_freq)
>> - pr_debug("No power measured while in RC0! GPU Freq:
>> %u in RC0\n",
>> + pr_debug("No power measured while in RC0! GPU Freq:
>> %uMHz in RC0\n",
>> rc0_freq);
>> else
>> pr_err("No power and freq measured while in RC0\n");
>> @@ -98,10 +104,10 @@ int live_rc6_manual(void *arg)
>> res[0] = rc6_residency(rc6);
>> intel_uncore_forcewake_flush(rc6_to_uncore(rc6), FORCEWAKE_ALL);
>> dt = ktime_get();
>> - rc6_power = librapl_energy_uJ();
>> - msleep(1000);
>> + rc6_power[0] = librapl_energy_uJ();
>> + msleep(sleep_time);
>> rc6_freq = intel_rps_read_actual_frequency_fw(rps);
>> - rc6_power = librapl_energy_uJ() - rc6_power;
>> + rc6_power[1] = librapl_energy_uJ() - rc6_power[0];
>> dt = ktime_sub(ktime_get(), dt);
>> res[1] = rc6_residency(rc6);
>> if (res[1] == res[0]) {
>> @@ -113,13 +119,24 @@ int live_rc6_manual(void *arg)
>> }
>> if (has_power) {
>> - rc6_power = div64_u64(NSEC_PER_SEC * rc6_power,
>> - ktime_to_ns(dt));
>> - pr_info("GPU consumed %llduW in RC0 and %llduW in RC6\n",
>> - rc0_power, rc6_power);
>> - if (2 * rc6_power > rc0_power) {
>> - pr_err("GPU leaked energy while in RC6! GPU Freq: %u in
>> RC6 and %u in RC0\n",
>> - rc6_freq, rc0_freq);
>> + rc6_power[2] = div64_u64(NSEC_PER_SEC * rc6_power[1],
>> + ktime_to_ns(dt));
>> + pr_info("GPU consumed %lluuW in RC0 and %lluuW in RC6\n",
>> + rc0_power[2], rc6_power[2]);
>> +
>> + if (2 * rc6_power[2] > rc0_power[2]) {
>> + pr_err("GPU leaked energy while in RC6!\n"
>> + "GPU Freq: %uMHz in RC6 and %uMHz in RC0\n"
>> + "RC0 energy before & after sleep respectively:
>> %lluuJ %lluuJ\n"
>> + "RC6 energy before & after sleep respectively:
>> %lluuJ %lluuJ\n",
>> + rc6_freq, rc0_freq, rc0_power[0], rc0_power[1],
>> + rc6_power[0], rc6_power[1]);
>> +
>> + diff = res[1] - res[0];
>> + threshold = (9 * NSEC_PER_MSEC * sleep_time) / 10;
>> + if (diff < threshold)
>> + pr_err("Did not enter RC6 properly, RC6 start
>> residency=%lluns, RC6 end residency=%lluns\n",
>> + res[0], res[1]);
>
> Check if BAT failures reported are related. Similar errors were seen
> with other selftest related patches too.
> Otherwise this looks good to me.
>
> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
>
> Regards,
> Badal
BAT failures were not related to these changes and re-reported.
Thanks,
Anirban
>
>> err = -EINVAL;
>> goto out_unlock;
>> }
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-03-11 16:17 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-06 19:11 [PATCH v3] drm/i915/selftests: Refactor RC6 power measurement and error handling sk.anirban
2025-03-06 21:42 ` ✗ i915.CI.BAT: failure for drm/i915/selftests: Refactor RC6 power measurement and error handling (rev2) Patchwork
2025-03-11 11:17 ` [PATCH v3] drm/i915/selftests: Refactor RC6 power measurement and error handling Nilawar, Badal
2025-03-11 16:17 ` Anirban, Sk
2025-03-11 13:57 ` ✓ i915.CI.BAT: success for drm/i915/selftests: Refactor RC6 power measurement and error handling (rev2) Patchwork
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox