* [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC @ 2022-10-20 0:29 ` Vinay Belgaumkar 0 siblings, 0 replies; 15+ messages in thread From: Vinay Belgaumkar @ 2022-10-20 0:29 UTC (permalink / raw) To: intel-gfx, dri-devel Waitboost (when SLPC is enabled) results in a H2G message. This can result in thousands of messages during a stress test and fill up an already full CTB. There is no need to request for RP0 if GuC is already requesting the same. v2: Add the tracing back, and check requested freq in the worker thread (Tvrtko) v3: Check requested freq in dec_waiters as well Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> --- drivers/gpu/drm/i915/gt/intel_rps.c | 3 +++ drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 14 +++++++++++--- 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c index fc23c562d9b2..18b75cf08d1b 100644 --- a/drivers/gpu/drm/i915/gt/intel_rps.c +++ b/drivers/gpu/drm/i915/gt/intel_rps.c @@ -1016,6 +1016,9 @@ void intel_rps_boost(struct i915_request *rq) if (rps_uses_slpc(rps)) { slpc = rps_to_slpc(rps); + GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n", + rq->fence.context, rq->fence.seqno); + /* Return if old value is non zero */ if (!atomic_fetch_inc(&slpc->num_waiters)) schedule_work(&slpc->boost_work); diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c index b7cdeec44bd3..9dbdbab1515a 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c @@ -227,14 +227,19 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq) static void slpc_boost_work(struct work_struct *work) { struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), boost_work); + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; int err; /* * Raise min freq to boost. It's possible that * this is greater than current max. But it will * certainly be limited by RP0. An error setting - * the min param is not fatal. + * the min param is not fatal. No need to boost + * if we are already requesting it. */ + if (intel_rps_get_requested_frequency(rps) == slpc->boost_freq) + return; + mutex_lock(&slpc->lock); if (atomic_read(&slpc->num_waiters)) { err = slpc_force_min_freq(slpc, slpc->boost_freq); @@ -728,6 +733,7 @@ int intel_guc_slpc_set_boost_freq(struct intel_guc_slpc *slpc, u32 val) void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) { + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; /* * Return min back to the softlimit. * This is called during request retire, @@ -735,8 +741,10 @@ void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) * set_param fails. */ mutex_lock(&slpc->lock); - if (atomic_dec_and_test(&slpc->num_waiters)) - slpc_force_min_freq(slpc, slpc->min_freq_softlimit); + if (atomic_dec_and_test(&slpc->num_waiters)) { + if (intel_rps_get_requested_frequency(rps) != slpc->min_freq_softlimit) + slpc_force_min_freq(slpc, slpc->min_freq_softlimit); + } mutex_unlock(&slpc->lock); } -- 2.35.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC @ 2022-10-20 0:29 ` Vinay Belgaumkar 0 siblings, 0 replies; 15+ messages in thread From: Vinay Belgaumkar @ 2022-10-20 0:29 UTC (permalink / raw) To: intel-gfx, dri-devel; +Cc: Vinay Belgaumkar Waitboost (when SLPC is enabled) results in a H2G message. This can result in thousands of messages during a stress test and fill up an already full CTB. There is no need to request for RP0 if GuC is already requesting the same. v2: Add the tracing back, and check requested freq in the worker thread (Tvrtko) v3: Check requested freq in dec_waiters as well Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> --- drivers/gpu/drm/i915/gt/intel_rps.c | 3 +++ drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 14 +++++++++++--- 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c index fc23c562d9b2..18b75cf08d1b 100644 --- a/drivers/gpu/drm/i915/gt/intel_rps.c +++ b/drivers/gpu/drm/i915/gt/intel_rps.c @@ -1016,6 +1016,9 @@ void intel_rps_boost(struct i915_request *rq) if (rps_uses_slpc(rps)) { slpc = rps_to_slpc(rps); + GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n", + rq->fence.context, rq->fence.seqno); + /* Return if old value is non zero */ if (!atomic_fetch_inc(&slpc->num_waiters)) schedule_work(&slpc->boost_work); diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c index b7cdeec44bd3..9dbdbab1515a 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c @@ -227,14 +227,19 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq) static void slpc_boost_work(struct work_struct *work) { struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), boost_work); + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; int err; /* * Raise min freq to boost. It's possible that * this is greater than current max. But it will * certainly be limited by RP0. An error setting - * the min param is not fatal. + * the min param is not fatal. No need to boost + * if we are already requesting it. */ + if (intel_rps_get_requested_frequency(rps) == slpc->boost_freq) + return; + mutex_lock(&slpc->lock); if (atomic_read(&slpc->num_waiters)) { err = slpc_force_min_freq(slpc, slpc->boost_freq); @@ -728,6 +733,7 @@ int intel_guc_slpc_set_boost_freq(struct intel_guc_slpc *slpc, u32 val) void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) { + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; /* * Return min back to the softlimit. * This is called during request retire, @@ -735,8 +741,10 @@ void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) * set_param fails. */ mutex_lock(&slpc->lock); - if (atomic_dec_and_test(&slpc->num_waiters)) - slpc_force_min_freq(slpc, slpc->min_freq_softlimit); + if (atomic_dec_and_test(&slpc->num_waiters)) { + if (intel_rps_get_requested_frequency(rps) != slpc->min_freq_softlimit) + slpc_force_min_freq(slpc, slpc->min_freq_softlimit); + } mutex_unlock(&slpc->lock); } -- 2.35.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/slpc: Optmize waitboost for SLPC (rev3) 2022-10-20 0:29 ` Vinay Belgaumkar (?) @ 2022-10-20 1:22 ` Patchwork -1 siblings, 0 replies; 15+ messages in thread From: Patchwork @ 2022-10-20 1:22 UTC (permalink / raw) To: Vinay Belgaumkar; +Cc: intel-gfx [-- Attachment #1: Type: text/plain, Size: 14463 bytes --] == Series Details == Series: drm/i915/slpc: Optmize waitboost for SLPC (rev3) URL : https://patchwork.freedesktop.org/series/109840/ State : failure == Summary == CI Bug Log - changes from CI_DRM_12261 -> Patchwork_109840v3 ==================================================== Summary ------- **FAILURE** Serious unknown changes coming with Patchwork_109840v3 absolutely need to be verified manually. If you think the reported changes have nothing to do with the changes introduced in Patchwork_109840v3, please notify your bug team to allow them to document this new failure mode, which will reduce false positives in CI. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/index.html Participating hosts (42 -> 41) ------------------------------ Additional (2): fi-rkl-11600 fi-icl-u2 Missing (3): fi-kbl-soraka fi-cml-u2 fi-hsw-4770 Possible new issues ------------------- Here are the unknown changes that may have been introduced in Patchwork_109840v3: ### IGT changes ### #### Possible regressions #### * igt@i915_selftest@live@gem_contexts: - fi-apl-guc: [PASS][1] -> [DMESG-FAIL][2] [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12261/fi-apl-guc/igt@i915_selftest@live@gem_contexts.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-apl-guc/igt@i915_selftest@live@gem_contexts.html * igt@i915_selftest@live@migrate: - bat-adlp-4: [PASS][3] -> [INCOMPLETE][4] [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12261/bat-adlp-4/igt@i915_selftest@live@migrate.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/bat-adlp-4/igt@i915_selftest@live@migrate.html #### Suppressed #### The following results come from untrusted machines, tests, or statuses. They do not affect the overall result. * igt@i915_selftest@live@slpc: - {bat-dg2-11}: [PASS][5] -> [DMESG-FAIL][6] [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12261/bat-dg2-11/igt@i915_selftest@live@slpc.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/bat-dg2-11/igt@i915_selftest@live@slpc.html Known issues ------------ Here are the changes found in Patchwork_109840v3 that come from known issues: ### IGT changes ### #### Issues hit #### * igt@gem_exec_suspend@basic-s3@smem: - fi-rkl-11600: NOTRUN -> [INCOMPLETE][7] ([i915#6179]) [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-rkl-11600/igt@gem_exec_suspend@basic-s3@smem.html * igt@gem_huc_copy@huc-copy: - fi-icl-u2: NOTRUN -> [SKIP][8] ([i915#2190]) [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-icl-u2/igt@gem_huc_copy@huc-copy.html - fi-rkl-11600: NOTRUN -> [SKIP][9] ([i915#2190]) [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-rkl-11600/igt@gem_huc_copy@huc-copy.html * igt@gem_lmem_swapping@basic: - fi-rkl-11600: NOTRUN -> [SKIP][10] ([i915#4613]) +3 similar issues [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-rkl-11600/igt@gem_lmem_swapping@basic.html * igt@gem_lmem_swapping@random-engines: - fi-icl-u2: NOTRUN -> [SKIP][11] ([i915#4613]) +3 similar issues [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-icl-u2/igt@gem_lmem_swapping@random-engines.html * igt@gem_tiled_pread_basic: - fi-rkl-11600: NOTRUN -> [SKIP][12] ([i915#3282]) [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-rkl-11600/igt@gem_tiled_pread_basic.html * igt@i915_pm_backlight@basic-brightness: - fi-rkl-11600: NOTRUN -> [SKIP][13] ([i915#3012]) [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-rkl-11600/igt@i915_pm_backlight@basic-brightness.html * igt@i915_suspend@basic-s3-without-i915: - fi-rkl-11600: NOTRUN -> [FAIL][14] ([fdo#103375]) [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-rkl-11600/igt@i915_suspend@basic-s3-without-i915.html * igt@kms_chamelium@common-hpd-after-suspend: - fi-ivb-3770: NOTRUN -> [SKIP][15] ([fdo#109271] / [fdo#111827]) [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-ivb-3770/igt@kms_chamelium@common-hpd-after-suspend.html - fi-bdw-5557u: NOTRUN -> [SKIP][16] ([fdo#109271] / [fdo#111827]) [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-bdw-5557u/igt@kms_chamelium@common-hpd-after-suspend.html * igt@kms_chamelium@hdmi-edid-read: - fi-rkl-11600: NOTRUN -> [SKIP][17] ([fdo#111827]) +7 similar issues [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-rkl-11600/igt@kms_chamelium@hdmi-edid-read.html * igt@kms_chamelium@hdmi-hpd-fast: - fi-icl-u2: NOTRUN -> [SKIP][18] ([fdo#111827]) +8 similar issues [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-icl-u2/igt@kms_chamelium@hdmi-hpd-fast.html * igt@kms_cursor_legacy@basic-busy-flip-before-cursor: - fi-rkl-11600: NOTRUN -> [SKIP][19] ([i915#4103]) [19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-rkl-11600/igt@kms_cursor_legacy@basic-busy-flip-before-cursor.html - fi-icl-u2: NOTRUN -> [SKIP][20] ([i915#4103]) [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-icl-u2/igt@kms_cursor_legacy@basic-busy-flip-before-cursor.html * igt@kms_force_connector_basic@force-load-detect: - fi-rkl-11600: NOTRUN -> [SKIP][21] ([fdo#109285] / [i915#4098]) [21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-rkl-11600/igt@kms_force_connector_basic@force-load-detect.html - fi-icl-u2: NOTRUN -> [SKIP][22] ([fdo#109285]) [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-icl-u2/igt@kms_force_connector_basic@force-load-detect.html * igt@kms_psr@primary_page_flip: - fi-rkl-11600: NOTRUN -> [SKIP][23] ([i915#1072]) +3 similar issues [23]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-rkl-11600/igt@kms_psr@primary_page_flip.html * igt@kms_setmode@basic-clone-single-crtc: - fi-rkl-11600: NOTRUN -> [SKIP][24] ([i915#3555] / [i915#4098]) [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-rkl-11600/igt@kms_setmode@basic-clone-single-crtc.html - fi-icl-u2: NOTRUN -> [SKIP][25] ([i915#3555]) [25]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-icl-u2/igt@kms_setmode@basic-clone-single-crtc.html * igt@prime_vgem@basic-read: - fi-rkl-11600: NOTRUN -> [SKIP][26] ([fdo#109295] / [i915#3291] / [i915#3708]) +2 similar issues [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-rkl-11600/igt@prime_vgem@basic-read.html * igt@prime_vgem@basic-userptr: - fi-icl-u2: NOTRUN -> [SKIP][27] ([fdo#109295] / [i915#3301]) [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-icl-u2/igt@prime_vgem@basic-userptr.html - fi-rkl-11600: NOTRUN -> [SKIP][28] ([fdo#109295] / [i915#3301] / [i915#3708]) [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-rkl-11600/igt@prime_vgem@basic-userptr.html * igt@runner@aborted: - bat-adlp-4: NOTRUN -> [FAIL][29] ([i915#4312]) [29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/bat-adlp-4/igt@runner@aborted.html #### Possible fixes #### * igt@fbdev@read: - {bat-rpls-2}: [SKIP][30] ([i915#2582]) -> [PASS][31] +4 similar issues [30]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12261/bat-rpls-2/igt@fbdev@read.html [31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/bat-rpls-2/igt@fbdev@read.html * igt@gem_exec_gttfill@basic: - fi-pnv-d510: [SKIP][32] ([fdo#109271]) -> [PASS][33] +1 similar issue [32]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12261/fi-pnv-d510/igt@gem_exec_gttfill@basic.html [33]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-pnv-d510/igt@gem_exec_gttfill@basic.html * igt@gem_exec_suspend@basic-s3@smem: - {bat-adlm-1}: [DMESG-WARN][34] ([i915#2867]) -> [PASS][35] [34]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12261/bat-adlm-1/igt@gem_exec_suspend@basic-s3@smem.html [35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/bat-adlm-1/igt@gem_exec_suspend@basic-s3@smem.html * igt@gem_huc_copy@huc-copy: - {bat-dg2-8}: [FAIL][36] ([i915#7029]) -> [PASS][37] [36]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12261/bat-dg2-8/igt@gem_huc_copy@huc-copy.html [37]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/bat-dg2-8/igt@gem_huc_copy@huc-copy.html * igt@i915_pm_rpm@basic-rte: - {bat-rplp-1}: [DMESG-WARN][38] -> [PASS][39] [38]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12261/bat-rplp-1/igt@i915_pm_rpm@basic-rte.html [39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/bat-rplp-1/igt@i915_pm_rpm@basic-rte.html * igt@i915_pm_rpm@module-reload: - {bat-rpls-2}: [DMESG-WARN][40] ([i915#5537]) -> [PASS][41] [40]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12261/bat-rpls-2/igt@i915_pm_rpm@module-reload.html [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/bat-rpls-2/igt@i915_pm_rpm@module-reload.html * igt@i915_selftest@live@gem_contexts: - fi-skl-guc: [DMESG-FAIL][42] ([i915#7270]) -> [PASS][43] [42]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12261/fi-skl-guc/igt@i915_selftest@live@gem_contexts.html [43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-skl-guc/igt@i915_selftest@live@gem_contexts.html - {fi-jsl-1}: [DMESG-FAIL][44] ([i915#7270]) -> [PASS][45] [44]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12261/fi-jsl-1/igt@i915_selftest@live@gem_contexts.html [45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-jsl-1/igt@i915_selftest@live@gem_contexts.html * igt@i915_selftest@live@gt_heartbeat: - {bat-jsl-1}: [DMESG-FAIL][46] -> [PASS][47] [46]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12261/bat-jsl-1/igt@i915_selftest@live@gt_heartbeat.html [47]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/bat-jsl-1/igt@i915_selftest@live@gt_heartbeat.html * igt@i915_selftest@live@hangcheck: - fi-ivb-3770: [INCOMPLETE][48] ([i915#3303] / [i915#7122]) -> [PASS][49] [48]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12261/fi-ivb-3770/igt@i915_selftest@live@hangcheck.html [49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-ivb-3770/igt@i915_selftest@live@hangcheck.html * igt@i915_selftest@live@hugepages: - {bat-rpls-1}: [DMESG-WARN][50] ([i915#5278]) -> [PASS][51] [50]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12261/bat-rpls-1/igt@i915_selftest@live@hugepages.html [51]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/bat-rpls-1/igt@i915_selftest@live@hugepages.html * igt@i915_suspend@basic-s3-without-i915: - fi-bdw-5557u: [INCOMPLETE][52] ([i915#146] / [i915#6712]) -> [PASS][53] [52]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12261/fi-bdw-5557u/igt@i915_suspend@basic-s3-without-i915.html [53]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/fi-bdw-5557u/igt@i915_suspend@basic-s3-without-i915.html {name}: This element is suppressed. This means it is ignored when computing the status of the difference (SUCCESS, WARNING, or FAILURE). [fdo#103375]: https://bugs.freedesktop.org/show_bug.cgi?id=103375 [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271 [fdo#109285]: https://bugs.freedesktop.org/show_bug.cgi?id=109285 [fdo#109295]: https://bugs.freedesktop.org/show_bug.cgi?id=109295 [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827 [i915#1072]: https://gitlab.freedesktop.org/drm/intel/issues/1072 [i915#146]: https://gitlab.freedesktop.org/drm/intel/issues/146 [i915#1845]: https://gitlab.freedesktop.org/drm/intel/issues/1845 [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190 [i915#2582]: https://gitlab.freedesktop.org/drm/intel/issues/2582 [i915#2867]: https://gitlab.freedesktop.org/drm/intel/issues/2867 [i915#3012]: https://gitlab.freedesktop.org/drm/intel/issues/3012 [i915#3282]: https://gitlab.freedesktop.org/drm/intel/issues/3282 [i915#3291]: https://gitlab.freedesktop.org/drm/intel/issues/3291 [i915#3301]: https://gitlab.freedesktop.org/drm/intel/issues/3301 [i915#3303]: https://gitlab.freedesktop.org/drm/intel/issues/3303 [i915#3555]: https://gitlab.freedesktop.org/drm/intel/issues/3555 [i915#3708]: https://gitlab.freedesktop.org/drm/intel/issues/3708 [i915#4098]: https://gitlab.freedesktop.org/drm/intel/issues/4098 [i915#4103]: https://gitlab.freedesktop.org/drm/intel/issues/4103 [i915#4312]: https://gitlab.freedesktop.org/drm/intel/issues/4312 [i915#4613]: https://gitlab.freedesktop.org/drm/intel/issues/4613 [i915#4983]: https://gitlab.freedesktop.org/drm/intel/issues/4983 [i915#5278]: https://gitlab.freedesktop.org/drm/intel/issues/5278 [i915#5537]: https://gitlab.freedesktop.org/drm/intel/issues/5537 [i915#6179]: https://gitlab.freedesktop.org/drm/intel/issues/6179 [i915#6621]: https://gitlab.freedesktop.org/drm/intel/issues/6621 [i915#6687]: https://gitlab.freedesktop.org/drm/intel/issues/6687 [i915#6712]: https://gitlab.freedesktop.org/drm/intel/issues/6712 [i915#6816]: https://gitlab.freedesktop.org/drm/intel/issues/6816 [i915#7029]: https://gitlab.freedesktop.org/drm/intel/issues/7029 [i915#7122]: https://gitlab.freedesktop.org/drm/intel/issues/7122 [i915#7270]: https://gitlab.freedesktop.org/drm/intel/issues/7270 Build changes ------------- * Linux: CI_DRM_12261 -> Patchwork_109840v3 CI-20190529: 20190529 CI_DRM_12261: 41447224fdfbfbfd1e9ffa5fabc9d277f9c02f8a @ git://anongit.freedesktop.org/gfx-ci/linux IGT_7019: fdbafce2b74e84739bb1d81223ae6f01fb442980 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git Patchwork_109840v3: 41447224fdfbfbfd1e9ffa5fabc9d277f9c02f8a @ git://anongit.freedesktop.org/gfx-ci/linux ### Linux commits a137487f5cc4 drm/i915/slpc: Optmize waitboost for SLPC == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109840v3/index.html [-- Attachment #2: Type: text/html, Size: 16752 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC 2022-10-20 0:29 ` Vinay Belgaumkar @ 2022-10-20 18:33 ` Dixit, Ashutosh -1 siblings, 0 replies; 15+ messages in thread From: Dixit, Ashutosh @ 2022-10-20 18:33 UTC (permalink / raw) To: Vinay Belgaumkar; +Cc: intel-gfx, dri-devel On Wed, 19 Oct 2022 17:29:44 -0700, Vinay Belgaumkar wrote: > Hi Vinay, > Waitboost (when SLPC is enabled) results in a H2G message. This can result > in thousands of messages during a stress test and fill up an already full > CTB. There is no need to request for RP0 if GuC is already requesting the > same. But how are we sure that the freq will remain at RP0 in the future (when the waiting request or any requests which are ahead execute)? In the current waitboost implementation, set_param is sent to GuC ahead of the waiting request to ensure that the freq would be max when this waiting request executed on the GPU and the freq is kept at max till this request retires (considering just one waiting request). How can we ensure this if we don't send the waitboost set_param to GuC? I had assumed we'll do this optimization for server parts where min is already RP0 in which case we can completely disable waitboost. But this patch is something else. Thanks. -- Ashutosh > > v2: Add the tracing back, and check requested freq > in the worker thread (Tvrtko) > v3: Check requested freq in dec_waiters as well > > Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> > --- > drivers/gpu/drm/i915/gt/intel_rps.c | 3 +++ > drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 14 +++++++++++--- > 2 files changed, 14 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c > index fc23c562d9b2..18b75cf08d1b 100644 > --- a/drivers/gpu/drm/i915/gt/intel_rps.c > +++ b/drivers/gpu/drm/i915/gt/intel_rps.c > @@ -1016,6 +1016,9 @@ void intel_rps_boost(struct i915_request *rq) > if (rps_uses_slpc(rps)) { > slpc = rps_to_slpc(rps); > > + GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n", > + rq->fence.context, rq->fence.seqno); > + > /* Return if old value is non zero */ > if (!atomic_fetch_inc(&slpc->num_waiters)) > schedule_work(&slpc->boost_work); > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > index b7cdeec44bd3..9dbdbab1515a 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > @@ -227,14 +227,19 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq) > static void slpc_boost_work(struct work_struct *work) > { > struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), boost_work); > + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; > int err; > > /* > * Raise min freq to boost. It's possible that > * this is greater than current max. But it will > * certainly be limited by RP0. An error setting > - * the min param is not fatal. > + * the min param is not fatal. No need to boost > + * if we are already requesting it. > */ > + if (intel_rps_get_requested_frequency(rps) == slpc->boost_freq) > + return; > + > mutex_lock(&slpc->lock); > if (atomic_read(&slpc->num_waiters)) { > err = slpc_force_min_freq(slpc, slpc->boost_freq); > @@ -728,6 +733,7 @@ int intel_guc_slpc_set_boost_freq(struct intel_guc_slpc *slpc, u32 val) > > void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) > { > + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; > /* > * Return min back to the softlimit. > * This is called during request retire, > @@ -735,8 +741,10 @@ void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) > * set_param fails. > */ > mutex_lock(&slpc->lock); > - if (atomic_dec_and_test(&slpc->num_waiters)) > - slpc_force_min_freq(slpc, slpc->min_freq_softlimit); > + if (atomic_dec_and_test(&slpc->num_waiters)) { > + if (intel_rps_get_requested_frequency(rps) != slpc->min_freq_softlimit) > + slpc_force_min_freq(slpc, slpc->min_freq_softlimit); > + } > mutex_unlock(&slpc->lock); > } > > -- > 2.35.1 > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC @ 2022-10-20 18:33 ` Dixit, Ashutosh 0 siblings, 0 replies; 15+ messages in thread From: Dixit, Ashutosh @ 2022-10-20 18:33 UTC (permalink / raw) To: Vinay Belgaumkar; +Cc: Tvrtko Ursulin, intel-gfx, dri-devel On Wed, 19 Oct 2022 17:29:44 -0700, Vinay Belgaumkar wrote: > Hi Vinay, > Waitboost (when SLPC is enabled) results in a H2G message. This can result > in thousands of messages during a stress test and fill up an already full > CTB. There is no need to request for RP0 if GuC is already requesting the > same. But how are we sure that the freq will remain at RP0 in the future (when the waiting request or any requests which are ahead execute)? In the current waitboost implementation, set_param is sent to GuC ahead of the waiting request to ensure that the freq would be max when this waiting request executed on the GPU and the freq is kept at max till this request retires (considering just one waiting request). How can we ensure this if we don't send the waitboost set_param to GuC? I had assumed we'll do this optimization for server parts where min is already RP0 in which case we can completely disable waitboost. But this patch is something else. Thanks. -- Ashutosh > > v2: Add the tracing back, and check requested freq > in the worker thread (Tvrtko) > v3: Check requested freq in dec_waiters as well > > Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> > --- > drivers/gpu/drm/i915/gt/intel_rps.c | 3 +++ > drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 14 +++++++++++--- > 2 files changed, 14 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c > index fc23c562d9b2..18b75cf08d1b 100644 > --- a/drivers/gpu/drm/i915/gt/intel_rps.c > +++ b/drivers/gpu/drm/i915/gt/intel_rps.c > @@ -1016,6 +1016,9 @@ void intel_rps_boost(struct i915_request *rq) > if (rps_uses_slpc(rps)) { > slpc = rps_to_slpc(rps); > > + GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n", > + rq->fence.context, rq->fence.seqno); > + > /* Return if old value is non zero */ > if (!atomic_fetch_inc(&slpc->num_waiters)) > schedule_work(&slpc->boost_work); > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > index b7cdeec44bd3..9dbdbab1515a 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > @@ -227,14 +227,19 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq) > static void slpc_boost_work(struct work_struct *work) > { > struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), boost_work); > + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; > int err; > > /* > * Raise min freq to boost. It's possible that > * this is greater than current max. But it will > * certainly be limited by RP0. An error setting > - * the min param is not fatal. > + * the min param is not fatal. No need to boost > + * if we are already requesting it. > */ > + if (intel_rps_get_requested_frequency(rps) == slpc->boost_freq) > + return; > + > mutex_lock(&slpc->lock); > if (atomic_read(&slpc->num_waiters)) { > err = slpc_force_min_freq(slpc, slpc->boost_freq); > @@ -728,6 +733,7 @@ int intel_guc_slpc_set_boost_freq(struct intel_guc_slpc *slpc, u32 val) > > void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) > { > + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; > /* > * Return min back to the softlimit. > * This is called during request retire, > @@ -735,8 +741,10 @@ void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) > * set_param fails. > */ > mutex_lock(&slpc->lock); > - if (atomic_dec_and_test(&slpc->num_waiters)) > - slpc_force_min_freq(slpc, slpc->min_freq_softlimit); > + if (atomic_dec_and_test(&slpc->num_waiters)) { > + if (intel_rps_get_requested_frequency(rps) != slpc->min_freq_softlimit) > + slpc_force_min_freq(slpc, slpc->min_freq_softlimit); > + } > mutex_unlock(&slpc->lock); > } > > -- > 2.35.1 > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC 2022-10-20 18:33 ` Dixit, Ashutosh @ 2022-10-20 20:16 ` Belgaumkar, Vinay -1 siblings, 0 replies; 15+ messages in thread From: Belgaumkar, Vinay @ 2022-10-20 20:16 UTC (permalink / raw) To: Dixit, Ashutosh; +Cc: intel-gfx, dri-devel On 10/20/2022 11:33 AM, Dixit, Ashutosh wrote: > On Wed, 19 Oct 2022 17:29:44 -0700, Vinay Belgaumkar wrote: > Hi Vinay, > >> Waitboost (when SLPC is enabled) results in a H2G message. This can result >> in thousands of messages during a stress test and fill up an already full >> CTB. There is no need to request for RP0 if GuC is already requesting the >> same. > But how are we sure that the freq will remain at RP0 in the future (when > the waiting request or any requests which are ahead execute)? > > In the current waitboost implementation, set_param is sent to GuC ahead of > the waiting request to ensure that the freq would be max when this waiting > request executed on the GPU and the freq is kept at max till this request > retires (considering just one waiting request). How can we ensure this if > we don't send the waitboost set_param to GuC? There is no way to guarantee the frequency will remain at RP0 till the request retires. As a theoretical example, lets say the request boosted freq to RP0, but a user changed min freq using sysfs immediately after. Waitboost is done by a pending request to "hurry" the current requests. If GT is already at boost frequency, that purpose is served. Also, host algorithm already has this optimization as well. Thanks, Vinay. > > I had assumed we'll do this optimization for server parts where min is > already RP0 in which case we can completely disable waitboost. But this > patch is something else. > > Thanks. > -- > Ashutosh > > >> v2: Add the tracing back, and check requested freq >> in the worker thread (Tvrtko) >> v3: Check requested freq in dec_waiters as well >> >> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> >> --- >> drivers/gpu/drm/i915/gt/intel_rps.c | 3 +++ >> drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 14 +++++++++++--- >> 2 files changed, 14 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c >> index fc23c562d9b2..18b75cf08d1b 100644 >> --- a/drivers/gpu/drm/i915/gt/intel_rps.c >> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c >> @@ -1016,6 +1016,9 @@ void intel_rps_boost(struct i915_request *rq) >> if (rps_uses_slpc(rps)) { >> slpc = rps_to_slpc(rps); >> >> + GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n", >> + rq->fence.context, rq->fence.seqno); >> + >> /* Return if old value is non zero */ >> if (!atomic_fetch_inc(&slpc->num_waiters)) >> schedule_work(&slpc->boost_work); >> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c >> index b7cdeec44bd3..9dbdbab1515a 100644 >> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c >> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c >> @@ -227,14 +227,19 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq) >> static void slpc_boost_work(struct work_struct *work) >> { >> struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), boost_work); >> + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; >> int err; >> >> /* >> * Raise min freq to boost. It's possible that >> * this is greater than current max. But it will >> * certainly be limited by RP0. An error setting >> - * the min param is not fatal. >> + * the min param is not fatal. No need to boost >> + * if we are already requesting it. >> */ >> + if (intel_rps_get_requested_frequency(rps) == slpc->boost_freq) >> + return; >> + >> mutex_lock(&slpc->lock); >> if (atomic_read(&slpc->num_waiters)) { >> err = slpc_force_min_freq(slpc, slpc->boost_freq); >> @@ -728,6 +733,7 @@ int intel_guc_slpc_set_boost_freq(struct intel_guc_slpc *slpc, u32 val) >> >> void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) >> { >> + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; >> /* >> * Return min back to the softlimit. >> * This is called during request retire, >> @@ -735,8 +741,10 @@ void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) >> * set_param fails. >> */ >> mutex_lock(&slpc->lock); >> - if (atomic_dec_and_test(&slpc->num_waiters)) >> - slpc_force_min_freq(slpc, slpc->min_freq_softlimit); >> + if (atomic_dec_and_test(&slpc->num_waiters)) { >> + if (intel_rps_get_requested_frequency(rps) != slpc->min_freq_softlimit) >> + slpc_force_min_freq(slpc, slpc->min_freq_softlimit); >> + } >> mutex_unlock(&slpc->lock); >> } >> >> -- >> 2.35.1 >> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC @ 2022-10-20 20:16 ` Belgaumkar, Vinay 0 siblings, 0 replies; 15+ messages in thread From: Belgaumkar, Vinay @ 2022-10-20 20:16 UTC (permalink / raw) To: Dixit, Ashutosh; +Cc: Tvrtko Ursulin, intel-gfx, dri-devel On 10/20/2022 11:33 AM, Dixit, Ashutosh wrote: > On Wed, 19 Oct 2022 17:29:44 -0700, Vinay Belgaumkar wrote: > Hi Vinay, > >> Waitboost (when SLPC is enabled) results in a H2G message. This can result >> in thousands of messages during a stress test and fill up an already full >> CTB. There is no need to request for RP0 if GuC is already requesting the >> same. > But how are we sure that the freq will remain at RP0 in the future (when > the waiting request or any requests which are ahead execute)? > > In the current waitboost implementation, set_param is sent to GuC ahead of > the waiting request to ensure that the freq would be max when this waiting > request executed on the GPU and the freq is kept at max till this request > retires (considering just one waiting request). How can we ensure this if > we don't send the waitboost set_param to GuC? There is no way to guarantee the frequency will remain at RP0 till the request retires. As a theoretical example, lets say the request boosted freq to RP0, but a user changed min freq using sysfs immediately after. Waitboost is done by a pending request to "hurry" the current requests. If GT is already at boost frequency, that purpose is served. Also, host algorithm already has this optimization as well. Thanks, Vinay. > > I had assumed we'll do this optimization for server parts where min is > already RP0 in which case we can completely disable waitboost. But this > patch is something else. > > Thanks. > -- > Ashutosh > > >> v2: Add the tracing back, and check requested freq >> in the worker thread (Tvrtko) >> v3: Check requested freq in dec_waiters as well >> >> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> >> --- >> drivers/gpu/drm/i915/gt/intel_rps.c | 3 +++ >> drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 14 +++++++++++--- >> 2 files changed, 14 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c >> index fc23c562d9b2..18b75cf08d1b 100644 >> --- a/drivers/gpu/drm/i915/gt/intel_rps.c >> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c >> @@ -1016,6 +1016,9 @@ void intel_rps_boost(struct i915_request *rq) >> if (rps_uses_slpc(rps)) { >> slpc = rps_to_slpc(rps); >> >> + GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n", >> + rq->fence.context, rq->fence.seqno); >> + >> /* Return if old value is non zero */ >> if (!atomic_fetch_inc(&slpc->num_waiters)) >> schedule_work(&slpc->boost_work); >> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c >> index b7cdeec44bd3..9dbdbab1515a 100644 >> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c >> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c >> @@ -227,14 +227,19 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq) >> static void slpc_boost_work(struct work_struct *work) >> { >> struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), boost_work); >> + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; >> int err; >> >> /* >> * Raise min freq to boost. It's possible that >> * this is greater than current max. But it will >> * certainly be limited by RP0. An error setting >> - * the min param is not fatal. >> + * the min param is not fatal. No need to boost >> + * if we are already requesting it. >> */ >> + if (intel_rps_get_requested_frequency(rps) == slpc->boost_freq) >> + return; >> + >> mutex_lock(&slpc->lock); >> if (atomic_read(&slpc->num_waiters)) { >> err = slpc_force_min_freq(slpc, slpc->boost_freq); >> @@ -728,6 +733,7 @@ int intel_guc_slpc_set_boost_freq(struct intel_guc_slpc *slpc, u32 val) >> >> void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) >> { >> + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; >> /* >> * Return min back to the softlimit. >> * This is called during request retire, >> @@ -735,8 +741,10 @@ void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) >> * set_param fails. >> */ >> mutex_lock(&slpc->lock); >> - if (atomic_dec_and_test(&slpc->num_waiters)) >> - slpc_force_min_freq(slpc, slpc->min_freq_softlimit); >> + if (atomic_dec_and_test(&slpc->num_waiters)) { >> + if (intel_rps_get_requested_frequency(rps) != slpc->min_freq_softlimit) >> + slpc_force_min_freq(slpc, slpc->min_freq_softlimit); >> + } >> mutex_unlock(&slpc->lock); >> } >> >> -- >> 2.35.1 >> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC 2022-10-20 20:16 ` Belgaumkar, Vinay @ 2022-10-20 23:36 ` Dixit, Ashutosh -1 siblings, 0 replies; 15+ messages in thread From: Dixit, Ashutosh @ 2022-10-20 23:36 UTC (permalink / raw) To: Belgaumkar, Vinay; +Cc: intel-gfx, dri-devel On Thu, 20 Oct 2022 13:16:00 -0700, Belgaumkar, Vinay wrote: > > On 10/20/2022 11:33 AM, Dixit, Ashutosh wrote: > > On Wed, 19 Oct 2022 17:29:44 -0700, Vinay Belgaumkar wrote: > > Hi Vinay, > > > >> Waitboost (when SLPC is enabled) results in a H2G message. This can result > >> in thousands of messages during a stress test and fill up an already full > >> CTB. There is no need to request for RP0 if GuC is already requesting the > >> same. > > But how are we sure that the freq will remain at RP0 in the future (when > > the waiting request or any requests which are ahead execute)? > > > > In the current waitboost implementation, set_param is sent to GuC ahead of > > the waiting request to ensure that the freq would be max when this waiting > > request executed on the GPU and the freq is kept at max till this request > > retires (considering just one waiting request). How can we ensure this if > > we don't send the waitboost set_param to GuC? > > There is no way to guarantee the frequency will remain at RP0 till the > request retires. As a theoretical example, lets say the request boosted > freq to RP0, but a user changed min freq using sysfs immediately after. That would be a bug. If waitboost is in progress and in the middle user changed min freq, I would expect the freq to revert to the new min only after the waitboost phase was over. In any case, I am not referring to this case. Since FW controls the freq there is nothing preventing FW to change the freq unless we raise min to max which is what waitboost does. > Waitboost is done by a pending request to "hurry" the current requests. If > GT is already at boost frequency, that purpose is served. FW can bring the freq down later before the waiting request is scheduled. > Also, host algorithm already has this optimization as well. Host turbo is different from SLPC. Host turbo controls the freq algorithm so it knows freq will not come down till it itself brings the freq down. Unlike SLPC where FW is controling the freq. Therefore host turbo doesn't ever need to do a MMIO read but only needs to refer to its own state (rps->cur_freq etc.). > > > > I had assumed we'll do this optimization for server parts where min is > > already RP0 in which case we can completely disable waitboost. But this > > patch is something else. Thanks. -- Ashutosh > >> v2: Add the tracing back, and check requested freq > >> in the worker thread (Tvrtko) > >> v3: Check requested freq in dec_waiters as well > >> > >> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> > >> --- > >> drivers/gpu/drm/i915/gt/intel_rps.c | 3 +++ > >> drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 14 +++++++++++--- > >> 2 files changed, 14 insertions(+), 3 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c > >> index fc23c562d9b2..18b75cf08d1b 100644 > >> --- a/drivers/gpu/drm/i915/gt/intel_rps.c > >> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c > >> @@ -1016,6 +1016,9 @@ void intel_rps_boost(struct i915_request *rq) > >> if (rps_uses_slpc(rps)) { > >> slpc = rps_to_slpc(rps); > >> > >> + GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n", > >> + rq->fence.context, rq->fence.seqno); > >> + > >> /* Return if old value is non zero */ > >> if (!atomic_fetch_inc(&slpc->num_waiters)) > >> schedule_work(&slpc->boost_work); > >> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > >> index b7cdeec44bd3..9dbdbab1515a 100644 > >> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > >> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > >> @@ -227,14 +227,19 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq) > >> static void slpc_boost_work(struct work_struct *work) > >> { > >> struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), boost_work); > >> + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; > >> int err; > >> > >> /* > >> * Raise min freq to boost. It's possible that > >> * this is greater than current max. But it will > >> * certainly be limited by RP0. An error setting > >> - * the min param is not fatal. > >> + * the min param is not fatal. No need to boost > >> + * if we are already requesting it. > >> */ > >> + if (intel_rps_get_requested_frequency(rps) == slpc->boost_freq) > >> + return; > >> + > >> mutex_lock(&slpc->lock); > >> if (atomic_read(&slpc->num_waiters)) { > >> err = slpc_force_min_freq(slpc, slpc->boost_freq); > >> @@ -728,6 +733,7 @@ int intel_guc_slpc_set_boost_freq(struct intel_guc_slpc *slpc, u32 val) > >> > >> void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) > >> { > >> + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; > >> /* > >> * Return min back to the softlimit. > >> * This is called during request retire, > >> @@ -735,8 +741,10 @@ void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) > >> * set_param fails. > >> */ > >> mutex_lock(&slpc->lock); > >> - if (atomic_dec_and_test(&slpc->num_waiters)) > >> - slpc_force_min_freq(slpc, slpc->min_freq_softlimit); > >> + if (atomic_dec_and_test(&slpc->num_waiters)) { > >> + if (intel_rps_get_requested_frequency(rps) != slpc->min_freq_softlimit) > >> + slpc_force_min_freq(slpc, slpc->min_freq_softlimit); > >> + } > >> mutex_unlock(&slpc->lock); > >> } > >> > >> -- > >> 2.35.1 > >> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC @ 2022-10-20 23:36 ` Dixit, Ashutosh 0 siblings, 0 replies; 15+ messages in thread From: Dixit, Ashutosh @ 2022-10-20 23:36 UTC (permalink / raw) To: Belgaumkar, Vinay; +Cc: Tvrtko Ursulin, intel-gfx, dri-devel On Thu, 20 Oct 2022 13:16:00 -0700, Belgaumkar, Vinay wrote: > > On 10/20/2022 11:33 AM, Dixit, Ashutosh wrote: > > On Wed, 19 Oct 2022 17:29:44 -0700, Vinay Belgaumkar wrote: > > Hi Vinay, > > > >> Waitboost (when SLPC is enabled) results in a H2G message. This can result > >> in thousands of messages during a stress test and fill up an already full > >> CTB. There is no need to request for RP0 if GuC is already requesting the > >> same. > > But how are we sure that the freq will remain at RP0 in the future (when > > the waiting request or any requests which are ahead execute)? > > > > In the current waitboost implementation, set_param is sent to GuC ahead of > > the waiting request to ensure that the freq would be max when this waiting > > request executed on the GPU and the freq is kept at max till this request > > retires (considering just one waiting request). How can we ensure this if > > we don't send the waitboost set_param to GuC? > > There is no way to guarantee the frequency will remain at RP0 till the > request retires. As a theoretical example, lets say the request boosted > freq to RP0, but a user changed min freq using sysfs immediately after. That would be a bug. If waitboost is in progress and in the middle user changed min freq, I would expect the freq to revert to the new min only after the waitboost phase was over. In any case, I am not referring to this case. Since FW controls the freq there is nothing preventing FW to change the freq unless we raise min to max which is what waitboost does. > Waitboost is done by a pending request to "hurry" the current requests. If > GT is already at boost frequency, that purpose is served. FW can bring the freq down later before the waiting request is scheduled. > Also, host algorithm already has this optimization as well. Host turbo is different from SLPC. Host turbo controls the freq algorithm so it knows freq will not come down till it itself brings the freq down. Unlike SLPC where FW is controling the freq. Therefore host turbo doesn't ever need to do a MMIO read but only needs to refer to its own state (rps->cur_freq etc.). > > > > I had assumed we'll do this optimization for server parts where min is > > already RP0 in which case we can completely disable waitboost. But this > > patch is something else. Thanks. -- Ashutosh > >> v2: Add the tracing back, and check requested freq > >> in the worker thread (Tvrtko) > >> v3: Check requested freq in dec_waiters as well > >> > >> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> > >> --- > >> drivers/gpu/drm/i915/gt/intel_rps.c | 3 +++ > >> drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 14 +++++++++++--- > >> 2 files changed, 14 insertions(+), 3 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c > >> index fc23c562d9b2..18b75cf08d1b 100644 > >> --- a/drivers/gpu/drm/i915/gt/intel_rps.c > >> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c > >> @@ -1016,6 +1016,9 @@ void intel_rps_boost(struct i915_request *rq) > >> if (rps_uses_slpc(rps)) { > >> slpc = rps_to_slpc(rps); > >> > >> + GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n", > >> + rq->fence.context, rq->fence.seqno); > >> + > >> /* Return if old value is non zero */ > >> if (!atomic_fetch_inc(&slpc->num_waiters)) > >> schedule_work(&slpc->boost_work); > >> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > >> index b7cdeec44bd3..9dbdbab1515a 100644 > >> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > >> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > >> @@ -227,14 +227,19 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq) > >> static void slpc_boost_work(struct work_struct *work) > >> { > >> struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), boost_work); > >> + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; > >> int err; > >> > >> /* > >> * Raise min freq to boost. It's possible that > >> * this is greater than current max. But it will > >> * certainly be limited by RP0. An error setting > >> - * the min param is not fatal. > >> + * the min param is not fatal. No need to boost > >> + * if we are already requesting it. > >> */ > >> + if (intel_rps_get_requested_frequency(rps) == slpc->boost_freq) > >> + return; > >> + > >> mutex_lock(&slpc->lock); > >> if (atomic_read(&slpc->num_waiters)) { > >> err = slpc_force_min_freq(slpc, slpc->boost_freq); > >> @@ -728,6 +733,7 @@ int intel_guc_slpc_set_boost_freq(struct intel_guc_slpc *slpc, u32 val) > >> > >> void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) > >> { > >> + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; > >> /* > >> * Return min back to the softlimit. > >> * This is called during request retire, > >> @@ -735,8 +741,10 @@ void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) > >> * set_param fails. > >> */ > >> mutex_lock(&slpc->lock); > >> - if (atomic_dec_and_test(&slpc->num_waiters)) > >> - slpc_force_min_freq(slpc, slpc->min_freq_softlimit); > >> + if (atomic_dec_and_test(&slpc->num_waiters)) { > >> + if (intel_rps_get_requested_frequency(rps) != slpc->min_freq_softlimit) > >> + slpc_force_min_freq(slpc, slpc->min_freq_softlimit); > >> + } > >> mutex_unlock(&slpc->lock); > >> } > >> > >> -- > >> 2.35.1 > >> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC 2022-10-20 23:36 ` Dixit, Ashutosh @ 2022-10-21 18:24 ` Belgaumkar, Vinay -1 siblings, 0 replies; 15+ messages in thread From: Belgaumkar, Vinay @ 2022-10-21 18:24 UTC (permalink / raw) To: Dixit, Ashutosh; +Cc: intel-gfx, dri-devel On 10/20/2022 4:36 PM, Dixit, Ashutosh wrote: > On Thu, 20 Oct 2022 13:16:00 -0700, Belgaumkar, Vinay wrote: >> On 10/20/2022 11:33 AM, Dixit, Ashutosh wrote: >>> On Wed, 19 Oct 2022 17:29:44 -0700, Vinay Belgaumkar wrote: >>> Hi Vinay, >>> >>>> Waitboost (when SLPC is enabled) results in a H2G message. This can result >>>> in thousands of messages during a stress test and fill up an already full >>>> CTB. There is no need to request for RP0 if GuC is already requesting the >>>> same. >>> But how are we sure that the freq will remain at RP0 in the future (when >>> the waiting request or any requests which are ahead execute)? >>> >>> In the current waitboost implementation, set_param is sent to GuC ahead of >>> the waiting request to ensure that the freq would be max when this waiting >>> request executed on the GPU and the freq is kept at max till this request >>> retires (considering just one waiting request). How can we ensure this if >>> we don't send the waitboost set_param to GuC? >> There is no way to guarantee the frequency will remain at RP0 till the >> request retires. As a theoretical example, lets say the request boosted >> freq to RP0, but a user changed min freq using sysfs immediately after. > That would be a bug. If waitboost is in progress and in the middle user > changed min freq, I would expect the freq to revert to the new min only > after the waitboost phase was over. The problem here is that GuC is unaware of this "boosting" phenomenon. Setting the min_freq_softlimit as well to boost when we send a boost request might help with this issue. > > In any case, I am not referring to this case. Since FW controls the freq > there is nothing preventing FW to change the freq unless we raise min to > max which is what waitboost does. Ok, so maybe the solution here is to check if min_softlimit is already at boost freq, as it tracks the min freq changes. That should take care of server parts automatically as well. > >> Waitboost is done by a pending request to "hurry" the current requests. If >> GT is already at boost frequency, that purpose is served. > FW can bring the freq down later before the waiting request is scheduled. >> Also, host algorithm already has this optimization as well. > Host turbo is different from SLPC. Host turbo controls the freq algorithm > so it knows freq will not come down till it itself brings the freq > down. Unlike SLPC where FW is controling the freq. Therefore host turbo > doesn't ever need to do a MMIO read but only needs to refer to its own > state (rps->cur_freq etc.). True. Host algorithm has a periodic timer where it updates frequency. Here, it checks num_waiters and sets client_boost every time that is non-zero. >>> I had assumed we'll do this optimization for server parts where min is >>> already RP0 in which case we can completely disable waitboost. But this >>> patch is something else. Hopefully the softlimit changes above will help with client and server. Thanks, Vinay. > Thanks. > -- > Ashutosh > >>>> v2: Add the tracing back, and check requested freq >>>> in the worker thread (Tvrtko) >>>> v3: Check requested freq in dec_waiters as well >>>> >>>> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> >>>> --- >>>> drivers/gpu/drm/i915/gt/intel_rps.c | 3 +++ >>>> drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 14 +++++++++++--- >>>> 2 files changed, 14 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c >>>> index fc23c562d9b2..18b75cf08d1b 100644 >>>> --- a/drivers/gpu/drm/i915/gt/intel_rps.c >>>> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c >>>> @@ -1016,6 +1016,9 @@ void intel_rps_boost(struct i915_request *rq) >>>> if (rps_uses_slpc(rps)) { >>>> slpc = rps_to_slpc(rps); >>>> >>>> + GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n", >>>> + rq->fence.context, rq->fence.seqno); >>>> + >>>> /* Return if old value is non zero */ >>>> if (!atomic_fetch_inc(&slpc->num_waiters)) >>>> schedule_work(&slpc->boost_work); >>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c >>>> index b7cdeec44bd3..9dbdbab1515a 100644 >>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c >>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c >>>> @@ -227,14 +227,19 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq) >>>> static void slpc_boost_work(struct work_struct *work) >>>> { >>>> struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), boost_work); >>>> + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; >>>> int err; >>>> >>>> /* >>>> * Raise min freq to boost. It's possible that >>>> * this is greater than current max. But it will >>>> * certainly be limited by RP0. An error setting >>>> - * the min param is not fatal. >>>> + * the min param is not fatal. No need to boost >>>> + * if we are already requesting it. >>>> */ >>>> + if (intel_rps_get_requested_frequency(rps) == slpc->boost_freq) >>>> + return; >>>> + >>>> mutex_lock(&slpc->lock); >>>> if (atomic_read(&slpc->num_waiters)) { >>>> err = slpc_force_min_freq(slpc, slpc->boost_freq); >>>> @@ -728,6 +733,7 @@ int intel_guc_slpc_set_boost_freq(struct intel_guc_slpc *slpc, u32 val) >>>> >>>> void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) >>>> { >>>> + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; >>>> /* >>>> * Return min back to the softlimit. >>>> * This is called during request retire, >>>> @@ -735,8 +741,10 @@ void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) >>>> * set_param fails. >>>> */ >>>> mutex_lock(&slpc->lock); >>>> - if (atomic_dec_and_test(&slpc->num_waiters)) >>>> - slpc_force_min_freq(slpc, slpc->min_freq_softlimit); >>>> + if (atomic_dec_and_test(&slpc->num_waiters)) { >>>> + if (intel_rps_get_requested_frequency(rps) != slpc->min_freq_softlimit) >>>> + slpc_force_min_freq(slpc, slpc->min_freq_softlimit); >>>> + } >>>> mutex_unlock(&slpc->lock); >>>> } >>>> >>>> -- >>>> 2.35.1 >>>> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC @ 2022-10-21 18:24 ` Belgaumkar, Vinay 0 siblings, 0 replies; 15+ messages in thread From: Belgaumkar, Vinay @ 2022-10-21 18:24 UTC (permalink / raw) To: Dixit, Ashutosh; +Cc: Tvrtko Ursulin, intel-gfx, dri-devel On 10/20/2022 4:36 PM, Dixit, Ashutosh wrote: > On Thu, 20 Oct 2022 13:16:00 -0700, Belgaumkar, Vinay wrote: >> On 10/20/2022 11:33 AM, Dixit, Ashutosh wrote: >>> On Wed, 19 Oct 2022 17:29:44 -0700, Vinay Belgaumkar wrote: >>> Hi Vinay, >>> >>>> Waitboost (when SLPC is enabled) results in a H2G message. This can result >>>> in thousands of messages during a stress test and fill up an already full >>>> CTB. There is no need to request for RP0 if GuC is already requesting the >>>> same. >>> But how are we sure that the freq will remain at RP0 in the future (when >>> the waiting request or any requests which are ahead execute)? >>> >>> In the current waitboost implementation, set_param is sent to GuC ahead of >>> the waiting request to ensure that the freq would be max when this waiting >>> request executed on the GPU and the freq is kept at max till this request >>> retires (considering just one waiting request). How can we ensure this if >>> we don't send the waitboost set_param to GuC? >> There is no way to guarantee the frequency will remain at RP0 till the >> request retires. As a theoretical example, lets say the request boosted >> freq to RP0, but a user changed min freq using sysfs immediately after. > That would be a bug. If waitboost is in progress and in the middle user > changed min freq, I would expect the freq to revert to the new min only > after the waitboost phase was over. The problem here is that GuC is unaware of this "boosting" phenomenon. Setting the min_freq_softlimit as well to boost when we send a boost request might help with this issue. > > In any case, I am not referring to this case. Since FW controls the freq > there is nothing preventing FW to change the freq unless we raise min to > max which is what waitboost does. Ok, so maybe the solution here is to check if min_softlimit is already at boost freq, as it tracks the min freq changes. That should take care of server parts automatically as well. > >> Waitboost is done by a pending request to "hurry" the current requests. If >> GT is already at boost frequency, that purpose is served. > FW can bring the freq down later before the waiting request is scheduled. >> Also, host algorithm already has this optimization as well. > Host turbo is different from SLPC. Host turbo controls the freq algorithm > so it knows freq will not come down till it itself brings the freq > down. Unlike SLPC where FW is controling the freq. Therefore host turbo > doesn't ever need to do a MMIO read but only needs to refer to its own > state (rps->cur_freq etc.). True. Host algorithm has a periodic timer where it updates frequency. Here, it checks num_waiters and sets client_boost every time that is non-zero. >>> I had assumed we'll do this optimization for server parts where min is >>> already RP0 in which case we can completely disable waitboost. But this >>> patch is something else. Hopefully the softlimit changes above will help with client and server. Thanks, Vinay. > Thanks. > -- > Ashutosh > >>>> v2: Add the tracing back, and check requested freq >>>> in the worker thread (Tvrtko) >>>> v3: Check requested freq in dec_waiters as well >>>> >>>> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> >>>> --- >>>> drivers/gpu/drm/i915/gt/intel_rps.c | 3 +++ >>>> drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 14 +++++++++++--- >>>> 2 files changed, 14 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c >>>> index fc23c562d9b2..18b75cf08d1b 100644 >>>> --- a/drivers/gpu/drm/i915/gt/intel_rps.c >>>> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c >>>> @@ -1016,6 +1016,9 @@ void intel_rps_boost(struct i915_request *rq) >>>> if (rps_uses_slpc(rps)) { >>>> slpc = rps_to_slpc(rps); >>>> >>>> + GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n", >>>> + rq->fence.context, rq->fence.seqno); >>>> + >>>> /* Return if old value is non zero */ >>>> if (!atomic_fetch_inc(&slpc->num_waiters)) >>>> schedule_work(&slpc->boost_work); >>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c >>>> index b7cdeec44bd3..9dbdbab1515a 100644 >>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c >>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c >>>> @@ -227,14 +227,19 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq) >>>> static void slpc_boost_work(struct work_struct *work) >>>> { >>>> struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), boost_work); >>>> + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; >>>> int err; >>>> >>>> /* >>>> * Raise min freq to boost. It's possible that >>>> * this is greater than current max. But it will >>>> * certainly be limited by RP0. An error setting >>>> - * the min param is not fatal. >>>> + * the min param is not fatal. No need to boost >>>> + * if we are already requesting it. >>>> */ >>>> + if (intel_rps_get_requested_frequency(rps) == slpc->boost_freq) >>>> + return; >>>> + >>>> mutex_lock(&slpc->lock); >>>> if (atomic_read(&slpc->num_waiters)) { >>>> err = slpc_force_min_freq(slpc, slpc->boost_freq); >>>> @@ -728,6 +733,7 @@ int intel_guc_slpc_set_boost_freq(struct intel_guc_slpc *slpc, u32 val) >>>> >>>> void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) >>>> { >>>> + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; >>>> /* >>>> * Return min back to the softlimit. >>>> * This is called during request retire, >>>> @@ -735,8 +741,10 @@ void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) >>>> * set_param fails. >>>> */ >>>> mutex_lock(&slpc->lock); >>>> - if (atomic_dec_and_test(&slpc->num_waiters)) >>>> - slpc_force_min_freq(slpc, slpc->min_freq_softlimit); >>>> + if (atomic_dec_and_test(&slpc->num_waiters)) { >>>> + if (intel_rps_get_requested_frequency(rps) != slpc->min_freq_softlimit) >>>> + slpc_force_min_freq(slpc, slpc->min_freq_softlimit); >>>> + } >>>> mutex_unlock(&slpc->lock); >>>> } >>>> >>>> -- >>>> 2.35.1 >>>> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC 2022-10-21 18:24 ` Belgaumkar, Vinay @ 2022-10-21 18:40 ` Dixit, Ashutosh -1 siblings, 0 replies; 15+ messages in thread From: Dixit, Ashutosh @ 2022-10-21 18:40 UTC (permalink / raw) To: Belgaumkar, Vinay; +Cc: intel-gfx, dri-devel On Fri, 21 Oct 2022 11:24:42 -0700, Belgaumkar, Vinay wrote: > > > On 10/20/2022 4:36 PM, Dixit, Ashutosh wrote: > > On Thu, 20 Oct 2022 13:16:00 -0700, Belgaumkar, Vinay wrote: > >> On 10/20/2022 11:33 AM, Dixit, Ashutosh wrote: > >>> On Wed, 19 Oct 2022 17:29:44 -0700, Vinay Belgaumkar wrote: > >>> Hi Vinay, > >>> > >>>> Waitboost (when SLPC is enabled) results in a H2G message. This can result > >>>> in thousands of messages during a stress test and fill up an already full > >>>> CTB. There is no need to request for RP0 if GuC is already requesting the > >>>> same. > >>> But how are we sure that the freq will remain at RP0 in the future (when > >>> the waiting request or any requests which are ahead execute)? > >>> > >>> In the current waitboost implementation, set_param is sent to GuC ahead of > >>> the waiting request to ensure that the freq would be max when this waiting > >>> request executed on the GPU and the freq is kept at max till this request > >>> retires (considering just one waiting request). How can we ensure this if > >>> we don't send the waitboost set_param to GuC? > >> There is no way to guarantee the frequency will remain at RP0 till the > >> request retires. As a theoretical example, lets say the request boosted > >> freq to RP0, but a user changed min freq using sysfs immediately after. > > That would be a bug. If waitboost is in progress and in the middle user > > changed min freq, I would expect the freq to revert to the new min only > > after the waitboost phase was over. > > The problem here is that GuC is unaware of this "boosting" > phenomenon. Setting the min_freq_softlimit as well to boost when we send a > boost request might help with this issue. > > > > > In any case, I am not referring to this case. Since FW controls the freq > > there is nothing preventing FW to change the freq unless we raise min to > > max which is what waitboost does. > Ok, so maybe the solution here is to check if min_softlimit is already at > boost freq, as it tracks the min freq changes. That should take care of > server parts automatically as well. Correct, yes that would be the right way to do it. Thanks. -- Ashutosh > >> Waitboost is done by a pending request to "hurry" the current requests. If > >> GT is already at boost frequency, that purpose is served. > > FW can bring the freq down later before the waiting request is scheduled. > >> Also, host algorithm already has this optimization as well. > > Host turbo is different from SLPC. Host turbo controls the freq algorithm > > so it knows freq will not come down till it itself brings the freq > > down. Unlike SLPC where FW is controling the freq. Therefore host turbo > > doesn't ever need to do a MMIO read but only needs to refer to its own > > state (rps->cur_freq etc.). > True. Host algorithm has a periodic timer where it updates frequency. Here, > it checks num_waiters and sets client_boost every time that is non-zero. > >>> I had assumed we'll do this optimization for server parts where min is > >>> already RP0 in which case we can completely disable waitboost. But this > >>> patch is something else. > > Hopefully the softlimit changes above will help with client and server. > > Thanks, > > Vinay. > > > Thanks. > > -- > > Ashutosh > > > >>>> v2: Add the tracing back, and check requested freq > >>>> in the worker thread (Tvrtko) > >>>> v3: Check requested freq in dec_waiters as well > >>>> > >>>> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> > >>>> --- > >>>> drivers/gpu/drm/i915/gt/intel_rps.c | 3 +++ > >>>> drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 14 +++++++++++--- > >>>> 2 files changed, 14 insertions(+), 3 deletions(-) > >>>> > >>>> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c > >>>> index fc23c562d9b2..18b75cf08d1b 100644 > >>>> --- a/drivers/gpu/drm/i915/gt/intel_rps.c > >>>> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c > >>>> @@ -1016,6 +1016,9 @@ void intel_rps_boost(struct i915_request *rq) > >>>> if (rps_uses_slpc(rps)) { > >>>> slpc = rps_to_slpc(rps); > >>>> > >>>> + GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n", > >>>> + rq->fence.context, rq->fence.seqno); > >>>> + > >>>> /* Return if old value is non zero */ > >>>> if (!atomic_fetch_inc(&slpc->num_waiters)) > >>>> schedule_work(&slpc->boost_work); > >>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > >>>> index b7cdeec44bd3..9dbdbab1515a 100644 > >>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > >>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > >>>> @@ -227,14 +227,19 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq) > >>>> static void slpc_boost_work(struct work_struct *work) > >>>> { > >>>> struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), boost_work); > >>>> + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; > >>>> int err; > >>>> > >>>> /* > >>>> * Raise min freq to boost. It's possible that > >>>> * this is greater than current max. But it will > >>>> * certainly be limited by RP0. An error setting > >>>> - * the min param is not fatal. > >>>> + * the min param is not fatal. No need to boost > >>>> + * if we are already requesting it. > >>>> */ > >>>> + if (intel_rps_get_requested_frequency(rps) == slpc->boost_freq) > >>>> + return; > >>>> + > >>>> mutex_lock(&slpc->lock); > >>>> if (atomic_read(&slpc->num_waiters)) { > >>>> err = slpc_force_min_freq(slpc, slpc->boost_freq); > >>>> @@ -728,6 +733,7 @@ int intel_guc_slpc_set_boost_freq(struct intel_guc_slpc *slpc, u32 val) > >>>> > >>>> void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) > >>>> { > >>>> + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; > >>>> /* > >>>> * Return min back to the softlimit. > >>>> * This is called during request retire, > >>>> @@ -735,8 +741,10 @@ void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) > >>>> * set_param fails. > >>>> */ > >>>> mutex_lock(&slpc->lock); > >>>> - if (atomic_dec_and_test(&slpc->num_waiters)) > >>>> - slpc_force_min_freq(slpc, slpc->min_freq_softlimit); > >>>> + if (atomic_dec_and_test(&slpc->num_waiters)) { > >>>> + if (intel_rps_get_requested_frequency(rps) != slpc->min_freq_softlimit) > >>>> + slpc_force_min_freq(slpc, slpc->min_freq_softlimit); > >>>> + } > >>>> mutex_unlock(&slpc->lock); > >>>> } > >>>> > >>>> -- > >>>> 2.35.1 > >>>> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC @ 2022-10-21 18:40 ` Dixit, Ashutosh 0 siblings, 0 replies; 15+ messages in thread From: Dixit, Ashutosh @ 2022-10-21 18:40 UTC (permalink / raw) To: Belgaumkar, Vinay; +Cc: Tvrtko Ursulin, intel-gfx, dri-devel On Fri, 21 Oct 2022 11:24:42 -0700, Belgaumkar, Vinay wrote: > > > On 10/20/2022 4:36 PM, Dixit, Ashutosh wrote: > > On Thu, 20 Oct 2022 13:16:00 -0700, Belgaumkar, Vinay wrote: > >> On 10/20/2022 11:33 AM, Dixit, Ashutosh wrote: > >>> On Wed, 19 Oct 2022 17:29:44 -0700, Vinay Belgaumkar wrote: > >>> Hi Vinay, > >>> > >>>> Waitboost (when SLPC is enabled) results in a H2G message. This can result > >>>> in thousands of messages during a stress test and fill up an already full > >>>> CTB. There is no need to request for RP0 if GuC is already requesting the > >>>> same. > >>> But how are we sure that the freq will remain at RP0 in the future (when > >>> the waiting request or any requests which are ahead execute)? > >>> > >>> In the current waitboost implementation, set_param is sent to GuC ahead of > >>> the waiting request to ensure that the freq would be max when this waiting > >>> request executed on the GPU and the freq is kept at max till this request > >>> retires (considering just one waiting request). How can we ensure this if > >>> we don't send the waitboost set_param to GuC? > >> There is no way to guarantee the frequency will remain at RP0 till the > >> request retires. As a theoretical example, lets say the request boosted > >> freq to RP0, but a user changed min freq using sysfs immediately after. > > That would be a bug. If waitboost is in progress and in the middle user > > changed min freq, I would expect the freq to revert to the new min only > > after the waitboost phase was over. > > The problem here is that GuC is unaware of this "boosting" > phenomenon. Setting the min_freq_softlimit as well to boost when we send a > boost request might help with this issue. > > > > > In any case, I am not referring to this case. Since FW controls the freq > > there is nothing preventing FW to change the freq unless we raise min to > > max which is what waitboost does. > Ok, so maybe the solution here is to check if min_softlimit is already at > boost freq, as it tracks the min freq changes. That should take care of > server parts automatically as well. Correct, yes that would be the right way to do it. Thanks. -- Ashutosh > >> Waitboost is done by a pending request to "hurry" the current requests. If > >> GT is already at boost frequency, that purpose is served. > > FW can bring the freq down later before the waiting request is scheduled. > >> Also, host algorithm already has this optimization as well. > > Host turbo is different from SLPC. Host turbo controls the freq algorithm > > so it knows freq will not come down till it itself brings the freq > > down. Unlike SLPC where FW is controling the freq. Therefore host turbo > > doesn't ever need to do a MMIO read but only needs to refer to its own > > state (rps->cur_freq etc.). > True. Host algorithm has a periodic timer where it updates frequency. Here, > it checks num_waiters and sets client_boost every time that is non-zero. > >>> I had assumed we'll do this optimization for server parts where min is > >>> already RP0 in which case we can completely disable waitboost. But this > >>> patch is something else. > > Hopefully the softlimit changes above will help with client and server. > > Thanks, > > Vinay. > > > Thanks. > > -- > > Ashutosh > > > >>>> v2: Add the tracing back, and check requested freq > >>>> in the worker thread (Tvrtko) > >>>> v3: Check requested freq in dec_waiters as well > >>>> > >>>> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> > >>>> --- > >>>> drivers/gpu/drm/i915/gt/intel_rps.c | 3 +++ > >>>> drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 14 +++++++++++--- > >>>> 2 files changed, 14 insertions(+), 3 deletions(-) > >>>> > >>>> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c > >>>> index fc23c562d9b2..18b75cf08d1b 100644 > >>>> --- a/drivers/gpu/drm/i915/gt/intel_rps.c > >>>> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c > >>>> @@ -1016,6 +1016,9 @@ void intel_rps_boost(struct i915_request *rq) > >>>> if (rps_uses_slpc(rps)) { > >>>> slpc = rps_to_slpc(rps); > >>>> > >>>> + GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n", > >>>> + rq->fence.context, rq->fence.seqno); > >>>> + > >>>> /* Return if old value is non zero */ > >>>> if (!atomic_fetch_inc(&slpc->num_waiters)) > >>>> schedule_work(&slpc->boost_work); > >>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > >>>> index b7cdeec44bd3..9dbdbab1515a 100644 > >>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > >>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > >>>> @@ -227,14 +227,19 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq) > >>>> static void slpc_boost_work(struct work_struct *work) > >>>> { > >>>> struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), boost_work); > >>>> + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; > >>>> int err; > >>>> > >>>> /* > >>>> * Raise min freq to boost. It's possible that > >>>> * this is greater than current max. But it will > >>>> * certainly be limited by RP0. An error setting > >>>> - * the min param is not fatal. > >>>> + * the min param is not fatal. No need to boost > >>>> + * if we are already requesting it. > >>>> */ > >>>> + if (intel_rps_get_requested_frequency(rps) == slpc->boost_freq) > >>>> + return; > >>>> + > >>>> mutex_lock(&slpc->lock); > >>>> if (atomic_read(&slpc->num_waiters)) { > >>>> err = slpc_force_min_freq(slpc, slpc->boost_freq); > >>>> @@ -728,6 +733,7 @@ int intel_guc_slpc_set_boost_freq(struct intel_guc_slpc *slpc, u32 val) > >>>> > >>>> void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) > >>>> { > >>>> + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; > >>>> /* > >>>> * Return min back to the softlimit. > >>>> * This is called during request retire, > >>>> @@ -735,8 +741,10 @@ void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) > >>>> * set_param fails. > >>>> */ > >>>> mutex_lock(&slpc->lock); > >>>> - if (atomic_dec_and_test(&slpc->num_waiters)) > >>>> - slpc_force_min_freq(slpc, slpc->min_freq_softlimit); > >>>> + if (atomic_dec_and_test(&slpc->num_waiters)) { > >>>> + if (intel_rps_get_requested_frequency(rps) != slpc->min_freq_softlimit) > >>>> + slpc_force_min_freq(slpc, slpc->min_freq_softlimit); > >>>> + } > >>>> mutex_unlock(&slpc->lock); > >>>> } > >>>> > >>>> -- > >>>> 2.35.1 > >>>> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC 2022-10-21 18:40 ` Dixit, Ashutosh @ 2022-10-21 22:46 ` Belgaumkar, Vinay -1 siblings, 0 replies; 15+ messages in thread From: Belgaumkar, Vinay @ 2022-10-21 22:46 UTC (permalink / raw) To: Dixit, Ashutosh; +Cc: intel-gfx, dri-devel On 10/21/2022 11:40 AM, Dixit, Ashutosh wrote: > On Fri, 21 Oct 2022 11:24:42 -0700, Belgaumkar, Vinay wrote: >> >> On 10/20/2022 4:36 PM, Dixit, Ashutosh wrote: >>> On Thu, 20 Oct 2022 13:16:00 -0700, Belgaumkar, Vinay wrote: >>>> On 10/20/2022 11:33 AM, Dixit, Ashutosh wrote: >>>>> On Wed, 19 Oct 2022 17:29:44 -0700, Vinay Belgaumkar wrote: >>>>> Hi Vinay, >>>>> >>>>>> Waitboost (when SLPC is enabled) results in a H2G message. This can result >>>>>> in thousands of messages during a stress test and fill up an already full >>>>>> CTB. There is no need to request for RP0 if GuC is already requesting the >>>>>> same. >>>>> But how are we sure that the freq will remain at RP0 in the future (when >>>>> the waiting request or any requests which are ahead execute)? >>>>> >>>>> In the current waitboost implementation, set_param is sent to GuC ahead of >>>>> the waiting request to ensure that the freq would be max when this waiting >>>>> request executed on the GPU and the freq is kept at max till this request >>>>> retires (considering just one waiting request). How can we ensure this if >>>>> we don't send the waitboost set_param to GuC? >>>> There is no way to guarantee the frequency will remain at RP0 till the >>>> request retires. As a theoretical example, lets say the request boosted >>>> freq to RP0, but a user changed min freq using sysfs immediately after. >>> That would be a bug. If waitboost is in progress and in the middle user >>> changed min freq, I would expect the freq to revert to the new min only >>> after the waitboost phase was over. >> The problem here is that GuC is unaware of this "boosting" >> phenomenon. Setting the min_freq_softlimit as well to boost when we send a >> boost request might help with this issue. >> >>> In any case, I am not referring to this case. Since FW controls the freq >>> there is nothing preventing FW to change the freq unless we raise min to >>> max which is what waitboost does. >> Ok, so maybe the solution here is to check if min_softlimit is already at >> boost freq, as it tracks the min freq changes. That should take care of >> server parts automatically as well. > Correct, yes that would be the right way to do it. Actually, rethinking, it's not going to work for client GPUs. We cannot clobber the min_softlimit as the user may have set it. So, I'll just make this change to optimize it for server parts for now. Thanks, Vinay. > > Thanks. > -- > Ashutosh > >>>> Waitboost is done by a pending request to "hurry" the current requests. If >>>> GT is already at boost frequency, that purpose is served. >>> FW can bring the freq down later before the waiting request is scheduled. >>>> Also, host algorithm already has this optimization as well. >>> Host turbo is different from SLPC. Host turbo controls the freq algorithm >>> so it knows freq will not come down till it itself brings the freq >>> down. Unlike SLPC where FW is controling the freq. Therefore host turbo >>> doesn't ever need to do a MMIO read but only needs to refer to its own >>> state (rps->cur_freq etc.). >> True. Host algorithm has a periodic timer where it updates frequency. Here, >> it checks num_waiters and sets client_boost every time that is non-zero. >>>>> I had assumed we'll do this optimization for server parts where min is >>>>> already RP0 in which case we can completely disable waitboost. But this >>>>> patch is something else. >> Hopefully the softlimit changes above will help with client and server. >> >> Thanks, >> >> Vinay. >> >>> Thanks. >>> -- >>> Ashutosh >>> >>>>>> v2: Add the tracing back, and check requested freq >>>>>> in the worker thread (Tvrtko) >>>>>> v3: Check requested freq in dec_waiters as well >>>>>> >>>>>> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> >>>>>> --- >>>>>> drivers/gpu/drm/i915/gt/intel_rps.c | 3 +++ >>>>>> drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 14 +++++++++++--- >>>>>> 2 files changed, 14 insertions(+), 3 deletions(-) >>>>>> >>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c >>>>>> index fc23c562d9b2..18b75cf08d1b 100644 >>>>>> --- a/drivers/gpu/drm/i915/gt/intel_rps.c >>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c >>>>>> @@ -1016,6 +1016,9 @@ void intel_rps_boost(struct i915_request *rq) >>>>>> if (rps_uses_slpc(rps)) { >>>>>> slpc = rps_to_slpc(rps); >>>>>> >>>>>> + GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n", >>>>>> + rq->fence.context, rq->fence.seqno); >>>>>> + >>>>>> /* Return if old value is non zero */ >>>>>> if (!atomic_fetch_inc(&slpc->num_waiters)) >>>>>> schedule_work(&slpc->boost_work); >>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c >>>>>> index b7cdeec44bd3..9dbdbab1515a 100644 >>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c >>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c >>>>>> @@ -227,14 +227,19 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq) >>>>>> static void slpc_boost_work(struct work_struct *work) >>>>>> { >>>>>> struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), boost_work); >>>>>> + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; >>>>>> int err; >>>>>> >>>>>> /* >>>>>> * Raise min freq to boost. It's possible that >>>>>> * this is greater than current max. But it will >>>>>> * certainly be limited by RP0. An error setting >>>>>> - * the min param is not fatal. >>>>>> + * the min param is not fatal. No need to boost >>>>>> + * if we are already requesting it. >>>>>> */ >>>>>> + if (intel_rps_get_requested_frequency(rps) == slpc->boost_freq) >>>>>> + return; >>>>>> + >>>>>> mutex_lock(&slpc->lock); >>>>>> if (atomic_read(&slpc->num_waiters)) { >>>>>> err = slpc_force_min_freq(slpc, slpc->boost_freq); >>>>>> @@ -728,6 +733,7 @@ int intel_guc_slpc_set_boost_freq(struct intel_guc_slpc *slpc, u32 val) >>>>>> >>>>>> void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) >>>>>> { >>>>>> + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; >>>>>> /* >>>>>> * Return min back to the softlimit. >>>>>> * This is called during request retire, >>>>>> @@ -735,8 +741,10 @@ void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) >>>>>> * set_param fails. >>>>>> */ >>>>>> mutex_lock(&slpc->lock); >>>>>> - if (atomic_dec_and_test(&slpc->num_waiters)) >>>>>> - slpc_force_min_freq(slpc, slpc->min_freq_softlimit); >>>>>> + if (atomic_dec_and_test(&slpc->num_waiters)) { >>>>>> + if (intel_rps_get_requested_frequency(rps) != slpc->min_freq_softlimit) >>>>>> + slpc_force_min_freq(slpc, slpc->min_freq_softlimit); >>>>>> + } >>>>>> mutex_unlock(&slpc->lock); >>>>>> } >>>>>> >>>>>> -- >>>>>> 2.35.1 >>>>>> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC @ 2022-10-21 22:46 ` Belgaumkar, Vinay 0 siblings, 0 replies; 15+ messages in thread From: Belgaumkar, Vinay @ 2022-10-21 22:46 UTC (permalink / raw) To: Dixit, Ashutosh; +Cc: Tvrtko Ursulin, intel-gfx, dri-devel On 10/21/2022 11:40 AM, Dixit, Ashutosh wrote: > On Fri, 21 Oct 2022 11:24:42 -0700, Belgaumkar, Vinay wrote: >> >> On 10/20/2022 4:36 PM, Dixit, Ashutosh wrote: >>> On Thu, 20 Oct 2022 13:16:00 -0700, Belgaumkar, Vinay wrote: >>>> On 10/20/2022 11:33 AM, Dixit, Ashutosh wrote: >>>>> On Wed, 19 Oct 2022 17:29:44 -0700, Vinay Belgaumkar wrote: >>>>> Hi Vinay, >>>>> >>>>>> Waitboost (when SLPC is enabled) results in a H2G message. This can result >>>>>> in thousands of messages during a stress test and fill up an already full >>>>>> CTB. There is no need to request for RP0 if GuC is already requesting the >>>>>> same. >>>>> But how are we sure that the freq will remain at RP0 in the future (when >>>>> the waiting request or any requests which are ahead execute)? >>>>> >>>>> In the current waitboost implementation, set_param is sent to GuC ahead of >>>>> the waiting request to ensure that the freq would be max when this waiting >>>>> request executed on the GPU and the freq is kept at max till this request >>>>> retires (considering just one waiting request). How can we ensure this if >>>>> we don't send the waitboost set_param to GuC? >>>> There is no way to guarantee the frequency will remain at RP0 till the >>>> request retires. As a theoretical example, lets say the request boosted >>>> freq to RP0, but a user changed min freq using sysfs immediately after. >>> That would be a bug. If waitboost is in progress and in the middle user >>> changed min freq, I would expect the freq to revert to the new min only >>> after the waitboost phase was over. >> The problem here is that GuC is unaware of this "boosting" >> phenomenon. Setting the min_freq_softlimit as well to boost when we send a >> boost request might help with this issue. >> >>> In any case, I am not referring to this case. Since FW controls the freq >>> there is nothing preventing FW to change the freq unless we raise min to >>> max which is what waitboost does. >> Ok, so maybe the solution here is to check if min_softlimit is already at >> boost freq, as it tracks the min freq changes. That should take care of >> server parts automatically as well. > Correct, yes that would be the right way to do it. Actually, rethinking, it's not going to work for client GPUs. We cannot clobber the min_softlimit as the user may have set it. So, I'll just make this change to optimize it for server parts for now. Thanks, Vinay. > > Thanks. > -- > Ashutosh > >>>> Waitboost is done by a pending request to "hurry" the current requests. If >>>> GT is already at boost frequency, that purpose is served. >>> FW can bring the freq down later before the waiting request is scheduled. >>>> Also, host algorithm already has this optimization as well. >>> Host turbo is different from SLPC. Host turbo controls the freq algorithm >>> so it knows freq will not come down till it itself brings the freq >>> down. Unlike SLPC where FW is controling the freq. Therefore host turbo >>> doesn't ever need to do a MMIO read but only needs to refer to its own >>> state (rps->cur_freq etc.). >> True. Host algorithm has a periodic timer where it updates frequency. Here, >> it checks num_waiters and sets client_boost every time that is non-zero. >>>>> I had assumed we'll do this optimization for server parts where min is >>>>> already RP0 in which case we can completely disable waitboost. But this >>>>> patch is something else. >> Hopefully the softlimit changes above will help with client and server. >> >> Thanks, >> >> Vinay. >> >>> Thanks. >>> -- >>> Ashutosh >>> >>>>>> v2: Add the tracing back, and check requested freq >>>>>> in the worker thread (Tvrtko) >>>>>> v3: Check requested freq in dec_waiters as well >>>>>> >>>>>> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> >>>>>> --- >>>>>> drivers/gpu/drm/i915/gt/intel_rps.c | 3 +++ >>>>>> drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 14 +++++++++++--- >>>>>> 2 files changed, 14 insertions(+), 3 deletions(-) >>>>>> >>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c >>>>>> index fc23c562d9b2..18b75cf08d1b 100644 >>>>>> --- a/drivers/gpu/drm/i915/gt/intel_rps.c >>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c >>>>>> @@ -1016,6 +1016,9 @@ void intel_rps_boost(struct i915_request *rq) >>>>>> if (rps_uses_slpc(rps)) { >>>>>> slpc = rps_to_slpc(rps); >>>>>> >>>>>> + GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n", >>>>>> + rq->fence.context, rq->fence.seqno); >>>>>> + >>>>>> /* Return if old value is non zero */ >>>>>> if (!atomic_fetch_inc(&slpc->num_waiters)) >>>>>> schedule_work(&slpc->boost_work); >>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c >>>>>> index b7cdeec44bd3..9dbdbab1515a 100644 >>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c >>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c >>>>>> @@ -227,14 +227,19 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq) >>>>>> static void slpc_boost_work(struct work_struct *work) >>>>>> { >>>>>> struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), boost_work); >>>>>> + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; >>>>>> int err; >>>>>> >>>>>> /* >>>>>> * Raise min freq to boost. It's possible that >>>>>> * this is greater than current max. But it will >>>>>> * certainly be limited by RP0. An error setting >>>>>> - * the min param is not fatal. >>>>>> + * the min param is not fatal. No need to boost >>>>>> + * if we are already requesting it. >>>>>> */ >>>>>> + if (intel_rps_get_requested_frequency(rps) == slpc->boost_freq) >>>>>> + return; >>>>>> + >>>>>> mutex_lock(&slpc->lock); >>>>>> if (atomic_read(&slpc->num_waiters)) { >>>>>> err = slpc_force_min_freq(slpc, slpc->boost_freq); >>>>>> @@ -728,6 +733,7 @@ int intel_guc_slpc_set_boost_freq(struct intel_guc_slpc *slpc, u32 val) >>>>>> >>>>>> void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) >>>>>> { >>>>>> + struct intel_rps *rps = &slpc_to_gt(slpc)->rps; >>>>>> /* >>>>>> * Return min back to the softlimit. >>>>>> * This is called during request retire, >>>>>> @@ -735,8 +741,10 @@ void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc) >>>>>> * set_param fails. >>>>>> */ >>>>>> mutex_lock(&slpc->lock); >>>>>> - if (atomic_dec_and_test(&slpc->num_waiters)) >>>>>> - slpc_force_min_freq(slpc, slpc->min_freq_softlimit); >>>>>> + if (atomic_dec_and_test(&slpc->num_waiters)) { >>>>>> + if (intel_rps_get_requested_frequency(rps) != slpc->min_freq_softlimit) >>>>>> + slpc_force_min_freq(slpc, slpc->min_freq_softlimit); >>>>>> + } >>>>>> mutex_unlock(&slpc->lock); >>>>>> } >>>>>> >>>>>> -- >>>>>> 2.35.1 >>>>>> ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2022-10-21 22:47 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-10-20 0:29 [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC Vinay Belgaumkar 2022-10-20 0:29 ` Vinay Belgaumkar 2022-10-20 1:22 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/slpc: Optmize waitboost for SLPC (rev3) Patchwork 2022-10-20 18:33 ` [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC Dixit, Ashutosh 2022-10-20 18:33 ` Dixit, Ashutosh 2022-10-20 20:16 ` Belgaumkar, Vinay 2022-10-20 20:16 ` Belgaumkar, Vinay 2022-10-20 23:36 ` Dixit, Ashutosh 2022-10-20 23:36 ` Dixit, Ashutosh 2022-10-21 18:24 ` Belgaumkar, Vinay 2022-10-21 18:24 ` Belgaumkar, Vinay 2022-10-21 18:40 ` Dixit, Ashutosh 2022-10-21 18:40 ` Dixit, Ashutosh 2022-10-21 22:46 ` Belgaumkar, Vinay 2022-10-21 22:46 ` Belgaumkar, Vinay
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.