* [Intel-gfx] [PATCH v6 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware
@ 2023-04-20 16:40 Ashutosh Dixit
2023-04-20 16:40 ` [Intel-gfx] [PATCH 1/3] drm/i915/hwmon: Get mutex and rpm ref just once in hwm_power_max_write Ashutosh Dixit
` (5 more replies)
0 siblings, 6 replies; 22+ messages in thread
From: Ashutosh Dixit @ 2023-04-20 16:40 UTC (permalink / raw)
To: intel-gfx; +Cc: dri-devel, Rodrigo Vivi
v6: Update Patch 3 to remove the timeout when blocked
v1-v5: Please see individual patches for revision history
Ashutosh Dixit (3):
drm/i915/hwmon: Get mutex and rpm ref just once in hwm_power_max_write
drm/i915/guc: Disable PL1 power limit when loading GuC firmware
drm/i915/hwmon: Block waiting for GuC reset to complete
drivers/gpu/drm/i915/gt/uc/intel_uc.c | 13 +++-
drivers/gpu/drm/i915/i915_hwmon.c | 87 +++++++++++++++++++++++----
drivers/gpu/drm/i915/i915_hwmon.h | 7 +++
3 files changed, 93 insertions(+), 14 deletions(-)
--
2.38.0
^ permalink raw reply [flat|nested] 22+ messages in thread* [Intel-gfx] [PATCH 1/3] drm/i915/hwmon: Get mutex and rpm ref just once in hwm_power_max_write 2023-04-20 16:40 [Intel-gfx] [PATCH v6 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit @ 2023-04-20 16:40 ` Ashutosh Dixit 2023-04-20 16:40 ` [Intel-gfx] [PATCH 2/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit ` (4 subsequent siblings) 5 siblings, 0 replies; 22+ messages in thread From: Ashutosh Dixit @ 2023-04-20 16:40 UTC (permalink / raw) To: intel-gfx; +Cc: dri-devel, Rodrigo Vivi In preparation for follow-on patches, refactor hwm_power_max_write to take hwmon_lock and runtime pm wakeref at start of the function and release them at the end, therefore acquiring these just once each. Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> --- drivers/gpu/drm/i915/i915_hwmon.c | 28 +++++++++++++++------------- 1 file changed, 15 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c index 8e7dccc8d3a0e..7f44e809ca155 100644 --- a/drivers/gpu/drm/i915/i915_hwmon.c +++ b/drivers/gpu/drm/i915/i915_hwmon.c @@ -396,31 +396,33 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val) { struct i915_hwmon *hwmon = ddat->hwmon; intel_wakeref_t wakeref; + int ret = 0; u32 nval; + mutex_lock(&hwmon->hwmon_lock); + wakeref = intel_runtime_pm_get(ddat->uncore->rpm); + /* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */ if (val == PL1_DISABLE) { - mutex_lock(&hwmon->hwmon_lock); - with_intel_runtime_pm(ddat->uncore->rpm, wakeref) { - intel_uncore_rmw(ddat->uncore, hwmon->rg.pkg_rapl_limit, - PKG_PWR_LIM_1_EN, 0); - nval = intel_uncore_read(ddat->uncore, hwmon->rg.pkg_rapl_limit); - } - mutex_unlock(&hwmon->hwmon_lock); + intel_uncore_rmw(ddat->uncore, hwmon->rg.pkg_rapl_limit, + PKG_PWR_LIM_1_EN, 0); + nval = intel_uncore_read(ddat->uncore, hwmon->rg.pkg_rapl_limit); if (nval & PKG_PWR_LIM_1_EN) - return -ENODEV; - return 0; + ret = -ENODEV; + goto exit; } /* Computation in 64-bits to avoid overflow. Round to nearest. */ nval = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_power, SF_POWER); nval = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, nval); - hwm_locked_with_pm_intel_uncore_rmw(ddat, hwmon->rg.pkg_rapl_limit, - PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, - nval); - return 0; + intel_uncore_rmw(ddat->uncore, hwmon->rg.pkg_rapl_limit, + PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval); +exit: + intel_runtime_pm_put(ddat->uncore->rpm, wakeref); + mutex_unlock(&hwmon->hwmon_lock); + return ret; } static int -- 2.38.0 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Intel-gfx] [PATCH 2/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware 2023-04-20 16:40 [Intel-gfx] [PATCH v6 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit 2023-04-20 16:40 ` [Intel-gfx] [PATCH 1/3] drm/i915/hwmon: Get mutex and rpm ref just once in hwm_power_max_write Ashutosh Dixit @ 2023-04-20 16:40 ` Ashutosh Dixit 2023-04-20 16:40 ` [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete Ashutosh Dixit ` (3 subsequent siblings) 5 siblings, 0 replies; 22+ messages in thread From: Ashutosh Dixit @ 2023-04-20 16:40 UTC (permalink / raw) To: intel-gfx; +Cc: dri-devel, Rodrigo Vivi On dGfx, the PL1 power limit being enabled and set to a low value results in a low GPU operating freq. It also negates the freq raise operation which is done before GuC firmware load. As a result GuC firmware load can time out. Such timeouts were seen in the GL #8062 bug below (where the PL1 power limit was enabled and set to a low value). Therefore disable the PL1 power limit when allowed by HW when loading GuC firmware. v2: - Take mutex (to disallow writes to power1_max) across GuC reset/fw load - Add hwm_power_max_restore to error return code path v3 (Jani N): - Add/remove explanatory comments - Function renames - Type corrections - Locking annotation v4: - Don't hold the lock across GuC reset (Rodrigo) - New locking scheme (suggested by Rodrigo) - Eliminate rpm_get in power_max_disable/restore, not needed (Tvrtko) v5: - Fix uninitialized pl1en variable compile warning reported by kernel build robot by creating new err_rps label Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8062 Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> --- drivers/gpu/drm/i915/gt/uc/intel_uc.c | 13 +++++++-- drivers/gpu/drm/i915/i915_hwmon.c | 40 +++++++++++++++++++++++++++ drivers/gpu/drm/i915/i915_hwmon.h | 7 +++++ 3 files changed, 58 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c b/drivers/gpu/drm/i915/gt/uc/intel_uc.c index 4ccb4be4c9cba..996168312340e 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c @@ -18,6 +18,7 @@ #include "intel_uc.h" #include "i915_drv.h" +#include "i915_hwmon.h" static const struct intel_uc_ops uc_ops_off; static const struct intel_uc_ops uc_ops_on; @@ -461,6 +462,7 @@ static int __uc_init_hw(struct intel_uc *uc) struct intel_guc *guc = &uc->guc; struct intel_huc *huc = &uc->huc; int ret, attempts; + bool pl1en; GEM_BUG_ON(!intel_uc_supports_guc(uc)); GEM_BUG_ON(!intel_uc_wants_guc(uc)); @@ -491,6 +493,9 @@ static int __uc_init_hw(struct intel_uc *uc) else attempts = 1; + /* Disable a potentially low PL1 power limit to allow freq to be raised */ + i915_hwmon_power_max_disable(gt->i915, &pl1en); + intel_rps_raise_unslice(&uc_to_gt(uc)->rps); while (attempts--) { @@ -500,7 +505,7 @@ static int __uc_init_hw(struct intel_uc *uc) */ ret = __uc_sanitize(uc); if (ret) - goto err_out; + goto err_rps; intel_huc_fw_upload(huc); intel_guc_ads_reset(guc); @@ -547,6 +552,8 @@ static int __uc_init_hw(struct intel_uc *uc) intel_rps_lower_unslice(&uc_to_gt(uc)->rps); } + i915_hwmon_power_max_restore(gt->i915, pl1en); + guc_info(guc, "submission %s\n", str_enabled_disabled(intel_uc_uses_guc_submission(uc))); guc_info(guc, "SLPC %s\n", str_enabled_disabled(intel_uc_uses_guc_slpc(uc))); @@ -559,10 +566,12 @@ static int __uc_init_hw(struct intel_uc *uc) intel_guc_submission_disable(guc); err_log_capture: __uc_capture_load_err_log(uc); -err_out: +err_rps: /* Return GT back to RPn */ intel_rps_lower_unslice(&uc_to_gt(uc)->rps); + i915_hwmon_power_max_restore(gt->i915, pl1en); +err_out: __uc_sanitize(uc); if (!ret) { diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c index 7f44e809ca155..9ab8971679fe3 100644 --- a/drivers/gpu/drm/i915/i915_hwmon.c +++ b/drivers/gpu/drm/i915/i915_hwmon.c @@ -50,6 +50,7 @@ struct hwm_drvdata { struct hwm_energy_info ei; /* Energy info for energy1_input */ char name[12]; int gt_n; + bool reset_in_progress; }; struct i915_hwmon { @@ -400,6 +401,10 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val) u32 nval; mutex_lock(&hwmon->hwmon_lock); + if (hwmon->ddat.reset_in_progress) { + ret = -EAGAIN; + goto unlock; + } wakeref = intel_runtime_pm_get(ddat->uncore->rpm); /* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */ @@ -421,6 +426,7 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val) PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval); exit: intel_runtime_pm_put(ddat->uncore->rpm, wakeref); +unlock: mutex_unlock(&hwmon->hwmon_lock); return ret; } @@ -472,6 +478,40 @@ hwm_power_write(struct hwm_drvdata *ddat, u32 attr, int chan, long val) } } +void i915_hwmon_power_max_disable(struct drm_i915_private *i915, bool *old) +{ + struct i915_hwmon *hwmon = i915->hwmon; + u32 r; + + if (!hwmon || !i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit)) + return; + + mutex_lock(&hwmon->hwmon_lock); + + hwmon->ddat.reset_in_progress = true; + r = intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit, + PKG_PWR_LIM_1_EN, 0); + *old = !!(r & PKG_PWR_LIM_1_EN); + + mutex_unlock(&hwmon->hwmon_lock); +} + +void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old) +{ + struct i915_hwmon *hwmon = i915->hwmon; + + if (!hwmon || !i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit)) + return; + + mutex_lock(&hwmon->hwmon_lock); + + intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit, + PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0); + hwmon->ddat.reset_in_progress = false; + + mutex_unlock(&hwmon->hwmon_lock); +} + static umode_t hwm_energy_is_visible(const struct hwm_drvdata *ddat, u32 attr) { diff --git a/drivers/gpu/drm/i915/i915_hwmon.h b/drivers/gpu/drm/i915/i915_hwmon.h index 7ca9cf2c34c96..0fcb7de844061 100644 --- a/drivers/gpu/drm/i915/i915_hwmon.h +++ b/drivers/gpu/drm/i915/i915_hwmon.h @@ -7,14 +7,21 @@ #ifndef __I915_HWMON_H__ #define __I915_HWMON_H__ +#include <linux/types.h> + struct drm_i915_private; +struct intel_gt; #if IS_REACHABLE(CONFIG_HWMON) void i915_hwmon_register(struct drm_i915_private *i915); void i915_hwmon_unregister(struct drm_i915_private *i915); +void i915_hwmon_power_max_disable(struct drm_i915_private *i915, bool *old); +void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old); #else static inline void i915_hwmon_register(struct drm_i915_private *i915) { }; static inline void i915_hwmon_unregister(struct drm_i915_private *i915) { }; +static inline void i915_hwmon_power_max_disable(struct drm_i915_private *i915, bool *old) { }; +static inline void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old) { }; #endif #endif /* __I915_HWMON_H__ */ -- 2.38.0 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete 2023-04-20 16:40 [Intel-gfx] [PATCH v6 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit 2023-04-20 16:40 ` [Intel-gfx] [PATCH 1/3] drm/i915/hwmon: Get mutex and rpm ref just once in hwm_power_max_write Ashutosh Dixit 2023-04-20 16:40 ` [Intel-gfx] [PATCH 2/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit @ 2023-04-20 16:40 ` Ashutosh Dixit 2023-04-20 17:26 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915/guc: Disable PL1 power limit when loading GuC firmware Patchwork ` (2 subsequent siblings) 5 siblings, 0 replies; 22+ messages in thread From: Ashutosh Dixit @ 2023-04-20 16:40 UTC (permalink / raw) To: intel-gfx; +Cc: dri-devel, Rodrigo Vivi Instead of erroring out when GuC reset is in progress, block waiting for GuC reset to complete which is a more reasonable uapi behavior. v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo) v3: Remove timeout when blocked (Tvrtko) Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> --- drivers/gpu/drm/i915/i915_hwmon.c | 29 +++++++++++++++++++++++++---- 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c index 9ab8971679fe3..a3bdd9f68a458 100644 --- a/drivers/gpu/drm/i915/i915_hwmon.c +++ b/drivers/gpu/drm/i915/i915_hwmon.c @@ -51,6 +51,7 @@ struct hwm_drvdata { char name[12]; int gt_n; bool reset_in_progress; + wait_queue_head_t waitq; }; struct i915_hwmon { @@ -397,14 +398,32 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val) { struct i915_hwmon *hwmon = ddat->hwmon; intel_wakeref_t wakeref; + DEFINE_WAIT(wait); int ret = 0; u32 nval; - mutex_lock(&hwmon->hwmon_lock); - if (hwmon->ddat.reset_in_progress) { - ret = -EAGAIN; - goto unlock; + /* Block waiting for GuC reset to complete when needed */ + for (;;) { + mutex_lock(&hwmon->hwmon_lock); + + prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE); + + if (!hwmon->ddat.reset_in_progress) + break; + + if (signal_pending(current)) { + ret = -EINTR; + break; + } + + mutex_unlock(&hwmon->hwmon_lock); + + schedule(); } + finish_wait(&ddat->waitq, &wait); + if (ret) + goto unlock; + wakeref = intel_runtime_pm_get(ddat->uncore->rpm); /* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */ @@ -508,6 +527,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old) intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit, PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0); hwmon->ddat.reset_in_progress = false; + wake_up_all(&hwmon->ddat.waitq); mutex_unlock(&hwmon->hwmon_lock); } @@ -784,6 +804,7 @@ void i915_hwmon_register(struct drm_i915_private *i915) ddat->uncore = &i915->uncore; snprintf(ddat->name, sizeof(ddat->name), "i915"); ddat->gt_n = -1; + init_waitqueue_head(&ddat->waitq); for_each_gt(gt, i915, i) { ddat_gt = hwmon->ddat_gt + i; -- 2.38.0 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915/guc: Disable PL1 power limit when loading GuC firmware 2023-04-20 16:40 [Intel-gfx] [PATCH v6 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit ` (2 preceding siblings ...) 2023-04-20 16:40 ` [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete Ashutosh Dixit @ 2023-04-20 17:26 ` Patchwork 2023-04-20 17:37 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork 2023-04-21 0:08 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork 5 siblings, 0 replies; 22+ messages in thread From: Patchwork @ 2023-04-20 17:26 UTC (permalink / raw) To: Dixit, Ashutosh; +Cc: intel-gfx == Series Details == Series: drm/i915/guc: Disable PL1 power limit when loading GuC firmware URL : https://patchwork.freedesktop.org/series/116768/ State : warning == Summary == Error: dim sparse failed Sparse version: v0.6.2 Fast mode used, each commit won't be checked separately. +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:148:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:148:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:150:9: warning: unreplaced symbol 'oldbit' +./arch/x86/include/asm/bitops.h:150:9: warning: unreplaced symbol 'oldbit' +./arch/x86/include/asm/bitops.h:154:26: warning: unreplaced symbol 'oldbit' +./arch/x86/include/asm/bitops.h:154:26: warning: unreplaced symbol 'oldbit' +./arch/x86/include/asm/bitops.h:156:16: warning: unreplaced symbol 'oldbit' +./arch/x86/include/asm/bitops.h:156:16: warning: unreplaced symbol 'oldbit' +./arch/x86/include/asm/bitops.h:156:9: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:156:9: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:174:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:174:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:176:9: warning: unreplaced symbol 'oldbit' +./arch/x86/include/asm/bitops.h:176:9: warning: unreplaced symbol 'oldbit' +./arch/x86/include/asm/bitops.h:180:35: warning: unreplaced symbol 'oldbit' +./arch/x86/include/asm/bitops.h:180:35: warning: unreplaced symbol 'oldbit' +./arch/x86/include/asm/bitops.h:182:16: warning: unreplaced symbol 'oldbit' +./arch/x86/include/asm/bitops.h:182:16: warning: unreplaced symbol 'oldbit' +./arch/x86/include/asm/bitops.h:182:9: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:182:9: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:186:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:186:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:188:9: warning: unreplaced symbol 'oldbit' +./arch/x86/include/asm/bitops.h:188:9: warning: unreplaced symbol 'oldbit' +./arch/x86/include/asm/bitops.h:192:35: warning: unreplaced symbol 'oldbit' +./arch/x86/include/asm/bitops.h:192:35: warning: unreplaced symbol 'oldbit' +./arch/x86/include/asm/bitops.h:195:16: warning: unreplaced symbol 'oldbit' +./arch/x86/include/asm/bitops.h:195:16: warning: unreplaced symbol 'oldbit' +./arch/x86/include/asm/bitops.h:195:9: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:195:9: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:237:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:237:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:239:9: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:239:9: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:66:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:66:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:92:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:92:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:100:17: warning: unreplaced symbol 'old' +./include/asm-generic/bitops/generic-non-atomic.h:100:17: warning: unreplaced symbol 'old' +./include/asm-generic/bitops/generic-non-atomic.h:100:23: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:100:23: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:100:9: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:100:9: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:105:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:105:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:107:9: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:107:9: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:108:9: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:108:9: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:109:9: warning: unreplaced symbol 'old' +./include/asm-generic/bitops/generic-non-atomic.h:109:9: warning: unreplaced symbol 'old' +./include/asm-generic/bitops/generic-non-atomic.h:111:10: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:111:10: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:111:14: warning: unreplaced symbol 'old' +./include/asm-generic/bitops/generic-non-atomic.h:111:14: warning: unreplaced symbol 'old' +./include/asm-generic/bitops/generic-non-atomic.h:111:20: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:111:20: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:112:17: warning: unreplaced symbol 'old' +./include/asm-generic/bitops/generic-non-atomic.h:112:17: warning: unreplaced symbol 'old' +./include/asm-generic/bitops/generic-non-atomic.h:112:23: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:112:23: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:112:9: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:112:9: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:121:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:121:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:128:9: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:128:9: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:166:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:166:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:168:9: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:168:9: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:169:9: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:169:9: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:170:9: warning: unreplaced symbol 'val' +./include/asm-generic/bitops/generic-non-atomic.h:170:9: warning: unreplaced symbol 'val' +./include/asm-generic/bitops/generic-non-atomic.h:172:19: warning: unreplaced symbol 'val' +./include/asm-generic/bitops/generic-non-atomic.h:172:19: warning: unreplaced symbol 'val' +./include/asm-generic/bitops/generic-non-atomic.h:172:25: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:172:25: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:172:9: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:172:9: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:28:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:28:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:30:9: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:30:9: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:31:9: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:31:9: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:33:10: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:33:10: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:33:16: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:33:16: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:37:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:37:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:39:9: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:39:9: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:40:9: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:40:9: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:42:10: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:42:10: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:42:16: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:42:16: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:55:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:55:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:57:9: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:57:9: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:58:9: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:58:9: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:60:10: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:60:10: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:60:15: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:60:15: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:73:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:73:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:75:9: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:75:9: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:76:9: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:76:9: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:77:9: warning: unreplaced symbol 'old' +./include/asm-generic/bitops/generic-non-atomic.h:77:9: warning: unreplaced symbol 'old' +./include/asm-generic/bitops/generic-non-atomic.h:79:10: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:79:10: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:79:14: warning: unreplaced symbol 'old' +./include/asm-generic/bitops/generic-non-atomic.h:79:14: warning: unreplaced symbol 'old' +./include/asm-generic/bitops/generic-non-atomic.h:79:20: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:79:20: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:80:17: warning: unreplaced symbol 'old' +./include/asm-generic/bitops/generic-non-atomic.h:80:17: warning: unreplaced symbol 'old' +./include/asm-generic/bitops/generic-non-atomic.h:80:23: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:80:23: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:80:9: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:80:9: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:93:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:93:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/generic-non-atomic.h:95:9: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:95:9: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:96:9: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:96:9: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:97:9: warning: unreplaced symbol 'old' +./include/asm-generic/bitops/generic-non-atomic.h:97:9: warning: unreplaced symbol 'old' +./include/asm-generic/bitops/generic-non-atomic.h:99:10: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:99:10: warning: unreplaced symbol 'p' +./include/asm-generic/bitops/generic-non-atomic.h:99:14: warning: unreplaced symbol 'old' +./include/asm-generic/bitops/generic-non-atomic.h:99:14: warning: unreplaced symbol 'old' +./include/asm-generic/bitops/generic-non-atomic.h:99:21: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/generic-non-atomic.h:99:21: warning: unreplaced symbol 'mask' +./include/asm-generic/bitops/instrumented-non-atomic.h:100:9: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/instrumented-non-atomic.h:100:9: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/instrumented-non-atomic.h:112:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/instrumented-non-atomic.h:112:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/instrumented-non-atomic.h:115:9: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/instrumented-non-atomic.h:115:9: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/instrumented-non-atomic.h:127:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/instrumented-non-atomic.h:127:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/instrumented-non-atomic.h:130:9: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/instrumented-non-atomic.h:130:9: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/instrumented-non-atomic.h:139:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/instrumented-non-atomic.h:139:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/instrumented-non-atomic.h:142:9: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/instrumented-non-atomic.h:142:9: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/instrumented-non-atomic.h:26:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/instrumented-non-atomic.h:26:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/instrumented-non-atomic.h:42:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/instrumented-non-atomic.h:42:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/instrumented-non-atomic.h:58:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/instrumented-non-atomic.h:58:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/instrumented-non-atomic.h:97:1: warning: unreplaced symbol 'return' +./include/asm-generic/bitops/instrumented-non-atomic.h:97:1: warning: unreplaced symbol 'return' ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/guc: Disable PL1 power limit when loading GuC firmware 2023-04-20 16:40 [Intel-gfx] [PATCH v6 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit ` (3 preceding siblings ...) 2023-04-20 17:26 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915/guc: Disable PL1 power limit when loading GuC firmware Patchwork @ 2023-04-20 17:37 ` Patchwork 2023-04-21 0:08 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork 5 siblings, 0 replies; 22+ messages in thread From: Patchwork @ 2023-04-20 17:37 UTC (permalink / raw) To: Dixit, Ashutosh; +Cc: intel-gfx [-- Attachment #1: Type: text/plain, Size: 2722 bytes --] == Series Details == Series: drm/i915/guc: Disable PL1 power limit when loading GuC firmware URL : https://patchwork.freedesktop.org/series/116768/ State : success == Summary == CI Bug Log - changes from CI_DRM_13034 -> Patchwork_116768v1 ==================================================== Summary ------- **SUCCESS** No regressions found. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116768v1/index.html Participating hosts (37 -> 35) ------------------------------ Missing (2): bat-mtlp-8 fi-snb-2520m Known issues ------------ Here are the changes found in Patchwork_116768v1 that come from known issues: ### IGT changes ### #### Issues hit #### * igt@i915_selftest@live@requests: - bat-rpls-1: [PASS][1] -> [ABORT][2] ([i915#4983] / [i915#7911]) [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13034/bat-rpls-1/igt@i915_selftest@live@requests.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116768v1/bat-rpls-1/igt@i915_selftest@live@requests.html * igt@kms_chamelium_hpd@common-hpd-after-suspend: - bat-adlp-6: NOTRUN -> [SKIP][3] ([i915#7828]) [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116768v1/bat-adlp-6/igt@kms_chamelium_hpd@common-hpd-after-suspend.html #### Possible fixes #### * igt@i915_suspend@basic-s3-without-i915: - bat-adlp-6: [INCOMPLETE][4] ([i915#7443]) -> [PASS][5] [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13034/bat-adlp-6/igt@i915_suspend@basic-s3-without-i915.html [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116768v1/bat-adlp-6/igt@i915_suspend@basic-s3-without-i915.html [i915#4983]: https://gitlab.freedesktop.org/drm/intel/issues/4983 [i915#7443]: https://gitlab.freedesktop.org/drm/intel/issues/7443 [i915#7828]: https://gitlab.freedesktop.org/drm/intel/issues/7828 [i915#7911]: https://gitlab.freedesktop.org/drm/intel/issues/7911 Build changes ------------- * Linux: CI_DRM_13034 -> Patchwork_116768v1 CI-20190529: 20190529 CI_DRM_13034: bf6a3fbb5a6d7afbbaeb93043fd0001d6b83591b @ git://anongit.freedesktop.org/gfx-ci/linux IGT_7263: a6bd8f415c4ec41b5a014c7db47e46c81ffd0074 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git Patchwork_116768v1: bf6a3fbb5a6d7afbbaeb93043fd0001d6b83591b @ git://anongit.freedesktop.org/gfx-ci/linux ### Linux commits 39178baf855c drm/i915/hwmon: Block waiting for GuC reset to complete d6f8cf406cba drm/i915/guc: Disable PL1 power limit when loading GuC firmware 1341c465cd24 drm/i915/hwmon: Get mutex and rpm ref just once in hwm_power_max_write == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116768v1/index.html [-- Attachment #2: Type: text/html, Size: 3389 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915/guc: Disable PL1 power limit when loading GuC firmware 2023-04-20 16:40 [Intel-gfx] [PATCH v6 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit ` (4 preceding siblings ...) 2023-04-20 17:37 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork @ 2023-04-21 0:08 ` Patchwork 2023-04-25 0:26 ` Dixit, Ashutosh 5 siblings, 1 reply; 22+ messages in thread From: Patchwork @ 2023-04-21 0:08 UTC (permalink / raw) To: Dixit, Ashutosh; +Cc: intel-gfx [-- Attachment #1: Type: text/plain, Size: 12461 bytes --] == Series Details == Series: drm/i915/guc: Disable PL1 power limit when loading GuC firmware URL : https://patchwork.freedesktop.org/series/116768/ State : failure == Summary == CI Bug Log - changes from CI_DRM_13034_full -> Patchwork_116768v1_full ==================================================== Summary ------- **FAILURE** Serious unknown changes coming with Patchwork_116768v1_full absolutely need to be verified manually. If you think the reported changes have nothing to do with the changes introduced in Patchwork_116768v1_full, please notify your bug team to allow them to document this new failure mode, which will reduce false positives in CI. Participating hosts (7 -> 7) ------------------------------ No changes in participating hosts Possible new issues ------------------- Here are the unknown changes that may have been introduced in Patchwork_116768v1_full: ### IGT changes ### #### Possible regressions #### * igt@gem_ppgtt@blt-vs-render-ctx0: - shard-snb: [PASS][1] -> [DMESG-FAIL][2] [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13034/shard-snb6/igt@gem_ppgtt@blt-vs-render-ctx0.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116768v1/shard-snb6/igt@gem_ppgtt@blt-vs-render-ctx0.html #### Suppressed #### The following results come from untrusted machines, tests, or statuses. They do not affect the overall result. * igt@kms_flip@2x-flip-vs-fences-interruptible: - {shard-dg1}: NOTRUN -> [SKIP][3] [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116768v1/shard-dg1-17/igt@kms_flip@2x-flip-vs-fences-interruptible.html Known issues ------------ Here are the changes found in Patchwork_116768v1_full that come from known issues: ### IGT changes ### #### Issues hit #### * igt@gem_barrier_race@remote-request@rcs0: - shard-glk: [PASS][4] -> [ABORT][5] ([i915#8211]) [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13034/shard-glk9/igt@gem_barrier_race@remote-request@rcs0.html [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116768v1/shard-glk9/igt@gem_barrier_race@remote-request@rcs0.html * igt@kms_color@ctm-0-25@pipe-b-vga-1: - shard-snb: NOTRUN -> [SKIP][6] ([fdo#109271]) +8 similar issues [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116768v1/shard-snb7/igt@kms_color@ctm-0-25@pipe-b-vga-1.html * igt@kms_flip@flip-vs-blocking-wf-vblank@c-hdmi-a2: - shard-glk: [PASS][7] -> [FAIL][8] ([i915#2122]) [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13034/shard-glk4/igt@kms_flip@flip-vs-blocking-wf-vblank@c-hdmi-a2.html [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116768v1/shard-glk1/igt@kms_flip@flip-vs-blocking-wf-vblank@c-hdmi-a2.html * igt@kms_flip@plain-flip-fb-recreate@b-dp1: - shard-apl: [PASS][9] -> [FAIL][10] ([i915#2122]) [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13034/shard-apl3/igt@kms_flip@plain-flip-fb-recreate@b-dp1.html [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116768v1/shard-apl2/igt@kms_flip@plain-flip-fb-recreate@b-dp1.html #### Possible fixes #### * igt@drm_fdinfo@most-busy-check-all@rcs0: - {shard-rkl}: [FAIL][11] ([i915#7742]) -> [PASS][12] [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13034/shard-rkl-3/igt@drm_fdinfo@most-busy-check-all@rcs0.html [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116768v1/shard-rkl-7/igt@drm_fdinfo@most-busy-check-all@rcs0.html * igt@gem_barrier_race@remote-request@rcs0: - {shard-rkl}: [ABORT][13] ([i915#8211]) -> [PASS][14] [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13034/shard-rkl-7/igt@gem_barrier_race@remote-request@rcs0.html [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116768v1/shard-rkl-4/igt@gem_barrier_race@remote-request@rcs0.html * igt@gem_exec_endless@dispatch@rcs0: - {shard-tglu}: [TIMEOUT][15] ([i915#3778]) -> [PASS][16] [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13034/shard-tglu-8/igt@gem_exec_endless@dispatch@rcs0.html [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116768v1/shard-tglu-8/igt@gem_exec_endless@dispatch@rcs0.html * igt@gem_exec_fair@basic-none@vecs0: - {shard-rkl}: [FAIL][17] ([i915#2842]) -> [PASS][18] +1 similar issue [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13034/shard-rkl-7/igt@gem_exec_fair@basic-none@vecs0.html [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116768v1/shard-rkl-6/igt@gem_exec_fair@basic-none@vecs0.html * igt@gem_exec_fair@basic-pace-solo@rcs0: - shard-apl: [FAIL][19] ([i915#2842]) -> [PASS][20] [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13034/shard-apl2/igt@gem_exec_fair@basic-pace-solo@rcs0.html [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116768v1/shard-apl7/igt@gem_exec_fair@basic-pace-solo@rcs0.html * igt@i915_pm_rpm@modeset-lpsp: - {shard-rkl}: [SKIP][21] ([i915#1397]) -> [PASS][22] [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13034/shard-rkl-1/igt@i915_pm_rpm@modeset-lpsp.html [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116768v1/shard-rkl-7/igt@i915_pm_rpm@modeset-lpsp.html * igt@i915_pm_rpm@modeset-non-lpsp: - {shard-dg1}: [SKIP][23] ([i915#1397]) -> [PASS][24] [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13034/shard-dg1-14/igt@i915_pm_rpm@modeset-non-lpsp.html [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116768v1/shard-dg1-15/igt@i915_pm_rpm@modeset-non-lpsp.html * igt@i915_pm_rps@engine-order: - shard-apl: [FAIL][25] ([i915#6537]) -> [PASS][26] [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13034/shard-apl4/igt@i915_pm_rps@engine-order.html [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116768v1/shard-apl3/igt@i915_pm_rps@engine-order.html * igt@kms_cursor_legacy@forked-bo@pipe-b: - {shard-rkl}: [INCOMPLETE][27] ([i915#8011]) -> [PASS][28] [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13034/shard-rkl-7/igt@kms_cursor_legacy@forked-bo@pipe-b.html [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116768v1/shard-rkl-4/igt@kms_cursor_legacy@forked-bo@pipe-b.html * igt@kms_flip@flip-vs-expired-vblank@c-dp1: - shard-apl: [FAIL][29] ([i915#79]) -> [PASS][30] [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13034/shard-apl1/igt@kms_flip@flip-vs-expired-vblank@c-dp1.html [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116768v1/shard-apl6/igt@kms_flip@flip-vs-expired-vblank@c-dp1.html {name}: This element is suppressed. This means it is ignored when computing the status of the difference (SUCCESS, WARNING, or FAILURE). [IGT#2]: https://gitlab.freedesktop.org/drm/igt-gpu-tools/issues/2 [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271 [fdo#109289]: https://bugs.freedesktop.org/show_bug.cgi?id=109289 [fdo#109314]: https://bugs.freedesktop.org/show_bug.cgi?id=109314 [fdo#110189]: https://bugs.freedesktop.org/show_bug.cgi?id=110189 [fdo#111068]: https://bugs.freedesktop.org/show_bug.cgi?id=111068 [fdo#111615]: https://bugs.freedesktop.org/show_bug.cgi?id=111615 [fdo#111825]: https://bugs.freedesktop.org/show_bug.cgi?id=111825 [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827 [i915#1072]: https://gitlab.freedesktop.org/drm/intel/issues/1072 [i915#1397]: https://gitlab.freedesktop.org/drm/intel/issues/1397 [i915#2122]: https://gitlab.freedesktop.org/drm/intel/issues/2122 [i915#2433]: https://gitlab.freedesktop.org/drm/intel/issues/2433 [i915#2527]: https://gitlab.freedesktop.org/drm/intel/issues/2527 [i915#2575]: https://gitlab.freedesktop.org/drm/intel/issues/2575 [i915#2587]: https://gitlab.freedesktop.org/drm/intel/issues/2587 [i915#2672]: https://gitlab.freedesktop.org/drm/intel/issues/2672 [i915#284]: https://gitlab.freedesktop.org/drm/intel/issues/284 [i915#2842]: https://gitlab.freedesktop.org/drm/intel/issues/2842 [i915#3281]: https://gitlab.freedesktop.org/drm/intel/issues/3281 [i915#3282]: https://gitlab.freedesktop.org/drm/intel/issues/3282 [i915#3297]: https://gitlab.freedesktop.org/drm/intel/issues/3297 [i915#3299]: https://gitlab.freedesktop.org/drm/intel/issues/3299 [i915#3359]: https://gitlab.freedesktop.org/drm/intel/issues/3359 [i915#3458]: https://gitlab.freedesktop.org/drm/intel/issues/3458 [i915#3528]: https://gitlab.freedesktop.org/drm/intel/issues/3528 [i915#3539]: https://gitlab.freedesktop.org/drm/intel/issues/3539 [i915#3555]: https://gitlab.freedesktop.org/drm/intel/issues/3555 [i915#3591]: https://gitlab.freedesktop.org/drm/intel/issues/3591 [i915#3638]: https://gitlab.freedesktop.org/drm/intel/issues/3638 [i915#3689]: https://gitlab.freedesktop.org/drm/intel/issues/3689 [i915#3708]: https://gitlab.freedesktop.org/drm/intel/issues/3708 [i915#3778]: https://gitlab.freedesktop.org/drm/intel/issues/3778 [i915#3886]: https://gitlab.freedesktop.org/drm/intel/issues/3886 [i915#3938]: https://gitlab.freedesktop.org/drm/intel/issues/3938 [i915#3955]: https://gitlab.freedesktop.org/drm/intel/issues/3955 [i915#4070]: https://gitlab.freedesktop.org/drm/intel/issues/4070 [i915#4077]: https://gitlab.freedesktop.org/drm/intel/issues/4077 [i915#4083]: https://gitlab.freedesktop.org/drm/intel/issues/4083 [i915#4212]: https://gitlab.freedesktop.org/drm/intel/issues/4212 [i915#4270]: https://gitlab.freedesktop.org/drm/intel/issues/4270 [i915#4391]: https://gitlab.freedesktop.org/drm/intel/issues/4391 [i915#4538]: https://gitlab.freedesktop.org/drm/intel/issues/4538 [i915#4579]: https://gitlab.freedesktop.org/drm/intel/issues/4579 [i915#4771]: https://gitlab.freedesktop.org/drm/intel/issues/4771 [i915#4812]: https://gitlab.freedesktop.org/drm/intel/issues/4812 [i915#4816]: https://gitlab.freedesktop.org/drm/intel/issues/4816 [i915#4833]: https://gitlab.freedesktop.org/drm/intel/issues/4833 [i915#4852]: https://gitlab.freedesktop.org/drm/intel/issues/4852 [i915#4860]: https://gitlab.freedesktop.org/drm/intel/issues/4860 [i915#4880]: https://gitlab.freedesktop.org/drm/intel/issues/4880 [i915#5176]: https://gitlab.freedesktop.org/drm/intel/issues/5176 [i915#5235]: https://gitlab.freedesktop.org/drm/intel/issues/5235 [i915#5286]: https://gitlab.freedesktop.org/drm/intel/issues/5286 [i915#5289]: https://gitlab.freedesktop.org/drm/intel/issues/5289 [i915#5325]: https://gitlab.freedesktop.org/drm/intel/issues/5325 [i915#5354]: https://gitlab.freedesktop.org/drm/intel/issues/5354 [i915#5563]: https://gitlab.freedesktop.org/drm/intel/issues/5563 [i915#5784]: https://gitlab.freedesktop.org/drm/intel/issues/5784 [i915#6095]: https://gitlab.freedesktop.org/drm/intel/issues/6095 [i915#6433]: https://gitlab.freedesktop.org/drm/intel/issues/6433 [i915#6493]: https://gitlab.freedesktop.org/drm/intel/issues/6493 [i915#6524]: https://gitlab.freedesktop.org/drm/intel/issues/6524 [i915#6537]: https://gitlab.freedesktop.org/drm/intel/issues/6537 [i915#658]: https://gitlab.freedesktop.org/drm/intel/issues/658 [i915#7116]: https://gitlab.freedesktop.org/drm/intel/issues/7116 [i915#7561]: https://gitlab.freedesktop.org/drm/intel/issues/7561 [i915#7711]: https://gitlab.freedesktop.org/drm/intel/issues/7711 [i915#7742]: https://gitlab.freedesktop.org/drm/intel/issues/7742 [i915#7828]: https://gitlab.freedesktop.org/drm/intel/issues/7828 [i915#79]: https://gitlab.freedesktop.org/drm/intel/issues/79 [i915#7975]: https://gitlab.freedesktop.org/drm/intel/issues/7975 [i915#8011]: https://gitlab.freedesktop.org/drm/intel/issues/8011 [i915#8211]: https://gitlab.freedesktop.org/drm/intel/issues/8211 Build changes ------------- * Linux: CI_DRM_13034 -> Patchwork_116768v1 CI-20190529: 20190529 CI_DRM_13034: bf6a3fbb5a6d7afbbaeb93043fd0001d6b83591b @ git://anongit.freedesktop.org/gfx-ci/linux IGT_7263: a6bd8f415c4ec41b5a014c7db47e46c81ffd0074 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git Patchwork_116768v1: bf6a3fbb5a6d7afbbaeb93043fd0001d6b83591b @ git://anongit.freedesktop.org/gfx-ci/linux piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116768v1/index.html [-- Attachment #2: Type: text/html, Size: 9407 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915/guc: Disable PL1 power limit when loading GuC firmware 2023-04-21 0:08 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork @ 2023-04-25 0:26 ` Dixit, Ashutosh 0 siblings, 0 replies; 22+ messages in thread From: Dixit, Ashutosh @ 2023-04-25 0:26 UTC (permalink / raw) To: intel-gfx On Thu, 20 Apr 2023 17:08:42 -0700, Patchwork wrote: > > Patch Details > > Series: drm/i915/guc: Disable PL1 power limit when loading GuC firmware > URL: https://patchwork.freedesktop.org/series/116768/ > State: failure > Details: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_116768v1/index.html > > CI Bug Log - changes from CI_DRM_13034_full -> Patchwork_116768v1_full > > Summary > > FAILURE > > Serious unknown changes coming with Patchwork_116768v1_full absolutely need to be > verified manually. > > If you think the reported changes have nothing to do with the changes > introduced in Patchwork_116768v1_full, please notify your bug team to allow them > to document this new failure mode, which will reduce false positives in CI. > > Participating hosts (7 -> 7) > > No changes in participating hosts > > Possible new issues > > Here are the unknown changes that may have been introduced in Patchwork_116768v1_full: > > IGT changes > > Possible regressions > > * igt@gem_ppgtt@blt-vs-render-ctx0: > > * shard-snb: PASS -> DMESG-FAIL This failure is unrelated to this series. The series has effect only for dGFX. Thanks. -- Ashutosh ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-gfx] [PATCH 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware @ 2023-04-10 22:35 Ashutosh Dixit 2023-04-10 22:35 ` [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete Ashutosh Dixit 0 siblings, 1 reply; 22+ messages in thread From: Ashutosh Dixit @ 2023-04-10 22:35 UTC (permalink / raw) To: intel-gfx; +Cc: dri-devel, Rodrigo Vivi Updates to Patch 2/3 and Patch 3/3 in this version. Ashutosh Dixit (3): drm/i915/hwmon: Get mutex and rpm ref just once in hwm_power_max_write drm/i915/guc: Disable PL1 power limit when loading GuC firmware drm/i915/hwmon: Block waiting for GuC reset to complete drivers/gpu/drm/i915/gt/uc/intel_uc.c | 13 +++- drivers/gpu/drm/i915/i915_hwmon.c | 94 +++++++++++++++++++++++---- drivers/gpu/drm/i915/i915_hwmon.h | 7 ++ 3 files changed, 100 insertions(+), 14 deletions(-) -- 2.38.0 ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete 2023-04-10 22:35 [Intel-gfx] [PATCH 0/3] " Ashutosh Dixit @ 2023-04-10 22:35 ` Ashutosh Dixit 2023-04-18 5:35 ` Rodrigo Vivi 2023-04-19 13:21 ` Tvrtko Ursulin 0 siblings, 2 replies; 22+ messages in thread From: Ashutosh Dixit @ 2023-04-10 22:35 UTC (permalink / raw) To: intel-gfx; +Cc: dri-devel, Rodrigo Vivi Instead of erroring out when GuC reset is in progress, block waiting for GuC reset to complete which is a more reasonable uapi behavior. v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo) Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> --- drivers/gpu/drm/i915/i915_hwmon.c | 38 +++++++++++++++++++++++++++---- 1 file changed, 33 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c index 9ab8971679fe3..8471a667dfc71 100644 --- a/drivers/gpu/drm/i915/i915_hwmon.c +++ b/drivers/gpu/drm/i915/i915_hwmon.c @@ -51,6 +51,7 @@ struct hwm_drvdata { char name[12]; int gt_n; bool reset_in_progress; + wait_queue_head_t waitq; }; struct i915_hwmon { @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val) static int hwm_power_max_write(struct hwm_drvdata *ddat, long val) { +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000) + + int ret = 0, timeout = GUC_RESET_TIMEOUT; struct i915_hwmon *hwmon = ddat->hwmon; intel_wakeref_t wakeref; - int ret = 0; + DEFINE_WAIT(wait); u32 nval; - mutex_lock(&hwmon->hwmon_lock); - if (hwmon->ddat.reset_in_progress) { - ret = -EAGAIN; - goto unlock; + /* Block waiting for GuC reset to complete when needed */ + for (;;) { + mutex_lock(&hwmon->hwmon_lock); + + prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE); + + if (!hwmon->ddat.reset_in_progress) + break; + + if (signal_pending(current)) { + ret = -EINTR; + break; + } + + if (!timeout) { + ret = -ETIME; + break; + } + + mutex_unlock(&hwmon->hwmon_lock); + + timeout = schedule_timeout(timeout); } + finish_wait(&ddat->waitq, &wait); + if (ret) + goto unlock; + wakeref = intel_runtime_pm_get(ddat->uncore->rpm); /* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */ @@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old) intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit, PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0); hwmon->ddat.reset_in_progress = false; + wake_up_all(&hwmon->ddat.waitq); mutex_unlock(&hwmon->hwmon_lock); } @@ -784,6 +811,7 @@ void i915_hwmon_register(struct drm_i915_private *i915) ddat->uncore = &i915->uncore; snprintf(ddat->name, sizeof(ddat->name), "i915"); ddat->gt_n = -1; + init_waitqueue_head(&ddat->waitq); for_each_gt(gt, i915, i) { ddat_gt = hwmon->ddat_gt + i; -- 2.38.0 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete 2023-04-10 22:35 ` [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete Ashutosh Dixit @ 2023-04-18 5:35 ` Rodrigo Vivi 2023-04-18 17:23 ` Dixit, Ashutosh 2023-04-19 13:21 ` Tvrtko Ursulin 1 sibling, 1 reply; 22+ messages in thread From: Rodrigo Vivi @ 2023-04-18 5:35 UTC (permalink / raw) To: Ashutosh Dixit; +Cc: intel-gfx, dri-devel On Mon, Apr 10, 2023 at 03:35:09PM -0700, Ashutosh Dixit wrote: > Instead of erroring out when GuC reset is in progress, block waiting for > GuC reset to complete which is a more reasonable uapi behavior. > > v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo) > > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> > --- > drivers/gpu/drm/i915/i915_hwmon.c | 38 +++++++++++++++++++++++++++---- > 1 file changed, 33 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c > index 9ab8971679fe3..8471a667dfc71 100644 > --- a/drivers/gpu/drm/i915/i915_hwmon.c > +++ b/drivers/gpu/drm/i915/i915_hwmon.c > @@ -51,6 +51,7 @@ struct hwm_drvdata { > char name[12]; > int gt_n; > bool reset_in_progress; > + wait_queue_head_t waitq; > }; > > struct i915_hwmon { > @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val) > static int > hwm_power_max_write(struct hwm_drvdata *ddat, long val) > { > +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000) > + > + int ret = 0, timeout = GUC_RESET_TIMEOUT; > struct i915_hwmon *hwmon = ddat->hwmon; > intel_wakeref_t wakeref; > - int ret = 0; > + DEFINE_WAIT(wait); > u32 nval; > > - mutex_lock(&hwmon->hwmon_lock); > - if (hwmon->ddat.reset_in_progress) { > - ret = -EAGAIN; > - goto unlock; > + /* Block waiting for GuC reset to complete when needed */ > + for (;;) { > + mutex_lock(&hwmon->hwmon_lock); I'm really afraid of how this mutex is handled with the wait queue. some initial thought it looks like it is trying to reimplement ww_mutex? all other examples of the wait_queue usages like this or didn't use locks or had it in a total different flow that I could not correlate. > + > + prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE); > + > + if (!hwmon->ddat.reset_in_progress) > + break; If this breaks we never unlock it? > + > + if (signal_pending(current)) { > + ret = -EINTR; > + break; > + } > + > + if (!timeout) { > + ret = -ETIME; > + break; > + } > + > + mutex_unlock(&hwmon->hwmon_lock); do we need to lock the signal pending and timeout as well? or only wrapping it around the hwmon->ddat access would be enough? > + > + timeout = schedule_timeout(timeout); > } > + finish_wait(&ddat->waitq, &wait); > + if (ret) > + goto unlock; > + > wakeref = intel_runtime_pm_get(ddat->uncore->rpm); > > /* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */ > @@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old) > intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit, > PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0); > hwmon->ddat.reset_in_progress = false; > + wake_up_all(&hwmon->ddat.waitq); > > mutex_unlock(&hwmon->hwmon_lock); > } > @@ -784,6 +811,7 @@ void i915_hwmon_register(struct drm_i915_private *i915) > ddat->uncore = &i915->uncore; > snprintf(ddat->name, sizeof(ddat->name), "i915"); > ddat->gt_n = -1; > + init_waitqueue_head(&ddat->waitq); > > for_each_gt(gt, i915, i) { > ddat_gt = hwmon->ddat_gt + i; > -- > 2.38.0 > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete 2023-04-18 5:35 ` Rodrigo Vivi @ 2023-04-18 17:23 ` Dixit, Ashutosh 2023-04-19 19:40 ` Rodrigo Vivi 0 siblings, 1 reply; 22+ messages in thread From: Dixit, Ashutosh @ 2023-04-18 17:23 UTC (permalink / raw) To: Rodrigo Vivi; +Cc: intel-gfx, dri-devel On Mon, 17 Apr 2023 22:35:58 -0700, Rodrigo Vivi wrote: > Hi Rodrigo, > On Mon, Apr 10, 2023 at 03:35:09PM -0700, Ashutosh Dixit wrote: > > Instead of erroring out when GuC reset is in progress, block waiting for > > GuC reset to complete which is a more reasonable uapi behavior. > > > > v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo) > > > > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> > > --- > > drivers/gpu/drm/i915/i915_hwmon.c | 38 +++++++++++++++++++++++++++---- > > 1 file changed, 33 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c > > index 9ab8971679fe3..8471a667dfc71 100644 > > --- a/drivers/gpu/drm/i915/i915_hwmon.c > > +++ b/drivers/gpu/drm/i915/i915_hwmon.c > > @@ -51,6 +51,7 @@ struct hwm_drvdata { > > char name[12]; > > int gt_n; > > bool reset_in_progress; > > + wait_queue_head_t waitq; > > }; > > > > struct i915_hwmon { > > @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val) > > static int > > hwm_power_max_write(struct hwm_drvdata *ddat, long val) > > { > > +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000) > > + > > + int ret = 0, timeout = GUC_RESET_TIMEOUT; > > struct i915_hwmon *hwmon = ddat->hwmon; > > intel_wakeref_t wakeref; > > - int ret = 0; > > + DEFINE_WAIT(wait); > > u32 nval; > > > > - mutex_lock(&hwmon->hwmon_lock); > > - if (hwmon->ddat.reset_in_progress) { > > - ret = -EAGAIN; > > - goto unlock; > > + /* Block waiting for GuC reset to complete when needed */ > > + for (;;) { > > + mutex_lock(&hwmon->hwmon_lock); > > I'm really afraid of how this mutex is handled with the wait queue. > some initial thought it looks like it is trying to reimplement ww_mutex? Sorry, but I am missing the relation with ww_mutex. No such relation is intended. > all other examples of the wait_queue usages like this or didn't use > locks or had it in a total different flow that I could not correlate. Actually there are several examples of prepare_to_wait/finish_wait sequences with both spinlock and mutex in the kernel. See e.g. rpm_suspend(), wait_for_rtrs_disconnection(), softsynthx_read(). Also, as I mentioned, except for the lock, the sequence here is identical to intel_guc_wait_for_pending_msg(). > > > + > > + prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE); > > + > > + if (!hwmon->ddat.reset_in_progress) > > + break; > > If this breaks we never unlock it? Correct, this is the original case in Patch 2 where the mutex is acquired in the beginning of the function and released just before the final exit from the function (so the mutex is held for the entire duration of the function). > > > + > > + if (signal_pending(current)) { > > + ret = -EINTR; > > + break; > > + } > > + > > + if (!timeout) { > > + ret = -ETIME; > > + break; > > + } > > + > > + mutex_unlock(&hwmon->hwmon_lock); > > do we need to lock the signal pending and timeout as well? > or only wrapping it around the hwmon->ddat access would be > enough? Strictly, the mutex is only needed for the hwmon->ddat.reset_in_progress flag. But because this is not a performance path, implementing it as done in the patch simplifies the code flow (since there are several if/else, goto's, mutex lock/unlock and prepare_to_wait/finish_wait to consider). So if possible I *really* want to not try to over-optimize here (I did try a few other things when writing the patch but it was getting ugly). The only real requirement is to drop the lock before calling schedule_timeout() below (and we are reacquiring the lock as soon as we are scheduled back in, as you can see in the loop above). > > > + > > + timeout = schedule_timeout(timeout); > > } > > + finish_wait(&ddat->waitq, &wait); > > + if (ret) > > + goto unlock; > > + > > wakeref = intel_runtime_pm_get(ddat->uncore->rpm); > > > > /* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */ > > @@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old) > > intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit, > > PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0); > > hwmon->ddat.reset_in_progress = false; > > + wake_up_all(&hwmon->ddat.waitq); > > > > mutex_unlock(&hwmon->hwmon_lock); > > } > > @@ -784,6 +811,7 @@ void i915_hwmon_register(struct drm_i915_private *i915) > > ddat->uncore = &i915->uncore; > > snprintf(ddat->name, sizeof(ddat->name), "i915"); > > ddat->gt_n = -1; > > + init_waitqueue_head(&ddat->waitq); > > > > for_each_gt(gt, i915, i) { > > ddat_gt = hwmon->ddat_gt + i; > > -- > > 2.38.0 > > From what I understand is the locking above is fine and is not the point. The real race is between schedule_timeout() (which suspends the thread) and wake_up_all() (which schedules it back in). But this prepare_to_wait/finish_wait pattern is so widespread that the kernel guarantees that this works correctly as long as you do things in the correct order (otherwise we'd see a lot more kernel hangs/deadlocks). Thanks, Ashutosh ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete 2023-04-18 17:23 ` Dixit, Ashutosh @ 2023-04-19 19:40 ` Rodrigo Vivi 2023-04-19 22:13 ` Dixit, Ashutosh 0 siblings, 1 reply; 22+ messages in thread From: Rodrigo Vivi @ 2023-04-19 19:40 UTC (permalink / raw) To: Dixit, Ashutosh; +Cc: intel-gfx, dri-devel On Tue, Apr 18, 2023 at 10:23:50AM -0700, Dixit, Ashutosh wrote: > On Mon, 17 Apr 2023 22:35:58 -0700, Rodrigo Vivi wrote: > > > > Hi Rodrigo, > > > On Mon, Apr 10, 2023 at 03:35:09PM -0700, Ashutosh Dixit wrote: > > > Instead of erroring out when GuC reset is in progress, block waiting for > > > GuC reset to complete which is a more reasonable uapi behavior. > > > > > > v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo) > > > > > > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> > > > --- > > > drivers/gpu/drm/i915/i915_hwmon.c | 38 +++++++++++++++++++++++++++---- > > > 1 file changed, 33 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c > > > index 9ab8971679fe3..8471a667dfc71 100644 > > > --- a/drivers/gpu/drm/i915/i915_hwmon.c > > > +++ b/drivers/gpu/drm/i915/i915_hwmon.c > > > @@ -51,6 +51,7 @@ struct hwm_drvdata { > > > char name[12]; > > > int gt_n; > > > bool reset_in_progress; > > > + wait_queue_head_t waitq; > > > }; > > > > > > struct i915_hwmon { > > > @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val) > > > static int > > > hwm_power_max_write(struct hwm_drvdata *ddat, long val) > > > { > > > +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000) > > > + > > > + int ret = 0, timeout = GUC_RESET_TIMEOUT; > > > struct i915_hwmon *hwmon = ddat->hwmon; > > > intel_wakeref_t wakeref; > > > - int ret = 0; > > > + DEFINE_WAIT(wait); > > > u32 nval; > > > > > > - mutex_lock(&hwmon->hwmon_lock); > > > - if (hwmon->ddat.reset_in_progress) { > > > - ret = -EAGAIN; > > > - goto unlock; > > > + /* Block waiting for GuC reset to complete when needed */ > > > + for (;;) { > > > + mutex_lock(&hwmon->hwmon_lock); > > > > I'm really afraid of how this mutex is handled with the wait queue. > > some initial thought it looks like it is trying to reimplement ww_mutex? > > Sorry, but I am missing the relation with ww_mutex. No such relation is > intended. > > > all other examples of the wait_queue usages like this or didn't use > > locks or had it in a total different flow that I could not correlate. > > Actually there are several examples of prepare_to_wait/finish_wait > sequences with both spinlock and mutex in the kernel. See > e.g. rpm_suspend(), wait_for_rtrs_disconnection(), softsynthx_read(). > > Also, as I mentioned, except for the lock, the sequence here is identical > to intel_guc_wait_for_pending_msg(). > > > > > > + > > > + prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE); > > > + > > > + if (!hwmon->ddat.reset_in_progress) > > > + break; > > > > If this breaks we never unlock it? > > Correct, this is the original case in Patch 2 where the mutex is acquired > in the beginning of the function and released just before the final exit > from the function (so the mutex is held for the entire duration of the > function). I got really confused here... I looked at the patch 2 again and I don't see any place where the lock remains outside of the function. What was what I asked to remove on the initial versions. But now with this one I'm even more confused because I couldn't follow to understand who will remove the lock and when. > > > > > > + > > > + if (signal_pending(current)) { > > > + ret = -EINTR; > > > + break; > > > + } > > > + > > > + if (!timeout) { > > > + ret = -ETIME; > > > + break; > > > + } > > > + > > > + mutex_unlock(&hwmon->hwmon_lock); > > > > do we need to lock the signal pending and timeout as well? > > or only wrapping it around the hwmon->ddat access would be > > enough? > > Strictly, the mutex is only needed for the hwmon->ddat.reset_in_progress > flag. But because this is not a performance path, implementing it as done > in the patch simplifies the code flow (since there are several if/else, > goto's, mutex lock/unlock and prepare_to_wait/finish_wait to consider). > > So if possible I *really* want to not try to over-optimize here (I did try > a few other things when writing the patch but it was getting ugly). The > only real requirement is to drop the lock before calling schedule_timeout() > below (and we are reacquiring the lock as soon as we are scheduled back in, > as you can see in the loop above). > > > > > > + > > > + timeout = schedule_timeout(timeout); > > > } > > > + finish_wait(&ddat->waitq, &wait); > > > + if (ret) > > > + goto unlock; > > > + > > > wakeref = intel_runtime_pm_get(ddat->uncore->rpm); > > > > > > /* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */ > > > @@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old) > > > intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit, > > > PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0); > > > hwmon->ddat.reset_in_progress = false; > > > + wake_up_all(&hwmon->ddat.waitq); > > > > > > mutex_unlock(&hwmon->hwmon_lock); > > > } > > > @@ -784,6 +811,7 @@ void i915_hwmon_register(struct drm_i915_private *i915) > > > ddat->uncore = &i915->uncore; > > > snprintf(ddat->name, sizeof(ddat->name), "i915"); > > > ddat->gt_n = -1; > > > + init_waitqueue_head(&ddat->waitq); > > > > > > for_each_gt(gt, i915, i) { > > > ddat_gt = hwmon->ddat_gt + i; > > > -- > > > 2.38.0 > > > > > From what I understand is the locking above is fine and is not the > point. The real race is between schedule_timeout() (which suspends the > thread) and wake_up_all() (which schedules it back in). But this > prepare_to_wait/finish_wait pattern is so widespread that the kernel > guarantees that this works correctly as long as you do things in the > correct order (otherwise we'd see a lot more kernel hangs/deadlocks). > > Thanks, > Ashutosh ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete 2023-04-19 19:40 ` Rodrigo Vivi @ 2023-04-19 22:13 ` Dixit, Ashutosh 2023-04-20 15:45 ` Rodrigo Vivi 0 siblings, 1 reply; 22+ messages in thread From: Dixit, Ashutosh @ 2023-04-19 22:13 UTC (permalink / raw) To: Rodrigo Vivi; +Cc: intel-gfx, dri-devel On Wed, 19 Apr 2023 12:40:44 -0700, Rodrigo Vivi wrote: > Hi Rodrigo, > On Tue, Apr 18, 2023 at 10:23:50AM -0700, Dixit, Ashutosh wrote: > > On Mon, 17 Apr 2023 22:35:58 -0700, Rodrigo Vivi wrote: > > > > > > > Hi Rodrigo, > > > > > On Mon, Apr 10, 2023 at 03:35:09PM -0700, Ashutosh Dixit wrote: > > > > Instead of erroring out when GuC reset is in progress, block waiting for > > > > GuC reset to complete which is a more reasonable uapi behavior. > > > > > > > > v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo) > > > > > > > > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> > > > > --- > > > > drivers/gpu/drm/i915/i915_hwmon.c | 38 +++++++++++++++++++++++++++---- > > > > 1 file changed, 33 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c > > > > index 9ab8971679fe3..8471a667dfc71 100644 > > > > --- a/drivers/gpu/drm/i915/i915_hwmon.c > > > > +++ b/drivers/gpu/drm/i915/i915_hwmon.c > > > > @@ -51,6 +51,7 @@ struct hwm_drvdata { > > > > char name[12]; > > > > int gt_n; > > > > bool reset_in_progress; > > > > + wait_queue_head_t waitq; > > > > }; > > > > > > > > struct i915_hwmon { > > > > @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val) > > > > static int > > > > hwm_power_max_write(struct hwm_drvdata *ddat, long val) > > > > { > > > > +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000) > > > > + > > > > + int ret = 0, timeout = GUC_RESET_TIMEOUT; > > > > struct i915_hwmon *hwmon = ddat->hwmon; > > > > intel_wakeref_t wakeref; > > > > - int ret = 0; > > > > + DEFINE_WAIT(wait); > > > > u32 nval; > > > > > > > > - mutex_lock(&hwmon->hwmon_lock); > > > > - if (hwmon->ddat.reset_in_progress) { > > > > - ret = -EAGAIN; > > > > - goto unlock; > > > > + /* Block waiting for GuC reset to complete when needed */ > > > > + for (;;) { > > > > + mutex_lock(&hwmon->hwmon_lock); > > > > > > I'm really afraid of how this mutex is handled with the wait queue. > > > some initial thought it looks like it is trying to reimplement ww_mutex? > > > > Sorry, but I am missing the relation with ww_mutex. No such relation is > > intended. > > > > > all other examples of the wait_queue usages like this or didn't use > > > locks or had it in a total different flow that I could not correlate. > > > > Actually there are several examples of prepare_to_wait/finish_wait > > sequences with both spinlock and mutex in the kernel. See > > e.g. rpm_suspend(), wait_for_rtrs_disconnection(), softsynthx_read(). > > > > Also, as I mentioned, except for the lock, the sequence here is identical > > to intel_guc_wait_for_pending_msg(). > > > > > > > > > + > > > > + prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE); > > > > + > > > > + if (!hwmon->ddat.reset_in_progress) > > > > + break; > > > > > > If this breaks we never unlock it? > > > > Correct, this is the original case in Patch 2 where the mutex is acquired > > in the beginning of the function and released just before the final exit > > from the function (so the mutex is held for the entire duration of the > > function). > > I got really confused here... Sorry, the patch is a little confusing/tricky but I thought I'd better stick to the standard 'for (;;)' loop pattern otherwise it will also be hard to review. > I looked at the patch 2 again and I don't see any place where the lock > remains outside of the function. What was what I asked to remove on the > initial versions. So it was in Patch 1 where we changed the code to take the lock in the beginning of the function and release it at the end of the function (you can see it Patch 1). In Patch 2 the 'unlock' label and 'goto unlock' is introduced and the lock is released at the 'unlock' label (it is visible in Patch 2). > But now with this one I'm even more confused because I couldn't follow > to understand who will remove the lock and when. In Patch 3 again the lock is released at the the 'unlock' label (i.e. the destination of 'goto unlock', not visible in Patch 3). But we execute 'goto unlock' only when 'ret != 0' in the 'for (;;)' loop. But when 'ret == 0' (when 'ddat.reset_in_progress' flag is clear) we hold the mutex, execute the entire function and finally release the lock at the end of the function. Hopefully this helps. Thanks. -- Ashutosh > > > > > > > > > > + > > > > + if (signal_pending(current)) { > > > > + ret = -EINTR; > > > > + break; > > > > + } > > > > + > > > > + if (!timeout) { > > > > + ret = -ETIME; > > > > + break; > > > > + } > > > > + > > > > + mutex_unlock(&hwmon->hwmon_lock); > > > > > > do we need to lock the signal pending and timeout as well? > > > or only wrapping it around the hwmon->ddat access would be > > > enough? > > > > Strictly, the mutex is only needed for the hwmon->ddat.reset_in_progress > > flag. But because this is not a performance path, implementing it as done > > in the patch simplifies the code flow (since there are several if/else, > > goto's, mutex lock/unlock and prepare_to_wait/finish_wait to consider). > > > > So if possible I *really* want to not try to over-optimize here (I did try > > a few other things when writing the patch but it was getting ugly). The > > only real requirement is to drop the lock before calling schedule_timeout() > > below (and we are reacquiring the lock as soon as we are scheduled back in, > > as you can see in the loop above). > > > > > > > > > + > > > > + timeout = schedule_timeout(timeout); > > > > } > > > > + finish_wait(&ddat->waitq, &wait); > > > > + if (ret) > > > > + goto unlock; > > > > + > > > > wakeref = intel_runtime_pm_get(ddat->uncore->rpm); > > > > > > > > /* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */ > > > > @@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old) > > > > intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit, > > > > PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0); > > > > hwmon->ddat.reset_in_progress = false; > > > > + wake_up_all(&hwmon->ddat.waitq); > > > > > > > > mutex_unlock(&hwmon->hwmon_lock); > > > > } > > > > @@ -784,6 +811,7 @@ void i915_hwmon_register(struct drm_i915_private *i915) > > > > ddat->uncore = &i915->uncore; > > > > snprintf(ddat->name, sizeof(ddat->name), "i915"); > > > > ddat->gt_n = -1; > > > > + init_waitqueue_head(&ddat->waitq); > > > > > > > > for_each_gt(gt, i915, i) { > > > > ddat_gt = hwmon->ddat_gt + i; > > > > -- > > > > 2.38.0 > > > > > > > > From what I understand is the locking above is fine and is not the > > point. The real race is between schedule_timeout() (which suspends the > > thread) and wake_up_all() (which schedules it back in). But this > > prepare_to_wait/finish_wait pattern is so widespread that the kernel > > guarantees that this works correctly as long as you do things in the > > correct order (otherwise we'd see a lot more kernel hangs/deadlocks). > > > > Thanks, > > Ashutosh ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete 2023-04-19 22:13 ` Dixit, Ashutosh @ 2023-04-20 15:45 ` Rodrigo Vivi 0 siblings, 0 replies; 22+ messages in thread From: Rodrigo Vivi @ 2023-04-20 15:45 UTC (permalink / raw) To: Dixit, Ashutosh; +Cc: intel-gfx, dri-devel On Wed, Apr 19, 2023 at 03:13:08PM -0700, Dixit, Ashutosh wrote: > On Wed, 19 Apr 2023 12:40:44 -0700, Rodrigo Vivi wrote: > > > > Hi Rodrigo, > > > On Tue, Apr 18, 2023 at 10:23:50AM -0700, Dixit, Ashutosh wrote: > > > On Mon, 17 Apr 2023 22:35:58 -0700, Rodrigo Vivi wrote: > > > > > > > > > > Hi Rodrigo, > > > > > > > On Mon, Apr 10, 2023 at 03:35:09PM -0700, Ashutosh Dixit wrote: > > > > > Instead of erroring out when GuC reset is in progress, block waiting for > > > > > GuC reset to complete which is a more reasonable uapi behavior. > > > > > > > > > > v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo) > > > > > > > > > > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> > > > > > --- > > > > > drivers/gpu/drm/i915/i915_hwmon.c | 38 +++++++++++++++++++++++++++---- > > > > > 1 file changed, 33 insertions(+), 5 deletions(-) > > > > > > > > > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c > > > > > index 9ab8971679fe3..8471a667dfc71 100644 > > > > > --- a/drivers/gpu/drm/i915/i915_hwmon.c > > > > > +++ b/drivers/gpu/drm/i915/i915_hwmon.c > > > > > @@ -51,6 +51,7 @@ struct hwm_drvdata { > > > > > char name[12]; > > > > > int gt_n; > > > > > bool reset_in_progress; > > > > > + wait_queue_head_t waitq; > > > > > }; > > > > > > > > > > struct i915_hwmon { > > > > > @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val) > > > > > static int > > > > > hwm_power_max_write(struct hwm_drvdata *ddat, long val) > > > > > { > > > > > +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000) > > > > > + > > > > > + int ret = 0, timeout = GUC_RESET_TIMEOUT; > > > > > struct i915_hwmon *hwmon = ddat->hwmon; > > > > > intel_wakeref_t wakeref; > > > > > - int ret = 0; > > > > > + DEFINE_WAIT(wait); > > > > > u32 nval; > > > > > > > > > > - mutex_lock(&hwmon->hwmon_lock); > > > > > - if (hwmon->ddat.reset_in_progress) { > > > > > - ret = -EAGAIN; > > > > > - goto unlock; > > > > > + /* Block waiting for GuC reset to complete when needed */ > > > > > + for (;;) { > > > > > + mutex_lock(&hwmon->hwmon_lock); > > > > > > > > I'm really afraid of how this mutex is handled with the wait queue. > > > > some initial thought it looks like it is trying to reimplement ww_mutex? > > > > > > Sorry, but I am missing the relation with ww_mutex. No such relation is > > > intended. > > > > > > > all other examples of the wait_queue usages like this or didn't use > > > > locks or had it in a total different flow that I could not correlate. > > > > > > Actually there are several examples of prepare_to_wait/finish_wait > > > sequences with both spinlock and mutex in the kernel. See > > > e.g. rpm_suspend(), wait_for_rtrs_disconnection(), softsynthx_read(). > > > > > > Also, as I mentioned, except for the lock, the sequence here is identical > > > to intel_guc_wait_for_pending_msg(). > > > > > > > > > > > > + > > > > > + prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE); > > > > > + > > > > > + if (!hwmon->ddat.reset_in_progress) > > > > > + break; > > > > > > > > If this breaks we never unlock it? > > > > > > Correct, this is the original case in Patch 2 where the mutex is acquired > > > in the beginning of the function and released just before the final exit > > > from the function (so the mutex is held for the entire duration of the > > > function). > > > > I got really confused here... > > Sorry, the patch is a little confusing/tricky but I thought I'd better > stick to the standard 'for (;;)' loop pattern otherwise it will also be > hard to review. > > > I looked at the patch 2 again and I don't see any place where the lock > > remains outside of the function. What was what I asked to remove on the > > initial versions. > > So it was in Patch 1 where we changed the code to take the lock in the > beginning of the function and release it at the end of the function (you > can see it Patch 1). > > In Patch 2 the 'unlock' label and 'goto unlock' is introduced and the lock > is released at the 'unlock' label (it is visible in Patch 2). > > > But now with this one I'm even more confused because I couldn't follow > > to understand who will remove the lock and when. > > In Patch 3 again the lock is released at the the 'unlock' label (i.e. the > destination of 'goto unlock', not visible in Patch 3). But we execute 'goto > unlock' only when 'ret != 0' in the 'for (;;)' loop. But when 'ret == 0' > (when 'ddat.reset_in_progress' flag is clear) we hold the mutex, execute > the entire function and finally release the lock at the end of the > function. > > Hopefully this helps. more coffee also helped! I'm sorry for the noise. with the timeout thing sorted out: Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> > > Thanks. > -- > Ashutosh > > > > > > > > > > > > > > > + > > > > > + if (signal_pending(current)) { > > > > > + ret = -EINTR; > > > > > + break; > > > > > + } > > > > > + > > > > > + if (!timeout) { > > > > > + ret = -ETIME; > > > > > + break; > > > > > + } > > > > > + > > > > > + mutex_unlock(&hwmon->hwmon_lock); > > > > > > > > do we need to lock the signal pending and timeout as well? > > > > or only wrapping it around the hwmon->ddat access would be > > > > enough? > > > > > > Strictly, the mutex is only needed for the hwmon->ddat.reset_in_progress > > > flag. But because this is not a performance path, implementing it as done > > > in the patch simplifies the code flow (since there are several if/else, > > > goto's, mutex lock/unlock and prepare_to_wait/finish_wait to consider). > > > > > > So if possible I *really* want to not try to over-optimize here (I did try > > > a few other things when writing the patch but it was getting ugly). The > > > only real requirement is to drop the lock before calling schedule_timeout() > > > below (and we are reacquiring the lock as soon as we are scheduled back in, > > > as you can see in the loop above). > > > > > > > > > > > > + > > > > > + timeout = schedule_timeout(timeout); > > > > > } > > > > > + finish_wait(&ddat->waitq, &wait); > > > > > + if (ret) > > > > > + goto unlock; > > > > > + > > > > > wakeref = intel_runtime_pm_get(ddat->uncore->rpm); > > > > > > > > > > /* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */ > > > > > @@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old) > > > > > intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit, > > > > > PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0); > > > > > hwmon->ddat.reset_in_progress = false; > > > > > + wake_up_all(&hwmon->ddat.waitq); > > > > > > > > > > mutex_unlock(&hwmon->hwmon_lock); > > > > > } > > > > > @@ -784,6 +811,7 @@ void i915_hwmon_register(struct drm_i915_private *i915) > > > > > ddat->uncore = &i915->uncore; > > > > > snprintf(ddat->name, sizeof(ddat->name), "i915"); > > > > > ddat->gt_n = -1; > > > > > + init_waitqueue_head(&ddat->waitq); > > > > > > > > > > for_each_gt(gt, i915, i) { > > > > > ddat_gt = hwmon->ddat_gt + i; > > > > > -- > > > > > 2.38.0 > > > > > > > > > > > From what I understand is the locking above is fine and is not the > > > point. The real race is between schedule_timeout() (which suspends the > > > thread) and wake_up_all() (which schedules it back in). But this > > > prepare_to_wait/finish_wait pattern is so widespread that the kernel > > > guarantees that this works correctly as long as you do things in the > > > correct order (otherwise we'd see a lot more kernel hangs/deadlocks). > > > > > > Thanks, > > > Ashutosh ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete 2023-04-10 22:35 ` [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete Ashutosh Dixit 2023-04-18 5:35 ` Rodrigo Vivi @ 2023-04-19 13:21 ` Tvrtko Ursulin 2023-04-19 22:10 ` Dixit, Ashutosh 1 sibling, 1 reply; 22+ messages in thread From: Tvrtko Ursulin @ 2023-04-19 13:21 UTC (permalink / raw) To: Ashutosh Dixit, intel-gfx; +Cc: dri-devel, Rodrigo Vivi On 10/04/2023 23:35, Ashutosh Dixit wrote: > Instead of erroring out when GuC reset is in progress, block waiting for > GuC reset to complete which is a more reasonable uapi behavior. > > v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo) > > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> > --- > drivers/gpu/drm/i915/i915_hwmon.c | 38 +++++++++++++++++++++++++++---- > 1 file changed, 33 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c > index 9ab8971679fe3..8471a667dfc71 100644 > --- a/drivers/gpu/drm/i915/i915_hwmon.c > +++ b/drivers/gpu/drm/i915/i915_hwmon.c > @@ -51,6 +51,7 @@ struct hwm_drvdata { > char name[12]; > int gt_n; > bool reset_in_progress; > + wait_queue_head_t waitq; > }; > > struct i915_hwmon { > @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val) > static int > hwm_power_max_write(struct hwm_drvdata *ddat, long val) > { > +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000) > + > + int ret = 0, timeout = GUC_RESET_TIMEOUT; Patch looks good to me apart that I am not sure what is the purpose of the timeout? This is just the sysfs write path or has more callers? If the former perhaps it would be better to just use interruptible everything (mutex and sleep) and wait for as long as it takes or until user presses Ctrl-C? Regards, Tvrtko > struct i915_hwmon *hwmon = ddat->hwmon; > intel_wakeref_t wakeref; > - int ret = 0; > + DEFINE_WAIT(wait); > u32 nval; > > - mutex_lock(&hwmon->hwmon_lock); > - if (hwmon->ddat.reset_in_progress) { > - ret = -EAGAIN; > - goto unlock; > + /* Block waiting for GuC reset to complete when needed */ > + for (;;) { > + mutex_lock(&hwmon->hwmon_lock); > + > + prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE); > + > + if (!hwmon->ddat.reset_in_progress) > + break; > + > + if (signal_pending(current)) { > + ret = -EINTR; > + break; > + } > + > + if (!timeout) { > + ret = -ETIME; > + break; > + } > + > + mutex_unlock(&hwmon->hwmon_lock); > + > + timeout = schedule_timeout(timeout); > } > + finish_wait(&ddat->waitq, &wait); > + if (ret) > + goto unlock; > + > wakeref = intel_runtime_pm_get(ddat->uncore->rpm); > > /* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */ > @@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old) > intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit, > PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0); > hwmon->ddat.reset_in_progress = false; > + wake_up_all(&hwmon->ddat.waitq); > > mutex_unlock(&hwmon->hwmon_lock); > } > @@ -784,6 +811,7 @@ void i915_hwmon_register(struct drm_i915_private *i915) > ddat->uncore = &i915->uncore; > snprintf(ddat->name, sizeof(ddat->name), "i915"); > ddat->gt_n = -1; > + init_waitqueue_head(&ddat->waitq); > > for_each_gt(gt, i915, i) { > ddat_gt = hwmon->ddat_gt + i; ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete 2023-04-19 13:21 ` Tvrtko Ursulin @ 2023-04-19 22:10 ` Dixit, Ashutosh 2023-04-20 7:57 ` Tvrtko Ursulin 0 siblings, 1 reply; 22+ messages in thread From: Dixit, Ashutosh @ 2023-04-19 22:10 UTC (permalink / raw) To: Tvrtko Ursulin; +Cc: intel-gfx, dri-devel, Rodrigo Vivi On Wed, 19 Apr 2023 06:21:27 -0700, Tvrtko Ursulin wrote: > Hi Tvrtko, > On 10/04/2023 23:35, Ashutosh Dixit wrote: > > Instead of erroring out when GuC reset is in progress, block waiting for > > GuC reset to complete which is a more reasonable uapi behavior. > > > > v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo) > > > > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> > > --- > > drivers/gpu/drm/i915/i915_hwmon.c | 38 +++++++++++++++++++++++++++---- > > 1 file changed, 33 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c > > index 9ab8971679fe3..8471a667dfc71 100644 > > --- a/drivers/gpu/drm/i915/i915_hwmon.c > > +++ b/drivers/gpu/drm/i915/i915_hwmon.c > > @@ -51,6 +51,7 @@ struct hwm_drvdata { > > char name[12]; > > int gt_n; > > bool reset_in_progress; > > + wait_queue_head_t waitq; > > }; > > struct i915_hwmon { > > @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val) > > static int > > hwm_power_max_write(struct hwm_drvdata *ddat, long val) > > { > > +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000) > > + > > + int ret = 0, timeout = GUC_RESET_TIMEOUT; > > Patch looks good to me Great, thanks :) > apart that I am not sure what is the purpose of the timeout? This is just > the sysfs write path or has more callers? It is just the sysfs path, but the sysfs is accessed also by the oneAPI stack (Level 0). In the initial version I also didn't have the timeout thinking that the app can send a signal to the blocked thread to unblock it. I introduced the timeout after Rodrigo brought it up and I am now thinking maybe it's better to have the timeout in the driver since the app has no knowledge of how long GuC resets can take. But I can remove it if you think it's not needed. > If the > former perhaps it would be better to just use interruptible everything > (mutex and sleep) and wait for as long as it takes or until user presses > Ctrl-C? Now we are not holding the mutexes for long, just long enough do register rmw's. So not holding the mutex across GuC reset as we were originally. Therefore I am thinking mutex_lock_interruptible is not needed? The sleep is already interruptible (TASK_INTERRUPTIBLE). Anyway please let me know if you think we need to change anything. Thanks. -- Ashutosh > > struct i915_hwmon *hwmon = ddat->hwmon; > > intel_wakeref_t wakeref; > > - int ret = 0; > > + DEFINE_WAIT(wait); > > u32 nval; > > - mutex_lock(&hwmon->hwmon_lock); > > - if (hwmon->ddat.reset_in_progress) { > > - ret = -EAGAIN; > > - goto unlock; > > + /* Block waiting for GuC reset to complete when needed */ > > + for (;;) { > > + mutex_lock(&hwmon->hwmon_lock); > > + > > + prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE); > > + > > + if (!hwmon->ddat.reset_in_progress) > > + break; > > + > > + if (signal_pending(current)) { > > + ret = -EINTR; > > + break; > > + } > > + > > + if (!timeout) { > > + ret = -ETIME; > > + break; > > + } > > + > > + mutex_unlock(&hwmon->hwmon_lock); > > + > > + timeout = schedule_timeout(timeout); > > } > > + finish_wait(&ddat->waitq, &wait); > > + if (ret) > > + goto unlock; > > + > > wakeref = intel_runtime_pm_get(ddat->uncore->rpm); > > /* Disable PL1 limit and verify, because the limit cannot be > > disabled on all platforms */ > > @@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old) > > intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit, > > PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0); > > hwmon->ddat.reset_in_progress = false; > > + wake_up_all(&hwmon->ddat.waitq); > > mutex_unlock(&hwmon->hwmon_lock); > > } > > @@ -784,6 +811,7 @@ void i915_hwmon_register(struct drm_i915_private *i915) > > ddat->uncore = &i915->uncore; > > snprintf(ddat->name, sizeof(ddat->name), "i915"); > > ddat->gt_n = -1; > > + init_waitqueue_head(&ddat->waitq); > > for_each_gt(gt, i915, i) { > > ddat_gt = hwmon->ddat_gt + i; ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete 2023-04-19 22:10 ` Dixit, Ashutosh @ 2023-04-20 7:57 ` Tvrtko Ursulin 2023-04-20 15:43 ` Rodrigo Vivi 0 siblings, 1 reply; 22+ messages in thread From: Tvrtko Ursulin @ 2023-04-20 7:57 UTC (permalink / raw) To: Dixit, Ashutosh; +Cc: intel-gfx, dri-devel, Rodrigo Vivi On 19/04/2023 23:10, Dixit, Ashutosh wrote: > On Wed, 19 Apr 2023 06:21:27 -0700, Tvrtko Ursulin wrote: >> > > Hi Tvrtko, > >> On 10/04/2023 23:35, Ashutosh Dixit wrote: >>> Instead of erroring out when GuC reset is in progress, block waiting for >>> GuC reset to complete which is a more reasonable uapi behavior. >>> >>> v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo) >>> >>> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> >>> --- >>> drivers/gpu/drm/i915/i915_hwmon.c | 38 +++++++++++++++++++++++++++---- >>> 1 file changed, 33 insertions(+), 5 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c >>> index 9ab8971679fe3..8471a667dfc71 100644 >>> --- a/drivers/gpu/drm/i915/i915_hwmon.c >>> +++ b/drivers/gpu/drm/i915/i915_hwmon.c >>> @@ -51,6 +51,7 @@ struct hwm_drvdata { >>> char name[12]; >>> int gt_n; >>> bool reset_in_progress; >>> + wait_queue_head_t waitq; >>> }; >>> struct i915_hwmon { >>> @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val) >>> static int >>> hwm_power_max_write(struct hwm_drvdata *ddat, long val) >>> { >>> +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000) >>> + >>> + int ret = 0, timeout = GUC_RESET_TIMEOUT; >> >> Patch looks good to me > > Great, thanks :) > >> apart that I am not sure what is the purpose of the timeout? This is just >> the sysfs write path or has more callers? > > It is just the sysfs path, but the sysfs is accessed also by the oneAPI > stack (Level 0). In the initial version I also didn't have the timeout > thinking that the app can send a signal to the blocked thread to unblock > it. I introduced the timeout after Rodrigo brought it up and I am now > thinking maybe it's better to have the timeout in the driver since the app > has no knowledge of how long GuC resets can take. But I can remove it if > you think it's not needed. Maybe I am missing something but I don't get why we would need to provide a timeout facility in sysfs? If the library writes here to configure something it already has to expect a blocking write by the nature of a a write(2) and sysfs contract. It can take long for any reason so I hope we are not guaranteeing some latency number to someone? Or the concern is just about things getting stuck? In which case I think Ctrl-C is the answer because ETIME is not even listed as an errno for write(2). >> If the >> former perhaps it would be better to just use interruptible everything >> (mutex and sleep) and wait for as long as it takes or until user presses >> Ctrl-C? > > Now we are not holding the mutexes for long, just long enough do register > rmw's. So not holding the mutex across GuC reset as we were > originally. Therefore I am thinking mutex_lock_interruptible is not needed? > The sleep is already interruptible (TASK_INTERRUPTIBLE). Ah yes, you are right. Regards, Tvrtko > Anyway please let me know if you think we need to change anything. > > Thanks. > -- > Ashutosh > >>> struct i915_hwmon *hwmon = ddat->hwmon; >>> intel_wakeref_t wakeref; >>> - int ret = 0; >>> + DEFINE_WAIT(wait); >>> u32 nval; >>> - mutex_lock(&hwmon->hwmon_lock); >>> - if (hwmon->ddat.reset_in_progress) { >>> - ret = -EAGAIN; >>> - goto unlock; >>> + /* Block waiting for GuC reset to complete when needed */ >>> + for (;;) { >>> + mutex_lock(&hwmon->hwmon_lock); >>> + >>> + prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE); >>> + >>> + if (!hwmon->ddat.reset_in_progress) >>> + break; >>> + >>> + if (signal_pending(current)) { >>> + ret = -EINTR; >>> + break; >>> + } >>> + >>> + if (!timeout) { >>> + ret = -ETIME; >>> + break; >>> + } >>> + >>> + mutex_unlock(&hwmon->hwmon_lock); >>> + >>> + timeout = schedule_timeout(timeout); >>> } >>> + finish_wait(&ddat->waitq, &wait); >>> + if (ret) >>> + goto unlock; >>> + >>> wakeref = intel_runtime_pm_get(ddat->uncore->rpm); >>> /* Disable PL1 limit and verify, because the limit cannot be >>> disabled on all platforms */ >>> @@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old) >>> intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit, >>> PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0); >>> hwmon->ddat.reset_in_progress = false; >>> + wake_up_all(&hwmon->ddat.waitq); >>> mutex_unlock(&hwmon->hwmon_lock); >>> } >>> @@ -784,6 +811,7 @@ void i915_hwmon_register(struct drm_i915_private *i915) >>> ddat->uncore = &i915->uncore; >>> snprintf(ddat->name, sizeof(ddat->name), "i915"); >>> ddat->gt_n = -1; >>> + init_waitqueue_head(&ddat->waitq); >>> for_each_gt(gt, i915, i) { >>> ddat_gt = hwmon->ddat_gt + i; ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete 2023-04-20 7:57 ` Tvrtko Ursulin @ 2023-04-20 15:43 ` Rodrigo Vivi 2023-04-20 16:26 ` Dixit, Ashutosh 0 siblings, 1 reply; 22+ messages in thread From: Rodrigo Vivi @ 2023-04-20 15:43 UTC (permalink / raw) To: Tvrtko Ursulin; +Cc: intel-gfx, dri-devel On Thu, Apr 20, 2023 at 08:57:24AM +0100, Tvrtko Ursulin wrote: > > On 19/04/2023 23:10, Dixit, Ashutosh wrote: > > On Wed, 19 Apr 2023 06:21:27 -0700, Tvrtko Ursulin wrote: > > > > > > > Hi Tvrtko, > > > > > On 10/04/2023 23:35, Ashutosh Dixit wrote: > > > > Instead of erroring out when GuC reset is in progress, block waiting for > > > > GuC reset to complete which is a more reasonable uapi behavior. > > > > > > > > v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo) > > > > > > > > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> > > > > --- > > > > drivers/gpu/drm/i915/i915_hwmon.c | 38 +++++++++++++++++++++++++++---- > > > > 1 file changed, 33 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c > > > > index 9ab8971679fe3..8471a667dfc71 100644 > > > > --- a/drivers/gpu/drm/i915/i915_hwmon.c > > > > +++ b/drivers/gpu/drm/i915/i915_hwmon.c > > > > @@ -51,6 +51,7 @@ struct hwm_drvdata { > > > > char name[12]; > > > > int gt_n; > > > > bool reset_in_progress; > > > > + wait_queue_head_t waitq; > > > > }; > > > > struct i915_hwmon { > > > > @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val) > > > > static int > > > > hwm_power_max_write(struct hwm_drvdata *ddat, long val) > > > > { > > > > +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000) > > > > + > > > > + int ret = 0, timeout = GUC_RESET_TIMEOUT; > > > > > > Patch looks good to me > > > > Great, thanks :) > > > > > apart that I am not sure what is the purpose of the timeout? This is just > > > the sysfs write path or has more callers? > > > > It is just the sysfs path, but the sysfs is accessed also by the oneAPI > > stack (Level 0). In the initial version I also didn't have the timeout > > thinking that the app can send a signal to the blocked thread to unblock > > it. I introduced the timeout after Rodrigo brought it up and I am now > > thinking maybe it's better to have the timeout in the driver since the app > > has no knowledge of how long GuC resets can take. But I can remove it if > > you think it's not needed. > > Maybe I am missing something but I don't get why we would need to provide a > timeout facility in sysfs? If the library writes here to configure something > it already has to expect a blocking write by the nature of a a write(2) and > sysfs contract. It can take long for any reason so I hope we are not > guaranteeing some latency number to someone? Or the concern is just about > things getting stuck? In which case I think Ctrl-C is the answer because > ETIME is not even listed as an errno for write(2). I suggested the timeout on the other version because of that race, which is fixed now with this approach. It is probably better to remove it now to avoid confusions. I'm sorry about that. > > > > If the > > > former perhaps it would be better to just use interruptible everything > > > (mutex and sleep) and wait for as long as it takes or until user presses > > > Ctrl-C? > > > > Now we are not holding the mutexes for long, just long enough do register > > rmw's. So not holding the mutex across GuC reset as we were > > originally. Therefore I am thinking mutex_lock_interruptible is not needed? > > The sleep is already interruptible (TASK_INTERRUPTIBLE). > > Ah yes, you are right. > > Regards, > > Tvrtko > > > Anyway please let me know if you think we need to change anything. > > > > Thanks. > > -- > > Ashutosh > > > > > > struct i915_hwmon *hwmon = ddat->hwmon; > > > > intel_wakeref_t wakeref; > > > > - int ret = 0; > > > > + DEFINE_WAIT(wait); > > > > u32 nval; > > > > - mutex_lock(&hwmon->hwmon_lock); > > > > - if (hwmon->ddat.reset_in_progress) { > > > > - ret = -EAGAIN; > > > > - goto unlock; > > > > + /* Block waiting for GuC reset to complete when needed */ > > > > + for (;;) { > > > > + mutex_lock(&hwmon->hwmon_lock); > > > > + > > > > + prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE); > > > > + > > > > + if (!hwmon->ddat.reset_in_progress) > > > > + break; > > > > + > > > > + if (signal_pending(current)) { > > > > + ret = -EINTR; > > > > + break; > > > > + } > > > > + > > > > + if (!timeout) { > > > > + ret = -ETIME; > > > > + break; > > > > + } > > > > + > > > > + mutex_unlock(&hwmon->hwmon_lock); > > > > + > > > > + timeout = schedule_timeout(timeout); > > > > } > > > > + finish_wait(&ddat->waitq, &wait); > > > > + if (ret) > > > > + goto unlock; > > > > + > > > > wakeref = intel_runtime_pm_get(ddat->uncore->rpm); > > > > /* Disable PL1 limit and verify, because the limit cannot be > > > > disabled on all platforms */ > > > > @@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old) > > > > intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit, > > > > PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0); > > > > hwmon->ddat.reset_in_progress = false; > > > > + wake_up_all(&hwmon->ddat.waitq); > > > > mutex_unlock(&hwmon->hwmon_lock); > > > > } > > > > @@ -784,6 +811,7 @@ void i915_hwmon_register(struct drm_i915_private *i915) > > > > ddat->uncore = &i915->uncore; > > > > snprintf(ddat->name, sizeof(ddat->name), "i915"); > > > > ddat->gt_n = -1; > > > > + init_waitqueue_head(&ddat->waitq); > > > > for_each_gt(gt, i915, i) { > > > > ddat_gt = hwmon->ddat_gt + i; ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete 2023-04-20 15:43 ` Rodrigo Vivi @ 2023-04-20 16:26 ` Dixit, Ashutosh 0 siblings, 0 replies; 22+ messages in thread From: Dixit, Ashutosh @ 2023-04-20 16:26 UTC (permalink / raw) To: Rodrigo Vivi; +Cc: intel-gfx, dri-devel On Thu, 20 Apr 2023 08:43:52 -0700, Rodrigo Vivi wrote: > Hi Rodrigo, > On Thu, Apr 20, 2023 at 08:57:24AM +0100, Tvrtko Ursulin wrote: > > > > On 19/04/2023 23:10, Dixit, Ashutosh wrote: > > > On Wed, 19 Apr 2023 06:21:27 -0700, Tvrtko Ursulin wrote: > > > > > > > > > > Hi Tvrtko, > > > > > > > On 10/04/2023 23:35, Ashutosh Dixit wrote: > > > > > Instead of erroring out when GuC reset is in progress, block waiting for > > > > > GuC reset to complete which is a more reasonable uapi behavior. > > > > > > > > > > v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo) > > > > > > > > > > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> > > > > > --- > > > > > drivers/gpu/drm/i915/i915_hwmon.c | 38 +++++++++++++++++++++++++++---- > > > > > 1 file changed, 33 insertions(+), 5 deletions(-) > > > > > > > > > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c > > > > > index 9ab8971679fe3..8471a667dfc71 100644 > > > > > --- a/drivers/gpu/drm/i915/i915_hwmon.c > > > > > +++ b/drivers/gpu/drm/i915/i915_hwmon.c > > > > > @@ -51,6 +51,7 @@ struct hwm_drvdata { > > > > > char name[12]; > > > > > int gt_n; > > > > > bool reset_in_progress; > > > > > + wait_queue_head_t waitq; > > > > > }; > > > > > struct i915_hwmon { > > > > > @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long *val) > > > > > static int > > > > > hwm_power_max_write(struct hwm_drvdata *ddat, long val) > > > > > { > > > > > +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000) > > > > > + > > > > > + int ret = 0, timeout = GUC_RESET_TIMEOUT; > > > > > > > > Patch looks good to me > > > > > > Great, thanks :) > > > > > > > apart that I am not sure what is the purpose of the timeout? This is just > > > > the sysfs write path or has more callers? > > > > > > It is just the sysfs path, but the sysfs is accessed also by the oneAPI > > > stack (Level 0). In the initial version I also didn't have the timeout > > > thinking that the app can send a signal to the blocked thread to unblock > > > it. I introduced the timeout after Rodrigo brought it up and I am now > > > thinking maybe it's better to have the timeout in the driver since the app > > > has no knowledge of how long GuC resets can take. But I can remove it if > > > you think it's not needed. > > > > Maybe I am missing something but I don't get why we would need to provide a > > timeout facility in sysfs? If the library writes here to configure something > > it already has to expect a blocking write by the nature of a a write(2) and > > sysfs contract. It can take long for any reason so I hope we are not > > guaranteeing some latency number to someone? Or the concern is just about > > things getting stuck? In which case I think Ctrl-C is the answer because > > ETIME is not even listed as an errno for write(2). Hmm, good point. > I suggested the timeout on the other version because of that race, > which is fixed now with this approach. It is probably better to remove > it now to avoid confusions. I'm sorry about that. No problem, I've removed the timeout in the latest version. Thanks for the R-b. Ashutosh ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-gfx] [PATCH v4 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware @ 2023-04-06 4:45 Ashutosh Dixit 2023-04-06 4:45 ` [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete Ashutosh Dixit 0 siblings, 1 reply; 22+ messages in thread From: Ashutosh Dixit @ 2023-04-06 4:45 UTC (permalink / raw) To: intel-gfx; +Cc: dri-devel, Rodrigo Vivi Split the v3 patch into 3 patches for easier review, can squash later if needed. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Ashutosh Dixit (3): drm/i915/hwmon: Get mutex and rpm ref just once in hwm_power_max_write drm/i915/guc: Disable PL1 power limit when loading GuC firmware drm/i915/hwmon: Block waiting for GuC reset to complete drivers/gpu/drm/i915/gt/uc/intel_uc.c | 9 ++++ drivers/gpu/drm/i915/i915_hwmon.c | 75 ++++++++++++++++++++++----- drivers/gpu/drm/i915/i915_hwmon.h | 7 +++ 3 files changed, 78 insertions(+), 13 deletions(-) -- 2.38.0 ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete 2023-04-06 4:45 [Intel-gfx] [PATCH v4 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit @ 2023-04-06 4:45 ` Ashutosh Dixit 2023-04-07 11:04 ` Rodrigo Vivi 0 siblings, 1 reply; 22+ messages in thread From: Ashutosh Dixit @ 2023-04-06 4:45 UTC (permalink / raw) To: intel-gfx; +Cc: dri-devel, Rodrigo Vivi Instead of erroring out when GuC reset is in progress, block waiting for GuC reset to complete which is a more reasonable uapi behavior. Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> --- drivers/gpu/drm/i915/i915_hwmon.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c index 9ab8971679fe3..4343efb48e61b 100644 --- a/drivers/gpu/drm/i915/i915_hwmon.c +++ b/drivers/gpu/drm/i915/i915_hwmon.c @@ -51,6 +51,7 @@ struct hwm_drvdata { char name[12]; int gt_n; bool reset_in_progress; + wait_queue_head_t wqh; }; struct i915_hwmon { @@ -400,10 +401,15 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val) int ret = 0; u32 nval; +retry: mutex_lock(&hwmon->hwmon_lock); if (hwmon->ddat.reset_in_progress) { - ret = -EAGAIN; - goto unlock; + mutex_unlock(&hwmon->hwmon_lock); + ret = wait_event_interruptible(ddat->wqh, + !hwmon->ddat.reset_in_progress); + if (ret) + return ret; + goto retry; } wakeref = intel_runtime_pm_get(ddat->uncore->rpm); @@ -426,7 +432,6 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val) PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval); exit: intel_runtime_pm_put(ddat->uncore->rpm, wakeref); -unlock: mutex_unlock(&hwmon->hwmon_lock); return ret; } @@ -508,6 +513,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old) intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit, PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0); hwmon->ddat.reset_in_progress = false; + wake_up_all(&hwmon->ddat.wqh); mutex_unlock(&hwmon->hwmon_lock); } @@ -784,6 +790,7 @@ void i915_hwmon_register(struct drm_i915_private *i915) ddat->uncore = &i915->uncore; snprintf(ddat->name, sizeof(ddat->name), "i915"); ddat->gt_n = -1; + init_waitqueue_head(&ddat->wqh); for_each_gt(gt, i915, i) { ddat_gt = hwmon->ddat_gt + i; -- 2.38.0 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete 2023-04-06 4:45 ` [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete Ashutosh Dixit @ 2023-04-07 11:04 ` Rodrigo Vivi 2023-04-10 22:40 ` Dixit, Ashutosh 0 siblings, 1 reply; 22+ messages in thread From: Rodrigo Vivi @ 2023-04-07 11:04 UTC (permalink / raw) To: Ashutosh Dixit; +Cc: intel-gfx, dri-devel On Wed, Apr 05, 2023 at 09:45:22PM -0700, Ashutosh Dixit wrote: > Instead of erroring out when GuC reset is in progress, block waiting for > GuC reset to complete which is a more reasonable uapi behavior. > > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> > --- > drivers/gpu/drm/i915/i915_hwmon.c | 13 ++++++++++--- > 1 file changed, 10 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c > index 9ab8971679fe3..4343efb48e61b 100644 > --- a/drivers/gpu/drm/i915/i915_hwmon.c > +++ b/drivers/gpu/drm/i915/i915_hwmon.c > @@ -51,6 +51,7 @@ struct hwm_drvdata { > char name[12]; > int gt_n; > bool reset_in_progress; > + wait_queue_head_t wqh; > }; > > struct i915_hwmon { > @@ -400,10 +401,15 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val) > int ret = 0; > u32 nval; > > +retry: > mutex_lock(&hwmon->hwmon_lock); > if (hwmon->ddat.reset_in_progress) { > - ret = -EAGAIN; > - goto unlock; > + mutex_unlock(&hwmon->hwmon_lock); > + ret = wait_event_interruptible(ddat->wqh, > + !hwmon->ddat.reset_in_progress); this is indeed very clever! maybe just use the timeout version to be on the safeside and then return the -EAGAIN on timeout? my fear is probably due to the lack of knowledge on this wait queue, but I'm wondering what could go wrong if due to some funny race you enter this check right after wake_up_all below has passed and then you will be here indefinitely waiting... > + if (ret) > + return ret; > + goto retry; > } > wakeref = intel_runtime_pm_get(ddat->uncore->rpm); > > @@ -426,7 +432,6 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val) > PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval); > exit: > intel_runtime_pm_put(ddat->uncore->rpm, wakeref); > -unlock: > mutex_unlock(&hwmon->hwmon_lock); > return ret; > } > @@ -508,6 +513,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old) > intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit, > PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0); > hwmon->ddat.reset_in_progress = false; > + wake_up_all(&hwmon->ddat.wqh); > > mutex_unlock(&hwmon->hwmon_lock); > } > @@ -784,6 +790,7 @@ void i915_hwmon_register(struct drm_i915_private *i915) > ddat->uncore = &i915->uncore; > snprintf(ddat->name, sizeof(ddat->name), "i915"); > ddat->gt_n = -1; > + init_waitqueue_head(&ddat->wqh); > > for_each_gt(gt, i915, i) { > ddat_gt = hwmon->ddat_gt + i; > -- > 2.38.0 > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete 2023-04-07 11:04 ` Rodrigo Vivi @ 2023-04-10 22:40 ` Dixit, Ashutosh 0 siblings, 0 replies; 22+ messages in thread From: Dixit, Ashutosh @ 2023-04-10 22:40 UTC (permalink / raw) To: Rodrigo Vivi; +Cc: intel-gfx, dri-devel On Fri, 07 Apr 2023 04:04:06 -0700, Rodrigo Vivi wrote: > Hi Rodrigo, > On Wed, Apr 05, 2023 at 09:45:22PM -0700, Ashutosh Dixit wrote: > > Instead of erroring out when GuC reset is in progress, block waiting for > > GuC reset to complete which is a more reasonable uapi behavior. > > > > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> > > --- > > drivers/gpu/drm/i915/i915_hwmon.c | 13 ++++++++++--- > > 1 file changed, 10 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c > > index 9ab8971679fe3..4343efb48e61b 100644 > > --- a/drivers/gpu/drm/i915/i915_hwmon.c > > +++ b/drivers/gpu/drm/i915/i915_hwmon.c > > @@ -51,6 +51,7 @@ struct hwm_drvdata { > > char name[12]; > > int gt_n; > > bool reset_in_progress; > > + wait_queue_head_t wqh; > > }; > > > > struct i915_hwmon { > > @@ -400,10 +401,15 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val) > > int ret = 0; > > u32 nval; > > > > +retry: > > mutex_lock(&hwmon->hwmon_lock); > > if (hwmon->ddat.reset_in_progress) { > > - ret = -EAGAIN; > > - goto unlock; > > + mutex_unlock(&hwmon->hwmon_lock); > > + ret = wait_event_interruptible(ddat->wqh, > > + !hwmon->ddat.reset_in_progress); > > this is indeed very clever! Not clever, see below :/ > my fear is probably due to the lack of knowledge on this wait queue, but > I'm wondering what could go wrong if due to some funny race you enter this > check right after wake_up_all below has passed and then you will be here > indefinitely waiting... You are absolutely right, there is indeed a race in the patch because in the above code when we drop the mutex (mutex_unlock) the wake_up_all can happen before we have queued ourselves for the wake up. Solving this race needs a more complicated prepare_to_wait/finish_wait sequence which I have gone ahead and implemented in patch v2. The v2 code is also a standard code pattern and the pattern I have implemented is basically the same as that in intel_guc_wait_for_pending_msg() in i915 which I liked. I have read in several places (e.g. in the Advanced Sleeping section in https://static.lwn.net/images/pdf/LDD3/ch06.pdf and in kernel documentation for try_to_wake_up()) that this sequence will avoid the race (between schedule() and wake_up()). The crucial difference from the v1 patch is that in v2 the mutex is dropped after we queue ourselves in prepare_to_wait() just before calling schedule_timeout(). > maybe just use the timeout version to be on the safeside and then return the > -EAGAIN on timeout? Also incorporated timeout in the new version. All code paths in the new patch have been tested. Thanks. -- Ashutosh > > + if (ret) > > + return ret; > > + goto retry; > > } > > wakeref = intel_runtime_pm_get(ddat->uncore->rpm); > > > > @@ -426,7 +432,6 @@ hwm_power_max_write(struct hwm_drvdata *ddat, long val) > > PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval); > > exit: > > intel_runtime_pm_put(ddat->uncore->rpm, wakeref); > > -unlock: > > mutex_unlock(&hwmon->hwmon_lock); > > return ret; > > } > > @@ -508,6 +513,7 @@ void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old) > > intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit, > > PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0); > > hwmon->ddat.reset_in_progress = false; > > + wake_up_all(&hwmon->ddat.wqh); > > > > mutex_unlock(&hwmon->hwmon_lock); > > } > > @@ -784,6 +790,7 @@ void i915_hwmon_register(struct drm_i915_private *i915) > > ddat->uncore = &i915->uncore; > > snprintf(ddat->name, sizeof(ddat->name), "i915"); > > ddat->gt_n = -1; > > + init_waitqueue_head(&ddat->wqh); > > > > for_each_gt(gt, i915, i) { > > ddat_gt = hwmon->ddat_gt + i; > > -- > > 2.38.0 > > ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2023-04-25 0:26 UTC | newest] Thread overview: 22+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-04-20 16:40 [Intel-gfx] [PATCH v6 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit 2023-04-20 16:40 ` [Intel-gfx] [PATCH 1/3] drm/i915/hwmon: Get mutex and rpm ref just once in hwm_power_max_write Ashutosh Dixit 2023-04-20 16:40 ` [Intel-gfx] [PATCH 2/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit 2023-04-20 16:40 ` [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete Ashutosh Dixit 2023-04-20 17:26 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915/guc: Disable PL1 power limit when loading GuC firmware Patchwork 2023-04-20 17:37 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork 2023-04-21 0:08 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork 2023-04-25 0:26 ` Dixit, Ashutosh -- strict thread matches above, loose matches on Subject: below -- 2023-04-10 22:35 [Intel-gfx] [PATCH 0/3] " Ashutosh Dixit 2023-04-10 22:35 ` [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete Ashutosh Dixit 2023-04-18 5:35 ` Rodrigo Vivi 2023-04-18 17:23 ` Dixit, Ashutosh 2023-04-19 19:40 ` Rodrigo Vivi 2023-04-19 22:13 ` Dixit, Ashutosh 2023-04-20 15:45 ` Rodrigo Vivi 2023-04-19 13:21 ` Tvrtko Ursulin 2023-04-19 22:10 ` Dixit, Ashutosh 2023-04-20 7:57 ` Tvrtko Ursulin 2023-04-20 15:43 ` Rodrigo Vivi 2023-04-20 16:26 ` Dixit, Ashutosh 2023-04-06 4:45 [Intel-gfx] [PATCH v4 0/3] drm/i915/guc: Disable PL1 power limit when loading GuC firmware Ashutosh Dixit 2023-04-06 4:45 ` [Intel-gfx] [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete Ashutosh Dixit 2023-04-07 11:04 ` Rodrigo Vivi 2023-04-10 22:40 ` Dixit, Ashutosh
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox