* [PATCH] watchdog: starfive: use pm_runtime_resume_and_get() to fix refcount leak @ 2026-06-05 12:23 William Theesfeld 2026-06-05 13:11 ` [PATCH v2 0/6] watchdog: starfive: runtime PM cleanup William Theesfeld 0 siblings, 1 reply; 17+ messages in thread From: William Theesfeld @ 2026-06-05 12:23 UTC (permalink / raw) To: Xingyu Wu Cc: Ziv Xu, Wim Van Sebroeck, Guenter Roeck, linux-watchdog, linux-kernel starfive_wdt_pm_start() calls pm_runtime_get_sync(), but pm_runtime_get_sync() increments the runtime PM usage counter even when it returns an error. On the error path this function just returns the error code without a matching pm_runtime_put_noidle(), so the usage counter is leaked on every failed resume. Switch to pm_runtime_resume_and_get(), which decrements the usage counter automatically on failure, leaving the caller with the simple "check return value and propagate" pattern that the existing code was already trying to follow. Signed-off-by: William Theesfeld <william@theesfeld.net> --- drivers/watchdog/starfive-wdt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/watchdog/starfive-wdt.c b/drivers/watchdog/starfive-wdt.c index af55adc4a..5a3254c83 100644 --- a/drivers/watchdog/starfive-wdt.c +++ b/drivers/watchdog/starfive-wdt.c @@ -371,7 +371,7 @@ static void starfive_wdt_stop(struct starfive_wdt *wdt) static int starfive_wdt_pm_start(struct watchdog_device *wdd) { struct starfive_wdt *wdt = watchdog_get_drvdata(wdd); - int ret = pm_runtime_get_sync(wdd->parent); + int ret = pm_runtime_resume_and_get(wdd->parent); if (ret < 0) return ret; -- 2.54.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 0/6] watchdog: starfive: runtime PM cleanup 2026-06-05 12:23 [PATCH] watchdog: starfive: use pm_runtime_resume_and_get() to fix refcount leak William Theesfeld @ 2026-06-05 13:11 ` William Theesfeld 2026-06-05 13:11 ` [PATCH v2 1/6] watchdog: starfive: balance PM refcount when start operation fails William Theesfeld ` (6 more replies) 0 siblings, 7 replies; 17+ messages in thread From: William Theesfeld @ 2026-06-05 13:11 UTC (permalink / raw) To: Xingyu Wu Cc: Ziv Xu, Wim Van Sebroeck, Guenter Roeck, linux-watchdog, linux-kernel The v1 patch was a one-line conversion of pm_runtime_get_sync() to pm_runtime_resume_and_get() in starfive_wdt_pm_start(). On review, the StarFive maintainers pointed out one edge case the v1 fix had missed, plus five pre-existing issues in the surrounding runtime PM handling that all need to be addressed before the original fix is useful in practice. This v2 turns the single fix into a six-patch series. Each patch addresses one independent issue and can be reviewed in isolation; they are ordered roughly from the simplest correctness fix to the ones requiring a small amount of restructuring. The series: 1/6 Expand the original v1 fix so the runtime PM reference is also released when starfive_wdt_start() fails after a successful pm_runtime_resume_and_get(). 2/6 Treat a return value of 1 from pm_runtime_put_sync() as success in both starfive_wdt_pm_stop() and the probe path; the watchdog framework currently propagates the value verbatim and the probe takes the err_unregister_wdt path on a non-error return. 3/6 Restructure starfive_wdt_probe() error handling into three labels (err_pm_disable / err_put_pm / err_unregister_wdt) so every failure path balances exactly the resources it has acquired. Previously two early failure paths returned without calling pm_runtime_disable() and several later paths leaked the runtime PM reference. 4/6 Gate the system suspend / resume callbacks with pm_runtime_status_suspended() so the WDOGLOAD / WDOGCONTROL accesses run only when the device is actually clocked. When the device was already runtime-suspended the previous code triggered a synchronous external abort. 5/6 Make starfive_wdt_shutdown() only drop the PM refcount when WDOG_ACTIVE is set, and stop just the hardware (no PM touch) when only WDOG_HW_RUNNING is set. This avoids the refcount underflow that the previous unconditional pm_stop() produced. 6/6 Release the early_enable PM refcount in starfive_wdt_remove() when WDOG_HW_RUNNING is still asserted at unregister time. The watchdog framework's stop-on-unregister path only runs the stop op when WDOG_ACTIVE is set, so the early-enable + never-opened combination otherwise leaks the probe-time reference. The whole series builds cleanly with CONFIG_STARFIVE_WATCHDOG=m on x86_64 defconfig. Runtime testing would require StarFive JH7100 / JH7110 hardware and has not been performed. William Theesfeld (6): watchdog: starfive: balance PM refcount when start operation fails watchdog: starfive: treat pm_runtime_put_sync() positive return as success watchdog: starfive: balance PM refcount and disable in probe error paths watchdog: starfive: guard system suspend/resume hardware access watchdog: starfive: avoid PM refcount underflow in shutdown watchdog: starfive: release early_enable PM refcount on remove drivers/watchdog/starfive-wdt.c | 92 ++++++++++++++++++++++++++------- 1 file changed, 73 insertions(+), 19 deletions(-) -- 2.54.0 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v2 1/6] watchdog: starfive: balance PM refcount when start operation fails 2026-06-05 13:11 ` [PATCH v2 0/6] watchdog: starfive: runtime PM cleanup William Theesfeld @ 2026-06-05 13:11 ` William Theesfeld 2026-06-05 13:11 ` [PATCH v2 2/6] watchdog: starfive: treat pm_runtime_put_sync() positive return as success William Theesfeld ` (5 subsequent siblings) 6 siblings, 0 replies; 17+ messages in thread From: William Theesfeld @ 2026-06-05 13:11 UTC (permalink / raw) To: Xingyu Wu Cc: Ziv Xu, Wim Van Sebroeck, Guenter Roeck, linux-watchdog, linux-kernel After pm_runtime_resume_and_get() succeeds in starfive_wdt_pm_start(), the runtime PM usage counter is incremented. If starfive_wdt_start() subsequently fails, the function returns the error without releasing that reference, leaking the usage counter. Call pm_runtime_put_sync() on the start failure path so the counter is balanced. This also folds in the v1 conversion of pm_runtime_get_sync() to pm_runtime_resume_and_get(), which was the original purpose of this patch; both fixes together close the refcount-leak window in this function as flagged by the maintainer on the v1 thread. Signed-off-by: William Theesfeld <william@theesfeld.net> --- drivers/watchdog/starfive-wdt.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/watchdog/starfive-wdt.c b/drivers/watchdog/starfive-wdt.c index af55adc4a..ed8c5711a 100644 --- a/drivers/watchdog/starfive-wdt.c +++ b/drivers/watchdog/starfive-wdt.c @@ -371,12 +371,16 @@ static void starfive_wdt_stop(struct starfive_wdt *wdt) static int starfive_wdt_pm_start(struct watchdog_device *wdd) { struct starfive_wdt *wdt = watchdog_get_drvdata(wdd); - int ret = pm_runtime_get_sync(wdd->parent); + int ret = pm_runtime_resume_and_get(wdd->parent); if (ret < 0) return ret; - return starfive_wdt_start(wdt); + ret = starfive_wdt_start(wdt); + if (ret) + pm_runtime_put_sync(wdd->parent); + + return ret; } static int starfive_wdt_pm_stop(struct watchdog_device *wdd) -- 2.54.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 2/6] watchdog: starfive: treat pm_runtime_put_sync() positive return as success 2026-06-05 13:11 ` [PATCH v2 0/6] watchdog: starfive: runtime PM cleanup William Theesfeld 2026-06-05 13:11 ` [PATCH v2 1/6] watchdog: starfive: balance PM refcount when start operation fails William Theesfeld @ 2026-06-05 13:11 ` William Theesfeld 2026-06-05 13:11 ` [PATCH v2 3/6] watchdog: starfive: balance PM refcount and disable in probe error paths William Theesfeld ` (4 subsequent siblings) 6 siblings, 0 replies; 17+ messages in thread From: William Theesfeld @ 2026-06-05 13:11 UTC (permalink / raw) To: Xingyu Wu Cc: Ziv Xu, Wim Van Sebroeck, Guenter Roeck, linux-watchdog, linux-kernel pm_runtime_put_sync() returns 1 when the device cannot be suspended because it is still in use (typically due to other runtime PM users holding references); only negative return values are real errors. Both starfive_wdt_pm_stop() and starfive_wdt_probe() currently treat any non-zero return as failure: pm_stop returns the value verbatim, which the watchdog framework propagates as an error, and probe takes the err_unregister_wdt path even on a successful but non-zero return. Mask off the positive return value in pm_stop and tighten the probe check to "< 0" so the legitimate non-suspended case is no longer mishandled. Signed-off-by: William Theesfeld <william@theesfeld.net> --- drivers/watchdog/starfive-wdt.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/watchdog/starfive-wdt.c b/drivers/watchdog/starfive-wdt.c index ed8c5711a..c821b38d2 100644 --- a/drivers/watchdog/starfive-wdt.c +++ b/drivers/watchdog/starfive-wdt.c @@ -386,9 +386,17 @@ static int starfive_wdt_pm_start(struct watchdog_device *wdd) static int starfive_wdt_pm_stop(struct watchdog_device *wdd) { struct starfive_wdt *wdt = watchdog_get_drvdata(wdd); + int ret; starfive_wdt_stop(wdt); - return pm_runtime_put_sync(wdd->parent); + ret = pm_runtime_put_sync(wdd->parent); + /* + * pm_runtime_put_sync() can return 1 when the device is still in + * use (and therefore did not actually suspend). That is not an + * error from the caller's point of view; only propagate negative + * return values. + */ + return ret < 0 ? ret : 0; } static int starfive_wdt_set_timeout(struct watchdog_device *wdd, @@ -503,7 +511,7 @@ static int starfive_wdt_probe(struct platform_device *pdev) if (!early_enable) { if (pm_runtime_enabled(&pdev->dev)) { ret = pm_runtime_put_sync(&pdev->dev); - if (ret) + if (ret < 0) goto err_unregister_wdt; } } -- 2.54.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 3/6] watchdog: starfive: balance PM refcount and disable in probe error paths 2026-06-05 13:11 ` [PATCH v2 0/6] watchdog: starfive: runtime PM cleanup William Theesfeld 2026-06-05 13:11 ` [PATCH v2 1/6] watchdog: starfive: balance PM refcount when start operation fails William Theesfeld 2026-06-05 13:11 ` [PATCH v2 2/6] watchdog: starfive: treat pm_runtime_put_sync() positive return as success William Theesfeld @ 2026-06-05 13:11 ` William Theesfeld 2026-06-05 13:11 ` [PATCH v2 4/6] watchdog: starfive: guard system suspend/resume hardware access William Theesfeld ` (3 subsequent siblings) 6 siblings, 0 replies; 17+ messages in thread From: William Theesfeld @ 2026-06-05 13:11 UTC (permalink / raw) To: Xingyu Wu Cc: Ziv Xu, Wim Van Sebroeck, Guenter Roeck, linux-watchdog, linux-kernel The probe path takes a runtime PM reference via pm_runtime_resume_and_get() (or enables the clocks directly when runtime PM is unavailable), but several error paths after that point do not release that reference before returning, and the two earliest error paths return without calling pm_runtime_disable() at all even though pm_runtime_enable() has already run. Restructure the error handling into three labels so every failure path balances exactly the resources it has acquired: err_pm_disable: pm_runtime_enable() ran but no clock/refcount was taken yet (resume_and_get / enable_clock failed). err_put_pm: clock or PM refcount is held; release it, then fall through to disable runtime PM. err_unregister_wdt: watchdog_register_device() succeeded; unregister before releasing the refcount. Update the in-function goto targets to use these labels and remove the early "return ret;" paths so pm_runtime_disable() is always run once pm_runtime_enable() has been called. Signed-off-by: William Theesfeld <william@theesfeld.net> --- drivers/watchdog/starfive-wdt.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/drivers/watchdog/starfive-wdt.c b/drivers/watchdog/starfive-wdt.c index c821b38d2..6539bf089 100644 --- a/drivers/watchdog/starfive-wdt.c +++ b/drivers/watchdog/starfive-wdt.c @@ -460,17 +460,17 @@ static int starfive_wdt_probe(struct platform_device *pdev) if (pm_runtime_enabled(&pdev->dev)) { ret = pm_runtime_resume_and_get(&pdev->dev); if (ret < 0) - return ret; + goto err_pm_disable; } else { /* runtime PM is disabled but clocks need to be enabled */ ret = starfive_wdt_enable_clock(wdt); if (ret) - return ret; + goto err_pm_disable; } ret = starfive_wdt_reset_init(&pdev->dev); if (ret) - goto err_exit; + goto err_put_pm; watchdog_set_drvdata(&wdt->wdd, wdt); wdt->wdd.info = &starfive_wdt_info; @@ -482,7 +482,7 @@ static int starfive_wdt_probe(struct platform_device *pdev) if (!wdt->freq) { dev_err(&pdev->dev, "get clock rate failed.\n"); ret = -EINVAL; - goto err_exit; + goto err_put_pm; } wdt->wdd.min_timeout = 1; @@ -498,7 +498,7 @@ static int starfive_wdt_probe(struct platform_device *pdev) if (early_enable) { ret = starfive_wdt_start(wdt); if (ret) - goto err_exit; + goto err_put_pm; set_bit(WDOG_HW_RUNNING, &wdt->wdd.status); } else { starfive_wdt_stop(wdt); @@ -506,7 +506,7 @@ static int starfive_wdt_probe(struct platform_device *pdev) ret = watchdog_register_device(&wdt->wdd); if (ret) - goto err_exit; + goto err_put_pm; if (!early_enable) { if (pm_runtime_enabled(&pdev->dev)) { @@ -520,8 +520,12 @@ static int starfive_wdt_probe(struct platform_device *pdev) err_unregister_wdt: watchdog_unregister_device(&wdt->wdd); -err_exit: - starfive_wdt_disable_clock(wdt); +err_put_pm: + if (pm_runtime_enabled(&pdev->dev)) + pm_runtime_put_sync(&pdev->dev); + else + starfive_wdt_disable_clock(wdt); +err_pm_disable: pm_runtime_disable(&pdev->dev); return ret; -- 2.54.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 4/6] watchdog: starfive: guard system suspend/resume hardware access 2026-06-05 13:11 ` [PATCH v2 0/6] watchdog: starfive: runtime PM cleanup William Theesfeld ` (2 preceding siblings ...) 2026-06-05 13:11 ` [PATCH v2 3/6] watchdog: starfive: balance PM refcount and disable in probe error paths William Theesfeld @ 2026-06-05 13:11 ` William Theesfeld 2026-06-05 13:11 ` [PATCH v2 5/6] watchdog: starfive: avoid PM refcount underflow in shutdown William Theesfeld ` (2 subsequent siblings) 6 siblings, 0 replies; 17+ messages in thread From: William Theesfeld @ 2026-06-05 13:11 UTC (permalink / raw) To: Xingyu Wu Cc: Ziv Xu, Wim Van Sebroeck, Guenter Roeck, linux-watchdog, linux-kernel starfive_wdt_suspend() reads from STARFIVE_WDT_VALUE and writes to WDOGCONTROL via starfive_wdt_get_count() and starfive_wdt_stop() before calling pm_runtime_force_suspend(). If the device was already runtime-suspended when system suspend began, the apb and core clocks are gated and those register accesses produce a synchronous external abort. starfive_wdt_resume() has the mirror problem: it unconditionally restores the WDOGLOAD value after pm_runtime_force_resume() returns, but pm_runtime_force_resume() leaves the device in whichever runtime PM state it was in pre-suspend. If the device stays suspended, the restore writes race with gated clocks. Gate both halves with pm_runtime_status_suspended() so the hardware is only touched when it is actually active. When the watchdog was suspended at runtime PM level there is no state to save or restore. Signed-off-by: William Theesfeld <william@theesfeld.net> --- drivers/watchdog/starfive-wdt.c | 25 +++++++++++++++++++------ 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/drivers/watchdog/starfive-wdt.c b/drivers/watchdog/starfive-wdt.c index 6539bf089..4926984bf 100644 --- a/drivers/watchdog/starfive-wdt.c +++ b/drivers/watchdog/starfive-wdt.c @@ -556,11 +556,16 @@ static int starfive_wdt_suspend(struct device *dev) { struct starfive_wdt *wdt = dev_get_drvdata(dev); - /* Save watchdog state, and turn it off. */ - wdt->reload = starfive_wdt_get_count(wdt); - - /* Note that WTCNT doesn't need to be saved. */ - starfive_wdt_stop(wdt); + /* + * Only access the hardware while the device is runtime-resumed; if + * runtime PM has already suspended the device, its clocks are gated + * and a register read/write would trigger a synchronous external + * abort. WTCNT does not need to be saved. + */ + if (!pm_runtime_status_suspended(dev)) { + wdt->reload = starfive_wdt_get_count(wdt); + starfive_wdt_stop(wdt); + } return pm_runtime_force_suspend(dev); } @@ -574,8 +579,16 @@ static int starfive_wdt_resume(struct device *dev) if (ret) return ret; + /* + * pm_runtime_force_resume() leaves the device in whichever runtime + * PM state it was in before system suspend. Only restore hardware + * state when the device is actually resumed; otherwise the register + * writes below would race with gated clocks. + */ + if (pm_runtime_status_suspended(dev)) + return 0; + starfive_wdt_unlock(wdt); - /* Restore watchdog state. */ starfive_wdt_set_reload_count(wdt, wdt->reload); starfive_wdt_lock(wdt); -- 2.54.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 5/6] watchdog: starfive: avoid PM refcount underflow in shutdown 2026-06-05 13:11 ` [PATCH v2 0/6] watchdog: starfive: runtime PM cleanup William Theesfeld ` (3 preceding siblings ...) 2026-06-05 13:11 ` [PATCH v2 4/6] watchdog: starfive: guard system suspend/resume hardware access William Theesfeld @ 2026-06-05 13:11 ` William Theesfeld 2026-06-05 13:11 ` [PATCH v2 6/6] watchdog: starfive: release early_enable PM refcount on remove William Theesfeld 2026-06-05 17:19 ` [PATCH v3 0/6] watchdog: starfive: runtime PM cleanup William Theesfeld 6 siblings, 0 replies; 17+ messages in thread From: William Theesfeld @ 2026-06-05 13:11 UTC (permalink / raw) To: Xingyu Wu Cc: Ziv Xu, Wim Van Sebroeck, Guenter Roeck, linux-watchdog, linux-kernel starfive_wdt_shutdown() unconditionally calls starfive_wdt_pm_stop(), which drops a runtime PM reference via pm_runtime_put_sync(). If the device was never opened (so there was no matching pm_start()), and early_enable is also off (so probe already balanced its own get/put), the counter is at zero and put_sync() drives it negative. Only call pm_stop() when WDOG_ACTIVE is set (meaning a paired pm_start() happened on open). For the early_enable-and-never-opened case, the framework state is WDOG_HW_RUNNING without WDOG_ACTIVE; in that case stop just the hardware so the watchdog does not keep ticking through shutdown, while leaving the PM refcount for the remove path to release. Signed-off-by: William Theesfeld <william@theesfeld.net> --- drivers/watchdog/starfive-wdt.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/drivers/watchdog/starfive-wdt.c b/drivers/watchdog/starfive-wdt.c index 4926984bf..848935dc6 100644 --- a/drivers/watchdog/starfive-wdt.c +++ b/drivers/watchdog/starfive-wdt.c @@ -549,7 +549,19 @@ static void starfive_wdt_shutdown(struct platform_device *pdev) { struct starfive_wdt *wdt = platform_get_drvdata(pdev); - starfive_wdt_pm_stop(&wdt->wdd); + /* + * Only drop a runtime PM reference when we actually hold one. The + * watchdog framework increments it via pm_start() on open; when + * early_enable started the hardware without an open, the reference + * was taken at probe time and the framework state reads as + * WDOG_HW_RUNNING without WDOG_ACTIVE. In the inactive-and-not- + * running case there is nothing for us to do and dropping the + * counter unconditionally would underflow it. + */ + if (watchdog_active(&wdt->wdd)) + starfive_wdt_pm_stop(&wdt->wdd); + else if (test_bit(WDOG_HW_RUNNING, &wdt->wdd.status)) + starfive_wdt_stop(wdt); } static int starfive_wdt_suspend(struct device *dev) -- 2.54.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 6/6] watchdog: starfive: release early_enable PM refcount on remove 2026-06-05 13:11 ` [PATCH v2 0/6] watchdog: starfive: runtime PM cleanup William Theesfeld ` (4 preceding siblings ...) 2026-06-05 13:11 ` [PATCH v2 5/6] watchdog: starfive: avoid PM refcount underflow in shutdown William Theesfeld @ 2026-06-05 13:11 ` William Theesfeld 2026-06-05 17:19 ` [PATCH v3 0/6] watchdog: starfive: runtime PM cleanup William Theesfeld 6 siblings, 0 replies; 17+ messages in thread From: William Theesfeld @ 2026-06-05 13:11 UTC (permalink / raw) To: Xingyu Wu Cc: Ziv Xu, Wim Van Sebroeck, Guenter Roeck, linux-watchdog, linux-kernel When early_enable starts the hardware at probe time, the runtime PM reference taken by pm_runtime_resume_and_get() is intentionally kept to hold the device runtime-resumed. In the normal (userspace-opened) case the watchdog framework unwinds that reference via pm_stop() on close, and again via stop_on_unregister() during watchdog_unregister_device(). But watchdog_unregister_device() only runs the stop path when WDOG_ACTIVE is set (see drivers/watchdog/watchdog_dev.c). If early_enable started the hardware and userspace never opened it, WDOG_HW_RUNNING is set without WDOG_ACTIVE: unregister returns without touching pm_stop, and the probe-time PM reference is leaked. Drop that reference in starfive_wdt_remove() before pm_runtime_disable() runs, when WDOG_HW_RUNNING is still asserted. Signed-off-by: William Theesfeld <william@theesfeld.net> --- drivers/watchdog/starfive-wdt.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/drivers/watchdog/starfive-wdt.c b/drivers/watchdog/starfive-wdt.c index 848935dc6..d9175b045 100644 --- a/drivers/watchdog/starfive-wdt.c +++ b/drivers/watchdog/starfive-wdt.c @@ -538,6 +538,19 @@ static void starfive_wdt_remove(struct platform_device *pdev) starfive_wdt_stop(wdt); watchdog_unregister_device(&wdt->wdd); + /* + * watchdog_unregister_device() only calls our pm_stop op when + * WDOG_ACTIVE is set. When early_enable started the hardware at + * probe time and userspace never opened the watchdog, the framework + * state is WDOG_HW_RUNNING without WDOG_ACTIVE: the unregister path + * leaves the runtime PM reference taken at probe outstanding. Drop + * it here before disabling runtime PM so the usage counter does not + * leak. + */ + if (test_bit(WDOG_HW_RUNNING, &wdt->wdd.status) && + pm_runtime_enabled(&pdev->dev)) + pm_runtime_put_sync(&pdev->dev); + if (pm_runtime_enabled(&pdev->dev)) pm_runtime_disable(&pdev->dev); else -- 2.54.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v3 0/6] watchdog: starfive: runtime PM cleanup 2026-06-05 13:11 ` [PATCH v2 0/6] watchdog: starfive: runtime PM cleanup William Theesfeld ` (5 preceding siblings ...) 2026-06-05 13:11 ` [PATCH v2 6/6] watchdog: starfive: release early_enable PM refcount on remove William Theesfeld @ 2026-06-05 17:19 ` William Theesfeld 2026-06-05 17:19 ` [PATCH v3 1/6] watchdog: starfive: balance PM refcount when start operation fails William Theesfeld ` (6 more replies) 6 siblings, 7 replies; 17+ messages in thread From: William Theesfeld @ 2026-06-05 17:19 UTC (permalink / raw) To: Xingyu Wu Cc: Ziv Xu, Wim Van Sebroeck, Guenter Roeck, linux-watchdog, linux-kernel This series addresses runtime PM correctness issues in the StarFive watchdog driver. v3 fixes three issues that were raised against v2: Changes in v3: - 2/6 (pm_runtime_put_sync positive return): the v2 commit message and code comment described the semantics of a "1" return as "the device is still in use", which is wrong — that case is signalled by -EAGAIN. A positive return indicates a non-error condition such as the device already being in the requested state. Reword both the commit message and the in-function comment to describe this correctly. No code change beyond that. - 3/6 (probe error path restructure): the v2 patch left a path that put the runtime PM usage counter twice. When the success-path pm_runtime_put_sync() (line ~513) returns an error, the function jumps to err_unregister_wdt. In v2 that label fell through to err_put_pm, which called pm_runtime_put_sync() a second time on a counter that had already been decremented by the first put. Insert an explicit "goto err_pm_disable;" after the unregister so the post-register error path skips the second put. - 4/6 (system suspend/resume guard): the v2 patch added a pm_runtime_status_suspended() check on the resume path but kept the final restart predicate as watchdog_active(). For watchdogs started via early_enable, the framework state is WDOG_HW_RUNNING without WDOG_ACTIVE, so the existing predicate left the hardware permanently stopped after a system suspend/resume cycle (the matching suspend path had already stopped it). Extend the predicate to also accept watchdog_hw_running(). Patches 1, 5, and 6 are unchanged from v2. Several pre-existing issues in adjacent code paths were flagged during v2 review (hardware access during driver unbind, possible races between device_shutdown() and the watchdog reboot notifier, SETTIMEOUT/GETTIMELEFT ioctls touching gated registers). They are not addressed in this series; each is a separate logical change that deserves its own patch and its own justification. Happy to follow up on any the maintainers want addressed first. The whole series builds cleanly with CONFIG_STARFIVE_WATCHDOG=m on x86_64 defconfig. Runtime testing would require StarFive JH7100 / JH7110 hardware and has not been performed. William Theesfeld (6): watchdog: starfive: balance PM refcount when start operation fails watchdog: starfive: treat pm_runtime_put_sync() positive return as success watchdog: starfive: balance PM refcount and disable in probe error paths watchdog: starfive: guard system suspend/resume hardware access watchdog: starfive: avoid PM refcount underflow in shutdown watchdog: starfive: release early_enable PM refcount on remove drivers/watchdog/starfive-wdt.c | 109 ++++++++++++++++++++++++++------ 1 file changed, 89 insertions(+), 20 deletions(-) -- 2.54.0 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v3 1/6] watchdog: starfive: balance PM refcount when start operation fails 2026-06-05 17:19 ` [PATCH v3 0/6] watchdog: starfive: runtime PM cleanup William Theesfeld @ 2026-06-05 17:19 ` William Theesfeld 2026-06-08 19:39 ` Guenter Roeck 2026-06-05 17:19 ` [PATCH v3 2/6] watchdog: starfive: treat pm_runtime_put_sync() positive return as success William Theesfeld ` (5 subsequent siblings) 6 siblings, 1 reply; 17+ messages in thread From: William Theesfeld @ 2026-06-05 17:19 UTC (permalink / raw) To: Xingyu Wu Cc: Ziv Xu, Wim Van Sebroeck, Guenter Roeck, linux-watchdog, linux-kernel After pm_runtime_resume_and_get() succeeds in starfive_wdt_pm_start(), the runtime PM usage counter is incremented. If starfive_wdt_start() subsequently fails, the function returns the error without releasing that reference, leaking the usage counter. Call pm_runtime_put_sync() on the start failure path so the counter is balanced. This also folds in the v1 conversion of pm_runtime_get_sync() to pm_runtime_resume_and_get(), which was the original purpose of this patch; both fixes together close the refcount-leak window in this function as flagged by the maintainer on the v1 thread. Signed-off-by: William Theesfeld <william@theesfeld.net> --- drivers/watchdog/starfive-wdt.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/watchdog/starfive-wdt.c b/drivers/watchdog/starfive-wdt.c index af55adc4a..ed8c5711a 100644 --- a/drivers/watchdog/starfive-wdt.c +++ b/drivers/watchdog/starfive-wdt.c @@ -371,12 +371,16 @@ static void starfive_wdt_stop(struct starfive_wdt *wdt) static int starfive_wdt_pm_start(struct watchdog_device *wdd) { struct starfive_wdt *wdt = watchdog_get_drvdata(wdd); - int ret = pm_runtime_get_sync(wdd->parent); + int ret = pm_runtime_resume_and_get(wdd->parent); if (ret < 0) return ret; - return starfive_wdt_start(wdt); + ret = starfive_wdt_start(wdt); + if (ret) + pm_runtime_put_sync(wdd->parent); + + return ret; } static int starfive_wdt_pm_stop(struct watchdog_device *wdd) -- 2.54.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v3 1/6] watchdog: starfive: balance PM refcount when start operation fails 2026-06-05 17:19 ` [PATCH v3 1/6] watchdog: starfive: balance PM refcount when start operation fails William Theesfeld @ 2026-06-08 19:39 ` Guenter Roeck 0 siblings, 0 replies; 17+ messages in thread From: Guenter Roeck @ 2026-06-08 19:39 UTC (permalink / raw) To: William Theesfeld Cc: Xingyu Wu, Ziv Xu, Wim Van Sebroeck, linux-watchdog, linux-kernel On Fri, Jun 05, 2026 at 01:19:37PM -0400, William Theesfeld wrote: > After pm_runtime_resume_and_get() succeeds in starfive_wdt_pm_start(), > the runtime PM usage counter is incremented. If starfive_wdt_start() > subsequently fails, the function returns the error without releasing > that reference, leaking the usage counter. > > Call pm_runtime_put_sync() on the start failure path so the counter > is balanced. > > This also folds in the v1 conversion of pm_runtime_get_sync() to > pm_runtime_resume_and_get(), which was the original purpose of this > patch; both fixes together close the refcount-leak window in this > function as flagged by the maintainer on the v1 thread. That description doesn't really match the code. The purpose of this patch is to fix a refcount leak in the upstream code, not to fix the refcount leak introduced by the first version of this patch. That the first version of the patch introduced a new refcount leak is irrelevant (and actually confusing). Guenter > > Signed-off-by: William Theesfeld <william@theesfeld.net> > --- > drivers/watchdog/starfive-wdt.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/drivers/watchdog/starfive-wdt.c b/drivers/watchdog/starfive-wdt.c > index af55adc4a..ed8c5711a 100644 > --- a/drivers/watchdog/starfive-wdt.c > +++ b/drivers/watchdog/starfive-wdt.c > @@ -371,12 +371,16 @@ static void starfive_wdt_stop(struct starfive_wdt *wdt) > static int starfive_wdt_pm_start(struct watchdog_device *wdd) > { > struct starfive_wdt *wdt = watchdog_get_drvdata(wdd); > - int ret = pm_runtime_get_sync(wdd->parent); > + int ret = pm_runtime_resume_and_get(wdd->parent); > > if (ret < 0) > return ret; > > - return starfive_wdt_start(wdt); > + ret = starfive_wdt_start(wdt); > + if (ret) > + pm_runtime_put_sync(wdd->parent); > + > + return ret; > } > > static int starfive_wdt_pm_stop(struct watchdog_device *wdd) ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v3 2/6] watchdog: starfive: treat pm_runtime_put_sync() positive return as success 2026-06-05 17:19 ` [PATCH v3 0/6] watchdog: starfive: runtime PM cleanup William Theesfeld 2026-06-05 17:19 ` [PATCH v3 1/6] watchdog: starfive: balance PM refcount when start operation fails William Theesfeld @ 2026-06-05 17:19 ` William Theesfeld 2026-06-05 17:19 ` [PATCH v3 3/6] watchdog: starfive: balance PM refcount and disable in probe error paths William Theesfeld ` (4 subsequent siblings) 6 siblings, 0 replies; 17+ messages in thread From: William Theesfeld @ 2026-06-05 17:19 UTC (permalink / raw) To: Xingyu Wu Cc: Ziv Xu, Wim Van Sebroeck, Guenter Roeck, linux-watchdog, linux-kernel pm_runtime_put_sync() can return a positive value to signal a non-error condition (for example, the device was already in the requested state); only negative return values are real errors. Both starfive_wdt_pm_stop() and starfive_wdt_probe() currently treat any non-zero return as failure: pm_stop returns the value verbatim, which the watchdog framework propagates as an error, and probe takes the err_unregister_wdt path even on a successful but non-zero return. Mask off the positive return value in pm_stop and tighten the probe check to "< 0" so the legitimate positive return is no longer mishandled. Signed-off-by: William Theesfeld <william@theesfeld.net> --- drivers/watchdog/starfive-wdt.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/watchdog/starfive-wdt.c b/drivers/watchdog/starfive-wdt.c index ed8c5711a..e047f52b0 100644 --- a/drivers/watchdog/starfive-wdt.c +++ b/drivers/watchdog/starfive-wdt.c @@ -386,9 +386,17 @@ static int starfive_wdt_pm_start(struct watchdog_device *wdd) static int starfive_wdt_pm_stop(struct watchdog_device *wdd) { struct starfive_wdt *wdt = watchdog_get_drvdata(wdd); + int ret; starfive_wdt_stop(wdt); - return pm_runtime_put_sync(wdd->parent); + ret = pm_runtime_put_sync(wdd->parent); + /* + * pm_runtime_put_sync() can return a positive value to signal a + * non-error condition (for example, the device was already in the + * requested state and no suspend callback was needed). Only + * propagate negative return values as failures. + */ + return ret < 0 ? ret : 0; } static int starfive_wdt_set_timeout(struct watchdog_device *wdd, @@ -503,7 +511,7 @@ static int starfive_wdt_probe(struct platform_device *pdev) if (!early_enable) { if (pm_runtime_enabled(&pdev->dev)) { ret = pm_runtime_put_sync(&pdev->dev); - if (ret) + if (ret < 0) goto err_unregister_wdt; } } -- 2.54.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v3 3/6] watchdog: starfive: balance PM refcount and disable in probe error paths 2026-06-05 17:19 ` [PATCH v3 0/6] watchdog: starfive: runtime PM cleanup William Theesfeld 2026-06-05 17:19 ` [PATCH v3 1/6] watchdog: starfive: balance PM refcount when start operation fails William Theesfeld 2026-06-05 17:19 ` [PATCH v3 2/6] watchdog: starfive: treat pm_runtime_put_sync() positive return as success William Theesfeld @ 2026-06-05 17:19 ` William Theesfeld 2026-06-05 17:19 ` [PATCH v3 4/6] watchdog: starfive: guard system suspend/resume hardware access William Theesfeld ` (3 subsequent siblings) 6 siblings, 0 replies; 17+ messages in thread From: William Theesfeld @ 2026-06-05 17:19 UTC (permalink / raw) To: Xingyu Wu Cc: Ziv Xu, Wim Van Sebroeck, Guenter Roeck, linux-watchdog, linux-kernel The probe path takes a runtime PM reference via pm_runtime_resume_and_get() (or enables the clocks directly when runtime PM is unavailable), but several error paths after that point do not release that reference before returning, and the two earliest error paths return without calling pm_runtime_disable() at all even though pm_runtime_enable() has already run. Restructure the error handling into three labels so every failure path balances exactly the resources it has acquired: err_pm_disable: pm_runtime_enable() ran but no clock/refcount was taken yet (resume_and_get / enable_clock failed). err_put_pm: clock or PM refcount is held; release it, then fall through to disable runtime PM. err_unregister_wdt: watchdog_register_device() succeeded and the success-path pm_runtime_put_sync() returned an error. The put has already decremented the counter, so this path jumps directly to err_pm_disable rather than falling through to err_put_pm; otherwise the counter would be decremented a second time and underflow. Update the in-function goto targets to use these labels and remove the early "return ret;" paths so pm_runtime_disable() is always run once pm_runtime_enable() has been called. Signed-off-by: William Theesfeld <william@theesfeld.net> --- drivers/watchdog/starfive-wdt.c | 28 ++++++++++++++++++++-------- 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/drivers/watchdog/starfive-wdt.c b/drivers/watchdog/starfive-wdt.c index e047f52b0..856e55f04 100644 --- a/drivers/watchdog/starfive-wdt.c +++ b/drivers/watchdog/starfive-wdt.c @@ -460,17 +460,17 @@ static int starfive_wdt_probe(struct platform_device *pdev) if (pm_runtime_enabled(&pdev->dev)) { ret = pm_runtime_resume_and_get(&pdev->dev); if (ret < 0) - return ret; + goto err_pm_disable; } else { /* runtime PM is disabled but clocks need to be enabled */ ret = starfive_wdt_enable_clock(wdt); if (ret) - return ret; + goto err_pm_disable; } ret = starfive_wdt_reset_init(&pdev->dev); if (ret) - goto err_exit; + goto err_put_pm; watchdog_set_drvdata(&wdt->wdd, wdt); wdt->wdd.info = &starfive_wdt_info; @@ -482,7 +482,7 @@ static int starfive_wdt_probe(struct platform_device *pdev) if (!wdt->freq) { dev_err(&pdev->dev, "get clock rate failed.\n"); ret = -EINVAL; - goto err_exit; + goto err_put_pm; } wdt->wdd.min_timeout = 1; @@ -498,7 +498,7 @@ static int starfive_wdt_probe(struct platform_device *pdev) if (early_enable) { ret = starfive_wdt_start(wdt); if (ret) - goto err_exit; + goto err_put_pm; set_bit(WDOG_HW_RUNNING, &wdt->wdd.status); } else { starfive_wdt_stop(wdt); @@ -506,7 +506,7 @@ static int starfive_wdt_probe(struct platform_device *pdev) ret = watchdog_register_device(&wdt->wdd); if (ret) - goto err_exit; + goto err_put_pm; if (!early_enable) { if (pm_runtime_enabled(&pdev->dev)) { @@ -520,8 +520,20 @@ static int starfive_wdt_probe(struct platform_device *pdev) err_unregister_wdt: watchdog_unregister_device(&wdt->wdd); -err_exit: - starfive_wdt_disable_clock(wdt); + /* + * The only path into err_unregister_wdt is the post-register + * pm_runtime_put_sync() that returned an error. That call already + * decremented the runtime PM usage counter, so falling through to + * err_put_pm would put again and underflow the counter. Jump + * straight to err_pm_disable. + */ + goto err_pm_disable; +err_put_pm: + if (pm_runtime_enabled(&pdev->dev)) + pm_runtime_put_sync(&pdev->dev); + else + starfive_wdt_disable_clock(wdt); +err_pm_disable: pm_runtime_disable(&pdev->dev); return ret; -- 2.54.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v3 4/6] watchdog: starfive: guard system suspend/resume hardware access 2026-06-05 17:19 ` [PATCH v3 0/6] watchdog: starfive: runtime PM cleanup William Theesfeld ` (2 preceding siblings ...) 2026-06-05 17:19 ` [PATCH v3 3/6] watchdog: starfive: balance PM refcount and disable in probe error paths William Theesfeld @ 2026-06-05 17:19 ` William Theesfeld 2026-06-05 17:19 ` [PATCH v3 5/6] watchdog: starfive: avoid PM refcount underflow in shutdown William Theesfeld ` (2 subsequent siblings) 6 siblings, 0 replies; 17+ messages in thread From: William Theesfeld @ 2026-06-05 17:19 UTC (permalink / raw) To: Xingyu Wu Cc: Ziv Xu, Wim Van Sebroeck, Guenter Roeck, linux-watchdog, linux-kernel starfive_wdt_suspend() reads from STARFIVE_WDT_VALUE and writes to WDOGCONTROL via starfive_wdt_get_count() and starfive_wdt_stop() before calling pm_runtime_force_suspend(). If the device was already runtime-suspended when system suspend began, the apb and core clocks are gated and those register accesses produce a synchronous external abort. starfive_wdt_resume() has the mirror problem: it unconditionally restores the WDOGLOAD value after pm_runtime_force_resume() returns, but pm_runtime_force_resume() leaves the device in whichever runtime PM state it was in pre-suspend. If the device stays suspended, the restore writes race with gated clocks. Gate both halves with pm_runtime_status_suspended() so the hardware is only touched when it is actually active. When the watchdog was suspended at runtime PM level there is no state to save or restore. Additionally, restart the hardware on resume for the early_enable case (WDOG_HW_RUNNING set without WDOG_ACTIVE). Without this, a system suspend stops the early-enabled hardware via the suspend callback above, but the resume callback only restarts when watchdog_active() is true, leaving the watchdog permanently stopped after the suspend/resume cycle. Signed-off-by: William Theesfeld <william@theesfeld.net> --- drivers/watchdog/starfive-wdt.c | 34 ++++++++++++++++++++++++++------- 1 file changed, 27 insertions(+), 7 deletions(-) diff --git a/drivers/watchdog/starfive-wdt.c b/drivers/watchdog/starfive-wdt.c index 856e55f04..e3a47c0c3 100644 --- a/drivers/watchdog/starfive-wdt.c +++ b/drivers/watchdog/starfive-wdt.c @@ -564,11 +564,16 @@ static int starfive_wdt_suspend(struct device *dev) { struct starfive_wdt *wdt = dev_get_drvdata(dev); - /* Save watchdog state, and turn it off. */ - wdt->reload = starfive_wdt_get_count(wdt); - - /* Note that WTCNT doesn't need to be saved. */ - starfive_wdt_stop(wdt); + /* + * Only access the hardware while the device is runtime-resumed; if + * runtime PM has already suspended the device, its clocks are gated + * and a register read/write would trigger a synchronous external + * abort. WTCNT does not need to be saved. + */ + if (!pm_runtime_status_suspended(dev)) { + wdt->reload = starfive_wdt_get_count(wdt); + starfive_wdt_stop(wdt); + } return pm_runtime_force_suspend(dev); } @@ -582,12 +587,27 @@ static int starfive_wdt_resume(struct device *dev) if (ret) return ret; + /* + * pm_runtime_force_resume() leaves the device in whichever runtime + * PM state it was in before system suspend. Only restore hardware + * state when the device is actually resumed; otherwise the register + * writes below would race with gated clocks. + */ + if (pm_runtime_status_suspended(dev)) + return 0; + starfive_wdt_unlock(wdt); - /* Restore watchdog state. */ starfive_wdt_set_reload_count(wdt, wdt->reload); starfive_wdt_lock(wdt); - if (watchdog_active(&wdt->wdd)) + /* + * Restart the hardware on resume for both userspace-opened + * watchdogs (WDOG_ACTIVE) and watchdogs started via early_enable + * (WDOG_HW_RUNNING). Checking only WDOG_ACTIVE leaves the + * early_enable case stopped after a suspend/resume cycle even + * though the framework still considers the hardware running. + */ + if (watchdog_active(&wdt->wdd) || watchdog_hw_running(&wdt->wdd)) return starfive_wdt_start(wdt); return 0; -- 2.54.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v3 5/6] watchdog: starfive: avoid PM refcount underflow in shutdown 2026-06-05 17:19 ` [PATCH v3 0/6] watchdog: starfive: runtime PM cleanup William Theesfeld ` (3 preceding siblings ...) 2026-06-05 17:19 ` [PATCH v3 4/6] watchdog: starfive: guard system suspend/resume hardware access William Theesfeld @ 2026-06-05 17:19 ` William Theesfeld 2026-06-05 17:19 ` [PATCH v3 6/6] watchdog: starfive: release early_enable PM refcount on remove William Theesfeld 2026-06-08 20:46 ` [PATCH v3 0/6] watchdog: starfive: runtime PM cleanup Guenter Roeck 6 siblings, 0 replies; 17+ messages in thread From: William Theesfeld @ 2026-06-05 17:19 UTC (permalink / raw) To: Xingyu Wu Cc: Ziv Xu, Wim Van Sebroeck, Guenter Roeck, linux-watchdog, linux-kernel starfive_wdt_shutdown() unconditionally calls starfive_wdt_pm_stop(), which drops a runtime PM reference via pm_runtime_put_sync(). If the device was never opened (so there was no matching pm_start()), and early_enable is also off (so probe already balanced its own get/put), the counter is at zero and put_sync() drives it negative. Only call pm_stop() when WDOG_ACTIVE is set (meaning a paired pm_start() happened on open). For the early_enable-and-never-opened case, the framework state is WDOG_HW_RUNNING without WDOG_ACTIVE; in that case stop just the hardware so the watchdog does not keep ticking through shutdown, while leaving the PM refcount for the remove path to release. Signed-off-by: William Theesfeld <william@theesfeld.net> --- drivers/watchdog/starfive-wdt.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/drivers/watchdog/starfive-wdt.c b/drivers/watchdog/starfive-wdt.c index e3a47c0c3..6137e98f1 100644 --- a/drivers/watchdog/starfive-wdt.c +++ b/drivers/watchdog/starfive-wdt.c @@ -557,7 +557,19 @@ static void starfive_wdt_shutdown(struct platform_device *pdev) { struct starfive_wdt *wdt = platform_get_drvdata(pdev); - starfive_wdt_pm_stop(&wdt->wdd); + /* + * Only drop a runtime PM reference when we actually hold one. The + * watchdog framework increments it via pm_start() on open; when + * early_enable started the hardware without an open, the reference + * was taken at probe time and the framework state reads as + * WDOG_HW_RUNNING without WDOG_ACTIVE. In the inactive-and-not- + * running case there is nothing for us to do and dropping the + * counter unconditionally would underflow it. + */ + if (watchdog_active(&wdt->wdd)) + starfive_wdt_pm_stop(&wdt->wdd); + else if (test_bit(WDOG_HW_RUNNING, &wdt->wdd.status)) + starfive_wdt_stop(wdt); } static int starfive_wdt_suspend(struct device *dev) -- 2.54.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v3 6/6] watchdog: starfive: release early_enable PM refcount on remove 2026-06-05 17:19 ` [PATCH v3 0/6] watchdog: starfive: runtime PM cleanup William Theesfeld ` (4 preceding siblings ...) 2026-06-05 17:19 ` [PATCH v3 5/6] watchdog: starfive: avoid PM refcount underflow in shutdown William Theesfeld @ 2026-06-05 17:19 ` William Theesfeld 2026-06-08 20:46 ` [PATCH v3 0/6] watchdog: starfive: runtime PM cleanup Guenter Roeck 6 siblings, 0 replies; 17+ messages in thread From: William Theesfeld @ 2026-06-05 17:19 UTC (permalink / raw) To: Xingyu Wu Cc: Ziv Xu, Wim Van Sebroeck, Guenter Roeck, linux-watchdog, linux-kernel When early_enable starts the hardware at probe time, the runtime PM reference taken by pm_runtime_resume_and_get() is intentionally kept to hold the device runtime-resumed. In the normal (userspace-opened) case the watchdog framework unwinds that reference via pm_stop() on close, and again via stop_on_unregister() during watchdog_unregister_device(). But watchdog_unregister_device() only runs the stop path when WDOG_ACTIVE is set (see drivers/watchdog/watchdog_dev.c). If early_enable started the hardware and userspace never opened it, WDOG_HW_RUNNING is set without WDOG_ACTIVE: unregister returns without touching pm_stop, and the probe-time PM reference is leaked. Drop that reference in starfive_wdt_remove() before pm_runtime_disable() runs, when WDOG_HW_RUNNING is still asserted. Signed-off-by: William Theesfeld <william@theesfeld.net> --- drivers/watchdog/starfive-wdt.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/drivers/watchdog/starfive-wdt.c b/drivers/watchdog/starfive-wdt.c index 6137e98f1..ad5bb67d4 100644 --- a/drivers/watchdog/starfive-wdt.c +++ b/drivers/watchdog/starfive-wdt.c @@ -546,6 +546,19 @@ static void starfive_wdt_remove(struct platform_device *pdev) starfive_wdt_stop(wdt); watchdog_unregister_device(&wdt->wdd); + /* + * watchdog_unregister_device() only calls our pm_stop op when + * WDOG_ACTIVE is set. When early_enable started the hardware at + * probe time and userspace never opened the watchdog, the framework + * state is WDOG_HW_RUNNING without WDOG_ACTIVE: the unregister path + * leaves the runtime PM reference taken at probe outstanding. Drop + * it here before disabling runtime PM so the usage counter does not + * leak. + */ + if (test_bit(WDOG_HW_RUNNING, &wdt->wdd.status) && + pm_runtime_enabled(&pdev->dev)) + pm_runtime_put_sync(&pdev->dev); + if (pm_runtime_enabled(&pdev->dev)) pm_runtime_disable(&pdev->dev); else -- 2.54.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v3 0/6] watchdog: starfive: runtime PM cleanup 2026-06-05 17:19 ` [PATCH v3 0/6] watchdog: starfive: runtime PM cleanup William Theesfeld ` (5 preceding siblings ...) 2026-06-05 17:19 ` [PATCH v3 6/6] watchdog: starfive: release early_enable PM refcount on remove William Theesfeld @ 2026-06-08 20:46 ` Guenter Roeck 6 siblings, 0 replies; 17+ messages in thread From: Guenter Roeck @ 2026-06-08 20:46 UTC (permalink / raw) To: William Theesfeld, Xingyu Wu Cc: Ziv Xu, Wim Van Sebroeck, linux-watchdog, linux-kernel On 6/5/26 10:19, William Theesfeld wrote: > This series addresses runtime PM correctness issues in the StarFive > watchdog driver. v3 fixes three issues that were raised against v2: > I just realized that you are sending new versions of this series as reply to previous versions. No surprise that I am having trouble finding them. Please follow the guidance for submitting patches and stop doing that. Thanks, Guenter > Changes in v3: > - 2/6 (pm_runtime_put_sync positive return): the v2 commit message > and code comment described the semantics of a "1" return as "the > device is still in use", which is wrong — that case is signalled > by -EAGAIN. A positive return indicates a non-error condition such > as the device already being in the requested state. Reword both > the commit message and the in-function comment to describe this > correctly. No code change beyond that. > > - 3/6 (probe error path restructure): the v2 patch left a path that > put the runtime PM usage counter twice. When the success-path > pm_runtime_put_sync() (line ~513) returns an error, the function > jumps to err_unregister_wdt. In v2 that label fell through to > err_put_pm, which called pm_runtime_put_sync() a second time on a > counter that had already been decremented by the first put. > Insert an explicit "goto err_pm_disable;" after the unregister so > the post-register error path skips the second put. > > - 4/6 (system suspend/resume guard): the v2 patch added a > pm_runtime_status_suspended() check on the resume path but kept > the final restart predicate as watchdog_active(). For watchdogs > started via early_enable, the framework state is WDOG_HW_RUNNING > without WDOG_ACTIVE, so the existing predicate left the hardware > permanently stopped after a system suspend/resume cycle (the > matching suspend path had already stopped it). Extend the > predicate to also accept watchdog_hw_running(). > > Patches 1, 5, and 6 are unchanged from v2. > > Several pre-existing issues in adjacent code paths were flagged > during v2 review (hardware access during driver unbind, possible > races between device_shutdown() and the watchdog reboot notifier, > SETTIMEOUT/GETTIMELEFT ioctls touching gated registers). They are > not addressed in this series; each is a separate logical change > that deserves its own patch and its own justification. Happy to > follow up on any the maintainers want addressed first. > > The whole series builds cleanly with CONFIG_STARFIVE_WATCHDOG=m on > x86_64 defconfig. Runtime testing would require StarFive JH7100 / > JH7110 hardware and has not been performed. > > William Theesfeld (6): > watchdog: starfive: balance PM refcount when start operation fails > watchdog: starfive: treat pm_runtime_put_sync() positive return as > success > watchdog: starfive: balance PM refcount and disable in probe error > paths > watchdog: starfive: guard system suspend/resume hardware access > watchdog: starfive: avoid PM refcount underflow in shutdown > watchdog: starfive: release early_enable PM refcount on remove > > drivers/watchdog/starfive-wdt.c | 109 ++++++++++++++++++++++++++------ > 1 file changed, 89 insertions(+), 20 deletions(-) > ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2026-06-08 20:46 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-06-05 12:23 [PATCH] watchdog: starfive: use pm_runtime_resume_and_get() to fix refcount leak William Theesfeld 2026-06-05 13:11 ` [PATCH v2 0/6] watchdog: starfive: runtime PM cleanup William Theesfeld 2026-06-05 13:11 ` [PATCH v2 1/6] watchdog: starfive: balance PM refcount when start operation fails William Theesfeld 2026-06-05 13:11 ` [PATCH v2 2/6] watchdog: starfive: treat pm_runtime_put_sync() positive return as success William Theesfeld 2026-06-05 13:11 ` [PATCH v2 3/6] watchdog: starfive: balance PM refcount and disable in probe error paths William Theesfeld 2026-06-05 13:11 ` [PATCH v2 4/6] watchdog: starfive: guard system suspend/resume hardware access William Theesfeld 2026-06-05 13:11 ` [PATCH v2 5/6] watchdog: starfive: avoid PM refcount underflow in shutdown William Theesfeld 2026-06-05 13:11 ` [PATCH v2 6/6] watchdog: starfive: release early_enable PM refcount on remove William Theesfeld 2026-06-05 17:19 ` [PATCH v3 0/6] watchdog: starfive: runtime PM cleanup William Theesfeld 2026-06-05 17:19 ` [PATCH v3 1/6] watchdog: starfive: balance PM refcount when start operation fails William Theesfeld 2026-06-08 19:39 ` Guenter Roeck 2026-06-05 17:19 ` [PATCH v3 2/6] watchdog: starfive: treat pm_runtime_put_sync() positive return as success William Theesfeld 2026-06-05 17:19 ` [PATCH v3 3/6] watchdog: starfive: balance PM refcount and disable in probe error paths William Theesfeld 2026-06-05 17:19 ` [PATCH v3 4/6] watchdog: starfive: guard system suspend/resume hardware access William Theesfeld 2026-06-05 17:19 ` [PATCH v3 5/6] watchdog: starfive: avoid PM refcount underflow in shutdown William Theesfeld 2026-06-05 17:19 ` [PATCH v3 6/6] watchdog: starfive: release early_enable PM refcount on remove William Theesfeld 2026-06-08 20:46 ` [PATCH v3 0/6] watchdog: starfive: runtime PM cleanup Guenter Roeck
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox