Linux Watchdog driver development
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Kartik Rajput" <kkartik@nvidia.com>
Cc: linux-watchdog@vger.kernel.org
Subject: Re: [PATCH 4/4] clocksource/drivers/timer-tegra186: Reserve and service a kernel watchdog
Date: Fri, 08 May 2026 00:23:41 +0000	[thread overview]
Message-ID: <20260508002341.ED38AC2BCB2@smtp.kernel.org> (raw)
In-Reply-To: <20260507154557.2082697-5-kkartik@nvidia.com>

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [Critical] The interrupt handler fails to clear shared interrupts from userspace watchdogs, causing an infinite IRQ storm.
- [High] Pinging the kernel watchdog from a hardware interrupt handler defeats the purpose of detecting software hangs.
- [High] Missing teardown of the kernel watchdog on driver unbind causes an unintended system reset.
- [High] Race condition between system suspend and the watchdog IRQ handler can leave the watchdog incorrectly armed during sleep.
--

commit 8065750aadbeb77247fd706790ed9368ba7e0064
Author: Kartik Rajput <kkartik@nvidia.com>

clocksource/drivers/timer-tegra186: Reserve and service a kernel watchdog

Tegra SoCs supports multiple watchdog timers. Reserve first available watchdog
as a kernel-only watchdog for Tegra186 and Tegra234. Arm it during probe (120s
timeout) and keep it alive in the driver IRQ handler.

> diff --git a/drivers/clocksource/timer-tegra186.c b/drivers/clocksource/timer-tegra186.c
> index dd1d1a0dd63eb..78600ddeb1c61 100644
> --- a/drivers/clocksource/timer-tegra186.c
> +++ b/drivers/clocksource/timer-tegra186.c
[ ... ]
> @@ -219,6 +226,16 @@ static int tegra186_wdt_ping(struct watchdog_device *wdd)
>  	return 0;
>  }
>  
> +static irqreturn_t tegra186_wdt_irq(int irq, void *data)
> +{
> +	struct tegra186_wdt *wdt = data;
> +
> +	tegra186_wdt_disable(wdt);
> +	tegra186_wdt_enable(wdt);
> +
> +	return IRQ_HANDLED;
> +}

The commit message states the kernel watchdog is intended to recover the system
if "the kernel crashes or hangs before userspace enables a watchdog" and
handles cases where "userspace never starts".

Because this handler runs as a hardirq, won't it continue to execute and feed
the watchdog even if the kernel deadlocks in task context, enters an infinite
loop, or userspace completely freezes? As long as local interrupts are enabled,
it seems the watchdog will silently ping itself indefinitely without resetting
the system.

Also, if userspace enables a different watchdog on the system, will this
handler cause an infinite IRQ storm?

Looking at tegra186_timer_probe(), the single shared hardware interrupt is
requested but only passed the kernel_wdt as data:

	err = platform_get_irq(pdev, 0);
	irq = err;
	...
	err = devm_request_irq(dev, irq, tegra186_wdt_irq, 0,
			       dev_name(dev), kernel_wdt);

When a userspace watchdog is enabled, its interrupt is unmasked. If it expires
and asserts the shared IRQ line, this handler will only clear the interrupt
status for kernel_wdt. Since the userspace watchdog's interrupt remains
asserted, won't the CPU immediately re-enter this handler after it returns
IRQ_HANDLED?

[ ... ]
> @@ -501,8 +529,22 @@ static int tegra186_timer_probe(struct platform_device *pdev)
>  		goto unregister_osc;
>  	}
>  
> +	if (kernel_wdt) {
> +		err = devm_request_irq(dev, irq, tegra186_wdt_irq, 0,
> +				       dev_name(dev), kernel_wdt);
> +		if (err < 0) {
> +			dev_err(dev, "failed to request kernel WDT IRQ: %d\n", err);
> +			goto unregister_usec;
> +		}
> +
> +		tegra186_wdt_set_timeout(&kernel_wdt->base, TEGRA186_KERNEL_WDT_TIMEOUT);
> +		tegra186_wdt_enable(kernel_wdt);
> +	}
> +

When the driver is unbound (e.g., via sysfs or module unloading), devres
automatically frees the IRQ handler, but there doesn't appear to be any
teardown of the hardware timer itself.

Will the hardware timer continue to tick unserviced and forcefully hard-reset
the system within 120 seconds of driver removal?

[ ... ]
> @@ -525,8 +567,10 @@ static int __maybe_unused tegra186_timer_suspend(struct device *dev)
>  	unsigned int i;
>  
>  	for (i = 0; i < tegra->soc->num_wdts; i++) {
> -		if (tegra->wdts[i] && watchdog_active(&tegra->wdts[i]->base))
> -			tegra186_wdt_disable(tegra->wdts[i]);
> +		struct tegra186_wdt *wdt = tegra->wdts[i];
> +
> +		if (wdt && (wdt->is_kernel_wdt || watchdog_active(&wdt->base)))
> +			tegra186_wdt_disable(wdt);
>  	}
>  
>  	return 0;

If the watchdog's timer expires concurrently with suspend, can the
tegra186_wdt_irq handler execute on another CPU or immediately after suspend
unmasks interrupts?

Since the IRQ handler unconditionally calls tegra186_wdt_enable() without
checking any suspend state flags, it seems it could re-arm the watchdog after
this suspend callback has paused it, leading to a spurious hardware reset
during suspend.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260507154557.2082697-1-kkartik@nvidia.com?part=4

  reply	other threads:[~2026-05-08  0:23 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-07 15:45 [PATCH 0/4] Add support for Kernel WDT Kartik Rajput
2026-05-07 15:45 ` [PATCH 1/4] clocksource/drivers/timer-tegra186: Fix support for multiple watchdog instances Kartik Rajput
2026-05-07 22:58   ` sashiko-bot
2026-05-07 15:45 ` [PATCH 2/4] clocksource/drivers/timer-tegra186: Correct num_wdts for Tegra186 and Tegra234 Kartik Rajput
2026-05-07 15:45 ` [PATCH 3/4] clocksource/drivers/timer-tegra186: Register all accessible watchdog timers Kartik Rajput
2026-05-07 23:38   ` sashiko-bot
2026-05-07 15:45 ` [PATCH 4/4] clocksource/drivers/timer-tegra186: Reserve and service a kernel watchdog Kartik Rajput
2026-05-08  0:23   ` sashiko-bot [this message]
2026-05-12 13:56 ` [PATCH 0/4] Add support for Kernel WDT Jon Hunter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260508002341.ED38AC2BCB2@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=kkartik@nvidia.com \
    --cc=linux-watchdog@vger.kernel.org \
    --cc=sashiko@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox