Linux Watchdog driver development
 help / color / mirror / Atom feed
From: w15303746062  <w15303746062@163.com>
To: "Guenter Roeck" <linux@roeck-us.net>
Cc: wim@linux-watchdog.org, linux-watchdog@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	"Mingyu Wang" <25181214217@stu.xidian.edu.cn>
Subject: Re:Re: [PATCH] watchdog: wdt_pci: Fix shared IRQ storm and complete system lockup
Date: Mon, 11 May 2026 09:27:16 +0800 (CST)	[thread overview]
Message-ID: <1fc8bb5a.1346.19e14a548fb.Coremail.w15303746062@163.com> (raw)
In-Reply-To: <f182256d-0623-4c9b-a18a-acca8457bb08@roeck-us.net>




From: Mingyu Wang <25181214217@stu.xidian.edu.cn>

Hi Guenter,

Thank you for your prompt response and review.

To answer your questions directly and transparently: 

1. Is this an actual observed problem?
Yes, it is a real, observed problem. However, it was observed in a virtualized fuzzing environment (QEMU + Syzkaller) rather than on physical legacy hardware. 

2. How was it triggered?
In our QEMU setup, PCI IRQ lines are heavily shared. The fuzzer loaded the `wdt_pci` driver while simultaneously fuzzing other devices on the same shared IRQ line (e.g., the i2c-i801 controller). When the other device triggered a heavy interrupt load, `wdtpci_interrupt()` caught them. Since it bypassed the IRQ ownership check, it blindly claimed the interrupts and caused a massive printk storm (spamming "wdt_pci: Reset in 5ms" and "status 114"). 

This overwhelmed the CPU in hard IRQ context, defeated the spurious IRQ detector, and resulted in a 145-second Hung Task panic. Here is a brief snippet of the observed log:

[  375.485491] wdt_pci: Reset in 5ms
[  375.487467] wdt_pci: status 114
[  375.489171] wdt_pci: Reset in 5ms
...
[  375.484244] systemd-journald[4771]: /dev/kmsg buffer overrun, some messages lost.
[  519.189528] INFO: task syz.2.507 blocked for more than 145 seconds.

3. Did I confirm the register bit?
Yes, the assertion that `WDC_SR_IRQ` is active low relies directly on the hardware definition documented in the driver's own source code at line 66:
`#define WDC_SR_IRQ      128 /* Active low */ /* X   X   X  */`

I completely understand your perspective. This is legacy hardware from the 1990s, and it is extremely unlikely to be used in production today. My intention was solely to report a reproducible Local DoS vector found by the fuzzer, as `IRQF_SHARED` handlers are strictly required to verify their interrupt source.

If you feel that patching this outdated driver adds unnecessary churn and consumes maintainers' time, please feel free to drop this patch. Alternatively, if the hardware is truly obsolete, perhaps the driver should be marked as BROKEN or removed entirely. I leave that entirely to your expert judgment.

Thanks again for your time!

Best regards,
Mingyu Wang











At 2026-05-09 21:52:17, "Guenter Roeck" <linux@roeck-us.net> wrote:
>On 5/9/26 05:16, w15303746062@163.com wrote:
>> From: Mingyu Wang <25181214217@stu.xidian.edu.cn>
>> 
>> The wdt_pci driver registers its interrupt handler with the IRQF_SHARED
>> flag. However, the interrupt handler wdtpci_interrupt() fails to check
>> whether the interrupt actually originated from the watchdog device.
>> 
>> If another device on the same shared IRQ line (e.g., an I2C controller)
>> triggers an interrupt, wdtpci_interrupt() will erroneously process it,
>> blindly log hardware status (e.g., "Reset in 5ms") to the console, and
>> unconditionally return IRQ_HANDLED.
>> 
>> This behavior defeats the kernel's spurious interrupt detector. Under
>> heavy load from other devices sharing the IRQ, it causes a severe printk
>> storm and keeps the CPU trapped in hard IRQ context. This eventually
>> leads to a complete system lockup and RCU/Hung Task panics.
>> 
>> Fix this by checking the WDC_SR_IRQ bit (which is active low) in the
>> status register. If the bit is high, the interrupt is not ours, and we
>> must release the lock and return IRQ_NONE immediately.
>> 
>> Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn>
>
>Is this an actual observed problem ? Are you using this driver ?
>Did you confirm that the change solves the problem, and that the
>newly checked register bit is indeed active low as claimed ?
>
>If so, please provide evidence that the problem is observed.
>If not, please refrain from making such changes. This driver,
>specifically, is completely outdated, the hardware it supports
>is very unlikely to still be used, and making such changes without
>actual need serves no practical purpose but to keep maintainers busy.
>
>Thanks,
>Guenter
>
>> ---
>>   drivers/watchdog/wdt_pci.c | 11 +++++++++++
>>   1 file changed, 11 insertions(+)
>> 
>> diff --git a/drivers/watchdog/wdt_pci.c b/drivers/watchdog/wdt_pci.c
>> index 3918a600f2a0..a35ac064b690 100644
>> --- a/drivers/watchdog/wdt_pci.c
>> +++ b/drivers/watchdog/wdt_pci.c
>> @@ -304,6 +304,17 @@ static irqreturn_t wdtpci_interrupt(int irq, void *dev_id)
>>   	spin_lock(&wdtpci_lock);
>>   
>>   	status = inb(WDT_SR);
>> +	/*
>> +	 * The WDT500/501 supports shared interrupts (IRQF_SHARED).
>> +	 * We must check if the interrupt was generated by this device.
>> +	 * WDC_SR_IRQ is active low, so if it is set (1), the interrupt
>> +	 * belongs to another device on the shared line.
>> +	 */
>> +	if (status & WDC_SR_IRQ) {
>> +		spin_unlock(&wdtpci_lock);
>> +		return IRQ_NONE;
>> +	}
>> +
>>   	udelay(8);
>>   
>>   	pr_crit("status %d\n", status);

  reply	other threads:[~2026-05-11  1:27 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-09 12:16 [PATCH] watchdog: wdt_pci: Fix shared IRQ storm and complete system lockup w15303746062
2026-05-09 13:52 ` Guenter Roeck
2026-05-11  1:27   ` w15303746062 [this message]
2026-05-11  3:00     ` Guenter Roeck
2026-05-11  3:08       ` w15303746062
2026-05-11  4:39         ` Guenter Roeck
2026-05-11  4:52           ` w15303746062

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1fc8bb5a.1346.19e14a548fb.Coremail.w15303746062@163.com \
    --to=w15303746062@163.com \
    --cc=25181214217@stu.xidian.edu.cn \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-watchdog@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=wim@linux-watchdog.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox