* [PATCH v4] fix it87_wdt early reboot by reporting running timer
@ 2025-11-17 12:11 René Rebe
2025-11-17 15:24 ` Guenter Roeck
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: René Rebe @ 2025-11-17 12:11 UTC (permalink / raw)
To: linux; +Cc: wim, linux-watchdog
Some products, such as the Ugreen DXP4800 Plus NAS, ship with the it87
wdt enabled by the firmware and a broken BIOS option that does not
allow to change the time or turn it off. As this makes installing
Linux rather difficult, change the it87_wdt to report it running to
the watchdog core.
Signed-off-by: René Rebe <rene@exactco.de>
---
v1:
- just clear hw timer register
v2:
- detect running hw timer and report to watchdog core
v3:
- multiply TOV1 in _wdt_get_timeout
- don't wrongly and superfluously set .max_hw_heartbeat_ms
- don't call set_timeout manually
v4:
- simplify to wdt_running
- move code up to not move superio_exit
diff --git a/drivers/watchdog/it87_wdt.c b/drivers/watchdog/it87_wdt.c
index 3b8488c86a2f..8ba7e03857ca 100644
--- a/drivers/watchdog/it87_wdt.c
+++ b/drivers/watchdog/it87_wdt.c
@@ -188,6 +188,12 @@ static void _wdt_update_timeout(unsigned int t)
superio_outb(t >> 8, WDTVALMSB);
}
+/* Internal function, should be called after superio_select(GPIO) */
+static bool _wdt_running(void)
+{
+ return superio_inb(WDTVALLSB) || (max_units > 255 && superio_inb(WDTVALMSB));
+}
+
static int wdt_update_timeout(unsigned int t)
{
int ret;
@@ -374,6 +381,12 @@ static int __init it87_wdt_init(void)
}
}
+ /* wdt already left running by firmware? */
+ if (_wdt_running()) {
+ pr_info("Left running by firmware.\n");
+ set_bit(WDOG_HW_RUNNING, &wdt_dev.status);
+ }
+
superio_exit();
if (timeout < 1 || timeout > max_units * 60) {
--
René Rebe, ExactCODE GmbH, Berlin, Germany
https://exactco.de | https://t2linux.com | https://rene.rebe.de
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH v4] fix it87_wdt early reboot by reporting running timer 2025-11-17 12:11 [PATCH v4] fix it87_wdt early reboot by reporting running timer René Rebe @ 2025-11-17 15:24 ` Guenter Roeck 2025-12-12 20:21 ` James Hilliard [not found] ` <CADvTj4po1bx6AVfGKoxF38pzKURxryC17Up5Z7Ne+P5XBMZFmQ@mail.gmail.com> 2 siblings, 0 replies; 12+ messages in thread From: Guenter Roeck @ 2025-11-17 15:24 UTC (permalink / raw) To: René Rebe; +Cc: wim, linux-watchdog On 11/17/25 04:11, René Rebe wrote: > Some products, such as the Ugreen DXP4800 Plus NAS, ship with the it87 > wdt enabled by the firmware and a broken BIOS option that does not > allow to change the time or turn it off. As this makes installing > Linux rather difficult, change the it87_wdt to report it running to > the watchdog core. > > Signed-off-by: René Rebe <rene@exactco.de> Reviewed-by: Guenter Roeck <linux@roeck-us.net> > --- > v1: > - just clear hw timer register > v2: > - detect running hw timer and report to watchdog core > v3: > - multiply TOV1 in _wdt_get_timeout > - don't wrongly and superfluously set .max_hw_heartbeat_ms > - don't call set_timeout manually > v4: > - simplify to wdt_running > - move code up to not move superio_exit > > diff --git a/drivers/watchdog/it87_wdt.c b/drivers/watchdog/it87_wdt.c > index 3b8488c86a2f..8ba7e03857ca 100644 > --- a/drivers/watchdog/it87_wdt.c > +++ b/drivers/watchdog/it87_wdt.c > @@ -188,6 +188,12 @@ static void _wdt_update_timeout(unsigned int t) > superio_outb(t >> 8, WDTVALMSB); > } > > +/* Internal function, should be called after superio_select(GPIO) */ > +static bool _wdt_running(void) > +{ > + return superio_inb(WDTVALLSB) || (max_units > 255 && superio_inb(WDTVALMSB)); > +} > + > static int wdt_update_timeout(unsigned int t) > { > int ret; > @@ -374,6 +381,12 @@ static int __init it87_wdt_init(void) > } > } > > + /* wdt already left running by firmware? */ > + if (_wdt_running()) { > + pr_info("Left running by firmware.\n"); > + set_bit(WDOG_HW_RUNNING, &wdt_dev.status); > + } > + > superio_exit(); > > if (timeout < 1 || timeout > max_units * 60) { > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] fix it87_wdt early reboot by reporting running timer 2025-11-17 12:11 [PATCH v4] fix it87_wdt early reboot by reporting running timer René Rebe 2025-11-17 15:24 ` Guenter Roeck @ 2025-12-12 20:21 ` James Hilliard [not found] ` <CADvTj4po1bx6AVfGKoxF38pzKURxryC17Up5Z7Ne+P5XBMZFmQ@mail.gmail.com> 2 siblings, 0 replies; 12+ messages in thread From: James Hilliard @ 2025-12-12 20:21 UTC (permalink / raw) To: René Rebe; +Cc: linux, wim, linux-watchdog On Mon, Nov 17, 2025 at 5:11 AM René Rebe <rene@exactco.de> wrote: > > Some products, such as the Ugreen DXP4800 Plus NAS, ship with the it87 > wdt enabled by the firmware and a broken BIOS option that does not > allow to change the time or turn it off. As this makes installing > Linux rather difficult, change the it87_wdt to report it running to > the watchdog core. > > Signed-off-by: René Rebe <rene@exactco.de> > --- > v1: > - just clear hw timer register > v2: > - detect running hw timer and report to watchdog core > v3: > - multiply TOV1 in _wdt_get_timeout > - don't wrongly and superfluously set .max_hw_heartbeat_ms > - don't call set_timeout manually > v4: > - simplify to wdt_running > - move code up to not move superio_exit > > diff --git a/drivers/watchdog/it87_wdt.c b/drivers/watchdog/it87_wdt.c > index 3b8488c86a2f..8ba7e03857ca 100644 > --- a/drivers/watchdog/it87_wdt.c > +++ b/drivers/watchdog/it87_wdt.c > @@ -188,6 +188,12 @@ static void _wdt_update_timeout(unsigned int t) > superio_outb(t >> 8, WDTVALMSB); > } > > +/* Internal function, should be called after superio_select(GPIO) */ > +static bool _wdt_running(void) > +{ > + return superio_inb(WDTVALLSB) || (max_units > 255 && superio_inb(WDTVALMSB)); > +} > + > static int wdt_update_timeout(unsigned int t) > { > int ret; > @@ -374,6 +381,12 @@ static int __init it87_wdt_init(void) > } > } > > + /* wdt already left running by firmware? */ > + if (_wdt_running()) { > + pr_info("Left running by firmware.\n"); I'm wondering, is there a way other than looking at dmesg to identify if a wdt was left running by the firmware? I'm thinking having an ioctl or similar could be useful as a way to notify a user that a BIOS or firmware configuration change may be needed. > + set_bit(WDOG_HW_RUNNING, &wdt_dev.status); > + } > + > superio_exit(); > > if (timeout < 1 || timeout > max_units * 60) { > > -- > René Rebe, ExactCODE GmbH, Berlin, Germany > https://exactco.de | https://t2linux.com | https://rene.rebe.de > ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <CADvTj4po1bx6AVfGKoxF38pzKURxryC17Up5Z7Ne+P5XBMZFmQ@mail.gmail.com>]
* Re: [PATCH v4] fix it87_wdt early reboot by reporting running timer [not found] ` <CADvTj4po1bx6AVfGKoxF38pzKURxryC17Up5Z7Ne+P5XBMZFmQ@mail.gmail.com> @ 2025-12-12 21:50 ` Guenter Roeck 2025-12-12 22:04 ` James Hilliard 0 siblings, 1 reply; 12+ messages in thread From: Guenter Roeck @ 2025-12-12 21:50 UTC (permalink / raw) To: James Hilliard, René Rebe; +Cc: wim, linux-watchdog On 12/12/25 12:17, James Hilliard wrote: ... > + /* wdt already left running by firmware? */ > + if (_wdt_running()) { > + pr_info("Left running by firmware.\n"); > > > I'm wondering, is there a way other than looking at dmesg to identify if > a wdt was left running by the firmware? I'm thinking having an ioctl or > similar could be useful as a way to notify a user that a BIOS or firmware > configuration change may be needed. > This is not a bug, so there is no need to notify the user in the first place. The only reason for accepting the message is that I was tired arguing. It is even misleading, because loading the driver, starting the watchdog by touching the watchdog device, unloading it, and loading it again will likely trigger the message. Userspace can check if a watchdog is running by reading /sys/class/watchdog/watchdog<index>/state. Do that after loading the driver and before starting the watchdog daemon and you'll see if the watchdog was running when the driver was loaded. But that doesn't mean it was running when the system booted; it only means that the watchdog was running when the driver was loaded. Guenter ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] fix it87_wdt early reboot by reporting running timer 2025-12-12 21:50 ` Guenter Roeck @ 2025-12-12 22:04 ` James Hilliard 2025-12-12 22:16 ` René Rebe 0 siblings, 1 reply; 12+ messages in thread From: James Hilliard @ 2025-12-12 22:04 UTC (permalink / raw) To: Guenter Roeck; +Cc: René Rebe, wim, linux-watchdog On Fri, Dec 12, 2025 at 2:50 PM Guenter Roeck <linux@roeck-us.net> wrote: > > On 12/12/25 12:17, James Hilliard wrote: > ... > > + /* wdt already left running by firmware? */ > > + if (_wdt_running()) { > > + pr_info("Left running by firmware.\n"); > > > > > > I'm wondering, is there a way other than looking at dmesg to identify if > > a wdt was left running by the firmware? I'm thinking having an ioctl or > > similar could be useful as a way to notify a user that a BIOS or firmware > > configuration change may be needed. > > > > This is not a bug, so there is no need to notify the user in the first place. > The only reason for accepting the message is that I was tired arguing. > It is even misleading, because loading the driver, starting the watchdog > by touching the watchdog device, unloading it, and loading it again will > likely trigger the message. Yeah, I'm aware it's not a bug, I'm just thinking it might be good to have watchdog drivers record the initial running state. > > Userspace can check if a watchdog is running by reading > /sys/class/watchdog/watchdog<index>/state. Do that after loading the driver > and before starting the watchdog daemon and you'll see if the watchdog > was running when the driver was loaded. But that doesn't mean it was > running when the system booted; it only means that the watchdog was running > when the driver was loaded. Hmm, this seems impossible in some configurations, AFAIU systemd's watchdog is integrated into PID 1, so loading a watchdog daemon later doesn't appear possible. Maybe it would make sense to have a sysfs variable like /sys/class/watchdog/watchdog<index>/initial_state so that there's a way for userspace to determine if a watchdog was already armed by the time the driver was loaded? > > Guenter > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] fix it87_wdt early reboot by reporting running timer 2025-12-12 22:04 ` James Hilliard @ 2025-12-12 22:16 ` René Rebe 2025-12-12 22:28 ` James Hilliard 0 siblings, 1 reply; 12+ messages in thread From: René Rebe @ 2025-12-12 22:16 UTC (permalink / raw) To: James Hilliard; +Cc: Guenter Roeck, wim, linux-watchdog > On 12. Dec 2025, at 23:04, James Hilliard <james.hilliard1@gmail.com> wrote: > > On Fri, Dec 12, 2025 at 2:50 PM Guenter Roeck <linux@roeck-us.net> wrote: >> >> On 12/12/25 12:17, James Hilliard wrote: >> ... >>> + /* wdt already left running by firmware? */ >>> + if (_wdt_running()) { >>> + pr_info("Left running by firmware.\n"); >>> >>> >>> I'm wondering, is there a way other than looking at dmesg to identify if >>> a wdt was left running by the firmware? I'm thinking having an ioctl or >>> similar could be useful as a way to notify a user that a BIOS or firmware >>> configuration change may be needed. >>> >> >> This is not a bug, so there is no need to notify the user in the first place. >> The only reason for accepting the message is that I was tired arguing. >> It is even misleading, because loading the driver, starting the watchdog >> by touching the watchdog device, unloading it, and loading it again will >> likely trigger the message. > > Yeah, I'm aware it's not a bug, I'm just thinking it might be good to have > watchdog drivers record the initial running state. The kernel logs so much pointless random stuff; an info about a running watchdog timer is more than warranted in this case IMHO. It wasted quite a bit of my valuable time. >> Userspace can check if a watchdog is running by reading >> /sys/class/watchdog/watchdog<index>/state. Do that after loading the driver >> and before starting the watchdog daemon and you'll see if the watchdog >> was running when the driver was loaded. But that doesn't mean it was >> running when the system booted; it only means that the watchdog was running >> when the driver was loaded. > > Hmm, this seems impossible in some configurations, AFAIU systemd's > watchdog is integrated into PID 1, so loading a watchdog daemon later > doesn't appear possible. > > Maybe it would make sense to have a sysfs variable like > /sys/class/watchdog/watchdog<index>/initial_state so that > there's a way for userspace to determine if a watchdog was > already armed by the time the driver was loaded? This would be quite wasteful overkill for something that unimportant. It is rare that firmware leaves a watchdog timer enabled in any case. René -- https://exactco.de • https://t2linux.com • https://patreon.com/renerebe ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] fix it87_wdt early reboot by reporting running timer 2025-12-12 22:16 ` René Rebe @ 2025-12-12 22:28 ` James Hilliard 2025-12-12 22:34 ` René Rebe 0 siblings, 1 reply; 12+ messages in thread From: James Hilliard @ 2025-12-12 22:28 UTC (permalink / raw) To: René Rebe; +Cc: Guenter Roeck, wim, linux-watchdog On Fri, Dec 12, 2025 at 3:16 PM René Rebe <rene@exactco.de> wrote: > > > > On 12. Dec 2025, at 23:04, James Hilliard <james.hilliard1@gmail.com> wrote: > > > > On Fri, Dec 12, 2025 at 2:50 PM Guenter Roeck <linux@roeck-us.net> wrote: > >> > >> On 12/12/25 12:17, James Hilliard wrote: > >> ... > >>> + /* wdt already left running by firmware? */ > >>> + if (_wdt_running()) { > >>> + pr_info("Left running by firmware.\n"); > >>> > >>> > >>> I'm wondering, is there a way other than looking at dmesg to identify if > >>> a wdt was left running by the firmware? I'm thinking having an ioctl or > >>> similar could be useful as a way to notify a user that a BIOS or firmware > >>> configuration change may be needed. > >>> > >> > >> This is not a bug, so there is no need to notify the user in the first place. > >> The only reason for accepting the message is that I was tired arguing. > >> It is even misleading, because loading the driver, starting the watchdog > >> by touching the watchdog device, unloading it, and loading it again will > >> likely trigger the message. > > > > Yeah, I'm aware it's not a bug, I'm just thinking it might be good to have > > watchdog drivers record the initial running state. > > The kernel logs so much pointless random stuff; an info about a > running watchdog timer is more than warranted in this case IMHO. > It wasted quite a bit of my valuable time. > > >> Userspace can check if a watchdog is running by reading > >> /sys/class/watchdog/watchdog<index>/state. Do that after loading the driver > >> and before starting the watchdog daemon and you'll see if the watchdog > >> was running when the driver was loaded. But that doesn't mean it was > >> running when the system booted; it only means that the watchdog was running > >> when the driver was loaded. > > > > Hmm, this seems impossible in some configurations, AFAIU systemd's > > watchdog is integrated into PID 1, so loading a watchdog daemon later > > doesn't appear possible. > > > > Maybe it would make sense to have a sysfs variable like > > /sys/class/watchdog/watchdog<index>/initial_state so that > > there's a way for userspace to determine if a watchdog was > > already armed by the time the driver was loaded? > > This would be quite wasteful overkill for something that unimportant. > It is rare that firmware leaves a watchdog timer enabled in any case. I think your presumption that a watchdog is unimportant is wrong, in my case I want to identify systems and send alerts if it's detected that a watchdog was NOT armed by the firmware. I manage a bunch of x86_64 based embedded systems and we always want the watchdog enabled, including in the BIOS, however unlike on your system the watchdogs on my systems are disabled by default and must be manually configured in the BIOS. We do still arm them from Linux either way but it would be nice to warn users that their systems have bad BIOS settings, on the systems I work with failing to arm the watchdog in both the BIOS and Linux can result in the watchdog failing to fire when we need it to under some circumstances(we're not sure exactly why this happens but failing to arm the watchdog in the BIOS can result in the watchdog not always firing if the system freezes during a reboot from what we can tell, although it's difficult to reproduce this issue in our hardware testing lab). > > René > > -- > https://exactco.de • https://t2linux.com • https://patreon.com/renerebe > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] fix it87_wdt early reboot by reporting running timer 2025-12-12 22:28 ` James Hilliard @ 2025-12-12 22:34 ` René Rebe 2025-12-12 22:41 ` James Hilliard 0 siblings, 1 reply; 12+ messages in thread From: René Rebe @ 2025-12-12 22:34 UTC (permalink / raw) To: James Hilliard; +Cc: Guenter Roeck, wim, linux-watchdog > On 12. Dec 2025, at 23:28, James Hilliard <james.hilliard1@gmail.com> wrote: > > On Fri, Dec 12, 2025 at 3:16 PM René Rebe <rene@exactco.de> wrote: >> >> >>> On 12. Dec 2025, at 23:04, James Hilliard <james.hilliard1@gmail.com> wrote: >>> >>> On Fri, Dec 12, 2025 at 2:50 PM Guenter Roeck <linux@roeck-us.net> wrote: >>>> >>>> On 12/12/25 12:17, James Hilliard wrote: >>>> ... >>>>> + /* wdt already left running by firmware? */ >>>>> + if (_wdt_running()) { >>>>> + pr_info("Left running by firmware.\n"); >>>>> >>>>> >>>>> I'm wondering, is there a way other than looking at dmesg to identify if >>>>> a wdt was left running by the firmware? I'm thinking having an ioctl or >>>>> similar could be useful as a way to notify a user that a BIOS or firmware >>>>> configuration change may be needed. >>>>> >>>> >>>> This is not a bug, so there is no need to notify the user in the first place. >>>> The only reason for accepting the message is that I was tired arguing. >>>> It is even misleading, because loading the driver, starting the watchdog >>>> by touching the watchdog device, unloading it, and loading it again will >>>> likely trigger the message. >>> >>> Yeah, I'm aware it's not a bug, I'm just thinking it might be good to have >>> watchdog drivers record the initial running state. >> >> The kernel logs so much pointless random stuff; an info about a >> running watchdog timer is more than warranted in this case IMHO. >> It wasted quite a bit of my valuable time. >> >>>> Userspace can check if a watchdog is running by reading >>>> /sys/class/watchdog/watchdog<index>/state. Do that after loading the driver >>>> and before starting the watchdog daemon and you'll see if the watchdog >>>> was running when the driver was loaded. But that doesn't mean it was >>>> running when the system booted; it only means that the watchdog was running >>>> when the driver was loaded. >>> >>> Hmm, this seems impossible in some configurations, AFAIU systemd's >>> watchdog is integrated into PID 1, so loading a watchdog daemon later >>> doesn't appear possible. >>> >>> Maybe it would make sense to have a sysfs variable like >>> /sys/class/watchdog/watchdog<index>/initial_state so that >>> there's a way for userspace to determine if a watchdog was >>> already armed by the time the driver was loaded? >> >> This would be quite wasteful overkill for something that unimportant. >> It is rare that firmware leaves a watchdog timer enabled in any case. > > I think your presumption that a watchdog is unimportant is wrong, > in my case I want to identify systems and send alerts if it's detected > that a watchdog was NOT armed by the firmware. > > I manage a bunch of x86_64 based embedded systems and > we always want the watchdog enabled, including in the BIOS, > however unlike on your system the watchdogs on my systems are > disabled by default and must be manually configured in the BIOS. > We do still arm them from Linux either way but it would be nice > to warn users that their systems have bad BIOS settings, on the > systems I work with failing to arm the watchdog in both the BIOS > and Linux can result in the watchdog failing to fire when we need > it to under some circumstances(we're not sure exactly why this > happens but failing to arm the watchdog in the BIOS can result > in the watchdog not always firing if the system freezes during a > reboot from what we can tell, although it's difficult to reproduce > this issue in our hardware testing lab). Instead of adding new kernel state, you could probably just read the initial state as suggested by Guenter. René >> >> René >> >> -- >> https://exactco.de • https://t2linux.com • https://patreon.com/renerebe >> -- https://exactco.de • https://t2linux.com • https://patreon.com/renerebe ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] fix it87_wdt early reboot by reporting running timer 2025-12-12 22:34 ` René Rebe @ 2025-12-12 22:41 ` James Hilliard 2025-12-12 22:44 ` René Rebe 0 siblings, 1 reply; 12+ messages in thread From: James Hilliard @ 2025-12-12 22:41 UTC (permalink / raw) To: René Rebe; +Cc: Guenter Roeck, wim, linux-watchdog On Fri, Dec 12, 2025 at 3:34 PM René Rebe <rene@exactco.de> wrote: > > > On 12. Dec 2025, at 23:28, James Hilliard <james.hilliard1@gmail.com> wrote: > > > > On Fri, Dec 12, 2025 at 3:16 PM René Rebe <rene@exactco.de> wrote: > >> > >> > >>> On 12. Dec 2025, at 23:04, James Hilliard <james.hilliard1@gmail.com> wrote: > >>> > >>> On Fri, Dec 12, 2025 at 2:50 PM Guenter Roeck <linux@roeck-us.net> wrote: > >>>> > >>>> On 12/12/25 12:17, James Hilliard wrote: > >>>> ... > >>>>> + /* wdt already left running by firmware? */ > >>>>> + if (_wdt_running()) { > >>>>> + pr_info("Left running by firmware.\n"); > >>>>> > >>>>> > >>>>> I'm wondering, is there a way other than looking at dmesg to identify if > >>>>> a wdt was left running by the firmware? I'm thinking having an ioctl or > >>>>> similar could be useful as a way to notify a user that a BIOS or firmware > >>>>> configuration change may be needed. > >>>>> > >>>> > >>>> This is not a bug, so there is no need to notify the user in the first place. > >>>> The only reason for accepting the message is that I was tired arguing. > >>>> It is even misleading, because loading the driver, starting the watchdog > >>>> by touching the watchdog device, unloading it, and loading it again will > >>>> likely trigger the message. > >>> > >>> Yeah, I'm aware it's not a bug, I'm just thinking it might be good to have > >>> watchdog drivers record the initial running state. > >> > >> The kernel logs so much pointless random stuff; an info about a > >> running watchdog timer is more than warranted in this case IMHO. > >> It wasted quite a bit of my valuable time. > >> > >>>> Userspace can check if a watchdog is running by reading > >>>> /sys/class/watchdog/watchdog<index>/state. Do that after loading the driver > >>>> and before starting the watchdog daemon and you'll see if the watchdog > >>>> was running when the driver was loaded. But that doesn't mean it was > >>>> running when the system booted; it only means that the watchdog was running > >>>> when the driver was loaded. > >>> > >>> Hmm, this seems impossible in some configurations, AFAIU systemd's > >>> watchdog is integrated into PID 1, so loading a watchdog daemon later > >>> doesn't appear possible. > >>> > >>> Maybe it would make sense to have a sysfs variable like > >>> /sys/class/watchdog/watchdog<index>/initial_state so that > >>> there's a way for userspace to determine if a watchdog was > >>> already armed by the time the driver was loaded? > >> > >> This would be quite wasteful overkill for something that unimportant. > >> It is rare that firmware leaves a watchdog timer enabled in any case. > > > > I think your presumption that a watchdog is unimportant is wrong, > > in my case I want to identify systems and send alerts if it's detected > > that a watchdog was NOT armed by the firmware. > > > > I manage a bunch of x86_64 based embedded systems and > > we always want the watchdog enabled, including in the BIOS, > > however unlike on your system the watchdogs on my systems are > > disabled by default and must be manually configured in the BIOS. > > We do still arm them from Linux either way but it would be nice > > to warn users that their systems have bad BIOS settings, on the > > systems I work with failing to arm the watchdog in both the BIOS > > and Linux can result in the watchdog failing to fire when we need > > it to under some circumstances(we're not sure exactly why this > > happens but failing to arm the watchdog in the BIOS can result > > in the watchdog not always firing if the system freezes during a > > reboot from what we can tell, although it's difficult to reproduce > > this issue in our hardware testing lab). > > Instead of adding new kernel state, you could probably just read > the initial state as suggested by Guenter. As I mentioned earlier, I don't think we can read initial state since AFAIU systemd PID1 will immediately arm the watchdog prior to anything else running, so by the time we could read the state variable the watchdog would have already been armed so the state var would be meaningless in regards to determining if the firmware armed the watchdog. > > René > > >> > >> René > >> > >> -- > >> https://exactco.de • https://t2linux.com • https://patreon.com/renerebe > >> > > -- > https://exactco.de • https://t2linux.com • https://patreon.com/renerebe > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] fix it87_wdt early reboot by reporting running timer 2025-12-12 22:41 ` James Hilliard @ 2025-12-12 22:44 ` René Rebe 2025-12-12 23:00 ` James Hilliard 0 siblings, 1 reply; 12+ messages in thread From: René Rebe @ 2025-12-12 22:44 UTC (permalink / raw) To: James Hilliard; +Cc: Guenter Roeck, wim, linux-watchdog > On 12. Dec 2025, at 23:41, James Hilliard <james.hilliard1@gmail.com> wrote: > > On Fri, Dec 12, 2025 at 3:34 PM René Rebe <rene@exactco.de> wrote: >> >>> On 12. Dec 2025, at 23:28, James Hilliard <james.hilliard1@gmail.com> wrote: >>> >>> On Fri, Dec 12, 2025 at 3:16 PM René Rebe <rene@exactco.de> wrote: >>>> >>>> >>>>> On 12. Dec 2025, at 23:04, James Hilliard <james.hilliard1@gmail.com> wrote: >>>>> >>>>> On Fri, Dec 12, 2025 at 2:50 PM Guenter Roeck <linux@roeck-us.net> wrote: >>>>>> >>>>>> On 12/12/25 12:17, James Hilliard wrote: >>>>>> ... >>>>>>> + /* wdt already left running by firmware? */ >>>>>>> + if (_wdt_running()) { >>>>>>> + pr_info("Left running by firmware.\n"); >>>>>>> >>>>>>> >>>>>>> I'm wondering, is there a way other than looking at dmesg to identify if >>>>>>> a wdt was left running by the firmware? I'm thinking having an ioctl or >>>>>>> similar could be useful as a way to notify a user that a BIOS or firmware >>>>>>> configuration change may be needed. >>>>>>> >>>>>> >>>>>> This is not a bug, so there is no need to notify the user in the first place. >>>>>> The only reason for accepting the message is that I was tired arguing. >>>>>> It is even misleading, because loading the driver, starting the watchdog >>>>>> by touching the watchdog device, unloading it, and loading it again will >>>>>> likely trigger the message. >>>>> >>>>> Yeah, I'm aware it's not a bug, I'm just thinking it might be good to have >>>>> watchdog drivers record the initial running state. >>>> >>>> The kernel logs so much pointless random stuff; an info about a >>>> running watchdog timer is more than warranted in this case IMHO. >>>> It wasted quite a bit of my valuable time. >>>> >>>>>> Userspace can check if a watchdog is running by reading >>>>>> /sys/class/watchdog/watchdog<index>/state. Do that after loading the driver >>>>>> and before starting the watchdog daemon and you'll see if the watchdog >>>>>> was running when the driver was loaded. But that doesn't mean it was >>>>>> running when the system booted; it only means that the watchdog was running >>>>>> when the driver was loaded. >>>>> >>>>> Hmm, this seems impossible in some configurations, AFAIU systemd's >>>>> watchdog is integrated into PID 1, so loading a watchdog daemon later >>>>> doesn't appear possible. >>>>> >>>>> Maybe it would make sense to have a sysfs variable like >>>>> /sys/class/watchdog/watchdog<index>/initial_state so that >>>>> there's a way for userspace to determine if a watchdog was >>>>> already armed by the time the driver was loaded? >>>> >>>> This would be quite wasteful overkill for something that unimportant. >>>> It is rare that firmware leaves a watchdog timer enabled in any case. >>> >>> I think your presumption that a watchdog is unimportant is wrong, >>> in my case I want to identify systems and send alerts if it's detected >>> that a watchdog was NOT armed by the firmware. >>> >>> I manage a bunch of x86_64 based embedded systems and >>> we always want the watchdog enabled, including in the BIOS, >>> however unlike on your system the watchdogs on my systems are >>> disabled by default and must be manually configured in the BIOS. >>> We do still arm them from Linux either way but it would be nice >>> to warn users that their systems have bad BIOS settings, on the >>> systems I work with failing to arm the watchdog in both the BIOS >>> and Linux can result in the watchdog failing to fire when we need >>> it to under some circumstances(we're not sure exactly why this >>> happens but failing to arm the watchdog in the BIOS can result >>> in the watchdog not always firing if the system freezes during a >>> reboot from what we can tell, although it's difficult to reproduce >>> this issue in our hardware testing lab). >> >> Instead of adding new kernel state, you could probably just read >> the initial state as suggested by Guenter. > > As I mentioned earlier, I don't think we can read initial state since > AFAIU systemd PID1 will immediately arm the watchdog prior > to anything else running, so by the time we could read the state > variable the watchdog would have already been armed so the > state var would be meaningless in regards to determining if the > firmware armed the watchdog. You can either adjust systemd accordingly or add a /sbin/init wrapper for systemd to check the state before executing init. With the added benefit of working with older kernels. René -- https://exactco.de • https://t2linux.com • https://patreon.com/renerebe ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] fix it87_wdt early reboot by reporting running timer 2025-12-12 22:44 ` René Rebe @ 2025-12-12 23:00 ` James Hilliard 2025-12-13 16:01 ` Guenter Roeck 0 siblings, 1 reply; 12+ messages in thread From: James Hilliard @ 2025-12-12 23:00 UTC (permalink / raw) To: René Rebe; +Cc: Guenter Roeck, wim, linux-watchdog On Fri, Dec 12, 2025 at 3:44 PM René Rebe <rene@exactco.de> wrote: > > > > On 12. Dec 2025, at 23:41, James Hilliard <james.hilliard1@gmail.com> wrote: > > > > On Fri, Dec 12, 2025 at 3:34 PM René Rebe <rene@exactco.de> wrote: > >> > >>> On 12. Dec 2025, at 23:28, James Hilliard <james.hilliard1@gmail.com> wrote: > >>> > >>> On Fri, Dec 12, 2025 at 3:16 PM René Rebe <rene@exactco.de> wrote: > >>>> > >>>> > >>>>> On 12. Dec 2025, at 23:04, James Hilliard <james.hilliard1@gmail.com> wrote: > >>>>> > >>>>> On Fri, Dec 12, 2025 at 2:50 PM Guenter Roeck <linux@roeck-us.net> wrote: > >>>>>> > >>>>>> On 12/12/25 12:17, James Hilliard wrote: > >>>>>> ... > >>>>>>> + /* wdt already left running by firmware? */ > >>>>>>> + if (_wdt_running()) { > >>>>>>> + pr_info("Left running by firmware.\n"); > >>>>>>> > >>>>>>> > >>>>>>> I'm wondering, is there a way other than looking at dmesg to identify if > >>>>>>> a wdt was left running by the firmware? I'm thinking having an ioctl or > >>>>>>> similar could be useful as a way to notify a user that a BIOS or firmware > >>>>>>> configuration change may be needed. > >>>>>>> > >>>>>> > >>>>>> This is not a bug, so there is no need to notify the user in the first place. > >>>>>> The only reason for accepting the message is that I was tired arguing. > >>>>>> It is even misleading, because loading the driver, starting the watchdog > >>>>>> by touching the watchdog device, unloading it, and loading it again will > >>>>>> likely trigger the message. > >>>>> > >>>>> Yeah, I'm aware it's not a bug, I'm just thinking it might be good to have > >>>>> watchdog drivers record the initial running state. > >>>> > >>>> The kernel logs so much pointless random stuff; an info about a > >>>> running watchdog timer is more than warranted in this case IMHO. > >>>> It wasted quite a bit of my valuable time. > >>>> > >>>>>> Userspace can check if a watchdog is running by reading > >>>>>> /sys/class/watchdog/watchdog<index>/state. Do that after loading the driver > >>>>>> and before starting the watchdog daemon and you'll see if the watchdog > >>>>>> was running when the driver was loaded. But that doesn't mean it was > >>>>>> running when the system booted; it only means that the watchdog was running > >>>>>> when the driver was loaded. > >>>>> > >>>>> Hmm, this seems impossible in some configurations, AFAIU systemd's > >>>>> watchdog is integrated into PID 1, so loading a watchdog daemon later > >>>>> doesn't appear possible. > >>>>> > >>>>> Maybe it would make sense to have a sysfs variable like > >>>>> /sys/class/watchdog/watchdog<index>/initial_state so that > >>>>> there's a way for userspace to determine if a watchdog was > >>>>> already armed by the time the driver was loaded? > >>>> > >>>> This would be quite wasteful overkill for something that unimportant. > >>>> It is rare that firmware leaves a watchdog timer enabled in any case. > >>> > >>> I think your presumption that a watchdog is unimportant is wrong, > >>> in my case I want to identify systems and send alerts if it's detected > >>> that a watchdog was NOT armed by the firmware. > >>> > >>> I manage a bunch of x86_64 based embedded systems and > >>> we always want the watchdog enabled, including in the BIOS, > >>> however unlike on your system the watchdogs on my systems are > >>> disabled by default and must be manually configured in the BIOS. > >>> We do still arm them from Linux either way but it would be nice > >>> to warn users that their systems have bad BIOS settings, on the > >>> systems I work with failing to arm the watchdog in both the BIOS > >>> and Linux can result in the watchdog failing to fire when we need > >>> it to under some circumstances(we're not sure exactly why this > >>> happens but failing to arm the watchdog in the BIOS can result > >>> in the watchdog not always firing if the system freezes during a > >>> reboot from what we can tell, although it's difficult to reproduce > >>> this issue in our hardware testing lab). > >> > >> Instead of adding new kernel state, you could probably just read > >> the initial state as suggested by Guenter. > > > > As I mentioned earlier, I don't think we can read initial state since > > AFAIU systemd PID1 will immediately arm the watchdog prior > > to anything else running, so by the time we could read the state > > variable the watchdog would have already been armed so the > > state var would be meaningless in regards to determining if the > > firmware armed the watchdog. > > You can either adjust systemd accordingly or add a /sbin/init > wrapper for systemd to check the state before executing init. This seems super hacky to me and likely to cause weird issues. I'm thinking recording initial_state would be probably something that makes more sense to have for all watchdog drivers that have the ability to read the initial state at least, not sure how common a use case that would be but I think it would be helpful for debugging watchdog issues in general, especially on systems that might expose multiple types of watchdog times with different drivers. > With the added benefit of working with older kernels. I don't think that's all that important. When updating we use A/B partition rotation and always swap both the userspace and kernel at the same time so that we don't have to worry about that sort of issue at least for our use case. > > René > > -- > https://exactco.de • https://t2linux.com • https://patreon.com/renerebe > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] fix it87_wdt early reboot by reporting running timer 2025-12-12 23:00 ` James Hilliard @ 2025-12-13 16:01 ` Guenter Roeck 0 siblings, 0 replies; 12+ messages in thread From: Guenter Roeck @ 2025-12-13 16:01 UTC (permalink / raw) To: James Hilliard, René Rebe; +Cc: wim, linux-watchdog On 12/12/25 15:00, James Hilliard wrote: > On Fri, Dec 12, 2025 at 3:44 PM René Rebe <rene@exactco.de> wrote: >> >> >>> On 12. Dec 2025, at 23:41, James Hilliard <james.hilliard1@gmail.com> wrote: >>> >>> On Fri, Dec 12, 2025 at 3:34 PM René Rebe <rene@exactco.de> wrote: >>>> >>>>> On 12. Dec 2025, at 23:28, James Hilliard <james.hilliard1@gmail.com> wrote: >>>>> >>>>> On Fri, Dec 12, 2025 at 3:16 PM René Rebe <rene@exactco.de> wrote: >>>>>> >>>>>> >>>>>>> On 12. Dec 2025, at 23:04, James Hilliard <james.hilliard1@gmail.com> wrote: >>>>>>> >>>>>>> On Fri, Dec 12, 2025 at 2:50 PM Guenter Roeck <linux@roeck-us.net> wrote: >>>>>>>> >>>>>>>> On 12/12/25 12:17, James Hilliard wrote: >>>>>>>> ... >>>>>>>>> + /* wdt already left running by firmware? */ >>>>>>>>> + if (_wdt_running()) { >>>>>>>>> + pr_info("Left running by firmware.\n"); >>>>>>>>> >>>>>>>>> >>>>>>>>> I'm wondering, is there a way other than looking at dmesg to identify if >>>>>>>>> a wdt was left running by the firmware? I'm thinking having an ioctl or >>>>>>>>> similar could be useful as a way to notify a user that a BIOS or firmware >>>>>>>>> configuration change may be needed. >>>>>>>>> >>>>>>>> >>>>>>>> This is not a bug, so there is no need to notify the user in the first place. >>>>>>>> The only reason for accepting the message is that I was tired arguing. >>>>>>>> It is even misleading, because loading the driver, starting the watchdog >>>>>>>> by touching the watchdog device, unloading it, and loading it again will >>>>>>>> likely trigger the message. >>>>>>> >>>>>>> Yeah, I'm aware it's not a bug, I'm just thinking it might be good to have >>>>>>> watchdog drivers record the initial running state. >>>>>> >>>>>> The kernel logs so much pointless random stuff; an info about a >>>>>> running watchdog timer is more than warranted in this case IMHO. >>>>>> It wasted quite a bit of my valuable time. >>>>>> >>>>>>>> Userspace can check if a watchdog is running by reading >>>>>>>> /sys/class/watchdog/watchdog<index>/state. Do that after loading the driver >>>>>>>> and before starting the watchdog daemon and you'll see if the watchdog >>>>>>>> was running when the driver was loaded. But that doesn't mean it was >>>>>>>> running when the system booted; it only means that the watchdog was running >>>>>>>> when the driver was loaded. >>>>>>> >>>>>>> Hmm, this seems impossible in some configurations, AFAIU systemd's >>>>>>> watchdog is integrated into PID 1, so loading a watchdog daemon later >>>>>>> doesn't appear possible. >>>>>>> >>>>>>> Maybe it would make sense to have a sysfs variable like >>>>>>> /sys/class/watchdog/watchdog<index>/initial_state so that >>>>>>> there's a way for userspace to determine if a watchdog was >>>>>>> already armed by the time the driver was loaded? >>>>>> >>>>>> This would be quite wasteful overkill for something that unimportant. >>>>>> It is rare that firmware leaves a watchdog timer enabled in any case. >>>>> >>>>> I think your presumption that a watchdog is unimportant is wrong, >>>>> in my case I want to identify systems and send alerts if it's detected >>>>> that a watchdog was NOT armed by the firmware. >>>>> >>>>> I manage a bunch of x86_64 based embedded systems and >>>>> we always want the watchdog enabled, including in the BIOS, >>>>> however unlike on your system the watchdogs on my systems are >>>>> disabled by default and must be manually configured in the BIOS. >>>>> We do still arm them from Linux either way but it would be nice >>>>> to warn users that their systems have bad BIOS settings, on the >>>>> systems I work with failing to arm the watchdog in both the BIOS >>>>> and Linux can result in the watchdog failing to fire when we need >>>>> it to under some circumstances(we're not sure exactly why this >>>>> happens but failing to arm the watchdog in the BIOS can result >>>>> in the watchdog not always firing if the system freezes during a >>>>> reboot from what we can tell, although it's difficult to reproduce >>>>> this issue in our hardware testing lab). >>>> >>>> Instead of adding new kernel state, you could probably just read >>>> the initial state as suggested by Guenter. >>> >>> As I mentioned earlier, I don't think we can read initial state since >>> AFAIU systemd PID1 will immediately arm the watchdog prior >>> to anything else running, so by the time we could read the state >>> variable the watchdog would have already been armed so the >>> state var would be meaningless in regards to determining if the >>> firmware armed the watchdog. >> >> You can either adjust systemd accordingly or add a /sbin/init >> wrapper for systemd to check the state before executing init. > > This seems super hacky to me and likely to cause weird issues. > systemd executes a number of actions before opening the watchdog device. At the very least that includes loading all the modules. I am quite sure that some script can be configured to run after loading the modules and before opening the watchdog device. I don't see that as hacky. Guenter ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-12-13 16:01 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-17 12:11 [PATCH v4] fix it87_wdt early reboot by reporting running timer René Rebe
2025-11-17 15:24 ` Guenter Roeck
2025-12-12 20:21 ` James Hilliard
[not found] ` <CADvTj4po1bx6AVfGKoxF38pzKURxryC17Up5Z7Ne+P5XBMZFmQ@mail.gmail.com>
2025-12-12 21:50 ` Guenter Roeck
2025-12-12 22:04 ` James Hilliard
2025-12-12 22:16 ` René Rebe
2025-12-12 22:28 ` James Hilliard
2025-12-12 22:34 ` René Rebe
2025-12-12 22:41 ` James Hilliard
2025-12-12 22:44 ` René Rebe
2025-12-12 23:00 ` James Hilliard
2025-12-13 16:01 ` Guenter Roeck
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox