From: Petr Mladek <pmladek@suse.com>
To: mrungta@google.com
Cc: Jonathan Corbet <corbet@lwn.net>,
Jinchao Wang <wangjinchao600@gmail.com>,
Yunhui Cui <cuiyunhui@bytedance.com>,
Stephane Eranian <eranian@google.com>,
Ian Rogers <irogers@google.com>, Li Huafei <lihuafei1@huawei.com>,
Feng Tang <feng.tang@linux.alibaba.com>,
Max Kellermann <max.kellermann@ionos.com>,
Douglas Anderson <dianders@chromium.org>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org
Subject: Re: [PATCH 1/4] watchdog/hardlockup: Always update saved interrupts during check
Date: Wed, 4 Mar 2026 15:44:17 +0100 [thread overview]
Message-ID: <aahFQaHxNFsoaxEb@pathway.suse.cz> (raw)
In-Reply-To: <20260212-hardlockup-watchdog-fixes-v1-1-745f1dce04c3@google.com>
On Thu 2026-02-12 14:12:10, Mayank Rungta via B4 Relay wrote:
> From: Mayank Rungta <mrungta@google.com>
>
> Currently, arch_touch_nmi_watchdog() causes an early return that
> skips updating hrtimer_interrupts_saved. This leads to stale
> comparisons and delayed lockup detection.
>
> Update the saved interrupt count before checking the touched flag
> to ensure detection timeliness.
IMHO, it is not that easy, see below.
So I am curious. Have you found this when debugging a particular
problem or just by reading the code, please?
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -186,7 +186,21 @@ static void watchdog_hardlockup_kick(void)
>
> void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs)
> {
> + bool is_hl;
> int hardlockup_all_cpu_backtrace;
> + /*
> + * Check for a hardlockup by making sure the CPU's timer
> + * interrupt is incrementing. The timer interrupt should have
> + * fired multiple times before we overflow'd. If it hasn't
> + * then this is a good indication the cpu is stuck
> + *
> + * Purposely check this _before_ checking watchdog_hardlockup_touched
> + * so we make sure we still update the saved value of the interrupts.
> + * Without that we'll take an extra round through this function before
> + * we can detect a lockup.
> + */
> +
> + is_hl = is_hardlockup(cpu);
>
> if (per_cpu(watchdog_hardlockup_touched, cpu)) {
> per_cpu(watchdog_hardlockup_touched, cpu) = false;
Hmm, this does not look correct to me.
1. First, let's agree on the meaning of "watchdog_hardlockup_touched".
My understanding is that arch_touch_nmi_watchdog() is called by code
which might block interrupts (timers) for a long time and wants to
hide it.
Blocking interrupts for too long is _bad_. In the ideal word,
nobody should call arch_touch_nmi_watchdog() because we want
to know about all sinners.
In the real world, we allow to hide some sinners because
they might produce "false" positives, see touch_nmi_watchdog()
callers:
+ Most callers are printk() related.
We might argue whether it is a false positive or not.
The argument for "touching the watchdog" is that slow serial
consoles might block IRQs for a long time. But they work as
expected and can't do better.
Also the stall is kind of expected in this case. We could
confuse users and/or hide the original problem if the stall
was reported.
Note that there is a bigger problem with the legacy console
drivers. printk() tries to emit them immediately. And the
current console_lock() owner become responsible for flushing
new messages added by other CPUs in parallel.
It is better with the new NBCON console drivers which are
offloaded to a kthread. Here, printk() tries to flush them directly
only when called in an emergency mode (WARN, stall report)
or in panic().
+ There are some other callers, for example, multi_stop_cpu(),
or hv_do_rep_hypercall_ex(). IMHO, they create stalls on
purpose.
2. Let's look at is_hardlockup() in detail:
static bool is_hardlockup(unsigned int cpu)
{
int hrint = atomic_read(&per_cpu(hrtimer_interrupts, cpu));
if (per_cpu(hrtimer_interrupts_saved, cpu) == hrint) {
per_cpu(hrtimer_interrupts_missed, cpu)++;
if (per_cpu(hrtimer_interrupts_missed, cpu) >= watchdog_hardlockup_miss_thresh)
return true;
return false;
}
per_cpu(hrtimer_interrupts_saved, cpu) = hrint;
per_cpu(hrtimer_interrupts_missed, cpu) = 0;
return false;
}
If we call it when the watchdog was touched then
(per_cpu(hrtimer_interrupts_saved, cpu) == hrint)
=> per_cpu(hrtimer_interrupts_missed, cpu)++;
is called even when watchdog was touched.
As a result, we might report stall which should have been hidden,
for example:
CPU0 CPU1
<NMI>
watchdog_hardlockup_check() # passes
is_hardlockup() # no
hr_int_saved = hr_int;
hr_int_missed = 0;
</NMI>
local_irq_save()
printk()
console_trylock()
console_unlock()
console_flush_all()
touch_nmi_watchdog()
// Other CPUs print many messages,
// e.g. during boot when initializing a lot of HW
for (i=0; i<1000; i++) do
printk();
<NMI>
watchdog_hardlockup_check()
is_hardlockup() # yes
hr_int_missed++ # 1
# skip because touched
</NMI>
touch_nmi_watchdog()
<NMI>
watchdog_hardlockup_check()
is_hardlockup() # yes
hr_int_missed++ # 2
# skip because touched
</NMI>
... repeat many times ...
local_irq_restore()
# this might normally trigger handling of pending IRQs
# including the timers. But IMHO, it can be offloaded
# to a kthread (at least on RT)
<NMI>
watchdog_hardlockup_check()
is_hardlockup() # yes
hr_int_missed++ # might be already 3, 4,...
Report hardlockup even when all the "hr_int_missed"
values should have been ignored because of
touch_watchdog.
</NMI>
A solution might be clearing "hrtimer_interrupts_missed"
when the watchdog was touched.
But honestly, I am not sure if this is worth the complexity.
Higher level look:
------------------
My understanding is that this patch has an effect only when
touch_nmi_watchdog() is called as frequently as
watchdog_hardlockup_check().
The original code gives the system more time to recover after
a known stall (touch_nmi_watchdog() called).
The new code is more eager to report a stall. It might be more prone
to report "false" positives.
IMHO, the root of the problem is that touch_nmi_watchdog() is
called too frequently. And this patch is rather dancing around
then fixing it.
Alternative:
------------
An alternative solution might to detect and report when too many
watchdog_hardlockup_check() calls are ignored because of
touch_nmi_watchdog().
It might help to find a mis-use of touch_nmi_watchdog(). The question
is what details should be reported in this case.
It should be optional because touch_nmi_watchdog() is supposed
to hide "well-known" sinners after all.
> @@ -195,13 +209,8 @@ void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs)
>
> hardlockup_all_cpu_backtrace = (hardlockup_si_mask & SYS_INFO_ALL_BT) ?
> 1 : sysctl_hardlockup_all_cpu_backtrace;
> - /*
> - * Check for a hardlockup by making sure the CPU's timer
> - * interrupt is incrementing. The timer interrupt should have
> - * fired multiple times before we overflow'd. If it hasn't
> - * then this is a good indication the cpu is stuck
> - */
> - if (is_hardlockup(cpu)) {
> +
> + if (is_hl) {
> unsigned int this_cpu = smp_processor_id();
> unsigned long flags;
>
Best Regards,
Petr
next prev parent reply other threads:[~2026-03-04 14:44 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-12 21:12 [PATCH 0/4] watchdog/hardlockup: Improvements to hardlockup detection and documentation Mayank Rungta
2026-02-12 21:12 ` Mayank Rungta via B4 Relay
2026-02-12 21:12 ` [PATCH 1/4] watchdog/hardlockup: Always update saved interrupts during check Mayank Rungta
2026-02-12 21:12 ` Mayank Rungta via B4 Relay
2026-02-13 16:29 ` Doug Anderson
2026-03-04 14:44 ` Petr Mladek [this message]
2026-03-05 0:58 ` Doug Anderson
2026-03-05 11:27 ` Petr Mladek
2026-03-05 16:13 ` Doug Anderson
2026-03-09 13:33 ` Petr Mladek
2026-03-11 2:51 ` Mayank Rungta
2026-03-11 13:56 ` Petr Mladek
2026-02-12 21:12 ` [PATCH 2/4] doc: watchdog: Clarify hardlockup detection timing Mayank Rungta
2026-02-12 21:12 ` Mayank Rungta via B4 Relay
2026-02-13 16:29 ` Doug Anderson
2026-03-05 12:33 ` Petr Mladek
2026-02-12 21:12 ` [PATCH 3/4] watchdog/hardlockup: improve buddy system detection timeliness Mayank Rungta
2026-02-12 21:12 ` Mayank Rungta via B4 Relay
2026-02-13 16:30 ` Doug Anderson
2026-03-05 13:46 ` Petr Mladek
2026-03-05 16:45 ` Doug Anderson
2026-03-11 14:07 ` Petr Mladek
2026-03-12 21:02 ` Doug Anderson
2026-02-12 21:12 ` [PATCH 4/4] doc: watchdog: Document buddy detector Mayank Rungta
2026-02-12 21:12 ` Mayank Rungta via B4 Relay
2026-02-13 16:30 ` Doug Anderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aahFQaHxNFsoaxEb@pathway.suse.cz \
--to=pmladek@suse.com \
--cc=akpm@linux-foundation.org \
--cc=corbet@lwn.net \
--cc=cuiyunhui@bytedance.com \
--cc=dianders@chromium.org \
--cc=eranian@google.com \
--cc=feng.tang@linux.alibaba.com \
--cc=irogers@google.com \
--cc=lihuafei1@huawei.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=max.kellermann@ionos.com \
--cc=mrungta@google.com \
--cc=wangjinchao600@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.