From: Jinchao Wang <wangjinchao600@gmail.com>
To: Ian Rogers <irogers@google.com>
Cc: Doug Anderson <dianders@chromium.org>,
Peter Zijlstra <peterz@infradead.org>,
Will Deacon <will@kernel.org>,
Yunhui Cui <cuiyunhui@bytedance.com>,
akpm@linux-foundation.org, catalin.marinas@arm.com,
maddy@linux.ibm.com, mpe@ellerman.id.au, npiggin@gmail.com,
christophe.leroy@csgroup.eu, tglx@linutronix.de,
mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
hpa@zytor.com, acme@kernel.org, namhyung@kernel.org,
mark.rutland@arm.com, alexander.shishkin@linux.intel.com,
jolsa@kernel.org, adrian.hunter@intel.com,
kan.liang@linux.intel.com, kees@kernel.org, masahiroy@kernel.org,
aliceryhl@google.com, ojeda@kernel.org,
thomas.weissschuh@linutronix.de, xur@google.com,
ruanjinjie@huawei.com, gshan@redhat.com, maz@kernel.org,
suzuki.poulose@arm.com, zhanjie9@hisilicon.com,
yangyicong@hisilicon.com, gautam@linux.ibm.com, arnd@arndb.de,
zhao.xichao@vivo.com, rppt@kernel.org, lihuafei1@huawei.com,
coxu@redhat.com, jpoimboe@kernel.org, yaozhenguo1@gmail.com,
luogengkun@huaweicloud.com, max.kellermann@ionos.com,
tj@kernel.org, yury.norov@gmail.com, thorsten.blum@linux.dev,
x86@kernel.org, linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
linuxppc-dev@lists.ozlabs.org, linux-perf-users@vger.kernel.org
Subject: Re: [RFC PATCH V1] watchdog: Add boot-time selection for hard lockup detector
Date: Wed, 17 Sep 2025 09:47:37 +0800 [thread overview]
Message-ID: <aMoTOXIKBYVTj7PV@mdev> (raw)
In-Reply-To: <CAP-5=fWWOQ-6SWiNVBvb5mCofe0kZUURG_bm0PDsVFWqwDwrXg@mail.gmail.com>
On Tue, Sep 16, 2025 at 05:03:48PM -0700, Ian Rogers wrote:
> On Tue, Sep 16, 2025 at 7:51 AM Jinchao Wang <wangjinchao600@gmail.com> wrote:
> >
> > Currently, the hard lockup detector is selected at compile time via
> > Kconfig, which requires a kernel rebuild to switch implementations.
> > This is inflexible, especially on systems where a perf event may not
> > be available or may be needed for other tasks.
> >
> > This commit refactors the hard lockup detector to replace a rigid
> > compile-time choice with a flexible build-time and boot-time solution.
> > The patch supports building the kernel with either detector
> > independently, or with both. When both are built, a new boot parameter
> > `hardlockup_detector="perf|buddy"` allows the selection at boot time.
> > This is a more robust and user-friendly design.
> >
> > This patch is a follow-up to the discussion on the kernel mailing list
> > regarding the preference and future of the hard lockup detectors. It
> > implements a flexible solution that addresses the community's need to
> > select an appropriate detector at boot time.
> >
> > The core changes are:
> > - The `perf` and `buddy` watchdog implementations are separated into
> > distinct functions (e.g., `watchdog_perf_hardlockup_enable`).
> > - Global function pointers are introduced (`watchdog_hardlockup_enable_ptr`)
> > to serve as a single API for the entire feature.
> > - A new `hardlockup_detector=` boot parameter is added to allow the
> > user to select the desired detector at boot time.
> > - The Kconfig options are simplified by removing the complex
> > `HARDLOCKUP_DETECTOR_PREFER_BUDDY` and allowing both detectors to be
> > built without mutual exclusion.
> > - The weak stubs are updated to call the new function pointers,
> > centralizing the watchdog logic.
>
> What is the impact on /proc/sys/kernel/nmi_watchdog ? Is that
> enabling and disabling whatever the boot time choice was? I'm not sure
> why this has to be a boot time option given the ability to configure
> via /proc/sys/kernel/nmi_watchdog.
The new hardlockup_detector boot parameter and the existing
/proc/sys/kernel/nmi_watchdog file serve different purposes.
The boot parameter selects the type of hard lockup detector (perf or buddy).
This choice is made once at boot.
/proc/sys/kernel/nmi_watchdog, on the other hand, is only a simple on/off
switch for the currently selected detector. It does not change the detector's
type.
>
> > Link: https://lore.kernel.org/all/20250915035355.10846-1-cuiyunhui@bytedance.com/
> > Link: https://lore.kernel.org/all/CAD=FV=WWUiCi6bZCs_gseFpDDWNkuJMoL6XCftEo6W7q6jRCkg@mail.gmail.com/
> >
> > Signed-off-by: Jinchao Wang <wangjinchao600@gmail.com>
> > ---
> > .../admin-guide/kernel-parameters.txt | 7 +++
> > include/linux/nmi.h | 6 +++
> > kernel/watchdog.c | 46 ++++++++++++++++++-
> > kernel/watchdog_buddy.c | 7 +--
> > kernel/watchdog_perf.c | 10 ++--
> > lib/Kconfig.debug | 37 +++++++--------
> > 6 files changed, 85 insertions(+), 28 deletions(-)
> >
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > index 5a7a83c411e9..0af214ee566c 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -1828,6 +1828,13 @@
> > backtraces on all cpus.
> > Format: 0 | 1
> >
> > + hardlockup_detector=
> > + [perf, buddy] Selects the hard lockup detector to use at
> > + boot time.
> > + Format: <string>
> > + - "perf": Use the perf-based detector.
> > + - "buddy": Use the buddy-based detector.
> > +
> > hash_pointers=
> > [KNL,EARLY]
> > By default, when pointers are printed to the console
> > diff --git a/include/linux/nmi.h b/include/linux/nmi.h
> > index cf3c6ab408aa..9298980ce572 100644
> > --- a/include/linux/nmi.h
> > +++ b/include/linux/nmi.h
> > @@ -100,6 +100,9 @@ void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs);
> > #endif
> >
> > #if defined(CONFIG_HARDLOCKUP_DETECTOR_PERF)
> > +void watchdog_perf_hardlockup_enable(unsigned int cpu);
> > +void watchdog_perf_hardlockup_disable(unsigned int cpu);
> > +extern int watchdog_perf_hardlockup_probe(void);
> > extern void hardlockup_detector_perf_stop(void);
> > extern void hardlockup_detector_perf_restart(void);
> > extern void hardlockup_config_perf_event(const char *str);
> > @@ -120,6 +123,9 @@ void watchdog_hardlockup_disable(unsigned int cpu);
> > void lockup_detector_reconfigure(void);
> >
> > #ifdef CONFIG_HARDLOCKUP_DETECTOR_BUDDY
> > +void watchdog_buddy_hardlockup_enable(unsigned int cpu);
> > +void watchdog_buddy_hardlockup_disable(unsigned int cpu);
> > +int watchdog_buddy_hardlockup_probe(void);
> > void watchdog_buddy_check_hardlockup(int hrtimer_interrupts);
> > #else
> > static inline void watchdog_buddy_check_hardlockup(int hrtimer_interrupts) {}
> > diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> > index 80b56c002c7f..85451d24a77d 100644
> > --- a/kernel/watchdog.c
> > +++ b/kernel/watchdog.c
> > @@ -55,6 +55,37 @@ unsigned long *watchdog_cpumask_bits = cpumask_bits(&watchdog_cpumask);
> >
> > #ifdef CONFIG_HARDLOCKUP_DETECTOR
> >
> > +#ifdef CONFIG_HARDLOCKUP_DETECTOR_PERF
> > +/* The global function pointers */
> > +void (*watchdog_hardlockup_enable_ptr)(unsigned int cpu) = watchdog_perf_hardlockup_enable;
> > +void (*watchdog_hardlockup_disable_ptr)(unsigned int cpu) = watchdog_perf_hardlockup_disable;
> > +int (*watchdog_hardlockup_probe_ptr)(void) = watchdog_perf_hardlockup_probe;
> > +#elif defined(CONFIG_HARDLOCKUP_DETECTOR_BUDDY)
> > +void (*watchdog_hardlockup_enable_ptr)(unsigned int cpu) = watchdog_buddy_hardlockup_enable;
> > +void (*watchdog_hardlockup_disable_ptr)(unsigned int cpu) = watchdog_buddy_hardlockup_disable;
> > +int (*watchdog_hardlockup_probe_ptr)(void) = watchdog_buddy_hardlockup_probe;
> > +#endif
> > +
> > +#ifdef CONFIG_HARDLOCKUP_DETECTOR_MULTIPLE
> > +static char *hardlockup_detector_type = "perf"; /* Default to perf */
> > +static int __init set_hardlockup_detector_type(char *str)
> > +{
> > + if (!strncmp(str, "perf", 4)) {
> > + watchdog_hardlockup_enable_ptr = watchdog_perf_hardlockup_enable;
> > + watchdog_hardlockup_disable_ptr = watchdog_perf_hardlockup_disable;
> > + watchdog_hardlockup_probe_ptr = watchdog_perf_hardlockup_probe;
> > + } else if (!strncmp(str, "buddy", 5)) {
> > + watchdog_hardlockup_enable_ptr = watchdog_buddy_hardlockup_enable;
> > + watchdog_hardlockup_disable_ptr = watchdog_buddy_hardlockup_disable;
> > + watchdog_hardlockup_probe_ptr = watchdog_buddy_hardlockup_probe;
> > + }
> > + return 1;
> > +}
> > +
> > +__setup("hardlockup_detector=", set_hardlockup_detector_type);
> > +
> > +#endif
> > +
> > # ifdef CONFIG_SMP
> > int __read_mostly sysctl_hardlockup_all_cpu_backtrace;
> > # endif /* CONFIG_SMP */
> > @@ -262,9 +293,17 @@ static inline void watchdog_hardlockup_kick(void) { }
> > * softlockup watchdog start and stop. The detector must select the
> > * SOFTLOCKUP_DETECTOR Kconfig.
> > */
> > -void __weak watchdog_hardlockup_enable(unsigned int cpu) { }
> > +void __weak watchdog_hardlockup_enable(unsigned int cpu)
> > +{
> > + if (watchdog_hardlockup_enable_ptr)
> > + watchdog_hardlockup_enable_ptr(cpu);
> > +}
> >
> > -void __weak watchdog_hardlockup_disable(unsigned int cpu) { }
> > +void __weak watchdog_hardlockup_disable(unsigned int cpu)
> > +{
> > + if (watchdog_hardlockup_disable_ptr)
> > + watchdog_hardlockup_disable_ptr(cpu);
> > +}
> >
> > /*
> > * Watchdog-detector specific API.
> > @@ -275,6 +314,9 @@ void __weak watchdog_hardlockup_disable(unsigned int cpu) { }
> > */
> > int __weak __init watchdog_hardlockup_probe(void)
> > {
> > + if (watchdog_hardlockup_probe_ptr)
> > + return watchdog_hardlockup_probe_ptr();
> > +
> > return -ENODEV;
> > }
> >
> > diff --git a/kernel/watchdog_buddy.c b/kernel/watchdog_buddy.c
> > index ee754d767c21..390d89bfcafa 100644
> > --- a/kernel/watchdog_buddy.c
> > +++ b/kernel/watchdog_buddy.c
> > @@ -19,15 +19,16 @@ static unsigned int watchdog_next_cpu(unsigned int cpu)
> > return next_cpu;
> > }
> >
> > -int __init watchdog_hardlockup_probe(void)
> > +int __init watchdog_buddy_hardlockup_probe(void)
> > {
> > return 0;
> > }
> >
> > -void watchdog_hardlockup_enable(unsigned int cpu)
> > +void watchdog_buddy_hardlockup_enable(unsigned int cpu)
> > {
> > unsigned int next_cpu;
> >
> > + pr_info("ddddd %s\n", __func__);
> > /*
> > * The new CPU will be marked online before the hrtimer interrupt
> > * gets a chance to run on it. If another CPU tests for a
> > @@ -58,7 +59,7 @@ void watchdog_hardlockup_enable(unsigned int cpu)
> > cpumask_set_cpu(cpu, &watchdog_cpus);
> > }
> >
> > -void watchdog_hardlockup_disable(unsigned int cpu)
> > +void watchdog_buddy_hardlockup_disable(unsigned int cpu)
> > {
> > unsigned int next_cpu = watchdog_next_cpu(cpu);
> >
> > diff --git a/kernel/watchdog_perf.c b/kernel/watchdog_perf.c
> > index 9c58f5b4381d..270110e58f20 100644
> > --- a/kernel/watchdog_perf.c
> > +++ b/kernel/watchdog_perf.c
> > @@ -153,10 +153,12 @@ static int hardlockup_detector_event_create(void)
> > * watchdog_hardlockup_enable - Enable the local event
> > * @cpu: The CPU to enable hard lockup on.
> > */
> > -void watchdog_hardlockup_enable(unsigned int cpu)
> > +void watchdog_perf_hardlockup_enable(unsigned int cpu)
> > {
> > WARN_ON_ONCE(cpu != smp_processor_id());
> >
> > + pr_info("ddddd %s\n", __func__);
> > +
> > if (hardlockup_detector_event_create())
> > return;
> >
> > @@ -172,7 +174,7 @@ void watchdog_hardlockup_enable(unsigned int cpu)
> > * watchdog_hardlockup_disable - Disable the local event
> > * @cpu: The CPU to enable hard lockup on.
> > */
> > -void watchdog_hardlockup_disable(unsigned int cpu)
> > +void watchdog_perf_hardlockup_disable(unsigned int cpu)
> > {
> > struct perf_event *event = this_cpu_read(watchdog_ev);
> >
> > @@ -257,10 +259,12 @@ bool __weak __init arch_perf_nmi_is_available(void)
> > /**
> > * watchdog_hardlockup_probe - Probe whether NMI event is available at all
> > */
> > -int __init watchdog_hardlockup_probe(void)
> > +int __init watchdog_perf_hardlockup_probe(void)
> > {
> > int ret;
> >
> > + pr_info("ddddd %s\n", __func__);
> > +
> > if (!arch_perf_nmi_is_available())
> > return -ENODEV;
> >
> > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > index dc0e0c6ed075..443353fad1c1 100644
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -1167,36 +1167,33 @@ config HARDLOCKUP_DETECTOR
> > #
> > # Note that arch-specific variants are always preferred.
> > #
> > -config HARDLOCKUP_DETECTOR_PREFER_BUDDY
> > - bool "Prefer the buddy CPU hardlockup detector"
> > - depends on HARDLOCKUP_DETECTOR
> > - depends on HAVE_HARDLOCKUP_DETECTOR_PERF && HAVE_HARDLOCKUP_DETECTOR_BUDDY
> > - depends on !HAVE_HARDLOCKUP_DETECTOR_ARCH
> > - help
> > - Say Y here to prefer the buddy hardlockup detector over the perf one.
> > -
> > - With the buddy detector, each CPU uses its softlockup hrtimer
> > - to check that the next CPU is processing hrtimer interrupts by
> > - verifying that a counter is increasing.
> > -
> > - This hardlockup detector is useful on systems that don't have
> > - an arch-specific hardlockup detector or if resources needed
> > - for the hardlockup detector are better used for other things.
> > -
> > config HARDLOCKUP_DETECTOR_PERF
> > - bool
> > + bool "Enable perf-based hard lockup detector (preferred)"
> > depends on HARDLOCKUP_DETECTOR
> > - depends on HAVE_HARDLOCKUP_DETECTOR_PERF && !HARDLOCKUP_DETECTOR_PREFER_BUDDY
> > + depends on HAVE_HARDLOCKUP_DETECTOR_PERF
> > depends on !HAVE_HARDLOCKUP_DETECTOR_ARCH
> > select HARDLOCKUP_DETECTOR_COUNTS_HRTIMER
> > + help
> > + This detector uses a perf event on the CPU to detect when a CPU
> > + has become non-maskable interrupt (NMI) stuck. This is the
> > + preferred method on modern systems as it can detect lockups on
> > + all CPUs at the same time.
>
> I'd say this option should be the default for kernel developers but
> shouldn't be used by default to free the perf event and due to the
> extra power overhead.
>
> Thanks,
> Ian
>
> > config HARDLOCKUP_DETECTOR_BUDDY
> > - bool
> > + bool "Enable buddy-based hard lockup detector"
> > depends on HARDLOCKUP_DETECTOR
> > depends on HAVE_HARDLOCKUP_DETECTOR_BUDDY
> > - depends on !HAVE_HARDLOCKUP_DETECTOR_PERF || HARDLOCKUP_DETECTOR_PREFER_BUDDY
> > depends on !HAVE_HARDLOCKUP_DETECTOR_ARCH
> > select HARDLOCKUP_DETECTOR_COUNTS_HRTIMER
> > + help
> > + This is an alternative lockup detector that uses a heartbeat
> > + mechanism between CPUs to detect when one has stopped responding.
> > + It is less precise than the perf-based detector and cannot detect
> > + all-CPU lockups, but it does not require a perf counter.
> > +
> > +config CONFIG_HARDLOCKUP_DETECTOR_MULTIPLE
> > + bool
> > + depends on HARDLOCKUP_DETECTOR_PERF && HARDLOCKUP_DETECTOR_BUDDY
> >
> > config HARDLOCKUP_DETECTOR_ARCH
> > bool
> > --
> > 2.43.0
> >
--
Jinchao
next prev parent reply other threads:[~2025-09-18 5:35 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <https://lore.kernel.org/all/20250915035355.10846-1-cuiyunhui@bytedance.com/>
2025-09-16 14:50 ` [RFC PATCH V1] watchdog: Add boot-time selection for hard lockup detector Jinchao Wang
2025-09-17 0:03 ` Ian Rogers
2025-09-17 1:47 ` Jinchao Wang [this message]
2025-09-17 5:13 ` Ian Rogers
2025-09-17 5:35 ` Namhyung Kim
2025-09-17 6:14 ` Jinchao Wang
2025-10-06 21:29 ` Ian Rogers
2025-10-06 23:24 ` Doug Anderson
2025-10-07 1:00 ` Ian Rogers
2025-10-07 19:54 ` Doug Anderson
2025-10-07 20:43 ` Ian Rogers
2025-10-07 21:43 ` Doug Anderson
2025-10-07 22:45 ` Ian Rogers
2025-10-07 22:58 ` Doug Anderson
2025-10-08 0:11 ` Ian Rogers
2025-10-09 6:50 ` Jinchao Wang
2025-10-09 13:22 ` Ian Rogers
2025-10-10 12:54 ` Jinchao Wang
2025-10-13 15:22 ` Ian Rogers
2025-09-17 6:08 ` Christophe Leroy
2025-09-17 6:54 ` Jinchao Wang
2025-10-06 20:13 ` Doug Anderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aMoTOXIKBYVTj7PV@mdev \
--to=wangjinchao600@gmail.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=akpm@linux-foundation.org \
--cc=alexander.shishkin@linux.intel.com \
--cc=aliceryhl@google.com \
--cc=arnd@arndb.de \
--cc=bp@alien8.de \
--cc=catalin.marinas@arm.com \
--cc=christophe.leroy@csgroup.eu \
--cc=coxu@redhat.com \
--cc=cuiyunhui@bytedance.com \
--cc=dave.hansen@linux.intel.com \
--cc=dianders@chromium.org \
--cc=gautam@linux.ibm.com \
--cc=gshan@redhat.com \
--cc=hpa@zytor.com \
--cc=irogers@google.com \
--cc=jolsa@kernel.org \
--cc=jpoimboe@kernel.org \
--cc=kan.liang@linux.intel.com \
--cc=kees@kernel.org \
--cc=lihuafei1@huawei.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=luogengkun@huaweicloud.com \
--cc=maddy@linux.ibm.com \
--cc=mark.rutland@arm.com \
--cc=masahiroy@kernel.org \
--cc=max.kellermann@ionos.com \
--cc=maz@kernel.org \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=namhyung@kernel.org \
--cc=npiggin@gmail.com \
--cc=ojeda@kernel.org \
--cc=peterz@infradead.org \
--cc=rppt@kernel.org \
--cc=ruanjinjie@huawei.com \
--cc=suzuki.poulose@arm.com \
--cc=tglx@linutronix.de \
--cc=thomas.weissschuh@linutronix.de \
--cc=thorsten.blum@linux.dev \
--cc=tj@kernel.org \
--cc=will@kernel.org \
--cc=x86@kernel.org \
--cc=xur@google.com \
--cc=yangyicong@hisilicon.com \
--cc=yaozhenguo1@gmail.com \
--cc=yury.norov@gmail.com \
--cc=zhanjie9@hisilicon.com \
--cc=zhao.xichao@vivo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).