All of lore.kernel.org
 help / color / mirror / Atom feed
From: Namhyung Kim <namhyung@kernel.org>
To: Ian Rogers <irogers@google.com>
Cc: Jinchao Wang <wangjinchao600@gmail.com>,
	Doug Anderson <dianders@chromium.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Will Deacon <will@kernel.org>,
	Yunhui Cui <cuiyunhui@bytedance.com>,
	akpm@linux-foundation.org, catalin.marinas@arm.com,
	maddy@linux.ibm.com, mpe@ellerman.id.au, npiggin@gmail.com,
	christophe.leroy@csgroup.eu, tglx@linutronix.de,
	mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
	hpa@zytor.com, acme@kernel.org, mark.rutland@arm.com,
	alexander.shishkin@linux.intel.com, jolsa@kernel.org,
	adrian.hunter@intel.com, kan.liang@linux.intel.com,
	kees@kernel.org, masahiroy@kernel.org, aliceryhl@google.com,
	ojeda@kernel.org, thomas.weissschuh@linutronix.de,
	xur@google.com, ruanjinjie@huawei.com, gshan@redhat.com,
	maz@kernel.org, suzuki.poulose@arm.com, zhanjie9@hisilicon.com,
	yangyicong@hisilicon.com, gautam@linux.ibm.com, arnd@arndb.de,
	zhao.xichao@vivo.com, rppt@kernel.org, lihuafei1@huawei.com,
	coxu@redhat.com, jpoimboe@kernel.org, yaozhenguo1@gmail.com,
	luogengkun@huaweicloud.com, max.kellermann@ionos.com,
	tj@kernel.org, yury.norov@gmail.com, thorsten.blum@linux.dev,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linuxppc-dev@lists.ozlabs.org, linux-perf-users@vger.kernel.org
Subject: Re: [RFC PATCH V1] watchdog: Add boot-time selection for hard lockup detector
Date: Tue, 16 Sep 2025 22:35:46 -0700	[thread overview]
Message-ID: <aMpIsqcgpOH1AObN@z2> (raw)
In-Reply-To: <CAP-5=fX7NJmBjd1v5y4xCa0Ce5rNZ8Dqg0LXd12gPrdEQCERVA@mail.gmail.com>

Hello,

On Tue, Sep 16, 2025 at 10:13:12PM -0700, Ian Rogers wrote:
> On Tue, Sep 16, 2025 at 6:47 PM Jinchao Wang <wangjinchao600@gmail.com> wrote:
> >
> > On Tue, Sep 16, 2025 at 05:03:48PM -0700, Ian Rogers wrote:
> > > On Tue, Sep 16, 2025 at 7:51 AM Jinchao Wang <wangjinchao600@gmail.com> wrote:
> > > >
> > > > Currently, the hard lockup detector is selected at compile time via
> > > > Kconfig, which requires a kernel rebuild to switch implementations.
> > > > This is inflexible, especially on systems where a perf event may not
> > > > be available or may be needed for other tasks.
> > > >
> > > > This commit refactors the hard lockup detector to replace a rigid
> > > > compile-time choice with a flexible build-time and boot-time solution.
> > > > The patch supports building the kernel with either detector
> > > > independently, or with both. When both are built, a new boot parameter
> > > > `hardlockup_detector="perf|buddy"` allows the selection at boot time.
> > > > This is a more robust and user-friendly design.
> > > >
> > > > This patch is a follow-up to the discussion on the kernel mailing list
> > > > regarding the preference and future of the hard lockup detectors. It
> > > > implements a flexible solution that addresses the community's need to
> > > > select an appropriate detector at boot time.
> > > >
> > > > The core changes are:
> > > > - The `perf` and `buddy` watchdog implementations are separated into
> > > >   distinct functions (e.g., `watchdog_perf_hardlockup_enable`).
> > > > - Global function pointers are introduced (`watchdog_hardlockup_enable_ptr`)
> > > >   to serve as a single API for the entire feature.
> > > > - A new `hardlockup_detector=` boot parameter is added to allow the
> > > >   user to select the desired detector at boot time.
> > > > - The Kconfig options are simplified by removing the complex
> > > >   `HARDLOCKUP_DETECTOR_PREFER_BUDDY` and allowing both detectors to be
> > > >   built without mutual exclusion.
> > > > - The weak stubs are updated to call the new function pointers,
> > > >   centralizing the watchdog logic.
> > >
> > > What is the impact on  /proc/sys/kernel/nmi_watchdog ? Is that
> > > enabling and disabling whatever the boot time choice was? I'm not sure
> > > why this has to be a boot time option given the ability to configure
> > > via /proc/sys/kernel/nmi_watchdog.
> > The new hardlockup_detector boot parameter and the existing
> > /proc/sys/kernel/nmi_watchdog file serve different purposes.
> >
> > The boot parameter selects the type of hard lockup detector (perf or buddy).
> > This choice is made once at boot.
> >
> >  /proc/sys/kernel/nmi_watchdog, on the other hand, is only a simple on/off
> > switch for the currently selected detector. It does not change the detector's
> > type.
> 
> So the name "nmi_watchdog" for the buddy watchdog is wrong for fairly
> obvious naming reasons but also because we can't differentiate when a
> perf event has been taken or not - this impacts perf that is choosing
> not to group events in metrics because of it, reducing the metric's
> accuracy. We need an equivalent "buddy_watchdog" file to the
> "nmi_watchdog" file. If we have such a file then if I did "echo 1 >
> /proc/sys/kernel/nmi_watchdog" I'd expect the buddy watchdog to be
> disabled and the perf event one to be enabled. Similarly, if I did
> "echo 1 > /proc/sys/kernel/buddy_watchdog" then I would expect the
> perf event watchdog to be disabled and the buddy one enabled. If I did
>  "echo 0 > /proc/sys/kernel/nmi_watchdog; echo 0 >
> /proc/sys/kernel/buddy_watchdog" then I'd expect neither to be
> enabled. I don't see why choosing the type of watchdog implementation
> at boot time is particularly desirable. It seems sensible to default
> normal people to using the buddy watchdog (more perf events, power...)
> and  CONFIG_DEBUG_KERNEL type people to using the perf event one. As
> the "nmi_watchdog" file may be assumed to control the buddy watchdog,
> perhaps a compatibility option (where the "nmi_watchdog" file controls
> the buddy watchdog) is needed so that user code has time to migrate.

Sounds good to me.  For perf tools, it'd be great if we can have a run-
time check which watchdog is selected.

Thanks,
Namhyung


  reply	other threads:[~2025-09-17  5:35 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <https://lore.kernel.org/all/20250915035355.10846-1-cuiyunhui@bytedance.com/>
2025-09-16 14:50 ` [RFC PATCH V1] watchdog: Add boot-time selection for hard lockup detector Jinchao Wang
2025-09-17  0:03   ` Ian Rogers
2025-09-17  1:47     ` Jinchao Wang
2025-09-17  5:13       ` Ian Rogers
2025-09-17  5:35         ` Namhyung Kim [this message]
2025-09-17  6:14           ` Jinchao Wang
2025-10-06 21:29             ` Ian Rogers
2025-10-06 23:24               ` Doug Anderson
2025-10-07  1:00                 ` Ian Rogers
2025-10-07 19:54                   ` Doug Anderson
2025-10-07 20:43                     ` Ian Rogers
2025-10-07 21:43                       ` Doug Anderson
2025-10-07 22:45                         ` Ian Rogers
2025-10-07 22:58                           ` Doug Anderson
2025-10-08  0:11                             ` Ian Rogers
2025-10-09  6:50                               ` Jinchao Wang
2025-10-09 13:22                                 ` Ian Rogers
2025-10-10 12:54                                   ` Jinchao Wang
2025-10-13 15:22                                     ` Ian Rogers
2025-09-17  6:08   ` Christophe Leroy
2025-09-17  6:54     ` Jinchao Wang
2025-10-06 20:13       ` Doug Anderson
2025-09-17 19:47   ` kernel test robot
2025-09-17 23:16   ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aMpIsqcgpOH1AObN@z2 \
    --to=namhyung@kernel.org \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=aliceryhl@google.com \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=coxu@redhat.com \
    --cc=cuiyunhui@bytedance.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dianders@chromium.org \
    --cc=gautam@linux.ibm.com \
    --cc=gshan@redhat.com \
    --cc=hpa@zytor.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=jpoimboe@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=kees@kernel.org \
    --cc=lihuafei1@huawei.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=luogengkun@huaweicloud.com \
    --cc=maddy@linux.ibm.com \
    --cc=mark.rutland@arm.com \
    --cc=masahiroy@kernel.org \
    --cc=max.kellermann@ionos.com \
    --cc=maz@kernel.org \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=ojeda@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rppt@kernel.org \
    --cc=ruanjinjie@huawei.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.weissschuh@linutronix.de \
    --cc=thorsten.blum@linux.dev \
    --cc=tj@kernel.org \
    --cc=wangjinchao600@gmail.com \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=xur@google.com \
    --cc=yangyicong@hisilicon.com \
    --cc=yaozhenguo1@gmail.com \
    --cc=yury.norov@gmail.com \
    --cc=zhanjie9@hisilicon.com \
    --cc=zhao.xichao@vivo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.