From: Pingfan Liu <kernelfans@gmail.com>
To: linux-kernel@vger.kernel.org
Cc: Pingfan Liu <kernelfans@gmail.com>,
Sumit Garg <sumit.garg@linaro.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>, Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Mark Rutland <mark.rutland@arm.com>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>,
Marc Zyngier <maz@kernel.org>, Kees Cook <keescook@chromium.org>,
Masahiro Yamada <masahiroy@kernel.org>,
Sami Tolvanen <samitolvanen@google.com>,
Petr Mladek <pmladek@suse.com>,
Andrew Morton <akpm@linux-foundation.org>,
Wang Qing <wangqing@vivo.com>,
"Peter Zijlstra (Intel)" <peterz@infradead.org>,
Santosh Sivaraj <santosh@fossix.org>,
linux-arm-kernel@lists.infradead.org
Subject: [PATCHv3 3/4] kernel/watchdog: Adapt the watchdog_hld interface for async model
Date: Thu, 14 Oct 2021 10:41:54 +0800 [thread overview]
Message-ID: <20211014024155.15253-4-kernelfans@gmail.com> (raw)
In-Reply-To: <20211014024155.15253-1-kernelfans@gmail.com>
When lockup_detector_init()->watchdog_nmi_probe(), PMU may be not ready
yet. E.g. on arm64, PMU is not ready until
device_initcall(armv8_pmu_driver_init). And it is deeply integrated
with the driver model and cpuhp. Hence it is hard to push this
initialization before smp_init().
But it is easy to take an opposite approach by enabling watchdog_hld to
get the capability of PMU async.
The async model is achieved by expanding watchdog_nmi_probe() with
-EBUSY, and a re-initializing work_struct which waits on a wait_queue_head.
Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Wang Qing <wangqing@vivo.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Santosh Sivaraj <santosh@fossix.org>
Cc: linux-arm-kernel@lists.infradead.org
To: linux-kernel@vger.kernel.org
---
include/linux/nmi.h | 9 +++++++
kernel/watchdog.c | 57 +++++++++++++++++++++++++++++++++++++++++++--
2 files changed, 64 insertions(+), 2 deletions(-)
diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index b7bcd63c36b4..9def85c00bd8 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -118,6 +118,15 @@ static inline int hardlockup_detector_perf_init(void) { return 0; }
void watchdog_nmi_stop(void);
void watchdog_nmi_start(void);
+
+enum hld_detector_state {
+ DELAY_INIT_NOP,
+ DELAY_INIT_WAIT,
+ DELAY_INIT_READY
+};
+
+extern enum hld_detector_state detector_delay_init_state;
+extern struct wait_queue_head hld_detector_wait;
int watchdog_nmi_probe(void);
void watchdog_nmi_enable(unsigned int cpu);
void watchdog_nmi_disable(unsigned int cpu);
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 6e6dd5f0bc3e..2f267d21a7a1 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -103,7 +103,11 @@ void __weak watchdog_nmi_disable(unsigned int cpu)
hardlockup_detector_perf_disable();
}
-/* Return 0, if a NMI watchdog is available. Error code otherwise */
+/*
+ * Arch specific API. Return 0, if a NMI watchdog is available. -EBUSY if not
+ * ready, and arch code should wake up hld_detector_wait when ready. Other
+ * negative value if not support.
+ */
int __weak __init watchdog_nmi_probe(void)
{
return hardlockup_detector_perf_init();
@@ -739,15 +743,64 @@ int proc_watchdog_cpumask(struct ctl_table *table, int write,
}
#endif /* CONFIG_SYSCTL */
+static void lockup_detector_delay_init(struct work_struct *work);
+enum hld_detector_state detector_delay_init_state __initdata;
+
+struct wait_queue_head hld_detector_wait __initdata =
+ __WAIT_QUEUE_HEAD_INITIALIZER(hld_detector_wait);
+
+static struct work_struct detector_work __initdata =
+ __WORK_INITIALIZER(detector_work, lockup_detector_delay_init);
+
+static void __init lockup_detector_delay_init(struct work_struct *work)
+{
+ int ret;
+
+ wait_event(hld_detector_wait,
+ detector_delay_init_state == DELAY_INIT_READY);
+ ret = watchdog_nmi_probe();
+ if (!ret) {
+ nmi_watchdog_available = true;
+ lockup_detector_setup();
+ } else {
+ WARN_ON(ret == -EBUSY);
+ pr_info("Perf NMI watchdog permanently disabled\n");
+ }
+}
+
+/* Ensure the check is called after the initialization of PMU driver */
+static int __init lockup_detector_check(void)
+{
+ if (detector_delay_init_state < DELAY_INIT_WAIT)
+ return 0;
+
+ if (WARN_ON(detector_delay_init_state == DELAY_INIT_WAIT)) {
+ detector_delay_init_state = DELAY_INIT_READY;
+ wake_up(&hld_detector_wait);
+ }
+ flush_work(&detector_work);
+ return 0;
+}
+late_initcall_sync(lockup_detector_check);
+
+
void __init lockup_detector_init(void)
{
+ int ret;
+
if (tick_nohz_full_enabled())
pr_info("Disabling watchdog on nohz_full cores by default\n");
cpumask_copy(&watchdog_cpumask,
housekeeping_cpumask(HK_FLAG_TIMER));
- if (!watchdog_nmi_probe())
+ ret = watchdog_nmi_probe();
+ if (!ret)
nmi_watchdog_available = true;
+ else if (ret == -EBUSY) {
+ detector_delay_init_state = DELAY_INIT_WAIT;
+ queue_work_on(smp_processor_id(), system_wq, &detector_work);
+ }
+
lockup_detector_setup();
}
--
2.31.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
WARNING: multiple messages have this Message-ID (diff)
From: Pingfan Liu <kernelfans@gmail.com>
To: linux-kernel@vger.kernel.org
Cc: Pingfan Liu <kernelfans@gmail.com>,
Sumit Garg <sumit.garg@linaro.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>, Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Mark Rutland <mark.rutland@arm.com>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>,
Marc Zyngier <maz@kernel.org>, Kees Cook <keescook@chromium.org>,
Masahiro Yamada <masahiroy@kernel.org>,
Sami Tolvanen <samitolvanen@google.com>,
Petr Mladek <pmladek@suse.com>,
Andrew Morton <akpm@linux-foundation.org>,
Wang Qing <wangqing@vivo.com>,
"Peter Zijlstra (Intel)" <peterz@infradead.org>,
Santosh Sivaraj <santosh@fossix.org>,
linux-arm-kernel@lists.infradead.org
Subject: [PATCHv3 3/4] kernel/watchdog: Adapt the watchdog_hld interface for async model
Date: Thu, 14 Oct 2021 10:41:54 +0800 [thread overview]
Message-ID: <20211014024155.15253-4-kernelfans@gmail.com> (raw)
In-Reply-To: <20211014024155.15253-1-kernelfans@gmail.com>
When lockup_detector_init()->watchdog_nmi_probe(), PMU may be not ready
yet. E.g. on arm64, PMU is not ready until
device_initcall(armv8_pmu_driver_init). And it is deeply integrated
with the driver model and cpuhp. Hence it is hard to push this
initialization before smp_init().
But it is easy to take an opposite approach by enabling watchdog_hld to
get the capability of PMU async.
The async model is achieved by expanding watchdog_nmi_probe() with
-EBUSY, and a re-initializing work_struct which waits on a wait_queue_head.
Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Wang Qing <wangqing@vivo.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Santosh Sivaraj <santosh@fossix.org>
Cc: linux-arm-kernel@lists.infradead.org
To: linux-kernel@vger.kernel.org
---
include/linux/nmi.h | 9 +++++++
kernel/watchdog.c | 57 +++++++++++++++++++++++++++++++++++++++++++--
2 files changed, 64 insertions(+), 2 deletions(-)
diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index b7bcd63c36b4..9def85c00bd8 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -118,6 +118,15 @@ static inline int hardlockup_detector_perf_init(void) { return 0; }
void watchdog_nmi_stop(void);
void watchdog_nmi_start(void);
+
+enum hld_detector_state {
+ DELAY_INIT_NOP,
+ DELAY_INIT_WAIT,
+ DELAY_INIT_READY
+};
+
+extern enum hld_detector_state detector_delay_init_state;
+extern struct wait_queue_head hld_detector_wait;
int watchdog_nmi_probe(void);
void watchdog_nmi_enable(unsigned int cpu);
void watchdog_nmi_disable(unsigned int cpu);
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 6e6dd5f0bc3e..2f267d21a7a1 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -103,7 +103,11 @@ void __weak watchdog_nmi_disable(unsigned int cpu)
hardlockup_detector_perf_disable();
}
-/* Return 0, if a NMI watchdog is available. Error code otherwise */
+/*
+ * Arch specific API. Return 0, if a NMI watchdog is available. -EBUSY if not
+ * ready, and arch code should wake up hld_detector_wait when ready. Other
+ * negative value if not support.
+ */
int __weak __init watchdog_nmi_probe(void)
{
return hardlockup_detector_perf_init();
@@ -739,15 +743,64 @@ int proc_watchdog_cpumask(struct ctl_table *table, int write,
}
#endif /* CONFIG_SYSCTL */
+static void lockup_detector_delay_init(struct work_struct *work);
+enum hld_detector_state detector_delay_init_state __initdata;
+
+struct wait_queue_head hld_detector_wait __initdata =
+ __WAIT_QUEUE_HEAD_INITIALIZER(hld_detector_wait);
+
+static struct work_struct detector_work __initdata =
+ __WORK_INITIALIZER(detector_work, lockup_detector_delay_init);
+
+static void __init lockup_detector_delay_init(struct work_struct *work)
+{
+ int ret;
+
+ wait_event(hld_detector_wait,
+ detector_delay_init_state == DELAY_INIT_READY);
+ ret = watchdog_nmi_probe();
+ if (!ret) {
+ nmi_watchdog_available = true;
+ lockup_detector_setup();
+ } else {
+ WARN_ON(ret == -EBUSY);
+ pr_info("Perf NMI watchdog permanently disabled\n");
+ }
+}
+
+/* Ensure the check is called after the initialization of PMU driver */
+static int __init lockup_detector_check(void)
+{
+ if (detector_delay_init_state < DELAY_INIT_WAIT)
+ return 0;
+
+ if (WARN_ON(detector_delay_init_state == DELAY_INIT_WAIT)) {
+ detector_delay_init_state = DELAY_INIT_READY;
+ wake_up(&hld_detector_wait);
+ }
+ flush_work(&detector_work);
+ return 0;
+}
+late_initcall_sync(lockup_detector_check);
+
+
void __init lockup_detector_init(void)
{
+ int ret;
+
if (tick_nohz_full_enabled())
pr_info("Disabling watchdog on nohz_full cores by default\n");
cpumask_copy(&watchdog_cpumask,
housekeeping_cpumask(HK_FLAG_TIMER));
- if (!watchdog_nmi_probe())
+ ret = watchdog_nmi_probe();
+ if (!ret)
nmi_watchdog_available = true;
+ else if (ret == -EBUSY) {
+ detector_delay_init_state = DELAY_INIT_WAIT;
+ queue_work_on(smp_processor_id(), system_wq, &detector_work);
+ }
+
lockup_detector_setup();
}
--
2.31.1
next prev parent reply other threads:[~2021-10-14 2:45 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-14 2:41 [PATCHv3 0/4] watchdog_hld cleanup and async model for arm64 Pingfan Liu
2021-10-14 2:41 ` Pingfan Liu
2021-10-14 2:41 ` [PATCHv3 1/4] kernel/watchdog: trival cleanups Pingfan Liu
2021-10-14 2:41 ` Pingfan Liu
2021-10-14 2:41 ` [PATCHv3 2/4] kernel/watchdog_hld: Ensure CPU-bound context when creating hardlockup detector event Pingfan Liu
2021-10-14 2:41 ` Pingfan Liu
2021-10-14 2:41 ` Pingfan Liu [this message]
2021-10-14 2:41 ` [PATCHv3 3/4] kernel/watchdog: Adapt the watchdog_hld interface for async model Pingfan Liu
2021-10-14 2:41 ` [PATCHv3 4/4] arm64: Enable perf events based hard lockup detector Pingfan Liu
2021-10-14 2:41 ` Pingfan Liu
2021-10-26 5:10 ` kernel test robot
2021-10-26 5:10 ` kernel test robot
2021-10-26 10:20 ` kernel test robot
2021-10-26 10:20 ` kernel test robot
2022-01-17 10:19 ` [PATCHv3 0/4] watchdog_hld cleanup and async model for arm64 Lecopzer Chen
2022-01-17 10:19 ` Lecopzer Chen
2022-01-24 1:02 ` Pingfan Liu
2022-01-24 1:02 ` Pingfan Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211014024155.15253-4-kernelfans@gmail.com \
--to=kernelfans@gmail.com \
--cc=acme@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=alexander.shishkin@linux.intel.com \
--cc=catalin.marinas@arm.com \
--cc=jolsa@redhat.com \
--cc=keescook@chromium.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=masahiroy@kernel.org \
--cc=maz@kernel.org \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=samitolvanen@google.com \
--cc=santosh@fossix.org \
--cc=sumit.garg@linaro.org \
--cc=wangqing@vivo.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.