* [v4 PATCH 0/2] hung_task: Provide runtime reset interface for hung task detector
@ 2025-12-22 1:42 Aaron Tomlin
2025-12-22 1:42 ` [v4 PATCH 1/2] hung_task: Introduce helper for hung task warning Aaron Tomlin
2025-12-22 1:42 ` [v4 PATCH 2/2] hung_task: Enable runtime reset of hung_task_detect_count Aaron Tomlin
0 siblings, 2 replies; 5+ messages in thread
From: Aaron Tomlin @ 2025-12-22 1:42 UTC (permalink / raw)
To: akpm, lance.yang, mhiramat, gregkh, pmladek, joel.granados
Cc: sean, linux-kernel
Hi Lance, Greg, Petr, Joel,
This series introduces the ability to reset
/proc/sys/kernel/hung_task_detect_count.
Writing a zero value to this file atomically resets the counter of detected
hung tasks. This functionality provides system administrators with the
means to clear the cumulative diagnostic history following incident
resolution, thereby simplifying subsequent monitoring without necessitating
a system restart.
The implementation uses atomic acquire/release semantics to ensure that
diagnostic metadata published by one CPU is correctly observed by the
monitoring thread on another CPU.
Please let me know your thoughts.
Changes since v3 [1]:
- Use atomic operations to ensure cross-CPU visibility and prevent an integer underflow
- Use acquire/release semantics for memory ordering (Petr Mladek)
- Move quoted string to a single line (Petr Mladek)
- Remove variables coredump_msg and disable_msg to simplify code (Petr Mladek)
- Add trailing "\n" to all strings to ensure immediate console flushing (Petr Mladek)
- Improve the hung task counter documentation (Joel Granados)
- Reject non-zero writes with -EINVAL (Joel Granados)
- Translate to the new sysctl API (Petr Mladek)
Changes since v2 [2]:
- Avoided a needless double update to hung_task_detect_count (Lance Yang)
- Restored previous use of pr_err() for each message (Greg KH)
- Provided a complete descriptive comment for the helper
Changes since v1 [3]:
- Removed write-only sysfs attribute (Lance Yang)
- Modified procfs hung_task_detect_count instead (Lance Yang)
- Introduced a custom proc_handler
- Updated documentation (Lance Yang)
- Added 'static inline' as a hint to eliminate any function call overhead
- Removed clutter through encapsulation
[1]: https://lore.kernel.org/all/20251216030036.1822217-1-atomlin@atomlin.com/
[2]: https://lore.kernel.org/lkml/20251211033004.1628875-1-atomlin@atomlin.com/
[3]: https://lore.kernel.org/lkml/20251209041218.1583600-1-atomlin@atomlin.com/
Aaron Tomlin (2):
hung_task: Introduce helper for hung task warning
hung_task: Enable runtime reset of hung_task_detect_count
Documentation/admin-guide/sysctl/kernel.rst | 3 +-
kernel/hung_task.c | 109 ++++++++++++++++----
2 files changed, 90 insertions(+), 22 deletions(-)
--
2.51.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* [v4 PATCH 1/2] hung_task: Introduce helper for hung task warning
2025-12-22 1:42 [v4 PATCH 0/2] hung_task: Provide runtime reset interface for hung task detector Aaron Tomlin
@ 2025-12-22 1:42 ` Aaron Tomlin
2025-12-22 1:42 ` [v4 PATCH 2/2] hung_task: Enable runtime reset of hung_task_detect_count Aaron Tomlin
1 sibling, 0 replies; 5+ messages in thread
From: Aaron Tomlin @ 2025-12-22 1:42 UTC (permalink / raw)
To: akpm, lance.yang, mhiramat, gregkh, pmladek, joel.granados
Cc: sean, linux-kernel
Consolidate the multi-line console output block for reporting a hung
task into a new helper function, hung_task_diagnostics(). This improves
readability in the main check_hung_task() loop and makes the diagnostic
output structure easier to maintain and update in the future.
Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>
---
kernel/hung_task.c | 33 +++++++++++++++++++++++----------
1 file changed, 23 insertions(+), 10 deletions(-)
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index d2254c91450b..00c3296fd692 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -223,6 +223,28 @@ static inline void debug_show_blocker(struct task_struct *task, unsigned long ti
}
#endif
+/**
+ * hung_task_diagnostics - Print structured diagnostic info for a hung task.
+ * @t: Pointer to the detected hung task.
+ *
+ * This function consolidates the printing of core diagnostic information
+ * for a task found to be blocked.
+ */
+static inline void hung_task_diagnostics(struct task_struct *t)
+{
+ unsigned long blocked_secs = (jiffies - t->last_switch_time) / HZ;
+
+ pr_err("INFO: task %s:%d blocked for more than %ld seconds.\n",
+ t->comm, t->pid, blocked_secs);
+ pr_err(" %s %s %.*s\n",
+ print_tainted(), init_utsname()->release,
+ (int)strcspn(init_utsname()->version, " "),
+ init_utsname()->version);
+ if (t->flags & PF_POSTCOREDUMP)
+ pr_err(" Blocked by coredump.\n");
+ pr_err("\"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\" disables this message.\n");
+}
+
static void check_hung_task(struct task_struct *t, unsigned long timeout,
unsigned long prev_detect_count)
{
@@ -252,16 +274,7 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout,
if (sysctl_hung_task_warnings || hung_task_call_panic) {
if (sysctl_hung_task_warnings > 0)
sysctl_hung_task_warnings--;
- pr_err("INFO: task %s:%d blocked for more than %ld seconds.\n",
- t->comm, t->pid, (jiffies - t->last_switch_time) / HZ);
- pr_err(" %s %s %.*s\n",
- print_tainted(), init_utsname()->release,
- (int)strcspn(init_utsname()->version, " "),
- init_utsname()->version);
- if (t->flags & PF_POSTCOREDUMP)
- pr_err(" Blocked by coredump.\n");
- pr_err("\"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\""
- " disables this message.\n");
+ hung_task_diagnostics(t);
sched_show_task(t);
debug_show_blocker(t, timeout);
--
2.51.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [v4 PATCH 2/2] hung_task: Enable runtime reset of hung_task_detect_count
2025-12-22 1:42 [v4 PATCH 0/2] hung_task: Provide runtime reset interface for hung task detector Aaron Tomlin
2025-12-22 1:42 ` [v4 PATCH 1/2] hung_task: Introduce helper for hung task warning Aaron Tomlin
@ 2025-12-22 1:42 ` Aaron Tomlin
2025-12-22 2:21 ` Lance Yang
1 sibling, 1 reply; 5+ messages in thread
From: Aaron Tomlin @ 2025-12-22 1:42 UTC (permalink / raw)
To: akpm, lance.yang, mhiramat, gregkh, pmladek, joel.granados
Cc: sean, linux-kernel
Introduce support for writing to /proc/sys/kernel/hung_task_detect_count.
Writing a value of zero to this file atomically resets the counter of
detected hung tasks. This grants system administrators the ability to
clear the cumulative diagnostic history after resolving an incident,
simplifying monitoring without requiring a system restart.
Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>
---
Documentation/admin-guide/sysctl/kernel.rst | 3 +-
kernel/hung_task.c | 76 ++++++++++++++++++---
2 files changed, 67 insertions(+), 12 deletions(-)
diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index 239da22c4e28..68da4235225a 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst
@@ -418,7 +418,8 @@ hung_task_detect_count
======================
Indicates the total number of tasks that have been detected as hung since
-the system boot.
+the system boot or since the counter was reset. The counter is zeroed when
+a value of 0 is written.
This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled.
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index 00c3296fd692..70b3db047f5d 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -17,6 +17,7 @@
#include <linux/export.h>
#include <linux/panic_notifier.h>
#include <linux/sysctl.h>
+#include <linux/atomic.h>
#include <linux/suspend.h>
#include <linux/utsname.h>
#include <linux/sched/signal.h>
@@ -36,7 +37,7 @@ static int __read_mostly sysctl_hung_task_check_count = PID_MAX_LIMIT;
/*
* Total number of tasks detected as hung since boot:
*/
-static unsigned long __read_mostly sysctl_hung_task_detect_count;
+static atomic_long_t sysctl_hung_task_detect_count = ATOMIC_LONG_INIT(0);
/*
* Limit number of tasks checked in a batch.
@@ -246,20 +247,26 @@ static inline void hung_task_diagnostics(struct task_struct *t)
}
static void check_hung_task(struct task_struct *t, unsigned long timeout,
- unsigned long prev_detect_count)
+ unsigned long prev_detect_count)
{
- unsigned long total_hung_task;
+ unsigned long total_hung_task, current_detect;
if (!task_is_hung(t, timeout))
return;
/*
* This counter tracks the total number of tasks detected as hung
- * since boot.
+ * since boot. If a reset occurred during the scan, we treat the
+ * current count as the new delta to avoid an underflow error.
+ * Ensure hang details are globally visible before the counter
+ * update.
*/
- sysctl_hung_task_detect_count++;
+ current_detect = atomic_long_inc_return_acquire(&sysctl_hung_task_detect_count);
+ if (current_detect >= prev_detect_count)
+ total_hung_task = current_detect - prev_detect_count;
+ else
+ total_hung_task = current_detect;
- total_hung_task = sysctl_hung_task_detect_count - prev_detect_count;
trace_sched_process_hang(t);
if (sysctl_hung_task_panic && total_hung_task >= sysctl_hung_task_panic) {
@@ -318,7 +325,8 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
int max_count = sysctl_hung_task_check_count;
unsigned long last_break = jiffies;
struct task_struct *g, *t;
- unsigned long prev_detect_count = sysctl_hung_task_detect_count;
+ /* Acquire prevents reordering task checks before this point. */
+ unsigned long prev_detect_count = atomic_long_read_acquire(&sysctl_hung_task_detect_count);
int need_warning = sysctl_hung_task_warnings;
unsigned long si_mask = hung_task_si_mask;
@@ -346,7 +354,9 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
unlock:
rcu_read_unlock();
- if (!(sysctl_hung_task_detect_count - prev_detect_count))
+ /* Ensures we see all hang details recorded during the scan. */
+ if (!(atomic_long_read_acquire(&sysctl_hung_task_detect_count) -
+ prev_detect_count))
return;
if (need_warning || hung_task_call_panic) {
@@ -371,6 +381,51 @@ static long hung_timeout_jiffies(unsigned long last_checked,
}
#ifdef CONFIG_SYSCTL
+
+/**
+ * proc_dohung_task_detect_count - proc handler for hung_task_detect_count
+ * @table: Pointer to the struct ctl_table definition for this proc entry
+ * @dir: Flag indicating the operation
+ * @buffer: User space buffer for data transfer
+ * @lenp: Pointer to the length of the data being transferred
+ * @ppos: Pointer to the current file offset
+ *
+ * This handler is used for reading the current hung task detection count
+ * and for resetting it to zero when a write operation is performed using a
+ * zero value only. Returns 0 on success or a negative error code on
+ * failure.
+ */
+static int proc_dohung_task_detect_count(const struct ctl_table *table, int dir,
+ void *buffer, size_t *lenp, loff_t *ppos)
+{
+ unsigned long detect_count;
+ struct ctl_table proxy_table;
+ int err;
+
+ proxy_table = *table;
+ proxy_table.data = &detect_count;
+
+ if (SYSCTL_KERN_TO_USER(dir)) {
+ detect_count = atomic_long_read(&sysctl_hung_task_detect_count);
+
+ return proc_doulongvec_minmax(&proxy_table, dir, buffer, lenp, ppos);
+ }
+
+ err = proc_doulongvec_minmax(&proxy_table, dir, buffer, lenp, ppos);
+ if (err < 0)
+ return err;
+
+ if (SYSCTL_USER_TO_KERN(dir)) {
+ /* The only valid value for clearing is zero. */
+ if (detect_count)
+ return -EINVAL;
+ atomic_long_set(&sysctl_hung_task_detect_count, 0);
+ }
+
+ *ppos += *lenp;
+ return err;
+}
+
/*
* Process updating of timeout sysctl
*/
@@ -451,10 +506,9 @@ static const struct ctl_table hung_task_sysctls[] = {
},
{
.procname = "hung_task_detect_count",
- .data = &sysctl_hung_task_detect_count,
.maxlen = sizeof(unsigned long),
- .mode = 0444,
- .proc_handler = proc_doulongvec_minmax,
+ .mode = 0644,
+ .proc_handler = proc_dohung_task_detect_count,
},
{
.procname = "hung_task_sys_info",
--
2.51.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [v4 PATCH 2/2] hung_task: Enable runtime reset of hung_task_detect_count
2025-12-22 1:42 ` [v4 PATCH 2/2] hung_task: Enable runtime reset of hung_task_detect_count Aaron Tomlin
@ 2025-12-22 2:21 ` Lance Yang
2025-12-23 1:27 ` Aaron Tomlin
0 siblings, 1 reply; 5+ messages in thread
From: Lance Yang @ 2025-12-22 2:21 UTC (permalink / raw)
To: Aaron Tomlin
Cc: sean, linux-kernel, pmladek, joel.granados, gregkh, akpm,
mhiramat
On 2025/12/22 09:42, Aaron Tomlin wrote:
> Introduce support for writing to /proc/sys/kernel/hung_task_detect_count.
>
> Writing a value of zero to this file atomically resets the counter of
> detected hung tasks. This grants system administrators the ability to
> clear the cumulative diagnostic history after resolving an incident,
> simplifying monitoring without requiring a system restart.
>
> Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>
> ---
> Documentation/admin-guide/sysctl/kernel.rst | 3 +-
> kernel/hung_task.c | 76 ++++++++++++++++++---
> 2 files changed, 67 insertions(+), 12 deletions(-)
>
> diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
> index 239da22c4e28..68da4235225a 100644
> --- a/Documentation/admin-guide/sysctl/kernel.rst
> +++ b/Documentation/admin-guide/sysctl/kernel.rst
> @@ -418,7 +418,8 @@ hung_task_detect_count
> ======================
>
> Indicates the total number of tasks that have been detected as hung since
> -the system boot.
> +the system boot or since the counter was reset. The counter is zeroed when
> +a value of 0 is written.
>
> This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled.
>
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index 00c3296fd692..70b3db047f5d 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -17,6 +17,7 @@
> #include <linux/export.h>
> #include <linux/panic_notifier.h>
> #include <linux/sysctl.h>
> +#include <linux/atomic.h>
> #include <linux/suspend.h>
> #include <linux/utsname.h>
> #include <linux/sched/signal.h>
> @@ -36,7 +37,7 @@ static int __read_mostly sysctl_hung_task_check_count = PID_MAX_LIMIT;
> /*
> * Total number of tasks detected as hung since boot:
> */
> -static unsigned long __read_mostly sysctl_hung_task_detect_count;
> +static atomic_long_t sysctl_hung_task_detect_count = ATOMIC_LONG_INIT(0);
>
> /*
> * Limit number of tasks checked in a batch.
> @@ -246,20 +247,26 @@ static inline void hung_task_diagnostics(struct task_struct *t)
> }
>
> static void check_hung_task(struct task_struct *t, unsigned long timeout,
> - unsigned long prev_detect_count)
> + unsigned long prev_detect_count)
> {
> - unsigned long total_hung_task;
> + unsigned long total_hung_task, current_detect;
>
> if (!task_is_hung(t, timeout))
> return;
>
> /*
> * This counter tracks the total number of tasks detected as hung
> - * since boot.
> + * since boot. If a reset occurred during the scan, we treat the
> + * current count as the new delta to avoid an underflow error.
> + * Ensure hang details are globally visible before the counter
> + * update.
> */
> - sysctl_hung_task_detect_count++;
> + current_detect = atomic_long_inc_return_acquire(&sysctl_hung_task_detect_count);
> + if (current_detect >= prev_detect_count)
> + total_hung_task = current_detect - prev_detect_count;
> + else
> + total_hung_task = current_detect;
>
> - total_hung_task = sysctl_hung_task_detect_count - prev_detect_count;
> trace_sched_process_hang(t);
>
> if (sysctl_hung_task_panic && total_hung_task >= sysctl_hung_task_panic) {
> @@ -318,7 +325,8 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
> int max_count = sysctl_hung_task_check_count;
> unsigned long last_break = jiffies;
> struct task_struct *g, *t;
> - unsigned long prev_detect_count = sysctl_hung_task_detect_count;
> + /* Acquire prevents reordering task checks before this point. */
> + unsigned long prev_detect_count = atomic_long_read_acquire(&sysctl_hung_task_detect_count);
> int need_warning = sysctl_hung_task_warnings;
> unsigned long si_mask = hung_task_si_mask;
>
> @@ -346,7 +354,9 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
> unlock:
> rcu_read_unlock();
>
> - if (!(sysctl_hung_task_detect_count - prev_detect_count))
> + /* Ensures we see all hang details recorded during the scan. */
> + if (!(atomic_long_read_acquire(&sysctl_hung_task_detect_count) -
> + prev_detect_count))
> return;
Hmm, I think we're missing the same underflow check here ...
If reset happens mid-scan, it can also underflow and cause
false positives in the diagnostics :)
we should apply the same "if (current < prev) use current" logic
here as Petr metioned before.
[...]
Cheers,
Lance
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [v4 PATCH 2/2] hung_task: Enable runtime reset of hung_task_detect_count
2025-12-22 2:21 ` Lance Yang
@ 2025-12-23 1:27 ` Aaron Tomlin
0 siblings, 0 replies; 5+ messages in thread
From: Aaron Tomlin @ 2025-12-23 1:27 UTC (permalink / raw)
To: Lance Yang
Cc: sean, linux-kernel, pmladek, joel.granados, gregkh, akpm,
mhiramat
[-- Attachment #1: Type: text/plain, Size: 496 bytes --]
On Mon, Dec 22, 2025 at 10:21:31AM +0800, Lance Yang wrote:
> Hmm, I think we're missing the same underflow check here ...
>
> If reset happens mid-scan, it can also underflow and cause
> false positives in the diagnostics :)
>
> we should apply the same "if (current < prev) use current" logic
> here as Petr metioned before.
Hi Lance,
You are quite right; that was an oversight on my part. The underflow check
was indeed omitted in error.
Kind regards,
--
Aaron Tomlin
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-12-23 1:27 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-22 1:42 [v4 PATCH 0/2] hung_task: Provide runtime reset interface for hung task detector Aaron Tomlin
2025-12-22 1:42 ` [v4 PATCH 1/2] hung_task: Introduce helper for hung task warning Aaron Tomlin
2025-12-22 1:42 ` [v4 PATCH 2/2] hung_task: Enable runtime reset of hung_task_detect_count Aaron Tomlin
2025-12-22 2:21 ` Lance Yang
2025-12-23 1:27 ` Aaron Tomlin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox