From: Lance Yang <lance.yang@linux.dev>
To: Andrew Morton <akpm@linux-foundation.org>,
pmladek@suse.com, atomlin@atomlin.com
Cc: mm-commits@vger.kernel.org, mhiramat@kernel.org,
gregkh@linuxfoundation.org
Subject: Re: + hung_task-enable-runtime-reset-of-hung_task_detect_count.patch added to mm-nonmm-unstable branch
Date: Thu, 18 Dec 2025 10:30:10 +0800 [thread overview]
Message-ID: <20d511d2-1eed-4896-bd7d-c1e080319f61@linux.dev> (raw)
In-Reply-To: <20251216033809.8B31AC4CEF5@smtp.kernel.org>
Hi Andrew,
Could you please drop this patch from mm-nonmm-unstable?
As Petr pointed out[1], resetting sysctl_hung_task_detect_count has a
serious race condition ...
When check_hung_uninterruptible_tasks() saves prev_detect_count and
iterates through all processes, if userspace resets the counter during
this window, the subtraction will underflow:
total_hung_task = sysctl_hung_task_detect_count - prev_detect_count;
if (sysctl_hung_task_panic && total_hung_task >= sysctl_hung_task_panic) {
hung_task_call_panic = true; // false positive!!!
}
The race window is big (iterating all tasks). We'd need to add a lock
to detector code, but that's risky ...
As Petr said, keeping the detector working correctly matters more than
adding a reset feature. I'm not sure this convenience is worth it either.
[1] https://lore.kernel.org/all/aUKresftPnbndSBo@pathway.suse.cz/
Thanks,
Lance
On 2025/12/16 11:38, Andrew Morton wrote:
> The patch titled
> Subject: hung_task: enable runtime reset of hung_task_detect_count
> has been added to the -mm mm-nonmm-unstable branch. Its filename is
> hung_task-enable-runtime-reset-of-hung_task_detect_count.patch
>
> This patch will shortly appear at
> https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/hung_task-enable-runtime-reset-of-hung_task_detect_count.patch
>
> This patch will later appear in the mm-nonmm-unstable branch at
> git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>
> Before you just go and hit "reply", please:
> a) Consider who else should be cc'ed
> b) Prefer to cc a suitable mailing list as well
> c) Ideally: find the original patch on the mailing list and do a
> reply-to-all to that, adding suitable additional cc's
>
> *** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
>
> The -mm tree is included into linux-next via the mm-everything
> branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> and is updated there every 2-3 working days
>
> ------------------------------------------------------
> From: Aaron Tomlin <atomlin@atomlin.com>
> Subject: hung_task: enable runtime reset of hung_task_detect_count
> Date: Mon, 15 Dec 2025 22:00:36 -0500
>
> Introduce support for writing to /proc/sys/kernel/hung_task_detect_count.
>
> Writing any value to this file atomically resets the counter of detected
> hung tasks to zero. This grants system administrators the ability to
> clear the cumulative diagnostic history after resolving an incident,
> simplifying monitoring without requiring a system restart.
>
> Link: https://lkml.kernel.org/r/20251216030036.1822217-3-atomlin@atomlin.com
> Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Lance Yang <lance.yang@linux.dev>
> Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
> Cc: Petr Mladek <pmladek@suse.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>
> Documentation/admin-guide/sysctl/kernel.rst | 2 -
> kernel/hung_task.c | 29 ++++++++++++++++--
> 2 files changed, 28 insertions(+), 3 deletions(-)
>
> --- a/Documentation/admin-guide/sysctl/kernel.rst~hung_task-enable-runtime-reset-of-hung_task_detect_count
> +++ a/Documentation/admin-guide/sysctl/kernel.rst
> @@ -418,7 +418,7 @@ hung_task_detect_count
> ======================
>
> Indicates the total number of tasks that have been detected as hung since
> -the system boot.
> +the system boot. The counter can be reset to zero when written to.
>
> This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled.
>
> --- a/kernel/hung_task.c~hung_task-enable-runtime-reset-of-hung_task_detect_count
> +++ a/kernel/hung_task.c
> @@ -375,6 +375,31 @@ static long hung_timeout_jiffies(unsigne
> }
>
> #ifdef CONFIG_SYSCTL
> +
> +/**
> + * proc_dohung_task_detect_count - proc handler for hung_task_detect_count
> + * @table: Pointer to the struct ctl_table definition for this proc entry
> + * @write: Flag indicating the operation
> + * @buffer: User space buffer for data transfer
> + * @lenp: Pointer to the length of the data being transferred
> + * @ppos: Pointer to the current file offset
> + *
> + * This handler is used for reading the current hung task detection count
> + * and for resetting it to zero when a write operation is performed.
> + * Returns 0 on success or a negative error code on failure.
> + */
> +static int proc_dohung_task_detect_count(const struct ctl_table *table, int write,
> + void *buffer, size_t *lenp, loff_t *ppos)
> +{
> + if (!write)
> + return proc_doulongvec_minmax(table, write, buffer, lenp, ppos);
> +
> + WRITE_ONCE(sysctl_hung_task_detect_count, 0);
> + *ppos += *lenp;
> +
> + return 0;
> +}
> +
> /*
> * Process updating of timeout sysctl
> */
> @@ -457,8 +482,8 @@ static const struct ctl_table hung_task_
> .procname = "hung_task_detect_count",
> .data = &sysctl_hung_task_detect_count,
> .maxlen = sizeof(unsigned long),
> - .mode = 0444,
> - .proc_handler = proc_doulongvec_minmax,
> + .mode = 0644,
> + .proc_handler = proc_dohung_task_detect_count,
> },
> {
> .procname = "hung_task_sys_info",
> _
>
> Patches currently in -mm which might be from atomlin@atomlin.com are
>
> hung_task-introduce-helper-for-hung-task-warning.patch
> hung_task-enable-runtime-reset-of-hung_task_detect_count.patch
>
next prev parent reply other threads:[~2025-12-18 2:30 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-16 3:38 + hung_task-enable-runtime-reset-of-hung_task_detect_count.patch added to mm-nonmm-unstable branch Andrew Morton
2025-12-18 2:30 ` Lance Yang [this message]
-- strict thread matches above, loose matches on Subject: below --
2025-12-27 15:48 Andrew Morton
2026-03-03 22:10 Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20d511d2-1eed-4896-bd7d-c1e080319f61@linux.dev \
--to=lance.yang@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=atomlin@atomlin.com \
--cc=gregkh@linuxfoundation.org \
--cc=mhiramat@kernel.org \
--cc=mm-commits@vger.kernel.org \
--cc=pmladek@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.