From: Lance Yang <lance.yang@linux.dev>
To: Feng Tang <feng.tang@linux.alibaba.com>
Cc: Petr Mladek <pmladek@suse.com>,
Andrew Morton <akpm@linux-foundation.org>,
Steven Rostedt <rostedt@goodmis.org>,
Lance Yang <ioworker0@gmail.com>,
linux-kernel@vger.kernel.org, Jonathan Corbet <corbet@lwn.net>,
paulmck@kernel.org, lirongqing@baidu.com, leonylgao@tencent.com
Subject: Re: [PATCH v2 2/4] hung_task: Add hung_task_sys_info sysctl to dump sys info on task-hung
Date: Sun, 16 Nov 2025 21:22:43 +0800 [thread overview]
Message-ID: <cf6c6442-e160-4aad-9c12-a49225b73501@linux.dev> (raw)
In-Reply-To: <aRmVVE3klGZuX6aV@U-2FWC9VHC-2323.local>
On 2025/11/16 17:11, Feng Tang wrote:
> On Sun, Nov 16, 2025 at 03:58:32PM +0800, Lance Yang wrote:
>>
>>
>> On 2025/11/13 19:10, Feng Tang wrote:
>>> When task-hung happens, developers may need different kinds of system
>>> information (call-stacks, memory info, locks, etc.) to help debugging.
>>>
>>> Add 'hung_task_sys_info' sysctl knob to take human readable string like
>>> "tasks,mem,timers,locks,ftrace,...", and when task-hung happens, all
>>> requested information will be dumped. (refer kernel/sys_info.c for more
>>> details).
>>>
>>> Meanwhile, the newly introduced sys_info() call is used to unify some
>>> existing info-dumping knobs.
>>>
>>> Suggested-by: Petr Mladek <pmladek@suse.com>
>>> Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
>>> ---
>>> Documentation/admin-guide/sysctl/kernel.rst | 5 ++
>>> kernel/hung_task.c | 62 +++++++++++++--------
>>> 2 files changed, 43 insertions(+), 24 deletions(-)
>>> * Ok, the task did not get scheduled for more than 2 minutes,
>>> * complain:
>>> */
>>> - if (sysctl_hung_task_warnings || hung_task_call_panic) {
>>> + if (sysctl_hung_task_warnings) {
>>
>> It seems like the behavior changes when sysctl_hung_task_warnings is
>> 0 but a panic is about to be triggered ...
>>
>> Looking at the history:
>>
>> 1) Commit ("hung_task: ignore hung_task_warnings when hung_task_panic
>> is enabled")[1] ensured that hung task information is always dumped
>> when a panic is configured, even if the warning counter is exhausted.
>>
>> 2) Later, commit ("hung_task: panic when there are more than N hung
>> tasks at the same time")[2] refined the logic to trigger a panic based
>> on the number of hung tasks found in a single scan.
>>
>> To stay consistent with the established behavior, I think we should
>> continue to dump the information for hung tasks as long as
>> sysctl_hung_task_panic is enabled :)
>>
>> [1] https://lore.kernel.org/all/20240613033159.3446265-1-leonylgao@gmail.com
>> [2] https://lore.kernel.org/all/20251015063615.2632-1-lirongqing@baidu.com
>> [...]
>
> Aha, Petr asked similar question during his review. Thanks for the catch!
>
> How about following fixup patch to restore that part of logic?
>
> Thanks,
> Feng
>
> ---
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index 5b3a7785d3a2..d2254c91450b 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -223,8 +223,11 @@ static inline void debug_show_blocker(struct task_struct *task, unsigned long ti
> }
> #endif
>
> -static void check_hung_task(struct task_struct *t, unsigned long timeout)
> +static void check_hung_task(struct task_struct *t, unsigned long timeout,
> + unsigned long prev_detect_count)
> {
> + unsigned long total_hung_task;
> +
> if (!task_is_hung(t, timeout))
> return;
>
> @@ -234,13 +237,19 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
> */
> sysctl_hung_task_detect_count++;
>
> + total_hung_task = sysctl_hung_task_detect_count - prev_detect_count;
> trace_sched_process_hang(t);
>
> + if (sysctl_hung_task_panic && total_hung_task >= sysctl_hung_task_panic) {
> + console_verbose();
> + hung_task_call_panic = true;
> + }
> +
> /*
> * Ok, the task did not get scheduled for more than 2 minutes,
> * complain:
> */
> - if (sysctl_hung_task_warnings) {
> + if (sysctl_hung_task_warnings || hung_task_call_panic) {
> if (sysctl_hung_task_warnings > 0)
> sysctl_hung_task_warnings--;
> pr_err("INFO: task %s:%d blocked for more than %ld seconds.\n",
> @@ -295,7 +304,6 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
> {
> int max_count = sysctl_hung_task_check_count;
> unsigned long last_break = jiffies;
> - unsigned long total_hung_task;
> struct task_struct *g, *t;
> unsigned long prev_detect_count = sysctl_hung_task_detect_count;
> int need_warning = sysctl_hung_task_warnings;
> @@ -320,20 +328,14 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
> last_break = jiffies;
> }
>
> - check_hung_task(t, timeout);
> + check_hung_task(t, timeout, prev_detect_count);
> }
> unlock:
> rcu_read_unlock();
>
> - total_hung_task = sysctl_hung_task_detect_count - prev_detect_count;
> - if (!total_hung_task)
> + if (!(sysctl_hung_task_detect_count - prev_detect_count))
> return;
>
> - if (sysctl_hung_task_panic && total_hung_task >= sysctl_hung_task_panic) {
> - console_verbose();
> - hung_task_call_panic = true;
> - }
> -
> if (need_warning || hung_task_call_panic) {
> si_mask |= SYS_INFO_LOCKS;
Looks good to me now! I assume v3 would be expected, can you
post a new version?
Cheers,
Lance
next prev parent reply other threads:[~2025-11-16 13:22 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-13 11:10 [PATCH v2 0/4] Enable hung_task and lockup cases to dump system info on demand Feng Tang
2025-11-13 11:10 ` [PATCH v2 1/4] docs: panic: correct some sys_ifo names in sysctl doc Feng Tang
2025-11-13 11:10 ` [PATCH v2 2/4] hung_task: Add hung_task_sys_info sysctl to dump sys info on task-hung Feng Tang
2025-11-14 15:36 ` Petr Mladek
2025-11-16 7:16 ` Feng Tang
2025-11-16 7:58 ` Lance Yang
2025-11-16 9:11 ` Feng Tang
2025-11-16 13:22 ` Lance Yang [this message]
2025-11-16 14:13 ` Feng Tang
2025-11-17 17:53 ` Andrew Morton
2025-11-18 2:26 ` Feng Tang
2025-11-18 6:06 ` Lance Yang
2025-11-18 15:20 ` Petr Mladek
2025-11-18 17:57 ` Lance Yang
2025-11-19 12:31 ` Petr Mladek
2025-11-13 11:10 ` [PATCH v2 3/4] watchdog: add sys_info sysctls to dump sys info on system lockup Feng Tang
2025-11-14 15:44 ` Petr Mladek
2025-11-13 11:10 ` [PATCH v2 4/4] sys_info: add a default kernel sys_info mask Feng Tang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cf6c6442-e160-4aad-9c12-a49225b73501@linux.dev \
--to=lance.yang@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=corbet@lwn.net \
--cc=feng.tang@linux.alibaba.com \
--cc=ioworker0@gmail.com \
--cc=leonylgao@tencent.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lirongqing@baidu.com \
--cc=paulmck@kernel.org \
--cc=pmladek@suse.com \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.