All of lore.kernel.org
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.com>
To: Lance Yang <lance.yang@linux.dev>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Feng Tang <feng.tang@linux.alibaba.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Lance Yang <ioworker0@gmail.com>,
	linux-kernel@vger.kernel.org, Jonathan Corbet <corbet@lwn.net>,
	paulmck@kernel.org, lirongqing@baidu.com, leonylgao@tencent.com
Subject: Re: [PATCH v2 2/4] hung_task: Add hung_task_sys_info sysctl to dump sys info on task-hung
Date: Wed, 19 Nov 2025 13:31:06 +0100	[thread overview]
Message-ID: <aR24iloIoSjb6X1t@pathway.suse.cz> (raw)
In-Reply-To: <e0d12460-3ed8-43d4-8c0b-a7aa544d946e@linux.dev>

On Wed 2025-11-19 01:57:36, Lance Yang wrote:
> On 2025/11/18 23:20, Petr Mladek wrote:
> > Well, the behavior is still not ideal. It would be better when
> > we printed backtraces from _all_ "hung" tasks before panicking.
> > But it prints the backtraces only when sysctl_hung_task_panic
> > limit is reached.
> > 
> > I mean, for example, let's have:
> > 
> >    + sysctl_hung_task_warnings = 2;
> >    + sysctl_hung_task_panic = 5;
> >    + and detect 6 hung tasks.
> > 
> > The code will report 1st and 2nd hung tasks. It will skip 3rd and 4th
> > because sysctl_hung_task_warnings reached 0. It will report 5th and
> > 6th tasks because (total_hung_task >= 5).
> > 
> > It is better than nothing. But it might be confusing.
> 
> Right, I can see how it might be confusing.
> 
> IMHO, sysctl_hung_task_warnings is a user-configured limit on verbosity.
> It makes sense that reports are suppressed after the limit is exhausted,
> except when the sysctl_hung_task_panic threshold is reached ;)
> 
> > I am not sure how to fix it. A minimalist solution would be to print
> > a warning. Something like:
> > 
> > 	if (sysctl_hung_task_panic > 1 &&
> > 	    (total_hung_task == sysctl_hung_task_panic) &&
> > 	    !sysctl_hung_task_warnings) {
> > 		pr_err("INFO: %d blocked tasks might have been skipped because reached hung_task_warnings limit\n",
> > 			sysctl_hung_task_panic - 1);
> > 
> > Or we could print the "total_hung_task" counter somewhere, for
> > example,
> > 
> > 		pr_err("INFO[%lu]: task %s:%d blocked for more than %ld seconds.\n",
> > 			total_hung_task, ...
> > 
> > Or we could restart the for_each_process_thread() cycle and make sure
> > that all hung tasks will get reported.
> > 
> > Or we could ignore it until anyone complains.
> 
> It looks like we already inform the user when that happens. When
> sysctl_hung_task_warnings is finally decremented to zero, the code prints:
> 
> ```
> if (!sysctl_hung_task_warnings)
> 	pr_info("Future hung task reports are suppressed, see sysctl
> kernel.hung_task_warnings\n");
> ```
> 
> Given that this explicit warning is already in place, perhaps the current
> behavior is sufficient and clear enough?

The warning might get lost or it might happen long time before
critical stall so people might miss it.

But you are right. There is a warning. And my worries are rather
theoretical. Let's keep the code simple until anyone complains.

Best Regards,
Petr

  reply	other threads:[~2025-11-19 12:31 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-13 11:10 [PATCH v2 0/4] Enable hung_task and lockup cases to dump system info on demand Feng Tang
2025-11-13 11:10 ` [PATCH v2 1/4] docs: panic: correct some sys_ifo names in sysctl doc Feng Tang
2025-11-13 11:10 ` [PATCH v2 2/4] hung_task: Add hung_task_sys_info sysctl to dump sys info on task-hung Feng Tang
2025-11-14 15:36   ` Petr Mladek
2025-11-16  7:16     ` Feng Tang
2025-11-16  7:58   ` Lance Yang
2025-11-16  9:11     ` Feng Tang
2025-11-16 13:22       ` Lance Yang
2025-11-16 14:13         ` Feng Tang
2025-11-17 17:53           ` Andrew Morton
2025-11-18  2:26             ` Feng Tang
2025-11-18  6:06             ` Lance Yang
2025-11-18 15:20             ` Petr Mladek
2025-11-18 17:57               ` Lance Yang
2025-11-19 12:31                 ` Petr Mladek [this message]
2025-11-13 11:10 ` [PATCH v2 3/4] watchdog: add sys_info sysctls to dump sys info on system lockup Feng Tang
2025-11-14 15:44   ` Petr Mladek
2025-11-13 11:10 ` [PATCH v2 4/4] sys_info: add a default kernel sys_info mask Feng Tang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aR24iloIoSjb6X1t@pathway.suse.cz \
    --to=pmladek@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=feng.tang@linux.alibaba.com \
    --cc=ioworker0@gmail.com \
    --cc=lance.yang@linux.dev \
    --cc=leonylgao@tencent.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lirongqing@baidu.com \
    --cc=paulmck@kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.