From: "Li,Rongqing" <lirongqing@baidu.com>
To: "paulmck@kernel.org" <paulmck@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
"corbet@lwn.net" <corbet@lwn.net>,
"lance.yang@linux.dev" <lance.yang@linux.dev>,
"mhiramat@kernel.org" <mhiramat@kernel.org>,
"pawan.kumar.gupta@linux.intel.com"
<pawan.kumar.gupta@linux.intel.com>,
"mingo@kernel.org" <mingo@kernel.org>,
"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
"rostedt@goodmis.org" <rostedt@goodmis.org>,
"kees@kernel.org" <kees@kernel.org>,
"arnd@arndb.de" <arnd@arndb.de>,
"feng.tang@linux.alibaba.com" <feng.tang@linux.alibaba.com>,
"pauld@redhat.com" <pauld@redhat.com>,
"joel.granados@kernel.org" <joel.granados@kernel.org>,
"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: [外部邮件] Re: [????] Re: [PATCH][RFC] hung_task: Support to panic when the maximum number of hung task warnings is reached
Date: Tue, 23 Sep 2025 06:16:03 +0000 [thread overview]
Message-ID: <d334c33bc11243cd9ab31ebe8e4310ca@baidu.com> (raw)
In-Reply-To: <36db2f10-ebbe-4ecd-b27f-e02d9e1569c2@paulmck-laptop>
> -----Original Message-----
> From: Paul E. McKenney <paulmck@kernel.org>
> Sent: 2025年9月23日 14:04
> To: Li,Rongqing <lirongqing@baidu.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>; corbet@lwn.net;
> lance.yang@linux.dev; mhiramat@kernel.org;
> pawan.kumar.gupta@linux.intel.com; mingo@kernel.org;
> dave.hansen@linux.intel.com; rostedt@goodmis.org; kees@kernel.org;
> arnd@arndb.de; feng.tang@linux.alibaba.com; pauld@redhat.com;
> joel.granados@kernel.org; linux-doc@vger.kernel.org;
> linux-kernel@vger.kernel.org
> Subject: [外部邮件] Re: [????] Re: [PATCH][RFC] hung_task: Support to panic
> when the maximum number of hung task warnings is reached
>
> On Tue, Sep 23, 2025 at 04:00:03AM +0000, Li,Rongqing wrote:
> >
> >
> > > -----Original Message-----
> > > From: Andrew Morton <akpm@linux-foundation.org>
> > > Sent: 2025年9月23日 11:46
> > > To: Li,Rongqing <lirongqing@baidu.com>
> > > Cc: corbet@lwn.net; lance.yang@linux.dev; mhiramat@kernel.org;
> > > paulmck@kernel.org; pawan.kumar.gupta@linux.intel.com;
> > > mingo@kernel.org; dave.hansen@linux.intel.com; rostedt@goodmis.org;
> > > kees@kernel.org; arnd@arndb.de; feng.tang@linux.alibaba.com;
> > > pauld@redhat.com; joel.granados@kernel.org;
> > > linux-doc@vger.kernel.org; linux-kernel@vger.kernel.org
> > > Subject: [????] Re: [PATCH][RFC] hung_task: Support to panic when
> > > the maximum number of hung task warnings is reached
> > >
> > > On Tue, 23 Sep 2025 11:37:40 +0800 lirongqing <lirongqing@baidu.com>
> wrote:
> > >
> > > > Currently the hung task detector can either panic immediately or
> > > > continue operation when hung tasks are detected. However, there
> > > > are scenarios where we want a more balanced approach:
> > > >
> > > > - We don't want the system to panic immediately when a few hung tasks
> > > > are detected, as the system may be able to recover
> > > > - And we also don't want the system to stall indefinitely with multiple
> > > > hung tasks
> > > >
> > > > This commit introduces a new mode (value 2) for the hung task panic
> behavior.
> > > > When set to 2, the system will panic only after the maximum number
> > > > of hung task warnings (hung_task_warnings) has been reached.
> > > >
> > > > This provides a middle ground between immediate panic and
> > > > potentially infinite stall, allowing for automated vmcore
> > > > generation after a reasonable
> > >
> > > I assume the same argument applies to the NMI watchdog, to the
> > > softlockup detector and to the RCU stall detector?
> >
> > True, especial RCU stall detector
>
> There are the panic_on_rcu_stall and max_rcu_stall_to_panic sysctls, which
> together allow you to panic after (say) three RCU CPU stall warnings.
> Does those do what you need?
Yes, this is what I need. RCU has been implemented.
Thanks
-Li
>
> Thanx, Paul
>
> > > A general framework to handle all of these might be better. But why
> > > do it in kernel at all? What about a userspace detector which
> > > parses kernel logs (or new procfs counters) and makes such decisions?
> >
> >
> > By leveraging existing kernel mechanisms, implementation in kernel is
> > very simple and reliable, I think
> >
> > Thanks
> >
> > -Li
> >
next prev parent reply other threads:[~2025-09-23 6:17 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-23 3:37 [PATCH][RFC] hung_task: Support to panic when the maximum number of hung task warnings is reached lirongqing
2025-09-23 3:45 ` Andrew Morton
2025-09-23 3:59 ` Lance Yang
2025-09-23 5:22 ` [外部邮件] " Li,Rongqing
2025-09-23 5:55 ` Lance Yang
2025-09-23 6:19 ` Li,Rongqing
2025-09-23 4:00 ` [????] " Li,Rongqing
2025-09-23 6:03 ` Paul E. McKenney
2025-09-23 6:16 ` Li,Rongqing [this message]
2025-09-23 7:01 ` [外部邮件] " Li,Rongqing
2025-09-23 4:35 ` Randy Dunlap
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d334c33bc11243cd9ab31ebe8e4310ca@baidu.com \
--to=lirongqing@baidu.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=feng.tang@linux.alibaba.com \
--cc=joel.granados@kernel.org \
--cc=kees@kernel.org \
--cc=lance.yang@linux.dev \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mhiramat@kernel.org \
--cc=mingo@kernel.org \
--cc=pauld@redhat.com \
--cc=paulmck@kernel.org \
--cc=pawan.kumar.gupta@linux.intel.com \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).