All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lance Yang <lance.yang@linux.dev>
To: Ye Liu <ye.liu@linux.dev>
Cc: Ye Liu <liuye@kylinos.cn>,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Zi Li <zi.li@linux.dev>
Subject: Re: [PATCH] hung_task: add warning counter to blocked task report
Date: Mon, 21 Jul 2025 14:19:10 +0800	[thread overview]
Message-ID: <83ac6ac0-a7c5-4475-8800-0beefa117164@linux.dev> (raw)
In-Reply-To: <582cf973-1290-493c-b821-f23480e75014@linux.dev>



On 2025/7/21 13:45, Ye Liu wrote:
> 
> 
> On 2025/7/21 12:56, Lance Yang wrote:
>> Hi Ye,
>>
>> Thanks for your patch!
>>
>> On 2025/7/21 11:17, Ye Liu wrote:
>>> From: Ye Liu <liuye@kylinos.cn>
>>>
>>> Add a warning counter to each hung task message to make it easier
>>> to analyze and locate issues in the logs.
>>>
>>> Signed-off-by: Ye Liu <liuye@kylinos.cn>
>>> ---
>>>    kernel/hung_task.c | 6 ++++--
>>>    1 file changed, 4 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
>>> index 8708a1205f82..9e5f86148d47 100644
>>> --- a/kernel/hung_task.c
>>> +++ b/kernel/hung_task.c
>>> @@ -58,6 +58,7 @@ EXPORT_SYMBOL_GPL(sysctl_hung_task_timeout_secs);
>>>    static unsigned long __read_mostly sysctl_hung_task_check_interval_secs;
>>>      static int __read_mostly sysctl_hung_task_warnings = 10;
>>> +static int hung_task_warning_count;
>>>      static int __read_mostly did_panic;
>>>    static bool hung_task_show_lock;
>>> @@ -232,8 +233,9 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
>>>        if (sysctl_hung_task_warnings || hung_task_call_panic) {
>>>            if (sysctl_hung_task_warnings > 0)
>>>                sysctl_hung_task_warnings--;
>>> -        pr_err("INFO: task %s:%d blocked for more than %ld seconds.\n",
>>> -               t->comm, t->pid, (jiffies - t->last_switch_time) / HZ);
>>> +        pr_err("INFO: task %s:%d blocked for more than %ld seconds. [Warning #%d]\n",
>>> +               t->comm, t->pid, (jiffies - t->last_switch_time) / HZ,
>>> +               ++hung_task_warning_count);
>>>            pr_err("      %s %s %.*s\n",
>>>                print_tainted(), init_utsname()->release,
>>>                (int)strcspn(init_utsname()->version, " "),
>>
>> A quick thought on this: we already have the hung_task_detect_count
>> counter, which tracks the total number of hung tasks detected since
>> boot ;)
>>
>> While this patch adds a counter inline with the warning message, the
>> existing counter already provides a way to know how many hung task
>> events have occurred.
>>
>> Could you elaborate on the specific benefit of printing this count
>> directly in the log, compared to checking the global hung_task_detect_count?
>>
>> Also, if the goal is to give each warning a unique sequence number,
>> I think the dmesg timestamp prefix serves the same purpose ;)
>>
>> Thanks,
>> Lance
> 
> Sorry for not noticing sysctl_hung_task_detect_count.
> I just thought adding it directly to the warning message would make the
> log easier to read and more intuitive than relying on timestamps.
> 
> If accepted, I will send V2, like this:

Let's step back and considet the practical use case. when we are
troubleshooting hung task issues in a production log, what information
do we actually use?

Typically, we look for:
1) The timestamp, to correlate with other system events
2) The task name and PID (%s:%d)
3) The kernel stack trace that follows, to see where it's stuck

So, my question is: in what specific troubleshooting scenario would
knowing the sequence number, like [#N], provide actionable information
that the above data points do not?

Unless there's a compelling use case I'm missing, I'd prefer to keep
the code as it is ;)
Thanks,
Lance

> 
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index 8708a1205f82..231afdb68bb2 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -232,8 +232,9 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
>          if (sysctl_hung_task_warnings || hung_task_call_panic) {
>                  if (sysctl_hung_task_warnings > 0)
>                          sysctl_hung_task_warnings--;
> -               pr_err("INFO: task %s:%d blocked for more than %ld seconds.\n",
> -                      t->comm, t->pid, (jiffies - t->last_switch_time) / HZ);
> +               pr_err("INFO: task %s:%d blocked for more than %ld seconds. [#%ld]\n",
> +                      t->comm, t->pid, (jiffies - t->last_switch_time) / HZ,
> +                      sysctl_hung_task_detect_count);
>                  pr_err("      %s %s %.*s\n",
>                          print_tainted(), init_utsname()->release,
>                          (int)strcspn(init_utsname()->version, " "),
> 
> 
> 
> 


  reply	other threads:[~2025-07-21  6:19 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-21  3:17 [PATCH] hung_task: add warning counter to blocked task report Ye Liu
2025-07-21  4:56 ` Lance Yang
2025-07-21  5:45   ` Ye Liu
2025-07-21  6:19     ` Lance Yang [this message]
2025-07-23  7:31       ` Ye Liu
2025-07-23 10:46         ` Lance Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83ac6ac0-a7c5-4475-8800-0beefa117164@linux.dev \
    --to=lance.yang@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liuye@kylinos.cn \
    --cc=ye.liu@linux.dev \
    --cc=zi.li@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.