public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Lance Yang <lance.yang@linux.dev>
To: Aaron Tomlin <atomlin@atomlin.com>
Cc: neelx@suse.com, sean@ashe.io, pmladek@suse.com,
	mhiramat@kernel.org, akpm@linux-foundation.org,
	joel.granados@kernel.org, mproche@gmail.com, chjohnst@gmail.com,
	nick.lange@gmail.com, gregkh@linuxfoundation.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] hung_task: Skip scan on idle systems
Date: Mon, 26 Jan 2026 13:23:01 +0800	[thread overview]
Message-ID: <4db98cdc-132b-4034-a6f1-25a3df6d0d01@linux.dev> (raw)
In-Reply-To: <20260126034539.3407903-1-atomlin@atomlin.com>

Hi Aaron,

Keep one patch or series under review at a time, especially in the
same subsystem ...

Maintainers/Reviewers have limited bandwidth and can focus better
on one thing at a time.

Please, be patient! Just wait for it to be merged or rejected before
sending the next.

On 2026/1/26 11:45, Aaron Tomlin wrote:
> At present, the hung task detector behaves in an unoptimised manner: it
> wakes up periodically (every check_interval_secs, defaulting to 120
> seconds) and performs an O(N) scan of the entire process list,
> regardless of the system's actual state. On idle embedded devices,
> virtual machines, or large servers with no activity, this behaviour
> unnecessarily consumes CPU cycles and memory bandwidth, hindering
> power-saving states.
> 
> To rectify this, this patch introduces an adaptive "green" polling
> mechanism. The detector will now verify whether the system is
> effectively idle before committing to a full process scan.
> 
> To implement this, we utilise the standard get_avenrun() API to verify
> the global system load. Tasks in the TASK_UNINTERRUPTIBLE (D) state
> explicitly contribute to the system load average; consequently, if the
> 1-minute load average is zero, we can confidently infer that no tasks
> are currently hung, allowing us to bypass the expensive process scan.
> 
> Crucially, we invoke get_avenrun(load, 0, 0) with both the offset and
> shift parameters set to zero. This configuration is deliberate and
> necessary for safety:
> 
>          1. Zero Offset: Prevents the application of any artificial
>             rounding bias usually intended for human-readable display.
> 
>          2. Zero Shift: Retrieves the raw fixed-point value (where 1.0
>             load = 2048) rather than shifting it down to an integer.
> 
> This ensures maximum sensitivity: even a microscopic fractional load
> (e.g., a single task entering D state momentarily) will register as a
> non-zero raw value. This guarantees that we never encounter a false
> negative where a valid hung task is ignored due to integer truncation or
> rounding errors.
> 
> This heuristic significantly minimises the detector's footprint on
> healthy systems whilst maintaining robust reliability for genuine hangs.
> 
> Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>
> ---
>   kernel/hung_task.c | 10 ++++++++--
>   1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index d2254c91450b..7b9f5c1bd35e 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -17,6 +17,7 @@
>   #include <linux/export.h>
>   #include <linux/panic_notifier.h>
>   #include <linux/sysctl.h>
> +#include <linux/sched/loadavg.h>
>   #include <linux/suspend.h>
>   #include <linux/utsname.h>
>   #include <linux/sched/signal.h>
> @@ -503,6 +504,7 @@ static int watchdog(void *dummy)
>   	for ( ; ; ) {
>   		unsigned long timeout = sysctl_hung_task_timeout_secs;
>   		unsigned long interval = sysctl_hung_task_check_interval_secs;
> +		unsigned long load[3];
>   		long t;
>   
>   		if (interval == 0)
> @@ -511,8 +513,12 @@ static int watchdog(void *dummy)
>   		t = hung_timeout_jiffies(hung_last_checked, interval);
>   		if (t <= 0) {
>   			if (!atomic_xchg(&reset_hung_task, 0) &&
> -			    !hung_detector_suspended)
> -				check_hung_uninterruptible_tasks(timeout);
> +			    !hung_detector_suspended) {
> +				/* Check 1-min load to detect idle system */
> +				get_avenrun(load, 0, 0);
> +				if (load[0] > 0)
> +					check_hung_uninterruptible_tasks(timeout);

The optimization is not worth the trouble.

I don't think the assumption that "load[0] == 0 means no hung tasks" is
100% correct.

So that would miss actual hung tasks - a false negative, which is worse
than the "wasted scan" you're trying to avoid.

Also, I don't *really* care about optimizing something that runs once
every 120 seconds :)

Nacked-by: Lance Yang <lance.yang@linux.dev>

  reply	other threads:[~2026-01-26  5:23 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-26  3:45 [PATCH] hung_task: Skip scan on idle systems Aaron Tomlin
2026-01-26  5:23 ` Lance Yang [this message]
2026-01-26 20:14   ` Aaron Tomlin
2026-02-02 13:55     ` Petr Mladek
2026-02-07 21:01       ` Aaron Tomlin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4db98cdc-132b-4034-a6f1-25a3df6d0d01@linux.dev \
    --to=lance.yang@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=atomlin@atomlin.com \
    --cc=chjohnst@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=joel.granados@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=mproche@gmail.com \
    --cc=neelx@suse.com \
    --cc=nick.lange@gmail.com \
    --cc=pmladek@suse.com \
    --cc=sean@ashe.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox