All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sched/psi: Skip CPUs with zero non-idle jiffies in per-cpu aggregation
@ 2026-02-03 10:00 Zhan Xusheng
  2026-02-03 16:46 ` Johannes Weiner
  0 siblings, 1 reply; 7+ messages in thread
From: Zhan Xusheng @ 2026-02-03 10:00 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: linux-kernel, linux-sched, Zhan Xusheng

PSI per-cpu aggregation weights each CPU's contribution by its
non-idle time converted to jiffies. CPUs with zero non-idle jiffies
do not contribute to the weighted result, but are still processed in
the current implementation.

Skip CPUs with zero non-idle jiffies early to avoid unnecessary
per-cpu arithmetic during aggregation.

No functional change intended.

Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
---
 kernel/sched/psi.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 59fdb7ebbf22..ce2321793a67 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -387,6 +387,13 @@ static void collect_percpu_times(struct psi_group *group,
 		changed_states |= cpu_changed_states;
 
 		nonidle = nsecs_to_jiffies(times[PSI_NONIDLE]);
+		/*
+		 * A CPU with zero non-idle jiffies does not contribute to the
+		 * weighted per-CPU aggregation. There is no need to include it
+		 * in deltas or total accumulation.
+		 */
+		if (!nonidle)
+			continue;
 		nonidle_total += nonidle;
 
 		for (s = 0; s < PSI_NONIDLE; s++)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] sched/psi: Skip CPUs with zero non-idle jiffies in per-cpu aggregation
  2026-02-03 10:00 [PATCH] sched/psi: Skip CPUs with zero non-idle jiffies in per-cpu aggregation Zhan Xusheng
@ 2026-02-03 16:46 ` Johannes Weiner
  2026-02-04  2:23   ` [PATCH v2] sched/psi: Skip CPUs with zero non-idle jiffies in per-CPU aggregation Zhan Xusheng
  2026-04-29 10:05   ` [PATCH v4] sched/psi: Skip CPUs with zero non-idle delta " Zhan Xusheng
  0 siblings, 2 replies; 7+ messages in thread
From: Johannes Weiner @ 2026-02-03 16:46 UTC (permalink / raw)
  To: Zhan Xusheng; +Cc: linux-kernel, linux-sched, Zhan Xusheng

On Tue, Feb 03, 2026 at 06:00:07PM +0800, Zhan Xusheng wrote:
> PSI per-cpu aggregation weights each CPU's contribution by its
> non-idle time converted to jiffies. CPUs with zero non-idle jiffies
> do not contribute to the weighted result, but are still processed in
> the current implementation.
> 
> Skip CPUs with zero non-idle jiffies early to avoid unnecessary
> per-cpu arithmetic during aggregation.
> 
> No functional change intended.
> 
> Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>

Makes sense.

> ---
>  kernel/sched/psi.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> index 59fdb7ebbf22..ce2321793a67 100644
> --- a/kernel/sched/psi.c
> +++ b/kernel/sched/psi.c
> @@ -387,6 +387,13 @@ static void collect_percpu_times(struct psi_group *group,
>  		changed_states |= cpu_changed_states;
>  
>  		nonidle = nsecs_to_jiffies(times[PSI_NONIDLE]);
> +		/*
> +		 * A CPU with zero non-idle jiffies does not contribute to the
> +		 * weighted per-CPU aggregation. There is no need to include it
> +		 * in deltas or total accumulation.
> +		 */
> +		if (!nonidle)
> +			continue;

You could save the nsecs_to_jiffies() dance as well by doing:

		if (!(cpu_changed_states & (1 << PSI_NONIDLE)))
			coninue;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2] sched/psi: Skip CPUs with zero non-idle jiffies in per-CPU aggregation
  2026-02-03 16:46 ` Johannes Weiner
@ 2026-02-04  2:23   ` Zhan Xusheng
  2026-03-13  3:48     ` [PATCH v3] " Zhan Xusheng
  2026-04-29 10:05   ` [PATCH v4] sched/psi: Skip CPUs with zero non-idle delta " Zhan Xusheng
  1 sibling, 1 reply; 7+ messages in thread
From: Zhan Xusheng @ 2026-02-04  2:23 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: linux-kernel, linux-sched, Zhan Xusheng

To improve performance during per-CPU aggregation, skip CPUs that have
zero non-idle jiffies early in the process. These CPUs do not contribute
to the weighted result and can be excluded from the per-CPU calculations.
The change directly checks the `cpu_changed_states` for `PSI_NONIDLE`
instead of performing unnecessary arithmetic.

No functional change intended.

Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
---
 kernel/sched/psi.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 59fdb7ebbf22..f1bf5449d3f9 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -384,6 +384,13 @@ static void collect_percpu_times(struct psi_group *group,
 
 		get_recent_times(group, cpu, aggregator, times,
 				&cpu_changed_states);
+		/*
+		 * Skip CPUs with no non-idle time. These CPUs do not contribute
+		 * to the weighted per-CPU aggregation, so we can avoid unnecessary
+		 * calculations for them by checking cpu_changed_states.
+		 */
+		if (!(cpu_changed_states & (1 << PSI_NONIDLE)))
+			continue;
 		changed_states |= cpu_changed_states;
 
 		nonidle = nsecs_to_jiffies(times[PSI_NONIDLE]);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3] sched/psi: Skip CPUs with zero non-idle jiffies in per-CPU aggregation
  2026-02-04  2:23   ` [PATCH v2] sched/psi: Skip CPUs with zero non-idle jiffies in per-CPU aggregation Zhan Xusheng
@ 2026-03-13  3:48     ` Zhan Xusheng
  0 siblings, 0 replies; 7+ messages in thread
From: Zhan Xusheng @ 2026-03-13  3:48 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: Juri Lelli, linux-kernel, Zhan Xusheng

To improve performance during per-CPU aggregation, skip CPUs that have
zero non-idle jiffies early in the process. These CPUs do not contribute
to the weighted result and can be excluded from the per-CPU calculations.
The change directly checks the `cpu_changed_states` for `PSI_NONIDLE`
instead of performing unnecessary arithmetic.

No functional change intended.

Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
---
 kernel/sched/psi.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index d9c9d9480a45..65f09282143e 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -384,6 +384,13 @@ static void collect_percpu_times(struct psi_group *group,
 
 		get_recent_times(group, cpu, aggregator, times,
 				&cpu_changed_states);
+		/*
+		 * Skip CPUs with no non-idle time. These CPUs do not contribute
+		 * to the weighted per-CPU aggregation, so we can avoid unnecessary
+		 * calculations for them by checking cpu_changed_states.
+		 */
+		if (!(cpu_changed_states & (1 << PSI_NONIDLE)))
+			continue;
 		changed_states |= cpu_changed_states;
 
 		nonidle = nsecs_to_jiffies(times[PSI_NONIDLE]);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4] sched/psi: Skip CPUs with zero non-idle delta in per-CPU aggregation
  2026-02-03 16:46 ` Johannes Weiner
  2026-02-04  2:23   ` [PATCH v2] sched/psi: Skip CPUs with zero non-idle jiffies in per-CPU aggregation Zhan Xusheng
@ 2026-04-29 10:05   ` Zhan Xusheng
  2026-05-04  8:04     ` Peter Zijlstra
  1 sibling, 1 reply; 7+ messages in thread
From: Zhan Xusheng @ 2026-04-29 10:05 UTC (permalink / raw)
  To: Johannes Weiner, Peter Zijlstra; +Cc: Ingo Molnar, linux-kernel, Zhan Xusheng

collect_percpu_times() iterates over every possible CPU to build a
non-idle-weighted average of the PSI state times. When a CPU has
no PSI_NONIDLE delta for the current sampling interval:
  nonidle  = nsecs_to_jiffies(times[PSI_NONIDLE]) = 0
  deltas[s] += times[s] * nonidle               /* += 0 */

so the weighted accumulation contributes nothing.

get_recent_times() already sets the PSI_NONIDLE bit in
cpu_changed_states iff the PSI_NONIDLE delta is non-zero. Use that
bit to skip such CPUs early, as suggested by Johannes, avoiding the
nsecs_to_jiffies() call.

No functional change intended.

Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
---
v4:
 - Drop the incorrect Reviewed-by added in v2/v3; replace with
   Suggested-by. Johannes' "Makes sense." on v1 was an
   acknowledgement and an implementation suggestion, not a review
   tag.
 - Rebase commit message wording to describe "PSI_NONIDLE delta"
   rather than "non-idle jiffies", matching the actual check.
v3: https://lore.kernel.org/all/20260313034847.1422-1-zhanxusheng@xiaomi.com/
 - Resend of v2.
v2: https://lore.kernel.org/all/20260204022328.23938-1-zhanxusheng@xiaomi.com/
 - Use cpu_changed_states & (1 << PSI_NONIDLE) per Johannes'
   suggestion, saving the nsecs_to_jiffies() call.
v1: https://lore.kernel.org/all/20260203100007.22044-1-zhanxusheng@xiaomi.com/
---
 kernel/sched/psi.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index d9c9d9480a45..cd1174f0b5e5 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -384,6 +384,13 @@ static void collect_percpu_times(struct psi_group *group,
 
 		get_recent_times(group, cpu, aggregator, times,
 				&cpu_changed_states);
+		/*
+		 * If this CPU's PSI_NONIDLE delta is zero, it contributes
+		 * nothing to nonidle_total or to any deltas[] entry below,
+		 * so skip it early.
+		 */
+		if (!(cpu_changed_states & (1 << PSI_NONIDLE)))
+			continue;
 		changed_states |= cpu_changed_states;
 
 		nonidle = nsecs_to_jiffies(times[PSI_NONIDLE]);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v4] sched/psi: Skip CPUs with zero non-idle delta in per-CPU aggregation
  2026-04-29 10:05   ` [PATCH v4] sched/psi: Skip CPUs with zero non-idle delta " Zhan Xusheng
@ 2026-05-04  8:04     ` Peter Zijlstra
  2026-05-07 13:56       ` [PATCH v5] " Zhan Xusheng
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2026-05-04  8:04 UTC (permalink / raw)
  To: Zhan Xusheng; +Cc: Johannes Weiner, Ingo Molnar, linux-kernel, Zhan Xusheng

On Wed, Apr 29, 2026 at 06:05:55PM +0800, Zhan Xusheng wrote:
> collect_percpu_times() iterates over every possible CPU to build a
> non-idle-weighted average of the PSI state times. When a CPU has
> no PSI_NONIDLE delta for the current sampling interval:
>   nonidle  = nsecs_to_jiffies(times[PSI_NONIDLE]) = 0
>   deltas[s] += times[s] * nonidle               /* += 0 */
> 
> so the weighted accumulation contributes nothing.
> 
> get_recent_times() already sets the PSI_NONIDLE bit in
> cpu_changed_states iff the PSI_NONIDLE delta is non-zero. Use that
> bit to skip such CPUs early, as suggested by Johannes, avoiding the
> nsecs_to_jiffies() call.
> 
> No functional change intended.

So presumably this is an optimization. Where is the data that justifies
this?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v5] sched/psi: Skip CPUs with zero non-idle delta in per-CPU aggregation
  2026-05-04  8:04     ` Peter Zijlstra
@ 2026-05-07 13:56       ` Zhan Xusheng
  0 siblings, 0 replies; 7+ messages in thread
From: Zhan Xusheng @ 2026-05-07 13:56 UTC (permalink / raw)
  To: peterz
  Cc: hannes, mingo, Suren Baghdasaryan, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, K Prateek Nayak, linux-kernel, Zhan Xusheng

collect_percpu_times() iterates over every possible CPU to build a
non-idle-weighted average of the PSI state times. When a CPU has no
PSI_NONIDLE delta for the current sampling interval:
  nonidle     = nsecs_to_jiffies(times[PSI_NONIDLE]) = 0
  deltas[s]  += times[s] * nonidle               /* += 0 */

so the weighted accumulation contributes nothing.

get_recent_times() already sets the PSI_NONIDLE bit in
cpu_changed_states iff the PSI_NONIDLE delta is non-zero. Use that
bit to skip such CPUs early, as suggested by Johannes, avoiding the
nsecs_to_jiffies() call and the PSI_NONIDLE * u64 mul-adds that
follow.

No functional change: on the skipped path the old code adds zero to
deltas[] and zero to nonidle_total, which is exactly the result of
not iterating.

Measured on i7-8700 (6C/12T), same mainline base and same build
flags for both kernels. Reader is a pinned userspace loop of
open()+read()+close() on /proc/pressure/cpu, 100k iterations inside
a KVM guest with -smp matching the host LCPU count (12):
                            baseline    patched     diff
  idle             p50       2438 ns    2270 ns    -6.9%
  idle             p99       2598 ns    2449 ns    -5.7%
  1 busy / 12      p50       2479 ns    2281 ns    -8.0%
  all 12 busy      p50       3738 ns    3537 ns    -5.4%

The all-busy improvement shows the skip also kicks in when the box
is hot: between two samples, many CPUs record no PSI_NONIDLE state
transition even if they've been 100% utilised.

Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
---
 kernel/sched/psi.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index d9c9d9480a45..f220debc3fe0 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -386,6 +386,9 @@ static void collect_percpu_times(struct psi_group *group,
 				&cpu_changed_states);
 		changed_states |= cpu_changed_states;
 
+		if (!(cpu_changed_states & (1 << PSI_NONIDLE)))
+			continue;
+
 		nonidle = nsecs_to_jiffies(times[PSI_NONIDLE]);
 		nonidle_total += nonidle;
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-05-07 13:56 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-03 10:00 [PATCH] sched/psi: Skip CPUs with zero non-idle jiffies in per-cpu aggregation Zhan Xusheng
2026-02-03 16:46 ` Johannes Weiner
2026-02-04  2:23   ` [PATCH v2] sched/psi: Skip CPUs with zero non-idle jiffies in per-CPU aggregation Zhan Xusheng
2026-03-13  3:48     ` [PATCH v3] " Zhan Xusheng
2026-04-29 10:05   ` [PATCH v4] sched/psi: Skip CPUs with zero non-idle delta " Zhan Xusheng
2026-05-04  8:04     ` Peter Zijlstra
2026-05-07 13:56       ` [PATCH v5] " Zhan Xusheng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.