The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [PATCH] sched/fair: Replace random newidle_balance with Bresenham accumulator
       [not found] <20260429085147.1029505-1-wangqing7171@gmail.com>
@ 2026-05-06  2:44 ` Qing Wang
  2026-05-13  2:21   ` Qing Wang
  2026-05-13  5:29   ` K Prateek Nayak
  0 siblings, 2 replies; 4+ messages in thread
From: Qing Wang @ 2026-05-06  2:44 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, K Prateek Nayak, Li Chen, linux-kernel
  Cc: Qing Wang

The current NI_RANDOM implementation uses a random dice roll to allow
newidle_balance attempts according to the success rate. There is a better
way to implememte it.

Replace the random dice with a Bresenham accumulator that distributes the
allowed attempts with uniformly spaced evenly across a 1024-step window:

Each step do those:
  - Accumulate (1 + newidle_ratio) into newidle_window_pos.
  - If the accumulator reaches 1024, allow the balance attempt and
    subtract 1024.

This guarantees exactly (1 + newidle_ratio) newidle_balance per 1024 steps
and per newidle_balance with uniformly spaced for any ratio in [0, 1023].

Signed-off-by: Qing Wang <wangqing7171@gmail.com>
---
 include/linux/sched/topology.h |  1 +
 kernel/sched/fair.c            | 18 +++++++++++++-----
 kernel/sched/topology.c        |  1 +
 3 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
index 36553e14866d..fd6de240fb2b 100644
--- a/include/linux/sched/topology.h
+++ b/include/linux/sched/topology.h
@@ -95,6 +95,7 @@ struct sched_domain {
 	unsigned int newidle_call;
 	unsigned int newidle_success;
 	unsigned int newidle_ratio;
+	unsigned int newidle_window_pos;
 	u64 newidle_stamp;
 	u64 max_newidle_lb_cost;
 	unsigned long last_decay_max_lb_cost;
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 69361c63353a..7874f795f62d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -12509,6 +12509,7 @@ static inline void update_newidle_stats(struct sched_domain *sd, unsigned int su
 		ratio += sd->newidle_success;
 
 		sd->newidle_ratio = min(1024, ratio);
+		sd->newidle_window_pos = 0;
 		sd->newidle_call /= 2;
 		sd->newidle_success /= 2;
 	}
@@ -13217,16 +13218,23 @@ static int sched_balance_newidle(struct rq *this_rq, struct rq_flags *rf)
 
 			if (sched_feat(NI_RANDOM) && sd->newidle_ratio < 1024) {
 				/*
-				 * Throw a 1k sided dice; and only run
-				 * newidle_balance according to the success
-				 * rate.
+				 * Base on accumulator of Bresenham's line algorithm
+				 *
+				 * Accumulate ratio each time and trigger balance
+				 * when sum ≥ 1024.
+				 *
+				 * Ensures exactly ratio balances per 1024 calls
+				 * and distributes ratio balance operations uniformly
+				 * across every 1024 invocations.
 				 */
-				u32 d1k = sched_rng() % 1024;
 				weight = 1 + sd->newidle_ratio;
-				if (d1k > weight) {
+
+				sd->newidle_window_pos += weight;
+				if (sd->newidle_window_pos < 1024) {
 					update_newidle_stats(sd, 0);
 					continue;
 				}
+				sd->newidle_window_pos -= 1024;
 				weight = (1024 + weight/2) / weight;
 			}
 
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 5847b83d9d55..0d8fa413f66c 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1708,6 +1708,7 @@ sd_init(struct sched_domain_topology_level *tl,
 		.newidle_call		= 512,
 		.newidle_success	= 256,
 		.newidle_ratio		= 512,
+		.newidle_window_pos	= 0,
 		.newidle_stamp		= now,
 
 		.max_newidle_lb_cost	= 0,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] sched/fair: Replace random newidle_balance with Bresenham accumulator
  2026-05-06  2:44 ` [PATCH] sched/fair: Replace random newidle_balance with Bresenham accumulator Qing Wang
@ 2026-05-13  2:21   ` Qing Wang
  2026-05-13  5:29   ` K Prateek Nayak
  1 sibling, 0 replies; 4+ messages in thread
From: Qing Wang @ 2026-05-13  2:21 UTC (permalink / raw)
  To: bsegall, chenl311, dietmar.eggemann, juri.lelli, kprateek.nayak,
	mgorman, mingo, peterz, rostedt, vincent.guittot, vschneid
  Cc: wangqing7171, linux-kernel

Hi Peter,
Just a gentle ping on this patch. How about this idea?
I'm looking forward to your reply.
--
Best regards,
Qing

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] sched/fair: Replace random newidle_balance with Bresenham accumulator
  2026-05-06  2:44 ` [PATCH] sched/fair: Replace random newidle_balance with Bresenham accumulator Qing Wang
  2026-05-13  2:21   ` Qing Wang
@ 2026-05-13  5:29   ` K Prateek Nayak
  2026-05-14  2:22     ` Qing Wang
  1 sibling, 1 reply; 4+ messages in thread
From: K Prateek Nayak @ 2026-05-13  5:29 UTC (permalink / raw)
  To: Qing Wang, Ingo Molnar, Peter Zijlstra, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Mel Gorman, Valentin Schneider, Li Chen, linux-kernel

Hello Qing,

On 5/6/2026 8:14 AM, Qing Wang wrote:
> The current NI_RANDOM implementation uses a random dice roll to allow
> newidle_balance attempts according to the success rate. There is a better
> way to implememte it.
> 
> Replace the random dice with a Bresenham accumulator that distributes the
> allowed attempts with uniformly spaced evenly across a 1024-step window:
> 
> Each step do those:
>   - Accumulate (1 + newidle_ratio) into newidle_window_pos.
>   - If the accumulator reaches 1024, allow the balance attempt and
>     subtract 1024.
> 
> This guarantees exactly (1 + newidle_ratio) newidle_balance per 1024 steps
> and per newidle_balance with uniformly spaced for any ratio in [0, 1023].

I took the most sensitive workload I have for newidle balance (tbench)
and took it for a spin with these changes. Following are the results:

    Clients:        tip                  bresenham_accm
        1      321.65 (0.00 pct)       313.97 (-2.38 pct)
        2      641.92 (0.00 pct)       638.74 (-0.49 pct)
        4     1245.65 (0.00 pct)      1237.26 (-0.67 pct)
        8     2435.80 (0.00 pct)      2442.23 ( 0.26 pct)
       16     4717.66 (0.00 pct)      4688.19 (-0.62 pct)
       32     9303.53 (0.00 pct)      9390.71 ( 0.93 pct)
       64    18002.57 (0.00 pct)     17911.56 (-0.50 pct)
      128    27729.26 (0.00 pct)     27621.95 (-0.38 pct)
      256    47134.77 (0.00 pct)     46137.36 (-2.11 pct)
      512    43179.41 (0.00 pct)     43277.53 ( 0.22 pct)
     1024    40339.30 (0.00 pct)     40176.49 (-0.40 pct)

The %diff is in noise range which is a good indication that there
shouldn't be any surprises. I'll queue a run overnight to see if
there are other benchmarks that like / dislike these changes.

I'll let Peter comment on the change itself since he knows these
bits best ;-)

-- 
Thanks and Regards,
Prateek


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] sched/fair: Replace random newidle_balance with Bresenham accumulator
  2026-05-13  5:29   ` K Prateek Nayak
@ 2026-05-14  2:22     ` Qing Wang
  0 siblings, 0 replies; 4+ messages in thread
From: Qing Wang @ 2026-05-14  2:22 UTC (permalink / raw)
  To: kprateek.nayak
  Cc: bsegall, chenl311, dietmar.eggemann, juri.lelli, linux-kernel,
	mgorman, mingo, peterz, rostedt, vincent.guittot, vschneid,
	wangqing7171

On Wed, 13 May 2026 at 13:29, K Prateek Nayak <kprateek.nayak@amd.com> wrote:
> I took the most sensitive workload I have for newidle balance (tbench)
> and took it for a spin with these changes. Following are the results:
> 
>     Clients:        tip                  bresenham_accm
>         1      321.65 (0.00 pct)       313.97 (-2.38 pct)
>         2      641.92 (0.00 pct)       638.74 (-0.49 pct)
>         4     1245.65 (0.00 pct)      1237.26 (-0.67 pct)
>         8     2435.80 (0.00 pct)      2442.23 ( 0.26 pct)
>        16     4717.66 (0.00 pct)      4688.19 (-0.62 pct)
>        32     9303.53 (0.00 pct)      9390.71 ( 0.93 pct)
>        64    18002.57 (0.00 pct)     17911.56 (-0.50 pct)
>       128    27729.26 (0.00 pct)     27621.95 (-0.38 pct)
>       256    47134.77 (0.00 pct)     46137.36 (-2.11 pct)
>       512    43179.41 (0.00 pct)     43277.53 ( 0.22 pct)
>      1024    40339.30 (0.00 pct)     40176.49 (-0.40 pct)
> 
> The %diff is in noise range which is a good indication that there
> shouldn't be any surprises. I'll queue a run overnight to see if
> there are other benchmarks that like / dislike these changes.
> 
> I'll let Peter comment on the change itself since he knows these
> bits best ;-)

Oh! Thanks for your benchmarks :) The %diff looks good.
Let's wait for Peter's comments.

--
Best regards,
Qing

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-05-14  2:23 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20260429085147.1029505-1-wangqing7171@gmail.com>
2026-05-06  2:44 ` [PATCH] sched/fair: Replace random newidle_balance with Bresenham accumulator Qing Wang
2026-05-13  2:21   ` Qing Wang
2026-05-13  5:29   ` K Prateek Nayak
2026-05-14  2:22     ` Qing Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox