public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sched: Skip useless sched_balance_running acquisition if load balance is not due
@ 2025-04-16  3:58 Tim Chen
  2025-04-16  5:30 ` Shrikanth Hegde
                   ` (3 more replies)
  0 siblings, 4 replies; 23+ messages in thread
From: Tim Chen @ 2025-04-16  3:58 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: Tim Chen, Vincent Guittot, Chen Yu, Doug Nelson, Mohini Narkhede,
	linux-kernel

At load balance time, balance of last level cache domains and
above needs to be serialized. The scheduler checks the atomic var
sched_balance_running first and then see if time is due for a load
balance. This is an expensive operation as multiple CPUs can attempt
sched_balance_running acquisition at the same time.

On a 2 socket Granite Rapid systems enabling sub-numa cluster and
running OLTP workloads, 7.6% of cpu cycles are spent on cmpxchg of
sched_balance_running.  Most of the time, a balance attempt is aborted
immediately after acquiring sched_balance_running as load balance time
is not due.

Instead, check balance due time first before acquiring
sched_balance_running. This skips many useless acquisitions
of sched_balance_running and knocks the 7.6% CPU overhead on
sched_balance_domain() down to 0.05%.  Throughput of the OLTP workload
improved by 11%.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Reported-by: Mohini Narkhede <mohini.narkhede@intel.com>
Tested-by: Mohini Narkhede <mohini.narkhede@intel.com>
---
 kernel/sched/fair.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index e43993a4e580..5e5f7a770b2f 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -12220,13 +12220,13 @@ static void sched_balance_domains(struct rq *rq, enum cpu_idle_type idle)
 
 		interval = get_sd_balance_interval(sd, busy);
 
-		need_serialize = sd->flags & SD_SERIALIZE;
-		if (need_serialize) {
-			if (atomic_cmpxchg_acquire(&sched_balance_running, 0, 1))
-				goto out;
-		}
-
 		if (time_after_eq(jiffies, sd->last_balance + interval)) {
+			need_serialize = sd->flags & SD_SERIALIZE;
+			if (need_serialize) {
+				if (atomic_cmpxchg_acquire(&sched_balance_running, 0, 1))
+					goto out;
+			}
+
 			if (sched_balance_rq(cpu, rq, sd, idle, &continue_balancing)) {
 				/*
 				 * The LBF_DST_PINNED logic could have changed
@@ -12238,9 +12238,9 @@ static void sched_balance_domains(struct rq *rq, enum cpu_idle_type idle)
 			}
 			sd->last_balance = jiffies;
 			interval = get_sd_balance_interval(sd, busy);
+			if (need_serialize)
+				atomic_set_release(&sched_balance_running, 0);
 		}
-		if (need_serialize)
-			atomic_set_release(&sched_balance_running, 0);
 out:
 		if (time_after(next_balance, sd->last_balance + interval)) {
 			next_balance = sd->last_balance + interval;
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2025-10-27 18:06 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-16  3:58 [PATCH] sched: Skip useless sched_balance_running acquisition if load balance is not due Tim Chen
2025-04-16  5:30 ` Shrikanth Hegde
2025-04-16  6:28   ` Chen, Yu C
2025-04-16  9:16     ` Shrikanth Hegde
2025-04-16  9:29       ` Shrikanth Hegde
2025-04-16  9:47         ` Vincent Guittot
2025-04-16 14:14           ` Shrikanth Hegde
2025-04-17 11:10             ` K Prateek Nayak
2025-04-18 15:02             ` Vincent Guittot
2025-04-18 17:55               ` Shrikanth Hegde
2025-04-17 11:31           ` K Prateek Nayak
2025-04-17 12:01             ` Peter Zijlstra
2025-04-18  5:26               ` K Prateek Nayak
2025-04-18  9:28                 ` Peter Zijlstra
2025-04-18 12:13                   ` K Prateek Nayak
2025-04-16 16:19       ` Tim Chen
2025-04-16 17:11         ` Shrikanth Hegde
2025-04-17  9:19         ` Shrikanth Hegde
2025-04-17 17:12           ` Tim Chen
2025-05-29  9:00 ` K Prateek Nayak
2025-06-04  4:26 ` Chen, Yu C
2025-06-06 13:51 ` Vincent Guittot
2025-10-27 18:06   ` Mel Gorman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox