public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Srikar Dronamraju <srikar@linux.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>,
	Ingo Molnar <mingo@kernel.org>, Chen Yu <yu.c.chen@intel.com>,
	Doug Nelson <doug.nelson@intel.com>,
	Mohini Narkhede <mohini.narkhede@intel.com>,
	linux-kernel@vger.kernel.org,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Shrikanth Hegde <sshegde@linux.ibm.com>,
	K Prateek Nayak <kprateek.nayak@amd.com>
Subject: Re: [RESEND PATCH] sched/fair: Skip sched_balance_running cmpxchg when balance is not due
Date: Tue, 14 Oct 2025 19:20:35 +0530	[thread overview]
Message-ID: <aO5VK4PO_REXNhnN@linux.ibm.com> (raw)
In-Reply-To: <20251014092436.GK4067720@noisy.programming.kicks-ass.net>

* Peter Zijlstra <peterz@infradead.org> [2025-10-14 11:24:36]:

> On Mon, Oct 13, 2025 at 02:54:19PM -0700, Tim Chen wrote:
> 
> 
> Right, Yu Chen said something like that as well, should_we_balance() is
> too late.
> 
> Should we instead move the whole serialize thing inside
> sched_balance_rq() like so:
> 
> @@ -12122,21 +12148,6 @@ static int active_load_balance_cpu_stop(void *data)
>  	return 0;
>  }
>  
> -/*
> - * This flag serializes load-balancing passes over large domains
> - * (above the NODE topology level) - only one load-balancing instance
> - * may run at a time, to reduce overhead on very large systems with
> - * lots of CPUs and large NUMA distances.
> - *
> - * - Note that load-balancing passes triggered while another one
> - *   is executing are skipped and not re-tried.
> - *
> - * - Also note that this does not serialize rebalance_domains()
> - *   execution, as non-SD_SERIALIZE domains will still be
> - *   load-balanced in parallel.
> - */
> -static atomic_t sched_balance_running = ATOMIC_INIT(0);
> -
>  /*
>   * Scale the max sched_balance_rq interval with the number of CPUs in the system.
>   * This trades load-balance latency on larger machines for less cross talk.
> @@ -12192,7 +12203,7 @@ static void sched_balance_domains(struct rq *rq, enum cpu_idle_type idle)
>  	/* Earliest time when we have to do rebalance again */
>  	unsigned long next_balance = jiffies + 60*HZ;
>  	int update_next_balance = 0;
> -	int need_serialize, need_decay = 0;
> +	int need_decay = 0;
>  	u64 max_cost = 0;
>  
>  	rcu_read_lock();
> @@ -12216,13 +12227,6 @@ static void sched_balance_domains(struct rq *rq, enum cpu_idle_type idle)
>  		}
>  
>  		interval = get_sd_balance_interval(sd, busy);
> -
> -		need_serialize = sd->flags & SD_SERIALIZE;
> -		if (need_serialize) {
> -			if (atomic_cmpxchg_acquire(&sched_balance_running, 0, 1))
> -				goto out;
> -		}
> -
>  		if (time_after_eq(jiffies, sd->last_balance + interval)) {
>  			if (sched_balance_rq(cpu, rq, sd, idle, &continue_balancing)) {
>  				/*
> @@ -12236,9 +12240,7 @@ static void sched_balance_domains(struct rq *rq, enum cpu_idle_type idle)
>  			sd->last_balance = jiffies;
>  			interval = get_sd_balance_interval(sd, busy);
>  		}
> -		if (need_serialize)
> -			atomic_set_release(&sched_balance_running, 0);
> -out:
> +
>  		if (time_after(next_balance, sd->last_balance + interval)) {
>  			next_balance = sd->last_balance + interval;
>  			update_next_balance = 1;

I think this is better since previously the one CPU which was not suppose to
do the balancing may increment the atomic variable. If the CPU, that was
suppose to do the balance now tries it may fail since the variable was not
yet decremented.

-- 
Thanks and Regards
Srikar Dronamraju

  parent reply	other threads:[~2025-10-14 13:50 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-02 23:00 [RESEND PATCH] sched/fair: Skip sched_balance_running cmpxchg when balance is not due Tim Chen
2025-10-03  5:23 ` Shrikanth Hegde
2025-10-03 16:37   ` Tim Chen
2025-10-13 14:26 ` Peter Zijlstra
2025-10-13 16:32   ` Chen, Yu C
2025-10-13 16:41     ` Shrikanth Hegde
2025-10-13 16:43       ` Chen, Yu C
2025-10-14  9:26     ` Peter Zijlstra
2025-10-13 21:54   ` Tim Chen
2025-10-14  9:24     ` Peter Zijlstra
2025-10-14  9:33       ` Shrikanth Hegde
2025-10-14  9:42         ` Peter Zijlstra
2025-10-14  9:51           ` Shrikanth Hegde
2025-10-16 14:03           ` Shrikanth Hegde
2025-10-22 17:42             ` Shrikanth Hegde
2025-10-14 13:50       ` Srikar Dronamraju [this message]
2025-10-14 13:59         ` Peter Zijlstra
2025-10-14 14:28       ` Shrikanth Hegde
2025-10-14 18:05       ` Tim Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aO5VK4PO_REXNhnN@linux.ibm.com \
    --to=srikar@linux.ibm.com \
    --cc=doug.nelson@intel.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=mohini.narkhede@intel.com \
    --cc=peterz@infradead.org \
    --cc=sshegde@linux.ibm.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=vincent.guittot@linaro.org \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox