From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753413AbbCaIiS (ORCPT ); Tue, 31 Mar 2015 04:38:18 -0400 Received: from e38.co.us.ibm.com ([32.97.110.159]:54609 "EHLO e38.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752723AbbCaIiM (ORCPT ); Tue, 31 Mar 2015 04:38:12 -0400 Message-ID: <551A5CCE.70008@linux.vnet.ibm.com> Date: Tue, 31 Mar 2015 14:07:34 +0530 From: Preeti U Murthy User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Jason Low , peterz@infradead.org, mingo@kernel.org, Daniel Lezcano CC: riel@redhat.com, vincent.guittot@linaro.org, srikar@linux.vnet.ibm.com, pjt@google.com, benh@kernel.crashing.org, efault@gmx.de, linux-kernel@vger.kernel.org, iamjoonsoo.kim@lge.com, svaidy@linux.vnet.ibm.com, tim.c.chen@linux.intel.com, morten.rasmussen@arm.com Subject: Re: sched: Improve load balancing in the presence of idle CPUs References: <1427741729.5694.24.camel@j-VirtualBox> In-Reply-To: <1427741729.5694.24.camel@j-VirtualBox> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15033108-0029-0000-0000-000008D54919 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Jason, On 03/31/2015 12:25 AM, Jason Low wrote: > Hi Preeti, > > I noticed that another commit 4a725627f21d converted the check in > nohz_kick_needed() from idle_cpu() to rq->idle_balance, causing a > potentially outdated value to be used if this cpu is able to pull tasks > using rebalance_domains(), and nohz_kick_needed() directly returning > false. I see that rebalance_domains() will be run at the end of the scheduler tick interrupt handling. trigger_load_balance() only sets the softirq, it does not call rebalance_domains() immediately. So the call graph would be: rq->idle_balance = idle_cpu() |____trigger_load_balance() |_____raise SCHED_SOFTIRQ - we are handling interrupt,hence defer |____nohz_kick_needed() |____rebalance_domains() run through the softirqd. Correct me if I am wrong but since we do not pull any load between the rq->idle_balance update and nohz_kick_needed(), we are safe in reading rq->idle_balance in nohz_kick_needed(). > > Would this patch also help address some of the issue you are seeing? > > Signed-off-by: Jason Low > --- > kernel/sched/fair.c | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index fdae26e..ba8ec1a 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -7644,7 +7644,7 @@ static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) > * balancing owner will pick it up. > */ > if (need_resched()) > - break; > + goto end; Why is this hunk needed? Regards Preeti U Murthy > > rq = cpu_rq(balance_cpu); > > @@ -7687,7 +7687,7 @@ static inline bool nohz_kick_needed(struct rq *rq) > int nr_busy, cpu = rq->cpu; > bool kick = false; > > - if (unlikely(rq->idle_balance)) > + if (unlikely(idle_cpu(cpu))) > return false; > > /* >