* [PATCH] sched: Reduce the rate of needless idle load balancing @ 2014-05-20 20:17 Tim Chen 2014-05-20 20:51 ` Jason Low 0 siblings, 1 reply; 14+ messages in thread From: Tim Chen @ 2014-05-20 20:17 UTC (permalink / raw) To: Ingo Molnar, Peter Zijlstra Cc: Andrew Morton, Len Brown, Russ Anderson, Dimitri Sivanich, Hedi Berriche, Andi Kleen, Michel Lespinasse, Rik van Riel, Peter Hurley, linux-kernel The current no_hz idle load balancer do load balancing on *all* idle cpus, even though the time due to load balance for a particular idle cpu could be still a while in future. This introduces a much higher load balancing rate than what is necessary. The patch changes the behavior by only doing idle load balancing on behalf of an idle cpu only when time is due for load balancing. On SGI's systems with over 3000 cores, the cpu responsible for idle balancing got overwhelmed with idle balancing, and introduces a lot of OS noise to workloads. This patch fixes the issue. Thanks. Tim Acked-by: Russ Anderson <rja@sgi.com> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> --- kernel/sched/fair.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 9b4c4f3..97132db 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6764,12 +6764,17 @@ static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) rq = cpu_rq(balance_cpu); - raw_spin_lock_irq(&rq->lock); - update_rq_clock(rq); - update_idle_cpu_load(rq); - raw_spin_unlock_irq(&rq->lock); - - rebalance_domains(rq, CPU_IDLE); + /* + * If time for next balance is due, + * do the balance. + */ + if (time_after(jiffies + 1, rq->next_balance)) { + raw_spin_lock_irq(&rq->lock); + update_rq_clock(rq); + update_idle_cpu_load(rq); + raw_spin_unlock_irq(&rq->lock); + rebalance_domains(rq, CPU_IDLE); + } if (time_after(this_rq->next_balance, rq->next_balance)) this_rq->next_balance = rq->next_balance; -- 1.7.11.7 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] sched: Reduce the rate of needless idle load balancing 2014-05-20 20:17 [PATCH] sched: Reduce the rate of needless idle load balancing Tim Chen @ 2014-05-20 20:51 ` Jason Low 2014-05-20 20:58 ` Rik van Riel 2014-05-20 20:59 ` Tim Chen 0 siblings, 2 replies; 14+ messages in thread From: Jason Low @ 2014-05-20 20:51 UTC (permalink / raw) To: Tim Chen Cc: Ingo Molnar, Peter Zijlstra, Andrew Morton, Len Brown, Russ Anderson, Dimitri Sivanich, Hedi Berriche, Andi Kleen, Michel Lespinasse, Rik van Riel, Peter Hurley, Linux Kernel Mailing List On Tue, May 20, 2014 at 1:17 PM, Tim Chen <tim.c.chen@linux.intel.com> wrote: > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 9b4c4f3..97132db 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -6764,12 +6764,17 @@ static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) > > rq = cpu_rq(balance_cpu); > > - raw_spin_lock_irq(&rq->lock); > - update_rq_clock(rq); > - update_idle_cpu_load(rq); > - raw_spin_unlock_irq(&rq->lock); > - > - rebalance_domains(rq, CPU_IDLE); > + /* > + * If time for next balance is due, > + * do the balance. > + */ > + if (time_after(jiffies + 1, rq->next_balance)) { Hi Tim, If we want to do idle load balancing only when it is due for a balance, shouldn't the above just be "if (time_after(jiffies, rq->next_balance))"? > + raw_spin_lock_irq(&rq->lock); > + update_rq_clock(rq); > + update_idle_cpu_load(rq); > + raw_spin_unlock_irq(&rq->lock); > + rebalance_domains(rq, CPU_IDLE); > + } > > if (time_after(this_rq->next_balance, rq->next_balance)) > this_rq->next_balance = rq->next_balance; ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] sched: Reduce the rate of needless idle load balancing 2014-05-20 20:51 ` Jason Low @ 2014-05-20 20:58 ` Rik van Riel 2014-05-20 20:59 ` Tim Chen 1 sibling, 0 replies; 14+ messages in thread From: Rik van Riel @ 2014-05-20 20:58 UTC (permalink / raw) To: Jason Low, Tim Chen Cc: Ingo Molnar, Peter Zijlstra, Andrew Morton, Len Brown, Russ Anderson, Dimitri Sivanich, Hedi Berriche, Andi Kleen, Michel Lespinasse, Peter Hurley, Linux Kernel Mailing List On 05/20/2014 04:51 PM, Jason Low wrote: > On Tue, May 20, 2014 at 1:17 PM, Tim Chen <tim.c.chen@linux.intel.com> wrote: > >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >> index 9b4c4f3..97132db 100644 >> --- a/kernel/sched/fair.c >> +++ b/kernel/sched/fair.c >> @@ -6764,12 +6764,17 @@ static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) >> >> rq = cpu_rq(balance_cpu); >> >> - raw_spin_lock_irq(&rq->lock); >> - update_rq_clock(rq); >> - update_idle_cpu_load(rq); >> - raw_spin_unlock_irq(&rq->lock); >> - >> - rebalance_domains(rq, CPU_IDLE); >> + /* >> + * If time for next balance is due, >> + * do the balance. >> + */ >> + if (time_after(jiffies + 1, rq->next_balance)) { > > Hi Tim, > > If we want to do idle load balancing only when it is due for a > balance, shouldn't the above just be "if (time_after(jiffies, > rq->next_balance))"? I was wondering the same. Everything else gets my Reviewed-by: Rik van Riel <riel@redhat.com> -- All rights reversed ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] sched: Reduce the rate of needless idle load balancing 2014-05-20 20:51 ` Jason Low 2014-05-20 20:58 ` Rik van Riel @ 2014-05-20 20:59 ` Tim Chen 2014-05-20 21:04 ` Tim Chen 2014-05-20 21:09 ` Jason Low 1 sibling, 2 replies; 14+ messages in thread From: Tim Chen @ 2014-05-20 20:59 UTC (permalink / raw) To: Jason Low Cc: Ingo Molnar, Peter Zijlstra, Andrew Morton, Len Brown, Russ Anderson, Dimitri Sivanich, Hedi Berriche, Andi Kleen, Michel Lespinasse, Rik van Riel, Peter Hurley, Linux Kernel Mailing List On Tue, 2014-05-20 at 13:51 -0700, Jason Low wrote: > On Tue, May 20, 2014 at 1:17 PM, Tim Chen <tim.c.chen@linux.intel.com> wrote: > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index 9b4c4f3..97132db 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -6764,12 +6764,17 @@ static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) > > > > rq = cpu_rq(balance_cpu); > > > > - raw_spin_lock_irq(&rq->lock); > > - update_rq_clock(rq); > > - update_idle_cpu_load(rq); > > - raw_spin_unlock_irq(&rq->lock); > > - > > - rebalance_domains(rq, CPU_IDLE); > > + /* > > + * If time for next balance is due, > > + * do the balance. > > + */ > > + if (time_after(jiffies + 1, rq->next_balance)) { > > Hi Tim, > > If we want to do idle load balancing only when it is due for a > balance, shouldn't the above just be "if (time_after(jiffies, > rq->next_balance))"? If rq->next_balance and jiffies are equal, then time_after(jiffies, rq->next_balance) check will be false and you will not do balance. But actually you want to balance for this case so the jiffies+1 was used. Tim ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] sched: Reduce the rate of needless idle load balancing 2014-05-20 20:59 ` Tim Chen @ 2014-05-20 21:04 ` Tim Chen 2014-05-21 1:15 ` Joe Perches 2014-05-20 21:09 ` Jason Low 1 sibling, 1 reply; 14+ messages in thread From: Tim Chen @ 2014-05-20 21:04 UTC (permalink / raw) To: Jason Low Cc: Ingo Molnar, Peter Zijlstra, Andrew Morton, Len Brown, Russ Anderson, Dimitri Sivanich, Hedi Berriche, Andi Kleen, Michel Lespinasse, Rik van Riel, Peter Hurley, Linux Kernel Mailing List On Tue, 2014-05-20 at 13:59 -0700, Tim Chen wrote: > On Tue, 2014-05-20 at 13:51 -0700, Jason Low wrote: > > On Tue, May 20, 2014 at 1:17 PM, Tim Chen <tim.c.chen@linux.intel.com> wrote: > > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > > index 9b4c4f3..97132db 100644 > > > --- a/kernel/sched/fair.c > > > +++ b/kernel/sched/fair.c > > > @@ -6764,12 +6764,17 @@ static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) > > > > > > rq = cpu_rq(balance_cpu); > > > > > > - raw_spin_lock_irq(&rq->lock); > > > - update_rq_clock(rq); > > > - update_idle_cpu_load(rq); > > > - raw_spin_unlock_irq(&rq->lock); > > > - > > > - rebalance_domains(rq, CPU_IDLE); > > > + /* > > > + * If time for next balance is due, > > > + * do the balance. > > > + */ > > > + if (time_after(jiffies + 1, rq->next_balance)) { > > > > Hi Tim, > > > > If we want to do idle load balancing only when it is due for a > > balance, shouldn't the above just be "if (time_after(jiffies, > > rq->next_balance))"? > > If rq->next_balance and jiffies are equal, then > time_after(jiffies, rq->next_balance) check will be false and > you will not do balance. But actually you want to balance > for this case so the jiffies+1 was used. So maybe I should switch the check to if (time_before(rq->next_balance, jiffies)) Tim ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] sched: Reduce the rate of needless idle load balancing 2014-05-20 21:04 ` Tim Chen @ 2014-05-21 1:15 ` Joe Perches 2014-05-21 16:37 ` Tim Chen 0 siblings, 1 reply; 14+ messages in thread From: Joe Perches @ 2014-05-21 1:15 UTC (permalink / raw) To: Tim Chen Cc: Jason Low, Ingo Molnar, Peter Zijlstra, Andrew Morton, Len Brown, Russ Anderson, Dimitri Sivanich, Hedi Berriche, Andi Kleen, Michel Lespinasse, Rik van Riel, Peter Hurley, Linux Kernel Mailing List On Tue, 2014-05-20 at 14:04 -0700, Tim Chen wrote: > On Tue, 2014-05-20 at 13:59 -0700, Tim Chen wrote: > > On Tue, 2014-05-20 at 13:51 -0700, Jason Low wrote: > > > On Tue, May 20, 2014 at 1:17 PM, Tim Chen <tim.c.chen@linux.intel.com> wrote: > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c [] > > > If we want to do idle load balancing only when it is due for a > > > balance, shouldn't the above just be "if (time_after(jiffies, > > > rq->next_balance))"? > > > > If rq->next_balance and jiffies are equal, then > > time_after(jiffies, rq->next_balance) check will be false and > > you will not do balance. But actually you want to balance > > for this case so the jiffies+1 was used. > > So maybe I should switch the check to > if (time_before(rq->next_balance, jiffies)) time_after_eq() or time_is_after_eq_jiffies() ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] sched: Reduce the rate of needless idle load balancing 2014-05-21 1:15 ` Joe Perches @ 2014-05-21 16:37 ` Tim Chen 2014-05-21 18:26 ` Davidlohr Bueso 0 siblings, 1 reply; 14+ messages in thread From: Tim Chen @ 2014-05-21 16:37 UTC (permalink / raw) To: Joe Perches Cc: Jason Low, Ingo Molnar, Peter Zijlstra, Andrew Morton, Len Brown, Russ Anderson, Dimitri Sivanich, Hedi Berriche, Andi Kleen, Michel Lespinasse, Rik van Riel, Peter Hurley, Linux Kernel Mailing List On Tue, 2014-05-20 at 18:15 -0700, Joe Perches wrote: > On Tue, 2014-05-20 at 14:04 -0700, Tim Chen wrote: > > On Tue, 2014-05-20 at 13:59 -0700, Tim Chen wrote: > > > On Tue, 2014-05-20 at 13:51 -0700, Jason Low wrote: > > > > On Tue, May 20, 2014 at 1:17 PM, Tim Chen <tim.c.chen@linux.intel.com> wrote: > > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > [] > > > > If we want to do idle load balancing only when it is due for a > > > > balance, shouldn't the above just be "if (time_after(jiffies, > > > > rq->next_balance))"? > > > > > > If rq->next_balance and jiffies are equal, then > > > time_after(jiffies, rq->next_balance) check will be false and > > > you will not do balance. But actually you want to balance > > > for this case so the jiffies+1 was used. > > > > So maybe I should switch the check to > > if (time_before(rq->next_balance, jiffies)) > > time_after_eq() or time_is_after_eq_jiffies() > > I prefer time_after_eq to keep the code style consistent with the rest of the code in fair.c. Tim ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] sched: Reduce the rate of needless idle load balancing 2014-05-21 16:37 ` Tim Chen @ 2014-05-21 18:26 ` Davidlohr Bueso 2014-05-21 18:49 ` Tim Chen 0 siblings, 1 reply; 14+ messages in thread From: Davidlohr Bueso @ 2014-05-21 18:26 UTC (permalink / raw) To: Tim Chen Cc: Joe Perches, Jason Low, Ingo Molnar, Peter Zijlstra, Andrew Morton, Len Brown, Russ Anderson, Dimitri Sivanich, Hedi Berriche, Andi Kleen, Michel Lespinasse, Rik van Riel, Peter Hurley, Linux Kernel Mailing List On Wed, 2014-05-21 at 09:37 -0700, Tim Chen wrote: > On Tue, 2014-05-20 at 18:15 -0700, Joe Perches wrote: > > On Tue, 2014-05-20 at 14:04 -0700, Tim Chen wrote: > > > On Tue, 2014-05-20 at 13:59 -0700, Tim Chen wrote: > > > > On Tue, 2014-05-20 at 13:51 -0700, Jason Low wrote: > > > > > On Tue, May 20, 2014 at 1:17 PM, Tim Chen <tim.c.chen@linux.intel.com> wrote: > > > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > [] > > > > > If we want to do idle load balancing only when it is due for a > > > > > balance, shouldn't the above just be "if (time_after(jiffies, > > > > > rq->next_balance))"? > > > > > > > > If rq->next_balance and jiffies are equal, then > > > > time_after(jiffies, rq->next_balance) check will be false and > > > > you will not do balance. But actually you want to balance > > > > for this case so the jiffies+1 was used. > > > > > > So maybe I should switch the check to > > > if (time_before(rq->next_balance, jiffies)) > > > > time_after_eq() or time_is_after_eq_jiffies() > > > > > > I prefer time_after_eq to keep the code style consistent with the > rest of the code in fair.c. Should all the code be updated then? We should use the existing interfaces if available. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] sched: Reduce the rate of needless idle load balancing 2014-05-21 18:26 ` Davidlohr Bueso @ 2014-05-21 18:49 ` Tim Chen 0 siblings, 0 replies; 14+ messages in thread From: Tim Chen @ 2014-05-21 18:49 UTC (permalink / raw) To: Davidlohr Bueso Cc: Joe Perches, Jason Low, Ingo Molnar, Peter Zijlstra, Andrew Morton, Len Brown, Russ Anderson, Dimitri Sivanich, Hedi Berriche, Andi Kleen, Michel Lespinasse, Rik van Riel, Peter Hurley, Linux Kernel Mailing List On Wed, 2014-05-21 at 11:26 -0700, Davidlohr Bueso wrote: > On Wed, 2014-05-21 at 09:37 -0700, Tim Chen wrote: > > On Tue, 2014-05-20 at 18:15 -0700, Joe Perches wrote: > > > On Tue, 2014-05-20 at 14:04 -0700, Tim Chen wrote: > > > > On Tue, 2014-05-20 at 13:59 -0700, Tim Chen wrote: > > > > > On Tue, 2014-05-20 at 13:51 -0700, Jason Low wrote: > > > > > > On Tue, May 20, 2014 at 1:17 PM, Tim Chen <tim.c.chen@linux.intel.com> wrote: > > > > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > > [] > > > > > > If we want to do idle load balancing only when it is due for a > > > > > > balance, shouldn't the above just be "if (time_after(jiffies, > > > > > > rq->next_balance))"? > > > > > > > > > > If rq->next_balance and jiffies are equal, then > > > > > time_after(jiffies, rq->next_balance) check will be false and > > > > > you will not do balance. But actually you want to balance > > > > > for this case so the jiffies+1 was used. > > > > > > > > So maybe I should switch the check to > > > > if (time_before(rq->next_balance, jiffies)) > > > > > > time_after_eq() or time_is_after_eq_jiffies() > > > > > > > > > > I prefer time_after_eq to keep the code style consistent with the > > rest of the code in fair.c. > > Should all the code be updated then? We should use the existing > interfaces if available. > BTW, if this code was to be updated, time_is_before_eq_jiffies(rq->next_balance) check will be the correct thing to do for the patch. This expands to time_after_eq(jiffies, rq->next_balance), which is what we want. So something like: if (time_is_before_eq_jiffies(rq->next_balance)) { raw_spin_lock_irq(&rq->lock); update_rq_clock(rq); update_idle_cpu_load(rq); raw_spin_unlock_irq(&rq->lock); rebalance_domains(rq, CPU_IDLE); } But I don't think this change is making the code logic any clearer. I prefer time_after_eq(jiffies, rq->next_balance), which is more readable. Tim ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] sched: Reduce the rate of needless idle load balancing 2014-05-20 20:59 ` Tim Chen 2014-05-20 21:04 ` Tim Chen @ 2014-05-20 21:09 ` Jason Low 2014-05-20 21:12 ` Tim Chen 2014-05-20 21:39 ` Tim Chen 1 sibling, 2 replies; 14+ messages in thread From: Jason Low @ 2014-05-20 21:09 UTC (permalink / raw) To: Tim Chen Cc: Jason Low, Ingo Molnar, Peter Zijlstra, Andrew Morton, Len Brown, Russ Anderson, Dimitri Sivanich, Hedi Berriche, Andi Kleen, Michel Lespinasse, Rik van Riel, Peter Hurley, Linux Kernel Mailing List On Tue, May 20, 2014 at 1:59 PM, Tim Chen <tim.c.chen@linux.intel.com> wrote: > On Tue, 2014-05-20 at 13:51 -0700, Jason Low wrote: >> On Tue, May 20, 2014 at 1:17 PM, Tim Chen <tim.c.chen@linux.intel.com> wrote: >> >> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >> > index 9b4c4f3..97132db 100644 >> > --- a/kernel/sched/fair.c >> > +++ b/kernel/sched/fair.c >> > @@ -6764,12 +6764,17 @@ static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) >> > >> > rq = cpu_rq(balance_cpu); >> > >> > - raw_spin_lock_irq(&rq->lock); >> > - update_rq_clock(rq); >> > - update_idle_cpu_load(rq); >> > - raw_spin_unlock_irq(&rq->lock); >> > - >> > - rebalance_domains(rq, CPU_IDLE); >> > + /* >> > + * If time for next balance is due, >> > + * do the balance. >> > + */ >> > + if (time_after(jiffies + 1, rq->next_balance)) { >> >> Hi Tim, >> >> If we want to do idle load balancing only when it is due for a >> balance, shouldn't the above just be "if (time_after(jiffies, >> rq->next_balance))"? > > If rq->next_balance and jiffies are equal, then > time_after(jiffies, rq->next_balance) check will be false and > you will not do balance. But actually you want to balance > for this case so the jiffies+1 was used. Hi Tim, Rik Yes, that makes sense that we want to balance if they are equal. We may also consider using "if (time_after_eq(jiffies, rq->next_balance)". Reviewed-by: Jason Low <jason.low2@hp.com> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] sched: Reduce the rate of needless idle load balancing 2014-05-20 21:09 ` Jason Low @ 2014-05-20 21:12 ` Tim Chen 2014-05-20 21:39 ` Tim Chen 1 sibling, 0 replies; 14+ messages in thread From: Tim Chen @ 2014-05-20 21:12 UTC (permalink / raw) To: Jason Low Cc: Ingo Molnar, Peter Zijlstra, Andrew Morton, Len Brown, Russ Anderson, Dimitri Sivanich, Hedi Berriche, Andi Kleen, Michel Lespinasse, Rik van Riel, Peter Hurley, Linux Kernel Mailing List On Tue, 2014-05-20 at 14:09 -0700, Jason Low wrote: > On Tue, May 20, 2014 at 1:59 PM, Tim Chen <tim.c.chen@linux.intel.com> wrote: > > On Tue, 2014-05-20 at 13:51 -0700, Jason Low wrote: > >> On Tue, May 20, 2014 at 1:17 PM, Tim Chen <tim.c.chen@linux.intel.com> wrote: > >> > >> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > >> > index 9b4c4f3..97132db 100644 > >> > --- a/kernel/sched/fair.c > >> > +++ b/kernel/sched/fair.c > >> > @@ -6764,12 +6764,17 @@ static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) > >> > > >> > rq = cpu_rq(balance_cpu); > >> > > >> > - raw_spin_lock_irq(&rq->lock); > >> > - update_rq_clock(rq); > >> > - update_idle_cpu_load(rq); > >> > - raw_spin_unlock_irq(&rq->lock); > >> > - > >> > - rebalance_domains(rq, CPU_IDLE); > >> > + /* > >> > + * If time for next balance is due, > >> > + * do the balance. > >> > + */ > >> > + if (time_after(jiffies + 1, rq->next_balance)) { > >> > >> Hi Tim, > >> > >> If we want to do idle load balancing only when it is due for a > >> balance, shouldn't the above just be "if (time_after(jiffies, > >> rq->next_balance))"? > > > > If rq->next_balance and jiffies are equal, then > > time_after(jiffies, rq->next_balance) check will be false and > > you will not do balance. But actually you want to balance > > for this case so the jiffies+1 was used. > > Hi Tim, Rik > > Yes, that makes sense that we want to balance if they are equal. We > may also consider using "if (time_after_eq(jiffies, > rq->next_balance)". That sounds good. Thanks. > > Reviewed-by: Jason Low <jason.low2@hp.com> Tim ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] sched: Reduce the rate of needless idle load balancing 2014-05-20 21:09 ` Jason Low 2014-05-20 21:12 ` Tim Chen @ 2014-05-20 21:39 ` Tim Chen 2014-05-21 6:38 ` Peter Zijlstra 2014-06-05 14:34 ` [tip:sched/core] sched/balancing: " tip-bot for Tim Chen 1 sibling, 2 replies; 14+ messages in thread From: Tim Chen @ 2014-05-20 21:39 UTC (permalink / raw) To: Jason Low Cc: Ingo Molnar, Peter Zijlstra, Andrew Morton, Len Brown, Russ Anderson, Dimitri Sivanich, Hedi Berriche, Andi Kleen, Michel Lespinasse, Rik van Riel, Peter Hurley, Linux Kernel Mailing List On Tue, 2014-05-20 at 14:09 -0700, Jason Low wrote: > Hi Tim, Rik > > Yes, that makes sense that we want to balance if they are equal. We > may also consider using "if (time_after_eq(jiffies, > rq->next_balance)". > > Reviewed-by: Jason Low <jason.low2@hp.com> Jason & Rik, Thanks for reviewing the patch. I've updated the patch below as suggested. Tim --- From: Tim Chen <tim.c.chen@linux.intel.com> Subject: [PATCH v2] sched: Reduce the rate of needless idle load balancing The current no_hz idle load balancer do load balancing for *all* idle cpus, even though the time due to load balance for a particular idle cpu could be still a while in the future. This introduces a much higher load balancing rate than what is necessary. The patch changes the behavior by only doing idle load balancing on behalf of an idle cpu only when it is due for load balancing. On SGI's systems with over 3000 cores, the cpu responsible for idle balancing got overwhelmed with idle balancing, and introduces a lot of OS noise to workloads. This patch fixes the issue. Acked-by: Russ Anderson <rja@sgi.com> Reviewed-by: Rik van Riel <riel@redhat.com> Reviewed-by: Jason Low <jason.low2@hp.com> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> --- kernel/sched/fair.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 9b4c4f3..b826c3a 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6764,12 +6764,17 @@ static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) rq = cpu_rq(balance_cpu); - raw_spin_lock_irq(&rq->lock); - update_rq_clock(rq); - update_idle_cpu_load(rq); - raw_spin_unlock_irq(&rq->lock); - - rebalance_domains(rq, CPU_IDLE); + /* + * If time for next balance is due, + * do the balance. + */ + if (time_after_eq(jiffies, rq->next_balance)) { + raw_spin_lock_irq(&rq->lock); + update_rq_clock(rq); + update_idle_cpu_load(rq); + raw_spin_unlock_irq(&rq->lock); + rebalance_domains(rq, CPU_IDLE); + } if (time_after(this_rq->next_balance, rq->next_balance)) this_rq->next_balance = rq->next_balance; -- 1.7.11.7 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] sched: Reduce the rate of needless idle load balancing 2014-05-20 21:39 ` Tim Chen @ 2014-05-21 6:38 ` Peter Zijlstra 2014-06-05 14:34 ` [tip:sched/core] sched/balancing: " tip-bot for Tim Chen 1 sibling, 0 replies; 14+ messages in thread From: Peter Zijlstra @ 2014-05-21 6:38 UTC (permalink / raw) To: Tim Chen Cc: Jason Low, Ingo Molnar, Andrew Morton, Len Brown, Russ Anderson, Dimitri Sivanich, Hedi Berriche, Andi Kleen, Michel Lespinasse, Rik van Riel, Peter Hurley, Linux Kernel Mailing List [-- Attachment #1: Type: text/plain, Size: 1013 bytes --] On Tue, May 20, 2014 at 02:39:27PM -0700, Tim Chen wrote: > From: Tim Chen <tim.c.chen@linux.intel.com> > Subject: [PATCH v2] sched: Reduce the rate of needless idle load balancing > > > The current no_hz idle load balancer do load balancing for *all* idle cpus, > even though the time due to load balance for a particular > idle cpu could be still a while in the future. This introduces a much > higher load balancing rate than what is necessary. The patch > changes the behavior by only doing idle load balancing on > behalf of an idle cpu only when it is due for load balancing. > > On SGI's systems with over 3000 cores, the cpu responsible for idle balancing > got overwhelmed with idle balancing, and introduces a lot of OS noise > to workloads. This patch fixes the issue. > > Acked-by: Russ Anderson <rja@sgi.com> > Reviewed-by: Rik van Riel <riel@redhat.com> > Reviewed-by: Jason Low <jason.low2@hp.com> > Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> > --- Thanks! [-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
* [tip:sched/core] sched/balancing: Reduce the rate of needless idle load balancing 2014-05-20 21:39 ` Tim Chen 2014-05-21 6:38 ` Peter Zijlstra @ 2014-06-05 14:34 ` tip-bot for Tim Chen 1 sibling, 0 replies; 14+ messages in thread From: tip-bot for Tim Chen @ 2014-06-05 14:34 UTC (permalink / raw) To: linux-tip-commits Cc: mingo, torvalds, peterz, peter, jason.low2, sivanich, riel, akpm, tglx, len.brown, linux-kernel, hpa, andi, tim.c.chen, hedi, rja, walken Commit-ID: ed61bbc69c773465782476c7e5869fa5607fa73a Gitweb: http://git.kernel.org/tip/ed61bbc69c773465782476c7e5869fa5607fa73a Author: Tim Chen <tim.c.chen@linux.intel.com> AuthorDate: Tue, 20 May 2014 14:39:27 -0700 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Thu, 5 Jun 2014 11:52:01 +0200 sched/balancing: Reduce the rate of needless idle load balancing The current no_hz idle load balancer do load balancing for *all* idle cpus, even though the time due to load balance for a particular idle cpu could be still a while in the future. This introduces a much higher load balancing rate than what is necessary. The patch changes the behavior by only doing idle load balancing on behalf of an idle cpu only when it is due for load balancing. On SGI's systems with over 3000 cores, the cpu responsible for idle balancing got overwhelmed with idle balancing, and introduces a lot of OS noise to workloads. This patch fixes the issue. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Acked-by: Russ Anderson <rja@sgi.com> Reviewed-by: Rik van Riel <riel@redhat.com> Reviewed-by: Jason Low <jason.low2@hp.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Len Brown <len.brown@intel.com> Cc: Dimitri Sivanich <sivanich@sgi.com> Cc: Hedi Berriche <hedi@sgi.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: MichelLespinasse <walken@google.com> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1400621967.2970.280.camel@schen9-DESK Signed-off-by: Ingo Molnar <mingo@kernel.org> --- kernel/sched/fair.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index b71d8c3..7a0c000 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7193,12 +7193,17 @@ static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) rq = cpu_rq(balance_cpu); - raw_spin_lock_irq(&rq->lock); - update_rq_clock(rq); - update_idle_cpu_load(rq); - raw_spin_unlock_irq(&rq->lock); - - rebalance_domains(rq, CPU_IDLE); + /* + * If time for next balance is due, + * do the balance. + */ + if (time_after_eq(jiffies, rq->next_balance)) { + raw_spin_lock_irq(&rq->lock); + update_rq_clock(rq); + update_idle_cpu_load(rq); + raw_spin_unlock_irq(&rq->lock); + rebalance_domains(rq, CPU_IDLE); + } if (time_after(this_rq->next_balance, rq->next_balance)) this_rq->next_balance = rq->next_balance; ^ permalink raw reply related [flat|nested] 14+ messages in thread
end of thread, other threads:[~2014-06-05 14:35 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-05-20 20:17 [PATCH] sched: Reduce the rate of needless idle load balancing Tim Chen 2014-05-20 20:51 ` Jason Low 2014-05-20 20:58 ` Rik van Riel 2014-05-20 20:59 ` Tim Chen 2014-05-20 21:04 ` Tim Chen 2014-05-21 1:15 ` Joe Perches 2014-05-21 16:37 ` Tim Chen 2014-05-21 18:26 ` Davidlohr Bueso 2014-05-21 18:49 ` Tim Chen 2014-05-20 21:09 ` Jason Low 2014-05-20 21:12 ` Tim Chen 2014-05-20 21:39 ` Tim Chen 2014-05-21 6:38 ` Peter Zijlstra 2014-06-05 14:34 ` [tip:sched/core] sched/balancing: " tip-bot for Tim Chen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox