All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>, Venki Pallipadi <venki@google.com>,
	Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>,
	Mike Galbraith <efault@gmx.de>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Tim Chen <tim.c.chen@linux.jf.intel.com>,
	alex.shi@intel.com
Subject: Re: [patch 3/6] sched, nohz: sched group, domain aware nohz idle load balancing
Date: Thu, 24 Nov 2011 12:47:43 +0100	[thread overview]
Message-ID: <1322135263.2921.12.camel@twins> (raw)
In-Reply-To: <20111118230553.995756330@sbsiddha-desk.sc.intel.com>

On Fri, 2011-11-18 at 15:03 -0800, Suresh Siddha wrote:
>  static inline int nohz_kick_needed(struct rq *rq, int cpu)
>  {
>         unsigned long now = jiffies;
>         struct sched_domain *sd;
>  
> +       if (unlikely(idle_cpu(cpu)))
> +               return 0;
> +
>         /*
>          * We were recently in tickless idle mode. At the first busy tick
>          * after returning from idle, we will update the busy stats.
> @@ -5120,36 +5047,43 @@ static inline int nohz_kick_needed(struc
>         if (unlikely(test_bit(NOHZ_TICK_STOPPED, nohz_flags(cpu)))) {
>                 clear_bit(NOHZ_TICK_STOPPED, nohz_flags(cpu));
>  
> +               cpumask_clear_cpu(cpu, nohz.idle_cpus_mask);
> +               atomic_dec(&nohz.nr_cpus);
> +
>                 for_each_domain(cpu, sd)
>                         atomic_inc(&sd->groups->sgp->nr_busy_cpus);
>         }
>  
> +       /*
> +        * None are in tickless mode and hence no need for NOHZ idle load
> +        * balancing.
> +        */
> +       if (likely(!atomic_read(&nohz.nr_cpus)))
>                 return 0;
>  
> +       if (time_before(now, nohz.next_balance))
>                 return 0;
>  
> +       if (rq->nr_running >= 2)
> +               goto need_kick;
>  
> +       for_each_domain(cpu, sd) {
> +               struct sched_group *sg = sd->groups;
> +               struct sched_group_power *sgp = sg->sgp;
> +               int nr_busy = atomic_read(&sgp->nr_busy_cpus);
> +
> +               if (nr_busy > 1 && (nr_busy * SCHED_LOAD_SCALE > sgp->power))
> +                       goto need_kick;

This looks wrong, its basically always true for a box with HT.

sgp->power is a measure of how much compute power this group has, its
basic form is sg->weight * SCHED_POWER_SCALE and is reduced from there;
HT siblings get less since they're not as powerful as two actual cores
and we deduct time spend on RT-tasks and IRQs etc..

So how does comparing the load of non-nohz cpus to that make sense?

> +
> +               if (sd->flags & SD_ASYM_PACKING && nr_busy != sg->group_weight
> +                   && (cpumask_first_and(nohz.idle_cpus_mask,
> +                                         sched_domain_span(sd)) < cpu))
> +                       goto need_kick;
>         }
> +
>         return 0;
> +need_kick:
> +       return 1;
>  } 

  reply	other threads:[~2011-11-24 11:48 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-18 23:03 [patch 0/6] sched, nohz: load balancing patches Suresh Siddha
2011-11-18 23:03 ` [patch 1/6] sched, nohz: introduce nohz_flags in the struct rq Suresh Siddha
2011-11-24 10:24   ` Peter Zijlstra
2011-11-28 23:59     ` Suresh Siddha
2011-11-29  9:47       ` Peter Zijlstra
2011-11-18 23:03 ` [patch 2/6] sched, nohz: track nr_busy_cpus in the sched_group_power Suresh Siddha
2011-11-18 23:03 ` [patch 3/6] sched, nohz: sched group, domain aware nohz idle load balancing Suresh Siddha
2011-11-24 11:47   ` Peter Zijlstra [this message]
2011-11-28 23:51     ` Suresh Siddha
2011-11-29  9:44       ` Peter Zijlstra
2011-12-01  1:03         ` Suresh Siddha
2011-12-01  1:17         ` Suresh Siddha
2011-12-01  8:36           ` Peter Zijlstra
2011-11-24 11:53   ` Peter Zijlstra
2011-11-28 23:58     ` Suresh Siddha
2011-11-29  9:45       ` Peter Zijlstra
2011-11-18 23:03 ` [patch 4/6] sched, nohz: cleanup the find_new_ilb() using sched groups nr_busy_cpus Suresh Siddha
2011-11-18 23:03 ` [patch 5/6] sched: disable sched feature TTWU_QUEUE by default Suresh Siddha
2011-11-19  4:30   ` Mike Galbraith
2011-11-19  4:41     ` Mike Galbraith
2011-11-18 23:03 ` [patch 6/6] sched: fix the sched group node allocation for SD_OVERLAP domain Suresh Siddha
2011-12-06  9:51   ` [tip:sched/core] sched: Fix the sched group node allocation for SD_OVERLAP domains tip-bot for Suresh Siddha

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1322135263.2921.12.camel@twins \
    --to=peterz@infradead.org \
    --cc=alex.shi@intel.com \
    --cc=efault@gmx.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=suresh.b.siddha@intel.com \
    --cc=tim.c.chen@linux.jf.intel.com \
    --cc=vatsa@linux.vnet.ibm.com \
    --cc=venki@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.