Re: [PATCH] sched: make update_sd_pick_busiest return true on a busier sd

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Rik van Riel <riel@redhat.com>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Michael Neuling <mikey@neuling.org>,
	Ingo Molnar <mingo@kernel.org>, Paul Turner <pjt@google.com>,
	jhladky@redhat.com, ktkhai@parallels.com,
	tim.c.chen@linux.intel.com,
	Nicolas Pitre <nicolas.pitre@linaro.org>
Subject: Re: [PATCH] sched: make update_sd_pick_busiest return true on a busier sd
Date: Fri, 25 Jul 2014 10:02:43 -0400	[thread overview]
Message-ID: <53D26383.60707@redhat.com> (raw)
In-Reply-To: <CAKfTPtDdAzZcMA5bJvhyrdH9J1F69jMy5q3w5xc4t+PSKPQ0eA@mail.gmail.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 07/23/2014 03:41 AM, Vincent Guittot wrote:

> Regarding your issue with "perf bench numa mem" that is not spread
> on all nodes, SD_PREFER_SIBLING flag (of DIE level) should do the
> job by reducing the capacity of  "not local DIE" group at NUMA
> level to 1 task during the load balance computation. So you should
> have 1 task per sched_group at NUMA level.

Looking at the code some more, it is clear why this does not
happen. If sd->flags & SD_NUMA, then SD_PREFER_SIBLING will
never be set.

On a related note, that part of the load balancing code probably
needs to be rewritten to deal with unequal group_capacity_factors
anyway.

Say that one group has a group_capacity_factor twice that of
another group.

The group with the smaller group_capacity_factor is overloaded
by a factor 1.3. The larger group is loaded by a factor 0.8.
This means the larger group has a higher load than the first
group, and the current code in update_sd_pick_busiest will
not select the overloaded group as the busiest one, due to not
scaling load with the capacity...

static bool update_sd_pick_busiest(struct lb_env *env,
                                   struct sd_lb_stats *sds,
                                   struct sched_group *sg,
                                   struct sg_lb_stats *sgs)
{
        if (sgs->avg_load <= sds->busiest_stat.avg_load)
                return false;

I believe we may need to factor the group_capacity_factor
into this calculation, in order to properly identify which
group is busiest.

However, if we do that we may need to get rid of the
SD_PREFER_SIBLING hack that forces group_capacity_factor
to 1 on domains that have SD_PREFER_SIBLING set.

I suspect that should be ok though, if we make sure
update_sd_pick_busiest does the right thing...

- -- 
All rights reversed
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJT0mOCAAoJEM553pKExN6DHq4H/2THfH33d+JYvfOq95OpGLaD
HATAp8Dv0kTiGjnbZrHPp8TqqgLLXuM6HhLvsvURuhoJw6F/nOX6qOQWEtjcMyYp
omShkDSLnPjs/0Iwf9vNocT7K7Sn3Gk0hOj6+ICW7wchyug8JYtuiHunP8pYrpzW
G6l2qHMRqRs5mSENY/uWwH9qh6Z6jcfDoDDDKRTNBe0z67FzwMnX1IYCUA6XOBsZ
iRdXe8E0CIgio+ek8HVzRm5sUlkRyfJpTXJj+pemVJhTrNCCbMGTHxzADU4Ngc8S
+JQ+G6bsHz9R4pffsuzYFbL0avK0mm5SrjCIatE7MX171dQJ1cKpju+fAmnwuNg=
=EAzG
-----END PGP SIGNATURE-----

next prev parent reply	other threads:[~2014-07-25 14:03 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-22 18:45 [PATCH] sched: make update_sd_pick_busiest return true on a busier sd Rik van Riel
2014-07-23  7:41 ` Vincent Guittot
2014-07-25 13:33   ` Rik van Riel
2014-07-25 14:29     ` Vincent Guittot
2014-07-25 14:46       ` Rik van Riel
2014-07-25 14:02   ` Rik van Riel [this message]
2014-07-25 14:15     ` Peter Zijlstra
2014-07-25 15:02     ` Vincent Guittot
2014-07-25 15:13       ` Rik van Riel
2014-07-25 15:27 ` Peter Zijlstra
2014-07-25 15:45   ` Rik van Riel
2014-07-25 16:05     ` Peter Zijlstra
2014-07-25 16:22       ` Rik van Riel
2014-07-25 17:57         ` Vincent Guittot
2014-07-25 19:32           ` [PATCH v2] " Rik van Riel
2014-07-28  8:23             ` Vincent Guittot
2014-07-28 15:04               ` Rik van Riel
2014-07-28 14:30             ` Peter Zijlstra
2014-07-27 23:57   ` [PATCH] " Michael Neuling

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53D26383.60707@redhat.com \
    --to=riel@redhat.com \
    --cc=jhladky@redhat.com \
    --cc=ktkhai@parallels.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mikey@neuling.org \
    --cc=mingo@kernel.org \
    --cc=nicolas.pitre@linaro.org \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.