All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rik van Riel <riel@redhat.com>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Michael Neuling <mikey@neuling.org>,
	Ingo Molnar <mingo@kernel.org>, Paul Turner <pjt@google.com>,
	jhladky@redhat.com, ktkhai@parallels.com,
	tim.c.chen@linux.intel.com,
	Nicolas Pitre <nicolas.pitre@linaro.org>
Subject: Re: [PATCH] sched: make update_sd_pick_busiest return true on a busier sd
Date: Fri, 25 Jul 2014 11:13:17 -0400	[thread overview]
Message-ID: <53D2740D.9040609@redhat.com> (raw)
In-Reply-To: <CAKfTPtCjw9LTh_LYT=rRs=Q1gS2mVcz9Ef6E7KHLqFhdm8Y8pA@mail.gmail.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 07/25/2014 11:02 AM, Vincent Guittot wrote:
> On 25 July 2014 16:02, Rik van Riel <riel@redhat.com> wrote: On
> 07/23/2014 03:41 AM, Vincent Guittot wrote:
> 
>>>> Regarding your issue with "perf bench numa mem" that is not
>>>> spread on all nodes, SD_PREFER_SIBLING flag (of DIE level)
>>>> should do the job by reducing the capacity of  "not local
>>>> DIE" group at NUMA level to 1 task during the load balance
>>>> computation. So you should have 1 task per sched_group at
>>>> NUMA level.
>> 
>> Looking at the code some more, it is clear why this does not 
>> happen. If sd->flags & SD_NUMA, then SD_PREFER_SIBLING will never
>> be set.
> 
> I don't have a lot of experience on NUMA system and how their 
> sched_domain topology is described but IIUC, you don't have other 
> sched_domain level than NUMA ones ? otherwise the flag should be 
> present in one of the non NUMA level (SMT, MC or DIE)

The system I am testing on has 3 or 4 sched_domain levels,
one for each HT sibling(?), one for each core, one for
each node/socket, and one parent domain for the whole system.

SD_PREFER_SIBLING should be set at the HT sibling level
and at the core level.

However, it is not set at the levels above that.

That means the SD_PREFER_SIBLING flag does its thing within
each CPU core and between cores on a socket, but not between
NUMA nodes...

>> On a related note, that part of the load balancing code probably 
>> needs to be rewritten to deal with unequal
>> group_capacity_factors anyway.

> AFAICT, sgs->avg_load is weighted by the capacity in
> update_sg_lb_stats

Indeed, I dug into that code after sending the email, and found
that piece of the code just before I read Peter's email pointing
it out to me.

> I'm working on a patchset that get ride of capacity_factor (as 
> mentioned by Peter) and directly uses capacity instead. I should
> send the v4 next week.

I am looking forward to anything that will make this code easier
to follow :)

- -- 
All rights reversed
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJT0nQMAAoJEM553pKExN6DeFAIAK7KHHRl+lfWIxiZGRyarYHL
SHJydCA4a9Lkd2D60dULGWY/8ylB8+IMfwv69/jXHZzbxlg7Nu1+da7ZUF3Lx35k
AYxpOhC94eTJvp9KQX2W0nGiDZ0Di7YnWAWdoLsd1kZZDZjd82gtLVh63ossWZlF
hn+YH6E4n0iAe6CZ2PO4QMz7dDYVGzUnKuuQVZKl3DBJWSe6ZvcBhDS4xDT+uUe1
IheA5aQcY8XmAkcXbLs736iBiOCKH6Jts6trJUPaVxw3jkD8lI/CMuMss2dk7RPH
xl3Y8CKridywdtvGN6WrOwzUxVBxaEN1da/VuN7nF4OnoipYApSWwKTBe9wd7rQ=
=AsCN
-----END PGP SIGNATURE-----

  reply	other threads:[~2014-07-25 15:13 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-22 18:45 [PATCH] sched: make update_sd_pick_busiest return true on a busier sd Rik van Riel
2014-07-23  7:41 ` Vincent Guittot
2014-07-25 13:33   ` Rik van Riel
2014-07-25 14:29     ` Vincent Guittot
2014-07-25 14:46       ` Rik van Riel
2014-07-25 14:02   ` Rik van Riel
2014-07-25 14:15     ` Peter Zijlstra
2014-07-25 15:02     ` Vincent Guittot
2014-07-25 15:13       ` Rik van Riel [this message]
2014-07-25 15:27 ` Peter Zijlstra
2014-07-25 15:45   ` Rik van Riel
2014-07-25 16:05     ` Peter Zijlstra
2014-07-25 16:22       ` Rik van Riel
2014-07-25 17:57         ` Vincent Guittot
2014-07-25 19:32           ` [PATCH v2] " Rik van Riel
2014-07-28  8:23             ` Vincent Guittot
2014-07-28 15:04               ` Rik van Riel
2014-07-28 14:30             ` Peter Zijlstra
2014-07-27 23:57   ` [PATCH] " Michael Neuling

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53D2740D.9040609@redhat.com \
    --to=riel@redhat.com \
    --cc=jhladky@redhat.com \
    --cc=ktkhai@parallels.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mikey@neuling.org \
    --cc=mingo@kernel.org \
    --cc=nicolas.pitre@linaro.org \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.