All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Jiri Olsa <jolsa@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>,
	James Hartsock <hartsjc@redhat.com>,
	Rik van Riel <riel@redhat.com>,
	Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>,
	Kirill Tkhai <ktkhai@parallels.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC] sched: unused cpu in affine workload
Date: Mon, 4 Apr 2016 10:44:16 +0200	[thread overview]
Message-ID: <20160404084416.GV3448@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <20160404082302.GB2137@krava.local>

On Mon, Apr 04, 2016 at 10:23:02AM +0200, Jiri Olsa wrote:
> hi,
> we've noticed following issue in one of our workloads.
> 
> I have 24 CPUs server with following sched domains:
>   domain 0: (pairs)
>   domain 1: 0-5,12-17 (group1)  6-11,18-23 (group2)
>   domain 2: 0-23 level NUMA
> 
> I run CPU hogging workload on following CPUs:
>   4,6,14,18,19,20,23
> 
> that is:
>   4,14          CPUs from group1
>   6,18,19,20,23 CPUs from group2
> 
> the workload process gets affinity setup via 'taskset -c ${CPUs workload ...'
> and forks child for every CPU
> 
> very often we notice CPUs 4 and 14 running 3 processes of the workload
> while CPUs 6,18,19,20,23 running just 4 processes, leaving one of the
> CPU from group2 idle
> 
> AFAICS from the code the reason for this is that the load balancing
> follows domains setup (topology) and does not regard affinity setups
> like this. The code in find_busiest_group running under idle cpu from
> group2 will find group1 as bussiest, but its average load will be
> smaller than the one on the local group, so there's no task pulling.
> 
> It's obvious, that load balancer follows sched domain topology.
> However is there some sched feature I'm missing that could help
> with this? Or do we need to follow sched domains topology when
> we select CPUs for workload to get even balancing?

Yeah, this is 'hard', there is some code that tries not to totally blow
with this but its all a bit of a mess. See
kernel/sched/fair.c:sg_imbalanced().

The easiest solution is to simply not do this and stick with the topo
like you suggest.

So far I've not come up with a sane/stable solution for this problem.

  reply	other threads:[~2016-04-04  8:44 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-04  8:23 [RFC] sched: unused cpu in affine workload Jiri Olsa
2016-04-04  8:44 ` Peter Zijlstra [this message]
2016-04-04  8:59 ` Ingo Molnar
2016-04-04  9:19   ` Ingo Molnar
2016-04-04  9:38     ` Ingo Molnar
2016-04-04 13:23       ` Peter Zijlstra
2016-04-04 19:45         ` Rik van Riel
2016-04-04 21:34           ` Peter Zijlstra
2016-04-05  8:56             ` Jiri Olsa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160404084416.GV3448@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=hartsjc@redhat.com \
    --cc=jolsa@redhat.com \
    --cc=ktkhai@parallels.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=riel@redhat.com \
    --cc=vatsa@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.