From: Mel Gorman <mgorman@suse.de>
To: riel@redhat.com
Cc: linux-kernel@vger.kernel.org, jhladky@redhat.com,
peterz@infradead.org, mingo@kernel.org, dedekind1@gmail.com
Subject: Re: [PATCH 2/2] numa,sched: only consider less busy nodes as numa balancing destination
Date: Thu, 28 May 2015 09:29:27 +0100 [thread overview]
Message-ID: <20150528082927.GD13750@suse.de> (raw)
In-Reply-To: <1432753468-7785-3-git-send-email-riel@redhat.com>
On Wed, May 27, 2015 at 03:04:28PM -0400, riel@redhat.com wrote:
> From: Rik van Riel <riel@redhat.com>
>
> Changeset a43455a1 ("sched/numa: Ensure task_numa_migrate() checks the
> preferred node") fixes an issue where workloads would never converge
> on a fully loaded (or overloaded) system.
>
> However, it introduces a regression on less than fully loaded systems,
> where workloads converge on a few NUMA nodes, instead of properly staying
> spread out across the whole system. This leads to a reduction in available
> memory bandwidth, and usable CPU cache, with predictable performance problems.
>
> The root cause appears to be an interaction between the load balancer and
> NUMA balancing, where the short term load represented by the load balancer
> differs from the long term load the NUMA balancing code would like to base
> its decisions on.
>
> Simply reverting a43455a1 would re-introduce the non-convergence of
> workloads on fully loaded systems, so that is not a good option. As
> an aside, the check done before a43455a1 only applied to a task's
> preferred node, not to other candidate nodes in the system, so the
> converge-on-too-few-nodes problem still happens, just to a lesser
> degree.
>
> Instead, try to compensate for the impedance mismatch between the
> load balancer and NUMA balancing by only ever considering a lesser
> loaded node as a destination for NUMA balancing, regardless of
> whether the task is trying to move to the preferred node, or to another
> node.
>
> This patch also addresses the issue that a system with a single runnable
> thread would never migrate that thread to near its memory, introduced by
> 095bebf61a46 ("sched/numa: Do not move past the balance point if unbalanced").
>
> A test where the main thread creates a large memory area, and spawns
> a worker thread to iterate over the memory (placed on another node
> by select_task_rq_fair), after which the main thread goes to sleep
> and waits for the worker thread to loop over all the memory now sees
> the worker thread migrated to where the memory is, instead of having
> all the memory migrated over like before.
>
> Jirka has run a number of performance tests on several systems:
> single instance SpecJBB 2005 performance is 7-15% higher on a 4 node
> system, with higher gains on systems with more cores per socket.
> Multi-instance SpecJBB 2005 (one per node), linpack, and stream see
> little or no changes with the revert of 095bebf61a46 and this patch.
>
> Signed-off-by: Rik van Riel <riel@redhat.com>
> Reported-by: Artem Bityutski <dedekind1@gmail.com>
> Reported-by: Jirka Hladky <jhladky@redhat.com>
> Tested-by: Jirka Hladky <jhladky@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
--
Mel Gorman
SUSE Labs
next prev parent reply other threads:[~2015-05-28 8:30 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-27 19:04 [PATCH 0/2] numa,sched: resolve conflict between load balancing and NUMA balancing riel
2015-05-27 19:04 ` [PATCH 1/2] revert 095bebf61a46 ("sched/numa: Do not move past the balance point if unbalanced") riel
2015-06-07 17:47 ` [tip:sched/core] Revert " tip-bot for Rik van Riel
2015-05-27 19:04 ` [PATCH 2/2] numa,sched: only consider less busy nodes as numa balancing destination riel
2015-05-28 8:29 ` Mel Gorman [this message]
2015-05-28 11:07 ` Srikar Dronamraju
2015-05-28 13:33 ` Rik van Riel
2015-05-28 13:37 ` Peter Zijlstra
2015-05-28 13:52 ` [PATCH v2 " Rik van Riel
2015-06-07 17:47 ` [tip:sched/core] sched/numa: Only consider less busy nodes as numa balancing destinations tip-bot for Rik van Riel
2015-05-29 7:44 ` [PATCH 0/2] numa,sched: resolve conflict between load balancing and NUMA balancing Artem Bityutskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150528082927.GD13750@suse.de \
--to=mgorman@suse.de \
--cc=dedekind1@gmail.com \
--cc=jhladky@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.