From: Mel Gorman <mgorman@techsingularity.net>
To: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>,
linux-kernel@vger.kernel.org, "Suthikulpanit,
Suravee" <Suravee.Suthikulpanit@amd.com>,
"Lendacky, Thomas" <Thomas.Lendacky@amd.com>,
Borislav Petkov <bp@alien8.de>
Subject: Re: [PATCH v3] sched/topology: Improve load balancing on AMD EPYC
Date: Tue, 23 Jul 2019 12:42:48 +0100 [thread overview]
Message-ID: <20190723114248.GJ24383@techsingularity.net> (raw)
In-Reply-To: <20190723104830.26623-1-matt@codeblueprint.co.uk>
On Tue, Jul 23, 2019 at 11:48:30AM +0100, Matt Fleming wrote:
> SD_BALANCE_{FORK,EXEC} and SD_WAKE_AFFINE are stripped in sd_init()
> for any sched domains with a NUMA distance greater than 2 hops
> (RECLAIM_DISTANCE). The idea being that it's expensive to balance
> across domains that far apart.
>
> However, as is rather unfortunately explained in
>
> commit 32e45ff43eaf ("mm: increase RECLAIM_DISTANCE to 30")
>
> the value for RECLAIM_DISTANCE is based on node distance tables from
> 2011-era hardware.
>
> Current AMD EPYC machines have the following NUMA node distances:
>
> node distances:
> node 0 1 2 3 4 5 6 7
> 0: 10 16 16 16 32 32 32 32
> 1: 16 10 16 16 32 32 32 32
> 2: 16 16 10 16 32 32 32 32
> 3: 16 16 16 10 32 32 32 32
> 4: 32 32 32 32 10 16 16 16
> 5: 32 32 32 32 16 10 16 16
> 6: 32 32 32 32 16 16 10 16
> 7: 32 32 32 32 16 16 16 10
>
> where 2 hops is 32.
>
> The result is that the scheduler fails to load balance properly across
> NUMA nodes on different sockets -- 2 hops apart.
>
> For example, pinning 16 busy threads to NUMA nodes 0 (CPUs 0-7) and 4
> (CPUs 32-39) like so,
>
> $ numactl -C 0-7,32-39 ./spinner 16
>
> causes all threads to fork and remain on node 0 until the active
> balancer kicks in after a few seconds and forcibly moves some threads
> to node 4.
>
> Override node_reclaim_distance for AMD Zen.
>
> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
> Cc: "Suthikulpanit, Suravee" <Suravee.Suthikulpanit@amd.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: "Lendacky, Thomas" <Thomas.Lendacky@amd.com>
> Cc: Borislav Petkov <bp@alien8.de>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
The only caveat I can think of is that a future generation of Zen might
take a different magic number than 32 as their remote distance. If or
when this happens, it'll need additional smarts but lacking a crystal
ball, we can cross that bridge when we come to it.
--
Mel Gorman
SUSE Labs
next prev parent reply other threads:[~2019-07-23 11:42 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-23 10:48 [PATCH v3] sched/topology: Improve load balancing on AMD EPYC Matt Fleming
2019-07-23 11:42 ` Mel Gorman [this message]
2019-07-23 12:00 ` Peter Zijlstra
2019-07-23 13:03 ` Mel Gorman
2019-07-23 14:09 ` Peter Zijlstra
2019-07-25 16:37 ` Suthikulpanit, Suravee
2019-07-29 12:16 ` Matt Fleming
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190723114248.GJ24383@techsingularity.net \
--to=mgorman@techsingularity.net \
--cc=Suravee.Suthikulpanit@amd.com \
--cc=Thomas.Lendacky@amd.com \
--cc=bp@alien8.de \
--cc=linux-kernel@vger.kernel.org \
--cc=matt@codeblueprint.co.uk \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.