From: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
To: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>,
linux-kernel@vger.kernel.org, Paul Mackerras <paulus@samba.org>,
Anton Blanchard <anton@samba.org>,
David Rientjes <rientjes@google.com>,
linuxppc-dev@lists.ozlabs.org
Subject: Re: [RFC,1/2] powerpc/numa: fix cpu_to_node() usage during boot
Date: Wed, 8 Jul 2015 16:16:23 -0700 [thread overview]
Message-ID: <20150708231623.GB44862@linux.vnet.ibm.com> (raw)
In-Reply-To: <20150708040056.948A1140770@ozlabs.org>
On 08.07.2015 [14:00:56 +1000], Michael Ellerman wrote:
> On Thu, 2015-02-07 at 23:02:02 UTC, Nishanth Aravamudan wrote:
> > Much like on x86, now that powerpc is using USE_PERCPU_NUMA_NODE_ID, we
> > have an ordering issue during boot with early calls to cpu_to_node().
>
> "now that .." implies we changed something and broke this. What commit was
> it that changed the behaviour?
Well, that's something I'm trying to still unearth. In the commits
before and after adding USE_PERCPU_NUMA_NODE_ID (8c272261194d
"powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"), the dmesg reports:
pcpu-alloc: [0] 0 1 2 3 4 5 6 7
At least prior to 8c272261194d, this might have been due to the old
powerpc-specific cpu_to_node():
static inline int cpu_to_node(int cpu)
{
int nid;
nid = numa_cpu_lookup_table[cpu];
/*
* During early boot, the numa-cpu lookup table might not have
been
* setup for all CPUs yet. In such cases, default to node 0.
*/
return (nid < 0) ? 0 : nid;
}
which might imply that no one cares or that simply no one noticed.
> > The value returned by those calls now depend on the per-cpu area being
> > setup, but that is not guaranteed to be the case during boot. Instead,
> > we need to add an early_cpu_to_node() which doesn't use the per-CPU area
> > and call that from certain spots that are known to invoke cpu_to_node()
> > before the per-CPU areas are not configured.
> >
> > On an example 2-node NUMA system with the following topology:
> >
> > available: 2 nodes (0-1)
> > node 0 cpus: 0 1 2 3
> > node 0 size: 2029 MB
> > node 0 free: 1753 MB
> > node 1 cpus: 4 5 6 7
> > node 1 size: 2045 MB
> > node 1 free: 1945 MB
> > node distances:
> > node 0 1
> > 0: 10 40
> > 1: 40 10
> >
> > we currently emit at boot:
> >
> > [ 0.000000] pcpu-alloc: [0] 0 1 2 3 [0] 4 5 6 7
> >
> > After this commit, we correctly emit:
> >
> > [ 0.000000] pcpu-alloc: [0] 0 1 2 3 [1] 4 5 6 7
>
>
> So it looks fairly sane, and I guess it's a bug fix.
>
> But I'm a bit reluctant to put it in straight away without some time in next.
I'm fine with that -- it could use some more extensive testing,
admittedly (I only have been able to verify the pcpu areas are being
correctly allocated on the right node so far).
I still need to test with hotplug and things like that. Hence the RFC.
> It looks like the symptom is that the per-cpu areas are all allocated on node
> 0, is that all that goes wrong?
Yes, that's the symptom. I cc'd a few folks to see if they could help
indicate the performance implications of such a setup -- sorry, I should
have been more explicit about that.
Thanks,
Nish
next prev parent reply other threads:[~2015-07-08 23:16 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-02 23:02 [RFC PATCH 1/2] powerpc/numa: fix cpu_to_node() usage during boot Nishanth Aravamudan
2015-07-02 23:03 ` [RFC PATCH 2/2] powerpc/smp: use early_cpu_to_node() instead of direct references to numa_cpu_lookup_table Nishanth Aravamudan
2015-07-09 1:25 ` David Rientjes
2015-07-08 4:00 ` [RFC,1/2] powerpc/numa: fix cpu_to_node() usage during boot Michael Ellerman
2015-07-08 23:16 ` Nishanth Aravamudan [this message]
2015-07-09 1:24 ` David Rientjes
2015-07-10 16:15 ` Nishanth Aravamudan
2015-07-15 20:37 ` Tejun Heo
2015-07-15 0:22 ` Michael Ellerman
2015-07-09 1:22 ` [RFC PATCH 1/2] " David Rientjes
2015-07-10 16:25 ` Nishanth Aravamudan
2015-07-14 21:31 ` David Rientjes
2015-07-15 20:35 ` Tejun Heo
2015-07-15 22:43 ` Nishanth Aravamudan
2015-07-15 22:47 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150708231623.GB44862@linux.vnet.ibm.com \
--to=nacc@linux.vnet.ibm.com \
--cc=anton@samba.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=paulus@samba.org \
--cc=peterz@infradead.org \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).