From: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
To: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>,
linux-kernel@vger.kernel.org, Paul Mackerras <paulus@samba.org>,
Anton Blanchard <anton@samba.org>,
David Rientjes <rientjes@google.com>,
linuxppc-dev@lists.ozlabs.org
Subject: Re: [RFC,1/2] powerpc/numa: fix cpu_to_node() usage during boot
Date: Wed, 8 Jul 2015 16:16:23 -0700 [thread overview]
Message-ID: <20150708231623.GB44862@linux.vnet.ibm.com> (raw)
In-Reply-To: <20150708040056.948A1140770@ozlabs.org>
On 08.07.2015 [14:00:56 +1000], Michael Ellerman wrote:
> On Thu, 2015-02-07 at 23:02:02 UTC, Nishanth Aravamudan wrote:
> > Much like on x86, now that powerpc is using USE_PERCPU_NUMA_NODE_ID, we
> > have an ordering issue during boot with early calls to cpu_to_node().
>
> "now that .." implies we changed something and broke this. What commit was
> it that changed the behaviour?
Well, that's something I'm trying to still unearth. In the commits
before and after adding USE_PERCPU_NUMA_NODE_ID (8c272261194d
"powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"), the dmesg reports:
pcpu-alloc: [0] 0 1 2 3 4 5 6 7
At least prior to 8c272261194d, this might have been due to the old
powerpc-specific cpu_to_node():
static inline int cpu_to_node(int cpu)
{
int nid;
nid = numa_cpu_lookup_table[cpu];
/*
* During early boot, the numa-cpu lookup table might not have
been
* setup for all CPUs yet. In such cases, default to node 0.
*/
return (nid < 0) ? 0 : nid;
}
which might imply that no one cares or that simply no one noticed.
> > The value returned by those calls now depend on the per-cpu area being
> > setup, but that is not guaranteed to be the case during boot. Instead,
> > we need to add an early_cpu_to_node() which doesn't use the per-CPU area
> > and call that from certain spots that are known to invoke cpu_to_node()
> > before the per-CPU areas are not configured.
> >
> > On an example 2-node NUMA system with the following topology:
> >
> > available: 2 nodes (0-1)
> > node 0 cpus: 0 1 2 3
> > node 0 size: 2029 MB
> > node 0 free: 1753 MB
> > node 1 cpus: 4 5 6 7
> > node 1 size: 2045 MB
> > node 1 free: 1945 MB
> > node distances:
> > node 0 1
> > 0: 10 40
> > 1: 40 10
> >
> > we currently emit at boot:
> >
> > [ 0.000000] pcpu-alloc: [0] 0 1 2 3 [0] 4 5 6 7
> >
> > After this commit, we correctly emit:
> >
> > [ 0.000000] pcpu-alloc: [0] 0 1 2 3 [1] 4 5 6 7
>
>
> So it looks fairly sane, and I guess it's a bug fix.
>
> But I'm a bit reluctant to put it in straight away without some time in next.
I'm fine with that -- it could use some more extensive testing,
admittedly (I only have been able to verify the pcpu areas are being
correctly allocated on the right node so far).
I still need to test with hotplug and things like that. Hence the RFC.
> It looks like the symptom is that the per-cpu areas are all allocated on node
> 0, is that all that goes wrong?
Yes, that's the symptom. I cc'd a few folks to see if they could help
indicate the performance implications of such a setup -- sorry, I should
have been more explicit about that.
Thanks,
Nish
next prev parent reply other threads:[~2015-07-08 23:16 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-02 23:02 [RFC PATCH 1/2] powerpc/numa: fix cpu_to_node() usage during boot Nishanth Aravamudan
2015-07-02 23:03 ` [RFC PATCH 2/2] powerpc/smp: use early_cpu_to_node() instead of direct references to numa_cpu_lookup_table Nishanth Aravamudan
2015-07-09 1:25 ` David Rientjes
2015-07-08 4:00 ` [RFC,1/2] powerpc/numa: fix cpu_to_node() usage during boot Michael Ellerman
2015-07-08 23:16 ` Nishanth Aravamudan [this message]
2015-07-09 1:24 ` David Rientjes
2015-07-10 16:15 ` Nishanth Aravamudan
2015-07-15 20:37 ` Tejun Heo
2015-07-15 0:22 ` Michael Ellerman
2015-07-09 1:22 ` [RFC PATCH 1/2] " David Rientjes
2015-07-10 16:25 ` Nishanth Aravamudan
2015-07-14 21:31 ` David Rientjes
2015-07-15 20:35 ` Tejun Heo
2015-07-15 22:43 ` Nishanth Aravamudan
2015-07-15 22:47 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150708231623.GB44862@linux.vnet.ibm.com \
--to=nacc@linux.vnet.ibm.com \
--cc=anton@samba.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=paulus@samba.org \
--cc=peterz@infradead.org \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.