From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
To: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Cc: Dave Hansen <dave@sr71.net>,
Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
"linuxppc-dev@lists.ozlabs.org list"
<linuxppc-dev@lists.ozlabs.org>, Linux MM <linux-mm@kvack.org>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
nfont@linux.vnet.ibm.com,
Cody P Schafer <cody@linux.vnet.ibm.com>,
Anton Blanchard <anton@samba.org>
Subject: Re: NUMA topology question wrt. d4edc5b6
Date: Fri, 23 May 2014 02:18:05 +0530 [thread overview]
Message-ID: <537E6285.3050000@linux.vnet.ibm.com> (raw)
In-Reply-To: <20140521200451.GB5755@linux.vnet.ibm.com>
[ Adding a few more CC's ]
On 05/22/2014 01:34 AM, Nishanth Aravamudan wrote:
> Hi Srivatsa,
>
> After d4edc5b6 ("powerpc: Fix the setup of CPU-to-Node mappings during
> CPU online"), cpu_to_node() looks like:
>
> static inline int cpu_to_node(int cpu)
> {
> int nid;
>
> nid = numa_cpu_lookup_table[cpu];
>
> /*
> * During early boot, the numa-cpu lookup table might not have been
> * setup for all CPUs yet. In such cases, default to node 0.
> */
> return (nid < 0) ? 0 : nid;
> }
>
> However, I'm curious if this is correct in all cases. I have seen
> several LPARs that do not have any CPUs on node 0. In fact, because node
> 0 is statically set online in the initialization of the N_ONLINE
> nodemask, 0 is always present to Linux, whether it is present on the
> system. I'm not sure what the best thing to do here is, but I'm curious
> if you have any ideas? I would like to remove the static initialization
> of node 0, as it's confusing to users to see an empty node (particularly
> when it's completely separate in the numbering from other nodes), but
> we trip a panic (refer to:
> http://www.spinics.net/lists/linux-mm/msg73321.html).
>
Ah, I see. I didn't have any particular reason to default it to zero.
I just did that because the existing code before this patch did the same
thing. (numa_cpu_lookup_table[] is a global array, so it will be initialized
with zeros. So if we access it before populating it via numa_setup_cpu(),
it would return 0. So I retained that behaviour with the above conditional).
Will something like the below [totally untested] patch solve the boot-panic?
I understand that as of today first_online_node will still pick 0 since
N_ONLINE is initialized statically, but with your proposed change to that
init code, I guess the following patch should avoid the boot panic.
[ But note that first_online_node is hard-coded to 0, if MAX_NUMNODES is = 1.
So we'll have to fix that if powerpc can have a single node system whose node
is numbered something other than 0. Can that happen as well? ]
And regarding your question about what is the best way to fix this whole Linux
MM's assumption about node0, I'm not really sure.. since I am not really aware
of the extent to which the MM subsystem is intertwined with this assumption
and what it would take to cure that :-(
Regards,
Srivatsa S. Bhat
diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
index c920215..58e6469 100644
--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -18,6 +18,7 @@ struct device_node;
*/
#define RECLAIM_DISTANCE 10
+#include <linux/nodemask.h>
#include <asm/mmzone.h>
static inline int cpu_to_node(int cpu)
@@ -30,7 +31,7 @@ static inline int cpu_to_node(int cpu)
* During early boot, the numa-cpu lookup table might not have been
* setup for all CPUs yet. In such cases, default to node 0.
*/
- return (nid < 0) ? 0 : nid;
+ return (nid < 0) ? first_online_node : nid;
}
#define parent_node(node) (node)
WARNING: multiple messages have this Message-ID (diff)
From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
To: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Cc: benh@kernel.crashing.org,
Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
nfont@linux.vnet.ibm.com,
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
Cody P Schafer <cody@linux.vnet.ibm.com>,
Anton Blanchard <anton@samba.org>, Dave Hansen <dave@sr71.net>,
"linuxppc-dev@lists.ozlabs.org list"
<linuxppc-dev@lists.ozlabs.org>, Linux MM <linux-mm@kvack.org>
Subject: Re: NUMA topology question wrt. d4edc5b6
Date: Fri, 23 May 2014 02:18:05 +0530 [thread overview]
Message-ID: <537E6285.3050000@linux.vnet.ibm.com> (raw)
In-Reply-To: <20140521200451.GB5755@linux.vnet.ibm.com>
[ Adding a few more CC's ]
On 05/22/2014 01:34 AM, Nishanth Aravamudan wrote:
> Hi Srivatsa,
>
> After d4edc5b6 ("powerpc: Fix the setup of CPU-to-Node mappings during
> CPU online"), cpu_to_node() looks like:
>
> static inline int cpu_to_node(int cpu)
> {
> int nid;
>
> nid = numa_cpu_lookup_table[cpu];
>
> /*
> * During early boot, the numa-cpu lookup table might not have been
> * setup for all CPUs yet. In such cases, default to node 0.
> */
> return (nid < 0) ? 0 : nid;
> }
>
> However, I'm curious if this is correct in all cases. I have seen
> several LPARs that do not have any CPUs on node 0. In fact, because node
> 0 is statically set online in the initialization of the N_ONLINE
> nodemask, 0 is always present to Linux, whether it is present on the
> system. I'm not sure what the best thing to do here is, but I'm curious
> if you have any ideas? I would like to remove the static initialization
> of node 0, as it's confusing to users to see an empty node (particularly
> when it's completely separate in the numbering from other nodes), but
> we trip a panic (refer to:
> http://www.spinics.net/lists/linux-mm/msg73321.html).
>
Ah, I see. I didn't have any particular reason to default it to zero.
I just did that because the existing code before this patch did the same
thing. (numa_cpu_lookup_table[] is a global array, so it will be initialized
with zeros. So if we access it before populating it via numa_setup_cpu(),
it would return 0. So I retained that behaviour with the above conditional).
Will something like the below [totally untested] patch solve the boot-panic?
I understand that as of today first_online_node will still pick 0 since
N_ONLINE is initialized statically, but with your proposed change to that
init code, I guess the following patch should avoid the boot panic.
[ But note that first_online_node is hard-coded to 0, if MAX_NUMNODES is = 1.
So we'll have to fix that if powerpc can have a single node system whose node
is numbered something other than 0. Can that happen as well? ]
And regarding your question about what is the best way to fix this whole Linux
MM's assumption about node0, I'm not really sure.. since I am not really aware
of the extent to which the MM subsystem is intertwined with this assumption
and what it would take to cure that :-(
Regards,
Srivatsa S. Bhat
diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
index c920215..58e6469 100644
--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -18,6 +18,7 @@ struct device_node;
*/
#define RECLAIM_DISTANCE 10
+#include <linux/nodemask.h>
#include <asm/mmzone.h>
static inline int cpu_to_node(int cpu)
@@ -30,7 +31,7 @@ static inline int cpu_to_node(int cpu)
* During early boot, the numa-cpu lookup table might not have been
* setup for all CPUs yet. In such cases, default to node 0.
*/
- return (nid < 0) ? 0 : nid;
+ return (nid < 0) ? first_online_node : nid;
}
#define parent_node(node) (node)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-05-22 20:49 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-21 20:04 NUMA topology question wrt. d4edc5b6 Nishanth Aravamudan
2014-05-22 20:48 ` Srivatsa S. Bhat [this message]
2014-05-22 20:48 ` Srivatsa S. Bhat
2014-05-28 20:37 ` Nishanth Aravamudan
2014-05-28 20:37 ` Nishanth Aravamudan
2014-06-09 21:38 ` David Rientjes
2014-06-09 21:38 ` David Rientjes
2014-06-10 23:30 ` Nishanth Aravamudan
2014-06-10 23:30 ` Nishanth Aravamudan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=537E6285.3050000@linux.vnet.ibm.com \
--to=srivatsa.bhat@linux.vnet.ibm.com \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=anton@samba.org \
--cc=cody@linux.vnet.ibm.com \
--cc=dave@sr71.net \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=nacc@linux.vnet.ibm.com \
--cc=nfont@linux.vnet.ibm.com \
--cc=srikar@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.