From: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
To: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: devicetree@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org,
Rob Herring <robh+dt@kernel.org>,
Grant Likely <grant.likely@linaro.org>,
linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH] of: return NUMA_NO_NODE from fallback of_node_to_nid()
Date: Wed, 8 Apr 2015 16:07:40 -0700 [thread overview]
Message-ID: <20150408230740.GB53918@linux.vnet.ibm.com> (raw)
In-Reply-To: <55255F84.6060608@yandex-team.ru>
On 08.04.2015 [20:04:04 +0300], Konstantin Khlebnikov wrote:
> On 08.04.2015 19:59, Konstantin Khlebnikov wrote:
> >Node 0 might be offline as well as any other numa node,
> >in this case kernel cannot handle memory allocation and crashes.
Isn't the bug that numa_node_id() returned an offline node? That
shouldn't happen.
#ifdef CONFIG_USE_PERCPU_NUMA_NODE_ID
...
#ifndef numa_node_id
/* Returns the number of the current Node. */
static inline int numa_node_id(void)
{
return raw_cpu_read(numa_node);
}
#endif
...
#else /* !CONFIG_USE_PERCPU_NUMA_NODE_ID */
/* Returns the number of the current Node. */
#ifndef numa_node_id
static inline int numa_node_id(void)
{
return cpu_to_node(raw_smp_processor_id());
}
#endif
...
So that's either the per-cpu numa_node value, right? Or the result of
cpu_to_node on the current processor.
> Example:
>
> [ 0.027133] ------------[ cut here ]------------
> [ 0.027938] kernel BUG at include/linux/gfp.h:322!
This is
VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES || !node_online(nid));
in
alloc_pages_exact_node().
And based on the trace below, that's
__slab_alloc -> alloc
alloc_pages_exact_node
<- alloc_slab_page
<- allocate_slab
<- new_slab
<- new_slab_objects
< __slab_alloc?
which is just passing the node value down, right? Which I think was
from:
domain = kzalloc_node(sizeof(*domain) + (sizeof(unsigned int) * size),
GFP_KERNEL, of_node_to_nid(of_node));
?
What platform is this on, looks to be x86? qemu emulation of a
pathological topology? What was the topology?
Note that there is a ton of code that seems to assume node 0 is online.
I started working on removing this assumption myself and it just led
down a rathole (on power, we always have node 0 online, even if it is
memoryless and cpuless, as a result).
I am guessing this is just happening early in boot before the per-cpu
areas are setup? That's why (I think) x86 has the early_cpu_to_node()
function...
Or do you not have CONFIG_OF set? So isn't the only change necessary to
the include file, and it should just return first_online_node rather
than 0?
Ah and there's more of those node 0 assumptions :)
#define first_online_node 0
#define first_memory_node 0
if MAX_NUMODES == 1...
-Nish
WARNING: multiple messages have this Message-ID (diff)
From: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
To: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Grant Likely <grant.likely@linaro.org>,
devicetree@vger.kernel.org, Rob Herring <robh+dt@kernel.org>,
linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org,
linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH] of: return NUMA_NO_NODE from fallback of_node_to_nid()
Date: Wed, 08 Apr 2015 23:07:40 +0000 [thread overview]
Message-ID: <20150408230740.GB53918@linux.vnet.ibm.com> (raw)
In-Reply-To: <55255F84.6060608@yandex-team.ru>
On 08.04.2015 [20:04:04 +0300], Konstantin Khlebnikov wrote:
> On 08.04.2015 19:59, Konstantin Khlebnikov wrote:
> >Node 0 might be offline as well as any other numa node,
> >in this case kernel cannot handle memory allocation and crashes.
Isn't the bug that numa_node_id() returned an offline node? That
shouldn't happen.
#ifdef CONFIG_USE_PERCPU_NUMA_NODE_ID
...
#ifndef numa_node_id
/* Returns the number of the current Node. */
static inline int numa_node_id(void)
{
return raw_cpu_read(numa_node);
}
#endif
...
#else /* !CONFIG_USE_PERCPU_NUMA_NODE_ID */
/* Returns the number of the current Node. */
#ifndef numa_node_id
static inline int numa_node_id(void)
{
return cpu_to_node(raw_smp_processor_id());
}
#endif
...
So that's either the per-cpu numa_node value, right? Or the result of
cpu_to_node on the current processor.
> Example:
>
> [ 0.027133] ------------[ cut here ]------------
> [ 0.027938] kernel BUG at include/linux/gfp.h:322!
This is
VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES || !node_online(nid));
in
alloc_pages_exact_node().
And based on the trace below, that's
__slab_alloc -> alloc
alloc_pages_exact_node
<- alloc_slab_page
<- allocate_slab
<- new_slab
<- new_slab_objects
< __slab_alloc?
which is just passing the node value down, right? Which I think was
from:
domain = kzalloc_node(sizeof(*domain) + (sizeof(unsigned int) * size),
GFP_KERNEL, of_node_to_nid(of_node));
?
What platform is this on, looks to be x86? qemu emulation of a
pathological topology? What was the topology?
Note that there is a ton of code that seems to assume node 0 is online.
I started working on removing this assumption myself and it just led
down a rathole (on power, we always have node 0 online, even if it is
memoryless and cpuless, as a result).
I am guessing this is just happening early in boot before the per-cpu
areas are setup? That's why (I think) x86 has the early_cpu_to_node()
function...
Or do you not have CONFIG_OF set? So isn't the only change necessary to
the include file, and it should just return first_online_node rather
than 0?
Ah and there's more of those node 0 assumptions :)
#define first_online_node 0
#define first_memory_node 0
if MAX_NUMODES = 1...
-Nish
WARNING: multiple messages have this Message-ID (diff)
From: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
To: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Grant Likely <grant.likely@linaro.org>,
devicetree@vger.kernel.org, Rob Herring <robh+dt@kernel.org>,
linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org,
linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH] of: return NUMA_NO_NODE from fallback of_node_to_nid()
Date: Wed, 8 Apr 2015 16:07:40 -0700 [thread overview]
Message-ID: <20150408230740.GB53918@linux.vnet.ibm.com> (raw)
In-Reply-To: <55255F84.6060608@yandex-team.ru>
On 08.04.2015 [20:04:04 +0300], Konstantin Khlebnikov wrote:
> On 08.04.2015 19:59, Konstantin Khlebnikov wrote:
> >Node 0 might be offline as well as any other numa node,
> >in this case kernel cannot handle memory allocation and crashes.
Isn't the bug that numa_node_id() returned an offline node? That
shouldn't happen.
#ifdef CONFIG_USE_PERCPU_NUMA_NODE_ID
...
#ifndef numa_node_id
/* Returns the number of the current Node. */
static inline int numa_node_id(void)
{
return raw_cpu_read(numa_node);
}
#endif
...
#else /* !CONFIG_USE_PERCPU_NUMA_NODE_ID */
/* Returns the number of the current Node. */
#ifndef numa_node_id
static inline int numa_node_id(void)
{
return cpu_to_node(raw_smp_processor_id());
}
#endif
...
So that's either the per-cpu numa_node value, right? Or the result of
cpu_to_node on the current processor.
> Example:
>
> [ 0.027133] ------------[ cut here ]------------
> [ 0.027938] kernel BUG at include/linux/gfp.h:322!
This is
VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES || !node_online(nid));
in
alloc_pages_exact_node().
And based on the trace below, that's
__slab_alloc -> alloc
alloc_pages_exact_node
<- alloc_slab_page
<- allocate_slab
<- new_slab
<- new_slab_objects
< __slab_alloc?
which is just passing the node value down, right? Which I think was
from:
domain = kzalloc_node(sizeof(*domain) + (sizeof(unsigned int) * size),
GFP_KERNEL, of_node_to_nid(of_node));
?
What platform is this on, looks to be x86? qemu emulation of a
pathological topology? What was the topology?
Note that there is a ton of code that seems to assume node 0 is online.
I started working on removing this assumption myself and it just led
down a rathole (on power, we always have node 0 online, even if it is
memoryless and cpuless, as a result).
I am guessing this is just happening early in boot before the per-cpu
areas are setup? That's why (I think) x86 has the early_cpu_to_node()
function...
Or do you not have CONFIG_OF set? So isn't the only change necessary to
the include file, and it should just return first_online_node rather
than 0?
Ah and there's more of those node 0 assumptions :)
#define first_online_node 0
#define first_memory_node 0
if MAX_NUMODES == 1...
-Nish
WARNING: multiple messages have this Message-ID (diff)
From: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
To: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Grant Likely <grant.likely@linaro.org>,
devicetree@vger.kernel.org, Rob Herring <robh+dt@kernel.org>,
linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org,
linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH] of: return NUMA_NO_NODE from fallback of_node_to_nid()
Date: Wed, 8 Apr 2015 16:07:40 -0700 [thread overview]
Message-ID: <20150408230740.GB53918@linux.vnet.ibm.com> (raw)
In-Reply-To: <55255F84.6060608@yandex-team.ru>
On 08.04.2015 [20:04:04 +0300], Konstantin Khlebnikov wrote:
> On 08.04.2015 19:59, Konstantin Khlebnikov wrote:
> >Node 0 might be offline as well as any other numa node,
> >in this case kernel cannot handle memory allocation and crashes.
Isn't the bug that numa_node_id() returned an offline node? That
shouldn't happen.
#ifdef CONFIG_USE_PERCPU_NUMA_NODE_ID
...
#ifndef numa_node_id
/* Returns the number of the current Node. */
static inline int numa_node_id(void)
{
return raw_cpu_read(numa_node);
}
#endif
...
#else /* !CONFIG_USE_PERCPU_NUMA_NODE_ID */
/* Returns the number of the current Node. */
#ifndef numa_node_id
static inline int numa_node_id(void)
{
return cpu_to_node(raw_smp_processor_id());
}
#endif
...
So that's either the per-cpu numa_node value, right? Or the result of
cpu_to_node on the current processor.
> Example:
>
> [ 0.027133] ------------[ cut here ]------------
> [ 0.027938] kernel BUG at include/linux/gfp.h:322!
This is
VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES || !node_online(nid));
in
alloc_pages_exact_node().
And based on the trace below, that's
__slab_alloc -> alloc
alloc_pages_exact_node
<- alloc_slab_page
<- allocate_slab
<- new_slab
<- new_slab_objects
< __slab_alloc?
which is just passing the node value down, right? Which I think was
from:
domain = kzalloc_node(sizeof(*domain) + (sizeof(unsigned int) * size),
GFP_KERNEL, of_node_to_nid(of_node));
?
What platform is this on, looks to be x86? qemu emulation of a
pathological topology? What was the topology?
Note that there is a ton of code that seems to assume node 0 is online.
I started working on removing this assumption myself and it just led
down a rathole (on power, we always have node 0 online, even if it is
memoryless and cpuless, as a result).
I am guessing this is just happening early in boot before the per-cpu
areas are setup? That's why (I think) x86 has the early_cpu_to_node()
function...
Or do you not have CONFIG_OF set? So isn't the only change necessary to
the include file, and it should just return first_online_node rather
than 0?
Ah and there's more of those node 0 assumptions :)
#define first_online_node 0
#define first_memory_node 0
if MAX_NUMODES == 1...
-Nish
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-04-08 23:07 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-08 16:59 [PATCH] of: return NUMA_NO_NODE from fallback of_node_to_nid() Konstantin Khlebnikov
2015-04-08 16:59 ` Konstantin Khlebnikov
2015-04-08 16:59 ` Konstantin Khlebnikov
2015-04-08 17:04 ` Konstantin Khlebnikov
2015-04-08 17:04 ` Konstantin Khlebnikov
2015-04-08 17:04 ` Konstantin Khlebnikov
2015-04-08 23:07 ` Nishanth Aravamudan [this message]
2015-04-08 23:07 ` Nishanth Aravamudan
2015-04-08 23:07 ` Nishanth Aravamudan
2015-04-08 23:07 ` Nishanth Aravamudan
2015-04-09 4:27 ` Konstantin Khlebnikov
2015-04-09 4:27 ` Konstantin Khlebnikov
2015-04-09 4:27 ` Konstantin Khlebnikov
2015-04-09 4:27 ` Konstantin Khlebnikov
2015-04-09 22:58 ` Nishanth Aravamudan
2015-04-09 22:58 ` Nishanth Aravamudan
2015-04-09 22:58 ` Nishanth Aravamudan
2015-04-09 22:58 ` Nishanth Aravamudan
2015-04-10 11:37 ` Konstantin Khlebnikov
2015-04-10 11:37 ` Konstantin Khlebnikov
2015-04-10 11:37 ` Konstantin Khlebnikov
2015-04-10 11:37 ` Konstantin Khlebnikov
2015-04-10 19:48 ` Nishanth Aravamudan
2015-04-10 19:48 ` Nishanth Aravamudan
2015-04-10 19:48 ` Nishanth Aravamudan
2015-04-10 19:48 ` Nishanth Aravamudan
2015-04-08 23:12 ` Julian Calaby
2015-04-08 23:12 ` Julian Calaby
2015-04-08 23:12 ` Julian Calaby
2015-04-08 23:12 ` Julian Calaby
2015-04-09 4:35 ` Konstantin Khlebnikov
2015-04-09 4:35 ` Konstantin Khlebnikov
2015-04-09 4:35 ` Konstantin Khlebnikov
2015-04-09 4:35 ` Konstantin Khlebnikov
2015-04-13 13:22 ` Rob Herring
2015-04-13 13:22 ` Rob Herring
2015-04-13 13:22 ` Rob Herring
2015-04-13 13:22 ` Rob Herring
2015-04-13 13:38 ` Konstantin Khlebnikov
2015-04-13 13:38 ` Konstantin Khlebnikov
2015-04-13 13:38 ` Konstantin Khlebnikov
2015-04-13 13:38 ` Konstantin Khlebnikov
2015-04-13 16:49 ` Rob Herring
2015-04-13 16:49 ` Rob Herring
2015-04-13 16:49 ` Rob Herring
2015-04-13 16:49 ` Rob Herring
2015-04-13 16:49 ` Rob Herring
2015-04-29 1:11 ` songxiumiao
2015-04-29 1:11 ` songxiumiao
[not found] ` <201504290910595113455-6gUaA8visnnQT0dZR+AlfA@public.gmane.org>
2015-04-29 8:30 ` Konstantin Khlebnikov
2015-04-29 8:30 ` Konstantin Khlebnikov
2015-04-29 8:30 ` Konstantin Khlebnikov
2015-04-29 8:30 ` Konstantin Khlebnikov
2015-04-29 8:30 ` Konstantin Khlebnikov
2015-04-29 8:37 ` songxiumiao
2015-04-29 8:37 ` songxiumiao
2015-06-04 5:45 ` Grant Likely
2015-06-04 5:45 ` Grant Likely
2015-06-04 5:45 ` Grant Likely
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150408230740.GB53918@linux.vnet.ibm.com \
--to=nacc@linux.vnet.ibm.com \
--cc=devicetree@vger.kernel.org \
--cc=grant.likely@linaro.org \
--cc=khlebnikov@yandex-team.ru \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=robh+dt@kernel.org \
--cc=sparclinux@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.