linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
To: Tejun Heo <htejun@gmail.com>
Cc: Christoph Lameter <cl@linux.com>,
	linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org,
	anton@samba.org, David Rientjes <rientjes@google.com>,
	benh@kernel.crashing.org, tony.luck@intel.com
Subject: Re: Node 0 not necessary for powerpc?
Date: Wed, 21 May 2014 12:57:43 -0700	[thread overview]
Message-ID: <20140521195743.GA5755@linux.vnet.ibm.com> (raw)
In-Reply-To: <20140521185812.GA5259@htj.dyndns.org>

Hi Tejun,

On 21.05.2014 [14:58:12 -0400], Tejun Heo wrote:
> Hello,
> 
> On Wed, May 21, 2014 at 09:16:27AM -0500, Christoph Lameter wrote:
> > On Mon, 19 May 2014, Nishanth Aravamudan wrote:
> > > I'm seeing a panic at boot with this change on an LPAR which actually
> > > has no Node 0. Here's what I think is happening:
> > >
> > > start_kernel
> > >     ...
> > >     -> setup_per_cpu_areas
> > >         -> pcpu_embed_first_chunk
> > >             -> pcpu_fc_alloc
> > >                 -> ___alloc_bootmem_node(NODE_DATA(cpu_to_node(cpu), ...
> > >     -> smp_prepare_boot_cpu
> > >         -> set_numa_node(boot_cpuid)
> > >
> > > So we panic on the NODE_DATA call. It seems that ia64, at least, uses
> > > pcpu_alloc_first_chunk rather than embed. x86 has some code to handle
> > > early calls of cpu_to_node (early_cpu_to_node) and sets the mapping for
> > > all CPUs in setup_per_cpu_areas().
> > 
> > Maybe we can switch ia64 too embed? Tejun: Why are there these
> > dependencies?
> > 
> > > Thoughts? Does that mean we need something similar to x86 for powerpc?
> 
> I'm missing context to properly understand what's going on but the
> specific allocator in use shouldn't matter.  e.g. x86 can use both
> embed and page allocators.  If the problem is that the arch is
> accessing percpu memory before percpu allocator is initialized and the
> problem was masked before somehow, the right thing to do would be
> removing those premature percpu accesses.  If early percpu variables
> are really necessary, doing similar early_percpu thing as in x86 would
> be necessary.

For context: I was looking at why N_ONLINE was statically setting Node 0
to be online, whether or not the topology is that way -- I've been
getting several bugs lately where Node 0 is online, but has no CPUs and
no memory on it, on powerpc. 

On powerpc, setup_per_cpu_areas calls into ___alloc_bootmem_node using
NODE_DATA(cpu_to_node(cpu)).

Currently, cpu_to_node() in arch/powerpc/include/asm/topology.h does:

        /*
         * During early boot, the numa-cpu lookup table might not have been
         * setup for all CPUs yet. In such cases, default to node 0.
         */
        return (nid < 0) ? 0 : nid;

And so early at boot, if node 0 is not present, we end up accessing an
unitialized NODE_DATA(). So this seems buggy (I'll contact the powerpc
deveopers separately on that).

I recently submitted patches to have powerpc turn on
USE_PERCPU_NUMA_NODEID and HAVE_MEMORYLESS_NODES. But then, cpu_to_node
will be accessing percpu data in setup_per_cpu_areas, which seems like a
no-no. And more specifically, since we haven't yet run
smp_prepare_boot_cpu() at this point, cpu_to_node has not yet been
initialized to provide a sane value.

Thanks,
Nish

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-05-21 19:58 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-11 19:56 Node 0 not necessary for powerpc? Nishanth Aravamudan
2014-03-12  2:02 ` David Rientjes
2014-03-13 16:48   ` Nishanth Aravamudan
2014-03-12 13:41 ` Christoph Lameter
2014-03-13 16:49   ` Nishanth Aravamudan
2014-05-19 18:24     ` Nishanth Aravamudan
2014-05-21 14:16       ` Christoph Lameter
2014-05-21 18:58         ` Tejun Heo
2014-05-21 19:57           ` Nishanth Aravamudan [this message]
2014-06-09 21:47             ` David Rientjes
2014-06-10 23:31               ` Nishanth Aravamudan
2014-06-19 14:59                 ` Tejun Heo
2014-06-19 17:40                   ` Nishanth Aravamudan
2014-06-19 17:14           ` Nishanth Aravamudan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140521195743.GA5755@linux.vnet.ibm.com \
    --to=nacc@linux.vnet.ibm.com \
    --cc=anton@samba.org \
    --cc=benh@kernel.crashing.org \
    --cc=cl@linux.com \
    --cc=htejun@gmail.com \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=rientjes@google.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).