All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
To: Jack Steiner <steiner@sgi.com>
Cc: David Rientjes <rientjes@google.com>,
	alex.shi@intel.com, LKML <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@elte.hu>, Andi Kleen <andi@firstfloor.org>
Subject: Re: [PATCH] Fix early panic issue on machines with memless node
Date: Wed, 06 May 2009 13:19:52 +0800	[thread overview]
Message-ID: <1241587192.27664.56.camel@ymzhang> (raw)
In-Reply-To: <20090505202730.GA9831@sgi.com>

On Tue, 2009-05-05 at 15:27 -0500, Jack Steiner wrote:
> On Tue, May 05, 2009 at 12:52:54PM -0700, David Rientjes wrote:
> > On Tue, 5 May 2009, Jack Steiner wrote:
> > 
> > > I was able to duplicate your original problem. Your patch below solves the
> > > problem. AFAICT, it causes no new reqgressions to the various configurations
> > > that I'm testing. (I'll add the "mem=2G" to my configs that I test).
> > > 
> > 
> > Great, it would be helpful to catch these problems before 2.6.30 is 
> > released.  I've passed my patch along to Ingo.
> > 
> > > However, I see a new regression that was not present a couple of weeks ago.
> > > Configurations that have nodes with cpus and no memory panic during
> > > boot. This occurs both with and without your patch and is not related to "mem=".
> > > 
> > > I need to isolate the problem but here is the stack trace. :
> > > 	Pid: 0, comm: swapper Not tainted 2.6.30-rc4-next-20090505-medusa #12
> > > 	Call Trace:
> > > 	 [<ffffffff806b919e>] early_idt_handler+0x5e/0x71
> > > 	 [<ffffffff802920fe>] ? build_zonelists_node+0x4c/0x8d
> > > 	 [<ffffffff8029333f>] __build_all_zonelists+0x1ae/0x55a
> > > 	 [<ffffffff80293932>] build_all_zonelists+0x1b5/0x263
> > > 	 [<ffffffff806b9b6e>] start_kernel+0x17a/0x3c5
> > > 	 [<ffffffff806b9140>] ? early_idt_handler+0x0/0x71
> > > 	 [<ffffffff806b92a7>] x86_64_start_reservations+0xae/0xb2
> > > 	 [<ffffffff806b93fd>] x86_64_start_kernel+0x152/0x161
> > > 
> > 
> > Please post your .config since it apparently differs from x86_64 defconfig 
> > judging by my debugging symbols and also the full output of the panic.
> 
> I suspect I mislead you when I mentioned "configurations". I did not mean
> the .config file. I use a more-or-less standard .config file.
> 
> I do much of my testing on a system simulator. Using a simulator config file,
> I specify the system configuration such as number of nodes, sockets per node,
> cpus per socket, memory per socket, address map, boot options, etc. This
> makes it easy to quickly test a lot of strange (but real) configurations.
> 
> The configuration above that is failing is a 2-socket Nehelem blade that has no
> memory on socket 0. All memory is located on socket 1.  The panic is caused by a
> null dereference of NODE_DATA(0).
> 
> Still looking....
It seems in function setup_node_bootmem:

        if (!end)
                return;

stops the initialization of node_data[nodeid]. Later on panic when build_zonelists
dereference NODE_DATA(0).

Although a node is memoryless, but mostly it has small blocks of memory, so function
acpi_scan_nodes marks them offline. However, if getting node info in
acpi_numa_processor_affinity_init. the node might have no any memory, and acpi_scan_nodes
doesn't mark it offline.

The logic is confusing with patch dc09855191809. Could you revert it to retest?



  parent reply	other threads:[~2009-05-06  5:19 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-05  3:15 [PATCH] Fix early panic issue on machines with memless node Zhang, Yanmin
2009-05-05  3:32 ` David Rientjes
2009-05-05  5:55   ` Zhang, Yanmin
2009-05-05 16:36   ` Jack Steiner
2009-05-05 19:50     ` [patch] srat: do not register nodes beyond e820 map David Rientjes
2009-05-06  8:58       ` [tip:x86/urgent] x86, " tip-bot for David Rientjes
2009-05-05 19:52     ` [PATCH] Fix early panic issue on machines with memless node David Rientjes
2009-05-05 20:27       ` Jack Steiner
2009-05-05 20:41         ` David Rientjes
2009-05-06  5:19         ` Zhang, Yanmin [this message]
2009-05-06 14:38           ` Jack Steiner
2009-05-06  8:50       ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1241587192.27664.56.camel@ymzhang \
    --to=yanmin_zhang@linux.intel.com \
    --cc=alex.shi@intel.com \
    --cc=andi@firstfloor.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=rientjes@google.com \
    --cc=steiner@sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.