public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ravikiran G Thirumalai <kiran@scalex86.org>
To: Petr Vandrovec <vandrove@vc.cvut.cz>
Cc: Andrew Morton <akpm@osdl.org>,
	Christoph Lameter <clameter@engr.sgi.com>,
	alokk@calsoftinc.com, linux-kernel@vger.kernel.org,
	manfred@colorfullife.com,
	"Shai Fultheim (Shai@scalex86.org)" <shai@scalex86.org>,
	ananth@in.ibm.com, Andi Kleen <ak@suse.de>
Subject: Re: 2.6.14-rc1-git-now still dying in mm/slab - this time line 1849
Date: Wed, 28 Sep 2005 14:02:45 -0700	[thread overview]
Message-ID: <20050928210245.GA3760@localhost.localdomain> (raw)
In-Reply-To: <43301578.8040305@vc.cvut.cz>

On Tue, Sep 20, 2005 at 03:58:16PM +0200, Petr Vandrovec wrote:
> Andrew Morton wrote:
> >Christoph Lameter <clameter@engr.sgi.com> wrote:
> >
> >...
> >They do.  I don't believe that preemption is the source of this BUG. 
> >(Petr, does CONFIG_PREEMPT=n fix it?)
> 
> No, it does not.  I've even added printks here and there to show node 
> number,
> and everything works as it should.  Maybe there are some problems with
> numa_node_id() and migrating between processors when memory gets released,
> I do not know.
> 
> Only thing I know that if I'll add WARN_ON below to the free_block(), it
> triggers...
> 
> @free_block
>   slabp = GET_PAGE_SLAB(virt_to_page(objp));
>   nodeid = slabp->nodeid;
> +  WARN_ON(nodeid != numa_node_id());             <<<<<
>   l3 = cachep->nodelist[nodeid];
>   list_del(&slabp->list);
>   objnr = (objp - slabp->s_mem) / cachep->objsize;
>   check_spinlock_acquired_node(cachep, nodeid);
>   check_slabp(cachep, slabp);
> 
> ... saying that keventd/0 tries to operate on
> slab belonging to node#1, while having acquired lock for cachep belonging
  ^^^^^^^^^^^^^^^^^^^^^^^^^
> to node #0
  ^^^^^^^^^^

Just might be relevant here, I found a bug with the recent
x86_64 changes to 2.6.14-rc* which causes the node_to_cpumask[] to go bad for
the boot processor.  This happens on both amd and em64t boxes. I guess the
kevent/0 cpus_allowed mask might be changed by the bad node_to_cpumask[]
here?

On a opteron box (courtesy Ananth M)
# cat /sys/devices/system/node/node0/cpumap
00000000

# cat /sys/devices/system/node/node1/cpumap
00000003

On our em64t IBM x460 NUMA,

# cat /sys/devices/system/node/node0/cpumap
0000000e

# cat /sys/devices/system/node/node1/cpumap
000000f1

Here is a fix for that, I have sounded out Andi Kleen on this and waiting
for his comments.  Maybe somebody can test the patch below on amds?

Thanks,
Kiran

---
Patch to fix the BP node_to_cpumask.  2.6.14-rc* broke the boot cpu bit as
the cpu_to_node(0) is now not setup early enough for numa_init_array.
cpu_to_node[] is setup much later at srat_detect_node on acpi srat based
em64t machines.  This seems like a problem on amd machines too,  Tested on
em64t though. /sys/devices/system/node/node0/cpumap shows up sanely after
this patch.

Signed off by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>


Index: linux-2.6.14-rc1/arch/x86_64/mm/numa.c
===================================================================
--- linux-2.6.14-rc1.orig/arch/x86_64/mm/numa.c	2005-09-19 17:58:10.000000000 -0700
+++ linux-2.6.14-rc1/arch/x86_64/mm/numa.c	2005-09-27 01:34:20.000000000 -0700
@@ -178,7 +178,6 @@
 		rr++; 
 	}
 
-	set_bit(0, &node_to_cpumask[cpu_to_node(0)]);
 }
 
 #ifdef CONFIG_NUMA_EMU
@@ -266,9 +265,7 @@
 
 __cpuinit void numa_add_cpu(int cpu)
 {
-	/* BP is initialized elsewhere */
-	if (cpu) 
-		set_bit(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
+	set_bit(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
 } 
 
 unsigned long __init numa_free_all_bootmem(void) 

  parent reply	other threads:[~2005-09-28 21:03 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-09-15 16:51 2.6.14-rc1-git-now still dying in mm/slab - this time line 1849 Petr Vandrovec
2005-09-15 17:33 ` Petr Vandrovec
     [not found] ` <20050916023005.4146e499.akpm@osdl.org>
     [not found]   ` <432AA00D.4030706@vc.cvut.cz>
     [not found]     ` <20050916230809.789d6b0b.akpm@osdl.org>
2005-09-19 16:02       ` Petr Vandrovec
2005-09-19 18:29         ` Andrew Morton
2005-09-19 18:51           ` Christoph Lameter
2005-09-19 19:28             ` Andrew Morton
2005-09-19 21:20               ` Christoph Lameter
2005-09-20  5:16                 ` Andrew Morton
2005-09-20  8:34                   ` Alok Kataria
2005-09-20 13:58                   ` Petr Vandrovec
2005-09-21  1:03                     ` Christoph Lameter
2005-09-21  1:22                       ` Petr Vandrovec
2005-09-21 15:59                         ` Christoph Lameter
2005-09-22 19:52                           ` Christoph Lameter
2005-09-22 20:01                             ` Andrew Morton
2005-09-22 21:25                               ` Petr Vandrovec
2005-09-22 21:32                                 ` Christoph Lameter
2005-09-22 21:46                                 ` Andrew Morton
2005-09-22 21:54                                   ` Christoph Lameter
2005-09-23  0:25                                     ` Petr Vandrovec
2005-09-28 21:02                     ` Ravikiran G Thirumalai [this message]
2005-09-28 22:50                       ` Christoph Lameter
2005-09-29 16:43                       ` Petr Vandrovec
2005-09-29 18:11                         ` Ravikiran G Thirumalai
2005-09-29 18:38                           ` Christoph Lameter
2005-09-30  5:45                         ` Ravikiran G Thirumalai
2005-09-30  6:05                           ` Andrew Morton
2005-09-30  6:28                             ` Ravikiran G Thirumalai
2005-09-30 15:16                               ` Bryan O'Sullivan
2005-09-30 15:57                                 ` Christoph Lameter
2005-09-30 16:45                                   ` Bryan O'Sullivan
2005-09-30 20:11                                 ` Andi Kleen
2005-09-30 20:23                                   ` Ravikiran G Thirumalai
2005-09-30 16:55                           ` Christoph Lameter
2005-09-19 18:56           ` Petr Vandrovec
2005-09-19 19:08             ` Christoph Lameter
  -- strict thread matches above, loose matches on Subject: below --
2005-09-23 19:34 Alok Kataria
2005-09-23 23:57 ` Christoph Lameter
2005-09-24  0:05 ` Christoph Lameter
2005-09-24 12:52 ` Manfred Spraul
2005-09-25 14:16 Alok Kataria
2005-09-26 18:00 ` Christoph Lameter
2005-09-26 19:34   ` Alok Kataria

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050928210245.GA3760@localhost.localdomain \
    --to=kiran@scalex86.org \
    --cc=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=alokk@calsoftinc.com \
    --cc=ananth@in.ibm.com \
    --cc=clameter@engr.sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=manfred@colorfullife.com \
    --cc=shai@scalex86.org \
    --cc=vandrove@vc.cvut.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox