All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ravikiran G Thirumalai <kiran@scalex86.org>
To: Petr Vandrovec <vandrove@vc.cvut.cz>
Cc: Andrew Morton <akpm@osdl.org>,
	Christoph Lameter <clameter@engr.sgi.com>,
	alokk@calsoftinc.com, linux-kernel@vger.kernel.org,
	manfred@colorfullife.com,
	"Shai Fultheim (Shai@scalex86.org)" <shai@scalex86.org>,
	ananth@in.ibm.com, Andi Kleen <ak@suse.de>
Subject: Re: 2.6.14-rc1-git-now still dying in mm/slab - this time line 1849
Date: Wed, 28 Sep 2005 14:02:45 -0700	[thread overview]
Message-ID: <20050928210245.GA3760@localhost.localdomain> (raw)
In-Reply-To: <43301578.8040305@vc.cvut.cz>

On Tue, Sep 20, 2005 at 03:58:16PM +0200, Petr Vandrovec wrote:
> Andrew Morton wrote:
> >Christoph Lameter <clameter@engr.sgi.com> wrote:
> >
> >...
> >They do.  I don't believe that preemption is the source of this BUG. 
> >(Petr, does CONFIG_PREEMPT=n fix it?)
> 
> No, it does not.  I've even added printks here and there to show node 
> number,
> and everything works as it should.  Maybe there are some problems with
> numa_node_id() and migrating between processors when memory gets released,
> I do not know.
> 
> Only thing I know that if I'll add WARN_ON below to the free_block(), it
> triggers...
> 
> @free_block
>   slabp = GET_PAGE_SLAB(virt_to_page(objp));
>   nodeid = slabp->nodeid;
> +  WARN_ON(nodeid != numa_node_id());             <<<<<
>   l3 = cachep->nodelist[nodeid];
>   list_del(&slabp->list);
>   objnr = (objp - slabp->s_mem) / cachep->objsize;
>   check_spinlock_acquired_node(cachep, nodeid);
>   check_slabp(cachep, slabp);
> 
> ... saying that keventd/0 tries to operate on
> slab belonging to node#1, while having acquired lock for cachep belonging
  ^^^^^^^^^^^^^^^^^^^^^^^^^
> to node #0
  ^^^^^^^^^^

Just might be relevant here, I found a bug with the recent
x86_64 changes to 2.6.14-rc* which causes the node_to_cpumask[] to go bad for
the boot processor.  This happens on both amd and em64t boxes. I guess the
kevent/0 cpus_allowed mask might be changed by the bad node_to_cpumask[]
here?

On a opteron box (courtesy Ananth M)
# cat /sys/devices/system/node/node0/cpumap
00000000

# cat /sys/devices/system/node/node1/cpumap
00000003

On our em64t IBM x460 NUMA,

# cat /sys/devices/system/node/node0/cpumap
0000000e

# cat /sys/devices/system/node/node1/cpumap
000000f1

Here is a fix for that, I have sounded out Andi Kleen on this and waiting
for his comments.  Maybe somebody can test the patch below on amds?

Thanks,
Kiran

---
Patch to fix the BP node_to_cpumask.  2.6.14-rc* broke the boot cpu bit as
the cpu_to_node(0) is now not setup early enough for numa_init_array.
cpu_to_node[] is setup much later at srat_detect_node on acpi srat based
em64t machines.  This seems like a problem on amd machines too,  Tested on
em64t though. /sys/devices/system/node/node0/cpumap shows up sanely after
this patch.

Signed off by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>


Index: linux-2.6.14-rc1/arch/x86_64/mm/numa.c
===================================================================
--- linux-2.6.14-rc1.orig/arch/x86_64/mm/numa.c	2005-09-19 17:58:10.000000000 -0700
+++ linux-2.6.14-rc1/arch/x86_64/mm/numa.c	2005-09-27 01:34:20.000000000 -0700
@@ -178,7 +178,6 @@
 		rr++; 
 	}
 
-	set_bit(0, &node_to_cpumask[cpu_to_node(0)]);
 }
 
 #ifdef CONFIG_NUMA_EMU
@@ -266,9 +265,7 @@
 
 __cpuinit void numa_add_cpu(int cpu)
 {
-	/* BP is initialized elsewhere */
-	if (cpu) 
-		set_bit(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
+	set_bit(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
 } 
 
 unsigned long __init numa_free_all_bootmem(void) 

  parent reply	other threads:[~2005-09-28 21:03 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-09-15 16:51 2.6.14-rc1-git-now still dying in mm/slab - this time line 1849 Petr Vandrovec
2005-09-15 17:33 ` Petr Vandrovec
     [not found] ` <20050916023005.4146e499.akpm@osdl.org>
     [not found]   ` <432AA00D.4030706@vc.cvut.cz>
     [not found]     ` <20050916230809.789d6b0b.akpm@osdl.org>
2005-09-19 16:02       ` Petr Vandrovec
2005-09-19 18:29         ` Andrew Morton
2005-09-19 18:51           ` Christoph Lameter
2005-09-19 19:28             ` Andrew Morton
2005-09-19 21:20               ` Christoph Lameter
2005-09-20  5:16                 ` Andrew Morton
2005-09-20  8:34                   ` Alok Kataria
2005-09-20 13:58                   ` Petr Vandrovec
2005-09-21  1:03                     ` Christoph Lameter
2005-09-21  1:22                       ` Petr Vandrovec
2005-09-21 15:59                         ` Christoph Lameter
2005-09-22 19:52                           ` Christoph Lameter
2005-09-22 20:01                             ` Andrew Morton
2005-09-22 21:25                               ` Petr Vandrovec
2005-09-22 21:32                                 ` Christoph Lameter
2005-09-22 21:46                                 ` Andrew Morton
2005-09-22 21:54                                   ` Christoph Lameter
2005-09-23  0:25                                     ` Petr Vandrovec
2005-09-28 21:02                     ` Ravikiran G Thirumalai [this message]
2005-09-28 22:50                       ` Christoph Lameter
2005-09-29 16:43                       ` Petr Vandrovec
2005-09-29 18:11                         ` Ravikiran G Thirumalai
2005-09-29 18:38                           ` Christoph Lameter
2005-09-30  5:45                         ` Ravikiran G Thirumalai
2005-09-30  6:05                           ` Andrew Morton
2005-09-30  6:28                             ` Ravikiran G Thirumalai
2005-09-30 15:16                               ` Bryan O'Sullivan
2005-09-30 15:57                                 ` Christoph Lameter
2005-09-30 16:45                                   ` Bryan O'Sullivan
2005-09-30 20:11                                 ` Andi Kleen
2005-09-30 20:23                                   ` Ravikiran G Thirumalai
2005-09-30 16:55                           ` Christoph Lameter
2005-09-19 18:56           ` Petr Vandrovec
2005-09-19 19:08             ` Christoph Lameter
  -- strict thread matches above, loose matches on Subject: below --
2005-09-23 19:34 Alok Kataria
2005-09-23 23:57 ` Christoph Lameter
2005-09-24  0:05 ` Christoph Lameter
2005-09-24 12:52 ` Manfred Spraul
2005-09-25 14:16 Alok Kataria
2005-09-26 18:00 ` Christoph Lameter
2005-09-26 19:34   ` Alok Kataria

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050928210245.GA3760@localhost.localdomain \
    --to=kiran@scalex86.org \
    --cc=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=alokk@calsoftinc.com \
    --cc=ananth@in.ibm.com \
    --cc=clameter@engr.sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=manfred@colorfullife.com \
    --cc=shai@scalex86.org \
    --cc=vandrove@vc.cvut.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.