All of lore.kernel.org
 help / color / mirror / Atom feed
From: Petr Vandrovec <vandrove@vc.cvut.cz>
To: Andrew Morton <akpm@osdl.org>
Cc: linux-kernel@vger.kernel.org, Christoph Lameter <christoph@lameter.com>
Subject: Re: 2.6.14-rc1-git-now still dying in mm/slab - this time line 1849
Date: Mon, 19 Sep 2005 20:56:26 +0200	[thread overview]
Message-ID: <432F09DA.7050408@vc.cvut.cz> (raw)
In-Reply-To: <20050919112912.18daf2eb.akpm@osdl.org>

Andrew Morton wrote:
> Petr Vandrovec <vandrove@vc.cvut.cz> wrote:
> 
>>Andrew Morton wrote:
>>
>>>Petr Vandrovec <vandrove@vc.cvut.cz> wrote:
>>>
>>>
>>>>Andrew Morton wrote:
>>>>
>>>>>Petr Vandrovec <vandrove@vc.cvut.cz> wrote:
>>>>>
>>>>>
>>>>>>  so now once crashes on UP system were sorted out, I tried to
>>>>>>put new kernel on my SMP host - and sorry to say, but it does not
>>>>>>seem to work as advertised :-(
>>>>>
>>>>>.config (again), please.
>>>>
>>>>Any SMP with NUMA.  One which I'm trying to debug now is attached.
>>>>It is available at http://vana.vc.cvut.cz/config as well.
>>>
>>>I can get 2.6.14-rc1 to crash with your .config, but current -linus is OK.
>>
>>It still dies for me, with current git (tree 7513cdadc661cfe0bd1625145a4876e54df191ca,
>>commit 6c0741fbdee5bd0f8ed13ac287c4ab18e8ba7d83).  Config is available at
>>http://platan.vc.cvut.cz/config-vana.txt.  Box is dual opteron Tyan K8W, S2885.
>>
>>...
>>
>>     ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:DMA, hdb:pio
>>     ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:DMA, hdd:pio
>>----------- [cut here ] --------- [please bite here ] ---------
>>Kernel BUG at mm/slab.c:1849
>>invalid operand: 0000 [1] SMP
>>CPU 0
>>Modules linked in:
>>Pid: 8, comm: events/0 Not tainted 2.6.14-rc1-6c07 #1
>>RIP: 0010:[<ffffffff8016d316>] <ffffffff8016d316>{free_block+294}
>>RSP: 0000:ffff81007ff21d88  EFLAGS: 00010002
>>RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000310
>>RDX: 0000000000000000 RSI: ffff81007ffddd10 RDI: ffff81007ffda080
>>RBP: ffff81007ffde000 R08: ffff81003ffa0d50 R09: 0000000000000000
>>R10: 00000000ffffffff R11: 0000000000000000 R12: ffff81007ffc9b50
>>R13: ffff81007ffde048 R14: ffff81007ffda080 R15: ffff81007ffda080
>>FS:  0000000000000000(0000) GS:ffffffff805f2800(0000) knlGS:0000000000000000
>>CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
>>CR2: 0000000000000000 CR3: 0000000000101000 CR4: 00000000000006e0
>>Process events/0 (pid: 8, threadinfo ffff81007ff20000, task ffff81003ff8c790)
>>Stack: 0000000000000000 0000000000000000 0000000000000292 hda: _NEC DVD_RW ND-3500AG, ATAPI CD/DVD-ROM drive
>>0000000200000000
>>        ffff81007ffddd10 ffff81007ffddd10 ffff81007ffddce8 0000000000000002
>>        0000000000000000 ffff81007ffda080
>>Call Trace:<ffffffff8016e8b7>{drain_array_locked+167} <ffffffff8016e9f7>{cache_reap+231}
>>        <ffffffff80131e23>{__wake_up+67} <ffffffff8016e910>{cache_reap+0}
>>        <ffffffff8014930c>{worker_thread+476} <ffffffff80131d60>{default_wake_function+0}
>>        <ffffffff80131d60>{default_wake_function+0} <ffffffff80149130>{worker_thread+0}
>>        <ffffffff8014db82>{kthread+146} <ffffffff8010ec22>{child_rip+8}
>>        <ffffffff80149130>{worker_thread+0} <ffffffff8014daf0>{kthread+0}
>>        <ffffffff8010ec1a>{child_rip+0}
>>
>>Code: 0f 0b 68 9d 26 3d 80 c2 39 07 48 89 ee 4c 89 ff 4c 8d 75 30
>>RIP <ffffffff8016d316>{free_block+294} RSP <ffff81007ff21d88>
>>  ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
> 
> 
> Well.  The CPU_UP_CANCELED locking in cpuup_callback() looks borked to me -
> it takes cachep->nodelists[node]->list_lock and then calls
> drain_alien_cache() which appears to take the same lock.  But that's not
> the problem here.
> 
> The code in cache_reap() recalculates numa_node_id() multiple times, so if
> the caller changes CPUs then this assertion will trigger.  However it's
> running under keventd here, which is pinned to a single CPU.  Still, it
> would be useful if you could try putting preempt_disable()s in
> cache_reap(), or change cache_reap() to evaluate numa_node_id() just the
> once, and cache that in a local variable.
> 
> I wonder why numa_node_id() uses raw_smp_processor_id()?  That's just
> asking for preempt non-atomicity bugs.

I've thought that this is problem, but as far as I can tell while this is
problem it does not happen here.  Just free_block() finds that pointer it
got from caller belongs to the slab that belongs to the CPU#1/node#1
while caller obtained lock on CPU#0/node#0 structures.  Which suggests
that drain_array_locked() was issued with node #0 while array_cache->entry
it got contains blocks which belong to node #1.  Which I cannot explain.
								Petr


  parent reply	other threads:[~2005-09-19 18:56 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-09-15 16:51 2.6.14-rc1-git-now still dying in mm/slab - this time line 1849 Petr Vandrovec
2005-09-15 17:33 ` Petr Vandrovec
     [not found] ` <20050916023005.4146e499.akpm@osdl.org>
     [not found]   ` <432AA00D.4030706@vc.cvut.cz>
     [not found]     ` <20050916230809.789d6b0b.akpm@osdl.org>
2005-09-19 16:02       ` Petr Vandrovec
2005-09-19 18:29         ` Andrew Morton
2005-09-19 18:51           ` Christoph Lameter
2005-09-19 19:28             ` Andrew Morton
2005-09-19 21:20               ` Christoph Lameter
2005-09-20  5:16                 ` Andrew Morton
2005-09-20  8:34                   ` Alok Kataria
2005-09-20 13:58                   ` Petr Vandrovec
2005-09-21  1:03                     ` Christoph Lameter
2005-09-21  1:22                       ` Petr Vandrovec
2005-09-21 15:59                         ` Christoph Lameter
2005-09-22 19:52                           ` Christoph Lameter
2005-09-22 20:01                             ` Andrew Morton
2005-09-22 21:25                               ` Petr Vandrovec
2005-09-22 21:32                                 ` Christoph Lameter
2005-09-22 21:46                                 ` Andrew Morton
2005-09-22 21:54                                   ` Christoph Lameter
2005-09-23  0:25                                     ` Petr Vandrovec
2005-09-28 21:02                     ` Ravikiran G Thirumalai
2005-09-28 22:50                       ` Christoph Lameter
2005-09-29 16:43                       ` Petr Vandrovec
2005-09-29 18:11                         ` Ravikiran G Thirumalai
2005-09-29 18:38                           ` Christoph Lameter
2005-09-30  5:45                         ` Ravikiran G Thirumalai
2005-09-30  6:05                           ` Andrew Morton
2005-09-30  6:28                             ` Ravikiran G Thirumalai
2005-09-30 15:16                               ` Bryan O'Sullivan
2005-09-30 15:57                                 ` Christoph Lameter
2005-09-30 16:45                                   ` Bryan O'Sullivan
2005-09-30 20:11                                 ` Andi Kleen
2005-09-30 20:23                                   ` Ravikiran G Thirumalai
2005-09-30 16:55                           ` Christoph Lameter
2005-09-19 18:56           ` Petr Vandrovec [this message]
2005-09-19 19:08             ` Christoph Lameter
  -- strict thread matches above, loose matches on Subject: below --
2005-09-23 19:34 Alok Kataria
2005-09-23 23:57 ` Christoph Lameter
2005-09-24  0:05 ` Christoph Lameter
2005-09-24 12:52 ` Manfred Spraul
2005-09-25 14:16 Alok Kataria
2005-09-26 18:00 ` Christoph Lameter
2005-09-26 19:34   ` Alok Kataria

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=432F09DA.7050408@vc.cvut.cz \
    --to=vandrove@vc.cvut.cz \
    --cc=akpm@osdl.org \
    --cc=christoph@lameter.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.