From: Petr Vandrovec <vandrove@vc.cvut.cz>
To: Andrew Morton <akpm@osdl.org>
Cc: Christoph Lameter <clameter@engr.sgi.com>,
alokk@calsoftinc.com, linux-kernel@vger.kernel.org,
manfred@colorfullife.com
Subject: Re: 2.6.14-rc1-git-now still dying in mm/slab - this time line 1849
Date: Tue, 20 Sep 2005 15:58:16 +0200 [thread overview]
Message-ID: <43301578.8040305@vc.cvut.cz> (raw)
In-Reply-To: <20050919221614.6c01c2d1.akpm@osdl.org>
Andrew Morton wrote:
> Christoph Lameter <clameter@engr.sgi.com> wrote:
>
>>On Mon, 19 Sep 2005, Andrew Morton wrote:
>>
>>
>>> list_for_each(walk, &cache_chain) {
>>> kmem_cache_t *searchp;
>>> struct list_head* p;
>>> int tofree;
>>> struct slab *slabp;
>>>
>>> searchp = list_entry(walk, kmem_cache_t, next);
>>>
>>> if (searchp->flags & SLAB_NO_REAP)
>>> goto next;
>>>
>>> check_irq_on();
>>>
>>> l3 = searchp->nodelists[numa_node_id()];
>>> if (l3->alien)
>>> drain_alien_cache(searchp, l3);
>>>->preempt here
>>> spin_lock_irq(&l3->list_lock);
>>>
>>> drain_array_locked(searchp, ac_data(searchp), 0,
>>> numa_node_id());
>>>->oops, wrong node.
>>
>>This is called from keventd which exists per processor. Hmmm... This looks
>>as if it can change processors after all
>
>
> Well no, it would be a big bug if a keventd thread were to change CPUs.
>
> It's OK to rely upon the pinnedness of keventd I guess - a comment would be
> nice.
>
>
>>but the slab allocator depends on
>>it running on the right processor. So does the page allocator. sigh. What
>>is the point of having per processor workqueues if they do not stay on
>>the assigned processor?
>
>
> They do. I don't believe that preemption is the source of this BUG.
> (Petr, does CONFIG_PREEMPT=n fix it?)
No, it does not. I've even added printks here and there to show node number,
and everything works as it should. Maybe there are some problems with
numa_node_id() and migrating between processors when memory gets released,
I do not know.
Only thing I know that if I'll add WARN_ON below to the free_block(), it
triggers...
@free_block
slabp = GET_PAGE_SLAB(virt_to_page(objp));
nodeid = slabp->nodeid;
+ WARN_ON(nodeid != numa_node_id()); <<<<<
l3 = cachep->nodelist[nodeid];
list_del(&slabp->list);
objnr = (objp - slabp->s_mem) / cachep->objsize;
check_spinlock_acquired_node(cachep, nodeid);
check_slabp(cachep, slabp);
... saying that keventd/0 tries to operate on
slab belonging to node#1, while having acquired lock for cachep belonging
to node #0. Due to this check_spinlock_acquired_node(cachep, nodeid) fails
(check_spinlock_acquired_node(cachep, 0) would succeed).
Petr
next prev parent reply other threads:[~2005-09-20 13:58 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-09-15 16:51 2.6.14-rc1-git-now still dying in mm/slab - this time line 1849 Petr Vandrovec
2005-09-15 17:33 ` Petr Vandrovec
[not found] ` <20050916023005.4146e499.akpm@osdl.org>
[not found] ` <432AA00D.4030706@vc.cvut.cz>
[not found] ` <20050916230809.789d6b0b.akpm@osdl.org>
2005-09-19 16:02 ` Petr Vandrovec
2005-09-19 18:29 ` Andrew Morton
2005-09-19 18:51 ` Christoph Lameter
2005-09-19 19:28 ` Andrew Morton
2005-09-19 21:20 ` Christoph Lameter
2005-09-20 5:16 ` Andrew Morton
2005-09-20 8:34 ` Alok Kataria
2005-09-20 13:58 ` Petr Vandrovec [this message]
2005-09-21 1:03 ` Christoph Lameter
2005-09-21 1:22 ` Petr Vandrovec
2005-09-21 15:59 ` Christoph Lameter
2005-09-22 19:52 ` Christoph Lameter
2005-09-22 20:01 ` Andrew Morton
2005-09-22 21:25 ` Petr Vandrovec
2005-09-22 21:32 ` Christoph Lameter
2005-09-22 21:46 ` Andrew Morton
2005-09-22 21:54 ` Christoph Lameter
2005-09-23 0:25 ` Petr Vandrovec
2005-09-28 21:02 ` Ravikiran G Thirumalai
2005-09-28 22:50 ` Christoph Lameter
2005-09-29 16:43 ` Petr Vandrovec
2005-09-29 18:11 ` Ravikiran G Thirumalai
2005-09-29 18:38 ` Christoph Lameter
2005-09-30 5:45 ` Ravikiran G Thirumalai
2005-09-30 6:05 ` Andrew Morton
2005-09-30 6:28 ` Ravikiran G Thirumalai
2005-09-30 15:16 ` Bryan O'Sullivan
2005-09-30 15:57 ` Christoph Lameter
2005-09-30 16:45 ` Bryan O'Sullivan
2005-09-30 20:11 ` Andi Kleen
2005-09-30 20:23 ` Ravikiran G Thirumalai
2005-09-30 16:55 ` Christoph Lameter
2005-09-19 18:56 ` Petr Vandrovec
2005-09-19 19:08 ` Christoph Lameter
-- strict thread matches above, loose matches on Subject: below --
2005-09-23 19:34 Alok Kataria
2005-09-23 23:57 ` Christoph Lameter
2005-09-24 0:05 ` Christoph Lameter
2005-09-24 12:52 ` Manfred Spraul
2005-09-25 14:16 Alok Kataria
2005-09-26 18:00 ` Christoph Lameter
2005-09-26 19:34 ` Alok Kataria
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=43301578.8040305@vc.cvut.cz \
--to=vandrove@vc.cvut.cz \
--cc=akpm@osdl.org \
--cc=alokk@calsoftinc.com \
--cc=clameter@engr.sgi.com \
--cc=linux-kernel@vger.kernel.org \
--cc=manfred@colorfullife.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox