From: Ingo Molnar <mingo@elte.hu>
To: Linus Torvalds <torvalds@linux-foundation.org>,
Jens Axboe <axboe@kernel.dk>,
Andrew Morton <akpm@linux-foundation.org>,
Pekka Enberg <penberg@cs.helsinki.fi>
Cc: werner <w.landgraf@ru.ru>, "H. Peter Anvin" <hpa@zytor.com>,
Thomas Gleixner <tglx@linutronix.de>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [block IO crash] Re: 2.6.39-rc5-git2 boot crashs
Date: Wed, 4 May 2011 14:47:29 +0200 [thread overview]
Message-ID: <20110504124729.GA9731@elte.hu> (raw)
In-Reply-To: <20110504123753.GA8646@elte.hu>
* Ingo Molnar <mingo@elte.hu> wrote:
> > > index 94d2a33..27bc3be 100644
> > > --- a/mm/slub.c
> > > +++ b/mm/slub.c
> > > @@ -30,6 +30,8 @@
> > >
> > > #include <trace/events/kmem.h>
> > >
> > > +#undef CONFIG_CMPXCHG_LOCAL
> > > +
> > > /*
> > > * Lock order:
> > > * 1. slab_lock(page)
> >
> > This seems rock solid after half an hour of testing. I'll keep it running
> > longer, i still have no good data for how frequently the crashes are occuring.
>
> It's still rock solid after 2 hours: neither crashes nor IO/IRQ timeouts are
> occuring.
So i removed the above patch and rebooted, and within minutes of starting the
FS test i got:
skb_over_panic: text:c19fe045 len:98 put:98 head: (null) data: (null) tail:0x62 end:0x0 dev:<NULL>
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:127!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:0a.0/net/eth0/address
Modules linked in:
Pid: 3535, comm: dd Not tainted 2.6.39-rc5-i486-1sys+ #122586 System manufacturer System Product Name/A8N-E
EIP: 0060:[<c1bda60d>] EFLAGS: 00010292 CPU: 1
EIP is at skb_put+0x89/0x92
EAX: 0000006b EBX: 00000000 ECX: 00000046 EDX: 00000000
ESI: c19fe045 EDI: 00000062 EBP: f64cdf20 ESP: f64cdef4
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process dd (pid: 3535, ti=f64cc000 task=f5f4b570 task.ti=f53f4000)
Stack:
c2143545 c19fe045 00000062 00000062 00000000 00000000 00000062 00000000
c207d136 f6506000 f408d600 f64cdf4c c19fe045 c19fd92b f64cdf4c 00000040
f6506428 00000000 34020062 f6506000 00000246 c21b799c f64cdf90 c1a004c1
Call Trace:
[<c19fe045>] ? nv_rx_process_optimized+0x101/0x1de
[<c19fe045>] nv_rx_process_optimized+0x101/0x1de
[<c19fd92b>] ? nv_alloc_rx_optimized+0xe/0x18f
[<c1a004c1>] nv_napi_poll+0x496/0x4a5
[<c105838c>] ? hrtimer_run_pending+0xe/0xd1
[<c1d734b4>] ? _raw_spin_lock+0x8/0x1e
[<c1be1d59>] net_rx_action+0x94/0x1ab
[<c1042fcd>] __do_softirq+0x9f/0x14f
[<c1042f2e>] ? remote_softirq_receive+0x33/0x33
<IRQ>
[<c10431e7>] ? irq_exit+0x3a/0x43
[<c10047ce>] ? do_IRQ+0x8c/0xa0
[<c116366d>] ? __ext3_journal_dirty_metadata+0x1e/0x45
[<c1054f23>] ? wake_up_bit+0x1c/0x20
[<c10ec726>] ? __brelse+0xb/0x36
[<c102ea1c>] ? __wake_up_common+0xe/0x62
[<c1d74eb0>] ? common_interrupt+0x30/0x40
[<c14fb1ea>] ? sha_transform+0x9a/0x1be
[<c15ff44e>] ? extract_buf+0x50/0xe3
[<c14fe7ab>] ? __copy_to_user_ll+0xb/0x37
[<c14fe9b5>] ? copy_to_user+0x3e/0x49
[<c15ffd83>] ? extract_entropy_user+0x80/0xe5
[<c15ffdfa>] ? urandom_read+0x12/0x14
[<c10cc888>] ? vfs_read+0x93/0x115
[<c15ffde8>] ? extract_entropy_user+0xe5/0xe5
[<c10cc94c>] ? sys_read+0x42/0x66
[<c1d74903>] ? sysenter_do_call+0x12/0x28
Code: 00 00 89 44 24 14 8b 81 a8 00 00 00 89 44 24 10 89 54 24 0c 8b 41 50 89 44 24 08 89 74 24 04 c7 04 24 45 35 14 c2 e8 fa 09 18 00 <0f> 0b 83 c4 24 5b 5e 5d c3 55 89 e5 57 56 53 83 ec 30 e8 ac a8
EIP: [<c1bda60d>] skb_put+0x89/0x92 SS:ESP 0068:f64cdef4
---[ end trace 1d38b9741c67ed6b ]---
And in hindsight i have to admit that i saw this in randconfig testing in the
past few weeks, i just never managed to reproduce it ...
So yes, the fact that this time it crashed in networking (not in block IO)
clearly implicates SLUB as well.
And the trigger condition is the lockless SLUB code on 32-bit,
non-64-bit-cmpxchg platforms. I'd not be surprised if some embedded platforms
triggered this too.
Ingo
prev parent reply other threads:[~2011-05-04 12:48 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-02 22:28 2.6.39-rc5-git2 boot crashs werner
2011-05-02 23:24 ` Linus Torvalds
[not found] ` <web-516990066@zbackend1.aha.ru>
2011-05-03 15:22 ` Linus Torvalds
2011-05-03 19:08 ` Ingo Molnar
2011-05-03 20:17 ` Linus Torvalds
2011-05-03 20:20 ` H. Peter Anvin
2011-05-03 20:50 ` Ingo Molnar
2011-05-03 21:45 ` Linus Torvalds
2011-05-03 22:01 ` H. Peter Anvin
2011-05-04 7:19 ` Borislav Petkov
2011-05-04 7:38 ` Ingo Molnar
2011-05-04 7:55 ` Borislav Petkov
2011-05-04 8:35 ` [block IO crash] " Ingo Molnar
2011-05-04 9:52 ` Thomas Gleixner
2011-05-04 10:19 ` Ingo Molnar
2011-05-04 10:25 ` Ingo Molnar
2011-05-04 10:33 ` Ingo Molnar
2011-05-04 12:37 ` Ingo Molnar
2011-05-04 12:36 ` Ingo Molnar
2011-05-04 11:11 ` Thomas Gleixner
2011-05-04 11:16 ` Pekka Enberg
2011-05-04 11:27 ` Tejun Heo
2011-05-04 12:51 ` Pekka Enberg
2011-05-04 12:57 ` Ingo Molnar
2011-05-04 13:02 ` Thomas Gleixner
2011-05-04 13:00 ` Thomas Gleixner
2011-05-04 13:20 ` Tejun Heo
2011-05-04 14:10 ` Thomas Gleixner
2011-05-04 14:14 ` Ingo Molnar
2011-05-04 14:36 ` [PATCH] slub: Fix the lockless code on 32-bit platforms with no 64-bit cmpxchg Ingo Molnar
2011-05-04 14:42 ` Christoph Lameter
2011-05-04 16:30 ` Ingo Molnar
2011-05-04 21:52 ` Ben Greear
2011-05-04 22:00 ` Linus Torvalds
2011-05-04 22:22 ` Ben Greear
2011-05-04 14:19 ` [block IO crash] Re: 2.6.39-rc5-git2 boot crashs Christoph Lameter
2011-05-04 14:25 ` Tejun Heo
2011-05-04 14:35 ` Christoph Lameter
2011-05-04 15:20 ` Ingo Molnar
2011-05-04 14:46 ` Thomas Gleixner
2011-05-04 15:00 ` Christoph Lameter
2011-05-04 15:13 ` Linus Torvalds
2011-05-04 15:28 ` Christoph Lameter
2011-05-04 15:37 ` Pekka Enberg
2011-05-04 15:53 ` Linus Torvalds
2011-05-04 18:20 ` Linus Torvalds
2011-05-04 18:49 ` Christoph Lameter
2011-05-04 19:07 ` Linus Torvalds
2011-05-04 19:30 ` Christoph Lameter
2011-05-04 19:38 ` Linus Torvalds
2011-05-04 20:04 ` Christoph Lameter
2011-05-04 20:21 ` Valdis.Kletnieks
2011-05-04 20:32 ` Christoph Lameter
2011-05-04 20:49 ` Ingo Molnar
2011-05-04 21:06 ` Linus Torvalds
2011-05-04 21:19 ` Linus Torvalds
2011-05-04 21:40 ` Thomas Gleixner
2011-05-05 9:54 ` Tejun Heo
2011-05-05 10:18 ` Ingo Molnar
2011-05-05 10:45 ` Thomas Gleixner
2011-05-05 18:20 ` Christoph Lameter
2011-05-05 19:13 ` Ingo Molnar
2011-05-05 19:53 ` werner
2011-05-05 20:09 ` Christoph Lameter
2011-05-05 21:12 ` werner
2011-05-05 22:27 ` Thomas Gleixner
[not found] ` <web-518008166@zbackend1.aha.ru>
[not found] ` <web-518059420@zbackend1.aha.ru>
[not found] ` <20110505060204.GA28015@elte.hu>
2011-05-05 6:46 ` werner
2011-05-04 15:37 ` [block IO crash] " Linus Torvalds
2011-05-04 16:08 ` Christoph Lameter
2011-05-04 16:50 ` Ingo Molnar
2011-05-04 17:12 ` Thomas Gleixner
2011-05-04 15:41 ` Pekka Enberg
2011-05-04 13:22 ` Ingo Molnar
2011-05-04 14:21 ` Christoph Lameter
2011-05-04 14:04 ` Christoph Lameter
2011-05-04 14:07 ` Tejun Heo
2011-05-04 14:21 ` Thomas Gleixner
2011-05-04 10:13 ` Ingo Molnar
2011-05-04 10:41 ` Ingo Molnar
2011-05-04 10:45 ` Ingo Molnar
2011-05-04 11:06 ` Ingo Molnar
2011-05-04 12:37 ` Ingo Molnar
2011-05-04 12:47 ` Ingo Molnar [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110504124729.GA9731@elte.hu \
--to=mingo@elte.hu \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=penberg@cs.helsinki.fi \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=w.landgraf@ru.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox