From: Jens Axboe <axboe@suse.de>
To: Linus Torvalds <torvalds@osdl.org>
Cc: "David Martínez Moreno" <ender@debian.org>,
"Kernel Mailing List" <linux-kernel@vger.kernel.org>,
clubinfo.servers@adi.uam.es, "Ingo Molnar" <mingo@elte.hu>,
"Neil Brown" <neilb@cse.unsw.edu.au>
Subject: Re: Errors and later panics in 2.6.0-test11.
Date: Thu, 4 Dec 2003 13:43:42 +0100 [thread overview]
Message-ID: <20031204124342.GD1086@suse.de> (raw)
In-Reply-To: <Pine.LNX.4.58.0312030916080.6950@home.osdl.org>
On Wed, Dec 03 2003, Linus Torvalds wrote:
>
>
> On Wed, 3 Dec 2003, David Martínez Moreno wrote:
> >
> > I've just rebooted about six hours ago, and it's giving panics elsewhere:
> >
> > [...]
> > Ending XFS recovery on filesystem: md0 (dev: md0)
> > b44: eth0: Link is down.
> > b44: eth0: Link is up at 100 Mbps, full duplex.
> > b44: eth0: Flow control is on for TX and on for RX.
> > eth0: no IPv6 routers present
> > Unable to handle kernel paging request at virtual address 00100104
>
> That's the LIST_POISON stuff: 00100100 is the "bad list pointer". Somebody
> tried to remove a page twice.
>
> Doesn't mean a lot - if your "struct page" got corrupted, anything can
> happen. Quite possibly it's a double free.
>
> > I can rebuild the Debian mirror for not using the RAID and using the SATA
> > disks separately, but will be tomorrow, it's a lot of space to move, and I
> > need remote intervention.
> >
> > Anyway I'd love to know before doing if it will be useful, looking at what
> > Jens has said just ten minutes ago about RAIDs 0/5. Will it help to you? Say
> > so and I'll go for it.
>
> It might be more useful to leave it as RAID0, if you're willing to try out
> patches to try to debug this. The slab-debugging thing I sent out earlier
> is one such patch (but may well cause out-of-memory problems under load),
> and possibly the atomic-decrement checker patch (appended). And maybe Jens
> and Neil can come up with something..
I can reproduce on raid5 with linear dm on top (using XFS). I need to
kill the slab and memory debugging, I've put some bio debugging in there
instead (the memory debugging interferes with it). It's definitely a bio
use after free case, clone_endio() ends up with a freed bio.
Program received signal SIGEMT, Emulation trap.
0xc02dd454 in handle_stripe (sh=0xc17cf630) at drivers/md/raid5.c:1009
1009 wbi = wbi2;
(gdb) bt
#0 0xc02dd454 in handle_stripe (sh=0xc17cf630) at drivers/md/raid5.c:1009
#1 0xc02de31f in raid5d (mddev=0xdfd2d200) at drivers/md/raid5.c:1436
#2 0xc02e675a in md_thread (arg=0xdffdc1a0) at drivers/md/md.c:2692
#3 0xc010752d in kernel_thread_helper () at arch/i386/kernel/process.c:226
wbi (dev->written) has already been freed by someone else.
My puny 512MB test box cannot use your slab-debug patch :). The
atomic-checker didn't catch anything.
--
Jens Axboe
next prev parent reply other threads:[~2003-12-04 12:44 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-12-03 13:17 Errors and later panics in 2.6.0-test11 David Martínez Moreno
2003-12-03 13:31 ` William Lee Irwin III
2003-12-03 16:49 ` David Martínez Moreno
2003-12-03 15:59 ` Linus Torvalds
2003-12-03 16:20 ` Jens Axboe
2003-12-03 16:26 ` Jens Axboe
2003-12-03 16:51 ` Linus Torvalds
2003-12-03 17:25 ` Kevin P. Fleming
2003-12-05 3:20 ` Nathan Scott
2003-12-05 3:49 ` Kevin P. Fleming
2003-12-03 20:09 ` Neil Brown
2003-12-04 6:39 ` Nathan Scott
2003-12-03 20:04 ` Neil Brown
2003-12-03 20:11 ` Linus Torvalds
2003-12-03 16:47 ` David Martínez Moreno
2003-12-03 17:25 ` Linus Torvalds
2003-12-04 12:43 ` Jens Axboe [this message]
2003-12-04 14:07 ` Jens Axboe
2003-12-04 14:14 ` Jens Axboe
2003-12-05 3:07 ` Neil Brown
2003-12-05 4:31 ` Kevin P. Fleming
2003-12-05 4:32 ` Nathan Scott
2003-12-04 12:53 ` David Martínez Moreno
2003-12-12 18:38 ` David Martínez Moreno
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20031204124342.GD1086@suse.de \
--to=axboe@suse.de \
--cc=clubinfo.servers@adi.uam.es \
--cc=ender@debian.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=neilb@cse.unsw.edu.au \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox