From: Jens Axboe <axboe@suse.de>
To: Linus Torvalds <torvalds@osdl.org>
Cc: "David Martínez Moreno" <ender@debian.org>,
"Kernel Mailing List" <linux-kernel@vger.kernel.org>,
clubinfo.servers@adi.uam.es, "Ingo Molnar" <mingo@elte.hu>,
"Neil Brown" <neilb@cse.unsw.edu.au>
Subject: Re: Errors and later panics in 2.6.0-test11.
Date: Thu, 4 Dec 2003 13:43:42 +0100 [thread overview]
Message-ID: <20031204124342.GD1086@suse.de> (raw)
In-Reply-To: <Pine.LNX.4.58.0312030916080.6950@home.osdl.org>
On Wed, Dec 03 2003, Linus Torvalds wrote:
>
>
> On Wed, 3 Dec 2003, David Martínez Moreno wrote:
> >
> > I've just rebooted about six hours ago, and it's giving panics elsewhere:
> >
> > [...]
> > Ending XFS recovery on filesystem: md0 (dev: md0)
> > b44: eth0: Link is down.
> > b44: eth0: Link is up at 100 Mbps, full duplex.
> > b44: eth0: Flow control is on for TX and on for RX.
> > eth0: no IPv6 routers present
> > Unable to handle kernel paging request at virtual address 00100104
>
> That's the LIST_POISON stuff: 00100100 is the "bad list pointer". Somebody
> tried to remove a page twice.
>
> Doesn't mean a lot - if your "struct page" got corrupted, anything can
> happen. Quite possibly it's a double free.
>
> > I can rebuild the Debian mirror for not using the RAID and using the SATA
> > disks separately, but will be tomorrow, it's a lot of space to move, and I
> > need remote intervention.
> >
> > Anyway I'd love to know before doing if it will be useful, looking at what
> > Jens has said just ten minutes ago about RAIDs 0/5. Will it help to you? Say
> > so and I'll go for it.
>
> It might be more useful to leave it as RAID0, if you're willing to try out
> patches to try to debug this. The slab-debugging thing I sent out earlier
> is one such patch (but may well cause out-of-memory problems under load),
> and possibly the atomic-decrement checker patch (appended). And maybe Jens
> and Neil can come up with something..
I can reproduce on raid5 with linear dm on top (using XFS). I need to
kill the slab and memory debugging, I've put some bio debugging in there
instead (the memory debugging interferes with it). It's definitely a bio
use after free case, clone_endio() ends up with a freed bio.
Program received signal SIGEMT, Emulation trap.
0xc02dd454 in handle_stripe (sh=0xc17cf630) at drivers/md/raid5.c:1009
1009 wbi = wbi2;
(gdb) bt
#0 0xc02dd454 in handle_stripe (sh=0xc17cf630) at drivers/md/raid5.c:1009
#1 0xc02de31f in raid5d (mddev=0xdfd2d200) at drivers/md/raid5.c:1436
#2 0xc02e675a in md_thread (arg=0xdffdc1a0) at drivers/md/md.c:2692
#3 0xc010752d in kernel_thread_helper () at arch/i386/kernel/process.c:226
wbi (dev->written) has already been freed by someone else.
My puny 512MB test box cannot use your slab-debug patch :). The
atomic-checker didn't catch anything.
--
Jens Axboe
next prev parent reply other threads:[~2003-12-04 12:44 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-12-03 13:17 Errors and later panics in 2.6.0-test11 David Martínez Moreno
2003-12-03 13:31 ` William Lee Irwin III
2003-12-03 16:49 ` David Martínez Moreno
2003-12-03 15:59 ` Linus Torvalds
2003-12-03 16:20 ` Jens Axboe
2003-12-03 16:26 ` Jens Axboe
2003-12-03 16:51 ` Linus Torvalds
2003-12-03 17:25 ` Kevin P. Fleming
2003-12-05 3:20 ` Nathan Scott
2003-12-05 3:49 ` Kevin P. Fleming
2003-12-03 20:09 ` Neil Brown
2003-12-04 6:39 ` Nathan Scott
2003-12-03 20:04 ` Neil Brown
2003-12-03 20:11 ` Linus Torvalds
2003-12-03 16:47 ` David Martínez Moreno
2003-12-03 17:25 ` Linus Torvalds
2003-12-04 12:43 ` Jens Axboe [this message]
2003-12-04 14:07 ` Jens Axboe
2003-12-04 14:14 ` Jens Axboe
2003-12-05 3:07 ` Neil Brown
2003-12-05 4:31 ` Kevin P. Fleming
2003-12-05 4:32 ` Nathan Scott
2003-12-04 12:53 ` David Martínez Moreno
2003-12-12 18:38 ` David Martínez Moreno
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20031204124342.GD1086@suse.de \
--to=axboe@suse.de \
--cc=clubinfo.servers@adi.uam.es \
--cc=ender@debian.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=neilb@cse.unsw.edu.au \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.