From: Jens Axboe <jens.axboe@oracle.com>
To: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>, linux-kernel@vger.kernel.org
Subject: Re: 2.6.25-$sha1: RIP call_for_each_cic+0x25/0x50
Date: Sun, 4 May 2008 21:08:11 +0200 [thread overview]
Message-ID: <20080504190811.GP12774@kernel.dk> (raw)
In-Reply-To: <20080430221250.GA10150@martell.zuzino.mipt.ru>
On Thu, May 01 2008, Alexey Dobriyan wrote:
> On Tue, Apr 29, 2008 at 11:06:05AM +0200, Jens Axboe wrote:
> > On Tue, Apr 29 2008, Alexey Dobriyan wrote:
> > > On Mon, Apr 28, 2008 at 11:55:09PM +0400, Alexey Dobriyan wrote:
> > > > On Mon, Apr 28, 2008 at 02:04:13PM +0200, Jens Axboe wrote:
> > > > > On Mon, Apr 28 2008, Andrew Morton wrote:
> > > > > > On Mon, 28 Apr 2008 02:55:53 +0400 Alexey Dobriyan <adobriyan@gmail.com> wrote:
> > > > > >
> > > > > > > This happened while ~90 cross-compile jobs were running in parallel on
> > > > > > > ext2/noatime partition (slowly -- much debugging was on)
> > > > > > >
> > > > > > >
> > > > > > > general protection fault: 0000 [1] PREEMPT SMP DEBUG_PAGEALLOC
> > > > > > > CPU 0
> > > > > > > Modules linked in: ext2 nf_conntrack_irc ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables usblp uhci_hcd ehci_hcd usbcore sr_mod cdrom
> > > > > > > Pid: 16483, comm: as Not tainted 2.6.25-c3bf9bc243092c53946fd6d8ebd6dc2f4e572d48 #1
> > > > > > > RIP: 0010:[<ffffffff80307525>] [<ffffffff80307525>] call_for_each_cic+0x25/0x50
> > > > > > > RSP: 0018:ffff810170811e58 EFLAGS: 00010202
> > > > > > > RAX: 6b6b6b6b6b6b6b6b RBX: 6b6b6b6b6b6b6b6b RCX: 0000000000000000
> > > > > > > RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff81010ff92000
> > > > > > > RBP: ffff810170811e78 R08: 0000000000000001 R09: 0000000000000000
> > > > > > > R10: 0000000000000000 R11: ffff8100010069d8 R12: ffff810138ada300
> > > > > > > R13: ffffffff803075b0 R14: ffff81017fcd2000 R15: ffff81010ff92168
> > > > > > > FS: 00002ac3462426f0(0000) GS:ffffffff805d0000(0000) knlGS:0000000000000000
> > > > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > > > > > > CR2: 00002ab602550000 CR3: 000000013609d000 CR4: 0000000000000660
> > > > > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > > > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > > > > > Process as (pid: 16483, threadinfo ffff810170810000, task ffff81010ff92000)
> > > > > > > Stack: ffff810170811e88 ffff810138ada300 0000000000000010 ffff81010ff92100
> > > > > > > ffff810170811e88 ffffffff80307580 ffff810170811ea8 ffffffff80302a55
> > > > > > > ffff81010ff92100 ffff810138ada300 ffff810170811ec8 ffffffff80302b1f
> > > > > > > Call Trace:
> > > > > > > [<ffffffff80307580>] cfq_free_io_context+0x10/0x20
> > > > > > > [<ffffffff80302a55>] put_io_context+0x85/0x90
> > > > > > > [<ffffffff80302b1f>] exit_io_context+0x8f/0xb0
> > > > > > > [<ffffffff80235d19>] do_exit+0x549/0x780
> > > > > > > [<ffffffff80235f8e>] do_group_exit+0x3e/0xb0
> > > > > > > [<ffffffff80236012>] sys_exit_group+0x12/0x20
> > > > > > > [<ffffffff8020b6db>] system_call_after_swapgs+0x7b/0x80
> > > > > > >
> > > > > > >
> > > > > > > Code: 84 00 00 00 00 00 55 48 89 e5 41 55 49 89 f5 41 54 49 89 fc 53 48 83 ec 08 e8 18 e1 f5 ff 49 8b 44 24 68 48 85 c0 74 1e 48 89 c3 <48> 8b 03 48 8d 73 88 4c 89 e7 0f 18 08 41 ff d5 48 8b 03 48 85
> > > > > > > RIP [<ffffffff80307525>] call_for_each_cic+0x25/0x50
> > > > > > > RSP <ffff810170811e58>
> > > > > > > ---[ end trace ca143223eefdc828 ]---
> > > > > > > Fixing recursive fault but reboot is needed!
> > > > > > cfq-iosched.c hasn't been altered (yet) so it might not be a regression.
> > >
> > >
> > > > > It's not a regression, it's definitely in 2.6.25 as well. So that's a
> > > > > bit scary, I've been looking over this stuff this morning but haven't
> > > > > pin pointed anything yet.
> > > > >
> > > > > Alexey, is this something that reproduces for you?
> > > >
> > > > Not yet, second run of same workload went fine and I've never seen such
> > > > oopses before.
> > >
> > > And it oopses the very same way on the third run. as(1) again.
> > > So if there are any debugging patches, let me know.
> >
> > There seems to be a small race in the destructor path, can you see if
> > this makes a difference?
> >
> > diff --git a/block/blk-ioc.c b/block/blk-ioc.c
> > index e34df7c..012f065 100644
> > --- a/block/blk-ioc.c
> > +++ b/block/blk-ioc.c
> > @@ -41,8 +41,8 @@ int put_io_context(struct io_context *ioc)
> > rcu_read_lock();
> > if (ioc->aic && ioc->aic->dtor)
> > ioc->aic->dtor(ioc->aic);
> > - rcu_read_unlock();
> > cfq_dtor(ioc);
> > + rcu_read_unlock();
> >
> > kmem_cache_free(iocontext_cachep, ioc);
> > return 1;
>
> This helps in sense that 3 times bulk cross-compiles finish to the end.
> You'll hear me if another such oops will resurface.
Still looking good?
--
Jens Axboe
next prev parent reply other threads:[~2008-05-04 19:08 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-27 22:55 2.6.25-$sha1: RIP call_for_each_cic+0x25/0x50 Alexey Dobriyan
2008-04-28 12:01 ` Andrew Morton
2008-04-28 12:04 ` Jens Axboe
2008-04-28 19:55 ` Alexey Dobriyan
2008-04-29 6:21 ` Alexey Dobriyan
2008-04-29 9:06 ` Jens Axboe
2008-04-30 22:12 ` Alexey Dobriyan
2008-05-04 19:08 ` Jens Axboe [this message]
2008-05-04 20:15 ` Alexey Dobriyan
2008-05-04 19:25 ` Jens Axboe
2008-05-04 21:17 ` Alexey Dobriyan
2008-05-10 10:37 ` 2.6.25-$sha1: RIP __call_for_each_cic+0x20/0x50 Alexey Dobriyan
2008-05-27 5:27 ` 2.6.26-rc4: " Alexey Dobriyan
2008-05-27 13:35 ` Jens Axboe
2008-05-27 15:18 ` Paul E. McKenney
2008-05-28 10:07 ` Jens Axboe
2008-05-28 10:30 ` Paul E. McKenney
2008-05-28 12:44 ` Jens Axboe
2008-05-28 13:20 ` Paul E. McKenney
2008-05-29 4:38 ` Paul E. McKenney
2008-05-29 6:26 ` Jens Axboe
2008-05-29 6:42 ` Jens Axboe
2008-05-29 9:17 ` Paul E. McKenney
2008-05-29 10:13 ` Jens Axboe
2008-05-29 11:25 ` Paul E. McKenney
2008-05-29 11:44 ` Jens Axboe
2008-05-29 12:11 ` Paul E. McKenney
2008-05-29 12:13 ` Jens Axboe
2008-05-30 11:04 ` Paul E. McKenney
2008-05-30 13:16 ` Paul E. McKenney
2008-05-30 18:34 ` Alexey Dobriyan
2008-06-04 3:31 ` Paul E. McKenney
2008-06-04 18:32 ` Linus Torvalds
2008-06-05 4:23 ` Paul E. McKenney
2008-06-06 14:49 ` Paul E. McKenney
2008-05-28 11:52 ` Fabio Checconi
2008-05-28 11:58 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080504190811.GP12774@kernel.dk \
--to=jens.axboe@oracle.com \
--cc=adobriyan@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox