From: Tejun Heo <tj@kernel.org>
To: Jens Axboe <jaxboe@fusionio.com>
Cc: Rob Landley <rlandley@parallels.com>,
Pete Clements <clem@clem.clem-digital.net>,
linux-kernel <linux-kernel@vger.kernel.org>,
"linux-ide@vger.kernel.org" <linux-ide@vger.kernel.org>
Subject: Re: Commit 7eaceaccab5f40 causing boot hang.
Date: Thu, 31 Mar 2011 14:11:00 +0200 [thread overview]
Message-ID: <20110331121100.GD3385@htj.dyndns.org> (raw)
In-Reply-To: <4D945976.8000401@fusionio.com>
Hello,
On Thu, Mar 31, 2011 at 12:37:42PM +0200, Jens Axboe wrote:
> It seems to hard hang, looks very odd:
>
> [ 84.056007] BUG: soft lockup - CPU#0 stuck for 67s! [kworker/0:2:743]
> [ 84.056008] Modules linked in:
> [ 84.056008] irq event stamp: 334859658
> [ 84.056008] hardirqs last enabled at (334859657): [<ffffffff815c40c7>] _raw_spin_unlock_irq+0x2b/0x30
> [ 84.056008] hardirqs last disabled at (334859658): [<ffffffff815c42e7>] save_args+0x67/0x70
> [ 84.056008] softirqs last enabled at (334855538): [<ffffffff81044819>] __do_softirq+0x1a3/0x1c2
> [ 84.056008] softirqs last disabled at (334855525): [<ffffffff815cb9cc>] call_softirq+0x1c/0x30
> [ 84.056008] CPU 0
> [ 84.056008] Modules linked in:
> [ 84.056008]
> [ 84.056008] Pid: 743, comm: kworker/0:2 Not tainted 2.6.39-rc1+ #12 Bochs Bochs
> [ 84.056008] RIP: 0010:[<ffffffff815c40c9>] [<ffffffff815c40c9>] _raw_spin_unlock_irq+0x2d/0x30
> [ 84.056008] RSP: 0018:ffff88003d343d98 EFLAGS: 00000202
> [ 84.056008] RAX: 0000000013f58d89 RBX: 0000000000000006 RCX: ffff88003d2c5998
> [ 84.056008] RDX: 0000000000000006 RSI: ffff88003d343da0 RDI: ffff88003db19508
> [ 84.056008] RBP: ffff88003d343da0 R08: ffff88003fc15c00 R09: 0000000000000001
> [ 84.056008] R10: ffffffff81e0d040 R11: ffff88003d343d60 R12: ffffffff815cb18e
> [ 84.056008] R13: 0000000000000001 R14: ffff88003d2c5998 R15: ffffffff81069aec
> [ 84.056008] FS: 0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
> [ 84.056008] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 84.056008] CR2: 000000000060d828 CR3: 000000003d3f8000 CR4: 00000000000006f0
> [ 84.056008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 84.056008] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 84.056008] Process kworker/0:2 (pid: 743, threadinfo ffff88003d342000, task ffff88003db18f60)
> [ 84.056008] Stack:
> [ 84.056008] ffff88003d2c5870 ffff88003d343dc0 ffffffff812171d3 ffff88003fc15c00
> [ 84.056008] ffff88003d31e6c0 ffff88003d343e50 ffffffff81053e99 ffffffff81053e0b
> [ 84.056008] ffff88003d342010 ffff88003db18f60 0000000000000046 ffff88003fc15c05
> [ 84.056008] Call Trace:
> [ 84.056008] [<ffffffff812171d3>] blk_delay_work+0x32/0x36
> [ 84.056008] [<ffffffff81053e99>] process_one_work+0x230/0x397
> [ 84.056008] [<ffffffff81053e0b>] ? process_one_work+0x1a2/0x397
> [ 84.056008] [<ffffffff8105612a>] worker_thread+0x136/0x255
> [ 84.056008] [<ffffffff81055ff4>] ? manage_workers+0x190/0x190
> [ 84.056008] [<ffffffff8105974a>] kthread+0x7d/0x85
> [ 84.056008] [<ffffffff815cb8d4>] kernel_thread_helper+0x4/0x10
> [ 84.056008] [<ffffffff815c4440>] ? retint_restore_args+0xe/0xe
> [ 84.056008] [<ffffffff810596cd>] ? __init_kthread_worker+0x56/0x56
> [ 84.056008] [<ffffffff815cb8d0>] ? gs_change+0xb/0xb
> [ 84.056008] Code: 01 00 00 00 48 89 e5 53 48 89 fb 48 83 c7 18 48 83 ec 08 48 8b 55 08 e8 11 7b aa ff 48 89 df e8 03 05 c7 ff e8 f3 5e aa ff fb 5e <5b> c9 c3 55 48 89 e5 41 54 49 89 fc 48 8b 55 08 48 83 c7 18 53
Is CONFIG_PREEMPT[_VOLUNTARY] set? The soft lockup detection works by
checking whether a high pri RT task is scheduled periodically and busy
looping in kernel code with preemption disabled or CONFIG_PREEMPT
disabled would trigger it. The backtrace doesn't mean the CPU is
stuck there not progressing. It just shows where the CPU is at the
moment of triggering and the softlockup triggering itself indicates
that IRQ and bottom halves are running fine.
I think more likely failure mode is something is looping in block path
expecting someone else to do something but as it is busy looping and
preemption isn't enabled the someone else can't proceed. We had a
similar problem in md lately.
http://thread.gmane.org/gmane.linux.raid/33020
If this is the case, a good way to debug would be triggering sysrq-l
multiple times while the machine is hung and see where the CPU is busy
looping.
Thanks.
--
tejun
next prev parent reply other threads:[~2011-03-31 12:11 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-29 6:19 Commit 7eaceaccab5f40 causing boot hang Rob Landley
2011-03-29 6:35 ` Jens Axboe
2011-03-29 8:52 ` Rob Landley
2011-03-29 13:59 ` Jens Axboe
2011-03-29 14:13 ` Rob Landley
2011-03-29 14:16 ` Jens Axboe
2011-03-29 15:51 ` Pete Clements
2011-03-30 6:06 ` Rob Landley
2011-03-30 11:38 ` Jens Axboe
2011-03-30 13:52 ` Rob Landley
2011-03-31 9:02 ` Jens Axboe
2011-03-31 9:11 ` Rob Landley
2011-03-31 10:07 ` Jens Axboe
2011-03-31 10:37 ` Jens Axboe
2011-03-31 10:47 ` Rob Landley
2011-03-31 11:09 ` Jens Axboe
2011-04-01 3:11 ` Nicholas A. Bellinger
2011-04-03 7:05 ` Andreas Mohr
2011-04-01 12:26 ` Mark Lord
2011-04-09 2:56 ` Rob Landley
2011-04-09 3:00 ` Jens Axboe
2011-03-31 12:11 ` Tejun Heo [this message]
2011-03-31 12:33 ` Jens Axboe
2011-03-31 13:09 ` Richard Kennedy
2011-03-31 13:18 ` Jens Axboe
2011-03-31 13:23 ` Richard Kennedy
2011-03-31 13:33 ` Jens Axboe
2011-03-31 14:49 ` Richard Kennedy
2011-04-04 13:47 ` Richard Kennedy
2011-04-13 10:25 ` Richard Kennedy
2011-04-13 10:36 ` Jens Axboe
2011-04-06 11:55 ` Rob Landley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110331121100.GD3385@htj.dyndns.org \
--to=tj@kernel.org \
--cc=clem@clem.clem-digital.net \
--cc=jaxboe@fusionio.com \
--cc=linux-ide@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rlandley@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).