From: Steven Whitehouse <swhiteho@redhat.com>
To: linux-kernel@vger.kernel.org, Jens Axboe <jaxboe@fusionio.com>,
Tejun Heo <tj@kernel.org>
Subject: Strange block/scsi/workqueue issue
Date: Mon, 11 Apr 2011 15:56:03 +0100 [thread overview]
Message-ID: <1302533763.2596.23.camel@dolmen> (raw)
Hi,
Sometime around the -rc1 time (I'm not sure exactly, but it seemed to
start around then) my test box stopped booting. Lacking a serial console
I didn't get around to looking further before I went on holiday, but
since the problem is still there in the latest Linus' tree, I've been
looking again today.
My original problem was that the messages were scrolling off the screen
before I could read them. Using a video camera to try and capture them
was unsuccessful.
Today I used a serial console, and magically it seemed that the initial
boot issue had gone away - so I guess there is at least an element of
timing dependence. However, not very long after boot, I got the attached
messages. These look suspiciously similar (bearing in mind I could only
see the last few lines) to the messages I was getting with the boot
issue.
I should also say that the box was idle aside from the usual f14 start
up tasks, so there was not much I/O going on at the time.
Is the workqueue by any chance trying to wait synchronously for itself
to finish?
Steve.
INFO: task kworker/2:1:382 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/2:1 D ffff8800cb5d2500 4528 382 2 0x00000000
ffff8800c88d5880 0000000000000086 00000000001d2500 ffff8800c88d57c8
0000000000000086 ffff8800c88d5fd8 ffff8800c92d8b80 ffff8800c88d5fd8
ffff8800c88d4000 ffff8800c88d4000 ffff8800c88d4000 ffff8800c88d5fd8
Call Trace:
[<ffffffff8107fdea>] ? finish_task_switch+0x3a/0x110
[<ffffffff8167ebe6>] ? _raw_spin_unlock_irq+0x46/0x70
[<ffffffff8107fe1f>] ? finish_task_switch+0x6f/0x110
[<ffffffff8107fdea>] ? finish_task_switch+0x3a/0x110
[<ffffffff8167b8e5>] schedule_timeout+0x295/0x310
[<ffffffff816826b9>] ? sub_preempt_count+0xa9/0xe0
[<ffffffff8167a650>] wait_for_common+0x120/0x170
[<ffffffff810898c0>] ? try_to_wake_up+0x360/0x360
[<ffffffff8167a748>] wait_for_completion+0x18/0x20
[<ffffffff810aba4c>] wait_on_cpu_work+0xec/0x100
[<ffffffff810aa460>] ? do_work_for_cpu+0x30/0x30
[<ffffffff810abb3b>] wait_on_work+0xdb/0x150
[<ffffffff810aba60>] ? wait_on_cpu_work+0x100/0x100
[<ffffffff810abc33>] __cancel_work_timer+0x83/0x130
[<ffffffff810abced>] cancel_delayed_work_sync+0xd/0x10
[<ffffffff813b24b4>] blk_sync_queue+0x24/0x50
[<ffffffff813b24ef>] blk_cleanup_queue+0xf/0x60
[<ffffffff81479a89>] scsi_free_queue+0x9/0x10
[<ffffffff8147d30b>] scsi_device_dev_release_usercontext+0xeb/0x140
[<ffffffff810cbd5d>] ? trace_hardirqs_on_caller+0x14d/0x190
[<ffffffff8147d220>] ? scsi_device_cls_release+0x20/0x20
[<ffffffff810ac826>] execute_in_process_context+0x86/0xa0
[<ffffffff8147d1f7>] scsi_device_dev_release+0x17/0x20
[<ffffffff814609f2>] device_release+0x22/0x90
[<ffffffff8167ef98>] ? retint_restore_args+0x13/0x13
[<ffffffff813c8165>] kobject_release+0x45/0x90
[<ffffffff813c8120>] ? kobject_del+0x40/0x40
[<ffffffff813c9767>] kref_put+0x37/0x70
[<ffffffff813c8027>] kobject_put+0x27/0x60
[<ffffffff81460822>] put_device+0x12/0x20
[<ffffffff81478bd9>] scsi_request_fn+0xb9/0x4a0
[<ffffffff813aff2a>] __blk_run_queue+0x6a/0x110
[<ffffffff813b1f66>] blk_delay_work+0x26/0x40
[<ffffffff810aa9c7>] process_one_work+0x197/0x520
[<ffffffff810aa961>] ? process_one_work+0x131/0x520
[<ffffffff813b1f40>] ? blk_make_request+0x90/0x90
[<ffffffff810acfec>] worker_thread+0x15c/0x330
[<ffffffff810ace90>] ? manage_workers.clone.20+0x240/0x240
[<ffffffff810ace90>] ? manage_workers.clone.20+0x240/0x240
[<ffffffff810b1f16>] kthread+0xa6/0xb0
[<ffffffff81687064>] kernel_thread_helper+0x4/0x10
[<ffffffff8107fe1f>] ? finish_task_switch+0x6f/0x110
[<ffffffff8167ebe6>] ? _raw_spin_unlock_irq+0x46/0x70
[<ffffffff8167ef98>] ? retint_restore_args+0x13/0x13
[<ffffffff810b1e70>] ? __init_kthread_worker+0x70/0x70
[<ffffffff81687060>] ? gs_change+0x13/0x13
INFO: lockdep is turned off.
INFO: task kworker/2:1:382 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/2:1 D ffff8800cb5d2500 4528 382 2 0x00000000
ffff8800c88d5880 0000000000000086 00000000001d2500 ffff8800c88d57c8
0000000000000086 ffff8800c88d5fd8 ffff8800c92d8b80 ffff8800c88d5fd8
ffff8800c88d4000 ffff8800c88d4000 ffff8800c88d4000 ffff8800c88d5fd8
Call Trace:
[<ffffffff8107fdea>] ? finish_task_switch+0x3a/0x110
[<ffffffff8167ebe6>] ? _raw_spin_unlock_irq+0x46/0x70
[<ffffffff8107fe1f>] ? finish_task_switch+0x6f/0x110
[<ffffffff8107fdea>] ? finish_task_switch+0x3a/0x110
[<ffffffff8167b8e5>] schedule_timeout+0x295/0x310
[<ffffffff816826b9>] ? sub_preempt_count+0xa9/0xe0
[<ffffffff8167a650>] wait_for_common+0x120/0x170
[<ffffffff810898c0>] ? try_to_wake_up+0x360/0x360
[<ffffffff8167a748>] wait_for_completion+0x18/0x20
[<ffffffff810aba4c>] wait_on_cpu_work+0xec/0x100
[<ffffffff810aa460>] ? do_work_for_cpu+0x30/0x30
[<ffffffff810abb3b>] wait_on_work+0xdb/0x150
[<ffffffff810aba60>] ? wait_on_cpu_work+0x100/0x100
[<ffffffff810abc33>] __cancel_work_timer+0x83/0x130
[<ffffffff810abced>] cancel_delayed_work_sync+0xd/0x10
[<ffffffff813b24b4>] blk_sync_queue+0x24/0x50
[<ffffffff813b24ef>] blk_cleanup_queue+0xf/0x60
[<ffffffff81479a89>] scsi_free_queue+0x9/0x10
[<ffffffff8147d30b>] scsi_device_dev_release_usercontext+0xeb/0x140
[<ffffffff810cbd5d>] ? trace_hardirqs_on_caller+0x14d/0x190
[<ffffffff8147d220>] ? scsi_device_cls_release+0x20/0x20
[<ffffffff810ac826>] execute_in_process_context+0x86/0xa0
[<ffffffff8147d1f7>] scsi_device_dev_release+0x17/0x20
[<ffffffff814609f2>] device_release+0x22/0x90
[<ffffffff8167ef98>] ? retint_restore_args+0x13/0x13
[<ffffffff813c8165>] kobject_release+0x45/0x90
[<ffffffff813c8120>] ? kobject_del+0x40/0x40
[<ffffffff813c9767>] kref_put+0x37/0x70
[<ffffffff813c8027>] kobject_put+0x27/0x60
[<ffffffff81460822>] put_device+0x12/0x20
[<ffffffff81478bd9>] scsi_request_fn+0xb9/0x4a0
[<ffffffff813aff2a>] __blk_run_queue+0x6a/0x110
[<ffffffff813b1f66>] blk_delay_work+0x26/0x40
[<ffffffff810aa9c7>] process_one_work+0x197/0x520
[<ffffffff810aa961>] ? process_one_work+0x131/0x520
[<ffffffff813b1f40>] ? blk_make_request+0x90/0x90
[<ffffffff810acfec>] worker_thread+0x15c/0x330
[<ffffffff810ace90>] ? manage_workers.clone.20+0x240/0x240
[<ffffffff810ace90>] ? manage_workers.clone.20+0x240/0x240
[<ffffffff810b1f16>] kthread+0xa6/0xb0
[<ffffffff81687064>] kernel_thread_helper+0x4/0x10
[<ffffffff8107fe1f>] ? finish_task_switch+0x6f/0x110
[<ffffffff8167ebe6>] ? _raw_spin_unlock_irq+0x46/0x70
[<ffffffff8167ef98>] ? retint_restore_args+0x13/0x13
[<ffffffff810b1e70>] ? __init_kthread_worker+0x70/0x70
[<ffffffff81687060>] ? gs_change+0x13/0x13
INFO: lockdep is turned off.
next reply other threads:[~2011-04-11 14:53 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-11 14:56 Steven Whitehouse [this message]
2011-04-11 17:18 ` Strange block/scsi/workqueue issue Tejun Heo
2011-04-11 17:29 ` Jens Axboe
2011-04-11 17:52 ` Steven Whitehouse
2011-04-12 0:14 ` Tejun Heo
2011-04-12 8:49 ` Steven Whitehouse
2011-04-12 0:47 ` James Bottomley
2011-04-12 2:51 ` Tejun Heo
2011-04-12 4:49 ` James Bottomley
2011-04-12 5:02 ` James Bottomley
2011-04-12 8:42 ` Steven Whitehouse
2011-04-12 13:42 ` James Bottomley
2011-04-12 14:06 ` Steven Whitehouse
2011-04-12 15:14 ` James Bottomley
2011-04-12 16:04 ` Steven Whitehouse
2011-04-12 16:27 ` James Bottomley
2011-04-12 16:51 ` Steven Whitehouse
2011-04-12 17:41 ` James Bottomley
2011-04-12 18:33 ` Steven Whitehouse
2011-04-12 19:56 ` James Bottomley
2011-04-12 20:30 ` Steven Whitehouse
2011-04-12 20:43 ` James Bottomley
2011-04-13 5:18 ` Tejun Heo
2011-04-13 6:06 ` Tejun Heo
2011-04-13 9:20 ` Steven Whitehouse
2011-04-13 14:00 ` Steven Whitehouse
2011-04-13 17:01 ` James Bottomley
2011-04-13 19:35 ` Steven Whitehouse
2011-04-13 20:12 ` Jens Axboe
2011-04-13 20:17 ` James Bottomley
2011-04-22 18:01 ` Tejun Heo
2011-04-22 18:06 ` James Bottomley
2011-04-22 18:30 ` Tejun Heo
2011-05-31 6:05 ` Anton V. Boyarshinov
2011-04-22 18:03 ` Tejun Heo
2011-04-12 5:15 ` Tejun Heo
2011-04-12 15:15 ` James Bottomley
2011-04-13 5:11 ` Tejun Heo
2011-04-13 14:15 ` James Bottomley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1302533763.2596.23.camel@dolmen \
--to=swhiteho@redhat.com \
--cc=jaxboe@fusionio.com \
--cc=linux-kernel@vger.kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox