* filesystem access vs 120 seconds timeouts
@ 2011-08-20 6:57 Harald Dunkel
2011-09-05 14:17 ` Jan Kara
0 siblings, 1 reply; 2+ messages in thread
From: Harald Dunkel @ 2011-08-20 6:57 UTC (permalink / raw)
To: Kernel Mailing List
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi folks,
on huge disk IO operations I get something like this from time
to time:
[ 6220.508495] INFO: task jbd2/sdb3-8:1616 blocked for more than 120 seconds.
[ 6220.540831] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6220.573046] jbd2/sdb3-8 D 0000000000000000 0 1616 2 0x00000000
[ 6220.573053] ffff88021216e050 0000000000000046 ffff8801eab35a40 0000000000000000
[ 6220.573058] ffffffff81401020 ffff8802121bbfd8 0000000000010300 0000000000004000
[ 6220.573063] ffff8802106cbac0 ffff8802106cba70 ffff88020ec1c000 ffffffff81136cc1
[ 6220.573069] Call Trace:
[ 6220.573078] [<ffffffff81136cc1>] ? cfq_add_rq_rb+0xb6/0xc7
[ 6220.573085] [<ffffffff8113a973>] ? kobject_get+0x12/0x17
[ 6220.573093] [<ffffffff811cf573>] ? scsi_request_fn+0x374/0x44f
[ 6220.573100] [<ffffffff81083800>] ? find_get_page+0x4a/0x76
[ 6220.573105] [<ffffffff810838f8>] ? __lock_page+0x66/0x66
[ 6220.573111] [<ffffffff812a97aa>] ? io_schedule+0x4b/0x5d
[ 6220.573116] [<ffffffff810838fe>] ? sleep_on_page+0x6/0xa
[ 6220.573121] [<ffffffff812a9c8e>] ? __wait_on_bit+0x3e/0x71
[ 6220.573127] [<ffffffff81083a54>] ? wait_on_page_bit+0x6e/0x73
[ 6220.573133] [<ffffffff8104960b>] ? autoremove_wake_function+0x2a/0x2a
[ 6220.573138] [<ffffffff81083b04>] ? filemap_fdatawait_range+0x73/0x121
[ 6220.573155] [<ffffffff81129921>] ? submit_bio+0xb3/0xbc
[ 6220.573166] [<ffffffffa017aabb>] ? jbd2_journal_commit_transaction+0x75f/0xf84 [jbd2]
[ 6220.573170] [<ffffffff8103d6b7>] ? lock_timer_base.isra.25+0x22/0x47
[ 6220.573174] [<ffffffffa017d70c>] ? kjournald2+0xc0/0x20a [jbd2]
[ 6220.573177] [<ffffffff810495e1>] ? abort_exclusive_wait+0x79/0x79
[ 6220.573181] [<ffffffffa017d64c>] ? commit_timeout+0x5/0x5 [jbd2]
[ 6220.573184] [<ffffffff81049016>] ? kthread+0x76/0x7e
[ 6220.573187] [<ffffffff812ac814>] ? kernel_thread_helper+0x4/0x10
[ 6220.573190] [<ffffffff81048fa0>] ? kthread_worker_fn+0x139/0x139
[ 6220.573192] [<ffffffff812ac810>] ? gs_change+0xb/0xb
Is the timeout of 120 seconds still reasonable? Should I simply switch
off the message, as suggested?
Regards
Harri
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iEYEARECAAYFAk5PWsgACgkQUTlbRTxpHjdaKwCgkt5q3KABhq7FqNzit57A0eaP
VQUAn0ChNyaIoSMOKmUK3dOtLjAQiAFj
=H1Df
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: filesystem access vs 120 seconds timeouts
2011-08-20 6:57 filesystem access vs 120 seconds timeouts Harald Dunkel
@ 2011-09-05 14:17 ` Jan Kara
0 siblings, 0 replies; 2+ messages in thread
From: Jan Kara @ 2011-09-05 14:17 UTC (permalink / raw)
To: Harald Dunkel; +Cc: Kernel Mailing List
Hello,
On Sat 20-08-11 08:57:12, Harald Dunkel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> on huge disk IO operations I get something like this from time
> to time:
>
> [ 6220.508495] INFO: task jbd2/sdb3-8:1616 blocked for more than 120 seconds.
> [ 6220.540831] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 6220.573046] jbd2/sdb3-8 D 0000000000000000 0 1616 2 0x00000000
> [ 6220.573053] ffff88021216e050 0000000000000046 ffff8801eab35a40 0000000000000000
> [ 6220.573058] ffffffff81401020 ffff8802121bbfd8 0000000000010300 0000000000004000
> [ 6220.573063] ffff8802106cbac0 ffff8802106cba70 ffff88020ec1c000 ffffffff81136cc1
> [ 6220.573069] Call Trace:
> [ 6220.573078] [<ffffffff81136cc1>] ? cfq_add_rq_rb+0xb6/0xc7
> [ 6220.573085] [<ffffffff8113a973>] ? kobject_get+0x12/0x17
> [ 6220.573093] [<ffffffff811cf573>] ? scsi_request_fn+0x374/0x44f
> [ 6220.573100] [<ffffffff81083800>] ? find_get_page+0x4a/0x76
> [ 6220.573105] [<ffffffff810838f8>] ? __lock_page+0x66/0x66
> [ 6220.573111] [<ffffffff812a97aa>] ? io_schedule+0x4b/0x5d
> [ 6220.573116] [<ffffffff810838fe>] ? sleep_on_page+0x6/0xa
> [ 6220.573121] [<ffffffff812a9c8e>] ? __wait_on_bit+0x3e/0x71
> [ 6220.573127] [<ffffffff81083a54>] ? wait_on_page_bit+0x6e/0x73
> [ 6220.573133] [<ffffffff8104960b>] ? autoremove_wake_function+0x2a/0x2a
> [ 6220.573138] [<ffffffff81083b04>] ? filemap_fdatawait_range+0x73/0x121
> [ 6220.573155] [<ffffffff81129921>] ? submit_bio+0xb3/0xbc
> [ 6220.573166] [<ffffffffa017aabb>] ? jbd2_journal_commit_transaction+0x75f/0xf84 [jbd2]
> [ 6220.573170] [<ffffffff8103d6b7>] ? lock_timer_base.isra.25+0x22/0x47
> [ 6220.573174] [<ffffffffa017d70c>] ? kjournald2+0xc0/0x20a [jbd2]
> [ 6220.573177] [<ffffffff810495e1>] ? abort_exclusive_wait+0x79/0x79
> [ 6220.573181] [<ffffffffa017d64c>] ? commit_timeout+0x5/0x5 [jbd2]
> [ 6220.573184] [<ffffffff81049016>] ? kthread+0x76/0x7e
> [ 6220.573187] [<ffffffff812ac814>] ? kernel_thread_helper+0x4/0x10
> [ 6220.573190] [<ffffffff81048fa0>] ? kthread_worker_fn+0x139/0x139
> [ 6220.573192] [<ffffffff812ac810>] ? gs_change+0xb/0xb
>
>
> Is the timeout of 120 seconds still reasonable? Should I simply switch
> off the message, as suggested?
Hmm, yeah. The warning is in fact saying that some process blocked for
more than 120s on some lock. Usually that indicates that something went
really wrong but there are some cases like waiting for IO where it can
simply take so long for IO to finish when the load is big enough... So if
these messages annoy you, just switch the warning off.
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2011-09-05 14:18 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-08-20 6:57 filesystem access vs 120 seconds timeouts Harald Dunkel
2011-09-05 14:17 ` Jan Kara
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox