deadlock with fallocate

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* deadlock with fallocate
@ 2009-10-07  3:00 Thomas Neumann
  2009-10-11  7:57 ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: Thomas Neumann @ 2009-10-07  3:00 UTC (permalink / raw)
  To: linux-kernel

Under 2.6.32rc1 I observe a very annoying deadlock in fallocate calls with 
XFS. When running a very I/O intensive workload (using posix_fallocate and a 
lot of mmaps) the kernel deadlocks semi-reproducible (ca. 60% of the time). 
I included an excerpt from dmesg below, but I am not sure how useful it is 
for debugging.
Is there any way to isolate the problem? Unfortunately it is not 100% 
reproducible and it takes a while to reach the deadlock.

[36600.700131] INFO: task xfsconvertd/0:411 blocked for more than 120 
seconds.
[36600.700138] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
[36600.700143] xfsconvertd/0 D 0000000000000000     0   411      2 
0x00000000
[36600.700153]  ffff88007b21d3e0 0000000000000046 ffff88007d4e8c40 
ffff88007b21dfd8
[36600.700163]  ffff88007adfdb40 0000000000015980 0000000000015980 
ffff88007b21dfd8
[36600.700171]  0000000000015980 ffff88007b21dfd8 0000000000015980 
ffff88007adfdf00
[36600.700180] Call Trace:
[36600.700195]  [<ffffffff81526162>] io_schedule+0x42/0x60
[36600.700204]  [<ffffffff810df6d5>] sync_page+0x35/0x50
[36600.700211]  [<ffffffff815268e5>] __wait_on_bit+0x55/0x80
[36600.700218]  [<ffffffff810df6a0>] ? sync_page+0x0/0x50
[36600.700225]  [<ffffffff810df7f0>] wait_on_page_bit+0x70/0x80
[36600.700233]  [<ffffffff8107afb0>] ? wake_bit_function+0x0/0x30
[36600.700242]  [<ffffffff810ecce8>] shrink_page_list+0x3d8/0x550
[36600.700249]  [<ffffffff8107b0de>] ? finish_wait+0x5e/0x80
[36600.700258]  [<ffffffff810f85da>] ? congestion_wait+0x7a/0x90
[36600.700265]  [<ffffffff8107af70>] ? autoremove_wake_function+0x0/0x40
[36600.700273]  [<ffffffff810ed7e6>] shrink_inactive_list+0x6b6/0x700
[36600.700281]  [<ffffffff81044250>] ? wake_affine+0x220/0x240
[36600.700289]  [<ffffffff81048210>] ? __enqueue_entity+0x80/0x90
[36600.700296]  [<ffffffff810ed881>] shrink_list+0x51/0xb0
[36600.700303]  [<ffffffff810eddea>] shrink_zone+0x1ea/0x200
[36600.700311]  [<ffffffff810ee823>] shrink_zones+0x63/0xf0
[36600.700318]  [<ffffffff810ee920>] do_try_to_free_pages+0x70/0x280
[36600.700325]  [<ffffffff810eec9c>] try_to_free_pages+0x9c/0xc0
[36600.700332]  [<ffffffff810ed0e0>] ? isolate_pages_global+0x0/0x50
[36600.700340]  [<ffffffff810e6342>] __alloc_pages_slowpath+0x232/0x520
[36600.700347]  [<ffffffff810e6776>] __alloc_pages_nodemask+0x146/0x180
[36600.700355]  [<ffffffff811143f7>] alloc_pages_current+0x87/0xd0
[36600.700363]  [<ffffffff8111939c>] allocate_slab+0x11c/0x1b0
[36600.700369]  [<ffffffff8111945b>] new_slab+0x2b/0x190
[36600.700376]  [<ffffffff8111b641>] __slab_alloc+0x121/0x230
[36600.700422]  [<ffffffffa009b57d>] ? kmem_zone_alloc+0x5d/0xd0 [xfs]
[36600.700452]  [<ffffffffa009b57d>] ? kmem_zone_alloc+0x5d/0xd0 [xfs]
[36600.700459]  [<ffffffff8111b980>] kmem_cache_alloc+0xf0/0x130
[36600.700489]  [<ffffffffa009b57d>] kmem_zone_alloc+0x5d/0xd0 [xfs]
[36600.700519]  [<ffffffffa009b609>] kmem_zone_zalloc+0x19/0x50 [xfs]
[36600.700550]  [<ffffffffa009368f>] _xfs_trans_alloc+0x2f/0x70 [xfs]
[36600.700580]  [<ffffffffa0093832>] xfs_trans_alloc+0x92/0xa0 [xfs]
[36600.700588]  [<ffffffff810f7710>] ? __dec_zone_page_state+0x30/0x40
[36600.700619]  [<ffffffffa0083691>] xfs_iomap_write_unwritten+0x71/0x200 
[xfs]
[36600.700627]  [<ffffffff810e1ad2>] ? mempool_free_slab+0x12/0x20
[36600.700634]  [<ffffffff810e1b6a>] ? mempool_free+0x8a/0xa0
[36600.700663]  [<ffffffffa009c3d0>] ? xfs_end_bio_unwritten+0x0/0x80 [xfs]
[36600.700692]  [<ffffffffa009c435>] xfs_end_bio_unwritten+0x65/0x80 [xfs]
[36600.700701]  [<ffffffff81075c47>] run_workqueue+0xb7/0x190
[36600.700707]  [<ffffffff81076fa6>] worker_thread+0x96/0xf0
[36600.700714]  [<ffffffff8107af70>] ? autoremove_wake_function+0x0/0x40
[36600.700721]  [<ffffffff81076f10>] ? worker_thread+0x0/0xf0
[36600.700727]  [<ffffffff8107ab8e>] kthread+0x8e/0xa0
[36600.700735]  [<ffffffff81012f8a>] child_rip+0xa/0x20
[36600.700742]  [<ffffffff8107ab00>] ? kthread+0x0/0xa0
[36600.700748]  [<ffffffff81012f80>] ? child_rip+0x0/0x20
[36600.700784] INFO: task dataload:3947 blocked for more than 120 seconds.
[36600.700788] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
[36600.700792] dataload     D 0000000000000001     0  3947   2121 0x00000000
[36600.700801]  ffff880074f37ce8 0000000000000082 0000000000000086 
ffff880074f37fd8
[36600.700810]  ffff88007b0316d0 0000000000015980 0000000000015980 
ffff880074f37fd8
[36600.700818]  0000000000015980 ffff880074f37fd8 0000000000015980 
ffff88007b031a90
[36600.700826] Call Trace:
[36600.700859]  [<ffffffffa009c0e5>] xfs_ioend_wait+0x85/0xc0 [xfs]
[36600.700868]  [<ffffffff8107af70>] ? autoremove_wake_function+0x0/0x40
[36600.700899]  [<ffffffffa0097d1d>] xfs_setattr+0x85d/0xb20 [xfs]
[36600.700930]  [<ffffffffa0094aa8>] ? xfs_trans_unlocked_item+0x38/0x50 
[xfs]
[36600.700960]  [<ffffffffa00a2ebd>] xfs_vn_fallocate+0xed/0x100 [xfs]
[36600.700968]  [<ffffffff810334a9>] ? default_spin_lock_flags+0x9/0x10
[36600.700976]  [<ffffffff81527ecf>] ? _spin_lock_irqsave+0x2f/0x50
[36600.700984]  [<ffffffff8112556d>] do_fallocate+0xfd/0x110
[36600.700990]  [<ffffffff811255c9>] sys_fallocate+0x49/0x70
[36600.700998]  [<ffffffff81011f42>] system_call_fastpath+0x16/0x1b

I have also opened a bug here

http://oss.sgi.com/bugzilla/show_bug.cgi?id=852



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: deadlock with fallocate
  2009-10-07  3:00 deadlock with fallocate Thomas Neumann
@ 2009-10-11  7:57 ` Andrew Morton
  2009-10-11  8:53   ` Thomas Neumann
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2009-10-11  7:57 UTC (permalink / raw)
  To: Thomas Neumann; +Cc: linux-kernel, xfs-masters, Christoph Hellwig

On Wed, 07 Oct 2009 05:00:36 +0200 Thomas Neumann <tneumann@users.sourceforge.net> wrote:

> Under 2.6.32rc1 I observe a very annoying deadlock in fallocate calls with 
> XFS. When running a very I/O intensive workload (using posix_fallocate and a 
> lot of mmaps) the kernel deadlocks semi-reproducible (ca. 60% of the time). 
> I included an excerpt from dmesg below, but I am not sure how useful it is 
> for debugging.
> Is there any way to isolate the problem? Unfortunately it is not 100% 
> reproducible and it takes a while to reach the deadlock.

I'm wondering how serious this bug is.

Will legacy applications fail on newer kernels?  Or is it the case that
only recently-written applications which utilise new kernel
functionality will hit this bug?

If the former then it's time to run around in panic looking for changes
to revert.

[trace retained for the newly cc'ed]

> [36600.700131] INFO: task xfsconvertd/0:411 blocked for more than 120 
> seconds.
> [36600.700138] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
> this message.
> [36600.700143] xfsconvertd/0 D 0000000000000000     0   411      2 
> 0x00000000
> [36600.700153]  ffff88007b21d3e0 0000000000000046 ffff88007d4e8c40 
> ffff88007b21dfd8
> [36600.700163]  ffff88007adfdb40 0000000000015980 0000000000015980 
> ffff88007b21dfd8
> [36600.700171]  0000000000015980 ffff88007b21dfd8 0000000000015980 
> ffff88007adfdf00
> [36600.700180] Call Trace:
> [36600.700195]  [<ffffffff81526162>] io_schedule+0x42/0x60
> [36600.700204]  [<ffffffff810df6d5>] sync_page+0x35/0x50
> [36600.700211]  [<ffffffff815268e5>] __wait_on_bit+0x55/0x80
> [36600.700218]  [<ffffffff810df6a0>] ? sync_page+0x0/0x50
> [36600.700225]  [<ffffffff810df7f0>] wait_on_page_bit+0x70/0x80
> [36600.700233]  [<ffffffff8107afb0>] ? wake_bit_function+0x0/0x30
> [36600.700242]  [<ffffffff810ecce8>] shrink_page_list+0x3d8/0x550
> [36600.700249]  [<ffffffff8107b0de>] ? finish_wait+0x5e/0x80
> [36600.700258]  [<ffffffff810f85da>] ? congestion_wait+0x7a/0x90
> [36600.700265]  [<ffffffff8107af70>] ? autoremove_wake_function+0x0/0x40
> [36600.700273]  [<ffffffff810ed7e6>] shrink_inactive_list+0x6b6/0x700
> [36600.700281]  [<ffffffff81044250>] ? wake_affine+0x220/0x240
> [36600.700289]  [<ffffffff81048210>] ? __enqueue_entity+0x80/0x90
> [36600.700296]  [<ffffffff810ed881>] shrink_list+0x51/0xb0
> [36600.700303]  [<ffffffff810eddea>] shrink_zone+0x1ea/0x200
> [36600.700311]  [<ffffffff810ee823>] shrink_zones+0x63/0xf0
> [36600.700318]  [<ffffffff810ee920>] do_try_to_free_pages+0x70/0x280
> [36600.700325]  [<ffffffff810eec9c>] try_to_free_pages+0x9c/0xc0
> [36600.700332]  [<ffffffff810ed0e0>] ? isolate_pages_global+0x0/0x50
> [36600.700340]  [<ffffffff810e6342>] __alloc_pages_slowpath+0x232/0x520
> [36600.700347]  [<ffffffff810e6776>] __alloc_pages_nodemask+0x146/0x180
> [36600.700355]  [<ffffffff811143f7>] alloc_pages_current+0x87/0xd0
> [36600.700363]  [<ffffffff8111939c>] allocate_slab+0x11c/0x1b0
> [36600.700369]  [<ffffffff8111945b>] new_slab+0x2b/0x190
> [36600.700376]  [<ffffffff8111b641>] __slab_alloc+0x121/0x230
> [36600.700422]  [<ffffffffa009b57d>] ? kmem_zone_alloc+0x5d/0xd0 [xfs]
> [36600.700452]  [<ffffffffa009b57d>] ? kmem_zone_alloc+0x5d/0xd0 [xfs]
> [36600.700459]  [<ffffffff8111b980>] kmem_cache_alloc+0xf0/0x130
> [36600.700489]  [<ffffffffa009b57d>] kmem_zone_alloc+0x5d/0xd0 [xfs]
> [36600.700519]  [<ffffffffa009b609>] kmem_zone_zalloc+0x19/0x50 [xfs]
> [36600.700550]  [<ffffffffa009368f>] _xfs_trans_alloc+0x2f/0x70 [xfs]
> [36600.700580]  [<ffffffffa0093832>] xfs_trans_alloc+0x92/0xa0 [xfs]
> [36600.700588]  [<ffffffff810f7710>] ? __dec_zone_page_state+0x30/0x40
> [36600.700619]  [<ffffffffa0083691>] xfs_iomap_write_unwritten+0x71/0x200 
> [xfs]
> [36600.700627]  [<ffffffff810e1ad2>] ? mempool_free_slab+0x12/0x20
> [36600.700634]  [<ffffffff810e1b6a>] ? mempool_free+0x8a/0xa0
> [36600.700663]  [<ffffffffa009c3d0>] ? xfs_end_bio_unwritten+0x0/0x80 [xfs]
> [36600.700692]  [<ffffffffa009c435>] xfs_end_bio_unwritten+0x65/0x80 [xfs]
> [36600.700701]  [<ffffffff81075c47>] run_workqueue+0xb7/0x190
> [36600.700707]  [<ffffffff81076fa6>] worker_thread+0x96/0xf0
> [36600.700714]  [<ffffffff8107af70>] ? autoremove_wake_function+0x0/0x40
> [36600.700721]  [<ffffffff81076f10>] ? worker_thread+0x0/0xf0
> [36600.700727]  [<ffffffff8107ab8e>] kthread+0x8e/0xa0
> [36600.700735]  [<ffffffff81012f8a>] child_rip+0xa/0x20
> [36600.700742]  [<ffffffff8107ab00>] ? kthread+0x0/0xa0
> [36600.700748]  [<ffffffff81012f80>] ? child_rip+0x0/0x20
> [36600.700784] INFO: task dataload:3947 blocked for more than 120 seconds.
> [36600.700788] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
> this message.
> [36600.700792] dataload     D 0000000000000001     0  3947   2121 0x00000000
> [36600.700801]  ffff880074f37ce8 0000000000000082 0000000000000086 
> ffff880074f37fd8
> [36600.700810]  ffff88007b0316d0 0000000000015980 0000000000015980 
> ffff880074f37fd8
> [36600.700818]  0000000000015980 ffff880074f37fd8 0000000000015980 
> ffff88007b031a90
> [36600.700826] Call Trace:
> [36600.700859]  [<ffffffffa009c0e5>] xfs_ioend_wait+0x85/0xc0 [xfs]
> [36600.700868]  [<ffffffff8107af70>] ? autoremove_wake_function+0x0/0x40
> [36600.700899]  [<ffffffffa0097d1d>] xfs_setattr+0x85d/0xb20 [xfs]
> [36600.700930]  [<ffffffffa0094aa8>] ? xfs_trans_unlocked_item+0x38/0x50 
> [xfs]
> [36600.700960]  [<ffffffffa00a2ebd>] xfs_vn_fallocate+0xed/0x100 [xfs]
> [36600.700968]  [<ffffffff810334a9>] ? default_spin_lock_flags+0x9/0x10
> [36600.700976]  [<ffffffff81527ecf>] ? _spin_lock_irqsave+0x2f/0x50
> [36600.700984]  [<ffffffff8112556d>] do_fallocate+0xfd/0x110
> [36600.700990]  [<ffffffff811255c9>] sys_fallocate+0x49/0x70
> [36600.700998]  [<ffffffff81011f42>] system_call_fastpath+0x16/0x1b
> 
> I have also opened a bug here
> 
> http://oss.sgi.com/bugzilla/show_bug.cgi?id=852
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: deadlock with fallocate
  2009-10-11  7:57 ` Andrew Morton
@ 2009-10-11  8:53   ` Thomas Neumann
  2009-10-11 12:17     ` Christoph Hellwig
  0 siblings, 1 reply; 5+ messages in thread
From: Thomas Neumann @ 2009-10-11  8:53 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, xfs-masters, Christoph Hellwig

> Will legacy applications fail on newer kernels?  Or is it the case that
> only recently-written applications which utilise new kernel
> functionality will hit this bug?
In theory posix_fallocate has been around for a while and glibc will use 
kernel functionality if available, so applications might break. In practice it 
is perhaps not that common that applications use fallocate.

The problem is definitively fallocate related. When I replace possix_fallocate 
with the equivalent ftruncate64 call the problem goes away. (But then again 
the two calls are not really equivalent, and fallocate is the semantic that I 
need).
Furthermore the deadlocks seem to start occurring after writing more data than 
available main memory, but I did not investigate this thoroughly.

Thomas

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: deadlock with fallocate
  2009-10-11  8:53   ` Thomas Neumann
@ 2009-10-11 12:17     ` Christoph Hellwig
  2009-10-11 14:19       ` Thomas Neumann
  0 siblings, 1 reply; 5+ messages in thread
From: Christoph Hellwig @ 2009-10-11 12:17 UTC (permalink / raw)
  To: Thomas Neumann
  Cc: Andrew Morton, linux-kernel, xfs-masters, Christoph Hellwig

Thomas, I'll look at this ASAP.  It's been a bit of a busy time, just
returning from a conference.  Any sort of self-contained testcase would
be perfect, though.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: deadlock with fallocate
  2009-10-11 12:17     ` Christoph Hellwig
@ 2009-10-11 14:19       ` Thomas Neumann
  0 siblings, 0 replies; 5+ messages in thread
From: Thomas Neumann @ 2009-10-11 14:19 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Andrew Morton, linux-kernel, xfs-masters

Am Sonntag, 11. Oktober 2009 14:17:52 schrieb Christoph Hellwig:
> Thomas, I'll look at this ASAP.  It's been a bit of a busy time, just
> returning from a conference.  Any sort of self-contained testcase would
> be perfect, though.
I will send you a mail with a pointer to test data and code tomorrow. It is 
"self-contained" in the sense that it has no external dependencies, but it is 
somewhat large...

Thomas


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-10-11 14:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-07  3:00 deadlock with fallocate Thomas Neumann
2009-10-11  7:57 ` Andrew Morton
2009-10-11  8:53   ` Thomas Neumann
2009-10-11 12:17     ` Christoph Hellwig
2009-10-11 14:19       ` Thomas Neumann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox