XFS handling of synchronous buffers in case of EIO error

* XFS handling of synchronous buffers in case of EIO error
@ 2010-12-30 12:28 Ajeet Yadav
  2010-12-30 23:13 ` Dave Chinner
  0 siblings, 1 reply; 5+ messages in thread
From: Ajeet Yadav @ 2010-12-30 12:28 UTC (permalink / raw)
  To: xfs

[-- Attachment #1.1: Type: text/plain, Size: 9404 bytes --]

Kernel: 2.6.30.9

I am trouble shooting a hang in XFS during umount.
Test scenerio: Copy large number of files files using below script, and
remove the USB after 3-5 second

index=0
while [ "$?" == 0 ]
do
        index=$((index+1))
        sync
        cp $1/1KB.txt $2/"$index".test
done

In rare scenerio during USB unplug the umount process hang at xfs_buf_lock.
Below log shows the hung process

We have put printk to buffer handling functions xfs_buf_iodone_callbacks(),
xfs_buf_error_relse(), xfs_buf_relse() and xfs_buf_rele()

We always observed the hang only comes when bp->b_relse =
xfs_buf_error_relse(). i.e when xfs_buf_iodone_callbacks() execute
XFS_BUF_SET_BRELSE_FUNC(bp,xfs_buf_error_relse);
XFS_BUF_DONE(bp);
XFS_BUF_FINISH_IOWAIT(bp);

 buf its never called by xfs_buf_relse() because b_hold = 3.

Also we have seen that this problem always comes when bp->relse != NULL &&
bp->hold > 1.

I do not know whether below prints will help you, but I have taken printk
for super block buffer tracing
S-functionname ( Start of function)
E-functionname (End of function)

I think the problem is related to xfs_buf_iodone_callbacks synchronous
buffer error handling path and release buffer.
------------------------------------------------
buf_lock   (Here buffer lock success, taken after down())
S-xfs_buf_rele - 3  (Start of xfs_buf_rele(), b_hold = 3)
Call Trace:
[<8032c1bc>] dump_stack+0x8/0x34 from[<801cab2c>] xfs_buf_rele+0xc8/0x28c
[<801cab2c>] xfs_buf_rele+0xc8/0x28c from[<801cad9c>]
xfs_buf_delwri_dequeue+0xac/0x13c
[<801cad9c>] xfs_buf_delwri_dequeue+0xac/0x13c from[<801cb798>]
xfs_bwrite+0x5c/0x128
[<801cb798>] xfs_bwrite+0x5c/0x128 from[<801d5b48>]
xfs_sync_fsdata+0xbc/0x17c
[<801d5b48>] xfs_sync_fsdata+0xbc/0x17c from[<801d690c>]
xfs_quiesce_data+0x34/0x68
[<801d690c>] xfs_quiesce_data+0x34/0x68 from[<801d2b34>]
xfs_fs_sync_fs+0x30/0xec
[<801d2b34>] xfs_fs_sync_fs+0x30/0xec<7>hub 2-0:1.0: state 7 ports 1 chg
0000 evt 0002
usb 2-1: USB disconnect, address 3
 from[<800b7ff0>] sync_filesystems+0x118/0x19c
[<800b7ff0>] sync_filesystems+0x118/0x19c from[<800db490>] do_sync+0x38/0x7c
[<800db490>] do_sync+0x38/0x7c from[<800db510>] sys_sync+0x10/0x20
[<800db510>] sys_sync+0x10/0x20 from[<8000ff44>] stack_done+0x20/0x3c

E-xfs_buf_rele  (End of function)

S-xfs_bdstrat_cb
S-xfs_buf_rele - 3
S-xfs_buf_iodone_callbacks
Device sda2, XFS metadata write error block 0x0 in sda2

E-xfs_buf_iodone_callbacks
Call Trace:
[<8032c1bc>] dump_stack+0x8/0x34 from[<801cab2c>] xfs_buf_rele+0xc8/0x28c
[<801cab2c>] xfs_buf_rele+0xc8/0x28c from[<801cb2fc>]
xfs_buf_iorequest+0xe8/0x104
[<801cb2fc>] xfs_buf_iorequest+0xe8/0x104 from[<801cbd2c>]
xfs_bdstrat_cb+0x140/0x178
[<801cbd2c>] xfs_bdstrat_cb+0x140/0x178 from[<801cb7ac>]
xfs_bwrite+0x70/0x128
[<801cb7ac>] xfs_bwrite+0x70/0x128 from[<801d5b48>]
xfs_sync_fsdata+0xbc/0x17c
[<801d5b48>] xfs_sync_fsdata+0xbc/0x17c from[<801d690c>]
xfs_quiesce_data+0x34/0x68
[<801d690c>] xfs_quiesce_data+0x34/0x68 from[<801d2b34>]
xfs_fs_sync_fs+0x30/0xec
[<801d2b34>] xfs_fs_sync_fs+0x30/0xec from[<800b7ff0>]
sync_filesystems+0x118/0x19c
[<800b7ff0>] sync_filesystems+0x118/0x19c from[<800db490>] do_sync+0x38/0x7c
[<800db490>] do_sync+0x38/0x7c from[<800db510>] sys_sync+0x10/0x20
[<800db510>] sys_sync+0x10/0x20 from[<8000ff44>] stack_done+0x20/0x3c

E-xfs_buf_rele
E-xfs_bdstrat_cb
xfs_force_shutdown(sda2,0x1) called from line 1020 of file
fs/xfs/linux-2.6/xfs_buf.c.  Return address = 0x801cb7f0
Filesystem "sda2": I/O Error Detected.  Shutting down filesystem: sda2
Please umount the filesystem, and rectify the problem(s)
S-xfs_buf_relse
S-xfs_buf_rele - 2
Call Trace:
[<8032c1bc>] dump_stack+0x8/0x34 from[<801cab2c>] xfs_buf_rele+0xc8/0x28c
[<801cab2c>] xfs_buf_rele+0xc8/0x28c from[<801cb828>] xfs_bwrite+0xec/0x128
[<801cb828>] xfs_bwrite+0xec/0x128 from[<801d5b48>]
xfs_sync_fsdata+0xbc/0x17c
[<801d5b48>] xfs_sync_fsdata+0xbc/0x17c from[<801d690c>]
xfs_quiesce_data+0x34/0x68
[<801d690c>] xfs_quiesce_data+0x34/0x68 from[<801d2b34>]
xfs_fs_sync_fs+0x30/0xec
[<801d2b34>] xfs_fs_sync_fs+0x30/0xec from[<800b7ff0>]
sync_filesystems+0x118/0x19c
[<800b7ff0>] sync_filesystems+0x118/0x19c from[<800db490>] do_sync+0x38/0x7c
[<800db490>] do_sync+0x38/0x7c from[<800db510>] sys_sync+0x10/0x20
[<800db510>] sys_sync+0x10/0x20 from[<8000ff44>] stack_done+0x20/0x3c

E-xfs_buf_rele
E-xfs_buf_relse
cp: cannot stat '/dtv/usb/sda2/6.test': Input/output error
Filesystem "sda2": xfs_log_force: error 5 returned.
INFO: task khubd:56 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
khubd            D [86a516a8] 8032cb2c     0    56      2
237      26      (kernel thread)
Stack : 00000000 801a2a6c 8643ebe0 86b43a8c 86a51830 8032cb2c 7fffffff
86a516a8
        00000002 00000001 86b6dc00 80243a10 e0364448 8032c344 e0373f10
8032cb2c
        00000087 0000000b 86b43ab8 801d61bc 3b9aca00 8032d16c 00000008
8019f0dc
        86b43a88 00008000 8643ebe0 8616d900 0000000b 00000000 00000000
86bae000
        86b43b08 801d62e0 7fffffff 86a516a8 00000002 8032e374 00000000
8008fae4
        ...
Call Trace:
[<8032ca74>] __schedule+0x618/0x6b8 from[<8032cb2c>] schedule+0x18/0x3c
[<8032cb2c>] schedule+0x18/0x3c from[<8032d16c>] schedule_timeout+0x2c/0x1c0
[<8032d16c>] schedule_timeout+0x2c/0x1c0 from[<8032e374>] __down+0x8c/0xdc
[<8032e374>] __down+0x8c/0xdc from[<8004500c>] down+0x40/0x88
[<8004500c>] down+0x40/0x88 from[<801c9c10>] xfs_buf_lock+0xcc/0x178
[<801c9c10>] xfs_buf_lock+0xcc/0x178 from[<801b6550>] xfs_getsb+0x38/0x54
[<801b6550>] xfs_getsb+0x38/0x54 from[<801d5b00>] xfs_sync_fsdata+0x74/0x17c
[<801d5b00>] xfs_sync_fsdata+0x74/0x17c from[<801d690c>]
xfs_quiesce_data+0x34/0x68
[<801d690c>] xfs_quiesce_data+0x34/0x68 from[<801d2b34>]
xfs_fs_sync_fs+0x30/0xec
[<801d2b34>] xfs_fs_sync_fs+0x30/0xec from[<800b878c>]
__fsync_super+0xa4/0xc8
[<800b878c>] __fsync_super+0xa4/0xc8 from[<800b87c4>] fsync_super+0x14/0x28
[<800b87c4>] fsync_super+0x14/0x28 from[<800e5cc4>] fsync_bdev+0x28/0x64
[<800e5cc4>] fsync_bdev+0x28/0x64 from[<801faaa8>]
invalidate_partition+0x28/0x60
[<801faaa8>] invalidate_partition+0x28/0x60 from[<801001b0>]
del_gendisk+0x40/0xf0
[<801001b0>] del_gendisk+0x40/0xf0 from[<8025cb50>] sd_remove+0x40/0xc8
[<8025cb50>] sd_remove+0x40/0xc8 from[<80259b5c>] scsi_bus_remove+0x44/0x5c
[<80259b5c>] scsi_bus_remove+0x44/0x5c from[<802463d4>]
__device_release_driver+0x80/0xbc
[<802463d4>] __device_release_driver+0x80/0xbc from[<80246540>]
device_release_driver+0x28/0x40
[<80246540>] device_release_driver+0x28/0x40 from[<802457b8>]
bus_remove_device+0xb0/0xf0
[<802457b8>] bus_remove_device+0xb0/0xf0 from[<80243b30>]
device_del+0x120/0x1a8
[<80243b30>] device_del+0x120/0x1a8 from[<80259f50>]
__scsi_remove_device+0x40/0x98
[<80259f50>] __scsi_remove_device+0x40/0x98 from[<802569cc>]
scsi_forget_host+0x88/0xfc
[<802569cc>] scsi_forget_host+0x88/0xfc from[<8024f34c>]
scsi_remove_host+0xf8/0x1ac
[<8024f34c>] scsi_remove_host+0xf8/0x1ac from[<e039d6bc>]
quiesce_and_remove_host+0x9c/0x12c [usb_storage]
[<e039d6bc>] quiesce_and_remove_host+0x9c/0x12c [usb_storage]
from[<e039d83c>] usb_stor_disconnect+0x20/0x3c [usb_storage]
[<e039d83c>] usb_stor_disconnect+0x20/0x3c [usb_storage] from[<e0367c58>]
usb_unbind_interface+0x68/0x128 [usbcore]
[<e0367c58>] usb_unbind_interface+0x68/0x128 [usbcore] from[<802463d4>]
__device_release_driver+0x80/0xbc
[<802463d4>] __device_release_driver+0x80/0xbc from[<80246540>]
device_release_driver+0x28/0x40
[<80246540>] device_release_driver+0x28/0x40 from[<802457b8>]
bus_remove_device+0xb0/0xf0
[<802457b8>] bus_remove_device+0xb0/0xf0 from[<80243b30>]
device_del+0x120/0x1a8
[<80243b30>] device_del+0x120/0x1a8 from[<e0364854>]
usb_disable_device+0x14c/0x234 [usbcore]
[<e0364854>] usb_disable_device+0x14c/0x234 [usbcore] from[<e035d6b8>]
usb_disconnect+0x170/0x37c [usbcore]
[<e035d6b8>] usb_disconnect+0x170/0x37c [usbcore] from[<e035f264>]
hub_thread+0x85c/0x18f8 [usbcore]
[<e035f264>] hub_thread+0x85c/0x18f8 [usbcore] from[<8003ff3c>]
kthread+0x5c/0xa0
[<8003ff3c>] kthread+0x5c/0xa0 from[<80008908>]
kernel_thread_helper+0x10/0x18
-------------------------------------------------------------------------------------
INFO: task usb_mount:395 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
usb_mount        D [86163ae8] 8032cb2c     0   395
371                             (user thread)
Stack : ffffff9c 86b71ef0 00400ae0 800c0150 86163c70 8032cb2c 86163ae8
00000002
        86bbbf18 86bbbf00 00000000 00000000 86b71f00 86b71ef8 7f8b0670
8032cb2c
        86bbb780 800b2510 00000001 86bbbf00 86bbbf00 8032f024 86bbbf00
800b21c0
        86bbbf00 86bbbf00 86b6dc44 86b6dc44 86163ae8 00000002 86b6dc00
86b6dc00
        803fe190 800b9350 86bbbf00 86bbbf18 86bbbef8 86bbbf00 86bbbef8
86bbbef8
        ...
Call Trace:
[<8032ca74>] __schedule+0x618/0x6b8 from[<8032cb2c>] schedule+0x18/0x3c
[<8032cb2c>] schedule+0x18/0x3c from[<8032f024>]
__down_write_nested+0x104/0x128
[<8032f024>] __down_write_nested+0x104/0x128 from[<800b9350>]
deactivate_super+0x70/0x110
[<800b9350>] deactivate_super+0x70/0x110 from[<800d122c>]
sys_umount+0x310/0x358
[<800d122c>] sys_umount+0x310/0x358 from[<8000ff44>] stack_done+0x20/0x3c
-------------------------------------------------------------------------------------
Filesystem "sda2": xfs_log_force: error 5 returned.

[-- Attachment #1.2: Type: text/html, Size: 11280 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 5+ messages in thread