* XFS umount with IO errors seems to lead to memory corruption
@ 2015-02-09 21:24 Chris Holcombe
2015-02-09 22:18 ` Dave Chinner
0 siblings, 1 reply; 7+ messages in thread
From: Chris Holcombe @ 2015-02-09 21:24 UTC (permalink / raw)
To: xfs
Hi Dave,
http://www.spinics.net/lists/linux-xfs/msg00061.html
Back in Dec 2013 you responded to this message saying that you would
take a look at it. Was a fix for this ever issued? I'm seeing very
similar stacktraces:
Feb 7 00:27:32 node008-cont001 kernel: [83405.490909] INFO: task
umount:29224 blocked for more than 120 seconds.
Feb 7 00:27:32 node008-cont001 kernel: [83405.498645] Tainted:
G W 3.13.0-39-generic #66-Ubuntu
Feb 7 00:27:32 node008-cont001 kernel: [83405.506273] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 7 00:27:32 node008-cont001 kernel: [83405.515244] umount
D ffff880c4fc34480 0 29224 29221 0x00000082
Feb 7 00:27:32 node008-cont001 kernel: [83405.515253]
ffff880201211db0 0000000000000086 ffff880c39cb1800 ffff880201211fd8
Feb 7 00:27:32 node008-cont001 kernel: [83405.515260]
0000000000014480 0000000000014480 ffff880c39cb1800 ffff880c33386480
Feb 7 00:27:32 node008-cont001 kernel: [83405.515267]
ffff880c395e4bc8 ffff880c333864c0 ffff880c333864e8 ffff880c33386490
Feb 7 00:27:32 node008-cont001 kernel: [83405.515274] Call Trace:
Feb 7 00:27:32 node008-cont001 kernel: [83405.515298]
[<ffffffff81723109>] schedule+0x29/0x70
Feb 7 00:27:32 node008-cont001 kernel: [83405.515384]
[<ffffffffa023b0c9>] xfs_ail_push_all_sync+0xa9/0xe0 [xfs]
Feb 7 00:27:32 node008-cont001 kernel: [83405.515396]
[<ffffffff810aafd0>] ? prepare_to_wait_event+0x100/0x100
Feb 7 00:27:32 node008-cont001 kernel: [83405.515438]
[<ffffffffa0236f13>] xfs_log_quiesce+0x33/0x70 [xfs]
Feb 7 00:27:32 node008-cont001 kernel: [83405.515479]
[<ffffffffa0236f62>] xfs_log_unmount+0x12/0x30 [xfs]
Feb 7 00:27:32 node008-cont001 kernel: [83405.515510]
[<ffffffffa01ed846>] xfs_unmountfs+0xc6/0x150 [xfs]
Feb 7 00:27:32 node008-cont001 kernel: [83405.515541]
[<ffffffffa01ef211>] xfs_fs_put_super+0x21/0x60 [xfs]
Feb 7 00:27:32 node008-cont001 kernel: [83405.515550]
[<ffffffff811bf452>] generic_shutdown_super+0x72/0xf0
Feb 7 00:27:32 node008-cont001 kernel: [83405.515557]
[<ffffffff811bf707>] kill_block_super+0x27/0x70
Feb 7 00:27:32 node008-cont001 kernel: [83405.515565]
[<ffffffff811bf9ed>] deactivate_locked_super+0x3d/0x60
Feb 7 00:27:32 node008-cont001 kernel: [83405.515572]
[<ffffffff811bffa6>] deactivate_super+0x46/0x60
Feb 7 00:27:32 node008-cont001 kernel: [83405.515578]
[<ffffffff811dcd96>] mntput_no_expire+0xd6/0x170
Feb 7 00:27:32 node008-cont001 kernel: [83405.515584]
[<ffffffff811de31e>] SyS_umount+0x8e/0x100
Feb 7 00:27:32 node008-cont001 kernel: [83405.515591]
[<ffffffff8172f7ed>] system_call_fastpath+0x1a/0x1f
These type of errors are showing up in the logs:
Feb 7 00:27:34 node008-cont001 kernel: [83407.466853] XFS (dm-8):
metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 19
numblks 1
...
Feb 7 00:27:39 node008-cont001 kernel: [83412.510982] XFS (dm-8):
metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 19
numblks 1
Feb 7 00:27:44 node008-cont001 kernel: [83417.555152] XFS (dm-8):
metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 19
numblks 1
...
Feb 7 00:27:54 node008-cont001 kernel: [83427.643428] XFS (dm-8):
metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 19
numblks 1
Feb 7 00:27:57 node008-cont001 kernel: [83429.879442] XFS:: 568
callbacks suppressed
Feb 7 00:27:57 node008-cont001 kernel: [83429.879450] XFS (dm-8):
Detected failing async write on buffer block 0x0. Retrying async
write.
Feb 7 00:27:57 node008-cont001 kernel: [83429.879450]
Feb 7 00:27:57 node008-cont001 kernel: [83429.931438] XFS (dm-8):
Detected failing async write on buffer block 0x0. Retrying async
write.
Feb 7 00:27:57 node008-cont001 kernel: [83429.931438]
Feb 7 00:27:57 node008-cont001 kernel: [83429.983444] XFS (dm-8):
Detected failing async write on buffer block 0x0. Retrying async
write.
Feb 7 00:27:57 node008-cont001 kernel: [83429.983444]
Thanks for the help!
Chris
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: XFS umount with IO errors seems to lead to memory corruption 2015-02-09 21:24 XFS umount with IO errors seems to lead to memory corruption Chris Holcombe @ 2015-02-09 22:18 ` Dave Chinner 2015-02-09 22:25 ` Eric Sandeen 0 siblings, 1 reply; 7+ messages in thread From: Dave Chinner @ 2015-02-09 22:18 UTC (permalink / raw) To: Chris Holcombe; +Cc: xfs On Mon, Feb 09, 2015 at 01:24:15PM -0800, Chris Holcombe wrote: > Hi Dave, > > http://www.spinics.net/lists/linux-xfs/msg00061.html > Back in Dec 2013 you responded to this message saying that you would > take a look at it. Was a fix for this ever issued? Yes, it's been fixed, but that's not you problem. > I'm seeing very > similar stacktraces: > > INFO: task umount:29224 blocked for more than 120 seconds. > Tainted: G W 3.13.0-39-generic #66-Ubuntu > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > umount D ffff880c4fc34480 0 29224 29221 0x00000082 > ffff880201211db0 0000000000000086 ffff880c39cb1800 ffff880201211fd8 > 0000000000014480 0000000000014480 ffff880c39cb1800 ffff880c33386480 > ffff880c395e4bc8 ffff880c333864c0 ffff880c333864e8 ffff880c33386490 > Call Trace: > > [<ffffffff81723109>] schedule+0x29/0x70 > [<ffffffffa023b0c9>] xfs_ail_push_all_sync+0xa9/0xe0 [xfs] > [<ffffffff810aafd0>] ? prepare_to_wait_event+0x100/0x100 > [<ffffffffa0236f13>] xfs_log_quiesce+0x33/0x70 [xfs] > [<ffffffffa0236f62>] xfs_log_unmount+0x12/0x30 [xfs] > [<ffffffffa01ed846>] xfs_unmountfs+0xc6/0x150 [xfs] > [<ffffffffa01ef211>] xfs_fs_put_super+0x21/0x60 [xfs] > [<ffffffff811bf452>] generic_shutdown_super+0x72/0xf0 > [<ffffffff811bf707>] kill_block_super+0x27/0x70 > [<ffffffff811bf9ed>] deactivate_locked_super+0x3d/0x60 > [<ffffffff811bffa6>] deactivate_super+0x46/0x60 > [<ffffffff811dcd96>] mntput_no_expire+0xd6/0x170 > [<ffffffff811de31e>] SyS_umount+0x8e/0x100 > [<ffffffff8172f7ed>] system_call_fastpath+0x1a/0x1f That's XFS hung waiting for IO to complete during unmount. > These type of errors are showing up in the logs: > > XFS (dm-8): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 19 numblks 1 Error 19 = ENODEV. You pulled the drive out before you tried to unmount? > XFS (dm-8): Detected failing async write on buffer block 0x0. Retrying async write. Which means it's detecting that the write is failing, but the higher level has been told to keep trying until all metadata has been flushed. We probably need to tweak this slightly.... Eric - this is another case where transient vs permanent error is somewhat squishy, and treating ENODEV as a permanent error would solve this issue (i.e. trigger a shutdown). Did you start doing anything in this area? AFAICT a ENODEV error on Linux is a permanent error because if you replug the device it will come back as a different device and the ENODEV onteh removed device will still persist. However, I'm not sure what dm-multipath ends up doing in this case - it's supposed to hide the same devices coming and going, so maybe it won't trigger this error at all... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: XFS umount with IO errors seems to lead to memory corruption 2015-02-09 22:18 ` Dave Chinner @ 2015-02-09 22:25 ` Eric Sandeen 0 siblings, 0 replies; 7+ messages in thread From: Eric Sandeen @ 2015-02-09 22:25 UTC (permalink / raw) To: Dave Chinner, Chris Holcombe; +Cc: xfs On 2/9/15 4:18 PM, Dave Chinner wrote: > On Mon, Feb 09, 2015 at 01:24:15PM -0800, Chris Holcombe wrote: >> Hi Dave, >> >> http://www.spinics.net/lists/linux-xfs/msg00061.html >> Back in Dec 2013 you responded to this message saying that you would >> take a look at it. Was a fix for this ever issued? > > Yes, it's been fixed, but that's not you problem. > >> I'm seeing very >> similar stacktraces: >> >> INFO: task umount:29224 blocked for more than 120 seconds. >> Tainted: G W 3.13.0-39-generic #66-Ubuntu >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> umount D ffff880c4fc34480 0 29224 29221 0x00000082 >> ffff880201211db0 0000000000000086 ffff880c39cb1800 ffff880201211fd8 >> 0000000000014480 0000000000014480 ffff880c39cb1800 ffff880c33386480 >> ffff880c395e4bc8 ffff880c333864c0 ffff880c333864e8 ffff880c33386490 >> Call Trace: >> >> [<ffffffff81723109>] schedule+0x29/0x70 >> [<ffffffffa023b0c9>] xfs_ail_push_all_sync+0xa9/0xe0 [xfs] >> [<ffffffff810aafd0>] ? prepare_to_wait_event+0x100/0x100 >> [<ffffffffa0236f13>] xfs_log_quiesce+0x33/0x70 [xfs] >> [<ffffffffa0236f62>] xfs_log_unmount+0x12/0x30 [xfs] >> [<ffffffffa01ed846>] xfs_unmountfs+0xc6/0x150 [xfs] >> [<ffffffffa01ef211>] xfs_fs_put_super+0x21/0x60 [xfs] >> [<ffffffff811bf452>] generic_shutdown_super+0x72/0xf0 >> [<ffffffff811bf707>] kill_block_super+0x27/0x70 >> [<ffffffff811bf9ed>] deactivate_locked_super+0x3d/0x60 >> [<ffffffff811bffa6>] deactivate_super+0x46/0x60 >> [<ffffffff811dcd96>] mntput_no_expire+0xd6/0x170 >> [<ffffffff811de31e>] SyS_umount+0x8e/0x100 >> [<ffffffff8172f7ed>] system_call_fastpath+0x1a/0x1f > > That's XFS hung waiting for IO to complete during unmount. > >> These type of errors are showing up in the logs: >> >> XFS (dm-8): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 19 numblks 1 > > Error 19 = ENODEV. > > You pulled the drive out before you tried to unmount? > >> XFS (dm-8): Detected failing async write on buffer block 0x0. Retrying async write. > > Which means it's detecting that the write is failing, but the higher > level has been told to keep trying until all metadata has been > flushed. We probably need to tweak this slightly.... > > Eric - this is another case where transient vs permanent error is > somewhat squishy, and treating ENODEV as a permanent error would > solve this issue (i.e. trigger a shutdown). Did you start doing > anything in this area? that's (probably) a little more clear, enodev is unlikely to be transparently resolved. Even if it comes back, there's no mechanism to see that it came back with the same name, right? ... > AFAICT a ENODEV error on Linux is a permanent error because if you > replug the device it will come back as a different device and the > ENODEV onteh removed device will still persist. yes, right. :) > However, I'm not > sure what dm-multipath ends up doing in this case - it's supposed to > hide the same devices coming and going, so maybe it won't trigger > this error at all... Anyway, I had started a hack of accumulating consecutive failed IOs but didn't go too far yet, the initial try didn't do what I expected and I haven't gotten back to iet yet... -Eric > Cheers, > > Dave. > _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <CAOcd+r3i0mDK2vAnZ-0s6VGnSsJwWxnEB2uMrcz+WSJAxx2bmA@mail.gmail.com>]
* Re: XFS umount with IO errors seems to lead to memory corruption [not found] <CAOcd+r3i0mDK2vAnZ-0s6VGnSsJwWxnEB2uMrcz+WSJAxx2bmA@mail.gmail.com> @ 2013-11-21 22:07 ` Dave Chinner 2013-11-24 10:27 ` Alex Lyakas 0 siblings, 1 reply; 7+ messages in thread From: Dave Chinner @ 2013-11-21 22:07 UTC (permalink / raw) To: Alex Lyakas; +Cc: linux-xfs, xfs [cc'd the correct xfs list. PLease use xfs@oss.sgi.com in future.] On Thu, Nov 21, 2013 at 08:04:36PM +0200, Alex Lyakas wrote: > Greetings, > I am using stock XFS from kernel 3.8.13, compiled with kmemleak > enabled. I am testing a particular scenario, in which the underlying > block device returns IO errors during XFS umount. Almost in all cases > this results in kernel crashes in various places, and sometimes > kmemleak complains, and sometimes CPU soft lockup happens. It always > happens after XFS messages like: What testing are you doing? http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F > kernel: [ 600.190509] XFS (dm-22): metadata I/O error: block > 0x7600030 ("xlog_iodone") error 125 numblks 64 > kernel: [ 600.192267] XFS (dm-22): xfs_do_force_shutdown(0x2) called > from line 1115 of file > /mnt/compile/linux-stable/source/fs/xfs/xfs_log.c. Return address = > 0xffffffffa05cffa1 > kernel: [ 600.192319] XFS (dm-22): Log I/O Error Detected. Shutting > down filesystem > kernel: [ 600.192392] XFS (dm-22): Unable to update superblock > counters. Freespace may not be correct on next mount. > kernel: [ 600.192398] XFS (dm-22): xfs_log_force: error 5 returned. > kernel: [ 600.193687] XFS (º.Z): Please umount the filesystem and > rectify the problem(s) > > you can see here the garbage that XFS prints instead of the block device name. > In [1] and [2] I am attaching more kernel log from two such crashes. So, something is corrupting memory and stamping all over the XFS structures. What's error 125? #define ECANCELED 125 /* Operation Canceled */ I can't find an driver that actually returns that error to filesystems, which.... > kernel: [ 600.227881] Modules linked in: xfs raid1 xfrm_user > xfrm4_tunnel tunnel4 ipcomp xfrm_ipcomp esp4 ah4 scst_vdisk(O) > iscsi_scst(O) scst(O) dm_zcache(O) dm_btrfs(O) btrfs(O) libcrc32c > dm_iostat(O) .... given you have a bunch of out of tree modules loaded (and some which are experiemental) suggests that you have a problem with your storage... So, something is corrupting memory across a large number of subsystems, and the trigger is some custom code to run error injection. Can you reproduce the problem with something like dm-faulty? Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: XFS umount with IO errors seems to lead to memory corruption 2013-11-21 22:07 ` Dave Chinner @ 2013-11-24 10:27 ` Alex Lyakas 2013-12-10 7:36 ` Alex Lyakas 0 siblings, 1 reply; 7+ messages in thread From: Alex Lyakas @ 2013-11-24 10:27 UTC (permalink / raw) To: Dave Chinner, xfs; +Cc: linux-xfs Hi Dave, thank you for your comments. The test that I am doing is unmounting the XFS, while its underlying block device returns intermittent IO errors. The block device in question is a custom DeviceMapper target. It returns -ECANCELED in this case. Should I return some other errno instead? The same exact test works alright with ext4. It's unmount finishes, system seems to continue functioning normally and kmemleak is also happy. When doing a simpler reproductoin with "error" Device-Mapper, umount gets stuck and never returns, while kernel keeps printing: XFS (dm-0): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 numblks 1 XFS (dm-0): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 numblks 1 XFS (dm-0): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 numblks 1 XFS (dm-0): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 numblks 1 XFS (dm-0): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 numblks 1 XFS (dm-0): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 numblks 1 XFS (dm-0): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 numblks 1 ... What is the expected behavior of umount in this case? (when there are IO errors during it). This reproduction that I do is: root@vc-00-00-1075-dev:~# cat /etc/xfs.protofile dummy : bootfilename, not used, backward compatibility 0 0 : numbers of blocks and inodes, not used, backward compatibility d--777 0 0 : set 777 perms for the root dir $ $ root@vc-00-00-1075-dev:~# mkfs.xfs -f /dev/vde -p /etc/xfs.protofile meta-data=/dev/vde isize=256 agcount=4, agsize=1310720 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=5242880, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 root@vc-00-00-1075-dev:~# blockdev --getsz /dev/vde 41943040 root@vc-00-00-1075-dev:~# dmsetup create VDE --table "0 41943040 linear /dev/vde 0" root@vc-00-00-1075-dev:~# mount -o noatime,sync /dev/mapper/VDE /export/v122_nas/ copy some files to the mountpoint, then: dmsetup reload VDE --table "0 41943040 error" dmsetup resume VDE umount /export/v122_nas this never returns and /proc shows: root@vc-00-00-1075-dev:~# cat /proc/2684/stack [<ffffffffa033ac6a>] xfs_ail_push_all_sync+0x9a/0xd0 [xfs] [<ffffffffa0330123>] xfs_unmountfs+0x63/0x160 [xfs] [<ffffffffa02ee265>] xfs_fs_put_super+0x25/0x60 [xfs] [<ffffffff8118fd12>] generic_shutdown_super+0x62/0xf0 [<ffffffff8118fdd0>] kill_block_super+0x30/0x80 [<ffffffff811903dc>] deactivate_locked_super+0x3c/0x90 [<ffffffff81190d7e>] deactivate_super+0x4e/0x70 [<ffffffff811ad086>] mntput_no_expire+0x106/0x160 [<ffffffff811ae760>] sys_umount+0xa0/0xe0 [<ffffffff816ab919>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff And after some time, hung task warning shows: INFO: task kworker/2:1:39 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/2:1 D ffffffff8180cf00 0 39 2 0x00000000 ffff88007c54db38 0000000000000046 000000027d003700 ffff88007fd03fc0 ffff88007c54dfd8 ffff88007c54dfd8 ffff88007c54dfd8 0000000000013e40 ffff88007c9e9710 ffff88007c4bdc40 00000000000000b8 7fffffffffffffff Call Trace: [<ffffffff816a1b99>] schedule+0x29/0x70 [<ffffffff816a02d5>] schedule_timeout+0x1e5/0x250 [<ffffffffa02f3987>] ? kmem_zone_alloc+0x67/0xe0 [xfs] [<ffffffff816798e6>] ? kmemleak_alloc+0x26/0x50 [<ffffffff816a0f1b>] __down_common+0xa0/0xf0 [<ffffffffa032f37c>] ? xfs_getsb+0x3c/0x70 [xfs] [<ffffffff816a0fde>] __down+0x1d/0x1f [<ffffffff81084591>] down+0x41/0x50 [<ffffffffa02dcd44>] xfs_buf_lock+0x44/0x110 [xfs] [<ffffffffa032f37c>] xfs_getsb+0x3c/0x70 [xfs] [<ffffffffa033b4bc>] xfs_trans_getsb+0x4c/0x140 [xfs] [<ffffffffa032f06e>] xfs_mod_sb+0x4e/0xc0 [xfs] [<ffffffffa02e3b24>] xfs_fs_log_dummy+0x54/0x90 [xfs] [<ffffffffa0335bf8>] xfs_log_worker+0x48/0x50 [xfs] [<ffffffff81077a11>] process_one_work+0x141/0x4a0 [<ffffffff810789e8>] worker_thread+0x168/0x410 [<ffffffff81078880>] ? manage_workers+0x120/0x120 [<ffffffff8107df10>] kthread+0xc0/0xd0 [<ffffffff813a3ea4>] ? acpi_get_child+0x47/0x4d [<ffffffff813a3fb7>] ? acpi_platform_notify.part.0+0xbb/0xda [<ffffffff8107de50>] ? flush_kthread_worker+0xb0/0xb0 [<ffffffff816ab86c>] ret_from_fork+0x7c/0xb0 [<ffffffff8107de50>] ? flush_kthread_worker+0xb0/0xb0 [3] contains some of information that wiki asks for, right before doing the unmount in the original test. I have also re-ran the original test with slub_debug=FZ, but it did not show up any warnings (perhaps I am missing something on how to enable the red-zone debugging). Thanks, Alex. [3] root@vc-00-00-1075-dev:/mnt/work/alex# uname -a Linux vc-00-00-1075-dev 3.8.13-557-generic #1382000791 SMP Thu Oct 17 11:22:20 IST 2013 x86_64 x86_64 x86_64 GNU/Linux this is a manually-compiled 3.8.13 kernel, in which: CONFIG_HAVE_DEBUG_KMEMLEAK=y CONFIG_DEBUG_KMEMLEAK=y but no code changes. root@vc-00-00-1075-dev:/mnt/work/alex# xfs_repair -V xfs_repair version 3.1.7 number of CPUs: 4 root@vc-00-00-1075-dev:/mnt/work/alex# cat /proc/meminfo MemTotal: 2050940 kB MemFree: 996376 kB Buffers: 32836 kB Cached: 389208 kB SwapCached: 0 kB Active: 490232 kB Inactive: 239988 kB Active(anon): 311336 kB Inactive(anon): 1336 kB Active(file): 178896 kB Inactive(file): 238652 kB Unevictable: 4632 kB Mlocked: 4632 kB SwapTotal: 522236 kB SwapFree: 522236 kB Dirty: 96 kB Writeback: 0 kB AnonPages: 312920 kB Mapped: 46656 kB Shmem: 1456 kB Slab: 224640 kB SReclaimable: 30040 kB SUnreclaim: 194600 kB KernelStack: 3632 kB PageTables: 14584 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 1547704 kB Committed_AS: 1667900 kB VmallocTotal: 34359738367 kB VmallocUsed: 30860 kB VmallocChunk: 34359648832 kB HardwareCorrupted: 0 kB AnonHugePages: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 153588 kB DirectMap2M: 1943552 kB root@vc-00-00-1075-dev:/mnt/work/alex# cat /proc/mounts rootfs / rootfs rw 0 0 sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0 proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0 udev /dev devtmpfs rw,relatime,size=1016840k,nr_inodes=254210,mode=755 0 0 devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0 tmpfs /run tmpfs rw,nosuid,relatime,size=410188k,mode=755 0 0 /dev/disk/by-uuid/c2a6ff28-cf03-4785-9ac8-60dd4ed7d3af / ext4 rw,relatime,errors=remount-ro,data=ordered 0 0 none /sys/fs/fuse/connections fusectl rw,relatime 0 0 none /sys/kernel/debug debugfs rw,relatime 0 0 none /sys/kernel/security securityfs rw,relatime 0 0 none /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=10256k 0 0 none /run/shm tmpfs rw,nosuid,nodev,relatime 0 0 rpc_pipefs /run/rpc_pipefs rpc_pipefs rw,relatime 0 0 172.16.0.12:/export2/work /mnt/work nfs4 rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=172.16.0.35,local_lock=none,addr=172.16.0.12 0 0 172.16.0.12:/export2/share /mnt/share nfs4 rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=172.16.0.35,local_lock=none,addr=172.16.0.12 0 0 nfsd /proc/fs/nfsd nfsd rw,relatime 0 0 cgroup_root /sys/fs/cgroup tmpfs rw,relatime 0 0 blkio /sys/fs/cgroup/blkio cgroup rw,relatime,perf_event,blkio,freezer,devices,cpuacct,cpu,cpuset 0 0 /dev/mapper/vpart-1 /mnt/bootdir ext4 rw,relatime,(null),debug,data=ordered 0 0 /dev/dm-20 /btrfs/pool-00000007 btrfs rw,noatime,nodatasum,nodatacow,notreelog,nospace_cache,clear_cache 0 0 /dev/dm-22 /export/v122_nas xfs rw,sync,noatime,wsync,attr2,inode64,noquota 0 0 root@vc-00-00-1075-dev:/mnt/work/alex# cat /proc/partitions major minor #blocks name 254 0 52428800 vda 254 16 52428800 vdb 254 32 20971520 vdc 254 48 20971520 vdd 254 64 20971520 vde 254 80 512000 vdf 254 96 512000 vdg 254 112 512000 vdh 254 128 512000 vdi 254 144 512000 vdj 254 160 1048576 vdk 254 176 1048576 vdl 254 192 312320 vdm 254 208 312320 vdn 8 0 5242880 sda 8 1 4717568 sda1 8 2 1 sda2 8 5 522240 sda5 11 0 1048575 sr0 253 0 140737488355327 dm-0 253 1 512000 dm-1 253 2 512000 dm-2 253 3 512000 dm-3 9 1 511988 md1 253 4 511988 dm-4 253 5 1048576 dm-5 253 7 312320 dm-7 253 6 20971520 dm-6 253 8 52428800 dm-8 253 9 20971520 dm-9 253 10 1048576 dm-10 253 11 312320 dm-11 253 12 52428800 dm-12 9 5 18872320 md5 9 2 1048564 md2 9 3 312308 md3 9 4 50329600 md4 253 13 18872320 dm-13 253 14 312308 dm-14 253 15 1048564 dm-15 253 16 50329600 dm-16 253 17 49283072 dm-17 253 18 49283072 dm-18 253 19 49283072 dm-19 253 20 49283072 dm-20 253 21 127926272 dm-21 253 22 127926272 dm-22 root@vc-00-00-1075-dev:/mnt/work/alex# xfs_info /export/v122_nas/ meta-data=/dev/dm-22 isize=256 agcount=4, agsize=7995392 blks = sectsz=512 attr=2 data = bsize=4096 blocks=31981568, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal bsize=4096 blocks=15616, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 On Fri, Nov 22, 2013 at 12:07 AM, Dave Chinner <david@fromorbit.com> wrote: > [cc'd the correct xfs list. PLease use xfs@oss.sgi.com in future.] > > On Thu, Nov 21, 2013 at 08:04:36PM +0200, Alex Lyakas wrote: >> Greetings, >> I am using stock XFS from kernel 3.8.13, compiled with kmemleak >> enabled. I am testing a particular scenario, in which the underlying >> block device returns IO errors during XFS umount. Almost in all cases >> this results in kernel crashes in various places, and sometimes >> kmemleak complains, and sometimes CPU soft lockup happens. It always >> happens after XFS messages like: > > What testing are you doing? > > http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F > >> kernel: [ 600.190509] XFS (dm-22): metadata I/O error: block >> 0x7600030 ("xlog_iodone") error 125 numblks 64 >> kernel: [ 600.192267] XFS (dm-22): xfs_do_force_shutdown(0x2) called >> from line 1115 of file >> /mnt/compile/linux-stable/source/fs/xfs/xfs_log.c. Return address = >> 0xffffffffa05cffa1 >> kernel: [ 600.192319] XFS (dm-22): Log I/O Error Detected. Shutting >> down filesystem >> kernel: [ 600.192392] XFS (dm-22): Unable to update superblock >> counters. Freespace may not be correct on next mount. >> kernel: [ 600.192398] XFS (dm-22): xfs_log_force: error 5 returned. >> kernel: [ 600.193687] XFS (声.Z): Please umount the filesystem and >> rectify the problem(s) >> >> you can see here the garbage that XFS prints instead of the block device name. >> In [1] and [2] I am attaching more kernel log from two such crashes. > > So, something is corrupting memory and stamping all over the XFS > structures. > > What's error 125? > > #define ECANCELED 125 /* Operation Canceled */ > > I can't find an driver that actually returns that error to > filesystems, which.... > > >> kernel: [ 600.227881] Modules linked in: xfs raid1 xfrm_user >> xfrm4_tunnel tunnel4 ipcomp xfrm_ipcomp esp4 ah4 scst_vdisk(O) >> iscsi_scst(O) scst(O) dm_zcache(O) dm_btrfs(O) btrfs(O) libcrc32c >> dm_iostat(O) > > .... given you have a bunch of out of tree modules loaded (and some which > are experiemental) suggests that you have a problem with your > storage... > > So, something is corrupting memory across a large number of > subsystems, and the trigger is some custom code to run error > injection. Can you reproduce the problem with something like > dm-faulty? > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: XFS umount with IO errors seems to lead to memory corruption 2013-11-24 10:27 ` Alex Lyakas @ 2013-12-10 7:36 ` Alex Lyakas 2013-12-11 0:40 ` Dave Chinner 0 siblings, 1 reply; 7+ messages in thread From: Alex Lyakas @ 2013-12-10 7:36 UTC (permalink / raw) To: Dave Chinner, xfs; +Cc: linux-xfs Hi Dave, any insight on this issue? At least on the simpler reproduction with "error" DeviceMapper? Thanks, Alex. -----Original Message----- From: Alex Lyakas Sent: 24 November, 2013 12:27 PM To: Dave Chinner ; xfs@oss.sgi.com Cc: linux-xfs@vger.kernel.org Subject: Re: XFS umount with IO errors seems to lead to memory corruption Hi Dave, thank you for your comments. The test that I am doing is unmounting the XFS, while its underlying block device returns intermittent IO errors. The block device in question is a custom DeviceMapper target. It returns -ECANCELED in this case. Should I return some other errno instead? The same exact test works alright with ext4. It's unmount finishes, system seems to continue functioning normally and kmemleak is also happy. When doing a simpler reproductoin with "error" Device-Mapper, umount gets stuck and never returns, while kernel keeps printing: XFS (dm-0): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 numblks 1 XFS (dm-0): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 numblks 1 XFS (dm-0): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 numblks 1 XFS (dm-0): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 numblks 1 XFS (dm-0): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 numblks 1 XFS (dm-0): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 numblks 1 XFS (dm-0): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 numblks 1 ... What is the expected behavior of umount in this case? (when there are IO errors during it). This reproduction that I do is: root@vc-00-00-1075-dev:~# cat /etc/xfs.protofile dummy : bootfilename, not used, backward compatibility 0 0 : numbers of blocks and inodes, not used, backward compatibility d--777 0 0 : set 777 perms for the root dir $ $ root@vc-00-00-1075-dev:~# mkfs.xfs -f /dev/vde -p /etc/xfs.protofile meta-data=/dev/vde isize=256 agcount=4, agsize=1310720 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=5242880, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 root@vc-00-00-1075-dev:~# blockdev --getsz /dev/vde 41943040 root@vc-00-00-1075-dev:~# dmsetup create VDE --table "0 41943040 linear /dev/vde 0" root@vc-00-00-1075-dev:~# mount -o noatime,sync /dev/mapper/VDE /export/v122_nas/ copy some files to the mountpoint, then: dmsetup reload VDE --table "0 41943040 error" dmsetup resume VDE umount /export/v122_nas this never returns and /proc shows: root@vc-00-00-1075-dev:~# cat /proc/2684/stack [<ffffffffa033ac6a>] xfs_ail_push_all_sync+0x9a/0xd0 [xfs] [<ffffffffa0330123>] xfs_unmountfs+0x63/0x160 [xfs] [<ffffffffa02ee265>] xfs_fs_put_super+0x25/0x60 [xfs] [<ffffffff8118fd12>] generic_shutdown_super+0x62/0xf0 [<ffffffff8118fdd0>] kill_block_super+0x30/0x80 [<ffffffff811903dc>] deactivate_locked_super+0x3c/0x90 [<ffffffff81190d7e>] deactivate_super+0x4e/0x70 [<ffffffff811ad086>] mntput_no_expire+0x106/0x160 [<ffffffff811ae760>] sys_umount+0xa0/0xe0 [<ffffffff816ab919>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff And after some time, hung task warning shows: INFO: task kworker/2:1:39 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/2:1 D ffffffff8180cf00 0 39 2 0x00000000 ffff88007c54db38 0000000000000046 000000027d003700 ffff88007fd03fc0 ffff88007c54dfd8 ffff88007c54dfd8 ffff88007c54dfd8 0000000000013e40 ffff88007c9e9710 ffff88007c4bdc40 00000000000000b8 7fffffffffffffff Call Trace: [<ffffffff816a1b99>] schedule+0x29/0x70 [<ffffffff816a02d5>] schedule_timeout+0x1e5/0x250 [<ffffffffa02f3987>] ? kmem_zone_alloc+0x67/0xe0 [xfs] [<ffffffff816798e6>] ? kmemleak_alloc+0x26/0x50 [<ffffffff816a0f1b>] __down_common+0xa0/0xf0 [<ffffffffa032f37c>] ? xfs_getsb+0x3c/0x70 [xfs] [<ffffffff816a0fde>] __down+0x1d/0x1f [<ffffffff81084591>] down+0x41/0x50 [<ffffffffa02dcd44>] xfs_buf_lock+0x44/0x110 [xfs] [<ffffffffa032f37c>] xfs_getsb+0x3c/0x70 [xfs] [<ffffffffa033b4bc>] xfs_trans_getsb+0x4c/0x140 [xfs] [<ffffffffa032f06e>] xfs_mod_sb+0x4e/0xc0 [xfs] [<ffffffffa02e3b24>] xfs_fs_log_dummy+0x54/0x90 [xfs] [<ffffffffa0335bf8>] xfs_log_worker+0x48/0x50 [xfs] [<ffffffff81077a11>] process_one_work+0x141/0x4a0 [<ffffffff810789e8>] worker_thread+0x168/0x410 [<ffffffff81078880>] ? manage_workers+0x120/0x120 [<ffffffff8107df10>] kthread+0xc0/0xd0 [<ffffffff813a3ea4>] ? acpi_get_child+0x47/0x4d [<ffffffff813a3fb7>] ? acpi_platform_notify.part.0+0xbb/0xda [<ffffffff8107de50>] ? flush_kthread_worker+0xb0/0xb0 [<ffffffff816ab86c>] ret_from_fork+0x7c/0xb0 [<ffffffff8107de50>] ? flush_kthread_worker+0xb0/0xb0 [3] contains some of information that wiki asks for, right before doing the unmount in the original test. I have also re-ran the original test with slub_debug=FZ, but it did not show up any warnings (perhaps I am missing something on how to enable the red-zone debugging). Thanks, Alex. [3] root@vc-00-00-1075-dev:/mnt/work/alex# uname -a Linux vc-00-00-1075-dev 3.8.13-557-generic #1382000791 SMP Thu Oct 17 11:22:20 IST 2013 x86_64 x86_64 x86_64 GNU/Linux this is a manually-compiled 3.8.13 kernel, in which: CONFIG_HAVE_DEBUG_KMEMLEAK=y CONFIG_DEBUG_KMEMLEAK=y but no code changes. root@vc-00-00-1075-dev:/mnt/work/alex# xfs_repair -V xfs_repair version 3.1.7 number of CPUs: 4 root@vc-00-00-1075-dev:/mnt/work/alex# cat /proc/meminfo MemTotal: 2050940 kB MemFree: 996376 kB Buffers: 32836 kB Cached: 389208 kB SwapCached: 0 kB Active: 490232 kB Inactive: 239988 kB Active(anon): 311336 kB Inactive(anon): 1336 kB Active(file): 178896 kB Inactive(file): 238652 kB Unevictable: 4632 kB Mlocked: 4632 kB SwapTotal: 522236 kB SwapFree: 522236 kB Dirty: 96 kB Writeback: 0 kB AnonPages: 312920 kB Mapped: 46656 kB Shmem: 1456 kB Slab: 224640 kB SReclaimable: 30040 kB SUnreclaim: 194600 kB KernelStack: 3632 kB PageTables: 14584 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 1547704 kB Committed_AS: 1667900 kB VmallocTotal: 34359738367 kB VmallocUsed: 30860 kB VmallocChunk: 34359648832 kB HardwareCorrupted: 0 kB AnonHugePages: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 153588 kB DirectMap2M: 1943552 kB root@vc-00-00-1075-dev:/mnt/work/alex# cat /proc/mounts rootfs / rootfs rw 0 0 sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0 proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0 udev /dev devtmpfs rw,relatime,size=1016840k,nr_inodes=254210,mode=755 0 0 devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0 tmpfs /run tmpfs rw,nosuid,relatime,size=410188k,mode=755 0 0 /dev/disk/by-uuid/c2a6ff28-cf03-4785-9ac8-60dd4ed7d3af / ext4 rw,relatime,errors=remount-ro,data=ordered 0 0 none /sys/fs/fuse/connections fusectl rw,relatime 0 0 none /sys/kernel/debug debugfs rw,relatime 0 0 none /sys/kernel/security securityfs rw,relatime 0 0 none /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=10256k 0 0 none /run/shm tmpfs rw,nosuid,nodev,relatime 0 0 rpc_pipefs /run/rpc_pipefs rpc_pipefs rw,relatime 0 0 172.16.0.12:/export2/work /mnt/work nfs4 rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=172.16.0.35,local_lock=none,addr=172.16.0.12 0 0 172.16.0.12:/export2/share /mnt/share nfs4 rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=172.16.0.35,local_lock=none,addr=172.16.0.12 0 0 nfsd /proc/fs/nfsd nfsd rw,relatime 0 0 cgroup_root /sys/fs/cgroup tmpfs rw,relatime 0 0 blkio /sys/fs/cgroup/blkio cgroup rw,relatime,perf_event,blkio,freezer,devices,cpuacct,cpu,cpuset 0 0 /dev/mapper/vpart-1 /mnt/bootdir ext4 rw,relatime,(null),debug,data=ordered 0 0 /dev/dm-20 /btrfs/pool-00000007 btrfs rw,noatime,nodatasum,nodatacow,notreelog,nospace_cache,clear_cache 0 0 /dev/dm-22 /export/v122_nas xfs rw,sync,noatime,wsync,attr2,inode64,noquota 0 0 root@vc-00-00-1075-dev:/mnt/work/alex# cat /proc/partitions major minor #blocks name 254 0 52428800 vda 254 16 52428800 vdb 254 32 20971520 vdc 254 48 20971520 vdd 254 64 20971520 vde 254 80 512000 vdf 254 96 512000 vdg 254 112 512000 vdh 254 128 512000 vdi 254 144 512000 vdj 254 160 1048576 vdk 254 176 1048576 vdl 254 192 312320 vdm 254 208 312320 vdn 8 0 5242880 sda 8 1 4717568 sda1 8 2 1 sda2 8 5 522240 sda5 11 0 1048575 sr0 253 0 140737488355327 dm-0 253 1 512000 dm-1 253 2 512000 dm-2 253 3 512000 dm-3 9 1 511988 md1 253 4 511988 dm-4 253 5 1048576 dm-5 253 7 312320 dm-7 253 6 20971520 dm-6 253 8 52428800 dm-8 253 9 20971520 dm-9 253 10 1048576 dm-10 253 11 312320 dm-11 253 12 52428800 dm-12 9 5 18872320 md5 9 2 1048564 md2 9 3 312308 md3 9 4 50329600 md4 253 13 18872320 dm-13 253 14 312308 dm-14 253 15 1048564 dm-15 253 16 50329600 dm-16 253 17 49283072 dm-17 253 18 49283072 dm-18 253 19 49283072 dm-19 253 20 49283072 dm-20 253 21 127926272 dm-21 253 22 127926272 dm-22 root@vc-00-00-1075-dev:/mnt/work/alex# xfs_info /export/v122_nas/ meta-data=/dev/dm-22 isize=256 agcount=4, agsize=7995392 blks = sectsz=512 attr=2 data = bsize=4096 blocks=31981568, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal bsize=4096 blocks=15616, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 On Fri, Nov 22, 2013 at 12:07 AM, Dave Chinner <david@fromorbit.com> wrote: > [cc'd the correct xfs list. PLease use xfs@oss.sgi.com in future.] > > On Thu, Nov 21, 2013 at 08:04:36PM +0200, Alex Lyakas wrote: >> Greetings, >> I am using stock XFS from kernel 3.8.13, compiled with kmemleak >> enabled. I am testing a particular scenario, in which the underlying >> block device returns IO errors during XFS umount. Almost in all cases >> this results in kernel crashes in various places, and sometimes >> kmemleak complains, and sometimes CPU soft lockup happens. It always >> happens after XFS messages like: > > What testing are you doing? > > http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F > >> kernel: [ 600.190509] XFS (dm-22): metadata I/O error: block >> 0x7600030 ("xlog_iodone") error 125 numblks 64 >> kernel: [ 600.192267] XFS (dm-22): xfs_do_force_shutdown(0x2) called >> from line 1115 of file >> /mnt/compile/linux-stable/source/fs/xfs/xfs_log.c. Return address = >> 0xffffffffa05cffa1 >> kernel: [ 600.192319] XFS (dm-22): Log I/O Error Detected. Shutting >> down filesystem >> kernel: [ 600.192392] XFS (dm-22): Unable to update superblock >> counters. Freespace may not be correct on next mount. >> kernel: [ 600.192398] XFS (dm-22): xfs_log_force: error 5 returned. >> kernel: [ 600.193687] XFS (声.Z): Please umount the filesystem and >> rectify the problem(s) >> >> you can see here the garbage that XFS prints instead of the block device >> name. >> In [1] and [2] I am attaching more kernel log from two such crashes. > > So, something is corrupting memory and stamping all over the XFS > structures. > > What's error 125? > > #define ECANCELED 125 /* Operation Canceled */ > > I can't find an driver that actually returns that error to > filesystems, which.... > > >> kernel: [ 600.227881] Modules linked in: xfs raid1 xfrm_user >> xfrm4_tunnel tunnel4 ipcomp xfrm_ipcomp esp4 ah4 scst_vdisk(O) >> iscsi_scst(O) scst(O) dm_zcache(O) dm_btrfs(O) btrfs(O) libcrc32c >> dm_iostat(O) > > .... given you have a bunch of out of tree modules loaded (and some which > are experiemental) suggests that you have a problem with your > storage... > > So, something is corrupting memory across a large number of > subsystems, and the trigger is some custom code to run error > injection. Can you reproduce the problem with something like > dm-faulty? > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: XFS umount with IO errors seems to lead to memory corruption 2013-12-10 7:36 ` Alex Lyakas @ 2013-12-11 0:40 ` Dave Chinner 0 siblings, 0 replies; 7+ messages in thread From: Dave Chinner @ 2013-12-11 0:40 UTC (permalink / raw) To: Alex Lyakas; +Cc: linux-xfs, xfs [ Sorry, Alex, I missed your last email. Thanks for pinging me to remind me to look at it. ] On Tue, Dec 10, 2013 at 09:36:11AM +0200, Alex Lyakas wrote: > Hi Dave, > any insight on this issue? At least on the simpler reproduction with > "error" DeviceMapper? Yes, it does point to the problem. > -----Original Message----- From: Alex Lyakas > Sent: 24 November, 2013 12:27 PM > To: Dave Chinner ; xfs@oss.sgi.com > Cc: linux-xfs@vger.kernel.org > Subject: Re: XFS umount with IO errors seems to lead to memory corruption > > Hi Dave, > thank you for your comments. > > The test that I am doing is unmounting the XFS, while its underlying > block device returns intermittent IO errors. The block device in > question is a custom DeviceMapper target. It returns -ECANCELED in > this case. Should I return some other errno instead? > The same exact test works alright with ext4. It's unmount finishes, > system seems to continue functioning normally and kmemleak is also > happy. > > When doing a simpler reproductoin with "error" Device-Mapper, umount > gets stuck and never returns, while kernel keeps printing: > XFS (dm-0): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 numblks 1 It's trying to write the superblock - it's and async, background metadata write, and it's failing. /* * If the write was asynchronous then no one will be looking for the * error. Clear the error state and write the buffer out again. * * XXX: This helps against transient write errors, but we need to find * a way to shut the filesystem down if the writes keep failing. * * In practice we'll shut the filesystem down soon as non-transient * erorrs tend to affect the whole device and a failing log write * will make us give up. But we really ought to do better here. */ if (XFS_BUF_ISASYNC(bp)) { ASSERT(bp->b_iodone != NULL); trace_xfs_buf_item_iodone_async(bp, _RET_IP_); xfs_buf_ioerror(bp, 0); /* errno of 0 unsets the flag */ if (!XFS_BUF_ISSTALE(bp)) { bp->b_flags |= XBF_WRITE | XBF_ASYNC | XBF_DONE; xfs_buf_iorequest(bp); } else { xfs_buf_relse(bp); } return; } There's the problem code - it just keeps resubmitting the failed IO and so never unlocks it and it never completes. > this never returns and /proc shows: > root@vc-00-00-1075-dev:~# cat /proc/2684/stack > [<ffffffffa033ac6a>] xfs_ail_push_all_sync+0x9a/0xd0 [xfs] > [<ffffffffa0330123>] xfs_unmountfs+0x63/0x160 [xfs] > [<ffffffffa02ee265>] xfs_fs_put_super+0x25/0x60 [xfs] > [<ffffffff8118fd12>] generic_shutdown_super+0x62/0xf0 > [<ffffffff8118fdd0>] kill_block_super+0x30/0x80 > [<ffffffff811903dc>] deactivate_locked_super+0x3c/0x90 > [<ffffffff81190d7e>] deactivate_super+0x4e/0x70 > [<ffffffff811ad086>] mntput_no_expire+0x106/0x160 > [<ffffffff811ae760>] sys_umount+0xa0/0xe0 > [<ffffffff816ab919>] system_call_fastpath+0x16/0x1b > [<FFFfffffffffffff>] 0xffffffffffffffff That's waiting for the superblock to be marked clean. > And after some time, hung task warning shows: > INFO: task kworker/2:1:39 blocked for more than 120 seconds. > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > kworker/2:1 D ffffffff8180cf00 0 39 2 0x00000000 > ffff88007c54db38 0000000000000046 000000027d003700 ffff88007fd03fc0 > ffff88007c54dfd8 ffff88007c54dfd8 ffff88007c54dfd8 0000000000013e40 > ffff88007c9e9710 ffff88007c4bdc40 00000000000000b8 7fffffffffffffff > Call Trace: > [<ffffffff816a1b99>] schedule+0x29/0x70 > [<ffffffff816a02d5>] schedule_timeout+0x1e5/0x250 > [<ffffffffa02f3987>] ? kmem_zone_alloc+0x67/0xe0 [xfs] > [<ffffffff816798e6>] ? kmemleak_alloc+0x26/0x50 > [<ffffffff816a0f1b>] __down_common+0xa0/0xf0 > [<ffffffffa032f37c>] ? xfs_getsb+0x3c/0x70 [xfs] > [<ffffffff816a0fde>] __down+0x1d/0x1f > [<ffffffff81084591>] down+0x41/0x50 > [<ffffffffa02dcd44>] xfs_buf_lock+0x44/0x110 [xfs] > [<ffffffffa032f37c>] xfs_getsb+0x3c/0x70 [xfs] > [<ffffffffa033b4bc>] xfs_trans_getsb+0x4c/0x140 [xfs] > [<ffffffffa032f06e>] xfs_mod_sb+0x4e/0xc0 [xfs] > [<ffffffffa02e3b24>] xfs_fs_log_dummy+0x54/0x90 [xfs] > [<ffffffffa0335bf8>] xfs_log_worker+0x48/0x50 [xfs] > [<ffffffff81077a11>] process_one_work+0x141/0x4a0 > [<ffffffff810789e8>] worker_thread+0x168/0x410 > [<ffffffff81078880>] ? manage_workers+0x120/0x120 > [<ffffffff8107df10>] kthread+0xc0/0xd0 > [<ffffffff813a3ea4>] ? acpi_get_child+0x47/0x4d > [<ffffffff813a3fb7>] ? acpi_platform_notify.part.0+0xbb/0xda > [<ffffffff8107de50>] ? flush_kthread_worker+0xb0/0xb0 > [<ffffffff816ab86c>] ret_from_fork+0x7c/0xb0 > [<ffffffff8107de50>] ? flush_kthread_worker+0xb0/0xb0 And that's blocked on the superblock buffer because it hasn't been unlocked due to the failing write not completing. I'll have a think about how to fix it. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-02-09 22:25 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-02-09 21:24 XFS umount with IO errors seems to lead to memory corruption Chris Holcombe
2015-02-09 22:18 ` Dave Chinner
2015-02-09 22:25 ` Eric Sandeen
[not found] <CAOcd+r3i0mDK2vAnZ-0s6VGnSsJwWxnEB2uMrcz+WSJAxx2bmA@mail.gmail.com>
2013-11-21 22:07 ` Dave Chinner
2013-11-24 10:27 ` Alex Lyakas
2013-12-10 7:36 ` Alex Lyakas
2013-12-11 0:40 ` Dave Chinner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox