From mboxrd@z Thu Jan 1 00:00:00 1970 From: Pavel Hofman Subject: Re: More 'D' state processes Date: Wed, 24 Sep 2014 13:14:27 +0200 Message-ID: <5422A793.7090502@ivitera.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Return-path: Sender: linux-raid-owner@vger.kernel.org To: linux RAID List-Id: linux-raid.ids > On Mon, 06 Jan 2014 17:00:35 +0100 hans@xxxxxxxxxxxxxx wrote: >> >> >> Hi Neil, >> >> the output of 'uname -r' is: 3.2.0-4-amd64. A standard Debian Wheezy. >> >> I rebooted the system in the meantime. The re-sync started anew but >> finished otherwise without flaws. > > I guess there must be some other bug then. I still expect that it is related > to md_flush_request(). Probably related to the fact that you have an md > device on top of other md devices.... > > I had a quick look and cannot find the cause of the problem. Maybe it will > have to wait until someone else hit it :-( > > Thanks for the report, > NeilBrown Hi Neil, IMO I have been hitting the same bug. I have a layered setup with several MDs on top of each other (backup server with offline copies). Sometimes, when mdadm is checking/resyncing, my XFS filesystem gets stuck. Recovery of a small partition (raid1, md1, ext3) finishes, while recovery of large (raid1, md6) gets stuck. XFS is on md7 composed of raid1 - md5 + md6. Kernel 3.2 was OK, kernels 3.14, 3.16 produce the lockup. Please let me know what other diagnostics I have to run. Thanks a lot for your kind help. Pavel. dmesg ---------- [1295936.011721] md: recovery of RAID array md6 [1295936.011730] md: minimum _guaranteed_ speed: 100000 KB/sec/disk. [1295936.011733] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery. [1295936.011738] md: using 128k window, over a total of 4356343672k. [1295936.033832] md: export_rdev(sde2) [1295936.114241] md: bind [1295936.188266] md1: Warning: Device sde2 is misaligned [1295936.188273] RAID1 conf printout: [1295936.188275] --- wd:4 rd:6 [1295936.188278] disk 0, wo:0, o:1, dev:sdc1 [1295936.188280] disk 1, wo:0, o:1, dev:sdd1 [1295936.188283] disk 2, wo:0, o:1, dev:sdb2 [1295936.188285] disk 3, wo:0, o:1, dev:sda2 [1295936.188287] disk 4, wo:1, o:1, dev:sde2 [1296065.585142] md: recovery of RAID array md1 [1296065.585145] md: minimum _guaranteed_ speed: 100000 KB/sec/disk. [1296065.585147] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery. [1296065.585152] md: using 128k window, over a total of 10739328k. [1296196.841709] md: md1: recovery done. [1296197.003548] RAID1 conf printout: [1296197.003558] --- wd:6 rd:6 [1296197.003565] disk 0, wo:0, o:1, dev:sdc1 [1296197.003570] disk 1, wo:0, o:1, dev:sdd1 [1296197.003574] disk 2, wo:0, o:1, dev:sdb2 [1296197.003578] disk 3, wo:0, o:1, dev:sda2 [1296197.003582] disk 4, wo:0, o:1, dev:sde2 [1296197.003586] disk 5, wo:0, o:1, dev:sdf1 [1296360.756072] INFO: task BackupPC_tarExt:17592 blocked for more than 120 seconds. [1296360.756146] Tainted: G I 3.16.0-tecmintkernel #2 [1296360.756197] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [1296360.756261] BackupPC_tarExt D 0000000000000000 0 17592 17581 0x00000000 [1296360.756270] ffff8800b9680b00 0000000000000082 0000000000014640 ffff88006cb77fd8 [1296360.756277] 0000000000014640 ffff8800b9680b00 ffff880039476cf0 ffff88006cb77940 [1296360.756284] 7fffffffffffffff ffff8800b9680b00 0000000000000000 ffff8800ba0ec4c0 [1296360.756290] Call Trace: [1296360.756307] [] ? schedule_timeout+0x1f9/0x270 [1296360.756316] [] ? __down+0x6b/0xa0 [1296360.756362] [] ? _xfs_buf_find+0x135/0x280 [xfs] [1296360.756371] [] ? down+0x3c/0x50 [1296360.756391] [] ? xfs_buf_lock+0x2a/0xc0 [xfs] [1296360.756411] [] ? _xfs_buf_find+0x135/0x280 [xfs] [1296360.756430] [] ? xfs_buf_get_map+0x1e/0x1a0 [xfs] [1296360.756451] [] ? xfs_buf_read_map+0x1f/0x130 [xfs] [1296360.756484] [] ? xfs_trans_read_buf_map+0x20b/0x480 [xfs] [1296360.756510] [] ? xfs_imap_to_bp+0x59/0xe0 [xfs] [1296360.756537] [] ? xfs_iread+0x6b/0x390 [xfs] [1296360.756558] [] ? xfs_iget+0x24b/0x6b0 [xfs] [1296360.756586] [] ? xfs_lookup+0xce/0xf0 [xfs] [1296360.756607] [] ? xfs_vn_lookup+0x4e/0xa0 [xfs] [1296360.756616] [] ? lookup_dcache+0x7b/0xb0 [1296360.756622] [] ? lookup_real+0x14/0x60 [1296360.756628] [] ? __lookup_hash+0x2a/0x40 [1296360.756635] [] ? link_path_walk+0x46b/0xeb0 [1296360.756643] [] ? path_lookupat+0x52/0xd60 [1296360.756667] [] ? xfs_bmap_search_extents+0x59/0xe0 [xfs] [1296360.756675] [] ? filename_lookup+0x20/0xc0 [1296360.756682] [] ? user_path_at_empty+0x4f/0xa0 [1296360.756688] [] ? vfs_fstatat+0x48/0xa0 [1296360.756695] [] ? SYSC_newstat+0x1a/0x40 [1296360.756703] [] ? vfs_read+0x11c/0x170 [1296360.756709] [] ? SyS_read+0x3d/0xa0 [1296360.756715] [] ? system_call_fastpath+0x16/0x1b [1296360.756724] INFO: task BackupPC_link:22685 blocked for more than 120 seconds. [1296360.756784] Tainted: G I 3.16.0-tecmintkernel #2 [1296360.756834] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [1296360.756896] BackupPC_link D 0000000000000000 0 22685 1718 0x00000000 [1296360.756902] ffff8800b87792f0 0000000000000082 0000000000014640 ffff88000041bfd8 [1296360.756909] 0000000000014640 ffff8800b87792f0 ffff880016d569f0 ffff88000041b9f0 [1296360.756914] 7fffffffffffffff ffff8800b87792f0 0000000000000000 ffff8800bafdd880 [1296360.756921] Call Trace: [1296360.756929] [] ? schedule_timeout+0x1f9/0x270 [1296360.756952] [] ? nv_swncq_qc_issue+0x5b/0x90 [sata_nv] [1296360.756960] [] ? sched_clock+0x5/0x10 [1296360.756967] [] ? __down+0x6b/0xa0 [1296360.756974] [] ? __clear_sched_clock_stable+0x20/0x20 [1296360.756996] [] ? _xfs_buf_find+0x135/0x280 [xfs] [1296360.757003] [] ? down+0x3c/0x50 [1296360.757021] [] ? xfs_buf_lock+0x2a/0xc0 [xfs] [1296360.757040] [] ? _xfs_buf_find+0x135/0x280 [xfs] [1296360.757060] [] ? xfs_buf_get_map+0x1e/0x1a0 [xfs] [1296360.757079] [] ? xfs_buf_read_map+0x1f/0x130 [xfs] [1296360.757105] [] ? xfs_trans_read_buf_map+0x20b/0x480 [xfs] [1296360.757131] [] ? xfs_imap_to_bp+0x59/0xe0 [xfs] [1296360.757156] [] ? xfs_iread+0x6b/0x390 [xfs] [1296360.757177] [] ? xfs_iget+0x24b/0x6b0 [xfs] [1296360.757204] [] ? xfs_lookup+0xce/0xf0 [xfs] [1296360.757225] [] ? xfs_vn_lookup+0x4e/0xa0 [xfs] [1296360.757232] [] ? lookup_dcache+0x7b/0xb0 [1296360.757239] [] ? lookup_real+0x14/0x60 [1296360.757245] [] ? __lookup_hash+0x2a/0x40 [1296360.757251] [] ? path_lookupat+0x903/0xd60 [1296360.757276] [] ? xfs_trans_free_items+0x78/0xa0 [xfs] [1296360.757283] [] ? filename_lookup+0x20/0xc0 Processes in D state: ---------------------- 29 ? D 72:29 [kswapd0] 1815 ? D 36:22 [xfsaild/md7] 17592 ? D 33:37 /usr/bin/perl /usr/share/backuppc/bin/BackupPC_tarExtract orion-1000 / 3 21086 ? D 3:58 /usr/bin/rsync --server --sender --numeric-ids --perms --owner --group -D --links --hard-links --times --block-size=2048 --recursive -D --one-file-system . /mnt/raid/local-copies/nemeton/ 22341 ? D 0:00 [md6_resync] 22685 ? D 0:00 /usr/bin/perl /usr/share/backuppc/bin/BackupPC_link hestia orfeus:~# cat /proc/mdstat --------------------------- Personalities : [raid1] [raid0] md9 : active raid0 sde4[0] sdf3[1] 4356343808 blocks super 1.0 512k chunks md7 : active raid1 md5[2] md6[1] 4356343536 blocks super 1.0 [2/2] [UU] md6 : active raid1 md9[2] md4[0] 4356343672 blocks super 1.0 [2/1] [U_] [>....................] recovery = 0.0% (6208/4356343672) finish=200447042.5min speed=0K/sec bitmap: 9/9 pages [36KB], 262144KB chunk md5 : active raid1 md3[0] 4356343616 blocks super 1.0 [2/1] [U_] bitmap: 33/33 pages [132KB], 65536KB chunk md4 : active raid0 sdb4[0] sdd3[1] 4356343808 blocks super 1.0 512k chunks md3 : active raid0 sda4[0] sdc3[1] 4356343808 blocks super 1.0 512k chunks md2 : active raid1 sdc2[0] sda3[3] sdb3[2] sdd2[1] 8787456 blocks [4/4] [UUUU] md1 : active raid1 sdf1[5] sde2[4] sdc1[0] sda2[3] sdb2[2] sdd1[1] 10739328 blocks [6/6] [UUUUUU] > > >> >> Regards, Hans >> >> Am 05.01.2014 23:00, schrieb NeilBrown: >> >> > On Sun, 22 Dec 2013 17:43:43 +0100 Hans Kraus wrote: >> > >> >> Hi Neil, >> > >> > Hi Hans, >> > sorry for the delay - Christmas/New Year vacation... >> > [40800.777037] xfsaild/dm-0 D ffff88003754c300 0 20798 2 0x00000000 [40800.777042] ffff88003754c300 0000000000000046 0000000000000000 ffff88005d7c51a0 [40800.777047] 0000000000013780 ffff88001b01ffd8 ffff88001b01ffd8 ffff88003754c300 [40800.777052] ffff880071fede00 ffffffff81070fc1 0000000000000046 ffff88003734a400 [40800.777057] Call Trace: [40800.777061] [] ? arch_local_irq_save+0x11/0x17 [40800.777071] [] ? md_flush_request+0x96/0x111 [md_mod] [40800.777076] [] ? try_to_wake_up+0x197/0x197 [40800.777082] [] ? make_request+0x25/0x37a [raid456] [40800.777091] [] ? >> >> This looks like the most likely root of the problem - something wrong in >> md_flush_request. >> >> There was a bug here fixed in 2.6.37 >> >> commit a035fc3e2531703b539f23bec4ca7943cfc69349 >> Author: NeilBrown >> Date: Thu Dec 9 16:17:51 2010 +1100 >> >> md: fix possible deadlock in handling flush requests. >> >> You didn't say which kernel you were running. Could it be earlier than >> 2.6.37??? >> >> NeilBrown >> >>