Re: task xfssyncd blocked while raid5 was in recovery

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: NeilBrown <neilb@suse.de>
To: hanguozhong <hanguozhong@meganovo.com>
Cc: linux-raid <linux-raid@vger.kernel.org>, stan <stan@hardwarefreak.com>
Subject: Re: task xfssyncd blocked while raid5 was in recovery
Date: Wed, 24 Oct 2012 16:14:40 +1100	[thread overview]
Message-ID: <20121024161440.239ea8b8@notabene.brown> (raw)
In-Reply-To: <2012102411171145340220@meganovo.com>

[-- Attachment #1: Type: text/plain, Size: 11499 bytes --]

On Wed, 24 Oct 2012 11:17:15 +0800 hanguozhong <hanguozhong@meganovo.com>
wrote:

> >From: GuoZhong Han
> >Date: 2012-10-10 10:44
> >To: linux-raid
> >Subject: task xfssyncd blocked while raid5 was in recovery
>   
> >Hello, every one:
> >Recently, a problem has troubled me for a long time.
> >I created a 4*2T (sda, sdb, sdc, sdd) raid5 with XFS file system, 128K chuck size and 2048 strip_cache_size. The mdadm 3.2.2, kernel 2.6.38 and mkfs.xfs 3.1.1 were used. When the raid5 was in >recovery and the schedule reached 47%, I/O errors occurred in sdb. The following was the output:
> 
> >......
> >ata2: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00
> >ata2: status=0x41 { DriveReady Error }
> >ata2: error=0x04 { DriveStatusError }
> >ata2: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00
> >ata2: status=0x41 { DriveReady Error }
> >ata2: error=0x04 { DriveStatusError }
> >ata2: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00
> >ata2: status=0x41 { DriveReady Error }
> >ata2: error=0x04 { DriveStatusError }
> >ata2: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00
> >ata2: status=0x41 { DriveReady Error }
> >ata2: error=0x04 { DriveStatusError }
> >ata2: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00
> >ata2: status=0x41 { DriveReady Error }
> >ata2: error=0x04 { DriveStatusError }
> >ata2: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00
> >ata2: status=0x41 { DriveReady Error }
> >ata2: error=0x04 { DriveStatusError }
> >ata2: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00
> >ata2: status=0x41 { DriveReady Error }
> >ata2: error=0x04 { DriveStatusError }
> >ata2: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00
> >ata2: status=0x41 { DriveReady Error }
> >ata2: error=0x04 { DriveStatusError }
> >ata2: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00
> >ata2: status=0x41 { DriveReady Error }
> >ata2: error=0x04 { DriveStatusError }
> >ata2: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00
> >ata2: status=0x41 { DriveReady Error }
> >ata2: error=0x04 { DriveStatusError }
> >ata2: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00
> >ata2: status=0x41 { DriveReady Error }
> >ata2: error=0x04 { DriveStatusError }
> >ata2: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00
> >ata2: status=0x41 { DriveReady Error }
> >ata2: error=0x04 { DriveStatusError }
> >ata2: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00
> >……
> >d 0:0:1:0: [sdb]  Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> >sd 0:0:1:0: [sdb]  Sense Key : Aborted Command [current] [descriptor]
> >Descriptor sense data with sense descriptors (in hex):
> >        72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 
> >        00 00 00 f7 
> > sd 0:0:1:0: [sdb]  Add. Sense: No additional sense information
> > sd 0:0:1:0: [sdb] CDB: Read(10): 28 00 6f 4c cc 80 00 04 00 00
> > end_request: I/O error, dev sdb, sector 1867304064
> > hrtimer: interrupt took 28024956 ns
> > ata2: status=0x41 { DriveReady Error }
> > ata2: error=0x04 { DriveStatusError }
> > ata2: translatedserver RspCode F688 Ctrl 0 Idx A6F Len 70 ATA stat/err 0x
> > 41/04 to SCSI SK00 00 64 65 35 64 62 65 65 31 3A 32 64 38 35 63 37 30 63 3A 39 64 31 63 33 63 38 39 3A 32 32 64 39 32 65 32 31 00 00 00 00 00 00/ASC/ASCQ 0xb/00 /00
> > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 2F 00 00 00 ata2: status=0x4
> > 1 { DriveReady Error }
> > ata2: error=0x04 { DriveStatusError }
> > ata2: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00
> > sd 0:0:1:0: [sdb]  Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> > sd 0:0:1:0: [sdb]  Sense Key : Aborted Command [current] [descriptor]
> > Descriptor sense data with sense descriptors (in hex):
> >         72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 
> >         00 00 00 f7 
> > sd 0:0:1:0: [sdb]  Add. Sense: No additional sense information
> > sd 0:0:1:0: [sdb] CDB: Read(10): 28 00 6f 4c a4 80 00 04 00 00
> > end_request: I/O error, dev sdb, sector 1867293824
> > ata2: status=0x41 { DriveReady Error }
> > ……
> 
> > Then, there were lots of error messages about the file system. The following was the output:
> 
> > ......
> > INFO: task xfssyncd/md127:1058 blocked for more than 120 seconds.
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > xfssyncd/md127  D fffffff7000216d0     0  1058      2 0x00000000
> >   frame 0: 0xfffffff700020570 __switch_to+0x1b8/0x1c0 (sp 0xfffffe008d7ff900)
> >   frame 1: 0xfffffff7000216d0 schedule+0x918/0x1538 (sp 0xfffffe008d7ff9d0)
> >   frame 2: 0xfffffff700022a90 schedule_timeout+0x268/0x5b0 (sp 0xfffffe008d7ffd18)
> >   frame 3: 0xfffffff700024ee0 __down+0xd8/0x158 (sp 0xfffffe008d7ffda8)
> >   frame 4: 0xfffffff70085da78 down.cold+0x8/0x28 (sp 0xfffffe008d7ffe18)
> >   frame 5: 0xfffffff700750788 xfs_buf_lock+0xd0/0x120 (sp 0xfffffe008d7ffe38)
> >  frame 6: 0xfffffff700821b40 xfs_getsb+0x38/0x78 (sp 0xfffffe008d7ffe50)
> >   frame 7: 0xfffffff70077e230 xfs_trans_getsb+0xe0/0x100 (sp 0xfffffe008d7ffe68)
> >   frame 8: 0xfffffff7006babc0 xfs_mod_sb+0x88/0x198 (sp 0xfffffe008d7ffe88)
> >   frame 9: 0xfffffff7007a6480 xfs_fs_log_dummy+0x68/0xe0 (sp 0xfffffe008d7ffeb8)
> >   frame 10: 0xfffffff70079c6c0 xfs_sync_worker+0xe0/0xe8 (sp 0xfffffe008d7ffed8)
> >   frame 11: 0xfffffff700570a00 xfssyncd+0x240/0x328 (sp 0xfffffe008d7ffef0)
> >   frame 12: 0xfffffff7000f0530 kthread+0xe0/0xe8 (sp 0xfffffe008d7fff80)
> >   frame 13: 0xfffffff7000bab38 start_kernel_thread+0x18/0x20 (sp 0xfffffe008d7fffe8)
> > INFO: task xfssyncd/md127:1058 blocked for more than 120 seconds.
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > xfssyncd/md127  D fffffff7000216d0     0  1058      2 0x00000000
> >   frame 0: 0xfffffff700020570 __switch_to+0x1b8/0x1c0 (sp 0xfffffe008d7ff900)
> >   frame 1: 0xfffffff7000216d0 schedule+0x918/0x1538 (sp 0xfffffe008d7ff9d0)
> >   frame 2: 0xfffffff700022a90 schedule_timeout+0x268/0x5b0 (sp 0xfffffe008d7ffd18)
> >   frame 3: 0xfffffff700024ee0 __down+0xd8/0x158 (sp 0xfffffe008d7ffda8)
> >   frame 4: 0xfffffff70085da78 down.cold+0x8/0x28 (sp 0xfffffe008d7ffe18)
> >   frame 5: 0xfffffff700750788 xfs_buf_lock+0xd0/0x120 (sp 0xfffffe008d7ffe38)
> >   frame 6: 0xfffffff700821b40 xfs_getsb+0x38/0x78 (sp 0xfffffe008d7ffe50)
> >   frame 7: 0xfffffff70077e230 xfs_trans_getsb+0xe0/0x100 (sp 0xfffffe008d7ffe68)
> >   frame 8: 0xfffffff7006babc0 xfs_mod_sb+0x88/0x198 (sp 0xfffffe008d7ffe88)
> >   frame 9: 0xfffffff7007a6480 xfs_fs_log_dummy+0x68/0xe0 (sp 0xfffffe008d7ffeb8)
> >   frame 10: 0xfffffff70079c6c0 xfs_sync_worker+0xe0/0xe8 (sp 0xfffffe008d7ffed8)
> >   frame 11: 0xfffffff700570a00 xfssyncd+0x240/0x328 (sp 0xfffffe008d7ffef0)
> >   frame 12: 0xfffffff7000f0530 kthread+0xe0/0xe8 (sp 0xfffffe008d7fff80)
> >   frame 13: 0xfffffff7000bab38 start_kernel_thread+0x18/0x20 (sp 0xfffffe008d7fffe8)
> > INFO: task xfssyncd/md127:1058 blocked for more than 120 seconds.
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > xfssyncd/md127  D fffffff7000216d0     0  1058      2 0x00000000
> >   frame 0: 0xfffffff700020570 __switch_to+0x1b8/0x1c0 (sp 0xfffffe008d7ff900)
> >   frame 1: 0xfffffff7000216d0 schedule+0x918/0x1538 (sp 0xfffffe008d7ff9d0)
> >   frame 2: 0xfffffff700022a90 schedule_timeout+0x268/0x5b0 (sp 0xfffffe008d7ffd18)
> >   frame 3: 0xfffffff700024ee0 __down+0xd8/0x158 (sp 0xfffffe008d7ffda8)
> >   frame 4: 0xfffffff70085da78 down.cold+0x8/0x28 (sp 0xfffffe008d7ffe18)
> >   frame 5: 0xfffffff700750788 xfs_buf_lock+0xd0/0x120 (sp 0xfffffe008d7ffe38)
> >   frame 6: 0xfffffff700821b40 xfs_getsb+0x38/0x78 (sp 0xfffffe008d7ffe50)
> >   frame 7: 0xfffffff70077e230 xfs_trans_getsb+0xe0/0x100 (sp 0xfffffe008d7ffe68)
> >   frame 8: 0xfffffff7006babc0 xfs_mod_sb+0x88/0x198 (sp 0xfffffe008d7ffe88)
> >   frame 9: 0xfffffff7007a6480 xfs_fs_log_dummy+0x68/0xe0 (sp 0xfffffe008d7ffeb8)
> >   frame 10: 0xfffffff70079c6c0 xfs_sync_worker+0xe0/0xe8 (sp 0xfffffe008d7ffed8)
> >   frame 11: 0xfffffff700570a00 xfssyncd+0x240/0x328 (sp 0xfffffe008d7ffef0)
> >   frame 12: 0xfffffff7000f0530 kthread+0xe0/0xe8 (sp 0xfffffe008d7fff80)
> > ......
> 
> > The output said “INFO: task xfssyncd/md127:1058 blocked for more than 120 seconds”. What did that mean? I used “cat /proc/mdstat” to see the state of the raid5. The output was:
> 
> > Personalities : [raid0] [raid6] [raid5] [raid4] 
> > md127 : active raid5 sdd[3] sdc[2] sdb[1](F) sda[0]
> >       5860540032 blocks super 1.2 level 5, 128k chunk, algorithm 2 [4/3] [U_UU]
> >         resync=PENDING
>       
> > unused devices: <none>
> 
> >      The state of the raid5 was “PENDING”. I had never seen such a state of raid5 before. After that, I wrote a program to access the raid5, there was no response any more. Then I used “ps aux| > task xfssyncd” to see the state of “xfssyncd”. Unfortunately, there was no response yet. Then I tried “ps aux”. There were outputs, but the program could exit with “Ctrl+d” or “Ctrl+z”. And > when I tested the write performance for raid5, I/O errors often occurred. I did not know why this I/O errors occurred so frequently. 
> >      What was the problem? Can any one help me?
> 
>       At first, I thought it was a bug of xfs filesystem. Then I change the filesystem to ext4. I used sda, sdb, sdc and sdd to build a raid5 again. 
>       When the raid5 was in recovery, I/O errors occurred in "sdb". And the state of the raid5 tend to "PENDING" again. 
>       I used "ps aux|grep 127" to see the status of the programs. The following were the outputs:
> 
>       root      1197  0.0  0.0      0     0 ?        D    Oct23   0:03 [jbd2/md127-8]
>       root      1157 75.6  0.0      0     0 ?        R    Oct23 908:02 [md127_raid5]
>       root      1159  1.2  0.0      0     0 ?        S    Oct23  14:36 [md127_resync]
>       root      1381  0.0  0.0      0     0 ?        D    Oct23   0:12 [flush-9:127]
> 
>       The state of "md127_raid5", "jbd2/md127-8", "md127_resync", and "flush-9:127" did not change to other state when I used "ps aux|grep 127" many times.
>       And the %cpu of "md127_raid5" had been the highest in about 75.6%. Programs to read and write to "md127" seemed hang, there was no response.
>       I has no other way, unless restart the machine.
>       
>       The following were the outputs of "smartctl -A /dev/sde":
>       # ./smartctl -A /dev/sde 
>       smartctl 5.39.1 2010-01-28 r3054 [tilegx-unknown-linux-gnu] (local build)
>       Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
> 
>       Probable ATA device behind a SAT layer
>       Try an additional '-d ata' or '-d sat' argument.
>  
>       I use the tilera platform, the newest kernel I can update is 3.0.38. Now I try to update my kernel to this version.
>       But I do not know if this problem have fixed in 3.0.38. If you have better solutions, please tell me. 


Might be fixed by upstream commit:

commit fab363b5ff502d1b39ddcfec04271f5858d9f26e
Author: Shaohua Li <shli@kernel.org>
Date:   Tue Jul 3 15:57:19 2012 +1000

    raid5: delayed stripe fix

which is in 3.0.38.
If it isn't that, I don't know what it is.

So it is worth trying 3.0.38.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

next prev parent reply	other threads:[~2012-10-24  5:14 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CACY-59cbWX9Gu_xsfqv_p8=Q7CabWZuj=ZH2K41j4N0-o-8WLw@mail.gmail.com>
2012-10-24  3:17 ` task xfssyncd blocked while raid5 was in recovery hanguozhong
2012-10-24  5:14   ` NeilBrown [this message]
2012-10-30  2:19     ` hanguozhong
2012-10-30  4:49       ` Stan Hoeppner
2012-10-10  3:14 GuoZhong Han
2012-10-10 11:54 ` Stan Hoeppner
2012-10-11  2:42   ` hanguozhong
2012-10-11  3:47     ` Chris Murphy
2012-10-11 11:20     ` Stan Hoeppner
2012-10-11  6:12   ` Mikael Abrahamsson
2012-10-11 11:01     ` Stan Hoeppner
2012-10-11 11:16       ` Mikael Abrahamsson
     [not found]         ` <201210112054336567511@meganovo.com>
2012-10-11 14:47           ` Stan Hoeppner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121024161440.239ea8b8@notabene.brown \
    --to=neilb@suse.de \
    --cc=hanguozhong@meganovo.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=stan@hardwarefreak.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).