linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stan Hoeppner <stan@hardwarefreak.com>
To: GuoZhong Han <hanguozhong@meganovo.com>
Cc: linux-raid <linux-raid@vger.kernel.org>
Subject: Re: task xfssyncd blocked while raid5 was in recovery
Date: Wed, 10 Oct 2012 06:54:19 -0500	[thread overview]
Message-ID: <507561EB.1050308@hardwarefreak.com> (raw)
In-Reply-To: <CACY-59dr2Gy5XgddD01yy+c5ktCgYvOdU9w-VX8GK27bvbbgRA@mail.gmail.com>

On 10/9/2012 10:14 PM, GuoZhong Han wrote:

> Recently, a problem has troubled me for a long time.
> 
> I created a 4*2T (sda, sdb, sdc, sdd) raid5 with XFS file system, 128K
> chuck size and 2048 strip_cache_size. The mdadm 3.2.2, kernel 2.6.38
> and mkfs.xfs 3.1.1 were used. When the raid5 was in recovery and the
> schedule reached 47%, I/O errors occurred in sdb. The following was
> the output:

> ata2: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00
> 
> ata2: status=0x41 { DriveReady Error }
> 
> ata2: error=0x04 { DriveStatusError }
<snip repeated log entries>


> end_request: I/O error, dev sdb, sector 1867304064

Run smartctl and post this section:
"Vendor Specific SMART Attributes with Thresholds"

The drive that is sdb may or may not be bad.  smartctl may tell you
(us).  If the drive is not bad you'll need to force relocation of this
bad sector to a spare.  If you don't know how we can assist.

> INFO: task xfssyncd/md127:1058 blocked for more than 120 seconds.

> The output said “INFO: task xfssyncd/md127:1058 blocked for more than
> 120 seconds”. What did that mean?

Precisely what it says.  It doesn't tell you WHY it was blocked, as it
can't know.  The fact that your md array was in recovery and having
problems with one of the member drives is a good reason for xfssyncd to
block.

>      The state of the raid5 was “PENDING”. I had never seen such a
> state of raid5 before. After that, I wrote a program to access the
> raid5, there was no response any more. Then I used “ps aux| task
> xfssyncd” to see the state of “xfssyncd”. Unfortunately, there was no
> response yet. Then I tried “ps aux”. There were outputs, but the
> program could exit with “Ctrl+d” or “Ctrl+z”. And when I tested the
> write performance for raid5, I/O errors often occurred. I did not know
> why this I/O errors occurred so frequently.
> 
>      What was the problem? Can any one help me?

It looks like drive sdb is bad or going bad.  smartctl output or
additional testing should confirm this.

Also, your "XFS...blocked for 120s" error reminds me there are some
known bugs in XFS kernel 2.6.38 which cause a similar error, but are not
the cause of your error.  Yours is a drive problem.  Nonetheless, there
have been dozens of XFS bugs fixed since 2.6.38 and I recommend you
upgrade to kernel 3.2.31 or 3.4.13 if you roll your own kernels.  If you
use distro kernels get the latest 3.x series in the repos.

-- 
Stan

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2012-10-10 11:54 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-10  3:14 task xfssyncd blocked while raid5 was in recovery GuoZhong Han
2012-10-10 11:54 ` Stan Hoeppner [this message]
2012-10-11  2:42   ` hanguozhong
2012-10-11  3:47     ` Chris Murphy
2012-10-11 11:20     ` Stan Hoeppner
2012-10-11  6:12   ` Mikael Abrahamsson
2012-10-11 11:01     ` Stan Hoeppner
2012-10-11 11:16       ` Mikael Abrahamsson
     [not found]         ` <201210112054336567511@meganovo.com>
2012-10-11 14:47           ` Stan Hoeppner
     [not found] <CACY-59cbWX9Gu_xsfqv_p8=Q7CabWZuj=ZH2K41j4N0-o-8WLw@mail.gmail.com>
2012-10-24  3:17 ` hanguozhong
2012-10-24  5:14   ` NeilBrown
2012-10-30  2:19     ` hanguozhong
2012-10-30  4:49       ` Stan Hoeppner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=507561EB.1050308@hardwarefreak.com \
    --to=stan@hardwarefreak.com \
    --cc=hanguozhong@meganovo.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).