linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stan Hoeppner <stan@hardwarefreak.com>
To: hanguozhong <hanguozhong@meganovo.com>
Cc: Mikael Abrahamsson <swmike@swm.pp.se>,
	Linux RAID <linux-raid@vger.kernel.org>
Subject: Re: task xfssyncd blocked while raid5 was in recovery
Date: Thu, 11 Oct 2012 09:47:47 -0500	[thread overview]
Message-ID: <5076DC13.2020408@hardwarefreak.com> (raw)
In-Reply-To: <201210112054336567511@meganovo.com>

On 10/11/2012 7:54 AM, hanguozhong wrote:
>>> Doesn't he still have 3 good drives? So since sdb was failed, there
>>> would be no reason for sdb to cause blocking or writes to the (now
>>> degraded) raid5? OP said he saw write IO errors to the array (?), which
>>> I thought was strange.
>>
>> I think the more important question is, why was the OP writing to a
>> filesystem on a small RAID5 array while it was doing a rebuild?
> 
>>> Why is that an important question?
> 
>>> Even if he was, should there ever be IO write errors on it, even if it has 
>>> a lot of load on it?
> 
> The problem was, there was no response to my program any more after xfssyncd was blocked.
> And I could use "rm -rf /mnt/md127/*" to remove the datas in the raid5. 

Please always reply to the linux-raid list, not to individual subscribers.

None of the above really matters at this point.  We know you have one
disk with at least one bad sector which isn't being reassigned for some
reason.  We know that the error recovery procedure in the drive and the
block layer was causing problems.  We also know you were generating a
non-trivial amount of IO on the array with a rebuild and application
write load when xfssyncd blocked.

It seems your application was likely doing sync, fsync, or fdatasync
operations.  Writes to the XFS journal are always synchronous barrier
writes, so if you were running a metadata heavy benchmark program you
were issuing lots of fsyncs.  It seems that due to the underlying IO
problems, xfssyncd was blocking on ack from the fsyncs.  If that's not
the case, then you're hitting one of the XFS bugs I mentioned that have
already been fixed in newer kernels.

Thus, your solution is:

1.  Fix or replace the drive with the bad sector(s)
2.  Update to a 3.x series kernel

Discussing anything else before you complete these tasks is a waste of
keystrokes, yours and ours.

-- 
Stan


  parent reply	other threads:[~2012-10-11 14:47 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-10  3:14 task xfssyncd blocked while raid5 was in recovery GuoZhong Han
2012-10-10 11:54 ` Stan Hoeppner
2012-10-11  2:42   ` hanguozhong
2012-10-11  3:47     ` Chris Murphy
2012-10-11 11:20     ` Stan Hoeppner
2012-10-11  6:12   ` Mikael Abrahamsson
2012-10-11 11:01     ` Stan Hoeppner
2012-10-11 11:16       ` Mikael Abrahamsson
     [not found]         ` <201210112054336567511@meganovo.com>
2012-10-11 14:47           ` Stan Hoeppner [this message]
     [not found] <CACY-59cbWX9Gu_xsfqv_p8=Q7CabWZuj=ZH2K41j4N0-o-8WLw@mail.gmail.com>
2012-10-24  3:17 ` hanguozhong
2012-10-24  5:14   ` NeilBrown
2012-10-30  2:19     ` hanguozhong
2012-10-30  4:49       ` Stan Hoeppner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5076DC13.2020408@hardwarefreak.com \
    --to=stan@hardwarefreak.com \
    --cc=hanguozhong@meganovo.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=swmike@swm.pp.se \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).