From: Phil Turmel <philip@turmel.org>
To: Julie Ashworth <ashworth@berkeley.edu>
Cc: linux-raid@vger.kernel.org
Subject: Re: request help with RAID1 array that endlessly attempts to sync
Date: Tue, 17 Dec 2013 14:43:18 -0500 [thread overview]
Message-ID: <52B0A956.7030501@turmel.org> (raw)
In-Reply-To: <20131217192637.GB5070@localhost.localdomain>
On 12/17/2013 02:26 PM, Julie Ashworth wrote:
> Thanks Phil,
> I should note that the drives are labelled "enterprise", purchased from a hw RAID vendor (ACNC.com).
>
> On 17-12-2013 12.55 -0500, Phil Turmel wrote:
>> Please post the output of "smartctl -x" for both of these drives.
>
> The Centos5 smartctl (from smartmontools rpm) doesn't support the -x option. However, it's apparently equivelent to:
> smartctl -H -i -g all -c -A -f brief -l xerror,error -l xselftest,selftest -l selective -l directory -l scttemp -l scterc -l devstat -l sataphy
>
> Centos5 smartctl supports the following:
> smartctl -H -i -c -A -l error -l selftest -l selective -l directory -l scttemp -l scttempsts -l scttemphist
>
> ... and I enclosed the output for sda and sdb.
> If you think it would be useful to have the additional options (provided by -x), then let me know, and I'll try to build it.
I was interested in the reallocation counts, the current pending
sectors, and the scterc timeouts. The latter were not present, and are
important.
But /dev/sdb has three relocations and only one pending error. That's
an old drive, but not sick. I'd be concerned that there're other
hardware issues in your system if the timeout issue is not part of the
problem.
>> timeout mismatches combined with lack of scrubbing.
>
> I've read about mismatches, but not about scrubbing. I'll investigate this.
> What program/options do your weekly scrub?
Simple weekly cron job does "echo check >>/sys/block/mdX/md/sync_action"
for each array.
>> Maybe not. Please tell us you know all about error recovery timeouts
>
> Instead of stopping the sync, I decided to slow it down:
> echo 1001 > /proc/sys/dev/raid/speed_limit_max
>
>> and the timeout mismatch problem commonly encountered with
>> consumer-grade hard drives. Otherwise, you might want search the list
>> archives for various combinations of the keywords "scterc", "error
>> recovery", "timeout mismatch", "URE", and/or "bit error rate".
>
> I'm not a big fan of Seagate (enterprise or not). The drives I purchased before these (~2008) needed to have firmware updates to prevent bricking. Sigh.
That Seagate part number twigged an old memory... I didn't think it was
an enterprise drive. I have had good experiences with Hitachi, FWIW.
Recent purchases have all been WD Red just for this issue.
Phil
next prev parent reply other threads:[~2013-12-17 19:43 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-17 6:50 request help with RAID1 array that endlessly attempts to sync Julie Ashworth
2013-12-17 16:53 ` Julie Ashworth
2013-12-17 17:55 ` Phil Turmel
2013-12-17 19:26 ` Julie Ashworth
2013-12-17 19:43 ` Phil Turmel [this message]
2013-12-17 23:12 ` David C. Rankin
2013-12-18 3:45 ` Julie Ashworth
2013-12-18 12:08 ` Phil Turmel
2014-01-21 6:38 ` Julie Ashworth
2014-01-21 13:23 ` Phil Turmel
2014-02-25 0:16 ` Julie Ashworth
2013-12-17 18:12 ` Wilson Jonathan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52B0A956.7030501@turmel.org \
--to=philip@turmel.org \
--cc=ashworth@berkeley.edu \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).