From: Mike Accetta <maccetta@laurelnetworks.com>
To: Alberto Alonso <alberto@ggsys.net>
Cc: Neil Brown <neilb@suse.de>, linux-raid@vger.kernel.org
Subject: Re: Software RAID when it works and when it doesn't
Date: Tue, 16 Oct 2007 17:57:13 -0400 [thread overview]
Message-ID: <14526.1192571833@mdt.ecitele.com> (raw)
In-Reply-To: Your message of "Sun, 14 Oct 2007 00:57:58 CDT." <1192341478.16416.298.camel@w100>
Alberto Alonso writes:
> On Sun, 2007-10-14 at 08:50 +1000, Neil Brown wrote:
> > On Saturday October 13, alberto@ggsys.net wrote:
> > > Over the past several months I have encountered 3
> > > cases where the software RAID didn't work in keeping
> > > the servers up and running.
> > >
> > > In all cases, the failure has been on a single drive,
> > > yet the whole md device and server become unresponsive.
> > >
> > > (usb-storage)
> > > In one situation a RAID 0 across 2 USB drives failed
> > > when one of the drives accidentally got turned off.
> >
> > RAID0 is not true RAID - there is no redundancy. If one device in a
> > RAID0 fails, the whole array will fail. This is expected.
>
> Sorry, I meant RAID 1. Currently, we only use RAID 1 and RAID 5 on all
> our systems.
>
> >
> > >
> > > (sata)
> > > A second case a disk started generating reports like:
> > > end_request: I/O error, dev sdb, sector 42644555
> >
> > So the drive had errors - not uncommon. What happened to the array?
>
> The array never became degraded, it just made the system
> hang. I reported it back in May, but couldn't get it
> resolved. I replaced the system and unfortunately went
> to a non-RAID solution for that server.
Was the disk driver generating any low level errors or otherwise
indicating that it might be retrying operations on the bad drive at
the time (i.e. console diagnostics)? As Neil mentioned later, the md layer
is at the mercy of the low level disk driver. We've observed abysmal
RAID1 recovery times on failing SATA disks because all the time is
being spent in the driver retrying operations which will never succeed.
Also, read errors don't tend to fail the array so when the bad disk is
again accessed for some subsequent read the whole hopeless retry process
begins anew.
I posted a patch about 6 weeks ago which attempts to improve this situation
for RAID1 by telling the driver not to retry on failures and giving some
weight to read errors for failing the array. Hopefully, Neil is still
mulling it over and it or something similar will eventually make it into
the main line kernel as a solution for this problem.
--
Mike Accetta
ECI Telecom Ltd.
Transport Networking Division, US (previously Laurel Networks)
next prev parent reply other threads:[~2007-10-16 21:57 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-10-13 18:40 Software RAID when it works and when it doesn't Alberto Alonso
2007-10-13 22:46 ` Eyal Lebedinsky
2007-10-13 22:50 ` Neil Brown
2007-10-14 5:57 ` Alberto Alonso
2007-10-16 21:57 ` Mike Accetta [this message]
2007-10-16 22:29 ` Richard Scobie
2007-10-17 21:53 ` Support
2007-10-18 15:26 ` Goswin von Brederlow
2007-10-19 7:07 ` Alberto Alonso
2007-10-19 15:02 ` Justin Piszcz
2007-10-20 13:45 ` Michael Tokarev
2007-10-20 13:55 ` Justin Piszcz
2007-10-26 16:11 ` Goswin von Brederlow
2007-10-26 16:11 ` Justin Piszcz
2007-10-23 22:45 ` Bill Davidsen
2007-10-24 5:50 ` Alberto Alonso
2007-10-24 20:04 ` Bill Davidsen
2007-10-24 20:18 ` Alberto Alonso
2007-10-26 16:12 ` Goswin von Brederlow
2007-10-26 17:09 ` Alberto Alonso
2007-10-27 15:26 ` Bill Davidsen
2007-11-02 8:47 ` Alberto Alonso
[not found] ` <471241F8.50205@harddata.com>
2007-10-14 18:22 ` Alberto Alonso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=14526.1192571833@mdt.ecitele.com \
--to=maccetta@laurelnetworks.com \
--cc=alberto@ggsys.net \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).