public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: David R <david@unsolicited.net>
To: Cynbe ru Taren <cynbe@muq.org>
Cc: linux-kernel@vger.kernel.org
Subject: Re: FYI: RAID5 unusably unstable through 2.6.14
Date: Tue, 17 Jan 2006 19:58:59 +0000	[thread overview]
Message-ID: <43CD4C83.9090608@unsolicited.net> (raw)
In-Reply-To: <E1EywcM-0004Oz-IE@laurel.muq.org>

[-- Attachment #1: Type: text/plain, Size: 3104 bytes --]

Cynbe ru Taren wrote:
> The current Linux kernel RAID5 implementation is just
> too fragile to be used for most of the applications
> where it would be most useful.

I'm not sure I agree.

> What happens repeatedly, at least in my experience over
> a variety of boxes running a variety of 2.4 and 2.6
> Linux kernel releases, is that any transient I/O problem
> results in a critical mass of RAID5 drives being marked
> 'failed', at which point there is no longer any supported

What "transient" I/O problem would this be. I've had loads of issues with
flaky motherboard/PCI bus implementations that make RAID using addin cards
(all 5 slots filled with other devices) a nightmare. The built in controllers
seem to be more reliable.

> way of retrieving the data on the RAID5 device, even
> though the underlying drives are all fine, and the underlying
> data on those drives almost certainly intact.

This is no problem, just use something like

	mdadm --assemble --force /dev/md5 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1
/dev/sde1

(Then of course do a fsck)

You can even do this with (nr.drives-1), then add in the last one to be
sync'ed up in the background.

> This has just happened to me for at least the sixth time,
> this time in a brand new RAID5 consisting of 8 200G hotswap
> SATA drives backing up the contents of about a dozen onsite
> and offsite boxes via dirvish, which took me the better part
> of December to get initialized and running, and now two weeks
> later I'm back to square one.

:-( .. maybe try the force assemble?

> I'm currently digging through the md kernel source code
> trying to work out some ad-hoc recovery method, but this
> level of flakiness just isn't acceptable on systems where
> reliable mass storage is a must -- and when else would
> one bother with RAID5?

It isn't flaky for me now I'm using a better quality motherboard, in fact it's
saved me through 3 near simultaneous failures of WD 250GB drives.

> We need RAID5 to be equally resilient in the face of
> real-world problems, people -- it isn't enough to
> just be able to function under ideal lab conditions!

I think it is. The automatics are paranoid (as they should be) when failures
are noticed. The array can be assembled manually though.

> A design bug is -still- a bug, and -still- needs to
> get fixed.

It's not a design bug - in my opinion.

> Something HAS to be done to make the RAID5 logic
> MUCH more conservative about destroying RAID5
> systems in response to a transient burst of I/O
> errors, before it can in good conscience be declared

If such things are common you should investigate the hardware.

> ready for production use -- or at MINIMUM to provide
> a SUPPORTED way of restoring a butchered RAID5 to
> last-known-good configuration or such once transient
> hardware issues have been resolved.

It is. See above.

> In the meantime, IMHO Linux RAID5 should be prominently flagged
> EXPERIMENTAL -- NONCRITICAL USE ONLY or some such, to avoid
> building up ill-will and undeserved distrust of Linux
> software quality generally.

I'd calm down if I were you.

Cheers
David

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 256 bytes --]

  parent reply	other threads:[~2006-01-17 19:59 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-01-17 19:35 FYI: RAID5 unusably unstable through 2.6.14 Cynbe ru Taren
2006-01-17 19:39 ` Benjamin LaHaise
2006-01-17 20:13   ` Martin Drab
2006-01-17 23:39     ` Michael Loftis
2006-01-18  2:30       ` Martin Drab
2006-02-02 20:33     ` Bill Davidsen
2006-02-03  0:57       ` Martin Drab
2006-02-03  1:13         ` Martin Drab
2006-02-03 15:41         ` Phillip Susi
2006-02-03 16:13           ` Martin Drab
2006-02-03 16:38             ` Phillip Susi
2006-02-03 17:22               ` Roger Heflin
2006-02-03 19:38                 ` Phillip Susi
2006-02-03 17:51             ` Martin Drab
2006-02-03 19:10               ` Roger Heflin
2006-02-03 19:12                 ` Martin Drab
2006-02-03 19:41                   ` Phillip Susi
2006-02-03 19:45                     ` Martin Drab
2006-01-17 19:56 ` Kyle Moffett
2006-01-17 19:58 ` David R [this message]
2006-01-17 20:00 ` Kyle Moffett
2006-01-17 23:27 ` Michael Loftis
2006-01-18  0:12   ` Kyle Moffett
2006-01-18 11:24     ` Erik Mouw
2006-01-18  0:21   ` Phillip Susi
2006-01-18  0:29     ` Michael Loftis
2006-01-18  2:10       ` Phillip Susi
2006-01-18  3:01         ` Michael Loftis
2006-01-18 16:49           ` Krzysztof Halasa
2006-01-18 16:47         ` Krzysztof Halasa
2006-02-02 22:10     ` Bill Davidsen
2006-02-08 21:58       ` Pavel Machek
2006-01-18 10:54 ` Helge Hafting
2006-01-18 16:15   ` Mark Lord
2006-01-18 17:32     ` Alan Cox
2006-01-19 15:59       ` Mark Lord
2006-01-19 16:25         ` Alan Cox
2006-02-08 14:46           ` Alan Cox
2006-01-18 23:37     ` Neil Brown
2006-01-19 15:53       ` Mark Lord
2006-01-19  0:13 ` Neil Brown
  -- strict thread matches above, loose matches on Subject: below --
2006-02-03 17:00 Salyzyn, Mark
2006-02-03 17:39 ` Martin Drab
2006-02-03 19:46 ` Phillip Susi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=43CD4C83.9090608@unsolicited.net \
    --to=david@unsolicited.net \
    --cc=cynbe@muq.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox