linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alberto Alonso <alberto@ggsys.net>
To: linux-raid@vger.kernel.org
Subject: Software RAID when it works and when it doesn't
Date: Sat, 13 Oct 2007 13:40:51 -0500	[thread overview]
Message-ID: <1192300851.16416.265.camel@w100> (raw)

Over the past several months I have encountered 3
cases where the software RAID didn't work in keeping
the servers up and running.

In all cases, the failure has been on a single drive,
yet the whole md device and server become unresponsive.

(usb-storage)
In one situation a RAID 0 across 2 USB drives failed
when one of the drives accidentally got turned off.

(sata)
A second case a disk started generating reports like:
end_request: I/O error, dev sdb, sector 42644555

(sata)
The third case (which I'm living right now) is a disk
that I can see during the boot process but that I can't
get operations on it to come back (ie. fdisk -l /dev/sdc). 

(pata)
I have had at least 4 situations on old servers based
on pata disks where disk failures where successful in
being flagged and arrays where degraded automatically.

So, this is all making me wonder under what circumstances
software RAID may have problems detecting disk failures.

I need to come up with a best practices solution and also
need to understand more as I move into raid over local
network (ie. iscsi, AoE or NBD). Could a disk failure in
one of the servers or a server going offline bring the
whole array down?

Thanks for any information or comments,

Alberto



             reply	other threads:[~2007-10-13 18:40 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-13 18:40 Alberto Alonso [this message]
2007-10-13 22:46 ` Software RAID when it works and when it doesn't Eyal Lebedinsky
2007-10-13 22:50 ` Neil Brown
2007-10-14  5:57   ` Alberto Alonso
2007-10-16 21:57     ` Mike Accetta
2007-10-16 22:29       ` Richard Scobie
2007-10-17 21:53       ` Support
2007-10-18 15:26       ` Goswin von Brederlow
2007-10-19  7:07         ` Alberto Alonso
2007-10-19 15:02           ` Justin Piszcz
2007-10-20 13:45             ` Michael Tokarev
2007-10-20 13:55               ` Justin Piszcz
2007-10-26 16:11             ` Goswin von Brederlow
2007-10-26 16:11               ` Justin Piszcz
2007-10-23 22:45           ` Bill Davidsen
2007-10-24  5:50             ` Alberto Alonso
2007-10-24 20:04               ` Bill Davidsen
2007-10-24 20:18                 ` Alberto Alonso
2007-10-26 16:12                 ` Goswin von Brederlow
2007-10-26 17:09                   ` Alberto Alonso
2007-10-27 15:26                     ` Bill Davidsen
2007-11-02  8:47                       ` Alberto Alonso
     [not found] ` <471241F8.50205@harddata.com>
2007-10-14 18:22   ` Alberto Alonso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1192300851.16416.265.camel@w100 \
    --to=alberto@ggsys.net \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).