linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Joys of spare disks!
@ 2005-02-28 14:24 Robin Bowes
  2005-02-28 15:04 ` Jon Lewis
  2005-03-02  2:48 ` Robin Bowes
  0 siblings, 2 replies; 22+ messages in thread
From: Robin Bowes @ 2005-02-28 14:24 UTC (permalink / raw)
  To: linux-raid

Hi,

I run a RAID5 array built from six 250GB Maxtor Maxline II SATA disks. 
After having several problems with Maxtor disks I decided to use a spare 
disk, i.e. 5+1 spare.

Well, *another* disk failed last week. The spare disk was brought into 
play seamlessly:

[root@dude ~]# mdadm --detail /dev/md5
/dev/md5:
         Version : 00.90.01
   Creation Time : Thu Jul 29 21:41:38 2004
      Raid Level : raid5
      Array Size : 974566400 (929.42 GiB 997.96 GB)
     Device Size : 243641600 (232.35 GiB 249.49 GB)
    Raid Devices : 5
   Total Devices : 6
Preferred Minor : 5
     Persistence : Superblock is persistent

     Update Time : Mon Feb 28 14:00:54 2005
           State : clean
  Active Devices : 5
Working Devices : 5
  Failed Devices : 1
   Spare Devices : 0

          Layout : left-symmetric
      Chunk Size : 128K

            UUID : a4bbcd09:5e178c5b:3bf8bd45:8c31d2a1
          Events : 0.6941488

     Number   Major   Minor   RaidDevice State
        0       8        2        0      active sync   /dev/sda2
        1       8       18        1      active sync   /dev/sdb2
        2       8       34        2      active sync   /dev/sdc2
        3       8       82        3      active sync   /dev/sdf2
        4       8       66        4      active sync   /dev/sde2

        5       8       50        -      faulty   /dev/sdd2

I've done a quick test of /dev/sdd2:

[root@dude ~]# dd if=/dev/sdd2 of=/dev/null bs=64k
dd: reading `/dev/sdd2': Input/output error
50921+1 records in
50921+1 records out

So, I guess it's time to raise another return with Maxtor <sigh>.

/dev/sdd1 is used in /dev/md0. So, just to confirm, is this what I need 
to do to remove bad disk/add new disk:

Remove faulty partition:

	mdadm --manage /dev/md5 --remove /dev/sdd2

Remove "good" from RAID1 array:

	mdadm --manage /dev/md0 --fail /dev/sdd1
	mdadm --manage /dev/md0 --remove /dev/sdd1

[pull out bad disk, install replacement]

Partition new disk (will be /dev/sdd) (All six disks are partitioned the 
same):

	fdisk -l /dev/sda | fdisk /dev/sdd

(I seem to remember having a problem with this when I did it last time. 
Something about a bug in fdisk that won't partition brand new new disks 
correctly? Or was it sfdisk?)

Add new partitions to arrays:

	mdadm --manage /dev/md0 --add /dev/sdd1
	mdadm --manage /dev/md5 --add /dev/sdd2

Thanks,

R.
-- 
http://robinbowes.com


^ permalink raw reply	[flat|nested] 22+ messages in thread
* Re: Joys of spare disks!
@ 2005-03-07 16:36 LinuxRaid
  2005-03-07 17:09 ` Peter T. Breuer
  0 siblings, 1 reply; 22+ messages in thread
From: LinuxRaid @ 2005-03-07 16:36 UTC (permalink / raw)
  To: linux-raid

With error correcting RAID, where the whole idea is to do everything possible
to maintain data reliability, it seems to me the correct behavior of the RAID
subsytem is to attempt to re-write ECC failed data blocks whenever possible.

This is especially true where Software Controlled Timeouts are being
implimented on ATA/SATA drives.

I'm running several RAID-5 arrays against mixed PATA/SATA systems, and I am
amazed at how fragile Linux Software RAID-5 really is.  It makes no sense to
me that one soft ECC errror would kick out an entire volume of data, cause a
rebuild or a run in "degraded" mode, and with the inherent risk of another
event happening on another disk resulting in the loss of all data on the
storage system.

And from what I can tell, Linux software RAID, never gives the drive the
chance to perform reallocation on "weak" sectors...

What should be happening:
1) Drive has a read error or does not deliver the data within the command
timeout parameters that have been issued to the drive.
2) RAID driver collects the blocks from the "working" drives, generates the
missing data from the problem drive.
3) RAID driver both returns the data to the calling process, and issues a
re-write of the bad block on the disk drive in question.
4) RAID drive generates a log message tracking the problem
5) When the number of "event messages" for block re-writes exceeds a certain
threshold, alert the sys-admin that a specific drive is unreliable.

I've been going through the MD driver source, and to tell the truth, can't
figure out where the read error is detected and how to "hook" that event and
force a re-write of the failing sector.  I would very much appreciate it if
someone out there could send me some some hints/tips/pointers on how to
impliment this.  I'm not a Linux / kernel hacker (yet), but this should not be
hard to fix....

John Suykerbuyk


At Wed, 2 Mar 2005 13:05:04 +0100, you wrote
>Hm..  I said partial resync, because a full resync would be a waste of
>time if it's just a thousand sectors or so that needs to be relocated.
> Anyhow.
>
>There's no overhead to the application with the (theoretically
>"partial") degraded mode, since it happens in parallel.
>
>The latency of doing it while the read operation is ongoing would be,
>say, 3 seconds or so per bad sector on a standard disk?  Imagine a
>thousand bad sectors, and any sane person would quickly pull the plug
>from the dead box and have it resync when it boots instead of staring
>at a hung system.  When that happens there's even the risk that the
>resync fails completely, if md decides to pull one of the disks other
>than the one with bad blocks on it from the array before it resyncs.
>
>I prefer the first scenario (the system keeps running, the array isn't
>potentially destroyed), even if it means a slightly lower I/O rate and
>thus a minor overhead if and only if running applications utilize the
>I/O subsystem 100%..
>
>Am I wrong?
>
>Guy wrote:
>>I think the overhead related to fixing the bad blocks would be insignificant
>> compared to the overhead of degraded mode.
>> >> Guy
>> >> -----Original Message-----
>> From: linux-raid-owner@vger.kernel.org
>> [mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Molle Bestefich
>> Sent: Tuesday, March 01, 2005 10:51 PM
>> To: linux-raid@vger.kernel.org
>> Subject: Re: Joys of spare disks!
>> >> Robin Bowes wrote:
>> > I envisage something like:
>> >
>> > md attempts read
>> > one disk/partition fails with a bad block
>> > md re-calculates correct data from other disks
>> > md writes correct data to "bad" disk
>> >   - disk will re-locate the bad block
>> >> Probably not that simple, since some times multiple blocks will go
>> bad, and you wouldn't want the entire system to come to a screeching
>> halt whenever that happens.
>> >> A more consistent and risk-free way of doing it would probably be to
>> do the above partial resync in a background thread or so?..
>> -
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 22+ messages in thread
* Re: Joys of spare disks!
@ 2005-03-07 20:15 LinuxRaid
  0 siblings, 0 replies; 22+ messages in thread
From: LinuxRaid @ 2005-03-07 20:15 UTC (permalink / raw)
  To: linux-raid

Well, 

With this much interest, I will tear back into the bowels of Raid-5. Again,
anyone else reading this with a shred of a clue as to where to start, please
chime in!

- John "S"

At Mon, 07 Mar 2005 10:18:14 -0800, you wrote
>
>
>
>LinuxRaid@Suykerbuyk.org wrote:
>
>> I'm running several RAID-5 arrays against mixed PATA/SATA systems, and I am
>>amazed at how fragile Linux Software RAID-5 really is.  It makes no sense to
>
>Amen!
>
>> What should be happening:
>> 1) Drive has a read error or does not deliver the data within the command
>> timeout parameters that have been issued to the drive.
>> 2) RAID driver collects the blocks from the "working" drives, generates the
>> missing data from the problem drive.
>> 3) RAID driver both returns the data to the calling process, and issues a
>> re-write of the bad block on the disk drive in question.
>> 4) RAID drive generates a log message tracking the problem
>>5) When the number of "event messages" for block re-writes exceeds a certain
>> threshold, alert the sys-admin that a specific drive is unreliable.
>
>Absolutely
>
>>impliment this.  I'm not a Linux / kernel hacker (yet), but this should not be
>> hard to fix....
>
>You will have a very willing tester in me if you generate any patches. I 
>haven't played with device mapper yet (though that is apparently the way 
>to get fake "faulty" devices for testing), but I have created a quick 
>script to create/destroy a loopback-mounted set of files and raid5 array 
>on top of it. Its in the archives and may or may not help as a test rig 
>as you're hacking on the code
>
>There's lots more people than just me interested too, if you've got the 
>motivation
>
>-Mike
>


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2005-03-10 19:24 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-02-28 14:24 Joys of spare disks! Robin Bowes
2005-02-28 15:04 ` Jon Lewis
2005-02-28 15:23   ` Robin Bowes
2005-02-28 15:54     ` Nicola Fankhauser
2005-02-28 17:04       ` Robin Bowes
2005-02-28 18:58         ` Nicola Fankhauser
2005-02-28 19:25           ` Robin Bowes
2005-03-02  2:48 ` Robin Bowes
2005-03-02  2:59   ` Neil Brown
2005-03-02  3:50   ` Molle Bestefich
2005-03-02  3:52     ` Molle Bestefich
2005-03-02  5:52     ` Guy
2005-03-02 12:05       ` Molle Bestefich
2005-03-02 16:16         ` Guy
2005-03-03  9:37           ` Molle Bestefich
2005-03-02  4:57   ` Brad Campbell
2005-03-02  5:53     ` Guy
     [not found]       ` <eaa6dfe0503080915276466a1@mail.gmail.com>
2005-03-08 17:15         ` Derek Piper
     [not found]         ` <200503091704.j29H4l517152@www.watkins-home.com>
2005-03-10 19:24           ` Derek Piper
  -- strict thread matches above, loose matches on Subject: below --
2005-03-07 16:36 LinuxRaid
2005-03-07 17:09 ` Peter T. Breuer
2005-03-07 20:15 LinuxRaid

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).