RAID5 drive failure, please verify my commands

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* RAID5 drive failure, please verify my commands
@ 2005-01-16 18:14 Gerd Knops
  2005-01-16 20:34 ` Robin Bowes
  2005-01-16 21:25 ` Mike Hardy
  0 siblings, 2 replies; 11+ messages in thread
From: Gerd Knops @ 2005-01-16 18:14 UTC (permalink / raw)
  To: linux-raid

Hello all,

One of the dreaded Maxtor SATA drives in my RAID5 failed, after just 3 
months of light use. Anyhow I neither have the disk capacity nor the 
money to buy it to make a backup. To make sure I do it correctly, could 
you folks please double-check my intended course of action? I would 
really appreciate that.

Current state:

xanadu:~# uname -a
Linux xanadu 2.6.8-1-386 #1 Mon Sep 13 23:29:55 EDT 2004 i686 GNU/Linux
xanadu:~# mdadm --detail /dev/md0
/dev/md0:
         Version : 00.90.01
   Creation Time : Fri Oct  8 13:01:49 2004
      Raid Level : raid5
      Array Size : 490223104 (467.51 GiB 501.99 GB)
     Device Size : 245111552 (233.76 GiB 250.99 GB)
    Raid Devices : 3
   Total Devices : 3
Preferred Minor : 0
     Persistence : Superblock is persistent

     Update Time : Sun Jan 16 06:25:31 2005
           State : clean, degraded
  Active Devices : 2
Working Devices : 2
  Failed Devices : 1
   Spare Devices : 0

          Layout : left-symmetric
      Chunk Size : 256K

            UUID : 0c758197:f3af8f3b:d68050c0:04ea429e
          Events : 0.221727

     Number   Major   Minor   RaidDevice State
        0       0        0        -      removed
        1       8       17        1      active sync   /dev/sdb1
        2       8       33        2      active sync   /dev/sdc1

        3       8        1        -      faulty   /dev/sda1

Here is what I think I should be doing:

- Remove failed disk from array:

	mdadm /dev/md0 --remove /dev/sda1

- Physically remove disk from system
- Add new disk to system, partition
- Add to array:

	mdadm /dev/md0 --add /dev/sda1

Anything else to trigger rebuilding of the disk?

That should be it, correct? Also since I lost all confidence in the 
Maxtor drives (had a long history of problems with that brand, I don't 
think any Maxtor drive I ever owned made it to retirement) I probably 
will buy a new drive immediately and replace the broken one. Then when 
the broken one is repaired/exchanged, I would like to add it as spare. 
To do so, would the sequence be

	mdadm /dev/md0 --add /dev/sdd1

Is that it?

Also one last question: Foolishly I allocated all available space in 
the Maxtors for the RAID. Now, should the replacement drive have a 
slightly smaller capacity, is there some way to deal with that? I think 
i can use resize2fs to reduce the size of the filesystem (does this 
work with ext3 file systems?). Assuming that works, is there some way 
to convince the RAID to accept a smaller partition and adjust it's size 
accordingly?

Thanks all

Gerd

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RAID5 drive failure, please verify my commands
  2005-01-16 18:14 RAID5 drive failure, please verify my commands Gerd Knops
@ 2005-01-16 20:34 ` Robin Bowes
  2005-01-16 22:08   ` Gerd Knops
  2005-01-16 21:25 ` Mike Hardy
  1 sibling, 1 reply; 11+ messages in thread
From: Robin Bowes @ 2005-01-16 20:34 UTC (permalink / raw)
  To: linux-raid

Gerd Knops wrote:
> Hello all,
> 
> One of the dreaded Maxtor SATA drives in my RAID5 failed, after just 3 
> months of light use. Anyhow I neither have the disk capacity nor the 
> money to buy it to make a backup. To make sure I do it correctly, could 
> you folks please double-check my intended course of action? I would 
> really appreciate that.

Gerd,

If you've got a credit card you can get Maxtor to send out a replacement 
  without having to pull the failed drive.

I've done this several times :) (b****y Maxtor drives!)

R.
-- 
http://robinbowes.com


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RAID5 drive failure, please verify my commands
  2005-01-16 18:14 RAID5 drive failure, please verify my commands Gerd Knops
  2005-01-16 20:34 ` Robin Bowes
@ 2005-01-16 21:25 ` Mike Hardy
  2005-01-16 22:14   ` Gerd Knops
  2005-01-17 23:53   ` Robin Bowes
  1 sibling, 2 replies; 11+ messages in thread
From: Mike Hardy @ 2005-01-16 21:25 UTC (permalink / raw)
  To: linux-raid

Gerd Knops wrote:
> Hello all,
> 
> One of the dreaded Maxtor SATA drives in my RAID5 failed, after just 3 
> months of light use. Anyhow I neither have the disk capacity nor the 
> money to buy it to make a backup. To make sure I do it correctly, could 
> you folks please double-check my intended course of action? I would 
> really appreciate that.

Failed how? I have tons and tons of Maxtor drives in service, and only 
one actually had a complete failure (verified by their utility, which is 
present on some bootable CD I got called "UltimateBootDisk").

Most of the time its just a bad sector causing a single unreadable 
sector error, which causes the RAID code to kick the drive out. You can 
see what the problem is by using the SMART utilities 
(http://smartmontools.sf.net) - run a long self test 'smartctl -t long 
/dev/sda' to verify things.

I get a bad sector around once a week with maybe 40 250GB drives in 
service, 15 of which are medium to heavy use, all of which get nightly 
short tests and weekly long tests (they usually show up then, hardly 
ever in actual request processing). Usually never the same drive either, 
its pretty random.

One of the problems with Linux + SATA at the moment is that SMART 
doesn't work out of the box, but there are patches available that let it 
work if I recall correctly.

I believe those would be well worth applying if you haven't done so yet, 
as I can't imagine managing a bunch of disks without smartd and the bad 
block howto (google://BadBlockHowto) to fix sectors when they pop up. It 
happens on all disks, its not brand-specific.

Ok, sorry if I'm preaching to a convert, but its one of the few things 
that makes me feel like I'm managing the disks, instead of the opposite. 
Other than that...

> Here is what I think I should be doing:
> 
> - Remove failed disk from array:
> 
>     mdadm /dev/md0 --remove /dev/sda1

Looks relatively correct, although mdadm --manage --remove /dev/md0 
/dev/sda1 would the way I would say it. I do think they're identical 
though - I'm not nitpicking

Someone else mentioned that you can RMA the drive - I'd definitely get 
my money from them if it really was a drive failure. Grab the 
UltimateBootDisk (or make a bootable CD with the Maxtor utility on it) 
and verify the drive with their utility so you can get the magic code 
their website demands before it spits out an RMA.

> - Physically remove disk from system
> - Add new disk to system, partition
> - Add to array:
> 
>     mdadm /dev/md0 --add /dev/sda1

Again, I'd mdadm --manage --add /dev/md0 /dev/sda1, but I'm not sure 
they're any different at all

> Anything else to trigger rebuilding of the disk?

It'll rebuild automagically after the add - make sure the other drives 
don't have bad sectors first though or you'll have a nasty surprise.

I just posted a a script yesterday that makes a bunch of disk files, 
binds them to loop devices and creates a raid set out of them. You could 
use that to practice if you want (though you'd want to change the target 
array name from /dev/md0 to /dev/md1). Practice is always good if you're 
not confident :-). The archives should have it.

> That should be it, correct? Also since I lost all confidence in the 
> Maxtor drives (had a long history of problems with that brand, I don't 
> think any Maxtor drive I ever owned made it to retirement) I probably 

I've only had one Maxtor drive that didn't make it actually, and I don't 
even have good temparature control for a couple of my arrays (40C+ 
temps). Which is just to say that anecdotal evidence isn't worth much. 
Check power, check cooling, and if those are all good, switch brands by 
all means but be ready for more of the same, most likely ;-)

> Also one last question: Foolishly I allocated all available space in the 
> Maxtors for the RAID. Now, should the replacement drive have a slightly 
> smaller capacity, is there some way to deal with that? I think i can use 
> resize2fs to reduce the size of the filesystem (does this work with ext3 
> file systems?). Assuming that works, is there some way to convince the 
> RAID to accept a smaller partition and adjust it's size accordingly?

I'm batting .333 on raidreconf. I'd make sure you get a replacement 
drive of the same size if you can. If you don't, I'd run ext3resize 
*first*, so you shrink the filesystem *before* you shrink the array. 
Then you could try shrinking the array.

What I would really do though (given my recent history with raidreconf), 
assuming you've followed the rule of thumb to never have so much space 
you can't back it up, is to do a full backup, verify the backup, verify 
all the drives (with a smartctl -t long test, or full dd test), then 
attempt the resize, with an eye towards punting and just rebuilding it 
and restoring it if things don't work right.

Hopefully some of this was helpful, good luck resurrecting the array!

-Mike

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RAID5 drive failure, please verify my commands
  2005-01-16 20:34 ` Robin Bowes
@ 2005-01-16 22:08   ` Gerd Knops
  0 siblings, 0 replies; 11+ messages in thread
From: Gerd Knops @ 2005-01-16 22:08 UTC (permalink / raw)
  To: Robin Bowes; +Cc: linux-raid


On Jan 16, 2005, at 2:34 PM, Robin Bowes wrote:

> Gerd Knops wrote:
>> Hello all,
>> One of the dreaded Maxtor SATA drives in my RAID5 failed, after just 
>> 3 months of light use. Anyhow I neither have the disk capacity nor 
>> the money to buy it to make a backup. To make sure I do it correctly, 
>> could you folks please double-check my intended course of action? I 
>> would really appreciate that.
>
> Gerd,
>
> If you've got a credit card you can get Maxtor to send out a 
> replacement  without having to pull the failed drive.
>
> I've done this several times :) (b****y Maxtor drives!)
>
Thanks for the tip, I'll give that a try. Their website's RMA stuff 
doesn't seem to work (application unavailable bla bla), so I guess I'll 
cll them Monday... Probably be on hold for half a day :-(

Gerd


> R.
> -- 
> http://robinbowes.com
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" 
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RAID5 drive failure, please verify my commands
  2005-01-16 21:25 ` Mike Hardy
@ 2005-01-16 22:14   ` Gerd Knops
  2005-01-16 23:13     ` Mike Hardy
  2005-01-17 23:53   ` Robin Bowes
  1 sibling, 1 reply; 11+ messages in thread
From: Gerd Knops @ 2005-01-16 22:14 UTC (permalink / raw)
  To: Mike Hardy; +Cc: linux-raid


On Jan 16, 2005, at 3:25 PM, Mike Hardy wrote:

>
>
> Gerd Knops wrote:
>> Hello all,
>> One of the dreaded Maxtor SATA drives in my RAID5 failed, after just 
>> 3 months of light use. Anyhow I neither have the disk capacity nor 
>> the money to buy it to make a backup. To make sure I do it correctly, 
>> could you folks please double-check my intended course of action? I 
>> would really appreciate that.
>
> Failed how? I have tons and tons of Maxtor drives in service, and only 
> one actually had a complete failure (verified by their utility, which 
> is present on some bootable CD I got called "UltimateBootDisk").
>
> Most of the time its just a bad sector causing a single unreadable 
> sector error, which causes the RAID code to kick the drive out. You 
> can see what the problem is by using the SMART utilities 
> (http://smartmontools.sf.net) - run a long self test 'smartctl -t long 
> /dev/sda' to verify things.
>
Seems to be a whole slew of bad sectors, here is what the log says:

scsi0: ERROR on channel 0, id 0, lun 0, CDB: Read (10) 00 15 65 40 3f 
00 00 08 00
Current sda: sense key Medium Error
Additional sense: Unrecovered read error - auto reallocate failed
end_request: I/O error, dev sda, sector 358957119
scsi0: ERROR on channel 0, id 0, lun 0, CDB: Read (10) 00 15 65 40 40 
00 00 07 00
Current sda: sense key Medium Error
Additional sense: Unrecovered read error - auto reallocate failed
end_request: I/O error, dev sda, sector 358957120
scsi0: ERROR on channel 0, id 0, lun 0, CDB: Read (10) 00 15 65 40 41 
00 00 06 00
Current sda: sense key Medium Error
Additional sense: Unrecovered read error - auto reallocate failed
end_request: I/O error, dev sda, sector 358957121
scsi0: ERROR on channel 0, id 0, lun 0, CDB: Read (10) 00 15 65 40 42 
00 00 05 00
Current sda: sense key Medium Error
Additional sense: Unrecovered read error - auto reallocate failed
end_request: I/O error, dev sda, sector 358957122
scsi0: ERROR on channel 0, id 0, lun 0, CDB: Read (10) 00 15 65 40 43 
00 00 04 00
Current sda: sense key Medium Error
Additional sense: Unrecovered read error - auto reallocate failed
end_request: I/O error, dev sda, sector 358957123
scsi0: ERROR on channel 0, id 0, lun 0, CDB: Read (10) 00 15 65 40 44 
00 00 03 00
Current sda: sense key Medium Error
Additional sense: Unrecovered read error - auto reallocate failed
end_request: I/O error, dev sda, sector 358957124
scsi0: ERROR on channel 0, id 0, lun 0, CDB: Read (10) 00 15 65 40 45 
00 00 02 00
Current sda: sense key Medium Error
Additional sense: Unrecovered read error - auto reallocate failed
end_request: I/O error, dev sda, sector 358957125
scsi0: ERROR on channel 0, id 0, lun 0, CDB: Read (10) 00 15 65 40 46 
00 00 01 00
Current sda: sense key Medium Error
Additional sense: Unrecovered read error - auto reallocate failed
end_request: I/O error, dev sda, sector 358957126
raid5: Disk failure on sda1, disabling device. Operation continuing on 
2 devices

That looks to me to be more severe than a single bad sector.

Thanks much for all your other remarks, cut to keep this post 
reasonably short. Much appreciated!

Gerd


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RAID5 drive failure, please verify my commands
  2005-01-16 22:14   ` Gerd Knops
@ 2005-01-16 23:13     ` Mike Hardy
  2005-01-17  0:39       ` Mike Hardy
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Hardy @ 2005-01-16 23:13 UTC (permalink / raw)
  To: linux-raid

Gerd Knops wrote:

> Seems to be a whole slew of bad sectors, here is what the log says:

I typically get bad sectors in little mini-batches. Under a hundred - 
typically 40 or so in one chunk. There are 8 sectors per block, and I 
guess when a block goes, whatever took it out (some dust or something? a 
bad alignment of metal in the media?) gets the blocks near it too.

Anyway, it still looks to me like its semi-normal. Definitely check out 
the bad block howto and (after removing the drive from the array) try 
doing a dd if=/dev/sda of=/dev/null bs=4096 skip=(whatever the sector is 
divided by 8) count=1

You should see the read errors in the log then, that verifies that 
you've got the right offset in the disk.

At that point, do a dd if=/dev/zero of=/dev/sda bs=4096 skip=(sector 
address divided by 8) count=1

If you repeat the read after that block write, the read should succeed - 
you've told the drive you don't need that data any more (you wrote into 
the block) so it shouuld be able to reallocate

If you take that single-block idea, and expand it to the first block 
address and a count= number that includes all the bad blocks, you should 
clear them all

??

Or the drive is shot :-). Without a SMART self-test and a check of the 
results where you can really interrogate the drive, its hard to say

Glad this helps though

-Mike

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RAID5 drive failure, please verify my commands
  2005-01-16 23:13     ` Mike Hardy
@ 2005-01-17  0:39       ` Mike Hardy
  0 siblings, 0 replies; 11+ messages in thread
From: Mike Hardy @ 2005-01-17  0:39 UTC (permalink / raw)
  To: linux-raid

Mike Hardy wrote:

> At that point, do a dd if=/dev/zero of=/dev/sda bs=4096 skip=(sector 
> address divided by 8) count=1

I messed this command up, if you didn't catch it. You use skip to skip 
on the input, you use that for reading in. When you're writing the zeros 
out, you use "seek=..." to seek into the output to the right spot.

Its all in the bad block howto though - that's the definitive guide

-Mike

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RAID5 drive failure, please verify my commands
  2005-01-16 21:25 ` Mike Hardy
  2005-01-16 22:14   ` Gerd Knops
@ 2005-01-17 23:53   ` Robin Bowes
  2005-01-18 15:46     ` Derek Piper
  1 sibling, 1 reply; 11+ messages in thread
From: Robin Bowes @ 2005-01-17 23:53 UTC (permalink / raw)
  To: linux-raid

Mike Hardy wrote:
> 
> 
> Gerd Knops wrote:
> 
>> Hello all,
>>
>> One of the dreaded Maxtor SATA drives in my RAID5 failed, after just 3 
>> months of light use. Anyhow I neither have the disk capacity nor the 
>> money to buy it to make a backup. To make sure I do it correctly, 
>> could you folks please double-check my intended course of action? I 
>> would really appreciate that.
> 
> 
> Failed how? I have tons and tons of Maxtor drives in service, and only 
> one actually had a complete failure (verified by their utility, which is 
> present on some bootable CD I got called "UltimateBootDisk").

Mike,

When one of my drives fails I test it with the Maxtor Powermax tool - 
it's this tool that is confirming that the drive(s) is(are) dieing.

The only saving grace is the three-year warranty.

R.
-- 
http://robinbowes.com


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RAID5 drive failure, please verify my commands
  2005-01-17 23:53   ` Robin Bowes
@ 2005-01-18 15:46     ` Derek Piper
  2005-01-18 17:10       ` Guy
  0 siblings, 1 reply; 11+ messages in thread
From: Derek Piper @ 2005-01-18 15:46 UTC (permalink / raw)
  To: linux-raid

Since we're on the subject of RMA'ing hard drives, I have this
question: I have a Seagate drive that during the course of running dd
if=/dev/zero (a simple write-test I've done on all the 4 drives I am
planning on using in my RAID1 set-up) on it has been shown to have
errors (Drive Ready, Seek Complete .. blah blah). I used smartctl to
have a look at the information the drive is giving and to get it to
run a self-test. It apparently has reallocated 136 sectors, has 3
uncorrectable sector errors and failed the read part of the short
self-test with 90% to go. It says its health status is 'PASSED'
though. 2 of the other drives have just a couple of errors and
reallocated sectors, and one has never had an error.

My question is this, to you knowledgable people, should I RMA it or do
disk mfrs only replace drives if they are really DEAD? It's under
warranty until June. Since I've never done it before I thought I'd
ask, I'm thinking yes. The chick at Seagate kept telling me how to do
the RMA process  when I was asking if a few unrecoverable sectors were
covered. Has anyone RMA'd a drive only to have it sent back with 'nah,
it's working fine'?

Opinions are appreciated by anyone. Anyone that's dealt with Seagate,
all the better :)

Thanks,

Derek

On Mon, 17 Jan 2005 23:53:15 +0000, Robin Bowes
<robin-lists@robinbowes.com> wrote:
> Mike Hardy wrote:
> >
> >
> > Gerd Knops wrote:
> >
> >> Hello all,
> >>
> >> One of the dreaded Maxtor SATA drives in my RAID5 failed, after just 3
> >> months of light use. Anyhow I neither have the disk capacity nor the
> >> money to buy it to make a backup. To make sure I do it correctly,
> >> could you folks please double-check my intended course of action? I
> >> would really appreciate that.
> >
> >
> > Failed how? I have tons and tons of Maxtor drives in service, and only
> > one actually had a complete failure (verified by their utility, which is
> > present on some bootable CD I got called "UltimateBootDisk").
> 
> Mike,
> 
> When one of my drives fails I test it with the Maxtor Powermax tool -
> it's this tool that is confirming that the drive(s) is(are) dieing.
> 
> The only saving grace is the three-year warranty.
> 
> R.
> --
> http://robinbowes.com
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
Derek Piper - derek.piper@gmail.com
http://doofer.org/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: RAID5 drive failure, please verify my commands
  2005-01-18 15:46     ` Derek Piper
@ 2005-01-18 17:10       ` Guy
       [not found]         ` <eaa6dfe050118094248eb03ad@mail.gmail.com>
  0 siblings, 1 reply; 11+ messages in thread
From: Guy @ 2005-01-18 17:10 UTC (permalink / raw)
  To: 'Derek Piper', linux-raid

You should download "SeaTools Enterprise".
If this tool fails the drive, I think it is safe to return.
The tool uses the sg devices.  I am not sure, but I think these are for SCSI
devices.

Guy

-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Derek Piper
Sent: Tuesday, January 18, 2005 10:46 AM
To: linux-raid@vger.kernel.org
Subject: Re: RAID5 drive failure, please verify my commands

Since we're on the subject of RMA'ing hard drives, I have this
question: I have a Seagate drive that during the course of running dd
if=/dev/zero (a simple write-test I've done on all the 4 drives I am
planning on using in my RAID1 set-up) on it has been shown to have
errors (Drive Ready, Seek Complete .. blah blah). I used smartctl to
have a look at the information the drive is giving and to get it to
run a self-test. It apparently has reallocated 136 sectors, has 3
uncorrectable sector errors and failed the read part of the short
self-test with 90% to go. It says its health status is 'PASSED'
though. 2 of the other drives have just a couple of errors and
reallocated sectors, and one has never had an error.

My question is this, to you knowledgable people, should I RMA it or do
disk mfrs only replace drives if they are really DEAD? It's under
warranty until June. Since I've never done it before I thought I'd
ask, I'm thinking yes. The chick at Seagate kept telling me how to do
the RMA process  when I was asking if a few unrecoverable sectors were
covered. Has anyone RMA'd a drive only to have it sent back with 'nah,
it's working fine'?

Opinions are appreciated by anyone. Anyone that's dealt with Seagate,
all the better :)

Thanks,

Derek

On Mon, 17 Jan 2005 23:53:15 +0000, Robin Bowes
<robin-lists@robinbowes.com> wrote:
> Mike Hardy wrote:
> >
> >
> > Gerd Knops wrote:
> >
> >> Hello all,
> >>
> >> One of the dreaded Maxtor SATA drives in my RAID5 failed, after just 3
> >> months of light use. Anyhow I neither have the disk capacity nor the
> >> money to buy it to make a backup. To make sure I do it correctly,
> >> could you folks please double-check my intended course of action? I
> >> would really appreciate that.
> >
> >
> > Failed how? I have tons and tons of Maxtor drives in service, and only
> > one actually had a complete failure (verified by their utility, which is
> > present on some bootable CD I got called "UltimateBootDisk").
> 
> Mike,
> 
> When one of my drives fails I test it with the Maxtor Powermax tool -
> it's this tool that is confirming that the drive(s) is(are) dieing.
> 
> The only saving grace is the three-year warranty.
> 
> R.
> --
> http://robinbowes.com
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
Derek Piper - derek.piper@gmail.com
http://doofer.org/
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Fwd: RAID5 drive failure, please verify my commands
       [not found]         ` <eaa6dfe050118094248eb03ad@mail.gmail.com>
@ 2005-01-18 17:42           ` Derek Piper
  0 siblings, 0 replies; 11+ messages in thread
From: Derek Piper @ 2005-01-18 17:42 UTC (permalink / raw)
  To: linux-raid

Yea, SeaTools Enterprise says it uses the sg devices and it's for SCSI
disks. It looks like it runs tests similar to that that are triggered
by the -t option in 'smartctl'.

Hmm..I tried running the extended test (using smartctl). It completed
that okay and now says it's reallocated 137 sectors and has 2
uncorrectable ones, and the short test completes okay this time. Darn
drive, are you dying on me or not?

I've emailed the Seagate tech support people and see what they think
too. You can get a Seagate RMA just by entering the drive model and
serial # on the website. Doesn't say what consitutes a 'failure'
though, although I agree with you Mike, if it's got bad blocks then
it's 'bad' imho.

Thanks,

Derek

On Tue, 18 Jan 2005 12:10:21 -0500, Guy <bugzilla@watkins-home.com> wrote:
> You should download "SeaTools Enterprise".
> If this tool fails the drive, I think it is safe to return.
> The tool uses the sg devices.  I am not sure, but I think these are for SCSI
> devices.
>
> Guy
>
> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org
> [mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Derek Piper
> Sent: Tuesday, January 18, 2005 10:46 AM
> To: linux-raid@vger.kernel.org
> Subject: Re: RAID5 drive failure, please verify my commands
>
> Since we're on the subject of RMA'ing hard drives, I have this
> question: I have a Seagate drive that during the course of running dd
> if=/dev/zero (a simple write-test I've done on all the 4 drives I am
> planning on using in my RAID1 set-up) on it has been shown to have
> errors (Drive Ready, Seek Complete .. blah blah). I used smartctl to
> have a look at the information the drive is giving and to get it to
> run a self-test. It apparently has reallocated 136 sectors, has 3
> uncorrectable sector errors and failed the read part of the short
> self-test with 90% to go. It says its health status is 'PASSED'
> though. 2 of the other drives have just a couple of errors and
> reallocated sectors, and one has never had an error.
>
> My question is this, to you knowledgable people, should I RMA it or do
> disk mfrs only replace drives if they are really DEAD? It's under
> warranty until June. Since I've never done it before I thought I'd
> ask, I'm thinking yes. The chick at Seagate kept telling me how to do
> the RMA process  when I was asking if a few unrecoverable sectors were
> covered. Has anyone RMA'd a drive only to have it sent back with 'nah,
> it's working fine'?
>
> Opinions are appreciated by anyone. Anyone that's dealt with Seagate,
> all the better :)
>
> Thanks,
>
> Derek
>
> On Mon, 17 Jan 2005 23:53:15 +0000, Robin Bowes
> <robin-lists@robinbowes.com> wrote:
> > Mike Hardy wrote:
> > >
> > >
> > > Gerd Knops wrote:
> > >
> > >> Hello all,
> > >>
> > >> One of the dreaded Maxtor SATA drives in my RAID5 failed, after just 3
> > >> months of light use. Anyhow I neither have the disk capacity nor the
> > >> money to buy it to make a backup. To make sure I do it correctly,
> > >> could you folks please double-check my intended course of action? I
> > >> would really appreciate that.
> > >
> > >
> > > Failed how? I have tons and tons of Maxtor drives in service, and only
> > > one actually had a complete failure (verified by their utility, which is
> > > present on some bootable CD I got called "UltimateBootDisk").
> >
> > Mike,
> >
> > When one of my drives fails I test it with the Maxtor Powermax tool -
> > it's this tool that is confirming that the drive(s) is(are) dieing.
> >
> > The only saving grace is the three-year warranty.
> >
> > R.
> > --
> > http://robinbowes.com
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
>
> --
> Derek Piper - derek.piper@gmail.com
> http://doofer.org/
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>

--
Derek Piper - derek.piper@gmail.com
http://doofer.org/


-- 
Derek Piper - derek.piper@gmail.com
http://doofer.org/

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2005-01-18 17:42 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-01-16 18:14 RAID5 drive failure, please verify my commands Gerd Knops
2005-01-16 20:34 ` Robin Bowes
2005-01-16 22:08   ` Gerd Knops
2005-01-16 21:25 ` Mike Hardy
2005-01-16 22:14   ` Gerd Knops
2005-01-16 23:13     ` Mike Hardy
2005-01-17  0:39       ` Mike Hardy
2005-01-17 23:53   ` Robin Bowes
2005-01-18 15:46     ` Derek Piper
2005-01-18 17:10       ` Guy
     [not found]         ` <eaa6dfe050118094248eb03ad@mail.gmail.com>
2005-01-18 17:42           ` Fwd: " Derek Piper

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).