Good news / bad news - The joys of RAID

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Good news / bad news - The joys of RAID
@ 2004-11-19 21:06 Robin Bowes
  2004-11-19 21:28 ` Guy
                   ` (2 more replies)
  0 siblings, 3 replies; 34+ messages in thread
From: Robin Bowes @ 2004-11-19 21:06 UTC (permalink / raw)
  To: linux-raid

The bad news is I lost another disk tonight. Remind me *never* to buy 
Maxtor drives again.

The good news is that my RAID5 array was configured as 5 + 1 spare. I 
powered down the server, used the Maxtor PowerMax utility to identify 
the bad disk, pulled it out and re-booted. My array is currently re-syncing.

[root@dude root]# mdadm --detail /dev/md5
/dev/md5:
         Version : 00.90.01
   Creation Time : Thu Jul 29 21:41:38 2004
      Raid Level : raid5
      Array Size : 974566400 (929.42 GiB 997.96 GB)
     Device Size : 243641600 (232.35 GiB 249.49 GB)
    Raid Devices : 5
   Total Devices : 5
Preferred Minor : 5
     Persistence : Superblock is persistent

     Update Time : Fri Nov 19 20:52:58 2004
           State : dirty, resyncing
  Active Devices : 5
Working Devices : 5
  Failed Devices : 0
   Spare Devices : 0

          Layout : left-symmetric
      Chunk Size : 128K

  Rebuild Status : 0% complete

            UUID : a4bbcd09:5e178c5b:3bf8bd45:8c31d2a1
          Events : 0.1765551

     Number   Major   Minor   RaidDevice State
        0       8        2        0      active sync   /dev/sda2
        1       8       18        1      active sync   /dev/sdb2
        2       8       34        2      active sync   /dev/sdc2
        3       8       50        3      active sync   /dev/sdd2
        4       8       66        4      active sync   /dev/sde2

Thinking about what happened, I would have expected that the bad drive 
would just be removed from the array and spare activated and re-syncing 
started automatically.

What actually happened was that I rebooted to activate a new kernel and 
the box didn't come back up. As the machine runs headless, I had to 
power it off and take it to a monitor/keyboard to check it. In the new 
location it came up fine so I shut it down again and put it back in my 
"server room" (read: cellar). I still couldn't see it from the network 
so I dragged an old 14" CRT out of the shed and connected it up. The 
login prompt was there but there was an "ata2 timeout" error message and 
the console was dead. I power-cycled to reboot and as it booted I saw a 
message something like "postponing resync of md0 as it uses the same 
device as md5. waiting for md5 to resync. I then got a further ata 
timeout error. I had to physically disconnect the bad drive and reboot 
in order to re-start the re-sync.

Further md information:

[root@dude log]# mdadm --detail --scan
ARRAY /dev/md2 level=raid1 num-devices=2 
UUID=11caa547:1ba8d185:1f1f771f:d66368c9
    devices=/dev/sdc1
ARRAY /dev/md1 level=raid1 num-devices=2 
UUID=be8ad31a:f13b6f4b:c39732fc:c84f32a8
    devices=/dev/sdb1,/dev/sde1
ARRAY /dev/md5 level=raid5 num-devices=5 
UUID=a4bbcd09:5e178c5b:3bf8bd45:8c31d2a1
    devices=/dev/sda2,/dev/sdb2,/dev/sdc2,/dev/sdd2,/dev/sde2
ARRAY /dev/md0 level=raid1 num-devices=2 
UUID=4b28338c:bf08d0bc:bb2899fc:e7f35eae
    devices=/dev/sda1,/dev/sdd1

It was /dev/sdf that failed which contained two partitions, one of them 
part of md2 (now running un-mirrored but still showing two devices) and 
the other part of md5 (now re-syncing but only showing five devices).

Is this normal behaviour?

R.
-- 
http://robinbowes.com

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: Good news / bad news - The joys of RAID
  2004-11-19 21:06 Good news / bad news - The joys of RAID Robin Bowes
@ 2004-11-19 21:28 ` Guy
  2004-11-20 18:42   ` Mark Hahn
  2004-11-19 21:42 ` Good news / bad news - The joys of RAID Guy
  2004-11-19 21:58 ` Gordon Henderson
  2 siblings, 1 reply; 34+ messages in thread
From: Guy @ 2004-11-19 21:28 UTC (permalink / raw)
  To: 'Robin Bowes', linux-raid

Reminder....
Never buy Maxtor drives again!

Guy

-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Robin Bowes
Sent: Friday, November 19, 2004 4:07 PM
To: linux-raid@vger.kernel.org
Subject: Good news / bad news - The joys of RAID

The bad news is I lost another disk tonight. Remind me *never* to buy 
Maxtor drives again.

The good news is that my RAID5 array was configured as 5 + 1 spare. I 
powered down the server, used the Maxtor PowerMax utility to identify 
the bad disk, pulled it out and re-booted. My array is currently re-syncing.

[root@dude root]# mdadm --detail /dev/md5
/dev/md5:
         Version : 00.90.01
   Creation Time : Thu Jul 29 21:41:38 2004
      Raid Level : raid5
      Array Size : 974566400 (929.42 GiB 997.96 GB)
     Device Size : 243641600 (232.35 GiB 249.49 GB)
    Raid Devices : 5
   Total Devices : 5
Preferred Minor : 5
     Persistence : Superblock is persistent

     Update Time : Fri Nov 19 20:52:58 2004
           State : dirty, resyncing
  Active Devices : 5
Working Devices : 5
  Failed Devices : 0
   Spare Devices : 0

          Layout : left-symmetric
      Chunk Size : 128K

  Rebuild Status : 0% complete

            UUID : a4bbcd09:5e178c5b:3bf8bd45:8c31d2a1
          Events : 0.1765551

     Number   Major   Minor   RaidDevice State
        0       8        2        0      active sync   /dev/sda2
        1       8       18        1      active sync   /dev/sdb2
        2       8       34        2      active sync   /dev/sdc2
        3       8       50        3      active sync   /dev/sdd2
        4       8       66        4      active sync   /dev/sde2

Thinking about what happened, I would have expected that the bad drive 
would just be removed from the array and spare activated and re-syncing 
started automatically.

What actually happened was that I rebooted to activate a new kernel and 
the box didn't come back up. As the machine runs headless, I had to 
power it off and take it to a monitor/keyboard to check it. In the new 
location it came up fine so I shut it down again and put it back in my 
"server room" (read: cellar). I still couldn't see it from the network 
so I dragged an old 14" CRT out of the shed and connected it up. The 
login prompt was there but there was an "ata2 timeout" error message and 
the console was dead. I power-cycled to reboot and as it booted I saw a 
message something like "postponing resync of md0 as it uses the same 
device as md5. waiting for md5 to resync. I then got a further ata 
timeout error. I had to physically disconnect the bad drive and reboot 
in order to re-start the re-sync.

Further md information:

[root@dude log]# mdadm --detail --scan
ARRAY /dev/md2 level=raid1 num-devices=2 
UUID=11caa547:1ba8d185:1f1f771f:d66368c9
    devices=/dev/sdc1
ARRAY /dev/md1 level=raid1 num-devices=2 
UUID=be8ad31a:f13b6f4b:c39732fc:c84f32a8
    devices=/dev/sdb1,/dev/sde1
ARRAY /dev/md5 level=raid5 num-devices=5 
UUID=a4bbcd09:5e178c5b:3bf8bd45:8c31d2a1
    devices=/dev/sda2,/dev/sdb2,/dev/sdc2,/dev/sdd2,/dev/sde2
ARRAY /dev/md0 level=raid1 num-devices=2 
UUID=4b28338c:bf08d0bc:bb2899fc:e7f35eae
    devices=/dev/sda1,/dev/sdd1

It was /dev/sdf that failed which contained two partitions, one of them 
part of md2 (now running un-mirrored but still showing two devices) and 
the other part of md5 (now re-syncing but only showing five devices).

Is this normal behaviour?

R.
-- 
http://robinbowes.com
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: Good news / bad news - The joys of RAID
  2004-11-19 21:06 Good news / bad news - The joys of RAID Robin Bowes
  2004-11-19 21:28 ` Guy
@ 2004-11-19 21:42 ` Guy
  2004-11-28 13:15   ` Robin Bowes
  2004-11-19 21:58 ` Gordon Henderson
  2 siblings, 1 reply; 34+ messages in thread
From: Guy @ 2004-11-19 21:42 UTC (permalink / raw)
  To: 'Robin Bowes', linux-raid

The re-sync to the spare should have been automatic, without a re-boot.
Your errors related to ata timeout is not a Linux issue.  My guess is the
bios could see the drive, but the drive was not responding correctly.  I
think this is life with ata.  I have had similar problems with SCSI.  1
drive failed in a way that it caused problems with other drives on the same
SCSI bus.

It could be that your array was re-building, but did not finish.  In that
case it would start over from the beginning.  Which may look like it did not
attempt to re-build until the re-boot.  Did you check the status before you
shut it down?  I use mdadm's monitor mode to send me email when events
occur.  By the time I read my emails, a drive has failed and the re-sync to
the spare is done.  No need to check logs.

Yes, it is normal that md will not re-sync 2 arrays that share a common
device.  One will be delayed until the other finishes.

Second reminder....
Never buy Maxtor drives again!

This quote seems to fit real well!
"Sure you saved money, but at what cost?" - Guy Watkins

Guy

-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Robin Bowes
Sent: Friday, November 19, 2004 4:07 PM
To: linux-raid@vger.kernel.org
Subject: Good news / bad news - The joys of RAID

The bad news is I lost another disk tonight. Remind me *never* to buy 
Maxtor drives again.

The good news is that my RAID5 array was configured as 5 + 1 spare. I 
powered down the server, used the Maxtor PowerMax utility to identify 
the bad disk, pulled it out and re-booted. My array is currently re-syncing.

[root@dude root]# mdadm --detail /dev/md5
/dev/md5:
         Version : 00.90.01
   Creation Time : Thu Jul 29 21:41:38 2004
      Raid Level : raid5
      Array Size : 974566400 (929.42 GiB 997.96 GB)
     Device Size : 243641600 (232.35 GiB 249.49 GB)
    Raid Devices : 5
   Total Devices : 5
Preferred Minor : 5
     Persistence : Superblock is persistent

     Update Time : Fri Nov 19 20:52:58 2004
           State : dirty, resyncing
  Active Devices : 5
Working Devices : 5
  Failed Devices : 0
   Spare Devices : 0

          Layout : left-symmetric
      Chunk Size : 128K

  Rebuild Status : 0% complete

            UUID : a4bbcd09:5e178c5b:3bf8bd45:8c31d2a1
          Events : 0.1765551

     Number   Major   Minor   RaidDevice State
        0       8        2        0      active sync   /dev/sda2
        1       8       18        1      active sync   /dev/sdb2
        2       8       34        2      active sync   /dev/sdc2
        3       8       50        3      active sync   /dev/sdd2
        4       8       66        4      active sync   /dev/sde2

Thinking about what happened, I would have expected that the bad drive 
would just be removed from the array and spare activated and re-syncing 
started automatically.

What actually happened was that I rebooted to activate a new kernel and 
the box didn't come back up. As the machine runs headless, I had to 
power it off and take it to a monitor/keyboard to check it. In the new 
location it came up fine so I shut it down again and put it back in my 
"server room" (read: cellar). I still couldn't see it from the network 
so I dragged an old 14" CRT out of the shed and connected it up. The 
login prompt was there but there was an "ata2 timeout" error message and 
the console was dead. I power-cycled to reboot and as it booted I saw a 
message something like "postponing resync of md0 as it uses the same 
device as md5. waiting for md5 to resync. I then got a further ata 
timeout error. I had to physically disconnect the bad drive and reboot 
in order to re-start the re-sync.

Further md information:

[root@dude log]# mdadm --detail --scan
ARRAY /dev/md2 level=raid1 num-devices=2 
UUID=11caa547:1ba8d185:1f1f771f:d66368c9
    devices=/dev/sdc1
ARRAY /dev/md1 level=raid1 num-devices=2 
UUID=be8ad31a:f13b6f4b:c39732fc:c84f32a8
    devices=/dev/sdb1,/dev/sde1
ARRAY /dev/md5 level=raid5 num-devices=5 
UUID=a4bbcd09:5e178c5b:3bf8bd45:8c31d2a1
    devices=/dev/sda2,/dev/sdb2,/dev/sdc2,/dev/sdd2,/dev/sde2
ARRAY /dev/md0 level=raid1 num-devices=2 
UUID=4b28338c:bf08d0bc:bb2899fc:e7f35eae
    devices=/dev/sda1,/dev/sdd1

It was /dev/sdf that failed which contained two partitions, one of them 
part of md2 (now running un-mirrored but still showing two devices) and 
the other part of md5 (now re-syncing but only showing five devices).

Is this normal behaviour?

R.
-- 
http://robinbowes.com
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Good news / bad news - The joys of RAID
  2004-11-19 21:06 Good news / bad news - The joys of RAID Robin Bowes
  2004-11-19 21:28 ` Guy
  2004-11-19 21:42 ` Good news / bad news - The joys of RAID Guy
@ 2004-11-19 21:58 ` Gordon Henderson
  2 siblings, 0 replies; 34+ messages in thread
From: Gordon Henderson @ 2004-11-19 21:58 UTC (permalink / raw)
  To: Robin Bowes; +Cc: linux-raid

On Fri, 19 Nov 2004, Robin Bowes wrote:

> What actually happened was that I rebooted to activate a new kernel and
> the box didn't come back up. As the machine runs headless, I had to
> power it off and take it to a monitor/keyboard to check it.

Not directly related to your RAID issue, but I've been running headless
servers with console on serial ports as of late. LILO has an option to put
output on a serial line, and there's a kernel compile flag and an append
instruction to make it all work.

That combined with a power cycler makes me feel more at ease about the
remote servers I run.

Just don't connect 2 PCs back to back and run a getty on each serial
line...

Gordon

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: Good news / bad news - The joys of RAID
  2004-11-19 21:28 ` Guy
@ 2004-11-20 18:42   ` Mark Hahn
  2004-11-20 19:37     ` Guy
                       ` (3 more replies)
  0 siblings, 4 replies; 34+ messages in thread
From: Mark Hahn @ 2004-11-20 18:42 UTC (permalink / raw)
  To: linux-raid

> Never buy Maxtor drives again!

you imply that Maxtor drives are somehow inherently flawed.
can you explain why you think millions of people/companies
are naive idiots for continuing to buy Maxtor disks?

this sort of thing is just not plausible: Maxtor competes 
with the other top-tier disk vendors with similar products 
and prices and reliability.  yes, if you buy a 1-year disk,
you can expect it to have been less carefully tested, possibly
be of lower-end design and reliability, and to have been handle
more poorly by the supply chain.  thankfully, you don't have 
to buy 1-year disks any more.

read the specs.  make sure your supply chain knows how to 
handle disks.  make sure your disks are mounted correctly,
both mechanically and with enough airflow.  use raid and 
some form of archiving/backups.  don't get hung up on which 
of the 4-5 top-tier vendors makes your disk.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: Good news / bad news - The joys of RAID
  2004-11-20 18:42   ` Mark Hahn
@ 2004-11-20 19:37     ` Guy
  2004-11-20 20:03       ` Mark Klarzynski
  2004-11-20 23:30       ` Mark Hahn
  2004-11-20 19:40     ` David Greaves
                       ` (2 subsequent siblings)
  3 siblings, 2 replies; 34+ messages in thread
From: Guy @ 2004-11-20 19:37 UTC (permalink / raw)
  To: 'Mark Hahn', linux-raid

I have had far more failures of Maxtor drives than any other.  I have also
had problems with WD drives.  I know someone that had 4-6 IBM disks, most of
which have failed.

I am talking about disks with 3 year warranties!  Based on the spec.  But
OEM disks have none.  You must return them to the PC manufacture.
Most of my failures were within 3 years, but beyond the warranty period of
the system.  So the OEM issue has occurred too often.

I have had good luck with Seagate.

I use RAID, it is a must with the failure rate!
I do backup also, but RAID tends to save me.

Most people have a PC with 1 disk.  I don't understand RAID, and they don't
understand that everything will be lost if the disk breaks!  They think
"Dell will just fix it".  But wrong, Dell will just replace it!  Big
difference.

Today's disks claim a MTBF of about 1,000,000 hours!  That's about 114
years.  So, if I had 10 disks I should expect 1 failure every 11.4 years.
That would be so cool!  But not in the real world.

Can you explain how the disks have a MTBF of 1,000,000 hours?  But fail more
often than that?  Maybe I just don't understand some aspect of MTBF.

Guy

-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Mark Hahn
Sent: Saturday, November 20, 2004 1:43 PM
To: linux-raid@vger.kernel.org
Subject: RE: Good news / bad news - The joys of RAID

> Never buy Maxtor drives again!

you imply that Maxtor drives are somehow inherently flawed.
can you explain why you think millions of people/companies
are naive idiots for continuing to buy Maxtor disks?

this sort of thing is just not plausible: Maxtor competes 
with the other top-tier disk vendors with similar products 
and prices and reliability.  yes, if you buy a 1-year disk,
you can expect it to have been less carefully tested, possibly
be of lower-end design and reliability, and to have been handle
more poorly by the supply chain.  thankfully, you don't have 
to buy 1-year disks any more.

read the specs.  make sure your supply chain knows how to 
handle disks.  make sure your disks are mounted correctly,
both mechanically and with enough airflow.  use raid and 
some form of archiving/backups.  don't get hung up on which 
of the 4-5 top-tier vendors makes your disk.

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Good news / bad news - The joys of RAID
  2004-11-20 18:42   ` Mark Hahn
  2004-11-20 19:37     ` Guy
@ 2004-11-20 19:40     ` David Greaves
  2004-11-21  4:33       ` Guy
  2004-11-21  1:01     ` berk walker
  2004-11-23 19:10     ` H. Peter Anvin
  3 siblings, 1 reply; 34+ messages in thread
From: David Greaves @ 2004-11-20 19:40 UTC (permalink / raw)
  To: Mark Hahn; +Cc: linux-raid

Mark Hahn wrote:

>>Never buy Maxtor drives again!
>>    
>>
>
>you imply that Maxtor drives are somehow inherently flawed.
>can you explain why you think millions of people/companies
>are naive idiots for continuing to buy Maxtor disks?
>
>this sort of thing is just not plausible: Maxtor competes 
>with the other top-tier disk vendors with similar products 
>and prices and reliability.  yes, if you buy a 1-year disk,
>you can expect it to have been less carefully tested, possibly
>be of lower-end design and reliability, and to have been handle
>more poorly by the supply chain.  thankfully, you don't have 
>to buy 1-year disks any more.
>
>read the specs.  make sure your supply chain knows how to 
>handle disks.  make sure your disks are mounted correctly,
>both mechanically and with enough airflow.  use raid and 
>some form of archiving/backups.  don't get hung up on which 
>of the 4-5 top-tier vendors makes your disk.
>
>  
>

Yeah, you're right.
Of course - the fact that 2 of *my* 6 Maxtor 250Gb SATA drives (3 year 
warranty) date stamped at various times in 2004 have failed is 
coincidence and should, of course, be expected with a MTBF of millions 
of hours.

Oh, please note I'm not Robin - that must be a coincidence too :)

Personally I'm waiting for the revelation that they are recycled IBM 
Deskstar 70's ;)

I take your point about supply chain though - anything that's shipped by 
courier is suspect.

David



^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: Good news / bad news - The joys of RAID
  2004-11-20 19:37     ` Guy
@ 2004-11-20 20:03       ` Mark Klarzynski
  2004-11-20 22:17         ` Mark Hahn
  2004-11-20 23:30       ` Mark Hahn
  1 sibling, 1 reply; 34+ messages in thread
From: Mark Klarzynski @ 2004-11-20 20:03 UTC (permalink / raw)
  To: linux-raid

MTBF is statistic based upon the expected 'use' of the drive and the
replacement of the drive after its end of life (3-5 years)...

It's extremely complex and boring but the figure is only relative if the
drive is being used within an environment that matches those of the
calculations.

SATA / IDE drives have an MTBF similar to that of SCSI / Fibre. But this
is based upon their expected use... i.e. SCSI used to be [power on hours
= 24hr] [use = 8 hours].. whilst SATA used to be [power on = 8 hours]
and [use = 20 mins].

Regardless of what some people clam (usually those that only sell sata
based raids), the drives are not constructed the same in any way.

SATA's fail more within a raid environment (probably around 10:1)
because of the heavy use and also because they are not as intelligent...
therefore when they do not respond we have no way of interrogating them
or resetting them, whilst with scsi we do both. This means that a raid
controller / driver has no option to but simply fail the drive.

Maxtor lead the way in capacity and also reliability... I personal had
to recall countless earlier IBMs and replace them with maxtor.  But the
new generation of IBM's (Hitachi) have got it together.

So - I guess you are all right :) 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: Good news / bad news - The joys of RAID
  2004-11-20 20:03       ` Mark Klarzynski
@ 2004-11-20 22:17         ` Mark Hahn
  2004-11-20 23:09           ` Guy
  2004-12-02 16:47           ` TJ
  0 siblings, 2 replies; 34+ messages in thread
From: Mark Hahn @ 2004-11-20 22:17 UTC (permalink / raw)
  To: linux-raid

> SATA / IDE drives have an MTBF similar to that of SCSI / Fibre. But this
> is based upon their expected use... i.e. SCSI used to be [power on hours
> = 24hr] [use = 8 hours].. whilst SATA used to be [power on = 8 hours]
> and [use = 20 mins].

the vendors I talk to always quote SCSI/FC at 100% power 100% duty,
and PATA/SATA at 100% power 20% duty.

> Regardless of what some people clam (usually those that only sell sata
> based raids), the drives are not constructed the same in any way.

obviously, there *have* been pairs of SCSI/ATA disks which had 
identical mech/analog sections.  but the mech/analog fall into 
just two kinds:

	- optimized for IOPS: 10-15K rpm for minimal rotational 
	latency, narrow recording area for low seek distance,
	quite low bit and track density to avoid long waits for 
	the head to stabilize after a seek.

	- optimized for density/bandwidth: high bit/track density,
	wide recording area, modest seeks/rotation speed.

the first is SCSI/FC and the second ATA, mainly for historic reasons.

> SATA's fail more within a raid environment (probably around 10:1)
> because of the heavy use and also because they are not as intelligent...

what connection are you drawing between raid and "heavy use"?
how does being in a raid increase the IO load per disk?

> therefore when they do not respond we have no way of interrogating them
> or resetting them, whilst with scsi we do both. 

you've never seen a SCSI reset that looks just like an ATA reset?
sorry, but SCSI has no magic.

> This means that a raid
> controller / driver has no option to but simply fail the drive.

no.

> Maxtor lead the way in capacity and also reliability... I personal had
> to recall countless earlier IBMs and replace them with maxtor.  But the

afaikt, the deathstar incident was actually bad firmware 
(didn't correctly flush data when hard powered off, resulting in 
blocks on disk with bogus ECC, which had to be considered bad from
then on, even if the media was perfect.)


^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: Good news / bad news - The joys of RAID
  2004-11-20 22:17         ` Mark Hahn
@ 2004-11-20 23:09           ` Guy
  2004-12-02 16:47           ` TJ
  1 sibling, 0 replies; 34+ messages in thread
From: Guy @ 2004-11-20 23:09 UTC (permalink / raw)
  To: 'Mark Hahn', linux-raid

You got any links related to this?
"the deathstar incident was actually bad firmware"

Can a user download and update the firmware?

If so, I know someone that may have some bad disks that are not so bad.

If he can repair his disks, I will report the status back on this list.

Previously I thought IBM made very good disks, until my friend had more than
a 75% failure rate.  And within the warranty period.

I personally have an IBM SCSI disk that is running 100% of the time, and the
cooling is real bad.  The drive is much too hot to touch.  Been like that
for 5+ years.  Never had any issues.  The system also has a Seagate that is
too hot to touch, but only been running 3+ years.  Both are 18 Gig.  The
disks are in a system my wife uses!  Don't tell her. :)  I got to fix that
someday.

Guy

-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Mark Hahn
Sent: Saturday, November 20, 2004 5:18 PM
To: linux-raid@vger.kernel.org
Subject: RE: Good news / bad news - The joys of RAID

> SATA / IDE drives have an MTBF similar to that of SCSI / Fibre. But this
> is based upon their expected use... i.e. SCSI used to be [power on hours
> = 24hr] [use = 8 hours].. whilst SATA used to be [power on = 8 hours]
> and [use = 20 mins].

the vendors I talk to always quote SCSI/FC at 100% power 100% duty,
and PATA/SATA at 100% power 20% duty.

> Regardless of what some people clam (usually those that only sell sata
> based raids), the drives are not constructed the same in any way.

obviously, there *have* been pairs of SCSI/ATA disks which had 
identical mech/analog sections.  but the mech/analog fall into 
just two kinds:

	- optimized for IOPS: 10-15K rpm for minimal rotational 
	latency, narrow recording area for low seek distance,
	quite low bit and track density to avoid long waits for 
	the head to stabilize after a seek.

	- optimized for density/bandwidth: high bit/track density,
	wide recording area, modest seeks/rotation speed.

the first is SCSI/FC and the second ATA, mainly for historic reasons.

> SATA's fail more within a raid environment (probably around 10:1)
> because of the heavy use and also because they are not as intelligent...

what connection are you drawing between raid and "heavy use"?
how does being in a raid increase the IO load per disk?

> therefore when they do not respond we have no way of interrogating them
> or resetting them, whilst with scsi we do both. 

you've never seen a SCSI reset that looks just like an ATA reset?
sorry, but SCSI has no magic.

> This means that a raid
> controller / driver has no option to but simply fail the drive.

no.

> Maxtor lead the way in capacity and also reliability... I personal had
> to recall countless earlier IBMs and replace them with maxtor.  But the

afaikt, the deathstar incident was actually bad firmware 
(didn't correctly flush data when hard powered off, resulting in 
blocks on disk with bogus ECC, which had to be considered bad from
then on, even if the media was perfect.)

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: Good news / bad news - The joys of RAID
  2004-11-20 19:37     ` Guy
  2004-11-20 20:03       ` Mark Klarzynski
@ 2004-11-20 23:30       ` Mark Hahn
  1 sibling, 0 replies; 34+ messages in thread
From: Mark Hahn @ 2004-11-20 23:30 UTC (permalink / raw)
  To: Guy; +Cc: linux-raid

> Can you explain how the disks have a MTBF of 1,000,000 hours?  But fail more
> often than that?  Maybe I just don't understand some aspect of MTBF.

simple: the MTBF applies to very large sets of disks.  if you had 
millions of disks, you'd expect to average mtbf/ndisks between failures.
with statistically trivial sample sizes (10 disks), you can't 
really say much.  of course, a proper model of the failure rate 
would have a lot more than 1 parameter...

for instance, my organization will be buying about .5 PB
of storage soon.  here are some options:

disk		n	mtbf	hours	$/disk	$K total

250GB SATA	1920	1e6	500	399	766
600GB SATA	800	1e6	1250	600?	480

73GB SCSI/FC	6575	1.3e6	198	389	2558
146GB SCSI/FC	3288	1.3e6	395	600	1973
300GB SCSI/FC	1600	1.3e6	813	1200	1920

these mtbf's are basically made up, since disk vendors aren't really
very helpful in publishing their true reliability distributions.
these disk counts are starting to be big enough to give some meaning
to the hours=mtbf/n calculation - I'd WAG that "hours" is within
a factor of two.  (I looked at only three lines of SCSI disks to 
get 1.3e6 - two quoted 1.2 and the newer was 1.4.)  vendors seem 
to be switching to quoting "annualized failure rates", which are 
probably easier to understand - 1.2e6 MTBF or 0.73% AFR, for instance.
the latter makes it more clear that we're talking about gambling ;)

but the message is clear: for a fixed, large capacity, your main 
concern should be bigger disks.  since our money is also fixed,
you can see that SCSI/FC prices are a big problem (these are 
real list prices from a tier-1 vendor who marks up their SATA
by an embarassing amount...)  further, there's absolutely no chance
we could ever keep .5 PB of disks busy at 100% duty cycle, so that's
not a reason to buy SCSI/FC either...

regards, mark hahn.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Good news / bad news - The joys of RAID
  2004-11-20 18:42   ` Mark Hahn
  2004-11-20 19:37     ` Guy
  2004-11-20 19:40     ` David Greaves
@ 2004-11-21  1:01     ` berk walker
  2004-11-23 19:10     ` H. Peter Anvin
  3 siblings, 0 replies; 34+ messages in thread
From: berk walker @ 2004-11-21  1:01 UTC (permalink / raw)
  To: Mark Hahn; +Cc: linux-raid

ALL of the Maxtor junk that I have sitting next to me were in factory 
packaging, and not likely to have been affected by either physical or 
electrical shock.

HE might have implied, I am saying it!  Why ask someone as you did in 
sentence #2?  Ask them - or yourself.

Of course, he probably missed the warranty statement to not run Linux.

Mark Hahn wrote:

>>Never buy Maxtor drives again!
>>    
>>
>
>you imply that Maxtor drives are somehow inherently flawed.
>can you explain why you think millions of people/companies
>are naive idiots for continuing to buy Maxtor disks?
>
>this sort of thing is just not plausible: Maxtor competes 
>with the other top-tier disk vendors with similar products 
>and prices and reliability.  yes, if you buy a 1-year disk,
>you can expect it to have been less carefully tested, possibly
>be of lower-end design and reliability, and to have been handle
>more poorly by the supply chain.  thankfully, you don't have 
>to buy 1-year disks any more.
>
>read the specs.  make sure your supply chain knows how to 
>handle disks.  make sure your disks are mounted correctly,
>both mechanically and with enough airflow.  use raid and 
>some form of archiving/backups.  don't get hung up on which 
>of the 4-5 top-tier vendors makes your disk.
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>  
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: Good news / bad news - The joys of RAID
  2004-11-20 19:40     ` David Greaves
@ 2004-11-21  4:33       ` Guy
  0 siblings, 0 replies; 34+ messages in thread
From: Guy @ 2004-11-21  4:33 UTC (permalink / raw)
  To: 'David Greaves', 'Mark Hahn'; +Cc: linux-raid

You said:
"anything that's shipped by courier is suspect."

Humm, the way the drives are packed you would have a hard time exceeding
300Gs.  Even UPS can't do that I bet.  But I must admit, I have no idea what
force a drive would "feel" in a 4 foot drop.  Remember, the drive is packed
very well!  Also, they only refer to 2 ms.  So, no idea if that is equal to
150 Gs for 4ms.  Or 75 Gs for 8ms.

From a 300G Maxtor drive.

Reliability 
- Shock Tolerance: 60Gs @ 2 ms half-sine pulse (Operating), 300Gs @ 2 ms
half-sine pulse (Non-operating) 
- Data Error Rate: < 1 /10E15 bits read (Non-recoverable) 
- MTBF: 1000000 Hours

Guy

-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of David Greaves
Sent: Saturday, November 20, 2004 2:41 PM
To: Mark Hahn
Cc: linux-raid@vger.kernel.org
Subject: Re: Good news / bad news - The joys of RAID

Mark Hahn wrote:

>>Never buy Maxtor drives again!
>>    
>>
>
>you imply that Maxtor drives are somehow inherently flawed.
>can you explain why you think millions of people/companies
>are naive idiots for continuing to buy Maxtor disks?
>
>this sort of thing is just not plausible: Maxtor competes 
>with the other top-tier disk vendors with similar products 
>and prices and reliability.  yes, if you buy a 1-year disk,
>you can expect it to have been less carefully tested, possibly
>be of lower-end design and reliability, and to have been handle
>more poorly by the supply chain.  thankfully, you don't have 
>to buy 1-year disks any more.
>
>read the specs.  make sure your supply chain knows how to 
>handle disks.  make sure your disks are mounted correctly,
>both mechanically and with enough airflow.  use raid and 
>some form of archiving/backups.  don't get hung up on which 
>of the 4-5 top-tier vendors makes your disk.
>
>  
>

Yeah, you're right.
Of course - the fact that 2 of *my* 6 Maxtor 250Gb SATA drives (3 year 
warranty) date stamped at various times in 2004 have failed is 
coincidence and should, of course, be expected with a MTBF of millions 
of hours.

Oh, please note I'm not Robin - that must be a coincidence too :)

Personally I'm waiting for the revelation that they are recycled IBM 
Deskstar 70's ;)

I take your point about supply chain though - anything that's shipped by 
courier is suspect.

David


-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Good news / bad news - The joys of RAID
  2004-11-20 18:42   ` Mark Hahn
                       ` (2 preceding siblings ...)
  2004-11-21  1:01     ` berk walker
@ 2004-11-23 19:10     ` H. Peter Anvin
  2004-11-23 20:03       ` Guy
  3 siblings, 1 reply; 34+ messages in thread
From: H. Peter Anvin @ 2004-11-23 19:10 UTC (permalink / raw)
  To: linux-raid

Followup to:  <Pine.LNX.4.44.0411201238320.19120-100000@coffee.psychology.mcmaster.ca>
By author:    Mark Hahn <hahn@physics.mcmaster.ca>
In newsgroup: linux.dev.raid
>
> > Never buy Maxtor drives again!
> 
> you imply that Maxtor drives are somehow inherently flawed.
> can you explain why you think millions of people/companies
> are naive idiots for continuing to buy Maxtor disks?
> 
> this sort of thing is just not plausible: Maxtor competes 
> with the other top-tier disk vendors with similar products 
> and prices and reliability.
> 

In my experience, that is bullshit.  Maxtor competes on price using
inferior products.  I bought two Maxtor drives, both of them failed
within 13 months.  That was my first attempt at trying Maxtor again
after taking them off my sh*tlist from last time.

	-hpa

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: Good news / bad news - The joys of RAID
  2004-11-23 19:10     ` H. Peter Anvin
@ 2004-11-23 20:03       ` Guy
  2004-11-23 21:18         ` Mark Hahn
  0 siblings, 1 reply; 34+ messages in thread
From: Guy @ 2004-11-23 20:03 UTC (permalink / raw)
  To: 'H. Peter Anvin', linux-raid

When will you learn?  :)

-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of H. Peter Anvin
Sent: Tuesday, November 23, 2004 2:11 PM
To: linux-raid@vger.kernel.org
Subject: Re: Good news / bad news - The joys of RAID

Followup to:
<Pine.LNX.4.44.0411201238320.19120-100000@coffee.psychology.mcmaster.ca>
By author:    Mark Hahn <hahn@physics.mcmaster.ca>
In newsgroup: linux.dev.raid
>
> > Never buy Maxtor drives again!
> 
> you imply that Maxtor drives are somehow inherently flawed.
> can you explain why you think millions of people/companies
> are naive idiots for continuing to buy Maxtor disks?
> 
> this sort of thing is just not plausible: Maxtor competes 
> with the other top-tier disk vendors with similar products 
> and prices and reliability.
> 

In my experience, that is bullshit.  Maxtor competes on price using
inferior products.  I bought two Maxtor drives, both of them failed
within 13 months.  That was my first attempt at trying Maxtor again
after taking them off my sh*tlist from last time.

	-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: Good news / bad news - The joys of RAID
  2004-11-23 20:03       ` Guy
@ 2004-11-23 21:18         ` Mark Hahn
  2004-11-23 23:02           ` Robin Bowes
  2004-11-24  1:45           ` berk walker
  0 siblings, 2 replies; 34+ messages in thread
From: Mark Hahn @ 2004-11-23 21:18 UTC (permalink / raw)
  To: Guy; +Cc: 'H. Peter Anvin', linux-raid

> When will you learn?  :)

exactly - you can conclude absolutely nothing from two samples.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Good news / bad news - The joys of RAID
  2004-11-23 21:18         ` Mark Hahn
@ 2004-11-23 23:02           ` Robin Bowes
  2004-11-24  0:33             ` Guy
  2004-11-24  1:45           ` berk walker
  1 sibling, 1 reply; 34+ messages in thread
From: Robin Bowes @ 2004-11-23 23:02 UTC (permalink / raw)
  To: linux-raid

Mark Hahn wrote:
>>When will you learn?  :)
> 
> 
> exactly - you can conclude absolutely nothing from two samples.
> 

I read that mail as "I stopped buying Maxtor (for whatever reason) then 
tried them again and had an 100% failure rate (albeit with a small 
sample size) so have stopped buying them again" rather than "I bought 
two Maxtor drives that failed so Maxtor drives are shit".

My own personal experience (I'm the OP in this thread) is that the 250GB 
SATA Maxtor Maxline II drives I have purchased have an unacceptable 
failure rate (something like 40% in 5 months)

R.
-- 
http://robinbowes.com

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: Good news / bad news - The joys of RAID
  2004-11-23 23:02           ` Robin Bowes
@ 2004-11-24  0:33             ` Guy
  0 siblings, 0 replies; 34+ messages in thread
From: Guy @ 2004-11-24  0:33 UTC (permalink / raw)
  To: 'Robin Bowes', linux-raid

I understood!  I was poking fun that you tried them again, and again lost!
I hope you understood me.  "When will you learn? :)"

Also, I thought of this about 4 years ago.  Describes many managers!
"Sure you saved money, but at what cost?" - Guy Watkins

-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Robin Bowes
Sent: Tuesday, November 23, 2004 6:03 PM
To: linux-raid@vger.kernel.org
Subject: Re: Good news / bad news - The joys of RAID

Mark Hahn wrote:
>>When will you learn?  :)
> 
> 
> exactly - you can conclude absolutely nothing from two samples.
> 

I read that mail as "I stopped buying Maxtor (for whatever reason) then 
tried them again and had an 100% failure rate (albeit with a small 
sample size) so have stopped buying them again" rather than "I bought 
two Maxtor drives that failed so Maxtor drives are shit".

My own personal experience (I'm the OP in this thread) is that the 250GB 
SATA Maxtor Maxline II drives I have purchased have an unacceptable 
failure rate (something like 40% in 5 months)

R.
-- 
http://robinbowes.com
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Good news / bad news - The joys of RAID
  2004-11-23 21:18         ` Mark Hahn
  2004-11-23 23:02           ` Robin Bowes
@ 2004-11-24  1:45           ` berk walker
  2004-11-24  2:00             ` H. Peter Anvin
  1 sibling, 1 reply; 34+ messages in thread
From: berk walker @ 2004-11-24  1:45 UTC (permalink / raw)
  To: Mark Hahn; +Cc: Guy, 'H. Peter Anvin', linux-raid

I think I have 4 1/2 out of 6.  Better?

Mark Hahn wrote:

>>When will you learn?  :)
>>    
>>
>
>exactly - you can conclude absolutely nothing from two samples.
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>  
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Good news / bad news - The joys of RAID
  2004-11-24  1:45           ` berk walker
@ 2004-11-24  2:00             ` H. Peter Anvin
  2004-11-24  8:01               ` Good news / bad news - The joys of hardware Guy
  0 siblings, 1 reply; 34+ messages in thread
From: H. Peter Anvin @ 2004-11-24  2:00 UTC (permalink / raw)
  To: berk walker; +Cc: Mark Hahn, Guy, linux-raid

berk walker wrote:
> I think I have 4 1/2 out of 6.  Better?
> 
> Mark Hahn wrote:
> 
>>> When will you learn?  :)
>>>   
>>
>>
>> exactly - you can conclude absolutely nothing from two samples.
>>

Actually, you can.  Having two fail in short order should be an extremely rare 
event.

	-hpa

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Good news / bad news - The joys of hardware
  2004-11-24  2:00             ` H. Peter Anvin
@ 2004-11-24  8:01               ` Guy
  2004-11-24  8:57                 ` Robin Bowes
  0 siblings, 1 reply; 34+ messages in thread
From: Guy @ 2004-11-24  8:01 UTC (permalink / raw)
  Cc: linux-raid

About 2 years ago I had a disk fail, not 100%, but intermittent problems.
So I replaced it.  The replacement started acting up about 6-12 months ago.
Read errors about every 1-2 months, finally it went off-line.  But,
intermittently.  I did think it was odd that the drive in the same position
was failing, and with similar problems, but figured it was just a
quincidence.  Today I replaced it, after replacing it, I had some problems.
It is in a case with 6 other disks, so I could tell by the LEDs that the
replacement drive was acting wrong, intermittently.  I determined that the
Molex power plug going to the drive was causing the problems.  What a pain!
So, the 2 drives that I replaced may have been good.  The first drive I took
apart.  I have the magnets to prove it!  But it may have been a good drive!

To make a long story short, check the cables for failures, including the
power cables.

The drives are Seagate, and I have at least 26 in service, so 2 failures out
of 26 in 3 years is not so bad.  However, if the Molex connector was at
fault, then 0 failures out of 26 in 3 years, is just fine.

The drive is model ST118282LC, MTBF 1,000,000.  I think with 26 drives I
should have 1 failure in about 4.4 years.  The drives have a 5 year
warranty, but they are OEM, so I get nothing.  I am not the first owner, but
they were unused.  And I bet they are about 5 years old now.

Too much info?  Sorry.  Maybe I need a blog?  :)

Can anyone spell "quincidence"?

Guy

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Good news / bad news - The joys of hardware
  2004-11-24  8:01               ` Good news / bad news - The joys of hardware Guy
@ 2004-11-24  8:57                 ` Robin Bowes
  0 siblings, 0 replies; 34+ messages in thread
From: Robin Bowes @ 2004-11-24  8:57 UTC (permalink / raw)
  To: Guy; +Cc: linux-raid

Guy wrote:
> Can anyone spell "quincidence"?

http://dictionary.reference.com/search?q=coincidence

R.
-- 
http://robinbowes.com

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Good news / bad news - The joys of RAID
  2004-11-19 21:42 ` Good news / bad news - The joys of RAID Guy
@ 2004-11-28 13:15   ` Robin Bowes
  2004-11-30  2:05     ` Neil Brown
  0 siblings, 1 reply; 34+ messages in thread
From: Robin Bowes @ 2004-11-28 13:15 UTC (permalink / raw)
  To: Guy; +Cc: linux-raid

Guy wrote:
> I use mdadm's monitor mode to send me email when events occur.

Guy,

I've been meaning to write this for a while...

I tried monitoring once but had a problem when shutting down as the 
arrays were reported as "busy" because mdadm --monitor was running on 
them. I guess it needs to be killed earlier in the shutdown process.

So, can you share with me how you start/stop mdadm to run in monitor mode?

Thanks,

R.
-- 
http://robinbowes.com

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Good news / bad news - The joys of RAID
  2004-11-28 13:15   ` Robin Bowes
@ 2004-11-30  2:05     ` Neil Brown
  2004-12-01  3:34       ` Doug Ledford
  0 siblings, 1 reply; 34+ messages in thread
From: Neil Brown @ 2004-11-30  2:05 UTC (permalink / raw)
  To: Robin Bowes; +Cc: Guy, linux-raid

On Sunday November 28, robin-lists@robinbowes.com wrote:
> Guy wrote:
> > I use mdadm's monitor mode to send me email when events occur.
> 
> Guy,
> 
> I've been meaning to write this for a while...
> 
> I tried monitoring once but had a problem when shutting down as the 
> arrays were reported as "busy" because mdadm --monitor was running on 
> them. I guess it needs to be killed earlier in the shutdown process.

That bug was fixed in mdadm 1.6.0

NeilBrown


 From the ChangeLog:
Changes Prior to 1.6.0 release
...
    -   Fix bug in --monitor where an array could be held open and so
	could not be stopped without killing mdadm.
...


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Good news / bad news - The joys of RAID
  2004-11-30  2:05     ` Neil Brown
@ 2004-12-01  3:34       ` Doug Ledford
  2004-12-01 11:50         ` Robin Bowes
  0 siblings, 1 reply; 34+ messages in thread
From: Doug Ledford @ 2004-12-01  3:34 UTC (permalink / raw)
  To: Neil Brown; +Cc: Robin Bowes, Guy, linux-raid

On Tue, 2004-11-30 at 13:05 +1100, Neil Brown wrote:
> On Sunday November 28, robin-lists@robinbowes.com wrote:
> > Guy wrote:
> > > I use mdadm's monitor mode to send me email when events occur.
> > 
> > Guy,
> > 
> > I've been meaning to write this for a while...
> > 
> > I tried monitoring once but had a problem when shutting down as the 
> > arrays were reported as "busy" because mdadm --monitor was running on 
> > them. I guess it needs to be killed earlier in the shutdown process.
> 
> That bug was fixed in mdadm 1.6.0
> 
> NeilBrown
> 
> 
>  From the ChangeLog:
> Changes Prior to 1.6.0 release
> ...
>     -   Fix bug in --monitor where an array could be held open and so
> 	could not be stopped without killing mdadm.
> ...

If I recall correctly, this fixes the primary symptom, but not the whole
problem.  When in --monitor mode, mdadm will reopen each device every 15
seconds to scan its status.  As such, a shutdown could still fail if
mdadm is still running and the timing is right.  In that instance,
retrying the shutdown on failure would likely be enough to solve the
problem, but that sounds icky to me.  Would be much better if mdadm
could open a control device of some sort and query about running arrays
instead of opening the arrays themselves.

-- 
  Doug Ledford <dledford@redhat.com>
         Red Hat, Inc.
         1801 Varsity Dr.
         Raleigh, NC 27606

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Good news / bad news - The joys of RAID
  2004-12-01  3:34       ` Doug Ledford
@ 2004-12-01 11:50         ` Robin Bowes
  0 siblings, 0 replies; 34+ messages in thread
From: Robin Bowes @ 2004-12-01 11:50 UTC (permalink / raw)
  To: Doug Ledford; +Cc: Neil Brown, Guy, linux-raid

Doug Ledford wrote:
> 
> If I recall correctly, this fixes the primary symptom, but not the whole
> problem.  When in --monitor mode, mdadm will reopen each device every 15
> seconds to scan its status.  As such, a shutdown could still fail if
> mdadm is still running and the timing is right.  In that instance,
> retrying the shutdown on failure would likely be enough to solve the
> problem, but that sounds icky to me.  Would be much better if mdadm
> could open a control device of some sort and query about running arrays
> instead of opening the arrays themselves.

Wouldn't simply killing the "mdadm --monitor" process early on in the 
shutdown process achieve the same result?

R.
-- 
http://robinbowes.com

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Good news / bad news - The joys of RAID
  2004-11-20 22:17         ` Mark Hahn
  2004-11-20 23:09           ` Guy
@ 2004-12-02 16:47           ` TJ
  2004-12-02 17:29             ` Stephen C Woods
                               ` (2 more replies)
  1 sibling, 3 replies; 34+ messages in thread
From: TJ @ 2004-12-02 16:47 UTC (permalink / raw)
  To: linux-raid

> afaikt, the deathstar incident was actually bad firmware
> (didn't correctly flush data when hard powered off, resulting in
> blocks on disk with bogus ECC, which had to be considered bad from
> then on, even if the media was perfect.)

I do not think the deathstar incident was due to a firmware problem as you 
describe at all. I had a lot of these drives fail, and I read as much as I 
could find on the subject. The problem was most likely caused by the fact 
that these drives used IBM's new glass substrate technology. This substrate 
had heat expansion issues which caused the heads to misalign on tracks and 
eventually cross write over tracks, corrupting data. The classic "click of 
death" was the sound of the drive searching for a track repetitively. In some 
cases a format would allow the drive to be used again, in many cases it would 
not. It is my belief that formatting was inneffective at fixing the drive 
because the cross writing probably hit some of the low level data, which the 
drive cannot repair on a format.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Good news / bad news - The joys of RAID
  2004-12-02 16:47           ` TJ
@ 2004-12-02 17:29             ` Stephen C Woods
  2004-12-03  3:37             ` Mark Hahn
  2004-12-09  0:17             ` H. Peter Anvin
  2 siblings, 0 replies; 34+ messages in thread
From: Stephen C Woods @ 2004-12-02 17:29 UTC (permalink / raw)
  To: TJ, linux-raid


  Perhaps servo/timing data?  Also I recall some Kennedy Winchester
drives back in the early 80s that if you had a power outage would get
header CRC errors at pairs of blocks that were arranged in a spiral as the head
headed for the landing zone.    I recall writing a standalone program
that would read the entire drive and then 'correct' the CRC errors as it
found them.   Since much of the drive was unused I finally figurered out
that the data was fine it was the header CRC that got clobbered, apparently
there was a bug in the powerdown hardware so it would enable the write head
when it was in the the interblock zone as it was flying to land....

    Ahh for the days of poking into device registers  (in Memory) to get I/O to
happen (from the console).
<scw>


On Thu, Dec 02, 2004 at 11:47:12AM -0500, TJ wrote:
> > afaikt, the deathstar incident was actually bad firmware
> > (didn't correctly flush data when hard powered off, resulting in
> > blocks on disk with bogus ECC, which had to be considered bad from
> > then on, even if the media was perfect.)
> 
> I do not think the deathstar incident was due to a firmware problem as you 
> describe at all. I had a lot of these drives fail, and I read as much as I 
> could find on the subject. The problem was most likely caused by the fact 
> that these drives used IBM's new glass substrate technology. This substrate 
> had heat expansion issues which caused the heads to misalign on tracks and 
> eventually cross write over tracks, corrupting data. The classic "click of 
> death" was the sound of the drive searching for a track repetitively. In some 
> cases a format would allow the drive to be used again, in many cases it would 
> not. It is my belief that formatting was inneffective at fixing the drive 
> because the cross writing probably hit some of the low level data, which the 
> drive cannot repair on a format.

-- 
-----
Stephen C. Woods; UCLA SEASnet; 2567 Boelter hall; LA CA 90095; (310)-825-8614
Unless otherwise noted these statements are my own, Not those of the 
University of California.                      Internet mail:scw@seas.ucla.edu

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Good news / bad news - The joys of RAID
  2004-12-02 16:47           ` TJ
  2004-12-02 17:29             ` Stephen C Woods
@ 2004-12-03  3:37             ` Mark Hahn
  2004-12-03  4:16               ` Guy
  2004-12-09  0:17             ` H. Peter Anvin
  2 siblings, 1 reply; 34+ messages in thread
From: Mark Hahn @ 2004-12-03  3:37 UTC (permalink / raw)
  To: TJ; +Cc: linux-raid

> not. It is my belief that formatting was inneffective at fixing the drive 
> because the cross writing probably hit some of the low level data, which the 
> drive cannot repair on a format.

the ecc *is* the low-level data.  without performing a controlled experiment
that recreates the power-off scenario, there's no way to distinguish a block
whose media is actually bad from one whose ecc fails because the ecc is bad.

the firmware theory is supported by the fact that many deathstars 
performed perfectly well for many years.  I have at least one that lasted
for 4+ years, and was powered off only a few times, and all of those cleanly.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: Good news / bad news - The joys of RAID
  2004-12-03  3:37             ` Mark Hahn
@ 2004-12-03  4:16               ` Guy
  2004-12-03  4:46                 ` Alvin Oga
  2004-12-03  5:24                 ` Richard Scobie
  0 siblings, 2 replies; 34+ messages in thread
From: Guy @ 2004-12-03  4:16 UTC (permalink / raw)
  To: 'Mark Hahn', 'TJ'; +Cc: linux-raid

The ECC is not the low level data.  The servo tracks are.  I bet there are
start of track/sector header marks also.  I believe a low level format will
not re-write the servo tracks.  Some drives reserve 1 side of 1 platter for
servo data.  Others mix the servo data with user data.  I don't know the
full details, just tidbit I have read over the years.

If your drives were cooled better than most, that may explain why you did
not have the "substrate had heat expansion issues".  Just a guess.

If the problem was a firmware issue, why didn't IBM release a firmware
update?

You said:
"the firmware theory is supported by the fact that many deathstars 
performed perfectly well for many years"

Are you saying some drives had good firmware, while others had bad firmware?
Otherwise, I don't understand your logic, since a drive not failing does not
prove a firmware bug.

Guy

-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Mark Hahn
Sent: Thursday, December 02, 2004 10:37 PM
To: TJ
Cc: linux-raid@vger.kernel.org
Subject: Re: Good news / bad news - The joys of RAID

> not. It is my belief that formatting was inneffective at fixing the drive 
> because the cross writing probably hit some of the low level data, which
the 
> drive cannot repair on a format.

the ecc *is* the low-level data.  without performing a controlled experiment
that recreates the power-off scenario, there's no way to distinguish a block
whose media is actually bad from one whose ecc fails because the ecc is bad.

the firmware theory is supported by the fact that many deathstars 
performed perfectly well for many years.  I have at least one that lasted
for 4+ years, and was powered off only a few times, and all of those
cleanly.

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: Good news / bad news - The joys of RAID
  2004-12-03  4:16               ` Guy
@ 2004-12-03  4:46                 ` Alvin Oga
  2004-12-03  5:24                 ` Richard Scobie
  1 sibling, 0 replies; 34+ messages in thread
From: Alvin Oga @ 2004-12-03  4:46 UTC (permalink / raw)
  To: Guy; +Cc: linux-raid


On Thu, 2 Dec 2004, Guy wrote:

> The ECC is not the low level data.  The servo tracks are.  I bet there are
> start of track/sector header marks also.  I believe a low level format will
> not re-write the servo tracks.  Some drives reserve 1 side of 1 platter for
> servo data.  Others mix the servo data with user data.  I don't know the
> full details, just tidbit I have read over the years.

ecc is in the disk controller with the phaselock loop and other analog
circuit to convert the analog signal from the head back into 1's and 0's
for the ecc code in firmware to correct any obvious head read errors

track/sector info is written to the disk with low level format
	( usually at the manufacturer
	- you can also do lowlevel format with superformat
 
	- it contains sector and track info and other header info
	along with gaps and timing/spacing between each field

	- disks are now soft sectored .. ( no servo info )
	( 512bytes or 1K or 2K or 4K(?) bytes per sector )

	- there is just one "index" mark to indicate one full
	platter rotation

- you can change any/all of the data ... as long as the apps can
  read the data its lower-level drivers did to the disk
	- firmware level is the lowest changes ( on the disk controller )

	- some brave soles put "raid" in firmware .. 
	( risky in my book )

we use mke2fs, mkreiserfs etc to write file system data to make the
platter useful

we use software and other utilities to do more ecc checking on the 
data we expect to get back

- if the system memory is bad .. 
	we overwrite good disk data with bad data from bad memory

- if the disk read/write is bad ...
	we can sometimes compensate for it by keeping the disk
	cooler ( <= 30C for disk temp is good )

	if ecc on the disk controller cannot fix it ..  the disk
	is basically worthless

c ya
alvin


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Good news / bad news - The joys of RAID
  2004-12-03  4:16               ` Guy
  2004-12-03  4:46                 ` Alvin Oga
@ 2004-12-03  5:24                 ` Richard Scobie
  2004-12-03  5:40                   ` Konstantin Olchanski
  1 sibling, 1 reply; 34+ messages in thread
From: Richard Scobie @ 2004-12-03  5:24 UTC (permalink / raw)
  To: linux-raid

Guy wrote:

> If the problem was a firmware issue, why didn't IBM release a firmware
> update?

I believe they did. I recall downloading something similar to this:

http://support.dell.com/support/downloads/format.aspx?releaseid=r37239&c=us&l=en&s=biz&cs=555

at the time, to fix one of my drives.

Regards,

Richard


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Good news / bad news - The joys of RAID
  2004-12-03  5:24                 ` Richard Scobie
@ 2004-12-03  5:40                   ` Konstantin Olchanski
  0 siblings, 0 replies; 34+ messages in thread
From: Konstantin Olchanski @ 2004-12-03  5:40 UTC (permalink / raw)
  To: Richard Scobie; +Cc: linux-raid

On Fri, Dec 03, 2004 at 06:24:13PM +1300, Richard Scobie wrote:
> >If the problem was a firmware issue, why didn't IBM release a firmware
> >update?
> 
> I believe they did. I recall downloading something similar to this:
> http://support.dell.com/support/downloads/format.aspx?releaseid=r37239&c=us&l=en&s=biz&cs=555

The updated IBM firmware helped. Before, every power outage would
produce disks with unreadable sectors. Now, all our IBM disks have
the "new" firmware and they hardly ever develop unreadable sectors.

This makes me suspect that there are *two* unrelated problems:
1) the "scribble at power down" problem, fixed by the firmware update;
2) the "overheated disks lose data due to platter thermal expansion" problem,
   probably unfixable, other than by keeping the disks cool.

-- 
Konstantin Olchanski
Data Acquisition Systems: The Bytes Must Flow!
Email: olchansk-at-triumf-dot-ca
Snail mail: 4004 Wesbrook Mall, TRIUMF, Vancouver, B.C., V6T 2A3, Canada

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Good news / bad news - The joys of RAID
  2004-12-02 16:47           ` TJ
  2004-12-02 17:29             ` Stephen C Woods
  2004-12-03  3:37             ` Mark Hahn
@ 2004-12-09  0:17             ` H. Peter Anvin
  2 siblings, 0 replies; 34+ messages in thread
From: H. Peter Anvin @ 2004-12-09  0:17 UTC (permalink / raw)
  To: linux-raid

Followup to:  <200412021147.12410.systemloc@earthlink.net>
By author:    TJ <systemloc@earthlink.net>
In newsgroup: linux.dev.raid
> 
> I do not think the deathstar incident was due to a firmware problem as you 
> describe at all. I had a lot of these drives fail, and I read as much as I 
> could find on the subject. The problem was most likely caused by the fact 
> that these drives used IBM's new glass substrate technology. This substrate 
> had heat expansion issues which caused the heads to misalign on tracks and 
> eventually cross write over tracks, corrupting data. The classic "click of 
> death" was the sound of the drive searching for a track repetitively. In some 
> cases a format would allow the drive to be used again, in many cases it would 
> not. It is my belief that formatting was inneffective at fixing the drive 
> because the cross writing probably hit some of the low level data, which the 
> drive cannot repair on a format.
> 

It's also worth noting that there was extremely high correlation
between which factory built the drives and the failure rates.
Apparently some factories had virtually zero instances of this
problem.

	-hpa

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2004-12-09  0:17 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-11-19 21:06 Good news / bad news - The joys of RAID Robin Bowes
2004-11-19 21:28 ` Guy
2004-11-20 18:42   ` Mark Hahn
2004-11-20 19:37     ` Guy
2004-11-20 20:03       ` Mark Klarzynski
2004-11-20 22:17         ` Mark Hahn
2004-11-20 23:09           ` Guy
2004-12-02 16:47           ` TJ
2004-12-02 17:29             ` Stephen C Woods
2004-12-03  3:37             ` Mark Hahn
2004-12-03  4:16               ` Guy
2004-12-03  4:46                 ` Alvin Oga
2004-12-03  5:24                 ` Richard Scobie
2004-12-03  5:40                   ` Konstantin Olchanski
2004-12-09  0:17             ` H. Peter Anvin
2004-11-20 23:30       ` Mark Hahn
2004-11-20 19:40     ` David Greaves
2004-11-21  4:33       ` Guy
2004-11-21  1:01     ` berk walker
2004-11-23 19:10     ` H. Peter Anvin
2004-11-23 20:03       ` Guy
2004-11-23 21:18         ` Mark Hahn
2004-11-23 23:02           ` Robin Bowes
2004-11-24  0:33             ` Guy
2004-11-24  1:45           ` berk walker
2004-11-24  2:00             ` H. Peter Anvin
2004-11-24  8:01               ` Good news / bad news - The joys of hardware Guy
2004-11-24  8:57                 ` Robin Bowes
2004-11-19 21:42 ` Good news / bad news - The joys of RAID Guy
2004-11-28 13:15   ` Robin Bowes
2004-11-30  2:05     ` Neil Brown
2004-12-01  3:34       ` Doug Ledford
2004-12-01 11:50         ` Robin Bowes
2004-11-19 21:58 ` Gordon Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).