* Good news / bad news - The joys of RAID
@ 2004-11-19 21:06 Robin Bowes
2004-11-19 21:28 ` Guy
` (2 more replies)
0 siblings, 3 replies; 50+ messages in thread
From: Robin Bowes @ 2004-11-19 21:06 UTC (permalink / raw)
To: linux-raid
The bad news is I lost another disk tonight. Remind me *never* to buy
Maxtor drives again.
The good news is that my RAID5 array was configured as 5 + 1 spare. I
powered down the server, used the Maxtor PowerMax utility to identify
the bad disk, pulled it out and re-booted. My array is currently re-syncing.
[root@dude root]# mdadm --detail /dev/md5
/dev/md5:
Version : 00.90.01
Creation Time : Thu Jul 29 21:41:38 2004
Raid Level : raid5
Array Size : 974566400 (929.42 GiB 997.96 GB)
Device Size : 243641600 (232.35 GiB 249.49 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 5
Persistence : Superblock is persistent
Update Time : Fri Nov 19 20:52:58 2004
State : dirty, resyncing
Active Devices : 5
Working Devices : 5
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 128K
Rebuild Status : 0% complete
UUID : a4bbcd09:5e178c5b:3bf8bd45:8c31d2a1
Events : 0.1765551
Number Major Minor RaidDevice State
0 8 2 0 active sync /dev/sda2
1 8 18 1 active sync /dev/sdb2
2 8 34 2 active sync /dev/sdc2
3 8 50 3 active sync /dev/sdd2
4 8 66 4 active sync /dev/sde2
Thinking about what happened, I would have expected that the bad drive
would just be removed from the array and spare activated and re-syncing
started automatically.
What actually happened was that I rebooted to activate a new kernel and
the box didn't come back up. As the machine runs headless, I had to
power it off and take it to a monitor/keyboard to check it. In the new
location it came up fine so I shut it down again and put it back in my
"server room" (read: cellar). I still couldn't see it from the network
so I dragged an old 14" CRT out of the shed and connected it up. The
login prompt was there but there was an "ata2 timeout" error message and
the console was dead. I power-cycled to reboot and as it booted I saw a
message something like "postponing resync of md0 as it uses the same
device as md5. waiting for md5 to resync. I then got a further ata
timeout error. I had to physically disconnect the bad drive and reboot
in order to re-start the re-sync.
Further md information:
[root@dude log]# mdadm --detail --scan
ARRAY /dev/md2 level=raid1 num-devices=2
UUID=11caa547:1ba8d185:1f1f771f:d66368c9
devices=/dev/sdc1
ARRAY /dev/md1 level=raid1 num-devices=2
UUID=be8ad31a:f13b6f4b:c39732fc:c84f32a8
devices=/dev/sdb1,/dev/sde1
ARRAY /dev/md5 level=raid5 num-devices=5
UUID=a4bbcd09:5e178c5b:3bf8bd45:8c31d2a1
devices=/dev/sda2,/dev/sdb2,/dev/sdc2,/dev/sdd2,/dev/sde2
ARRAY /dev/md0 level=raid1 num-devices=2
UUID=4b28338c:bf08d0bc:bb2899fc:e7f35eae
devices=/dev/sda1,/dev/sdd1
It was /dev/sdf that failed which contained two partitions, one of them
part of md2 (now running un-mirrored but still showing two devices) and
the other part of md5 (now re-syncing but only showing five devices).
Is this normal behaviour?
R.
--
http://robinbowes.com
^ permalink raw reply [flat|nested] 50+ messages in thread* RE: Good news / bad news - The joys of RAID 2004-11-19 21:06 Good news / bad news - The joys of RAID Robin Bowes @ 2004-11-19 21:28 ` Guy 2004-11-20 18:42 ` Mark Hahn 2004-11-19 21:42 ` Good news / bad news - The joys of RAID Guy 2004-11-19 21:58 ` Gordon Henderson 2 siblings, 1 reply; 50+ messages in thread From: Guy @ 2004-11-19 21:28 UTC (permalink / raw) To: 'Robin Bowes', linux-raid Reminder.... Never buy Maxtor drives again! Guy -----Original Message----- From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Robin Bowes Sent: Friday, November 19, 2004 4:07 PM To: linux-raid@vger.kernel.org Subject: Good news / bad news - The joys of RAID The bad news is I lost another disk tonight. Remind me *never* to buy Maxtor drives again. The good news is that my RAID5 array was configured as 5 + 1 spare. I powered down the server, used the Maxtor PowerMax utility to identify the bad disk, pulled it out and re-booted. My array is currently re-syncing. [root@dude root]# mdadm --detail /dev/md5 /dev/md5: Version : 00.90.01 Creation Time : Thu Jul 29 21:41:38 2004 Raid Level : raid5 Array Size : 974566400 (929.42 GiB 997.96 GB) Device Size : 243641600 (232.35 GiB 249.49 GB) Raid Devices : 5 Total Devices : 5 Preferred Minor : 5 Persistence : Superblock is persistent Update Time : Fri Nov 19 20:52:58 2004 State : dirty, resyncing Active Devices : 5 Working Devices : 5 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 128K Rebuild Status : 0% complete UUID : a4bbcd09:5e178c5b:3bf8bd45:8c31d2a1 Events : 0.1765551 Number Major Minor RaidDevice State 0 8 2 0 active sync /dev/sda2 1 8 18 1 active sync /dev/sdb2 2 8 34 2 active sync /dev/sdc2 3 8 50 3 active sync /dev/sdd2 4 8 66 4 active sync /dev/sde2 Thinking about what happened, I would have expected that the bad drive would just be removed from the array and spare activated and re-syncing started automatically. What actually happened was that I rebooted to activate a new kernel and the box didn't come back up. As the machine runs headless, I had to power it off and take it to a monitor/keyboard to check it. In the new location it came up fine so I shut it down again and put it back in my "server room" (read: cellar). I still couldn't see it from the network so I dragged an old 14" CRT out of the shed and connected it up. The login prompt was there but there was an "ata2 timeout" error message and the console was dead. I power-cycled to reboot and as it booted I saw a message something like "postponing resync of md0 as it uses the same device as md5. waiting for md5 to resync. I then got a further ata timeout error. I had to physically disconnect the bad drive and reboot in order to re-start the re-sync. Further md information: [root@dude log]# mdadm --detail --scan ARRAY /dev/md2 level=raid1 num-devices=2 UUID=11caa547:1ba8d185:1f1f771f:d66368c9 devices=/dev/sdc1 ARRAY /dev/md1 level=raid1 num-devices=2 UUID=be8ad31a:f13b6f4b:c39732fc:c84f32a8 devices=/dev/sdb1,/dev/sde1 ARRAY /dev/md5 level=raid5 num-devices=5 UUID=a4bbcd09:5e178c5b:3bf8bd45:8c31d2a1 devices=/dev/sda2,/dev/sdb2,/dev/sdc2,/dev/sdd2,/dev/sde2 ARRAY /dev/md0 level=raid1 num-devices=2 UUID=4b28338c:bf08d0bc:bb2899fc:e7f35eae devices=/dev/sda1,/dev/sdd1 It was /dev/sdf that failed which contained two partitions, one of them part of md2 (now running un-mirrored but still showing two devices) and the other part of md5 (now re-syncing but only showing five devices). Is this normal behaviour? R. -- http://robinbowes.com - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: Good news / bad news - The joys of RAID 2004-11-19 21:28 ` Guy @ 2004-11-20 18:42 ` Mark Hahn 2004-11-20 19:37 ` Guy ` (3 more replies) 0 siblings, 4 replies; 50+ messages in thread From: Mark Hahn @ 2004-11-20 18:42 UTC (permalink / raw) To: linux-raid > Never buy Maxtor drives again! you imply that Maxtor drives are somehow inherently flawed. can you explain why you think millions of people/companies are naive idiots for continuing to buy Maxtor disks? this sort of thing is just not plausible: Maxtor competes with the other top-tier disk vendors with similar products and prices and reliability. yes, if you buy a 1-year disk, you can expect it to have been less carefully tested, possibly be of lower-end design and reliability, and to have been handle more poorly by the supply chain. thankfully, you don't have to buy 1-year disks any more. read the specs. make sure your supply chain knows how to handle disks. make sure your disks are mounted correctly, both mechanically and with enough airflow. use raid and some form of archiving/backups. don't get hung up on which of the 4-5 top-tier vendors makes your disk. ^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: Good news / bad news - The joys of RAID 2004-11-20 18:42 ` Mark Hahn @ 2004-11-20 19:37 ` Guy 2004-11-20 20:03 ` Mark Klarzynski 2004-11-20 23:30 ` Mark Hahn 2004-11-20 19:40 ` David Greaves ` (2 subsequent siblings) 3 siblings, 2 replies; 50+ messages in thread From: Guy @ 2004-11-20 19:37 UTC (permalink / raw) To: 'Mark Hahn', linux-raid I have had far more failures of Maxtor drives than any other. I have also had problems with WD drives. I know someone that had 4-6 IBM disks, most of which have failed. I am talking about disks with 3 year warranties! Based on the spec. But OEM disks have none. You must return them to the PC manufacture. Most of my failures were within 3 years, but beyond the warranty period of the system. So the OEM issue has occurred too often. I have had good luck with Seagate. I use RAID, it is a must with the failure rate! I do backup also, but RAID tends to save me. Most people have a PC with 1 disk. I don't understand RAID, and they don't understand that everything will be lost if the disk breaks! They think "Dell will just fix it". But wrong, Dell will just replace it! Big difference. Today's disks claim a MTBF of about 1,000,000 hours! That's about 114 years. So, if I had 10 disks I should expect 1 failure every 11.4 years. That would be so cool! But not in the real world. Can you explain how the disks have a MTBF of 1,000,000 hours? But fail more often than that? Maybe I just don't understand some aspect of MTBF. Guy -----Original Message----- From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Mark Hahn Sent: Saturday, November 20, 2004 1:43 PM To: linux-raid@vger.kernel.org Subject: RE: Good news / bad news - The joys of RAID > Never buy Maxtor drives again! you imply that Maxtor drives are somehow inherently flawed. can you explain why you think millions of people/companies are naive idiots for continuing to buy Maxtor disks? this sort of thing is just not plausible: Maxtor competes with the other top-tier disk vendors with similar products and prices and reliability. yes, if you buy a 1-year disk, you can expect it to have been less carefully tested, possibly be of lower-end design and reliability, and to have been handle more poorly by the supply chain. thankfully, you don't have to buy 1-year disks any more. read the specs. make sure your supply chain knows how to handle disks. make sure your disks are mounted correctly, both mechanically and with enough airflow. use raid and some form of archiving/backups. don't get hung up on which of the 4-5 top-tier vendors makes your disk. - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: Good news / bad news - The joys of RAID 2004-11-20 19:37 ` Guy @ 2004-11-20 20:03 ` Mark Klarzynski 2004-11-20 22:17 ` Mark Hahn 2004-11-20 23:30 ` Mark Hahn 1 sibling, 1 reply; 50+ messages in thread From: Mark Klarzynski @ 2004-11-20 20:03 UTC (permalink / raw) To: linux-raid MTBF is statistic based upon the expected 'use' of the drive and the replacement of the drive after its end of life (3-5 years)... It's extremely complex and boring but the figure is only relative if the drive is being used within an environment that matches those of the calculations. SATA / IDE drives have an MTBF similar to that of SCSI / Fibre. But this is based upon their expected use... i.e. SCSI used to be [power on hours = 24hr] [use = 8 hours].. whilst SATA used to be [power on = 8 hours] and [use = 20 mins]. Regardless of what some people clam (usually those that only sell sata based raids), the drives are not constructed the same in any way. SATA's fail more within a raid environment (probably around 10:1) because of the heavy use and also because they are not as intelligent... therefore when they do not respond we have no way of interrogating them or resetting them, whilst with scsi we do both. This means that a raid controller / driver has no option to but simply fail the drive. Maxtor lead the way in capacity and also reliability... I personal had to recall countless earlier IBMs and replace them with maxtor. But the new generation of IBM's (Hitachi) have got it together. So - I guess you are all right :) ^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: Good news / bad news - The joys of RAID 2004-11-20 20:03 ` Mark Klarzynski @ 2004-11-20 22:17 ` Mark Hahn 2004-11-20 23:09 ` Guy 2004-12-02 16:47 ` TJ 0 siblings, 2 replies; 50+ messages in thread From: Mark Hahn @ 2004-11-20 22:17 UTC (permalink / raw) To: linux-raid > SATA / IDE drives have an MTBF similar to that of SCSI / Fibre. But this > is based upon their expected use... i.e. SCSI used to be [power on hours > = 24hr] [use = 8 hours].. whilst SATA used to be [power on = 8 hours] > and [use = 20 mins]. the vendors I talk to always quote SCSI/FC at 100% power 100% duty, and PATA/SATA at 100% power 20% duty. > Regardless of what some people clam (usually those that only sell sata > based raids), the drives are not constructed the same in any way. obviously, there *have* been pairs of SCSI/ATA disks which had identical mech/analog sections. but the mech/analog fall into just two kinds: - optimized for IOPS: 10-15K rpm for minimal rotational latency, narrow recording area for low seek distance, quite low bit and track density to avoid long waits for the head to stabilize after a seek. - optimized for density/bandwidth: high bit/track density, wide recording area, modest seeks/rotation speed. the first is SCSI/FC and the second ATA, mainly for historic reasons. > SATA's fail more within a raid environment (probably around 10:1) > because of the heavy use and also because they are not as intelligent... what connection are you drawing between raid and "heavy use"? how does being in a raid increase the IO load per disk? > therefore when they do not respond we have no way of interrogating them > or resetting them, whilst with scsi we do both. you've never seen a SCSI reset that looks just like an ATA reset? sorry, but SCSI has no magic. > This means that a raid > controller / driver has no option to but simply fail the drive. no. > Maxtor lead the way in capacity and also reliability... I personal had > to recall countless earlier IBMs and replace them with maxtor. But the afaikt, the deathstar incident was actually bad firmware (didn't correctly flush data when hard powered off, resulting in blocks on disk with bogus ECC, which had to be considered bad from then on, even if the media was perfect.) ^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: Good news / bad news - The joys of RAID 2004-11-20 22:17 ` Mark Hahn @ 2004-11-20 23:09 ` Guy 2004-12-02 16:47 ` TJ 1 sibling, 0 replies; 50+ messages in thread From: Guy @ 2004-11-20 23:09 UTC (permalink / raw) To: 'Mark Hahn', linux-raid You got any links related to this? "the deathstar incident was actually bad firmware" Can a user download and update the firmware? If so, I know someone that may have some bad disks that are not so bad. If he can repair his disks, I will report the status back on this list. Previously I thought IBM made very good disks, until my friend had more than a 75% failure rate. And within the warranty period. I personally have an IBM SCSI disk that is running 100% of the time, and the cooling is real bad. The drive is much too hot to touch. Been like that for 5+ years. Never had any issues. The system also has a Seagate that is too hot to touch, but only been running 3+ years. Both are 18 Gig. The disks are in a system my wife uses! Don't tell her. :) I got to fix that someday. Guy -----Original Message----- From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Mark Hahn Sent: Saturday, November 20, 2004 5:18 PM To: linux-raid@vger.kernel.org Subject: RE: Good news / bad news - The joys of RAID > SATA / IDE drives have an MTBF similar to that of SCSI / Fibre. But this > is based upon their expected use... i.e. SCSI used to be [power on hours > = 24hr] [use = 8 hours].. whilst SATA used to be [power on = 8 hours] > and [use = 20 mins]. the vendors I talk to always quote SCSI/FC at 100% power 100% duty, and PATA/SATA at 100% power 20% duty. > Regardless of what some people clam (usually those that only sell sata > based raids), the drives are not constructed the same in any way. obviously, there *have* been pairs of SCSI/ATA disks which had identical mech/analog sections. but the mech/analog fall into just two kinds: - optimized for IOPS: 10-15K rpm for minimal rotational latency, narrow recording area for low seek distance, quite low bit and track density to avoid long waits for the head to stabilize after a seek. - optimized for density/bandwidth: high bit/track density, wide recording area, modest seeks/rotation speed. the first is SCSI/FC and the second ATA, mainly for historic reasons. > SATA's fail more within a raid environment (probably around 10:1) > because of the heavy use and also because they are not as intelligent... what connection are you drawing between raid and "heavy use"? how does being in a raid increase the IO load per disk? > therefore when they do not respond we have no way of interrogating them > or resetting them, whilst with scsi we do both. you've never seen a SCSI reset that looks just like an ATA reset? sorry, but SCSI has no magic. > This means that a raid > controller / driver has no option to but simply fail the drive. no. > Maxtor lead the way in capacity and also reliability... I personal had > to recall countless earlier IBMs and replace them with maxtor. But the afaikt, the deathstar incident was actually bad firmware (didn't correctly flush data when hard powered off, resulting in blocks on disk with bogus ECC, which had to be considered bad from then on, even if the media was perfect.) - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Good news / bad news - The joys of RAID 2004-11-20 22:17 ` Mark Hahn 2004-11-20 23:09 ` Guy @ 2004-12-02 16:47 ` TJ 2004-12-02 17:29 ` Stephen C Woods ` (2 more replies) 1 sibling, 3 replies; 50+ messages in thread From: TJ @ 2004-12-02 16:47 UTC (permalink / raw) To: linux-raid > afaikt, the deathstar incident was actually bad firmware > (didn't correctly flush data when hard powered off, resulting in > blocks on disk with bogus ECC, which had to be considered bad from > then on, even if the media was perfect.) I do not think the deathstar incident was due to a firmware problem as you describe at all. I had a lot of these drives fail, and I read as much as I could find on the subject. The problem was most likely caused by the fact that these drives used IBM's new glass substrate technology. This substrate had heat expansion issues which caused the heads to misalign on tracks and eventually cross write over tracks, corrupting data. The classic "click of death" was the sound of the drive searching for a track repetitively. In some cases a format would allow the drive to be used again, in many cases it would not. It is my belief that formatting was inneffective at fixing the drive because the cross writing probably hit some of the low level data, which the drive cannot repair on a format. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Good news / bad news - The joys of RAID 2004-12-02 16:47 ` TJ @ 2004-12-02 17:29 ` Stephen C Woods 2004-12-03 3:37 ` Mark Hahn 2004-12-09 0:17 ` H. Peter Anvin 2 siblings, 0 replies; 50+ messages in thread From: Stephen C Woods @ 2004-12-02 17:29 UTC (permalink / raw) To: TJ, linux-raid Perhaps servo/timing data? Also I recall some Kennedy Winchester drives back in the early 80s that if you had a power outage would get header CRC errors at pairs of blocks that were arranged in a spiral as the head headed for the landing zone. I recall writing a standalone program that would read the entire drive and then 'correct' the CRC errors as it found them. Since much of the drive was unused I finally figurered out that the data was fine it was the header CRC that got clobbered, apparently there was a bug in the powerdown hardware so it would enable the write head when it was in the the interblock zone as it was flying to land.... Ahh for the days of poking into device registers (in Memory) to get I/O to happen (from the console). <scw> On Thu, Dec 02, 2004 at 11:47:12AM -0500, TJ wrote: > > afaikt, the deathstar incident was actually bad firmware > > (didn't correctly flush data when hard powered off, resulting in > > blocks on disk with bogus ECC, which had to be considered bad from > > then on, even if the media was perfect.) > > I do not think the deathstar incident was due to a firmware problem as you > describe at all. I had a lot of these drives fail, and I read as much as I > could find on the subject. The problem was most likely caused by the fact > that these drives used IBM's new glass substrate technology. This substrate > had heat expansion issues which caused the heads to misalign on tracks and > eventually cross write over tracks, corrupting data. The classic "click of > death" was the sound of the drive searching for a track repetitively. In some > cases a format would allow the drive to be used again, in many cases it would > not. It is my belief that formatting was inneffective at fixing the drive > because the cross writing probably hit some of the low level data, which the > drive cannot repair on a format. -- ----- Stephen C. Woods; UCLA SEASnet; 2567 Boelter hall; LA CA 90095; (310)-825-8614 Unless otherwise noted these statements are my own, Not those of the University of California. Internet mail:scw@seas.ucla.edu ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Good news / bad news - The joys of RAID 2004-12-02 16:47 ` TJ 2004-12-02 17:29 ` Stephen C Woods @ 2004-12-03 3:37 ` Mark Hahn 2004-12-03 4:16 ` Guy 2004-12-09 0:17 ` H. Peter Anvin 2 siblings, 1 reply; 50+ messages in thread From: Mark Hahn @ 2004-12-03 3:37 UTC (permalink / raw) To: TJ; +Cc: linux-raid > not. It is my belief that formatting was inneffective at fixing the drive > because the cross writing probably hit some of the low level data, which the > drive cannot repair on a format. the ecc *is* the low-level data. without performing a controlled experiment that recreates the power-off scenario, there's no way to distinguish a block whose media is actually bad from one whose ecc fails because the ecc is bad. the firmware theory is supported by the fact that many deathstars performed perfectly well for many years. I have at least one that lasted for 4+ years, and was powered off only a few times, and all of those cleanly. ^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: Good news / bad news - The joys of RAID 2004-12-03 3:37 ` Mark Hahn @ 2004-12-03 4:16 ` Guy 2004-12-03 4:46 ` Alvin Oga 2004-12-03 5:24 ` Richard Scobie 0 siblings, 2 replies; 50+ messages in thread From: Guy @ 2004-12-03 4:16 UTC (permalink / raw) To: 'Mark Hahn', 'TJ'; +Cc: linux-raid The ECC is not the low level data. The servo tracks are. I bet there are start of track/sector header marks also. I believe a low level format will not re-write the servo tracks. Some drives reserve 1 side of 1 platter for servo data. Others mix the servo data with user data. I don't know the full details, just tidbit I have read over the years. If your drives were cooled better than most, that may explain why you did not have the "substrate had heat expansion issues". Just a guess. If the problem was a firmware issue, why didn't IBM release a firmware update? You said: "the firmware theory is supported by the fact that many deathstars performed perfectly well for many years" Are you saying some drives had good firmware, while others had bad firmware? Otherwise, I don't understand your logic, since a drive not failing does not prove a firmware bug. Guy -----Original Message----- From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Mark Hahn Sent: Thursday, December 02, 2004 10:37 PM To: TJ Cc: linux-raid@vger.kernel.org Subject: Re: Good news / bad news - The joys of RAID > not. It is my belief that formatting was inneffective at fixing the drive > because the cross writing probably hit some of the low level data, which the > drive cannot repair on a format. the ecc *is* the low-level data. without performing a controlled experiment that recreates the power-off scenario, there's no way to distinguish a block whose media is actually bad from one whose ecc fails because the ecc is bad. the firmware theory is supported by the fact that many deathstars performed perfectly well for many years. I have at least one that lasted for 4+ years, and was powered off only a few times, and all of those cleanly. - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: Good news / bad news - The joys of RAID 2004-12-03 4:16 ` Guy @ 2004-12-03 4:46 ` Alvin Oga 2004-12-03 5:24 ` Richard Scobie 1 sibling, 0 replies; 50+ messages in thread From: Alvin Oga @ 2004-12-03 4:46 UTC (permalink / raw) To: Guy; +Cc: linux-raid On Thu, 2 Dec 2004, Guy wrote: > The ECC is not the low level data. The servo tracks are. I bet there are > start of track/sector header marks also. I believe a low level format will > not re-write the servo tracks. Some drives reserve 1 side of 1 platter for > servo data. Others mix the servo data with user data. I don't know the > full details, just tidbit I have read over the years. ecc is in the disk controller with the phaselock loop and other analog circuit to convert the analog signal from the head back into 1's and 0's for the ecc code in firmware to correct any obvious head read errors track/sector info is written to the disk with low level format ( usually at the manufacturer - you can also do lowlevel format with superformat - it contains sector and track info and other header info along with gaps and timing/spacing between each field - disks are now soft sectored .. ( no servo info ) ( 512bytes or 1K or 2K or 4K(?) bytes per sector ) - there is just one "index" mark to indicate one full platter rotation - you can change any/all of the data ... as long as the apps can read the data its lower-level drivers did to the disk - firmware level is the lowest changes ( on the disk controller ) - some brave soles put "raid" in firmware .. ( risky in my book ) we use mke2fs, mkreiserfs etc to write file system data to make the platter useful we use software and other utilities to do more ecc checking on the data we expect to get back - if the system memory is bad .. we overwrite good disk data with bad data from bad memory - if the disk read/write is bad ... we can sometimes compensate for it by keeping the disk cooler ( <= 30C for disk temp is good ) if ecc on the disk controller cannot fix it .. the disk is basically worthless c ya alvin ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Good news / bad news - The joys of RAID 2004-12-03 4:16 ` Guy 2004-12-03 4:46 ` Alvin Oga @ 2004-12-03 5:24 ` Richard Scobie 2004-12-03 5:40 ` Konstantin Olchanski 1 sibling, 1 reply; 50+ messages in thread From: Richard Scobie @ 2004-12-03 5:24 UTC (permalink / raw) To: linux-raid Guy wrote: > If the problem was a firmware issue, why didn't IBM release a firmware > update? I believe they did. I recall downloading something similar to this: http://support.dell.com/support/downloads/format.aspx?releaseid=r37239&c=us&l=en&s=biz&cs=555 at the time, to fix one of my drives. Regards, Richard ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Good news / bad news - The joys of RAID 2004-12-03 5:24 ` Richard Scobie @ 2004-12-03 5:40 ` Konstantin Olchanski 0 siblings, 0 replies; 50+ messages in thread From: Konstantin Olchanski @ 2004-12-03 5:40 UTC (permalink / raw) To: Richard Scobie; +Cc: linux-raid On Fri, Dec 03, 2004 at 06:24:13PM +1300, Richard Scobie wrote: > >If the problem was a firmware issue, why didn't IBM release a firmware > >update? > > I believe they did. I recall downloading something similar to this: > http://support.dell.com/support/downloads/format.aspx?releaseid=r37239&c=us&l=en&s=biz&cs=555 The updated IBM firmware helped. Before, every power outage would produce disks with unreadable sectors. Now, all our IBM disks have the "new" firmware and they hardly ever develop unreadable sectors. This makes me suspect that there are *two* unrelated problems: 1) the "scribble at power down" problem, fixed by the firmware update; 2) the "overheated disks lose data due to platter thermal expansion" problem, probably unfixable, other than by keeping the disks cool. -- Konstantin Olchanski Data Acquisition Systems: The Bytes Must Flow! Email: olchansk-at-triumf-dot-ca Snail mail: 4004 Wesbrook Mall, TRIUMF, Vancouver, B.C., V6T 2A3, Canada ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Good news / bad news - The joys of RAID 2004-12-02 16:47 ` TJ 2004-12-02 17:29 ` Stephen C Woods 2004-12-03 3:37 ` Mark Hahn @ 2004-12-09 0:17 ` H. Peter Anvin 2 siblings, 0 replies; 50+ messages in thread From: H. Peter Anvin @ 2004-12-09 0:17 UTC (permalink / raw) To: linux-raid Followup to: <200412021147.12410.systemloc@earthlink.net> By author: TJ <systemloc@earthlink.net> In newsgroup: linux.dev.raid > > I do not think the deathstar incident was due to a firmware problem as you > describe at all. I had a lot of these drives fail, and I read as much as I > could find on the subject. The problem was most likely caused by the fact > that these drives used IBM's new glass substrate technology. This substrate > had heat expansion issues which caused the heads to misalign on tracks and > eventually cross write over tracks, corrupting data. The classic "click of > death" was the sound of the drive searching for a track repetitively. In some > cases a format would allow the drive to be used again, in many cases it would > not. It is my belief that formatting was inneffective at fixing the drive > because the cross writing probably hit some of the low level data, which the > drive cannot repair on a format. > It's also worth noting that there was extremely high correlation between which factory built the drives and the failure rates. Apparently some factories had virtually zero instances of this problem. -hpa ^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: Good news / bad news - The joys of RAID 2004-11-20 19:37 ` Guy 2004-11-20 20:03 ` Mark Klarzynski @ 2004-11-20 23:30 ` Mark Hahn 1 sibling, 0 replies; 50+ messages in thread From: Mark Hahn @ 2004-11-20 23:30 UTC (permalink / raw) To: Guy; +Cc: linux-raid > Can you explain how the disks have a MTBF of 1,000,000 hours? But fail more > often than that? Maybe I just don't understand some aspect of MTBF. simple: the MTBF applies to very large sets of disks. if you had millions of disks, you'd expect to average mtbf/ndisks between failures. with statistically trivial sample sizes (10 disks), you can't really say much. of course, a proper model of the failure rate would have a lot more than 1 parameter... for instance, my organization will be buying about .5 PB of storage soon. here are some options: disk n mtbf hours $/disk $K total 250GB SATA 1920 1e6 500 399 766 600GB SATA 800 1e6 1250 600? 480 73GB SCSI/FC 6575 1.3e6 198 389 2558 146GB SCSI/FC 3288 1.3e6 395 600 1973 300GB SCSI/FC 1600 1.3e6 813 1200 1920 these mtbf's are basically made up, since disk vendors aren't really very helpful in publishing their true reliability distributions. these disk counts are starting to be big enough to give some meaning to the hours=mtbf/n calculation - I'd WAG that "hours" is within a factor of two. (I looked at only three lines of SCSI disks to get 1.3e6 - two quoted 1.2 and the newer was 1.4.) vendors seem to be switching to quoting "annualized failure rates", which are probably easier to understand - 1.2e6 MTBF or 0.73% AFR, for instance. the latter makes it more clear that we're talking about gambling ;) but the message is clear: for a fixed, large capacity, your main concern should be bigger disks. since our money is also fixed, you can see that SCSI/FC prices are a big problem (these are real list prices from a tier-1 vendor who marks up their SATA by an embarassing amount...) further, there's absolutely no chance we could ever keep .5 PB of disks busy at 100% duty cycle, so that's not a reason to buy SCSI/FC either... regards, mark hahn. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Good news / bad news - The joys of RAID 2004-11-20 18:42 ` Mark Hahn 2004-11-20 19:37 ` Guy @ 2004-11-20 19:40 ` David Greaves 2004-11-21 4:33 ` Guy 2004-11-21 1:01 ` berk walker 2004-11-23 19:10 ` H. Peter Anvin 3 siblings, 1 reply; 50+ messages in thread From: David Greaves @ 2004-11-20 19:40 UTC (permalink / raw) To: Mark Hahn; +Cc: linux-raid Mark Hahn wrote: >>Never buy Maxtor drives again! >> >> > >you imply that Maxtor drives are somehow inherently flawed. >can you explain why you think millions of people/companies >are naive idiots for continuing to buy Maxtor disks? > >this sort of thing is just not plausible: Maxtor competes >with the other top-tier disk vendors with similar products >and prices and reliability. yes, if you buy a 1-year disk, >you can expect it to have been less carefully tested, possibly >be of lower-end design and reliability, and to have been handle >more poorly by the supply chain. thankfully, you don't have >to buy 1-year disks any more. > >read the specs. make sure your supply chain knows how to >handle disks. make sure your disks are mounted correctly, >both mechanically and with enough airflow. use raid and >some form of archiving/backups. don't get hung up on which >of the 4-5 top-tier vendors makes your disk. > > > Yeah, you're right. Of course - the fact that 2 of *my* 6 Maxtor 250Gb SATA drives (3 year warranty) date stamped at various times in 2004 have failed is coincidence and should, of course, be expected with a MTBF of millions of hours. Oh, please note I'm not Robin - that must be a coincidence too :) Personally I'm waiting for the revelation that they are recycled IBM Deskstar 70's ;) I take your point about supply chain though - anything that's shipped by courier is suspect. David ^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: Good news / bad news - The joys of RAID 2004-11-20 19:40 ` David Greaves @ 2004-11-21 4:33 ` Guy 0 siblings, 0 replies; 50+ messages in thread From: Guy @ 2004-11-21 4:33 UTC (permalink / raw) To: 'David Greaves', 'Mark Hahn'; +Cc: linux-raid You said: "anything that's shipped by courier is suspect." Humm, the way the drives are packed you would have a hard time exceeding 300Gs. Even UPS can't do that I bet. But I must admit, I have no idea what force a drive would "feel" in a 4 foot drop. Remember, the drive is packed very well! Also, they only refer to 2 ms. So, no idea if that is equal to 150 Gs for 4ms. Or 75 Gs for 8ms. From a 300G Maxtor drive. Reliability - Shock Tolerance: 60Gs @ 2 ms half-sine pulse (Operating), 300Gs @ 2 ms half-sine pulse (Non-operating) - Data Error Rate: < 1 /10E15 bits read (Non-recoverable) - MTBF: 1000000 Hours Guy -----Original Message----- From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-owner@vger.kernel.org] On Behalf Of David Greaves Sent: Saturday, November 20, 2004 2:41 PM To: Mark Hahn Cc: linux-raid@vger.kernel.org Subject: Re: Good news / bad news - The joys of RAID Mark Hahn wrote: >>Never buy Maxtor drives again! >> >> > >you imply that Maxtor drives are somehow inherently flawed. >can you explain why you think millions of people/companies >are naive idiots for continuing to buy Maxtor disks? > >this sort of thing is just not plausible: Maxtor competes >with the other top-tier disk vendors with similar products >and prices and reliability. yes, if you buy a 1-year disk, >you can expect it to have been less carefully tested, possibly >be of lower-end design and reliability, and to have been handle >more poorly by the supply chain. thankfully, you don't have >to buy 1-year disks any more. > >read the specs. make sure your supply chain knows how to >handle disks. make sure your disks are mounted correctly, >both mechanically and with enough airflow. use raid and >some form of archiving/backups. don't get hung up on which >of the 4-5 top-tier vendors makes your disk. > > > Yeah, you're right. Of course - the fact that 2 of *my* 6 Maxtor 250Gb SATA drives (3 year warranty) date stamped at various times in 2004 have failed is coincidence and should, of course, be expected with a MTBF of millions of hours. Oh, please note I'm not Robin - that must be a coincidence too :) Personally I'm waiting for the revelation that they are recycled IBM Deskstar 70's ;) I take your point about supply chain though - anything that's shipped by courier is suspect. David - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Good news / bad news - The joys of RAID 2004-11-20 18:42 ` Mark Hahn 2004-11-20 19:37 ` Guy 2004-11-20 19:40 ` David Greaves @ 2004-11-21 1:01 ` berk walker 2004-11-23 19:10 ` H. Peter Anvin 3 siblings, 0 replies; 50+ messages in thread From: berk walker @ 2004-11-21 1:01 UTC (permalink / raw) To: Mark Hahn; +Cc: linux-raid ALL of the Maxtor junk that I have sitting next to me were in factory packaging, and not likely to have been affected by either physical or electrical shock. HE might have implied, I am saying it! Why ask someone as you did in sentence #2? Ask them - or yourself. Of course, he probably missed the warranty statement to not run Linux. Mark Hahn wrote: >>Never buy Maxtor drives again! >> >> > >you imply that Maxtor drives are somehow inherently flawed. >can you explain why you think millions of people/companies >are naive idiots for continuing to buy Maxtor disks? > >this sort of thing is just not plausible: Maxtor competes >with the other top-tier disk vendors with similar products >and prices and reliability. yes, if you buy a 1-year disk, >you can expect it to have been less carefully tested, possibly >be of lower-end design and reliability, and to have been handle >more poorly by the supply chain. thankfully, you don't have >to buy 1-year disks any more. > >read the specs. make sure your supply chain knows how to >handle disks. make sure your disks are mounted correctly, >both mechanically and with enough airflow. use raid and >some form of archiving/backups. don't get hung up on which >of the 4-5 top-tier vendors makes your disk. > >- >To unsubscribe from this list: send the line "unsubscribe linux-raid" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html > > > ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Good news / bad news - The joys of RAID 2004-11-20 18:42 ` Mark Hahn ` (2 preceding siblings ...) 2004-11-21 1:01 ` berk walker @ 2004-11-23 19:10 ` H. Peter Anvin 2004-11-23 20:03 ` Guy 3 siblings, 1 reply; 50+ messages in thread From: H. Peter Anvin @ 2004-11-23 19:10 UTC (permalink / raw) To: linux-raid Followup to: <Pine.LNX.4.44.0411201238320.19120-100000@coffee.psychology.mcmaster.ca> By author: Mark Hahn <hahn@physics.mcmaster.ca> In newsgroup: linux.dev.raid > > > Never buy Maxtor drives again! > > you imply that Maxtor drives are somehow inherently flawed. > can you explain why you think millions of people/companies > are naive idiots for continuing to buy Maxtor disks? > > this sort of thing is just not plausible: Maxtor competes > with the other top-tier disk vendors with similar products > and prices and reliability. > In my experience, that is bullshit. Maxtor competes on price using inferior products. I bought two Maxtor drives, both of them failed within 13 months. That was my first attempt at trying Maxtor again after taking them off my sh*tlist from last time. -hpa ^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: Good news / bad news - The joys of RAID 2004-11-23 19:10 ` H. Peter Anvin @ 2004-11-23 20:03 ` Guy 2004-11-23 21:18 ` Mark Hahn 0 siblings, 1 reply; 50+ messages in thread From: Guy @ 2004-11-23 20:03 UTC (permalink / raw) To: 'H. Peter Anvin', linux-raid When will you learn? :) -----Original Message----- From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-owner@vger.kernel.org] On Behalf Of H. Peter Anvin Sent: Tuesday, November 23, 2004 2:11 PM To: linux-raid@vger.kernel.org Subject: Re: Good news / bad news - The joys of RAID Followup to: <Pine.LNX.4.44.0411201238320.19120-100000@coffee.psychology.mcmaster.ca> By author: Mark Hahn <hahn@physics.mcmaster.ca> In newsgroup: linux.dev.raid > > > Never buy Maxtor drives again! > > you imply that Maxtor drives are somehow inherently flawed. > can you explain why you think millions of people/companies > are naive idiots for continuing to buy Maxtor disks? > > this sort of thing is just not plausible: Maxtor competes > with the other top-tier disk vendors with similar products > and prices and reliability. > In my experience, that is bullshit. Maxtor competes on price using inferior products. I bought two Maxtor drives, both of them failed within 13 months. That was my first attempt at trying Maxtor again after taking them off my sh*tlist from last time. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: Good news / bad news - The joys of RAID 2004-11-23 20:03 ` Guy @ 2004-11-23 21:18 ` Mark Hahn 2004-11-23 23:02 ` Robin Bowes 2004-11-24 1:45 ` berk walker 0 siblings, 2 replies; 50+ messages in thread From: Mark Hahn @ 2004-11-23 21:18 UTC (permalink / raw) To: Guy; +Cc: 'H. Peter Anvin', linux-raid > When will you learn? :) exactly - you can conclude absolutely nothing from two samples. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Good news / bad news - The joys of RAID 2004-11-23 21:18 ` Mark Hahn @ 2004-11-23 23:02 ` Robin Bowes 2004-11-24 0:33 ` Guy 2004-11-24 1:45 ` berk walker 1 sibling, 1 reply; 50+ messages in thread From: Robin Bowes @ 2004-11-23 23:02 UTC (permalink / raw) To: linux-raid Mark Hahn wrote: >>When will you learn? :) > > > exactly - you can conclude absolutely nothing from two samples. > I read that mail as "I stopped buying Maxtor (for whatever reason) then tried them again and had an 100% failure rate (albeit with a small sample size) so have stopped buying them again" rather than "I bought two Maxtor drives that failed so Maxtor drives are shit". My own personal experience (I'm the OP in this thread) is that the 250GB SATA Maxtor Maxline II drives I have purchased have an unacceptable failure rate (something like 40% in 5 months) R. -- http://robinbowes.com ^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: Good news / bad news - The joys of RAID 2004-11-23 23:02 ` Robin Bowes @ 2004-11-24 0:33 ` Guy 0 siblings, 0 replies; 50+ messages in thread From: Guy @ 2004-11-24 0:33 UTC (permalink / raw) To: 'Robin Bowes', linux-raid I understood! I was poking fun that you tried them again, and again lost! I hope you understood me. "When will you learn? :)" Also, I thought of this about 4 years ago. Describes many managers! "Sure you saved money, but at what cost?" - Guy Watkins -----Original Message----- From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Robin Bowes Sent: Tuesday, November 23, 2004 6:03 PM To: linux-raid@vger.kernel.org Subject: Re: Good news / bad news - The joys of RAID Mark Hahn wrote: >>When will you learn? :) > > > exactly - you can conclude absolutely nothing from two samples. > I read that mail as "I stopped buying Maxtor (for whatever reason) then tried them again and had an 100% failure rate (albeit with a small sample size) so have stopped buying them again" rather than "I bought two Maxtor drives that failed so Maxtor drives are shit". My own personal experience (I'm the OP in this thread) is that the 250GB SATA Maxtor Maxline II drives I have purchased have an unacceptable failure rate (something like 40% in 5 months) R. -- http://robinbowes.com - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Good news / bad news - The joys of RAID 2004-11-23 21:18 ` Mark Hahn 2004-11-23 23:02 ` Robin Bowes @ 2004-11-24 1:45 ` berk walker 2004-11-24 2:00 ` H. Peter Anvin 1 sibling, 1 reply; 50+ messages in thread From: berk walker @ 2004-11-24 1:45 UTC (permalink / raw) To: Mark Hahn; +Cc: Guy, 'H. Peter Anvin', linux-raid I think I have 4 1/2 out of 6. Better? Mark Hahn wrote: >>When will you learn? :) >> >> > >exactly - you can conclude absolutely nothing from two samples. > >- >To unsubscribe from this list: send the line "unsubscribe linux-raid" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html > > > ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Good news / bad news - The joys of RAID 2004-11-24 1:45 ` berk walker @ 2004-11-24 2:00 ` H. Peter Anvin 2004-11-24 8:01 ` Good news / bad news - The joys of hardware Guy 0 siblings, 1 reply; 50+ messages in thread From: H. Peter Anvin @ 2004-11-24 2:00 UTC (permalink / raw) To: berk walker; +Cc: Mark Hahn, Guy, linux-raid berk walker wrote: > I think I have 4 1/2 out of 6. Better? > > Mark Hahn wrote: > >>> When will you learn? :) >>> >> >> >> exactly - you can conclude absolutely nothing from two samples. >> Actually, you can. Having two fail in short order should be an extremely rare event. -hpa ^ permalink raw reply [flat|nested] 50+ messages in thread
* Good news / bad news - The joys of hardware 2004-11-24 2:00 ` H. Peter Anvin @ 2004-11-24 8:01 ` Guy 2004-11-24 8:57 ` Robin Bowes 0 siblings, 1 reply; 50+ messages in thread From: Guy @ 2004-11-24 8:01 UTC (permalink / raw) Cc: linux-raid About 2 years ago I had a disk fail, not 100%, but intermittent problems. So I replaced it. The replacement started acting up about 6-12 months ago. Read errors about every 1-2 months, finally it went off-line. But, intermittently. I did think it was odd that the drive in the same position was failing, and with similar problems, but figured it was just a quincidence. Today I replaced it, after replacing it, I had some problems. It is in a case with 6 other disks, so I could tell by the LEDs that the replacement drive was acting wrong, intermittently. I determined that the Molex power plug going to the drive was causing the problems. What a pain! So, the 2 drives that I replaced may have been good. The first drive I took apart. I have the magnets to prove it! But it may have been a good drive! To make a long story short, check the cables for failures, including the power cables. The drives are Seagate, and I have at least 26 in service, so 2 failures out of 26 in 3 years is not so bad. However, if the Molex connector was at fault, then 0 failures out of 26 in 3 years, is just fine. The drive is model ST118282LC, MTBF 1,000,000. I think with 26 drives I should have 1 failure in about 4.4 years. The drives have a 5 year warranty, but they are OEM, so I get nothing. I am not the first owner, but they were unused. And I bet they are about 5 years old now. Too much info? Sorry. Maybe I need a blog? :) Can anyone spell "quincidence"? Guy ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Good news / bad news - The joys of hardware 2004-11-24 8:01 ` Good news / bad news - The joys of hardware Guy @ 2004-11-24 8:57 ` Robin Bowes 0 siblings, 0 replies; 50+ messages in thread From: Robin Bowes @ 2004-11-24 8:57 UTC (permalink / raw) To: Guy; +Cc: linux-raid Guy wrote: > Can anyone spell "quincidence"? http://dictionary.reference.com/search?q=coincidence R. -- http://robinbowes.com ^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: Good news / bad news - The joys of RAID 2004-11-19 21:06 Good news / bad news - The joys of RAID Robin Bowes 2004-11-19 21:28 ` Guy @ 2004-11-19 21:42 ` Guy 2004-11-28 13:15 ` Robin Bowes 2004-11-19 21:58 ` Gordon Henderson 2 siblings, 1 reply; 50+ messages in thread From: Guy @ 2004-11-19 21:42 UTC (permalink / raw) To: 'Robin Bowes', linux-raid The re-sync to the spare should have been automatic, without a re-boot. Your errors related to ata timeout is not a Linux issue. My guess is the bios could see the drive, but the drive was not responding correctly. I think this is life with ata. I have had similar problems with SCSI. 1 drive failed in a way that it caused problems with other drives on the same SCSI bus. It could be that your array was re-building, but did not finish. In that case it would start over from the beginning. Which may look like it did not attempt to re-build until the re-boot. Did you check the status before you shut it down? I use mdadm's monitor mode to send me email when events occur. By the time I read my emails, a drive has failed and the re-sync to the spare is done. No need to check logs. Yes, it is normal that md will not re-sync 2 arrays that share a common device. One will be delayed until the other finishes. Second reminder.... Never buy Maxtor drives again! This quote seems to fit real well! "Sure you saved money, but at what cost?" - Guy Watkins Guy -----Original Message----- From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Robin Bowes Sent: Friday, November 19, 2004 4:07 PM To: linux-raid@vger.kernel.org Subject: Good news / bad news - The joys of RAID The bad news is I lost another disk tonight. Remind me *never* to buy Maxtor drives again. The good news is that my RAID5 array was configured as 5 + 1 spare. I powered down the server, used the Maxtor PowerMax utility to identify the bad disk, pulled it out and re-booted. My array is currently re-syncing. [root@dude root]# mdadm --detail /dev/md5 /dev/md5: Version : 00.90.01 Creation Time : Thu Jul 29 21:41:38 2004 Raid Level : raid5 Array Size : 974566400 (929.42 GiB 997.96 GB) Device Size : 243641600 (232.35 GiB 249.49 GB) Raid Devices : 5 Total Devices : 5 Preferred Minor : 5 Persistence : Superblock is persistent Update Time : Fri Nov 19 20:52:58 2004 State : dirty, resyncing Active Devices : 5 Working Devices : 5 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 128K Rebuild Status : 0% complete UUID : a4bbcd09:5e178c5b:3bf8bd45:8c31d2a1 Events : 0.1765551 Number Major Minor RaidDevice State 0 8 2 0 active sync /dev/sda2 1 8 18 1 active sync /dev/sdb2 2 8 34 2 active sync /dev/sdc2 3 8 50 3 active sync /dev/sdd2 4 8 66 4 active sync /dev/sde2 Thinking about what happened, I would have expected that the bad drive would just be removed from the array and spare activated and re-syncing started automatically. What actually happened was that I rebooted to activate a new kernel and the box didn't come back up. As the machine runs headless, I had to power it off and take it to a monitor/keyboard to check it. In the new location it came up fine so I shut it down again and put it back in my "server room" (read: cellar). I still couldn't see it from the network so I dragged an old 14" CRT out of the shed and connected it up. The login prompt was there but there was an "ata2 timeout" error message and the console was dead. I power-cycled to reboot and as it booted I saw a message something like "postponing resync of md0 as it uses the same device as md5. waiting for md5 to resync. I then got a further ata timeout error. I had to physically disconnect the bad drive and reboot in order to re-start the re-sync. Further md information: [root@dude log]# mdadm --detail --scan ARRAY /dev/md2 level=raid1 num-devices=2 UUID=11caa547:1ba8d185:1f1f771f:d66368c9 devices=/dev/sdc1 ARRAY /dev/md1 level=raid1 num-devices=2 UUID=be8ad31a:f13b6f4b:c39732fc:c84f32a8 devices=/dev/sdb1,/dev/sde1 ARRAY /dev/md5 level=raid5 num-devices=5 UUID=a4bbcd09:5e178c5b:3bf8bd45:8c31d2a1 devices=/dev/sda2,/dev/sdb2,/dev/sdc2,/dev/sdd2,/dev/sde2 ARRAY /dev/md0 level=raid1 num-devices=2 UUID=4b28338c:bf08d0bc:bb2899fc:e7f35eae devices=/dev/sda1,/dev/sdd1 It was /dev/sdf that failed which contained two partitions, one of them part of md2 (now running un-mirrored but still showing two devices) and the other part of md5 (now re-syncing but only showing five devices). Is this normal behaviour? R. -- http://robinbowes.com - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Good news / bad news - The joys of RAID 2004-11-19 21:42 ` Good news / bad news - The joys of RAID Guy @ 2004-11-28 13:15 ` Robin Bowes 2004-11-30 2:05 ` Neil Brown 0 siblings, 1 reply; 50+ messages in thread From: Robin Bowes @ 2004-11-28 13:15 UTC (permalink / raw) To: Guy; +Cc: linux-raid Guy wrote: > I use mdadm's monitor mode to send me email when events occur. Guy, I've been meaning to write this for a while... I tried monitoring once but had a problem when shutting down as the arrays were reported as "busy" because mdadm --monitor was running on them. I guess it needs to be killed earlier in the shutdown process. So, can you share with me how you start/stop mdadm to run in monitor mode? Thanks, R. -- http://robinbowes.com ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Good news / bad news - The joys of RAID 2004-11-28 13:15 ` Robin Bowes @ 2004-11-30 2:05 ` Neil Brown 2004-12-01 3:34 ` Doug Ledford 0 siblings, 1 reply; 50+ messages in thread From: Neil Brown @ 2004-11-30 2:05 UTC (permalink / raw) To: Robin Bowes; +Cc: Guy, linux-raid On Sunday November 28, robin-lists@robinbowes.com wrote: > Guy wrote: > > I use mdadm's monitor mode to send me email when events occur. > > Guy, > > I've been meaning to write this for a while... > > I tried monitoring once but had a problem when shutting down as the > arrays were reported as "busy" because mdadm --monitor was running on > them. I guess it needs to be killed earlier in the shutdown process. That bug was fixed in mdadm 1.6.0 NeilBrown From the ChangeLog: Changes Prior to 1.6.0 release ... - Fix bug in --monitor where an array could be held open and so could not be stopped without killing mdadm. ... ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Good news / bad news - The joys of RAID 2004-11-30 2:05 ` Neil Brown @ 2004-12-01 3:34 ` Doug Ledford 2004-12-01 11:50 ` Robin Bowes 0 siblings, 1 reply; 50+ messages in thread From: Doug Ledford @ 2004-12-01 3:34 UTC (permalink / raw) To: Neil Brown; +Cc: Robin Bowes, Guy, linux-raid On Tue, 2004-11-30 at 13:05 +1100, Neil Brown wrote: > On Sunday November 28, robin-lists@robinbowes.com wrote: > > Guy wrote: > > > I use mdadm's monitor mode to send me email when events occur. > > > > Guy, > > > > I've been meaning to write this for a while... > > > > I tried monitoring once but had a problem when shutting down as the > > arrays were reported as "busy" because mdadm --monitor was running on > > them. I guess it needs to be killed earlier in the shutdown process. > > That bug was fixed in mdadm 1.6.0 > > NeilBrown > > > From the ChangeLog: > Changes Prior to 1.6.0 release > ... > - Fix bug in --monitor where an array could be held open and so > could not be stopped without killing mdadm. > ... If I recall correctly, this fixes the primary symptom, but not the whole problem. When in --monitor mode, mdadm will reopen each device every 15 seconds to scan its status. As such, a shutdown could still fail if mdadm is still running and the timing is right. In that instance, retrying the shutdown on failure would likely be enough to solve the problem, but that sounds icky to me. Would be much better if mdadm could open a control device of some sort and query about running arrays instead of opening the arrays themselves. -- Doug Ledford <dledford@redhat.com> Red Hat, Inc. 1801 Varsity Dr. Raleigh, NC 27606 ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Good news / bad news - The joys of RAID 2004-12-01 3:34 ` Doug Ledford @ 2004-12-01 11:50 ` Robin Bowes 0 siblings, 0 replies; 50+ messages in thread From: Robin Bowes @ 2004-12-01 11:50 UTC (permalink / raw) To: Doug Ledford; +Cc: Neil Brown, Guy, linux-raid Doug Ledford wrote: > > If I recall correctly, this fixes the primary symptom, but not the whole > problem. When in --monitor mode, mdadm will reopen each device every 15 > seconds to scan its status. As such, a shutdown could still fail if > mdadm is still running and the timing is right. In that instance, > retrying the shutdown on failure would likely be enough to solve the > problem, but that sounds icky to me. Would be much better if mdadm > could open a control device of some sort and query about running arrays > instead of opening the arrays themselves. Wouldn't simply killing the "mdadm --monitor" process early on in the shutdown process achieve the same result? R. -- http://robinbowes.com ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Good news / bad news - The joys of RAID 2004-11-19 21:06 Good news / bad news - The joys of RAID Robin Bowes 2004-11-19 21:28 ` Guy 2004-11-19 21:42 ` Good news / bad news - The joys of RAID Guy @ 2004-11-19 21:58 ` Gordon Henderson 2 siblings, 0 replies; 50+ messages in thread From: Gordon Henderson @ 2004-11-19 21:58 UTC (permalink / raw) To: Robin Bowes; +Cc: linux-raid On Fri, 19 Nov 2004, Robin Bowes wrote: > What actually happened was that I rebooted to activate a new kernel and > the box didn't come back up. As the machine runs headless, I had to > power it off and take it to a monitor/keyboard to check it. Not directly related to your RAID issue, but I've been running headless servers with console on serial ports as of late. LILO has an option to put output on a serial line, and there's a kernel compile flag and an append instruction to make it all work. That combined with a power cycler makes me feel more at ease about the remote servers I run. Just don't connect 2 PCs back to back and run a getty on each serial line... Gordon ^ permalink raw reply [flat|nested] 50+ messages in thread
[parent not found: <037401c4cf3b$ee75bc90$030a0a0a@musicroom>]
* RE: Good news / bad news - The joys of RAID [not found] <037401c4cf3b$ee75bc90$030a0a0a@musicroom> @ 2004-11-21 4:33 ` Guy 2004-11-22 14:13 ` Yu Chen 0 siblings, 1 reply; 50+ messages in thread From: Guy @ 2004-11-21 4:33 UTC (permalink / raw) To: 'Mark Klarzynski', linux-raid Humm, the Maxtor spec I am looking at does not limit the duty cycle. It makes no reference at all. I think it is reasonable to assume 24 hours per day, unless they claim less. The drive should fail on average of once per 114 years, but end of life is 3-5 years? I did find this on the Maxtor web site: No MTBF, but ARR of <1%. I think they are saying if I had 100 drives less than 1 failure per year. That is a MTFB of more than 100 years. Design life (min) 5 years. So, the disk should last al least 5 years. I have no problem with this. If this is running time, not time powered off. No limits on duty cycle listed, so got to assume 24/7. So, if I had 100 disks that lasted at least 5 years with less than 1 failure per year... I would be happy. After all, in 5 years I could replace the 100 drives with 6 new drives with the same total capacity. This is based on drive size doubling every 1.5 years. Of course my requirements double every year! :) http://maxtor.com/_files/maxtor/en_us/documentation/data_sheets/diamondmax_1 0_data_sheet.pdf Now if someone made an affordable tape drive and tapes that could backup 200G per tape, that would be cool! Guy -----Original Message----- From: Mark Klarzynski [mailto:mark.k@computer-design.co.uk] Sent: Saturday, November 20, 2004 3:03 PM To: 'Guy' Subject: RE: Good news / bad news - The joys of RAID MTBF is statistic based upon the expected 'use' of the drive and the replacement of the drive after its end of life (3-5 years)... It's extremely complex and boring but the figure is only relative if the drive is being used within an environment that matches those of the calculations. SATA / IDE drives have an MTBF similar to that of SCSI / Fibre. But this is based upon their expected use... i.e. SCSI used to be [power on hours = 24hr] [use = 8 hours].. whilst SATA used to be [power on = 8 hours] and [use = 20 mins]. Regardless of what some people clam (usually those that only sell sata based raids), the drives are not constructed the same in any way. SATA's fail more within a raid environment (probably around 10:1) because of the heavy use and also because they are not as intelligent... therefore when they do not respond we have no way of interrogating them or resetting them, whilst with scsi we do both. This means that a raid controller / driver has no option to but simply fail the drive. Maxtor lead the way in capacity and also reliability... I personal had to recall countless earlier IBMs and replace them with maxtor. But the new generation of IBM's (Hitachi) have got it together. So - I guess you are all right :) -----Original Message----- From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Guy Sent: 20 November 2004 19:38 To: 'Mark Hahn'; linux-raid@vger.kernel.org Subject: RE: Good news / bad news - The joys of RAID I have had far more failures of Maxtor drives than any other. I have also had problems with WD drives. I know someone that had 4-6 IBM disks, most of which have failed. I am talking about disks with 3 year warranties! Based on the spec. But OEM disks have none. You must return them to the PC manufacture. Most of my failures were within 3 years, but beyond the warranty period of the system. So the OEM issue has occurred too often. I have had good luck with Seagate. I use RAID, it is a must with the failure rate! I do backup also, but RAID tends to save me. Most people have a PC with 1 disk. I don't understand RAID, and they don't understand that everything will be lost if the disk breaks! They think "Dell will just fix it". But wrong, Dell will just replace it! Big difference. Today's disks claim a MTBF of about 1,000,000 hours! That's about 114 years. So, if I had 10 disks I should expect 1 failure every 11.4 years. That would be so cool! But not in the real world. Can you explain how the disks have a MTBF of 1,000,000 hours? But fail more often than that? Maybe I just don't understand some aspect of MTBF. Guy -----Original Message----- From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Mark Hahn Sent: Saturday, November 20, 2004 1:43 PM To: linux-raid@vger.kernel.org Subject: RE: Good news / bad news - The joys of RAID > Never buy Maxtor drives again! you imply that Maxtor drives are somehow inherently flawed. can you explain why you think millions of people/companies are naive idiots for continuing to buy Maxtor disks? this sort of thing is just not plausible: Maxtor competes with the other top-tier disk vendors with similar products and prices and reliability. yes, if you buy a 1-year disk, you can expect it to have been less carefully tested, possibly be of lower-end design and reliability, and to have been handle more poorly by the supply chain. thankfully, you don't have to buy 1-year disks any more. read the specs. make sure your supply chain knows how to handle disks. make sure your disks are mounted correctly, both mechanically and with enough airflow. use raid and some form of archiving/backups. don't get hung up on which of the 4-5 top-tier vendors makes your disk. - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: Good news / bad news - The joys of RAID 2004-11-21 4:33 ` Guy @ 2004-11-22 14:13 ` Yu Chen 2004-11-22 14:34 ` Gordon Henderson 2004-11-23 0:17 ` berk walker 0 siblings, 2 replies; 50+ messages in thread From: Yu Chen @ 2004-11-22 14:13 UTC (permalink / raw) To: Guy; +Cc: 'Mark Klarzynski', linux-raid > Now if someone made an affordable tape drive and tapes that could backup > 200G per tape, that would be cool! > You don't know? they have that already, AIT-4, LTO as I know. =========================================== Yu Chen Howard Hughes Medical Institute Chemistry Building, Rm 182 University of Maryland at Baltimore County 1000 Hilltop Circle Baltimore, MD 21250 phone: (410)455-6347 (primary) (410)455-2718 (secondary) fax: (410)455-1174 email: chen@hhmi.umbc.edu =========================================== ^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: Good news / bad news - The joys of RAID 2004-11-22 14:13 ` Yu Chen @ 2004-11-22 14:34 ` Gordon Henderson 2004-11-22 17:51 ` Guy 2004-11-23 0:17 ` berk walker 1 sibling, 1 reply; 50+ messages in thread From: Gordon Henderson @ 2004-11-22 14:34 UTC (permalink / raw) To: linux-raid On Mon, 22 Nov 2004, Yu Chen wrote: > > Now if someone made an affordable tape drive and tapes that could backup > > 200G per tape, that would be cool! > > You don't know? they have that already, AIT-4, LTO as I know. I think the key-word here was "affordable" I use DLT drives, which I think go up to 220GB native right now, ('m only currently using 160GB native drives), but right now the cost of media at about £60 each is about the same as a 160GB IDE drive.. Easier to manage though, and the cost of the tape drive is still round about £3500. But how valuable is your data? (As I keep telling my clients!!!) I've tried to build servers that have a max. capacity of 200GB per partition, but I have clients chomping at the bit for bigger partitions, thn it becomes a PITA to backup to tape. I don't think the requirement for tape backup is going to go away in the near future, anyway. I just wish tape technology would keep up with disk technology. RAID is great, but it's not for archive and backup. Gordon - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: Good news / bad news - The joys of RAID 2004-11-22 14:34 ` Gordon Henderson @ 2004-11-22 17:51 ` Guy 2004-11-22 23:26 ` Gordon Henderson 0 siblings, 1 reply; 50+ messages in thread From: Guy @ 2004-11-22 17:51 UTC (permalink / raw) To: 'Gordon Henderson', linux-raid Yes, I was going for affordable! A tape drive with native capacity of 160 Gig costs over $2600 US (SDLT). And tapes cost $89 each. You need to do a lot of backups before tapes cost less than an IDE disk. An IDE disk is so much faster too. The best price I could find for a 160Gig ultra 100 was $107 Hitachi A Hitachi 160 Gig SATA disk is $113. SDLT tapes cost $89 each (10 for $890) I am sure you could get a quantity discount on tapes, but disk drives too. Now we just need to be able to hot plug ultra 100 disk drives. SATA hardware supports hot plug, but I read Linux does not support that yet. I do want to be able to remove my backup and put it in the shelf. A business should have 2 copies where one goes off site. I did have a power supply fail in a way that it fried everything in the box. I think line voltage was send directly to the 12V or 5V line. DVD drive, disk drive, motherboard, RAM, video card, ... all gone. So if my backups were on-line with the same power supply as the main disk(s), all would have been lost. Some people seem to think tape is better than disk. Somehow since there is no filesystem, so you can't delete a file by mistake. So, fine, just use the disk drive the same way. Use cpio and output to /dev/hda or similar. The only thing tapes have that is better than disk drives is the eof and eot marks. I can put 10-20 daily backups on the same tape and let the hardware track the position of each backup. With disk, you would need to count the blocks used, and track the start and length of each. Or you could use a file system, but like I said, some people seem to think that has too much risk. Guy -----Original Message----- From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Gordon Henderson Sent: Monday, November 22, 2004 9:35 AM To: linux-raid@vger.kernel.org Subject: RE: Good news / bad news - The joys of RAID On Mon, 22 Nov 2004, Yu Chen wrote: > > Now if someone made an affordable tape drive and tapes that could backup > > 200G per tape, that would be cool! > > You don't know? they have that already, AIT-4, LTO as I know. I think the key-word here was "affordable" I use DLT drives, which I think go up to 220GB native right now, ('m only currently using 160GB native drives), but right now the cost of media at about £60 each is about the same as a 160GB IDE drive.. Easier to manage though, and the cost of the tape drive is still round about £3500. But how valuable is your data? (As I keep telling my clients!!!) I've tried to build servers that have a max. capacity of 200GB per partition, but I have clients chomping at the bit for bigger partitions, thn it becomes a PITA to backup to tape. I don't think the requirement for tape backup is going to go away in the near future, anyway. I just wish tape technology would keep up with disk technology. RAID is great, but it's not for archive and backup. Gordon - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: Good news / bad news - The joys of RAID 2004-11-22 17:51 ` Guy @ 2004-11-22 23:26 ` Gordon Henderson 2004-11-22 23:48 ` Guy 0 siblings, 1 reply; 50+ messages in thread From: Gordon Henderson @ 2004-11-22 23:26 UTC (permalink / raw) To: Guy; +Cc: linux-raid On Mon, 22 Nov 2004, Guy wrote: > Yes, I was going for affordable! A tape drive with native capacity of 160 > Gig costs over $2600 US (SDLT). And tapes cost $89 each. You need to do a > lot of backups before tapes cost less than an IDE disk. An IDE disk is so > much faster too. True (on the speed side) Although right now it's only just over 2 hours to dump ~200GB on one of the servers I look after. I can see a time where the only real solution is a combined disk/tape system - right now, I'm taking a snapshot overnight off some servers, then backing up from that - that at least gives the punters a "yesterday" snapshot which is great for those "accidental" deletions where getting stuff off tape might take 4-5 hours. Using rsync, or LVM, you can even make multiple days of snapshots. (Although I'm not sure about LVM even now, after having some problems with it causing crashes, and very slow performance after snapshots had been taken, maybe it's time to look at it again though) > The best price I could find for a 160Gig ultra 100 was $107 Hitachi > A Hitachi 160 Gig SATA disk is $113. > > SDLT tapes cost $89 each (10 for $890) > > I am sure you could get a quantity discount on tapes, but disk drives too. > > Now we just need to be able to hot plug ultra 100 disk drives. > SATA hardware supports hot plug, but I read Linux does not support that yet. I've had good results with SCSI hot pluggability and with a FireWire drive where the underlying hardware uses the SCSI stack, also with USB mass storage devices which look like SCSI drives (eg. my digital camera!) So-far I've just used a little script to do the echo "scsi-hot-add 0 0 1 0" > /proc/scsi, etc. then mount /dev/sda1 and so on. I'm hoping that SATA using the SCSI stack will be able to do this too, but I'm hearing mutterings about problems with the device numbers, but so-far I've not had any problems myself... So in that respect, going SCSI, or things that look like SCSI drives might be the way to go... > I do want to be able to remove my backup and put it in the shelf. A > business should have 2 copies where one goes off site. I did have a power > supply fail in a way that it fried everything in the box. I think line > voltage was send directly to the 12V or 5V line. DVD drive, disk drive, > motherboard, RAM, video card, ... all gone. So if my backups were on-line > with the same power supply as the main disk(s), all would have been lost. Ouch. I've not had anythng this bad, (yet?) Different businesses have different ideas about backup and archive (and there are legal implications too for some companies) One of my clients is a small web design house - their in-house server gets backed up to a firewire drive ("lacie" I think the brand is) once a week, as well as a daily snapshot on-line, and is remote backed up over the net to one of my servers, they have 2 other servers for their client web sites which I manage and I back these up to each other overnight - not perfect, but usable, and as these are 200 miles away from me, I need these to be as reliable as possible within the money restaints put upon me by my client (mutter) > Some people seem to think tape is better than disk. Somehow since there is > no filesystem, so you can't delete a file by mistake. So, fine, just use > the disk drive the same way. Use cpio and output to /dev/hda or similar. I actually use 'dump' to a file on their removable firewire drive which is formatted ext2 - they have a 120GB drive and only 20GB of live data, so plenty of room for multiple backups - all on the same drive... I'm going to set them up with 'amanda' soon to try to automate it. I've used amanda for many years no - PITA to setup, but once going, it's very good (with tapes, anyway - I'm not actually sure I'll be able to get it to backup to individual files on the single drive) > The only thing tapes have that is better than disk drives is the eof and eot > marks. I can put 10-20 daily backups on the same tape and let the hardware > track the position of each backup. With disk, you would need to count the > blocks used, and track the start and length of each. Or you could use a > file system, but like I said, some people seem to think that has too much > risk. I haven't found anything that beats tapes for ease of handling (physical stacking and storage in nice boxes) and archiving. I have DLT tapes that are 5 years old now that still read - the real problem with archiving is a good management system, as well as realising the fact that nothing lasts forever, so at some point you have to take those old tapes, read them back onto disk and re-write them using the current technology, and hope the current technology will still be about in 5 years time when you do it again... (The good side is that densities have improved immensely, so long-term storage costs ought to decrease...) Gordon ^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: Good news / bad news - The joys of RAID 2004-11-22 23:26 ` Gordon Henderson @ 2004-11-22 23:48 ` Guy 2004-11-23 0:09 ` Måns Rullgård 2004-11-23 15:33 ` Gordon Henderson 0 siblings, 2 replies; 50+ messages in thread From: Guy @ 2004-11-22 23:48 UTC (permalink / raw) To: 'Gordon Henderson'; +Cc: linux-raid Amanda... I looked into this about 2 years ago. From what I found, each daily backup used a different tape. This is crazy! I can put 10-20 days on 1 tape. Maybe more, not really sure. Of course it is based on how much data changes each day. So, my full backups are only needed about once every 2-3 weeks. Since Amanda uses up too many tapes, I use a home grown set of scripts that maintain the tape position and use cpio for the backup. Do you know if the above is true about Amanda? About tape age. I know of a system that has DLT tapes that are over 7 years old. They have 21 tape drives total in 7 tape juke boxes. No idea about the number of tapes, but well over 4000. These very in age from less than 1 year to over 7 years old. They also have 2 copies of all data. So, if a tape fails, just find the copy. Also make a new copy to maintain 2 copies. Guy -----Original Message----- From: Gordon Henderson [mailto:gordon@drogon.net] Sent: Monday, November 22, 2004 6:27 PM To: Guy Cc: linux-raid@vger.kernel.org Subject: RE: Good news / bad news - The joys of RAID On Mon, 22 Nov 2004, Guy wrote: > Yes, I was going for affordable! A tape drive with native capacity of 160 > Gig costs over $2600 US (SDLT). And tapes cost $89 each. You need to do a > lot of backups before tapes cost less than an IDE disk. An IDE disk is so > much faster too. True (on the speed side) Although right now it's only just over 2 hours to dump ~200GB on one of the servers I look after. I can see a time where the only real solution is a combined disk/tape system - right now, I'm taking a snapshot overnight off some servers, then backing up from that - that at least gives the punters a "yesterday" snapshot which is great for those "accidental" deletions where getting stuff off tape might take 4-5 hours. Using rsync, or LVM, you can even make multiple days of snapshots. (Although I'm not sure about LVM even now, after having some problems with it causing crashes, and very slow performance after snapshots had been taken, maybe it's time to look at it again though) > The best price I could find for a 160Gig ultra 100 was $107 Hitachi > A Hitachi 160 Gig SATA disk is $113. > > SDLT tapes cost $89 each (10 for $890) > > I am sure you could get a quantity discount on tapes, but disk drives too. > > Now we just need to be able to hot plug ultra 100 disk drives. > SATA hardware supports hot plug, but I read Linux does not support that yet. I've had good results with SCSI hot pluggability and with a FireWire drive where the underlying hardware uses the SCSI stack, also with USB mass storage devices which look like SCSI drives (eg. my digital camera!) So-far I've just used a little script to do the echo "scsi-hot-add 0 0 1 0" > /proc/scsi, etc. then mount /dev/sda1 and so on. I'm hoping that SATA using the SCSI stack will be able to do this too, but I'm hearing mutterings about problems with the device numbers, but so-far I've not had any problems myself... So in that respect, going SCSI, or things that look like SCSI drives might be the way to go... > I do want to be able to remove my backup and put it in the shelf. A > business should have 2 copies where one goes off site. I did have a power > supply fail in a way that it fried everything in the box. I think line > voltage was send directly to the 12V or 5V line. DVD drive, disk drive, > motherboard, RAM, video card, ... all gone. So if my backups were on-line > with the same power supply as the main disk(s), all would have been lost. Ouch. I've not had anythng this bad, (yet?) Different businesses have different ideas about backup and archive (and there are legal implications too for some companies) One of my clients is a small web design house - their in-house server gets backed up to a firewire drive ("lacie" I think the brand is) once a week, as well as a daily snapshot on-line, and is remote backed up over the net to one of my servers, they have 2 other servers for their client web sites which I manage and I back these up to each other overnight - not perfect, but usable, and as these are 200 miles away from me, I need these to be as reliable as possible within the money restaints put upon me by my client (mutter) > Some people seem to think tape is better than disk. Somehow since there is > no filesystem, so you can't delete a file by mistake. So, fine, just use > the disk drive the same way. Use cpio and output to /dev/hda or similar. I actually use 'dump' to a file on their removable firewire drive which is formatted ext2 - they have a 120GB drive and only 20GB of live data, so plenty of room for multiple backups - all on the same drive... I'm going to set them up with 'amanda' soon to try to automate it. I've used amanda for many years no - PITA to setup, but once going, it's very good (with tapes, anyway - I'm not actually sure I'll be able to get it to backup to individual files on the single drive) > The only thing tapes have that is better than disk drives is the eof and eot > marks. I can put 10-20 daily backups on the same tape and let the hardware > track the position of each backup. With disk, you would need to count the > blocks used, and track the start and length of each. Or you could use a > file system, but like I said, some people seem to think that has too much > risk. I haven't found anything that beats tapes for ease of handling (physical stacking and storage in nice boxes) and archiving. I have DLT tapes that are 5 years old now that still read - the real problem with archiving is a good management system, as well as realising the fact that nothing lasts forever, so at some point you have to take those old tapes, read them back onto disk and re-write them using the current technology, and hope the current technology will still be about in 5 years time when you do it again... (The good side is that densities have improved immensely, so long-term storage costs ought to decrease...) Gordon ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Good news / bad news - The joys of RAID 2004-11-22 23:48 ` Guy @ 2004-11-23 0:09 ` Måns Rullgård 2004-11-23 15:33 ` Gordon Henderson 1 sibling, 0 replies; 50+ messages in thread From: Måns Rullgård @ 2004-11-23 0:09 UTC (permalink / raw) To: linux-raid "Guy" <bugzilla@watkins-home.com> writes: > Amanda... > I looked into this about 2 years ago. From what I found, each daily backup > used a different tape. This is crazy! I can put 10-20 days on 1 tape. > Maybe more, not really sure. Of course it is based on how much data changes > each day. So, my full backups are only needed about once every 2-3 weeks. Using several tapes, switching every day (or however often you make backups), is a good idea. If the tapes can hold more than one backup, just keep adding to the oldest tape when all have been used once. That way, if the system explodes during a backup, it won't take the most recent backup with it. -- Måns Rullgård mru@inprovide.com - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: Good news / bad news - The joys of RAID 2004-11-22 23:48 ` Guy 2004-11-23 0:09 ` Måns Rullgård @ 2004-11-23 15:33 ` Gordon Henderson 1 sibling, 0 replies; 50+ messages in thread From: Gordon Henderson @ 2004-11-23 15:33 UTC (permalink / raw) To: Guy; +Cc: linux-raid On Mon, 22 Nov 2004, Guy wrote: > Amanda... > I looked into this about 2 years ago. From what I found, each daily backup > used a different tape. This is crazy! I can put 10-20 days on 1 tape. > Maybe more, not really sure. Of course it is based on how much data changes > each day. So, my full backups are only needed about once every 2-3 weeks. > > Since Amanda uses up too many tapes, I use a home grown set of scripts that > maintain the tape position and use cpio for the backup. > > Do you know if the above is true about Amanda? Yes. Amanda uses one tape per backup. It writes a label at the start of every tape to make sure it's writing the backup to the right tape. With your system, if you lose one tape, you lose a lot of backups (however, I have a client who uses one removable disk and I store multiple backups on that disk)... > About tape age. I know of a system that has DLT tapes that are over 7 years > old. They have 21 tape drives total in 7 tape juke boxes. No idea about > the number of tapes, but well over 4000. These very in age from less than 1 > year to over 7 years old. They also have 2 copies of all data. So, if a > tape fails, just find the copy. Also make a new copy to maintain 2 copies. As long as they remember to take a set of tapes out of the jukebox from time to time and replace with fresh :) Gordon ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Good news / bad news - The joys of RAID 2004-11-22 14:13 ` Yu Chen 2004-11-22 14:34 ` Gordon Henderson @ 2004-11-23 0:17 ` berk walker 2004-11-23 9:24 ` Robin Bowes 1 sibling, 1 reply; 50+ messages in thread From: berk walker @ 2004-11-23 0:17 UTC (permalink / raw) To: Yu Chen; +Cc: Guy, 'Mark Klarzynski', linux-raid you must be NUTS! hehe.. I don't know what these cost on the street, but earlier, Computerworld forcast the price @ $3500, and $79 for the media. If one does the traditional multi-level backup routine, the drive and fodder would buy a heck of a lot of alternative storage. My idea of affordable is..$179 + 9.99. Yu Chen wrote: >> Now if someone made an affordable tape drive and tapes that could backup >> 200G per tape, that would be cool! >> > > You don't know? they have that already, AIT-4, LTO as I know. > > > =========================================== > Yu Chen > Howard Hughes Medical Institute > Chemistry Building, Rm 182 > University of Maryland at Baltimore County > 1000 Hilltop Circle > Baltimore, MD 21250 > > phone: (410)455-6347 (primary) > (410)455-2718 (secondary) > fax: (410)455-1174 > email: chen@hhmi.umbc.edu > =========================================== > > - > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Good news / bad news - The joys of RAID 2004-11-23 0:17 ` berk walker @ 2004-11-23 9:24 ` Robin Bowes 2004-11-23 12:31 ` Bob Hillegas 0 siblings, 1 reply; 50+ messages in thread From: Robin Bowes @ 2004-11-23 9:24 UTC (permalink / raw) To: berk walker; +Cc: Yu Chen, Guy, 'Mark Klarzynski', linux-raid berk walker wrote: > My idea of affordable is..$179 + 9.99. You pay delivery ??? :) R. -- http://robinbowes.com ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Good news / bad news - The joys of RAID 2004-11-23 9:24 ` Robin Bowes @ 2004-11-23 12:31 ` Bob Hillegas 2004-11-23 13:00 ` berk walker 0 siblings, 1 reply; 50+ messages in thread From: Bob Hillegas @ 2004-11-23 12:31 UTC (permalink / raw) To: linux-raid On Tue, 2004-11-23 at 03:24, Robin Bowes wrote: > berk walker wrote: > > My idea of affordable is..$179 + 9.99. Has anyone considered Omega's REV drive? It's kind of smallish when talking about backing up terabytes. It's 35 gigs per removable cartridge. But it is random access in the $375 + $20 range. Thanks, BobH -- Bob Hillegas <bobhillegas@houston.rr.com> ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Good news / bad news - The joys of RAID 2004-11-23 12:31 ` Bob Hillegas @ 2004-11-23 13:00 ` berk walker 0 siblings, 0 replies; 50+ messages in thread From: berk walker @ 2004-11-23 13:00 UTC (permalink / raw) To: Bob Hillegas; +Cc: linux-raid gotta dash, but i just checked, $343 + $58. Back after work for comment. thx - b- Bob Hillegas wrote: >On Tue, 2004-11-23 at 03:24, Robin Bowes wrote: > > >>berk walker wrote: >> >> >>>My idea of affordable is..$179 + 9.99. >>> >>> > >Has anyone considered Omega's REV drive? It's kind of smallish when >talking about backing up terabytes. It's 35 gigs per removable >cartridge. But it is random access in the $375 + $20 range. > >Thanks, BobH > > ^ permalink raw reply [flat|nested] 50+ messages in thread
[parent not found: <Pine.LNX.4.44.0411201655400.19120-100000@coffee.psychology.mcmaster.ca>]
* RE: Good news / bad news - The joys of RAID [not found] <Pine.LNX.4.44.0411201655400.19120-100000@coffee.psychology.mcmaster.ca> @ 2004-11-21 21:28 ` Mark Klarzynski 2004-11-21 21:58 ` Mark Hahn 2004-11-22 6:29 ` Mikael Abrahamsson 0 siblings, 2 replies; 50+ messages in thread From: Mark Klarzynski @ 2004-11-21 21:28 UTC (permalink / raw) To: 'Mark Hahn'; +Cc: linux-raid I have no idea as to what the tier1 vendors say as I have only worked within the storage business.. the figures I quoted are based on the last time I consulted on this are would been provided by IBM / Seagate as these are the only two scsi vendors we use. If you really want to dig, then ask Seagate, they are respected in both camps and will openly justify the technology and price difference. They produce extremely in-depth docs on the testing methods and assumptions. In terms of reset I am not sure what you mean... we and all raid manufacturers will reset a scsi bus on scsi timeouts.. this is normal practice and simple to achieve. It is not achievable on sata.. I have not used pata much, but I do not recall a reset line that we could trigger from firmware level. RAID in isolation does not increase the i/o load as we all know... but the reality is that raid applications do. Non of us can refuse the cost effective nature of sata drives, this means we can often use raid in places where we could not afford or justify scsi. Add multiple users and the stress on the drives increase dramatically. If you want a real life situation... one of our scsi designs is used around the world and has probably 10m+ users (many systems).. in some cases these have been running for 4 / 5 years and therefore we have to look at drive replacement. For a trial we used sata to obviously see if we could save costs or offer an intermediate solution. We could not keep a single system going for more than 14 days. The load varied between 10-250 users at any one time.. we tried Maxtor and IBM. There was also a 40% occurrence of fatal state errors.. this was simple the rate that the drives were failing meant it was likely to fail whilst in rebuild state and obviously die. Take the sata box and stick it in many applications and it will last you to your dying day. You may be right that there has been ata and scsi drive manufactured with the same components excluding the interface.... but the last time I saw this was a bearing shortage in 95... I don't know of any manufactures today that even hint at this. But I could well be wrong.. The discussion could probably go on forever, but the point is that we are not stupid... sata solutions are probably 30% of the cost of the scsi..... there is a difference and we know it. the important thing is accepting the difference and using the right technology for the right application. -----Original Message----- From: Mark Hahn [mailto:hahn@physics.mcmaster.ca] Sent: 20 November 2004 21:58 To: Mark Klarzynski Subject: RE: Good news / bad news - The joys of RAID > SATA / IDE drives have an MTBF similar to that of SCSI / Fibre. But this > is based upon their expected use... i.e. SCSI used to be [power on hours > = 24hr] [use = 8 hours].. whilst SATA used to be [power on = 8 hours] > and [use = 20 mins]. can you cite a source for these numbers? the vendors I talk to (tier1 system vendors, not disk vendors) usually state 24x7 100% duty cycles for scsi/fc, and 100% poweron, 20% duty cycles for PATA/SATA. ^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: Good news / bad news - The joys of RAID 2004-11-21 21:28 ` Mark Klarzynski @ 2004-11-21 21:58 ` Mark Hahn 2004-11-22 6:29 ` Mikael Abrahamsson 1 sibling, 0 replies; 50+ messages in thread From: Mark Hahn @ 2004-11-21 21:58 UTC (permalink / raw) To: Mark Klarzynski; +Cc: linux-raid > practice and simple to achieve. It is not achievable on sata.. I have Linux certainly appears to be able to reset both pata and sata; perhaps the drivers are just lying. > are not stupid... sata solutions are probably 30% of the cost of the > scsi..... there is a difference and we know it. the important thing is > accepting the difference and using the right technology for the right > application. sure. it's basically only extremely high-end DBs (which require 150 IOPS per disk, 24/7) that need SCSI/FC. anyone designing a storage system needs to actually profile their IO to see whether their workload actually falls into this very tiny niche. do your seeks scale down as ndisks increases? do you need bandwidth (which is almost trivial to obtain with more disks)? do you need reliability (which is easy to achieve with raid)? does your IO drop to near zero once you run it through a battery-backed cache of a few GB? the take-home message is that you need to actually find out whether your workload requires that you pay the huge premium for SCSI/FC infrastructure ("enterprise-class storage"). almost none do, seriously. ^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: Good news / bad news - The joys of RAID 2004-11-21 21:28 ` Mark Klarzynski 2004-11-21 21:58 ` Mark Hahn @ 2004-11-22 6:29 ` Mikael Abrahamsson 1 sibling, 0 replies; 50+ messages in thread From: Mikael Abrahamsson @ 2004-11-22 6:29 UTC (permalink / raw) To: linux-raid On Sun, 21 Nov 2004, Mark Klarzynski wrote: > You may be right that there has been ata and scsi drive manufactured > with the same components excluding the interface.... but the last time I > saw this was a bearing shortage in 95... I don't know of any > manufactures today that even hint at this. But I could well be wrong.. This was in the day of Mac:s only having scsi interface but needing an affordable drive. Since Apple stopped using scsi in their lowend boxes, as far as I know there has been no more "desktop scsi drive". -- Mikael Abrahamsson email: swmike@swm.pp.se ^ permalink raw reply [flat|nested] 50+ messages in thread
[parent not found: <04Nov26.172857est.30052@gpu.utcc.utoronto.ca>]
* Re: Good news / bad news - The joys of RAID [not found] <04Nov26.172857est.30052@gpu.utcc.utoronto.ca> @ 2004-11-26 22:41 ` Robin Bowes 0 siblings, 0 replies; 50+ messages in thread From: Robin Bowes @ 2004-11-26 22:41 UTC (permalink / raw) To: Chris Siebenmann, linux-raid Chris Siebenmann wrote: > You write: > | Thinking about what happened, I would have expected that the bad > | drive would just be removed from the array and spare activated and > | re-syncing started automatically. > > This is what is supposed to happen; when the hardware winds are blowing > in the right direction and the software recognizes everything, it even > really does happen. Chris, I suspect that what happened is that the array was in the process of re-syncing when I powered off the box because it had frozen because of an ATA timeout error. When I re-booted, the RAID1 root partition was dirty and wouldn't re-sync while the RAID 5 array was re-syncing. Whatever, I got it back up and running by disconnecting the failed drive. Cheers, R. -- http://robinbowes.com ^ permalink raw reply [flat|nested] 50+ messages in thread
end of thread, other threads:[~2004-12-09 0:17 UTC | newest]
Thread overview: 50+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-11-19 21:06 Good news / bad news - The joys of RAID Robin Bowes
2004-11-19 21:28 ` Guy
2004-11-20 18:42 ` Mark Hahn
2004-11-20 19:37 ` Guy
2004-11-20 20:03 ` Mark Klarzynski
2004-11-20 22:17 ` Mark Hahn
2004-11-20 23:09 ` Guy
2004-12-02 16:47 ` TJ
2004-12-02 17:29 ` Stephen C Woods
2004-12-03 3:37 ` Mark Hahn
2004-12-03 4:16 ` Guy
2004-12-03 4:46 ` Alvin Oga
2004-12-03 5:24 ` Richard Scobie
2004-12-03 5:40 ` Konstantin Olchanski
2004-12-09 0:17 ` H. Peter Anvin
2004-11-20 23:30 ` Mark Hahn
2004-11-20 19:40 ` David Greaves
2004-11-21 4:33 ` Guy
2004-11-21 1:01 ` berk walker
2004-11-23 19:10 ` H. Peter Anvin
2004-11-23 20:03 ` Guy
2004-11-23 21:18 ` Mark Hahn
2004-11-23 23:02 ` Robin Bowes
2004-11-24 0:33 ` Guy
2004-11-24 1:45 ` berk walker
2004-11-24 2:00 ` H. Peter Anvin
2004-11-24 8:01 ` Good news / bad news - The joys of hardware Guy
2004-11-24 8:57 ` Robin Bowes
2004-11-19 21:42 ` Good news / bad news - The joys of RAID Guy
2004-11-28 13:15 ` Robin Bowes
2004-11-30 2:05 ` Neil Brown
2004-12-01 3:34 ` Doug Ledford
2004-12-01 11:50 ` Robin Bowes
2004-11-19 21:58 ` Gordon Henderson
[not found] <037401c4cf3b$ee75bc90$030a0a0a@musicroom>
2004-11-21 4:33 ` Guy
2004-11-22 14:13 ` Yu Chen
2004-11-22 14:34 ` Gordon Henderson
2004-11-22 17:51 ` Guy
2004-11-22 23:26 ` Gordon Henderson
2004-11-22 23:48 ` Guy
2004-11-23 0:09 ` Måns Rullgård
2004-11-23 15:33 ` Gordon Henderson
2004-11-23 0:17 ` berk walker
2004-11-23 9:24 ` Robin Bowes
2004-11-23 12:31 ` Bob Hillegas
2004-11-23 13:00 ` berk walker
[not found] <Pine.LNX.4.44.0411201655400.19120-100000@coffee.psychology.mcmaster.ca>
2004-11-21 21:28 ` Mark Klarzynski
2004-11-21 21:58 ` Mark Hahn
2004-11-22 6:29 ` Mikael Abrahamsson
[not found] <04Nov26.172857est.30052@gpu.utcc.utoronto.ca>
2004-11-26 22:41 ` Robin Bowes
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).