* Use RAID-6!
@ 2013-04-16 16:44 Roy Sigurd Karlsbakk
2013-04-16 17:09 ` Mikael Abrahamsson
` (2 more replies)
0 siblings, 3 replies; 28+ messages in thread
From: Roy Sigurd Karlsbakk @ 2013-04-16 16:44 UTC (permalink / raw)
To: Linux RAID
Hi all
After reading this list for some time, there's a single mode of failure that's repeated over and over: RAID-5 loses a drive and finds bad data on another (or just loses another). This is rather normal, far more than documented by the disk vendors. This is also the case with "professional" systems with "enterprise" drives.
So, if you can afford another drive, please use RAID-6. Do *not* trust RAID-5 with something like 8 drives.
Also, maybe this should be on an FAQ/RAID tutorial somewhere?
Vennlige hilsener / Best regards
roy
--
Roy Sigurd Karlsbakk
(+47) 98013356
roy@karlsbakk.net
http://blogg.karlsbakk.net/
GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med xenotyp etymologi. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 28+ messages in thread* Re: Use RAID-6! 2013-04-16 16:44 Use RAID-6! Roy Sigurd Karlsbakk @ 2013-04-16 17:09 ` Mikael Abrahamsson 2013-04-16 17:25 ` Roy Sigurd Karlsbakk 2013-04-16 20:01 ` David Brown 2013-04-16 19:52 ` Robert L Mathews 2013-04-16 23:42 ` md dropping disks too early (was: Use RAID-6!) Ben Bucksch 2 siblings, 2 replies; 28+ messages in thread From: Mikael Abrahamsson @ 2013-04-16 17:09 UTC (permalink / raw) To: Roy Sigurd Karlsbakk; +Cc: Linux RAID On Tue, 16 Apr 2013, Roy Sigurd Karlsbakk wrote: > Also, maybe this should be on an FAQ/RAID tutorial somewhere? Question is, where should it be put so that people read it and actually understand it. This article is from 2007: <http://www.zdnet.com/blog/storage/why-raid-5-stops-working-in-2009/162> I've had people argue with me that the above article is wrong, but I never udnerstood their logic. To me it makes perfect sense and I always go RAID6. I also think the work having more than 2 parity drives was very promising. I'd rather have a 20 drive volume with 4 parity drives than to LVM together two 10 drive RAID6:es (apart from obvious performance penalties). -- Mikael Abrahamsson email: swmike@swm.pp.se ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Use RAID-6! 2013-04-16 17:09 ` Mikael Abrahamsson @ 2013-04-16 17:25 ` Roy Sigurd Karlsbakk 2013-04-16 20:01 ` David Brown 1 sibling, 0 replies; 28+ messages in thread From: Roy Sigurd Karlsbakk @ 2013-04-16 17:25 UTC (permalink / raw) To: Mikael Abrahamsson; +Cc: Linux RAID > > Also, maybe this should be on an FAQ/RAID tutorial somewhere? > > Question is, where should it be put so that people read it and > actually > understand it. > > This article is from 2007: > > <http://www.zdnet.com/blog/storage/why-raid-5-stops-working-in-2009/162> > > I've had people argue with me that the above article is wrong, but I > never > udnerstood their logic. To me it makes perfect sense and I always go > RAID6. > > I also think the work having more than 2 parity drives was very > promising. > I'd rather have a 20 drive volume with 4 parity drives than to LVM > together two 10 drive RAID6:es (apart from obvious performance > penalties). I've been running RAIDz3 for a backup machine (zfs receive from the main box), and the write performance was rather low. I'd rather use lvm or raid-something over different raid-6 volumes. Spread out the risk factor. With ~8 drives in each raid-6 set, the risk is low enough to allow for rather large volumes. -- Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 98013356 roy@karlsbakk.net http://blogg.karlsbakk.net/ GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med xenotyp etymologi. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Use RAID-6! 2013-04-16 17:09 ` Mikael Abrahamsson 2013-04-16 17:25 ` Roy Sigurd Karlsbakk @ 2013-04-16 20:01 ` David Brown 2013-04-17 7:56 ` Mikael Abrahamsson 1 sibling, 1 reply; 28+ messages in thread From: David Brown @ 2013-04-16 20:01 UTC (permalink / raw) To: Mikael Abrahamsson; +Cc: Roy Sigurd Karlsbakk, Linux RAID On 16/04/13 19:09, Mikael Abrahamsson wrote: > On Tue, 16 Apr 2013, Roy Sigurd Karlsbakk wrote: > >> Also, maybe this should be on an FAQ/RAID tutorial somewhere? > > Question is, where should it be put so that people read it and actually > understand it. > > This article is from 2007: > > <http://www.zdnet.com/blog/storage/why-raid-5-stops-working-in-2009/162> > > I've had people argue with me that the above article is wrong, but I > never udnerstood their logic. To me it makes perfect sense and I always > go RAID6. > > I also think the work having more than 2 parity drives was very > promising. I'd rather have a 20 drive volume with 4 parity drives than > to LVM together two 10 drive RAID6:es (apart from obvious performance > penalties). > Raid calculations for a third parity are noticeably more time-consuming than for the second parity of Raid6. And with a bigger array with lots of drives, you are going to have terrible RMW performance for small writes. However, as the multi-threaded scaling of Raid5 and Raid6 improves and makes its way into distro's standard kernels, it's going to be more realistic - especially for machines with plenty of cores and lots of RAM for stripe caches. I hope triple parity raid will make it into the kernel at some point. I've done the main part of the maths involved, but not had the time to work it into anything resembling real code. I don't know if I personally will ever make it into working code - but if anyone else is at all interested in doing so, then I will certainly help with the maths. I am not sure there is much real-world need of triple parity raid for normal arrays - even with better cpu scaling, it would still be a lot slower than two raid6 arrays LVM'ed together. I foresee it's main use as a temporary measure during array maintenance. For example, if you have a raid6 and you want to swap out the drives for bigger ones, then you could temporarily add an extra drive for a third parity using a non-symmetrical layout. Once this extra drive is synced, then you can step through the other drives doing a replace-and-resync, knowing that you still have the double parity safety. Then at the end of the process you drop the third parity again. Quad parity has some limitations, especially if you want to keep the first 3 parities compatible with triple parity. In particular, you are limited to 21 data disks. There are, of course, ways to handle even greater parity counts - but the cost in complexity and speed is considerable. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Use RAID-6! 2013-04-16 20:01 ` David Brown @ 2013-04-17 7:56 ` Mikael Abrahamsson 2013-04-17 9:26 ` David Brown 0 siblings, 1 reply; 28+ messages in thread From: Mikael Abrahamsson @ 2013-04-17 7:56 UTC (permalink / raw) To: David Brown; +Cc: Roy Sigurd Karlsbakk, Linux RAID On Tue, 16 Apr 2013, David Brown wrote: > you are going to have terrible RMW performance for small writes. However, as As I said, I don't have problem with lower performance. My workload is write once and few, read many. If the performance is approximately the approximately the same as a 10 drive RAID-6, but with double the storage, I'm fine. > I am not sure there is much real-world need of triple parity raid for > normal arrays - even with better cpu scaling, it would still be a lot > slower than two raid6 arrays LVM'ed together. I foresee it's main use > as a temporary measure during array maintenance. For example, if you > have a raid6 and you want to swap out the drives for bigger ones, then > you could temporarily add an extra drive for a third parity using a > non-symmetrical layout. Once this extra drive is synced, then you can > step through the other drives doing a replace-and-resync, knowing that > you still have the double parity safety. Then at the end of the process > you drop the third parity again. Well, I run RAID6+spare. I'd rather run a triple parity drive unless the write performance penalty is huge. -- Mikael Abrahamsson email: swmike@swm.pp.se ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Use RAID-6! 2013-04-17 7:56 ` Mikael Abrahamsson @ 2013-04-17 9:26 ` David Brown 0 siblings, 0 replies; 28+ messages in thread From: David Brown @ 2013-04-17 9:26 UTC (permalink / raw) To: Mikael Abrahamsson; +Cc: Roy Sigurd Karlsbakk, Linux RAID On 17/04/13 09:56, Mikael Abrahamsson wrote: > On Tue, 16 Apr 2013, David Brown wrote: > >> you are going to have terrible RMW performance for small writes. >> However, as > > As I said, I don't have problem with lower performance. My workload is > write once and few, read many. If the performance is approximately the > approximately the same as a 10 drive RAID-6, but with double the > storage, I'm fine. I would expect read performance for triple-parity raid to be similar to Raid5 or Raid6 - i.e., you get good striped performance, especially for large files as they are spread over many spindles. Of course, since triple-parity md raid does not yet exist, that's just theoretical... > >> I am not sure there is much real-world need of triple parity raid for >> normal arrays - even with better cpu scaling, it would still be a lot >> slower than two raid6 arrays LVM'ed together. I foresee it's main use >> as a temporary measure during array maintenance. For example, if you >> have a raid6 and you want to swap out the drives for bigger ones, then >> you could temporarily add an extra drive for a third parity using a >> non-symmetrical layout. Once this extra drive is synced, then you can >> step through the other drives doing a replace-and-resync, knowing that >> you still have the double parity safety. Then at the end of the >> process you drop the third parity again. > > Well, I run RAID6+spare. I'd rather run a triple parity drive unless the > write performance penalty is huge. > It's encouraging to hear people are interested in this. But before it can be implemented, there has to be someone with an understanding of Linux md raid who can implement it. I know the maths involved, but I have no experience with Linux kernel work (I work with embedded systems - while I use the same programming language as the kernel, it's a very different style of programming). ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Use RAID-6! 2013-04-16 16:44 Use RAID-6! Roy Sigurd Karlsbakk 2013-04-16 17:09 ` Mikael Abrahamsson @ 2013-04-16 19:52 ` Robert L Mathews 2013-04-16 20:05 ` Carsten Aulbert 2013-04-17 17:27 ` Roy Sigurd Karlsbakk 2013-04-16 23:42 ` md dropping disks too early (was: Use RAID-6!) Ben Bucksch 2 siblings, 2 replies; 28+ messages in thread From: Robert L Mathews @ 2013-04-16 19:52 UTC (permalink / raw) To: Linux RAID On 4/16/13 9:44 AM, Roy Sigurd Karlsbakk wrote: > So, if you can afford another drive, please use RAID-6. Do *not* trust RAID-5 with something like 8 drives. Yep. This has been true for many years, too: http://www.miracleas.com/BAARF/ I personally don't even trust RAID 6. All our servers use three-disk RAID 1 setups, with disks from at least two different manufacturers to prevent against firmware bricking (although this is becoming more and more difficult as the industry consolidates). If (no, scratch that, "when") something goes horribly wrong, I can mount any of the disks as normal, non-RAID volumes. All I need is one disk to work. (If none of the disks are working, the RAID level is irrelevant... ;-) To me, that level of confidence is worth sacrificing quite a bit of performance and capacity. It has saved my bacon at least once. -- Robert L Mathews, Tiger Technologies, http://www.tigertech.net/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Use RAID-6! 2013-04-16 19:52 ` Robert L Mathews @ 2013-04-16 20:05 ` Carsten Aulbert 2013-04-16 20:19 ` Roman Mamedov 2013-04-16 22:44 ` Robert L Mathews 2013-04-17 17:27 ` Roy Sigurd Karlsbakk 1 sibling, 2 replies; 28+ messages in thread From: Carsten Aulbert @ 2013-04-16 20:05 UTC (permalink / raw) To: Robert L Mathews; +Cc: Linux RAID [-- Attachment #1: Type: text/plain, Size: 670 bytes --] Hi On 04/16/2013 09:52 PM, Robert L Mathews wrote: > I personally don't even trust RAID 6. All our servers use three-disk > RAID 1 setups, with disks from at least two different manufacturers to > prevent against firmware bricking (although this is becoming more and > more difficult as the industry consolidates). The problem I find with RAID1 is that it won't protect you against silent corruptions (same as RAID5). What do you do if you do a through check and both drives claim a data block is valid and intact, but data differs? Do you trust disk1 or disk2? In that respect I think RAID1 is a step into the wrong direction :( Cheers Carsten [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 2044 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Use RAID-6! 2013-04-16 20:05 ` Carsten Aulbert @ 2013-04-16 20:19 ` Roman Mamedov 2013-04-16 22:44 ` Robert L Mathews 1 sibling, 0 replies; 28+ messages in thread From: Roman Mamedov @ 2013-04-16 20:19 UTC (permalink / raw) To: Carsten Aulbert; +Cc: Robert L Mathews, Linux RAID [-- Attachment #1: Type: text/plain, Size: 801 bytes --] On Tue, 16 Apr 2013 22:05:53 +0200 Carsten Aulbert <Carsten.Aulbert@aei.mpg.de> wrote: > The problem I find with RAID1 is that it won't protect you against > silent corruptions (same as RAID5). What do you do if you do a through > check and both drives claim a data block is valid and intact, but data > differs? Do you trust disk1 or disk2? > > In that respect I think RAID1 is a step into the wrong direction :( Then use btrfs RAID1 where every data and metadata block is checksummed and in case some array member returns blocks with invalid checksums, this is healed from others which still have the correct ones. (Although currently btrfs "RAID1" stores data on *two disks*, no matter how many you have in the array; so it's a bit unconventional). -- With respect, Roman [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Use RAID-6! 2013-04-16 20:05 ` Carsten Aulbert 2013-04-16 20:19 ` Roman Mamedov @ 2013-04-16 22:44 ` Robert L Mathews 2013-04-17 0:20 ` Ben Bucksch 2013-04-17 4:20 ` Roman Mamedov 1 sibling, 2 replies; 28+ messages in thread From: Robert L Mathews @ 2013-04-16 22:44 UTC (permalink / raw) To: Linux RAID On 4/16/13 1:05 PM, Carsten Aulbert wrote: > The problem I find with RAID1 is that it won't protect you against > silent corruptions (same as RAID5). What do you do if you do a through > check and both drives claim a data block is valid and intact, but data > differs? Do you trust disk1 or disk2? That's partly why we use three-disk arrays instead of two-disk. But as you say, this general issue is a problem with RAID 5 too. We plan to switch to Btrfs as soon as doing so is wise. In the meantime, I'd rather risk this problem than the endless reports of complete array failures that appear on the list with RAID 5 and even RAID 6 (a recent topic, I note, was "multiple disk failures in an md raid6 array"). I almost never see anyone reporting complete loss of a RAID 1 array. The fundamental difference between RAID 1 and other levels seems to be that the usefulness of an individual array member doesn't rely on the state of any other member. This vastly reduces the impact of failures on the overall system. After using mdadm with various RAID levels since 2002 (thanks, Neil), I'm convinced that RAID 1 is by its very nature far less fragile than any other scheme. This belief is sadly reinforced almost every week by a new tale of woe on the mailing list. -- Robert L Mathews, Tiger Technologies, http://www.tigertech.net/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Use RAID-6! 2013-04-16 22:44 ` Robert L Mathews @ 2013-04-17 0:20 ` Ben Bucksch 2013-04-17 1:35 ` Adam Goryachev 2013-04-17 3:32 ` Robert L Mathews 2013-04-17 4:20 ` Roman Mamedov 1 sibling, 2 replies; 28+ messages in thread From: Ben Bucksch @ 2013-04-17 0:20 UTC (permalink / raw) To: Robert L Mathews; +Cc: Linux RAID Robert L Mathews wrote, On 17.04.2013 00:44: > the endless reports of complete array failures that appear on the list > with RAID 5 and even RAID 6 (a recent topic, I note, was "multiple > disk failures in an md raid6 array"). I almost never see anyone > reporting complete loss of a RAID 1 array. Correct > The fundamental difference between RAID 1 and other levels seems to be > that the usefulness of an individual array member doesn't rely on the > state of any other member. This vastly reduces the impact of failures > on the overall system. After using mdadm with various RAID levels > since 2002 (thanks, Neil), I'm convinced that RAID 1 is by its very > nature far less fragile than any other scheme. This belief is sadly > reinforced almost every week by a new tale of woe on the mailing list. Exactly. However, I think the RAID5 problems are caused by bad design decisions in the md implementation, not in the inherent concept of RAID5, though. Many people seem to have problems getting to the data of their RAID5 array, although they have enough disks that are readable, but they can't convince md to read it. RAID1 doesn't have that problem, because you can ignore md when reading them. This is a home-made problem of Linux md. FWIW, my own 10 years of experience with Linux md RAID led to the same conclusion as you had. See thread "md dropping disks too early" Ben ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Use RAID-6! 2013-04-17 0:20 ` Ben Bucksch @ 2013-04-17 1:35 ` Adam Goryachev 2013-04-17 4:27 ` Robert L Mathews 2013-04-17 11:13 ` Ben Bucksch 2013-04-17 3:32 ` Robert L Mathews 1 sibling, 2 replies; 28+ messages in thread From: Adam Goryachev @ 2013-04-17 1:35 UTC (permalink / raw) To: Ben Bucksch; +Cc: Robert L Mathews, Linux RAID On 17/04/13 10:20, Ben Bucksch wrote: > Robert L Mathews wrote, On 17.04.2013 00:44: >> the endless reports of complete array failures that appear on the >> list with RAID 5 and even RAID 6 (a recent topic, I note, was >> "multiple disk failures in an md raid6 array"). I almost never see >> anyone reporting complete loss of a RAID 1 array. > Correct > Obviously, if they suffered a two disk failure then they won't be here asking for help will they :) Although, you are right, there are less failure scenarios where they are left with one or more working disks and no possibility to recover the data. >> The fundamental difference between RAID 1 and other levels seems to >> be that the usefulness of an individual array member doesn't rely on >> the state of any other member. This vastly reduces the impact of >> failures on the overall system. After using mdadm with various RAID >> levels since 2002 (thanks, Neil), I'm convinced that RAID 1 is by its >> very nature far less fragile than any other scheme. This belief is >> sadly reinforced almost every week by a new tale of woe on the >> mailing list. > > Exactly. > > However, I think the RAID5 problems are caused by bad design decisions > in the md implementation, not in the inherent concept of RAID5, > though. Many people seem to have problems getting to the data of their > RAID5 array, although they have enough disks that are readable, but > they can't convince md to read it. RAID1 doesn't have that problem, > because you can ignore md when reading them. This is a home-made > problem of Linux md. Well, you can ignore Linux md when reading from RAID5 member disks, you just need to do some work to make the contents actually useful. However, I totally disagree with your comment anyway. Linux md is simple a part of the kernel, not the whole kernel. It takes a "block device" and generates read/write commands to that block device. It can get back one of a few possible results: 1) read error 2) write error 3) block device is no longer valid 1) A read error can be generated for a number of causes, but (AFAIK) Linux md will simply read from another member, and try to write the data back to the device that generated the read error. This would fix a URE for example. 2) A write error is more of a problem, if the block device generates a write error, then there are limited options. We can retry the write, or we can discard the entire device. I think Linux md will discard the entire device, possibly after retrying the write one or more times I don't know enough about Linux md, but in any case, I think this is a rare case where we get a write error from an otherwise good block device. 3) This is the issue that seems to bite everyone. Using block devices that are not configured correctly. Sooner or later, the drive has a URE, the drive goes off to la-la land and Linux patiently waits, tries a drive reset, SATA bus reset, etc, still no response, eventually deciding the drive has gone. The Linux kernel advises Linux md that the block device is gone, so Linux md discards the block device and stops trying to use it. Personally, I don't see that Linux md has a lot of choice in the matter, without trying to re-implement every SATA/SCSI/SAS controller driver into md itself so that we can keep retrying longer. We are told the device is gone, so it is gone, end of story. Now, if you truly have this issue, and do NOT make any silly assumption, and follow the correct advice, you will have no problem resolving the issue (as long as the actual device is working properly). Generally, this is just a matter of assembling the MD without the oldest/first affected device, and/or using --force or similar. The SECOND problem is caused by the user attempting some other recovery methods which cause additional writes to the array. Certainly, a hardware raid controller doesn't have this issue, it controls the disk, disk controller, and RAID, it knows everything about all layers. However, if some strange issue happens such as two disks dropping out of the array, one after the other, then I'm not sure what your recovery options are, but I expect they are a lot more limited compared to having the power of Linux md and tools like dd, GNU ddrescue, etc to manipulate the data in well documented and understood ways (as opposed to being stuck in a limited "BIOS" type tool with limited GUI type options...) Perhaps it is possible for Linux md to check whether the RAID members support ecterc and/or what their timeout is, along with the associated interface timeout. Possibly using user space mdadm rather than the in-kernel md. At least this might catch more broken configurations before they break rather than waiting for it to break first. > FWIW, my own 10 years of experience with Linux md RAID led to the same > conclusion as you had. > > See thread "md dropping disks too early" Personally, I'd like to see RAID10 get a lot more attention. We need to be able to grow RAID10 arrays (and shrink), etc, not because this would provide RAID1 type reliability. Of course, you can still get multiple disk failures, and you can still mess up a RAID10 array by trying to "fix" it, yet still have just enough idea that all your data might be there, you just need to know the right magic spell to make it re-appear. The best part of Linux md RAID is that the large majority of the time, the people that come to the list with broken arrays are able to recover all of their data *IF* they are patient enough, *AND* follow the advice of the very knowledgeable people on this list, even in cases where that user has broken their RAID array further in their attempts to "fix" it. In summary, I'll say it again, most Linux md RAID issues seem to be caused by: 1) mis-configured systems that are just waiting for a critical moment to break (Murphy's Law) 2) people who don't know enough about Linux md RAID who try to fix the broken array PS, I really have no idea what I'm talking about, except lurking and reading this list and the problems (and resolutions) here, if I've made any errors in the above, feel free to fix it. I really think the above (plus whatever corrections/more complete information) should be saved in a FAQ somewhere so we can just point people at the same page all the time instead of discussing it again each time (it invariably seems to be discussed every month or so). Regards, Adam -- Adam Goryachev Website Managers www.websitemanagers.com.au ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Use RAID-6! 2013-04-17 1:35 ` Adam Goryachev @ 2013-04-17 4:27 ` Robert L Mathews 2013-04-17 4:45 ` Adam Goryachev 2013-04-17 6:06 ` Stan Hoeppner 2013-04-17 11:13 ` Ben Bucksch 1 sibling, 2 replies; 28+ messages in thread From: Robert L Mathews @ 2013-04-17 4:27 UTC (permalink / raw) To: Linux RAID On 4/16/13 6:35 PM, Adam Goryachev wrote: > Obviously, if they suffered a two disk [RAID 1] failure then they won't > be here asking for help will they :) Heh. Well, no, they won't if the disks are completely and permanently dead. (I know I'm starting to sound like a broken record, but "that's partly why we use three disks instead of two and make sure they don't all use the same company's firmware".) But complete disk death doesn't seem to be the normal failure mode. If the failure is spurious, as so many seem to be, and temporarily affects an array so that each disk has a different event count, that isn't a disaster under RAID 1. If worst comes to worst, you can pick one disk to use and pretend RAID doesn't even exist. You don't need to get the members to successfully sync into an array to read the data. But if each disk in a RAID 5 or RAID 6 array gets a different event count, or if the disks refuse to easily assemble into an active array for any other reason, all your data is inaccessible until you fix the RAID problem. I avidly read the details of every RAID 5 [and 6] disaster on the list, and almost every one would be trivially easy to fix under RAID 1, with no risk of complete data loss. It's heartbreaking. -- Robert L Mathews, Tiger Technologies, http://www.tigertech.net/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Use RAID-6! 2013-04-17 4:27 ` Robert L Mathews @ 2013-04-17 4:45 ` Adam Goryachev 2013-04-17 6:06 ` Stan Hoeppner 1 sibling, 0 replies; 28+ messages in thread From: Adam Goryachev @ 2013-04-17 4:45 UTC (permalink / raw) To: Robert L Mathews; +Cc: Linux RAID On 17/04/13 14:27, Robert L Mathews wrote: > But complete disk death doesn't seem to be the normal failure mode. If > the failure is spurious, as so many seem to be, and temporarily > affects an array so that each disk has a different event count, that > isn't a disaster under RAID 1. If worst comes to worst, you can pick > one disk to use and pretend RAID doesn't even exist. You don't need to > get the members to successfully sync into an array to read the data. > But if each disk in a RAID 5 or RAID 6 array gets a different event > count, or if the disks refuse to easily assemble into an active array > for any other reason, all your data is inaccessible until you fix the > RAID problem. I avidly read the details of every RAID 5 [and 6] > disaster on the list, and almost every one would be trivially easy to > fix under RAID 1, with no risk of complete data loss. It's heartbreaking. RAID1 of course fails the requirement of a single filesystem that requires more space than a single disk can provide. Of course, you can then consider LVM2, multiple mount points, or RAID10 or RAID1 + linear etc.... but most people still prefer to see a single block device. Dealing with multiple RAID1 and a linear could lead to more complex issues as well. In any case, as mentioned previously, the majority of issues are caused by mis-configuration, if we could add some configuration verification to mdadm or similar, then we might be able to warn more people prior to things failing. Regards, Adam -- Adam Goryachev Website Managers www.websitemanagers.com.au ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Use RAID-6! 2013-04-17 4:27 ` Robert L Mathews 2013-04-17 4:45 ` Adam Goryachev @ 2013-04-17 6:06 ` Stan Hoeppner 1 sibling, 0 replies; 28+ messages in thread From: Stan Hoeppner @ 2013-04-17 6:06 UTC (permalink / raw) To: Robert L Mathews; +Cc: Linux RAID On 4/16/2013 11:27 PM, Robert L Mathews wrote: > I avidly read the details of every RAID 5 [and 6] disaster on the list, > and almost every one would be trivially easy to fix under RAID 1, with > no risk of complete data loss. It's heartbreaking. I do read most of them as well. But mirrors simply don't scale in either capacity or performance and thus aren't suitable. If one needs a 4TB+ filesystem today or more than combined ~150MB/s streaming write throughput one must use one of: 1. RAID10 2. RAID0 over RAID1 pairs/triples 3. A linear concat over pairs/triples w/XFS 4. RAID5 or RAID6 Each of these is most suitable for only subset of workloads, but all of them can scale to more than 4TB, whereas RAID1 cannot. When SATA4/SAS1200 arrive offering 1.2GB/s interface rate, and SSDs hit 2-4TB capacity at reasonable prices, then I think you'll see more straight RAID1 being used in more of the systems that don't need any more total capacity. But as many servers will always need more than this and will still use rust, striped/concatenated arrays will be with us for quite some time. And BTW, regarding your triplets setup, if you want to do that right according to your philosophy, then you need a dedicated SAS/SATA controller for each drive, each controller being of a different make/model with different firmware. The old UNIX/Netware "duplexing" strategy but triplexing in this case. But I doubt you're doing this. All 3 are probably connected to the single motherboard down SATA controller. -- Stan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Use RAID-6! 2013-04-17 1:35 ` Adam Goryachev 2013-04-17 4:27 ` Robert L Mathews @ 2013-04-17 11:13 ` Ben Bucksch 2013-04-17 11:32 ` Adam Goryachev 1 sibling, 1 reply; 28+ messages in thread From: Ben Bucksch @ 2013-04-17 11:13 UTC (permalink / raw) To: Adam Goryachev; +Cc: Robert L Mathews, Linux RAID Adam Goryachev wrote, On 17.04.2013 03:35: > Obviously, if they suffered a two disk failure then they won't be here > asking for help will they:) Wrong, sadly. I suffered a 1 disk failure, and I am here asking for help. And nobody can give it. Again: I have a RAID5, and 1 (one) disk failed, so I should be fine, but I cannot read the data anymore, no way to get at it. That's because md ejected a good (!) drive to start with, and refuses to take it back (!). (And then another drive failed during resync.) If you have a way, please do show me, see thread 'Disk wrongly marked "spare", need to force re-add it' The problem isn't double disk failure. The problem is bugs in md implementation. > The Linux kernel advises Linux md that the block > device is gone, so Linux md discards the block device and stops trying > to use it. Personally, I don't see that Linux md has a lot of choice in > the matter True. But often, such errors are temporary. For example, a loose cable. I must be able to re-add the device as a good device with data. But I can't, md doesn't let me. My case was even more unbelievable: md ejected perfectly good drives simply because I upgraded the OS. (This happened with 2 independent arrays, so not coincidence.) Also, a single sector being unreadable/unwritable doesn't count as "disk failure" in my book, and shouldn't eject the whole disk. If I have 2 sectors on 2 different disks that are unreadable, md currently trashes the whole array and doesn't let me read anything at all anymore. That's obviously broken, but unfortunately the sad reality. See http://neil.brown.name/blog/20110216044002#1 (And, BTW, RAID6 doesn't really help with this problem, because it's quite possible that 3 disks have sectors unreadable/unwritable.) ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Use RAID-6! 2013-04-17 11:13 ` Ben Bucksch @ 2013-04-17 11:32 ` Adam Goryachev 2013-04-17 11:51 ` Ben Bucksch 0 siblings, 1 reply; 28+ messages in thread From: Adam Goryachev @ 2013-04-17 11:32 UTC (permalink / raw) To: Ben Bucksch; +Cc: Robert L Mathews, Linux RAID On 17/04/13 21:13, Ben Bucksch wrote: > Adam Goryachev wrote, On 17.04.2013 03:35: >> Obviously, if they suffered a two disk failure then they won't be here >> asking for help will they:) > > Wrong, sadly. I suffered a 1 disk failure, and I am here asking for > help. And nobody can give it. > > Again: I have a RAID5, and 1 (one) disk failed, so I should be fine, but > I cannot read the data anymore, no way to get at it. That's because md > ejected a good (!) drive to start with, Actually, I think the real problem here is that you don't know why your so called good drive was ejected from the array. You assume that the drive is good, and that it was configured correctly, but obviously Linux and/or MD has a different opinion. > and refuses to take it back (!). It probably would have taken it back, although requiring a resync. > (And then another drive failed during resync.) If you have a way, please > do show me, see thread 'Disk wrongly marked "spare", need to force > re-add it' Like I said, you need to be patient, and follow the expert advice provided from the list. This discussion is just a diversion from your problem, forget the diversion (at least until you get your problem fixed). > The problem isn't double disk failure. The problem is bugs in md > implementation. Or users who expect things to work a certain way, without actually bothering to find out in advance. Hence their expectation is considered a bug when really it is just a lack of knowledge. >> The Linux kernel advises Linux md that the block >> device is gone, so Linux md discards the block device and stops trying >> to use it. Personally, I don't see that Linux md has a lot of choice in >> the matter > > True. But often, such errors are temporary. For example, a loose cable. > I must be able to re-add the device as a good device with data. But I > can't, md doesn't let me. It does actually. You can re-add it, with a resync, or if you ensure that no writes occurred since the drive was ejected, you can re-add it without a resync. In addition, even if some writes occurred, if you use a bitmap, only the newly written blocks need to by resynced. > My case was even more unbelievable: md ejected perfectly good drives > simply because I upgraded the OS. (This happened with 2 independent > arrays, so not coincidence.) Like I said, the drives were ejected for a reason. You just don't know what that reason is. > Also, a single sector being unreadable/unwritable doesn't count as "disk > failure" in my book, and shouldn't eject the whole disk. If I have 2 > sectors on 2 different disks that are unreadable, md currently trashes > the whole array and doesn't let me read anything at all anymore. That's > obviously broken, but unfortunately the sad reality. > See http://neil.brown.name/blog/20110216044002#1 This is all true, however, I would hope that when this is implemented, the distributions will properly alert the user that one or more drives are faulty. One failed write is very frequently indicative of more failed writes to come. Personally, I would want to replace that drive ASAP. In addition, the one thing that appeared missing from the blog was the ability for md to clear the bad blocks list when a drive is replaced, and rebuild the content of the "bad blocks" from the other members. > (And, BTW, RAID6 doesn't really help with this problem, because it's > quite possible that 3 disks have sectors unreadable/unwritable.) RAID6 simply improves your odds or chances. There is no RAID level that can provide a 100% uptime, at some point you have lost too many disks or too much data, etc. Use the appropriate level of RAID depending on your risk profile. Regards, Adam -- Adam Goryachev Website Managers www.websitemanagers.com.au ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Use RAID-6! 2013-04-17 11:32 ` Adam Goryachev @ 2013-04-17 11:51 ` Ben Bucksch 2013-04-17 17:50 ` Roy Sigurd Karlsbakk 0 siblings, 1 reply; 28+ messages in thread From: Ben Bucksch @ 2013-04-17 11:51 UTC (permalink / raw) To: Adam Goryachev; +Cc: Ben Bucksch, Robert L Mathews, Linux RAID Adam Goryachev wrote, On 17.04.2013 13:32: > On 17/04/13 21:13, Ben Bucksch wrote: >> Adam Goryachev wrote, On 17.04.2013 03:35: >>> Obviously, if they suffered a two disk failure then they won't be here >>> asking for help will they:) >> Wrong, sadly. I suffered a 1 disk failure, and I am here asking for >> help. And nobody can give it. >> >> Again: I have a RAID5, and 1 (one) disk failed, so I should be fine, but >> I cannot read the data anymore, no way to get at it. That's because md >> ejected a good (!) drive to start with, > Actually, I think the real problem here is that you don't know why your > so called good drive was ejected from the array. I know it doesn't have a fatal hardware failure. See my quote above. > obviously Linux and/or MD has a different opinion. See my first post. You see that they have the almost same event count, yet I can't re-add it (considering the fact that another drive failed entirely). > >> and refuses to take it back (!). > It probably would have taken it back, although requiring a resync. It did. And that resync uncovered the failure of the other disk. The combination trashed my array. The problem is that the first drive should never have been ejected, so that the failing drive would not be fatal. > Like I said, you need to be patient, and follow the expert advice > provided from the list. Well, I'm listening. All the info is in my thread: md RAID5: Disk wrongly marked "spare", need to force re-add it (And, FYI, being "patient" is difficult when you can't work until the array is back online.) > This discussion is just a diversion from your > problem, forget the diversion (at least until you get your problem fixed). I am interested in both: My immediate problem fixed, and that this problem hever happens again: not for me, and not for anybody else who isn't aware of it yet. > >> The problem isn't double disk failure. The problem is bugs in md >> implementation. > Or users who expect things to work a certain way, without actually > bothering to find out in advance. Hence their expectation is considered > a bug when really it is just a lack of knowledge. FWIW; I read a lot about RAID before using it, and I use it since 10 years. RAID5 is supposed to protect against 1 total harddrive failure. It doesn't. That's a bug, no matter how you put the light on it. > >>> The Linux kernel advises Linux md that the block >>> device is gone, so Linux md discards the block device and stops trying >>> to use it. Personally, I don't see that Linux md has a lot of choice in >>> the matter >> True. But often, such errors are temporary. For example, a loose cable. >> I must be able to re-add the device as a good device with data. But I >> can't, md doesn't let me. > It does actually. You can re-add it, with a resync, or if you ensure > that no writes occurred since the drive was ejected, you can re-add it > without a resync. In addition, even if some writes occurred, if you use > a bitmap, only the newly written blocks need to by resynced. >> My case was even more unbelievable: md ejected perfectly good drives >> simply because I upgraded the OS. (This happened with 2 independent >> arrays, so not coincidence.) > Like I said, the drives were ejected for a reason. You just don't know > what that reason is. > >> Also, a single sector being unreadable/unwritable doesn't count as "disk >> failure" in my book, and shouldn't eject the whole disk. If I have 2 >> sectors on 2 different disks that are unreadable, md currently trashes >> the whole array and doesn't let me read anything at all anymore. That's >> obviously broken, but unfortunately the sad reality. >> See http://neil.brown.name/blog/20110216044002#1 > This is all true, however, I would hope that when this is implemented, > the distributions will properly alert the user that one or more drives > are faulty. One failed write is very frequently indicative of more > failed writes to come. Personally, I would want to replace that drive ASAP. > > In addition, the one thing that appeared missing from the blog was the > ability for md to clear the bad blocks list when a drive is replaced, > and rebuild the content of the "bad blocks" from the other members. > >> (And, BTW, RAID6 doesn't really help with this problem, because it's >> quite possible that 3 disks have sectors unreadable/unwritable.) > RAID6 simply improves your odds or chances. There is no RAID level that > can provide a 100% uptime, at some point you have lost too many disks or > too much data, etc. Use the appropriate level of RAID depending on your > risk profile. > > Regards, > Adam > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Use RAID-6! 2013-04-17 11:51 ` Ben Bucksch @ 2013-04-17 17:50 ` Roy Sigurd Karlsbakk 0 siblings, 0 replies; 28+ messages in thread From: Roy Sigurd Karlsbakk @ 2013-04-17 17:50 UTC (permalink / raw) To: Ben Bucksch; +Cc: Robert L Mathews, Linux RAID, Adam Goryachev > FWIW; I read a lot about RAID before using it, and I use it since 10 > years. RAID5 is supposed to protect against 1 total harddrive failure. > It doesn't. That's a bug, no matter how you put the light on it. Usually, the problem is someone using desktop drives without scterc enabled. If the drive goes into deep recovery, it'll time out from Linux' point of view and flagged as bad. See my post in your original thread for more info. Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 98013356 roy@karlsbakk.net http://blogg.karlsbakk.net/ GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med xenotyp etymologi. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Use RAID-6! 2013-04-17 0:20 ` Ben Bucksch 2013-04-17 1:35 ` Adam Goryachev @ 2013-04-17 3:32 ` Robert L Mathews 1 sibling, 0 replies; 28+ messages in thread From: Robert L Mathews @ 2013-04-17 3:32 UTC (permalink / raw) To: Linux RAID On 4/16/13 5:20 PM, Ben Bucksch wrote: > However, I think the RAID5 problems are caused by bad design decisions > in the md implementation, not in the inherent concept of RAID5, though. I'm not so sure this is true. I once lost (backup) data on a proprietary non-mdadm RAID 5 system, too, because some spurious event caused problems for multiple drives at once. With mdadm, at least there's the opportunity to fix something with the raw disks, which proprietary systems don't allow. Knowing the right recovery steps to take is complex and easy to screw up, but there are many different things that could have gone wrong in the first place. As Adam Goryachev said, it's amazing how many of the the "my RAID died *and* I did something foolish" stories do end with getting the data back. This speaks well of the flexibility and power of mdadm in the true Unix sense, I think. -- Robert L Mathews, Tiger Technologies, http://www.tigertech.net/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Use RAID-6! 2013-04-16 22:44 ` Robert L Mathews 2013-04-17 0:20 ` Ben Bucksch @ 2013-04-17 4:20 ` Roman Mamedov 2013-04-17 5:22 ` Robert L Mathews 1 sibling, 1 reply; 28+ messages in thread From: Roman Mamedov @ 2013-04-17 4:20 UTC (permalink / raw) To: Robert L Mathews; +Cc: Linux RAID [-- Attachment #1: Type: text/plain, Size: 2219 bytes --] On Tue, 16 Apr 2013 15:44:03 -0700 Robert L Mathews <lists@tigertech.com> wrote: > On 4/16/13 1:05 PM, Carsten Aulbert wrote: > > > The problem I find with RAID1 is that it won't protect you against > > silent corruptions (same as RAID5). What do you do if you do a through > > check and both drives claim a data block is valid and intact, but data > > differs? Do you trust disk1 or disk2? > > That's partly why we use three-disk arrays instead of two-disk. You do know there is no "voting" system in md, right? If you imagine that all three disks are being read in parallel, and if one returns bad data, it is automatically "overruled" by a majority vote from the two other ones with correct data, that's not how it works at all. The data is read randomly from all three disks (I think it's load-balanced by process ID); if one disk happened to silently return corrupt data, that's it, your app just got corrupt data passed to it, and if happens to write it back to disk (maybe after some processing), then the incorrect data will be faithfully replicated by md to all three disks. So in the future you have not even a _chance_ to read back the correct data that was previously there. > In the meantime, I'd rather risk this problem than the endless reports > of complete array failures that appear on the list with RAID 5 and even > RAID 6 (a recent topic, I note, was "multiple disk failures in an md > raid6 array"). I almost never see anyone reporting complete loss of a > RAID 1 array. In general, you seem to be WAY too concerned about losing your RAID array; this sounds like you are someone who doesn't make backups and tries to use RAID as a replacement for them. Don't forget if for example a rogue program gets 'root' on your machine and overwrites the md device with zeroes, it will be instantly replicated to all three disks as well. As for me, if I lose my primary RAID6, it's a maximum a day's worth of changes, and some data transfer from here and there to get it all copied from backups and be up and running again. (I could reduce even that risk and easily back up 4 times a day, but do not see the need at the moment.) -- With respect, Roman [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Use RAID-6! 2013-04-17 4:20 ` Roman Mamedov @ 2013-04-17 5:22 ` Robert L Mathews 0 siblings, 0 replies; 28+ messages in thread From: Robert L Mathews @ 2013-04-17 5:22 UTC (permalink / raw) To: Linux RAID On 4/16/13 9:20 PM, Roman Mamedov wrote: > You do know there is no "voting" system in md, right? Yes, but the question was "What do you do if you do a through check and both drives claim a data block is valid and intact, but data differs?" The implication was that the array has failed and you need to manually reconstruct data, perhaps sector-by-sector. Having three sources for a manual reconstruction outside of md reduces the "someone with two clocks never knows what time it is" problem. With three, you can make an informed guess about which one is wrong. I'm not saying that this is the primary reason to use three disks in RAID 1, because it's not. I've never needed to do sector-level recovery of an array. The primary reason is so that you can withstand two simultaneous disk failures, just as with RAID 6 vs. RAID 5. > In general, you seem to be WAY too concerned about losing your RAID array; > this sounds like you are someone who doesn't make backups and tries to use > RAID as a replacement for them. No, that's definitely not the case. We have backup systems in multiple data centers, and our disaster recovery planning includes plane crashes that destroy live servers and so on. Many businesses require 100% availability. Losing an array on a server means downtime and telling paying customers "we lost the new data you stored since the last backup". Neither is acceptable if it's in any way avoidable, even if the last backup was minutes ago. Like you, I can easily recover from losing a couple of hours work, but my customers who run online stores are less sanguine about such things. By the way, I think I'm going to pin "you seem to be WAY too concerned about losing your RAID array" up on my wall. That's wonderful, because it's exactly how concerned I want to be. ;-) -- Robert L Mathews, Tiger Technologies, http://www.tigertech.net/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Use RAID-6! 2013-04-16 19:52 ` Robert L Mathews 2013-04-16 20:05 ` Carsten Aulbert @ 2013-04-17 17:27 ` Roy Sigurd Karlsbakk 1 sibling, 0 replies; 28+ messages in thread From: Roy Sigurd Karlsbakk @ 2013-04-17 17:27 UTC (permalink / raw) To: Robert L Mathews; +Cc: Linux RAID > I personally don't even trust RAID 6. All our servers use three-disk > RAID 1 setups, with disks from at least two different manufacturers to > prevent against firmware bricking (although this is becoming more and > more difficult as the industry consolidates). You can *never* trust RAID alone. Even with ZFS, you can have problems taking down a whole pool, even with RAIDz3 and ZFS' checksumming. A power surge can take down half (or even all) the drives in the array, and even with three-way mirrors, the chances are good your pool will die. So, choose something decent, like RAID-6 (RAIDz2) or mirrors, three-way if you're paranoid, and keep a good backup, preferably offsite and on tape. Tapes in a tape library can't be damaged much of a power surge (except perhaps those in the reader). Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 98013356 roy@karlsbakk.net http://blogg.karlsbakk.net/ GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med xenotyp etymologi. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 28+ messages in thread
* md dropping disks too early (was: Use RAID-6!) 2013-04-16 16:44 Use RAID-6! Roy Sigurd Karlsbakk 2013-04-16 17:09 ` Mikael Abrahamsson 2013-04-16 19:52 ` Robert L Mathews @ 2013-04-16 23:42 ` Ben Bucksch 2013-04-17 8:00 ` Mikael Abrahamsson 2 siblings, 1 reply; 28+ messages in thread From: Ben Bucksch @ 2013-04-16 23:42 UTC (permalink / raw) To: Linux RAID The purpose of my RAID system is 1) to protect against hardware disk failures, both that a harddrive is entirely broken and won't read at all anymore. I know that this *will* happen at some point, but it's still a fairly rare event. The chance that 2 out of 8 drives go bad *in the same week* (!) is very small. I am also concerned about 2) bit errors and silently broken sectors, and want my RAID to detect and fix those. I am not sure that Linux md does that. There is a good chance that a controller or some wiring is bad, and many disks fail at the same time. Neither RAID5 nor RAID6 will protect against that, but a re-cabling should fix it without data loss, as the data on the disks is not affected. Given that this RAID array is for my personal use, and the amount of disk slots in a machine is limited, and drives need 24/7 power, too, a RAID5 is the right choice for me, given the above situation. --- BUT - and this is the main purpose of my post - Linux md causes problems by itself: In my case, and from what I read in other posts in forums and on this mailing lists, many people have the problem that Linux md simply drops a disk from the RAID5, even though there was NOT an unrecoverable hardware failure. There are many situations where this happens: 1. Upgrade (my case) 2. Disk temporarily not accessible 3. Disk has bad sectors (but the other content can still be read) None of these should be fatal. But it seems that md marks the disk as faulty and requires a resync. There does not seem to be any way to get a disk that was once marked spare or faulty back into the array, unless I do a resync. (If somebody knows a way, please show me, see thread 'Disk wrongly marked "spare", need to force re-add it'.) Now, the resync needs to read all data from all disks and can be the event that uncovers a problem with one of the other disks. That disk is then dropped as well, again with no way to re-add, and the array is entirely lost. However, that is completely unnecessary, given that there are often only a few bad sectors, and these - while bad - are no reason to say goodbye to several TB of data. Essentially, by being overly cautious with the data and dropping disks too early and being too instant about it, md actually achieves the opposite of what it was made for. It was intended to protect my data against disk problems, but md actually makes minor or even temporary problems resulting in a total dataloss. I'm not overstating, because that's the exact situation I am in right now. I have only 1 disk that's actually failing, and a RAID5, so in theory I am fine. But I see no way to safely get at my data anymore. My array is offline and I have no idea how to get it online again without risking to lose all data. And worst: the whole situation was triggered by md dropping a disk from the array that is wasn't even failing, but just because I upgraded. :-( Ben ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: md dropping disks too early (was: Use RAID-6!) 2013-04-16 23:42 ` md dropping disks too early (was: Use RAID-6!) Ben Bucksch @ 2013-04-17 8:00 ` Mikael Abrahamsson 2013-04-17 10:57 ` md dropping disks too early Ben Bucksch 0 siblings, 1 reply; 28+ messages in thread From: Mikael Abrahamsson @ 2013-04-17 8:00 UTC (permalink / raw) To: Ben Bucksch; +Cc: Linux RAID On Wed, 17 Apr 2013, Ben Bucksch wrote: > I am also concerned about 2) bit errors and silently broken sectors, and > want my RAID to detect and fix those. I am not sure that Linux md does > that. Yes it does, but you need to do frequent scrubbing to reduce the risk of hitting this when you actually need it, ie after complete drive failure. > Given that this RAID array is for my personal use, and the amount of > disk slots in a machine is limited, and drives need 24/7 power, too, a > RAID5 is the right choice for me, given the above situation. It's the combination of drive failure and other drive having read errors that RAID6 protects against. At least that's my primary use for it. -- Mikael Abrahamsson email: swmike@swm.pp.se ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: md dropping disks too early 2013-04-17 8:00 ` Mikael Abrahamsson @ 2013-04-17 10:57 ` Ben Bucksch 2013-04-17 15:03 ` Keith Keller 2013-04-17 18:09 ` Roy Sigurd Karlsbakk 0 siblings, 2 replies; 28+ messages in thread From: Ben Bucksch @ 2013-04-17 10:57 UTC (permalink / raw) To: Mikael Abrahamsson; +Cc: Linux RAID Mikael Abrahamsson wrote, On 17.04.2013 10:00: > Yes it does, but you need to do frequent scrubbing to reduce the risk > of hitting this when you actually need it, ie after complete drive > failure. No, it's not "me" who needs to do that. The software needs to be set up by default to do that, be it the kernel or some userland cron job from the distro (advantage of latter: configurable). Apparently, Ubuntu 10.04 didn't do that. Please stop blaming users, start blaming the software, and fix it. > It's the combination of drive failure and other drive having read > errors that RAID6 protects against. At least that's my primary use for > it. But a single read error is no reason to send the whole array to the trash. RAID6 is merely a workaround here. With joy, I read that this problem was described, recognized and intended to be fixed by the developers: http://neil.brown.name/blog/20110216044002#1 "Bad Block Log" Unfortunately, that doesn't seem to be done, as I was running into exactly that problem he describes. I hope somebody will fix that, because he eloquently describes how the RAID achieves the opposite of what it's intended to do. Ben ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: md dropping disks too early 2013-04-17 10:57 ` md dropping disks too early Ben Bucksch @ 2013-04-17 15:03 ` Keith Keller 2013-04-17 18:09 ` Roy Sigurd Karlsbakk 1 sibling, 0 replies; 28+ messages in thread From: Keith Keller @ 2013-04-17 15:03 UTC (permalink / raw) To: linux-raid On 2013-04-17, Ben Bucksch <linux.news@bucksch.org> wrote: > > No, it's not "me" who needs to do that. The software needs to be set up > by default to do that, be it the kernel or some userland cron job from > the distro (advantage of latter: configurable). Apparently, Ubuntu 10.04 > didn't do that. CentOS (and, by implication, RHEL) has had this check since version 5 (not sure which minor version). --keith -- kkeller@wombat.san-francisco.ca.us ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: md dropping disks too early 2013-04-17 10:57 ` md dropping disks too early Ben Bucksch 2013-04-17 15:03 ` Keith Keller @ 2013-04-17 18:09 ` Roy Sigurd Karlsbakk 1 sibling, 0 replies; 28+ messages in thread From: Roy Sigurd Karlsbakk @ 2013-04-17 18:09 UTC (permalink / raw) To: Ben Bucksch; +Cc: Linux RAID, Mikael Abrahamsson > No, it's not "me" who needs to do that. The software needs to be set > up > by default to do that, be it the kernel or some userland cron job from > the distro (advantage of latter: configurable). Apparently, Ubuntu > 10.04 > didn't do that. > Please stop blaming users, start blaming the software, and fix it. Not sure about Ubuntu 10.04, but 12.04 and later has this cron'ed first sunday of the month. Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 98013356 roy@karlsbakk.net http://blogg.karlsbakk.net/ GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med xenotyp etymologi. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2013-04-17 18:09 UTC | newest] Thread overview: 28+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-04-16 16:44 Use RAID-6! Roy Sigurd Karlsbakk 2013-04-16 17:09 ` Mikael Abrahamsson 2013-04-16 17:25 ` Roy Sigurd Karlsbakk 2013-04-16 20:01 ` David Brown 2013-04-17 7:56 ` Mikael Abrahamsson 2013-04-17 9:26 ` David Brown 2013-04-16 19:52 ` Robert L Mathews 2013-04-16 20:05 ` Carsten Aulbert 2013-04-16 20:19 ` Roman Mamedov 2013-04-16 22:44 ` Robert L Mathews 2013-04-17 0:20 ` Ben Bucksch 2013-04-17 1:35 ` Adam Goryachev 2013-04-17 4:27 ` Robert L Mathews 2013-04-17 4:45 ` Adam Goryachev 2013-04-17 6:06 ` Stan Hoeppner 2013-04-17 11:13 ` Ben Bucksch 2013-04-17 11:32 ` Adam Goryachev 2013-04-17 11:51 ` Ben Bucksch 2013-04-17 17:50 ` Roy Sigurd Karlsbakk 2013-04-17 3:32 ` Robert L Mathews 2013-04-17 4:20 ` Roman Mamedov 2013-04-17 5:22 ` Robert L Mathews 2013-04-17 17:27 ` Roy Sigurd Karlsbakk 2013-04-16 23:42 ` md dropping disks too early (was: Use RAID-6!) Ben Bucksch 2013-04-17 8:00 ` Mikael Abrahamsson 2013-04-17 10:57 ` md dropping disks too early Ben Bucksch 2013-04-17 15:03 ` Keith Keller 2013-04-17 18:09 ` Roy Sigurd Karlsbakk
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox