From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Brown Subject: Re: RAID 5 - One drive dropped while replacing another Date: Wed, 02 Feb 2011 22:15:31 +0100 Message-ID: References: <20110202043605.593f0c5c@natsu> <20110202192835.5d35f2d1@natsu> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: Sender: linux-raid-owner@vger.kernel.org To: linux-raid@vger.kernel.org List-Id: linux-raid.ids On 02/02/11 17:29, hansbkk@gmail.com wrote: > On Wed, Feb 2, 2011 at 11:03 PM, Scott E. Armitage > wrote: >> RAID1+0 can lose up to half the drives in the array, as long as no single >> mirror loses all it's drives. Instead of only being able to survive "the >> right pair", it's quite the opposite: RAID1+0 will only fail if "the wrong >> pair" of drives fail. > > AFAICT it''s a glass half-full/half-empty thing. Maybe it's just my > personality, but I don't like leaving such things to chance. Maybe if > I had more than two drives per array, but that would be **very** > inefficient (ie expensive usable space ratio). > > However, following up on the "spare-group" idea, I'd like confirmation > please that this scenario would work: > > From the man page: > > mdadm may move a spare drive from one array to another if they are in > the same spare-group and if the destination array has a failed drive > but no spares. > > Given all component drives are the same size, mdadm.conf contains > > ARRAY /dev/md0 level=raid1 num-devices=2 spare-group=bigraid10 > ARRAY /dev/md1 level=raid1 num-device=2 spare-group=bigraid10 > etc > > I then add any number of spares to any of the RAID1 arrays (which > under RAID 1+0 would be in turn components of the RAID0 span one layer > up - personally I'd use LVM for this) the follow/monitor mode feature > would allocate these spares as whatever RAID1 array needed them. > > Does this make sense? > > If so I would recognize this as being more fault-tolerant than RAID6, > with the big advantage being fast rebuild times - performance > advantages too, especially on writes - but obviously at a relatively > higher cost. You have to be precise about what you mean by fault-tolerant. With RAID6, /any/ two drives can fail and your system is still running. Hot spares don't change that - they just minimise the time before one of the failed drives is replaced. If you have a set of RAID1 pairs that are striped together (by LVM or RAID0), then you can only tolerate a single failed drive. You /might/ tolerate more failures. For example, if you have 4 pairs, then a random second failure has a 7/8 chance of being on a different pair, and therefore safe. If you crunch the numbers, it's possible that the average or expected number of failures you can tolerate is more than 2. But for the guaranteed worst-case scenario, your set can only tolerate a single drive failure. Again, hot spares don't change that - they only reduce your degraded (and therefore risky) time.