From mboxrd@z Thu Jan 1 00:00:00 1970 From: Lars Marowsky-Bree Subject: Re: [PATCH] proactive raid5 disk replacement for 2.6.11, updated Date: Thu, 18 Aug 2005 12:24:58 +0200 Message-ID: <20050818102458.GJ13344@marowsky-bree.de> References: <1124322731.3810.77.camel@localhost.localdomain> <17156.7305.638579.812295@cse.unsw.edu.au> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: Content-Disposition: inline In-Reply-To: <17156.7305.638579.812295@cse.unsw.edu.au> Sender: linux-raid-owner@vger.kernel.org To: Neil Brown , Pallai Roland Cc: linux-raid@vger.kernel.org List-Id: linux-raid.ids On 2005-08-18T15:28:41, Neil Brown wrote: > If we want to mirror a single drive in a raid5 array, I would really > like to do that using the raid1 personality. > e.g. > suspend io > remove the drive > build a raid1 (with no superblock) using the drive. > add that back into the array > resume io. I hate to say this, but this is something where the Device Mapper framework, with it's suspend/resume options and the ability to change the mapping atomically. Maybe copying some of the ideas would be useful. =46reeze, reconfigure one disk to be RAID1, resume - all IO goes on whi= le at the same time said RAID1 re-mirrors to the new disk. Repeat with a removal later. > To handle read failures, I would like the first step to be to re-writ= e > the failed block. I believe most (all?) drives will relocate the > block if a write cannot succeed at the normal location, so this will > often fix the problem. =20 Yes. This would be highly useful. > A userspace process can then notice an unacceptable failure rate and > start a miror/swap process as above. Agreed. Combined with SMART monitoring, this could provide highly usefu= l features. > This possible doesn't handle the possibility of a write failing very > well, but I'm not sure what your approach does in that case. Could > you explain that? I think a failed write can't really be handled - it might be retried once or twice, but then the way to proceed is to kick the drive and rebuild the array. > It also means that if the raid1 rebuild hits a read-error it cannot > cope whereas your code would just reconstruct the block from the rest > of the raid5. Good point. One way to fix this would be to have a callback to one leve= l up "Hi, I can't read this section, can you reconstruct and give it to me?". (Which is a pretty ugly hack.) However, that would also assume that the data on the disk which _can_ b= e read still can be trusted. I'm not sure I'd buy that myself, untrusted. But a periodic background consistency check for RAID might help convinc= e users that this is indeed the case ;-) If you can no longer pro-actively reconstruct the disk because it has indeed failed, maybe treating it like a failed disk and rebuilding the array in the "classic" fashion isn't the worst idea, though. Sincerely, Lars Marowsky-Br=E9e --=20 High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" - To unsubscribe from this list: send the line "unsubscribe linux-raid" i= n the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html