From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Greaves Subject: Re: Raid Recovery after Machine Failure Date: Sun, 13 Mar 2005 09:47:49 +0000 Message-ID: <42340C45.7060905@dgreaves.com> References: <94a5aa6f5e94171df4c070e741014f02@stanford.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit In-Reply-To: <94a5aa6f5e94171df4c070e741014f02@stanford.edu> Sender: linux-raid-owner@vger.kernel.org To: Can Sar Cc: linux-raid@vger.kernel.org List-Id: linux-raid.ids I *think* this is correct. I'm a user, not a coder. If nothing else it should help you search the archives for clarification :) In general I think the answer lies around md's superblocks Can Sar wrote: > Hi, > > I am working with a research group that is currently building a tool > to automatically find bugs in file systems, and related questions. We > are trying to check whether file systems really guarantee the > consistencies they promise, and one aspect we are looking at is > running them on top of Raid devices. In order to do this we have to > understand a few things about the Linux Raid driver/tools and I > haven't been able to figure this out from the documention/source code, > so maybe you can help me. > I asked this same question a few days ago, but I think I didn't really > state it clearly, so let me try to rephrase it. > > For Raid 4-6 and for say 5 disks say we write a block that is striped > across all the disks, and after 4 of the disks write their part of the > block to disk the machine crashes without the 5th disk being able to > complete the write. Because of this, the checksum for this stripe > should be incorrect, right? If I understand correctly, the superblocks are updated after each device sync - in this case superblock on disk 5 is different to 1-4. This means that disk 5 is kicked on restart and the array re-syncs using 1-4 to verify or write (not sure) disk 5. > The raid array is a Linux soft raid array set up using mdadm, and none > of the disks actually crashed or wrote had any errors during this > operation (the machine crashed for some other reason) We then reboot > the machine and recreate the array, this should be 'automatic' It's not so much 'recreate' (special recovery related meaning in md terminology) as 'start the md device' > then remount it and then try to read the sector that was previously > written (that has an incorrect checksum). At what point will the raid > driver discover that something is wrong? as it starts it checks the superblock sequence number, notices that one disk is wrong and not use it. > Will it ever (I feel that it should discover this during the read at > latest). Will it try to perform any kind of recovery or simply fail? so, since the superblock is wrong it starts in 'degraded' mode and resyncs. > How would this change if only 3 of the 5 disk writes made it to disk? > Fixing the error would be impossible of course (at least with Raid 4 > and 5, i know little about 6), but detection should still work. Will > the driver complain? I don't know what happens if the superblock fails to update on say, 3 out of 6 disks in an array. The driver _will_ complain. Newer kernels have an experimental fail facility that you may be interested in: CONFIG_MD_FAULTY: The "faulty" module allows for a block device that occasionally returns read or write errors. It is useful for testing. HTH David