linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* How to avoid complete rebuild of RAID 6 array (6/8 active devices)
@ 2008-06-25  6:37 Dave Moon
  2008-06-25 16:13 ` Andre Noll
  0 siblings, 1 reply; 11+ messages in thread
From: Dave Moon @ 2008-06-25  6:37 UTC (permalink / raw)
  To: linux-raid

Hi Everyone,

First off, a little background on my setup.

Ubuntu 7.10 i386 Server (2.6.22-14-server)
upgraded to
Ubuntu 8.04 i386 Server (2.6.24-19-server)

I have 8 SATA drives connected and the drives are organized into three  
md RAID arrays as follows:

/dev/md1: ext3 partition mounted as /boot, composed of 8 members (RAID  
1) (sda1/b1/c1/d1/e1/f1/g1/h1)
/dev/md2: ext3 partition mounted as /root, composed of 8 members (RAID  
1) (sda2/b2/c2/d2/e2/f2/g2/h2)
/dev/md3: ext3 partition mounted as /mnt/raid-md3, composed of 8  
members (RAID 6) (sda3/b3/c3/d3/e3/f3/g3/h3), this is the main data  
partition holding 2.7TiBs worth of data

All the raid member partitions are set to type "fd" (Linux RAID  
Autodetect).

Important Note: 6 of the drives are connected to two Sil3114 SATA  
controller cards whilst 2 of the drives are connected to the on-board  
SATA controller (I don't know which model it is).

After upgrading my Ubuntu installation to 8.04, upon system restart  
there was an error message saying that my RAID arrays were degraded  
and thus the system was unable to boot from it.

At the time, not knowing the cause of the sudden RAID failure, I  
attempted to force mdadm to start the arrays anyways (the RAID 1  
arrays with 8 members each were no causes for concern, of course, but  
I wanted to back up my data on the degraded md3 array as soon as  
possible).

Then it hit me, why would it recognize only 6 drives? Apparently the  
kernel has some compatibility problems with certain SATA controllers  
and my on-board controller chip was one of them.

Sure enough, after moving all 8 drives to the Silicon Image  
controllers, the drives were all recognized without any problems.

If the missing drives were recognized again before the array was ever  
brought up again, everything would've been fine. But unfortunately I  
forced mdadm (--run switch) to bring it online with 2 missing members.

This is when the problem began. I know that as soon as I re-add the  
two missing drives back into the md3 (RAID 6) array, the system will  
attempt to rebuild the array, using the information from the remaining  
6 drives.

Given the size of the array and the type of the disk drives being used  
(off-the-shelf SATA drives with bit error rate of 1 out of 10^14  
bits), I think it is highly likely that the system will encounter one  
or more bit errors during the rebuild.

Anyway, I panicked and brought the md3 array down first to prevent  
possible further damage.

So, at this stage what I'm wondering is:

1. If mdadm encounters a bit error during a RAID 6 rebuild, will it  
just give up on that particular file and move on to recover other data  
on the array? Or will it trash the entire array?

2. Is it possible to cheat mdadm by somehow replacing the new "raid  
metadata" on the 6 drives with the old data on the 2 drives? Will it  
make mdadm think the array is clean, consistent and nothing ever  
happened?

Please do note that I did not write ANY new data onto the RAID 6 array  
from the time it was degraded until the time I brought it down with (-- 
stop).

Sorry for the long post and thank you for your time in advance. I  
really hope to get this RAID array back up without data corruption  
because I don't have a working backup of the array (I know, very  
stupid of me).

Dave

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2008-07-15 14:24 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-25  6:37 How to avoid complete rebuild of RAID 6 array (6/8 active devices) Dave Moon
2008-06-25 16:13 ` Andre Noll
2008-06-27 10:40   ` Neil Brown
2008-06-29 21:58     ` Bill Davidsen
2008-07-14 10:44       ` Matthias Urlichs
2008-07-14 16:14         ` David Greaves
2008-07-14 16:54           ` David Lethe
2008-07-14 22:58           ` Matthias Urlichs
2008-07-14 23:54             ` Richard Scobie
2008-07-15  0:05               ` Matthias Urlichs
2008-07-15 14:24             ` Keld Jørn Simonsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).