From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mike Myers Subject: Re: Need urgent help in fixing raid5 array Date: Fri, 2 Jan 2009 12:56:31 -0800 (PST) Message-ID: <680071.18057.qm@web30803.mail.mud.yahoo.com> References: <451872.61166.qm@web30802.mail.mud.yahoo.com> <467705.96388.qm@web30807.mail.mud.yahoo.com> <344038.60917.qm@web30808.mail.mud.yahoo.com> <17665.11566.qm@web30804.mail.mud.yahoo.com> <746863.34803.qm@web30803.mail.mud.yahoo.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Sender: linux-raid-owner@vger.kernel.org To: Justin Piszcz Cc: linux-raid@vger.kernel.org, john lists List-Id: linux-raid.ids BTW, several of the disks that md didn't want to assemble have perfectly valid uuid info on them. if I try and manually specify the devices in the --assemble line, and add the ones that are missing (but have a valid uuid for the array), even with the force option, md refuses to assemble them. The disks can't be all bad can they? That's 4 drives out of 14 that would have all had to go bad at once. thx Mike ----- Original Message ---- From: Justin Piszcz To: Mike Myers Cc: linux-raid@vger.kernel.org; john lists Sent: Friday, January 2, 2009 10:57:13 AM Subject: Re: Need urgent help in fixing raid5 array On Fri, 2 Jan 2009, Mike Myers wrote: > Well, I can read from sdg1 just fine. It seems to work ok, at least for a few GB of data. I'll try this on some of the other disks, but it is possible for to pull the disks out of the backplane and run the SFF-8087 fanout cables direct to each drive and bypass the backplane completely. It certainly would be easy to do this for the at least the sdo1 drive and see if I can get better results going direct to the disk. I have moved the disks around the backplane a bit to deal with the issues of the controller failure, so I am pretty sure it's not just one bad slot or the like. > > So you've seen a backplane fail in away that the disks come up fine at boot but have corrupted data transfers across them? I wonder about the sata cables in that case as well. I could hook up a pair of PMP's to my SI3132's and bypass the 8077 cables as well. 1. Try by-passing the backplane. 2. Bad cables will usually cause smart identifier UDMA_CRC_Error_Count to increase quite high, if it is 0 or close to it, the cable is unlikely the issue. 3. I have seem all kinds of weirdness with bad backplanes, drives dropping out of the array, drives producing I/O errors, etc. Justin. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html