From mboxrd@z Thu Jan 1 00:00:00 1970 From: berk walker Subject: Re: Severe, huge data corruption with softraid Date: Wed, 02 Mar 2005 19:10:57 -0500 Message-ID: <42265611.4020307@panix.com> References: <42264AF4.4000600@tls.msk.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit In-Reply-To: <42264AF4.4000600@tls.msk.ru> Sender: linux-raid-owner@vger.kernel.org To: Michael Tokarev Cc: linux-raid@vger.kernel.org List-Id: linux-raid.ids Just a thought, have you tried swapping power supplies, and checked/improved the system's Earth ground? b- Michael Tokarev wrote: > Too bad I can't diagnose the problem correctly, but it is > here somewhere, and is (hardly) reproduceable. > > I'm doing alot of experiments right now with various raid options > and read/write speed. And 3 times now, the whole system went boom > during the experiments. It is writing into random places on all > disks, including boot sectors, partition tables and whatnot, so > obviously every filesystem out there becomes corrupt to hell. > > It seems the problem is due to integer overflow somewhere in raid > (very probably raid5) or ext3fs code, as it is starting to write > to the beginning of all disks instead of the raid partitions being > tested. It *may* be related to direct-io (O_DIRECT) into a file > in ext3 filesystem which is on top of softraid5 array. It may also > be related to raid10 code, but it is less likely. > > Here's the scenario. > > I have 7 scsi disks, sda..sdg, 36GB each. > On each drive there's a 3GB partition at the end (sdX10) > where I'm testing stuff. > I tried to create various raid arrays out of those sdX10 partitions, > including raid5 (various chunk sizes), raid1+raid0 and raid10. > On top of the raid array, I also tried to create ext3 fs. > And did various read/write tests on both the md device (without the > filesystem) and a file on the filesystem. > The tests - just sequential read and write with various I/O size > (8k, 16k, 32k, ..., 1m) and various O_DIRECT/O_SYNC/fsync() combinations. > > Ofcourse I created/stopped raid arrays (all on the same sdX11), created, > mounted and umounted filesystem on that arrays and did alot of reading > and writing. I'm sure I didn't access other devices during all this > testing (like trying to write to /dev/sdX instead of /dev/sdX11), and > did not write to the device while there was filesystem mounted. And > yes, my /dev/ major/minor numbers are correct (just verified to be sure). > > The symthom is simple: at some time, partition table on /dev/sdX becomes > corrupt (either primary or extended which is at about 1.2Gb of the start > of each disk), just like alot of other stuff, mostly at the beginning of > all disks -- on all but one or two disks involved in testing. > > We lost the system this way after first series of testing, and during > re-install (as there's no data anymore anyway), I descided to perform > some more testing, and hit the same prob again and (after restoring > partition tables) yet again. > > All my attempts to reproduce it failed so far, but when I din't watch > partition tables after each operation, it happened again after yet more > series of tests. > > One note: every time before it "crashed", I tried to create/use a raid5 > array out of 3, 4 or 5 drives with chunk size = 4Kb (each partition is > 3GB large), and -- if i recall correctly -- experimented with direct > write on the filesystem created on top of the array. Maybe it dislikes > chunk size this small... > > Now it's 02:18 here, deep night and I'm still in office -- I have to re- > install the server by morning so our users will have something to do, > so I have very limited time for more testing. Any quick suggestions > about what/where to look at right now welcome... > > BTW, the hardware is good, drives, memory, mobo and CPUs. > This happens on either 2.6.10 or 2.6.9 the first time, now it is > running 2.6.9. > > /mjt > - > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > . >