From mboxrd@z Thu Jan 1 00:00:00 1970 From: Robert Hancock Subject: Re: Corrupt data - RAID sata_sil 3114 chip Date: Mon, 19 Jan 2009 20:50:06 -0600 Message-ID: <49753BDE.8050403@shaw.ca> References: <200901032104.15242.bs@q-leap.de> <496436C4.4070305@kernel.org> <49643FD4.9080100@shaw.ca> <200901071632.02264.bs@q-leap.de> <49693E08.3050209@shaw.ca> <49694094.60501@shaw.ca> <496A9D42.4000302@kernel.org> <20090119184304.GB30365@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20090119184304.GB30365@redhat.com> Sender: linux-raid-owner@vger.kernel.org To: Dave Jones Cc: Tejun Heo , Bernd Schubert , Alan Cox , Justin Piszcz , debian-user@lists.debian.org, linux-raid@vger.kernel.org, linux-ide@vger.kernel.org List-Id: linux-ide@vger.kernel.org Dave Jones wrote: > On Mon, Jan 12, 2009 at 10:30:42AM +0900, Tejun Heo wrote: > > Robert Hancock wrote: > > >> There are apparently some reports of issues on NVidia chipsets as > > >> well, though I don't have any details at hand. > > > > > > Well, Carlos' email bounces, so much for that one. Anyone have any other > > > contacts at Silicon Image? > > > > I'll ping my SIMG contacts but I've pinged about this problem in the > > past but it didn't get anywhere. > > I wish I'd read this thread last week.. I've been beating my head > against this problem all weekend. > > I picked up a cheap 3114 card, and found that when I created a filesystem > with it on a 250GB disk, it got massive corruption very quickly. > > My experience echos most the other peoples in this thread, but here's > a few data points I've been able to figure out.. > > I ran badblocks -v -w -s on the disk, and after running > for nearly 24 hours, it reported a huge number of blocks > failing at the upper part of the disk. > > I created a partition in this bad area to speed up testing.. > > Device Boot Start End Blocks Id System > /dev/sde1 1 30000 240974968+ 83 Linux > /dev/sde2 30001 30200 1606500 83 Linux > /dev/sde3 30201 30401 1614532+ 83 Linux > > Rerunning badblocks on /dev/sde2 consistently fails when > it gets to the reading back 0x00 stage. > (Somehow it passes reading back 0xff, 0xaa and 0x55) > > I was beginning to suspect the disk may be bad, but when I > moved it to a box with Intel sata, the badblocks run on that > same partition succeeds with no problems at all. > > Given the corruption happens at high block numbers, I'm wondering > if maybe there's some kind of wraparound bug happening here. > (Though why only the 0x00 pattern fails would still be a mystery). Yeah, that seems a bit bizarre.. Apparently somehow zeros are being converted into non-zero.. Can you try zeroing out the partition by dd'ing into it from /dev/zero or something, then dumping it back out to see what kind of data is showing up? > > > After reading about the firmware update fixing it, I thought I'd > give that a shot. This was pretty much complete fail. > > The DOS utility for flashing claims I'm running BIOS 5.0.39, > which looking at http://www.siliconimage.com/support/searchresults.aspx?pid=28&cat=15 > is quite ancient. So I tried the newer ones. > Same experience with both 5.4.0.3, and 5.0.73 > > "BIOS version in the input file is not a newer version" > > Forcing it to write anyway gets.. > > "Data is different at address 65f6h" > > > > > Dave > >