From: Bernd Schubert <bs@q-leap.de>
To: Robert Hancock <hancockr@shaw.ca>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>,
Justin Piszcz <jpiszcz@lucidpixels.com>,
debian-user@lists.debian.org, linux-raid@vger.kernel.org,
linux-ide@vger.kernel.org
Subject: Re: Corrupt data - RAID sata_sil 3114 chip
Date: Sat, 3 Jan 2009 22:11:47 +0100 [thread overview]
Message-ID: <20090103211147.GA3707@lanczos.q-leap.de> (raw)
In-Reply-To: <495FD035.8080501@shaw.ca>
On Sat, Jan 03, 2009 at 02:53:09PM -0600, Robert Hancock wrote:
> Bernd Schubert wrote:
>> [sorry sent again, since Robert dropped all mailing list CCs and I
>> didn't notice first]
>>
>> On Sat, Jan 03, 2009 at 12:31:12PM -0600, Robert Hancock wrote:
>>> Bernd Schubert wrote:
>>>> On Sat, Jan 03, 2009 at 01:39:36PM +0000, Alan Cox wrote:
>>>>> On Fri, 2 Jan 2009 22:30:07 +0100
>>>>> Bernd Schubert <bs@q-leap.de> wrote:
>>>>>
>>>>>> Hello Bengt,
>>>>>>
>>>>>> sil3114 is known to cause data corruption with some disks.
>>>>> News to me. There are a few people with lots of SI and other devices
>>>> No no, you just forgot about it, since you even reviewed the patches ;)
>>>>
>>>> http://lkml.org/lkml/2007/10/11/137
>>> And Jeff explained why they were not merged:
>>>
>>> http://lkml.org/lkml/2007/10/11/166
>>>
>>> All the patch does is try to reduce the speed impact of the
>>> workaround. But as was pointed out, they don't reliably solve the
>>> problem the workaround is trying to fix, and besides, the workaround
>>> is already not applied to SiI3114 at all, as it is apparently not
>>> applicable on that controller (only 3112).
>>
>> Well, do they reliable solve the problem in our case (before taking the patch
>> into production I run a checksum tests for about 2 weeks). Anyway, I entirely
>> understand the patches didn't get accepted.
>>
>> But now more than a year has passed again without doing anything
>> about it and actually this is what I strongly criticize. Most people don't
>> know about issues like that and don't run file checksum tests as I now always
>> do before taking a disk into production. So users are exposed to known
>> data corruption problems without even being warned about it. Usually
>> even backups don't help, since one creates a backup of the corrupted data.
>>
>> So IMHO, the driver should be deactived for sil3114 until a real
>> solution is found. And it only should be possible to force activate it
>> by a kernel flag, which then also would print a huuuge warning about
>> possible data corruption (unfortunately most distributions disables
>> inital kernel messages *grumble*).
>
> If the corruption was happening on all such controllers then people
> would have been complaining in droves and something would have been
> done. It seems much more likely that in this case the problem is some
> kind of hardware fault or combination of hardware which is causing the
> problem. Unfortunately these kind of not-easily-reproducible issues tend
> to be very hard to track down.
>
Well yes, it only happens with certain drives. But these drives work fine on
other controllers. But still these are by now
known issues and nothing is done for that.
I would happily help to solve the problem, I just don't have any knowledge
about hardware programming. What would be your next step, if you had remote
access to such a system?
Thanks,
Bernd
next prev parent reply other threads:[~2009-01-03 21:11 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-03 20:04 Corrupt data - RAID sata_sil 3114 chip Bernd Schubert
2009-01-03 20:53 ` Robert Hancock
2009-01-03 21:11 ` Bernd Schubert [this message]
2009-01-03 23:23 ` Robert Hancock
2009-01-07 4:59 ` Tejun Heo
2009-01-07 5:38 ` Robert Hancock
2009-01-07 15:31 ` Bernd Schubert
2009-01-11 0:32 ` Robert Hancock
2009-01-11 0:43 ` Robert Hancock
2009-01-12 1:30 ` Tejun Heo
2009-01-19 18:43 ` Dave Jones
2009-01-20 2:50 ` Robert Hancock
2009-01-20 20:07 ` Dave Jones
[not found] <bQVFb-3SB-37@gated-at.bofh.it>
[not found] ` <bQVFb-3SB-39@gated-at.bofh.it>
[not found] ` <bQVFb-3SB-41@gated-at.bofh.it>
[not found] ` <bQVFc-3SB-43@gated-at.bofh.it>
[not found] ` <bQVFc-3SB-45@gated-at.bofh.it>
[not found] ` <bQVFc-3SB-47@gated-at.bofh.it>
[not found] ` <bQVFb-3SB-35@gated-at.bofh.it>
[not found] ` <4963306F.4060504@sm7jqb.se>
2009-01-06 10:48 ` Justin Piszcz
[not found] <495E01E3.9060903@sm7jqb.se>
2009-01-02 12:42 ` Justin Piszcz
2009-01-02 21:30 ` Bernd Schubert
2009-01-02 21:47 ` Twigathy
2009-01-03 2:31 ` Redeeman
2009-01-03 13:13 ` Bernd Schubert
2009-01-03 13:39 ` Alan Cox
2009-01-03 16:20 ` Bernd Schubert
2009-01-03 18:31 ` Robert Hancock
2009-01-03 22:19 ` James Youngman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090103211147.GA3707@lanczos.q-leap.de \
--to=bs@q-leap.de \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=debian-user@lists.debian.org \
--cc=hancockr@shaw.ca \
--cc=jpiszcz@lucidpixels.com \
--cc=linux-ide@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).