From mboxrd@z Thu Jan 1 00:00:00 1970 From: bl0 Subject: Re: sata_sil data corruption, possible workarounds Date: Tue, 08 Jan 2013 13:25:30 +0100 Message-ID: References: <50CCF1E0.9070804@gmail.com> <50CEB13B.9010100@gmail.com> <50D13831.9040105@gmail.com> <50EA4AEE.1050401@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7Bit Return-path: Received: from plane.gmane.org ([80.91.229.3]:42818 "EHLO plane.gmane.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755981Ab3AHMZs (ORCPT ); Tue, 8 Jan 2013 07:25:48 -0500 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1TsYFj-0002LP-32 for linux-ide@vger.kernel.org; Tue, 08 Jan 2013 13:25:59 +0100 Received: from 91.150.147.9.internetia.net.pl ([91.150.147.9]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 08 Jan 2013 13:25:59 +0100 Received: from bl0-052 by 91.150.147.9.internetia.net.pl with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 08 Jan 2013 13:25:59 +0100 Sender: linux-ide-owner@vger.kernel.org List-Id: linux-ide@vger.kernel.org To: linux-ide@vger.kernel.org Cc: linux-pci@vger.kernel.org On Monday 07 January 2013 05:11, Robert Hancock wrote: > On 12/20/2012 02:54 AM, bl0 wrote: >> On Wednesday 19 December 2012 04:44, Robert Hancock wrote: >> >>> On 12/18/2012 09:23 AM, bl0 wrote: >>>> Do you think something should be done about it in the linux sata_sil >>>> driver? For a lack of a better solution, here is my suggestion. There >>>> is already one option 'slow_down' for problematic disks. Another >>>> option, for example 'cache_line_workaround', could be added for >>>> problematic motherboards. If enabled, the most straightforward way is >>>> to set cache line size to 0 and not worry about the fifo_cfg register. >>>> If someone else confirms that it solves the problem for them, this >>>> option could be enabled automatically if certain motherboard chipset is >>>> detected. >>> >>> We'd have to somehow narrow down which chipsets were involved, which >>> might be a hard task. Do you have an idea of how much the performance is >>> hurt by these workarounds? If it's not a lot, it might make sense to do >>> it by default. >> >> After setting cache line size to 0, write speed as shown by 'dd >> if=/tmpfs/testfile of=/dev/sdc9 bs=1M count=256' goes down from about >> 45 MB/s to 17 MB/s. Personally I don't care about performance, >> reliability and data safety are more important to me. > > Yeah, cutting performance by 2/3rds is fairly bad though. Yes, it's probably not a good thing to do by default for everyone. >> The other workaround is to increase cache line size to 64 bytes, if >> necessary, and set fifo_cfg to 0. No difference in performance measured. >> This workaround is more of a hit or miss. It seems to contradict that >> code commit made back in 2005, which was also about data corruption. In >> the worst case, what solves data corruption problem on some motherboards >> might introduce this problem on some other motherboards. > > That's possible, which is why I suspect that someone from Silicon Image > would have to confirm a possible fix - might be hard to get their > attention about this old chipset.. I still recommend, for a start, a kernel module option, along with a message in dmesg. (If you haven't seen it yet, the code diff in my last message shows a possible way to do this.)