From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from plane.gmane.org ([80.91.229.3]:38672 "EHLO plane.gmane.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752375Ab2LPMYx (ORCPT ); Sun, 16 Dec 2012 07:24:53 -0500 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1TkDHD-0002Lk-GA for linux-pci@vger.kernel.org; Sun, 16 Dec 2012 13:25:03 +0100 Received: from 91.150.147.9.internetia.net.pl ([91.150.147.9]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sun, 16 Dec 2012 13:25:03 +0100 Received: from bl0-052 by 91.150.147.9.internetia.net.pl with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sun, 16 Dec 2012 13:25:03 +0100 To: linux-pci@vger.kernel.org From: bl0 Subject: Re: sata_sil data corruption, possible workarounds Date: Sun, 16 Dec 2012 13:21:01 +0100 Message-ID: References: <50CCF1E0.9070804@gmail.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="nextPart2169060.BJCX0EGrHf" Cc: linux-ide@vger.kernel.org Sender: linux-pci-owner@vger.kernel.org List-ID: --nextPart2169060.BJCX0EGrHf Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8Bit Thanks for your response. On Saturday 15 December 2012 22:55, Robert Hancock wrote: > On 12/15/2012 02:02 AM, bl0 wrote: >> I have a PCI card based on Silicon Image 3114 SATA controller. Like many >> people in the past I have experienced silent data corruption. >> I am lucky to have a hardware configuration where it is easy to reproduce >> this behavior with 100% rate by copying a file from a USB stick plugged >> into another PCI card. My motherboard has nvidia chipset. >> >> Going through messages and bug reports about this problem, someone >> mentioned that PCI cache line size may be relevant. I did some testing >> with different CLS values and found that the problem of data corruption >> is solved if either >> A). CLS is set to 0, before or after sata_sil kernel driver is loaded >> # setpci -d 1095:3114 CACHE_LINE_SIZE=0 >> where 1095:3114 is the device id as shown by 'lspci -nn'. The same >> command can also be used in grub2 (recent versions) shell or >> configuration file before booting linux. >> or >> B). CLS is set to a sufficiently large value, only after sata_sil is >> loaded. >> # setpci -d 1095:3114 CACHE_LINE_SIZE=28 >> (value is hexadecimal, in 4-byte units, here it's 160 bytes) >> What is a sufficiently large value depends on the value that is set >> before the driver is loaded. If the value before the driver is loaded is >> 32 or 64 bytes, I have to increase it (after the driver is loaded) to 128 >> or 160 bytes, respectively. >> >> In sata_sil.c source in sil_init_controller it writes some >> hardware-specific value depending on PCI cache line size. By lowering >> this value I can get it to work with lower CLS. The lowest value 0 works >> with CLS 64 bytes. If the CLS is 32 bytes, I have to increase the CLS. > > The meaning of that value from the datasheet is: "This bit field is used > to specify the system cacheline size in terms of 32-bit words. The upper > 2 bits are not used, resulting a maximum size of 64 32-bit words. With > the SiI3114 as a master, initiating a read transaction, it issues PCI > command Read Multiple in place, when empty space in its FIFO is larger > than the value programmed in this register." > > I think this value is likely the key. The cache line size itself > shouldn't make any difference with this controller as it only really > affects Memory Write & Invalidate (MWI) and the driver doesn't try to > enable that for this device. But it's being used to derive the value > written into this register. In practice, on my hardware configuration, increasing the CLS after the internal value has already been derived does make a difference. > Can you add in some output to figure out what values are being written > to this register If the CLS is 32 or 64 bytes, it writes 2 or 3, respectively. > and see which values are working or not working? That depends on the CLS. If the CLS is 32 bytes, it doesn't work (by work I mean it's safe from data corruption) no matter what value I write to that hardware register. If the CLS is 64 bytes, the only value that works is 0. CLS A B 32 2 none 64 3 0 96 4 1 128 5 2 160 6 3 A: value written by default B: maximum value safe from data corruption, based on my testing, probably only applies to similar problematic hardware configurations. Looking at this table you can see that increasing the CLS to a large value can be a workaround after the driver has set the default value. By default on my system this part of sata_sil code just overwrites the same value (2 for 32 bytes CLS) that is already in place (as retrieved using readw()) because the same value gets set (by the sata controller bios?) after reboot. Changing this logic can work around data corruption problem. There is another problem, sata link becoming inaccessible (I wrote more about it in the first post), not affected by this part of sata_sil code. My guess is that the main cause of the problems is elsewhere. >> Data corruption is the biggest problem for me and these workarounds help >> but another problem remains, sometimes when accessing multiple PCI >> devices at the same time sata becomes inaccessible and times out with log >> messages similar to: >> [ 411.351805] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 >> frozen >> [ 411.351824] ata3.00: cmd c8/00:00:00:af:00/00:00:00:00:00/e0 tag 0 dma >> 131072 in >> [ 411.351826] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 >> (timeout) >> [ 411.351830] ata3.00: status: { DRDY } >> [ 411.351843] ata3: hard resetting link >> [ 411.671775] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310) >> [ 411.697059] ata3.00: configured for UDMA/100 >> [ 411.697080] ata3: EH complete >> >> Reboot is needed to access sata drives again. If I had the root >> filesystem on a sata drive it would probably crash the system. >> >> Another thing that may be related. Comparing lspci output reveals that >> when accessing multiple PCI devices at the same time, the flag >> DiscTmrStat (Discard Timer Status) gets toggled on for device "00:08.0 >> PCI bridge: nVidia Corporation nForce2 External PCI Bridge". I don't know >> if it's normal or not. > > I'm not an expert on the whole PCI bridge/delayed completion stuff but > it appears that this means that a device (either the host bridge/CPU or > a device behind that bridge) initiated a delayed transaction for a read, > but then didn't retry the request to pick up the read data later. From > what I can tell this seems abnormal, at least in most cases. > > Can you post the full lspci -vv output? Do the problems only occur if > there are multiple devices plugged in behind that bridge? 'lspci -vvv' output attached. Yes, I've only encountered problems with the sata controller if at least one other external PCI card is in active use. (The built-in devices which appear as PCI under another bridge do not cause problems.) >> Finally, the same simple test that I use on Linux does not produce data >> corruption on FreeBSD. Either this problem doesn't occur there or it's >> not trivial to reproduce. >> >> This bug has been around for so long. I hope someone will find this >> information useful. --nextPart2169060.BJCX0EGrHf Content-Type: text/plain; name="3-lspci-vvv" Content-Transfer-Encoding: 8Bit Content-Disposition: attachment; filename="3-lspci-vvv" 00:00.0 Host bridge: nVidia Corporation nForce2 IGP2 (rev a2) Subsystem: ASUSTeK Computer Inc. Unknown device 80ac Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr+ DiscTmrStat+ DiscTmrSERREn- Kernel modules: shpchp 00:09.0 IDE interface: nVidia Corporation nForce2 IDE (rev a2) (prog-if 8a [Master SecP PriP]) Subsystem: ASUSTeK Computer Inc. Unknown device 0c11 Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Kernel modules: shpchp 01:07.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 30) Subsystem: 3Com Corporation 3C905B Fast Etherlink XL 10/100 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- Kernel driver in use: nvidia Kernel modules: nvidia, nvidiafb --nextPart2169060.BJCX0EGrHf--