linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Sata Sil3512 bug?
@ 2007-09-27 13:51 MisterE
  2007-09-28 12:25 ` Tejun Heo
  0 siblings, 1 reply; 37+ messages in thread
From: MisterE @ 2007-09-27 13:51 UTC (permalink / raw)
  To: jgarzik, linux-ide, benh

[-- Attachment #1: Type: text/plain, Size: 1037 bytes --]

Hello,

First off, i'm quite new to linux. I don't know the official way's to
report bugs. I'm not even sure that the bug is sata driver related. I
hope you can do some suggestions.

I recently bought 2 Sweex Sata controllers (without raid). This device
contains the Sil3512 chip.
I connected it to my D815EEA motherboard with a samsung hard drive.
When i mounted it and connected to it with WinScp or samba i got
"hangs"; a couple of seconds the filetransfer stopped.
The logs in var and the screen are spooled with errors like in "samsung error.txt".

I now have bought some Western Digitals drives. I get similar problems
(wd error.txt), but nog "hangs".

I've tried the controller in another motherboard, the ASUS CUSL2 (with similar specs)
and i don't have any problems. Can you help? I've included some logs
with may be of use.

btw: i use Debian unstable. I use the same hd with the OS (IDE drive)
on both systems, so we can exclude a faulty OS.
  

-- 
Best regards,
 MisterE                          mailto:MisterE2002@zonnet.nl

[-- Attachment #2: samsung error.txt --]
[-- Type: text/plain, Size: 3880 bytes --]

Sep 24 17:42:43 fileserver kernel: NTFS driver 2.1.28 [Flags: R/W MODULE].
Sep 24 17:42:43 fileserver kernel: NTFS volume version 3.1.
Sep 24 17:44:05 fileserver kernel: NTFS volume version 3.1.
Sep 24 17:44:36 fileserver kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Sep 24 17:44:36 fileserver kernel: ata2.00: (BMDMA2 stat 0x6c0009)
Sep 24 17:44:36 fileserver kernel: ata2.00: cmd c8/00:0c:a8:01:bd/00:00:00:00:00/e1 tag 0 cdb 0x0 data 6144 in
Sep 24 17:44:36 fileserver kernel: res 51/04:00:b3:01:bd/00:00:00:00:00/e1 Emask 0x1 (device error)
Sep 24 17:44:36 fileserver kernel: ata2.00: configured for UDMA/100
Sep 24 17:44:36 fileserver kernel: ata2: EH complete
Sep 24 17:44:36 fileserver kernel: sd 1:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB)
Sep 24 17:44:36 fileserver kernel: sd 1:0:0:0: [sda] Write Protect is off
Sep 24 17:44:36 fileserver kernel: sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00
Sep 24 17:44:36 fileserver kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Sep 24 17:44:36 fileserver kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Sep 24 17:44:36 fileserver kernel: ata2.00: (BMDMA2 stat 0x6d0009)
Sep 24 17:44:36 fileserver kernel: ata2.00: cmd c8/00:07:38:19:bd/00:00:00:00:00/e1 tag 0 cdb 0x0 data 3584 in
Sep 24 17:44:36 fileserver kernel: res 51/04:00:3e:19:bd/00:00:00:00:00/e1 Emask 0x1 (device error)
Sep 24 17:44:36 fileserver kernel: ata2.00: configured for UDMA/100
Sep 24 17:44:36 fileserver kernel: ata2: EH complete
Sep 24 17:44:36 fileserver kernel: sd 1:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB)
Sep 24 17:44:36 fileserver kernel: sd 1:0:0:0: [sda] Write Protect is off
Sep 24 17:44:36 fileserver kernel: sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00
Sep 24 17:44:36 fileserver kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Sep 24 17:45:06 fileserver kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Sep 24 17:45:06 fileserver kernel: ata2.00: cmd c8/00:80:3f:21:bd/00:00:00:00:00/e1 tag 0 cdb 0x0 data 65536 in
Sep 24 17:45:06 fileserver kernel: res 40/00:00:3e:19:bd/00:00:00:00:00/e1 Emask 0x4 (timeout)
Sep 24 17:45:07 fileserver kernel: ata2: soft resetting port
Sep 24 17:45:07 fileserver kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Sep 24 17:45:07 fileserver kernel: ata2.00: configured for UDMA/100
Sep 24 17:45:07 fileserver kernel: ata2: EH complete
Sep 24 17:45:07 fileserver kernel: sd 1:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB)
Sep 24 17:45:07 fileserver kernel: sd 1:0:0:0: [sda] Write Protect is off
Sep 24 17:45:07 fileserver kernel: sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00
Sep 24 17:45:07 fileserver kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Sep 24 17:45:37 fileserver kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Sep 24 17:45:37 fileserver kernel: ata2.00: cmd c8/00:80:3f:24:bd/00:00:00:00:00/e1 tag 0 cdb 0x0 data 65536 in
Sep 24 17:45:37 fileserver kernel: res 40/00:00:3e:19:bd/00:00:00:00:00/e1 Emask 0x4 (timeout)
Sep 24 17:45:37 fileserver kernel: ata2: soft resetting port
Sep 24 17:45:37 fileserver kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Sep 24 17:45:37 fileserver kernel: ata2.00: configured for UDMA/100
Sep 24 17:45:37 fileserver kernel: ata2: EH complete
Sep 24 17:45:37 fileserver kernel: sd 1:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB)
Sep 24 17:45:37 fileserver kernel: sd 1:0:0:0: [sda] Write Protect is off
Sep 24 17:45:37 fileserver kernel: sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00
Sep 24 17:45:37 fileserver kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

[-- Attachment #3: wd error.txt --]
[-- Type: text/plain, Size: 883 bytes --]

Sep 25 14:09:57 fileserver kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x2400000 action 0x0
Sep 25 14:09:57 fileserver kernel: ata2.00: (BMDMA2 stat 0x650001)
Sep 25 14:09:57 fileserver kernel: ata2.00: cmd ca/00:00:9b:39:02/00:00:00:00:00/e0 tag 0 cdb 0x0 data 131072 out
Sep 25 14:09:57 fileserver kernel: res 51/04:20:7b:3a:02/00:00:00:00:00/e0 Emask 0x1 (device error)
Sep 25 14:09:57 fileserver kernel: ata2.00: configured for UDMA/33
Sep 25 14:09:57 fileserver kernel: ata2: EH complete
Sep 25 14:09:57 fileserver kernel: sd 1:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB)
Sep 25 14:09:57 fileserver kernel: sd 1:0:0:0: [sda] Write Protect is off
Sep 25 14:09:57 fileserver kernel: sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00
Sep 25 14:09:57 fileserver kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

[-- Attachment #4: lshw asus.txt --]
[-- Type: text/plain, Size: 10549 bytes --]

fileserver
    description: Tower Computer
    product: System Name
    vendor: System Manufacturer
    version: System Version
    serial: SYS-1234567890
    width: 32 bits
    capabilities: smbios-2.3 dmi-2.3
    configuration: boot=normal chassis=tower
  *-core
       description: Motherboard
       product: CUSL2-C
       vendor: ASUSTeK Computer INC.
       physical id: 0
       version: REV 1.xx
       serial: xxxxxxxxxxx
     *-firmware
          description: BIOS
          vendor: Award Software, Inc.
          physical id: 0
          version: ASUS CUSL2-C ACPI BIOS Revision 1014 Beta 001 (09/20/2002)
          size: 64KB
          capacity: 448KB
          capabilities: pci pnp apm upgrade shadowing escd cdboot bootselect socketedrom edd int13floppy360 int13floppy1200 int13floppy720 int13floppy2880 int5printscreen int9keyboard int14serial int17printer int10video acpi usb agp ls120boot zipboot
     *-cpu
          description: CPU
          product: Pentium III (Coppermine)
          vendor: Intel Corp.
          physical id: 4
          bus info: cpu@0
          version: 6.8.10
          slot: PGA 370
          size: 1GHz
          capacity: 1600MHz
          width: 32 bits
          clock: 133MHz
          capabilities: fpu fpu_exception wp vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr sse up
        *-cache:0
             description: L1 cache
             physical id: 9
             slot: L1 Cache
             size: 32KB
             capacity: 32KB
             capabilities: pipeline-burst synchronous internal write-back data
        *-cache:1
             description: L2 cache
             physical id: a
             slot: L2 Cache
             size: 256KB
             capacity: 256KB
             capabilities: pipeline-burst synchronous internal write-back data
     *-memory
          description: System Memory
          physical id: 22
          slot: System board or motherboard
          size: 512MB
          capacity: 512MB
        *-bank:0
             description: DIMM DRAM Synchronous
             physical id: 0
             slot: DIMM 1
             size: 256MB
             width: 64 bits
        *-bank:1
             description: DIMM DRAM Synchronous
             physical id: 1
             slot: DIMM 2
             size: 256MB
             width: 64 bits
        *-bank:2
             description: DIMM DRAM Synchronous [empty]
             physical id: 2
             slot: DIMM 3
     *-pci
          description: Host bridge
          product: 82815 815 Chipset Host Bridge and Memory Controller Hub
          vendor: Intel Corporation
          physical id: 100
          bus info: pci@0000:00:00.0
          version: 02
          width: 32 bits
          clock: 33MHz
          configuration: driver=agpgart-intel module=intel_agp
        *-pci:0
             description: PCI bridge
             product: 82815 815 Chipset AGP Bridge
             vendor: Intel Corporation
             physical id: 1
             bus info: pci@0000:00:01.0
             version: 02
             width: 32 bits
             clock: 66MHz
             capabilities: pci normal_decode bus_master
           *-display
                description: VGA compatible controller
                product: MGA G400/G450
                vendor: Matrox Graphics, Inc.
                physical id: 0
                bus info: pci@0000:01:00.0
                version: 82
                width: 32 bits
                clock: 33MHz
                capabilities: pm agp agp-2.0 vga bus_master cap_list
                configuration: latency=64 maxlatency=32 mingnt=16
        *-pci:1
             description: PCI bridge
             product: 82801 PCI Bridge
             vendor: Intel Corporation
             physical id: 1e
             bus info: pci@0000:00:1e.0
             version: 02
             width: 32 bits
             clock: 33MHz
             capabilities: pci normal_decode bus_master
           *-storage
                description: Mass storage controller
                product: SiI 3512 [SATALink/SATARaid] Serial ATA Controller
                vendor: Silicon Image, Inc.
                physical id: b
                bus info: pci@0000:02:0b.0
                version: 01
                width: 32 bits
                clock: 66MHz
                capabilities: storage pm bus_master cap_list
                configuration: driver=sata_sil latency=32 module=sata_sil
           *-network
                description: Ethernet interface
                product: 83c170 EPIC/100 Fast Ethernet Adapter
                vendor: Standard Microsystems Corp [SMC]
                physical id: d
                bus info: pci@0000:02:0d.0
                logical name: eth2
                version: 08
                serial: 00:e0:29:6c:26:d2
                size: 100MB/s
                capacity: 100MB/s
                width: 32 bits
                clock: 33MHz
                capabilities: pm bus_master cap_list ethernet physical tp mii 10bt 10bt-fd 100bt 100bt-fd autonegotiation
                configuration: autonegotiation=on broadcast=yes driver=epic100 driverversion=2.1 duplex=full ip=10.0.0.12 latency=32 link=yes maxlatency=28 mingnt=8 module=epic100 multicast=yes port=MII speed=100MB/s
        *-isa
             description: ISA bridge
             product: 82801BA ISA Bridge (LPC)
             vendor: Intel Corporation
             physical id: 1f
             bus info: pci@0000:00:1f.0
             version: 02
             width: 32 bits
             clock: 33MHz
             capabilities: isa bus_master
             configuration: latency=0
        *-ide
             description: IDE interface
             product: 82801BA IDE U100
             vendor: Intel Corporation
             physical id: 1f.1
             bus info: pci@0000:00:1f.1
             version: 02
             width: 32 bits
             clock: 33MHz
             capabilities: ide bus_master
             configuration: driver=PIIX_IDE latency=0 module=piix
           *-ide:0
                description: IDE Channel 0
                physical id: 0
                bus info: ide@0
                logical name: ide0
                clock: 33MHz
              *-disk
                   description: ATA Disk
                   product: WDC AC36400L
                   vendor: Western Digital
                   physical id: 0
                   bus info: ide@0.0
                   logical name: /dev/hda
                   version: 09.09M08
                   serial: WD-WM4200668163
                   size: 6149MB
                   capacity: 6149MB
                   capabilities: ata dma lba iordy smart pm partitioned partitioned:dos
                   configuration: mode=udma2 smart=on
                 *-volume:0
                      description: Linux filesystem partition
                      physical id: 1
                      bus info: ide@0.0,1
                      logical name: /dev/hda1
                      capacity: 5828MB
                      capabilities: primary bootable
                 *-volume:1
                      description: Extended partition
                      physical id: 2
                      bus info: ide@0.0,2
                      logical name: /dev/hda2
                      size: 321MB
                      capacity: 321MB
                      capabilities: primary extended partitioned partitioned:extended
                    *-logicalvolume
                         description: Linux swap / Solaris partition
                         physical id: 5
                         logical name: /dev/hda5
                         capacity: 321MB
                         capabilities: nofs
           *-ide:1
                description: IDE Channel 1
                physical id: 1
                bus info: ide@1
                logical name: ide1
                clock: 33MHz
              *-cdrom
                   description: DVD-RAM writer
                   product: HL-DT-ST DVDRAM GSA-H42L
                   physical id: 0
                   bus info: ide@1.0
                   logical name: /dev/hdc
                   version: SL00
                   serial: K176CJ80810
                   capabilities: packet atapi cdrom removable nonmagnetic dma lba iordy pm audio cd-r cd-rw dvd dvd-r dvd-ram
                   configuration: mode=udma4 status=nodisc
        *-usb:0
             description: USB Controller
             product: 82801BA/BAM USB (Hub #1)
             vendor: Intel Corporation
             physical id: 1f.2
             bus info: pci@0000:00:1f.2
             version: 02
             width: 32 bits
             clock: 33MHz
             capabilities: uhci bus_master
             configuration: driver=uhci_hcd latency=0 module=uhci_hcd
           *-usbhost
                product: UHCI Host Controller
                vendor: Linux 2.6.22-2-686 uhci_hcd
                physical id: 1
                bus info: usb@1
                logical name: usb1
                version: 2.06
                capabilities: usb-1.10
                configuration: maxpower=0mA slots=2 speed=12.0MB/s
        *-serial
             description: SMBus
             product: 82801BA/BAM SMBus
             vendor: Intel Corporation
             physical id: 1f.3
             bus info: pci@0000:00:1f.3
             version: 02
             width: 32 bits
             clock: 33MHz
             configuration: driver=i801_smbus latency=0 module=i2c_i801
        *-usb:1
             description: USB Controller
             product: 82801BA/BAM USB (Hub #2)
             vendor: Intel Corporation
             physical id: 1f.4
             bus info: pci@0000:00:1f.4
             version: 02
             width: 32 bits
             clock: 33MHz
             capabilities: uhci bus_master
             configuration: driver=uhci_hcd latency=0 module=uhci_hcd
           *-usbhost
                product: UHCI Host Controller
                vendor: Linux 2.6.22-2-686 uhci_hcd
                physical id: 1
                bus info: usb@2
                logical name: usb2
                version: 2.06
                capabilities: usb-1.10
                configuration: maxpower=0mA slots=2 speed=12.0MB/s

[-- Attachment #5: lspci asus.txt --]
[-- Type: text/plain, Size: 908 bytes --]

00:00.0 Host bridge: Intel Corporation 82815 815 Chipset Host Bridge and Memory Controller Hub (rev 02)
00:01.0 PCI bridge: Intel Corporation 82815 815 Chipset AGP Bridge (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 02)
00:1f.0 ISA bridge: Intel Corporation 82801BA ISA Bridge (LPC) (rev 02)
00:1f.1 IDE interface: Intel Corporation 82801BA IDE U100 (rev 02)
00:1f.2 USB Controller: Intel Corporation 82801BA/BAM USB (Hub #1) (rev 02)
00:1f.3 SMBus: Intel Corporation 82801BA/BAM SMBus (rev 02)
00:1f.4 USB Controller: Intel Corporation 82801BA/BAM USB (Hub #2) (rev 02)
01:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA G400/G450 (rev 82)
02:0b.0 Mass storage controller: Silicon Image, Inc. SiI 3512 [SATALink/SATARaid] Serial ATA Controller (rev 01)
02:0d.0 Ethernet controller: Standard Microsystems Corp [SMC] 83c170 EPIC/100 Fast Ethernet Adapter (rev 08)

[-- Attachment #6: lshw intel.txt --]
[-- Type: text/plain, Size: 9902 bytes --]

fileserver
    description: Computer
    width: 32 bits
    capabilities: smbios-2.3 dmi-2.3
    configuration: boot=normal uuid=82F98A6F-1F59-11D5-BDC8-001083FDCE08
  *-core
       description: Motherboard
       product: D815EEA
       vendor: Intel Corporation
       physical id: 0
       version: AAA10378-405
       serial: BLEA11230259
       slot: LPT1
     *-firmware
          description: BIOS
          vendor: Intel Corp.
          physical id: 0
          version: EA81510A.86A.0051.P11.0106190714 (06/19/2001)
          size: 64KB
          capacity: 448KB
          capabilities: pci pnp apm upgrade shadowing escd cdboot bootselect edd int13floppynec int13floppytoshiba int13floppy360 int13floppy1200 int13floppy720 int13floppy2880 int5printscreen int9keyboard int14serial int17printer int10video acpi usb agp ls120boot zipboot biosbootspecification
     *-cpu
          description: CPU
          product: Pentium III (Coppermine)
          vendor: Intel Corp.
          physical id: 4
          bus info: cpu@0
          version: 6.8.6
          slot: J4L1
          size: 933MHz
          capacity: 1100MHz
          width: 32 bits
          clock: 133MHz
          capabilities: fpu fpu_exception wp vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr sse up
        *-cache:0
             description: L1 cache
             physical id: 5
             slot: None
             size: 32KB
             capacity: 32KB
             clock: 25MHz (40.0ns)
             capabilities: pipeline-burst synchronous internal write-back unified
        *-cache:1
             description: L2 cache
             physical id: 6
             slot: None
             size: 256KB
             capacity: 256KB
             capabilities: synchronous internal write-back unified
     *-memory
          description: System Memory
          physical id: 2e
          slot: System board or motherboard
          size: 256MB
        *-bank:0
             description: [empty]
             physical id: 0
             slot: DIMM0
        *-bank:1
             description: DIMM DRAM Synchronous 100 MHz (10.0 ns)
             physical id: 1
             slot: DIMM1
             size: 128MB
             width: 64 bits
             clock: 100MHz (10.0ns)
        *-bank:2
             description: DIMM DRAM Synchronous 100 MHz (10.0 ns)
             physical id: 2
             slot: DIMM2
             size: 128MB
             width: 64 bits
             clock: 100MHz (10.0ns)
     *-pci
          description: Host bridge
          product: 82815 815 Chipset Host Bridge and Memory Controller Hub
          vendor: Intel Corporation
          physical id: 100
          bus info: pci@0000:00:00.0
          version: 02
          width: 32 bits
          clock: 33MHz
          configuration: driver=agpgart-intel module=intel_agp
        *-display
             description: VGA compatible controller
             product: 82815 CGC [Chipset Graphics Controller]
             vendor: Intel Corporation
             physical id: 2
             bus info: pci@0000:00:02.0
             version: 02
             width: 32 bits
             clock: 66MHz
             capabilities: pm vga bus_master cap_list
             configuration: latency=0
        *-pci
             description: PCI bridge
             product: 82801 PCI Bridge
             vendor: Intel Corporation
             physical id: 1e
             bus info: pci@0000:00:1e.0
             version: 02
             width: 32 bits
             clock: 33MHz
             capabilities: pci normal_decode bus_master
           *-multimedia
                description: Multimedia audio controller
                product: ES1371 [AudioPCI-97]
                vendor: Ensoniq
                physical id: 7
                bus info: pci@0000:01:07.0
                version: 08
                width: 32 bits
                clock: 33MHz
                capabilities: pm bus_master cap_list
                configuration: driver=ENS1371 latency=32 maxlatency=128 mingnt=12 module=snd_ens1371
           *-storage
                description: Mass storage controller
                product: SiI 3512 [SATALink/SATARaid] Serial ATA Controller
                vendor: Silicon Image, Inc.
                physical id: 9
                bus info: pci@0000:01:09.0
                version: 01
                width: 32 bits
                clock: 66MHz
                capabilities: storage pm bus_master cap_list
                configuration: driver=sata_sil latency=32 module=sata_sil
           *-network
                description: Ethernet interface
                product: 3c905C-TX/TX-M [Tornado]
                vendor: 3Com Corporation
                physical id: c
                bus info: pci@0000:01:0c.0
                logical name: eth0
                version: 78
                serial: 00:01:02:e3:12:bb
                size: 100MB/s
                capacity: 100MB/s
                width: 32 bits
                clock: 33MHz
                capabilities: pm bus_master cap_list ethernet physical tp mii 10bt 10bt-fd 100bt 100bt-fd autonegotiation
                configuration: autonegotiation=on broadcast=yes driver=3c59x duplex=full ip=10.0.0.8 latency=32 link=yes maxlatency=10 mingnt=10 module=3c59x multicast=yes port=MII speed=100MB/s
        *-isa
             description: ISA bridge
             product: 82801BA ISA Bridge (LPC)
             vendor: Intel Corporation
             physical id: 1f
             bus info: pci@0000:00:1f.0
             version: 02
             width: 32 bits
             clock: 33MHz
             capabilities: isa bus_master
             configuration: latency=0
        *-ide
             description: IDE interface
             product: 82801BA IDE U100
             vendor: Intel Corporation
             physical id: 1f.1
             bus info: pci@0000:00:1f.1
             version: 02
             width: 32 bits
             clock: 33MHz
             capabilities: ide bus_master
             configuration: driver=PIIX_IDE latency=0 module=piix
           *-ide
                description: IDE Channel 0
                physical id: 0
                bus info: ide@0
                logical name: ide0
                clock: 33MHz
              *-disk
                   description: ATA Disk
                   product: WDC AC36400L
                   vendor: Western Digital
                   physical id: 0
                   bus info: ide@0.0
                   logical name: /dev/hda
                   version: 09.09M08
                   serial: WD-WM4200668163
                   size: 6149MB
                   capacity: 6149MB
                   capabilities: ata dma lba iordy smart pm partitioned partitioned:dos
                   configuration: mode=udma2 smart=on
                 *-volume:0
                      description: Linux filesystem partition
                      physical id: 1
                      bus info: ide@0.0,1
                      logical name: /dev/hda1
                      capacity: 5828MB
                      capabilities: primary bootable
                 *-volume:1
                      description: Extended partition
                      physical id: 2
                      bus info: ide@0.0,2
                      logical name: /dev/hda2
                      size: 321MB
                      capacity: 321MB
                      capabilities: primary extended partitioned partitioned:extended
                    *-logicalvolume
                         description: Linux swap / Solaris partition
                         physical id: 5
                         logical name: /dev/hda5
                         capacity: 321MB
                         capabilities: nofs
        *-usb:0
             description: USB Controller
             product: 82801BA/BAM USB (Hub #1)
             vendor: Intel Corporation
             physical id: 1f.2
             bus info: pci@0000:00:1f.2
             version: 02
             width: 32 bits
             clock: 33MHz
             capabilities: uhci bus_master
             configuration: driver=uhci_hcd latency=0 module=uhci_hcd
           *-usbhost
                product: UHCI Host Controller
                vendor: Linux 2.6.22-2-686 uhci_hcd
                physical id: 1
                bus info: usb@1
                logical name: usb1
                version: 2.06
                capabilities: usb-1.10
                configuration: maxpower=0mA slots=2 speed=12.0MB/s
        *-serial
             description: SMBus
             product: 82801BA/BAM SMBus
             vendor: Intel Corporation
             physical id: 1f.3
             bus info: pci@0000:00:1f.3
             version: 02
             width: 32 bits
             clock: 33MHz
             configuration: driver=i801_smbus latency=0 module=i2c_i801
        *-usb:1
             description: USB Controller
             product: 82801BA/BAM USB (Hub #2)
             vendor: Intel Corporation
             physical id: 1f.4
             bus info: pci@0000:00:1f.4
             version: 02
             width: 32 bits
             clock: 33MHz
             capabilities: uhci bus_master
             configuration: driver=uhci_hcd latency=0 module=uhci_hcd
           *-usbhost
                product: UHCI Host Controller
                vendor: Linux 2.6.22-2-686 uhci_hcd
                physical id: 1
                bus info: usb@2
                logical name: usb2
                version: 2.06
                capabilities: usb-1.10
                configuration: maxpower=0mA slots=2 speed=12.0MB/s

[-- Attachment #7: lspci intel.txt --]
[-- Type: text/plain, Size: 900 bytes --]

00:00.0 Host bridge: Intel Corporation 82815 815 Chipset Host Bridge and Memory Controller Hub (rev 02)
00:02.0 VGA compatible controller: Intel Corporation 82815 CGC [Chipset Graphics Controller] (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 02)
00:1f.0 ISA bridge: Intel Corporation 82801BA ISA Bridge (LPC) (rev 02)
00:1f.1 IDE interface: Intel Corporation 82801BA IDE U100 (rev 02)
00:1f.2 USB Controller: Intel Corporation 82801BA/BAM USB (Hub #1) (rev 02)
00:1f.3 SMBus: Intel Corporation 82801BA/BAM SMBus (rev 02)
00:1f.4 USB Controller: Intel Corporation 82801BA/BAM USB (Hub #2) (rev 02)
01:07.0 Multimedia audio controller: Ensoniq ES1371 [AudioPCI-97] (rev 08)
01:09.0 Mass storage controller: Silicon Image, Inc. SiI 3512 [SATALink/SATARaid] Serial ATA Controller (rev 01)
01:0c.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] (rev 78)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Sata Sil3512 bug?
  2007-09-27 13:51 MisterE
@ 2007-09-28 12:25 ` Tejun Heo
  2007-09-28 15:25   ` Re[2]: " MisterE
  0 siblings, 1 reply; 37+ messages in thread
From: Tejun Heo @ 2007-09-28 12:25 UTC (permalink / raw)
  To: MisterE; +Cc: jgarzik, linux-ide, benh

Hello,

MisterE wrote:
> I've tried the controller in another motherboard, the ASUS CUSL2 (with similar specs)
> and i don't have any problems. Can you help? I've included some logs
> with may be of use.

Did you use the same cable on both machines?  Also, does the problem go
away if you power the hard drive from the power supply of the other machine?

-- 
tejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Sata Sil3512 bug?
  2007-09-28 15:25   ` Re[2]: " MisterE
@ 2007-09-28 15:51     ` Alan Cox
  2007-09-28 16:55       ` Tejun Heo
  0 siblings, 1 reply; 37+ messages in thread
From: Alan Cox @ 2007-09-28 15:51 UTC (permalink / raw)
  To: MisterE; +Cc: Tejun Heo, jgarzik, linux-ide, benh

> sda1 are corrupted (2 to 4 blocks missing). Copying that data back to
> Windows and it give the same results in Quickpar. So reading does not
> have problems. The data written to hda1 is correct.

We've got a whole pile of reports like this with the 3512 and almost
always Nvidia chipset, plus reports of BIOS updates fixing it. That you
see something similar on intel boards is a bit worrying.

Alan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Sata Sil3512 bug?
  2007-09-28 15:51     ` Alan Cox
@ 2007-09-28 16:55       ` Tejun Heo
  2007-10-02 19:20         ` Re[2]: " MisterE
  0 siblings, 1 reply; 37+ messages in thread
From: Tejun Heo @ 2007-09-28 16:55 UTC (permalink / raw)
  To: Alan Cox; +Cc: MisterE, jgarzik, linux-ide, benh

Alan Cox wrote:
>> sda1 are corrupted (2 to 4 blocks missing). Copying that data back to
>> Windows and it give the same results in Quickpar. So reading does not
>> have problems. The data written to hda1 is correct.
> 
> We've got a whole pile of reports like this with the 3512 and almost
> always Nvidia chipset, plus reports of BIOS updates fixing it. That you
> see something similar on intel boards is a bit worrying.

Multiple sil3112/3512 + nvidia chipset problem doesn't usually involve
device errors or timeouts.  It usually corrupts data silently.  And,
yeah, data corruption on intel board is really disturbing.

MisterE, do you have any processor powersaving mechanism enabled?  If
so, can you disable all and see whether that changes anything?

-- 
tejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Re[2]: Sata Sil3512 bug?
@ 2007-10-03  7:26 Mikael Pettersson
  2007-10-03  8:31 ` Alexander Sabourenkov
  0 siblings, 1 reply; 37+ messages in thread
From: Mikael Pettersson @ 2007-10-03  7:26 UTC (permalink / raw)
  To: MisterE2002, htejun; +Cc: alan, benh, jgarzik, linux-ide

On Tue, 2 Oct 2007 21:20:23 +0200, MisterE wrote:
> I build another setup with almost the same hardware.
> This motherboard had already the latest bios.
> I notice that the computer does almost never find the hard drive
> although the controller is found every time (with lspci). So i get no
> drive (sda) assigned. I don't always see the "bios" screen from the
> controller at startup. And in the past it showed the hard drive.
> So i could not experiment with this motherboard.
> 
> After that i installed Windows XP and used the orginal (sweex)
> drivers with the first motherboard. This also makes the data corrupt.
> So it seems not to be an linux problem. So there is something wrong with
> the motherboard or the 3512 controller.
> 
> After that i plugged both hard drives (ide with windows and sata disk)
> to the Asus board. No data corruption. So the hard disks are'nt the
> problem either.
> 
> I'm thinking of replacing both 3512 controllers with a Promise SATA300
> TX4. Do you know if there are problems with this device?

(please don't top-post)

There are no known data-corruption issues with Promise SATA cards.
However, some of them, especially the 2nd generation SATA300 TX4,
are known to trigger intermittent error interrupts (that are dealt
with but may cause a speed reduction) in some systems. We're still
scratching our heads on that issue.

/Mikael

> Friday, September 28, 2007, 6:55:47 PM, you wrote:
> 
> > Alan Cox wrote:
> >>> sda1 are corrupted (2 to 4 blocks missing). Copying that data back to
> >>> Windows and it give the same results in Quickpar. So reading does not
> >>> have problems. The data written to hda1 is correct.
> >> 
> >> We've got a whole pile of reports like this with the 3512 and almost
> >> always Nvidia chipset, plus reports of BIOS updates fixing it. That you
> >> see something similar on intel boards is a bit worrying.
> 
> > Multiple sil3112/3512 + nvidia chipset problem doesn't usually involve
> > device errors or timeouts.  It usually corrupts data silently.  And,
> > yeah, data corruption on intel board is really disturbing.
> 
> > MisterE, do you have any processor powersaving mechanism enabled?  If
> > so, can you disable all and see whether that changes anything?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Sata Sil3512 bug?
  2007-10-03  7:26 Re[2]: Sata Sil3512 bug? Mikael Pettersson
@ 2007-10-03  8:31 ` Alexander Sabourenkov
  2007-10-03 14:45   ` Re[2]: " MisterE
                     ` (2 more replies)
  0 siblings, 3 replies; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-03  8:31 UTC (permalink / raw)
  To: Mikael Pettersson; +Cc: MisterE2002, htejun, alan, benh, jgarzik, linux-ide

Mikael Pettersson wrote:

>>
>> I'm thinking of replacing both 3512 controllers with a Promise SATA300
>> TX4. Do you know if there are problems with this device?
> 
> (please don't top-post)
> 
> There are no known data-corruption issues with Promise SATA cards.
> However, some of them, especially the 2nd generation SATA300 TX4,
> are known to trigger intermittent error interrupts (that are dealt
> with but may cause a speed reduction) in some systems. We're still
> scratching our heads on that issue.
> 

But see this thread:

http://marc.info/?l=linux-ide&m=119122463403033&w=2
http://www.spinics.net/lists/linux-ide/msg14868.html

Personally I would not recommend Promise SATA300 TX4 at the moment.

-- 

./lxnt



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re[2]: Sata Sil3512 bug?
  2007-10-03  8:31 ` Alexander Sabourenkov
@ 2007-10-03 14:45   ` MisterE
  2007-10-03 14:50     ` Alan Cox
  2007-10-14 12:07   ` Re[2]: " MisterE
  2007-10-17 12:39   ` Re[2]: Sata Sil3512 bug?; Promise SATA300 TX4 MisterE
  2 siblings, 1 reply; 37+ messages in thread
From: MisterE @ 2007-10-03 14:45 UTC (permalink / raw)
  To: Alexander Sabourenkov
  Cc: Mikael Pettersson, htejun, alan, benh, jgarzik, linux-ide

Hello Alexander,

Wednesday, October 3, 2007, 10:31:17 AM, you wrote:

> Mikael Pettersson wrote:

>>>
>>> I'm thinking of replacing both 3512 controllers with a Promise SATA300
>>> TX4. Do you know if there are problems with this device?
>> 
>> (please don't top-post)
>> 
>> There are no known data-corruption issues with Promise SATA cards.
>> However, some of them, especially the 2nd generation SATA300 TX4,
>> are known to trigger intermittent error interrupts (that are dealt
>> with but may cause a speed reduction) in some systems. We're still
>> scratching our heads on that issue.
>> 

> But see this thread:

> http://marc.info/?l=linux-ide&m=119122463403033&w=2
> http://www.spinics.net/lists/linux-ide/msg14868.html

> Personally I would not recommend Promise SATA300 TX4 at the moment.


That is not hopefull. Highpoint does not have sata controllers (Except
softraid controllers). Other (real raid controllers) brands are too
expensive or/and does not have a PCI interface.
Maybe i should keep those 3512 cards? How are the user experiences
with these controllers (except nvidia boards)? Because i don't really
trust the intel boards so using the Asus would be an option.


-- 
Best regards,
 MisterE                            mailto:MisterE2002@zonnet.nl



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Sata Sil3512 bug?
  2007-10-03 14:45   ` Re[2]: " MisterE
@ 2007-10-03 14:50     ` Alan Cox
  0 siblings, 0 replies; 37+ messages in thread
From: Alan Cox @ 2007-10-03 14:50 UTC (permalink / raw)
  To: MisterE
  Cc: Alexander Sabourenkov, Mikael Pettersson, htejun, benh, jgarzik,
	linux-ide

> That is not hopefull. Highpoint does not have sata controllers (Except
> softraid controllers). Other (real raid controllers) brands are too

There are pretty much no "real" RAID controllers in the ATA world except
the very high end pricy ones.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Sata Sil3512 bug?
@ 2007-10-04  0:46 Richard Scobie
  0 siblings, 0 replies; 37+ messages in thread
From: Richard Scobie @ 2007-10-04  0:46 UTC (permalink / raw)
  To: linux-ide

 > There are pretty much no "real" RAID controllers in the ATA world 
        > except the very high end pricy ones.

Can anyone comment on the reliability or otherwise of Marvell 885X6081 
controllers?

Supermicro do a reasonably priced non-RAID 8 drive SATA card using it:

http://www.supermicro.com/products/accessories/addon/AoC-SAT2-MV8.cfm

Regards,

Richard

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Sata Sil3512 bug?
  2007-10-02 19:20         ` Re[2]: " MisterE
@ 2007-10-04  1:27           ` Tejun Heo
  2007-10-13 16:36             ` Re[2]: " MisterE
  0 siblings, 1 reply; 37+ messages in thread
From: Tejun Heo @ 2007-10-04  1:27 UTC (permalink / raw)
  To: MisterE; +Cc: Alan Cox, jgarzik, linux-ide, benh

Hello,

MisterE wrote:
> I build another setup with almost the same hardware.
> This motherboard had already the latest bios.
> I notice that the computer does almost never find the hard drive
> although the controller is found every time (with lspci).

What do you mean by "almost never"?  Does it find the harddisk
sometimes?  Also, please post kernel boot log after disk detection
failure.  lspci result just indicates only that the PCI device is present.

> So i get no
> drive (sda) assigned. I don't always see the "bios" screen from the
> controller at startup. And in the past it showed the hard drive.
> So i could not experiment with this motherboard.

Can you re-seat the controller or move it to another slot and see
whether things change?

> After that i installed Windows XP and used the orginal (sweex)
> drivers with the first motherboard. This also makes the data corrupt.
> So it seems not to be an linux problem. So there is something wrong with
> the motherboard or the 3512 controller.
> 
> After that i plugged both hard drives (ide with windows and sata disk)
> to the Asus board. No data corruption. So the hard disks are'nt the
> problem either.

Hmmm... It's relieving to know that the problem isn't caused by sata_sil
but I don't have much idea than it seems like something goes wrong on
the PCI bus.  :-(

> I'm thinking of replacing both 3512 controllers with a Promise SATA300
> TX4. Do you know if there are problems with this device?

I see occasional bug reports on sata_promise but AFAIK there haven't
been any data corruption report.  Mikael knows much better about promise
controllers.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re[2]: Sata Sil3512 bug?
  2007-10-03  8:31 ` Alexander Sabourenkov
  2007-10-03 14:45   ` Re[2]: " MisterE
@ 2007-10-14 12:07   ` MisterE
  2007-10-15  8:44     ` Alexander Sabourenkov
  2007-10-17 12:39   ` Re[2]: Sata Sil3512 bug?; Promise SATA300 TX4 MisterE
  2 siblings, 1 reply; 37+ messages in thread
From: MisterE @ 2007-10-14 12:07 UTC (permalink / raw)
  To: Alexander Sabourenkov
  Cc: Mikael Pettersson, htejun, alan, benh, jgarzik, linux-ide

Hello,

Alexander, does these problems with the Promise SATA300 TX4 happen to
everyone?

The only alternatives are
using soft-raid products as normal controllers. Does anyone have experiences
with the following products?
* Highpoint RocketRAID 1640 (150 MB/s)
* Highpoint RocketRAID 1740 (300 MB/s)
* Adaptec 1210SA


Wednesday, October 3, 2007, 10:31:17 AM, you wrote:

> Mikael Pettersson wrote:

>>>
>>> I'm thinking of replacing both 3512 controllers with a Promise SATA300
>>> TX4. Do you know if there are problems with this device?
>> 
>> (please don't top-post)
>> 
>> There are no known data-corruption issues with Promise SATA cards.
>> However, some of them, especially the 2nd generation SATA300 TX4,
>> are known to trigger intermittent error interrupts (that are dealt
>> with but may cause a speed reduction) in some systems. We're still
>> scratching our heads on that issue.
>> 

> But see this thread:

> http://marc.info/?l=linux-ide&m=119122463403033&w=2
> http://www.spinics.net/lists/linux-ide/msg14868.html

> Personally I would not recommend Promise SATA300 TX4 at the moment.




-- 
Best regards,
 MisterE                            mailto:MisterE2002@zonnet.nl



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Sata Sil3512 bug?
  2007-10-14 12:07   ` Re[2]: " MisterE
@ 2007-10-15  8:44     ` Alexander Sabourenkov
  0 siblings, 0 replies; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-15  8:44 UTC (permalink / raw)
  To: MisterE; +Cc: Mikael Pettersson, htejun, alan, benh, jgarzik, linux-ide

MisterE wrote:
> Hello,
> 
> Alexander, does these problems with the Promise SATA300 TX4 happen to
> everyone?
> 

Most probably not, as I think it would have been fixed much faster then.

I was waiting for a) release of 2.6.23, and b) me completing the move to 
another flat
to retest all the latest developments in mainline and libata-dev.

With a) done and b) almost done, I'll retest and report any issues quite 
soon.

Besides, there is a report of TX4 and 2.6.23 not showing problems that 
were there with 2.6.22,
( see "Bug is fixed in 2.6.23.1: sata_promise: port is slow to respond, 
reset failed" thread).


> The only alternatives are
> using soft-raid products as normal controllers. Does anyone have experiences
> with the following products?
> * Highpoint RocketRAID 1640 (150 MB/s)
> * Highpoint RocketRAID 1740 (300 MB/s)
> * Adaptec 1210SA
> 

For any kind of non-hobby task I'd skip trying to build a disk array to 
buying a SATA-SCSI/SATA-iSCSI box.
While I had many mind-boggling issues with various combinations of SATA 
HDDs, onboard and standalone
controllers, Promise and Infortrend disk arrays worked quite reliably.

-- 

./lxnt

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re[2]: Sata Sil3512 bug?;  Promise SATA300 TX4
  2007-10-03  8:31 ` Alexander Sabourenkov
  2007-10-03 14:45   ` Re[2]: " MisterE
  2007-10-14 12:07   ` Re[2]: " MisterE
@ 2007-10-17 12:39   ` MisterE
  2007-10-17 12:54     ` Alexander Sabourenkov
  2 siblings, 1 reply; 37+ messages in thread
From: MisterE @ 2007-10-17 12:39 UTC (permalink / raw)
  To: Alexander Sabourenkov
  Cc: Mikael Pettersson, htejun, alan, benh, jgarzik, linux-ide, jeff

Hello,

Wednesday, October 3, 2007, 10:31:17 AM, you wrote:

> Mikael Pettersson wrote:

>>>
>>> I'm thinking of replacing both 3512 controllers with a Promise SATA300
>>> TX4. Do you know if there are problems with this device?
>> 
>> (please don't top-post)
>> 
>> There are no known data-corruption issues with Promise SATA cards.
>> However, some of them, especially the 2nd generation SATA300 TX4,
>> are known to trigger intermittent error interrupts (that are dealt
>> with but may cause a speed reduction) in some systems. We're still
>> scratching our heads on that issue.
>> 

> But see this thread:

> http://marc.info/?l=linux-ide&m=119122463403033&w=2
> http://www.spinics.net/lists/linux-ide/msg14868.html

> Personally I would not recommend Promise SATA300 TX4 at the moment.


After all the problems i had with the sweex 3512 cards i returned them
to the shop and decided to buy a Sata300 TX4 (because the shop nearby
had one. Unfortunately the shops in the region don't have Highpoints)

Things looked promising when i inserted the card in both Intel D815EEA
motherboards. No problems detecting the hard drives (unlike with the 3512 cards).
With the 3512 i had LOTS of error messages and corrupt data when writing to it.
Using a separate videocard, instead of the onboard one, seemed to reduce the amount of errors.

But after some heavy reading/writing with the promise i got 2 errors. (see log file).
But i did'nt find any corrupt files. I can not reproduce the error.
I'm not sure if these are the "intermittent error interrupts" Mikael
Pettersson mentioned?

ps: as you can i see i got at the boot some errors from the boot disk
(hda). I not sure what is wrong with it. Sometimes it produce these
errors. Used a non-destructive read-write test with badblocks but no
bad sectors found. I don't know if this could influence the sata controller.

Alexander Sabourenkov can you please tell me where i can find the
"Bug is fixed in 2.6.23.1: sata_promise: port is slow to respond,
reset failed" thread you mentioned?

I also see that the driver is now at version 2.10. Is there something
really critical changed? I've tried testing with Debian stable
(2.6.18-4-686; sata_promise: 1.04) and with Debian Unstable
(2.6.22-2-686; sata_promise: 2.07). 2.6.23 is not in the repositories
yet.

So basically the question is this. Can i trust the SATA300 TX4 or
should i buy a Highpoint RocketRAID 1640/1740?. I can order such device
online but i need to be sure that it works correctly :(


-- 
Best regards,
 MisterE                            mailto:MisterE2002@zonnet.nl



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Sata Sil3512 bug?;  Promise SATA300 TX4
  2007-10-17 12:39   ` Re[2]: Sata Sil3512 bug?; Promise SATA300 TX4 MisterE
@ 2007-10-17 12:54     ` Alexander Sabourenkov
  2007-10-17 15:04       ` Re[2]: " MisterE
  0 siblings, 1 reply; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-17 12:54 UTC (permalink / raw)
  To: MisterE; +Cc: Mikael Pettersson, htejun, alan, benh, jgarzik, linux-ide, jeff

MisterE wrote:
  > But after some heavy reading/writing with the promise i got 2 
errors. (see log file).

Log file got lost. Please post relevant parts inline.


> Alexander Sabourenkov can you please tell me where i can find the
> "Bug is fixed in 2.6.23.1: sata_promise: port is slow to respond,
> reset failed" thread you mentioned?

That would be this one:
(got split into two parts)
http://www.spinics.net/lists/linux-ide/msg14069.html
http://www.spinics.net/lists/linux-ide/msg15299.html


> 
> So basically the question is this. Can i trust the SATA300 TX4 or
> should i buy a Highpoint RocketRAID 1640/1740?. I can order such device
> online but i need to be sure that it works correctly :(
> 

Since you have the hardware, do the tests and decide for yourself.

I'd try copying one (big, preferably over 160G ) disk onto another (with 
dd) for a start,
while waiting for answers on mailing lists.


-- 

./lxnt

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re[2]: Sata Sil3512 bug?;  Promise SATA300 TX4
  2007-10-17 12:54     ` Alexander Sabourenkov
@ 2007-10-17 15:04       ` MisterE
  2007-10-17 19:21         ` Peter Favrholdt
  2007-10-18 21:07         ` Alexander Sabourenkov
  0 siblings, 2 replies; 37+ messages in thread
From: MisterE @ 2007-10-17 15:04 UTC (permalink / raw)
  To: Alexander Sabourenkov
  Cc: Mikael Pettersson, htejun, alan, benh, jgarzik, linux-ide, jeff

Hello Alexander,

Wednesday, October 17, 2007, 2:54:25 PM, you wrote:

> Log file got lost. Please post relevant parts inline.

Sorry, i totally forgot to include them.
I can not reproduce the errors. Last times hda did not give errors. So i'm
not sure if it is related to each other. (in the thread you mentioned
that you can't explain the fixing of problem from Peter Favrholdt, so
maybe it has indeed something to do with the libata)

ct 16 14:10:59 fileserver kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Oct 16 14:10:59 fileserver kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
Oct 16 14:10:59 fileserver kernel: ide: failed opcode was: unknown
Oct 16 14:12:49 fileserver kernel: kjournald starting.  Commit interval 5 seconds
Oct 16 14:12:49 fileserver kernel: EXT3 FS on sda1, internal journal
Oct 16 14:12:49 fileserver kernel: EXT3-fs: mounted filesystem with ordered data mode.
Oct 16 14:13:34 fileserver kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Oct 16 14:13:34 fileserver kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
Oct 16 14:13:34 fileserver kernel: ide: failed opcode was: unknown
Oct 16 14:17:21 fileserver kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Oct 16 14:17:21 fileserver kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
Oct 16 14:17:21 fileserver kernel: ide: failed opcode was: unknown
Oct 16 14:17:21 fileserver kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Oct 16 14:17:21 fileserver kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
Oct 16 14:17:21 fileserver kernel: ide: failed opcode was: unknown
Oct 16 14:17:21 fileserver kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Oct 16 14:17:21 fileserver kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
Oct 16 14:17:21 fileserver kernel: ide: failed opcode was: unknown
Oct 16 14:17:21 fileserver kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Oct 16 14:17:21 fileserver kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
Oct 16 14:17:21 fileserver kernel: ide: failed opcode was: unknown
Oct 16 14:17:21 fileserver kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Oct 16 14:17:21 fileserver kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
Oct 16 14:17:21 fileserver kernel: ide: failed opcode was: unknown
Oct 16 14:17:21 fileserver kernel: hdb: DMA disabled
Oct 16 14:17:21 fileserver kernel: ide0: reset: success
Oct 16 14:32:51 fileserver kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
Oct 16 14:32:51 fileserver kernel: ata1.00: (port_status 0x20080000)
Oct 16 14:32:51 fileserver kernel: ata1.00: cmd c8/00:00:77:f6:6c/00:00:00:00:00/e4 tag 0 cdb 0x0 data 131072 in
Oct 16 14:32:51 fileserver kernel:          res 50/00:00:76:f7:6c/00:00:00:00:00/e4 Emask 0x2 (HSM violation)
Oct 16 14:32:51 fileserver kernel: ata1: soft resetting port
Oct 16 14:32:51 fileserver kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Oct 16 14:32:51 fileserver kernel: ata1.00: configured for UDMA/133
Oct 16 14:32:51 fileserver kernel: ata1: EH complete
Oct 16 14:32:51 fileserver kernel: sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB)
Oct 16 14:32:51 fileserver kernel: sd 0:0:0:0: [sda] Write Protect is off
Oct 16 14:32:51 fileserver kernel: sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Oct 16 14:32:51 fileserver kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 16 14:44:09 fileserver kernel: sd 0:0:0:0: Attached scsi generic sg0 type 0
Oct 16 14:48:48 fileserver kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
Oct 16 14:48:48 fileserver kernel: ata1.00: (port_status 0x20080000)
Oct 16 14:48:48 fileserver kernel: ata1.00: cmd 25/00:00:3f:d0:26/00:01:23:00:00/e0 tag 0 cdb 0x0 data 131072 in
Oct 16 14:48:48 fileserver kernel:          res 50/00:00:3e:d1:26/00:00:23:00:00/e0 Emask 0x2 (HSM violation)
Oct 16 14:48:48 fileserver kernel: ata1: soft resetting port
Oct 16 14:48:49 fileserver kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Oct 16 14:48:49 fileserver kernel: ata1.00: configured for UDMA/133
Oct 16 14:48:49 fileserver kernel: ata1: EH complete
Oct 16 14:48:49 fileserver kernel: sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB)
Oct 16 14:48:49 fileserver kernel: sd 0:0:0:0: [sda] Write Protect is off
Oct 16 14:48:49 fileserver kernel: sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Oct 16 14:48:49 fileserver kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA



> Since you have the hardware, do the tests and decide for yourself.

> I'd try copying one (big, preferably over 160G ) disk onto another (with
> dd) for a start,
> while waiting for answers on mailing lists.


I can order that 1740 online, but returning something is always more
difficult. So need to be quite sure that there are'nt problems with
this highpoint.

Tonight i will try the Asus motherboard with 1 drive and much I/O. And
i will create a new array which takes 7 hours. But how often/hours do
you need to try something to prove it does not fail :P

-- 
Best regards,
 MisterE                            mailto:MisterE2002@zonnet.nl



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Sata Sil3512 bug?;  Promise SATA300 TX4
  2007-10-17 15:04       ` Re[2]: " MisterE
@ 2007-10-17 19:21         ` Peter Favrholdt
  2007-10-19 12:02           ` Re[2]: " MisterE
  2007-10-18 21:07         ` Alexander Sabourenkov
  1 sibling, 1 reply; 37+ messages in thread
From: Peter Favrholdt @ 2007-10-17 19:21 UTC (permalink / raw)
  To: MisterE; +Cc: Alexander Sabourenkov, Mikael Pettersson, linux-ide

Hi,

MisterE wrote:
> Tonight i will try the Asus motherboard with 1 drive and much I/O. And
> i will create a new array which takes 7 hours. But how often/hours do
> you need to try something to prove it does not fail :P

On one box I had problems with the SATA300 TX4 using 2.6.21 through 
2.6.22 (different versions). I have 4x500GB Seagate ES SATA drives 
connected. The system would run fine, but when put to a stress - i.e. 
loaded on all sata ports one or two ports would fail - one after the 
other. I have _always_ been able to make it fail doing:

dd if=/dev/sda of=/dev/null bs=1M &
dd if=/dev/sdb of=/dev/null bs=1M &
dd if=/dev/sdc of=/dev/null bs=1M &
dd if=/dev/sdd of=/dev/null bs=1M &

The ports would freeze before running long - e.g. in less than an hour.

This can be done without even starting the array (mdadm). Therefore no 
data corruption will happen.

The above issue was fixed by updating to vanilla 2.6.23.1.

Until then I have been running with 2.6.21-rc2 with a Mikael Petterson 
patch to force the SATA to 1.5Gbps (this could possibly be accomplished 
by jumpers on the drives as well - but I didn't try that).

I have another system (Dell PE1800 = different from the above) running 
24x7 using vanilla linux 2.6.19.5. This system has been running without 
hickups for more than a year (current uptime 135 days).


Hope this helps,

Best regards,

Peter

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Sata Sil3512 bug?
  2007-10-13 16:36             ` Re[2]: " MisterE
@ 2007-10-18  3:29               ` Tejun Heo
  0 siblings, 0 replies; 37+ messages in thread
From: Tejun Heo @ 2007-10-18  3:29 UTC (permalink / raw)
  To: MisterE; +Cc: Alan Cox, jgarzik, linux-ide, benh

MisterE wrote:
> Oct 13 13:01:26 fileserver kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x2400000 action 0x0
> Oct 13 13:01:26 fileserver kernel: ata4.00: (BMDMA2 stat 0x650001)
> Oct 13 13:01:26 fileserver kernel: ata4.00: cmd ca/00:f8:47:e1:5e/00:00:00:00:00/e4 tag 0 cdb 0x0 data 126976 out
> Oct 13 13:01:26 fileserver kernel:          res 51/04:98:a7:e1:5e/00:00:00:00:00/e4 Emask 0x1 (device error)
> Oct 13 13:01:26 fileserver kernel: ata4.00: configured for UDMA/100
> Oct 13 13:01:26 fileserver kernel: ata4: EH complete
> Oct 13 13:01:26 fileserver kernel: sd 3:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB)
> Oct 13 13:01:26 fileserver kernel: sd 3:0:0:0: [sdd] Write Protect is off
> Oct 13 13:01:26 fileserver kernel: sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00
> Oct 13 13:01:26 fileserver kernel: sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> 
> I'm not sure if it will result in corrupt data? But i don't trust it
> anymore.

That looks like a data transmission error.  When you're transferring
massive amount of data, things like that can happen and it won't cause
data corruption.

> You people advise me to not buy the Promise SATA300 TX4 controller and
> this Sweex PU102 (3512) seems to have problems. Not much choices left
> except the really expensive solutions.
> 
> Is it really so hard to build a controller without problems?!? :(

It seems so. :-(

-- 
tejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Sata Sil3512 bug?;  Promise SATA300 TX4
  2007-10-17 15:04       ` Re[2]: " MisterE
  2007-10-17 19:21         ` Peter Favrholdt
@ 2007-10-18 21:07         ` Alexander Sabourenkov
  2007-10-19  1:26           ` Tejun Heo
  1 sibling, 1 reply; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-18 21:07 UTC (permalink / raw)
  To: linux-ide; +Cc: MisterE, Mikael Pettersson, htejun, alan, benh, jgarzik, jeff

Hello.


I have done some quick tests with 2.6.23/amd64 and unfortunately, the
very same problem persists.

By the way, 8 in (port_status 0x20080000) stands for
        PDC_OVERRUN_ERR         = (1 << 19), /* S/G byte count larger
than HD requires */


Does by any chance 'S/G' here somehow relate to 'sg in the 'sg-chaining
work' there is so much talk about on the -kernel mailing list?



In a somewhat parallel development, write errors caused my (other) md
RAID-1 to lose one drive while copying data under 2.6.22
from TX4-attached drives to onboard-VIA-attached ones.

Device: VIA VT6420
00:0f.0 0104: 1106:3149 (rev 80)

Boot:

Oct 17 21:28:25 host sata_via 0000:00:0f.0: version 2.2
Oct 17 21:28:25 host ACPI: PCI Interrupt 0000:00:0f.0[B] -> GSI 20
(level, low) -> IRQ 17
Oct 17 21:28:25 host sata_via 0000:00:0f.0: routed to hard irq line 10
Oct 17 21:28:25 host scsi4 : sata_via
Oct 17 21:28:25 host scsi5 : sata_via

Oct 17 21:28:25 host ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Oct 17 21:28:25 host ata6.00: ATA-7: ST3200827AS, 3.AAH, max UDMA/133
Oct 17 21:28:25 host ata6.00: 390721968 sectors, multi 0: LBA48 NCQ
(depth 0/32)
Oct 17 21:28:25 host ata6.00: configured for UDMA/133

... the first two port resets:

Oct 17 23:10:50 host ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0
action 0x2
Oct 17 23:10:50 host ata6.00: (BMDMA stat 0x4)
Oct 17 23:10:50 host ata6.00: cmd ca/00:08:e7:30:00/00:00:00:00:00/e0
tag 0 cdb 0x0 data 4096 out
Oct 17 23:10:50 host res 51/84:08:e7:30:00/00:00:00:00:00/e0 Emask 0x10
(ATA bus error)
Oct 17 23:10:50 host ata6: soft resetting port
Oct 17 23:10:50 host ata6.00: configured for UDMA/133
Oct 17 23:10:50 host ata6: EH complete
Oct 17 23:10:50 host sd 5:0:0:0: [sdd] 390721968 512-byte hardware
sectors (200050 MB)
Oct 17 23:10:50 host sd 5:0:0:0: [sdd] Write Protect is off
Oct 17 23:10:50 host sd 5:0:0:0: [sdd] Mode Sense: 00 3a 00 00
Oct 17 23:10:50 host sd 5:0:0:0: [sdd] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
Oct 17 23:10:50 host ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0
action 0x2
Oct 17 23:10:50 host ata6.00: (BMDMA stat 0x5)
Oct 17 23:10:50 host ata6.00: cmd ca/00:f8:4f:31:00/00:00:00:00:00/e0
tag 0 cdb 0x0 data 126976 out
Oct 17 23:10:50 host res 51/84:f8:4f:31:00/00:00:00:00:00/e0 Emask 0x10
(ATA bus error)
Oct 17 23:10:50 host ata6: soft resetting port
Oct 17 23:10:50 host ata6.00: configured for UDMA/133
Oct 17 23:10:50 host ata6: EH complete
Oct 17 23:10:50 host sd 5:0:0:0: [sdd] 390721968 512-byte hardware
sectors (200050 MB)
Oct 17 23:10:50 host sd 5:0:0:0: [sdd] Write Protect is off
Oct 17 23:10:50 host sd 5:0:0:0: [sdd] Mode Sense: 00 3a 00 00
Oct 17 23:10:50 host sd 5:0:0:0: [sdd] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA

... and multiple unsuccessful port resets follow:

Oct 17 23:11:57 host ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0
action 0x2 frozen
Oct 17 23:11:57 host ata6.00: cmd 25/00:08:7f:bf:28/00:00:16:00:00/e0
tag 0 cdb 0x0 data 4096 in
Oct 17 23:11:57 host res 40/00:f8:4f:31:00/00:00:00:00:00/e0 Emask 0x4
(timeout)
Oct 17 23:12:02 host ata6: port is slow to respond, please be patient
(Status 0xd0)
Oct 17 23:12:07 host ata6: soft resetting port
Oct 17 23:12:37 host ata6.00: qc timeout (cmd 0xec)
Oct 17 23:12:37 host ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Oct 17 23:12:37 host ata6.00: revalidation failed (errno=-5)
Oct 17 23:12:37 host ata6: failed to recover some devices, retrying in 5
secs
Oct 17 23:12:47 host ata6: port is slow to respond, please be patient
(Status 0xd0)
Oct 17 23:12:52 host ata6: soft resetting port
Oct 17 23:13:22 host ata6.00: qc timeout (cmd 0xec)
Oct 17 23:13:22 host ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Oct 17 23:13:22 host ata6.00: revalidation failed (errno=-5)
Oct 17 23:13:22 host ata6.00: limiting speed to UDMA/133:PIO3
Oct 17 23:13:22 host ata6: failed to recover some devices, retrying in 5
secs
Oct 17 23:13:32 host ata6: port is slow to respond, please be patient
(Status 0xd0)
Oct 17 23:13:37 host ata6: soft resetting port
Oct 17 23:14:08 host ata6.00: qc timeout (cmd 0xec)
Oct 17 23:14:08 host ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Oct 17 23:14:08 host ata6.00: revalidation failed (errno=-5)
Oct 17 23:14:08 host ata6.00: disabled
Oct 17 23:14:08 host ata6: EH complete
Oct 17 23:14:08 host sd 5:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET
driverbyte=DRIVER_OK,SUGGEST_OK
Oct 17 23:14:08 host end_request: I/O error, dev sdd, sector 371769215
Oct 17 23:14:08 host raid1: sdd1: rescheduling sector 371769152
Oct 17 23:14:08 host sd 5:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET
driverbyte=DRIVER_OK,SUGGEST_OK
Oct 17 23:14:08 host end_request: I/O error, dev sdd, sector 390379327
Oct 17 23:14:08 host md: super_written gets error=-5, uptodate=0
Oct 17 23:14:08 host raid1: Disk failure on sdd1, disabling device.

I'm unable to reproduce this on 2.6.23, so this is of historic interest
only.

-- 

./lxnt

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Sata Sil3512 bug?;  Promise SATA300 TX4
  2007-10-18 21:07         ` Alexander Sabourenkov
@ 2007-10-19  1:26           ` Tejun Heo
  2007-10-19 21:06             ` Alexander Sabourenkov
  0 siblings, 1 reply; 37+ messages in thread
From: Tejun Heo @ 2007-10-19  1:26 UTC (permalink / raw)
  To: Alexander Sabourenkov
  Cc: linux-ide, MisterE, Mikael Pettersson, alan, benh, jgarzik, jeff

Hello,

Alexander Sabourenkov wrote:
> In a somewhat parallel development, write errors caused my (other) md
> RAID-1 to lose one drive while copying data under 2.6.22
> from TX4-attached drives to onboard-VIA-attached ones.
 >
> ... the first two port resets:
> 
> Oct 17 23:10:50 host ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0
> action 0x2
> Oct 17 23:10:50 host ata6.00: (BMDMA stat 0x4)
> Oct 17 23:10:50 host ata6.00: cmd ca/00:08:e7:30:00/00:00:00:00:00/e0
> tag 0 cdb 0x0 data 4096 out
> Oct 17 23:10:50 host res 51/84:08:e7:30:00/00:00:00:00:00/e0 Emask 0x10
> (ATA bus error)
> Oct 17 23:10:50 host ata6: soft resetting port
> Oct 17 23:10:50 host ata6.00: configured for UDMA/133
> Oct 17 23:10:50 host ata6: EH complete
[--snip--]
> Oct 17 23:13:37 host ata6: soft resetting port
> Oct 17 23:14:08 host ata6.00: qc timeout (cmd 0xec)
> Oct 17 23:14:08 host ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> Oct 17 23:14:08 host ata6.00: revalidation failed (errno=-5)
> Oct 17 23:14:08 host ata6.00: disabled
> Oct 17 23:14:08 host ata6: EH complete
> Oct 17 23:14:08 host sd 5:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET
> driverbyte=DRIVER_OK,SUGGEST_OK
> Oct 17 23:14:08 host end_request: I/O error, dev sdd, sector 371769215
> Oct 17 23:14:08 host raid1: sdd1: rescheduling sector 371769152
> Oct 17 23:14:08 host sd 5:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET
> driverbyte=DRIVER_OK,SUGGEST_OK
> Oct 17 23:14:08 host end_request: I/O error, dev sdd, sector 390379327
> Oct 17 23:14:08 host md: super_written gets error=-5, uptodate=0
> Oct 17 23:14:08 host raid1: Disk failure on sdd1, disabling device.
> 
> I'm unable to reproduce this on 2.6.23, so this is of historic interest
> only.

It might not have anything to do with the os and driver.  Some SATA 
controllers and/or drives aren't very reliable and they just fail from 
time to time.  My previous desktop was using sata_nv w/ seagate sata 
drives and was up 24/7.  I used it for like two years and during that 
time, there was single transfer error and it brought the drive down 
completely and I had to reboot and rebuild my RAID 1 array.  ISTR what's 
dead was the controller port.  IIRC, powering off and on the drive 
didn't help.

Another interesting case was first gen SATA harddrives from certain 
vendor.  After any transfer error, those drives went completely deaf. 
The only way to recover them was removing power, waiting a bit and 
reapplying it.

So, my bet for your second report is your hardware went through 
something similar as above.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re[2]: Sata Sil3512 bug?;  Promise SATA300 TX4
  2007-10-17 19:21         ` Peter Favrholdt
@ 2007-10-19 12:02           ` MisterE
  0 siblings, 0 replies; 37+ messages in thread
From: MisterE @ 2007-10-19 12:02 UTC (permalink / raw)
  To: Peter Favrholdt; +Cc: Alexander Sabourenkov, Mikael Pettersson, linux-ide

Hello Peter,

Wednesday, October 17, 2007, 9:21:28 PM, you wrote:


> On one box I had problems with the SATA300 TX4 using 2.6.21 through 
> 2.6.22 (different versions). I have 4x500GB Seagate ES SATA drives 
> connected. The system would run fine, but when put to a stress - i.e. 
> loaded on all sata ports one or two ports would fail - one after the 
> other. I have _always_ been able to make it fail doing:

> dd if=/dev/sda of=/dev/null bs=1M &
> dd if=/dev/sdb of=/dev/null bs=1M &
> dd if=/dev/sdc of=/dev/null bs=1M &
> dd if=/dev/sdd of=/dev/null bs=1M &

> The ports would freeze before running long - e.g. in less than an hour.

I followed your advice and tested it. I have 4x500GB drives (western
digital Caviar SE16 WD5000AAKS). I tested it with and without jumpers
(300 and 150Gb mode). All test are done with the Asus CUSL2-C


1 :: The first run; debian 2.6.18-4-686 (stable); 300Gb [3 hours in total]:
Oct 17 18:06:12 debian kernel: ata1: no sense translation for status: 0x50
Oct 17 18:06:12 debian kernel: ata1: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Oct 17 18:06:12 debian kernel: ata1: status=0x50 { DriveReady SeekComplete }
Oct 17 19:37:15 debian kernel: ata1: no sense translation for status: 0x50
Oct 17 19:37:15 debian kernel: ata1: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Oct 17 19:37:15 debian kernel: ata1: status=0x50 { DriveReady SeekComplete }
Oct 17 19:42:11 debian kernel: ata3: no sense translation for status: 0x50
Oct 17 19:42:12 debian kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Oct 17 19:42:12 debian kernel: ata3: status=0x50 { DriveReady SeekComplete }
Oct 17 20:23:38 debian kernel: ata1: no sense translation for status: 0x50
Oct 17 20:23:39 debian kernel: ata1: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Oct 17 20:23:39 debian kernel: ata1: status=0x50 { DriveReady SeekComplete }
Oct 17 20:31:38 debian kernel: ata2: no sense translation for status: 0x50
Oct 17 20:31:38 debian kernel: ata2: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Oct 17 20:31:38 debian kernel: ata2: status=0x50 { DriveReady SeekComplete }
Oct 17 20:44:56 debian kernel: ata3: no sense translation for status: 0x50
Oct 17 20:44:56 debian kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Oct 17 20:44:56 debian kernel: ata3: status=0x50 { DriveReady SeekComplete }

2 :: Second run (1 hour); same settings:
Oct 18 09:27:47 debian kernel: ata4: no sense translation for status: 0x50
Oct 18 09:27:47 debian kernel: ata4: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Oct 18 09:27:47 debian kernel: ata4: status=0x50 { DriveReady SeekComplete }
Oct 18 09:38:18 debian kernel: ata3: no sense translation for status: 0x50
Oct 18 09:38:18 debian kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Oct 18 09:38:18 debian kernel: ata3: status=0x50 { DriveReady SeekComplete }


3 :: After that 3 a 5 hours with the drives jumpered. No problems.


4 :: 17:15 - 18:28; 2.6.22-2-686; 300Gb

Oct 18 13:45:25 fileserver kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
Oct 18 13:45:25 fileserver kernel: ata1.00: (port_status 0x20080000)
Oct 18 13:45:25 fileserver kernel: ata1.00: cmd c8/00:08:00:e6:cb/00:00:00:00:00/e2 tag 0 cdb 0x0 data 4096 in
Oct 18 13:45:25 fileserver kernel:          res 50/00:00:07:e6:cb/00:00:00:00:00/e2 Emask 0x2 (HSM violation)
Oct 18 13:45:26 fileserver kernel: ata1: soft resetting port
Oct 18 13:45:26 fileserver kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Oct 18 13:45:26 fileserver kernel: ata1.00: configured for UDMA/133
Oct 18 13:45:26 fileserver kernel: ata1: EH complete
Oct 18 13:45:26 fileserver kernel: sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB)
Oct 18 13:45:26 fileserver kernel: sd 0:0:0:0: [sda] Write Protect is off
Oct 18 13:45:26 fileserver kernel: sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Oct 18 13:45:26 fileserver kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 18 13:57:19 fileserver kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
Oct 18 13:57:19 fileserver kernel: ata2.00: (port_status 0x20080000)
Oct 18 13:57:19 fileserver kernel: ata2.00: cmd c8/00:08:00:e6:92/00:00:00:00:00/e4 tag 0 cdb 0x0 data 4096 in
Oct 18 13:57:19 fileserver kernel:          res 50/00:00:07:e6:92/00:00:00:00:00/e4 Emask 0x2 (HSM violation)
Oct 18 13:57:19 fileserver kernel: ata2: soft resetting port
Oct 18 13:57:20 fileserver kernel: ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Oct 18 13:57:20 fileserver kernel: ata2.00: configured for UDMA/133
Oct 18 13:57:20 fileserver kernel: ata2: EH complete
Oct 18 13:57:20 fileserver kernel: sd 1:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB)
Oct 18 13:57:20 fileserver kernel: sd 1:0:0:0: [sdb] Write Protect is off
Oct 18 13:57:20 fileserver kernel: sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
Oct 18 13:57:20 fileserver kernel: sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 18 14:09:44 fileserver kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
Oct 18 14:09:44 fileserver kernel: ata1.00: (port_status 0x20080000)
Oct 18 14:09:44 fileserver kernel: ata1.00: cmd c8/00:e0:20:8d:3b/00:00:00:00:00/e6 tag 0 cdb 0x0 data 114688 in
Oct 18 14:09:44 fileserver kernel:          res 50/00:00:ff:8d:3b/00:00:00:00:00/e6 Emask 0x2 (HSM violation)
Oct 18 14:09:44 fileserver kernel: ata1: soft resetting port
Oct 18 14:09:44 fileserver kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Oct 18 14:09:44 fileserver kernel: ata1.00: configured for UDMA/133
Oct 18 14:09:44 fileserver kernel: ata1: EH complete
Oct 18 14:09:44 fileserver kernel: sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB)
Oct 18 14:09:44 fileserver kernel: sd 0:0:0:0: [sda] Write Protect is off
Oct 18 14:09:44 fileserver kernel: sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Oct 18 14:09:44 fileserver kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 18 14:15:37 fileserver kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
Oct 18 14:15:37 fileserver kernel: ata3.00: (port_status 0x20080000)
Oct 18 14:15:37 fileserver kernel: ata3.00: cmd c8/00:08:00:4a:27/00:00:00:00:00/e7 tag 0 cdb 0x0 data 4096 in
Oct 18 14:15:37 fileserver kernel:          res 50/00:00:07:4a:27/00:00:00:00:00/e7 Emask 0x2 (HSM violation)
Oct 18 14:15:37 fileserver kernel: ata3: soft resetting port
Oct 18 14:15:38 fileserver kernel: ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Oct 18 14:15:38 fileserver kernel: ata3.00: configured for UDMA/133
Oct 18 14:15:38 fileserver kernel: ata3: EH complete
Oct 18 14:15:38 fileserver kernel: sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB)
Oct 18 14:15:38 fileserver kernel: sd 2:0:0:0: [sdc] Write Protect is off
Oct 18 14:15:38 fileserver kernel: sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
Oct 18 14:15:38 fileserver kernel: sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA


5 :: 2.6.22-2-686 - 2 hours with the drives jumpered. No problems.


> The above issue was fixed by updating to vanilla 2.6.23.1.

So, when running in the 150Gb mode there are no problems.

I'm going to try the same with .23(.1). I'm not really familiar with
updating the kernel. Tried it before with: http://www.debianhelp.co.uk/kernel2.6.htm
but not much success. But, i'm going to try...
I will post the results later.



-- 
Best regards,
 MisterE                            mailto:MisterE2002@zonnet.nl



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Sata Sil3512 bug?;  Promise SATA300 TX4
  2007-10-19  1:26           ` Tejun Heo
@ 2007-10-19 21:06             ` Alexander Sabourenkov
  2007-10-19 22:58               ` Re[2]: " MisterE
  2007-10-19 23:58               ` Tejun Heo
  0 siblings, 2 replies; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-19 21:06 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-ide, MisterE, alan, benh, jgarzik, jeff

Hello.

> 
> So, my bet for your second report is your hardware went through
> something similar as above.
> 

Thanks for the insight. Let's dismiss it then.

Back to the TX4, I tried libata-dev.git cloned at about 20:00 UTC 19.10,
  no perceived difference - parallel read from two drives causes a lot
of  errors.

dmesgs  with boot and errors are at http://lxnt.info/linux/libata-dev/

I don't know what to try next. Any ideas?

-- 

./lxnt






^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re[2]: Sata Sil3512 bug?;  Promise SATA300 TX4
  2007-10-19 21:06             ` Alexander Sabourenkov
@ 2007-10-19 22:58               ` MisterE
  2007-10-19 23:58               ` Tejun Heo
  1 sibling, 0 replies; 37+ messages in thread
From: MisterE @ 2007-10-19 22:58 UTC (permalink / raw)
  To: Alexander Sabourenkov; +Cc: Tejun Heo, linux-ide, alan, benh, jgarzik, jeff

Hello Alexander,

Friday, October 19, 2007, 11:06:02 PM, you wrote:


> I don't know what to try next. Any ideas?

I'm no kernel hacker, so i'll take a shot.
I assume you have done most already...

* hardware (Tested/without/or used another: motherboard, videocard, memory,
hard drives, power supply, all other hardware)

* tried a more n00b-proof distribution. As far as i know you have all
those flags with gentoo. A mistake is easily made.

* Tested with the latest official drivers (redhat) from the Promise site. And
installing that OS on a disk. I assume they made working drivers, so
it should work with it...

* Does it work correctly with Windows?

This would be the steps i would take to determine the cause of the
problem.

Finally, my 2.6.23 kernel is done. I'm going try to install it now.
Tomorrow the results :)


-- 
Best regards,
 MisterE                            mailto:MisterE2002@zonnet.nl



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Sata Sil3512 bug?;  Promise SATA300 TX4
  2007-10-19 21:06             ` Alexander Sabourenkov
  2007-10-19 22:58               ` Re[2]: " MisterE
@ 2007-10-19 23:58               ` Tejun Heo
  2007-10-20 21:50                 ` Alexander Sabourenkov
  1 sibling, 1 reply; 37+ messages in thread
From: Tejun Heo @ 2007-10-19 23:58 UTC (permalink / raw)
  To: Alexander Sabourenkov; +Cc: linux-ide, MisterE, alan, benh, jgarzik, jeff

[-- Attachment #1: Type: text/plain, Size: 527 bytes --]

Alexander Sabourenkov wrote:
> Hello.
> 
>> So, my bet for your second report is your hardware went through
>> something similar as above.
>>
> 
> Thanks for the insight. Let's dismiss it then.
> 
> Back to the TX4, I tried libata-dev.git cloned at about 20:00 UTC 19.10,
>   no perceived difference - parallel read from two drives causes a lot
> of  errors.
> 
> dmesgs  with boot and errors are at http://lxnt.info/linux/libata-dev/
> 
> I don't know what to try next. Any ideas?
> 

Does the attached patch help?

-- 
tejun

[-- Attachment #2: limit-PHY-to-1.5Gbps.patch --]
[-- Type: text/plain, Size: 402 bytes --]

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 68699b3..4c93fee 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -6435,6 +6435,7 @@ int sata_link_init_spd(struct ata_link *link)
 	spd = (scontrol >> 4) & 0xf;
 	if (spd)
 		link->hw_sata_spd_limit &= (1 << spd) - 1;
+	link->hw_sata_spd_limit = 1;
 
 	link->sata_spd_limit = link->hw_sata_spd_limit;
 

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: Sata Sil3512 bug?;  Promise SATA300 TX4
  2007-10-19 23:58               ` Tejun Heo
@ 2007-10-20 21:50                 ` Alexander Sabourenkov
  2007-10-27 13:24                   ` [PATCH-RFC] (was: Re: Sata Sil3512 bug?; Promise SATA300 TX4) Alexander Sabourenkov
  0 siblings, 1 reply; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-20 21:50 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-ide, MisterE, alan, benh, jgarzik, jeff

Hello.

Tejun Heo wrote:
> 
> Does the attached patch help?
> 

It does somehow force 1.5GB/s mode, and it does change the pattern of
'configured for UDMAxxx' messages that come along with errors, and it
causes the following error:

ata3: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xb t4
ata3: hotplug_status 0x10
ata3: soft resetting link
ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata3.00: configured for UDMA/133
ata3: EH complete

for both drives on TX4 on startup, but read errors are still there.

dmesgs at http://lxnt.info/linux/libata-dev/patch0/

READY
[]


-- 

./lxnt

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH-RFC] (was: Re: Sata Sil3512 bug?;  Promise SATA300 TX4)
  2007-10-20 21:50                 ` Alexander Sabourenkov
@ 2007-10-27 13:24                   ` Alexander Sabourenkov
  2007-10-27 13:44                     ` [PATCH-RFC] Alexander Sabourenkov
  0 siblings, 1 reply; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-27 13:24 UTC (permalink / raw)
  To: linux-ide; +Cc: Tejun Heo, MisterE, benh, jgarzik, jeff

Hello.

There appears to be a hardware bug in that it chokes on scatterlist
if the last item is larger than 164 bytes.

The patch that follows fixes my problem on 2.6.22.

I can't think of a way to avoid second pass over scatterlist without
duplicating code (ata_qc_prep() and ata_fill_sg() from libata-core.c).


--- a/drivers/ata/sata_promise.c	2007-07-09 03:32:17.000000000 +0400
+++ b/drivers/ata/sata_promise.c	2007-10-27 17:20:03.000000000 +0400
@@ -531,6 +531,80 @@
 	memcpy(buf+31, cdb, cdb_len);
 }

+/**
+ *	pdc_qc_prep - Fill PCI IDE PRD table
+ *	@qc: Metadata associated with taskfile to be transferred
+ *
+ *	Fill PCI IDE PRD (scatter-gather) table with segments
+ *	associated with the current disk command.
+ *	Make sure hardware does not choke on it.
+ *
+ *	LOCKING:
+ *	spin_lock_irqsave(host lock)
+ *
+ */
+static void pdc_qc_prep(struct ata_queued_cmd *qc)
+{
+	struct ata_port *ap = qc->ap;
+	struct scatterlist *sg;
+	unsigned int idx;
+	const u32 SG_COUNT_ASIC_BUG = 41*4;
+
+	if (!(qc->flags & ATA_QCFLAG_DMAMAP))
+		return;
+	
+	WARN_ON(qc->__sg == NULL);
+	WARN_ON(qc->n_elem == 0 && qc->pad_len == 0);
+
+	idx = 0;
+	ata_for_each_sg(sg, qc) {
+		u32 addr, offset;
+		u32 sg_len, len;
+
+		/* determine if physical DMA addr spans 64K boundary.
+		 * Note h/w doesn't support 64-bit, so we unconditionally
+		 * truncate dma_addr_t to u32.
+		 */
+		addr = (u32) sg_dma_address(sg);
+		sg_len = sg_dma_len(sg);
+
+		while (sg_len) {
+			offset = addr & 0xffff;
+			len = sg_len;
+			if ((offset + sg_len) > 0x10000)
+				len = 0x10000 - offset;
+
+			ap->prd[idx].addr = cpu_to_le32(addr);
+			ap->prd[idx].flags_len = cpu_to_le32(len & 0xffff);
+			VPRINTK("PRD[%u] = (0x%X, 0x%X)\n", idx, addr, len);
+
+			idx++;
+			sg_len -= len;
+			addr += len;
+		}
+	}
+
+	if (idx) {
+		u32 len = ap->prd[idx - 1].flags_len;
+		if (len > SG_COUNT_ASIC_BUG) {
+			u32 addr, len;
+
+			VPRINTK("Last PRD split\n");
+			
+			len = le32_to_cpu(ap->prd[idx - 1].flags_len) - SG_COUNT_ASIC_BUG;
+			addr = le32_to_cpu(ap->prd[idx - 1].addr);
+			ap->prd[idx - 1].flags_len = cpu_to_le32(len);
+			VPRINTK("PRD[%u] = (0x%X, 0x%X)\n", idx, addr, len);
+			
+			ap->prd[idx].flags_len = cpu_to_le32(SG_COUNT_ASIC_BUG);
+			ap->prd[idx].addr = cpu_to_le32(addr + len);
+			idx++;
+			VPRINTK("PRD[%u] = (0x%X, 0x%X)\n", idx, addr + len, SG_COUNT_ASIC_BUG);
+		}
+		ap->prd[idx - 1].flags_len |= cpu_to_le32(ATA_PRD_EOT);
+	}
+}
+
 static void pdc_qc_prep(struct ata_queued_cmd *qc)
 {
 	struct pdc_port_priv *pp = qc->ap->private_data;
@@ -540,7 +614,7 @@

 	switch (qc->tf.protocol) {
 	case ATA_PROT_DMA:
-		ata_qc_prep(qc);
+		pdc_qc_prep(qc);
 		/* fall through */

 	case ATA_PROT_NODATA:


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH-RFC]
  2007-10-27 13:24                   ` [PATCH-RFC] (was: Re: Sata Sil3512 bug?; Promise SATA300 TX4) Alexander Sabourenkov
@ 2007-10-27 13:44                     ` Alexander Sabourenkov
  2007-10-27 14:08                       ` Re[2]: [PATCH-RFC] MisterE
  2007-10-27 15:16                       ` [PATCH-RFC] Promise TX4 implement hw-bug workaround Alexander Sabourenkov
  0 siblings, 2 replies; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-27 13:44 UTC (permalink / raw)
  To: Alexander Sabourenkov; +Cc: linux-ide, Tejun Heo, MisterE, benh, jgarzik, jeff

Alexander Sabourenkov wrote:
> Hello.
> 
> There appears to be a hardware bug in that it chokes on scatterlist
> if the last item is larger than 164 bytes.
> 
> The patch that follows fixes my problem on 2.6.22.
> 
> I can't think of a way to avoid second pass over scatterlist without
> duplicating code (ata_qc_prep() and ata_fill_sg() from libata-core.c).
> 
> 

Sorry, this was wrong patch :(. Two days looking at vendor code must
have driven me insane. Will send the correct one asap.

-- 

./lxnt


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re[2]: [PATCH-RFC]
  2007-10-27 13:44                     ` [PATCH-RFC] Alexander Sabourenkov
@ 2007-10-27 14:08                       ` MisterE
  2007-10-27 15:09                         ` [PATCH-RFC] Alexander Sabourenkov
  2007-10-27 15:16                       ` [PATCH-RFC] Promise TX4 implement hw-bug workaround Alexander Sabourenkov
  1 sibling, 1 reply; 37+ messages in thread
From: MisterE @ 2007-10-27 14:08 UTC (permalink / raw)
  To: Alexander Sabourenkov; +Cc: linux-ide, Tejun Heo, benh, jgarzik, jeff

Hello Alexander,

Saturday, October 27, 2007, 3:44:51 PM, you wrote:


>> There appears to be a hardware bug in that it chokes on scatterlist
>> if the last item is larger than 164 bytes.

Can you confirm that this only will happen when running at 300Gb mode?
I have the drives jumpered and have no errors. I tested the "copy to
null" method several times with several kernel versions. I'm now in
the fase of copying all my data to the fileserver.

I'm willing to try your patch but i'm not a experienced linux guru ;)
Once i patched the kernel source (2.6.23 to 2.6.23.1) but i was stuck how
to install the updated driver....

-- 
Best regards,
 MisterE                            mailto:MisterE2002@zonnet.nl



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH-RFC]
  2007-10-27 14:08                       ` Re[2]: [PATCH-RFC] MisterE
@ 2007-10-27 15:09                         ` Alexander Sabourenkov
  0 siblings, 0 replies; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-27 15:09 UTC (permalink / raw)
  To: MisterE; +Cc: linux-ide

MisterE wrote:
> 
> Can you confirm that this only will happen when running at 300Gb mode?

I confirm that without this patch errors happen on both 150 and 300
modes, on both jumpered and unjumpered drives. It seems that errors are
highly hardware/configuration dependent.

> I'm willing to try your patch but i'm not a experienced linux guru ;)

I would not advise trying this patch now if you do not experience
problems, and certainly not with any valuable data behind the controller.

-- 

./lxnt




^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH-RFC] Promise TX4 implement hw-bug workaround
  2007-10-27 13:44                     ` [PATCH-RFC] Alexander Sabourenkov
  2007-10-27 14:08                       ` Re[2]: [PATCH-RFC] MisterE
@ 2007-10-27 15:16                       ` Alexander Sabourenkov
  2007-10-27 18:09                         ` Alan Cox
  2007-10-28 10:29                         ` Jeff Garzik
  1 sibling, 2 replies; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-27 15:16 UTC (permalink / raw)
  To: Alexander Sabourenkov; +Cc: linux-ide, Tejun Heo, MisterE, benh, jgarzik, jeff

Hello.
Once again,

There appears to be a hardware bug in that it chokes on scatterlist
if the last item is larger than 164 bytes. This was discovered by
reading the code of vendor-supplied driver.

The patch that follows fixes my problem on 2.6.22.

I can't think of a way to avoid second pass over scatterlist without
duplicating code (ata_qc_prep() and ata_fill_sg() from libata-core.c).





--- a/drivers/ata/sata_promise.c	2007-07-09 03:32:17.000000000 +0400
+++ b/drivers/ata/sata_promise.c	2007-10-27 19:12:46.000000000 +0400
@@ -531,6 +531,87 @@
 	memcpy(buf+31, cdb, cdb_len);
 }

+/**
+ *	pdc_fill_sg - Fill PCI IDE PRD table
+ *	@qc: Metadata associated with taskfile to be transferred
+ *
+ *	Fill PCI IDE PRD (scatter-gather) table with segments
+ *	associated with the current disk command.
+ *	Make sure hardware does not choke on it.
+ *
+ *	LOCKING:
+ *	spin_lock_irqsave(host lock)
+ *
+ */
+static void pdc_fill_sg(struct ata_queued_cmd *qc)
+{
+	struct ata_port *ap = qc->ap;
+	struct scatterlist *sg;
+	unsigned int idx;
+	const u32 SG_COUNT_ASIC_BUG = 41*4;
+
+	if (!(qc->flags & ATA_QCFLAG_DMAMAP))
+		return;
+	
+	WARN_ON(qc->__sg == NULL);
+	WARN_ON(qc->n_elem == 0 && qc->pad_len == 0);
+
+	idx = 0;
+	ata_for_each_sg(sg, qc) {
+		u32 addr, offset;
+		u32 sg_len, len;
+
+		/* determine if physical DMA addr spans 64K boundary.
+		 * Note h/w doesn't support 64-bit, so we unconditionally
+		 * truncate dma_addr_t to u32.
+		 */
+		addr = (u32) sg_dma_address(sg);
+		sg_len = sg_dma_len(sg);
+
+		while (sg_len) {
+			offset = addr & 0xffff;
+			len = sg_len;
+			if ((offset + sg_len) > 0x10000)
+				len = 0x10000 - offset;
+
+			ap->prd[idx].addr = cpu_to_le32(addr);
+			ap->prd[idx].flags_len = cpu_to_le32(len & 0xffff);
+			VPRINTK("PRD[%u] = (0x%X, 0x%X)\n", idx, addr, len);
+
+			idx++;
+			sg_len -= len;
+			addr += len;
+		}
+	}
+
+	if (idx) {
+		u32 len = le32_to_cpu(ap->prd[idx - 1].flags_len);
+
+		if (len > SG_COUNT_ASIC_BUG) {
+			u32 addr;
+			/* if len < 2*SG_COUNT_ASIC_BUG then last
+			   segment will be larger than next-to-last.
+			   Somewhat ugly :(
+			*/
+
+			VPRINTK("Splitting last PRD.\n");
+
+			ap->prd[idx - 1].flags_len -= cpu_to_le32(SG_COUNT_ASIC_BUG);
+			VPRINTK("PRD[%u] = (0x%X, 0x%X)\n", idx - 1, addr, SG_COUNT_ASIC_BUG);
+			
+			addr = le32_to_cpu(ap->prd[idx - 1].addr) + len - SG_COUNT_ASIC_BUG;
+			len  = SG_COUNT_ASIC_BUG;
+			ap->prd[idx].addr = cpu_to_le32(addr);
+			ap->prd[idx].flags_len = cpu_to_le32(len);
+			VPRINTK("PRD[%u] = (0x%X, 0x%X)\n", idx, addr, len);
+
+			idx++;
+		}
+
+		ap->prd[idx - 1].flags_len |= cpu_to_le32(ATA_PRD_EOT);
+	}
+}
+
 static void pdc_qc_prep(struct ata_queued_cmd *qc)
 {
 	struct pdc_port_priv *pp = qc->ap->private_data;
@@ -540,7 +621,7 @@

 	switch (qc->tf.protocol) {
 	case ATA_PROT_DMA:
-		ata_qc_prep(qc);
+		pdc_fill_sg(qc);
 		/* fall through */

 	case ATA_PROT_NODATA:
@@ -556,11 +637,11 @@
 		break;

 	case ATA_PROT_ATAPI:
-		ata_qc_prep(qc);
+		pdc_fill_sg(qc);
 		break;

 	case ATA_PROT_ATAPI_DMA:
-		ata_qc_prep(qc);
+		pdc_fill_sg(qc);
 		/*FALLTHROUGH*/
 	case ATA_PROT_ATAPI_NODATA:
 		pdc_atapi_pkt(qc);


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH-RFC] Promise TX4 implement hw-bug workaround
  2007-10-27 15:16                       ` [PATCH-RFC] Promise TX4 implement hw-bug workaround Alexander Sabourenkov
@ 2007-10-27 18:09                         ` Alan Cox
  2007-10-27 18:18                           ` Alexander Sabourenkov
  2007-10-28 10:29                         ` Jeff Garzik
  1 sibling, 1 reply; 37+ messages in thread
From: Alan Cox @ 2007-10-27 18:09 UTC (permalink / raw)
  Cc: Alexander Sabourenkov, linux-ide, Tejun Heo, MisterE, benh,
	jgarzik, jeff

> I can't think of a way to avoid second pass over scatterlist without
> duplicating code (ata_qc_prep() and ata_fill_sg() from libata-core.c).

This appears to be incomplete:

> +			VPRINTK("Splitting last PRD.\n");
> +
> +			ap->prd[idx - 1].flags_len -= cpu_to_le32(SG_COUNT_ASIC_BUG);
> +			VPRINTK("PRD[%u] = (0x%X, 0x%X)\n", idx - 1, addr, SG_COUNT_ASIC_BUG);
> +			
> +			addr = le32_to_cpu(ap->prd[idx - 1].addr) + len - SG_COUNT_ASIC_BUG;
> +			len  = SG_COUNT_ASIC_BUG;
> +			ap->prd[idx].addr = cpu_to_le32(addr);
> +			ap->prd[idx].flags_len = cpu_to_le32(len);
> +			VPRINTK("PRD[%u] = (0x%X, 0x%X)\n", idx, addr, len);
> +
> +			idx++;

What guarantees you have enough PRD entries to do this without changing
the limit in the structures ?

Otherwise looks good

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH-RFC] Promise TX4 implement hw-bug workaround
  2007-10-27 18:09                         ` Alan Cox
@ 2007-10-27 18:18                           ` Alexander Sabourenkov
  2007-10-27 18:37                             ` Alexander Sabourenkov
  2007-10-28  8:21                             ` Jeff Garzik
  0 siblings, 2 replies; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-27 18:18 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-ide, Tejun Heo, MisterE, benh, jgarzik, jeff

Alan Cox wrote:
>> I can't think of a way to avoid second pass over scatterlist without
>> duplicating code (ata_qc_prep() and ata_fill_sg() from libata-core.c).
> 
> This appears to be incomplete:
> 

[...]

> 
> What guarantees you have enough PRD entries to do this without changing
> the limit in the structures ?
> 
> Otherwise looks good

PRD entries count is 256
include/linux/ata.h:
	ATA_MAX_PRD		= 256,
	ATA_PRD_TBL_SZ          = (ATA_MAX_PRD * ATA_PRD_SZ),

drivers/ata/libata-core.c:
 ap->prd = dmam_alloc_coherent(dev, ATA_PRD_TBL_SZ, &ap->prd_dma,

sata_promise Scsi_Host declares support for half of that:

include/linux/libata.h:
LIBATA_MAX_PRD		= ATA_MAX_PRD / 2,

drivers/ata/sata_promise.c
    .sg_tablesize           = LIBATA_MAX_PRD,


PS: Vendor code has this limit at 32.

-- 

./lxnt

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH-RFC] Promise TX4 implement hw-bug workaround
  2007-10-27 18:18                           ` Alexander Sabourenkov
@ 2007-10-27 18:37                             ` Alexander Sabourenkov
  2007-10-28  8:21                             ` Jeff Garzik
  1 sibling, 0 replies; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-27 18:37 UTC (permalink / raw)
  To: Alexander Sabourenkov
  Cc: Alan Cox, linux-ide, Tejun Heo, MisterE, benh, jgarzik, jeff

Alexander Sabourenkov wrote:
> Alan Cox wrote:
>>> I can't think of a way to avoid second pass over scatterlist without
>>> duplicating code (ata_qc_prep() and ata_fill_sg() from libata-core.c).
>> This appears to be incomplete:
>>
> 
> [...]
> 
>> What guarantees you have enough PRD entries to do this without changing
>> the limit in the structures ?
>>
>> Otherwise looks good
> 
> PRD entries count is 256
> include/linux/ata.h:
> 	ATA_MAX_PRD		= 256,
> 	ATA_PRD_TBL_SZ          = (ATA_MAX_PRD * ATA_PRD_SZ),
> 
> drivers/ata/libata-core.c:
>  ap->prd = dmam_alloc_coherent(dev, ATA_PRD_TBL_SZ, &ap->prd_dma,
> 
> sata_promise Scsi_Host declares support for half of that:
> 
> include/linux/libata.h:
> LIBATA_MAX_PRD		= ATA_MAX_PRD / 2,
> 
> drivers/ata/sata_promise.c
>     .sg_tablesize           = LIBATA_MAX_PRD,
> 
> 
> PS: Vendor code has this limit at 32.
> 

That's an interesting question of itself. I don't know what limits PRD
count, but if it's hardware, then the driver should somehow make sure
that it gets no more than hw can handle minus one for this errata.

Right now driver declares that any hardware it supports can handle 128
PRD entries. If this is not true for any possibly existing specimen,
we're welcoming trouble.

-- 

./lxnt

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH-RFC] Promise TX4 implement hw-bug workaround
  2007-10-27 18:18                           ` Alexander Sabourenkov
  2007-10-27 18:37                             ` Alexander Sabourenkov
@ 2007-10-28  8:21                             ` Jeff Garzik
  2007-10-28 20:03                               ` Alexander Sabourenkov
  1 sibling, 1 reply; 37+ messages in thread
From: Jeff Garzik @ 2007-10-28  8:21 UTC (permalink / raw)
  To: Alexander Sabourenkov
  Cc: Alan Cox, linux-ide, Tejun Heo, MisterE, benh, jgarzik

Alexander Sabourenkov wrote:
> Alan Cox wrote:
>>> I can't think of a way to avoid second pass over scatterlist without
>>> duplicating code (ata_qc_prep() and ata_fill_sg() from libata-core.c).
>> This appears to be incomplete:
>>
> 
> [...]
> 
>> What guarantees you have enough PRD entries to do this without changing
>> the limit in the structures ?
>>
>> Otherwise looks good
> 
> PRD entries count is 256
> include/linux/ata.h:
> 	ATA_MAX_PRD		= 256,
> 	ATA_PRD_TBL_SZ          = (ATA_MAX_PRD * ATA_PRD_SZ),
> 
> drivers/ata/libata-core.c:
>  ap->prd = dmam_alloc_coherent(dev, ATA_PRD_TBL_SZ, &ap->prd_dma,
> 
> sata_promise Scsi_Host declares support for half of that:
> 
> include/linux/libata.h:
> LIBATA_MAX_PRD		= ATA_MAX_PRD / 2,
> 
> drivers/ata/sata_promise.c
>     .sg_tablesize           = LIBATA_MAX_PRD,

Alan's point was that the existing code will give you up to 
LIBATA_MAX_PRD entries.  After the post-virtual-merge splitting code in 
ata_fill_sg() executes, the worst case result is ATA_MAX_PRD entries.

Thus, since your code has the potential to increase the number of s/g 
entries above that, it can potentially corrupt memory, lock up the 
machine, all the wonderful things that can happen when you run off the 
end of the s/g list.

The fix is to decrease .sg_tablesize (LIBATA_MAX_PRD - 2 perhaps?) so 
that you guarantee this worst case never occurs, by guaranteeing that 
the system never sends you enough s/g entries to cause your code to go 
out of bounds.

	Jeff




^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH-RFC] Promise TX4 implement hw-bug workaround
  2007-10-27 15:16                       ` [PATCH-RFC] Promise TX4 implement hw-bug workaround Alexander Sabourenkov
  2007-10-27 18:09                         ` Alan Cox
@ 2007-10-28 10:29                         ` Jeff Garzik
  2007-10-28 11:52                           ` Alexander Sabourenkov
  1 sibling, 1 reply; 37+ messages in thread
From: Jeff Garzik @ 2007-10-28 10:29 UTC (permalink / raw)
  To: Alexander Sabourenkov; +Cc: linux-ide, Tejun Heo, MisterE, benh, jgarzik

BTW, looking at the Promise code I see

> cam_con.h:
> /* for ASIC bug, limit the last element of SG byteCount must < 32 Dword */
> #define SG_COUNT_ASIC_BUG       32
> //#define SG_COUNT_ASIC_BUG     128

	and in the code itself

> /* check PRD table, last element <= (32 Dword), fix ASIC bug */

(though the code obviously uses SG_COUNT_ASIC_BUG==32, as the first 
paste indicates)

so it seems like Promise first used 128 (32 dwords), but then backed 
down to 32 (8 dwords).

Either way, we definitely have an ASIC bug to work around, it seems...

	Jeff




^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH-RFC] Promise TX4 implement hw-bug workaround
  2007-10-28 11:52                           ` Alexander Sabourenkov
@ 2007-10-28 11:10                             ` Jeff Garzik
  0 siblings, 0 replies; 37+ messages in thread
From: Jeff Garzik @ 2007-10-28 11:10 UTC (permalink / raw)
  To: Alexander Sabourenkov, Mikael Pettersson
  Cc: linux-ide, Tejun Heo, MisterE, benh

Alexander Sabourenkov wrote:
> Jeff Garzik wrote:
>> BTW, looking at the Promise code I see
>>
>>> cam_con.h:
>>> /* for ASIC bug, limit the last element of SG byteCount must < 32
>>> Dword */
>>> #define SG_COUNT_ASIC_BUG       32
>>> //#define SG_COUNT_ASIC_BUG     128
>>     and in the code itself
>>
>>> /* check PRD table, last element <= (32 Dword), fix ASIC bug */
>> (though the code obviously uses SG_COUNT_ASIC_BUG==32, as the first
>> paste indicates)
>>
>> so it seems like Promise first used 128 (32 dwords), but then backed
>> down to 32 (8 dwords).
>>
> 
> Which version is this define from?
> 
> Both versions that are available now from their website define it at 41*4:

Mikael Pettersson wrote:
> You're looking at the old pdc-ultra2 driver. The newer unified
> sataii150-300 driver (v1.01.0.23) upped the value to 41*4.


I was looking at pdc-ulsata2_1.00.0.15.tgz, which was the latest driver 
that Promise's website gave me to when I listed "SATA300 TX4" as my product.

Sounds like that is outdated information, thanks for the correction!

	Jeff



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH-RFC] Promise TX4 implement hw-bug workaround
  2007-10-28 10:29                         ` Jeff Garzik
@ 2007-10-28 11:52                           ` Alexander Sabourenkov
  2007-10-28 11:10                             ` Jeff Garzik
  0 siblings, 1 reply; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-28 11:52 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: linux-ide, Tejun Heo, MisterE, benh, jgarzik

Jeff Garzik wrote:
> BTW, looking at the Promise code I see
> 
>> cam_con.h:
>> /* for ASIC bug, limit the last element of SG byteCount must < 32
>> Dword */
>> #define SG_COUNT_ASIC_BUG       32
>> //#define SG_COUNT_ASIC_BUG     128
> 
>     and in the code itself
> 
>> /* check PRD table, last element <= (32 Dword), fix ASIC bug */
> 
> (though the code obviously uses SG_COUNT_ASIC_BUG==32, as the first
> paste indicates)
> 
> so it seems like Promise first used 128 (32 dwords), but then backed
> down to 32 (8 dwords).
> 

Which version is this define from?

Both versions that are available now from their website define it at 41*4:


/* for ASIC bug, limit the last element of SG byteCount must <= 41 Dword */
#define SG_COUNT_ASIC_BUG       41*4
//#define SG_COUNT_ASIC_BUG     32
//#define SG_COUNT_ASIC_BUG     128


-- 

./lxnt









^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH-RFC] Promise TX4 implement hw-bug workaround
  2007-10-28  8:21                             ` Jeff Garzik
@ 2007-10-28 20:03                               ` Alexander Sabourenkov
  0 siblings, 0 replies; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-28 20:03 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Alan Cox, linux-ide, Tejun Heo, MisterE, benh, jgarzik

Jeff Garzik wrote:
> 
> Alan's point was that the existing code will give you up to
> LIBATA_MAX_PRD entries.  After the post-virtual-merge splitting code in
> ata_fill_sg() executes, the worst case result is ATA_MAX_PRD entries.
> 
> Thus, since your code has the potential to increase the number of s/g
> entries above that, it can potentially corrupt memory, lock up the
> machine, all the wonderful things that can happen when you run off the
> end of the s/g list.
> 
> The fix is to decrease .sg_tablesize (LIBATA_MAX_PRD - 2 perhaps?) so
> that you guarantee this worst case never occurs, by guaranteeing that
> the system never sends you enough s/g entries to cause your code to go
> out of bounds.
> 

Ah, now I understand. Thanks for the explanation.

I take it something guarantees that s/g entry size can not exceed 128K.


-- 

./lxnt

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2007-10-28 19:13 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-03  7:26 Re[2]: Sata Sil3512 bug? Mikael Pettersson
2007-10-03  8:31 ` Alexander Sabourenkov
2007-10-03 14:45   ` Re[2]: " MisterE
2007-10-03 14:50     ` Alan Cox
2007-10-14 12:07   ` Re[2]: " MisterE
2007-10-15  8:44     ` Alexander Sabourenkov
2007-10-17 12:39   ` Re[2]: Sata Sil3512 bug?; Promise SATA300 TX4 MisterE
2007-10-17 12:54     ` Alexander Sabourenkov
2007-10-17 15:04       ` Re[2]: " MisterE
2007-10-17 19:21         ` Peter Favrholdt
2007-10-19 12:02           ` Re[2]: " MisterE
2007-10-18 21:07         ` Alexander Sabourenkov
2007-10-19  1:26           ` Tejun Heo
2007-10-19 21:06             ` Alexander Sabourenkov
2007-10-19 22:58               ` Re[2]: " MisterE
2007-10-19 23:58               ` Tejun Heo
2007-10-20 21:50                 ` Alexander Sabourenkov
2007-10-27 13:24                   ` [PATCH-RFC] (was: Re: Sata Sil3512 bug?; Promise SATA300 TX4) Alexander Sabourenkov
2007-10-27 13:44                     ` [PATCH-RFC] Alexander Sabourenkov
2007-10-27 14:08                       ` Re[2]: [PATCH-RFC] MisterE
2007-10-27 15:09                         ` [PATCH-RFC] Alexander Sabourenkov
2007-10-27 15:16                       ` [PATCH-RFC] Promise TX4 implement hw-bug workaround Alexander Sabourenkov
2007-10-27 18:09                         ` Alan Cox
2007-10-27 18:18                           ` Alexander Sabourenkov
2007-10-27 18:37                             ` Alexander Sabourenkov
2007-10-28  8:21                             ` Jeff Garzik
2007-10-28 20:03                               ` Alexander Sabourenkov
2007-10-28 10:29                         ` Jeff Garzik
2007-10-28 11:52                           ` Alexander Sabourenkov
2007-10-28 11:10                             ` Jeff Garzik
  -- strict thread matches above, loose matches on Subject: below --
2007-10-04  0:46 Sata Sil3512 bug? Richard Scobie
2007-09-27 13:51 MisterE
2007-09-28 12:25 ` Tejun Heo
2007-09-28 15:25   ` Re[2]: " MisterE
2007-09-28 15:51     ` Alan Cox
2007-09-28 16:55       ` Tejun Heo
2007-10-02 19:20         ` Re[2]: " MisterE
2007-10-04  1:27           ` Tejun Heo
2007-10-13 16:36             ` Re[2]: " MisterE
2007-10-18  3:29               ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).