* Sata Sil3512 bug?
@ 2007-09-27 13:51 MisterE
2007-09-28 12:25 ` Tejun Heo
0 siblings, 1 reply; 37+ messages in thread
From: MisterE @ 2007-09-27 13:51 UTC (permalink / raw)
To: jgarzik, linux-ide, benh
[-- Attachment #1: Type: text/plain, Size: 1037 bytes --]
Hello,
First off, i'm quite new to linux. I don't know the official way's to
report bugs. I'm not even sure that the bug is sata driver related. I
hope you can do some suggestions.
I recently bought 2 Sweex Sata controllers (without raid). This device
contains the Sil3512 chip.
I connected it to my D815EEA motherboard with a samsung hard drive.
When i mounted it and connected to it with WinScp or samba i got
"hangs"; a couple of seconds the filetransfer stopped.
The logs in var and the screen are spooled with errors like in "samsung error.txt".
I now have bought some Western Digitals drives. I get similar problems
(wd error.txt), but nog "hangs".
I've tried the controller in another motherboard, the ASUS CUSL2 (with similar specs)
and i don't have any problems. Can you help? I've included some logs
with may be of use.
btw: i use Debian unstable. I use the same hd with the OS (IDE drive)
on both systems, so we can exclude a faulty OS.
--
Best regards,
MisterE mailto:MisterE2002@zonnet.nl
[-- Attachment #2: samsung error.txt --]
[-- Type: text/plain, Size: 3880 bytes --]
Sep 24 17:42:43 fileserver kernel: NTFS driver 2.1.28 [Flags: R/W MODULE].
Sep 24 17:42:43 fileserver kernel: NTFS volume version 3.1.
Sep 24 17:44:05 fileserver kernel: NTFS volume version 3.1.
Sep 24 17:44:36 fileserver kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Sep 24 17:44:36 fileserver kernel: ata2.00: (BMDMA2 stat 0x6c0009)
Sep 24 17:44:36 fileserver kernel: ata2.00: cmd c8/00:0c:a8:01:bd/00:00:00:00:00/e1 tag 0 cdb 0x0 data 6144 in
Sep 24 17:44:36 fileserver kernel: res 51/04:00:b3:01:bd/00:00:00:00:00/e1 Emask 0x1 (device error)
Sep 24 17:44:36 fileserver kernel: ata2.00: configured for UDMA/100
Sep 24 17:44:36 fileserver kernel: ata2: EH complete
Sep 24 17:44:36 fileserver kernel: sd 1:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB)
Sep 24 17:44:36 fileserver kernel: sd 1:0:0:0: [sda] Write Protect is off
Sep 24 17:44:36 fileserver kernel: sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00
Sep 24 17:44:36 fileserver kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Sep 24 17:44:36 fileserver kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Sep 24 17:44:36 fileserver kernel: ata2.00: (BMDMA2 stat 0x6d0009)
Sep 24 17:44:36 fileserver kernel: ata2.00: cmd c8/00:07:38:19:bd/00:00:00:00:00/e1 tag 0 cdb 0x0 data 3584 in
Sep 24 17:44:36 fileserver kernel: res 51/04:00:3e:19:bd/00:00:00:00:00/e1 Emask 0x1 (device error)
Sep 24 17:44:36 fileserver kernel: ata2.00: configured for UDMA/100
Sep 24 17:44:36 fileserver kernel: ata2: EH complete
Sep 24 17:44:36 fileserver kernel: sd 1:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB)
Sep 24 17:44:36 fileserver kernel: sd 1:0:0:0: [sda] Write Protect is off
Sep 24 17:44:36 fileserver kernel: sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00
Sep 24 17:44:36 fileserver kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Sep 24 17:45:06 fileserver kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Sep 24 17:45:06 fileserver kernel: ata2.00: cmd c8/00:80:3f:21:bd/00:00:00:00:00/e1 tag 0 cdb 0x0 data 65536 in
Sep 24 17:45:06 fileserver kernel: res 40/00:00:3e:19:bd/00:00:00:00:00/e1 Emask 0x4 (timeout)
Sep 24 17:45:07 fileserver kernel: ata2: soft resetting port
Sep 24 17:45:07 fileserver kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Sep 24 17:45:07 fileserver kernel: ata2.00: configured for UDMA/100
Sep 24 17:45:07 fileserver kernel: ata2: EH complete
Sep 24 17:45:07 fileserver kernel: sd 1:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB)
Sep 24 17:45:07 fileserver kernel: sd 1:0:0:0: [sda] Write Protect is off
Sep 24 17:45:07 fileserver kernel: sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00
Sep 24 17:45:07 fileserver kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Sep 24 17:45:37 fileserver kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Sep 24 17:45:37 fileserver kernel: ata2.00: cmd c8/00:80:3f:24:bd/00:00:00:00:00/e1 tag 0 cdb 0x0 data 65536 in
Sep 24 17:45:37 fileserver kernel: res 40/00:00:3e:19:bd/00:00:00:00:00/e1 Emask 0x4 (timeout)
Sep 24 17:45:37 fileserver kernel: ata2: soft resetting port
Sep 24 17:45:37 fileserver kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Sep 24 17:45:37 fileserver kernel: ata2.00: configured for UDMA/100
Sep 24 17:45:37 fileserver kernel: ata2: EH complete
Sep 24 17:45:37 fileserver kernel: sd 1:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB)
Sep 24 17:45:37 fileserver kernel: sd 1:0:0:0: [sda] Write Protect is off
Sep 24 17:45:37 fileserver kernel: sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00
Sep 24 17:45:37 fileserver kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[-- Attachment #3: wd error.txt --]
[-- Type: text/plain, Size: 883 bytes --]
Sep 25 14:09:57 fileserver kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x2400000 action 0x0
Sep 25 14:09:57 fileserver kernel: ata2.00: (BMDMA2 stat 0x650001)
Sep 25 14:09:57 fileserver kernel: ata2.00: cmd ca/00:00:9b:39:02/00:00:00:00:00/e0 tag 0 cdb 0x0 data 131072 out
Sep 25 14:09:57 fileserver kernel: res 51/04:20:7b:3a:02/00:00:00:00:00/e0 Emask 0x1 (device error)
Sep 25 14:09:57 fileserver kernel: ata2.00: configured for UDMA/33
Sep 25 14:09:57 fileserver kernel: ata2: EH complete
Sep 25 14:09:57 fileserver kernel: sd 1:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB)
Sep 25 14:09:57 fileserver kernel: sd 1:0:0:0: [sda] Write Protect is off
Sep 25 14:09:57 fileserver kernel: sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00
Sep 25 14:09:57 fileserver kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[-- Attachment #4: lshw asus.txt --]
[-- Type: text/plain, Size: 10549 bytes --]
fileserver
description: Tower Computer
product: System Name
vendor: System Manufacturer
version: System Version
serial: SYS-1234567890
width: 32 bits
capabilities: smbios-2.3 dmi-2.3
configuration: boot=normal chassis=tower
*-core
description: Motherboard
product: CUSL2-C
vendor: ASUSTeK Computer INC.
physical id: 0
version: REV 1.xx
serial: xxxxxxxxxxx
*-firmware
description: BIOS
vendor: Award Software, Inc.
physical id: 0
version: ASUS CUSL2-C ACPI BIOS Revision 1014 Beta 001 (09/20/2002)
size: 64KB
capacity: 448KB
capabilities: pci pnp apm upgrade shadowing escd cdboot bootselect socketedrom edd int13floppy360 int13floppy1200 int13floppy720 int13floppy2880 int5printscreen int9keyboard int14serial int17printer int10video acpi usb agp ls120boot zipboot
*-cpu
description: CPU
product: Pentium III (Coppermine)
vendor: Intel Corp.
physical id: 4
bus info: cpu@0
version: 6.8.10
slot: PGA 370
size: 1GHz
capacity: 1600MHz
width: 32 bits
clock: 133MHz
capabilities: fpu fpu_exception wp vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr sse up
*-cache:0
description: L1 cache
physical id: 9
slot: L1 Cache
size: 32KB
capacity: 32KB
capabilities: pipeline-burst synchronous internal write-back data
*-cache:1
description: L2 cache
physical id: a
slot: L2 Cache
size: 256KB
capacity: 256KB
capabilities: pipeline-burst synchronous internal write-back data
*-memory
description: System Memory
physical id: 22
slot: System board or motherboard
size: 512MB
capacity: 512MB
*-bank:0
description: DIMM DRAM Synchronous
physical id: 0
slot: DIMM 1
size: 256MB
width: 64 bits
*-bank:1
description: DIMM DRAM Synchronous
physical id: 1
slot: DIMM 2
size: 256MB
width: 64 bits
*-bank:2
description: DIMM DRAM Synchronous [empty]
physical id: 2
slot: DIMM 3
*-pci
description: Host bridge
product: 82815 815 Chipset Host Bridge and Memory Controller Hub
vendor: Intel Corporation
physical id: 100
bus info: pci@0000:00:00.0
version: 02
width: 32 bits
clock: 33MHz
configuration: driver=agpgart-intel module=intel_agp
*-pci:0
description: PCI bridge
product: 82815 815 Chipset AGP Bridge
vendor: Intel Corporation
physical id: 1
bus info: pci@0000:00:01.0
version: 02
width: 32 bits
clock: 66MHz
capabilities: pci normal_decode bus_master
*-display
description: VGA compatible controller
product: MGA G400/G450
vendor: Matrox Graphics, Inc.
physical id: 0
bus info: pci@0000:01:00.0
version: 82
width: 32 bits
clock: 33MHz
capabilities: pm agp agp-2.0 vga bus_master cap_list
configuration: latency=64 maxlatency=32 mingnt=16
*-pci:1
description: PCI bridge
product: 82801 PCI Bridge
vendor: Intel Corporation
physical id: 1e
bus info: pci@0000:00:1e.0
version: 02
width: 32 bits
clock: 33MHz
capabilities: pci normal_decode bus_master
*-storage
description: Mass storage controller
product: SiI 3512 [SATALink/SATARaid] Serial ATA Controller
vendor: Silicon Image, Inc.
physical id: b
bus info: pci@0000:02:0b.0
version: 01
width: 32 bits
clock: 66MHz
capabilities: storage pm bus_master cap_list
configuration: driver=sata_sil latency=32 module=sata_sil
*-network
description: Ethernet interface
product: 83c170 EPIC/100 Fast Ethernet Adapter
vendor: Standard Microsystems Corp [SMC]
physical id: d
bus info: pci@0000:02:0d.0
logical name: eth2
version: 08
serial: 00:e0:29:6c:26:d2
size: 100MB/s
capacity: 100MB/s
width: 32 bits
clock: 33MHz
capabilities: pm bus_master cap_list ethernet physical tp mii 10bt 10bt-fd 100bt 100bt-fd autonegotiation
configuration: autonegotiation=on broadcast=yes driver=epic100 driverversion=2.1 duplex=full ip=10.0.0.12 latency=32 link=yes maxlatency=28 mingnt=8 module=epic100 multicast=yes port=MII speed=100MB/s
*-isa
description: ISA bridge
product: 82801BA ISA Bridge (LPC)
vendor: Intel Corporation
physical id: 1f
bus info: pci@0000:00:1f.0
version: 02
width: 32 bits
clock: 33MHz
capabilities: isa bus_master
configuration: latency=0
*-ide
description: IDE interface
product: 82801BA IDE U100
vendor: Intel Corporation
physical id: 1f.1
bus info: pci@0000:00:1f.1
version: 02
width: 32 bits
clock: 33MHz
capabilities: ide bus_master
configuration: driver=PIIX_IDE latency=0 module=piix
*-ide:0
description: IDE Channel 0
physical id: 0
bus info: ide@0
logical name: ide0
clock: 33MHz
*-disk
description: ATA Disk
product: WDC AC36400L
vendor: Western Digital
physical id: 0
bus info: ide@0.0
logical name: /dev/hda
version: 09.09M08
serial: WD-WM4200668163
size: 6149MB
capacity: 6149MB
capabilities: ata dma lba iordy smart pm partitioned partitioned:dos
configuration: mode=udma2 smart=on
*-volume:0
description: Linux filesystem partition
physical id: 1
bus info: ide@0.0,1
logical name: /dev/hda1
capacity: 5828MB
capabilities: primary bootable
*-volume:1
description: Extended partition
physical id: 2
bus info: ide@0.0,2
logical name: /dev/hda2
size: 321MB
capacity: 321MB
capabilities: primary extended partitioned partitioned:extended
*-logicalvolume
description: Linux swap / Solaris partition
physical id: 5
logical name: /dev/hda5
capacity: 321MB
capabilities: nofs
*-ide:1
description: IDE Channel 1
physical id: 1
bus info: ide@1
logical name: ide1
clock: 33MHz
*-cdrom
description: DVD-RAM writer
product: HL-DT-ST DVDRAM GSA-H42L
physical id: 0
bus info: ide@1.0
logical name: /dev/hdc
version: SL00
serial: K176CJ80810
capabilities: packet atapi cdrom removable nonmagnetic dma lba iordy pm audio cd-r cd-rw dvd dvd-r dvd-ram
configuration: mode=udma4 status=nodisc
*-usb:0
description: USB Controller
product: 82801BA/BAM USB (Hub #1)
vendor: Intel Corporation
physical id: 1f.2
bus info: pci@0000:00:1f.2
version: 02
width: 32 bits
clock: 33MHz
capabilities: uhci bus_master
configuration: driver=uhci_hcd latency=0 module=uhci_hcd
*-usbhost
product: UHCI Host Controller
vendor: Linux 2.6.22-2-686 uhci_hcd
physical id: 1
bus info: usb@1
logical name: usb1
version: 2.06
capabilities: usb-1.10
configuration: maxpower=0mA slots=2 speed=12.0MB/s
*-serial
description: SMBus
product: 82801BA/BAM SMBus
vendor: Intel Corporation
physical id: 1f.3
bus info: pci@0000:00:1f.3
version: 02
width: 32 bits
clock: 33MHz
configuration: driver=i801_smbus latency=0 module=i2c_i801
*-usb:1
description: USB Controller
product: 82801BA/BAM USB (Hub #2)
vendor: Intel Corporation
physical id: 1f.4
bus info: pci@0000:00:1f.4
version: 02
width: 32 bits
clock: 33MHz
capabilities: uhci bus_master
configuration: driver=uhci_hcd latency=0 module=uhci_hcd
*-usbhost
product: UHCI Host Controller
vendor: Linux 2.6.22-2-686 uhci_hcd
physical id: 1
bus info: usb@2
logical name: usb2
version: 2.06
capabilities: usb-1.10
configuration: maxpower=0mA slots=2 speed=12.0MB/s
[-- Attachment #5: lspci asus.txt --]
[-- Type: text/plain, Size: 908 bytes --]
00:00.0 Host bridge: Intel Corporation 82815 815 Chipset Host Bridge and Memory Controller Hub (rev 02)
00:01.0 PCI bridge: Intel Corporation 82815 815 Chipset AGP Bridge (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 02)
00:1f.0 ISA bridge: Intel Corporation 82801BA ISA Bridge (LPC) (rev 02)
00:1f.1 IDE interface: Intel Corporation 82801BA IDE U100 (rev 02)
00:1f.2 USB Controller: Intel Corporation 82801BA/BAM USB (Hub #1) (rev 02)
00:1f.3 SMBus: Intel Corporation 82801BA/BAM SMBus (rev 02)
00:1f.4 USB Controller: Intel Corporation 82801BA/BAM USB (Hub #2) (rev 02)
01:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA G400/G450 (rev 82)
02:0b.0 Mass storage controller: Silicon Image, Inc. SiI 3512 [SATALink/SATARaid] Serial ATA Controller (rev 01)
02:0d.0 Ethernet controller: Standard Microsystems Corp [SMC] 83c170 EPIC/100 Fast Ethernet Adapter (rev 08)
[-- Attachment #6: lshw intel.txt --]
[-- Type: text/plain, Size: 9902 bytes --]
fileserver
description: Computer
width: 32 bits
capabilities: smbios-2.3 dmi-2.3
configuration: boot=normal uuid=82F98A6F-1F59-11D5-BDC8-001083FDCE08
*-core
description: Motherboard
product: D815EEA
vendor: Intel Corporation
physical id: 0
version: AAA10378-405
serial: BLEA11230259
slot: LPT1
*-firmware
description: BIOS
vendor: Intel Corp.
physical id: 0
version: EA81510A.86A.0051.P11.0106190714 (06/19/2001)
size: 64KB
capacity: 448KB
capabilities: pci pnp apm upgrade shadowing escd cdboot bootselect edd int13floppynec int13floppytoshiba int13floppy360 int13floppy1200 int13floppy720 int13floppy2880 int5printscreen int9keyboard int14serial int17printer int10video acpi usb agp ls120boot zipboot biosbootspecification
*-cpu
description: CPU
product: Pentium III (Coppermine)
vendor: Intel Corp.
physical id: 4
bus info: cpu@0
version: 6.8.6
slot: J4L1
size: 933MHz
capacity: 1100MHz
width: 32 bits
clock: 133MHz
capabilities: fpu fpu_exception wp vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr sse up
*-cache:0
description: L1 cache
physical id: 5
slot: None
size: 32KB
capacity: 32KB
clock: 25MHz (40.0ns)
capabilities: pipeline-burst synchronous internal write-back unified
*-cache:1
description: L2 cache
physical id: 6
slot: None
size: 256KB
capacity: 256KB
capabilities: synchronous internal write-back unified
*-memory
description: System Memory
physical id: 2e
slot: System board or motherboard
size: 256MB
*-bank:0
description: [empty]
physical id: 0
slot: DIMM0
*-bank:1
description: DIMM DRAM Synchronous 100 MHz (10.0 ns)
physical id: 1
slot: DIMM1
size: 128MB
width: 64 bits
clock: 100MHz (10.0ns)
*-bank:2
description: DIMM DRAM Synchronous 100 MHz (10.0 ns)
physical id: 2
slot: DIMM2
size: 128MB
width: 64 bits
clock: 100MHz (10.0ns)
*-pci
description: Host bridge
product: 82815 815 Chipset Host Bridge and Memory Controller Hub
vendor: Intel Corporation
physical id: 100
bus info: pci@0000:00:00.0
version: 02
width: 32 bits
clock: 33MHz
configuration: driver=agpgart-intel module=intel_agp
*-display
description: VGA compatible controller
product: 82815 CGC [Chipset Graphics Controller]
vendor: Intel Corporation
physical id: 2
bus info: pci@0000:00:02.0
version: 02
width: 32 bits
clock: 66MHz
capabilities: pm vga bus_master cap_list
configuration: latency=0
*-pci
description: PCI bridge
product: 82801 PCI Bridge
vendor: Intel Corporation
physical id: 1e
bus info: pci@0000:00:1e.0
version: 02
width: 32 bits
clock: 33MHz
capabilities: pci normal_decode bus_master
*-multimedia
description: Multimedia audio controller
product: ES1371 [AudioPCI-97]
vendor: Ensoniq
physical id: 7
bus info: pci@0000:01:07.0
version: 08
width: 32 bits
clock: 33MHz
capabilities: pm bus_master cap_list
configuration: driver=ENS1371 latency=32 maxlatency=128 mingnt=12 module=snd_ens1371
*-storage
description: Mass storage controller
product: SiI 3512 [SATALink/SATARaid] Serial ATA Controller
vendor: Silicon Image, Inc.
physical id: 9
bus info: pci@0000:01:09.0
version: 01
width: 32 bits
clock: 66MHz
capabilities: storage pm bus_master cap_list
configuration: driver=sata_sil latency=32 module=sata_sil
*-network
description: Ethernet interface
product: 3c905C-TX/TX-M [Tornado]
vendor: 3Com Corporation
physical id: c
bus info: pci@0000:01:0c.0
logical name: eth0
version: 78
serial: 00:01:02:e3:12:bb
size: 100MB/s
capacity: 100MB/s
width: 32 bits
clock: 33MHz
capabilities: pm bus_master cap_list ethernet physical tp mii 10bt 10bt-fd 100bt 100bt-fd autonegotiation
configuration: autonegotiation=on broadcast=yes driver=3c59x duplex=full ip=10.0.0.8 latency=32 link=yes maxlatency=10 mingnt=10 module=3c59x multicast=yes port=MII speed=100MB/s
*-isa
description: ISA bridge
product: 82801BA ISA Bridge (LPC)
vendor: Intel Corporation
physical id: 1f
bus info: pci@0000:00:1f.0
version: 02
width: 32 bits
clock: 33MHz
capabilities: isa bus_master
configuration: latency=0
*-ide
description: IDE interface
product: 82801BA IDE U100
vendor: Intel Corporation
physical id: 1f.1
bus info: pci@0000:00:1f.1
version: 02
width: 32 bits
clock: 33MHz
capabilities: ide bus_master
configuration: driver=PIIX_IDE latency=0 module=piix
*-ide
description: IDE Channel 0
physical id: 0
bus info: ide@0
logical name: ide0
clock: 33MHz
*-disk
description: ATA Disk
product: WDC AC36400L
vendor: Western Digital
physical id: 0
bus info: ide@0.0
logical name: /dev/hda
version: 09.09M08
serial: WD-WM4200668163
size: 6149MB
capacity: 6149MB
capabilities: ata dma lba iordy smart pm partitioned partitioned:dos
configuration: mode=udma2 smart=on
*-volume:0
description: Linux filesystem partition
physical id: 1
bus info: ide@0.0,1
logical name: /dev/hda1
capacity: 5828MB
capabilities: primary bootable
*-volume:1
description: Extended partition
physical id: 2
bus info: ide@0.0,2
logical name: /dev/hda2
size: 321MB
capacity: 321MB
capabilities: primary extended partitioned partitioned:extended
*-logicalvolume
description: Linux swap / Solaris partition
physical id: 5
logical name: /dev/hda5
capacity: 321MB
capabilities: nofs
*-usb:0
description: USB Controller
product: 82801BA/BAM USB (Hub #1)
vendor: Intel Corporation
physical id: 1f.2
bus info: pci@0000:00:1f.2
version: 02
width: 32 bits
clock: 33MHz
capabilities: uhci bus_master
configuration: driver=uhci_hcd latency=0 module=uhci_hcd
*-usbhost
product: UHCI Host Controller
vendor: Linux 2.6.22-2-686 uhci_hcd
physical id: 1
bus info: usb@1
logical name: usb1
version: 2.06
capabilities: usb-1.10
configuration: maxpower=0mA slots=2 speed=12.0MB/s
*-serial
description: SMBus
product: 82801BA/BAM SMBus
vendor: Intel Corporation
physical id: 1f.3
bus info: pci@0000:00:1f.3
version: 02
width: 32 bits
clock: 33MHz
configuration: driver=i801_smbus latency=0 module=i2c_i801
*-usb:1
description: USB Controller
product: 82801BA/BAM USB (Hub #2)
vendor: Intel Corporation
physical id: 1f.4
bus info: pci@0000:00:1f.4
version: 02
width: 32 bits
clock: 33MHz
capabilities: uhci bus_master
configuration: driver=uhci_hcd latency=0 module=uhci_hcd
*-usbhost
product: UHCI Host Controller
vendor: Linux 2.6.22-2-686 uhci_hcd
physical id: 1
bus info: usb@2
logical name: usb2
version: 2.06
capabilities: usb-1.10
configuration: maxpower=0mA slots=2 speed=12.0MB/s
[-- Attachment #7: lspci intel.txt --]
[-- Type: text/plain, Size: 900 bytes --]
00:00.0 Host bridge: Intel Corporation 82815 815 Chipset Host Bridge and Memory Controller Hub (rev 02)
00:02.0 VGA compatible controller: Intel Corporation 82815 CGC [Chipset Graphics Controller] (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 02)
00:1f.0 ISA bridge: Intel Corporation 82801BA ISA Bridge (LPC) (rev 02)
00:1f.1 IDE interface: Intel Corporation 82801BA IDE U100 (rev 02)
00:1f.2 USB Controller: Intel Corporation 82801BA/BAM USB (Hub #1) (rev 02)
00:1f.3 SMBus: Intel Corporation 82801BA/BAM SMBus (rev 02)
00:1f.4 USB Controller: Intel Corporation 82801BA/BAM USB (Hub #2) (rev 02)
01:07.0 Multimedia audio controller: Ensoniq ES1371 [AudioPCI-97] (rev 08)
01:09.0 Mass storage controller: Silicon Image, Inc. SiI 3512 [SATALink/SATARaid] Serial ATA Controller (rev 01)
01:0c.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] (rev 78)
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Sata Sil3512 bug?
2007-09-27 13:51 MisterE
@ 2007-09-28 12:25 ` Tejun Heo
2007-09-28 15:25 ` Re[2]: " MisterE
0 siblings, 1 reply; 37+ messages in thread
From: Tejun Heo @ 2007-09-28 12:25 UTC (permalink / raw)
To: MisterE; +Cc: jgarzik, linux-ide, benh
Hello,
MisterE wrote:
> I've tried the controller in another motherboard, the ASUS CUSL2 (with similar specs)
> and i don't have any problems. Can you help? I've included some logs
> with may be of use.
Did you use the same cable on both machines? Also, does the problem go
away if you power the hard drive from the power supply of the other machine?
--
tejun
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Sata Sil3512 bug?
2007-09-28 15:25 ` Re[2]: " MisterE
@ 2007-09-28 15:51 ` Alan Cox
2007-09-28 16:55 ` Tejun Heo
0 siblings, 1 reply; 37+ messages in thread
From: Alan Cox @ 2007-09-28 15:51 UTC (permalink / raw)
To: MisterE; +Cc: Tejun Heo, jgarzik, linux-ide, benh
> sda1 are corrupted (2 to 4 blocks missing). Copying that data back to
> Windows and it give the same results in Quickpar. So reading does not
> have problems. The data written to hda1 is correct.
We've got a whole pile of reports like this with the 3512 and almost
always Nvidia chipset, plus reports of BIOS updates fixing it. That you
see something similar on intel boards is a bit worrying.
Alan
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Sata Sil3512 bug?
2007-09-28 15:51 ` Alan Cox
@ 2007-09-28 16:55 ` Tejun Heo
2007-10-02 19:20 ` Re[2]: " MisterE
0 siblings, 1 reply; 37+ messages in thread
From: Tejun Heo @ 2007-09-28 16:55 UTC (permalink / raw)
To: Alan Cox; +Cc: MisterE, jgarzik, linux-ide, benh
Alan Cox wrote:
>> sda1 are corrupted (2 to 4 blocks missing). Copying that data back to
>> Windows and it give the same results in Quickpar. So reading does not
>> have problems. The data written to hda1 is correct.
>
> We've got a whole pile of reports like this with the 3512 and almost
> always Nvidia chipset, plus reports of BIOS updates fixing it. That you
> see something similar on intel boards is a bit worrying.
Multiple sil3112/3512 + nvidia chipset problem doesn't usually involve
device errors or timeouts. It usually corrupts data silently. And,
yeah, data corruption on intel board is really disturbing.
MisterE, do you have any processor powersaving mechanism enabled? If
so, can you disable all and see whether that changes anything?
--
tejun
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Re[2]: Sata Sil3512 bug?
@ 2007-10-03 7:26 Mikael Pettersson
2007-10-03 8:31 ` Alexander Sabourenkov
0 siblings, 1 reply; 37+ messages in thread
From: Mikael Pettersson @ 2007-10-03 7:26 UTC (permalink / raw)
To: MisterE2002, htejun; +Cc: alan, benh, jgarzik, linux-ide
On Tue, 2 Oct 2007 21:20:23 +0200, MisterE wrote:
> I build another setup with almost the same hardware.
> This motherboard had already the latest bios.
> I notice that the computer does almost never find the hard drive
> although the controller is found every time (with lspci). So i get no
> drive (sda) assigned. I don't always see the "bios" screen from the
> controller at startup. And in the past it showed the hard drive.
> So i could not experiment with this motherboard.
>
> After that i installed Windows XP and used the orginal (sweex)
> drivers with the first motherboard. This also makes the data corrupt.
> So it seems not to be an linux problem. So there is something wrong with
> the motherboard or the 3512 controller.
>
> After that i plugged both hard drives (ide with windows and sata disk)
> to the Asus board. No data corruption. So the hard disks are'nt the
> problem either.
>
> I'm thinking of replacing both 3512 controllers with a Promise SATA300
> TX4. Do you know if there are problems with this device?
(please don't top-post)
There are no known data-corruption issues with Promise SATA cards.
However, some of them, especially the 2nd generation SATA300 TX4,
are known to trigger intermittent error interrupts (that are dealt
with but may cause a speed reduction) in some systems. We're still
scratching our heads on that issue.
/Mikael
> Friday, September 28, 2007, 6:55:47 PM, you wrote:
>
> > Alan Cox wrote:
> >>> sda1 are corrupted (2 to 4 blocks missing). Copying that data back to
> >>> Windows and it give the same results in Quickpar. So reading does not
> >>> have problems. The data written to hda1 is correct.
> >>
> >> We've got a whole pile of reports like this with the 3512 and almost
> >> always Nvidia chipset, plus reports of BIOS updates fixing it. That you
> >> see something similar on intel boards is a bit worrying.
>
> > Multiple sil3112/3512 + nvidia chipset problem doesn't usually involve
> > device errors or timeouts. It usually corrupts data silently. And,
> > yeah, data corruption on intel board is really disturbing.
>
> > MisterE, do you have any processor powersaving mechanism enabled? If
> > so, can you disable all and see whether that changes anything?
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Sata Sil3512 bug?
2007-10-03 7:26 Re[2]: Sata Sil3512 bug? Mikael Pettersson
@ 2007-10-03 8:31 ` Alexander Sabourenkov
2007-10-03 14:45 ` Re[2]: " MisterE
` (2 more replies)
0 siblings, 3 replies; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-03 8:31 UTC (permalink / raw)
To: Mikael Pettersson; +Cc: MisterE2002, htejun, alan, benh, jgarzik, linux-ide
Mikael Pettersson wrote:
>>
>> I'm thinking of replacing both 3512 controllers with a Promise SATA300
>> TX4. Do you know if there are problems with this device?
>
> (please don't top-post)
>
> There are no known data-corruption issues with Promise SATA cards.
> However, some of them, especially the 2nd generation SATA300 TX4,
> are known to trigger intermittent error interrupts (that are dealt
> with but may cause a speed reduction) in some systems. We're still
> scratching our heads on that issue.
>
But see this thread:
http://marc.info/?l=linux-ide&m=119122463403033&w=2
http://www.spinics.net/lists/linux-ide/msg14868.html
Personally I would not recommend Promise SATA300 TX4 at the moment.
--
./lxnt
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re[2]: Sata Sil3512 bug?
2007-10-03 8:31 ` Alexander Sabourenkov
@ 2007-10-03 14:45 ` MisterE
2007-10-03 14:50 ` Alan Cox
2007-10-14 12:07 ` Re[2]: " MisterE
2007-10-17 12:39 ` Re[2]: Sata Sil3512 bug?; Promise SATA300 TX4 MisterE
2 siblings, 1 reply; 37+ messages in thread
From: MisterE @ 2007-10-03 14:45 UTC (permalink / raw)
To: Alexander Sabourenkov
Cc: Mikael Pettersson, htejun, alan, benh, jgarzik, linux-ide
Hello Alexander,
Wednesday, October 3, 2007, 10:31:17 AM, you wrote:
> Mikael Pettersson wrote:
>>>
>>> I'm thinking of replacing both 3512 controllers with a Promise SATA300
>>> TX4. Do you know if there are problems with this device?
>>
>> (please don't top-post)
>>
>> There are no known data-corruption issues with Promise SATA cards.
>> However, some of them, especially the 2nd generation SATA300 TX4,
>> are known to trigger intermittent error interrupts (that are dealt
>> with but may cause a speed reduction) in some systems. We're still
>> scratching our heads on that issue.
>>
> But see this thread:
> http://marc.info/?l=linux-ide&m=119122463403033&w=2
> http://www.spinics.net/lists/linux-ide/msg14868.html
> Personally I would not recommend Promise SATA300 TX4 at the moment.
That is not hopefull. Highpoint does not have sata controllers (Except
softraid controllers). Other (real raid controllers) brands are too
expensive or/and does not have a PCI interface.
Maybe i should keep those 3512 cards? How are the user experiences
with these controllers (except nvidia boards)? Because i don't really
trust the intel boards so using the Asus would be an option.
--
Best regards,
MisterE mailto:MisterE2002@zonnet.nl
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Sata Sil3512 bug?
2007-10-03 14:45 ` Re[2]: " MisterE
@ 2007-10-03 14:50 ` Alan Cox
0 siblings, 0 replies; 37+ messages in thread
From: Alan Cox @ 2007-10-03 14:50 UTC (permalink / raw)
To: MisterE
Cc: Alexander Sabourenkov, Mikael Pettersson, htejun, benh, jgarzik,
linux-ide
> That is not hopefull. Highpoint does not have sata controllers (Except
> softraid controllers). Other (real raid controllers) brands are too
There are pretty much no "real" RAID controllers in the ATA world except
the very high end pricy ones.
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Sata Sil3512 bug?
@ 2007-10-04 0:46 Richard Scobie
0 siblings, 0 replies; 37+ messages in thread
From: Richard Scobie @ 2007-10-04 0:46 UTC (permalink / raw)
To: linux-ide
> There are pretty much no "real" RAID controllers in the ATA world
> except the very high end pricy ones.
Can anyone comment on the reliability or otherwise of Marvell 885X6081
controllers?
Supermicro do a reasonably priced non-RAID 8 drive SATA card using it:
http://www.supermicro.com/products/accessories/addon/AoC-SAT2-MV8.cfm
Regards,
Richard
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Sata Sil3512 bug?
2007-10-02 19:20 ` Re[2]: " MisterE
@ 2007-10-04 1:27 ` Tejun Heo
2007-10-13 16:36 ` Re[2]: " MisterE
0 siblings, 1 reply; 37+ messages in thread
From: Tejun Heo @ 2007-10-04 1:27 UTC (permalink / raw)
To: MisterE; +Cc: Alan Cox, jgarzik, linux-ide, benh
Hello,
MisterE wrote:
> I build another setup with almost the same hardware.
> This motherboard had already the latest bios.
> I notice that the computer does almost never find the hard drive
> although the controller is found every time (with lspci).
What do you mean by "almost never"? Does it find the harddisk
sometimes? Also, please post kernel boot log after disk detection
failure. lspci result just indicates only that the PCI device is present.
> So i get no
> drive (sda) assigned. I don't always see the "bios" screen from the
> controller at startup. And in the past it showed the hard drive.
> So i could not experiment with this motherboard.
Can you re-seat the controller or move it to another slot and see
whether things change?
> After that i installed Windows XP and used the orginal (sweex)
> drivers with the first motherboard. This also makes the data corrupt.
> So it seems not to be an linux problem. So there is something wrong with
> the motherboard or the 3512 controller.
>
> After that i plugged both hard drives (ide with windows and sata disk)
> to the Asus board. No data corruption. So the hard disks are'nt the
> problem either.
Hmmm... It's relieving to know that the problem isn't caused by sata_sil
but I don't have much idea than it seems like something goes wrong on
the PCI bus. :-(
> I'm thinking of replacing both 3512 controllers with a Promise SATA300
> TX4. Do you know if there are problems with this device?
I see occasional bug reports on sata_promise but AFAIK there haven't
been any data corruption report. Mikael knows much better about promise
controllers.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re[2]: Sata Sil3512 bug?
2007-10-03 8:31 ` Alexander Sabourenkov
2007-10-03 14:45 ` Re[2]: " MisterE
@ 2007-10-14 12:07 ` MisterE
2007-10-15 8:44 ` Alexander Sabourenkov
2007-10-17 12:39 ` Re[2]: Sata Sil3512 bug?; Promise SATA300 TX4 MisterE
2 siblings, 1 reply; 37+ messages in thread
From: MisterE @ 2007-10-14 12:07 UTC (permalink / raw)
To: Alexander Sabourenkov
Cc: Mikael Pettersson, htejun, alan, benh, jgarzik, linux-ide
Hello,
Alexander, does these problems with the Promise SATA300 TX4 happen to
everyone?
The only alternatives are
using soft-raid products as normal controllers. Does anyone have experiences
with the following products?
* Highpoint RocketRAID 1640 (150 MB/s)
* Highpoint RocketRAID 1740 (300 MB/s)
* Adaptec 1210SA
Wednesday, October 3, 2007, 10:31:17 AM, you wrote:
> Mikael Pettersson wrote:
>>>
>>> I'm thinking of replacing both 3512 controllers with a Promise SATA300
>>> TX4. Do you know if there are problems with this device?
>>
>> (please don't top-post)
>>
>> There are no known data-corruption issues with Promise SATA cards.
>> However, some of them, especially the 2nd generation SATA300 TX4,
>> are known to trigger intermittent error interrupts (that are dealt
>> with but may cause a speed reduction) in some systems. We're still
>> scratching our heads on that issue.
>>
> But see this thread:
> http://marc.info/?l=linux-ide&m=119122463403033&w=2
> http://www.spinics.net/lists/linux-ide/msg14868.html
> Personally I would not recommend Promise SATA300 TX4 at the moment.
--
Best regards,
MisterE mailto:MisterE2002@zonnet.nl
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Sata Sil3512 bug?
2007-10-14 12:07 ` Re[2]: " MisterE
@ 2007-10-15 8:44 ` Alexander Sabourenkov
0 siblings, 0 replies; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-15 8:44 UTC (permalink / raw)
To: MisterE; +Cc: Mikael Pettersson, htejun, alan, benh, jgarzik, linux-ide
MisterE wrote:
> Hello,
>
> Alexander, does these problems with the Promise SATA300 TX4 happen to
> everyone?
>
Most probably not, as I think it would have been fixed much faster then.
I was waiting for a) release of 2.6.23, and b) me completing the move to
another flat
to retest all the latest developments in mainline and libata-dev.
With a) done and b) almost done, I'll retest and report any issues quite
soon.
Besides, there is a report of TX4 and 2.6.23 not showing problems that
were there with 2.6.22,
( see "Bug is fixed in 2.6.23.1: sata_promise: port is slow to respond,
reset failed" thread).
> The only alternatives are
> using soft-raid products as normal controllers. Does anyone have experiences
> with the following products?
> * Highpoint RocketRAID 1640 (150 MB/s)
> * Highpoint RocketRAID 1740 (300 MB/s)
> * Adaptec 1210SA
>
For any kind of non-hobby task I'd skip trying to build a disk array to
buying a SATA-SCSI/SATA-iSCSI box.
While I had many mind-boggling issues with various combinations of SATA
HDDs, onboard and standalone
controllers, Promise and Infortrend disk arrays worked quite reliably.
--
./lxnt
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re[2]: Sata Sil3512 bug?; Promise SATA300 TX4
2007-10-03 8:31 ` Alexander Sabourenkov
2007-10-03 14:45 ` Re[2]: " MisterE
2007-10-14 12:07 ` Re[2]: " MisterE
@ 2007-10-17 12:39 ` MisterE
2007-10-17 12:54 ` Alexander Sabourenkov
2 siblings, 1 reply; 37+ messages in thread
From: MisterE @ 2007-10-17 12:39 UTC (permalink / raw)
To: Alexander Sabourenkov
Cc: Mikael Pettersson, htejun, alan, benh, jgarzik, linux-ide, jeff
Hello,
Wednesday, October 3, 2007, 10:31:17 AM, you wrote:
> Mikael Pettersson wrote:
>>>
>>> I'm thinking of replacing both 3512 controllers with a Promise SATA300
>>> TX4. Do you know if there are problems with this device?
>>
>> (please don't top-post)
>>
>> There are no known data-corruption issues with Promise SATA cards.
>> However, some of them, especially the 2nd generation SATA300 TX4,
>> are known to trigger intermittent error interrupts (that are dealt
>> with but may cause a speed reduction) in some systems. We're still
>> scratching our heads on that issue.
>>
> But see this thread:
> http://marc.info/?l=linux-ide&m=119122463403033&w=2
> http://www.spinics.net/lists/linux-ide/msg14868.html
> Personally I would not recommend Promise SATA300 TX4 at the moment.
After all the problems i had with the sweex 3512 cards i returned them
to the shop and decided to buy a Sata300 TX4 (because the shop nearby
had one. Unfortunately the shops in the region don't have Highpoints)
Things looked promising when i inserted the card in both Intel D815EEA
motherboards. No problems detecting the hard drives (unlike with the 3512 cards).
With the 3512 i had LOTS of error messages and corrupt data when writing to it.
Using a separate videocard, instead of the onboard one, seemed to reduce the amount of errors.
But after some heavy reading/writing with the promise i got 2 errors. (see log file).
But i did'nt find any corrupt files. I can not reproduce the error.
I'm not sure if these are the "intermittent error interrupts" Mikael
Pettersson mentioned?
ps: as you can i see i got at the boot some errors from the boot disk
(hda). I not sure what is wrong with it. Sometimes it produce these
errors. Used a non-destructive read-write test with badblocks but no
bad sectors found. I don't know if this could influence the sata controller.
Alexander Sabourenkov can you please tell me where i can find the
"Bug is fixed in 2.6.23.1: sata_promise: port is slow to respond,
reset failed" thread you mentioned?
I also see that the driver is now at version 2.10. Is there something
really critical changed? I've tried testing with Debian stable
(2.6.18-4-686; sata_promise: 1.04) and with Debian Unstable
(2.6.22-2-686; sata_promise: 2.07). 2.6.23 is not in the repositories
yet.
So basically the question is this. Can i trust the SATA300 TX4 or
should i buy a Highpoint RocketRAID 1640/1740?. I can order such device
online but i need to be sure that it works correctly :(
--
Best regards,
MisterE mailto:MisterE2002@zonnet.nl
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Sata Sil3512 bug?; Promise SATA300 TX4
2007-10-17 12:39 ` Re[2]: Sata Sil3512 bug?; Promise SATA300 TX4 MisterE
@ 2007-10-17 12:54 ` Alexander Sabourenkov
2007-10-17 15:04 ` Re[2]: " MisterE
0 siblings, 1 reply; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-17 12:54 UTC (permalink / raw)
To: MisterE; +Cc: Mikael Pettersson, htejun, alan, benh, jgarzik, linux-ide, jeff
MisterE wrote:
> But after some heavy reading/writing with the promise i got 2
errors. (see log file).
Log file got lost. Please post relevant parts inline.
> Alexander Sabourenkov can you please tell me where i can find the
> "Bug is fixed in 2.6.23.1: sata_promise: port is slow to respond,
> reset failed" thread you mentioned?
That would be this one:
(got split into two parts)
http://www.spinics.net/lists/linux-ide/msg14069.html
http://www.spinics.net/lists/linux-ide/msg15299.html
>
> So basically the question is this. Can i trust the SATA300 TX4 or
> should i buy a Highpoint RocketRAID 1640/1740?. I can order such device
> online but i need to be sure that it works correctly :(
>
Since you have the hardware, do the tests and decide for yourself.
I'd try copying one (big, preferably over 160G ) disk onto another (with
dd) for a start,
while waiting for answers on mailing lists.
--
./lxnt
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re[2]: Sata Sil3512 bug?; Promise SATA300 TX4
2007-10-17 12:54 ` Alexander Sabourenkov
@ 2007-10-17 15:04 ` MisterE
2007-10-17 19:21 ` Peter Favrholdt
2007-10-18 21:07 ` Alexander Sabourenkov
0 siblings, 2 replies; 37+ messages in thread
From: MisterE @ 2007-10-17 15:04 UTC (permalink / raw)
To: Alexander Sabourenkov
Cc: Mikael Pettersson, htejun, alan, benh, jgarzik, linux-ide, jeff
Hello Alexander,
Wednesday, October 17, 2007, 2:54:25 PM, you wrote:
> Log file got lost. Please post relevant parts inline.
Sorry, i totally forgot to include them.
I can not reproduce the errors. Last times hda did not give errors. So i'm
not sure if it is related to each other. (in the thread you mentioned
that you can't explain the fixing of problem from Peter Favrholdt, so
maybe it has indeed something to do with the libata)
ct 16 14:10:59 fileserver kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Oct 16 14:10:59 fileserver kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
Oct 16 14:10:59 fileserver kernel: ide: failed opcode was: unknown
Oct 16 14:12:49 fileserver kernel: kjournald starting. Commit interval 5 seconds
Oct 16 14:12:49 fileserver kernel: EXT3 FS on sda1, internal journal
Oct 16 14:12:49 fileserver kernel: EXT3-fs: mounted filesystem with ordered data mode.
Oct 16 14:13:34 fileserver kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Oct 16 14:13:34 fileserver kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
Oct 16 14:13:34 fileserver kernel: ide: failed opcode was: unknown
Oct 16 14:17:21 fileserver kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Oct 16 14:17:21 fileserver kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
Oct 16 14:17:21 fileserver kernel: ide: failed opcode was: unknown
Oct 16 14:17:21 fileserver kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Oct 16 14:17:21 fileserver kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
Oct 16 14:17:21 fileserver kernel: ide: failed opcode was: unknown
Oct 16 14:17:21 fileserver kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Oct 16 14:17:21 fileserver kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
Oct 16 14:17:21 fileserver kernel: ide: failed opcode was: unknown
Oct 16 14:17:21 fileserver kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Oct 16 14:17:21 fileserver kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
Oct 16 14:17:21 fileserver kernel: ide: failed opcode was: unknown
Oct 16 14:17:21 fileserver kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Oct 16 14:17:21 fileserver kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
Oct 16 14:17:21 fileserver kernel: ide: failed opcode was: unknown
Oct 16 14:17:21 fileserver kernel: hdb: DMA disabled
Oct 16 14:17:21 fileserver kernel: ide0: reset: success
Oct 16 14:32:51 fileserver kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
Oct 16 14:32:51 fileserver kernel: ata1.00: (port_status 0x20080000)
Oct 16 14:32:51 fileserver kernel: ata1.00: cmd c8/00:00:77:f6:6c/00:00:00:00:00/e4 tag 0 cdb 0x0 data 131072 in
Oct 16 14:32:51 fileserver kernel: res 50/00:00:76:f7:6c/00:00:00:00:00/e4 Emask 0x2 (HSM violation)
Oct 16 14:32:51 fileserver kernel: ata1: soft resetting port
Oct 16 14:32:51 fileserver kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Oct 16 14:32:51 fileserver kernel: ata1.00: configured for UDMA/133
Oct 16 14:32:51 fileserver kernel: ata1: EH complete
Oct 16 14:32:51 fileserver kernel: sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB)
Oct 16 14:32:51 fileserver kernel: sd 0:0:0:0: [sda] Write Protect is off
Oct 16 14:32:51 fileserver kernel: sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Oct 16 14:32:51 fileserver kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 16 14:44:09 fileserver kernel: sd 0:0:0:0: Attached scsi generic sg0 type 0
Oct 16 14:48:48 fileserver kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
Oct 16 14:48:48 fileserver kernel: ata1.00: (port_status 0x20080000)
Oct 16 14:48:48 fileserver kernel: ata1.00: cmd 25/00:00:3f:d0:26/00:01:23:00:00/e0 tag 0 cdb 0x0 data 131072 in
Oct 16 14:48:48 fileserver kernel: res 50/00:00:3e:d1:26/00:00:23:00:00/e0 Emask 0x2 (HSM violation)
Oct 16 14:48:48 fileserver kernel: ata1: soft resetting port
Oct 16 14:48:49 fileserver kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Oct 16 14:48:49 fileserver kernel: ata1.00: configured for UDMA/133
Oct 16 14:48:49 fileserver kernel: ata1: EH complete
Oct 16 14:48:49 fileserver kernel: sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB)
Oct 16 14:48:49 fileserver kernel: sd 0:0:0:0: [sda] Write Protect is off
Oct 16 14:48:49 fileserver kernel: sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Oct 16 14:48:49 fileserver kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> Since you have the hardware, do the tests and decide for yourself.
> I'd try copying one (big, preferably over 160G ) disk onto another (with
> dd) for a start,
> while waiting for answers on mailing lists.
I can order that 1740 online, but returning something is always more
difficult. So need to be quite sure that there are'nt problems with
this highpoint.
Tonight i will try the Asus motherboard with 1 drive and much I/O. And
i will create a new array which takes 7 hours. But how often/hours do
you need to try something to prove it does not fail :P
--
Best regards,
MisterE mailto:MisterE2002@zonnet.nl
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Sata Sil3512 bug?; Promise SATA300 TX4
2007-10-17 15:04 ` Re[2]: " MisterE
@ 2007-10-17 19:21 ` Peter Favrholdt
2007-10-19 12:02 ` Re[2]: " MisterE
2007-10-18 21:07 ` Alexander Sabourenkov
1 sibling, 1 reply; 37+ messages in thread
From: Peter Favrholdt @ 2007-10-17 19:21 UTC (permalink / raw)
To: MisterE; +Cc: Alexander Sabourenkov, Mikael Pettersson, linux-ide
Hi,
MisterE wrote:
> Tonight i will try the Asus motherboard with 1 drive and much I/O. And
> i will create a new array which takes 7 hours. But how often/hours do
> you need to try something to prove it does not fail :P
On one box I had problems with the SATA300 TX4 using 2.6.21 through
2.6.22 (different versions). I have 4x500GB Seagate ES SATA drives
connected. The system would run fine, but when put to a stress - i.e.
loaded on all sata ports one or two ports would fail - one after the
other. I have _always_ been able to make it fail doing:
dd if=/dev/sda of=/dev/null bs=1M &
dd if=/dev/sdb of=/dev/null bs=1M &
dd if=/dev/sdc of=/dev/null bs=1M &
dd if=/dev/sdd of=/dev/null bs=1M &
The ports would freeze before running long - e.g. in less than an hour.
This can be done without even starting the array (mdadm). Therefore no
data corruption will happen.
The above issue was fixed by updating to vanilla 2.6.23.1.
Until then I have been running with 2.6.21-rc2 with a Mikael Petterson
patch to force the SATA to 1.5Gbps (this could possibly be accomplished
by jumpers on the drives as well - but I didn't try that).
I have another system (Dell PE1800 = different from the above) running
24x7 using vanilla linux 2.6.19.5. This system has been running without
hickups for more than a year (current uptime 135 days).
Hope this helps,
Best regards,
Peter
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Sata Sil3512 bug?
2007-10-13 16:36 ` Re[2]: " MisterE
@ 2007-10-18 3:29 ` Tejun Heo
0 siblings, 0 replies; 37+ messages in thread
From: Tejun Heo @ 2007-10-18 3:29 UTC (permalink / raw)
To: MisterE; +Cc: Alan Cox, jgarzik, linux-ide, benh
MisterE wrote:
> Oct 13 13:01:26 fileserver kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x2400000 action 0x0
> Oct 13 13:01:26 fileserver kernel: ata4.00: (BMDMA2 stat 0x650001)
> Oct 13 13:01:26 fileserver kernel: ata4.00: cmd ca/00:f8:47:e1:5e/00:00:00:00:00/e4 tag 0 cdb 0x0 data 126976 out
> Oct 13 13:01:26 fileserver kernel: res 51/04:98:a7:e1:5e/00:00:00:00:00/e4 Emask 0x1 (device error)
> Oct 13 13:01:26 fileserver kernel: ata4.00: configured for UDMA/100
> Oct 13 13:01:26 fileserver kernel: ata4: EH complete
> Oct 13 13:01:26 fileserver kernel: sd 3:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB)
> Oct 13 13:01:26 fileserver kernel: sd 3:0:0:0: [sdd] Write Protect is off
> Oct 13 13:01:26 fileserver kernel: sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00
> Oct 13 13:01:26 fileserver kernel: sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
>
> I'm not sure if it will result in corrupt data? But i don't trust it
> anymore.
That looks like a data transmission error. When you're transferring
massive amount of data, things like that can happen and it won't cause
data corruption.
> You people advise me to not buy the Promise SATA300 TX4 controller and
> this Sweex PU102 (3512) seems to have problems. Not much choices left
> except the really expensive solutions.
>
> Is it really so hard to build a controller without problems?!? :(
It seems so. :-(
--
tejun
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Sata Sil3512 bug?; Promise SATA300 TX4
2007-10-17 15:04 ` Re[2]: " MisterE
2007-10-17 19:21 ` Peter Favrholdt
@ 2007-10-18 21:07 ` Alexander Sabourenkov
2007-10-19 1:26 ` Tejun Heo
1 sibling, 1 reply; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-18 21:07 UTC (permalink / raw)
To: linux-ide; +Cc: MisterE, Mikael Pettersson, htejun, alan, benh, jgarzik, jeff
Hello.
I have done some quick tests with 2.6.23/amd64 and unfortunately, the
very same problem persists.
By the way, 8 in (port_status 0x20080000) stands for
PDC_OVERRUN_ERR = (1 << 19), /* S/G byte count larger
than HD requires */
Does by any chance 'S/G' here somehow relate to 'sg in the 'sg-chaining
work' there is so much talk about on the -kernel mailing list?
In a somewhat parallel development, write errors caused my (other) md
RAID-1 to lose one drive while copying data under 2.6.22
from TX4-attached drives to onboard-VIA-attached ones.
Device: VIA VT6420
00:0f.0 0104: 1106:3149 (rev 80)
Boot:
Oct 17 21:28:25 host sata_via 0000:00:0f.0: version 2.2
Oct 17 21:28:25 host ACPI: PCI Interrupt 0000:00:0f.0[B] -> GSI 20
(level, low) -> IRQ 17
Oct 17 21:28:25 host sata_via 0000:00:0f.0: routed to hard irq line 10
Oct 17 21:28:25 host scsi4 : sata_via
Oct 17 21:28:25 host scsi5 : sata_via
Oct 17 21:28:25 host ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Oct 17 21:28:25 host ata6.00: ATA-7: ST3200827AS, 3.AAH, max UDMA/133
Oct 17 21:28:25 host ata6.00: 390721968 sectors, multi 0: LBA48 NCQ
(depth 0/32)
Oct 17 21:28:25 host ata6.00: configured for UDMA/133
... the first two port resets:
Oct 17 23:10:50 host ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0
action 0x2
Oct 17 23:10:50 host ata6.00: (BMDMA stat 0x4)
Oct 17 23:10:50 host ata6.00: cmd ca/00:08:e7:30:00/00:00:00:00:00/e0
tag 0 cdb 0x0 data 4096 out
Oct 17 23:10:50 host res 51/84:08:e7:30:00/00:00:00:00:00/e0 Emask 0x10
(ATA bus error)
Oct 17 23:10:50 host ata6: soft resetting port
Oct 17 23:10:50 host ata6.00: configured for UDMA/133
Oct 17 23:10:50 host ata6: EH complete
Oct 17 23:10:50 host sd 5:0:0:0: [sdd] 390721968 512-byte hardware
sectors (200050 MB)
Oct 17 23:10:50 host sd 5:0:0:0: [sdd] Write Protect is off
Oct 17 23:10:50 host sd 5:0:0:0: [sdd] Mode Sense: 00 3a 00 00
Oct 17 23:10:50 host sd 5:0:0:0: [sdd] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
Oct 17 23:10:50 host ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0
action 0x2
Oct 17 23:10:50 host ata6.00: (BMDMA stat 0x5)
Oct 17 23:10:50 host ata6.00: cmd ca/00:f8:4f:31:00/00:00:00:00:00/e0
tag 0 cdb 0x0 data 126976 out
Oct 17 23:10:50 host res 51/84:f8:4f:31:00/00:00:00:00:00/e0 Emask 0x10
(ATA bus error)
Oct 17 23:10:50 host ata6: soft resetting port
Oct 17 23:10:50 host ata6.00: configured for UDMA/133
Oct 17 23:10:50 host ata6: EH complete
Oct 17 23:10:50 host sd 5:0:0:0: [sdd] 390721968 512-byte hardware
sectors (200050 MB)
Oct 17 23:10:50 host sd 5:0:0:0: [sdd] Write Protect is off
Oct 17 23:10:50 host sd 5:0:0:0: [sdd] Mode Sense: 00 3a 00 00
Oct 17 23:10:50 host sd 5:0:0:0: [sdd] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
... and multiple unsuccessful port resets follow:
Oct 17 23:11:57 host ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0
action 0x2 frozen
Oct 17 23:11:57 host ata6.00: cmd 25/00:08:7f:bf:28/00:00:16:00:00/e0
tag 0 cdb 0x0 data 4096 in
Oct 17 23:11:57 host res 40/00:f8:4f:31:00/00:00:00:00:00/e0 Emask 0x4
(timeout)
Oct 17 23:12:02 host ata6: port is slow to respond, please be patient
(Status 0xd0)
Oct 17 23:12:07 host ata6: soft resetting port
Oct 17 23:12:37 host ata6.00: qc timeout (cmd 0xec)
Oct 17 23:12:37 host ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Oct 17 23:12:37 host ata6.00: revalidation failed (errno=-5)
Oct 17 23:12:37 host ata6: failed to recover some devices, retrying in 5
secs
Oct 17 23:12:47 host ata6: port is slow to respond, please be patient
(Status 0xd0)
Oct 17 23:12:52 host ata6: soft resetting port
Oct 17 23:13:22 host ata6.00: qc timeout (cmd 0xec)
Oct 17 23:13:22 host ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Oct 17 23:13:22 host ata6.00: revalidation failed (errno=-5)
Oct 17 23:13:22 host ata6.00: limiting speed to UDMA/133:PIO3
Oct 17 23:13:22 host ata6: failed to recover some devices, retrying in 5
secs
Oct 17 23:13:32 host ata6: port is slow to respond, please be patient
(Status 0xd0)
Oct 17 23:13:37 host ata6: soft resetting port
Oct 17 23:14:08 host ata6.00: qc timeout (cmd 0xec)
Oct 17 23:14:08 host ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Oct 17 23:14:08 host ata6.00: revalidation failed (errno=-5)
Oct 17 23:14:08 host ata6.00: disabled
Oct 17 23:14:08 host ata6: EH complete
Oct 17 23:14:08 host sd 5:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET
driverbyte=DRIVER_OK,SUGGEST_OK
Oct 17 23:14:08 host end_request: I/O error, dev sdd, sector 371769215
Oct 17 23:14:08 host raid1: sdd1: rescheduling sector 371769152
Oct 17 23:14:08 host sd 5:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET
driverbyte=DRIVER_OK,SUGGEST_OK
Oct 17 23:14:08 host end_request: I/O error, dev sdd, sector 390379327
Oct 17 23:14:08 host md: super_written gets error=-5, uptodate=0
Oct 17 23:14:08 host raid1: Disk failure on sdd1, disabling device.
I'm unable to reproduce this on 2.6.23, so this is of historic interest
only.
--
./lxnt
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Sata Sil3512 bug?; Promise SATA300 TX4
2007-10-18 21:07 ` Alexander Sabourenkov
@ 2007-10-19 1:26 ` Tejun Heo
2007-10-19 21:06 ` Alexander Sabourenkov
0 siblings, 1 reply; 37+ messages in thread
From: Tejun Heo @ 2007-10-19 1:26 UTC (permalink / raw)
To: Alexander Sabourenkov
Cc: linux-ide, MisterE, Mikael Pettersson, alan, benh, jgarzik, jeff
Hello,
Alexander Sabourenkov wrote:
> In a somewhat parallel development, write errors caused my (other) md
> RAID-1 to lose one drive while copying data under 2.6.22
> from TX4-attached drives to onboard-VIA-attached ones.
>
> ... the first two port resets:
>
> Oct 17 23:10:50 host ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0
> action 0x2
> Oct 17 23:10:50 host ata6.00: (BMDMA stat 0x4)
> Oct 17 23:10:50 host ata6.00: cmd ca/00:08:e7:30:00/00:00:00:00:00/e0
> tag 0 cdb 0x0 data 4096 out
> Oct 17 23:10:50 host res 51/84:08:e7:30:00/00:00:00:00:00/e0 Emask 0x10
> (ATA bus error)
> Oct 17 23:10:50 host ata6: soft resetting port
> Oct 17 23:10:50 host ata6.00: configured for UDMA/133
> Oct 17 23:10:50 host ata6: EH complete
[--snip--]
> Oct 17 23:13:37 host ata6: soft resetting port
> Oct 17 23:14:08 host ata6.00: qc timeout (cmd 0xec)
> Oct 17 23:14:08 host ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> Oct 17 23:14:08 host ata6.00: revalidation failed (errno=-5)
> Oct 17 23:14:08 host ata6.00: disabled
> Oct 17 23:14:08 host ata6: EH complete
> Oct 17 23:14:08 host sd 5:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET
> driverbyte=DRIVER_OK,SUGGEST_OK
> Oct 17 23:14:08 host end_request: I/O error, dev sdd, sector 371769215
> Oct 17 23:14:08 host raid1: sdd1: rescheduling sector 371769152
> Oct 17 23:14:08 host sd 5:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET
> driverbyte=DRIVER_OK,SUGGEST_OK
> Oct 17 23:14:08 host end_request: I/O error, dev sdd, sector 390379327
> Oct 17 23:14:08 host md: super_written gets error=-5, uptodate=0
> Oct 17 23:14:08 host raid1: Disk failure on sdd1, disabling device.
>
> I'm unable to reproduce this on 2.6.23, so this is of historic interest
> only.
It might not have anything to do with the os and driver. Some SATA
controllers and/or drives aren't very reliable and they just fail from
time to time. My previous desktop was using sata_nv w/ seagate sata
drives and was up 24/7. I used it for like two years and during that
time, there was single transfer error and it brought the drive down
completely and I had to reboot and rebuild my RAID 1 array. ISTR what's
dead was the controller port. IIRC, powering off and on the drive
didn't help.
Another interesting case was first gen SATA harddrives from certain
vendor. After any transfer error, those drives went completely deaf.
The only way to recover them was removing power, waiting a bit and
reapplying it.
So, my bet for your second report is your hardware went through
something similar as above.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re[2]: Sata Sil3512 bug?; Promise SATA300 TX4
2007-10-17 19:21 ` Peter Favrholdt
@ 2007-10-19 12:02 ` MisterE
0 siblings, 0 replies; 37+ messages in thread
From: MisterE @ 2007-10-19 12:02 UTC (permalink / raw)
To: Peter Favrholdt; +Cc: Alexander Sabourenkov, Mikael Pettersson, linux-ide
Hello Peter,
Wednesday, October 17, 2007, 9:21:28 PM, you wrote:
> On one box I had problems with the SATA300 TX4 using 2.6.21 through
> 2.6.22 (different versions). I have 4x500GB Seagate ES SATA drives
> connected. The system would run fine, but when put to a stress - i.e.
> loaded on all sata ports one or two ports would fail - one after the
> other. I have _always_ been able to make it fail doing:
> dd if=/dev/sda of=/dev/null bs=1M &
> dd if=/dev/sdb of=/dev/null bs=1M &
> dd if=/dev/sdc of=/dev/null bs=1M &
> dd if=/dev/sdd of=/dev/null bs=1M &
> The ports would freeze before running long - e.g. in less than an hour.
I followed your advice and tested it. I have 4x500GB drives (western
digital Caviar SE16 WD5000AAKS). I tested it with and without jumpers
(300 and 150Gb mode). All test are done with the Asus CUSL2-C
1 :: The first run; debian 2.6.18-4-686 (stable); 300Gb [3 hours in total]:
Oct 17 18:06:12 debian kernel: ata1: no sense translation for status: 0x50
Oct 17 18:06:12 debian kernel: ata1: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Oct 17 18:06:12 debian kernel: ata1: status=0x50 { DriveReady SeekComplete }
Oct 17 19:37:15 debian kernel: ata1: no sense translation for status: 0x50
Oct 17 19:37:15 debian kernel: ata1: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Oct 17 19:37:15 debian kernel: ata1: status=0x50 { DriveReady SeekComplete }
Oct 17 19:42:11 debian kernel: ata3: no sense translation for status: 0x50
Oct 17 19:42:12 debian kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Oct 17 19:42:12 debian kernel: ata3: status=0x50 { DriveReady SeekComplete }
Oct 17 20:23:38 debian kernel: ata1: no sense translation for status: 0x50
Oct 17 20:23:39 debian kernel: ata1: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Oct 17 20:23:39 debian kernel: ata1: status=0x50 { DriveReady SeekComplete }
Oct 17 20:31:38 debian kernel: ata2: no sense translation for status: 0x50
Oct 17 20:31:38 debian kernel: ata2: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Oct 17 20:31:38 debian kernel: ata2: status=0x50 { DriveReady SeekComplete }
Oct 17 20:44:56 debian kernel: ata3: no sense translation for status: 0x50
Oct 17 20:44:56 debian kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Oct 17 20:44:56 debian kernel: ata3: status=0x50 { DriveReady SeekComplete }
2 :: Second run (1 hour); same settings:
Oct 18 09:27:47 debian kernel: ata4: no sense translation for status: 0x50
Oct 18 09:27:47 debian kernel: ata4: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Oct 18 09:27:47 debian kernel: ata4: status=0x50 { DriveReady SeekComplete }
Oct 18 09:38:18 debian kernel: ata3: no sense translation for status: 0x50
Oct 18 09:38:18 debian kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Oct 18 09:38:18 debian kernel: ata3: status=0x50 { DriveReady SeekComplete }
3 :: After that 3 a 5 hours with the drives jumpered. No problems.
4 :: 17:15 - 18:28; 2.6.22-2-686; 300Gb
Oct 18 13:45:25 fileserver kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
Oct 18 13:45:25 fileserver kernel: ata1.00: (port_status 0x20080000)
Oct 18 13:45:25 fileserver kernel: ata1.00: cmd c8/00:08:00:e6:cb/00:00:00:00:00/e2 tag 0 cdb 0x0 data 4096 in
Oct 18 13:45:25 fileserver kernel: res 50/00:00:07:e6:cb/00:00:00:00:00/e2 Emask 0x2 (HSM violation)
Oct 18 13:45:26 fileserver kernel: ata1: soft resetting port
Oct 18 13:45:26 fileserver kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Oct 18 13:45:26 fileserver kernel: ata1.00: configured for UDMA/133
Oct 18 13:45:26 fileserver kernel: ata1: EH complete
Oct 18 13:45:26 fileserver kernel: sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB)
Oct 18 13:45:26 fileserver kernel: sd 0:0:0:0: [sda] Write Protect is off
Oct 18 13:45:26 fileserver kernel: sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Oct 18 13:45:26 fileserver kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 18 13:57:19 fileserver kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
Oct 18 13:57:19 fileserver kernel: ata2.00: (port_status 0x20080000)
Oct 18 13:57:19 fileserver kernel: ata2.00: cmd c8/00:08:00:e6:92/00:00:00:00:00/e4 tag 0 cdb 0x0 data 4096 in
Oct 18 13:57:19 fileserver kernel: res 50/00:00:07:e6:92/00:00:00:00:00/e4 Emask 0x2 (HSM violation)
Oct 18 13:57:19 fileserver kernel: ata2: soft resetting port
Oct 18 13:57:20 fileserver kernel: ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Oct 18 13:57:20 fileserver kernel: ata2.00: configured for UDMA/133
Oct 18 13:57:20 fileserver kernel: ata2: EH complete
Oct 18 13:57:20 fileserver kernel: sd 1:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB)
Oct 18 13:57:20 fileserver kernel: sd 1:0:0:0: [sdb] Write Protect is off
Oct 18 13:57:20 fileserver kernel: sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
Oct 18 13:57:20 fileserver kernel: sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 18 14:09:44 fileserver kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
Oct 18 14:09:44 fileserver kernel: ata1.00: (port_status 0x20080000)
Oct 18 14:09:44 fileserver kernel: ata1.00: cmd c8/00:e0:20:8d:3b/00:00:00:00:00/e6 tag 0 cdb 0x0 data 114688 in
Oct 18 14:09:44 fileserver kernel: res 50/00:00:ff:8d:3b/00:00:00:00:00/e6 Emask 0x2 (HSM violation)
Oct 18 14:09:44 fileserver kernel: ata1: soft resetting port
Oct 18 14:09:44 fileserver kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Oct 18 14:09:44 fileserver kernel: ata1.00: configured for UDMA/133
Oct 18 14:09:44 fileserver kernel: ata1: EH complete
Oct 18 14:09:44 fileserver kernel: sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB)
Oct 18 14:09:44 fileserver kernel: sd 0:0:0:0: [sda] Write Protect is off
Oct 18 14:09:44 fileserver kernel: sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Oct 18 14:09:44 fileserver kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 18 14:15:37 fileserver kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
Oct 18 14:15:37 fileserver kernel: ata3.00: (port_status 0x20080000)
Oct 18 14:15:37 fileserver kernel: ata3.00: cmd c8/00:08:00:4a:27/00:00:00:00:00/e7 tag 0 cdb 0x0 data 4096 in
Oct 18 14:15:37 fileserver kernel: res 50/00:00:07:4a:27/00:00:00:00:00/e7 Emask 0x2 (HSM violation)
Oct 18 14:15:37 fileserver kernel: ata3: soft resetting port
Oct 18 14:15:38 fileserver kernel: ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Oct 18 14:15:38 fileserver kernel: ata3.00: configured for UDMA/133
Oct 18 14:15:38 fileserver kernel: ata3: EH complete
Oct 18 14:15:38 fileserver kernel: sd 2:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB)
Oct 18 14:15:38 fileserver kernel: sd 2:0:0:0: [sdc] Write Protect is off
Oct 18 14:15:38 fileserver kernel: sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
Oct 18 14:15:38 fileserver kernel: sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
5 :: 2.6.22-2-686 - 2 hours with the drives jumpered. No problems.
> The above issue was fixed by updating to vanilla 2.6.23.1.
So, when running in the 150Gb mode there are no problems.
I'm going to try the same with .23(.1). I'm not really familiar with
updating the kernel. Tried it before with: http://www.debianhelp.co.uk/kernel2.6.htm
but not much success. But, i'm going to try...
I will post the results later.
--
Best regards,
MisterE mailto:MisterE2002@zonnet.nl
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Sata Sil3512 bug?; Promise SATA300 TX4
2007-10-19 1:26 ` Tejun Heo
@ 2007-10-19 21:06 ` Alexander Sabourenkov
2007-10-19 22:58 ` Re[2]: " MisterE
2007-10-19 23:58 ` Tejun Heo
0 siblings, 2 replies; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-19 21:06 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-ide, MisterE, alan, benh, jgarzik, jeff
Hello.
>
> So, my bet for your second report is your hardware went through
> something similar as above.
>
Thanks for the insight. Let's dismiss it then.
Back to the TX4, I tried libata-dev.git cloned at about 20:00 UTC 19.10,
no perceived difference - parallel read from two drives causes a lot
of errors.
dmesgs with boot and errors are at http://lxnt.info/linux/libata-dev/
I don't know what to try next. Any ideas?
--
./lxnt
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re[2]: Sata Sil3512 bug?; Promise SATA300 TX4
2007-10-19 21:06 ` Alexander Sabourenkov
@ 2007-10-19 22:58 ` MisterE
2007-10-19 23:58 ` Tejun Heo
1 sibling, 0 replies; 37+ messages in thread
From: MisterE @ 2007-10-19 22:58 UTC (permalink / raw)
To: Alexander Sabourenkov; +Cc: Tejun Heo, linux-ide, alan, benh, jgarzik, jeff
Hello Alexander,
Friday, October 19, 2007, 11:06:02 PM, you wrote:
> I don't know what to try next. Any ideas?
I'm no kernel hacker, so i'll take a shot.
I assume you have done most already...
* hardware (Tested/without/or used another: motherboard, videocard, memory,
hard drives, power supply, all other hardware)
* tried a more n00b-proof distribution. As far as i know you have all
those flags with gentoo. A mistake is easily made.
* Tested with the latest official drivers (redhat) from the Promise site. And
installing that OS on a disk. I assume they made working drivers, so
it should work with it...
* Does it work correctly with Windows?
This would be the steps i would take to determine the cause of the
problem.
Finally, my 2.6.23 kernel is done. I'm going try to install it now.
Tomorrow the results :)
--
Best regards,
MisterE mailto:MisterE2002@zonnet.nl
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Sata Sil3512 bug?; Promise SATA300 TX4
2007-10-19 21:06 ` Alexander Sabourenkov
2007-10-19 22:58 ` Re[2]: " MisterE
@ 2007-10-19 23:58 ` Tejun Heo
2007-10-20 21:50 ` Alexander Sabourenkov
1 sibling, 1 reply; 37+ messages in thread
From: Tejun Heo @ 2007-10-19 23:58 UTC (permalink / raw)
To: Alexander Sabourenkov; +Cc: linux-ide, MisterE, alan, benh, jgarzik, jeff
[-- Attachment #1: Type: text/plain, Size: 527 bytes --]
Alexander Sabourenkov wrote:
> Hello.
>
>> So, my bet for your second report is your hardware went through
>> something similar as above.
>>
>
> Thanks for the insight. Let's dismiss it then.
>
> Back to the TX4, I tried libata-dev.git cloned at about 20:00 UTC 19.10,
> no perceived difference - parallel read from two drives causes a lot
> of errors.
>
> dmesgs with boot and errors are at http://lxnt.info/linux/libata-dev/
>
> I don't know what to try next. Any ideas?
>
Does the attached patch help?
--
tejun
[-- Attachment #2: limit-PHY-to-1.5Gbps.patch --]
[-- Type: text/plain, Size: 402 bytes --]
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 68699b3..4c93fee 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -6435,6 +6435,7 @@ int sata_link_init_spd(struct ata_link *link)
spd = (scontrol >> 4) & 0xf;
if (spd)
link->hw_sata_spd_limit &= (1 << spd) - 1;
+ link->hw_sata_spd_limit = 1;
link->sata_spd_limit = link->hw_sata_spd_limit;
^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: Sata Sil3512 bug?; Promise SATA300 TX4
2007-10-19 23:58 ` Tejun Heo
@ 2007-10-20 21:50 ` Alexander Sabourenkov
2007-10-27 13:24 ` [PATCH-RFC] (was: Re: Sata Sil3512 bug?; Promise SATA300 TX4) Alexander Sabourenkov
0 siblings, 1 reply; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-20 21:50 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-ide, MisterE, alan, benh, jgarzik, jeff
Hello.
Tejun Heo wrote:
>
> Does the attached patch help?
>
It does somehow force 1.5GB/s mode, and it does change the pattern of
'configured for UDMAxxx' messages that come along with errors, and it
causes the following error:
ata3: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xb t4
ata3: hotplug_status 0x10
ata3: soft resetting link
ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata3.00: configured for UDMA/133
ata3: EH complete
for both drives on TX4 on startup, but read errors are still there.
dmesgs at http://lxnt.info/linux/libata-dev/patch0/
READY
[]
--
./lxnt
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH-RFC] (was: Re: Sata Sil3512 bug?; Promise SATA300 TX4)
2007-10-20 21:50 ` Alexander Sabourenkov
@ 2007-10-27 13:24 ` Alexander Sabourenkov
2007-10-27 13:44 ` [PATCH-RFC] Alexander Sabourenkov
0 siblings, 1 reply; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-27 13:24 UTC (permalink / raw)
To: linux-ide; +Cc: Tejun Heo, MisterE, benh, jgarzik, jeff
Hello.
There appears to be a hardware bug in that it chokes on scatterlist
if the last item is larger than 164 bytes.
The patch that follows fixes my problem on 2.6.22.
I can't think of a way to avoid second pass over scatterlist without
duplicating code (ata_qc_prep() and ata_fill_sg() from libata-core.c).
--- a/drivers/ata/sata_promise.c 2007-07-09 03:32:17.000000000 +0400
+++ b/drivers/ata/sata_promise.c 2007-10-27 17:20:03.000000000 +0400
@@ -531,6 +531,80 @@
memcpy(buf+31, cdb, cdb_len);
}
+/**
+ * pdc_qc_prep - Fill PCI IDE PRD table
+ * @qc: Metadata associated with taskfile to be transferred
+ *
+ * Fill PCI IDE PRD (scatter-gather) table with segments
+ * associated with the current disk command.
+ * Make sure hardware does not choke on it.
+ *
+ * LOCKING:
+ * spin_lock_irqsave(host lock)
+ *
+ */
+static void pdc_qc_prep(struct ata_queued_cmd *qc)
+{
+ struct ata_port *ap = qc->ap;
+ struct scatterlist *sg;
+ unsigned int idx;
+ const u32 SG_COUNT_ASIC_BUG = 41*4;
+
+ if (!(qc->flags & ATA_QCFLAG_DMAMAP))
+ return;
+
+ WARN_ON(qc->__sg == NULL);
+ WARN_ON(qc->n_elem == 0 && qc->pad_len == 0);
+
+ idx = 0;
+ ata_for_each_sg(sg, qc) {
+ u32 addr, offset;
+ u32 sg_len, len;
+
+ /* determine if physical DMA addr spans 64K boundary.
+ * Note h/w doesn't support 64-bit, so we unconditionally
+ * truncate dma_addr_t to u32.
+ */
+ addr = (u32) sg_dma_address(sg);
+ sg_len = sg_dma_len(sg);
+
+ while (sg_len) {
+ offset = addr & 0xffff;
+ len = sg_len;
+ if ((offset + sg_len) > 0x10000)
+ len = 0x10000 - offset;
+
+ ap->prd[idx].addr = cpu_to_le32(addr);
+ ap->prd[idx].flags_len = cpu_to_le32(len & 0xffff);
+ VPRINTK("PRD[%u] = (0x%X, 0x%X)\n", idx, addr, len);
+
+ idx++;
+ sg_len -= len;
+ addr += len;
+ }
+ }
+
+ if (idx) {
+ u32 len = ap->prd[idx - 1].flags_len;
+ if (len > SG_COUNT_ASIC_BUG) {
+ u32 addr, len;
+
+ VPRINTK("Last PRD split\n");
+
+ len = le32_to_cpu(ap->prd[idx - 1].flags_len) - SG_COUNT_ASIC_BUG;
+ addr = le32_to_cpu(ap->prd[idx - 1].addr);
+ ap->prd[idx - 1].flags_len = cpu_to_le32(len);
+ VPRINTK("PRD[%u] = (0x%X, 0x%X)\n", idx, addr, len);
+
+ ap->prd[idx].flags_len = cpu_to_le32(SG_COUNT_ASIC_BUG);
+ ap->prd[idx].addr = cpu_to_le32(addr + len);
+ idx++;
+ VPRINTK("PRD[%u] = (0x%X, 0x%X)\n", idx, addr + len, SG_COUNT_ASIC_BUG);
+ }
+ ap->prd[idx - 1].flags_len |= cpu_to_le32(ATA_PRD_EOT);
+ }
+}
+
static void pdc_qc_prep(struct ata_queued_cmd *qc)
{
struct pdc_port_priv *pp = qc->ap->private_data;
@@ -540,7 +614,7 @@
switch (qc->tf.protocol) {
case ATA_PROT_DMA:
- ata_qc_prep(qc);
+ pdc_qc_prep(qc);
/* fall through */
case ATA_PROT_NODATA:
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH-RFC]
2007-10-27 13:24 ` [PATCH-RFC] (was: Re: Sata Sil3512 bug?; Promise SATA300 TX4) Alexander Sabourenkov
@ 2007-10-27 13:44 ` Alexander Sabourenkov
2007-10-27 14:08 ` Re[2]: [PATCH-RFC] MisterE
2007-10-27 15:16 ` [PATCH-RFC] Promise TX4 implement hw-bug workaround Alexander Sabourenkov
0 siblings, 2 replies; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-27 13:44 UTC (permalink / raw)
To: Alexander Sabourenkov; +Cc: linux-ide, Tejun Heo, MisterE, benh, jgarzik, jeff
Alexander Sabourenkov wrote:
> Hello.
>
> There appears to be a hardware bug in that it chokes on scatterlist
> if the last item is larger than 164 bytes.
>
> The patch that follows fixes my problem on 2.6.22.
>
> I can't think of a way to avoid second pass over scatterlist without
> duplicating code (ata_qc_prep() and ata_fill_sg() from libata-core.c).
>
>
Sorry, this was wrong patch :(. Two days looking at vendor code must
have driven me insane. Will send the correct one asap.
--
./lxnt
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re[2]: [PATCH-RFC]
2007-10-27 13:44 ` [PATCH-RFC] Alexander Sabourenkov
@ 2007-10-27 14:08 ` MisterE
2007-10-27 15:09 ` [PATCH-RFC] Alexander Sabourenkov
2007-10-27 15:16 ` [PATCH-RFC] Promise TX4 implement hw-bug workaround Alexander Sabourenkov
1 sibling, 1 reply; 37+ messages in thread
From: MisterE @ 2007-10-27 14:08 UTC (permalink / raw)
To: Alexander Sabourenkov; +Cc: linux-ide, Tejun Heo, benh, jgarzik, jeff
Hello Alexander,
Saturday, October 27, 2007, 3:44:51 PM, you wrote:
>> There appears to be a hardware bug in that it chokes on scatterlist
>> if the last item is larger than 164 bytes.
Can you confirm that this only will happen when running at 300Gb mode?
I have the drives jumpered and have no errors. I tested the "copy to
null" method several times with several kernel versions. I'm now in
the fase of copying all my data to the fileserver.
I'm willing to try your patch but i'm not a experienced linux guru ;)
Once i patched the kernel source (2.6.23 to 2.6.23.1) but i was stuck how
to install the updated driver....
--
Best regards,
MisterE mailto:MisterE2002@zonnet.nl
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH-RFC]
2007-10-27 14:08 ` Re[2]: [PATCH-RFC] MisterE
@ 2007-10-27 15:09 ` Alexander Sabourenkov
0 siblings, 0 replies; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-27 15:09 UTC (permalink / raw)
To: MisterE; +Cc: linux-ide
MisterE wrote:
>
> Can you confirm that this only will happen when running at 300Gb mode?
I confirm that without this patch errors happen on both 150 and 300
modes, on both jumpered and unjumpered drives. It seems that errors are
highly hardware/configuration dependent.
> I'm willing to try your patch but i'm not a experienced linux guru ;)
I would not advise trying this patch now if you do not experience
problems, and certainly not with any valuable data behind the controller.
--
./lxnt
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH-RFC] Promise TX4 implement hw-bug workaround
2007-10-27 13:44 ` [PATCH-RFC] Alexander Sabourenkov
2007-10-27 14:08 ` Re[2]: [PATCH-RFC] MisterE
@ 2007-10-27 15:16 ` Alexander Sabourenkov
2007-10-27 18:09 ` Alan Cox
2007-10-28 10:29 ` Jeff Garzik
1 sibling, 2 replies; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-27 15:16 UTC (permalink / raw)
To: Alexander Sabourenkov; +Cc: linux-ide, Tejun Heo, MisterE, benh, jgarzik, jeff
Hello.
Once again,
There appears to be a hardware bug in that it chokes on scatterlist
if the last item is larger than 164 bytes. This was discovered by
reading the code of vendor-supplied driver.
The patch that follows fixes my problem on 2.6.22.
I can't think of a way to avoid second pass over scatterlist without
duplicating code (ata_qc_prep() and ata_fill_sg() from libata-core.c).
--- a/drivers/ata/sata_promise.c 2007-07-09 03:32:17.000000000 +0400
+++ b/drivers/ata/sata_promise.c 2007-10-27 19:12:46.000000000 +0400
@@ -531,6 +531,87 @@
memcpy(buf+31, cdb, cdb_len);
}
+/**
+ * pdc_fill_sg - Fill PCI IDE PRD table
+ * @qc: Metadata associated with taskfile to be transferred
+ *
+ * Fill PCI IDE PRD (scatter-gather) table with segments
+ * associated with the current disk command.
+ * Make sure hardware does not choke on it.
+ *
+ * LOCKING:
+ * spin_lock_irqsave(host lock)
+ *
+ */
+static void pdc_fill_sg(struct ata_queued_cmd *qc)
+{
+ struct ata_port *ap = qc->ap;
+ struct scatterlist *sg;
+ unsigned int idx;
+ const u32 SG_COUNT_ASIC_BUG = 41*4;
+
+ if (!(qc->flags & ATA_QCFLAG_DMAMAP))
+ return;
+
+ WARN_ON(qc->__sg == NULL);
+ WARN_ON(qc->n_elem == 0 && qc->pad_len == 0);
+
+ idx = 0;
+ ata_for_each_sg(sg, qc) {
+ u32 addr, offset;
+ u32 sg_len, len;
+
+ /* determine if physical DMA addr spans 64K boundary.
+ * Note h/w doesn't support 64-bit, so we unconditionally
+ * truncate dma_addr_t to u32.
+ */
+ addr = (u32) sg_dma_address(sg);
+ sg_len = sg_dma_len(sg);
+
+ while (sg_len) {
+ offset = addr & 0xffff;
+ len = sg_len;
+ if ((offset + sg_len) > 0x10000)
+ len = 0x10000 - offset;
+
+ ap->prd[idx].addr = cpu_to_le32(addr);
+ ap->prd[idx].flags_len = cpu_to_le32(len & 0xffff);
+ VPRINTK("PRD[%u] = (0x%X, 0x%X)\n", idx, addr, len);
+
+ idx++;
+ sg_len -= len;
+ addr += len;
+ }
+ }
+
+ if (idx) {
+ u32 len = le32_to_cpu(ap->prd[idx - 1].flags_len);
+
+ if (len > SG_COUNT_ASIC_BUG) {
+ u32 addr;
+ /* if len < 2*SG_COUNT_ASIC_BUG then last
+ segment will be larger than next-to-last.
+ Somewhat ugly :(
+ */
+
+ VPRINTK("Splitting last PRD.\n");
+
+ ap->prd[idx - 1].flags_len -= cpu_to_le32(SG_COUNT_ASIC_BUG);
+ VPRINTK("PRD[%u] = (0x%X, 0x%X)\n", idx - 1, addr, SG_COUNT_ASIC_BUG);
+
+ addr = le32_to_cpu(ap->prd[idx - 1].addr) + len - SG_COUNT_ASIC_BUG;
+ len = SG_COUNT_ASIC_BUG;
+ ap->prd[idx].addr = cpu_to_le32(addr);
+ ap->prd[idx].flags_len = cpu_to_le32(len);
+ VPRINTK("PRD[%u] = (0x%X, 0x%X)\n", idx, addr, len);
+
+ idx++;
+ }
+
+ ap->prd[idx - 1].flags_len |= cpu_to_le32(ATA_PRD_EOT);
+ }
+}
+
static void pdc_qc_prep(struct ata_queued_cmd *qc)
{
struct pdc_port_priv *pp = qc->ap->private_data;
@@ -540,7 +621,7 @@
switch (qc->tf.protocol) {
case ATA_PROT_DMA:
- ata_qc_prep(qc);
+ pdc_fill_sg(qc);
/* fall through */
case ATA_PROT_NODATA:
@@ -556,11 +637,11 @@
break;
case ATA_PROT_ATAPI:
- ata_qc_prep(qc);
+ pdc_fill_sg(qc);
break;
case ATA_PROT_ATAPI_DMA:
- ata_qc_prep(qc);
+ pdc_fill_sg(qc);
/*FALLTHROUGH*/
case ATA_PROT_ATAPI_NODATA:
pdc_atapi_pkt(qc);
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH-RFC] Promise TX4 implement hw-bug workaround
2007-10-27 15:16 ` [PATCH-RFC] Promise TX4 implement hw-bug workaround Alexander Sabourenkov
@ 2007-10-27 18:09 ` Alan Cox
2007-10-27 18:18 ` Alexander Sabourenkov
2007-10-28 10:29 ` Jeff Garzik
1 sibling, 1 reply; 37+ messages in thread
From: Alan Cox @ 2007-10-27 18:09 UTC (permalink / raw)
Cc: Alexander Sabourenkov, linux-ide, Tejun Heo, MisterE, benh,
jgarzik, jeff
> I can't think of a way to avoid second pass over scatterlist without
> duplicating code (ata_qc_prep() and ata_fill_sg() from libata-core.c).
This appears to be incomplete:
> + VPRINTK("Splitting last PRD.\n");
> +
> + ap->prd[idx - 1].flags_len -= cpu_to_le32(SG_COUNT_ASIC_BUG);
> + VPRINTK("PRD[%u] = (0x%X, 0x%X)\n", idx - 1, addr, SG_COUNT_ASIC_BUG);
> +
> + addr = le32_to_cpu(ap->prd[idx - 1].addr) + len - SG_COUNT_ASIC_BUG;
> + len = SG_COUNT_ASIC_BUG;
> + ap->prd[idx].addr = cpu_to_le32(addr);
> + ap->prd[idx].flags_len = cpu_to_le32(len);
> + VPRINTK("PRD[%u] = (0x%X, 0x%X)\n", idx, addr, len);
> +
> + idx++;
What guarantees you have enough PRD entries to do this without changing
the limit in the structures ?
Otherwise looks good
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH-RFC] Promise TX4 implement hw-bug workaround
2007-10-27 18:09 ` Alan Cox
@ 2007-10-27 18:18 ` Alexander Sabourenkov
2007-10-27 18:37 ` Alexander Sabourenkov
2007-10-28 8:21 ` Jeff Garzik
0 siblings, 2 replies; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-27 18:18 UTC (permalink / raw)
To: Alan Cox; +Cc: linux-ide, Tejun Heo, MisterE, benh, jgarzik, jeff
Alan Cox wrote:
>> I can't think of a way to avoid second pass over scatterlist without
>> duplicating code (ata_qc_prep() and ata_fill_sg() from libata-core.c).
>
> This appears to be incomplete:
>
[...]
>
> What guarantees you have enough PRD entries to do this without changing
> the limit in the structures ?
>
> Otherwise looks good
PRD entries count is 256
include/linux/ata.h:
ATA_MAX_PRD = 256,
ATA_PRD_TBL_SZ = (ATA_MAX_PRD * ATA_PRD_SZ),
drivers/ata/libata-core.c:
ap->prd = dmam_alloc_coherent(dev, ATA_PRD_TBL_SZ, &ap->prd_dma,
sata_promise Scsi_Host declares support for half of that:
include/linux/libata.h:
LIBATA_MAX_PRD = ATA_MAX_PRD / 2,
drivers/ata/sata_promise.c
.sg_tablesize = LIBATA_MAX_PRD,
PS: Vendor code has this limit at 32.
--
./lxnt
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH-RFC] Promise TX4 implement hw-bug workaround
2007-10-27 18:18 ` Alexander Sabourenkov
@ 2007-10-27 18:37 ` Alexander Sabourenkov
2007-10-28 8:21 ` Jeff Garzik
1 sibling, 0 replies; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-27 18:37 UTC (permalink / raw)
To: Alexander Sabourenkov
Cc: Alan Cox, linux-ide, Tejun Heo, MisterE, benh, jgarzik, jeff
Alexander Sabourenkov wrote:
> Alan Cox wrote:
>>> I can't think of a way to avoid second pass over scatterlist without
>>> duplicating code (ata_qc_prep() and ata_fill_sg() from libata-core.c).
>> This appears to be incomplete:
>>
>
> [...]
>
>> What guarantees you have enough PRD entries to do this without changing
>> the limit in the structures ?
>>
>> Otherwise looks good
>
> PRD entries count is 256
> include/linux/ata.h:
> ATA_MAX_PRD = 256,
> ATA_PRD_TBL_SZ = (ATA_MAX_PRD * ATA_PRD_SZ),
>
> drivers/ata/libata-core.c:
> ap->prd = dmam_alloc_coherent(dev, ATA_PRD_TBL_SZ, &ap->prd_dma,
>
> sata_promise Scsi_Host declares support for half of that:
>
> include/linux/libata.h:
> LIBATA_MAX_PRD = ATA_MAX_PRD / 2,
>
> drivers/ata/sata_promise.c
> .sg_tablesize = LIBATA_MAX_PRD,
>
>
> PS: Vendor code has this limit at 32.
>
That's an interesting question of itself. I don't know what limits PRD
count, but if it's hardware, then the driver should somehow make sure
that it gets no more than hw can handle minus one for this errata.
Right now driver declares that any hardware it supports can handle 128
PRD entries. If this is not true for any possibly existing specimen,
we're welcoming trouble.
--
./lxnt
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH-RFC] Promise TX4 implement hw-bug workaround
2007-10-27 18:18 ` Alexander Sabourenkov
2007-10-27 18:37 ` Alexander Sabourenkov
@ 2007-10-28 8:21 ` Jeff Garzik
2007-10-28 20:03 ` Alexander Sabourenkov
1 sibling, 1 reply; 37+ messages in thread
From: Jeff Garzik @ 2007-10-28 8:21 UTC (permalink / raw)
To: Alexander Sabourenkov
Cc: Alan Cox, linux-ide, Tejun Heo, MisterE, benh, jgarzik
Alexander Sabourenkov wrote:
> Alan Cox wrote:
>>> I can't think of a way to avoid second pass over scatterlist without
>>> duplicating code (ata_qc_prep() and ata_fill_sg() from libata-core.c).
>> This appears to be incomplete:
>>
>
> [...]
>
>> What guarantees you have enough PRD entries to do this without changing
>> the limit in the structures ?
>>
>> Otherwise looks good
>
> PRD entries count is 256
> include/linux/ata.h:
> ATA_MAX_PRD = 256,
> ATA_PRD_TBL_SZ = (ATA_MAX_PRD * ATA_PRD_SZ),
>
> drivers/ata/libata-core.c:
> ap->prd = dmam_alloc_coherent(dev, ATA_PRD_TBL_SZ, &ap->prd_dma,
>
> sata_promise Scsi_Host declares support for half of that:
>
> include/linux/libata.h:
> LIBATA_MAX_PRD = ATA_MAX_PRD / 2,
>
> drivers/ata/sata_promise.c
> .sg_tablesize = LIBATA_MAX_PRD,
Alan's point was that the existing code will give you up to
LIBATA_MAX_PRD entries. After the post-virtual-merge splitting code in
ata_fill_sg() executes, the worst case result is ATA_MAX_PRD entries.
Thus, since your code has the potential to increase the number of s/g
entries above that, it can potentially corrupt memory, lock up the
machine, all the wonderful things that can happen when you run off the
end of the s/g list.
The fix is to decrease .sg_tablesize (LIBATA_MAX_PRD - 2 perhaps?) so
that you guarantee this worst case never occurs, by guaranteeing that
the system never sends you enough s/g entries to cause your code to go
out of bounds.
Jeff
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH-RFC] Promise TX4 implement hw-bug workaround
2007-10-27 15:16 ` [PATCH-RFC] Promise TX4 implement hw-bug workaround Alexander Sabourenkov
2007-10-27 18:09 ` Alan Cox
@ 2007-10-28 10:29 ` Jeff Garzik
2007-10-28 11:52 ` Alexander Sabourenkov
1 sibling, 1 reply; 37+ messages in thread
From: Jeff Garzik @ 2007-10-28 10:29 UTC (permalink / raw)
To: Alexander Sabourenkov; +Cc: linux-ide, Tejun Heo, MisterE, benh, jgarzik
BTW, looking at the Promise code I see
> cam_con.h:
> /* for ASIC bug, limit the last element of SG byteCount must < 32 Dword */
> #define SG_COUNT_ASIC_BUG 32
> //#define SG_COUNT_ASIC_BUG 128
and in the code itself
> /* check PRD table, last element <= (32 Dword), fix ASIC bug */
(though the code obviously uses SG_COUNT_ASIC_BUG==32, as the first
paste indicates)
so it seems like Promise first used 128 (32 dwords), but then backed
down to 32 (8 dwords).
Either way, we definitely have an ASIC bug to work around, it seems...
Jeff
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH-RFC] Promise TX4 implement hw-bug workaround
2007-10-28 11:52 ` Alexander Sabourenkov
@ 2007-10-28 11:10 ` Jeff Garzik
0 siblings, 0 replies; 37+ messages in thread
From: Jeff Garzik @ 2007-10-28 11:10 UTC (permalink / raw)
To: Alexander Sabourenkov, Mikael Pettersson
Cc: linux-ide, Tejun Heo, MisterE, benh
Alexander Sabourenkov wrote:
> Jeff Garzik wrote:
>> BTW, looking at the Promise code I see
>>
>>> cam_con.h:
>>> /* for ASIC bug, limit the last element of SG byteCount must < 32
>>> Dword */
>>> #define SG_COUNT_ASIC_BUG 32
>>> //#define SG_COUNT_ASIC_BUG 128
>> and in the code itself
>>
>>> /* check PRD table, last element <= (32 Dword), fix ASIC bug */
>> (though the code obviously uses SG_COUNT_ASIC_BUG==32, as the first
>> paste indicates)
>>
>> so it seems like Promise first used 128 (32 dwords), but then backed
>> down to 32 (8 dwords).
>>
>
> Which version is this define from?
>
> Both versions that are available now from their website define it at 41*4:
Mikael Pettersson wrote:
> You're looking at the old pdc-ultra2 driver. The newer unified
> sataii150-300 driver (v1.01.0.23) upped the value to 41*4.
I was looking at pdc-ulsata2_1.00.0.15.tgz, which was the latest driver
that Promise's website gave me to when I listed "SATA300 TX4" as my product.
Sounds like that is outdated information, thanks for the correction!
Jeff
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH-RFC] Promise TX4 implement hw-bug workaround
2007-10-28 10:29 ` Jeff Garzik
@ 2007-10-28 11:52 ` Alexander Sabourenkov
2007-10-28 11:10 ` Jeff Garzik
0 siblings, 1 reply; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-28 11:52 UTC (permalink / raw)
To: Jeff Garzik; +Cc: linux-ide, Tejun Heo, MisterE, benh, jgarzik
Jeff Garzik wrote:
> BTW, looking at the Promise code I see
>
>> cam_con.h:
>> /* for ASIC bug, limit the last element of SG byteCount must < 32
>> Dword */
>> #define SG_COUNT_ASIC_BUG 32
>> //#define SG_COUNT_ASIC_BUG 128
>
> and in the code itself
>
>> /* check PRD table, last element <= (32 Dword), fix ASIC bug */
>
> (though the code obviously uses SG_COUNT_ASIC_BUG==32, as the first
> paste indicates)
>
> so it seems like Promise first used 128 (32 dwords), but then backed
> down to 32 (8 dwords).
>
Which version is this define from?
Both versions that are available now from their website define it at 41*4:
/* for ASIC bug, limit the last element of SG byteCount must <= 41 Dword */
#define SG_COUNT_ASIC_BUG 41*4
//#define SG_COUNT_ASIC_BUG 32
//#define SG_COUNT_ASIC_BUG 128
--
./lxnt
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH-RFC] Promise TX4 implement hw-bug workaround
2007-10-28 8:21 ` Jeff Garzik
@ 2007-10-28 20:03 ` Alexander Sabourenkov
0 siblings, 0 replies; 37+ messages in thread
From: Alexander Sabourenkov @ 2007-10-28 20:03 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Alan Cox, linux-ide, Tejun Heo, MisterE, benh, jgarzik
Jeff Garzik wrote:
>
> Alan's point was that the existing code will give you up to
> LIBATA_MAX_PRD entries. After the post-virtual-merge splitting code in
> ata_fill_sg() executes, the worst case result is ATA_MAX_PRD entries.
>
> Thus, since your code has the potential to increase the number of s/g
> entries above that, it can potentially corrupt memory, lock up the
> machine, all the wonderful things that can happen when you run off the
> end of the s/g list.
>
> The fix is to decrease .sg_tablesize (LIBATA_MAX_PRD - 2 perhaps?) so
> that you guarantee this worst case never occurs, by guaranteeing that
> the system never sends you enough s/g entries to cause your code to go
> out of bounds.
>
Ah, now I understand. Thanks for the explanation.
I take it something guarantees that s/g entry size can not exceed 128K.
--
./lxnt
^ permalink raw reply [flat|nested] 37+ messages in thread
end of thread, other threads:[~2007-10-28 19:13 UTC | newest]
Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-03 7:26 Re[2]: Sata Sil3512 bug? Mikael Pettersson
2007-10-03 8:31 ` Alexander Sabourenkov
2007-10-03 14:45 ` Re[2]: " MisterE
2007-10-03 14:50 ` Alan Cox
2007-10-14 12:07 ` Re[2]: " MisterE
2007-10-15 8:44 ` Alexander Sabourenkov
2007-10-17 12:39 ` Re[2]: Sata Sil3512 bug?; Promise SATA300 TX4 MisterE
2007-10-17 12:54 ` Alexander Sabourenkov
2007-10-17 15:04 ` Re[2]: " MisterE
2007-10-17 19:21 ` Peter Favrholdt
2007-10-19 12:02 ` Re[2]: " MisterE
2007-10-18 21:07 ` Alexander Sabourenkov
2007-10-19 1:26 ` Tejun Heo
2007-10-19 21:06 ` Alexander Sabourenkov
2007-10-19 22:58 ` Re[2]: " MisterE
2007-10-19 23:58 ` Tejun Heo
2007-10-20 21:50 ` Alexander Sabourenkov
2007-10-27 13:24 ` [PATCH-RFC] (was: Re: Sata Sil3512 bug?; Promise SATA300 TX4) Alexander Sabourenkov
2007-10-27 13:44 ` [PATCH-RFC] Alexander Sabourenkov
2007-10-27 14:08 ` Re[2]: [PATCH-RFC] MisterE
2007-10-27 15:09 ` [PATCH-RFC] Alexander Sabourenkov
2007-10-27 15:16 ` [PATCH-RFC] Promise TX4 implement hw-bug workaround Alexander Sabourenkov
2007-10-27 18:09 ` Alan Cox
2007-10-27 18:18 ` Alexander Sabourenkov
2007-10-27 18:37 ` Alexander Sabourenkov
2007-10-28 8:21 ` Jeff Garzik
2007-10-28 20:03 ` Alexander Sabourenkov
2007-10-28 10:29 ` Jeff Garzik
2007-10-28 11:52 ` Alexander Sabourenkov
2007-10-28 11:10 ` Jeff Garzik
-- strict thread matches above, loose matches on Subject: below --
2007-10-04 0:46 Sata Sil3512 bug? Richard Scobie
2007-09-27 13:51 MisterE
2007-09-28 12:25 ` Tejun Heo
2007-09-28 15:25 ` Re[2]: " MisterE
2007-09-28 15:51 ` Alan Cox
2007-09-28 16:55 ` Tejun Heo
2007-10-02 19:20 ` Re[2]: " MisterE
2007-10-04 1:27 ` Tejun Heo
2007-10-13 16:36 ` Re[2]: " MisterE
2007-10-18 3:29 ` Tejun Heo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).