linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* sata_sil24 resetting controller...
@ 2006-04-27 18:58 jdi
  2006-04-27 19:21 ` Jan Dittmer
  2006-04-27 23:52 ` Tejun Heo
  0 siblings, 2 replies; 9+ messages in thread
From: jdi @ 2006-04-27 18:58 UTC (permalink / raw)
  To: jgarzik; +Cc: jdi, linux-ide, linux-kernel

Hello,

I just received a DawiControl DC-4300 SATA-2 RAID card
(sil3124 based) together with a Western Digital WD3200KS 
hard drive which should support sata2 and ncq in its
full glory. Basic read/write tests with dd went ok,
so I started to reshape my raid5 array. Since the reshape
started my kernel log gets swamped with the following messages:

[4297871.909000] sata_sil24 ata1: resetting controller...
[4297871.909000] ata1: status=0x50 { DriveReady SeekComplete }
[4297871.909000] sdc: Current: sense key=0x0
[4297871.909000]     ASC=0x0 ASCQ=0x0
[4297873.266000] ata1: error interrupt on port0
[4297873.266000]   stat=0x80000001 irq=0xb60002 cmd_err=35 sstatus=0x123
serror=0x0[4297873.266000] sata_sil24 ata1: resetting controller...
[4297873.267000] ata1: status=0x50 { DriveReady SeekComplete }
[4297873.267000] sdc: Current: sense key=0x0
[4297873.267000]     ASC=0x0 ASCQ=0x0

The time between these events varies from .5s to up to 10s, resync speed is
pretty bad (6mb/s) but appears(!) to be working.
This is with vanilla 2.6.17-rc3, sata drivers built into the kernel.
Find below /proc/interrupts and lspci output. Boot dmesg output was washed
away by above messages, sorry.

What's the cause of the error, can I ignore it or will it destroy
my raid eventually? I'm now about 5% through the resync process, with 
an estimated finish in 1260 minutes.

Thanks,

Jan


$ lspci -vv -s 03:04.0
0000:03:04.0 RAID bus controller: Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller (rev 01)
	Subsystem: Silicon Image, Inc.: Unknown device 7124
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B-
	Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
	Latency: 32
	Interrupt: pin A routed to IRQ 22
	Region 0: Memory at fa800000 (64-bit, non-prefetchable) [size=128]
	Region 2: Memory at fa000000 (64-bit, non-prefetchable) [size=32K]
	Region 4: I/O ports at 9400 [size=16]
	Expansion ROM at fe900000 [disabled] [size=512K]
	Capabilities: [64] Power Management version 2
		Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=1 PME-
	Capabilities: [40] PCI-X non-bridge device.
		Command: DPERE- ERO+ RBC=0 OST=5
		Status: Bus=3 Dev=4 Func=0 64bit+ 133MHz+ SCD- USC-, DC=simple, DMMRBC=2, DMOST=5, DMCRS=4, RSCEM-
	Capabilities: [54] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
		Address: 0000000000000000  Data: 0000

$ lspci

0000:00:00.0 Host bridge: Intel Corporation E7501 Memory Controller Hub (rev 01)
0000:00:00.1 ff00: Intel Corporation E7500/E7501 Host RASUM Controller (rev 01)
0000:00:02.0 PCI bridge: Intel Corporation E7500/E7501 Hub Interface B PCI-to-PCI Bridge (rev 01)
0000:00:1d.0 USB Controller: Intel Corporation 82801CA/CAM USB (Hub #1) (rev 02)
0000:00:1d.1 USB Controller: Intel Corporation 82801CA/CAM USB (Hub #2) (rev 02)
0000:00:1d.2 USB Controller: Intel Corporation 82801CA/CAM USB (Hub #3) (rev 02)
0000:00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 42)
0000:00:1f.0 ISA bridge: Intel Corporation 82801CA LPC Interface Controller (rev 02)
0000:00:1f.1 IDE interface: Intel Corporation 82801CA Ultra ATA Storage Controller (rev 02)
0000:01:1c.0 PIC: Intel Corporation 82870P2 P64H2 I/OxAPIC (rev 04)
0000:01:1d.0 PCI bridge: Intel Corporation 82870P2 P64H2 Hub PCI Bridge (rev 04)
0000:01:1e.0 PIC: Intel Corporation 82870P2 P64H2 I/OxAPIC (rev 04)
0000:01:1f.0 PCI bridge: Intel Corporation 82870P2 P64H2 Hub PCI Bridge (rev 04)
0000:02:03.0 Mass storage controller: Promise Technology, Inc. PDC20268 (Ultra100 TX2) (rev 02)
0000:02:05.0 SCSI storage controller: Adaptec AIC-7902B U320 (rev 10)
0000:02:05.1 SCSI storage controller: Adaptec AIC-7902B U320 (rev 10)
0000:03:03.0 Ethernet controller: Intel Corporation 82544GC Gigabit Ethernet Controller (LOM) (rev 02)
0000:03:04.0 RAID bus controller: Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller (rev 01)
0000:04:01.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 02)
0000:04:02.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
0000:04:03.0 Multimedia video controller: Conexant CX23880/1/2/3 PCI Video and Audio Decoder (rev 05)
0000:04:03.2 Multimedia controller: Conexant CX23880/1/2/3 PCI Video and Audio Decoder [MPEG Port] (rev 05)
0000:04:03.4 Multimedia controller: Conexant CX23880/1/2/3 PCI Video and Audio Decoder [IR Port] (rev 05)

$ cat /proc/interrupts

           CPU0       CPU1       CPU2       CPU3       
  0:     585143     480193    1343927    1340601    IO-APIC-edge  timer
  1:          9          0          1          0    IO-APIC-edge  i8042
  7:          0          0          0          0    IO-APIC-edge  parport0
  8:          4          0          0          0    IO-APIC-edge  rtc
  9:          0          0          0          0   IO-APIC-level  acpi
 14:     206162     136159     141145     134456    IO-APIC-edge  ide0
 16:    3024021          0          0          0   IO-APIC-level  eth0
 17:        139          0      48709          0   IO-APIC-level  eth1
 18:          0          0          0          0   IO-APIC-level  uhci_hcd:usb3
 19:      83632     396995     247509     324823   IO-APIC-level  ide2, ide3
 20:      12682      32985      15646      34963   IO-APIC-level  aic79xx
 21:         15          0          0          0   IO-APIC-level  aic79xx
 22:     103851     230769     219349     209174   IO-APIC-level  libata
 23:          0          0          0          0   IO-APIC-level  uhci_hcd:usb1
 24:          0          0          0          0   IO-APIC-level  uhci_hcd:usb2
 25:      46521     202715     249695     116146   IO-APIC-level  cx88[0]
NMI:          0          0          0          0 
LOC:    3749323    3749326    3749308    3749317 
ERR:          0
MIS:          0

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: sata_sil24 resetting controller...
  2006-04-27 18:58 sata_sil24 resetting controller jdi
@ 2006-04-27 19:21 ` Jan Dittmer
  2006-04-27 23:52 ` Tejun Heo
  1 sibling, 0 replies; 9+ messages in thread
From: Jan Dittmer @ 2006-04-27 19:21 UTC (permalink / raw)
  To: jgarzik; +Cc: jdi, linux-ide, linux-kernel

jdi@l4x.org wrote:
> so I started to reshape my raid5 array. Since the reshape
> started my kernel log gets swamped with the following messages:
> 
> [4297871.909000] sata_sil24 ata1: resetting controller...
> [4297871.909000] ata1: status=0x50 { DriveReady SeekComplete }
> [4297871.909000] sdc: Current: sense key=0x0
> [4297871.909000]     ASC=0x0 ASCQ=0x0
> [4297873.266000] ata1: error interrupt on port0
> [4297873.266000]   stat=0x80000001 irq=0xb60002 cmd_err=35 sstatus=0x123
> serror=0x0
> 
> The time between these events varies from .5s to up to 10s, resync speed is

Just one more observation:
The time between the messages seems to be correlated to the resync speed.
The faster the resync, the more messages.

Jan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: sata_sil24 resetting controller...
  2006-04-27 18:58 sata_sil24 resetting controller jdi
  2006-04-27 19:21 ` Jan Dittmer
@ 2006-04-27 23:52 ` Tejun Heo
  2006-04-29  0:13   ` Jan Dittmer
  1 sibling, 1 reply; 9+ messages in thread
From: Tejun Heo @ 2006-04-27 23:52 UTC (permalink / raw)
  To: jdi; +Cc: jgarzik, linux-ide, linux-kernel

jdi@l4x.org wrote:
> Hello,
> 
> I just received a DawiControl DC-4300 SATA-2 RAID card
> (sil3124 based) together with a Western Digital WD3200KS 
> hard drive which should support sata2 and ncq in its
> full glory. Basic read/write tests with dd went ok,
> so I started to reshape my raid5 array. Since the reshape
> started my kernel log gets swamped with the following messages:
> 
> [4297871.909000] sata_sil24 ata1: resetting controller...
> [4297871.909000] ata1: status=0x50 { DriveReady SeekComplete }
> [4297871.909000] sdc: Current: sense key=0x0
> [4297871.909000]     ASC=0x0 ASCQ=0x0
> [4297873.266000] ata1: error interrupt on port0
> [4297873.266000]   stat=0x80000001 irq=0xb60002 cmd_err=35 sstatus=0x123

cmd_err 35 is...

PORT_CERR_XFR_PCIPERR	= 35, /* PSD ecode 11 - PCI prity err during 
transfer */

> serror=0x0[4297873.266000] sata_sil24 ata1: resetting controller...
> [4297873.267000] ata1: status=0x50 { DriveReady SeekComplete }
> [4297873.267000] sdc: Current: sense key=0x0
> [4297873.267000]     ASC=0x0 ASCQ=0x0
> 
> The time between these events varies from .5s to up to 10s, resync speed is
> pretty bad (6mb/s) but appears(!) to be working.
> This is with vanilla 2.6.17-rc3, sata drivers built into the kernel.
> Find below /proc/interrupts and lspci output. Boot dmesg output was washed
> away by above messages, sorry.
> 
> What's the cause of the error, can I ignore it or will it destroy
> my raid eventually? I'm now about 5% through the resync process, with 
> an estimated finish in 1260 minutes.
> 
> Thanks,
> 
> Jan
> 
> 
> $ lspci -vv -s 03:04.0
> 0000:03:04.0 RAID bus controller: Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller (rev 01)
> 	Subsystem: Silicon Image, Inc.: Unknown device 7124
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B-
> 	Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
> 	Latency: 32
> 	Interrupt: pin A routed to IRQ 22
> 	Region 0: Memory at fa800000 (64-bit, non-prefetchable) [size=128]
> 	Region 2: Memory at fa000000 (64-bit, non-prefetchable) [size=32K]
> 	Region 4: I/O ports at 9400 [size=16]
> 	Expansion ROM at fe900000 [disabled] [size=512K]
> 	Capabilities: [64] Power Management version 2
> 		Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
> 		Status: D0 PME-Enable- DSel=0 DScale=1 PME-
> 	Capabilities: [40] PCI-X non-bridge device.
> 		Command: DPERE- ERO+ RBC=0 OST=5
> 		Status: Bus=3 Dev=4 Func=0 64bit+ 133MHz+ SCD- USC-, DC=simple, DMMRBC=2, DMOST=5, DMCRS=4, RSCEM-

So, slow down the PCI-X bus.  It can usually be done from BIOS setup 
menu.  Does your machine has a riser board which extends or changes 
orientation of PCI-X bus?  Motherboard vendors describe the bus 
frequency limit when using riser boards in the manual but sometimes 
server vendors forget to set them.  Heck, some of them don't even know 
what that is.

-- 
tejun

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: sata_sil24 resetting controller...
  2006-04-27 23:52 ` Tejun Heo
@ 2006-04-29  0:13   ` Jan Dittmer
  2006-04-29  0:20     ` Tejun Heo
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Dittmer @ 2006-04-29  0:13 UTC (permalink / raw)
  To: Tejun Heo; +Cc: jgarzik, linux-ide, linux-kernel

Tejun Heo wrote:
>> serror=0x0[4297873.266000] sata_sil24 ata1: resetting controller...
>> [4297873.267000] ata1: status=0x50 { DriveReady SeekComplete }
>> [4297873.267000] sdc: Current: sense key=0x0
>> [4297873.267000]     ASC=0x0 ASCQ=0x0
>>
>> The time between these events varies from .5s to up to 10s, resync speed is
>> pretty bad (6mb/s) but appears(!) to be working.
>> This is with vanilla 2.6.17-rc3, sata drivers built into the kernel.
>> Find below /proc/interrupts and lspci output. Boot dmesg output was washed
>> away by above messages, sorry.
>>
>> What's the cause of the error, can I ignore it or will it destroy
>> my raid eventually? I'm now about 5% through the resync process, with 
>> an estimated finish in 1260 minutes.
>>
>> Thanks,
>>
>> Jan
>>
>>
>> $ lspci -vv -s 03:04.0
>> 0000:03:04.0 RAID bus controller: Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller (rev 01)
>> 	Subsystem: Silicon Image, Inc.: Unknown device 7124
>> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B-
>> 	Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
>> 	Latency: 32
>> 	Interrupt: pin A routed to IRQ 22
>> 	Region 0: Memory at fa800000 (64-bit, non-prefetchable) [size=128]
>> 	Region 2: Memory at fa000000 (64-bit, non-prefetchable) [size=32K]
>> 	Region 4: I/O ports at 9400 [size=16]
>> 	Expansion ROM at fe900000 [disabled] [size=512K]
>> 	Capabilities: [64] Power Management version 2
>> 		Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
>> 		Status: D0 PME-Enable- DSel=0 DScale=1 PME-
>> 	Capabilities: [40] PCI-X non-bridge device.
>> 		Command: DPERE- ERO+ RBC=0 OST=5
>> 		Status: Bus=3 Dev=4 Func=0 64bit+ 133MHz+ SCD- USC-, DC=simple, DMMRBC=2, DMOST=5, DMCRS=4, RSCEM-
> 
> So, slow down the PCI-X bus.  It can usually be done from BIOS setup 
> menu.  Does your machine has a riser board which extends or changes 
> orientation of PCI-X bus?  Motherboard vendors describe the bus 
> frequency limit when using riser boards in the manual but sometimes 
> server vendors forget to set them.  Heck, some of them don't even know 
> what that is.


Hmm I don't have a riser card and I don't have a setting for the frequency,
nor a jumper.
I plugged the card in another slot, next to a 66MHz only card. So now I've
it working with 66MHz (checked with lspci), but my drive isn't initialized
properly anymore:

[4294690.486000] libata version 1.20 loaded.
[4294690.486000] sata_sil24 0000:03:04.0: version 0.23
[4294690.486000] ACPI: PCI Interrupt 0000:03:04.0[A] -> GSI 24 (level, low) -> IRQ 22
[4294690.487000] ata1: SATA max UDMA/100 cmd 0xF8810000 ctl 0x0 bmdma 0x0 irq 22
[4294690.487000] ata2: SATA max UDMA/100 cmd 0xF8812000 ctl 0x0 bmdma 0x0 irq 22
[4294690.487000] ata3: SATA max UDMA/100 cmd 0xF8814000 ctl 0x0 bmdma 0x0 irq 22
[4294690.487000] ata4: SATA max UDMA/100 cmd 0xF8816000 ctl 0x0 bmdma 0x0 irq 22
[4294690.800000] ata1: SATA link up 3.0 Gbps (SStatus 123)
[4294690.801000] ata1: dev 0 cfg 49:5145 82:0000 83:0000 84:0000 85:0000 86:0000 87:0000 88:0000
[4294690.801000] ata1: dev 0 ATA-0, max MWDMA2, 16514064 sectors: CHS 16383/16/63
[4294690.802000] ata1: dev 0 model number mismatch 'WDC WD3200re/sasats_li42S' != ''
[4294690.802000] ata1: dev 0 revalidation failed (errno=-19)
[4294690.802000] ata1: failed to revalidate after set xfermode
[4294690.802000] scsi2 : sata_sil24
[4294691.003000] ata2: SATA link down (SStatus 0)
[4294691.003000] scsi3 : sata_sil24
[4294691.204000] ata3: SATA link down (SStatus 0)
[4294691.204000] scsi4 : sata_sil24
[4294691.405000] ata4: SATA link down (SStatus 0)
[4294691.405000] scsi5 : sata_sil24

Can this still be a pci bus problem? I get the same error on every reboot.

Jan

ps: the raid survived :-), though it's obviously missing one disk now.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: sata_sil24 resetting controller...
  2006-04-29  0:13   ` Jan Dittmer
@ 2006-04-29  0:20     ` Tejun Heo
  2006-04-29  8:53       ` Mogens Valentin
  0 siblings, 1 reply; 9+ messages in thread
From: Tejun Heo @ 2006-04-29  0:20 UTC (permalink / raw)
  To: Jan Dittmer; +Cc: jgarzik, linux-ide, linux-kernel

Jan Dittmer wrote:
> Tejun Heo wrote:
>>> serror=0x0[4297873.266000] sata_sil24 ata1: resetting controller...
>>> [4297873.267000] ata1: status=0x50 { DriveReady SeekComplete }
>>> [4297873.267000] sdc: Current: sense key=0x0
>>> [4297873.267000]     ASC=0x0 ASCQ=0x0
>>>
>>> The time between these events varies from .5s to up to 10s, resync speed is
>>> pretty bad (6mb/s) but appears(!) to be working.
>>> This is with vanilla 2.6.17-rc3, sata drivers built into the kernel.
>>> Find below /proc/interrupts and lspci output. Boot dmesg output was washed
>>> away by above messages, sorry.
>>>
>>> What's the cause of the error, can I ignore it or will it destroy
>>> my raid eventually? I'm now about 5% through the resync process, with 
>>> an estimated finish in 1260 minutes.
>>>
>>> Thanks,
>>>
>>> Jan
>>>
>>>
>>> $ lspci -vv -s 03:04.0
>>> 0000:03:04.0 RAID bus controller: Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller (rev 01)
>>> 	Subsystem: Silicon Image, Inc.: Unknown device 7124
>>> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B-
>>> 	Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
>>> 	Latency: 32
>>> 	Interrupt: pin A routed to IRQ 22
>>> 	Region 0: Memory at fa800000 (64-bit, non-prefetchable) [size=128]
>>> 	Region 2: Memory at fa000000 (64-bit, non-prefetchable) [size=32K]
>>> 	Region 4: I/O ports at 9400 [size=16]
>>> 	Expansion ROM at fe900000 [disabled] [size=512K]
>>> 	Capabilities: [64] Power Management version 2
>>> 		Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
>>> 		Status: D0 PME-Enable- DSel=0 DScale=1 PME-
>>> 	Capabilities: [40] PCI-X non-bridge device.
>>> 		Command: DPERE- ERO+ RBC=0 OST=5
>>> 		Status: Bus=3 Dev=4 Func=0 64bit+ 133MHz+ SCD- USC-, DC=simple, DMMRBC=2, DMOST=5, DMCRS=4, RSCEM-
>> So, slow down the PCI-X bus.  It can usually be done from BIOS setup 
>> menu.  Does your machine has a riser board which extends or changes 
>> orientation of PCI-X bus?  Motherboard vendors describe the bus 
>> frequency limit when using riser boards in the manual but sometimes 
>> server vendors forget to set them.  Heck, some of them don't even know 
>> what that is.
> 
> 
> Hmm I don't have a riser card and I don't have a setting for the frequency,
> nor a jumper.
> I plugged the card in another slot, next to a 66MHz only card. So now I've
> it working with 66MHz (checked with lspci), but my drive isn't initialized
> properly anymore:
> 
> [4294690.486000] libata version 1.20 loaded.
> [4294690.486000] sata_sil24 0000:03:04.0: version 0.23
> [4294690.486000] ACPI: PCI Interrupt 0000:03:04.0[A] -> GSI 24 (level, low) -> IRQ 22
> [4294690.487000] ata1: SATA max UDMA/100 cmd 0xF8810000 ctl 0x0 bmdma 0x0 irq 22
> [4294690.487000] ata2: SATA max UDMA/100 cmd 0xF8812000 ctl 0x0 bmdma 0x0 irq 22
> [4294690.487000] ata3: SATA max UDMA/100 cmd 0xF8814000 ctl 0x0 bmdma 0x0 irq 22
> [4294690.487000] ata4: SATA max UDMA/100 cmd 0xF8816000 ctl 0x0 bmdma 0x0 irq 22
> [4294690.800000] ata1: SATA link up 3.0 Gbps (SStatus 123)
> [4294690.801000] ata1: dev 0 cfg 49:5145 82:0000 83:0000 84:0000 85:0000 86:0000 87:0000 88:0000
> [4294690.801000] ata1: dev 0 ATA-0, max MWDMA2, 16514064 sectors: CHS 16383/16/63
> [4294690.802000] ata1: dev 0 model number mismatch 'WDC WD3200re/sasats_li42S' != ''
> [4294690.802000] ata1: dev 0 revalidation failed (errno=-19)
> [4294690.802000] ata1: failed to revalidate after set xfermode
> [4294690.802000] scsi2 : sata_sil24
> [4294691.003000] ata2: SATA link down (SStatus 0)
> [4294691.003000] scsi3 : sata_sil24
> [4294691.204000] ata3: SATA link down (SStatus 0)
> [4294691.204000] scsi4 : sata_sil24
> [4294691.405000] ata4: SATA link down (SStatus 0)
> [4294691.405000] scsi5 : sata_sil24
> 
> Can this still be a pci bus problem? I get the same error on every reboot.

Hmmm.. max MWDMA2?  Something is very off with your configuration.  Can 
you try the card in another box or on a regular PCI slot?

-- 
tejun

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: sata_sil24 resetting controller...
  2006-04-29  0:20     ` Tejun Heo
@ 2006-04-29  8:53       ` Mogens Valentin
  2006-04-29 10:06         ` Jan Dittmer
  0 siblings, 1 reply; 9+ messages in thread
From: Mogens Valentin @ 2006-04-29  8:53 UTC (permalink / raw)
  To: Jan Dittmer; +Cc: Tejun Heo, jgarzik, linux-ide, linux-kernel

Tejun Heo wrote:
> Jan Dittmer wrote:
> 
>> Tejun Heo wrote:
>>
>>>> serror=0x0[4297873.266000] sata_sil24 ata1: resetting controller...
>>>> [4297873.267000] ata1: status=0x50 { DriveReady SeekComplete }
>>>> [4297873.267000] sdc: Current: sense key=0x0
>>>> [4297873.267000]     ASC=0x0 ASCQ=0x0
>>>>
>>>> The time between these events varies from .5s to up to 10s, resync 
>>>> speed is
>>>> pretty bad (6mb/s) but appears(!) to be working.
>>>> This is with vanilla 2.6.17-rc3, sata drivers built into the kernel.
>>>> Find below /proc/interrupts and lspci output. Boot dmesg output was 
>>>> washed
>>>> away by above messages, sorry.
>>>>
>>>> What's the cause of the error, can I ignore it or will it destroy
>>>> my raid eventually? I'm now about 5% through the resync process, 
>>>> with an estimated finish in 1260 minutes.
>>>>
>>>>
>>>> $ lspci -vv -s 03:04.0
>>>> 0000:03:04.0 RAID bus controller: Silicon Image, Inc. SiI 3124 PCI-X 
>>>> Serial ATA Controller (rev 01)
>>>>     Subsystem: Silicon Image, Inc.: Unknown device 7124
>>>>     Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
>>>> ParErr- Stepping+ SERR- FastB2B-
>>>>     Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- 
>>>> <TAbort- <MAbort- >SERR- <PERR-
>>>>     Latency: 32
>>>>     Interrupt: pin A routed to IRQ 22
>>>>     Region 0: Memory at fa800000 (64-bit, non-prefetchable) [size=128]
>>>>     Region 2: Memory at fa000000 (64-bit, non-prefetchable) [size=32K]
>>>>     Region 4: I/O ports at 9400 [size=16]
>>>>     Expansion ROM at fe900000 [disabled] [size=512K]
>>>>     Capabilities: [64] Power Management version 2
>>>>         Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA 
>>>> PME(D0-,D1-,D2-,D3hot-,D3cold-)
>>>>         Status: D0 PME-Enable- DSel=0 DScale=1 PME-
>>>>     Capabilities: [40] PCI-X non-bridge device.
>>>>         Command: DPERE- ERO+ RBC=0 OST=5
>>>>         Status: Bus=3 Dev=4 Func=0 64bit+ 133MHz+ SCD- USC-, 
>>>> DC=simple, DMMRBC=2, DMOST=5, DMCRS=4, RSCEM-
>>>
>>> So, slow down the PCI-X bus.  It can usually be done from BIOS setup 
>>> menu.  Does your machine has a riser board which extends or changes 
>>> orientation of PCI-X bus?  Motherboard vendors describe the bus 
>>> frequency limit when using riser boards in the manual but sometimes 
>>> server vendors forget to set them.  Heck, some of them don't even 
>>> know what that is.
>>
>>
>> Hmm I don't have a riser card and I don't have a setting for the 
>> frequency,
>> nor a jumper.
>> I plugged the card in another slot, next to a 66MHz only card. So now 
>> I've
>> it working with 66MHz (checked with lspci), but my drive isn't 
>> initialized
>> properly anymore:
>>
>> [4294690.486000] libata version 1.20 loaded.
>> [4294690.486000] sata_sil24 0000:03:04.0: version 0.23
>> [4294690.486000] ACPI: PCI Interrupt 0000:03:04.0[A] -> GSI 24 (level, 
>> low) -> IRQ 22
>> [4294690.487000] ata1: SATA max UDMA/100 cmd 0xF8810000 ctl 0x0 bmdma 
>> 0x0 irq 22
>> [4294690.487000] ata2: SATA max UDMA/100 cmd 0xF8812000 ctl 0x0 bmdma 
>> 0x0 irq 22
>> [4294690.487000] ata3: SATA max UDMA/100 cmd 0xF8814000 ctl 0x0 bmdma 
>> 0x0 irq 22
>> [4294690.487000] ata4: SATA max UDMA/100 cmd 0xF8816000 ctl 0x0 bmdma 
>> 0x0 irq 22
>> [4294690.800000] ata1: SATA link up 3.0 Gbps (SStatus 123)
>> [4294690.801000] ata1: dev 0 cfg 49:5145 82:0000 83:0000 84:0000 
>> 85:0000 86:0000 87:0000 88:0000
>> [4294690.801000] ata1: dev 0 ATA-0, max MWDMA2, 16514064 sectors: CHS 
>> 16383/16/63
>> [4294690.802000] ata1: dev 0 model number mismatch 'WDC 
>> WD3200re/sasats_li42S' != ''
>> [4294690.802000] ata1: dev 0 revalidation failed (errno=-19)
>> [4294690.802000] ata1: failed to revalidate after set xfermode
>> [4294690.802000] scsi2 : sata_sil24
>> [4294691.003000] ata2: SATA link down (SStatus 0)
>> [4294691.003000] scsi3 : sata_sil24
>> [4294691.204000] ata3: SATA link down (SStatus 0)
>> [4294691.204000] scsi4 : sata_sil24
>> [4294691.405000] ata4: SATA link down (SStatus 0)
>> [4294691.405000] scsi5 : sata_sil24
>>
>> Can this still be a pci bus problem? I get the same error on every 
>> reboot.
> 
> Hmmm.. max MWDMA2?  Something is very off with your configuration.  Can 
> you try the card in another box or on a regular PCI slot?

Since moving the card to another slot changes the behaviour/problem, I'm 
thinking it might be a mobo implementation problem with slots 
interacting WRT IRQ, like in the older PCI-IRQ problem days.

You might try shifting that card and other cards in various slots and 
dump the IRQ table for each combination. Maybe simply take out any other 
cards you can live without while trying out the various slots.

-- 
Kind regards,
Mogens Valentin


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: sata_sil24 resetting controller...
  2006-04-29  8:53       ` Mogens Valentin
@ 2006-04-29 10:06         ` Jan Dittmer
  2006-04-30  2:22           ` jason
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Dittmer @ 2006-04-29 10:06 UTC (permalink / raw)
  To: mogensv; +Cc: Tejun Heo, jgarzik, linux-ide, linux-kernel

Mogens Valentin wrote:
> Tejun Heo wrote:
>> Jan Dittmer wrote:
>>
>>> Tejun Heo wrote:
>>>
>>>>> serror=0x0[4297873.266000] sata_sil24 ata1: resetting controller...
>>>>> [4297873.267000] ata1: status=0x50 { DriveReady SeekComplete }
>>>>> [4297873.267000] sdc: Current: sense key=0x0
>>>>> [4297873.267000]     ASC=0x0 ASCQ=0x0
>>>>>
>>>>> The time between these events varies from .5s to up to 10s, resync 
>>>>> speed is
>>>>> pretty bad (6mb/s) but appears(!) to be working.
>>>>> This is with vanilla 2.6.17-rc3, sata drivers built into the kernel.
>>>>> Find below /proc/interrupts and lspci output. Boot dmesg output was 
>>>>> washed
>>>>> away by above messages, sorry.
>>>>>
>>>>> What's the cause of the error, can I ignore it or will it destroy
>>>>> my raid eventually? I'm now about 5% through the resync process, 
>>>>> with an estimated finish in 1260 minutes.
>>>>>
>>>>>
>>>>> $ lspci -vv -s 03:04.0
>>>>> 0000:03:04.0 RAID bus controller: Silicon Image, Inc. SiI 3124 PCI-X 
>>>>> Serial ATA Controller (rev 01)
>>>>>     Subsystem: Silicon Image, Inc.: Unknown device 7124
>>>>>     Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
>>>>> ParErr- Stepping+ SERR- FastB2B-
>>>>>     Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- 
>>>>> <TAbort- <MAbort- >SERR- <PERR-
>>>>>     Latency: 32
>>>>>     Interrupt: pin A routed to IRQ 22
>>>>>     Region 0: Memory at fa800000 (64-bit, non-prefetchable) [size=128]
>>>>>     Region 2: Memory at fa000000 (64-bit, non-prefetchable) [size=32K]
>>>>>     Region 4: I/O ports at 9400 [size=16]
>>>>>     Expansion ROM at fe900000 [disabled] [size=512K]
>>>>>     Capabilities: [64] Power Management version 2
>>>>>         Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA 
>>>>> PME(D0-,D1-,D2-,D3hot-,D3cold-)
>>>>>         Status: D0 PME-Enable- DSel=0 DScale=1 PME-
>>>>>     Capabilities: [40] PCI-X non-bridge device.
>>>>>         Command: DPERE- ERO+ RBC=0 OST=5
>>>>>         Status: Bus=3 Dev=4 Func=0 64bit+ 133MHz+ SCD- USC-, 
>>>>> DC=simple, DMMRBC=2, DMOST=5, DMCRS=4, RSCEM-
>>>> So, slow down the PCI-X bus.  It can usually be done from BIOS setup 
>>>> menu.  Does your machine has a riser board which extends or changes 
>>>> orientation of PCI-X bus?  Motherboard vendors describe the bus 
>>>> frequency limit when using riser boards in the manual but sometimes 
>>>> server vendors forget to set them.  Heck, some of them don't even 
>>>> know what that is.
>>>
>>> Hmm I don't have a riser card and I don't have a setting for the 
>>> frequency,
>>> nor a jumper.
>>> I plugged the card in another slot, next to a 66MHz only card. So now 
>>> I've
>>> it working with 66MHz (checked with lspci), but my drive isn't 
>>> initialized
>>> properly anymore:
>>>
>>> [4294690.486000] libata version 1.20 loaded.
>>> [4294690.486000] sata_sil24 0000:03:04.0: version 0.23
>>> [4294690.486000] ACPI: PCI Interrupt 0000:03:04.0[A] -> GSI 24 (level, 
>>> low) -> IRQ 22
>>> [4294690.487000] ata1: SATA max UDMA/100 cmd 0xF8810000 ctl 0x0 bmdma 
>>> 0x0 irq 22
>>> [4294690.487000] ata2: SATA max UDMA/100 cmd 0xF8812000 ctl 0x0 bmdma 
>>> 0x0 irq 22
>>> [4294690.487000] ata3: SATA max UDMA/100 cmd 0xF8814000 ctl 0x0 bmdma 
>>> 0x0 irq 22
>>> [4294690.487000] ata4: SATA max UDMA/100 cmd 0xF8816000 ctl 0x0 bmdma 
>>> 0x0 irq 22
>>> [4294690.800000] ata1: SATA link up 3.0 Gbps (SStatus 123)
>>> [4294690.801000] ata1: dev 0 cfg 49:5145 82:0000 83:0000 84:0000 
>>> 85:0000 86:0000 87:0000 88:0000
>>> [4294690.801000] ata1: dev 0 ATA-0, max MWDMA2, 16514064 sectors: CHS 
>>> 16383/16/63
>>> [4294690.802000] ata1: dev 0 model number mismatch 'WDC 
>>> WD3200re/sasats_li42S' != ''
>>> [4294690.802000] ata1: dev 0 revalidation failed (errno=-19)
>>> [4294690.802000] ata1: failed to revalidate after set xfermode
>>> [4294690.802000] scsi2 : sata_sil24
>>> [4294691.003000] ata2: SATA link down (SStatus 0)
>>> [4294691.003000] scsi3 : sata_sil24
>>> [4294691.204000] ata3: SATA link down (SStatus 0)
>>> [4294691.204000] scsi4 : sata_sil24
>>> [4294691.405000] ata4: SATA link down (SStatus 0)
>>> [4294691.405000] scsi5 : sata_sil24
>>>
>>> Can this still be a pci bus problem? I get the same error on every 
>>> reboot.
>> Hmmm.. max MWDMA2?  Something is very off with your configuration.  Can 
>> you try the card in another box or on a regular PCI slot?
> 
> Since moving the card to another slot changes the behaviour/problem, I'm 
> thinking it might be a mobo implementation problem with slots 
> interacting WRT IRQ, like in the older PCI-IRQ problem days.
> 
> You might try shifting that card and other cards in various slots and 
> dump the IRQ table for each combination. Maybe simply take out any other 
> cards you can live without while trying out the various slots.

I shifted the sata card into a 66MHz, 32bit PCI slot now and the
problems went away. Just for the record, this is an Asus PU-DLS
mainboard with E7501 chipset. Now I can dd from all devices without
any error messages, giving me about 360mb/s continuous throughput for
6 devices which isn't that bad I suppose.
The card gets assigned irq 22 in both configurations but in the
latter the irq is shared with the on-board usb-uhci controller
which somehow seems to work better...

Thanks for all your help,

Jan



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: sata_sil24 resetting controller...
  2006-04-29 10:06         ` Jan Dittmer
@ 2006-04-30  2:22           ` jason
  2006-04-30  2:31             ` Tejun Heo
  0 siblings, 1 reply; 9+ messages in thread
From: jason @ 2006-04-30  2:22 UTC (permalink / raw)
  To: Jan Dittmer; +Cc: mogensv, Tejun Heo, jgarzik, linux-ide, linux-kernel

Jan Dittmer wrote


> I shifted the sata card into a 66MHz, 32bit PCI slot now and the
> problems went away. Just for the record, this is an Asus PU-DLS
> mainboard with E7501 chipset. Now I can dd from all devices without
> any error messages, giving me about 360mb/s continuous throughput for
> 6 devices which isn't that bad I suppose.
> The card gets assigned irq 22 in both configurations but in the
> latter the irq is shared with the on-board usb-uhci controller
> which somehow seems to work better...

push a 64bit+ 133Mhz+ non bridge PCI-X device into a 32bit slot? Is it right?
I am confused...

--
Yours,
jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: sata_sil24 resetting controller...
  2006-04-30  2:22           ` jason
@ 2006-04-30  2:31             ` Tejun Heo
  0 siblings, 0 replies; 9+ messages in thread
From: Tejun Heo @ 2006-04-30  2:31 UTC (permalink / raw)
  To: jason; +Cc: Jan Dittmer, mogensv, jgarzik, linux-ide, linux-kernel

jason wrote:
> Jan Dittmer wrote
> 
> 
>> I shifted the sata card into a 66MHz, 32bit PCI slot now and the
>> problems went away. Just for the record, this is an Asus PU-DLS
>> mainboard with E7501 chipset. Now I can dd from all devices without
>> any error messages, giving me about 360mb/s continuous throughput for
>> 6 devices which isn't that bad I suppose.
>> The card gets assigned irq 22 in both configurations but in the
>> latter the irq is shared with the on-board usb-uhci controller
>> which somehow seems to work better...
> 
> push a 64bit+ 133Mhz+ non bridge PCI-X device into a 32bit slot? Is it 
> right?
> I am confused...
> 

Well, it's not optimal, but most PCI-X cards including sil3124 are 
backward compatible with PCI, so....

-- 
tejun

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2006-04-30  2:31 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-27 18:58 sata_sil24 resetting controller jdi
2006-04-27 19:21 ` Jan Dittmer
2006-04-27 23:52 ` Tejun Heo
2006-04-29  0:13   ` Jan Dittmer
2006-04-29  0:20     ` Tejun Heo
2006-04-29  8:53       ` Mogens Valentin
2006-04-29 10:06         ` Jan Dittmer
2006-04-30  2:22           ` jason
2006-04-30  2:31             ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).