linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* PROBLEM: ata_piix.c for the ICH5 SATA Controller.
@ 2007-06-28 12:46 Johny Mail list
  2007-06-28 17:43 ` Mark Lord
  0 siblings, 1 reply; 7+ messages in thread
From: Johny Mail list @ 2007-06-28 12:46 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: linux-ide

Hi,
I have a big problem with my SC1425 Dell Servers. I use Linux Software
RAID on them and last days i make few tests on them to see the
reaction of the server about different situations like : power
failure, hard drive prower failure ...
And the hard drive prower failure was the problem. When i unplug the
electric alimentation (or the SATA port cable) of one of my two hard
drives in RAID 1, the server stop responding and i get this messages :
ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata4.00: cmd e7/00:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x0 data 0
             res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata4: port is slow to respond, please be patient (Status 0xd0)
ata4: port failed to respond (30sec, Status 0xd0)
ata4: soft resetting port

I have make the same test on a SC1435 (the next generation) with a
broadcom chispset/driver and everything is fine when i unplug one hard
drive.
On SC1425 my bios is up-to-date from the dell website.

You can contact me for more informations, or some tests.

Thanks for your work

My informations :
# sh scripts/ver_linux
Linux raid-test 2.6.21.5-grsec-ipvs #1 SMP Thu Jun 28 13:51:34 CEST
2007 x86_64 GNU/Linux

Gnu C                  4.1.2
Gnu make               3.81
binutils               2.17
util-linux             2.12r
mount                  2.12r
module-init-tools      3.3-pre2
e2fsprogs              1.40-WIP
Linux C Library        2.3.6
Dynamic linker (ldd)   2.3.6
Procps                 3.2.7
Net-tools              1.60
Console-tools          0.2.3
Sh-utils               5.97
udev                   105

# cat /proc/ioports
0000-001f : dma1
0020-0021 : pic1
0040-0043 : timer0
0050-0053 : timer1
0060-006f : keyboard
0070-0077 : rtc
0080-008f : dma page reg
00a0-00a1 : pic2
00c0-00df : dma2
00f0-00ff : fpu
0170-0177 : 0000:00:1f.1
  0170-0177 : libata
01f0-01f7 : 0000:00:1f.1
  01f0-01f7 : libata
0376-0376 : 0000:00:1f.1
  0376-0376 : libata
03c0-03df : vga+
03f6-03f6 : 0000:00:1f.1
  03f6-03f6 : libata
03f8-03ff : serial
0800-087f : 0000:00:1f.0
  0800-087f : pnp 00:07
    0800-0803 : ACPI PM1a_EVT_BLK
    0804-0805 : ACPI PM1a_CNT_BLK
    0808-080b : ACPI PM_TMR
    0828-082f : ACPI GPE0_BLK
0880-08bf : 0000:00:1f.0
  0880-08bf : pnp 00:07
08c0-08df : pnp 00:07
08e0-08e3 : pnp 00:07
0c00-0c0f : pnp 00:07
0c10-0c1f : pnp 00:07
0ca0-0ca7 : pnp 00:07
0ca9-0cab : pnp 00:07
0cf8-0cff : PCI conf1
cc80-cc8f : 0000:00:1f.2
  cc80-cc8f : libata
cc98-cc9b : 0000:00:1f.2
  cc98-cc9b : libata
cca0-cca7 : 0000:00:1f.2
  cca0-cca7 : libata
ccb0-ccb3 : 0000:00:1f.2
  ccb0-ccb3 : libata
ccb8-ccbf : 0000:00:1f.2
  ccb8-ccbf : libata
ccc0-ccdf : 0000:00:1d.1
  ccc0-ccdf : uhci_hcd
cce0-ccff : 0000:00:1d.0
  cce0-ccff : uhci_hcd
d000-dfff : PCI Bus #04
  d800-d8ff : 0000:04:0d.0
  dcc0-dcff : 0000:04:03.0
    dcc0-dcff : e1000
e000-efff : PCI Bus #01
  e000-efff : PCI Bus #02
    ecc0-ecff : 0000:02:04.0
      ecc0-ecff : e1000
fc00-fc0f : 0000:00:1f.1
  fc00-fc0f : libata

# cat /proc/iomem
00000000-0009ffff : System RAM
  00000000-00000000 : Crash kernel
00100000-1ffbffff : System RAM
  00200000-004dcfb9 : Kernel code
  004dcfba-0059c9ef : Kernel data
1ffc0000-1ffcfbff : ACPI Tables
1ffcfc00-1fffefff : reserved
20000000-200003ff : 0000:00:1f.1
e0000000-efffffff : PCI MMCONFIG 0
  e0000000-efffffff : reserved
f0000000-f7ffffff : PCI Bus #04
  f0000000-f7ffffff : 0000:04:0d.0
fe500000-fe6fffff : PCI Bus #04
  fe500000-fe51ffff : 0000:04:0d.0
  fe5d0000-fe5dffff : 0000:04:0d.0
  fe5e0000-fe5fffff : 0000:04:03.0
    fe5e0000-fe5fffff : e1000
fe700000-feafffff : PCI Bus #01
  fe900000-feafffff : PCI Bus #02
    fe9e0000-fe9fffff : 0000:02:04.0
      fe9e0000-fe9fffff : e1000
feb00000-feb003ff : 0000:00:1d.7
  feb00000-feb003ff : ehci_hcd
fec00000-fec8ffff : reserved
  fec00000-fec00fff : IOAPIC 0
  fec80000-fec80fff : IOAPIC 1
fed00000-fed003ff : HPET 0
fee00000-fee00fff : Local APIC
ffb00000-ffffffff : reserved

# lspci -vvv
00:00.0 Host bridge: Intel Corporation E7520 Memory Controller Hub (rev 09)
        Subsystem: Dell PowerEdge SC1425
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr+ Stepping- SERR+ FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 0
        Capabilities: [40] Vendor Specific Information

00:02.0 PCI bridge: Intel Corporation E7525/E7520/E7320 PCI Express
Port A (rev 09) (prog-if 00 [Normal decode])
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr+ Stepping- SERR- FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 0, Cache Line Size: 64 bytes
        Bus: primary=00, secondary=01, subordinate=03, sec-latency=0
        I/O behind bridge: 0000e000-0000efff
        Memory behind bridge: fe700000-feafffff
        Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff
        Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- <SERR- <PERR-
        BridgeCtl: Parity+ SERR+ NoISA+ VGA- MAbort- >Reset- FastB2B-
        Capabilities: [50] Power Management version 2
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [58] Message Signalled Interrupts: Mask- 64bit-
Queue=0/1 Enable-
                Address: fee00000  Data: 0000
        Capabilities: [64] Express Root Port (Slot-) IRQ 0
                Device: Supported: MaxPayload 256 bytes, PhantFunc 0, ExtTag-
                Device: Latency L0s <64ns, L1 <1us
                Device: Errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
                Device: RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                Device: MaxPayload 256 bytes, MaxReadReq 128 bytes
                Link: Supported Speed 2.5Gb/s, Width x8, ASPM L0s, Port 2
                Link: Latency L0s <4us, L1 unlimited
                Link: ASPM Disabled RCB 64 bytes CommClk- ExtSynch-
                Link: Speed 2.5Gb/s, Width x8
                Root: Correctable- Non-Fatal- Fatal- PME-
        Capabilities: [100] Advanced Error Reporting

00:1d.0 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB
UHCI Controller #1 (rev 02) (prog-if 00 [UHCI])
        Subsystem: Dell PowerEdge SC1425
        Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
        Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 0
        Interrupt: pin A routed to IRQ 16
        Region 4: I/O ports at cce0 [size=32]

00:1d.1 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB
UHCI Controller #2 (rev 02) (prog-if 00 [UHCI])
        Subsystem: Dell PowerEdge SC1425
        Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
        Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 0
        Interrupt: pin B routed to IRQ 19
        Region 4: I/O ports at ccc0 [size=32]

00:1d.7 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2
EHCI Controller (rev 02) (prog-if 20 [EHCI])
        Subsystem: Dell PowerEdge SC1425
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR+ FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 0
        Interrupt: pin D routed to IRQ 23
        Region 0: Memory at feb00000 (32-bit, non-prefetchable) [size=1K]
        Capabilities: [50] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [58] Debug port

00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev c2)
(prog-if 00 [Normal decode])
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr+ Stepping- SERR+ FastB2B-
        Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 0
        Bus: primary=00, secondary=04, subordinate=04, sec-latency=32
        I/O behind bridge: 0000d000-0000dfff
        Memory behind bridge: fe500000-fe6fffff
        Prefetchable memory behind bridge: f0000000-f7ffffff
        Secondary status: 66MHz- FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort+ <SERR- <PERR-
        BridgeCtl: Parity+ SERR+ NoISA- VGA+ MAbort- >Reset- FastB2B-

00:1f.0 ISA bridge: Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC
Interface Bridge (rev 02)
        Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop-
ParErr+ Stepping- SERR+ FastB2B-
        Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 0

00:1f.1 IDE interface: Intel Corporation 82801EB/ER (ICH5/ICH5R) IDE
Controller (rev 02) (prog-if 8a [Master SecP PriP])
        Subsystem: Dell PowerEdge SC1425
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
        Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 0
        Interrupt: pin A routed to IRQ 18
        Region 0: I/O ports at 01f0 [size=8]
        Region 1: I/O ports at 03f4 [size=1]
        Region 2: I/O ports at 0170 [size=8]
        Region 3: I/O ports at 0374 [size=1]
        Region 4: I/O ports at fc00 [size=16]
        Region 5: Memory at 20000000 (32-bit, non-prefetchable) [size=1K]

00:1f.2 IDE interface: Intel Corporation 82801EB (ICH5) SATA
Controller (rev 02) (prog-if 8f [Master SecP SecO PriP PriO])
        Subsystem: Dell PowerEdge SC1425
        Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
        Status: Cap- 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 0
        Interrupt: pin A routed to IRQ 18
        Region 0: I/O ports at ccb8 [size=8]
        Region 1: I/O ports at ccb0 [size=4]
        Region 2: I/O ports at cca0 [size=8]
        Region 3: I/O ports at cc98 [size=4]
        Region 4: I/O ports at cc80 [size=16]

01:00.0 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI
Bridge A (rev 09) (prog-if 00 [Normal decode])
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr+ Stepping- SERR- FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 0, Cache Line Size: 64 bytes
        Bus: primary=01, secondary=02, subordinate=02, sec-latency=32
        I/O behind bridge: 0000e000-0000efff
        Memory behind bridge: fe900000-feafffff
        Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff
        Secondary status: 66MHz+ FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- <SERR- <PERR-
        BridgeCtl: Parity+ SERR+ NoISA+ VGA- MAbort- >Reset- FastB2B-
        Capabilities: [44] Express PCI/PCI-X Bridge IRQ 0
                Device: Supported: MaxPayload 256 bytes, PhantFunc 0, ExtTag-
                Device: Latency L0s <64ns, L1 <1us
                Device: AtnBtn- AtnInd- PwrInd-
                Device: Errors: Correctable+ Non-Fatal+ Fatal+ Unsupported-
                Device: RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                Device: MaxPayload 256 bytes, MaxReadReq 512 bytes
                Link: Supported Speed 2.5Gb/s, Width x8, ASPM L0s, Port 0
                Link: Latency L0s unlimited, L1 unlimited
                Link: ASPM Disabled CommClk- ExtSynch-
                Link: Speed 2.5Gb/s, Width x8
        Capabilities: [5c] Message Signalled Interrupts: Mask- 64bit+
Queue=0/0 Enable-
                Address: 0000000000000000  Data: 0000
        Capabilities: [6c] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [d8] PCI-X bridge device
                Secondary Status: 64bit+ 133MHz+ SCD- USC- SCO- SRD- Freq=conv
                Status: Dev=01:00.0 64bit- 133MHz- SCD- USC- SCO- SRD-
                Upstream: Capacity=65535 CommitmentLimit=65535
                Downstream: Capacity=65535 CommitmentLimit=65535
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [300] Power Budgeting

01:00.2 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI
Bridge B (rev 09) (prog-if 00 [Normal decode])
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr+ Stepping- SERR- FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 0, Cache Line Size: 64 bytes
        Bus: primary=01, secondary=03, subordinate=03, sec-latency=64
        I/O behind bridge: 0000f000-00000fff
        Memory behind bridge: fff00000-000fffff
        Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff
        Secondary status: 66MHz+ FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort+ <SERR- <PERR-
        BridgeCtl: Parity+ SERR+ NoISA+ VGA- MAbort- >Reset- FastB2B-
        Capabilities: [44] Express PCI/PCI-X Bridge IRQ 0
                Device: Supported: MaxPayload 256 bytes, PhantFunc 0, ExtTag-
                Device: Latency L0s <64ns, L1 <1us
                Device: AtnBtn- AtnInd- PwrInd-
                Device: Errors: Correctable+ Non-Fatal+ Fatal+ Unsupported-
                Device: RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                Device: MaxPayload 256 bytes, MaxReadReq 512 bytes
                Link: Supported Speed 2.5Gb/s, Width x8, ASPM L0s, Port 0
                Link: Latency L0s unlimited, L1 unlimited
                Link: ASPM Disabled CommClk- ExtSynch-
                Link: Speed 2.5Gb/s, Width x8
        Capabilities: [5c] Message Signalled Interrupts: Mask- 64bit+
Queue=0/0 Enable-
                Address: 0000000000000000  Data: 0000
        Capabilities: [6c] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [d8] PCI-X bridge device
                Secondary Status: 64bit+ 133MHz+ SCD- USC- SCO- SRD- Freq=133MHz
                Status: Dev=01:00.2 64bit- 133MHz- SCD- USC- SCO- SRD-
                Upstream: Capacity=65535 CommitmentLimit=65535
                Downstream: Capacity=65535 CommitmentLimit=65535
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [300] Power Budgeting

02:04.0 Ethernet controller: Intel Corporation 82541GI Gigabit
Ethernet Controller (rev 05)
        Subsystem: Dell PowerEdge SC1425
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop-
ParErr- Stepping- SERR+ FastB2B-
        Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 32 (63750ns min), Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 32
        Region 0: Memory at fe9e0000 (32-bit, non-prefetchable) [size=128K]
        Region 2: I/O ports at ecc0 [size=64]
        Capabilities: [dc] Power Management version 2
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=1 PME-
        Capabilities: [e4] PCI-X non-bridge device
                Command: DPERE- ERO+ RBC=512 OST=1
                Status: Dev=00:00.0 64bit- 133MHz- SCD- USC- DC=simple
DMMRBC=2048 DMOST=1 DMCRS=8 RSCEM- 266MHz- 533MHz-

04:03.0 Ethernet controller: Intel Corporation 82541GI Gigabit
Ethernet Controller (rev 05)
        Subsystem: Dell PowerEdge SC1425
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop-
ParErr- Stepping- SERR+ FastB2B-
        Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 32 (63750ns min), Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 20
        Region 0: Memory at fe5e0000 (32-bit, non-prefetchable) [size=128K]
        Region 2: I/O ports at dcc0 [size=64]
        Capabilities: [dc] Power Management version 2
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=1 PME-
        Capabilities: [e4] PCI-X non-bridge device
                Command: DPERE- ERO+ RBC=512 OST=1
                Status: Dev=00:00.0 64bit- 133MHz- SCD- USC- DC=simple
DMMRBC=2048 DMOST=1 DMCRS=8 RSCEM- 266MHz- 533MHz-

04:0d.0 VGA compatible controller: ATI Technologies Inc Radeon RV100
QY [Radeon 7000/VE] (prog-if 00 [VGA])
        Subsystem: Dell PowerEdge SC1425
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop+
ParErr- Stepping+ SERR+ FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 32 (2000ns min), Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 5
        Region 0: Memory at f0000000 (32-bit, prefetchable) [size=128M]
        Region 1: I/O ports at d800 [size=256]
        Region 2: Memory at fe5d0000 (32-bit, non-prefetchable) [size=64K]
        [virtual] Expansion ROM at fe500000 [disabled] [size=128K]
        Capabilities: [50] Power Management version 2
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-

# cat /proc/scsi/scsi
Attached devices:
Host: scsi2 Channel: 00 Id: 00 Lun: 00
  Vendor: ATA      Model: WDC WD800JD-75LS Rev: 09.0
  Type:   Direct-Access                    ANSI  SCSI revision: 05
Host: scsi3 Channel: 00 Id: 00 Lun: 00
  Vendor: ATA      Model: WDC WD800JD-75LS Rev: 09.0
  Type:   Direct-Access                    ANSI  SCSI revision: 05

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: PROBLEM: ata_piix.c for the ICH5 SATA Controller.
  2007-06-28 12:46 PROBLEM: ata_piix.c for the ICH5 SATA Controller Johny Mail list
@ 2007-06-28 17:43 ` Mark Lord
  2007-06-29 11:39   ` Johny Mail list
  0 siblings, 1 reply; 7+ messages in thread
From: Mark Lord @ 2007-06-28 17:43 UTC (permalink / raw)
  To: Johny Mail list; +Cc: Jeff Garzik, linux-ide

Johny Mail list wrote:
> Hi,
> I have a big problem with my SC1425 Dell Servers. I use Linux Software
> RAID on them and last days i make few tests on them to see the
> reaction of the server about different situations like : power
> failure, hard drive prower failure ...
> And the hard drive prower failure was the problem. When i unplug the
> electric alimentation (or the SATA port cable) of one of my two hard
> drives in RAID 1, the server stop responding and i get this messages :
> ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata4.00: cmd e7/00:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x0 data 0
>             res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> ata4: port is slow to respond, please be patient (Status 0xd0)
> ata4: port failed to respond (30sec, Status 0xd0)
> ata4: soft resetting port

Does it hang permanently there, or keep failing with additional messages?

According to Intel, their ICH5 hardware does not support ordinary
SATA drive hot insertion/removal.  In practice, it can be made to work
but not via the standard SATA mechanism.

My own observation is that the hardware (CPU) locks up hard when
libata attempts to issue SRST (reset) to a removed SATA drive on ICH5.

I have an ugly (but working) hack for the ICH5 ata_piix driver
to support hot insertion/removal of drives, but I don't know if/when
I'll be pushing it upstream.

Cheers


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: PROBLEM: ata_piix.c for the ICH5 SATA Controller.
  2007-06-28 17:43 ` Mark Lord
@ 2007-06-29 11:39   ` Johny Mail list
  2007-06-29 15:04     ` Mark Lord
  0 siblings, 1 reply; 7+ messages in thread
From: Johny Mail list @ 2007-06-29 11:39 UTC (permalink / raw)
  To: Mark Lord; +Cc: Jeff Garzik, linux-ide

2007/6/28, Mark Lord <liml@rtr.ca>:
> Johny Mail list wrote:
> > Hi,
> > I have a big problem with my SC1425 Dell Servers. I use Linux Software
> > RAID on them and last days i make few tests on them to see the
> > reaction of the server about different situations like : power
> > failure, hard drive prower failure ...
> > And the hard drive prower failure was the problem. When i unplug the
> > electric alimentation (or the SATA port cable) of one of my two hard
> > drives in RAID 1, the server stop responding and i get this messages :
> > ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> > ata4.00: cmd e7/00:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x0 data 0
> >             res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> > ata4: port is slow to respond, please be patient (Status 0xd0)
> > ata4: port failed to respond (30sec, Status 0xd0)
> > ata4: soft resetting port
>
> Does it hang permanently there, or keep failing with additional messages?
>
> According to Intel, their ICH5 hardware does not support ordinary
> SATA drive hot insertion/removal.  In practice, it can be made to work
> but not via the standard SATA mechanism.
>
> My own observation is that the hardware (CPU) locks up hard when
> libata attempts to issue SRST (reset) to a removed SATA drive on ICH5.
>
> I have an ugly (but working) hack for the ICH5 ata_piix driver
> to support hot insertion/removal of drives, but I don't know if/when
> I'll be pushing it upstream.
>
> Cheers
>
>

Yes it hang permanently there, after this messages i generally reboot
the server.
Yes it not support SATA drive hot insertion/removal, but i have make
the same test on windows. I unplug one disk when i'm logged and the
system don't stop. The drive is removed from the devices list.

If you can give me the patch for testing it... I would give you my
returns about the good/bad functioning in my case.

Salutation

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: PROBLEM: ata_piix.c for the ICH5 SATA Controller.
  2007-06-29 11:39   ` Johny Mail list
@ 2007-06-29 15:04     ` Mark Lord
  2007-06-29 15:56       ` Mark Lord
  0 siblings, 1 reply; 7+ messages in thread
From: Mark Lord @ 2007-06-29 15:04 UTC (permalink / raw)
  To: Johny Mail list; +Cc: Jeff Garzik, linux-ide

Johny Mail list wrote:
> 2007/6/28, Mark Lord <liml@rtr.ca>:
>> I have an ugly (but working) hack for the ICH5 ata_piix driver
>> to support hot insertion/removal of drives, but I don't know if/when
>> I'll be pushing it upstream.
>
> Yes it hang permanently there, after this messages i generally reboot
> the server.
> Yes it not support SATA drive hot insertion/removal, but i have make
> the same test on windows. I unplug one disk when i'm logged and the
> system don't stop. The drive is removed from the devices list.
> 
> If you can give me the patch for testing it... I would give you my
> returns about the good/bad functioning in my case.

Okay, Here is a working patch for a very specific variant of ICH5.
If your PCI IDs don't match what the patch is looking for,
then it should have no effect -- you may need to patch the patch
to contain the correct PCI IDs (from lspci -n).
* * *

Implement ICH5 chipset handling for drive hot insertion/removal.
This cannot go upstream, as it conflicts with a more generic
polled-hotplug framework that is currently in development.

Hot-inserted drives are automatically detected within a second or two,
and are ready-to-use within 30 seconds or so.  This could be even faster,
but the 2.6.18.8 libata implementation of error-handling is what slows
us down here.

Hot-removed drives are *not* noticed by the kernel until the next
time they are accessed.  If you want this to happen quickly,
then just launch a script like this from /etc/inittab at boot time:

   #!/bin/bash
   ( while ( /bin/true ) ; do /sbin/hdparm -C /dev/sd[a-z] ; sleep 5 ; done ) &>/dev/null &

This hack is not ready for mainline -- it's awaiting Tejun's hp-poll patches,
with which it will eventually be integrated.

Signed-off-by: Mark Lord <mlord@pobox.com>
---

diff -u --recursive --new-file --exclude-from=old/Documentation/dontdiff old/drivers/scsi/ata_piix.c linux/drivers/scsi/ata_piix.c
--- old/drivers/scsi/ata_piix.c	2007-04-20 14:08:46.000000000 -0400
+++ linux/drivers/scsi/ata_piix.c	2007-06-26 07:23:21.000000000 -0400
@@ -106,6 +106,8 @@
 	PIIX_FLAG_AHCI		= (1 << 27), /* AHCI possible */
 	PIIX_FLAG_CHECKINTR	= (1 << 28), /* make sure PCI INTx enabled */
 
+	PIIX_HOTPLUG_POLL_TM	= (2 * (HZ)),	/* polling interval for hotplug */
+
 	/* combined mode.  if set, PATA is channel 0.
 	 * if clear, PATA is channel 1.
 	 */
@@ -150,6 +152,171 @@
 	const struct piix_map_db *map_db;
 };
 
+struct piix_port_priv {
+	int pcs_hotplug_supported;
+	struct timer_list hotplug_timer;
+	u16 old_pcs;
+};
+
+static u32 ich_scr_read (struct ata_port *ap, unsigned int reg)
+{
+	u32 scr = 0;
+
+	if (reg == SCR_STATUS) {
+		struct piix_port_priv *pp = ap->private_data;
+		if (pp && pp->pcs_hotplug_supported) {
+			u16 pcs, port_bit = (1 << ap->hard_port_no);
+			struct pci_dev *pdev = to_pci_dev(ap->dev);
+
+			pci_read_config_word(pdev, ICH5_PCS, &pcs);
+			if (pcs & (port_bit << 4))
+				scr = 0x113;
+		}
+	}
+	return scr;
+}
+
+static int ich_port_offline (struct ata_port *ap)
+{
+	struct pci_dev *pdev;
+	u16 pcs, port_bit = (1 << ap->hard_port_no);
+	struct piix_port_priv *pp = ap->private_data;
+	u8 ostatus;
+	unsigned int offline;
+
+	if (!pp || !pp->pcs_hotplug_supported) {
+		u32 sstatus;
+		if (!sata_scr_read(ap, SCR_STATUS, &sstatus) && (sstatus & 0xf) != 0x3)
+			return 1;
+		return 0;
+	}
+
+	/*
+	 * ICH5 with a mostly good/working PCS register.
+	 * The only flaw is, it doesn't seem to detect *removed* drives
+	 * unless we toggle the enable line before checking.
+	 */
+	ostatus = ata_altstatus(ap);
+	pdev = to_pci_dev(ap->dev);
+	pci_read_config_word(pdev, ICH5_PCS, &pcs);
+	offline = ((pcs & (port_bit << 4)) == 0);
+
+	if (!offline) {
+		unsigned int usecs;
+
+		/* Cycle PCS register to force it to redetect devices: */
+		pci_write_config_word(pdev, ICH5_PCS, pcs & ~port_bit);
+		udelay(1);
+		pci_write_config_word(pdev, ICH5_PCS, 0x0003);
+
+		/* Wait for SATA PHY to sync up; typically 5->6 usecs */
+		for (usecs = 0; usecs < 100; ++usecs) {
+			pci_read_config_word(pdev,  ICH5_PCS, &pcs);
+			offline = ((pcs & (port_bit << 4)) == 0);
+			if (!offline)
+				break;
+			udelay(1);
+		}
+		if (!offline) {
+			unsigned int msecs;
+			/* Wait for drive to become not-BUSY, typically 10->62 msecs */
+			for (msecs = 1; msecs < 150; msecs += 3) {
+				u8 status;
+				msleep(3);
+				status = ata_altstatus(ap);
+				if (status && !(status & ATA_BUSY))
+					break;
+			}
+			usecs += msecs * 1000;
+		}
+		printk("ata%u (port %u): status=%02x pcs=0x%04x offline=%u delay=%u usecs\n",
+			ap->id, ap->hard_port_no, ostatus, pcs, offline, usecs);
+	}
+	if (offline)
+		ata_port_disable(ap);
+	return offline;
+}
+
+static void pcs_hotplug_poll (unsigned long data)
+{
+	struct ata_port *ap = (void *)data;
+	struct pci_dev *pdev = to_pci_dev(ap->dev);
+	u16 old, new, port_bit = ((1 << ap->hard_port_no) << 4);
+	struct piix_port_priv *pp = ap->private_data;
+	int check_hotplug = 0;
+	unsigned long flags;
+
+	spin_lock_irqsave(ap->lock, flags);
+
+	if (!ap->qc_active) {
+		pci_read_config_word(pdev, ICH5_PCS, &new);
+		old = pp->old_pcs;
+		pp->old_pcs = new;
+
+		//printk("pcs_hotplug_poll(%d.%d) old=%04x new=%04x\n", ap->id, ap->hard_port_no, old, new);
+
+		if ((new & port_bit) != (old & port_bit)) {
+			check_hotplug = 1;
+		} else if (old & port_bit) {
+			//if (ap->hard_port_no == 1)	//FIXME FIXME FIXME
+			//	check_hotplug = 1;
+		}
+
+		if (check_hotplug) {
+			struct ata_eh_info *ehi = &ap->eh_info;
+
+			ata_port_printk(ap, KERN_INFO, "pcs_hotplug_poll: old=%04x new=%04x\n", old, new);
+			ata_ehi_clear_desc(ehi);
+			ata_ehi_hotplugged(ehi);
+			ata_ehi_push_desc(ehi, "hotplug event");
+			ata_port_freeze(ap);
+		}
+	}
+	if (pp->pcs_hotplug_supported)
+		mod_timer(&pp->hotplug_timer, jiffies + PIIX_HOTPLUG_POLL_TM);
+	spin_unlock_irqrestore(ap->lock, flags);
+}
+
+static int ich_port_start (struct ata_port *ap)
+{
+	struct pci_dev *pdev = to_pci_dev(ap->dev);
+	int rc;
+
+	rc = ata_port_start(ap);
+	if (rc == 0) {
+		if (pdev->vendor == 0x8086 && pdev->device == 0x24d1) {
+			struct piix_port_priv *pp;
+			pp = kzalloc(sizeof(*pp), GFP_KERNEL);
+			if (pp) {
+				pp->pcs_hotplug_supported = 1;
+				if (ap->private_data)
+					printk(KERN_ERR "port_start: huh? private_data=%p instead of NULL\n", ap->private_data);
+				ap->private_data = pp;
+				setup_timer(&pp->hotplug_timer, pcs_hotplug_poll, (unsigned long)ap);
+				pp->hotplug_timer.expires = jiffies + PIIX_HOTPLUG_POLL_TM;
+				add_timer(&pp->hotplug_timer);
+			} else {
+				printk(KERN_ERR "ich_port_start: failed to alloc %d bytes for port_priv\n", sizeof(*pp));
+			}
+		}
+	} else {
+		printk(KERN_ERR "ich_port_start: ata_port_start failed, rc=%d\n", rc);
+	}
+	return rc;
+}
+
+static void ich_port_stop (struct ata_port *ap)
+{
+	struct piix_port_priv *pp = ap->private_data;
+
+	if (pp) {
+		pp->pcs_hotplug_supported = 0;
+		del_timer_sync(&pp->hotplug_timer);
+		ap->private_data = NULL;
+		kfree(pp);
+	}
+}
+
 static int piix_init_one (struct pci_dev *pdev,
 				    const struct pci_device_id *ent);
 static void piix_host_stop(struct ata_host_set *host_set);
@@ -289,8 +456,11 @@
 	.irq_handler		= ata_interrupt,
 	.irq_clear		= ata_bmdma_irq_clear,
 
-	.port_start		= ata_port_start,
-	.port_stop		= ata_port_stop,
+	.scr_read		= ich_scr_read,
+
+	.port_offline		= ich_port_offline,
+	.port_start		= ich_port_start,
+	.port_stop		= ich_port_stop,
 	.host_stop		= piix_host_stop,
 };
 
diff -u --recursive --new-file --exclude-from=old/Documentation/dontdiff old/drivers/scsi/libata-core.c linux/drivers/scsi/libata-core.c
--- old/drivers/scsi/libata-core.c	2007-04-20 14:08:45.000000000 -0400
+++ linux/drivers/scsi/libata-core.c	2007-06-26 07:22:19.000000000 -0400
@@ -4914,7 +4914,7 @@
  */
 int sata_scr_write(struct ata_port *ap, int reg, u32 val)
 {
-	if (sata_scr_valid(ap)) {
+	if (sata_scr_valid(ap) && ap->ops->scr_write) {
 		ap->ops->scr_write(ap, reg, val);
 		return 0;
 	}
@@ -4987,6 +4987,8 @@
 {
 	u32 sstatus;
 
+	if (ap->ops->port_offline)
+		return ap->ops->port_offline(ap);
 	if (!sata_scr_read(ap, SCR_STATUS, &sstatus) && (sstatus & 0xf) != 0x3)
 		return 1;
 	return 0;
diff -u --recursive --new-file --exclude-from=old/Documentation/dontdiff old/include/linux/libata.h linux/include/linux/libata.h
--- old/include/linux/libata.h	2007-06-26 07:22:26.000000000 -0400
+++ linux/include/linux/libata.h	2007-06-26 07:22:19.000000000 -0400
@@ -614,6 +614,7 @@
 
 	int (*port_start) (struct ata_port *ap);
 	void (*port_stop) (struct ata_port *ap);
+	int (*port_offline) (struct ata_port *ap);
 
 	void (*host_stop) (struct ata_host_set *host_set);
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: PROBLEM: ata_piix.c for the ICH5 SATA Controller.
  2007-06-29 15:04     ` Mark Lord
@ 2007-06-29 15:56       ` Mark Lord
  2007-07-02 14:26         ` Johny Mail list
  0 siblings, 1 reply; 7+ messages in thread
From: Mark Lord @ 2007-06-29 15:56 UTC (permalink / raw)
  To: Johny Mail list; +Cc: Jeff Garzik, linux-ide

Mark Lord wrote:
> Johny Mail list wrote:
>> 2007/6/28, Mark Lord <liml@rtr.ca>:
>>> I have an ugly (but working) hack for the ICH5 ata_piix driver
>>> to support hot insertion/removal of drives, but I don't know if/when
>>> I'll be pushing it upstream.
>>
>> Yes it hang permanently there, after this messages i generally reboot
>> the server.
>> Yes it not support SATA drive hot insertion/removal, but i have make
>> the same test on windows. I unplug one disk when i'm logged and the
>> system don't stop. The drive is removed from the devices list.
>>
>> If you can give me the patch for testing it... I would give you my
>> returns about the good/bad functioning in my case.
> 
> Okay, Here is a working patch for a very specific variant of ICH5.
> If your PCI IDs don't match what the patch is looking for,
> then it should have no effect -- you may need to patch the patch
> to contain the correct PCI IDs (from lspci -n).
> * * *
> 
> Implement ICH5 chipset handling for drive hot insertion/removal.
> This cannot go upstream, as it conflicts with a more generic
> polled-hotplug framework that is currently in development.
> 
> Hot-inserted drives are automatically detected within a second or two,
> and are ready-to-use within 30 seconds or so.  This could be even faster,
> but the 2.6.18.8 libata implementation of error-handling is what slows
> us down here.
...

This patch was for 2.6.18.8 -- it *might* apply to newer kernels,
but I haven't ported it forward yet.

Cheers

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: PROBLEM: ata_piix.c for the ICH5 SATA Controller.
  2007-06-29 15:56       ` Mark Lord
@ 2007-07-02 14:26         ` Johny Mail list
  2007-07-03 22:45           ` Mark Lord
  0 siblings, 1 reply; 7+ messages in thread
From: Johny Mail list @ 2007-07-02 14:26 UTC (permalink / raw)
  To: Mark Lord; +Cc: Jeff Garzik, linux-ide

2007/6/29, Mark Lord <liml@rtr.ca>:
> Mark Lord wrote:
> > Johny Mail list wrote:
> >> 2007/6/28, Mark Lord <liml@rtr.ca>:
> >>> I have an ugly (but working) hack for the ICH5 ata_piix driver
> >>> to support hot insertion/removal of drives, but I don't know if/when
> >>> I'll be pushing it upstream.
> >>
> >> Yes it hang permanently there, after this messages i generally reboot
> >> the server.
> >> Yes it not support SATA drive hot insertion/removal, but i have make
> >> the same test on windows. I unplug one disk when i'm logged and the
> >> system don't stop. The drive is removed from the devices list.
> >>
> >> If you can give me the patch for testing it... I would give you my
> >> returns about the good/bad functioning in my case.
> >
> > Okay, Here is a working patch for a very specific variant of ICH5.
> > If your PCI IDs don't match what the patch is looking for,
> > then it should have no effect -- you may need to patch the patch
> > to contain the correct PCI IDs (from lspci -n).
> > * * *
> >
> > Implement ICH5 chipset handling for drive hot insertion/removal.
> > This cannot go upstream, as it conflicts with a more generic
> > polled-hotplug framework that is currently in development.
> >
> > Hot-inserted drives are automatically detected within a second or two,
> > and are ready-to-use within 30 seconds or so.  This could be even faster,
> > but the 2.6.18.8 libata implementation of error-handling is what slows
> > us down here.
> ...
>
> This patch was for 2.6.18.8 -- it *might* apply to newer kernels,
> but I haven't ported it forward yet.
>
> Cheers
>

This patch don't work in my case.
Sorry but i don't understand when you say : "you may need to patch the
patch to contain the correct PCI IDs (from lspci -n)."
Where is the correct line in the patch to set the correct value.
My lspci -n line for the sata is "00:1f.2 0101: 8086:24d1 (rev 02)".
I have noticed that the lock of my kernel is when the "ata4: port
failed to respond (30sec, Status 0xd0)" is written.

Thk

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: PROBLEM: ata_piix.c for the ICH5 SATA Controller.
  2007-07-02 14:26         ` Johny Mail list
@ 2007-07-03 22:45           ` Mark Lord
  0 siblings, 0 replies; 7+ messages in thread
From: Mark Lord @ 2007-07-03 22:45 UTC (permalink / raw)
  To: Johny Mail list; +Cc: Jeff Garzik, linux-ide

Johny Mail list wrote:
>
> This patch don't work in my case.

Please elaborate on "don't work".
Especially with a 2.6.18.8 kernel.

> Sorry but i don't understand when you say : "you may need to patch the
> patch to contain the correct PCI IDs (from lspci -n)."
> Where is the correct line in the patch to set the correct value.
> My lspci -n line for the sata is "00:1f.2 0101: 8086:24d1 (rev 02)".

No need to patch it then, as the patch already has 8086:24d1.

Cheers

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2007-07-03 22:45 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-06-28 12:46 PROBLEM: ata_piix.c for the ICH5 SATA Controller Johny Mail list
2007-06-28 17:43 ` Mark Lord
2007-06-29 11:39   ` Johny Mail list
2007-06-29 15:04     ` Mark Lord
2007-06-29 15:56       ` Mark Lord
2007-07-02 14:26         ` Johny Mail list
2007-07-03 22:45           ` Mark Lord

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).