All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gerard Sharp <gsharp@ihug.co.nz>
To: linux-kernel@vger.kernel.org
Subject: HPT366 + SMP = slight corruption in 2.3.99 - 2.4.0-11
Date: Sat, 02 Dec 2000 00:04:27 +1300	[thread overview]
Message-ID: <3A2785BB.EB36DDE0@ihug.co.nz> (raw)

Hello.
[1.] One line summary of the problem:    
Intermittent corruption of 4 bytes in SMP kernels using HPT366

[2.] Full description of the problem/report:
First noticed in 2.3.99-preX; but hard to track down then.
When the system was under load - e.g. cp /usr/src/linux /usr/src/l2,
it would occasionally and randomly corrupt some files; possibly multiple
times per file; possibly multiple files. always exactly 4 bytes would be
altered per corruption.
Nothing shows up in logs; no oopses; no messages.
Tests on 2.3.99 found the problem to be unreproducable on UP kernels
Tests on the current kernel found the problem to be unreproducable on
the BX chipset's own ATA33 controller.

[3.] Keywords (i.e., modules, networking, kernel):
IDE, HPT366, EXT2, SMP, Corruption, Worrying

[4.] Kernel version (from /proc/version):
#cat /proc/version 
Linux version 2.4.0-test11-ac4-smp (root@midnight) (gcc version
egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)) #2 SMP Tue Nov 28
22:38:21 NZDT 2000

[5.]
Nada

[6.] A small shell script or example program which triggers the
     problem (if possible)
cp /usr/src/linux /usr/src/l2 ; diff -dur /usr/src/linux /usr/src/l2
shows the problem up if diff produces any output
system may 'survive' two copies (I tend to use a different, uncached
kernel for each attempt - to rule out/minimise the effect of caching)
but 'fail' the third.
where 'survive' = no corruption; 'fail' = some / lots of corruption.
High memory usage increases likelihood; hitting swap at ALL seems to
increase likelihood (swap on same drive)

[7.] Environment
Redhat 6.2 basis.
Abit BP6 Motherboard.
Dual Celeron 466's
128 Mb ram; 13.6 Gb Seagate Barracuda HDD
"hda: ST313620A, ATA DISK drive"
CD-ROM on hdd

[7.1.] Software (add the output of the ver_linux script here)

-- Versions installed: (if some fields are empty or look
-- unusual then possibly you have very old versions)
Linux midnight 2.4.0-test11-ac4-smp #2 SMP Tue Nov 28 22:38:21 NZDT 2000
i686 unknown
Kernel modules         2.3.13
Gnu C                  egcs-2.91.66
Gnu Make               3.78.1
Binutils               2.9.5.0.22
Linux C Library        2.1.3
Dynamic linker         ldd (GNU libc) 2.1.3
Procps                 2.0.6
Mount                  2.10q
Net-tools              1.54
Console-tools          0.3.3
Sh-utils               2.0

[7.2.] Processor information (from /proc/cpuinfo):

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 6
model name      : Celeron (Mendocino)
stepping        : 5
cpu MHz         : 467.000741
cache size      : 128 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 mmx fxsr
bogomips        : 933.89

processor       : 0
vendor_id       : GenuineIntel
...

[7.3.] Module information (from /proc/modules):
Doesn't Impact Problem.

[7.4.] Loaded driver and hardware information (/proc/ioports,
/proc/iomem)
#cat /proc/ioports 
0000-001f : dma1
0020-003f : pic1
0040-005f : timer
0060-006f : keyboard
0070-007f : rtc
0080-008f : dma page reg
00a0-00bf : pic2
00c0-00df : dma2
00f0-00ff : fpu
0170-0177 : ide1
01f0-01f7 : ide0
0220-022f : soundblaster
02f8-02ff : serial(auto)
0376-0376 : ide1
03c0-03df : vga+
  03c0-03df : matrox
03f6-03f6 : ide0
03f8-03ff : serial(auto)
0cf8-0cff : PCI conf1
4000-403f : Intel Corporation 82371AB PIIX4 ACPI
5000-501f : Intel Corporation 82371AB PIIX4 ACPI
  5000-5007 : piix4-smbus
d000-d01f : Intel Corporation 82371AB PIIX4 USB
d400-d4ff : Realtek Semiconductor Co., Ltd. RTL-8139
  d400-d4ff : eth0
d800-d807 : Triones Technologies, Inc. HPT366
dc00-dc03 : Triones Technologies, Inc. HPT366
e000-e0ff : Triones Technologies, Inc. HPT366
  e000-e007 : ide2
  e010-e0ff : HPT366
e400-e407 : Triones Technologies, Inc. HPT366 (#2)
e800-e803 : Triones Technologies, Inc. HPT366 (#2)
ec00-ecff : Triones Technologies, Inc. HPT366 (#2)
  ec00-ec07 : ide3
  ec10-ecff : HPT366
f000-f00f : Intel Corporation 82371AB PIIX4 IDE
  f000-f007 : ide0
  f008-f00f : ide1

#cat /proc/iomem   
00000000-0009fbff : System RAM
0009fc00-0009ffff : reserved
000a0000-000bffff : Video RAM area
000c0000-000c7fff : Video ROM
000f0000-000fffff : System ROM
00100000-07ffffff : System RAM
  00100000-0021232f : Kernel code
  00212330-002239ff : Kernel data
e0000000-e3ffffff : Intel Corporation 440BX/ZX - 82443BX/ZX Host bridge
e4000000-e4003fff : Matrox Graphics, Inc. MGA 1064SG [Mystique]
  e4000000-e4003fff : matroxfb MMIO
e5000000-e57fffff : Matrox Graphics, Inc. MGA 1064SG [Mystique]
  e5000000-e57fffff : matroxfb FB
e6000000-e67fffff : Matrox Graphics, Inc. MGA 1064SG [Mystique]
e9000000-e90000ff : Realtek Semiconductor Co., Ltd. RTL-8139
  e9000000-e90000ff : eth0
fec00000-fec00fff : reserved
fee00000-fee00fff : reserved
ffff0000-ffffffff : reserved


[7.5.] PCI information ('lspci -vvv' as root)
===
#lspci -vvv | less
00:00.0 Host bridge: Intel Corporation 440BX/ZX - 82443BX/ZX Host bridge
(rev 03
)
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Step
ping- SERR- FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort
- <MAbort+ >SERR- <PERR-
        Latency: 32 set
        Region 0: Memory at e0000000 (32-bit, prefetchable) [size=64M]
        Capabilities: [a0] AGP version 1.0
                Status: RQ=31 SBA+ 64bit- FW- Rate=x1,x2
                Command: RQ=0 SBA- AGP- 64bit- FW- Rate=<none>

00:01.0 PCI bridge: Intel Corporation 440BX/ZX - 82443BX/ZX AGP bridge
(rev 03) 
(prog-if 00 [Normal decode])
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Step
ping- SERR+ FastB2B-
        Status: Cap- 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort
- <MAbort- >SERR- <PERR-
        Latency: 64 set
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=32
        I/O behind bridge: 0000f000-00000fff
        Memory behind bridge: fff00000-000fffff
        Prefetchable memory behind bridge: fff00000-000fffff
        BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B+

00:07.0 ISA bridge: Intel Corporation 82371AB PIIX4 ISA (rev 02)
        Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop-
ParErr- Step
ping- SERR- FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort
- <MAbort- >SERR- <PERR-
        Latency: 0 set

00:07.1 IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 01)
(prog-if 80 
[Master])
        Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Step
ping- SERR- FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort
- <MAbort- >SERR- <PERR-
        Latency: 32 set
        Region 4: I/O ports at f000 [size=16]
00:07.2 USB Controller: Intel Corporation 82371AB PIIX4 USB (rev 01)
(prog-if 00
 [UHCI])
        Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Step
ping- SERR- FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort
- <MAbort- >SERR- <PERR-
        Latency: 32 set
        Interrupt: pin D routed to IRQ 19
        Region 4: I/O ports at d000 [size=32]

00:07.3 Bridge: Intel Corporation 82371AB PIIX4 ACPI (rev 02)
        Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop-
ParErr- Step
ping- SERR- FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort
- <MAbort- >SERR- <PERR-

00:0b.0 VGA compatible controller: Matrox Graphics, Inc. MGA 1064SG
[Mystique] (
rev 02) (prog-if 00 [VGA])
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Step
ping+ SERR- FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort
- <MAbort- >SERR- <PERR-
        Latency: 32 set
        Interrupt: pin A routed to IRQ 18
        Region 0: Memory at e4000000 (32-bit, non-prefetchable)
[size=16K]
        Region 1: Memory at e5000000 (32-bit, prefetchable) [size=8M]
        Region 2: Memory at e6000000 (32-bit, non-prefetchable)
[size=8M]
        Expansion ROM at <unassigned> [disabled] [size=64K]

00:0f.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139
(rev 10)
        Subsystem: Realtek Semiconductor Co., Ltd. RT8139
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Step
ping- SERR- FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort
- <MAbort- >SERR- <PERR-
        Latency: 32 min, 64 max, 32 set
        Interrupt: pin A routed to IRQ 16
        Region 0: I/O ports at d400 [size=256]
        Region 1: Memory at e9000000 (32-bit, non-prefetchable)
[size=256]
        Capabilities: [50] Power Management version 2
                Flags: PMEClk- AuxPwr- DSI- D1+ D2+ PME-
                Status: D0 PME-Enable+ DSel=0 DScale=0 PME-
        Capabilities: [60] Vital Product Data
00:13.0 Unknown mass storage controller: Triones Technologies, Inc.
HPT366 (rev 
01)
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Step
ping- SERR- FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort
- <MAbort- >SERR- <PERR-
        Latency: 8 min, 8 max, 120 set, cache line size 08
        Interrupt: pin A routed to IRQ 18
        Region 0: I/O ports at d800 [size=8]
        Region 1: I/O ports at dc00 [size=4]
        Region 4: I/O ports at e000 [size=256]
        Expansion ROM at e8000000 [disabled] [size=128K]

00:13.1 Unknown mass storage controller: Triones Technologies, Inc.
HPT366 (rev 
01)
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Step
ping- SERR- FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort
- <MAbort- >SERR- <PERR-
        Latency: 8 min, 8 max, 120 set, cache line size 08
        Interrupt: pin B routed to IRQ 18
        Region 0: I/O ports at e400 [size=8]
        Region 1: I/O ports at e800 [size=4]
        Region 4: I/O ports at ec00 [size=256]
===

[7.6.] SCSI information (from /proc/scsi/scsi)
Nada

[7.7.]
snippets from dmesg:
=== <hard drive on hde> ===
HPT366: onboard version of chipset, pin1=1 pin2=2 
HPT366: IDE controller on PCI bus 00 dev 98 
PCI: Enabling device 00:13.0 (0005 -> 0007) 
HPT366: chipset revision 1 
HPT366: not 100% native mode: will probe irqs later 
    ide2: BM-DMA at 0xe000-0xe007, BIOS settings: hde:DMA, hdf:pio 
HPT366: IDE controller on PCI bus 00 dev 99 
HPT366: chipset revision 1 
HPT366: not 100% native mode: will probe irqs later 
    ide3: BM-DMA at 0xec00-0xec07, BIOS settings: hdg:pio, hdh:pio 
hdd: FX240S, ATAPI CDROM drive 
hde: ST313620A, ATA DISK drive 
ide1 at 0x170-0x177,0x376 on irq 15 
ide2 at 0xd800-0xd807,0xdc02 on irq 18 
hde: 26692776 sectors (13667 MB) w/512KiB Cache, CHS=26480/16/63,
UDMA(66) 
=== </hard drive on hde> ===

=== <hard drive on hda> ===
HPT366: onboard version of chipset, pin1=1 pin2=2
HPT366: IDE controller on PCI bus 00 dev 98
PCI: Enabling device 00:13.0 (0005 -> 0007)
HPT366: chipset revision 1
HPT366: not 100% native mode: will probe irqs later
    ide2: BM-DMA at 0xe000-0xe007, BIOS settings: hde:pio, hdf:pio
HPT366: IDE controller on PCI bus 00 dev 99
HPT366: chipset revision 1
HPT366: not 100% native mode: will probe irqs later
    ide3: BM-DMA at 0xec00-0xec07, BIOS settings: hdg:pio, hdh:pio
hda: ST313620A, ATA DISK drive
hdd: FX240S, ATAPI CDROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
hda: 26692776 sectors (13667 MB) w/512KiB Cache, CHS=1661/255/63,
UDMA(33)
=== </hard drive on hda> ===


[X.] Other notes, patches, fixes, workarounds:

Only current workaround is to avoid the HPT chip :(

I can't help but worry that (especially after the volume of this email)
it's a simple problem / my fault - however; I have not seen anything
specific to this in the past few months.

I can offer to help debug; but my time is limited due to the twin evils
of Work and Sleep; and I don't have too many leads what with no error
output; just silent corruption :(


Gerard Sharp
Two Penguins at 1024x768
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

             reply	other threads:[~2000-12-01 11:35 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2000-12-01 11:04 Gerard Sharp [this message]
2000-12-02 16:25 ` HPT366 + SMP = slight corruption in 2.3.99 - 2.4.0-11 Gnea
2000-12-04 16:10   ` Gerard Sharp
2000-12-04 20:49     ` Dan Hollis
2000-12-04 21:27       ` Richard Torkar
2000-12-04 21:41         ` Dan Hollis
2000-12-04 21:51           ` Mike Dresser
2000-12-06 10:33       ` Gerard Sharp
2000-12-06 11:23         ` kernel
2000-12-07  9:23           ` Gerard Sharp
  -- strict thread matches above, loose matches on Subject: below --
2000-12-04 16:39 Gerard Sharp
2000-12-05 20:34 Winfried Truemper
2000-12-09  9:43 Gerard Sharp
2000-12-09  9:57 ` Andre Hedrick
2000-12-10  3:26   ` Gerard Sharp
2000-12-10 12:15     ` Hakan Lennestal
2000-12-10 16:25       ` David Woodhouse
2000-12-10 17:17       ` Andre Hedrick
2000-12-10 19:01         ` Hakan Lennestal
2000-12-10 21:07           ` Gerard Sharp
2000-12-11  8:03           ` Andre Hedrick
2000-12-10  3:30   ` Gerard Sharp

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3A2785BB.EB36DDE0@ihug.co.nz \
    --to=gsharp@ihug.co.nz \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.