* New IDE code and DMA failures @ 2002-04-11 17:39 Denis Vlasenko 2002-04-11 12:05 ` Martin Dalecki 2002-04-11 13:05 ` Ted Deppner 0 siblings, 2 replies; 13+ messages in thread From: Denis Vlasenko @ 2002-04-11 17:39 UTC (permalink / raw) To: Jens Axboe, Martin Dalecki, Vojtech Pavlik; +Cc: linux-kernel Hi Jens, Martin, Vojtech, I have a flaky IDE subsystem in one box. Reads work fine, writes sometimes don't work and hang either IDE/block device sybsystem or entire box. For example, I dumped ~40 MB file to the disk and now I have additional power led (i.e. hdd activity led is constantly on) and a bunch of "D" state processes (kupdated, mount, umount). This is happening since I decided to try 2.5.7. 2.4.18 reported DMA failures and reverted to PIO. I did send a detailed report of similar event with ksymoopsed stack traces of hung prosesses to lkml. Since you are working on IDE subsystem, I will be glad to *retain* my flaky IDE setup and test future kernels for correct operation in this failure mode. Please inform me whenever you want me to test your patches. -- vda ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: New IDE code and DMA failures 2002-04-11 17:39 New IDE code and DMA failures Denis Vlasenko @ 2002-04-11 12:05 ` Martin Dalecki 2002-04-11 18:44 ` Denis Vlasenko 2002-04-11 13:05 ` Ted Deppner 1 sibling, 1 reply; 13+ messages in thread From: Martin Dalecki @ 2002-04-11 12:05 UTC (permalink / raw) To: vda; +Cc: Jens Axboe, Martin Dalecki, Vojtech Pavlik, linux-kernel Denis Vlasenko wrote: > Hi Jens, Martin, Vojtech, Zdrastwujtie. > I have a flaky IDE subsystem in one box. Reads work fine, > writes sometimes don't work and hang either IDE/block device > sybsystem or entire box. For example, I dumped ~40 MB file to > the disk and now I have additional power led (i.e. hdd activity > led is constantly on) and a bunch of "D" state processes > (kupdated, mount, umount). > > This is happening since I decided to try 2.5.7. > 2.4.18 reported DMA failures and reverted to PIO. > > I did send a detailed report of similar event with > ksymoopsed stack traces of hung prosesses to lkml. > > Since you are working on IDE subsystem, I will be glad to > *retain* my flaky IDE setup and test future kernels > for correct operation in this failure mode. > > Please inform me whenever you want me to test your patches. Guessing from the symptoms I would rather suggest that: 1. Are you sure you have the support for your chipset properly enabled? It's allmost a must for DMA. 2. Could you please report about the hardware you have. There are chipsets around there which are using theyr own transport layer implementations. host chip (aka south bridge) disk types and so on. 3. Some timeout values got increased to more generally used values (in esp. IBM microdrives advice about timeout values. Could you see whatever the data doesn't eventually go to the disk after georgeous amounts of time. 4. Could you try to set the DMA mode lower then it's set up per default by using hdparm and try whatever it helps? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: New IDE code and DMA failures 2002-04-11 12:05 ` Martin Dalecki @ 2002-04-11 18:44 ` Denis Vlasenko 2002-04-11 12:52 ` Martin Dalecki 2002-04-11 15:48 ` Vojtech Pavlik 0 siblings, 2 replies; 13+ messages in thread From: Denis Vlasenko @ 2002-04-11 18:44 UTC (permalink / raw) To: Martin Dalecki; +Cc: Jens Axboe, Martin Dalecki, Vojtech Pavlik, linux-kernel On 11 April 2002 10:05, Martin Dalecki wrote: > > Since you are working on IDE subsystem, I will be glad to > > *retain* my flaky IDE setup and test future kernels > > for correct operation in this failure mode. > > > > Please inform me whenever you want me to test your patches. > > Guessing from the symptoms I would rather suggest that: > > 1. Are you sure you have the support for your chipset properly > enabled? It's allmost a must for DMA. I am deadly sure. lspci: 00:00.0 Host bridge: Intel Corp. 440LX/EX - 82443LX/EX Host bridge (rev 03) 00:01.0 PCI bridge: Intel Corp. 440LX/EX - 82443LX/EX AGP bridge (rev 03) 00:04.0 ISA bridge: Intel Corp. 82371AB PIIX4 ISA (rev 01) 00:04.1 IDE interface: Intel Corp. 82371AB PIIX4 IDE (rev 01) 00:04.2 USB Controller: Intel Corp. 82371AB PIIX4 USB (rev 01) 00:04.3 Bridge: Intel Corp. 82371AB PIIX4 ACPI (rev 01) 00:06.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 24) 00:0a.0 VGA compatible controller: Matrox Graphics, Inc. MGA 2164W [Millennium II] 00:0c.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8029(AS) /boot/2.4.7/config: CONFIG_BLK_DEV_PIIX=y > 2. Could you please report about the hardware you have. There are > chipsets around there which are using theyr own transport layer > implementations. host chip (aka south bridge) disk types and so on. # hdparm -i /dev/hda Model=Maxtor 51369U3, FwRev=DA620CQ0, SerialNo=EK3HAE61C Config={ Fixed } RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=57 BuffType=3(DualPortCache), BuffSize=2048kB, MaxMultSect=16, MultSect=16 DblWordIO=no, maxPIO=2(fast), DMA=yes, maxDMA=0(slow) CurCHS=17475/15/63, CurSects=16513875, LBA=yes LBA CHS=512/511/63 Remapping, LBA=yes, LBAsects=26520480 tDMA={min:120,rec:120}, DMA modes: mword0 mword1 mword2 IORDY=on/off, tPIO={min:120,w/IORDY:120}, PIO modes: mode3 mode4 UDMA modes: mode0 mode1 *mode2 # hdparm -i /dev/hdc Model=ST31277A, FwRev=0.75, SerialNo=VAE07701 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% } RawCHS=2482/16/63, TrkSize=0, SectSize=0, ECCbytes=4 BuffType=0(?), BuffSize=0kB, MaxMultSect=16, MultSect=16 DblWordIO=no, maxPIO=1(medium), DMA=yes, maxDMA=2(fast) CurCHS=2482/16/63, CurSects=2501856, LBA=yes LBA CHS=620/64/63 Remapping, LBA=yes, LBAsects=2501856 tDMA={min:120,rec:120}, DMA modes: mword0 mword1 *mword2 IORDY=on/off, tPIO={min:383,w/IORDY:120}, PIO modes: mode3 mode4 I have problems with hdc. hda is mostly unused, so maybe it is DMA errors prone too but I have not seen that yet. > 3. Some timeout values got increased to more generally used values (in esp. > IBM microdrives advice about timeout values. Could you see whatever > the data doesn't eventually go to the disk after georgeous > amounts of time. Erm.. my English comprehension fails here... do you say my disk does not like bigger timeouts? > 4. Could you try to set the DMA mode lower then it's set up > per default by using hdparm and try whatever it helps? Current params: # hdparm /dev/hda /dev/hdc /dev/hda: multcount = 16 (on) I/O support = 1 (32-bit) unmaskirq = 1 (on) using_dma = 1 (on) keepsettings = 0 (off) nowerr = 0 (off) readonly = 0 (off) BLKRAGET failed: Invalid argument geometry = 1754/240/63, sectors = 26520480, start = 0 /dev/hdc: multcount = 16 (on) I/O support = 1 (32-bit) unmaskirq = 1 (on) using_dma = 1 (on) keepsettings = 0 (off) nowerr = 0 (off) readonly = 0 (off) BLKRAGET failed: Invalid argument geometry = 620/64/63, sectors = 2501856, start = 0 I can't quite figure what MW/UDMA mode is active. -- vda ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: New IDE code and DMA failures 2002-04-11 18:44 ` Denis Vlasenko @ 2002-04-11 12:52 ` Martin Dalecki 2002-04-11 19:17 ` Denis Vlasenko 2002-04-11 15:48 ` Vojtech Pavlik 1 sibling, 1 reply; 13+ messages in thread From: Martin Dalecki @ 2002-04-11 12:52 UTC (permalink / raw) To: vda; +Cc: linux-kernel Denis Vlasenko wrote: > # hdparm -i /dev/hda > Model=Maxtor 51369U3, FwRev=DA620CQ0, SerialNo=EK3HAE61C > Config={ Fixed } > RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=57 > BuffType=3(DualPortCache), BuffSize=2048kB, MaxMultSect=16, MultSect=16 > DblWordIO=no, maxPIO=2(fast), DMA=yes, maxDMA=0(slow) > CurCHS=17475/15/63, CurSects=16513875, LBA=yes > LBA CHS=512/511/63 Remapping, LBA=yes, LBAsects=26520480 > tDMA={min:120,rec:120}, DMA modes: mword0 mword1 mword2 > IORDY=on/off, tPIO={min:120,w/IORDY:120}, PIO modes: mode3 mode4 > UDMA modes: mode0 mode1 *mode2 To answer an later question. The asterix here denotes the active UDMA mode! > > # hdparm -i /dev/hdc > Model=ST31277A, FwRev=0.75, SerialNo=VAE07701 > Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% } > RawCHS=2482/16/63, TrkSize=0, SectSize=0, ECCbytes=4 > BuffType=0(?), BuffSize=0kB, MaxMultSect=16, MultSect=16 > DblWordIO=no, maxPIO=1(medium), DMA=yes, maxDMA=2(fast) > CurCHS=2482/16/63, CurSects=2501856, LBA=yes > LBA CHS=620/64/63 Remapping, LBA=yes, LBAsects=2501856 > tDMA={min:120,rec:120}, DMA modes: mword0 mword1 *mword2 > IORDY=on/off, tPIO={min:383,w/IORDY:120}, PIO modes: mode3 mode4 > > I have problems with hdc. hda is mostly unused, so maybe it is DMA errors > prone too but I have not seen that yet. > > >>3. Some timeout values got increased to more generally used values (in esp. >> IBM microdrives advice about timeout values. Could you see whatever >> the data doesn't eventually go to the disk after georgeous >> amounts of time. > > > Erm.. my English comprehension fails here... do you say my disk > does not like bigger timeouts? Please just wait and look whatever the driver actually recovers (can be minutes...) > >>4. Could you try to set the DMA mode lower then it's set up >> per default by using hdparm and try whatever it helps? > > > Current params: > > # hdparm /dev/hda /dev/hdc > /dev/hda: > multcount = 16 (on) > I/O support = 1 (32-bit) > unmaskirq = 1 (on) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Could you try to disable this please? This can cause trouble as well. > using_dma = 1 (on) > keepsettings = 0 (off) > nowerr = 0 (off) > readonly = 0 (off) > BLKRAGET failed: Invalid argument > geometry = 1754/240/63, sectors = 26520480, start = 0 > > /dev/hdc: > multcount = 16 (on) > I/O support = 1 (32-bit) > unmaskirq = 1 (on) > using_dma = 1 (on) > keepsettings = 0 (off) > nowerr = 0 (off) > readonly = 0 (off) > BLKRAGET failed: Invalid argument > geometry = 620/64/63, sectors = 2501856, start = 0 > > I can't quite figure what MW/UDMA mode is active. See above. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: New IDE code and DMA failures 2002-04-11 12:52 ` Martin Dalecki @ 2002-04-11 19:17 ` Denis Vlasenko 0 siblings, 0 replies; 13+ messages in thread From: Denis Vlasenko @ 2002-04-11 19:17 UTC (permalink / raw) To: Martin Dalecki; +Cc: linux-kernel On 11 April 2002 10:52, Martin Dalecki wrote: > >>3. Some timeout values got increased to more generally used values (in > >> esp. IBM microdrives advice about timeout values. Could you see whatever > >> the data doesn't eventually go to the disk after georgeous > >> amounts of time. > > > > Erm.. my English comprehension fails here... do you say my disk > > does not like bigger timeouts? > > Please just wait and look whatever the driver actually recovers (can be > minutes...) I tried that just today. Continued to work despite kupdated hung in "D" state. After a long while box box froze. SysRq-B worked though. In my first report to lkml I told that live disconnect of hdc cured "D" state processes (yes I know I risk burning my southbridge...). Do you want me to mail it again (there is ksymoopsed SysRq-T)? > > unmaskirq = 1 (on) > Could you try to disable this please? This can cause trouble > as well. Will try this, but I don't specifically seek to eliminate freezes, I want to help debug new IDE code so that it will be no worse than 2.4 in this failure mode. I don't want to eliminate DMA failures, I _want to have them_ to see what IDE code will do. -- vda ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: New IDE code and DMA failures 2002-04-11 18:44 ` Denis Vlasenko 2002-04-11 12:52 ` Martin Dalecki @ 2002-04-11 15:48 ` Vojtech Pavlik 2002-04-12 14:47 ` Denis Vlasenko 1 sibling, 1 reply; 13+ messages in thread From: Vojtech Pavlik @ 2002-04-11 15:48 UTC (permalink / raw) To: Denis Vlasenko Cc: Martin Dalecki, Jens Axboe, Martin Dalecki, Vojtech Pavlik, linux-kernel On Thu, Apr 11, 2002 at 04:44:29PM -0200, Denis Vlasenko wrote: > On 11 April 2002 10:05, Martin Dalecki wrote: > > > Since you are working on IDE subsystem, I will be glad to > > > *retain* my flaky IDE setup and test future kernels > > > for correct operation in this failure mode. > > > > > > Please inform me whenever you want me to test your patches. > > > > Guessing from the symptoms I would rather suggest that: > > > > 1. Are you sure you have the support for your chipset properly > > enabled? It's allmost a must for DMA. > > I am deadly sure. lspci: > 00:00.0 Host bridge: Intel Corp. 440LX/EX - 82443LX/EX Host bridge (rev 03) > 00:01.0 PCI bridge: Intel Corp. 440LX/EX - 82443LX/EX AGP bridge (rev 03) > 00:04.0 ISA bridge: Intel Corp. 82371AB PIIX4 ISA (rev 01) > 00:04.1 IDE interface: Intel Corp. 82371AB PIIX4 IDE (rev 01) > 00:04.2 USB Controller: Intel Corp. 82371AB PIIX4 USB (rev 01) > 00:04.3 Bridge: Intel Corp. 82371AB PIIX4 ACPI (rev 01) > 00:06.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 24) > 00:0a.0 VGA compatible controller: Matrox Graphics, Inc. MGA 2164W [Millennium II] > 00:0c.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8029(AS) > > /boot/2.4.7/config: > CONFIG_BLK_DEV_PIIX=y There's new PIIX code by me in the 2.5 kernels. Can you provide /proc/ide/piix data (and lspci -vvxxx) as well? > > > 2. Could you please report about the hardware you have. There are > > chipsets around there which are using theyr own transport layer > > implementations. host chip (aka south bridge) disk types and so on. > > # hdparm -i /dev/hda > Model=Maxtor 51369U3, FwRev=DA620CQ0, SerialNo=EK3HAE61C > Config={ Fixed } > RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=57 > BuffType=3(DualPortCache), BuffSize=2048kB, MaxMultSect=16, MultSect=16 > DblWordIO=no, maxPIO=2(fast), DMA=yes, maxDMA=0(slow) > CurCHS=17475/15/63, CurSects=16513875, LBA=yes > LBA CHS=512/511/63 Remapping, LBA=yes, LBAsects=26520480 > tDMA={min:120,rec:120}, DMA modes: mword0 mword1 mword2 > IORDY=on/off, tPIO={min:120,w/IORDY:120}, PIO modes: mode3 mode4 > UDMA modes: mode0 mode1 *mode2 > > # hdparm -i /dev/hdc > Model=ST31277A, FwRev=0.75, SerialNo=VAE07701 > Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% } > RawCHS=2482/16/63, TrkSize=0, SectSize=0, ECCbytes=4 > BuffType=0(?), BuffSize=0kB, MaxMultSect=16, MultSect=16 > DblWordIO=no, maxPIO=1(medium), DMA=yes, maxDMA=2(fast) > CurCHS=2482/16/63, CurSects=2501856, LBA=yes > LBA CHS=620/64/63 Remapping, LBA=yes, LBAsects=2501856 > tDMA={min:120,rec:120}, DMA modes: mword0 mword1 *mword2 > IORDY=on/off, tPIO={min:383,w/IORDY:120}, PIO modes: mode3 mode4 > > I have problems with hdc. hda is mostly unused, so maybe it is DMA errors > prone too but I have not seen that yet. > > > 3. Some timeout values got increased to more generally used values (in esp. > > IBM microdrives advice about timeout values. Could you see whatever > > the data doesn't eventually go to the disk after georgeous > > amounts of time. > > Erm.. my English comprehension fails here... do you say my disk > does not like bigger timeouts? > > > 4. Could you try to set the DMA mode lower then it's set up > > per default by using hdparm and try whatever it helps? > > Current params: > > # hdparm /dev/hda /dev/hdc > /dev/hda: > multcount = 16 (on) > I/O support = 1 (32-bit) > unmaskirq = 1 (on) > using_dma = 1 (on) > keepsettings = 0 (off) > nowerr = 0 (off) > readonly = 0 (off) > BLKRAGET failed: Invalid argument > geometry = 1754/240/63, sectors = 26520480, start = 0 > > /dev/hdc: > multcount = 16 (on) > I/O support = 1 (32-bit) > unmaskirq = 1 (on) > using_dma = 1 (on) > keepsettings = 0 (off) > nowerr = 0 (off) > readonly = 0 (off) > BLKRAGET failed: Invalid argument > geometry = 620/64/63, sectors = 2501856, start = 0 > > I can't quite figure what MW/UDMA mode is active. > -- > vda -- Vojtech Pavlik SuSE Labs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: New IDE code and DMA failures 2002-04-11 15:48 ` Vojtech Pavlik @ 2002-04-12 14:47 ` Denis Vlasenko 0 siblings, 0 replies; 13+ messages in thread From: Denis Vlasenko @ 2002-04-12 14:47 UTC (permalink / raw) To: Vojtech Pavlik; +Cc: Martin Dalecki, Jens Axboe, Vojtech Pavlik, linux-kernel On 11 April 2002 13:48, Vojtech Pavlik wrote: > > lspci: > > 00:00.0 Host bridge: Intel Corp. 440LX/EX - 82443LX/EX Host bridge (rev > > 03) 00:01.0 PCI bridge: Intel Corp. 440LX/EX - 82443LX/EX AGP bridge (rev > > 03) 00:04.0 ISA bridge: Intel Corp. 82371AB PIIX4 ISA (rev 01) > > 00:04.1 IDE interface: Intel Corp. 82371AB PIIX4 IDE (rev 01) > > 00:04.2 USB Controller: Intel Corp. 82371AB PIIX4 USB (rev 01) > > 00:04.3 Bridge: Intel Corp. 82371AB PIIX4 ACPI (rev 01) > > 00:06.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] > > (rev 24) 00:0a.0 VGA compatible controller: Matrox Graphics, Inc. MGA > > 2164W [Millennium II] 00:0c.0 Ethernet controller: Realtek Semiconductor > > Co., Ltd. RTL-8029(AS) > > > > /boot/2.4.7/config: > > CONFIG_BLK_DEV_PIIX=y > > There's new PIIX code by me in the 2.5 kernels. Can you provide > /proc/ide/piix data (and lspci -vvxxx) as well? lspci -vvxxx: 00:04.1 IDE interface: Intel Corp. 82371AB PIIX4 IDE (rev 01) (prog-if 80 [Master]) Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 64 Region 4: I/O ports at fcb0 [size=16] 00: 86 80 11 71 05 00 80 02 01 80 01 01 00 40 00 00 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20: b1 fc 00 00 00 00 00 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 40: 77 e3 47 e3 0b 00 00 00 01 00 02 00 00 00 00 00 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 28 0f 00 00 00 00 00 00 /proc/ide/piix: ----------PIIX BusMastering IDE Configuration--------------- Driver Version: 1.2 South Bridge: PCI device 8086:7111 Revision: IDE 0x1 Highest DMA rate: UDMA33 BM-DMA base: 0xfcb0 PCI clock: 33.3MHz -----------------------Primary IDE-------Secondary IDE------ Enabled: yes yes Simplex only: no no Cable Type: 40w 40w -------------------drive0----drive1----drive2----drive3----- Prefetch+Post: yes yes yes yes Transfer Mode: UDMA DMA DMA PIO Address Setup: 90ns 90ns 90ns 90ns Cmd Active: 360ns 360ns 360ns 360ns Cmd Recovery: 540ns 540ns 540ns 540ns Data Active: 90ns 90ns 90ns 360ns Data Recovery: 30ns 30ns 30ns 540ns Cycle Time: 60ns 120ns 120ns 900ns Transfer Rate: 33.3MB/s 16.6MB/s 16.6MB/s 2.2MB/s -- vda ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: New IDE code and DMA failures 2002-04-11 17:39 New IDE code and DMA failures Denis Vlasenko 2002-04-11 12:05 ` Martin Dalecki @ 2002-04-11 13:05 ` Ted Deppner 2002-04-11 14:10 ` Oleg Drokin 1 sibling, 1 reply; 13+ messages in thread From: Ted Deppner @ 2002-04-11 13:05 UTC (permalink / raw) To: Denis Vlasenko Cc: Jens Axboe, Martin Dalecki, Vojtech Pavlik, linux-kernel, Hans Reiser On Thu, Apr 11, 2002 at 03:39:33PM -0200, Denis Vlasenko wrote: > I have a flaky IDE subsystem in one box. Reads work fine, > writes sometimes don't work and hang either IDE/block device > > Please inform me whenever you want me to test your patches. I've been testing 2.4.17 and 2.4.19-pre6 and see some similar issues. I have an Asus A7V w/ 1gig Athlon processor. Using the onboard Promise UDMA100 controller, I can read and write all day long to /dev/hde all by itself... However, after few minutes of any type of access to /dev/hdh, /dev/hde suddenly starts having DMA errors and switches to PIO. I'm on my third DMA66 cable (yet it fights tightly), and am still seeing the exact same issues. I don't believe my IDE subsystem to be flaky. hde is a WD drive, and hdh is a Maxtor. In one of my tests the contents /dev/hdh was additionally corrupted (a write test to /dev/hdh1) so badly that the partion information changed from type 83 to type 3 (Xenix), and the contents of a reiser partition so badly damaged that a --rebuild-tree and later a --rebuild-sb to reiserfsck didn't restore it to usable. (I put those options in at the request of reiserfsck, and I haven't wiped the drive yet if someone would like further tests against the reiserfs partition). -- Ted Deppner http://www.psyber.com/~ted/ ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: New IDE code and DMA failures 2002-04-11 13:05 ` Ted Deppner @ 2002-04-11 14:10 ` Oleg Drokin 2002-04-13 0:58 ` Ted Deppner 0 siblings, 1 reply; 13+ messages in thread From: Oleg Drokin @ 2002-04-11 14:10 UTC (permalink / raw) To: ted, linux-kernel, Hans Reiser Hello! On Thu, Apr 11, 2002 at 06:05:44AM -0700, Ted Deppner wrote: > In one of my tests the contents /dev/hdh was additionally corrupted (a > write test to /dev/hdh1) so badly that the partion information changed > from type 83 to type 3 (Xenix), and the contents of a reiser partition so > badly damaged that a --rebuild-tree and later a --rebuild-sb to reiserfsck > didn't restore it to usable. (I put those options in at the request of > reiserfsck, and I haven't wiped the drive yet if someone would like > further tests against the reiserfs partition). We are interested in such a damaged partitions that makes current reiserfsck to segfault or to incorrectly repair FS (incorrectly in the meaning that subsequent reiserfsck run finds more errors) Is this the case with you? Bye, Oleg ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: New IDE code and DMA failures 2002-04-11 14:10 ` Oleg Drokin @ 2002-04-13 0:58 ` Ted Deppner 2002-04-13 7:07 ` Oleg Drokin 0 siblings, 1 reply; 13+ messages in thread From: Ted Deppner @ 2002-04-13 0:58 UTC (permalink / raw) To: Oleg Drokin; +Cc: ted, linux-kernel, Hans Reiser On Thu, Apr 11, 2002 at 06:10:27PM +0400, Oleg Drokin wrote: > We are interested in such a damaged partitions that makes current reiserfsck > to segfault or to incorrectly repair FS (incorrectly in the meaning that > subsequent reiserfsck run finds more errors) > Is this the case with you? Subsequent runs of reiserfsck are no longer finding new errors. There were several cases where --rebuild-tree segfaulted reiserfsck -- HOWEVER this was before I got the DMA errors ironed out. Now that the DMA errors are taken care of, I've not been able to get reiserfsck to behave oddly. -- Ted Deppner http://www.psyber.com/~ted/ ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: New IDE code and DMA failures 2002-04-13 0:58 ` Ted Deppner @ 2002-04-13 7:07 ` Oleg Drokin 0 siblings, 0 replies; 13+ messages in thread From: Oleg Drokin @ 2002-04-13 7:07 UTC (permalink / raw) To: ted, linux-kernel, Hans Reiser Hello! On Fri, Apr 12, 2002 at 05:58:05PM -0700, Ted Deppner wrote: > > We are interested in such a damaged partitions that makes current reiserfsck > > to segfault or to incorrectly repair FS (incorrectly in the meaning that > > subsequent reiserfsck run finds more errors) > > Is this the case with you? > Subsequent runs of reiserfsck are no longer finding new errors. There > were several cases where --rebuild-tree segfaulted reiserfsck -- HOWEVER > this was before I got the DMA errors ironed out. Still reiserfsck should not segfault. What version of reiserfsprogs do you have? Have you saved core files? Bye, Oleg ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: New IDE code and DMA failures
@ 2002-04-11 16:27 Petr Vandrovec
2002-04-13 1:02 ` Ted Deppner
0 siblings, 1 reply; 13+ messages in thread
From: Petr Vandrovec @ 2002-04-11 16:27 UTC (permalink / raw)
To: Ted Deppner; +Cc: Jens Axboe, Martin Dalecki, Vojtech Pavlik, linux-kernel, vda
On 11 Apr 02 at 6:05, Ted Deppner wrote:
> On Thu, Apr 11, 2002 at 03:39:33PM -0200, Denis Vlasenko wrote:
> > I have a flaky IDE subsystem in one box. Reads work fine,
> > writes sometimes don't work and hang either IDE/block device
> >
> > Please inform me whenever you want me to test your patches.
>
> I've been testing 2.4.17 and 2.4.19-pre6 and see some similar issues. I
> have an Asus A7V w/ 1gig Athlon processor. Using the onboard Promise
> UDMA100 controller, I can read and write all day long to /dev/hde all by
> itself... However, after few minutes of any type of access to /dev/hdh,
> /dev/hde suddenly starts having DMA errors and switches to PIO. I'm on my
> third DMA66 cable (yet it fights tightly), and am still seeing the exact
> same issues. I don't believe my IDE subsystem to be flaky. hde is a WD
> drive, and hdh is a Maxtor.
What your /dev/hdg is? Using slave-alone on the A7V's Promise (and maybe
on other motherboards too) will corrupt your disk badly. Under Linux,
and also under Windows98. I did not tried other OSes...
Petr Vandrovec
vandrove@vc.cvut.cz
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: New IDE code and DMA failures 2002-04-11 16:27 Petr Vandrovec @ 2002-04-13 1:02 ` Ted Deppner 0 siblings, 0 replies; 13+ messages in thread From: Ted Deppner @ 2002-04-13 1:02 UTC (permalink / raw) To: Petr Vandrovec Cc: Ted Deppner, Jens Axboe, Martin Dalecki, Vojtech Pavlik, linux-kernel, vda On Thu, Apr 11, 2002 at 05:27:29PM +0100, Petr Vandrovec wrote: > What your /dev/hdg is? Using slave-alone on the A7V's Promise (and maybe > on other motherboards too) will corrupt your disk badly. Under Linux, > and also under Windows98. I did not tried other OSes... I did not have a /dev/hdg. I searched and found your emails to linux-kernel regarding your findings on quirks with the PDC20265 controller and moved /dev/hdh to /dev/hdg I've not had a single DMA error since, regardless of how much I've tried to break things. Previously I was able to fail things within a few minutes. Thank you! At this point I am racking these issues against hardware quirks of my A7V's onboard controller... I cannot say that there is anything amiss in the kernel (or with reiserfs) in the light these findings. -- Ted Deppner http://www.psyber.com/~ted/ ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2002-04-13 7:07 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2002-04-11 17:39 New IDE code and DMA failures Denis Vlasenko 2002-04-11 12:05 ` Martin Dalecki 2002-04-11 18:44 ` Denis Vlasenko 2002-04-11 12:52 ` Martin Dalecki 2002-04-11 19:17 ` Denis Vlasenko 2002-04-11 15:48 ` Vojtech Pavlik 2002-04-12 14:47 ` Denis Vlasenko 2002-04-11 13:05 ` Ted Deppner 2002-04-11 14:10 ` Oleg Drokin 2002-04-13 0:58 ` Ted Deppner 2002-04-13 7:07 ` Oleg Drokin -- strict thread matches above, loose matches on Subject: below -- 2002-04-11 16:27 Petr Vandrovec 2002-04-13 1:02 ` Ted Deppner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox