public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.4 ate my filesystem on rw-mount
@ 2001-01-12  9:15 Tobias Ringstrom
  2001-01-12 10:22 ` Alan Cox
  2001-01-12 19:49 ` Vojtech Pavlik
  0 siblings, 2 replies; 16+ messages in thread
From: Tobias Ringstrom @ 2001-01-12  9:15 UTC (permalink / raw)
  To: Kernel Mailing List

I've never seen anything like it before, which I'm happy for.  The system
had been running a standard RedHat 7 kernel for days without any problems,
but who wants to run a 2.2 kernel?  I compiled 2.4.0 for it, rebooted, and
blam!  The RedHat init stripts got to the "remounting root read-write"
point, and just froze solid.

Rebooting into RH7 failed, becauce inittab could not be found.  In fact
the filesystem was completely messed up, with /dev empty, lots of device
nodes in /etc, and files missing all over the place.  I had to reinstall
RH7 from scratch.

I do not understand how this could happen during a remounting root rw.
Is the filesystem really that unstable?

Am I right in suspecting DMA, which was enabled at the time?  Any other
ideas?  Is it a known problem?

This is on a 450 MHz AMD-K6 with the following IDE controller:

00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)

[I know this is not a very good trouble report, but it will have to do for
the time beeing.  I hope to do more testing at a later time.]

/Tobias

PS. This is _not_ the same system that I reported IDE busy errors for.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.4 ate my filesystem on rw-mount
  2001-01-12  9:15 2.4 ate my filesystem on rw-mount Tobias Ringstrom
@ 2001-01-12 10:22 ` Alan Cox
  2001-01-12 17:23   ` Martin Laberge
  2001-01-12 19:49 ` Vojtech Pavlik
  1 sibling, 1 reply; 16+ messages in thread
From: Alan Cox @ 2001-01-12 10:22 UTC (permalink / raw)
  To: Tobias Ringstrom; +Cc: Kernel Mailing List

> This is on a 450 MHz AMD-K6 with the following IDE controller:
> 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)

There are several people who have reported that the 2.4.0 VIA IDE driver
trashes hard disks like that. The 2.2 one also did this sometimes but only
with specific chipset versions and if you have dma autotune on (thats why
currently 2.2 refuses to do tuning on VP3)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.4 ate my filesystem on rw-mount
  2001-01-12 10:22 ` Alan Cox
@ 2001-01-12 17:23   ` Martin Laberge
  2001-01-12 19:51     ` Vojtech Pavlik
  0 siblings, 1 reply; 16+ messages in thread
From: Martin Laberge @ 2001-01-12 17:23 UTC (permalink / raw)
  To: Alan Cox; +Cc: Tobias Ringstrom, Kernel Mailing List

Alan Cox wrote:

> > This is on a 450 MHz AMD-K6 with the following IDE controller:
> > 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
>
> There are several people who have reported that the 2.4.0 VIA IDE driver
> trashes hard disks like that. The 2.2 one also did this sometimes but only
> with specific chipset versions and if you have dma autotune on (thats why
> currently 2.2 refuses to do tuning on VP3)
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> Please read the FAQ at http://www.tux.org/lkml/

I had exactly the same problem with my K6-350 and IDE VT82C586a
on a kernet 2.2.16    ..... i just made a hdparm to enable DMA and pooffff....
lost all data .... reinstall necessary from scratch

Martin Laberge
mlsoft@videotron.ca


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.4 ate my filesystem on rw-mount
  2001-01-12  9:15 2.4 ate my filesystem on rw-mount Tobias Ringstrom
  2001-01-12 10:22 ` Alan Cox
@ 2001-01-12 19:49 ` Vojtech Pavlik
  2001-01-13  8:12   ` Tobias Ringstrom
  1 sibling, 1 reply; 16+ messages in thread
From: Vojtech Pavlik @ 2001-01-12 19:49 UTC (permalink / raw)
  To: Tobias Ringstrom; +Cc: Kernel Mailing List

On Fri, Jan 12, 2001 at 10:15:45AM +0100, Tobias Ringstrom wrote:
> I've never seen anything like it before, which I'm happy for.  The system
> had been running a standard RedHat 7 kernel for days without any problems,
> but who wants to run a 2.2 kernel?  I compiled 2.4.0 for it, rebooted, and
> blam!  The RedHat init stripts got to the "remounting root read-write"
> point, and just froze solid.
> 
> Rebooting into RH7 failed, becauce inittab could not be found.  In fact
> the filesystem was completely messed up, with /dev empty, lots of device
> nodes in /etc, and files missing all over the place.  I had to reinstall
> RH7 from scratch.
> 
> I do not understand how this could happen during a remounting root rw.
> Is the filesystem really that unstable?
> 
> Am I right in suspecting DMA, which was enabled at the time?  Any other
> ideas?  Is it a known problem?
> 
> This is on a 450 MHz AMD-K6 with the following IDE controller:
> 
> 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
> 
> [I know this is not a very good trouble report, but it will have to do for
> the time beeing.  I hope to do more testing at a later time.]
> 
> /Tobias
> 
> PS. This is _not_ the same system that I reported IDE busy errors for.

Wow. Ok, I'm maintaining the 2.4.0 VIA driver, so I'd like to know more
about this:

1) What's the ISA bridge revision?
2) What's in /proc/ide/via?
3) What says hdparm -i on your devices?
4) If you mount your filesystem read-only, does it read garbage?

Thanks.

-- 
Vojtech Pavlik
SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.4 ate my filesystem on rw-mount
  2001-01-12 17:23   ` Martin Laberge
@ 2001-01-12 19:51     ` Vojtech Pavlik
  0 siblings, 0 replies; 16+ messages in thread
From: Vojtech Pavlik @ 2001-01-12 19:51 UTC (permalink / raw)
  To: Martin Laberge; +Cc: Alan Cox, Tobias Ringstrom, Kernel Mailing List

On Fri, Jan 12, 2001 at 12:23:21PM -0500, Martin Laberge wrote:

> > > This is on a 450 MHz AMD-K6 with the following IDE controller:
> > > 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
> >
> > There are several people who have reported that the 2.4.0 VIA IDE driver
> > trashes hard disks like that. The 2.2 one also did this sometimes but only
> > with specific chipset versions and if you have dma autotune on (thats why
> > currently 2.2 refuses to do tuning on VP3)
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > Please read the FAQ at http://www.tux.org/lkml/
> 
> I had exactly the same problem with my K6-350 and IDE VT82C586a
> on a kernet 2.2.16    ..... i just made a hdparm to enable DMA and pooffff....
> lost all data .... reinstall necessary from scratch

Is this problem still present with 2.4.0? Well, you don't need to kill
your data to test this - make sure the kernel is mounting the
filesystems read only in the test. DMA will be probably enabled
automatically for your drives.

-- 
Vojtech Pavlik
SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.4 ate my filesystem on rw-mount
  2001-01-12 19:49 ` Vojtech Pavlik
@ 2001-01-13  8:12   ` Tobias Ringstrom
  2001-01-13 14:35     ` Vojtech Pavlik
  0 siblings, 1 reply; 16+ messages in thread
From: Tobias Ringstrom @ 2001-01-13  8:12 UTC (permalink / raw)
  To: Vojtech Pavlik; +Cc: Kernel Mailing List

On Fri, 12 Jan 2001, Vojtech Pavlik wrote:
> Wow. Ok, I'm maintaining the 2.4.0 VIA driver, so I'd like to know more
> about this:
>
> 1) What's the ISA bridge revision?

00:00.0 Host bridge: VIA Technologies, Inc. VT8501 (rev 02)
00:01.0 PCI bridge: VIA Technologies, Inc. VT8501
00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
00:07.2 USB Controller: VIA Technologies, Inc. VT82C586B USB (rev 0e)
00:07.4 Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 20)
00:07.5 Multimedia audio controller: VIA Technologies, Inc. VT82C686 [Apollo Super AC97/Audio] (rev 21)
00:0a.0 Ethernet controller: VIA Technologies, Inc. VT86C100A [Rhine 10/100] (rev 06)
01:00.0 VGA compatible controller: Trident Microsystems CyberBlade/i7 (rev 5b)

> 2) What's in /proc/ide/via?

It's not there since I disabled the VIA driver.

> 3) What says hdparm -i on your devices?

/dev/hda:

 Model=SAMSUNG VG34323A (4.32GB), FwRev=GQ200, SerialNo=dW1921060033c8
 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
 RawCHS=14896/9/63, TrkSize=32256, SectSize=512, ECCbytes=21
 BuffType=DualPortCache, BuffSize=496kB, MaxMultSect=16, MultSect=off
 CurCHS=14896/9/63, CurSects=-531627904, LBA=yes, LBAsects=8446032
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes: pio0 pio1 pio2 pio3 pio4
 DMA modes: sdma0 sdma1 sdma2 *mdma0 mdma1 mdma2 udma0 udma1 *udma2

> 4) If you mount your filesystem read-only, does it read garbage?

Now here's a strange part, or possibly a crusial clue.  When I booted a
2.4.0 kernel (from floppy using the excellent syslinux) with "ro
init=/bin/sh", I could access the filesystem just fine.  I could even
remount the root filesystem rw, and there were no problems.  But I did not
write anything to the disk, since I was convinced that the problem was
gone (this was the second try).  After this I rebooted with
ctrl-alt-delete, forgetting how bad an idea that is with init=/bin/sh,
booted up the RH7 2.2.16 kernel, and fsck was run with no errors.  Now I
though all was well, rebooted from floppy again, but without the init=
part, and poof, it hang.

More interesting may be that I had to turn the computer off and on again
to get BIOS to find the hard drive.  Repeated long reset button presses
did not help.  It is possible that it hung during BIOS hd detection - I
wish I could remember.

I suspect that I could have hung the drive with init=/bin/sh if I would
have done some reading and writing to the device, besides ls.

I think I can spend some more time today trying it out some more.  I will
also try your 3.11 driver, which seems to be an enormous cleanup.  Btw, do
you have a home page for the VIA driver?  A CVS perhaps?  If not, please
consider using sourceforge or something similar.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.4 ate my filesystem on rw-mount
  2001-01-13  8:12   ` Tobias Ringstrom
@ 2001-01-13 14:35     ` Vojtech Pavlik
  2001-01-13 16:20       ` Tobias Ringstrom
  0 siblings, 1 reply; 16+ messages in thread
From: Vojtech Pavlik @ 2001-01-13 14:35 UTC (permalink / raw)
  To: Tobias Ringstrom; +Cc: Kernel Mailing List

On Sat, Jan 13, 2001 at 09:12:27AM +0100, Tobias Ringstrom wrote:

> > Wow. Ok, I'm maintaining the 2.4.0 VIA driver, so I'd like to know more
> > about this:
> >
> > 1) What's the ISA bridge revision?
> 
> 00:00.0 Host bridge: VIA Technologies, Inc. VT8501 (rev 02)
> 00:01.0 PCI bridge: VIA Technologies, Inc. VT8501
> 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
> 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
> 00:07.2 USB Controller: VIA Technologies, Inc. VT82C586B USB (rev 0e)
> 00:07.4 Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 20)
> 00:07.5 Multimedia audio controller: VIA Technologies, Inc. VT82C686 [Apollo Super AC97/Audio] (rev 21)
> 00:0a.0 Ethernet controller: VIA Technologies, Inc. VT86C100A [Rhine 10/100] (rev 06)
> 01:00.0 VGA compatible controller: Trident Microsystems CyberBlade/i7 (rev 5b)

Ok, your IDE chip is a vt82c686a/ce.

> > 2) What's in /proc/ide/via?
> 
> It's not there since I disabled the VIA driver.

Ok. Could you send me this file when you boot with fs r-o?

> > 3) What says hdparm -i on your devices?
> 
> /dev/hda:
> 
>  Model=SAMSUNG VG34323A (4.32GB), FwRev=GQ200, SerialNo=dW1921060033c8
>  Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
>  RawCHS=14896/9/63, TrkSize=32256, SectSize=512, ECCbytes=21
>  BuffType=DualPortCache, BuffSize=496kB, MaxMultSect=16, MultSect=off
>  CurCHS=14896/9/63, CurSects=-531627904, LBA=yes, LBAsects=8446032
>  IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
>  PIO modes: pio0 pio1 pio2 pio3 pio4
>  DMA modes: sdma0 sdma1 sdma2 *mdma0 mdma1 mdma2 udma0 udma1 *udma2

Looks good, too. An UDMA33 drive.

> > 4) If you mount your filesystem read-only, does it read garbage?
> 
> Now here's a strange part, or possibly a crusial clue.  When I booted a
> 2.4.0 kernel (from floppy using the excellent syslinux) with "ro
> init=/bin/sh", I could access the filesystem just fine.  I could even
> remount the root filesystem rw, and there were no problems.  But I did not
> write anything to the disk, since I was convinced that the problem was
> gone (this was the second try).  After this I rebooted with
> ctrl-alt-delete, forgetting how bad an idea that is with init=/bin/sh,
> booted up the RH7 2.2.16 kernel, and fsck was run with no errors.

So far no problem. Rebooting with c-a-d with fs r-o is OK.

> Now I
> though all was well, rebooted from floppy again, but without the init=
> part, and poof, it hang.

Where? It could be a different reason than IDE setup ...

> More interesting may be that I had to turn the computer off and on again
> to get BIOS to find the hard drive. Repeated long reset button presses
> did not help.  It is possible that it hung during BIOS hd detection - I
> wish I could remember.

I fear this isn't much of a clue, sorry.

> I suspect that I could have hung the drive with init=/bin/sh if I would
> have done some reading and writing to the device, besides ls.

Please try it. Best mke2fs your swap partition and try reading & writing
to that. You can mkswap it back after you finish.

> I think I can spend some more time today trying it out some more.

Please do. 'lspci -vvxxx' data for the case without a driver, with 2.4.0
driver and with 3.11 driver would help me find the problem.

Make sure you *don't* have any hdparm -d1 or hdparm -X66 or similar
stuff in your init scripts.

> I will
> also try your 3.11 driver, which seems to be an enormous cleanup.

the 2.1e driver is an enormous cleanup of the original driver from the
2.2 kernels. the 3.11 is an enormous cleanup of 2.1e, yes.

> Btw, do
> you have a home page for the VIA driver?  A CVS perhaps?  If not, please
> consider using sourceforge or something similar.

No, not yet, but working on that.

-- 
Vojtech Pavlik
SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.4 ate my filesystem on rw-mount
  2001-01-13 14:35     ` Vojtech Pavlik
@ 2001-01-13 16:20       ` Tobias Ringstrom
  2001-01-13 22:36         ` 2.4 ate my filesystem on rw-mount, getting closer Tobias Ringstrom
  0 siblings, 1 reply; 16+ messages in thread
From: Tobias Ringstrom @ 2001-01-13 16:20 UTC (permalink / raw)
  To: Vojtech Pavlik; +Cc: Kernel Mailing List

On Sat, 13 Jan 2001, Vojtech Pavlik wrote:

> On Sat, Jan 13, 2001 at 09:12:27AM +0100, Tobias Ringstrom wrote:
> > > 2) What's in /proc/ide/via?
> >
> > It's not there since I disabled the VIA driver.
>
> Ok. Could you send me this file when you boot with fs r-o?

Ok, but this is with the wrong disc.  Withe the bad disc, drive0 looks
exacly like drive2, i.e. normal UDMA(33).  Sorry about that.

----------VIA BusMastering IDE Configuration----------------
Driver Version:                     2.1e
South Bridge:                       VIA vt82c686a rev 0x1b
Command register:                   0x7
Latency timer:                      32
PCI clock:                          33MHz
Master Read  Cycle IRDY:            0ws
Master Write Cycle IRDY:            0ws
FIFO Output Data 1/2 Clock Advance: off
BM IDE Status Register Read Retry:  on
Max DRDY Pulse Width:               No limit
-----------------------Primary IDE-------Secondary IDE------
Read DMA FIFO flush:           on                  on
End Sect. FIFO flush:          on                  on
Prefetch Buffer:               on                  on
Post Write Buffer:             on                  on
FIFO size:                      8                   8
Threshold Prim.:              1/2                 1/2
Bytes Per Sector:             512                 512
Both channels togth:          yes                 yes
-------------------drive0----drive1----drive2----drive3-----
BMDMA enabled:        yes       yes       yes       yes
Transfer Mode:       UDMA   DMA/PIO      UDMA   DMA/PIO
Address Setup:       30ns     120ns      30ns     120ns
Active Pulse:        90ns     330ns      90ns     330ns
Recovery Time:       30ns     270ns      30ns     270ns
Cycle Time:          30ns     600ns      60ns     600ns
Transfer Rate:   66.0MB/s   3.3MB/s  33.0MB/s   3.3MB/s

> > > 4) If you mount your filesystem read-only, does it read garbage?
> >
> > Now here's a strange part, or possibly a crusial clue.  When I booted a
> > 2.4.0 kernel (from floppy using the excellent syslinux) with "ro
> > init=/bin/sh", I could access the filesystem just fine.  I could even
> > remount the root filesystem rw, and there were no problems.  But I did not
> > write anything to the disk, since I was convinced that the problem was
> > gone (this was the second try).  After this I rebooted with
> > ctrl-alt-delete, forgetting how bad an idea that is with init=/bin/sh,
> > booted up the RH7 2.2.16 kernel, and fsck was run with no errors.
>
> So far no problem. Rebooting with c-a-d with fs r-o is OK.
>
> > Now I
> > though all was well, rebooted from floppy again, but without the init=
> > part, and poof, it hang.
>
> Where? It could be a different reason than IDE setup ...

Don't think so.  It happens on the "Remounting root read-write".

> > More interesting may be that I had to turn the computer off and on again
> > to get BIOS to find the hard drive. Repeated long reset button presses
> > did not help.  It is possible that it hung during BIOS hd detection - I
> > wish I could remember.
>
> I fear this isn't much of a clue, sorry.

The clue is that the VIA driver messed up either the chipset or the drive
quite a lot, but maybe that is already obvious.

> > I suspect that I could have hung the drive with init=/bin/sh if I would
> > have done some reading and writing to the device, besides ls.
>
> Please try it. Best mke2fs your swap partition and try reading & writing
> to that. You can mkswap it back after you finish.

After more testing, I think I have isolated the problem to this disk, or
at least this disk with this controller.  With another (UDMA66) disk,
there are no problems.  Details at the end.

> > I think I can spend some more time today trying it out some more.
>
> Please do. 'lspci -vvxxx' data for the case without a driver, with 2.4.0
> driver and with 3.11 driver would help me find the problem.

Ok, I'll do that later.

> Make sure you *don't* have any hdparm -d1 or hdparm -X66 or similar
> stuff in your init scripts.

I'm sure I don't.  This happens with a clean fresh RH7 installation.

> > I will
> > also try your 3.11 driver, which seems to be an enormous cleanup.
>
> the 2.1e driver is an enormous cleanup of the original driver from the
> 2.2 kernels. the 3.11 is an enormous cleanup of 2.1e, yes.

I have not had a chance to try the 3.11 driver yet.

Now for the new details.  When writing to the disk with DMA enabled, I get
the following errors, in two different machines.  Both are VIA IDE
machines.  I is NOT a cable error.  I have tries with several cables.
Possibly a connector or soldering problem.  I'll try the disk in more
machines an get back with more info.  I have to run now.

hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdc: dma_intr: error=0x84 { DriveStatusError BadCRC }

/Tobias


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.4 ate my filesystem on rw-mount, getting closer
  2001-01-13 16:20       ` Tobias Ringstrom
@ 2001-01-13 22:36         ` Tobias Ringstrom
  2001-01-14  8:44           ` Vojtech Pavlik
  0 siblings, 1 reply; 16+ messages in thread
From: Tobias Ringstrom @ 2001-01-13 22:36 UTC (permalink / raw)
  To: Vojtech Pavlik; +Cc: Kernel Mailing List

I have now tried the SAMSUNG VG34323A disk with two other controllers at
home (Promise ATA100 an VIA vt82c686a rev 0x22, both on an ASUS A7V
motherboard), and there are no problems to be found with DMA enabled.
Streaming 10 MB/s without glitches.

However, writing to the SAMSUNG VG34323A disk with DMA enabled on either
this machine [1] (at work, using the VIA IDE driver version 3.11)

00:07.0 ISA bridge: VIA Technologies, Inc. VT82C596 ISA [Apollo PRO] (rev 23)
00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10)

or this machine [2] (at work, using the VIA IDE driver version 2.1e)

00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)

I get exactly the following errors on both machines

hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdc: dma_intr: error=0x84 { DriveStatusError BadCRC }

no matter what cable I use.  When I get this, the machine does not recover
most of the time, and I have to reset or power cycle.  This disc works
flawlessly on two other IDE controllers, so I do not think that the disk
is completely broken. It must be either these chipsets or the driver in
combination with this disk.  Note that I _can_ use another UDMA66 disk
_with_ DMA enabled on both machine [1] and [2] above without problems.
Also, 2.2.16-22 seems to work with DMA enabled on machine [1].  I have not
tried 2.2.16-22 with DMA enabled on machine [2].

The problem I reported at first, hence the nasty subject, was a hang and a
nasty fs corruption when RH7 tried to remount the root fs read-write.  I
examined the RH7 init scripts, or more precisely /etc/rc.sysinit, and
discovered, to my great disgust, that the stupid thing disables the dmesg
output on the console very early in the script.  It is thus entirely
possible that I do get the above mentioned errors when the computer seems
to hang, and my fs gets corrupted.  I will fix the script tomorrow to see
if my assumption is correct.

SUMMARY:  I have a disk that with DMA enabled give me CRC errors on two
machines, but not on two other, independent on the cable.  Both troubling
machines do not recover from these errors.  Linux 2.2.16-22 from RedHat
works fine with DMA enabled on machine [1], [2] is unknown.

I hope this makes things a lot clearer.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.4 ate my filesystem on rw-mount, getting closer
  2001-01-13 22:36         ` 2.4 ate my filesystem on rw-mount, getting closer Tobias Ringstrom
@ 2001-01-14  8:44           ` Vojtech Pavlik
  2001-01-14  8:45             ` Tobias Ringstrom
  0 siblings, 1 reply; 16+ messages in thread
From: Vojtech Pavlik @ 2001-01-14  8:44 UTC (permalink / raw)
  To: Tobias Ringstrom; +Cc: Kernel Mailing List

On Sat, Jan 13, 2001 at 11:36:13PM +0100, Tobias Ringstrom wrote:

> I have now tried the SAMSUNG VG34323A disk with two other controllers at
> home (Promise ATA100 an VIA vt82c686a rev 0x22, both on an ASUS A7V
> motherboard), and there are no problems to be found with DMA enabled.
> Streaming 10 MB/s without glitches.

So the drive *did* work on the vt82c686a in the A7V board? You tested it
both on the Promise and on the 686a? But doesn't work on the 686a in
your other board?

> However, writing to the SAMSUNG VG34323A disk with DMA enabled on either
> this machine [1] (at work, using the VIA IDE driver version 3.11)
> 
> 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C596 ISA [Apollo PRO] (rev 23)
> 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10)
> 
> or this machine [2] (at work, using the VIA IDE driver version 2.1e)
> 
> 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
> 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)

What's the manufacturer/model of these boards? Just for record ...
What's the PCI bus speed? Or memory speed?

> I get exactly the following errors on both machines
> 
> hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> hdc: dma_intr: error=0x84 { DriveStatusError BadCRC }
> 
> no matter what cable I use.  When I get this, the machine does not recover
> most of the time, and I have to reset or power cycle.

It should be able to recover in a couple (up to 10) minutes ...

> This disc works
> flawlessly on two other IDE controllers, so I do not think that the disk
> is completely broken. It must be either these chipsets or the driver in
> combination with this disk.  Note that I _can_ use another UDMA66 disk
> _with_ DMA enabled on both machine [1] and [2] above without problems.
> Also, 2.2.16-22 seems to work with DMA enabled on machine [1].  I have not
> tried 2.2.16-22 with DMA enabled on machine [2].
> 
> The problem I reported at first, hence the nasty subject, was a hang and a
> nasty fs corruption when RH7 tried to remount the root fs read-write.  I
> examined the RH7 init scripts, or more precisely /etc/rc.sysinit, and
> discovered, to my great disgust, that the stupid thing disables the dmesg
> output on the console very early in the script.  It is thus entirely
> possible that I do get the above mentioned errors when the computer seems
> to hang, and my fs gets corrupted.  I will fix the script tomorrow to see
> if my assumption is correct.
> 
> SUMMARY:  I have a disk that with DMA enabled give me CRC errors on two
> machines, but not on two other, independent on the cable.  Both troubling
> machines do not recover from these errors.  Linux 2.2.16-22 from RedHat
> works fine with DMA enabled on machine [1], [2] is unknown.
> 
> I hope this makes things a lot clearer.

Yes, indeed it's much clearer now. Now to fix the bug, or at least be
able to track it closer, I'll need 'lspci -vvxxx' of the IDE pci device
in the following cases:

1) SAMSUNG VG34323A on VT82C596b/cf with RH 2.2.16-22 and DMA (working)
2) SAMSUNG VG34323A on VT82C686a/ce with RH 2.2.16-22 and DMA (working)
3) SAMSUNG VG34323A on VT82C596b/cf with 2.4.0+via3.11 and DMA,
	(doesn't work, so fs readonly)
4) SAMSUNG VG34323A on VT82C686a/ce with 2.4.0+via3.11 and DMA,
	(doesn't work, so fs readonly)
5) The other drive on VT82C596b/cf with 2.4.0+via3.11 and DMA (working)
6) The other drive on VT82C686a/ce with 2.4.0+via3.11 and DMA (working)

With these data I should be able to find out what's different between
the working and not working setups ...

........

My current theory: In UDMA, when reading, the drive provides the clock.
The IDE controller thus can read everything OK. When writing, the
controller provides the clock and for some reason the Samsung can't keep
up with the setting the driver selects for it. The question is why and
why the driver selects the incorrect (or just too tight?) value.

-- 
Vojtech Pavlik
SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.4 ate my filesystem on rw-mount, getting closer
  2001-01-14  8:44           ` Vojtech Pavlik
@ 2001-01-14  8:45             ` Tobias Ringstrom
  2001-01-14  9:48               ` Vojtech Pavlik
  0 siblings, 1 reply; 16+ messages in thread
From: Tobias Ringstrom @ 2001-01-14  8:45 UTC (permalink / raw)
  To: Vojtech Pavlik; +Cc: Kernel Mailing List

On Sun, 14 Jan 2001, Vojtech Pavlik wrote:
> On Sat, Jan 13, 2001 at 11:36:13PM +0100, Tobias Ringstrom wrote:
>
> > I have now tried the SAMSUNG VG34323A disk with two other controllers at
> > home (Promise ATA100 an VIA vt82c686a rev 0x22, both on an ASUS A7V
> > motherboard), and there are no problems to be found with DMA enabled.
> > Streaming 10 MB/s without glitches.
>
> So the drive *did* work on the vt82c686a in the A7V board? You tested it
> both on the Promise and on the 686a? But doesn't work on the 686a in
> your other board?

Yes, on both the Promise and on the 686a.  But the device revisions are
different.  The machine that does NOT work:

00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)

The machine that works:

00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 22)
00:04.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10)

The one the works is a 1 GHz Athlon, and the other is an 800 MHz
Pentium-III.

> > no matter what cable I use.  When I get this, the machine does not recover
> > most of the time, and I have to reset or power cycle.
>
> It should be able to recover in a couple (up to 10) minutes ...

Who waits 10 minutes for a timeout?  Can it be lowered?

Expect another mail with the data you requested within a couple of hours.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.4 ate my filesystem on rw-mount, getting closer
  2001-01-14  8:45             ` Tobias Ringstrom
@ 2001-01-14  9:48               ` Vojtech Pavlik
  2001-01-14 16:37                 ` Tobias Ringstrom
  0 siblings, 1 reply; 16+ messages in thread
From: Vojtech Pavlik @ 2001-01-14  9:48 UTC (permalink / raw)
  To: Tobias Ringstrom; +Cc: Kernel Mailing List

On Sun, Jan 14, 2001 at 09:45:09AM +0100, Tobias Ringstrom wrote:
> On Sun, 14 Jan 2001, Vojtech Pavlik wrote:
> > On Sat, Jan 13, 2001 at 11:36:13PM +0100, Tobias Ringstrom wrote:
> >
> > > I have now tried the SAMSUNG VG34323A disk with two other controllers at
> > > home (Promise ATA100 an VIA vt82c686a rev 0x22, both on an ASUS A7V
> > > motherboard), and there are no problems to be found with DMA enabled.
> > > Streaming 10 MB/s without glitches.
> >
> > So the drive *did* work on the vt82c686a in the A7V board? You tested it
> > both on the Promise and on the 686a? But doesn't work on the 686a in
> > your other board?
> 
> Yes, on both the Promise and on the 686a.  But the device revisions are
> different.  The machine that does NOT work:
> 
> 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
> 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
> 
> The machine that works:
> 
> 00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 22)
> 00:04.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10)
> 
> The one the works is a 1 GHz Athlon, and the other is an 800 MHz
> Pentium-III.
> 
> > > no matter what cable I use.  When I get this, the machine does not recover
> > > most of the time, and I have to reset or power cycle.
> >
> > It should be able to recover in a couple (up to 10) minutes ...
> 
> Who waits 10 minutes for a timeout?  Can it be lowered?

It's not a 10 minute timeout, it's a shorter timeout retried many times.
Not my code, though - this is generic PCI IDE code, and is a huge mess.

> Expect another mail with the data you requested within a couple of hours.

Thanks a lot.

-- 
Vojtech Pavlik
SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.4 ate my filesystem on rw-mount, getting closer
  2001-01-14  9:48               ` Vojtech Pavlik
@ 2001-01-14 16:37                 ` Tobias Ringstrom
  2001-01-14 17:59                   ` Tobias Ringstrom
  0 siblings, 1 reply; 16+ messages in thread
From: Tobias Ringstrom @ 2001-01-14 16:37 UTC (permalink / raw)
  To: Vojtech Pavlik; +Cc: Kernel Mailing List

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3154 bytes --]

On Sun, 14 Jan 2001, Vojtech Pavlik wrote:
> > > So the drive *did* work on the vt82c686a in the A7V board? You tested it
> > > both on the Promise and on the 686a? But doesn't work on the 686a in
> > > your other board?
> >
> > Yes, on both the Promise and on the 686a.  But the device revisions are
> > different.  The machine that does NOT work:
> >
> > 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
> > 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
> >
> > The machine that works:
> >
> > 00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 22)
> > 00:04.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10)
> >
> > The one the works is a 1 GHz Athlon, and the other is an 800 MHz
> > Pentium-III.

Of course is isn't.  The vt82c686 that does not work is a 450 MHz K-6, not
a PIII.

> > > > no matter what cable I use.  When I get this, the machine does not recover
> > > > most of the time, and I have to reset or power cycle.
> > >
> > > It should be able to recover in a couple (up to 10) minutes ...
> >
> > Who waits 10 minutes for a timeout?  Can it be lowered?
>
> It's not a 10 minute timeout, it's a shorter timeout retried many times.
> Not my code, though - this is generic PCI IDE code, and is a huge mess.

What I get is a number of Busy and Drive is not ready for command for
different sectors.

> > Expect another mail with the data you requested within a couple of hours.
>
> Thanks a lot.

Ok, it took a bit longer that that, mostly because me and my whife had
unexpected (but very welcome) guests at home.  It is Sunday, after all...

I have attached a tar file with "lspci -vvxxx" and "hdinfo -i" for machine
1 and 2 to this mail, but first some comments.

I will be talking about three machines:

1) 450 MHz K-6 on an AOpen MX59 PRO II motherboard
2) 800 MHz PIII on an unknown cheap/crappy motherboard.
3) 1 GHz Athlon on an ASUS A7V motherboard.

and the following drives:

A) SAMSUNG VG34323A, sdma0 sdma1 sdma2 mdma0 mdma1 mdma2 udma0 udma1 udma2
B) ST38421A, mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4

Machine 3 is the machine at home, and it does not have problems with any
disks I have tried soo far, and seems very stable, both with ATA100 and
ATA66.

I verified that what is happening when RH7 tries to remount / read-write,
is that I get the infamous CRC errors.  It does not seem to recover from
this state.  At least I did not wait that long.

I do not think that the RH7 kernel 2.2.16-22 uses udma2 at any time, and
that may be why it works.

Disk B does NOT work with DMA enabled with machine 1 or 2.  It works
better than disk A, but it does still fail after some time.  The
combination 1B was the most stable, and only failed once.

When using disk B, the computer has managed to recover from the CRC error
condition every time, as opposed to disk A which never recovers.  (Busy)

Using hdparm -X65 (udma1) makes disk A work with 2.4 in machine 2.  What
is the difference between udma1 and udma2?

Now I'm almost completely lost.  Hope this helps.  Let me know if you want
me to try something else.

/Tobias


[-- Attachment #2: Type: TEXT/PLAIN, Size: 520 bytes --]


/dev/hde:

 Model=SAMSUNG VG34323A (4.32GB), FwRev=GQ200, SerialNo=dW1921060033c8
 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
 RawCHS=14896/9/63, TrkSize=32256, SectSize=512, ECCbytes=21
 BuffType=DualPortCache, BuffSize=496kB, MaxMultSect=16, MultSect=off
 CurCHS=14896/9/63, CurSects=-531627904, LBA=yes, LBAsects=8446032
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes: pio0 pio1 pio2 pio3 pio4 
 DMA modes: sdma0 sdma1 sdma2 mdma0 mdma1 mdma2 udma0 udma1 *udma2 

[-- Attachment #3: Type: TEXT/PLAIN, Size: 17878 bytes --]

00:00.0 Host bridge: VIA Technologies, Inc.: Unknown device 0305 (rev 02)
	Subsystem: Asustek Computer, Inc.: Unknown device 8033
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
	Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR+
	Latency: 0
	Region 0: Memory at e0000000 (32-bit, prefetchable) [size=128M]
	Capabilities: [a0] AGP version 2.0
		Status: RQ=31 SBA+ 64bit- FW+ Rate=x1,x2
		Command: RQ=0 SBA- AGP- 64bit- FW- Rate=<none>
	Capabilities: [c0] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
00: 06 11 05 03 06 00 10 a2 02 00 00 06 00 00 00 00
10: 08 00 00 e0 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 43 10 33 80
30: 00 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 00
40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
50: 17 a4 6b b4 4f 81 10 10 80 00 08 10 10 10 10 10
60: 03 ff 00 b0 e6 e5 e5 00 44 7c 86 0f 08 3f 00 00
70: de 80 cc 0c 0e a1 d2 00 01 b4 11 02 00 00 00 01
80: 0f 40 00 00 80 00 00 00 02 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 02 c0 20 00 17 02 00 1f 00 00 00 00 6e 02 14 00
b0: 61 ec 80 e5 32 33 28 00 00 00 00 00 00 00 00 00
c0: 01 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 0e 22 00 00 00 00 00 91 06

00:01.0 PCI bridge: VIA Technologies, Inc.: Unknown device 8305 (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
	Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR-
	Latency: 0
	Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
	Memory behind bridge: d6000000-d7dfffff
	Prefetchable memory behind bridge: d7f00000-dfffffff
	BridgeCtl: Parity- SERR- NoISA- VGA+ MAbort- >Reset- FastB2B-
	Capabilities: [80] Power Management version 2
		Flags: PMEClk- DSI+ D1+ D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
00: 06 11 05 83 07 00 30 22 00 00 04 06 00 00 01 00
10: 00 00 00 00 00 00 00 00 00 01 01 00 e0 d0 00 00
20: 00 d6 d0 d7 f0 d7 f0 df 00 00 00 00 00 00 00 00
30: 00 00 00 00 80 00 00 00 00 00 00 00 00 00 08 00
40: cb cd 18 14 27 72 05 83 00 00 00 00 00 00 00 00
50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
80: 01 00 22 02 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 22)
	Subsystem: Asustek Computer, Inc.: Unknown device 8033
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B-
	Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
	Latency: 0
00: 06 11 86 06 87 00 10 02 22 00 01 06 00 00 80 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 43 10 33 80
30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
40: 00 01 00 00 00 80 62 ee 01 00 44 00 00 00 00 f3
50: 0e 06 34 00 00 b0 9a 50 00 04 ff 08 80 00 00 00
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
70: 00 00 00 00 2d 08 40 82 00 00 00 00 00 00 00 00
80: 00 00 00 00 00 0f 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 01 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00:04.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10) (prog-if 8a [Master SecP PriP])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
	Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
	Latency: 32
	Region 4: I/O ports at d800 [size=16]
	Capabilities: [c0] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
00: 06 11 71 05 07 00 90 02 10 8a 01 01 00 20 00 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 01 d8 00 00 00 00 00 00 00 00 00 00 00 00 00 00
30: 00 00 00 00 c0 00 00 00 00 00 00 00 ff 00 00 00
40: 03 f2 02 3a 08 03 f0 00 a8 20 a8 20 33 00 20 20
50: 0f 07 0f e2 14 00 00 00 a8 a8 a8 a8 00 00 00 00
60: 00 02 00 00 00 00 00 00 00 02 00 00 00 00 00 00
70: 02 01 00 00 00 00 00 00 02 01 00 00 00 00 00 00
80: 00 30 46 01 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 01 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 10 00 71 05 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00:04.2 USB Controller: VIA Technologies, Inc. VT82C586B USB (rev 10) (prog-if 00 [UHCI])
	Subsystem: Unknown device 0925:1234
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B-
	Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
	Latency: 32, cache line size 08
	Interrupt: pin D routed to IRQ 5
	Region 4: I/O ports at d400 [size=32]
	Capabilities: [80] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
00: 06 11 38 30 17 00 10 02 10 00 03 0c 08 20 00 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 01 d4 00 00 00 00 00 00 00 00 00 00 25 09 34 12
30: 00 00 00 00 80 00 00 00 00 00 00 00 05 04 00 00
40: 00 12 03 00 c2 00 30 cc 00 00 00 00 00 00 00 00
50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
60: 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
80: 01 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00:04.3 USB Controller: VIA Technologies, Inc. VT82C586B USB (rev 10) (prog-if 00 [UHCI])
	Subsystem: Unknown device 0925:1234
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B-
	Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
	Latency: 32, cache line size 08
	Interrupt: pin D routed to IRQ 5
	Region 4: I/O ports at d000 [size=32]
	Capabilities: [80] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
00: 06 11 38 30 17 00 10 02 10 00 03 0c 08 20 00 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 01 d0 00 00 00 00 00 00 00 00 00 00 25 09 34 12
30: 00 00 00 00 80 00 00 00 00 00 00 00 05 04 00 00
40: 00 12 03 00 c6 00 31 18 00 00 00 00 00 00 00 00
50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
60: 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
80: 01 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00:04.4 Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 30)
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
	Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
	Capabilities: [68] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
00: 06 11 57 30 00 00 90 02 30 00 00 06 00 00 00 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
30: 00 00 00 00 68 00 00 00 00 00 00 00 00 00 00 00
40: 20 80 49 00 1a 10 00 00 01 e4 00 00 48 10 00 00
50: 00 ff ff 04 10 04 00 00 00 ff ff 00 00 00 00 00
60: 00 00 00 00 00 00 00 00 01 00 02 00 00 00 00 00
70: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 01 e8 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00:0a.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink] (rev 74)
	Subsystem: 3Com Corporation 3C905C-TX Fast Etherlink for PC Management NIC
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B-
	Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
	Latency: 32 (2500ns min, 2500ns max), cache line size 08
	Interrupt: pin A routed to IRQ 9
	Region 0: I/O ports at a400 [size=128]
	Region 1: Memory at d5800000 (32-bit, non-prefetchable) [size=128]
	Expansion ROM at <unassigned> [disabled] [size=128K]
	Capabilities: [dc] Power Management version 2
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
		Status: D0 PME-Enable- DSel=0 DScale=2 PME-
00: b7 10 00 92 17 00 10 02 74 00 00 02 08 20 00 00
10: 01 a4 00 00 00 00 80 d5 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 b7 10 00 10
30: 00 00 00 00 dc 00 00 00 00 00 00 00 ff 01 0a 0a
40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 01 00 02 fe
e0: 00 40 00 b7 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00:0b.0 Multimedia audio controller: Creative Labs SB Live! EMU10000 (rev 08)
	Subsystem: Creative Labs CT4832 SBLive! Value
	Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
	Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
	Latency: 32 (500ns min, 5000ns max)
	Interrupt: pin A routed to IRQ 10
	Region 0: I/O ports at a000 [size=32]
	Capabilities: [dc] Power Management version 1
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
00: 02 11 02 00 05 00 90 02 08 00 01 04 00 20 80 00
10: 01 a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 02 11 27 80
30: 00 00 00 00 dc 00 00 00 00 00 00 00 ff 01 02 14
40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 01 00 01 06
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00:0b.1 Input device controller: Creative Labs SB Live! (rev 08)
	Subsystem: Creative Labs Gameport Joystick
	Control: I/O- Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
	Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
	Latency: 32
	Region 0: I/O ports at 9800 [disabled] [size=8]
	Capabilities: [dc] Power Management version 1
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
00: 02 11 02 70 04 00 90 02 08 00 80 09 00 20 80 00
10: 01 98 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 02 11 20 00
30: 00 00 00 00 dc 00 00 00 00 00 00 00 00 00 00 00
40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 01 00 01 06
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00:11.0 Unknown mass storage controller: Promise Technology, Inc.: Unknown device 0d30 (rev 02)
	Subsystem: Promise Technology, Inc.: Unknown device 4d33
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
	Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
	Latency: 32
	Interrupt: pin A routed to IRQ 10
	Region 0: I/O ports at 9400 [size=8]
	Region 1: I/O ports at 9000 [size=4]
	Region 2: I/O ports at 8800 [size=8]
	Region 3: I/O ports at 8400 [size=4]
	Region 4: I/O ports at 8000 [size=64]
	Region 5: Memory at d5000000 (32-bit, non-prefetchable) [size=128K]
	Expansion ROM at <unassigned> [disabled] [size=64K]
	Capabilities: [58] Power Management version 1
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
00: 5a 10 30 0d 07 00 10 02 02 00 80 01 00 20 00 00
10: 01 94 00 00 01 90 00 00 01 88 00 00 01 84 00 00
20: 01 80 00 00 00 00 00 d5 00 00 00 00 5a 10 33 4d
30: 00 00 00 00 58 00 00 00 00 00 00 00 0a 01 00 00
40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
50: 8e 33 00 00 00 00 00 00 01 00 01 00 00 00 00 00
60: f1 24 41 00 c4 f3 4f 00 04 f3 4f 00 04 f3 4f 00
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

01:00.0 VGA compatible controller: nVidia Corporation NV15 Bladerunner (Geforce2 GTS) (rev a4) (prog-if 00 [VGA])
	Subsystem: Unknown device 1681:0010
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
	Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
	Latency: 64 (1250ns min, 250ns max)
	Interrupt: pin A routed to IRQ 11
	Region 0: Memory at d6000000 (32-bit, non-prefetchable) [size=16M]
	Region 1: Memory at d8000000 (32-bit, prefetchable) [size=128M]
	Expansion ROM at d7ff0000 [disabled] [size=64K]
	Capabilities: [60] Power Management version 1
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [44] AGP version 2.0
		Status: RQ=31 SBA- 64bit- FW+ Rate=x1,x2
		Command: RQ=0 SBA- AGP- 64bit- FW- Rate=<none>
00: de 10 52 01 07 00 b0 02 a4 00 00 03 00 40 00 00
10: 00 00 00 d6 08 00 00 d8 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 81 16 10 00
30: 00 00 ff d7 60 00 00 00 00 00 00 00 0b 01 05 01
40: 81 16 10 00 02 00 20 00 17 00 00 1f 00 00 00 00
50: 01 00 00 00 01 00 00 00 ce d6 23 00 0f 00 00 00
60: 01 44 01 00 00 00 00 00 00 00 00 00 00 00 00 00
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00


[-- Attachment #4: Type: APPLICATION/x-gzip, Size: 5032 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.4 ate my filesystem on rw-mount, getting closer
  2001-01-14 16:37                 ` Tobias Ringstrom
@ 2001-01-14 17:59                   ` Tobias Ringstrom
  2001-01-14 22:32                     ` Vojtech Pavlik
  0 siblings, 1 reply; 16+ messages in thread
From: Tobias Ringstrom @ 2001-01-14 17:59 UTC (permalink / raw)
  To: Vojtech Pavlik; +Cc: Kernel Mailing List

I should also add that the 3.11 driver seems to make things better, but
not yet perfect.  My intuition tells me that I get CRC errors much sooner
with 2.1e than with 3.11.

Has the timings changed from 2.1e to 3.11, and would it be easy to modify
3.11 to get extra safe/paranoid, but less high performance, timings?

Some extra data:
* B seems to work in 2 with udma2
* A seems to work in 2 with udma1, but not with udma2.

I wouldn't say it's rock solid, and I would not trust my data to any of
these combinations, but at least it not break immmediately (i.e. for less
than 1 GB written).

The worst combination is 2.4.0 with VIA 2.1e and A in 1.  Going from 2.1e
to 3.11 helps, but it is still very bad.

I'd really like to be more precise, but there are too many combinations to
try to try them all, and sometimes it fails right away, and sometimes
after several hundred megabytes.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.4 ate my filesystem on rw-mount, getting closer
  2001-01-14 17:59                   ` Tobias Ringstrom
@ 2001-01-14 22:32                     ` Vojtech Pavlik
  2001-01-23 21:39                       ` 2.4 ate my filesystem on rw-mount, summary Tobias Ringstrom
  0 siblings, 1 reply; 16+ messages in thread
From: Vojtech Pavlik @ 2001-01-14 22:32 UTC (permalink / raw)
  To: Tobias Ringstrom; +Cc: Kernel Mailing List

On Sun, Jan 14, 2001 at 06:59:57PM +0100, Tobias Ringstrom wrote:
> 
> I should also add that the 3.11 driver seems to make things better, but
> not yet perfect.  My intuition tells me that I get CRC errors much sooner
> with 2.1e than with 3.11.
> 
> Has the timings changed from 2.1e to 3.11, and would it be easy to modify
> 3.11 to get extra safe/paranoid, but less high performance, timings?

If you use 'idebus=40' or 'idebus=50', the driver will add an extra
margin to the timings, trying to compensate for the 40 or 50 MHz PCI bus
it will be tricked to think it's working with.

This could add a data point, yes.

> Some extra data:
> * B seems to work in 2 with udma2
> * A seems to work in 2 with udma1, but not with udma2.

UDMA1 is 22.2 MB/sec, UDMA2 is 33.3. UDMA0 is 16.6.

Could you (if didn't already) send me the lspci -vvxxx after the -X65
(UDMA1) command, together with the one before? That also could tell
something.

> I wouldn't say it's rock solid, and I would not trust my data to any of
> these combinations, but at least it not break immmediately (i.e. for less
> than 1 GB written).

Actually, the CRC messages are safe and only mean a data transfer is
retried. That is, only if it doesn't fail every time. They happen on
many boards and drives using UDMA even under normal correct operation :(

> The worst combination is 2.4.0 with VIA 2.1e and A in 1.  Going from 2.1e
> to 3.11 helps, but it is still very bad.
> 
> I'd really like to be more precise, but there are too many combinations to
> try to try them all, and sometimes it fails right away, and sometimes
> after several hundred megabytes.

If 'fails after several hundred megabytes' only means a single CRC error
which is recovered from correctly, then that actually means 'working and
probably would work perfect with a shorter cable'.

-- 
Vojtech Pavlik
SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.4 ate my filesystem on rw-mount, summary
  2001-01-14 22:32                     ` Vojtech Pavlik
@ 2001-01-23 21:39                       ` Tobias Ringstrom
  0 siblings, 0 replies; 16+ messages in thread
From: Tobias Ringstrom @ 2001-01-23 21:39 UTC (permalink / raw)
  To: Vojtech Pavlik; +Cc: Kernel Mailing List

Ok, folks, it's time for a summary.  Since my last post, I've had time to
experiment a bit more, and I've also had some private communication with
Vojtech.

First, I would like to say that you do need quite a bit of bad luck (or
hardware) to have the same problems I did.  Linux 2.4, VIA and IDE works
very well for most users.  But I really recommend making a backup of all
your vital data before installing 2.4 and enabling DMA with IDE disks.
(And, yes, I did this.  Honest! :-) )

Problem log
===========

1. Installed RedHat 7
2. Built 2.4.0 with VIA driver and DMA by default (well, in 2.4.0, the VIA
   driver will always use DMA by default, wheather you want to or not.)
3. Rebooted -> 2.4.0
4. The computer froze on the remounting root read-write message.
5. Powercycle
6. Rebooted -> 2.2.16-22
7. Got a corrupt disk, missing files, moved files, incorrect file contents
8. Goto 1

So, why did this happen?

Problem one
===========

This one really makes me upset, because had it not been for this one, it
would have been soo much easier to find the cause of the problem.  It is
also so easy to fix.

The problem is that the RedHat disables all kernel messages during boot,
except for panics.  I my not so very humble opinion, kernel error
messages, and possibly also warning messages, should of course be shown.
It can easyly be fixed by editing /etc/sysconfig/init.

The error messages that was hidden by RH7, was a couple of CRC error
messages, and then an endless stream of "Busy" and "Drive not ready for
command" errors.  More on this later.

Problem two
===========

The computer in question has problems with UDMA(33), otherwise I would not
have gotten CRC errors, and everything would have been fine.  Why I do get
CRC errors, one can so far only speculate, especially since I am able to
use UDMA(66) with another drive, on the same controller, without much
trouble.

One theory is that the PCI bus clock may be too fast, and the drive cannot
catch up.  To check this, I plan to measure the PCI clock to see if this
is true.  Quick measurements with a not too great oscilloscope seems to
indicate a clock speed of around 33.3-33.4 MHz, so it may actully be out
of spec, but not by much.

Another theory is that the CRC errors are caused by bad cables,
connectors, or motherboard, but the fact that I can use UDMA(66) on the
same controller seems to contradicts this.  But OTOH I have learnt not to
underestimate the amazing amount of trouble a bad cable can cause.

Possible work-arounds include a "idebus=40" kernel option, or using
hdparm to configure the drive and kernel for UDMA(22).

Problem three
=============

The drive that gave me these problems is a SAMSUNG VG34323A, and the
problem with this drive is that it does not seem to recover from CRC
errors.  Once I get my first CRC error, the drive becomes permanently
busy, until I power cycle.

Problem four
============

<speculation>I do not know exactly what Linux is doing when remounting a
partition read-write, but it does seem to update some very sensitive
sectors, and when the write fails, a lot of very vital data is destroyed.
It is perhaps questionable whether the destruction of a couple of files
would be much better than the destruction of /dev, but I think it is.
</speculation>

Lesson
======

Be very careful when enabling DMA on a Linux machine, especially on cheap
hardware.  It is not enough to test DMA on a read-only partition first,
since writing is a completely different story.

...and probably some more things that I either forgot, or are too painful
to remember...

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2001-01-23 21:40 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-01-12  9:15 2.4 ate my filesystem on rw-mount Tobias Ringstrom
2001-01-12 10:22 ` Alan Cox
2001-01-12 17:23   ` Martin Laberge
2001-01-12 19:51     ` Vojtech Pavlik
2001-01-12 19:49 ` Vojtech Pavlik
2001-01-13  8:12   ` Tobias Ringstrom
2001-01-13 14:35     ` Vojtech Pavlik
2001-01-13 16:20       ` Tobias Ringstrom
2001-01-13 22:36         ` 2.4 ate my filesystem on rw-mount, getting closer Tobias Ringstrom
2001-01-14  8:44           ` Vojtech Pavlik
2001-01-14  8:45             ` Tobias Ringstrom
2001-01-14  9:48               ` Vojtech Pavlik
2001-01-14 16:37                 ` Tobias Ringstrom
2001-01-14 17:59                   ` Tobias Ringstrom
2001-01-14 22:32                     ` Vojtech Pavlik
2001-01-23 21:39                       ` 2.4 ate my filesystem on rw-mount, summary Tobias Ringstrom

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox