* 2.5.49: Severe PIIX4/ATA filesystem corruption
@ 2002-11-26 21:07 H. Peter Anvin
2002-11-27 0:32 ` Alan Cox
0 siblings, 1 reply; 6+ messages in thread
From: H. Peter Anvin @ 2002-11-26 21:07 UTC (permalink / raw)
To: linux-kernel
So, I finally braved it and tried running 2.5.49 on my workstation to
test out my RAID-6 patches. There were no patches outside the md
area, and the ordinary filesystems aren't on md drives.
The two SCSI drives (SymBIOS controller) work just fine, but I have
gotten repeated, severe data corruption on the one ATA drive in the
system after only a few hours of operation.
Just thought I'd warn people...
-hpa
--
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <amsp@zytor.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 2.5.49: Severe PIIX4/ATA filesystem corruption
2002-11-26 21:07 2.5.49: Severe PIIX4/ATA filesystem corruption H. Peter Anvin
@ 2002-11-27 0:32 ` Alan Cox
2002-11-27 0:13 ` H. Peter Anvin
0 siblings, 1 reply; 6+ messages in thread
From: Alan Cox @ 2002-11-27 0:32 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Linux Kernel Mailing List
On Tue, 2002-11-26 at 21:07, H. Peter Anvin wrote:
> So, I finally braved it and tried running 2.5.49 on my workstation to
> test out my RAID-6 patches. There were no patches outside the md
> area, and the ordinary filesystems aren't on md drives.
>
> The two SCSI drives (SymBIOS controller) work just fine, but I have
> gotten repeated, severe data corruption on the one ATA drive in the
> system after only a few hours of operation.
If you mash the innards of the page cache you'll get corruption
everywhere, its one of the charms of testing out that area of the code
on Linux. You might want to debug using 2.5.49 user mode linux rather
than on raw disks. Its so much easier to use "cp" to generate a
replacement root_fs 8)
Alan
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 2.5.49: Severe PIIX4/ATA filesystem corruption
2002-11-27 0:32 ` Alan Cox
@ 2002-11-27 0:13 ` H. Peter Anvin
2002-11-27 1:03 ` Alan Cox
0 siblings, 1 reply; 6+ messages in thread
From: H. Peter Anvin @ 2002-11-27 0:13 UTC (permalink / raw)
To: Alan Cox; +Cc: Linux Kernel Mailing List
Alan Cox wrote:
> On Tue, 2002-11-26 at 21:07, H. Peter Anvin wrote:
>
>>So, I finally braved it and tried running 2.5.49 on my workstation to
>>test out my RAID-6 patches. There were no patches outside the md
>>area, and the ordinary filesystems aren't on md drives.
>>
>>The two SCSI drives (SymBIOS controller) work just fine, but I have
>>gotten repeated, severe data corruption on the one ATA drive in the
>>system after only a few hours of operation.
>
>
> If you mash the innards of the page cache you'll get corruption
> everywhere, its one of the charms of testing out that area of the code
> on Linux. You might want to debug using 2.5.49 user mode linux rather
> than on raw disks. Its so much easier to use "cp" to generate a
> replacement root_fs 8)
>
Yes, that's true. However, the heavily used two SCSI disks saw no
corruption whatsoever, whereas the single, lightly used ATA disk saw
heavy corruption; if it was due to experimental unrelated code one would
have expected corruption everywhere. This does not mean that it is not
my fault (as far as UML is concerned, I tried building it quite a few
times before giving up), but given the severity of the corruption I was
seeing I thought I'd raise a red flag.
-hpa
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 2.5.49: Severe PIIX4/ATA filesystem corruption
2002-11-27 0:13 ` H. Peter Anvin
@ 2002-11-27 1:03 ` Alan Cox
2002-11-27 0:36 ` H. Peter Anvin
2002-12-17 20:41 ` H. Peter Anvin
0 siblings, 2 replies; 6+ messages in thread
From: Alan Cox @ 2002-11-27 1:03 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Linux Kernel Mailing List
On Wed, 2002-11-27 at 00:13, H. Peter Anvin wrote:
> Yes, that's true. However, the heavily used two SCSI disks saw no
> corruption whatsoever, whereas the single, lightly used ATA disk saw
> heavy corruption; if it was due to experimental unrelated code one would
> have expected corruption everywhere. This does not mean that it is not
IDE does handle the I/O quite differently so Im not sure about that.
Accidentally fiddling with an in flight request tends to do horrible
things on IDE but not on scsi for example.
The base 2.5.47/8/9 Linus tree PIIX code has had no corruption reports
(except someone whose box failed memtest86) and its about the most
tested IDE controller.
I would be interested to know what happens if you boot a base 2.5.49
without raid6 adulteration and stress it on your hw there, just to be
sure.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 2.5.49: Severe PIIX4/ATA filesystem corruption
2002-11-27 1:03 ` Alan Cox
@ 2002-11-27 0:36 ` H. Peter Anvin
2002-12-17 20:41 ` H. Peter Anvin
1 sibling, 0 replies; 6+ messages in thread
From: H. Peter Anvin @ 2002-11-27 0:36 UTC (permalink / raw)
To: Alan Cox; +Cc: Linux Kernel Mailing List
Alan Cox wrote:
>
> The base 2.5.47/8/9 Linus tree PIIX code has had no corruption reports
> (except someone whose box failed memtest86) and its about the most
> tested IDE controller.
>
> I would be interested to know what happens if you boot a base 2.5.49
> without raid6 adulteration and stress it on your hw there, just to be
> sure.
>
I will try it once the system gets put back together, which will be in
about two weeks (we're moving, and today was computer teardown day.)
All I have ATM is my laptop, which I won't be running experimental stuff
on :-/
-hpa
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 2.5.49: Severe PIIX4/ATA filesystem corruption
2002-11-27 1:03 ` Alan Cox
2002-11-27 0:36 ` H. Peter Anvin
@ 2002-12-17 20:41 ` H. Peter Anvin
1 sibling, 0 replies; 6+ messages in thread
From: H. Peter Anvin @ 2002-12-17 20:41 UTC (permalink / raw)
To: linux-kernel
Followup to: <1038359021.3267.110.camel@irongate.swansea.linux.org.uk>
By author: Alan Cox <alan@lxorguk.ukuu.org.uk>
In newsgroup: linux.dev.kernel
>
> I would be interested to know what happens if you boot a base 2.5.49
> without raid6 adulteration and stress it on your hw there, just to be
> sure.
>
Well, I finally got the system up and running again, after moving, and
ran it without loading any of the md modules (thus nothing modified by
the raid6 code.) Leaving it running overnight at the shell prompt but
cron jobs running -- including the one that backs up the SCSI drives
onto the IDE drive -- left me with tons of ext3fs error messages in
the morning on the IDE drive in question.
This is unfortunately all the information I have right at the moment.
-hpa
--
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <amsp@zytor.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2002-12-17 20:34 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-11-26 21:07 2.5.49: Severe PIIX4/ATA filesystem corruption H. Peter Anvin
2002-11-27 0:32 ` Alan Cox
2002-11-27 0:13 ` H. Peter Anvin
2002-11-27 1:03 ` Alan Cox
2002-11-27 0:36 ` H. Peter Anvin
2002-12-17 20:41 ` H. Peter Anvin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.