* Re: IDE corruption, 2.2, VIA chipset in PIO mode
[not found] <E156y8B-0005eg-00@the-village.bc.nu>
@ 2001-06-05 8:29 ` Neil Conway
0 siblings, 0 replies; 4+ messages in thread
From: Neil Conway @ 2001-06-05 8:29 UTC (permalink / raw)
To: Alan Cox, linux-kernel
Alan Cox wrote:
> > Sigh. Ah, I think I see a nice brown bag, in a nice deep hole.
>
> Its only a pointer. PIO speed cable errors tend to imply a bad cable problem
> (eg not properly connected ribbon). So it could still be that the problem is
> elsewhere
Ah OK. Though a cable fault does seem consistent with the evidence...
(I swear I read the FAQ before posting!)
In practice, does a BadCRC error EVER imply a crap/buggy chipset?
On the flip side, the cable isn't too long, isn't damaged, and was very
definitely seated properly (I did it personally and took some care over
that). On the third hand, I don't know where it came from, and somebody
had spilled coffee on it in a previous life :-) (not the connectors!).
To approach the question from a different angle completely: DARE I use
the VIA 686A in UDMA-33/66[/100 if capable?] mode, or is it not really
up to the job? I've seen so many posts on a search for "linux via ide
corruption" that I'm uneasy about repeating the experiments on what is a
production box...
thanks,
Neil
PS: 80pin cables on the way :-)
^ permalink raw reply [flat|nested] 4+ messages in thread
* IDE corruption, 2.2, VIA chipset in PIO mode
@ 2001-06-04 14:43 Neil Conway
2001-06-04 17:02 ` Alan Cox
0 siblings, 1 reply; 4+ messages in thread
From: Neil Conway @ 2001-06-04 14:43 UTC (permalink / raw)
To: linux-kernel
Summary: we got IDE trashage in PIO mode with a VIA 686A IDE chipset,
using 2.2.12-20smp (RH6.1 stock).
Disk is an IBM 75GXP 75GB, mobo is Gigabyte GA-6VXDC7 (IIRC).
Story: had the system hooked up with SCSI disk, needed more disk space,
had IBM EIDE handy, stuck it in, no UDMA cable handy so used 40pin
cable. Verified with hdparm that disk was using UDMA mode 2 (rather
than 3 or 4 which would have needed the 80pin cable).
Because it was an IDE disk in a box with a SCSI system disk, I made a
little /ideboot partition (50 megs or so) and avoided LILO hassles by
parking the kernel+initrd on that. Rest of disk was just /data
partition.
Used for a while, copied about 7gigs onto it. Then got lots of BadCRC
errors when reading from disk (from dma_intr). Decided to disable DMA
as a result of this...
Sometime later tried to reboot, couldn't. Closer examination showed the
/ideboot partition was hosed. No worries, we thought, it's just been
screwed by the DMA being on earlier. So, I just rebuilt /ideboot (a
little optimistic of me) and got it booting again, and then compared
files on /data with the original data files. When they failed to match,
I decided to blitz the whole lot, repartition both partitions and remake
the fs's (still thinking at this point that the main cause was the
original 7 gigs of copying while DMA was enabled).
All of this rebuilding was done in PIO mode.
So, having recopied the data onto the disk again, sometime later one of
us reboots it, and hey presto, it doesn't reboot, and yes, it's due to
the little partition /ideboot being hosed again (illegal triply indirect
blocks, bad inodes etc...).
So, I'm now left thinking that this final failure (and thus by inference
maybe the others too) really can't have been caused by DMA problems...
(The only little caveat is that when I blitzed the lot, rebuilding the
partition table and both filesystems, I didn't wipe out the entire boot
sector/cylinder, so in principle some tiny vestigial memories of the
corruption might have persisted??)
I've searched the web, and found plenty of people suffering from broken
DMA on the VIA chipsets, but no clear reports of PIO breakage.
It does seem incredible that a chipset could fail to work reliably in
PIO mode, but it's either that, or the 2.2.12-20smp kernel, or a broken
disk or motherboard. Given VIA's apparent flakiness, the chipset seems
like a good candidate for suspicion...
Anyone out there got the answers?
Neil
PS: 2.2.12-20smp - argh puke, but not my machine so not my kernel
choice...
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: IDE corruption, 2.2, VIA chipset in PIO mode
2001-06-04 14:43 Neil Conway
@ 2001-06-04 17:02 ` Alan Cox
2001-06-04 17:29 ` Neil Conway
0 siblings, 1 reply; 4+ messages in thread
From: Alan Cox @ 2001-06-04 17:02 UTC (permalink / raw)
To: Neil Conway; +Cc: linux-kernel
> Used for a while, copied about 7gigs onto it. Then got lots of BadCRC
> errors when reading from disk (from dma_intr). Decided to disable DMA
> as a result of this...
Cable errors. Disabling DMA also disabled error checking and correction
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: IDE corruption, 2.2, VIA chipset in PIO mode
2001-06-04 17:02 ` Alan Cox
@ 2001-06-04 17:29 ` Neil Conway
0 siblings, 0 replies; 4+ messages in thread
From: Neil Conway @ 2001-06-04 17:29 UTC (permalink / raw)
To: Alan Cox; +Cc: linux-kernel
Alan Cox wrote:
>
> > Used for a while, copied about 7gigs onto it. Then got lots of BadCRC
> > errors when reading from disk (from dma_intr). Decided to disable DMA
> > as a result of this...
>
> Cable errors. Disabling DMA also disabled error checking and correction
Jeez. Crappity, crappity, crap. Embarrassing mistake of the year.
I managed to convince myself that the cable couldn't possibly be
responsible, because we got corruption even when using UDMA-33 (which
protects from cable faults). Somewhere in my logic I obviously lost the
plot, and forgot that I switched DMA off *before* seeing corruption.
Sigh. Ah, I think I see a nice brown bag, in a nice deep hole.
Thank you for gently pointing out the error of my ways ;-)
Neil
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2001-06-05 8:27 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <E156y8B-0005eg-00@the-village.bc.nu>
2001-06-05 8:29 ` IDE corruption, 2.2, VIA chipset in PIO mode Neil Conway
2001-06-04 14:43 Neil Conway
2001-06-04 17:02 ` Alan Cox
2001-06-04 17:29 ` Neil Conway
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox