From: "Daniel B." <dsb@smart.net>
To: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: IDE DMA errors, massive disk corruption: Why? Fixed Yet? Why not re-do failed op?
Date: Mon, 06 Oct 2003 14:42:24 -0400 [thread overview]
Message-ID: <3F81B790.B8AF7136@smart.net> (raw)
I just got bitten _again_ by IDE DMA timeout errors and massive
filesystem corruption in kernel 2.4.22 (on an Asus A7M266-D dual-Athlon
XP motherboard (AMD 768 chip / amd7441 IDE controller)).
(I had turned DMA off in my init scripts, but apparently Debian
unstable's k7-smp configuration enables DMA by default before my init
scripts get control. Ext3 journal "recovery" trashed my system
partition.)
What's going on with the IDE DMA bugs? They have existed since 2.2
(right?), and even at .22 in the 2.4 series they still exist. Why
have they been around so long? Is it that few kernel developers use
the combinations of hardware or configuration options that expose
the bugs (like my dual-CPU box with IDE, not SCSI, disks)?
Are the DMA bugs believed to be fixed (for real) yet? IF so, in which
version?
Is there any consolidated documentation of the combinations of factors
that cause corruption, or of how to reliably avoid corruption (like
all the things to check to make sure your kernel never even tries to
enable DMA)?
Also, why does a DMA timeout cause such corruption? Doesn't the kernel
keep track of uncompleted operations, retain the information needed to
try again, and try again if there's a failure? If not, why not?
If it can't try again, shouldn't the kernel at least abort after one
disk-write failure instead of performing additional writes, which
frequently depend on the previous writes? (E.g., if I try to read
block 1's data and write it to block 2, and then write something new
to block 1, if the first write fails but continue and do the second
write, data gets destroyed. If the first write fails and I stop right
away, less is destroyed.)
Daniel
--
Daniel Barclay
dsb@smart.net
next reply other threads:[~2003-10-06 18:42 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-10-06 18:42 Daniel B. [this message]
2003-10-06 19:11 ` IDE DMA errors, massive disk corruption: Why? Fixed Yet? Why not re-do failed op? Bartlomiej Zolnierkiewicz
-- strict thread matches above, loose matches on Subject: below --
2003-10-06 19:32 IDE DMA errors, massive disk corruption: Why? Fixed Yet? W hy " Mudama, Eric
2003-10-06 20:20 ` IDE DMA errors, massive disk corruption: Why? Fixed Yet? Why " Daniel B.
2003-10-06 20:45 ` Valdis.Kletnieks
2003-10-06 21:07 ` Daniel B.
2003-10-06 21:26 ` Jeff Garzik
2003-10-07 5:24 ` IDE DMA errors, massive disk corruption: Why? Fixed Yet? Whynot " Daniel B.
2003-10-07 6:03 ` Valdis.Kletnieks
2003-10-07 13:32 ` IDE DMA errors, massive disk corruption: Why? Fixed Yet? Why not " Daniel B.
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3F81B790.B8AF7136@smart.net \
--to=dsb@smart.net \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.