From: "Daniel B." <dsb@smart.net>
To: unlisted-recipients:; (no To-header on input)
Cc: linux-kernel@vger.kernel.org
Subject: Re: IDE DMA errors, massive disk corruption: Why? Fixed Yet? Whynotre-do failed op?
Date: Tue, 07 Oct 2003 09:46:26 -0400 [thread overview]
Message-ID: <3F82C3B2.A764EA6D@smart.net> (raw)
In-Reply-To: Pine.LNX.4.44.0310071250450.3492-100000@gatemaster.ivimey.org
Ruth Ivimey-Cook wrote:
>
> On Tue, 7 Oct 2003, Valdis.Kletnieks@vt.edu wrote:
> ...
> But surely what Daniel is complaining about is that the disk never did ack the
> bus transfer.
Yes, much closer to what I meant.
Actually, I'd talking about when the kernel doesn't receive the
acknowledgment (the DMA interrupt).
>
> Consider this as a correct sequence of operations (hope I get it right:-) :
>
> 1. Kernel uses IDE controller to initiate ATA disk write request:
...
> 2. IDE controller DMA used to transfer data to disk unit:
...
> 3. Transfer complete actions: when the required number of words are acked:
> a. IDE DMA controller fires end-of-transfer IRQ
> b. ...
>
> 4. Kernel sees end of transfer IRQ and initiates software ACK of transfer,
> e.g. to remove DMA buffer from 'block dirty' list.
>
> 5. If caching enabled, some time later the data in the drive is written to
> the platter.
>
> Now, the case I believe Daniel is complaining about is that things go well
> through step 1 and perhaps some part of step 2. But, because the drive doesn't
> accept the data or some other error, step 3 doesn't happen.
Actually, I think I'm talking about the very beginning of step 4--
the interrupt request doesn't actually make it to the kernel ("interrupt
lost"?), so the kernel doesn't see the interrupt request.
(I've been assuming that the errors I'm getting are just from DMA
interrupt problems, that is, the drive accepted the data just fine, but
the kernel didn't see the acknowledgement. Of course, it's not clear
how that in itself would cause corruption, so I don't know for sure
that drive-side errors or rejections aren't involved.)
> Consequently, the
> IDE DMA timeout happens, the kernel cries foul and things go wrong. So the
> failure actually looks like this:
>
> 1. Kernel uses IDE controller to initiate ATA disk write request:
> a. Kernel sets up DMA parameters (start, length, timeout)
> b. kernel initiates transfer of 1 sector to disk
> c. (in parallel with b) drive accepts transfer request and waits for data
>
> 2. IDE controller DMA used to transfer data to disk unit:
> a. hardware DMA tries to send 256 16-bit words of data to disk
> b. (in parallel) drive accepts none or, perhaps, some data from bus into
> internal buffer, but not all of it.
>
> 3. After waiting, IDE controller fires DMA timeout IRQ.
>
> 4. Kernel sees IRQ and emits warning message. Tries to reset bus and ....
Actually, I'm thinking of the case where the interrupt request doesn't
make it to the kernel, so the kernel _doesn't_ see any IRQ in the expected
time (and proceeds as you say).
> Have I got this scenario right?
Just about.
(Also, thanks for the DMA details.)
Daniel
--
Daniel Barclay
dsb@smart.net
next prev parent reply other threads:[~2003-10-07 13:46 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-10-06 19:32 IDE DMA errors, massive disk corruption: Why? Fixed Yet? W hy not re-do failed op? Mudama, Eric
2003-10-06 20:20 ` IDE DMA errors, massive disk corruption: Why? Fixed Yet? Why " Daniel B.
2003-10-06 20:45 ` Valdis.Kletnieks
2003-10-06 21:07 ` Daniel B.
2003-10-06 21:26 ` Jeff Garzik
2003-10-07 5:24 ` IDE DMA errors, massive disk corruption: Why? Fixed Yet? Whynot " Daniel B.
2003-10-07 6:03 ` Valdis.Kletnieks
2003-10-07 12:23 ` Ruth Ivimey-Cook
2003-10-07 13:46 ` Daniel B. [this message]
2003-10-07 13:32 ` IDE DMA errors, massive disk corruption: Why? Fixed Yet? Why not " Daniel B.
2003-10-10 1:10 ` IDE DMA errors, massive disk corruption: Why? Fixed Yet? W hy " Greg Stark
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3F82C3B2.A764EA6D@smart.net \
--to=dsb@smart.net \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.