All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Daniel B." <dsb@smart.net>
To: unlisted-recipients:; (no To-header on input)
Cc: linux-kernel@vger.kernel.org
Subject: Re: IDE DMA errors, massive disk corruption: Why? Fixed Yet? Whynotre-do  failed op?
Date: Tue, 07 Oct 2003 09:46:26 -0400	[thread overview]
Message-ID: <3F82C3B2.A764EA6D@smart.net> (raw)
In-Reply-To: Pine.LNX.4.44.0310071250450.3492-100000@gatemaster.ivimey.org

Ruth Ivimey-Cook wrote:
> 
> On Tue, 7 Oct 2003, Valdis.Kletnieks@vt.edu wrote:
> ...
> But surely what Daniel is complaining about is that the disk never did ack the
> bus transfer.

Yes, much closer to what I meant.  

Actually, I'd talking about when the kernel doesn't receive the 
acknowledgment (the DMA  interrupt).


> 
> Consider this as a correct sequence of operations (hope I get it right:-) :
> 
> 1.   Kernel uses IDE controller to initiate ATA disk write request:
...
> 2.   IDE controller DMA used to transfer data to disk unit:
...
> 3.   Transfer complete actions: when the required number of words are acked:
>      a. IDE DMA controller fires end-of-transfer IRQ
>      b. ...
> 
> 4.   Kernel sees end of transfer IRQ and initiates software ACK of transfer,
>      e.g. to remove DMA buffer from 'block dirty' list.
> 
> 5.   If caching enabled, some time later the data in the drive is written to
>      the platter.
> 
> Now, the case I believe Daniel is complaining about is that things go well
> through step 1 and perhaps some part of step 2. But, because the drive doesn't
> accept the data or some other error, step 3 doesn't happen. 

Actually, I think I'm talking about the very beginning of step 4--
the interrupt request doesn't actually make it to the kernel ("interrupt
lost"?), so the kernel doesn't see the interrupt request.

(I've been assuming that the errors I'm getting are just from DMA 
interrupt problems, that is, the drive accepted the data just fine, but 
the kernel didn't see the acknowledgement.  Of course, it's not clear
how that in itself would cause corruption, so I don't know for sure
that drive-side errors or rejections aren't involved.)


> Consequently, the
> IDE DMA timeout happens, the kernel cries foul and things go wrong. So the
> failure actually looks like this:
> 
> 1.   Kernel uses IDE controller to initiate ATA disk write request:
>      a. Kernel sets up DMA parameters (start, length, timeout)
>      b. kernel initiates transfer of 1 sector to disk
>      c. (in parallel with b) drive accepts transfer request and waits for data
> 
> 2.   IDE controller DMA used to transfer data to disk unit:
>      a. hardware DMA tries to send 256 16-bit words of data to disk
>      b. (in parallel) drive accepts none or, perhaps, some data from bus into
>         internal buffer, but not all of it.
> 
> 3.   After waiting, IDE controller fires DMA timeout IRQ.
> 
> 4.   Kernel sees IRQ and emits warning message. Tries to reset bus and ....

Actually, I'm thinking of the case where the interrupt request doesn't
make it to the kernel, so the kernel _doesn't_ see any IRQ in the expected 
time (and proceeds as you say).


> Have I got this scenario right?

Just about.

(Also, thanks for the DMA details.)



Daniel
-- 
Daniel Barclay
dsb@smart.net

  reply	other threads:[~2003-10-07 13:46 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-10-06 19:32 IDE DMA errors, massive disk corruption: Why? Fixed Yet? W hy not re-do failed op? Mudama, Eric
2003-10-06 20:20 ` IDE DMA errors, massive disk corruption: Why? Fixed Yet? Why " Daniel B.
2003-10-06 20:45   ` Valdis.Kletnieks
2003-10-06 21:07     ` Daniel B.
2003-10-06 21:26       ` Jeff Garzik
2003-10-07  5:24         ` IDE DMA errors, massive disk corruption: Why? Fixed Yet? Whynot " Daniel B.
2003-10-07  6:03           ` Valdis.Kletnieks
2003-10-07 12:23             ` Ruth Ivimey-Cook
2003-10-07 13:46               ` Daniel B. [this message]
2003-10-07 13:32             ` IDE DMA errors, massive disk corruption: Why? Fixed Yet? Why not " Daniel B.
2003-10-10  1:10 ` IDE DMA errors, massive disk corruption: Why? Fixed Yet? W hy " Greg Stark

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3F82C3B2.A764EA6D@smart.net \
    --to=dsb@smart.net \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.