linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alan Cox <alan@lxorguk.ukuu.org.uk>
To: John Treubig <jtreubig@hotmail.com>
Cc: linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org,
	linux-ide@vger.kernel.org
Subject: Re: ATA Write Error and Time-out Notification in User Space
Date: Tue, 20 Dec 2005 22:50:35 +0000	[thread overview]
Message-ID: <1135119036.25010.21.camel@localhost.localdomain> (raw)
In-Reply-To: <BAY101-F33B48301330A7FFF7849A4DF3E0@phx.gbl>

On Maw, 2005-12-20 at 15:55 -0600, John Treubig wrote:
> Where would I look in the LibATA/SCSI chain to permit Write Error and 
> Time-out notification to be passed back to user space without hanging the 
> system?


Some background first:

The 2.6 block layer can generally handle passing errors back up. It has
a load of problems with EOF on media that is variable size but block
that need fixing but the fundamental errors get back to the block layer.

Unfortunately although they get back to the block request the full error
is not propogated further up the stack. Thats actually tricky for the
general file system case as I/O as asynchronous to the actual file
system accesses and we may even hit errors on pages we didn't actually
need.

One result of that is that on write errors we generally mark a volume
offline and processes accessing it get stuck.

> drives.  When a SCSI disks report errors, the SCSI handlers perform as 
> expected, reporting the error and recovering.  When ATA drives report 
> errors, only read errors recover and we are able to capture the error.  
> Write and time-out errors hang the system.

The problem with the file system layer at this point is it isn't clear
how you get the device back. What you should see is a sequence of
retries and then the volume going offline.

I don't know how complete your log is but it doesn't end with the
expected 'giving up' and volume offlining. Is that because the final
messages don't hit the log or are they just not seen ? The promise
devices have some "interesting" behaviour when you reset the chip.

There is a second problem with PATA too. If the drive decides to keel
over asserting IORDY its game over. The bus transactions will hang and
the CPU get stuck. That would _not_ be my first suspicion however.




  reply	other threads:[~2005-12-20 22:50 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-12-20 21:55 ATA Write Error and Time-out Notification in User Space John Treubig
2005-12-20 22:50 ` Alan Cox [this message]
2005-12-21  0:31   ` Drew Winstel
2005-12-22  1:09     ` Alan Cox
  -- strict thread matches above, loose matches on Subject: below --
2006-01-03 18:29 John Treubig
2006-01-03 18:58 ` Alan Cox
2006-01-03 19:27   ` John Treubig
2006-01-03 21:48     ` Alan Cox
2006-01-04 12:37   ` Erik Slagter
2006-01-04 12:45     ` Alan Cox
2006-01-04 12:48       ` Erik Slagter
2006-01-05 20:27 John Treubig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1135119036.25010.21.camel@localhost.localdomain \
    --to=alan@lxorguk.ukuu.org.uk \
    --cc=jtreubig@hotmail.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).