linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Overagressive failing of disk reads, both LIBATA and IDE
@ 2009-03-20  2:12 Norman Diamond
  2009-03-20  3:32 ` Mark Lord
  0 siblings, 1 reply; 12+ messages in thread
From: Norman Diamond @ 2009-03-20  2:12 UTC (permalink / raw)
  To: linux-kernel, linux-ide

For months I was wondering how a disk could do this:
dd if=/dev/hda of=/dev/null bs=512 skip=551540 count=4  # succeeds
dd if=/dev/hda of=/dev/null bs=512 skip=551544 count=4  # succeeds
dd if=/dev/hda of=/dev/null bs=512 skip=551540 count=8  # fails

It turns out the disk isn't doing that.  Linux is.  The old IDE drivers did
it, but with LIBATA the same thing happens to /dev/sda.  In later examples
also, the same happens to /dev/sda as /dev/hda.

Here's what the disk is really responsible for:
dd if=/dev/hda of=/dev/null bs=512 skip=551562 count=1  # really fails

Here's Linux to blame again:
dd if=/dev/hda of=/dev/null bs=512 skip=551561 count=1  # fails

When the drive reports an uncorrectable media error, Linux correctly records
it in the log.  But when the app didn't ask for that block, when blocks that
the app asked for were all read, Linux incorrectly reports failure to the
app.

I don't know how Linux decides how many blocks to read ahead, but no matter
how many it chooses, read ahead is read ahead.  Go ahead and record it in
the log.  I'd also like to suggest that if a user is logged in on the screen
(whether X11 or text) see if we can warn them that their disk is dying.  But
don't return a failure to the app.  If the blocks that the app asked for
were read, we should give them to the app, successfully.

Sheesh.

P.S.
One would expect this to persuade the hard drive to relocate the block:
dd if=/dev/zero of=/dev/hda bs=512 seek=551562 count=1
But it doesn't because Linux wants to read 4 blocks, modify 1, and write 4
blocks.  The read fails.

One would expect this to persuade the hard drive to relocate the block:
dd if=/dev/zero of=/dev/hda bs=512 seek=551560 count=4
But it doesn't because the hard drive reports success.  If an app tries to
read the bad sector again it still fails.  So the drive has egregiously bad
firmware.  That doesn't excuse Linux.

--------------------------------------
Power up the Internet with Yahoo! Toolbar.
http://pr.mail.yahoo.co.jp/toolbar/

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2009-03-22  2:16 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-20  2:12 Overagressive failing of disk reads, both LIBATA and IDE Norman Diamond
2009-03-20  3:32 ` Mark Lord
2009-03-20 10:00   ` Andrew Morton
2009-03-20 13:09     ` Mark Lord
2009-03-21 14:22   ` James Bottomley
2009-03-21 14:55     ` Mark Lord
2009-03-21 15:01       ` Mark Lord
2009-03-21 15:08       ` James Bottomley
2009-03-21 15:20         ` Mark Lord
2009-03-21 15:10     ` Alan Cox
2009-03-21 15:18       ` James Bottomley
2009-03-22  2:15     ` Norman Diamond

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).