From: Ric Wheeler <ric@emc.com>
To: Mark Lord <liml@rtr.ca>
Cc: Tejun Heo <htejun@gmail.com>,
jeff@garzik.org, linux-ide@vger.kernel.org,
alan@lxorguk.ukuu.org.uk
Subject: Re: [PATCHSET #upstream] libata: improve FLUSH error handling
Date: Thu, 27 Mar 2008 14:01:11 -0400 [thread overview]
Message-ID: <47EBE0E7.9070205@emc.com> (raw)
In-Reply-To: <47EBB09F.9070607@rtr.ca>
Mark Lord wrote:
> Mark Lord wrote:
> ..
>> Absolute theoretical worst case for a drive with a buffer 4X the largest
>> current size: 328 seconds. Not taking into account having bad-sector
>> retries for each of those I/O blocks, but *nobody* is going to wait
>> that long anyway. They'll have long since pulled the power cord or
>> reached for the BIG RED BUTTON.
> ..
>
> Speaking of which.. these are all WRITEs.
>
> In 18 years of IDE/ATA development,
> I have *never* seen a hard disk drive report a WRITE error.
I have seen them in the wild.
>
> Which makes sense, if you think about it -- it's rewriting the sector
> with new ECC info, so it *should* succeed. The only case where it won't,
> is if the sector has been marked as "bad" internally, and the drive is
> too dumb to try anyways after it runs out of remap space.
>
> In which case we've already lost data, and taking more than a hundred
> and twenty seconds isn't going to make a serious difference.
You can definitely start failing writes once your remapped sector table is
exhausted, but to your point, that drive is usually in bad shape at this point
in time.
That makes it more important to fail quickly so that we don't hang waiting for
something that is most likely to be on its last legs...
>
> Mmm.. anyone got a spare modern-ish drive to risk destroying?
> Say, one of the few still-functioning DeathStars, or an buggy-NCQ Maxtor ?
>
> If so, it might be fun to try and produce a no-more-remaps scenario on it.
> One could use "hdparm --make-bad-sector" to corrupt a few hundred/thousand
> sectors in a row (sequentially numbered).
I don't think that this will do it. What happens with our sector corruption, I
believe, is that we corrupt the data integrity bits around the sector. Once we
write, that original sector is repaired since the drive overwrites the junk bits
we gave it. The remapped sector count should not be growing (but it is worth
checking to verify my theory ;-)).
You have my blessing to be mean to a drive that you got from me if that helps ;-)
>
> Then loop and attempt to read from them individually with "hdparm
> --read-sector"
> (should fail on all, but it might force the drive to remap them).
Again, I don't think that reads will ever force a remap.
>
> Then finally try and write back to them with "hdparm --write-sector",
> and see if a WRITE ERROR is ever reported. Maybe time the individual
> WRITEs
> to see if any of them take more than a few milliseconds.
>
> Perhaps try this whole thing with/without the write cache enabled.
>
> Mmm...
>
> Cheers
ric
next prev parent reply other threads:[~2008-03-27 18:11 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-03-27 10:14 [PATCHSET #upstream] libata: improve FLUSH error handling Tejun Heo
2008-03-27 10:14 ` [PATCH 1/4] libata: make ata_tf_to_lba[48]() generic Tejun Heo
2008-04-04 7:45 ` Jeff Garzik
2008-03-27 10:14 ` [PATCH 2/4] libata: implement ATA_QCFLAG_RETRY Tejun Heo
2008-03-27 10:14 ` [PATCH 3/4] libata: kill unused ata_flush_cache() Tejun Heo
2008-03-27 10:14 ` [PATCH 4/4] libata: improve FLUSH error handling Tejun Heo
2008-04-04 7:46 ` Jeff Garzik
2008-03-27 10:23 ` Debug patch to induce errors on FLUSH Tejun Heo
2008-03-27 14:24 ` [PATCHSET #upstream] libata: improve FLUSH error handling Mark Lord
2008-03-27 14:35 ` Mark Lord
2008-03-27 15:31 ` Alan Cox
2008-03-27 18:01 ` Ric Wheeler [this message]
2008-03-28 1:57 ` Tejun Heo
2008-03-28 2:33 ` Mark Lord
2008-03-28 13:36 ` Ric Wheeler
2008-03-28 14:52 ` Tejun Heo
2008-03-28 14:53 ` Ric Wheeler
2008-03-28 15:16 ` Alan Cox
2008-03-28 16:57 ` Ric Wheeler
2008-03-28 16:04 ` Mark Lord
2008-03-27 17:53 ` Ric Wheeler
2008-03-27 18:52 ` Jeff Garzik
2008-03-27 20:23 ` Ric Wheeler
2008-03-28 7:46 ` Andi Kleen
2008-03-28 8:30 ` Tejun Heo
2008-03-28 8:48 ` Andi Kleen
2008-03-28 8:53 ` Tejun Heo
2008-03-27 17:51 ` Ric Wheeler
2008-03-27 18:53 ` Jeff Garzik
2008-03-27 22:00 ` Alan Cox
2008-03-28 2:02 ` Tejun Heo
2008-03-28 9:48 ` Alan Cox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47EBE0E7.9070205@emc.com \
--to=ric@emc.com \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=htejun@gmail.com \
--cc=jeff@garzik.org \
--cc=liml@rtr.ca \
--cc=linux-ide@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).