All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ric Wheeler <ric@emc.com>
To: Mark Lord <liml@rtr.ca>
Cc: Tejun Heo <htejun@gmail.com>,
	jeff@garzik.org, linux-ide@vger.kernel.org,
	alan@lxorguk.ukuu.org.uk
Subject: Re: [PATCHSET #upstream] libata: improve FLUSH error handling
Date: Fri, 28 Mar 2008 09:36:58 -0400	[thread overview]
Message-ID: <47ECF47A.2040508@emc.com> (raw)
In-Reply-To: <47EC58F6.3070601@rtr.ca>

Mark Lord wrote:
> Tejun Heo wrote:
>> Hello, Mark.
>>
>> Mark Lord wrote:
>>> Speaking of which.. these are all WRITEs.
>>>
>>> In 18 years of IDE/ATA development,
>>> I have *never* seen a hard disk drive report a WRITE error.
>>>
>>> Which makes sense, if you think about it -- it's rewriting the sector
>>> with new ECC info, so it *should* succeed.  The only case where it 
>>> won't,
>>> is if the sector has been marked as "bad" internally, and the drive is
>>> too dumb to try anyways after it runs out of remap space.
>>>
>>> In which case we've already lost data, and taking more than a hundred
>>> and twenty seconds isn't going to make a serious difference.
>>
>> Yeah, the disk must be knee deep in shit to report WRITE failure.  I
>> don't really expect the code to be exercised often but was mainly trying
>> fill the loophole in libata error handling as this type of behavior is
>> what the spec requires on FLUSH errors.
>>
>> I didn't add global timeout because retries are done iff the drive is
>> reporting progress.
>>
>> 1. Drives genuinely deep in shit and getting lots of WRITE errors would
>> report different sectors on each FLUSH and we NEED to keep retrying.
>> That's what the spec requires and the FLUSH could be from shutdown and
>> if so that would be the drive's last chance to write data to the drive.
>>
>> 2. There are other issues causing the command to fail (e.g. timeout, HSM
>> violation or somesuch).  This is the case EH can take a really long time
>> if it keeps retrying but the posted code doesn't retry if this is the 
>> case.
>>
>> 3. The drive is crazy and reporting errors for no good reason.  Unless
>> the drive is really anti-social and raise such error condition only
>> after tens of seconds, this shouldn't take too long.  Also, if LBA
>> doesn't change for each retry, the tries count is halved.
>>
>> So, I think the code should be safe.  Do you still think we need a
>> global timeout?  It is easy to add.  I'm just not sure whether we need
>> it or not.
> ..
> 
> With EH becoming more and more capable and complex,
> a global deadline for FLUSH looks like a reasonable thing.
> People who have no backups can leave it at the default "near-infinity" 
> setting
> that is there now, and folks with RAID1 (or better) can set it to a much 
> shorter number -- so that their system-recovery reboot doesn't take 3 hours
> to get past the FLUSH_CACHE on the failing drive.  :)
> 
> Cheers

I think that is a really important knob to have. Not just for RAID 
systems, but we use the FLUSH_CACHE on systems without barriers mainly 
when we power down & do the unmounts, etc.

If you hit a bad block during power down of a laptop, I can image that 
have a worst case of (30?) seconds is infinitely better than multiple 
minutes ;-)

ric

  reply	other threads:[~2008-03-28 13:39 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-27 10:14 [PATCHSET #upstream] libata: improve FLUSH error handling Tejun Heo
2008-03-27 10:14 ` [PATCH 1/4] libata: make ata_tf_to_lba[48]() generic Tejun Heo
2008-04-04  7:45   ` Jeff Garzik
2008-03-27 10:14 ` [PATCH 2/4] libata: implement ATA_QCFLAG_RETRY Tejun Heo
2008-03-27 10:14 ` [PATCH 3/4] libata: kill unused ata_flush_cache() Tejun Heo
2008-03-27 10:14 ` [PATCH 4/4] libata: improve FLUSH error handling Tejun Heo
2008-04-04  7:46   ` Jeff Garzik
2008-03-27 10:23 ` Debug patch to induce errors on FLUSH Tejun Heo
2008-03-27 14:24 ` [PATCHSET #upstream] libata: improve FLUSH error handling Mark Lord
2008-03-27 14:35   ` Mark Lord
2008-03-27 15:31     ` Alan Cox
2008-03-27 18:01     ` Ric Wheeler
2008-03-28  1:57     ` Tejun Heo
2008-03-28  2:33       ` Mark Lord
2008-03-28 13:36         ` Ric Wheeler [this message]
2008-03-28 14:52           ` Tejun Heo
2008-03-28 14:53             ` Ric Wheeler
2008-03-28 15:16               ` Alan Cox
2008-03-28 16:57                 ` Ric Wheeler
2008-03-28 16:04             ` Mark Lord
2008-03-27 17:53   ` Ric Wheeler
2008-03-27 18:52     ` Jeff Garzik
2008-03-27 20:23       ` Ric Wheeler
2008-03-28  7:46   ` Andi Kleen
2008-03-28  8:30     ` Tejun Heo
2008-03-28  8:48       ` Andi Kleen
2008-03-28  8:53         ` Tejun Heo
2008-03-27 17:51 ` Ric Wheeler
2008-03-27 18:53   ` Jeff Garzik
2008-03-27 22:00   ` Alan Cox
2008-03-28  2:02   ` Tejun Heo
2008-03-28  9:48     ` Alan Cox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47ECF47A.2040508@emc.com \
    --to=ric@emc.com \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=htejun@gmail.com \
    --cc=jeff@garzik.org \
    --cc=liml@rtr.ca \
    --cc=linux-ide@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.