Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Ric Wheeler <ric@emc.com>
To: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Mark Lord <liml@rtr.ca>,
	linux-kernel@vger.kernel.org,
	IDE/ATA development list <linux-ide@vger.kernel.org>,
	linux-scsi <linux-scsi@vger.kernel.org>
Subject: Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR
Date: Fri, 02 Feb 2007 07:20:15 -0500	[thread overview]
Message-ID: <45C32C7F.9050706@emc.com> (raw)
In-Reply-To: <1170366920.3388.62.camel@mulgrave.il.steeleye.com>



James Bottomley wrote:

>On Thu, 2007-02-01 at 15:02 -0500, Mark Lord wrote:
>  
>
>>I believe you made the first change in response to my prodding at the time,
>>when libata was not returning valid sense data (no LBA) for media errors.
>>The SCSI EH handling of that was rather poor at the time,
>>and so having it not retry the remaining sectors was actually
>>a very good fix at the time.
>>
>>But now, libata *does* return valid sense data for LBA/DMA drives,
>>and the workaround from circa 2.6.16 is no longer the best we can do.
>>Now that we know which sector failed, we ought to be able to skip
>>over it, and continue with the rest of the merged request.
>>    
>>
>
>We can ... the big concern with your approach, which you haven't
>addressed is the time factor.  For most SCSI devices, returning a fatal
>MEDIUM ERROR means we're out of remapping table, and also that there's
>probably a bunch of sectors on the track that are now out.  Thus, there
>are almost always multiple sector failures.  In linux, the average
>request size on a filesystem is around 64-128kb; thats 128-256 sectors.
>If we fail at the initial sector, we have to go through another 128-256
>attempts, with the internal device retries, before we fail the entire
>request.  Some devices can take a second or so for each read before they
>finally give up and decide they really can't read the sector, so you're
>looking at 2-5 minutes before the machine finally fails this one
>request ... and much worse for devices that retry more times.
>  
>

This is not the case on a read error - we commonly see transient errors 
on reads from disks.  What we push our vendors to do is to try to keep 
the "worst case" response down to tens of seconds as they retry, etc 
internally with a device. When they take that long (and they do), adding 
retries up the stack can translate into minutes per sector.

The interesting point of this question is about the typically pattern of 
IO errors. On a read, it is safe to assume that you will have issues 
with some bounded numbers of adjacent sectors.

>  
>
>>One thing that could be even better than the patch below,
>>would be to have it perhaps skip the entire bio that includes
>>the failed sector, rather than only the bad sector itself.
>>    
>>
>
>Er ... define "skip over the bio".  A bio is simply a block
>representation for a bunch of sg elements coming in to the elevator.
>Mostly what we see in SCSI is a single bio per request, so skipping the
>bio is really the current behaviour (to fail the rest of the request).
>  
>
This is really a tricky one - what can happen when we fail merged IO 
requests is really unpredictable behavior up at the application level 
since the IO error might not be at all relevant to my part of the 
request.  Merging can produce a request that is much larger than any 
normal drive error.

I really like the idea of being able to set this kind of policy on a per 
drive instance since what you want here will change depending on what 
your system requirements are, what the system is trying to do (i.e., 
when trying to recover a failing but not dead yet disk, IO errors should 
be as quick as possible and we should choose an IO scheduler that does 
not combine IO's).

>  
>
>>I think doing that might address most concerns expressed here.
>>Have you got an alternate suggestion, James?
>>    
>>
>
>James
>  
>

next prev parent reply	other threads:[~2007-02-02 12:20 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-01-31  0:47 [PATCH] scsi_lib.c: continue after MEDIUM_ERROR Mark Lord
2007-01-31  1:12 ` [PATCH] RESEND " Mark Lord
2007-01-31  1:16 ` [PATCH] " James Bottomley
2007-01-31  1:36   ` Mark Lord
     [not found]   ` <311601c90701301725n53d25a74g652b7ca3bfc64c56@mail.gmail.com>
2007-01-31  1:41     ` Mark Lord
2007-01-31  3:20       ` Ric Wheeler
2007-01-31  4:21         ` James Bottomley
2007-01-31 15:13           ` Mark Lord
2007-01-31 15:22             ` Mark Lord
2007-01-31 15:24             ` James Bottomley
2007-01-31  5:09         ` Douglas Gilbert
2007-01-31 15:08         ` Mark Lord
2007-01-31 15:23           ` Alan
2007-01-31 16:35             ` Ric Wheeler
2007-01-31 17:57             ` Mark Lord
2007-01-31 18:13               ` James Bottomley
2007-01-31 18:37                 ` Mark Lord
2007-01-31  9:30       ` Jeff Garzik
2007-01-31 14:36         ` Ric Wheeler
2007-01-31 15:28           ` Douglas Gilbert
2007-01-31 15:38             ` Mark Lord
2007-02-01 20:02   ` Mark Lord
2007-02-01 21:55     ` James Bottomley
2007-02-02  2:48       ` Mark Lord
2007-02-02 12:20       ` Ric Wheeler [this message]
2007-02-02 14:42         ` Alan
2007-02-02 14:53           ` James Bottomley
2007-02-02 16:16             ` Ric Wheeler
2007-02-02 20:16           ` Douglas Gilbert
2007-02-02 14:50         ` Alan
2007-02-02 16:06           ` Mark Lord
2007-02-02 19:49             ` Matt Mackall
2007-02-02 22:58               ` Mark Lord
2007-02-02 23:07                 ` Matt Mackall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45C32C7F.9050706@emc.com \
    --to=ric@emc.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=liml@rtr.ca \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).