* [PATCH v4 0/3] differentiate between I/O errors
@ 2011-01-18 9:13 Hannes Reinecke
2011-01-18 9:13 ` [PATCH v4 1/3] scsi: Detailed " Hannes Reinecke
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Hannes Reinecke @ 2011-01-18 9:13 UTC (permalink / raw)
To: James Bottomley; +Cc: linux-scsi, jaxboe, michaelc, agk, Hannes Reinecke
Changes in v4:
Included new error -EBADE to signal nexus errors.
The diffstat:
Hannes Reinecke (1):
scsi: Detailed I/O errors
Mike Snitzer (2):
dm mpath: propagate target errors immediately
block: improve detail in I/O error messages
block/blk-core.c | 23 ++++++++++++++++++++---
drivers/md/dm-mpath.c | 22 ++++++++++------------
drivers/scsi/scsi_error.c | 24 +++++++++++++++++-------
drivers/scsi/scsi_lib.c | 28 ++++++++++++++++++++++++++--
include/scsi/scsi.h | 5 +++++
5 files changed, 78 insertions(+), 24 deletions(-)
^ permalink raw reply [flat|nested] 12+ messages in thread* [PATCH v4 1/3] scsi: Detailed I/O errors 2011-01-18 9:13 [PATCH v4 0/3] differentiate between I/O errors Hannes Reinecke @ 2011-01-18 9:13 ` Hannes Reinecke 2011-01-18 11:33 ` Douglas Gilbert 2011-01-18 9:13 ` [PATCH v4 2/3] dm mpath: propagate target errors immediately Hannes Reinecke 2011-01-18 9:13 ` [PATCH v4 3/3] block: improve detail in I/O error messages Hannes Reinecke 2 siblings, 1 reply; 12+ messages in thread From: Hannes Reinecke @ 2011-01-18 9:13 UTC (permalink / raw) To: James Bottomley Cc: linux-scsi, jaxboe, michaelc, agk, Hannes Reinecke, Mike Snitzer Instead of just passing 'EIO' for any I/O error we should be notifying the upper layers with more details about the cause of this error. Update the possible I/O errors to: - ENOLINK: Link failure between host and target - EIO: Retryable I/O error - EREMOTEIO: Non-retryable I/O error - EBADE: I/O error restricted to the I_T_L nexus 'Retryable' in this context means that an I/O error _might_ be restricted to the I_T_L nexus (vulgo: path), so retrying on another nexus / path might succeed. 'Non-retryable' in general refers to a target failure, so this error will always be generated regardless of the I_T_L nexus it was send on. I/O errors restricted to the I_T_L nexus might be retried on another nexus / path, but they should _not_ be queued if no paths are available. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com> --- drivers/scsi/scsi_error.c | 24 +++++++++++++++++------- drivers/scsi/scsi_lib.c | 28 ++++++++++++++++++++++++++-- include/scsi/scsi.h | 5 +++++ 3 files changed, 48 insertions(+), 9 deletions(-) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 45c7564..991de3c 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -223,7 +223,7 @@ static inline void scsi_eh_prt_fail_stats(struct Scsi_Host *shost, * @scmd: Cmd to have sense checked. * * Return value: - * SUCCESS or FAILED or NEEDS_RETRY + * SUCCESS or FAILED or NEEDS_RETRY or TARGET_ERROR * * Notes: * When a deferred error is detected the current command has @@ -326,17 +326,19 @@ static int scsi_check_sense(struct scsi_cmnd *scmd) */ return SUCCESS; - /* these three are not supported */ + /* these are not supported */ case COPY_ABORTED: case VOLUME_OVERFLOW: case MISCOMPARE: - return SUCCESS; + case BLANK_CHECK: + case DATA_PROTECT: + return TARGET_ERROR; case MEDIUM_ERROR: if (sshdr.asc == 0x11 || /* UNRECOVERED READ ERR */ sshdr.asc == 0x13 || /* AMNF DATA FIELD */ sshdr.asc == 0x14) { /* RECORD NOT FOUND */ - return SUCCESS; + return TARGET_ERROR; } return NEEDS_RETRY; @@ -344,11 +346,9 @@ static int scsi_check_sense(struct scsi_cmnd *scmd) if (scmd->device->retry_hwerror) return ADD_TO_MLQUEUE; else - return SUCCESS; + return TARGET_ERROR; case ILLEGAL_REQUEST: - case BLANK_CHECK: - case DATA_PROTECT: default: return SUCCESS; } @@ -787,6 +787,7 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, case SUCCESS: case NEEDS_RETRY: case FAILED: + case TARGET_ERROR: break; case ADD_TO_MLQUEUE: rtn = NEEDS_RETRY; @@ -1469,6 +1470,14 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd) rtn = scsi_check_sense(scmd); if (rtn == NEEDS_RETRY) goto maybe_retry; + else if (rtn == TARGET_ERROR) { + /* + * Need to modify host byte to signal a + * permanent target failure + */ + scmd->result |= (DID_TARGET_FAILURE << 16); + rtn = SUCCESS; + } /* if rtn == FAILED, we have no sense information; * returning FAILED will wake the error handler thread * to collect the sense and redo the decide @@ -1486,6 +1495,7 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd) case RESERVATION_CONFLICT: sdev_printk(KERN_INFO, scmd->device, "reservation conflict\n"); + scmd->result |= (DID_NEXUS_FAILURE << 16); return SUCCESS; /* causes immediate i/o error */ default: return FAILED; diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 0ed7a66..91f1c13 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -667,6 +667,30 @@ void scsi_release_buffers(struct scsi_cmnd *cmd) } EXPORT_SYMBOL(scsi_release_buffers); +static int __scsi_error_from_host_byte(struct scsi_cmnd *cmd, int result) +{ + int error = 0; + + switch(host_byte(result)) { + case DID_TRANSPORT_FAILFAST: + error = -ENOLINK; + break; + case DID_TARGET_FAILURE: + cmd->result |= (DID_OK << 16); + error = -EREMOTEIO; + break; + case DID_NEXUS_FAILURE: + cmd->result |= (DID_OK << 16); + error = -EBADE; + break; + default: + error = -EIO; + break; + } + + return error; +} + /* * Function: scsi_io_completion() * @@ -737,7 +761,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) req->sense_len = len; } if (!sense_deferred) - error = -EIO; + error = __scsi_error_from_host_byte(cmd, result); } req->resid_len = scsi_get_resid(cmd); @@ -796,7 +820,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) if (scsi_end_request(cmd, error, good_bytes, result == 0) == NULL) return; - error = -EIO; + error = __scsi_error_from_host_byte(cmd, result); if (host_byte(result) == DID_RESET) { /* Third party bus reset or reset for error recovery diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h index 1651fef..078a0ac 100644 --- a/include/scsi/scsi.h +++ b/include/scsi/scsi.h @@ -433,6 +433,10 @@ static inline int scsi_is_wlun(unsigned int lun) * recover the link. Transport class will * retry or fail IO */ #define DID_TRANSPORT_FAILFAST 0x0f /* Transport class fastfailed the io */ +#define DID_TARGET_FAILURE 0x10 /* Permanent target failure, do not retry on + * other paths */ +#define DID_NEXUS_FAILURE 0x11 /* Permanent nexus failure, retry on other + * paths might yield different results */ #define DRIVER_OK 0x00 /* Driver status */ /* @@ -462,6 +466,7 @@ static inline int scsi_is_wlun(unsigned int lun) #define TIMEOUT_ERROR 0x2007 #define SCSI_RETURN_NOT_HANDLED 0x2008 #define FAST_IO_FAIL 0x2009 +#define TARGET_ERROR 0x200A /* * Midlevel queue return values. -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v4 1/3] scsi: Detailed I/O errors 2011-01-18 9:13 ` [PATCH v4 1/3] scsi: Detailed " Hannes Reinecke @ 2011-01-18 11:33 ` Douglas Gilbert 2011-01-18 12:01 ` Hannes Reinecke 0 siblings, 1 reply; 12+ messages in thread From: Douglas Gilbert @ 2011-01-18 11:33 UTC (permalink / raw) To: Hannes Reinecke Cc: James Bottomley, linux-scsi, jaxboe, michaelc, agk, Mike Snitzer On 11-01-18 10:13 AM, Hannes Reinecke wrote: > Instead of just passing 'EIO' for any I/O error we should be > notifying the upper layers with more details about the cause > of this error. > > Update the possible I/O errors to: > > - ENOLINK: Link failure between host and target > - EIO: Retryable I/O error > - EREMOTEIO: Non-retryable I/O error > - EBADE: I/O error restricted to the I_T_L nexus > > 'Retryable' in this context means that an I/O error _might_ be > restricted to the I_T_L nexus (vulgo: path), so retrying on another > nexus / path might succeed. > > 'Non-retryable' in general refers to a target failure, so this > error will always be generated regardless of the I_T_L nexus > it was send on. > > I/O errors restricted to the I_T_L nexus might be retried > on another nexus / path, but they should _not_ be queued > if no paths are available. Hannes, I don't know if it is applicable to this patch but with SAS when the uplink from an expander is being stressed (i.e. it temporarily doesn't have enough bandwidth) then a sense key of ABORTED COMMAND may be generated. In my experience retrying such a command succeeds. BTW might "vulgo" be "ergo" [Latin: therefore]? Doug Gilbert ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4 1/3] scsi: Detailed I/O errors 2011-01-18 11:33 ` Douglas Gilbert @ 2011-01-18 12:01 ` Hannes Reinecke 2011-01-27 22:35 ` Mike Snitzer 0 siblings, 1 reply; 12+ messages in thread From: Hannes Reinecke @ 2011-01-18 12:01 UTC (permalink / raw) To: dgilbert; +Cc: James Bottomley, linux-scsi, jaxboe, michaelc, agk, Mike Snitzer On 01/18/2011 12:33 PM, Douglas Gilbert wrote: > On 11-01-18 10:13 AM, Hannes Reinecke wrote: >> Instead of just passing 'EIO' for any I/O error we should be >> notifying the upper layers with more details about the cause >> of this error. >> >> Update the possible I/O errors to: >> >> - ENOLINK: Link failure between host and target >> - EIO: Retryable I/O error >> - EREMOTEIO: Non-retryable I/O error >> - EBADE: I/O error restricted to the I_T_L nexus >> >> 'Retryable' in this context means that an I/O error _might_ be >> restricted to the I_T_L nexus (vulgo: path), so retrying on another >> nexus / path might succeed. >> >> 'Non-retryable' in general refers to a target failure, so this >> error will always be generated regardless of the I_T_L nexus >> it was send on. >> >> I/O errors restricted to the I_T_L nexus might be retried >> on another nexus / path, but they should _not_ be queued >> if no paths are available. > > Hannes, > I don't know if it is applicable to this patch but with > SAS when the uplink from an expander is being stressed > (i.e. it temporarily doesn't have enough bandwidth) then > a sense key of ABORTED COMMAND may be generated. In my > experience retrying such a command succeeds. > I guess this should be handled by scsi EH, as there should be some sensible ASC/ASCQ values to go with it. This patchset is primarily for fixing up multipathing, which has the habit of retrying failed I/Os on the next path. For some errors this is just pointless (eg MEDIUM ERROR), for some errors this is the desired behaviour (namely transport errors), and for others this is positively damaging (persistent reservation failures). Just plain EIO simply don't cover the whole range :-) > > BTW might "vulgo" be "ergo" [Latin: therefore]? > Nope. Correct etymology is from 'sermo vulgaris', ie the language of the common people. But maybe I should remove it for the next round to avoid confusion. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Markus Rex, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4 1/3] scsi: Detailed I/O errors 2011-01-18 12:01 ` Hannes Reinecke @ 2011-01-27 22:35 ` Mike Snitzer 2011-01-27 22:41 ` James Bottomley 0 siblings, 1 reply; 12+ messages in thread From: Mike Snitzer @ 2011-01-27 22:35 UTC (permalink / raw) To: Hannes Reinecke Cc: dgilbert, James Bottomley, linux-scsi, jaxboe, michaelc, agk On Tue, Jan 18 2011 at 7:01am -0500, Hannes Reinecke <hare@suse.de> wrote: > On 01/18/2011 12:33 PM, Douglas Gilbert wrote: > > This patchset is primarily for fixing up multipathing, > which has the habit of retrying failed I/Os on the > next path. For some errors this is just pointless > (eg MEDIUM ERROR), for some errors this is the desired > behaviour (namely transport errors), and for others > this is positively damaging (persistent reservation > failures). > Just plain EIO simply don't cover the whole range :-) > > > > > BTW might "vulgo" be "ergo" [Latin: therefore]? > > > Nope. Correct etymology is from 'sermo vulgaris', > ie the language of the common people. > But maybe I should remove it for the next > round to avoid confusion. Is a new round even needed given there haven't been any code issues raised against v4? James, what are your thoughts on this patchset? Would be great to get this in scsi-misc for 2.6.39 Please advise, Mike ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4 1/3] scsi: Detailed I/O errors 2011-01-27 22:35 ` Mike Snitzer @ 2011-01-27 22:41 ` James Bottomley 2011-01-27 22:54 ` Mike Snitzer 2011-01-28 13:11 ` Alasdair G Kergon 0 siblings, 2 replies; 12+ messages in thread From: James Bottomley @ 2011-01-27 22:41 UTC (permalink / raw) To: Mike Snitzer Cc: Hannes Reinecke, dgilbert, James Bottomley, linux-scsi, jaxboe, michaelc, agk On Thu, 2011-01-27 at 17:35 -0500, Mike Snitzer wrote: > On Tue, Jan 18 2011 at 7:01am -0500, > Hannes Reinecke <hare@suse.de> wrote: > > > On 01/18/2011 12:33 PM, Douglas Gilbert wrote: > > > > This patchset is primarily for fixing up multipathing, > > which has the habit of retrying failed I/Os on the > > next path. For some errors this is just pointless > > (eg MEDIUM ERROR), for some errors this is the desired > > behaviour (namely transport errors), and for others > > this is positively damaging (persistent reservation > > failures). > > Just plain EIO simply don't cover the whole range :-) > > > > > > > > BTW might "vulgo" be "ergo" [Latin: therefore]? > > > > > Nope. Correct etymology is from 'sermo vulgaris', > > ie the language of the common people. > > But maybe I should remove it for the next > > round to avoid confusion. > > Is a new round even needed given there haven't been any code issues > raised against v4? > > James, what are your thoughts on this patchset? Would be great to get > this in scsi-misc for 2.6.39 > > Please advise, Well, it covers three subsystems ... I was waiting for Alasdair and Jens to ack ... but I bet they each were waiting for the other two to ack ... So, I'll take it if no objections. James ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4 1/3] scsi: Detailed I/O errors 2011-01-27 22:41 ` James Bottomley @ 2011-01-27 22:54 ` Mike Snitzer 2011-01-28 8:12 ` Jens Axboe 2011-01-28 13:11 ` Alasdair G Kergon 1 sibling, 1 reply; 12+ messages in thread From: Mike Snitzer @ 2011-01-27 22:54 UTC (permalink / raw) To: James Bottomley Cc: Hannes Reinecke, dgilbert, James Bottomley, linux-scsi, jaxboe, michaelc, agk On Thu, Jan 27 2011 at 5:41pm -0500, James Bottomley <James.Bottomley@suse.de> wrote: > On Thu, 2011-01-27 at 17:35 -0500, Mike Snitzer wrote: > > On Tue, Jan 18 2011 at 7:01am -0500, > > Hannes Reinecke <hare@suse.de> wrote: > > > > > On 01/18/2011 12:33 PM, Douglas Gilbert wrote: > > > > > > This patchset is primarily for fixing up multipathing, > > > which has the habit of retrying failed I/Os on the > > > next path. For some errors this is just pointless > > > (eg MEDIUM ERROR), for some errors this is the desired > > > behaviour (namely transport errors), and for others > > > this is positively damaging (persistent reservation > > > failures). > > > Just plain EIO simply don't cover the whole range :-) > > > > > > > > > > > BTW might "vulgo" be "ergo" [Latin: therefore]? > > > > > > > Nope. Correct etymology is from 'sermo vulgaris', > > > ie the language of the common people. > > > But maybe I should remove it for the next > > > round to avoid confusion. > > > > Is a new round even needed given there haven't been any code issues > > raised against v4? > > > > James, what are your thoughts on this patchset? Would be great to get > > this in scsi-misc for 2.6.39 > > > > Please advise, > > Well, it covers three subsystems ... I was waiting for Alasdair and Jens > to ack ... but I bet they each were waiting for the other two to ack ... > > So, I'll take it if no objections. OK, I just sent a mail to jens and alasdair asking the same ;) Thanks, Mike ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4 1/3] scsi: Detailed I/O errors 2011-01-27 22:54 ` Mike Snitzer @ 2011-01-28 8:12 ` Jens Axboe 0 siblings, 0 replies; 12+ messages in thread From: Jens Axboe @ 2011-01-28 8:12 UTC (permalink / raw) To: Mike Snitzer Cc: James Bottomley, Hannes Reinecke, dgilbert@interlog.com, James Bottomley, linux-scsi@vger.kernel.org, michaelc@cs.wisc.edu, agk@redhat.com On 2011-01-27 23:54, Mike Snitzer wrote: > On Thu, Jan 27 2011 at 5:41pm -0500, > James Bottomley <James.Bottomley@suse.de> wrote: > >> On Thu, 2011-01-27 at 17:35 -0500, Mike Snitzer wrote: >>> On Tue, Jan 18 2011 at 7:01am -0500, >>> Hannes Reinecke <hare@suse.de> wrote: >>> >>>> On 01/18/2011 12:33 PM, Douglas Gilbert wrote: >>>> >>>> This patchset is primarily for fixing up multipathing, >>>> which has the habit of retrying failed I/Os on the >>>> next path. For some errors this is just pointless >>>> (eg MEDIUM ERROR), for some errors this is the desired >>>> behaviour (namely transport errors), and for others >>>> this is positively damaging (persistent reservation >>>> failures). >>>> Just plain EIO simply don't cover the whole range :-) >>>> >>>>> >>>>> BTW might "vulgo" be "ergo" [Latin: therefore]? >>>>> >>>> Nope. Correct etymology is from 'sermo vulgaris', >>>> ie the language of the common people. >>>> But maybe I should remove it for the next >>>> round to avoid confusion. >>> >>> Is a new round even needed given there haven't been any code issues >>> raised against v4? >>> >>> James, what are your thoughts on this patchset? Would be great to get >>> this in scsi-misc for 2.6.39 >>> >>> Please advise, >> >> Well, it covers three subsystems ... I was waiting for Alasdair and Jens >> to ack ... but I bet they each were waiting for the other two to ack ... >> >> So, I'll take it if no objections. > > OK, I just sent a mail to jens and alasdair asking the same ;) This one will be easier just to take in the SCSI tree, since there's so little risk for conflict. You can add my acked-by to the patches. -- Jens Axboe ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4 1/3] scsi: Detailed I/O errors 2011-01-27 22:41 ` James Bottomley 2011-01-27 22:54 ` Mike Snitzer @ 2011-01-28 13:11 ` Alasdair G Kergon 1 sibling, 0 replies; 12+ messages in thread From: Alasdair G Kergon @ 2011-01-28 13:11 UTC (permalink / raw) To: James Bottomley Cc: Mike Snitzer, Hannes Reinecke, dgilbert, James Bottomley, linux-scsi, jaxboe, michaelc, agk On Thu, Jan 27, 2011 at 05:41:42PM -0500, James Bottomley wrote: > So, I'll take it if no objections. No objections - go for it. Alasdair ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v4 2/3] dm mpath: propagate target errors immediately 2011-01-18 9:13 [PATCH v4 0/3] differentiate between I/O errors Hannes Reinecke 2011-01-18 9:13 ` [PATCH v4 1/3] scsi: Detailed " Hannes Reinecke @ 2011-01-18 9:13 ` Hannes Reinecke 2011-01-18 9:13 ` [PATCH v4 3/3] block: improve detail in I/O error messages Hannes Reinecke 2 siblings, 0 replies; 12+ messages in thread From: Hannes Reinecke @ 2011-01-18 9:13 UTC (permalink / raw) To: James Bottomley Cc: linux-scsi, jaxboe, michaelc, agk, Hannes Reinecke, Mike Snitzer DM now has more information about the nature of the underlying storage failure. Path failure is avoided if a request failed due to a target error. Instead the target error is immediately passed up the stack. Discard requests that fail due to non-target errors may now be retried. Errors restricted to the path will be retried or returned if no paths are available, irregarding the no_path_retry setting. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Hannes Reinecke <hare@suse.de> --- drivers/md/dm-mpath.c | 22 ++++++++++------------ 1 files changed, 10 insertions(+), 12 deletions(-) diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index 487ecda..0781683 100644 --- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -1270,24 +1270,22 @@ static int do_end_io(struct multipath *m, struct request *clone, if (!error && !clone->errors) return 0; /* I/O complete */ - if (error == -EOPNOTSUPP) - return error; - - if (clone->cmd_flags & REQ_DISCARD) - /* - * Pass all discard request failures up. - * FIXME: only fail_path if the discard failed due to a - * transport problem. This requires precise understanding - * of the underlying failure (e.g. the SCSI sense). - */ + if (error == -EOPNOTSUPP || error == -EREMOTEIO) return error; if (mpio->pgpath) fail_path(mpio->pgpath); spin_lock_irqsave(&m->lock, flags); - if (!m->nr_valid_paths && !m->queue_if_no_path && !__must_push_back(m)) - r = -EIO; + if (!m->nr_valid_paths) { + if (!m->queue_if_no_path) { + if (!__must_push_back(m)) + r = -EIO; + } else { + if (error == -EBADE) + r = error; + } + } spin_unlock_irqrestore(&m->lock, flags); return r; -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 3/3] block: improve detail in I/O error messages 2011-01-18 9:13 [PATCH v4 0/3] differentiate between I/O errors Hannes Reinecke 2011-01-18 9:13 ` [PATCH v4 1/3] scsi: Detailed " Hannes Reinecke 2011-01-18 9:13 ` [PATCH v4 2/3] dm mpath: propagate target errors immediately Hannes Reinecke @ 2011-01-18 9:13 ` Hannes Reinecke 2011-01-27 22:51 ` Mike Snitzer 2 siblings, 1 reply; 12+ messages in thread From: Hannes Reinecke @ 2011-01-18 9:13 UTC (permalink / raw) To: James Bottomley Cc: linux-scsi, jaxboe, michaelc, agk, Hannes Reinecke, Mike Snitzer Classify severity of I/O errors for target, nexus, and transport errors. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Hannes Reinecke <hare@suse.de> --- block/blk-core.c | 23 ++++++++++++++++++++--- 1 files changed, 20 insertions(+), 3 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 4ce953f..3380a49 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -2028,9 +2028,26 @@ bool blk_update_request(struct request *req, int error, unsigned int nr_bytes) if (error && req->cmd_type == REQ_TYPE_FS && !(req->cmd_flags & REQ_QUIET)) { - printk(KERN_ERR "end_request: I/O error, dev %s, sector %llu\n", - req->rq_disk ? req->rq_disk->disk_name : "?", - (unsigned long long)blk_rq_pos(req)); + char *error_type; + + switch (error) { + case -ENOLINK: + error_type = "recoverable transport"; + break; + case -EREMOTEIO: + error_type = "critical target"; + break; + case -EBADE: + error_type = "critical nexus"; + break; + case -EIO: + default: + error_type = "I/O"; + break; + } + printk(KERN_ERR "end_request: %s error, dev %s, sector %llu\n", + error_type, req->rq_disk ? req->rq_disk->disk_name : "?", + (unsigned long long)blk_rq_pos(req)); } blk_account_io_completion(req, nr_bytes); -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v4 3/3] block: improve detail in I/O error messages 2011-01-18 9:13 ` [PATCH v4 3/3] block: improve detail in I/O error messages Hannes Reinecke @ 2011-01-27 22:51 ` Mike Snitzer 0 siblings, 0 replies; 12+ messages in thread From: Mike Snitzer @ 2011-01-27 22:51 UTC (permalink / raw) To: jaxboe, agk; +Cc: James Bottomley, linux-scsi, michaelc, Hannes Reinecke On Tue, Jan 18, 2011 at 4:13 AM, Hannes Reinecke <hare@suse.de> wrote: > Classify severity of I/O errors for target, nexus, and > transport errors. > > Signed-off-by: Mike Snitzer <snitzer@redhat.com> > Signed-off-by: Hannes Reinecke <hare@suse.de> > --- > block/blk-core.c | 23 ++++++++++++++++++++--- > 1 files changed, 20 insertions(+), 3 deletions(-) > > diff --git a/block/blk-core.c b/block/blk-core.c > index 4ce953f..3380a49 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -2028,9 +2028,26 @@ bool blk_update_request(struct request *req, int error, unsigned int nr_bytes) > > if (error && req->cmd_type == REQ_TYPE_FS && > !(req->cmd_flags & REQ_QUIET)) { > - printk(KERN_ERR "end_request: I/O error, dev %s, sector %llu\n", > - req->rq_disk ? req->rq_disk->disk_name : "?", > - (unsigned long long)blk_rq_pos(req)); > + char *error_type; > + > + switch (error) { > + case -ENOLINK: > + error_type = "recoverable transport"; > + break; > + case -EREMOTEIO: > + error_type = "critical target"; > + break; > + case -EBADE: > + error_type = "critical nexus"; > + break; > + case -EIO: > + default: > + error_type = "I/O"; > + break; > + } > + printk(KERN_ERR "end_request: %s error, dev %s, sector %llu\n", > + error_type, req->rq_disk ? req->rq_disk->disk_name : "?", > + (unsigned long long)blk_rq_pos(req)); > } > > blk_account_io_completion(req, nr_bytes); > -- Hi Jens, Are you OK with this change provided James takes the SCSI patch? If so, should James pull this block patch in too to avoid dependency concerns between trees? Same question goes for the dm-mpath change (patch 2/3), Alasdair? To ease merging with Linus, might be best if all patches were to go in through a single tree.. if James were to carry these small block and DM patches I'd wager chances of collision are slim. Please advise, thanks. Mike -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2011-01-28 13:11 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-01-18 9:13 [PATCH v4 0/3] differentiate between I/O errors Hannes Reinecke 2011-01-18 9:13 ` [PATCH v4 1/3] scsi: Detailed " Hannes Reinecke 2011-01-18 11:33 ` Douglas Gilbert 2011-01-18 12:01 ` Hannes Reinecke 2011-01-27 22:35 ` Mike Snitzer 2011-01-27 22:41 ` James Bottomley 2011-01-27 22:54 ` Mike Snitzer 2011-01-28 8:12 ` Jens Axboe 2011-01-28 13:11 ` Alasdair G Kergon 2011-01-18 9:13 ` [PATCH v4 2/3] dm mpath: propagate target errors immediately Hannes Reinecke 2011-01-18 9:13 ` [PATCH v4 3/3] block: improve detail in I/O error messages Hannes Reinecke 2011-01-27 22:51 ` Mike Snitzer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox