Re: If I have a single bad sector, how many failed reads should simple dd report?

Linux ATA/IDE development
 help / color / mirror / Atom feed

From: Mark Lord <kernel@teksavvy.com>
To: Greg Freemyer <greg.freemyer@gmail.com>
Cc: IDE/ATA development list <linux-ide@vger.kernel.org>,
	Mark Lord <liml@rtr.ca>
Subject: Re: If I have a single bad sector, how many failed reads should simple dd report?
Date: Fri, 09 Jul 2010 21:24:08 -0400	[thread overview]
Message-ID: <4C37CBB8.1040909@teksavvy.com> (raw)
In-Reply-To: <4C37CA99.1040104@teksavvy.com>

[-- Attachment #1: Type: text/plain, Size: 4127 bytes --]

On 09/07/10 09:19 PM, Mark Lord wrote:
> On 09/07/10 03:04 PM, Greg Freemyer wrote:
> ..
>>> When I re-ran it, /var/log/messages reported 10 bad logical blocks.
>>> And even worse, dd reported 20 bad blocks. I examined the data dd
>>> read and it had 80KB of zero'ed out data. So that's 160 sectors worth
>>> of data lost because of a single bad sector. At most I was expecting
>>> 4KB of zero'ed out data.
> ..
>
> That's just the standard, undesirable result of the current SCSI EH
> when used with libata for (mainly) desktop computers.
>
> I have patches (against older kernels) to fix it, but have yet to
> get both myself and James B. interested enough simultaneously to
> actually get the kernel fixed. :)
..

Here (attached and inline below) are my most recent patches for this.
Still outdated, though.  These are against the SLES11 2.6.27.19 kernel:

-----------------------------snip----------------------------

Stop the SCSI EH from performing tons of retries on unrecoverable medium errors,
so that error-handling fails more quickly and we (EMC) avoid unneeded node resets.

The ugliness of this patch matches the ugliness of SCSI EH.
Does *anyone* actually understand this code completely?

Signed-off-by: Mark Lord <mlord@pobox.com>

--- old/drivers/scsi/scsi_error.c	2009-06-04 09:46:55.000000000 -0400
+++ linux/drivers/scsi/scsi_error.c	2009-06-04 12:08:48.000000000 -0400
@@ -423,6 +423,52 @@
  	}
  }
  
+/*
+ * The problem with scsi_check_sense(), is that it is (designed to be)
+ * called only after retries are exhausted.  But for MEDIUM_ERRORs (only)
+ * we don't want any retries here at all.
+ *
+ * So this function below is a clone of the necessary parts from scsi_check_sense(),
+ * to check for unrecoverable MEDIUM_ERRORs when deciding whether to retry or not.
+ */
+static int scsi_unrecoverable_medium_error(struct scsi_cmnd *scmd)
+{
+	struct scsi_sense_hdr sshdr;
+
+	if (! scsi_command_normalize_sense(scmd, &sshdr))
+		return 0;	/* no valid sense data */
+
+	if (scsi_sense_is_deferred(&sshdr))
+		return 0;
+
+	if (sshdr.response_code == 0x70) {
+		/* fixed format */
+		if (scmd->sense_buffer[2] & 0xe0)
+			return 0;
+	} else {
+		/*
+		 * descriptor format: look for "stream commands sense data
+		 * descriptor" (see SSC-3). Assume single sense data
+		 * descriptor. Ignore ILI from SBC-2 READ LONG and WRITE LONG.
+		 */
+		if ((sshdr.additional_length > 3) &&
+		    (scmd->sense_buffer[8] == 0x4) &&
+		    (scmd->sense_buffer[11] & 0xe0))
+			return 0;
+	}
+
+	switch (sshdr.sense_key) {
+	case MEDIUM_ERROR:
+		if (sshdr.asc == 0x11 || /* UNRECOVERED READ ERR */
+		    sshdr.asc == 0x13 || /* AMNF DATA FIELD */
+		    sshdr.asc == 0x14) { /* RECORD NOT FOUND */
+			//printk(KERN_WARNING "%s: MEDIUM_ERROR\n", __func__);
+			return 1;
+		}
+	}
+	return 0;
+}
+
  /**
   * scsi_eh_completed_normally - Disposition a eh cmd on return from LLD.
   * @scmd:	SCSI cmd to examine.
@@ -1334,6 +1380,8 @@
  
  	switch (status_byte(scmd->result)) {
  	case CHECK_CONDITION:
+		if (scsi_unrecoverable_medium_error(scmd))
+			return 1;
  		/*
  		 * assume caller has checked sense and determinted
  		 * the check condition was retryable.
-----------------------------snip----------------------------

On encountering a bad sector, report and skip over it,
then continue with the remainder of the request.
Otherwise we would fail perfectly good sectors,
making a bad situation even worse.

Signed-off-by: Mark Lord <mlord@pobox.com>

--- old/drivers/scsi/scsi_lib.c	2009-06-04 12:26:52.000000000 -0400
+++ linux/drivers/scsi/scsi_lib.c	2009-06-04 14:40:11.000000000 -0400
@@ -952,6 +952,12 @@
  	 */
  	if (sense_valid && !sense_deferred) {
  		switch (sshdr.sense_key) {
+		case MEDIUM_ERROR:
+		/* Bad sector.  Fail it, and then continue the rest of the request. */
+		if (this_count && scsi_end_request(cmd, -EIO, cmd->device->sector_size, 1) == NULL) {
+			cmd->retries = 0;       // go around again..
+			return;
+		}
  		case UNIT_ATTENTION:
  			if (cmd->device->removable) {
  				/* Detected disc change.  Set a bit
-----------------------------snip----------------------------

[-- Attachment #2: 01_scsi_no_retry_medium_errors-sles11.patch --]
[-- Type: text/x-diff, Size: 2252 bytes --]

sles11:
Stop the SCSI EH from performing tons of retries on unrecoverable medium errors,
so that error-handling fails more quickly and we (EMC) avoid unneeded node resets.

The ugliness of this patch matches the ugliness of SCSI EH.
Does *anyone* actually understand this code completely?

Signed-off-by: Mark Lord <mlord@pobox.com>

--- old/drivers/scsi/scsi_error.c	2009-06-04 09:46:55.000000000 -0400
+++ linux/drivers/scsi/scsi_error.c	2009-06-04 12:08:48.000000000 -0400
@@ -423,6 +423,52 @@
 	}
 }
 
+/*
+ * The problem with scsi_check_sense(), is that it is (designed to be)
+ * called only after retries are exhausted.  But for MEDIUM_ERRORs (only)
+ * we don't want any retries here at all.
+ *
+ * So this function below is a clone of the necessary parts from scsi_check_sense(),
+ * to check for unrecoverable MEDIUM_ERRORs when deciding whether to retry or not.
+ */
+static int scsi_unrecoverable_medium_error(struct scsi_cmnd *scmd)
+{
+	struct scsi_sense_hdr sshdr;
+
+	if (! scsi_command_normalize_sense(scmd, &sshdr))
+		return 0;	/* no valid sense data */
+
+	if (scsi_sense_is_deferred(&sshdr))
+		return 0;
+
+	if (sshdr.response_code == 0x70) {
+		/* fixed format */
+		if (scmd->sense_buffer[2] & 0xe0)
+			return 0;
+	} else {
+		/*
+		 * descriptor format: look for "stream commands sense data
+		 * descriptor" (see SSC-3). Assume single sense data
+		 * descriptor. Ignore ILI from SBC-2 READ LONG and WRITE LONG.
+		 */
+		if ((sshdr.additional_length > 3) &&
+		    (scmd->sense_buffer[8] == 0x4) &&
+		    (scmd->sense_buffer[11] & 0xe0))
+			return 0;
+	}
+
+	switch (sshdr.sense_key) {
+	case MEDIUM_ERROR:
+		if (sshdr.asc == 0x11 || /* UNRECOVERED READ ERR */
+		    sshdr.asc == 0x13 || /* AMNF DATA FIELD */
+		    sshdr.asc == 0x14) { /* RECORD NOT FOUND */
+			//printk(KERN_WARNING "%s: MEDIUM_ERROR\n", __func__);
+			return 1;
+		}
+	}
+	return 0;
+}
+
 /**
  * scsi_eh_completed_normally - Disposition a eh cmd on return from LLD.
  * @scmd:	SCSI cmd to examine.
@@ -1334,6 +1380,8 @@
 
 	switch (status_byte(scmd->result)) {
 	case CHECK_CONDITION:
+		if (scsi_unrecoverable_medium_error(scmd))
+			return 1;
 		/*
 		 * assume caller has checked sense and determinted
 		 * the check condition was retryable.

[-- Attachment #3: 02_scsi_eh_continue-sles11.patch --]
[-- Type: text/x-diff, Size: 825 bytes --]

sles11:
On encountering a bad sector, report and skip over it,
then continue with the remainder of the request.
Otherwise we would fail perfectly good sectors,
making a bad situation even worse.

Signed-off-by: Mark Lord <mlord@pobox.com>

--- old/drivers/scsi/scsi_lib.c	2009-06-04 12:26:52.000000000 -0400
+++ linux/drivers/scsi/scsi_lib.c	2009-06-04 14:40:11.000000000 -0400
@@ -952,6 +952,12 @@
 	 */
 	if (sense_valid && !sense_deferred) {
 		switch (sshdr.sense_key) {
+		case MEDIUM_ERROR:
+		/* Bad sector.  Fail it, and then continue the rest of the request. */
+		if (this_count && scsi_end_request(cmd, -EIO, cmd->device->sector_size, 1) == NULL) {
+			cmd->retries = 0;       // go around again..
+			return;
+		}
 		case UNIT_ATTENTION:
 			if (cmd->device->removable) {
 				/* Detected disc change.  Set a bit

next prev parent reply	other threads:[~2010-07-10  1:24 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-08 17:14 If I have a single bad sector, how many failed reads should simple dd report? Greg Freemyer
2010-07-09 19:04 ` Greg Freemyer
2010-07-10  1:19   ` Mark Lord
2010-07-10  1:24     ` Mark Lord [this message]
2010-07-10 14:14       ` James Bottomley
2010-07-11 12:58         ` Greg Freemyer
2010-07-13 19:07         ` Mark Lord

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C37CBB8.1040909@teksavvy.com \
    --to=kernel@teksavvy.com \
    --cc=greg.freemyer@gmail.com \
    --cc=liml@rtr.ca \
    --cc=linux-ide@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox