From: James Bottomley <James.Bottomley@suse.de>
To: Hannes Reinecke <hare@suse.de>
Cc: linux-scsi@vger.kernel.org
Subject: Re: [PATCH] sd: retry read_capacity on UNIT_ATTENTION
Date: Thu, 01 Apr 2010 10:30:01 -0400 [thread overview]
Message-ID: <1270132201.4439.14.camel@mulgrave.site> (raw)
In-Reply-To: <20100401134428.7E4D1337C5@ochil.suse.de>
On Thu, 2010-04-01 at 15:44 +0200, Hannes Reinecke wrote:
> Hazard testing uncovered yet another bug in sd. Under heavy
> reset activity the retry counter might be exhausted and
> the command will be returned with sense UNIT_ATTENTION/0x29/00
> (POWER ON, RESET, OR BUS DEVICE RESET OCCURRED). In those
> cases we should just increase the retry counter again,
> retrying one more to clear up this Unit Attention state.
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
>
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 1962bea..7d75a21 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -1454,8 +1454,15 @@ static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
> if (media_not_present(sdkp, &sshdr))
> return -ENODEV;
>
> - if (the_result)
> + if (the_result) {
> sense_valid = scsi_sense_valid(&sshdr);
> + if (sense_valid &&
> + sshdr.sense_key == UNIT_ATTENTION &&
> + sshdr.asc = 0x29 && sshdr.asq == 0x00)
^^^^
should be ==
> + /* Device reset might occur several times,
> + * give it one more chance */
> + retries++;
> + }
Firstly, not even compile checked:
drivers/scsi/sd.c: In function ‘read_capacity_10’:
drivers/scsi/sd.c:1558: error: ‘struct scsi_sense_hdr’ has no member named ‘asq’
Secondly, we can't quite do this. Some devices (only broken ones in my
experience) will reply UNIT_ATTENTION I was RESET forever, leading to a
loop here. Additionally, a massive reset storm on a shared bus would
DoS the code here, so there must be a give up point after a reasonable
number of retries.
The third problem is that if this is happening to a large device, we
only catch it in RC10 ... so we'll report undersize if the device is >
SPC2
How about this instead?
James
---
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 7b75c8a..cdb8ed6 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1432,6 +1432,8 @@ static void read_capacity_error(struct scsi_disk *sdkp, struct scsi_device *sdp,
#error RC16_LEN must not be more than SD_BUF_SIZE
#endif
+#define READ_CAPACITY_RETRIES_ON_RESET 10
+
static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
unsigned char *buffer)
{
@@ -1439,7 +1441,7 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
struct scsi_sense_hdr sshdr;
int sense_valid = 0;
int the_result;
- int retries = 3;
+ int retries = 3, reset_retries = READ_CAPACITY_RETRIES_ON_RESET;
unsigned int alignment;
unsigned long long lba;
unsigned sector_size;
@@ -1468,6 +1470,13 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
* Invalid Field in CDB, just retry
* silently with RC10 */
return -EINVAL;
+ if (sense_valid &&
+ sshdr.sense_key == UNIT_ATTENTION &&
+ sshdr.asc == 0x29 && sshdr.ascq == 0x00)
+ /* Device reset might occur several times,
+ * give it one more chance */
+ if (--reset_retries > 0)
+ continue;
}
retries--;
@@ -1526,7 +1535,7 @@ static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
struct scsi_sense_hdr sshdr;
int sense_valid = 0;
int the_result;
- int retries = 3;
+ int retries = 3, reset_retries = READ_CAPACITY_RETRIES_ON_RESET;
sector_t lba;
unsigned sector_size;
@@ -1542,8 +1551,16 @@ static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
if (media_not_present(sdkp, &sshdr))
return -ENODEV;
- if (the_result)
+ if (the_result) {
sense_valid = scsi_sense_valid(&sshdr);
+ if (sense_valid &&
+ sshdr.sense_key == UNIT_ATTENTION &&
+ sshdr.asc == 0x29 && sshdr.ascq == 0x00)
+ /* Device reset might occur several times,
+ * give it one more chance */
+ if (--reset_retries > 0)
+ continue;
+ }
retries--;
} while (the_result && retries);
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2010-04-01 14:30 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-01 13:44 [PATCH] sd: retry read_capacity on UNIT_ATTENTION Hannes Reinecke
2010-04-01 14:30 ` James Bottomley [this message]
2010-04-08 7:36 ` Hannes Reinecke
2010-04-08 13:48 ` James Bottomley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1270132201.4439.14.camel@mulgrave.site \
--to=james.bottomley@suse.de \
--cc=hare@suse.de \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox