From: David Jeffery <djeffery@redhat.com>
To: linux-scsi@vger.kernel.org
Subject: [PATCH] sd: medium access timeout counter fails to reset
Date: Thu, 10 Apr 2014 11:08:30 -0400 [thread overview]
Message-ID: <20140410150830.GA19457@fury.redhat.com> (raw)
There is an error with the medium access timeout feature of the sd driver. The
sdkp->medium_access_timed_out value is reset to zero in sd_done() in the wrong
place. Currently it is reset to zero only when a command returns sense data.
This can result in cases where the medium access check falsely triggers from
timed out commands which are hours or days apart.
For example, an I/O command times out and is aborted. It then retries and
succeeds. But with no sense data generated and returned, the
medium_access_timed_out value is not reset. If no sd command returns sense
data, then the next command to time out (however far in time from the first
failure) will trigger the medium access timeout and put the device offline.
The resetting of sdkp->medium_access_timed_out should occur before the check
for sense data.
Signed-off-by: David Jeffery <djeffery@redhat.com>
---
To reproduce using scsi_debug, use SCSI_DEBUG_OPT_TIMEOUT or
SCSI_DEBUG_OPT_MAC_TIMEOUT to force an I/O command to timeout. Then, remove
the opt value so the I/O will succeed on retry. Perform more I/O as desired.
Finally, repeat the process to make a new I/O command time out. Without the
patch, the device will be marked offline even though many I/O commands have
succeeded between the 2 instances of timed out commands.
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 470954a..a41e68e 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1689,12 +1689,12 @@ static int sd_done(struct scsi_cmnd *SCpnt)
sshdr.ascq));
}
#endif
+ sdkp->medium_access_timed_out = 0;
+
if (driver_byte(result) != DRIVER_SENSE &&
(!sense_valid || sense_deferred))
goto out;
- sdkp->medium_access_timed_out = 0;
-
switch (sshdr.sense_key) {
case HARDWARE_ERROR:
case MEDIUM_ERROR:
next reply other threads:[~2014-04-10 15:11 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-10 15:08 David Jeffery [this message]
2014-04-14 15:50 ` [PATCH] sd: medium access timeout counter fails to reset Ewan Milne
2014-04-17 19:23 ` Martin K. Petersen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140410150830.GA19457@fury.redhat.com \
--to=djeffery@redhat.com \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox