linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/9] Retry READ CAPACITY(10)/(16) with good status but no data
@ 2025-08-14 18:28 Ewan D. Milne
  2025-08-14 18:28 ` [PATCH v2 1/9] scsi: Explicitly specify .ascq = 0x00 for ASC 0x28/0x29 scsi_failures Ewan D. Milne
                   ` (8 more replies)
  0 siblings, 9 replies; 24+ messages in thread
From: Ewan D. Milne @ 2025-08-14 18:28 UTC (permalink / raw)
  To: linux-scsi; +Cc: michael.christie, dgilbert, bvanassche

We encountered a SCSI device that responded to the initial READ CAPACITY command
with a good status, but no data was transferred.  This caused a sudden change of
the device capacity to zero when the device was rescanned, for no obvious reason.

This patch series changes read_capacity_10() and read_capacity_16() in sd.c
to retry the command up to 3 times in an attempt to get valid capacity information.
A message is logged if this is ultimately unsuccessful.

There are some predecessor patches, one from a patch in a series by Mike Christie
which changes read_capacity_16() to use the scsi_failures mechanism (which did
not eventually get merged), this makes the changes here much more similar for
both the read_capacity_10 and read_capacity_16() case.  Another patch corrects
a potential use of an uninitialized variable, and a third one removes a check
for -EOVERFLOW that hasn't been needed since commit 72deb455b5ec
("block: remove CONFIG_LBDAF").  Other patches fill in missing .ascq entries
in the scsi_failures array and address other review comments.

The final patch to scsi_debug is allow insertion of the fault to test this change.

Changes in v2:
  - Added patches to explicitly specify .ascq in scsi_features usage
  - Pass the length of the buffer used through the sd_read_capacity() call chain
  - Simplify a conditional in scsi_probe_lun() that was requested in similar
    code in read_capacity_16()/read_capacity_10()
  - Changed code in scsi_debug() to make only one call to scsi_set_resid()
  - Moved some declarations around in read_capacity_16()/read_capacity_10()
    and memset() the whole buffer instead of the expected data size
  - Add the newly added flag SDEBUG_NO_DATA to SDEBUG_OPT_ALL_INJECTING

Ewan D. Milne (9):
  scsi: Explicitly specify .ascq = 0x00 for ASC 0x28/0x29 scsi_failures
  scsi: sd: Explicitly specify .ascq = SCMD_FAILURE_ASCQ_ANY for ASC
    0x3a
  scsi: sd: Pass buffer length as argument to sd_read_capacity() et al.
  scsi: sd: Have scsi-ml retry read_capacity_16 errors
  scsi: sd: Avoid passing potentially uninitialized "sense_valid" to
    read_capacity_error()
  scsi: sd: Remove checks for -EOVERFLOW in sd_read_capacity()
  scsi: sd: Check for and retry in case of READ_CAPCITY(10)/(16)
    returning no data
  scsi: Simplify nested if conditional in scsi_probe_lun()
  scsi: scsi_debug: Add option to suppress returned data but return good
    status

 drivers/scsi/scsi_debug.c |  47 +++++++---
 drivers/scsi/scsi_scan.c  |   7 +-
 drivers/scsi/sd.c         | 188 +++++++++++++++++++++++++++-----------
 3 files changed, 174 insertions(+), 68 deletions(-)

-- 
2.47.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 1/9] scsi: Explicitly specify .ascq = 0x00 for ASC 0x28/0x29 scsi_failures
  2025-08-14 18:28 [PATCH v2 0/9] Retry READ CAPACITY(10)/(16) with good status but no data Ewan D. Milne
@ 2025-08-14 18:28 ` Ewan D. Milne
  2025-08-14 18:29 ` [PATCH v2 2/9] scsi: sd: Explicitly specify .ascq = SCMD_FAILURE_ASCQ_ANY for ASC 0x3a Ewan D. Milne
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Ewan D. Milne @ 2025-08-14 18:28 UTC (permalink / raw)
  To: linux-scsi; +Cc: michael.christie, dgilbert, bvanassche

This does not change any behavior (since .ascq was initialized to 0 by
the compiler) but makes explicit that the entry in the scsi_failures
array does not handle cases where ASCQ is nonzero, consistent with other
usage.

Signed-off-by: Ewan D. Milne <emilne@redhat.com>
---
 drivers/scsi/scsi_scan.c | 2 ++
 drivers/scsi/sd.c        | 1 +
 2 files changed, 3 insertions(+)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 3c6e089e80c3..c754b1d566e0 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -660,11 +660,13 @@ static int scsi_probe_lun(struct scsi_device *sdev, unsigned char *inq_result,
 		{
 			.sense = UNIT_ATTENTION,
 			.asc = 0x28,
+			.ascq = 0x00,
 			.result = SAM_STAT_CHECK_CONDITION,
 		},
 		{
 			.sense = UNIT_ATTENTION,
 			.asc = 0x29,
+			.ascq = 0x00,
 			.result = SAM_STAT_CHECK_CONDITION,
 		},
 		{
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 5b8668accf8e..78f5903cc8d0 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2740,6 +2740,7 @@ static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
 		{
 			.sense = UNIT_ATTENTION,
 			.asc = 0x29,
+			.ascq = 0x00,
 			.allowed = READ_CAPACITY_RETRIES_ON_RESET,
 			.result = SAM_STAT_CHECK_CONDITION,
 		},
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 2/9] scsi: sd: Explicitly specify .ascq = SCMD_FAILURE_ASCQ_ANY for ASC 0x3a
  2025-08-14 18:28 [PATCH v2 0/9] Retry READ CAPACITY(10)/(16) with good status but no data Ewan D. Milne
  2025-08-14 18:28 ` [PATCH v2 1/9] scsi: Explicitly specify .ascq = 0x00 for ASC 0x28/0x29 scsi_failures Ewan D. Milne
@ 2025-08-14 18:29 ` Ewan D. Milne
  2025-08-14 21:24   ` Bart Van Assche
  2025-08-14 18:29 ` [PATCH v2 3/9] scsi: sd: Pass buffer length as argument to sd_read_capacity() et al Ewan D. Milne
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 24+ messages in thread
From: Ewan D. Milne @ 2025-08-14 18:29 UTC (permalink / raw)
  To: linux-scsi; +Cc: michael.christie, dgilbert, bvanassche

This makes the handling in read_capacity_10() consistent with other
cases, e.g. sd_spinup_disk().  Omitting .ascq in scsi_failure did not
result in wildcard matching, it only handled ASCQ 0x00.  This patch
changes the retry behavior, we no longer retry 3 times on ASC 0x3a
if a nonzero ASCQ is ever returned.

Signed-off-by: Ewan D. Milne <emilne@redhat.com>
---
 drivers/scsi/sd.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 78f5903cc8d0..e3b802b26f0e 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2729,11 +2729,13 @@ static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
 		{
 			.sense = UNIT_ATTENTION,
 			.asc = 0x3A,
+			.ascq = SCMD_FAILURE_ASCQ_ANY,
 			.result = SAM_STAT_CHECK_CONDITION,
 		},
 		{
 			.sense = NOT_READY,
 			.asc = 0x3A,
+			.ascq = SCMD_FAILURE_ASCQ_ANY,
 			.result = SAM_STAT_CHECK_CONDITION,
 		},
 		 /* Device reset might occur several times so retry a lot */
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 3/9] scsi: sd: Pass buffer length as argument to sd_read_capacity() et al.
  2025-08-14 18:28 [PATCH v2 0/9] Retry READ CAPACITY(10)/(16) with good status but no data Ewan D. Milne
  2025-08-14 18:28 ` [PATCH v2 1/9] scsi: Explicitly specify .ascq = 0x00 for ASC 0x28/0x29 scsi_failures Ewan D. Milne
  2025-08-14 18:29 ` [PATCH v2 2/9] scsi: sd: Explicitly specify .ascq = SCMD_FAILURE_ASCQ_ANY for ASC 0x3a Ewan D. Milne
@ 2025-08-14 18:29 ` Ewan D. Milne
  2025-08-14 21:31   ` Bart Van Assche
  2025-08-15  2:43   ` Damien Le Moal
  2025-08-14 18:29 ` [PATCH v2 4/9] scsi: sd: Have scsi-ml retry read_capacity_16 errors Ewan D. Milne
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 24+ messages in thread
From: Ewan D. Milne @ 2025-08-14 18:29 UTC (permalink / raw)
  To: linux-scsi; +Cc: michael.christie, dgilbert, bvanassche

That will make it easier to spot an inconsistent buffer size value.
Also, memset() the entire buffer rather than the 8 or 32 bytes expected
back from READ CAPACITY(10) or READ CAPACITY(16).

Signed-off-by: Ewan D. Milne <emilne@redhat.com>
---
 drivers/scsi/sd.c | 28 ++++++++++++++++++----------
 1 file changed, 18 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index e3b802b26f0e..ae8eac4b1cb2 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2629,7 +2629,8 @@ static void read_capacity_error(struct scsi_disk *sdkp, struct scsi_device *sdp,
 #define READ_CAPACITY_RETRIES_ON_RESET	10
 
 static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
-		struct queue_limits *lim, unsigned char *buffer)
+			    struct queue_limits *lim, unsigned char *buffer,
+			    unsigned int buflen)
 {
 	unsigned char cmd[16];
 	struct scsi_sense_hdr sshdr;
@@ -2651,7 +2652,7 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
 		cmd[0] = SERVICE_ACTION_IN_16;
 		cmd[1] = SAI_READ_CAPACITY_16;
 		cmd[13] = RC16_LEN;
-		memset(buffer, 0, RC16_LEN);
+		memset(buffer, 0, buflen);
 
 		the_result = scsi_execute_cmd(sdp, cmd, REQ_OP_DRV_IN,
 					      buffer, RC16_LEN, SD_TIMEOUT,
@@ -2719,8 +2720,13 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
 	return sector_size;
 }
 
+#define RC10_LEN 8
+#if RC10_LEN > SD_BUF_SIZE
+#error RC10_LEN must not be more than SD_BUF_SIZE
+#endif
+
 static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
-						unsigned char *buffer)
+			    unsigned char *buffer, unsigned int buflen)
 {
 	static const u8 cmd[10] = { READ_CAPACITY };
 	struct scsi_sense_hdr sshdr;
@@ -2765,7 +2771,7 @@ static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
 	sector_t lba;
 	unsigned sector_size;
 
-	memset(buffer, 0, 8);
+	memset(buffer, 0, buflen);
 
 	the_result = scsi_execute_cmd(sdp, cmd, REQ_OP_DRV_IN, buffer,
 				      8, SD_TIMEOUT, sdkp->max_retries,
@@ -2819,23 +2825,24 @@ static int sd_try_rc16_first(struct scsi_device *sdp)
  */
 static void
 sd_read_capacity(struct scsi_disk *sdkp, struct queue_limits *lim,
-		unsigned char *buffer)
+		 unsigned char *buffer, unsigned int buflen)
 {
 	int sector_size;
 	struct scsi_device *sdp = sdkp->device;
 
 	if (sd_try_rc16_first(sdp)) {
-		sector_size = read_capacity_16(sdkp, sdp, lim, buffer);
+		sector_size = read_capacity_16(sdkp, sdp, lim, buffer, buflen);
 		if (sector_size == -EOVERFLOW)
 			goto got_data;
 		if (sector_size == -ENODEV)
 			return;
 		if (sector_size < 0)
-			sector_size = read_capacity_10(sdkp, sdp, buffer);
+			sector_size = read_capacity_10(sdkp, sdp, buffer,
+						       buflen);
 		if (sector_size < 0)
 			return;
 	} else {
-		sector_size = read_capacity_10(sdkp, sdp, buffer);
+		sector_size = read_capacity_10(sdkp, sdp, buffer, buflen);
 		if (sector_size == -EOVERFLOW)
 			goto got_data;
 		if (sector_size < 0)
@@ -2845,7 +2852,8 @@ sd_read_capacity(struct scsi_disk *sdkp, struct queue_limits *lim,
 			int old_sector_size = sector_size;
 			sd_printk(KERN_NOTICE, sdkp, "Very big device. "
 					"Trying to use READ CAPACITY(16).\n");
-			sector_size = read_capacity_16(sdkp, sdp, lim, buffer);
+			sector_size = read_capacity_16(sdkp, sdp, lim, buffer,
+						       buflen);
 			if (sector_size < 0) {
 				sd_printk(KERN_NOTICE, sdkp,
 					"Using 0xffffffff as device size\n");
@@ -3730,7 +3738,7 @@ static int sd_revalidate_disk(struct gendisk *disk)
 	 * react badly if we do.
 	 */
 	if (sdkp->media_present) {
-		sd_read_capacity(sdkp, &lim, buffer);
+		sd_read_capacity(sdkp, &lim, buffer, SD_BUF_SIZE);
 		/*
 		 * Some USB/UAS devices return generic values for mode pages
 		 * until the media has been accessed. Trigger a READ operation
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 4/9] scsi: sd: Have scsi-ml retry read_capacity_16 errors
  2025-08-14 18:28 [PATCH v2 0/9] Retry READ CAPACITY(10)/(16) with good status but no data Ewan D. Milne
                   ` (2 preceding siblings ...)
  2025-08-14 18:29 ` [PATCH v2 3/9] scsi: sd: Pass buffer length as argument to sd_read_capacity() et al Ewan D. Milne
@ 2025-08-14 18:29 ` Ewan D. Milne
  2025-08-14 21:33   ` Bart Van Assche
  2025-08-14 18:29 ` [PATCH v2 5/9] scsi: sd: Avoid passing potentially uninitialized "sense_valid" to read_capacity_error() Ewan D. Milne
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 24+ messages in thread
From: Ewan D. Milne @ 2025-08-14 18:29 UTC (permalink / raw)
  To: linux-scsi; +Cc: michael.christie, dgilbert, bvanassche

This has read_capacity_16 have scsi-ml retry errors instead of driving
them itself.

There are 2 behavior changes with this patch:
1. There is one behavior change where we no longer retry when
scsi_execute_cmd returns < 0, but we should be ok. We don't need to retry
for failures like the queue being removed, and for the case where there
are no tags/reqs since the block layer waits/retries for us. For possible
memory allocation failures from blk_rq_map_kern we use GFP_NOIO, so
retrying will probably not help.
2. For the specific UAs we checked for and retried, we would get
READ_CAPACITY_RETRIES_ON_RESET retries plus whatever retries were left
from the main loop's retries. Each UA now gets
READ_CAPACITY_RETRIES_ON_RESET reties, and the other errors get up to 3
retries. This is most likely ok, because READ_CAPACITY_RETRIES_ON_RESET
is already 10 and is not based on anything specific like a spec or
device, so the extra 3 we got from the main loop was probably just an
accident and is not going to help.

Original patch by Mike Christie <michael.christie@oracle.com> modified
based upon review comments for an earlier version of this patch.

Signed-off-by: Ewan D. Milne <emilne@redhat.com>
---
 drivers/scsi/sd.c | 107 +++++++++++++++++++++++++++++++---------------
 1 file changed, 73 insertions(+), 34 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index ae8eac4b1cb2..44422751c95e 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2632,14 +2632,66 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
 			    struct queue_limits *lim, unsigned char *buffer,
 			    unsigned int buflen)
 {
-	unsigned char cmd[16];
+	static const u8 cmd[16] = {
+		[0] = SERVICE_ACTION_IN_16,
+		[1] = SAI_READ_CAPACITY_16,
+		[13] = RC16_LEN,
+	};
 	struct scsi_sense_hdr sshdr;
+	struct scsi_failure failure_defs[] = {
+		/*
+		 * Do not retry Invalid Command Operation Code or Invalid
+		 * Field in CDB.
+		 */
+		{
+			.sense = ILLEGAL_REQUEST,
+			.asc = 0x20,
+			.ascq = 0x00,
+			.result = SAM_STAT_CHECK_CONDITION,
+		},
+		{
+			.sense = ILLEGAL_REQUEST,
+			.asc = 0x24,
+			.ascq = 0x00,
+			.result = SAM_STAT_CHECK_CONDITION,
+		},
+		/* Do not retry Medium Not Present */
+		{
+			.sense = UNIT_ATTENTION,
+			.asc = 0x3A,
+			.ascq = SCMD_FAILURE_ASCQ_ANY,
+			.result = SAM_STAT_CHECK_CONDITION,
+		},
+		{
+			.sense = NOT_READY,
+			.asc = 0x3A,
+			.ascq = SCMD_FAILURE_ASCQ_ANY,
+			.result = SAM_STAT_CHECK_CONDITION,
+		},
+		/* Device reset might occur several times so retry a lot */
+		{
+			.sense = UNIT_ATTENTION,
+			.asc = 0x29,
+			.ascq = 0x00,
+			.allowed = READ_CAPACITY_RETRIES_ON_RESET,
+			.result = SAM_STAT_CHECK_CONDITION,
+		},
+		/* Any other error not listed above retry 3 times */
+		{
+			.result = SCMD_FAILURE_RESULT_ANY,
+			.allowed = 3,
+		},
+		{}
+	};
+	struct scsi_failures failures = {
+		.failure_definitions = failure_defs,
+	};
 	const struct scsi_exec_args exec_args = {
 		.sshdr = &sshdr,
+		.failures = &failures,
 	};
 	int sense_valid = 0;
 	int the_result;
-	int retries = 3, reset_retries = READ_CAPACITY_RETRIES_ON_RESET;
 	unsigned int alignment;
 	unsigned long long lba;
 	unsigned sector_size;
@@ -2647,40 +2699,27 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
 	if (sdp->no_read_capacity_16)
 		return -EINVAL;
 
-	do {
-		memset(cmd, 0, 16);
-		cmd[0] = SERVICE_ACTION_IN_16;
-		cmd[1] = SAI_READ_CAPACITY_16;
-		cmd[13] = RC16_LEN;
-		memset(buffer, 0, buflen);
-
-		the_result = scsi_execute_cmd(sdp, cmd, REQ_OP_DRV_IN,
-					      buffer, RC16_LEN, SD_TIMEOUT,
-					      sdkp->max_retries, &exec_args);
-		if (the_result > 0) {
-			if (media_not_present(sdkp, &sshdr))
-				return -ENODEV;
+	memset(buffer, 0, buflen);
 
-			sense_valid = scsi_sense_valid(&sshdr);
-			if (sense_valid &&
-			    sshdr.sense_key == ILLEGAL_REQUEST &&
-			    (sshdr.asc == 0x20 || sshdr.asc == 0x24) &&
-			    sshdr.ascq == 0x00)
-				/* Invalid Command Operation Code or
-				 * Invalid Field in CDB, just retry
-				 * silently with RC10 */
-				return -EINVAL;
-			if (sense_valid &&
-			    sshdr.sense_key == UNIT_ATTENTION &&
-			    sshdr.asc == 0x29 && sshdr.ascq == 0x00)
-				/* Device reset might occur several times,
-				 * give it one more chance */
-				if (--reset_retries > 0)
-					continue;
-		}
-		retries--;
+	the_result = scsi_execute_cmd(sdp, cmd, REQ_OP_DRV_IN, buffer,
+				      RC16_LEN, SD_TIMEOUT, sdkp->max_retries,
+				      &exec_args);
 
-	} while (the_result && retries);
+	if (the_result > 0) {
+		if (media_not_present(sdkp, &sshdr))
+			return -ENODEV;
+
+		sense_valid = scsi_sense_valid(&sshdr);
+		if (sense_valid && sshdr.sense_key == ILLEGAL_REQUEST &&
+		    (sshdr.asc == 0x20 || sshdr.asc == 0x24) &&
+		     sshdr.ascq == 0x00) {
+			/*
+			 * Invalid Command Operation Code or Invalid Field in
+			 * CDB, just retry silently with RC10
+			 */
+			return -EINVAL;
+		}
+	}
 
 	if (the_result) {
 		sd_print_result(sdkp, "Read Capacity(16) failed", the_result);
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 5/9] scsi: sd: Avoid passing potentially uninitialized "sense_valid" to read_capacity_error()
  2025-08-14 18:28 [PATCH v2 0/9] Retry READ CAPACITY(10)/(16) with good status but no data Ewan D. Milne
                   ` (3 preceding siblings ...)
  2025-08-14 18:29 ` [PATCH v2 4/9] scsi: sd: Have scsi-ml retry read_capacity_16 errors Ewan D. Milne
@ 2025-08-14 18:29 ` Ewan D. Milne
  2025-08-14 21:34   ` Bart Van Assche
  2025-08-14 18:29 ` [PATCH v2 6/9] scsi: sd: Remove checks for -EOVERFLOW in sd_read_capacity() Ewan D. Milne
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 24+ messages in thread
From: Ewan D. Milne @ 2025-08-14 18:29 UTC (permalink / raw)
  To: linux-scsi; +Cc: michael.christie, dgilbert, bvanassche

read_capacity_10() sets "sense_valid" in a different conditional statement prior to
calling read_capacity_error(), and does not use this value otherwise.  Move the call
to scsi_sense_valid() to read_capacity_error() instead of passing it as a parameter
from read_capacity_16() and read_capacity_10().

Signed-off-by: Ewan D. Milne <emilne@redhat.com>
---
 drivers/scsi/sd.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 44422751c95e..10ba6ec5012d 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2597,9 +2597,10 @@ static void sd_config_protection(struct scsi_disk *sdkp,
 }
 
 static void read_capacity_error(struct scsi_disk *sdkp, struct scsi_device *sdp,
-			struct scsi_sense_hdr *sshdr, int sense_valid,
-			int the_result)
+				struct scsi_sense_hdr *sshdr, int the_result)
 {
+	bool sense_valid = scsi_sense_valid(sshdr);
+
 	if (sense_valid)
 		sd_print_sense_hdr(sdkp, sshdr);
 	else
@@ -2723,7 +2724,7 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
 
 	if (the_result) {
 		sd_print_result(sdkp, "Read Capacity(16) failed", the_result);
-		read_capacity_error(sdkp, sdp, &sshdr, sense_valid, the_result);
+		read_capacity_error(sdkp, sdp, &sshdr, the_result);
 		return -EINVAL;
 	}
 
@@ -2805,7 +2806,6 @@ static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
 		.sshdr = &sshdr,
 		.failures = &failures,
 	};
-	int sense_valid = 0;
 	int the_result;
 	sector_t lba;
 	unsigned sector_size;
@@ -2817,15 +2817,13 @@ static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
 				      &exec_args);
 
 	if (the_result > 0) {
-		sense_valid = scsi_sense_valid(&sshdr);
-
 		if (media_not_present(sdkp, &sshdr))
 			return -ENODEV;
 	}
 
 	if (the_result) {
 		sd_print_result(sdkp, "Read Capacity(10) failed", the_result);
-		read_capacity_error(sdkp, sdp, &sshdr, sense_valid, the_result);
+		read_capacity_error(sdkp, sdp, &sshdr, the_result);
 		return -EINVAL;
 	}
 
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 6/9] scsi: sd: Remove checks for -EOVERFLOW in sd_read_capacity()
  2025-08-14 18:28 [PATCH v2 0/9] Retry READ CAPACITY(10)/(16) with good status but no data Ewan D. Milne
                   ` (4 preceding siblings ...)
  2025-08-14 18:29 ` [PATCH v2 5/9] scsi: sd: Avoid passing potentially uninitialized "sense_valid" to read_capacity_error() Ewan D. Milne
@ 2025-08-14 18:29 ` Ewan D. Milne
  2025-08-14 21:36   ` Bart Van Assche
  2025-08-14 18:29 ` [PATCH v2 7/9] scsi: sd: Check for and retry in case of READ_CAPCITY(10)/(16) returning no data Ewan D. Milne
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 24+ messages in thread
From: Ewan D. Milne @ 2025-08-14 18:29 UTC (permalink / raw)
  To: linux-scsi; +Cc: michael.christie, dgilbert, bvanassche

Remove checks for -EOVERFLOW in sd_read_capacity() because this value has not
been returned to it since commit 72deb455b5ec ("block: remove CONFIG_LBDAF").

Signed-off-by: Ewan D. Milne <emilne@redhat.com>
---
 drivers/scsi/sd.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 10ba6ec5012d..f1ab2409ea3e 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2869,8 +2869,6 @@ sd_read_capacity(struct scsi_disk *sdkp, struct queue_limits *lim,
 
 	if (sd_try_rc16_first(sdp)) {
 		sector_size = read_capacity_16(sdkp, sdp, lim, buffer, buflen);
-		if (sector_size == -EOVERFLOW)
-			goto got_data;
 		if (sector_size == -ENODEV)
 			return;
 		if (sector_size < 0)
@@ -2880,8 +2878,6 @@ sd_read_capacity(struct scsi_disk *sdkp, struct queue_limits *lim,
 			return;
 	} else {
 		sector_size = read_capacity_10(sdkp, sdp, buffer, buflen);
-		if (sector_size == -EOVERFLOW)
-			goto got_data;
 		if (sector_size < 0)
 			return;
 		if ((sizeof(sdkp->capacity) > 4) &&
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 7/9] scsi: sd: Check for and retry in case of READ_CAPCITY(10)/(16) returning no data
  2025-08-14 18:28 [PATCH v2 0/9] Retry READ CAPACITY(10)/(16) with good status but no data Ewan D. Milne
                   ` (5 preceding siblings ...)
  2025-08-14 18:29 ` [PATCH v2 6/9] scsi: sd: Remove checks for -EOVERFLOW in sd_read_capacity() Ewan D. Milne
@ 2025-08-14 18:29 ` Ewan D. Milne
  2025-08-14 21:37   ` Bart Van Assche
  2025-08-15  2:50   ` Damien Le Moal
  2025-08-14 18:29 ` [PATCH v2 8/9] scsi: Simplify nested if conditional in scsi_probe_lun() Ewan D. Milne
  2025-08-14 18:29 ` [PATCH v2 9/9] scsi: scsi_debug: Add option to suppress returned data but return good status Ewan D. Milne
  8 siblings, 2 replies; 24+ messages in thread
From: Ewan D. Milne @ 2025-08-14 18:29 UTC (permalink / raw)
  To: linux-scsi; +Cc: michael.christie, dgilbert, bvanassche

sd_read_capacity_10() and sd_read_capacity_16() do not check for underflow
and can extract invalid (e.g. zero) data when a malfunctioning device does
not actually transfer any data, but returnes a good status otherwise.
Check for this and retry, and log a message and return -EINVAL if we can't
get the capacity information.

We encountered a device that did this once but returned good data afterwards.

See similar commit 5cd3bbfad088 ("[SCSI] retry with missing data for INQUIRY")

Signed-off-by: Ewan D. Milne <emilne@redhat.com>
---
 drivers/scsi/sd.c | 56 ++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 48 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index f1ab2409ea3e..20b5eebba968 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2639,6 +2639,7 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
 		[13] = RC16_LEN,
 	};
 	struct scsi_sense_hdr sshdr;
+	int count, resid;
 	struct scsi_failure failure_defs[] = {
 		/*
 		 * Do not retry Invalid Command Operation Code or Invalid
@@ -2689,6 +2690,7 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
 	};
 	const struct scsi_exec_args exec_args = {
 		.sshdr = &sshdr,
+		.resid = &resid,
 		.failures = &failures,
 	};
 	int sense_valid = 0;
@@ -2700,11 +2702,23 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
 	if (sdp->no_read_capacity_16)
 		return -EINVAL;
 
-	memset(buffer, 0, buflen);
+	for (count = 0; count < 3; ++count) {
+		memset(buffer, 0, buflen);
 
-	the_result = scsi_execute_cmd(sdp, cmd, REQ_OP_DRV_IN, buffer,
-				      RC16_LEN, SD_TIMEOUT, sdkp->max_retries,
-				      &exec_args);
+		the_result = scsi_execute_cmd(sdp, cmd, REQ_OP_DRV_IN,
+					      buffer, RC16_LEN, SD_TIMEOUT,
+					      sdkp->max_retries, &exec_args);
+
+		if ((the_result == 0) && (resid == RC16_LEN)) {
+			/*
+			 * if nothing was transferred, we try
+			 * again. It's a workaround for a broken
+			 * device.
+			 */
+			continue;
+		}
+		break;
+	}
 
 	if (the_result > 0) {
 		if (media_not_present(sdkp, &sshdr))
@@ -2728,6 +2742,12 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
 		return -EINVAL;
 	}
 
+	if (resid == RC16_LEN) {
+		sd_printk(KERN_ERR, sdkp,
+			  "Read Capacity(16) returned good status but no data");
+		return -EINVAL;
+	}
+
 	sector_size = get_unaligned_be32(&buffer[8]);
 	lba = get_unaligned_be64(&buffer[0]);
 
@@ -2770,6 +2790,7 @@ static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
 {
 	static const u8 cmd[10] = { READ_CAPACITY };
 	struct scsi_sense_hdr sshdr;
+	int count, resid;
 	struct scsi_failure failure_defs[] = {
 		/* Do not retry Medium Not Present */
 		{
@@ -2804,17 +2825,30 @@ static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
 	};
 	const struct scsi_exec_args exec_args = {
 		.sshdr = &sshdr,
+		.resid = &resid,
 		.failures = &failures,
 	};
 	int the_result;
 	sector_t lba;
 	unsigned sector_size;
 
-	memset(buffer, 0, buflen);
+	for (count = 0; count < 3; ++count) {
+		memset(buffer, 0, buflen);
 
-	the_result = scsi_execute_cmd(sdp, cmd, REQ_OP_DRV_IN, buffer,
-				      8, SD_TIMEOUT, sdkp->max_retries,
-				      &exec_args);
+		the_result = scsi_execute_cmd(sdp, cmd, REQ_OP_DRV_IN,
+					      buffer, RC10_LEN, SD_TIMEOUT,
+					      sdkp->max_retries, &exec_args);
+
+		if ((the_result == 0) && (resid == RC16_LEN)) {
+			/*
+			 * if nothing was transferred, we try
+			 * again. It's a workaround for a broken
+			 * device.
+			 */
+			continue;
+		}
+		break;
+	}
 
 	if (the_result > 0) {
 		if (media_not_present(sdkp, &sshdr))
@@ -2827,6 +2861,12 @@ static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
 		return -EINVAL;
 	}
 
+	if (resid == RC10_LEN) {
+		sd_printk(KERN_ERR, sdkp,
+			  "Read Capacity(10) returned good status but no data");
+		return -EINVAL;
+	}
+
 	sector_size = get_unaligned_be32(&buffer[4]);
 	lba = get_unaligned_be32(&buffer[0]);
 
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 8/9] scsi: Simplify nested if conditional in scsi_probe_lun()
  2025-08-14 18:28 [PATCH v2 0/9] Retry READ CAPACITY(10)/(16) with good status but no data Ewan D. Milne
                   ` (6 preceding siblings ...)
  2025-08-14 18:29 ` [PATCH v2 7/9] scsi: sd: Check for and retry in case of READ_CAPCITY(10)/(16) returning no data Ewan D. Milne
@ 2025-08-14 18:29 ` Ewan D. Milne
  2025-08-14 21:37   ` Bart Van Assche
  2025-08-15  2:50   ` Damien Le Moal
  2025-08-14 18:29 ` [PATCH v2 9/9] scsi: scsi_debug: Add option to suppress returned data but return good status Ewan D. Milne
  8 siblings, 2 replies; 24+ messages in thread
From: Ewan D. Milne @ 2025-08-14 18:29 UTC (permalink / raw)
  To: linux-scsi; +Cc: michael.christie, dgilbert, bvanassche

Make code congruent with similar code in read_capacity_16()/read_capacity_10().

Signed-off-by: Ewan D. Milne <emilne@redhat.com>
---
 drivers/scsi/scsi_scan.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index c754b1d566e0..1346a52f55c4 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -717,14 +717,13 @@ static int scsi_probe_lun(struct scsi_device *sdev, unsigned char *inq_result,
 				"scsi scan: INQUIRY %s with code 0x%x\n",
 				result ? "failed" : "successful", result));
 
-		if (result == 0) {
+		if ((result == 0) && (resid == try_inquiry_len)) {
 			/*
 			 * if nothing was transferred, we try
 			 * again. It's a workaround for some USB
 			 * devices.
 			 */
-			if (resid == try_inquiry_len)
-				continue;
+			continue;
 		}
 		break;
 	}
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 9/9] scsi: scsi_debug: Add option to suppress returned data but return good status
  2025-08-14 18:28 [PATCH v2 0/9] Retry READ CAPACITY(10)/(16) with good status but no data Ewan D. Milne
                   ` (7 preceding siblings ...)
  2025-08-14 18:29 ` [PATCH v2 8/9] scsi: Simplify nested if conditional in scsi_probe_lun() Ewan D. Milne
@ 2025-08-14 18:29 ` Ewan D. Milne
  2025-08-14 21:41   ` Bart Van Assche
  8 siblings, 1 reply; 24+ messages in thread
From: Ewan D. Milne @ 2025-08-14 18:29 UTC (permalink / raw)
  To: linux-scsi; +Cc: michael.christie, dgilbert, bvanassche

This is used to test the earlier read_capacity_10()/16() retry patch.

Signed-off-by: Ewan D. Milne <emilne@redhat.com>
---
 drivers/scsi/scsi_debug.c | 47 ++++++++++++++++++++++++++++-----------
 1 file changed, 34 insertions(+), 13 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 353cb60e1abe..6239783bef21 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -230,6 +230,7 @@ struct tape_block {
 #define SDEBUG_OPT_NO_CDB_NOISE		0x4000
 #define SDEBUG_OPT_HOST_BUSY		0x8000
 #define SDEBUG_OPT_CMD_ABORT		0x10000
+#define SDEBUG_OPT_NO_DATA		0x20000
 #define SDEBUG_OPT_ALL_NOISE (SDEBUG_OPT_NOISE | SDEBUG_OPT_Q_NOISE | \
 			      SDEBUG_OPT_RESET_NOISE)
 #define SDEBUG_OPT_ALL_INJECTING (SDEBUG_OPT_RECOVERED_ERR | \
@@ -237,7 +238,8 @@ struct tape_block {
 				  SDEBUG_OPT_DIF_ERR | SDEBUG_OPT_DIX_ERR | \
 				  SDEBUG_OPT_SHORT_TRANSFER | \
 				  SDEBUG_OPT_HOST_BUSY | \
-				  SDEBUG_OPT_CMD_ABORT)
+				  SDEBUG_OPT_CMD_ABORT | \
+				  SDEBUG_OPT_NO_DATA)
 #define SDEBUG_OPT_RECOV_DIF_DIX (SDEBUG_OPT_RECOVERED_ERR | \
 				  SDEBUG_OPT_DIF_ERR | SDEBUG_OPT_DIX_ERR)
 
@@ -1633,7 +1635,7 @@ static int make_ua(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
 static int fill_from_dev_buffer(struct scsi_cmnd *scp, unsigned char *arr,
 				int arr_len)
 {
-	int act_len;
+	int act_len, resid;
 	struct scsi_data_buffer *sdb = &scp->sdb;
 
 	if (!sdb->length)
@@ -1641,9 +1643,18 @@ static int fill_from_dev_buffer(struct scsi_cmnd *scp, unsigned char *arr,
 	if (scp->sc_data_direction != DMA_FROM_DEVICE)
 		return DID_ERROR << 16;
 
-	act_len = sg_copy_from_buffer(sdb->table.sgl, sdb->table.nents,
-				      arr, arr_len);
-	scsi_set_resid(scp, scsi_bufflen(scp) - act_len);
+	/*
+	 * Conditionally suppress DATA IN transfer and leave resid set to bufflen.
+	 */
+	if (unlikely((sdebug_opts & SDEBUG_OPT_NO_DATA) &&
+		      atomic_read(&sdeb_inject_pending))) {
+		resid = scsi_bufflen(scp);
+	} else {
+		act_len = sg_copy_from_buffer(sdb->table.sgl, sdb->table.nents,
+					      arr, arr_len);
+		resid = scsi_bufflen(scp) - act_len;
+	}
+	scsi_set_resid(scp, resid);
 
 	return 0;
 }
@@ -1656,7 +1667,7 @@ static int fill_from_dev_buffer(struct scsi_cmnd *scp, unsigned char *arr,
 static int p_fill_from_dev_buffer(struct scsi_cmnd *scp, const void *arr,
 				  int arr_len, unsigned int off_dst)
 {
-	unsigned int act_len, n;
+	unsigned int act_len, n, resid;
 	struct scsi_data_buffer *sdb = &scp->sdb;
 	off_t skip = off_dst;
 
@@ -1665,13 +1676,23 @@ static int p_fill_from_dev_buffer(struct scsi_cmnd *scp, const void *arr,
 	if (scp->sc_data_direction != DMA_FROM_DEVICE)
 		return DID_ERROR << 16;
 
-	act_len = sg_pcopy_from_buffer(sdb->table.sgl, sdb->table.nents,
-				       arr, arr_len, skip);
-	pr_debug("%s: off_dst=%u, scsi_bufflen=%u, act_len=%u, resid=%d\n",
-		 __func__, off_dst, scsi_bufflen(scp), act_len,
-		 scsi_get_resid(scp));
-	n = scsi_bufflen(scp) - (off_dst + act_len);
-	scsi_set_resid(scp, min_t(u32, scsi_get_resid(scp), n));
+	/*
+	 * Conditionally suppress DATA IN transfer and leave resid set to bufflen.
+	 */
+	if (unlikely((sdebug_opts & SDEBUG_OPT_NO_DATA) &&
+		      atomic_read(&sdeb_inject_pending))) {
+		resid = scsi_bufflen(scp);
+	} else {
+		act_len = sg_pcopy_from_buffer(sdb->table.sgl, sdb->table.nents,
+					       arr, arr_len, skip);
+		pr_debug("%s: off_dst=%u, scsi_bufflen=%u, act_len=%u, resid=%d\n",
+			 __func__, off_dst, scsi_bufflen(scp), act_len,
+			 scsi_get_resid(scp));
+		n = scsi_bufflen(scp) - (off_dst + act_len);
+		resid = min_t(u32, scsi_get_resid(scp), n);
+	}
+	scsi_set_resid(scp, resid);
+
 	return 0;
 }
 
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/9] scsi: sd: Explicitly specify .ascq = SCMD_FAILURE_ASCQ_ANY for ASC 0x3a
  2025-08-14 18:29 ` [PATCH v2 2/9] scsi: sd: Explicitly specify .ascq = SCMD_FAILURE_ASCQ_ANY for ASC 0x3a Ewan D. Milne
@ 2025-08-14 21:24   ` Bart Van Assche
  2025-08-15 19:17     ` Ewan Milne
  0 siblings, 1 reply; 24+ messages in thread
From: Bart Van Assche @ 2025-08-14 21:24 UTC (permalink / raw)
  To: Ewan D. Milne, linux-scsi; +Cc: michael.christie, dgilbert

On 8/14/25 11:29 AM, Ewan D. Milne wrote:
> This patch
> changes the retry behavior, we no longer retry 3 times on ASC 0x3a
> if a nonzero ASCQ is ever returned.

Hmm ... my understanding is that this patch makes read_capacity_10()
retry if ASC == 0x3a and ASCQ != 0 and also that without this patch
these retries don't happen. Did I perhaps misunderstand something?

Bart.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 3/9] scsi: sd: Pass buffer length as argument to sd_read_capacity() et al.
  2025-08-14 18:29 ` [PATCH v2 3/9] scsi: sd: Pass buffer length as argument to sd_read_capacity() et al Ewan D. Milne
@ 2025-08-14 21:31   ` Bart Van Assche
  2025-08-15  2:43   ` Damien Le Moal
  1 sibling, 0 replies; 24+ messages in thread
From: Bart Van Assche @ 2025-08-14 21:31 UTC (permalink / raw)
  To: Ewan D. Milne, linux-scsi; +Cc: michael.christie, dgilbert

On 8/14/25 11:29 AM, Ewan D. Milne wrote:
> -		memset(buffer, 0, RC16_LEN);
> +		memset(buffer, 0, buflen);

Hmm ... why to clear the entire buffer (512 bytes?) while only the first
32 bytes of the buffer will be used?

> -	memset(buffer, 0, 8);
> +	memset(buffer, 0, buflen);

A similar comment applies here.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 4/9] scsi: sd: Have scsi-ml retry read_capacity_16 errors
  2025-08-14 18:29 ` [PATCH v2 4/9] scsi: sd: Have scsi-ml retry read_capacity_16 errors Ewan D. Milne
@ 2025-08-14 21:33   ` Bart Van Assche
  0 siblings, 0 replies; 24+ messages in thread
From: Bart Van Assche @ 2025-08-14 21:33 UTC (permalink / raw)
  To: Ewan D. Milne, linux-scsi; +Cc: michael.christie, dgilbert

On 8/14/25 11:29 AM, Ewan D. Milne wrote:
> +	memset(buffer, 0, buflen);

Also here, why to clear the entire buffer instead of only the bytes that
will be read by sd.c?

Otherwise this patch looks good to me.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 5/9] scsi: sd: Avoid passing potentially uninitialized "sense_valid" to read_capacity_error()
  2025-08-14 18:29 ` [PATCH v2 5/9] scsi: sd: Avoid passing potentially uninitialized "sense_valid" to read_capacity_error() Ewan D. Milne
@ 2025-08-14 21:34   ` Bart Van Assche
  0 siblings, 0 replies; 24+ messages in thread
From: Bart Van Assche @ 2025-08-14 21:34 UTC (permalink / raw)
  To: Ewan D. Milne, linux-scsi; +Cc: michael.christie, dgilbert

On 8/14/25 11:29 AM, Ewan D. Milne wrote:
> read_capacity_10() sets "sense_valid" in a different conditional statement prior to
> calling read_capacity_error(), and does not use this value otherwise.  Move the call
> to scsi_sense_valid() to read_capacity_error() instead of passing it as a parameter
> from read_capacity_16() and read_capacity_10().

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 6/9] scsi: sd: Remove checks for -EOVERFLOW in sd_read_capacity()
  2025-08-14 18:29 ` [PATCH v2 6/9] scsi: sd: Remove checks for -EOVERFLOW in sd_read_capacity() Ewan D. Milne
@ 2025-08-14 21:36   ` Bart Van Assche
  0 siblings, 0 replies; 24+ messages in thread
From: Bart Van Assche @ 2025-08-14 21:36 UTC (permalink / raw)
  To: Ewan D. Milne, linux-scsi; +Cc: michael.christie, dgilbert

On 8/14/25 11:29 AM, Ewan D. Milne wrote:
> Remove checks for -EOVERFLOW in sd_read_capacity() because this value has not
> been returned to it since commit 72deb455b5ec ("block: remove CONFIG_LBDAF").
Reviewed-by: Bart Van Assche <bvanassche@acm.org>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 7/9] scsi: sd: Check for and retry in case of READ_CAPCITY(10)/(16) returning no data
  2025-08-14 18:29 ` [PATCH v2 7/9] scsi: sd: Check for and retry in case of READ_CAPCITY(10)/(16) returning no data Ewan D. Milne
@ 2025-08-14 21:37   ` Bart Van Assche
  2025-08-15  2:50   ` Damien Le Moal
  1 sibling, 0 replies; 24+ messages in thread
From: Bart Van Assche @ 2025-08-14 21:37 UTC (permalink / raw)
  To: Ewan D. Milne, linux-scsi; +Cc: michael.christie, dgilbert

On 8/14/25 11:29 AM, Ewan D. Milne wrote:
> +		if ((the_result == 0) && (resid == RC16_LEN)) {

No superfluous parentheses please. Otherwise this patch looks good to me.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 8/9] scsi: Simplify nested if conditional in scsi_probe_lun()
  2025-08-14 18:29 ` [PATCH v2 8/9] scsi: Simplify nested if conditional in scsi_probe_lun() Ewan D. Milne
@ 2025-08-14 21:37   ` Bart Van Assche
  2025-08-15  2:50   ` Damien Le Moal
  1 sibling, 0 replies; 24+ messages in thread
From: Bart Van Assche @ 2025-08-14 21:37 UTC (permalink / raw)
  To: Ewan D. Milne, linux-scsi; +Cc: michael.christie, dgilbert

On 8/14/25 11:29 AM, Ewan D. Milne wrote:
> -		if (result == 0) {
> +		if ((result == 0) && (resid == try_inquiry_len)) {

Also here, no superfluous parentheses please.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 9/9] scsi: scsi_debug: Add option to suppress returned data but return good status
  2025-08-14 18:29 ` [PATCH v2 9/9] scsi: scsi_debug: Add option to suppress returned data but return good status Ewan D. Milne
@ 2025-08-14 21:41   ` Bart Van Assche
  0 siblings, 0 replies; 24+ messages in thread
From: Bart Van Assche @ 2025-08-14 21:41 UTC (permalink / raw)
  To: Ewan D. Milne, linux-scsi; +Cc: michael.christie, dgilbert

On 8/14/25 11:29 AM, Ewan D. Milne wrote:
> This is used to test the earlier read_capacity_10()/16() retry patch.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 3/9] scsi: sd: Pass buffer length as argument to sd_read_capacity() et al.
  2025-08-14 18:29 ` [PATCH v2 3/9] scsi: sd: Pass buffer length as argument to sd_read_capacity() et al Ewan D. Milne
  2025-08-14 21:31   ` Bart Van Assche
@ 2025-08-15  2:43   ` Damien Le Moal
  2025-08-15 18:53     ` Ewan Milne
  1 sibling, 1 reply; 24+ messages in thread
From: Damien Le Moal @ 2025-08-15  2:43 UTC (permalink / raw)
  To: Ewan D. Milne, linux-scsi; +Cc: michael.christie, dgilbert, bvanassche

On 8/15/25 03:29, Ewan D. Milne wrote:
> That will make it easier to spot an inconsistent buffer size value.

How ? the buffer size is actually not checked. You only memset the buffer.

> Also, memset() the entire buffer rather than the 8 or 32 bytes expected
> back from READ CAPACITY(10) or READ CAPACITY(16).

Why ? There is no explanation why that is needed. The command will not return
more than the transfer length specified. So memsetting bytes that will never be
used seems useless.

> 
> Signed-off-by: Ewan D. Milne <emilne@redhat.com>
> ---
>  drivers/scsi/sd.c | 28 ++++++++++++++++++----------
>  1 file changed, 18 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index e3b802b26f0e..ae8eac4b1cb2 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -2629,7 +2629,8 @@ static void read_capacity_error(struct scsi_disk *sdkp, struct scsi_device *sdp,
>  #define READ_CAPACITY_RETRIES_ON_RESET	10
>  
>  static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
> -		struct queue_limits *lim, unsigned char *buffer)
> +			    struct queue_limits *lim, unsigned char *buffer,
> +			    unsigned int buflen)
>  {
>  	unsigned char cmd[16];
>  	struct scsi_sense_hdr sshdr;
> @@ -2651,7 +2652,7 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
>  		cmd[0] = SERVICE_ACTION_IN_16;
>  		cmd[1] = SAI_READ_CAPACITY_16;
>  		cmd[13] = RC16_LEN;
> -		memset(buffer, 0, RC16_LEN);
> +		memset(buffer, 0, buflen);

I would leave this as-is.

>  >  		the_result = scsi_execute_cmd(sdp, cmd, REQ_OP_DRV_IN,
>  					      buffer, RC16_LEN, SD_TIMEOUT,
> @@ -2719,8 +2720,13 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
>  	return sector_size;
>  }
>  
> +#define RC10_LEN 8
> +#if RC10_LEN > SD_BUF_SIZE
> +#error RC10_LEN must not be more than SD_BUF_SIZE
> +#endif
> +
>  static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
> -						unsigned char *buffer)
> +			    unsigned char *buffer, unsigned int buflen)
>  {
>  	static const u8 cmd[10] = { READ_CAPACITY };
>  	struct scsi_sense_hdr sshdr;
> @@ -2765,7 +2771,7 @@ static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
>  	sector_t lba;
>  	unsigned sector_size;
>  
> -	memset(buffer, 0, 8);
> +	memset(buffer, 0, buflen);

Same here, but maybe define RC10_LEN instead of having the magic "8" value
hardcoded ?

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 7/9] scsi: sd: Check for and retry in case of READ_CAPCITY(10)/(16) returning no data
  2025-08-14 18:29 ` [PATCH v2 7/9] scsi: sd: Check for and retry in case of READ_CAPCITY(10)/(16) returning no data Ewan D. Milne
  2025-08-14 21:37   ` Bart Van Assche
@ 2025-08-15  2:50   ` Damien Le Moal
  2025-08-15 18:31     ` Ewan Milne
  1 sibling, 1 reply; 24+ messages in thread
From: Damien Le Moal @ 2025-08-15  2:50 UTC (permalink / raw)
  To: Ewan D. Milne, linux-scsi; +Cc: michael.christie, dgilbert, bvanassche

On 8/15/25 03:29, Ewan D. Milne wrote:
> sd_read_capacity_10() and sd_read_capacity_16() do not check for underflow
> and can extract invalid (e.g. zero) data when a malfunctioning device does
> not actually transfer any data, but returnes a good status otherwise.
> Check for this and retry, and log a message and return -EINVAL if we can't
> get the capacity information.
> 
> We encountered a device that did this once but returned good data afterwards.
> 
> See similar commit 5cd3bbfad088 ("[SCSI] retry with missing data for INQUIRY")
> 
> Signed-off-by: Ewan D. Milne <emilne@redhat.com>
> ---
>  drivers/scsi/sd.c | 56 ++++++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 48 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index f1ab2409ea3e..20b5eebba968 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -2639,6 +2639,7 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
>  		[13] = RC16_LEN,
>  	};
>  	struct scsi_sense_hdr sshdr;
> +	int count, resid;
>  	struct scsi_failure failure_defs[] = {
>  		/*
>  		 * Do not retry Invalid Command Operation Code or Invalid
> @@ -2689,6 +2690,7 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
>  	};
>  	const struct scsi_exec_args exec_args = {
>  		.sshdr = &sshdr,
> +		.resid = &resid,
>  		.failures = &failures,
>  	};
>  	int sense_valid = 0;
> @@ -2700,11 +2702,23 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
>  	if (sdp->no_read_capacity_16)
>  		return -EINVAL;
>  
> -	memset(buffer, 0, buflen);
> +	for (count = 0; count < 3; ++count) {
> +		memset(buffer, 0, buflen);
>  
> -	the_result = scsi_execute_cmd(sdp, cmd, REQ_OP_DRV_IN, buffer,
> -				      RC16_LEN, SD_TIMEOUT, sdkp->max_retries,
> -				      &exec_args);
> +		the_result = scsi_execute_cmd(sdp, cmd, REQ_OP_DRV_IN,
> +					      buffer, RC16_LEN, SD_TIMEOUT,
> +					      sdkp->max_retries, &exec_args);
> +
> +		if ((the_result == 0) && (resid == RC16_LEN)) {

You do not need the inner parenthesis. Also, it seems to me that this check
should simply be:

		if (resid)

Because any incomplete read capacity buffer is bound to be invalid.

> +			/*
> +			 * if nothing was transferred, we try
> +			 * again. It's a workaround for a broken
> +			 * device.
> +			 */
> +			continue;
> +		}
> +		break;
> +	}
>  
>  	if (the_result > 0) {
>  		if (media_not_present(sdkp, &sshdr))
> @@ -2728,6 +2742,12 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
>  		return -EINVAL;
>  	}
>  
> +	if (resid == RC16_LEN) {
> +		sd_printk(KERN_ERR, sdkp,
> +			  "Read Capacity(16) returned good status but no data");

Shouldn't this be a warning instead of error ? After all, there was no error...
And I would prefer seeing a warning for a bad device. The message would also be
better mentioning that this is the device fault.

> +		return -EINVAL;
> +	}
> +
>  	sector_size = get_unaligned_be32(&buffer[8]);
>  	lba = get_unaligned_be64(&buffer[0]);
>  
> @@ -2770,6 +2790,7 @@ static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
>  {
>  	static const u8 cmd[10] = { READ_CAPACITY };
>  	struct scsi_sense_hdr sshdr;
> +	int count, resid;
>  	struct scsi_failure failure_defs[] = {
>  		/* Do not retry Medium Not Present */
>  		{
> @@ -2804,17 +2825,30 @@ static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
>  	};
>  	const struct scsi_exec_args exec_args = {
>  		.sshdr = &sshdr,
> +		.resid = &resid,
>  		.failures = &failures,
>  	};
>  	int the_result;
>  	sector_t lba;
>  	unsigned sector_size;
>  
> -	memset(buffer, 0, buflen);
> +	for (count = 0; count < 3; ++count) {
> +		memset(buffer, 0, buflen);
>  
> -	the_result = scsi_execute_cmd(sdp, cmd, REQ_OP_DRV_IN, buffer,
> -				      8, SD_TIMEOUT, sdkp->max_retries,
> -				      &exec_args);
> +		the_result = scsi_execute_cmd(sdp, cmd, REQ_OP_DRV_IN,
> +					      buffer, RC10_LEN, SD_TIMEOUT,
> +					      sdkp->max_retries, &exec_args);
> +
> +		if ((the_result == 0) && (resid == RC16_LEN)) {

Same comment here: if (resid) ?

> +			/*
> +			 * if nothing was transferred, we try
> +			 * again. It's a workaround for a broken
> +			 * device.
> +			 */
> +			continue;
> +		}
> +		break;
> +	}
>  
>  	if (the_result > 0) {
>  		if (media_not_present(sdkp, &sshdr))
> @@ -2827,6 +2861,12 @@ static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
>  		return -EINVAL;
>  	}
>  
> +	if (resid == RC10_LEN) {
> +		sd_printk(KERN_ERR, sdkp,
> +			  "Read Capacity(10) returned good status but no data");
> +		return -EINVAL;
> +	}
> +
>  	sector_size = get_unaligned_be32(&buffer[4]);
>  	lba = get_unaligned_be32(&buffer[0]);
>  


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 8/9] scsi: Simplify nested if conditional in scsi_probe_lun()
  2025-08-14 18:29 ` [PATCH v2 8/9] scsi: Simplify nested if conditional in scsi_probe_lun() Ewan D. Milne
  2025-08-14 21:37   ` Bart Van Assche
@ 2025-08-15  2:50   ` Damien Le Moal
  1 sibling, 0 replies; 24+ messages in thread
From: Damien Le Moal @ 2025-08-15  2:50 UTC (permalink / raw)
  To: Ewan D. Milne, linux-scsi; +Cc: michael.christie, dgilbert, bvanassche

On 8/15/25 03:29, Ewan D. Milne wrote:
> Make code congruent with similar code in read_capacity_16()/read_capacity_10().
> 
> Signed-off-by: Ewan D. Milne <emilne@redhat.com>
> ---
>  drivers/scsi/scsi_scan.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index c754b1d566e0..1346a52f55c4 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -717,14 +717,13 @@ static int scsi_probe_lun(struct scsi_device *sdev, unsigned char *inq_result,
>  				"scsi scan: INQUIRY %s with code 0x%x\n",
>  				result ? "failed" : "successful", result));
>  
> -		if (result == 0) {
> +		if ((result == 0) && (resid == try_inquiry_len)) {

No need for the inner parenthesis.

>  			/*
>  			 * if nothing was transferred, we try
>  			 * again. It's a workaround for some USB
>  			 * devices.
>  			 */
> -			if (resid == try_inquiry_len)
> -				continue;
> +			continue;
>  		}
>  		break;
>  	}


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 7/9] scsi: sd: Check for and retry in case of READ_CAPCITY(10)/(16) returning no data
  2025-08-15  2:50   ` Damien Le Moal
@ 2025-08-15 18:31     ` Ewan Milne
  0 siblings, 0 replies; 24+ messages in thread
From: Ewan Milne @ 2025-08-15 18:31 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-scsi, michael.christie, dgilbert, bvanassche

On Thu, Aug 14, 2025 at 10:50 PM Damien Le Moal <dlemoal@kernel.org> wrote:
>
> On 8/15/25 03:29, Ewan D. Milne wrote:
> > sd_read_capacity_10() and sd_read_capacity_16() do not check for underflow
> > and can extract invalid (e.g. zero) data when a malfunctioning device does
> > not actually transfer any data, but returnes a good status otherwise.
> > Check for this and retry, and log a message and return -EINVAL if we can't
> > get the capacity information.
> >
> > We encountered a device that did this once but returned good data afterwards.
> >
> > See similar commit 5cd3bbfad088 ("[SCSI] retry with missing data for INQUIRY")
> >
> > Signed-off-by: Ewan D. Milne <emilne@redhat.com>
> > ---
> >  drivers/scsi/sd.c | 56 ++++++++++++++++++++++++++++++++++++++++-------
> >  1 file changed, 48 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> > index f1ab2409ea3e..20b5eebba968 100644
> > --- a/drivers/scsi/sd.c
> > +++ b/drivers/scsi/sd.c
> > @@ -2639,6 +2639,7 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
> >
> > -     the_result = scsi_execute_cmd(sdp, cmd, REQ_OP_DRV_IN, buffer,
> > -                                   RC16_LEN, SD_TIMEOUT, sdkp->max_retries,
> > -                                   &exec_args);
> > +             the_result = scsi_execute_cmd(sdp, cmd, REQ_OP_DRV_IN,
> > +                                           buffer, RC16_LEN, SD_TIMEOUT,
> > +                                           sdkp->max_retries, &exec_args);
> > +
> > +             if ((the_result == 0) && (resid == RC16_LEN)) {
>
> You do not need the inner parenthesis. Also, it seems to me that this check
> should simply be:
>
>                 if (resid)
>
> Because any incomplete read capacity buffer is bound to be invalid.

No.  Because if the result is nonzero we want to exit the retry
(break: vs. continue;).
The retry is only intended to handle the case where the status was
good but there
was no data transferred (there will be no data transferred if the result != 0).

Parenthesis, sure.

>
> > +                     /*
> > +                      * if nothing was transferred, we try
> > +                      * again. It's a workaround for a broken
> > +                      * device.
> > +                      */
> > +                     continue;
> > +             }
> > +             break;
> > +     }
> >
> >       if (the_result > 0) {
> >               if (media_not_present(sdkp, &sshdr))
> > @@ -2728,6 +2742,12 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
> >               return -EINVAL;
> >       }
> >
> > +     if (resid == RC16_LEN) {
> > +             sd_printk(KERN_ERR, sdkp,
> > +                       "Read Capacity(16) returned good status but no data");
>
> Shouldn't this be a warning instead of error ? After all, there was no error...
> And I would prefer seeing a warning for a bad device. The message would also be
> better mentioning that this is the device fault.

KERN_ERR was on purpose, some people have dmesg -n set to suppress warnings.
I mean, it is an error, although the language in SBC is I think
unclear, it is definitely
implied that a certain amount of data is supposed to be returned.
(SBC-3 5.10 / 5.11
don't actually say "shall").

Although the case I looked at was caused by a F/W bug on a hard drive, you could
imagine that a similar problem might be e.g. USB bridge or something
(see the code
in scsi_probe_lun()).  So I prefer to just explain just what the
kernel actually sees.


>
> > +             return -EINVAL;
> > +     }
> > +
> >       sector_size = get_unaligned_be32(&buffer[8]);
> >       lba = get_unaligned_be64(&buffer[0]);
> >
> > @@ -2770,6 +2790,7 @@ static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
> >  {
> >       static const u8 cmd[10] = { READ_CAPACITY };
> >       struct scsi_sense_hdr sshdr;
> > +     int count, resid;
> >       struct scsi_failure failure_defs[] = {
> >               /* Do not retry Medium Not Present */
> >               {
> > @@ -2804,17 +2825,30 @@ static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
> >       };
> >       const struct scsi_exec_args exec_args = {
> >               .sshdr = &sshdr,
> > +             .resid = &resid,
> >               .failures = &failures,
> >       };
> >       int the_result;
> >       sector_t lba;
> >       unsigned sector_size;
> >
> > -     memset(buffer, 0, buflen);
> > +     for (count = 0; count < 3; ++count) {
> > +             memset(buffer, 0, buflen);
> >
> > -     the_result = scsi_execute_cmd(sdp, cmd, REQ_OP_DRV_IN, buffer,
> > -                                   8, SD_TIMEOUT, sdkp->max_retries,
> > -                                   &exec_args);
> > +             the_result = scsi_execute_cmd(sdp, cmd, REQ_OP_DRV_IN,
> > +                                           buffer, RC10_LEN, SD_TIMEOUT,
> > +                                           sdkp->max_retries, &exec_args);
> > +
> > +             if ((the_result == 0) && (resid == RC16_LEN)) {
>
> Same comment here: if (resid) ?

See above.

-Ewan

>
> > +                     /*
> > +                      * if nothing was transferred, we try
> > +                      * again. It's a workaround for a broken
> > +                      * device.
> > +                      */
> > +                     continue;
> > +             }
> > +             break;
> > +     }
> >
> >       if (the_result > 0) {
> >               if (media_not_present(sdkp, &sshdr))
> > @@ -2827,6 +2861,12 @@ static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
> >               return -EINVAL;
> >       }
> >
> > +     if (resid == RC10_LEN) {
> > +             sd_printk(KERN_ERR, sdkp,
> > +                       "Read Capacity(10) returned good status but no data");
> > +             return -EINVAL;
> > +     }
> > +
> >       sector_size = get_unaligned_be32(&buffer[4]);
> >       lba = get_unaligned_be32(&buffer[0]);
> >
>
>
> --
> Damien Le Moal
> Western Digital Research
>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 3/9] scsi: sd: Pass buffer length as argument to sd_read_capacity() et al.
  2025-08-15  2:43   ` Damien Le Moal
@ 2025-08-15 18:53     ` Ewan Milne
  0 siblings, 0 replies; 24+ messages in thread
From: Ewan Milne @ 2025-08-15 18:53 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-scsi, michael.christie, dgilbert, bvanassche

On Thu, Aug 14, 2025 at 10:47 PM Damien Le Moal <dlemoal@kernel.org> wrote:
>
> On 8/15/25 03:29, Ewan D. Milne wrote:
> > That will make it easier to spot an inconsistent buffer size value.
>
> How ? the buffer size is actually not checked. You only memset the buffer.

Because Bart asked for this in an earlier review (he actually wanted the code to
use ARRAY_SIZE(), but since the buffer is obtained from kmalloc() that
won't work.

I'll just remove this all in v3, it isn't actually required for what
the series fixes.

>
> > Also, memset() the entire buffer rather than the 8 or 32 bytes expected
> > back from READ CAPACITY(10) or READ CAPACITY(16).
>
> Why ? There is no explanation why that is needed. The command will not return
> more than the transfer length specified. So memsetting bytes that will never be
> used seems useless.

It's not really needed, I guess, no other routines that use this
buffer do the memset().
I'll just leave it using RC10_LEN / RC16_LEN as you suggested.

-Ewan

>
> >
> > Signed-off-by: Ewan D. Milne <emilne@redhat.com>
> > ---
> >  drivers/scsi/sd.c | 28 ++++++++++++++++++----------
> >  1 file changed, 18 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> > index e3b802b26f0e..ae8eac4b1cb2 100644
> > --- a/drivers/scsi/sd.c
> > +++ b/drivers/scsi/sd.c
> > @@ -2629,7 +2629,8 @@ static void read_capacity_error(struct scsi_disk *sdkp, struct scsi_device *sdp,
> >  #define READ_CAPACITY_RETRIES_ON_RESET       10
> >
> >  static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
> > -             struct queue_limits *lim, unsigned char *buffer)
> > +                         struct queue_limits *lim, unsigned char *buffer,
> > +                         unsigned int buflen)
> >  {
> >       unsigned char cmd[16];
> >       struct scsi_sense_hdr sshdr;
> > @@ -2651,7 +2652,7 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
> >               cmd[0] = SERVICE_ACTION_IN_16;
> >               cmd[1] = SAI_READ_CAPACITY_16;
> >               cmd[13] = RC16_LEN;
> > -             memset(buffer, 0, RC16_LEN);
> > +             memset(buffer, 0, buflen);
>
> I would leave this as-is.

Sure.  See above.

>
> >  >            the_result = scsi_execute_cmd(sdp, cmd, REQ_OP_DRV_IN,
> >                                             buffer, RC16_LEN, SD_TIMEOUT,
> > @@ -2719,8 +2720,13 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp,
> >       return sector_size;
> >  }
> >
> > +#define RC10_LEN 8
> > +#if RC10_LEN > SD_BUF_SIZE
> > +#error RC10_LEN must not be more than SD_BUF_SIZE
> > +#endif
> > +
> >  static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
> > -                                             unsigned char *buffer)
> > +                         unsigned char *buffer, unsigned int buflen)
> >  {
> >       static const u8 cmd[10] = { READ_CAPACITY };
> >       struct scsi_sense_hdr sshdr;
> > @@ -2765,7 +2771,7 @@ static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp,
> >       sector_t lba;
> >       unsigned sector_size;
> >
> > -     memset(buffer, 0, 8);
> > +     memset(buffer, 0, buflen);
>
> Same here, but maybe define RC10_LEN instead of having the magic "8" value
> hardcoded ?

Right.  I'll put this in the other patch.

-Ewan
>
> --
> Damien Le Moal
> Western Digital Research
>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/9] scsi: sd: Explicitly specify .ascq = SCMD_FAILURE_ASCQ_ANY for ASC 0x3a
  2025-08-14 21:24   ` Bart Van Assche
@ 2025-08-15 19:17     ` Ewan Milne
  0 siblings, 0 replies; 24+ messages in thread
From: Ewan Milne @ 2025-08-15 19:17 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: linux-scsi, michael.christie, dgilbert

On Thu, Aug 14, 2025 at 5:25 PM Bart Van Assche <bvanassche@acm.org> wrote:
>
> On 8/14/25 11:29 AM, Ewan D. Milne wrote:
> > This patch
> > changes the retry behavior, we no longer retry 3 times on ASC 0x3a
> > if a nonzero ASCQ is ever returned.
>
> Hmm ... my understanding is that this patch makes read_capacity_10()
> retry if ASC == 0x3a and ASCQ != 0 and also that without this patch
> these retries don't happen. Did I perhaps misunderstand something?
>
> Bart.
>

As I understand the scsi_failure mechanism, scsi_check_passthrough()
goes through the scsi_failure entries looking for a match.  If an entry
doesn't match (i.e. in the case where .ascq was not explicit and is therefore
zero).  It moves on to the next one.  If none match no retry is performed.

But, the last entry in failure_defs[] in read_capacity_10() matches on
any result (SCMD_FAILURE_RESULT_ANY).  So it will retry 3 times
if nothing matches.

Current behavior (before patch):
  Match on either UNIT ATTENTION or NOT READY, with ASC 0x3a:
    ASCQ 00 (default, since not specified) - DO NOT RETRY
    ASCQ not 00 - RETRY, since scsi_failure entry won't match, but later
                            entry SCMD_FAILURE_RESULT_ANY will match

Behavior with patch:
  Match on either UNIT ATTENTION or NOT READY, with ASC 0x3a:
     ASCQ 00 - DO NOT RETRY
     ASCQ not 00 - DO NOT RETRY

Behavior change is not to retry regardless of ASCQ value.
I think this is what we want to do here, I thought this was what you meant
in your review of v1.

-Ewan


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2025-08-15 19:17 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-14 18:28 [PATCH v2 0/9] Retry READ CAPACITY(10)/(16) with good status but no data Ewan D. Milne
2025-08-14 18:28 ` [PATCH v2 1/9] scsi: Explicitly specify .ascq = 0x00 for ASC 0x28/0x29 scsi_failures Ewan D. Milne
2025-08-14 18:29 ` [PATCH v2 2/9] scsi: sd: Explicitly specify .ascq = SCMD_FAILURE_ASCQ_ANY for ASC 0x3a Ewan D. Milne
2025-08-14 21:24   ` Bart Van Assche
2025-08-15 19:17     ` Ewan Milne
2025-08-14 18:29 ` [PATCH v2 3/9] scsi: sd: Pass buffer length as argument to sd_read_capacity() et al Ewan D. Milne
2025-08-14 21:31   ` Bart Van Assche
2025-08-15  2:43   ` Damien Le Moal
2025-08-15 18:53     ` Ewan Milne
2025-08-14 18:29 ` [PATCH v2 4/9] scsi: sd: Have scsi-ml retry read_capacity_16 errors Ewan D. Milne
2025-08-14 21:33   ` Bart Van Assche
2025-08-14 18:29 ` [PATCH v2 5/9] scsi: sd: Avoid passing potentially uninitialized "sense_valid" to read_capacity_error() Ewan D. Milne
2025-08-14 21:34   ` Bart Van Assche
2025-08-14 18:29 ` [PATCH v2 6/9] scsi: sd: Remove checks for -EOVERFLOW in sd_read_capacity() Ewan D. Milne
2025-08-14 21:36   ` Bart Van Assche
2025-08-14 18:29 ` [PATCH v2 7/9] scsi: sd: Check for and retry in case of READ_CAPCITY(10)/(16) returning no data Ewan D. Milne
2025-08-14 21:37   ` Bart Van Assche
2025-08-15  2:50   ` Damien Le Moal
2025-08-15 18:31     ` Ewan Milne
2025-08-14 18:29 ` [PATCH v2 8/9] scsi: Simplify nested if conditional in scsi_probe_lun() Ewan D. Milne
2025-08-14 21:37   ` Bart Van Assche
2025-08-15  2:50   ` Damien Le Moal
2025-08-14 18:29 ` [PATCH v2 9/9] scsi: scsi_debug: Add option to suppress returned data but return good status Ewan D. Milne
2025-08-14 21:41   ` Bart Van Assche

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).