* [PATCH v2 0/1] scsi: scsi_dh_alua: use device timeout instead of constant @ 2026-03-25 15:15 Brian Bunker 2026-03-25 15:15 ` [PATCH v2] scsi: scsi_dh_alua: use the device timeout rather than a constant Brian Bunker 0 siblings, 1 reply; 4+ messages in thread From: Brian Bunker @ 2026-03-25 15:15 UTC (permalink / raw) To: linux-scsi, hare; +Cc: Brian Bunker This patch changes the ALUA device handler to use the SCSI device's configured timeout (rq_timeout) instead of the hardcoded 60-second ALUA_FAILOVER_TIMEOUT constant. This allows administrators to control ALUA-related timeouts via the standard SCSI device timeout sysfs interface (/sys/block/sdX/device/timeout). Changes in v2: - Added READ_ONCE() when accessing sdev->request_queue->rq_timeout since this value can be modified dynamically by userspace via sysfs - Added fallback to ALUA_FAILOVER_TIMEOUT if rq_timeout is 0, using the ?: operator, to handle any edge cases where the timeout might be uninitialized (though sysfs store functions reject 0) - Retained ALUA_FAILOVER_TIMEOUT constant as the fallback value Regarding dynamic changes to rq_timeout: this is acceptable because each SCSI command reads the current timeout when issued. If an admin changes the timeout mid-operation, subsequent commands will use the new value. This is actually desirable as it allows administrators to adjust timeouts on the fly during problematic failovers without reloading modules. Brian Bunker (1): scsi: scsi_dh_alua: use the device timeout rather than a constant drivers/scsi/device_handler/scsi_dh_alua.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) -- 2.50.1 (Apple Git-155) ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v2] scsi: scsi_dh_alua: use the device timeout rather than a constant 2026-03-25 15:15 [PATCH v2 0/1] scsi: scsi_dh_alua: use device timeout instead of constant Brian Bunker @ 2026-03-25 15:15 ` Brian Bunker 2026-03-26 10:22 ` Hannes Reinecke 0 siblings, 1 reply; 4+ messages in thread From: Brian Bunker @ 2026-03-25 15:15 UTC (permalink / raw) To: linux-scsi, hare; +Cc: Brian Bunker, Krishna Kant, Riya Savla Instead of using a constant for timeouts, use the timeout of the SCSI device itself. There are reasons why someone might want to extend the SCSI timeout and having the constant out of sync can lead to early timeouts. Signed-off-by: Krishna Kant <krishna.kant@purestorage.com> Signed-off-by: Riya Savla <rsavla@purestorage.com> Signed-off-by: Brian Bunker <brian@purestorage.com> --- drivers/scsi/device_handler/scsi_dh_alua.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c index efb08b9b145a..a4ee67109548 100644 --- a/drivers/scsi/device_handler/scsi_dh_alua.c +++ b/drivers/scsi/device_handler/scsi_dh_alua.c @@ -143,7 +143,7 @@ static int submit_rtpg(struct scsi_device *sdev, unsigned char *buff, put_unaligned_be32(bufflen, &cdb[6]); return scsi_execute_cmd(sdev, cdb, opf, buff, bufflen, - ALUA_FAILOVER_TIMEOUT * HZ, + READ_ONCE(sdev->request_queue->rq_timeout) ?: ALUA_FAILOVER_TIMEOUT * HZ, ALUA_FAILOVER_RETRIES, &exec_args); } @@ -178,7 +178,7 @@ static int submit_stpg(struct scsi_device *sdev, int group_id, put_unaligned_be32(stpg_len, &cdb[6]); return scsi_execute_cmd(sdev, cdb, opf, stpg_data, - stpg_len, ALUA_FAILOVER_TIMEOUT * HZ, + stpg_len, READ_ONCE(sdev->request_queue->rq_timeout) ?: ALUA_FAILOVER_TIMEOUT * HZ, ALUA_FAILOVER_RETRIES, &exec_args); } @@ -512,7 +512,7 @@ static int alua_tur(struct scsi_device *sdev) struct scsi_sense_hdr sense_hdr; int retval; - retval = scsi_test_unit_ready(sdev, ALUA_FAILOVER_TIMEOUT * HZ, + retval = scsi_test_unit_ready(sdev, READ_ONCE(sdev->request_queue->rq_timeout) ?: ALUA_FAILOVER_TIMEOUT * HZ, ALUA_FAILOVER_RETRIES, &sense_hdr); if ((sense_hdr.sense_key == NOT_READY || sense_hdr.sense_key == UNIT_ATTENTION) && @@ -552,7 +552,8 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) valid_states_old = pg->valid_states; if (!pg->expiry) { - unsigned long transition_tmo = ALUA_FAILOVER_TIMEOUT * HZ; + unsigned long transition_tmo = min(READ_ONCE(sdev->request_queue->rq_timeout) ?: ALUA_FAILOVER_TIMEOUT * HZ, + (unsigned long)U8_MAX * HZ); if (pg->transition_tmo) transition_tmo = pg->transition_tmo * HZ; @@ -664,7 +665,8 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) if ((buff[4] & RTPG_FMT_MASK) == RTPG_FMT_EXT_HDR && buff[5] != 0) pg->transition_tmo = buff[5]; else - pg->transition_tmo = ALUA_FAILOVER_TIMEOUT; + pg->transition_tmo = min((READ_ONCE(sdev->request_queue->rq_timeout) ?: ALUA_FAILOVER_TIMEOUT * HZ) / HZ, + (unsigned long)U8_MAX); if (orig_transition_tmo != pg->transition_tmo) { sdev_printk(KERN_INFO, sdev, -- 2.50.1 (Apple Git-155) ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v2] scsi: scsi_dh_alua: use the device timeout rather than a constant 2026-03-25 15:15 ` [PATCH v2] scsi: scsi_dh_alua: use the device timeout rather than a constant Brian Bunker @ 2026-03-26 10:22 ` Hannes Reinecke 2026-03-27 0:12 ` Brian Bunker 0 siblings, 1 reply; 4+ messages in thread From: Hannes Reinecke @ 2026-03-26 10:22 UTC (permalink / raw) To: Brian Bunker, linux-scsi; +Cc: Krishna Kant, Riya Savla On 3/25/26 16:15, Brian Bunker wrote: > Instead of using a constant for timeouts, use the timeout of the SCSI > device itself. There are reasons why someone might want to extend > the SCSI timeout and having the constant out of sync can lead to > early timeouts. > > Signed-off-by: Krishna Kant <krishna.kant@purestorage.com> > Signed-off-by: Riya Savla <rsavla@purestorage.com> > Signed-off-by: Brian Bunker <brian@purestorage.com> > --- > drivers/scsi/device_handler/scsi_dh_alua.c | 12 +++++++----- > 1 file changed, 7 insertions(+), 5 deletions(-) > > diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c > index efb08b9b145a..a4ee67109548 100644 > --- a/drivers/scsi/device_handler/scsi_dh_alua.c > +++ b/drivers/scsi/device_handler/scsi_dh_alua.c > @@ -143,7 +143,7 @@ static int submit_rtpg(struct scsi_device *sdev, unsigned char *buff, > put_unaligned_be32(bufflen, &cdb[6]); > > return scsi_execute_cmd(sdev, cdb, opf, buff, bufflen, > - ALUA_FAILOVER_TIMEOUT * HZ, > + READ_ONCE(sdev->request_queue->rq_timeout) ?: ALUA_FAILOVER_TIMEOUT * HZ, > ALUA_FAILOVER_RETRIES, &exec_args); > } > > @@ -178,7 +178,7 @@ static int submit_stpg(struct scsi_device *sdev, int group_id, > put_unaligned_be32(stpg_len, &cdb[6]); > > return scsi_execute_cmd(sdev, cdb, opf, stpg_data, > - stpg_len, ALUA_FAILOVER_TIMEOUT * HZ, > + stpg_len, READ_ONCE(sdev->request_queue->rq_timeout) ?: ALUA_FAILOVER_TIMEOUT * HZ, > ALUA_FAILOVER_RETRIES, &exec_args); > } > > @@ -512,7 +512,7 @@ static int alua_tur(struct scsi_device *sdev) > struct scsi_sense_hdr sense_hdr; > int retval; > > - retval = scsi_test_unit_ready(sdev, ALUA_FAILOVER_TIMEOUT * HZ, > + retval = scsi_test_unit_ready(sdev, READ_ONCE(sdev->request_queue->rq_timeout) ?: ALUA_FAILOVER_TIMEOUT * HZ, > ALUA_FAILOVER_RETRIES, &sense_hdr); > if ((sense_hdr.sense_key == NOT_READY || > sense_hdr.sense_key == UNIT_ATTENTION) && > @@ -552,7 +552,8 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) > valid_states_old = pg->valid_states; > > if (!pg->expiry) { > - unsigned long transition_tmo = ALUA_FAILOVER_TIMEOUT * HZ; > + unsigned long transition_tmo = min(READ_ONCE(sdev->request_queue->rq_timeout) ?: ALUA_FAILOVER_TIMEOUT * HZ, > + (unsigned long)U8_MAX * HZ); > > if (pg->transition_tmo) > transition_tmo = pg->transition_tmo * HZ; > @@ -664,7 +665,8 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) > if ((buff[4] & RTPG_FMT_MASK) == RTPG_FMT_EXT_HDR && buff[5] != 0) > pg->transition_tmo = buff[5]; > else > - pg->transition_tmo = ALUA_FAILOVER_TIMEOUT; > + pg->transition_tmo = min((READ_ONCE(sdev->request_queue->rq_timeout) ?: ALUA_FAILOVER_TIMEOUT * HZ) / HZ, > + (unsigned long)U8_MAX); > > if (orig_transition_tmo != pg->transition_tmo) { > sdev_printk(KERN_INFO, sdev, Weelll ... The transition timeout is _vastly_ different from the device command timeout. While the latter tends to be rather small (ie in the seconds range), the former can take _really_ long time. Ask you competitors, they regularly require tens of _minuntes_ here. Having is settable is a good idea, but not to the command timeout. Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.com +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2] scsi: scsi_dh_alua: use the device timeout rather than a constant 2026-03-26 10:22 ` Hannes Reinecke @ 2026-03-27 0:12 ` Brian Bunker 0 siblings, 0 replies; 4+ messages in thread From: Brian Bunker @ 2026-03-27 0:12 UTC (permalink / raw) To: Hannes Reinecke; +Cc: linux-scsi, Krishna Kant, Riya Savla On Thu, Mar 26, 2026 at 3:22 AM Hannes Reinecke <hare@suse.com> wrote: > > On 3/25/26 16:15, Brian Bunker wrote: > > Instead of using a constant for timeouts, use the timeout of the SCSI > > device itself. There are reasons why someone might want to extend > > the SCSI timeout and having the constant out of sync can lead to > > early timeouts. > > > > Signed-off-by: Krishna Kant <krishna.kant@purestorage.com> > > Signed-off-by: Riya Savla <rsavla@purestorage.com> > > Signed-off-by: Brian Bunker <brian@purestorage.com> > > --- > > drivers/scsi/device_handler/scsi_dh_alua.c | 12 +++++++----- > > 1 file changed, 7 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c > > index efb08b9b145a..a4ee67109548 100644 > > --- a/drivers/scsi/device_handler/scsi_dh_alua.c > > +++ b/drivers/scsi/device_handler/scsi_dh_alua.c > > @@ -143,7 +143,7 @@ static int submit_rtpg(struct scsi_device *sdev, unsigned char *buff, > > put_unaligned_be32(bufflen, &cdb[6]); > > > > return scsi_execute_cmd(sdev, cdb, opf, buff, bufflen, > > - ALUA_FAILOVER_TIMEOUT * HZ, > > + READ_ONCE(sdev->request_queue->rq_timeout) ?: ALUA_FAILOVER_TIMEOUT * HZ, > > ALUA_FAILOVER_RETRIES, &exec_args); > > } > > > > @@ -178,7 +178,7 @@ static int submit_stpg(struct scsi_device *sdev, int group_id, > > put_unaligned_be32(stpg_len, &cdb[6]); > > > > return scsi_execute_cmd(sdev, cdb, opf, stpg_data, > > - stpg_len, ALUA_FAILOVER_TIMEOUT * HZ, > > + stpg_len, READ_ONCE(sdev->request_queue->rq_timeout) ?: ALUA_FAILOVER_TIMEOUT * HZ, > > ALUA_FAILOVER_RETRIES, &exec_args); > > } > > > > @@ -512,7 +512,7 @@ static int alua_tur(struct scsi_device *sdev) > > struct scsi_sense_hdr sense_hdr; > > int retval; > > > > - retval = scsi_test_unit_ready(sdev, ALUA_FAILOVER_TIMEOUT * HZ, > > + retval = scsi_test_unit_ready(sdev, READ_ONCE(sdev->request_queue->rq_timeout) ?: ALUA_FAILOVER_TIMEOUT * HZ, > > ALUA_FAILOVER_RETRIES, &sense_hdr); > > if ((sense_hdr.sense_key == NOT_READY || > > sense_hdr.sense_key == UNIT_ATTENTION) && > > @@ -552,7 +552,8 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) > > valid_states_old = pg->valid_states; > > > > if (!pg->expiry) { > > - unsigned long transition_tmo = ALUA_FAILOVER_TIMEOUT * HZ; > > + unsigned long transition_tmo = min(READ_ONCE(sdev->request_queue->rq_timeout) ?: ALUA_FAILOVER_TIMEOUT * HZ, > > + (unsigned long)U8_MAX * HZ); > > > > if (pg->transition_tmo) > > transition_tmo = pg->transition_tmo * HZ; > > @@ -664,7 +665,8 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) > > if ((buff[4] & RTPG_FMT_MASK) == RTPG_FMT_EXT_HDR && buff[5] != 0) > > pg->transition_tmo = buff[5]; > > else > > - pg->transition_tmo = ALUA_FAILOVER_TIMEOUT; > > + pg->transition_tmo = min((READ_ONCE(sdev->request_queue->rq_timeout) ?: ALUA_FAILOVER_TIMEOUT * HZ) / HZ, > > + (unsigned long)U8_MAX); > > > > if (orig_transition_tmo != pg->transition_tmo) { > > sdev_printk(KERN_INFO, sdev, > > Weelll ... The transition timeout is _vastly_ different from the device > command timeout. While the latter tends to be rather small (ie in the > seconds range), the former can take _really_ long time. > Ask you competitors, they regularly require tens of _minuntes_ here. > > Having is settable is a good idea, but not to the command timeout. The SCSI path timeout represents a contract between the initiator and the target. Storage vendors provide recommended path timeout values for their arrays as a best practice, and administrators are expected to configure these values accordingly. This timeout value reflects what the target vendor has determined is the appropriate maximum time to wait for any operation on that path, accounting for the target's internal processing, failover capabilities, and expected behavior under various conditions. When the implicit transition timeout honors this same path timeout, it simply extends that existing contract to cover ALUA state transitions. The target vendor whoever recommended a 30-second or 60-second path timeout did so with full knowledge of their array's behavior, including how long implicit transitions might take. If their array requires longer transitions than the recommended path timeout allows, that is a deficiency in their recommendation, not in the use of it as a default. The SCSI SPC specification itself constrains the implicit transition timeout to a single byte in the RTPG extended header, limiting it to 255 seconds. The kernel's data structure reflects this with an unsigned char for transition_tmo. If the specification authors believed implicit transitions could legitimately require tens of minutes, they would have allocated more than one byte for this field. Beyond the spec and data structure constraints, when the implicit transition timer expires the port group state is forced to STANDBY and new commands fail immediately. Supporting tens of minutes is not possible with the current implementation regardless of the default chosen. Using the path timeout as the default creates consistency with the broader timeout contract that already exists between the initiator and the target, rather than introducing a separate arbitrary value that may conflict with the administrator's I/O expectations behavior on that path. > > Cheers, > > Hannes > -- > Dr. Hannes Reinecke Kernel Storage Architect > hare@suse.com +49 911 74053 688 > SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg > HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich Thanks, Brian -- Brian Bunker PURE Storage, Inc. brian@purestorage.com ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-03-27 0:12 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-03-25 15:15 [PATCH v2 0/1] scsi: scsi_dh_alua: use device timeout instead of constant Brian Bunker 2026-03-25 15:15 ` [PATCH v2] scsi: scsi_dh_alua: use the device timeout rather than a constant Brian Bunker 2026-03-26 10:22 ` Hannes Reinecke 2026-03-27 0:12 ` Brian Bunker
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox