* [PATCH] scsi: sd: Move sd_read_cpr() out of the q->limits_lock region
@ 2024-08-01 5:42 Shin'ichiro Kawasaki
2024-08-01 5:49 ` Coelho, Luciano
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Shin'ichiro Kawasaki @ 2024-08-01 5:42 UTC (permalink / raw)
To: linux-scsi, linux-block
Cc: Christoph Hellwig, Coelho, Luciano, Saarinen, Jani,
Shin'ichiro Kawasaki
Commit 804e498e0496 ("sd: convert to the atomic queue limits API")
introduced pairs of function calls to queue_limits_start_update() and
queue_limits_commit_update(). These two functions lock and unlock
q->limits_lock. In sd_revalidate_disk(), sd_read_cpr() is called after
queue_limits_start_update() call and before
queue_limits_commit_update() call. sd_read_cpr() locks q->sysfs_dir_lock
and &q->sysfs_lock. Then new lock dependencies were created between
q->limits_lock, q->sysfs_dir_lock and q->sysfs_lock, as follows:
sd_revalidate_disk
queue_limits_start_update
mutex_lock(&q->limits_lock)
sd_read_cpr
disk_set_independent_access_ranges
mutex_lock(&q->sysfs_dir_lock)
mutex_lock(&q->sysfs_lock)
mutex_unlock(&q->sysfs_lock)
mutex_unlock(&q->sysfs_dir_lock)
queue_limits_commit_update
mutex_unlock(&q->limits_lock)
However, the three locks already had reversed dependencies in other
places. Then the new dependencies triggered the lockdep WARN "possible
circular locking dependency detected" [1]. This WARN was observed by
running the blktests test case srp/002.
To avoid the WARN, move the sd_read_cpr() call in sd_revalidate_disk()
after the queue_limits_commit_update() call. In other words, move the
sd_read_cpr() call out of the q->limits_lock region.
[1] https://lore.kernel.org/linux-scsi/vlmv53ni3ltwxplig5qnw4xsl2h6ccxijfbqzekx76vxoim5a5@dekv7q3es3tx/
Fixes: 804e498e0496 ("sd: convert to the atomic queue limits API")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
---
drivers/scsi/sd.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index adeaa8ab9951..08cbe3815006 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3753,7 +3753,6 @@ static int sd_revalidate_disk(struct gendisk *disk)
sd_read_block_limits_ext(sdkp);
sd_read_block_characteristics(sdkp, &lim);
sd_zbc_read_zones(sdkp, &lim, buffer);
- sd_read_cpr(sdkp);
}
sd_print_capacity(sdkp, old_capacity);
@@ -3808,6 +3807,14 @@ static int sd_revalidate_disk(struct gendisk *disk)
if (err)
return err;
+ /*
+ * Query concurrent positioning ranges after
+ * queue_limits_commit_update() unlocked q->limits_lock to avoid
+ * deadlock with q->sysfs_dir_lock and q->sysfs_lock.
+ */
+ if (sdkp->media_present && scsi_device_supports_vpd(sdp))
+ sd_read_cpr(sdkp);
+
/*
* For a zoned drive, revalidating the zones can be done only once
* the gendisk capacity is set. So if this fails, set back the gendisk
--
2.45.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] scsi: sd: Move sd_read_cpr() out of the q->limits_lock region
2024-08-01 5:42 [PATCH] scsi: sd: Move sd_read_cpr() out of the q->limits_lock region Shin'ichiro Kawasaki
@ 2024-08-01 5:49 ` Coelho, Luciano
2024-08-01 7:09 ` Damien Le Moal
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Coelho, Luciano @ 2024-08-01 5:49 UTC (permalink / raw)
To: linux-scsi@vger.kernel.org, shinichiro.kawasaki@wdc.com,
linux-block@vger.kernel.org
Cc: hch@lst.de, Saarinen, Jani, luca@coelho.fi
On Thu, 2024-08-01 at 14:42 +0900, Shin'ichiro Kawasaki wrote:
> Commit 804e498e0496 ("sd: convert to the atomic queue limits API")
> introduced pairs of function calls to queue_limits_start_update() and
> queue_limits_commit_update(). These two functions lock and unlock
> q->limits_lock. In sd_revalidate_disk(), sd_read_cpr() is called after
> queue_limits_start_update() call and before
> queue_limits_commit_update() call. sd_read_cpr() locks q->sysfs_dir_lock
> and &q->sysfs_lock. Then new lock dependencies were created between
> q->limits_lock, q->sysfs_dir_lock and q->sysfs_lock, as follows:
>
> sd_revalidate_disk
> queue_limits_start_update
> mutex_lock(&q->limits_lock)
> sd_read_cpr
> disk_set_independent_access_ranges
> mutex_lock(&q->sysfs_dir_lock)
> mutex_lock(&q->sysfs_lock)
> mutex_unlock(&q->sysfs_lock)
> mutex_unlock(&q->sysfs_dir_lock)
> queue_limits_commit_update
> mutex_unlock(&q->limits_lock)
>
> However, the three locks already had reversed dependencies in other
> places. Then the new dependencies triggered the lockdep WARN "possible
> circular locking dependency detected" [1]. This WARN was observed by
> running the blktests test case srp/002.
>
> To avoid the WARN, move the sd_read_cpr() call in sd_revalidate_disk()
> after the queue_limits_commit_update() call. In other words, move the
> sd_read_cpr() call out of the q->limits_lock region.
>
> [1] https://lore.kernel.org/linux-scsi/vlmv53ni3ltwxplig5qnw4xsl2h6ccxijfbqzekx76vxoim5a5@dekv7q3es3tx/
>
> Fixes: 804e498e0496 ("sd: convert to the atomic queue limits API")
> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
> ---
> drivers/scsi/sd.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index adeaa8ab9951..08cbe3815006 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -3753,7 +3753,6 @@ static int sd_revalidate_disk(struct gendisk *disk)
> sd_read_block_limits_ext(sdkp);
> sd_read_block_characteristics(sdkp, &lim);
> sd_zbc_read_zones(sdkp, &lim, buffer);
> - sd_read_cpr(sdkp);
> }
>
> sd_print_capacity(sdkp, old_capacity);
> @@ -3808,6 +3807,14 @@ static int sd_revalidate_disk(struct gendisk *disk)
> if (err)
> return err;
>
> + /*
> + * Query concurrent positioning ranges after
> + * queue_limits_commit_update() unlocked q->limits_lock to avoid
> + * deadlock with q->sysfs_dir_lock and q->sysfs_lock.
> + */
> + if (sdkp->media_present && scsi_device_supports_vpd(sdp))
> + sd_read_cpr(sdkp);
> +
> /*
> * For a zoned drive, revalidating the zones can be done only once
> * the gendisk capacity is set. So if this fails, set back the gendisk
This seems to do the trick! At least on our setups we're not seeing the
deadlock issue anymore.
Thanks, Shinichiro!
Tested-by: Luca Coelho <luciano.coelho@intel.com>
--
Cheers,
Luca.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] scsi: sd: Move sd_read_cpr() out of the q->limits_lock region
2024-08-01 5:42 [PATCH] scsi: sd: Move sd_read_cpr() out of the q->limits_lock region Shin'ichiro Kawasaki
2024-08-01 5:49 ` Coelho, Luciano
@ 2024-08-01 7:09 ` Damien Le Moal
2024-08-01 14:20 ` Christoph Hellwig
2024-08-01 16:07 ` Bart Van Assche
3 siblings, 0 replies; 5+ messages in thread
From: Damien Le Moal @ 2024-08-01 7:09 UTC (permalink / raw)
To: Shin'ichiro Kawasaki, linux-scsi, linux-block
Cc: Christoph Hellwig, Coelho, Luciano, Saarinen, Jani
On 8/1/24 2:42 PM, Shin'ichiro Kawasaki wrote:
> Commit 804e498e0496 ("sd: convert to the atomic queue limits API")
> introduced pairs of function calls to queue_limits_start_update() and
> queue_limits_commit_update(). These two functions lock and unlock
> q->limits_lock. In sd_revalidate_disk(), sd_read_cpr() is called after
> queue_limits_start_update() call and before
> queue_limits_commit_update() call. sd_read_cpr() locks q->sysfs_dir_lock
> and &q->sysfs_lock. Then new lock dependencies were created between
> q->limits_lock, q->sysfs_dir_lock and q->sysfs_lock, as follows:
>
> sd_revalidate_disk
> queue_limits_start_update
> mutex_lock(&q->limits_lock)
> sd_read_cpr
> disk_set_independent_access_ranges
> mutex_lock(&q->sysfs_dir_lock)
> mutex_lock(&q->sysfs_lock)
> mutex_unlock(&q->sysfs_lock)
> mutex_unlock(&q->sysfs_dir_lock)
> queue_limits_commit_update
> mutex_unlock(&q->limits_lock)
>
> However, the three locks already had reversed dependencies in other
> places. Then the new dependencies triggered the lockdep WARN "possible
> circular locking dependency detected" [1]. This WARN was observed by
> running the blktests test case srp/002.
>
> To avoid the WARN, move the sd_read_cpr() call in sd_revalidate_disk()
> after the queue_limits_commit_update() call. In other words, move the
> sd_read_cpr() call out of the q->limits_lock region.
>
> [1] https://lore.kernel.org/linux-scsi/vlmv53ni3ltwxplig5qnw4xsl2h6ccxijfbqzekx76vxoim5a5@dekv7q3es3tx/
>
> Fixes: 804e498e0496 ("sd: convert to the atomic queue limits API")
> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Given that sd_read_cpr() does not change any limit, looks good to me.
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] scsi: sd: Move sd_read_cpr() out of the q->limits_lock region
2024-08-01 5:42 [PATCH] scsi: sd: Move sd_read_cpr() out of the q->limits_lock region Shin'ichiro Kawasaki
2024-08-01 5:49 ` Coelho, Luciano
2024-08-01 7:09 ` Damien Le Moal
@ 2024-08-01 14:20 ` Christoph Hellwig
2024-08-01 16:07 ` Bart Van Assche
3 siblings, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2024-08-01 14:20 UTC (permalink / raw)
To: Shin'ichiro Kawasaki
Cc: linux-scsi, linux-block, Christoph Hellwig, Coelho, Luciano,
Saarinen, Jani
Looks good, thanks!
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] scsi: sd: Move sd_read_cpr() out of the q->limits_lock region
2024-08-01 5:42 [PATCH] scsi: sd: Move sd_read_cpr() out of the q->limits_lock region Shin'ichiro Kawasaki
` (2 preceding siblings ...)
2024-08-01 14:20 ` Christoph Hellwig
@ 2024-08-01 16:07 ` Bart Van Assche
3 siblings, 0 replies; 5+ messages in thread
From: Bart Van Assche @ 2024-08-01 16:07 UTC (permalink / raw)
To: Shin'ichiro Kawasaki, linux-scsi, linux-block
Cc: Christoph Hellwig, Coelho, Luciano, Saarinen, Jani
On 7/31/24 10:42 PM, Shin'ichiro Kawasaki wrote:
> To avoid the WARN, move the sd_read_cpr() call in sd_revalidate_disk()
> after the queue_limits_commit_update() call. In other words, move the
> sd_read_cpr() call out of the q->limits_lock region.
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-08-01 16:07 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-01 5:42 [PATCH] scsi: sd: Move sd_read_cpr() out of the q->limits_lock region Shin'ichiro Kawasaki
2024-08-01 5:49 ` Coelho, Luciano
2024-08-01 7:09 ` Damien Le Moal
2024-08-01 14:20 ` Christoph Hellwig
2024-08-01 16:07 ` Bart Van Assche
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).