* [PATCH] scsi: fix refcount leaking for "tagset_refcnt"
@ 2026-02-23 23:27 Junxiao Bi
2026-02-27 18:32 ` Mike Christie
` (4 more replies)
0 siblings, 5 replies; 9+ messages in thread
From: Junxiao Bi @ 2026-02-23 23:27 UTC (permalink / raw)
To: linux-scsi; +Cc: martin.petersen, James.Bottomley, junxiao.bi
This leaking will cause hung when tearing down the scsi host.
This is an example with iscsi, iscsid hung with the following
call trace after this kernel log.
[130120.652718] scsi_alloc_sdev: Allocation failure during SCSI scanning, some SCSI devices might not be configured
PID: 2528 TASK: ffff9d0408974e00 CPU: 3 COMMAND: "iscsid"
#0 [ffffb5b9c134b9e0] __schedule at ffffffff860657d4
#1 [ffffb5b9c134ba28] schedule at ffffffff86065c6f
#2 [ffffb5b9c134ba40] schedule_timeout at ffffffff86069fb0
#3 [ffffb5b9c134bab0] __wait_for_common at ffffffff8606674f
#4 [ffffb5b9c134bb10] scsi_remove_host at ffffffff85bfe84b
#5 [ffffb5b9c134bb30] iscsi_sw_tcp_session_destroy at ffffffffc03031c4 [iscsi_tcp]
#6 [ffffb5b9c134bb48] iscsi_if_recv_msg at ffffffffc0292692 [scsi_transport_iscsi]
#7 [ffffb5b9c134bb98] iscsi_if_rx at ffffffffc02929c2 [scsi_transport_iscsi]
#8 [ffffb5b9c134bbf0] netlink_unicast at ffffffff85e551d6
#9 [ffffb5b9c134bc38] netlink_sendmsg at ffffffff85e554ef
Fixes: 8fe4ce5836e9 ("scsi: core: Fix a use-after-free")
Cc: stable@vger.kernel.org
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
---
drivers/scsi/scsi_scan.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 7acbfcfc2172..c64ef71633d8 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -361,6 +361,7 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
* since we use this queue depth most of times.
*/
if (scsi_realloc_sdev_budget_map(sdev, depth)) {
+ kref_put(&sdev->host->tagset_refcnt, scsi_mq_free_tags);
put_device(&starget->dev);
kfree(sdev);
goto out;
--
2.50.1 (Apple Git-155)
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi: fix refcount leaking for "tagset_refcnt"
2026-02-23 23:27 [PATCH] scsi: fix refcount leaking for "tagset_refcnt" Junxiao Bi
@ 2026-02-27 18:32 ` Mike Christie
2026-02-27 18:46 ` Bart Van Assche
` (3 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Mike Christie @ 2026-02-27 18:32 UTC (permalink / raw)
To: Junxiao Bi, linux-scsi; +Cc: martin.petersen, James.Bottomley
On 2/23/26 5:27 PM, Junxiao Bi wrote:
> This leaking will cause hung when tearing down the scsi host.
> This is an example with iscsi, iscsid hung with the following
> call trace after this kernel log.
>
> [130120.652718] scsi_alloc_sdev: Allocation failure during SCSI scanning, some SCSI devices might not be configured
>
> PID: 2528 TASK: ffff9d0408974e00 CPU: 3 COMMAND: "iscsid"
> #0 [ffffb5b9c134b9e0] __schedule at ffffffff860657d4
> #1 [ffffb5b9c134ba28] schedule at ffffffff86065c6f
> #2 [ffffb5b9c134ba40] schedule_timeout at ffffffff86069fb0
> #3 [ffffb5b9c134bab0] __wait_for_common at ffffffff8606674f
> #4 [ffffb5b9c134bb10] scsi_remove_host at ffffffff85bfe84b
> #5 [ffffb5b9c134bb30] iscsi_sw_tcp_session_destroy at ffffffffc03031c4 [iscsi_tcp]
> #6 [ffffb5b9c134bb48] iscsi_if_recv_msg at ffffffffc0292692 [scsi_transport_iscsi]
> #7 [ffffb5b9c134bb98] iscsi_if_rx at ffffffffc02929c2 [scsi_transport_iscsi]
> #8 [ffffb5b9c134bbf0] netlink_unicast at ffffffff85e551d6
> #9 [ffffb5b9c134bc38] netlink_sendmsg at ffffffff85e554ef
>
> Fixes: 8fe4ce5836e9 ("scsi: core: Fix a use-after-free")
> Cc: stable@vger.kernel.org
> Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
> ---
> drivers/scsi/scsi_scan.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index 7acbfcfc2172..c64ef71633d8 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -361,6 +361,7 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
> * since we use this queue depth most of times.
> */
> if (scsi_realloc_sdev_budget_map(sdev, depth)) {
> + kref_put(&sdev->host->tagset_refcnt, scsi_mq_free_tags);
> put_device(&starget->dev);
> kfree(sdev);
> goto out;
Reviewed-by: Mike Christie <michael.christie@oracle.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi: fix refcount leaking for "tagset_refcnt"
2026-02-23 23:27 [PATCH] scsi: fix refcount leaking for "tagset_refcnt" Junxiao Bi
2026-02-27 18:32 ` Mike Christie
@ 2026-02-27 18:46 ` Bart Van Assche
2026-02-28 22:28 ` Martin K. Petersen
` (2 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Bart Van Assche @ 2026-02-27 18:46 UTC (permalink / raw)
To: Junxiao Bi, linux-scsi; +Cc: martin.petersen, James.Bottomley
On 2/23/26 3:27 PM, Junxiao Bi wrote:
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index 7acbfcfc2172..c64ef71633d8 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -361,6 +361,7 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
> * since we use this queue depth most of times.
> */
> if (scsi_realloc_sdev_budget_map(sdev, depth)) {
> + kref_put(&sdev->host->tagset_refcnt, scsi_mq_free_tags);
> put_device(&starget->dev);
> kfree(sdev);
> goto out;
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi: fix refcount leaking for "tagset_refcnt"
2026-02-23 23:27 [PATCH] scsi: fix refcount leaking for "tagset_refcnt" Junxiao Bi
2026-02-27 18:32 ` Mike Christie
2026-02-27 18:46 ` Bart Van Assche
@ 2026-02-28 22:28 ` Martin K. Petersen
2026-03-01 2:11 ` Martin K. Petersen
2026-03-02 10:30 ` John Garry
4 siblings, 0 replies; 9+ messages in thread
From: Martin K. Petersen @ 2026-02-28 22:28 UTC (permalink / raw)
To: Junxiao Bi; +Cc: linux-scsi, martin.petersen, James.Bottomley
Junxiao,
> This leaking will cause hung when tearing down the scsi host.
> This is an example with iscsi, iscsid hung with the following
> call trace after this kernel log.
Applied to 7.0/scsi-fixes, thanks!
--
Martin K. Petersen
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi: fix refcount leaking for "tagset_refcnt"
2026-02-23 23:27 [PATCH] scsi: fix refcount leaking for "tagset_refcnt" Junxiao Bi
` (2 preceding siblings ...)
2026-02-28 22:28 ` Martin K. Petersen
@ 2026-03-01 2:11 ` Martin K. Petersen
2026-03-02 10:30 ` John Garry
4 siblings, 0 replies; 9+ messages in thread
From: Martin K. Petersen @ 2026-03-01 2:11 UTC (permalink / raw)
To: linux-scsi, Junxiao Bi; +Cc: Martin K . Petersen, James.Bottomley
On Mon, 23 Feb 2026 15:27:28 -0800, Junxiao Bi wrote:
> This leaking will cause hung when tearing down the scsi host.
> This is an example with iscsi, iscsid hung with the following
> call trace after this kernel log.
>
> [130120.652718] scsi_alloc_sdev: Allocation failure during SCSI scanning, some SCSI devices might not be configured
>
> PID: 2528 TASK: ffff9d0408974e00 CPU: 3 COMMAND: "iscsid"
> #0 [ffffb5b9c134b9e0] __schedule at ffffffff860657d4
> #1 [ffffb5b9c134ba28] schedule at ffffffff86065c6f
> #2 [ffffb5b9c134ba40] schedule_timeout at ffffffff86069fb0
> #3 [ffffb5b9c134bab0] __wait_for_common at ffffffff8606674f
> #4 [ffffb5b9c134bb10] scsi_remove_host at ffffffff85bfe84b
> #5 [ffffb5b9c134bb30] iscsi_sw_tcp_session_destroy at ffffffffc03031c4 [iscsi_tcp]
> #6 [ffffb5b9c134bb48] iscsi_if_recv_msg at ffffffffc0292692 [scsi_transport_iscsi]
> #7 [ffffb5b9c134bb98] iscsi_if_rx at ffffffffc02929c2 [scsi_transport_iscsi]
> #8 [ffffb5b9c134bbf0] netlink_unicast at ffffffff85e551d6
> #9 [ffffb5b9c134bc38] netlink_sendmsg at ffffffff85e554ef
>
> [...]
Applied to 7.0/scsi-fixes, thanks!
[1/1] scsi: fix refcount leaking for "tagset_refcnt"
https://git.kernel.org/mkp/scsi/c/1ac22c8eae81
--
Martin K. Petersen
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi: fix refcount leaking for "tagset_refcnt"
2026-02-23 23:27 [PATCH] scsi: fix refcount leaking for "tagset_refcnt" Junxiao Bi
` (3 preceding siblings ...)
2026-03-01 2:11 ` Martin K. Petersen
@ 2026-03-02 10:30 ` John Garry
2026-03-02 20:36 ` junxiao.bi
4 siblings, 1 reply; 9+ messages in thread
From: John Garry @ 2026-03-02 10:30 UTC (permalink / raw)
To: Junxiao Bi, linux-scsi; +Cc: martin.petersen, James.Bottomley
On 23/02/2026 23:27, Junxiao Bi wrote:
> This leaking will cause hung when tearing down the scsi host.
> This is an example with iscsi, iscsid hung with the following
> call trace after this kernel log.
>
> [130120.652718] scsi_alloc_sdev: Allocation failure during SCSI scanning, some SCSI devices might not be configured
>
> PID: 2528 TASK: ffff9d0408974e00 CPU: 3 COMMAND: "iscsid"
> #0 [ffffb5b9c134b9e0] __schedule at ffffffff860657d4
> #1 [ffffb5b9c134ba28] schedule at ffffffff86065c6f
> #2 [ffffb5b9c134ba40] schedule_timeout at ffffffff86069fb0
> #3 [ffffb5b9c134bab0] __wait_for_common at ffffffff8606674f
> #4 [ffffb5b9c134bb10] scsi_remove_host at ffffffff85bfe84b
> #5 [ffffb5b9c134bb30] iscsi_sw_tcp_session_destroy at ffffffffc03031c4 [iscsi_tcp]
> #6 [ffffb5b9c134bb48] iscsi_if_recv_msg at ffffffffc0292692 [scsi_transport_iscsi]
> #7 [ffffb5b9c134bb98] iscsi_if_rx at ffffffffc02929c2 [scsi_transport_iscsi]
> #8 [ffffb5b9c134bbf0] netlink_unicast at ffffffff85e551d6
> #9 [ffffb5b9c134bc38] netlink_sendmsg at ffffffff85e554ef
>
> Fixes: 8fe4ce5836e9 ("scsi: core: Fix a use-after-free")
> Cc: stable@vger.kernel.org
> Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
> ---
> drivers/scsi/scsi_scan.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index 7acbfcfc2172..c64ef71633d8 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -361,6 +361,7 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
> * since we use this queue depth most of times.
> */
> if (scsi_realloc_sdev_budget_map(sdev, depth)) {
At this point scsi_sysfs_device_initialize() has been called. Then if
you check the comment in __scsi_remove_device():
Paired with kref_get() in scsi_sysfs_device_initialize()*
So I wonder why we don't call __scsi_remove_device() instead, which
calls scsi_target_reap().
Indeed, the current error handling in scsi_alloc_sdev() is odd - we only
call __scsi_remove_device() for ->sdev_init() failure, but nothing
happens between calling ->sdev_init() and after
scsi_sysfs_device_initialize() which means that at this point we should
only now call __scsi_remove_device().
* I think that should be scsi_sysfs_initialize() and has always been
incorrect
> + kref_put(&sdev->host->tagset_refcnt, scsi_mq_free_tags);
> put_device(&starget->dev);
> kfree(sdev);
> goto out;
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi: fix refcount leaking for "tagset_refcnt"
2026-03-02 10:30 ` John Garry
@ 2026-03-02 20:36 ` junxiao.bi
2026-03-03 11:06 ` John Garry
0 siblings, 1 reply; 9+ messages in thread
From: junxiao.bi @ 2026-03-02 20:36 UTC (permalink / raw)
To: John Garry, linux-scsi; +Cc: martin.petersen, James.Bottomley
On 3/2/26 2:30 AM, John Garry wrote:
> On 23/02/2026 23:27, Junxiao Bi wrote:
>> This leaking will cause hung when tearing down the scsi host.
>> This is an example with iscsi, iscsid hung with the following
>> call trace after this kernel log.
>>
>> [130120.652718] scsi_alloc_sdev: Allocation failure during SCSI
>> scanning, some SCSI devices might not be configured
>>
>> PID: 2528 TASK: ffff9d0408974e00 CPU: 3 COMMAND: "iscsid"
>> #0 [ffffb5b9c134b9e0] __schedule at ffffffff860657d4
>> #1 [ffffb5b9c134ba28] schedule at ffffffff86065c6f
>> #2 [ffffb5b9c134ba40] schedule_timeout at ffffffff86069fb0
>> #3 [ffffb5b9c134bab0] __wait_for_common at ffffffff8606674f
>> #4 [ffffb5b9c134bb10] scsi_remove_host at ffffffff85bfe84b
>> #5 [ffffb5b9c134bb30] iscsi_sw_tcp_session_destroy at
>> ffffffffc03031c4 [iscsi_tcp]
>> #6 [ffffb5b9c134bb48] iscsi_if_recv_msg at ffffffffc0292692
>> [scsi_transport_iscsi]
>> #7 [ffffb5b9c134bb98] iscsi_if_rx at ffffffffc02929c2
>> [scsi_transport_iscsi]
>> #8 [ffffb5b9c134bbf0] netlink_unicast at ffffffff85e551d6
>> #9 [ffffb5b9c134bc38] netlink_sendmsg at ffffffff85e554ef
>>
>> Fixes: 8fe4ce5836e9 ("scsi: core: Fix a use-after-free")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
>> ---
>> drivers/scsi/scsi_scan.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
>> index 7acbfcfc2172..c64ef71633d8 100644
>> --- a/drivers/scsi/scsi_scan.c
>> +++ b/drivers/scsi/scsi_scan.c
>> @@ -361,6 +361,7 @@ static struct scsi_device *scsi_alloc_sdev(struct
>> scsi_target *starget,
>> * since we use this queue depth most of times.
>> */
>> if (scsi_realloc_sdev_budget_map(sdev, depth)) {
>
> At this point scsi_sysfs_device_initialize() has been called. Then if
> you check the comment in __scsi_remove_device():
>
> Paired with kref_get() in scsi_sysfs_device_initialize()*
>
> So I wonder why we don't call __scsi_remove_device() instead, which
> calls scsi_target_reap().
>
> Indeed, the current error handling in scsi_alloc_sdev() is odd - we
> only call __scsi_remove_device() for ->sdev_init() failure, but
> nothing happens between calling ->sdev_init() and after
> scsi_sysfs_device_initialize() which means that at this point we
> should only now call __scsi_remove_device().
>
> * I think that should be scsi_sysfs_initialize() and has always been
> incorrect
Good catch. Thanks John. I will send a v2 with this:
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 60c06fa4ec32..c2f70de5c093 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -361,9 +361,7 @@ static struct scsi_device *scsi_alloc_sdev(struct
scsi_target *starget,
* since we use this queue depth most of times.
*/
if (scsi_realloc_sdev_budget_map(sdev, depth)) {
- put_device(&starget->dev);
- kfree(sdev);
- goto out;
+ goto out_device_destroy;
}
scsi_change_queue_depth(sdev, depth);
Thanks,
Junxiao.
>
>
>> + kref_put(&sdev->host->tagset_refcnt, scsi_mq_free_tags);
>> put_device(&starget->dev);
>> kfree(sdev);
>> goto out;
>
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi: fix refcount leaking for "tagset_refcnt"
2026-03-02 20:36 ` junxiao.bi
@ 2026-03-03 11:06 ` John Garry
2026-03-03 16:58 ` junxiao.bi
0 siblings, 1 reply; 9+ messages in thread
From: John Garry @ 2026-03-03 11:06 UTC (permalink / raw)
To: junxiao.bi, linux-scsi; +Cc: martin.petersen, James.Bottomley
On 02/03/2026 20:36, junxiao.bi@oracle.com wrote:
>> At this point scsi_sysfs_device_initialize() has been called. Then if
>> you check the comment in __scsi_remove_device():
>>
>> Paired with kref_get() in scsi_sysfs_device_initialize()*
>>
>> So I wonder why we don't call __scsi_remove_device() instead, which
>> calls scsi_target_reap().
>>
>> Indeed, the current error handling in scsi_alloc_sdev() is odd - we
>> only call __scsi_remove_device() for ->sdev_init() failure, but
>> nothing happens between calling ->sdev_init() and after
>> scsi_sysfs_device_initialize() which means that at this point we
>> should only now call __scsi_remove_device().
>>
>> * I think that should be scsi_sysfs_initialize() and has always been
>> incorrect
>
> Good catch. Thanks John. I will send a v2 with this:
>
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index 60c06fa4ec32..c2f70de5c093 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -361,9 +361,7 @@ static struct scsi_device *scsi_alloc_sdev(struct
> scsi_target *starget,
> * since we use this queue depth most of times.
> */
> if (scsi_realloc_sdev_budget_map(sdev, depth)) {
> - put_device(&starget->dev);
> - kfree(sdev);
> - goto out;
> + goto out_device_destroy;
> }
>
> scsi_change_queue_depth(sdev, depth);
NP and sorry the late review. I think that Martin has already queued
your v1 in his fixes branch...
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi: fix refcount leaking for "tagset_refcnt"
2026-03-03 11:06 ` John Garry
@ 2026-03-03 16:58 ` junxiao.bi
0 siblings, 0 replies; 9+ messages in thread
From: junxiao.bi @ 2026-03-03 16:58 UTC (permalink / raw)
To: John Garry, linux-scsi; +Cc: martin.petersen, James.Bottomley
On 3/3/26 3:06 AM, John Garry wrote:
> On 02/03/2026 20:36, junxiao.bi@oracle.com wrote:
>>> At this point scsi_sysfs_device_initialize() has been called. Then
>>> if you check the comment in __scsi_remove_device():
>>>
>>> Paired with kref_get() in scsi_sysfs_device_initialize()*
>>>
>>> So I wonder why we don't call __scsi_remove_device() instead, which
>>> calls scsi_target_reap().
>>>
>>> Indeed, the current error handling in scsi_alloc_sdev() is odd - we
>>> only call __scsi_remove_device() for ->sdev_init() failure, but
>>> nothing happens between calling ->sdev_init() and after
>>> scsi_sysfs_device_initialize() which means that at this point we
>>> should only now call __scsi_remove_device().
>>>
>>> * I think that should be scsi_sysfs_initialize() and has always been
>>> incorrect
>>
>> Good catch. Thanks John. I will send a v2 with this:
>>
>> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
>> index 60c06fa4ec32..c2f70de5c093 100644
>> --- a/drivers/scsi/scsi_scan.c
>> +++ b/drivers/scsi/scsi_scan.c
>> @@ -361,9 +361,7 @@ static struct scsi_device *scsi_alloc_sdev(struct
>> scsi_target *starget,
>> * since we use this queue depth most of times.
>> */
>> if (scsi_realloc_sdev_budget_map(sdev, depth)) {
>> - put_device(&starget->dev);
>> - kfree(sdev);
>> - goto out;
>> + goto out_device_destroy;
>> }
>>
>> scsi_change_queue_depth(sdev, depth);
>
> NP and sorry the late review. I think that Martin has already queued
> your v1 in his fixes branch...
I see. Then i will post a new patch to fix the sysfs reference leak.
Thanks,
Junxiao.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-03-03 16:58 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-23 23:27 [PATCH] scsi: fix refcount leaking for "tagset_refcnt" Junxiao Bi
2026-02-27 18:32 ` Mike Christie
2026-02-27 18:46 ` Bart Van Assche
2026-02-28 22:28 ` Martin K. Petersen
2026-03-01 2:11 ` Martin K. Petersen
2026-03-02 10:30 ` John Garry
2026-03-02 20:36 ` junxiao.bi
2026-03-03 11:06 ` John Garry
2026-03-03 16:58 ` junxiao.bi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox