* [PATCH 0/2] nvme-tcp: fix two problems found with blktests nvme/063 @ 2025-06-02 4:35 Shin'ichiro Kawasaki 2025-06-02 4:35 ` [PATCH 1/2] nvme-tcp: avoid race between nvme scan and reset Shin'ichiro Kawasaki 2025-06-02 4:35 ` [PATCH 2/2] nvme-tcp: remove tag set when second admin queue config fails Shin'ichiro Kawasaki 0 siblings, 2 replies; 16+ messages in thread From: Shin'ichiro Kawasaki @ 2025-06-02 4:35 UTC (permalink / raw) To: linux-nvme Cc: Hannes Reinecke, Sagi Grimberg, Keith Busch, Jens Axboe, Christoph Hellwig, Shin'ichiro Kawasaki When the test case nvme/063 is repeated a few times, a WARN and a BUG KASAN slab-use-after-free are observed [1]. This series addresses them. The first patch fixes the WARN, and the second patch fixes the BUG KASAN. [1] https://lore.kernel.org/linux-nvme/6mhxskdlbo6fk6hotsffvwriauurqky33dfb3s44mqtr5dsxmf@gywwmnyh3twm/ Shin'ichiro Kawasaki (2): nvme-tcp: avoid race between nvme scan and reset nvme-tcp: remove tag set when second admin queue config fails drivers/nvme/host/tcp.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) -- 2.49.0 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 1/2] nvme-tcp: avoid race between nvme scan and reset 2025-06-02 4:35 [PATCH 0/2] nvme-tcp: fix two problems found with blktests nvme/063 Shin'ichiro Kawasaki @ 2025-06-02 4:35 ` Shin'ichiro Kawasaki 2025-06-02 6:39 ` Hannes Reinecke ` (2 more replies) 2025-06-02 4:35 ` [PATCH 2/2] nvme-tcp: remove tag set when second admin queue config fails Shin'ichiro Kawasaki 1 sibling, 3 replies; 16+ messages in thread From: Shin'ichiro Kawasaki @ 2025-06-02 4:35 UTC (permalink / raw) To: linux-nvme Cc: Hannes Reinecke, Sagi Grimberg, Keith Busch, Jens Axboe, Christoph Hellwig, Shin'ichiro Kawasaki When the nvme scan work and the nvme controller reset work race, the WARN below happens: WARNING: CPU: 1 PID: 69 at block/blk-mq.c:330 blk_mq_unquiesce_queue+0x8f/0xb0 The WARN can be recreated by repeating the blktests test case nvme/063 a few times [1]. Its cause is the new queue allocation for the tag set by the scan work between blk_mq_quiesce_tagset() and blk_mq_unquiesce_tagset() calls by the reset work. Reset work Scan work ------------------------------------------------------------------------ nvme_reset_ctrl_work() nvme_tcp_teardown_ctrl() nvme_tcp_teardown_io_queues() nvme_quiesce_io_queues() blk_mq_quiesce_tagset() list_for_each_entry() .. N queues blk_mq_quiesce_queue() nvme_scan_work() nvme_scan_ns_*() nvme_scan_ns() nvme_alloc_ns() blk_mq_alloc_disk() __blk_mq_alloc_disk() blk_mq_alloc_queue() blk_mq_init_allocate_queue() blk_mq_add_queue_tag_set() list_add_tail() .. N+1 queues nvme_tcp_setup_ctrl() nvme_start_ctrl() nvme_unquiesce_io_queues() blk_mq_unquiesce_tagset() list_for_each_entry() .. N+1 queues blk_mq_unquiesce_queue() WARN_ON_ONCE(q->quiesce_depth <= 0) blk_mq_quiesce_queue() is not called for the new queue added by the scan work, while blk_mq_unquiesce_queue() is called for it. Hence the WARN. To suppress the WARN, avoid the race between the reset work and the scan work by flushing the scan work at the beginning of the reset work. Link: https://lore.kernel.org/linux-nvme/6mhxskdlbo6fk6hotsffvwriauurqky33dfb3s44mqtr5dsxmf@gywwmnyh3twm/ [1] Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> --- drivers/nvme/host/tcp.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index f6379aa33d77..14b9d67329b3 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -2491,6 +2491,9 @@ static void nvme_reset_ctrl_work(struct work_struct *work) container_of(work, struct nvme_ctrl, reset_work); int ret; + /* prevent racing with ns scanning */ + flush_work(&ctrl->scan_work); + if (nvme_tcp_key_revoke_needed(ctrl)) nvme_auth_revoke_tls_key(ctrl); nvme_stop_ctrl(ctrl); -- 2.49.0 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] nvme-tcp: avoid race between nvme scan and reset 2025-06-02 4:35 ` [PATCH 1/2] nvme-tcp: avoid race between nvme scan and reset Shin'ichiro Kawasaki @ 2025-06-02 6:39 ` Hannes Reinecke 2025-06-03 0:48 ` Chaitanya Kulkarni 2025-06-03 10:58 ` Sagi Grimberg 2 siblings, 0 replies; 16+ messages in thread From: Hannes Reinecke @ 2025-06-02 6:39 UTC (permalink / raw) To: Shin'ichiro Kawasaki, linux-nvme Cc: Hannes Reinecke, Sagi Grimberg, Keith Busch, Jens Axboe, Christoph Hellwig On 6/2/25 06:35, Shin'ichiro Kawasaki wrote: > When the nvme scan work and the nvme controller reset work race, the > WARN below happens: > > WARNING: CPU: 1 PID: 69 at block/blk-mq.c:330 blk_mq_unquiesce_queue+0x8f/0xb0 > > The WARN can be recreated by repeating the blktests test case nvme/063 a > few times [1]. Its cause is the new queue allocation for the tag set > by the scan work between blk_mq_quiesce_tagset() and > blk_mq_unquiesce_tagset() calls by the reset work. > > Reset work Scan work > ------------------------------------------------------------------------ > nvme_reset_ctrl_work() > nvme_tcp_teardown_ctrl() > nvme_tcp_teardown_io_queues() > nvme_quiesce_io_queues() > blk_mq_quiesce_tagset() > list_for_each_entry() .. N queues > blk_mq_quiesce_queue() > nvme_scan_work() > nvme_scan_ns_*() > nvme_scan_ns() > nvme_alloc_ns() > blk_mq_alloc_disk() > __blk_mq_alloc_disk() > blk_mq_alloc_queue() > blk_mq_init_allocate_queue() > blk_mq_add_queue_tag_set() > list_add_tail() .. N+1 queues > nvme_tcp_setup_ctrl() > nvme_start_ctrl() > nvme_unquiesce_io_queues() > blk_mq_unquiesce_tagset() > list_for_each_entry() .. N+1 queues > blk_mq_unquiesce_queue() > WARN_ON_ONCE(q->quiesce_depth <= 0) > > blk_mq_quiesce_queue() is not called for the new queue added by the scan > work, while blk_mq_unquiesce_queue() is called for it. Hence the WARN. > > To suppress the WARN, avoid the race between the reset work and the scan > work by flushing the scan work at the beginning of the reset work. > > Link: https://lore.kernel.org/linux-nvme/6mhxskdlbo6fk6hotsffvwriauurqky33dfb3s44mqtr5dsxmf@gywwmnyh3twm/ [1] > Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> > --- > drivers/nvme/host/tcp.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c > index f6379aa33d77..14b9d67329b3 100644 > --- a/drivers/nvme/host/tcp.c > +++ b/drivers/nvme/host/tcp.c > @@ -2491,6 +2491,9 @@ static void nvme_reset_ctrl_work(struct work_struct *work) > container_of(work, struct nvme_ctrl, reset_work); > int ret; > > + /* prevent racing with ns scanning */ > + flush_work(&ctrl->scan_work); > + > if (nvme_tcp_key_revoke_needed(ctrl)) > nvme_auth_revoke_tls_key(ctrl); > nvme_stop_ctrl(ctrl); Reviewed-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] nvme-tcp: avoid race between nvme scan and reset 2025-06-02 4:35 ` [PATCH 1/2] nvme-tcp: avoid race between nvme scan and reset Shin'ichiro Kawasaki 2025-06-02 6:39 ` Hannes Reinecke @ 2025-06-03 0:48 ` Chaitanya Kulkarni 2025-06-03 10:58 ` Sagi Grimberg 2 siblings, 0 replies; 16+ messages in thread From: Chaitanya Kulkarni @ 2025-06-03 0:48 UTC (permalink / raw) To: Shin'ichiro Kawasaki, linux-nvme@lists.infradead.org Cc: Hannes Reinecke, Sagi Grimberg, Keith Busch, Jens Axboe, Christoph Hellwig On 6/1/25 21:35, Shin'ichiro Kawasaki wrote: > When the nvme scan work and the nvme controller reset work race, the > WARN below happens: > > WARNING: CPU: 1 PID: 69 at block/blk-mq.c:330 blk_mq_unquiesce_queue+0x8f/0xb0 > > The WARN can be recreated by repeating the blktests test case nvme/063 a > few times [1]. Its cause is the new queue allocation for the tag set > by the scan work between blk_mq_quiesce_tagset() and > blk_mq_unquiesce_tagset() calls by the reset work. > > Reset work Scan work > ------------------------------------------------------------------------ > nvme_reset_ctrl_work() > nvme_tcp_teardown_ctrl() > nvme_tcp_teardown_io_queues() > nvme_quiesce_io_queues() > blk_mq_quiesce_tagset() > list_for_each_entry() .. N queues > blk_mq_quiesce_queue() > nvme_scan_work() > nvme_scan_ns_*() > nvme_scan_ns() > nvme_alloc_ns() > blk_mq_alloc_disk() > __blk_mq_alloc_disk() > blk_mq_alloc_queue() > blk_mq_init_allocate_queue() > blk_mq_add_queue_tag_set() > list_add_tail() .. N+1 queues > nvme_tcp_setup_ctrl() > nvme_start_ctrl() > nvme_unquiesce_io_queues() > blk_mq_unquiesce_tagset() > list_for_each_entry() .. N+1 queues > blk_mq_unquiesce_queue() > WARN_ON_ONCE(q->quiesce_depth <= 0) > > blk_mq_quiesce_queue() is not called for the new queue added by the scan > work, while blk_mq_unquiesce_queue() is called for it. Hence the WARN. > > To suppress the WARN, avoid the race between the reset work and the scan > work by flushing the scan work at the beginning of the reset work. > > Link:https://lore.kernel.org/linux-nvme/6mhxskdlbo6fk6hotsffvwriauurqky33dfb3s44mqtr5dsxmf@gywwmnyh3twm/ [1] > Signed-off-by: Shin'ichiro Kawasaki<shinichiro.kawasaki@wdc.com> Looks good. Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> -ck ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] nvme-tcp: avoid race between nvme scan and reset 2025-06-02 4:35 ` [PATCH 1/2] nvme-tcp: avoid race between nvme scan and reset Shin'ichiro Kawasaki 2025-06-02 6:39 ` Hannes Reinecke 2025-06-03 0:48 ` Chaitanya Kulkarni @ 2025-06-03 10:58 ` Sagi Grimberg 2025-06-04 7:10 ` Sagi Grimberg 2 siblings, 1 reply; 16+ messages in thread From: Sagi Grimberg @ 2025-06-03 10:58 UTC (permalink / raw) To: Shin'ichiro Kawasaki, linux-nvme Cc: Hannes Reinecke, Keith Busch, Jens Axboe, Christoph Hellwig On 02/06/2025 7:35, Shin'ichiro Kawasaki wrote: > When the nvme scan work and the nvme controller reset work race, the > WARN below happens: > > WARNING: CPU: 1 PID: 69 at block/blk-mq.c:330 blk_mq_unquiesce_queue+0x8f/0xb0 > > The WARN can be recreated by repeating the blktests test case nvme/063 a > few times [1]. Its cause is the new queue allocation for the tag set > by the scan work between blk_mq_quiesce_tagset() and > blk_mq_unquiesce_tagset() calls by the reset work. > > Reset work Scan work > ------------------------------------------------------------------------ > nvme_reset_ctrl_work() > nvme_tcp_teardown_ctrl() > nvme_tcp_teardown_io_queues() > nvme_quiesce_io_queues() > blk_mq_quiesce_tagset() > list_for_each_entry() .. N queues > blk_mq_quiesce_queue() > nvme_scan_work() > nvme_scan_ns_*() > nvme_scan_ns() > nvme_alloc_ns() > blk_mq_alloc_disk() > __blk_mq_alloc_disk() > blk_mq_alloc_queue() > blk_mq_init_allocate_queue() > blk_mq_add_queue_tag_set() > list_add_tail() .. N+1 queues > nvme_tcp_setup_ctrl() > nvme_start_ctrl() > nvme_unquiesce_io_queues() > blk_mq_unquiesce_tagset() > list_for_each_entry() .. N+1 queues > blk_mq_unquiesce_queue() > WARN_ON_ONCE(q->quiesce_depth <= 0) > > blk_mq_quiesce_queue() is not called for the new queue added by the scan > work, while blk_mq_unquiesce_queue() is called for it. Hence the WARN. > > To suppress the WARN, avoid the race between the reset work and the scan > work by flushing the scan work at the beginning of the reset work. > > Link: https://lore.kernel.org/linux-nvme/6mhxskdlbo6fk6hotsffvwriauurqky33dfb3s44mqtr5dsxmf@gywwmnyh3twm/ [1] > Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> > --- > drivers/nvme/host/tcp.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c > index f6379aa33d77..14b9d67329b3 100644 > --- a/drivers/nvme/host/tcp.c > +++ b/drivers/nvme/host/tcp.c > @@ -2491,6 +2491,9 @@ static void nvme_reset_ctrl_work(struct work_struct *work) > container_of(work, struct nvme_ctrl, reset_work); > int ret; > > + /* prevent racing with ns scanning */ > + flush_work(&ctrl->scan_work); This is a problem. We are flushing a work that is IO bound, which can take a long time to complete. Up until now, we have deliberately avoided introducing dependency between reset forward progress and scan work IO to completely finish. I would like to keep it this way. BTW, this is not TCP specific. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] nvme-tcp: avoid race between nvme scan and reset 2025-06-03 10:58 ` Sagi Grimberg @ 2025-06-04 7:10 ` Sagi Grimberg 2025-06-04 11:17 ` Shinichiro Kawasaki 0 siblings, 1 reply; 16+ messages in thread From: Sagi Grimberg @ 2025-06-04 7:10 UTC (permalink / raw) To: Shin'ichiro Kawasaki, linux-nvme, linux-block Cc: Hannes Reinecke, Keith Busch, Jens Axboe, Christoph Hellwig On 03/06/2025 13:58, Sagi Grimberg wrote: > > > On 02/06/2025 7:35, Shin'ichiro Kawasaki wrote: >> When the nvme scan work and the nvme controller reset work race, the >> WARN below happens: >> >> WARNING: CPU: 1 PID: 69 at block/blk-mq.c:330 >> blk_mq_unquiesce_queue+0x8f/0xb0 >> >> The WARN can be recreated by repeating the blktests test case nvme/063 a >> few times [1]. Its cause is the new queue allocation for the tag set >> by the scan work between blk_mq_quiesce_tagset() and >> blk_mq_unquiesce_tagset() calls by the reset work. >> >> Reset work Scan work >> ------------------------------------------------------------------------ >> nvme_reset_ctrl_work() >> nvme_tcp_teardown_ctrl() >> nvme_tcp_teardown_io_queues() >> nvme_quiesce_io_queues() >> blk_mq_quiesce_tagset() >> list_for_each_entry() .. N queues >> blk_mq_quiesce_queue() >> nvme_scan_work() >> nvme_scan_ns_*() >> nvme_scan_ns() >> nvme_alloc_ns() >> blk_mq_alloc_disk() >> __blk_mq_alloc_disk() >> blk_mq_alloc_queue() >> blk_mq_init_allocate_queue() >> blk_mq_add_queue_tag_set() >> list_add_tail() .. N+1 >> queues >> nvme_tcp_setup_ctrl() >> nvme_start_ctrl() >> nvme_unquiesce_io_queues() >> blk_mq_unquiesce_tagset() >> list_for_each_entry() .. N+1 >> queues >> blk_mq_unquiesce_queue() >> WARN_ON_ONCE(q->quiesce_depth <= 0) >> >> blk_mq_quiesce_queue() is not called for the new queue added by the scan >> work, while blk_mq_unquiesce_queue() is called for it. Hence the WARN. >> >> To suppress the WARN, avoid the race between the reset work and the scan >> work by flushing the scan work at the beginning of the reset work. >> >> Link: >> https://lore.kernel.org/linux-nvme/6mhxskdlbo6fk6hotsffvwriauurqky33dfb3s44mqtr5dsxmf@gywwmnyh3twm/ >> [1] >> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> >> --- >> drivers/nvme/host/tcp.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c >> index f6379aa33d77..14b9d67329b3 100644 >> --- a/drivers/nvme/host/tcp.c >> +++ b/drivers/nvme/host/tcp.c >> @@ -2491,6 +2491,9 @@ static void nvme_reset_ctrl_work(struct >> work_struct *work) >> container_of(work, struct nvme_ctrl, reset_work); >> int ret; >> + /* prevent racing with ns scanning */ >> + flush_work(&ctrl->scan_work); > > This is a problem. We are flushing a work that is IO bound, which can > take a long time to complete. > Up until now, we have deliberately avoided introducing dependency > between reset forward progress > and scan work IO to completely finish. > > I would like to keep it this way. > > BTW, this is not TCP specific. blk_mq_unquiesce_queue is still very much safe to call as many times as we want. The only thing that comes in the way is this pesky WARN. How about we make it go away and have a debug print instead? My preference would be to allow nvme to unquiesce queues that were not previously quiesced (just like it historically was) instead of having to block a controller reset until the scan_work is completed (which is admin I/O dependent, and may get stuck until admin timeout, which can be changed by the user for 60 minutes or something arbitrarily long btw). How about something like this patch instead: -- diff --git a/block/blk-mq.c b/block/blk-mq.c index c2697db59109..74f3ad16e812 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -327,8 +327,10 @@ void blk_mq_unquiesce_queue(struct request_queue *q) bool run_queue = false; spin_lock_irqsave(&q->queue_lock, flags); - if (WARN_ON_ONCE(q->quiesce_depth <= 0)) { - ; + if (q->quiesce_depth <= 0) { + printk(KERN_DEBUG + "dev %s: unquiescing a non-quiesced queue, expected?\n", + q->disk ? q->disk->disk_name : "?", ); } else if (!--q->quiesce_depth) { blk_queue_flag_clear(QUEUE_FLAG_QUIESCED, q); run_queue = true; -- ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] nvme-tcp: avoid race between nvme scan and reset 2025-06-04 7:10 ` Sagi Grimberg @ 2025-06-04 11:17 ` Shinichiro Kawasaki 2025-06-13 4:10 ` Shinichiro Kawasaki 2025-06-13 4:56 ` Ming Lei 0 siblings, 2 replies; 16+ messages in thread From: Shinichiro Kawasaki @ 2025-06-04 11:17 UTC (permalink / raw) To: Sagi Grimberg Cc: linux-nvme@lists.infradead.org, linux-block, Hannes Reinecke, Keith Busch, Jens Axboe, hch@infradead.org, Ming Lei Cc+: Ming, Hi Sagi, thanks for the background explanation and the suggestion. On Jun 04, 2025 / 10:10, Sagi Grimberg wrote: ... > > This is a problem. We are flushing a work that is IO bound, which can > > take a long time to complete. > > Up until now, we have deliberately avoided introducing dependency > > between reset forward progress > > and scan work IO to completely finish. > > > > I would like to keep it this way. > > > > BTW, this is not TCP specific. I see. The blktests test case nvme/063 is dedicated to tcp transport, so that's why it was reported for the TCP transport. > > blk_mq_unquiesce_queue is still very much safe to call as many times as we > want. > The only thing that comes in the way is this pesky WARN. How about we make > it go away and have > a debug print instead? > > My preference would be to allow nvme to unquiesce queues that were not > previously quiesced (just > like it historically was) instead of having to block a controller reset > until the scan_work is completed (which > is admin I/O dependent, and may get stuck until admin timeout, which can be > changed by the user for 60 > minutes or something arbitrarily long btw). > > How about something like this patch instead: > -- > diff --git a/block/blk-mq.c b/block/blk-mq.c > index c2697db59109..74f3ad16e812 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -327,8 +327,10 @@ void blk_mq_unquiesce_queue(struct request_queue *q) > bool run_queue = false; > > spin_lock_irqsave(&q->queue_lock, flags); > - if (WARN_ON_ONCE(q->quiesce_depth <= 0)) { > - ; > + if (q->quiesce_depth <= 0) { > + printk(KERN_DEBUG > + "dev %s: unquiescing a non-quiesced queue, > expected?\n", > + q->disk ? q->disk->disk_name : "?", ); > } else if (!--q->quiesce_depth) { > blk_queue_flag_clear(QUEUE_FLAG_QUIESCED, q); > run_queue = true; > -- The WARN was introduced with the commit e70feb8b3e68 ("blk-mq: support concurrent queue quiesce/unquiesce") that Ming authored. Ming, may I ask your comment on the suggestion by Sagi? In case the WARN will be left as it is, blktests can ignore it by adding the line below to the test case: DMESG_FILTER="grep --invert-match blk_mq_unquiesce_queue" Said that, I think Sagi's solution will be cleaner. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] nvme-tcp: avoid race between nvme scan and reset 2025-06-04 11:17 ` Shinichiro Kawasaki @ 2025-06-13 4:10 ` Shinichiro Kawasaki 2025-06-13 4:56 ` Ming Lei 1 sibling, 0 replies; 16+ messages in thread From: Shinichiro Kawasaki @ 2025-06-13 4:10 UTC (permalink / raw) To: Sagi Grimberg Cc: linux-nvme@lists.infradead.org, linux-block, Hannes Reinecke, Keith Busch, Jens Axboe, hch@infradead.org, Ming Lei On Jun 04, 2025 / 11:17, Shinichiro Kawasaki wrote: ... > On Jun 04, 2025 / 10:10, Sagi Grimberg wrote: > ... > > My preference would be to allow nvme to unquiesce queues that were not > > previously quiesced (just > > like it historically was) instead of having to block a controller reset > > until the scan_work is completed (which > > is admin I/O dependent, and may get stuck until admin timeout, which can be > > changed by the user for 60 > > minutes or something arbitrarily long btw). > > > > How about something like this patch instead: > > -- > > diff --git a/block/blk-mq.c b/block/blk-mq.c > > index c2697db59109..74f3ad16e812 100644 > > --- a/block/blk-mq.c > > +++ b/block/blk-mq.c > > @@ -327,8 +327,10 @@ void blk_mq_unquiesce_queue(struct request_queue *q) > > bool run_queue = false; > > > > spin_lock_irqsave(&q->queue_lock, flags); > > - if (WARN_ON_ONCE(q->quiesce_depth <= 0)) { > > - ; > > + if (q->quiesce_depth <= 0) { > > + printk(KERN_DEBUG > > + "dev %s: unquiescing a non-quiesced queue, > > expected?\n", > > + q->disk ? q->disk->disk_name : "?", ); > > } else if (!--q->quiesce_depth) { > > blk_queue_flag_clear(QUEUE_FLAG_QUIESCED, q); > > run_queue = true; > > -- > > The WARN was introduced with the commit e70feb8b3e68 ("blk-mq: support > concurrent queue quiesce/unquiesce") that Ming authored. Ming, may I > ask your comment on the suggestion by Sagi? > > In case the WARN will be left as it is, blktests can ignore it by adding the > line below to the test case: > > DMESG_FILTER="grep --invert-match blk_mq_unquiesce_queue" > > Said that, I think Sagi's solution will be cleaner. FYI, I tried to recreate the WARN blk_mq_unquiesce_queue() using the kernel v6.16-rc1, but it was not recreated. AFAIK, the kernel changes between v6.15 and v6.16-rc1 do not address the WARN, so I'm guessing the WARN just disappeared because of timing changes. Anyway, I suggest to put low priority for this problem. Sagi, Hannes, thanks for your actions. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] nvme-tcp: avoid race between nvme scan and reset 2025-06-04 11:17 ` Shinichiro Kawasaki 2025-06-13 4:10 ` Shinichiro Kawasaki @ 2025-06-13 4:56 ` Ming Lei 2025-06-26 4:45 ` Shinichiro Kawasaki 2025-06-28 10:24 ` Sagi Grimberg 1 sibling, 2 replies; 16+ messages in thread From: Ming Lei @ 2025-06-13 4:56 UTC (permalink / raw) To: Shinichiro Kawasaki Cc: Sagi Grimberg, linux-nvme@lists.infradead.org, linux-block, Hannes Reinecke, Keith Busch, Jens Axboe, hch@infradead.org On Wed, Jun 04, 2025 at 11:17:05AM +0000, Shinichiro Kawasaki wrote: > Cc+: Ming, > > Hi Sagi, thanks for the background explanation and the suggestion. > > On Jun 04, 2025 / 10:10, Sagi Grimberg wrote: > ... > > > This is a problem. We are flushing a work that is IO bound, which can > > > take a long time to complete. > > > Up until now, we have deliberately avoided introducing dependency > > > between reset forward progress > > > and scan work IO to completely finish. > > > > > > I would like to keep it this way. > > > > > > BTW, this is not TCP specific. > > I see. The blktests test case nvme/063 is dedicated to tcp transport, so that's > why it was reported for the TCP transport. > > > > > blk_mq_unquiesce_queue is still very much safe to call as many times as we > > want. > > The only thing that comes in the way is this pesky WARN. How about we make > > it go away and have > > a debug print instead? > > > > My preference would be to allow nvme to unquiesce queues that were not > > previously quiesced (just > > like it historically was) instead of having to block a controller reset > > until the scan_work is completed (which > > is admin I/O dependent, and may get stuck until admin timeout, which can be > > changed by the user for 60 > > minutes or something arbitrarily long btw). > > > > How about something like this patch instead: > > -- > > diff --git a/block/blk-mq.c b/block/blk-mq.c > > index c2697db59109..74f3ad16e812 100644 > > --- a/block/blk-mq.c > > +++ b/block/blk-mq.c > > @@ -327,8 +327,10 @@ void blk_mq_unquiesce_queue(struct request_queue *q) > > bool run_queue = false; > > > > spin_lock_irqsave(&q->queue_lock, flags); > > - if (WARN_ON_ONCE(q->quiesce_depth <= 0)) { > > - ; > > + if (q->quiesce_depth <= 0) { > > + printk(KERN_DEBUG > > + "dev %s: unquiescing a non-quiesced queue, > > expected?\n", > > + q->disk ? q->disk->disk_name : "?", ); > > } else if (!--q->quiesce_depth) { > > blk_queue_flag_clear(QUEUE_FLAG_QUIESCED, q); > > run_queue = true; > > -- > > The WARN was introduced with the commit e70feb8b3e68 ("blk-mq: support > concurrent queue quiesce/unquiesce") that Ming authored. Ming, may I > ask your comment on the suggestion by Sagi? I think it is bad to use one standard block layer API in this unbalanced way, that is why WARN_ON_ONCE() is added. We shouldn't encourage driver to use APIs in this way. Question is why nvme have to unquiesce one non-quiesced queue? Thanks, Ming ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] nvme-tcp: avoid race between nvme scan and reset 2025-06-13 4:56 ` Ming Lei @ 2025-06-26 4:45 ` Shinichiro Kawasaki 2025-06-28 10:24 ` Sagi Grimberg 1 sibling, 0 replies; 16+ messages in thread From: Shinichiro Kawasaki @ 2025-06-26 4:45 UTC (permalink / raw) To: Ming Lei Cc: Sagi Grimberg, linux-nvme@lists.infradead.org, linux-block, Hannes Reinecke, Keith Busch, Jens Axboe, hch@infradead.org On Jun 13, 2025 / 12:56, Ming Lei wrote: > On Wed, Jun 04, 2025 at 11:17:05AM +0000, Shinichiro Kawasaki wrote: > > Cc+: Ming, > > > > Hi Sagi, thanks for the background explanation and the suggestion. > > > > On Jun 04, 2025 / 10:10, Sagi Grimberg wrote: > > ... > > > > This is a problem. We are flushing a work that is IO bound, which can > > > > take a long time to complete. > > > > Up until now, we have deliberately avoided introducing dependency > > > > between reset forward progress > > > > and scan work IO to completely finish. > > > > > > > > I would like to keep it this way. > > > > > > > > BTW, this is not TCP specific. > > > > I see. The blktests test case nvme/063 is dedicated to tcp transport, so that's > > why it was reported for the TCP transport. > > > > > > > > blk_mq_unquiesce_queue is still very much safe to call as many times as we > > > want. > > > The only thing that comes in the way is this pesky WARN. How about we make > > > it go away and have > > > a debug print instead? > > > > > > My preference would be to allow nvme to unquiesce queues that were not > > > previously quiesced (just > > > like it historically was) instead of having to block a controller reset > > > until the scan_work is completed (which > > > is admin I/O dependent, and may get stuck until admin timeout, which can be > > > changed by the user for 60 > > > minutes or something arbitrarily long btw). > > > > > > How about something like this patch instead: > > > -- > > > diff --git a/block/blk-mq.c b/block/blk-mq.c > > > index c2697db59109..74f3ad16e812 100644 > > > --- a/block/blk-mq.c > > > +++ b/block/blk-mq.c > > > @@ -327,8 +327,10 @@ void blk_mq_unquiesce_queue(struct request_queue *q) > > > bool run_queue = false; > > > > > > spin_lock_irqsave(&q->queue_lock, flags); > > > - if (WARN_ON_ONCE(q->quiesce_depth <= 0)) { > > > - ; > > > + if (q->quiesce_depth <= 0) { > > > + printk(KERN_DEBUG > > > + "dev %s: unquiescing a non-quiesced queue, > > > expected?\n", > > > + q->disk ? q->disk->disk_name : "?", ); > > > } else if (!--q->quiesce_depth) { > > > blk_queue_flag_clear(QUEUE_FLAG_QUIESCED, q); > > > run_queue = true; > > > -- > > > > The WARN was introduced with the commit e70feb8b3e68 ("blk-mq: support > > concurrent queue quiesce/unquiesce") that Ming authored. Ming, may I > > ask your comment on the suggestion by Sagi? > > I think it is bad to use one standard block layer API in this unbalanced way, > that is why WARN_ON_ONCE() is added. We shouldn't encourage driver to use > APIs in this way. > > Question is why nvme have to unquiesce one non-quiesced queue? Ming, thanks for the comment. I think the call trace in the commit message of my patch will answer your question. blk_mq_add_queue_tag_set() is called between blk_mq_quiesce_tagset() and blk_mq_unquiesce_tagset() calls. Let me show the call trace again, as below. Anyway, the WARN disappeared with v6.16-rc1 kernel, so this problem does not have high priority at this moment from my point of view. Reset work Scan work ------------------------------------------------------------------------ nvme_reset_ctrl_work() nvme_tcp_teardown_ctrl() nvme_tcp_teardown_io_queues() nvme_quiesce_io_queues() blk_mq_quiesce_tagset() list_for_each_entry() .. N queues blk_mq_quiesce_queue() nvme_scan_work() nvme_scan_ns_*() nvme_scan_ns() nvme_alloc_ns() blk_mq_alloc_disk() __blk_mq_alloc_disk() blk_mq_alloc_queue() blk_mq_init_allocate_queue() blk_mq_add_queue_tag_set() list_add_tail() .. N+1 queues nvme_tcp_setup_ctrl() nvme_start_ctrl() nvme_unquiesce_io_queues() blk_mq_unquiesce_tagset() list_for_each_entry() .. N+1 queues blk_mq_unquiesce_queue() WARN_ON_ONCE(q->quiesce_depth <= 0) ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] nvme-tcp: avoid race between nvme scan and reset 2025-06-13 4:56 ` Ming Lei 2025-06-26 4:45 ` Shinichiro Kawasaki @ 2025-06-28 10:24 ` Sagi Grimberg 1 sibling, 0 replies; 16+ messages in thread From: Sagi Grimberg @ 2025-06-28 10:24 UTC (permalink / raw) To: Ming Lei, Shinichiro Kawasaki Cc: linux-nvme@lists.infradead.org, linux-block, Hannes Reinecke, Keith Busch, Jens Axboe, hch@infradead.org On 13/06/2025 7:56, Ming Lei wrote: > On Wed, Jun 04, 2025 at 11:17:05AM +0000, Shinichiro Kawasaki wrote: >> Cc+: Ming, >> >> Hi Sagi, thanks for the background explanation and the suggestion. >> >> On Jun 04, 2025 / 10:10, Sagi Grimberg wrote: >> ... >>>> This is a problem. We are flushing a work that is IO bound, which can >>>> take a long time to complete. >>>> Up until now, we have deliberately avoided introducing dependency >>>> between reset forward progress >>>> and scan work IO to completely finish. >>>> >>>> I would like to keep it this way. >>>> >>>> BTW, this is not TCP specific. >> I see. The blktests test case nvme/063 is dedicated to tcp transport, so that's >> why it was reported for the TCP transport. >> >>> blk_mq_unquiesce_queue is still very much safe to call as many times as we >>> want. >>> The only thing that comes in the way is this pesky WARN. How about we make >>> it go away and have >>> a debug print instead? >>> >>> My preference would be to allow nvme to unquiesce queues that were not >>> previously quiesced (just >>> like it historically was) instead of having to block a controller reset >>> until the scan_work is completed (which >>> is admin I/O dependent, and may get stuck until admin timeout, which can be >>> changed by the user for 60 >>> minutes or something arbitrarily long btw). >>> >>> How about something like this patch instead: >>> -- >>> diff --git a/block/blk-mq.c b/block/blk-mq.c >>> index c2697db59109..74f3ad16e812 100644 >>> --- a/block/blk-mq.c >>> +++ b/block/blk-mq.c >>> @@ -327,8 +327,10 @@ void blk_mq_unquiesce_queue(struct request_queue *q) >>> bool run_queue = false; >>> >>> spin_lock_irqsave(&q->queue_lock, flags); >>> - if (WARN_ON_ONCE(q->quiesce_depth <= 0)) { >>> - ; >>> + if (q->quiesce_depth <= 0) { >>> + printk(KERN_DEBUG >>> + "dev %s: unquiescing a non-quiesced queue, >>> expected?\n", >>> + q->disk ? q->disk->disk_name : "?", ); >>> } else if (!--q->quiesce_depth) { >>> blk_queue_flag_clear(QUEUE_FLAG_QUIESCED, q); >>> run_queue = true; >>> -- >> The WARN was introduced with the commit e70feb8b3e68 ("blk-mq: support >> concurrent queue quiesce/unquiesce") that Ming authored. Ming, may I >> ask your comment on the suggestion by Sagi? > I think it is bad to use one standard block layer API in this unbalanced way, > that is why WARN_ON_ONCE() is added. We shouldn't encourage driver to use > APIs in this way. > > Question is why nvme have to unquiesce one non-quiesced queue? It started before quiesce/unquiesce became an API that had to be balanced. In this case, we are using the tagset quiesce/unquiesce interface. Which iterates over the request queues from the tagset. The problem is that due to namespace scanning, we may add new namespaces (and their request queues) after we quiesced the tagset. Then when we call tagset unquiesce, we have a new request queue that wasn't quiesced (added after the quiesce). I don't mind having a BLK_FEAT_ALLOW_UNBALANCED_QUIESCE for the nvme request queues if you don't want to blindly remove the WARN_ON. ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 2/2] nvme-tcp: remove tag set when second admin queue config fails 2025-06-02 4:35 [PATCH 0/2] nvme-tcp: fix two problems found with blktests nvme/063 Shin'ichiro Kawasaki 2025-06-02 4:35 ` [PATCH 1/2] nvme-tcp: avoid race between nvme scan and reset Shin'ichiro Kawasaki @ 2025-06-02 4:35 ` Shin'ichiro Kawasaki 2025-06-02 6:39 ` Hannes Reinecke ` (3 more replies) 1 sibling, 4 replies; 16+ messages in thread From: Shin'ichiro Kawasaki @ 2025-06-02 4:35 UTC (permalink / raw) To: linux-nvme Cc: Hannes Reinecke, Sagi Grimberg, Keith Busch, Jens Axboe, Christoph Hellwig, Shin'ichiro Kawasaki Commit 104d0e2f6222 ("nvme-fabrics: reset admin connection for secure concatenation") modified nvme_tcp_setup_ctrl() to call nvme_tcp_configure_admin_queue() twice. The first call prepares for DH-CHAP negotitation, and the second call is required for secure concatenation. However, this change triggered BUG KASAN slab-use-after- free in blk_mq_queue_tag_busy_iter(). This BUG can be recreated by repeating the blktests test case nvme/063 a few times [1]. When the BUG happens, nvme_tcp_create_ctrl() fails in the call chain below: nvme_tcp_create_ctrl() nvme_tcp_alloc_ctrl() new=true ... Alloc nvme_tcp_ctrl and admin_tag_set nvme_tcp_setup_ctrl() new=true nvme_tcp_configure_admin_queue() new=true ... Succeed nvme_alloc_admin_tag_set() ... Alloc the tag set for admin_tag_set nvme_stop_keep_alive() nvme_tcp_teardown_admin_queue() remove=false nvme_tcp_configure_admin_queue() new=false nvme_tcp_alloc_admin_queue() ... Fail, but do not call nvme_remove_admin_tag_set() nvme_uninit_ctrl() nvme_put_ctrl() ... Free up the nvme_tcp_ctrl and admin_tag_set The first call of nvme_tcp_configure_admin_queue() succeeds with new=true argument. The second call fails with new=false argument. This second call does not call nvme_remove_admin_tag_set() on failure, due to the new=false argument. Then the admin tag set is not removed. However, nvme_tcp_create_ctrl() assumes that nvme_tcp_setup_ctrl() would call nvme_remove_admin_tag_set(). Then it frees up struct nvme_tcp_ctrl which has admin_tag_set field. Later on, the timeout handler accesses the admin_tag_set field and causes the BUG KASAN slab-use-after-free. To not leave the admin tag set, call nvme_remove_admin_tag_set() when the second nvme_tcp_configure_admin_queue() call fails. Do not return from nvme_tcp_setup_ctrl() on failure. Instead, jump to "destroy_admin" go-to label to call nvme_tcp_teardown_admin_queue() which calls nvme_remove_admin_tag_set(). Fixes: 104d0e2f6222 ("nvme-fabrics: reset admin connection for secure concatenation") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/linux-nvme/6mhxskdlbo6fk6hotsffvwriauurqky33dfb3s44mqtr5dsxmf@gywwmnyh3twm/ [1] Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> --- drivers/nvme/host/tcp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 14b9d67329b3..85acb4c9c977 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -2350,7 +2350,7 @@ static int nvme_tcp_setup_ctrl(struct nvme_ctrl *ctrl, bool new) nvme_tcp_teardown_admin_queue(ctrl, false); ret = nvme_tcp_configure_admin_queue(ctrl, false); if (ret) - return ret; + goto destroy_admin; } if (ctrl->icdoff) { -- 2.49.0 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 2/2] nvme-tcp: remove tag set when second admin queue config fails 2025-06-02 4:35 ` [PATCH 2/2] nvme-tcp: remove tag set when second admin queue config fails Shin'ichiro Kawasaki @ 2025-06-02 6:39 ` Hannes Reinecke 2025-06-03 0:49 ` Chaitanya Kulkarni ` (2 subsequent siblings) 3 siblings, 0 replies; 16+ messages in thread From: Hannes Reinecke @ 2025-06-02 6:39 UTC (permalink / raw) To: Shin'ichiro Kawasaki, linux-nvme Cc: Hannes Reinecke, Sagi Grimberg, Keith Busch, Jens Axboe, Christoph Hellwig On 6/2/25 06:35, Shin'ichiro Kawasaki wrote: > Commit 104d0e2f6222 ("nvme-fabrics: reset admin connection for secure > concatenation") modified nvme_tcp_setup_ctrl() to call > nvme_tcp_configure_admin_queue() twice. The first call prepares for > DH-CHAP negotitation, and the second call is required for secure > concatenation. However, this change triggered BUG KASAN slab-use-after- > free in blk_mq_queue_tag_busy_iter(). This BUG can be recreated by > repeating the blktests test case nvme/063 a few times [1]. > > When the BUG happens, nvme_tcp_create_ctrl() fails in the call chain > below: > > nvme_tcp_create_ctrl() > nvme_tcp_alloc_ctrl() new=true ... Alloc nvme_tcp_ctrl and admin_tag_set > nvme_tcp_setup_ctrl() new=true > nvme_tcp_configure_admin_queue() new=true ... Succeed > nvme_alloc_admin_tag_set() ... Alloc the tag set for admin_tag_set > nvme_stop_keep_alive() > nvme_tcp_teardown_admin_queue() remove=false > nvme_tcp_configure_admin_queue() new=false > nvme_tcp_alloc_admin_queue() ... Fail, but do not call nvme_remove_admin_tag_set() > nvme_uninit_ctrl() > nvme_put_ctrl() ... Free up the nvme_tcp_ctrl and admin_tag_set > > The first call of nvme_tcp_configure_admin_queue() succeeds with > new=true argument. The second call fails with new=false argument. This > second call does not call nvme_remove_admin_tag_set() on failure, due to > the new=false argument. Then the admin tag set is not removed. However, > nvme_tcp_create_ctrl() assumes that nvme_tcp_setup_ctrl() would call > nvme_remove_admin_tag_set(). Then it frees up struct nvme_tcp_ctrl which > has admin_tag_set field. Later on, the timeout handler accesses the > admin_tag_set field and causes the BUG KASAN slab-use-after-free. > > To not leave the admin tag set, call nvme_remove_admin_tag_set() when > the second nvme_tcp_configure_admin_queue() call fails. Do not return > from nvme_tcp_setup_ctrl() on failure. Instead, jump to "destroy_admin" > go-to label to call nvme_tcp_teardown_admin_queue() which calls > nvme_remove_admin_tag_set(). > > Fixes: 104d0e2f6222 ("nvme-fabrics: reset admin connection for secure concatenation") > Cc: stable@vger.kernel.org > Link: https://lore.kernel.org/linux-nvme/6mhxskdlbo6fk6hotsffvwriauurqky33dfb3s44mqtr5dsxmf@gywwmnyh3twm/ [1] > Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> > --- > drivers/nvme/host/tcp.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c > index 14b9d67329b3..85acb4c9c977 100644 > --- a/drivers/nvme/host/tcp.c > +++ b/drivers/nvme/host/tcp.c > @@ -2350,7 +2350,7 @@ static int nvme_tcp_setup_ctrl(struct nvme_ctrl *ctrl, bool new) > nvme_tcp_teardown_admin_queue(ctrl, false); > ret = nvme_tcp_configure_admin_queue(ctrl, false); > if (ret) > - return ret; > + goto destroy_admin; > } > > if (ctrl->icdoff) { Reviewed-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/2] nvme-tcp: remove tag set when second admin queue config fails 2025-06-02 4:35 ` [PATCH 2/2] nvme-tcp: remove tag set when second admin queue config fails Shin'ichiro Kawasaki 2025-06-02 6:39 ` Hannes Reinecke @ 2025-06-03 0:49 ` Chaitanya Kulkarni 2025-06-04 6:50 ` Sagi Grimberg 2025-06-04 7:53 ` Christoph Hellwig 3 siblings, 0 replies; 16+ messages in thread From: Chaitanya Kulkarni @ 2025-06-03 0:49 UTC (permalink / raw) To: Shin'ichiro Kawasaki, linux-nvme@lists.infradead.org Cc: Hannes Reinecke, Sagi Grimberg, Keith Busch, Jens Axboe, Christoph Hellwig On 6/1/25 21:35, Shin'ichiro Kawasaki wrote: > Commit 104d0e2f6222 ("nvme-fabrics: reset admin connection for secure > concatenation") modified nvme_tcp_setup_ctrl() to call > nvme_tcp_configure_admin_queue() twice. The first call prepares for > DH-CHAP negotitation, and the second call is required for secure > concatenation. However, this change triggered BUG KASAN slab-use-after- > free in blk_mq_queue_tag_busy_iter(). This BUG can be recreated by > repeating the blktests test case nvme/063 a few times [1]. > > When the BUG happens, nvme_tcp_create_ctrl() fails in the call chain > below: > > nvme_tcp_create_ctrl() > nvme_tcp_alloc_ctrl() new=true ... Alloc nvme_tcp_ctrl and admin_tag_set > nvme_tcp_setup_ctrl() new=true > nvme_tcp_configure_admin_queue() new=true ... Succeed > nvme_alloc_admin_tag_set() ... Alloc the tag set for admin_tag_set > nvme_stop_keep_alive() > nvme_tcp_teardown_admin_queue() remove=false > nvme_tcp_configure_admin_queue() new=false > nvme_tcp_alloc_admin_queue() ... Fail, but do not call nvme_remove_admin_tag_set() > nvme_uninit_ctrl() > nvme_put_ctrl() ... Free up the nvme_tcp_ctrl and admin_tag_set > > The first call of nvme_tcp_configure_admin_queue() succeeds with > new=true argument. The second call fails with new=false argument. This > second call does not call nvme_remove_admin_tag_set() on failure, due to > the new=false argument. Then the admin tag set is not removed. However, > nvme_tcp_create_ctrl() assumes that nvme_tcp_setup_ctrl() would call > nvme_remove_admin_tag_set(). Then it frees up struct nvme_tcp_ctrl which > has admin_tag_set field. Later on, the timeout handler accesses the > admin_tag_set field and causes the BUG KASAN slab-use-after-free. > > To not leave the admin tag set, call nvme_remove_admin_tag_set() when > the second nvme_tcp_configure_admin_queue() call fails. Do not return > from nvme_tcp_setup_ctrl() on failure. Instead, jump to "destroy_admin" > go-to label to call nvme_tcp_teardown_admin_queue() which calls > nvme_remove_admin_tag_set(). > > Fixes: 104d0e2f6222 ("nvme-fabrics: reset admin connection for secure concatenation") > Cc:stable@vger.kernel.org > Link:https://lore.kernel.org/linux-nvme/6mhxskdlbo6fk6hotsffvwriauurqky33dfb3s44mqtr5dsxmf@gywwmnyh3twm/ [1] > Signed-off-by: Shin'ichiro Kawasaki<shinichiro.kawasaki@wdc.com> Thanks for the really good commit log, it made reviewing both the patches easier. Looks good. Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> -ck ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/2] nvme-tcp: remove tag set when second admin queue config fails 2025-06-02 4:35 ` [PATCH 2/2] nvme-tcp: remove tag set when second admin queue config fails Shin'ichiro Kawasaki 2025-06-02 6:39 ` Hannes Reinecke 2025-06-03 0:49 ` Chaitanya Kulkarni @ 2025-06-04 6:50 ` Sagi Grimberg 2025-06-04 7:53 ` Christoph Hellwig 3 siblings, 0 replies; 16+ messages in thread From: Sagi Grimberg @ 2025-06-04 6:50 UTC (permalink / raw) To: Shin'ichiro Kawasaki, linux-nvme Cc: Hannes Reinecke, Keith Busch, Jens Axboe, Christoph Hellwig Reviewed-by: Sagi Grimberg <sagi@grimberg.me> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/2] nvme-tcp: remove tag set when second admin queue config fails 2025-06-02 4:35 ` [PATCH 2/2] nvme-tcp: remove tag set when second admin queue config fails Shin'ichiro Kawasaki ` (2 preceding siblings ...) 2025-06-04 6:50 ` Sagi Grimberg @ 2025-06-04 7:53 ` Christoph Hellwig 3 siblings, 0 replies; 16+ messages in thread From: Christoph Hellwig @ 2025-06-04 7:53 UTC (permalink / raw) To: Shin'ichiro Kawasaki Cc: linux-nvme, Hannes Reinecke, Sagi Grimberg, Keith Busch, Jens Axboe, Christoph Hellwig Thanks, applied to nvme-6.16. ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2025-06-28 10:24 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-06-02 4:35 [PATCH 0/2] nvme-tcp: fix two problems found with blktests nvme/063 Shin'ichiro Kawasaki 2025-06-02 4:35 ` [PATCH 1/2] nvme-tcp: avoid race between nvme scan and reset Shin'ichiro Kawasaki 2025-06-02 6:39 ` Hannes Reinecke 2025-06-03 0:48 ` Chaitanya Kulkarni 2025-06-03 10:58 ` Sagi Grimberg 2025-06-04 7:10 ` Sagi Grimberg 2025-06-04 11:17 ` Shinichiro Kawasaki 2025-06-13 4:10 ` Shinichiro Kawasaki 2025-06-13 4:56 ` Ming Lei 2025-06-26 4:45 ` Shinichiro Kawasaki 2025-06-28 10:24 ` Sagi Grimberg 2025-06-02 4:35 ` [PATCH 2/2] nvme-tcp: remove tag set when second admin queue config fails Shin'ichiro Kawasaki 2025-06-02 6:39 ` Hannes Reinecke 2025-06-03 0:49 ` Chaitanya Kulkarni 2025-06-04 6:50 ` Sagi Grimberg 2025-06-04 7:53 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).