From: Chaitanya Kulkarni <kch@nvidia.com>
To: <skumar47@syr.edu>
Cc: <hch@lst.de>, <sagi@grimberg.me>,
<linux-nvme@lists.infradead.org>, <kbusch@kernel.org>,
Chaitanya Kulkarni <kch@nvidia.com>, <stable@vger.kernel.org>
Subject: [PATCH] nvmet-tcp: fix race between ICReq handling and queue teardown
Date: Wed, 8 Apr 2026 00:51:31 -0700 [thread overview]
Message-ID: <20260408075131.6221-1-kch@nvidia.com> (raw)
nvmet_tcp_handle_icreq() updates queue->state after sending an
Initialization Connection Response (ICResp), but it does so without
serializing against target-side queue teardown.
If an NVMe/TCP host sends an Initialization Connection Request
(ICReq) and immediately closes the connection, target-side teardown
may start in softirq context before io_work drains the already
buffered ICReq. In that case, nvmet_tcp_schedule_release_queue()
sets queue->state to NVMET_TCP_Q_DISCONNECTING and drops the queue
reference under state_lock.
If io_work later processes that ICReq, nvmet_tcp_handle_icreq() can
still overwrite the state back to NVMET_TCP_Q_LIVE. That defeats the
DISCONNECTING-state guard in nvmet_tcp_schedule_release_queue() and
allows a later socket state change to re-enter teardown and issue a
second kref_put() on an already released queue.
The ICResp send failure path has the same problem. If teardown has
already moved the queue to DISCONNECTING, a send error can still
overwrite the state with NVMET_TCP_Q_FAILED, again reopening the
window for a second teardown path to drop the queue reference.
Fix this by serializing both post-send state transitions with
state_lock and bailing out if teardown has already started.
Use -ESHUTDOWN as an internal sentinel for that bail-out path rather
than propagating it as a transport error like -ECONNRESET. Keep
nvmet_tcp_socket_error() setting rcv_state to NVMET_TCP_RECV_ERR before
honoring that sentinel so receive-side parsing stays quiesced until the
existing release path completes.
Reported-by: Shivam Kumar <skumar47@syr.edu>
Fixes: c46a6465bac2 ("nvmet-tcp: add NVMe over TCP target driver")
Cc: stable@vger.kernel.org
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
---
Hi Shivam,
This patch is different than the one I posted.
Posted patch :-
iov.iov_len = sizeof(*icresp);
ret = kernel_sendmsg(queue->sock, &msg, &iov, 1, iov.iov_len);
if (ret < 0) {
- queue->state = NVMET_TCP_Q_FAILED;
+ spin_lock_bh(&queue->state_lock);
+ if (queue->state != NVMET_TCP_Q_DISCONNECTING)
+ queue->state = NVMET_TCP_Q_FAILED;
+ spin_unlock_bh(&queue->state_lock);
return ret; /* queue removal will cleanup */
}
This patch :-
iov.iov_len = sizeof(*icresp);
ret = kernel_sendmsg(queue->sock, &msg, &iov, 1, iov.iov_len);
if (ret < 0) {
+ spin_lock_bh(&queue->state_lock);
+ if (queue->state == NVMET_TCP_Q_DISCONNECTING) {
+ spin_unlock_bh(&queue->state_lock);
+ return -ESHUTDOWN;
+ }
queue->state = NVMET_TCP_Q_FAILED;
+ spin_unlock_bh(&queue->state_lock);
return ret; /* queue removal will cleanup */
}
It will be great if you can provide tested-by tag on this patch
so we can merge this fix as well.
-ck
---
drivers/nvme/target/tcp.c | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index 69e971b179ae..98b2ce9a70ca 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -407,7 +407,22 @@ static void nvmet_tcp_fatal_error(struct nvmet_tcp_queue *queue)
static void nvmet_tcp_socket_error(struct nvmet_tcp_queue *queue, int status)
{
+ /*
+ * Keep rcv_state at RECV_ERR even for the internal -ESHUTDOWN path.
+ * nvmet_tcp_handle_icreq() can return -ESHUTDOWN after the ICReq has
+ * already been consumed and queue teardown has started.
+ *
+ * If nvmet_tcp_data_ready() or nvmet_tcp_write_space() queues
+ * nvmet_tcp_io_work() again before nvmet_tcp_release_queue_work()
+ * cancels it, the queue must not keep that old receive state.
+ * Otherwise the next nvmet_tcp_io_work() run can reach
+ * nvmet_tcp_done_recv_pdu() and try to handle the same ICReq again.
+ *
+ * That is why queue->rcv_state needs to be updated before we return.
+ */
queue->rcv_state = NVMET_TCP_RECV_ERR;
+ if (status == -ESHUTDOWN)
+ return;
if (status == -EPIPE || status == -ECONNRESET)
kernel_sock_shutdown(queue->sock, SHUT_RDWR);
else
@@ -922,11 +937,24 @@ static int nvmet_tcp_handle_icreq(struct nvmet_tcp_queue *queue)
iov.iov_len = sizeof(*icresp);
ret = kernel_sendmsg(queue->sock, &msg, &iov, 1, iov.iov_len);
if (ret < 0) {
+ spin_lock_bh(&queue->state_lock);
+ if (queue->state == NVMET_TCP_Q_DISCONNECTING) {
+ spin_unlock_bh(&queue->state_lock);
+ return -ESHUTDOWN;
+ }
queue->state = NVMET_TCP_Q_FAILED;
+ spin_unlock_bh(&queue->state_lock);
return ret; /* queue removal will cleanup */
}
+ spin_lock_bh(&queue->state_lock);
+ if (queue->state == NVMET_TCP_Q_DISCONNECTING) {
+ spin_unlock_bh(&queue->state_lock);
+ /* Tell nvmet_tcp_socket_error() teardown is already in progress. */
+ return -ESHUTDOWN;
+ }
queue->state = NVMET_TCP_Q_LIVE;
+ spin_unlock_bh(&queue->state_lock);
nvmet_prepare_receive_pdu(queue);
return 0;
}
--
2.39.5
next reply other threads:[~2026-04-08 7:52 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-08 7:51 Chaitanya Kulkarni [this message]
[not found] ` <BN0PR01MB7007C5B4178BCD8619FC5D91C85B2@BN0PR01MB7007.prod.exchangelabs.com>
2026-04-08 16:50 ` Fw: [PATCH] nvmet-tcp: fix race between ICReq handling and queue teardown Shivam Kumar
2026-04-08 19:22 ` Keith Busch
2026-04-09 6:11 ` Christoph Hellwig
2026-04-09 14:21 ` Keith Busch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260408075131.6221-1-kch@nvidia.com \
--to=kch@nvidia.com \
--cc=hch@lst.de \
--cc=kbusch@kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
--cc=skumar47@syr.edu \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.