From: John Meneghini <jmeneghi@redhat.com>
To: Maurizio Lombardi <mlombard@redhat.com>,
Sagi Grimberg <sagi@grimberg.me>
Cc: linux-nvme@lists.infradead.org, hch@lst.de, hare@suse.de,
chaitanya.kulkarni@wdc.com
Subject: Re: [PATCH 2/2] nvmet: fix a race condition between release_queue and io_work
Date: Fri, 12 Nov 2021 10:54:42 -0500 [thread overview]
Message-ID: <81bb4d6f-8639-7150-d4fd-e18d42007278@redhat.com> (raw)
In-Reply-To: <20211112105430.GA192791@raketa>
Nice work Maurizio. This should solve some of the problems we are seeing with nvme/tcp shutdown.
Do you think we have a similar problem on the host side, in nvme_tcp_init_connection?
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 8cb15ee5b249..adca40c932b7 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1271,8 +1271,12 @@ static int nvme_tcp_init_connection(struct nvme_tcp_queue *queue)
memset(&msg, 0, sizeof(msg));
iov.iov_base = icresp;
iov.iov_len = sizeof(*icresp);
- ret = kernel_recvmsg(queue->sock, &msg, &iov, 1,
+
+ do {
+ ret = kernel_recvmsg(queue->sock, &msg, &iov, 1,
iov.iov_len, msg.msg_flags);
+ } while (ret == 0);
+
if (ret < 0)
goto free_icresp;
On 11/12/21 05:54, Maurizio Lombardi wrote:
> Hi Sagi,
>
> On Thu, Nov 04, 2021 at 02:59:53PM +0200, Sagi Grimberg wrote:
>>
>> Right, after the call to cancel_work_sync we will know that io_work
>> is not running. Note that it can run as a result of a backend completion
>> but that is ok and we do want to let it run and return completion to the
>> host, but the socket should already be shut down for recv, so we cannot
>> get any other byte from the network.
>
>
> I did some tests and I found out that kernel_recvmsg() sometimes returns
> data even if the socket has been already shut down (maybe it's data it received
> before the call to kernel_sock_shutdown() and waiting in some internal buffer?).
>
> So when nvmet_sq_destroy() triggered io_work, recvmsg() still returned data
> and the kernel crashed again despite the socket was closed.
>
> Therefore, I think that after we shut down the socket we
> should let io_work run and requeue itself until it finishes its job
> and no more data is returned by recvmsg(),
> one way to achieve this is to repeatedly call flush_work() until it returns
> false.
>
> Right now I am testing the patch below and it works perfectly.
>
> Note that when the socket is closed recvmsg() might return 0,
> nvmet_tcp_try_recv_data() should return -EAGAIN
> in that case otherwise we end up in an infinite loop (io_work
> will countinously requeue itself).
>
> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
> index 2f03a94725ae..7b441071c6b9 100644
> --- a/drivers/nvme/target/tcp.c
> +++ b/drivers/nvme/target/tcp.c
> @@ -1139,8 +1139,10 @@ static int nvmet_tcp_try_recv_data(struct nvmet_tcp_queue *queue)
> while (msg_data_left(&cmd->recv_msg)) {
> ret = sock_recvmsg(cmd->queue->sock, &cmd->recv_msg,
> cmd->recv_msg.msg_flags);
> - if (ret <= 0)
> + if (ret < 0)
> return ret;
> + else if (ret == 0)
> + return -EAGAIN;
>
> cmd->pdu_recv += ret;
> cmd->rbytes_done += ret;
> @@ -1446,8 +1450,10 @@ static void nvmet_tcp_release_queue_work(struct work_struct *w)
> list_del_init(&queue->queue_list);
> mutex_unlock(&nvmet_tcp_queue_mutex);
>
> + kernel_sock_shutdown(queue->sock, SHUT_RD);
> +
> nvmet_tcp_restore_socket_callbacks(queue);
> - flush_work(&queue->io_work);
> + while (flush_work(&queue->io_work));
>
> nvmet_tcp_uninit_data_in_cmds(queue);
> nvmet_sq_destroy(&queue->nvme_sq);
>
next prev parent reply other threads:[~2021-11-12 15:54 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-21 8:41 [PATCH 0/2] Fix a race condition when performing a controller reset Maurizio Lombardi
2021-10-21 8:41 ` [PATCH 1/2] nvmet: add an helper to free the iovec Maurizio Lombardi
2021-10-21 14:56 ` John Meneghini
2021-10-21 14:58 ` John Meneghini
2021-10-27 0:15 ` Chaitanya Kulkarni
2021-10-21 8:41 ` [PATCH 2/2] nvmet: fix a race condition between release_queue and io_work Maurizio Lombardi
2021-10-21 14:57 ` John Meneghini
2021-10-26 15:42 ` Sagi Grimberg
2021-10-28 7:55 ` Maurizio Lombardi
2021-11-03 9:28 ` Sagi Grimberg
2021-11-03 11:31 ` Maurizio Lombardi
2021-11-04 12:59 ` Sagi Grimberg
2021-11-12 10:54 ` Maurizio Lombardi
2021-11-12 15:54 ` John Meneghini [this message]
2021-11-15 7:52 ` Maurizio Lombardi
2021-11-14 10:28 ` Sagi Grimberg
2021-11-15 7:47 ` Maurizio Lombardi
2021-11-15 9:48 ` Sagi Grimberg
2021-11-15 10:00 ` Maurizio Lombardi
2021-11-15 10:13 ` Sagi Grimberg
2021-11-15 10:57 ` Maurizio Lombardi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=81bb4d6f-8639-7150-d4fd-e18d42007278@redhat.com \
--to=jmeneghi@redhat.com \
--cc=chaitanya.kulkarni@wdc.com \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=linux-nvme@lists.infradead.org \
--cc=mlombard@redhat.com \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox