Linux-NVME Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>, Keith Busch <kbusch@kernel.org>,
	linux-nvme@lists.infradead.org, Hannes Reinecke <hare@suse.de>,
	Hannes Reinecke <hare@kernel.org>
Subject: [PATCH 6/8] nvme-tcp: reduce callback lock contention
Date: Tue, 16 Jul 2024 09:36:14 +0200	[thread overview]
Message-ID: <20240716073616.84417-7-hare@kernel.org> (raw)
In-Reply-To: <20240716073616.84417-1-hare@kernel.org>

From: Hannes Reinecke <hare@suse.de>

We have heavily queued tx and rx flows, so callbacks might happen
at the same time. As the callbacks influence the state machine we
really should remove contention here to not impact I/O performance.

Signed-off-by: Hannes Reinecke <hare@kernel.org>
---
 drivers/nvme/host/tcp.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index a758fbb3f9bb..9634c16d7bc0 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1153,28 +1153,28 @@ static void nvme_tcp_data_ready(struct sock *sk)
 
 	trace_sk_data_ready(sk);
 
-	read_lock_bh(&sk->sk_callback_lock);
-	queue = sk->sk_user_data;
+	rcu_read_lock();
+	queue = rcu_dereference_sk_user_data(sk);
 	if (likely(queue && queue->rd_enabled) &&
 	    !test_bit(NVME_TCP_Q_POLLING, &queue->flags)) {
 		queue_work_on(queue->io_cpu, nvme_tcp_wq, &queue->io_work);
 		queue->data_ready_cnt++;
 	}
-	read_unlock_bh(&sk->sk_callback_lock);
+	rcu_read_unlock();
 }
 
 static void nvme_tcp_write_space(struct sock *sk)
 {
 	struct nvme_tcp_queue *queue;
 
-	read_lock_bh(&sk->sk_callback_lock);
-	queue = sk->sk_user_data;
+	rcu_read_lock();
+	queue = rcu_dereference_sk_user_data(sk);
 	if (likely(queue && sk_stream_is_writeable(sk))) {
 		clear_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
 		queue_work_on(queue->io_cpu, nvme_tcp_wq, &queue->io_work);
 		queue->write_space_cnt++;
 	}
-	read_unlock_bh(&sk->sk_callback_lock);
+	rcu_read_unlock();
 }
 
 static void nvme_tcp_state_change(struct sock *sk)
@@ -2076,6 +2076,7 @@ static void nvme_tcp_restore_sock_ops(struct nvme_tcp_queue *queue)
 	sock->sk->sk_state_change = queue->state_change;
 	sock->sk->sk_write_space  = queue->write_space;
 	write_unlock_bh(&sock->sk->sk_callback_lock);
+	synchronize_rcu();
 }
 
 static void __nvme_tcp_stop_queue(struct nvme_tcp_queue *queue)
@@ -2115,6 +2116,7 @@ static void nvme_tcp_setup_sock_ops(struct nvme_tcp_queue *queue)
 	queue->sock->sk->sk_ll_usec = 1;
 #endif
 	write_unlock_bh(&queue->sock->sk->sk_callback_lock);
+	synchronize_rcu();
 }
 
 static int nvme_tcp_start_queue(struct nvme_ctrl *nctrl, int idx)
-- 
2.35.3



  parent reply	other threads:[~2024-07-16  7:36 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-16  7:36 [PATCHv3 0/8] nvme-tcp: improve scalability Hannes Reinecke
2024-07-16  7:36 ` [PATCH 1/8] nvme-tcp: switch TX deadline to microseconds and make it configurable Hannes Reinecke
2024-07-17 21:03   ` Sagi Grimberg
2024-07-18  6:30     ` Hannes Reinecke
2024-07-16  7:36 ` [PATCH 2/8] nvme-tcp: io_work stall debugging Hannes Reinecke
2024-07-17 21:05   ` Sagi Grimberg
2024-07-16  7:36 ` [PATCH 3/8] nvme-tcp: re-init request list entries Hannes Reinecke
2024-07-17 21:23   ` Sagi Grimberg
2024-07-16  7:36 ` [PATCH 4/8] nvme-tcp: improve stall debugging Hannes Reinecke
2024-07-17 21:11   ` Sagi Grimberg
2024-07-16  7:36 ` [PATCH 5/8] nvme-tcp: debugfs entries for latency statistics Hannes Reinecke
2024-07-17 21:14   ` Sagi Grimberg
2024-07-16  7:36 ` Hannes Reinecke [this message]
2024-07-17 21:19   ` [PATCH 6/8] nvme-tcp: reduce callback lock contention Sagi Grimberg
2024-07-18  6:42     ` Hannes Reinecke
2024-07-21 11:46       ` Sagi Grimberg
2024-07-16  7:36 ` [PATCH 7/8] nvme-tcp: check for SOCK_NOSPACE before sending Hannes Reinecke
2024-07-17 21:19   ` Sagi Grimberg
2024-07-16  7:36 ` [PATCH 8/8] nvme-tcp: align I/O cpu with blk-mq mapping Hannes Reinecke
2024-07-17 21:34   ` Sagi Grimberg
2024-08-13 19:36     ` Sagi Grimberg
2024-07-17 21:01 ` [PATCHv3 0/8] nvme-tcp: improve scalability Sagi Grimberg
2024-07-18  6:20   ` Hannes Reinecke
2024-07-21 12:05     ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240716073616.84417-7-hare@kernel.org \
    --to=hare@kernel.org \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox