From: Hannes Reinecke <hare@kernel.org>
To: Sagi Grimberg <sagi@grimberg.me>
Cc: Christoph Hellwig <hch@lst.de>, Keith Busch <kbusch@kernel.org>,
linux-nvme@lists.infradead.org, Hannes Reinecke <hare@kernel.org>
Subject: [PATCH 2/4] nvme-tcp: align I/O cpu with blk-mq mapping
Date: Wed, 3 Jul 2024 15:50:19 +0200 [thread overview]
Message-ID: <20240703135021.34143-3-hare@kernel.org> (raw)
In-Reply-To: <20240703135021.34143-1-hare@kernel.org>
When 'wq_unbound' is selected we should select the
the first CPU from a given blk-mq hctx mapping to queue
the tcp workqueue item. With this we can instruct the
workqueue code to keep the I/O affinity and avoid
a performance penalty.
One should switch to 'cpu' workqueue affinity to
get full advantage of this by issuing:
echo cpu > /sys/devices/virtual/workqueue/nvme_tcp_wq_*/affinity_scope
Signed-off-by: Hannes Reinecke <hare@kernel.org>
---
drivers/nvme/host/tcp.c | 44 +++++++++++++++++++++++++++++------------
1 file changed, 31 insertions(+), 13 deletions(-)
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index d43099c562fc..df184004a514 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1559,19 +1559,36 @@ static bool nvme_tcp_poll_queue(struct nvme_tcp_queue *queue)
static void nvme_tcp_set_queue_io_cpu(struct nvme_tcp_queue *queue)
{
struct nvme_tcp_ctrl *ctrl = queue->ctrl;
- int qid = nvme_tcp_queue_id(queue);
+ struct blk_mq_tag_set *set = &ctrl->tag_set;
+ int qid = nvme_tcp_queue_id(queue) - 1;
+ unsigned int *mq_map = NULL;;
int n = 0;
- if (nvme_tcp_default_queue(queue))
- n = qid - 1;
- else if (nvme_tcp_read_queue(queue))
- n = qid - ctrl->io_queues[HCTX_TYPE_DEFAULT] - 1;
- else if (nvme_tcp_poll_queue(queue))
+ if (nvme_tcp_default_queue(queue)) {
+ mq_map = set->map[HCTX_TYPE_DEFAULT].mq_map;
+ n = qid;
+ } else if (nvme_tcp_read_queue(queue)) {
+ mq_map = set->map[HCTX_TYPE_READ].mq_map;
+ n = qid - ctrl->io_queues[HCTX_TYPE_DEFAULT];
+ } else if (nvme_tcp_poll_queue(queue)) {
+ mq_map = set->map[HCTX_TYPE_POLL].mq_map;
n = qid - ctrl->io_queues[HCTX_TYPE_DEFAULT] -
- ctrl->io_queues[HCTX_TYPE_READ] - 1;
- if (wq_unbound)
- queue->io_cpu = WORK_CPU_UNBOUND;
- else
+ ctrl->io_queues[HCTX_TYPE_READ];
+ }
+ if (wq_unbound) {
+ int cpu;
+
+ if (WARN_ON(!mq_map))
+ return;
+ for_each_online_cpu(cpu) {
+ if (mq_map[cpu] == qid) {
+ queue->io_cpu = cpu;
+ break;
+ }
+ }
+ dev_dbg(ctrl->ctrl.device, "queue %d: using cpu %d\n",
+ qid, queue->io_cpu);
+ } else
queue->io_cpu = cpumask_next_wrap(n - 1, cpu_online_mask, -1, false);
}
@@ -1716,7 +1733,7 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid,
queue->sock->sk->sk_allocation = GFP_ATOMIC;
queue->sock->sk->sk_use_task_frag = false;
- nvme_tcp_set_queue_io_cpu(queue);
+ queue->io_cpu = WORK_CPU_UNBOUND;
queue->request = NULL;
queue->data_remaining = 0;
queue->ddgst_remaining = 0;
@@ -1872,9 +1889,10 @@ static int nvme_tcp_start_queue(struct nvme_ctrl *nctrl, int idx)
nvme_tcp_init_recv_ctx(queue);
nvme_tcp_setup_sock_ops(queue);
- if (idx)
+ if (idx) {
+ nvme_tcp_set_queue_io_cpu(queue);
ret = nvmf_connect_io_queue(nctrl, idx);
- else
+ } else
ret = nvmf_connect_admin_queue(nctrl);
if (!ret) {
--
2.35.3
next prev parent reply other threads:[~2024-07-03 13:50 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-03 13:50 [PATCH 0/4] nvme-tcp: improve scalability Hannes Reinecke
2024-07-03 13:50 ` [PATCH 1/4] nvme-tcp: per-controller I/O workqueues Hannes Reinecke
2024-07-03 14:11 ` Sagi Grimberg
2024-07-03 14:46 ` Hannes Reinecke
2024-07-03 15:16 ` Sagi Grimberg
2024-07-03 17:07 ` Tejun Heo
2024-07-03 19:14 ` Sagi Grimberg
2024-07-03 19:17 ` Tejun Heo
2024-07-03 19:41 ` Sagi Grimberg
2024-07-04 7:36 ` Hannes Reinecke
2024-07-05 7:10 ` Christoph Hellwig
2024-07-05 8:11 ` Hannes Reinecke
2024-07-05 8:16 ` Jens Axboe
2024-07-04 5:36 ` Christoph Hellwig
2024-07-03 13:50 ` Hannes Reinecke [this message]
2024-07-03 14:19 ` [PATCH 2/4] nvme-tcp: align I/O cpu with blk-mq mapping Sagi Grimberg
2024-07-03 14:53 ` Hannes Reinecke
2024-07-03 15:03 ` Sagi Grimberg
2024-07-03 15:40 ` Hannes Reinecke
2024-07-03 19:38 ` Sagi Grimberg
2024-07-03 19:47 ` Sagi Grimberg
2024-07-04 6:43 ` Hannes Reinecke
2024-07-04 9:07 ` Sagi Grimberg
2024-07-04 14:03 ` Hannes Reinecke
2024-07-04 5:37 ` Christoph Hellwig
2024-07-04 9:13 ` Sagi Grimberg
2024-07-03 13:50 ` [PATCH 3/4] workqueue: introduce helper workqueue_unbound_affinity_scope() Hannes Reinecke
2024-07-03 17:31 ` Tejun Heo
2024-07-04 6:04 ` Hannes Reinecke
2024-07-03 13:50 ` [PATCH 4/4] nvme-tcp: switch to 'cpu' affinity scope for unbound workqueues Hannes Reinecke
2024-07-03 14:22 ` Sagi Grimberg
2024-07-03 15:01 ` Hannes Reinecke
2024-07-03 15:09 ` Sagi Grimberg
2024-07-03 15:50 ` Hannes Reinecke
2024-07-04 9:11 ` Sagi Grimberg
2024-07-04 15:54 ` Hannes Reinecke
2024-07-05 11:48 ` Sagi Grimberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240703135021.34143-3-hare@kernel.org \
--to=hare@kernel.org \
--cc=hch@lst.de \
--cc=kbusch@kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox