From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Nilay Shroff <nilay@linux.ibm.com>,
Christoph Hellwig <hch@lst.de>, Keith Busch <kbusch@kernel.org>,
Sasha Levin <sashal@kernel.org>,
sagi@grimberg.me, linux-nvme@lists.infradead.org
Subject: [PATCH AUTOSEL 6.1 01/14] nvme-multipath: find NUMA path only for online numa-node
Date: Wed, 5 Jun 2024 08:04:34 -0400 [thread overview]
Message-ID: <20240605120455.2967445-1-sashal@kernel.org> (raw)
From: Nilay Shroff <nilay@linux.ibm.com>
[ Upstream commit d3a043733f25d743f3aa617c7f82dbcb5ee2211a ]
In current native multipath design when a shared namespace is created,
we loop through each possible numa-node, calculate the NUMA distance of
that node from each nvme controller and then cache the optimal IO path
for future reference while sending IO. The issue with this design is that
we may refer to the NUMA distance table for an offline node which may not
be populated at the time and so we may inadvertently end up finding and
caching a non-optimal path for IO. Then latter when the corresponding
numa-node becomes online and hence the NUMA distance table entry for that
node is created, ideally we should re-calculate the multipath node distance
for the newly added node however that doesn't happen unless we rescan/reset
the controller. So essentially, we may keep using non-optimal IO path for a
node which is made online after namespace is created.
This patch helps fix this issue ensuring that when a shared namespace is
created, we calculate the multipath node distance for each online numa-node
instead of each possible numa-node. Then latter when a node becomes online
and we receive any IO on that newly added node, we would calculate the
multipath node distance for newly added node but this time NUMA distance
table would have been already populated for newly added node. Hence we
would be able to correctly calculate the multipath node distance and choose
the optimal path for the IO.
Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/nvme/host/multipath.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index f96d330d39641..ead42a81cb352 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -558,7 +558,7 @@ static void nvme_mpath_set_live(struct nvme_ns *ns)
int node, srcu_idx;
srcu_idx = srcu_read_lock(&head->srcu);
- for_each_node(node)
+ for_each_online_node(node)
__nvme_find_path(head, node);
srcu_read_unlock(&head->srcu, srcu_idx);
}
--
2.43.0
next reply other threads:[~2024-06-05 12:04 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-05 12:04 Sasha Levin [this message]
2024-06-05 12:04 ` [PATCH AUTOSEL 6.1 02/14] dma-mapping: benchmark: avoid needless copy_to_user if benchmark fails Sasha Levin
2024-06-05 12:04 ` [PATCH AUTOSEL 6.1 03/14] nvme: adjust multiples of NVME_CTRL_PAGE_SIZE in offset Sasha Levin
2024-06-05 12:04 ` [PATCH AUTOSEL 6.1 04/14] afs: Don't cross .backup mountpoint from backup volume Sasha Levin
2024-06-05 12:04 ` [PATCH AUTOSEL 6.1 05/14] regmap-i2c: Subtract reg size from max_write Sasha Levin
2024-06-05 12:04 ` [PATCH AUTOSEL 6.1 06/14] platform/x86: touchscreen_dmi: Add support for setting touchscreen properties from cmdline Sasha Levin
2024-06-05 12:04 ` [PATCH AUTOSEL 6.1 07/14] platform/x86: touchscreen_dmi: Add info for GlobalSpace SolT IVW 11.6" tablet Sasha Levin
2024-06-05 12:04 ` [PATCH AUTOSEL 6.1 08/14] platform/x86: touchscreen_dmi: Add info for the EZpad 6s Pro Sasha Levin
2024-06-05 12:04 ` [PATCH AUTOSEL 6.1 09/14] nvmet: fix a possible leak when destroy a ctrl during qp establishment Sasha Levin
2024-06-05 12:04 ` [PATCH AUTOSEL 6.1 10/14] kbuild: fix short log for AS in link-vmlinux.sh Sasha Levin
2024-06-05 12:04 ` [PATCH AUTOSEL 6.1 11/14] nfc/nci: Add the inconsistency check between the input data length and count Sasha Levin
2024-06-05 12:04 ` [PATCH AUTOSEL 6.1 12/14] spi: cadence: Ensure data lines set to low during dummy-cycle period Sasha Levin
2024-06-05 12:04 ` [PATCH AUTOSEL 6.1 13/14] drm/amdgpu: fix dereference null return value for the function amdgpu_vm_pt_parent Sasha Levin
2024-06-18 9:11 ` Pavel Machek
2024-06-18 11:42 ` Christian König
2024-07-22 12:55 ` Sasha Levin
2024-06-05 12:04 ` [PATCH AUTOSEL 6.1 14/14] null_blk: Do not allow runt zone with zone capacity smaller then zone size Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240605120455.2967445-1-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=hch@lst.de \
--cc=kbusch@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=nilay@linux.ibm.com \
--cc=sagi@grimberg.me \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox