* [PATCH AUTOSEL 5.4 1/2] nvme-multipath: find NUMA path only for online numa-node
@ 2024-06-05 12:06 Sasha Levin
2024-06-05 12:06 ` [PATCH AUTOSEL 5.4 2/2] afs: Don't cross .backup mountpoint from backup volume Sasha Levin
0 siblings, 1 reply; 2+ messages in thread
From: Sasha Levin @ 2024-06-05 12:06 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Nilay Shroff, Christoph Hellwig, Keith Busch, Sasha Levin, sagi,
linux-nvme
From: Nilay Shroff <nilay@linux.ibm.com>
[ Upstream commit d3a043733f25d743f3aa617c7f82dbcb5ee2211a ]
In current native multipath design when a shared namespace is created,
we loop through each possible numa-node, calculate the NUMA distance of
that node from each nvme controller and then cache the optimal IO path
for future reference while sending IO. The issue with this design is that
we may refer to the NUMA distance table for an offline node which may not
be populated at the time and so we may inadvertently end up finding and
caching a non-optimal path for IO. Then latter when the corresponding
numa-node becomes online and hence the NUMA distance table entry for that
node is created, ideally we should re-calculate the multipath node distance
for the newly added node however that doesn't happen unless we rescan/reset
the controller. So essentially, we may keep using non-optimal IO path for a
node which is made online after namespace is created.
This patch helps fix this issue ensuring that when a shared namespace is
created, we calculate the multipath node distance for each online numa-node
instead of each possible numa-node. Then latter when a node becomes online
and we receive any IO on that newly added node, we would calculate the
multipath node distance for newly added node but this time NUMA distance
table would have been already populated for newly added node. Hence we
would be able to correctly calculate the multipath node distance and choose
the optimal path for the IO.
Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/nvme/host/multipath.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 811f7b96b5517..c993548403f5c 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -436,7 +436,7 @@ static void nvme_mpath_set_live(struct nvme_ns *ns)
int node, srcu_idx;
srcu_idx = srcu_read_lock(&head->srcu);
- for_each_node(node)
+ for_each_online_node(node)
__nvme_find_path(head, node);
srcu_read_unlock(&head->srcu, srcu_idx);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
* [PATCH AUTOSEL 5.4 2/2] afs: Don't cross .backup mountpoint from backup volume
2024-06-05 12:06 [PATCH AUTOSEL 5.4 1/2] nvme-multipath: find NUMA path only for online numa-node Sasha Levin
@ 2024-06-05 12:06 ` Sasha Levin
0 siblings, 0 replies; 2+ messages in thread
From: Sasha Levin @ 2024-06-05 12:06 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Marc Dionne, Jan Henrik Sylvester, Markus Suvanto, David Howells,
Jeffrey Altman, linux-afs, Christian Brauner, Sasha Levin
From: Marc Dionne <marc.dionne@auristor.com>
[ Upstream commit 29be9100aca2915fab54b5693309bc42956542e5 ]
Don't cross a mountpoint that explicitly specifies a backup volume
(target is <vol>.backup) when starting from a backup volume.
It it not uncommon to mount a volume's backup directly in the volume
itself. This can cause tools that are not paying attention to get
into a loop mounting the volume onto itself as they attempt to
traverse the tree, leading to a variety of problems.
This doesn't prevent the general case of loops in a sequence of
mountpoints, but addresses a common special case in the same way
as other afs clients.
Reported-by: Jan Henrik Sylvester <jan.henrik.sylvester@uni-hamburg.de>
Link: http://lists.infradead.org/pipermail/linux-afs/2024-May/008454.html
Reported-by: Markus Suvanto <markus.suvanto@gmail.com>
Link: http://lists.infradead.org/pipermail/linux-afs/2024-February/008074.html
Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/768760.1716567475@warthog.procyon.org.uk
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
cc: linux-afs@lists.infradead.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/afs/mntpt.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/fs/afs/mntpt.c b/fs/afs/mntpt.c
index f105f061e89bb..eeb2b4b11e323 100644
--- a/fs/afs/mntpt.c
+++ b/fs/afs/mntpt.c
@@ -146,6 +146,11 @@ static int afs_mntpt_set_params(struct fs_context *fc, struct dentry *mntpt)
put_page(page);
if (ret < 0)
return ret;
+
+ /* Don't cross a backup volume mountpoint from a backup volume */
+ if (src_as->volume && src_as->volume->type == AFSVL_BACKVOL &&
+ ctx->type == AFSVL_BACKVOL)
+ return -ENODEV;
}
return 0;
--
2.43.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-06-05 12:06 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-05 12:06 [PATCH AUTOSEL 5.4 1/2] nvme-multipath: find NUMA path only for online numa-node Sasha Levin
2024-06-05 12:06 ` [PATCH AUTOSEL 5.4 2/2] afs: Don't cross .backup mountpoint from backup volume Sasha Levin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox