Linux-NVME Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] nvme: recompute multipath zoned limits from ready paths
@ 2026-05-22  7:58 Yao Sang
  2026-05-22  9:21 ` Christoph Hellwig
  0 siblings, 1 reply; 4+ messages in thread
From: Yao Sang @ 2026-05-22  7:58 UTC (permalink / raw)
  To: linux-nvme; +Cc: kbusch, axboe, hch, sagi, shinichiro.kawasaki, Yao Sang

This was found while debugging a zoned NVMe multipath setup, where the
namespace head reported 0/0 for max_open_zones and max_active_zones
while the live path still reported finite limits.

These zoned resource limits are namespace-wide, but the head can retain
stale values after the ready-path set changes. Since 0 means "no limit"
for both values, stale head state can leave the multipath namespace with
bogus 0/0 limits even when the current ready path still advertises
finite limits.

Recompute max_open_zones and max_active_zones from the current
NVME_NS_READY paths when refreshing the namespace head limits, and clear
them when the resulting queue is not zoned.

Signed-off-by: Yao Sang <sangyao@kylinos.cn>
---
Changes in v2:
- Address Shin'ichiro Kawasaki's feedback that the v1 generic
  block-layer approach ("block: stack zoned resource limits") could
  break the DM zone resource limit semantics behind blktests zbd/011.
- Narrow the fix to NVMe multipath, rename the patch accordingly, and
  move the CONFIG_NVME_MULTIPATH guard to the call site.
- Rewrite the changelog around the stale 0/0 head-limit symptom and
  refresh the testing summary with directly relevant passing coverage.

Testing:
- Build: CONFIG_NVME_MULTIPATH=n to cover the non-multipath call-site guard.
- blktests: nvme/005, nvme/057, nvme/058 for ready-path changes via reset,
  ANA failover/failback, and namespace remap.
- blktests: zbd/011, zbd/012, zbd/013, block/004
  to cover adjacent zoned/block-limit behavior.
- blktests on QEMU ZNS NVMe /dev/nvme0n1: zbd/001, zbd/002, zbd/003,
  zbd/004, zbd/005, zbd/006 to confirm the fix on the QEMU-backed
  zoned NVMe device used in this VM testbed.

Link: https://lore.kernel.org/r/20260520091237.392802-1-sangyao@kylinos.cn

 drivers/nvme/host/core.c | 42 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index c3032d6ad6b1..41fbfbe5f970 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2483,6 +2483,44 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
 	return ret;
 }

+#ifdef CONFIG_NVME_MULTIPATH
+static void nvme_update_ns_head_zoned_limits(struct nvme_ns_head *head,
+					     struct queue_limits *lim)
+{
+	int srcu_idx;
+	struct nvme_ns *path;
+
+	if (!(lim->features & BLK_FEAT_ZONED)) {
+		lim->max_open_zones = 0;
+		lim->max_active_zones = 0;
+		return;
+	}
+
+	/*
+	 * Zone resource limits are namespace-wide. Recompute them from all ready
+	 * namespace paths instead of incrementally stacking stale head values.
+	 */
+	lim->max_open_zones = 0;
+	lim->max_active_zones = 0;
+
+	srcu_idx = srcu_read_lock(&head->srcu);
+	list_for_each_entry_srcu(path, &head->list, siblings,
+				 srcu_read_lock_held(&head->srcu)) {
+		struct queue_limits *path_lim;
+
+		if (!path->disk || !test_bit(NVME_NS_READY, &path->flags))
+			continue;
+
+		path_lim = &path->disk->queue->limits;
+		lim->max_open_zones = min_not_zero(lim->max_open_zones,
+						   path_lim->max_open_zones);
+		lim->max_active_zones = min_not_zero(lim->max_active_zones,
+						     path_lim->max_active_zones);
+	}
+	srcu_read_unlock(&head->srcu, srcu_idx);
+}
+#endif
+
 static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_ns_info *info)
 {
 	bool unsupported = false;
@@ -2549,6 +2587,10 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_ns_info *info)
 		lim.io_opt = ns_lim->io_opt;
 		queue_limits_stack_bdev(&lim, ns->disk->part0, 0,
 					ns->head->disk->disk_name);
+
+#ifdef CONFIG_NVME_MULTIPATH
+		nvme_update_ns_head_zoned_limits(ns->head, &lim);
+#endif
 		if (unsupported)
 			ns->head->disk->flags |= GENHD_FL_HIDDEN;
 		else
--
2.25.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-05-25  7:30 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-22  7:58 [PATCH v2] nvme: recompute multipath zoned limits from ready paths Yao Sang
2026-05-22  9:21 ` Christoph Hellwig
2026-05-25  6:40   ` Yao Sang
2026-05-25  7:30     ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox