All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] nvme: reserve a keep-alive admin tag for all transports
@ 2026-05-15  7:12 Chao Shi
  0 siblings, 0 replies; only message in thread
From: Chao Shi @ 2026-05-15  7:12 UTC (permalink / raw)
  To: linux-nvme, Keith Busch
  Cc: Christoph Hellwig, Sagi Grimberg, Jens Axboe, Tatsuya Sasaki,
	Maurizio Lombardi, linux-kernel, Sungwoo Kim, Dave Tian,
	Weidong Zhu

nvme_keep_alive_work() always allocates with BLK_MQ_REQ_RESERVED, but
nvme_alloc_admin_tag_set() only sets reserved_tags for fabrics. Since
commit b58da2d270db ("nvme: update keep alive interval when kato is
modified"), userspace can start keep-alive on any transport via Set
Features (KATO), after which the allocation trips WARN_ON_ONCE() in
blk_mq_get_tag() and fails with -EWOULDBLOCK:

  nvme nvme0: keep-alive failed: -11

Per NVMe 2.0a section 5.27.1.12 and the transport binding wording,
PCIe MAY support KATO. Reserve one admin tag on all transports so
the host is ready when a controller accepts the feature. Fabrics
keeps two, the second being for the connect command.

A quirk-based approach was considered but no PCIe controller
documented to declare KAS != 0 was found (two enterprise SSDs tested
locally report KAS=0), so an allowlist has no entries today.

Link: https://lore.kernel.org/linux-nvme/20260428022911.1288485-1-coshi036@gmail.com/

Fixes: b58da2d270db ("nvme: update keep alive interval when kato is modified")

Found by FuzzNvme (Syzkaller with FEMU fuzzing framework).

Acked-by: Sungwoo Kim <iam@sung-woo.kim>
Acked-by: Dave Tian <daveti@purdue.edu>
Acked-by: Weidong Zhu <weizhu@fiu.edu>
Signed-off-by: Chao Shi <coshi036@gmail.com>
---

Reproducer (run as root on an unpatched kernel with a PCIe NVMe device):

    #include <fcntl.h>
    #include <stdio.h>
    #include <string.h>
    #include <sys/ioctl.h>
    #include <linux/nvme_ioctl.h>

    int main(void)
    {
            struct nvme_admin_cmd cmd = {0};
            int fd = open("/dev/nvme0", O_RDWR);
            if (fd < 0) { perror("open"); return 1; }
            cmd.opcode = 0x09;       /* SET_FEATURES */
            cmd.cdw10  = 0x0f;       /* Feature ID: KATO */
            cmd.cdw11  = 5;          /* KATO = 5 seconds */
            if (ioctl(fd, NVME_IOCTL_ADMIN_CMD, &cmd) < 0) {
                    perror("ioctl");
                    return 1;
            }
            return 0;
    }

Within ~kato/2 seconds after the program exits, dmesg shows:

    nvme nvme0: keep alive interval updated from 0 ms to 5000 ms
    WARNING: CPU: 0 PID: ... at block/blk-mq-tag.c:148 blk_mq_get_tag+...
    nvme nvme0: keep-alive failed: -11

Changes since v1:
- Add spec citation (NVMe 2.0a 5.27.1.12 + transport binding wording)
  clarifying that PCIe MAY support KATO.
- Discuss the quirk-based alternative suggested in v1 review and
  note that no PCIe controller declaring KAS != 0 is documented
  today (two enterprise SSDs tested locally report KAS=0).
- Add Link: to v1 thread.

 drivers/nvme/host/core.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 7bf228df6001..6db02ecde6d1 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4850,8 +4850,13 @@ int nvme_alloc_admin_tag_set(struct nvme_ctrl *ctrl, struct blk_mq_tag_set *set,
 	memset(set, 0, sizeof(*set));
 	set->ops = ops;
 	set->queue_depth = NVME_AQ_MQ_TAG_DEPTH;
+	/*
+	 * Reserve one tag for keep-alive, which is allocated with
+	 * BLK_MQ_REQ_RESERVED and can be enabled on any transport via the
+	 * KATO feature.  Fabrics needs a second reserved tag for connect.
+	 */
+	set->reserved_tags = 1;
 	if (ctrl->ops->flags & NVME_F_FABRICS)
-		/* Reserved for fabric connect and keep alive */
 		set->reserved_tags = 2;
 	set->numa_node = ctrl->numa_node;
 	if (ctrl->ops->flags & NVME_F_BLOCKING)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2026-05-15  7:12 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-15  7:12 [PATCH v2] nvme: reserve a keep-alive admin tag for all transports Chao Shi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.