* [PATCH AUTOSEL 6.6 37/43] nvme: find numa distance only if controller has valid numa id
[not found] <20240507231033.393285-1-sashal@kernel.org>
@ 2024-05-07 23:09 ` Sasha Levin
2024-05-07 23:09 ` [PATCH AUTOSEL 6.6 38/43] nvmet-auth: return the error code to the nvmet_auth_host_hash() callers Sasha Levin
` (4 subsequent siblings)
5 siblings, 0 replies; 6+ messages in thread
From: Sasha Levin @ 2024-05-07 23:09 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Nilay Shroff, Christoph Hellwig, Sagi Grimberg,
Chaitanya Kulkarni, Keith Busch, Sasha Levin, linux-nvme
From: Nilay Shroff <nilay@linux.ibm.com>
[ Upstream commit 863fe60ed27f2c85172654a63c5b827e72c8b2e6 ]
On system where native nvme multipath is configured and iopolicy
is set to numa but the nvme controller numa node id is undefined
or -1 (NUMA_NO_NODE) then avoid calculating node distance for
finding optimal io path. In such case we may access numa distance
table with invalid index and that may potentially refer to incorrect
memory. So this patch ensures that if the nvme controller numa node
id is -1 then instead of calculating node distance for finding optimal
io path, we set the numa node distance of such controller to default 10
(LOCAL_DISTANCE).
Link: https://lore.kernel.org/all/20240413090614.678353-1-nilay@linux.ibm.com/
Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/nvme/host/multipath.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 0a88d7bdc5e37..b39553b8378b5 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -246,7 +246,8 @@ static struct nvme_ns *__nvme_find_path(struct nvme_ns_head *head, int node)
if (nvme_path_is_disabled(ns))
continue;
- if (READ_ONCE(head->subsys->iopolicy) == NVME_IOPOLICY_NUMA)
+ if (ns->ctrl->numa_node != NUMA_NO_NODE &&
+ READ_ONCE(head->subsys->iopolicy) == NVME_IOPOLICY_NUMA)
distance = node_distance(node, ns->ctrl->numa_node);
else
distance = LOCAL_DISTANCE;
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH AUTOSEL 6.6 38/43] nvmet-auth: return the error code to the nvmet_auth_host_hash() callers
[not found] <20240507231033.393285-1-sashal@kernel.org>
2024-05-07 23:09 ` [PATCH AUTOSEL 6.6 37/43] nvme: find numa distance only if controller has valid numa id Sasha Levin
@ 2024-05-07 23:09 ` Sasha Levin
2024-05-07 23:10 ` [PATCH AUTOSEL 6.6 39/43] nvmet-auth: replace pr_debug() with pr_err() to report an error Sasha Levin
` (3 subsequent siblings)
5 siblings, 0 replies; 6+ messages in thread
From: Sasha Levin @ 2024-05-07 23:09 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Maurizio Lombardi, Sagi Grimberg, Chaitanya Kulkarni, Keith Busch,
Sasha Levin, hare, linux-nvme
From: Maurizio Lombardi <mlombard@redhat.com>
[ Upstream commit 46b8f9f74f6d500871985e22eb19560b21f3bc81 ]
If the nvmet_auth_host_hash() function fails, the error code should
be returned to its callers.
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/nvme/target/auth.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/target/auth.c b/drivers/nvme/target/auth.c
index 4dcddcf95279b..1f7d492c4dc26 100644
--- a/drivers/nvme/target/auth.c
+++ b/drivers/nvme/target/auth.c
@@ -368,7 +368,7 @@ int nvmet_auth_host_hash(struct nvmet_req *req, u8 *response,
kfree_sensitive(host_response);
out_free_tfm:
crypto_free_shash(shash_tfm);
- return 0;
+ return ret;
}
int nvmet_auth_ctrl_hash(struct nvmet_req *req, u8 *response,
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH AUTOSEL 6.6 39/43] nvmet-auth: replace pr_debug() with pr_err() to report an error.
[not found] <20240507231033.393285-1-sashal@kernel.org>
2024-05-07 23:09 ` [PATCH AUTOSEL 6.6 37/43] nvme: find numa distance only if controller has valid numa id Sasha Levin
2024-05-07 23:09 ` [PATCH AUTOSEL 6.6 38/43] nvmet-auth: return the error code to the nvmet_auth_host_hash() callers Sasha Levin
@ 2024-05-07 23:10 ` Sasha Levin
2024-05-07 23:10 ` [PATCH AUTOSEL 6.6 40/43] nvme: cancel pending I/O if nvme controller is in terminal state Sasha Levin
` (2 subsequent siblings)
5 siblings, 0 replies; 6+ messages in thread
From: Sasha Levin @ 2024-05-07 23:10 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Maurizio Lombardi, Sagi Grimberg, Chaitanya Kulkarni, Keith Busch,
Sasha Levin, hare, linux-nvme
From: Maurizio Lombardi <mlombard@redhat.com>
[ Upstream commit 445f9119e70368ccc964575c2a6d3176966a9d65 ]
In nvmet_auth_host_hash(), if a mismatch is detected in the hash length
the kernel should print an error.
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/nvme/target/auth.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/nvme/target/auth.c b/drivers/nvme/target/auth.c
index 1f7d492c4dc26..e900525b78665 100644
--- a/drivers/nvme/target/auth.c
+++ b/drivers/nvme/target/auth.c
@@ -284,9 +284,9 @@ int nvmet_auth_host_hash(struct nvmet_req *req, u8 *response,
}
if (shash_len != crypto_shash_digestsize(shash_tfm)) {
- pr_debug("%s: hash len mismatch (len %d digest %d)\n",
- __func__, shash_len,
- crypto_shash_digestsize(shash_tfm));
+ pr_err("%s: hash len mismatch (len %d digest %d)\n",
+ __func__, shash_len,
+ crypto_shash_digestsize(shash_tfm));
ret = -EINVAL;
goto out_free_tfm;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH AUTOSEL 6.6 40/43] nvme: cancel pending I/O if nvme controller is in terminal state
[not found] <20240507231033.393285-1-sashal@kernel.org>
` (2 preceding siblings ...)
2024-05-07 23:10 ` [PATCH AUTOSEL 6.6 39/43] nvmet-auth: replace pr_debug() with pr_err() to report an error Sasha Levin
@ 2024-05-07 23:10 ` Sasha Levin
2024-05-07 23:10 ` [PATCH AUTOSEL 6.6 41/43] nvmet-tcp: fix possible memory leak when tearing down a controller Sasha Levin
2024-05-07 23:10 ` [PATCH AUTOSEL 6.6 42/43] nvmet: fix nvme status code when namespace is disabled Sasha Levin
5 siblings, 0 replies; 6+ messages in thread
From: Sasha Levin @ 2024-05-07 23:10 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Nilay Shroff, Sagi Grimberg, Keith Busch, Sasha Levin, linux-nvme
From: Nilay Shroff <nilay@linux.ibm.com>
[ Upstream commit 25bb3534ee21e39eb9301c4edd7182eb83cb0d07 ]
While I/O is running, if the pci bus error occurs then
in-flight I/O can not complete. Worst, if at this time,
user (logically) hot-unplug the nvme disk then the
nvme_remove() code path can't forward progress until
in-flight I/O is cancelled. So these sequence of events
may potentially hang hot-unplug code path indefinitely.
This patch helps cancel the pending/in-flight I/O from the
nvme request timeout handler in case the nvme controller
is in the terminal (DEAD/DELETING/DELETING_NOIO) state and
that helps nvme_remove() code path forward progress and
finish successfully.
Link: https://lore.kernel.org/all/199be893-5dfa-41e5-b6f2-40ac90ebccc4@linux.ibm.com/
Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/nvme/host/core.c | 21 ---------------------
drivers/nvme/host/nvme.h | 21 +++++++++++++++++++++
drivers/nvme/host/pci.c | 8 +++++++-
3 files changed, 28 insertions(+), 22 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 012c8b3f5f9c9..02d9d1b973494 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -587,27 +587,6 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
}
EXPORT_SYMBOL_GPL(nvme_change_ctrl_state);
-/*
- * Returns true for sink states that can't ever transition back to live.
- */
-static bool nvme_state_terminal(struct nvme_ctrl *ctrl)
-{
- switch (nvme_ctrl_state(ctrl)) {
- case NVME_CTRL_NEW:
- case NVME_CTRL_LIVE:
- case NVME_CTRL_RESETTING:
- case NVME_CTRL_CONNECTING:
- return false;
- case NVME_CTRL_DELETING:
- case NVME_CTRL_DELETING_NOIO:
- case NVME_CTRL_DEAD:
- return true;
- default:
- WARN_ONCE(1, "Unhandled ctrl state:%d", ctrl->state);
- return true;
- }
-}
-
/*
* Waits for the controller state to be resetting, or returns false if it is
* not possible to ever transition to that state.
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index ba62d42d2a8b7..43cff851ac5ae 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -735,6 +735,27 @@ static inline bool nvme_is_aen_req(u16 qid, __u16 command_id)
nvme_tag_from_cid(command_id) >= NVME_AQ_BLK_MQ_DEPTH;
}
+/*
+ * Returns true for sink states that can't ever transition back to live.
+ */
+static inline bool nvme_state_terminal(struct nvme_ctrl *ctrl)
+{
+ switch (nvme_ctrl_state(ctrl)) {
+ case NVME_CTRL_NEW:
+ case NVME_CTRL_LIVE:
+ case NVME_CTRL_RESETTING:
+ case NVME_CTRL_CONNECTING:
+ return false;
+ case NVME_CTRL_DELETING:
+ case NVME_CTRL_DELETING_NOIO:
+ case NVME_CTRL_DEAD:
+ return true;
+ default:
+ WARN_ONCE(1, "Unhandled ctrl state:%d", ctrl->state);
+ return true;
+ }
+}
+
void nvme_complete_rq(struct request *req);
void nvme_complete_batch_req(struct request *req);
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index b985142fb84b9..4352206533ede 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1286,6 +1286,9 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req)
struct nvme_command cmd = { };
u32 csts = readl(dev->bar + NVME_REG_CSTS);
+ if (nvme_state_terminal(&dev->ctrl))
+ goto disable;
+
/* If PCI error recovery process is happening, we cannot reset or
* the recovery mechanism will surely fail.
*/
@@ -1388,8 +1391,11 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req)
return BLK_EH_RESET_TIMER;
disable:
- if (!nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_RESETTING))
+ if (!nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_RESETTING)) {
+ if (nvme_state_terminal(&dev->ctrl))
+ nvme_dev_disable(dev, true);
return BLK_EH_DONE;
+ }
nvme_dev_disable(dev, false);
if (nvme_try_sched_reset(&dev->ctrl))
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH AUTOSEL 6.6 41/43] nvmet-tcp: fix possible memory leak when tearing down a controller
[not found] <20240507231033.393285-1-sashal@kernel.org>
` (3 preceding siblings ...)
2024-05-07 23:10 ` [PATCH AUTOSEL 6.6 40/43] nvme: cancel pending I/O if nvme controller is in terminal state Sasha Levin
@ 2024-05-07 23:10 ` Sasha Levin
2024-05-07 23:10 ` [PATCH AUTOSEL 6.6 42/43] nvmet: fix nvme status code when namespace is disabled Sasha Levin
5 siblings, 0 replies; 6+ messages in thread
From: Sasha Levin @ 2024-05-07 23:10 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Sagi Grimberg, Yi Zhang, Christoph Hellwig, Keith Busch,
Sasha Levin, kch, linux-nvme
From: Sagi Grimberg <sagi@grimberg.me>
[ Upstream commit 6825bdde44340c5a9121f6d6fa25cc885bd9e821 ]
When we teardown the controller, we wait for pending I/Os to complete
(sq->ref on all queues to drop to zero) and then we go over the commands,
and free their command buffers in case they are still fetching data from
the host (e.g. processing nvme writes) and have yet to take a reference
on the sq.
However, we may miss the case where commands have failed before executing
and are queued for sending a response, but will never occur because the
queue socket is already down. In this case we may miss deallocating command
buffers.
Solve this by freeing all commands buffers as nvmet_tcp_free_cmd_buffers is
idempotent anyways.
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/nvme/target/tcp.c | 11 ++++-------
1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index 8e5d547aa16cb..3d302815c6f36 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -324,6 +324,7 @@ static int nvmet_tcp_check_ddgst(struct nvmet_tcp_queue *queue, void *pdu)
return 0;
}
+/* If cmd buffers are NULL, no operation is performed */
static void nvmet_tcp_free_cmd_buffers(struct nvmet_tcp_cmd *cmd)
{
kfree(cmd->iov);
@@ -1476,13 +1477,9 @@ static void nvmet_tcp_free_cmd_data_in_buffers(struct nvmet_tcp_queue *queue)
struct nvmet_tcp_cmd *cmd = queue->cmds;
int i;
- for (i = 0; i < queue->nr_cmds; i++, cmd++) {
- if (nvmet_tcp_need_data_in(cmd))
- nvmet_tcp_free_cmd_buffers(cmd);
- }
-
- if (!queue->nr_cmds && nvmet_tcp_need_data_in(&queue->connect))
- nvmet_tcp_free_cmd_buffers(&queue->connect);
+ for (i = 0; i < queue->nr_cmds; i++, cmd++)
+ nvmet_tcp_free_cmd_buffers(cmd);
+ nvmet_tcp_free_cmd_buffers(&queue->connect);
}
static void nvmet_tcp_release_queue_work(struct work_struct *w)
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH AUTOSEL 6.6 42/43] nvmet: fix nvme status code when namespace is disabled
[not found] <20240507231033.393285-1-sashal@kernel.org>
` (4 preceding siblings ...)
2024-05-07 23:10 ` [PATCH AUTOSEL 6.6 41/43] nvmet-tcp: fix possible memory leak when tearing down a controller Sasha Levin
@ 2024-05-07 23:10 ` Sasha Levin
5 siblings, 0 replies; 6+ messages in thread
From: Sasha Levin @ 2024-05-07 23:10 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Sagi Grimberg, Jirong Feng, Christoph Hellwig, Keith Busch,
Sasha Levin, kch, linux-nvme
From: Sagi Grimberg <sagi@grimberg.me>
[ Upstream commit 505363957fad35f7aed9a2b0d8dad73451a80fb5 ]
If the user disabled a nvmet namespace, it is removed from the subsystem
namespaces list. When nvmet processes a command directed to an nsid that
was disabled, it cannot differentiate between a nsid that is disabled
vs. a non-existent namespace, and resorts to return NVME_SC_INVALID_NS
with the dnr bit set.
This translates to a non-retryable status for the host, which translates
to a user error. We should expect disabled namespaces to not cause an
I/O error in a multipath environment.
Address this by searching a configfs item for the namespace nvmet failed
to find, and if we found one, conclude that the namespace is disabled
(perhaps temporarily). Return NVME_SC_INTERNAL_PATH_ERROR in this case
and keep DNR bit cleared.
Reported-by: Jirong Feng <jirong.feng@easystack.cn>
Tested-by: Jirong Feng <jirong.feng@easystack.cn>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/nvme/target/configfs.c | 13 +++++++++++++
drivers/nvme/target/core.c | 5 ++++-
drivers/nvme/target/nvmet.h | 1 +
3 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c
index 01b2a3d1a5e6c..3670a1103863b 100644
--- a/drivers/nvme/target/configfs.c
+++ b/drivers/nvme/target/configfs.c
@@ -616,6 +616,19 @@ static struct configfs_attribute *nvmet_ns_attrs[] = {
NULL,
};
+bool nvmet_subsys_nsid_exists(struct nvmet_subsys *subsys, u32 nsid)
+{
+ struct config_item *ns_item;
+ char name[4] = {};
+
+ if (sprintf(name, "%u", nsid) <= 0)
+ return false;
+ mutex_lock(&subsys->namespaces_group.cg_subsys->su_mutex);
+ ns_item = config_group_find_item(&subsys->namespaces_group, name);
+ mutex_unlock(&subsys->namespaces_group.cg_subsys->su_mutex);
+ return ns_item != NULL;
+}
+
static void nvmet_ns_release(struct config_item *item)
{
struct nvmet_ns *ns = to_nvmet_ns(item);
diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
index 3935165048e74..ce7e945cb4f7e 100644
--- a/drivers/nvme/target/core.c
+++ b/drivers/nvme/target/core.c
@@ -425,10 +425,13 @@ void nvmet_stop_keep_alive_timer(struct nvmet_ctrl *ctrl)
u16 nvmet_req_find_ns(struct nvmet_req *req)
{
u32 nsid = le32_to_cpu(req->cmd->common.nsid);
+ struct nvmet_subsys *subsys = nvmet_req_subsys(req);
- req->ns = xa_load(&nvmet_req_subsys(req)->namespaces, nsid);
+ req->ns = xa_load(&subsys->namespaces, nsid);
if (unlikely(!req->ns)) {
req->error_loc = offsetof(struct nvme_common_command, nsid);
+ if (nvmet_subsys_nsid_exists(subsys, nsid))
+ return NVME_SC_INTERNAL_PATH_ERROR;
return NVME_SC_INVALID_NS | NVME_SC_DNR;
}
diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
index 8cfd60f3b5648..15b00ed7be16a 100644
--- a/drivers/nvme/target/nvmet.h
+++ b/drivers/nvme/target/nvmet.h
@@ -530,6 +530,7 @@ void nvmet_subsys_disc_changed(struct nvmet_subsys *subsys,
struct nvmet_host *host);
void nvmet_add_async_event(struct nvmet_ctrl *ctrl, u8 event_type,
u8 event_info, u8 log_page);
+bool nvmet_subsys_nsid_exists(struct nvmet_subsys *subsys, u32 nsid);
#define NVMET_QUEUE_SIZE 1024
#define NVMET_NR_QUEUES 128
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread