[PATCH 0/5] nvme: rework __nvme_submit_sync

public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed

* [PATCH 0/5] nvme: rework __nvme_submit_sync_cmd()
@ 2023-02-09 14:38 Hannes Reinecke
  2023-02-09 14:38 ` [PATCH 1/5] nvme: split __nvme_submit_sync_cmd() Hannes Reinecke
                   ` (4 more replies)
  0 siblings, 5 replies; 22+ messages in thread
From: Hannes Reinecke @ 2023-02-09 14:38 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke

Hi all,

as Jens and hch complained about the argument list to __nvme_submit_sync_cmd()
getting too long, here's now a patchset to clean things up.
The first patch is to split off nvme_alloc_request() from
__nvme_submit_sync_cmd(), which serves to reduce the number of arguments;
it's also a nice cleanup as this function can be utilized to replace some
open-coded versions.
Then a patch to allow blk_rq_map_kern() to accept a NULL pointer; that's
really trivial and simplifies caller handling.
The third one again is pretty simple, and moves the 'result' handling into
nvme_execute_rq(), rather than leave it with the caller.
And with that we can open-code __nvme_submit_sync_cmd() in the callers,
and drop the function altogether.
The last patch to allow for retries of the authentication commands is
then pretty simple.

As usual, comments and reviews are welcome.

Hannes Reinecke (5):
  nvme: split __nvme_submit_sync_cmd()
  block: make blk_rq_map_kern() to accept a NULL buffer
  nvme: move result handling into nvme_execute_rq()
  nvme: open-code __nvme_submit_sync_cmd()
  nvme: retry authentication commands if DNR status bit is not set

 block/blk-map.c                |  2 +-
 drivers/block/pktcdvd.c        | 10 ++--
 drivers/nvme/host/auth.c       | 31 ++++++++---
 drivers/nvme/host/core.c       | 96 +++++++++++++++++++++-------------
 drivers/nvme/host/fabrics.c    | 62 +++++++++++++++++-----
 drivers/nvme/host/ioctl.c      |  5 +-
 drivers/nvme/host/nvme.h       |  9 ++--
 drivers/nvme/host/pci.c        |  8 ++-
 drivers/nvme/target/passthru.c |  6 +--
 drivers/scsi/scsi_ioctl.c      |  8 ++-
 drivers/scsi/scsi_lib.c        | 11 ++--
 11 files changed, 155 insertions(+), 93 deletions(-)

-- 
2.35.3

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/5] nvme: split __nvme_submit_sync_cmd()
  2023-02-09 14:38 [PATCH 0/5] nvme: rework __nvme_submit_sync_cmd() Hannes Reinecke
@ 2023-02-09 14:38 ` Hannes Reinecke
  2023-02-13  6:19   ` Christoph Hellwig
  2023-02-09 14:38 ` [PATCH 2/5] block: make blk_rq_map_kern() to accept a NULL buffer Hannes Reinecke
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: Hannes Reinecke @ 2023-02-09 14:38 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke

Split a nvme_alloc_request() function from __nvme_submit_sync_cmd()
to reduce the number of arguments.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/host/auth.c       |  9 +++--
 drivers/nvme/host/core.c       | 60 ++++++++++++++++++++++------------
 drivers/nvme/host/fabrics.c    | 44 ++++++++++++++++++-------
 drivers/nvme/host/nvme.h       |  8 ++---
 drivers/nvme/host/pci.c        |  8 ++---
 drivers/nvme/target/passthru.c |  3 +-
 6 files changed, 85 insertions(+), 47 deletions(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index 787537454f7f..05d02ab229f7 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -62,6 +62,7 @@ static int nvme_auth_submit(struct nvme_ctrl *ctrl, int qid,
 	struct nvme_command cmd = {};
 	blk_mq_req_flags_t flags = nvme_auth_flags_from_qid(qid);
 	struct request_queue *q = nvme_auth_queue_from_qid(ctrl, qid);
+	struct request *req;
 	int ret;
 
 	cmd.auth_common.opcode = nvme_fabrics_command;
@@ -76,9 +77,11 @@ static int nvme_auth_submit(struct nvme_ctrl *ctrl, int qid,
 		cmd.auth_receive.al = cpu_to_le32(data_len);
 	}
 
-	ret = __nvme_submit_sync_cmd(q, &cmd, NULL, data, data_len,
-				     qid == 0 ? NVME_QID_ANY : qid,
-				     0, flags);
+	req = nvme_alloc_request(q, &cmd, qid == 0 ? NVME_QID_ANY : qid, flags);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	ret = __nvme_submit_sync_cmd(req, NULL, data, data_len, 0);
 	if (ret > 0)
 		dev_warn(ctrl->device,
 			"qid %d auth_send failed with status %d\n", qid, ret);
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index d1c9402389f9..ad2bcb412944 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1017,29 +1017,35 @@ int nvme_execute_rq(struct request *rq, bool at_head)
 }
 EXPORT_SYMBOL_NS_GPL(nvme_execute_rq, NVME_TARGET_PASSTHRU);
 
-/*
- * Returns 0 on success.  If the result is negative, it's a Linux error code;
- * if the result is positive, it's an NVM Express status code
- */
-int __nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
-		union nvme_result *result, void *buffer, unsigned bufflen,
-		int qid, int at_head, blk_mq_req_flags_t flags)
+struct request *nvme_alloc_request(struct request_queue *q,
+				   struct nvme_command *cmd, int qid,
+				   blk_mq_req_flags_t flags)
 {
 	struct request *req;
-	int ret;
 
 	if (qid == NVME_QID_ANY)
 		req = blk_mq_alloc_request(q, nvme_req_op(cmd), flags);
 	else
 		req = blk_mq_alloc_request_hctx(q, nvme_req_op(cmd), flags,
 						qid - 1);
+	if (!IS_ERR(req))
+		nvme_init_request(req, cmd);
 
-	if (IS_ERR(req))
-		return PTR_ERR(req);
-	nvme_init_request(req, cmd);
+	return req;
+}
+EXPORT_SYMBOL_GPL(nvme_alloc_request);
+
+/*
+ * Returns 0 on success.  If the result is negative, it's a Linux error code;
+ * if the result is positive, it's an NVM Express status code
+ */
+int __nvme_submit_sync_cmd(struct request *req, union nvme_result *result,
+			   void *buffer, unsigned bufflen, int at_head)
+{
+	int ret;
 
 	if (buffer && bufflen) {
-		ret = blk_rq_map_kern(q, req, buffer, bufflen, GFP_KERNEL);
+		ret = blk_rq_map_kern(req->q, req, buffer, bufflen, GFP_KERNEL);
 		if (ret)
 			goto out;
 	}
@@ -1056,8 +1062,13 @@ EXPORT_SYMBOL_GPL(__nvme_submit_sync_cmd);
 int nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
 		void *buffer, unsigned bufflen)
 {
-	return __nvme_submit_sync_cmd(q, cmd, NULL, buffer, bufflen,
-			NVME_QID_ANY, 0, 0);
+	struct request *req;
+
+	req = nvme_alloc_request(q, cmd, NVME_QID_ANY, 0);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	return __nvme_submit_sync_cmd(req, NULL, buffer, bufflen, 0);
 }
 EXPORT_SYMBOL_GPL(nvme_submit_sync_cmd);
 
@@ -1199,15 +1210,14 @@ static void nvme_keep_alive_work(struct work_struct *work)
 		return;
 	}
 
-	rq = blk_mq_alloc_request(ctrl->admin_q, nvme_req_op(&ctrl->ka_cmd),
-				  BLK_MQ_REQ_RESERVED | BLK_MQ_REQ_NOWAIT);
+	rq = nvme_alloc_request(ctrl->admin_q, &ctrl->ka_cmd, NVME_QID_ANY,
+				BLK_MQ_REQ_RESERVED | BLK_MQ_REQ_NOWAIT);
 	if (IS_ERR(rq)) {
 		/* allocation failure, reset the controller */
 		dev_err(ctrl->device, "keep-alive failed: %ld\n", PTR_ERR(rq));
 		nvme_reset_ctrl(ctrl);
 		return;
 	}
-	nvme_init_request(rq, &ctrl->ka_cmd);
 
 	rq->timeout = ctrl->kato * HZ;
 	rq->end_io = nvme_keep_alive_end_io;
@@ -1475,6 +1485,7 @@ static int nvme_ns_info_from_id_cs_indep(struct nvme_ctrl *ctrl,
 static int nvme_features(struct nvme_ctrl *dev, u8 op, unsigned int fid,
 		unsigned int dword11, void *buffer, size_t buflen, u32 *result)
 {
+	struct request *req;
 	union nvme_result res = { 0 };
 	struct nvme_command c = { };
 	int ret;
@@ -1483,8 +1494,11 @@ static int nvme_features(struct nvme_ctrl *dev, u8 op, unsigned int fid,
 	c.features.fid = cpu_to_le32(fid);
 	c.features.dword11 = cpu_to_le32(dword11);
 
-	ret = __nvme_submit_sync_cmd(dev->admin_q, &c, &res,
-			buffer, buflen, NVME_QID_ANY, 0, 0);
+	req = nvme_alloc_request(dev->admin_q, &c, NVME_QID_ANY, 0);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	ret = __nvme_submit_sync_cmd(req, &res, buffer, buflen, 0);
 	if (ret >= 0 && result)
 		*result = le32_to_cpu(res.u32);
 	return ret;
@@ -2206,6 +2220,7 @@ static int nvme_sec_submit(void *data, u16 spsp, u8 secp, void *buffer, size_t l
 {
 	struct nvme_ctrl *ctrl = data;
 	struct nvme_command cmd = { };
+	struct request *req;
 
 	if (send)
 		cmd.common.opcode = nvme_admin_security_send;
@@ -2215,8 +2230,11 @@ static int nvme_sec_submit(void *data, u16 spsp, u8 secp, void *buffer, size_t l
 	cmd.common.cdw10 = cpu_to_le32(((u32)secp) << 24 | ((u32)spsp) << 8);
 	cmd.common.cdw11 = cpu_to_le32(len);
 
-	return __nvme_submit_sync_cmd(ctrl->admin_q, &cmd, NULL, buffer, len,
-			NVME_QID_ANY, 1, 0);
+	req = nvme_alloc_request(ctrl->admin_q, &cmd, NVME_QID_ANY, 0);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	return __nvme_submit_sync_cmd(req, NULL, buffer, len, 1);
 }
 
 static void nvme_configure_opal(struct nvme_ctrl *ctrl, bool was_suspended)
diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index bbaa04a0c502..914784b611a2 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -144,6 +144,7 @@ EXPORT_SYMBOL_GPL(nvmf_get_address);
  */
 int nvmf_reg_read32(struct nvme_ctrl *ctrl, u32 off, u32 *val)
 {
+	struct request *req;
 	struct nvme_command cmd = { };
 	union nvme_result res;
 	int ret;
@@ -152,8 +153,11 @@ int nvmf_reg_read32(struct nvme_ctrl *ctrl, u32 off, u32 *val)
 	cmd.prop_get.fctype = nvme_fabrics_type_property_get;
 	cmd.prop_get.offset = cpu_to_le32(off);
 
-	ret = __nvme_submit_sync_cmd(ctrl->fabrics_q, &cmd, &res, NULL, 0,
-			NVME_QID_ANY, 0, 0);
+	req = nvme_alloc_request(ctrl->fabrics_q, &cmd, NVME_QID_ANY, 0);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	ret = __nvme_submit_sync_cmd(req, &res, NULL, 0, 0);
 
 	if (ret >= 0)
 		*val = le64_to_cpu(res.u64);
@@ -189,6 +193,7 @@ EXPORT_SYMBOL_GPL(nvmf_reg_read32);
  */
 int nvmf_reg_read64(struct nvme_ctrl *ctrl, u32 off, u64 *val)
 {
+	struct request *req;
 	struct nvme_command cmd = { };
 	union nvme_result res;
 	int ret;
@@ -198,8 +203,11 @@ int nvmf_reg_read64(struct nvme_ctrl *ctrl, u32 off, u64 *val)
 	cmd.prop_get.attrib = 1;
 	cmd.prop_get.offset = cpu_to_le32(off);
 
-	ret = __nvme_submit_sync_cmd(ctrl->fabrics_q, &cmd, &res, NULL, 0,
-			NVME_QID_ANY, 0, 0);
+	req = nvme_alloc_request(ctrl->fabrics_q, &cmd, NVME_QID_ANY, 0);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	ret = __nvme_submit_sync_cmd(req, &res, NULL, 0, 0);
 
 	if (ret >= 0)
 		*val = le64_to_cpu(res.u64);
@@ -234,6 +242,7 @@ EXPORT_SYMBOL_GPL(nvmf_reg_read64);
  */
 int nvmf_reg_write32(struct nvme_ctrl *ctrl, u32 off, u32 val)
 {
+	struct request *req;
 	struct nvme_command cmd = { };
 	int ret;
 
@@ -243,8 +252,11 @@ int nvmf_reg_write32(struct nvme_ctrl *ctrl, u32 off, u32 val)
 	cmd.prop_set.offset = cpu_to_le32(off);
 	cmd.prop_set.value = cpu_to_le64(val);
 
-	ret = __nvme_submit_sync_cmd(ctrl->fabrics_q, &cmd, NULL, NULL, 0,
-			NVME_QID_ANY, 0, 0);
+	req = nvme_alloc_request(ctrl->fabrics_q, &cmd, NVME_QID_ANY, 0);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	ret = __nvme_submit_sync_cmd(req, NULL, NULL, 0, 0);
 	if (unlikely(ret))
 		dev_err(ctrl->device,
 			"Property Set error: %d, offset %#x\n",
@@ -374,6 +386,7 @@ int nvmf_connect_admin_queue(struct nvme_ctrl *ctrl)
 	struct nvme_command cmd = { };
 	union nvme_result res;
 	struct nvmf_connect_data *data;
+	struct request *req;
 	int ret;
 	u32 result;
 
@@ -399,9 +412,12 @@ int nvmf_connect_admin_queue(struct nvme_ctrl *ctrl)
 	strncpy(data->subsysnqn, ctrl->opts->subsysnqn, NVMF_NQN_SIZE);
 	strncpy(data->hostnqn, ctrl->opts->host->nqn, NVMF_NQN_SIZE);
 
-	ret = __nvme_submit_sync_cmd(ctrl->fabrics_q, &cmd, &res,
-			data, sizeof(*data), NVME_QID_ANY, 1,
-			BLK_MQ_REQ_RESERVED | BLK_MQ_REQ_NOWAIT);
+	req = nvme_alloc_request(ctrl->fabrics_q, &cmd, NVME_QID_ANY,
+				 BLK_MQ_REQ_RESERVED | BLK_MQ_REQ_NOWAIT);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	ret = __nvme_submit_sync_cmd(req, &res, data, sizeof(*data), 1);
 	if (ret) {
 		nvmf_log_connect_error(ctrl, ret, le32_to_cpu(res.u32),
 				       &cmd, data);
@@ -465,6 +481,7 @@ int nvmf_connect_io_queue(struct nvme_ctrl *ctrl, u16 qid)
 	struct nvme_command cmd = { };
 	struct nvmf_connect_data *data;
 	union nvme_result res;
+	struct request *req;
 	int ret;
 	u32 result;
 
@@ -485,9 +502,12 @@ int nvmf_connect_io_queue(struct nvme_ctrl *ctrl, u16 qid)
 	strncpy(data->subsysnqn, ctrl->opts->subsysnqn, NVMF_NQN_SIZE);
 	strncpy(data->hostnqn, ctrl->opts->host->nqn, NVMF_NQN_SIZE);
 
-	ret = __nvme_submit_sync_cmd(ctrl->connect_q, &cmd, &res,
-			data, sizeof(*data), qid, 1,
-			BLK_MQ_REQ_RESERVED | BLK_MQ_REQ_NOWAIT);
+	req = nvme_alloc_request(ctrl->fabrics_q, &cmd, NVME_QID_ANY,
+				 BLK_MQ_REQ_RESERVED | BLK_MQ_REQ_NOWAIT);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	ret = __nvme_submit_sync_cmd(req, &res, data, sizeof(*data), 1);
 	if (ret) {
 		nvmf_log_connect_error(ctrl, ret, le32_to_cpu(res.u32),
 				       &cmd, data);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index bf46f122e9e1..9b4b3bb69e27 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -813,10 +813,10 @@ static inline bool nvme_is_unique_nsid(struct nvme_ctrl *ctrl,
 
 int nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
 		void *buf, unsigned bufflen);
-int __nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
-		union nvme_result *result, void *buffer, unsigned bufflen,
-		int qid, int at_head,
-		blk_mq_req_flags_t flags);
+struct request *nvme_alloc_request(struct request_queue *q,
+		struct nvme_command *cmd, int qid, blk_mq_req_flags_t flags);
+int __nvme_submit_sync_cmd(struct request *req, union nvme_result *result,
+		void *buffer, unsigned bufflen, int at_head);
 int nvme_set_features(struct nvme_ctrl *dev, unsigned int fid,
 		      unsigned int dword11, void *buffer, size_t buflen,
 		      u32 *result);
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index a331fbfa9a66..94dcc9b65fec 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1376,13 +1376,12 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req)
 		 nvme_get_opcode_str(nvme_req(req)->cmd->common.opcode),
 		 nvmeq->qid);
 
-	abort_req = blk_mq_alloc_request(dev->ctrl.admin_q, nvme_req_op(&cmd),
-					 BLK_MQ_REQ_NOWAIT);
+	abort_req = nvme_alloc_request(dev->ctrl.admin_q, &cmd,
+			NVME_QID_ANY, BLK_MQ_REQ_NOWAIT);
 	if (IS_ERR(abort_req)) {
 		atomic_inc(&dev->ctrl.abort_limit);
 		return BLK_EH_RESET_TIMER;
 	}
-	nvme_init_request(abort_req, &cmd);
 
 	abort_req->end_io = abort_endio;
 	abort_req->end_io_data = NULL;
@@ -2391,10 +2390,9 @@ static int nvme_delete_queue(struct nvme_queue *nvmeq, u8 opcode)
 	cmd.delete_queue.opcode = opcode;
 	cmd.delete_queue.qid = cpu_to_le16(nvmeq->qid);
 
-	req = blk_mq_alloc_request(q, nvme_req_op(&cmd), BLK_MQ_REQ_NOWAIT);
+	req = nvme_alloc_request(q, &cmd, NVME_QID_ANY, BLK_MQ_REQ_NOWAIT);
 	if (IS_ERR(req))
 		return PTR_ERR(req);
-	nvme_init_request(req, &cmd);
 
 	if (opcode == nvme_admin_delete_cq)
 		req->end_io = nvme_del_cq_end;
diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
index 511c980d538d..aae6d5bb4fd8 100644
--- a/drivers/nvme/target/passthru.c
+++ b/drivers/nvme/target/passthru.c
@@ -316,12 +316,11 @@ static void nvmet_passthru_execute_cmd(struct nvmet_req *req)
 		timeout = nvmet_req_subsys(req)->admin_timeout;
 	}
 
-	rq = blk_mq_alloc_request(q, nvme_req_op(req->cmd), 0);
+	rq = nvme_alloc_request(q, req->cmd, NVME_QID_ANY, 0);
 	if (IS_ERR(rq)) {
 		status = NVME_SC_INTERNAL;
 		goto out_put_ns;
 	}
-	nvme_init_request(rq, req->cmd);
 
 	if (timeout)
 		rq->timeout = timeout;
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/5] block: make blk_rq_map_kern() to accept a NULL buffer
  2023-02-09 14:38 [PATCH 0/5] nvme: rework __nvme_submit_sync_cmd() Hannes Reinecke
  2023-02-09 14:38 ` [PATCH 1/5] nvme: split __nvme_submit_sync_cmd() Hannes Reinecke
@ 2023-02-09 14:38 ` Hannes Reinecke
  2023-02-13  6:21   ` Christoph Hellwig
  2023-02-13  9:49   ` Sagi Grimberg
  2023-02-09 14:38 ` [PATCH 3/5] nvme: move result handling into nvme_execute_rq() Hannes Reinecke
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 22+ messages in thread
From: Hannes Reinecke @ 2023-02-09 14:38 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke

We really should accept a NULL buffer as argument, avoiding conditional
coding in the caller.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 block/blk-map.c           |  2 +-
 drivers/block/pktcdvd.c   | 10 ++++------
 drivers/nvme/host/core.c  |  8 +++-----
 drivers/scsi/scsi_ioctl.c |  8 +++-----
 drivers/scsi/scsi_lib.c   | 11 +++++------
 5 files changed, 16 insertions(+), 23 deletions(-)

diff --git a/block/blk-map.c b/block/blk-map.c
index 859590be077e..4615386b85d3 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -783,7 +783,7 @@ int blk_rq_map_kern(struct request_queue *q, struct request *rq, void *kbuf,
 	if (len > (queue_max_hw_sectors(q) << 9))
 		return -EINVAL;
 	if (!len || !kbuf)
-		return -EINVAL;
+		return 0;
 
 	if (!blk_rq_aligned(q, addr, len) || object_is_on_stack(kbuf) ||
 	    blk_queue_may_bounce(q))
diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c
index 2f1a92509271..9e49cca82bfd 100644
--- a/drivers/block/pktcdvd.c
+++ b/drivers/block/pktcdvd.c
@@ -694,12 +694,10 @@ static int pkt_generic_packet(struct pktcdvd_device *pd, struct packet_command *
 		return PTR_ERR(rq);
 	scmd = blk_mq_rq_to_pdu(rq);
 
-	if (cgc->buflen) {
-		ret = blk_rq_map_kern(q, rq, cgc->buffer, cgc->buflen,
-				      GFP_NOIO);
-		if (ret)
-			goto out;
-	}
+	ret = blk_rq_map_kern(q, rq, cgc->buffer, cgc->buflen,
+			      GFP_NOIO);
+	if (ret)
+		goto out;
 
 	scmd->cmd_len = COMMAND_SIZE(cgc->cmd[0]);
 	memcpy(scmd->cmnd, cgc->cmd, CDROM_PACKET_SIZE);
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index ad2bcb412944..78da9c6cbba8 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1044,11 +1044,9 @@ int __nvme_submit_sync_cmd(struct request *req, union nvme_result *result,
 {
 	int ret;
 
-	if (buffer && bufflen) {
-		ret = blk_rq_map_kern(req->q, req, buffer, bufflen, GFP_KERNEL);
-		if (ret)
-			goto out;
-	}
+	ret = blk_rq_map_kern(req->q, req, buffer, bufflen, GFP_KERNEL);
+	if (ret)
+		goto out;
 
 	ret = nvme_execute_rq(req, at_head);
 	if (result && ret >= 0)
diff --git a/drivers/scsi/scsi_ioctl.c b/drivers/scsi/scsi_ioctl.c
index 1126a265d5ee..1e182d71ce74 100644
--- a/drivers/scsi/scsi_ioctl.c
+++ b/drivers/scsi/scsi_ioctl.c
@@ -581,11 +581,9 @@ static int sg_scsi_ioctl(struct request_queue *q, fmode_t mode,
 		break;
 	}
 
-	if (bytes) {
-		err = blk_rq_map_kern(q, rq, buffer, bytes, GFP_NOIO);
-		if (err)
-			goto error;
-	}
+	err = blk_rq_map_kern(q, rq, buffer, bytes, GFP_NOIO);
+	if (err)
+		goto error;
 
 	blk_execute_rq(rq, false);
 
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 9ed1ebcb7443..b15ddf6820a3 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -221,12 +221,11 @@ int __scsi_execute(struct scsi_device *sdev, const unsigned char *cmd,
 	if (IS_ERR(req))
 		return PTR_ERR(req);
 
-	if (bufflen) {
-		ret = blk_rq_map_kern(sdev->request_queue, req,
-				      buffer, bufflen, GFP_NOIO);
-		if (ret)
-			goto out;
-	}
+	ret = blk_rq_map_kern(sdev->request_queue, req,
+			buffer, bufflen, GFP_NOIO);
+	if (ret)
+		goto out;
+
 	scmd = blk_mq_rq_to_pdu(req);
 	scmd->cmd_len = COMMAND_SIZE(cmd[0]);
 	memcpy(scmd->cmnd, cmd, scmd->cmd_len);
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/5] nvme: move result handling into nvme_execute_rq()
  2023-02-09 14:38 [PATCH 0/5] nvme: rework __nvme_submit_sync_cmd() Hannes Reinecke
  2023-02-09 14:38 ` [PATCH 1/5] nvme: split __nvme_submit_sync_cmd() Hannes Reinecke
  2023-02-09 14:38 ` [PATCH 2/5] block: make blk_rq_map_kern() to accept a NULL buffer Hannes Reinecke
@ 2023-02-09 14:38 ` Hannes Reinecke
  2023-02-13  9:59   ` Sagi Grimberg
  2023-02-09 14:38 ` [PATCH 4/5] nvme: open-code __nvme_submit_sync_cmd() Hannes Reinecke
  2023-02-09 14:38 ` [PATCH 5/5] nvme: retry authentication commands if DNR status bit is not set Hannes Reinecke
  4 siblings, 1 reply; 22+ messages in thread
From: Hannes Reinecke @ 2023-02-09 14:38 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke

and simplify the calling convention.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/host/core.c       | 16 ++++++++++------
 drivers/nvme/host/ioctl.c      |  5 +++--
 drivers/nvme/host/nvme.h       |  3 ++-
 drivers/nvme/target/passthru.c |  3 +--
 4 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 78da9c6cbba8..11faebe87764 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1004,15 +1004,21 @@ EXPORT_SYMBOL_GPL(nvme_setup_cmd);
  * >0: nvme controller's cqe status response
  * <0: kernel error in lieu of controller response
  */
-int nvme_execute_rq(struct request *rq, bool at_head)
+int nvme_execute_rq(struct request *rq, union nvme_result *result,
+		    bool at_head)
 {
 	blk_status_t status;
+	int ret;
 
 	status = blk_execute_rq(rq, at_head);
 	if (nvme_req(rq)->flags & NVME_REQ_CANCELLED)
 		return -EINTR;
-	if (nvme_req(rq)->status)
-		return nvme_req(rq)->status;
+	ret = nvme_req(rq)->status;
+	if (ret) {
+		if (result && ret >= 0)
+			*result = nvme_req(rq)->result;
+		return ret;
+	}
 	return blk_status_to_errno(status);
 }
 EXPORT_SYMBOL_NS_GPL(nvme_execute_rq, NVME_TARGET_PASSTHRU);
@@ -1048,9 +1054,7 @@ int __nvme_submit_sync_cmd(struct request *req, union nvme_result *result,
 	if (ret)
 		goto out;
 
-	ret = nvme_execute_rq(req, at_head);
-	if (result && ret >= 0)
-		*result = nvme_req(req)->result;
+	ret = nvme_execute_rq(req, result, at_head);
  out:
 	blk_mq_free_request(req);
 	return ret;
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index 723e7d5b778f..cdba543cbd02 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -220,6 +220,7 @@ static int nvme_submit_user_cmd(struct request_queue *q,
 		u64 *result, unsigned timeout, unsigned int flags)
 {
 	struct nvme_ns *ns = q->queuedata;
+	union nvme_result res;
 	struct nvme_ctrl *ctrl;
 	struct request *req;
 	void *meta = NULL;
@@ -243,9 +244,9 @@ static int nvme_submit_user_cmd(struct request_queue *q,
 	ctrl = nvme_req(req)->ctrl;
 
 	effects = nvme_passthru_start(ctrl, ns, cmd->common.opcode);
-	ret = nvme_execute_rq(req, false);
+	ret = nvme_execute_rq(req, &res, false);
 	if (result)
-		*result = le64_to_cpu(nvme_req(req)->result.u64);
+		*result = le64_to_cpu(res.u64);
 	if (meta)
 		ret = nvme_finish_user_metadata(req, meta_buffer, meta,
 						meta_len, ret);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 9b4b3bb69e27..a63650b00d1a 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -1071,7 +1071,8 @@ static inline void nvme_auth_free(struct nvme_ctrl *ctrl) {};
 u32 nvme_command_effects(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
 			 u8 opcode);
 u32 nvme_passthru_start(struct nvme_ctrl *ctrl, struct nvme_ns *ns, u8 opcode);
-int nvme_execute_rq(struct request *rq, bool at_head);
+int nvme_execute_rq(struct request *rq, union nvme_result *result,
+		    bool at_head);
 void nvme_passthru_end(struct nvme_ctrl *ctrl, u32 effects,
 		       struct nvme_command *cmd, int status);
 struct nvme_ctrl *nvme_ctrl_from_file(struct file *file);
diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
index aae6d5bb4fd8..f5ab4d7a8c1f 100644
--- a/drivers/nvme/target/passthru.c
+++ b/drivers/nvme/target/passthru.c
@@ -221,7 +221,7 @@ static void nvmet_passthru_execute_cmd_work(struct work_struct *w)
 	int status;
 
 	effects = nvme_passthru_start(ctrl, ns, req->cmd->common.opcode);
-	status = nvme_execute_rq(rq, false);
+	status = nvme_execute_rq(rq, &req->cqe->result, false);
 	if (status == NVME_SC_SUCCESS &&
 	    req->cmd->common.opcode == nvme_admin_identify) {
 		switch (req->cmd->identify.cns) {
@@ -238,7 +238,6 @@ static void nvmet_passthru_execute_cmd_work(struct work_struct *w)
 	} else if (status < 0)
 		status = NVME_SC_INTERNAL;
 
-	req->cqe->result = nvme_req(rq)->result;
 	nvmet_req_complete(req, status);
 	blk_mq_free_request(rq);
 
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 4/5] nvme: open-code __nvme_submit_sync_cmd()
  2023-02-09 14:38 [PATCH 0/5] nvme: rework __nvme_submit_sync_cmd() Hannes Reinecke
                   ` (2 preceding siblings ...)
  2023-02-09 14:38 ` [PATCH 3/5] nvme: move result handling into nvme_execute_rq() Hannes Reinecke
@ 2023-02-09 14:38 ` Hannes Reinecke
  2023-02-13  6:26   ` Christoph Hellwig
  2023-02-09 14:38 ` [PATCH 5/5] nvme: retry authentication commands if DNR status bit is not set Hannes Reinecke
  4 siblings, 1 reply; 22+ messages in thread
From: Hannes Reinecke @ 2023-02-09 14:38 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke

The argument list for __nvme_submit_sync_cmd() really got overloaded
trying to abstract away the various use-cases. Plus it's a really
simple function now, so open-code it in the callers and drop this function.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/host/auth.c    | 22 ++++++++++----
 drivers/nvme/host/core.c    | 42 ++++++++++++++-------------
 drivers/nvme/host/fabrics.c | 58 +++++++++++++++++++++++--------------
 drivers/nvme/host/nvme.h    |  2 --
 4 files changed, 74 insertions(+), 50 deletions(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index 05d02ab229f7..ef5dcd885b0c 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -81,13 +81,23 @@ static int nvme_auth_submit(struct nvme_ctrl *ctrl, int qid,
 	if (IS_ERR(req))
 		return PTR_ERR(req);
 
-	ret = __nvme_submit_sync_cmd(req, NULL, data, data_len, 0);
-	if (ret > 0)
-		dev_warn(ctrl->device,
-			"qid %d auth_send failed with status %d\n", qid, ret);
-	else if (ret < 0)
+	ret = blk_rq_map_kern(req->q, req, data, data_len, GFP_KERNEL);
+	if (ret) {
 		dev_err(ctrl->device,
-			"qid %d auth_send failed with error %d\n", qid, ret);
+			"qid %d auth_submit failed to map, error %d\n",
+			qid, ret);
+	} else {
+		ret = nvme_execute_rq(req, NULL, false);
+		if (ret > 0)
+			dev_warn(ctrl->device,
+				 "qid %d auth_submit failed with status %d\n",
+				 qid, ret);
+		else if (ret < 0)
+			dev_err(ctrl->device,
+				"qid %d auth_submit failed with error %d\n",
+				qid, ret);
+	}
+	blk_mq_free_request(req);
 	return ret;
 }
 
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 11faebe87764..160261a76f27 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1021,7 +1021,7 @@ int nvme_execute_rq(struct request *rq, union nvme_result *result,
 	}
 	return blk_status_to_errno(status);
 }
-EXPORT_SYMBOL_NS_GPL(nvme_execute_rq, NVME_TARGET_PASSTHRU);
+EXPORT_SYMBOL_GPL(nvme_execute_rq);
 
 struct request *nvme_alloc_request(struct request_queue *q,
 				   struct nvme_command *cmd, int qid,
@@ -1045,32 +1045,22 @@ EXPORT_SYMBOL_GPL(nvme_alloc_request);
  * Returns 0 on success.  If the result is negative, it's a Linux error code;
  * if the result is positive, it's an NVM Express status code
  */
-int __nvme_submit_sync_cmd(struct request *req, union nvme_result *result,
-			   void *buffer, unsigned bufflen, int at_head)
-{
-	int ret;
-
-	ret = blk_rq_map_kern(req->q, req, buffer, bufflen, GFP_KERNEL);
-	if (ret)
-		goto out;
-
-	ret = nvme_execute_rq(req, result, at_head);
- out:
-	blk_mq_free_request(req);
-	return ret;
-}
-EXPORT_SYMBOL_GPL(__nvme_submit_sync_cmd);
-
 int nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
 		void *buffer, unsigned bufflen)
 {
 	struct request *req;
+	int ret;
 
 	req = nvme_alloc_request(q, cmd, NVME_QID_ANY, 0);
 	if (IS_ERR(req))
 		return PTR_ERR(req);
 
-	return __nvme_submit_sync_cmd(req, NULL, buffer, bufflen, 0);
+	ret = blk_rq_map_kern(req->q, req, buffer, bufflen, GFP_KERNEL);
+	if (!ret)
+		ret = nvme_execute_rq(req, NULL, false);
+
+	blk_mq_free_request(req);
+	return ret;
 }
 EXPORT_SYMBOL_GPL(nvme_submit_sync_cmd);
 
@@ -1500,7 +1490,13 @@ static int nvme_features(struct nvme_ctrl *dev, u8 op, unsigned int fid,
 	if (IS_ERR(req))
 		return PTR_ERR(req);
 
-	ret = __nvme_submit_sync_cmd(req, &res, buffer, buflen, 0);
+	ret = blk_rq_map_kern(req->q, req, buffer, buflen, GFP_KERNEL);
+	if (ret)
+		goto out;
+
+	ret = nvme_execute_rq(req, &res, false);
+ out:
+	blk_mq_free_request(req);
 	if (ret >= 0 && result)
 		*result = le32_to_cpu(res.u32);
 	return ret;
@@ -2223,6 +2219,7 @@ static int nvme_sec_submit(void *data, u16 spsp, u8 secp, void *buffer, size_t l
 	struct nvme_ctrl *ctrl = data;
 	struct nvme_command cmd = { };
 	struct request *req;
+	int ret;
 
 	if (send)
 		cmd.common.opcode = nvme_admin_security_send;
@@ -2236,7 +2233,12 @@ static int nvme_sec_submit(void *data, u16 spsp, u8 secp, void *buffer, size_t l
 	if (IS_ERR(req))
 		return PTR_ERR(req);
 
-	return __nvme_submit_sync_cmd(req, NULL, buffer, len, 1);
+	ret = blk_rq_map_kern(req->q, req, buffer, len, GFP_KERNEL);
+	if (!ret)
+		ret = nvme_execute_rq(req, NULL, true);
+
+	blk_mq_free_request(req);
+	return ret;
 }
 
 static void nvme_configure_opal(struct nvme_ctrl *ctrl, bool was_suspended)
diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index 914784b611a2..49299fcdcf68 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -121,6 +121,31 @@ int nvmf_get_address(struct nvme_ctrl *ctrl, char *buf, int size)
 }
 EXPORT_SYMBOL_GPL(nvmf_get_address);
 
+/**
+ * nvmf_submit_fabrics_cmd() - Submit NVMe Fabrics command
+ * @ctrl:	Host NVMe controller instance
+ * @cmd:	NVMe command to submit
+ * @res:	NVMe result to be returned (may be NULL)
+ */
+static int nvmf_submit_fabrics_cmd(struct nvme_ctrl *ctrl,
+				   struct nvme_command *cmd,
+				   union nvme_result *res)
+{
+	struct request *req;
+	int ret;
+
+	req = nvme_alloc_request(ctrl->fabrics_q, cmd, NVME_QID_ANY, 0);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	ret = blk_rq_map_kern(req->q, req, NULL, 0, GFP_KERNEL);
+	if (!ret)
+		ret = nvme_execute_rq(req, res, false);
+
+	blk_mq_free_request(req);
+	return ret;
+}
+
 /**
  * nvmf_reg_read32() -  NVMe Fabrics "Property Get" API function.
  * @ctrl:	Host NVMe controller instance maintaining the admin
@@ -144,7 +169,6 @@ EXPORT_SYMBOL_GPL(nvmf_get_address);
  */
 int nvmf_reg_read32(struct nvme_ctrl *ctrl, u32 off, u32 *val)
 {
-	struct request *req;
 	struct nvme_command cmd = { };
 	union nvme_result res;
 	int ret;
@@ -153,12 +177,7 @@ int nvmf_reg_read32(struct nvme_ctrl *ctrl, u32 off, u32 *val)
 	cmd.prop_get.fctype = nvme_fabrics_type_property_get;
 	cmd.prop_get.offset = cpu_to_le32(off);
 
-	req = nvme_alloc_request(ctrl->fabrics_q, &cmd, NVME_QID_ANY, 0);
-	if (IS_ERR(req))
-		return PTR_ERR(req);
-
-	ret = __nvme_submit_sync_cmd(req, &res, NULL, 0, 0);
-
+	ret = nvmf_submit_fabrics_cmd(ctrl, &cmd, &res);
 	if (ret >= 0)
 		*val = le64_to_cpu(res.u64);
 	if (unlikely(ret != 0))
@@ -193,7 +212,6 @@ EXPORT_SYMBOL_GPL(nvmf_reg_read32);
  */
 int nvmf_reg_read64(struct nvme_ctrl *ctrl, u32 off, u64 *val)
 {
-	struct request *req;
 	struct nvme_command cmd = { };
 	union nvme_result res;
 	int ret;
@@ -203,12 +221,7 @@ int nvmf_reg_read64(struct nvme_ctrl *ctrl, u32 off, u64 *val)
 	cmd.prop_get.attrib = 1;
 	cmd.prop_get.offset = cpu_to_le32(off);
 
-	req = nvme_alloc_request(ctrl->fabrics_q, &cmd, NVME_QID_ANY, 0);
-	if (IS_ERR(req))
-		return PTR_ERR(req);
-
-	ret = __nvme_submit_sync_cmd(req, &res, NULL, 0, 0);
-
+	ret = nvmf_submit_fabrics_cmd(ctrl, &cmd, &res);
 	if (ret >= 0)
 		*val = le64_to_cpu(res.u64);
 	if (unlikely(ret != 0))
@@ -242,7 +255,6 @@ EXPORT_SYMBOL_GPL(nvmf_reg_read64);
  */
 int nvmf_reg_write32(struct nvme_ctrl *ctrl, u32 off, u32 val)
 {
-	struct request *req;
 	struct nvme_command cmd = { };
 	int ret;
 
@@ -252,11 +264,7 @@ int nvmf_reg_write32(struct nvme_ctrl *ctrl, u32 off, u32 val)
 	cmd.prop_set.offset = cpu_to_le32(off);
 	cmd.prop_set.value = cpu_to_le64(val);
 
-	req = nvme_alloc_request(ctrl->fabrics_q, &cmd, NVME_QID_ANY, 0);
-	if (IS_ERR(req))
-		return PTR_ERR(req);
-
-	ret = __nvme_submit_sync_cmd(req, NULL, NULL, 0, 0);
+	ret = nvmf_submit_fabrics_cmd(ctrl, &cmd, NULL);
 	if (unlikely(ret))
 		dev_err(ctrl->device,
 			"Property Set error: %d, offset %#x\n",
@@ -417,7 +425,10 @@ int nvmf_connect_admin_queue(struct nvme_ctrl *ctrl)
 	if (IS_ERR(req))
 		return PTR_ERR(req);
 
-	ret = __nvme_submit_sync_cmd(req, &res, data, sizeof(*data), 1);
+	ret = blk_rq_map_kern(req->q, req, data, sizeof(*data), GFP_KERNEL);
+	if (!ret)
+		ret = nvme_execute_rq(req, &res, true);
+	blk_mq_free_request(req);
 	if (ret) {
 		nvmf_log_connect_error(ctrl, ret, le32_to_cpu(res.u32),
 				       &cmd, data);
@@ -507,7 +518,10 @@ int nvmf_connect_io_queue(struct nvme_ctrl *ctrl, u16 qid)
 	if (IS_ERR(req))
 		return PTR_ERR(req);
 
-	ret = __nvme_submit_sync_cmd(req, &res, data, sizeof(*data), 1);
+	ret = blk_rq_map_kern(req->q, req, data, sizeof(*data), GFP_KERNEL);
+	if (!ret)
+		ret = nvme_execute_rq(req, &res, true);
+	blk_mq_free_request(req);
 	if (ret) {
 		nvmf_log_connect_error(ctrl, ret, le32_to_cpu(res.u32),
 				       &cmd, data);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index a63650b00d1a..8f02d981644c 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -815,8 +815,6 @@ int nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
 		void *buf, unsigned bufflen);
 struct request *nvme_alloc_request(struct request_queue *q,
 		struct nvme_command *cmd, int qid, blk_mq_req_flags_t flags);
-int __nvme_submit_sync_cmd(struct request *req, union nvme_result *result,
-		void *buffer, unsigned bufflen, int at_head);
 int nvme_set_features(struct nvme_ctrl *dev, unsigned int fid,
 		      unsigned int dword11, void *buffer, size_t buflen,
 		      u32 *result);
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 5/5] nvme: retry authentication commands if DNR status bit is not set
  2023-02-09 14:38 [PATCH 0/5] nvme: rework __nvme_submit_sync_cmd() Hannes Reinecke
                   ` (3 preceding siblings ...)
  2023-02-09 14:38 ` [PATCH 4/5] nvme: open-code __nvme_submit_sync_cmd() Hannes Reinecke
@ 2023-02-09 14:38 ` Hannes Reinecke
  2023-02-13 10:14   ` Sagi Grimberg
  4 siblings, 1 reply; 22+ messages in thread
From: Hannes Reinecke @ 2023-02-09 14:38 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke

Clear the FAILFAST_DRIVER bit for authentication commands
allowing them to be retried in nvme_decide_disposition() if the DNR
bit is not set in the command result.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/host/auth.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index ef5dcd885b0c..935f340607a7 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -87,6 +87,8 @@ static int nvme_auth_submit(struct nvme_ctrl *ctrl, int qid,
 			"qid %d auth_submit failed to map, error %d\n",
 			qid, ret);
 	} else {
+		/* Clear failfast flag to allow for retries */
+		req->cmd_flags &= ~REQ_FAILFAST_DRIVER;
 		ret = nvme_execute_rq(req, NULL, false);
 		if (ret > 0)
 			dev_warn(ctrl->device,
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/5] nvme: split __nvme_submit_sync_cmd()
  2023-02-09 14:38 ` [PATCH 1/5] nvme: split __nvme_submit_sync_cmd() Hannes Reinecke
@ 2023-02-13  6:19   ` Christoph Hellwig
  2023-02-13  9:47     ` Sagi Grimberg
  0 siblings, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2023-02-13  6:19 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: Christoph Hellwig, Sagi Grimberg, Keith Busch, linux-nvme

> +struct request *nvme_alloc_request(struct request_queue *q,
> +				   struct nvme_command *cmd, int qid,
> +				   blk_mq_req_flags_t flags)
>  {
>  	struct request *req;
> -	int ret;
>  
>  	if (qid == NVME_QID_ANY)
>  		req = blk_mq_alloc_request(q, nvme_req_op(cmd), flags);
>  	else
>  		req = blk_mq_alloc_request_hctx(q, nvme_req_op(cmd), flags,
>  						qid - 1);
> +	if (!IS_ERR(req))

As already said last time, please split out a nvme_alloc_request_qid
function for the few caller that pass a qid.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/5] block: make blk_rq_map_kern() to accept a NULL buffer
  2023-02-09 14:38 ` [PATCH 2/5] block: make blk_rq_map_kern() to accept a NULL buffer Hannes Reinecke
@ 2023-02-13  6:21   ` Christoph Hellwig
  2023-02-13  9:49   ` Sagi Grimberg
  1 sibling, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2023-02-13  6:21 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: Christoph Hellwig, Sagi Grimberg, Keith Busch, linux-nvme

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

but you really need to Cc Jens and linux-block for this one.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/5] nvme: open-code __nvme_submit_sync_cmd()
  2023-02-09 14:38 ` [PATCH 4/5] nvme: open-code __nvme_submit_sync_cmd() Hannes Reinecke
@ 2023-02-13  6:26   ` Christoph Hellwig
  2023-02-13 10:07     ` Sagi Grimberg
  0 siblings, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2023-02-13  6:26 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: Christoph Hellwig, Sagi Grimberg, Keith Busch, linux-nvme

On Thu, Feb 09, 2023 at 03:38:19PM +0100, Hannes Reinecke wrote:
> +	if (ret) {
>  		dev_err(ctrl->device,
> -			"qid %d auth_send failed with error %d\n", qid, ret);
> +			"qid %d auth_submit failed to map, error %d\n",
> +			qid, ret);

Plase do a goto out_free_request here instead of indenting the
normal flow in the else path.  The same comment applies in a bunch of
other places.

> -	ret = __nvme_submit_sync_cmd(req, &res, buffer, buflen, 0);
> +	ret = blk_rq_map_kern(req->q, req, buffer, buflen, GFP_KERNEL);
> +	if (ret)
> +		goto out;
> +
> +	ret = nvme_execute_rq(req, &res, false);

This could use nvmf_submit_fabrics_cmd, which has nothing to with
fabrics, but just is a variant of nvme_submit_sync_cmd that returns the
result.  So if we have that helper anyway we might as well move it to
the core and turn nvme_submit_sync_cmd into a wrapper for it.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/5] nvme: split __nvme_submit_sync_cmd()
  2023-02-13  6:19   ` Christoph Hellwig
@ 2023-02-13  9:47     ` Sagi Grimberg
  0 siblings, 0 replies; 22+ messages in thread
From: Sagi Grimberg @ 2023-02-13  9:47 UTC (permalink / raw)
  To: Christoph Hellwig, Hannes Reinecke; +Cc: Keith Busch, linux-nvme


>> +struct request *nvme_alloc_request(struct request_queue *q,
>> +				   struct nvme_command *cmd, int qid,
>> +				   blk_mq_req_flags_t flags)
>>   {
>>   	struct request *req;
>> -	int ret;
>>   
>>   	if (qid == NVME_QID_ANY)
>>   		req = blk_mq_alloc_request(q, nvme_req_op(cmd), flags);
>>   	else
>>   		req = blk_mq_alloc_request_hctx(q, nvme_req_op(cmd), flags,
>>   						qid - 1);
>> +	if (!IS_ERR(req))
> 
> As already said last time, please split out a nvme_alloc_request_qid
> function for the few caller that pass a qid.

Agreed.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/5] block: make blk_rq_map_kern() to accept a NULL buffer
  2023-02-09 14:38 ` [PATCH 2/5] block: make blk_rq_map_kern() to accept a NULL buffer Hannes Reinecke
  2023-02-13  6:21   ` Christoph Hellwig
@ 2023-02-13  9:49   ` Sagi Grimberg
  1 sibling, 0 replies; 22+ messages in thread
From: Sagi Grimberg @ 2023-02-13  9:49 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/5] nvme: move result handling into nvme_execute_rq()
  2023-02-09 14:38 ` [PATCH 3/5] nvme: move result handling into nvme_execute_rq() Hannes Reinecke
@ 2023-02-13  9:59   ` Sagi Grimberg
  2023-02-13 10:04     ` Hannes Reinecke
  0 siblings, 1 reply; 22+ messages in thread
From: Sagi Grimberg @ 2023-02-13  9:59 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme



On 2/9/23 16:38, Hannes Reinecke wrote:
> and simplify the calling convention.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>   drivers/nvme/host/core.c       | 16 ++++++++++------
>   drivers/nvme/host/ioctl.c      |  5 +++--
>   drivers/nvme/host/nvme.h       |  3 ++-
>   drivers/nvme/target/passthru.c |  3 +--
>   4 files changed, 16 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 78da9c6cbba8..11faebe87764 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -1004,15 +1004,21 @@ EXPORT_SYMBOL_GPL(nvme_setup_cmd);
>    * >0: nvme controller's cqe status response
>    * <0: kernel error in lieu of controller response
>    */
> -int nvme_execute_rq(struct request *rq, bool at_head)
> +int nvme_execute_rq(struct request *rq, union nvme_result *result,
> +		    bool at_head)
>   {
>   	blk_status_t status;
> +	int ret;
>   
>   	status = blk_execute_rq(rq, at_head);
>   	if (nvme_req(rq)->flags & NVME_REQ_CANCELLED)
>   		return -EINTR;
> -	if (nvme_req(rq)->status)
> -		return nvme_req(rq)->status;
> +	ret = nvme_req(rq)->status;
> +	if (ret) {
> +		if (result && ret >= 0)

How can ret be < 0 here?

Why not always take the result, unconditionally?

Is this really necessary though?

> +			*result = nvme_req(rq)->result;
> +		return ret;
> +	}
>   	return blk_status_to_errno(status);
>   }
>   EXPORT_SYMBOL_NS_GPL(nvme_execute_rq, NVME_TARGET_PASSTHRU);
> @@ -1048,9 +1054,7 @@ int __nvme_submit_sync_cmd(struct request *req, union nvme_result *result,
>   	if (ret)
>   		goto out;
>   
> -	ret = nvme_execute_rq(req, at_head);
> -	if (result && ret >= 0)
> -		*result = nvme_req(req)->result;
> +	ret = nvme_execute_rq(req, result, at_head);
>    out:
>   	blk_mq_free_request(req);
>   	return ret;
> diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
> index 723e7d5b778f..cdba543cbd02 100644
> --- a/drivers/nvme/host/ioctl.c
> +++ b/drivers/nvme/host/ioctl.c
> @@ -220,6 +220,7 @@ static int nvme_submit_user_cmd(struct request_queue *q,
>   		u64 *result, unsigned timeout, unsigned int flags)
>   {
>   	struct nvme_ns *ns = q->queuedata;
> +	union nvme_result res;
>   	struct nvme_ctrl *ctrl;
>   	struct request *req;
>   	void *meta = NULL;
> @@ -243,9 +244,9 @@ static int nvme_submit_user_cmd(struct request_queue *q,
>   	ctrl = nvme_req(req)->ctrl;
>   
>   	effects = nvme_passthru_start(ctrl, ns, cmd->common.opcode);
> -	ret = nvme_execute_rq(req, false);
> +	ret = nvme_execute_rq(req, &res, false);
>   	if (result)
> -		*result = le64_to_cpu(nvme_req(req)->result.u64);
> +		*result = le64_to_cpu(res.u64);
>   	if (meta)
>   		ret = nvme_finish_user_metadata(req, meta_buffer, meta,
>   						meta_len, ret);
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index 9b4b3bb69e27..a63650b00d1a 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -1071,7 +1071,8 @@ static inline void nvme_auth_free(struct nvme_ctrl *ctrl) {};
>   u32 nvme_command_effects(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
>   			 u8 opcode);
>   u32 nvme_passthru_start(struct nvme_ctrl *ctrl, struct nvme_ns *ns, u8 opcode);
> -int nvme_execute_rq(struct request *rq, bool at_head);
> +int nvme_execute_rq(struct request *rq, union nvme_result *result,
> +		    bool at_head);
>   void nvme_passthru_end(struct nvme_ctrl *ctrl, u32 effects,
>   		       struct nvme_command *cmd, int status);
>   struct nvme_ctrl *nvme_ctrl_from_file(struct file *file);
> diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
> index aae6d5bb4fd8..f5ab4d7a8c1f 100644
> --- a/drivers/nvme/target/passthru.c
> +++ b/drivers/nvme/target/passthru.c
> @@ -221,7 +221,7 @@ static void nvmet_passthru_execute_cmd_work(struct work_struct *w)
>   	int status;
>   
>   	effects = nvme_passthru_start(ctrl, ns, req->cmd->common.opcode);
> -	status = nvme_execute_rq(rq, false);
> +	status = nvme_execute_rq(rq, &req->cqe->result, false);
>   	if (status == NVME_SC_SUCCESS &&
>   	    req->cmd->common.opcode == nvme_admin_identify) {
>   		switch (req->cmd->identify.cns) {
> @@ -238,7 +238,6 @@ static void nvmet_passthru_execute_cmd_work(struct work_struct *w)
>   	} else if (status < 0)
>   		status = NVME_SC_INTERNAL;
>   
> -	req->cqe->result = nvme_req(rq)->result;
>   	nvmet_req_complete(req, status);
>   	blk_mq_free_request(rq);
>   


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/5] nvme: move result handling into nvme_execute_rq()
  2023-02-13  9:59   ` Sagi Grimberg
@ 2023-02-13 10:04     ` Hannes Reinecke
  2023-02-13 10:08       ` Sagi Grimberg
  0 siblings, 1 reply; 22+ messages in thread
From: Hannes Reinecke @ 2023-02-13 10:04 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig; +Cc: Keith Busch, linux-nvme

On 2/13/23 10:59, Sagi Grimberg wrote:
> 
> 
> On 2/9/23 16:38, Hannes Reinecke wrote:
>> and simplify the calling convention.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>>   drivers/nvme/host/core.c       | 16 ++++++++++------
>>   drivers/nvme/host/ioctl.c      |  5 +++--
>>   drivers/nvme/host/nvme.h       |  3 ++-
>>   drivers/nvme/target/passthru.c |  3 +--
>>   4 files changed, 16 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index 78da9c6cbba8..11faebe87764 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -1004,15 +1004,21 @@ EXPORT_SYMBOL_GPL(nvme_setup_cmd);
>>    * >0: nvme controller's cqe status response
>>    * <0: kernel error in lieu of controller response
>>    */
>> -int nvme_execute_rq(struct request *rq, bool at_head)
>> +int nvme_execute_rq(struct request *rq, union nvme_result *result,
>> +            bool at_head)
>>   {
>>       blk_status_t status;
>> +    int ret;
>>       status = blk_execute_rq(rq, at_head);
>>       if (nvme_req(rq)->flags & NVME_REQ_CANCELLED)
>>           return -EINTR;
>> -    if (nvme_req(rq)->status)
>> -        return nvme_req(rq)->status;
>> +    ret = nvme_req(rq)->status;
>> +    if (ret) {
>> +        if (result && ret >= 0)
> 
> How can ret be < 0 here?
> 
> Why not always take the result, unconditionally?
> 
Yeah, right. Will be fixing it up.

> Is this really necessary though?
> 
??
Yes, it is necessary to return the status (if there is one).
Quite some callers expect that.
And no, it's probably not necessary to check if there is a status;
we might always return a status here. But for the result I guess we
should be checking the status first; I'm slightly hesitant to blindly
return the result even if we have a zero status.

Cheers,

Hannes



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/5] nvme: open-code __nvme_submit_sync_cmd()
  2023-02-13  6:26   ` Christoph Hellwig
@ 2023-02-13 10:07     ` Sagi Grimberg
  0 siblings, 0 replies; 22+ messages in thread
From: Sagi Grimberg @ 2023-02-13 10:07 UTC (permalink / raw)
  To: Christoph Hellwig, Hannes Reinecke; +Cc: Keith Busch, linux-nvme


>> -	ret = __nvme_submit_sync_cmd(req, &res, buffer, buflen, 0);
>> +	ret = blk_rq_map_kern(req->q, req, buffer, buflen, GFP_KERNEL);
>> +	if (ret)
>> +		goto out;
>> +
>> +	ret = nvme_execute_rq(req, &res, false);
> 
> This could use nvmf_submit_fabrics_cmd, which has nothing to with
> fabrics, but just is a variant of nvme_submit_sync_cmd that returns the
> result.

I think that the distinction is that it operates on the fabrics request
queue as well.

>  So if we have that helper anyway we might as well move it to
> the core and turn nvme_submit_sync_cmd into a wrapper for it.

I was thinking the other way around, nvmf_submit_fabrics_cmd wraps
nvme_submit_sync_cmd with ctrl->fabrics_q, and we can add


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/5] nvme: move result handling into nvme_execute_rq()
  2023-02-13 10:04     ` Hannes Reinecke
@ 2023-02-13 10:08       ` Sagi Grimberg
  0 siblings, 0 replies; 22+ messages in thread
From: Sagi Grimberg @ 2023-02-13 10:08 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme



On 2/13/23 12:04, Hannes Reinecke wrote:
> On 2/13/23 10:59, Sagi Grimberg wrote:
>>
>>
>> On 2/9/23 16:38, Hannes Reinecke wrote:
>>> and simplify the calling convention.
>>>
>>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>>> ---
>>>   drivers/nvme/host/core.c       | 16 ++++++++++------
>>>   drivers/nvme/host/ioctl.c      |  5 +++--
>>>   drivers/nvme/host/nvme.h       |  3 ++-
>>>   drivers/nvme/target/passthru.c |  3 +--
>>>   4 files changed, 16 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>> index 78da9c6cbba8..11faebe87764 100644
>>> --- a/drivers/nvme/host/core.c
>>> +++ b/drivers/nvme/host/core.c
>>> @@ -1004,15 +1004,21 @@ EXPORT_SYMBOL_GPL(nvme_setup_cmd);
>>>    * >0: nvme controller's cqe status response
>>>    * <0: kernel error in lieu of controller response
>>>    */
>>> -int nvme_execute_rq(struct request *rq, bool at_head)
>>> +int nvme_execute_rq(struct request *rq, union nvme_result *result,
>>> +            bool at_head)
>>>   {
>>>       blk_status_t status;
>>> +    int ret;
>>>       status = blk_execute_rq(rq, at_head);
>>>       if (nvme_req(rq)->flags & NVME_REQ_CANCELLED)
>>>           return -EINTR;
>>> -    if (nvme_req(rq)->status)
>>> -        return nvme_req(rq)->status;
>>> +    ret = nvme_req(rq)->status;
>>> +    if (ret) {
>>> +        if (result && ret >= 0)
>>
>> How can ret be < 0 here?
>>
>> Why not always take the result, unconditionally?
>>
> Yeah, right. Will be fixing it up.
> 
>> Is this really necessary though?
>>
> ??
> Yes, it is necessary to return the status (if there is one).
> Quite some callers expect that.
> And no, it's probably not necessary to check if there is a status;
> we might always return a status here. But for the result I guess we
> should be checking the status first; I'm slightly hesitant to blindly
> return the result even if we have a zero status.

I meant the simplification altogether. it really forces callers
to send a nvme_result (or NULL) and convert back. Didn't see that
it dramatically simplifies the code.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/5] nvme: retry authentication commands if DNR status bit is not set
  2023-02-09 14:38 ` [PATCH 5/5] nvme: retry authentication commands if DNR status bit is not set Hannes Reinecke
@ 2023-02-13 10:14   ` Sagi Grimberg
  2023-02-13 10:28     ` Hannes Reinecke
  0 siblings, 1 reply; 22+ messages in thread
From: Sagi Grimberg @ 2023-02-13 10:14 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme

> Clear the FAILFAST_DRIVER bit for authentication commands
> allowing them to be retried in nvme_decide_disposition() if the DNR
> bit is not set in the command result.

Can you please explain what is this fixing? and is there a test
to reproduce this?

> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>   drivers/nvme/host/auth.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
> index ef5dcd885b0c..935f340607a7 100644
> --- a/drivers/nvme/host/auth.c
> +++ b/drivers/nvme/host/auth.c
> @@ -87,6 +87,8 @@ static int nvme_auth_submit(struct nvme_ctrl *ctrl, int qid,
>   			"qid %d auth_submit failed to map, error %d\n",
>   			qid, ret);
>   	} else {
> +		/* Clear failfast flag to allow for retries */
> +		req->cmd_flags &= ~REQ_FAILFAST_DRIVER;

This should come right after the allocation.
But what makes this special than any other fabrics/admin command?
Why is it different than connect for example?


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/5] nvme: retry authentication commands if DNR status bit is not set
  2023-02-13 10:14   ` Sagi Grimberg
@ 2023-02-13 10:28     ` Hannes Reinecke
  2023-02-13 10:33       ` Sagi Grimberg
  0 siblings, 1 reply; 22+ messages in thread
From: Hannes Reinecke @ 2023-02-13 10:28 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig, George, Martin; +Cc: Keith Busch, linux-nvme

On 2/13/23 11:14, Sagi Grimberg wrote:
>> Clear the FAILFAST_DRIVER bit for authentication commands
>> allowing them to be retried in nvme_decide_disposition() if the DNR
>> bit is not set in the command result.
> 
> Can you please explain what is this fixing? and is there a test
> to reproduce this?
> 
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>>   drivers/nvme/host/auth.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
>> index ef5dcd885b0c..935f340607a7 100644
>> --- a/drivers/nvme/host/auth.c
>> +++ b/drivers/nvme/host/auth.c
>> @@ -87,6 +87,8 @@ static int nvme_auth_submit(struct nvme_ctrl *ctrl, 
>> int qid,
>>               "qid %d auth_submit failed to map, error %d\n",
>>               qid, ret);
>>       } else {
>> +        /* Clear failfast flag to allow for retries */
>> +        req->cmd_flags &= ~REQ_FAILFAST_DRIVER;
> 
> This should come right after the allocation.
> But what makes this special than any other fabrics/admin command?
> Why is it different than connect for example?

It is not different; it does, however, introduce a difference in behaviour.
And hence I didn't enable it for all commands, as previously we didn't 
retry, and it's questionable if we should for things like 
'reg_read'/'reg_write' and friends.

But you are correct for connect, though; we really should retry that 
one, too, if the DNR bit is not set.

Cheers,

Hannes



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/5] nvme: retry authentication commands if DNR status bit is not set
  2023-02-13 10:28     ` Hannes Reinecke
@ 2023-02-13 10:33       ` Sagi Grimberg
  2023-02-13 13:24         ` Hannes Reinecke
  0 siblings, 1 reply; 22+ messages in thread
From: Sagi Grimberg @ 2023-02-13 10:33 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig, George, Martin
  Cc: Keith Busch, linux-nvme


>>> Clear the FAILFAST_DRIVER bit for authentication commands
>>> allowing them to be retried in nvme_decide_disposition() if the DNR
>>> bit is not set in the command result.
>>
>> Can you please explain what is this fixing? and is there a test
>> to reproduce this?
>>
>>>
>>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>>> ---
>>>   drivers/nvme/host/auth.c | 2 ++
>>>   1 file changed, 2 insertions(+)
>>>
>>> diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
>>> index ef5dcd885b0c..935f340607a7 100644
>>> --- a/drivers/nvme/host/auth.c
>>> +++ b/drivers/nvme/host/auth.c
>>> @@ -87,6 +87,8 @@ static int nvme_auth_submit(struct nvme_ctrl *ctrl, 
>>> int qid,
>>>               "qid %d auth_submit failed to map, error %d\n",
>>>               qid, ret);
>>>       } else {
>>> +        /* Clear failfast flag to allow for retries */
>>> +        req->cmd_flags &= ~REQ_FAILFAST_DRIVER;
>>
>> This should come right after the allocation.
>> But what makes this special than any other fabrics/admin command?
>> Why is it different than connect for example?
> 
> It is not different; it does, however, introduce a difference in behaviour.
> And hence I didn't enable it for all commands, as previously we didn't 
> retry, and it's questionable if we should for things like 
> 'reg_read'/'reg_write' and friends.
> 
> But you are correct for connect, though; we really should retry that 
> one, too, if the DNR bit is not set.

I'm not sure. if the transport is unavailable, we cannot set DNR, which
will lead to a retry of the connect (or equivalent) command. Today the
host will schedule another full controller connect within
reconnect_delay, which is what we want right?


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/5] nvme: retry authentication commands if DNR status bit is not set
  2023-02-13 10:33       ` Sagi Grimberg
@ 2023-02-13 13:24         ` Hannes Reinecke
  2023-02-13 13:47           ` Sagi Grimberg
  0 siblings, 1 reply; 22+ messages in thread
From: Hannes Reinecke @ 2023-02-13 13:24 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig, George, Martin; +Cc: Keith Busch, linux-nvme

On 2/13/23 11:33, Sagi Grimberg wrote:
> 
>>>> Clear the FAILFAST_DRIVER bit for authentication commands
>>>> allowing them to be retried in nvme_decide_disposition() if the DNR
>>>> bit is not set in the command result.
>>>
>>> Can you please explain what is this fixing? and is there a test
>>> to reproduce this?
>>>
>>>>
>>>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>>>> ---
>>>>   drivers/nvme/host/auth.c | 2 ++
>>>>   1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
>>>> index ef5dcd885b0c..935f340607a7 100644
>>>> --- a/drivers/nvme/host/auth.c
>>>> +++ b/drivers/nvme/host/auth.c
>>>> @@ -87,6 +87,8 @@ static int nvme_auth_submit(struct nvme_ctrl 
>>>> *ctrl, int qid,
>>>>               "qid %d auth_submit failed to map, error %d\n",
>>>>               qid, ret);
>>>>       } else {
>>>> +        /* Clear failfast flag to allow for retries */
>>>> +        req->cmd_flags &= ~REQ_FAILFAST_DRIVER;
>>>
>>> This should come right after the allocation.
>>> But what makes this special than any other fabrics/admin command?
>>> Why is it different than connect for example?
>>
>> It is not different; it does, however, introduce a difference in 
>> behaviour.
>> And hence I didn't enable it for all commands, as previously we didn't 
>> retry, and it's questionable if we should for things like 
>> 'reg_read'/'reg_write' and friends.
>>
>> But you are correct for connect, though; we really should retry that 
>> one, too, if the DNR bit is not set.
> 
> I'm not sure. if the transport is unavailable, we cannot set DNR, which
> will lead to a retry of the connect (or equivalent) command. Today the
> host will schedule another full controller connect within
> reconnect_delay, which is what we want right?

I knew this would be coming up, which is why I left it out of the 
initial patchset.

Long explanation:

Thing is, connect handling is not _just_ connect handling.
With linux we only have a single (ioctl-like) entry for the userspace 
'nvme connect' call.

That really does several things:
- Setting up the transport
- Create the transport association for the admin q
- Create the admin q
- send 'connect' over that queue
- Create transport associations for the i/o qs
- create i/o qs
- send 'connect' over those queues

And my point here is that the actual 'connect' commands really are 
straight NVMe commands. And as such should be subjected to the normal 
status evaluation rules for NVMe commands.
The problem here is the strange 'half-breed' nature of the connect command:

_Technically_ the association and the 'connect' command are independent, 
and as such we would be entirely within spec to allow the 'connect'
command to fail without having the association torn down. But due to
our implementation that would leave us unable to re-use the association
for further connect commands, as that would need to be triggered by
userspace (which will always create a new association).

Consequently I would advocate for having any non-retryable failure of 
the 'connect' command to always tear down the association, too.

That would be a deviation from our normal rules (where we always retry
transport failures), but really is something we need to do if we enable 
DNR evaluation for the 'connect' command.

And we would need an agreement on this, hence I didn't attempt it in 
this patchset.

(Actually I'm wondering if it's worth a session at LSF; this has been a 
long-running topic without us being able to make meaningful progress)

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/5] nvme: retry authentication commands if DNR status bit is not set
  2023-02-13 13:24         ` Hannes Reinecke
@ 2023-02-13 13:47           ` Sagi Grimberg
  2023-02-13 14:07             ` Hannes Reinecke
  0 siblings, 1 reply; 22+ messages in thread
From: Sagi Grimberg @ 2023-02-13 13:47 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig, George, Martin
  Cc: Keith Busch, linux-nvme


>>>>> Clear the FAILFAST_DRIVER bit for authentication commands
>>>>> allowing them to be retried in nvme_decide_disposition() if the DNR
>>>>> bit is not set in the command result.
>>>>
>>>> Can you please explain what is this fixing? and is there a test
>>>> to reproduce this?
>>>>
>>>>>
>>>>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>>>>> ---
>>>>>   drivers/nvme/host/auth.c | 2 ++
>>>>>   1 file changed, 2 insertions(+)
>>>>>
>>>>> diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
>>>>> index ef5dcd885b0c..935f340607a7 100644
>>>>> --- a/drivers/nvme/host/auth.c
>>>>> +++ b/drivers/nvme/host/auth.c
>>>>> @@ -87,6 +87,8 @@ static int nvme_auth_submit(struct nvme_ctrl 
>>>>> *ctrl, int qid,
>>>>>               "qid %d auth_submit failed to map, error %d\n",
>>>>>               qid, ret);
>>>>>       } else {
>>>>> +        /* Clear failfast flag to allow for retries */
>>>>> +        req->cmd_flags &= ~REQ_FAILFAST_DRIVER;
>>>>
>>>> This should come right after the allocation.
>>>> But what makes this special than any other fabrics/admin command?
>>>> Why is it different than connect for example?
>>>
>>> It is not different; it does, however, introduce a difference in 
>>> behaviour.
>>> And hence I didn't enable it for all commands, as previously we 
>>> didn't retry, and it's questionable if we should for things like 
>>> 'reg_read'/'reg_write' and friends.
>>>
>>> But you are correct for connect, though; we really should retry that 
>>> one, too, if the DNR bit is not set.
>>
>> I'm not sure. if the transport is unavailable, we cannot set DNR, which
>> will lead to a retry of the connect (or equivalent) command. Today the
>> host will schedule another full controller connect within
>> reconnect_delay, which is what we want right?
> 
> I knew this would be coming up, which is why I left it out of the 
> initial patchset.
> 
> Long explanation:
> 
> Thing is, connect handling is not _just_ connect handling.
> With linux we only have a single (ioctl-like) entry for the userspace 
> 'nvme connect' call.
> 
> That really does several things:
> - Setting up the transport
> - Create the transport association for the admin q
> - Create the admin q
> - send 'connect' over that queue
> - Create transport associations for the i/o qs
> - create i/o qs
> - send 'connect' over those queues
> 
> And my point here is that the actual 'connect' commands really are 
> straight NVMe commands. And as such should be subjected to the normal 
> status evaluation rules for NVMe commands.
> The problem here is the strange 'half-breed' nature of the connect command:
> 
> _Technically_ the association and the 'connect' command are independent, 
> and as such we would be entirely within spec to allow the 'connect'
> command to fail without having the association torn down. But due to
> our implementation that would leave us unable to re-use the association
> for further connect commands, as that would need to be triggered by
> userspace (which will always create a new association).
> 
> Consequently I would advocate for having any non-retryable failure of 
> the 'connect' command to always tear down the association, too.
> 
> That would be a deviation from our normal rules (where we always retry
> transport failures), but really is something we need to do if we enable 
> DNR evaluation for the 'connect' command.

How the host handles connect or any other request failing is entirely
the host choice.

In this patch, you decided to effectively to retry for transport errors,
which will effectively retry nvme_max_retries times. Can you please help
me understand what is the patch actually solving? I'm assuming that this
is an actual completion that you see from the controller, my second
question is what happens with this patch when the transport is the
source of the completion?

Once we understand this, we can discuss any difference between auth and
other driver generated commands.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/5] nvme: retry authentication commands if DNR status bit is not set
  2023-02-13 13:47           ` Sagi Grimberg
@ 2023-02-13 14:07             ` Hannes Reinecke
  2023-02-14  9:39               ` Sagi Grimberg
  0 siblings, 1 reply; 22+ messages in thread
From: Hannes Reinecke @ 2023-02-13 14:07 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig, George, Martin; +Cc: Keith Busch, linux-nvme

On 2/13/23 14:47, Sagi Grimberg wrote:
> 
>>>>>> Clear the FAILFAST_DRIVER bit for authentication commands
>>>>>> allowing them to be retried in nvme_decide_disposition() if the DNR
>>>>>> bit is not set in the command result.
>>>>>
>>>>> Can you please explain what is this fixing? and is there a test
>>>>> to reproduce this?
>>>>>
>>>>>>
>>>>>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>>>>>> ---
>>>>>>   drivers/nvme/host/auth.c | 2 ++
>>>>>>   1 file changed, 2 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
>>>>>> index ef5dcd885b0c..935f340607a7 100644
>>>>>> --- a/drivers/nvme/host/auth.c
>>>>>> +++ b/drivers/nvme/host/auth.c
>>>>>> @@ -87,6 +87,8 @@ static int nvme_auth_submit(struct nvme_ctrl 
>>>>>> *ctrl, int qid,
>>>>>>               "qid %d auth_submit failed to map, error %d\n",
>>>>>>               qid, ret);
>>>>>>       } else {
>>>>>> +        /* Clear failfast flag to allow for retries */
>>>>>> +        req->cmd_flags &= ~REQ_FAILFAST_DRIVER;
>>>>>
>>>>> This should come right after the allocation.
>>>>> But what makes this special than any other fabrics/admin command?
>>>>> Why is it different than connect for example?
>>>>
>>>> It is not different; it does, however, introduce a difference in 
>>>> behaviour.
>>>> And hence I didn't enable it for all commands, as previously we 
>>>> didn't retry, and it's questionable if we should for things like 
>>>> 'reg_read'/'reg_write' and friends.
>>>>
>>>> But you are correct for connect, though; we really should retry that 
>>>> one, too, if the DNR bit is not set.
>>>
>>> I'm not sure. if the transport is unavailable, we cannot set DNR, which
>>> will lead to a retry of the connect (or equivalent) command. Today the
>>> host will schedule another full controller connect within
>>> reconnect_delay, which is what we want right?
>>
>> I knew this would be coming up, which is why I left it out of the 
>> initial patchset.
>>
>> Long explanation:
>>
>> Thing is, connect handling is not _just_ connect handling.
>> With linux we only have a single (ioctl-like) entry for the userspace 
>> 'nvme connect' call.
>>
>> That really does several things:
>> - Setting up the transport
>> - Create the transport association for the admin q
>> - Create the admin q
>> - send 'connect' over that queue
>> - Create transport associations for the i/o qs
>> - create i/o qs
>> - send 'connect' over those queues
>>
>> And my point here is that the actual 'connect' commands really are 
>> straight NVMe commands. And as such should be subjected to the normal 
>> status evaluation rules for NVMe commands.
>> The problem here is the strange 'half-breed' nature of the connect 
>> command:
>>
>> _Technically_ the association and the 'connect' command are 
>> independent, and as such we would be entirely within spec to allow the 
>> 'connect'
>> command to fail without having the association torn down. But due to
>> our implementation that would leave us unable to re-use the association
>> for further connect commands, as that would need to be triggered by
>> userspace (which will always create a new association).
>>
>> Consequently I would advocate for having any non-retryable failure of 
>> the 'connect' command to always tear down the association, too.
>>
>> That would be a deviation from our normal rules (where we always retry
>> transport failures), but really is something we need to do if we 
>> enable DNR evaluation for the 'connect' command.
> 
> How the host handles connect or any other request failing is entirely
> the host choice.
> 
> In this patch, you decided to effectively to retry for transport errors,
> which will effectively retry nvme_max_retries times. Can you please help
> me understand what is the patch actually solving? I'm assuming that this
> is an actual completion that you see from the controller, my second
> question is what happens with this patch when the transport is the
> source of the completion?
> 
Hmm. All what this patch does is to enable retries by modifying the flow 
in nvme_decide_disposition().
And nvme_decide_disposition() only deals with NVMe errors; if
nvme_req(req)->status would be an error number we'd be in much worse 
problems than just retries.

So where do you see the transport errors being retried?

And I need to retry as some controller implementations temporarily 
return the authentication commands with the 'DNR' bit set, presumeably 
to initialize the authentication subsystem.

Cheers,

Hannes



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/5] nvme: retry authentication commands if DNR status bit is not set
  2023-02-13 14:07             ` Hannes Reinecke
@ 2023-02-14  9:39               ` Sagi Grimberg
  0 siblings, 0 replies; 22+ messages in thread
From: Sagi Grimberg @ 2023-02-14  9:39 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig, George, Martin
  Cc: Keith Busch, linux-nvme


>>>>>>> Clear the FAILFAST_DRIVER bit for authentication commands
>>>>>>> allowing them to be retried in nvme_decide_disposition() if the DNR
>>>>>>> bit is not set in the command result.
>>>>>>
>>>>>> Can you please explain what is this fixing? and is there a test
>>>>>> to reproduce this?
>>>>>>
>>>>>>>
>>>>>>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>>>>>>> ---
>>>>>>>   drivers/nvme/host/auth.c | 2 ++
>>>>>>>   1 file changed, 2 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
>>>>>>> index ef5dcd885b0c..935f340607a7 100644
>>>>>>> --- a/drivers/nvme/host/auth.c
>>>>>>> +++ b/drivers/nvme/host/auth.c
>>>>>>> @@ -87,6 +87,8 @@ static int nvme_auth_submit(struct nvme_ctrl 
>>>>>>> *ctrl, int qid,
>>>>>>>               "qid %d auth_submit failed to map, error %d\n",
>>>>>>>               qid, ret);
>>>>>>>       } else {
>>>>>>> +        /* Clear failfast flag to allow for retries */
>>>>>>> +        req->cmd_flags &= ~REQ_FAILFAST_DRIVER;
>>>>>>
>>>>>> This should come right after the allocation.
>>>>>> But what makes this special than any other fabrics/admin command?
>>>>>> Why is it different than connect for example?
>>>>>
>>>>> It is not different; it does, however, introduce a difference in 
>>>>> behaviour.
>>>>> And hence I didn't enable it for all commands, as previously we 
>>>>> didn't retry, and it's questionable if we should for things like 
>>>>> 'reg_read'/'reg_write' and friends.
>>>>>
>>>>> But you are correct for connect, though; we really should retry 
>>>>> that one, too, if the DNR bit is not set.
>>>>
>>>> I'm not sure. if the transport is unavailable, we cannot set DNR, which
>>>> will lead to a retry of the connect (or equivalent) command. Today the
>>>> host will schedule another full controller connect within
>>>> reconnect_delay, which is what we want right?
>>>
>>> I knew this would be coming up, which is why I left it out of the 
>>> initial patchset.
>>>
>>> Long explanation:
>>>
>>> Thing is, connect handling is not _just_ connect handling.
>>> With linux we only have a single (ioctl-like) entry for the userspace 
>>> 'nvme connect' call.
>>>
>>> That really does several things:
>>> - Setting up the transport
>>> - Create the transport association for the admin q
>>> - Create the admin q
>>> - send 'connect' over that queue
>>> - Create transport associations for the i/o qs
>>> - create i/o qs
>>> - send 'connect' over those queues
>>>
>>> And my point here is that the actual 'connect' commands really are 
>>> straight NVMe commands. And as such should be subjected to the normal 
>>> status evaluation rules for NVMe commands.
>>> The problem here is the strange 'half-breed' nature of the connect 
>>> command:
>>>
>>> _Technically_ the association and the 'connect' command are 
>>> independent, and as such we would be entirely within spec to allow 
>>> the 'connect'
>>> command to fail without having the association torn down. But due to
>>> our implementation that would leave us unable to re-use the association
>>> for further connect commands, as that would need to be triggered by
>>> userspace (which will always create a new association).
>>>
>>> Consequently I would advocate for having any non-retryable failure of 
>>> the 'connect' command to always tear down the association, too.
>>>
>>> That would be a deviation from our normal rules (where we always retry
>>> transport failures), but really is something we need to do if we 
>>> enable DNR evaluation for the 'connect' command.
>>
>> How the host handles connect or any other request failing is entirely
>> the host choice.
>>
>> In this patch, you decided to effectively to retry for transport errors,
>> which will effectively retry nvme_max_retries times. Can you please help
>> me understand what is the patch actually solving? I'm assuming that this
>> is an actual completion that you see from the controller, my second
>> question is what happens with this patch when the transport is the
>> source of the completion?
>>
> Hmm. All what this patch does is to enable retries by modifying the flow 
> in nvme_decide_disposition().
> And nvme_decide_disposition() only deals with NVMe errors; if
> nvme_req(req)->status would be an error number we'd be in much worse 
> problems than just retries.

Hannes, can you please describe the flow that went wrong here and
you attempted to fix?

My assumption is that the host got a non-zero nvme status with DNR
cleared. Is that correct? This is not described in the commit msg, hence
me asking.

> So where do you see the transport errors being retried?

Effectively, all transport errors do not set the DNR bit, and will
be retried within crdt, until nvme_max_retries is reached. My question
is weather this affects the reconnect flow given that we go and
cancel all inflight requests (NVME_SC_HOST_ABORTED_CMD).
Does this mean that if now the auth is inflight, it will retry?
causing the teardown to block?

Same question applies to other driver requests that you say that
should comply to the DNR bit.

> And I need to retry as some controller implementations temporarily 
> return the authentication commands with the 'DNR' bit set, presumeably 
> to initialize the authentication subsystem.

That is unrelated to this patch set though.


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2023-02-14  9:39 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-09 14:38 [PATCH 0/5] nvme: rework __nvme_submit_sync_cmd() Hannes Reinecke
2023-02-09 14:38 ` [PATCH 1/5] nvme: split __nvme_submit_sync_cmd() Hannes Reinecke
2023-02-13  6:19   ` Christoph Hellwig
2023-02-13  9:47     ` Sagi Grimberg
2023-02-09 14:38 ` [PATCH 2/5] block: make blk_rq_map_kern() to accept a NULL buffer Hannes Reinecke
2023-02-13  6:21   ` Christoph Hellwig
2023-02-13  9:49   ` Sagi Grimberg
2023-02-09 14:38 ` [PATCH 3/5] nvme: move result handling into nvme_execute_rq() Hannes Reinecke
2023-02-13  9:59   ` Sagi Grimberg
2023-02-13 10:04     ` Hannes Reinecke
2023-02-13 10:08       ` Sagi Grimberg
2023-02-09 14:38 ` [PATCH 4/5] nvme: open-code __nvme_submit_sync_cmd() Hannes Reinecke
2023-02-13  6:26   ` Christoph Hellwig
2023-02-13 10:07     ` Sagi Grimberg
2023-02-09 14:38 ` [PATCH 5/5] nvme: retry authentication commands if DNR status bit is not set Hannes Reinecke
2023-02-13 10:14   ` Sagi Grimberg
2023-02-13 10:28     ` Hannes Reinecke
2023-02-13 10:33       ` Sagi Grimberg
2023-02-13 13:24         ` Hannes Reinecke
2023-02-13 13:47           ` Sagi Grimberg
2023-02-13 14:07             ` Hannes Reinecke
2023-02-14  9:39               ` Sagi Grimberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox