linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/4] nvme: protect against possible request reference after completion
@ 2021-06-16 21:19 Sagi Grimberg
  2021-06-16 21:19 ` [PATCH v3 1/4] params: lift param_set_uint_minmax to common code Sagi Grimberg
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Sagi Grimberg @ 2021-06-16 21:19 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, Keith Busch
  Cc: Hannes Reinecke, Chaitanya Kulkarni

Nothing in nvme protects against referencing a request after it was completed.
For example, in case a buggy controller sends a completion twice for the same
request, the host can access and modify a request that was already completed.

At best, this will cause a panic, but on the worst case, this can cause a silent
data corruption if the request was already reused and executed by the time
we reference it.

The nvme command_id is an opaque that we simply placed the request tag thus far.
To protect against a access after completion, we introduce a generation counter
to the upper 4-bits of the command_id that will increment every invocation and
be validated upon the reception of a completion. This will limit the maximum
queue depth to be effectively 4095, but we hardly ever use such long queues
(in fabrics the maximum is already 1024).

Changes from v2:
- cc linux-nfs,linux-kernel for patch 1/4
- fix expected genctr print in patch 4/4
- match param_set_uint_minmax indentation
- collected review tags

Changes from v1:
- lift param_set_uint_minmax and reuse it
- simplify initialization in patch 3/4

Sagi Grimberg (4):
  params: lift param_set_uint_minmax to common code
  nvme-pci: limit maximum queue depth to 4095
  nvme-tcp: don't check blk_mq_tag_to_rq when receiving pdu data
  nvme: code command_id with a genctr for use-after-free validation

 drivers/nvme/host/core.c    |  3 ++-
 drivers/nvme/host/nvme.h    | 47 ++++++++++++++++++++++++++++++++++++-
 drivers/nvme/host/pci.c     | 17 ++++++--------
 drivers/nvme/host/rdma.c    |  4 ++--
 drivers/nvme/host/tcp.c     | 38 ++++++++++++------------------
 drivers/nvme/target/loop.c  |  4 ++--
 include/linux/moduleparam.h |  3 +++
 kernel/params.c             | 19 +++++++++++++++
 net/sunrpc/xprtsock.c       | 18 --------------
 9 files changed, 96 insertions(+), 57 deletions(-)

-- 
2.27.0


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 1/4] params: lift param_set_uint_minmax to common code
  2021-06-16 21:19 [PATCH v3 0/4] nvme: protect against possible request reference after completion Sagi Grimberg
@ 2021-06-16 21:19 ` Sagi Grimberg
  2021-06-17  5:45   ` Hannes Reinecke
                     ` (2 more replies)
  2021-06-16 21:19 ` [PATCH v3 2/4] nvme-pci: limit maximum queue depth to 4095 Sagi Grimberg
                   ` (3 subsequent siblings)
  4 siblings, 3 replies; 13+ messages in thread
From: Sagi Grimberg @ 2021-06-16 21:19 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, Keith Busch
  Cc: Hannes Reinecke, Chaitanya Kulkarni

It is a useful helper hence move it to common code so others can enjoy
it.

Suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: linux-nfs@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 include/linux/moduleparam.h |  3 +++
 kernel/params.c             | 19 +++++++++++++++++++
 net/sunrpc/xprtsock.c       | 18 ------------------
 3 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/include/linux/moduleparam.h b/include/linux/moduleparam.h
index eed280fae433..b36ece513616 100644
--- a/include/linux/moduleparam.h
+++ b/include/linux/moduleparam.h
@@ -432,6 +432,9 @@ extern const struct kernel_param_ops param_ops_uint;
 extern int param_set_uint(const char *val, const struct kernel_param *kp);
 extern int param_get_uint(char *buffer, const struct kernel_param *kp);
 #define param_check_uint(name, p) __param_check(name, p, unsigned int)
+int param_set_uint_minmax(const char *val,
+			  const struct kernel_param *kp,
+			  unsigned int min, unsigned int max);
 
 extern const struct kernel_param_ops param_ops_long;
 extern int param_set_long(const char *val, const struct kernel_param *kp);
diff --git a/kernel/params.c b/kernel/params.c
index 2daa2780a92c..7d3c61e64140 100644
--- a/kernel/params.c
+++ b/kernel/params.c
@@ -243,6 +243,25 @@ STANDARD_PARAM_DEF(ulong,	unsigned long,		"%lu",		kstrtoul);
 STANDARD_PARAM_DEF(ullong,	unsigned long long,	"%llu",		kstrtoull);
 STANDARD_PARAM_DEF(hexint,	unsigned int,		"%#08x", 	kstrtouint);
 
+int param_set_uint_minmax(const char *val,
+			  const struct kernel_param *kp,
+			  unsigned int min, unsigned int max)
+{
+	unsigned int num;
+	int ret;
+
+	if (!val)
+		return -EINVAL;
+	ret = kstrtouint(val, 0, &num);
+	if (ret)
+		return ret;
+	if (num < min || num > max)
+		return -EINVAL;
+	*((unsigned int *)kp->arg) = num;
+	return 0;
+}
+EXPORT_SYMBOL(param_set_uint_minmax);
+
 int param_set_charp(const char *val, const struct kernel_param *kp)
 {
 	if (strlen(val) > 1024) {
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 47aa47a2b07c..0cfbf618e8c2 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -3130,24 +3130,6 @@ void cleanup_socket_xprt(void)
 	xprt_unregister_transport(&xs_bc_tcp_transport);
 }
 
-static int param_set_uint_minmax(const char *val,
-		const struct kernel_param *kp,
-		unsigned int min, unsigned int max)
-{
-	unsigned int num;
-	int ret;
-
-	if (!val)
-		return -EINVAL;
-	ret = kstrtouint(val, 0, &num);
-	if (ret)
-		return ret;
-	if (num < min || num > max)
-		return -EINVAL;
-	*((unsigned int *)kp->arg) = num;
-	return 0;
-}
-
 static int param_set_portnr(const char *val, const struct kernel_param *kp)
 {
 	return param_set_uint_minmax(val, kp,
-- 
2.27.0


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 2/4] nvme-pci: limit maximum queue depth to 4095
  2021-06-16 21:19 [PATCH v3 0/4] nvme: protect against possible request reference after completion Sagi Grimberg
  2021-06-16 21:19 ` [PATCH v3 1/4] params: lift param_set_uint_minmax to common code Sagi Grimberg
@ 2021-06-16 21:19 ` Sagi Grimberg
  2021-06-17  5:46   ` Hannes Reinecke
  2021-06-17  8:04   ` Daniel Wagner
  2021-06-16 21:19 ` [PATCH v3 3/4] nvme-tcp: don't check blk_mq_tag_to_rq when receiving pdu data Sagi Grimberg
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 13+ messages in thread
From: Sagi Grimberg @ 2021-06-16 21:19 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, Keith Busch
  Cc: Hannes Reinecke, Chaitanya Kulkarni

We are going to use the upper 4-bits of the command_id for a generation
counter, so enforce the new queue depth upper limit. As we enforce
both min and max queue depth, use param_set_uint_minmax istead of
open coding it.

Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/pci.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 9cdf2099027a..d2ee9e46f5d3 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -60,6 +60,8 @@ MODULE_PARM_DESC(sgl_threshold,
 		"Use SGLs when average request segment size is larger or equal to "
 		"this size. Use 0 to disable SGLs.");
 
+#define NVME_PCI_MIN_QUEUE_SIZE 2
+#define NVME_PCI_MAX_QUEUE_SIZE 4095
 static int io_queue_depth_set(const char *val, const struct kernel_param *kp);
 static const struct kernel_param_ops io_queue_depth_ops = {
 	.set = io_queue_depth_set,
@@ -68,7 +70,7 @@ static const struct kernel_param_ops io_queue_depth_ops = {
 
 static unsigned int io_queue_depth = 1024;
 module_param_cb(io_queue_depth, &io_queue_depth_ops, &io_queue_depth, 0644);
-MODULE_PARM_DESC(io_queue_depth, "set io queue depth, should >= 2");
+MODULE_PARM_DESC(io_queue_depth, "set io queue depth, should >= 2 and < 4096");
 
 static int io_queue_count_set(const char *val, const struct kernel_param *kp)
 {
@@ -157,14 +159,9 @@ struct nvme_dev {
 
 static int io_queue_depth_set(const char *val, const struct kernel_param *kp)
 {
-	int ret;
-	u32 n;
-
-	ret = kstrtou32(val, 10, &n);
-	if (ret != 0 || n < 2)
-		return -EINVAL;
-
-	return param_set_uint(val, kp);
+	return param_set_uint_minmax(val, kp,
+			NVME_PCI_MIN_QUEUE_SIZE,
+			NVME_PCI_MAX_QUEUE_SIZE);
 }
 
 static inline unsigned int sq_idx(unsigned int qid, u32 stride)
-- 
2.27.0


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 3/4] nvme-tcp: don't check blk_mq_tag_to_rq when receiving pdu data
  2021-06-16 21:19 [PATCH v3 0/4] nvme: protect against possible request reference after completion Sagi Grimberg
  2021-06-16 21:19 ` [PATCH v3 1/4] params: lift param_set_uint_minmax to common code Sagi Grimberg
  2021-06-16 21:19 ` [PATCH v3 2/4] nvme-pci: limit maximum queue depth to 4095 Sagi Grimberg
@ 2021-06-16 21:19 ` Sagi Grimberg
  2021-06-17  8:11   ` Daniel Wagner
  2021-06-16 21:19 ` [PATCH v3 4/4] nvme: code command_id with a genctr for use-after-free validation Sagi Grimberg
  2021-07-16  7:15 ` [PATCH v3 0/4] nvme: protect against possible request reference after completion Christoph Hellwig
  4 siblings, 1 reply; 13+ messages in thread
From: Sagi Grimberg @ 2021-06-16 21:19 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, Keith Busch
  Cc: Hannes Reinecke, Chaitanya Kulkarni

We already validate it when receiving the c2hdata pdu header
and this is not changing so this is a redundant check.

Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/tcp.c | 14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index c7bd37103cf4..3ad65f42fc1e 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -703,17 +703,9 @@ static int nvme_tcp_recv_data(struct nvme_tcp_queue *queue, struct sk_buff *skb,
 			      unsigned int *offset, size_t *len)
 {
 	struct nvme_tcp_data_pdu *pdu = (void *)queue->pdu;
-	struct nvme_tcp_request *req;
-	struct request *rq;
-
-	rq = blk_mq_tag_to_rq(nvme_tcp_tagset(queue), pdu->command_id);
-	if (!rq) {
-		dev_err(queue->ctrl->ctrl.device,
-			"queue %d tag %#x not found\n",
-			nvme_tcp_queue_id(queue), pdu->command_id);
-		return -ENOENT;
-	}
-	req = blk_mq_rq_to_pdu(rq);
+	struct request *rq =
+		blk_mq_tag_to_rq(nvme_tcp_tagset(queue), pdu->command_id);
+	struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq);
 
 	while (true) {
 		int recv_len, ret;
-- 
2.27.0


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 4/4] nvme: code command_id with a genctr for use-after-free validation
  2021-06-16 21:19 [PATCH v3 0/4] nvme: protect against possible request reference after completion Sagi Grimberg
                   ` (2 preceding siblings ...)
  2021-06-16 21:19 ` [PATCH v3 3/4] nvme-tcp: don't check blk_mq_tag_to_rq when receiving pdu data Sagi Grimberg
@ 2021-06-16 21:19 ` Sagi Grimberg
  2021-06-17  8:56   ` Daniel Wagner
  2021-07-16  7:15 ` [PATCH v3 0/4] nvme: protect against possible request reference after completion Christoph Hellwig
  4 siblings, 1 reply; 13+ messages in thread
From: Sagi Grimberg @ 2021-06-16 21:19 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, Keith Busch
  Cc: Hannes Reinecke, Chaitanya Kulkarni

We cannot detect a (perhaps buggy) controller that is sending us
a completion for a request that was already completed (for example
sending a completion twice), this phenomenon was seen in the wild
a few times.

So to protect against this, we use the upper 4 msbits of the nvme sqe
command_id to use as a 4-bit generation counter and verify it matches
the existing request generation that is incrementing on every execution.

The 16-bit command_id structure now is constructed by:
| xxxx | xxxxxxxxxxxx |
  gen    request tag

This means that we are giving up some possible queue depth as 12 bits
allow for a maximum queue depth of 4095 instead of 65536, however we
never create such long queues anyways so no real harm done.

Suggested-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/core.c   |  3 ++-
 drivers/nvme/host/nvme.h   | 47 +++++++++++++++++++++++++++++++++++++-
 drivers/nvme/host/pci.c    |  2 +-
 drivers/nvme/host/rdma.c   |  4 ++--
 drivers/nvme/host/tcp.c    | 26 ++++++++++-----------
 drivers/nvme/target/loop.c |  4 ++--
 6 files changed, 66 insertions(+), 20 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 25ad9027f928..a75876dfa38c 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1026,7 +1026,8 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req)
 		return BLK_STS_IOERR;
 	}
 
-	cmd->common.command_id = req->tag;
+	nvme_req(req)->genctr++;
+	cmd->common.command_id = nvme_cid(req);
 	trace_nvme_setup_cmd(req, cmd);
 	return ret;
 }
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 1aab74128d40..ad518b1c0fac 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -158,6 +158,7 @@ enum nvme_quirks {
 struct nvme_request {
 	struct nvme_command	*cmd;
 	union nvme_result	result;
+	u8			genctr;
 	u8			retries;
 	u8			flags;
 	u16			status;
@@ -497,6 +498,49 @@ struct nvme_ctrl_ops {
 	int (*get_address)(struct nvme_ctrl *ctrl, char *buf, int size);
 };
 
+/*
+ * nvme command_id is constructed as such:
+ * | xxxx | xxxxxxxxxxxx |
+ *   gen    request tag
+ */
+#define nvme_genctr_mask(gen)			(gen & 0xf)
+#define nvme_cid_install_genctr(gen)		(nvme_genctr_mask(gen) << 12)
+#define nvme_genctr_from_cid(cid)		((cid & 0xf000) >> 12)
+#define nvme_tag_from_cid(cid)			(cid & 0xfff)
+
+static inline u16 nvme_cid(struct request *rq)
+{
+	return nvme_cid_install_genctr(nvme_req(rq)->genctr) | rq->tag;
+}
+
+static inline struct request *nvme_find_rq(struct blk_mq_tags *tags,
+		u16 command_id)
+{
+	u8 genctr = nvme_genctr_from_cid(command_id);
+	u16 tag = nvme_tag_from_cid(command_id);
+	struct request *rq;
+
+	rq = blk_mq_tag_to_rq(tags, tag);
+	if (unlikely(!rq)) {
+		pr_err("could not locate request for tag %#x\n",
+			tag);
+		return NULL;
+	}
+	if (unlikely(nvme_genctr_mask(nvme_req(rq)->genctr) != genctr)) {
+		dev_err(nvme_req(rq)->ctrl->device,
+			"request %#x genctr mismatch (got %#x expected %#x)\n",
+			tag, genctr, nvme_genctr_mask(nvme_req(rq)->genctr));
+		return NULL;
+	}
+	return rq;
+}
+
+static inline struct request *nvme_cid_to_rq(struct blk_mq_tags *tags,
+                u16 command_id)
+{
+	return blk_mq_tag_to_rq(tags, nvme_tag_from_cid(command_id));
+}
+
 #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
 void nvme_fault_inject_init(struct nvme_fault_inject *fault_inj,
 			    const char *dev_name);
@@ -594,7 +638,8 @@ static inline void nvme_put_ctrl(struct nvme_ctrl *ctrl)
 
 static inline bool nvme_is_aen_req(u16 qid, __u16 command_id)
 {
-	return !qid && command_id >= NVME_AQ_BLK_MQ_DEPTH;
+	return !qid &&
+		nvme_tag_from_cid(command_id) >= NVME_AQ_BLK_MQ_DEPTH;
 }
 
 void nvme_complete_rq(struct request *req);
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index d2ee9e46f5d3..e111532c2082 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1012,7 +1012,7 @@ static inline void nvme_handle_cqe(struct nvme_queue *nvmeq, u16 idx)
 		return;
 	}
 
-	req = blk_mq_tag_to_rq(nvme_queue_tagset(nvmeq), command_id);
+	req = nvme_find_rq(nvme_queue_tagset(nvmeq), command_id);
 	if (unlikely(!req)) {
 		dev_warn(nvmeq->dev->ctrl.device,
 			"invalid id %d completed on queue %d\n",
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 74bf2c7f2b80..14d5023603d7 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1730,10 +1730,10 @@ static void nvme_rdma_process_nvme_rsp(struct nvme_rdma_queue *queue,
 	struct request *rq;
 	struct nvme_rdma_request *req;
 
-	rq = blk_mq_tag_to_rq(nvme_rdma_tagset(queue), cqe->command_id);
+	rq = nvme_find_rq(nvme_rdma_tagset(queue), cqe->command_id);
 	if (!rq) {
 		dev_err(queue->ctrl->ctrl.device,
-			"tag 0x%x on QP %#x not found\n",
+			"got bad command_id %#x on QP %#x\n",
 			cqe->command_id, queue->qp->qp_num);
 		nvme_rdma_error_recovery(queue->ctrl);
 		return;
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 3ad65f42fc1e..b906ee41449e 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -488,11 +488,11 @@ static int nvme_tcp_process_nvme_cqe(struct nvme_tcp_queue *queue,
 {
 	struct request *rq;
 
-	rq = blk_mq_tag_to_rq(nvme_tcp_tagset(queue), cqe->command_id);
+	rq = nvme_find_rq(nvme_tcp_tagset(queue), cqe->command_id);
 	if (!rq) {
 		dev_err(queue->ctrl->ctrl.device,
-			"queue %d tag 0x%x not found\n",
-			nvme_tcp_queue_id(queue), cqe->command_id);
+			"got bad cqe.command_id %#x on queue %d\n",
+			cqe->command_id, nvme_tcp_queue_id(queue));
 		nvme_tcp_error_recovery(&queue->ctrl->ctrl);
 		return -EINVAL;
 	}
@@ -509,11 +509,11 @@ static int nvme_tcp_handle_c2h_data(struct nvme_tcp_queue *queue,
 {
 	struct request *rq;
 
-	rq = blk_mq_tag_to_rq(nvme_tcp_tagset(queue), pdu->command_id);
+	rq = nvme_find_rq(nvme_tcp_tagset(queue), pdu->command_id);
 	if (!rq) {
 		dev_err(queue->ctrl->ctrl.device,
-			"queue %d tag %#x not found\n",
-			nvme_tcp_queue_id(queue), pdu->command_id);
+			"got bad c2hdata.command_id %#x on queue %d\n",
+			pdu->command_id, nvme_tcp_queue_id(queue));
 		return -ENOENT;
 	}
 
@@ -607,7 +607,7 @@ static int nvme_tcp_setup_h2c_data_pdu(struct nvme_tcp_request *req,
 	data->hdr.plen =
 		cpu_to_le32(data->hdr.hlen + hdgst + req->pdu_len + ddgst);
 	data->ttag = pdu->ttag;
-	data->command_id = rq->tag;
+	data->command_id = nvme_cid(rq);
 	data->data_offset = cpu_to_le32(req->data_sent);
 	data->data_length = cpu_to_le32(req->pdu_len);
 	return 0;
@@ -620,11 +620,11 @@ static int nvme_tcp_handle_r2t(struct nvme_tcp_queue *queue,
 	struct request *rq;
 	int ret;
 
-	rq = blk_mq_tag_to_rq(nvme_tcp_tagset(queue), pdu->command_id);
+	rq = nvme_find_rq(nvme_tcp_tagset(queue), pdu->command_id);
 	if (!rq) {
 		dev_err(queue->ctrl->ctrl.device,
-			"queue %d tag %#x not found\n",
-			nvme_tcp_queue_id(queue), pdu->command_id);
+			"got bad r2t.command_id %#x on queue %d\n",
+			pdu->command_id, nvme_tcp_queue_id(queue));
 		return -ENOENT;
 	}
 	req = blk_mq_rq_to_pdu(rq);
@@ -704,7 +704,7 @@ static int nvme_tcp_recv_data(struct nvme_tcp_queue *queue, struct sk_buff *skb,
 {
 	struct nvme_tcp_data_pdu *pdu = (void *)queue->pdu;
 	struct request *rq =
-		blk_mq_tag_to_rq(nvme_tcp_tagset(queue), pdu->command_id);
+		nvme_cid_to_rq(nvme_tcp_tagset(queue), pdu->command_id);
 	struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq);
 
 	while (true) {
@@ -797,8 +797,8 @@ static int nvme_tcp_recv_ddgst(struct nvme_tcp_queue *queue,
 	}
 
 	if (pdu->hdr.flags & NVME_TCP_F_DATA_SUCCESS) {
-		struct request *rq = blk_mq_tag_to_rq(nvme_tcp_tagset(queue),
-						pdu->command_id);
+		struct request *rq = nvme_cid_to_rq(nvme_tcp_tagset(queue),
+					pdu->command_id);
 
 		nvme_tcp_end_request(rq, NVME_SC_SUCCESS);
 		queue->nr_cqe++;
diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c
index cb30cb942e1d..88d34212b6dc 100644
--- a/drivers/nvme/target/loop.c
+++ b/drivers/nvme/target/loop.c
@@ -107,10 +107,10 @@ static void nvme_loop_queue_response(struct nvmet_req *req)
 	} else {
 		struct request *rq;
 
-		rq = blk_mq_tag_to_rq(nvme_loop_tagset(queue), cqe->command_id);
+		rq = nvme_find_rq(nvme_loop_tagset(queue), cqe->command_id);
 		if (!rq) {
 			dev_err(queue->ctrl->ctrl.device,
-				"tag 0x%x on queue %d not found\n",
+				"got bad command_id %#x on queue %d\n",
 				cqe->command_id, nvme_loop_queue_idx(queue));
 			return;
 		}
-- 
2.27.0


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/4] params: lift param_set_uint_minmax to common code
  2021-06-16 21:19 ` [PATCH v3 1/4] params: lift param_set_uint_minmax to common code Sagi Grimberg
@ 2021-06-17  5:45   ` Hannes Reinecke
  2021-06-17  8:00   ` Daniel Wagner
  2021-06-17 13:48   ` Christoph Hellwig
  2 siblings, 0 replies; 13+ messages in thread
From: Hannes Reinecke @ 2021-06-17  5:45 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme, Christoph Hellwig, Keith Busch
  Cc: Chaitanya Kulkarni

On 6/16/21 11:19 PM, Sagi Grimberg wrote:
> It is a useful helper hence move it to common code so others can enjoy
> it.
> 
> Suggested-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> Cc: linux-nfs@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>   include/linux/moduleparam.h |  3 +++
>   kernel/params.c             | 19 +++++++++++++++++++
>   net/sunrpc/xprtsock.c       | 18 ------------------
>   3 files changed, 22 insertions(+), 18 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/4] nvme-pci: limit maximum queue depth to 4095
  2021-06-16 21:19 ` [PATCH v3 2/4] nvme-pci: limit maximum queue depth to 4095 Sagi Grimberg
@ 2021-06-17  5:46   ` Hannes Reinecke
  2021-06-17  8:04   ` Daniel Wagner
  1 sibling, 0 replies; 13+ messages in thread
From: Hannes Reinecke @ 2021-06-17  5:46 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme, Christoph Hellwig, Keith Busch
  Cc: Chaitanya Kulkarni

On 6/16/21 11:19 PM, Sagi Grimberg wrote:
> We are going to use the upper 4-bits of the command_id for a generation
> counter, so enforce the new queue depth upper limit. As we enforce
> both min and max queue depth, use param_set_uint_minmax istead of
> open coding it.
> 
> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>   drivers/nvme/host/pci.c | 15 ++++++---------
>   1 file changed, 6 insertions(+), 9 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/4] params: lift param_set_uint_minmax to common code
  2021-06-16 21:19 ` [PATCH v3 1/4] params: lift param_set_uint_minmax to common code Sagi Grimberg
  2021-06-17  5:45   ` Hannes Reinecke
@ 2021-06-17  8:00   ` Daniel Wagner
  2021-06-17 13:48   ` Christoph Hellwig
  2 siblings, 0 replies; 13+ messages in thread
From: Daniel Wagner @ 2021-06-17  8:00 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, Christoph Hellwig, Keith Busch, Hannes Reinecke,
	Chaitanya Kulkarni

On Wed, Jun 16, 2021 at 02:19:33PM -0700, Sagi Grimberg wrote:
> It is a useful helper hence move it to common code so others can enjoy
> it.
> 
> Suggested-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> Cc: linux-nfs@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>

[...]

> +int param_set_uint_minmax(const char *val,
> +			  const struct kernel_param *kp,
> +			  unsigned int min, unsigned int max)
> +{
> +	unsigned int num;
> +	int ret;
> +
> +	if (!val)
> +		return -EINVAL;
> +	ret = kstrtouint(val, 0, &num);
> +	if (ret)
> +		return ret;
> +	if (num < min || num > max)
> +		return -EINVAL;
> +	*((unsigned int *)kp->arg) = num;
> +	return 0;
> +}
> +EXPORT_SYMBOL(param_set_uint_minmax);

Couldn't this be EXPORT_SYMBOL_GPL?

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/4] nvme-pci: limit maximum queue depth to 4095
  2021-06-16 21:19 ` [PATCH v3 2/4] nvme-pci: limit maximum queue depth to 4095 Sagi Grimberg
  2021-06-17  5:46   ` Hannes Reinecke
@ 2021-06-17  8:04   ` Daniel Wagner
  1 sibling, 0 replies; 13+ messages in thread
From: Daniel Wagner @ 2021-06-17  8:04 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, Christoph Hellwig, Keith Busch, Hannes Reinecke,
	Chaitanya Kulkarni

On Wed, Jun 16, 2021 at 02:19:34PM -0700, Sagi Grimberg wrote:
> We are going to use the upper 4-bits of the command_id for a generation
> counter, so enforce the new queue depth upper limit. As we enforce
> both min and max queue depth, use param_set_uint_minmax istead of
> open coding it.
> 
> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---

Reviewed-by: Daniel Wagner <dwagner@suse.de>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] nvme-tcp: don't check blk_mq_tag_to_rq when receiving pdu data
  2021-06-16 21:19 ` [PATCH v3 3/4] nvme-tcp: don't check blk_mq_tag_to_rq when receiving pdu data Sagi Grimberg
@ 2021-06-17  8:11   ` Daniel Wagner
  0 siblings, 0 replies; 13+ messages in thread
From: Daniel Wagner @ 2021-06-17  8:11 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, Christoph Hellwig, Keith Busch, Hannes Reinecke,
	Chaitanya Kulkarni

On Wed, Jun 16, 2021 at 02:19:35PM -0700, Sagi Grimberg wrote:
> We already validate it when receiving the c2hdata pdu header
> and this is not changing so this is a redundant check.
> 
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>

Reviewed-by: Daniel Wagner <dwagner@suse.de>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 4/4] nvme: code command_id with a genctr for use-after-free validation
  2021-06-16 21:19 ` [PATCH v3 4/4] nvme: code command_id with a genctr for use-after-free validation Sagi Grimberg
@ 2021-06-17  8:56   ` Daniel Wagner
  0 siblings, 0 replies; 13+ messages in thread
From: Daniel Wagner @ 2021-06-17  8:56 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, Christoph Hellwig, Keith Busch, Hannes Reinecke,
	Chaitanya Kulkarni

On Wed, Jun 16, 2021 at 02:19:36PM -0700, Sagi Grimberg wrote:
> We cannot detect a (perhaps buggy) controller that is sending us
> a completion for a request that was already completed (for example
> sending a completion twice), this phenomenon was seen in the wild
> a few times.
> 
> So to protect against this, we use the upper 4 msbits of the nvme sqe
> command_id to use as a 4-bit generation counter and verify it matches
> the existing request generation that is incrementing on every execution.
> 
> The 16-bit command_id structure now is constructed by:
> | xxxx | xxxxxxxxxxxx |
>   gen    request tag
> 
> This means that we are giving up some possible queue depth as 12 bits
> allow for a maximum queue depth of 4095 instead of 65536, however we
> never create such long queues anyways so no real harm done.
> 
> Suggested-by: Keith Busch <kbusch@kernel.org>
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> Acked-by: Keith Busch <kbusch@kernel.org>
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>

I've tested (only functional) this on FC (NetApp target). All looks
good.

Tested-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/4] params: lift param_set_uint_minmax to common code
  2021-06-16 21:19 ` [PATCH v3 1/4] params: lift param_set_uint_minmax to common code Sagi Grimberg
  2021-06-17  5:45   ` Hannes Reinecke
  2021-06-17  8:00   ` Daniel Wagner
@ 2021-06-17 13:48   ` Christoph Hellwig
  2 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2021-06-17 13:48 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, Christoph Hellwig, Keith Busch, Hannes Reinecke,
	Chaitanya Kulkarni

This needs a Cc to linux-kernel..

> @@ -432,6 +432,9 @@ extern const struct kernel_param_ops param_ops_uint;
>  extern int param_set_uint(const char *val, const struct kernel_param *kp);
>  extern int param_get_uint(char *buffer, const struct kernel_param *kp);
>  #define param_check_uint(name, p) __param_check(name, p, unsigned int)
> +int param_set_uint_minmax(const char *val,
> +			  const struct kernel_param *kp,
> +			  unsigned int min, unsigned int max);

Super minor nitpick, but for consistency I'd move it above the
param_check_uint definition/

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/4] nvme: protect against possible request reference after completion
  2021-06-16 21:19 [PATCH v3 0/4] nvme: protect against possible request reference after completion Sagi Grimberg
                   ` (3 preceding siblings ...)
  2021-06-16 21:19 ` [PATCH v3 4/4] nvme: code command_id with a genctr for use-after-free validation Sagi Grimberg
@ 2021-07-16  7:15 ` Christoph Hellwig
  4 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2021-07-16  7:15 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, Christoph Hellwig, Keith Busch, Hannes Reinecke,
	Chaitanya Kulkarni

Thanks,

applies to nvme-5.15.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-07-16  7:15 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-06-16 21:19 [PATCH v3 0/4] nvme: protect against possible request reference after completion Sagi Grimberg
2021-06-16 21:19 ` [PATCH v3 1/4] params: lift param_set_uint_minmax to common code Sagi Grimberg
2021-06-17  5:45   ` Hannes Reinecke
2021-06-17  8:00   ` Daniel Wagner
2021-06-17 13:48   ` Christoph Hellwig
2021-06-16 21:19 ` [PATCH v3 2/4] nvme-pci: limit maximum queue depth to 4095 Sagi Grimberg
2021-06-17  5:46   ` Hannes Reinecke
2021-06-17  8:04   ` Daniel Wagner
2021-06-16 21:19 ` [PATCH v3 3/4] nvme-tcp: don't check blk_mq_tag_to_rq when receiving pdu data Sagi Grimberg
2021-06-17  8:11   ` Daniel Wagner
2021-06-16 21:19 ` [PATCH v3 4/4] nvme: code command_id with a genctr for use-after-free validation Sagi Grimberg
2021-06-17  8:56   ` Daniel Wagner
2021-07-16  7:15 ` [PATCH v3 0/4] nvme: protect against possible request reference after completion Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).