All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 00/12] io_uring attached nvme queue
       [not found] <CGME20230429094228epcas5p4a80d8ed77433989fa804ecf449f83b0b@epcas5p4.samsung.com>
@ 2023-04-29  9:39 ` Kanchan Joshi
  2023-04-29  9:39   ` [RFC PATCH 01/12] nvme: refactor nvme_alloc_io_tag_set Kanchan Joshi
                     ` (12 more replies)
  0 siblings, 13 replies; 25+ messages in thread
From: Kanchan Joshi @ 2023-04-29  9:39 UTC (permalink / raw)
  To: axboe, hch, sagi, kbusch
  Cc: io-uring, linux-nvme, linux-block, gost.dev, anuj1072538,
	xiaoguang.wang, Kanchan Joshi

This series shows one way to do what the title says.
This puts up a more direct/lean path that enables
 - submission from io_uring SQE to NVMe SQE
 - completion from NVMe CQE to io_uring CQE
Essentially cutting the hoops (involving request/bio) for nvme io path.

Also, io_uring ring is not to be shared among application threads.
Application is responsible for building the sharing (if it feels the
need). This means ring-associated exclusive queue can do away with some
synchronization costs that occur for shared queue.

Primary objective is to amp up of efficiency of kernel io path further
(towards PCIe gen N, N+1 hardware).
And we are seeing some asks too [1].

Building-blocks
===============
At high level, series can be divided into following parts -

1. nvme driver starts exposing some queue-pairs (SQ+CQ) that can
be attached to other in-kernel user (not just to block-layer, which is
the case at the moment) on demand.

Example:
insmod nvme.ko poll_queus=1 raw_queues=2

nvme0: 24/0/1/2 default/read/poll queues/raw queues

While driver registers other queues with block-layer, raw-queues are
rather reserved for exclusive attachment with other in-kernel users.
At this point, each raw-queue is interrupt-disabled (similar to
poll_queues). Maybe we need a better name for these (e.g. app/user queues).
[Refer: patch 2]

2. register/unregister queue interface
(a) one for io_uring application to ask for device-queue and register
with the ring. [Refer: patch 4]
(b) another at nvme so that other in-kernel users (io_uring for now) can
ask for a raw-queue. [Refer: patch 3, 5, 6]

The latter returns a qid, that io_uring stores internally (not exposed
to user-space) in the ring ctx. At max one queue per ring is enabled.
Ring has no other special properties except the fact that it stores a
qid that it can use exclusively. So application can very well use the
ring to do other things than nvme io.

3. user-interface to send commands down this way
(a) uring-cmd is extended to support a new flag "IORING_URING_CMD_DIRECT"
that application passes in the SQE. That is all.
(b) the flag goes down to provider of ->uring_cmd which may choose to do
  things differently based on it (or ignore it).
[Refer: patch 7]

4. nvme uring-cmd understands the above flag. It submits the command
into the known pre-registered queue, and completes (polled-completion)
from it. Transformation from "struct io_uring_cmd" to "nvme command" is
done directly without building other intermediate constructs.
[Refer: patch 8, 10, 12]

Testing and Performance
=======================
fio and t/io_uring is modified to exercise this path.
- fio: new "registerqueues" option
- t/io_uring: new "k" option

Good part:
2.96M -> 5.02M

nvme io (without this):
# t/io_uring -b512 -d64 -c2 -s2 -p1 -F1 -B1 -O0 -n1 -u1 -r4 -k0 /dev/ng0n1
submitter=0, tid=2922, file=/dev/ng0n1, node=-1
polled=1, fixedbufs=1/0, register_files=1, buffered=1, register_queues=0 QD=64
Engine=io_uring, sq_ring=64, cq_ring=64
IOPS=2.89M, BW=1412MiB/s, IOS/call=2/1
IOPS=2.92M, BW=1426MiB/s, IOS/call=2/2
IOPS=2.96M, BW=1444MiB/s, IOS/call=2/1
Exiting on timeout
Maximum IOPS=2.96M

nvme io (with this):
# t/io_uring -b512 -d64 -c2 -s2 -p1 -F1 -B1 -O0 -n1 -u1 -r4 -k1 /dev/ng0n1
submitter=0, tid=2927, file=/dev/ng0n1, node=-1
polled=1, fixedbufs=1/0, register_files=1, buffered=1, register_queues=1 QD=64
Engine=io_uring, sq_ring=64, cq_ring=64
IOPS=4.99M, BW=2.43GiB/s, IOS/call=2/1
IOPS=5.02M, BW=2.45GiB/s, IOS/call=2/1
IOPS=5.02M, BW=2.45GiB/s, IOS/call=2/1
Exiting on timeout
Maximum IOPS=5.02M

Not so good part:
While single IO is fast this way, we do not have batching abilities for
multi-io scenario. Plugging, submission and completion batching are tied to
block-layer constructs. Things should look better if we could do something
about that.
Particularly something is off with the completion-batching.

With -s32 and -c32, the numbers decline:

# t/io_uring -b512 -d64 -c32 -s32 -p1 -F1 -B1 -O0 -n1 -u1 -r4 -k1 /dev/ng0n1
submitter=0, tid=3674, file=/dev/ng0n1, node=-1
polled=1, fixedbufs=1/0, register_files=1, buffered=1, register_queues=1 QD=64
Engine=io_uring, sq_ring=64, cq_ring=64
IOPS=3.70M, BW=1806MiB/s, IOS/call=32/31
IOPS=3.71M, BW=1812MiB/s, IOS/call=32/31
IOPS=3.71M, BW=1812MiB/s, IOS/call=32/32
Exiting on timeout
Maximum IOPS=3.71M

And perf gets restored if we go back to -c2

# t/io_uring -b512 -d64 -c2 -s32 -p1 -F1 -B1 -O0 -n1 -u1 -r4 -k1 /dev/ng0n1
submitter=0, tid=3677, file=/dev/ng0n1, node=-1
polled=1, fixedbufs=1/0, register_files=1, buffered=1, register_queues=1 QD=64
Engine=io_uring, sq_ring=64, cq_ring=64
IOPS=4.99M, BW=2.44GiB/s, IOS/call=5/5
IOPS=5.02M, BW=2.45GiB/s, IOS/call=5/5
IOPS=5.02M, BW=2.45GiB/s, IOS/call=5/5
Exiting on timeout
Maximum IOPS=5.02M

Source
======
Kernel: https://github.com/OpenMPDK/linux/tree/feat/directq-v1
fio: https://github.com/OpenMPDK/fio/commits/feat/rawq-v2

Please take a look.

[1]
https://lore.kernel.org/io-uring/24179a47-ab37-fa32-d177-1086668fbd3d@linux.alibaba.com/

Anuj Gupta (5):
  fs, block: interface to register/unregister the raw-queue
  io_uring, fs: plumb support to register/unregister raw-queue
  nvme: wire-up register/unregister queue f_op callback
  block: add mq_ops to submit and complete commands from raw-queue
  pci: modify nvme_setup_prp_simple parameters

Kanchan Joshi (7):
  nvme: refactor nvme_alloc_io_tag_set
  pci: enable "raw_queues = N" module parameter
  pci: implement register/unregister functionality
  io_uring: support for using registered queue in uring-cmd
  nvme: carve out a helper to prepare nvme_command from ioucmd->cmd
  nvme: submisssion/completion of uring_cmd to/from the registered queue
  pci: implement submission/completion for rawq commands

 drivers/nvme/host/core.c       |  31 ++-
 drivers/nvme/host/fc.c         |   3 +-
 drivers/nvme/host/ioctl.c      | 234 +++++++++++++++----
 drivers/nvme/host/multipath.c  |   2 +
 drivers/nvme/host/nvme.h       |  19 +-
 drivers/nvme/host/pci.c        | 409 +++++++++++++++++++++++++++++++--
 drivers/nvme/host/rdma.c       |   2 +-
 drivers/nvme/host/tcp.c        |   3 +-
 drivers/nvme/target/loop.c     |   3 +-
 fs/file.c                      |  14 ++
 include/linux/blk-mq.h         |   5 +
 include/linux/fs.h             |   4 +
 include/linux/io_uring.h       |   6 +
 include/linux/io_uring_types.h |   3 +
 include/uapi/linux/io_uring.h  |   6 +
 io_uring/io_uring.c            |  60 +++++
 io_uring/uring_cmd.c           |  14 +-
 17 files changed, 739 insertions(+), 79 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [RFC PATCH 01/12] nvme: refactor nvme_alloc_io_tag_set
  2023-04-29  9:39 ` [RFC PATCH 00/12] io_uring attached nvme queue Kanchan Joshi
@ 2023-04-29  9:39   ` Kanchan Joshi
  2023-04-29  9:39   ` [RFC PATCH 02/12] pci: enable "raw_queues = N" module parameter Kanchan Joshi
                     ` (11 subsequent siblings)
  12 siblings, 0 replies; 25+ messages in thread
From: Kanchan Joshi @ 2023-04-29  9:39 UTC (permalink / raw)
  To: axboe, hch, sagi, kbusch
  Cc: io-uring, linux-nvme, linux-block, gost.dev, anuj1072538,
	xiaoguang.wang, Kanchan Joshi

Allow this function to take nr_hw_queues as a parameter. And change all
the callers. This is in preparation to introduce queues which do not
have to be registered with block-layer.

Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
---
 drivers/nvme/host/core.c   | 5 +++--
 drivers/nvme/host/fc.c     | 3 ++-
 drivers/nvme/host/nvme.h   | 2 +-
 drivers/nvme/host/pci.c    | 3 ++-
 drivers/nvme/host/rdma.c   | 2 +-
 drivers/nvme/host/tcp.c    | 3 ++-
 drivers/nvme/target/loop.c | 3 ++-
 7 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 1bfd52eae2ee..ba476c48d566 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4962,7 +4962,7 @@ EXPORT_SYMBOL_GPL(nvme_remove_admin_tag_set);
 
 int nvme_alloc_io_tag_set(struct nvme_ctrl *ctrl, struct blk_mq_tag_set *set,
 		const struct blk_mq_ops *ops, unsigned int nr_maps,
-		unsigned int cmd_size)
+		unsigned int cmd_size, unsigned int nr_hw_queues)
 {
 	int ret;
 
@@ -4983,9 +4983,10 @@ int nvme_alloc_io_tag_set(struct nvme_ctrl *ctrl, struct blk_mq_tag_set *set,
 		set->flags |= BLK_MQ_F_BLOCKING;
 	set->cmd_size = cmd_size,
 	set->driver_data = ctrl;
-	set->nr_hw_queues = ctrl->queue_count - 1;
+	set->nr_hw_queues = nr_hw_queues - 1;
 	set->timeout = NVME_IO_TIMEOUT;
 	set->nr_maps = nr_maps;
+
 	ret = blk_mq_alloc_tag_set(set);
 	if (ret)
 		return ret;
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 456ee42a6133..c7dd084c1999 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2918,7 +2918,8 @@ nvme_fc_create_io_queues(struct nvme_fc_ctrl *ctrl)
 	ret = nvme_alloc_io_tag_set(&ctrl->ctrl, &ctrl->tag_set,
 			&nvme_fc_mq_ops, 1,
 			struct_size((struct nvme_fcp_op_w_sgl *)NULL, priv,
-				    ctrl->lport->ops->fcprqst_priv_sz));
+				    ctrl->lport->ops->fcprqst_priv_sz),
+				    ctrl->ctrl.queue_count);
 	if (ret)
 		return ret;
 
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index bf46f122e9e1..73992dc9dec7 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -748,7 +748,7 @@ int nvme_alloc_admin_tag_set(struct nvme_ctrl *ctrl, struct blk_mq_tag_set *set,
 void nvme_remove_admin_tag_set(struct nvme_ctrl *ctrl);
 int nvme_alloc_io_tag_set(struct nvme_ctrl *ctrl, struct blk_mq_tag_set *set,
 		const struct blk_mq_ops *ops, unsigned int nr_maps,
-		unsigned int cmd_size);
+		unsigned int cmd_size, unsigned int nr_hw_queues);
 void nvme_remove_io_tag_set(struct nvme_ctrl *ctrl);
 
 void nvme_remove_namespaces(struct nvme_ctrl *ctrl);
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 7f25c0fe3a0b..3a38ee6ee129 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3033,7 +3033,8 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 
 	if (dev->online_queues > 1) {
 		nvme_alloc_io_tag_set(&dev->ctrl, &dev->tagset, &nvme_mq_ops,
-				nvme_pci_nr_maps(dev), sizeof(struct nvme_iod));
+				nvme_pci_nr_maps(dev), sizeof(struct nvme_iod),
+				dev->ctrl.queue_count);
 		nvme_dbbuf_set(dev);
 	}
 
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 0eb79696fb73..2e2e131edc62 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -795,7 +795,7 @@ static int nvme_rdma_alloc_tag_set(struct nvme_ctrl *ctrl)
 	return nvme_alloc_io_tag_set(ctrl, &to_rdma_ctrl(ctrl)->tag_set,
 			&nvme_rdma_mq_ops,
 			ctrl->opts->nr_poll_queues ? HCTX_MAX_TYPES : 2,
-			cmd_size);
+			cmd_size, ctrl->queue_count);
 }
 
 static void nvme_rdma_destroy_admin_queue(struct nvme_rdma_ctrl *ctrl)
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index bf0230442d57..50efbd724284 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1893,7 +1893,8 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
 		ret = nvme_alloc_io_tag_set(ctrl, &to_tcp_ctrl(ctrl)->tag_set,
 				&nvme_tcp_mq_ops,
 				ctrl->opts->nr_poll_queues ? HCTX_MAX_TYPES : 2,
-				sizeof(struct nvme_tcp_request));
+				sizeof(struct nvme_tcp_request),
+				ctrl->queue_count);
 		if (ret)
 			goto out_free_io_queues;
 	}
diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c
index f2d24b2d992f..f02419b537f7 100644
--- a/drivers/nvme/target/loop.c
+++ b/drivers/nvme/target/loop.c
@@ -496,7 +496,8 @@ static int nvme_loop_create_io_queues(struct nvme_loop_ctrl *ctrl)
 	ret = nvme_alloc_io_tag_set(&ctrl->ctrl, &ctrl->tag_set,
 			&nvme_loop_mq_ops, 1,
 			sizeof(struct nvme_loop_iod) +
-			NVME_INLINE_SG_CNT * sizeof(struct scatterlist));
+			NVME_INLINE_SG_CNT * sizeof(struct scatterlist),
+			ctrl->ctrl.queue_count);
 	if (ret)
 		goto out_destroy_queues;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 02/12] pci: enable "raw_queues = N" module parameter
  2023-04-29  9:39 ` [RFC PATCH 00/12] io_uring attached nvme queue Kanchan Joshi
  2023-04-29  9:39   ` [RFC PATCH 01/12] nvme: refactor nvme_alloc_io_tag_set Kanchan Joshi
@ 2023-04-29  9:39   ` Kanchan Joshi
  2023-04-29  9:39   ` [RFC PATCH 03/12] fs, block: interface to register/unregister the raw-queue Kanchan Joshi
                     ` (10 subsequent siblings)
  12 siblings, 0 replies; 25+ messages in thread
From: Kanchan Joshi @ 2023-04-29  9:39 UTC (permalink / raw)
  To: axboe, hch, sagi, kbusch
  Cc: io-uring, linux-nvme, linux-block, gost.dev, anuj1072538,
	xiaoguang.wang, Kanchan Joshi

Add the infrastructure that carves out N nvme queue-pairs (struct
nvme_queue) which are not registered with the block layer.
The last N entries in dev->nvmeq[] are available to be attached
on demand.
Similar to poll_queues, these are interrupt-disabled.

This patch does not introduce the interface to attach/detach these
queues with any user. That is to be followed in subsequent patches.

Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
---
 drivers/nvme/host/pci.c | 49 +++++++++++++++++++++++++++++------------
 1 file changed, 35 insertions(+), 14 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 3a38ee6ee129..d366a76cc304 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -101,6 +101,10 @@ static unsigned int poll_queues;
 module_param_cb(poll_queues, &io_queue_count_ops, &poll_queues, 0644);
 MODULE_PARM_DESC(poll_queues, "Number of queues to use for polled IO.");
 
+static unsigned int raw_queues;
+module_param_cb(raw_queues, &io_queue_count_ops, &raw_queues, 0644);
+MODULE_PARM_DESC(raw_queues, "Number of polled, unmanaged queues.");
+
 static bool noacpi;
 module_param(noacpi, bool, 0444);
 MODULE_PARM_DESC(noacpi, "disable acpi bios quirks");
@@ -159,6 +163,7 @@ struct nvme_dev {
 	unsigned int nr_allocated_queues;
 	unsigned int nr_write_queues;
 	unsigned int nr_poll_queues;
+	unsigned int nr_raw_queues;
 };
 
 static int io_queue_depth_set(const char *val, const struct kernel_param *kp)
@@ -209,6 +214,7 @@ struct nvme_queue {
 #define NVMEQ_SQ_CMB		1
 #define NVMEQ_DELETE_ERROR	2
 #define NVMEQ_POLLED		3
+#define NVMEQ_RAW		4
 	__le32 *dbbuf_sq_db;
 	__le32 *dbbuf_cq_db;
 	__le32 *dbbuf_sq_ei;
@@ -1599,7 +1605,8 @@ static int nvme_setup_io_queues_trylock(struct nvme_dev *dev)
 	return 0;
 }
 
-static int nvme_create_queue(struct nvme_queue *nvmeq, int qid, bool polled)
+static int nvme_create_queue(struct nvme_queue *nvmeq, int qid, bool polled,
+				bool rawq)
 {
 	struct nvme_dev *dev = nvmeq->dev;
 	int result;
@@ -1613,8 +1620,11 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid, bool polled)
 	 */
 	if (!polled)
 		vector = dev->num_vecs == 1 ? 0 : qid;
-	else
+	else {
 		set_bit(NVMEQ_POLLED, &nvmeq->flags);
+		if (rawq)
+			set_bit(NVMEQ_RAW, &nvmeq->flags);
+	}
 
 	result = adapter_alloc_cq(dev, qid, nvmeq, vector);
 	if (result)
@@ -1770,7 +1780,7 @@ static int nvme_pci_configure_admin_queue(struct nvme_dev *dev)
 
 static int nvme_create_io_queues(struct nvme_dev *dev)
 {
-	unsigned i, max, rw_queues;
+	unsigned i, max, rw_queues, rw_poll_queues;
 	int ret = 0;
 
 	for (i = dev->ctrl.queue_count; i <= dev->max_qid; i++) {
@@ -1781,17 +1791,20 @@ static int nvme_create_io_queues(struct nvme_dev *dev)
 	}
 
 	max = min(dev->max_qid, dev->ctrl.queue_count - 1);
-	if (max != 1 && dev->io_queues[HCTX_TYPE_POLL]) {
+	if (max != 1 &&
+		(dev->io_queues[HCTX_TYPE_POLL] || dev->nr_raw_queues)) {
 		rw_queues = dev->io_queues[HCTX_TYPE_DEFAULT] +
 				dev->io_queues[HCTX_TYPE_READ];
+		rw_poll_queues = rw_queues + dev->io_queues[HCTX_TYPE_POLL];
 	} else {
 		rw_queues = max;
+		rw_poll_queues = max;
 	}
-
 	for (i = dev->online_queues; i <= max; i++) {
 		bool polled = i > rw_queues;
+		bool rawq = i > rw_poll_queues;
 
-		ret = nvme_create_queue(&dev->queues[i], i, polled);
+		ret = nvme_create_queue(&dev->queues[i], i, polled, rawq);
 		if (ret)
 			break;
 	}
@@ -2212,7 +2225,7 @@ static int nvme_setup_irqs(struct nvme_dev *dev, unsigned int nr_io_queues)
 		.calc_sets	= nvme_calc_irq_sets,
 		.priv		= dev,
 	};
-	unsigned int irq_queues, poll_queues;
+	unsigned int irq_queues, poll_queues, raw_queues;
 
 	/*
 	 * Poll queues don't need interrupts, but we need at least one I/O queue
@@ -2220,6 +2233,7 @@ static int nvme_setup_irqs(struct nvme_dev *dev, unsigned int nr_io_queues)
 	 */
 	poll_queues = min(dev->nr_poll_queues, nr_io_queues - 1);
 	dev->io_queues[HCTX_TYPE_POLL] = poll_queues;
+	raw_queues = dev->nr_raw_queues;
 
 	/*
 	 * Initialize for the single interrupt case, will be updated in
@@ -2235,7 +2249,7 @@ static int nvme_setup_irqs(struct nvme_dev *dev, unsigned int nr_io_queues)
 	 */
 	irq_queues = 1;
 	if (!(dev->ctrl.quirks & NVME_QUIRK_SINGLE_VECTOR))
-		irq_queues += (nr_io_queues - poll_queues);
+		irq_queues += (nr_io_queues - poll_queues - raw_queues);
 	return pci_alloc_irq_vectors_affinity(pdev, 1, irq_queues,
 			      PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY, &affd);
 }
@@ -2248,7 +2262,9 @@ static unsigned int nvme_max_io_queues(struct nvme_dev *dev)
 	 */
 	if (dev->ctrl.quirks & NVME_QUIRK_SHARED_TAGS)
 		return 1;
-	return num_possible_cpus() + dev->nr_write_queues + dev->nr_poll_queues;
+
+	return num_possible_cpus() + dev->nr_write_queues + dev->nr_poll_queues
+		+ dev->nr_raw_queues;
 }
 
 static int nvme_setup_io_queues(struct nvme_dev *dev)
@@ -2265,6 +2281,7 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
 	 */
 	dev->nr_write_queues = write_queues;
 	dev->nr_poll_queues = poll_queues;
+	dev->nr_raw_queues = raw_queues;
 
 	nr_io_queues = dev->nr_allocated_queues - 1;
 	result = nvme_set_queue_count(&dev->ctrl, &nr_io_queues);
@@ -2329,7 +2346,8 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
 
 	dev->num_vecs = result;
 	result = max(result - 1, 1);
-	dev->max_qid = result + dev->io_queues[HCTX_TYPE_POLL];
+	dev->max_qid = result + dev->io_queues[HCTX_TYPE_POLL] +
+			dev->nr_raw_queues;
 
 	/*
 	 * Should investigate if there's a performance win from allocating
@@ -2356,10 +2374,11 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
 		nvme_suspend_io_queues(dev);
 		goto retry;
 	}
-	dev_info(dev->ctrl.device, "%d/%d/%d default/read/poll queues\n",
+	dev_info(dev->ctrl.device, "%d/%d/%d/%d default/read/poll queues/raw queues\n",
 					dev->io_queues[HCTX_TYPE_DEFAULT],
 					dev->io_queues[HCTX_TYPE_READ],
-					dev->io_queues[HCTX_TYPE_POLL]);
+					dev->io_queues[HCTX_TYPE_POLL],
+					dev->nr_raw_queues);
 	return 0;
 out_unlock:
 	mutex_unlock(&dev->shutdown_lock);
@@ -2457,7 +2476,8 @@ static unsigned int nvme_pci_nr_maps(struct nvme_dev *dev)
 
 static void nvme_pci_update_nr_queues(struct nvme_dev *dev)
 {
-	blk_mq_update_nr_hw_queues(&dev->tagset, dev->online_queues - 1);
+	blk_mq_update_nr_hw_queues(&dev->tagset,
+			dev->online_queues - dev->nr_raw_queues - 1);
 	/* free previously allocated queues that are no longer usable */
 	nvme_free_queues(dev, dev->online_queues);
 }
@@ -2921,6 +2941,7 @@ static struct nvme_dev *nvme_pci_alloc_dev(struct pci_dev *pdev,
 
 	dev->nr_write_queues = write_queues;
 	dev->nr_poll_queues = poll_queues;
+	dev->nr_raw_queues = raw_queues;
 	dev->nr_allocated_queues = nvme_max_io_queues(dev) + 1;
 	dev->queues = kcalloc_node(dev->nr_allocated_queues,
 			sizeof(struct nvme_queue), GFP_KERNEL, node);
@@ -3034,7 +3055,7 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (dev->online_queues > 1) {
 		nvme_alloc_io_tag_set(&dev->ctrl, &dev->tagset, &nvme_mq_ops,
 				nvme_pci_nr_maps(dev), sizeof(struct nvme_iod),
-				dev->ctrl.queue_count);
+				dev->ctrl.queue_count - dev->nr_raw_queues);
 		nvme_dbbuf_set(dev);
 	}
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 03/12] fs, block: interface to register/unregister the raw-queue
  2023-04-29  9:39 ` [RFC PATCH 00/12] io_uring attached nvme queue Kanchan Joshi
  2023-04-29  9:39   ` [RFC PATCH 01/12] nvme: refactor nvme_alloc_io_tag_set Kanchan Joshi
  2023-04-29  9:39   ` [RFC PATCH 02/12] pci: enable "raw_queues = N" module parameter Kanchan Joshi
@ 2023-04-29  9:39   ` Kanchan Joshi
  2023-04-29  9:39   ` [RFC PATCH 04/12] io_uring, fs: plumb support to register/unregister raw-queue Kanchan Joshi
                     ` (9 subsequent siblings)
  12 siblings, 0 replies; 25+ messages in thread
From: Kanchan Joshi @ 2023-04-29  9:39 UTC (permalink / raw)
  To: axboe, hch, sagi, kbusch
  Cc: io-uring, linux-nvme, linux-block, gost.dev, anuj1072538,
	xiaoguang.wang, Anuj Gupta, Kanchan Joshi

From: Anuj Gupta <anuj20.g@samsung.com>

Introduce register/unregister callbacks to fops and mq_ops, so that
callers can attach/detach to raw-queues.

Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
---
 include/linux/blk-mq.h | 2 ++
 include/linux/fs.h     | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 06caacd77ed6..7d6790be4847 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -657,6 +657,8 @@ struct blk_mq_ops {
 	 * information about a request.
 	 */
 	void (*show_rq)(struct seq_file *m, struct request *rq);
+	int (*register_queue)(void *data);
+	int (*unregister_queue)(void *data, int qid);
 #endif
 };
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 475d88640d3d..79acccc5e7d4 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1798,6 +1798,8 @@ struct file_operations {
 	int (*uring_cmd)(struct io_uring_cmd *ioucmd, unsigned int issue_flags);
 	int (*uring_cmd_iopoll)(struct io_uring_cmd *, struct io_comp_batch *,
 				unsigned int poll_flags);
+	int (*register_queue)(struct file *f);
+	int (*unregister_queue)(struct file *f, int qid);
 } __randomize_layout;
 
 struct inode_operations {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 04/12] io_uring, fs: plumb support to register/unregister raw-queue
  2023-04-29  9:39 ` [RFC PATCH 00/12] io_uring attached nvme queue Kanchan Joshi
                     ` (2 preceding siblings ...)
  2023-04-29  9:39   ` [RFC PATCH 03/12] fs, block: interface to register/unregister the raw-queue Kanchan Joshi
@ 2023-04-29  9:39   ` Kanchan Joshi
  2023-04-29  9:39   ` [RFC PATCH 05/12] nvme: wire-up register/unregister queue f_op callback Kanchan Joshi
                     ` (8 subsequent siblings)
  12 siblings, 0 replies; 25+ messages in thread
From: Kanchan Joshi @ 2023-04-29  9:39 UTC (permalink / raw)
  To: axboe, hch, sagi, kbusch
  Cc: io-uring, linux-nvme, linux-block, gost.dev, anuj1072538,
	xiaoguang.wang, Anuj Gupta, Kanchan Joshi

From: Anuj Gupta <anuj20.g@samsung.com>

Extend io_uring's registration interface with

- IORING_REGISTER_QUEUE: to ask for a queue. It goes down via
fops->register_queue and returns identifier of the queue. This qid is
stored in ring's ctx.

- IORING_UNREGISTER_QUEUE: to return the previously registered queue.

At max one queue is allowed to be attached with the io_uring's ring.
The file for which queue is requested is expected to be in
registered file-set.

Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
---
 fs/file.c                      | 14 ++++++++
 include/linux/fs.h             |  2 ++
 include/linux/io_uring_types.h |  3 ++
 include/uapi/linux/io_uring.h  |  4 +++
 io_uring/io_uring.c            | 60 ++++++++++++++++++++++++++++++++++
 5 files changed, 83 insertions(+)

diff --git a/fs/file.c b/fs/file.c
index 7893ea161d77..7dada9cd0911 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -1318,3 +1318,17 @@ int iterate_fd(struct files_struct *files, unsigned n,
 	return res;
 }
 EXPORT_SYMBOL(iterate_fd);
+
+int file_register_queue(struct file *file)
+{
+	if (file->f_op->register_queue)
+		return file->f_op->register_queue(file);
+	return -EINVAL;
+}
+
+int file_unregister_queue(struct file *file, int qid)
+{
+	if (file->f_op->unregister_queue)
+		return file->f_op->unregister_queue(file, qid);
+	return -EINVAL;
+}
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 79acccc5e7d4..0a82aac6868b 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3190,5 +3190,7 @@ extern int vfs_fadvise(struct file *file, loff_t offset, loff_t len,
 		       int advice);
 extern int generic_fadvise(struct file *file, loff_t offset, loff_t len,
 			   int advice);
+int file_register_queue(struct file *file);
+int file_unregister_queue(struct file *file, int qid);
 
 #endif /* _LINUX_FS_H */
diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
index 1b2a20a42413..8d4e721493d6 100644
--- a/include/linux/io_uring_types.h
+++ b/include/linux/io_uring_types.h
@@ -364,6 +364,9 @@ struct io_ring_ctx {
 	unsigned			sq_thread_idle;
 	/* protected by ->completion_lock */
 	unsigned			evfd_last_cq_tail;
+	/* for io_uring attached device queue */
+	int				dev_qid;
+	int				dev_fd;
 };
 
 struct io_tw_state {
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 0716cb17e436..a9d59bfd26f7 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -523,6 +523,10 @@ enum {
 	/* register a range of fixed file slots for automatic slot allocation */
 	IORING_REGISTER_FILE_ALLOC_RANGE	= 25,
 
+	/* register a device-queue with the ring */
+	IORING_REGISTER_QUEUE			= 26,
+	IORING_UNREGISTER_QUEUE			= 27,
+
 	/* this goes last */
 	IORING_REGISTER_LAST,
 
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 3bca7a79efda..5a9b7adf438e 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -337,6 +337,8 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p)
 	INIT_WQ_LIST(&ctx->locked_free_list);
 	INIT_DELAYED_WORK(&ctx->fallback_work, io_fallback_req_func);
 	INIT_WQ_LIST(&ctx->submit_state.compl_reqs);
+	/* -EINVAL implies nothing is registered with this ring */
+	ctx->dev_qid = -EINVAL;
 	return ctx;
 err:
 	kfree(ctx->dummy_ubuf);
@@ -2822,6 +2824,51 @@ static void io_req_caches_free(struct io_ring_ctx *ctx)
 	mutex_unlock(&ctx->uring_lock);
 }
 
+static int io_register_queue(struct io_ring_ctx *ctx, void __user *arg)
+{
+	struct io_fixed_file *file_slot;
+	struct file *file;
+	__s32 __user *fds = arg;
+	int fd, qid;
+
+	if (ctx->dev_qid != -EINVAL)
+		return -EINVAL;
+	if (copy_from_user(&fd, fds, sizeof(*fds)))
+		return -EFAULT;
+	file_slot = io_fixed_file_slot(&ctx->file_table,
+			array_index_nospec(fd, ctx->nr_user_files));
+	if (!file_slot->file_ptr)
+		return -EBADF;
+	file = (struct file *)(file_slot->file_ptr & FFS_MASK);
+	qid = file_register_queue(file);
+	if (qid < 0)
+		return qid;
+	ctx->dev_fd = fd;
+	ctx->dev_qid = qid;
+	return 0;
+}
+
+static int io_unregister_queue(struct io_ring_ctx *ctx)
+{
+	struct io_fixed_file *file_slot;
+	struct file *file;
+	int ret;
+
+	if (ctx->dev_qid == -EINVAL)
+		return 0;
+	file_slot = io_fixed_file_slot(&ctx->file_table,
+			array_index_nospec(ctx->dev_fd, ctx->nr_user_files));
+	if (!file_slot)
+		return -EBADF;
+	if (!file_slot->file_ptr)
+		return -EBADF;
+	file = (struct file *)(file_slot->file_ptr & FFS_MASK);
+	ret = file_unregister_queue(file, ctx->dev_qid);
+	if (!ret)
+		ctx->dev_qid = -EINVAL;
+	return ret;
+}
+
 static void io_rsrc_node_cache_free(struct io_cache_entry *entry)
 {
 	kfree(container_of(entry, struct io_rsrc_node, cache));
@@ -2835,6 +2882,7 @@ static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx)
 		return;
 
 	mutex_lock(&ctx->uring_lock);
+	io_unregister_queue(ctx);
 	if (ctx->buf_data)
 		__io_sqe_buffers_unregister(ctx);
 	if (ctx->file_data)
@@ -4418,6 +4466,18 @@ static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode,
 			break;
 		ret = io_register_file_alloc_range(ctx, arg);
 		break;
+	case IORING_REGISTER_QUEUE:
+		ret = -EINVAL;
+		if (!arg || nr_args != 1)
+			break;
+		ret = io_register_queue(ctx, arg);
+		break;
+	case IORING_UNREGISTER_QUEUE:
+		ret = -EINVAL;
+		if (arg || nr_args)
+			break;
+		ret = io_unregister_queue(ctx);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 05/12] nvme: wire-up register/unregister queue f_op callback
  2023-04-29  9:39 ` [RFC PATCH 00/12] io_uring attached nvme queue Kanchan Joshi
                     ` (3 preceding siblings ...)
  2023-04-29  9:39   ` [RFC PATCH 04/12] io_uring, fs: plumb support to register/unregister raw-queue Kanchan Joshi
@ 2023-04-29  9:39   ` Kanchan Joshi
  2023-04-29 12:02     ` kernel test robot
                       ` (3 more replies)
  2023-04-29  9:39   ` [RFC PATCH 06/12] pci: implement register/unregister functionality Kanchan Joshi
                     ` (7 subsequent siblings)
  12 siblings, 4 replies; 25+ messages in thread
From: Kanchan Joshi @ 2023-04-29  9:39 UTC (permalink / raw)
  To: axboe, hch, sagi, kbusch
  Cc: io-uring, linux-nvme, linux-block, gost.dev, anuj1072538,
	xiaoguang.wang, Anuj Gupta, Kanchan Joshi

From: Anuj Gupta <anuj20.g@samsung.com>

Implement register/unregister handlers for char-device file.

Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
---
 drivers/nvme/host/core.c      | 26 +++++++++++++++++++
 drivers/nvme/host/ioctl.c     | 48 +++++++++++++++++++++++++++++++++++
 drivers/nvme/host/multipath.c |  2 ++
 drivers/nvme/host/nvme.h      |  2 ++
 4 files changed, 78 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index ba476c48d566..4462ce50d076 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4059,6 +4059,30 @@ static int nvme_ns_chr_release(struct inode *inode, struct file *file)
 	return 0;
 }
 
+int nvme_register_queue(struct file *file)
+{
+	struct nvme_ns *ns = container_of(file_inode(file)->i_cdev,
+			struct nvme_ns, cdev);
+	struct nvme_ctrl *ctrl = ns->ctrl;
+	struct request_queue *q = ns ? ns->queue : ctrl->admin_q;
+
+	if (q->mq_ops && q->mq_ops->register_queue)
+		return q->mq_ops->register_queue(ns);
+	return -EINVAL;
+}
+
+int nvme_unregister_queue(struct file *file, int qid)
+{
+	struct nvme_ns *ns = container_of(file_inode(file)->i_cdev,
+			struct nvme_ns, cdev);
+	struct nvme_ctrl *ctrl = ns->ctrl;
+	struct request_queue *q = ns ? ns->queue : ctrl->admin_q;
+
+	if (q->mq_ops && q->mq_ops->unregister_queue)
+		return q->mq_ops->unregister_queue(ns, qid);
+	return -EINVAL;
+}
+
 static const struct file_operations nvme_ns_chr_fops = {
 	.owner		= THIS_MODULE,
 	.open		= nvme_ns_chr_open,
@@ -4067,6 +4091,8 @@ static const struct file_operations nvme_ns_chr_fops = {
 	.compat_ioctl	= compat_ptr_ioctl,
 	.uring_cmd	= nvme_ns_chr_uring_cmd,
 	.uring_cmd_iopoll = nvme_ns_chr_uring_cmd_iopoll,
+	.register_queue	= nvme_register_queue,
+	.unregister_queue = nvme_unregister_queue,
 };
 
 static int nvme_add_ns_cdev(struct nvme_ns *ns)
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index d24ea2e05156..292a578686b6 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -871,6 +871,54 @@ long nvme_ns_head_chr_ioctl(struct file *file, unsigned int cmd,
 	return ret;
 }
 
+int nvme_ns_head_register_queue(struct file *file)
+{
+	struct cdev *cdev = file_inode(file)->i_cdev;
+	struct nvme_ns_head *head =
+		container_of(cdev, struct nvme_ns_head, cdev);
+	struct nvme_ns *ns;
+	struct nvme_ctrl *ctrl;
+	struct request_queue *q;
+	int srcu_idx, ret = -EINVAL;
+
+	srcu_idx = srcu_read_lock(&head->srcu);
+	ns = nvme_find_path(head);
+	if (!ns)
+		goto out_unlock;
+
+	ctrl = ns->ctrl;
+	q = ns ? ns->queue : ctrl->admin_q;
+	if (q->mq_ops && q->mq_ops->register_queue)
+		ret = q->mq_ops->register_queue(ns);
+out_unlock:
+	srcu_read_unlock(&head->srcu, srcu_idx);
+	return ret;
+}
+
+int nvme_ns_head_unregister_queue(struct file *file, int qid)
+{
+	struct cdev *cdev = file_inode(file)->i_cdev;
+	struct nvme_ns_head *head =
+		container_of(cdev, struct nvme_ns_head, cdev);
+	struct nvme_ns *ns;
+	struct nvme_ctrl *ctrl;
+	struct request_queue *q;
+	int srcu_idx, ret = -EINVAL;
+
+	srcu_idx = srcu_read_lock(&head->srcu);
+	ns = nvme_find_path(head);
+	if (!ns)
+		goto out_unlock;
+
+	ctrl = ns->ctrl;
+	q = ns ? ns->queue : ctrl->admin_q;
+	if (q->mq_ops && q->mq_ops->unregister_queue)
+		ret = q->mq_ops->unregister_queue(ns, qid);
+out_unlock:
+	srcu_read_unlock(&head->srcu, srcu_idx);
+	return ret;
+}
+
 int nvme_ns_head_chr_uring_cmd(struct io_uring_cmd *ioucmd,
 		unsigned int issue_flags)
 {
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 9171452e2f6d..eed30daf1a37 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -471,6 +471,8 @@ static const struct file_operations nvme_ns_head_chr_fops = {
 	.compat_ioctl	= compat_ptr_ioctl,
 	.uring_cmd	= nvme_ns_head_chr_uring_cmd,
 	.uring_cmd_iopoll = nvme_ns_head_chr_uring_cmd_iopoll,
+	.register_queue	= nvme_ns_head_register_queue,
+	.unregister_queue = nvme_ns_head_unregister_queue,
 };
 
 static int nvme_add_ns_head_cdev(struct nvme_ns_head *head)
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 73992dc9dec7..4619a7498f8e 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -843,6 +843,8 @@ int nvme_ns_head_ioctl(struct block_device *bdev, fmode_t mode,
 		unsigned int cmd, unsigned long arg);
 long nvme_ns_head_chr_ioctl(struct file *file, unsigned int cmd,
 		unsigned long arg);
+int nvme_ns_head_register_queue(struct file *file);
+int nvme_ns_head_unregister_queue(struct file *file, int qid);
 long nvme_dev_ioctl(struct file *file, unsigned int cmd,
 		unsigned long arg);
 int nvme_ns_chr_uring_cmd_iopoll(struct io_uring_cmd *ioucmd,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 06/12] pci: implement register/unregister functionality
  2023-04-29  9:39 ` [RFC PATCH 00/12] io_uring attached nvme queue Kanchan Joshi
                     ` (4 preceding siblings ...)
  2023-04-29  9:39   ` [RFC PATCH 05/12] nvme: wire-up register/unregister queue f_op callback Kanchan Joshi
@ 2023-04-29  9:39   ` Kanchan Joshi
  2023-04-29 12:13     ` kernel test robot
  2023-04-29 12:23     ` kernel test robot
  2023-04-29  9:39   ` [RFC PATCH 07/12] io_uring: support for using registered queue in uring-cmd Kanchan Joshi
                     ` (6 subsequent siblings)
  12 siblings, 2 replies; 25+ messages in thread
From: Kanchan Joshi @ 2023-04-29  9:39 UTC (permalink / raw)
  To: axboe, hch, sagi, kbusch
  Cc: io-uring, linux-nvme, linux-block, gost.dev, anuj1072538,
	xiaoguang.wang, Kanchan Joshi, Anuj Gupta

Implement the register callback. It checks if any raw-queue is available to
be attached. If found, it returns the qid.
During queue registration, iod and command-id bitmap are also preallocated.

Unregister callback does the opposite and returns the corresponding
queue to the pool of available raw-queues.

Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
---
 drivers/nvme/host/pci.c | 154 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 154 insertions(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index d366a76cc304..b4498e198e8a 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -116,6 +116,15 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown);
 static void nvme_delete_io_queues(struct nvme_dev *dev);
 static void nvme_update_attrs(struct nvme_dev *dev);
 
+enum {
+	Q_FREE,
+	Q_ALLOC
+};
+struct rawq_info {
+	int nr_free;
+	int q_state[];
+};
+
 /*
  * Represents an NVM Express device.  Each nvme_dev is a PCI function.
  */
@@ -164,6 +173,8 @@ struct nvme_dev {
 	unsigned int nr_write_queues;
 	unsigned int nr_poll_queues;
 	unsigned int nr_raw_queues;
+	struct mutex rawq_lock;
+	struct rawq_info *rawqi;
 };
 
 static int io_queue_depth_set(const char *val, const struct kernel_param *kp)
@@ -195,6 +206,10 @@ struct nvme_queue {
 	struct nvme_dev *dev;
 	spinlock_t sq_lock;
 	void *sq_cmds;
+	 /* only used for raw queues: */
+	unsigned long *cmdid_bmp;
+	spinlock_t cmdid_lock;
+	struct nvme_iod *iod;
 	 /* only used for poll queues: */
 	spinlock_t cq_poll_lock ____cacheline_aligned_in_smp;
 	struct nvme_completion *cqes;
@@ -1661,6 +1676,141 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid, bool polled,
 	return result;
 }
 
+static int setup_rawq_info(struct nvme_dev *dev, int nr_rawq)
+{
+	struct rawq_info *rawqi;
+	int size = sizeof(struct rawq_info) + nr_rawq * sizeof(int);
+
+	rawqi = kzalloc(size, GFP_KERNEL);
+	if (rawqi == NULL)
+		return -ENOMEM;
+	rawqi->nr_free = nr_rawq;
+	dev->rawqi = rawqi;
+	return 0;
+}
+
+static int nvme_pci_get_rawq(struct nvme_dev *dev)
+{
+	int i, qid, nr_rawq;
+	struct rawq_info *rawqi = NULL;
+	int ret = -EINVAL;
+
+	nr_rawq = dev->nr_raw_queues;
+	if (!nr_rawq)
+		return ret;
+
+	mutex_lock(&dev->rawq_lock);
+	if (dev->rawqi == NULL) {
+		ret = setup_rawq_info(dev, nr_rawq);
+		if (ret)
+			goto unlock;
+	}
+	rawqi = dev->rawqi;
+	if (rawqi->nr_free == 0) {
+		ret = -EINVAL;
+		goto unlock;
+	}
+	for (i = 0; i < nr_rawq; i++) {
+		if (rawqi->q_state[i] == Q_FREE) {
+			rawqi->q_state[i] = Q_ALLOC;
+			qid = dev->nr_allocated_queues - nr_rawq - i;
+			rawqi->nr_free--;
+			ret = qid;
+			goto unlock;
+		}
+	}
+unlock:
+	mutex_unlock(&dev->rawq_lock);
+	return ret;
+}
+
+static int nvme_pci_put_rawq(struct nvme_dev *dev, int qid)
+{
+	int i, nr_rawq;
+	struct rawq_info *rawqi = NULL;
+	struct nvme_queue *nvmeq;
+
+	nr_rawq = dev->nr_raw_queues;
+	if (!nr_rawq || dev->rawqi == NULL)
+		return -EINVAL;
+
+	i = dev->nr_allocated_queues - nr_rawq - qid;
+	mutex_lock(&dev->rawq_lock);
+	rawqi = dev->rawqi;
+	if (rawqi->q_state[i] == Q_ALLOC) {
+		rawqi->q_state[i] = Q_FREE;
+		rawqi->nr_free++;
+	}
+	mutex_unlock(&dev->rawq_lock);
+	nvmeq = &dev->queues[qid];
+	kfree(nvmeq->cmdid_bmp);
+	kfree(nvmeq->iod);
+	return 0;
+}
+
+static int nvme_pci_alloc_cmdid_bmp(struct nvme_queue *nvmeq)
+{
+	int size = BITS_TO_LONGS(nvmeq->q_depth) * sizeof(unsigned long);
+
+	if (!test_bit(NVMEQ_RAW, &nvmeq->flags))
+		return -EINVAL;
+	nvmeq->cmdid_bmp = kzalloc(size, GFP_KERNEL);
+	if (!nvmeq->cmdid_bmp)
+		return -ENOMEM;
+	spin_lock_init(&nvmeq->cmdid_lock);
+	return 0;
+}
+
+static int nvme_pci_alloc_iod_array(struct nvme_queue *nvmeq)
+{
+	if (!test_bit(NVMEQ_RAW, &nvmeq->flags))
+		return -EINVAL;
+	nvmeq->iod = kcalloc(nvmeq->q_depth - 1, sizeof(struct nvme_iod),
+				 GFP_KERNEL);
+	if (!nvmeq->iod)
+		return -ENOMEM;
+	return 0;
+}
+
+static int nvme_pci_setup_rawq(struct nvme_queue *nvmeq)
+{
+	int ret;
+
+	ret = nvme_pci_alloc_cmdid_bmp(nvmeq);
+	if (ret)
+		return ret;
+	ret = nvme_pci_alloc_iod_array(nvmeq);
+	if (ret) {
+		kfree(nvmeq->cmdid_bmp);
+		return ret;
+	}
+	return ret;
+}
+
+static int nvme_pci_register_queue(void *data)
+{
+	struct nvme_ns *ns = (struct nvme_ns *) data;
+	struct nvme_dev *dev = to_nvme_dev(ns->ctrl);
+	int qid, ret;
+
+	qid = nvme_pci_get_rawq(dev);
+	if (qid > 0) {
+		/* setup command-id bitmap and iod array */
+		ret = nvme_pci_setup_rawq(&dev->queues[qid]);
+		if (ret < 0)
+			qid = ret;
+	}
+	return qid;
+}
+
+static int nvme_pci_unregister_queue(void *data, int qid)
+{
+	struct nvme_ns *ns = (struct nvme_ns *) data;
+	struct nvme_dev *dev = to_nvme_dev(ns->ctrl);
+
+	return nvme_pci_put_rawq(dev, qid);
+}
+
 static const struct blk_mq_ops nvme_mq_admin_ops = {
 	.queue_rq	= nvme_queue_rq,
 	.complete	= nvme_pci_complete_rq,
@@ -1679,6 +1829,8 @@ static const struct blk_mq_ops nvme_mq_ops = {
 	.map_queues	= nvme_pci_map_queues,
 	.timeout	= nvme_timeout,
 	.poll		= nvme_poll,
+	.register_queue	= nvme_pci_register_queue,
+	.unregister_queue =  nvme_pci_unregister_queue,
 };
 
 static void nvme_dev_remove_admin(struct nvme_dev *dev)
@@ -2698,6 +2850,7 @@ static void nvme_pci_free_ctrl(struct nvme_ctrl *ctrl)
 	nvme_free_tagset(dev);
 	put_device(dev->dev);
 	kfree(dev->queues);
+	kfree(dev->rawqi);
 	kfree(dev);
 }
 
@@ -2938,6 +3091,7 @@ static struct nvme_dev *nvme_pci_alloc_dev(struct pci_dev *pdev,
 		return ERR_PTR(-ENOMEM);
 	INIT_WORK(&dev->ctrl.reset_work, nvme_reset_work);
 	mutex_init(&dev->shutdown_lock);
+	mutex_init(&dev->rawq_lock);
 
 	dev->nr_write_queues = write_queues;
 	dev->nr_poll_queues = poll_queues;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 07/12] io_uring: support for using registered queue in uring-cmd
  2023-04-29  9:39 ` [RFC PATCH 00/12] io_uring attached nvme queue Kanchan Joshi
                     ` (5 preceding siblings ...)
  2023-04-29  9:39   ` [RFC PATCH 06/12] pci: implement register/unregister functionality Kanchan Joshi
@ 2023-04-29  9:39   ` Kanchan Joshi
  2023-04-29  9:39   ` [RFC PATCH 08/12] block: add mq_ops to submit and complete commands from raw-queue Kanchan Joshi
                     ` (5 subsequent siblings)
  12 siblings, 0 replies; 25+ messages in thread
From: Kanchan Joshi @ 2023-04-29  9:39 UTC (permalink / raw)
  To: axboe, hch, sagi, kbusch
  Cc: io-uring, linux-nvme, linux-block, gost.dev, anuj1072538,
	xiaoguang.wang, Kanchan Joshi

Add a new flag IORING_URING_CMD_DIRECT that user-space can spcifiy in SQE.
If queue is registered with the ring, this flag goes down to the provider
of ->uring_cmd. Provider may choose to do things differently or ignore
this flag.
Also export a helper that allows retrieving the identifier of the
registered queue.

Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
---
 include/linux/io_uring.h      |  6 ++++++
 include/uapi/linux/io_uring.h |  2 ++
 io_uring/uring_cmd.c          | 14 +++++++++++++-
 3 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h
index 35b9328ca335..bb6f900411e1 100644
--- a/include/linux/io_uring.h
+++ b/include/linux/io_uring.h
@@ -37,6 +37,7 @@ struct io_uring_cmd {
 };
 
 #if defined(CONFIG_IO_URING)
+int io_uring_cmd_import_qid(void *ioucmd);
 int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
 			      struct iov_iter *iter, void *ioucmd);
 void io_uring_cmd_done(struct io_uring_cmd *cmd, ssize_t ret, ssize_t res2,
@@ -67,6 +68,11 @@ static inline void io_uring_free(struct task_struct *tsk)
 		__io_uring_free(tsk);
 }
 #else
+static inline int io_uring_cmd_import_qid(void *ioucmd)
+{
+	return -EOPNOTSUPP;
+}
+
 static inline int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
 			      struct iov_iter *iter, void *ioucmd)
 {
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index a9d59bfd26f7..67fbcfd3f676 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -232,8 +232,10 @@ enum io_uring_op {
  * sqe->uring_cmd_flags
  * IORING_URING_CMD_FIXED	use registered buffer; pass this flag
  *				along with setting sqe->buf_index.
+ * IORING_URING_CMD_DIRECT	use registered queue for this cmd.
  */
 #define IORING_URING_CMD_FIXED	(1U << 0)
+#define IORING_URING_CMD_DIRECT	(1U << 1)
 
 
 /*
diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
index 5113c9a48583..2a543b490045 100644
--- a/io_uring/uring_cmd.c
+++ b/io_uring/uring_cmd.c
@@ -89,9 +89,12 @@ int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 		return -EINVAL;
 
 	ioucmd->flags = READ_ONCE(sqe->uring_cmd_flags);
-	if (ioucmd->flags & ~IORING_URING_CMD_FIXED)
+	if (ioucmd->flags & ~(IORING_URING_CMD_FIXED | IORING_URING_CMD_DIRECT))
 		return -EINVAL;
 
+	if (ioucmd->flags & IORING_URING_CMD_DIRECT &&
+			req->ctx->dev_qid == -EINVAL)
+		return -EINVAL;
 	if (ioucmd->flags & IORING_URING_CMD_FIXED) {
 		struct io_ring_ctx *ctx = req->ctx;
 		u16 index;
@@ -162,3 +165,12 @@ int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
 	return io_import_fixed(rw, iter, req->imu, ubuf, len);
 }
 EXPORT_SYMBOL_GPL(io_uring_cmd_import_fixed);
+
+int io_uring_cmd_import_qid(void *ioucmd)
+{
+	struct io_kiocb *req = cmd_to_io_kiocb(ioucmd);
+	struct io_ring_ctx *ctx = req->ctx;
+
+	return ctx->dev_qid;
+}
+EXPORT_SYMBOL_GPL(io_uring_cmd_import_qid);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 08/12] block: add mq_ops to submit and complete commands from raw-queue
  2023-04-29  9:39 ` [RFC PATCH 00/12] io_uring attached nvme queue Kanchan Joshi
                     ` (6 preceding siblings ...)
  2023-04-29  9:39   ` [RFC PATCH 07/12] io_uring: support for using registered queue in uring-cmd Kanchan Joshi
@ 2023-04-29  9:39   ` Kanchan Joshi
  2023-04-29  9:39   ` [RFC PATCH 09/12] nvme: carve out a helper to prepare nvme_command from ioucmd->cmd Kanchan Joshi
                     ` (4 subsequent siblings)
  12 siblings, 0 replies; 25+ messages in thread
From: Kanchan Joshi @ 2023-04-29  9:39 UTC (permalink / raw)
  To: axboe, hch, sagi, kbusch
  Cc: io-uring, linux-nvme, linux-block, gost.dev, anuj1072538,
	xiaoguang.wang, Anuj Gupta, Kanchan Joshi

From: Anuj Gupta <anuj20.g@samsung.com>

This patch introduces ->queue_uring_cmd and -> poll_uring_cmd in mq_ops,
which will allow to submit and complete (via polling) from the specified
raw-queue (denoted by qid).

Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
---
 include/linux/blk-mq.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 7d6790be4847..dcce2939ff1e 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -659,6 +659,9 @@ struct blk_mq_ops {
 	void (*show_rq)(struct seq_file *m, struct request *rq);
 	int (*register_queue)(void *data);
 	int (*unregister_queue)(void *data, int qid);
+	int (*queue_uring_cmd)(struct io_uring_cmd *ioucmd, int qid);
+	int (*poll_uring_cmd)(struct io_uring_cmd *ioucmd, int qid,
+			      struct io_comp_batch *);
 #endif
 };
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 09/12] nvme: carve out a helper to prepare nvme_command from ioucmd->cmd
  2023-04-29  9:39 ` [RFC PATCH 00/12] io_uring attached nvme queue Kanchan Joshi
                     ` (7 preceding siblings ...)
  2023-04-29  9:39   ` [RFC PATCH 08/12] block: add mq_ops to submit and complete commands from raw-queue Kanchan Joshi
@ 2023-04-29  9:39   ` Kanchan Joshi
  2023-04-29  9:39   ` [RFC PATCH 10/12] nvme: submisssion/completion of uring_cmd to/from the registered queue Kanchan Joshi
                     ` (3 subsequent siblings)
  12 siblings, 0 replies; 25+ messages in thread
From: Kanchan Joshi @ 2023-04-29  9:39 UTC (permalink / raw)
  To: axboe, hch, sagi, kbusch
  Cc: io-uring, linux-nvme, linux-block, gost.dev, anuj1072538,
	xiaoguang.wang, Kanchan Joshi, Anuj Gupta

This helper will be used in the subsequent patch.

Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
---
 drivers/nvme/host/ioctl.c | 81 +++++++++++++++++++++------------------
 drivers/nvme/host/nvme.h  | 11 ++++++
 2 files changed, 54 insertions(+), 38 deletions(-)

diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index 292a578686b6..18f4f20f5e76 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -430,14 +430,6 @@ static int nvme_user_cmd64(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
 	return status;
 }
 
-struct nvme_uring_data {
-	__u64	metadata;
-	__u64	addr;
-	__u32	data_len;
-	__u32	metadata_len;
-	__u32	timeout_ms;
-};
-
 /*
  * This overlays struct io_uring_cmd pdu.
  * Expect build errors if this grows larger than that.
@@ -548,11 +540,50 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io_meta(struct request *req,
 	return RQ_END_IO_NONE;
 }
 
+int nvme_prep_cmd_from_ioucmd(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
+		struct io_uring_cmd *ioucmd, struct nvme_command *c,
+		struct nvme_uring_data *d)
+{
+	const struct nvme_uring_cmd *cmd = ioucmd->cmd;
+
+	c->common.opcode = READ_ONCE(cmd->opcode);
+	c->common.flags = READ_ONCE(cmd->flags);
+	if (c->common.flags)
+		return -EINVAL;
+
+	c->common.command_id = 0;
+	c->common.nsid = cpu_to_le32(cmd->nsid);
+	if (!nvme_validate_passthru_nsid(ctrl, ns, le32_to_cpu(c->common.nsid)))
+		return -EINVAL;
+
+	c->common.cdw2[0] = cpu_to_le32(READ_ONCE(cmd->cdw2));
+	c->common.cdw2[1] = cpu_to_le32(READ_ONCE(cmd->cdw3));
+	c->common.metadata = 0;
+	c->common.dptr.prp1 = c->common.dptr.prp2 = 0;
+	c->common.cdw10 = cpu_to_le32(READ_ONCE(cmd->cdw10));
+	c->common.cdw11 = cpu_to_le32(READ_ONCE(cmd->cdw11));
+	c->common.cdw12 = cpu_to_le32(READ_ONCE(cmd->cdw12));
+	c->common.cdw13 = cpu_to_le32(READ_ONCE(cmd->cdw13));
+	c->common.cdw14 = cpu_to_le32(READ_ONCE(cmd->cdw14));
+	c->common.cdw15 = cpu_to_le32(READ_ONCE(cmd->cdw15));
+
+	if (!nvme_cmd_allowed(ns, c, 0, ioucmd->file->f_mode))
+		return -EACCES;
+
+	d->metadata = READ_ONCE(cmd->metadata);
+	d->addr = READ_ONCE(cmd->addr);
+	d->data_len = READ_ONCE(cmd->data_len);
+	d->metadata_len = READ_ONCE(cmd->metadata_len);
+	d->timeout_ms = READ_ONCE(cmd->timeout_ms);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nvme_prep_cmd_from_ioucmd);
+
+
 static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
 		struct io_uring_cmd *ioucmd, unsigned int issue_flags, bool vec)
 {
 	struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd);
-	const struct nvme_uring_cmd *cmd = ioucmd->cmd;
 	struct request_queue *q = ns ? ns->queue : ctrl->admin_q;
 	struct nvme_uring_data d;
 	struct nvme_command c;
@@ -562,35 +593,9 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
 	void *meta = NULL;
 	int ret;
 
-	c.common.opcode = READ_ONCE(cmd->opcode);
-	c.common.flags = READ_ONCE(cmd->flags);
-	if (c.common.flags)
-		return -EINVAL;
-
-	c.common.command_id = 0;
-	c.common.nsid = cpu_to_le32(cmd->nsid);
-	if (!nvme_validate_passthru_nsid(ctrl, ns, le32_to_cpu(c.common.nsid)))
-		return -EINVAL;
-
-	c.common.cdw2[0] = cpu_to_le32(READ_ONCE(cmd->cdw2));
-	c.common.cdw2[1] = cpu_to_le32(READ_ONCE(cmd->cdw3));
-	c.common.metadata = 0;
-	c.common.dptr.prp1 = c.common.dptr.prp2 = 0;
-	c.common.cdw10 = cpu_to_le32(READ_ONCE(cmd->cdw10));
-	c.common.cdw11 = cpu_to_le32(READ_ONCE(cmd->cdw11));
-	c.common.cdw12 = cpu_to_le32(READ_ONCE(cmd->cdw12));
-	c.common.cdw13 = cpu_to_le32(READ_ONCE(cmd->cdw13));
-	c.common.cdw14 = cpu_to_le32(READ_ONCE(cmd->cdw14));
-	c.common.cdw15 = cpu_to_le32(READ_ONCE(cmd->cdw15));
-
-	if (!nvme_cmd_allowed(ns, &c, 0, ioucmd->file->f_mode))
-		return -EACCES;
-
-	d.metadata = READ_ONCE(cmd->metadata);
-	d.addr = READ_ONCE(cmd->addr);
-	d.data_len = READ_ONCE(cmd->data_len);
-	d.metadata_len = READ_ONCE(cmd->metadata_len);
-	d.timeout_ms = READ_ONCE(cmd->timeout_ms);
+	ret = nvme_prep_cmd_from_ioucmd(ctrl, ns, ioucmd, &c, &d);
+	if (ret)
+		return ret;
 
 	if (issue_flags & IO_URING_F_NONBLOCK) {
 		rq_flags |= REQ_NOWAIT;
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 4619a7498f8e..4eb45afc9484 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -168,6 +168,14 @@ struct nvme_request {
 	struct nvme_ctrl	*ctrl;
 };
 
+struct nvme_uring_data {
+	__u64	metadata;
+	__u64	addr;
+	__u32	data_len;
+	__u32	metadata_len;
+	__u32	timeout_ms;
+};
+
 /*
  * Mark a bio as coming in through the mpath node.
  */
@@ -811,6 +819,9 @@ static inline bool nvme_is_unique_nsid(struct nvme_ctrl *ctrl,
 		(ctrl->ctratt & NVME_CTRL_CTRATT_NVM_SETS);
 }
 
+int nvme_prep_cmd_from_ioucmd(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
+		struct io_uring_cmd *ioucmd, struct nvme_command *c,
+		struct nvme_uring_data *d);
 int nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
 		void *buf, unsigned bufflen);
 int __nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 10/12] nvme: submisssion/completion of uring_cmd to/from the registered queue
  2023-04-29  9:39 ` [RFC PATCH 00/12] io_uring attached nvme queue Kanchan Joshi
                     ` (8 preceding siblings ...)
  2023-04-29  9:39   ` [RFC PATCH 09/12] nvme: carve out a helper to prepare nvme_command from ioucmd->cmd Kanchan Joshi
@ 2023-04-29  9:39   ` Kanchan Joshi
  2023-04-29 12:23     ` kernel test robot
  2023-04-29 12:44     ` kernel test robot
  2023-04-29  9:39   ` [RFC PATCH 11/12] pci: modify nvme_setup_prp_simple parameters Kanchan Joshi
                     ` (2 subsequent siblings)
  12 siblings, 2 replies; 25+ messages in thread
From: Kanchan Joshi @ 2023-04-29  9:39 UTC (permalink / raw)
  To: axboe, hch, sagi, kbusch
  Cc: io-uring, linux-nvme, linux-block, gost.dev, anuj1072538,
	xiaoguang.wang, Kanchan Joshi

If IORING_URING_CMD_DIRECT flag is set, get the registered qid and
- submit the io_uring command via mq_ops->queue_uring_cmd.
- complete it via mq_ops->poll_uring_cmd.

If the command could not be submitted this way due to any reason,
abstract it and fallback to old way of submission.
This keeps IORING_URING_CMD_DIRECT flag advisory.

Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
---
 drivers/nvme/host/ioctl.c | 105 +++++++++++++++++++++++++++++++++-----
 drivers/nvme/host/nvme.h  |   4 ++
 2 files changed, 97 insertions(+), 12 deletions(-)

diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index 18f4f20f5e76..df86fb4f132b 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -653,6 +653,23 @@ static bool is_ctrl_ioctl(unsigned int cmd)
 	return false;
 }
 
+static int nvme_uring_cmd_io_direct(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
+		struct io_uring_cmd *ioucmd, unsigned int issue_flags)
+{
+	struct request_queue *q = ns ? ns->queue : ctrl->admin_q;
+	int qid = io_uring_cmd_import_qid(ioucmd);
+	struct nvme_uring_direct_pdu *pdu =
+		(struct nvme_uring_direct_pdu *)&ioucmd->pdu;
+
+	if ((issue_flags & IO_URING_F_IOPOLL) != IO_URING_F_IOPOLL)
+		return -EOPNOTSUPP;
+
+	pdu->ns = ns;
+	if (q->mq_ops && q->mq_ops->queue_uring_cmd)
+		return q->mq_ops->queue_uring_cmd(ioucmd, qid);
+	return -EOPNOTSUPP;
+}
+
 static int nvme_ctrl_ioctl(struct nvme_ctrl *ctrl, unsigned int cmd,
 		void __user *argp, fmode_t mode)
 {
@@ -763,6 +780,14 @@ static int nvme_ns_uring_cmd(struct nvme_ns *ns, struct io_uring_cmd *ioucmd,
 
 	switch (ioucmd->cmd_op) {
 	case NVME_URING_CMD_IO:
+		if (ioucmd->flags & IORING_URING_CMD_DIRECT) {
+			ret = nvme_uring_cmd_io_direct(ctrl, ns, ioucmd,
+					issue_flags);
+			if (ret == -EIOCBQUEUED)
+				return ret;
+			/* in case of any error, just fallback */
+			ioucmd->flags &= ~(IORING_URING_CMD_DIRECT);
+		}
 		ret = nvme_uring_cmd_io(ctrl, ns, ioucmd, issue_flags, false);
 		break;
 	case NVME_URING_CMD_IO_VEC:
@@ -783,6 +808,38 @@ int nvme_ns_chr_uring_cmd(struct io_uring_cmd *ioucmd, unsigned int issue_flags)
 	return nvme_ns_uring_cmd(ns, ioucmd, issue_flags);
 }
 
+/* similar to blk_mq_poll; may be possible to unify */
+int nvme_uring_cmd_iopoll_qid(struct request_queue *q,
+				 struct io_uring_cmd *ioucmd, int qid,
+				 struct io_comp_batch *iob,
+				 unsigned int flags)
+{
+	long state = get_current_state();
+	int ret;
+
+	if (!(q->mq_ops && q->mq_ops->poll_uring_cmd))
+		return 0;
+	do {
+		ret = q->mq_ops->poll_uring_cmd(ioucmd, qid, iob);
+		if (ret > 0) {
+			__set_current_state(TASK_RUNNING);
+			return ret;
+		}
+		if (signal_pending_state(state, current))
+			__set_current_state(TASK_RUNNING);
+		if (task_is_running(current))
+			return 1;
+
+		if (ret < 0 || (flags & BLK_POLL_ONESHOT))
+			break;
+		cpu_relax();
+
+	} while (!need_resched());
+
+	__set_current_state(TASK_RUNNING);
+	return 0;
+}
+
 int nvme_ns_chr_uring_cmd_iopoll(struct io_uring_cmd *ioucmd,
 				 struct io_comp_batch *iob,
 				 unsigned int poll_flags)
@@ -792,14 +849,26 @@ int nvme_ns_chr_uring_cmd_iopoll(struct io_uring_cmd *ioucmd,
 	struct nvme_ns *ns;
 	struct request_queue *q;
 
-	rcu_read_lock();
-	bio = READ_ONCE(ioucmd->cookie);
 	ns = container_of(file_inode(ioucmd->file)->i_cdev,
 			struct nvme_ns, cdev);
 	q = ns->queue;
-	if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio && bio->bi_bdev)
-		ret = bio_poll(bio, iob, poll_flags);
-	rcu_read_unlock();
+	if (!(ioucmd->flags & IORING_URING_CMD_DIRECT)) {
+		rcu_read_lock();
+		bio = READ_ONCE(ioucmd->cookie);
+		if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio && bio->bi_bdev)
+			ret = bio_poll(bio, iob, poll_flags);
+
+		rcu_read_unlock();
+	} else {
+		int qid = io_uring_cmd_import_qid(ioucmd);
+
+		if (qid <= 0)
+			return 0;
+		if (!percpu_ref_tryget(&q->q_usage_counter))
+			return 0;
+		ret = nvme_uring_cmd_iopoll_qid(q, ioucmd, qid, iob, poll_flags);
+		percpu_ref_put(&q->q_usage_counter);
+	}
 	return ret;
 }
 #ifdef CONFIG_NVME_MULTIPATH
@@ -952,13 +1021,25 @@ int nvme_ns_head_chr_uring_cmd_iopoll(struct io_uring_cmd *ioucmd,
 	struct request_queue *q;
 
 	if (ns) {
-		rcu_read_lock();
-		bio = READ_ONCE(ioucmd->cookie);
-		q = ns->queue;
-		if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio
-				&& bio->bi_bdev)
-			ret = bio_poll(bio, iob, poll_flags);
-		rcu_read_unlock();
+		if (!(ioucmd->flags & IORING_URING_CMD_DIRECT)) {
+			rcu_read_lock();
+			bio = READ_ONCE(ioucmd->cookie);
+			q = ns->queue;
+			if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio
+					&& bio->bi_bdev)
+				ret = bio_poll(bio, iob, poll_flags);
+			rcu_read_unlock();
+		} else {
+			int qid = io_uring_cmd_import_qid(ioucmd);
+
+			if (qid <= 0)
+				return 0;
+			if (!percpu_ref_tryget(&q->q_usage_counter))
+				return 0;
+			ret = nvme_uring_cmd_iopoll_qid(q, ioucmd, qid, iob,
+							poll_flags);
+			percpu_ref_put(&q->q_usage_counter);
+		}
 	}
 	srcu_read_unlock(&head->srcu, srcu_idx);
 	return ret;
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 4eb45afc9484..2fd4432fbe12 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -176,6 +176,10 @@ struct nvme_uring_data {
 	__u32	timeout_ms;
 };
 
+struct nvme_uring_direct_pdu {
+	struct nvme_ns *ns;
+};
+
 /*
  * Mark a bio as coming in through the mpath node.
  */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 11/12] pci: modify nvme_setup_prp_simple parameters
  2023-04-29  9:39 ` [RFC PATCH 00/12] io_uring attached nvme queue Kanchan Joshi
                     ` (9 preceding siblings ...)
  2023-04-29  9:39   ` [RFC PATCH 10/12] nvme: submisssion/completion of uring_cmd to/from the registered queue Kanchan Joshi
@ 2023-04-29  9:39   ` Kanchan Joshi
  2023-04-29  9:39   ` [RFC PATCH 12/12] pci: implement submission/completion for rawq commands Kanchan Joshi
  2023-04-29 17:17   ` [RFC PATCH 00/12] io_uring attached nvme queue Jens Axboe
  12 siblings, 0 replies; 25+ messages in thread
From: Kanchan Joshi @ 2023-04-29  9:39 UTC (permalink / raw)
  To: axboe, hch, sagi, kbusch
  Cc: io-uring, linux-nvme, linux-block, gost.dev, anuj1072538,
	xiaoguang.wang, Anuj Gupta

From: Anuj Gupta <anuj20.g@samsung.com>

Refactor parameters of nvme_setup_prp_simple a bit.
This is a prep patch.

Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
---
 drivers/nvme/host/pci.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index b4498e198e8a..30d7a1a6eaab 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -750,14 +750,13 @@ static blk_status_t nvme_pci_setup_sgls(struct nvme_dev *dev,
 }
 
 static blk_status_t nvme_setup_prp_simple(struct nvme_dev *dev,
-		struct request *req, struct nvme_rw_command *cmnd,
-		struct bio_vec *bv)
+		struct nvme_iod *iod, struct nvme_rw_command *cmnd,
+		struct bio_vec *bv, int dma_dir)
 {
-	struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
 	unsigned int offset = bv->bv_offset & (NVME_CTRL_PAGE_SIZE - 1);
 	unsigned int first_prp_len = NVME_CTRL_PAGE_SIZE - offset;
 
-	iod->first_dma = dma_map_bvec(dev->dev, bv, rq_dma_dir(req), 0);
+	iod->first_dma = dma_map_bvec(dev->dev, bv, dma_dir, 0);
 	if (dma_mapping_error(dev->dev, iod->first_dma))
 		return BLK_STS_RESOURCE;
 	iod->dma_len = bv->bv_len;
@@ -801,8 +800,10 @@ static blk_status_t nvme_map_data(struct nvme_dev *dev, struct request *req,
 
 		if (!is_pci_p2pdma_page(bv.bv_page)) {
 			if (bv.bv_offset + bv.bv_len <= NVME_CTRL_PAGE_SIZE * 2)
-				return nvme_setup_prp_simple(dev, req,
-							     &cmnd->rw, &bv);
+				return nvme_setup_prp_simple(dev,
+					       blk_mq_rq_to_pdu(req),
+					       &cmnd->rw, &bv,
+					       rq_dma_dir(req));
 
 			if (nvmeq->qid && sgl_threshold &&
 			    nvme_ctrl_sgl_supported(&dev->ctrl))
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 12/12] pci: implement submission/completion for rawq commands
  2023-04-29  9:39 ` [RFC PATCH 00/12] io_uring attached nvme queue Kanchan Joshi
                     ` (10 preceding siblings ...)
  2023-04-29  9:39   ` [RFC PATCH 11/12] pci: modify nvme_setup_prp_simple parameters Kanchan Joshi
@ 2023-04-29  9:39   ` Kanchan Joshi
  2023-04-29 12:54     ` kernel test robot
  2023-05-03  5:36     ` kernel test robot
  2023-04-29 17:17   ` [RFC PATCH 00/12] io_uring attached nvme queue Jens Axboe
  12 siblings, 2 replies; 25+ messages in thread
From: Kanchan Joshi @ 2023-04-29  9:39 UTC (permalink / raw)
  To: axboe, hch, sagi, kbusch
  Cc: io-uring, linux-nvme, linux-block, gost.dev, anuj1072538,
	xiaoguang.wang, Kanchan Joshi, Anuj Gupta

Implement ->queue_uring_cmd and ->poll_uring_cmd for leaner submission
and completion. Both operate on "struct io_uring_cmd" directly, without
having to take hoops involving request and bio.

This is currently enabled for small (single segement length), premapped
(fixedbufs) IOs.

Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
---
 drivers/nvme/host/pci.c | 194 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 192 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 30d7a1a6eaab..ca0580c4e977 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -28,6 +28,8 @@
 #include <linux/io-64-nonatomic-hi-lo.h>
 #include <linux/sed-opal.h>
 #include <linux/pci-p2pdma.h>
+#include <linux/nvme_ioctl.h>
+#include <linux/io_uring.h>
 
 #include "trace.h"
 #include "nvme.h"
@@ -210,6 +212,7 @@ struct nvme_queue {
 	unsigned long *cmdid_bmp;
 	spinlock_t cmdid_lock;
 	struct nvme_iod *iod;
+	u8	reg_id;
 	 /* only used for poll queues: */
 	spinlock_t cq_poll_lock ____cacheline_aligned_in_smp;
 	struct nvme_completion *cqes;
@@ -249,7 +252,11 @@ union nvme_descriptor {
  * to the actual struct scatterlist.
  */
 struct nvme_iod {
-	struct nvme_request req;
+	union {
+		struct nvme_request req;
+		/* for raw-queue */
+		struct io_uring_cmd *ioucmd;
+	};
 	struct nvme_command cmd;
 	bool aborted;
 	s8 nr_allocations;	/* PRP list pool allocations. 0 means small
@@ -1025,6 +1032,13 @@ static inline void nvme_ring_cq_doorbell(struct nvme_queue *nvmeq)
 		writel(head, nvmeq->q_db + nvmeq->dev->db_stride);
 }
 
+static void nvme_pci_put_cmdid(struct nvme_queue *nvmeq, int id)
+{
+	spin_lock(&nvmeq->cmdid_lock);
+	clear_bit(id, nvmeq->cmdid_bmp);
+	spin_unlock(&nvmeq->cmdid_lock);
+}
+
 static inline struct blk_mq_tags *nvme_queue_tagset(struct nvme_queue *nvmeq)
 {
 	if (!nvmeq->qid)
@@ -1032,6 +1046,37 @@ static inline struct blk_mq_tags *nvme_queue_tagset(struct nvme_queue *nvmeq)
 	return nvmeq->dev->tagset.tags[nvmeq->qid - 1];
 }
 
+static inline struct nvme_uring_direct_pdu *nvme_uring_cmd_direct_pdu(
+		struct io_uring_cmd *ioucmd)
+{
+	return (struct nvme_uring_direct_pdu *)&ioucmd->pdu;
+}
+
+static inline void nvme_handle_cqe_rawq(struct nvme_queue *nvmeq,
+					struct nvme_completion *cqe,
+					__u16 command_id)
+{
+	struct nvme_dev *dev = nvmeq->dev;
+	u32 status = le16_to_cpu(cqe->status) >> 1;
+	u64 result = cqe->result.u64;
+	struct nvme_iod *iod;
+	int id, reg_id;
+
+	reg_id = nvme_genctr_from_cid(command_id);
+	/* we should not encounter completions from past registration*/
+	WARN_ON_ONCE(nvmeq->reg_id != reg_id);
+
+	id = nvme_tag_from_cid(command_id);
+	iod = &nvmeq->iod[id];
+
+	if (iod->dma_len)
+		dma_unmap_page(dev->dev, iod->first_dma, iod->dma_len,
+		nvme_is_write(&iod->cmd) ? DMA_TO_DEVICE : DMA_FROM_DEVICE);
+
+	nvme_pci_put_cmdid(nvmeq, id);
+	io_uring_cmd_done(iod->ioucmd, status, result, IO_URING_F_UNLOCKED);
+}
+
 static inline void nvme_handle_cqe(struct nvme_queue *nvmeq,
 				   struct io_comp_batch *iob, u16 idx)
 {
@@ -1039,6 +1084,9 @@ static inline void nvme_handle_cqe(struct nvme_queue *nvmeq,
 	__u16 command_id = READ_ONCE(cqe->command_id);
 	struct request *req;
 
+	if (test_bit(NVMEQ_RAW, &nvmeq->flags))
+		return nvme_handle_cqe_rawq(nvmeq, cqe, command_id);
+
 	/*
 	 * AEN requests are special as they don't time out and can
 	 * survive any kind of queue freeze and often don't respond to
@@ -1151,6 +1199,25 @@ static int nvme_poll(struct blk_mq_hw_ctx *hctx, struct io_comp_batch *iob)
 	return found;
 }
 
+static int nvme_poll_uring_cmd(struct io_uring_cmd *ioucmd, int qid,
+			       struct io_comp_batch *iob)
+
+{
+	struct nvme_uring_direct_pdu *pdu = nvme_uring_cmd_direct_pdu(ioucmd);
+	struct nvme_dev *dev = to_nvme_dev(pdu->ns->ctrl);
+	struct nvme_queue *nvmeq = &dev->queues[qid];
+	bool found;
+
+	if (!nvme_cqe_pending(nvmeq))
+		return 0;
+
+	spin_lock(&nvmeq->cq_poll_lock);
+	found = nvme_poll_cq(nvmeq, iob);
+	spin_unlock(&nvmeq->cq_poll_lock);
+
+	return found;
+}
+
 static void nvme_pci_submit_async_event(struct nvme_ctrl *ctrl)
 {
 	struct nvme_dev *dev = to_nvme_dev(ctrl);
@@ -1762,6 +1829,22 @@ static int nvme_pci_alloc_cmdid_bmp(struct nvme_queue *nvmeq)
 	return 0;
 }
 
+static int nvme_pci_get_cmdid(struct nvme_queue *nvmeq)
+{
+	int id = 0, size = nvmeq->q_depth - 1;
+
+	spin_lock(&nvmeq->cmdid_lock);
+	id = find_first_zero_bit(nvmeq->cmdid_bmp, size);
+	if (id >= size) {
+		id = -EBUSY;
+		goto unlock;
+	}
+	set_bit(id, nvmeq->cmdid_bmp);
+unlock:
+	spin_unlock(&nvmeq->cmdid_lock);
+	return id;
+}
+
 static int nvme_pci_alloc_iod_array(struct nvme_queue *nvmeq)
 {
 	if (!test_bit(NVMEQ_RAW, &nvmeq->flags))
@@ -1793,13 +1876,17 @@ static int nvme_pci_register_queue(void *data)
 	struct nvme_ns *ns = (struct nvme_ns *) data;
 	struct nvme_dev *dev = to_nvme_dev(ns->ctrl);
 	int qid, ret;
+	struct nvme_queue *nvmeq;
 
 	qid = nvme_pci_get_rawq(dev);
 	if (qid > 0) {
 		/* setup command-id bitmap and iod array */
-		ret = nvme_pci_setup_rawq(&dev->queues[qid]);
+		nvmeq = &dev->queues[qid];
+		ret = nvme_pci_setup_rawq(nvmeq);
 		if (ret < 0)
 			qid = ret;
+		else
+			nvmeq->reg_id++;
 	}
 	return qid;
 }
@@ -1812,6 +1899,107 @@ static int nvme_pci_unregister_queue(void *data, int qid)
 	return nvme_pci_put_rawq(dev, qid);
 }
 
+static int nvme_map_io_fb(struct request_queue *q, struct io_uring_cmd *ioucmd,
+			struct nvme_dev *dev, struct nvme_iod *iod,
+			unsigned long addr, unsigned int buf_len)
+{
+	struct nvme_command *cmnd = &iod->cmd;
+	int ret, rw = nvme_is_write(cmnd);
+	struct iov_iter iter;
+	size_t nr_iter, nr_segs;
+	struct bio_vec *bv;
+
+	if (!(ioucmd->flags & IORING_URING_CMD_FIXED))
+		return -EINVAL;
+
+	ret = io_uring_cmd_import_fixed(addr, buf_len, rw, &iter, ioucmd);
+	if (ret < 0)
+		return ret;
+	nr_iter = iov_iter_count(&iter);
+	nr_segs = iter.nr_segs;
+
+	/* device will do these checks anyway, so why do duplicate work? */
+	if (!nr_iter || (nr_iter >> SECTOR_SHIFT) > queue_max_hw_sectors(q))
+		goto err;
+	if (nr_segs > queue_max_segments(q))
+		goto err;
+	/* this will be attempted from regular path instead */
+	if (nr_segs > 1)
+		goto err;
+
+	bv = (struct bio_vec *)&iter.bvec[0];
+	if (bv->bv_offset + bv->bv_len <= NVME_CTRL_PAGE_SIZE * 2)
+		return nvme_setup_prp_simple(dev, iod,
+				(struct nvme_rw_command *)cmnd,
+				bv, rw ? DMA_TO_DEVICE : DMA_FROM_DEVICE);
+err:
+	return -EINVAL;
+}
+
+static int nvme_alloc_cmd_from_q(struct nvme_queue *nvmeq, int *cmdid,
+				struct nvme_iod **iod)
+{
+	int id;
+
+	id = nvme_pci_get_cmdid(nvmeq);
+	if (id < 0)
+		return id;
+	*cmdid = id | nvme_cid_install_genctr(nvmeq->reg_id);
+	*iod = &nvmeq->iod[id];
+	return 0;
+}
+
+static int nvme_pci_queue_ucmd(struct io_uring_cmd *ioucmd, int qid)
+{
+	struct nvme_uring_direct_pdu *pdu = nvme_uring_cmd_direct_pdu(ioucmd);
+	struct nvme_ns *ns = pdu->ns;
+	struct nvme_ctrl *ctrl = ns->ctrl;
+	struct nvme_dev *dev = to_nvme_dev(ctrl);
+	struct nvme_command *nvme_cmd;
+	struct nvme_uring_data d;
+	int ret, cmdid;
+	struct nvme_iod *iod;
+	struct nvme_queue *nvmeq = &dev->queues[qid];
+
+	ret = nvme_alloc_cmd_from_q(nvmeq, &cmdid, &iod);
+	if (ret)
+		return ret;
+
+	iod->ioucmd = ioucmd;
+	nvme_cmd = &iod->cmd;
+	ret = nvme_prep_cmd_from_ioucmd(ctrl, ns, ioucmd, nvme_cmd, &d);
+	if (ret)
+		goto out;
+
+	nvme_cmd->common.command_id = cmdid;
+	iod->aborted = false;
+	iod->nr_allocations = -1;
+	iod->sgt.nents = 0;
+	ret = nvme_map_io_fb(ns->queue, ioucmd, dev, iod, d.addr, d.data_len);
+	if  (ret)
+		goto out;
+
+	/*
+	 * since this nvmeq is exclusive to single submitter (io_uring
+	 * follows one-thread-per-ring model), we do not need the lock
+	 * ideally. But we have lifetime difference between io_uring
+	 * and nvme sqe. io_uring SQE can be reused just after submission
+	 * but for nvme we have to wait until completion.
+	 * So if we run out of space, submission will be deferred to
+	 * io_uring worker. So we can't skip the lock.
+	 * But not all is lost! Due to one-thread-per-ring model and
+	 * polled-completion, contention is not common in most cases.
+	 */
+	spin_lock(&nvmeq->sq_lock);
+	nvme_sq_copy_cmd(nvmeq, &iod->cmd);
+	nvme_write_sq_db(nvmeq, true);
+	spin_unlock(&nvmeq->sq_lock);
+	return -EIOCBQUEUED;
+out:
+	nvme_pci_put_cmdid(nvmeq, cmdid);
+	return ret;
+}
+
 static const struct blk_mq_ops nvme_mq_admin_ops = {
 	.queue_rq	= nvme_queue_rq,
 	.complete	= nvme_pci_complete_rq,
@@ -1832,6 +2020,8 @@ static const struct blk_mq_ops nvme_mq_ops = {
 	.poll		= nvme_poll,
 	.register_queue	= nvme_pci_register_queue,
 	.unregister_queue =  nvme_pci_unregister_queue,
+	.queue_uring_cmd = nvme_pci_queue_ucmd,
+	.poll_uring_cmd	= nvme_poll_uring_cmd,
 };
 
 static void nvme_dev_remove_admin(struct nvme_dev *dev)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 05/12] nvme: wire-up register/unregister queue f_op callback
  2023-04-29  9:39   ` [RFC PATCH 05/12] nvme: wire-up register/unregister queue f_op callback Kanchan Joshi
@ 2023-04-29 12:02     ` kernel test robot
  2023-04-29 12:02     ` kernel test robot
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 25+ messages in thread
From: kernel test robot @ 2023-04-29 12:02 UTC (permalink / raw)
  To: Kanchan Joshi; +Cc: oe-kbuild-all

Hi Kanchan,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build warnings:

[auto build test WARNING on axboe-block/for-next]
[also build test WARNING on linus/master next-20230428]
[cannot apply to hch-configfs/for-next v6.3]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Kanchan-Joshi/pci-enable-raw_queues-N-module-parameter/20230429-183933
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link:    https://lore.kernel.org/r/20230429093925.133327-6-joshi.k%40samsung.com
patch subject: [RFC PATCH 05/12] nvme: wire-up register/unregister queue f_op callback
config: m68k-allyesconfig (https://download.01.org/0day-ci/archive/20230429/202304291931.mAMjfqtH-lkp@intel.com/config)
compiler: m68k-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/fb2d573b75b3615d774f841c417790d368b70ccd
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Kanchan-Joshi/pci-enable-raw_queues-N-module-parameter/20230429-183933
        git checkout fb2d573b75b3615d774f841c417790d368b70ccd
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=m68k olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=m68k SHELL=/bin/bash drivers/nvme/host/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202304291931.mAMjfqtH-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/nvme/host/core.c:4062:5: warning: no previous prototype for 'nvme_register_queue' [-Wmissing-prototypes]
    4062 | int nvme_register_queue(struct file *file)
         |     ^~~~~~~~~~~~~~~~~~~
>> drivers/nvme/host/core.c:4074:5: warning: no previous prototype for 'nvme_unregister_queue' [-Wmissing-prototypes]
    4074 | int nvme_unregister_queue(struct file *file, int qid)
         |     ^~~~~~~~~~~~~~~~~~~~~


vim +/nvme_register_queue +4062 drivers/nvme/host/core.c

  4061	
> 4062	int nvme_register_queue(struct file *file)
  4063	{
  4064		struct nvme_ns *ns = container_of(file_inode(file)->i_cdev,
  4065				struct nvme_ns, cdev);
  4066		struct nvme_ctrl *ctrl = ns->ctrl;
  4067		struct request_queue *q = ns ? ns->queue : ctrl->admin_q;
  4068	
  4069		if (q->mq_ops && q->mq_ops->register_queue)
  4070			return q->mq_ops->register_queue(ns);
  4071		return -EINVAL;
  4072	}
  4073	
> 4074	int nvme_unregister_queue(struct file *file, int qid)
  4075	{
  4076		struct nvme_ns *ns = container_of(file_inode(file)->i_cdev,
  4077				struct nvme_ns, cdev);
  4078		struct nvme_ctrl *ctrl = ns->ctrl;
  4079		struct request_queue *q = ns ? ns->queue : ctrl->admin_q;
  4080	
  4081		if (q->mq_ops && q->mq_ops->unregister_queue)
  4082			return q->mq_ops->unregister_queue(ns, qid);
  4083		return -EINVAL;
  4084	}
  4085	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 05/12] nvme: wire-up register/unregister queue f_op callback
  2023-04-29  9:39   ` [RFC PATCH 05/12] nvme: wire-up register/unregister queue f_op callback Kanchan Joshi
  2023-04-29 12:02     ` kernel test robot
@ 2023-04-29 12:02     ` kernel test robot
  2023-04-29 12:02     ` kernel test robot
  2023-04-29 12:13     ` kernel test robot
  3 siblings, 0 replies; 25+ messages in thread
From: kernel test robot @ 2023-04-29 12:02 UTC (permalink / raw)
  To: Kanchan Joshi; +Cc: llvm, oe-kbuild-all

Hi Kanchan,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build warnings:

[auto build test WARNING on axboe-block/for-next]
[also build test WARNING on linus/master next-20230428]
[cannot apply to hch-configfs/for-next v6.3]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Kanchan-Joshi/pci-enable-raw_queues-N-module-parameter/20230429-183933
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link:    https://lore.kernel.org/r/20230429093925.133327-6-joshi.k%40samsung.com
patch subject: [RFC PATCH 05/12] nvme: wire-up register/unregister queue f_op callback
config: hexagon-randconfig-r012-20230429 (https://download.01.org/0day-ci/archive/20230429/202304291923.9c6BwInk-lkp@intel.com/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project 437b7602e4a998220871de78afcb020b9c14a661)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/fb2d573b75b3615d774f841c417790d368b70ccd
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Kanchan-Joshi/pci-enable-raw_queues-N-module-parameter/20230429-183933
        git checkout fb2d573b75b3615d774f841c417790d368b70ccd
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=hexagon olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=hexagon SHELL=/bin/bash drivers/nvme/host/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202304291923.9c6BwInk-lkp@intel.com/

All warnings (new ones prefixed by >>):

   In file included from drivers/nvme/host/core.c:7:
   In file included from include/linux/blkdev.h:9:
   In file included from include/linux/blk_types.h:10:
   In file included from include/linux/bvec.h:10:
   In file included from include/linux/highmem.h:12:
   In file included from include/linux/hardirq.h:11:
   In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/hexagon/include/asm/io.h:334:
   include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __raw_readb(PCI_IOBASE + addr);
                             ~~~~~~~~~~ ^
   include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/little_endian.h:37:51: note: expanded from macro '__le16_to_cpu'
   #define __le16_to_cpu(x) ((__force __u16)(__le16)(x))
                                                     ^
   In file included from drivers/nvme/host/core.c:7:
   In file included from include/linux/blkdev.h:9:
   In file included from include/linux/blk_types.h:10:
   In file included from include/linux/bvec.h:10:
   In file included from include/linux/highmem.h:12:
   In file included from include/linux/hardirq.h:11:
   In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/hexagon/include/asm/io.h:334:
   include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/little_endian.h:35:51: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) ((__force __u32)(__le32)(x))
                                                     ^
   In file included from drivers/nvme/host/core.c:7:
   In file included from include/linux/blkdev.h:9:
   In file included from include/linux/blk_types.h:10:
   In file included from include/linux/bvec.h:10:
   In file included from include/linux/highmem.h:12:
   In file included from include/linux/hardirq.h:11:
   In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/hexagon/include/asm/io.h:334:
   include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writeb(value, PCI_IOBASE + addr);
                               ~~~~~~~~~~ ^
   include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   drivers/nvme/host/core.c:4069:30: error: no member named 'register_queue' in 'struct blk_mq_ops'
           if (q->mq_ops && q->mq_ops->register_queue)
                            ~~~~~~~~~  ^
   drivers/nvme/host/core.c:4070:21: error: no member named 'register_queue' in 'struct blk_mq_ops'
                   return q->mq_ops->register_queue(ns);
                          ~~~~~~~~~  ^
>> drivers/nvme/host/core.c:4062:5: warning: no previous prototype for function 'nvme_register_queue' [-Wmissing-prototypes]
   int nvme_register_queue(struct file *file)
       ^
   drivers/nvme/host/core.c:4062:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   int nvme_register_queue(struct file *file)
   ^
   static 
   drivers/nvme/host/core.c:4081:30: error: no member named 'unregister_queue' in 'struct blk_mq_ops'
           if (q->mq_ops && q->mq_ops->unregister_queue)
                            ~~~~~~~~~  ^
   drivers/nvme/host/core.c:4082:21: error: no member named 'unregister_queue' in 'struct blk_mq_ops'
                   return q->mq_ops->unregister_queue(ns, qid);
                          ~~~~~~~~~  ^
>> drivers/nvme/host/core.c:4074:5: warning: no previous prototype for function 'nvme_unregister_queue' [-Wmissing-prototypes]
   int nvme_unregister_queue(struct file *file, int qid)
       ^
   drivers/nvme/host/core.c:4074:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   int nvme_unregister_queue(struct file *file, int qid)
   ^
   static 
   8 warnings and 4 errors generated.


vim +/nvme_register_queue +4062 drivers/nvme/host/core.c

  4061	
> 4062	int nvme_register_queue(struct file *file)
  4063	{
  4064		struct nvme_ns *ns = container_of(file_inode(file)->i_cdev,
  4065				struct nvme_ns, cdev);
  4066		struct nvme_ctrl *ctrl = ns->ctrl;
  4067		struct request_queue *q = ns ? ns->queue : ctrl->admin_q;
  4068	
> 4069		if (q->mq_ops && q->mq_ops->register_queue)
  4070			return q->mq_ops->register_queue(ns);
  4071		return -EINVAL;
  4072	}
  4073	
> 4074	int nvme_unregister_queue(struct file *file, int qid)
  4075	{
  4076		struct nvme_ns *ns = container_of(file_inode(file)->i_cdev,
  4077				struct nvme_ns, cdev);
  4078		struct nvme_ctrl *ctrl = ns->ctrl;
  4079		struct request_queue *q = ns ? ns->queue : ctrl->admin_q;
  4080	
> 4081		if (q->mq_ops && q->mq_ops->unregister_queue)
  4082			return q->mq_ops->unregister_queue(ns, qid);
  4083		return -EINVAL;
  4084	}
  4085	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 05/12] nvme: wire-up register/unregister queue f_op callback
  2023-04-29  9:39   ` [RFC PATCH 05/12] nvme: wire-up register/unregister queue f_op callback Kanchan Joshi
  2023-04-29 12:02     ` kernel test robot
  2023-04-29 12:02     ` kernel test robot
@ 2023-04-29 12:02     ` kernel test robot
  2023-04-29 12:13     ` kernel test robot
  3 siblings, 0 replies; 25+ messages in thread
From: kernel test robot @ 2023-04-29 12:02 UTC (permalink / raw)
  To: Kanchan Joshi; +Cc: oe-kbuild-all

Hi Kanchan,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:

[auto build test ERROR on axboe-block/for-next]
[also build test ERROR on linus/master next-20230428]
[cannot apply to v6.3]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Kanchan-Joshi/pci-enable-raw_queues-N-module-parameter/20230429-183933
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link:    https://lore.kernel.org/r/20230429093925.133327-6-joshi.k%40samsung.com
patch subject: [RFC PATCH 05/12] nvme: wire-up register/unregister queue f_op callback
config: i386-randconfig-a003 (https://download.01.org/0day-ci/archive/20230429/202304291925.J9c5dVau-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-12) 11.3.0
reproduce (this is a W=1 build):
        # https://github.com/intel-lab-lkp/linux/commit/fb2d573b75b3615d774f841c417790d368b70ccd
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Kanchan-Joshi/pci-enable-raw_queues-N-module-parameter/20230429-183933
        git checkout fb2d573b75b3615d774f841c417790d368b70ccd
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        make W=1 O=build_dir ARCH=i386 olddefconfig
        make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash drivers/nvme/host/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202304291925.J9c5dVau-lkp@intel.com/

All error/warnings (new ones prefixed by >>):

>> drivers/nvme/host/core.c:4062:5: warning: no previous prototype for 'nvme_register_queue' [-Wmissing-prototypes]
    4062 | int nvme_register_queue(struct file *file)
         |     ^~~~~~~~~~~~~~~~~~~
   drivers/nvme/host/core.c: In function 'nvme_register_queue':
>> drivers/nvme/host/core.c:4069:35: error: 'const struct blk_mq_ops' has no member named 'register_queue'
    4069 |         if (q->mq_ops && q->mq_ops->register_queue)
         |                                   ^~
   drivers/nvme/host/core.c:4070:33: error: 'const struct blk_mq_ops' has no member named 'register_queue'
    4070 |                 return q->mq_ops->register_queue(ns);
         |                                 ^~
   drivers/nvme/host/core.c: At top level:
>> drivers/nvme/host/core.c:4074:5: warning: no previous prototype for 'nvme_unregister_queue' [-Wmissing-prototypes]
    4074 | int nvme_unregister_queue(struct file *file, int qid)
         |     ^~~~~~~~~~~~~~~~~~~~~
   drivers/nvme/host/core.c: In function 'nvme_unregister_queue':
>> drivers/nvme/host/core.c:4081:35: error: 'const struct blk_mq_ops' has no member named 'unregister_queue'
    4081 |         if (q->mq_ops && q->mq_ops->unregister_queue)
         |                                   ^~
   drivers/nvme/host/core.c:4082:33: error: 'const struct blk_mq_ops' has no member named 'unregister_queue'
    4082 |                 return q->mq_ops->unregister_queue(ns, qid);
         |                                 ^~


vim +4069 drivers/nvme/host/core.c

  4061	
> 4062	int nvme_register_queue(struct file *file)
  4063	{
  4064		struct nvme_ns *ns = container_of(file_inode(file)->i_cdev,
  4065				struct nvme_ns, cdev);
  4066		struct nvme_ctrl *ctrl = ns->ctrl;
  4067		struct request_queue *q = ns ? ns->queue : ctrl->admin_q;
  4068	
> 4069		if (q->mq_ops && q->mq_ops->register_queue)
  4070			return q->mq_ops->register_queue(ns);
  4071		return -EINVAL;
  4072	}
  4073	
> 4074	int nvme_unregister_queue(struct file *file, int qid)
  4075	{
  4076		struct nvme_ns *ns = container_of(file_inode(file)->i_cdev,
  4077				struct nvme_ns, cdev);
  4078		struct nvme_ctrl *ctrl = ns->ctrl;
  4079		struct request_queue *q = ns ? ns->queue : ctrl->admin_q;
  4080	
> 4081		if (q->mq_ops && q->mq_ops->unregister_queue)
  4082			return q->mq_ops->unregister_queue(ns, qid);
  4083		return -EINVAL;
  4084	}
  4085	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 06/12] pci: implement register/unregister functionality
  2023-04-29  9:39   ` [RFC PATCH 06/12] pci: implement register/unregister functionality Kanchan Joshi
@ 2023-04-29 12:13     ` kernel test robot
  2023-04-29 12:23     ` kernel test robot
  1 sibling, 0 replies; 25+ messages in thread
From: kernel test robot @ 2023-04-29 12:13 UTC (permalink / raw)
  To: Kanchan Joshi; +Cc: oe-kbuild-all

Hi Kanchan,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:

[auto build test ERROR on axboe-block/for-next]
[also build test ERROR on linus/master next-20230428]
[cannot apply to v6.3]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Kanchan-Joshi/pci-enable-raw_queues-N-module-parameter/20230429-183933
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link:    https://lore.kernel.org/r/20230429093925.133327-7-joshi.k%40samsung.com
patch subject: [RFC PATCH 06/12] pci: implement register/unregister functionality
config: csky-randconfig-r013-20230429 (https://download.01.org/0day-ci/archive/20230429/202304292023.jmgnfrZF-lkp@intel.com/config)
compiler: csky-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/6b117fabad0d77551dee34c2b3335dde220e057a
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Kanchan-Joshi/pci-enable-raw_queues-N-module-parameter/20230429-183933
        git checkout 6b117fabad0d77551dee34c2b3335dde220e057a
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=csky olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=csky SHELL=/bin/bash drivers/nvme/host/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202304292023.jmgnfrZF-lkp@intel.com/

All errors (new ones prefixed by >>):

>> drivers/nvme/host/pci.c:1832:10: error: 'const struct blk_mq_ops' has no member named 'register_queue'
    1832 |         .register_queue = nvme_pci_register_queue,
         |          ^~~~~~~~~~~~~~
>> drivers/nvme/host/pci.c:1832:27: error: initialization of 'void (*)(struct request *)' from incompatible pointer type 'int (*)(void *)' [-Werror=incompatible-pointer-types]
    1832 |         .register_queue = nvme_pci_register_queue,
         |                           ^~~~~~~~~~~~~~~~~~~~~~~
   drivers/nvme/host/pci.c:1832:27: note: (near initialization for 'nvme_mq_ops.complete')
   drivers/nvme/host/pci.c:1832:27: warning: initialized field overwritten [-Woverride-init]
   drivers/nvme/host/pci.c:1832:27: note: (near initialization for 'nvme_mq_ops.complete')
>> drivers/nvme/host/pci.c:1833:10: error: 'const struct blk_mq_ops' has no member named 'unregister_queue'
    1833 |         .unregister_queue =  nvme_pci_unregister_queue,
         |          ^~~~~~~~~~~~~~~~
>> drivers/nvme/host/pci.c:1833:30: error: initialization of 'int (*)(struct blk_mq_hw_ctx *, void *, unsigned int)' from incompatible pointer type 'int (*)(void *, int)' [-Werror=incompatible-pointer-types]
    1833 |         .unregister_queue =  nvme_pci_unregister_queue,
         |                              ^~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/nvme/host/pci.c:1833:30: note: (near initialization for 'nvme_mq_ops.init_hctx')
   drivers/nvme/host/pci.c:1833:30: warning: initialized field overwritten [-Woverride-init]
   drivers/nvme/host/pci.c:1833:30: note: (near initialization for 'nvme_mq_ops.init_hctx')
   cc1: some warnings being treated as errors


vim +1832 drivers/nvme/host/pci.c

  1821	
  1822	static const struct blk_mq_ops nvme_mq_ops = {
  1823		.queue_rq	= nvme_queue_rq,
  1824		.queue_rqs	= nvme_queue_rqs,
  1825		.complete	= nvme_pci_complete_rq,
  1826		.commit_rqs	= nvme_commit_rqs,
  1827		.init_hctx	= nvme_init_hctx,
  1828		.init_request	= nvme_pci_init_request,
  1829		.map_queues	= nvme_pci_map_queues,
  1830		.timeout	= nvme_timeout,
  1831		.poll		= nvme_poll,
> 1832		.register_queue	= nvme_pci_register_queue,
> 1833		.unregister_queue =  nvme_pci_unregister_queue,
  1834	};
  1835	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 05/12] nvme: wire-up register/unregister queue f_op callback
  2023-04-29  9:39   ` [RFC PATCH 05/12] nvme: wire-up register/unregister queue f_op callback Kanchan Joshi
                       ` (2 preceding siblings ...)
  2023-04-29 12:02     ` kernel test robot
@ 2023-04-29 12:13     ` kernel test robot
  3 siblings, 0 replies; 25+ messages in thread
From: kernel test robot @ 2023-04-29 12:13 UTC (permalink / raw)
  To: Kanchan Joshi; +Cc: oe-kbuild-all

Hi Kanchan,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:

[auto build test ERROR on axboe-block/for-next]
[also build test ERROR on linus/master next-20230428]
[cannot apply to v6.3]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Kanchan-Joshi/pci-enable-raw_queues-N-module-parameter/20230429-183933
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link:    https://lore.kernel.org/r/20230429093925.133327-6-joshi.k%40samsung.com
patch subject: [RFC PATCH 05/12] nvme: wire-up register/unregister queue f_op callback
config: alpha-randconfig-r032-20230428 (https://download.01.org/0day-ci/archive/20230429/202304291929.tFHH75JE-lkp@intel.com/config)
compiler: alpha-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/fb2d573b75b3615d774f841c417790d368b70ccd
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Kanchan-Joshi/pci-enable-raw_queues-N-module-parameter/20230429-183933
        git checkout fb2d573b75b3615d774f841c417790d368b70ccd
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=alpha olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=alpha SHELL=/bin/bash drivers/nvme/host/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202304291929.tFHH75JE-lkp@intel.com/

All errors (new ones prefixed by >>):

   drivers/nvme/host/core.c:4062:5: warning: no previous prototype for 'nvme_register_queue' [-Wmissing-prototypes]
    4062 | int nvme_register_queue(struct file *file)
         |     ^~~~~~~~~~~~~~~~~~~
   drivers/nvme/host/core.c: In function 'nvme_register_queue':
>> drivers/nvme/host/core.c:4069:35: error: 'const struct blk_mq_ops' has no member named 'register_queue'
    4069 |         if (q->mq_ops && q->mq_ops->register_queue)
         |                                   ^~
   drivers/nvme/host/core.c:4070:33: error: 'const struct blk_mq_ops' has no member named 'register_queue'
    4070 |                 return q->mq_ops->register_queue(ns);
         |                                 ^~
   drivers/nvme/host/core.c: At top level:
   drivers/nvme/host/core.c:4074:5: warning: no previous prototype for 'nvme_unregister_queue' [-Wmissing-prototypes]
    4074 | int nvme_unregister_queue(struct file *file, int qid)
         |     ^~~~~~~~~~~~~~~~~~~~~
   drivers/nvme/host/core.c: In function 'nvme_unregister_queue':
>> drivers/nvme/host/core.c:4081:35: error: 'const struct blk_mq_ops' has no member named 'unregister_queue'
    4081 |         if (q->mq_ops && q->mq_ops->unregister_queue)
         |                                   ^~
   drivers/nvme/host/core.c:4082:33: error: 'const struct blk_mq_ops' has no member named 'unregister_queue'
    4082 |                 return q->mq_ops->unregister_queue(ns, qid);
         |                                 ^~
--
   drivers/nvme/host/ioctl.c: In function 'nvme_ns_head_register_queue':
>> drivers/nvme/host/ioctl.c:891:35: error: 'const struct blk_mq_ops' has no member named 'register_queue'
     891 |         if (q->mq_ops && q->mq_ops->register_queue)
         |                                   ^~
   drivers/nvme/host/ioctl.c:892:32: error: 'const struct blk_mq_ops' has no member named 'register_queue'
     892 |                 ret = q->mq_ops->register_queue(ns);
         |                                ^~
   drivers/nvme/host/ioctl.c: In function 'nvme_ns_head_unregister_queue':
>> drivers/nvme/host/ioctl.c:915:35: error: 'const struct blk_mq_ops' has no member named 'unregister_queue'
     915 |         if (q->mq_ops && q->mq_ops->unregister_queue)
         |                                   ^~
   drivers/nvme/host/ioctl.c:916:32: error: 'const struct blk_mq_ops' has no member named 'unregister_queue'
     916 |                 ret = q->mq_ops->unregister_queue(ns, qid);
         |                                ^~


vim +4069 drivers/nvme/host/core.c

  4061	
> 4062	int nvme_register_queue(struct file *file)
  4063	{
  4064		struct nvme_ns *ns = container_of(file_inode(file)->i_cdev,
  4065				struct nvme_ns, cdev);
  4066		struct nvme_ctrl *ctrl = ns->ctrl;
  4067		struct request_queue *q = ns ? ns->queue : ctrl->admin_q;
  4068	
> 4069		if (q->mq_ops && q->mq_ops->register_queue)
> 4070			return q->mq_ops->register_queue(ns);
  4071		return -EINVAL;
  4072	}
  4073	
  4074	int nvme_unregister_queue(struct file *file, int qid)
  4075	{
  4076		struct nvme_ns *ns = container_of(file_inode(file)->i_cdev,
  4077				struct nvme_ns, cdev);
  4078		struct nvme_ctrl *ctrl = ns->ctrl;
  4079		struct request_queue *q = ns ? ns->queue : ctrl->admin_q;
  4080	
> 4081		if (q->mq_ops && q->mq_ops->unregister_queue)
  4082			return q->mq_ops->unregister_queue(ns, qid);
  4083		return -EINVAL;
  4084	}
  4085	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 10/12] nvme: submisssion/completion of uring_cmd to/from the registered queue
  2023-04-29  9:39   ` [RFC PATCH 10/12] nvme: submisssion/completion of uring_cmd to/from the registered queue Kanchan Joshi
@ 2023-04-29 12:23     ` kernel test robot
  2023-04-29 12:44     ` kernel test robot
  1 sibling, 0 replies; 25+ messages in thread
From: kernel test robot @ 2023-04-29 12:23 UTC (permalink / raw)
  To: Kanchan Joshi; +Cc: llvm, oe-kbuild-all

Hi Kanchan,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build warnings:

[auto build test WARNING on axboe-block/for-next]
[also build test WARNING on linus/master next-20230428]
[cannot apply to v6.3]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Kanchan-Joshi/pci-enable-raw_queues-N-module-parameter/20230429-183933
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link:    https://lore.kernel.org/r/20230429093925.133327-11-joshi.k%40samsung.com
patch subject: [RFC PATCH 10/12] nvme: submisssion/completion of uring_cmd to/from the registered queue
config: arm-randconfig-r032-20230429 (https://download.01.org/0day-ci/archive/20230429/202304292012.oxeNukfB-lkp@intel.com/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project 437b7602e4a998220871de78afcb020b9c14a661)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install arm cross compiling tool for clang build
        # apt-get install binutils-arm-linux-gnueabi
        # https://github.com/intel-lab-lkp/linux/commit/e06694e83e548757cf886b603579ce384d95b3ab
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Kanchan-Joshi/pci-enable-raw_queues-N-module-parameter/20230429-183933
        git checkout e06694e83e548757cf886b603579ce384d95b3ab
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=arm olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=arm SHELL=/bin/bash drivers/nvme/host/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202304292012.oxeNukfB-lkp@intel.com/

All warnings (new ones prefixed by >>):

   drivers/nvme/host/ioctl.c:812:5: warning: no previous prototype for function 'nvme_uring_cmd_iopoll_qid' [-Wmissing-prototypes]
   int nvme_uring_cmd_iopoll_qid(struct request_queue *q,
       ^
   drivers/nvme/host/ioctl.c:812:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   int nvme_uring_cmd_iopoll_qid(struct request_queue *q,
   ^
   static 
>> drivers/nvme/host/ioctl.c:1037:28: warning: variable 'q' is uninitialized when used here [-Wuninitialized]
                           if (!percpu_ref_tryget(&q->q_usage_counter))
                                                   ^
   drivers/nvme/host/ioctl.c:1021:25: note: initialize the variable 'q' to silence this warning
           struct request_queue *q;
                                  ^
                                   = NULL
   2 warnings generated.


vim +/q +1037 drivers/nvme/host/ioctl.c

  1010	
  1011	int nvme_ns_head_chr_uring_cmd_iopoll(struct io_uring_cmd *ioucmd,
  1012					      struct io_comp_batch *iob,
  1013					      unsigned int poll_flags)
  1014	{
  1015		struct cdev *cdev = file_inode(ioucmd->file)->i_cdev;
  1016		struct nvme_ns_head *head = container_of(cdev, struct nvme_ns_head, cdev);
  1017		int srcu_idx = srcu_read_lock(&head->srcu);
  1018		struct nvme_ns *ns = nvme_find_path(head);
  1019		struct bio *bio;
  1020		int ret = 0;
  1021		struct request_queue *q;
  1022	
  1023		if (ns) {
  1024			if (!(ioucmd->flags & IORING_URING_CMD_DIRECT)) {
  1025				rcu_read_lock();
  1026				bio = READ_ONCE(ioucmd->cookie);
  1027				q = ns->queue;
  1028				if (test_bit(QUEUE_FLAG_POLL, &q->queue_flags) && bio
  1029						&& bio->bi_bdev)
  1030					ret = bio_poll(bio, iob, poll_flags);
  1031				rcu_read_unlock();
  1032			} else {
  1033				int qid = io_uring_cmd_import_qid(ioucmd);
  1034	
  1035				if (qid <= 0)
  1036					return 0;
> 1037				if (!percpu_ref_tryget(&q->q_usage_counter))
  1038					return 0;
  1039				ret = nvme_uring_cmd_iopoll_qid(q, ioucmd, qid, iob,
  1040								poll_flags);
  1041				percpu_ref_put(&q->q_usage_counter);
  1042			}
  1043		}
  1044		srcu_read_unlock(&head->srcu, srcu_idx);
  1045		return ret;
  1046	}
  1047	#endif /* CONFIG_NVME_MULTIPATH */
  1048	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 06/12] pci: implement register/unregister functionality
  2023-04-29  9:39   ` [RFC PATCH 06/12] pci: implement register/unregister functionality Kanchan Joshi
  2023-04-29 12:13     ` kernel test robot
@ 2023-04-29 12:23     ` kernel test robot
  1 sibling, 0 replies; 25+ messages in thread
From: kernel test robot @ 2023-04-29 12:23 UTC (permalink / raw)
  To: Kanchan Joshi; +Cc: llvm, oe-kbuild-all

Hi Kanchan,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:

[auto build test ERROR on axboe-block/for-next]
[also build test ERROR on linus/master next-20230428]
[cannot apply to v6.3]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Kanchan-Joshi/pci-enable-raw_queues-N-module-parameter/20230429-183933
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link:    https://lore.kernel.org/r/20230429093925.133327-7-joshi.k%40samsung.com
patch subject: [RFC PATCH 06/12] pci: implement register/unregister functionality
config: riscv-randconfig-r034-20230428 (https://download.01.org/0day-ci/archive/20230429/202304292047.JoicCc5d-lkp@intel.com/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project 437b7602e4a998220871de78afcb020b9c14a661)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install riscv cross compiling tool for clang build
        # apt-get install binutils-riscv64-linux-gnu
        # https://github.com/intel-lab-lkp/linux/commit/6b117fabad0d77551dee34c2b3335dde220e057a
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Kanchan-Joshi/pci-enable-raw_queues-N-module-parameter/20230429-183933
        git checkout 6b117fabad0d77551dee34c2b3335dde220e057a
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=riscv olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=riscv SHELL=/bin/bash drivers/nvme/host/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202304292047.JoicCc5d-lkp@intel.com/

All errors (new ones prefixed by >>):

>> drivers/nvme/host/pci.c:1832:3: error: field designator 'register_queue' does not refer to any field in type 'const struct blk_mq_ops'
           .register_queue = nvme_pci_register_queue,
           ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> drivers/nvme/host/pci.c:1833:3: error: field designator 'unregister_queue' does not refer to any field in type 'const struct blk_mq_ops'
           .unregister_queue =  nvme_pci_unregister_queue,
           ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/nvme/host/pci.c:3125:41: warning: shift count >= width of type [-Wshift-count-overflow]
                   dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64));
                                                         ^~~~~~~~~~~~~~~~
   include/linux/dma-mapping.h:76:54: note: expanded from macro 'DMA_BIT_MASK'
   #define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL<<(n))-1))
                                                        ^ ~~~
   1 warning and 2 errors generated.


vim +1832 drivers/nvme/host/pci.c

  1821	
  1822	static const struct blk_mq_ops nvme_mq_ops = {
  1823		.queue_rq	= nvme_queue_rq,
  1824		.queue_rqs	= nvme_queue_rqs,
  1825		.complete	= nvme_pci_complete_rq,
  1826		.commit_rqs	= nvme_commit_rqs,
  1827		.init_hctx	= nvme_init_hctx,
  1828		.init_request	= nvme_pci_init_request,
  1829		.map_queues	= nvme_pci_map_queues,
  1830		.timeout	= nvme_timeout,
  1831		.poll		= nvme_poll,
> 1832		.register_queue	= nvme_pci_register_queue,
> 1833		.unregister_queue =  nvme_pci_unregister_queue,
  1834	};
  1835	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 10/12] nvme: submisssion/completion of uring_cmd to/from the registered queue
  2023-04-29  9:39   ` [RFC PATCH 10/12] nvme: submisssion/completion of uring_cmd to/from the registered queue Kanchan Joshi
  2023-04-29 12:23     ` kernel test robot
@ 2023-04-29 12:44     ` kernel test robot
  1 sibling, 0 replies; 25+ messages in thread
From: kernel test robot @ 2023-04-29 12:44 UTC (permalink / raw)
  To: Kanchan Joshi; +Cc: oe-kbuild-all

Hi Kanchan,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:

[auto build test ERROR on axboe-block/for-next]
[also build test ERROR on linus/master next-20230428]
[cannot apply to v6.3]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Kanchan-Joshi/pci-enable-raw_queues-N-module-parameter/20230429-183933
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link:    https://lore.kernel.org/r/20230429093925.133327-11-joshi.k%40samsung.com
patch subject: [RFC PATCH 10/12] nvme: submisssion/completion of uring_cmd to/from the registered queue
config: i386-randconfig-a003 (https://download.01.org/0day-ci/archive/20230429/202304292027.TXDcGEe2-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-12) 11.3.0
reproduce (this is a W=1 build):
        # https://github.com/intel-lab-lkp/linux/commit/e06694e83e548757cf886b603579ce384d95b3ab
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Kanchan-Joshi/pci-enable-raw_queues-N-module-parameter/20230429-183933
        git checkout e06694e83e548757cf886b603579ce384d95b3ab
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        make W=1 O=build_dir ARCH=i386 olddefconfig
        make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash drivers/nvme/host/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202304292027.TXDcGEe2-lkp@intel.com/

All error/warnings (new ones prefixed by >>):

   drivers/nvme/host/ioctl.c: In function 'nvme_uring_cmd_io_direct':
>> drivers/nvme/host/ioctl.c:668:35: error: 'const struct blk_mq_ops' has no member named 'queue_uring_cmd'
     668 |         if (q->mq_ops && q->mq_ops->queue_uring_cmd)
         |                                   ^~
   drivers/nvme/host/ioctl.c:669:33: error: 'const struct blk_mq_ops' has no member named 'queue_uring_cmd'
     669 |                 return q->mq_ops->queue_uring_cmd(ioucmd, qid);
         |                                 ^~
   drivers/nvme/host/ioctl.c: At top level:
>> drivers/nvme/host/ioctl.c:812:5: warning: no previous prototype for 'nvme_uring_cmd_iopoll_qid' [-Wmissing-prototypes]
     812 | int nvme_uring_cmd_iopoll_qid(struct request_queue *q,
         |     ^~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/nvme/host/ioctl.c: In function 'nvme_uring_cmd_iopoll_qid':
>> drivers/nvme/host/ioctl.c:820:37: error: 'const struct blk_mq_ops' has no member named 'poll_uring_cmd'
     820 |         if (!(q->mq_ops && q->mq_ops->poll_uring_cmd))
         |                                     ^~
   drivers/nvme/host/ioctl.c:823:32: error: 'const struct blk_mq_ops' has no member named 'poll_uring_cmd'
     823 |                 ret = q->mq_ops->poll_uring_cmd(ioucmd, qid, iob);
         |                                ^~


vim +668 drivers/nvme/host/ioctl.c

   655	
   656	static int nvme_uring_cmd_io_direct(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
   657			struct io_uring_cmd *ioucmd, unsigned int issue_flags)
   658	{
   659		struct request_queue *q = ns ? ns->queue : ctrl->admin_q;
   660		int qid = io_uring_cmd_import_qid(ioucmd);
   661		struct nvme_uring_direct_pdu *pdu =
   662			(struct nvme_uring_direct_pdu *)&ioucmd->pdu;
   663	
   664		if ((issue_flags & IO_URING_F_IOPOLL) != IO_URING_F_IOPOLL)
   665			return -EOPNOTSUPP;
   666	
   667		pdu->ns = ns;
 > 668		if (q->mq_ops && q->mq_ops->queue_uring_cmd)
   669			return q->mq_ops->queue_uring_cmd(ioucmd, qid);
   670		return -EOPNOTSUPP;
   671	}
   672	
   673	static int nvme_ctrl_ioctl(struct nvme_ctrl *ctrl, unsigned int cmd,
   674			void __user *argp, fmode_t mode)
   675	{
   676		switch (cmd) {
   677		case NVME_IOCTL_ADMIN_CMD:
   678			return nvme_user_cmd(ctrl, NULL, argp, 0, mode);
   679		case NVME_IOCTL_ADMIN64_CMD:
   680			return nvme_user_cmd64(ctrl, NULL, argp, 0, mode);
   681		default:
   682			return sed_ioctl(ctrl->opal_dev, cmd, argp);
   683		}
   684	}
   685	
   686	#ifdef COMPAT_FOR_U64_ALIGNMENT
   687	struct nvme_user_io32 {
   688		__u8	opcode;
   689		__u8	flags;
   690		__u16	control;
   691		__u16	nblocks;
   692		__u16	rsvd;
   693		__u64	metadata;
   694		__u64	addr;
   695		__u64	slba;
   696		__u32	dsmgmt;
   697		__u32	reftag;
   698		__u16	apptag;
   699		__u16	appmask;
   700	} __attribute__((__packed__));
   701	#define NVME_IOCTL_SUBMIT_IO32	_IOW('N', 0x42, struct nvme_user_io32)
   702	#endif /* COMPAT_FOR_U64_ALIGNMENT */
   703	
   704	static int nvme_ns_ioctl(struct nvme_ns *ns, unsigned int cmd,
   705			void __user *argp, unsigned int flags, fmode_t mode)
   706	{
   707		switch (cmd) {
   708		case NVME_IOCTL_ID:
   709			force_successful_syscall_return();
   710			return ns->head->ns_id;
   711		case NVME_IOCTL_IO_CMD:
   712			return nvme_user_cmd(ns->ctrl, ns, argp, flags, mode);
   713		/*
   714		 * struct nvme_user_io can have different padding on some 32-bit ABIs.
   715		 * Just accept the compat version as all fields that are used are the
   716		 * same size and at the same offset.
   717		 */
   718	#ifdef COMPAT_FOR_U64_ALIGNMENT
   719		case NVME_IOCTL_SUBMIT_IO32:
   720	#endif
   721		case NVME_IOCTL_SUBMIT_IO:
   722			return nvme_submit_io(ns, argp);
   723		case NVME_IOCTL_IO64_CMD_VEC:
   724			flags |= NVME_IOCTL_VEC;
   725			fallthrough;
   726		case NVME_IOCTL_IO64_CMD:
   727			return nvme_user_cmd64(ns->ctrl, ns, argp, flags, mode);
   728		default:
   729			return -ENOTTY;
   730		}
   731	}
   732	
   733	int nvme_ioctl(struct block_device *bdev, fmode_t mode,
   734			unsigned int cmd, unsigned long arg)
   735	{
   736		struct nvme_ns *ns = bdev->bd_disk->private_data;
   737		void __user *argp = (void __user *)arg;
   738		unsigned int flags = 0;
   739	
   740		if (bdev_is_partition(bdev))
   741			flags |= NVME_IOCTL_PARTITION;
   742	
   743		if (is_ctrl_ioctl(cmd))
   744			return nvme_ctrl_ioctl(ns->ctrl, cmd, argp, mode);
   745		return nvme_ns_ioctl(ns, cmd, argp, flags, mode);
   746	}
   747	
   748	long nvme_ns_chr_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
   749	{
   750		struct nvme_ns *ns =
   751			container_of(file_inode(file)->i_cdev, struct nvme_ns, cdev);
   752		void __user *argp = (void __user *)arg;
   753	
   754		if (is_ctrl_ioctl(cmd))
   755			return nvme_ctrl_ioctl(ns->ctrl, cmd, argp, file->f_mode);
   756		return nvme_ns_ioctl(ns, cmd, argp, 0, file->f_mode);
   757	}
   758	
   759	static int nvme_uring_cmd_checks(unsigned int issue_flags)
   760	{
   761	
   762		/* NVMe passthrough requires big SQE/CQE support */
   763		if ((issue_flags & (IO_URING_F_SQE128|IO_URING_F_CQE32)) !=
   764		    (IO_URING_F_SQE128|IO_URING_F_CQE32))
   765			return -EOPNOTSUPP;
   766		return 0;
   767	}
   768	
   769	static int nvme_ns_uring_cmd(struct nvme_ns *ns, struct io_uring_cmd *ioucmd,
   770				     unsigned int issue_flags)
   771	{
   772		struct nvme_ctrl *ctrl = ns->ctrl;
   773		int ret;
   774	
   775		BUILD_BUG_ON(sizeof(struct nvme_uring_cmd_pdu) > sizeof(ioucmd->pdu));
   776	
   777		ret = nvme_uring_cmd_checks(issue_flags);
   778		if (ret)
   779			return ret;
   780	
   781		switch (ioucmd->cmd_op) {
   782		case NVME_URING_CMD_IO:
   783			if (ioucmd->flags & IORING_URING_CMD_DIRECT) {
   784				ret = nvme_uring_cmd_io_direct(ctrl, ns, ioucmd,
   785						issue_flags);
   786				if (ret == -EIOCBQUEUED)
   787					return ret;
   788				/* in case of any error, just fallback */
   789				ioucmd->flags &= ~(IORING_URING_CMD_DIRECT);
   790			}
   791			ret = nvme_uring_cmd_io(ctrl, ns, ioucmd, issue_flags, false);
   792			break;
   793		case NVME_URING_CMD_IO_VEC:
   794			ret = nvme_uring_cmd_io(ctrl, ns, ioucmd, issue_flags, true);
   795			break;
   796		default:
   797			ret = -ENOTTY;
   798		}
   799	
   800		return ret;
   801	}
   802	
   803	int nvme_ns_chr_uring_cmd(struct io_uring_cmd *ioucmd, unsigned int issue_flags)
   804	{
   805		struct nvme_ns *ns = container_of(file_inode(ioucmd->file)->i_cdev,
   806				struct nvme_ns, cdev);
   807	
   808		return nvme_ns_uring_cmd(ns, ioucmd, issue_flags);
   809	}
   810	
   811	/* similar to blk_mq_poll; may be possible to unify */
 > 812	int nvme_uring_cmd_iopoll_qid(struct request_queue *q,
   813					 struct io_uring_cmd *ioucmd, int qid,
   814					 struct io_comp_batch *iob,
   815					 unsigned int flags)
   816	{
   817		long state = get_current_state();
   818		int ret;
   819	
 > 820		if (!(q->mq_ops && q->mq_ops->poll_uring_cmd))
   821			return 0;
   822		do {
   823			ret = q->mq_ops->poll_uring_cmd(ioucmd, qid, iob);
   824			if (ret > 0) {
   825				__set_current_state(TASK_RUNNING);
   826				return ret;
   827			}
   828			if (signal_pending_state(state, current))
   829				__set_current_state(TASK_RUNNING);
   830			if (task_is_running(current))
   831				return 1;
   832	
   833			if (ret < 0 || (flags & BLK_POLL_ONESHOT))
   834				break;
   835			cpu_relax();
   836	
   837		} while (!need_resched());
   838	
   839		__set_current_state(TASK_RUNNING);
   840		return 0;
   841	}
   842	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 12/12] pci: implement submission/completion for rawq commands
  2023-04-29  9:39   ` [RFC PATCH 12/12] pci: implement submission/completion for rawq commands Kanchan Joshi
@ 2023-04-29 12:54     ` kernel test robot
  2023-05-03  5:36     ` kernel test robot
  1 sibling, 0 replies; 25+ messages in thread
From: kernel test robot @ 2023-04-29 12:54 UTC (permalink / raw)
  To: Kanchan Joshi; +Cc: oe-kbuild-all

Hi Kanchan,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:

[auto build test ERROR on axboe-block/for-next]
[also build test ERROR on linus/master next-20230428]
[cannot apply to v6.3]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Kanchan-Joshi/pci-enable-raw_queues-N-module-parameter/20230429-183933
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link:    https://lore.kernel.org/r/20230429093925.133327-13-joshi.k%40samsung.com
patch subject: [RFC PATCH 12/12] pci: implement submission/completion for rawq commands
config: csky-randconfig-r013-20230429 (https://download.01.org/0day-ci/archive/20230429/202304292003.uwEZSENw-lkp@intel.com/config)
compiler: csky-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/5e9295340fad52984048c267b4b4c422c9556c86
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Kanchan-Joshi/pci-enable-raw_queues-N-module-parameter/20230429-183933
        git checkout 5e9295340fad52984048c267b4b4c422c9556c86
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=csky olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=csky SHELL=/bin/bash drivers/nvme/host/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202304292003.uwEZSENw-lkp@intel.com/

All errors (new ones prefixed by >>):

   drivers/nvme/host/pci.c:2021:10: error: 'const struct blk_mq_ops' has no member named 'register_queue'
    2021 |         .register_queue = nvme_pci_register_queue,
         |          ^~~~~~~~~~~~~~
   drivers/nvme/host/pci.c:2021:27: error: initialization of 'void (*)(struct request *)' from incompatible pointer type 'int (*)(void *)' [-Werror=incompatible-pointer-types]
    2021 |         .register_queue = nvme_pci_register_queue,
         |                           ^~~~~~~~~~~~~~~~~~~~~~~
   drivers/nvme/host/pci.c:2021:27: note: (near initialization for 'nvme_mq_ops.complete')
   drivers/nvme/host/pci.c:2021:27: warning: initialized field overwritten [-Woverride-init]
   drivers/nvme/host/pci.c:2021:27: note: (near initialization for 'nvme_mq_ops.complete')
   drivers/nvme/host/pci.c:2022:10: error: 'const struct blk_mq_ops' has no member named 'unregister_queue'
    2022 |         .unregister_queue =  nvme_pci_unregister_queue,
         |          ^~~~~~~~~~~~~~~~
   drivers/nvme/host/pci.c:2022:30: error: initialization of 'int (*)(struct blk_mq_hw_ctx *, void *, unsigned int)' from incompatible pointer type 'int (*)(void *, int)' [-Werror=incompatible-pointer-types]
    2022 |         .unregister_queue =  nvme_pci_unregister_queue,
         |                              ^~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/nvme/host/pci.c:2022:30: note: (near initialization for 'nvme_mq_ops.init_hctx')
   drivers/nvme/host/pci.c:2022:30: warning: initialized field overwritten [-Woverride-init]
   drivers/nvme/host/pci.c:2022:30: note: (near initialization for 'nvme_mq_ops.init_hctx')
>> drivers/nvme/host/pci.c:2023:10: error: 'const struct blk_mq_ops' has no member named 'queue_uring_cmd'
    2023 |         .queue_uring_cmd = nvme_pci_queue_ucmd,
         |          ^~~~~~~~~~~~~~~
>> drivers/nvme/host/pci.c:2023:28: error: initialization of 'void (*)(struct blk_mq_hw_ctx *, unsigned int)' from incompatible pointer type 'int (*)(struct io_uring_cmd *, int)' [-Werror=incompatible-pointer-types]
    2023 |         .queue_uring_cmd = nvme_pci_queue_ucmd,
         |                            ^~~~~~~~~~~~~~~~~~~
   drivers/nvme/host/pci.c:2023:28: note: (near initialization for 'nvme_mq_ops.exit_hctx')
>> drivers/nvme/host/pci.c:2024:10: error: 'const struct blk_mq_ops' has no member named 'poll_uring_cmd'
    2024 |         .poll_uring_cmd = nvme_poll_uring_cmd,
         |          ^~~~~~~~~~~~~~
>> drivers/nvme/host/pci.c:2024:27: error: initialization of 'int (*)(struct blk_mq_tag_set *, struct request *, unsigned int,  unsigned int)' from incompatible pointer type 'int (*)(struct io_uring_cmd *, int,  struct io_comp_batch *)' [-Werror=incompatible-pointer-types]
    2024 |         .poll_uring_cmd = nvme_poll_uring_cmd,
         |                           ^~~~~~~~~~~~~~~~~~~
   drivers/nvme/host/pci.c:2024:27: note: (near initialization for 'nvme_mq_ops.init_request')
   drivers/nvme/host/pci.c:2024:27: warning: initialized field overwritten [-Woverride-init]
   drivers/nvme/host/pci.c:2024:27: note: (near initialization for 'nvme_mq_ops.init_request')
   cc1: some warnings being treated as errors


vim +2023 drivers/nvme/host/pci.c

  2010	
  2011	static const struct blk_mq_ops nvme_mq_ops = {
  2012		.queue_rq	= nvme_queue_rq,
  2013		.queue_rqs	= nvme_queue_rqs,
  2014		.complete	= nvme_pci_complete_rq,
  2015		.commit_rqs	= nvme_commit_rqs,
  2016		.init_hctx	= nvme_init_hctx,
  2017		.init_request	= nvme_pci_init_request,
  2018		.map_queues	= nvme_pci_map_queues,
  2019		.timeout	= nvme_timeout,
  2020		.poll		= nvme_poll,
  2021		.register_queue	= nvme_pci_register_queue,
> 2022		.unregister_queue =  nvme_pci_unregister_queue,
> 2023		.queue_uring_cmd = nvme_pci_queue_ucmd,
> 2024		.poll_uring_cmd	= nvme_poll_uring_cmd,
  2025	};
  2026	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 00/12] io_uring attached nvme queue
  2023-04-29  9:39 ` [RFC PATCH 00/12] io_uring attached nvme queue Kanchan Joshi
                     ` (11 preceding siblings ...)
  2023-04-29  9:39   ` [RFC PATCH 12/12] pci: implement submission/completion for rawq commands Kanchan Joshi
@ 2023-04-29 17:17   ` Jens Axboe
  2023-05-01 11:36     ` Kanchan Joshi
  12 siblings, 1 reply; 25+ messages in thread
From: Jens Axboe @ 2023-04-29 17:17 UTC (permalink / raw)
  To: Kanchan Joshi, hch, sagi, kbusch
  Cc: io-uring, linux-nvme, linux-block, gost.dev, anuj1072538,
	xiaoguang.wang

On 4/29/23 3:39?AM, Kanchan Joshi wrote:
> This series shows one way to do what the title says.
> This puts up a more direct/lean path that enables
>  - submission from io_uring SQE to NVMe SQE
>  - completion from NVMe CQE to io_uring CQE
> Essentially cutting the hoops (involving request/bio) for nvme io path.
> 
> Also, io_uring ring is not to be shared among application threads.
> Application is responsible for building the sharing (if it feels the
> need). This means ring-associated exclusive queue can do away with some
> synchronization costs that occur for shared queue.
> 
> Primary objective is to amp up of efficiency of kernel io path further
> (towards PCIe gen N, N+1 hardware).
> And we are seeing some asks too [1].
> 
> Building-blocks
> ===============
> At high level, series can be divided into following parts -
> 
> 1. nvme driver starts exposing some queue-pairs (SQ+CQ) that can
> be attached to other in-kernel user (not just to block-layer, which is
> the case at the moment) on demand.
> 
> Example:
> insmod nvme.ko poll_queus=1 raw_queues=2
> 
> nvme0: 24/0/1/2 default/read/poll queues/raw queues
> 
> While driver registers other queues with block-layer, raw-queues are
> rather reserved for exclusive attachment with other in-kernel users.
> At this point, each raw-queue is interrupt-disabled (similar to
> poll_queues). Maybe we need a better name for these (e.g. app/user queues).
> [Refer: patch 2]
> 
> 2. register/unregister queue interface
> (a) one for io_uring application to ask for device-queue and register
> with the ring. [Refer: patch 4]
> (b) another at nvme so that other in-kernel users (io_uring for now) can
> ask for a raw-queue. [Refer: patch 3, 5, 6]
> 
> The latter returns a qid, that io_uring stores internally (not exposed
> to user-space) in the ring ctx. At max one queue per ring is enabled.
> Ring has no other special properties except the fact that it stores a
> qid that it can use exclusively. So application can very well use the
> ring to do other things than nvme io.
> 
> 3. user-interface to send commands down this way
> (a) uring-cmd is extended to support a new flag "IORING_URING_CMD_DIRECT"
> that application passes in the SQE. That is all.
> (b) the flag goes down to provider of ->uring_cmd which may choose to do
>   things differently based on it (or ignore it).
> [Refer: patch 7]
> 
> 4. nvme uring-cmd understands the above flag. It submits the command
> into the known pre-registered queue, and completes (polled-completion)
> from it. Transformation from "struct io_uring_cmd" to "nvme command" is
> done directly without building other intermediate constructs.
> [Refer: patch 8, 10, 12]
> 
> Testing and Performance
> =======================
> fio and t/io_uring is modified to exercise this path.
> - fio: new "registerqueues" option
> - t/io_uring: new "k" option
> 
> Good part:
> 2.96M -> 5.02M
> 
> nvme io (without this):
> # t/io_uring -b512 -d64 -c2 -s2 -p1 -F1 -B1 -O0 -n1 -u1 -r4 -k0 /dev/ng0n1
> submitter=0, tid=2922, file=/dev/ng0n1, node=-1
> polled=1, fixedbufs=1/0, register_files=1, buffered=1, register_queues=0 QD=64
> Engine=io_uring, sq_ring=64, cq_ring=64
> IOPS=2.89M, BW=1412MiB/s, IOS/call=2/1
> IOPS=2.92M, BW=1426MiB/s, IOS/call=2/2
> IOPS=2.96M, BW=1444MiB/s, IOS/call=2/1
> Exiting on timeout
> Maximum IOPS=2.96M
> 
> nvme io (with this):
> # t/io_uring -b512 -d64 -c2 -s2 -p1 -F1 -B1 -O0 -n1 -u1 -r4 -k1 /dev/ng0n1
> submitter=0, tid=2927, file=/dev/ng0n1, node=-1
> polled=1, fixedbufs=1/0, register_files=1, buffered=1, register_queues=1 QD=64
> Engine=io_uring, sq_ring=64, cq_ring=64
> IOPS=4.99M, BW=2.43GiB/s, IOS/call=2/1
> IOPS=5.02M, BW=2.45GiB/s, IOS/call=2/1
> IOPS=5.02M, BW=2.45GiB/s, IOS/call=2/1
> Exiting on timeout
> Maximum IOPS=5.02M
> 
> Not so good part:
> While single IO is fast this way, we do not have batching abilities for
> multi-io scenario. Plugging, submission and completion batching are tied to
> block-layer constructs. Things should look better if we could do something
> about that.
> Particularly something is off with the completion-batching.
> 
> With -s32 and -c32, the numbers decline:
> 
> # t/io_uring -b512 -d64 -c32 -s32 -p1 -F1 -B1 -O0 -n1 -u1 -r4 -k1 /dev/ng0n1
> submitter=0, tid=3674, file=/dev/ng0n1, node=-1
> polled=1, fixedbufs=1/0, register_files=1, buffered=1, register_queues=1 QD=64
> Engine=io_uring, sq_ring=64, cq_ring=64
> IOPS=3.70M, BW=1806MiB/s, IOS/call=32/31
> IOPS=3.71M, BW=1812MiB/s, IOS/call=32/31
> IOPS=3.71M, BW=1812MiB/s, IOS/call=32/32
> Exiting on timeout
> Maximum IOPS=3.71M
> 
> And perf gets restored if we go back to -c2
> 
> # t/io_uring -b512 -d64 -c2 -s32 -p1 -F1 -B1 -O0 -n1 -u1 -r4 -k1 /dev/ng0n1
> submitter=0, tid=3677, file=/dev/ng0n1, node=-1
> polled=1, fixedbufs=1/0, register_files=1, buffered=1, register_queues=1 QD=64
> Engine=io_uring, sq_ring=64, cq_ring=64
> IOPS=4.99M, BW=2.44GiB/s, IOS/call=5/5
> IOPS=5.02M, BW=2.45GiB/s, IOS/call=5/5
> IOPS=5.02M, BW=2.45GiB/s, IOS/call=5/5
> Exiting on timeout
> Maximum IOPS=5.02M
> 
> Source
> ======
> Kernel: https://github.com/OpenMPDK/linux/tree/feat/directq-v1
> fio: https://github.com/OpenMPDK/fio/commits/feat/rawq-v2
> 
> Please take a look.

This looks like a great starting point! Unfortunately I won't be at
LSFMM this year to discuss it in person, but I'll be taking a closer
look at this. Some quick initial reactions:

- I'd call them "user" queues rather than raw or whatever, I think that
  more accurately describes what they are for.

- I guess there's no way around needing to pre-allocate these user
  queues, just like we do for polled_queues right now? In terms of user
  API, it'd be nicer if you could just do IORING_REGISTER_QUEUE (insert
  right name here...) and it'd allocate and return you an ID.

- Need to take a look at the uring_cmd stuff again, but would be nice if
  we did not have to add more stuff to fops for this. Maybe we can set
  aside a range of "ioctl" type commands through uring_cmd for this
  instead, and go that way for registering/unregistering queues.

We do have some users that are CPU constrained, and while my testing
easily maxes out a gen2 optane (actually 2 or 3) with the generic IO
path, that's also with all the fat that adds overhead removed. Most
people don't have this luxury, necessarily, or actually need some of
this fat for their monitoring, for example. This would provide a nice
way to have pretty consistent and efficient performance across distro
type configs, which would be great, while still retaining the fattier
bits for "normal" IO.
 

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 00/12] io_uring attached nvme queue
  2023-04-29 17:17   ` [RFC PATCH 00/12] io_uring attached nvme queue Jens Axboe
@ 2023-05-01 11:36     ` Kanchan Joshi
  0 siblings, 0 replies; 25+ messages in thread
From: Kanchan Joshi @ 2023-05-01 11:36 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Kanchan Joshi, hch, sagi, kbusch, io-uring, linux-nvme,
	linux-block, gost.dev, anuj1072538, xiaoguang.wang

On Sat, Apr 29, 2023 at 10:55 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> On 4/29/23 3:39?AM, Kanchan Joshi wrote:
> > This series shows one way to do what the title says.
> > This puts up a more direct/lean path that enables
> >  - submission from io_uring SQE to NVMe SQE
> >  - completion from NVMe CQE to io_uring CQE
> > Essentially cutting the hoops (involving request/bio) for nvme io path.
> >
> > Also, io_uring ring is not to be shared among application threads.
> > Application is responsible for building the sharing (if it feels the
> > need). This means ring-associated exclusive queue can do away with some
> > synchronization costs that occur for shared queue.
> >
> > Primary objective is to amp up of efficiency of kernel io path further
> > (towards PCIe gen N, N+1 hardware).
> > And we are seeing some asks too [1].
> >
> > Building-blocks
> > ===============
> > At high level, series can be divided into following parts -
> >
> > 1. nvme driver starts exposing some queue-pairs (SQ+CQ) that can
> > be attached to other in-kernel user (not just to block-layer, which is
> > the case at the moment) on demand.
> >
> > Example:
> > insmod nvme.ko poll_queus=1 raw_queues=2
> >
> > nvme0: 24/0/1/2 default/read/poll queues/raw queues
> >
> > While driver registers other queues with block-layer, raw-queues are
> > rather reserved for exclusive attachment with other in-kernel users.
> > At this point, each raw-queue is interrupt-disabled (similar to
> > poll_queues). Maybe we need a better name for these (e.g. app/user queues).
> > [Refer: patch 2]
> >
> > 2. register/unregister queue interface
> > (a) one for io_uring application to ask for device-queue and register
> > with the ring. [Refer: patch 4]
> > (b) another at nvme so that other in-kernel users (io_uring for now) can
> > ask for a raw-queue. [Refer: patch 3, 5, 6]
> >
> > The latter returns a qid, that io_uring stores internally (not exposed
> > to user-space) in the ring ctx. At max one queue per ring is enabled.
> > Ring has no other special properties except the fact that it stores a
> > qid that it can use exclusively. So application can very well use the
> > ring to do other things than nvme io.
> >
> > 3. user-interface to send commands down this way
> > (a) uring-cmd is extended to support a new flag "IORING_URING_CMD_DIRECT"
> > that application passes in the SQE. That is all.
> > (b) the flag goes down to provider of ->uring_cmd which may choose to do
> >   things differently based on it (or ignore it).
> > [Refer: patch 7]
> >
> > 4. nvme uring-cmd understands the above flag. It submits the command
> > into the known pre-registered queue, and completes (polled-completion)
> > from it. Transformation from "struct io_uring_cmd" to "nvme command" is
> > done directly without building other intermediate constructs.
> > [Refer: patch 8, 10, 12]
> >
> > Testing and Performance
> > =======================
> > fio and t/io_uring is modified to exercise this path.
> > - fio: new "registerqueues" option
> > - t/io_uring: new "k" option
> >
> > Good part:
> > 2.96M -> 5.02M
> >
> > nvme io (without this):
> > # t/io_uring -b512 -d64 -c2 -s2 -p1 -F1 -B1 -O0 -n1 -u1 -r4 -k0 /dev/ng0n1
> > submitter=0, tid=2922, file=/dev/ng0n1, node=-1
> > polled=1, fixedbufs=1/0, register_files=1, buffered=1, register_queues=0 QD=64
> > Engine=io_uring, sq_ring=64, cq_ring=64
> > IOPS=2.89M, BW=1412MiB/s, IOS/call=2/1
> > IOPS=2.92M, BW=1426MiB/s, IOS/call=2/2
> > IOPS=2.96M, BW=1444MiB/s, IOS/call=2/1
> > Exiting on timeout
> > Maximum IOPS=2.96M
> >
> > nvme io (with this):
> > # t/io_uring -b512 -d64 -c2 -s2 -p1 -F1 -B1 -O0 -n1 -u1 -r4 -k1 /dev/ng0n1
> > submitter=0, tid=2927, file=/dev/ng0n1, node=-1
> > polled=1, fixedbufs=1/0, register_files=1, buffered=1, register_queues=1 QD=64
> > Engine=io_uring, sq_ring=64, cq_ring=64
> > IOPS=4.99M, BW=2.43GiB/s, IOS/call=2/1
> > IOPS=5.02M, BW=2.45GiB/s, IOS/call=2/1
> > IOPS=5.02M, BW=2.45GiB/s, IOS/call=2/1
> > Exiting on timeout
> > Maximum IOPS=5.02M
> >
> > Not so good part:
> > While single IO is fast this way, we do not have batching abilities for
> > multi-io scenario. Plugging, submission and completion batching are tied to
> > block-layer constructs. Things should look better if we could do something
> > about that.
> > Particularly something is off with the completion-batching.
> >
> > With -s32 and -c32, the numbers decline:
> >
> > # t/io_uring -b512 -d64 -c32 -s32 -p1 -F1 -B1 -O0 -n1 -u1 -r4 -k1 /dev/ng0n1
> > submitter=0, tid=3674, file=/dev/ng0n1, node=-1
> > polled=1, fixedbufs=1/0, register_files=1, buffered=1, register_queues=1 QD=64
> > Engine=io_uring, sq_ring=64, cq_ring=64
> > IOPS=3.70M, BW=1806MiB/s, IOS/call=32/31
> > IOPS=3.71M, BW=1812MiB/s, IOS/call=32/31
> > IOPS=3.71M, BW=1812MiB/s, IOS/call=32/32
> > Exiting on timeout
> > Maximum IOPS=3.71M
> >
> > And perf gets restored if we go back to -c2
> >
> > # t/io_uring -b512 -d64 -c2 -s32 -p1 -F1 -B1 -O0 -n1 -u1 -r4 -k1 /dev/ng0n1
> > submitter=0, tid=3677, file=/dev/ng0n1, node=-1
> > polled=1, fixedbufs=1/0, register_files=1, buffered=1, register_queues=1 QD=64
> > Engine=io_uring, sq_ring=64, cq_ring=64
> > IOPS=4.99M, BW=2.44GiB/s, IOS/call=5/5
> > IOPS=5.02M, BW=2.45GiB/s, IOS/call=5/5
> > IOPS=5.02M, BW=2.45GiB/s, IOS/call=5/5
> > Exiting on timeout
> > Maximum IOPS=5.02M
> >
> > Source
> > ======
> > Kernel: https://github.com/OpenMPDK/linux/tree/feat/directq-v1
> > fio: https://github.com/OpenMPDK/fio/commits/feat/rawq-v2
> >
> > Please take a look.
>
> This looks like a great starting point! Unfortunately I won't be at
> LSFMM this year to discuss it in person, but I'll be taking a closer
> look at this.

That will help, thanks.

> Some quick initial reactions:
>
> - I'd call them "user" queues rather than raw or whatever, I think that
>   more accurately describes what they are for.

Right, that is better.

> - I guess there's no way around needing to pre-allocate these user
>   queues, just like we do for polled_queues right now?

Right, we would need to allocate nvme sq/cq in the outset.
Changing the count at run-time is a bit murky. I will have another look though.

>In terms of user
>   API, it'd be nicer if you could just do IORING_REGISTER_QUEUE (insert
>   right name here...) and it'd allocate and return you an ID.

But this is the implemented API (new register code in io_uring) in the
patchset at the moment.
So it seems I am missing your point?

> - Need to take a look at the uring_cmd stuff again, but would be nice if
>   we did not have to add more stuff to fops for this. Maybe we can set
>   aside a range of "ioctl" type commands through uring_cmd for this
>   instead, and go that way for registering/unregistering queues.

Yes, I see your point in not having to add new fops.
But, a new uring_cmd opcode is only at the nvme-level.
It is a good way to allocate/deallocate a nvme queue, but it cannot
attach that with the io_uring's ring.
Or do you have a different view? Seems this is connected to the previous point.

> We do have some users that are CPU constrained, and while my testing
> easily maxes out a gen2 optane (actually 2 or 3) with the generic IO
> path, that's also with all the fat that adds overhead removed. Most
> people don't have this luxury, necessarily, or actually need some of
> this fat for their monitoring, for example. This would provide a nice
> way to have pretty consistent and efficient performance across distro
> type configs, which would be great, while still retaining the fattier
> bits for "normal" IO.
Makes total sense.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 12/12] pci: implement submission/completion for rawq commands
  2023-04-29  9:39   ` [RFC PATCH 12/12] pci: implement submission/completion for rawq commands Kanchan Joshi
  2023-04-29 12:54     ` kernel test robot
@ 2023-05-03  5:36     ` kernel test robot
  1 sibling, 0 replies; 25+ messages in thread
From: kernel test robot @ 2023-05-03  5:36 UTC (permalink / raw)
  To: Kanchan Joshi; +Cc: oe-kbuild-all

Hi Kanchan,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build warnings:

[auto build test WARNING on axboe-block/for-next]
[also build test WARNING on linus/master next-20230428]
[cannot apply to hch-configfs/for-next v6.3]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Kanchan-Joshi/pci-enable-raw_queues-N-module-parameter/20230429-183933
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link:    https://lore.kernel.org/r/20230429093925.133327-13-joshi.k%40samsung.com
patch subject: [RFC PATCH 12/12] pci: implement submission/completion for rawq commands
config: arm-randconfig-s052-20230430 (https://download.01.org/0day-ci/archive/20230503/202305031330.zWfaf5UM-lkp@intel.com/config)
compiler: arm-linux-gnueabi-gcc (GCC) 12.1.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # apt-get install sparse
        # sparse version: v0.6.4-39-gce1a6720-dirty
        # https://github.com/intel-lab-lkp/linux/commit/5e9295340fad52984048c267b4b4c422c9556c86
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Kanchan-Joshi/pci-enable-raw_queues-N-module-parameter/20230429-183933
        git checkout 5e9295340fad52984048c267b4b4c422c9556c86
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=arm olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=arm SHELL=/bin/bash drivers/nvme/host/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202305031330.zWfaf5UM-lkp@intel.com/

sparse warnings: (new ones prefixed by >>)
>> drivers/nvme/host/pci.c:1932:45: sparse: sparse: incorrect type in return expression (different base types) @@     expected int @@     got restricted blk_status_t @@
   drivers/nvme/host/pci.c:1932:45: sparse:     expected int
   drivers/nvme/host/pci.c:1932:45: sparse:     got restricted blk_status_t
>> drivers/nvme/host/pci.c:1061:33: sparse: sparse: incorrect type in initializer (different base types) @@     expected unsigned long long [usertype] result @@     got restricted __le64 [usertype] u64 @@
   drivers/nvme/host/pci.c:1061:33: sparse:     expected unsigned long long [usertype] result
   drivers/nvme/host/pci.c:1061:33: sparse:     got restricted __le64 [usertype] u64
>> drivers/nvme/host/pci.c:1061:33: sparse: sparse: incorrect type in initializer (different base types) @@     expected unsigned long long [usertype] result @@     got restricted __le64 [usertype] u64 @@
   drivers/nvme/host/pci.c:1061:33: sparse:     expected unsigned long long [usertype] result
   drivers/nvme/host/pci.c:1061:33: sparse:     got restricted __le64 [usertype] u64
>> drivers/nvme/host/pci.c:1061:33: sparse: sparse: incorrect type in initializer (different base types) @@     expected unsigned long long [usertype] result @@     got restricted __le64 [usertype] u64 @@
   drivers/nvme/host/pci.c:1061:33: sparse:     expected unsigned long long [usertype] result
   drivers/nvme/host/pci.c:1061:33: sparse:     got restricted __le64 [usertype] u64
>> drivers/nvme/host/pci.c:1061:33: sparse: sparse: incorrect type in initializer (different base types) @@     expected unsigned long long [usertype] result @@     got restricted __le64 [usertype] u64 @@
   drivers/nvme/host/pci.c:1061:33: sparse:     expected unsigned long long [usertype] result
   drivers/nvme/host/pci.c:1061:33: sparse:     got restricted __le64 [usertype] u64
>> drivers/nvme/host/pci.c:1061:33: sparse: sparse: incorrect type in initializer (different base types) @@     expected unsigned long long [usertype] result @@     got restricted __le64 [usertype] u64 @@
   drivers/nvme/host/pci.c:1061:33: sparse:     expected unsigned long long [usertype] result
   drivers/nvme/host/pci.c:1061:33: sparse:     got restricted __le64 [usertype] u64

vim +1932 drivers/nvme/host/pci.c

  1901	
  1902	static int nvme_map_io_fb(struct request_queue *q, struct io_uring_cmd *ioucmd,
  1903				struct nvme_dev *dev, struct nvme_iod *iod,
  1904				unsigned long addr, unsigned int buf_len)
  1905	{
  1906		struct nvme_command *cmnd = &iod->cmd;
  1907		int ret, rw = nvme_is_write(cmnd);
  1908		struct iov_iter iter;
  1909		size_t nr_iter, nr_segs;
  1910		struct bio_vec *bv;
  1911	
  1912		if (!(ioucmd->flags & IORING_URING_CMD_FIXED))
  1913			return -EINVAL;
  1914	
  1915		ret = io_uring_cmd_import_fixed(addr, buf_len, rw, &iter, ioucmd);
  1916		if (ret < 0)
  1917			return ret;
  1918		nr_iter = iov_iter_count(&iter);
  1919		nr_segs = iter.nr_segs;
  1920	
  1921		/* device will do these checks anyway, so why do duplicate work? */
  1922		if (!nr_iter || (nr_iter >> SECTOR_SHIFT) > queue_max_hw_sectors(q))
  1923			goto err;
  1924		if (nr_segs > queue_max_segments(q))
  1925			goto err;
  1926		/* this will be attempted from regular path instead */
  1927		if (nr_segs > 1)
  1928			goto err;
  1929	
  1930		bv = (struct bio_vec *)&iter.bvec[0];
  1931		if (bv->bv_offset + bv->bv_len <= NVME_CTRL_PAGE_SIZE * 2)
> 1932			return nvme_setup_prp_simple(dev, iod,
  1933					(struct nvme_rw_command *)cmnd,
  1934					bv, rw ? DMA_TO_DEVICE : DMA_FROM_DEVICE);
  1935	err:
  1936		return -EINVAL;
  1937	}
  1938	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2023-05-03  5:36 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CGME20230429094228epcas5p4a80d8ed77433989fa804ecf449f83b0b@epcas5p4.samsung.com>
2023-04-29  9:39 ` [RFC PATCH 00/12] io_uring attached nvme queue Kanchan Joshi
2023-04-29  9:39   ` [RFC PATCH 01/12] nvme: refactor nvme_alloc_io_tag_set Kanchan Joshi
2023-04-29  9:39   ` [RFC PATCH 02/12] pci: enable "raw_queues = N" module parameter Kanchan Joshi
2023-04-29  9:39   ` [RFC PATCH 03/12] fs, block: interface to register/unregister the raw-queue Kanchan Joshi
2023-04-29  9:39   ` [RFC PATCH 04/12] io_uring, fs: plumb support to register/unregister raw-queue Kanchan Joshi
2023-04-29  9:39   ` [RFC PATCH 05/12] nvme: wire-up register/unregister queue f_op callback Kanchan Joshi
2023-04-29 12:02     ` kernel test robot
2023-04-29 12:02     ` kernel test robot
2023-04-29 12:02     ` kernel test robot
2023-04-29 12:13     ` kernel test robot
2023-04-29  9:39   ` [RFC PATCH 06/12] pci: implement register/unregister functionality Kanchan Joshi
2023-04-29 12:13     ` kernel test robot
2023-04-29 12:23     ` kernel test robot
2023-04-29  9:39   ` [RFC PATCH 07/12] io_uring: support for using registered queue in uring-cmd Kanchan Joshi
2023-04-29  9:39   ` [RFC PATCH 08/12] block: add mq_ops to submit and complete commands from raw-queue Kanchan Joshi
2023-04-29  9:39   ` [RFC PATCH 09/12] nvme: carve out a helper to prepare nvme_command from ioucmd->cmd Kanchan Joshi
2023-04-29  9:39   ` [RFC PATCH 10/12] nvme: submisssion/completion of uring_cmd to/from the registered queue Kanchan Joshi
2023-04-29 12:23     ` kernel test robot
2023-04-29 12:44     ` kernel test robot
2023-04-29  9:39   ` [RFC PATCH 11/12] pci: modify nvme_setup_prp_simple parameters Kanchan Joshi
2023-04-29  9:39   ` [RFC PATCH 12/12] pci: implement submission/completion for rawq commands Kanchan Joshi
2023-04-29 12:54     ` kernel test robot
2023-05-03  5:36     ` kernel test robot
2023-04-29 17:17   ` [RFC PATCH 00/12] io_uring attached nvme queue Jens Axboe
2023-05-01 11:36     ` Kanchan Joshi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.