* [PATCHSET v2 0/5] Enable alloc caching and batched freeing for passthrough
@ 2022-09-27 1:44 Jens Axboe
2022-09-27 1:44 ` [PATCH 1/5] block: enable batched allocation for blk_mq_alloc_request() Jens Axboe
` (6 more replies)
0 siblings, 7 replies; 25+ messages in thread
From: Jens Axboe @ 2022-09-27 1:44 UTC (permalink / raw)
To: linux-block; +Cc: linux-scsi, linux-nvme
Hi,
The passthrough IO path currently doesn't do any request allocation
batching like we do for normal IO. Wire this up through the usual
blk_mq_alloc_request() allocation helper.
Similarly, we don't currently supported batched completions for
passthrough IO. Allow the request->end_io() handler to return back
whether or not it retains ownership of the request. By default all
handlers are converted to returning RQ_END_IO_NONE, which retains
the existing behavior. But with that in place, we can tweak the
nvme uring_cmd end_io handler to pass back ownership, and hence enable
completion batching for passthrough requests as well.
This is good for a 10% improvement for passthrough performance. For
a non-drive limited test case, passthrough IO is now more efficient
than the regular bdev O_DIRECT path.
Changes since v1:
- Remove spurious semicolon
- Cleanup struct nvme_uring_cmd_pdu handling
--
Jens Axboe
^ permalink raw reply [flat|nested] 25+ messages in thread* [PATCH 1/5] block: enable batched allocation for blk_mq_alloc_request() 2022-09-27 1:44 [PATCHSET v2 0/5] Enable alloc caching and batched freeing for passthrough Jens Axboe @ 2022-09-27 1:44 ` Jens Axboe 2022-09-28 13:38 ` Anuj gupta 2022-09-27 1:44 ` [PATCH 2/5] block: change request end_io handler to pass back a return value Jens Axboe ` (5 subsequent siblings) 6 siblings, 1 reply; 25+ messages in thread From: Jens Axboe @ 2022-09-27 1:44 UTC (permalink / raw) To: linux-block; +Cc: linux-scsi, linux-nvme, Jens Axboe The filesystem IO path can take advantage of allocating batches of requests, if the underlying submitter tells the block layer about it through the blk_plug. For passthrough IO, the exported API is the blk_mq_alloc_request() helper, and that one does not allow for request caching. Wire up request caching for blk_mq_alloc_request(), which is generally done without having a bio available upfront. Signed-off-by: Jens Axboe <axboe@kernel.dk> --- block/blk-mq.c | 80 ++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 71 insertions(+), 9 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index c11949d66163..d3a9f8b9c7ee 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -510,25 +510,87 @@ static struct request *__blk_mq_alloc_requests(struct blk_mq_alloc_data *data) alloc_time_ns); } -struct request *blk_mq_alloc_request(struct request_queue *q, blk_opf_t opf, - blk_mq_req_flags_t flags) +static struct request *blk_mq_rq_cache_fill(struct request_queue *q, + struct blk_plug *plug, + blk_opf_t opf, + blk_mq_req_flags_t flags) { struct blk_mq_alloc_data data = { .q = q, .flags = flags, .cmd_flags = opf, - .nr_tags = 1, + .nr_tags = plug->nr_ios, + .cached_rq = &plug->cached_rq, }; struct request *rq; - int ret; - ret = blk_queue_enter(q, flags); - if (ret) - return ERR_PTR(ret); + if (blk_queue_enter(q, flags)) + return NULL; + + plug->nr_ios = 1; rq = __blk_mq_alloc_requests(&data); - if (!rq) - goto out_queue_exit; + if (unlikely(!rq)) + blk_queue_exit(q); + return rq; +} + +static struct request *blk_mq_alloc_cached_request(struct request_queue *q, + blk_opf_t opf, + blk_mq_req_flags_t flags) +{ + struct blk_plug *plug = current->plug; + struct request *rq; + + if (!plug) + return NULL; + if (rq_list_empty(plug->cached_rq)) { + if (plug->nr_ios == 1) + return NULL; + rq = blk_mq_rq_cache_fill(q, plug, opf, flags); + if (rq) + goto got_it; + return NULL; + } + rq = rq_list_peek(&plug->cached_rq); + if (!rq || rq->q != q) + return NULL; + + if (blk_mq_get_hctx_type(opf) != rq->mq_hctx->type) + return NULL; + if (op_is_flush(rq->cmd_flags) != op_is_flush(opf)) + return NULL; + + plug->cached_rq = rq_list_next(rq); +got_it: + rq->cmd_flags = opf; + INIT_LIST_HEAD(&rq->queuelist); + return rq; +} + +struct request *blk_mq_alloc_request(struct request_queue *q, blk_opf_t opf, + blk_mq_req_flags_t flags) +{ + struct request *rq; + + rq = blk_mq_alloc_cached_request(q, opf, flags); + if (!rq) { + struct blk_mq_alloc_data data = { + .q = q, + .flags = flags, + .cmd_flags = opf, + .nr_tags = 1, + }; + int ret; + + ret = blk_queue_enter(q, flags); + if (ret) + return ERR_PTR(ret); + + rq = __blk_mq_alloc_requests(&data); + if (!rq) + goto out_queue_exit; + } rq->__data_len = 0; rq->__sector = (sector_t) -1; rq->bio = rq->biotail = NULL; -- 2.35.1 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 1/5] block: enable batched allocation for blk_mq_alloc_request() 2022-09-27 1:44 ` [PATCH 1/5] block: enable batched allocation for blk_mq_alloc_request() Jens Axboe @ 2022-09-28 13:38 ` Anuj gupta 0 siblings, 0 replies; 25+ messages in thread From: Anuj gupta @ 2022-09-28 13:38 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-block, linux-scsi, linux-nvme On Tue, Sep 27, 2022 at 7:19 AM Jens Axboe <axboe@kernel.dk> wrote: > > The filesystem IO path can take advantage of allocating batches of > requests, if the underlying submitter tells the block layer about it > through the blk_plug. For passthrough IO, the exported API is the > blk_mq_alloc_request() helper, and that one does not allow for > request caching. > > Wire up request caching for blk_mq_alloc_request(), which is generally > done without having a bio available upfront. > > Signed-off-by: Jens Axboe <axboe@kernel.dk> > --- > block/blk-mq.c | 80 ++++++++++++++++++++++++++++++++++++++++++++------ > 1 file changed, 71 insertions(+), 9 deletions(-) > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index c11949d66163..d3a9f8b9c7ee 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -510,25 +510,87 @@ static struct request *__blk_mq_alloc_requests(struct blk_mq_alloc_data *data) > alloc_time_ns); > } > > -struct request *blk_mq_alloc_request(struct request_queue *q, blk_opf_t opf, > - blk_mq_req_flags_t flags) > +static struct request *blk_mq_rq_cache_fill(struct request_queue *q, > + struct blk_plug *plug, > + blk_opf_t opf, > + blk_mq_req_flags_t flags) > { > struct blk_mq_alloc_data data = { > .q = q, > .flags = flags, > .cmd_flags = opf, > - .nr_tags = 1, > + .nr_tags = plug->nr_ios, > + .cached_rq = &plug->cached_rq, > }; > struct request *rq; > - int ret; > > - ret = blk_queue_enter(q, flags); > - if (ret) > - return ERR_PTR(ret); > + if (blk_queue_enter(q, flags)) > + return NULL; > + > + plug->nr_ios = 1; > > rq = __blk_mq_alloc_requests(&data); > - if (!rq) > - goto out_queue_exit; > + if (unlikely(!rq)) > + blk_queue_exit(q); > + return rq; > +} > + > +static struct request *blk_mq_alloc_cached_request(struct request_queue *q, > + blk_opf_t opf, > + blk_mq_req_flags_t flags) > +{ > + struct blk_plug *plug = current->plug; > + struct request *rq; > + > + if (!plug) > + return NULL; > + if (rq_list_empty(plug->cached_rq)) { > + if (plug->nr_ios == 1) > + return NULL; > + rq = blk_mq_rq_cache_fill(q, plug, opf, flags); > + if (rq) > + goto got_it; > + return NULL; > + } > + rq = rq_list_peek(&plug->cached_rq); > + if (!rq || rq->q != q) > + return NULL; > + > + if (blk_mq_get_hctx_type(opf) != rq->mq_hctx->type) > + return NULL; > + if (op_is_flush(rq->cmd_flags) != op_is_flush(opf)) > + return NULL; > + > + plug->cached_rq = rq_list_next(rq); > +got_it: > + rq->cmd_flags = opf; > + INIT_LIST_HEAD(&rq->queuelist); > + return rq; > +} > + > +struct request *blk_mq_alloc_request(struct request_queue *q, blk_opf_t opf, > + blk_mq_req_flags_t flags) > +{ > + struct request *rq; > + > + rq = blk_mq_alloc_cached_request(q, opf, flags); > + if (!rq) { > + struct blk_mq_alloc_data data = { > + .q = q, > + .flags = flags, > + .cmd_flags = opf, > + .nr_tags = 1, > + }; > + int ret; > + > + ret = blk_queue_enter(q, flags); > + if (ret) > + return ERR_PTR(ret); > + > + rq = __blk_mq_alloc_requests(&data); > + if (!rq) > + goto out_queue_exit; > + } > rq->__data_len = 0; > rq->__sector = (sector_t) -1; > rq->bio = rq->biotail = NULL; > -- > 2.35.1 > A large chunk of this improvement in passthrough performance is coming by enabling request caching. On my setup, the performance improves from 2.34 to 2.54 MIOPS. I have tested this using the t/io_uring utility (in fio) and I am using an Intel Optane Gen2 device. Tested-by: Anuj Gupta <anuj20.g@samsung.com> -- Anuj Gupta ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 2/5] block: change request end_io handler to pass back a return value 2022-09-27 1:44 [PATCHSET v2 0/5] Enable alloc caching and batched freeing for passthrough Jens Axboe 2022-09-27 1:44 ` [PATCH 1/5] block: enable batched allocation for blk_mq_alloc_request() Jens Axboe @ 2022-09-27 1:44 ` Jens Axboe 2022-09-27 1:44 ` [PATCH 3/5] block: allow end_io based requests in the completion batch handling Jens Axboe ` (4 subsequent siblings) 6 siblings, 0 replies; 25+ messages in thread From: Jens Axboe @ 2022-09-27 1:44 UTC (permalink / raw) To: linux-block; +Cc: linux-scsi, linux-nvme, Jens Axboe Everything is just converted to returning RQ_END_IO_NONE, and there should be no functional changes with this patch. In preparation for allowing the end_io handler to pass ownership back to the block layer, rather than retain ownership of the request. Signed-off-by: Jens Axboe <axboe@kernel.dk> --- block/blk-flush.c | 10 +++++++--- block/blk-mq.c | 14 +++++++++----- drivers/md/dm-rq.c | 4 +++- drivers/nvme/host/core.c | 6 ++++-- drivers/nvme/host/ioctl.c | 5 ++++- drivers/nvme/host/pci.c | 12 ++++++++---- drivers/nvme/target/passthru.c | 5 +++-- drivers/scsi/scsi_error.c | 4 +++- drivers/scsi/sg.c | 9 +++++---- drivers/scsi/st.c | 4 +++- drivers/target/target_core_pscsi.c | 6 ++++-- drivers/ufs/core/ufshpb.c | 8 ++++++-- include/linux/blk-mq.h | 7 ++++++- 13 files changed, 65 insertions(+), 29 deletions(-) diff --git a/block/blk-flush.c b/block/blk-flush.c index d20a0c6b2c66..ac850f4d9c4c 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -218,7 +218,8 @@ static void blk_flush_complete_seq(struct request *rq, blk_kick_flush(q, fq, cmd_flags); } -static void flush_end_io(struct request *flush_rq, blk_status_t error) +static enum rq_end_io_ret flush_end_io(struct request *flush_rq, + blk_status_t error) { struct request_queue *q = flush_rq->q; struct list_head *running; @@ -232,7 +233,7 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error) if (!req_ref_put_and_test(flush_rq)) { fq->rq_status = error; spin_unlock_irqrestore(&fq->mq_flush_lock, flags); - return; + return RQ_END_IO_NONE; } blk_account_io_flush(flush_rq); @@ -269,6 +270,7 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error) } spin_unlock_irqrestore(&fq->mq_flush_lock, flags); + return RQ_END_IO_NONE; } bool is_flush_rq(struct request *rq) @@ -354,7 +356,8 @@ static void blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq, blk_flush_queue_rq(flush_rq, false); } -static void mq_flush_data_end_io(struct request *rq, blk_status_t error) +static enum rq_end_io_ret mq_flush_data_end_io(struct request *rq, + blk_status_t error) { struct request_queue *q = rq->q; struct blk_mq_hw_ctx *hctx = rq->mq_hctx; @@ -376,6 +379,7 @@ static void mq_flush_data_end_io(struct request *rq, blk_status_t error) spin_unlock_irqrestore(&fq->mq_flush_lock, flags); blk_mq_sched_restart(hctx); + return RQ_END_IO_NONE; } /** diff --git a/block/blk-mq.c b/block/blk-mq.c index d3a9f8b9c7ee..a4e018c82b7c 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1001,7 +1001,8 @@ inline void __blk_mq_end_request(struct request *rq, blk_status_t error) if (rq->end_io) { rq_qos_done(rq->q, rq); - rq->end_io(rq, error); + if (rq->end_io(rq, error) == RQ_END_IO_FREE) + blk_mq_free_request(rq); } else { blk_mq_free_request(rq); } @@ -1287,12 +1288,13 @@ struct blk_rq_wait { blk_status_t ret; }; -static void blk_end_sync_rq(struct request *rq, blk_status_t ret) +static enum rq_end_io_ret blk_end_sync_rq(struct request *rq, blk_status_t ret) { struct blk_rq_wait *wait = rq->end_io_data; wait->ret = ret; complete(&wait->done); + return RQ_END_IO_NONE; } bool blk_rq_is_poll(struct request *rq) @@ -1526,10 +1528,12 @@ static bool blk_mq_req_expired(struct request *rq, unsigned long *next) void blk_mq_put_rq_ref(struct request *rq) { - if (is_flush_rq(rq)) - rq->end_io(rq, 0); - else if (req_ref_put_and_test(rq)) + if (is_flush_rq(rq)) { + if (rq->end_io(rq, 0) == RQ_END_IO_FREE) + blk_mq_free_request(rq); + } else if (req_ref_put_and_test(rq)) { __blk_mq_free_request(rq); + } } static bool blk_mq_check_expired(struct request *rq, void *priv) diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c index 4f49bbcce4f1..3001b10a3fbf 100644 --- a/drivers/md/dm-rq.c +++ b/drivers/md/dm-rq.c @@ -292,11 +292,13 @@ static void dm_kill_unmapped_request(struct request *rq, blk_status_t error) dm_complete_request(rq, error); } -static void end_clone_request(struct request *clone, blk_status_t error) +static enum rq_end_io_ret end_clone_request(struct request *clone, + blk_status_t error) { struct dm_rq_target_io *tio = clone->end_io_data; dm_complete_request(tio->orig, error); + return RQ_END_IO_NONE; } static int dm_rq_bio_constructor(struct bio *bio, struct bio *bio_orig, diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 13080a017ecf..f946f85e7a66 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1177,7 +1177,8 @@ static void nvme_queue_keep_alive_work(struct nvme_ctrl *ctrl) queue_delayed_work(nvme_wq, &ctrl->ka_work, ctrl->kato * HZ / 2); } -static void nvme_keep_alive_end_io(struct request *rq, blk_status_t status) +static enum rq_end_io_ret nvme_keep_alive_end_io(struct request *rq, + blk_status_t status) { struct nvme_ctrl *ctrl = rq->end_io_data; unsigned long flags; @@ -1189,7 +1190,7 @@ static void nvme_keep_alive_end_io(struct request *rq, blk_status_t status) dev_err(ctrl->device, "failed nvme_keep_alive_end_io error=%d\n", status); - return; + return RQ_END_IO_NONE; } ctrl->comp_seen = false; @@ -1200,6 +1201,7 @@ static void nvme_keep_alive_end_io(struct request *rq, blk_status_t status) spin_unlock_irqrestore(&ctrl->lock, flags); if (startka) nvme_queue_keep_alive_work(ctrl); + return RQ_END_IO_NONE; } static void nvme_keep_alive_work(struct work_struct *work) diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c index 548aca8b5b9f..c80b3ecca5c8 100644 --- a/drivers/nvme/host/ioctl.c +++ b/drivers/nvme/host/ioctl.c @@ -385,7 +385,8 @@ static void nvme_uring_task_cb(struct io_uring_cmd *ioucmd) io_uring_cmd_done(ioucmd, status, result); } -static void nvme_uring_cmd_end_io(struct request *req, blk_status_t err) +static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req, + blk_status_t err) { struct io_uring_cmd *ioucmd = req->end_io_data; struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); @@ -404,6 +405,8 @@ static void nvme_uring_cmd_end_io(struct request *req, blk_status_t err) nvme_uring_task_cb(ioucmd); else io_uring_cmd_complete_in_task(ioucmd, nvme_uring_task_cb); + + return RQ_END_IO_NONE; } static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns, diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 09b5d62f342b..361f09f23648 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -1268,7 +1268,7 @@ static int adapter_delete_sq(struct nvme_dev *dev, u16 sqid) return adapter_delete_queue(dev, nvme_admin_delete_sq, sqid); } -static void abort_endio(struct request *req, blk_status_t error) +static enum rq_end_io_ret abort_endio(struct request *req, blk_status_t error) { struct nvme_queue *nvmeq = req->mq_hctx->driver_data; @@ -1276,6 +1276,7 @@ static void abort_endio(struct request *req, blk_status_t error) "Abort status: 0x%x", nvme_req(req)->status); atomic_inc(&nvmeq->dev->ctrl.abort_limit); blk_mq_free_request(req); + return RQ_END_IO_NONE; } static bool nvme_should_reset(struct nvme_dev *dev, u32 csts) @@ -2447,22 +2448,25 @@ static int nvme_setup_io_queues(struct nvme_dev *dev) return result; } -static void nvme_del_queue_end(struct request *req, blk_status_t error) +static enum rq_end_io_ret nvme_del_queue_end(struct request *req, + blk_status_t error) { struct nvme_queue *nvmeq = req->end_io_data; blk_mq_free_request(req); complete(&nvmeq->delete_done); + return RQ_END_IO_NONE; } -static void nvme_del_cq_end(struct request *req, blk_status_t error) +static enum rq_end_io_ret nvme_del_cq_end(struct request *req, + blk_status_t error) { struct nvme_queue *nvmeq = req->end_io_data; if (error) set_bit(NVMEQ_DELETE_ERROR, &nvmeq->flags); - nvme_del_queue_end(req, error); + return nvme_del_queue_end(req, error); } static int nvme_delete_queue(struct nvme_queue *nvmeq, u8 opcode) diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c index 6f39a29828b1..4876ccaac55b 100644 --- a/drivers/nvme/target/passthru.c +++ b/drivers/nvme/target/passthru.c @@ -240,14 +240,15 @@ static void nvmet_passthru_execute_cmd_work(struct work_struct *w) blk_mq_free_request(rq); } -static void nvmet_passthru_req_done(struct request *rq, - blk_status_t blk_status) +static enum rq_end_io_ret nvmet_passthru_req_done(struct request *rq, + blk_status_t blk_status) { struct nvmet_req *req = rq->end_io_data; req->cqe->result = nvme_req(rq)->result; nvmet_req_complete(req, nvme_req(rq)->status); blk_mq_free_request(rq); + return RQ_END_IO_NONE; } static int nvmet_passthru_map_sg(struct nvmet_req *req, struct request *rq) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 448748e3fba5..786fb963cf3f 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -2004,9 +2004,11 @@ enum scsi_disposition scsi_decide_disposition(struct scsi_cmnd *scmd) } } -static void eh_lock_door_done(struct request *req, blk_status_t status) +static enum rq_end_io_ret eh_lock_door_done(struct request *req, + blk_status_t status) { blk_mq_free_request(req); + return RQ_END_IO_NONE; } /** diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 340b050ad28d..94c5e9a9309c 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -177,7 +177,7 @@ typedef struct sg_device { /* holds the state of each scsi generic device */ } Sg_device; /* tasklet or soft irq callback */ -static void sg_rq_end_io(struct request *rq, blk_status_t status); +static enum rq_end_io_ret sg_rq_end_io(struct request *rq, blk_status_t status); static int sg_start_req(Sg_request *srp, unsigned char *cmd); static int sg_finish_rem_req(Sg_request * srp); static int sg_build_indirect(Sg_scatter_hold * schp, Sg_fd * sfp, int buff_size); @@ -1311,7 +1311,7 @@ sg_rq_end_io_usercontext(struct work_struct *work) * This function is a "bottom half" handler that is called by the mid * level when a command is completed (or has failed). */ -static void +static enum rq_end_io_ret sg_rq_end_io(struct request *rq, blk_status_t status) { struct scsi_cmnd *scmd = blk_mq_rq_to_pdu(rq); @@ -1324,11 +1324,11 @@ sg_rq_end_io(struct request *rq, blk_status_t status) int result, resid, done = 1; if (WARN_ON(srp->done != 0)) - return; + return RQ_END_IO_NONE; sfp = srp->parentfp; if (WARN_ON(sfp == NULL)) - return; + return RQ_END_IO_NONE; sdp = sfp->parentdp; if (unlikely(atomic_read(&sdp->detaching))) @@ -1406,6 +1406,7 @@ sg_rq_end_io(struct request *rq, blk_status_t status) INIT_WORK(&srp->ew.work, sg_rq_end_io_usercontext); schedule_work(&srp->ew.work); } + return RQ_END_IO_NONE; } static const struct file_operations sg_fops = { diff --git a/drivers/scsi/st.c b/drivers/scsi/st.c index 850172a2b8f1..55e7c07ebe4c 100644 --- a/drivers/scsi/st.c +++ b/drivers/scsi/st.c @@ -512,7 +512,8 @@ static void st_do_stats(struct scsi_tape *STp, struct request *req) atomic64_dec(&STp->stats->in_flight); } -static void st_scsi_execute_end(struct request *req, blk_status_t status) +static enum rq_end_io_ret st_scsi_execute_end(struct request *req, + blk_status_t status) { struct scsi_cmnd *scmd = blk_mq_rq_to_pdu(req); struct st_request *SRpnt = req->end_io_data; @@ -532,6 +533,7 @@ static void st_scsi_execute_end(struct request *req, blk_status_t status) blk_rq_unmap_user(tmp); blk_mq_free_request(req); + return RQ_END_IO_NONE; } static int st_scsi_execute(struct st_request *SRpnt, const unsigned char *cmd, diff --git a/drivers/target/target_core_pscsi.c b/drivers/target/target_core_pscsi.c index e6a967ddc08c..8a7306e5e133 100644 --- a/drivers/target/target_core_pscsi.c +++ b/drivers/target/target_core_pscsi.c @@ -39,7 +39,7 @@ static inline struct pscsi_dev_virt *PSCSI_DEV(struct se_device *dev) } static sense_reason_t pscsi_execute_cmd(struct se_cmd *cmd); -static void pscsi_req_done(struct request *, blk_status_t); +static enum rq_end_io_ret pscsi_req_done(struct request *, blk_status_t); /* pscsi_attach_hba(): * @@ -1002,7 +1002,8 @@ static sector_t pscsi_get_blocks(struct se_device *dev) return 0; } -static void pscsi_req_done(struct request *req, blk_status_t status) +static enum rq_end_io_ret pscsi_req_done(struct request *req, + blk_status_t status) { struct se_cmd *cmd = req->end_io_data; struct scsi_cmnd *scmd = blk_mq_rq_to_pdu(req); @@ -1029,6 +1030,7 @@ static void pscsi_req_done(struct request *req, blk_status_t status) } blk_mq_free_request(req); + return RQ_END_IO_NONE; } static const struct target_backend_ops pscsi_ops = { diff --git a/drivers/ufs/core/ufshpb.c b/drivers/ufs/core/ufshpb.c index a1a7a1175a5a..3d69a81c5b17 100644 --- a/drivers/ufs/core/ufshpb.c +++ b/drivers/ufs/core/ufshpb.c @@ -613,14 +613,17 @@ static void ufshpb_activate_subregion(struct ufshpb_lu *hpb, srgn->srgn_state = HPB_SRGN_VALID; } -static void ufshpb_umap_req_compl_fn(struct request *req, blk_status_t error) +static enum rq_end_io_ret ufshpb_umap_req_compl_fn(struct request *req, + blk_status_t error) { struct ufshpb_req *umap_req = (struct ufshpb_req *)req->end_io_data; ufshpb_put_req(umap_req->hpb, umap_req); + return RQ_END_IO_NONE; } -static void ufshpb_map_req_compl_fn(struct request *req, blk_status_t error) +static enum rq_end_io_ret ufshpb_map_req_compl_fn(struct request *req, + blk_status_t error) { struct ufshpb_req *map_req = (struct ufshpb_req *) req->end_io_data; struct ufshpb_lu *hpb = map_req->hpb; @@ -636,6 +639,7 @@ static void ufshpb_map_req_compl_fn(struct request *req, blk_status_t error) spin_unlock_irqrestore(&hpb->rgn_state_lock, flags); ufshpb_put_map_req(map_req->hpb, map_req); + return RQ_END_IO_NONE; } static void ufshpb_set_unmap_cmd(unsigned char *cdb, struct ufshpb_region *rgn) diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 00a15808c137..e6fa49dd6196 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -14,7 +14,12 @@ struct blk_flush_queue; #define BLKDEV_MIN_RQ 4 #define BLKDEV_DEFAULT_RQ 128 -typedef void (rq_end_io_fn)(struct request *, blk_status_t); +enum rq_end_io_ret { + RQ_END_IO_NONE, + RQ_END_IO_FREE, +}; + +typedef enum rq_end_io_ret (rq_end_io_fn)(struct request *, blk_status_t); /* * request flags */ -- 2.35.1 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 3/5] block: allow end_io based requests in the completion batch handling 2022-09-27 1:44 [PATCHSET v2 0/5] Enable alloc caching and batched freeing for passthrough Jens Axboe 2022-09-27 1:44 ` [PATCH 1/5] block: enable batched allocation for blk_mq_alloc_request() Jens Axboe 2022-09-27 1:44 ` [PATCH 2/5] block: change request end_io handler to pass back a return value Jens Axboe @ 2022-09-27 1:44 ` Jens Axboe 2022-09-28 13:42 ` Anuj gupta 2022-09-27 1:44 ` [PATCH 4/5] nvme: split out metadata vs non metadata end_io uring_cmd completions Jens Axboe ` (3 subsequent siblings) 6 siblings, 1 reply; 25+ messages in thread From: Jens Axboe @ 2022-09-27 1:44 UTC (permalink / raw) To: linux-block; +Cc: linux-scsi, linux-nvme, Jens Axboe, Stefan Roesch With end_io handlers now being able to potentially pass ownership of the request upon completion, we can allow requests with end_io handlers in the batch completion handling. Co-developed-by: Stefan Roesch <shr@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> --- block/blk-mq.c | 13 +++++++++++-- include/linux/blk-mq.h | 3 ++- 2 files changed, 13 insertions(+), 3 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index a4e018c82b7c..a7dfe7a898a4 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -823,8 +823,10 @@ static void blk_complete_request(struct request *req) * can find how many bytes remain in the request * later. */ - req->bio = NULL; - req->__data_len = 0; + if (!req->end_io) { + req->bio = NULL; + req->__data_len = 0; + } } /** @@ -1055,6 +1057,13 @@ void blk_mq_end_request_batch(struct io_comp_batch *iob) rq_qos_done(rq->q, rq); + /* + * If end_io handler returns NONE, then it still has + * ownership of the request. + */ + if (rq->end_io && rq->end_io(rq, 0) == RQ_END_IO_NONE) + continue; + WRITE_ONCE(rq->state, MQ_RQ_IDLE); if (!req_ref_put_and_test(rq)) continue; diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index e6fa49dd6196..50811d0fb143 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -853,8 +853,9 @@ static inline bool blk_mq_add_to_batch(struct request *req, struct io_comp_batch *iob, int ioerror, void (*complete)(struct io_comp_batch *)) { - if (!iob || (req->rq_flags & RQF_ELV) || req->end_io || ioerror) + if (!iob || (req->rq_flags & RQF_ELV) || ioerror) return false; + if (!iob->complete) iob->complete = complete; else if (iob->complete != complete) -- 2.35.1 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 3/5] block: allow end_io based requests in the completion batch handling 2022-09-27 1:44 ` [PATCH 3/5] block: allow end_io based requests in the completion batch handling Jens Axboe @ 2022-09-28 13:42 ` Anuj gupta 0 siblings, 0 replies; 25+ messages in thread From: Anuj gupta @ 2022-09-28 13:42 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-block, linux-scsi, linux-nvme, Stefan Roesch On Tue, Sep 27, 2022 at 7:20 AM Jens Axboe <axboe@kernel.dk> wrote: > > With end_io handlers now being able to potentially pass ownership of > the request upon completion, we can allow requests with end_io handlers > in the batch completion handling. > > Co-developed-by: Stefan Roesch <shr@fb.com> > Signed-off-by: Jens Axboe <axboe@kernel.dk> > --- > block/blk-mq.c | 13 +++++++++++-- > include/linux/blk-mq.h | 3 ++- > 2 files changed, 13 insertions(+), 3 deletions(-) > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index a4e018c82b7c..a7dfe7a898a4 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -823,8 +823,10 @@ static void blk_complete_request(struct request *req) > * can find how many bytes remain in the request > * later. > */ > - req->bio = NULL; > - req->__data_len = 0; > + if (!req->end_io) { > + req->bio = NULL; > + req->__data_len = 0; > + } > } > > /** > @@ -1055,6 +1057,13 @@ void blk_mq_end_request_batch(struct io_comp_batch *iob) > > rq_qos_done(rq->q, rq); > > + /* > + * If end_io handler returns NONE, then it still has > + * ownership of the request. > + */ > + if (rq->end_io && rq->end_io(rq, 0) == RQ_END_IO_NONE) > + continue; > + > WRITE_ONCE(rq->state, MQ_RQ_IDLE); > if (!req_ref_put_and_test(rq)) > continue; > diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h > index e6fa49dd6196..50811d0fb143 100644 > --- a/include/linux/blk-mq.h > +++ b/include/linux/blk-mq.h > @@ -853,8 +853,9 @@ static inline bool blk_mq_add_to_batch(struct request *req, > struct io_comp_batch *iob, int ioerror, > void (*complete)(struct io_comp_batch *)) > { > - if (!iob || (req->rq_flags & RQF_ELV) || req->end_io || ioerror) > + if (!iob || (req->rq_flags & RQF_ELV) || ioerror) > return false; > + > if (!iob->complete) > iob->complete = complete; > else if (iob->complete != complete) > -- > 2.35.1 > Looks good. Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> -- Anuj Gupta ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 4/5] nvme: split out metadata vs non metadata end_io uring_cmd completions 2022-09-27 1:44 [PATCHSET v2 0/5] Enable alloc caching and batched freeing for passthrough Jens Axboe ` (2 preceding siblings ...) 2022-09-27 1:44 ` [PATCH 3/5] block: allow end_io based requests in the completion batch handling Jens Axboe @ 2022-09-27 1:44 ` Jens Axboe 2022-09-27 7:50 ` Christoph Hellwig ` (2 more replies) 2022-09-27 1:44 ` [PATCH 5/5] nvme: enable batched completions of passthrough IO Jens Axboe ` (2 subsequent siblings) 6 siblings, 3 replies; 25+ messages in thread From: Jens Axboe @ 2022-09-27 1:44 UTC (permalink / raw) To: linux-block; +Cc: linux-scsi, linux-nvme, Jens Axboe, Stefan Roesch By splitting up the metadata and non-metadata end_io handling, we can remove any request dependencies on the normal non-metadata IO path. This is in preparation for enabling the normal IO passthrough path to pass the ownership of the request back to the block layer. Co-developed-by: Stefan Roesch <shr@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> --- drivers/nvme/host/ioctl.c | 79 ++++++++++++++++++++++++++++++--------- 1 file changed, 61 insertions(+), 18 deletions(-) diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c index c80b3ecca5c8..9e356a6c96c2 100644 --- a/drivers/nvme/host/ioctl.c +++ b/drivers/nvme/host/ioctl.c @@ -349,9 +349,15 @@ struct nvme_uring_cmd_pdu { struct bio *bio; struct request *req; }; - void *meta; /* kernel-resident buffer */ - void __user *meta_buffer; u32 meta_len; + u32 nvme_status; + union { + struct { + void *meta; /* kernel-resident buffer */ + void __user *meta_buffer; + }; + u64 result; + } u; }; static inline struct nvme_uring_cmd_pdu *nvme_uring_cmd_pdu( @@ -360,11 +366,10 @@ static inline struct nvme_uring_cmd_pdu *nvme_uring_cmd_pdu( return (struct nvme_uring_cmd_pdu *)&ioucmd->pdu; } -static void nvme_uring_task_cb(struct io_uring_cmd *ioucmd) +static void nvme_uring_task_meta_cb(struct io_uring_cmd *ioucmd) { struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); struct request *req = pdu->req; - struct bio *bio = req->bio; int status; u64 result; @@ -375,27 +380,39 @@ static void nvme_uring_task_cb(struct io_uring_cmd *ioucmd) result = le64_to_cpu(nvme_req(req)->result.u64); - if (pdu->meta) - status = nvme_finish_user_metadata(req, pdu->meta_buffer, - pdu->meta, pdu->meta_len, status); - if (bio) - blk_rq_unmap_user(bio); + if (pdu->meta_len) + status = nvme_finish_user_metadata(req, pdu->u.meta_buffer, + pdu->u.meta, pdu->meta_len, status); + if (req->bio) + blk_rq_unmap_user(req->bio); blk_mq_free_request(req); io_uring_cmd_done(ioucmd, status, result); } +static void nvme_uring_task_cb(struct io_uring_cmd *ioucmd) +{ + struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); + + if (pdu->bio) + blk_rq_unmap_user(pdu->bio); + + io_uring_cmd_done(ioucmd, pdu->nvme_status, pdu->u.result); +} + static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req, blk_status_t err) { struct io_uring_cmd *ioucmd = req->end_io_data; struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); - /* extract bio before reusing the same field for request */ - struct bio *bio = pdu->bio; void *cookie = READ_ONCE(ioucmd->cookie); - pdu->req = req; - req->bio = bio; + req->bio = pdu->bio; + if (nvme_req(req)->flags & NVME_REQ_CANCELLED) + pdu->nvme_status = -EINTR; + else + pdu->nvme_status = nvme_req(req)->status; + pdu->u.result = le64_to_cpu(nvme_req(req)->result.u64); /* * For iopoll, complete it directly. @@ -406,6 +423,29 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req, else io_uring_cmd_complete_in_task(ioucmd, nvme_uring_task_cb); + blk_mq_free_request(req); + return RQ_END_IO_NONE; +} + +static enum rq_end_io_ret nvme_uring_cmd_end_io_meta(struct request *req, + blk_status_t err) +{ + struct io_uring_cmd *ioucmd = req->end_io_data; + struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); + void *cookie = READ_ONCE(ioucmd->cookie); + + req->bio = pdu->bio; + pdu->req = req; + + /* + * For iopoll, complete it directly. + * Otherwise, move the completion to task work. + */ + if (cookie != NULL && blk_rq_is_poll(req)) + nvme_uring_task_meta_cb(ioucmd); + else + io_uring_cmd_complete_in_task(ioucmd, nvme_uring_task_meta_cb); + return RQ_END_IO_NONE; } @@ -467,8 +507,6 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns, blk_flags); if (IS_ERR(req)) return PTR_ERR(req); - req->end_io = nvme_uring_cmd_end_io; - req->end_io_data = ioucmd; if (issue_flags & IO_URING_F_IOPOLL && rq_flags & REQ_POLLED) { if (unlikely(!req->bio)) { @@ -483,10 +521,15 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns, } /* to free bio on completion, as req->bio will be null at that time */ pdu->bio = req->bio; - pdu->meta = meta; - pdu->meta_buffer = nvme_to_user_ptr(d.metadata); pdu->meta_len = d.metadata_len; - + req->end_io_data = ioucmd; + if (pdu->meta_len) { + pdu->u.meta = meta; + pdu->u.meta_buffer = nvme_to_user_ptr(d.metadata); + req->end_io = nvme_uring_cmd_end_io_meta; + } else { + req->end_io = nvme_uring_cmd_end_io; + } blk_execute_rq_nowait(req, false); return -EIOCBQUEUED; } -- 2.35.1 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 4/5] nvme: split out metadata vs non metadata end_io uring_cmd completions 2022-09-27 1:44 ` [PATCH 4/5] nvme: split out metadata vs non metadata end_io uring_cmd completions Jens Axboe @ 2022-09-27 7:50 ` Christoph Hellwig 2022-09-28 13:51 ` Anuj gupta 2022-09-28 14:47 ` Sagi Grimberg 2 siblings, 0 replies; 25+ messages in thread From: Christoph Hellwig @ 2022-09-27 7:50 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-block, linux-scsi, linux-nvme, Stefan Roesch On Mon, Sep 26, 2022 at 07:44:19PM -0600, Jens Axboe wrote: > By splitting up the metadata and non-metadata end_io handling, we can > remove any request dependencies on the normal non-metadata IO path. This > is in preparation for enabling the normal IO passthrough path to pass > the ownership of the request back to the block layer. > > Co-developed-by: Stefan Roesch <shr@fb.com> > Signed-off-by: Jens Axboe <axboe@kernel.dk> Looks good: Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 4/5] nvme: split out metadata vs non metadata end_io uring_cmd completions 2022-09-27 1:44 ` [PATCH 4/5] nvme: split out metadata vs non metadata end_io uring_cmd completions Jens Axboe 2022-09-27 7:50 ` Christoph Hellwig @ 2022-09-28 13:51 ` Anuj gupta 2022-09-28 14:47 ` Sagi Grimberg 2 siblings, 0 replies; 25+ messages in thread From: Anuj gupta @ 2022-09-28 13:51 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-block, linux-scsi, linux-nvme, Stefan Roesch On Tue, Sep 27, 2022 at 7:22 AM Jens Axboe <axboe@kernel.dk> wrote: > > By splitting up the metadata and non-metadata end_io handling, we can > remove any request dependencies on the normal non-metadata IO path. This > is in preparation for enabling the normal IO passthrough path to pass > the ownership of the request back to the block layer. > > Co-developed-by: Stefan Roesch <shr@fb.com> > Signed-off-by: Jens Axboe <axboe@kernel.dk> > --- > drivers/nvme/host/ioctl.c | 79 ++++++++++++++++++++++++++++++--------- > 1 file changed, 61 insertions(+), 18 deletions(-) > > diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c > index c80b3ecca5c8..9e356a6c96c2 100644 > --- a/drivers/nvme/host/ioctl.c > +++ b/drivers/nvme/host/ioctl.c > @@ -349,9 +349,15 @@ struct nvme_uring_cmd_pdu { > struct bio *bio; > struct request *req; > }; > - void *meta; /* kernel-resident buffer */ > - void __user *meta_buffer; > u32 meta_len; > + u32 nvme_status; > + union { > + struct { > + void *meta; /* kernel-resident buffer */ > + void __user *meta_buffer; > + }; > + u64 result; > + } u; > }; > > static inline struct nvme_uring_cmd_pdu *nvme_uring_cmd_pdu( > @@ -360,11 +366,10 @@ static inline struct nvme_uring_cmd_pdu *nvme_uring_cmd_pdu( > return (struct nvme_uring_cmd_pdu *)&ioucmd->pdu; > } > > -static void nvme_uring_task_cb(struct io_uring_cmd *ioucmd) > +static void nvme_uring_task_meta_cb(struct io_uring_cmd *ioucmd) > { > struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); > struct request *req = pdu->req; > - struct bio *bio = req->bio; > int status; > u64 result; > > @@ -375,27 +380,39 @@ static void nvme_uring_task_cb(struct io_uring_cmd *ioucmd) > > result = le64_to_cpu(nvme_req(req)->result.u64); > > - if (pdu->meta) > - status = nvme_finish_user_metadata(req, pdu->meta_buffer, > - pdu->meta, pdu->meta_len, status); > - if (bio) > - blk_rq_unmap_user(bio); > + if (pdu->meta_len) > + status = nvme_finish_user_metadata(req, pdu->u.meta_buffer, > + pdu->u.meta, pdu->meta_len, status); > + if (req->bio) > + blk_rq_unmap_user(req->bio); > blk_mq_free_request(req); > > io_uring_cmd_done(ioucmd, status, result); > } > > +static void nvme_uring_task_cb(struct io_uring_cmd *ioucmd) > +{ > + struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); > + > + if (pdu->bio) > + blk_rq_unmap_user(pdu->bio); > + > + io_uring_cmd_done(ioucmd, pdu->nvme_status, pdu->u.result); > +} > + > static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req, > blk_status_t err) > { > struct io_uring_cmd *ioucmd = req->end_io_data; > struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); > - /* extract bio before reusing the same field for request */ > - struct bio *bio = pdu->bio; > void *cookie = READ_ONCE(ioucmd->cookie); > > - pdu->req = req; > - req->bio = bio; > + req->bio = pdu->bio; > + if (nvme_req(req)->flags & NVME_REQ_CANCELLED) > + pdu->nvme_status = -EINTR; > + else > + pdu->nvme_status = nvme_req(req)->status; > + pdu->u.result = le64_to_cpu(nvme_req(req)->result.u64); > > /* > * For iopoll, complete it directly. > @@ -406,6 +423,29 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req, > else > io_uring_cmd_complete_in_task(ioucmd, nvme_uring_task_cb); > > + blk_mq_free_request(req); > + return RQ_END_IO_NONE; > +} > + > +static enum rq_end_io_ret nvme_uring_cmd_end_io_meta(struct request *req, > + blk_status_t err) > +{ > + struct io_uring_cmd *ioucmd = req->end_io_data; > + struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); > + void *cookie = READ_ONCE(ioucmd->cookie); > + > + req->bio = pdu->bio; > + pdu->req = req; > + > + /* > + * For iopoll, complete it directly. > + * Otherwise, move the completion to task work. > + */ > + if (cookie != NULL && blk_rq_is_poll(req)) > + nvme_uring_task_meta_cb(ioucmd); > + else > + io_uring_cmd_complete_in_task(ioucmd, nvme_uring_task_meta_cb); > + > return RQ_END_IO_NONE; > } > > @@ -467,8 +507,6 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns, > blk_flags); > if (IS_ERR(req)) > return PTR_ERR(req); > - req->end_io = nvme_uring_cmd_end_io; > - req->end_io_data = ioucmd; > > if (issue_flags & IO_URING_F_IOPOLL && rq_flags & REQ_POLLED) { > if (unlikely(!req->bio)) { > @@ -483,10 +521,15 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns, > } > /* to free bio on completion, as req->bio will be null at that time */ > pdu->bio = req->bio; > - pdu->meta = meta; > - pdu->meta_buffer = nvme_to_user_ptr(d.metadata); > pdu->meta_len = d.metadata_len; > - > + req->end_io_data = ioucmd; > + if (pdu->meta_len) { > + pdu->u.meta = meta; > + pdu->u.meta_buffer = nvme_to_user_ptr(d.metadata); > + req->end_io = nvme_uring_cmd_end_io_meta; > + } else { > + req->end_io = nvme_uring_cmd_end_io; > + } > blk_execute_rq_nowait(req, false); > return -EIOCBQUEUED; > } > -- > 2.35.1 > Looks good. Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> -- Anuj Gupta ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 4/5] nvme: split out metadata vs non metadata end_io uring_cmd completions 2022-09-27 1:44 ` [PATCH 4/5] nvme: split out metadata vs non metadata end_io uring_cmd completions Jens Axboe 2022-09-27 7:50 ` Christoph Hellwig 2022-09-28 13:51 ` Anuj gupta @ 2022-09-28 14:47 ` Sagi Grimberg 2 siblings, 0 replies; 25+ messages in thread From: Sagi Grimberg @ 2022-09-28 14:47 UTC (permalink / raw) To: Jens Axboe, linux-block; +Cc: linux-scsi, linux-nvme, Stefan Roesch > -static void nvme_uring_task_cb(struct io_uring_cmd *ioucmd) > +static void nvme_uring_task_meta_cb(struct io_uring_cmd *ioucmd) > { > struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); > struct request *req = pdu->req; > - struct bio *bio = req->bio; Unrelated change I think. But other than that, looks good, Reviewed-by: Sagi Grimberg <sagi@grimberg.me> ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 5/5] nvme: enable batched completions of passthrough IO 2022-09-27 1:44 [PATCHSET v2 0/5] Enable alloc caching and batched freeing for passthrough Jens Axboe ` (3 preceding siblings ...) 2022-09-27 1:44 ` [PATCH 4/5] nvme: split out metadata vs non metadata end_io uring_cmd completions Jens Axboe @ 2022-09-27 1:44 ` Jens Axboe 2022-09-28 13:55 ` Anuj gupta 2022-09-28 14:47 ` Sagi Grimberg 2022-09-28 13:23 ` [PATCHSET v2 0/5] Enable alloc caching and batched freeing for passthrough Anuj gupta 2022-09-28 17:05 ` Keith Busch 6 siblings, 2 replies; 25+ messages in thread From: Jens Axboe @ 2022-09-27 1:44 UTC (permalink / raw) To: linux-block; +Cc: linux-scsi, linux-nvme, Jens Axboe, Stefan Roesch Now that the normal passthrough end_io path doesn't need the request anymore, we can kill the explicit blk_mq_free_request() and just pass back RQ_END_IO_FREE instead. This enables the batched completion from freeing batches of requests at the time. This brings passthrough IO performance at least on par with bdev based O_DIRECT with io_uring. With this and batche allocations, peak performance goes from 110M IOPS to 122M IOPS. For IRQ based, passthrough is now also about 10% faster than previously, going from ~61M to ~67M IOPS. Co-developed-by: Stefan Roesch <shr@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> --- drivers/nvme/host/ioctl.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c index 9e356a6c96c2..d9633f426690 100644 --- a/drivers/nvme/host/ioctl.c +++ b/drivers/nvme/host/ioctl.c @@ -423,8 +423,7 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req, else io_uring_cmd_complete_in_task(ioucmd, nvme_uring_task_cb); - blk_mq_free_request(req); - return RQ_END_IO_NONE; + return RQ_END_IO_FREE; } static enum rq_end_io_ret nvme_uring_cmd_end_io_meta(struct request *req, -- 2.35.1 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 5/5] nvme: enable batched completions of passthrough IO 2022-09-27 1:44 ` [PATCH 5/5] nvme: enable batched completions of passthrough IO Jens Axboe @ 2022-09-28 13:55 ` Anuj gupta 2022-09-28 14:47 ` Sagi Grimberg 1 sibling, 0 replies; 25+ messages in thread From: Anuj gupta @ 2022-09-28 13:55 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-block, linux-scsi, linux-nvme, Stefan Roesch On Tue, Sep 27, 2022 at 7:19 AM Jens Axboe <axboe@kernel.dk> wrote: > > Now that the normal passthrough end_io path doesn't need the request > anymore, we can kill the explicit blk_mq_free_request() and just pass > back RQ_END_IO_FREE instead. This enables the batched completion from > freeing batches of requests at the time. > > This brings passthrough IO performance at least on par with bdev based > O_DIRECT with io_uring. With this and batche allocations, peak performance > goes from 110M IOPS to 122M IOPS. For IRQ based, passthrough is now also > about 10% faster than previously, going from ~61M to ~67M IOPS. > > Co-developed-by: Stefan Roesch <shr@fb.com> > Signed-off-by: Jens Axboe <axboe@kernel.dk> > --- > drivers/nvme/host/ioctl.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c > index 9e356a6c96c2..d9633f426690 100644 > --- a/drivers/nvme/host/ioctl.c > +++ b/drivers/nvme/host/ioctl.c > @@ -423,8 +423,7 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req, > else > io_uring_cmd_complete_in_task(ioucmd, nvme_uring_task_cb); > > - blk_mq_free_request(req); > - return RQ_END_IO_NONE; > + return RQ_END_IO_FREE; > } > > static enum rq_end_io_ret nvme_uring_cmd_end_io_meta(struct request *req, > -- > 2.35.1 > Looks good to me. Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> -- Anuj Gupta ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 5/5] nvme: enable batched completions of passthrough IO 2022-09-27 1:44 ` [PATCH 5/5] nvme: enable batched completions of passthrough IO Jens Axboe 2022-09-28 13:55 ` Anuj gupta @ 2022-09-28 14:47 ` Sagi Grimberg 1 sibling, 0 replies; 25+ messages in thread From: Sagi Grimberg @ 2022-09-28 14:47 UTC (permalink / raw) To: Jens Axboe, linux-block; +Cc: linux-scsi, linux-nvme, Stefan Roesch Reviewed-by: Sagi Grimberg <sagi@grimberg.me> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCHSET v2 0/5] Enable alloc caching and batched freeing for passthrough 2022-09-27 1:44 [PATCHSET v2 0/5] Enable alloc caching and batched freeing for passthrough Jens Axboe ` (4 preceding siblings ...) 2022-09-27 1:44 ` [PATCH 5/5] nvme: enable batched completions of passthrough IO Jens Axboe @ 2022-09-28 13:23 ` Anuj gupta 2022-09-28 14:22 ` Jens Axboe 2022-09-28 17:05 ` Keith Busch 6 siblings, 1 reply; 25+ messages in thread From: Anuj gupta @ 2022-09-28 13:23 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-block, linux-scsi, linux-nvme On Tue, Sep 27, 2022 at 7:14 AM Jens Axboe <axboe@kernel.dk> wrote: > > Hi, > > The passthrough IO path currently doesn't do any request allocation > batching like we do for normal IO. Wire this up through the usual > blk_mq_alloc_request() allocation helper. > > Similarly, we don't currently supported batched completions for > passthrough IO. Allow the request->end_io() handler to return back > whether or not it retains ownership of the request. By default all > handlers are converted to returning RQ_END_IO_NONE, which retains > the existing behavior. But with that in place, we can tweak the > nvme uring_cmd end_io handler to pass back ownership, and hence enable > completion batching for passthrough requests as well. > > This is good for a 10% improvement for passthrough performance. For > a non-drive limited test case, passthrough IO is now more efficient > than the regular bdev O_DIRECT path. > > Changes since v1: > - Remove spurious semicolon > - Cleanup struct nvme_uring_cmd_pdu handling > > -- > Jens Axboe > > I see an improvement of ~12% (2.34 to 2.63 MIOPS) with polling enabled and an improvement of ~4% (1.84 to 1.92 MIOPS) with polling disabled using the t/io_uring utility (in fio) in my setup with this patch series! -- Anuj Gupta ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCHSET v2 0/5] Enable alloc caching and batched freeing for passthrough 2022-09-28 13:23 ` [PATCHSET v2 0/5] Enable alloc caching and batched freeing for passthrough Anuj gupta @ 2022-09-28 14:22 ` Jens Axboe 0 siblings, 0 replies; 25+ messages in thread From: Jens Axboe @ 2022-09-28 14:22 UTC (permalink / raw) To: Anuj gupta; +Cc: linux-block, linux-scsi, linux-nvme On 9/28/22 7:23 AM, Anuj gupta wrote: > On Tue, Sep 27, 2022 at 7:14 AM Jens Axboe <axboe@kernel.dk> wrote: >> >> Hi, >> >> The passthrough IO path currently doesn't do any request allocation >> batching like we do for normal IO. Wire this up through the usual >> blk_mq_alloc_request() allocation helper. >> >> Similarly, we don't currently supported batched completions for >> passthrough IO. Allow the request->end_io() handler to return back >> whether or not it retains ownership of the request. By default all >> handlers are converted to returning RQ_END_IO_NONE, which retains >> the existing behavior. But with that in place, we can tweak the >> nvme uring_cmd end_io handler to pass back ownership, and hence enable >> completion batching for passthrough requests as well. >> >> This is good for a 10% improvement for passthrough performance. For >> a non-drive limited test case, passthrough IO is now more efficient >> than the regular bdev O_DIRECT path. >> >> Changes since v1: >> - Remove spurious semicolon >> - Cleanup struct nvme_uring_cmd_pdu handling >> >> -- >> Jens Axboe >> >> > I see an improvement of ~12% (2.34 to 2.63 MIOPS) with polling enabled and > an improvement of ~4% (1.84 to 1.92 MIOPS) with polling disabled using the > t/io_uring utility (in fio) in my setup with this patch series! Thanks for your testing! I'll add your reviewed-by to the series. -- Jens Axboe ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCHSET v2 0/5] Enable alloc caching and batched freeing for passthrough 2022-09-27 1:44 [PATCHSET v2 0/5] Enable alloc caching and batched freeing for passthrough Jens Axboe ` (5 preceding siblings ...) 2022-09-28 13:23 ` [PATCHSET v2 0/5] Enable alloc caching and batched freeing for passthrough Anuj gupta @ 2022-09-28 17:05 ` Keith Busch 6 siblings, 0 replies; 25+ messages in thread From: Keith Busch @ 2022-09-28 17:05 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-block, linux-scsi, linux-nvme Series looks good. Reviewed-by: Keith Busch <kbusch@kernel.org> ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCHSET 0/5] Enable alloc caching and batched freeing for passthrough @ 2022-09-22 18:28 Jens Axboe 2022-09-22 18:28 ` [PATCH 4/5] nvme: split out metadata vs non metadata end_io uring_cmd completions Jens Axboe 0 siblings, 1 reply; 25+ messages in thread From: Jens Axboe @ 2022-09-22 18:28 UTC (permalink / raw) To: linux-block; +Cc: linux-scsi, linux-nvme Hi, The passthrough IO path currently doesn't do any request allocation batching like we do for normal IO. Wire this up through the usual blk_mq_alloc_request() allocation helper. Similarly, we don't currently supported batched completions for passthrough IO. Allow the request->end_io() handler to return back whether or not it retains ownership of the request. By default all handlers are converted to returning RQ_END_IO_NONE, which retains the existing behavior. But with that in place, we can tweak the nvme uring_cmd end_io handler to pass back ownership, and hence enable completion batching for passthrough requests as well. This is good for a 10% improvement for passthrough performance. For a non-drive limited test case, passthrough IO is now more efficient than the regular bdev O_DIRECT path. -- Jens Axboe ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 4/5] nvme: split out metadata vs non metadata end_io uring_cmd completions 2022-09-22 18:28 [PATCHSET " Jens Axboe @ 2022-09-22 18:28 ` Jens Axboe 2022-09-23 15:21 ` Christoph Hellwig 0 siblings, 1 reply; 25+ messages in thread From: Jens Axboe @ 2022-09-22 18:28 UTC (permalink / raw) To: linux-block; +Cc: linux-scsi, linux-nvme, Jens Axboe, Stefan Roesch By splitting up the metadata and non-metadata end_io handling, we can remove any request dependencies on the normal non-metadata IO path. This is in preparation for enabling the normal IO passthrough path to pass the ownership of the request back to the block layer. Co-developed-by: Stefan Roesch <shr@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> --- drivers/nvme/host/ioctl.c | 82 +++++++++++++++++++++++++++++++-------- 1 file changed, 66 insertions(+), 16 deletions(-) diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c index c80b3ecca5c8..1ccc9dd6d434 100644 --- a/drivers/nvme/host/ioctl.c +++ b/drivers/nvme/host/ioctl.c @@ -349,9 +349,18 @@ struct nvme_uring_cmd_pdu { struct bio *bio; struct request *req; }; - void *meta; /* kernel-resident buffer */ - void __user *meta_buffer; u32 meta_len; + union { + struct { + void *meta; /* kernel-resident buffer */ + void __user *meta_buffer; + }; + struct { + u32 nvme_flags; + u32 nvme_status; + u64 result; + }; + }; }; static inline struct nvme_uring_cmd_pdu *nvme_uring_cmd_pdu( @@ -360,11 +369,10 @@ static inline struct nvme_uring_cmd_pdu *nvme_uring_cmd_pdu( return (struct nvme_uring_cmd_pdu *)&ioucmd->pdu; } -static void nvme_uring_task_cb(struct io_uring_cmd *ioucmd) +static void nvme_uring_task_meta_cb(struct io_uring_cmd *ioucmd) { struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); struct request *req = pdu->req; - struct bio *bio = req->bio; int status; u64 result; @@ -375,27 +383,43 @@ static void nvme_uring_task_cb(struct io_uring_cmd *ioucmd) result = le64_to_cpu(nvme_req(req)->result.u64); - if (pdu->meta) + if (pdu->meta_len) status = nvme_finish_user_metadata(req, pdu->meta_buffer, pdu->meta, pdu->meta_len, status); - if (bio) - blk_rq_unmap_user(bio); + if (req->bio) + blk_rq_unmap_user(req->bio); blk_mq_free_request(req); io_uring_cmd_done(ioucmd, status, result); } +static void nvme_uring_task_cb(struct io_uring_cmd *ioucmd) +{ + struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); + int status; + + if (pdu->nvme_flags & NVME_REQ_CANCELLED) + status = -EINTR; + else + status = pdu->nvme_status; + + if (pdu->bio) + blk_rq_unmap_user(pdu->bio); + + io_uring_cmd_done(ioucmd, status, pdu->result); +} + static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req, blk_status_t err) { struct io_uring_cmd *ioucmd = req->end_io_data; struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); - /* extract bio before reusing the same field for request */ - struct bio *bio = pdu->bio; void *cookie = READ_ONCE(ioucmd->cookie); - pdu->req = req; - req->bio = bio; + req->bio = pdu->bio; + pdu->nvme_flags = nvme_req(req)->flags; + pdu->nvme_status = nvme_req(req)->status; + pdu->result = le64_to_cpu(nvme_req(req)->result.u64); /* * For iopoll, complete it directly. @@ -406,6 +430,29 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req, else io_uring_cmd_complete_in_task(ioucmd, nvme_uring_task_cb); + blk_mq_free_request(req); + return RQ_END_IO_NONE; +} + +static enum rq_end_io_ret nvme_uring_cmd_end_io_meta(struct request *req, + blk_status_t err) +{ + struct io_uring_cmd *ioucmd = req->end_io_data; + struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); + void *cookie = READ_ONCE(ioucmd->cookie); + + req->bio = pdu->bio; + pdu->req = req; + + /* + * For iopoll, complete it directly. + * Otherwise, move the completion to task work. + */ + if (cookie != NULL && blk_rq_is_poll(req)) + nvme_uring_task_meta_cb(ioucmd); + else + io_uring_cmd_complete_in_task(ioucmd, nvme_uring_task_meta_cb); + return RQ_END_IO_NONE; } @@ -467,8 +514,6 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns, blk_flags); if (IS_ERR(req)) return PTR_ERR(req); - req->end_io = nvme_uring_cmd_end_io; - req->end_io_data = ioucmd; if (issue_flags & IO_URING_F_IOPOLL && rq_flags & REQ_POLLED) { if (unlikely(!req->bio)) { @@ -483,10 +528,15 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns, } /* to free bio on completion, as req->bio will be null at that time */ pdu->bio = req->bio; - pdu->meta = meta; - pdu->meta_buffer = nvme_to_user_ptr(d.metadata); pdu->meta_len = d.metadata_len; - + req->end_io_data = ioucmd; + if (pdu->meta_len) { + pdu->meta = meta; + pdu->meta_buffer = nvme_to_user_ptr(d.metadata); + req->end_io = nvme_uring_cmd_end_io_meta; + } else { + req->end_io = nvme_uring_cmd_end_io; + } blk_execute_rq_nowait(req, false); return -EIOCBQUEUED; } -- 2.35.1 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 4/5] nvme: split out metadata vs non metadata end_io uring_cmd completions 2022-09-22 18:28 ` [PATCH 4/5] nvme: split out metadata vs non metadata end_io uring_cmd completions Jens Axboe @ 2022-09-23 15:21 ` Christoph Hellwig 2022-09-23 20:52 ` Jens Axboe 0 siblings, 1 reply; 25+ messages in thread From: Christoph Hellwig @ 2022-09-23 15:21 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-block, linux-scsi, linux-nvme, Stefan Roesch > + union { > + struct { > + void *meta; /* kernel-resident buffer */ > + void __user *meta_buffer; > + }; > + struct { > + u32 nvme_flags; > + u32 nvme_status; > + u64 result; > + }; > + }; Without naming the arms of the union this is becoming a bit too much of a mess.. > +static void nvme_uring_task_cb(struct io_uring_cmd *ioucmd) > +{ > + struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); > + int status; > + > + if (pdu->nvme_flags & NVME_REQ_CANCELLED) > + status = -EINTR; > + else > + status = pdu->nvme_status; If you add a signed int field you only need one field instead of two in the pdu for this (the nvme status is only 15 bits anyway). ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 4/5] nvme: split out metadata vs non metadata end_io uring_cmd completions 2022-09-23 15:21 ` Christoph Hellwig @ 2022-09-23 20:52 ` Jens Axboe 2022-09-26 14:41 ` Christoph Hellwig 0 siblings, 1 reply; 25+ messages in thread From: Jens Axboe @ 2022-09-23 20:52 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-block, linux-scsi, linux-nvme, Stefan Roesch On 9/23/22 9:21 AM, Christoph Hellwig wrote: >> + union { >> + struct { >> + void *meta; /* kernel-resident buffer */ >> + void __user *meta_buffer; >> + }; >> + struct { >> + u32 nvme_flags; >> + u32 nvme_status; >> + u64 result; >> + }; >> + }; > > Without naming the arms of the union this is becoming a bit too much > of a mess.. > >> +static void nvme_uring_task_cb(struct io_uring_cmd *ioucmd) >> +{ >> + struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); >> + int status; >> + >> + if (pdu->nvme_flags & NVME_REQ_CANCELLED) >> + status = -EINTR; >> + else >> + status = pdu->nvme_status; > > If you add a signed int field you only need one field instead of > two in the pdu for this (the nvme status is only 15 bits anyway). For both of these, how about we just simplify like below? I think at that point it's useless to name them anyway. diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c index 25f2f6df1602..6f955984ca14 100644 --- a/drivers/nvme/host/ioctl.c +++ b/drivers/nvme/host/ioctl.c @@ -350,16 +350,13 @@ struct nvme_uring_cmd_pdu { struct request *req; }; u32 meta_len; + u32 nvme_status; union { struct { void *meta; /* kernel-resident buffer */ void __user *meta_buffer; }; - struct { - u32 nvme_flags; - u32 nvme_status; - u64 result; - }; + u64 result; }; }; @@ -396,17 +393,11 @@ static void nvme_uring_task_meta_cb(struct io_uring_cmd *ioucmd) static void nvme_uring_task_cb(struct io_uring_cmd *ioucmd) { struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); - int status; - - if (pdu->nvme_flags & NVME_REQ_CANCELLED) - status = -EINTR; - else - status = pdu->nvme_status; if (pdu->bio) blk_rq_unmap_user(pdu->bio); - io_uring_cmd_done(ioucmd, status, pdu->result); + io_uring_cmd_done(ioucmd, pdu->nvme_status, pdu->result); } static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req, @@ -417,8 +408,10 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req, void *cookie = READ_ONCE(ioucmd->cookie); req->bio = pdu->bio; - pdu->nvme_flags = nvme_req(req)->flags; - pdu->nvme_status = nvme_req(req)->status; + if (nvme_req(req)->flags & NVME_REQ_CANCELLED) + pdu->nvme_status = -EINTR; + else + pdu->nvme_status = nvme_req(req)->status; pdu->result = le64_to_cpu(nvme_req(req)->result.u64); /* -- Jens Axboe ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 4/5] nvme: split out metadata vs non metadata end_io uring_cmd completions 2022-09-23 20:52 ` Jens Axboe @ 2022-09-26 14:41 ` Christoph Hellwig 2022-09-26 14:41 ` Jens Axboe 0 siblings, 1 reply; 25+ messages in thread From: Christoph Hellwig @ 2022-09-26 14:41 UTC (permalink / raw) To: Jens Axboe Cc: Christoph Hellwig, linux-block, linux-scsi, linux-nvme, Stefan Roesch On Fri, Sep 23, 2022 at 02:52:54PM -0600, Jens Axboe wrote: > For both of these, how about we just simplify like below? I think > at that point it's useless to name them anyway. I think this version is better than the previous one, but I'd still prefer a non-anonymous union. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 4/5] nvme: split out metadata vs non metadata end_io uring_cmd completions 2022-09-26 14:41 ` Christoph Hellwig @ 2022-09-26 14:41 ` Jens Axboe 2022-09-26 14:43 ` Christoph Hellwig 0 siblings, 1 reply; 25+ messages in thread From: Jens Axboe @ 2022-09-26 14:41 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-block, linux-scsi, linux-nvme, Stefan Roesch On 9/26/22 8:41 AM, Christoph Hellwig wrote: > On Fri, Sep 23, 2022 at 02:52:54PM -0600, Jens Axboe wrote: >> For both of these, how about we just simplify like below? I think >> at that point it's useless to name them anyway. > > I think this version is better than the previous one, but I'd still > prefer a non-anonymous union. Sure, I don't really care. What name do you want for it? -- Jens Axboe ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 4/5] nvme: split out metadata vs non metadata end_io uring_cmd completions 2022-09-26 14:41 ` Jens Axboe @ 2022-09-26 14:43 ` Christoph Hellwig 2022-09-26 14:50 ` Jens Axboe 0 siblings, 1 reply; 25+ messages in thread From: Christoph Hellwig @ 2022-09-26 14:43 UTC (permalink / raw) To: Jens Axboe Cc: Christoph Hellwig, linux-block, linux-scsi, linux-nvme, Stefan Roesch On Mon, Sep 26, 2022 at 08:41:38AM -0600, Jens Axboe wrote: > Sure, I don't really care. What name do you want for it? Maybe slow and fast? Or simple and meta? > > -- > Jens Axboe > > ---end quoted text--- ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 4/5] nvme: split out metadata vs non metadata end_io uring_cmd completions 2022-09-26 14:43 ` Christoph Hellwig @ 2022-09-26 14:50 ` Jens Axboe 2022-09-26 14:52 ` Christoph Hellwig 0 siblings, 1 reply; 25+ messages in thread From: Jens Axboe @ 2022-09-26 14:50 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-block, linux-scsi, linux-nvme, Stefan Roesch On 9/26/22 8:43 AM, Christoph Hellwig wrote: > On Mon, Sep 26, 2022 at 08:41:38AM -0600, Jens Axboe wrote: >> Sure, I don't really care. What name do you want for it? > > Maybe slow and fast? Or simple and meta? So you want 'result' in a named struct too then? Because right now it looks like this: struct nvme_uring_cmd_pdu { union { struct bio *bio; struct request *req; }; u32 meta_len; u32 nvme_status; union { struct { void *meta; /* kernel-resident buffer */ void __user *meta_buffer; }; u64 result; }; }; Or just the union named so it's clear it's a union? That'd make it pdu->u.meta and so forth. I think that might be cleaner. -- Jens Axboe ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 4/5] nvme: split out metadata vs non metadata end_io uring_cmd completions 2022-09-26 14:50 ` Jens Axboe @ 2022-09-26 14:52 ` Christoph Hellwig 2022-09-26 14:54 ` Jens Axboe 0 siblings, 1 reply; 25+ messages in thread From: Christoph Hellwig @ 2022-09-26 14:52 UTC (permalink / raw) To: Jens Axboe Cc: Christoph Hellwig, linux-block, linux-scsi, linux-nvme, Stefan Roesch On Mon, Sep 26, 2022 at 08:50:41AM -0600, Jens Axboe wrote: > Or just the union named so it's clear it's a union? That'd make it > > pdu->u.meta > > and so forth. I think that might be cleaner. Ok, that's at least a bit of a warning sign. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 4/5] nvme: split out metadata vs non metadata end_io uring_cmd completions 2022-09-26 14:52 ` Christoph Hellwig @ 2022-09-26 14:54 ` Jens Axboe 0 siblings, 0 replies; 25+ messages in thread From: Jens Axboe @ 2022-09-26 14:54 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-block, linux-scsi, linux-nvme, Stefan Roesch On 9/26/22 8:52 AM, Christoph Hellwig wrote: > On Mon, Sep 26, 2022 at 08:50:41AM -0600, Jens Axboe wrote: >> Or just the union named so it's clear it's a union? That'd make it >> >> pdu->u.meta >> >> and so forth. I think that might be cleaner. > > Ok, that's at least a bit of a warning sign. I'll go with that. -- Jens Axboe ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2022-09-28 17:05 UTC | newest] Thread overview: 25+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-09-27 1:44 [PATCHSET v2 0/5] Enable alloc caching and batched freeing for passthrough Jens Axboe 2022-09-27 1:44 ` [PATCH 1/5] block: enable batched allocation for blk_mq_alloc_request() Jens Axboe 2022-09-28 13:38 ` Anuj gupta 2022-09-27 1:44 ` [PATCH 2/5] block: change request end_io handler to pass back a return value Jens Axboe 2022-09-27 1:44 ` [PATCH 3/5] block: allow end_io based requests in the completion batch handling Jens Axboe 2022-09-28 13:42 ` Anuj gupta 2022-09-27 1:44 ` [PATCH 4/5] nvme: split out metadata vs non metadata end_io uring_cmd completions Jens Axboe 2022-09-27 7:50 ` Christoph Hellwig 2022-09-28 13:51 ` Anuj gupta 2022-09-28 14:47 ` Sagi Grimberg 2022-09-27 1:44 ` [PATCH 5/5] nvme: enable batched completions of passthrough IO Jens Axboe 2022-09-28 13:55 ` Anuj gupta 2022-09-28 14:47 ` Sagi Grimberg 2022-09-28 13:23 ` [PATCHSET v2 0/5] Enable alloc caching and batched freeing for passthrough Anuj gupta 2022-09-28 14:22 ` Jens Axboe 2022-09-28 17:05 ` Keith Busch -- strict thread matches above, loose matches on Subject: below -- 2022-09-22 18:28 [PATCHSET " Jens Axboe 2022-09-22 18:28 ` [PATCH 4/5] nvme: split out metadata vs non metadata end_io uring_cmd completions Jens Axboe 2022-09-23 15:21 ` Christoph Hellwig 2022-09-23 20:52 ` Jens Axboe 2022-09-26 14:41 ` Christoph Hellwig 2022-09-26 14:41 ` Jens Axboe 2022-09-26 14:43 ` Christoph Hellwig 2022-09-26 14:50 ` Jens Axboe 2022-09-26 14:52 ` Christoph Hellwig 2022-09-26 14:54 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).