* [PATCH V4 0/3] blk-mq: issue request directly for blk_insert_cloned_request
@ 2018-01-15 16:58 Ming Lei
2018-01-15 16:58 ` [PATCH V4 1/3] blk-mq: move actual issue into one helper Ming Lei
` (3 more replies)
0 siblings, 4 replies; 22+ messages in thread
From: Ming Lei @ 2018-01-15 16:58 UTC (permalink / raw)
To: Jens Axboe, linux-block; +Cc: Mike Snitzer, Christoph Hellwig, Ming Lei
Hi Guys,
The 3 paches changes the blk-mq part of blk_insert_cloned_request(),
in which we switch to blk_mq_try_issue_directly(), so that both dm-rq
and blk-mq can get the dispatch result of underlying queue, and with
this information, blk-mq can handle IO merge much better, then
sequential I/O performance is improved much.
In my dm-mpath over virtio-scsi test, this whole patchset improves
sequential IO by 3X ~ 5X.
V4:
- remove dm patches which are in DM tree already
- cleanup __blk_mq_issue_req as suggested by Jens
V3:
- rebase on the latest for-4.16/block of block tree
- add missed pg_init_all_paths() in patch 1, according to Bart's review
V2:
- drop 'dm-mpath: cache ti->clone during requeue', which is a bit
too complicated, and not see obvious performance improvement.
- make change on blk-mq part cleaner
Ming Lei (3):
blk-mq: move actual issue into one helper
blk-mq: return dispatch result to caller in blk_mq_try_issue_directly
blk-mq: issue request directly for blk_insert_cloned_request
block/blk-mq.c | 85 +++++++++++++++++++++++++++++++++++++++++++---------------
1 file changed, 64 insertions(+), 21 deletions(-)
--
2.9.5
^ permalink raw reply [flat|nested] 22+ messages in thread* [PATCH V4 1/3] blk-mq: move actual issue into one helper 2018-01-15 16:58 [PATCH V4 0/3] blk-mq: issue request directly for blk_insert_cloned_request Ming Lei @ 2018-01-15 16:58 ` Ming Lei 2018-01-15 17:15 ` Mike Snitzer 2018-01-15 17:29 ` Jens Axboe 2018-01-15 16:58 ` [PATCH V4 2/3] blk-mq: return dispatch result to caller in blk_mq_try_issue_directly Ming Lei ` (2 subsequent siblings) 3 siblings, 2 replies; 22+ messages in thread From: Ming Lei @ 2018-01-15 16:58 UTC (permalink / raw) To: Jens Axboe, linux-block; +Cc: Mike Snitzer, Christoph Hellwig, Ming Lei No functional change, just to clean up code a bit, so that the following change of using direct issue for blk_mq_request_bypass_insert() which is needed by DM can be easier to do. Signed-off-by: Ming Lei <ming.lei@redhat.com> --- block/blk-mq.c | 39 +++++++++++++++++++++++++++------------ 1 file changed, 27 insertions(+), 12 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index edb1291a42c5..bf8d6651f40e 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1696,15 +1696,37 @@ static blk_qc_t request_to_qc_t(struct blk_mq_hw_ctx *hctx, struct request *rq) return blk_tag_to_qc_t(rq->internal_tag, hctx->queue_num, true); } -static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, - struct request *rq, - blk_qc_t *cookie) +static blk_status_t __blk_mq_issue_req(struct blk_mq_hw_ctx *hctx, + struct request *rq, + blk_qc_t *new_cookie) { + blk_status_t ret; struct request_queue *q = rq->q; struct blk_mq_queue_data bd = { .rq = rq, .last = true, }; + + if (!blk_mq_get_driver_tag(rq, NULL, false)) + return BLK_STS_AGAIN; + + if (!blk_mq_get_dispatch_budget(hctx)) { + blk_mq_put_driver_tag(rq); + return BLK_STS_AGAIN; + } + + *new_cookie = request_to_qc_t(hctx, rq); + + ret = q->mq_ops->queue_rq(hctx, &bd); + + return ret; +} + +static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, + struct request *rq, + blk_qc_t *cookie) +{ + struct request_queue *q = rq->q; blk_qc_t new_cookie; blk_status_t ret; bool run_queue = true; @@ -1718,22 +1740,15 @@ static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, if (q->elevator) goto insert; - if (!blk_mq_get_driver_tag(rq, NULL, false)) + ret = __blk_mq_issue_req(hctx, rq, &new_cookie); + if (ret == BLK_STS_AGAIN) goto insert; - if (!blk_mq_get_dispatch_budget(hctx)) { - blk_mq_put_driver_tag(rq); - goto insert; - } - - new_cookie = request_to_qc_t(hctx, rq); - /* * For OK queue, we are done. For error, kill it. Any other * error (busy), just add it to our list as we previously * would have done */ - ret = q->mq_ops->queue_rq(hctx, &bd); switch (ret) { case BLK_STS_OK: *cookie = new_cookie; -- 2.9.5 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH V4 1/3] blk-mq: move actual issue into one helper 2018-01-15 16:58 ` [PATCH V4 1/3] blk-mq: move actual issue into one helper Ming Lei @ 2018-01-15 17:15 ` Mike Snitzer 2018-01-16 1:36 ` Ming Lei 2018-01-15 17:29 ` Jens Axboe 1 sibling, 1 reply; 22+ messages in thread From: Mike Snitzer @ 2018-01-15 17:15 UTC (permalink / raw) To: Ming Lei; +Cc: Jens Axboe, linux-block, Christoph Hellwig On Mon, Jan 15 2018 at 11:58am -0500, Ming Lei <ming.lei@redhat.com> wrote: > No functional change, just to clean up code a bit, so that the following > change of using direct issue for blk_mq_request_bypass_insert() which is > needed by DM can be easier to do. > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > --- > block/blk-mq.c | 39 +++++++++++++++++++++++++++------------ > 1 file changed, 27 insertions(+), 12 deletions(-) > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index edb1291a42c5..bf8d6651f40e 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -1696,15 +1696,37 @@ static blk_qc_t request_to_qc_t(struct blk_mq_hw_ctx *hctx, struct request *rq) > return blk_tag_to_qc_t(rq->internal_tag, hctx->queue_num, true); > } > > -static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, > - struct request *rq, > - blk_qc_t *cookie) > +static blk_status_t __blk_mq_issue_req(struct blk_mq_hw_ctx *hctx, > + struct request *rq, > + blk_qc_t *new_cookie) > { > + blk_status_t ret; > struct request_queue *q = rq->q; > struct blk_mq_queue_data bd = { > .rq = rq, > .last = true, > }; > + > + if (!blk_mq_get_driver_tag(rq, NULL, false)) > + return BLK_STS_AGAIN; > + > + if (!blk_mq_get_dispatch_budget(hctx)) { > + blk_mq_put_driver_tag(rq); > + return BLK_STS_AGAIN; > + } > + > + *new_cookie = request_to_qc_t(hctx, rq); > + > + ret = q->mq_ops->queue_rq(hctx, &bd); > + > + return ret; > +} > + > +static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, > + struct request *rq, > + blk_qc_t *cookie) > +{ > + struct request_queue *q = rq->q; > blk_qc_t new_cookie; > blk_status_t ret; > bool run_queue = true; > @@ -1718,22 +1740,15 @@ static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, > if (q->elevator) > goto insert; > > - if (!blk_mq_get_driver_tag(rq, NULL, false)) > + ret = __blk_mq_issue_req(hctx, rq, &new_cookie); > + if (ret == BLK_STS_AGAIN) > goto insert; You're (ab)using BLK_STS_AGAIN as a means to an end... But what if in the future q->mq_ops->queue_rq() returns BLK_STS_AGAIN? Mike ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V4 1/3] blk-mq: move actual issue into one helper 2018-01-15 17:15 ` Mike Snitzer @ 2018-01-16 1:36 ` Ming Lei 0 siblings, 0 replies; 22+ messages in thread From: Ming Lei @ 2018-01-16 1:36 UTC (permalink / raw) To: Mike Snitzer; +Cc: Jens Axboe, linux-block, Christoph Hellwig On Mon, Jan 15, 2018 at 12:15:47PM -0500, Mike Snitzer wrote: > On Mon, Jan 15 2018 at 11:58am -0500, > Ming Lei <ming.lei@redhat.com> wrote: > > > No functional change, just to clean up code a bit, so that the following > > change of using direct issue for blk_mq_request_bypass_insert() which is > > needed by DM can be easier to do. > > > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > > --- > > block/blk-mq.c | 39 +++++++++++++++++++++++++++------------ > > 1 file changed, 27 insertions(+), 12 deletions(-) > > > > diff --git a/block/blk-mq.c b/block/blk-mq.c > > index edb1291a42c5..bf8d6651f40e 100644 > > --- a/block/blk-mq.c > > +++ b/block/blk-mq.c > > @@ -1696,15 +1696,37 @@ static blk_qc_t request_to_qc_t(struct blk_mq_hw_ctx *hctx, struct request *rq) > > return blk_tag_to_qc_t(rq->internal_tag, hctx->queue_num, true); > > } > > > > -static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, > > - struct request *rq, > > - blk_qc_t *cookie) > > +static blk_status_t __blk_mq_issue_req(struct blk_mq_hw_ctx *hctx, > > + struct request *rq, > > + blk_qc_t *new_cookie) > > { > > + blk_status_t ret; > > struct request_queue *q = rq->q; > > struct blk_mq_queue_data bd = { > > .rq = rq, > > .last = true, > > }; > > + > > + if (!blk_mq_get_driver_tag(rq, NULL, false)) > > + return BLK_STS_AGAIN; > > + > > + if (!blk_mq_get_dispatch_budget(hctx)) { > > + blk_mq_put_driver_tag(rq); > > + return BLK_STS_AGAIN; > > + } > > + > > + *new_cookie = request_to_qc_t(hctx, rq); > > + > > + ret = q->mq_ops->queue_rq(hctx, &bd); > > + > > + return ret; > > +} > > + > > +static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, > > + struct request *rq, > > + blk_qc_t *cookie) > > +{ > > + struct request_queue *q = rq->q; > > blk_qc_t new_cookie; > > blk_status_t ret; > > bool run_queue = true; > > @@ -1718,22 +1740,15 @@ static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, > > if (q->elevator) > > goto insert; > > > > - if (!blk_mq_get_driver_tag(rq, NULL, false)) > > + ret = __blk_mq_issue_req(hctx, rq, &new_cookie); > > + if (ret == BLK_STS_AGAIN) > > goto insert; > > You're (ab)using BLK_STS_AGAIN as a means to an end... > But what if in the future q->mq_ops->queue_rq() returns BLK_STS_AGAIN? Yeah, but now only dio uses that, so I let Jens decide it. -- Ming ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V4 1/3] blk-mq: move actual issue into one helper 2018-01-15 16:58 ` [PATCH V4 1/3] blk-mq: move actual issue into one helper Ming Lei 2018-01-15 17:15 ` Mike Snitzer @ 2018-01-15 17:29 ` Jens Axboe 2018-01-15 19:41 ` Mike Snitzer 2018-01-16 1:40 ` Ming Lei 1 sibling, 2 replies; 22+ messages in thread From: Jens Axboe @ 2018-01-15 17:29 UTC (permalink / raw) To: Ming Lei, linux-block; +Cc: Mike Snitzer, Christoph Hellwig On 1/15/18 9:58 AM, Ming Lei wrote: > No functional change, just to clean up code a bit, so that the following > change of using direct issue for blk_mq_request_bypass_insert() which is > needed by DM can be easier to do. > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > --- > block/blk-mq.c | 39 +++++++++++++++++++++++++++------------ > 1 file changed, 27 insertions(+), 12 deletions(-) > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index edb1291a42c5..bf8d6651f40e 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -1696,15 +1696,37 @@ static blk_qc_t request_to_qc_t(struct blk_mq_hw_ctx *hctx, struct request *rq) > return blk_tag_to_qc_t(rq->internal_tag, hctx->queue_num, true); > } > > -static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, > - struct request *rq, > - blk_qc_t *cookie) > +static blk_status_t __blk_mq_issue_req(struct blk_mq_hw_ctx *hctx, > + struct request *rq, > + blk_qc_t *new_cookie) > { > + blk_status_t ret; > struct request_queue *q = rq->q; > struct blk_mq_queue_data bd = { > .rq = rq, > .last = true, > }; > + > + if (!blk_mq_get_driver_tag(rq, NULL, false)) > + return BLK_STS_AGAIN; > + > + if (!blk_mq_get_dispatch_budget(hctx)) { > + blk_mq_put_driver_tag(rq); > + return BLK_STS_AGAIN; > + } > + > + *new_cookie = request_to_qc_t(hctx, rq); > + > + ret = q->mq_ops->queue_rq(hctx, &bd); > + > + return ret; return q->mq_ops->queue_rq(hctx, &bd); and kill 'ret', it's not used. But more importantly, who puts the driver tag and the budget if we get != OK for ->queue_rq()? -- Jens Axboe ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V4 1/3] blk-mq: move actual issue into one helper 2018-01-15 17:29 ` Jens Axboe @ 2018-01-15 19:41 ` Mike Snitzer 2018-01-16 1:43 ` Ming Lei 2018-01-16 1:40 ` Ming Lei 1 sibling, 1 reply; 22+ messages in thread From: Mike Snitzer @ 2018-01-15 19:41 UTC (permalink / raw) To: Jens Axboe; +Cc: Ming Lei, linux-block, Christoph Hellwig On Mon, Jan 15 2018 at 12:29pm -0500, Jens Axboe <axboe@kernel.dk> wrote: > On 1/15/18 9:58 AM, Ming Lei wrote: > > No functional change, just to clean up code a bit, so that the following > > change of using direct issue for blk_mq_request_bypass_insert() which is > > needed by DM can be easier to do. > > > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > > --- > > block/blk-mq.c | 39 +++++++++++++++++++++++++++------------ > > 1 file changed, 27 insertions(+), 12 deletions(-) > > > > diff --git a/block/blk-mq.c b/block/blk-mq.c > > index edb1291a42c5..bf8d6651f40e 100644 > > --- a/block/blk-mq.c > > +++ b/block/blk-mq.c > > @@ -1696,15 +1696,37 @@ static blk_qc_t request_to_qc_t(struct blk_mq_hw_ctx *hctx, struct request *rq) > > return blk_tag_to_qc_t(rq->internal_tag, hctx->queue_num, true); > > } > > > > -static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, > > - struct request *rq, > > - blk_qc_t *cookie) > > +static blk_status_t __blk_mq_issue_req(struct blk_mq_hw_ctx *hctx, > > + struct request *rq, > > + blk_qc_t *new_cookie) > > { > > + blk_status_t ret; > > struct request_queue *q = rq->q; > > struct blk_mq_queue_data bd = { > > .rq = rq, > > .last = true, > > }; > > + > > + if (!blk_mq_get_driver_tag(rq, NULL, false)) > > + return BLK_STS_AGAIN; > > + > > + if (!blk_mq_get_dispatch_budget(hctx)) { > > + blk_mq_put_driver_tag(rq); > > + return BLK_STS_AGAIN; > > + } > > + > > + *new_cookie = request_to_qc_t(hctx, rq); > > + > > + ret = q->mq_ops->queue_rq(hctx, &bd); > > + > > + return ret; > > return q->mq_ops->queue_rq(hctx, &bd); > > and kill 'ret', it's not used. Yeap, good point. > But more importantly, who puts the > driver tag and the budget if we get != OK for ->queue_rq()? __blk_mq_try_issue_directly() processes the returned value same as before this patch. Means this patch isn't making any functional change: If BLK_STS_RESOURCE: __blk_mq_requeue_request() is called. __blk_mq_requeue_request() will blk_mq_put_driver_tag(). Otherwise, all other errors result in blk_mq_end_request(rq, ret); So ignoring this patch, are you concerned that: 1) Does blk_mq_end_request() put both? Looks like blk_mq_free_request() handles rq->tag != -1 but why not have it use __blk_mq_put_driver_tag()? I'm not seeing where the budget is put from blk_mq_end_request()... 2) Nothing seems to be putting the budget in __blk_mq_try_issue_directly()'s BLK_STS_RESOURCE error path? I share your concern now (for drivers who set {get,put}_budget in mq_ops). Should __blk_mq_requeue_request() be updated to also blk_mq_put_dispatch_budget()? Mike ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V4 1/3] blk-mq: move actual issue into one helper 2018-01-15 19:41 ` Mike Snitzer @ 2018-01-16 1:43 ` Ming Lei 2018-01-16 1:45 ` Mike Snitzer 0 siblings, 1 reply; 22+ messages in thread From: Ming Lei @ 2018-01-16 1:43 UTC (permalink / raw) To: Mike Snitzer; +Cc: Jens Axboe, linux-block, Christoph Hellwig On Mon, Jan 15, 2018 at 02:41:12PM -0500, Mike Snitzer wrote: > On Mon, Jan 15 2018 at 12:29pm -0500, > Jens Axboe <axboe@kernel.dk> wrote: > > > On 1/15/18 9:58 AM, Ming Lei wrote: > > > No functional change, just to clean up code a bit, so that the following > > > change of using direct issue for blk_mq_request_bypass_insert() which is > > > needed by DM can be easier to do. > > > > > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > > > --- > > > block/blk-mq.c | 39 +++++++++++++++++++++++++++------------ > > > 1 file changed, 27 insertions(+), 12 deletions(-) > > > > > > diff --git a/block/blk-mq.c b/block/blk-mq.c > > > index edb1291a42c5..bf8d6651f40e 100644 > > > --- a/block/blk-mq.c > > > +++ b/block/blk-mq.c > > > @@ -1696,15 +1696,37 @@ static blk_qc_t request_to_qc_t(struct blk_mq_hw_ctx *hctx, struct request *rq) > > > return blk_tag_to_qc_t(rq->internal_tag, hctx->queue_num, true); > > > } > > > > > > -static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, > > > - struct request *rq, > > > - blk_qc_t *cookie) > > > +static blk_status_t __blk_mq_issue_req(struct blk_mq_hw_ctx *hctx, > > > + struct request *rq, > > > + blk_qc_t *new_cookie) > > > { > > > + blk_status_t ret; > > > struct request_queue *q = rq->q; > > > struct blk_mq_queue_data bd = { > > > .rq = rq, > > > .last = true, > > > }; > > > + > > > + if (!blk_mq_get_driver_tag(rq, NULL, false)) > > > + return BLK_STS_AGAIN; > > > + > > > + if (!blk_mq_get_dispatch_budget(hctx)) { > > > + blk_mq_put_driver_tag(rq); > > > + return BLK_STS_AGAIN; > > > + } > > > + > > > + *new_cookie = request_to_qc_t(hctx, rq); > > > + > > > + ret = q->mq_ops->queue_rq(hctx, &bd); > > > + > > > + return ret; > > > > return q->mq_ops->queue_rq(hctx, &bd); > > > > and kill 'ret', it's not used. > > Yeap, good point. > > > But more importantly, who puts the > > driver tag and the budget if we get != OK for ->queue_rq()? > > __blk_mq_try_issue_directly() processes the returned value same as before > this patch. Means this patch isn't making any functional change: > If BLK_STS_RESOURCE: __blk_mq_requeue_request() is called. > __blk_mq_requeue_request() will blk_mq_put_driver_tag(). > Otherwise, all other errors result in blk_mq_end_request(rq, ret); > > So ignoring this patch, are you concerned that: > 1) Does blk_mq_end_request() put both? Looks like blk_mq_free_request() > handles rq->tag != -1 but why not have it use __blk_mq_put_driver_tag()? > I'm not seeing where the budget is put from blk_mq_end_request()... blk_mq_free_request() releases driver tag, and budget is owned by driver once .queue_rq is called. > > 2) Nothing seems to be putting the budget in > __blk_mq_try_issue_directly()'s BLK_STS_RESOURCE error path? I share > your concern now (for drivers who set {get,put}_budget in mq_ops). > Should __blk_mq_requeue_request() be updated to also > blk_mq_put_dispatch_budget()? No, at least it is current protocol of using budget, please see scsi_mq_queue_rq() and comment of blk_mq_do_dispatch_sched(). -- Ming ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V4 1/3] blk-mq: move actual issue into one helper 2018-01-16 1:43 ` Ming Lei @ 2018-01-16 1:45 ` Mike Snitzer 0 siblings, 0 replies; 22+ messages in thread From: Mike Snitzer @ 2018-01-16 1:45 UTC (permalink / raw) To: Ming Lei; +Cc: Jens Axboe, linux-block, Christoph Hellwig On Mon, Jan 15 2018 at 8:43pm -0500, Ming Lei <ming.lei@redhat.com> wrote: > On Mon, Jan 15, 2018 at 02:41:12PM -0500, Mike Snitzer wrote: > > On Mon, Jan 15 2018 at 12:29pm -0500, > > Jens Axboe <axboe@kernel.dk> wrote: > > > > > On 1/15/18 9:58 AM, Ming Lei wrote: > > > > No functional change, just to clean up code a bit, so that the following > > > > change of using direct issue for blk_mq_request_bypass_insert() which is > > > > needed by DM can be easier to do. > > > > > > > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > > > > --- > > > > block/blk-mq.c | 39 +++++++++++++++++++++++++++------------ > > > > 1 file changed, 27 insertions(+), 12 deletions(-) > > > > > > > > diff --git a/block/blk-mq.c b/block/blk-mq.c > > > > index edb1291a42c5..bf8d6651f40e 100644 > > > > --- a/block/blk-mq.c > > > > +++ b/block/blk-mq.c > > > > @@ -1696,15 +1696,37 @@ static blk_qc_t request_to_qc_t(struct blk_mq_hw_ctx *hctx, struct request *rq) > > > > return blk_tag_to_qc_t(rq->internal_tag, hctx->queue_num, true); > > > > } > > > > > > > > -static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, > > > > - struct request *rq, > > > > - blk_qc_t *cookie) > > > > +static blk_status_t __blk_mq_issue_req(struct blk_mq_hw_ctx *hctx, > > > > + struct request *rq, > > > > + blk_qc_t *new_cookie) > > > > { > > > > + blk_status_t ret; > > > > struct request_queue *q = rq->q; > > > > struct blk_mq_queue_data bd = { > > > > .rq = rq, > > > > .last = true, > > > > }; > > > > + > > > > + if (!blk_mq_get_driver_tag(rq, NULL, false)) > > > > + return BLK_STS_AGAIN; > > > > + > > > > + if (!blk_mq_get_dispatch_budget(hctx)) { > > > > + blk_mq_put_driver_tag(rq); > > > > + return BLK_STS_AGAIN; > > > > + } > > > > + > > > > + *new_cookie = request_to_qc_t(hctx, rq); > > > > + > > > > + ret = q->mq_ops->queue_rq(hctx, &bd); > > > > + > > > > + return ret; > > > > > > return q->mq_ops->queue_rq(hctx, &bd); > > > > > > and kill 'ret', it's not used. > > > > Yeap, good point. > > > > > But more importantly, who puts the > > > driver tag and the budget if we get != OK for ->queue_rq()? > > > > __blk_mq_try_issue_directly() processes the returned value same as before > > this patch. Means this patch isn't making any functional change: > > If BLK_STS_RESOURCE: __blk_mq_requeue_request() is called. > > __blk_mq_requeue_request() will blk_mq_put_driver_tag(). > > Otherwise, all other errors result in blk_mq_end_request(rq, ret); > > > > So ignoring this patch, are you concerned that: > > 1) Does blk_mq_end_request() put both? Looks like blk_mq_free_request() > > handles rq->tag != -1 but why not have it use __blk_mq_put_driver_tag()? > > I'm not seeing where the budget is put from blk_mq_end_request()... > > blk_mq_free_request() releases driver tag, and budget is owned by driver > once .queue_rq is called. > > > > > 2) Nothing seems to be putting the budget in > > __blk_mq_try_issue_directly()'s BLK_STS_RESOURCE error path? I share > > your concern now (for drivers who set {get,put}_budget in mq_ops). > > Should __blk_mq_requeue_request() be updated to also > > blk_mq_put_dispatch_budget()? > > No, at least it is current protocol of using budget, please see > scsi_mq_queue_rq() and comment of blk_mq_do_dispatch_sched(). Yeap, thanks for clarifying. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V4 1/3] blk-mq: move actual issue into one helper 2018-01-15 17:29 ` Jens Axboe 2018-01-15 19:41 ` Mike Snitzer @ 2018-01-16 1:40 ` Ming Lei 2018-01-16 4:05 ` Ming Lei 1 sibling, 1 reply; 22+ messages in thread From: Ming Lei @ 2018-01-16 1:40 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-block, Mike Snitzer, Christoph Hellwig On Mon, Jan 15, 2018 at 10:29:46AM -0700, Jens Axboe wrote: > On 1/15/18 9:58 AM, Ming Lei wrote: > > No functional change, just to clean up code a bit, so that the following > > change of using direct issue for blk_mq_request_bypass_insert() which is > > needed by DM can be easier to do. > > > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > > --- > > block/blk-mq.c | 39 +++++++++++++++++++++++++++------------ > > 1 file changed, 27 insertions(+), 12 deletions(-) > > > > diff --git a/block/blk-mq.c b/block/blk-mq.c > > index edb1291a42c5..bf8d6651f40e 100644 > > --- a/block/blk-mq.c > > +++ b/block/blk-mq.c > > @@ -1696,15 +1696,37 @@ static blk_qc_t request_to_qc_t(struct blk_mq_hw_ctx *hctx, struct request *rq) > > return blk_tag_to_qc_t(rq->internal_tag, hctx->queue_num, true); > > } > > > > -static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, > > - struct request *rq, > > - blk_qc_t *cookie) > > +static blk_status_t __blk_mq_issue_req(struct blk_mq_hw_ctx *hctx, > > + struct request *rq, > > + blk_qc_t *new_cookie) > > { > > + blk_status_t ret; > > struct request_queue *q = rq->q; > > struct blk_mq_queue_data bd = { > > .rq = rq, > > .last = true, > > }; > > + > > + if (!blk_mq_get_driver_tag(rq, NULL, false)) > > + return BLK_STS_AGAIN; > > + > > + if (!blk_mq_get_dispatch_budget(hctx)) { > > + blk_mq_put_driver_tag(rq); > > + return BLK_STS_AGAIN; > > + } > > + > > + *new_cookie = request_to_qc_t(hctx, rq); > > + > > + ret = q->mq_ops->queue_rq(hctx, &bd); > > + > > + return ret; > > return q->mq_ops->queue_rq(hctx, &bd); > > and kill 'ret', it's not used. But more importantly, who puts the OK. > driver tag and the budget if we get != OK for ->queue_rq()? For the budget, the current protocol is that driver is responsible for the release once .queue_rq is called, see scsi_mq_queue_rq() and comment in blk_mq_do_dispatch_sched(). For driver tag, it is released in blk_mq_free_request(), which is done in failure handling of != OK. -- Ming ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V4 1/3] blk-mq: move actual issue into one helper 2018-01-16 1:40 ` Ming Lei @ 2018-01-16 4:05 ` Ming Lei 0 siblings, 0 replies; 22+ messages in thread From: Ming Lei @ 2018-01-16 4:05 UTC (permalink / raw) To: Ming Lei; +Cc: Jens Axboe, linux-block, Mike Snitzer, Christoph Hellwig On Tue, Jan 16, 2018 at 9:40 AM, Ming Lei <ming.lei@redhat.com> wrote: > On Mon, Jan 15, 2018 at 10:29:46AM -0700, Jens Axboe wrote: >> On 1/15/18 9:58 AM, Ming Lei wrote: >> > No functional change, just to clean up code a bit, so that the following >> > change of using direct issue for blk_mq_request_bypass_insert() which is >> > needed by DM can be easier to do. >> > >> > Signed-off-by: Ming Lei <ming.lei@redhat.com> >> > --- >> > block/blk-mq.c | 39 +++++++++++++++++++++++++++------------ >> > 1 file changed, 27 insertions(+), 12 deletions(-) >> > >> > diff --git a/block/blk-mq.c b/block/blk-mq.c >> > index edb1291a42c5..bf8d6651f40e 100644 >> > --- a/block/blk-mq.c >> > +++ b/block/blk-mq.c >> > @@ -1696,15 +1696,37 @@ static blk_qc_t request_to_qc_t(struct blk_mq_hw_ctx *hctx, struct request *rq) >> > return blk_tag_to_qc_t(rq->internal_tag, hctx->queue_num, true); >> > } >> > >> > -static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, >> > - struct request *rq, >> > - blk_qc_t *cookie) >> > +static blk_status_t __blk_mq_issue_req(struct blk_mq_hw_ctx *hctx, >> > + struct request *rq, >> > + blk_qc_t *new_cookie) >> > { >> > + blk_status_t ret; >> > struct request_queue *q = rq->q; >> > struct blk_mq_queue_data bd = { >> > .rq = rq, >> > .last = true, >> > }; >> > + >> > + if (!blk_mq_get_driver_tag(rq, NULL, false)) >> > + return BLK_STS_AGAIN; >> > + >> > + if (!blk_mq_get_dispatch_budget(hctx)) { >> > + blk_mq_put_driver_tag(rq); >> > + return BLK_STS_AGAIN; >> > + } >> > + >> > + *new_cookie = request_to_qc_t(hctx, rq); >> > + >> > + ret = q->mq_ops->queue_rq(hctx, &bd); >> > + >> > + return ret; >> >> return q->mq_ops->queue_rq(hctx, &bd); >> >> and kill 'ret', it's not used. But more importantly, who puts the > > OK. > >> driver tag and the budget if we get != OK for ->queue_rq()? > > For the budget, the current protocol is that driver is responsible for the > release once .queue_rq is called, see scsi_mq_queue_rq() and comment in > blk_mq_do_dispatch_sched(). > > For driver tag, it is released in blk_mq_free_request(), which is done > in failure handling of != OK. Actually the driver tag should be released here when .queue_rq() returns !OK, and this patch does follow the current logic: - call __blk_mq_requeue_request() to release tag and make it ready for requeue if STS_RESOURCE is returned. - For other !OK cases, let driver free the request. -- Ming Lei ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH V4 2/3] blk-mq: return dispatch result to caller in blk_mq_try_issue_directly 2018-01-15 16:58 [PATCH V4 0/3] blk-mq: issue request directly for blk_insert_cloned_request Ming Lei 2018-01-15 16:58 ` [PATCH V4 1/3] blk-mq: move actual issue into one helper Ming Lei @ 2018-01-15 16:58 ` Ming Lei 2018-01-15 16:58 ` [PATCH V4 3/3] blk-mq: issue request directly for blk_insert_cloned_request Ming Lei 2018-01-15 17:43 ` [PATCH V4 0/3] " Mike Snitzer 3 siblings, 0 replies; 22+ messages in thread From: Ming Lei @ 2018-01-15 16:58 UTC (permalink / raw) To: Jens Axboe, linux-block; +Cc: Mike Snitzer, Christoph Hellwig, Ming Lei In the following patch, we will use blk_mq_try_issue_directly() for DM to return the dispatch result, and DM need this informatin to improve IO merge. Signed-off-by: Ming Lei <ming.lei@redhat.com> --- block/blk-mq.c | 23 ++++++++++++++--------- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index bf8d6651f40e..1654c9c284d8 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1722,13 +1722,13 @@ static blk_status_t __blk_mq_issue_req(struct blk_mq_hw_ctx *hctx, return ret; } -static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, - struct request *rq, - blk_qc_t *cookie) +static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, + struct request *rq, + blk_qc_t *cookie) { struct request_queue *q = rq->q; blk_qc_t new_cookie; - blk_status_t ret; + blk_status_t ret = BLK_STS_OK; bool run_queue = true; /* RCU or SRCU read lock is needed before checking quiesced flag */ @@ -1752,31 +1752,36 @@ static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, switch (ret) { case BLK_STS_OK: *cookie = new_cookie; - return; + return ret; case BLK_STS_RESOURCE: __blk_mq_requeue_request(rq); goto insert; default: *cookie = BLK_QC_T_NONE; blk_mq_end_request(rq, ret); - return; + return ret; } insert: blk_mq_sched_insert_request(rq, false, run_queue, false, hctx->flags & BLK_MQ_F_BLOCKING); + return ret; } -static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, - struct request *rq, blk_qc_t *cookie) +static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, + struct request *rq, + blk_qc_t *cookie) { int srcu_idx; + blk_status_t ret; might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING); hctx_lock(hctx, &srcu_idx); - __blk_mq_try_issue_directly(hctx, rq, cookie); + ret = __blk_mq_try_issue_directly(hctx, rq, cookie); hctx_unlock(hctx, srcu_idx); + + return ret; } static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) -- 2.9.5 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH V4 3/3] blk-mq: issue request directly for blk_insert_cloned_request 2018-01-15 16:58 [PATCH V4 0/3] blk-mq: issue request directly for blk_insert_cloned_request Ming Lei 2018-01-15 16:58 ` [PATCH V4 1/3] blk-mq: move actual issue into one helper Ming Lei 2018-01-15 16:58 ` [PATCH V4 2/3] blk-mq: return dispatch result to caller in blk_mq_try_issue_directly Ming Lei @ 2018-01-15 16:58 ` Ming Lei 2018-01-16 1:34 ` Mike Snitzer 2018-01-15 17:43 ` [PATCH V4 0/3] " Mike Snitzer 3 siblings, 1 reply; 22+ messages in thread From: Ming Lei @ 2018-01-15 16:58 UTC (permalink / raw) To: Jens Axboe, linux-block; +Cc: Mike Snitzer, Christoph Hellwig, Ming Lei blk_insert_cloned_request() is called in fast path of dm-rq driver, and in this function we append request to hctx->dispatch_list of the underlying queue directly. 1) This way isn't efficient enough because hctx lock is always required 2) With blk_insert_cloned_request(), we bypass underlying queue's IO scheduler totally, and depend on DM rq driver to do IO schedule completely. But DM rq driver can't get underlying queue's dispatch feedback at all, and this information is extreamly useful for IO merge. Without that IO merge can't be done basically by blk-mq, and causes very bad sequential IO performance. This patch makes use of blk_mq_try_issue_directly() to dispatch rq to underlying queue and provides disptch result to dm-rq and blk-mq, and improves the above situations very much. With this patch, seqential IO is improved by 3X in my test over mpath/virtio-scsi. Signed-off-by: Ming Lei <ming.lei@redhat.com> --- block/blk-mq.c | 29 ++++++++++++++++++++++++++--- 1 file changed, 26 insertions(+), 3 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 1654c9c284d8..ce3965f271e3 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1730,6 +1730,11 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, blk_qc_t new_cookie; blk_status_t ret = BLK_STS_OK; bool run_queue = true; + /* + * This function is called from blk_insert_cloned_request() if + * 'cookie' is NULL, and for dispatching this request only. + */ + bool dispatch_only = !cookie; /* RCU or SRCU read lock is needed before checking quiesced flag */ if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) { @@ -1737,10 +1742,17 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, goto insert; } - if (q->elevator) + if (q->elevator && !dispatch_only) goto insert; ret = __blk_mq_issue_req(hctx, rq, &new_cookie); + if (dispatch_only) { + if (ret == BLK_STS_AGAIN) + return BLK_STS_RESOURCE; + if (ret == BLK_STS_RESOURCE) + __blk_mq_requeue_request(rq); + return ret; + } if (ret == BLK_STS_AGAIN) goto insert; @@ -1763,8 +1775,11 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, } insert: - blk_mq_sched_insert_request(rq, false, run_queue, false, - hctx->flags & BLK_MQ_F_BLOCKING); + if (!dispatch_only) + blk_mq_sched_insert_request(rq, false, run_queue, false, + hctx->flags & BLK_MQ_F_BLOCKING); + else + blk_mq_request_bypass_insert(rq, run_queue); return ret; } @@ -1784,6 +1799,14 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } +blk_status_t blk_mq_request_direct_issue(struct request *rq) +{ + struct blk_mq_ctx *ctx = rq->mq_ctx; + struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(rq->q, ctx->cpu); + + return blk_mq_try_issue_directly(hctx, rq, NULL); +} + static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) { const int is_sync = op_is_sync(bio->bi_opf); -- 2.9.5 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH V4 3/3] blk-mq: issue request directly for blk_insert_cloned_request 2018-01-15 16:58 ` [PATCH V4 3/3] blk-mq: issue request directly for blk_insert_cloned_request Ming Lei @ 2018-01-16 1:34 ` Mike Snitzer 0 siblings, 0 replies; 22+ messages in thread From: Mike Snitzer @ 2018-01-16 1:34 UTC (permalink / raw) To: Ming Lei; +Cc: Jens Axboe, linux-block, Christoph Hellwig On Mon, Jan 15 2018 at 11:58am -0500, Ming Lei <ming.lei@redhat.com> wrote: > blk_insert_cloned_request() is called in fast path of dm-rq driver, and > in this function we append request to hctx->dispatch_list of the underlying > queue directly. > > 1) This way isn't efficient enough because hctx lock is always required > > 2) With blk_insert_cloned_request(), we bypass underlying queue's IO scheduler > totally, and depend on DM rq driver to do IO schedule completely. But DM > rq driver can't get underlying queue's dispatch feedback at all, and this > information is extreamly useful for IO merge. Without that IO merge can't > be done basically by blk-mq, and causes very bad sequential IO performance. > > This patch makes use of blk_mq_try_issue_directly() to dispatch rq to > underlying queue and provides disptch result to dm-rq and blk-mq, and > improves the above situations very much. > > With this patch, seqential IO is improved by 3X in my test over > mpath/virtio-scsi. > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > --- > block/blk-mq.c | 29 ++++++++++++++++++++++++++--- > 1 file changed, 26 insertions(+), 3 deletions(-) Hey Ming, I also just noticed your V4 of this patch only includes the block/blk-mq.c changes. You're missing: block/blk-core.c | 3 +-- block/blk-mq.h | 3 +++ drivers/md/dm-rq.c | 19 +++++++++++++--- Please let Jens know if you're OK with my V4, tagged here: https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/tag/?h=for-block-4.16/dm-changes-2 And also detailed in this message from earlier in this thread: https://marc.info/?l=linux-block&m=151603824725438&w=2 Or please generate V5 of your series. Hopefully it'd include the header I revised for this 3/3 patch, see: https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-block-4.16/dm-changes-2&id=d86beab5712a8f18123011487dee797a1e3a07e1 We also need to address the issues Jens noticed and I looked at a bit closer: https://marc.info/?l=linux-block&m=151604528127707&w=2 (those issues might need fixing first, and marked for stable@, and the series rebased ontop of it?) Thanks! Mie ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V4 0/3] blk-mq: issue request directly for blk_insert_cloned_request 2018-01-15 16:58 [PATCH V4 0/3] blk-mq: issue request directly for blk_insert_cloned_request Ming Lei ` (2 preceding siblings ...) 2018-01-15 16:58 ` [PATCH V4 3/3] blk-mq: issue request directly for blk_insert_cloned_request Ming Lei @ 2018-01-15 17:43 ` Mike Snitzer 2018-01-16 1:57 ` Ming Lei 3 siblings, 1 reply; 22+ messages in thread From: Mike Snitzer @ 2018-01-15 17:43 UTC (permalink / raw) To: Ming Lei, axboe; +Cc: Jens Axboe, linux-block, Christoph Hellwig On Mon, Jan 15 2018 at 11:58am -0500, Ming Lei <ming.lei@redhat.com> wrote: > Hi Guys, > > The 3 paches changes the blk-mq part of blk_insert_cloned_request(), > in which we switch to blk_mq_try_issue_directly(), so that both dm-rq > and blk-mq can get the dispatch result of underlying queue, and with > this information, blk-mq can handle IO merge much better, then > sequential I/O performance is improved much. > > In my dm-mpath over virtio-scsi test, this whole patchset improves > sequential IO by 3X ~ 5X. > > V4: > - remove dm patches which are in DM tree already > - cleanup __blk_mq_issue_req as suggested by Jens > Ming, You dropped the header cleanups that I did in v3 ("blk-mq: issue request directly for blk_insert_cloned_request") being the one header I care about being updated). I also worked in parallel on my own v4 to address Jens' dislike for v3's 3 returns. But I skinned the cat a different way, by dropping your first patch that introduces the __blk_mq_issue_req helper, please see these 2 commits: https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=block-4.16&id=40f8947784128bb83dc5f7a6aed7ed230222f675 https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=block-4.16&id=f641015f42f41df75220313ee62e8241f2feeeed I think it makes the changes more obvious (not spread across 2 methods and doesn't require use of BLK_STS_AGAIN). Happy to yield to Jens to decide which he prefers. Jens, if you'd like to pick my variant of v4 up they are here, thanks! The following changes since commit c100ec49fdd2222836ff8a17c7bfcc7611d2ee2b: dm: fix incomplete request_queue initialization (2018-01-15 08:54:32 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git tags/for-block-4.16/dm-changes-2 for you to fetch changes up to d86beab5712a8f18123011487dee797a1e3a07e1: blk-mq: issue request directly for blk_insert_cloned_request (2018-01-15 12:40:44 -0500) ---------------------------------------------------------------- - Ming's blk-mq improvements to blk_insert_cloned_request(), which is used exclusively by request-based DM's blk-mq mode, that enable substantial dm-mpath sequential IO performance improvements. ---------------------------------------------------------------- Ming Lei (2): blk-mq: return dispatch result from blk_mq_try_issue_directly blk-mq: issue request directly for blk_insert_cloned_request block/blk-core.c | 3 +-- block/blk-mq.c | 65 +++++++++++++++++++++++++++++++++++++++++------------- block/blk-mq.h | 3 +++ drivers/md/dm-rq.c | 19 +++++++++++++--- 4 files changed, 70 insertions(+), 20 deletions(-) ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V4 0/3] blk-mq: issue request directly for blk_insert_cloned_request 2018-01-15 17:43 ` [PATCH V4 0/3] " Mike Snitzer @ 2018-01-16 1:57 ` Ming Lei 2018-01-16 15:01 ` [for-4.16 PATCH v4-mike 1/2] blk-mq: return dispatch result from blk_mq_try_issue_directly Mike Snitzer 2018-01-16 15:01 ` [for-4.16 PATCH v4-mike 2/2] blk-mq: issue request directly for blk_insert_cloned_request Mike Snitzer 0 siblings, 2 replies; 22+ messages in thread From: Ming Lei @ 2018-01-16 1:57 UTC (permalink / raw) To: Mike Snitzer; +Cc: axboe, Jens Axboe, linux-block, Christoph Hellwig On Mon, Jan 15, 2018 at 12:43:44PM -0500, Mike Snitzer wrote: > On Mon, Jan 15 2018 at 11:58am -0500, > Ming Lei <ming.lei@redhat.com> wrote: > > > Hi Guys, > > > > The 3 paches changes the blk-mq part of blk_insert_cloned_request(), > > in which we switch to blk_mq_try_issue_directly(), so that both dm-rq > > and blk-mq can get the dispatch result of underlying queue, and with > > this information, blk-mq can handle IO merge much better, then > > sequential I/O performance is improved much. > > > > In my dm-mpath over virtio-scsi test, this whole patchset improves > > sequential IO by 3X ~ 5X. > > > > V4: > > - remove dm patches which are in DM tree already > > - cleanup __blk_mq_issue_req as suggested by Jens > > > > Ming, > > You dropped the header cleanups that I did in v3 ("blk-mq: issue request > directly for blk_insert_cloned_request") being the one header I care > about being updated). > > I also worked in parallel on my own v4 to address Jens' dislike for v3's > 3 returns. But I skinned the cat a different way, by dropping your > first patch that introduces the __blk_mq_issue_req helper, please see > these 2 commits: > > https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=block-4.16&id=40f8947784128bb83dc5f7a6aed7ed230222f675 > https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=block-4.16&id=f641015f42f41df75220313ee62e8241f2feeeed > > I think it makes the changes more obvious (not spread across 2 methods > and doesn't require use of BLK_STS_AGAIN). > > Happy to yield to Jens to decide which he prefers. > > Jens, if you'd like to pick my variant of v4 up they are here, thanks! Hi Mike, Looks we were working on it at the same time, I am fine with your V4. Jens, please let us know if Mike's V4 is OK, if not, we can make V5/.., :-) Thanks, Ming ^ permalink raw reply [flat|nested] 22+ messages in thread
* [for-4.16 PATCH v4-mike 1/2] blk-mq: return dispatch result from blk_mq_try_issue_directly 2018-01-16 1:57 ` Ming Lei @ 2018-01-16 15:01 ` Mike Snitzer 2018-01-16 15:01 ` [for-4.16 PATCH v4-mike 2/2] blk-mq: issue request directly for blk_insert_cloned_request Mike Snitzer 1 sibling, 0 replies; 22+ messages in thread From: Mike Snitzer @ 2018-01-16 15:01 UTC (permalink / raw) To: axboe; +Cc: Ming Lei, hch, dm-devel, linux-block From: Ming Lei <ming.lei@redhat.com> In the following patch, we will use blk_mq_try_issue_directly() for DM to return the dispatch result. DM needs this information to improve IO merging. Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> --- block/blk-mq.c | 23 ++++++++++++++--------- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index c8f62e6be6b6..55f3a27fb2e6 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1694,9 +1694,9 @@ static blk_qc_t request_to_qc_t(struct blk_mq_hw_ctx *hctx, struct request *rq) return blk_tag_to_qc_t(rq->internal_tag, hctx->queue_num, true); } -static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, - struct request *rq, - blk_qc_t *cookie) +static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, + struct request *rq, + blk_qc_t *cookie) { struct request_queue *q = rq->q; struct blk_mq_queue_data bd = { @@ -1704,7 +1704,7 @@ static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, .last = true, }; blk_qc_t new_cookie; - blk_status_t ret; + blk_status_t ret = BLK_STS_OK; bool run_queue = true; /* RCU or SRCU read lock is needed before checking quiesced flag */ @@ -1735,31 +1735,36 @@ static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, switch (ret) { case BLK_STS_OK: *cookie = new_cookie; - return; + return ret; case BLK_STS_RESOURCE: __blk_mq_requeue_request(rq); goto insert; default: *cookie = BLK_QC_T_NONE; blk_mq_end_request(rq, ret); - return; + return ret; } insert: blk_mq_sched_insert_request(rq, false, run_queue, false, hctx->flags & BLK_MQ_F_BLOCKING); + return ret; } -static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, - struct request *rq, blk_qc_t *cookie) +static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, + struct request *rq, + blk_qc_t *cookie) { int srcu_idx; + blk_status_t ret; might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING); hctx_lock(hctx, &srcu_idx); - __blk_mq_try_issue_directly(hctx, rq, cookie); + ret = __blk_mq_try_issue_directly(hctx, rq, cookie); hctx_unlock(hctx, srcu_idx); + + return ret; } static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) -- 2.15.0 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [for-4.16 PATCH v4-mike 2/2] blk-mq: issue request directly for blk_insert_cloned_request 2018-01-16 1:57 ` Ming Lei 2018-01-16 15:01 ` [for-4.16 PATCH v4-mike 1/2] blk-mq: return dispatch result from blk_mq_try_issue_directly Mike Snitzer @ 2018-01-16 15:01 ` Mike Snitzer 2018-01-16 16:41 ` Mike Snitzer 2018-01-16 17:20 ` Jens Axboe 1 sibling, 2 replies; 22+ messages in thread From: Mike Snitzer @ 2018-01-16 15:01 UTC (permalink / raw) To: axboe; +Cc: Ming Lei, hch, dm-devel, linux-block From: Ming Lei <ming.lei@redhat.com> blk_insert_cloned_request() is called in fast path of dm-rq driver, and in this function we append request to hctx->dispatch_list of the underlying queue directly. 1) This way isn't efficient enough because hctx lock is always required 2) With blk_insert_cloned_request(), we bypass underlying queue's IO scheduler totally, and depend on DM rq driver to do IO schedule completely. But DM rq driver can't get underlying queue's dispatch feedback at all, and this information is extreamly useful for IO merge. Without that IO merge can't be done basically by blk-mq, which causes very bad sequential IO performance. Fix this by having blk_insert_cloned_request() make use of blk_mq_try_issue_directly() via blk_mq_request_direct_issue(). blk_mq_request_direct_issue() allows a request to be dispatched to be issue directly to the underlying queue and provides dispatch result to dm-rq and blk-mq. With this, the DM's blk-mq sequential IO performance is vastly improved (as much as 3X in mpath/virtio-scsi testing). Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> --- block/blk-core.c | 3 +-- block/blk-mq.c | 42 ++++++++++++++++++++++++++++++++++++------ block/blk-mq.h | 3 +++ drivers/md/dm-rq.c | 19 ++++++++++++++++--- 4 files changed, 56 insertions(+), 11 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 7ba607527487..55f338020254 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -2500,8 +2500,7 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request * * bypass a potential scheduler on the bottom device for * insert. */ - blk_mq_request_bypass_insert(rq, true); - return BLK_STS_OK; + return blk_mq_request_direct_issue(rq); } spin_lock_irqsave(q->queue_lock, flags); diff --git a/block/blk-mq.c b/block/blk-mq.c index 55f3a27fb2e6..3168a13cb012 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1706,6 +1706,12 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, blk_qc_t new_cookie; blk_status_t ret = BLK_STS_OK; bool run_queue = true; + /* + * If @cookie is NULL do not insert the request, this mode is used + * by blk_insert_cloned_request() via blk_mq_request_direct_issue() + */ + bool dispatch_only = !cookie; + bool need_insert = false; /* RCU or SRCU read lock is needed before checking quiesced flag */ if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) { @@ -1713,25 +1719,38 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, goto insert; } - if (q->elevator) + if (q->elevator && !dispatch_only) goto insert; if (!blk_mq_get_driver_tag(rq, NULL, false)) - goto insert; + need_insert = true; - if (!blk_mq_get_dispatch_budget(hctx)) { + if (!need_insert && !blk_mq_get_dispatch_budget(hctx)) { blk_mq_put_driver_tag(rq); + need_insert = true; + } + + if (need_insert) { + if (dispatch_only) + return BLK_STS_RESOURCE; goto insert; } new_cookie = request_to_qc_t(hctx, rq); + ret = q->mq_ops->queue_rq(hctx, &bd); + + if (dispatch_only) { + if (ret == BLK_STS_RESOURCE) + __blk_mq_requeue_request(rq); + return ret; + } + /* * For OK queue, we are done. For error, kill it. Any other * error (busy), just add it to our list as we previously * would have done */ - ret = q->mq_ops->queue_rq(hctx, &bd); switch (ret) { case BLK_STS_OK: *cookie = new_cookie; @@ -1746,8 +1765,11 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, } insert: - blk_mq_sched_insert_request(rq, false, run_queue, false, - hctx->flags & BLK_MQ_F_BLOCKING); + if (!dispatch_only) + blk_mq_sched_insert_request(rq, false, run_queue, false, + hctx->flags & BLK_MQ_F_BLOCKING); + else + blk_mq_request_bypass_insert(rq, run_queue); return ret; } @@ -1767,6 +1789,14 @@ static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, return ret; } +blk_status_t blk_mq_request_direct_issue(struct request *rq) +{ + struct blk_mq_ctx *ctx = rq->mq_ctx; + struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(rq->q, ctx->cpu); + + return blk_mq_try_issue_directly(hctx, rq, NULL); +} + static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) { const int is_sync = op_is_sync(bio->bi_opf); diff --git a/block/blk-mq.h b/block/blk-mq.h index 8591a54d989b..e3ebc93646ca 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -74,6 +74,9 @@ void blk_mq_request_bypass_insert(struct request *rq, bool run_queue); void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx, struct list_head *list); +/* Used by blk_insert_cloned_request() to issue request directly */ +blk_status_t blk_mq_request_direct_issue(struct request *rq); + /* * CPU -> queue mappings */ diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c index c28357f5cb0e..e0d84b17c1cd 100644 --- a/drivers/md/dm-rq.c +++ b/drivers/md/dm-rq.c @@ -395,7 +395,7 @@ static void end_clone_request(struct request *clone, blk_status_t error) dm_complete_request(tio->orig, error); } -static void dm_dispatch_clone_request(struct request *clone, struct request *rq) +static blk_status_t dm_dispatch_clone_request(struct request *clone, struct request *rq) { blk_status_t r; @@ -404,9 +404,10 @@ static void dm_dispatch_clone_request(struct request *clone, struct request *rq) clone->start_time = jiffies; r = blk_insert_cloned_request(clone->q, clone); - if (r) + if (r != BLK_STS_OK && r != BLK_STS_RESOURCE) /* must complete clone in terms of original request */ dm_complete_request(rq, r); + return r; } static int dm_rq_bio_constructor(struct bio *bio, struct bio *bio_orig, @@ -476,8 +477,10 @@ static int map_request(struct dm_rq_target_io *tio) struct mapped_device *md = tio->md; struct request *rq = tio->orig; struct request *clone = NULL; + blk_status_t ret; r = ti->type->clone_and_map_rq(ti, rq, &tio->info, &clone); + check_again: switch (r) { case DM_MAPIO_SUBMITTED: /* The target has taken the I/O to submit by itself later */ @@ -492,7 +495,17 @@ static int map_request(struct dm_rq_target_io *tio) /* The target has remapped the I/O so dispatch it */ trace_block_rq_remap(clone->q, clone, disk_devt(dm_disk(md)), blk_rq_pos(rq)); - dm_dispatch_clone_request(clone, rq); + ret = dm_dispatch_clone_request(clone, rq); + if (ret == BLK_STS_RESOURCE) { + blk_rq_unprep_clone(clone); + tio->ti->type->release_clone_rq(clone); + tio->clone = NULL; + if (!rq->q->mq_ops) + r = DM_MAPIO_DELAY_REQUEUE; + else + r = DM_MAPIO_REQUEUE; + goto check_again; + } break; case DM_MAPIO_REQUEUE: /* The target wants to requeue the I/O */ -- 2.15.0 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [for-4.16 PATCH v4-mike 2/2] blk-mq: issue request directly for blk_insert_cloned_request 2018-01-16 15:01 ` [for-4.16 PATCH v4-mike 2/2] blk-mq: issue request directly for blk_insert_cloned_request Mike Snitzer @ 2018-01-16 16:41 ` Mike Snitzer 2018-01-16 17:20 ` Jens Axboe 1 sibling, 0 replies; 22+ messages in thread From: Mike Snitzer @ 2018-01-16 16:41 UTC (permalink / raw) To: axboe; +Cc: linux-block, Ming Lei, dm-devel, hch On Tue, Jan 16 2018 at 10:01P -0500, Mike Snitzer <snitzer@redhat.com> wrote: > From: Ming Lei <ming.lei@redhat.com> > > blk_insert_cloned_request() is called in fast path of dm-rq driver, and > in this function we append request to hctx->dispatch_list of the underlying > queue directly. > > 1) This way isn't efficient enough because hctx lock is always required > > 2) With blk_insert_cloned_request(), we bypass underlying queue's IO > scheduler totally, and depend on DM rq driver to do IO schedule > completely. But DM rq driver can't get underlying queue's dispatch > feedback at all, and this information is extreamly useful for IO merge. > Without that IO merge can't be done basically by blk-mq, which causes > very bad sequential IO performance. > > Fix this by having blk_insert_cloned_request() make use of > blk_mq_try_issue_directly() via blk_mq_request_direct_issue(). > blk_mq_request_direct_issue() allows a request to be dispatched to be > issue directly to the underlying queue and provides dispatch result to > dm-rq and blk-mq. > > With this, the DM's blk-mq sequential IO performance is vastly > improved (as much as 3X in mpath/virtio-scsi testing). > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > Signed-off-by: Mike Snitzer <snitzer@redhat.com> > --- > block/blk-core.c | 3 +-- > block/blk-mq.c | 42 ++++++++++++++++++++++++++++++++++++------ > block/blk-mq.h | 3 +++ > drivers/md/dm-rq.c | 19 ++++++++++++++++--- > 4 files changed, 56 insertions(+), 11 deletions(-) > > diff --git a/block/blk-core.c b/block/blk-core.c > index 7ba607527487..55f338020254 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -2500,8 +2500,7 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request * > * bypass a potential scheduler on the bottom device for > * insert. > */ > - blk_mq_request_bypass_insert(rq, true); > - return BLK_STS_OK; > + return blk_mq_request_direct_issue(rq); > } > > spin_lock_irqsave(q->queue_lock, flags); > diff --git a/block/blk-mq.c b/block/blk-mq.c > index 55f3a27fb2e6..3168a13cb012 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -1706,6 +1706,12 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, > blk_qc_t new_cookie; > blk_status_t ret = BLK_STS_OK; > bool run_queue = true; > + /* > + * If @cookie is NULL do not insert the request, this mode is used > + * by blk_insert_cloned_request() via blk_mq_request_direct_issue() > + */ > + bool dispatch_only = !cookie; > + bool need_insert = false; > > /* RCU or SRCU read lock is needed before checking quiesced flag */ > if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) { > @@ -1713,25 +1719,38 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, > goto insert; > } > > - if (q->elevator) > + if (q->elevator && !dispatch_only) > goto insert; > > if (!blk_mq_get_driver_tag(rq, NULL, false)) > - goto insert; > + need_insert = true; > > - if (!blk_mq_get_dispatch_budget(hctx)) { > + if (!need_insert && !blk_mq_get_dispatch_budget(hctx)) { > blk_mq_put_driver_tag(rq); > + need_insert = true; > + } > + > + if (need_insert) { > + if (dispatch_only) > + return BLK_STS_RESOURCE; > goto insert; > } > > new_cookie = request_to_qc_t(hctx, rq); > > + ret = q->mq_ops->queue_rq(hctx, &bd); > + > + if (dispatch_only) { > + if (ret == BLK_STS_RESOURCE) > + __blk_mq_requeue_request(rq); > + return ret; > + } > + > /* > * For OK queue, we are done. For error, kill it. Any other > * error (busy), just add it to our list as we previously > * would have done > */ > - ret = q->mq_ops->queue_rq(hctx, &bd); > switch (ret) { > case BLK_STS_OK: > *cookie = new_cookie; FYI, because blk_mq_put_driver_tag() is idempotent, the above can be simplified by eliminating 'need_insert' like so (Jens feel free to fold this in?): diff --git a/block/blk-mq.c b/block/blk-mq.c index 3168a13cb012..c2f66a5c5242 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1711,7 +1711,6 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, * by blk_insert_cloned_request() via blk_mq_request_direct_issue() */ bool dispatch_only = !cookie; - bool need_insert = false; /* RCU or SRCU read lock is needed before checking quiesced flag */ if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) { @@ -1722,15 +1721,9 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, if (q->elevator && !dispatch_only) goto insert; - if (!blk_mq_get_driver_tag(rq, NULL, false)) - need_insert = true; - - if (!need_insert && !blk_mq_get_dispatch_budget(hctx)) { + if (!blk_mq_get_driver_tag(rq, NULL, false) || + !blk_mq_get_dispatch_budget(hctx)) { blk_mq_put_driver_tag(rq); - need_insert = true; - } - - if (need_insert) { if (dispatch_only) return BLK_STS_RESOURCE; goto insert; ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [for-4.16 PATCH v4-mike 2/2] blk-mq: issue request directly for blk_insert_cloned_request 2018-01-16 15:01 ` [for-4.16 PATCH v4-mike 2/2] blk-mq: issue request directly for blk_insert_cloned_request Mike Snitzer 2018-01-16 16:41 ` Mike Snitzer @ 2018-01-16 17:20 ` Jens Axboe 2018-01-16 17:38 ` Mike Snitzer 1 sibling, 1 reply; 22+ messages in thread From: Jens Axboe @ 2018-01-16 17:20 UTC (permalink / raw) To: Mike Snitzer; +Cc: Ming Lei, hch, dm-devel, linux-block On 1/16/18 8:01 AM, Mike Snitzer wrote: > From: Ming Lei <ming.lei@redhat.com> > > blk_insert_cloned_request() is called in fast path of dm-rq driver, and > in this function we append request to hctx->dispatch_list of the underlying > queue directly. > > 1) This way isn't efficient enough because hctx lock is always required > > 2) With blk_insert_cloned_request(), we bypass underlying queue's IO > scheduler totally, and depend on DM rq driver to do IO schedule > completely. But DM rq driver can't get underlying queue's dispatch > feedback at all, and this information is extreamly useful for IO merge. > Without that IO merge can't be done basically by blk-mq, which causes > very bad sequential IO performance. > > Fix this by having blk_insert_cloned_request() make use of > blk_mq_try_issue_directly() via blk_mq_request_direct_issue(). > blk_mq_request_direct_issue() allows a request to be dispatched to be > issue directly to the underlying queue and provides dispatch result to > dm-rq and blk-mq. > > With this, the DM's blk-mq sequential IO performance is vastly > improved (as much as 3X in mpath/virtio-scsi testing). This still feels pretty hacky... > diff --git a/block/blk-mq.c b/block/blk-mq.c > index 55f3a27fb2e6..3168a13cb012 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -1706,6 +1706,12 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, > blk_qc_t new_cookie; > blk_status_t ret = BLK_STS_OK; > bool run_queue = true; > + /* > + * If @cookie is NULL do not insert the request, this mode is used > + * by blk_insert_cloned_request() via blk_mq_request_direct_issue() > + */ > + bool dispatch_only = !cookie; > + bool need_insert = false; Overloading 'cookie' to also mean this isn't very future proof or solid. And now __blk_mq_try_issue_directly() is pretty much a mess. Feels like it should be split in two, where the other half would do the actual insert. Then let the caller do it, if we could not issue directly. That would be a lot more solid and easier to read. -- Jens Axboe ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [for-4.16 PATCH v4-mike 2/2] blk-mq: issue request directly for blk_insert_cloned_request 2018-01-16 17:20 ` Jens Axboe @ 2018-01-16 17:38 ` Mike Snitzer 2018-01-16 17:41 ` Jens Axboe 0 siblings, 1 reply; 22+ messages in thread From: Mike Snitzer @ 2018-01-16 17:38 UTC (permalink / raw) To: Jens Axboe; +Cc: Ming Lei, hch, dm-devel, linux-block On Tue, Jan 16 2018 at 12:20pm -0500, Jens Axboe <axboe@kernel.dk> wrote: > On 1/16/18 8:01 AM, Mike Snitzer wrote: > > From: Ming Lei <ming.lei@redhat.com> > > > > blk_insert_cloned_request() is called in fast path of dm-rq driver, and > > in this function we append request to hctx->dispatch_list of the underlying > > queue directly. > > > > 1) This way isn't efficient enough because hctx lock is always required > > > > 2) With blk_insert_cloned_request(), we bypass underlying queue's IO > > scheduler totally, and depend on DM rq driver to do IO schedule > > completely. But DM rq driver can't get underlying queue's dispatch > > feedback at all, and this information is extreamly useful for IO merge. > > Without that IO merge can't be done basically by blk-mq, which causes > > very bad sequential IO performance. > > > > Fix this by having blk_insert_cloned_request() make use of > > blk_mq_try_issue_directly() via blk_mq_request_direct_issue(). > > blk_mq_request_direct_issue() allows a request to be dispatched to be > > issue directly to the underlying queue and provides dispatch result to > > dm-rq and blk-mq. > > > > With this, the DM's blk-mq sequential IO performance is vastly > > improved (as much as 3X in mpath/virtio-scsi testing). > > This still feels pretty hacky... > > > diff --git a/block/blk-mq.c b/block/blk-mq.c > > index 55f3a27fb2e6..3168a13cb012 100644 > > --- a/block/blk-mq.c > > +++ b/block/blk-mq.c > > @@ -1706,6 +1706,12 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, > > blk_qc_t new_cookie; > > blk_status_t ret = BLK_STS_OK; > > bool run_queue = true; > > + /* > > + * If @cookie is NULL do not insert the request, this mode is used > > + * by blk_insert_cloned_request() via blk_mq_request_direct_issue() > > + */ > > + bool dispatch_only = !cookie; > > + bool need_insert = false; > > Overloading 'cookie' to also mean this isn't very future proof or solid. It enables the existing interface to be used without needing to prop up something else that extends out to the edge (blk_insert_cloned_request). > And now __blk_mq_try_issue_directly() is pretty much a mess. Feels like > it should be split in two, where the other half would do the actual > insert. Then let the caller do it, if we could not issue directly. That > would be a lot more solid and easier to read. That is effectively what Ming's variant did (by splitting out the issue to a helper). BUT I'll see what I can come up with... (Ming please stand down until you hear back from me ;) Thanks, Mike ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [for-4.16 PATCH v4-mike 2/2] blk-mq: issue request directly for blk_insert_cloned_request 2018-01-16 17:38 ` Mike Snitzer @ 2018-01-16 17:41 ` Jens Axboe 2018-01-16 18:16 ` Mike Snitzer 0 siblings, 1 reply; 22+ messages in thread From: Jens Axboe @ 2018-01-16 17:41 UTC (permalink / raw) To: Mike Snitzer; +Cc: Ming Lei, hch, dm-devel, linux-block On 1/16/18 10:38 AM, Mike Snitzer wrote: > On Tue, Jan 16 2018 at 12:20pm -0500, > Jens Axboe <axboe@kernel.dk> wrote: > >> On 1/16/18 8:01 AM, Mike Snitzer wrote: >>> From: Ming Lei <ming.lei@redhat.com> >>> >>> blk_insert_cloned_request() is called in fast path of dm-rq driver, and >>> in this function we append request to hctx->dispatch_list of the underlying >>> queue directly. >>> >>> 1) This way isn't efficient enough because hctx lock is always required >>> >>> 2) With blk_insert_cloned_request(), we bypass underlying queue's IO >>> scheduler totally, and depend on DM rq driver to do IO schedule >>> completely. But DM rq driver can't get underlying queue's dispatch >>> feedback at all, and this information is extreamly useful for IO merge. >>> Without that IO merge can't be done basically by blk-mq, which causes >>> very bad sequential IO performance. >>> >>> Fix this by having blk_insert_cloned_request() make use of >>> blk_mq_try_issue_directly() via blk_mq_request_direct_issue(). >>> blk_mq_request_direct_issue() allows a request to be dispatched to be >>> issue directly to the underlying queue and provides dispatch result to >>> dm-rq and blk-mq. >>> >>> With this, the DM's blk-mq sequential IO performance is vastly >>> improved (as much as 3X in mpath/virtio-scsi testing). >> >> This still feels pretty hacky... >> >>> diff --git a/block/blk-mq.c b/block/blk-mq.c >>> index 55f3a27fb2e6..3168a13cb012 100644 >>> --- a/block/blk-mq.c >>> +++ b/block/blk-mq.c >>> @@ -1706,6 +1706,12 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, >>> blk_qc_t new_cookie; >>> blk_status_t ret = BLK_STS_OK; >>> bool run_queue = true; >>> + /* >>> + * If @cookie is NULL do not insert the request, this mode is used >>> + * by blk_insert_cloned_request() via blk_mq_request_direct_issue() >>> + */ >>> + bool dispatch_only = !cookie; >>> + bool need_insert = false; >> >> Overloading 'cookie' to also mean this isn't very future proof or solid. > > It enables the existing interface to be used without needing to prop up > something else that extends out to the edge (blk_insert_cloned_request). Doesn't really matter if the end result is too ugly/fragile to live. >> And now __blk_mq_try_issue_directly() is pretty much a mess. Feels like >> it should be split in two, where the other half would do the actual >> insert. Then let the caller do it, if we could not issue directly. That >> would be a lot more solid and easier to read. > > That is effectively what Ming's variant did (by splitting out the issue > to a helper). > > BUT I'll see what I can come up with... Maybe I missed that version, there were many rapid fire versions posted. Please just take your time and get it right, that's much more important. -- Jens Axboe ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [for-4.16 PATCH v4-mike 2/2] blk-mq: issue request directly for blk_insert_cloned_request 2018-01-16 17:41 ` Jens Axboe @ 2018-01-16 18:16 ` Mike Snitzer 0 siblings, 0 replies; 22+ messages in thread From: Mike Snitzer @ 2018-01-16 18:16 UTC (permalink / raw) To: Jens Axboe; +Cc: Ming Lei, hch, dm-devel, linux-block On Tue, Jan 16 2018 at 12:41pm -0500, Jens Axboe <axboe@kernel.dk> wrote: > On 1/16/18 10:38 AM, Mike Snitzer wrote: > > On Tue, Jan 16 2018 at 12:20pm -0500, > > Jens Axboe <axboe@kernel.dk> wrote: > > > >> On 1/16/18 8:01 AM, Mike Snitzer wrote: > >>> From: Ming Lei <ming.lei@redhat.com> > >>> > >>> blk_insert_cloned_request() is called in fast path of dm-rq driver, and > >>> in this function we append request to hctx->dispatch_list of the underlying > >>> queue directly. > >>> > >>> 1) This way isn't efficient enough because hctx lock is always required > >>> > >>> 2) With blk_insert_cloned_request(), we bypass underlying queue's IO > >>> scheduler totally, and depend on DM rq driver to do IO schedule > >>> completely. But DM rq driver can't get underlying queue's dispatch > >>> feedback at all, and this information is extreamly useful for IO merge. > >>> Without that IO merge can't be done basically by blk-mq, which causes > >>> very bad sequential IO performance. > >>> > >>> Fix this by having blk_insert_cloned_request() make use of > >>> blk_mq_try_issue_directly() via blk_mq_request_direct_issue(). > >>> blk_mq_request_direct_issue() allows a request to be dispatched to be > >>> issue directly to the underlying queue and provides dispatch result to > >>> dm-rq and blk-mq. > >>> > >>> With this, the DM's blk-mq sequential IO performance is vastly > >>> improved (as much as 3X in mpath/virtio-scsi testing). > >> > >> This still feels pretty hacky... > >> > >>> diff --git a/block/blk-mq.c b/block/blk-mq.c > >>> index 55f3a27fb2e6..3168a13cb012 100644 > >>> --- a/block/blk-mq.c > >>> +++ b/block/blk-mq.c > >>> @@ -1706,6 +1706,12 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, > >>> blk_qc_t new_cookie; > >>> blk_status_t ret = BLK_STS_OK; > >>> bool run_queue = true; > >>> + /* > >>> + * If @cookie is NULL do not insert the request, this mode is used > >>> + * by blk_insert_cloned_request() via blk_mq_request_direct_issue() > >>> + */ > >>> + bool dispatch_only = !cookie; > >>> + bool need_insert = false; > >> > >> Overloading 'cookie' to also mean this isn't very future proof or solid. > > > > It enables the existing interface to be used without needing to prop up > > something else that extends out to the edge (blk_insert_cloned_request). > > Doesn't really matter if the end result is too ugly/fragile to live. Agreed. > >> And now __blk_mq_try_issue_directly() is pretty much a mess. Feels like > >> it should be split in two, where the other half would do the actual > >> insert. Then let the caller do it, if we could not issue directly. That > >> would be a lot more solid and easier to read. > > > > That is effectively what Ming's variant did (by splitting out the issue > > to a helper). > > > > BUT I'll see what I can come up with... > > Maybe I missed that version, there were many rapid fire versions posted. > Please just take your time and get it right, that's much more important. No trying to rush, going over it carefully now. Think I have a cleaner way forward. Thanks, Mike ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2018-01-16 18:16 UTC | newest] Thread overview: 22+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-01-15 16:58 [PATCH V4 0/3] blk-mq: issue request directly for blk_insert_cloned_request Ming Lei 2018-01-15 16:58 ` [PATCH V4 1/3] blk-mq: move actual issue into one helper Ming Lei 2018-01-15 17:15 ` Mike Snitzer 2018-01-16 1:36 ` Ming Lei 2018-01-15 17:29 ` Jens Axboe 2018-01-15 19:41 ` Mike Snitzer 2018-01-16 1:43 ` Ming Lei 2018-01-16 1:45 ` Mike Snitzer 2018-01-16 1:40 ` Ming Lei 2018-01-16 4:05 ` Ming Lei 2018-01-15 16:58 ` [PATCH V4 2/3] blk-mq: return dispatch result to caller in blk_mq_try_issue_directly Ming Lei 2018-01-15 16:58 ` [PATCH V4 3/3] blk-mq: issue request directly for blk_insert_cloned_request Ming Lei 2018-01-16 1:34 ` Mike Snitzer 2018-01-15 17:43 ` [PATCH V4 0/3] " Mike Snitzer 2018-01-16 1:57 ` Ming Lei 2018-01-16 15:01 ` [for-4.16 PATCH v4-mike 1/2] blk-mq: return dispatch result from blk_mq_try_issue_directly Mike Snitzer 2018-01-16 15:01 ` [for-4.16 PATCH v4-mike 2/2] blk-mq: issue request directly for blk_insert_cloned_request Mike Snitzer 2018-01-16 16:41 ` Mike Snitzer 2018-01-16 17:20 ` Jens Axboe 2018-01-16 17:38 ` Mike Snitzer 2018-01-16 17:41 ` Jens Axboe 2018-01-16 18:16 ` Mike Snitzer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).