linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: JeffleXu <jefflexu@linux.alibaba.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
	Mike Snitzer <snitzer@redhat.com>,
	dm-devel@redhat.com
Subject: Re: [RFC PATCH 08/11] block: use per-task poll context to implement bio based io poll
Date: Wed, 17 Mar 2021 10:54:24 +0800	[thread overview]
Message-ID: <YFFvYH6dRBoqARF6@T590> (raw)
In-Reply-To: <3848e80d-e7ad-9372-c96f-d32bfb9f0ae5@linux.alibaba.com>

On Tue, Mar 16, 2021 at 04:52:36PM +0800, JeffleXu wrote:
> 
> 
> On 3/16/21 3:17 PM, Ming Lei wrote:
> > On Tue, Mar 16, 2021 at 02:46:08PM +0800, JeffleXu wrote:
> >> It is a giant progress to gather all split bios that need to be polled
> >> in a per-task queue. Still some comments below.
> >>
> >>
> >> On 3/16/21 11:15 AM, Ming Lei wrote:
> >>> Currently bio based IO poll needs to poll all hw queue blindly, this way
> >>> is very inefficient, and the big reason is that we can't pass bio
> >>> submission result to io poll task.
> >>>
> >>> In IO submission context, store associated underlying bios into the
> >>> submission queue and save 'cookie' poll data in bio->bi_iter.bi_private_data,
> >>> and return current->pid to caller of submit_bio() for any DM or bio based
> >>> driver's IO, which is submitted from FS.
> >>>
> >>> In IO poll context, the passed cookie tells us the PID of submission
> >>> context, and we can find the bio from that submission context. Moving
> >>> bio from submission queue to poll queue of the poll context, and keep
> >>> polling until these bios are ended. Remove bio from poll queue if the
> >>> bio is ended. Add BIO_DONE and BIO_END_BY_POLL for such purpose.
> >>>
> >>> Usually submission shares context with io poll. The per-task poll context
> >>> is just like stack variable, and it is cheap to move data between the two
> >>> per-task queues.
> >>>
> >>> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> >>> ---
> >>>  block/bio.c               |   5 ++
> >>>  block/blk-core.c          |  74 +++++++++++++++++-
> >>>  block/blk-mq.c            | 156 +++++++++++++++++++++++++++++++++++++-
> >>>  include/linux/blk_types.h |   3 +
> >>>  4 files changed, 235 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/block/bio.c b/block/bio.c
> >>> index a1c4d2900c7a..bcf5eca0e8e3 100644
> >>> --- a/block/bio.c
> >>> +++ b/block/bio.c
> >>> @@ -1402,6 +1402,11 @@ static inline bool bio_remaining_done(struct bio *bio)
> >>>   **/
> >>>  void bio_endio(struct bio *bio)
> >>>  {
> >>> +	/* BIO_END_BY_POLL has to be set before calling submit_bio */
> >>> +	if (bio_flagged(bio, BIO_END_BY_POLL)) {
> >>> +		bio_set_flag(bio, BIO_DONE);
> >>> +		return;
> >>> +	}
> >>>  again:
> >>>  	if (!bio_remaining_done(bio))
> >>>  		return;
> >>> diff --git a/block/blk-core.c b/block/blk-core.c
> >>> index a082bbc856fb..970b23fa2e6e 100644
> >>> --- a/block/blk-core.c
> >>> +++ b/block/blk-core.c
> >>> @@ -854,6 +854,40 @@ static inline void blk_bio_poll_preprocess(struct request_queue *q,
> >>>  		bio->bi_opf |= REQ_TAG;
> >>>  }
> >>>  
> >>> +static bool blk_bio_poll_prep_submit(struct io_context *ioc, struct bio *bio)
> >>> +{
> >>> +	struct blk_bio_poll_data data = {
> >>> +		.bio	=	bio,
> >>> +	};
> >>> +	struct blk_bio_poll_ctx *pc = ioc->data;
> >>> +	unsigned int queued;
> >>> +
> >>> +	/* lock is required if there is more than one writer */
> >>> +	if (unlikely(atomic_read(&ioc->nr_tasks) > 1)) {
> >>> +		spin_lock(&pc->lock);
> >>> +		queued = kfifo_put(&pc->sq, data);
> >>> +		spin_unlock(&pc->lock);
> >>> +	} else {
> >>> +		queued = kfifo_put(&pc->sq, data);
> >>> +	}
> >>> +
> >>> +	/*
> >>> +	 * Now the bio is added per-task fifo, mark it as END_BY_POLL,
> >>> +	 * so we can save cookie into this bio after submit_bio().
> >>> +	 */
> >>> +	if (queued)
> >>> +		bio_set_flag(bio, BIO_END_BY_POLL);
> >>> +	else
> >>> +		bio->bi_opf &= ~(REQ_HIPRI | REQ_TAG);
> >>> +
> >>> +	return queued;
> >>> +}
> >>
> >> The size of kfifo is limited, and it seems that once the sq of kfifio is
> >> full, REQ_HIPRI flag is cleared and the corresponding bio is actually
> >> enqueued into the default hw queue, which is IRQ driven.
> > 
> > Yeah, this patch starts with 64 queue depth, and we can increase it to
> > 128, which should cover most of cases.
> 
> It seems that the queue depth of kfifo will affect the performance as I
> did a fast test.
> 
> 
> 
> Test Result:
> 
> BLK_BIO_POLL_SQ_SZ | iodepth | IOPS
> ------------------ | ------- | ----
> 64                 | 128     | 301k (IRQ) -> 340k (iopoll)
> 64                 | 16      | 304k (IRQ) -> 392k (iopoll)
> 128                | 128     | 204k (IRQ) -> 317k (iopoll)
> 256                | 128     | 241k (IRQ) -> 391k (iopoll)
> 
> It seems that BLK_BIO_POLL_SQ_SZ need to be increased accordingly when
> iodepth is quite large. But I don't know why the performance in IRQ mode
> decreases when BLK_BIO_POLL_SQ_SZ is increased.

This patchset is supposed to not affect IRQ mode because HIPRI isn't set
at IRQ mode. Or you mean '--hipri' & io_uring is setup but setting
nvme.poll_queues as 0 at your 'IRQ' mode test?

Thanks for starting to run performance test, and so far I just run test
in KVM, not start performance test yet.



thanks,
Ming


  reply	other threads:[~2021-03-17  2:55 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-16  3:15 [RFC PATCH 00/11] block: support bio based io polling Ming Lei
2021-03-16  3:15 ` [RFC PATCH 01/11] block: add helper of blk_queue_poll Ming Lei
2021-03-16  3:26   ` Chaitanya Kulkarni
2021-03-16  3:15 ` [RFC PATCH 02/11] block: add one helper to free io_context Ming Lei
2021-03-16  3:15 ` [RFC PATCH 03/11] block: add helper of blk_create_io_context Ming Lei
2021-03-16  3:15 ` [RFC PATCH 04/11] block: create io poll context for submission and poll task Ming Lei
2021-03-16  3:15 ` [RFC PATCH 05/11] block: add req flag of REQ_TAG Ming Lei
2021-03-16  3:15 ` [RFC PATCH 06/11] block: add new field into 'struct bvec_iter' Ming Lei
2021-03-16  3:15 ` [RFC PATCH 07/11] block/mq: extract one helper function polling hw queue Ming Lei
2021-03-16  3:15 ` [RFC PATCH 08/11] block: use per-task poll context to implement bio based io poll Ming Lei
2021-03-16  6:46   ` JeffleXu
2021-03-16  7:17     ` Ming Lei
2021-03-16  8:52       ` JeffleXu
2021-03-17  2:54         ` Ming Lei [this message]
2021-03-17  3:53           ` JeffleXu
2021-03-17  6:54             ` Ming Lei
2021-03-16 11:00       ` JeffleXu
2021-03-17  3:38         ` Ming Lei
2021-03-17  3:49         ` JeffleXu
2021-03-17  7:19           ` Ming Lei
2021-03-18 14:51             ` Mike Snitzer
2021-03-16  3:15 ` [RFC PATCH 09/11] block: add queue_to_disk() to get gendisk from request_queue Ming Lei
2021-03-16  3:15 ` [RFC PATCH 10/11] block: add poll_capable method to support bio-based IO polling Ming Lei
2021-03-16  3:15 ` [RFC PATCH 11/11] dm: support IO polling for bio-based dm device Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YFFvYH6dRBoqARF6@T590 \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@redhat.com \
    --cc=hch@lst.de \
    --cc=jefflexu@linux.alibaba.com \
    --cc=linux-block@vger.kernel.org \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).