From: Ming Lei <ming.lei@redhat.com>
To: JeffleXu <jefflexu@linux.alibaba.com>
Cc: axboe@kernel.dk, linux-block@vger.kernel.org,
joseph.qi@linux.alibaba.com, hch@infradead.org
Subject: Re: [PATCH v6] block: disable iopoll for split bio
Date: Wed, 25 Nov 2020 16:19:37 +0800 [thread overview]
Message-ID: <20201125081937.GA28463@T590> (raw)
In-Reply-To: <f8f52efa-a99e-5fbb-dd92-597a13fd4a2f@linux.alibaba.com>
On Wed, Nov 25, 2020 at 04:05:10PM +0800, JeffleXu wrote:
>
>
> On 11/25/20 3:19 PM, Ming Lei wrote:
> > On Wed, Nov 25, 2020 at 02:41:47PM +0800, Jeffle Xu wrote:
> >> iopoll is initially for small size, latency sensitive IO. It doesn't
> >> work well for big IO, especially when it needs to be split to multiple
> >> bios. In this case, the returned cookie of __submit_bio_noacct_mq() is
> >> indeed the cookie of the last split bio. The completion of *this* last
> >> split bio done by iopoll doesn't mean the whole original bio has
> >> completed. Callers of iopoll still need to wait for completion of other
> >> split bios.
> >>
> >> Besides bio splitting may cause more trouble for iopoll which isn't
> >> supposed to be used in case of big IO.
> >>
> >> iopoll for split bio may cause potential race if CPU migration happens
> >> during bio submission. Since the returned cookie is that of the last
> >> split bio, polling on the corresponding hardware queue doesn't help
> >> complete other split bios, if these split bios are enqueued into
> >> different hardware queues. Since interrupts are disabled for polling
> >> queues, the completion of these other split bios depends on timeout
> >> mechanism, thus causing a potential hang.
> >>
> >> iopoll for split bio may also cause hang for sync polling. Currently
> >> both the blkdev and iomap-based fs (ext4/xfs, etc) support sync polling
> >> in direct IO routine. These routines will submit bio without REQ_NOWAIT
> >> flag set, and then start sync polling in current process context. The
> >> process may hang in blk_mq_get_tag() if the submitted bio has to be
> >> split into multiple bios and can rapidly exhaust the queue depth. The
> >> process are waiting for the completion of the previously allocated
> >> requests, which should be reaped by the following polling, and thus
> >> causing a deadlock.
> >>
> >> To avoid these subtle trouble described above, just disable iopoll for
> >> split bio.
> >>
> >> Suggested-by: Ming Lei <ming.lei@redhat.com>
> >> Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
> >> Reviewed-by: Christoph Hellwig <hch@lst.de>
> >> ---
> >> block/bio.c | 2 ++
> >> block/blk-merge.c | 12 ++++++++++++
> >> block/blk-mq.c | 3 +++
> >> include/linux/blk_types.h | 1 +
> >> 4 files changed, 18 insertions(+)
> >>
> >> diff --git a/block/bio.c b/block/bio.c
> >> index fa01bef35bb1..7f7ddc22a30d 100644
> >> --- a/block/bio.c
> >> +++ b/block/bio.c
> >> @@ -684,6 +684,8 @@ void __bio_clone_fast(struct bio *bio, struct bio *bio_src)
> >> bio_set_flag(bio, BIO_CLONED);
> >> if (bio_flagged(bio_src, BIO_THROTTLED))
> >> bio_set_flag(bio, BIO_THROTTLED);
> >> + if (bio_flagged(bio_src, BIO_SPLIT))
> >> + bio_set_flag(bio, BIO_SPLIT);
> >> bio->bi_opf = bio_src->bi_opf;
> >> bio->bi_ioprio = bio_src->bi_ioprio;
> >> bio->bi_write_hint = bio_src->bi_write_hint;
> >> diff --git a/block/blk-merge.c b/block/blk-merge.c
> >> index bcf5e4580603..a2890cebf99f 100644
> >> --- a/block/blk-merge.c
> >> +++ b/block/blk-merge.c
> >> @@ -279,6 +279,18 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
> >> return NULL;
> >> split:
> >> *segs = nsegs;
> >> +
> >> + /*
> >> + * Bio splitting may cause subtle trouble such as hang when doing sync
> >> + * iopoll in direct IO routine. Given performance gain of iopoll for
> >> + * big IO can be trival, disable iopoll when split needed. We need
> >> + * BIO_SPLIT to identify bios need this workaround. Since currently
> >> + * only normal IO under mq routine may suffer this issue, BIO_SPLIT is
> >> + * only marked here.
> >> + */
> >> + bio->bi_opf &= ~REQ_HIPRI;
> >> + bio_set_flag(bio, BIO_SPLIT);
> >> +
> >> return bio_split(bio, sectors, GFP_NOIO, bs);
> >> }
> >>
> >> diff --git a/block/blk-mq.c b/block/blk-mq.c
> >> index 55bcee5dc032..ce1f3628e4c2 100644
> >> --- a/block/blk-mq.c
> >> +++ b/block/blk-mq.c
> >> @@ -2265,6 +2265,9 @@ blk_qc_t blk_mq_submit_bio(struct bio *bio)
> >> blk_mq_sched_insert_request(rq, false, true, true);
> >> }
> >>
> >> + if (bio_flagged(bio, BIO_SPLIT))
> >> + return BLK_QC_T_NONE;
> >> +
> >
> > Not sure the new bio flag is really required for this case, just wondering
> > why not take the following simple way? BTW we are really going to run
> > out of bio flag.
> >
>
> Please consider the following case:
>
> One big bio got split into two split bios. At the first call of
> blk_mq_submit_bio(), the input @bio (actually the original big bio)
> indeed gets split. The split bio gets enqueued to hw queue and the
> returned cookie is BLK_QC_T_NONE, while the remained bio gets buffered
> in bio_list. So far so good.
When this bio gets splitted, REQ_HIPRI is cleared for this bio, and
all splitted bios won't set this flag too.
>
> Then when calling blk_mq_submit_bio() the second time, the input @bio is
> indeed the remained bio. At this time, it will not get split and you
> will get a *valid* cookie. And since the cookie of last split bio will
> actually overrides the previous cookie, you will get a *valid* cookie as
> a result.
Then valid cookie can be returned only for bio with REQ_HIPRI.
Thanks,
Ming
next prev parent reply other threads:[~2020-11-25 8:19 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-25 6:41 [PATCH v6] block: disable iopoll for split bio Jeffle Xu
2020-11-25 7:19 ` Ming Lei
2020-11-25 7:48 ` Christoph Hellwig
2020-11-25 8:05 ` JeffleXu
2020-11-25 8:19 ` Ming Lei [this message]
2020-11-25 9:15 ` JeffleXu
2020-11-25 8:29 ` Ming Lei
2020-11-25 9:17 ` JeffleXu
2020-11-25 9:52 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201125081937.GA28463@T590 \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=hch@infradead.org \
--cc=jefflexu@linux.alibaba.com \
--cc=joseph.qi@linux.alibaba.com \
--cc=linux-block@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).