From: Ming Lei <ming.lei@redhat.com>
To: Guoqing Jiang <guoqing.jiang@linux.dev>
Cc: Christoph Hellwig <hch@infradead.org>,
song@kernel.org, linux-raid@vger.kernel.org,
jens@chianterastutte.eu, linux-block@vger.kernel.org
Subject: Re: [PATCH] raid1: ensure bio doesn't have more than BIO_MAX_VECS sectors
Date: Mon, 16 Aug 2021 15:13:01 +0800 [thread overview]
Message-ID: <YRoP/XU6XnPna4jU@T590> (raw)
In-Reply-To: <05bdd906-2e78-bc85-c186-7bffac9076e0@linux.dev>
On Mon, Aug 16, 2021 at 02:27:48PM +0800, Guoqing Jiang wrote:
> Hi Ming and Christoph,
>
> On 8/14/21 4:57 PM, Ming Lei wrote:
> > On Sat, Aug 14, 2021 at 08:55:21AM +0100, Christoph Hellwig wrote:
> > > On Fri, Aug 13, 2021 at 04:38:59PM +0800, Guoqing Jiang wrote:
> > > > Ok, thanks.
> > > >
> > > > > In general the size of a bio only depends on the number of vectors, not
> > > > > the total I/O size. But alloc_behind_master_bio allocates new backing
> > > > > pages using order 0 allocations, so in this exceptional case the total
> > > > > size oes actually matter.
> > > > >
> > > > > While we're at it: this huge memory allocation looks really deadlock
> > > > > prone.
> > > > Hmm, let me think more about it, or could you share your thought? ????
> > > Well, you'd need a mempool which can fit the max payload of a bio,
> > > that is BIO_MAX_VECS pages.
>
> IIUC, the behind bio is allocated from bio_set (mddev->bio_set) which is
> allocated in md_run by
> call bioset_init, so the mempool (bvec_pool) of this bio_set is created by
> biovec_init_pool which
> uses global biovec slabs. Do we really need another mempool? Or, there is no
> potential deadlock
> for this case.
>
> > > FYI, this is what I'd do instead of this patch for now. We don't really
> > > need a vetor per sector, just per page. So this limits the I/O
> > > size a little less.
> > >
> > > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> > > index 3c44c4bb40fc..5b27d995302e 100644
> > > --- a/drivers/md/raid1.c
> > > +++ b/drivers/md/raid1.c
> > > @@ -1454,6 +1454,15 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
> > > goto retry_write;
> > > }
> > > + /*
> > > + * When using a bitmap, we may call alloc_behind_master_bio below.
> > > + * alloc_behind_master_bio allocates a copy of the data payload a page
> > > + * at a time and thus needs a new bio that can fit the whole payload
> > > + * this bio in page sized chunks.
> > > + */
>
> Thanks for the above, will copy it accordingly. I will check if WriteMostly
> is set before, then check both
> the flag and bitmap.
>
> > > + if (bitmap)
> > > + max_sectors = min_t(int, max_sectors, BIO_MAX_VECS * PAGE_SIZE);
> > s/PAGE_SIZE/PAGE_SECTORS
>
> Agree.
>
> > > +
> > > if (max_sectors < bio_sectors(bio)) {
> > > struct bio *split = bio_split(bio, max_sectors,
> > > GFP_NOIO, &conf->bio_split);
> > Here the limit is max single-page vectors, and the above way may not work,
> > such as:ust splitted and not
> >
> > 0 ~ 254: each bvec's length is 512
> > 255: bvec's length is 8192
> >
> > the total length is just 512*255 + 8192 = 138752 bytes = 271 sectors, but it
> > still may need 257 bvecs, which can't be allocated via bio_alloc_bioset().
>
> Thanks for deeper looking! I guess it is because how vcnt is calculated.
>
> > One solution is to add queue limit of max_single_page_bvec, and let
> > blk_queue_split() handle it.
>
> The path (blk_queue_split -> blk_bio_segment_split -> bvec_split_segs) which
> respects max_segments
> of limit. Do you mean introduce max_single_page_bvec to limit? Then perform
> similar checking as for
> max_segment.
Yes, then the bio is guaranteed to not reach max single-page bvec limit,
just like what __blk_queue_bounce() does.
thanks,
Ming
next prev parent reply other threads:[~2021-08-16 7:13 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-13 6:05 [PATCH] raid1: ensure bio doesn't have more than BIO_MAX_VECS sectors Guoqing Jiang
2021-08-13 7:49 ` Christoph Hellwig
2021-08-13 8:38 ` Guoqing Jiang
2021-08-14 7:55 ` Christoph Hellwig
2021-08-14 8:57 ` Ming Lei
2021-08-16 6:27 ` Guoqing Jiang
2021-08-16 7:13 ` Ming Lei [this message]
2021-08-16 9:37 ` Christoph Hellwig
2021-08-16 11:40 ` Ming Lei
2021-08-17 5:06 ` Christoph Hellwig
2021-08-17 12:32 ` Ming Lei
2021-09-24 15:34 ` Jens Stutte (Archiv)
2021-09-25 23:02 ` Guoqing Jiang
2021-08-13 9:27 ` kernel test robot
2021-08-13 9:27 ` kernel test robot
2021-08-13 10:12 ` kernel test robot
2021-08-13 10:12 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YRoP/XU6XnPna4jU@T590 \
--to=ming.lei@redhat.com \
--cc=guoqing.jiang@linux.dev \
--cc=hch@infradead.org \
--cc=jens@chianterastutte.eu \
--cc=linux-block@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=song@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.