From: Hannes Reinecke <hare@suse.de>
To: device-mapper development <dm-devel@redhat.com>,
Christoph Hellwig <hch@lst.de>
Cc: Mike Snitzer <snitzer@redhat.com>,
Ming Lei <ming.lei@canonical.com>,
Alasdair Kergon <agk@redhat.com>,
Lars Ellenberg <drbd-dev@lists.linbit.com>,
Philip Kelleher <pjk1939@linux.vnet.ibm.com>,
Christoph Hellwig <hch@infradead.org>,
Kent Overstreet <kent.overstreet@gmail.com>,
Neil@redhat.com, Ming Lin <ming.l@ssi.samsung.com>,
Al@redhat.com, Oleg Drokin <oleg.drokin@intel.com>,
Viro <viro@zeniv.linux.org.uk>, Nitin Gupta <ngupta@vflare.org>,
Jens Axboe <axboe@kernel.dk>,
Andreas Dilger <andreas.dilger@intel.com>,
Geoff Levand <geoff@infradead.org>, Jiri Kosina <jkosina@suse.cz>,
lkml <linux-kernel@vger.kernel.org>, Jim Paris <jim@jtan.com>,
Minchan Kim <minchan@kernel.org>, Dongsu Park <dpark@posteo.net>,
drbd-user@lists.linbit.com
Subject: Re: [Drbd-dev] [dm-devel] [PATCH v5 01/11] block: make generic_make_request handle arbitrarily sized bios
Date: Fri, 11 Sep 2015 13:21:37 -0000 [thread overview]
Message-ID: <55C5C348.1070307@suse.de> (raw)
In-Reply-To: <1438990806.24452.8.camel@ssi>
On 08/08/2015 01:40 AM, Ming Lin wrote:
>
> On Fri, 2015-08-07 at 09:30 +0200, Christoph Hellwig wrote:
>> I'm for solution 3:
>>
>> - keep blk_bio_{discard,write_same}_split, but ensure we never built
>> a > 4GB bio in blkdev_issue_{discard,write_same}.
>
> This has problem as I mentioned in solution 1.
> We need to also make sure max discard size is of proper granularity.
> See below example.
>
> 4G: 8388608 sectors
> UINT_MAX: 8388607 sectors
>
> dm-thinp block size = default discard granularity = 128 sectors
>
> blkdev_issue_discard(sector=0, nr_sectors=8388608)
>
> 1. Only ensure bi_size not overflow
>
> It doesn't work.
>
> [start_sector, end_sector]
> [0, 8388607]
> [0, 8388606], then dm-thinp splits it to 2 bios
> [0, 8388479]
> [8388480, 8388606] ---> this has problem in process_discard_bio(),
> because the discard size(7 sectors) covers less than a block(128 sectors)
> [8388607, 8388607] ---> same problem
>
> 2. Ensure bi_size not overflow and max discard size is of proper granularity
>
> It works.
>
> [start_sector, end_sector]
> [0, 8388607]
> [0, 8388479]
> [8388480, 8388607]
>
>
> So how about below patch?
>
> commit 1ca2ad977255efb3c339f4ca16fb798ed5ec54f7
> Author: Ming Lin <ming.l@ssi.samsung.com>
> Date: Fri Aug 7 15:07:07 2015 -0700
>
> block: remove split code in blkdev_issue_{discard,write_same}
>
> The split code in blkdev_issue_{discard,write_same} can go away
> now that any driver that cares does the split. We have to make
> sure bio size doesn't overflow.
>
> For discard, we ensure max_discard_sectors is of the proper
> granularity. So if discard size > 4G, blkdev_issue_discard() always
> send multiple granularity requests to lower level, except that the
> last one may be not multiple granularity.
>
> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
> ---
> block/blk-lib.c | 37 +++++++++----------------------------
> 1 file changed, 9 insertions(+), 28 deletions(-)
>
> diff --git a/block/blk-lib.c b/block/blk-lib.c
> index 7688ee3..e178a07 100644
> --- a/block/blk-lib.c
> +++ b/block/blk-lib.c
> @@ -44,7 +44,6 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
> struct request_queue *q = bdev_get_queue(bdev);
> int type = REQ_WRITE | REQ_DISCARD;
> unsigned int max_discard_sectors, granularity;
> - int alignment;
> struct bio_batch bb;
> struct bio *bio;
> int ret = 0;
> @@ -58,18 +57,15 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
>
> /* Zero-sector (unknown) and one-sector granularities are the same. */
> granularity = max(q->limits.discard_granularity >> 9, 1U);
> - alignment = (bdev_discard_alignment(bdev) >> 9) % granularity;
>
> /*
> - * Ensure that max_discard_sectors is of the proper
> - * granularity, so that requests stay aligned after a split.
> - */
> - max_discard_sectors = min(q->limits.max_discard_sectors, UINT_MAX >> 9);
> + * Ensure that max_discard_sectors doesn't overflow bi_size and is of
> + * the proper granularity. So if discard size > 4G, blkdev_issue_discard()
> + * always split and send multiple granularity requests to lower level,
> + * except that the last one may be not multiple granularity.
> + */
> + max_discard_sectors = UINT_MAX >> 9;
> max_discard_sectors -= max_discard_sectors % granularity;
> - if (unlikely(!max_discard_sectors)) {
> - /* Avoid infinite loop below. Being cautious never hurts. */
> - return -EOPNOTSUPP;
> - }
>
> if (flags & BLKDEV_DISCARD_SECURE) {
> if (!blk_queue_secdiscard(q))
> @@ -84,7 +80,7 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
> blk_start_plug(&plug);
> while (nr_sects) {
> unsigned int req_sects;
> - sector_t end_sect, tmp;
> + sector_t end_sect;
>
> bio = bio_alloc(gfp_mask, 1);
> if (!bio) {
> @@ -93,20 +89,7 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
> }
>
> req_sects = min_t(sector_t, nr_sects, max_discard_sectors);
> -
> - /*
> - * If splitting a request, and the next starting sector would be
> - * misaligned, stop the discard at the previous aligned sector.
> - */
> end_sect = sector + req_sects;
> - tmp = end_sect;
> - if (req_sects < nr_sects &&
> - sector_div(tmp, granularity) != alignment) {
> - end_sect = end_sect - alignment;
> - sector_div(end_sect, granularity);
> - end_sect = end_sect * granularity + alignment;
> - req_sects = end_sect - sector;
> - }
>
> bio->bi_iter.bi_sector = sector;
> bio->bi_end_io = bio_batch_end_io;
> @@ -166,10 +149,8 @@ int blkdev_issue_write_same(struct block_device *bdev, sector_t sector,
> if (!q)
> return -ENXIO;
>
> - max_write_same_sectors = q->limits.max_write_same_sectors;
> -
> - if (max_write_same_sectors == 0)
> - return -EOPNOTSUPP;
> + /* Ensure that max_write_same_sectors doesn't overflow bi_size */
> + max_write_same_sectors = UINT_MAX >> 9;
>
> atomic_set(&bb.done, 1);
> bb.flags = 1 << BIO_UPTODATE;
>
Wouldn't it be easier to move both max_write_same_sectors and
max_discard sectors to 64 bit (ie to type sector_t) and be done with the
overflow?
Seems to me this is far too much coding around self-imposed restrictions...
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
next prev parent reply other threads:[~2015-09-11 13:20 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-06 7:44 [Drbd-dev] [PATCH v5 01/11] block: make generic_make_request handle arbitrarily sized bios Ming Lin
2015-07-06 7:44 ` [Drbd-dev] [PATCH v5 08/11] block: kill merge_bvec_fn() completely Ming Lin
2015-07-31 19:23 ` [Drbd-dev] [PATCH v5 01/11] block: make generic_make_request handle arbitrarily sized bios Mike Snitzer
2015-07-31 21:19 ` Ming Lin
2015-07-31 21:38 ` Mike Snitzer
2015-08-01 6:58 ` Ming Lin
2015-08-01 16:33 ` Mike Snitzer
2015-08-03 5:58 ` Ming Lin
2015-08-04 11:36 ` Christoph Hellwig
2015-08-05 6:03 ` Ming Lin
2015-08-07 7:30 ` Christoph Hellwig
2015-08-07 23:40 ` Ming Lin
2015-09-11 13:20 ` Kent Overstreet
2015-08-08 5:17 ` Ming Lin
2015-09-11 13:20 ` Kent Overstreet
2015-08-08 12:35 ` Christoph Hellwig
2015-09-11 13:21 ` Hannes Reinecke [this message]
2015-09-11 13:21 ` [Drbd-dev] [dm-devel] " Kent Overstreet
2015-09-11 13:22 ` Hannes Reinecke
[not found] ` <20150807000004.GB30757@moria.home.lan>
2015-08-07 7:30 ` [Drbd-dev] " Christoph Hellwig
2015-08-08 16:19 ` [Drbd-dev] [dm-devel] " Martin K. Petersen
2015-08-09 5:59 ` Ming Lin
2015-08-09 6:41 ` Christoph Hellwig
2015-08-09 6:55 ` Ming Lin
2015-08-09 7:01 ` Christoph Hellwig
2015-08-09 7:18 ` Ming Lin
2015-08-10 15:02 ` [Drbd-dev] " Mike Snitzer
2015-08-10 16:14 ` Ming Lin
2015-08-10 16:18 ` Ming Lin
2015-08-10 16:40 ` Martin K. Petersen
2015-08-10 18:13 ` Mike Snitzer
2015-08-10 22:30 ` Ming Lin
2015-08-10 16:22 ` Martin K. Petersen
2015-08-10 18:18 ` Ming Lin
2015-08-11 2:00 ` Martin K. Petersen
2015-08-11 2:41 ` Mike Snitzer
2015-08-11 17:36 ` Martin K. Petersen
2015-08-11 17:47 ` Mike Snitzer
2015-08-11 18:01 ` [Drbd-dev] [dm-devel] " Martin K. Petersen
2015-09-11 13:22 ` [Drbd-dev] " Kent Overstreet
2015-08-11 14:08 ` Mike Snitzer
2015-08-11 17:49 ` Martin K. Petersen
2015-08-11 18:05 ` Martin K. Petersen
2015-08-11 20:56 ` Ming Lin
2015-08-12 0:24 ` Martin K. Petersen
2015-08-12 4:41 ` Ming Lin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55C5C348.1070307@suse.de \
--to=hare@suse.de \
--cc=Al@redhat.com \
--cc=Neil@redhat.com \
--cc=agk@redhat.com \
--cc=andreas.dilger@intel.com \
--cc=axboe@kernel.dk \
--cc=dm-devel@redhat.com \
--cc=dpark@posteo.net \
--cc=drbd-dev@lists.linbit.com \
--cc=drbd-user@lists.linbit.com \
--cc=geoff@infradead.org \
--cc=hch@infradead.org \
--cc=hch@lst.de \
--cc=jim@jtan.com \
--cc=jkosina@suse.cz \
--cc=kent.overstreet@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=minchan@kernel.org \
--cc=ming.l@ssi.samsung.com \
--cc=ming.lei@canonical.com \
--cc=ngupta@vflare.org \
--cc=oleg.drokin@intel.com \
--cc=pjk1939@linux.vnet.ibm.com \
--cc=snitzer@redhat.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox