* [PATCH 02/60] block drivers: convert to bio_init_with_vec_table()
2016-10-29 8:07 [PATCH 00/60] block: support multipage bvec Ming Lei
@ 2016-10-29 8:08 ` Ming Lei
2016-10-29 8:08 ` [PATCH 07/60] dm: crypt: use bio_add_page() Ming Lei
` (5 subsequent siblings)
6 siblings, 0 replies; 15+ messages in thread
From: Ming Lei @ 2016-10-29 8:08 UTC (permalink / raw)
To: Jens Axboe, linux-kernel
Cc: linux-block, linux-fsdevel, Christoph Hellwig,
Kirill A . Shutemov, Ming Lei, Jiri Kosina, Kent Overstreet,
Shaohua Li, Alasdair Kergon, Mike Snitzer,
maintainer:DEVICE-MAPPER LVM, Christoph Hellwig, Sagi Grimberg,
Joern Engel, Prasad Joshi, Mike Christie, Hannes Reinecke,
Rasmus Villemoes, Johannes Thumshirn, Guoqing Jiang
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
drivers/block/floppy.c | 3 +--
drivers/md/bcache/io.c | 4 +---
drivers/md/bcache/journal.c | 4 +---
drivers/md/bcache/movinggc.c | 7 +++----
drivers/md/bcache/super.c | 13 ++++---------
drivers/md/bcache/writeback.c | 6 +++---
drivers/md/dm-bufio.c | 4 +---
drivers/md/raid5.c | 9 ++-------
drivers/nvme/target/io-cmd.c | 4 +---
fs/logfs/dev_bdev.c | 4 +---
10 files changed, 18 insertions(+), 40 deletions(-)
diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index e3d8e4ced4a2..cdc916a95137 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -3806,8 +3806,7 @@ static int __floppy_read_block_0(struct block_device *bdev, int drive)
cbdata.drive = drive;
- bio_init(&bio);
- bio.bi_io_vec = &bio_vec;
+ bio_init_with_vec_table(&bio, &bio_vec, 1);
bio_vec.bv_page = page;
bio_vec.bv_len = size;
bio_vec.bv_offset = 0;
diff --git a/drivers/md/bcache/io.c b/drivers/md/bcache/io.c
index e97b0acf7b8d..af9489087cd3 100644
--- a/drivers/md/bcache/io.c
+++ b/drivers/md/bcache/io.c
@@ -24,9 +24,7 @@ struct bio *bch_bbio_alloc(struct cache_set *c)
struct bbio *b = mempool_alloc(c->bio_meta, GFP_NOIO);
struct bio *bio = &b->bio;
- bio_init(bio);
- bio->bi_max_vecs = bucket_pages(c);
- bio->bi_io_vec = bio->bi_inline_vecs;
+ bio_init_with_vec_table(bio, bio->bi_inline_vecs, bucket_pages(c));
return bio;
}
diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c
index 6925023e12d4..b966f28d1b98 100644
--- a/drivers/md/bcache/journal.c
+++ b/drivers/md/bcache/journal.c
@@ -448,13 +448,11 @@ static void do_journal_discard(struct cache *ca)
atomic_set(&ja->discard_in_flight, DISCARD_IN_FLIGHT);
- bio_init(bio);
+ bio_init_with_vec_table(bio, bio->bi_inline_vecs, 1);
bio_set_op_attrs(bio, REQ_OP_DISCARD, 0);
bio->bi_iter.bi_sector = bucket_to_sector(ca->set,
ca->sb.d[ja->discard_idx]);
bio->bi_bdev = ca->bdev;
- bio->bi_max_vecs = 1;
- bio->bi_io_vec = bio->bi_inline_vecs;
bio->bi_iter.bi_size = bucket_bytes(ca);
bio->bi_end_io = journal_discard_endio;
diff --git a/drivers/md/bcache/movinggc.c b/drivers/md/bcache/movinggc.c
index 5c4bddecfaf0..9d7991f69030 100644
--- a/drivers/md/bcache/movinggc.c
+++ b/drivers/md/bcache/movinggc.c
@@ -77,15 +77,14 @@ static void moving_init(struct moving_io *io)
{
struct bio *bio = &io->bio.bio;
- bio_init(bio);
+ bio_init_with_vec_table(bio, bio->bi_inline_vecs,
+ DIV_ROUND_UP(KEY_SIZE(&io->w->key),
+ PAGE_SECTORS));
bio_get(bio);
bio_set_prio(bio, IOPRIO_PRIO_VALUE(IOPRIO_CLASS_IDLE, 0));
bio->bi_iter.bi_size = KEY_SIZE(&io->w->key) << 9;
- bio->bi_max_vecs = DIV_ROUND_UP(KEY_SIZE(&io->w->key),
- PAGE_SECTORS);
bio->bi_private = &io->cl;
- bio->bi_io_vec = bio->bi_inline_vecs;
bch_bio_map(bio, NULL);
}
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 849ad441cd76..d8a6d807b498 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1152,9 +1152,7 @@ static void register_bdev(struct cache_sb *sb, struct page *sb_page,
dc->bdev = bdev;
dc->bdev->bd_holder = dc;
- bio_init(&dc->sb_bio);
- dc->sb_bio.bi_max_vecs = 1;
- dc->sb_bio.bi_io_vec = dc->sb_bio.bi_inline_vecs;
+ bio_init_with_vec_table(&dc->sb_bio, dc->sb_bio.bi_inline_vecs, 1);
dc->sb_bio.bi_io_vec[0].bv_page = sb_page;
get_page(sb_page);
@@ -1814,9 +1812,8 @@ static int cache_alloc(struct cache *ca)
__module_get(THIS_MODULE);
kobject_init(&ca->kobj, &bch_cache_ktype);
- bio_init(&ca->journal.bio);
- ca->journal.bio.bi_max_vecs = 8;
- ca->journal.bio.bi_io_vec = ca->journal.bio.bi_inline_vecs;
+ bio_init_with_vec_table(&ca->journal.bio,
+ ca->journal.bio.bi_inline_vecs, 8);
free = roundup_pow_of_two(ca->sb.nbuckets) >> 10;
@@ -1852,9 +1849,7 @@ static int register_cache(struct cache_sb *sb, struct page *sb_page,
ca->bdev = bdev;
ca->bdev->bd_holder = ca;
- bio_init(&ca->sb_bio);
- ca->sb_bio.bi_max_vecs = 1;
- ca->sb_bio.bi_io_vec = ca->sb_bio.bi_inline_vecs;
+ bio_init_with_vec_table(&ca->sb_bio, ca->sb_bio.bi_inline_vecs, 1);
ca->sb_bio.bi_io_vec[0].bv_page = sb_page;
get_page(sb_page);
diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
index e51644e503a5..b2568cef8c86 100644
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -106,14 +106,14 @@ static void dirty_init(struct keybuf_key *w)
struct dirty_io *io = w->private;
struct bio *bio = &io->bio;
- bio_init(bio);
+ bio_init_with_vec_table(bio, bio->bi_inline_vecs,
+ DIV_ROUND_UP(KEY_SIZE(&w->key),
+ PAGE_SECTORS));
if (!io->dc->writeback_percent)
bio_set_prio(bio, IOPRIO_PRIO_VALUE(IOPRIO_CLASS_IDLE, 0));
bio->bi_iter.bi_size = KEY_SIZE(&w->key) << 9;
- bio->bi_max_vecs = DIV_ROUND_UP(KEY_SIZE(&w->key), PAGE_SECTORS);
bio->bi_private = w;
- bio->bi_io_vec = bio->bi_inline_vecs;
bch_bio_map(bio, NULL);
}
diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index 125aedc3875f..5b13e7e7c8aa 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -611,9 +611,7 @@ static void use_inline_bio(struct dm_buffer *b, int rw, sector_t block,
char *ptr;
int len;
- bio_init(&b->bio);
- b->bio.bi_io_vec = b->bio_vec;
- b->bio.bi_max_vecs = DM_BUFIO_INLINE_VECS;
+ bio_init_with_vec_table(&b->bio, b->bio_vec, DM_BUFIO_INLINE_VECS);
b->bio.bi_iter.bi_sector = block << b->c->sectors_per_block_bits;
b->bio.bi_bdev = b->c->bdev;
b->bio.bi_end_io = inline_endio;
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 92ac251e91e6..eae7b4cf34d4 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -2004,13 +2004,8 @@ static struct stripe_head *alloc_stripe(struct kmem_cache *sc, gfp_t gfp,
for (i = 0; i < disks; i++) {
struct r5dev *dev = &sh->dev[i];
- bio_init(&dev->req);
- dev->req.bi_io_vec = &dev->vec;
- dev->req.bi_max_vecs = 1;
-
- bio_init(&dev->rreq);
- dev->rreq.bi_io_vec = &dev->rvec;
- dev->rreq.bi_max_vecs = 1;
+ bio_init_with_vec_table(&dev->req, &dev->vec, 1);
+ bio_init_with_vec_table(&dev->rreq, &dev->rvec, 1);
}
}
return sh;
diff --git a/drivers/nvme/target/io-cmd.c b/drivers/nvme/target/io-cmd.c
index 4a96c2049b7b..6a32b0b68b1e 100644
--- a/drivers/nvme/target/io-cmd.c
+++ b/drivers/nvme/target/io-cmd.c
@@ -37,9 +37,7 @@ static void nvmet_inline_bio_init(struct nvmet_req *req)
{
struct bio *bio = &req->inline_bio;
- bio_init(bio);
- bio->bi_max_vecs = NVMET_MAX_INLINE_BIOVEC;
- bio->bi_io_vec = req->inline_bvec;
+ bio_init_with_vec_table(bio, req->inline_bvec, NVMET_MAX_INLINE_BIOVEC);
}
static void nvmet_execute_rw(struct nvmet_req *req)
diff --git a/fs/logfs/dev_bdev.c b/fs/logfs/dev_bdev.c
index a8329cc47dec..2bf53b0ffe83 100644
--- a/fs/logfs/dev_bdev.c
+++ b/fs/logfs/dev_bdev.c
@@ -19,9 +19,7 @@ static int sync_request(struct page *page, struct block_device *bdev, int op)
struct bio bio;
struct bio_vec bio_vec;
- bio_init(&bio);
- bio.bi_max_vecs = 1;
- bio.bi_io_vec = &bio_vec;
+ bio_init_with_vec_table(&bio, &bio_vec, 1);
bio_vec.bv_page = page;
bio_vec.bv_len = PAGE_SIZE;
bio_vec.bv_offset = 0;
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 07/60] dm: crypt: use bio_add_page()
2016-10-29 8:07 [PATCH 00/60] block: support multipage bvec Ming Lei
2016-10-29 8:08 ` [PATCH 02/60] block drivers: convert to bio_init_with_vec_table() Ming Lei
@ 2016-10-29 8:08 ` Ming Lei
2016-10-29 8:08 ` [PATCH 08/60] dm: use bvec iterator helpers to implement .get_page and .next_page Ming Lei
` (4 subsequent siblings)
6 siblings, 0 replies; 15+ messages in thread
From: Ming Lei @ 2016-10-29 8:08 UTC (permalink / raw)
To: Jens Axboe, linux-kernel
Cc: linux-block, linux-fsdevel, Christoph Hellwig,
Kirill A . Shutemov, Ming Lei, Alasdair Kergon, Mike Snitzer,
maintainer:DEVICE-MAPPER LVM, Shaohua Li,
open list:SOFTWARE RAID Multiple Disks SUPPORT
We have the standard interface to add page to bio, so don't
do that in hacking way.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
drivers/md/dm-crypt.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index a2768835d394..4999c7497f95 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -994,7 +994,6 @@ static struct bio *crypt_alloc_buffer(struct dm_crypt_io *io, unsigned size)
gfp_t gfp_mask = GFP_NOWAIT | __GFP_HIGHMEM;
unsigned i, len, remaining_size;
struct page *page;
- struct bio_vec *bvec;
retry:
if (unlikely(gfp_mask & __GFP_DIRECT_RECLAIM))
@@ -1019,12 +1018,7 @@ static struct bio *crypt_alloc_buffer(struct dm_crypt_io *io, unsigned size)
len = (remaining_size > PAGE_SIZE) ? PAGE_SIZE : remaining_size;
- bvec = &clone->bi_io_vec[clone->bi_vcnt++];
- bvec->bv_page = page;
- bvec->bv_len = len;
- bvec->bv_offset = 0;
-
- clone->bi_iter.bi_size += len;
+ bio_add_page(clone, page, len, 0);
remaining_size -= len;
}
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 08/60] dm: use bvec iterator helpers to implement .get_page and .next_page
2016-10-29 8:07 [PATCH 00/60] block: support multipage bvec Ming Lei
2016-10-29 8:08 ` [PATCH 02/60] block drivers: convert to bio_init_with_vec_table() Ming Lei
2016-10-29 8:08 ` [PATCH 07/60] dm: crypt: use bio_add_page() Ming Lei
@ 2016-10-29 8:08 ` Ming Lei
2016-10-29 8:08 ` [PATCH 09/60] dm: dm.c: replace 'bio->bi_vcnt == 1' with !bio_multiple_segments Ming Lei
` (3 subsequent siblings)
6 siblings, 0 replies; 15+ messages in thread
From: Ming Lei @ 2016-10-29 8:08 UTC (permalink / raw)
To: Jens Axboe, linux-kernel
Cc: linux-block, linux-fsdevel, Christoph Hellwig,
Kirill A . Shutemov, Ming Lei, Alasdair Kergon, Mike Snitzer,
maintainer:DEVICE-MAPPER LVM, Shaohua Li,
open list:SOFTWARE RAID Multiple Disks SUPPORT
Firstly we have mature bvec/bio iterator helper for iterate each
page in one bio, not necessary to reinvent a wheel to do that.
Secondly the coming multipage bvecs requires this patch.
Also add comments about the direct access to bvec table.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
drivers/md/dm-io.c | 34 ++++++++++++++++++++++++----------
1 file changed, 24 insertions(+), 10 deletions(-)
diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
index 0bf1a12e35fe..2ef573c220fc 100644
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -162,7 +162,10 @@ struct dpages {
struct page **p, unsigned long *len, unsigned *offset);
void (*next_page)(struct dpages *dp);
- unsigned context_u;
+ union {
+ unsigned context_u;
+ struct bvec_iter context_bi;
+ };
void *context_ptr;
void *vma_invalidate_address;
@@ -204,25 +207,36 @@ static void list_dp_init(struct dpages *dp, struct page_list *pl, unsigned offse
static void bio_get_page(struct dpages *dp, struct page **p,
unsigned long *len, unsigned *offset)
{
- struct bio_vec *bvec = dp->context_ptr;
- *p = bvec->bv_page;
- *len = bvec->bv_len - dp->context_u;
- *offset = bvec->bv_offset + dp->context_u;
+ struct bio_vec bv = bvec_iter_bvec((struct bio_vec *)dp->context_ptr,
+ dp->context_bi);
+
+ *p = bv.bv_page;
+ *len = bv.bv_len;
+ *offset = bv.bv_offset;
+
+ /* avoid to figure out it in bio_next_page() again */
+ dp->context_bi.bi_sector = (sector_t)bv.bv_len;
}
static void bio_next_page(struct dpages *dp)
{
- struct bio_vec *bvec = dp->context_ptr;
- dp->context_ptr = bvec + 1;
- dp->context_u = 0;
+ unsigned int len = (unsigned int)dp->context_bi.bi_sector;
+
+ bvec_iter_advance((struct bio_vec *)dp->context_ptr,
+ &dp->context_bi, len);
}
static void bio_dp_init(struct dpages *dp, struct bio *bio)
{
dp->get_page = bio_get_page;
dp->next_page = bio_next_page;
- dp->context_ptr = __bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter);
- dp->context_u = bio->bi_iter.bi_bvec_done;
+
+ /*
+ * We just use bvec iterator to retrieve pages, so it is ok to
+ * access the bvec table directly here
+ */
+ dp->context_ptr = bio->bi_io_vec;
+ dp->context_bi = bio->bi_iter;
}
/*
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 09/60] dm: dm.c: replace 'bio->bi_vcnt == 1' with !bio_multiple_segments
2016-10-29 8:07 [PATCH 00/60] block: support multipage bvec Ming Lei
` (2 preceding siblings ...)
2016-10-29 8:08 ` [PATCH 08/60] dm: use bvec iterator helpers to implement .get_page and .next_page Ming Lei
@ 2016-10-29 8:08 ` Ming Lei
2016-10-31 15:29 ` Christoph Hellwig
2016-10-29 8:08 ` [PATCH 29/60] dm: limit the max bio size as BIO_SP_MAX_SECTORS << SECTOR_SHIFT Ming Lei
` (2 subsequent siblings)
6 siblings, 1 reply; 15+ messages in thread
From: Ming Lei @ 2016-10-29 8:08 UTC (permalink / raw)
To: Jens Axboe, linux-kernel
Cc: linux-block, linux-fsdevel, Christoph Hellwig,
Kirill A . Shutemov, Ming Lei, Alasdair Kergon, Mike Snitzer,
maintainer:DEVICE-MAPPER LVM, Shaohua Li,
open list:SOFTWARE RAID Multiple Disks SUPPORT
Avoid to access .bi_vcnt directly, because it may be not what
the driver expected any more after supporting multipage bvec.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
drivers/md/dm-rq.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index 1d0d2adc050a..8534cbf8ce35 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -819,7 +819,8 @@ static void dm_old_request_fn(struct request_queue *q)
pos = blk_rq_pos(rq);
if ((dm_old_request_peeked_before_merge_deadline(md) &&
- md_in_flight(md) && rq->bio && rq->bio->bi_vcnt == 1 &&
+ md_in_flight(md) && rq->bio &&
+ !bio_multiple_segments(rq->bio) &&
md->last_rq_pos == pos && md->last_rq_rw == rq_data_dir(rq)) ||
(ti->type->busy && ti->type->busy(ti))) {
blk_delay_queue(q, 10);
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH 09/60] dm: dm.c: replace 'bio->bi_vcnt == 1' with !bio_multiple_segments
2016-10-29 8:08 ` [PATCH 09/60] dm: dm.c: replace 'bio->bi_vcnt == 1' with !bio_multiple_segments Ming Lei
@ 2016-10-31 15:29 ` Christoph Hellwig
2016-10-31 22:59 ` Ming Lei
2016-11-02 3:09 ` Kent Overstreet
0 siblings, 2 replies; 15+ messages in thread
From: Christoph Hellwig @ 2016-10-31 15:29 UTC (permalink / raw)
To: Ming Lei
Cc: Jens Axboe, linux-kernel, linux-block, linux-fsdevel,
Christoph Hellwig, Kirill A . Shutemov, Alasdair Kergon,
Mike Snitzer, maintainer:DEVICE-MAPPER (LVM), Shaohua Li,
open list:SOFTWARE RAID (Multiple Disks) SUPPORT
On Sat, Oct 29, 2016 at 04:08:08PM +0800, Ming Lei wrote:
> Avoid to access .bi_vcnt directly, because it may be not what
> the driver expected any more after supporting multipage bvec.
>
> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
It would be really nice to have a comment in the code why it's
even checking for multiple segments.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 09/60] dm: dm.c: replace 'bio->bi_vcnt == 1' with !bio_multiple_segments
2016-10-31 15:29 ` Christoph Hellwig
@ 2016-10-31 22:59 ` Ming Lei
2016-11-02 3:09 ` Kent Overstreet
1 sibling, 0 replies; 15+ messages in thread
From: Ming Lei @ 2016-10-31 22:59 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, Linux Kernel Mailing List, linux-block,
Linux FS Devel, Kirill A . Shutemov, Alasdair Kergon,
Mike Snitzer, maintainer:DEVICE-MAPPER (LVM), Shaohua Li,
open list:SOFTWARE RAID (Multiple Disks) SUPPORT
On Mon, Oct 31, 2016 at 11:29 PM, Christoph Hellwig <hch@infradead.org> wrote:
> On Sat, Oct 29, 2016 at 04:08:08PM +0800, Ming Lei wrote:
>> Avoid to access .bi_vcnt directly, because it may be not what
>> the driver expected any more after supporting multipage bvec.
>>
>> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
>
> It would be really nice to have a comment in the code why it's
> even checking for multiple segments.
>
OK, will add comment about using !bio_multiple_segments(rq->bio)
to replace 'rq->bio->bi_vcnt == 1'.
Thanks,
Ming Lei
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 09/60] dm: dm.c: replace 'bio->bi_vcnt == 1' with !bio_multiple_segments
2016-10-31 15:29 ` Christoph Hellwig
2016-10-31 22:59 ` Ming Lei
@ 2016-11-02 3:09 ` Kent Overstreet
2016-11-02 7:56 ` Ming Lei
1 sibling, 1 reply; 15+ messages in thread
From: Kent Overstreet @ 2016-11-02 3:09 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Ming Lei, Jens Axboe, linux-kernel, linux-block, linux-fsdevel,
Kirill A . Shutemov, Alasdair Kergon, Mike Snitzer,
maintainer:DEVICE-MAPPER (LVM), Shaohua Li,
open list:SOFTWARE RAID (Multiple Disks) SUPPORT
On Mon, Oct 31, 2016 at 08:29:01AM -0700, Christoph Hellwig wrote:
> On Sat, Oct 29, 2016 at 04:08:08PM +0800, Ming Lei wrote:
> > Avoid to access .bi_vcnt directly, because it may be not what
> > the driver expected any more after supporting multipage bvec.
> >
> > Signed-off-by: Ming Lei <tom.leiming@gmail.com>
>
> It would be really nice to have a comment in the code why it's
> even checking for multiple segments.
Or ideally refactor the code to not care about multiple segments at all.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 09/60] dm: dm.c: replace 'bio->bi_vcnt == 1' with !bio_multiple_segments
2016-11-02 3:09 ` Kent Overstreet
@ 2016-11-02 7:56 ` Ming Lei
2016-11-02 14:24 ` Mike Snitzer
0 siblings, 1 reply; 15+ messages in thread
From: Ming Lei @ 2016-11-02 7:56 UTC (permalink / raw)
To: Kent Overstreet
Cc: Christoph Hellwig, Jens Axboe, Linux Kernel Mailing List,
linux-block, Linux FS Devel, Kirill A . Shutemov, Alasdair Kergon,
Mike Snitzer, maintainer:DEVICE-MAPPER (LVM), Shaohua Li,
open list:SOFTWARE RAID (Multiple Disks) SUPPORT
On Wed, Nov 2, 2016 at 11:09 AM, Kent Overstreet
<kent.overstreet@gmail.com> wrote:
> On Mon, Oct 31, 2016 at 08:29:01AM -0700, Christoph Hellwig wrote:
>> On Sat, Oct 29, 2016 at 04:08:08PM +0800, Ming Lei wrote:
>> > Avoid to access .bi_vcnt directly, because it may be not what
>> > the driver expected any more after supporting multipage bvec.
>> >
>> > Signed-off-by: Ming Lei <tom.leiming@gmail.com>
>>
>> It would be really nice to have a comment in the code why it's
>> even checking for multiple segments.
>
> Or ideally refactor the code to not care about multiple segments at all.
The check on 'bio->bi_vcnt == 1' is introduced in commit de3ec86dff160(dm:
don't start current request if it would've merged with the previous), which
fixed one performance issue.[1]
Looks the idea of the patch is to delay dispatching the rq if it
would've merged with previous request and the rq is small(single bvec).
I guess the motivation is to try to increase chance of merging with the delay.
But why does the code check on 'bio->bi_vcnt == 1'? Once the bio is
submitted, .bi_vcnt isn't changed any more and merging doesn't change
it too. So should the check have been on blk_rq_bytes(rq)?
Mike, please correct me if my understanding is wrong.
[1] https://www.redhat.com/archives/dm-devel/2015-March/msg00014.html
thanks,
Ming Lei
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 09/60] dm: dm.c: replace 'bio->bi_vcnt == 1' with !bio_multiple_segments
2016-11-02 7:56 ` Ming Lei
@ 2016-11-02 14:24 ` Mike Snitzer
2016-11-02 23:47 ` Ming Lei
0 siblings, 1 reply; 15+ messages in thread
From: Mike Snitzer @ 2016-11-02 14:24 UTC (permalink / raw)
To: Ming Lei
Cc: Jens Axboe, open list:SOFTWARE RAID (Multiple Disks) SUPPORT,
Linux Kernel Mailing List, Christoph Hellwig,
maintainer:DEVICE-MAPPER (LVM), linux-block, Alasdair Kergon,
Linux FS Devel, Shaohua Li, Kent Overstreet, Kirill A . Shutemov
On Wed, Nov 02 2016 at 3:56am -0400,
Ming Lei <tom.leiming@gmail.com> wrote:
> On Wed, Nov 2, 2016 at 11:09 AM, Kent Overstreet
> <kent.overstreet@gmail.com> wrote:
> > On Mon, Oct 31, 2016 at 08:29:01AM -0700, Christoph Hellwig wrote:
> >> On Sat, Oct 29, 2016 at 04:08:08PM +0800, Ming Lei wrote:
> >> > Avoid to access .bi_vcnt directly, because it may be not what
> >> > the driver expected any more after supporting multipage bvec.
> >> >
> >> > Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> >>
> >> It would be really nice to have a comment in the code why it's
> >> even checking for multiple segments.
> >
> > Or ideally refactor the code to not care about multiple segments at all.
>
> The check on 'bio->bi_vcnt == 1' is introduced in commit de3ec86dff160(dm:
> don't start current request if it would've merged with the previous), which
> fixed one performance issue.[1]
>
> Looks the idea of the patch is to delay dispatching the rq if it
> would've merged with previous request and the rq is small(single bvec).
> I guess the motivation is to try to increase chance of merging with the delay.
>
> But why does the code check on 'bio->bi_vcnt == 1'? Once the bio is
> submitted, .bi_vcnt isn't changed any more and merging doesn't change
> it too. So should the check have been on blk_rq_bytes(rq)?
>
> Mike, please correct me if my understanding is wrong.
>
>
> [1] https://www.redhat.com/archives/dm-devel/2015-March/msg00014.html
The patch was labored over for quite a while and is based on suggestions I
got from Jens when discussing a very problematic aspect of old
.request_fn request-based DM performance for a multi-threaded (64
threads) sequential IO benchmark (vdbench IIRC). The issue was reported
by NetApp.
The patch in question fixed the lack of merging that was seen with this
interleaved sequential IO benchmark. The lack of merging was made worse
if a DM multipath device had more underlying paths (e.g. 4 instead of 2).
As for your question, about using blk_rq_bytes(rq) vs 'bio->bi_vcnt == 1'
.. not sure how that would be a suitable replacement. But it has been a
while since I've delved into these block core merge details of old
.request_fn but _please_ don't change the logic of this code simply
because it is proving itself to be problematic for your current
patchset's cleanliness.
Mike
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 09/60] dm: dm.c: replace 'bio->bi_vcnt == 1' with !bio_multiple_segments
2016-11-02 14:24 ` Mike Snitzer
@ 2016-11-02 23:47 ` Ming Lei
0 siblings, 0 replies; 15+ messages in thread
From: Ming Lei @ 2016-11-02 23:47 UTC (permalink / raw)
To: Mike Snitzer
Cc: Kent Overstreet, Christoph Hellwig, Jens Axboe,
Linux Kernel Mailing List, linux-block, Linux FS Devel,
Kirill A . Shutemov, Alasdair Kergon,
maintainer:DEVICE-MAPPER (LVM), Shaohua Li,
open list:SOFTWARE RAID (Multiple Disks) SUPPORT
On Wed, Nov 2, 2016 at 10:24 PM, Mike Snitzer <snitzer@redhat.com> wrote:
> On Wed, Nov 02 2016 at 3:56am -0400,
> Ming Lei <tom.leiming@gmail.com> wrote:
>
>> On Wed, Nov 2, 2016 at 11:09 AM, Kent Overstreet
>> <kent.overstreet@gmail.com> wrote:
>> > On Mon, Oct 31, 2016 at 08:29:01AM -0700, Christoph Hellwig wrote:
>> >> On Sat, Oct 29, 2016 at 04:08:08PM +0800, Ming Lei wrote:
>> >> > Avoid to access .bi_vcnt directly, because it may be not what
>> >> > the driver expected any more after supporting multipage bvec.
>> >> >
>> >> > Signed-off-by: Ming Lei <tom.leiming@gmail.com>
>> >>
>> >> It would be really nice to have a comment in the code why it's
>> >> even checking for multiple segments.
>> >
>> > Or ideally refactor the code to not care about multiple segments at all.
>>
>> The check on 'bio->bi_vcnt == 1' is introduced in commit de3ec86dff160(dm:
>> don't start current request if it would've merged with the previous), which
>> fixed one performance issue.[1]
>>
>> Looks the idea of the patch is to delay dispatching the rq if it
>> would've merged with previous request and the rq is small(single bvec).
>> I guess the motivation is to try to increase chance of merging with the delay.
>>
>> But why does the code check on 'bio->bi_vcnt == 1'? Once the bio is
>> submitted, .bi_vcnt isn't changed any more and merging doesn't change
>> it too. So should the check have been on blk_rq_bytes(rq)?
>>
>> Mike, please correct me if my understanding is wrong.
>>
>>
>> [1] https://www.redhat.com/archives/dm-devel/2015-March/msg00014.html
>
> The patch was labored over for quite a while and is based on suggestions I
> got from Jens when discussing a very problematic aspect of old
> .request_fn request-based DM performance for a multi-threaded (64
> threads) sequential IO benchmark (vdbench IIRC). The issue was reported
> by NetApp.
>
> The patch in question fixed the lack of merging that was seen with this
> interleaved sequential IO benchmark. The lack of merging was made worse
> if a DM multipath device had more underlying paths (e.g. 4 instead of 2).
>
> As for your question, about using blk_rq_bytes(rq) vs 'bio->bi_vcnt == 1'
> .. not sure how that would be a suitable replacement. But it has been a
> while since I've delved into these block core merge details of old
Just last year, looks not long enough, :-)
> .request_fn but _please_ don't change the logic of this code simply
As I explained before, neither .bi_vcnt will be changed after submitting,
nor be changed during merging, so I think the checking is wrong,
could you explain what is your initial motivation of checking on
'bio->bi_vcnt == 1'?
> because it is proving itself to be problematic for your current
> patchset's cleanliness.
Could you explain what is problematic for the cleanliness?
Thanks,
Ming Lei
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 29/60] dm: limit the max bio size as BIO_SP_MAX_SECTORS << SECTOR_SHIFT
2016-10-29 8:07 [PATCH 00/60] block: support multipage bvec Ming Lei
` (3 preceding siblings ...)
2016-10-29 8:08 ` [PATCH 09/60] dm: dm.c: replace 'bio->bi_vcnt == 1' with !bio_multiple_segments Ming Lei
@ 2016-10-29 8:08 ` Ming Lei
2016-10-29 8:08 ` [PATCH 58/60] dm-crypt: convert to bio_for_each_segment_all_rd() Ming Lei
2016-10-31 15:25 ` [PATCH 00/60] block: support multipage bvec Christoph Hellwig
6 siblings, 0 replies; 15+ messages in thread
From: Ming Lei @ 2016-10-29 8:08 UTC (permalink / raw)
To: Jens Axboe, linux-kernel
Cc: linux-block, linux-fsdevel, Christoph Hellwig,
Kirill A . Shutemov, Ming Lei, Alasdair Kergon, Mike Snitzer,
maintainer:DEVICE-MAPPER LVM, Shaohua Li,
open list:SOFTWARE RAID Multiple Disks SUPPORT
For BIO based DM, some targets aren't ready for dealing with
bigger incoming bio than 1Mbyte, such as crypt and log write
targets.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
drivers/md/dm.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index ef7bf1dd6900..ce454c6c1a4e 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -899,7 +899,16 @@ int dm_set_target_max_io_len(struct dm_target *ti, sector_t len)
return -EINVAL;
}
- ti->max_io_len = (uint32_t) len;
+ /*
+ * BIO based queue uses its own splitting. When multipage bvecs
+ * is switched on, size of the incoming bio may be too big to
+ * be handled in some targets, such as crypt and log write.
+ *
+ * When these targets are ready for the big bio, we can remove
+ * the limit.
+ */
+ ti->max_io_len = min_t(uint32_t, len,
+ BIO_SP_MAX_SECTORS << SECTOR_SHIFT);
return 0;
}
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 58/60] dm-crypt: convert to bio_for_each_segment_all_rd()
2016-10-29 8:07 [PATCH 00/60] block: support multipage bvec Ming Lei
` (4 preceding siblings ...)
2016-10-29 8:08 ` [PATCH 29/60] dm: limit the max bio size as BIO_SP_MAX_SECTORS << SECTOR_SHIFT Ming Lei
@ 2016-10-29 8:08 ` Ming Lei
2016-10-31 15:25 ` [PATCH 00/60] block: support multipage bvec Christoph Hellwig
6 siblings, 0 replies; 15+ messages in thread
From: Ming Lei @ 2016-10-29 8:08 UTC (permalink / raw)
To: Jens Axboe, linux-kernel
Cc: linux-block, linux-fsdevel, Christoph Hellwig,
Kirill A . Shutemov, Ming Lei, Alasdair Kergon, Mike Snitzer,
maintainer:DEVICE-MAPPER LVM, Shaohua Li,
open list:SOFTWARE RAID Multiple Disks SUPPORT
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
drivers/md/dm-crypt.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 4999c7497f95..ed0f54e51638 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -1034,8 +1034,9 @@ static void crypt_free_buffer_pages(struct crypt_config *cc, struct bio *clone)
{
unsigned int i;
struct bio_vec *bv;
+ struct bvec_iter_all bia;
- bio_for_each_segment_all(bv, clone, i) {
+ bio_for_each_segment_all_rd(bv, clone, i, bia) {
BUG_ON(!bv->bv_page);
mempool_free(bv->bv_page, cc->page_pool);
bv->bv_page = NULL;
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH 00/60] block: support multipage bvec
2016-10-29 8:07 [PATCH 00/60] block: support multipage bvec Ming Lei
` (5 preceding siblings ...)
2016-10-29 8:08 ` [PATCH 58/60] dm-crypt: convert to bio_for_each_segment_all_rd() Ming Lei
@ 2016-10-31 15:25 ` Christoph Hellwig
2016-10-31 22:52 ` Ming Lei
6 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2016-10-31 15:25 UTC (permalink / raw)
To: Ming Lei
Cc: Jens Axboe, linux-kernel, linux-block, linux-fsdevel,
Christoph Hellwig, Kirill A . Shutemov, Al Viro, Andrew Morton,
Bart Van Assche, open list:GFS2 FILE SYSTEM, Coly Li,
Dan Williams, open list:DEVICE-MAPPER (LVM),
open list:DRBD DRIVER, Eric Wheeler, Guoqing Jiang,
Hannes Reinecke, Hannes Reinecke, Jiri Kosina, Joe Perches,
Johannes Berg, Johannes Thumshirn, Keith Busch
Hi Ming,
can you send a first patch just doing the obvious cleanups like
converting to bio_add_page and replacing direct poking into the
bio with the proper accessors? That should help reducing the
actual series to a sane size, and it should also help to cut
down the Cc list.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH 00/60] block: support multipage bvec
2016-10-31 15:25 ` [PATCH 00/60] block: support multipage bvec Christoph Hellwig
@ 2016-10-31 22:52 ` Ming Lei
0 siblings, 0 replies; 15+ messages in thread
From: Ming Lei @ 2016-10-31 22:52 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, Linux Kernel Mailing List, linux-block,
Linux FS Devel, Kirill A . Shutemov, Al Viro, Andrew Morton,
Bart Van Assche, open list:GFS2 FILE SYSTEM, Coly Li,
Dan Williams, open list:DEVICE-MAPPER (LVM),
open list:DRBD DRIVER, Eric Wheeler, Guoqing Jiang,
Hannes Reinecke, Hannes Reinecke, Jiri Kosina, Joe Perches,
Johannes Berg, Johannes Thumshirn, Kei
On Mon, Oct 31, 2016 at 11:25 PM, Christoph Hellwig <hch@infradead.org> wrote:
> Hi Ming,
>
> can you send a first patch just doing the obvious cleanups like
> converting to bio_add_page and replacing direct poking into the
> bio with the proper accessors? That should help reducing the
OK, that is just the 1st part of the patchset.
> actual series to a sane size, and it should also help to cut
> down the Cc list.
>
Thanks,
Ming Lei
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread