public inbox for linux-bcache@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/60] block: support multipage bvec
@ 2016-10-29  8:07 Ming Lei
  2016-10-29  8:08 ` [PATCH 02/60] block drivers: convert to bio_init_with_vec_table() Ming Lei
                   ` (7 more replies)
  0 siblings, 8 replies; 10+ messages in thread
From: Ming Lei @ 2016-10-29  8:07 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: linux-block, linux-fsdevel, Christoph Hellwig,
	Kirill A . Shutemov, Ming Lei, Al Viro, Andrew Morton,
	Bart Van Assche, open list:GFS2 FILE SYSTEM, Coly Li,
	Dan Williams, open list:DEVICE-MAPPER  LVM, open list:DRBD DRIVER,
	Eric Wheeler, Guoqing Jiang, Hannes Reinecke, Hannes Reinecke,
	Jiri Kosina, Joe Perches, Johannes Berg, Johannes Thumshirn,
	Keith Busch, Kent

Hi,

This patchset brings multipage bvec into block layer. Basic
xfstests(-a auto) over virtio-blk/virtio-scsi have been run
and no regression is found, so it should be good enough
to show the approach now, and any comments are welcome!

1) what is multipage bvec?

Multipage bvecs means that one 'struct bio_bvec' can hold
multiple pages which are physically contiguous instead
of one single page used in linux kernel for long time.

2) why is multipage bvec introduced?

Kent proposed the idea[1] first. 

As system's RAM becomes much bigger than before, and 
at the same time huge page, transparent huge page and
memory compaction are widely used, it is a bit easy now
to see physically contiguous pages inside fs/block stack.
On the other hand, from block layer's view, it isn't
necessary to store intermediate pages into bvec, and
it is enough to just store the physicallly contiguous
'segment'.

Also huge pages are being brought to filesystem[2], we
can do IO a hugepage a time[3], requires that one bio can
transfer at least one huge page one time. Turns out it isn't
flexiable to change BIO_MAX_PAGES simply[3]. Multipage bvec
can fit in this case very well.

With multipage bvec:

- bio size can be increased and it should improve some
high-bandwidth IO case in theory[4].

- Inside block layer, both bio splitting and sg map can
become more efficient than before by just traversing the
physically contiguous 'segment' instead of each page.

- there is possibility in future to improve memory footprint
of bvecs usage. 

3) how is multipage bvec implemented in this patchset?

The 1st 22 patches cleanup on direct access to bvec table,
and comments on some special cases. With this approach,
most of cases are found as safe for multipage bvec,
only fs/buffer, pktcdvd, dm-io, MD and btrfs need to deal
with.

Given a little more work is involved to cleanup pktcdvd,
MD and btrfs, this patchset introduces QUEUE_FLAG_NO_MP for
them, and these components can still see/use singlepage bvec.
In the future, once the cleanup is done, the flag can be killed.

The 2nd part(23 ~ 60) implements multipage bvec in block:

- put all tricks into bvec/bio/rq iterators, and as far as
drivers and fs use these standard iterators, they are happy
with multipage bvec

- bio_for_each_segment_all() changes
this helper pass pointer of each bvec directly to user, and
it has to be changed. Two new helpers(bio_for_each_segment_all_rd()
and bio_for_each_segment_all_wt()) are introduced. 

- bio_clone() changes
At default bio_clone still clones one new bio in multipage bvec
way. Also single page version of bio_clone() is introduced
for some special cases, such as only single page bvec is used
for the new cloned bio(bio bounce, ...)

These patches can be found in the following git tree:

	https://github.com/ming1/linux/tree/mp-bvec-0.3-v4.9

Thanks Christoph for looking at the early version and providing
very good suggestions, such as: introduce bio_init_with_vec_table(),
remove another unnecessary helpers for cleanup and so on.

TODO:
	- cleanup direct access to bvec table for MD & btrfs


[1], http://marc.info/?l=linux-kernel&m=141680246629547&w=2
[2], http://lwn.net/Articles/700781/
[3], http://marc.info/?t=147735447100001&r=1&w=2
[4], http://marc.info/?l=linux-mm&m=147745525801433&w=2


Ming Lei (60):
  block: bio: introduce bio_init_with_vec_table()
  block drivers: convert to bio_init_with_vec_table()
  block: drbd: remove impossible failure handling
  block: floppy: use bio_add_page()
  target: avoid to access .bi_vcnt directly
  bcache: debug: avoid to access .bi_io_vec directly
  dm: crypt: use bio_add_page()
  dm: use bvec iterator helpers to implement .get_page and .next_page
  dm: dm.c: replace 'bio->bi_vcnt == 1' with !bio_multiple_segments
  fs: logfs: convert to bio_add_page() in sync_request()
  fs: logfs: use bio_add_page() in __bdev_writeseg()
  fs: logfs: use bio_add_page() in do_erase()
  fs: logfs: remove unnecesary check
  block: drbd: comment on direct access bvec table
  block: loop: comment on direct access to bvec table
  block: pktcdvd: comment on direct access to bvec table
  kernel/power/swap.c: comment on direct access to bvec table
  mm: page_io.c: comment on direct access to bvec table
  fs/buffer: comment on direct access to bvec table
  f2fs: f2fs_read_end_io: comment on direct access to bvec table
  bcache: comment on direct access to bvec table
  block: comment on bio_alloc_pages()
  block: introduce flag QUEUE_FLAG_NO_MP
  md: set NO_MP for request queue of md
  block: pktcdvd: set NO_MP for pktcdvd request queue
  btrfs: set NO_MP for request queues behind BTRFS
  block: introduce BIO_SP_MAX_SECTORS
  block: introduce QUEUE_FLAG_SPLIT_MP
  dm: limit the max bio size as BIO_SP_MAX_SECTORS << SECTOR_SHIFT
  bcache: set flag of QUEUE_FLAG_SPLIT_MP
  block: introduce multipage/single page bvec helpers
  block: implement sp version of bvec iterator helpers
  block: introduce bio_for_each_segment_mp()
  block: introduce bio_clone_sp()
  bvec_iter: introduce BVEC_ITER_ALL_INIT
  block: bounce: avoid direct access to bvec from bio->bi_io_vec
  block: bounce: don't access bio->bi_io_vec in copy_to_high_bio_irq
  block: bounce: convert multipage bvecs into singlepage
  bcache: debug: switch to bio_clone_sp()
  blk-merge: compute bio->bi_seg_front_size efficiently
  block: blk-merge: try to make front segments in full size
  block: use bio_for_each_segment_mp() to compute segments count
  block: use bio_for_each_segment_mp() to map sg
  block: introduce bvec_for_each_sp_bvec()
  block: bio: introduce bio_for_each_segment_all_rd() and its write pair
  block: deal with dirtying pages for multipage bvec
  block: convert to bio_for_each_segment_all_rd()
  fs/mpage: convert to bio_for_each_segment_all_rd()
  fs/direct-io: convert to bio_for_each_segment_all_rd()
  ext4: convert to bio_for_each_segment_all_rd()
  xfs: convert to bio_for_each_segment_all_rd()
  logfs: convert to bio_for_each_segment_all_rd()
  gfs2: convert to bio_for_each_segment_all_rd()
  f2fs: convert to bio_for_each_segment_all_rd()
  exofs: convert to bio_for_each_segment_all_rd()
  fs: crypto: convert to bio_for_each_segment_all_rd()
  bcache: convert to bio_for_each_segment_all_rd()
  dm-crypt: convert to bio_for_each_segment_all_rd()
  fs/buffer.c: use bvec iterator to truncate the bio
  block: enable multipage bvecs

 block/bio.c                        | 104 ++++++++++++++----
 block/blk-merge.c                  | 216 +++++++++++++++++++++++++++++--------
 block/bounce.c                     |  80 ++++++++++----
 drivers/block/drbd/drbd_bitmap.c   |   1 +
 drivers/block/drbd/drbd_receiver.c |  14 +--
 drivers/block/floppy.c             |  10 +-
 drivers/block/loop.c               |   5 +
 drivers/block/pktcdvd.c            |   8 ++
 drivers/md/bcache/btree.c          |   4 +-
 drivers/md/bcache/debug.c          |  19 +++-
 drivers/md/bcache/io.c             |   4 +-
 drivers/md/bcache/journal.c        |   4 +-
 drivers/md/bcache/movinggc.c       |   7 +-
 drivers/md/bcache/super.c          |  25 +++--
 drivers/md/bcache/util.c           |   7 ++
 drivers/md/bcache/writeback.c      |   6 +-
 drivers/md/dm-bufio.c              |   4 +-
 drivers/md/dm-crypt.c              |  11 +-
 drivers/md/dm-io.c                 |  34 ++++--
 drivers/md/dm-rq.c                 |   3 +-
 drivers/md/dm.c                    |  11 +-
 drivers/md/md.c                    |  12 +++
 drivers/md/raid5.c                 |   9 +-
 drivers/nvme/target/io-cmd.c       |   4 +-
 drivers/target/target_core_pscsi.c |   8 +-
 fs/btrfs/volumes.c                 |   3 +
 fs/buffer.c                        |  24 +++--
 fs/crypto/crypto.c                 |   3 +-
 fs/direct-io.c                     |   4 +-
 fs/exofs/ore.c                     |   3 +-
 fs/exofs/ore_raid.c                |   3 +-
 fs/ext4/page-io.c                  |   3 +-
 fs/ext4/readpage.c                 |   3 +-
 fs/f2fs/data.c                     |  13 ++-
 fs/gfs2/lops.c                     |   3 +-
 fs/gfs2/meta_io.c                  |   3 +-
 fs/logfs/dev_bdev.c                | 110 +++++++------------
 fs/mpage.c                         |   3 +-
 fs/xfs/xfs_aops.c                  |   3 +-
 include/linux/bio.h                | 108 +++++++++++++++++--
 include/linux/blk_types.h          |   6 ++
 include/linux/blkdev.h             |   4 +
 include/linux/bvec.h               | 123 +++++++++++++++++++--
 kernel/power/swap.c                |   2 +
 mm/page_io.c                       |   1 +
 45 files changed, 759 insertions(+), 276 deletions(-)

-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 02/60] block drivers: convert to bio_init_with_vec_table()
  2016-10-29  8:07 [PATCH 00/60] block: support multipage bvec Ming Lei
@ 2016-10-29  8:08 ` Ming Lei
  2016-10-29  8:08 ` [PATCH 06/60] bcache: debug: avoid to access .bi_io_vec directly Ming Lei
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2016-10-29  8:08 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: linux-block, linux-fsdevel, Christoph Hellwig,
	Kirill A . Shutemov, Ming Lei, Jiri Kosina, Kent Overstreet,
	Shaohua Li, Alasdair Kergon, Mike Snitzer,
	maintainer:DEVICE-MAPPER LVM, Christoph Hellwig, Sagi Grimberg,
	Joern Engel, Prasad Joshi, Mike Christie, Hannes Reinecke,
	Rasmus Villemoes, Johannes Thumshirn, Guoqing Jiang

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/block/floppy.c        |  3 +--
 drivers/md/bcache/io.c        |  4 +---
 drivers/md/bcache/journal.c   |  4 +---
 drivers/md/bcache/movinggc.c  |  7 +++----
 drivers/md/bcache/super.c     | 13 ++++---------
 drivers/md/bcache/writeback.c |  6 +++---
 drivers/md/dm-bufio.c         |  4 +---
 drivers/md/raid5.c            |  9 ++-------
 drivers/nvme/target/io-cmd.c  |  4 +---
 fs/logfs/dev_bdev.c           |  4 +---
 10 files changed, 18 insertions(+), 40 deletions(-)

diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index e3d8e4ced4a2..cdc916a95137 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -3806,8 +3806,7 @@ static int __floppy_read_block_0(struct block_device *bdev, int drive)
 
 	cbdata.drive = drive;
 
-	bio_init(&bio);
-	bio.bi_io_vec = &bio_vec;
+	bio_init_with_vec_table(&bio, &bio_vec, 1);
 	bio_vec.bv_page = page;
 	bio_vec.bv_len = size;
 	bio_vec.bv_offset = 0;
diff --git a/drivers/md/bcache/io.c b/drivers/md/bcache/io.c
index e97b0acf7b8d..af9489087cd3 100644
--- a/drivers/md/bcache/io.c
+++ b/drivers/md/bcache/io.c
@@ -24,9 +24,7 @@ struct bio *bch_bbio_alloc(struct cache_set *c)
 	struct bbio *b = mempool_alloc(c->bio_meta, GFP_NOIO);
 	struct bio *bio = &b->bio;
 
-	bio_init(bio);
-	bio->bi_max_vecs	 = bucket_pages(c);
-	bio->bi_io_vec		 = bio->bi_inline_vecs;
+	bio_init_with_vec_table(bio, bio->bi_inline_vecs, bucket_pages(c));
 
 	return bio;
 }
diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c
index 6925023e12d4..b966f28d1b98 100644
--- a/drivers/md/bcache/journal.c
+++ b/drivers/md/bcache/journal.c
@@ -448,13 +448,11 @@ static void do_journal_discard(struct cache *ca)
 
 		atomic_set(&ja->discard_in_flight, DISCARD_IN_FLIGHT);
 
-		bio_init(bio);
+		bio_init_with_vec_table(bio, bio->bi_inline_vecs, 1);
 		bio_set_op_attrs(bio, REQ_OP_DISCARD, 0);
 		bio->bi_iter.bi_sector	= bucket_to_sector(ca->set,
 						ca->sb.d[ja->discard_idx]);
 		bio->bi_bdev		= ca->bdev;
-		bio->bi_max_vecs	= 1;
-		bio->bi_io_vec		= bio->bi_inline_vecs;
 		bio->bi_iter.bi_size	= bucket_bytes(ca);
 		bio->bi_end_io		= journal_discard_endio;
 
diff --git a/drivers/md/bcache/movinggc.c b/drivers/md/bcache/movinggc.c
index 5c4bddecfaf0..9d7991f69030 100644
--- a/drivers/md/bcache/movinggc.c
+++ b/drivers/md/bcache/movinggc.c
@@ -77,15 +77,14 @@ static void moving_init(struct moving_io *io)
 {
 	struct bio *bio = &io->bio.bio;
 
-	bio_init(bio);
+	bio_init_with_vec_table(bio, bio->bi_inline_vecs,
+				DIV_ROUND_UP(KEY_SIZE(&io->w->key),
+					     PAGE_SECTORS));
 	bio_get(bio);
 	bio_set_prio(bio, IOPRIO_PRIO_VALUE(IOPRIO_CLASS_IDLE, 0));
 
 	bio->bi_iter.bi_size	= KEY_SIZE(&io->w->key) << 9;
-	bio->bi_max_vecs	= DIV_ROUND_UP(KEY_SIZE(&io->w->key),
-					       PAGE_SECTORS);
 	bio->bi_private		= &io->cl;
-	bio->bi_io_vec		= bio->bi_inline_vecs;
 	bch_bio_map(bio, NULL);
 }
 
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 849ad441cd76..d8a6d807b498 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1152,9 +1152,7 @@ static void register_bdev(struct cache_sb *sb, struct page *sb_page,
 	dc->bdev = bdev;
 	dc->bdev->bd_holder = dc;
 
-	bio_init(&dc->sb_bio);
-	dc->sb_bio.bi_max_vecs	= 1;
-	dc->sb_bio.bi_io_vec	= dc->sb_bio.bi_inline_vecs;
+	bio_init_with_vec_table(&dc->sb_bio, dc->sb_bio.bi_inline_vecs, 1);
 	dc->sb_bio.bi_io_vec[0].bv_page = sb_page;
 	get_page(sb_page);
 
@@ -1814,9 +1812,8 @@ static int cache_alloc(struct cache *ca)
 	__module_get(THIS_MODULE);
 	kobject_init(&ca->kobj, &bch_cache_ktype);
 
-	bio_init(&ca->journal.bio);
-	ca->journal.bio.bi_max_vecs = 8;
-	ca->journal.bio.bi_io_vec = ca->journal.bio.bi_inline_vecs;
+	bio_init_with_vec_table(&ca->journal.bio,
+				ca->journal.bio.bi_inline_vecs, 8);
 
 	free = roundup_pow_of_two(ca->sb.nbuckets) >> 10;
 
@@ -1852,9 +1849,7 @@ static int register_cache(struct cache_sb *sb, struct page *sb_page,
 	ca->bdev = bdev;
 	ca->bdev->bd_holder = ca;
 
-	bio_init(&ca->sb_bio);
-	ca->sb_bio.bi_max_vecs	= 1;
-	ca->sb_bio.bi_io_vec	= ca->sb_bio.bi_inline_vecs;
+	bio_init_with_vec_table(&ca->sb_bio, ca->sb_bio.bi_inline_vecs, 1);
 	ca->sb_bio.bi_io_vec[0].bv_page = sb_page;
 	get_page(sb_page);
 
diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
index e51644e503a5..b2568cef8c86 100644
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -106,14 +106,14 @@ static void dirty_init(struct keybuf_key *w)
 	struct dirty_io *io = w->private;
 	struct bio *bio = &io->bio;
 
-	bio_init(bio);
+	bio_init_with_vec_table(bio, bio->bi_inline_vecs,
+				DIV_ROUND_UP(KEY_SIZE(&w->key),
+					     PAGE_SECTORS));
 	if (!io->dc->writeback_percent)
 		bio_set_prio(bio, IOPRIO_PRIO_VALUE(IOPRIO_CLASS_IDLE, 0));
 
 	bio->bi_iter.bi_size	= KEY_SIZE(&w->key) << 9;
-	bio->bi_max_vecs	= DIV_ROUND_UP(KEY_SIZE(&w->key), PAGE_SECTORS);
 	bio->bi_private		= w;
-	bio->bi_io_vec		= bio->bi_inline_vecs;
 	bch_bio_map(bio, NULL);
 }
 
diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index 125aedc3875f..5b13e7e7c8aa 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -611,9 +611,7 @@ static void use_inline_bio(struct dm_buffer *b, int rw, sector_t block,
 	char *ptr;
 	int len;
 
-	bio_init(&b->bio);
-	b->bio.bi_io_vec = b->bio_vec;
-	b->bio.bi_max_vecs = DM_BUFIO_INLINE_VECS;
+	bio_init_with_vec_table(&b->bio, b->bio_vec, DM_BUFIO_INLINE_VECS);
 	b->bio.bi_iter.bi_sector = block << b->c->sectors_per_block_bits;
 	b->bio.bi_bdev = b->c->bdev;
 	b->bio.bi_end_io = inline_endio;
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 92ac251e91e6..eae7b4cf34d4 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -2004,13 +2004,8 @@ static struct stripe_head *alloc_stripe(struct kmem_cache *sc, gfp_t gfp,
 		for (i = 0; i < disks; i++) {
 			struct r5dev *dev = &sh->dev[i];
 
-			bio_init(&dev->req);
-			dev->req.bi_io_vec = &dev->vec;
-			dev->req.bi_max_vecs = 1;
-
-			bio_init(&dev->rreq);
-			dev->rreq.bi_io_vec = &dev->rvec;
-			dev->rreq.bi_max_vecs = 1;
+			bio_init_with_vec_table(&dev->req, &dev->vec, 1);
+			bio_init_with_vec_table(&dev->rreq, &dev->rvec, 1);
 		}
 	}
 	return sh;
diff --git a/drivers/nvme/target/io-cmd.c b/drivers/nvme/target/io-cmd.c
index 4a96c2049b7b..6a32b0b68b1e 100644
--- a/drivers/nvme/target/io-cmd.c
+++ b/drivers/nvme/target/io-cmd.c
@@ -37,9 +37,7 @@ static void nvmet_inline_bio_init(struct nvmet_req *req)
 {
 	struct bio *bio = &req->inline_bio;
 
-	bio_init(bio);
-	bio->bi_max_vecs = NVMET_MAX_INLINE_BIOVEC;
-	bio->bi_io_vec = req->inline_bvec;
+	bio_init_with_vec_table(bio, req->inline_bvec, NVMET_MAX_INLINE_BIOVEC);
 }
 
 static void nvmet_execute_rw(struct nvmet_req *req)
diff --git a/fs/logfs/dev_bdev.c b/fs/logfs/dev_bdev.c
index a8329cc47dec..2bf53b0ffe83 100644
--- a/fs/logfs/dev_bdev.c
+++ b/fs/logfs/dev_bdev.c
@@ -19,9 +19,7 @@ static int sync_request(struct page *page, struct block_device *bdev, int op)
 	struct bio bio;
 	struct bio_vec bio_vec;
 
-	bio_init(&bio);
-	bio.bi_max_vecs = 1;
-	bio.bi_io_vec = &bio_vec;
+	bio_init_with_vec_table(&bio, &bio_vec, 1);
 	bio_vec.bv_page = page;
 	bio_vec.bv_len = PAGE_SIZE;
 	bio_vec.bv_offset = 0;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 06/60] bcache: debug: avoid to access .bi_io_vec directly
  2016-10-29  8:07 [PATCH 00/60] block: support multipage bvec Ming Lei
  2016-10-29  8:08 ` [PATCH 02/60] block drivers: convert to bio_init_with_vec_table() Ming Lei
@ 2016-10-29  8:08 ` Ming Lei
  2016-10-29  8:08 ` [PATCH 21/60] bcache: comment on direct access to bvec table Ming Lei
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2016-10-29  8:08 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: linux-block, linux-fsdevel, Christoph Hellwig,
	Kirill A . Shutemov, Ming Lei, Kent Overstreet, Shaohua Li,
	Mike Christie, Hannes Reinecke, Guoqing Jiang,
	open list:BCACHE BLOCK LAYER CACHE,
	open list:SOFTWARE RAID Multiple Disks SUPPORT

Instead we use standard iterator way to do that.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/bcache/debug.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/md/bcache/debug.c b/drivers/md/bcache/debug.c
index 333a1e5f6ae6..430f3050663c 100644
--- a/drivers/md/bcache/debug.c
+++ b/drivers/md/bcache/debug.c
@@ -107,8 +107,8 @@ void bch_data_verify(struct cached_dev *dc, struct bio *bio)
 {
 	char name[BDEVNAME_SIZE];
 	struct bio *check;
-	struct bio_vec bv;
-	struct bvec_iter iter;
+	struct bio_vec bv, cbv;
+	struct bvec_iter iter, citer = { 0 };
 
 	check = bio_clone(bio, GFP_NOIO);
 	if (!check)
@@ -120,9 +120,13 @@ void bch_data_verify(struct cached_dev *dc, struct bio *bio)
 
 	submit_bio_wait(check);
 
+	citer.bi_size = UINT_MAX;
 	bio_for_each_segment(bv, bio, iter) {
 		void *p1 = kmap_atomic(bv.bv_page);
-		void *p2 = page_address(check->bi_io_vec[iter.bi_idx].bv_page);
+		void *p2;
+
+		cbv = bio_iter_iovec(check, citer);
+		p2 = page_address(cbv.bv_page);
 
 		cache_set_err_on(memcmp(p1 + bv.bv_offset,
 					p2 + bv.bv_offset,
@@ -133,6 +137,7 @@ void bch_data_verify(struct cached_dev *dc, struct bio *bio)
 				 (uint64_t) bio->bi_iter.bi_sector);
 
 		kunmap_atomic(p1);
+		bio_advance_iter(check, &citer, bv.bv_len);
 	}
 
 	bio_free_pages(check);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 21/60] bcache: comment on direct access to bvec table
  2016-10-29  8:07 [PATCH 00/60] block: support multipage bvec Ming Lei
  2016-10-29  8:08 ` [PATCH 02/60] block drivers: convert to bio_init_with_vec_table() Ming Lei
  2016-10-29  8:08 ` [PATCH 06/60] bcache: debug: avoid to access .bi_io_vec directly Ming Lei
@ 2016-10-29  8:08 ` Ming Lei
  2016-10-29  8:08 ` [PATCH 22/60] block: comment on bio_alloc_pages() Ming Lei
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2016-10-29  8:08 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: linux-block, linux-fsdevel, Christoph Hellwig,
	Kirill A . Shutemov, Ming Lei, Kent Overstreet, Shaohua Li,
	Mike Christie, Hannes Reinecke, Guoqing Jiang, Jiri Kosina,
	Zheng Liu, Eric Wheeler, Yijing Wang, Coly Li, Al Viro,
	open list:BCACHE BLOCK LAYER CACHE,
	open list:SOFTWARE RAID Multiple Disks SUPPORT

Looks all are safe after multipage bvec is supported.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/bcache/btree.c | 1 +
 drivers/md/bcache/super.c | 6 ++++++
 drivers/md/bcache/util.c  | 7 +++++++
 3 files changed, 14 insertions(+)

diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 81d3db40cd7b..b419bc91ba32 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -428,6 +428,7 @@ static void do_btree_node_write(struct btree *b)
 
 		continue_at(cl, btree_node_write_done, NULL);
 	} else {
+		/* No harm for multipage bvec since the new is just allocated */
 		b->bio->bi_vcnt = 0;
 		bch_bio_map(b->bio, i);
 
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index d8a6d807b498..52876fcf2b36 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -207,6 +207,7 @@ static void write_bdev_super_endio(struct bio *bio)
 
 static void __write_super(struct cache_sb *sb, struct bio *bio)
 {
+	/* single page bio, safe for multipage bvec */
 	struct cache_sb *out = page_address(bio->bi_io_vec[0].bv_page);
 	unsigned i;
 
@@ -1153,6 +1154,8 @@ static void register_bdev(struct cache_sb *sb, struct page *sb_page,
 	dc->bdev->bd_holder = dc;
 
 	bio_init_with_vec_table(&dc->sb_bio, dc->sb_bio.bi_inline_vecs, 1);
+
+	/* single page bio, safe for multipage bvec */
 	dc->sb_bio.bi_io_vec[0].bv_page = sb_page;
 	get_page(sb_page);
 
@@ -1794,6 +1797,7 @@ void bch_cache_release(struct kobject *kobj)
 	for (i = 0; i < RESERVE_NR; i++)
 		free_fifo(&ca->free[i]);
 
+	/* single page bio, safe for multipage bvec */
 	if (ca->sb_bio.bi_inline_vecs[0].bv_page)
 		put_page(ca->sb_bio.bi_io_vec[0].bv_page);
 
@@ -1850,6 +1854,8 @@ static int register_cache(struct cache_sb *sb, struct page *sb_page,
 	ca->bdev->bd_holder = ca;
 
 	bio_init_with_vec_table(&ca->sb_bio, ca->sb_bio.bi_inline_vecs, 1);
+
+	/* single page bio, safe for multipage bvec */
 	ca->sb_bio.bi_io_vec[0].bv_page = sb_page;
 	get_page(sb_page);
 
diff --git a/drivers/md/bcache/util.c b/drivers/md/bcache/util.c
index dde6172f3f10..5cc0b49a65fb 100644
--- a/drivers/md/bcache/util.c
+++ b/drivers/md/bcache/util.c
@@ -222,6 +222,13 @@ uint64_t bch_next_delay(struct bch_ratelimit *d, uint64_t done)
 		: 0;
 }
 
+/*
+ * Generally it isn't good to access .bi_io_vec and .bi_vcnt
+ * directly, the preferred way is bio_add_page, but in
+ * this case, bch_bio_map() supposes that the bvec table
+ * is empty, so it is safe to access .bi_vcnt & .bi_io_vec
+ * in this way even after multipage bvec is supported.
+ */
 void bch_bio_map(struct bio *bio, void *base)
 {
 	size_t size = bio->bi_iter.bi_size;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 22/60] block: comment on bio_alloc_pages()
  2016-10-29  8:07 [PATCH 00/60] block: support multipage bvec Ming Lei
                   ` (2 preceding siblings ...)
  2016-10-29  8:08 ` [PATCH 21/60] bcache: comment on direct access to bvec table Ming Lei
@ 2016-10-29  8:08 ` Ming Lei
  2016-10-29  8:08 ` [PATCH 30/60] bcache: set flag of QUEUE_FLAG_SPLIT_MP Ming Lei
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2016-10-29  8:08 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: linux-block, linux-fsdevel, Christoph Hellwig,
	Kirill A . Shutemov, Ming Lei, Jens Axboe, Kent Overstreet,
	Shaohua Li, Mike Christie, Guoqing Jiang, Hannes Reinecke,
	open list:BCACHE BLOCK LAYER CACHE,
	open list:SOFTWARE RAID Multiple Disks SUPPORT

This patch adds comment on usage of bio_alloc_pages(),
also comments on one special case of bch_data_verify().

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 block/bio.c               | 4 +++-
 drivers/md/bcache/debug.c | 6 ++++++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/block/bio.c b/block/bio.c
index db85c5753a76..a49d1d89a85c 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -907,7 +907,9 @@ EXPORT_SYMBOL(bio_advance);
  * @bio: bio to allocate pages for
  * @gfp_mask: flags for allocation
  *
- * Allocates pages up to @bio->bi_vcnt.
+ * Allocates pages up to @bio->bi_vcnt, and this function should only
+ * be called on a new initialized bio, which means no page isn't added
+ * to the bio via bio_add_page() yet.
  *
  * Returns 0 on success, -ENOMEM on failure. On failure, any allocated pages are
  * freed.
diff --git a/drivers/md/bcache/debug.c b/drivers/md/bcache/debug.c
index 430f3050663c..71a9f05918eb 100644
--- a/drivers/md/bcache/debug.c
+++ b/drivers/md/bcache/debug.c
@@ -110,6 +110,12 @@ void bch_data_verify(struct cached_dev *dc, struct bio *bio)
 	struct bio_vec bv, cbv;
 	struct bvec_iter iter, citer = { 0 };
 
+	/*
+	 * Once multipage bvec is supported, the bio_clone()
+	 * has to make sure page count in this bio can be held
+	 * in the new cloned bio because each single page need
+	 * to assign to each bvec of the new bio.
+	 */
 	check = bio_clone(bio, GFP_NOIO);
 	if (!check)
 		return;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 30/60] bcache: set flag of QUEUE_FLAG_SPLIT_MP
  2016-10-29  8:07 [PATCH 00/60] block: support multipage bvec Ming Lei
                   ` (3 preceding siblings ...)
  2016-10-29  8:08 ` [PATCH 22/60] block: comment on bio_alloc_pages() Ming Lei
@ 2016-10-29  8:08 ` Ming Lei
  2016-10-29  8:08 ` [PATCH 39/60] bcache: debug: switch to bio_clone_sp() Ming Lei
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2016-10-29  8:08 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: linux-block, linux-fsdevel, Christoph Hellwig,
	Kirill A . Shutemov, Ming Lei, Kent Overstreet, Shaohua Li,
	Eric Wheeler, Coly Li, Yijing Wang, Zheng Liu, Mike Christie,
	open list:BCACHE BLOCK LAYER CACHE,
	open list:SOFTWARE RAID Multiple Disks SUPPORT

It isn't safe(such as bch_data_verify()) to let bcache deal with
more than 1M bio from multipage bvec, so set this flag and size of
incoming bio won't be bigger than BIO_SP_MAX_SECTORS.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/bcache/super.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 52876fcf2b36..fca023a1a026 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -821,6 +821,12 @@ static int bcache_device_init(struct bcache_device *d, unsigned block_size,
 
 	blk_queue_write_cache(q, true, true);
 
+	/*
+	 * Once bcache is audited that it is ready to deal with big
+	 * incoming bio with multipage bvecs, we can remove the flag.
+	 */
+	set_bit(QUEUE_FLAG_SPLIT_MP,	&d->disk->queue->queue_flags);
+
 	return 0;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 39/60] bcache: debug: switch to bio_clone_sp()
  2016-10-29  8:07 [PATCH 00/60] block: support multipage bvec Ming Lei
                   ` (4 preceding siblings ...)
  2016-10-29  8:08 ` [PATCH 30/60] bcache: set flag of QUEUE_FLAG_SPLIT_MP Ming Lei
@ 2016-10-29  8:08 ` Ming Lei
  2016-10-29  8:08 ` [PATCH 57/60] bcache: convert to bio_for_each_segment_all_rd() Ming Lei
  2016-10-31 15:25 ` [PATCH 00/60] block: support multipage bvec Christoph Hellwig
  7 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2016-10-29  8:08 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: linux-block, linux-fsdevel, Christoph Hellwig,
	Kirill A . Shutemov, Ming Lei, Kent Overstreet, Shaohua Li,
	Mike Christie, Hannes Reinecke, Guoqing Jiang,
	open list:BCACHE BLOCK LAYER CACHE,
	open list:SOFTWARE RAID Multiple Disks SUPPORT

The cloned bio has to be singlepage bvec based, so
use bio_clone_sp(), and the allocated bvec table
is enough for hold the bvecs because QUEUE_FLAG_SPLIT_MP
is set for bcache.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/bcache/debug.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/md/bcache/debug.c b/drivers/md/bcache/debug.c
index 71a9f05918eb..0735015b0842 100644
--- a/drivers/md/bcache/debug.c
+++ b/drivers/md/bcache/debug.c
@@ -111,12 +111,10 @@ void bch_data_verify(struct cached_dev *dc, struct bio *bio)
 	struct bvec_iter iter, citer = { 0 };
 
 	/*
-	 * Once multipage bvec is supported, the bio_clone()
-	 * has to make sure page count in this bio can be held
-	 * in the new cloned bio because each single page need
-	 * to assign to each bvec of the new bio.
+	 * QUEUE_FLAG_SPLIT_MP can make the cloned singlepage
+	 * bvecs to be held in the allocated bvec table.
 	 */
-	check = bio_clone(bio, GFP_NOIO);
+	check = bio_clone_sp(bio, GFP_NOIO);
 	if (!check)
 		return;
 	bio_set_op_attrs(check, REQ_OP_READ, READ_SYNC);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 57/60] bcache: convert to bio_for_each_segment_all_rd()
  2016-10-29  8:07 [PATCH 00/60] block: support multipage bvec Ming Lei
                   ` (5 preceding siblings ...)
  2016-10-29  8:08 ` [PATCH 39/60] bcache: debug: switch to bio_clone_sp() Ming Lei
@ 2016-10-29  8:08 ` Ming Lei
  2016-10-31 15:25 ` [PATCH 00/60] block: support multipage bvec Christoph Hellwig
  7 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2016-10-29  8:08 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: linux-block, linux-fsdevel, Christoph Hellwig,
	Kirill A . Shutemov, Ming Lei, Kent Overstreet, Shaohua Li,
	Hannes Reinecke, Jiri Kosina, Mike Christie, Guoqing Jiang,
	Zheng Liu, open list:BCACHE BLOCK LAYER CACHE,
	open list:SOFTWARE RAID Multiple Disks SUPPORT

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/bcache/btree.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index b419bc91ba32..89abada6a091 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -419,8 +419,9 @@ static void do_btree_node_write(struct btree *b)
 		int j;
 		struct bio_vec *bv;
 		void *base = (void *) ((unsigned long) i & ~(PAGE_SIZE - 1));
+		struct bvec_iter_all bia;
 
-		bio_for_each_segment_all(bv, b->bio, j)
+		bio_for_each_segment_all_rd(bv, b->bio, j, bia)
 			memcpy(page_address(bv->bv_page),
 			       base + j * PAGE_SIZE, PAGE_SIZE);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 00/60] block: support multipage bvec
  2016-10-29  8:07 [PATCH 00/60] block: support multipage bvec Ming Lei
                   ` (6 preceding siblings ...)
  2016-10-29  8:08 ` [PATCH 57/60] bcache: convert to bio_for_each_segment_all_rd() Ming Lei
@ 2016-10-31 15:25 ` Christoph Hellwig
  2016-10-31 22:52   ` Ming Lei
  7 siblings, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2016-10-31 15:25 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-kernel, linux-block, linux-fsdevel,
	Christoph Hellwig, Kirill A . Shutemov, Al Viro, Andrew Morton,
	Bart Van Assche, open list:GFS2 FILE SYSTEM, Coly Li,
	Dan Williams, open list:DEVICE-MAPPER  (LVM),
	open list:DRBD DRIVER, Eric Wheeler, Guoqing Jiang,
	Hannes Reinecke, Hannes Reinecke, Jiri Kosina, Joe Perches,
	Johannes Berg, Johannes Thumshirn, Keith Busch

Hi Ming,

can you send a first patch just doing the obvious cleanups like
converting to bio_add_page and replacing direct poking into the
bio with the proper accessors?  That should help reducing the
actual series to a sane size, and it should also help to cut
down the Cc list.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 00/60] block: support multipage bvec
  2016-10-31 15:25 ` [PATCH 00/60] block: support multipage bvec Christoph Hellwig
@ 2016-10-31 22:52   ` Ming Lei
  0 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2016-10-31 22:52 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Linux Kernel Mailing List, linux-block,
	Linux FS Devel, Kirill A . Shutemov, Al Viro, Andrew Morton,
	Bart Van Assche, open list:GFS2 FILE SYSTEM, Coly Li,
	Dan Williams, open list:DEVICE-MAPPER (LVM),
	open list:DRBD DRIVER, Eric Wheeler, Guoqing Jiang,
	Hannes Reinecke, Hannes Reinecke, Jiri Kosina, Joe Perches,
	Johannes Berg, Johannes Thumshirn, Kei

On Mon, Oct 31, 2016 at 11:25 PM, Christoph Hellwig <hch@infradead.org> wrote:
> Hi Ming,
>
> can you send a first patch just doing the obvious cleanups like
> converting to bio_add_page and replacing direct poking into the
> bio with the proper accessors?  That should help reducing the

OK, that is just the 1st part of the patchset.

> actual series to a sane size, and it should also help to cut
> down the Cc list.
>



Thanks,
Ming Lei

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-10-31 22:52 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-10-29  8:07 [PATCH 00/60] block: support multipage bvec Ming Lei
2016-10-29  8:08 ` [PATCH 02/60] block drivers: convert to bio_init_with_vec_table() Ming Lei
2016-10-29  8:08 ` [PATCH 06/60] bcache: debug: avoid to access .bi_io_vec directly Ming Lei
2016-10-29  8:08 ` [PATCH 21/60] bcache: comment on direct access to bvec table Ming Lei
2016-10-29  8:08 ` [PATCH 22/60] block: comment on bio_alloc_pages() Ming Lei
2016-10-29  8:08 ` [PATCH 30/60] bcache: set flag of QUEUE_FLAG_SPLIT_MP Ming Lei
2016-10-29  8:08 ` [PATCH 39/60] bcache: debug: switch to bio_clone_sp() Ming Lei
2016-10-29  8:08 ` [PATCH 57/60] bcache: convert to bio_for_each_segment_all_rd() Ming Lei
2016-10-31 15:25 ` [PATCH 00/60] block: support multipage bvec Christoph Hellwig
2016-10-31 22:52   ` Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox