linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 00/54] block: support multipage bvec
@ 2016-12-27 15:55 Ming Lei
  2016-12-27 15:55 ` [PATCH v1 07/54] bcache: comment on direct access to bvec table Ming Lei
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Ming Lei @ 2016-12-27 15:55 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: linux-block, Christoph Hellwig, Ming Lei, Al Viro, Andrew Morton,
	Bart Van Assche, Chaitanya Kulkarni, open list:GFS2 FILE SYSTEM,
	Damien Le Moal, Dan Williams, open list:DEVICE-MAPPER  LVM,
	open list:DRBD DRIVER, Eric Wheeler, Guoqing Jiang,
	Hannes Reinecke, Hannes Reinecke, Jiri Kosina, Joe Perches,
	Johannes Berg, Johannes Thumshirn, Kent Overstreet, linux-bcache

Hi,

This patchset brings multipage bvec into block layer. Basic
xfstests(-a auto) over virtio-blk/virtio-scsi have been run
and no regression is found, so it should be good enough
to show the approach now, and any comments are welcome!

1) what is multipage bvec?

Multipage bvecs means that one 'struct bio_bvec' can hold
multiple pages which are physically contiguous instead
of one single page used in linux kernel for long time.

2) why is multipage bvec introduced?

Kent proposed the idea[1] first. 

As system's RAM becomes much bigger than before, and 
at the same time huge page, transparent huge page and
memory compaction are widely used, it is a bit easy now
to see physically contiguous pages from fs in I/O.
On the other hand, from block layer's view, it isn't
necessary to store intermediate pages into bvec, and
it is enough to just store the physicallly contiguous
'segment'.

Also huge pages are being brought to filesystem[2], we
can do IO a hugepage a time[3], requires that one bio can
transfer at least one huge page one time. Turns out it isn't
flexiable to change BIO_MAX_PAGES simply[3]. Multipage bvec
can fit in this case very well.

With multipage bvec:

- bio size can be increased and it should improve some
high-bandwidth IO case in theory[4].

- Inside block layer, both bio splitting and sg map can
become more efficient than before by just traversing the
physically contiguous 'segment' instead of each page.

- there is possibility in future to improve memory footprint
of bvecs usage. 

3) how is multipage bvec implemented in this patchset?

The 1st 9 patches comment on some special cases. As we saw,
most of cases are found as safe for multipage bvec,
only fs/buffer, MD and btrfs need to deal with. Both fs/buffer
and btrfs are dealt with in the following patches based on some
new block APIs for multipage bvec. 

Given a little more work is involved to cleanup MD, this patchset
introduces QUEUE_FLAG_NO_MP for them, and this component can still
see/use singlepage bvec. In the future, once the cleanup is done, the
flag can be killed.

The 2nd part(23 ~ 54) implements multipage bvec in block:

- put all tricks into bvec/bio/rq iterators, and as far as
drivers and fs use these standard iterators, they are happy
with multipage bvec

- bio_for_each_segment_all() changes
this helper pass pointer of each bvec directly to user, and
it has to be changed. Two new helpers(bio_for_each_segment_all_sp()
and bio_for_each_segment_all_mp()) are introduced. 

Also convert current bio_for_each_segment_all() into the
above two.

- bio_clone() changes
At default bio_clone still clones one new bio in multipage bvec
way. Also single page version of bio_clone() is introduced
for some special cases, such as only single page bvec is used
for the new cloned bio(bio bounce, ...)

- btrfs cleanup
just three patches for avoiding direct access to bvec table.

These patches can be found in the following git tree:

	https://github.com/ming1/linux/commits/mp-bvec-0.6-v4.10-rc

Thanks Christoph for looking at the early version and providing
very good suggestions, such as: introduce bio_init_with_vec_table(),
remove another unnecessary helpers for cleanup and so on.

TODO:
	- cleanup direct access to bvec table for MD

V1:
	- against v4.10-rc1 and some cleanup in V0 are in -linus already
	- handle queue_virt_boundary() in mp bvec change and make NVMe happy
	- further BTRFS cleanup
	- remove QUEUE_FLAG_SPLIT_MP
	- rename for two new helpers of bio_for_each_segment_all()
	- fix bounce convertion
	- address comments in V0

[1], http://marc.info/?l=linux-kernel&m=141680246629547&w=2
[2], https://patchwork.kernel.org/patch/9451523/
[3], http://marc.info/?t=147735447100001&r=1&w=2
[4], http://marc.info/?l=linux-mm&m=147745525801433&w=2


Ming Lei (54):
  block: drbd: comment on direct access bvec table
  block: loop: comment on direct access to bvec table
  kernel/power/swap.c: comment on direct access to bvec table
  mm: page_io.c: comment on direct access to bvec table
  fs/buffer: comment on direct access to bvec table
  f2fs: f2fs_read_end_io: comment on direct access to bvec table
  bcache: comment on direct access to bvec table
  block: comment on bio_alloc_pages()
  block: comment on bio_iov_iter_get_pages()
  block: introduce flag QUEUE_FLAG_NO_MP
  md: set NO_MP for request queue of md
  dm: limit the max bio size as BIO_MAX_PAGES * PAGE_SIZE
  block: comments on bio_for_each_segment[_all]
  block: introduce multipage/single page bvec helpers
  block: implement sp version of bvec iterator helpers
  block: introduce bio_for_each_segment_mp()
  block: introduce bio_clone_sp()
  bvec_iter: introduce BVEC_ITER_ALL_INIT
  block: bounce: avoid direct access to bvec table
  block: bounce: don't access bio->bi_io_vec in copy_to_high_bio_irq
  block: introduce bio_can_convert_to_sp()
  block: bounce: convert multipage bvecs into singlepage
  bcache: handle bio_clone() & bvec updating for multipage bvecs
  blk-merge: compute bio->bi_seg_front_size efficiently
  block: blk-merge: try to make front segments in full size
  block: blk-merge: remove unnecessary check
  block: use bio_for_each_segment_mp() to compute segments count
  block: use bio_for_each_segment_mp() to map sg
  block: introduce bvec_for_each_sp_bvec()
  block: bio: introduce single/multi page version of
    bio_for_each_segment_all()
  block: introduce bio_segments_all()
  block: introduce bvec_get_last_sp()
  block: deal with dirtying pages for multipage bvec
  block: convert to singe/multi page version of
    bio_for_each_segment_all()
  bcache: convert to bio_for_each_segment_all_sp()
  dm-crypt: don't clear bvec->bv_page in crypt_free_buffer_pages()
  dm-crypt: convert to bio_for_each_segment_all_sp()
  md/raid1.c: convert to bio_for_each_segment_all_sp()
  fs/mpage: convert to bio_for_each_segment_all_sp()
  fs/direct-io: convert to bio_for_each_segment_all_sp()
  ext4: convert to bio_for_each_segment_all_sp()
  xfs: convert to bio_for_each_segment_all_sp()
  gfs2: convert to bio_for_each_segment_all_sp()
  f2fs: convert to bio_for_each_segment_all_sp()
  exofs: convert to bio_for_each_segment_all_sp()
  fs: crypto: convert to bio_for_each_segment_all_sp()
  fs/btrfs: convert to bio_for_each_segment_all_sp()
  fs/block_dev.c: convert to bio_for_each_segment_all_sp()
  fs/iomap.c: convert to bio_for_each_segment_all_sp()
  fs/buffer.c: use bvec iterator to truncate the bio
  btrfs: avoid access to .bi_vcnt directly
  btrfs: use bvec_get_last_sp to get the last singlepage bvec
  btrfs: comment on direct access bvec table
  block: enable multipage bvecs

 block/bio.c                      | 110 +++++++++++++++----
 block/blk-merge.c                | 227 +++++++++++++++++++++++++++++++--------
 block/blk-zoned.c                |   5 +-
 block/bounce.c                   |  75 +++++++++----
 drivers/block/drbd/drbd_bitmap.c |   1 +
 drivers/block/loop.c             |   5 +
 drivers/md/bcache/btree.c        |   4 +-
 drivers/md/bcache/debug.c        |  30 +++++-
 drivers/md/bcache/super.c        |   6 ++
 drivers/md/bcache/util.c         |   7 ++
 drivers/md/dm-crypt.c            |   4 +-
 drivers/md/dm.c                  |  11 +-
 drivers/md/md.c                  |  12 +++
 drivers/md/raid1.c               |   3 +-
 fs/block_dev.c                   |   6 +-
 fs/btrfs/check-integrity.c       |  12 ++-
 fs/btrfs/compression.c           |  12 ++-
 fs/btrfs/disk-io.c               |   3 +-
 fs/btrfs/extent_io.c             |  26 +++--
 fs/btrfs/extent_io.h             |   1 +
 fs/btrfs/file-item.c             |   6 +-
 fs/btrfs/inode.c                 |  34 ++++--
 fs/btrfs/raid56.c                |   6 +-
 fs/buffer.c                      |  24 +++--
 fs/crypto/crypto.c               |   3 +-
 fs/direct-io.c                   |   4 +-
 fs/exofs/ore.c                   |   3 +-
 fs/exofs/ore_raid.c              |   3 +-
 fs/ext4/page-io.c                |   3 +-
 fs/ext4/readpage.c               |   3 +-
 fs/f2fs/data.c                   |  13 ++-
 fs/gfs2/lops.c                   |   3 +-
 fs/gfs2/meta_io.c                |   3 +-
 fs/iomap.c                       |   3 +-
 fs/mpage.c                       |   3 +-
 fs/xfs/xfs_aops.c                |   3 +-
 include/linux/bio.h              | 164 ++++++++++++++++++++++++++--
 include/linux/blk_types.h        |   6 ++
 include/linux/blkdev.h           |   2 +
 include/linux/bvec.h             | 138 ++++++++++++++++++++++--
 kernel/power/swap.c              |   2 +
 mm/page_io.c                     |   2 +
 42 files changed, 829 insertions(+), 162 deletions(-)

-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v1 07/54] bcache: comment on direct access to bvec table
  2016-12-27 15:55 [PATCH v1 00/54] block: support multipage bvec Ming Lei
@ 2016-12-27 15:55 ` Ming Lei
  2016-12-30 16:56   ` Coly Li
  2016-12-27 15:55 ` [PATCH v1 08/54] block: comment on bio_alloc_pages() Ming Lei
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Ming Lei @ 2016-12-27 15:55 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: linux-block, Christoph Hellwig, Ming Lei, Kent Overstreet,
	Shaohua Li, Guoqing Jiang, Zheng Liu, Mike Christie, Jiri Kosina,
	Eric Wheeler, Yijing Wang, Al Viro,
	open list:BCACHE BLOCK LAYER CACHE,
	open list:SOFTWARE RAID Multiple Disks SUPPORT

Looks all are safe after multipage bvec is supported.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/bcache/btree.c | 1 +
 drivers/md/bcache/super.c | 6 ++++++
 drivers/md/bcache/util.c  | 7 +++++++
 3 files changed, 14 insertions(+)

diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index a43eedd5804d..fc35cfb4d0f1 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -428,6 +428,7 @@ static void do_btree_node_write(struct btree *b)
 
 		continue_at(cl, btree_node_write_done, NULL);
 	} else {
+		/* No harm for multipage bvec since the new is just allocated */
 		b->bio->bi_vcnt = 0;
 		bch_bio_map(b->bio, i);
 
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 3a19cbc8b230..607b022259dc 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -208,6 +208,7 @@ static void write_bdev_super_endio(struct bio *bio)
 
 static void __write_super(struct cache_sb *sb, struct bio *bio)
 {
+	/* single page bio, safe for multipage bvec */
 	struct cache_sb *out = page_address(bio->bi_io_vec[0].bv_page);
 	unsigned i;
 
@@ -1156,6 +1157,8 @@ static void register_bdev(struct cache_sb *sb, struct page *sb_page,
 	dc->bdev->bd_holder = dc;
 
 	bio_init(&dc->sb_bio, dc->sb_bio.bi_inline_vecs, 1);
+
+	/* single page bio, safe for multipage bvec */
 	dc->sb_bio.bi_io_vec[0].bv_page = sb_page;
 	get_page(sb_page);
 
@@ -1799,6 +1802,7 @@ void bch_cache_release(struct kobject *kobj)
 	for (i = 0; i < RESERVE_NR; i++)
 		free_fifo(&ca->free[i]);
 
+	/* single page bio, safe for multipage bvec */
 	if (ca->sb_bio.bi_inline_vecs[0].bv_page)
 		put_page(ca->sb_bio.bi_io_vec[0].bv_page);
 
@@ -1854,6 +1858,8 @@ static int register_cache(struct cache_sb *sb, struct page *sb_page,
 	ca->bdev->bd_holder = ca;
 
 	bio_init(&ca->sb_bio, ca->sb_bio.bi_inline_vecs, 1);
+
+	/* single page bio, safe for multipage bvec */
 	ca->sb_bio.bi_io_vec[0].bv_page = sb_page;
 	get_page(sb_page);
 
diff --git a/drivers/md/bcache/util.c b/drivers/md/bcache/util.c
index dde6172f3f10..5cc0b49a65fb 100644
--- a/drivers/md/bcache/util.c
+++ b/drivers/md/bcache/util.c
@@ -222,6 +222,13 @@ uint64_t bch_next_delay(struct bch_ratelimit *d, uint64_t done)
 		: 0;
 }
 
+/*
+ * Generally it isn't good to access .bi_io_vec and .bi_vcnt
+ * directly, the preferred way is bio_add_page, but in
+ * this case, bch_bio_map() supposes that the bvec table
+ * is empty, so it is safe to access .bi_vcnt & .bi_io_vec
+ * in this way even after multipage bvec is supported.
+ */
 void bch_bio_map(struct bio *bio, void *base)
 {
 	size_t size = bio->bi_iter.bi_size;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v1 08/54] block: comment on bio_alloc_pages()
  2016-12-27 15:55 [PATCH v1 00/54] block: support multipage bvec Ming Lei
  2016-12-27 15:55 ` [PATCH v1 07/54] bcache: comment on direct access to bvec table Ming Lei
@ 2016-12-27 15:55 ` Ming Lei
  2016-12-30 10:40   ` Coly Li
  2016-12-30 11:06   ` Coly Li
  2016-12-27 15:56 ` [PATCH v1 11/54] md: set NO_MP for request queue of md Ming Lei
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 11+ messages in thread
From: Ming Lei @ 2016-12-27 15:55 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: linux-block, Christoph Hellwig, Ming Lei, Jens Axboe,
	Kent Overstreet, Shaohua Li, Mike Christie, Guoqing Jiang,
	Hannes Reinecke, open list:BCACHE BLOCK LAYER CACHE,
	open list:SOFTWARE RAID Multiple Disks SUPPORT

This patch adds comment on usage of bio_alloc_pages(),
also comments on one special case of bch_data_verify().

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 block/bio.c               | 4 +++-
 drivers/md/bcache/debug.c | 6 ++++++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/block/bio.c b/block/bio.c
index 2b375020fc49..d4a1e0b63ea0 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -961,7 +961,9 @@ EXPORT_SYMBOL(bio_advance);
  * @bio: bio to allocate pages for
  * @gfp_mask: flags for allocation
  *
- * Allocates pages up to @bio->bi_vcnt.
+ * Allocates pages up to @bio->bi_vcnt, and this function should only
+ * be called on a new initialized bio, which means all pages aren't added
+ * to the bio via bio_add_page() yet.
  *
  * Returns 0 on success, -ENOMEM on failure. On failure, any allocated pages are
  * freed.
diff --git a/drivers/md/bcache/debug.c b/drivers/md/bcache/debug.c
index 06f55056aaae..48d03e8b3385 100644
--- a/drivers/md/bcache/debug.c
+++ b/drivers/md/bcache/debug.c
@@ -110,6 +110,12 @@ void bch_data_verify(struct cached_dev *dc, struct bio *bio)
 	struct bio_vec bv, cbv;
 	struct bvec_iter iter, citer = { 0 };
 
+	/*
+	 * Once multipage bvec is supported, the bio_clone()
+	 * has to make sure page count in this bio can be held
+	 * in the new cloned bio because each single page need
+	 * to assign to each bvec of the new bio.
+	 */
 	check = bio_clone(bio, GFP_NOIO);
 	if (!check)
 		return;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v1 11/54] md: set NO_MP for request queue of md
  2016-12-27 15:55 [PATCH v1 00/54] block: support multipage bvec Ming Lei
  2016-12-27 15:55 ` [PATCH v1 07/54] bcache: comment on direct access to bvec table Ming Lei
  2016-12-27 15:55 ` [PATCH v1 08/54] block: comment on bio_alloc_pages() Ming Lei
@ 2016-12-27 15:56 ` Ming Lei
  2016-12-27 15:56 ` [PATCH v1 12/54] dm: limit the max bio size as BIO_MAX_PAGES * PAGE_SIZE Ming Lei
  2016-12-27 15:56 ` [PATCH v1 23/54] bcache: handle bio_clone() & bvec updating for multipage bvecs Ming Lei
  4 siblings, 0 replies; 11+ messages in thread
From: Ming Lei @ 2016-12-27 15:56 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: linux-block, Christoph Hellwig, Ming Lei, Shaohua Li,
	open list:SOFTWARE RAID Multiple Disks SUPPORT

MD isn't ready for multipage bvecs, so mark it as
NO_MP.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/md.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 82821ee0d57f..63c6326bafde 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5162,6 +5162,16 @@ static void md_safemode_timeout(unsigned long data)
 
 static int start_dirty_degraded;
 
+/*
+ * MD isn't ready for multipage bvecs yet, and set the flag
+ * so that MD still can see singlepage bvecs bio
+ */
+static inline void md_set_no_mp(struct mddev *mddev)
+{
+	if (mddev->queue)
+		set_bit(QUEUE_FLAG_NO_MP, &mddev->queue->queue_flags);
+}
+
 int md_run(struct mddev *mddev)
 {
 	int err;
@@ -5381,6 +5391,8 @@ int md_run(struct mddev *mddev)
 	if (mddev->sb_flags)
 		md_update_sb(mddev, 0);
 
+	md_set_no_mp(mddev);
+
 	md_new_event(mddev);
 	sysfs_notify_dirent_safe(mddev->sysfs_state);
 	sysfs_notify_dirent_safe(mddev->sysfs_action);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v1 12/54] dm: limit the max bio size as BIO_MAX_PAGES * PAGE_SIZE
  2016-12-27 15:55 [PATCH v1 00/54] block: support multipage bvec Ming Lei
                   ` (2 preceding siblings ...)
  2016-12-27 15:56 ` [PATCH v1 11/54] md: set NO_MP for request queue of md Ming Lei
@ 2016-12-27 15:56 ` Ming Lei
  2016-12-27 15:56 ` [PATCH v1 23/54] bcache: handle bio_clone() & bvec updating for multipage bvecs Ming Lei
  4 siblings, 0 replies; 11+ messages in thread
From: Ming Lei @ 2016-12-27 15:56 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: linux-block, Christoph Hellwig, Ming Lei, Alasdair Kergon,
	Mike Snitzer, maintainer:DEVICE-MAPPER LVM, Shaohua Li,
	open list:SOFTWARE RAID Multiple Disks SUPPORT

For BIO based DM, some targets aren't ready for dealing with
bigger incoming bio than 1Mbyte, such as crypt target.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/dm.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 3086da5664f3..6139bf7623f7 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -899,7 +899,16 @@ int dm_set_target_max_io_len(struct dm_target *ti, sector_t len)
 		return -EINVAL;
 	}
 
-	ti->max_io_len = (uint32_t) len;
+	/*
+	 * BIO based queue uses its own splitting. When multipage bvecs
+	 * is switched on, size of the incoming bio may be too big to
+	 * be handled in some targets, such as crypt.
+	 *
+	 * When these targets are ready for the big bio, we can remove
+	 * the limit.
+	 */
+	ti->max_io_len = min_t(uint32_t, len,
+			       (BIO_MAX_PAGES * PAGE_SIZE));
 
 	return 0;
 }
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v1 23/54] bcache: handle bio_clone() & bvec updating for multipage bvecs
  2016-12-27 15:55 [PATCH v1 00/54] block: support multipage bvec Ming Lei
                   ` (3 preceding siblings ...)
  2016-12-27 15:56 ` [PATCH v1 12/54] dm: limit the max bio size as BIO_MAX_PAGES * PAGE_SIZE Ming Lei
@ 2016-12-27 15:56 ` Ming Lei
  2016-12-30 11:01   ` Coly Li
  4 siblings, 1 reply; 11+ messages in thread
From: Ming Lei @ 2016-12-27 15:56 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: linux-block, Christoph Hellwig, Ming Lei, Kent Overstreet,
	Shaohua Li, Mike Christie, Guoqing Jiang,
	open list:BCACHE BLOCK LAYER CACHE,
	open list:SOFTWARE RAID Multiple Disks SUPPORT

The incoming bio may be too big to be cloned into
one singlepage bvecs bio, so split the bio and
check the splitted bio one by one.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/bcache/debug.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/md/bcache/debug.c b/drivers/md/bcache/debug.c
index 48d03e8b3385..18b2d2d138e3 100644
--- a/drivers/md/bcache/debug.c
+++ b/drivers/md/bcache/debug.c
@@ -103,7 +103,7 @@ void bch_btree_verify(struct btree *b)
 	up(&b->io_mutex);
 }
 
-void bch_data_verify(struct cached_dev *dc, struct bio *bio)
+static void __bch_data_verify(struct cached_dev *dc, struct bio *bio)
 {
 	char name[BDEVNAME_SIZE];
 	struct bio *check;
@@ -116,7 +116,7 @@ void bch_data_verify(struct cached_dev *dc, struct bio *bio)
 	 * in the new cloned bio because each single page need
 	 * to assign to each bvec of the new bio.
 	 */
-	check = bio_clone(bio, GFP_NOIO);
+	check = bio_clone_sp(bio, GFP_NOIO);
 	if (!check)
 		return;
 	check->bi_opf = REQ_OP_READ;
@@ -151,6 +151,26 @@ void bch_data_verify(struct cached_dev *dc, struct bio *bio)
 	bio_put(check);
 }
 
+void bch_data_verify(struct cached_dev *dc, struct bio *bio)
+{
+	struct request_queue *q = bdev_get_queue(bio->bi_bdev);
+	struct bio *clone = bio_clone_fast(bio, GFP_NOIO, q->bio_split);
+	unsigned sectors;
+
+	while (!bio_can_convert_to_sp(clone, &sectors)) {
+		struct bio *split = bio_split(clone, sectors,
+					      GFP_NOIO, q->bio_split);
+
+		__bch_data_verify(dc, split);
+		bio_put(split);
+	}
+
+	if (bio_sectors(clone))
+		__bch_data_verify(dc, clone);
+
+	bio_put(clone);
+}
+
 #endif
 
 #ifdef CONFIG_DEBUG_FS
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 08/54] block: comment on bio_alloc_pages()
  2016-12-27 15:55 ` [PATCH v1 08/54] block: comment on bio_alloc_pages() Ming Lei
@ 2016-12-30 10:40   ` Coly Li
  2016-12-30 11:06   ` Coly Li
  1 sibling, 0 replies; 11+ messages in thread
From: Coly Li @ 2016-12-30 10:40 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, linux-kernel
  Cc: linux-block, Christoph Hellwig, Jens Axboe, Kent Overstreet,
	Shaohua Li, Mike Christie, Guoqing Jiang, Hannes Reinecke,
	open list:BCACHE (BLOCK LAYER CACHE),
	open list:SOFTWARE RAID (Multiple Disks) SUPPORT

On 2016/12/27 下午11:55, Ming Lei wrote:
> This patch adds comment on usage of bio_alloc_pages(),
> also comments on one special case of bch_data_verify().
> 
> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> ---
>  block/bio.c               | 4 +++-
>  drivers/md/bcache/debug.c | 6 ++++++
>  2 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/block/bio.c b/block/bio.c
> index 2b375020fc49..d4a1e0b63ea0 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -961,7 +961,9 @@ EXPORT_SYMBOL(bio_advance);
>   * @bio: bio to allocate pages for
>   * @gfp_mask: flags for allocation
>   *
> - * Allocates pages up to @bio->bi_vcnt.
> + * Allocates pages up to @bio->bi_vcnt, and this function should only
> + * be called on a new initialized bio, which means all pages aren't added
> + * to the bio via bio_add_page() yet.
>   *
>   * Returns 0 on success, -ENOMEM on failure. On failure, any allocated pages are
>   * freed.
> diff --git a/drivers/md/bcache/debug.c b/drivers/md/bcache/debug.c
> index 06f55056aaae..48d03e8b3385 100644
> --- a/drivers/md/bcache/debug.c
> +++ b/drivers/md/bcache/debug.c
> @@ -110,6 +110,12 @@ void bch_data_verify(struct cached_dev *dc, struct bio *bio)
>  	struct bio_vec bv, cbv;
>  	struct bvec_iter iter, citer = { 0 };
>  
> +	/*
> +	 * Once multipage bvec is supported, the bio_clone()
> +	 * has to make sure page count in this bio can be held
> +	 * in the new cloned bio because each single page need
> +	 * to assign to each bvec of the new bio.
> +	 */
>  	check = bio_clone(bio, GFP_NOIO);
>  	if (!check)
>  		return;
> 
Acked-by: Coly Li <colyli@suse.de>

-- 
Coly Li

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 23/54] bcache: handle bio_clone() & bvec updating for multipage bvecs
  2016-12-27 15:56 ` [PATCH v1 23/54] bcache: handle bio_clone() & bvec updating for multipage bvecs Ming Lei
@ 2016-12-30 11:01   ` Coly Li
  2016-12-31 10:29     ` Ming Lei
  0 siblings, 1 reply; 11+ messages in thread
From: Coly Li @ 2016-12-30 11:01 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-kernel, linux-block, Christoph Hellwig,
	Kent Overstreet, Shaohua Li, Mike Christie, Guoqing Jiang,
	open list:BCACHE (BLOCK LAYER CACHE),
	open list:SOFTWARE RAID (Multiple Disks) SUPPORT

On 2016/12/27 下午11:56, Ming Lei wrote:
> The incoming bio may be too big to be cloned into
> one singlepage bvecs bio, so split the bio and
> check the splitted bio one by one.
> 
> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> ---
>  drivers/md/bcache/debug.c | 24 ++++++++++++++++++++++--
>  1 file changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/md/bcache/debug.c b/drivers/md/bcache/debug.c
> index 48d03e8b3385..18b2d2d138e3 100644
> --- a/drivers/md/bcache/debug.c
> +++ b/drivers/md/bcache/debug.c
> @@ -103,7 +103,7 @@ void bch_btree_verify(struct btree *b)
>  	up(&b->io_mutex);
>  }
>  
> -void bch_data_verify(struct cached_dev *dc, struct bio *bio)
> +static void __bch_data_verify(struct cached_dev *dc, struct bio *bio)
>  {
>  	char name[BDEVNAME_SIZE];
>  	struct bio *check;
> @@ -116,7 +116,7 @@ void bch_data_verify(struct cached_dev *dc, struct bio *bio)
>  	 * in the new cloned bio because each single page need
>  	 * to assign to each bvec of the new bio.
>  	 */
> -	check = bio_clone(bio, GFP_NOIO);
> +	check = bio_clone_sp(bio, GFP_NOIO);
>  	if (!check)
>  		return;
>  	check->bi_opf = REQ_OP_READ;
> @@ -151,6 +151,26 @@ void bch_data_verify(struct cached_dev *dc, struct bio *bio)
>  	bio_put(check);
>  }
>  
> +void bch_data_verify(struct cached_dev *dc, struct bio *bio)
> +{
> +	struct request_queue *q = bdev_get_queue(bio->bi_bdev);
> +	struct bio *clone = bio_clone_fast(bio, GFP_NOIO, q->bio_split);
> +	unsigned sectors;
> +
> +	while (!bio_can_convert_to_sp(clone, &sectors)) {
> +		struct bio *split = bio_split(clone, sectors,
> +					      GFP_NOIO, q->bio_split);
> +
> +		__bch_data_verify(dc, split);
> +		bio_put(split);
> +	}
> +
> +	if (bio_sectors(clone))
> +		__bch_data_verify(dc, clone);
> +
> +	bio_put(clone);
> +}
> +

Hi Lei,

The above patch is good IMHO. Just wondering why not use the classical
style ? something like,


do {
	if (!bio_can_convert_to_sp(clone, &sectors))
		split = bio_split(clone, sectors,
				  GFP_NOIO, q->bio_split);
	else
		split = clone;

	__bch_data_verity(gc, split);
	bio_put(split);
} while (split != clone);


I guess maybe the above style generates less binary code.


-- 
Coly Li

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 08/54] block: comment on bio_alloc_pages()
  2016-12-27 15:55 ` [PATCH v1 08/54] block: comment on bio_alloc_pages() Ming Lei
  2016-12-30 10:40   ` Coly Li
@ 2016-12-30 11:06   ` Coly Li
  1 sibling, 0 replies; 11+ messages in thread
From: Coly Li @ 2016-12-30 11:06 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, linux-kernel
  Cc: linux-block, Christoph Hellwig, Jens Axboe, Kent Overstreet,
	Shaohua Li, Mike Christie, Guoqing Jiang, Hannes Reinecke,
	open list:BCACHE (BLOCK LAYER CACHE),
	open list:SOFTWARE RAID (Multiple Disks) SUPPORT

On 2016/12/27 下午11:55, Ming Lei wrote:
> This patch adds comment on usage of bio_alloc_pages(),
> also comments on one special case of bch_data_verify().
> 
> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> ---
>  block/bio.c               | 4 +++-
>  drivers/md/bcache/debug.c | 6 ++++++
>  2 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/block/bio.c b/block/bio.c
> index 2b375020fc49..d4a1e0b63ea0 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -961,7 +961,9 @@ EXPORT_SYMBOL(bio_advance);
>   * @bio: bio to allocate pages for
>   * @gfp_mask: flags for allocation
>   *
> - * Allocates pages up to @bio->bi_vcnt.
> + * Allocates pages up to @bio->bi_vcnt, and this function should only
> + * be called on a new initialized bio, which means all pages aren't added
> + * to the bio via bio_add_page() yet.
>   *
>   * Returns 0 on success, -ENOMEM on failure. On failure, any allocated pages are
>   * freed.
> diff --git a/drivers/md/bcache/debug.c b/drivers/md/bcache/debug.c
> index 06f55056aaae..48d03e8b3385 100644
> --- a/drivers/md/bcache/debug.c
> +++ b/drivers/md/bcache/debug.c
> @@ -110,6 +110,12 @@ void bch_data_verify(struct cached_dev *dc, struct bio *bio)
>  	struct bio_vec bv, cbv;
>  	struct bvec_iter iter, citer = { 0 };
>  
> +	/*
> +	 * Once multipage bvec is supported, the bio_clone()
> +	 * has to make sure page count in this bio can be held
> +	 * in the new cloned bio because each single page need
> +	 * to assign to each bvec of the new bio.
> +	 */
>  	check = bio_clone(bio, GFP_NOIO);
>  	if (!check)
>  		return;
> 
Acked-by: Coly Li <colyli@suse.de>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 07/54] bcache: comment on direct access to bvec table
  2016-12-27 15:55 ` [PATCH v1 07/54] bcache: comment on direct access to bvec table Ming Lei
@ 2016-12-30 16:56   ` Coly Li
  0 siblings, 0 replies; 11+ messages in thread
From: Coly Li @ 2016-12-30 16:56 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-kernel, linux-block, Christoph Hellwig,
	Kent Overstreet, Shaohua Li, Guoqing Jiang, Zheng Liu,
	Mike Christie, Jiri Kosina, Eric Wheeler, Yijing Wang, Al Viro,
	open list:BCACHE (BLOCK LAYER CACHE),
	open list:SOFTWARE RAID (Multiple Disks) SUPPORT

On 2016/12/27 下午11:55, Ming Lei wrote:
> Looks all are safe after multipage bvec is supported.
> 
> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> ---
>  drivers/md/bcache/btree.c | 1 +
>  drivers/md/bcache/super.c | 6 ++++++
>  drivers/md/bcache/util.c  | 7 +++++++
>  3 files changed, 14 insertions(+)
> 
> diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
> index a43eedd5804d..fc35cfb4d0f1 100644
> --- a/drivers/md/bcache/btree.c
> +++ b/drivers/md/bcache/btree.c
> @@ -428,6 +428,7 @@ static void do_btree_node_write(struct btree *b)
>  
>  		continue_at(cl, btree_node_write_done, NULL);
>  	} else {
> +		/* No harm for multipage bvec since the new is just allocated */
>  		b->bio->bi_vcnt = 0;
>  		bch_bio_map(b->bio, i);
>  
> diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
> index 3a19cbc8b230..607b022259dc 100644
> --- a/drivers/md/bcache/super.c
> +++ b/drivers/md/bcache/super.c
> @@ -208,6 +208,7 @@ static void write_bdev_super_endio(struct bio *bio)
>  
>  static void __write_super(struct cache_sb *sb, struct bio *bio)
>  {
> +	/* single page bio, safe for multipage bvec */
>  	struct cache_sb *out = page_address(bio->bi_io_vec[0].bv_page);
>  	unsigned i;
>  
> @@ -1156,6 +1157,8 @@ static void register_bdev(struct cache_sb *sb, struct page *sb_page,
>  	dc->bdev->bd_holder = dc;
>  
>  	bio_init(&dc->sb_bio, dc->sb_bio.bi_inline_vecs, 1);
> +
> +	/* single page bio, safe for multipage bvec */
>  	dc->sb_bio.bi_io_vec[0].bv_page = sb_page;
>  	get_page(sb_page);
>  
> @@ -1799,6 +1802,7 @@ void bch_cache_release(struct kobject *kobj)
>  	for (i = 0; i < RESERVE_NR; i++)
>  		free_fifo(&ca->free[i]);
>  
> +	/* single page bio, safe for multipage bvec */
>  	if (ca->sb_bio.bi_inline_vecs[0].bv_page)
>  		put_page(ca->sb_bio.bi_io_vec[0].bv_page);
>  
> @@ -1854,6 +1858,8 @@ static int register_cache(struct cache_sb *sb, struct page *sb_page,
>  	ca->bdev->bd_holder = ca;
>  
>  	bio_init(&ca->sb_bio, ca->sb_bio.bi_inline_vecs, 1);
> +
> +	/* single page bio, safe for multipage bvec */
>  	ca->sb_bio.bi_io_vec[0].bv_page = sb_page;
>  	get_page(sb_page);
>  
> diff --git a/drivers/md/bcache/util.c b/drivers/md/bcache/util.c
> index dde6172f3f10..5cc0b49a65fb 100644
> --- a/drivers/md/bcache/util.c
> +++ b/drivers/md/bcache/util.c
> @@ -222,6 +222,13 @@ uint64_t bch_next_delay(struct bch_ratelimit *d, uint64_t done)
>  		: 0;
>  }
>  
> +/*
> + * Generally it isn't good to access .bi_io_vec and .bi_vcnt
> + * directly, the preferred way is bio_add_page, but in
> + * this case, bch_bio_map() supposes that the bvec table
> + * is empty, so it is safe to access .bi_vcnt & .bi_io_vec
> + * in this way even after multipage bvec is supported.
> + */
>  void bch_bio_map(struct bio *bio, void *base)
>  {
>  	size_t size = bio->bi_iter.bi_size;
> 

Acked-by: Coly Li <colyli@suse.de>



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 23/54] bcache: handle bio_clone() & bvec updating for multipage bvecs
  2016-12-30 11:01   ` Coly Li
@ 2016-12-31 10:29     ` Ming Lei
  0 siblings, 0 replies; 11+ messages in thread
From: Ming Lei @ 2016-12-31 10:29 UTC (permalink / raw)
  To: Coly Li
  Cc: Jens Axboe, Linux Kernel Mailing List, linux-block,
	Christoph Hellwig, Kent Overstreet, Shaohua Li, Mike Christie,
	Guoqing Jiang, open list:BCACHE (BLOCK LAYER CACHE),
	open list:SOFTWARE RAID (Multiple Disks) SUPPORT

Hi Coly,

On Fri, Dec 30, 2016 at 7:01 PM, Coly Li <i@coly.li> wrote:
> On 2016/12/27 下午11:56, Ming Lei wrote:
>> The incoming bio may be too big to be cloned into
>> one singlepage bvecs bio, so split the bio and
>> check the splitted bio one by one.
>>
>> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
>> ---
>>  drivers/md/bcache/debug.c | 24 ++++++++++++++++++++++--
>>  1 file changed, 22 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/md/bcache/debug.c b/drivers/md/bcache/debug.c
>> index 48d03e8b3385..18b2d2d138e3 100644
>> --- a/drivers/md/bcache/debug.c
>> +++ b/drivers/md/bcache/debug.c
>> @@ -103,7 +103,7 @@ void bch_btree_verify(struct btree *b)
>>       up(&b->io_mutex);
>>  }
>>
>> -void bch_data_verify(struct cached_dev *dc, struct bio *bio)
>> +static void __bch_data_verify(struct cached_dev *dc, struct bio *bio)
>>  {
>>       char name[BDEVNAME_SIZE];
>>       struct bio *check;
>> @@ -116,7 +116,7 @@ void bch_data_verify(struct cached_dev *dc, struct bio *bio)
>>        * in the new cloned bio because each single page need
>>        * to assign to each bvec of the new bio.
>>        */
>> -     check = bio_clone(bio, GFP_NOIO);
>> +     check = bio_clone_sp(bio, GFP_NOIO);
>>       if (!check)
>>               return;
>>       check->bi_opf = REQ_OP_READ;
>> @@ -151,6 +151,26 @@ void bch_data_verify(struct cached_dev *dc, struct bio *bio)
>>       bio_put(check);
>>  }
>>
>> +void bch_data_verify(struct cached_dev *dc, struct bio *bio)
>> +{
>> +     struct request_queue *q = bdev_get_queue(bio->bi_bdev);
>> +     struct bio *clone = bio_clone_fast(bio, GFP_NOIO, q->bio_split);
>> +     unsigned sectors;
>> +
>> +     while (!bio_can_convert_to_sp(clone, &sectors)) {
>> +             struct bio *split = bio_split(clone, sectors,
>> +                                           GFP_NOIO, q->bio_split);
>> +
>> +             __bch_data_verify(dc, split);
>> +             bio_put(split);
>> +     }
>> +
>> +     if (bio_sectors(clone))
>> +             __bch_data_verify(dc, clone);
>> +
>> +     bio_put(clone);
>> +}
>> +
>
> Hi Lei,
>
> The above patch is good IMHO. Just wondering why not use the classical
> style ? something like,

I don't know there is the classical style, :-)

>
>
> do {
>         if (!bio_can_convert_to_sp(clone, &sectors))
>                 split = bio_split(clone, sectors,
>                                   GFP_NOIO, q->bio_split);
>         else
>                 split = clone;
>
>         __bch_data_verity(gc, split);
>         bio_put(split);
> } while (split != clone);
>
>
> I guess maybe the above style generates less binary code.

Maybe, will take this style in V2.

Thanks for the review!

-- 
Ming Lei

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-12-31 10:29 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-27 15:55 [PATCH v1 00/54] block: support multipage bvec Ming Lei
2016-12-27 15:55 ` [PATCH v1 07/54] bcache: comment on direct access to bvec table Ming Lei
2016-12-30 16:56   ` Coly Li
2016-12-27 15:55 ` [PATCH v1 08/54] block: comment on bio_alloc_pages() Ming Lei
2016-12-30 10:40   ` Coly Li
2016-12-30 11:06   ` Coly Li
2016-12-27 15:56 ` [PATCH v1 11/54] md: set NO_MP for request queue of md Ming Lei
2016-12-27 15:56 ` [PATCH v1 12/54] dm: limit the max bio size as BIO_MAX_PAGES * PAGE_SIZE Ming Lei
2016-12-27 15:56 ` [PATCH v1 23/54] bcache: handle bio_clone() & bvec updating for multipage bvecs Ming Lei
2016-12-30 11:01   ` Coly Li
2016-12-31 10:29     ` Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).