linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 00/10] block: enable multi-page bvec for passthrough IO
@ 2019-03-17 10:01 Ming Lei
  2019-03-17 10:01 ` [PATCH V2 01/10] block: pass page to xen_biovec_phys_mergeable Ming Lei
                   ` (10 more replies)
  0 siblings, 11 replies; 13+ messages in thread
From: Ming Lei @ 2019-03-17 10:01 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Ming Lei, ris Ostrovsky, Juergen Gross, xen-devel,
	Omar Sandoval, Christoph Hellwig

Hi,

Now the whole IO stack is capable of handling multi-page bvec, and it has
been enabled in the normal FS IO path. However, it isn't done for
passthrough IO.

Without enabling multi-bvec for passthough IO, we won't go ahead for
optimizing related IO paths, such as bvec merging, bio_add_pc_page
simplification.

This patch enables multi-page bvec for passthrough IO. Turns out
bio_add_pc_page() is simpliefied a lot, especially the physical segment
number of passthrough bio is always same with bio.bi_vcnt. Also the
bvec merging inside bio is killed.

blktests(block/029) is added for covering passthough IO path, and this
patchset does pass the new block/029 test.

	https://marc.info/?l=linux-block&m=155175063417139&w=2

V2:
	- add new patch of 'block: avoid to break XEN by multi-page bvec'
	- add one new patch to cleanup bio_add_pc_page()
	- add another two new patches for cleanup of mapping bvec to sg
	- most of others are patch style changes

Ming Lei (10):
  block: pass page to xen_biovec_phys_mergeable
  block: avoid to break XEN by multi-page bvec
  block: don't merge adjacent bvecs to one segment in bio
    blk_queue_split
  block: cleanup bio_add_pc_page
  block: check if page is mergeable in one helper
  block: put the same page when adding it to bio
  block: enable multi-page bvec for passthrough IO
  block: remove argument of 'request_queue' from __blk_bvec_map_sg
  block: reuse __blk_bvec_map_sg() for mapping page sized bvec
  block: don't check if adjacent bvecs in one bio can be mergeable

 block/bio.c            | 134 ++++++++++++++++++++++++++++---------------------
 block/blk-merge.c      | 106 +++++++++++++++++++-------------------
 block/blk.h            |   2 +-
 drivers/xen/biomerge.c |   5 +-
 include/linux/bio.h    |   3 ++
 include/xen/xen.h      |   2 +-
 6 files changed, 135 insertions(+), 117 deletions(-)

Cc: ris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: xen-devel@lists.xenproject.org
Cc: Omar Sandoval <osandov@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>


-- 
2.9.5


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH V2 01/10] block: pass page to xen_biovec_phys_mergeable
  2019-03-17 10:01 [PATCH V2 00/10] block: enable multi-page bvec for passthrough IO Ming Lei
@ 2019-03-17 10:01 ` Ming Lei
  2019-03-17 10:01 ` [PATCH V2 02/10] block: avoid to break XEN by multi-page bvec Ming Lei
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Ming Lei @ 2019-03-17 10:01 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Ming Lei, ris Ostrovsky, Juergen Gross, xen-devel,
	Omar Sandoval, Christoph Hellwig

xen_biovec_phys_mergeable() only needs .bv_page of the 2nd bio bvec
for checking if the two bvecs can be merged, so pass page to
xen_biovec_phys_mergeable() directly.

No function change.

Cc: ris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: xen-devel@lists.xenproject.org
Cc: Omar Sandoval <osandov@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk.h            | 2 +-
 drivers/xen/biomerge.c | 5 +++--
 include/xen/xen.h      | 2 +-
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/block/blk.h b/block/blk.h
index 5d636ee41663..e27fd1512e4b 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -75,7 +75,7 @@ static inline bool biovec_phys_mergeable(struct request_queue *q,
 
 	if (addr1 + vec1->bv_len != addr2)
 		return false;
-	if (xen_domain() && !xen_biovec_phys_mergeable(vec1, vec2))
+	if (xen_domain() && !xen_biovec_phys_mergeable(vec1, vec2->bv_page))
 		return false;
 	if ((addr1 | mask) != ((addr2 + vec2->bv_len - 1) | mask))
 		return false;
diff --git a/drivers/xen/biomerge.c b/drivers/xen/biomerge.c
index f3fbb700f569..05a286d24f14 100644
--- a/drivers/xen/biomerge.c
+++ b/drivers/xen/biomerge.c
@@ -4,12 +4,13 @@
 #include <xen/xen.h>
 #include <xen/page.h>
 
+/* check if @page can be merged with 'vec1' */
 bool xen_biovec_phys_mergeable(const struct bio_vec *vec1,
-			       const struct bio_vec *vec2)
+			       const struct page *page)
 {
 #if XEN_PAGE_SIZE == PAGE_SIZE
 	unsigned long bfn1 = pfn_to_bfn(page_to_pfn(vec1->bv_page));
-	unsigned long bfn2 = pfn_to_bfn(page_to_pfn(vec2->bv_page));
+	unsigned long bfn2 = pfn_to_bfn(page_to_pfn(page));
 
 	return bfn1 + PFN_DOWN(vec1->bv_offset + vec1->bv_len) == bfn2;
 #else
diff --git a/include/xen/xen.h b/include/xen/xen.h
index 19d032373de5..0e2324085b32 100644
--- a/include/xen/xen.h
+++ b/include/xen/xen.h
@@ -44,7 +44,7 @@ extern struct hvm_start_info pvh_start_info;
 
 struct bio_vec;
 bool xen_biovec_phys_mergeable(const struct bio_vec *vec1,
-		const struct bio_vec *vec2);
+		const struct page *page);
 
 #if defined(CONFIG_MEMORY_HOTPLUG) && defined(CONFIG_XEN_BALLOON)
 extern u64 xen_saved_max_mem_size;
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH V2 02/10] block: avoid to break XEN by multi-page bvec
  2019-03-17 10:01 [PATCH V2 00/10] block: enable multi-page bvec for passthrough IO Ming Lei
  2019-03-17 10:01 ` [PATCH V2 01/10] block: pass page to xen_biovec_phys_mergeable Ming Lei
@ 2019-03-17 10:01 ` Ming Lei
  2019-03-19  9:18   ` Juergen Gross
  2019-03-17 10:01 ` [PATCH V2 03/10] block: don't merge adjacent bvecs to one segment in bio blk_queue_split Ming Lei
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 13+ messages in thread
From: Ming Lei @ 2019-03-17 10:01 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Ming Lei, ris Ostrovsky, Juergen Gross, xen-devel,
	Omar Sandoval, Christoph Hellwig

XEN has special page merge requirement, see xen_biovec_phys_mergeable().
We can't merge pages into one bvec simply for XEN.

So move XEN's specific check on page merge into __bio_try_merge_page(),
then abvoid to break XEN by multi-page bvec.

Cc: ris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: xen-devel@lists.xenproject.org
Cc: Omar Sandoval <osandov@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/bio.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block/bio.c b/block/bio.c
index 71a78d9fb8b7..d8f48188937c 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -776,6 +776,8 @@ bool __bio_try_merge_page(struct bio *bio, struct page *page,
 
 		if (vec_end_addr + 1 != page_addr + off)
 			return false;
+		if (xen_domain() && !xen_biovec_phys_mergeable(bv, page))
+			return false;
 		if (same_page && (vec_end_addr & PAGE_MASK) != page_addr)
 			return false;
 
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH V2 03/10] block: don't merge adjacent bvecs to one segment in bio blk_queue_split
  2019-03-17 10:01 [PATCH V2 00/10] block: enable multi-page bvec for passthrough IO Ming Lei
  2019-03-17 10:01 ` [PATCH V2 01/10] block: pass page to xen_biovec_phys_mergeable Ming Lei
  2019-03-17 10:01 ` [PATCH V2 02/10] block: avoid to break XEN by multi-page bvec Ming Lei
@ 2019-03-17 10:01 ` Ming Lei
  2019-03-17 10:01 ` [PATCH V2 04/10] block: cleanup bio_add_pc_page Ming Lei
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Ming Lei @ 2019-03-17 10:01 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Ming Lei, Omar Sandoval, Christoph Hellwig

For normal filesystem IO, each page is added via blk_add_page(),
in which bvec(page) merge has been handled already, and basically
not possible to merge two adjacent bvecs in one bio.

So not try to merge two adjacent bvecs in blk_queue_split().

Cc: Omar Sandoval <osandov@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-merge.c | 17 -----------------
 1 file changed, 17 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 1c9d4f0f96ea..aa9164eb7187 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -267,23 +267,6 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
 			goto split;
 		}
 
-		if (bvprvp) {
-			if (seg_size + bv.bv_len > queue_max_segment_size(q))
-				goto new_segment;
-			if (!biovec_phys_mergeable(q, bvprvp, &bv))
-				goto new_segment;
-
-			seg_size += bv.bv_len;
-			bvprv = bv;
-			bvprvp = &bvprv;
-			sectors += bv.bv_len >> 9;
-
-			if (nsegs == 1 && seg_size > front_seg_size)
-				front_seg_size = seg_size;
-
-			continue;
-		}
-new_segment:
 		if (nsegs == max_segs)
 			goto split;
 
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH V2 04/10] block: cleanup bio_add_pc_page
  2019-03-17 10:01 [PATCH V2 00/10] block: enable multi-page bvec for passthrough IO Ming Lei
                   ` (2 preceding siblings ...)
  2019-03-17 10:01 ` [PATCH V2 03/10] block: don't merge adjacent bvecs to one segment in bio blk_queue_split Ming Lei
@ 2019-03-17 10:01 ` Ming Lei
  2019-03-17 10:01 ` [PATCH V2 05/10] block: check if page is mergeable in one helper Ming Lei
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Ming Lei @ 2019-03-17 10:01 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Ming Lei, Omar Sandoval, Christoph Hellwig

REQ_PC is out of date, so replace it with passthrough IO.

Also remove the local variable of 'prev' since we can reuse
the top local variable of 'bvec'.

No function change.

Cc: Omar Sandoval <osandov@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/bio.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index d8f48188937c..c94a3d30bc85 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -648,7 +648,7 @@ struct bio *bio_clone_fast(struct bio *bio, gfp_t gfp_mask, struct bio_set *bs)
 EXPORT_SYMBOL(bio_clone_fast);
 
 /**
- *	bio_add_pc_page	-	attempt to add page to bio
+ *	bio_add_pc_page	-	attempt to add page to passthrough bio
  *	@q: the target queue
  *	@bio: destination bio
  *	@page: page to add
@@ -660,7 +660,7 @@ EXPORT_SYMBOL(bio_clone_fast);
  *	limitations. The target block device must allow bio's up to PAGE_SIZE,
  *	so it is always possible to add a single page to an empty bio.
  *
- *	This should only be used by REQ_PC bios.
+ *	This should only be used by passthrough bios.
  */
 int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page
 		    *page, unsigned int len, unsigned int offset)
@@ -683,11 +683,11 @@ int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page
 	 * a consecutive offset.  Optimize this special case.
 	 */
 	if (bio->bi_vcnt > 0) {
-		struct bio_vec *prev = &bio->bi_io_vec[bio->bi_vcnt - 1];
+		bvec = &bio->bi_io_vec[bio->bi_vcnt - 1];
 
-		if (page == prev->bv_page &&
-		    offset == prev->bv_offset + prev->bv_len) {
-			prev->bv_len += len;
+		if (page == bvec->bv_page &&
+		    offset == bvec->bv_offset + bvec->bv_len) {
+			bvec->bv_len += len;
 			bio->bi_iter.bi_size += len;
 			goto done;
 		}
@@ -696,7 +696,7 @@ int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page
 		 * If the queue doesn't support SG gaps and adding this
 		 * offset would create a gap, disallow it.
 		 */
-		if (bvec_gap_to_prev(q, prev, offset))
+		if (bvec_gap_to_prev(q, bvec, offset))
 			return 0;
 	}
 
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH V2 05/10] block: check if page is mergeable in one helper
  2019-03-17 10:01 [PATCH V2 00/10] block: enable multi-page bvec for passthrough IO Ming Lei
                   ` (3 preceding siblings ...)
  2019-03-17 10:01 ` [PATCH V2 04/10] block: cleanup bio_add_pc_page Ming Lei
@ 2019-03-17 10:01 ` Ming Lei
  2019-03-17 10:01 ` [PATCH V2 06/10] block: put the same page when adding it to bio Ming Lei
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Ming Lei @ 2019-03-17 10:01 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Ming Lei, Omar Sandoval, Christoph Hellwig

Now the check for deciding if one page is mergeable to current bvec
becomes a bit complicated, and we need to reuse the code before
adding pc page.

So move the check in one dedicated helper.

No function change.

Cc: Omar Sandoval <osandov@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/bio.c | 38 ++++++++++++++++++++++++--------------
 1 file changed, 24 insertions(+), 14 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index c94a3d30bc85..5ac8692e3986 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -647,6 +647,24 @@ struct bio *bio_clone_fast(struct bio *bio, gfp_t gfp_mask, struct bio_set *bs)
 }
 EXPORT_SYMBOL(bio_clone_fast);
 
+static inline bool page_is_mergeable(const struct bio_vec *bv,
+		struct page *page, unsigned int len, unsigned int off,
+		bool same_page)
+{
+	phys_addr_t vec_end_addr = page_to_phys(bv->bv_page) +
+		bv->bv_offset + bv->bv_len - 1;
+	phys_addr_t page_addr = page_to_phys(page);
+
+	if (vec_end_addr + 1 != page_addr + off)
+		return false;
+	if (xen_domain() && !xen_biovec_phys_mergeable(bv, page))
+		return false;
+	if (same_page && (vec_end_addr & PAGE_MASK) != page_addr)
+		return false;
+
+	return true;
+}
+
 /**
  *	bio_add_pc_page	-	attempt to add page to passthrough bio
  *	@q: the target queue
@@ -770,20 +788,12 @@ bool __bio_try_merge_page(struct bio *bio, struct page *page,
 
 	if (bio->bi_vcnt > 0) {
 		struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1];
-		phys_addr_t vec_end_addr = page_to_phys(bv->bv_page) +
-			bv->bv_offset + bv->bv_len - 1;
-		phys_addr_t page_addr = page_to_phys(page);
-
-		if (vec_end_addr + 1 != page_addr + off)
-			return false;
-		if (xen_domain() && !xen_biovec_phys_mergeable(bv, page))
-			return false;
-		if (same_page && (vec_end_addr & PAGE_MASK) != page_addr)
-			return false;
-
-		bv->bv_len += len;
-		bio->bi_iter.bi_size += len;
-		return true;
+
+		if (page_is_mergeable(bv, page, len, off, same_page)) {
+			bv->bv_len += len;
+			bio->bi_iter.bi_size += len;
+			return true;
+		}
 	}
 	return false;
 }
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH V2 06/10] block: put the same page when adding it to bio
  2019-03-17 10:01 [PATCH V2 00/10] block: enable multi-page bvec for passthrough IO Ming Lei
                   ` (4 preceding siblings ...)
  2019-03-17 10:01 ` [PATCH V2 05/10] block: check if page is mergeable in one helper Ming Lei
@ 2019-03-17 10:01 ` Ming Lei
  2019-03-17 10:01 ` [PATCH V2 07/10] block: enable multi-page bvec for passthrough IO Ming Lei
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Ming Lei @ 2019-03-17 10:01 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Ming Lei, Omar Sandoval, Christoph Hellwig

When the added page is merged to last same page in bio_add_pc_page(),
the user may need to put this page for avoiding page leak.

bio_map_user_iov() needs this kind of handling, and now it deals with
it by itself in hack style.

Moves the handling of put page into __bio_add_pc_page(), so
bio_map_user_iov() may be simplified a bit, and maybe more users
can benefit from this change.

Cc: Omar Sandoval <osandov@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/bio.c         | 28 ++++++++++++++++------------
 include/linux/bio.h |  3 +++
 2 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 5ac8692e3986..4b37f5173c66 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -666,12 +666,13 @@ static inline bool page_is_mergeable(const struct bio_vec *bv,
 }
 
 /**
- *	bio_add_pc_page	-	attempt to add page to passthrough bio
+ *	__bio_add_pc_page	- attempt to add page to passthrough bio
  *	@q: the target queue
  *	@bio: destination bio
  *	@page: page to add
  *	@len: vec entry length
  *	@offset: vec entry offset
+ *	@put_same_page: put the page if it is same with last added page
  *
  *	Attempt to add a page to the bio_vec maplist. This can fail for a
  *	number of reasons, such as the bio being full or target block device
@@ -680,8 +681,9 @@ static inline bool page_is_mergeable(const struct bio_vec *bv,
  *
  *	This should only be used by passthrough bios.
  */
-int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page
-		    *page, unsigned int len, unsigned int offset)
+int __bio_add_pc_page(struct request_queue *q, struct bio *bio,
+		struct page *page, unsigned int len, unsigned int offset,
+		bool put_same_page)
 {
 	int retried_segments = 0;
 	struct bio_vec *bvec;
@@ -705,6 +707,8 @@ int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page
 
 		if (page == bvec->bv_page &&
 		    offset == bvec->bv_offset + bvec->bv_len) {
+			if (put_same_page)
+				put_page(page);
 			bvec->bv_len += len;
 			bio->bi_iter.bi_size += len;
 			goto done;
@@ -763,6 +767,13 @@ int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page
 	blk_recount_segments(q, bio);
 	return 0;
 }
+EXPORT_SYMBOL(__bio_add_pc_page);
+
+int bio_add_pc_page(struct request_queue *q, struct bio *bio,
+		struct page *page, unsigned int len, unsigned int offset)
+{
+	return __bio_add_pc_page(q, bio, page, len, offset, false);
+}
 EXPORT_SYMBOL(bio_add_pc_page);
 
 /**
@@ -1394,21 +1405,14 @@ struct bio *bio_map_user_iov(struct request_queue *q,
 			for (j = 0; j < npages; j++) {
 				struct page *page = pages[j];
 				unsigned int n = PAGE_SIZE - offs;
-				unsigned short prev_bi_vcnt = bio->bi_vcnt;
 
 				if (n > bytes)
 					n = bytes;
 
-				if (!bio_add_pc_page(q, bio, page, n, offs))
+				if (!__bio_add_pc_page(q, bio, page, n, offs,
+							true))
 					break;
 
-				/*
-				 * check if vector was merged with previous
-				 * drop page reference if needed
-				 */
-				if (bio->bi_vcnt == prev_bi_vcnt)
-					put_page(page);
-
 				added += n;
 				bytes -= n;
 				offs = 0;
diff --git a/include/linux/bio.h b/include/linux/bio.h
index bb6090aa165d..bb915591557b 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -432,6 +432,9 @@ void bio_chain(struct bio *, struct bio *);
 extern int bio_add_page(struct bio *, struct page *, unsigned int,unsigned int);
 extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *,
 			   unsigned int, unsigned int);
+extern int __bio_add_pc_page(struct request_queue *, struct bio *,
+			     struct page *, unsigned int, unsigned int,
+			     bool);
 bool __bio_try_merge_page(struct bio *bio, struct page *page,
 		unsigned int len, unsigned int off, bool same_page);
 void __bio_add_page(struct bio *bio, struct page *page,
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH V2 07/10] block: enable multi-page bvec for passthrough IO
  2019-03-17 10:01 [PATCH V2 00/10] block: enable multi-page bvec for passthrough IO Ming Lei
                   ` (5 preceding siblings ...)
  2019-03-17 10:01 ` [PATCH V2 06/10] block: put the same page when adding it to bio Ming Lei
@ 2019-03-17 10:01 ` Ming Lei
  2019-03-17 10:01 ` [PATCH V2 08/10] block: remove argument of 'request_queue' from __blk_bvec_map_sg Ming Lei
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Ming Lei @ 2019-03-17 10:01 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Ming Lei, Omar Sandoval, Christoph Hellwig

Now block IO stack is basically ready for supporting multi-page bvec,
however it isn't enabled on passthrough IO.

One reason is that passthrough IO is dispatched to LLD directly and bio
split is bypassed, so the bio has to be built correctly for dispatch to
LLD from the beginning.

Implement multi-page support for passthrough IO by limitting each bvec
as block device's segment and applying all kinds of queue limit in
blk_add_pc_page(). Then we don't need to calculate segments any more for
passthrough IO any more, turns out code is simplified much.

Cc: Omar Sandoval <osandov@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/bio.c | 60 +++++++++++++++++++++++++++++++-----------------------------
 1 file changed, 31 insertions(+), 29 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 4b37f5173c66..3564a57963fa 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -665,6 +665,27 @@ static inline bool page_is_mergeable(const struct bio_vec *bv,
 	return true;
 }
 
+/*
+ * Check if the @page can be added to the current segment(@bv), and make
+ * sure to call it only if page_is_mergeable(@bv, @page) is true
+ */
+static bool can_add_page_to_seg(struct request_queue *q,
+		const struct bio_vec *bv, const struct page *page,
+		unsigned len, unsigned offset)
+{
+	unsigned long mask = queue_segment_boundary(q);
+	phys_addr_t addr1 = page_to_phys(bv->bv_page) + bv->bv_offset;
+	phys_addr_t addr2 = page_to_phys(page) + offset + len - 1;
+
+	if ((addr1 | mask) != (addr2 | mask))
+		return false;
+
+	if (bv->bv_len + len > queue_max_segment_size(q))
+		return false;
+
+	return true;
+}
+
 /**
  *	__bio_add_pc_page	- attempt to add page to passthrough bio
  *	@q: the target queue
@@ -685,7 +706,6 @@ int __bio_add_pc_page(struct request_queue *q, struct bio *bio,
 		struct page *page, unsigned int len, unsigned int offset,
 		bool put_same_page)
 {
-	int retried_segments = 0;
 	struct bio_vec *bvec;
 
 	/*
@@ -709,6 +729,7 @@ int __bio_add_pc_page(struct request_queue *q, struct bio *bio,
 		    offset == bvec->bv_offset + bvec->bv_len) {
 			if (put_same_page)
 				put_page(page);
+ bvec_merge:
 			bvec->bv_len += len;
 			bio->bi_iter.bi_size += len;
 			goto done;
@@ -720,11 +741,18 @@ int __bio_add_pc_page(struct request_queue *q, struct bio *bio,
 		 */
 		if (bvec_gap_to_prev(q, bvec, offset))
 			return 0;
+
+		if (page_is_mergeable(bvec, page, len, offset, false) &&
+				can_add_page_to_seg(q, bvec, page, len, offset))
+			goto bvec_merge;
 	}
 
 	if (bio_full(bio))
 		return 0;
 
+	if (bio->bi_phys_segments >= queue_max_segments(q))
+		return 0;
+
 	/*
 	 * setup the new entry, we might clear it again later if we
 	 * cannot add the page
@@ -734,38 +762,12 @@ int __bio_add_pc_page(struct request_queue *q, struct bio *bio,
 	bvec->bv_len = len;
 	bvec->bv_offset = offset;
 	bio->bi_vcnt++;
-	bio->bi_phys_segments++;
 	bio->bi_iter.bi_size += len;
 
-	/*
-	 * Perform a recount if the number of segments is greater
-	 * than queue_max_segments(q).
-	 */
-
-	while (bio->bi_phys_segments > queue_max_segments(q)) {
-
-		if (retried_segments)
-			goto failed;
-
-		retried_segments = 1;
-		blk_recount_segments(q, bio);
-	}
-
-	/* If we may be able to merge these biovecs, force a recount */
-	if (bio->bi_vcnt > 1 && biovec_phys_mergeable(q, bvec - 1, bvec))
-		bio_clear_flag(bio, BIO_SEG_VALID);
-
  done:
+	bio->bi_phys_segments = bio->bi_vcnt;
+	bio_set_flag(bio, BIO_SEG_VALID);
 	return len;
-
- failed:
-	bvec->bv_page = NULL;
-	bvec->bv_len = 0;
-	bvec->bv_offset = 0;
-	bio->bi_vcnt--;
-	bio->bi_iter.bi_size -= len;
-	blk_recount_segments(q, bio);
-	return 0;
 }
 EXPORT_SYMBOL(__bio_add_pc_page);
 
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH V2 08/10] block: remove argument of 'request_queue' from __blk_bvec_map_sg
  2019-03-17 10:01 [PATCH V2 00/10] block: enable multi-page bvec for passthrough IO Ming Lei
                   ` (6 preceding siblings ...)
  2019-03-17 10:01 ` [PATCH V2 07/10] block: enable multi-page bvec for passthrough IO Ming Lei
@ 2019-03-17 10:01 ` Ming Lei
  2019-03-17 10:01 ` [PATCH V2 09/10] block: reuse __blk_bvec_map_sg() for mapping page sized bvec Ming Lei
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Ming Lei @ 2019-03-17 10:01 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Ming Lei, Omar Sandoval, Christoph Hellwig

The argument of 'request_queue' isn't used by __blk_bvec_map_sg(),
so remove it.

Cc: Omar Sandoval <osandov@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-merge.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index aa9164eb7187..9ec704bb58ec 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -520,7 +520,7 @@ __blk_segment_map_sg(struct request_queue *q, struct bio_vec *bvec,
 	*bvprv = *bvec;
 }
 
-static inline int __blk_bvec_map_sg(struct request_queue *q, struct bio_vec bv,
+static inline int __blk_bvec_map_sg(struct bio_vec bv,
 		struct scatterlist *sglist, struct scatterlist **sg)
 {
 	*sg = sglist;
@@ -555,9 +555,9 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq,
 	int nsegs = 0;
 
 	if (rq->rq_flags & RQF_SPECIAL_PAYLOAD)
-		nsegs = __blk_bvec_map_sg(q, rq->special_vec, sglist, &sg);
+		nsegs = __blk_bvec_map_sg(rq->special_vec, sglist, &sg);
 	else if (rq->bio && bio_op(rq->bio) == REQ_OP_WRITE_SAME)
-		nsegs = __blk_bvec_map_sg(q, bio_iovec(rq->bio), sglist, &sg);
+		nsegs = __blk_bvec_map_sg(bio_iovec(rq->bio), sglist, &sg);
 	else if (rq->bio)
 		nsegs = __blk_bios_map_sg(q, rq->bio, sglist, &sg);
 
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH V2 09/10] block: reuse __blk_bvec_map_sg() for mapping page sized bvec
  2019-03-17 10:01 [PATCH V2 00/10] block: enable multi-page bvec for passthrough IO Ming Lei
                   ` (7 preceding siblings ...)
  2019-03-17 10:01 ` [PATCH V2 08/10] block: remove argument of 'request_queue' from __blk_bvec_map_sg Ming Lei
@ 2019-03-17 10:01 ` Ming Lei
  2019-03-17 10:01 ` [PATCH V2 10/10] block: don't check if adjacent bvecs in one bio can be mergeable Ming Lei
  2019-03-27 15:44 ` [PATCH V2 00/10] block: enable multi-page bvec for passthrough IO Jens Axboe
  10 siblings, 0 replies; 13+ messages in thread
From: Ming Lei @ 2019-03-17 10:01 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Ming Lei, Omar Sandoval, Christoph Hellwig

Inside __blk_segment_map_sg(), page sized bvec mapping is optimized
a bit with one standalone branch.

So reuse __blk_bvec_map_sg() to do that.

Cc: Omar Sandoval <osandov@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-merge.c | 20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 9ec704bb58ec..3e934ee9a907 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -493,6 +493,14 @@ static unsigned blk_bvec_map_sg(struct request_queue *q,
 	return nsegs;
 }
 
+static inline int __blk_bvec_map_sg(struct bio_vec bv,
+		struct scatterlist *sglist, struct scatterlist **sg)
+{
+	*sg = blk_next_sg(sg, sglist);
+	sg_set_page(*sg, bv.bv_page, bv.bv_len, bv.bv_offset);
+	return 1;
+}
+
 static inline void
 __blk_segment_map_sg(struct request_queue *q, struct bio_vec *bvec,
 		     struct scatterlist *sglist, struct bio_vec *bvprv,
@@ -511,23 +519,13 @@ __blk_segment_map_sg(struct request_queue *q, struct bio_vec *bvec,
 	} else {
 new_segment:
 		if (bvec->bv_offset + bvec->bv_len <= PAGE_SIZE) {
-			*sg = blk_next_sg(sg, sglist);
-			sg_set_page(*sg, bvec->bv_page, nbytes, bvec->bv_offset);
-			(*nsegs) += 1;
+			(*nsegs) += __blk_bvec_map_sg(*bvec, sglist, sg);
 		} else
 			(*nsegs) += blk_bvec_map_sg(q, bvec, sglist, sg);
 	}
 	*bvprv = *bvec;
 }
 
-static inline int __blk_bvec_map_sg(struct bio_vec bv,
-		struct scatterlist *sglist, struct scatterlist **sg)
-{
-	*sg = sglist;
-	sg_set_page(*sg, bv.bv_page, bv.bv_len, bv.bv_offset);
-	return 1;
-}
-
 static int __blk_bios_map_sg(struct request_queue *q, struct bio *bio,
 			     struct scatterlist *sglist,
 			     struct scatterlist **sg)
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH V2 10/10] block: don't check if adjacent bvecs in one bio can be mergeable
  2019-03-17 10:01 [PATCH V2 00/10] block: enable multi-page bvec for passthrough IO Ming Lei
                   ` (8 preceding siblings ...)
  2019-03-17 10:01 ` [PATCH V2 09/10] block: reuse __blk_bvec_map_sg() for mapping page sized bvec Ming Lei
@ 2019-03-17 10:01 ` Ming Lei
  2019-03-27 15:44 ` [PATCH V2 00/10] block: enable multi-page bvec for passthrough IO Jens Axboe
  10 siblings, 0 replies; 13+ messages in thread
From: Ming Lei @ 2019-03-17 10:01 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Ming Lei, Omar Sandoval, Christoph Hellwig

Now both passthrough and FS IO have supported multi-page bvec, and
bvec merging has been handled actually when adding page to bio, then
adjacent bvecs won't be mergeable any more if they belong to same bio.

So only try to merge bvecs if they are from different bios.

Cc: Omar Sandoval <osandov@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-merge.c | 69 +++++++++++++++++++++++++++++++++----------------------
 1 file changed, 42 insertions(+), 27 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 3e934ee9a907..8f96d683b577 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -354,11 +354,11 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
 					     struct bio *bio)
 {
 	struct bio_vec bv, bvprv = { NULL };
-	int prev = 0;
 	unsigned int seg_size, nr_phys_segs;
 	unsigned front_seg_size;
 	struct bio *fbio, *bbio;
 	struct bvec_iter iter;
+	bool new_bio = false;
 
 	if (!bio)
 		return 0;
@@ -379,7 +379,7 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
 	nr_phys_segs = 0;
 	for_each_bio(bio) {
 		bio_for_each_bvec(bv, bio, iter) {
-			if (prev) {
+			if (new_bio) {
 				if (seg_size + bv.bv_len
 				    > queue_max_segment_size(q))
 					goto new_segment;
@@ -387,7 +387,6 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
 					goto new_segment;
 
 				seg_size += bv.bv_len;
-				bvprv = bv;
 
 				if (nr_phys_segs == 1 && seg_size >
 						front_seg_size)
@@ -396,12 +395,13 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
 				continue;
 			}
 new_segment:
-			bvprv = bv;
-			prev = 1;
 			bvec_split_segs(q, &bv, &nr_phys_segs, &seg_size,
 					&front_seg_size, NULL, UINT_MAX);
+			new_bio = false;
 		}
 		bbio = bio;
+		bvprv = bv;
+		new_bio = true;
 	}
 
 	fbio->bi_seg_front_size = front_seg_size;
@@ -501,29 +501,26 @@ static inline int __blk_bvec_map_sg(struct bio_vec bv,
 	return 1;
 }
 
-static inline void
-__blk_segment_map_sg(struct request_queue *q, struct bio_vec *bvec,
-		     struct scatterlist *sglist, struct bio_vec *bvprv,
-		     struct scatterlist **sg, int *nsegs)
+/* only try to merge bvecs into one sg if they are from two bios */
+static inline bool
+__blk_segment_map_sg_merge(struct request_queue *q, struct bio_vec *bvec,
+			   struct bio_vec *bvprv, struct scatterlist **sg)
 {
 
 	int nbytes = bvec->bv_len;
 
-	if (*sg) {
-		if ((*sg)->length + nbytes > queue_max_segment_size(q))
-			goto new_segment;
-		if (!biovec_phys_mergeable(q, bvprv, bvec))
-			goto new_segment;
+	if (!*sg)
+		return false;
 
-		(*sg)->length += nbytes;
-	} else {
-new_segment:
-		if (bvec->bv_offset + bvec->bv_len <= PAGE_SIZE) {
-			(*nsegs) += __blk_bvec_map_sg(*bvec, sglist, sg);
-		} else
-			(*nsegs) += blk_bvec_map_sg(q, bvec, sglist, sg);
-	}
-	*bvprv = *bvec;
+	if ((*sg)->length + nbytes > queue_max_segment_size(q))
+		return false;
+
+	if (!biovec_phys_mergeable(q, bvprv, bvec))
+		return false;
+
+	(*sg)->length += nbytes;
+
+	return true;
 }
 
 static int __blk_bios_map_sg(struct request_queue *q, struct bio *bio,
@@ -533,11 +530,29 @@ static int __blk_bios_map_sg(struct request_queue *q, struct bio *bio,
 	struct bio_vec bvec, bvprv = { NULL };
 	struct bvec_iter iter;
 	int nsegs = 0;
+	bool new_bio = false;
 
-	for_each_bio(bio)
-		bio_for_each_bvec(bvec, bio, iter)
-			__blk_segment_map_sg(q, &bvec, sglist, &bvprv, sg,
-					     &nsegs);
+	for_each_bio(bio) {
+		bio_for_each_bvec(bvec, bio, iter) {
+			/*
+			 * Only try to merge bvecs from two bios given we
+			 * have done bio internal merge when adding pages
+			 * to bio
+			 */
+			if (new_bio &&
+			    __blk_segment_map_sg_merge(q, &bvec, &bvprv, sg))
+				goto next_bvec;
+
+			if (bvec.bv_offset + bvec.bv_len <= PAGE_SIZE)
+				nsegs += __blk_bvec_map_sg(bvec, sglist, sg);
+			else
+				nsegs += blk_bvec_map_sg(q, &bvec, sglist, sg);
+ next_bvec:
+			new_bio = false;
+		}
+		bvprv = bvec;
+		new_bio = true;
+	}
 
 	return nsegs;
 }
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH V2 02/10] block: avoid to break XEN by multi-page bvec
  2019-03-17 10:01 ` [PATCH V2 02/10] block: avoid to break XEN by multi-page bvec Ming Lei
@ 2019-03-19  9:18   ` Juergen Gross
  0 siblings, 0 replies; 13+ messages in thread
From: Juergen Gross @ 2019-03-19  9:18 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe
  Cc: linux-block, ris Ostrovsky, xen-devel, Omar Sandoval,
	Christoph Hellwig

On 17/03/2019 11:01, Ming Lei wrote:
> XEN has special page merge requirement, see xen_biovec_phys_mergeable().
> We can't merge pages into one bvec simply for XEN.
> 
> So move XEN's specific check on page merge into __bio_try_merge_page(),
> then abvoid to break XEN by multi-page bvec.
> 
> Cc: ris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Juergen Gross <jgross@suse.com>
> Cc: xen-devel@lists.xenproject.org
> Cc: Omar Sandoval <osandov@fb.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>

Reviewed-by: Juergen Gross <jgross@suse.com>


Juergen


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V2 00/10] block: enable multi-page bvec for passthrough IO
  2019-03-17 10:01 [PATCH V2 00/10] block: enable multi-page bvec for passthrough IO Ming Lei
                   ` (9 preceding siblings ...)
  2019-03-17 10:01 ` [PATCH V2 10/10] block: don't check if adjacent bvecs in one bio can be mergeable Ming Lei
@ 2019-03-27 15:44 ` Jens Axboe
  10 siblings, 0 replies; 13+ messages in thread
From: Jens Axboe @ 2019-03-27 15:44 UTC (permalink / raw)
  To: Ming Lei
  Cc: linux-block, ris Ostrovsky, Juergen Gross, xen-devel,
	Omar Sandoval, Christoph Hellwig

On 3/17/19 4:01 AM, Ming Lei wrote:
> Hi,
> 
> Now the whole IO stack is capable of handling multi-page bvec, and it has
> been enabled in the normal FS IO path. However, it isn't done for
> passthrough IO.
> 
> Without enabling multi-bvec for passthough IO, we won't go ahead for
> optimizing related IO paths, such as bvec merging, bio_add_pc_page
> simplification.
> 
> This patch enables multi-page bvec for passthrough IO. Turns out
> bio_add_pc_page() is simpliefied a lot, especially the physical segment
> number of passthrough bio is always same with bio.bi_vcnt. Also the
> bvec merging inside bio is killed.
> 
> blktests(block/029) is added for covering passthough IO path, and this
> patchset does pass the new block/029 test.
> 
> 	https://marc.info/?l=linux-block&m=155175063417139&w=2

Merged for 5.2, thanks Ming.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-03-27 15:44 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-03-17 10:01 [PATCH V2 00/10] block: enable multi-page bvec for passthrough IO Ming Lei
2019-03-17 10:01 ` [PATCH V2 01/10] block: pass page to xen_biovec_phys_mergeable Ming Lei
2019-03-17 10:01 ` [PATCH V2 02/10] block: avoid to break XEN by multi-page bvec Ming Lei
2019-03-19  9:18   ` Juergen Gross
2019-03-17 10:01 ` [PATCH V2 03/10] block: don't merge adjacent bvecs to one segment in bio blk_queue_split Ming Lei
2019-03-17 10:01 ` [PATCH V2 04/10] block: cleanup bio_add_pc_page Ming Lei
2019-03-17 10:01 ` [PATCH V2 05/10] block: check if page is mergeable in one helper Ming Lei
2019-03-17 10:01 ` [PATCH V2 06/10] block: put the same page when adding it to bio Ming Lei
2019-03-17 10:01 ` [PATCH V2 07/10] block: enable multi-page bvec for passthrough IO Ming Lei
2019-03-17 10:01 ` [PATCH V2 08/10] block: remove argument of 'request_queue' from __blk_bvec_map_sg Ming Lei
2019-03-17 10:01 ` [PATCH V2 09/10] block: reuse __blk_bvec_map_sg() for mapping page sized bvec Ming Lei
2019-03-17 10:01 ` [PATCH V2 10/10] block: don't check if adjacent bvecs in one bio can be mergeable Ming Lei
2019-03-27 15:44 ` [PATCH V2 00/10] block: enable multi-page bvec for passthrough IO Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).