* [PATCH] fix bio_add_page for non trivial merge_bvec_fn case
@ 2008-06-08 15:45 Dmitri Monakhov
2008-06-30 20:37 ` Andrew Morton
0 siblings, 1 reply; 3+ messages in thread
From: Dmitri Monakhov @ 2008-06-08 15:45 UTC (permalink / raw)
To: linux-kernel, linux-fsdevel
We have to properly decrease all related bio's counters, especially bi_size
in order to merge_bvec_fn return right result. Usually this result in
false merge rejects for two absolutely valid bio_vecs. This may cause
significant performance penalty for example Itanium: page_size == 16k,
fs_block_size == 1k and block device is raid with small chunk_size.
Signed-off-by: Dmitri Monakhov <dmonakhov@openvz.org>
---
fs/bio.c | 16 ++++++++++++----
1 files changed, 12 insertions(+), 4 deletions(-)
diff --git a/fs/bio.c b/fs/bio.c
index 7856257..d713074 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -332,14 +332,21 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
if (page == prev->bv_page &&
offset == prev->bv_offset + prev->bv_len) {
+ /* Temprory detacth last bio_vec. */
+ bio->bi_size -= prev->bv_len;
+ bio->bi_vcnt--;
+ bio->bi_phys_segments--;
+ bio->bi_hw_segments--;
+
prev->bv_len += len;
if (q->merge_bvec_fn &&
q->merge_bvec_fn(q, bio, prev) < len) {
prev->bv_len -= len;
- return 0;
+ len = 0;
}
- goto done;
+ bio->bi_size += prev->bv_len;
+ goto out_add_bvec;
}
}
@@ -394,11 +401,12 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
BIOVEC_VIRT_MERGEABLE(bvec-1, bvec)))
bio->bi_flags &= ~(1 << BIO_SEG_VALID);
+ bio->bi_size += len;
+out_add_bvec:
bio->bi_vcnt++;
bio->bi_phys_segments++;
bio->bi_hw_segments++;
- done:
- bio->bi_size += len;
return len;
}
--
1.5.4.rc4
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH] fix bio_add_page for non trivial merge_bvec_fn case
2008-06-08 15:45 [PATCH] fix bio_add_page for non trivial merge_bvec_fn case Dmitri Monakhov
@ 2008-06-30 20:37 ` Andrew Morton
2008-07-01 7:14 ` Jens Axboe
0 siblings, 1 reply; 3+ messages in thread
From: Andrew Morton @ 2008-06-30 20:37 UTC (permalink / raw)
To: Dmitri Monakhov; +Cc: linux-kernel, linux-fsdevel, Jens Axboe
On Sun, 08 Jun 2008 19:45:08 +0400
Dmitri Monakhov <dmonakhov@openvz.org> wrote:
> We have to properly decrease all related bio's counters, especially bi_size
> in order to merge_bvec_fn return right result. Usually this result in
> false merge rejects for two absolutely valid bio_vecs. This may cause
> significant performance penalty for example Itanium: page_size == 16k,
> fs_block_size == 1k and block device is raid with small chunk_size.
>
Please cc Jens on BIO changes.
> ---
> fs/bio.c | 16 ++++++++++++----
> 1 files changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/fs/bio.c b/fs/bio.c
> index 7856257..d713074 100644
> --- a/fs/bio.c
> +++ b/fs/bio.c
> @@ -332,14 +332,21 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
>
> if (page == prev->bv_page &&
> offset == prev->bv_offset + prev->bv_len) {
> + /* Temprory detacth last bio_vec. */
whoa, drunken speling.
> + bio->bi_size -= prev->bv_len;
> + bio->bi_vcnt--;
> + bio->bi_phys_segments--;
> + bio->bi_hw_segments--;
> +
> prev->bv_len += len;
> if (q->merge_bvec_fn &&
> q->merge_bvec_fn(q, bio, prev) < len) {
> prev->bv_len -= len;
> - return 0;
> + len = 0;
> }
>
> - goto done;
> + bio->bi_size += prev->bv_len;
> + goto out_add_bvec;
> }
> }
>
> @@ -394,11 +401,12 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
> BIOVEC_VIRT_MERGEABLE(bvec-1, bvec)))
> bio->bi_flags &= ~(1 << BIO_SEG_VALID);
>
> + bio->bi_size += len;
> +out_add_bvec:
> bio->bi_vcnt++;
> bio->bi_phys_segments++;
> bio->bi_hw_segments++;
> - done:
> - bio->bi_size += len;
> return len;
> }
For some reason patch(1) claims this hunk was corrupted. I typed it in
by hand.
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH] fix bio_add_page for non trivial merge_bvec_fn case
2008-06-30 20:37 ` Andrew Morton
@ 2008-07-01 7:14 ` Jens Axboe
0 siblings, 0 replies; 3+ messages in thread
From: Jens Axboe @ 2008-07-01 7:14 UTC (permalink / raw)
To: Andrew Morton; +Cc: Dmitri Monakhov, linux-kernel, linux-fsdevel
On Mon, Jun 30 2008, Andrew Morton wrote:
> On Sun, 08 Jun 2008 19:45:08 +0400
> Dmitri Monakhov <dmonakhov@openvz.org> wrote:
>
> > We have to properly decrease all related bio's counters, especially bi_size
> > in order to merge_bvec_fn return right result. Usually this result in
> > false merge rejects for two absolutely valid bio_vecs. This may cause
> > significant performance penalty for example Itanium: page_size == 16k,
> > fs_block_size == 1k and block device is raid with small chunk_size.
> >
>
> Please cc Jens on BIO changes.
>
> > ---
> > fs/bio.c | 16 ++++++++++++----
> > 1 files changed, 12 insertions(+), 4 deletions(-)
> >
> > diff --git a/fs/bio.c b/fs/bio.c
> > index 7856257..d713074 100644
> > --- a/fs/bio.c
> > +++ b/fs/bio.c
> > @@ -332,14 +332,21 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
> >
> > if (page == prev->bv_page &&
> > offset == prev->bv_offset + prev->bv_len) {
> > + /* Temprory detacth last bio_vec. */
>
> whoa, drunken speling.
>
> > + bio->bi_size -= prev->bv_len;
> > + bio->bi_vcnt--;
> > + bio->bi_phys_segments--;
> > + bio->bi_hw_segments--;
> > +
This logic isn't quite right, the rules for what constitutes a new hw or
phys segment is not a 1:1 mapping with number of pages in the bio. How
about just dropping the segment decrement? The merge_bvec fn should not
care, and we'll retry and coalesce segment count if we get to the limit
anyway.
--
Jens Axboe
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2008-07-01 7:14 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-08 15:45 [PATCH] fix bio_add_page for non trivial merge_bvec_fn case Dmitri Monakhov
2008-06-30 20:37 ` Andrew Morton
2008-07-01 7:14 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).