* [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn case @ 2010-01-26 13:15 Dmitry Monakhov 2010-01-26 13:29 ` Jens Axboe 0 siblings, 1 reply; 5+ messages in thread From: Dmitry Monakhov @ 2010-01-26 13:15 UTC (permalink / raw) To: linux-kernel; +Cc: Jens Axboe [-- Attachment #1: Type: text/plain, Size: 133 bytes --] Hi, year ago I've sent a patch which fix false bio merge rejects, but seems patch was missed. Currently the issue is still present. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-PATCH-block-fix-bio_add_page-for-non-trivial-merge_b.patch --] [-- Type: text/x-diff, Size: 1129 bytes --] >From 92a97ef181e15caa94bd56a1ade5c337db599b79 Mon Sep 17 00:00:00 2001 From: Dmitry Monakhov <dmonakhov@openvz.org> Date: Tue, 26 Jan 2010 16:01:34 +0300 Subject: [PATCH] [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn case We have to properly decrease bi_size in order to merge_bvec_fn return right result. Otherwise this result in false merge rejects for two absolutely valid bio_vecs. This may cause significant performance penalty for example Itanium: page_size == 16k, fs_block_size == 1k and block device is raid with small chunk_size. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> --- fs/bio.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/fs/bio.c b/fs/bio.c index 76e6713..9f8e517 100644 --- a/fs/bio.c +++ b/fs/bio.c @@ -548,7 +548,8 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page struct bvec_merge_data bvm = { .bi_bdev = bio->bi_bdev, .bi_sector = bio->bi_sector, - .bi_size = bio->bi_size, + .bi_size = bio->bi_size - + (prev->bv_len - len), .bi_rw = bio->bi_rw, }; -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn case 2010-01-26 13:15 [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn case Dmitry Monakhov @ 2010-01-26 13:29 ` Jens Axboe 2010-01-26 14:17 ` Dmitry Monakhov 0 siblings, 1 reply; 5+ messages in thread From: Jens Axboe @ 2010-01-26 13:29 UTC (permalink / raw) To: Dmitry Monakhov; +Cc: linux-kernel On Tue, Jan 26 2010, Dmitry Monakhov wrote: > Hi, year ago I've sent a patch which fix false bio merge rejects, but > seems patch was missed. Currently the issue is still present. > > From 92a97ef181e15caa94bd56a1ade5c337db599b79 Mon Sep 17 00:00:00 2001 > From: Dmitry Monakhov <dmonakhov@openvz.org> > Date: Tue, 26 Jan 2010 16:01:34 +0300 > Subject: [PATCH] [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn case > > We have to properly decrease bi_size in order to merge_bvec_fn return > right result. Otherwise this result in false merge rejects for two > absolutely valid bio_vecs. This may cause significant performance penalty > for example Itanium: page_size == 16k, fs_block_size == 1k and block device > is raid with small chunk_size. > > Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> > --- > fs/bio.c | 3 ++- > 1 files changed, 2 insertions(+), 1 deletions(-) > > diff --git a/fs/bio.c b/fs/bio.c > index 76e6713..9f8e517 100644 > --- a/fs/bio.c > +++ b/fs/bio.c > @@ -548,7 +548,8 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page > struct bvec_merge_data bvm = { > .bi_bdev = bio->bi_bdev, > .bi_sector = bio->bi_sector, > - .bi_size = bio->bi_size, > + .bi_size = bio->bi_size - > + (prev->bv_len - len), > .bi_rw = bio->bi_rw, > }; Hmm confused. why isn't this just bio->bi_size - len? -- Jens Axboe ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn case 2010-01-26 13:29 ` Jens Axboe @ 2010-01-26 14:17 ` Dmitry Monakhov 2010-01-27 21:11 ` Dmitry Monakhov 0 siblings, 1 reply; 5+ messages in thread From: Dmitry Monakhov @ 2010-01-26 14:17 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-kernel On Tue, Jan 26, 2010 at 4:29 PM, Jens Axboe <jens.axboe@oracle.com> wrote: > On Tue, Jan 26 2010, Dmitry Monakhov wrote: >> Hi, year ago I've sent a patch which fix false bio merge rejects, but >> seems patch was missed. Currently the issue is still present. >> > >> From 92a97ef181e15caa94bd56a1ade5c337db599b79 Mon Sep 17 00:00:00 2001 >> From: Dmitry Monakhov <dmonakhov@openvz.org> >> Date: Tue, 26 Jan 2010 16:01:34 +0300 >> Subject: [PATCH] [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn case >> >> We have to properly decrease bi_size in order to merge_bvec_fn return >> right result. Otherwise this result in false merge rejects for two >> absolutely valid bio_vecs. This may cause significant performance penalty >> for example Itanium: page_size == 16k, fs_block_size == 1k and block device >> is raid with small chunk_size. >> >> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> >> --- >> fs/bio.c | 3 ++- >> 1 files changed, 2 insertions(+), 1 deletions(-) >> >> diff --git a/fs/bio.c b/fs/bio.c >> index 76e6713..9f8e517 100644 >> --- a/fs/bio.c >> +++ b/fs/bio.c >> @@ -548,7 +548,8 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page >> struct bvec_merge_data bvm = { >> .bi_bdev = bio->bi_bdev, >> .bi_sector = bio->bi_sector, >> - .bi_size = bio->bi_size, >> + .bi_size = bio->bi_size - >> + (prev->bv_len - len), >> .bi_rw = bio->bi_rw, >> }; > > Hmm confused. why isn't this just bio->bi_size - len? because we have following scenario: 0) old_bv_len = prev->bv_len ; 1) prev->bv_len += len; 2)->merge_bvec_fn() usually it looks like follows if (bio->bv_len + bvm->bi_size > max_chunk_size) goto fail So formally we detach last bvec increase it size and try to attach it to bio. but old_bv_len is already accounted in bi_size, so we have to decrease bi_size to this value, and old_bv_len = (prev->bv_len - len) at that moment. I've used math manipuation in order to avoid temporary variable because stack size is matter on block layer. > > -- > Jens Axboe > > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn case 2010-01-26 14:17 ` Dmitry Monakhov @ 2010-01-27 21:11 ` Dmitry Monakhov 2010-01-27 21:14 ` Jens Axboe 0 siblings, 1 reply; 5+ messages in thread From: Dmitry Monakhov @ 2010-01-27 21:11 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 2212 bytes --] Dmitry Monakhov <dmonakhov@openvz.org> writes: > On Tue, Jan 26, 2010 at 4:29 PM, Jens Axboe <jens.axboe@oracle.com> wrote: >> On Tue, Jan 26 2010, Dmitry Monakhov wrote: >>> Hi, year ago I've sent a patch which fix false bio merge rejects, but >>> seems patch was missed. Currently the issue is still present. >>> >> >>> From 92a97ef181e15caa94bd56a1ade5c337db599b79 Mon Sep 17 00:00:00 2001 >>> From: Dmitry Monakhov <dmonakhov@openvz.org> >>> Date: Tue, 26 Jan 2010 16:01:34 +0300 >>> Subject: [PATCH] [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn case >>> >>> We have to properly decrease bi_size in order to merge_bvec_fn return >>> right result. Otherwise this result in false merge rejects for two >>> absolutely valid bio_vecs. This may cause significant performance penalty >>> for example Itanium: page_size == 16k, fs_block_size == 1k and block device >>> is raid with small chunk_size. >>> >>> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> >>> --- >>> fs/bio.c | 3 ++- >>> 1 files changed, 2 insertions(+), 1 deletions(-) >>> >>> diff --git a/fs/bio.c b/fs/bio.c >>> index 76e6713..9f8e517 100644 >>> --- a/fs/bio.c >>> +++ b/fs/bio.c >>> @@ -548,7 +548,8 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page >>> struct bvec_merge_data bvm = { >>> .bi_bdev = bio->bi_bdev, >>> .bi_sector = bio->bi_sector, >>> - .bi_size = bio->bi_size, >>> + .bi_size = bio->bi_size - >>> + (prev->bv_len - len), >>> .bi_rw = bio->bi_rw, >>> }; >> >> Hmm confused. why isn't this just bio->bi_size - len? I've attached more descriptive version of the patch. Jens, please clarify your opinion to the patch( do you like it or not?) I don't want it miss again. [-- Attachment #2: 0001-PATCH-block-fix-bio_add_page-for-non-trivial-merge_b.patch --] [-- Type: text/plain, Size: 1521 bytes --] >From 65ac8c9a2988310a79215b68e9c93b6dca6c6665 Mon Sep 17 00:00:00 2001 From: Dmitry Monakhov <dmonakhov@openvz.org> Date: Wed, 27 Jan 2010 22:44:36 +0300 Subject: [PATCH] [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn case We have to properly decrease bi_size in order to merge_bvec_fn return right result. Otherwise this result in false merge rejects for two absolutely valid bio_vecs. This may cause significant performance penalty for example fs_block_size == 1k and block device is raid0 with small chunk_size = 8k. Then it is impossible to merge 7-th fs-block in to bio which already has 6 fs-blocks. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> --- fs/bio.c | 7 ++++++- 1 files changed, 6 insertions(+), 1 deletions(-) diff --git a/fs/bio.c b/fs/bio.c index 76e6713..7448c7d 100644 --- a/fs/bio.c +++ b/fs/bio.c @@ -542,13 +542,18 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page if (page == prev->bv_page && offset == prev->bv_offset + prev->bv_len) { + unsigned int prev_bv_len = prev->bv_len; prev->bv_len += len; if (q->merge_bvec_fn) { struct bvec_merge_data bvm = { + /* prev_bvec is already charged in + bi_size, discharge it in order to + simulate merging updated prev_bvec + as new bvec. */ .bi_bdev = bio->bi_bdev, .bi_sector = bio->bi_sector, - .bi_size = bio->bi_size, + .bi_size = bio->bi_size - prev_bv_len, .bi_rw = bio->bi_rw, }; -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn case 2010-01-27 21:11 ` Dmitry Monakhov @ 2010-01-27 21:14 ` Jens Axboe 0 siblings, 0 replies; 5+ messages in thread From: Jens Axboe @ 2010-01-27 21:14 UTC (permalink / raw) To: Dmitry Monakhov; +Cc: linux-kernel On Thu, Jan 28 2010, Dmitry Monakhov wrote: > Dmitry Monakhov <dmonakhov@openvz.org> writes: > > > On Tue, Jan 26, 2010 at 4:29 PM, Jens Axboe <jens.axboe@oracle.com> wrote: > >> On Tue, Jan 26 2010, Dmitry Monakhov wrote: > >>> Hi, year ago I've sent a patch which fix false bio merge rejects, but > >>> seems patch was missed. Currently the issue is still present. > >>> > >> > >>> From 92a97ef181e15caa94bd56a1ade5c337db599b79 Mon Sep 17 00:00:00 2001 > >>> From: Dmitry Monakhov <dmonakhov@openvz.org> > >>> Date: Tue, 26 Jan 2010 16:01:34 +0300 > >>> Subject: [PATCH] [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn case > >>> > >>> We have to properly decrease bi_size in order to merge_bvec_fn return > >>> right result. Otherwise this result in false merge rejects for two > >>> absolutely valid bio_vecs. This may cause significant performance penalty > >>> for example Itanium: page_size == 16k, fs_block_size == 1k and block device > >>> is raid with small chunk_size. > >>> > >>> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> > >>> --- > >>> fs/bio.c | 3 ++- > >>> 1 files changed, 2 insertions(+), 1 deletions(-) > >>> > >>> diff --git a/fs/bio.c b/fs/bio.c > >>> index 76e6713..9f8e517 100644 > >>> --- a/fs/bio.c > >>> +++ b/fs/bio.c > >>> @@ -548,7 +548,8 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page > >>> struct bvec_merge_data bvm = { > >>> .bi_bdev = bio->bi_bdev, > >>> .bi_sector = bio->bi_sector, > >>> - .bi_size = bio->bi_size, > >>> + .bi_size = bio->bi_size - > >>> + (prev->bv_len - len), > >>> .bi_rw = bio->bi_rw, > >>> }; > >> > >> Hmm confused. why isn't this just bio->bi_size - len? > I've attached more descriptive version of the patch. Jens, please > clarify your opinion to the patch( do you like it or not?) > I don't want it miss again. Yes thanks, I agree with your evaluation, it's definitely correct. I'll add it. -- Jens Axboe ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-01-27 21:14 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-01-26 13:15 [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn case Dmitry Monakhov 2010-01-26 13:29 ` Jens Axboe 2010-01-26 14:17 ` Dmitry Monakhov 2010-01-27 21:11 ` Dmitry Monakhov 2010-01-27 21:14 ` Jens Axboe
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.