linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn case
@ 2010-01-26 13:15 Dmitry Monakhov
  2010-01-26 13:29 ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Dmitry Monakhov @ 2010-01-26 13:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe

[-- Attachment #1: Type: text/plain, Size: 133 bytes --]

Hi, year ago I've sent a patch which fix false bio merge rejects, but
seems patch was missed. Currently the issue is still present.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-PATCH-block-fix-bio_add_page-for-non-trivial-merge_b.patch --]
[-- Type: text/x-diff, Size: 1129 bytes --]

>From 92a97ef181e15caa94bd56a1ade5c337db599b79 Mon Sep 17 00:00:00 2001
From: Dmitry Monakhov <dmonakhov@openvz.org>
Date: Tue, 26 Jan 2010 16:01:34 +0300
Subject: [PATCH] [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn case

We have to properly decrease bi_size in order to merge_bvec_fn return
right result.  Otherwise this result in false merge rejects for two
absolutely valid bio_vecs.  This may cause significant performance penalty
for example Itanium: page_size == 16k, fs_block_size == 1k and block device
is raid with small chunk_size.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
 fs/bio.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/fs/bio.c b/fs/bio.c
index 76e6713..9f8e517 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -548,7 +548,8 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
 				struct bvec_merge_data bvm = {
 					.bi_bdev = bio->bi_bdev,
 					.bi_sector = bio->bi_sector,
-					.bi_size = bio->bi_size,
+					.bi_size = bio->bi_size -
+							(prev->bv_len - len),
 					.bi_rw = bio->bi_rw,
 				};
 
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn case
  2010-01-26 13:15 [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn case Dmitry Monakhov
@ 2010-01-26 13:29 ` Jens Axboe
  2010-01-26 14:17   ` Dmitry Monakhov
  0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2010-01-26 13:29 UTC (permalink / raw)
  To: Dmitry Monakhov; +Cc: linux-kernel

On Tue, Jan 26 2010, Dmitry Monakhov wrote:
> Hi, year ago I've sent a patch which fix false bio merge rejects, but
> seems patch was missed. Currently the issue is still present.
> 

> From 92a97ef181e15caa94bd56a1ade5c337db599b79 Mon Sep 17 00:00:00 2001
> From: Dmitry Monakhov <dmonakhov@openvz.org>
> Date: Tue, 26 Jan 2010 16:01:34 +0300
> Subject: [PATCH] [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn case
> 
> We have to properly decrease bi_size in order to merge_bvec_fn return
> right result.  Otherwise this result in false merge rejects for two
> absolutely valid bio_vecs.  This may cause significant performance penalty
> for example Itanium: page_size == 16k, fs_block_size == 1k and block device
> is raid with small chunk_size.
> 
> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
> ---
>  fs/bio.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/bio.c b/fs/bio.c
> index 76e6713..9f8e517 100644
> --- a/fs/bio.c
> +++ b/fs/bio.c
> @@ -548,7 +548,8 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
>  				struct bvec_merge_data bvm = {
>  					.bi_bdev = bio->bi_bdev,
>  					.bi_sector = bio->bi_sector,
> -					.bi_size = bio->bi_size,
> +					.bi_size = bio->bi_size -
> +							(prev->bv_len - len),
>  					.bi_rw = bio->bi_rw,
>  				};

Hmm confused. why isn't this just bio->bi_size - len?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn  case
  2010-01-26 13:29 ` Jens Axboe
@ 2010-01-26 14:17   ` Dmitry Monakhov
  2010-01-27 21:11     ` Dmitry Monakhov
  0 siblings, 1 reply; 5+ messages in thread
From: Dmitry Monakhov @ 2010-01-26 14:17 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel

On Tue, Jan 26, 2010 at 4:29 PM, Jens Axboe <jens.axboe@oracle.com> wrote:
> On Tue, Jan 26 2010, Dmitry Monakhov wrote:
>> Hi, year ago I've sent a patch which fix false bio merge rejects, but
>> seems patch was missed. Currently the issue is still present.
>>
>
>> From 92a97ef181e15caa94bd56a1ade5c337db599b79 Mon Sep 17 00:00:00 2001
>> From: Dmitry Monakhov <dmonakhov@openvz.org>
>> Date: Tue, 26 Jan 2010 16:01:34 +0300
>> Subject: [PATCH] [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn case
>>
>> We have to properly decrease bi_size in order to merge_bvec_fn return
>> right result.  Otherwise this result in false merge rejects for two
>> absolutely valid bio_vecs.  This may cause significant performance penalty
>> for example Itanium: page_size == 16k, fs_block_size == 1k and block device
>> is raid with small chunk_size.
>>
>> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
>> ---
>>  fs/bio.c |    3 ++-
>>  1 files changed, 2 insertions(+), 1 deletions(-)
>>
>> diff --git a/fs/bio.c b/fs/bio.c
>> index 76e6713..9f8e517 100644
>> --- a/fs/bio.c
>> +++ b/fs/bio.c
>> @@ -548,7 +548,8 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
>>                               struct bvec_merge_data bvm = {
>>                                       .bi_bdev = bio->bi_bdev,
>>                                       .bi_sector = bio->bi_sector,
>> -                                     .bi_size = bio->bi_size,
>> +                                     .bi_size = bio->bi_size -
>> +                                                     (prev->bv_len - len),
>>                                       .bi_rw = bio->bi_rw,
>>                               };
>
> Hmm confused. why isn't this just bio->bi_size - len?
because we have following scenario:
0) old_bv_len = prev->bv_len ;
1) prev->bv_len += len;
2)->merge_bvec_fn()
      usually it looks like follows
      if (bio->bv_len + bvm->bi_size > max_chunk_size)
          goto fail
     So formally we detach last bvec increase it size and try to
attach it to bio.

     but old_bv_len is already accounted in bi_size, so we have to decrease
    bi_size to this value, and old_bv_len = (prev->bv_len - len) at
that moment.
    I've used math manipuation in order to avoid temporary variable
    because stack size is matter on block layer.
>
> --
> Jens Axboe
>
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn  case
  2010-01-26 14:17   ` Dmitry Monakhov
@ 2010-01-27 21:11     ` Dmitry Monakhov
  2010-01-27 21:14       ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Dmitry Monakhov @ 2010-01-27 21:11 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2212 bytes --]

Dmitry Monakhov <dmonakhov@openvz.org> writes:

> On Tue, Jan 26, 2010 at 4:29 PM, Jens Axboe <jens.axboe@oracle.com> wrote:
>> On Tue, Jan 26 2010, Dmitry Monakhov wrote:
>>> Hi, year ago I've sent a patch which fix false bio merge rejects, but
>>> seems patch was missed. Currently the issue is still present.
>>>
>>
>>> From 92a97ef181e15caa94bd56a1ade5c337db599b79 Mon Sep 17 00:00:00 2001
>>> From: Dmitry Monakhov <dmonakhov@openvz.org>
>>> Date: Tue, 26 Jan 2010 16:01:34 +0300
>>> Subject: [PATCH] [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn case
>>>
>>> We have to properly decrease bi_size in order to merge_bvec_fn return
>>> right result.  Otherwise this result in false merge rejects for two
>>> absolutely valid bio_vecs.  This may cause significant performance penalty
>>> for example Itanium: page_size == 16k, fs_block_size == 1k and block device
>>> is raid with small chunk_size.
>>>
>>> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
>>> ---
>>>  fs/bio.c |    3 ++-
>>>  1 files changed, 2 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/fs/bio.c b/fs/bio.c
>>> index 76e6713..9f8e517 100644
>>> --- a/fs/bio.c
>>> +++ b/fs/bio.c
>>> @@ -548,7 +548,8 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
>>>                               struct bvec_merge_data bvm = {
>>>                                       .bi_bdev = bio->bi_bdev,
>>>                                       .bi_sector = bio->bi_sector,
>>> -                                     .bi_size = bio->bi_size,
>>> +                                     .bi_size = bio->bi_size -
>>> +                                                     (prev->bv_len - len),
>>>                                       .bi_rw = bio->bi_rw,
>>>                               };
>>
>> Hmm confused. why isn't this just bio->bi_size - len?
I've attached more descriptive  version of the patch. Jens, please
clarify your opinion to the patch( do you like it or not?)
I don't want it miss again.

[-- Attachment #2: 0001-PATCH-block-fix-bio_add_page-for-non-trivial-merge_b.patch --]
[-- Type: text/plain, Size: 1521 bytes --]

>From 65ac8c9a2988310a79215b68e9c93b6dca6c6665 Mon Sep 17 00:00:00 2001
From: Dmitry Monakhov <dmonakhov@openvz.org>
Date: Wed, 27 Jan 2010 22:44:36 +0300
Subject: [PATCH] [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn case

We have to properly decrease bi_size in order to merge_bvec_fn return
right result.  Otherwise this result in false merge rejects for two
absolutely valid bio_vecs.  This may cause significant performance
penalty for example fs_block_size == 1k and block device is raid0 with
small chunk_size = 8k. Then it is impossible to merge 7-th fs-block in
to bio which already has 6 fs-blocks.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
 fs/bio.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/fs/bio.c b/fs/bio.c
index 76e6713..7448c7d 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -542,13 +542,18 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
 
 		if (page == prev->bv_page &&
 		    offset == prev->bv_offset + prev->bv_len) {
+			unsigned int prev_bv_len = prev->bv_len;
 			prev->bv_len += len;
 
 			if (q->merge_bvec_fn) {
 				struct bvec_merge_data bvm = {
+					/* prev_bvec is already charged in
+					   bi_size, discharge it in order to
+					   simulate merging updated prev_bvec
+					   as new bvec. */
 					.bi_bdev = bio->bi_bdev,
 					.bi_sector = bio->bi_sector,
-					.bi_size = bio->bi_size,
+					.bi_size = bio->bi_size - prev_bv_len,
 					.bi_rw = bio->bi_rw,
 				};
 
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn case
  2010-01-27 21:11     ` Dmitry Monakhov
@ 2010-01-27 21:14       ` Jens Axboe
  0 siblings, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2010-01-27 21:14 UTC (permalink / raw)
  To: Dmitry Monakhov; +Cc: linux-kernel

On Thu, Jan 28 2010, Dmitry Monakhov wrote:
> Dmitry Monakhov <dmonakhov@openvz.org> writes:
> 
> > On Tue, Jan 26, 2010 at 4:29 PM, Jens Axboe <jens.axboe@oracle.com> wrote:
> >> On Tue, Jan 26 2010, Dmitry Monakhov wrote:
> >>> Hi, year ago I've sent a patch which fix false bio merge rejects, but
> >>> seems patch was missed. Currently the issue is still present.
> >>>
> >>
> >>> From 92a97ef181e15caa94bd56a1ade5c337db599b79 Mon Sep 17 00:00:00 2001
> >>> From: Dmitry Monakhov <dmonakhov@openvz.org>
> >>> Date: Tue, 26 Jan 2010 16:01:34 +0300
> >>> Subject: [PATCH] [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn case
> >>>
> >>> We have to properly decrease bi_size in order to merge_bvec_fn return
> >>> right result.  Otherwise this result in false merge rejects for two
> >>> absolutely valid bio_vecs.  This may cause significant performance penalty
> >>> for example Itanium: page_size == 16k, fs_block_size == 1k and block device
> >>> is raid with small chunk_size.
> >>>
> >>> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
> >>> ---
> >>>  fs/bio.c |    3 ++-
> >>>  1 files changed, 2 insertions(+), 1 deletions(-)
> >>>
> >>> diff --git a/fs/bio.c b/fs/bio.c
> >>> index 76e6713..9f8e517 100644
> >>> --- a/fs/bio.c
> >>> +++ b/fs/bio.c
> >>> @@ -548,7 +548,8 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
> >>>                               struct bvec_merge_data bvm = {
> >>>                                       .bi_bdev = bio->bi_bdev,
> >>>                                       .bi_sector = bio->bi_sector,
> >>> -                                     .bi_size = bio->bi_size,
> >>> +                                     .bi_size = bio->bi_size -
> >>> +                                                     (prev->bv_len - len),
> >>>                                       .bi_rw = bio->bi_rw,
> >>>                               };
> >>
> >> Hmm confused. why isn't this just bio->bi_size - len?
> I've attached more descriptive  version of the patch. Jens, please
> clarify your opinion to the patch( do you like it or not?)
> I don't want it miss again.

Yes thanks, I agree with your evaluation, it's definitely correct. I'll
add it.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-01-27 21:14 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-26 13:15 [PATCH] block: fix bio_add_page for non trivial merge_bvec_fn case Dmitry Monakhov
2010-01-26 13:29 ` Jens Axboe
2010-01-26 14:17   ` Dmitry Monakhov
2010-01-27 21:11     ` Dmitry Monakhov
2010-01-27 21:14       ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).