* [PATCH] block: makes bio_split support bio without data
@ 2012-09-24 4:56 NeilBrown
2012-09-24 8:35 ` Namhyung Kim
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: NeilBrown @ 2012-09-24 4:56 UTC (permalink / raw)
To: Jens Axboe; +Cc: Shaohua Li, lkml
[-- Attachment #1: Type: text/plain, Size: 2072 bytes --]
Hi Jens,
this patch has been sitting in my -next tree for a little while and I was
hoping for it to go in for the next merge window.
It simply allows bio_split() to be used on bios without a payload, such as
'discard'.
Are you happy with it going in though my 'md' tree, or would you rather take
it though your 'block' tree?
Thanks,
NeilBrown
From: Shaohua Li <shli@fusionio.com>
Date: Thu, 20 Sep 2012 09:36:03 +1000
Subject: [PATCH] block: makes bio_split support bio without data
discard bio hasn't data attached. We hit a BUG_ON with such bio. This makes
bio_split works for such bio.
Signed-off-by: Shaohua Li <shli@fusionio.com>
Signed-off-by: NeilBrown <neilb@suse.de>
diff --git a/fs/bio.c b/fs/bio.c
index 71072ab..dbb7a6c 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -1501,7 +1501,7 @@ struct bio_pair *bio_split(struct bio *bi, int first_sectors)
trace_block_split(bdev_get_queue(bi->bi_bdev), bi,
bi->bi_sector + first_sectors);
- BUG_ON(bi->bi_vcnt != 1);
+ BUG_ON(bi->bi_vcnt != 1 && bi->bi_vcnt != 0);
BUG_ON(bi->bi_idx != 0);
atomic_set(&bp->cnt, 3);
bp->error = 0;
@@ -1511,17 +1511,19 @@ struct bio_pair *bio_split(struct bio *bi, int first_sectors)
bp->bio2.bi_size -= first_sectors << 9;
bp->bio1.bi_size = first_sectors << 9;
- bp->bv1 = bi->bi_io_vec[0];
- bp->bv2 = bi->bi_io_vec[0];
- bp->bv2.bv_offset += first_sectors << 9;
- bp->bv2.bv_len -= first_sectors << 9;
- bp->bv1.bv_len = first_sectors << 9;
+ if (bi->bi_vcnt != 0) {
+ bp->bv1 = bi->bi_io_vec[0];
+ bp->bv2 = bi->bi_io_vec[0];
+ bp->bv2.bv_offset += first_sectors << 9;
+ bp->bv2.bv_len -= first_sectors << 9;
+ bp->bv1.bv_len = first_sectors << 9;
- bp->bio1.bi_io_vec = &bp->bv1;
- bp->bio2.bi_io_vec = &bp->bv2;
+ bp->bio1.bi_io_vec = &bp->bv1;
+ bp->bio2.bi_io_vec = &bp->bv2;
- bp->bio1.bi_max_vecs = 1;
- bp->bio2.bi_max_vecs = 1;
+ bp->bio1.bi_max_vecs = 1;
+ bp->bio2.bi_max_vecs = 1;
+ }
bp->bio1.bi_end_io = bio_pair_end_1;
bp->bio2.bi_end_io = bio_pair_end_2;
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] block: makes bio_split support bio without data
2012-09-24 4:56 [PATCH] block: makes bio_split support bio without data NeilBrown
@ 2012-09-24 8:35 ` Namhyung Kim
2012-09-24 23:37 ` NeilBrown
2012-09-25 12:51 ` Jens Axboe
2012-09-28 16:23 ` Kent Overstreet
2 siblings, 1 reply; 12+ messages in thread
From: Namhyung Kim @ 2012-09-24 8:35 UTC (permalink / raw)
To: NeilBrown; +Cc: Jens Axboe, Shaohua Li, lkml
Hi,
On Mon, 24 Sep 2012 14:56:39 +1000, NeilBrown wrote:
> Hi Jens,
> this patch has been sitting in my -next tree for a little while and I was
> hoping for it to go in for the next merge window.
> It simply allows bio_split() to be used on bios without a payload, such as
> 'discard'.
> Are you happy with it going in though my 'md' tree, or would you rather take
> it though your 'block' tree?
>
> Thanks,
> NeilBrown
>
>
> From: Shaohua Li <shli@fusionio.com>
> Date: Thu, 20 Sep 2012 09:36:03 +1000
> Subject: [PATCH] block: makes bio_split support bio without data
>
> discard bio hasn't data attached. We hit a BUG_ON with such bio. This makes
> bio_split works for such bio.
>
> Signed-off-by: Shaohua Li <shli@fusionio.com>
> Signed-off-by: NeilBrown <neilb@suse.de>
>
> diff --git a/fs/bio.c b/fs/bio.c
> index 71072ab..dbb7a6c 100644
> --- a/fs/bio.c
> +++ b/fs/bio.c
> @@ -1501,7 +1501,7 @@ struct bio_pair *bio_split(struct bio *bi, int first_sectors)
> trace_block_split(bdev_get_queue(bi->bi_bdev), bi,
> bi->bi_sector + first_sectors);
>
> - BUG_ON(bi->bi_vcnt != 1);
> + BUG_ON(bi->bi_vcnt != 1 && bi->bi_vcnt != 0);
Why not
BUG_ON(bi->bi_vcnt > 1);
?
Thanks,
Namhyung
> BUG_ON(bi->bi_idx != 0);
> atomic_set(&bp->cnt, 3);
> bp->error = 0;
> @@ -1511,17 +1511,19 @@ struct bio_pair *bio_split(struct bio *bi, int first_sectors)
> bp->bio2.bi_size -= first_sectors << 9;
> bp->bio1.bi_size = first_sectors << 9;
>
> - bp->bv1 = bi->bi_io_vec[0];
> - bp->bv2 = bi->bi_io_vec[0];
> - bp->bv2.bv_offset += first_sectors << 9;
> - bp->bv2.bv_len -= first_sectors << 9;
> - bp->bv1.bv_len = first_sectors << 9;
> + if (bi->bi_vcnt != 0) {
> + bp->bv1 = bi->bi_io_vec[0];
> + bp->bv2 = bi->bi_io_vec[0];
> + bp->bv2.bv_offset += first_sectors << 9;
> + bp->bv2.bv_len -= first_sectors << 9;
> + bp->bv1.bv_len = first_sectors << 9;
>
> - bp->bio1.bi_io_vec = &bp->bv1;
> - bp->bio2.bi_io_vec = &bp->bv2;
> + bp->bio1.bi_io_vec = &bp->bv1;
> + bp->bio2.bi_io_vec = &bp->bv2;
>
> - bp->bio1.bi_max_vecs = 1;
> - bp->bio2.bi_max_vecs = 1;
> + bp->bio1.bi_max_vecs = 1;
> + bp->bio2.bi_max_vecs = 1;
> + }
>
> bp->bio1.bi_end_io = bio_pair_end_1;
> bp->bio2.bi_end_io = bio_pair_end_2;
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] block: makes bio_split support bio without data
2012-09-24 8:35 ` Namhyung Kim
@ 2012-09-24 23:37 ` NeilBrown
0 siblings, 0 replies; 12+ messages in thread
From: NeilBrown @ 2012-09-24 23:37 UTC (permalink / raw)
To: Namhyung Kim; +Cc: Jens Axboe, Shaohua Li, lkml
[-- Attachment #1: Type: text/plain, Size: 1737 bytes --]
On Mon, 24 Sep 2012 17:35:34 +0900 Namhyung Kim <namhyung@kernel.org> wrote:
> Hi,
>
> On Mon, 24 Sep 2012 14:56:39 +1000, NeilBrown wrote:
> > Hi Jens,
> > this patch has been sitting in my -next tree for a little while and I was
> > hoping for it to go in for the next merge window.
> > It simply allows bio_split() to be used on bios without a payload, such as
> > 'discard'.
> > Are you happy with it going in though my 'md' tree, or would you rather take
> > it though your 'block' tree?
> >
> > Thanks,
> > NeilBrown
> >
> >
> > From: Shaohua Li <shli@fusionio.com>
> > Date: Thu, 20 Sep 2012 09:36:03 +1000
> > Subject: [PATCH] block: makes bio_split support bio without data
> >
> > discard bio hasn't data attached. We hit a BUG_ON with such bio. This makes
> > bio_split works for such bio.
> >
> > Signed-off-by: Shaohua Li <shli@fusionio.com>
> > Signed-off-by: NeilBrown <neilb@suse.de>
> >
> > diff --git a/fs/bio.c b/fs/bio.c
> > index 71072ab..dbb7a6c 100644
> > --- a/fs/bio.c
> > +++ b/fs/bio.c
> > @@ -1501,7 +1501,7 @@ struct bio_pair *bio_split(struct bio *bi, int first_sectors)
> > trace_block_split(bdev_get_queue(bi->bi_bdev), bi,
> > bi->bi_sector + first_sectors);
> >
> > - BUG_ON(bi->bi_vcnt != 1);
> > + BUG_ON(bi->bi_vcnt != 1 && bi->bi_vcnt != 0);
>
> Why not
> BUG_ON(bi->bi_vcnt > 1);
> ?
Either is fine with me.
'1' and '0' are the cases that bio_split explicitly supports.
'>1' are the cases which will cause problems.
As bi_vnt is unsigned, both conditions should produce exactly the same
machine code.
As I see no reason to prefer one over the other, I'm happy to go with what the
original author wrote.
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] block: makes bio_split support bio without data
2012-09-24 4:56 [PATCH] block: makes bio_split support bio without data NeilBrown
2012-09-24 8:35 ` Namhyung Kim
@ 2012-09-25 12:51 ` Jens Axboe
2012-09-28 7:36 ` Shaohua Li
2012-09-28 16:23 ` Kent Overstreet
2 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2012-09-25 12:51 UTC (permalink / raw)
To: NeilBrown; +Cc: Shaohua Li, lkml
On 09/24/2012 06:56 AM, NeilBrown wrote:
>
> Hi Jens,
> this patch has been sitting in my -next tree for a little while and I was
> hoping for it to go in for the next merge window.
> It simply allows bio_split() to be used on bios without a payload, such as
> 'discard'.
> Are you happy with it going in though my 'md' tree, or would you rather take
> it though your 'block' tree?
It should go through my tree, especially since we've got conflicts with
other changes. In other words, your patch does not apply to for-3.7/core
as-is...
Shaohua, could you resend an updated variant?
--
Jens Axboe
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] block: makes bio_split support bio without data
2012-09-25 12:51 ` Jens Axboe
@ 2012-09-28 7:36 ` Shaohua Li
2012-09-28 8:39 ` Jens Axboe
0 siblings, 1 reply; 12+ messages in thread
From: Shaohua Li @ 2012-09-28 7:36 UTC (permalink / raw)
To: Jens Axboe; +Cc: NeilBrown, Shaohua Li, lkml
On Tue, Sep 25, 2012 at 02:51:54PM +0200, Jens Axboe wrote:
> On 09/24/2012 06:56 AM, NeilBrown wrote:
> >
> > Hi Jens,
> > this patch has been sitting in my -next tree for a little while and I was
> > hoping for it to go in for the next merge window.
> > It simply allows bio_split() to be used on bios without a payload, such as
> > 'discard'.
> > Are you happy with it going in though my 'md' tree, or would you rather take
> > it though your 'block' tree?
>
> It should go through my tree, especially since we've got conflicts with
> other changes. In other words, your patch does not apply to for-3.7/core
> as-is...
>
> Shaohua, could you resend an updated variant?
Here is the one applied to for-3.7/core
Subject: block: makes bio_split support bio without data
discard bio hasn't data attached. We hit a BUG_ON with such bio. This makes
bio_split works for such bio.
Signed-off-by: Shaohua Li <shli@fusionio.com>
Signed-off-by: NeilBrown <neilb@suse.de>
---
fs/bio.c | 28 +++++++++++++++-------------
1 file changed, 15 insertions(+), 13 deletions(-)
Index: linux/fs/bio.c
===================================================================
--- linux.orig/fs/bio.c 2012-09-28 15:09:38.000000000 +0800
+++ linux/fs/bio.c 2012-09-28 15:25:38.955252846 +0800
@@ -1475,7 +1475,7 @@ struct bio_pair *bio_split(struct bio *b
trace_block_split(bdev_get_queue(bi->bi_bdev), bi,
bi->bi_sector + first_sectors);
- BUG_ON(bi->bi_vcnt != 1);
+ BUG_ON(bi->bi_vcnt != 1 && bi->bi_vcnt != 0);
BUG_ON(bi->bi_idx != 0);
atomic_set(&bp->cnt, 3);
bp->error = 0;
@@ -1485,20 +1485,22 @@ struct bio_pair *bio_split(struct bio *b
bp->bio2.bi_size -= first_sectors << 9;
bp->bio1.bi_size = first_sectors << 9;
- bp->bv1 = bi->bi_io_vec[0];
- bp->bv2 = bi->bi_io_vec[0];
+ if (bi->bi_vcnt != 0) {
+ bp->bv1 = bi->bi_io_vec[0];
+ bp->bv2 = bi->bi_io_vec[0];
+
+ if (bio_is_rw(bi)) {
+ bp->bv2.bv_offset += first_sectors << 9;
+ bp->bv2.bv_len -= first_sectors << 9;
+ bp->bv1.bv_len = first_sectors << 9;
+ }
- if (bio_is_rw(bi)) {
- bp->bv2.bv_offset += first_sectors << 9;
- bp->bv2.bv_len -= first_sectors << 9;
- bp->bv1.bv_len = first_sectors << 9;
- }
-
- bp->bio1.bi_io_vec = &bp->bv1;
- bp->bio2.bi_io_vec = &bp->bv2;
+ bp->bio1.bi_io_vec = &bp->bv1;
+ bp->bio2.bi_io_vec = &bp->bv2;
- bp->bio1.bi_max_vecs = 1;
- bp->bio2.bi_max_vecs = 1;
+ bp->bio1.bi_max_vecs = 1;
+ bp->bio2.bi_max_vecs = 1;
+ }
bp->bio1.bi_end_io = bio_pair_end_1;
bp->bio2.bi_end_io = bio_pair_end_2;
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] block: makes bio_split support bio without data
2012-09-28 7:36 ` Shaohua Li
@ 2012-09-28 8:39 ` Jens Axboe
0 siblings, 0 replies; 12+ messages in thread
From: Jens Axboe @ 2012-09-28 8:39 UTC (permalink / raw)
To: Shaohua Li; +Cc: NeilBrown, Shaohua Li, lkml
On 2012-09-28 09:36, Shaohua Li wrote:
> On Tue, Sep 25, 2012 at 02:51:54PM +0200, Jens Axboe wrote:
>> On 09/24/2012 06:56 AM, NeilBrown wrote:
>>>
>>> Hi Jens,
>>> this patch has been sitting in my -next tree for a little while and I was
>>> hoping for it to go in for the next merge window.
>>> It simply allows bio_split() to be used on bios without a payload, such as
>>> 'discard'.
>>> Are you happy with it going in though my 'md' tree, or would you rather take
>>> it though your 'block' tree?
>>
>> It should go through my tree, especially since we've got conflicts with
>> other changes. In other words, your patch does not apply to for-3.7/core
>> as-is...
>>
>> Shaohua, could you resend an updated variant?
> Here is the one applied to for-3.7/core
Thanks Shaohua, applied!
--
Jens Axboe
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] block: makes bio_split support bio without data
2012-09-24 4:56 [PATCH] block: makes bio_split support bio without data NeilBrown
2012-09-24 8:35 ` Namhyung Kim
2012-09-25 12:51 ` Jens Axboe
@ 2012-09-28 16:23 ` Kent Overstreet
2012-10-02 6:22 ` NeilBrown
2 siblings, 1 reply; 12+ messages in thread
From: Kent Overstreet @ 2012-09-28 16:23 UTC (permalink / raw)
To: NeilBrown; +Cc: Jens Axboe, Shaohua Li, lkml
On Mon, Sep 24, 2012 at 02:56:39PM +1000, NeilBrown wrote:
>
> Hi Jens,
> this patch has been sitting in my -next tree for a little while and I was
> hoping for it to go in for the next merge window.
> It simply allows bio_split() to be used on bios without a payload, such as
> 'discard'.
Thing is, at some point in the stack a discard bio is going to have data
- see blk_add_rquest_payload(), and it used to be the single page was
added to discard bios above generic_make_request(), in
blkdev_issue_discard() or whatever it's called.
So while I'm sure your code works, it's just a fragile way of doing it.
There's also other types of bios where bi_size has nothing to do with
the amount of data in the bi_io_vec - actually I think this is a new
thing, since Martin Petersen just added REQ_WRITE_SAME and I don't think
there were any other instances besides REQ_DISCARD before.
So my preference would be defining a mask (REQ_DISCARD|REQ_WRITE_SAME),
and if bio->bi_rw & that mask is true, just duplicate the bvec or
whatever.
That way it's much more explicit and less likely to trip someone else
up later.
(I've actually got a patch in my tree that does just that, but it's
special cased in bio_advance() which makes things work out really
nicely).
> Are you happy with it going in though my 'md' tree, or would you rather take
> it though your 'block' tree?
>
> Thanks,
> NeilBrown
>
>
> From: Shaohua Li <shli@fusionio.com>
> Date: Thu, 20 Sep 2012 09:36:03 +1000
> Subject: [PATCH] block: makes bio_split support bio without data
>
> discard bio hasn't data attached. We hit a BUG_ON with such bio. This makes
> bio_split works for such bio.
>
> Signed-off-by: Shaohua Li <shli@fusionio.com>
> Signed-off-by: NeilBrown <neilb@suse.de>
>
> diff --git a/fs/bio.c b/fs/bio.c
> index 71072ab..dbb7a6c 100644
> --- a/fs/bio.c
> +++ b/fs/bio.c
> @@ -1501,7 +1501,7 @@ struct bio_pair *bio_split(struct bio *bi, int first_sectors)
> trace_block_split(bdev_get_queue(bi->bi_bdev), bi,
> bi->bi_sector + first_sectors);
>
> - BUG_ON(bi->bi_vcnt != 1);
> + BUG_ON(bi->bi_vcnt != 1 && bi->bi_vcnt != 0);
> BUG_ON(bi->bi_idx != 0);
> atomic_set(&bp->cnt, 3);
> bp->error = 0;
> @@ -1511,17 +1511,19 @@ struct bio_pair *bio_split(struct bio *bi, int first_sectors)
> bp->bio2.bi_size -= first_sectors << 9;
> bp->bio1.bi_size = first_sectors << 9;
>
> - bp->bv1 = bi->bi_io_vec[0];
> - bp->bv2 = bi->bi_io_vec[0];
> - bp->bv2.bv_offset += first_sectors << 9;
> - bp->bv2.bv_len -= first_sectors << 9;
> - bp->bv1.bv_len = first_sectors << 9;
> + if (bi->bi_vcnt != 0) {
> + bp->bv1 = bi->bi_io_vec[0];
> + bp->bv2 = bi->bi_io_vec[0];
> + bp->bv2.bv_offset += first_sectors << 9;
> + bp->bv2.bv_len -= first_sectors << 9;
> + bp->bv1.bv_len = first_sectors << 9;
>
> - bp->bio1.bi_io_vec = &bp->bv1;
> - bp->bio2.bi_io_vec = &bp->bv2;
> + bp->bio1.bi_io_vec = &bp->bv1;
> + bp->bio2.bi_io_vec = &bp->bv2;
>
> - bp->bio1.bi_max_vecs = 1;
> - bp->bio2.bi_max_vecs = 1;
> + bp->bio1.bi_max_vecs = 1;
> + bp->bio2.bi_max_vecs = 1;
> + }
>
> bp->bio1.bi_end_io = bio_pair_end_1;
> bp->bio2.bi_end_io = bio_pair_end_2;
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] block: makes bio_split support bio without data
2012-09-28 16:23 ` Kent Overstreet
@ 2012-10-02 6:22 ` NeilBrown
2012-10-02 21:09 ` Kent Overstreet
0 siblings, 1 reply; 12+ messages in thread
From: NeilBrown @ 2012-10-02 6:22 UTC (permalink / raw)
To: Kent Overstreet; +Cc: Jens Axboe, Shaohua Li, lkml
[-- Attachment #1: Type: text/plain, Size: 4118 bytes --]
On Fri, 28 Sep 2012 09:23:43 -0700 Kent Overstreet <koverstreet@google.com>
wrote:
> On Mon, Sep 24, 2012 at 02:56:39PM +1000, NeilBrown wrote:
> >
> > Hi Jens,
> > this patch has been sitting in my -next tree for a little while and I was
> > hoping for it to go in for the next merge window.
> > It simply allows bio_split() to be used on bios without a payload, such as
> > 'discard'.
>
> Thing is, at some point in the stack a discard bio is going to have data
> - see blk_add_rquest_payload(), and it used to be the single page was
> added to discard bios above generic_make_request(), in
> blkdev_issue_discard() or whatever it's called.
>
> So while I'm sure your code works, it's just a fragile way of doing it.
>
> There's also other types of bios where bi_size has nothing to do with
> the amount of data in the bi_io_vec - actually I think this is a new
> thing, since Martin Petersen just added REQ_WRITE_SAME and I don't think
> there were any other instances besides REQ_DISCARD before.
>
> So my preference would be defining a mask (REQ_DISCARD|REQ_WRITE_SAME),
> and if bio->bi_rw & that mask is true, just duplicate the bvec or
> whatever.
Hi Kent,
I'm afraid I don't see the relevance of your comments to the patch.
The current bio_split code can successfully split a bio with zero or one
bi_vec entry. If there are more than that, we cannot split.
How does it matter whether the bio is a DISCARD or a WRITE_SAME or a DATA or
whatever?
NeilBrown
>
> That way it's much more explicit and less likely to trip someone else
> up later.
>
> (I've actually got a patch in my tree that does just that, but it's
> special cased in bio_advance() which makes things work out really
> nicely).
>
> > Are you happy with it going in though my 'md' tree, or would you rather take
> > it though your 'block' tree?
> >
> > Thanks,
> > NeilBrown
> >
> >
> > From: Shaohua Li <shli@fusionio.com>
> > Date: Thu, 20 Sep 2012 09:36:03 +1000
> > Subject: [PATCH] block: makes bio_split support bio without data
> >
> > discard bio hasn't data attached. We hit a BUG_ON with such bio. This makes
> > bio_split works for such bio.
> >
> > Signed-off-by: Shaohua Li <shli@fusionio.com>
> > Signed-off-by: NeilBrown <neilb@suse.de>
> >
> > diff --git a/fs/bio.c b/fs/bio.c
> > index 71072ab..dbb7a6c 100644
> > --- a/fs/bio.c
> > +++ b/fs/bio.c
> > @@ -1501,7 +1501,7 @@ struct bio_pair *bio_split(struct bio *bi, int first_sectors)
> > trace_block_split(bdev_get_queue(bi->bi_bdev), bi,
> > bi->bi_sector + first_sectors);
> >
> > - BUG_ON(bi->bi_vcnt != 1);
> > + BUG_ON(bi->bi_vcnt != 1 && bi->bi_vcnt != 0);
> > BUG_ON(bi->bi_idx != 0);
> > atomic_set(&bp->cnt, 3);
> > bp->error = 0;
> > @@ -1511,17 +1511,19 @@ struct bio_pair *bio_split(struct bio *bi, int first_sectors)
> > bp->bio2.bi_size -= first_sectors << 9;
> > bp->bio1.bi_size = first_sectors << 9;
> >
> > - bp->bv1 = bi->bi_io_vec[0];
> > - bp->bv2 = bi->bi_io_vec[0];
> > - bp->bv2.bv_offset += first_sectors << 9;
> > - bp->bv2.bv_len -= first_sectors << 9;
> > - bp->bv1.bv_len = first_sectors << 9;
> > + if (bi->bi_vcnt != 0) {
> > + bp->bv1 = bi->bi_io_vec[0];
> > + bp->bv2 = bi->bi_io_vec[0];
> > + bp->bv2.bv_offset += first_sectors << 9;
> > + bp->bv2.bv_len -= first_sectors << 9;
> > + bp->bv1.bv_len = first_sectors << 9;
> >
> > - bp->bio1.bi_io_vec = &bp->bv1;
> > - bp->bio2.bi_io_vec = &bp->bv2;
> > + bp->bio1.bi_io_vec = &bp->bv1;
> > + bp->bio2.bi_io_vec = &bp->bv2;
> >
> > - bp->bio1.bi_max_vecs = 1;
> > - bp->bio2.bi_max_vecs = 1;
> > + bp->bio1.bi_max_vecs = 1;
> > + bp->bio2.bi_max_vecs = 1;
> > + }
> >
> > bp->bio1.bi_end_io = bio_pair_end_1;
> > bp->bio2.bi_end_io = bio_pair_end_2;
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] block: makes bio_split support bio without data
2012-10-02 6:22 ` NeilBrown
@ 2012-10-02 21:09 ` Kent Overstreet
2012-10-03 3:30 ` NeilBrown
0 siblings, 1 reply; 12+ messages in thread
From: Kent Overstreet @ 2012-10-02 21:09 UTC (permalink / raw)
To: NeilBrown; +Cc: Jens Axboe, Shaohua Li, lkml
On Tue, Oct 02, 2012 at 04:22:01PM +1000, NeilBrown wrote:
> On Fri, 28 Sep 2012 09:23:43 -0700 Kent Overstreet <koverstreet@google.com>
> wrote:
>
> > On Mon, Sep 24, 2012 at 02:56:39PM +1000, NeilBrown wrote:
> > >
> > > Hi Jens,
> > > this patch has been sitting in my -next tree for a little while and I was
> > > hoping for it to go in for the next merge window.
> > > It simply allows bio_split() to be used on bios without a payload, such as
> > > 'discard'.
> >
> > Thing is, at some point in the stack a discard bio is going to have data
> > - see blk_add_rquest_payload(), and it used to be the single page was
> > added to discard bios above generic_make_request(), in
> > blkdev_issue_discard() or whatever it's called.
> >
> > So while I'm sure your code works, it's just a fragile way of doing it.
> >
> > There's also other types of bios where bi_size has nothing to do with
> > the amount of data in the bi_io_vec - actually I think this is a new
> > thing, since Martin Petersen just added REQ_WRITE_SAME and I don't think
> > there were any other instances besides REQ_DISCARD before.
> >
> > So my preference would be defining a mask (REQ_DISCARD|REQ_WRITE_SAME),
> > and if bio->bi_rw & that mask is true, just duplicate the bvec or
> > whatever.
>
> Hi Kent,
> I'm afraid I don't see the relevance of your comments to the patch.
>
> The current bio_split code can successfully split a bio with zero or one
> bi_vec entry. If there are more than that, we cannot split.
>
> How does it matter whether the bio is a DISCARD or a WRITE_SAME or a DATA or
> whatever?
Hrm, I think I didn't explain very well.
After your change, if bio->bi_vcnt != 0, then it splits the bvec.
The trouble is that discard bios do under certain circumstances have
bio->bi_vcnt != 0, in which case splitting the bvec is the wrong thing
to do - first_sectors will quite likely be bigger than the bvec.
In practice this isn't currently a problem for discard bios, because
since Christoph added blk_add_request_payload(), discard bios won't have
that bvec added until they hit the scsi layer which will be after any
splitting. But this is a fairly recent and unrelated change, and IMO not
the kind of behaviour I'd want to rely on.
WRITE_SAME is a problem for the same reason - bio_sectors(bio) may be
large, but the bio will always have a single bvec and splitting the bvec
is always the wrong thing to do for WRITE_SAME.
So, I think it makes more sense to make the splitting conditional on
!(bio->bi_rw & (REQ_DISCARD|REQ_WRITE_SAME)), in addition to
bio->bi_vcnt == 1.
..That make more sense?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] block: makes bio_split support bio without data
2012-10-02 21:09 ` Kent Overstreet
@ 2012-10-03 3:30 ` NeilBrown
2012-10-03 3:42 ` Kent Overstreet
0 siblings, 1 reply; 12+ messages in thread
From: NeilBrown @ 2012-10-03 3:30 UTC (permalink / raw)
To: Kent Overstreet; +Cc: Jens Axboe, Shaohua Li, lkml
[-- Attachment #1: Type: text/plain, Size: 3502 bytes --]
On Tue, 2 Oct 2012 14:09:23 -0700 Kent Overstreet <koverstreet@google.com>
wrote:
> On Tue, Oct 02, 2012 at 04:22:01PM +1000, NeilBrown wrote:
> > On Fri, 28 Sep 2012 09:23:43 -0700 Kent Overstreet <koverstreet@google.com>
> > wrote:
> >
> > > On Mon, Sep 24, 2012 at 02:56:39PM +1000, NeilBrown wrote:
> > > >
> > > > Hi Jens,
> > > > this patch has been sitting in my -next tree for a little while and I was
> > > > hoping for it to go in for the next merge window.
> > > > It simply allows bio_split() to be used on bios without a payload, such as
> > > > 'discard'.
> > >
> > > Thing is, at some point in the stack a discard bio is going to have data
> > > - see blk_add_rquest_payload(), and it used to be the single page was
> > > added to discard bios above generic_make_request(), in
> > > blkdev_issue_discard() or whatever it's called.
> > >
> > > So while I'm sure your code works, it's just a fragile way of doing it.
> > >
> > > There's also other types of bios where bi_size has nothing to do with
> > > the amount of data in the bi_io_vec - actually I think this is a new
> > > thing, since Martin Petersen just added REQ_WRITE_SAME and I don't think
> > > there were any other instances besides REQ_DISCARD before.
> > >
> > > So my preference would be defining a mask (REQ_DISCARD|REQ_WRITE_SAME),
> > > and if bio->bi_rw & that mask is true, just duplicate the bvec or
> > > whatever.
> >
> > Hi Kent,
> > I'm afraid I don't see the relevance of your comments to the patch.
> >
> > The current bio_split code can successfully split a bio with zero or one
> > bi_vec entry. If there are more than that, we cannot split.
> >
> > How does it matter whether the bio is a DISCARD or a WRITE_SAME or a DATA or
> > whatever?
>
> Hrm, I think I didn't explain very well.
>
> After your change, if bio->bi_vcnt != 0, then it splits the bvec.
>
> The trouble is that discard bios do under certain circumstances have
> bio->bi_vcnt != 0, in which case splitting the bvec is the wrong thing
> to do - first_sectors will quite likely be bigger than the bvec.
>
> In practice this isn't currently a problem for discard bios, because
> since Christoph added blk_add_request_payload(), discard bios won't have
> that bvec added until they hit the scsi layer which will be after any
> splitting. But this is a fairly recent and unrelated change, and IMO not
> the kind of behaviour I'd want to rely on.
>
> WRITE_SAME is a problem for the same reason - bio_sectors(bio) may be
> large, but the bio will always have a single bvec and splitting the bvec
> is always the wrong thing to do for WRITE_SAME.
>
> So, I think it makes more sense to make the splitting conditional on
> !(bio->bi_rw & (REQ_DISCARD|REQ_WRITE_SAME)), in addition to
> bio->bi_vcnt == 1.
>
> ..That make more sense?
Yes, that does make some more sense, thanks. However it doesn't convince me
that we need to change the patch.
I guess my position is that once we get to this code, we absolutely have to
split the bio - it maps to two separate devices in a RAID0 or similar so
not-splitting is not an option.
Maybe various md devices need to detect and reject REQ_DISCARD requests that
have a payload and REQ_WRITE_SAME requests? Or would they need to explicitly
set a flag to say they accept them?
So maybe there is something to fix, but I don't think it is in bit_split,
except maybe to add WARN_ON ??
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] block: makes bio_split support bio without data
2012-10-03 3:30 ` NeilBrown
@ 2012-10-03 3:42 ` Kent Overstreet
2012-10-03 16:22 ` Martin K. Petersen
0 siblings, 1 reply; 12+ messages in thread
From: Kent Overstreet @ 2012-10-03 3:42 UTC (permalink / raw)
To: NeilBrown; +Cc: Jens Axboe, Shaohua Li, lkml, martin.petersen
Adding Martin to the cc, so he can chime in on WRITE_SAME if I got it
wrong
On Wed, Oct 03, 2012 at 01:30:45PM +1000, NeilBrown wrote:
> On Tue, 2 Oct 2012 14:09:23 -0700 Kent Overstreet <koverstreet@google.com>
> wrote:
>
> > On Tue, Oct 02, 2012 at 04:22:01PM +1000, NeilBrown wrote:
> > > On Fri, 28 Sep 2012 09:23:43 -0700 Kent Overstreet <koverstreet@google.com>
> > > wrote:
> > >
> > > > On Mon, Sep 24, 2012 at 02:56:39PM +1000, NeilBrown wrote:
> > > > >
> > > > > Hi Jens,
> > > > > this patch has been sitting in my -next tree for a little while and I was
> > > > > hoping for it to go in for the next merge window.
> > > > > It simply allows bio_split() to be used on bios without a payload, such as
> > > > > 'discard'.
> > > >
> > > > Thing is, at some point in the stack a discard bio is going to have data
> > > > - see blk_add_rquest_payload(), and it used to be the single page was
> > > > added to discard bios above generic_make_request(), in
> > > > blkdev_issue_discard() or whatever it's called.
> > > >
> > > > So while I'm sure your code works, it's just a fragile way of doing it.
> > > >
> > > > There's also other types of bios where bi_size has nothing to do with
> > > > the amount of data in the bi_io_vec - actually I think this is a new
> > > > thing, since Martin Petersen just added REQ_WRITE_SAME and I don't think
> > > > there were any other instances besides REQ_DISCARD before.
> > > >
> > > > So my preference would be defining a mask (REQ_DISCARD|REQ_WRITE_SAME),
> > > > and if bio->bi_rw & that mask is true, just duplicate the bvec or
> > > > whatever.
> > >
> > > Hi Kent,
> > > I'm afraid I don't see the relevance of your comments to the patch.
> > >
> > > The current bio_split code can successfully split a bio with zero or one
> > > bi_vec entry. If there are more than that, we cannot split.
> > >
> > > How does it matter whether the bio is a DISCARD or a WRITE_SAME or a DATA or
> > > whatever?
> >
> > Hrm, I think I didn't explain very well.
> >
> > After your change, if bio->bi_vcnt != 0, then it splits the bvec.
> >
> > The trouble is that discard bios do under certain circumstances have
> > bio->bi_vcnt != 0, in which case splitting the bvec is the wrong thing
> > to do - first_sectors will quite likely be bigger than the bvec.
> >
> > In practice this isn't currently a problem for discard bios, because
> > since Christoph added blk_add_request_payload(), discard bios won't have
> > that bvec added until they hit the scsi layer which will be after any
> > splitting. But this is a fairly recent and unrelated change, and IMO not
> > the kind of behaviour I'd want to rely on.
> >
> > WRITE_SAME is a problem for the same reason - bio_sectors(bio) may be
> > large, but the bio will always have a single bvec and splitting the bvec
> > is always the wrong thing to do for WRITE_SAME.
> >
> > So, I think it makes more sense to make the splitting conditional on
> > !(bio->bi_rw & (REQ_DISCARD|REQ_WRITE_SAME)), in addition to
> > bio->bi_vcnt == 1.
> >
> > ..That make more sense?
>
> Yes, that does make some more sense, thanks. However it doesn't convince me
> that we need to change the patch.
>
> I guess my position is that once we get to this code, we absolutely have to
> split the bio - it maps to two separate devices in a RAID0 or similar so
> not-splitting is not an option.
>
> Maybe various md devices need to detect and reject REQ_DISCARD requests that
> have a payload and REQ_WRITE_SAME requests? Or would they need to explicitly
> set a flag to say they accept them?
I think we should be able to split REQ_DISCARD bios that have a payload
or REQ_WRITE_SAME bios just fine though - for both of those cases, the
payload doesn't correspond to a particular sector, so just copy the
original bvec to the two splits and don't do anything else to it.
This gets so much cleaner with immutable bvecs :p
Actually that might be wrong for REQ_DISCARD bios if they had a payload,
I have no idea what that payload is actually for. But that should never
happen anymore, could make do WARN_ON((bio->bi_rw & REQ_DISCARD) &&
bio->bi_vcnt)
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] block: makes bio_split support bio without data
2012-10-03 3:42 ` Kent Overstreet
@ 2012-10-03 16:22 ` Martin K. Petersen
0 siblings, 0 replies; 12+ messages in thread
From: Martin K. Petersen @ 2012-10-03 16:22 UTC (permalink / raw)
To: Kent Overstreet; +Cc: NeilBrown, Jens Axboe, Shaohua Li, lkml, martin.petersen
>>>>> "Kent" == Kent Overstreet <koverstreet@google.com> writes:
Kent> I think we should be able to split REQ_DISCARD bios that have a
Kent> payload or REQ_WRITE_SAME bios just fine though - for both of
Kent> those cases, the payload doesn't correspond to a particular
Kent> sector, so just copy the original bvec to the two splits and don't
Kent> do anything else to it.
DISCARD bios come down with a single bvec that is later used in the SCSI
disk driver to describe a memory page that can then be mapped into a
scatter-gather list. The reason for this is that both ATA TRIM and SCSI
UNMAP put the block range descriptors in the payload rather than in the
command itself. By the time MD calls bio_split there will be an empty
bvec in the bio.
For WRITE SAME the parent payload contains a bvec describing a single
logical block of data (i.e. typically 512 bytes). The same bvec is used
for both bios in the pair.
For neither DISCARD, nor WRITE SAME do we need to muck with bv_offset
and bv_len. As a result, my patch uses the bio_is_rw() conditional to
wrap the the bvec munging code.
--
Martin K. Petersen Oracle Linux Engineering
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2012-10-03 16:22 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-24 4:56 [PATCH] block: makes bio_split support bio without data NeilBrown
2012-09-24 8:35 ` Namhyung Kim
2012-09-24 23:37 ` NeilBrown
2012-09-25 12:51 ` Jens Axboe
2012-09-28 7:36 ` Shaohua Li
2012-09-28 8:39 ` Jens Axboe
2012-09-28 16:23 ` Kent Overstreet
2012-10-02 6:22 ` NeilBrown
2012-10-02 21:09 ` Kent Overstreet
2012-10-03 3:30 ` NeilBrown
2012-10-03 3:42 ` Kent Overstreet
2012-10-03 16:22 ` Martin K. Petersen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).