* [PATCH] md: raid1,10: Handle REQ_WRITE_SAME flag in write bios
@ 2012-12-14 19:36 Joe Lawrence
2012-12-14 22:38 ` Martin K. Petersen
2013-01-02 15:42 ` Joe Lawrence
0 siblings, 2 replies; 6+ messages in thread
From: Joe Lawrence @ 2012-12-14 19:36 UTC (permalink / raw)
To: linux-raid; +Cc: Joe Lawrence, Martin K. Petersen, neilb
Submitting patch for review... This sets the max_write_same_sectors for
the mdev request queue to its chunk_sectors (before merging its member
disk limits). I believe this should keep the write same LBA sector count
inside the write intent bitmap granularity, but I'm not sure if the sector
range could cross bits and how could would be handled. Any insight into
the bitmap side of things would be appreciated.
Thanks,
-- Joe
From d31c4569b3883475f293403fa4eed63f107616bb Mon Sep 17 00:00:00 2001
From: Joe Lawrence <joe.lawrence@stratus.com>
Date: Fri, 14 Dec 2012 11:25:27 -0500
Subject: [PATCH] md: raid1,10: Handle REQ_WRITE_SAME flag in write bios
Set mddev queue's max_write_same_sectors to its chunk_sector value (before
disk_stack_limits merges the underlying disk limits.) With that in place,
be sure to handle writes coming down from the block layer that have the
REQ_WRITE_SAME flag set. That flag needs to be copied into any newly cloned
write bio.
Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com>
---
drivers/md/raid1.c | 5 ++++-
drivers/md/raid10.c | 8 ++++++--
2 files changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index a0f7309..c9ef213 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1001,6 +1001,7 @@ static void make_request(struct mddev *mddev, struct bio * bio)
const unsigned long do_flush_fua = (bio->bi_rw & (REQ_FLUSH | REQ_FUA));
const unsigned long do_discard = (bio->bi_rw
& (REQ_DISCARD | REQ_SECURE));
+ const unsigned long do_same = (bio->bi_rw & REQ_WRITE_SAME);
struct md_rdev *blocked_rdev;
struct blk_plug_cb *cb;
struct raid1_plug_cb *plug = NULL;
@@ -1302,7 +1303,8 @@ read_again:
conf->mirrors[i].rdev->data_offset);
mbio->bi_bdev = conf->mirrors[i].rdev->bdev;
mbio->bi_end_io = raid1_end_write_request;
- mbio->bi_rw = WRITE | do_flush_fua | do_sync | do_discard;
+ mbio->bi_rw =
+ WRITE | do_flush_fua | do_sync | do_discard | do_same;
mbio->bi_private = r1_bio;
atomic_inc(&r1_bio->remaining);
@@ -2819,6 +2821,7 @@ static int run(struct mddev *mddev)
if (IS_ERR(conf))
return PTR_ERR(conf);
+ blk_queue_max_write_same_sectors(mddev->queue, mddev->chunk_sectors);
rdev_for_each(rdev, mddev) {
if (!mddev->gendisk)
continue;
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index c9acbd7..3d77a92 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1106,6 +1106,7 @@ static void make_request(struct mddev *mddev, struct bio * bio)
const unsigned long do_fua = (bio->bi_rw & REQ_FUA);
const unsigned long do_discard = (bio->bi_rw
& (REQ_DISCARD | REQ_SECURE));
+ const unsigned long do_same = (bio->bi_rw & REQ_WRITE_SAME);
unsigned long flags;
struct md_rdev *blocked_rdev;
struct blk_plug_cb *cb;
@@ -1461,7 +1462,8 @@ retry_write:
rdev));
mbio->bi_bdev = rdev->bdev;
mbio->bi_end_io = raid10_end_write_request;
- mbio->bi_rw = WRITE | do_sync | do_fua | do_discard;
+ mbio->bi_rw =
+ WRITE | do_sync | do_fua | do_discard | do_same;
mbio->bi_private = r10_bio;
atomic_inc(&r10_bio->remaining);
@@ -1503,7 +1505,8 @@ retry_write:
r10_bio, rdev));
mbio->bi_bdev = rdev->bdev;
mbio->bi_end_io = raid10_end_write_request;
- mbio->bi_rw = WRITE | do_sync | do_fua | do_discard;
+ mbio->bi_rw =
+ WRITE | do_sync | do_fua | do_discard | do_same;
mbio->bi_private = r10_bio;
atomic_inc(&r10_bio->remaining);
@@ -3578,6 +3581,7 @@ static int run(struct mddev *mddev)
(conf->geo.raid_disks / conf->geo.near_copies));
}
+ blk_queue_max_write_same_sectors(mddev->queue, mddev->chunk_sectors);
rdev_for_each(rdev, mddev) {
long long diff;
struct request_queue *q;
--
1.7.11.7
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH] md: raid1,10: Handle REQ_WRITE_SAME flag in write bios
2012-12-14 19:36 [PATCH] md: raid1,10: Handle REQ_WRITE_SAME flag in write bios Joe Lawrence
@ 2012-12-14 22:38 ` Martin K. Petersen
2013-01-02 15:42 ` Joe Lawrence
1 sibling, 0 replies; 6+ messages in thread
From: Martin K. Petersen @ 2012-12-14 22:38 UTC (permalink / raw)
To: Joe Lawrence; +Cc: linux-raid, Martin K. Petersen, neilb
>>>>> "Joe" == Joe Lawrence <Joe.Lawrence@stratus.com> writes:
Joe> This sets the max_write_same_sectors for the mdev request queue to
Joe> its chunk_sectors (before merging its member disk limits).
Yep, that's the right approach.
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
--
Martin K. Petersen Oracle Linux Engineering
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] md: raid1,10: Handle REQ_WRITE_SAME flag in write bios
2012-12-14 19:36 [PATCH] md: raid1,10: Handle REQ_WRITE_SAME flag in write bios Joe Lawrence
2012-12-14 22:38 ` Martin K. Petersen
@ 2013-01-02 15:42 ` Joe Lawrence
2013-01-07 2:22 ` NeilBrown
1 sibling, 1 reply; 6+ messages in thread
From: Joe Lawrence @ 2013-01-02 15:42 UTC (permalink / raw)
To: Joe Lawrence; +Cc: linux-raid, neilb
On Fri, 14 Dec 2012, Joe Lawrence wrote:
> Submitting patch for review... This sets the max_write_same_sectors for
> the mdev request queue to its chunk_sectors (before merging its member
> disk limits). I believe this should keep the write same LBA sector count
> inside the write intent bitmap granularity, but I'm not sure if the sector
> range could cross bits and how could would be handled. Any insight into
> the bitmap side of things would be appreciated.
>
> Thanks,
>
> -- Joe
>
> From d31c4569b3883475f293403fa4eed63f107616bb Mon Sep 17 00:00:00 2001
> From: Joe Lawrence <joe.lawrence@stratus.com>
> Date: Fri, 14 Dec 2012 11:25:27 -0500
> Subject: [PATCH] md: raid1,10: Handle REQ_WRITE_SAME flag in write bios
>
> Set mddev queue's max_write_same_sectors to its chunk_sector value (before
> disk_stack_limits merges the underlying disk limits.) With that in place,
> be sure to handle writes coming down from the block layer that have the
> REQ_WRITE_SAME flag set. That flag needs to be copied into any newly cloned
> write bio.
>
> Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com>
> ---
> drivers/md/raid1.c | 5 ++++-
> drivers/md/raid10.c | 8 ++++++--
> 2 files changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index a0f7309..c9ef213 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -1001,6 +1001,7 @@ static void make_request(struct mddev *mddev, struct bio * bio)
> const unsigned long do_flush_fua = (bio->bi_rw & (REQ_FLUSH | REQ_FUA));
> const unsigned long do_discard = (bio->bi_rw
> & (REQ_DISCARD | REQ_SECURE));
> + const unsigned long do_same = (bio->bi_rw & REQ_WRITE_SAME);
> struct md_rdev *blocked_rdev;
> struct blk_plug_cb *cb;
> struct raid1_plug_cb *plug = NULL;
> @@ -1302,7 +1303,8 @@ read_again:
> conf->mirrors[i].rdev->data_offset);
> mbio->bi_bdev = conf->mirrors[i].rdev->bdev;
> mbio->bi_end_io = raid1_end_write_request;
> - mbio->bi_rw = WRITE | do_flush_fua | do_sync | do_discard;
> + mbio->bi_rw =
> + WRITE | do_flush_fua | do_sync | do_discard | do_same;
> mbio->bi_private = r1_bio;
>
> atomic_inc(&r1_bio->remaining);
> @@ -2819,6 +2821,7 @@ static int run(struct mddev *mddev)
> if (IS_ERR(conf))
> return PTR_ERR(conf);
>
> + blk_queue_max_write_same_sectors(mddev->queue, mddev->chunk_sectors);
> rdev_for_each(rdev, mddev) {
> if (!mddev->gendisk)
> continue;
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index c9acbd7..3d77a92 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -1106,6 +1106,7 @@ static void make_request(struct mddev *mddev, struct bio * bio)
> const unsigned long do_fua = (bio->bi_rw & REQ_FUA);
> const unsigned long do_discard = (bio->bi_rw
> & (REQ_DISCARD | REQ_SECURE));
> + const unsigned long do_same = (bio->bi_rw & REQ_WRITE_SAME);
> unsigned long flags;
> struct md_rdev *blocked_rdev;
> struct blk_plug_cb *cb;
> @@ -1461,7 +1462,8 @@ retry_write:
> rdev));
> mbio->bi_bdev = rdev->bdev;
> mbio->bi_end_io = raid10_end_write_request;
> - mbio->bi_rw = WRITE | do_sync | do_fua | do_discard;
> + mbio->bi_rw =
> + WRITE | do_sync | do_fua | do_discard | do_same;
> mbio->bi_private = r10_bio;
>
> atomic_inc(&r10_bio->remaining);
> @@ -1503,7 +1505,8 @@ retry_write:
> r10_bio, rdev));
> mbio->bi_bdev = rdev->bdev;
> mbio->bi_end_io = raid10_end_write_request;
> - mbio->bi_rw = WRITE | do_sync | do_fua | do_discard;
> + mbio->bi_rw =
> + WRITE | do_sync | do_fua | do_discard | do_same;
> mbio->bi_private = r10_bio;
>
> atomic_inc(&r10_bio->remaining);
> @@ -3578,6 +3581,7 @@ static int run(struct mddev *mddev)
> (conf->geo.raid_disks / conf->geo.near_copies));
> }
>
> + blk_queue_max_write_same_sectors(mddev->queue, mddev->chunk_sectors);
> rdev_for_each(rdev, mddev) {
> long long diff;
> struct request_queue *q;
> --
> 1.7.11.7
>
>
Hi Neil,
Happy new year -- any comments on this bug / patch? I can rebase and
test against 3.8, though I don't believe any related code has changed yet.
Regards,
-- Joe
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] md: raid1,10: Handle REQ_WRITE_SAME flag in write bios
2013-01-02 15:42 ` Joe Lawrence
@ 2013-01-07 2:22 ` NeilBrown
2013-01-07 20:56 ` Joe Lawrence
0 siblings, 1 reply; 6+ messages in thread
From: NeilBrown @ 2013-01-07 2:22 UTC (permalink / raw)
To: Joe Lawrence; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 4868 bytes --]
On Wed, 2 Jan 2013 10:42:49 -0500 (EST) Joe Lawrence
<Joe.Lawrence@stratus.com> wrote:
> On Fri, 14 Dec 2012, Joe Lawrence wrote:
>
> > Submitting patch for review... This sets the max_write_same_sectors for
> > the mdev request queue to its chunk_sectors (before merging its member
> > disk limits). I believe this should keep the write same LBA sector count
> > inside the write intent bitmap granularity, but I'm not sure if the sector
> > range could cross bits and how could would be handled. Any insight into
> > the bitmap side of things would be appreciated.
> >
> > Thanks,
> >
> > -- Joe
> >
> > From d31c4569b3883475f293403fa4eed63f107616bb Mon Sep 17 00:00:00 2001
> > From: Joe Lawrence <joe.lawrence@stratus.com>
> > Date: Fri, 14 Dec 2012 11:25:27 -0500
> > Subject: [PATCH] md: raid1,10: Handle REQ_WRITE_SAME flag in write bios
> >
> > Set mddev queue's max_write_same_sectors to its chunk_sector value (before
> > disk_stack_limits merges the underlying disk limits.) With that in place,
> > be sure to handle writes coming down from the block layer that have the
> > REQ_WRITE_SAME flag set. That flag needs to be copied into any newly cloned
> > write bio.
> >
> > Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com>
> > ---
> > drivers/md/raid1.c | 5 ++++-
> > drivers/md/raid10.c | 8 ++++++--
> > 2 files changed, 10 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> > index a0f7309..c9ef213 100644
> > --- a/drivers/md/raid1.c
> > +++ b/drivers/md/raid1.c
> > @@ -1001,6 +1001,7 @@ static void make_request(struct mddev *mddev, struct bio * bio)
> > const unsigned long do_flush_fua = (bio->bi_rw & (REQ_FLUSH | REQ_FUA));
> > const unsigned long do_discard = (bio->bi_rw
> > & (REQ_DISCARD | REQ_SECURE));
> > + const unsigned long do_same = (bio->bi_rw & REQ_WRITE_SAME);
> > struct md_rdev *blocked_rdev;
> > struct blk_plug_cb *cb;
> > struct raid1_plug_cb *plug = NULL;
> > @@ -1302,7 +1303,8 @@ read_again:
> > conf->mirrors[i].rdev->data_offset);
> > mbio->bi_bdev = conf->mirrors[i].rdev->bdev;
> > mbio->bi_end_io = raid1_end_write_request;
> > - mbio->bi_rw = WRITE | do_flush_fua | do_sync | do_discard;
> > + mbio->bi_rw =
> > + WRITE | do_flush_fua | do_sync | do_discard | do_same;
> > mbio->bi_private = r1_bio;
> >
> > atomic_inc(&r1_bio->remaining);
> > @@ -2819,6 +2821,7 @@ static int run(struct mddev *mddev)
> > if (IS_ERR(conf))
> > return PTR_ERR(conf);
> >
> > + blk_queue_max_write_same_sectors(mddev->queue, mddev->chunk_sectors);
> > rdev_for_each(rdev, mddev) {
> > if (!mddev->gendisk)
> > continue;
> > diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> > index c9acbd7..3d77a92 100644
> > --- a/drivers/md/raid10.c
> > +++ b/drivers/md/raid10.c
> > @@ -1106,6 +1106,7 @@ static void make_request(struct mddev *mddev, struct bio * bio)
> > const unsigned long do_fua = (bio->bi_rw & REQ_FUA);
> > const unsigned long do_discard = (bio->bi_rw
> > & (REQ_DISCARD | REQ_SECURE));
> > + const unsigned long do_same = (bio->bi_rw & REQ_WRITE_SAME);
> > unsigned long flags;
> > struct md_rdev *blocked_rdev;
> > struct blk_plug_cb *cb;
> > @@ -1461,7 +1462,8 @@ retry_write:
> > rdev));
> > mbio->bi_bdev = rdev->bdev;
> > mbio->bi_end_io = raid10_end_write_request;
> > - mbio->bi_rw = WRITE | do_sync | do_fua | do_discard;
> > + mbio->bi_rw =
> > + WRITE | do_sync | do_fua | do_discard | do_same;
> > mbio->bi_private = r10_bio;
> >
> > atomic_inc(&r10_bio->remaining);
> > @@ -1503,7 +1505,8 @@ retry_write:
> > r10_bio, rdev));
> > mbio->bi_bdev = rdev->bdev;
> > mbio->bi_end_io = raid10_end_write_request;
> > - mbio->bi_rw = WRITE | do_sync | do_fua | do_discard;
> > + mbio->bi_rw =
> > + WRITE | do_sync | do_fua | do_discard | do_same;
> > mbio->bi_private = r10_bio;
> >
> > atomic_inc(&r10_bio->remaining);
> > @@ -3578,6 +3581,7 @@ static int run(struct mddev *mddev)
> > (conf->geo.raid_disks / conf->geo.near_copies));
> > }
> >
> > + blk_queue_max_write_same_sectors(mddev->queue, mddev->chunk_sectors);
> > rdev_for_each(rdev, mddev) {
> > long long diff;
> > struct request_queue *q;
> > --
> > 1.7.11.7
> >
> >
>
> Hi Neil,
>
> Happy new year -- any comments on this bug / patch? I can rebase and
> test against 3.8, though I don't believe any related code has changed yet.
>
Thanks. I've added Martin's and and queue it up.
Assuming that a REQ_WRITE_SAME is just like a normal write except for the
iovec being shorter (is that right?) there should be no interesting issues
with bitmaps - it should just work.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] md: raid1,10: Handle REQ_WRITE_SAME flag in write bios
2013-01-07 2:22 ` NeilBrown
@ 2013-01-07 20:56 ` Joe Lawrence
2013-01-08 0:21 ` Martin K. Petersen
0 siblings, 1 reply; 6+ messages in thread
From: Joe Lawrence @ 2013-01-07 20:56 UTC (permalink / raw)
To: NeilBrown; +Cc: Joe Lawrence, Martin K. Petersen, linux-raid
On Mon, 7 Jan 2013, NeilBrown wrote:
> Assuming that a REQ_WRITE_SAME is just like a normal write except for the
> iovec being shorter (is that right?) there should be no interesting issues
> with bitmaps - it should just work.
I believe this is the case (at least in the instances that I have seen),
but Martin can say for sure.
-- Joe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] md: raid1,10: Handle REQ_WRITE_SAME flag in write bios
2013-01-07 20:56 ` Joe Lawrence
@ 2013-01-08 0:21 ` Martin K. Petersen
0 siblings, 0 replies; 6+ messages in thread
From: Martin K. Petersen @ 2013-01-08 0:21 UTC (permalink / raw)
To: Joe Lawrence; +Cc: NeilBrown, Martin K. Petersen, linux-raid
>>>>> "Joe" == Joe Lawrence <Joe.Lawrence@stratus.com> writes:
Joe> On Mon, 7 Jan 2013, NeilBrown wrote:
>> Assuming that a REQ_WRITE_SAME is just like a normal write except for
>> the iovec being shorter (is that right?)
Yes, that's correct. bio_sectors() will correspond to the area worked on
but the bio_vec just describes a single logical block.
--
Martin K. Petersen Oracle Linux Engineering
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-01-08 0:21 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-14 19:36 [PATCH] md: raid1,10: Handle REQ_WRITE_SAME flag in write bios Joe Lawrence
2012-12-14 22:38 ` Martin K. Petersen
2013-01-02 15:42 ` Joe Lawrence
2013-01-07 2:22 ` NeilBrown
2013-01-07 20:56 ` Joe Lawrence
2013-01-08 0:21 ` Martin K. Petersen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).