[Qemu-devel] [PATCH] drive-mirror: Change the amount of data base on granularity

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH] drive-mirror: Change the amount of data base on granularity
@ 2014-01-18  8:09 Chentao (Boby)
  2014-02-24 13:20 ` Stefan Hajnoczi
  0 siblings, 1 reply; 3+ messages in thread
From: Chentao (Boby) @ 2014-01-18  8:09 UTC (permalink / raw)
  To: qemu-devel, Stefan Hajnoczi, Fam Zheng
  Cc: Qinling, Zhangmin (Rudy), Wubin (H), Wangting (Kathy)

[-- Attachment #1: Type: text/plain, Size: 4778 bytes --]

Before, one iteration send the amount of data is continuous dirty block, maximum is mirror buffer size(default is 10M).

This way has a low write/read performance. If image type is raw, first loop, all the data is dirty.

One iteration, read 10M data and then write 10M data to target image, so read and write cannot be parallelized.



Now, I change the amount of data in an iteration, it base on granularity. We can set the granularity to 1M,so it can send

10 times read request, and then send write request. Once a write request is done, it will have 1M free buffer to send next read request.

So this way can allow read/write to be parallelized.



This change can improve read and write performance.

On my server:

(write) MBps:55MB/S --> 90 MB/S utility:50%->85%



Signed-off-by: Zhang Min <rudy.zhangmin@huawei.com<mailto:rudy.zhangmin@huawei.com>>

---

block/mirror.c |   68 ++++++++++++++++++++++---------------------------------

1 files changed, 27 insertions(+), 41 deletions(-)



diff --git a/block/mirror.c b/block/mirror.c index 2932bab..1ba2862 100644

--- a/block/mirror.c

+++ b/block/mirror.c

@@ -183,54 +183,40 @@ static void coroutine_fn mirror_iteration(MirrorBlockJob *s)

         qemu_coroutine_yield();

     }



-    do {

-        int added_sectors, added_chunks;

+    int added_sectors, added_chunks;



-        if (!bdrv_get_dirty(source, s->dirty_bitmap, next_sector) ||

-            test_bit(next_chunk, s->in_flight_bitmap)) {

-            assert(nb_sectors > 0);

-            break;

-        }

+    added_sectors = sectors_per_chunk;

+    if (s->cow_bitmap && !test_bit(next_chunk, s->cow_bitmap)) {

+        bdrv_round_to_clusters(s->target,

+                next_sector, added_sectors,

+                &next_sector, &added_sectors);



-        added_sectors = sectors_per_chunk;

-        if (s->cow_bitmap && !test_bit(next_chunk, s->cow_bitmap)) {

-            bdrv_round_to_clusters(s->target,

-                                   next_sector, added_sectors,

-                                   &next_sector, &added_sectors);

-

-            /* On the first iteration, the rounding may make us copy

-             * sectors before the first dirty one.

-             */

-            if (next_sector < sector_num) {

-                assert(nb_sectors == 0);

-                sector_num = next_sector;

-                next_chunk = next_sector / sectors_per_chunk;

-            }

+        /* On the first iteration, the rounding may make us copy

+         * sectors before the first dirty one.

+         */

+        if (next_sector < sector_num) {

+            assert(nb_sectors == 0);

+            sector_num = next_sector;

+            next_chunk = next_sector / sectors_per_chunk;

         }

+    }



-        added_sectors = MIN(added_sectors, end - (sector_num + nb_sectors));

-        added_chunks = (added_sectors + sectors_per_chunk - 1) / sectors_per_chunk;

+    added_sectors = MIN(added_sectors, end - (sector_num + nb_sectors));

+    added_chunks = (added_sectors + sectors_per_chunk - 1) /

+ sectors_per_chunk;



-        /* When doing COW, it may happen that there is not enough space for

-         * a full cluster.  Wait if that is the case.

-         */

-        while (nb_chunks == 0 && s->buf_free_count < added_chunks) {

-            trace_mirror_yield_buf_busy(s, nb_chunks, s->in_flight);

-            qemu_coroutine_yield();

-        }

-        if (s->buf_free_count < nb_chunks + added_chunks) {

-            trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight);

-            break;

-        }

+    /* When doing COW, it may happen that there is not enough space for

+     * a full cluster.  Wait if that is the case.

+     */

+    while (nb_chunks == 0 && s->buf_free_count < added_chunks) {

+        trace_mirror_yield_buf_busy(s, nb_chunks, s->in_flight);

+        qemu_coroutine_yield();

+    }



-        /* We have enough free space to copy these sectors.  */

-        bitmap_set(s->in_flight_bitmap, next_chunk, added_chunks);

+    /* We have enough free space to copy these sectors.  */

+    bitmap_set(s->in_flight_bitmap, next_chunk, added_chunks);



-        nb_sectors += added_sectors;

-        nb_chunks += added_chunks;

-        next_sector += added_sectors;

-        next_chunk += added_chunks;

-    } while (next_sector < end);

+    nb_sectors += added_sectors;

+    nb_chunks += added_chunks;



     /* Allocate a MirrorOp that is used as an AIO callback.  */

     op = g_slice_new(MirrorOp);

--



Zhang Min


[-- Attachment #2: Type: text/html, Size: 17225 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Qemu-devel] [PATCH] drive-mirror: Change the amount of data base on granularity
  2014-01-18  8:09 [Qemu-devel] [PATCH] drive-mirror: Change the amount of data base on granularity Chentao (Boby)
@ 2014-02-24 13:20 ` Stefan Hajnoczi
  2014-02-24 13:36   ` Paolo Bonzini
  0 siblings, 1 reply; 3+ messages in thread
From: Stefan Hajnoczi @ 2014-02-24 13:20 UTC (permalink / raw)
  To: Chentao (Boby)
  Cc: Kevin Wolf, Fam Zheng, Wangting (Kathy), qemu-devel, Qinling,
	Paolo Bonzini, Zhangmin (Rudy), Wubin (H)

On Sat, Jan 18, 2014 at 08:09:43AM +0000, Chentao (Boby) wrote:

Please CC Kevin Wolf <kwolf@redhat.com> who co-maintains the QEMU block
layer with me.  Use scripts/get_maintainer.pl -f block/mirror.c to find
the list of maintainers to CC.

> Before, one iteration send the amount of data is continuous dirty block, maximum is mirror buffer size(default is 10M).
> 
> This way has a low write/read performance. If image type is raw, first loop, all the data is dirty.
> 
> One iteration, read 10M data and then write 10M data to target image, so read and write cannot be parallelized.
> 
> 
> 
> Now, I change the amount of data in an iteration, it base on granularity. We can set the granularity to 1M,so it can send
> 
> 10 times read request, and then send write request. Once a write request is done, it will have 1M free buffer to send next read request.
> 
> So this way can allow read/write to be parallelized.
> 
> 
> 
> This change can improve read and write performance.
> 
> On my server:
> 
> (write) MBps:55MB/S --> 90 MB/S utility:50%->85%
> 
> 
> 
> Signed-off-by: Zhang Min <rudy.zhangmin@huawei.com<mailto:rudy.zhangmin@huawei.com>>
> 

This patch is not signed off by you.  If you have permission to
contribute this patch on behalf of Zhang Min, please add your own
Signed-off-by: below.

Please also remove the <mailto:> link.  It should just be:
First Last <username@domain.com>

> ---
> 
> block/mirror.c |   68 ++++++++++++++++++++++---------------------------------
> 
> 1 files changed, 27 insertions(+), 41 deletions(-)
> 
> 
> 
> diff --git a/block/mirror.c b/block/mirror.c index 2932bab..1ba2862 100644
> 
> --- a/block/mirror.c

Please do not send patches with Outlook or in HTML.  Use
git-send-email(1) to send patches that are properly formatted and can be
applied with git-am(1) by the maintainers.

For more info on patch submission guidelines, please see:
http://qemu-project.org/Contribute/SubmitAPatch

I have reformatted the remainder of this email.

> +++ b/block/mirror.c
> @@ -183,54 +183,40 @@ static void coroutine_fn mirror_iteration(MirrorBlockJob *s)
>          qemu_coroutine_yield();
>      }
> 
> -    do {
> -        int added_sectors, added_chunks;
> +    int added_sectors, added_chunks;
> 
> -        if (!bdrv_get_dirty(source, s->dirty_bitmap, next_sector) ||
> -            test_bit(next_chunk, s->in_flight_bitmap)) {
> -            assert(nb_sectors > 0);
> -            break;
> -        }
> +    added_sectors = sectors_per_chunk;
> +    if (s->cow_bitmap && !test_bit(next_chunk, s->cow_bitmap)) {
> +        bdrv_round_to_clusters(s->target,
> +                next_sector, added_sectors,
> +                &next_sector, &added_sectors);
> 
> -        added_sectors = sectors_per_chunk;
> -        if (s->cow_bitmap && !test_bit(next_chunk, s->cow_bitmap)) {
> -            bdrv_round_to_clusters(s->target,
> -                                   next_sector, added_sectors,
> -                                   &next_sector, &added_sectors);
> -
> -            /* On the first iteration, the rounding may make us copy
> -             * sectors before the first dirty one.
> -             */
> -            if (next_sector < sector_num) {
> -                assert(nb_sectors == 0);
> -                sector_num = next_sector;
> -                next_chunk = next_sector / sectors_per_chunk;
> -            }
> +        /* On the first iteration, the rounding may make us copy
> +         * sectors before the first dirty one.
> +         */
> +        if (next_sector < sector_num) {
> +            assert(nb_sectors == 0);
> +            sector_num = next_sector;
> +            next_chunk = next_sector / sectors_per_chunk;
>          }
> +    }
> 
> -        added_sectors = MIN(added_sectors, end - (sector_num + nb_sectors));
> -        added_chunks = (added_sectors + sectors_per_chunk - 1) / sectors_per_chunk;
> +    added_sectors = MIN(added_sectors, end - (sector_num + nb_sectors));
> +    added_chunks = (added_sectors + sectors_per_chunk - 1) /
> + sectors_per_chunk;
> 
> -        /* When doing COW, it may happen that there is not enough space for
> -         * a full cluster.  Wait if that is the case.
> -         */
> -        while (nb_chunks == 0 && s->buf_free_count < added_chunks) {
> -            trace_mirror_yield_buf_busy(s, nb_chunks, s->in_flight);
> -            qemu_coroutine_yield();
> -        }
> -        if (s->buf_free_count < nb_chunks + added_chunks) {
> -            trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight);
> -            break;
> -        }
> +    /* When doing COW, it may happen that there is not enough space for
> +     * a full cluster.  Wait if that is the case.
> +     */
> +    while (nb_chunks == 0 && s->buf_free_count < added_chunks) {
> +        trace_mirror_yield_buf_busy(s, nb_chunks, s->in_flight);
> +        qemu_coroutine_yield();
> +    }
> 
> -        /* We have enough free space to copy these sectors.  */
> -        bitmap_set(s->in_flight_bitmap, next_chunk, added_chunks);
> +    /* We have enough free space to copy these sectors.  */
> +    bitmap_set(s->in_flight_bitmap, next_chunk, added_chunks);
> 
> -        nb_sectors += added_sectors;
> -        nb_chunks += added_chunks;
> -        next_sector += added_sectors;
> -        next_chunk += added_chunks;
> -    } while (next_sector < end);
> +    nb_sectors += added_sectors;
> +    nb_chunks += added_chunks;

I think further discussion will be required, this patch undoes something
that the code is designed to do.  You forgot to modify the big block
comment above this code that explains that we are trying to extend the
copy region to be at least one cluster and also to cover adjacent dirty
blocks (to reduce the number of I/O requests).

I have CCed Paolo Bonzini, who implemented most of block/mirror.c.
Maybe he can discuss the purpose of this change with you.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Qemu-devel] [PATCH] drive-mirror: Change the amount of data base on granularity
  2014-02-24 13:20 ` Stefan Hajnoczi
@ 2014-02-24 13:36   ` Paolo Bonzini
  0 siblings, 0 replies; 3+ messages in thread
From: Paolo Bonzini @ 2014-02-24 13:36 UTC (permalink / raw)
  To: Stefan Hajnoczi, Chentao (Boby)
  Cc: Kevin Wolf, Fam Zheng, Wangting (Kathy), qemu-devel, Qinling,
	Zhangmin (Rudy), Wubin (H)

Il 24/02/2014 14:20, Stefan Hajnoczi ha scritto:
>> Now, I change the amount of data in an iteration, it base on
>> granularity. We can set the granularity to 1M,so it can send
>> 
>> 10 times read request, and then send write request. Once a write
>> request is done, it will have 1M free buffer to send next read
>> request.
>> 
>> So this way can allow read/write to be parallelized.

This also means that in the dirty phase you have to send chunks of 1M 
instead of say 64K.  64K is a common value of the granularity.

Try plotting the I/O rate against the granularity, you'll see that you
will not get top utilization at 10M, but you will not get it at 64K
either.  What you need is probably as simple as this:

$ git diff block/mirror.c
diff --git a/block/mirror.c b/block/mirror.c
index e683959..66093da 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -231,7 +231,7 @@ static void coroutine_fn mirror_iteration(MirrorBlockJob *s)
         nb_chunks += added_chunks;
         next_sector += added_sectors;
         next_chunk += added_chunks;
-    } while (next_sector < end);
+    } while (nb_sectors < MAX_SECTORS_PER_MIRROR_OP && next_sector < end);
 
     /* Allocate a MirrorOp that is used as an AIO callback.  */
     op = g_slice_new(MirrorOp);

for some value of MAX_SECTORS_PER_MIRROR_OP.  You will need to run
some benchmarks in order to find the right value.

Paolo

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-02-24 13:37 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-18  8:09 [Qemu-devel] [PATCH] drive-mirror: Change the amount of data base on granularity Chentao (Boby)
2014-02-24 13:20 ` Stefan Hajnoczi
2014-02-24 13:36   ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).