[Qemu-devel] [PATCH RFC v2 0/3] Fix slow startup with many disks

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH RFC v2 0/3] Fix slow startup with many disks
@ 2015-06-10 11:38 Alexander Yarygin
  2015-06-10 11:38 ` [Qemu-devel] [PATCH 1/3] block-backend: Introduce blk_drain() Alexander Yarygin
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Alexander Yarygin @ 2015-06-10 11:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Alexander Yarygin, Ekaterina Tumanova,
	Christian Borntraeger, Stefan Hajnoczi, Cornelia Huck,
	Paolo Bonzini

Changes in v2:
    - Patch "block-backend: Introduce blk_drain() and replace blk_drain_all()"
      splitted to two.
    - blk_drain() moved to above virtio_blk_data_plane_stop().

During reset the aio_poll() function is called at least amount_of_disks^2 times:

for_each disk
    virtio_blk_reset()
        bdrv_drain_all()
                for_each disk
                         aio_poll()

For example, startup with 1000 disks takes over 13 minutes.

Patches 1 and 2 removes inner loop by using bdrv_drain() instead
of bdrv_drain_all(). bdrv_drain() works on one disk at time.

Since bdrv_drain_all() is still called in other places, patch 3 optimizes
it for cases, where there are more disks than iothreads.

Thanks.

Alexander Yarygin (3):
  block-backend: Introduce blk_drain()
  virtio-blk: Use blk_drain() to drain IO requests
  block: Let bdrv_drain_all() to call aio_poll() for each AioContext

 block/block-backend.c          |  5 +++++
 block/io.c                     | 42 ++++++++++++++++++++++++++----------------
 hw/block/virtio-blk.c          | 11 ++++++-----
 include/sysemu/block-backend.h |  1 +
 4 files changed, 38 insertions(+), 21 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 1/3] block-backend: Introduce blk_drain()
  2015-06-10 11:38 [Qemu-devel] [PATCH RFC v2 0/3] Fix slow startup with many disks Alexander Yarygin
@ 2015-06-10 11:38 ` Alexander Yarygin
  2015-06-10 11:38 ` [Qemu-devel] [PATCH 2/3] virtio-blk: Use blk_drain() to drain IO requests Alexander Yarygin
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 13+ messages in thread
From: Alexander Yarygin @ 2015-06-10 11:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Alexander Yarygin, Ekaterina Tumanova,
	Christian Borntraeger, Stefan Hajnoczi, Cornelia Huck,
	Paolo Bonzini

This patch introduces the blk_drain() function which allows to replace
blk_drain_all() when only one BlockDriverState needs to be drained.

Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
---
 block/block-backend.c          | 5 +++++
 include/sysemu/block-backend.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/block/block-backend.c b/block/block-backend.c
index 93e46f3..aee8a12 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -700,6 +700,11 @@ int blk_flush_all(void)
     return bdrv_flush_all();
 }
 
+void blk_drain(BlockBackend *blk)
+{
+    bdrv_drain(blk->bs);
+}
+
 void blk_drain_all(void)
 {
     bdrv_drain_all();
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index b4a4d5e..8fc960f 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -118,6 +118,7 @@ int blk_co_discard(BlockBackend *blk, int64_t sector_num, int nb_sectors);
 int blk_co_flush(BlockBackend *blk);
 int blk_flush(BlockBackend *blk);
 int blk_flush_all(void);
+void blk_drain(BlockBackend *blk);
 void blk_drain_all(void);
 BlockdevOnError blk_get_on_error(BlockBackend *blk, bool is_read);
 BlockErrorAction blk_get_error_action(BlockBackend *blk, bool is_read,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 2/3] virtio-blk: Use blk_drain() to drain IO requests
  2015-06-10 11:38 [Qemu-devel] [PATCH RFC v2 0/3] Fix slow startup with many disks Alexander Yarygin
  2015-06-10 11:38 ` [Qemu-devel] [PATCH 1/3] block-backend: Introduce blk_drain() Alexander Yarygin
@ 2015-06-10 11:38 ` Alexander Yarygin
  2015-06-11  2:51   ` Fam Zheng
  2015-06-12 14:13   ` Stefan Hajnoczi
  2015-06-10 11:38 ` [Qemu-devel] [PATCH 3/3] block: Let bdrv_drain_all() to call aio_poll() for each AioContext Alexander Yarygin
  2015-06-10 18:40 ` [Qemu-devel] [PATCH RFC v2 0/3] Fix slow startup with many disks Christian Borntraeger
  3 siblings, 2 replies; 13+ messages in thread
From: Alexander Yarygin @ 2015-06-10 11:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Alexander Yarygin, Ekaterina Tumanova,
	Christian Borntraeger, Stefan Hajnoczi, Cornelia Huck,
	Paolo Bonzini

Each call of the virtio_blk_reset() function calls blk_drain_all(),
which works for all existing BlockDriverStates, while draining only
one is needed.

This patch replaces blk_drain_all() by blk_drain() in virtio_blk_reset().

Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
---
 hw/block/virtio-blk.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index e6afe97..2009092 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -652,15 +652,16 @@ static void virtio_blk_reset(VirtIODevice *vdev)
 {
     VirtIOBlock *s = VIRTIO_BLK(vdev);
 
-    if (s->dataplane) {
-        virtio_blk_data_plane_stop(s->dataplane);
-    }
-
     /*
      * This should cancel pending requests, but can't do nicely until there
      * are per-device request lists.
      */
-    blk_drain_all();
+    blk_drain(s->blk);
+
+    if (s->dataplane) {
+        virtio_blk_data_plane_stop(s->dataplane);
+    }
+
     blk_set_enable_write_cache(s->blk, s->original_wce);
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 3/3] block: Let bdrv_drain_all() to call aio_poll() for each AioContext
  2015-06-10 11:38 [Qemu-devel] [PATCH RFC v2 0/3] Fix slow startup with many disks Alexander Yarygin
  2015-06-10 11:38 ` [Qemu-devel] [PATCH 1/3] block-backend: Introduce blk_drain() Alexander Yarygin
  2015-06-10 11:38 ` [Qemu-devel] [PATCH 2/3] virtio-blk: Use blk_drain() to drain IO requests Alexander Yarygin
@ 2015-06-10 11:38 ` Alexander Yarygin
  2015-06-12 14:17   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
                     ` (2 more replies)
  2015-06-10 18:40 ` [Qemu-devel] [PATCH RFC v2 0/3] Fix slow startup with many disks Christian Borntraeger
  3 siblings, 3 replies; 13+ messages in thread
From: Alexander Yarygin @ 2015-06-10 11:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Alexander Yarygin, Ekaterina Tumanova,
	Christian Borntraeger, Stefan Hajnoczi, Cornelia Huck,
	Paolo Bonzini

After the commit 9b536adc ("block: acquire AioContext in
bdrv_drain_all()") the aio_poll() function got called for every
BlockDriverState, in assumption that every device may have its own
AioContext. If we have thousands of disks attached, there are a lot of
BlockDriverStates but only a few AioContexts, leading to tons of
unnecessary aio_poll() calls.

This patch changes the bdrv_drain_all() function allowing it find shared
AioContexts and to call aio_poll() only for unique ones.

Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
---
 block/io.c | 42 ++++++++++++++++++++++++++----------------
 1 file changed, 26 insertions(+), 16 deletions(-)

diff --git a/block/io.c b/block/io.c
index e394d92..7502186 100644
--- a/block/io.c
+++ b/block/io.c
@@ -271,17 +271,6 @@ static bool bdrv_requests_pending(BlockDriverState *bs)
     return false;
 }
 
-static bool bdrv_drain_one(BlockDriverState *bs)
-{
-    bool bs_busy;
-
-    bdrv_flush_io_queue(bs);
-    bdrv_start_throttled_reqs(bs);
-    bs_busy = bdrv_requests_pending(bs);
-    bs_busy |= aio_poll(bdrv_get_aio_context(bs), bs_busy);
-    return bs_busy;
-}
-
 /*
  * Wait for pending requests to complete on a single BlockDriverState subtree
  *
@@ -294,8 +283,13 @@ static bool bdrv_drain_one(BlockDriverState *bs)
  */
 void bdrv_drain(BlockDriverState *bs)
 {
-    while (bdrv_drain_one(bs)) {
+    bool busy = true;
+
+    while (busy) {
         /* Keep iterating */
+         bdrv_flush_io_queue(bs);
+         busy = bdrv_requests_pending(bs);
+         busy |= aio_poll(bdrv_get_aio_context(bs), busy);
     }
 }
 
@@ -316,6 +310,7 @@ void bdrv_drain_all(void)
     /* Always run first iteration so any pending completion BHs run */
     bool busy = true;
     BlockDriverState *bs = NULL;
+    GSList *aio_ctxs = NULL, *ctx;
 
     while ((bs = bdrv_next(bs))) {
         AioContext *aio_context = bdrv_get_aio_context(bs);
@@ -325,17 +320,30 @@ void bdrv_drain_all(void)
             block_job_pause(bs->job);
         }
         aio_context_release(aio_context);
+
+        if (!aio_ctxs || !g_slist_find(aio_ctxs, aio_context)) {
+            aio_ctxs = g_slist_prepend(aio_ctxs, aio_context);
+        }
     }
 
     while (busy) {
         busy = false;
-        bs = NULL;
 
-        while ((bs = bdrv_next(bs))) {
-            AioContext *aio_context = bdrv_get_aio_context(bs);
+        for (ctx = aio_ctxs; ctx != NULL; ctx = ctx->next) {
+            AioContext *aio_context = ctx->data;
+            bs = NULL;
 
             aio_context_acquire(aio_context);
-            busy |= bdrv_drain_one(bs);
+            while ((bs = bdrv_next(bs))) {
+                if (aio_context == bdrv_get_aio_context(bs)) {
+                    bdrv_flush_io_queue(bs);
+                    if (bdrv_requests_pending(bs)) {
+                        busy = true;
+                        aio_poll(aio_context, busy);
+                    }
+                }
+            }
+            busy |= aio_poll(aio_context, false);
             aio_context_release(aio_context);
         }
     }
@@ -350,6 +358,7 @@ void bdrv_drain_all(void)
         }
         aio_context_release(aio_context);
     }
+    g_slist_free(aio_ctxs);
 }
 
 /**
@@ -2600,4 +2609,5 @@ void bdrv_flush_io_queue(BlockDriverState *bs)
     } else if (bs->file) {
         bdrv_flush_io_queue(bs->file);
     }
+    bdrv_start_throttled_reqs(bs);
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH RFC v2 0/3] Fix slow startup with many disks
  2015-06-10 11:38 [Qemu-devel] [PATCH RFC v2 0/3] Fix slow startup with many disks Alexander Yarygin
                   ` (2 preceding siblings ...)
  2015-06-10 11:38 ` [Qemu-devel] [PATCH 3/3] block: Let bdrv_drain_all() to call aio_poll() for each AioContext Alexander Yarygin
@ 2015-06-10 18:40 ` Christian Borntraeger
  3 siblings, 0 replies; 13+ messages in thread
From: Christian Borntraeger @ 2015-06-10 18:40 UTC (permalink / raw)
  To: Alexander Yarygin, qemu-devel
  Cc: Kevin Wolf, qemu-block, Ekaterina Tumanova, Stefan Hajnoczi,
	Cornelia Huck, Paolo Bonzini

Am 10.06.2015 um 13:38 schrieb Alexander Yarygin:
> Changes in v2:
>     - Patch "block-backend: Introduce blk_drain() and replace blk_drain_all()"
>       splitted to two.
>     - blk_drain() moved to above virtio_blk_data_plane_stop().
> 
> During reset the aio_poll() function is called at least amount_of_disks^2 times:
> 
> for_each disk
>     virtio_blk_reset()
>         bdrv_drain_all()
>                 for_each disk
>                          aio_poll()
> 
> For example, startup with 1000 disks takes over 13 minutes.
> 
> Patches 1 and 2 removes inner loop by using bdrv_drain() instead
> of bdrv_drain_all(). bdrv_drain() works on one disk at time.
> 
> Since bdrv_drain_all() is still called in other places, patch 3 optimizes
> it for cases, where there are more disks than iothreads.
> 
> Thanks.
> 
> Alexander Yarygin (3):
>   block-backend: Introduce blk_drain()
>   virtio-blk: Use blk_drain() to drain IO requests
>   block: Let bdrv_drain_all() to call aio_poll() for each AioContext
> 
>  block/block-backend.c          |  5 +++++
>  block/io.c                     | 42 ++++++++++++++++++++++++++----------------
>  hw/block/virtio-blk.c          | 11 ++++++-----
>  include/sysemu/block-backend.h |  1 +
>  4 files changed, 38 insertions(+), 21 deletions(-)
> 

Whole series is 
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>

I think we can remove the RFC. I would like to see this for 2.4.

Christian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] virtio-blk: Use blk_drain() to drain IO requests
  2015-06-10 11:38 ` [Qemu-devel] [PATCH 2/3] virtio-blk: Use blk_drain() to drain IO requests Alexander Yarygin
@ 2015-06-11  2:51   ` Fam Zheng
  2015-06-12 12:50     ` Christian Borntraeger
  2015-06-12 14:17     ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
  2015-06-12 14:13   ` Stefan Hajnoczi
  1 sibling, 2 replies; 13+ messages in thread
From: Fam Zheng @ 2015-06-11  2:51 UTC (permalink / raw)
  To: Alexander Yarygin
  Cc: Kevin Wolf, qemu-block, Ekaterina Tumanova, qemu-devel,
	qemu-stable, Christian Borntraeger, Stefan Hajnoczi,
	Cornelia Huck, Paolo Bonzini

On Wed, 06/10 14:38, Alexander Yarygin wrote:
> Each call of the virtio_blk_reset() function calls blk_drain_all(),
> which works for all existing BlockDriverStates, while draining only
> one is needed.
> 
> This patch replaces blk_drain_all() by blk_drain() in virtio_blk_reset().

Please add a note "virtio_blk_data_plane_stop should be called after draining
because it restores vblk->complete_request" as well.
> 
> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
> Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
> Cc: Kevin Wolf <kwolf@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Stefan Hajnoczi <stefanha@redhat.com>

Cc: qemu-stable@nongnu.org

> Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
> ---
>  hw/block/virtio-blk.c | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index e6afe97..2009092 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -652,15 +652,16 @@ static void virtio_blk_reset(VirtIODevice *vdev)
>  {
>      VirtIOBlock *s = VIRTIO_BLK(vdev);
>  
> -    if (s->dataplane) {
> -        virtio_blk_data_plane_stop(s->dataplane);
> -    }
> -
>      /*
>       * This should cancel pending requests, but can't do nicely until there
>       * are per-device request lists.
>       */

This comment can be dropped now.

> -    blk_drain_all();
> +    blk_drain(s->blk);
> +
> +    if (s->dataplane) {
> +        virtio_blk_data_plane_stop(s->dataplane);
> +    }
> +
>      blk_set_enable_write_cache(s->blk, s->original_wce);
>  }
>  
> -- 
> 1.9.1
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] virtio-blk: Use blk_drain() to drain IO requests
  2015-06-11  2:51   ` Fam Zheng
@ 2015-06-12 12:50     ` Christian Borntraeger
  2015-06-12 14:17     ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
  1 sibling, 0 replies; 13+ messages in thread
From: Christian Borntraeger @ 2015-06-12 12:50 UTC (permalink / raw)
  To: Fam Zheng, Alexander Yarygin
  Cc: Kevin Wolf, qemu-block, Ekaterina Tumanova, qemu-devel,
	qemu-stable, Stefan Hajnoczi, Cornelia Huck, Paolo Bonzini

Am 11.06.2015 um 04:51 schrieb Fam Zheng:
> On Wed, 06/10 14:38, Alexander Yarygin wrote:
>> Each call of the virtio_blk_reset() function calls blk_drain_all(),
>> which works for all existing BlockDriverStates, while draining only
>> one is needed.
>>
>> This patch replaces blk_drain_all() by blk_drain() in virtio_blk_reset().
> 
> Please add a note "virtio_blk_data_plane_stop should be called after draining
> because it restores vblk->complete_request" as well.
>>
>> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
>> Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
>> Cc: Kevin Wolf <kwolf@redhat.com>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> 
> Cc: qemu-stable@nongnu.org

Only for the move of virtio_blk_data_plane_stop? Are we sure that we can
call blk_drain safely on old version?

What about the following:
respin without RFC
patch 1: move  virtio_blk_data_plane_stop, remove comment.  cc stable
patch 2: introduce blk_drain
patch 3: blk_drain_all -> blk_drain
patch 4: As RFC  "Let bdrv_drain_all() to call aio_poll() for each AioContext"

So that Stefan/Kevin can apply 1-3. and we can then review 4.

Makes sense?

Christian


> 
>> Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
>> ---
>>  hw/block/virtio-blk.c | 11 ++++++-----
>>  1 file changed, 6 insertions(+), 5 deletions(-)
>>
>> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
>> index e6afe97..2009092 100644
>> --- a/hw/block/virtio-blk.c
>> +++ b/hw/block/virtio-blk.c
>> @@ -652,15 +652,16 @@ static void virtio_blk_reset(VirtIODevice *vdev)
>>  {
>>      VirtIOBlock *s = VIRTIO_BLK(vdev);
>>  
>> -    if (s->dataplane) {
>> -        virtio_blk_data_plane_stop(s->dataplane);
>> -    }
>> -
>>      /*
>>       * This should cancel pending requests, but can't do nicely until there
>>       * are per-device request lists.
>>       */
> 
> This comment can be dropped now.
> 
>> -    blk_drain_all();
>> +    blk_drain(s->blk);
>> +
>> +    if (s->dataplane) {
>> +        virtio_blk_data_plane_stop(s->dataplane);
>> +    }
>> +
>>      blk_set_enable_write_cache(s->blk, s->original_wce);
>>  }
>>  
>> -- 
>> 1.9.1
>>
>>
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 2/3] virtio-blk: Use blk_drain() to drain IO requests
  2015-06-10 11:38 ` [Qemu-devel] [PATCH 2/3] virtio-blk: Use blk_drain() to drain IO requests Alexander Yarygin
  2015-06-11  2:51   ` Fam Zheng
@ 2015-06-12 14:13   ` Stefan Hajnoczi
  1 sibling, 0 replies; 13+ messages in thread
From: Stefan Hajnoczi @ 2015-06-12 14:13 UTC (permalink / raw)
  To: Alexander Yarygin
  Cc: Kevin Wolf, qemu-block, Ekaterina Tumanova, qemu-devel,
	Christian Borntraeger, Stefan Hajnoczi, Cornelia Huck,
	Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 1793 bytes --]

On Wed, Jun 10, 2015 at 02:38:16PM +0300, Alexander Yarygin wrote:
> Each call of the virtio_blk_reset() function calls blk_drain_all(),
> which works for all existing BlockDriverStates, while draining only
> one is needed.
> 
> This patch replaces blk_drain_all() by blk_drain() in virtio_blk_reset().
> 
> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
> Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
> Cc: Kevin Wolf <kwolf@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
> ---
>  hw/block/virtio-blk.c | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index e6afe97..2009092 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -652,15 +652,16 @@ static void virtio_blk_reset(VirtIODevice *vdev)
>  {
>      VirtIOBlock *s = VIRTIO_BLK(vdev);
>  
> -    if (s->dataplane) {
> -        virtio_blk_data_plane_stop(s->dataplane);
> -    }
> -
>      /*
>       * This should cancel pending requests, but can't do nicely until there
>       * are per-device request lists.
>       */
> -    blk_drain_all();
> +    blk_drain(s->blk);
> +
> +    if (s->dataplane) {
> +        virtio_blk_data_plane_stop(s->dataplane);
> +    }

This is unsafe now.  virtio_blk_reset() is called from the vcpu thread
while the virtqueue processing may be running in an IOThread.
blk_drain() does not acquire AioContext so it races with the IOThread.

Try:

ctx = bdrv_get_aio_context(s->blk);
aio_context_acquire(ctx);

blk_drain(s->blk);

if (s->dataplane) {
    virtio_blk_data_plane_stop(s->dataplane);
}

aio_context_release(ctx);

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 2/3] virtio-blk: Use blk_drain() to drain IO requests
  2015-06-11  2:51   ` Fam Zheng
  2015-06-12 12:50     ` Christian Borntraeger
@ 2015-06-12 14:17     ` Stefan Hajnoczi
  1 sibling, 0 replies; 13+ messages in thread
From: Stefan Hajnoczi @ 2015-06-12 14:17 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, qemu-block, Alexander Yarygin, Ekaterina Tumanova,
	qemu-stable, qemu-devel, Christian Borntraeger, Stefan Hajnoczi,
	Cornelia Huck, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 1764 bytes --]

On Thu, Jun 11, 2015 at 10:51:54AM +0800, Fam Zheng wrote:
> On Wed, 06/10 14:38, Alexander Yarygin wrote:
> > Each call of the virtio_blk_reset() function calls blk_drain_all(),
> > which works for all existing BlockDriverStates, while draining only
> > one is needed.
> > 
> > This patch replaces blk_drain_all() by blk_drain() in virtio_blk_reset().
> 
> Please add a note "virtio_blk_data_plane_stop should be called after draining
> because it restores vblk->complete_request" as well.
> > 
> > Cc: Christian Borntraeger <borntraeger@de.ibm.com>
> > Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
> > Cc: Kevin Wolf <kwolf@redhat.com>
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: Stefan Hajnoczi <stefanha@redhat.com>
> 
> Cc: qemu-stable@nongnu.org
> 
> > Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
> > ---
> >  hw/block/virtio-blk.c | 11 ++++++-----
> >  1 file changed, 6 insertions(+), 5 deletions(-)
> > 
> > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> > index e6afe97..2009092 100644
> > --- a/hw/block/virtio-blk.c
> > +++ b/hw/block/virtio-blk.c
> > @@ -652,15 +652,16 @@ static void virtio_blk_reset(VirtIODevice *vdev)
> >  {
> >      VirtIOBlock *s = VIRTIO_BLK(vdev);
> >  
> > -    if (s->dataplane) {
> > -        virtio_blk_data_plane_stop(s->dataplane);
> > -    }
> > -
> >      /*
> >       * This should cancel pending requests, but can't do nicely until there
> >       * are per-device request lists.
> >       */
> 
> This comment can be dropped now.

The comment still has value.

bdrv_drain != cancel pending requests

We're using the per-device request list now but we're not cancelling
yet.  The comment hasn't been fully addressed.

Stefan

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 3/3] block: Let bdrv_drain_all() to call aio_poll() for each AioContext
  2015-06-10 11:38 ` [Qemu-devel] [PATCH 3/3] block: Let bdrv_drain_all() to call aio_poll() for each AioContext Alexander Yarygin
@ 2015-06-12 14:17   ` Stefan Hajnoczi
  2015-06-15 12:24   ` [Qemu-devel] " Christian Borntraeger
  2015-06-16  8:39   ` Stefan Hajnoczi
  2 siblings, 0 replies; 13+ messages in thread
From: Stefan Hajnoczi @ 2015-06-12 14:17 UTC (permalink / raw)
  To: Alexander Yarygin
  Cc: Kevin Wolf, qemu-block, Ekaterina Tumanova, qemu-devel,
	Christian Borntraeger, Stefan Hajnoczi, Cornelia Huck,
	Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 1054 bytes --]

On Wed, Jun 10, 2015 at 02:38:17PM +0300, Alexander Yarygin wrote:
> After the commit 9b536adc ("block: acquire AioContext in
> bdrv_drain_all()") the aio_poll() function got called for every
> BlockDriverState, in assumption that every device may have its own
> AioContext. If we have thousands of disks attached, there are a lot of
> BlockDriverStates but only a few AioContexts, leading to tons of
> unnecessary aio_poll() calls.
> 
> This patch changes the bdrv_drain_all() function allowing it find shared
> AioContexts and to call aio_poll() only for unique ones.
> 
> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
> Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
> Cc: Kevin Wolf <kwolf@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
> ---
>  block/io.c | 42 ++++++++++++++++++++++++++----------------
>  1 file changed, 26 insertions(+), 16 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] block: Let bdrv_drain_all() to call aio_poll() for each AioContext
  2015-06-10 11:38 ` [Qemu-devel] [PATCH 3/3] block: Let bdrv_drain_all() to call aio_poll() for each AioContext Alexander Yarygin
  2015-06-12 14:17   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
@ 2015-06-15 12:24   ` Christian Borntraeger
  2015-06-16  8:39   ` Stefan Hajnoczi
  2 siblings, 0 replies; 13+ messages in thread
From: Christian Borntraeger @ 2015-06-15 12:24 UTC (permalink / raw)
  To: Alexander Yarygin, qemu-devel
  Cc: Kevin Wolf, qemu-block, Ekaterina Tumanova, Stefan Hajnoczi,
	Cornelia Huck, Paolo Bonzini

Am 10.06.2015 um 13:38 schrieb Alexander Yarygin:
> After the commit 9b536adc ("block: acquire AioContext in
> bdrv_drain_all()") the aio_poll() function got called for every
> BlockDriverState, in assumption that every device may have its own
> AioContext. If we have thousands of disks attached, there are a lot of
> BlockDriverStates but only a few AioContexts, leading to tons of
> unnecessary aio_poll() calls.
> 
> This patch changes the bdrv_drain_all() function allowing it find shared
> AioContexts and to call aio_poll() only for unique ones.

I read Stefans replies as "we need to have another respin of  patch 2"
FWIW, I tested patch 3 alone and it also reduces the start time a lot for
lots of disks. So maybe apply 1 and 3 and defer 2 to the next round?


> 
> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
> Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
> Cc: Kevin Wolf <kwolf@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
> ---
>  block/io.c | 42 ++++++++++++++++++++++++++----------------
>  1 file changed, 26 insertions(+), 16 deletions(-)
> 
> diff --git a/block/io.c b/block/io.c
> index e394d92..7502186 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -271,17 +271,6 @@ static bool bdrv_requests_pending(BlockDriverState *bs)
>      return false;
>  }
> 
> -static bool bdrv_drain_one(BlockDriverState *bs)
> -{
> -    bool bs_busy;
> -
> -    bdrv_flush_io_queue(bs);
> -    bdrv_start_throttled_reqs(bs);
> -    bs_busy = bdrv_requests_pending(bs);
> -    bs_busy |= aio_poll(bdrv_get_aio_context(bs), bs_busy);
> -    return bs_busy;
> -}
> -
>  /*
>   * Wait for pending requests to complete on a single BlockDriverState subtree
>   *
> @@ -294,8 +283,13 @@ static bool bdrv_drain_one(BlockDriverState *bs)
>   */
>  void bdrv_drain(BlockDriverState *bs)
>  {
> -    while (bdrv_drain_one(bs)) {
> +    bool busy = true;
> +
> +    while (busy) {
>          /* Keep iterating */
> +         bdrv_flush_io_queue(bs);
> +         busy = bdrv_requests_pending(bs);
> +         busy |= aio_poll(bdrv_get_aio_context(bs), busy);
>      }
>  }
> 
> @@ -316,6 +310,7 @@ void bdrv_drain_all(void)
>      /* Always run first iteration so any pending completion BHs run */
>      bool busy = true;
>      BlockDriverState *bs = NULL;
> +    GSList *aio_ctxs = NULL, *ctx;
> 
>      while ((bs = bdrv_next(bs))) {
>          AioContext *aio_context = bdrv_get_aio_context(bs);
> @@ -325,17 +320,30 @@ void bdrv_drain_all(void)
>              block_job_pause(bs->job);
>          }
>          aio_context_release(aio_context);
> +
> +        if (!aio_ctxs || !g_slist_find(aio_ctxs, aio_context)) {
> +            aio_ctxs = g_slist_prepend(aio_ctxs, aio_context);
> +        }
>      }
> 
>      while (busy) {
>          busy = false;
> -        bs = NULL;
> 
> -        while ((bs = bdrv_next(bs))) {
> -            AioContext *aio_context = bdrv_get_aio_context(bs);
> +        for (ctx = aio_ctxs; ctx != NULL; ctx = ctx->next) {
> +            AioContext *aio_context = ctx->data;
> +            bs = NULL;
> 
>              aio_context_acquire(aio_context);
> -            busy |= bdrv_drain_one(bs);
> +            while ((bs = bdrv_next(bs))) {
> +                if (aio_context == bdrv_get_aio_context(bs)) {
> +                    bdrv_flush_io_queue(bs);
> +                    if (bdrv_requests_pending(bs)) {
> +                        busy = true;
> +                        aio_poll(aio_context, busy);
> +                    }
> +                }
> +            }
> +            busy |= aio_poll(aio_context, false);
>              aio_context_release(aio_context);
>          }
>      }
> @@ -350,6 +358,7 @@ void bdrv_drain_all(void)
>          }
>          aio_context_release(aio_context);
>      }
> +    g_slist_free(aio_ctxs);
>  }
> 
>  /**
> @@ -2600,4 +2609,5 @@ void bdrv_flush_io_queue(BlockDriverState *bs)
>      } else if (bs->file) {
>          bdrv_flush_io_queue(bs->file);
>      }
> +    bdrv_start_throttled_reqs(bs);
>  }
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] block: Let bdrv_drain_all() to call aio_poll() for each AioContext
  2015-06-10 11:38 ` [Qemu-devel] [PATCH 3/3] block: Let bdrv_drain_all() to call aio_poll() for each AioContext Alexander Yarygin
  2015-06-12 14:17   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
  2015-06-15 12:24   ` [Qemu-devel] " Christian Borntraeger
@ 2015-06-16  8:39   ` Stefan Hajnoczi
  2015-06-16  9:08     ` Alexander Yarygin
  2 siblings, 1 reply; 13+ messages in thread
From: Stefan Hajnoczi @ 2015-06-16  8:39 UTC (permalink / raw)
  To: Alexander Yarygin
  Cc: Kevin Wolf, qemu-block, Ekaterina Tumanova, qemu-devel,
	Christian Borntraeger, Cornelia Huck, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 1181 bytes --]

On Wed, Jun 10, 2015 at 02:38:17PM +0300, Alexander Yarygin wrote:
> After the commit 9b536adc ("block: acquire AioContext in
> bdrv_drain_all()") the aio_poll() function got called for every
> BlockDriverState, in assumption that every device may have its own
> AioContext. If we have thousands of disks attached, there are a lot of
> BlockDriverStates but only a few AioContexts, leading to tons of
> unnecessary aio_poll() calls.
> 
> This patch changes the bdrv_drain_all() function allowing it find shared
> AioContexts and to call aio_poll() only for unique ones.
> 
> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
> Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
> Cc: Kevin Wolf <kwolf@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
> ---
>  block/io.c | 42 ++++++++++++++++++++++++++----------------
>  1 file changed, 26 insertions(+), 16 deletions(-)

Thanks, applied this patch only to my block tree:
https://github.com/stefanha/qemu/commits/block

Patch 2 has a pending issue and Patch 1 is only needed by Patch 2.

Stefan

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] block: Let bdrv_drain_all() to call aio_poll() for each AioContext
  2015-06-16  8:39   ` Stefan Hajnoczi
@ 2015-06-16  9:08     ` Alexander Yarygin
  0 siblings, 0 replies; 13+ messages in thread
From: Alexander Yarygin @ 2015-06-16  9:08 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, qemu-block, Ekaterina Tumanova, qemu-devel,
	Christian Borntraeger, Cornelia Huck, Paolo Bonzini

Stefan Hajnoczi <stefanha@redhat.com> writes:

> On Wed, Jun 10, 2015 at 02:38:17PM +0300, Alexander Yarygin wrote:
>> After the commit 9b536adc ("block: acquire AioContext in
>> bdrv_drain_all()") the aio_poll() function got called for every
>> BlockDriverState, in assumption that every device may have its own
>> AioContext. If we have thousands of disks attached, there are a lot of
>> BlockDriverStates but only a few AioContexts, leading to tons of
>> unnecessary aio_poll() calls.
>> 
>> This patch changes the bdrv_drain_all() function allowing it find shared
>> AioContexts and to call aio_poll() only for unique ones.
>> 
>> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
>> Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
>> Cc: Kevin Wolf <kwolf@redhat.com>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Stefan Hajnoczi <stefanha@redhat.com>
>> Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
>> ---
>>  block/io.c | 42 ++++++++++++++++++++++++++----------------
>>  1 file changed, 26 insertions(+), 16 deletions(-)
>
> Thanks, applied this patch only to my block tree:
> https://github.com/stefanha/qemu/commits/block
>
> Patch 2 has a pending issue and Patch 1 is only needed by Patch 2.
>
> Stefan

Ok, I will respin patches 1-2.
Thanks!

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-06-16  9:09 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-10 11:38 [Qemu-devel] [PATCH RFC v2 0/3] Fix slow startup with many disks Alexander Yarygin
2015-06-10 11:38 ` [Qemu-devel] [PATCH 1/3] block-backend: Introduce blk_drain() Alexander Yarygin
2015-06-10 11:38 ` [Qemu-devel] [PATCH 2/3] virtio-blk: Use blk_drain() to drain IO requests Alexander Yarygin
2015-06-11  2:51   ` Fam Zheng
2015-06-12 12:50     ` Christian Borntraeger
2015-06-12 14:17     ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2015-06-12 14:13   ` Stefan Hajnoczi
2015-06-10 11:38 ` [Qemu-devel] [PATCH 3/3] block: Let bdrv_drain_all() to call aio_poll() for each AioContext Alexander Yarygin
2015-06-12 14:17   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2015-06-15 12:24   ` [Qemu-devel] " Christian Borntraeger
2015-06-16  8:39   ` Stefan Hajnoczi
2015-06-16  9:08     ` Alexander Yarygin
2015-06-10 18:40 ` [Qemu-devel] [PATCH RFC v2 0/3] Fix slow startup with many disks Christian Borntraeger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).