qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	qemu-block@nongnu.org, qemu-devel@nongnu.org,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Cornelia Huck <cornelia.huck@de.ibm.com>
Subject: Re: [Qemu-devel] [PATCH] block: Let bdrv_drain_all() to call aio_poll() for each AioContext
Date: Wed, 13 May 2015 19:34:40 +0300	[thread overview]
Message-ID: <87vbfw77xb.fsf@linux.vnet.ibm.com> (raw)
In-Reply-To: <55536C6B.4040400@redhat.com> (Paolo Bonzini's message of "Wed, 13 May 2015 17:23:23 +0200")

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 13/05/2015 17:18, Alexander Yarygin wrote:
>> After the commit 9b536adc ("block: acquire AioContext in
>> bdrv_drain_all()") the aio_poll() function got called for every
>> BlockDriverState, in assumption that every device may have its own
>> AioContext. The bdrv_drain_all() function is called in each
>> virtio_reset() call,
>
> ... which should actually call bdrv_drain().  Can you fix that?
>

I thought about it, but couldn't come to conclusion that it's safe. The
comment above bdrv_drain_all() states "... it is not possible to have a
function to drain a single device's I/O queue.", besides that what if we
have several virtual disks that share host file?
Or I'm wrong and it's ok to do?

>> which in turn is called for every virtio-blk
>> device on initialization, so we got aio_poll() called
>> 'length(device_list)^2' times.
>> 
>> If we have thousands of disks attached, there are a lot of
>> BlockDriverStates but only a few AioContexts, leading to tons of
>> unnecessary aio_poll() calls. For example, startup times with 1000 disks
>> takes over 13 minutes.
>> 
>> This patch changes the bdrv_drain_all() function allowing it find shared
>> AioContexts and to call aio_poll() only for unique ones. This results in
>> much better startup times, e.g. 1000 disks do come up within 5 seconds.
>
> I'm not sure this patch is correct.  You may have to call aio_poll
> multiple times before a BlockDriverState is drained.
>
> Paolo
>


Ah, right. We need second loop, something like this:

@@ -2030,20 +2033,33 @@ void bdrv_drain(BlockDriverState *bs)
 void bdrv_drain_all(void)
 {
     /* Always run first iteration so any pending completion BHs run */
-    bool busy = true;
+    bool busy = true, pending = false;
     BlockDriverState *bs;
+    GList *aio_ctxs = NULL, *ctx;
+    AioContext *aio_context;

     while (busy) {
         busy = false;

         QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
-            AioContext *aio_context = bdrv_get_aio_context(bs);
+            aio_context = bdrv_get_aio_context(bs);

             aio_context_acquire(aio_context);
             busy |= bdrv_drain_one(bs);
             aio_context_release(aio_context);
+            if (!aio_ctxs || !g_list_find(aio_ctxs, aio_context))
+                aio_ctxs = g_list_append(aio_ctxs, aio_context);
+        }
+        pending = busy;
+
+        for (ctx = aio_ctxs; ctx != NULL; ctx = ctx->next) {
+            aio_context = ctx->data;
+            aio_context_acquire(aio_context);
+            busy |= aio_poll(aio_context, pending);
+            aio_context_release(aio_context);
         }
     }
+    g_list_free(aio_ctxs);
 }

That looks quite ugly for me and breaks consistence of bdrv_drain_one()
since it doesn't call aio_poll() anymore...


>> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
>> Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
>> Cc: Kevin Wolf <kwolf@redhat.com>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Stefan Hajnoczi <stefanha@redhat.com>
>> Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
>> ---
>>  block.c | 13 +++++++++++--
>>  1 file changed, 11 insertions(+), 2 deletions(-)
>> 
>> diff --git a/block.c b/block.c
>> index f2f8ae7..7414815 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -1994,7 +1994,6 @@ static bool bdrv_drain_one(BlockDriverState *bs)
>>      bdrv_flush_io_queue(bs);
>>      bdrv_start_throttled_reqs(bs);
>>      bs_busy = bdrv_requests_pending(bs);
>> -    bs_busy |= aio_poll(bdrv_get_aio_context(bs), bs_busy);
>>      return bs_busy;
>>  }
>>  
>> @@ -2010,8 +2009,12 @@ static bool bdrv_drain_one(BlockDriverState *bs)
>>   */
>>  void bdrv_drain(BlockDriverState *bs)
>>  {
>> -    while (bdrv_drain_one(bs)) {
>> +    bool busy = true;
>> +
>> +    while (busy) {
>>          /* Keep iterating */
>> +        busy = bdrv_drain_one(bs);
>> +        busy |= aio_poll(bdrv_get_aio_context(bs), busy);
>>      }
>>  }
>>  
>> @@ -2032,6 +2035,7 @@ void bdrv_drain_all(void)
>>      /* Always run first iteration so any pending completion BHs run */
>>      bool busy = true;
>>      BlockDriverState *bs;
>> +    GList *aio_ctxs = NULL;
>>  
>>      while (busy) {
>>          busy = false;
>> @@ -2041,9 +2045,14 @@ void bdrv_drain_all(void)
>>  
>>              aio_context_acquire(aio_context);
>>              busy |= bdrv_drain_one(bs);
>> +            if (!aio_ctxs || !g_list_find(aio_ctxs, aio_context)) {
>> +                busy |= aio_poll(aio_context, busy);
>> +                aio_ctxs = g_list_append(aio_ctxs, aio_context);
>> +            }
>>              aio_context_release(aio_context);
>>          }
>>      }
>> +    g_list_free(aio_ctxs);
>>  }
>>  
>>  /* make a BlockDriverState anonymous by removing from bdrv_state and
>> 

  reply	other threads:[~2015-05-13 16:34 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-13 15:18 [Qemu-devel] [PATCH] block: Let bdrv_drain_all() to call aio_poll() for each AioContext Alexander Yarygin
2015-05-13 15:23 ` Paolo Bonzini
2015-05-13 16:34   ` Alexander Yarygin [this message]
2015-05-14  2:25     ` Fam Zheng
2015-05-14 10:57       ` Alexander Yarygin
2015-05-14 12:05     ` Paolo Bonzini
2015-05-14 14:29       ` Alexander Yarygin
2015-05-14 14:34         ` Paolo Bonzini
2015-05-13 16:02 ` [Qemu-devel] [Qemu-block] " Alberto Garcia
2015-05-13 16:37   ` Alexander Yarygin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87vbfw77xb.fsf@linux.vnet.ibm.com \
    --to=yarygin@linux.vnet.ibm.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cornelia.huck@de.ibm.com \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).