qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
To: qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>,
	qemu-block@nongnu.org,
	Alexander Yarygin <yarygin@linux.vnet.ibm.com>,
	Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Cornelia Huck <cornelia.huck@de.ibm.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: [Qemu-devel] [PATCH v2] block: Let bdrv_drain_all() to call aio_poll() for each AioContext
Date: Thu, 14 May 2015 19:03:20 +0300	[thread overview]
Message-ID: <1431619400-640-1-git-send-email-yarygin@linux.vnet.ibm.com> (raw)

After the commit 9b536adc ("block: acquire AioContext in
bdrv_drain_all()") the aio_poll() function got called for every
BlockDriverState, in assumption that every device may have its own
AioContext. The bdrv_drain_all() function is called in each
virtio_reset() call, which in turn is called for every virtio-blk
device on initialization, so we got aio_poll() called
'length(device_list)^2' times.

If we have thousands of disks attached, there are a lot of
BlockDriverStates but only a few AioContexts, leading to tons of
unnecessary aio_poll() calls. For example, startup times with 1000 disks
takes over 13 minutes.

This patch changes the bdrv_drain_all() function allowing it find shared
AioContexts and to call aio_poll() only for unique ones. This results in
much better startup times, e.g. 1000 disks do come up within 5 seconds.

Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
---
 block.c | 40 +++++++++++++++++++++++++---------------
 1 file changed, 25 insertions(+), 15 deletions(-)

diff --git a/block.c b/block.c
index f2f8ae7..bdfb1ce 100644
--- a/block.c
+++ b/block.c
@@ -1987,17 +1987,6 @@ static bool bdrv_requests_pending(BlockDriverState *bs)
     return false;
 }
 
-static bool bdrv_drain_one(BlockDriverState *bs)
-{
-    bool bs_busy;
-
-    bdrv_flush_io_queue(bs);
-    bdrv_start_throttled_reqs(bs);
-    bs_busy = bdrv_requests_pending(bs);
-    bs_busy |= aio_poll(bdrv_get_aio_context(bs), bs_busy);
-    return bs_busy;
-}
-
 /*
  * Wait for pending requests to complete on a single BlockDriverState subtree
  *
@@ -2010,8 +1999,13 @@ static bool bdrv_drain_one(BlockDriverState *bs)
  */
 void bdrv_drain(BlockDriverState *bs)
 {
-    while (bdrv_drain_one(bs)) {
+    bool busy = true;
+
+    while (busy) {
         /* Keep iterating */
+        bdrv_flush_io_queue(bs);
+        busy = bdrv_requests_pending(bs);
+        busy |= aio_poll(bdrv_get_aio_context(bs), busy);
     }
 }
 
@@ -2030,20 +2024,35 @@ void bdrv_drain(BlockDriverState *bs)
 void bdrv_drain_all(void)
 {
     /* Always run first iteration so any pending completion BHs run */
-    bool busy = true;
+    bool busy = true, pending = false;
     BlockDriverState *bs;
+    GSList *aio_ctxs = NULL, *ctx;
+    AioContext *aio_context;
 
     while (busy) {
         busy = false;
 
         QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
-            AioContext *aio_context = bdrv_get_aio_context(bs);
+            aio_context = bdrv_get_aio_context(bs);
+
+            aio_context_acquire(aio_context);
+            bdrv_flush_io_queue(bs);
+            busy |= bdrv_requests_pending(bs);
+            aio_context_release(aio_context);
+            if (!aio_ctxs || !g_slist_find(aio_ctxs, aio_context)) {
+                aio_ctxs = g_slist_prepend(aio_ctxs, aio_context);
+            }
+        }
+        pending = busy;
 
+        for (ctx = aio_ctxs; ctx != NULL; ctx = ctx->next) {
+            aio_context = ctx->data;
             aio_context_acquire(aio_context);
-            busy |= bdrv_drain_one(bs);
+            busy |= aio_poll(aio_context, pending);
             aio_context_release(aio_context);
         }
     }
+    g_slist_free(aio_ctxs);
 }
 
 /* make a BlockDriverState anonymous by removing from bdrv_state and
@@ -6087,6 +6096,7 @@ void bdrv_flush_io_queue(BlockDriverState *bs)
     } else if (bs->file) {
         bdrv_flush_io_queue(bs->file);
     }
+    bdrv_start_throttled_reqs(bs);
 }
 
 static bool append_open_options(QDict *d, BlockDriverState *bs)
-- 
1.9.1

             reply	other threads:[~2015-05-14 16:06 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-14 16:03 Alexander Yarygin [this message]
2015-05-15  2:04 ` [Qemu-devel] [PATCH v2] block: Let bdrv_drain_all() to call aio_poll() for each AioContext Fam Zheng
2015-05-15  6:59 ` Christian Borntraeger
2015-05-15  7:00   ` Christian Borntraeger
2015-05-15  8:16   ` Christian Borntraeger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1431619400-640-1-git-send-email-yarygin@linux.vnet.ibm.com \
    --to=yarygin@linux.vnet.ibm.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cornelia.huck@de.ibm.com \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=tumanova@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).