All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: qemu-devel@nongnu.org
Cc: kwolf@redhat.com
Subject: [Qemu-devel] [PULL v2 for-2.1 06/22] AioContext: speed up aio_notify
Date: Mon, 14 Jul 2014 13:42:56 +0200	[thread overview]
Message-ID: <1405338192-18850-7-git-send-email-kwolf@redhat.com> (raw)
In-Reply-To: <1405338192-18850-1-git-send-email-kwolf@redhat.com>

From: Paolo Bonzini <pbonzini@redhat.com>

In many cases, the call to event_notifier_set in aio_notify is unnecessary.
In particular, if we are executing aio_dispatch, or if aio_poll is not
blocking, we know that we will soon get to the next loop iteration (if
necessary); the thread that hosts the AioContext's event loop does not
need any nudging.

The patch includes a Promela formal model that shows that this really
works and does not need any further complication such as generation
counts.  It needs a memory barrier though.

The generation counts are not needed because any change to
ctx->dispatching after the memory barrier is okay for aio_notify.
If it changes from zero to one, it is the right thing to skip
event_notifier_set.  If it changes from one to zero, the
event_notifier_set is unnecessary but harmless.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 aio-posix.c             |  34 +++++++++++++++-
 async.c                 |  19 ++++++++-
 docs/aio_notify.promela | 104 ++++++++++++++++++++++++++++++++++++++++++++++++
 include/block/aio.h     |   9 +++++
 4 files changed, 164 insertions(+), 2 deletions(-)
 create mode 100644 docs/aio_notify.promela

diff --git a/aio-posix.c b/aio-posix.c
index 44c4df3..2eada2e 100644
--- a/aio-posix.c
+++ b/aio-posix.c
@@ -175,11 +175,38 @@ static bool aio_dispatch(AioContext *ctx)
 bool aio_poll(AioContext *ctx, bool blocking)
 {
     AioHandler *node;
+    bool was_dispatching;
     int ret;
     bool progress;
 
+    was_dispatching = ctx->dispatching;
     progress = false;
 
+    /* aio_notify can avoid the expensive event_notifier_set if
+     * everything (file descriptors, bottom halves, timers) will
+     * be re-evaluated before the next blocking poll().  This happens
+     * in two cases:
+     *
+     * 1) when aio_poll is called with blocking == false
+     *
+     * 2) when we are called after poll().  If we are called before
+     *    poll(), bottom halves will not be re-evaluated and we need
+     *    aio_notify() if blocking == true.
+     *
+     * The first aio_dispatch() only does something when AioContext is
+     * running as a GSource, and in that case aio_poll is used only
+     * with blocking == false, so this optimization is already quite
+     * effective.  However, the code is ugly and should be restructured
+     * to have a single aio_dispatch() call.  To do this, we need to
+     * reorganize aio_poll into a prepare/poll/dispatch model like
+     * glib's.
+     *
+     * If we're in a nested event loop, ctx->dispatching might be true.
+     * In that case we can restore it just before returning, but we
+     * have to clear it now.
+     */
+    aio_set_dispatching(ctx, !blocking);
+
     /*
      * If there are callbacks left that have been queued, we need to call them.
      * Do not call select in this case, because it is possible that the caller
@@ -190,12 +217,14 @@ bool aio_poll(AioContext *ctx, bool blocking)
         progress = true;
     }
 
+    /* Re-evaluate condition (1) above.  */
+    aio_set_dispatching(ctx, !blocking);
     if (aio_dispatch(ctx)) {
         progress = true;
     }
 
     if (progress && !blocking) {
-        return true;
+        goto out;
     }
 
     ctx->walking_handlers++;
@@ -234,9 +263,12 @@ bool aio_poll(AioContext *ctx, bool blocking)
     }
 
     /* Run dispatch even if there were no readable fds to run timers */
+    aio_set_dispatching(ctx, true);
     if (aio_dispatch(ctx)) {
         progress = true;
     }
 
+out:
+    aio_set_dispatching(ctx, was_dispatching);
     return progress;
 }
diff --git a/async.c b/async.c
index 5b6fe6b..34af0b2 100644
--- a/async.c
+++ b/async.c
@@ -26,6 +26,7 @@
 #include "block/aio.h"
 #include "block/thread-pool.h"
 #include "qemu/main-loop.h"
+#include "qemu/atomic.h"
 
 /***********************************************************/
 /* bottom halves (can be seen as timers which expire ASAP) */
@@ -247,9 +248,25 @@ ThreadPool *aio_get_thread_pool(AioContext *ctx)
     return ctx->thread_pool;
 }
 
+void aio_set_dispatching(AioContext *ctx, bool dispatching)
+{
+    ctx->dispatching = dispatching;
+    if (!dispatching) {
+        /* Write ctx->dispatching before reading e.g. bh->scheduled.
+         * Optimization: this is only needed when we're entering the "unsafe"
+         * phase where other threads must call event_notifier_set.
+         */
+        smp_mb();
+    }
+}
+
 void aio_notify(AioContext *ctx)
 {
-    event_notifier_set(&ctx->notifier);
+    /* Write e.g. bh->scheduled before reading ctx->dispatching.  */
+    smp_mb();
+    if (!ctx->dispatching) {
+        event_notifier_set(&ctx->notifier);
+    }
 }
 
 static void aio_timerlist_notify(void *opaque)
diff --git a/docs/aio_notify.promela b/docs/aio_notify.promela
new file mode 100644
index 0000000..ad3f6f0
--- /dev/null
+++ b/docs/aio_notify.promela
@@ -0,0 +1,104 @@
+/*
+ * This model describes the interaction between aio_set_dispatching()
+ * and aio_notify().
+ *
+ * Author: Paolo Bonzini <pbonzini@redhat.com>
+ *
+ * This file is in the public domain.  If you really want a license,
+ * the WTFPL will do.
+ *
+ * To simulate it:
+ *     spin -p docs/aio_notify.promela
+ *
+ * To verify it:
+ *     spin -a docs/aio_notify.promela
+ *     gcc -O2 pan.c
+ *     ./a.out -a
+ */
+
+#define MAX   4
+#define LAST  (1 << (MAX - 1))
+#define FINAL ((LAST << 1) - 1)
+
+bool dispatching;
+bool event;
+
+int req, done;
+
+active proctype waiter()
+{
+     int fetch, blocking;
+
+     do
+        :: done != FINAL -> {
+            // Computing "blocking" is separate from execution of the
+            // "bottom half"
+            blocking = (req == 0);
+
+            // This is our "bottom half"
+            atomic { fetch = req; req = 0; }
+            done = done | fetch;
+
+            // Wait for a nudge from the other side
+            do
+                :: event == 1 -> { event = 0; break; }
+                :: !blocking  -> break;
+            od;
+
+            dispatching = 1;
+
+            // If you are simulating this model, you may want to add
+            // something like this here:
+            //
+            //      int foo; foo++; foo++; foo++;
+            //
+            // This only wastes some time and makes it more likely
+            // that the notifier process hits the "fast path".
+
+            dispatching = 0;
+        }
+        :: else -> break;
+    od
+}
+
+active proctype notifier()
+{
+    int next = 1;
+    int sets = 0;
+
+    do
+        :: next <= LAST -> {
+            // generate a request
+            req = req | next;
+            next = next << 1;
+
+            // aio_notify
+            if
+                :: dispatching == 0 -> sets++; event = 1;
+                :: else             -> skip;
+            fi;
+
+            // Test both synchronous and asynchronous delivery
+            if
+                :: 1 -> do
+                            :: req == 0 -> break;
+                        od;
+                :: 1 -> skip;
+            fi;
+        }
+        :: else -> break;
+    od;
+    printf("Skipped %d event_notifier_set\n", MAX - sets);
+}
+
+#define p (done == FINAL)
+
+never  {
+    do
+        :: 1                      // after an arbitrarily long prefix
+        :: p -> break             // p becomes true
+    od;
+    do
+        :: !p -> accept: break    // it then must remains true forever after
+    od
+}
diff --git a/include/block/aio.h b/include/block/aio.h
index d81250c..433e7ff 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -60,8 +60,14 @@ struct AioContext {
      */
     int walking_handlers;
 
+    /* Used to avoid unnecessary event_notifier_set calls in aio_notify.
+     * Writes protected by lock or BQL, reads are lockless.
+     */
+    bool dispatching;
+
     /* lock to protect between bh's adders and deleter */
     QemuMutex bh_lock;
+
     /* Anchor of the list of Bottom Halves belonging to the context */
     struct QEMUBH *first_bh;
 
@@ -83,6 +89,9 @@ struct AioContext {
     QEMUTimerListGroup tlg;
 };
 
+/* Used internally to synchronize aio_poll against qemu_bh_schedule.  */
+void aio_set_dispatching(AioContext *ctx, bool dispatching);
+
 /**
  * aio_context_new: Allocate a new AioContext.
  *
-- 
1.8.3.1

  parent reply	other threads:[~2014-07-14 11:43 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-14 11:42 [Qemu-devel] [PULL v2 for-2.1 00/22] Block patches for 2.1.0-rc2 Kevin Wolf
2014-07-14 11:42 ` [Qemu-devel] [PULL v2 for-2.1 01/22] block/backup: Fix hang for unaligned image size Kevin Wolf
2014-07-14 11:42 ` [Qemu-devel] [PULL v2 for-2.1 02/22] block: Fix bdrv_is_allocated() return value Kevin Wolf
2014-07-14 11:42 ` [Qemu-devel] [PULL v2 for-2.1 03/22] block: prefer aio_poll to qemu_aio_wait Kevin Wolf
2014-07-14 11:42 ` [Qemu-devel] [PULL v2 for-2.1 04/22] block: drop aio functions that operate on the main AioContext Kevin Wolf
2014-07-14 11:42 ` [Qemu-devel] [PULL v2 for-2.1 05/22] test-aio: fix GSource-based timer test Kevin Wolf
2014-07-14 11:42 ` Kevin Wolf [this message]
2014-07-14 11:42 ` [Qemu-devel] [PULL v2 for-2.1 07/22] block: Make qiov match the request size until EOF Kevin Wolf
2014-07-14 11:42 ` [Qemu-devel] [PULL v2 for-2.1 08/22] qcow2: Make qiov match request size until backing file EOF Kevin Wolf
2014-07-14 11:42 ` [Qemu-devel] [PULL v2 for-2.1 09/22] qed: " Kevin Wolf
2014-07-14 11:43 ` [Qemu-devel] [PULL v2 for-2.1 10/22] block: Assert qiov length matches request length Kevin Wolf
2014-07-14 11:43 ` [Qemu-devel] [PULL v2 for-2.1 11/22] virtio-blk: avoid dataplane VirtIOBlockReq early free Kevin Wolf
2014-07-14 11:43 ` [Qemu-devel] [PULL v2 for-2.1 12/22] dataplane: do not free VirtQueueElement in vring_push() Kevin Wolf
2014-07-14 11:43 ` [Qemu-devel] [PULL v2 for-2.1 13/22] virtio-blk: avoid g_slice_new0() for VirtIOBlockReq and VirtQueueElement Kevin Wolf
2014-07-14 11:43 ` [Qemu-devel] [PULL v2 for-2.1 14/22] virtio-blk: embed VirtQueueElement in VirtIOBlockReq Kevin Wolf
2014-07-14 11:43 ` [Qemu-devel] [PULL v2 for-2.1 15/22] AioContext: do not rely on aio_poll(ctx, true) result to end a loop Kevin Wolf
2014-07-14 11:43 ` [Qemu-devel] [PULL v2 for-2.1 16/22] tests: Fix unterminated string output visitor enum human string Kevin Wolf
2014-07-14 11:43 ` [Qemu-devel] [PULL v2 for-2.1 17/22] qtest: fix vhost-user-test compilation with old GLib Kevin Wolf
2014-07-14 11:43 ` [Qemu-devel] [PULL v2 for-2.1 18/22] dma-helpers: Fix too long qiov Kevin Wolf
2014-07-14 11:43 ` [Qemu-devel] [PULL v2 for-2.1 19/22] virtio-blk: Factor common checks out of virtio_blk_handle_read/write() Kevin Wolf
2014-07-14 11:43 ` [Qemu-devel] [PULL v2 for-2.1 20/22] virtio-blk: Bypass error action and I/O accounting on invalid r/w Kevin Wolf
2014-07-14 11:43 ` [Qemu-devel] [PULL v2 for-2.1 21/22] virtio-blk: Treat read/write beyond end as invalid Kevin Wolf
2014-07-14 11:43 ` [Qemu-devel] [PULL v2 for-2.1 22/22] ide: " Kevin Wolf
2014-07-14 15:01 ` [Qemu-devel] [PULL v2 for-2.1 00/22] Block patches for 2.1.0-rc2 Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1405338192-18850-7-git-send-email-kwolf@redhat.com \
    --to=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.