Re: [Qemu-devel] [PATCH v2 3/3] AioContext: fix broken ctx->dispatching optimization

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Fam Zheng <famz@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: kwolf@redhat.com, lersek@redhat.com, qemu-devel@nongnu.org,
	stefanha@redhat.com, rjones@redhat.com
Subject: Re: [Qemu-devel] [PATCH v2 3/3] AioContext: fix broken ctx->dispatching optimization
Date: Fri, 17 Jul 2015 10:25:25 +0800	[thread overview]
Message-ID: <20150717022525.GA13284@ad.nay.redhat.com> (raw)
In-Reply-To: <1437040609-9878-4-git-send-email-pbonzini@redhat.com>

On Thu, 07/16 11:56, Paolo Bonzini wrote:
> diff --git a/aio-posix.c b/aio-posix.c
> index 4abec38..268d14d 100644
> --- a/aio-posix.c
> +++ b/aio-posix.c
> @@ -233,26 +233,23 @@ static void add_pollfd(AioHandler *node)
>  bool aio_poll(AioContext *ctx, bool blocking)
>  {
>      AioHandler *node;
> -    bool was_dispatching;
>      int i, ret;
>      bool progress;
>      int64_t timeout;
>  
>      aio_context_acquire(ctx);
> -    was_dispatching = ctx->dispatching;
>      progress = false;
>  
>      /* aio_notify can avoid the expensive event_notifier_set if
>       * everything (file descriptors, bottom halves, timers) will
>       * be re-evaluated before the next blocking poll().  This is
>       * already true when aio_poll is called with blocking == false;
> -     * if blocking == true, it is only true after poll() returns.
> -     *
> -     * If we're in a nested event loop, ctx->dispatching might be true.
> -     * In that case we can restore it just before returning, but we
> -     * have to clear it now.
> +     * if blocking == true, it is only true after poll() returns,
> +     * so disable the optimization now.
>       */
> -    aio_set_dispatching(ctx, !blocking);
> +    if (blocking) {
> +        atomic_add(&ctx->notify_me, 2);
> +    }

Sorry if my questions are stupid, but I'm having difficulties in fully
understanding it.

What if aio_notify happens after the previous aio_dispatch() but before the
next necessary atomic_add? The aio_notify would still skip the
event_notifier_set(), and the next ppoll() will not return. For example:

  Thread A                             Thread B
------------------------------------------------------------------------
  aio_poll(blocking=true)
                                       aio_notify()
                                         smp_mb()
                                         if (ctx->notify_me) /* false! */
    atomic_add(ctx->notify_me, 2)
    ppoll()
    atomic_sub(ctx->notify_me, 2)           event_notifier_set() /* not run */


And if that's not a problem, why don't we need something like ACCESS_ONCE in
aio_noitfy()?

Fam

>  
>      ctx->walking_handlers++;
>  
> @@ -286,13 +283,15 @@ bool aio_poll(AioContext *ctx, bool blocking)
>      npfd = 0;
>      ctx->walking_handlers--;
>  
> +    if (blocking) {
> +        atomic_sub(&ctx->notify_me, 2);
> +    }
> +
>      /* Run dispatch even if there were no readable fds to run timers */
> -    aio_set_dispatching(ctx, true);
>      if (aio_dispatch(ctx)) {
>          progress = true;
>      }
>  
> -    aio_set_dispatching(ctx, was_dispatching);
>      aio_context_release(ctx);
>  
>      return progress;
> diff --git a/aio-win32.c b/aio-win32.c
> index 9268b5c..9d6c12f 100644
> --- a/aio-win32.c
> +++ b/aio-win32.c
> @@ -279,25 +279,23 @@ bool aio_poll(AioContext *ctx, bool blocking)
>  {
>      AioHandler *node;
>      HANDLE events[MAXIMUM_WAIT_OBJECTS + 1];
> -    bool was_dispatching, progress, have_select_revents, first;
> +    bool progress, have_select_revents, first;
>      int count;
>      int timeout;
>  
>      aio_context_acquire(ctx);
> -    was_dispatching = ctx->dispatching;
>      progress = false;
>  
>      /* aio_notify can avoid the expensive event_notifier_set if
>       * everything (file descriptors, bottom halves, timers) will
>       * be re-evaluated before the next blocking poll().  This is
>       * already true when aio_poll is called with blocking == false;
> -     * if blocking == true, it is only true after poll() returns.
> -     *
> -     * If we're in a nested event loop, ctx->dispatching might be true.
> -     * In that case we can restore it just before returning, but we
> -     * have to clear it now.
> +     * if blocking == true, it is only true after poll() returns,
> +     * so disable the optimization now.
>       */
> -    aio_set_dispatching(ctx, !blocking);
> +    if (blocking) {
> +        atomic_add(&ctx->notify_me, 2);
> +    }
>  
>      have_select_revents = aio_prepare(ctx);
>  
> @@ -334,7 +332,10 @@ bool aio_poll(AioContext *ctx, bool blocking)
>          if (timeout) {
>              aio_context_acquire(ctx);
>          }
> -        aio_set_dispatching(ctx, true);
> +        if (blocking) {
> +            assert(first);
> +            atomic_sub(&ctx->notify_me, 2);
> +        }
>  
>          if (first && aio_bh_poll(ctx)) {
>              progress = true;
> @@ -358,7 +359,6 @@ bool aio_poll(AioContext *ctx, bool blocking)
>  
>      progress |= timerlistgroup_run_timers(&ctx->tlg);
>  
> -    aio_set_dispatching(ctx, was_dispatching);
>      aio_context_release(ctx);
>      return progress;
>  }
> diff --git a/async.c b/async.c
> index 77d080d..a232192 100644
> --- a/async.c
> +++ b/async.c
> @@ -184,6 +184,8 @@ aio_ctx_prepare(GSource *source, gint    *timeout)
>  {
>      AioContext *ctx = (AioContext *) source;
>  
> +    atomic_or(&ctx->notify_me, 1);
> +
>      /* We assume there is no timeout already supplied */
>      *timeout = qemu_timeout_ns_to_ms(aio_compute_timeout(ctx));
>  
> @@ -200,6 +202,7 @@ aio_ctx_check(GSource *source)
>      AioContext *ctx = (AioContext *) source;
>      QEMUBH *bh;
>  
> +    atomic_and(&ctx->notify_me, ~1);
>      for (bh = ctx->first_bh; bh; bh = bh->next) {
>          if (!bh->deleted && bh->scheduled) {
>              return true;
> @@ -254,23 +257,13 @@ ThreadPool *aio_get_thread_pool(AioContext *ctx)
>      return ctx->thread_pool;
>  }
>  
> -void aio_set_dispatching(AioContext *ctx, bool dispatching)
> -{
> -    ctx->dispatching = dispatching;
> -    if (!dispatching) {
> -        /* Write ctx->dispatching before reading e.g. bh->scheduled.
> -         * Optimization: this is only needed when we're entering the "unsafe"
> -         * phase where other threads must call event_notifier_set.
> -         */
> -        smp_mb();
> -    }
> -}
> -
>  void aio_notify(AioContext *ctx)
>  {
> -    /* Write e.g. bh->scheduled before reading ctx->dispatching.  */
> +    /* Write e.g. bh->scheduled before reading ctx->notify_me.  Pairs
> +     * with atomic_or in aio_ctx_prepare or atomic_add in aio_poll.
> +     */
>      smp_mb();
> -    if (!ctx->dispatching) {
> +    if (ctx->notify_me) {
>          event_notifier_set(&ctx->notifier);
>      }
>  }
> diff --git a/docs/aio_notify.promela b/docs/aio_notify.promela
> index ad3f6f0..fccc7ee 100644
> --- a/docs/aio_notify.promela
> +++ b/docs/aio_notify.promela
> @@ -1,5 +1,5 @@
>  /*
> - * This model describes the interaction between aio_set_dispatching()
> + * This model describes the interaction between ctx->notify_me
>   * and aio_notify().
>   *
>   * Author: Paolo Bonzini <pbonzini@redhat.com>
> @@ -14,57 +14,53 @@
>   *     spin -a docs/aio_notify.promela
>   *     gcc -O2 pan.c
>   *     ./a.out -a
> + *
> + * To verify it (with a bug planted in the model):
> + *     spin -a -DBUG docs/aio_notify.promela
> + *     gcc -O2 pan.c
> + *     ./a.out -a
>   */
>  
>  #define MAX   4
>  #define LAST  (1 << (MAX - 1))
>  #define FINAL ((LAST << 1) - 1)
>  
> -bool dispatching;
> +bool notify_me;
>  bool event;
>  
> -int req, done;
> +int req;
> +int done;
>  
>  active proctype waiter()
>  {
> -     int fetch, blocking;
> +    int fetch;
>  
> -     do
> -        :: done != FINAL -> {
> -            // Computing "blocking" is separate from execution of the
> -            // "bottom half"
> -            blocking = (req == 0);
> -
> -            // This is our "bottom half"
> -            atomic { fetch = req; req = 0; }
> -            done = done | fetch;
> -
> -            // Wait for a nudge from the other side
> -            do
> -                :: event == 1 -> { event = 0; break; }
> -                :: !blocking  -> break;
> -            od;
> +    do
> +        :: true -> {
> +            notify_me++;
>  
> -            dispatching = 1;
> +            if
> +#ifndef BUG
> +                :: (req > 0) -> skip;
> +#endif
> +                :: else ->
> +                    // Wait for a nudge from the other side
> +                    do
> +                        :: event == 1 -> { event = 0; break; }
> +                    od;
> +            fi;
>  
> -            // If you are simulating this model, you may want to add
> -            // something like this here:
> -            //
> -            //      int foo; foo++; foo++; foo++;
> -            //
> -            // This only wastes some time and makes it more likely
> -            // that the notifier process hits the "fast path".
> +            notify_me--;
>  
> -            dispatching = 0;
> +            atomic { fetch = req; req = 0; }
> +            done = done | fetch;
>          }
> -        :: else -> break;
>      od
>  }
>  
>  active proctype notifier()
>  {
>      int next = 1;
> -    int sets = 0;
>  
>      do
>          :: next <= LAST -> {
> @@ -74,8 +70,8 @@ active proctype notifier()
>  
>              // aio_notify
>              if
> -                :: dispatching == 0 -> sets++; event = 1;
> -                :: else             -> skip;
> +                :: notify_me == 1 -> event = 1;
> +                :: else           -> printf("Skipped event_notifier_set\n"); skip;
>              fi;
>  
>              // Test both synchronous and asynchronous delivery
> @@ -86,19 +82,12 @@ active proctype notifier()
>                  :: 1 -> skip;
>              fi;
>          }
> -        :: else -> break;
>      od;
> -    printf("Skipped %d event_notifier_set\n", MAX - sets);
>  }
>  
> -#define p (done == FINAL)
> -
> -never  {
> -    do
> -        :: 1                      // after an arbitrarily long prefix
> -        :: p -> break             // p becomes true
> -    od;
> -    do
> -        :: !p -> accept: break    // it then must remains true forever after
> -    od
> +never { /* [] done < FINAL */
> +accept_init:
> +        do
> +        :: done < FINAL -> skip;
> +        od;
>  }
> diff --git a/include/block/aio.h b/include/block/aio.h
> index b46103e..be91e3f 100644
> --- a/include/block/aio.h
> +++ b/include/block/aio.h
> @@ -63,10 +63,30 @@ struct AioContext {
>       */
>      int walking_handlers;
>  
> -    /* Used to avoid unnecessary event_notifier_set calls in aio_notify.
> -     * Writes protected by lock or BQL, reads are lockless.
> +    /* Used to avoid unnecessary event_notifier_set calls in aio_notify;
> +     * accessed with atomic primitives.  If this field is 0, everything
> +     * (file descriptors, bottom halves, timers) will be re-evaluated
> +     * before the next blocking poll(), thus the event_notifier_set call
> +     * can be skipped.  If it is non-zero, you may need to wake up a
> +     * concurrent aio_poll or the glib main event loop, making
> +     * event_notifier_set necessary.
> +     *
> +     * Bit 0 is reserved for GSource usage of the AioContext, and is 1
> +     * between a call to aio_ctx_check and the next call to aio_ctx_dispatch.
> +     * Bits 1-31 simply count the number of active calls to aio_poll
> +     * that are in the prepare or poll phase.
> +     *
> +     * The GSource and aio_poll must use a different mechanism because
> +     * there is no certainty that a call to GSource's prepare callback
> +     * (via g_main_context_prepare) is indeed followed by check and
> +     * dispatch.  It's not clear whether this would be a bug, but let's
> +     * play safe and allow it---it will just cause extra calls to
> +     * event_notifier_set until the next call to dispatch.
> +     *
> +     * Instead, the aio_poll calls include both the prepare and the
> +     * dispatch phase, hence a simple counter is enough for them.
>       */
> -    bool dispatching;
> +    uint32_t notify_me;
>  
>      /* lock to protect between bh's adders and deleter */
>      QemuMutex bh_lock;
> @@ -89,9 +109,6 @@ struct AioContext {
>      QEMUTimerListGroup tlg;
>  };
>  
> -/* Used internally to synchronize aio_poll against qemu_bh_schedule.  */
> -void aio_set_dispatching(AioContext *ctx, bool dispatching);
> -
>  /**
>   * aio_context_new: Allocate a new AioContext.
>   *
> diff --git a/tests/test-aio.c b/tests/test-aio.c
> index e7bbb83..217e337 100644
> --- a/tests/test-aio.c
> +++ b/tests/test-aio.c
> @@ -97,14 +97,6 @@ static void event_ready_cb(EventNotifier *e)
>  
>  /* Tests using aio_*.  */
>  
> -static void test_notify(void)
> -{
> -    g_assert(!aio_poll(ctx, false));
> -    aio_notify(ctx);
> -    g_assert(!aio_poll(ctx, true));
> -    g_assert(!aio_poll(ctx, false));
> -}
> -
>  typedef struct {
>      QemuMutex start_lock;
>      bool thread_acquired;
> @@ -494,14 +486,6 @@ static void test_timer_schedule(void)
>   *   works well, and that's what I am using.
>   */
>  
> -static void test_source_notify(void)
> -{
> -    while (g_main_context_iteration(NULL, false));
> -    aio_notify(ctx);
> -    g_assert(g_main_context_iteration(NULL, true));
> -    g_assert(!g_main_context_iteration(NULL, false));
> -}
> -
>  static void test_source_flush(void)
>  {
>      g_assert(!g_main_context_iteration(NULL, false));
> @@ -830,7 +814,6 @@ int main(int argc, char **argv)
>      while (g_main_context_iteration(NULL, false));
>  
>      g_test_init(&argc, &argv, NULL);
> -    g_test_add_func("/aio/notify",                  test_notify);
>      g_test_add_func("/aio/acquire",                 test_acquire);
>      g_test_add_func("/aio/bh/schedule",             test_bh_schedule);
>      g_test_add_func("/aio/bh/schedule10",           test_bh_schedule10);
> @@ -845,7 +828,6 @@ int main(int argc, char **argv)
>      g_test_add_func("/aio/event/flush",             test_flush_event_notifier);
>      g_test_add_func("/aio/timer/schedule",          test_timer_schedule);
>  
> -    g_test_add_func("/aio-gsource/notify",                  test_source_notify);
>      g_test_add_func("/aio-gsource/flush",                   test_source_flush);
>      g_test_add_func("/aio-gsource/bh/schedule",             test_source_bh_schedule);
>      g_test_add_func("/aio-gsource/bh/schedule10",           test_source_bh_schedule10);
> -- 
> 2.4.3
> 
>

next prev parent reply	other threads:[~2015-07-17  2:25 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-16  9:56 [Qemu-devel] [PATCH v2 0/3] AioContext: ctx->dispatching is dead, all hail ctx->notify_me Paolo Bonzini
2015-07-16  9:56 ` [Qemu-devel] [PATCH v2 1/3] tests: remove irrelevant assertions from test-aio Paolo Bonzini
2015-07-16  9:56 ` [Qemu-devel] [PATCH v2 2/3] aio-win32: reorganize polling loop Paolo Bonzini
2015-07-16  9:56 ` [Qemu-devel] [PATCH v2 3/3] AioContext: fix broken ctx->dispatching optimization Paolo Bonzini
2015-07-17  2:25   ` Fam Zheng [this message]
2015-07-17  2:27     ` Paolo Bonzini
2015-07-17  4:17   ` Paolo Bonzini
2015-07-17  8:39     ` Stefan Hajnoczi
2015-07-16 11:07 ` [Qemu-devel] [PATCH v2 0/3] AioContext: ctx->dispatching is dead, all hail ctx->notify_me Kevin Wolf
2015-07-16 12:44 ` Richard W.M. Jones
2015-07-16 19:05 ` Richard W.M. Jones
2015-07-16 22:06   ` Paolo Bonzini
2015-07-17  0:17     ` Paolo Bonzini
2015-07-17  4:44   ` Paolo Bonzini
2015-07-17  9:30     ` Paolo Bonzini
2015-07-17 12:58       ` Richard W.M. Jones
2015-07-17 13:02         ` Paolo Bonzini
2015-07-17 13:28       ` Marc Zyngier
2015-07-17 13:39         ` Laszlo Ersek
2015-07-17 13:48           ` Marc Zyngier
2015-07-17 13:53             ` Richard W.M. Jones
2015-07-17 14:03               ` Marc Zyngier
2015-07-17 13:57             ` Laszlo Ersek
2015-07-17 14:04         ` Paolo Bonzini
2015-07-17 14:18           ` Marc Zyngier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150717022525.GA13284@ad.nay.redhat.com \
    --to=famz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=lersek@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rjones@redhat.com \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).