From: Fam Zheng <famz@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: kwolf@redhat.com, lersek@redhat.com, qemu-devel@nongnu.org,
stefanha@redhat.com, rjones@redhat.com
Subject: Re: [Qemu-devel] [PATCH v2 3/3] AioContext: fix broken ctx->dispatching optimization
Date: Fri, 17 Jul 2015 10:25:25 +0800 [thread overview]
Message-ID: <20150717022525.GA13284@ad.nay.redhat.com> (raw)
In-Reply-To: <1437040609-9878-4-git-send-email-pbonzini@redhat.com>
On Thu, 07/16 11:56, Paolo Bonzini wrote:
> diff --git a/aio-posix.c b/aio-posix.c
> index 4abec38..268d14d 100644
> --- a/aio-posix.c
> +++ b/aio-posix.c
> @@ -233,26 +233,23 @@ static void add_pollfd(AioHandler *node)
> bool aio_poll(AioContext *ctx, bool blocking)
> {
> AioHandler *node;
> - bool was_dispatching;
> int i, ret;
> bool progress;
> int64_t timeout;
>
> aio_context_acquire(ctx);
> - was_dispatching = ctx->dispatching;
> progress = false;
>
> /* aio_notify can avoid the expensive event_notifier_set if
> * everything (file descriptors, bottom halves, timers) will
> * be re-evaluated before the next blocking poll(). This is
> * already true when aio_poll is called with blocking == false;
> - * if blocking == true, it is only true after poll() returns.
> - *
> - * If we're in a nested event loop, ctx->dispatching might be true.
> - * In that case we can restore it just before returning, but we
> - * have to clear it now.
> + * if blocking == true, it is only true after poll() returns,
> + * so disable the optimization now.
> */
> - aio_set_dispatching(ctx, !blocking);
> + if (blocking) {
> + atomic_add(&ctx->notify_me, 2);
> + }
Sorry if my questions are stupid, but I'm having difficulties in fully
understanding it.
What if aio_notify happens after the previous aio_dispatch() but before the
next necessary atomic_add? The aio_notify would still skip the
event_notifier_set(), and the next ppoll() will not return. For example:
Thread A Thread B
------------------------------------------------------------------------
aio_poll(blocking=true)
aio_notify()
smp_mb()
if (ctx->notify_me) /* false! */
atomic_add(ctx->notify_me, 2)
ppoll()
atomic_sub(ctx->notify_me, 2) event_notifier_set() /* not run */
And if that's not a problem, why don't we need something like ACCESS_ONCE in
aio_noitfy()?
Fam
>
> ctx->walking_handlers++;
>
> @@ -286,13 +283,15 @@ bool aio_poll(AioContext *ctx, bool blocking)
> npfd = 0;
> ctx->walking_handlers--;
>
> + if (blocking) {
> + atomic_sub(&ctx->notify_me, 2);
> + }
> +
> /* Run dispatch even if there were no readable fds to run timers */
> - aio_set_dispatching(ctx, true);
> if (aio_dispatch(ctx)) {
> progress = true;
> }
>
> - aio_set_dispatching(ctx, was_dispatching);
> aio_context_release(ctx);
>
> return progress;
> diff --git a/aio-win32.c b/aio-win32.c
> index 9268b5c..9d6c12f 100644
> --- a/aio-win32.c
> +++ b/aio-win32.c
> @@ -279,25 +279,23 @@ bool aio_poll(AioContext *ctx, bool blocking)
> {
> AioHandler *node;
> HANDLE events[MAXIMUM_WAIT_OBJECTS + 1];
> - bool was_dispatching, progress, have_select_revents, first;
> + bool progress, have_select_revents, first;
> int count;
> int timeout;
>
> aio_context_acquire(ctx);
> - was_dispatching = ctx->dispatching;
> progress = false;
>
> /* aio_notify can avoid the expensive event_notifier_set if
> * everything (file descriptors, bottom halves, timers) will
> * be re-evaluated before the next blocking poll(). This is
> * already true when aio_poll is called with blocking == false;
> - * if blocking == true, it is only true after poll() returns.
> - *
> - * If we're in a nested event loop, ctx->dispatching might be true.
> - * In that case we can restore it just before returning, but we
> - * have to clear it now.
> + * if blocking == true, it is only true after poll() returns,
> + * so disable the optimization now.
> */
> - aio_set_dispatching(ctx, !blocking);
> + if (blocking) {
> + atomic_add(&ctx->notify_me, 2);
> + }
>
> have_select_revents = aio_prepare(ctx);
>
> @@ -334,7 +332,10 @@ bool aio_poll(AioContext *ctx, bool blocking)
> if (timeout) {
> aio_context_acquire(ctx);
> }
> - aio_set_dispatching(ctx, true);
> + if (blocking) {
> + assert(first);
> + atomic_sub(&ctx->notify_me, 2);
> + }
>
> if (first && aio_bh_poll(ctx)) {
> progress = true;
> @@ -358,7 +359,6 @@ bool aio_poll(AioContext *ctx, bool blocking)
>
> progress |= timerlistgroup_run_timers(&ctx->tlg);
>
> - aio_set_dispatching(ctx, was_dispatching);
> aio_context_release(ctx);
> return progress;
> }
> diff --git a/async.c b/async.c
> index 77d080d..a232192 100644
> --- a/async.c
> +++ b/async.c
> @@ -184,6 +184,8 @@ aio_ctx_prepare(GSource *source, gint *timeout)
> {
> AioContext *ctx = (AioContext *) source;
>
> + atomic_or(&ctx->notify_me, 1);
> +
> /* We assume there is no timeout already supplied */
> *timeout = qemu_timeout_ns_to_ms(aio_compute_timeout(ctx));
>
> @@ -200,6 +202,7 @@ aio_ctx_check(GSource *source)
> AioContext *ctx = (AioContext *) source;
> QEMUBH *bh;
>
> + atomic_and(&ctx->notify_me, ~1);
> for (bh = ctx->first_bh; bh; bh = bh->next) {
> if (!bh->deleted && bh->scheduled) {
> return true;
> @@ -254,23 +257,13 @@ ThreadPool *aio_get_thread_pool(AioContext *ctx)
> return ctx->thread_pool;
> }
>
> -void aio_set_dispatching(AioContext *ctx, bool dispatching)
> -{
> - ctx->dispatching = dispatching;
> - if (!dispatching) {
> - /* Write ctx->dispatching before reading e.g. bh->scheduled.
> - * Optimization: this is only needed when we're entering the "unsafe"
> - * phase where other threads must call event_notifier_set.
> - */
> - smp_mb();
> - }
> -}
> -
> void aio_notify(AioContext *ctx)
> {
> - /* Write e.g. bh->scheduled before reading ctx->dispatching. */
> + /* Write e.g. bh->scheduled before reading ctx->notify_me. Pairs
> + * with atomic_or in aio_ctx_prepare or atomic_add in aio_poll.
> + */
> smp_mb();
> - if (!ctx->dispatching) {
> + if (ctx->notify_me) {
> event_notifier_set(&ctx->notifier);
> }
> }
> diff --git a/docs/aio_notify.promela b/docs/aio_notify.promela
> index ad3f6f0..fccc7ee 100644
> --- a/docs/aio_notify.promela
> +++ b/docs/aio_notify.promela
> @@ -1,5 +1,5 @@
> /*
> - * This model describes the interaction between aio_set_dispatching()
> + * This model describes the interaction between ctx->notify_me
> * and aio_notify().
> *
> * Author: Paolo Bonzini <pbonzini@redhat.com>
> @@ -14,57 +14,53 @@
> * spin -a docs/aio_notify.promela
> * gcc -O2 pan.c
> * ./a.out -a
> + *
> + * To verify it (with a bug planted in the model):
> + * spin -a -DBUG docs/aio_notify.promela
> + * gcc -O2 pan.c
> + * ./a.out -a
> */
>
> #define MAX 4
> #define LAST (1 << (MAX - 1))
> #define FINAL ((LAST << 1) - 1)
>
> -bool dispatching;
> +bool notify_me;
> bool event;
>
> -int req, done;
> +int req;
> +int done;
>
> active proctype waiter()
> {
> - int fetch, blocking;
> + int fetch;
>
> - do
> - :: done != FINAL -> {
> - // Computing "blocking" is separate from execution of the
> - // "bottom half"
> - blocking = (req == 0);
> -
> - // This is our "bottom half"
> - atomic { fetch = req; req = 0; }
> - done = done | fetch;
> -
> - // Wait for a nudge from the other side
> - do
> - :: event == 1 -> { event = 0; break; }
> - :: !blocking -> break;
> - od;
> + do
> + :: true -> {
> + notify_me++;
>
> - dispatching = 1;
> + if
> +#ifndef BUG
> + :: (req > 0) -> skip;
> +#endif
> + :: else ->
> + // Wait for a nudge from the other side
> + do
> + :: event == 1 -> { event = 0; break; }
> + od;
> + fi;
>
> - // If you are simulating this model, you may want to add
> - // something like this here:
> - //
> - // int foo; foo++; foo++; foo++;
> - //
> - // This only wastes some time and makes it more likely
> - // that the notifier process hits the "fast path".
> + notify_me--;
>
> - dispatching = 0;
> + atomic { fetch = req; req = 0; }
> + done = done | fetch;
> }
> - :: else -> break;
> od
> }
>
> active proctype notifier()
> {
> int next = 1;
> - int sets = 0;
>
> do
> :: next <= LAST -> {
> @@ -74,8 +70,8 @@ active proctype notifier()
>
> // aio_notify
> if
> - :: dispatching == 0 -> sets++; event = 1;
> - :: else -> skip;
> + :: notify_me == 1 -> event = 1;
> + :: else -> printf("Skipped event_notifier_set\n"); skip;
> fi;
>
> // Test both synchronous and asynchronous delivery
> @@ -86,19 +82,12 @@ active proctype notifier()
> :: 1 -> skip;
> fi;
> }
> - :: else -> break;
> od;
> - printf("Skipped %d event_notifier_set\n", MAX - sets);
> }
>
> -#define p (done == FINAL)
> -
> -never {
> - do
> - :: 1 // after an arbitrarily long prefix
> - :: p -> break // p becomes true
> - od;
> - do
> - :: !p -> accept: break // it then must remains true forever after
> - od
> +never { /* [] done < FINAL */
> +accept_init:
> + do
> + :: done < FINAL -> skip;
> + od;
> }
> diff --git a/include/block/aio.h b/include/block/aio.h
> index b46103e..be91e3f 100644
> --- a/include/block/aio.h
> +++ b/include/block/aio.h
> @@ -63,10 +63,30 @@ struct AioContext {
> */
> int walking_handlers;
>
> - /* Used to avoid unnecessary event_notifier_set calls in aio_notify.
> - * Writes protected by lock or BQL, reads are lockless.
> + /* Used to avoid unnecessary event_notifier_set calls in aio_notify;
> + * accessed with atomic primitives. If this field is 0, everything
> + * (file descriptors, bottom halves, timers) will be re-evaluated
> + * before the next blocking poll(), thus the event_notifier_set call
> + * can be skipped. If it is non-zero, you may need to wake up a
> + * concurrent aio_poll or the glib main event loop, making
> + * event_notifier_set necessary.
> + *
> + * Bit 0 is reserved for GSource usage of the AioContext, and is 1
> + * between a call to aio_ctx_check and the next call to aio_ctx_dispatch.
> + * Bits 1-31 simply count the number of active calls to aio_poll
> + * that are in the prepare or poll phase.
> + *
> + * The GSource and aio_poll must use a different mechanism because
> + * there is no certainty that a call to GSource's prepare callback
> + * (via g_main_context_prepare) is indeed followed by check and
> + * dispatch. It's not clear whether this would be a bug, but let's
> + * play safe and allow it---it will just cause extra calls to
> + * event_notifier_set until the next call to dispatch.
> + *
> + * Instead, the aio_poll calls include both the prepare and the
> + * dispatch phase, hence a simple counter is enough for them.
> */
> - bool dispatching;
> + uint32_t notify_me;
>
> /* lock to protect between bh's adders and deleter */
> QemuMutex bh_lock;
> @@ -89,9 +109,6 @@ struct AioContext {
> QEMUTimerListGroup tlg;
> };
>
> -/* Used internally to synchronize aio_poll against qemu_bh_schedule. */
> -void aio_set_dispatching(AioContext *ctx, bool dispatching);
> -
> /**
> * aio_context_new: Allocate a new AioContext.
> *
> diff --git a/tests/test-aio.c b/tests/test-aio.c
> index e7bbb83..217e337 100644
> --- a/tests/test-aio.c
> +++ b/tests/test-aio.c
> @@ -97,14 +97,6 @@ static void event_ready_cb(EventNotifier *e)
>
> /* Tests using aio_*. */
>
> -static void test_notify(void)
> -{
> - g_assert(!aio_poll(ctx, false));
> - aio_notify(ctx);
> - g_assert(!aio_poll(ctx, true));
> - g_assert(!aio_poll(ctx, false));
> -}
> -
> typedef struct {
> QemuMutex start_lock;
> bool thread_acquired;
> @@ -494,14 +486,6 @@ static void test_timer_schedule(void)
> * works well, and that's what I am using.
> */
>
> -static void test_source_notify(void)
> -{
> - while (g_main_context_iteration(NULL, false));
> - aio_notify(ctx);
> - g_assert(g_main_context_iteration(NULL, true));
> - g_assert(!g_main_context_iteration(NULL, false));
> -}
> -
> static void test_source_flush(void)
> {
> g_assert(!g_main_context_iteration(NULL, false));
> @@ -830,7 +814,6 @@ int main(int argc, char **argv)
> while (g_main_context_iteration(NULL, false));
>
> g_test_init(&argc, &argv, NULL);
> - g_test_add_func("/aio/notify", test_notify);
> g_test_add_func("/aio/acquire", test_acquire);
> g_test_add_func("/aio/bh/schedule", test_bh_schedule);
> g_test_add_func("/aio/bh/schedule10", test_bh_schedule10);
> @@ -845,7 +828,6 @@ int main(int argc, char **argv)
> g_test_add_func("/aio/event/flush", test_flush_event_notifier);
> g_test_add_func("/aio/timer/schedule", test_timer_schedule);
>
> - g_test_add_func("/aio-gsource/notify", test_source_notify);
> g_test_add_func("/aio-gsource/flush", test_source_flush);
> g_test_add_func("/aio-gsource/bh/schedule", test_source_bh_schedule);
> g_test_add_func("/aio-gsource/bh/schedule10", test_source_bh_schedule10);
> --
> 2.4.3
>
>
next prev parent reply other threads:[~2015-07-17 2:25 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-16 9:56 [Qemu-devel] [PATCH v2 0/3] AioContext: ctx->dispatching is dead, all hail ctx->notify_me Paolo Bonzini
2015-07-16 9:56 ` [Qemu-devel] [PATCH v2 1/3] tests: remove irrelevant assertions from test-aio Paolo Bonzini
2015-07-16 9:56 ` [Qemu-devel] [PATCH v2 2/3] aio-win32: reorganize polling loop Paolo Bonzini
2015-07-16 9:56 ` [Qemu-devel] [PATCH v2 3/3] AioContext: fix broken ctx->dispatching optimization Paolo Bonzini
2015-07-17 2:25 ` Fam Zheng [this message]
2015-07-17 2:27 ` Paolo Bonzini
2015-07-17 4:17 ` Paolo Bonzini
2015-07-17 8:39 ` Stefan Hajnoczi
2015-07-16 11:07 ` [Qemu-devel] [PATCH v2 0/3] AioContext: ctx->dispatching is dead, all hail ctx->notify_me Kevin Wolf
2015-07-16 12:44 ` Richard W.M. Jones
2015-07-16 19:05 ` Richard W.M. Jones
2015-07-16 22:06 ` Paolo Bonzini
2015-07-17 0:17 ` Paolo Bonzini
2015-07-17 4:44 ` Paolo Bonzini
2015-07-17 9:30 ` Paolo Bonzini
2015-07-17 12:58 ` Richard W.M. Jones
2015-07-17 13:02 ` Paolo Bonzini
2015-07-17 13:28 ` Marc Zyngier
2015-07-17 13:39 ` Laszlo Ersek
2015-07-17 13:48 ` Marc Zyngier
2015-07-17 13:53 ` Richard W.M. Jones
2015-07-17 14:03 ` Marc Zyngier
2015-07-17 13:57 ` Laszlo Ersek
2015-07-17 14:04 ` Paolo Bonzini
2015-07-17 14:18 ` Marc Zyngier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150717022525.GA13284@ad.nay.redhat.com \
--to=famz@redhat.com \
--cc=kwolf@redhat.com \
--cc=lersek@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=rjones@redhat.com \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).