All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Fam Zheng <famz@redhat.com>
Cc: kwolf@redhat.com, lersek@redhat.com, qemu-devel@nongnu.org,
	stefanha@redhat.com, rjones@redhat.com
Subject: Re: [Qemu-devel] [PATCH 1/2] AioContext: fix broken placement of event_notifier_test_and_clear
Date: Mon, 20 Jul 2015 07:32:49 +0200	[thread overview]
Message-ID: <55AC8801.5020808@redhat.com> (raw)
In-Reply-To: <20150720035535.GB17582@ad.nay.redhat.com>



On 20/07/2015 05:55, Fam Zheng wrote:
> On Sat, 07/18 22:21, Paolo Bonzini wrote:
>> event_notifier_test_and_clear must be called before processing events.
>> Otherwise, an aio_poll could "eat" the notification before the main
>> I/O thread invokes ppoll().  The main I/O thread then never wakes up.
>> This is an example of what could happen:
>>
>>    i/o thread         vcpu thread                   worker thread
>>    ---------------------------------------------------------------------
>>    lock_iothread
>>    notify_me = 1
>>    ...
>>    unlock_iothread
>>                       lock_iothread
>>                       notify_me = 3
>>                       ppoll
>>                       notify_me = 1
>>                                                      bh->scheduled = 1
>>                                                      event_notifier_set
>>                       event_notifier_test_and_clear
>>    ppoll
>>    *** hang ***
> 
> It still seems possible for event_notifier_test_and_clear to happen before main
> thread ppoll, will that be a problem? How does main thread get waken up in that
> case?

You could have this:


   i/o thread         vcpu thread                   worker thread
   ---------------------------------------------------------------------
   lock_iothread
   notify_me = 1
   ...
   unlock_iothread
                      lock_iothread
                      notify_me = 3
                      ppoll
                      notify_me = 1
                                                    bh->scheduled = 1
                                                    event_notifier_set
                      event_notifier_test_and_clear
   ppoll

But then the bottom half *will* be processed by the VCPU thread.  It's a
similar "rule" to the one that you use with lock-free algorithms: the
consumer, 99% of the time, uses the variables in the opposite order of
the producer.

Here the producer calls event_notifier_set after bh->scheduled (of
course...); the consumer *must* clear the EventNotifier before reading
bh->scheduled.

Paolo


> 
>>
>> In contrast with the previous bug, there wasn't really much debugging
>> to do here.  "Tracing" with qemu_clock_get_ns shows pretty much the
>> same behavior as in the previous patch, so the only wait to find the
>> bug is staring at the code.
>>
>> One could also use a formal model, of course.  The included one shows
>> this with three processes: notifier corresponds to a QEMU thread pool
>> worker, temporary_waiter to a VCPU thread that invokes aio_poll(),
>> waiter to the main I/O thread.  I would be happy to say that the
>> formal model found the bug for me, but actually I wrote it after the
>> fact.
>>
>> This patch is a bit of a big hammer.  The next one optimizes it,
>> with help (this time for real rather than a posteriori :)) from
>> another, similar formal model.
>>
>> Reported-by: Richard W. M. Jones <rjones@redhat.com>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  aio-posix.c                 |   2 +
>>  aio-win32.c                 |   2 +
>>  async.c                     |   8 ++-
>>  docs/aio_notify_bug.promela | 140 ++++++++++++++++++++++++++++++++++++++++++++
>>  4 files changed, 151 insertions(+), 1 deletion(-)
>>  create mode 100644 docs/aio_notify_bug.promela
>>
>> diff --git a/aio-posix.c b/aio-posix.c
>> index 249889f..5c8b266 100644
>> --- a/aio-posix.c
>> +++ b/aio-posix.c
>> @@ -276,6 +276,8 @@ bool aio_poll(AioContext *ctx, bool blocking)
>>          aio_context_acquire(ctx);
>>      }
>>  
>> +    event_notifier_test_and_clear(&ctx->notifier);
>> +
>>      /* if we have any readable fds, dispatch event */
>>      if (ret > 0) {
>>          for (i = 0; i < npfd; i++) {
>> diff --git a/aio-win32.c b/aio-win32.c
>> index ea655b0..3e0db20 100644
>> --- a/aio-win32.c
>> +++ b/aio-win32.c
>> @@ -337,6 +337,8 @@ bool aio_poll(AioContext *ctx, bool blocking)
>>              aio_context_acquire(ctx);
>>          }
>>  
>> +        event_notifier_test_and_clear(&ctx->notifier);
>> +
>>          if (first && aio_bh_poll(ctx)) {
>>              progress = true;
>>          }
>> diff --git a/async.c b/async.c
>> index a232192..d625e8a 100644
>> --- a/async.c
>> +++ b/async.c
>> @@ -203,6 +203,8 @@ aio_ctx_check(GSource *source)
>>      QEMUBH *bh;
>>  
>>      atomic_and(&ctx->notify_me, ~1);
>> +    event_notifier_test_and_clear(&ctx->notifier);
>> +
>>      for (bh = ctx->first_bh; bh; bh = bh->next) {
>>          if (!bh->deleted && bh->scheduled) {
>>              return true;
>> @@ -279,6 +281,10 @@ static void aio_rfifolock_cb(void *opaque)
>>      aio_notify(opaque);
>>  }
>>  
>> +static void event_notifier_dummy_cb(EventNotifier *e)
>> +{
>> +}
>> +
>>  AioContext *aio_context_new(Error **errp)
>>  {
>>      int ret;
>> @@ -293,7 +299,7 @@ AioContext *aio_context_new(Error **errp)
>>      g_source_set_can_recurse(&ctx->source, true);
>>      aio_set_event_notifier(ctx, &ctx->notifier,
>>                             (EventNotifierHandler *)
>> -                           event_notifier_test_and_clear);
>> +                           event_notifier_dummy_cb);
>>      ctx->thread_pool = NULL;
>>      qemu_mutex_init(&ctx->bh_lock);
>>      rfifolock_init(&ctx->lock, aio_rfifolock_cb, ctx);
>> diff --git a/docs/aio_notify_bug.promela b/docs/aio_notify_bug.promela
>> new file mode 100644
>> index 0000000..b3bfca1
>> --- /dev/null
>> +++ b/docs/aio_notify_bug.promela
>> @@ -0,0 +1,140 @@
>> +/*
>> + * This model describes a bug in aio_notify.  If ctx->notifier is
>> + * cleared too late, a wakeup could be lost.
>> + *
>> + * Author: Paolo Bonzini <pbonzini@redhat.com>
>> + *
>> + * This file is in the public domain.  If you really want a license,
>> + * the WTFPL will do.
>> + *
>> + * To verify the buggy version:
>> + *     spin -a -DBUG docs/aio_notify_bug.promela
>> + *     gcc -O2 pan.c
>> + *     ./a.out -a -f
>> + *
>> + * To verify the working version:
>> + *     spin -a docs/aio_notify_bug.promela
>> + *     gcc -O2 pan.c
>> + *     ./a.out -a -f
>> + *
>> + * Add -DCHECK_REQ to test an alternative invariant and the
>> + * "notify_me" optimization.
>> + */
>> +
>> +int notify_me;
>> +bool event;
>> +bool req;
>> +bool notifier_done;
>> +
>> +#ifdef CHECK_REQ
>> +#define USE_NOTIFY_ME 1
>> +#else
>> +#define USE_NOTIFY_ME 0
>> +#endif
>> +
>> +active proctype notifier()
>> +{
>> +    do
>> +        :: true -> {
>> +            req = 1;
>> +            if
>> +               :: !USE_NOTIFY_ME || notify_me -> event = 1;
>> +               :: else -> skip;
>> +            fi
>> +        }
>> +        :: true -> break;
>> +    od;
>> +    notifier_done = 1;
>> +}
>> +
>> +#ifdef BUG
>> +#define AIO_POLL                                                    \
>> +    notify_me++;                                                    \
>> +    if                                                              \
>> +        :: !req -> {                                                \
>> +            if                                                      \
>> +                :: event -> skip;                                   \
>> +            fi;                                                     \
>> +        }                                                           \
>> +        :: else -> skip;                                            \
>> +    fi;                                                             \
>> +    notify_me--;                                                    \
>> +                                                                    \
>> +    req = 0;                                                        \
>> +    event = 0;
>> +#else
>> +#define AIO_POLL                                                    \
>> +    notify_me++;                                                    \
>> +    if                                                              \
>> +        :: !req -> {                                                \
>> +            if                                                      \
>> +                :: event -> skip;                                   \
>> +            fi;                                                     \
>> +        }                                                           \
>> +        :: else -> skip;                                            \
>> +    fi;                                                             \
>> +    notify_me--;                                                    \
>> +                                                                    \
>> +    event = 0;                                                      \
>> +    req = 0;
>> +#endif
>> +
>> +active proctype waiter()
>> +{
>> +    do
>> +       :: true -> AIO_POLL;
>> +    od;
>> +}
>> +
>> +/* Same as waiter(), but disappears after a while.  */
>> +active proctype temporary_waiter()
>> +{
>> +    do
>> +       :: true -> AIO_POLL;
>> +       :: true -> break;
>> +    od;
>> +}
>> +
>> +#ifdef CHECK_REQ
>> +never {
>> +    do
>> +        :: req -> goto accept_if_req_not_eventually_false;
>> +        :: true -> skip;
>> +    od;
>> +
>> +accept_if_req_not_eventually_false:
>> +    if
>> +        :: req -> goto accept_if_req_not_eventually_false;
>> +    fi;
>> +    assert(0);
>> +}
>> +
>> +#else
>> +/* There must be infinitely many transitions of event as long
>> + * as the notifier does not exit.
>> + *
>> + * If event stayed always true, the waiters would be busy looping.
>> + * If event stayed always false, the waiters would be sleeping
>> + * forever.
>> + */
>> +never {
>> +    do
>> +        :: !event    -> goto accept_if_event_not_eventually_true;
>> +        :: event     -> goto accept_if_event_not_eventually_false;
>> +        :: true      -> skip;
>> +    od;
>> +
>> +accept_if_event_not_eventually_true:
>> +    if
>> +        :: !event && notifier_done  -> do :: true -> skip; od;
>> +        :: !event && !notifier_done -> goto accept_if_event_not_eventually_true;
>> +    fi;
>> +    assert(0);
>> +
>> +accept_if_event_not_eventually_false:
>> +    if
>> +        :: event     -> goto accept_if_event_not_eventually_false;
>> +    fi;
>> +    assert(0);
>> +}
>> +#endif
>> -- 
>> 2.4.3
>>
>>
> 
> 

  reply	other threads:[~2015-07-20  5:32 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-18 20:21 [Qemu-devel] [PATCH 0/2] AioContext: fix missing wakeups due to event_notifier_test_and_clear Paolo Bonzini
2015-07-18 20:21 ` [Qemu-devel] [PATCH 1/2] AioContext: fix broken placement of event_notifier_test_and_clear Paolo Bonzini
2015-07-20  3:55   ` Fam Zheng
2015-07-20  5:32     ` Paolo Bonzini [this message]
2015-07-18 20:21 ` [Qemu-devel] [PATCH 2/2] AioContext: optimize clearing the EventNotifier Paolo Bonzini
2015-07-20  2:27   ` Fam Zheng
2015-07-20  5:25     ` Paolo Bonzini
2015-07-20  5:34       ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55AC8801.5020808@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=famz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=lersek@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rjones@redhat.com \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.