All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>, qemu-devel@nongnu.org
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Subject: Re: [Qemu-devel] [PATCH for-2.1? 2/2] thread-pool: avoid deadlock in nested aio_poll() calls
Date: Mon, 14 Jul 2014 10:36:21 +0200	[thread overview]
Message-ID: <53C39685.2090400@redhat.com> (raw)
In-Reply-To: <1405077612-7806-3-git-send-email-stefanha@redhat.com>

Il 11/07/2014 13:20, Stefan Hajnoczi ha scritto:
> The thread pool has a race condition if two elements complete before
> thread_pool_completion_bh() runs:
>
>   If element A's callback waits for element B using aio_poll() it will
>   deadlock since pool->completion_bh is not marked scheduled when the
>   nested aio_poll() runs.
>
> Fix this by marking the BH scheduled while thread_pool_completion_bh()
> is executing.  This way any nested aio_poll() loops will enter
> thread_pool_completion_bh() and complete the remaining elements.
>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  thread-pool.c | 27 +++++++++++++++++++++++++--
>  1 file changed, 25 insertions(+), 2 deletions(-)
>
> diff --git a/thread-pool.c b/thread-pool.c
> index 4cfd078..0ede168 100644
> --- a/thread-pool.c
> +++ b/thread-pool.c
> @@ -65,6 +65,9 @@ struct ThreadPool {
>      int max_threads;
>      QEMUBH *new_thread_bh;
>
> +    /* Atomic counter to detect completions while completion handler runs */
> +    uint32_t completion_token;
> +
>      /* The following variables are only accessed from one AioContext. */
>      QLIST_HEAD(, ThreadPoolElement) head;
>
> @@ -118,6 +121,7 @@ static void *worker_thread(void *opaque)
>              qemu_cond_broadcast(&pool->check_cancel);
>          }
>
> +        atomic_inc(&pool->completion_token);
>          qemu_bh_schedule(pool->completion_bh);
>      }
>
> @@ -167,9 +171,8 @@ static void spawn_thread(ThreadPool *pool)
>      }
>  }
>
> -static void thread_pool_completion_bh(void *opaque)
> +static void thread_pool_complete_elements(ThreadPool *pool)
>  {
> -    ThreadPool *pool = opaque;
>      ThreadPoolElement *elem, *next;
>
>  restart:
> @@ -196,6 +199,26 @@ restart:
>      }
>  }
>
> +static void thread_pool_completion_bh(void *opaque)
> +{
> +    ThreadPool *pool = opaque;
> +    uint32_t token;
> +
> +    do {
> +        token = atomic_mb_read(&pool->completion_token);
> +
> +        /* Stay scheduled in case elem->common.cb() makes a nested aio_poll()
> +         * call.  This avoids deadlock if element A's callback waits for
> +         * element B and both completed at the same time.
> +         */
> +        qemu_bh_schedule(pool->completion_bh);
> +
> +        thread_pool_complete_elements(pool);
> +
> +        qemu_bh_cancel(pool->completion_bh);
> +    } while (token != pool->completion_token);
> +}
> +
>  static void thread_pool_cancel(BlockDriverAIOCB *acb)
>  {
>      ThreadPoolElement *elem = (ThreadPoolElement *)acb;
>

I am not sure I understand this patch.

The simplest way to fix deadlock is to change this in 
thread_pool_completion_bh:

             elem->common.cb(elem->common.opaque, elem->ret);
             qemu_aio_release(elem);
             goto restart;

to

             /* In case elem->common.cb() makes a nested aio_poll() call,
              * next may become invalid as well.  Instead of just
              * restarting the QLIST_FOREACH_SAFE, go through the BH
              * once more, which also avoids deadlock if element A's
              * callback waits for element B and both completed at the
              * same time.
              */
             qemu_bh_schedule(pool->completion_bh);
             elem->common.cb(elem->common.opaque, elem->ret);
             qemu_aio_release(elem);

There is no change in logic, it's just that the goto is switched to a BH 
representing a continuation.  I am then not sure why 
pool->completion_token is necessary?

Perhaps it is just an optimization to avoid going multiple times around 
aio_poll()?

Paolo

  reply	other threads:[~2014-07-14  8:36 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-11 11:20 [Qemu-devel] [PATCH for-2.1? 0/2] thread-pool: avoid fd usage and fix nested aio_poll() deadlock Stefan Hajnoczi
2014-07-11 11:20 ` [Qemu-devel] [PATCH for-2.1? 1/2] thread-pool: avoid per-thread-pool EventNotifier Stefan Hajnoczi
2014-07-11 11:20 ` [Qemu-devel] [PATCH for-2.1? 2/2] thread-pool: avoid deadlock in nested aio_poll() calls Stefan Hajnoczi
2014-07-14  8:36   ` Paolo Bonzini [this message]
2014-07-14 10:49     ` Paolo Bonzini
2014-07-15 14:37       ` Stefan Hajnoczi
2014-07-15 15:21         ` Paolo Bonzini
2014-07-17 12:56           ` Stefan Hajnoczi
2014-07-15 14:23     ` Stefan Hajnoczi
2014-07-11 11:37 ` [Qemu-devel] [PATCH for-2.1? 0/2] thread-pool: avoid fd usage and fix nested aio_poll() deadlock Christian Borntraeger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53C39685.2090400@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.