qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2] thread-pool: fix deadlock when callbacks depends on each other
@ 2014-06-02  7:15 Marcin Gibuła
  2014-06-04 10:01 ` Stefan Hajnoczi
  0 siblings, 1 reply; 4+ messages in thread
From: Marcin Gibuła @ 2014-06-02  7:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Stefan Hajnoczi

When two coroutines submit I/O and first coroutine depends on second to 
complete (by calling bdrv_drain_all), deadlock may occur.

This is because both requests may have completed before thread pool 
notifier got called. Then, when notifier gets executed and first 
coroutine calls aio_pool() to make progress, it will hang forever, as 
notifier's descriptor has been already marked clear.

This patch fixes this, by deferring clearing notifier until no 
completions are pending.

Without this patch, I could reproduce this bug with snapshot-commit with 
about 1 per 10 tries. With this patch, I couldn't reproduce it any more.

Signed-off-by: Marcin Gibula <m.gibula@beyond.pl>
---

--- thread-pool.c	2014-04-17 15:44:45.000000000 +0200
+++ thread-pool.c	2014-06-02 09:10:25.442260590 +0200
@@ -76,6 +76,8 @@ struct ThreadPool {
      int new_threads;     /* backlog of threads we need to create */
      int pending_threads; /* threads created but not running yet */
      int pending_cancellations; /* whether we need a cond_broadcast */
+    int pending_completions; /* whether we need to rearm notifier when
+                                executing callback */
      bool stopping;
  };

@@ -110,6 +112,10 @@ static void *worker_thread(void *opaque)
          ret = req->func(req->arg);

          req->ret = ret;
+        if (req->common.cb) {
+            atomic_inc(&pool->pending_completions);
+        }
+
          /* Write ret before state.  */
          smp_wmb();
          req->state = THREAD_DONE;
@@ -173,7 +179,6 @@ static void event_notifier_ready(EventNo
      ThreadPool *pool = container_of(notifier, ThreadPool, notifier);
      ThreadPoolElement *elem, *next;

-    event_notifier_test_and_clear(notifier);
  restart:
      QLIST_FOREACH_SAFE(elem, &pool->head, all, next) {
          if (elem->state != THREAD_CANCELED && elem->state != 
THREAD_DONE) {
@@ -185,6 +190,8 @@ restart:
          }
          if (elem->state == THREAD_DONE && elem->common.cb) {
              QLIST_REMOVE(elem, all);
+            atomic_dec(&pool->pending_completions);
+
              /* Read state before ret.  */
              smp_rmb();
              elem->common.cb(elem->common.opaque, elem->ret);
@@ -196,6 +203,19 @@ restart:
              qemu_aio_release(elem);
          }
      }
+
+    /* Double test of pending_completions is necessary to
+     * ensure that there is no race between testing it and
+     * clearing notifier.
+     */
+    if (atomic_read(&pool->pending_completions)) {
+        goto restart;
+    }
+    event_notifier_test_and_clear(notifier);
+    if (atomic_read(&pool->pending_completions)) {
+        event_notifier_set(notifier);
+        goto restart;
+    }
  }

  static void thread_pool_cancel(BlockDriverAIOCB *acb)

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-06-04 10:31 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-02  7:15 [Qemu-devel] [PATCH v2] thread-pool: fix deadlock when callbacks depends on each other Marcin Gibuła
2014-06-04 10:01 ` Stefan Hajnoczi
2014-06-04 10:18   ` Paolo Bonzini
2014-06-04 10:31   ` Marcin Gibuła

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).