From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44100) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X74Uz-0005SH-Qp for qemu-devel@nongnu.org; Tue, 15 Jul 2014 11:18:41 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1X74Us-0000E0-2n for qemu-devel@nongnu.org; Tue, 15 Jul 2014 11:18:33 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43102) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X74Ur-0000DQ-Ep for qemu-devel@nongnu.org; Tue, 15 Jul 2014 11:18:25 -0400 Message-ID: <53C5463A.8040401@redhat.com> Date: Tue, 15 Jul 2014 17:18:18 +0200 From: Paolo Bonzini MIME-Version: 1.0 References: <1405435466-20801-1-git-send-email-stefanha@redhat.com> <1405435466-20801-3-git-send-email-stefanha@redhat.com> In-Reply-To: <1405435466-20801-3-git-send-email-stefanha@redhat.com> Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v2 2/2] thread-pool: avoid deadlock in nested aio_poll() calls List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi , qemu-devel@nongnu.org Cc: Christian Borntraeger Il 15/07/2014 16:44, Stefan Hajnoczi ha scritto: > The thread pool has a race condition if two elements complete before > thread_pool_completion_bh() runs: > > If element A's callback waits for element B using aio_poll() it will > deadlock since pool->completion_bh is not marked scheduled when the > nested aio_poll() runs. > > Fix this by marking the BH scheduled while thread_pool_completion_bh() > is executing. This way any nested aio_poll() loops will enter > thread_pool_completion_bh() and complete the remaining elements. > > Signed-off-by: Stefan Hajnoczi > --- > thread-pool.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/thread-pool.c b/thread-pool.c > index 4cfd078..23888dc 100644 > --- a/thread-pool.c > +++ b/thread-pool.c > @@ -185,6 +185,12 @@ restart: > QLIST_REMOVE(elem, all); > /* Read state before ret. */ > smp_rmb(); > + > + /* Schedule ourselves in case elem->common.cb() calls aio_poll() to > + * wait for another request that completed at the same time. > + */ > + qemu_bh_schedule(pool->completion_bh); > + > elem->common.cb(elem->common.opaque, elem->ret); > qemu_aio_release(elem); > goto restart; Please make this "return" instead, so that it is clearer that the next invocation of the BH is a continuation of this one. Paolo