From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53925) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WrUE9-0007x4-2s for qemu-devel@nongnu.org; Mon, 02 Jun 2014 11:32:56 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WrUE0-0004WR-2i for qemu-devel@nongnu.org; Mon, 02 Jun 2014 11:32:45 -0400 Received: from mail-wi0-x231.google.com ([2a00:1450:400c:c05::231]:64433) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WrUDz-0004WA-Sh for qemu-devel@nongnu.org; Mon, 02 Jun 2014 11:32:36 -0400 Received: by mail-wi0-f177.google.com with SMTP id f8so4879600wiw.4 for ; Mon, 02 Jun 2014 08:32:33 -0700 (PDT) Sender: Paolo Bonzini Message-ID: <538C990E.1080308@redhat.com> Date: Mon, 02 Jun 2014 17:32:30 +0200 From: Paolo Bonzini MIME-Version: 1.0 References: <538A1F71.4080203@beyond.pl> <538B6D25.8030106@redhat.com> <538B78A9.8060804@beyond.pl> In-Reply-To: <538B78A9.8060804@beyond.pl> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [PATCH] thread-pool: fix deadlock when callbacks depends on each other List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?UTF-8?B?TWFyY2luIEdpYnXFgmE=?= , "qemu-devel@nongnu.org" Cc: Stefan Hajnoczi Il 01/06/2014 21:02, Marcin GibuĊ‚a ha scritto: >> Good catch! The main problem with the patch is that you need to use >> atomic_inc/atomic_dec to increment and decrement >> pool->pending_completions. > > Ok. > >> Secondarily, event_notifier_set is pretty heavy-weight, does it work if >> you wrap the loop like this? >> >> restart: >> QLIST_FOREACH_SAFE(elem, &pool->head, all, next) { >> ... >> } >> if (pool->pending_completions) { >> goto restart; >> } >> event_notifier_test_and_clear(notifier); >> if (pool->pending_completions) { >> event_notifier_set(notifier); >> goto restart; >> } > > I'll test it tomorrow. I assume you want to avoid calling > event_notifier_set() until function is reentered via aio_pool? Yes. But actually, I need to check if it's possible to fix bdrv_drain_all. If you're in coroutine context, you can defer the draining to a safe point using a bottom half. If you're not in coroutine context, perhaps bdrv_drain_all has to be made illegal. Which means a bunch of code auditing... Paolo >> Finally, the same bug is also in block/linux-aio.c and >> block/win32-aio.c. > > I can try with linux-aio, but my knowledge of windows api is zero... >