From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:40788) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S4aYG-0005z8-Hp for qemu-devel@nongnu.org; Mon, 05 Mar 2012 11:14:26 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S4aY6-0006F2-LB for qemu-devel@nongnu.org; Mon, 05 Mar 2012 11:14:20 -0500 Received: from mx1.redhat.com ([209.132.183.28]:46392) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S4aY6-0006DK-EU for qemu-devel@nongnu.org; Mon, 05 Mar 2012 11:14:10 -0500 Message-ID: <4F54E64C.4050506@redhat.com> Date: Mon, 05 Mar 2012 17:14:04 +0100 From: Paolo Bonzini MIME-Version: 1.0 References: <1330936455-23802-1-git-send-email-pbonzini@redhat.com> <4F548263.1070905@siemens.com> <4F54CC8A.5010509@redhat.com> <4F54CDFE.3030309@redhat.com> <4F54D866.30402@redhat.com> In-Reply-To: <4F54D866.30402@redhat.com> Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC PATCH] fix select(2) race between main_loop_wait and qemu_aio_wait List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Avi Kivity Cc: anthony@codemonkey.ws, Jan Kiszka , qemu-devel@nongnu.org, kvm@vger.kernel.org, laurent@vivier.eu Il 05/03/2012 16:14, Avi Kivity ha scritto: >> > Hmm, I don't think so. It would need to protect execution of the >> > iohandlers too, and pretty much everything can happen there including a >> > nested loop. Of course recursive mutexes exist, but it sounds like too >> > big an axe. > The I/O handlers would still use the qemu mutex, no? we'd just protect > the select() (taking the mutex from before releasing the global lock, > and reacquiring it afterwards). Yes, that could work, but it is _really_ ugly. I still prefer this patch or fixing NBD. At least both contain the hack in a single place. >> > I could add a generation count updated by qemu_aio_wait(), and rerun the >> > select() only if the generation count changes during its execution. >> > >> > Or we can call it an NBD bug. I'm not against that, but it seemed to me >> > that the problem is more general. > What about making sure all callers of qemu_aio_wait() run from > coroutines (or threads)? Then they just ask the main thread to wake > them up, instead of dispatching completions themselves. That would open another Pandora's box. The point of having a separate main loop is that only AIO can happen during qemu_aio_wait() or qemu_aio_flush(). In particular you don't want the monitor to process input while you're running another monitor command. Paolo