From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:49634) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S4UB8-00047f-1O for qemu-devel@nongnu.org; Mon, 05 Mar 2012 04:26:03 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S4UAm-00064M-1g for qemu-devel@nongnu.org; Mon, 05 Mar 2012 04:26:01 -0500 Received: from mx1.redhat.com ([209.132.183.28]:22236) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S4UAl-000648-PX for qemu-devel@nongnu.org; Mon, 05 Mar 2012 04:25:39 -0500 Message-ID: <4F54868E.1040402@redhat.com> Date: Mon, 05 Mar 2012 10:25:34 +0100 From: Paolo Bonzini MIME-Version: 1.0 References: <1330936455-23802-1-git-send-email-pbonzini@redhat.com> <4F548263.1070905@siemens.com> In-Reply-To: <4F548263.1070905@siemens.com> Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC PATCH] fix select(2) race between main_loop_wait and qemu_aio_wait List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Kiszka Cc: anthony@codemonkey.ws, qemu-devel@nongnu.org, kvm@vger.kernel.org, laurent@vivier.eu Il 05/03/2012 10:07, Jan Kiszka ha scritto: > > This is quite ugly. Two threads, one running main_loop_wait and > > one running qemu_aio_wait, can race with each other on running the > > same iohandler. The result is that an iohandler could run while the > > underlying socket is not readable or writable, with possibly ill effects. > > Hmm, isn't it a problem already that a socket is polled by two threads > at the same time? Can't that be avoided? We still have synchronous I/O in the device models. That's the root cause of the bug, I suppose. > Long-term, I'd like to cut out certain file descriptors from the main > loop and process them completely in separate threads (for separate > locking, prioritization etc.). Dunno how NBD works, but maybe it should > be reworked like this already. Me too, I even made a very simple proof of concept a couple of weeks ago (search for a thread "switching the block layer from coroutines to threads"). It worked, though it is obviously not upstreamable in any way. In that world order EventNotifiers would replace qemu_aio_set_fd_handler, and socket-based protocols such as NBD would run with blocking I/O in their own thread. In addition to one thread per I/O request (from a thread pool), there would be one arbiter thread that reads replies and dispatches them to the appropriate I/O request thread. The arbiter thread replaces the read callback in qemu_aio_set_fd_handler. The problem is, even though it worked, making this thread-safe is another story. I suspect that in practice it is very difficult to do without resurrecting RCU patches. Paolo