From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=47530 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1ObKfE-0003mX-MP for qemu-devel@nongnu.org; Tue, 20 Jul 2010 17:47:52 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1ObKfA-0001gi-Cs for qemu-devel@nongnu.org; Tue, 20 Jul 2010 17:47:48 -0400 Received: from mail-iw0-f173.google.com ([209.85.214.173]:38284) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1ObKfA-0001ge-8T for qemu-devel@nongnu.org; Tue, 20 Jul 2010 17:47:44 -0400 Received: by iwn6 with SMTP id 6so6361912iwn.4 for ; Tue, 20 Jul 2010 14:47:43 -0700 (PDT) Message-ID: <4C46197E.9010400@codemonkey.ws> Date: Tue, 20 Jul 2010 16:47:42 -0500 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] Docs for and debugging of Asynchronous I/O References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Ot ten Thije Cc: qemu-devel@nongnu.org On 07/20/2010 01:34 PM, Ot ten Thije wrote: > Hello, > > I am working on fixing the savevm/loadvm functionality in the Android > emulator, and the two issues I've encountered so far both appear to > stem from the asynchronous I/O (AIO) code. In both cases, the emulator > busy-waits indefinitely for an operation that never signals completion. > > Unfortunately I am not really familiar with AIO, so I was hoping one > of the emulator devs could point me some resources (design docs, > general introduction, etc.). I've done some searching myself and found > some docs for the Linux kernel AIO implementation > (http://lse.sourceforge.net/io/aio.html), but I'm not sure to what > extent it applies to the QEMU code. > > Tips for debugging AIO would also be greatly appreciated. I can trace > the execution until I am within the (emulated) device driver (i.e. > block/qcow2.c:qcow_aio_writev()), but haven't been able to pinpoint > the exact location where the actual async call is made. This makes it > difficult to identify the code that should signal completion back to > the main process (and apparently fails to do so). I know this code is > called though, because some asynchronous calls *do* signal completion. TCG translates guest code into small sequences of host code (basic blocks). These basic blocks can be chained together such that one block directly jmps to the next block. The effect is that a guest can run a tight loop whereas guest code continuously runs without a chance for QEMU to do any work. To allow qemu to make forward progress in such a scenario, we program signals to fire. Currently, the signals fire in a number of circumstances including when AIO operations complete, or when a periodic timer needs to fire. When dealing with multiple threads, it's very easy to screw things up by not masking signals properly. Often times, this is hidden because the periodic timer runs often enough that it doesn't matter if you miss a signal. An exception, however, would be emulation of synchronous code. This tends to happen in qcow2 metadata operations since they are still synchronous. To complete this emulation, we have to block the current thread until the I/O operation completes. But since qemu isn't re-entrant, we can't run the full main loop as that could trigger re-entrancy in qcow2. To work around this, we implement "idle bottom halves" which are special bottom halves that are run by the normal io loop but also by a special I/O used exclusive for emulating synchronous writes. To further complicate matters, non-x86 platforms (like ARM) are more likely to not use a periodic timer which makes these bugs much more obvious. > > I realize that the Android emulator is a rather heavy fork of QEMU, so > giving specific advice will probably be difficult. However, the > overall approach is still the same, so I hope you can help me get a > better understanding of that. This is the problem with forking. This is very hairy code that requires careful attention to detail. If you're introducing any type of threading, disk emulation, or changes to the block subsystem, chances are you've done it wrong. Regards, Anthony Liguori > > Ot ten Thije