From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=36569 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Pi7zq-0007sx-Qb for qemu-devel@nongnu.org; Wed, 26 Jan 2011 11:13:27 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Pi7zp-0006E4-C8 for qemu-devel@nongnu.org; Wed, 26 Jan 2011 11:13:26 -0500 Received: from mx1.redhat.com ([209.132.183.28]:3469) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Pi7zp-0006Dr-0A for qemu-devel@nongnu.org; Wed, 26 Jan 2011 11:13:25 -0500 Message-ID: <4D40481E.9040309@redhat.com> Date: Wed, 26 Jan 2011 18:13:18 +0200 From: Avi Kivity MIME-Version: 1.0 Subject: Re: [Qemu-devel] [RFC][PATCH 11/12] qcow2: Convert qcow2 to use coroutines for async I/O References: <1295688567-25496-1-git-send-email-stefanha@linux.vnet.ibm.com> <1295688567-25496-12-git-send-email-stefanha@linux.vnet.ibm.com> <4D40406B.2070302@redhat.com> <4D4042A8.2040903@redhat.com> <4D4046EF.3050108@codemonkey.ws> In-Reply-To: <4D4046EF.3050108@codemonkey.ws> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: Kevin Wolf , Anthony Liguori , Stefan Hajnoczi , qemu-devel@nongnu.org On 01/26/2011 06:08 PM, Anthony Liguori wrote: > On 01/26/2011 09:50 AM, Kevin Wolf wrote: >> Am 26.01.2011 16:40, schrieb Avi Kivity: >>> On 01/22/2011 11:29 AM, Stefan Hajnoczi wrote: >>>> Converting qcow2 to use coroutines is fairly simple since most of >>>> qcow2 >>>> is synchronous. The synchronous I/O functions likes bdrv_pread() now >>>> transparently work when called from a coroutine, so all the >>>> synchronous >>>> code just works. >>>> >>>> The explicitly asynchronous code is adjusted to repeatedly call >>>> qcow2_aio_read_cb() or qcow2_aio_write_cb() until the request >>>> completes. >>>> At that point the coroutine will return from its entry function and >>>> its >>>> resources are freed. >>>> >>>> The bdrv_aio_readv() and bdrv_aio_writev() user callback is now >>>> invoked >>>> from a BH. This is necessary since the user callback code does not >>>> expect to be executed from a coroutine. >>>> >>>> This conversion is not completely correct because the safety the >>>> synchronous code does not carry over to the coroutine version. >>>> Previously, a synchronous code path could assume that it will never be >>>> interleaved with another request executing. This is no longer true >>>> because bdrv_pread() and bdrv_pwrite() cause the coroutine to yield >>>> and >>>> other requests can be processed during that time. >>>> >>>> The solution is to carefully introduce checks so that pending requests >>>> do not step on each other's toes. That is left for a future patch... >>> The way I thought of doing this is: >>> >>> qcow_aio_write(...) >>> { >>> execute_in_coroutine { >>> co_mutex_lock(&bs->mutex); >>> do_qcow_aio_write(...); // original qcow code >>> co_mutex_release(&bs->mutex); > > The release has to be executed in the call back. > > I think it's a bit nicer to not do a mutex, but rather to have a > notion of freezing/unfreezing the block queue and instead do: > > completion() { > bdrv_unfreeze(bs); > } > > coroutine { > bdrv_freeze(bs); > do_qcow_aio_write(completion); > } > > Freeze/unfreeze is useful in a number of other places too (like > snapshotting). Serializing against a global mutex has the advantage that it can be treated as a global lock that is decomposed into fine-grained locks. For example, we can start the code conversion from an explict async model to a threaded sync model by converting the mutex into a shared/exclusive lock. Operations like read and write take the lock for shared access (and take a fine-grained mutex on the metadata cache entry), while operation like creating a snapshot take the lock for exclusive access. That doesn't work with freeze/thaw. -- error compiling committee.c: too many arguments to function