From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:35051) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Tj4qZ-000285-5T for qemu-devel@nongnu.org; Thu, 13 Dec 2012 04:12:51 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Tj4qY-0002L8-1F for qemu-devel@nongnu.org; Thu, 13 Dec 2012 04:12:51 -0500 Received: from mx1.redhat.com ([209.132.183.28]:9357) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Tj4qX-0002Ky-Op for qemu-devel@nongnu.org; Thu, 13 Dec 2012 04:12:49 -0500 Message-ID: <50C99C0C.80704@redhat.com> Date: Thu, 13 Dec 2012 10:12:44 +0100 From: Kevin Wolf MIME-Version: 1.0 References: <1354925118-23061-1-git-send-email-keith.busch@intel.com> <50C37FF9.9090904@suse.de> <50C5D735.8070902@redhat.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH] NVMe: Initial commit to add an NVM Express device List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Busch, Keith" Cc: "Michael S. Tsirkin" , Stefan Hajnoczi , qemu-devel , Keith Busch , Hannes Reinecke , =?ISO-8859-1?Q?Andreas_F=E4rber?= Am 13.12.2012 01:13, schrieb Busch, Keith: > On Mon, Dec 10, 2012 at 7:11 AM, Stefan Hajnoczi wrote: >> Quick pointers to get started on Kevin's suggestion: >> >> bdrv_aio_readv(), bdrv_aio_writev(), bdrv_aio_flush(), and >> bdrv_aio_discard() provide the block device operations that emulated >> storage controllers use. > > There seems to be an issue with the bdrv_aio_[readv/writev] routines. I added some additional tracing in the block and coroutine code, and well, long story short, the coroutine "bdrv_co_io_em" needs to call "qemu_coroutine_yield" before his aio callback "bdrv_co_io_em_complete" invokes "qemu_coroutine_enter". It does not always win this race in my experiments, and qemu aborts with a recursive re-entry error. I don't know this coroutine code well enough to propose a fix -- I'd say maybe use locks but I think that defeats the purpose of using coroutines if I understand them correctly? The block layer, like most other qemu functions, is supposed to run under the qemu_global_mutex (also called BQL). Do you call into it from a different thread that doesn't take this lock? Kevin