From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MCN80-0005FD-Pc for qemu-devel@nongnu.org; Thu, 04 Jun 2009 20:17:49 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MCN7u-0005BC-L4 for qemu-devel@nongnu.org; Thu, 04 Jun 2009 20:17:47 -0400 Received: from [199.232.76.173] (port=52860 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MCMUD-0006wb-Vc for qemu-devel@nongnu.org; Thu, 04 Jun 2009 19:36:42 -0400 Received: from mx2.redhat.com ([66.187.237.31]:44114) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1MCB5x-00006p-QT for qemu-devel@nongnu.org; Thu, 04 Jun 2009 07:26:54 -0400 Date: Thu, 4 Jun 2009 13:26:45 +0200 From: Andrea Arcangeli Message-ID: <20090604112645.GQ25483@random.random> References: <20090528163309.GJ20464@random.random> <20090530100842.GA28053@lst.de> <20090530121709.GA22104@random.random> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090530121709.GA22104@random.random> Subject: [Qemu-devel] [PATCH] fix qemu_aio_flush List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Christoph Hellwig Cc: Kevin Wolf , qemu-devel@nongnu.org Hello, Kevin has good point that when highlevel callback handler completes we should be guaranteed all underlying layers of callbacks events completed for that specific aio operation. So it seems the main bug was only in qemu_aio_flush() (only made visible by the debug code included in the ide_dma_cancel patch). I guess that's a problem for savevm/reset that assumes there is no outstanding aio waiting to be run while in fact there can be because of the bug. Patch is much simpler as seen below: ---------- From: Andrea Arcangeli qemu_aio_wait by invoking the bh or one of the aio completion callbacks, could end up submitting new pending aio, breaking the invariant that qemu_aio_poll returns only when no pending aio is outstanding (possibly a problem for migration as such). Signed-off-by: Andrea Arcangeli --- diff --git a/aio.c b/aio.c index 11fbb6c..dc9b85d 100644 --- a/aio.c +++ b/aio.c @@ -103,11 +103,15 @@ void qemu_aio_flush(void) do { ret = 0; + /* + * If there are pending emulated aio start them now so flush + * will be able to return 1. + */ + qemu_aio_wait(); + LIST_FOREACH(node, &aio_handlers, node) { ret |= node->io_flush(node->opaque); } - - qemu_aio_wait(); } while (ret > 0); } diff --git a/qemu-aio.h b/qemu-aio.h index 7967829..f262344 100644 --- a/qemu-aio.h +++ b/qemu-aio.h @@ -24,9 +24,10 @@ typedef int (AioFlushHandler)(void *opaque); * outstanding AIO operations have been completed or cancelled. */ void qemu_aio_flush(void); -/* Wait for a single AIO completion to occur. This function will until a - * single AIO opeartion has completed. It is intended to be used as a looping - * primative when simulating synchronous IO based on asynchronous IO. */ +/* Wait for a single AIO completion to occur. This function will wait + * until a single AIO event has completed and it will ensure something + * has moved before returning. This can issue new pending aio as + * result of executing I/O completion or bh callbacks. */ void qemu_aio_wait(void); /* Register a file descriptor and associated callbacks. Behaves very similarly