From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:37421) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T9ZAl-0008T2-R7 for qemu-devel@nongnu.org; Thu, 06 Sep 2012 06:19:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1T9ZAf-00046k-Cb for qemu-devel@nongnu.org; Thu, 06 Sep 2012 06:18:55 -0400 Received: from mx1.redhat.com ([209.132.183.28]:55110) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T9ZAf-00046M-41 for qemu-devel@nongnu.org; Thu, 06 Sep 2012 06:18:49 -0400 Message-ID: <50487881.9030207@redhat.com> Date: Thu, 06 Sep 2012 12:18:41 +0200 From: Paolo Bonzini MIME-Version: 1.0 References: <20120809130010.GA7960@in.ibm.com> <20120809130216.GC7960@in.ibm.com> <20120905074106.GA28080@in.ibm.com> <20120905095431.GB28080@in.ibm.com> <50484F63.5050406@redhat.com> <504867A9.3050808@redhat.com> <50486F24.2010106@redhat.com> <504875F6.9090302@redhat.com> In-Reply-To: <504875F6.9090302@redhat.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v6 2/2] block: Support GlusterFS as a QEMU block backend List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: Anthony Liguori , Anand Avati , Stefan Hajnoczi , Vijay Bellur , Amar Tumballi , qemu-devel@nongnu.org, Blue Swirl , bharata@linux.vnet.ibm.com Il 06/09/2012 12:07, Kevin Wolf ha scritto: >> The AIOCB is already invalid at the time the callback is entered, so we >> could release it before the call. However, not all implementation of >> AIO are ready for that and I'm not really in the mood for large scale >> refactoring... > > But the way, what I'd really want to see in the end is to get rid of > qemu_aio_flush() and replace it by .bdrv_drain() callbacks in each > BlockDriver. The way we're doing it today is a layering violation. That's quite difficult. Completion of an I/O operation can trigger another I/O operation on another block device, and so on until we go back to the first device (think of a hypothetical RAID-5 device). > Doesn't change anything about this problem, though. So the options that > we have are: > > 1. Delay the callback using a BH. Doing this in each driver is ugly. > But is there actually more than one possible callback in today's > coroutine world? I only see bdrv_co_io_em_complete(), which could > reenter the coroutine from a BH. Easy and safe, but it feels a bit like a timebomb. Also, I'm not entirely sure of _why_ the bottom half works. :) > 2. Delay the callback by just calling it later when the cleanup has > been completed and .io_flush() can return 0. You say that it's hard > to implement for some drivers, except if the AIOCB are leaked until > the end of functions like qcow2_create(). ... which is what we do in posix-aio-compat.c; nobody screamed so far. Not really hard, it just has to be assessed for each driver separately. We can just do it in gluster and refactor it later. > 3. Add a delay only later in functions like bdrv_drain_all() that assume > that the request has completed. Are there more of this type? AIOCBs > are leaked until a bdrv_drain_all() call. Does it work with draining > specific BDSes instead of everything? > > Unless I forgot some important point, it almost looks like option 1 is > the easiest and safest. I agree with your opinion, but I would feel better if I understood better why it works. (2) can be done easily in each driver (no ugliness) and refactored later. Paolo