From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=54836 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PrXV0-0008TW-AH for qemu-devel@nongnu.org; Mon, 21 Feb 2011 10:16:31 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PrXUy-0002Ra-Jy for qemu-devel@nongnu.org; Mon, 21 Feb 2011 10:16:30 -0500 Received: from mail-vw0-f52.google.com ([209.85.212.52]:51427) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PrXUy-0002RJ-HG for qemu-devel@nongnu.org; Mon, 21 Feb 2011 10:16:28 -0500 Received: by vws20 with SMTP id 20so1678084vws.11 for ; Mon, 21 Feb 2011 07:16:27 -0800 (PST) Message-ID: <4D6281D4.50308@codemonkey.ws> Date: Mon, 21 Feb 2011 09:16:36 -0600 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] Re: Strategic decision: COW format References: <4D5BC467.4070804@redhat.com> <4D5E4271.80501@redhat.com> <20110220221357.GO4580@hall.aurel32.net> <4D62295E.1030504@redhat.com> <4D627257.7010403@redhat.com> In-Reply-To: <4D627257.7010403@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: Stefan Hajnoczi , Stefan Hajnoczi , qemu-devel@nongnu.org, Markus Armbruster , Chunqiang Tang , Aurelien Jarno On 02/21/2011 08:10 AM, Kevin Wolf wrote: > Am 21.02.2011 14:44, schrieb Stefan Hajnoczi: > >> On Mon, Feb 21, 2011 at 8:59 AM, Kevin Wolf wrote: >> >>> In fact, the only area where qcow2 in performs really bad in 0.14 is >>> cache=writethrough (which unfortunately is the default...). With >>> cache=none it's easy to find scenarios where it provides higher >>> throughput than QED. >>> >> Yeah, I'm tempted to implement parallel allocating writes now so I can >> pick on qcow2 in all benchmarks again ;). >> > Heh. ;-) > > In the end it just shows that the differences are mainly in the > implementation, not in the format. > > >>> Anyway, there's really only one crucial difference between QED and >>> qcow2, which is that qcow2 ensures that metadata is consistent on disk >>> at any time whereas QED relies on a dirty flag and rebuilds metadata >>> after a crash (basically requiring an fsck). The obvious solution if you >>> want to have this in qcow2, is adding a dirty flag there as well. >>> >>> Likewise, I think FVD might provide some ideas that we can integrate as >>> well, I just don't see a justification to include it as a separate format. >>> >> You think that QED and FVD can be integrated into a QCOW2-based >> format. I agree it's possible and has some value. It isn't pretty >> and I would prefer to work on a clean new format because that, too, >> has value. >> >> In any case, the next step is to get down to specifics. Here is the >> page with the current QCOW3 roadmap: >> >> http://wiki.qemu.org/Qcow3_Roadmap >> >> Please raise concrete requirements or features so they can be >> discussed and captured. >> >> For example, journalling is an alternative to the dirty bit approach. >> If you feel that journalling is the best technique to address >> consistent updates, then make your case outside the context of today's >> qcow2, QED, and FVD implementations (although benchmark data will rely >> on current implementations). Explain how the technique would fit into >> QCOW3 and what format changes need to be made. >> > I think journalling is an interesting option, but I'm not sure if we > should target it for 0.15. As you know, there's already more than enough > stuff to do until then, with coroutines etc. The dirty flag thing would > be way easier to implement. We can always add a journal as a compatible > feature in 0.16. > > To be honest, I'm not even sure any more that the dirty flag is that > important. Originally we have been talking about cache=none and it > definitely makes a big difference there because we save flushes. > However, we're talking about cache=writethrough now and you flush on any > write. It might be more important to make things parallel for writethrough. > One thing I wonder about is whether we really need to have cache=X and wce=X. I never really minded the fact that cache=none advertised wce=on because we behaved effectively as if wce=on. But now that qcow2 triggers on wce=on, I'm a bit concerned that we're introducing a subtle degradation that most people won't realize. Ignoring some of the problems with O_DIRECT, semantically, I think there's a strong use-case for cache=none, wce=off. Regards, Anthony Liguori > Maybe not writing out refcounts is something we should measure before we > start implementing anything. (It's easy to disable all writes for a > benchmark, even if the image will be broken afterwards) > > >> I think this is the level we need to discuss at rather than qcow2 vs QED vs FVD. >> > Definitely more productive, yes. > > Kevin > >