From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1KLNl1-0006tT-FX for qemu-devel@nongnu.org; Tue, 22 Jul 2008 15:42:47 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1KLNkz-0006rR-Gv for qemu-devel@nongnu.org; Tue, 22 Jul 2008 15:42:46 -0400 Received: from [199.232.76.173] (port=40899 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KLNkz-0006r9-0k for qemu-devel@nongnu.org; Tue, 22 Jul 2008 15:42:45 -0400 Received: from mail2.shareable.org ([80.68.89.115]:37110) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1KLNky-0002lt-7i for qemu-devel@nongnu.org; Tue, 22 Jul 2008 15:42:44 -0400 Received: from jamie by mail2.shareable.org with local (Exim 4.63) (envelope-from ) id 1KLO6Q-0007Tb-84 for qemu-devel@nongnu.org; Tue, 22 Jul 2008 21:04:54 +0100 Date: Tue, 22 Jul 2008 21:04:54 +0100 From: Jamie Lokier Subject: Re: [Qemu-devel] qcow2 - safe on kill? safe on power fail? Message-ID: <20080722200450.GA27753@shareable.org> References: <47CF16C5.6040102@codemonkey.ws> <20080721181031.GA31773@shareable.org> <4884E6F1.5020205@codemonkey.ws> <48850A99.7070005@codemonkey.ws> <48857926.5020708@qumranet.com> <4885EA8B.5050908@codemonkey.ws> <4885F068.2060902@qumranet.com> <20080722161607.GA22535@shareable.org> <4886316E.4080601@qumranet.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4886316E.4080601@qumranet.com> Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Avi Kivity wrote: > >>It's a simple matter of allocating, making sure the allocation is on > >>disk, and recording that allocation in the tables. > > > >The simple implementations are only safe if sector writes are atomic. > > > >Opinions from Google seem divided about when you can assume that, > >especially when the underlying file or device is not directly mapped > >to disk sectors. > > That's worrying. I guess always-allocate-on-write solves that (with > versioned roots in well-known places), but that's not qcow2 any more -- > it's btrfs. Fair. Simple journalling with checksumed log records also solves the problem without being half as clever - and probably easy to retrofit to qcow2, without breaking backward compatibility. (Old qemus would ignore the journal.) > And given that btrfs ought to allow file-level snapshots, perhaps > the direction should be raw files on top of btrfs (which could be > extended to do block sharing, yay!) Block/extent sharing would be a nice bonus :-) Does btrfs work on other platforms than Linux? Also, is btrfs as good as the hype, in respect of things like fsync, barriers, cache=off consistency etc. which we've talked about? Maybe, but I wouldn't assume it. Userspace btrfs-in-a-file library would be ideal, for cross-platform support, but I don't see it happening. You can do raw, sparse files on ext3 or any other unix filesystem. They are about as compact as qcow2, if you ignore compression. The real big problem I found with sparse files is that copying them locally, or copying them to another machine (e.g. with rsync) is *incredibly* slow because it's so slow to scan the sparse regions, and this gets really slow if you have, say, a 100GB virtual disk (5GB used, rest to grow into). "rsync --sparse" even bizarrely transmits a lot of zero data over the network, or spends an age compressing it. btrfs flat files will have the same problem. The FIEMAP interface may solve it, generically on all Linux filesystem, if copying tools are updated to use it. -- Jamie