From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1KLB0i-0003sA-CR
	for qemu-devel@nongnu.org; Tue, 22 Jul 2008 02:06:08 -0400
Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1KLB0h-0003ry-NS
	for qemu-devel@nongnu.org; Tue, 22 Jul 2008 02:06:08 -0400
Received: from [199.232.76.173] (port=44611 helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1KLB0h-0003rv-Jo
	for qemu-devel@nongnu.org; Tue, 22 Jul 2008 02:06:07 -0400
Received: from il.qumranet.com ([212.179.150.194]:27913)
	by monty-python.gnu.org with esmtp (Exim 4.60)
	(envelope-from <avi@qumranet.com>) id 1KLB0h-0004R3-21
	for qemu-devel@nongnu.org; Tue, 22 Jul 2008 02:06:07 -0400
Message-ID: <488578CA.4000402@qumranet.com>
Date: Tue, 22 Jul 2008 09:06:02 +0300
From: Avi Kivity <avi@qumranet.com>
MIME-Version: 1.0
Subject: Re: [Qemu-devel] qcow2 - safe on kill?  safe on power fail?
References: <47CF0E0C.9030807@quinthar.com>	<47CF16C5.6040102@codemonkey.ws>	<20080721181031.GA31773@shareable.org>	<4884E6F1.5020205@codemonkey.ws>	<20080721212604.GA2823@shareable.org>
	<48850A5A.3070106@codemonkey.ws>
In-Reply-To: <48850A5A.3070106@codemonkey.ws>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Reply-To: qemu-devel@nongnu.org
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: qemu-devel@nongnu.org

Anthony Liguori wrote:
> Jamie Lokier wrote:
>>> If the sector hasn't been previously allocated, then a new sector in 
>>> the file needs to be allocated.  This is going to change metadata 
>>> within the QCOW2 file and this is where it is possible to corrupt a 
>>> disk image.  The operation of allocating a new disk sector is 
>>> completely synchronous so no other code runs until this completes.  
>>> Once the disk sector is allocated, you're safe again[1].
>>>     
>>
>> My main concern is corruption of the QCOW2 sector allocation map, and
>> subsequently QEMU/KVM breaking or going wildly haywire with that file.
>>
>> With a normal filesystem, sure, there are lots of ways to get
>> corruption when certain events happen.  But you don't lose the _whole_
>> filesystem.
>>   
>
> Sure you can.  If you don't have a battery backed disk cache and are 
> using write-back (which is usually the default), you can definitely 
> get corruption of the journal.  Likewise, under the right scenarios, 
> you will get journal corruption with the default mount options of ext3 
> because it doesn't use barriers.
>

What about SCSI or SATA NCQ?  On these, barriers don't impact 
performance greatly.

> This is very hard to see happen in practice though because these 
> windows are very small--just like with QEMU.
>

The exposure window with qemu is not small.  It's as large as the page 
cache of the host.

>
>
>>> you are running QEMU with cache=off to disable host write caching.      
>>
>> Doesn't that use O_DIRECT?  O_DIRECT writes don't use barriers, and
>> fsync() does not deterministically issue a disk barrier if there's no
>> metadata change, so O_DIRECT writes are _less_ safe with disks which
>> have write-cache enabled than using normal writes.
>>   
>
> It depends on the filesystem.  ext3 never issues any barriers by 
> default :-)
>
> I would think a good filesystem would issue a barrier after an 
> O_DIRECT write.
>

Using a disk controller that supports queueing means that you can (in 
theory at least) leave writeback turned on and yet have the disk not lie 
to you about completions.


-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.