From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: JFYI: ext4 bug triggerable by kvm Date: Tue, 17 Aug 2010 17:59:07 +0300 Message-ID: <4C6AA3BB.5020103@redhat.com> References: <4C694E7D.3060600@codemonkey.ws> <20100816184237.GA16579@infradead.org> <4C69A0C4.2080102@codemonkey.ws> <20100817090755.GA11110@infradead.org> <4C6A86E4.9080600@codemonkey.ws> <20100817130702.GA16635@infradead.org> <4C6A9AB5.6050404@codemonkey.ws> <20100817142808.GA22412@infradead.org> <4C6A9F4F.8040209@msgid.tls.msk.ru> <4C6AA061.80704@codemonkey.ws> <20100817144651.GB10280@infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Anthony Liguori , Michael Tokarev , KVM list , Kevin Wolf To: Christoph Hellwig Return-path: Received: from mx1.redhat.com ([209.132.183.28]:8059 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750837Ab0HQO7P (ORCPT ); Tue, 17 Aug 2010 10:59:15 -0400 In-Reply-To: <20100817144651.GB10280@infradead.org> Sender: kvm-owner@vger.kernel.org List-ID: On 08/17/2010 05:46 PM, Christoph Hellwig wrote: > On Tue, Aug 17, 2010 at 09:44:49AM -0500, Anthony Liguori wrote: >> I think the real issue is we're mixing host configuration with guest >> visible state. > The last time I proposed to decouple the two you and Avi were heavily > opposed to it.. I wasn't that I can recall. >> With O_SYNC, we're causing cache=writethrough to do writethrough >> through two layers of the storage heirarchy. I don't think that's >> necessary or desirable though. > It's absolutely nessecary if we tell the guest that we do not have > a volatile write cache. Which is the only good reason to use > data=writethrough anyway - except for dealing with old guests that > can't handle volatile writecache it's an absolutely stupid mode of > operation. I agree, but there's another case: tell the guest that we have a write cache, use O_DSYNC, but only flush the disk cache on guest flushes. The reason for this is that if we don't use O_DSYNC the page cache can grow to huge proportions. While this is allowed by the contract between virtual drive and guest, guest software and users won't expect a huge data loss on power fail, only a minor data loss from the last fraction of a second before the failure. I believe this can be approximated by mounting the host filesystem with barrier=0? -- error compiling committee.c: too many arguments to function