From mboxrd@z Thu Jan 1 00:00:00 1970 From: Anthony Liguori Subject: Re: JFYI: ext4 bug triggerable by kvm Date: Tue, 17 Aug 2010 09:20:37 -0500 Message-ID: <4C6A9AB5.6050404@codemonkey.ws> References: <4C694483.5010903@msgid.tls.msk.ru> <4C694E7D.3060600@codemonkey.ws> <20100816184237.GA16579@infradead.org> <4C69A0C4.2080102@codemonkey.ws> <20100817090755.GA11110@infradead.org> <4C6A86E4.9080600@codemonkey.ws> <20100817130702.GA16635@infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Michael Tokarev , KVM list , Kevin Wolf To: Christoph Hellwig Return-path: Received: from mail-gw0-f46.google.com ([74.125.83.46]:54554 "EHLO mail-gw0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755446Ab0HQOUn (ORCPT ); Tue, 17 Aug 2010 10:20:43 -0400 Received: by gwj17 with SMTP id 17so1523801gwj.19 for ; Tue, 17 Aug 2010 07:20:41 -0700 (PDT) In-Reply-To: <20100817130702.GA16635@infradead.org> Sender: kvm-owner@vger.kernel.org List-ID: On 08/17/2010 08:07 AM, Christoph Hellwig wrote: >> The point is that we don't want to flush the disk write cache. The >> intention of writethrough is not to make the disk cache writethrough >> but to treat the host's cache as writethrough. >> > > We need to make sure data is not in the disk write cache if want to > provide data integrity. When the guest explicitly flushes the emulated disk's write cache. Not on every single write completion. > It has nothing to do with the qemu caching > mode - for data=writeback or none it's commited as part of the fdatasync > call, and for data=writethrough it's commited as part of the O_SYNC > write. Note that both these path end up calling the filesystems ->fsync > method which is what's require to make writes stable. That's exactly > what is missing out in sync_file_range, and that's why that API is not > useful at all for data integrity operations. For normal writes from a guest, we don't need to follow the write with an fsync(). We should only need to issue an fsync() given an explicit flush from the guest. > It's also what makes > fsync slow on extN - but the fix to that is not to not provide data > integrity but rather to make fsync fast. There's various other > filesystems that can already do it, and if you insist on using those > that are slow for this operation you'll have to suffer until that > issue is fixed for them. > fsync() being slow is orthogonal to my point. I don't see why we need to do an fsync() on *every* write. It should only be necessary when a guest injects an actual barrier. Regards, Anthony Liguori