From mboxrd@z Thu Jan 1 00:00:00 1970 From: Anthony Liguori Subject: Re: [PATCH, RFC] virtio_blk: add cache flush command Date: Mon, 11 May 2009 13:29:06 -0500 Message-ID: <4A086E72.5060302@codemonkey.ws> References: <20090511083908.GB20082@lst.de> <4A083B7C.1000703@codemonkey.ws> <20090511154046.GA4226@lst.de> <4A08482E.30100@redhat.com> <20090511162810.GA6027@lst.de> <4A085721.2050005@redhat.com> <4A0864CE.10505@codemonkey.ws> <4A0867B8.2090601@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Christoph Hellwig , Rusty Russell , kvm@vger.kernel.org To: Avi Kivity Return-path: Received: from an-out-0708.google.com ([209.85.132.249]:52955 "EHLO an-out-0708.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754972AbZEKS3N (ORCPT ); Mon, 11 May 2009 14:29:13 -0400 Received: by an-out-0708.google.com with SMTP id d40so10352405and.1 for ; Mon, 11 May 2009 11:29:13 -0700 (PDT) In-Reply-To: <4A0867B8.2090601@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: Avi Kivity wrote: > Anthony Liguori wrote: > >> >> Right now, it's fairly easy to understand. cache=none and >> cache=writethrough guarantee that all write operations that the guest >> thinks have completed are completed. cache=writeback provides no >> such guarantee. > > cache=none is partially broken as well, since O_DIRECT writes might > hit an un-battery-packed write cache. I think cache=writeback will > send the necessary flushes, if the disk and the underlying filesystem > support them. Sure, but this likely doesn't upset people that much since O_DIRECT has always had this behavior. Using non-battery backed disks with writeback enabled introduces a larger set of possible data integrity issues. I think this case is acceptable to ignore because it's a straight forward policy. >> cache=writeback+fsync would guarantee that only operations that >> include a T_FLUSH are present on disk which currently includes fsyncs >> but does not include O_DIRECT writes. I guess whether O_SYNC does a >> T_FLUSH also has to be determined. >> >> It seems too complicated to me. If we could provide a mode where >> cache=writeback provided as strong a guarantee as cache=writethrough, >> then that would be quite interesting. > > It don't think we realistically can. Maybe two fds? One open in O_SYNC and one not. Is such a thing sane? >>>> (Or maybe ext3 actually is stupid enough to flush the whole fs even >>>> for >>>> that case >>> >>> Sigh. >> >> I'm also worried about ext3 here. > > I'm just waiting for btrfs. Even ext4 is saner but we'll get lots of bug reports while ext3 remains common. Regards, Anthony Liguori