From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Koq6X-0008Ml-Sd for qemu-devel@nongnu.org; Sat, 11 Oct 2008 21:50:45 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Koq6V-0008MU-E9 for qemu-devel@nongnu.org; Sat, 11 Oct 2008 21:50:44 -0400 Received: from [199.232.76.173] (port=42305 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Koq6V-0008MK-8J for qemu-devel@nongnu.org; Sat, 11 Oct 2008 21:50:43 -0400 Received: from mx2.redhat.com ([66.187.237.31]:45398) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1Koq6U-0008Vl-SN for qemu-devel@nongnu.org; Sat, 11 Oct 2008 21:50:43 -0400 Date: Sat, 11 Oct 2008 18:50:03 -0700 From: Chris Wright Subject: Re: [Qemu-devel] [RFC] Disk integrity in QEMU Message-ID: <20081012015003.GD9763@acer.localdomain> References: <48EE38B9.2050106@codemonkey.ws> <48EF1D55.7060307@redhat.com> <48F0E83E.2000907@redhat.com> <48F10DFD.40505@codemonkey.ws> <48F14814.7000805@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <48F14814.7000805@redhat.com> Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Mark Wagner Cc: Chris Wright , Mark McLoughlin , kvm@vger.kernel.org, Laurent Vivier , qemu-devel@nongnu.org, Ryan Harper * Mark Wagner (mwagner@redhat.com) wrote: > I think that are two distinct arguments going on here. My main concern is > that I don't think that this a simple "what do we make the default cache policy > be" issue. I think that regardless of the cache policy, if something in the > guest requests O_DIRECT, the host must honor that and not cache the data. OK, O_DIRECT in the guest is just one example of the guest requesting data to be synchronously written to disk. It bypasses guest page cache, but even page cached writes need to be written at some point. Any time the disk driver issues an io where it expects the data to be on disk (possible low-level storage subystem caching) is the area of concern. * Mark Wagner (mwagner@redhat.com) wrote: > Anthony Liguori wrote: >> It's extremely important to understand what the guarantee is. The >> guarantee is that upon completion on write(), the data will have been >> reported as written by the underlying storage subsystem. This does >> *not* mean that the data is on disk. > > I apologize if I worded it poorly, I assume that the guarantee is that > the data has been sent to the storage controller and said controller > sent an indication that the write has completed. This could mean > multiple things likes its in the controllers cache, on the disk, etc. > > I do not believe that this means that the data is still sitting in the > host cache. I realize it may not yet be on a disk, but, at a minimum, > I would expect that is has been sent to the storage controller. Do you > consider the hosts cache to be part of the storage subsystem ? Either wt or uncached (so host O_DSYNC or O_DIRECT) would suffice to get it through to host's storage subsytem, and I think that's been the core of the discussion (plus defaults, etc). >> In the case of KVM, even using write-back caching with the host page >> cache, we are still honoring the guarantee of O_DIRECT. We just have >> another level of caching that happens to be write-back. > > I still don't get it. If I have something running on the host that I > open with O_DIRECT, do you still consider it not to be a violation of > the system call if that data ends up in the host cache instead of being > sent to the storage controller? I suppose an argument could be made for host caching and write-back to be considered part of the storage subsystem from the guest pov, but then we also need to bring in the requirement for proper cache flushing. Given a popular linux guest fs can be a little fast and loose, wb and flushing isn't really optimal choice for the integrity case. thanks, -chris