From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1O9Onm-0005BZ-8l for qemu-devel@nongnu.org; Tue, 04 May 2010 16:33:10 -0400 Received: from [140.186.70.92] (port=45371 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1O9Onk-00059O-Qn for qemu-devel@nongnu.org; Tue, 04 May 2010 16:33:09 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1O9Onj-0005kx-2T for qemu-devel@nongnu.org; Tue, 04 May 2010 16:33:08 -0400 Received: from mail2.shareable.org ([80.68.89.115]:59654) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O9Oni-0005ko-Re for qemu-devel@nongnu.org; Tue, 04 May 2010 16:33:07 -0400 Date: Tue, 4 May 2010 21:32:55 +0100 From: Jamie Lokier Subject: Re: [Qemu-devel] Re: [PATCH] virtio-spec: document block CMD and FLUSH Message-ID: <20100504203255.GB4360@shareable.org> References: <20100218222220.GA14847@redhat.com> <201005041408.25069.rusty@rustcorp.com.au> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201005041408.25069.rusty@rustcorp.com.au> List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Rusty Russell Cc: tytso@mit.edu, kvm@vger.kernel.org, "Michael S. Tsirkin" , Neil Brown , qemu-devel@nongnu.org, virtualization@lists.linux-foundation.org, Jens Axboe , hch@lst.de Rusty Russell wrote: > On Fri, 19 Feb 2010 08:52:20 am Michael S. Tsirkin wrote: > > I took a stub at documenting CMD and FLUSH request types in virtio > > block. Christoph, could you look over this please? > > > > I note that the interface seems full of warts to me, > > this might be a first step to cleaning them. > > ISTR Christoph had withdrawn some patches in this area, and was waiting > for him to resubmit? > > I've given up on figuring out the block device. What seem to me to be sane > semantics along the lines of memory barriers are foreign to disk people: they > want (and depend on) flushing everywhere. > > For example, tdb transactions do not require a flush, they only require what > I would call a barrier: that prior data be written out before any future data. > Surely that would be more efficient in general than a flush! In fact, TDB > wants only writes to *that file* (and metadata) written out first; it has no > ordering issues with other I/O on the same device. I've just posted elsewhere on this thread, that an I/O level flush can be more efficient than an I/O level barrier (implemented using a cache-flush really), because the barrier has stricter ordering requirements at the I/O scheduling level. By the time you work up to tdb, another way to think of it is distinguishing "eager fsync" from "fsync but I'm not in a hurry - delay as long as is convenient". The latter makes much more sense with AIO. > A generic I/O interface would allow you to specify "this request > depends on these outstanding requests" and leave it at that. It > might have some sync flush command for dumb applications and OSes. For filesystems, it would probably be easy to label in-place overwrites and fdatasync data flushes when there's no file extension with an opqaue per-file identifier for certain operations. Typically over-writing in place and fdatasync would match up and wouldn't need ordering against anything else. Other operations would tend to get labelled as ordered against everything including these. -- Jamie