From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1O9C3R-0005sx-HZ for qemu-devel@nongnu.org; Tue, 04 May 2010 02:56:29 -0400 Received: from [140.186.70.92] (port=54778 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1O9C3Q-0005sK-5P for qemu-devel@nongnu.org; Tue, 04 May 2010 02:56:29 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1O9C3O-0005lg-80 for qemu-devel@nongnu.org; Tue, 04 May 2010 02:56:27 -0400 Received: from mail-vw0-f45.google.com ([209.85.212.45]:49152) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O9C3O-0005lY-4a for qemu-devel@nongnu.org; Tue, 04 May 2010 02:56:26 -0400 Received: by vws1 with SMTP id 1so429655vws.4 for ; Mon, 03 May 2010 23:56:25 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <201005041408.25069.rusty@rustcorp.com.au> References: <20100218222220.GA14847@redhat.com> <201005041408.25069.rusty@rustcorp.com.au> Date: Tue, 4 May 2010 07:56:24 +0100 Message-ID: Subject: Re: [Qemu-devel] Re: [PATCH] virtio-spec: document block CMD and FLUSH From: Stefan Hajnoczi Content-Type: multipart/alternative; boundary=e0cb4e88769d384efd0485bf35f0 List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Rusty Russell Cc: tytso@mit.edu, kvm@vger.kernel.org, "Michael S. Tsirkin" , Neil Brown , qemu-devel@nongnu.org, virtualization@lists.linux-foundation.org, Jens Axboe , hch@lst.de --e0cb4e88769d384efd0485bf35f0 Content-Type: text/plain; charset=ISO-8859-1 A userspace barrier API would be very useful instead of doing fsync when only ordering is required. I'd like to follow that discussion too. Stefan On 4 May 2010 05:39, "Rusty Russell" wrote: On Fri, 19 Feb 2010 08:52:20 am Michael S. Tsirkin wrote: > I took a stub at documenting CMD and FLU... ISTR Christoph had withdrawn some patches in this area, and was waiting for him to resubmit? I've given up on figuring out the block device. What seem to me to be sane semantics along the lines of memory barriers are foreign to disk people: they want (and depend on) flushing everywhere. For example, tdb transactions do not require a flush, they only require what I would call a barrier: that prior data be written out before any future data. Surely that would be more efficient in general than a flush! In fact, TDB wants only writes to *that file* (and metadata) written out first; it has no ordering issues with other I/O on the same device. A generic I/O interface would allow you to specify "this request depends on these outstanding requests" and leave it at that. It might have some sync flush command for dumb applications and OSes. The userspace API might be not be as precise and only allow such a barrier against all prior writes on this fd. ISTR someone mentioning a desire for such an API years ago, so CC'ing the usual I/O suspects... Cheers, Rusty. --e0cb4e88769d384efd0485bf35f0 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable

A userspace barrier API would be very useful instead of doing fsync when= only ordering is required. I'd like to follow that discussion too.

Stefan

On 4 May 2010 05:39, "Rusty Russell"= <rusty@rustcorp.com.au>= wrote:

On Fri, 19 Feb 2010 08:52:20 am M= ichael S. Tsirkin wrote:
> I took a stub at documenting CMD and FLU...

ISTR Christoph h= ad withdrawn some patches in this area, and was waiting
for him to resubmit?

I've given up on figuring out the block device. =A0What seem to me to b= e sane
semantics along the lines of memory barriers are foreign to disk people: th= ey
want (and depend on) flushing everywhere.

For example, tdb transactions do not require a flush, they only require wha= t
I would call a barrier: that prior data be written out before any future da= ta.
Surely that would be more efficient in general than a flush! =A0In fact, TD= B
wants only writes to *that file* (and metadata) written out first; it has n= o
ordering issues with other I/O on the same device.

A generic I/O interface would allow you to specify "this request depen= ds on these
outstanding requests" and leave it at that. =A0It might have some sync= flush
command for dumb applications and OSes. =A0The userspace API might be not b= e as
precise and only allow such a barrier against all prior writes on this fd.<= br>
ISTR someone mentioning a desire for such an API years ago, so CC'ing t= he
usual I/O suspects...

Cheers,
Rusty.


--e0cb4e88769d384efd0485bf35f0--