From: Jamie Lokier <jamie@shareable.org>
To: Jens Axboe <qemu@kernel.dk>
Cc: tytso@mit.edu, kvm@vger.kernel.org,
"Michael S. Tsirkin" <mst@redhat.com>, Neil Brown <neilb@suse.de>,
Rusty Russell <rusty@rustcorp.com.au>,
qemu-devel@nongnu.org, virtualization@lists.linux-foundation.org,
hch@lst.de
Subject: Re: [Qemu-devel] Re: [PATCH] virtio-spec: document block CMD and FLUSH
Date: Tue, 4 May 2010 21:17:05 +0100 [thread overview]
Message-ID: <20100504201705.GA4360@shareable.org> (raw)
In-Reply-To: <20100504084133.GH27497@kernel.dk>
Jens Axboe wrote:
> On Tue, May 04 2010, Rusty Russell wrote:
> > ISTR someone mentioning a desire for such an API years ago, so CC'ing the
> > usual I/O suspects...
>
> It would be nice to have a more fuller API for this, but the reality is
> that only the flush approach is really workable. Even just strict
> ordering of requests could only be supported on SCSI, and even there the
> kernel still lacks proper guarantees on error handling to prevent
> reordering there.
There's a few I/O scheduling differences that might be useful:
1. The I/O scheduler could freely move WRITEs before a FLUSH but not
before a BARRIER. That might be useful for time-critical WRITEs,
and those issued by high I/O priority.
2. The I/O scheduler could move WRITEs after a FLUSH if the FLUSH is
only for data belonging to a particular file (e.g. fdatasync with
no file size change, even on btrfs if O_DIRECT was used for the
writes being committed). That would entail tagging FLUSHes and
WRITEs with a fs-specific identifier (such as inode number), opaque
to the scheduler which only checks equality.
3. By delaying FLUSHes through reordering as above, the I/O scheduler
could merge multiple FLUSHes into a single command.
4. On MD/RAID, BARRIER requires every backing device to quiesce before
sending the low-level cache-flush, and all of those to finish
before resuming each backing device. FLUSH doesn't require as much
synchronising. (With per-file FLUSH; see 2; it could even avoid
FLUSH altogether to some backing devices for small files).
In other words, FLUSH can be more relaxed than BARRIER inside the
kernel. It's ironic that we think of fsync as stronger than
fbarrier outside the kernel :-)
-- Jamie
next prev parent reply other threads:[~2010-05-04 20:17 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-18 22:22 [Qemu-devel] [PATCH] virtio-spec: document block CMD and FLUSH Michael S. Tsirkin
2010-04-19 21:26 ` [Qemu-devel] " Michael S. Tsirkin
2010-04-28 15:52 ` Michael S. Tsirkin
2010-04-20 1:46 ` [Qemu-devel] " Jamie Lokier
2010-04-20 13:22 ` Paul Brook
2010-04-21 10:39 ` Michael S. Tsirkin
2010-05-04 18:56 ` Christoph Hellwig
2010-05-04 19:01 ` Michael S. Tsirkin
2010-05-04 4:38 ` [Qemu-devel] " Rusty Russell
2010-05-04 6:56 ` Stefan Hajnoczi
2010-05-04 8:34 ` Avi Kivity
2010-05-04 8:41 ` Jens Axboe
2010-05-04 20:17 ` Jamie Lokier [this message]
2010-05-05 4:58 ` Rusty Russell
2010-05-05 6:03 ` Neil Brown
2010-05-06 6:05 ` Rusty Russell
2010-05-06 14:57 ` Jamie Lokier
2010-05-06 15:25 ` Jamie Lokier
2010-05-04 10:05 ` Christoph Hellwig
2010-05-04 20:32 ` Jamie Lokier
2010-05-04 18:54 ` Christoph Hellwig
2010-05-04 18:56 ` Michael S. Tsirkin
2010-05-04 18:58 ` Michael S. Tsirkin
2010-05-05 5:00 ` Rusty Russell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100504201705.GA4360@shareable.org \
--to=jamie@shareable.org \
--cc=hch@lst.de \
--cc=kvm@vger.kernel.org \
--cc=mst@redhat.com \
--cc=neilb@suse.de \
--cc=qemu-devel@nongnu.org \
--cc=qemu@kernel.dk \
--cc=rusty@rustcorp.com.au \
--cc=tytso@mit.edu \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).