From: Jamie Lokier <jamie@shareable.org>
To: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] bdrv_aio_flush
Date: Tue, 2 Sep 2008 19:22:15 +0100 [thread overview]
Message-ID: <20080902182215.GA15737@shareable.org> (raw)
In-Reply-To: <18621.28502.567886.718843@mariner.uk.xensource.com>
Ian Jackson wrote:
> Jens Axboe writes ("Re: [Qemu-devel] [PATCH] bdrv_aio_flush"):
> > On Tue, Sep 02 2008, Ian Jackson wrote:
> > > This is still not perfect because we unnecessarily flush some data
> > > thus delaying reporting completion of the WRITE FUA. But there is at
> > > at least no need to wait for _other_ writes to complete.
> >
> > I don't see how the above works. There's no dependency on FUA and
> > non-FUA writes, in fact FUA writes tend to jump the device queue due to
> > certain other operating systems using it for conditions where that is
> > appropriate. So unless you do all writes using FUA, there's no way
> > around a flush for committing dirty data. Unfortunately we don't have a
> > FLUSH_RANGE command, it's just a big sledge hammer.
>
> Yes, certainly you do aio_sync _some_ data that doesn't need to be.
> Without an O_FSYNC flag on aio_write that's almost inevitable.
Btw, in principle for FUA writes you can set O_SYNC or O_DSYNC on the
file descriptor just for this operation. Either using fcntl() (but
I'm not sure I believe that would be portable and really work), or
using two file descriptors.
> But if bdrv_aio_fsync also does a flush first then you're going to
> sync _even more_ unnecessarily: the difference between `bdrv_aio_fsync
> does flush first' and `bdrv_aio_fsync does not flush' only affects
> writes are queued but not completed when bdrv_aio_fsync is called.
>
> That is, non-FUA writes which were submitted after the FUA write.
> There is no need to fsync these and that's what I think qemu should
> do.
I agree, that's a clever reason to make bdrv_aio_fsync() guarantee
less rather than more.
(Who knows, that might be the reason SuS doesn't offer a stronger
guarantee too, although I doubt it - if that was serious they might
have defined a more selective sync instead.)
It would be interesting to see if using aio_fsync(O_DSYNC) were slower
or faster than fdatasync() on a range of hosts - just in case the
former syncs previously submitted AIOs and the latter doesn't.
Btw, on Linux aio_fsync(O_DSYNC) does the equivalent of fsync(), not
fdatasync(). This is because Glibc defines O_DSYNC to be the same as
O_SYNC. To get fdatasync(), you have to use the Linux-AIO API and
IOCB_CMD_FDSYNC.
> Andrea was making some comments about scsi and virtio. It's possible
> that these have different intended semantics and perhaps those device
> models (in hw/*) need to call flush explicitly before sync.
Or perhaps they would benefit from an async equivalent, so they don't
have to pause and can queue more requests?
-- Jamie
next prev parent reply other threads:[~2008-09-02 18:22 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-29 13:37 [Qemu-devel] [PATCH] bdrv_aio_flush Andrea Arcangeli
2008-09-01 11:27 ` Ian Jackson
2008-09-01 12:25 ` Andrea Arcangeli
2008-09-01 13:54 ` Jamie Lokier
2008-09-02 10:52 ` Ian Jackson
2008-09-02 14:25 ` Jens Axboe
2008-09-02 16:49 ` Ian Jackson
2008-09-01 13:25 ` Jamie Lokier
2008-09-02 10:46 ` Ian Jackson
2008-09-02 14:28 ` Jens Axboe
2008-09-02 16:52 ` Ian Jackson
2008-09-02 18:22 ` Jamie Lokier [this message]
2008-09-03 10:01 ` Ian Jackson
2008-09-02 18:01 ` Jamie Lokier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080902182215.GA15737@shareable.org \
--to=jamie@shareable.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).