qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Jamie Lokier <jamie@shareable.org>
To: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] bdrv_aio_flush
Date: Tue, 2 Sep 2008 19:22:15 +0100	[thread overview]
Message-ID: <20080902182215.GA15737@shareable.org> (raw)
In-Reply-To: <18621.28502.567886.718843@mariner.uk.xensource.com>

Ian Jackson wrote:
> Jens Axboe writes ("Re: [Qemu-devel] [PATCH] bdrv_aio_flush"):
> > On Tue, Sep 02 2008, Ian Jackson wrote:
> > > This is still not perfect because we unnecessarily flush some data
> > > thus delaying reporting completion of the WRITE FUA.  But there is at
> > > at least no need to wait for _other_ writes to complete.
> > 
> > I don't see how the above works. There's no dependency on FUA and
> > non-FUA writes, in fact FUA writes tend to jump the device queue due to
> > certain other operating systems using it for conditions where that is
> > appropriate. So unless you do all writes using FUA, there's no way
> > around a flush for committing dirty data. Unfortunately we don't have a
> > FLUSH_RANGE command, it's just a big sledge hammer.
> 
> Yes, certainly you do aio_sync _some_ data that doesn't need to be.
> Without an O_FSYNC flag on aio_write that's almost inevitable.

Btw, in principle for FUA writes you can set O_SYNC or O_DSYNC on the
file descriptor just for this operation.  Either using fcntl() (but
I'm not sure I believe that would be portable and really work), or
using two file descriptors.

> But if bdrv_aio_fsync also does a flush first then you're going to
> sync _even more_ unnecessarily: the difference between `bdrv_aio_fsync
> does flush first' and `bdrv_aio_fsync does not flush' only affects
> writes are queued but not completed when bdrv_aio_fsync is called.
> 
> That is, non-FUA writes which were submitted after the FUA write.
> There is no need to fsync these and that's what I think qemu should
> do.

I agree, that's a clever reason to make bdrv_aio_fsync() guarantee
less rather than more.

(Who knows, that might be the reason SuS doesn't offer a stronger
guarantee too, although I doubt it - if that was serious they might
have defined a more selective sync instead.)

It would be interesting to see if using aio_fsync(O_DSYNC) were slower
or faster than fdatasync() on a range of hosts - just in case the
former syncs previously submitted AIOs and the latter doesn't.

Btw, on Linux aio_fsync(O_DSYNC) does the equivalent of fsync(), not
fdatasync().  This is because Glibc defines O_DSYNC to be the same as
O_SYNC.  To get fdatasync(), you have to use the Linux-AIO API and
IOCB_CMD_FDSYNC.

> Andrea was making some comments about scsi and virtio.  It's possible
> that these have different intended semantics and perhaps those device
> models (in hw/*) need to call flush explicitly before sync.

Or perhaps they would benefit from an async equivalent, so they don't
have to pause and can queue more requests?

-- Jamie

  reply	other threads:[~2008-09-02 18:22 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-29 13:37 [Qemu-devel] [PATCH] bdrv_aio_flush Andrea Arcangeli
2008-09-01 11:27 ` Ian Jackson
2008-09-01 12:25   ` Andrea Arcangeli
2008-09-01 13:54     ` Jamie Lokier
2008-09-02 10:52     ` Ian Jackson
2008-09-02 14:25       ` Jens Axboe
2008-09-02 16:49         ` Ian Jackson
2008-09-01 13:25   ` Jamie Lokier
2008-09-02 10:46     ` Ian Jackson
2008-09-02 14:28       ` Jens Axboe
2008-09-02 16:52         ` Ian Jackson
2008-09-02 18:22           ` Jamie Lokier [this message]
2008-09-03 10:01             ` Ian Jackson
2008-09-02 18:01       ` Jamie Lokier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080902182215.GA15737@shareable.org \
    --to=jamie@shareable.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).