All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jamie Lokier <jamie@shareable.org>
To: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] bdrv_aio_flush
Date: Tue, 2 Sep 2008 19:22:15 +0100	[thread overview]
Message-ID: <20080902182215.GA15737@shareable.org> (raw)
In-Reply-To: <18621.28502.567886.718843@mariner.uk.xensource.com>

Ian Jackson wrote:
> Jens Axboe writes ("Re: [Qemu-devel] [PATCH] bdrv_aio_flush"):
> > On Tue, Sep 02 2008, Ian Jackson wrote:
> > > This is still not perfect because we unnecessarily flush some data
> > > thus delaying reporting completion of the WRITE FUA.  But there is at
> > > at least no need to wait for _other_ writes to complete.
> > 
> > I don't see how the above works. There's no dependency on FUA and
> > non-FUA writes, in fact FUA writes tend to jump the device queue due to
> > certain other operating systems using it for conditions where that is
> > appropriate. So unless you do all writes using FUA, there's no way
> > around a flush for committing dirty data. Unfortunately we don't have a
> > FLUSH_RANGE command, it's just a big sledge hammer.
> 
> Yes, certainly you do aio_sync _some_ data that doesn't need to be.
> Without an O_FSYNC flag on aio_write that's almost inevitable.

Btw, in principle for FUA writes you can set O_SYNC or O_DSYNC on the
file descriptor just for this operation.  Either using fcntl() (but
I'm not sure I believe that would be portable and really work), or
using two file descriptors.

> But if bdrv_aio_fsync also does a flush first then you're going to
> sync _even more_ unnecessarily: the difference between `bdrv_aio_fsync
> does flush first' and `bdrv_aio_fsync does not flush' only affects
> writes are queued but not completed when bdrv_aio_fsync is called.
> 
> That is, non-FUA writes which were submitted after the FUA write.
> There is no need to fsync these and that's what I think qemu should
> do.

I agree, that's a clever reason to make bdrv_aio_fsync() guarantee
less rather than more.

(Who knows, that might be the reason SuS doesn't offer a stronger
guarantee too, although I doubt it - if that was serious they might
have defined a more selective sync instead.)

It would be interesting to see if using aio_fsync(O_DSYNC) were slower
or faster than fdatasync() on a range of hosts - just in case the
former syncs previously submitted AIOs and the latter doesn't.

Btw, on Linux aio_fsync(O_DSYNC) does the equivalent of fsync(), not
fdatasync().  This is because Glibc defines O_DSYNC to be the same as
O_SYNC.  To get fdatasync(), you have to use the Linux-AIO API and
IOCB_CMD_FDSYNC.

> Andrea was making some comments about scsi and virtio.  It's possible
> that these have different intended semantics and perhaps those device
> models (in hw/*) need to call flush explicitly before sync.

Or perhaps they would benefit from an async equivalent, so they don't
have to pause and can queue more requests?

-- Jamie

  reply	other threads:[~2008-09-02 18:22 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-29 13:37 [Qemu-devel] [PATCH] bdrv_aio_flush Andrea Arcangeli
2008-09-01 11:27 ` Ian Jackson
2008-09-01 12:25   ` Andrea Arcangeli
2008-09-01 13:54     ` Jamie Lokier
2008-09-02 10:52     ` Ian Jackson
2008-09-02 14:25       ` Jens Axboe
2008-09-02 16:49         ` Ian Jackson
2008-09-01 13:25   ` Jamie Lokier
2008-09-02 10:46     ` Ian Jackson
2008-09-02 14:28       ` Jens Axboe
2008-09-02 16:52         ` Ian Jackson
2008-09-02 18:22           ` Jamie Lokier [this message]
2008-09-03 10:01             ` Ian Jackson
2008-09-02 18:01       ` Jamie Lokier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080902182215.GA15737@shareable.org \
    --to=jamie@shareable.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.