From: Jamie Lokier <jamie@shareable.org>
To: Marcelo Tosatti <marcelo@kvack.org>
Cc: kvm-devel <kvm-devel@lists.sourceforge.net>,
Paul Brook <paul@codesourcery.com>,
qemu-devel@nongnu.org
Subject: Re: [kvm-devel] [Qemu-devel] [PATCH] QEMU: fsync AIO writes on flush request
Date: Sat, 29 Mar 2008 01:09:30 +0000 [thread overview]
Message-ID: <20080329010930.GA30219@shareable.org> (raw)
In-Reply-To: <20080328183628.GB19547@dmt>
Marcelo Tosatti wrote:
> I don't think the first qemu_aio_flush() is necessary because the fsync
> request will be enqueued after pending ones:
>
> aio_fsync() function does a sync on all outstanding
> asynchronous I/O operations associated with
> aiocbp->aio_fildes.
>
> More precisely, if op is O_SYNC, then all currently queued
> I/O operations shall be completed as if by a call of
> fsync(2), and if op is O_DSYNC, this call is the asynchronous
> analog of fdatasync(2). Note that this is a request only —
> this call does not wait for I/O completion.
>
> glibc sets the priority for fsync as 0, which is the same priority AIO
> reads and writes are submitted by QEMU.
Do AIO operations always get executed in the order they are submitted?
I was under the impression this is not guaranteed, as relaxed ordering
permits better I/O scheduling (e.g. to reduce disk seeks) - which is
one of the most useful points of AIO. (Otherwise you might as well
just have one worker thread doing synchronous IO in order).
And because of that, I was under the impression the only way to
implement a write barrier+flush in AIO was (1) wait for pending writes
to complete, then (2) aio_fsync, then (3) wait for the aio_fsync.
I could be wrong, but I haven't seen any documentation which says
otherwise, and it's what I'd expect of an implementation. I.e. it's
just an asynchronous version of fsync().
The quoted man page doesn't convince me. It says "all currently
queued I/O operations shall be completed" which _could_ mean that
aio_fsync is an AIO barrier too.
But then "if by a call of fsync(2)" implies that aio_fsync+aio_suspend
could just be replaced by fsync() with no change of semantics. So
"queued I/O operations" means what fsync() handles: dirty file data,
not in-flight AIO writes.
And you already noticed that fsync() is _not_ guaranteed to flush
in-flight AIO writes. Being the asynchronous analog, aio_fsync()
would not either.
-- Jamie
next prev parent reply other threads:[~2008-03-29 1:09 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-03-28 15:05 [Qemu-devel] [PATCH] QEMU: fsync AIO writes on flush request Marcelo Tosatti
2008-03-28 15:07 ` Jamie Lokier
2008-03-28 16:31 ` [kvm-devel] " Marcelo Tosatti
2008-03-28 16:40 ` Paul Brook
2008-03-28 16:59 ` Marcelo Tosatti
2008-03-28 17:00 ` Paul Brook
2008-03-28 18:13 ` Marcelo Tosatti
2008-03-29 1:17 ` Jamie Lokier
2008-03-29 2:02 ` Paul Brook
2008-03-29 2:11 ` Jamie Lokier
2008-03-29 2:43 ` Paul Brook
2008-03-28 18:03 ` Jamie Lokier
2008-03-28 18:36 ` Marcelo Tosatti
2008-03-29 1:09 ` Jamie Lokier [this message]
2008-03-29 6:49 ` Marcelo Tosatti
2008-03-28 17:25 ` Ian Jackson
2008-03-28 19:11 ` [kvm-devel] " Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080329010930.GA30219@shareable.org \
--to=jamie@shareable.org \
--cc=kvm-devel@lists.sourceforge.net \
--cc=marcelo@kvack.org \
--cc=paul@codesourcery.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).