From: Jamie Lokier <jamie@shareable.org>
To: qemu-devel@nongnu.org
Cc: kvm-devel@lists.sourceforge.net, Marcelo Tosatti <mtosatti@redhat.com>
Subject: Re: [kvm-devel] [Qemu-devel] Re: [PATCH 1/3] Refactor AIO interface to allow other AIO implementations
Date: Tue, 22 Apr 2008 16:12:20 +0100 [thread overview]
Message-ID: <20080422151219.GA10229@shareable.org> (raw)
In-Reply-To: <480DFBD9.4030802@codemonkey.ws>
Anthony Liguori wrote:
> >Perhaps. This raises another point about AIO vs. threads:
> >
> >If I submit sequential O_DIRECT reads with aio_read(), will they enter
> >the device read queue in the same order, and reach the disk in that
> >order (allowing for reordering when worthwhile by the elevator)?
>
> There's no guarantee that any sort of order will be preserved by AIO
> requests. The same is true with writes. This is what fdsync is for, to
> guarantee ordering.
You misunderstand. I'm not talking about guarantees, I'm talking
about expectations for the performance effect.
Basically, to do performant streaming read with O_DIRECT you need two
things:
1. Overlap at least 2 requests, so the device is kept busy.
2. Requests be sent to the disk in a good order, which is usually
(but not always) sequential offset order.
The kernel does this itself with buffered reads, doing readahead.
It works very well, unless you have other problems caused by readahead.
With O_DIRECT, an application has to do the equivalent of readahead
itself to get performant streaming.
If the app uses two threads calling pread(), it's hard to ensure the
kernel even _sees_ the first two calls in sequential offset order.
You spawn two threads, and then both threads call pread() with
non-deterministic scheduling. The problem starts before even entering
the kernel.
Then, depending on I/O scheduling in the kernel, it might send the
less good pread() to the disk immediately, then later a backward head
seek and the other one. The elevator cannot fix this: it doesn't have
enough information, unless it adds artificial delays. But artificial
delays may harm too; it's not optimal.
After that, the two threads tend to call pread() in the best order
provided there's no scheduling conflicts, but are easily disrupted by
other tasks, especially on SMP (one reading thread per CPU, so when
one of them is descheduled, the other continues and issues a request
in the 'wrong' order.)
With AIO, even though you can't be sure what the kernel does, you can
be sure the kernel receives aio_read() calls in the exact order which
is most likely to perform well. Application knowledge of it's access
pattern is passed along better.
As I've said, I saw a man page which described why this makes AIO
superior to using threads for reading tapes on that OS. So it's not a
completely spurious point.
This has nothing to do with guarantees.
-- Jamie
next prev parent reply other threads:[~2008-04-22 15:12 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-17 19:26 [Qemu-devel] [PATCH 1/3] Refactor AIO interface to allow other AIO implementations Anthony Liguori
2008-04-17 19:26 ` [Qemu-devel] [PATCH 2/3] Split out posix-aio code Anthony Liguori
2008-04-17 19:26 ` [Qemu-devel] [PATCH 3/3] Implement linux-aio backend Anthony Liguori
2008-04-18 15:09 ` [Qemu-devel] " Marcelo Tosatti
2008-04-18 15:18 ` Anthony Liguori
2008-04-18 17:46 ` Marcelo Tosatti
2008-04-17 19:38 ` [Qemu-devel] Re: [kvm-devel] [PATCH 1/3] Refactor AIO interface to allow other AIO implementations Daniel P. Berrange
2008-04-17 19:41 ` Anthony Liguori
2008-04-17 20:00 ` Daniel P. Berrange
2008-04-17 20:05 ` Anthony Liguori
2008-04-18 12:43 ` Jamie Lokier
2008-04-18 15:23 ` Anthony Liguori
2008-04-18 16:22 ` Jamie Lokier
2008-04-18 16:32 ` [kvm-devel] [Qemu-devel] " Avi Kivity
2008-04-20 15:49 ` Jamie Lokier
2008-04-20 18:43 ` Avi Kivity
2008-04-20 23:39 ` Jamie Lokier
2008-04-21 6:39 ` Avi Kivity
2008-04-21 12:10 ` Jamie Lokier
2008-04-22 8:10 ` Avi Kivity
2008-04-22 14:28 ` Jamie Lokier
2008-04-22 14:53 ` Anthony Liguori
2008-04-22 15:05 ` Avi Kivity
2008-04-22 15:23 ` Jamie Lokier
2008-04-22 15:12 ` Jamie Lokier [this message]
2008-04-22 15:03 ` Avi Kivity
2008-04-22 15:36 ` Jamie Lokier
2008-05-02 16:37 ` Antonio Vargas
2008-05-02 17:18 ` Jamie Lokier
2008-05-02 17:52 ` Anthony Liguori
2008-05-02 18:24 ` Jamie Lokier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080422151219.GA10229@shareable.org \
--to=jamie@shareable.org \
--cc=kvm-devel@lists.sourceforge.net \
--cc=mtosatti@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).