qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Avi Kivity <avi@qumranet.com>
To: qemu-devel@nongnu.org, Anthony Liguori <aliguori@us.ibm.com>,
	kvm-devel@lists.sourceforge.net,
	Marcelo Tosatti <mtosatti@redhat.com>
Subject: Re: [kvm-devel] [Qemu-devel] Re: [PATCH 1/3] Refactor AIO interface to allow other AIO implementations
Date: Tue, 22 Apr 2008 18:03:31 +0300	[thread overview]
Message-ID: <480DFE43.8060509@qumranet.com> (raw)
In-Reply-To: <20080422142847.GC4849@shareable.org>

Jamie Lokier wrote:
> Avi Kivity wrote:
>   
>>> And video streaming on some embedded devices with no MMU!  (Due to the
>>> page cache heuristics working poorly with no MMU, sustained reliable
>>> streaming is managed with O_DIRECT and the app managing cache itself
>>> (like a database), and that needs AIO to keep the request queue busy.
>>> At least, that's the theory.)
>>>       
>> Could use threads as well, no?
>>     
>
> Perhaps.  This raises another point about AIO vs. threads:
>
> If I submit sequential O_DIRECT reads with aio_read(), will they enter
> the device read queue in the same order, and reach the disk in that
> order (allowing for reordering when worthwhile by the elevator)?
>   

Yes, unless the implementation in the kernel (or glibc) is threaded.

> With threads this isn't guaranteed and scheduling makes it quite
> likely to issue the parallel synchronous reads out of order, and for
> them to reach the disk out of order because the elevator doesn't see
> them simultaneously.
>   

If the disk is busy, it doesn't matter.  The requests will queue and the 
elevator will sort them out.  So it's just the first few requests that 
may get to disk out of order.

> With AIO (non-Glibc! (and non-kthreads)) it might be better at
> keeping the intended issue order, I'm not sure.
>
> It is highly desirable: O_DIRECT streaming performance depends on
> avoiding seeks (no reordering) and on keeping the request queue
> non-empty (no gap).
>
> I read a man page for some other unix, describing AIO as better than
> threaded parallel reads for reading tape drives because of this (tape
> seeks are very expensive).  But the rest of the man page didn't say
> anything more.  Unfortunately I don't remember where I read it.  I
> have no idea whether AIO submission order is nearly always preserved
> in general, or expected to be.
>   

I haven't considered tape, but this is a good point indeed.  I expect it 
doesn't make much of a difference for a loaded disk.

>   
>> It's me at fault here.  I just assumed that because it's easy to do aio 
>> in a thread pool efficiently, that's what glibc does.
>>
>> Unfortunately the code does some ridiculous things like not service 
>> multiple requests on a single fd in parallel.  I see absolutely no 
>> reason for it (the code says "fight for resources").
>>     
>
> Ouch.  Perhaps that relates to my thought above, about multiple
> requests to the same file causing seek storms when thread scheduling
> is unlucky?
>   

My first thought on seeing this is that it relates to a deficiency on 
older kernels servicing multiple requests on a single fd (i.e. a 
per-file lock).  I don't know if such a deficiency ever existed, though.

>   
>> It could and should.  It probably doesn't.
>>
>> A simple thread pool implementation could come within 10% of Linux aio 
>> for most workloads.  It will never be "exactly", but for small numbers 
>> of disks, close enough.
>>     
>
> I would wait for benchmark results for I/O patterns like sequential
> reading and writing, because of potential for seeks caused by request
> reordering, before being confident of that.
>
>   

I did have measurements (and a test rig) at a previous job (where I did 
a lot of I/O work); IIRC the performance of a tuned thread pool was not 
far behind aio, both for seeks and sequential.  It was a while back though.


-- 
error compiling committee.c: too many arguments to function

  parent reply	other threads:[~2008-04-22 15:03 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-17 19:26 [Qemu-devel] [PATCH 1/3] Refactor AIO interface to allow other AIO implementations Anthony Liguori
2008-04-17 19:26 ` [Qemu-devel] [PATCH 2/3] Split out posix-aio code Anthony Liguori
2008-04-17 19:26 ` [Qemu-devel] [PATCH 3/3] Implement linux-aio backend Anthony Liguori
2008-04-18 15:09   ` [Qemu-devel] " Marcelo Tosatti
2008-04-18 15:18     ` Anthony Liguori
2008-04-18 17:46       ` Marcelo Tosatti
2008-04-17 19:38 ` [Qemu-devel] Re: [kvm-devel] [PATCH 1/3] Refactor AIO interface to allow other AIO implementations Daniel P. Berrange
2008-04-17 19:41   ` Anthony Liguori
2008-04-17 20:00     ` Daniel P. Berrange
2008-04-17 20:05       ` Anthony Liguori
2008-04-18 12:43       ` Jamie Lokier
2008-04-18 15:23         ` Anthony Liguori
2008-04-18 16:22           ` Jamie Lokier
2008-04-18 16:32           ` [kvm-devel] [Qemu-devel] " Avi Kivity
2008-04-20 15:49             ` Jamie Lokier
2008-04-20 18:43               ` Avi Kivity
2008-04-20 23:39                 ` Jamie Lokier
2008-04-21  6:39                   ` Avi Kivity
2008-04-21 12:10                     ` Jamie Lokier
2008-04-22  8:10                       ` Avi Kivity
2008-04-22 14:28                         ` Jamie Lokier
2008-04-22 14:53                           ` Anthony Liguori
2008-04-22 15:05                             ` Avi Kivity
2008-04-22 15:23                               ` Jamie Lokier
2008-04-22 15:12                             ` Jamie Lokier
2008-04-22 15:03                           ` Avi Kivity [this message]
2008-04-22 15:36                             ` Jamie Lokier
2008-05-02 16:37                               ` Antonio Vargas
2008-05-02 17:18                                 ` Jamie Lokier
2008-05-02 17:52                                   ` Anthony Liguori
2008-05-02 18:24                                     ` Jamie Lokier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=480DFE43.8060509@qumranet.com \
    --to=avi@qumranet.com \
    --cc=aliguori@us.ibm.com \
    --cc=kvm-devel@lists.sourceforge.net \
    --cc=mtosatti@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).