From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1JoKXl-0004lG-P3 for qemu-devel@nongnu.org; Tue, 22 Apr 2008 11:36:29 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1JoKXi-0004kD-Nb for qemu-devel@nongnu.org; Tue, 22 Apr 2008 11:36:28 -0400 Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JoKXi-0004kA-F5 for qemu-devel@nongnu.org; Tue, 22 Apr 2008 11:36:26 -0400 Received: from mail2.shareable.org ([80.68.89.115]) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1JoKXh-0007gg-FP for qemu-devel@nongnu.org; Tue, 22 Apr 2008 11:36:25 -0400 Date: Tue, 22 Apr 2008 16:36:20 +0100 From: Jamie Lokier Subject: Re: [kvm-devel] [Qemu-devel] Re: [PATCH 1/3] Refactor AIO interface to allow other AIO implementations Message-ID: <20080422153616.GC10229@shareable.org> References: <4808BCF3.3060200@us.ibm.com> <4808CD10.8010609@qumranet.com> <20080420154943.GB14268@shareable.org> <480B8EDC.6060507@qumranet.com> <20080420233913.GA23292@shareable.org> <480C36A3.6010900@qumranet.com> <20080421121028.GD4193@shareable.org> <480D9D74.5070801@qumranet.com> <20080422142847.GC4849@shareable.org> <480DFE43.8060509@qumranet.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <480DFE43.8060509@qumranet.com> Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: kvm-devel@lists.sourceforge.net, Anthony Liguori , Marcelo Tosatti Avi Kivity wrote: > >Perhaps. This raises another point about AIO vs. threads: > > > >If I submit sequential O_DIRECT reads with aio_read(), will they enter > >the device read queue in the same order, and reach the disk in that > >order (allowing for reordering when worthwhile by the elevator)? > > Yes, unless the implementation in the kernel (or glibc) is threaded. > > >With threads this isn't guaranteed and scheduling makes it quite > >likely to issue the parallel synchronous reads out of order, and for > >them to reach the disk out of order because the elevator doesn't see > >them simultaneously. > > If the disk is busy, it doesn't matter. The requests will queue and the > elevator will sort them out. So it's just the first few requests that > may get to disk out of order. There's two cases where it matters to a read-streaming app: 1. Disk isn't busy with anything else, maximum streaming performance is desired. 2. Disk is busy with unrelated things, but you're using I/O priorities to give the streaming app near-absolute priority. Then you need to maintain overlapped streaming requests, otherwise disk is given to a lower priority I/O. If that happens often, you lose, priority is ineffective. Because one of the streaming requests is usually being serviced, elevator has similar limitations as for a disk which is not busy with anything else. > I haven't considered tape, but this is a good point indeed. I expect it > doesn't make much of a difference for a loaded disk. Yes, as long as it's loaded with unrelated requests at the same I/O priority, the elevator has time to sort requests and hide thread scheduling artifacts. Btw, regarding QEMU: QEMU gets requests _after_ sorting by the guest's elevator, then submits them to the host's elevator. If the guest and host elevators are both configured 'anticipatory', do the anticipatory delays add up? -- Jamie