From mboxrd@z Thu Jan 1 00:00:00 1970 From: Anton Altaparmakov Subject: Re: [PATCH] add support for vectored and async I/O to all simple filesystems Date: Wed, 2 Nov 2005 21:04:44 +0000 (GMT) Message-ID: References: <20051101023656.GA23724@lst.de> <20051101192000.GB29542@mail.shareable.org> <20051101205745.GB27231@kvack.org> <20051102110630.GB30550@mail.shareable.org> <20051102162107.GA32755@kvack.org> <20051102162904.GK23749@parisc-linux.org> <20051102203105.GA20756@mail.shareable.org> Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: Matthew Wilcox , Benjamin LaHaise , Christoph Hellwig , akpm@osdl.org, linux-fsdevel@vger.kernel.org Return-path: Received: from ppsw-0.csi.cam.ac.uk ([131.111.8.130]:26841 "EHLO ppsw-0.csi.cam.ac.uk") by vger.kernel.org with ESMTP id S965229AbVKBVE7 (ORCPT ); Wed, 2 Nov 2005 16:04:59 -0500 To: Jamie Lokier In-Reply-To: <20051102203105.GA20756@mail.shareable.org> Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Wed, 2 Nov 2005, Jamie Lokier wrote: > Matthew Wilcox wrote: > > On Wed, Nov 02, 2005 at 11:21:07AM -0500, Benjamin LaHaise wrote: > > > On Wed, Nov 02, 2005 at 11:06:30AM +0000, Jamie Lokier wrote: > > > > So it means that any program that mustn't block, must now have a > > > > stupid kernel version check to make sure it avoids even trying aio > > > > system calls? I was under the impression that the right thing to do > > > > so far was try them, and when EINVAL is returned, use threads instead. > > > > > > Yes, that is correct. > > > > To be fair, the aio system calls were never _guaranteed_ to not block, > > were they? ISTR there were various corner cases that would still get > > your task blocking while doing an aio submission. > > Could we have some documentation of when those corner cases occur? > > The main point of aio, as far as I'm aware, is to avoid the need for > threads (or reduce the number of threads) in programs using I/O that > shouldn't block, particularly when they are latency sensitive too. > > If aio has a habit of blocking from time to time, then it may still be > useful, but it would be helpful to know that multiple threads are > still needed to ensure a program (e.g. such as a HTTP or SMB server) > can continue to make progress - and more helpful to know when. > > One particular question is: can aio calls block for a long time due to > network delays (e.g. over NFS) and I/O delays (e.g. slow disk or CD), > or are the corner cases restricted to things like paging during memory > allocation, which is unavoidable one way or another anyway? Yes, of course aio can block and in fact will block arbitrarily for arbitrary lengths of time. At least at present the implementations of ->aio_read and ->aio_write in the file systems will block left right and center. For a start, i_sem is downed which can block. Then when we get inside readpage or the relevant file write function, buffers may be allocated for the current page which can block. Then the filesystem needs to map the buffers if they are not mapped already and it is possible the filesystem needs to obtain other locks (again can block here) and even worse the filesystem may need to read data from disk to determine where mapping information for the buffers. This obviously is a slow and blocking operation unless your device is a ram disk. And in the write case the filesystem may need to allocate blocks on disk first, which in turn will involve taking locks (and possibly blocking) in addition to reading/writing metadata to find free blocks that can be allocated and marking them as allocated. And that of course can involve on-disk access and hence again blocking. I am not sure we need documentation for all that. It is kind of obvious once you sit and think about what a read and a write actually implies. The only way you can _really_ have guaranteed async io is to queue the io to a kernel thread work queue and return immediately to the caller. The only thing you will then block on potentially is allocating memory for the "queue entry item" and on waiting for the lock to the "queue" so it is safe to write to it. And if you do that, it then becomes easy to be truly non-blocking. Just allocate with GFP_ATOMIC (and perhaps add __GFP_NORETRY and __GFP_NORECLAIM?) and do a try lock for the queue lock. And if either of those fails, punt the reqest and return immediately to the user with error -EWOULDBLOCK or whatever... You could even optimise away the queue lock by using an atomic compare and exchange based queue addition function but that may not be worth the extra complexity, don't know. I guess the big smp folks may see contention on that lock... You could at least do the queues and hence their locks per superblock or something... Best regards, Anton -- Anton Altaparmakov (replace at with @) Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/