From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751394AbVKEBEX (ORCPT ); Fri, 4 Nov 2005 20:04:23 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751395AbVKEBEW (ORCPT ); Fri, 4 Nov 2005 20:04:22 -0500 Received: from agminet01.oracle.com ([141.146.126.228]:31929 "EHLO agminet01.oracle.com") by vger.kernel.org with ESMTP id S1751394AbVKEBEW (ORCPT ); Fri, 4 Nov 2005 20:04:22 -0500 Message-ID: <436C04FE.6000708@oracle.com> Date: Fri, 04 Nov 2005 17:03:58 -0800 From: Zach Brown User-Agent: Mozilla Thunderbird 1.0.7-1.1.fc4 (X11/20050929) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Christoph Hellwig CC: linux-aio@kvack.org, linux-kernel@vger.kernel.org, Benjamin LaHaise , Andrew Morton Subject: Re: [Patch] vectored aio: IO_CMD_P{READ,WRITE}V and fops->aio_{read,write}v References: <20051102233020.27835.89951.sendpatchset@volauvent.pdx.zabbo.net> <20051105002406.GA11235@lst.de> In-Reply-To: <20051105002406.GA11235@lst.de> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Brightmail-Tracker: AAAAAQAAAAI= X-Whitelist: TRUE Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Christoph Hellwig wrote: > On Wed, Nov 02, 2005 at 03:27:29PM -0800, Zach Brown wrote: > >>vectored aio: IO_CMD_P{READ,WRITE}V and fops->aio_{read,write}v > > The aio.c portion looks nice. I'm not happy about the filesystems bits. > The last thing we want is another set of read/write file operations. Yeah, I can't possibly disagree. > So > as part of the patch (it'll probably grow into a series) we should > remove the aio non-vectored and maybe even the plain vectored > operations. What sort of time line are you thinking for this? Removing the plain vectored operation sounds pretty aggressive :) I'll admit to fearing that the simple aio vectored api addition will get backed up behind a grand api reorganization. But, in any case, I'm happy to lend some time to the mechanical API shuffling and testing. Let me know how we can coordinate efforts effectively. If we're going down this path, and find ourselves touching every vectored implementation in the world, I wonder if we shouldn't consider that iovec container. The desire is to avoid the duplicated iovec walking that happens at the various layers by storing the result of a single walk. An ext3 O_DIRECT write walks the iovec no fewer than 7 times: do_readv_writev __generic_file_aio_write_nolock generic_file_direct_IO (via iov_length) ext3_direct_IO (via_iov_length) __blockdev_direct_IO direct_io_worker (twice) That's the pathological case. Buffered ext3 only gets 3 walks, 2 is somewhat obviously the minimum. Maybe we don't care about the cache pressure of huge vectored o_direct writes? I just thought we might be sad if we realized we *did* care about avoiding those duplicated walks after having just spent weeks shuffling the vectored fs ops around. - z