From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S270227AbTHGQM4 (ORCPT ); Thu, 7 Aug 2003 12:12:56 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S270237AbTHGQMz (ORCPT ); Thu, 7 Aug 2003 12:12:55 -0400 Received: from e5.ny.us.ibm.com ([32.97.182.105]:28317 "EHLO e5.ny.us.ibm.com") by vger.kernel.org with ESMTP id S270227AbTHGQLj convert rfc822-to-8bit (ORCPT ); Thu, 7 Aug 2003 12:11:39 -0400 Content-Type: text/plain; charset=US-ASCII From: Badari Pulavarty To: suparna@in.ibm.com, akpm@osdl.org Subject: Re: [PATCH][2.6-mm] Readahead issues and AIO read speedup Date: Thu, 7 Aug 2003 09:01:01 -0700 User-Agent: KMail/1.4.1 Cc: linux-kernel@vger.kernel.org, linux-aio@kvack.org References: <20030807100120.GA5170@in.ibm.com> In-Reply-To: <20030807100120.GA5170@in.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Message-Id: <200308070901.01119.pbadari@us.ibm.com> Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Suparna, I noticed the exact same thing while testing on database benchmark on filesystems (without AIO). I added instrumentation in scsi layer to record the IO pattern and I found that we are doing lots of (4million) 4K reads, in my benchmark run. I was tracing that and found that all those reads are generated by slow read path, since readahead window is maximally shrunk. When I forced the readahead code to read 16k (my database pagesize), in case ra window closed - I see 20% improvement in my benchmark. I asked "Ramchandra Pai" (linuxram@us.ibm.com) to investigate it further. Thanks, Badari On Thursday 07 August 2003 03:01 am, Suparna Bhattacharya wrote: > I noticed a problem with the way do_generic_mapping_read > and readahead works for the case of large reads, especially > random reads. This was leading to very inefficient behaviour > for a stream for AIO reads. (See the results a little later > in this note) > > 1) We should be reading ahead at least the pages that are > required by the current read request (even if the ra window > is maximally shrunk). I think I've seen this in 2.4 - we > seem to have lost that in 2.5. > The result is that sometimes (large random reads) we end > up doing reads one page at a time waiting for it to complete > being reading the next page and so on, even for a large read. > (until we buildup a readahead window again) > > 2) Once the ra window is maximally shrunk, the responsibility > for reading the pages and re-building the window is shifted > to the slow path in read, which breaks down in the case of > a stream of AIO reads where multiple iocbs submit reads > to the same file rather than serialise the wait for i/o > completion. > > So here is a patch that fixes this by making sure we do > (1) and pushing up the handle_ra_miss calls for the maximally > shrunk case before the loop that waits for I/O completion. > > Does it make a difference ? A lot, actually.