From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752068AbXD1IXh (ORCPT ); Sat, 28 Apr 2007 04:23:37 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753084AbXD1IXh (ORCPT ); Sat, 28 Apr 2007 04:23:37 -0400 Received: from smtp1.linux-foundation.org ([65.172.181.25]:44949 "EHLO smtp1.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752068AbXD1IXf (ORCPT ); Sat, 28 Apr 2007 04:23:35 -0400 Date: Sat, 28 Apr 2007 01:22:51 -0700 From: Andrew Morton To: Peter Zijlstra Cc: Nick Piggin , David Chinner , Christoph Lameter , linux-kernel@vger.kernel.org, Mel Gorman , William Lee Irwin III , Jens Axboe , Badari Pulavarty , Maxim Levitsky Subject: Re: [00/17] Large Blocksize Support V3 Message-Id: <20070428012251.fae10a71.akpm@linux-foundation.org> In-Reply-To: <1177747448.28223.26.camel@twins> References: <20070426190438.3a856220.akpm@linux-foundation.org> <20070427022731.GF65285596@melbourne.sgi.com> <20070426195357.597ffd7e.akpm@linux-foundation.org> <20070427042046.GI65285596@melbourne.sgi.com> <20070426221528.655d79cb.akpm@linux-foundation.org> <20070426235542.bad7035a.akpm@linux-foundation.org> <20070427002640.22a71d06.akpm@linux-foundation.org> <20070427163620.GI32602149@melbourne.sgi.com> <20070427173432.GJ32602149@melbourne.sgi.com> <20070427121108.9ee05710.akpm@linux-foundation.org> <4632A6DF.7080301@yahoo.com.au> <1177747448.28223.26.camel@twins> X-Mailer: Sylpheed version 2.2.7 (GTK+ 2.8.17; x86_64-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Sat, 28 Apr 2007 10:04:08 +0200 Peter Zijlstra wrote: > > > > The other thing is that we can batch up pagecache page insertions for bulk > > writes as well (that is. write(2) with buffer size > page size). I should > > have a patch somewhere for that as well if anyone interested. > > Together with the optimistic locking from my concurrent pagecache that > should bring most of the gains: > > sequential insert of 8388608 items: > > CONFIG_RADIX_TREE_CONCURRENT=n > > [ffff81007d7f60c0] insert 0 done in 15286 ms > > CONFIG_RADIX_TREE_OPTIMISTIC=y > > [ffff81006b36e040] insert 0 done in 3443 ms > > only 4.4 times faster, and more scalable, since we don't bounce the > upper level locks around. I'm not sure what we're looking at here. radix-tree changes? Locking changes? Both? If we have a whole pile of pages to insert then there are obvious gains from not taking the lock once per page (gang insert). But I expect there will also be gains from not walking down the radix tree once per page too: walk all the way down and populate all the way to the end of the node. The implementation could get a bit tricky, handling pages which a racer instantiated when we dropped the lock, and suitably adjusting ->index. Not rocket science though. The depth of the radix tree matters (ie, the file size). 'twould be useful to always describe the tree's size when publishing microbenchmark results like this.