From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S266292AbUHMRdW (ORCPT ); Fri, 13 Aug 2004 13:33:22 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S266284AbUHMRdV (ORCPT ); Fri, 13 Aug 2004 13:33:21 -0400 Received: from smtp109.mail.sc5.yahoo.com ([66.163.170.7]:52364 "HELO smtp109.mail.sc5.yahoo.com") by vger.kernel.org with SMTP id S266292AbUHMRbw (ORCPT ); Fri, 13 Aug 2004 13:31:52 -0400 Message-ID: <411CFB04.603@yahoo.com.au> Date: Sat, 14 Aug 2004 03:31:48 +1000 From: Nick Piggin User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040707 Debian/1.7-5 X-Accept-Language: en MIME-Version: 1.0 To: "Martin J. Bligh" CC: Jesse Barnes , akpm@osdl.org, linux-kernel@vger.kernel.org, steiner@sgi.com Subject: Re: [PATCH] allocate page caches pages in round robin fasion References: <200408121646.50740.jbarnes@engr.sgi.com> <200408130859.24637.jbarnes@engr.sgi.com> <89760000.1092414010@[10.10.2.4]> <200408130934.20913.jbarnes@engr.sgi.com> <92140000.1092415652@[10.10.2.4]> In-Reply-To: <92140000.1092415652@[10.10.2.4]> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Martin J. Bligh wrote: > --Jesse Barnes wrote (on Friday, August 13, 2004 09:34:20 -0700): > > >>On Friday, August 13, 2004 9:20 am, Martin J. Bligh wrote: >> >>>>>I really don't think this is a good idea - you're assuming there's >>>>>really no locality of reference, which I don't think is at all true in >>>>>most cases. >>>> >>>>No, not at all, just that locality of reference matters more for stack >>>>and anonymous pages than it does for page cache pages. I.e. we don't >>>>want a node to be filled up with page cache pages causing all other >>>>memory references from the process to be off node. >>> >>>Does that actually happen though? Looking at the current code makes me >>>think it'll keep some pages free on all nodes at all times, and if kswapd >>>does it's job, we'll never fall back across nodes. Now ... I think that's >>>broken, but I think that's what currently happens - that was what we >>>discussed at KS ... I might be misreading it though, I should test it. >> >>Not nearly enough pages for any sizeable app though. Maybe the behavior could >>be configurable? > > > Well, either we're: > > 1. Falling back and putting all our most recent accesses off-node. > > or. > > 2. Not falling back and only able to use one node's memory for any one > (single threaded) app. > > Either situation is crap, though I'm not sure which turd we picked right > now ... I'd have to look at the code again ;-) I thought it was 2, but > I might be wrong. > I'm looking at this now. We are doing 1 currently. There are a couple of issues. The first is that you need to minimise regressions for when working set size is bigger than the local node. The second is a fundamental tradeoff where purely FIFO touch patterns will never be allowed to expand past local node memory for any sane and simple (ie 2.6 worthy) scanning algorithms that I can think of. I have a patch going now that just reclaims use-once file cache before going off node. Seems to help a bit for basic things that just push pagecache through the system. It definitely reduces remote allocations by several orders of magnitude for those cases.