From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261620AbULTTWt (ORCPT ); Mon, 20 Dec 2004 14:22:49 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261616AbULTTWs (ORCPT ); Mon, 20 Dec 2004 14:22:48 -0500 Received: from mail-relay-1.tiscali.it ([213.205.33.41]:15330 "EHLO mail-relay-1.tiscali.it") by vger.kernel.org with ESMTP id S261617AbULTTVZ (ORCPT ); Mon, 20 Dec 2004 14:21:25 -0500 Date: Mon, 20 Dec 2004 20:20:46 +0100 From: Andrea Arcangeli To: Andrew Morton Cc: James Pearson , marcelo.tosatti@cyclades.com, linux-kernel@vger.kernel.org Subject: Re: Reducing inode cache usage on 2.4? Message-ID: <20041220192046.GM4630@dualathlon.random> References: <41C316BC.1020909@moving-picture.com> <20041217151228.GA17650@logos.cnet> <41C37AB6.10906@moving-picture.com> <20041217172104.00da3517.akpm@osdl.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041217172104.00da3517.akpm@osdl.org> X-GPG-Key: 1024D/68B9CB43 13D9 8355 295F 4823 7C49 C012 DFA1 686E 68B9 CB43 X-PGP-Key: 1024R/CB4660B9 CC A0 71 81 F4 A0 63 AC C0 4B 81 1D 8C 15 C8 E5 User-Agent: Mutt/1.5.6i Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Dec 17, 2004 at 05:21:04PM -0800, Andrew Morton wrote: > James Pearson wrote: > > > > It seems the inode cache has priority over cached file data. > > It does. If the machine is full of unmapped clean pagecache pages the > kernel won't even try to reclaim inodes. This should help a bit: > > --- 24/mm/vmscan.c~a 2004-12-17 17:18:31.660254712 -0800 > +++ 24-akpm/mm/vmscan.c 2004-12-17 17:18:41.821709936 -0800 > @@ -659,13 +659,13 @@ int fastcall try_to_free_pages_zone(zone > > do { > nr_pages = shrink_caches(classzone, gfp_mask, nr_pages, &failed_swapout); > - if (nr_pages <= 0) > - return 1; > shrink_dcache_memory(vm_vfs_scan_ratio, gfp_mask); > shrink_icache_memory(vm_vfs_scan_ratio, gfp_mask); > #ifdef CONFIG_QUOTA > shrink_dqcache_memory(vm_vfs_scan_ratio, gfp_mask); > #endif > + if (nr_pages <= 0) > + return 1; > if (!failed_swapout) > failed_swapout = !swap_out(classzone); > } while (--tries); I'm worried this is too aggressive by default and it may hurt stuff. The real bug is that we don't do anything when too many collisions happens in the hashtables. That is the thing to work on. We should free colliding entries in the background after a 'touch' timeout. That should work pretty well to age the dcache proprerly too. But the above will just shrink everything all the time and it's going to break stuff. For 2.6 we can talk about the background shrink based on timeout. My only suggestion for 2.4 is to try with vm_cache_scan_ratio = 20 or higher (or alternatively vm_mapped_ratio = 50 or = 20). There's a reason why everything is tunable by sysctl. I don't think the vm_lru_balance_ratio is the one he's interested about. vm_lru_balance_ratio controls how much work is being done at every dcache/icache shrinking. His real objective is to invoke the dcache/icache shrinking more frequently, how much work is being done at each pass is a secondary issue. If we don't invoke it, nothing will be shrunk, no matter what is the value of vm_lru_balance_ratio. Hope this helps funding an optimal tuning for the workload.