From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id o4QMxAjV097890 for ; Wed, 26 May 2010 17:59:11 -0500 Received: from mail.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 1057C3798A0 for ; Wed, 26 May 2010 16:01:32 -0700 (PDT) Received: from mail.internode.on.net (bld-mail12.adl6.internode.on.net [150.101.137.97]) by cuda.sgi.com with ESMTP id MWWmx7cR3NZt7EGv for ; Wed, 26 May 2010 16:01:32 -0700 (PDT) Date: Thu, 27 May 2010 09:01:29 +1000 From: Dave Chinner Subject: Re: [PATCH 1/5] inode: Make unused inode LRU per superblock Message-ID: <20100526230129.GA1395@dastard> References: <1274777588-21494-1-git-send-email-david@fromorbit.com> <1274777588-21494-2-git-send-email-david@fromorbit.com> <20100526161732.GC22536@laptop> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20100526161732.GC22536@laptop> List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Sender: xfs-bounces@oss.sgi.com Errors-To: xfs-bounces@oss.sgi.com To: Nick Piggin Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, xfs@oss.sgi.com T24gVGh1LCBNYXkgMjcsIDIwMTAgYXQgMDI6MTc6MzNBTSArMTAwMCwgTmljayBQaWdnaW4gd3Jv dGU6Cj4gT24gVHVlLCBNYXkgMjUsIDIwMTAgYXQgMDY6NTM6MDRQTSArMTAwMCwgRGF2ZSBDaGlu bmVyIHdyb3RlOgo+ID4gRnJvbTogRGF2ZSBDaGlubmVyIDxkY2hpbm5lckByZWRoYXQuY29tPgo+ ID4gCj4gPiBUaGUgaW5vZGUgdW51c2VkIGxpc3QgaXMgY3VycmVudGx5IGEgZ2xvYmFsIExSVS4g VGhpcyBkb2VzIG5vdCBtYXRjaAo+ID4gdGhlIG90aGVyIGdsb2JhbCBmaWxlc3lzdGVtIGNhY2hl IC0gdGhlIGRlbnRyeSBjYWNoZSAtIHdoaWNoIHVzZXMKPiA+IHBlci1zdXBlcmJsb2NrIExSVSBs aXN0cy4gSGVuY2Ugd2UgaGF2ZSByZWxhdGVkIGZpbGVzeXN0ZW0gb2JqZWN0Cj4gPiB0eXBlcyB1 c2luZyBkaWZmZXJlbnQgTFJVIHJlY2xhaW1hdGluIHNjaGVtZXMuCj4gCj4gSXMgdGhpcyBhbiBp bXByb3ZlbWVudCBJIHdvbmRlcj8gVGhlIGRjYWNoZSBpcyB1c2luZyBwZXIgc2IgbGlzdHMKPiBi ZWNhdXNlIGl0IHNwZWNpZmljYWxseSByZXF1aXJlcyBzYiB0cmF2ZXJzYWwuCgpSaWdodCAtIEkg b3JpZ2luYWxseSBpbXBsZW1lbnRlZCB0aGUgcGVyLXNiIGRlbnRyeSBsaXN0cyBmb3IKc2NhbGFi aWxpdHkgcHVycG9zZXMuIGkuZS4gdG8gYXZvaWQgbW9ub3BvbGlzaW5nIHRoZSBkZW50cnlfbG9j awpkdXJpbmcgdW5tb3VudCBsb29raW5nIGZvciBkZW50cmllcyBvbiBhIHNwZWNpZmljIHNiIGFu ZCBoYW5naW5nIHRoZQpzeXN0ZW0gZm9yIHNldmVyYWwgbWludXRlcy4KCkhvd2V2ZXIsIHRoZSBy ZWFzb24gZm9yIGRvaW5nIHRoaXMgdG8gdGhlIGlub2RlIGNhY2hlIGlzIG5vdCBmb3IKc2NhbGFi aWxpdHksIGl0J3MgYmVjYXVzZSB3ZSBoYXZlIGEgdGlnaHQgcmVsYXRpb25zaGlwIGJldHdlZW4g dGhlCmRlbnRyeSBhbmQgaW5vZGUgY2FjaGXRlS4gVGhhdCBpcywgcmVjbGFpbSBmcm9tIHRoZSBk ZW50cnkgTFJVIGdyb3dzCnRoZSBpbm9kZSBMUlUuICBMaWtlIHRoZSByZWdpc3RyYXRpb24gb2Yg dGhlIHNocmlua2VycywgdGhpcyBpcyBraW5kCm9mIGFuIGltcGxpY2l0LCB1bmRvY3VtZW50ZWQg YmVoYXZvdXIgb2YgdGhlIGN1cnJlbnQgc2hyaW5rZXIKaW1wbGVtZW5hdGlvbi4KCldoYXQgdGhp cyBwYXRjaCBzZXJpZXMgZG9lcyBpcyB0YWtlIHRoYXQgaW1wbGljaXQgcmVsYXRpb25zaGlwIGFu ZAptYWtlIGl0IGV4cGxpY2l0LiAgSXQgYWxzbyBhbGxvd3Mgb3RoZXIgZmlsZXN5c3RlbSBjYWNo ZXMgdG8gdGllCmludG8gdGhlIHJlbGF0aW9uc2hpcCBpZiB0aGV5IG5lZWQgdG8gKGUuZy4gdGhl IFhGUyBpbm9kZSBjYWNoZSkuCldoYXQgaXQgX2RvZXNuJ3QgZG9fIGlzIGNoYW5nZSB0aGUgbWFj cm8gbGV2ZWwgYmVoYXZpb3VyIG9mIHRoZQpzaHJpbmtlcnMuLi4KCj4gV2hhdCBhbGxvY2F0aW9u L3JlY2xhaW0gcmVhbGx5IHdhbnRzIChmb3IgZ29vZCBzY2FsYWJpbGl0eSBhbmQgTlVNQQo+IGNo YXJhY3RlcmlzdGljcykgaXMgcGVyLXpvbmUgbGlzdHMgZm9yIHRoZXNlIHRoaW5ncy4gSXQncyBl YXN5IHRvCj4gY29udmVydCBhIHNpbmdsZSBsaXN0IGludG8gcGVyLXpvbmUgbGlzdHMuCj4KPiBJ dCBpcyBtdWNoIGhhcmRlciB0byBjb252ZXJ0IHBlci1zYiBsaXN0cyBpbnRvIHBlci1zYiB4IHBl ci16b25lIGxpc3RzLgoKTm8gaXQncyBub3QuIEp1c3QgY29udmVydCB0aGUgc197ZGVudHJ5LGlu b2RlfV9scnUgbGlzdHMgb24gZWFjaApzdXBlcmJsb2NrIGFuZCBjYWxsIHRoZSBzaHJpbmtlciB3 aXRoIGEgbmV3IHpvbmUgbWFzayBmaWVsZCB0byBwaWNrCnRoZSBjb3JyZWN0IExSVS4gVGhhdCdz IG5vIGhhcmRlciB0aGFuIGNvbnZlcnRpbmcgYSBnbG9iYWwgTFJVLgpBbnl3YXksIHlvdSdkIHN0 aWxsIGhhdmUgdG8gZG8gcGVyLXNiIHggcGVyLXpvbmUgbGlzdHMgZm9yIHRoZSBkZW50cnkgTFJV cywKc28gY2hhbmdpbmcgdGhlIGlub2RlIGNhY2hlIHRvIHBlci1zYiBtYWtlcyBubyBkaWZmZXJl bmNlLgoKSG93ZXZlciwgdGhpcyBpcyBhIG1vb3QgcG9pbnQgYmVjYXVzZSB3ZSBkb24ndCBoYXZl IHBlci16b25lIHNocmlua2VyCmludGVyZmFjZXMuIFRoYXQncyBhbiBlbnRpcmVseSBzZXBhcmF0 ZSBkaXNjdXNzaW9uIGJlY2F1c2Ugb2YgdGhlCm1hY3JvLWxldmVsIGJlaGF2aW91cmFsIGNoYW5n ZXMgaXQgaW1wbGllcy4uLi4KCkNoZWVycywKCkRhdmUuCi0tIApEYXZlIENoaW5uZXIKZGF2aWRA ZnJvbW9yYml0LmNvbQoKX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX18KeGZzIG1haWxpbmcgbGlzdAp4ZnNAb3NzLnNnaS5jb20KaHR0cDovL29zcy5zZ2kuY29t L21haWxtYW4vbGlzdGluZm8veGZzCg== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757224Ab0EZXBo (ORCPT ); Wed, 26 May 2010 19:01:44 -0400 Received: from bld-mail12.adl6.internode.on.net ([150.101.137.97]:34568 "EHLO mail.internode.on.net" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753367Ab0EZXBm (ORCPT ); Wed, 26 May 2010 19:01:42 -0400 Date: Thu, 27 May 2010 09:01:29 +1000 From: Dave Chinner To: Nick Piggin Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, xfs@oss.sgi.com Subject: Re: [PATCH 1/5] inode: Make unused inode LRU per superblock Message-ID: <20100526230129.GA1395@dastard> References: <1274777588-21494-1-git-send-email-david@fromorbit.com> <1274777588-21494-2-git-send-email-david@fromorbit.com> <20100526161732.GC22536@laptop> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20100526161732.GC22536@laptop> User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 27, 2010 at 02:17:33AM +1000, Nick Piggin wrote: > On Tue, May 25, 2010 at 06:53:04PM +1000, Dave Chinner wrote: > > From: Dave Chinner > > > > The inode unused list is currently a global LRU. This does not match > > the other global filesystem cache - the dentry cache - which uses > > per-superblock LRU lists. Hence we have related filesystem object > > types using different LRU reclaimatin schemes. > > Is this an improvement I wonder? The dcache is using per sb lists > because it specifically requires sb traversal. Right - I originally implemented the per-sb dentry lists for scalability purposes. i.e. to avoid monopolising the dentry_lock during unmount looking for dentries on a specific sb and hanging the system for several minutes. However, the reason for doing this to the inode cache is not for scalability, it's because we have a tight relationship between the dentry and inode cacheѕ. That is, reclaim from the dentry LRU grows the inode LRU. Like the registration of the shrinkers, this is kind of an implicit, undocumented behavour of the current shrinker implemenation. What this patch series does is take that implicit relationship and make it explicit. It also allows other filesystem caches to tie into the relationship if they need to (e.g. the XFS inode cache). What it _doesn't do_ is change the macro level behaviour of the shrinkers... > What allocation/reclaim really wants (for good scalability and NUMA > characteristics) is per-zone lists for these things. It's easy to > convert a single list into per-zone lists. > > It is much harder to convert per-sb lists into per-sb x per-zone lists. No it's not. Just convert the s_{dentry,inode}_lru lists on each superblock and call the shrinker with a new zone mask field to pick the correct LRU. That's no harder than converting a global LRU. Anyway, you'd still have to do per-sb x per-zone lists for the dentry LRUs, so changing the inode cache to per-sb makes no difference. However, this is a moot point because we don't have per-zone shrinker interfaces. That's an entirely separate discussion because of the macro-level behavioural changes it implies.... Cheers, Dave. -- Dave Chinner david@fromorbit.com From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Chinner Subject: Re: [PATCH 1/5] inode: Make unused inode LRU per superblock Date: Thu, 27 May 2010 09:01:29 +1000 Message-ID: <20100526230129.GA1395@dastard> References: <1274777588-21494-1-git-send-email-david@fromorbit.com> <1274777588-21494-2-git-send-email-david@fromorbit.com> <20100526161732.GC22536@laptop> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, xfs@oss.sgi.com To: Nick Piggin Return-path: Content-Disposition: inline In-Reply-To: <20100526161732.GC22536@laptop> Sender: owner-linux-mm@kvack.org List-Id: linux-fsdevel.vger.kernel.org On Thu, May 27, 2010 at 02:17:33AM +1000, Nick Piggin wrote: > On Tue, May 25, 2010 at 06:53:04PM +1000, Dave Chinner wrote: > > From: Dave Chinner > >=20 > > The inode unused list is currently a global LRU. This does not match > > the other global filesystem cache - the dentry cache - which uses > > per-superblock LRU lists. Hence we have related filesystem object > > types using different LRU reclaimatin schemes. >=20 > Is this an improvement I wonder? The dcache is using per sb lists > because it specifically requires sb traversal. Right - I originally implemented the per-sb dentry lists for scalability purposes. i.e. to avoid monopolising the dentry_lock during unmount looking for dentries on a specific sb and hanging the system for several minutes. However, the reason for doing this to the inode cache is not for scalability, it's because we have a tight relationship between the dentry and inode cache=D1=95. That is, reclaim from the dentry LRU grows the inode LRU. Like the registration of the shrinkers, this is kind of an implicit, undocumented behavour of the current shrinker implemenation. What this patch series does is take that implicit relationship and make it explicit. It also allows other filesystem caches to tie into the relationship if they need to (e.g. the XFS inode cache). What it _doesn't do_ is change the macro level behaviour of the shrinkers... > What allocation/reclaim really wants (for good scalability and NUMA > characteristics) is per-zone lists for these things. It's easy to > convert a single list into per-zone lists. > > It is much harder to convert per-sb lists into per-sb x per-zone lists. No it's not. Just convert the s_{dentry,inode}_lru lists on each superblock and call the shrinker with a new zone mask field to pick the correct LRU. That's no harder than converting a global LRU. Anyway, you'd still have to do per-sb x per-zone lists for the dentry LRU= s, so changing the inode cache to per-sb makes no difference. However, this is a moot point because we don't have per-zone shrinker interfaces. That's an entirely separate discussion because of the macro-level behavioural changes it implies.... Cheers, Dave. --=20 Dave Chinner david@fromorbit.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail143.messagelabs.com (mail143.messagelabs.com [216.82.254.35]) by kanga.kvack.org (Postfix) with SMTP id 6B28B6B01B6 for ; Wed, 26 May 2010 19:01:38 -0400 (EDT) Date: Thu, 27 May 2010 09:01:29 +1000 From: Dave Chinner Subject: Re: [PATCH 1/5] inode: Make unused inode LRU per superblock Message-ID: <20100526230129.GA1395@dastard> References: <1274777588-21494-1-git-send-email-david@fromorbit.com> <1274777588-21494-2-git-send-email-david@fromorbit.com> <20100526161732.GC22536@laptop> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20100526161732.GC22536@laptop> Sender: owner-linux-mm@kvack.org To: Nick Piggin Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, xfs@oss.sgi.com List-ID: On Thu, May 27, 2010 at 02:17:33AM +1000, Nick Piggin wrote: > On Tue, May 25, 2010 at 06:53:04PM +1000, Dave Chinner wrote: > > From: Dave Chinner > > > > The inode unused list is currently a global LRU. This does not match > > the other global filesystem cache - the dentry cache - which uses > > per-superblock LRU lists. Hence we have related filesystem object > > types using different LRU reclaimatin schemes. > > Is this an improvement I wonder? The dcache is using per sb lists > because it specifically requires sb traversal. Right - I originally implemented the per-sb dentry lists for scalability purposes. i.e. to avoid monopolising the dentry_lock during unmount looking for dentries on a specific sb and hanging the system for several minutes. However, the reason for doing this to the inode cache is not for scalability, it's because we have a tight relationship between the dentry and inode cacheN?. That is, reclaim from the dentry LRU grows the inode LRU. Like the registration of the shrinkers, this is kind of an implicit, undocumented behavour of the current shrinker implemenation. What this patch series does is take that implicit relationship and make it explicit. It also allows other filesystem caches to tie into the relationship if they need to (e.g. the XFS inode cache). What it _doesn't do_ is change the macro level behaviour of the shrinkers... > What allocation/reclaim really wants (for good scalability and NUMA > characteristics) is per-zone lists for these things. It's easy to > convert a single list into per-zone lists. > > It is much harder to convert per-sb lists into per-sb x per-zone lists. No it's not. Just convert the s_{dentry,inode}_lru lists on each superblock and call the shrinker with a new zone mask field to pick the correct LRU. That's no harder than converting a global LRU. Anyway, you'd still have to do per-sb x per-zone lists for the dentry LRUs, so changing the inode cache to per-sb makes no difference. However, this is a moot point because we don't have per-zone shrinker interfaces. That's an entirely separate discussion because of the macro-level behavioural changes it implies.... Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org