From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id p430lnir254334 for ; Mon, 2 May 2011 19:47:49 -0500 Received: from ipmail06.adl6.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 8414F1E19B93 for ; Mon, 2 May 2011 17:51:21 -0700 (PDT) Received: from ipmail06.adl6.internode.on.net (ipmail06.adl6.internode.on.net [150.101.137.145]) by cuda.sgi.com with ESMTP id OZWjSvjKMbqEZlAP for ; Mon, 02 May 2011 17:51:21 -0700 (PDT) Date: Tue, 3 May 2011 10:51:14 +1000 From: Dave Chinner Subject: Re: 2.6.39-rc4+: oom-killer busy killing tasks Message-ID: <20110503005114.GE2978@dastard> References: <20110427102824.GI12436@dastard> <20110428233751.GR12436@dastard> <20110429201701.GA13166@x4.trippels.de> <20110501080149.GD13542@dastard> <20110502121958.GA2978@dastard> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Sender: xfs-bounces@oss.sgi.com Errors-To: xfs-bounces@oss.sgi.com To: Christian Kujau Cc: minchan.kim@gmail.com, LKML , Markus Trippelsdorf , xfs@oss.sgi.com T24gTW9uLCBNYXkgMDIsIDIwMTEgYXQgMTI6NTk6NTBQTSAtMDcwMCwgQ2hyaXN0aWFuIEt1amF1 IHdyb3RlOgo+IE9uIE1vbiwgMiBNYXkgMjAxMSBhdCAyMjoxOSwgRGF2ZSBDaGlubmVyIHdyb3Rl Ogo+ID4gWWVzLiBUcnkgMiBvcmRlcnMgb2YgbWFnbml0dWRlIGFzIGEgc3RhcnQuIGkuZSBjaGFu Z2UgaXQgdG8gMTAwMDAuLi4KPiAKPiBJJ3ZlIHJ1biB0aGUgLTEyIHRlc3Qgd2l0aCB2ZnNfY2Fj aGVfcHJlc3N1cmU9MjAwIGFuZCBub3cgdGhlIC0xMyB0ZXN0IAo+IHdpdGggdmZzX2NhY2hlX3By ZXNzdXJlPTEwMDAwLiBUaGUgT09NIGtpbGxlciBzdGlsbCBraWNrcyBpbiwgYnV0IHRoZSAKPiBt YWNoaW5lIHNlZW1zIHRvIGJlIG1vcmUgdXNhYmxlIGFmdGVyd2FyZHMgYW5kIGRvZXMgbm90IGdl dCB0b3RhbGx5IHN0dWNrOgo+IAo+ICAgaHR0cDovL25lcmRieW5hdHVyZS5kZS9iaXRzLzIuNi4z OS1yYzQvb29tLwo+ICAgLSBtZXNzYWdlcy0xMi50eHQuZ3ogJiBzbGFiaW5mby0xMi50eHQuYnoy Cj4gICAgICogb29tLWRlYnVnLnNoIGludm9rZWQgb29tLWtpbGxlciBhdCAwMToyNzoxMQo+ICAg ICAqIHN5c3JxLXcgd29ya3MgdW50aWwgMDE6Mjc6MDgsIGJ1dCBnb3Qga2lsbGVkIGJ5IG9vbQo+ IAo+ICAgLSBtZXNzYWdlcy0xMy50eHQuZ3ogJiBzbGFiaW5mby0xMy50eHQuYnoyIAo+ICAgICAq IGZpbmQgaW52b2tlZCBvb20ta2lsbGVyIGF0IDA4OjQ0OjA3Cj4gICAgICogc3lzcnEtdyB3b3Jr cyB1bnRpbCAwODo0NTo0OCAobGlzdGluZyBqYmQyL2hkYTYtOCksIHRoZW4KPiAgICAgICBteSBk ZWJ1ZyBzY3JpcHQgZ290IGtpbGxlZAoKU28gYmVmb3JlIHRoZSBPT00ga2lsbGVyIGtpY2tzIGlu LCBrc3dhcGQgaXMgc3R1Y2sgaW4KY29uZ2VzdGlvbl93YWl0KCksIGFuZCBhZnRlciBhIG51bWJl ciBvZiBvb20ta2lsbHMgb3ZlciBhIDVzIHBlcmlvZAppdCBpcyBzdGlsbCBpbiBjb25nZXN0aW9u X3dhaXQoKS4gN3MgbGF0ZXIgaXQgaXMgc3RpbGwgaW4KY29uZ2VzdGlvbl93YWl0KCkgYW5kIHRo ZSBvb20ta2lsbGVyIHN0YXJ0cyB1cCBhZ2Fpbiwgd2l0aCBrc3dhcGQKc3RpbGwgYmVpbmcgaW4g Y29uZ2VzdGlvbl93YWl0KCkgd2hlbiB0aGUgb29tLWtpbGxlciBzdG9wcyBhZ2FpbiAzcwpsYXRl ci4KCk9rLCBzbyBrc3dhcGQgYmVpbmcgc3R1Y2sgaW4gY29uZ2VzdGlvbiB3YWl0IG1lYW5zIGl0 IGNhbiBvbmx5IGJlIGluCmJhbGFuY2VfcGdkYXQoKSBhbmQgaXQgdGhpbmtzICB0aGF0IGl0IGlz IGdldHRpbmcgaW50byB0cm91YmxlLgoKTG9va2luZyBhdCB0aGUgT09NIG91dHB1dDoKCiAgYWN0 aXZlX2Fub246Nzk5MiBpbmFjdGl2ZV9hbm9uOjg3MTQgaXNvbGF0ZWRfYW5vbjowCiAgYWN0aXZl X2ZpbGU6NTk5NSBpbmFjdGl2ZV9maWxlOjczNzgwIGlzb2xhdGVkX2ZpbGU6MAogIHVuZXZpY3Rh YmxlOjAgZGlydHk6MCB3cml0ZWJhY2s6MCB1bnN0YWJsZTowCiAgZnJlZTozNTI2MyBzbGFiX3Jl Y2xhaW1hYmxlOjE4MjY1MiBzbGFiX3VucmVjbGFpbWFibGU6MzIyNAogIG1hcHBlZDo2OTI5IHNo bWVtOjE5OSBwYWdldGFibGVzOjM5NiBib3VuY2U6MAogRE1BIGZyZWU6MzQzNmtCIG1pbjozNTMy a0IgbG93OjQ0MTJrQiBoaWdoOjUyOTZrQiBhY3RpdmVfYW5vbjowa0IgaW5hY3RpdmVfYW5vbjow a0IgYWN0aXZlX2ZpbGU6MjM2a0IgaW5hY3RpdmVfZmlsZToyNDhrQiB1bmV2aWN0YWJsZTowa0Ig aXNvbGF0ZWQoYW5vbik6MGtCIGlzb2xhdGVkKGZpbGUpOjBrQiBwcmVzZW50Ojc4MDI4OGtCIG1s b2NrZWQ6MGtCIGRpcnR5OjBrQiB3cml0ZWJhY2s6MGtCIG1hcHBlZDo4a0Igc2htZW06MGtCIHNs YWJfcmVjbGFpbWFibGU6NzMwNjA4a0Igc2xhYl91bnJlY2xhaW1hYmxlOjEyODk2a0Iga2VybmVs X3N0YWNrOjEwMzJrQiBwYWdldGFibGVzOjE1ODRrQiB1bnN0YWJsZTowa0IgYm91bmNlOjBrQiB3 cml0ZWJhY2tfdG1wOjBrQiBwYWdlc19zY2FubmVkOjY4MCBhbGxfdW5yZWNsYWltYWJsZT8geWVz CiBsb3dtZW1fcmVzZXJ2ZVtdOiAwIDAgNTA4IDUwOAogSGlnaE1lbSBmcmVlOjEzNzYxNmtCIG1p bjo1MDhrQiBsb3c6MTA5NmtCIGhpZ2g6MTY4NGtCIGFjdGl2ZV9hbm9uOjMxOTY4a0IgaW5hY3Rp dmVfYW5vbjozNDg1NmtCIGFjdGl2ZV9maWxlOjIzNzQ0a0IgaW5hY3RpdmVfZmlsZToyOTQ4NzJr QiB1bmV2aWN0YWJsZTowa0IgaXNvbGF0ZWQoYW5vbik6MGtCIGlzb2xhdGVkKGZpbGUpOjBrQiBw cmVzZW50OjUyMDE5MmtCIG1sb2NrZWQ6MGtCIGRpcnR5OjBrQiB3cml0ZWJhY2s6MGtCIG1hcHBl ZDoyNzcwOGtCIHNobWVtOjc5NmtCIHNsYWJfcmVjbGFpbWFibGU6MGtCIHNsYWJfdW5yZWNsYWlt YWJsZTowa0Iga2VybmVsX3N0YWNrOjBrQiBwYWdldGFibGVzOjBrQiB1bnN0YWJsZTowa0IgYm91 bmNlOjBrQiB3cml0ZWJhY2tfdG1wOjBrQiBwYWdlc19zY2FubmVkOjAgYWxsX3VucmVjbGFpbWFi bGU/IG5vCiBsb3dtZW1fcmVzZXJ2ZVtdOiAwIDAgMCAwCgpUaGVyZSBhcmUgbm8gaXNvbGF0ZWQg cGFnZXMsIHNvIHRoYXQgbWVhbnMgd2UgYXJlbid0IGluIHRoZQpjb25nZXN0aW9uX3dhaXQoKSBj YWxsIHJlbGF0ZWQgdG8gaGF2aW5nIHRvbyBtYW55IGlzb2xhdGVkIHBhZ2VzLgoKV2Ugc2VlIHRo YXQgdGhlIFpPTkVfRE1BIGlzIGFsbF91bnJlY2xhaW1hYmxlIGFuZCBoYWQgNjgwIHBhZ2VzCnNj YW5uZWQuIFpPTkVfSElHSE1FTSBoYWQgX3plcm9fIHBhZ2VzIHNjYW5uZWQsIHdoaWNoIG1lYW5z IGl0IG11c3QKYmUgb3ZlciB0aGUgaGlnaCB3YXRlciBtYXJrcyBmb3IgZnJlZSBtZW1vcnkgYW5k IHNvIG5vIGF0dGVtcHQgaXMKbWFkZSB0byByZWNsYWltIGZyb20gdGhpcyB6b25lLiBUaGF0IG1l YW5zIGxydV9wYWdlcyBpcyBzZXQgdG8Kem9uZV9yZWNsYWltYWJsZV9wYWdlcyhaT05FX0RNQSks IHdoaWNoIGF0IHRoaXMgcG9pbnQgaW4gdGltZSB3b3VsZApiZToKCglhY3RpdmVfYW5vbjowa0Ig aW5hY3RpdmVfYW5vbjowa0IgYWN0aXZlX2ZpbGU6MjM2a0IgaW5hY3RpdmVfZmlsZToyNDhrQgoK YWJvdXQgNDg0ayBvciAxMjEgcGFnZXMuIFRvIGdldCBhbGxfdW5yZWNsYWltYWJsZSBzZXQsIHRo ZQpzaHJpbmtfc2xhYigpIGNhbGwgbXVzdCBoYXZlIHJldHVybmVkIHplcm8gdG8gaW5kaWNhdGUg aXQgZGlkbid0CmZyZWUgYW55dGhpbmcuCgpTbyB0aGUgZmlyc3QgcGFzcyB0aHJvdWdoIHdvdWxk IGhhdmUgcGFzc2VkIHRoYXQgdG8gc2hyaW5rX3NsYWIsIGFuZAphc3VtbWluZyB0aGV5IGFyZSBh bGwgbWFwcGVkIHBhZ2VzIHdlJ2QgZW5kIHVwIHdpdGggbnJfc2Nhbm5lZCA9CjI0Mi4gRm9yIHRo ZSB4ZnMgaW5vZGUgY2FjaGUgd2l0aCA2MDAsMDAwIHJlY2xhaW1hYmxlIGlub2RlcywgdGhpcwp3 b3VsZCBoYXZlIHJlc3VsdGVkIGluOgoKCW1heF9wYXNzID0gNjAwMDAwCglkZWx0YSA9IDQgKiAy NDIgLyAyID0gNDg0CglkZWx0YSA9IDQ4NCAqIDYwMCwwMDAgPSAyOTAsNDAwLDAwMAoJZGVsdGEg PSAyOTAsNDAwLDAwMCAvIDEyMSArIDEg2Y09IDIsMzgwLDMyNwoJc2hyaW5rZXItPm5yICs9IGRl bHRhCglpZiAoc2hyaW5rZXItPm5yID4gbWF4X3Bhc3MgKiAyKQoJCXNocmlua2VyLT5uciA9IG1h eF9wYXNzICogMjsgPSAxLDIwMCwwMDAKClNvLCB0aGUgc2hyaW5rZXItPm5yIHNob3VsZCBiZSB3 ZWxsIGFib3ZlIHplcm8sIGV2ZW4gaW4gdGhlIHdvcnN0CmNhc2UuIFRoZSBxdWVzdGlvbiBpcyBu b3c6IGhvdyBvbiBlYXJ0aCBpcyBpdCByZXR1cm5pbmcgemVybz8KClR3byBjYXNlczogaWYgdGhl IHNocmlua2VyIHJldHVybnMgLTEsIG9yIGJlY2F1c2UgdGhlIGNhY2hlIGlzIGdyb3dpbmc6Cgog ICAgICAgICAgICAgICAgICAgICAgICBucl9iZWZvcmUgPSAoKnNocmlua2VyLT5zaHJpbmspKHNo cmlua2VyLCAwLCBnZnBfbWFzayk7CiAgICAgICAgICAgICAgICAgICAgICAgIHNocmlua19yZXQg PSAoKnNocmlua2VyLT5zaHJpbmspKHNocmlua2VyLCB0aGlzX3NjYW4sCiAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBnZnBfbWFz ayk7CiAgICAgICAgICAgICAgICAgICAgICAgIGlmIChzaHJpbmtfcmV0ID09IC0xKQogICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgIGJyZWFrOwogICAgICAgICAgICAgICAgICAgICAgICBp ZiAoc2hyaW5rX3JldCA8IG5yX2JlZm9yZSkKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICByZXQgKz0gbnJfYmVmb3JlIC0gc2hyaW5rX3JldDsKClNvLCBmaXJzdCBjYXNlIHdpbGwgaGFw cGVuIGZvciBYRlMgd2hlbjoKCgkJaWYgKCEoZ2ZwX21hc2sgJiBfX0dGUF9GUykpCiAgICAgICAg ICAgICAgICAgICAgICAgIHJldHVybiAtMTsKCkluIG1vc3Qgb2YgdGhlIE9PTS1raWxsZXIgaW52 b2NhdGlvbnMsIHRoZSBzdGFjayB0cmFjZSBpczoKCiBvdXRfb2ZfbWVtb3J5KzB4MjdjLzB4MzYw CiBfX2FsbG9jX3BhZ2VzX25vZGVtYXNrKzB4NmY4LzB4NzA4CiBuZXdfc2xhYisweDFmYy8weDIz NAogVC45MTUrMHgxZjgvMHgzODgKIGttZW1fY2FjaGVfYWxsb2MrMHgxMWMvMHgxMjQKIGttZW1f em9uZV9hbGxvYysweGE0LzB4MTE0CiB4ZnNfaW5vZGVfYWxsb2MrMHg0MC8weDEzYwogeGZzX2ln ZXQrMHgyYTgvMHg2MjAKIHhmc19sb29rdXArMHhmOC8weDExNAogeGZzX3ZuX2xvb2t1cCsweDVj LzB4YjAKIGRfYWxsb2NfYW5kX2xvb2t1cCsweDU0LzB4OTAKIGRvX2xvb2t1cCsweDI0OC8weDJi YwogcGF0aF9sb29rdXBhdCsweGZjLzB4OGY0CiBkb19wYXRoX2xvb2t1cCsweDM0LzB4YWMKIHVz ZXJfcGF0aF9hdCsweDY0LzB4YjQKIHZmc19mc3RhdGF0KzB4NTgvMHhiYwogc3lzX2ZzdGF0YXQ2 NCsweDI0LzB4NTAKIHJldF9mcm9tX3N5c2NhbGwrMHgwLzB4MzgKClNvIHdlIGFyZSBub3QgcHJl dmVudGluZyByZWNsYWltIHZpYSB0aGUgZ2ZwX21hc2suIFRoYXQgbGVhdmVzIHRoZQpvdGhlciBj YXNlLCB3aGVyZSB0aGUgbnVtYmVyIG9mIHJlY2xhaW1hYmxlIGlub2RlcyBpcyBncm93aW5nIGZh c3Rlcgp0aGFuIHRoZSBzaHJpbmtlciBpcyBmcmVlaW5nIHRoZW0uICBJIGNhbid0IHJlYWxseSBz ZWUgaG93IHRoYXQgaXMKcG9zc2libGUgd2l0aCBhIHNpbmdsZSBDUFUgbWFjaGluZSB3aXRob3V0 IHByZW1wdCBlbmFibGVkIGFuZCwKYXBwYXJlbnRseSwgbm8gZGlydHkgaW5vZGVzLiBJbm9kZSBy ZWNsYWltIHNob3VsZCBub3QgYmxvY2sKKHNocmlua2VyIG9yIGJhY2tncm91bmQpLCBzbyB0aGVy ZSdzIHNvbWV0aGluZyBlbHNlIGdvaW5nIG9uIGhlcmUuCgpDYW4geW91IHJ1biBhbiBldmVudCB0 cmFjZSBvZiBhbGwgdGhlIFhGUyBldmVudHMgZHVyaW5nIGEgZmluZCBmb3IKbWU/IERvbid0IGRv IGl0IG92ZXIgdGhlIGVudGlyZSBzdWJzZXQgb2YgdGhlIGZpbGVzeXN0ZW0gLSBvbmx5CjEwMCww MDAgaW5vZGVzIGlzIHN1ZmZpY2llbnQgKGkuZS4ga2lsbCB0aGUgZmluZCBvbmNlIHRoZSB4ZnMg aW5vZGUKY2FjaGUgc2xhYiByZWFjaGVzIDEwMGsgaW5vZGVzLiBXaGlsZSBzdGlsbCBydW5uaW5n IHRoZSBldmVudCB0cmFjZSwKY2FuIHlvdSB0aGVuIGRyb3AgdGhlIGNhY2hlcyAoZWNobyAzID4g L3Byb2Mvc3lzL3ZtL2Ryb3BfY2FjaGVzKSBhbmQKY2hlY2sgdGhhdCB0aGUgeGZzIGlub2RlIGNh Y2hlIGlzIGVtcHRpZWQ/IElmIGl0IGlzbid0IGVtcHRpZWQsIGRyb3AKY2FjaGVzIGFnYWluIHRv IHNlZSBpZiB0aGF0IGVtcHRpZXMgaXQuIElmIHlvdSBjb3VsIGR0aGVuIHBvc3QgdGhlCmV2ZW50 IHRyYWNlLCBJIG1pZ2h0IGJlIGFibGUgdG8gc2VlIHdoYXQgaXMgZ29pbmcgc3RyYW5nZSB3aXRo IHRoZQpzaHJpbmtlciBhbmQvb3IgcmVjbGFpbS4KCkNoZWVycywKCkRhdmUuCi0tIApEYXZlIENo aW5uZXIKZGF2aWRAZnJvbW9yYml0LmNvbQoKX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX18KeGZzIG1haWxpbmcgbGlzdAp4ZnNAb3NzLnNnaS5jb20KaHR0cDov L29zcy5zZ2kuY29tL21haWxtYW4vbGlzdGluZm8veGZzCg== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757671Ab1ECAvY (ORCPT ); Mon, 2 May 2011 20:51:24 -0400 Received: from ipmail06.adl6.internode.on.net ([150.101.137.145]:35862 "EHLO ipmail06.adl6.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751863Ab1ECAvW (ORCPT ); Mon, 2 May 2011 20:51:22 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Av0EAONQv015LBza/2dsb2JhbACEUKE+eLQckF4OgRyDVYEBBJ0t Date: Tue, 3 May 2011 10:51:14 +1000 From: Dave Chinner To: Christian Kujau Cc: Markus Trippelsdorf , LKML , xfs@oss.sgi.com, minchan.kim@gmail.com Subject: Re: 2.6.39-rc4+: oom-killer busy killing tasks Message-ID: <20110503005114.GE2978@dastard> References: <20110427102824.GI12436@dastard> <20110428233751.GR12436@dastard> <20110429201701.GA13166@x4.trippels.de> <20110501080149.GD13542@dastard> <20110502121958.GA2978@dastard> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, May 02, 2011 at 12:59:50PM -0700, Christian Kujau wrote: > On Mon, 2 May 2011 at 22:19, Dave Chinner wrote: > > Yes. Try 2 orders of magnitude as a start. i.e change it to 10000... > > I've run the -12 test with vfs_cache_pressure=200 and now the -13 test > with vfs_cache_pressure=10000. The OOM killer still kicks in, but the > machine seems to be more usable afterwards and does not get totally stuck: > > http://nerdbynature.de/bits/2.6.39-rc4/oom/ > - messages-12.txt.gz & slabinfo-12.txt.bz2 > * oom-debug.sh invoked oom-killer at 01:27:11 > * sysrq-w works until 01:27:08, but got killed by oom > > - messages-13.txt.gz & slabinfo-13.txt.bz2 > * find invoked oom-killer at 08:44:07 > * sysrq-w works until 08:45:48 (listing jbd2/hda6-8), then > my debug script got killed So before the OOM killer kicks in, kswapd is stuck in congestion_wait(), and after a number of oom-kills over a 5s period it is still in congestion_wait(). 7s later it is still in congestion_wait() and the oom-killer starts up again, with kswapd still being in congestion_wait() when the oom-killer stops again 3s later. Ok, so kswapd being stuck in congestion wait means it can only be in balance_pgdat() and it thinks that it is getting into trouble. Looking at the OOM output: active_anon:7992 inactive_anon:8714 isolated_anon:0 active_file:5995 inactive_file:73780 isolated_file:0 unevictable:0 dirty:0 writeback:0 unstable:0 free:35263 slab_reclaimable:182652 slab_unreclaimable:3224 mapped:6929 shmem:199 pagetables:396 bounce:0 DMA free:3436kB min:3532kB low:4412kB high:5296kB active_anon:0kB inactive_anon:0kB active_file:236kB inactive_file:248kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:780288kB mlocked:0kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:730608kB slab_unreclaimable:12896kB kernel_stack:1032kB pagetables:1584kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:680 all_unreclaimable? yes lowmem_reserve[]: 0 0 508 508 HighMem free:137616kB min:508kB low:1096kB high:1684kB active_anon:31968kB inactive_anon:34856kB active_file:23744kB inactive_file:294872kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:520192kB mlocked:0kB dirty:0kB writeback:0kB mapped:27708kB shmem:796kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 0 There are no isolated pages, so that means we aren't in the congestion_wait() call related to having too many isolated pages. We see that the ZONE_DMA is all_unreclaimable and had 680 pages scanned. ZONE_HIGHMEM had _zero_ pages scanned, which means it must be over the high water marks for free memory and so no attempt is made to reclaim from this zone. That means lru_pages is set to zone_reclaimable_pages(ZONE_DMA), which at this point in time would be: active_anon:0kB inactive_anon:0kB active_file:236kB inactive_file:248kB about 484k or 121 pages. To get all_unreclaimable set, the shrink_slab() call must have returned zero to indicate it didn't free anything. So the first pass through would have passed that to shrink_slab, and asumming they are all mapped pages we'd end up with nr_scanned = 242. For the xfs inode cache with 600,000 reclaimable inodes, this would have resulted in: max_pass = 600000 delta = 4 * 242 / 2 = 484 delta = 484 * 600,000 = 290,400,000 delta = 290,400,000 / 121 + 1 ٍ= 2,380,327 shrinker->nr += delta if (shrinker->nr > max_pass * 2) shrinker->nr = max_pass * 2; = 1,200,000 So, the shrinker->nr should be well above zero, even in the worst case. The question is now: how on earth is it returning zero? Two cases: if the shrinker returns -1, or because the cache is growing: nr_before = (*shrinker->shrink)(shrinker, 0, gfp_mask); shrink_ret = (*shrinker->shrink)(shrinker, this_scan, gfp_mask); if (shrink_ret == -1) break; if (shrink_ret < nr_before) ret += nr_before - shrink_ret; So, first case will happen for XFS when: if (!(gfp_mask & __GFP_FS)) return -1; In most of the OOM-killer invocations, the stack trace is: out_of_memory+0x27c/0x360 __alloc_pages_nodemask+0x6f8/0x708 new_slab+0x1fc/0x234 T.915+0x1f8/0x388 kmem_cache_alloc+0x11c/0x124 kmem_zone_alloc+0xa4/0x114 xfs_inode_alloc+0x40/0x13c xfs_iget+0x2a8/0x620 xfs_lookup+0xf8/0x114 xfs_vn_lookup+0x5c/0xb0 d_alloc_and_lookup+0x54/0x90 do_lookup+0x248/0x2bc path_lookupat+0xfc/0x8f4 do_path_lookup+0x34/0xac user_path_at+0x64/0xb4 vfs_fstatat+0x58/0xbc sys_fstatat64+0x24/0x50 ret_from_syscall+0x0/0x38 So we are not preventing reclaim via the gfp_mask. That leaves the other case, where the number of reclaimable inodes is growing faster than the shrinker is freeing them. I can't really see how that is possible with a single CPU machine without prempt enabled and, apparently, no dirty inodes. Inode reclaim should not block (shrinker or background), so there's something else going on here. Can you run an event trace of all the XFS events during a find for me? Don't do it over the entire subset of the filesystem - only 100,000 inodes is sufficient (i.e. kill the find once the xfs inode cache slab reaches 100k inodes. While still running the event trace, can you then drop the caches (echo 3 > /proc/sys/vm/drop_caches) and check that the xfs inode cache is emptied? If it isn't emptied, drop caches again to see if that empties it. If you coul dthen post the event trace, I might be able to see what is going strange with the shrinker and/or reclaim. Cheers, Dave. -- Dave Chinner david@fromorbit.com