From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay.sgi.com (relay1.corp.sgi.com [137.38.102.111]) by oss.sgi.com (Postfix) with ESMTP id B72EC7F66 for ; Thu, 12 Nov 2015 14:07:20 -0600 (CST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by relay1.corp.sgi.com (Postfix) with ESMTP id 995C48F804B for ; Thu, 12 Nov 2015 12:07:17 -0800 (PST) Received: from ipmail04.adl6.internode.on.net (ipmail04.adl6.internode.on.net [150.101.137.141]) by cuda.sgi.com with ESMTP id w845YYowVGfcjD52 for ; Thu, 12 Nov 2015 12:07:11 -0800 (PST) Date: Fri, 13 Nov 2015 07:06:41 +1100 From: Dave Chinner Subject: Re: memory reclaim problems on fs usage Message-ID: <20151112200641.GR19199@dastard> References: <201511102313.36685.arekm@maven.pl> <201511111719.44035.arekm@maven.pl> <201511120719.EBF35970.OtSOHOVFJMFQFL@I-love.SAKURA.ne.jp> <201511120706.10739.arekm@maven.pl> <56449E44.7020407@I-love.SAKURA.ne.jp> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <56449E44.7020407@I-love.SAKURA.ne.jp> List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com To: Tetsuo Handa Cc: linux-mm@kvack.org, xfs@oss.sgi.com T24gVGh1LCBOb3YgMTIsIDIwMTUgYXQgMTE6MTI6MjBQTSArMDkwMCwgVGV0c3VvIEhhbmRhIHdy b3RlOgo+IE9uIDIwMTUvMTEvMTIgMTU6MDYsIEFya2FkaXVzeiBNacWba2lld2ljeiB3cm90ZToK PiA+T24gV2VkbmVzZGF5IDExIG9mIE5vdmVtYmVyIDIwMTUsIFRldHN1byBIYW5kYSB3cm90ZToK PiA+PkFya2FkaXVzeiBNaT9raWV3aWN6IHdyb3RlOgo+ID4+PlRoaXMgcGF0Y2ggaXMgYWdhaW5z dCB3aGljaCB0cmVlPyAodHJpZWQgNC4xLCA0LjIgYW5kIDQuMykKPiA+Pgo+ID4+T29wcy4gV2hp dGVzcGFjZS1kYW1hZ2VkLiBUaGlzIHBhdGNoIGlzIGZvciB2YW5pbGxhIDQuMS4yLgo+ID4+UmVw b3N0aW5nIHdpdGggb25lIGNvbmRpdGlvbiBjb3JyZWN0ZWQuCj4gPgo+ID5IZXJlIGlzIGxvZzoK PiA+Cj4gPmh0dHA6Ly9peGlvbi5wbGQtbGludXgub3JnL35hcmVrbS9sb2ctbW0tMS50eHQuZ3oK PiA+Cj4gPlVuY29tcHJlc3NlcyBpcyAxLjRNQiwgc28gbm90IHBvc3RpbmcgaGVyZS4KPiA+Cj4g VGhhbmsgeW91IGZvciB0aGUgbG9nLiBUaGUgcmVzdWx0IGlzIHVuZXhwZWN0ZWQgZm9yIG1lLgo+ IAo+IFdoYXQgSSBmZWVsIHN0cmFuZ2UgaXMgdGhhdCBmcmVlOiByZW1haW5lZCBiZWxvdyBtaW46 IGxldmVsLgo+IFdoaWxlIEdGUF9BVE9NSUMgYWxsb2NhdGlvbnMgY2FuIGFjY2VzcyBtZW1vcnkg cmVzZXJ2ZXMsIEkgdGhpbmsgdGhhdAo+IHRoZXNlIGZyZWU6IHZhbHVlcyBhcmUgdG9vIHNtYWxs LiBNZW1vcnkgYWxsb2NhdGVkIGJ5IEdGUF9BVE9NSUMgc2hvdWxkCj4gYmUgcmVsZWFzZWQgc2hv cnRseSwgb3IgYW55IF9fR0ZQX1dBSVQgYWxsb2NhdGlvbnMgd291bGQgc3RhbGwgZm9yIGxvbmcu Cj4gCj4gWyA4NjMzLjc1MzUyOF0gTm9kZSAwIE5vcm1hbCBmcmVlOjEyOGtCIG1pbjo3MTA0a0Ig bG93Ojg4ODBrQgo+IGhpZ2g6MTA2NTZrQiBhY3RpdmVfYW5vbjo1OTAwOGtCIGluYWN0aXZlX2Fu b246NzUyNDBrQgo+IGFjdGl2ZV9maWxlOjE0NzEya0IgaW5hY3RpdmVfZmlsZTozMjU2OTYwa0Ig dW5ldmljdGFibGU6MGtCCi4uLi4KPiBpc29sYXRlZChhbm9uKTowa0IgaXNvbGF0ZWQoZmlsZSk6 MGtCIHByZXNlbnQ6NTI0Mjg4MGtCCj4gbWFuYWdlZDo1MTA5OTgwa0IgbWxvY2tlZDowa0IgZGly dHk6MjBrQiB3cml0ZWJhY2s6MGtCIG1hcHBlZDo3MzY4a0IKLi4uLgo+IHBhZ2VzX3NjYW5uZWQ6 MTc2IGFsbF91bnJlY2xhaW1hYmxlPyBubwoKU286IHdlIGhhdmUgMy4yR0IgKDgwMCwwMDAgcGFn ZXMpIG9mIGltbWVkaWF0ZWx5IHJlY2xhaW1hYmxlIHBhZ2UKY2FjaGUgaW4gdGhpcyB6b25lIChp bmFjdGl2ZV9maWxlKSBhbmQgYSBHRlBfQVRPTUlDIGFsbG9jYXRpb24KY29udGV4dCBzbyB3ZSBj YW4gb25seSByZWFsbHkgcmVjbGFpbSBjbGVhbiBwYWdlIGNhY2hlIHBhZ2VzCnJlbGlhYmx5LgoK U28gd2h5IGhhdmUgd2Ugb25seSBzY2FubmVkICoxNzYqIHBhZ2VzKiBkdXJpbmcgcmVjbGFpbT8g IE9uIG90aGVyCk9PTSByZXBvcnRzIGluIHRoaXMgdHJhY2UgaXQncyBhcyBsb3cgYXMgMTIuICBF aXRoZXIgdGhhdCBzdGF0IGlzCmNvbXBsZXRlbHkgd3JvbmcsIG9yIHdlJ3JlIG5vdCBkb2luZyBz dWZmaWNpZW50IHBhZ2UgTFJVIHJlY2xhaW0Kc2Nhbm5pbmcuLi4uCgo+IFsgOTY2Mi4yMzQ2ODVd IE1lbUFsbG9jLUluZm86IDMgc3RhbGxpbmcgdGFzaywgMCBkeWluZyB0YXNrLCAwIHZpY3RpbSB0 YXNrLgo+IAo+IHZtc3RhdF91cGRhdGUoKSBhbmQgc3VibWl0X2ZsdXNoZXMoKSByZW1haW5lZCBw ZW5kaW5nIGZvciBhYm91dCAxMTAgc2Vjb25kcy4KPiBJZiB4bG9nX2NpbF9wdXNoX3dvcmsoKSB3 ZXJlIHNwaW5uaW5nIGluc2lkZSBHRlBfTk9GUyBhbGxvY2F0aW9uLCBpdCBzaG91bGQgYmUKPiBy ZXBvcnRlZCBhcyBNZW1BbGxvYzogdHJhY2VzLCBidXQgbm8gc3VjaCBsaW5lcyBhcmUgcmVjb3Jk ZWQuIEkgZG9uJ3Qga25vdyB3aHkKPiB4bG9nX2NpbF9wdXNoX3dvcmsoKSBkaWQgbm90IGNhbGwg c2NoZWR1bGUoKSBmb3Igc28gbG9uZy4KCkknZCBzYXkgaXQgaXMgcmVwZWF0ZWRseSB3YWl0aW5n IGZvciBJTyBjb21wbGV0aW9uIG9uIGxvZyBidWZmZXJzIHRvCndyaXRlIG91dCB0aGUgY2hlY2tw b2ludC4gSXQncyBtYWtpbmcgcHJvZ3Jlc3MsIGp1c3QgaWYgaXQncyB0YWtpbmcKbXVsdGlwbGUg c2Vjb25kIHBlciBqb3VybmFsIElPIGl0IHdpbGwgdGFrZSBhIGxvbmcgdGltZSB0byB3cml0ZSBh CmNoZWNrcG9pbnQuIEFsbCB0aGUgb3RoZXIgYmxvY2tlZCB0YXNrcyBpbiBYRlMgaW5vZGUgcmVj bGFpbSBhcmUKZWl0aGVyIHdhaXRpbmcgZGlyZWN0bHkgb24gSU8gY29tcGxldGlvbiBvciB3YWl0 aW5nIGZvciB0aGUgbG9nIHRvCmNvbXBsZXRlIGEgZmx1c2gsIHNvIHRoaXMgcmVhbGx5IGp1c3Qg bG9va3MgbGlrZSBhbiBvdmVybG9hZGVkIElPCnN1YnN5c3RlbSB0byBtZS4uLi4KCj4gV2VsbCwg d2hhdCBzdGVwcyBzaG91bGQgd2UgdHJ5IG5leHQgZm9yIGlzb2xhdGluZyB0aGUgcHJvYmxlbT8K CkkgdGhpbmsgdGhlcmUncyBwbGVudHkgdGhhdCBuZWVkcyB0byBiZSBleHBsYWluZWQgZnJvbSB0 aGlzCmluZm9ybWF0aW9uLiBXZSBkb24ndCBoYXZlIGEgbWVtb3J5IGFsbG9jYXRpb24gbGl2ZWxv Y2sgLSBzbG93CnByb2dyZXNzIGlzIGJlaW5nIG1hZGUgb24gcmVjbGFpbWluZyBpbm9kZXMsIGJ1 dCB3ZQpjbGVhcmx5IGhhdmUgYSB6b25lIGltYmFsYW5jZSBhbmQgZGlyZWN0IHBhZ2UgcmVjbGFp bSBpcyBub3QgZnJlZWluZwpjbGVhbiBpbmFjdGl2ZSBwYWdlcyB3aGVuIHRoZXJlIGFyZSBodWdl IG51bWJlcnMgb2YgdGhlbSBhdmFpbGFibGUKZm9yIHJlY2xhaW0uIFNvbWVib2R5IHdobyB1bmRl cnN0YW5kcyB0aGUgcGFnZSByZWNsYWltIGNvZGUgbmVlZHMgdG8KbG9vayBjbG9zZWx5IGF0IHRo ZSBjb2RlIHRvIGV4cGxhaW4gdGhpcyBiZWhhdmlvdXIgbm93LCB0aGVuIHdlJ2xsCmtub3cgd2hh dCBpbmZvcm1hdGlvbiBuZWVkcyB0byBiZSBnYXRoZXJlZCBuZXh0Li4uCgpDaGVlcnMsCgpEYXZl LgoKCi0tIApEYXZlIENoaW5uZXIKZGF2aWRAZnJvbW9yYml0LmNvbQoKX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KeGZzIG1haWxpbmcgbGlzdAp4ZnNAb3Nz LnNnaS5jb20KaHR0cDovL29zcy5zZ2kuY29tL21haWxtYW4vbGlzdGluZm8veGZzCg== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f44.google.com (mail-pa0-f44.google.com [209.85.220.44]) by kanga.kvack.org (Postfix) with ESMTP id 309B66B0038 for ; Thu, 12 Nov 2015 15:07:14 -0500 (EST) Received: by pacdm15 with SMTP id dm15so74555786pac.3 for ; Thu, 12 Nov 2015 12:07:13 -0800 (PST) Received: from ipmail04.adl6.internode.on.net (ipmail04.adl6.internode.on.net. [150.101.137.141]) by mx.google.com with ESMTP id ip1si21886112pbc.157.2015.11.12.12.07.11 for ; Thu, 12 Nov 2015 12:07:12 -0800 (PST) Date: Fri, 13 Nov 2015 07:06:41 +1100 From: Dave Chinner Subject: Re: memory reclaim problems on fs usage Message-ID: <20151112200641.GR19199@dastard> References: <201511102313.36685.arekm@maven.pl> <201511111719.44035.arekm@maven.pl> <201511120719.EBF35970.OtSOHOVFJMFQFL@I-love.SAKURA.ne.jp> <201511120706.10739.arekm@maven.pl> <56449E44.7020407@I-love.SAKURA.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <56449E44.7020407@I-love.SAKURA.ne.jp> Sender: owner-linux-mm@kvack.org List-ID: To: Tetsuo Handa Cc: Arkadiusz =?utf-8?Q?Mi=C5=9Bkiewicz?= , linux-mm@kvack.org, xfs@oss.sgi.com On Thu, Nov 12, 2015 at 11:12:20PM +0900, Tetsuo Handa wrote: > On 2015/11/12 15:06, Arkadiusz MiA?kiewicz wrote: > >On Wednesday 11 of November 2015, Tetsuo Handa wrote: > >>Arkadiusz Mi?kiewicz wrote: > >>>This patch is against which tree? (tried 4.1, 4.2 and 4.3) > >> > >>Oops. Whitespace-damaged. This patch is for vanilla 4.1.2. > >>Reposting with one condition corrected. > > > >Here is log: > > > >http://ixion.pld-linux.org/~arekm/log-mm-1.txt.gz > > > >Uncompresses is 1.4MB, so not posting here. > > > Thank you for the log. The result is unexpected for me. > > What I feel strange is that free: remained below min: level. > While GFP_ATOMIC allocations can access memory reserves, I think that > these free: values are too small. Memory allocated by GFP_ATOMIC should > be released shortly, or any __GFP_WAIT allocations would stall for long. > > [ 8633.753528] Node 0 Normal free:128kB min:7104kB low:8880kB > high:10656kB active_anon:59008kB inactive_anon:75240kB > active_file:14712kB inactive_file:3256960kB unevictable:0kB .... > isolated(anon):0kB isolated(file):0kB present:5242880kB > managed:5109980kB mlocked:0kB dirty:20kB writeback:0kB mapped:7368kB .... > pages_scanned:176 all_unreclaimable? no So: we have 3.2GB (800,000 pages) of immediately reclaimable page cache in this zone (inactive_file) and a GFP_ATOMIC allocation context so we can only really reclaim clean page cache pages reliably. So why have we only scanned *176* pages* during reclaim? On other OOM reports in this trace it's as low as 12. Either that stat is completely wrong, or we're not doing sufficient page LRU reclaim scanning.... > [ 9662.234685] MemAlloc-Info: 3 stalling task, 0 dying task, 0 victim task. > > vmstat_update() and submit_flushes() remained pending for about 110 seconds. > If xlog_cil_push_work() were spinning inside GFP_NOFS allocation, it should be > reported as MemAlloc: traces, but no such lines are recorded. I don't know why > xlog_cil_push_work() did not call schedule() for so long. I'd say it is repeatedly waiting for IO completion on log buffers to write out the checkpoint. It's making progress, just if it's taking multiple second per journal IO it will take a long time to write a checkpoint. All the other blocked tasks in XFS inode reclaim are either waiting directly on IO completion or waiting for the log to complete a flush, so this really just looks like an overloaded IO subsystem to me.... > Well, what steps should we try next for isolating the problem? I think there's plenty that needs to be explained from this information. We don't have a memory allocation livelock - slow progress is being made on reclaiming inodes, but we clearly have a zone imbalance and direct page reclaim is not freeing clean inactive pages when there are huge numbers of them available for reclaim. Somebody who understands the page reclaim code needs to look closely at the code to explain this behaviour now, then we'll know what information needs to be gathered next... Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org