From: Mel Gorman <mel@csn.ul.ie>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>,
linux-mm@kvack.org, Nick Piggin <npiggin@suse.de>,
Martin Schwidefsky <schwidefsky@de.ibm.com>,
Andreas Krebbel <krebbel@linux.vnet.ibm.com>
Subject: Re: oom-killer killing even if memory is available?
Date: Fri, 20 Mar 2009 15:27:00 +0000 [thread overview]
Message-ID: <20090320152700.GM24586@csn.ul.ie> (raw)
In-Reply-To: <20090317024605.846420e1.akpm@linux-foundation.org>
On Tue, Mar 17, 2009 at 02:46:05AM -0700, Andrew Morton wrote:
> On Tue, 17 Mar 2009 10:00:49 +0100 Heiko Carstens <heiko.carstens@de.ibm.com> wrote:
>
> > Hi all,
> >
> > the below looks like there is some bug in the memory management code.
> > Even if there seems to be plenty of memory available the oom-killer
> > kills processes.
> >
> > The below happened after 27 days uptime, memory seems to be heavily
> > fragmented, but there are stills larger portions of memory free that
> > could satisfy an order 2 allocation. Any idea why this fails?
> >
You are hitting the watermark code for the order-2 allocation in all
liklihood. This looks like a GFP_KERNEL allocation so ordinarily it's a
bit surprising.
> > [root@t6360003 ~]# uptime
> > 09:33:41 up 27 days, 22:55, 1 user, load average: 0.00, 0.00, 0.00
> >
> > Mar 16 21:40:40 t6360003 kernel: basename invoked oom-killer: gfp_mask=0xd0, order=2, oomkilladj=0
> > Mar 16 21:40:40 t6360003 kernel: CPU: 0 Not tainted 2.6.28 #1
> > Mar 16 21:40:40 t6360003 kernel: Process basename (pid: 30555, task: 000000007baa6838, ksp: 0000000063867968)
> > Mar 16 21:40:40 t6360003 kernel: 0700000084a8c238 0000000063867a90 0000000000000002 0000000000000000
> > Mar 16 21:40:40 t6360003 kernel: 0000000063867b30 0000000063867aa8 0000000063867aa8 000000000010534e
> > Mar 16 21:40:40 t6360003 kernel: 0000000000000000 0000000063867968 0000000000000000 000000000000000a
> > Mar 16 21:40:40 t6360003 kernel: 000000000000000d 0000000000000000 0000000063867a90 0000000063867b08
> > Mar 16 21:40:40 t6360003 kernel: 00000000004a5ab0 000000000010534e 0000000063867a90 0000000063867ae0
> > Mar 16 21:40:40 t6360003 kernel: Call Trace:
> > Mar 16 21:40:40 t6360003 kernel: ([<0000000000105248>] show_trace+0xf4/0x144)
> > Mar 16 21:40:40 t6360003 kernel: [<0000000000105300>] show_stack+0x68/0xf4
> > Mar 16 21:40:40 t6360003 kernel: [<000000000049c84c>] dump_stack+0xb0/0xc0
> > Mar 16 21:40:40 t6360003 kernel: [<000000000019235e>] oom_kill_process+0x9e/0x220
> > Mar 16 21:40:40 t6360003 kernel: [<0000000000192c30>] out_of_memory+0x17c/0x264
> > Mar 16 21:40:40 t6360003 kernel: [<000000000019714e>] __alloc_pages_internal+0x4f6/0x534
> > Mar 16 21:40:40 t6360003 kernel: [<0000000000104058>] crst_table_alloc+0x48/0x108
> > Mar 16 21:40:40 t6360003 kernel: [<00000000001a3f60>] __pmd_alloc+0x3c/0x1a8
> > Mar 16 21:40:40 t6360003 kernel: [<00000000001a802e>] handle_mm_fault+0x262/0x9cc
> > Mar 16 21:40:40 t6360003 kernel: [<00000000004a1a7a>] do_dat_exception+0x30a/0x41c
> > Mar 16 21:40:40 t6360003 kernel: [<0000000000115e5c>] sysc_return+0x0/0x8
> > Mar 16 21:40:40 t6360003 kernel: [<0000004d193bfae0>] 0x4d193bfae0
> > Mar 16 21:40:40 t6360003 kernel: Mem-Info:
> > Mar 16 21:40:40 t6360003 kernel: DMA per-cpu:
> > Mar 16 21:40:40 t6360003 kernel: CPU 0: hi: 186, btch: 31 usd: 0
> > Mar 16 21:40:40 t6360003 kernel: CPU 1: hi: 186, btch: 31 usd: 0
> > Mar 16 21:40:40 t6360003 kernel: CPU 2: hi: 186, btch: 31 usd: 0
> > Mar 16 21:40:40 t6360003 kernel: CPU 3: hi: 186, btch: 31 usd: 0
> > Mar 16 21:40:40 t6360003 kernel: CPU 4: hi: 186, btch: 31 usd: 0
> > Mar 16 21:40:40 t6360003 kernel: CPU 5: hi: 186, btch: 31 usd: 0
> > Mar 16 21:40:40 t6360003 kernel: Normal per-cpu:
> > Mar 16 21:40:40 t6360003 kernel: CPU 0: hi: 186, btch: 31 usd: 0
> > Mar 16 21:40:40 t6360003 kernel: CPU 1: hi: 186, btch: 31 usd: 30
> > Mar 16 21:40:40 t6360003 kernel: CPU 2: hi: 186, btch: 31 usd: 0
> > Mar 16 21:40:40 t6360003 kernel: CPU 3: hi: 186, btch: 31 usd: 0
> > Mar 16 21:40:40 t6360003 kernel: CPU 4: hi: 186, btch: 31 usd: 0
> > Mar 16 21:40:40 t6360003 kernel: CPU 5: hi: 186, btch: 31 usd: 0
> > Mar 16 21:40:40 t6360003 kernel: Active_anon:372 active_file:45 inactive_anon:154
> > Mar 16 21:40:40 t6360003 kernel: inactive_file:152 unevictable:987 dirty:0 writeback:188 unstable:0
> > Mar 16 21:40:40 t6360003 kernel: free:146348 slab:875833 mapped:805 pagetables:378 bounce:0
> > Mar 16 21:40:40 t6360003 kernel: DMA free:467728kB min:4064kB low:5080kB high:6096kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:116kB unevictable:0kB present:2068480kB pages_scanned:0 all_unreclaimable? no
> > Mar 16 21:40:40 t6360003 kernel: lowmem_reserve[]: 0 2020 2020
> > Mar 16 21:40:40 t6360003 kernel: Normal free:117664kB min:4064kB low:5080kB high:6096kB active_anon:1488kB inactive_anon:616kB active_file:188kB inactive_file:492kB unevictable:3948kB present:2068480kB pages_scanned:128 all_unreclaimable? no
> > Mar 16 21:40:40 t6360003 kernel: lowmem_reserve[]: 0 0 0
>
> The scanner has wrung pretty much all it can out of the reclaimable pages -
> the LRUs are nearly empty. There's a few hundred MB free and apparently we
> don't have four physically contiguous free pages anywhere. It's
> believeable.
>
> The question is: where the heck did all your memory go? You have 2GB of
> ZONE_NORMAL memory in that machine, but only a tenth of it is visible to
> the page reclaim code.
>
> Something must have allocated (and possibly leaked) it.
>
This looks like a memory leak all right. There used to be a patch that
recorded a stack trace for every page allocation but it was dropped from
-mm ages ago because of a merge conflict. I didn't revive it at the time
because it wasn't of immediate concern.
Should I revive the patch or do we have preferred ways of tracking down
memory leaks these days?
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-03-20 15:26 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-17 9:00 oom-killer killing even if memory is available? Heiko Carstens
2009-03-17 9:46 ` Andrew Morton
2009-03-17 10:17 ` Heiko Carstens
2009-03-17 10:28 ` Heiko Carstens
2009-03-17 10:49 ` Nick Piggin
2009-03-17 11:39 ` Heiko Carstens
2009-03-20 5:08 ` Wu Fengguang
2009-03-20 15:27 ` Mel Gorman [this message]
2009-03-20 21:02 ` Andrew Morton
2009-03-23 11:55 ` Mel Gorman
2009-03-23 14:58 ` Mel Gorman
2009-03-17 9:51 ` Nick Piggin
2009-03-17 10:11 ` Heiko Carstens
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090320152700.GM24586@csn.ul.ie \
--to=mel@csn.ul.ie \
--cc=akpm@linux-foundation.org \
--cc=heiko.carstens@de.ibm.com \
--cc=krebbel@linux.vnet.ibm.com \
--cc=linux-mm@kvack.org \
--cc=npiggin@suse.de \
--cc=schwidefsky@de.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).