From: Andrew Morton <akpm@osdl.org>
To: Justin Piszcz <jpiszcz@lucidpixels.com>
Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
xfs@oss.sgi.com
Subject: Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2
Date: Mon, 22 Jan 2007 11:57:03 -0800 [thread overview]
Message-ID: <20070122115703.97ed54f3.akpm@osdl.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0701211424170.2552@p34.internal.lan>
> On Sun, 21 Jan 2007 14:27:34 -0500 (EST) Justin Piszcz <jpiszcz@lucidpixels.com> wrote:
> Why does copying an 18GB on a 74GB raptor raid1 cause the kernel to invoke
> the OOM killer and kill all of my processes?
What's that? Software raid or hardware raid? If the latter, which driver?
> Doing this on a single disk 2.6.19.2 is OK, no issues. However, this
> happens every time!
>
> Anything to try? Any other output needed? Can someone shed some light on
> this situation?
>
> Thanks.
>
>
> The last lines of vmstat 1 (right before it kill -9'd my shell/ssh)
>
> procs -----------memory---------- ---swap-- -----io---- -system--
> ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy id
> wa
> 0 7 764 50348 12 1269988 0 0 53632 172 1902 4600 1 8
> 29 62
> 0 7 764 49420 12 1260004 0 0 53632 34368 1871 6357 2 11
> 48 40
The wordwrapping is painful :(
>
> The last lines of dmesg:
> [ 5947.199985] lowmem_reserve[]: 0 0 0
> [ 5947.199992] DMA: 0*4kB 1*8kB 1*16kB 0*32kB 1*64kB 1*128kB 1*256kB
> 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3544kB
> [ 5947.200010] Normal: 1*4kB 0*8kB 1*16kB 1*32kB 0*64kB 1*128kB 0*256kB
> 1*512kB 0*1024kB 1*2048kB 0*4096kB = 2740kB
> [ 5947.200035] HighMem: 98*4kB 35*8kB 9*16kB 69*32kB 4*64kB 1*128kB
> 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3664kB
> [ 5947.200052] Swap cache: add 789, delete 189, find 16/17, race 0+0
> [ 5947.200055] Free swap = 2197628kB
> [ 5947.200058] Total swap = 2200760kB
> [ 5947.200060] Free swap: 2197628kB
> [ 5947.205664] 517888 pages of RAM
> [ 5947.205671] 288512 pages of HIGHMEM
> [ 5947.205673] 5666 reserved pages
> [ 5947.205675] 257163 pages shared
> [ 5947.205678] 600 pages swap cached
> [ 5947.205680] 88876 pages dirty
> [ 5947.205682] 115111 pages writeback
> [ 5947.205684] 5608 pages mapped
> [ 5947.205686] 49367 pages slab
> [ 5947.205688] 541 pages pagetables
> [ 5947.205795] Out of memory: kill process 1853 (named) score 9937 or a
> child
> [ 5947.205801] Killed process 1853 (named)
> [ 5947.206616] bash invoked oom-killer: gfp_mask=0x84d0, order=0,
> oomkilladj=0
> [ 5947.206621] [<c013e33b>] out_of_memory+0x17b/0x1b0
> [ 5947.206631] [<c013fcac>] __alloc_pages+0x29c/0x2f0
> [ 5947.206636] [<c01479ad>] __pte_alloc+0x1d/0x90
> [ 5947.206643] [<c0148bf7>] copy_page_range+0x357/0x380
> [ 5947.206649] [<c0119d75>] copy_process+0x765/0xfc0
> [ 5947.206655] [<c012c3f9>] alloc_pid+0x1b9/0x280
> [ 5947.206662] [<c011a839>] do_fork+0x79/0x1e0
> [ 5947.206674] [<c015f91f>] do_pipe+0x5f/0xc0
> [ 5947.206680] [<c0101176>] sys_clone+0x36/0x40
> [ 5947.206686] [<c0103138>] syscall_call+0x7/0xb
> [ 5947.206691] [<c0420033>] __sched_text_start+0x853/0x950
> [ 5947.206698] =======================
Important information from the oom-killing event is missing. Please send
it all.
From your earlier reports we have several hundred MB of ZONE_NORMAL memory
which has gone awol.
Please include /proc/meminfo from after the oom-killing.
Please work out what is using all that slab memory, via /proc/slabinfo.
After the oom-killing, please see if you can free up the ZONE_NORMAL memory
via a few `echo 3 > /proc/sys/vm/drop_caches' commands. See if you can
work out what happened to the missing couple-of-hundred MB from
ZONE_NORMAL.
WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@osdl.org>
To: Justin Piszcz <jpiszcz@lucidpixels.com>
Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
xfs@oss.sgi.com
Subject: Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2
Date: Mon, 22 Jan 2007 11:57:03 -0800 [thread overview]
Message-ID: <20070122115703.97ed54f3.akpm@osdl.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0701211424170.2552@p34.internal.lan>
> On Sun, 21 Jan 2007 14:27:34 -0500 (EST) Justin Piszcz <jpiszcz@lucidpixels.com> wrote:
> Why does copying an 18GB on a 74GB raptor raid1 cause the kernel to invoke
> the OOM killer and kill all of my processes?
What's that? Software raid or hardware raid? If the latter, which driver?
> Doing this on a single disk 2.6.19.2 is OK, no issues. However, this
> happens every time!
>
> Anything to try? Any other output needed? Can someone shed some light on
> this situation?
>
> Thanks.
>
>
> The last lines of vmstat 1 (right before it kill -9'd my shell/ssh)
>
> procs -----------memory---------- ---swap-- -----io---- -system--
> ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy id
> wa
> 0 7 764 50348 12 1269988 0 0 53632 172 1902 4600 1 8
> 29 62
> 0 7 764 49420 12 1260004 0 0 53632 34368 1871 6357 2 11
> 48 40
The wordwrapping is painful :(
>
> The last lines of dmesg:
> [ 5947.199985] lowmem_reserve[]: 0 0 0
> [ 5947.199992] DMA: 0*4kB 1*8kB 1*16kB 0*32kB 1*64kB 1*128kB 1*256kB
> 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3544kB
> [ 5947.200010] Normal: 1*4kB 0*8kB 1*16kB 1*32kB 0*64kB 1*128kB 0*256kB
> 1*512kB 0*1024kB 1*2048kB 0*4096kB = 2740kB
> [ 5947.200035] HighMem: 98*4kB 35*8kB 9*16kB 69*32kB 4*64kB 1*128kB
> 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3664kB
> [ 5947.200052] Swap cache: add 789, delete 189, find 16/17, race 0+0
> [ 5947.200055] Free swap = 2197628kB
> [ 5947.200058] Total swap = 2200760kB
> [ 5947.200060] Free swap: 2197628kB
> [ 5947.205664] 517888 pages of RAM
> [ 5947.205671] 288512 pages of HIGHMEM
> [ 5947.205673] 5666 reserved pages
> [ 5947.205675] 257163 pages shared
> [ 5947.205678] 600 pages swap cached
> [ 5947.205680] 88876 pages dirty
> [ 5947.205682] 115111 pages writeback
> [ 5947.205684] 5608 pages mapped
> [ 5947.205686] 49367 pages slab
> [ 5947.205688] 541 pages pagetables
> [ 5947.205795] Out of memory: kill process 1853 (named) score 9937 or a
> child
> [ 5947.205801] Killed process 1853 (named)
> [ 5947.206616] bash invoked oom-killer: gfp_mask=0x84d0, order=0,
> oomkilladj=0
> [ 5947.206621] [<c013e33b>] out_of_memory+0x17b/0x1b0
> [ 5947.206631] [<c013fcac>] __alloc_pages+0x29c/0x2f0
> [ 5947.206636] [<c01479ad>] __pte_alloc+0x1d/0x90
> [ 5947.206643] [<c0148bf7>] copy_page_range+0x357/0x380
> [ 5947.206649] [<c0119d75>] copy_process+0x765/0xfc0
> [ 5947.206655] [<c012c3f9>] alloc_pid+0x1b9/0x280
> [ 5947.206662] [<c011a839>] do_fork+0x79/0x1e0
> [ 5947.206674] [<c015f91f>] do_pipe+0x5f/0xc0
> [ 5947.206680] [<c0101176>] sys_clone+0x36/0x40
> [ 5947.206686] [<c0103138>] syscall_call+0x7/0xb
> [ 5947.206691] [<c0420033>] __sched_text_start+0x853/0x950
> [ 5947.206698] =======================
Important information from the oom-killing event is missing. Please send
it all.
>From your earlier reports we have several hundred MB of ZONE_NORMAL memory
which has gone awol.
Please include /proc/meminfo from after the oom-killing.
Please work out what is using all that slab memory, via /proc/slabinfo.
After the oom-killing, please see if you can free up the ZONE_NORMAL memory
via a few `echo 3 > /proc/sys/vm/drop_caches' commands. See if you can
work out what happened to the missing couple-of-hundred MB from
ZONE_NORMAL.
next prev parent reply other threads:[~2007-01-22 19:57 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-01-21 19:27 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2 Justin Piszcz
2007-01-22 13:37 ` Pavel Machek
2007-01-22 18:48 ` Justin Piszcz
2007-01-22 23:47 ` Pavel Machek
2007-01-24 23:39 ` Justin Piszcz
2007-01-24 23:42 ` Justin Piszcz
2007-01-25 0:32 ` Pavel Machek
2007-01-25 0:36 ` Justin Piszcz
2007-01-25 0:58 ` Justin Piszcz
2007-01-25 9:08 ` Justin Piszcz
2007-01-25 22:34 ` Mark Hahn
2007-01-26 0:22 ` Justin Piszcz
2007-01-22 19:57 ` Andrew Morton [this message]
2007-01-22 19:57 ` Andrew Morton
2007-01-22 20:20 ` Justin Piszcz
2007-01-23 0:37 ` Donald Douwsma
2007-01-23 1:12 ` Andrew Morton
2007-01-24 23:40 ` Justin Piszcz
2007-01-25 0:10 ` Justin Piszcz
2007-01-25 0:36 ` Nick Piggin
2007-01-25 11:11 ` Justin Piszcz
2007-01-25 1:21 ` Bill Cizek
2007-01-25 11:13 ` Justin Piszcz
2007-01-25 0:34 ` Justin Piszcz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070122115703.97ed54f3.akpm@osdl.org \
--to=akpm@osdl.org \
--cc=jpiszcz@lucidpixels.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.