public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.6.20 OOM with 8Gb RAM
@ 2007-04-12 17:38 Cameron Schaus
  2007-04-12 19:15 ` Andrew Morton
  0 siblings, 1 reply; 9+ messages in thread
From: Cameron Schaus @ 2007-04-12 17:38 UTC (permalink / raw)
  To: linux-kernel

I am running the latest FC5-i686-smp kernel, 2.6.20, on a machine with
8Gb of RAM, and 2 Xeon processors.  The system has a 750Mb ramdisk,
and one process allocating and deallocating memory that is also
writing lots of files to the ramdisk.  The process also reads and
writes from the network.  After the process runs for a while, the
linux OOM killer starts killing processes, even though there is lots
of memory available.

The system does not ordinarily use swap space, but I've added swap to
see if it makes a difference, but it only defers the problem.

The OOM dump below shows that memory in the NORMAL_ZONE is exhausted,
but there is still plenty of memory (6Gb+) in the HighMem Zone.  I can
provide .config and dmesg data if these would be helpful.

Why is the OOM killer being invoked when there is still memory
available for use?

java invoked oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0
java invoked oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0
 [<c0455f84>] out_of_memory+0x69/0x191
 [<c0457460>] __alloc_pages+0x220/0x2aa
 [<c046c80a>] cache_alloc_refill+0x26f/0x468
 [<c046ca76>] __kmalloc+0x73/0x7d
 [<c05bb4ce>] __alloc_skb+0x49/0xf7
 [<c05e483d>] tcp_sendmsg+0x169/0xa04
 [<c05fd76d>] inet_sendmsg+0x3b/0x45
 [<c05b57d5>] sock_aio_write+0xf9/0x105
 [<c0455708>] generic_file_aio_read+0x173/0x1a3
 [<c046fd11>] do_sync_write+0xc7/0x10a
 [<c04379fd>] autoremove_wake_function+0x0/0x35
 [<c05e413e>] tcp_ioctl+0x10a/0x115
 [<c05e4034>] tcp_ioctl+0x0/0x115
 [<c05fd406>] inet_ioctl+0x8d/0x91
 [<c0470564>] vfs_write+0xbc/0x154
 [<c0470b62>] sys_write+0x41/0x67
 [<c0403ef6>] sysenter_past_esp+0x5f/0x85
 =======================
DMA per-cpu:
CPU    0: Hot: hi:    0, btch:   1 usd:   0   Cold: hi:    0, btch:   1 usd:   0
CPU    1: Hot: hi:    0, btch:   1 usd:   0   Cold: hi:    0, btch:   1 usd:   0
CPU    2: Hot: hi:    0, btch:   1 usd:   0   Cold: hi:    0, btch:   1 usd:   0
CPU    3: Hot: hi:    0, btch:   1 usd:   0   Cold: hi:    0, btch:   1 usd:   0
Normal per-cpu:
CPU    0: Hot: hi:  186, btch:  31 usd:  84   Cold: hi:   62, btch:  15 usd:  53
CPU    1: Hot: hi:  186, btch:  31 usd:  66   Cold: hi:   62, btch:  15 usd:  57
CPU    2: Hot: hi:  186, btch:  31 usd:  59   Cold: hi:   62, btch:  15 usd:  51
CPU    3: Hot: hi:  186, btch:  31 usd:  60   Cold: hi:   62, btch:  15 usd:  58
HighMem per-cpu:
CPU    0: Hot: hi:  186, btch:  31 usd:  65   Cold: hi:   62, btch:  15 usd:   1
CPU    1: Hot: hi:  186, btch:  31 usd: 172   Cold: hi:   62, btch:  15 usd:   5
CPU    2: Hot: hi:  186, btch:  31 usd:  12   Cold: hi:   62, btch:  15 usd:   4
CPU    3: Hot: hi:  186, btch:  31 usd:   9   Cold: hi:   62, btch:  15 usd:   3
Active:313398 inactive:141504 dirty:0 writeback:0 unstable:0 free:1592770 slab:20743 mapped:7015 pagetables:819
DMA free:3536kB min:68kB low:84kB high:100kB active:3496kB inactive:3088kB present:16224kB pages_scanned:24701 all_unreclaimable? yes
lowmem_reserve[]: 0 871 8603
Normal free:1784kB min:3740kB low:4672kB high:5608kB active:324892kB inactive:394008kB present:892320kB pages_scanned:1419914 all_unreclaimable? yes
lowmem_reserve[]: 0 0 61854
HighMem free:6365760kB min:512kB low:8816kB high:17120kB active:925204kB inactive:168920kB present:7917312kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 0*4kB 0*8kB 1*16kB 2*32kB 0*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3536kB
Normal: 0*4kB 1*8kB 1*16kB 9*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1784kB
HighMem: 1438*4kB 599*8kB 191*16kB 49*32kB 0*64kB 0*128kB 79*256kB 48*512kB 10*1024kB 8*2048kB 1533*4096kB = 6365760kB
Swap cache: add 0, delete 0, find 0/0, race 0+0
Free swap  = 9775512kB
Total swap = 9775512kB


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.6.20 OOM with 8Gb RAM
  2007-04-12 17:38 2.6.20 OOM with 8Gb RAM Cameron Schaus
@ 2007-04-12 19:15 ` Andrew Morton
  2007-04-12 21:30   ` Cameron Schaus
  2007-04-13 22:39   ` Jason Lunz
  0 siblings, 2 replies; 9+ messages in thread
From: Andrew Morton @ 2007-04-12 19:15 UTC (permalink / raw)
  To: Cameron Schaus; +Cc: linux-kernel

On Thu, 12 Apr 2007 11:38:30 -0600
Cameron Schaus <cam@schaus.ca> wrote:

> I am running the latest FC5-i686-smp kernel, 2.6.20, on a machine with
> 8Gb of RAM, and 2 Xeon processors.  The system has a 750Mb ramdisk,
> and one process allocating and deallocating memory that is also
> writing lots of files to the ramdisk.  The process also reads and
> writes from the network.  After the process runs for a while, the
> linux OOM killer starts killing processes, even though there is lots
> of memory available.
> 
> The system does not ordinarily use swap space, but I've added swap to
> see if it makes a difference, but it only defers the problem.
> 
> The OOM dump below shows that memory in the NORMAL_ZONE is exhausted,
> but there is still plenty of memory (6Gb+) in the HighMem Zone.  I can
> provide .config and dmesg data if these would be helpful.
> 
> Why is the OOM killer being invoked when there is still memory
> available for use?
> 
> java invoked oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0
> java invoked oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0
>  [<c0455f84>] out_of_memory+0x69/0x191
>  [<c0457460>] __alloc_pages+0x220/0x2aa
>  [<c046c80a>] cache_alloc_refill+0x26f/0x468
>  [<c046ca76>] __kmalloc+0x73/0x7d
>  [<c05bb4ce>] __alloc_skb+0x49/0xf7
>  [<c05e483d>] tcp_sendmsg+0x169/0xa04
>  [<c05fd76d>] inet_sendmsg+0x3b/0x45
>  [<c05b57d5>] sock_aio_write+0xf9/0x105
>  [<c0455708>] generic_file_aio_read+0x173/0x1a3
>  [<c046fd11>] do_sync_write+0xc7/0x10a
>  [<c04379fd>] autoremove_wake_function+0x0/0x35
>  [<c05e413e>] tcp_ioctl+0x10a/0x115
>  [<c05e4034>] tcp_ioctl+0x0/0x115
>  [<c05fd406>] inet_ioctl+0x8d/0x91
>  [<c0470564>] vfs_write+0xbc/0x154
>  [<c0470b62>] sys_write+0x41/0x67
>  [<c0403ef6>] sysenter_past_esp+0x5f/0x85

All of ZONE_NORMAL got used by ramdisk, and networking wants to
allocate a page from ZONE_NORMAL.  An oom-killing is the correct
response, although probably not effective.

ramdisk is a nasty thing - cannot you use ramfs or tmpfs?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.6.20 OOM with 8Gb RAM
  2007-04-12 19:15 ` Andrew Morton
@ 2007-04-12 21:30   ` Cameron Schaus
  2007-04-13 22:39   ` Jason Lunz
  1 sibling, 0 replies; 9+ messages in thread
From: Cameron Schaus @ 2007-04-12 21:30 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

Andrew Morton wrote:
> All of ZONE_NORMAL got used by ramdisk, and networking wants to
> allocate a page from ZONE_NORMAL.  An oom-killing is the correct
> response, although probably not effective.
>
> ramdisk is a nasty thing - cannot you use ramfs or tmpfs?
>   
Sure enough, changing the ramdisk to a tmpfs did the trick.  No more OOM 
(at least for now).

Thanks!
Cam



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.6.20 OOM with 8Gb RAM
  2007-04-12 19:15 ` Andrew Morton
  2007-04-12 21:30   ` Cameron Schaus
@ 2007-04-13 22:39   ` Jason Lunz
  2007-04-13 22:46     ` Andrew Morton
  1 sibling, 1 reply; 9+ messages in thread
From: Jason Lunz @ 2007-04-13 22:39 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Cameron Schaus, linux-kernel

On Thu, Apr 12, 2007 at 12:15:53PM -0700, Andrew Morton wrote:
> All of ZONE_NORMAL got used by ramdisk, and networking wants to
> allocate a page from ZONE_NORMAL.  An oom-killing is the correct
> response, although probably not effective.
> 
> ramdisk is a nasty thing - cannot you use ramfs or tmpfs?

What do you mean by "nasty thing"? I've heard that about loopback too.

If I want to run a system entirely from ram with a compressed filesystem
image mounted on /, is it better to store that image in a ramdisk, or on
a tmpfs and mount it via loopback?

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.6.20 OOM with 8Gb RAM
  2007-04-13 22:39   ` Jason Lunz
@ 2007-04-13 22:46     ` Andrew Morton
  2007-04-13 22:54       ` William Lee Irwin III
  2007-04-13 23:01       ` Jason Lunz
  0 siblings, 2 replies; 9+ messages in thread
From: Andrew Morton @ 2007-04-13 22:46 UTC (permalink / raw)
  To: Jason Lunz; +Cc: Cameron Schaus, linux-kernel

On Fri, 13 Apr 2007 18:39:36 -0400
Jason Lunz <lunz@falooley.org> wrote:

> On Thu, Apr 12, 2007 at 12:15:53PM -0700, Andrew Morton wrote:
> > All of ZONE_NORMAL got used by ramdisk, and networking wants to
> > allocate a page from ZONE_NORMAL.  An oom-killing is the correct
> > response, although probably not effective.
> > 
> > ramdisk is a nasty thing - cannot you use ramfs or tmpfs?
> 
> What do you mean by "nasty thing"?

It's just weird - it exploits internal knowledge of VFS behaviour, diddles
with pagecache within a fake disk strategy handler, etc.

Furthermore, because it pretends to be a block device, the VFS will not use
highmem pages when accessing the ramdisk.  So the 8GB machine will go splat
with only 800MB of ramdisk.

ramfs is much cleaner and does not have that limitation.

> I've heard that about loopback too.

loopback does some pretty weird thigns too, but it has more of an excuse:
it is a specialised layering thing, whereas ramdisk is, umm, just a
ramdisk.

> If I want to run a system entirely from ram with a compressed filesystem
> image mounted on /, is it better to store that image in a ramdisk, or on
> a tmpfs and mount it via loopback?

Store it all in ramfs, no loopback needed?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.6.20 OOM with 8Gb RAM
  2007-04-13 22:46     ` Andrew Morton
@ 2007-04-13 22:54       ` William Lee Irwin III
  2007-04-13 23:32         ` Andrew Morton
  2007-04-13 23:01       ` Jason Lunz
  1 sibling, 1 reply; 9+ messages in thread
From: William Lee Irwin III @ 2007-04-13 22:54 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Jason Lunz, Cameron Schaus, linux-kernel

On Fri, Apr 13, 2007 at 03:46:53PM -0700, Andrew Morton wrote:
> It's just weird - it exploits internal knowledge of VFS behaviour, diddles
> with pagecache within a fake disk strategy handler, etc.
> Furthermore, because it pretends to be a block device, the VFS will not use
> highmem pages when accessing the ramdisk.  So the 8GB machine will go splat
> with only 800MB of ramdisk.
> ramfs is much cleaner and does not have that limitation.

After all this time, bdevs are still lowmem etc. Crying shame.


-- wli

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.6.20 OOM with 8Gb RAM
  2007-04-13 22:46     ` Andrew Morton
  2007-04-13 22:54       ` William Lee Irwin III
@ 2007-04-13 23:01       ` Jason Lunz
  1 sibling, 0 replies; 9+ messages in thread
From: Jason Lunz @ 2007-04-13 23:01 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Cameron Schaus, linux-kernel

On Fri, Apr 13, 2007 at 03:46:53PM -0700, Andrew Morton wrote:
> > If I want to run a system entirely from ram with a compressed filesystem
> > image mounted on /, is it better to store that image in a ramdisk, or on
> > a tmpfs and mount it via loopback?
> 
> Store it all in ramfs, no loopback needed?

I used to put everything in a tmpfs on /, and that certainly works. But
most files in a typical image are rarely used and it's a pity to have
lots of little files taking up a 4k page each.

You get pretty big savings by compressing the system into a squashfs and
mounting that, so the question becomes: where to put the squashfs?
ramdisk or loopback mount it from tmpfs/ramfs?

iirc, the problems with loopback have to do with writeout, which isn't a
problem here since squashfs is readonly.

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.6.20 OOM with 8Gb RAM
  2007-04-13 22:54       ` William Lee Irwin III
@ 2007-04-13 23:32         ` Andrew Morton
  2007-04-13 23:40           ` William Lee Irwin III
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2007-04-13 23:32 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: Jason Lunz, Cameron Schaus, linux-kernel

On Fri, 13 Apr 2007 15:54:33 -0700
William Lee Irwin III <wli@holomorphy.com> wrote:

> On Fri, Apr 13, 2007 at 03:46:53PM -0700, Andrew Morton wrote:
> > It's just weird - it exploits internal knowledge of VFS behaviour, diddles
> > with pagecache within a fake disk strategy handler, etc.
> > Furthermore, because it pretends to be a block device, the VFS will not use
> > highmem pages when accessing the ramdisk.  So the 8GB machine will go splat
> > with only 800MB of ramdisk.
> > ramfs is much cleaner and does not have that limitation.
> 
> After all this time, bdevs are still lowmem etc. Crying shame.

One would need to hunt down every use of b_data in filesystems and switch
them to kmap the page.

Possibly it could be done on a per-fs basis.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.6.20 OOM with 8Gb RAM
  2007-04-13 23:32         ` Andrew Morton
@ 2007-04-13 23:40           ` William Lee Irwin III
  0 siblings, 0 replies; 9+ messages in thread
From: William Lee Irwin III @ 2007-04-13 23:40 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Jason Lunz, Cameron Schaus, linux-kernel

On Fri, 13 Apr 2007 15:54:33 -0700 William Lee Irwin III <wli@holomorphy.com> wrote:
>> After all this time, bdevs are still lowmem etc. Crying shame.

On Fri, Apr 13, 2007 at 04:32:42PM -0700, Andrew Morton wrote:
> One would need to hunt down every use of b_data in filesystems and switch
> them to kmap the page.
> Possibly it could be done on a per-fs basis.

Queued right behind a dozen other massive sweeps for me to do.


-- wli

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2007-04-13 23:40 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-04-12 17:38 2.6.20 OOM with 8Gb RAM Cameron Schaus
2007-04-12 19:15 ` Andrew Morton
2007-04-12 21:30   ` Cameron Schaus
2007-04-13 22:39   ` Jason Lunz
2007-04-13 22:46     ` Andrew Morton
2007-04-13 22:54       ` William Lee Irwin III
2007-04-13 23:32         ` Andrew Morton
2007-04-13 23:40           ` William Lee Irwin III
2007-04-13 23:01       ` Jason Lunz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox