All of lore.kernel.org
 help / color / mirror / Atom feed
* dom0  oom-killer: gfp_mask=0x1d
@ 2005-10-18 20:34 Ted Kaczmarek
  0 siblings, 0 replies; 5+ messages in thread
From: Ted Kaczmarek @ 2005-10-18 20:34 UTC (permalink / raw)
  To: xen-devel

I had the dom0 which unfortunately didn't have a console on it hit a
race condition and saw oom errors on it also. This happened after it was
running for over 36 hours with a domU whose load average was avg was
around 3 most of the time.

changeset:   7396:9b51e7637676

Dom0 - UP i686, Centos 4.1, 768 megs

domU-1 92 megs snmpd
domU-2 92 megs snmpdd
domU-3 410 megs postgres, tomcat 5.5, opennms head on java 5

Got this from syslog

 Oct 18 10:31:28 tarkus kernel: peth0: received packet with  own address
as source address
Oct 18 10:32:03 tarkus kernel: peth0: received packet with  own address
as source address
Oct 18 10:33:53 tarkus kernel: oom-killer: gfp_mask=0xd2
Oct 18 10:33:53 tarkus kernel: DMA per-cpu:
Oct 18 10:33:53 tarkus kernel: cpu 0 hot: low 30, high 90, batch 15
Oct 18 10:33:53 tarkus kernel: cpu 0 cold: low 0, high 30, batch 15
Oct 18 10:33:53 tarkus kernel: Normal per-cpu: empty
Oct 18 10:33:53 tarkus kernel: HighMem per-cpu: empty
Oct 18 10:33:53 tarkus kernel:
Oct 18 10:33:53 tarkus kernel: Free pages:        1948kB (0kB HighMem)
Oct 18 10:34:02 tarkus kernel: Active:899 inactive:695 dirty:0
writeback:7 unstable:0 free:487 slab:28555 mapped:815 pagetables:206
Oct 18 10:34:09 tarkus crond(pam_unix)[25117]: session closed for user
root
Oct 18 10:34:16 tarkus kernel: DMA free:1948kB min:1480kB low:1848kB
high:2220kB active:3596kB inactive:2780kB present:137200kB
pages_scanned:1480 all_unreclaimable? no
Oct 18 10:34:42 tarkus kernel: lowmem_reserve[]: 0 0 0
Oct 18 10:34:53 tarkus kernel: Normal free:0kB min:0kB low:0kB high:0kB
active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable?
no
Oct 18 10:35:01 tarkus kernel: lowmem_reserve[]: 0 0 0
Oct 18 10:35:02 tarkus kernel: HighMem free:0kB min:128kB low:160kB
high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0
all_unreclaimable? no
Oct 18 10:35:10 tarkus kernel: lowmem_reserve[]: 0 0 0
Oct 18 10:35:17 tarkus crond(pam_unix)[25252]: session opened for user
root by (uid=0)
Oct 18 10:35:22 tarkus kernel: DMA: 113*4kB 9*8kB 3*16kB 1*32kB 1*64kB
0*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1948kB
Oct 18 10:35:28 tarkus kernel: Normal: empty
Oct 18 10:35:37 tarkus kernel: HighMem: empty
Oct 18 10:35:43 tarkus kernel: Swap cache: add 164344, delete 163838,
find 45906/87552, race 0+46
Oct 18 10:35:43 tarkus kernel: Free swap  = 1553068kB
Oct 18 10:35:43 tarkus kernel: Total swap = 1572856kB
Oct 18 10:35:43 tarkus kernel: Out of Memory: Killed process 25124
(sendmail).
Oct 18 10:35:52 tarkus snmpd[3925]: send response:  (if_nameindex()
failed)
Oct 18 10:36:06 tarkus snmpd[3925]: send response:
Oct 18 10:36:26 tarkus snmpd[3925]: send response:
Oct 18 10:37:00 tarkus last message repeated 4 times
Oct 18 10:37:02 tarkus kernel: peth0: received packet with  own address
as source address

Regards,
Ted

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: dom0  oom-killer: gfp_mask=0x1d
@ 2005-10-18 22:10 Ian Pratt
  2005-10-18 22:37 ` Ted Kaczmarek
  2005-10-18 23:44 ` Ted Kaczmarek
  0 siblings, 2 replies; 5+ messages in thread
From: Ian Pratt @ 2005-10-18 22:10 UTC (permalink / raw)
  To: Ted Kaczmarek, xen-devel

> I had the dom0 which unfortunately didn't have a console on it hit a
> race condition and saw oom errors on it also. This happened 
> after it was
> running for over 36 hours with a domU whose load average was avg was
> around 3 most of the time.

Were you running anything in the dom0 other than xend etc?

It looks like the slabcache it dom0 is over 100MB, so having the ouput
of /proc/slabinfo would be interesting when the machine is in this
state. It might be good to look at this overtime and see if something is
being leaked.

Ian


> changeset:   7396:9b51e7637676
> 
> Dom0 - UP i686, Centos 4.1, 768 megs
> 
> domU-1 92 megs snmpd
> domU-2 92 megs snmpdd
> domU-3 410 megs postgres, tomcat 5.5, opennms head on java 5
> 
> Got this from syslog
> 
>  Oct 18 10:31:28 tarkus kernel: peth0: received packet with  
> own address
> as source address
> Oct 18 10:32:03 tarkus kernel: peth0: received packet with  
> own address
> as source address
> Oct 18 10:33:53 tarkus kernel: oom-killer: gfp_mask=0xd2
> Oct 18 10:33:53 tarkus kernel: DMA per-cpu:
> Oct 18 10:33:53 tarkus kernel: cpu 0 hot: low 30, high 90, batch 15
> Oct 18 10:33:53 tarkus kernel: cpu 0 cold: low 0, high 30, batch 15
> Oct 18 10:33:53 tarkus kernel: Normal per-cpu: empty
> Oct 18 10:33:53 tarkus kernel: HighMem per-cpu: empty
> Oct 18 10:33:53 tarkus kernel:
> Oct 18 10:33:53 tarkus kernel: Free pages:        1948kB (0kB HighMem)
> Oct 18 10:34:02 tarkus kernel: Active:899 inactive:695 dirty:0
> writeback:7 unstable:0 free:487 slab:28555 mapped:815 pagetables:206
> Oct 18 10:34:09 tarkus crond(pam_unix)[25117]: session closed for user
> root
> Oct 18 10:34:16 tarkus kernel: DMA free:1948kB min:1480kB low:1848kB
> high:2220kB active:3596kB inactive:2780kB present:137200kB
> pages_scanned:1480 all_unreclaimable? no
> Oct 18 10:34:42 tarkus kernel: lowmem_reserve[]: 0 0 0
> Oct 18 10:34:53 tarkus kernel: Normal free:0kB min:0kB 
> low:0kB high:0kB
> active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable?
> no
> Oct 18 10:35:01 tarkus kernel: lowmem_reserve[]: 0 0 0
> Oct 18 10:35:02 tarkus kernel: HighMem free:0kB min:128kB low:160kB
> high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0
> all_unreclaimable? no
> Oct 18 10:35:10 tarkus kernel: lowmem_reserve[]: 0 0 0
> Oct 18 10:35:17 tarkus crond(pam_unix)[25252]: session opened for user
> root by (uid=0)
> Oct 18 10:35:22 tarkus kernel: DMA: 113*4kB 9*8kB 3*16kB 1*32kB 1*64kB
> 0*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1948kB
> Oct 18 10:35:28 tarkus kernel: Normal: empty
> Oct 18 10:35:37 tarkus kernel: HighMem: empty
> Oct 18 10:35:43 tarkus kernel: Swap cache: add 164344, delete 163838,
> find 45906/87552, race 0+46
> Oct 18 10:35:43 tarkus kernel: Free swap  = 1553068kB
> Oct 18 10:35:43 tarkus kernel: Total swap = 1572856kB
> Oct 18 10:35:43 tarkus kernel: Out of Memory: Killed process 25124
> (sendmail).
> Oct 18 10:35:52 tarkus snmpd[3925]: send response:  (if_nameindex()
> failed)
> Oct 18 10:36:06 tarkus snmpd[3925]: send response:
> Oct 18 10:36:26 tarkus snmpd[3925]: send response:
> Oct 18 10:37:00 tarkus last message repeated 4 times
> Oct 18 10:37:02 tarkus kernel: peth0: received packet with  
> own address
> as source address
> 
> Regards,
> Ted
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: dom0  oom-killer: gfp_mask=0x1d
  2005-10-18 22:10 dom0 oom-killer: gfp_mask=0x1d Ian Pratt
@ 2005-10-18 22:37 ` Ted Kaczmarek
  2005-10-18 23:44 ` Ted Kaczmarek
  1 sibling, 0 replies; 5+ messages in thread
From: Ted Kaczmarek @ 2005-10-18 22:37 UTC (permalink / raw)
  To: Ian Pratt; +Cc: xen-devel

On Tue, 2005-10-18 at 23:10 +0100, Ian Pratt wrote:
> > I had the dom0 which unfortunately didn't have a console on it hit a
> > race condition and saw oom errors on it also. This happened 
> > after it was
> > running for over 36 hours with a domU whose load average was avg was
> > around 3 most of the time.
> 
> Were you running anything in the dom0 other than xend etc?
> 
> It looks like the slabcache it dom0 is over 100MB, so having the ouput
> of /proc/slabinfo would be interesting when the machine is in this
> state. It might be good to look at this overtime and see if something is
> being leaked.
zebra, ospfd, bgpd, smartd, cupsd, snmpd, acpid, normally I am below 100
megs usage.

[root@tarkus ~]# free
          total       used       free     shared    buffers     cached
Mem:      132212      96796      35416          0      12308      40124
-/+ buffers/cache:    44364      87848
Swap:     1572856         0    1572856

These usage is fairly consistent, don't have any snaps right before it
happened :-) 

Anything specific in slabinfo to look for ? 

Regards,
Ted
> 
> Ian
> 
> 
> > changeset:   7396:9b51e7637676
> > 
> > Dom0 - UP i686, Centos 4.1, 768 megs
> > 
> > domU-1 92 megs snmpd
> > domU-2 92 megs snmpdd
> > domU-3 410 megs postgres, tomcat 5.5, opennms head on java 5
> > 
> > Got this from syslog
> > 
> >  Oct 18 10:31:28 tarkus kernel: peth0: received packet with  
> > own address
> > as source address
> > Oct 18 10:32:03 tarkus kernel: peth0: received packet with  
> > own address
> > as source address
> > Oct 18 10:33:53 tarkus kernel: oom-killer: gfp_mask=0xd2
> > Oct 18 10:33:53 tarkus kernel: DMA per-cpu:
> > Oct 18 10:33:53 tarkus kernel: cpu 0 hot: low 30, high 90, batch 15
> > Oct 18 10:33:53 tarkus kernel: cpu 0 cold: low 0, high 30, batch 15
> > Oct 18 10:33:53 tarkus kernel: Normal per-cpu: empty
> > Oct 18 10:33:53 tarkus kernel: HighMem per-cpu: empty
> > Oct 18 10:33:53 tarkus kernel:
> > Oct 18 10:33:53 tarkus kernel: Free pages:        1948kB (0kB HighMem)
> > Oct 18 10:34:02 tarkus kernel: Active:899 inactive:695 dirty:0
> > writeback:7 unstable:0 free:487 slab:28555 mapped:815 pagetables:206
> > Oct 18 10:34:09 tarkus crond(pam_unix)[25117]: session closed for user
> > root
> > Oct 18 10:34:16 tarkus kernel: DMA free:1948kB min:1480kB low:1848kB
> > high:2220kB active:3596kB inactive:2780kB present:137200kB
> > pages_scanned:1480 all_unreclaimable? no
> > Oct 18 10:34:42 tarkus kernel: lowmem_reserve[]: 0 0 0
> > Oct 18 10:34:53 tarkus kernel: Normal free:0kB min:0kB 
> > low:0kB high:0kB
> > active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable?
> > no
> > Oct 18 10:35:01 tarkus kernel: lowmem_reserve[]: 0 0 0
> > Oct 18 10:35:02 tarkus kernel: HighMem free:0kB min:128kB low:160kB
> > high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0
> > all_unreclaimable? no
> > Oct 18 10:35:10 tarkus kernel: lowmem_reserve[]: 0 0 0
> > Oct 18 10:35:17 tarkus crond(pam_unix)[25252]: session opened for user
> > root by (uid=0)
> > Oct 18 10:35:22 tarkus kernel: DMA: 113*4kB 9*8kB 3*16kB 1*32kB 1*64kB
> > 0*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1948kB
> > Oct 18 10:35:28 tarkus kernel: Normal: empty
> > Oct 18 10:35:37 tarkus kernel: HighMem: empty
> > Oct 18 10:35:43 tarkus kernel: Swap cache: add 164344, delete 163838,
> > find 45906/87552, race 0+46
> > Oct 18 10:35:43 tarkus kernel: Free swap  = 1553068kB
> > Oct 18 10:35:43 tarkus kernel: Total swap = 1572856kB
> > Oct 18 10:35:43 tarkus kernel: Out of Memory: Killed process 25124
> > (sendmail).
> > Oct 18 10:35:52 tarkus snmpd[3925]: send response:  (if_nameindex()
> > failed)
> > Oct 18 10:36:06 tarkus snmpd[3925]: send response:
> > Oct 18 10:36:26 tarkus snmpd[3925]: send response:
> > Oct 18 10:37:00 tarkus last message repeated 4 times
> > Oct 18 10:37:02 tarkus kernel: peth0: received packet with  
> > own address
> > as source address
> > 
> > Regards,
> > Ted
> > 
> > 
> > 
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xensource.com
> > http://lists.xensource.com/xen-devel
> > 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: dom0  oom-killer: gfp_mask=0x1d
  2005-10-18 22:10 dom0 oom-killer: gfp_mask=0x1d Ian Pratt
  2005-10-18 22:37 ` Ted Kaczmarek
@ 2005-10-18 23:44 ` Ted Kaczmarek
  1 sibling, 0 replies; 5+ messages in thread
From: Ted Kaczmarek @ 2005-10-18 23:44 UTC (permalink / raw)
  To: Ian Pratt; +Cc: xen-devel

On Tue, 2005-10-18 at 23:10 +0100, Ian Pratt wrote:
> > I had the dom0 which unfortunately didn't have a console on it hit a
> > race condition and saw oom errors on it also. This happened 
> > after it was
> > running for over 36 hours with a domU whose load average was avg was
> > around 3 most of the time.
> 
> Were you running anything in the dom0 other than xend etc?
> 

zebra, ospfd, bgpd, smartd, cupsd, snmpd, acpid, normally I am below 100
megs usage.

[root@tarkus ~]# free
          total       used       free     shared    buffers     cached
Mem:      132212      96796      35416          0      12308      40124
-/+ buffers/cache:    44364      87848
Swap:     1572856         0    1572856

That is fairly consistent, don't have any snaps right before it
happened :-) 


Regards,
Ted

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: dom0  oom-killer: gfp_mask=0x1d
@ 2005-10-19  6:19 Ian Pratt
  0 siblings, 0 replies; 5+ messages in thread
From: Ian Pratt @ 2005-10-19  6:19 UTC (permalink / raw)
  To: Ted Kaczmarek; +Cc: xen-devel

> > It looks like the slabcache it dom0 is over 100MB, so 
> having the ouput 
> > of /proc/slabinfo would be interesting when the machine is in this 
> > state. It might be good to look at this overtime and see if 

> Anything specific in slabinfo to look for ? 

Look for an object type with an abnormally large number of objects in it
(e.g. where size*num > 40MB). In particularly, keep an eye out for an
object type which seems to grow steadily over time.

Ian

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-10-19  6:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-10-18 22:10 dom0 oom-killer: gfp_mask=0x1d Ian Pratt
2005-10-18 22:37 ` Ted Kaczmarek
2005-10-18 23:44 ` Ted Kaczmarek
  -- strict thread matches above, loose matches on Subject: below --
2005-10-19  6:19 Ian Pratt
2005-10-18 20:34 Ted Kaczmarek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.