* dom0 oom-killer: gfp_mask=0x1d
@ 2005-10-18 20:34 Ted Kaczmarek
0 siblings, 0 replies; 5+ messages in thread
From: Ted Kaczmarek @ 2005-10-18 20:34 UTC (permalink / raw)
To: xen-devel
I had the dom0 which unfortunately didn't have a console on it hit a
race condition and saw oom errors on it also. This happened after it was
running for over 36 hours with a domU whose load average was avg was
around 3 most of the time.
changeset: 7396:9b51e7637676
Dom0 - UP i686, Centos 4.1, 768 megs
domU-1 92 megs snmpd
domU-2 92 megs snmpdd
domU-3 410 megs postgres, tomcat 5.5, opennms head on java 5
Got this from syslog
Oct 18 10:31:28 tarkus kernel: peth0: received packet with own address
as source address
Oct 18 10:32:03 tarkus kernel: peth0: received packet with own address
as source address
Oct 18 10:33:53 tarkus kernel: oom-killer: gfp_mask=0xd2
Oct 18 10:33:53 tarkus kernel: DMA per-cpu:
Oct 18 10:33:53 tarkus kernel: cpu 0 hot: low 30, high 90, batch 15
Oct 18 10:33:53 tarkus kernel: cpu 0 cold: low 0, high 30, batch 15
Oct 18 10:33:53 tarkus kernel: Normal per-cpu: empty
Oct 18 10:33:53 tarkus kernel: HighMem per-cpu: empty
Oct 18 10:33:53 tarkus kernel:
Oct 18 10:33:53 tarkus kernel: Free pages: 1948kB (0kB HighMem)
Oct 18 10:34:02 tarkus kernel: Active:899 inactive:695 dirty:0
writeback:7 unstable:0 free:487 slab:28555 mapped:815 pagetables:206
Oct 18 10:34:09 tarkus crond(pam_unix)[25117]: session closed for user
root
Oct 18 10:34:16 tarkus kernel: DMA free:1948kB min:1480kB low:1848kB
high:2220kB active:3596kB inactive:2780kB present:137200kB
pages_scanned:1480 all_unreclaimable? no
Oct 18 10:34:42 tarkus kernel: lowmem_reserve[]: 0 0 0
Oct 18 10:34:53 tarkus kernel: Normal free:0kB min:0kB low:0kB high:0kB
active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable?
no
Oct 18 10:35:01 tarkus kernel: lowmem_reserve[]: 0 0 0
Oct 18 10:35:02 tarkus kernel: HighMem free:0kB min:128kB low:160kB
high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0
all_unreclaimable? no
Oct 18 10:35:10 tarkus kernel: lowmem_reserve[]: 0 0 0
Oct 18 10:35:17 tarkus crond(pam_unix)[25252]: session opened for user
root by (uid=0)
Oct 18 10:35:22 tarkus kernel: DMA: 113*4kB 9*8kB 3*16kB 1*32kB 1*64kB
0*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1948kB
Oct 18 10:35:28 tarkus kernel: Normal: empty
Oct 18 10:35:37 tarkus kernel: HighMem: empty
Oct 18 10:35:43 tarkus kernel: Swap cache: add 164344, delete 163838,
find 45906/87552, race 0+46
Oct 18 10:35:43 tarkus kernel: Free swap = 1553068kB
Oct 18 10:35:43 tarkus kernel: Total swap = 1572856kB
Oct 18 10:35:43 tarkus kernel: Out of Memory: Killed process 25124
(sendmail).
Oct 18 10:35:52 tarkus snmpd[3925]: send response: (if_nameindex()
failed)
Oct 18 10:36:06 tarkus snmpd[3925]: send response:
Oct 18 10:36:26 tarkus snmpd[3925]: send response:
Oct 18 10:37:00 tarkus last message repeated 4 times
Oct 18 10:37:02 tarkus kernel: peth0: received packet with own address
as source address
Regards,
Ted
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: dom0 oom-killer: gfp_mask=0x1d
@ 2005-10-18 22:10 Ian Pratt
2005-10-18 22:37 ` Ted Kaczmarek
2005-10-18 23:44 ` Ted Kaczmarek
0 siblings, 2 replies; 5+ messages in thread
From: Ian Pratt @ 2005-10-18 22:10 UTC (permalink / raw)
To: Ted Kaczmarek, xen-devel
> I had the dom0 which unfortunately didn't have a console on it hit a
> race condition and saw oom errors on it also. This happened
> after it was
> running for over 36 hours with a domU whose load average was avg was
> around 3 most of the time.
Were you running anything in the dom0 other than xend etc?
It looks like the slabcache it dom0 is over 100MB, so having the ouput
of /proc/slabinfo would be interesting when the machine is in this
state. It might be good to look at this overtime and see if something is
being leaked.
Ian
> changeset: 7396:9b51e7637676
>
> Dom0 - UP i686, Centos 4.1, 768 megs
>
> domU-1 92 megs snmpd
> domU-2 92 megs snmpdd
> domU-3 410 megs postgres, tomcat 5.5, opennms head on java 5
>
> Got this from syslog
>
> Oct 18 10:31:28 tarkus kernel: peth0: received packet with
> own address
> as source address
> Oct 18 10:32:03 tarkus kernel: peth0: received packet with
> own address
> as source address
> Oct 18 10:33:53 tarkus kernel: oom-killer: gfp_mask=0xd2
> Oct 18 10:33:53 tarkus kernel: DMA per-cpu:
> Oct 18 10:33:53 tarkus kernel: cpu 0 hot: low 30, high 90, batch 15
> Oct 18 10:33:53 tarkus kernel: cpu 0 cold: low 0, high 30, batch 15
> Oct 18 10:33:53 tarkus kernel: Normal per-cpu: empty
> Oct 18 10:33:53 tarkus kernel: HighMem per-cpu: empty
> Oct 18 10:33:53 tarkus kernel:
> Oct 18 10:33:53 tarkus kernel: Free pages: 1948kB (0kB HighMem)
> Oct 18 10:34:02 tarkus kernel: Active:899 inactive:695 dirty:0
> writeback:7 unstable:0 free:487 slab:28555 mapped:815 pagetables:206
> Oct 18 10:34:09 tarkus crond(pam_unix)[25117]: session closed for user
> root
> Oct 18 10:34:16 tarkus kernel: DMA free:1948kB min:1480kB low:1848kB
> high:2220kB active:3596kB inactive:2780kB present:137200kB
> pages_scanned:1480 all_unreclaimable? no
> Oct 18 10:34:42 tarkus kernel: lowmem_reserve[]: 0 0 0
> Oct 18 10:34:53 tarkus kernel: Normal free:0kB min:0kB
> low:0kB high:0kB
> active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable?
> no
> Oct 18 10:35:01 tarkus kernel: lowmem_reserve[]: 0 0 0
> Oct 18 10:35:02 tarkus kernel: HighMem free:0kB min:128kB low:160kB
> high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0
> all_unreclaimable? no
> Oct 18 10:35:10 tarkus kernel: lowmem_reserve[]: 0 0 0
> Oct 18 10:35:17 tarkus crond(pam_unix)[25252]: session opened for user
> root by (uid=0)
> Oct 18 10:35:22 tarkus kernel: DMA: 113*4kB 9*8kB 3*16kB 1*32kB 1*64kB
> 0*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1948kB
> Oct 18 10:35:28 tarkus kernel: Normal: empty
> Oct 18 10:35:37 tarkus kernel: HighMem: empty
> Oct 18 10:35:43 tarkus kernel: Swap cache: add 164344, delete 163838,
> find 45906/87552, race 0+46
> Oct 18 10:35:43 tarkus kernel: Free swap = 1553068kB
> Oct 18 10:35:43 tarkus kernel: Total swap = 1572856kB
> Oct 18 10:35:43 tarkus kernel: Out of Memory: Killed process 25124
> (sendmail).
> Oct 18 10:35:52 tarkus snmpd[3925]: send response: (if_nameindex()
> failed)
> Oct 18 10:36:06 tarkus snmpd[3925]: send response:
> Oct 18 10:36:26 tarkus snmpd[3925]: send response:
> Oct 18 10:37:00 tarkus last message repeated 4 times
> Oct 18 10:37:02 tarkus kernel: peth0: received packet with
> own address
> as source address
>
> Regards,
> Ted
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: dom0 oom-killer: gfp_mask=0x1d
2005-10-18 22:10 Ian Pratt
@ 2005-10-18 22:37 ` Ted Kaczmarek
2005-10-18 23:44 ` Ted Kaczmarek
1 sibling, 0 replies; 5+ messages in thread
From: Ted Kaczmarek @ 2005-10-18 22:37 UTC (permalink / raw)
To: Ian Pratt; +Cc: xen-devel
On Tue, 2005-10-18 at 23:10 +0100, Ian Pratt wrote:
> > I had the dom0 which unfortunately didn't have a console on it hit a
> > race condition and saw oom errors on it also. This happened
> > after it was
> > running for over 36 hours with a domU whose load average was avg was
> > around 3 most of the time.
>
> Were you running anything in the dom0 other than xend etc?
>
> It looks like the slabcache it dom0 is over 100MB, so having the ouput
> of /proc/slabinfo would be interesting when the machine is in this
> state. It might be good to look at this overtime and see if something is
> being leaked.
zebra, ospfd, bgpd, smartd, cupsd, snmpd, acpid, normally I am below 100
megs usage.
[root@tarkus ~]# free
total used free shared buffers cached
Mem: 132212 96796 35416 0 12308 40124
-/+ buffers/cache: 44364 87848
Swap: 1572856 0 1572856
These usage is fairly consistent, don't have any snaps right before it
happened :-)
Anything specific in slabinfo to look for ?
Regards,
Ted
>
> Ian
>
>
> > changeset: 7396:9b51e7637676
> >
> > Dom0 - UP i686, Centos 4.1, 768 megs
> >
> > domU-1 92 megs snmpd
> > domU-2 92 megs snmpdd
> > domU-3 410 megs postgres, tomcat 5.5, opennms head on java 5
> >
> > Got this from syslog
> >
> > Oct 18 10:31:28 tarkus kernel: peth0: received packet with
> > own address
> > as source address
> > Oct 18 10:32:03 tarkus kernel: peth0: received packet with
> > own address
> > as source address
> > Oct 18 10:33:53 tarkus kernel: oom-killer: gfp_mask=0xd2
> > Oct 18 10:33:53 tarkus kernel: DMA per-cpu:
> > Oct 18 10:33:53 tarkus kernel: cpu 0 hot: low 30, high 90, batch 15
> > Oct 18 10:33:53 tarkus kernel: cpu 0 cold: low 0, high 30, batch 15
> > Oct 18 10:33:53 tarkus kernel: Normal per-cpu: empty
> > Oct 18 10:33:53 tarkus kernel: HighMem per-cpu: empty
> > Oct 18 10:33:53 tarkus kernel:
> > Oct 18 10:33:53 tarkus kernel: Free pages: 1948kB (0kB HighMem)
> > Oct 18 10:34:02 tarkus kernel: Active:899 inactive:695 dirty:0
> > writeback:7 unstable:0 free:487 slab:28555 mapped:815 pagetables:206
> > Oct 18 10:34:09 tarkus crond(pam_unix)[25117]: session closed for user
> > root
> > Oct 18 10:34:16 tarkus kernel: DMA free:1948kB min:1480kB low:1848kB
> > high:2220kB active:3596kB inactive:2780kB present:137200kB
> > pages_scanned:1480 all_unreclaimable? no
> > Oct 18 10:34:42 tarkus kernel: lowmem_reserve[]: 0 0 0
> > Oct 18 10:34:53 tarkus kernel: Normal free:0kB min:0kB
> > low:0kB high:0kB
> > active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable?
> > no
> > Oct 18 10:35:01 tarkus kernel: lowmem_reserve[]: 0 0 0
> > Oct 18 10:35:02 tarkus kernel: HighMem free:0kB min:128kB low:160kB
> > high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0
> > all_unreclaimable? no
> > Oct 18 10:35:10 tarkus kernel: lowmem_reserve[]: 0 0 0
> > Oct 18 10:35:17 tarkus crond(pam_unix)[25252]: session opened for user
> > root by (uid=0)
> > Oct 18 10:35:22 tarkus kernel: DMA: 113*4kB 9*8kB 3*16kB 1*32kB 1*64kB
> > 0*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1948kB
> > Oct 18 10:35:28 tarkus kernel: Normal: empty
> > Oct 18 10:35:37 tarkus kernel: HighMem: empty
> > Oct 18 10:35:43 tarkus kernel: Swap cache: add 164344, delete 163838,
> > find 45906/87552, race 0+46
> > Oct 18 10:35:43 tarkus kernel: Free swap = 1553068kB
> > Oct 18 10:35:43 tarkus kernel: Total swap = 1572856kB
> > Oct 18 10:35:43 tarkus kernel: Out of Memory: Killed process 25124
> > (sendmail).
> > Oct 18 10:35:52 tarkus snmpd[3925]: send response: (if_nameindex()
> > failed)
> > Oct 18 10:36:06 tarkus snmpd[3925]: send response:
> > Oct 18 10:36:26 tarkus snmpd[3925]: send response:
> > Oct 18 10:37:00 tarkus last message repeated 4 times
> > Oct 18 10:37:02 tarkus kernel: peth0: received packet with
> > own address
> > as source address
> >
> > Regards,
> > Ted
> >
> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xensource.com
> > http://lists.xensource.com/xen-devel
> >
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: dom0 oom-killer: gfp_mask=0x1d
2005-10-18 22:10 Ian Pratt
2005-10-18 22:37 ` Ted Kaczmarek
@ 2005-10-18 23:44 ` Ted Kaczmarek
1 sibling, 0 replies; 5+ messages in thread
From: Ted Kaczmarek @ 2005-10-18 23:44 UTC (permalink / raw)
To: Ian Pratt; +Cc: xen-devel
On Tue, 2005-10-18 at 23:10 +0100, Ian Pratt wrote:
> > I had the dom0 which unfortunately didn't have a console on it hit a
> > race condition and saw oom errors on it also. This happened
> > after it was
> > running for over 36 hours with a domU whose load average was avg was
> > around 3 most of the time.
>
> Were you running anything in the dom0 other than xend etc?
>
zebra, ospfd, bgpd, smartd, cupsd, snmpd, acpid, normally I am below 100
megs usage.
[root@tarkus ~]# free
total used free shared buffers cached
Mem: 132212 96796 35416 0 12308 40124
-/+ buffers/cache: 44364 87848
Swap: 1572856 0 1572856
That is fairly consistent, don't have any snaps right before it
happened :-)
Regards,
Ted
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: dom0 oom-killer: gfp_mask=0x1d
@ 2005-10-19 6:19 Ian Pratt
0 siblings, 0 replies; 5+ messages in thread
From: Ian Pratt @ 2005-10-19 6:19 UTC (permalink / raw)
To: Ted Kaczmarek; +Cc: xen-devel
> > It looks like the slabcache it dom0 is over 100MB, so
> having the ouput
> > of /proc/slabinfo would be interesting when the machine is in this
> > state. It might be good to look at this overtime and see if
> Anything specific in slabinfo to look for ?
Look for an object type with an abnormally large number of objects in it
(e.g. where size*num > 40MB). In particularly, keep an eye out for an
object type which seems to grow steadily over time.
Ian
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2005-10-19 6:19 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-10-18 20:34 dom0 oom-killer: gfp_mask=0x1d Ted Kaczmarek
-- strict thread matches above, loose matches on Subject: below --
2005-10-18 22:10 Ian Pratt
2005-10-18 22:37 ` Ted Kaczmarek
2005-10-18 23:44 ` Ted Kaczmarek
2005-10-19 6:19 Ian Pratt
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.