* cgroup memory limit problems
@ 2011-11-09 20:57 Arkadiusz Miśkiewicz
[not found] ` <201111092157.06819.arekm-evZBlRFTdvA@public.gmane.org>
0 siblings, 1 reply; 8+ messages in thread
From: Arkadiusz Miśkiewicz @ 2011-11-09 20:57 UTC (permalink / raw)
To: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
Hi,
I have a machine with 6GB of ram and a cgroup for apache processes limited to
memory.limit_in_bytes = "5100M";
memory.soft_limit_in_bytes = "5000M";
Unfortunately when apache processes ate all ram assigned to their cgroup load
on whole machine jumps the roof.
cgroup aware OOM kicks in, kills one process and that doesn't help.
If I'm fast enough I notice and then apache processes require tons of kill -9
(I'm doing "killall -9 apache" in a while (true) loop for 20-30s) to get
killed (and that not always succeeds - sometimes I'm unable to kill these and
I'm just doing sysrq u, s, b after few minutes.. if I'm lucky. Sometimes I
cannot do any command).
This all happens on 2.6.38.8 kernel.
http://ixion.pld-linux.org/~arekm/cgroup-eaten-memory-failure-1.txt
for kernel log. It ends with reboot of the machine.
Now the question is - is this is how cgroup memory limit supposed to work? If
yes then it's hardly usable but maybe there are some patches for this?
Would newer kernels be better ? (cannot test immediately since newer kernels
kill ipmi connectivity on that machine). I would expect apache processes to be
killed without taking down entire machine.
ps. found similar history at http://serverfault.com/questions/211509/how-to-
stop-apache-from-crashing-my-entire-server from year ago
--
Arkadiusz Miśkiewicz PLD/Linux Team
arekm / maven.pl http://ftp.pld-linux.org/
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers
^ permalink raw reply [flat|nested] 8+ messages in thread[parent not found: <201111092157.06819.arekm-evZBlRFTdvA@public.gmane.org>]
* Re: cgroup memory limit problems [not found] ` <201111092157.06819.arekm-evZBlRFTdvA@public.gmane.org> @ 2011-11-09 22:45 ` Daniel Lezcano [not found] ` <4EBB026D.8090701-GANU6spQydw@public.gmane.org> 2011-11-09 22:51 ` Balbir Singh 1 sibling, 1 reply; 8+ messages in thread From: Daniel Lezcano @ 2011-11-09 22:45 UTC (permalink / raw) To: Arkadiusz Miśkiewicz Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA On 11/09/2011 09:57 PM, Arkadiusz Miśkiewicz wrote: > Hi, > > I have a machine with 6GB of ram and a cgroup for apache processes limited to > memory.limit_in_bytes = "5100M"; > memory.soft_limit_in_bytes = "5000M"; > > Unfortunately when apache processes ate all ram assigned to their cgroup load > on whole machine jumps the roof. > > cgroup aware OOM kicks in, kills one process and that doesn't help. Did you try to disable the oom with the "memory.oom_control" file by setting it to "1" ? If the cgroup runs out of memory they should be stopped until you give more memory or kill some tasks. Hope that helps -- Daniel _______________________________________________ Containers mailing list Containers@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <4EBB026D.8090701-GANU6spQydw@public.gmane.org>]
* Re: cgroup memory limit problems [not found] ` <4EBB026D.8090701-GANU6spQydw@public.gmane.org> @ 2011-11-09 23:01 ` Arkadiusz Miśkiewicz 0 siblings, 0 replies; 8+ messages in thread From: Arkadiusz Miśkiewicz @ 2011-11-09 23:01 UTC (permalink / raw) To: Daniel Lezcano Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA On Wednesday 09 of November 2011, Daniel Lezcano wrote: > On 11/09/2011 09:57 PM, Arkadiusz Miśkiewicz wrote: > > Hi, > > > > I have a machine with 6GB of ram and a cgroup for apache processes > > limited to > > > > memory.limit_in_bytes = "5100M"; > > memory.soft_limit_in_bytes = "5000M"; > > > > Unfortunately when apache processes ate all ram assigned to their cgroup > > load on whole machine jumps the roof. > > > > cgroup aware OOM kicks in, kills one process and that doesn't help. > > Did you try to disable the oom with the "memory.oom_control" file by > setting it to "1" ? No, will try. > > If the cgroup runs out of memory they should be stopped until you give > more memory or kill some tasks. I hope these tasks will be allowed to be killed when oom will be disabled. Right now it's often impossible to kill these. Anyway I would preffer httpd processed to be automaticly killed without manual intervention. > Hope that helps > -- Daniel -- Arkadiusz Miśkiewicz PLD/Linux Team arekm / maven.pl http://ftp.pld-linux.org/ _______________________________________________ Containers mailing list Containers@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: cgroup memory limit problems [not found] ` <201111092157.06819.arekm-evZBlRFTdvA@public.gmane.org> 2011-11-09 22:45 ` Daniel Lezcano @ 2011-11-09 22:51 ` Balbir Singh [not found] ` <CAKTCnzmHk1U6fP+jrw2Rq3dLwU30gyocRBp2QpQVHAbU=x5uSA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 1 sibling, 1 reply; 8+ messages in thread From: Balbir Singh @ 2011-11-09 22:51 UTC (permalink / raw) To: Arkadiusz Miśkiewicz Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA 2011/11/10 Arkadiusz Miśkiewicz <arekm@maven.pl>: > > Hi, > > I have a machine with 6GB of ram and a cgroup for apache processes limited to > memory.limit_in_bytes = "5100M"; > memory.soft_limit_in_bytes = "5000M"; > Can you please send the output of memory.stat? > Unfortunately when apache processes ate all ram assigned to their cgroup load > on whole machine jumps the roof. > > cgroup aware OOM kicks in, kills one process and that doesn't help. > > If I'm fast enough I notice and then apache processes require tons of kill -9 > (I'm doing "killall -9 apache" in a while (true) loop for 20-30s) to get > killed (and that not always succeeds - sometimes I'm unable to kill these and > I'm just doing sysrq u, s, b after few minutes.. if I'm lucky. Sometimes I > cannot do any command). > > This all happens on 2.6.38.8 kernel. > http://ixion.pld-linux.org/~arekm/cgroup-eaten-memory-failure-1.txt > for kernel log. It ends with reboot of the machine. > From the logs "Nov 9 20:48:53 tm2 kernel: [18300.349106] Task in /somecgroup/httpd killed as a result of limit of /somecgroup/httpd Nov 9 20:48:53 tm2 kernel: [18300.349110] memory: usage 5222400kB, limit 5222400kB, failcnt 282869 Nov 9 20:48:53 tm2 kernel: [18300.349113] memory+swap: usage 0kB, limit 9007199254740991kB, failcnt 0" It seems like you've disabled swap, is that correct? Balbir _______________________________________________ Containers mailing list Containers@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <CAKTCnzmHk1U6fP+jrw2Rq3dLwU30gyocRBp2QpQVHAbU=x5uSA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: cgroup memory limit problems [not found] ` <CAKTCnzmHk1U6fP+jrw2Rq3dLwU30gyocRBp2QpQVHAbU=x5uSA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2011-11-09 22:53 ` Balbir Singh 2011-11-09 22:58 ` Arkadiusz Miśkiewicz 1 sibling, 0 replies; 8+ messages in thread From: Balbir Singh @ 2011-11-09 22:53 UTC (permalink / raw) To: Arkadiusz Miśkiewicz Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, Linux Containers CC'ing cgroups list Balbir 2011/11/10 Balbir Singh <bsingharora@gmail.com>: > 2011/11/10 Arkadiusz Miśkiewicz <arekm@maven.pl>: >> >> Hi, >> >> I have a machine with 6GB of ram and a cgroup for apache processes limited to >> memory.limit_in_bytes = "5100M"; >> memory.soft_limit_in_bytes = "5000M"; >> > > Can you please send the output of memory.stat? > >> Unfortunately when apache processes ate all ram assigned to their cgroup load >> on whole machine jumps the roof. >> >> cgroup aware OOM kicks in, kills one process and that doesn't help. >> >> If I'm fast enough I notice and then apache processes require tons of kill -9 >> (I'm doing "killall -9 apache" in a while (true) loop for 20-30s) to get >> killed (and that not always succeeds - sometimes I'm unable to kill these and >> I'm just doing sysrq u, s, b after few minutes.. if I'm lucky. Sometimes I >> cannot do any command). >> >> This all happens on 2.6.38.8 kernel. >> http://ixion.pld-linux.org/~arekm/cgroup-eaten-memory-failure-1.txt >> for kernel log. It ends with reboot of the machine. >> > > From the logs > > "Nov 9 20:48:53 tm2 kernel: [18300.349106] Task in /somecgroup/httpd > killed as a result of limit of /somecgroup/httpd > Nov 9 20:48:53 tm2 kernel: [18300.349110] memory: usage 5222400kB, > limit 5222400kB, failcnt 282869 > Nov 9 20:48:53 tm2 kernel: [18300.349113] memory+swap: usage 0kB, > limit 9007199254740991kB, failcnt 0" > > It seems like you've disabled swap, is that correct? > > Balbir > _______________________________________________ Containers mailing list Containers@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: cgroup memory limit problems [not found] ` <CAKTCnzmHk1U6fP+jrw2Rq3dLwU30gyocRBp2QpQVHAbU=x5uSA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2011-11-09 22:53 ` Balbir Singh @ 2011-11-09 22:58 ` Arkadiusz Miśkiewicz [not found] ` <201111092358.58079.arekm-evZBlRFTdvA@public.gmane.org> 1 sibling, 1 reply; 8+ messages in thread From: Arkadiusz Miśkiewicz @ 2011-11-09 22:58 UTC (permalink / raw) To: Balbir Singh Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA On Wednesday 09 of November 2011, Balbir Singh wrote: > 2011/11/10 Arkadiusz Miśkiewicz <arekm@maven.pl>: > > Hi, > > > > I have a machine with 6GB of ram and a cgroup for apache processes > > limited to memory.limit_in_bytes = "5100M"; > > memory.soft_limit_in_bytes = "5000M"; > > Can you please send the output of memory.stat? Right now there is small number of users using this server (night here) so: # cat /dev/cgroup/memory/somecgroup/httpd/memory.stat cache 511426560 rss 2637795328 mapped_file 2129920 pgpgin 32233755 pgpgout 31509870 inactive_anon 720896 active_anon 2637271040 inactive_file 255959040 active_file 255234048 unevictable 0 hierarchical_memory_limit 5347737600 total_cache 511426560 total_rss 2637795328 total_mapped_file 2129920 total_pgpgin 32233755 total_pgpgout 31509870 total_inactive_anon 720896 total_active_anon 2637271040 total_inactive_file 255959040 total_active_file 255234048 total_unevictable 0 > > > Unfortunately when apache processes ate all ram assigned to their cgroup > > load on whole machine jumps the roof. > > > > cgroup aware OOM kicks in, kills one process and that doesn't help. > > > > If I'm fast enough I notice and then apache processes require tons of > > kill -9 (I'm doing "killall -9 apache" in a while (true) loop for > > 20-30s) to get killed (and that not always succeeds - sometimes I'm > > unable to kill these and I'm just doing sysrq u, s, b after few > > minutes.. if I'm lucky. Sometimes I cannot do any command). > > > > This all happens on 2.6.38.8 kernel. > > http://ixion.pld-linux.org/~arekm/cgroup-eaten-memory-failure-1.txt > > for kernel log. It ends with reboot of the machine. > > From the logs > > "Nov 9 20:48:53 tm2 kernel: [18300.349106] Task in /somecgroup/httpd > killed as a result of limit of /somecgroup/httpd > Nov 9 20:48:53 tm2 kernel: [18300.349110] memory: usage 5222400kB, > limit 5222400kB, failcnt 282869 > Nov 9 20:48:53 tm2 kernel: [18300.349113] memory+swap: usage 0kB, > limit 9007199254740991kB, failcnt 0" > > It seems like you've disabled swap, is that correct? Yes, this machine has no swap but the kernel has # zcat /proc/config.gz |grep CGROUP_MEM CONFIG_CGROUP_MEM_RES_CTLR=y CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y # CONFIG_CGROUP_MEM_RES_CTLR_SWAP_ENABLED is not set + ctlr swap wasn't enabled at boot. > Balbir ps. vger.kernel.org doesn't like my address (no idea why and postmaster@ is silent), so likely my reply won't reach cgroup@ -- Arkadiusz Miśkiewicz PLD/Linux Team arekm / maven.pl http://ftp.pld-linux.org/ _______________________________________________ Containers mailing list Containers@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <201111092358.58079.arekm-evZBlRFTdvA@public.gmane.org>]
* Re: cgroup memory limit problems [not found] ` <201111092358.58079.arekm-evZBlRFTdvA@public.gmane.org> @ 2011-11-10 0:10 ` KAMEZAWA Hiroyuki [not found] ` <20111110091012.0ce7d719.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> 0 siblings, 1 reply; 8+ messages in thread From: KAMEZAWA Hiroyuki @ 2011-11-10 0:10 UTC (permalink / raw) To: Arkadiusz Miśkiewicz Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA On Wed, 9 Nov 2011 23:58:57 +0100 Arkadiusz Miśkiewicz <arekm@maven.pl> wrote: > On Wednesday 09 of November 2011, Balbir Singh wrote: > > 2011/11/10 Arkadiusz Miśkiewicz <arekm@maven.pl>: > > > Hi, > > > > > > I have a machine with 6GB of ram and a cgroup for apache processes > > > limited to memory.limit_in_bytes = "5100M"; > > > memory.soft_limit_in_bytes = "5000M"; > > > > Can you please send the output of memory.stat? > > Right now there is small number of users using this server (night here) so: > > # cat /dev/cgroup/memory/somecgroup/httpd/memory.stat > cache 511426560 > rss 2637795328 > mapped_file 2129920 > pgpgin 32233755 > pgpgout 31509870 > inactive_anon 720896 > active_anon 2637271040 > inactive_file 255959040 > active_file 255234048 > unevictable 0 > hierarchical_memory_limit 5347737600 > total_cache 511426560 > total_rss 2637795328 > total_mapped_file 2129920 > total_pgpgin 32233755 > total_pgpgout 31509870 > total_inactive_anon 720896 > total_active_anon 2637271040 > total_inactive_file 255959040 > total_active_file 255234048 > total_unevictable 0 > > > > > > Unfortunately when apache processes ate all ram assigned to their cgroup > > > load on whole machine jumps the roof. > > > > > > cgroup aware OOM kicks in, kills one process and that doesn't help. > > > > > > If I'm fast enough I notice and then apache processes require tons of > > > kill -9 (I'm doing "killall -9 apache" in a while (true) loop for > > > 20-30s) to get killed (and that not always succeeds - sometimes I'm > > > unable to kill these and I'm just doing sysrq u, s, b after few > > > minutes.. if I'm lucky. Sometimes I cannot do any command). > > > > > > This all happens on 2.6.38.8 kernel. > > > http://ixion.pld-linux.org/~arekm/cgroup-eaten-memory-failure-1.txt > > > for kernel log. It ends with reboot of the machine. > > > > From the logs > > > > "Nov 9 20:48:53 tm2 kernel: [18300.349106] Task in /somecgroup/httpd > > killed as a result of limit of /somecgroup/httpd > > Nov 9 20:48:53 tm2 kernel: [18300.349110] memory: usage 5222400kB, > > limit 5222400kB, failcnt 282869 > > Nov 9 20:48:53 tm2 kernel: [18300.349113] memory+swap: usage 0kB, > > limit 9007199254740991kB, failcnt 0" > > > > It seems like you've disabled swap, is that correct? > > Yes, this machine has no swap but the kernel has > > # zcat /proc/config.gz |grep CGROUP_MEM > CONFIG_CGROUP_MEM_RES_CTLR=y > CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y > # CONFIG_CGROUP_MEM_RES_CTLR_SWAP_ENABLED is not set > > + ctlr swap wasn't enabled at boot. > > > Balbir > > ps. vger.kernel.org doesn't like my address (no idea why and postmaster@ is > silent), so likely my reply won't reach cgroup@ If you can, please try the latest kernels. recent 2 commits commit 79dfdaccd1d5b40ff7cf4a35a0e63696ebb78b4d " memcg: make oom_lock 0 and 1 based rather than counter " commit 1d65f86db14806cf7b1218c7b4ecb8b4db5af27d " mm: preallocate page before lock_page() at filemap COW" are works well against fork-bomb under memcg. In my test, make -j under swapless memcg hangs (or takes long time) to be oom-killed. Thanks, -Kame _______________________________________________ Containers mailing list Containers@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <20111110091012.0ce7d719.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>]
* Re: cgroup memory limit problems [not found] ` <20111110091012.0ce7d719.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> @ 2011-11-18 9:13 ` Michal Hocko 0 siblings, 0 replies; 8+ messages in thread From: Michal Hocko @ 2011-11-18 9:13 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA Sorry to reply late On Thu 10-11-11 09:10:12, KAMEZAWA Hiroyuki wrote: > On Wed, 9 Nov 2011 23:58:57 +0100 [...] > If you can, please try the latest kernels. > > recent 2 commits > > commit 79dfdaccd1d5b40ff7cf4a35a0e63696ebb78b4d > " memcg: make oom_lock 0 and 1 based rather than counter " This one needs a follow up fix (23751be0094012eb6b4756fa80ca54b3eb83069f "memcg: fix hierarchical oom locking") > commit 1d65f86db14806cf7b1218c7b4ecb8b4db5af27d > " mm: preallocate page before lock_page() at filemap COW" > > are works well against fork-bomb under memcg. In my test, make -j under > swapless memcg hangs (or takes long time) to be oom-killed. -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-11-18 9:13 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-09 20:57 cgroup memory limit problems Arkadiusz Miśkiewicz
[not found] ` <201111092157.06819.arekm-evZBlRFTdvA@public.gmane.org>
2011-11-09 22:45 ` Daniel Lezcano
[not found] ` <4EBB026D.8090701-GANU6spQydw@public.gmane.org>
2011-11-09 23:01 ` Arkadiusz Miśkiewicz
2011-11-09 22:51 ` Balbir Singh
[not found] ` <CAKTCnzmHk1U6fP+jrw2Rq3dLwU30gyocRBp2QpQVHAbU=x5uSA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-09 22:53 ` Balbir Singh
2011-11-09 22:58 ` Arkadiusz Miśkiewicz
[not found] ` <201111092358.58079.arekm-evZBlRFTdvA@public.gmane.org>
2011-11-10 0:10 ` KAMEZAWA Hiroyuki
[not found] ` <20111110091012.0ce7d719.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2011-11-18 9:13 ` Michal Hocko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox