From: ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org (Eric W. Biederman)
To: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Cc: Glauber Costa <glommer-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>,
Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Linus Torvalds
<torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
kent.overstreet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
Subject: Re: memcg creates an unkillable task in 3.11-rc2
Date: Tue, 30 Jul 2013 01:19:31 -0700 [thread overview]
Message-ID: <87ppu0a298.fsf_-_@tw-ebiederman.twitter.com> (raw)
In-Reply-To: <51F71DE2.4020102-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> (Li Zefan's message of "Tue, 30 Jul 2013 09:58:58 +0800")
Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> writes:
>> I am also seeing what looks like a leak somewhere in the cgroup code as
>> well. After some runs of the same reproducer I get into a state where
>> after everything is clean up. All of the control groups have been
>> removed and the cgroup filesystem is unmounted, I can mount a cgroup
>> filesystem with that same combindation of subsystems, but I can't mount
>> a cgroup filesystem with any of those subsystems in any other
>> combination. So I am guessing that the superblock is from the original
>> mounting is still lingering for some reason.
>>
>
> If this happens again, you can check /proc/cgroups,
>
> #subsys_name hierarchy num_cgroups enabled
> cpuset 0 1 1
> debug 0 1 1
> cpu 0 1 1
> cpuacct 0 1 1
> memory 0 1 1
> devices 0 1 1
> freezer 0 1 1
> blkio 0 1 1
>
> If "hierachy" is not 0, then it didn't really unmounted. If "num_cgroups"
> is not 1, then there're some cgroups not really destroyed though they've
> been rmdired.
Interesting. It looks at some point I had some cpu and cpuacct
hierarchies that never really unmounted.
#subsys_name hierarchy num_cgroups enabled
cpuset 0 1 1
cpu 89 1 1
cpuacct 89 1 1
memory 0 1 1
devices 0 1 1
freezer 0 1 1
net_cls 0 1 1
blkio 0 1 1
perf_event 0 1 1
hugetlb 0 1 1
And playing a little more I get the leak scenario.
#subsys_name hierarchy num_cgroups enabled
cpuset 0 1 1
cpu 90 3 1
cpuacct 90 3 1
memory 90 3 1
devices 0 1 1
freezer 90 3 1
net_cls 0 1 1
blkio 0 1 1
perf_event 0 1 1
hugetlb 0 1 1
So it definitely did not unmount.
After echo 3 > /proc/sys/vm/drop_caches
#subsys_name hierarchy num_cgroups enabled
cpuset 0 1 1
cpu 90 1 1
cpuacct 90 1 1
memory 90 1 1
devices 0 1 1
freezer 90 1 1
net_cls 0 1 1
blkio 0 1 1
perf_event 0 1 1
hugetlb 0 1 1
Hmm. But after some time passes I have
#subsys_name hierarchy num_cgroups enabled
cpuset 0 1 1
cpu 0 1 1
cpuacct 0 1 1
memory 0 1 1
devices 0 1 1
freezer 0 1 1
net_cls 0 1 1
blkio 0 1 1
perf_event 0 1 1
hugetlb 0 1 1
Hmm. Looking farther I see what is going on. And it has nothing to do
with the freezer. (I have commented out that code and reproduced it
without the freezer to be doubly certain).
On the exit path exit_robust_list is triggering a page fault to fault a
page back in. Which since we have no memory causes the exit path
to get stuck in mem_cgroup_handle_oom.
Which means the following change should fix the hang. I will test it in just
a second.
The problem is that we only handled pending fatal signals and exiting
processes when the OOM logic was enabled. Sigh.
Eric
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 00a7a66..5998a57 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1792,16 +1792,6 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
unsigned int points = 0;
struct task_struct *chosen = NULL;
- /*
- * If current has a pending SIGKILL or is exiting, then automatically
- * select it. The goal is to allow it to allocate so that it may
- * quickly exit and free its memory.
- */
- if (fatal_signal_pending(current) || current->flags & PF_EXITING) {
- set_thread_flag(TIF_MEMDIE);
- return;
- }
-
check_panic_on_oom(CONSTRAINT_MEMCG, gfp_mask, order, NULL);
totalpages = mem_cgroup_get_limit(memcg) >> PAGE_SHIFT ? : 1;
for_each_mem_cgroup_tree(iter, memcg) {
@@ -2220,7 +2210,15 @@ static bool mem_cgroup_handle_oom(struct mem_cgroup *memcg, gfp_t mask,
mem_cgroup_oom_notify(memcg);
spin_unlock(&memcg_oom_lock);
- if (need_to_kill) {
+ /*
+ * If current has a pending SIGKILL or is exiting, then automatically
+ * select it. The goal is to allow it to allocate so that it may
+ * quickly exit and free its memory.
+ */
+ if (fatal_signal_pending(current) || current->flags & PF_EXITING) {
+ set_thread_flag(TIF_MEMDIE);
+ finish_wait(&memcg_oom_waitq, &owait.wait);
+ } else if (need_to_kill) {
finish_wait(&memcg_oom_waitq, &owait.wait);
mem_cgroup_out_of_memory(memcg, mask, order);
} else {
next prev parent reply other threads:[~2013-07-30 8:19 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-23 17:47 [GIT PULL] cgroup changes for 3.11-rc2 Tejun Heo
[not found] ` <20130723174711.GE21100-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2013-07-29 0:42 ` memcg creates an unkillable task in 3.2-rc2 Eric W. Biederman
[not found] ` <8761vui4cr.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-07-29 7:59 ` Michal Hocko
[not found] ` <20130729075939.GA4678-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-07-29 8:54 ` Eric W. Biederman
[not found] ` <87ehahg312.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-07-29 9:51 ` Michal Hocko
[not found] ` <20130729095109.GB4678-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-07-29 10:21 ` Eric W. Biederman
2013-07-29 16:10 ` Tejun Heo
[not found] ` <20130729161026.GD22605-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2013-07-29 17:03 ` Eric W. Biederman
[not found] ` <87r4eh70yg.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-07-29 17:20 ` Tejun Heo
[not found] ` <20130729172046.GI22605-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2013-07-29 18:06 ` Eric W. Biederman
2013-07-29 18:17 ` Michal Hocko
2013-07-29 18:13 ` Johannes Weiner
[not found] ` <20130729181354.GX715-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
2013-07-29 18:52 ` Eric W. Biederman
2013-07-30 1:58 ` Li Zefan
[not found] ` <51F71DE2.4020102-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-07-30 8:19 ` Eric W. Biederman [this message]
[not found] ` <87ppu0a298.fsf_-_-HxuHnoDHeQZYhcs0q7wBk77fW72O3V7zAL8bYrjMMd8@public.gmane.org>
2013-07-30 12:31 ` memcg creates an unkillable task in 3.11-rc2 Michal Hocko
[not found] ` <20130730123120.GA15847-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-07-30 16:37 ` Eric W. Biederman
[not found] ` <874nbc3sx1.fsf-HxuHnoDHeQZYhcs0q7wBk77fW72O3V7zAL8bYrjMMd8@public.gmane.org>
2013-07-31 7:37 ` Michal Hocko
[not found] ` <20130731073726.GC30514-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-07-31 12:10 ` Johannes Weiner
2013-07-31 22:09 ` Eric W. Biederman
[not found] ` <87zjt2tm9f.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-08-01 9:06 ` Michal Hocko
[not found] ` <20130801090620.GA5198-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-09-05 9:56 ` Michal Hocko
[not found] ` <20130905095653.GB9702-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-09-06 18:09 ` Eric W. Biederman
[not found] ` <87ob85kejy.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-09-09 8:31 ` Michal Hocko
2013-07-30 16:28 ` Eric W. Biederman
[not found] ` <87ppu03td7.fsf-HxuHnoDHeQZYhcs0q7wBk77fW72O3V7zAL8bYrjMMd8@public.gmane.org>
2013-09-26 23:41 ` Fabio Kung
[not found] ` <CAHyO6Z33pUJ1_MjPO2OeUY_+ZRmc1niPiFm5DzGVDokm5vb4rw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-09-27 0:35 ` Eric W. Biederman
2013-11-12 16:00 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ppu0a298.fsf_-_@tw-ebiederman.twitter.com \
--to=ebiederm-as9lmozglivwk0htik3j/w@public.gmane.org \
--cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=glommer-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
--cc=kent.overstreet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org \
--cc=mhocko-AlSwsSmVLrQ@public.gmane.org \
--cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
--cc=torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox