From: Hiroyuki Kamezawa <kamezawa.hiroyuki@gmail.com>
To: Tejun Heo <tj@kernel.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Linux Kernel <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"cgroups@vger.kernel.org" <cgroups@vger.kernel.org>,
Michal Hocko <mhocko@suse.cz>,
Johannes Weiner <hannes@cmpxchg.org>,
Frederic Weisbecker <fweisbec@gmail.com>,
Glauber Costa <glommer@parallels.com>,
Han Ying <yinghan@google.com>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [RFC][PATCH 8/9 v2] cgroup: avoid creating new cgroup under a cgroup being destroyed
Date: Sat, 28 Apr 2012 18:31:38 +0900 [thread overview]
Message-ID: <CABEgKgpPXPu3L6oS6+2+dZmcPS=t-ZR7PnCvm0mo8UFeXPHDog@mail.gmail.com> (raw)
In-Reply-To: <20120428020003.GA26573@mtj.dyndns.org>
On Sat, Apr 28, 2012 at 11:00 AM, Tejun Heo <tj@kernel.org> wrote:
> Hi, KAME.
>
> On Sat, Apr 28, 2012 at 09:20:52AM +0900, Hiroyuki Kamezawa wrote:
>> What I thought was...
>> Assume a memory cgoup A, with use_hierarchy==1.
>>
>> 1. thread:0 start calling pre->destroy of cgroup A
>> 2. thread:0 it sometimes calls cond_resched or other sleep functions.
>> 3. thread:1 create a cgroup B under "A"
>> 4. thread:1 attach a thread X to cgroup A/B
>> 5. res_counter of A charged up. but pre_destroy() can't find what happens
>> because it scans LRU of A.
>>
>> So, we have -EBUSY now. I considered some options to fix this.
>>
>> option 1) just return 0 instead of -EBUSY when pre_destroy() finds a
>> task or a child.
>>
>> There is a race....even if we return 0 here and expects cgroup code
>> can catch it,
>> the thread or a child we found may be moved to other cgroup before we check it
>> in cgroup's final check.
>> In that case, the cgroup will be freed before full-ack of
>> pre_destory() and the charges
>> will be lost.
>
> So, cgroup code won't proceed with rmdir if children are created
> inbetween and note that the race condition of lost charge you
> described above existed before this change - ie. new cgroup could be
> created after pre_destroy() is complete.
>
> The current cgroup rmdir code is transitional. It has to support both
> retrying and non-retrying pre_destroy()s and that means we can't mark
> the cgroup DEAD before starting invoking pre_destroy(); however, we
> can do that once memcg's pre_destroy() is converted which will also
> remove all the WAIT_ON_RMDIR mechanism and the above described race.
>
> There really isn't much point in trying to make the current cgroup
> rmdir behave perfectly when the next step is removing all the fixed up
> parts.
>
> So, IMHO, just making pre_destroy() clean up its own charges and
> always returning 0 is enough. There's no need to fix up old
> non-critical race condition at this point in the patch stream. cgroup
> rmdir simplification will make them disappear anyway.
>
So, hmm, ok. I'll drop patch 7 & 8. memcg may return -EBUSY in very very
race case but users will not see it in the most case.
I'll fix limit, move-charge and use_hierarchy problem first.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-04-28 9:31 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-27 5:45 [RFC][PATCH 0/7 v2] memcg: prevent failure in pre_destroy() KAMEZAWA Hiroyuki
2012-04-27 5:49 ` [RFC][PATCH 1/7 v2] temporal compile-fix in linux-next KAMEZAWA Hiroyuki
2012-04-30 8:47 ` Aneesh Kumar K.V
2012-04-27 5:51 ` [RFC][PATCH 2/7 v2] memcg: fix error code in hugetlb_force_memcg_empty() KAMEZAWA Hiroyuki
2012-04-30 8:49 ` Aneesh Kumar K.V
2012-04-27 5:53 ` [RFC][PATCH 3/7 v2] res_counter: add res_counter_uncharge_until() KAMEZAWA Hiroyuki
2012-04-27 17:08 ` Glauber Costa
2012-04-27 23:51 ` Hiroyuki Kamezawa
2012-04-27 18:18 ` Tejun Heo
2012-04-27 23:51 ` Hiroyuki Kamezawa
2012-04-27 5:54 ` [RFC][PATCH 4/7 v2] memcg: use res_counter_uncharge_until in move_parent KAMEZAWA Hiroyuki
2012-04-27 17:16 ` Glauber Costa
2012-04-27 18:26 ` Ying Han
2012-04-27 20:11 ` Glauber Costa
2012-04-27 23:58 ` Hiroyuki Kamezawa
2012-04-27 18:20 ` Tejun Heo
2012-04-27 23:59 ` Hiroyuki Kamezawa
2012-04-30 9:00 ` Aneesh Kumar K.V
2012-04-27 5:58 ` [RFC][PATCH 5/9 v2] move charges to root at rmdir if use_hierarchy is unset KAMEZAWA Hiroyuki
2012-04-27 19:12 ` Ying Han
2012-04-28 0:01 ` Hiroyuki Kamezawa
2012-04-30 9:07 ` Aneesh Kumar K.V
2012-04-27 6:00 ` [RFC][PATCH 6/9 v2] memcg: don't uncharge in mem_cgroup_move_account KAMEZAWA Hiroyuki
2012-04-27 6:02 ` [RFC][PATCH 7/9 v2] cgroup: avoid attaching task to a cgroup under rmdir() KAMEZAWA Hiroyuki
2012-04-27 10:39 ` Frederic Weisbecker
2012-04-28 0:06 ` Hiroyuki Kamezawa
2012-04-27 20:31 ` Tejun Heo
2012-04-27 20:33 ` Tejun Heo
2012-04-27 6:04 ` [RFC][PATCH 8/9 v2] cgroup: avoid creating new cgroup under a cgroup being destroyed KAMEZAWA Hiroyuki
2012-04-27 17:18 ` Glauber Costa
2012-04-27 20:40 ` Tejun Heo
2012-04-27 20:41 ` Tejun Heo
2012-04-28 0:20 ` Hiroyuki Kamezawa
2012-04-28 2:00 ` Tejun Heo
2012-04-28 9:31 ` Hiroyuki Kamezawa [this message]
2012-04-28 21:31 ` Tejun Heo
2012-04-27 6:06 ` [RFC][PATCH 9/9 v2] memcg: never return error at pre_destroy() KAMEZAWA Hiroyuki
2012-04-27 21:28 ` Ying Han
2012-04-28 0:25 ` Hiroyuki Kamezawa
2012-04-30 17:02 ` Ying Han
2012-05-01 22:28 ` Suleiman Souhlal
2012-05-02 3:34 ` Hiroyuki Kamezawa
2012-04-27 18:16 ` [RFC][PATCH 0/7 v2] memcg: prevent failure in pre_destroy() Tejun Heo
2012-04-27 23:48 ` Hiroyuki Kamezawa
2012-04-28 16:13 ` Michal Hocko
2012-04-29 6:03 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CABEgKgpPXPu3L6oS6+2+dZmcPS=t-ZR7PnCvm0mo8UFeXPHDog@mail.gmail.com' \
--to=kamezawa.hiroyuki@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=cgroups@vger.kernel.org \
--cc=fweisbec@gmail.com \
--cc=glommer@parallels.com \
--cc=hannes@cmpxchg.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
--cc=tj@kernel.org \
--cc=yinghan@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).