From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Michal Hocko <mhocko@suse.cz>, akpm@linux-foundation.org
Cc: mm-commits@vger.kernel.org, kamezawa.hiroyu@jp.fujitsu.com,
liwanp@linux.vnet.ibm.com, Tejun Heo <htejun@gmail.com>,
Li Zefan <lizefan@huawei.com>,
cgroups mailinglist <cgroups@vger.kernel.org>,
linux-mm@kvack.org
Subject: Re: + hugetlb-cgroup-simplify-pre_destroy-callback.patch added to -mm tree
Date: Thu, 19 Jul 2012 17:51:05 +0530 [thread overview]
Message-ID: <87r4s8gcwe.fsf@skywalker.in.ibm.com> (raw)
In-Reply-To: <20120719113915.GC2864@tiehlicka.suse.cz>
Michal Hocko <mhocko@suse.cz> writes:
> On Wed 18-07-12 14:26:36, Andrew Morton wrote:
>>
>> The patch titled
>> Subject: hugetlb/cgroup: simplify pre_destroy callback
>> has been added to the -mm tree. Its filename is
>> hugetlb-cgroup-simplify-pre_destroy-callback.patch
>>
>> Before you just go and hit "reply", please:
>> a) Consider who else should be cc'ed
>> b) Prefer to cc a suitable mailing list as well
>> c) Ideally: find the original patch on the mailing list and do a
>> reply-to-all to that, adding suitable additional cc's
>>
>> *** Remember to use Documentation/SubmitChecklist when testing your code ***
>>
>> The -mm tree is included into linux-next and is updated
>> there every 3-4 working days
>>
>> ------------------------------------------------------
>> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>> Subject: hugetlb/cgroup: simplify pre_destroy callback
>>
>> Since we cannot fail in hugetlb_cgroup_move_parent(), we don't really need
>> to check whether cgroup have any change left after that. Also skip those
>> hstates for which we don't have any charge in this cgroup.
>
> IIUC this depends on a non-existent (cgroup) patch. I guess something
> like the patch at the end should address it. I haven't tested it though
> so it is not signed-off-by yet.
>
>> Based on an earlier patch from Wanpeng Li.
>>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>> Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
>> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>> Cc: Michal Hocko <mhocko@suse.cz>
>> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>> ---
>>
>> mm/hugetlb_cgroup.c | 49 ++++++++++++++++++------------------------
>> 1 file changed, 21 insertions(+), 28 deletions(-)
>>
>> diff -puN mm/hugetlb_cgroup.c~hugetlb-cgroup-simplify-pre_destroy-callback mm/hugetlb_cgroup.c
>> --- a/mm/hugetlb_cgroup.c~hugetlb-cgroup-simplify-pre_destroy-callback
>> +++ a/mm/hugetlb_cgroup.c
>> @@ -65,18 +65,6 @@ static inline struct hugetlb_cgroup *par
>> return hugetlb_cgroup_from_cgroup(cg->parent);
>> }
>>
>> -static inline bool hugetlb_cgroup_have_usage(struct cgroup *cg)
>> -{
>> - int idx;
>> - struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_cgroup(cg);
>> -
>> - for (idx = 0; idx < hugetlb_max_hstate; idx++) {
>> - if ((res_counter_read_u64(&h_cg->hugepage[idx], RES_USAGE)) > 0)
>> - return true;
>> - }
>> - return false;
>> -}
>> -
>> static struct cgroup_subsys_state *hugetlb_cgroup_create(struct cgroup *cgroup)
>> {
>> int idx;
>> @@ -159,24 +147,29 @@ static int hugetlb_cgroup_pre_destroy(st
>> {
>> struct hstate *h;
>> struct page *page;
>> - int ret = 0, idx = 0;
>> + int ret = 0, idx;
>> + struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_cgroup(cgroup);
>>
>> - do {
>> - if (cgroup_task_count(cgroup) ||
>> - !list_empty(&cgroup->children)) {
>> - ret = -EBUSY;
>> - goto out;
>> - }
>> - for_each_hstate(h) {
>> - spin_lock(&hugetlb_lock);
>> - list_for_each_entry(page, &h->hugepage_activelist, lru)
>> - hugetlb_cgroup_move_parent(idx, cgroup, page);
>>
>> - spin_unlock(&hugetlb_lock);
>> - idx++;
>> - }
>> - cond_resched();
>> - } while (hugetlb_cgroup_have_usage(cgroup));
>> + if (cgroup_task_count(cgroup) ||
>> + !list_empty(&cgroup->children)) {
>> + ret = -EBUSY;
>> + goto out;
>> + }
>> +
>> + for_each_hstate(h) {
>> + /*
>> + * if we don't have any charge, skip this hstate
>> + */
>> + idx = hstate_index(h);
>> + if (res_counter_read_u64(&h_cg->hugepage[idx], RES_USAGE) == 0)
>> + continue;
>> + spin_lock(&hugetlb_lock);
>> + list_for_each_entry(page, &h->hugepage_activelist, lru)
>> + hugetlb_cgroup_move_parent(idx, cgroup, page);
>> + spin_unlock(&hugetlb_lock);
>> + VM_BUG_ON(res_counter_read_u64(&h_cg->hugepage[idx], RES_USAGE));
>> + }
>> out:
>> return ret;
>> }
>> _
>
> ---
> From 621ed1c9dab63bd82205bd5266eb9974f86a0a3f Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.cz>
> Date: Thu, 19 Jul 2012 13:23:23 +0200
> Subject: [PATCH] cgroup: keep cgroup_mutex locked for pre_destroy
>
> 3fa59dfb (cgroup: fix potential deadlock in pre_destroy) dropped the
> cgroup_mutex lock while calling pre_destroy callbacks because memory
> controller could deadlock because force_empty triggered reclaim.
> Since "memcg: move charges to root cgroup if use_hierarchy=0" there is
> no reclaim going on from mem_cgroup_force_empty though so we can safely
> keep the cgroup_mutex locked. This has an advantage that no tasks might
> be added during pre_destroy callback and so the handlers don't have to
> consider races when new tasks add new charges. This simplifies the
> implementation.
> ---
> kernel/cgroup.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 0f3527d..9dba05d 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -4181,7 +4181,6 @@ again:
> mutex_unlock(&cgroup_mutex);
> return -EBUSY;
> }
> - mutex_unlock(&cgroup_mutex);
>
> /*
> * In general, subsystem has no css->refcnt after pre_destroy(). But
> @@ -4204,7 +4203,6 @@ again:
> return ret;
> }
>
> - mutex_lock(&cgroup_mutex);
> parent = cgrp->parent;
> if (atomic_read(&cgrp->count) || !list_empty(&cgrp->children)) {
> clear_bit(CGRP_WAIT_ON_RMDIR, &cgrp->flags);
mem_cgroup_force_empty still calls
lru_add_drain_all
->schedule_on_each_cpu
-> get_online_cpus
->mutex_lock(&cpu_hotplug.lock);
So wont we deadlock ?
-aneesh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-07-19 12:21 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20120718212637.133475C0050@hpza9.eem.corp.google.com>
2012-07-19 11:39 ` + hugetlb-cgroup-simplify-pre_destroy-callback.patch added to -mm tree Michal Hocko
2012-07-19 12:21 ` Aneesh Kumar K.V [this message]
2012-07-19 12:38 ` Michal Hocko
2012-07-19 13:48 ` Aneesh Kumar K.V
2012-07-19 14:09 ` [PATCH] cgroup: Don't drop the cgroup_mutex in cgroup_rmdir Aneesh Kumar K.V
2012-07-19 16:50 ` Tejun Heo
2012-07-20 15:45 ` Peter Zijlstra
2012-07-20 20:05 ` Tejun Heo
2012-07-20 22:07 ` Glauber Costa
2012-07-27 6:15 ` Li Zefan
2012-07-30 18:25 ` Tejun Heo
2012-07-20 7:51 ` Michal Hocko
2012-07-20 19:49 ` Tejun Heo
2012-07-20 1:05 ` + hugetlb-cgroup-simplify-pre_destroy-callback.patch added to -mm tree Kamezawa Hiroyuki
2012-07-20 1:20 ` Kamezawa Hiroyuki
2012-07-20 8:01 ` Michal Hocko
2012-07-20 8:08 ` Kamezawa Hiroyuki
2012-07-20 8:06 ` Michal Hocko
2012-07-20 19:18 ` Aneesh Kumar K.V
2012-07-20 19:56 ` Tejun Heo
2012-07-21 2:14 ` Kamezawa Hiroyuki
2012-07-21 2:46 ` Tejun Heo
2012-07-21 4:05 ` Kamezawa Hiroyuki
2012-07-22 17:34 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87r4s8gcwe.fsf@skywalker.in.ibm.com \
--to=aneesh.kumar@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=htejun@gmail.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=liwanp@linux.vnet.ibm.com \
--cc=lizefan@huawei.com \
--cc=mhocko@suse.cz \
--cc=mm-commits@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).