All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Michal Hocko <mhocko@suse.cz>, akpm@linux-foundation.org
Cc: mm-commits@vger.kernel.org, kamezawa.hiroyu@jp.fujitsu.com,
	liwanp@linux.vnet.ibm.com, Tejun Heo <htejun@gmail.com>,
	Li Zefan <lizefan@huawei.com>,
	cgroups mailinglist <cgroups@vger.kernel.org>,
	linux-mm@kvack.org
Subject: Re: + hugetlb-cgroup-simplify-pre_destroy-callback.patch added to -mm tree
Date: Thu, 19 Jul 2012 17:51:05 +0530	[thread overview]
Message-ID: <87r4s8gcwe.fsf@skywalker.in.ibm.com> (raw)
In-Reply-To: <20120719113915.GC2864@tiehlicka.suse.cz>

Michal Hocko <mhocko@suse.cz> writes:

> On Wed 18-07-12 14:26:36, Andrew Morton wrote:
>> 
>> The patch titled
>>      Subject: hugetlb/cgroup: simplify pre_destroy callback
>> has been added to the -mm tree.  Its filename is
>>      hugetlb-cgroup-simplify-pre_destroy-callback.patch
>> 
>> Before you just go and hit "reply", please:
>>    a) Consider who else should be cc'ed
>>    b) Prefer to cc a suitable mailing list as well
>>    c) Ideally: find the original patch on the mailing list and do a
>>       reply-to-all to that, adding suitable additional cc's
>> 
>> *** Remember to use Documentation/SubmitChecklist when testing your code ***
>> 
>> The -mm tree is included into linux-next and is updated
>> there every 3-4 working days
>> 
>> ------------------------------------------------------
>> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>> Subject: hugetlb/cgroup: simplify pre_destroy callback
>> 
>> Since we cannot fail in hugetlb_cgroup_move_parent(), we don't really need
>> to check whether cgroup have any change left after that.  Also skip those
>> hstates for which we don't have any charge in this cgroup.
>
> IIUC this depends on a non-existent (cgroup) patch. I guess something
> like the patch at the end should address it. I haven't tested it though
> so it is not signed-off-by yet.
>
>> Based on an earlier patch from Wanpeng Li.
>> 
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>> Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
>> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>> Cc: Michal Hocko <mhocko@suse.cz>
>> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>> ---
>> 
>>  mm/hugetlb_cgroup.c |   49 ++++++++++++++++++------------------------
>>  1 file changed, 21 insertions(+), 28 deletions(-)
>> 
>> diff -puN mm/hugetlb_cgroup.c~hugetlb-cgroup-simplify-pre_destroy-callback mm/hugetlb_cgroup.c
>> --- a/mm/hugetlb_cgroup.c~hugetlb-cgroup-simplify-pre_destroy-callback
>> +++ a/mm/hugetlb_cgroup.c
>> @@ -65,18 +65,6 @@ static inline struct hugetlb_cgroup *par
>>  	return hugetlb_cgroup_from_cgroup(cg->parent);
>>  }
>>  
>> -static inline bool hugetlb_cgroup_have_usage(struct cgroup *cg)
>> -{
>> -	int idx;
>> -	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_cgroup(cg);
>> -
>> -	for (idx = 0; idx < hugetlb_max_hstate; idx++) {
>> -		if ((res_counter_read_u64(&h_cg->hugepage[idx], RES_USAGE)) > 0)
>> -			return true;
>> -	}
>> -	return false;
>> -}
>> -
>>  static struct cgroup_subsys_state *hugetlb_cgroup_create(struct cgroup *cgroup)
>>  {
>>  	int idx;
>> @@ -159,24 +147,29 @@ static int hugetlb_cgroup_pre_destroy(st
>>  {
>>  	struct hstate *h;
>>  	struct page *page;
>> -	int ret = 0, idx = 0;
>> +	int ret = 0, idx;
>> +	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_cgroup(cgroup);
>>  
>> -	do {
>> -		if (cgroup_task_count(cgroup) ||
>> -		    !list_empty(&cgroup->children)) {
>> -			ret = -EBUSY;
>> -			goto out;
>> -		}
>> -		for_each_hstate(h) {
>> -			spin_lock(&hugetlb_lock);
>> -			list_for_each_entry(page, &h->hugepage_activelist, lru)
>> -				hugetlb_cgroup_move_parent(idx, cgroup, page);
>>  
>> -			spin_unlock(&hugetlb_lock);
>> -			idx++;
>> -		}
>> -		cond_resched();
>> -	} while (hugetlb_cgroup_have_usage(cgroup));
>> +	if (cgroup_task_count(cgroup) ||
>> +	    !list_empty(&cgroup->children)) {
>> +		ret = -EBUSY;
>> +		goto out;
>> +	}
>> +
>> +	for_each_hstate(h) {
>> +		/*
>> +		 * if we don't have any charge, skip this hstate
>> +		 */
>> +		idx = hstate_index(h);
>> +		if (res_counter_read_u64(&h_cg->hugepage[idx], RES_USAGE) == 0)
>> +			continue;
>> +		spin_lock(&hugetlb_lock);
>> +		list_for_each_entry(page, &h->hugepage_activelist, lru)
>> +			hugetlb_cgroup_move_parent(idx, cgroup, page);
>> +		spin_unlock(&hugetlb_lock);
>> +		VM_BUG_ON(res_counter_read_u64(&h_cg->hugepage[idx], RES_USAGE));
>> +	}
>>  out:
>>  	return ret;
>>  }
>> _
>
> ---
> From 621ed1c9dab63bd82205bd5266eb9974f86a0a3f Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.cz>
> Date: Thu, 19 Jul 2012 13:23:23 +0200
> Subject: [PATCH] cgroup: keep cgroup_mutex locked for pre_destroy
>
> 3fa59dfb (cgroup: fix potential deadlock in pre_destroy) dropped the
> cgroup_mutex lock while calling pre_destroy callbacks because memory
> controller could deadlock because force_empty triggered reclaim.
> Since "memcg: move charges to root cgroup if use_hierarchy=0" there is
> no reclaim going on from mem_cgroup_force_empty though so we can safely
> keep the cgroup_mutex locked. This has an advantage that no tasks might
> be added during pre_destroy callback and so the handlers don't have to
> consider races when new tasks add new charges. This simplifies the
> implementation.
> ---
>  kernel/cgroup.c |    2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 0f3527d..9dba05d 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -4181,7 +4181,6 @@ again:
>  		mutex_unlock(&cgroup_mutex);
>  		return -EBUSY;
>  	}
> -	mutex_unlock(&cgroup_mutex);
>
>  	/*
>  	 * In general, subsystem has no css->refcnt after pre_destroy(). But
> @@ -4204,7 +4203,6 @@ again:
>  		return ret;
>  	}
>
> -	mutex_lock(&cgroup_mutex);
>  	parent = cgrp->parent;
>  	if (atomic_read(&cgrp->count) || !list_empty(&cgrp->children)) {
>  		clear_bit(CGRP_WAIT_ON_RMDIR, &cgrp->flags);

mem_cgroup_force_empty still calls 

lru_add_drain_all 
   ->schedule_on_each_cpu
        -> get_online_cpus
           ->mutex_lock(&cpu_hotplug.lock);

So wont we deadlock ?

-aneesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2012-07-19 12:21 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-18 21:26 + hugetlb-cgroup-simplify-pre_destroy-callback.patch added to -mm tree akpm
     [not found] ` <20120718212637.133475C0050-gd0R4GGuC+lfGOtoag0VdhPsWskHk0ljAL8bYrjMMd8@public.gmane.org>
2012-07-19 11:39   ` Michal Hocko
2012-07-19 11:39     ` Michal Hocko
2012-07-19 12:21     ` Aneesh Kumar K.V [this message]
     [not found]       ` <87r4s8gcwe.fsf-6yE53ggjAfyqSkle7U1LjlaTQe2KTcn/@public.gmane.org>
2012-07-19 12:38         ` Michal Hocko
2012-07-19 12:38           ` Michal Hocko
     [not found]           ` <20120719123820.GG2864-VqjxzfR4DlwKmadIfiO5sKVXKuFTiq87@public.gmane.org>
2012-07-19 13:48             ` Aneesh Kumar K.V
2012-07-19 13:48               ` Aneesh Kumar K.V
     [not found]               ` <87ipdjc15j.fsf-6yE53ggjAfyqSkle7U1LjlaTQe2KTcn/@public.gmane.org>
2012-07-19 14:09                 ` [PATCH] cgroup: Don't drop the cgroup_mutex in cgroup_rmdir Aneesh Kumar K.V
2012-07-19 14:09                   ` Aneesh Kumar K.V
     [not found]                   ` <1342706972-10912-1-git-send-email-aneesh.kumar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2012-07-19 16:50                     ` Tejun Heo
2012-07-19 16:50                       ` Tejun Heo
2012-07-20 15:45                       ` Peter Zijlstra
2012-07-20 20:05                         ` Tejun Heo
2012-07-20 20:05                           ` Tejun Heo
2012-07-20 22:07                           ` Glauber Costa
     [not found]                           ` <20120720200542.GD21218-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-07-27  6:15                             ` Li Zefan
2012-07-27  6:15                               ` Li Zefan
     [not found]                               ` <501231F0.8050505-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2012-07-30 18:25                                 ` Tejun Heo
2012-07-30 18:25                                   ` Tejun Heo
2012-07-20  7:51                     ` Michal Hocko
2012-07-20  7:51                       ` Michal Hocko
2012-07-20 19:49                     ` Tejun Heo
2012-07-20 19:49                       ` Tejun Heo
2012-07-20  1:05                 ` + hugetlb-cgroup-simplify-pre_destroy-callback.patch added to -mm tree Kamezawa Hiroyuki
2012-07-20  1:05                   ` Kamezawa Hiroyuki
     [not found]                   ` <5008AEC2.9090707-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2012-07-20  1:20                     ` Kamezawa Hiroyuki
2012-07-20  1:20                       ` Kamezawa Hiroyuki
     [not found]                       ` <5008B25D.5000902-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2012-07-20  8:01                         ` Michal Hocko
2012-07-20  8:01                           ` Michal Hocko
2012-07-20  8:08                           ` Kamezawa Hiroyuki
2012-07-20  8:06               ` Michal Hocko
2012-07-20 19:18                 ` Aneesh Kumar K.V
2012-07-20 19:56                   ` Tejun Heo
     [not found]                     ` <20120720195643.GC21218-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-07-21  2:14                       ` Kamezawa Hiroyuki
2012-07-21  2:14                         ` Kamezawa Hiroyuki
2012-07-21  2:46                         ` Tejun Heo
2012-07-21  4:05                           ` Kamezawa Hiroyuki
     [not found]                             ` <500A2A79.5030705-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2012-07-22 17:34                               ` Tejun Heo
2012-07-22 17:34                                 ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r4s8gcwe.fsf@skywalker.in.ibm.com \
    --to=aneesh.kumar@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=htejun@gmail.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=liwanp@linux.vnet.ibm.com \
    --cc=lizefan@huawei.com \
    --cc=mhocko@suse.cz \
    --cc=mm-commits@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.