[BUGFIX][PATCH 0/5] memcg bugfixes in the last week.

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [BUGFIX][PATCH 0/5] memcg bugfixes in the last week.
@ 2011-06-13  3:00 KAMEZAWA Hiroyuki
  2011-06-13  3:03 ` [BUGFIX][PATCH 1/5] memcg fix numa_stat permission KAMEZAWA Hiroyuki
                   ` (4 more replies)
  0 siblings, 5 replies; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-06-13  3:00 UTC (permalink / raw)
  To: linux-kernel@vger.kernel.org
  Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
	nishimura@mxp.nes.nec.co.jp, bsingharora@gmail.com,
	hannes@cmpxchg.org, Michal Hocko, Ying Han

In the last week, I(and memcg people) posted 5 bug fixes.
I was slightly confued. 

For making things clearer, I post all 5 patches again,
which are I'm now testing. 

If I miss some patches/fixes/bugs, please notify me. 

[1/5] - fix memory.numa_stat permission (this is in mmotm)
[2/5] - fix init_page_cgroup() nid with sparsemem
[3/5] - fix mm->owner update
[4/5] - fix wrong check of noswap with softlimit
[5/5] - fix percpu cached charge draining.

Thank you for all your helps. 

Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [BUGFIX][PATCH 1/5] memcg fix numa_stat permission
  2011-06-13  3:00 [BUGFIX][PATCH 0/5] memcg bugfixes in the last week KAMEZAWA Hiroyuki
@ 2011-06-13  3:03 ` KAMEZAWA Hiroyuki
  2011-06-14  9:42   ` Johannes Weiner
  2011-06-13  3:06 ` [BUGFIX][PATCH 2/5] memcg: fix init_page_cgroup nid with sparsemem KAMEZAWA Hiroyuki
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-06-13  3:03 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, nishimura@mxp.nes.nec.co.jp,
	bsingharora@gmail.com, hannes@cmpxchg.org, Michal Hocko, Ying Han

This is already queued in mmotm.
==

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [BUGFIX][PATCH 2/5] memcg: fix init_page_cgroup nid with sparsemem
  2011-06-13  3:00 [BUGFIX][PATCH 0/5] memcg bugfixes in the last week KAMEZAWA Hiroyuki
  2011-06-13  3:03 ` [BUGFIX][PATCH 1/5] memcg fix numa_stat permission KAMEZAWA Hiroyuki
@ 2011-06-13  3:06 ` KAMEZAWA Hiroyuki
  2011-06-14  9:44   ` Johannes Weiner
  2011-06-16 10:09   ` [-git build bug, PATCH] " Ingo Molnar
  2011-06-13  3:09 ` [BUGFIX][PATCH 3/5] memcg: clear mm->owner when last possible owner leaves KAMEZAWA Hiroyuki
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-06-13  3:06 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, nishimura@mxp.nes.nec.co.jp,
	bsingharora@gmail.com, hannes@cmpxchg.org, Michal Hocko, Ying Han

added some clean ups.
==

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [BUGFIX][PATCH 3/5] memcg: clear mm->owner when last possible owner leaves
  2011-06-13  3:00 [BUGFIX][PATCH 0/5] memcg bugfixes in the last week KAMEZAWA Hiroyuki
  2011-06-13  3:03 ` [BUGFIX][PATCH 1/5] memcg fix numa_stat permission KAMEZAWA Hiroyuki
  2011-06-13  3:06 ` [BUGFIX][PATCH 2/5] memcg: fix init_page_cgroup nid with sparsemem KAMEZAWA Hiroyuki
@ 2011-06-13  3:09 ` KAMEZAWA Hiroyuki
  2011-06-14  9:45   ` Johannes Weiner
  2011-06-13  3:11 ` [BUGFIX][PATCH 4/5] memcg: fix wrong check of noswap with softlimit KAMEZAWA Hiroyuki
  2011-06-13  3:16 ` [BUGFIX][PATCH 5/5] memcg: fix percpu cached charge draining frequency KAMEZAWA Hiroyuki
  4 siblings, 1 reply; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-06-13  3:09 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, nishimura@mxp.nes.nec.co.jp,
	bsingharora@gmail.com, hannes@cmpxchg.org, Michal Hocko, Ying Han,
	Hugh Dickins, davej

This is Hugh's version.
==

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [BUGFIX][PATCH 4/5] memcg:  fix wrong check of noswap with softlimit
  2011-06-13  3:00 [BUGFIX][PATCH 0/5] memcg bugfixes in the last week KAMEZAWA Hiroyuki
                   ` (2 preceding siblings ...)
  2011-06-13  3:09 ` [BUGFIX][PATCH 3/5] memcg: clear mm->owner when last possible owner leaves KAMEZAWA Hiroyuki
@ 2011-06-13  3:11 ` KAMEZAWA Hiroyuki
  2011-06-14  9:48   ` Johannes Weiner
  2011-06-13  3:16 ` [BUGFIX][PATCH 5/5] memcg: fix percpu cached charge draining frequency KAMEZAWA Hiroyuki
  4 siblings, 1 reply; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-06-13  3:11 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, nishimura@mxp.nes.nec.co.jp,
	bsingharora@gmail.com, hannes@cmpxchg.org, Michal Hocko, Ying Han



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [BUGFIX][PATCH 5/5] memcg: fix percpu cached charge draining frequency
  2011-06-13  3:00 [BUGFIX][PATCH 0/5] memcg bugfixes in the last week KAMEZAWA Hiroyuki
                   ` (3 preceding siblings ...)
  2011-06-13  3:11 ` [BUGFIX][PATCH 4/5] memcg: fix wrong check of noswap with softlimit KAMEZAWA Hiroyuki
@ 2011-06-13  3:16 ` KAMEZAWA Hiroyuki
  2011-06-13 21:25   ` Andrew Morton
                     ` (3 more replies)
  4 siblings, 4 replies; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-06-13  3:16 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, nishimura@mxp.nes.nec.co.jp,
	bsingharora@gmail.com, hannes@cmpxchg.org, Michal Hocko, Ying Han



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [BUGFIX][PATCH 5/5] memcg: fix percpu cached charge draining frequency
  2011-06-13  3:16 ` [BUGFIX][PATCH 5/5] memcg: fix percpu cached charge draining frequency KAMEZAWA Hiroyuki
@ 2011-06-13 21:25   ` Andrew Morton
  2011-06-15  0:09     ` KAMEZAWA Hiroyuki
  2011-06-14  7:36   ` Michal Hocko
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 24+ messages in thread
From: Andrew Morton @ 2011-06-13 21:25 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	nishimura@mxp.nes.nec.co.jp, bsingharora@gmail.com,
	hannes@cmpxchg.org, Michal Hocko, Ying Han

On Mon, 13 Jun 2011 12:16:48 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:

> >From 18b12e53f1cdf6d7feed1f9226c189c34866338c Mon Sep 17 00:00:00 2001
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Date: Mon, 13 Jun 2011 11:25:43 +0900
> Subject: [PATCH 5/5] memcg: fix percpu cached charge draining frequency
> 
>  For performance, memory cgroup caches some "charge" from res_counter
>  into per cpu cache. This works well but because it's cache,
>  it needs to be flushed in some cases. Typical cases are
>          1. when someone hit limit.
>          2. when rmdir() is called and need to charges to be 0.
> 
> But "1" has problem.
> 
> Recently, with large SMP machines, we see many kworker runs because
> of flushing memcg's cache. Bad things in implementation are
> that even if a cpu contains a cache for memcg not related to
> a memcg which hits limit, drain code is called.
> 
> This patch does
> 	A) check percpu cache contains a useful data or not.
> 	B) check other asynchronous percpu draining doesn't run.
> 	C) don't call local cpu callback.
> 	D) don't call at softlimit reclaim.
> 
>
> ...
>
> +DEFINE_MUTEX(percpu_charge_mutex);

I made this static.  If we later wish to give it kernel-wide scope then
"percpu_charge_mutex" will not be a good choice of name.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [BUGFIX][PATCH 5/5] memcg: fix percpu cached charge draining frequency
  2011-06-13  3:16 ` [BUGFIX][PATCH 5/5] memcg: fix percpu cached charge draining frequency KAMEZAWA Hiroyuki
  2011-06-13 21:25   ` Andrew Morton
@ 2011-06-14  7:36   ` Michal Hocko
  2011-06-15  0:12     ` KAMEZAWA Hiroyuki
  2011-06-14 10:04   ` Johannes Weiner
  2011-06-15  1:49   ` [BUGFIX][PATCH v6] " KAMEZAWA Hiroyuki
  3 siblings, 1 reply; 24+ messages in thread
From: Michal Hocko @ 2011-06-14  7:36 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, nishimura@mxp.nes.nec.co.jp,
	bsingharora@gmail.com, hannes@cmpxchg.org, Ying Han

On Mon 13-06-11 12:16:48, KAMEZAWA Hiroyuki wrote:
> From 18b12e53f1cdf6d7feed1f9226c189c34866338c Mon Sep 17 00:00:00 2001
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Date: Mon, 13 Jun 2011 11:25:43 +0900
> Subject: [PATCH 5/5] memcg: fix percpu cached charge draining frequency
> 
>  For performance, memory cgroup caches some "charge" from res_counter
>  into per cpu cache. This works well but because it's cache,
>  it needs to be flushed in some cases. Typical cases are
>          1. when someone hit limit.
>          2. when rmdir() is called and need to charges to be 0.
> 
> But "1" has problem.
> 
> Recently, with large SMP machines, we see many kworker runs because
> of flushing memcg's cache. Bad things in implementation are
> that even if a cpu contains a cache for memcg not related to
> a memcg which hits limit, drain code is called.
> 
> This patch does
> 	D) don't call at softlimit reclaim.

I think this needs some justification. The decision is not that
obvious IMO. I would say that this is a good decision because cached
charges will not help to free any memory (at least not directly) during
background reclaim. What about something like:
"
We are not draining per cpu cached charges during soft limit reclaim 
because background reclaim doesn't care about charges. It tries to free
some memory and charges will not give any.
Cached charges might influence only selection of the biggest soft limit
offender but as the call is done only after the selection has been
already done it makes no change.
"

Anyway, wouldn't it be better to have this change separate from the
async draining logic change?
-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [BUGFIX][PATCH 1/5] memcg fix numa_stat permission
  2011-06-13  3:03 ` [BUGFIX][PATCH 1/5] memcg fix numa_stat permission KAMEZAWA Hiroyuki
@ 2011-06-14  9:42   ` Johannes Weiner
  0 siblings, 0 replies; 24+ messages in thread
From: Johannes Weiner @ 2011-06-14  9:42 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, nishimura@mxp.nes.nec.co.jp,
	bsingharora@gmail.com, hannes@cmpxchg.org, Michal Hocko, Ying Han

On Mon, Jun 13, 2011 at 12:03:01PM +0900, KAMEZAWA Hiroyuki wrote:
> This is already queued in mmotm.
> ==
> >From 3b1bec1d07ba21339697f069acab47dae6b81097 Mon Sep 17 00:00:00 2001
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Date: Mon, 13 Jun 2011 09:32:28 +0900
> Subject: [PATCH 1/5] memcg fix numa_stat permission
> 
> commit 406eb0c9b ("memcg: add memory.numastat api for numa statistics")
>  adds memory.numa_stat file for memory cgroup.  But the file permissions
>  are wrong.
> 
> [kamezawa@bluextal linux-2.6]$ ls -l /cgroup/memory/A/memory.numa_stat
> ---------- 1 root root 0 Jun  9 18:36 /cgroup/memory/A/memory.numa_stat
> 
> This patch fixes the permission as
> 
> [root@bluextal kamezawa]# ls -l /cgroup/memory/A/memory.numa_stat
> -r--r--r-- 1 root root 0 Jun 10 16:49 /cgroup/memory/A/memory.numa_stat
> 
> Acked-by: Ying Han <yinghan@google.com>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

Acked-by: Johannes Weiner <jweiner@redhat.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [BUGFIX][PATCH 2/5] memcg: fix init_page_cgroup nid with sparsemem
  2011-06-13  3:06 ` [BUGFIX][PATCH 2/5] memcg: fix init_page_cgroup nid with sparsemem KAMEZAWA Hiroyuki
@ 2011-06-14  9:44   ` Johannes Weiner
  2011-06-16 10:09   ` [-git build bug, PATCH] " Ingo Molnar
  1 sibling, 0 replies; 24+ messages in thread
From: Johannes Weiner @ 2011-06-14  9:44 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, nishimura@mxp.nes.nec.co.jp,
	bsingharora@gmail.com, hannes@cmpxchg.org, Michal Hocko, Ying Han

On Mon, Jun 13, 2011 at 12:06:08PM +0900, KAMEZAWA Hiroyuki wrote:
> added some clean ups.
> ==
> >From 45b8881201f73015c4e0352eb7e434f6e9c53fdd Mon Sep 17 00:00:00 2001
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Date: Mon, 13 Jun 2011 10:09:17 +0900
> Subject: [PATCH 2/5] [BUGFIX] memcg: fix init_page_cgroup nid with sparsemem
> 
> 
> commit 21a3c96 makes page_cgroup allocation as NUMA aware. But that caused
> a problem https://bugzilla.kernel.org/show_bug.cgi?id=36192.
> 
> The problem was getting a NID from invalid struct pages, which was not
> initialized because it was out-of-node, out of [node_start_pfn, node_end_pfn)
> 
> Now, with sparsemem, page_cgroup_init scans pfn from 0 to max_pfn.
> But this may scan a pfn which is not on any node and can access
> memmap which is not initialized.
> 
> This makes page_cgroup_init() for SPARSEMEM node aware and remove
> a code to get nid from page->flags. (Then, we'll use valid NID
> always.)
> 
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

Acked-by: Johannes Weiner <jweiner@redhat.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [BUGFIX][PATCH 3/5] memcg: clear mm->owner when last possible owner leaves
  2011-06-13  3:09 ` [BUGFIX][PATCH 3/5] memcg: clear mm->owner when last possible owner leaves KAMEZAWA Hiroyuki
@ 2011-06-14  9:45   ` Johannes Weiner
  0 siblings, 0 replies; 24+ messages in thread
From: Johannes Weiner @ 2011-06-14  9:45 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, nishimura@mxp.nes.nec.co.jp,
	bsingharora@gmail.com, hannes@cmpxchg.org, Michal Hocko, Ying Han,
	Hugh Dickins, davej

On Mon, Jun 13, 2011 at 12:09:51PM +0900, KAMEZAWA Hiroyuki wrote:
> This is Hugh's version.

I approve of Hugh's fixes :-)  Thanks

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [BUGFIX][PATCH 4/5] memcg:  fix wrong check of noswap with softlimit
  2011-06-13  3:11 ` [BUGFIX][PATCH 4/5] memcg: fix wrong check of noswap with softlimit KAMEZAWA Hiroyuki
@ 2011-06-14  9:48   ` Johannes Weiner
  0 siblings, 0 replies; 24+ messages in thread
From: Johannes Weiner @ 2011-06-14  9:48 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, nishimura@mxp.nes.nec.co.jp,
	bsingharora@gmail.com, hannes@cmpxchg.org, Michal Hocko, Ying Han

On Mon, Jun 13, 2011 at 12:11:05PM +0900, KAMEZAWA Hiroyuki wrote:
> >From 0a0358d300330a4ba86e39ea56ed63f1e4519dfd Mon Sep 17 00:00:00 2001
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Date: Mon, 13 Jun 2011 10:31:16 +0900
> Subject: [PATCH 4/5]  fix wrong check of noswap with softlimit
> 
> Hierarchical reclaim doesn't swap out if memsw and resource limits are
> same (memsw_is_minimum == true) because we would hit mem+swap limit
> anyway (during hard limit reclaim).
> If it comes to the solft limit we shouldn't consider memsw_is_minimum at
> all because it doesn't make much sense. Either the soft limit is bellow
> the hard limit and then we cannot hit mem+swap limit or the direct
> reclaim takes a precedence.
> 
> Reviewed-by: Michal Hocko <mhocko@suse.cz>
> Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

Acked-by: Johannes Weiner <jweiner@redhat.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [BUGFIX][PATCH 5/5] memcg: fix percpu cached charge draining frequency
  2011-06-13  3:16 ` [BUGFIX][PATCH 5/5] memcg: fix percpu cached charge draining frequency KAMEZAWA Hiroyuki
  2011-06-13 21:25   ` Andrew Morton
  2011-06-14  7:36   ` Michal Hocko
@ 2011-06-14 10:04   ` Johannes Weiner
  2011-06-15  0:16     ` KAMEZAWA Hiroyuki
  2011-06-15  1:49   ` [BUGFIX][PATCH v6] " KAMEZAWA Hiroyuki
  3 siblings, 1 reply; 24+ messages in thread
From: Johannes Weiner @ 2011-06-14 10:04 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, nishimura@mxp.nes.nec.co.jp,
	bsingharora@gmail.com, hannes@cmpxchg.org, Michal Hocko, Ying Han

On Mon, Jun 13, 2011 at 12:16:48PM +0900, KAMEZAWA Hiroyuki wrote:
> @@ -1670,8 +1670,8 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem,
>  		victim = mem_cgroup_select_victim(root_mem);
>  		if (victim == root_mem) {
>  			loop++;
> -			if (loop >= 1)
> -				drain_all_stock_async();
> +			if (!check_soft && loop >= 1)
> +				drain_all_stock_async(root_mem);

I agree with Michal, this should be a separate change.

> @@ -2008,26 +2011,50 @@ static void refill_stock(struct mem_cgroup *mem, unsigned int nr_pages)
>   * expects some charges will be back to res_counter later but cannot wait for
>   * it.
>   */
> -static void drain_all_stock_async(void)
> +static void drain_all_stock_async(struct mem_cgroup *root_mem)
>  {
> -	int cpu;
> -	/* This function is for scheduling "drain" in asynchronous way.
> -	 * The result of "drain" is not directly handled by callers. Then,
> -	 * if someone is calling drain, we don't have to call drain more.
> -	 * Anyway, WORK_STRUCT_PENDING check in queue_work_on() will catch if
> -	 * there is a race. We just do loose check here.
> +	int cpu, curcpu;
> +	/*
> +	 * If someone calls draining, avoid adding more kworker runs.
>  	 */
> -	if (atomic_read(&memcg_drain_count))
> +	if (!mutex_trylock(&percpu_charge_mutex))
>  		return;
>  	/* Notify other cpus that system-wide "drain" is running */
> -	atomic_inc(&memcg_drain_count);
>  	get_online_cpus();
> +
> +	/*
> +	 * get a hint for avoiding draining charges on the current cpu,
> +	 * which must be exhausted by our charging. But this is not
> +	 * required to be a precise check, We use raw_smp_processor_id()
> +	 * instead of getcpu()/putcpu().
> +	 */
> +	curcpu = raw_smp_processor_id();
>  	for_each_online_cpu(cpu) {
>  		struct memcg_stock_pcp *stock = &per_cpu(memcg_stock, cpu);
> -		schedule_work_on(cpu, &stock->work);
> +		struct mem_cgroup *mem;
> +
> +		if (cpu == curcpu)
> +			continue;
> +
> +		mem = stock->cached;
> +		if (!mem)
> +			continue;
> +		if (mem != root_mem) {
> +			if (!root_mem->use_hierarchy)
> +				continue;
> +			/* check whether "mem" is under tree of "root_mem" */
> +			rcu_read_lock();
> +			if (!css_is_ancestor(&mem->css, &root_mem->css)) {
> +				rcu_read_unlock();
> +				continue;
> +			}
> +			rcu_read_unlock();

css_is_ancestor() takes the rcu read lock itself already.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [BUGFIX][PATCH 5/5] memcg: fix percpu cached charge draining frequency
  2011-06-13 21:25   ` Andrew Morton
@ 2011-06-15  0:09     ` KAMEZAWA Hiroyuki
  2011-06-15  1:19       ` Andrew Morton
  0 siblings, 1 reply; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-06-15  0:09 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	nishimura@mxp.nes.nec.co.jp, bsingharora@gmail.com,
	hannes@cmpxchg.org, Michal Hocko, Ying Han

On Mon, 13 Jun 2011 14:25:01 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:

> On Mon, 13 Jun 2011 12:16:48 +0900
> KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> 
> > >From 18b12e53f1cdf6d7feed1f9226c189c34866338c Mon Sep 17 00:00:00 2001
> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > Date: Mon, 13 Jun 2011 11:25:43 +0900
> > Subject: [PATCH 5/5] memcg: fix percpu cached charge draining frequency
> > 
> >  For performance, memory cgroup caches some "charge" from res_counter
> >  into per cpu cache. This works well but because it's cache,
> >  it needs to be flushed in some cases. Typical cases are
> >          1. when someone hit limit.
> >          2. when rmdir() is called and need to charges to be 0.
> > 
> > But "1" has problem.
> > 
> > Recently, with large SMP machines, we see many kworker runs because
> > of flushing memcg's cache. Bad things in implementation are
> > that even if a cpu contains a cache for memcg not related to
> > a memcg which hits limit, drain code is called.
> > 
> > This patch does
> > 	A) check percpu cache contains a useful data or not.
> > 	B) check other asynchronous percpu draining doesn't run.
> > 	C) don't call local cpu callback.
> > 	D) don't call at softlimit reclaim.
> > 
> >
> > ...
> >
> > +DEFINE_MUTEX(percpu_charge_mutex);
> 
> I made this static.  If we later wish to give it kernel-wide scope then
> "percpu_charge_mutex" will not be a good choice of name.

Thank you.
And, yes..... memcg_cached_charge_mutex ?

Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [BUGFIX][PATCH 5/5] memcg: fix percpu cached charge draining frequency
  2011-06-14  7:36   ` Michal Hocko
@ 2011-06-15  0:12     ` KAMEZAWA Hiroyuki
  2011-06-15  1:12       ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-06-15  0:12 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, nishimura@mxp.nes.nec.co.jp,
	bsingharora@gmail.com, hannes@cmpxchg.org, Ying Han

On Tue, 14 Jun 2011 09:36:51 +0200
Michal Hocko <mhocko@suse.cz> wrote:

> On Mon 13-06-11 12:16:48, KAMEZAWA Hiroyuki wrote:
> > From 18b12e53f1cdf6d7feed1f9226c189c34866338c Mon Sep 17 00:00:00 2001
> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > Date: Mon, 13 Jun 2011 11:25:43 +0900
> > Subject: [PATCH 5/5] memcg: fix percpu cached charge draining frequency
> > 
> >  For performance, memory cgroup caches some "charge" from res_counter
> >  into per cpu cache. This works well but because it's cache,
> >  it needs to be flushed in some cases. Typical cases are
> >          1. when someone hit limit.
> >          2. when rmdir() is called and need to charges to be 0.
> > 
> > But "1" has problem.
> > 
> > Recently, with large SMP machines, we see many kworker runs because
> > of flushing memcg's cache. Bad things in implementation are
> > that even if a cpu contains a cache for memcg not related to
> > a memcg which hits limit, drain code is called.
> > 
> > This patch does
> > 	D) don't call at softlimit reclaim.
> 
> I think this needs some justification. The decision is not that
> obvious IMO. I would say that this is a good decision because cached
> charges will not help to free any memory (at least not directly) during
> background reclaim. What about something like:
> "
> We are not draining per cpu cached charges during soft limit reclaim 
> because background reclaim doesn't care about charges. It tries to free
> some memory and charges will not give any.
> Cached charges might influence only selection of the biggest soft limit
> offender but as the call is done only after the selection has been
> already done it makes no change.
> "
> 
> Anyway, wouldn't it be better to have this change separate from the
> async draining logic change?

Hmm. I think calling "draining" at softlimit is just a bug.

Thanks,
-Kame


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [BUGFIX][PATCH 5/5] memcg: fix percpu cached charge draining frequency
  2011-06-14 10:04   ` Johannes Weiner
@ 2011-06-15  0:16     ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-06-15  0:16 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, nishimura@mxp.nes.nec.co.jp,
	bsingharora@gmail.com, hannes@cmpxchg.org, Michal Hocko, Ying Han

On Tue, 14 Jun 2011 12:04:12 +0200
Johannes Weiner <jweiner@redhat.com> wrote:

> On Mon, Jun 13, 2011 at 12:16:48PM +0900, KAMEZAWA Hiroyuki wrote:
> > @@ -1670,8 +1670,8 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem,
> >  		victim = mem_cgroup_select_victim(root_mem);
> >  		if (victim == root_mem) {
> >  			loop++;
> > -			if (loop >= 1)
> > -				drain_all_stock_async();
> > +			if (!check_soft && loop >= 1)
> > +				drain_all_stock_async(root_mem);
> 
> I agree with Michal, this should be a separate change.
> 

Hm, ok, I'll do.

> > @@ -2008,26 +2011,50 @@ static void refill_stock(struct mem_cgroup *mem, unsigned int nr_pages)
> >   * expects some charges will be back to res_counter later but cannot wait for
> >   * it.
> >   */
> > -static void drain_all_stock_async(void)
> > +static void drain_all_stock_async(struct mem_cgroup *root_mem)
> >  {
> > -	int cpu;
> > -	/* This function is for scheduling "drain" in asynchronous way.
> > -	 * The result of "drain" is not directly handled by callers. Then,
> > -	 * if someone is calling drain, we don't have to call drain more.
> > -	 * Anyway, WORK_STRUCT_PENDING check in queue_work_on() will catch if
> > -	 * there is a race. We just do loose check here.
> > +	int cpu, curcpu;
> > +	/*
> > +	 * If someone calls draining, avoid adding more kworker runs.
> >  	 */
> > -	if (atomic_read(&memcg_drain_count))
> > +	if (!mutex_trylock(&percpu_charge_mutex))
> >  		return;
> >  	/* Notify other cpus that system-wide "drain" is running */
> > -	atomic_inc(&memcg_drain_count);
> >  	get_online_cpus();
> > +
> > +	/*
> > +	 * get a hint for avoiding draining charges on the current cpu,
> > +	 * which must be exhausted by our charging. But this is not
> > +	 * required to be a precise check, We use raw_smp_processor_id()
> > +	 * instead of getcpu()/putcpu().
> > +	 */
> > +	curcpu = raw_smp_processor_id();
> >  	for_each_online_cpu(cpu) {
> >  		struct memcg_stock_pcp *stock = &per_cpu(memcg_stock, cpu);
> > -		schedule_work_on(cpu, &stock->work);
> > +		struct mem_cgroup *mem;
> > +
> > +		if (cpu == curcpu)
> > +			continue;
> > +
> > +		mem = stock->cached;
> > +		if (!mem)
> > +			continue;
> > +		if (mem != root_mem) {
> > +			if (!root_mem->use_hierarchy)
> > +				continue;
> > +			/* check whether "mem" is under tree of "root_mem" */
> > +			rcu_read_lock();
> > +			if (!css_is_ancestor(&mem->css, &root_mem->css)) {
> > +				rcu_read_unlock();
> > +				continue;
> > +			}
> > +			rcu_read_unlock();
> 
> css_is_ancestor() takes the rcu read lock itself already.
> 

you're right.

I'll post an update.

Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [BUGFIX][PATCH 5/5] memcg: fix percpu cached charge draining frequency
  2011-06-15  0:12     ` KAMEZAWA Hiroyuki
@ 2011-06-15  1:12       ` KAMEZAWA Hiroyuki
  2011-06-15  7:15         ` Michal Hocko
  0 siblings, 1 reply; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-06-15  1:12 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Michal Hocko, linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, nishimura@mxp.nes.nec.co.jp,
	bsingharora@gmail.com, hannes@cmpxchg.org, Ying Han

On Wed, 15 Jun 2011 09:12:45 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:

> On Tue, 14 Jun 2011 09:36:51 +0200
> Michal Hocko <mhocko@suse.cz> wrote:
> 
> > On Mon 13-06-11 12:16:48, KAMEZAWA Hiroyuki wrote:
> > > From 18b12e53f1cdf6d7feed1f9226c189c34866338c Mon Sep 17 00:00:00 2001
> > > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > > Date: Mon, 13 Jun 2011 11:25:43 +0900
> > > Subject: [PATCH 5/5] memcg: fix percpu cached charge draining frequency
> > > 
> > >  For performance, memory cgroup caches some "charge" from res_counter
> > >  into per cpu cache. This works well but because it's cache,
> > >  it needs to be flushed in some cases. Typical cases are
> > >          1. when someone hit limit.
> > >          2. when rmdir() is called and need to charges to be 0.
> > > 
> > > But "1" has problem.
> > > 
> > > Recently, with large SMP machines, we see many kworker runs because
> > > of flushing memcg's cache. Bad things in implementation are
> > > that even if a cpu contains a cache for memcg not related to
> > > a memcg which hits limit, drain code is called.
> > > 
> > > This patch does
> > > 	D) don't call at softlimit reclaim.
> > 
> > I think this needs some justification. The decision is not that
> > obvious IMO. I would say that this is a good decision because cached
> > charges will not help to free any memory (at least not directly) during
> > background reclaim. What about something like:
> > "
> > We are not draining per cpu cached charges during soft limit reclaim 
> > because background reclaim doesn't care about charges. It tries to free
> > some memory and charges will not give any.
> > Cached charges might influence only selection of the biggest soft limit
> > offender but as the call is done only after the selection has been
> > already done it makes no change.
> > "
> > 
> > Anyway, wouldn't it be better to have this change separate from the
> > async draining logic change?
> 
> Hmm. I think calling "draining" at softlimit is just a bug.
> 
I'll divide patches.

Thanks,
-kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [BUGFIX][PATCH 5/5] memcg: fix percpu cached charge draining frequency
  2011-06-15  0:09     ` KAMEZAWA Hiroyuki
@ 2011-06-15  1:19       ` Andrew Morton
  0 siblings, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2011-06-15  1:19 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	nishimura@mxp.nes.nec.co.jp, bsingharora@gmail.com,
	hannes@cmpxchg.org, Michal Hocko, Ying Han

On Wed, 15 Jun 2011 09:09:35 +0900 KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:

> > I made this static.  If we later wish to give it kernel-wide scope then
> > "percpu_charge_mutex" will not be a good choice of name.
> 
> Thank you.
> And, yes..... memcg_cached_charge_mutex ?

I don't see a need to rename the mutex unless we change it to have
kernel-wide scope.  Given that it's presently static to memcontrol.c,
it's obviously part of memcg ;)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [BUGFIX][PATCH v6] memcg: fix percpu cached charge draining frequency
  2011-06-13  3:16 ` [BUGFIX][PATCH 5/5] memcg: fix percpu cached charge draining frequency KAMEZAWA Hiroyuki
                     ` (2 preceding siblings ...)
  2011-06-14 10:04   ` Johannes Weiner
@ 2011-06-15  1:49   ` KAMEZAWA Hiroyuki
  2011-06-15  1:52     ` [BUGFIX][PATCH v2] memcg: avoid percpu cached charge draining at softlimit KAMEZAWA Hiroyuki
  2011-06-15  5:30     ` [BUGFIX][PATCH v6] memcg: fix percpu cached charge draining frequency Ying Han
  3 siblings, 2 replies; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-06-15  1:49 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, nishimura@mxp.nes.nec.co.jp,
	bsingharora@gmail.com, hannes@cmpxchg.org, Michal Hocko, Ying Han

This is a repleacement for
memcg-fix-percpu-cached-charge-draining-frequency.patch
+
memcg-fix-percpu-cached-charge-draining-frequency-fix.patch


Changelog:
  - removed unnecessary rcu_read_lock()
  - removed a fix for softlimit case (move to another independent patch)
  - make mutex static.
  - applied comment updates from Andrew Morton.

A patch for softlimit will follow this.

==

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [BUGFIX][PATCH v2] memcg: avoid percpu cached charge draining at softlimit
  2011-06-15  1:49   ` [BUGFIX][PATCH v6] " KAMEZAWA Hiroyuki
@ 2011-06-15  1:52     ` KAMEZAWA Hiroyuki
  2011-06-15  5:30     ` [BUGFIX][PATCH v6] memcg: fix percpu cached charge draining frequency Ying Han
  1 sibling, 0 replies; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-06-15  1:52 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, nishimura@mxp.nes.nec.co.jp,
	bsingharora@gmail.com, hannes@cmpxchg.org, Michal Hocko, Ying Han

Based on Michal Hokko's comment.

=

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [BUGFIX][PATCH v6] memcg: fix percpu cached charge draining frequency
  2011-06-15  1:49   ` [BUGFIX][PATCH v6] " KAMEZAWA Hiroyuki
  2011-06-15  1:52     ` [BUGFIX][PATCH v2] memcg: avoid percpu cached charge draining at softlimit KAMEZAWA Hiroyuki
@ 2011-06-15  5:30     ` Ying Han
  1 sibling, 0 replies; 24+ messages in thread
From: Ying Han @ 2011-06-15  5:30 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, nishimura@mxp.nes.nec.co.jp,
	bsingharora@gmail.com, hannes@cmpxchg.org, Michal Hocko

On Tue, Jun 14, 2011 at 6:49 PM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> This is a repleacement for
> memcg-fix-percpu-cached-charge-draining-frequency.patch
> +
> memcg-fix-percpu-cached-charge-draining-frequency-fix.patch
>
>
> Changelog:
>  - removed unnecessary rcu_read_lock()
>  - removed a fix for softlimit case (move to another independent patch)
>  - make mutex static.
>  - applied comment updates from Andrew Morton.
>
> A patch for softlimit will follow this.
>
> ==
> From f3f41b827d70142858ba8b370510a82d608870d0 Mon Sep 17 00:00:00 2001
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Date: Wed, 15 Jun 2011 10:39:57 +0900
> Subject: [PATCH 5/6] memcg: fix behavior of per cpu charge cache draining.
>
>  For performance, memory cgroup caches some "charge" from res_counter
>  into per cpu cache. This works well but because it's cache,
>  it needs to be flushed in some cases. Typical cases are
>         1. when someone hit limit.
>         2. when rmdir() is called and need to charges to be 0.
>
> But "1" has problem.
>
> Recently, with large SMP machines, we see many kworker runs because
> of flushing memcg's cache. Bad things in implementation are
> that even if a cpu contains a cache for memcg not related to
> a memcg which hits limit, drain code is called.
>
> This patch does
>        A) check percpu cache contains a useful data or not.
>        B) check other asynchronous percpu draining doesn't run.
>        C) don't call local cpu callback.
>
> (*)This patch avoid changing the calling condition with hard-limit.
>
> When I run "cat 1Gfile > /dev/null" under 300M limit memcg,
>
> [Before]
> 13767 kamezawa  20   0 98.6m  424  416 D 10.0  0.0   0:00.61 cat
>   58 root      20   0     0    0    0 S  0.6  0.0   0:00.09 kworker/2:1
>   60 root      20   0     0    0    0 S  0.6  0.0   0:00.08 kworker/4:1
>    4 root      20   0     0    0    0 S  0.3  0.0   0:00.02 kworker/0:0
>   57 root      20   0     0    0    0 S  0.3  0.0   0:00.05 kworker/1:1
>   61 root      20   0     0    0    0 S  0.3  0.0   0:00.05 kworker/5:1
>   62 root      20   0     0    0    0 S  0.3  0.0   0:00.05 kworker/6:1
>   63 root      20   0     0    0    0 S  0.3  0.0   0:00.05 kworker/7:1
>
> [After]
>  2676 root      20   0 98.6m  416  416 D  9.3  0.0   0:00.87 cat
>  2626 kamezawa  20   0 15192 1312  920 R  0.3  0.0   0:00.28 top
>    1 root      20   0 19384 1496 1204 S  0.0  0.0   0:00.66 init
>    2 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kthreadd
>    3 root      20   0     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/0
>    4 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kworker/0:0
>
> Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> Reviewed-by: Michal Hocko <mhocko@suse.cz>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>
> Changelog:
>  - removed unnecessary rcu_read_lock()
>  - removed a fix for softlimit case (move to another independent patch)
>  - make mutex static.
>  - applied comment updates from Andrew Morton.
> ---
>  mm/memcontrol.c |   54 ++++++++++++++++++++++++++++++++++++++----------------
>  1 files changed, 38 insertions(+), 16 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 915c3f3..8fb29de 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -359,7 +359,7 @@ enum charge_type {
>  static void mem_cgroup_get(struct mem_cgroup *mem);
>  static void mem_cgroup_put(struct mem_cgroup *mem);
>  static struct mem_cgroup *parent_mem_cgroup(struct mem_cgroup *mem);
> -static void drain_all_stock_async(void);
> +static void drain_all_stock_async(struct mem_cgroup *mem);
>
>  static struct mem_cgroup_per_zone *
>  mem_cgroup_zoneinfo(struct mem_cgroup *mem, int nid, int zid)
> @@ -1671,7 +1671,7 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem,
>                if (victim == root_mem) {
>                        loop++;
>                        if (loop >= 1)
> -                               drain_all_stock_async();
> +                               drain_all_stock_async(root_mem);
>                        if (loop >= 2) {
>                                /*
>                                 * If we have not been able to reclaim
> @@ -1934,9 +1934,11 @@ struct memcg_stock_pcp {
>        struct mem_cgroup *cached; /* this never be root cgroup */
>        unsigned int nr_pages;
>        struct work_struct work;
> +       unsigned long flags;
> +#define FLUSHING_CACHED_CHARGE (0)
>  };
>  static DEFINE_PER_CPU(struct memcg_stock_pcp, memcg_stock);
> -static atomic_t memcg_drain_count;
> +static DEFINE_MUTEX(percpu_charge_mutex);
>
>  /*
>  * Try to consume stocked charge on this cpu. If success, one page is consumed
> @@ -1984,6 +1986,7 @@ static void drain_local_stock(struct work_struct *dummy)
>  {
>        struct memcg_stock_pcp *stock = &__get_cpu_var(memcg_stock);
>        drain_stock(stock);
> +       clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags);
>  }
>
>  /*
> @@ -2008,26 +2011,45 @@ static void refill_stock(struct mem_cgroup *mem, unsigned int nr_pages)
>  * expects some charges will be back to res_counter later but cannot wait for
>  * it.
>  */
> -static void drain_all_stock_async(void)
> +static void drain_all_stock_async(struct mem_cgroup *root_mem)
>  {
> -       int cpu;
> -       /* This function is for scheduling "drain" in asynchronous way.
> -        * The result of "drain" is not directly handled by callers. Then,
> -        * if someone is calling drain, we don't have to call drain more.
> -        * Anyway, WORK_STRUCT_PENDING check in queue_work_on() will catch if
> -        * there is a race. We just do loose check here.
> +       int cpu, curcpu;
> +       /*
> +        * If someone calls draining, avoid adding more kworker runs.
>         */
> -       if (atomic_read(&memcg_drain_count))
> +       if (!mutex_trylock(&percpu_charge_mutex))
>                return;
>        /* Notify other cpus that system-wide "drain" is running */
> -       atomic_inc(&memcg_drain_count);
>        get_online_cpus();
> +       /*
> +        * Get a hint for avoiding draining charges on the current cpu,
> +        * which must be exhausted by our charging.  It is not required that
> +        * this be a precise check, so we use raw_smp_processor_id() instead of
> +        * getcpu()/putcpu().
> +        */
> +       curcpu = raw_smp_processor_id();
>        for_each_online_cpu(cpu) {
>                struct memcg_stock_pcp *stock = &per_cpu(memcg_stock, cpu);
> -               schedule_work_on(cpu, &stock->work);
> +               struct mem_cgroup *mem;
> +
> +               if (cpu == curcpu)
> +                       continue;
> +
> +               mem = stock->cached;
> +               if (!mem)
> +                       continue;
> +               if (mem != root_mem) {
> +                       if (!root_mem->use_hierarchy)
> +                               continue;
> +                       /* check whether "mem" is under tree of "root_mem" */
> +                       if (!css_is_ancestor(&mem->css, &root_mem->css))
> +                               continue;
> +               }
> +               if (!test_and_set_bit(FLUSHING_CACHED_CHARGE, &stock->flags))
> +                       schedule_work_on(cpu, &stock->work);
>        }
>        put_online_cpus();
> -       atomic_dec(&memcg_drain_count);
> +       mutex_unlock(&percpu_charge_mutex);
>        /* We don't wait for flush_work */
>  }
>
> @@ -2035,9 +2057,9 @@ static void drain_all_stock_async(void)
>  static void drain_all_stock_sync(void)
>  {
>        /* called when force_empty is called */
> -       atomic_inc(&memcg_drain_count);
> +       mutex_lock(&percpu_charge_mutex);
>        schedule_on_each_cpu(drain_local_stock);
> -       atomic_dec(&memcg_drain_count);
> +       mutex_unlock(&percpu_charge_mutex);
>  }
>
>  /*
> --
> 1.7.4.1
>
>
>

Tested-by: Ying Han <yinghan@google.com>


--Ying

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [BUGFIX][PATCH 5/5] memcg: fix percpu cached charge draining frequency
  2011-06-15  1:12       ` KAMEZAWA Hiroyuki
@ 2011-06-15  7:15         ` Michal Hocko
  0 siblings, 0 replies; 24+ messages in thread
From: Michal Hocko @ 2011-06-15  7:15 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, nishimura@mxp.nes.nec.co.jp,
	bsingharora@gmail.com, hannes@cmpxchg.org, Ying Han

On Wed 15-06-11 10:12:02, KAMEZAWA Hiroyuki wrote:
> On Wed, 15 Jun 2011 09:12:45 +0900
> KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> 
> > On Tue, 14 Jun 2011 09:36:51 +0200
> > Michal Hocko <mhocko@suse.cz> wrote:
> > 
> > > On Mon 13-06-11 12:16:48, KAMEZAWA Hiroyuki wrote:
> > > > From 18b12e53f1cdf6d7feed1f9226c189c34866338c Mon Sep 17 00:00:00 2001
> > > > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > > > Date: Mon, 13 Jun 2011 11:25:43 +0900
> > > > Subject: [PATCH 5/5] memcg: fix percpu cached charge draining frequency
> > > > 
> > > >  For performance, memory cgroup caches some "charge" from res_counter
> > > >  into per cpu cache. This works well but because it's cache,
> > > >  it needs to be flushed in some cases. Typical cases are
> > > >          1. when someone hit limit.
> > > >          2. when rmdir() is called and need to charges to be 0.
> > > > 
> > > > But "1" has problem.
> > > > 
> > > > Recently, with large SMP machines, we see many kworker runs because
> > > > of flushing memcg's cache. Bad things in implementation are
> > > > that even if a cpu contains a cache for memcg not related to
> > > > a memcg which hits limit, drain code is called.
> > > > 
> > > > This patch does
> > > > 	D) don't call at softlimit reclaim.
> > > 
> > > I think this needs some justification. The decision is not that
> > > obvious IMO. I would say that this is a good decision because cached
> > > charges will not help to free any memory (at least not directly) during
> > > background reclaim. What about something like:
> > > "
> > > We are not draining per cpu cached charges during soft limit reclaim 
> > > because background reclaim doesn't care about charges. It tries to free
> > > some memory and charges will not give any.
> > > Cached charges might influence only selection of the biggest soft limit
> > > offender but as the call is done only after the selection has been
> > > already done it makes no change.
> > > "
> > > 
> > > Anyway, wouldn't it be better to have this change separate from the
> > > async draining logic change?
> > 
> > Hmm. I think calling "draining" at softlimit is just a bug.
> > 
> I'll divide patches.

Thanks.

-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [-git build bug, PATCH] Re: [BUGFIX][PATCH 2/5] memcg: fix init_page_cgroup nid with sparsemem
  2011-06-13  3:06 ` [BUGFIX][PATCH 2/5] memcg: fix init_page_cgroup nid with sparsemem KAMEZAWA Hiroyuki
  2011-06-14  9:44   ` Johannes Weiner
@ 2011-06-16 10:09   ` Ingo Molnar
  2011-06-16 13:01     ` Hiroyuki Kamezawa
  1 sibling, 1 reply; 24+ messages in thread
From: Ingo Molnar @ 2011-06-16 10:09 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, nishimura@mxp.nes.nec.co.jp,
	bsingharora@gmail.com, hannes@cmpxchg.org, Michal Hocko, Ying Han,
	Linus Torvalds


* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:

> Date: Mon, 13 Jun 2011 10:09:17 +0900
> Subject: [PATCH 2/5] [BUGFIX] memcg: fix init_page_cgroup nid with sparsemem

This fresh upstream commit commit:

  37573e8c7182: memcg: fix init_page_cgroup nid with sparsemem

is causing widespread build failures on latest -git, on x86:

  mm/page_cgroup.c:308:3: error: implicit declaration of function a??node_start_pfna?? [-Werror=implicit-function-declaration]
  mm/page_cgroup.c:309:3: error: implicit declaration of function a??node_end_pfna?? [-Werror=implicit-function-declaration]

On any config that has CONFIG_CGROUP_MEM_RES_CTLR=y enabled but 
CONFIG_NUMA disabled.

For now i've worked it around with the patch below, but the real 
solution would be to make the page_cgroup.c code not depend on NUMA.

Thanks,

	Ingo

diff --git a/init/Kconfig b/init/Kconfig
index 412c21b..1593be9 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -639,6 +639,7 @@ config RESOURCE_COUNTERS
 config CGROUP_MEM_RES_CTLR
 	bool "Memory Resource Controller for Control Groups"
 	depends on RESOURCE_COUNTERS
+	depends on NUMA
 	select MM_OWNER
 	help
 	  Provides a memory resource controller that manages both anonymous

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [-git build bug, PATCH] Re: [BUGFIX][PATCH 2/5] memcg: fix init_page_cgroup nid with sparsemem
  2011-06-16 10:09   ` [-git build bug, PATCH] " Ingo Molnar
@ 2011-06-16 13:01     ` Hiroyuki Kamezawa
  0 siblings, 0 replies; 24+ messages in thread
From: Hiroyuki Kamezawa @ 2011-06-16 13:01 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: KAMEZAWA Hiroyuki, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, akpm@linux-foundation.org,
	nishimura@mxp.nes.nec.co.jp, bsingharora@gmail.com,
	hannes@cmpxchg.org, Michal Hocko, Ying Han, Linus Torvalds

2011/6/16 Ingo Molnar <mingo@elte.hu>:
>
> * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
>
>> Date: Mon, 13 Jun 2011 10:09:17 +0900
>> Subject: [PATCH 2/5] [BUGFIX] memcg: fix init_page_cgroup nid with sparsemem
>
> This fresh upstream commit commit:
>
>  37573e8c7182: memcg: fix init_page_cgroup nid with sparsemem
>
> is causing widespread build failures on latest -git, on x86:
>
>  mm/page_cgroup.c:308:3: error: implicit declaration of function ‘node_start_pfn’ [-Werror=implicit-function-declaration]
>  mm/page_cgroup.c:309:3: error: implicit declaration of function ‘node_end_pfn’ [-Werror=implicit-function-declaration]
>
> On any config that has CONFIG_CGROUP_MEM_RES_CTLR=y enabled but
> CONFIG_NUMA disabled.
>
> For now i've worked it around with the patch below, but the real
> solution would be to make the page_cgroup.c code not depend on NUMA.
>
> Thanks,
>
>        Ingo
>
yes, very sorry. I'm now preparing a fix in this thread.

http://marc.info/?t=130819986800002&r=1&w=2

I think I'll be able to post a final fix, tomorrow. I'll cc you when I'll post.
Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2011-06-16 13:01 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-13  3:00 [BUGFIX][PATCH 0/5] memcg bugfixes in the last week KAMEZAWA Hiroyuki
2011-06-13  3:03 ` [BUGFIX][PATCH 1/5] memcg fix numa_stat permission KAMEZAWA Hiroyuki
2011-06-14  9:42   ` Johannes Weiner
2011-06-13  3:06 ` [BUGFIX][PATCH 2/5] memcg: fix init_page_cgroup nid with sparsemem KAMEZAWA Hiroyuki
2011-06-14  9:44   ` Johannes Weiner
2011-06-16 10:09   ` [-git build bug, PATCH] " Ingo Molnar
2011-06-16 13:01     ` Hiroyuki Kamezawa
2011-06-13  3:09 ` [BUGFIX][PATCH 3/5] memcg: clear mm->owner when last possible owner leaves KAMEZAWA Hiroyuki
2011-06-14  9:45   ` Johannes Weiner
2011-06-13  3:11 ` [BUGFIX][PATCH 4/5] memcg: fix wrong check of noswap with softlimit KAMEZAWA Hiroyuki
2011-06-14  9:48   ` Johannes Weiner
2011-06-13  3:16 ` [BUGFIX][PATCH 5/5] memcg: fix percpu cached charge draining frequency KAMEZAWA Hiroyuki
2011-06-13 21:25   ` Andrew Morton
2011-06-15  0:09     ` KAMEZAWA Hiroyuki
2011-06-15  1:19       ` Andrew Morton
2011-06-14  7:36   ` Michal Hocko
2011-06-15  0:12     ` KAMEZAWA Hiroyuki
2011-06-15  1:12       ` KAMEZAWA Hiroyuki
2011-06-15  7:15         ` Michal Hocko
2011-06-14 10:04   ` Johannes Weiner
2011-06-15  0:16     ` KAMEZAWA Hiroyuki
2011-06-15  1:49   ` [BUGFIX][PATCH v6] " KAMEZAWA Hiroyuki
2011-06-15  1:52     ` [BUGFIX][PATCH v2] memcg: avoid percpu cached charge draining at softlimit KAMEZAWA Hiroyuki
2011-06-15  5:30     ` [BUGFIX][PATCH v6] memcg: fix percpu cached charge draining frequency Ying Han

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).