All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"balbir@linux.vnet.ibm.com" <balbir@linux.vnet.ibm.com>,
	"nishimura@mxp.nes.nec.co.jp" <nis@tyo205.gate.nec.co.jp>
Subject: Re: [PATCH 1/2] memcg: coalescing uncharge at unmap and truncation (fixed coimpile bug)
Date: Thu, 8 Oct 2009 15:17:10 -0700	[thread overview]
Message-ID: <20091008151710.1216a615.akpm@linux-foundation.org> (raw)
In-Reply-To: <20091002160213.32ae2bb5.kamezawa.hiroyu@jp.fujitsu.com>

On Fri, 2 Oct 2009 16:02:13 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:

>
> ...
>
> In massive parallel enviroment, res_counter can be a performance bottleneck.
> One strong techinque to reduce lock contention is reducing calls by
> coalescing some amount of calls into one.
> 
> Considering charge/uncharge chatacteristic,
> 	- charge is done one by one via demand-paging.
> 	- uncharge is done by
> 		- in chunk at munmap, truncate, exit, execve...
> 		- one by one via vmscan/paging.
> 
> It seems we have a chance in uncharge at unmap/truncation.
> 
> This patch is a for coalescing uncharge. For avoiding scattering memcg's
> structure to functions under /mm, this patch adds memcg batch uncharge
> information to the task. 
> 
> The degree of coalescing depends on callers
>   - at invalidate/trucate... pagevec size
>   - at unmap ....ZAP_BLOCK_SIZE
> (memory itself will be freed in this degree.)
> Then, we'll not coalescing too much.
> 
>
> ...
>
> +static void
> +__do_uncharge(struct mem_cgroup *mem, const enum charge_type ctype)
> +{
> +	struct memcg_batch_info *batch = NULL;
> +	bool uncharge_memsw = true;
> +	/* If swapout, usage of swap doesn't decrease */
> +	if (!do_swap_account || ctype == MEM_CGROUP_CHARGE_TYPE_SWAPOUT)
> +		uncharge_memsw = false;
> +	/*
> +	 * do_batch > 0 when unmapping pages or inode invalidate/truncate.
> +	 * In those cases, all pages freed continously can be expected to be in
> +	 * the same cgroup and we have chance to coalesce uncharges.
> +	 * And, we do uncharge one by one if this is killed by OOM.
> +	 */
> +	if (!current->memcg_batch.do_batch || test_thread_flag(TIF_MEMDIE))
> +		goto direct_uncharge;
> +
> +	batch = &current->memcg_batch;
> +	/*
> +	 * In usual, we do css_get() when we remember memcg pointer.
> +	 * But in this case, we keep res->usage until end of a series of
> +	 * uncharges. Then, it's ok to ignore memcg's refcnt.
> +	 */
> +	if (!batch->memcg)
> +		batch->memcg = mem;
> +	/*
> +	 * In typical case, batch->memcg == mem. This means we can
> +	 * merge a series of uncharges to an uncharge of res_counter.
> +	 * If not, we uncharge res_counter ony by one.
> +	 */
> +	if (batch->memcg != mem)
> +		goto direct_uncharge;
> +	/* remember freed charge and uncharge it later */
> +	batch->pages += PAGE_SIZE;

->pages is really confusingly named.  It doesn't count pages, it counts
bytes!

We could call it `bytes', but perhaps charge_bytes would be more
communicative?

> +/*
> + * batch_start/batch_end is called in unmap_page_range/invlidate/trucate.
> + * In that cases, pages are freed continuously and we can expect pages
> + * are in the same memcg. All these calls itself limits the number of
> + * pages freed at once, then uncharge_start/end() is called properly.
> + */
> +
> +void mem_cgroup_uncharge_start(void)
> +{
> +	if (!current->memcg_batch.do_batch) {
> +		current->memcg_batch.memcg = NULL;
> +		current->memcg_batch.pages = 0;
> +		current->memcg_batch.memsw = 0;

what's memsw?

> +	}
> +	current->memcg_batch.do_batch++;
> +}
> +
>
> ...
>
> +#ifdef CONFIG_CGROUP_MEM_RES_CTLR /* memcg uses this to do batch job */
> +	struct memcg_batch_info {
> +		int do_batch;
> +		struct mem_cgroup *memcg;
> +		long pages, memsw;
> +	} memcg_batch;
> +#endif

I find the most valuable documetnation is that which is devoted to the
data structures.  This one didn't get any :(

Negative values of `pages' and `memsw' are meaningless, so it would be
better to give them an unsigned type.  That matches the
res_counter_charge() expectations also.



WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"balbir@linux.vnet.ibm.com" <balbir@linux.vnet.ibm.com>,
	"nishimura@mxp.nes.nec.co.jp" <nis@tyo205.gate.nec.co.jp>
Subject: Re: [PATCH 1/2] memcg: coalescing uncharge at unmap and truncation (fixed coimpile bug)
Date: Thu, 8 Oct 2009 15:17:10 -0700	[thread overview]
Message-ID: <20091008151710.1216a615.akpm@linux-foundation.org> (raw)
In-Reply-To: <20091002160213.32ae2bb5.kamezawa.hiroyu@jp.fujitsu.com>

On Fri, 2 Oct 2009 16:02:13 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:

>
> ...
>
> In massive parallel enviroment, res_counter can be a performance bottleneck.
> One strong techinque to reduce lock contention is reducing calls by
> coalescing some amount of calls into one.
> 
> Considering charge/uncharge chatacteristic,
> 	- charge is done one by one via demand-paging.
> 	- uncharge is done by
> 		- in chunk at munmap, truncate, exit, execve...
> 		- one by one via vmscan/paging.
> 
> It seems we have a chance in uncharge at unmap/truncation.
> 
> This patch is a for coalescing uncharge. For avoiding scattering memcg's
> structure to functions under /mm, this patch adds memcg batch uncharge
> information to the task. 
> 
> The degree of coalescing depends on callers
>   - at invalidate/trucate... pagevec size
>   - at unmap ....ZAP_BLOCK_SIZE
> (memory itself will be freed in this degree.)
> Then, we'll not coalescing too much.
> 
>
> ...
>
> +static void
> +__do_uncharge(struct mem_cgroup *mem, const enum charge_type ctype)
> +{
> +	struct memcg_batch_info *batch = NULL;
> +	bool uncharge_memsw = true;
> +	/* If swapout, usage of swap doesn't decrease */
> +	if (!do_swap_account || ctype == MEM_CGROUP_CHARGE_TYPE_SWAPOUT)
> +		uncharge_memsw = false;
> +	/*
> +	 * do_batch > 0 when unmapping pages or inode invalidate/truncate.
> +	 * In those cases, all pages freed continously can be expected to be in
> +	 * the same cgroup and we have chance to coalesce uncharges.
> +	 * And, we do uncharge one by one if this is killed by OOM.
> +	 */
> +	if (!current->memcg_batch.do_batch || test_thread_flag(TIF_MEMDIE))
> +		goto direct_uncharge;
> +
> +	batch = &current->memcg_batch;
> +	/*
> +	 * In usual, we do css_get() when we remember memcg pointer.
> +	 * But in this case, we keep res->usage until end of a series of
> +	 * uncharges. Then, it's ok to ignore memcg's refcnt.
> +	 */
> +	if (!batch->memcg)
> +		batch->memcg = mem;
> +	/*
> +	 * In typical case, batch->memcg == mem. This means we can
> +	 * merge a series of uncharges to an uncharge of res_counter.
> +	 * If not, we uncharge res_counter ony by one.
> +	 */
> +	if (batch->memcg != mem)
> +		goto direct_uncharge;
> +	/* remember freed charge and uncharge it later */
> +	batch->pages += PAGE_SIZE;

->pages is really confusingly named.  It doesn't count pages, it counts
bytes!

We could call it `bytes', but perhaps charge_bytes would be more
communicative?

> +/*
> + * batch_start/batch_end is called in unmap_page_range/invlidate/trucate.
> + * In that cases, pages are freed continuously and we can expect pages
> + * are in the same memcg. All these calls itself limits the number of
> + * pages freed at once, then uncharge_start/end() is called properly.
> + */
> +
> +void mem_cgroup_uncharge_start(void)
> +{
> +	if (!current->memcg_batch.do_batch) {
> +		current->memcg_batch.memcg = NULL;
> +		current->memcg_batch.pages = 0;
> +		current->memcg_batch.memsw = 0;

what's memsw?

> +	}
> +	current->memcg_batch.do_batch++;
> +}
> +
>
> ...
>
> +#ifdef CONFIG_CGROUP_MEM_RES_CTLR /* memcg uses this to do batch job */
> +	struct memcg_batch_info {
> +		int do_batch;
> +		struct mem_cgroup *memcg;
> +		long pages, memsw;
> +	} memcg_batch;
> +#endif

I find the most valuable documetnation is that which is devoted to the
data structures.  This one didn't get any :(

Negative values of `pages' and `memsw' are meaningless, so it would be
better to give them an unsigned type.  That matches the
res_counter_charge() expectations also.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-10-08 22:18 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-02  4:55 [PATCH 0/2] memcg: improving scalability by reducing lock contention at charge/uncharge KAMEZAWA Hiroyuki
2009-10-02  4:55 ` KAMEZAWA Hiroyuki
2009-10-02  5:01 ` [PATCH 1/2] memcg: coalescing uncharge at unmap and truncation KAMEZAWA Hiroyuki
2009-10-02  5:01   ` KAMEZAWA Hiroyuki
2009-10-02  6:47   ` Hiroshi Shimamoto
2009-10-02  6:47     ` Hiroshi Shimamoto
2009-10-02  6:53     ` Hiroshi Shimamoto
2009-10-02  6:53       ` Hiroshi Shimamoto
2009-10-02  7:04       ` KAMEZAWA Hiroyuki
2009-10-02  7:04         ` KAMEZAWA Hiroyuki
2009-10-02  7:02     ` [PATCH 1/2] memcg: coalescing uncharge at unmap and truncation (fixed coimpile bug) KAMEZAWA Hiroyuki
2009-10-02  7:02       ` KAMEZAWA Hiroyuki
2009-10-08 22:17       ` Andrew Morton [this message]
2009-10-08 22:17         ` Andrew Morton
2009-10-08 23:48         ` KAMEZAWA Hiroyuki
2009-10-08 23:48           ` KAMEZAWA Hiroyuki
2009-10-09  4:01   ` [PATCH 1/2] memcg: coalescing uncharge at unmap and truncation Balbir Singh
2009-10-09  4:01     ` Balbir Singh
2009-10-09  4:17     ` KAMEZAWA Hiroyuki
2009-10-09  4:17       ` KAMEZAWA Hiroyuki
2009-10-02  5:03 ` [PATCH 2/2] memcg: coalescing charges per cpu KAMEZAWA Hiroyuki
2009-10-02  5:03   ` KAMEZAWA Hiroyuki
2009-10-08 22:26   ` Andrew Morton
2009-10-08 22:26     ` Andrew Morton
2009-10-08 23:54     ` KAMEZAWA Hiroyuki
2009-10-08 23:54       ` KAMEZAWA Hiroyuki
2009-10-09  4:15   ` Balbir Singh
2009-10-09  4:15     ` Balbir Singh
2009-10-09  4:25     ` KAMEZAWA Hiroyuki
2009-10-09  4:25       ` KAMEZAWA Hiroyuki
2009-10-02  8:53 ` [PATCH 0/2] memcg: improving scalability by reducing lock contention at charge/uncharge KAMEZAWA Hiroyuki
2009-10-05  7:18   ` KAMEZAWA Hiroyuki
2009-10-05  7:18     ` KAMEZAWA Hiroyuki
2009-10-05 10:37 ` Balbir Singh
2009-10-05 10:37   ` Balbir Singh
     [not found] ` <604427e00910091737s52e11ce9p256c95d533dc2837@mail.gmail.com>
2009-10-11  2:33   ` KAMEZAWA Hiroyuki
2009-10-11  2:33     ` KAMEZAWA Hiroyuki
     [not found]     ` <604427e00910111134o6f22f0ddg2b87124dd334ec02@mail.gmail.com>
2009-10-12 11:38       ` Balbir Singh
2009-10-12 11:38         ` Balbir Singh
2009-10-13  0:29       ` KAMEZAWA Hiroyuki
2009-10-13  0:29         ` KAMEZAWA Hiroyuki
     [not found]         ` <604427e00910121818w71dd4b7dl8781d7f5bc4f7dd9@mail.gmail.com>
2009-10-13  1:28           ` KAMEZAWA Hiroyuki
2009-10-13  1:28             ` KAMEZAWA Hiroyuki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091008151710.1216a615.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=h-shimamoto@ct.jp.nec.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nis@tyo205.gate.nec.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.