linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: "linux-mm@kvack.org" <linux-mm@kvack.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	"lizf@cn.fujitsu.com" <lizf@cn.fujitsu.com>,
	Rik van Riel <riel@surriel.com>,
	Bharata B Rao <bharata.rao@in.ibm.com>,
	Dhaval Giani <dhaval@linux.vnet.ibm.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: [RFI] Shared accounting for memory resource controller
Date: Tue, 7 Apr 2009 12:07:22 +0530	[thread overview]
Message-ID: <20090407063722.GQ7082@balbir.in.ibm.com> (raw)

Hi, All,

This is a request for input for the design of shared page accounting for
the memory resource controller, here is what I have so far

Motivation for shared page accounting
-------------------------------------
1. Memory cgroup administrators will benefit from the knowledge of how
   much of the data is shared, it helps size the groups correctly.
2. We currently report only the pages brought in by the cgroup, knowledge
   of shared data will give a complete picture of the actual usage.

Use rmap to account sharing/unsharing through mapcount
-------------------------------------------------------

The current page has links to

	+-------+
	|       |
	|	+--->pc->mem_cgroup (first mem_cgroup to touch the page)
	|	|
	| page	|
	|	+--->mapping (used for rmap)
	|	|
	+-------+

While accounting shared pages works well, as pages get unshared, I've hit a
problem. Here is the current flow for shared accounting

Flow for sharing
----------------
1. Page not yet mapped anywhere (_mapcount is -1 and mem_cgroup,mapping is NULL)
2. Page gets mapped for the first time (_mapcount is 0, mem_cgroup points
   to the memory resource group that brought in the page, mapping is set)
3. Page gets shared (_mapcount is 1, mem_cgroup points to the cgroup that
   brought in the page, mapping is set and now has rmap information)

When a page is being shared at step 3, we detect we are sharing the page and

1. For page->pc->mem_cgroup, we note that the page is being shared
2. For any vma that maps this page, we get to vma->vm_mm and then to the
   other mem_cgroup that is sharing this page and note this page is being
   shared.

So far so good

When a page is being uncharged

1. We note that we need to deduct the shared accounting from the mem_cgroup
2. When the _mapcount reaches 0, we have no way of knowing which of the
   mm's or mem_cgroup's is left behind. The original page->pc->mem_cgroup
   could have unmapped this page long time back. At this point we want
   to note the only mm that has this page mapped and the mem_cgroup is not
   sharing the page, but that the page is private to it.

Figuring out the mem_cgroup/mm for the last uncharge, requires a rmap
lookup, which we cannot do with PTE lock held (I have all my hooks in
page_add.*rmap() and page_remove_rmap()).

Questions, suggestions

1. Does it make sense to use the rmap routines for shared accounting?
2. How do we solve the problem of the last unshare causing the pages
   becoming private
	a. Can we use rmap?
	b. Can we live with leaving the page being marked as shared, even
           though it is no longer shared?


-- 
	Balbir

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

             reply	other threads:[~2009-04-07  6:37 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-07  6:37 Balbir Singh [this message]
2009-04-07  7:00 ` [RFI] Shared accounting for memory resource controller KAMEZAWA Hiroyuki
2009-04-07  7:18   ` Balbir Singh
2009-04-07  7:33     ` KAMEZAWA Hiroyuki
2009-04-07  8:03       ` Balbir Singh
2009-04-07  8:24         ` KAMEZAWA Hiroyuki
2009-04-07 10:10           ` Balbir Singh
2009-04-08  5:29           ` Balbir Singh
2009-04-08  6:15             ` KAMEZAWA Hiroyuki
2009-04-08  7:04               ` Balbir Singh
2009-04-08  7:07                 ` KAMEZAWA Hiroyuki
2009-04-08  7:11                   ` Balbir Singh
2009-04-08  7:18                     ` KAMEZAWA Hiroyuki
2009-04-08  7:31                       ` Bharata B Rao
2009-04-08  7:34                         ` KAMEZAWA Hiroyuki
2009-04-08  7:45                           ` Bharata B Rao
2009-04-08  7:52                             ` Dhaval Giani
2009-04-08  7:39                       ` KAMEZAWA Hiroyuki
2009-04-08  7:48                       ` Balbir Singh
2009-04-08  8:03                         ` KAMEZAWA Hiroyuki
2009-04-08  8:49                           ` Balbir Singh
2009-04-08  8:54                             ` KAMEZAWA Hiroyuki
2009-04-08  9:02                               ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090407063722.GQ7082@balbir.in.ibm.com \
    --to=balbir@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=bharata.rao@in.ibm.com \
    --cc=dhaval@linux.vnet.ibm.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=riel@surriel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).