linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: linux-mm@kvack.org, mgorman@suse.de, dhillf@gmail.com,
	aarcange@redhat.com, mhocko@suse.cz, akpm@linux-foundation.org,
	hannes@cmpxchg.org, linux-kernel@vger.kernel.org,
	cgroups@vger.kernel.org
Subject: Re: [PATCH -V2 0/9] memcg: add HugeTLB resource tracking
Date: Sun, 04 Mar 2012 23:44:58 +0530	[thread overview]
Message-ID: <87boocdyhp.fsf@linux.vnet.ibm.com> (raw)
In-Reply-To: <20120302144828.e985c63a.kamezawa.hiroyu@jp.fujitsu.com>

On Fri, 2 Mar 2012 14:48:28 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> On Thu,  1 Mar 2012 14:46:11 +0530
> "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> wrote:
> 
> > Hi,
> > 
> > This patchset implements a memory controller extension to control
> > HugeTLB allocations. It is similar to the existing hugetlb quota
> > support in that, the limit is enforced at mmap(2) time and not at
> > fault time. HugeTLB's quota mechanism limits the number of huge pages
> > that can allocated per superblock.
> > 
> 
> Thank you, I think memcg-extension is better than hugetlbfs cgroup.
> 
> 
> > For shared mappings we track the regions mapped by a task along with the
> > memcg. We keep the memory controller charged even after the task
> > that did mmap(2) exits. Uncharge happens during truncate. For Private
> > mappings we charge and uncharge from the current task cgroup.
> > 
> 
> What "current" means here ? current task's cgroup ?

yes. 


> 
> 
> > A sample strace output for an application doing malloc with hugectl is given
> > below. libhugetlbfs will fall back to normal pagesize if the HugeTLB mmap fails.
> > 
> > open("/mnt/libhugetlbfs.tmp.uhLMgy", O_RDWR|O_CREAT|O_EXCL, 0600) = 3
> > unlink("/mnt/libhugetlbfs.tmp.uhLMgy")  = 0
> > 
> > .........
> > 
> > mmap(0x20000000000, 50331648, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0) = -1 ENOMEM (Cannot allocate memory)
> > write(2, "libhugetlbfs", 12libhugetlbfs)            = 12
> > write(2, ": WARNING: New heap segment map" ....
> > mmap(NULL, 42008576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfff946c0000
> > ....
> > 
> > 
> > Goals:
> > 
> > 1) We want to keep the semantic closer to hugelb quota support. ie, we want
> >    to extend quota semantics to a group of tasks. Currently hugetlb quota
> >    mechanism allows one to control number of hugetlb pages allocated per
> >    hugetlbfs superblock.
> > 
> > 2) Applications using hugetlbfs always fallback to normal page size allocation when they
> >    fail to allocate huge pages. libhugetlbfs internally handles this for malloc(3). We
> >    want to retain this behaviour when we enforce the controller limit. ie, when huge page
> >    allocation fails due to controller limit, applications should fallback to
> >    allocation using normal page size. The above implies that we need to enforce
> >    limit at mmap(2).
> > 
> 
> Hm, ok. 
> 
> > 3) HugeTLBfs doesn't support page reclaim. It also doesn't support write(2). Applications
> >    use hugetlbfs via mmap(2) interface. Important point to note here is hugetlbfs
> >    extends file size in mmap.
> > 
> >    With shared mappings, the file size gets extended in mmap and file will remain in hugetlbfs
> >    consuming huge pages until it is truncated. We want to make sure we keep the controller
> >    charged until the file is truncated. This implies, that the controller will be charged
> >    even after the task that did mmap exit.
> > 
> 
> O.K. hugetlbfs is charged until the file is removed.
> Then, next question will be 'can we destory cgroup....'

That is explained later. We don't allow cgroup removal if it's non
reclaim resource usage is non zero. But that restriction should be remove
before this can be merged. 

> 
> > Implementation details:
> > 
> > In order to achieve the above goals we need to track the cgroup information
> > along with mmap range in a charge list in inode for shared mapping and in
> > vm_area_struct for private mapping. We won't be using page to track cgroup
> > information because with the above goals we are not really tracking the pages used.
> > 
> > Since we track cgroup in charge list, if we want to remove the cgroup, we need to update
> > the charge list to point to the parent cgroup. Currently we take the easy route
> > and prevent a cgroup removal if it's non reclaim resource usage is non zero.
> > 
> 
> As Andrew pointed out, there are some ongoing works about page-range tracking.
> Please check.

Will do


-aneesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

      reply	other threads:[~2012-03-04 18:15 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-01  9:16 [PATCH -V2 0/9] memcg: add HugeTLB resource tracking Aneesh Kumar K.V
2012-03-01  9:16 ` [PATCH -V2 1/9] mm: move hugetlbfs region tracking function to common code Aneesh Kumar K.V
2012-03-01 22:33   ` Andrew Morton
2012-03-04 17:37     ` Aneesh Kumar K.V
2012-03-01  9:16 ` [PATCH -V2 2/9] mm: Update region function to take new data arg Aneesh Kumar K.V
2012-03-01  9:16 ` [PATCH -V2 3/9] hugetlbfs: Use the generic region API and drop local one Aneesh Kumar K.V
2012-03-01  9:16 ` [PATCH -V2 4/9] memcg: Add non reclaim resource tracking to memcg Aneesh Kumar K.V
2012-03-02  8:38   ` KAMEZAWA Hiroyuki
2012-03-04 18:07     ` Aneesh Kumar K.V
2012-03-08  5:56       ` KAMEZAWA Hiroyuki
2012-03-08 11:48         ` Aneesh Kumar K.V
2012-03-01  9:16 ` [PATCH -V2 5/9] hugetlbfs: Add memory controller support for shared mapping Aneesh Kumar K.V
2012-03-01  9:16 ` [PATCH -V2 6/9] hugetlbfs: Add memory controller support for private mapping Aneesh Kumar K.V
2012-05-17 23:16   ` Darrick J. Wong
2012-03-01  9:16 ` [PATCH -V2 7/9] memcg: track resource index in cftype private Aneesh Kumar K.V
2012-03-01  9:16 ` [PATCH -V2 8/9] hugetlbfs: Add memcg control files for hugetlbfs Aneesh Kumar K.V
2012-03-01  9:16 ` [PATCH -V2 9/9] memcg: Add memory controller documentation for hugetlb management Aneesh Kumar K.V
2012-03-01 22:40 ` [PATCH -V2 0/9] memcg: add HugeTLB resource tracking Andrew Morton
2012-03-02  3:28   ` David Gibson
2012-03-04 18:09     ` Aneesh Kumar K.V
2012-03-06  2:38       ` David Gibson
2012-03-04 19:15   ` Aneesh Kumar K.V
2012-03-05 13:56     ` Hillf Danton
2012-03-06 14:05       ` Aneesh Kumar K.V
2012-03-02  5:48 ` KAMEZAWA Hiroyuki
2012-03-04 18:14   ` Aneesh Kumar K.V [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87boocdyhp.fsf@linux.vnet.ibm.com \
    --to=aneesh.kumar@linux.vnet.ibm.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=dhillf@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).