All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <arighi@develer.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>,
	Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
	Vivek Goyal <vgoyal@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Trond Myklebust <trond.myklebust@fys.uio.no>,
	Suleiman Souhlal <suleiman@google.com>,
	Greg Thelen <gthelen@google.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Andrew Morton <akpm@linux-foundation.org>,
	containers@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH -mmotm 0/5] memcg: per cgroup dirty limit (v6)
Date: Thu, 11 Mar 2010 23:23:48 +0100	[thread overview]
Message-ID: <20100311222348.GB2427@linux> (raw)
In-Reply-To: <20100311093913.07c9ca8a.kamezawa.hiroyu@jp.fujitsu.com>

On Thu, Mar 11, 2010 at 09:39:13AM +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 10 Mar 2010 00:00:31 +0100
> Andrea Righi <arighi@develer.com> wrote:
> 
> > Control the maximum amount of dirty pages a cgroup can have at any given time.
> > 
> > Per cgroup dirty limit is like fixing the max amount of dirty (hard to reclaim)
> > page cache used by any cgroup. So, in case of multiple cgroup writers, they
> > will not be able to consume more than their designated share of dirty pages and
> > will be forced to perform write-out if they cross that limit.
> > 
> > The overall design is the following:
> > 
> >  - account dirty pages per cgroup
> >  - limit the number of dirty pages via memory.dirty_ratio / memory.dirty_bytes
> >    and memory.dirty_background_ratio / memory.dirty_background_bytes in
> >    cgroupfs
> >  - start to write-out (background or actively) when the cgroup limits are
> >    exceeded
> > 
> > This feature is supposed to be strictly connected to any underlying IO
> > controller implementation, so we can stop increasing dirty pages in VM layer
> > and enforce a write-out before any cgroup will consume the global amount of
> > dirty pages defined by the /proc/sys/vm/dirty_ratio|dirty_bytes and
> > /proc/sys/vm/dirty_background_ratio|dirty_background_bytes limits.
> > 
> > Changelog (v5 -> v6)
> > ~~~~~~~~~~~~~~~~~~~~~~
> >  * always disable/enable IRQs at lock/unlock_page_cgroup(): this allows to drop
> >    the previous complicated locking scheme in favor of a simpler locking, even
> >    if this obviously adds some overhead (see results below)
> >  * drop FUSE and NILFS2 dirty pages accounting for now (this depends on
> >    charging bounce pages per cgroup)
> > 
> > Results
> > ~~~~~~~
> > I ran some tests using a kernel build (2.6.33 x86_64_defconfig) on a
> > Intel Core 2 @ 1.2GHz as testcase using different kernels:
> >  - mmotm "vanilla"
> >  - mmotm with cgroup-dirty-memory using the previous "complex" locking scheme
> >    (my previous patchset + the fixes reported by Kame-san and Daisuke-san)
> >  - mmotm with cgroup-dirty-memory using the simple locking scheme
> >    (lock_page_cgroup() with IRQs disabled)
> > 
> > Following the results:
> > <before>
> >  - mmotm "vanilla", root  cgroup:			11m51.983s
> >  - mmotm "vanilla", child cgroup:			11m56.596s
> > 
> > <after>
> >  - mmotm, "complex" locking scheme, root  cgroup:	11m53.037s
> >  - mmotm, "complex" locking scheme, child cgroup:	11m57.896s
> > 
> >  - mmotm, lock_page_cgroup+irq_disabled, root  cgroup:	12m5.499s
> >  - mmotm, lock_page_cgroup+irq_disabled, child cgroup:	12m9.920s
> > 
> > With the "complex" locking solution, the overhead introduced by the
> > cgroup dirty memory accounting is minimal (0.14%), compared with the overhead
> > introduced by the lock_page_cgroup+irq_disabled solution (1.90%).
> > 
> Hmm....isn't this bigger than expected ?

Consider that I'm not running the kernel build on tmpfs, but on a fs
defined on /dev/sda. So the additional overhead should be normal
compared to the mmotm vanilla, where there's only FILE_MAPPED
accounting.

-Andrea

WARNING: multiple messages have this Message-ID (diff)
From: Andrea Righi <arighi@develer.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>,
	Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
	Vivek Goyal <vgoyal@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Trond Myklebust <trond.myklebust@fys.uio.no>,
	Suleiman Souhlal <suleiman@google.com>,
	Greg Thelen <gthelen@google.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Andrew Morton <akpm@linux-foundation.org>,
	containers@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH -mmotm 0/5] memcg: per cgroup dirty limit (v6)
Date: Thu, 11 Mar 2010 23:23:48 +0100	[thread overview]
Message-ID: <20100311222348.GB2427@linux> (raw)
In-Reply-To: <20100311093913.07c9ca8a.kamezawa.hiroyu@jp.fujitsu.com>

On Thu, Mar 11, 2010 at 09:39:13AM +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 10 Mar 2010 00:00:31 +0100
> Andrea Righi <arighi@develer.com> wrote:
> 
> > Control the maximum amount of dirty pages a cgroup can have at any given time.
> > 
> > Per cgroup dirty limit is like fixing the max amount of dirty (hard to reclaim)
> > page cache used by any cgroup. So, in case of multiple cgroup writers, they
> > will not be able to consume more than their designated share of dirty pages and
> > will be forced to perform write-out if they cross that limit.
> > 
> > The overall design is the following:
> > 
> >  - account dirty pages per cgroup
> >  - limit the number of dirty pages via memory.dirty_ratio / memory.dirty_bytes
> >    and memory.dirty_background_ratio / memory.dirty_background_bytes in
> >    cgroupfs
> >  - start to write-out (background or actively) when the cgroup limits are
> >    exceeded
> > 
> > This feature is supposed to be strictly connected to any underlying IO
> > controller implementation, so we can stop increasing dirty pages in VM layer
> > and enforce a write-out before any cgroup will consume the global amount of
> > dirty pages defined by the /proc/sys/vm/dirty_ratio|dirty_bytes and
> > /proc/sys/vm/dirty_background_ratio|dirty_background_bytes limits.
> > 
> > Changelog (v5 -> v6)
> > ~~~~~~~~~~~~~~~~~~~~~~
> >  * always disable/enable IRQs at lock/unlock_page_cgroup(): this allows to drop
> >    the previous complicated locking scheme in favor of a simpler locking, even
> >    if this obviously adds some overhead (see results below)
> >  * drop FUSE and NILFS2 dirty pages accounting for now (this depends on
> >    charging bounce pages per cgroup)
> > 
> > Results
> > ~~~~~~~
> > I ran some tests using a kernel build (2.6.33 x86_64_defconfig) on a
> > Intel Core 2 @ 1.2GHz as testcase using different kernels:
> >  - mmotm "vanilla"
> >  - mmotm with cgroup-dirty-memory using the previous "complex" locking scheme
> >    (my previous patchset + the fixes reported by Kame-san and Daisuke-san)
> >  - mmotm with cgroup-dirty-memory using the simple locking scheme
> >    (lock_page_cgroup() with IRQs disabled)
> > 
> > Following the results:
> > <before>
> >  - mmotm "vanilla", root  cgroup:			11m51.983s
> >  - mmotm "vanilla", child cgroup:			11m56.596s
> > 
> > <after>
> >  - mmotm, "complex" locking scheme, root  cgroup:	11m53.037s
> >  - mmotm, "complex" locking scheme, child cgroup:	11m57.896s
> > 
> >  - mmotm, lock_page_cgroup+irq_disabled, root  cgroup:	12m5.499s
> >  - mmotm, lock_page_cgroup+irq_disabled, child cgroup:	12m9.920s
> > 
> > With the "complex" locking solution, the overhead introduced by the
> > cgroup dirty memory accounting is minimal (0.14%), compared with the overhead
> > introduced by the lock_page_cgroup+irq_disabled solution (1.90%).
> > 
> Hmm....isn't this bigger than expected ?

Consider that I'm not running the kernel build on tmpfs, but on a fs
defined on /dev/sda. So the additional overhead should be normal
compared to the mmotm vanilla, where there's only FILE_MAPPED
accounting.

-Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2010-03-11 22:23 UTC|newest]

Thread overview: 101+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-09 23:00 [PATCH -mmotm 0/5] memcg: per cgroup dirty limit (v6) Andrea Righi
2010-03-09 23:00 ` Andrea Righi
2010-03-09 23:00 ` [PATCH -mmotm 1/5] memcg: disable irq at page cgroup lock Andrea Righi
2010-03-09 23:00   ` Andrea Righi
2010-03-09 23:00 ` [PATCH -mmotm 2/5] memcg: dirty memory documentation Andrea Righi
2010-03-09 23:00   ` Andrea Righi
2010-03-09 23:00 ` [PATCH -mmotm 3/5] page_cgroup: introduce file cache flags Andrea Righi
2010-03-09 23:00   ` Andrea Righi
2010-03-09 23:00 ` [PATCH -mmotm 4/5] memcg: dirty pages accounting and limiting infrastructure Andrea Righi
2010-03-09 23:00   ` Andrea Righi
2010-03-10 22:23   ` Vivek Goyal
2010-03-10 22:23     ` Vivek Goyal
     [not found]     ` <20100310222338.GB3009-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-03-11 22:27       ` Andrea Righi
2010-03-11 22:27     ` Andrea Righi
2010-03-11 22:27       ` Andrea Righi
     [not found]   ` <1268175636-4673-5-git-send-email-arighi-vWjgImWzx8FBDgjK7y7TUQ@public.gmane.org>
2010-03-10 22:23     ` Vivek Goyal
2010-03-09 23:00 ` [PATCH -mmotm 5/5] memcg: dirty pages instrumentation Andrea Righi
2010-03-09 23:00   ` Andrea Righi
     [not found] ` <1268175636-4673-1-git-send-email-arighi-vWjgImWzx8FBDgjK7y7TUQ@public.gmane.org>
2010-03-09 23:00   ` [PATCH -mmotm 1/5] memcg: disable irq at page cgroup lock Andrea Righi
2010-03-09 23:00   ` [PATCH -mmotm 2/5] memcg: dirty memory documentation Andrea Righi
2010-03-09 23:00   ` [PATCH -mmotm 3/5] page_cgroup: introduce file cache flags Andrea Righi
2010-03-09 23:00   ` [PATCH -mmotm 4/5] memcg: dirty pages accounting and limiting infrastructure Andrea Righi
2010-03-09 23:00   ` [PATCH -mmotm 5/5] memcg: dirty pages instrumentation Andrea Righi
2010-03-11  0:39   ` [PATCH -mmotm 0/5] memcg: per cgroup dirty limit (v6) KAMEZAWA Hiroyuki
2010-03-11 18:07   ` Vivek Goyal
2010-03-10  1:36 ` Balbir Singh
2010-03-10  1:36   ` Balbir Singh
2010-03-11  0:39 ` KAMEZAWA Hiroyuki
2010-03-11  0:39   ` KAMEZAWA Hiroyuki
2010-03-11  1:17   ` KAMEZAWA Hiroyuki
2010-03-11  1:17     ` KAMEZAWA Hiroyuki
     [not found]     ` <20100311101726.f58d24e9.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-03-11  9:14       ` Peter Zijlstra
2010-03-11  9:14     ` Peter Zijlstra
2010-03-11  9:14       ` Peter Zijlstra
2010-03-11  9:25       ` KAMEZAWA Hiroyuki
2010-03-11  9:25       ` KAMEZAWA Hiroyuki
2010-03-11  9:25         ` KAMEZAWA Hiroyuki
2010-03-11  9:42         ` KAMEZAWA Hiroyuki
2010-03-11  9:42           ` KAMEZAWA Hiroyuki
2010-03-11 22:20           ` Andrea Righi
2010-03-11 22:20             ` Andrea Righi
2010-03-12  1:14           ` Daisuke Nishimura
2010-03-12  1:14             ` Daisuke Nishimura
2010-03-12  2:24             ` KAMEZAWA Hiroyuki
2010-03-12  2:24               ` KAMEZAWA Hiroyuki
     [not found]               ` <20100312112433.689c7294.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-03-15 14:48                 ` Vivek Goyal
2010-03-15 14:48               ` Vivek Goyal
2010-03-15 14:48                 ` Vivek Goyal
2010-03-12 10:07             ` Andrea Righi
2010-03-12 10:07               ` Andrea Righi
     [not found]             ` <20100312101411.b2639128.nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil@public.gmane.org>
2010-03-12  2:24               ` KAMEZAWA Hiroyuki
2010-03-12 10:07               ` Andrea Righi
     [not found]           ` <20100311184244.6735076a.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-03-11 22:20             ` Andrea Righi
2010-03-12  1:14             ` Daisuke Nishimura
2010-03-11 15:03         ` Vivek Goyal
2010-03-11 15:03           ` Vivek Goyal
     [not found]           ` <20100311150307.GC29246-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-03-11 23:27             ` Andrea Righi
2010-03-11 23:42             ` KAMEZAWA Hiroyuki
2010-03-11 23:27           ` Andrea Righi
2010-03-11 23:27             ` Andrea Righi
2010-03-11 23:52             ` KAMEZAWA Hiroyuki
2010-03-11 23:52               ` KAMEZAWA Hiroyuki
     [not found]               ` <20100312085244.98e48991.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-03-12 10:01                 ` Andrea Righi
2010-03-12 10:01               ` Andrea Righi
2010-03-12 10:01                 ` Andrea Righi
2010-03-11 23:52             ` KAMEZAWA Hiroyuki
2010-03-15 14:16             ` Vivek Goyal
2010-03-15 14:16             ` Vivek Goyal
2010-03-15 14:16               ` Vivek Goyal
2010-03-11 23:42           ` KAMEZAWA Hiroyuki
2010-03-11 23:42             ` KAMEZAWA Hiroyuki
     [not found]             ` <20100312084230.850f331d.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-03-12  0:33               ` Andrea Righi
2010-03-15 14:38               ` Vivek Goyal
2010-03-12  0:33             ` Andrea Righi
2010-03-12  0:33               ` Andrea Righi
2010-03-15 14:38             ` Vivek Goyal
2010-03-15 14:38               ` Vivek Goyal
2010-03-17 22:32               ` Andrea Righi
2010-03-17 22:32                 ` Andrea Righi
     [not found]               ` <20100315143841.GE21127-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-03-17 22:32                 ` Andrea Righi
     [not found]         ` <20100311182500.0f3ba994.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-03-11  9:42           ` KAMEZAWA Hiroyuki
2010-03-11 15:03           ` Vivek Goyal
2010-03-11 22:23   ` Andrea Righi [this message]
2010-03-11 22:23     ` Andrea Righi
     [not found]   ` <20100311093913.07c9ca8a.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-03-11  1:17     ` KAMEZAWA Hiroyuki
2010-03-11 22:23     ` Andrea Righi
2010-03-11 18:07 ` Vivek Goyal
2010-03-11 18:07   ` Vivek Goyal
     [not found]   ` <20100311180753.GE29246-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-03-11 23:59     ` Andrea Righi
2010-03-11 23:59   ` Andrea Righi
2010-03-11 23:59     ` Andrea Righi
2010-03-12  0:03     ` KAMEZAWA Hiroyuki
2010-03-12  0:03     ` KAMEZAWA Hiroyuki
2010-03-12  0:03       ` KAMEZAWA Hiroyuki
2010-03-12  9:58       ` Andrea Righi
2010-03-12  9:58         ` Andrea Righi
     [not found]       ` <20100312090326.ad07c05c.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-03-12  9:58         ` Andrea Righi
2010-03-15 14:41     ` Vivek Goyal
2010-03-15 14:41     ` Vivek Goyal
2010-03-15 14:41       ` Vivek Goyal
  -- strict thread matches above, loose matches on Subject: below --
2010-03-09 23:00 Andrea Righi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100311222348.GB2427@linux \
    --to=arighi@develer.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=containers@lists.linux-foundation.org \
    --cc=gthelen@google.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nishimura@mxp.nes.nec.co.jp \
    --cc=peterz@infradead.org \
    --cc=suleiman@google.com \
    --cc=trond.myklebust@fys.uio.no \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.