All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <arighi@develer.com>
To: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Vivek Goyal <vgoyal@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Trond Myklebust <trond.myklebust@fys.uio.no>,
	Suleiman Souhlal <suleiman@google.com>,
	Greg Thelen <gthelen@google.com>,
	Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Andrew Morton <akpm@linux-foundation.org>,
	containers@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH -mmotm 0/4] memcg: per cgroup dirty limit (v4)
Date: Thu, 4 Mar 2010 22:37:34 +0100	[thread overview]
Message-ID: <20100304213734.GA4787@linux> (raw)
In-Reply-To: <20100304171143.GG3073@balbir.in.ibm.com>

On Thu, Mar 04, 2010 at 10:41:43PM +0530, Balbir Singh wrote:
> * Andrea Righi <arighi@develer.com> [2010-03-04 11:40:11]:
> 
> > Control the maximum amount of dirty pages a cgroup can have at any given time.
> > 
> > Per cgroup dirty limit is like fixing the max amount of dirty (hard to reclaim)
> > page cache used by any cgroup. So, in case of multiple cgroup writers, they
> > will not be able to consume more than their designated share of dirty pages and
> > will be forced to perform write-out if they cross that limit.
> > 
> > The overall design is the following:
> > 
> >  - account dirty pages per cgroup
> >  - limit the number of dirty pages via memory.dirty_ratio / memory.dirty_bytes
> >    and memory.dirty_background_ratio / memory.dirty_background_bytes in
> >    cgroupfs
> >  - start to write-out (background or actively) when the cgroup limits are
> >    exceeded
> > 
> > This feature is supposed to be strictly connected to any underlying IO
> > controller implementation, so we can stop increasing dirty pages in VM layer
> > and enforce a write-out before any cgroup will consume the global amount of
> > dirty pages defined by the /proc/sys/vm/dirty_ratio|dirty_bytes and
> > /proc/sys/vm/dirty_background_ratio|dirty_background_bytes limits.
> > 
> > Changelog (v3 -> v4)
> > ~~~~~~~~~~~~~~~~~~~~~~
> >  * handle the migration of tasks across different cgroups
> >    NOTE: at the moment we don't move charges of file cache pages, so this
> >    functionality is not immediately necessary. However, since the migration of
> >    file cache pages is in plan, it is better to start handling file pages
> >    anyway.
> >  * properly account dirty pages in nilfs2
> >    (thanks to Kirill A. Shutemov <kirill@shutemov.name>)
> >  * lockless access to dirty memory parameters
> >  * fix: page_cgroup lock must not be acquired under mapping->tree_lock
> >    (thanks to Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> and
> >     KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>)
> >  * code restyling
> >
> 
> This seems to be converging, what sort of tests are you running on
> this patchset? 

A very simple test at the moment, just some parallel dd's running in
different cgroups. For example:

 - cgroup A: low dirty limits (writes are almost sync)
   echo 1000 > /cgroups/A/memory.dirty_bytes
   echo 1000 > /cgroups/A/memory.dirty_background_bytes

 - cgroup B: high dirty limits (writes are all buffered in page cache)
   echo 100 > /cgroups/B/memory.dirty_ratio
   echo 50  > /cgroups/B/memory.dirty_background_ratio

Then run the dd's and look at memory.stat:
  - cgroup A: # dd if=/dev/zero of=A bs=1M count=1000
  - cgroup B: # dd if=/dev/zero of=B bs=1M count=1000

A random snapshot during the writes:

# grep "dirty\|writeback" /cgroups/[AB]/memory.stat
/cgroups/A/memory.stat:filedirty 0
/cgroups/A/memory.stat:writeback 0
/cgroups/A/memory.stat:writeback_tmp 0
/cgroups/A/memory.stat:dirty_pages 0
/cgroups/A/memory.stat:writeback_pages 0
/cgroups/A/memory.stat:writeback_temp_pages 0
/cgroups/B/memory.stat:filedirty 67226
/cgroups/B/memory.stat:writeback 136
/cgroups/B/memory.stat:writeback_tmp 0
/cgroups/B/memory.stat:dirty_pages 67226
/cgroups/B/memory.stat:writeback_pages 136
/cgroups/B/memory.stat:writeback_temp_pages 0

I plan to run more detailed IO benchmark soon.

-Andrea

WARNING: multiple messages have this Message-ID (diff)
From: Andrea Righi <arighi@develer.com>
To: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Vivek Goyal <vgoyal@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Trond Myklebust <trond.myklebust@fys.uio.no>,
	Suleiman Souhlal <suleiman@google.com>,
	Greg Thelen <gthelen@google.com>,
	Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Andrew Morton <akpm@linux-foundation.org>,
	containers@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH -mmotm 0/4] memcg: per cgroup dirty limit (v4)
Date: Thu, 4 Mar 2010 22:37:34 +0100	[thread overview]
Message-ID: <20100304213734.GA4787@linux> (raw)
In-Reply-To: <20100304171143.GG3073@balbir.in.ibm.com>

On Thu, Mar 04, 2010 at 10:41:43PM +0530, Balbir Singh wrote:
> * Andrea Righi <arighi@develer.com> [2010-03-04 11:40:11]:
> 
> > Control the maximum amount of dirty pages a cgroup can have at any given time.
> > 
> > Per cgroup dirty limit is like fixing the max amount of dirty (hard to reclaim)
> > page cache used by any cgroup. So, in case of multiple cgroup writers, they
> > will not be able to consume more than their designated share of dirty pages and
> > will be forced to perform write-out if they cross that limit.
> > 
> > The overall design is the following:
> > 
> >  - account dirty pages per cgroup
> >  - limit the number of dirty pages via memory.dirty_ratio / memory.dirty_bytes
> >    and memory.dirty_background_ratio / memory.dirty_background_bytes in
> >    cgroupfs
> >  - start to write-out (background or actively) when the cgroup limits are
> >    exceeded
> > 
> > This feature is supposed to be strictly connected to any underlying IO
> > controller implementation, so we can stop increasing dirty pages in VM layer
> > and enforce a write-out before any cgroup will consume the global amount of
> > dirty pages defined by the /proc/sys/vm/dirty_ratio|dirty_bytes and
> > /proc/sys/vm/dirty_background_ratio|dirty_background_bytes limits.
> > 
> > Changelog (v3 -> v4)
> > ~~~~~~~~~~~~~~~~~~~~~~
> >  * handle the migration of tasks across different cgroups
> >    NOTE: at the moment we don't move charges of file cache pages, so this
> >    functionality is not immediately necessary. However, since the migration of
> >    file cache pages is in plan, it is better to start handling file pages
> >    anyway.
> >  * properly account dirty pages in nilfs2
> >    (thanks to Kirill A. Shutemov <kirill@shutemov.name>)
> >  * lockless access to dirty memory parameters
> >  * fix: page_cgroup lock must not be acquired under mapping->tree_lock
> >    (thanks to Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> and
> >     KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>)
> >  * code restyling
> >
> 
> This seems to be converging, what sort of tests are you running on
> this patchset? 

A very simple test at the moment, just some parallel dd's running in
different cgroups. For example:

 - cgroup A: low dirty limits (writes are almost sync)
   echo 1000 > /cgroups/A/memory.dirty_bytes
   echo 1000 > /cgroups/A/memory.dirty_background_bytes

 - cgroup B: high dirty limits (writes are all buffered in page cache)
   echo 100 > /cgroups/B/memory.dirty_ratio
   echo 50  > /cgroups/B/memory.dirty_background_ratio

Then run the dd's and look at memory.stat:
  - cgroup A: # dd if=/dev/zero of=A bs=1M count=1000
  - cgroup B: # dd if=/dev/zero of=B bs=1M count=1000

A random snapshot during the writes:

# grep "dirty\|writeback" /cgroups/[AB]/memory.stat
/cgroups/A/memory.stat:filedirty 0
/cgroups/A/memory.stat:writeback 0
/cgroups/A/memory.stat:writeback_tmp 0
/cgroups/A/memory.stat:dirty_pages 0
/cgroups/A/memory.stat:writeback_pages 0
/cgroups/A/memory.stat:writeback_temp_pages 0
/cgroups/B/memory.stat:filedirty 67226
/cgroups/B/memory.stat:writeback 136
/cgroups/B/memory.stat:writeback_tmp 0
/cgroups/B/memory.stat:dirty_pages 67226
/cgroups/B/memory.stat:writeback_pages 136
/cgroups/B/memory.stat:writeback_temp_pages 0

I plan to run more detailed IO benchmark soon.

-Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2010-03-04 21:37 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-04 10:40 [PATCH -mmotm 0/4] memcg: per cgroup dirty limit (v4) Andrea Righi
2010-03-04 10:40 ` Andrea Righi
2010-03-04 10:40 ` [PATCH -mmotm 1/4] memcg: dirty memory documentation Andrea Righi
2010-03-04 10:40   ` Andrea Righi
2010-03-04 10:40 ` [PATCH -mmotm 2/4] page_cgroup: introduce file cache flags Andrea Righi
2010-03-04 10:40   ` Andrea Righi
     [not found]   ` <1267699215-4101-3-git-send-email-arighi-vWjgImWzx8FBDgjK7y7TUQ@public.gmane.org>
2010-03-05  6:32     ` Balbir Singh
2010-03-05  6:32   ` Balbir Singh
2010-03-05  6:32     ` Balbir Singh
2010-03-05 22:35     ` Andrea Righi
2010-03-05 22:35       ` Andrea Righi
     [not found]     ` <20100305063249.GH3073-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
2010-03-05 22:35       ` Andrea Righi
2010-03-04 10:40 ` [PATCH -mmotm 3/4] memcg: dirty pages accounting and limiting infrastructure Andrea Righi
2010-03-04 10:40   ` Andrea Righi
2010-03-04 11:54   ` Kirill A. Shutemov
2010-03-04 11:54     ` Kirill A. Shutemov
2010-03-05  1:12   ` Daisuke Nishimura
2010-03-05  1:12     ` Daisuke Nishimura
     [not found]     ` <20100305101234.909001e8.nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil@public.gmane.org>
2010-03-05  1:58       ` KAMEZAWA Hiroyuki
2010-03-05 22:14       ` Andrea Righi
2010-03-05  1:58     ` KAMEZAWA Hiroyuki
2010-03-05  1:58       ` KAMEZAWA Hiroyuki
     [not found]       ` <20100305105855.9b53176c.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-03-05  7:01         ` Balbir Singh
2010-03-05  7:01           ` Balbir Singh
2010-03-05  7:01           ` Balbir Singh
2010-03-05 22:14         ` Andrea Righi
2010-03-05 22:14       ` Andrea Righi
2010-03-05 22:14         ` Andrea Righi
2010-03-05 22:14     ` Andrea Righi
2010-03-05 22:14       ` Andrea Righi
     [not found]   ` <1267699215-4101-4-git-send-email-arighi-vWjgImWzx8FBDgjK7y7TUQ@public.gmane.org>
2010-03-04 11:54     ` Kirill A. Shutemov
2010-03-05  1:12     ` Daisuke Nishimura
2010-03-04 10:40 ` [PATCH -mmotm 4/4] memcg: dirty pages instrumentation Andrea Righi
2010-03-04 10:40   ` Andrea Righi
2010-03-04 16:18   ` Vivek Goyal
2010-03-04 16:18     ` Vivek Goyal
     [not found]     ` <20100304161828.GC18786-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-03-04 16:28       ` Andrea Righi
2010-03-04 16:28     ` Andrea Righi
2010-03-04 16:28       ` Andrea Righi
     [not found]   ` <1267699215-4101-5-git-send-email-arighi-vWjgImWzx8FBDgjK7y7TUQ@public.gmane.org>
2010-03-04 16:18     ` Vivek Goyal
2010-03-04 19:41     ` Vivek Goyal
2010-03-04 19:41       ` Vivek Goyal
2010-03-04 19:41       ` Vivek Goyal
     [not found]       ` <20100304194144.GE18786-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-03-04 21:51         ` Andrea Righi
2010-03-04 21:51           ` Andrea Righi
2010-03-04 21:51           ` Andrea Righi
2010-03-05  6:38     ` Balbir Singh
2010-03-05  6:38   ` Balbir Singh
2010-03-05  6:38     ` Balbir Singh
2010-03-05 22:55     ` Andrea Righi
2010-03-05 22:55       ` Andrea Righi
     [not found]     ` <20100305063843.GI3073-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
2010-03-05 22:55       ` Andrea Righi
2010-03-04 17:11 ` [PATCH -mmotm 0/4] memcg: per cgroup dirty limit (v4) Balbir Singh
2010-03-04 17:11   ` Balbir Singh
     [not found]   ` <20100304171143.GG3073-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
2010-03-04 21:37     ` Andrea Righi
2010-03-04 21:37   ` Andrea Righi [this message]
2010-03-04 21:37     ` Andrea Righi
     [not found] ` <1267699215-4101-1-git-send-email-arighi-vWjgImWzx8FBDgjK7y7TUQ@public.gmane.org>
2010-03-04 10:40   ` [PATCH -mmotm 1/4] memcg: dirty memory documentation Andrea Righi
2010-03-04 10:40   ` [PATCH -mmotm 2/4] page_cgroup: introduce file cache flags Andrea Righi
2010-03-04 10:40   ` [PATCH -mmotm 3/4] memcg: dirty pages accounting and limiting infrastructure Andrea Righi
2010-03-04 10:40   ` [PATCH -mmotm 4/4] memcg: dirty pages instrumentation Andrea Righi
2010-03-04 17:11   ` [PATCH -mmotm 0/4] memcg: per cgroup dirty limit (v4) Balbir Singh
  -- strict thread matches above, loose matches on Subject: below --
2010-03-04 10:40 Andrea Righi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100304213734.GA4787@linux \
    --to=arighi@develer.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=containers@lists.linux-foundation.org \
    --cc=gthelen@google.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nishimura@mxp.nes.nec.co.jp \
    --cc=peterz@infradead.org \
    --cc=suleiman@google.com \
    --cc=trond.myklebust@fys.uio.no \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.