From: Andrea Righi <arighi@develer.com>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Peter Zijlstra <peterz@infradead.org>,
Balbir Singh <balbir@linux.vnet.ibm.com>,
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
Trond Myklebust <trond.myklebust@fys.uio.no>,
Suleiman Souhlal <suleiman@google.com>,
Greg Thelen <gthelen@google.com>,
"Kirill A. Shutemov" <kirill@shutemov.name>,
Andrew Morton <akpm@linux-foundation.org>,
containers@lists.linux-foundation.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH -mmotm 0/5] memcg: per cgroup dirty limit (v6)
Date: Wed, 17 Mar 2010 23:32:22 +0100 [thread overview]
Message-ID: <20100317223222.GA8467@linux.develer.com> (raw)
In-Reply-To: <20100315143841.GE21127@redhat.com>
On Mon, Mar 15, 2010 at 10:38:41AM -0400, Vivek Goyal wrote:
> > >
> > > bdi_thres ~= per_memory_cgroup_dirty * bdi_fraction
> > >
> > > But bdi_nr_reclaimable and bdi_nr_writeback stats are still global.
> > >
> > Why bdi_thresh of ROOT cgroup doesn't depend on global number ?
> >
>
> I think in current implementation ROOT cgroup bdi_thres is always same
> as global number. It is only for other child groups where it is different
> from global number because of reduced dirytable_memory() limit. And we
> don't seem to be allowing any control on root group.
>
> But I am wondering, what happens in following case.
>
> IIUC, with use_hierarhy=0, if I create two test groups test1 and test2, then
> hierarchy looks as follows.
>
> root test1 test2
>
> Now root group's DIRTYABLE is still system wide but test1 and test2's
> dirtyable will be reduced based on RES_LIMIT in those groups.
>
> Conceptually, per cgroup dirty ratio is like fixing page cache share of
> each group. So effectively we are saying that these limits apply to only
> child group of root but not to root as such?
Correct. In this implementation root cgroup means "outside all cgroups".
I think this can be an acceptable behaviour since in general we don't
set any limit to the root cgroup.
>
> > > So for the same number of dirty pages system wide on this bdi, we will be
> > > triggering writeouts much more aggressively if somebody has created few
> > > memory cgroups and tasks are running in those cgroups.
> > >
> > > I guess it might cause performance regressions in case of small file
> > > writeouts because previously one could have written the file to cache and
> > > be done with it but with this patch set, there are higher changes that
> > > you will be throttled to write the pages back to disk.
> > >
> > > I guess we need two pieces to resolve this.
> > > - BDI stats per cgroup.
> > > - Writeback of inodes from same cgroup.
> > >
> > > I think BDI stats per cgroup will increase the complextiy.
> > >
> > Thank you for clarification. IIUC, dirty_limit implemanation shoul assume
> > there is I/O resource controller, maybe usual users will use I/O resource
> > controller and memcg at the same time.
> > Then, my question is what happens when used with I/O resource controller ?
> >
>
> Currently IO resource controller keep all the async IO queues in root
> group so we can't measure exactly. But my guess is until and unless we
> at least implement "writeback inodes from same cgroup" we will not see
> increased flow of writes from one cgroup over other cgroup.
Agreed. And I plan to look a the "writeback inodes per cgroup" feature
soon. I'm sorry but I've some deadlines this week, so probably I'll
start working on this in the next weekend.
-Andrea
WARNING: multiple messages have this Message-ID (diff)
From: Andrea Righi <arighi@develer.com>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Peter Zijlstra <peterz@infradead.org>,
Balbir Singh <balbir@linux.vnet.ibm.com>,
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
Trond Myklebust <trond.myklebust@fys.uio.no>,
Suleiman Souhlal <suleiman@google.com>,
Greg Thelen <gthelen@google.com>,
"Kirill A. Shutemov" <kirill@shutemov.name>,
Andrew Morton <akpm@linux-foundation.org>,
containers@lists.linux-foundation.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH -mmotm 0/5] memcg: per cgroup dirty limit (v6)
Date: Wed, 17 Mar 2010 23:32:22 +0100 [thread overview]
Message-ID: <20100317223222.GA8467@linux.develer.com> (raw)
In-Reply-To: <20100315143841.GE21127@redhat.com>
On Mon, Mar 15, 2010 at 10:38:41AM -0400, Vivek Goyal wrote:
> > >
> > > bdi_thres ~= per_memory_cgroup_dirty * bdi_fraction
> > >
> > > But bdi_nr_reclaimable and bdi_nr_writeback stats are still global.
> > >
> > Why bdi_thresh of ROOT cgroup doesn't depend on global number ?
> >
>
> I think in current implementation ROOT cgroup bdi_thres is always same
> as global number. It is only for other child groups where it is different
> from global number because of reduced dirytable_memory() limit. And we
> don't seem to be allowing any control on root group.
>
> But I am wondering, what happens in following case.
>
> IIUC, with use_hierarhy=0, if I create two test groups test1 and test2, then
> hierarchy looks as follows.
>
> root test1 test2
>
> Now root group's DIRTYABLE is still system wide but test1 and test2's
> dirtyable will be reduced based on RES_LIMIT in those groups.
>
> Conceptually, per cgroup dirty ratio is like fixing page cache share of
> each group. So effectively we are saying that these limits apply to only
> child group of root but not to root as such?
Correct. In this implementation root cgroup means "outside all cgroups".
I think this can be an acceptable behaviour since in general we don't
set any limit to the root cgroup.
>
> > > So for the same number of dirty pages system wide on this bdi, we will be
> > > triggering writeouts much more aggressively if somebody has created few
> > > memory cgroups and tasks are running in those cgroups.
> > >
> > > I guess it might cause performance regressions in case of small file
> > > writeouts because previously one could have written the file to cache and
> > > be done with it but with this patch set, there are higher changes that
> > > you will be throttled to write the pages back to disk.
> > >
> > > I guess we need two pieces to resolve this.
> > > - BDI stats per cgroup.
> > > - Writeback of inodes from same cgroup.
> > >
> > > I think BDI stats per cgroup will increase the complextiy.
> > >
> > Thank you for clarification. IIUC, dirty_limit implemanation shoul assume
> > there is I/O resource controller, maybe usual users will use I/O resource
> > controller and memcg at the same time.
> > Then, my question is what happens when used with I/O resource controller ?
> >
>
> Currently IO resource controller keep all the async IO queues in root
> group so we can't measure exactly. But my guess is until and unless we
> at least implement "writeback inodes from same cgroup" we will not see
> increased flow of writes from one cgroup over other cgroup.
Agreed. And I plan to look a the "writeback inodes per cgroup" feature
soon. I'm sorry but I've some deadlines this week, so probably I'll
start working on this in the next weekend.
-Andrea
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-03-17 22:32 UTC|newest]
Thread overview: 101+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-09 23:00 [PATCH -mmotm 0/5] memcg: per cgroup dirty limit (v6) Andrea Righi
2010-03-09 23:00 ` Andrea Righi
2010-03-09 23:00 ` [PATCH -mmotm 1/5] memcg: disable irq at page cgroup lock Andrea Righi
2010-03-09 23:00 ` Andrea Righi
2010-03-09 23:00 ` [PATCH -mmotm 2/5] memcg: dirty memory documentation Andrea Righi
2010-03-09 23:00 ` Andrea Righi
2010-03-09 23:00 ` [PATCH -mmotm 3/5] page_cgroup: introduce file cache flags Andrea Righi
2010-03-09 23:00 ` Andrea Righi
[not found] ` <1268175636-4673-1-git-send-email-arighi-vWjgImWzx8FBDgjK7y7TUQ@public.gmane.org>
2010-03-09 23:00 ` [PATCH -mmotm 1/5] memcg: disable irq at page cgroup lock Andrea Righi
2010-03-09 23:00 ` [PATCH -mmotm 2/5] memcg: dirty memory documentation Andrea Righi
2010-03-09 23:00 ` [PATCH -mmotm 3/5] page_cgroup: introduce file cache flags Andrea Righi
2010-03-09 23:00 ` [PATCH -mmotm 4/5] memcg: dirty pages accounting and limiting infrastructure Andrea Righi
2010-03-09 23:00 ` [PATCH -mmotm 5/5] memcg: dirty pages instrumentation Andrea Righi
2010-03-11 0:39 ` [PATCH -mmotm 0/5] memcg: per cgroup dirty limit (v6) KAMEZAWA Hiroyuki
2010-03-11 18:07 ` Vivek Goyal
2010-03-09 23:00 ` [PATCH -mmotm 4/5] memcg: dirty pages accounting and limiting infrastructure Andrea Righi
2010-03-09 23:00 ` Andrea Righi
[not found] ` <1268175636-4673-5-git-send-email-arighi-vWjgImWzx8FBDgjK7y7TUQ@public.gmane.org>
2010-03-10 22:23 ` Vivek Goyal
2010-03-10 22:23 ` Vivek Goyal
2010-03-10 22:23 ` Vivek Goyal
[not found] ` <20100310222338.GB3009-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-03-11 22:27 ` Andrea Righi
2010-03-11 22:27 ` Andrea Righi
2010-03-11 22:27 ` Andrea Righi
2010-03-09 23:00 ` [PATCH -mmotm 5/5] memcg: dirty pages instrumentation Andrea Righi
2010-03-09 23:00 ` Andrea Righi
2010-03-10 1:36 ` [PATCH -mmotm 0/5] memcg: per cgroup dirty limit (v6) Balbir Singh
2010-03-10 1:36 ` Balbir Singh
2010-03-11 0:39 ` KAMEZAWA Hiroyuki
2010-03-11 0:39 ` KAMEZAWA Hiroyuki
2010-03-11 1:17 ` KAMEZAWA Hiroyuki
2010-03-11 1:17 ` KAMEZAWA Hiroyuki
2010-03-11 9:14 ` Peter Zijlstra
2010-03-11 9:14 ` Peter Zijlstra
2010-03-11 9:25 ` KAMEZAWA Hiroyuki
2010-03-11 9:25 ` KAMEZAWA Hiroyuki
[not found] ` <20100311182500.0f3ba994.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-03-11 9:42 ` KAMEZAWA Hiroyuki
2010-03-11 15:03 ` Vivek Goyal
2010-03-11 9:42 ` KAMEZAWA Hiroyuki
2010-03-11 9:42 ` KAMEZAWA Hiroyuki
2010-03-11 22:20 ` Andrea Righi
2010-03-11 22:20 ` Andrea Righi
2010-03-12 1:14 ` Daisuke Nishimura
2010-03-12 1:14 ` Daisuke Nishimura
[not found] ` <20100312101411.b2639128.nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil@public.gmane.org>
2010-03-12 2:24 ` KAMEZAWA Hiroyuki
2010-03-12 10:07 ` Andrea Righi
2010-03-12 2:24 ` KAMEZAWA Hiroyuki
2010-03-12 2:24 ` KAMEZAWA Hiroyuki
2010-03-15 14:48 ` Vivek Goyal
2010-03-15 14:48 ` Vivek Goyal
[not found] ` <20100312112433.689c7294.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-03-15 14:48 ` Vivek Goyal
2010-03-12 10:07 ` Andrea Righi
2010-03-12 10:07 ` Andrea Righi
[not found] ` <20100311184244.6735076a.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-03-11 22:20 ` Andrea Righi
2010-03-12 1:14 ` Daisuke Nishimura
2010-03-11 15:03 ` Vivek Goyal
2010-03-11 15:03 ` Vivek Goyal
[not found] ` <20100311150307.GC29246-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-03-11 23:27 ` Andrea Righi
2010-03-11 23:42 ` KAMEZAWA Hiroyuki
2010-03-11 23:27 ` Andrea Righi
2010-03-11 23:27 ` Andrea Righi
2010-03-11 23:52 ` KAMEZAWA Hiroyuki
2010-03-11 23:52 ` KAMEZAWA Hiroyuki
2010-03-12 10:01 ` Andrea Righi
2010-03-12 10:01 ` Andrea Righi
[not found] ` <20100312085244.98e48991.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-03-12 10:01 ` Andrea Righi
2010-03-11 23:52 ` KAMEZAWA Hiroyuki
2010-03-15 14:16 ` Vivek Goyal
2010-03-15 14:16 ` Vivek Goyal
2010-03-15 14:16 ` Vivek Goyal
2010-03-11 23:42 ` KAMEZAWA Hiroyuki
2010-03-11 23:42 ` KAMEZAWA Hiroyuki
[not found] ` <20100312084230.850f331d.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-03-12 0:33 ` Andrea Righi
2010-03-15 14:38 ` Vivek Goyal
2010-03-12 0:33 ` Andrea Righi
2010-03-12 0:33 ` Andrea Righi
2010-03-15 14:38 ` Vivek Goyal
2010-03-15 14:38 ` Vivek Goyal
2010-03-17 22:32 ` Andrea Righi [this message]
2010-03-17 22:32 ` Andrea Righi
[not found] ` <20100315143841.GE21127-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-03-17 22:32 ` Andrea Righi
2010-03-11 9:25 ` KAMEZAWA Hiroyuki
[not found] ` <20100311101726.f58d24e9.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-03-11 9:14 ` Peter Zijlstra
2010-03-11 22:23 ` Andrea Righi
2010-03-11 22:23 ` Andrea Righi
[not found] ` <20100311093913.07c9ca8a.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-03-11 1:17 ` KAMEZAWA Hiroyuki
2010-03-11 22:23 ` Andrea Righi
2010-03-11 18:07 ` Vivek Goyal
2010-03-11 18:07 ` Vivek Goyal
[not found] ` <20100311180753.GE29246-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-03-11 23:59 ` Andrea Righi
2010-03-11 23:59 ` Andrea Righi
2010-03-11 23:59 ` Andrea Righi
2010-03-12 0:03 ` KAMEZAWA Hiroyuki
2010-03-12 0:03 ` KAMEZAWA Hiroyuki
2010-03-12 0:03 ` KAMEZAWA Hiroyuki
[not found] ` <20100312090326.ad07c05c.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-03-12 9:58 ` Andrea Righi
2010-03-12 9:58 ` Andrea Righi
2010-03-12 9:58 ` Andrea Righi
2010-03-15 14:41 ` Vivek Goyal
2010-03-15 14:41 ` Vivek Goyal
2010-03-15 14:41 ` Vivek Goyal
-- strict thread matches above, loose matches on Subject: below --
2010-03-09 23:00 Andrea Righi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100317223222.GA8467@linux.develer.com \
--to=arighi@develer.com \
--cc=akpm@linux-foundation.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=containers@lists.linux-foundation.org \
--cc=gthelen@google.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nishimura@mxp.nes.nec.co.jp \
--cc=peterz@infradead.org \
--cc=suleiman@google.com \
--cc=trond.myklebust@fys.uio.no \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.