From: Andrew Morton <akpm@linux-foundation.org>
To: Christoph Lameter <cl@linux-foundation.org>
Cc: peterz@infradead.org, rientjes@google.com, npiggin@suse.de,
menage@google.com, dfults@sgi.com, linux-kernel@vger.kernel.org,
containers@lists.osdl.org
Subject: Re: [patch 0/7] cpuset writeback throttling
Date: Tue, 4 Nov 2008 19:05:05 -0800 [thread overview]
Message-ID: <20081104190505.769b93ec.akpm@linux-foundation.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0811042036000.31167@quilx.com>
On Tue, 4 Nov 2008 20:45:17 -0600 (CST) Christoph Lameter <cl@linux-foundation.org> wrote:
> On Tue, 4 Nov 2008, Andrew Morton wrote:
>
> > In a memcg implementation what we would implement is "throttle
> > page-dirtying tasks in this memcg when the memcg's dirty memory reaches
> > 40% of its total".
>
> Right that is similar to what this patch does for cpusets. A memcg
> implementation would need to figure out if we are currently part of a
> memcg and then determine the percentage of memory that is dirty.
>
> That is one aspect. When performing writeback then we need to figure out
> which inodes have dirty pages in the memcg and we need to start writeout
> on those inodes and not on others that have their dirty pages elsewhere.
> There are two components of this that are in this patch and that would
> also have to be implemented for a memcg.
Doable. lru->page->mapping->host is a good start.
> > But that doesn't solve the problem which this patchset is trying to
> > solve, which is "don't let all the memory in all this group of nodes
> > get dirty".
>
> This patch would solve the problem if the calculation of the dirty pages
> would consider the active memcg and be able to determine the amount of
> dirty pages (through some sort of additional memcg counters). That is just
> the first part though. The second part of finding the inodes that have
> dirty pages for writeback would require an association between memcgs and
> inodes.
We presently have that via the LRU. It has holes, but so does this per-cpuset
scheme.
> > What happens if cpuset A uses nodes 0,1,2,3,4,5,6,7,8,9 and cpuset B
> > uses nodes 0,1? Can activity in cpuset A cause ooms in cpuset B?
>
> Yes if the activities of cpuset A cause all pages to be dirtied in cpuset
> B and then cpuset B attempts to do writeback. This will fail to acquire
> enough memory for writeback and make reclaim impossible.
>
> Typically cpusets are not overlapped like that but used to segment the
> system.
>
> The system would work correctly if the dirty ratio calculation would be
> done on all overlapping cpusets/memcg groups that contain nodes from
> which allocations are permitted.
That.
Generally, I worry that this is a specific fix to a specific problem
encountered on specific machines with specific setups and specific
workloads, and that it's just all too low-level and myopic.
And now we're back in the usual position where there's existing code and
everyone says it's terribly wonderful and everyone is reluctant to step
back and look at the big picture. Am I wrong?
Plus: we need per-memcg dirty-memory throttling, and this is more
important than per-cpuset, I suspect. How will the (already rather
buggy) code look once we've stuffed both of them in there?
I agree that there's a problem here, although given the amount of time
that it's been there, I suspect that it is a very small problem.
Someone please convince me that in three years time we will agree that
merging this fix to that problem was a correct decision?
next prev parent reply other threads:[~2008-11-05 3:06 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-30 19:23 [patch 0/7] cpuset writeback throttling David Rientjes
2008-10-30 19:23 ` [patch 1/7] cpusets: add dirty map to struct address_space David Rientjes
2008-11-04 21:09 ` Andrew Morton
2008-11-04 21:20 ` Christoph Lameter
2008-11-04 21:42 ` Andrew Morton
2008-10-30 19:23 ` [patch 2/7] pdflush: allow the passing of a nodemask parameter David Rientjes
2008-10-30 19:23 ` [patch 3/7] mm: make page writeback obey cpuset constraints David Rientjes
2008-10-30 19:23 ` [patch 5/7] mm: throttle writeout with cpuset awareness David Rientjes
2008-10-30 19:23 ` [patch 4/7] mm: cpuset aware reclaim writeout David Rientjes
2008-10-30 19:23 ` [patch 6/7] cpusets: per cpuset dirty ratios David Rientjes
2008-10-30 19:23 ` [patch 7/7] cpusets: update documentation for writeback throttling David Rientjes
2008-10-30 21:08 ` [patch 0/7] cpuset " Dave Chinner
2008-10-30 21:33 ` Christoph Lameter
2008-10-30 22:03 ` Dave Chinner
2008-10-31 13:47 ` Christoph Lameter
2008-10-31 16:36 ` David Rientjes
2008-11-04 20:47 ` Andrew Morton
2008-11-04 20:53 ` Peter Zijlstra
2008-11-04 20:58 ` Christoph Lameter
2008-11-04 21:10 ` David Rientjes
2008-11-04 21:16 ` Andrew Morton
2008-11-04 21:21 ` Peter Zijlstra
2008-11-04 21:50 ` Andrew Morton
2008-11-04 22:17 ` Christoph Lameter
2008-11-04 22:35 ` Andrew Morton
2008-11-04 22:52 ` Christoph Lameter
2008-11-04 23:36 ` Andrew Morton
2008-11-05 1:31 ` KAMEZAWA Hiroyuki
2008-11-05 3:09 ` Andrew Morton
2008-11-05 2:45 ` Christoph Lameter
2008-11-05 3:05 ` Andrew Morton [this message]
2008-11-05 4:31 ` KAMEZAWA Hiroyuki
2008-11-10 9:02 ` Andrea Righi
2008-11-10 10:02 ` David Rientjes
2008-11-05 13:52 ` Christoph Lameter
2008-11-05 18:41 ` Andrew Morton
2008-11-05 20:21 ` Christoph Lameter
2008-11-05 20:31 ` Andrew Morton
2008-11-05 20:40 ` Christoph Lameter
2008-11-05 20:56 ` Andrew Morton
2008-11-05 21:28 ` Christoph Lameter
2008-11-05 21:55 ` Paul Menage
2008-11-05 22:04 ` David Rientjes
2008-11-06 1:34 ` KAMEZAWA Hiroyuki
2008-11-06 20:35 ` David Rientjes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081104190505.769b93ec.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=cl@linux-foundation.org \
--cc=containers@lists.osdl.org \
--cc=dfults@sgi.com \
--cc=linux-kernel@vger.kernel.org \
--cc=menage@google.com \
--cc=npiggin@suse.de \
--cc=peterz@infradead.org \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox