From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Martin Bligh <mbligh@mbligh.org>
Cc: rohitseth@google.com, Dave Hansen <haveblue@us.ibm.com>,
Kirill Korotaev <dev@sw.ru>,
vatsa@in.ibm.com, Alan Cox <alan@lxorguk.ukuu.org.uk>,
Andrew Morton <akpm@osdl.org>,
mingo@elte.hu, sam@vilain.net, linux-kernel@vger.kernel.org,
dev@openvz.org, efault@gmx.de, balbir@in.ibm.com,
sekharan@us.ibm.com, nagar@watson.ibm.com, pj@sgi.com,
Andrey Savochkin <saw@sw.ru>
Subject: Re: memory resource accounting (was Re: [RFC, PATCH 0/5] Going forward with Resource Management - A cpu controller)
Date: Wed, 09 Aug 2006 11:54:07 +1000 [thread overview]
Message-ID: <44D9403F.4070608@yahoo.com.au> (raw)
In-Reply-To: <44D8C4F9.3000402@mbligh.org>
Martin Bligh wrote:
>>> It also saves you from maintaining huge lists against each page.
>>>
>>> Worse case, you want to bill everyone who opens that address_space
>>> equally. But the semantics on exit still suck.
>>>
>>> What was Alan's quote again? "unfair, unreliable, inefficient ...
>>> pick at least one out of the three". or something like that.
>>
>>
>> What's the sucking semantics on exit? I haven't looked much at the
>> existing memory controllers going around, but the implementation I
>> imagine looks something like this (I think it is conceptually similar
>> to the basic beancounters idea):
>
>
> You have to increase the other processes allocations, putting them
> over their limits. If you then force them into reclaim, they're going
> to stall, and give bad latency.
Not within a particular container. If the process exits but leaves around
some memory charge, then that just remains within the same container.
If you want to remove a container, then you have a hierarchy of billing
and your charge just gets accounted to the parent.
>
>> - anyone who allocates a page for anything gets charged for that page.
>> Except interrupt/softirq context. Could we ignore these for the
>> moment?
>>
>> This does give you kernel (slab, pagetable, etc) allocations as
>> well as
>> userspace. I don't like the idea of doing controllers for inode cache
>> and controllers for dentry cache, etc, etc, ad infinitum.
>>
>> - each struct page has a backpointer to its billed container. At the mm
>> summit Linus said he didn't want back pointers, but I clarified
>> with him
>> and he isn't against them if they are easily configured out when
>> not using memory controllers.
>>
>> - memory accounting containers are in a hierarchy. If you want to
>> destroy a
>> container but it still has billed memory outstanding, that gets
>> charged
>> back to the parent. The data structure itself obviously still needs to
>> stay around, to keep the backpointers from going stale... but that
>> could
>> be as little as a word or two in size.
>>
>> The reason I like this way of accounting is that it can be done with
>> a couple
>> of hooks into page_alloc.c and an ifdef in mm.h, and that is the
>> extent of
>> the impact on core mm/ so I'd be against anything more intrusive
>> unless this
>> really doesn't work.
>>
>
> See "inefficent" above (sorry ;-)) What you've chosen is more correct,
> but much higher overhead. The point was that there's tradeoffs either
> way - the conclusion we came to last time was that to make it 100%
> correct, you'd be better off going with a model like Xen.
So if someone says they want it 100% correct, I can tell them to use
Xen and not put accounting into any place in the kernel that allocates
memory? Sweet OK.
If we're happy with doing userspace only memory, then a similar scheme
can be implemented on an object-accounting basis (eg. vmas). I think
there is something that already implements this.
>
> 1. You're adding a backpointer to struct page.
That's nowhere near the overhead of pte chain rmaps, though. I think it
is perfectly acceptible (assuming you *did* want to account kernel page
allocations) and probably will be difficult to notice on non-crazy-highmem
boxes. Which is just about everyone we care about now.
>
> 2. Each page is not accounted to one container, but shared across them,
> so the billing changes every time someone forks or exits. And not just
> for that container, but all of them. Think pte chain based rmap ...
> except worse.
In my proposed scheme, it is just the first who allocates. You hope that
statistically, that is good enough. Otherwise you could go into tracking
what process has a reference to which dentry... good luck getting that
past Al and Christoph.
>
> 3. When a container needs to "shrink" when somebody else exits, how do
> we do reclaim pages from a specific container?
Not the problem of accounting. Any other scheme will have a similar
problem.
However, having the container in the struct page *could* actually help
directed reclaim FWIW.
--
Send instant messages to your online friends http://au.messenger.yahoo.com
next prev parent reply other threads:[~2006-08-09 1:54 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-08-04 5:07 [RFC, PATCH 0/5] Going forward with Resource Management - A cpu controller Srivatsa Vaddagiri
2006-08-04 5:09 ` [ RFC, PATCH 1/5 ] CPU controller - base changes Srivatsa Vaddagiri
2006-08-04 7:35 ` Andrew Morton
2006-08-04 11:18 ` Srivatsa Vaddagiri
2006-08-04 14:34 ` Kirill Korotaev
2006-08-04 14:50 ` Balbir Singh
2006-08-04 14:51 ` Srivatsa Vaddagiri
2006-08-04 5:10 ` [ RFC, PATCH 2/5 ] CPU controller - Define group operations Srivatsa Vaddagiri
2006-08-04 23:10 ` Jiri Slaby
2006-08-04 5:11 ` [ RFC, PATCH 3/5 ] CPU controller - deal with movement of tasks Srivatsa Vaddagiri
2006-08-04 5:12 ` [ RFC, PATCH 4/5 ] CPU controller - deal with dont care groups Srivatsa Vaddagiri
2006-08-04 5:13 ` [ RFC, PATCH 5/5 ] CPU controller - interface with cpusets Srivatsa Vaddagiri
2006-08-04 5:36 ` [RFC, PATCH 0/5] Going forward with Resource Management - A cpu controller Andrew Morton
2006-08-04 5:42 ` Andrew Morton
2006-08-04 9:49 ` Alan Cox
2006-08-04 11:41 ` Srivatsa Vaddagiri
2006-08-04 14:51 ` Kirill Korotaev
2006-08-04 15:31 ` Srivatsa Vaddagiri
2006-08-04 16:03 ` Kirill Korotaev
2006-08-04 17:02 ` [ProbableSpam] " Shailabh Nagar
2006-08-04 18:27 ` Rohit Seth
2006-08-04 19:11 ` Shailabh Nagar
2006-08-04 19:24 ` Rohit Seth
2006-08-07 7:19 ` Kirill Korotaev
2006-08-07 17:14 ` Rohit Seth
2006-08-08 7:17 ` Kirill Korotaev
2006-08-08 17:16 ` Rohit Seth
2006-08-04 17:50 ` Martin Bligh
2006-08-07 7:25 ` Kirill Korotaev
2006-08-07 14:34 ` Martin J. Bligh
2006-08-07 16:33 ` Kirill Korotaev
2006-08-07 18:31 ` Rohit Seth
2006-08-07 18:43 ` Dave Hansen
2006-08-07 19:00 ` Rohit Seth
2006-08-07 19:46 ` Martin Bligh
2006-08-08 14:19 ` memory resource accounting (was Re: [RFC, PATCH 0/5] Going forward with Resource Management - A cpu controller) Nick Piggin
2006-08-08 14:57 ` Dave Hansen
2006-08-08 15:22 ` Nick Piggin
2006-08-09 13:43 ` Kirill Korotaev
2006-08-08 17:08 ` Martin Bligh
2006-08-09 1:54 ` Nick Piggin [this message]
2006-08-08 17:34 ` Rohit Seth
2006-08-09 4:33 ` Andi Kleen
2006-08-09 6:00 ` Magnus Damm
2006-08-09 6:06 ` Andi Kleen
2006-08-09 6:56 ` Andrey Savochkin
2006-08-08 7:19 ` [RFC, PATCH 0/5] Going forward with Resource Management - A cpu controller Kirill Korotaev
2006-08-04 16:16 ` Kirill Korotaev
2006-08-04 16:49 ` [ProbableSpam] " Shailabh Nagar
2006-08-04 17:03 ` Dipankar Sarma
2006-08-04 18:17 ` Shailabh Nagar
2006-08-07 7:23 ` Kirill Korotaev
2006-08-04 14:57 ` Kirill Korotaev
2006-08-04 5:58 ` Paul Jackson
2006-08-04 6:02 ` Paul Jackson
2006-08-04 6:16 ` Paul Jackson
2006-08-04 6:20 ` Dipankar Sarma
2006-08-04 6:31 ` Paul Jackson
2006-08-04 6:37 ` Dipankar Sarma
2006-08-04 6:49 ` Andrew Morton
2006-08-04 6:45 ` Andrew Morton
2006-08-04 7:10 ` Dipankar Sarma
2006-08-04 7:24 ` Andrew Morton
2006-08-04 19:10 ` Chandra Seetharaman
2006-08-04 6:56 ` Srivatsa Vaddagiri
2006-08-04 7:13 ` Andrew Morton
2006-08-04 11:16 ` Srivatsa Vaddagiri
2006-08-04 18:51 ` Andrew Morton
2006-08-04 14:20 ` Kirill Korotaev
2006-08-04 14:35 ` Christoph Hellwig
2006-08-04 15:29 ` [ProbableSpam] " Shailabh Nagar
2006-08-07 7:29 ` Kirill Korotaev
2006-08-07 9:30 ` Paul Jackson
2006-08-07 15:58 ` Chandra Seetharaman
2006-08-07 16:10 ` Kirill Korotaev
2006-08-07 17:15 ` Paul Jackson
2006-08-07 18:19 ` Rohit Seth
2006-08-05 3:30 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=44D9403F.4070608@yahoo.com.au \
--to=nickpiggin@yahoo.com.au \
--cc=akpm@osdl.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=balbir@in.ibm.com \
--cc=dev@openvz.org \
--cc=dev@sw.ru \
--cc=efault@gmx.de \
--cc=haveblue@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mbligh@mbligh.org \
--cc=mingo@elte.hu \
--cc=nagar@watson.ibm.com \
--cc=pj@sgi.com \
--cc=rohitseth@google.com \
--cc=sam@vilain.net \
--cc=saw@sw.ru \
--cc=sekharan@us.ibm.com \
--cc=vatsa@in.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.