From: Glauber Costa <glommer@parallels.com>
To: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
Andi Kleen <andi@firstfloor.org>, <linux-kernel@vger.kernel.org>,
<xemul@parallels.com>, <paul@paulmenage.org>,
<lizf@cn.fujitsu.com>, <daniel.lezcano@free.fr>,
<jbottomley@parallels.com>
Subject: Re: [PATCH 0/9] Per-cgroup /proc/stat
Date: Tue, 20 Sep 2011 18:37:49 -0300 [thread overview]
Message-ID: <4E7907AD.3030408@parallels.com> (raw)
In-Reply-To: <4E77CB30.3030509@google.com>
On 09/19/2011 08:07 PM, Paul Turner wrote:
> On 09/15/11 01:56, Peter Zijlstra wrote:
>> On Wed, 2011-09-14 at 13:23 -0700, Andi Kleen wrote:
>>> Peter Zijlstra<a.p.zijlstra@chello.nl> writes:
>>>>
>>>> Guys we should seriously trim back a lot of that code, not grow ever
>>>> more and more. The sad fact is that if you build a kernel with
>>>> cpu-cgroup support the context switch cost is more than double that
>>>> of a
>>>> kernel without, and then you haven't even started creating cgroups yet.
>>>
>>> That sounds indeed quite bad. Is it known why it is so costly?
>>
>> Mostly because all data structures grow and all code paths grow, some by
>> quite a bit, its spread all over the place, lots of little cuts etc..
>>
>> pjt and I tried trimming some of the code paths with static_branch() but
>> didn't really get anywhere.. need to get back to looking at this stuff
>> sometime soon.
>
> When I get some time I think I'm just going to post a patch[*] that
> merges the useful _field_ (usage, usage_percpu) from cpuacct into cpu
> since we are *already* doing the accounting on the entity level making
> this addition free.
agree.
> At that point we could !CONFIG_CGROUP_CPUACCT by default and deprecate
> the beast without breaking ABI for those who really need it (either
> because their applications have hard-coded paths or because they really
> like cgroup user/sys time -- which we COULD duplicate into cpu but I'm
> inclined not to).
Well, why ? Now that I look into it, one of the nice ways to achieve
what I am proposing in this patchset is:
1) get rid of cpuacct.
2) do all accounting per-cpu cgroup, and then merge it to fs/proc/stat.c
> [*]: the only real caveat is how loudly people scream about the code
> duplication; I think it's worth it if it let's us kill cpuacct in the
> long run.
One way to deprecate it, is probably disallowing cpuacct to have any
tasks written to its task file. We then expose whatever information
there is in cpu/.
It may get ugly since we'll need to touch core cgroup code, but it is
nice from a user PoV.
> Another unrelated optimization on this path I have sitting around in
> patches/ to push at some point is keeping the left-most entity out of
> tree; since the worst case is an entity with a lower-vruntime comes
> along and we insert the previous left-most and the best case is we get
> to pick it without futzing with the rb-tree. I think this was good for a
> percent or two when I hacked it together before.
>
> Another idea I have kicking around for this path is the introduction of
> a link_entity which bridges over nr_running=1 chains (break it
> opportunistically when an element in the chain goes to nr_running=2).
> This one requires some pretty careful accounting around the breaking of
> a chain though so I'm not touching it until I get the new load tracking
> code out. (Incidentally when I benchmarked it before LPC I had it
> working out to be a little more efficient than the current math good for
> ~2-3% on pipe_test.)
>
> - Paul
prev parent reply other threads:[~2011-09-20 21:38 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-14 20:04 [PATCH 0/9] Per-cgroup /proc/stat Glauber Costa
2011-09-14 20:04 ` [PATCH 1/9] Remove parent field in cpuacct cgroup Glauber Costa
2011-09-19 16:03 ` Peter Zijlstra
2011-09-19 16:09 ` Glauber Costa
2011-09-19 16:19 ` Peter Zijlstra
2011-09-19 16:30 ` Glauber Costa
2011-09-19 16:39 ` Peter Zijlstra
2011-09-19 16:41 ` Glauber Costa
2011-09-19 18:40 ` Peter Zijlstra
2011-09-20 17:29 ` Srivatsa Vaddagiri
2011-09-22 15:11 ` Balbir Singh
2011-09-22 15:17 ` Peter Zijlstra
2011-09-23 8:09 ` Balbir Singh
2011-09-23 14:35 ` Peter Zijlstra
2011-09-23 15:45 ` Glauber Costa
2011-09-19 18:35 ` Peter Zijlstra
2011-09-19 18:38 ` Glauber Costa
2011-09-19 18:44 ` Peter Zijlstra
2011-09-19 19:14 ` Glauber Costa
2011-09-19 19:18 ` Peter Zijlstra
2011-09-19 19:19 ` Glauber Costa
2011-09-14 20:04 ` [PATCH 2/9] Make cpuacct fields per cpu variables Glauber Costa
2011-09-19 16:10 ` Peter Zijlstra
2011-09-14 20:04 ` [PATCH 3/9] Include nice values in cpuacct Glauber Costa
2011-09-19 16:19 ` Peter Zijlstra
2011-09-19 16:26 ` Glauber Costa
2011-09-19 18:36 ` Peter Zijlstra
2011-09-19 18:37 ` Glauber Costa
2011-09-14 20:04 ` [PATCH 4/9] Include irq and softirq fields " Glauber Costa
2011-09-19 18:38 ` Peter Zijlstra
2011-09-19 18:40 ` Glauber Costa
2011-09-14 20:04 ` [PATCH 5/9] Include guest " Glauber Costa
2011-09-14 20:04 ` [PATCH 6/9] Include idle and iowait " Glauber Costa
2011-09-20 9:21 ` Peter Zijlstra
2011-09-20 12:36 ` Glauber Costa
2011-09-20 12:58 ` Peter Zijlstra
2011-09-20 12:58 ` Glauber Costa
2011-09-20 13:05 ` Peter Zijlstra
2011-09-20 13:29 ` Glauber Costa
2011-09-14 20:04 ` [PATCH 7/9] Create cpuacct.proc.stat file Glauber Costa
2011-09-20 9:22 ` Peter Zijlstra
2011-09-20 12:22 ` Glauber Costa
2011-09-14 20:04 ` [PATCH 8/9] per-cgroup boot time Glauber Costa
2011-09-20 9:25 ` Peter Zijlstra
2011-09-20 12:37 ` Glauber Costa
2011-09-20 13:04 ` Peter Zijlstra
2011-09-20 13:06 ` Glauber Costa
2011-09-20 13:31 ` Peter Zijlstra
2011-09-14 20:04 ` [PATCH 9/9] Report steal time for cgroup Glauber Costa
2011-09-20 9:29 ` Peter Zijlstra
2011-09-14 20:13 ` [PATCH 0/9] Per-cgroup /proc/stat Peter Zijlstra
2011-09-14 20:20 ` Glauber Costa
2011-09-15 8:53 ` Peter Zijlstra
2011-09-14 20:23 ` Andi Kleen
2011-09-15 8:56 ` Peter Zijlstra
2011-09-19 23:07 ` Paul Turner
2011-09-20 8:33 ` Peter Zijlstra
2011-09-20 21:37 ` Glauber Costa [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E7907AD.3030408@parallels.com \
--to=glommer@parallels.com \
--cc=a.p.zijlstra@chello.nl \
--cc=andi@firstfloor.org \
--cc=daniel.lezcano@free.fr \
--cc=jbottomley@parallels.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lizf@cn.fujitsu.com \
--cc=paul@paulmenage.org \
--cc=pjt@google.com \
--cc=xemul@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.