All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Turner <pjt@google.com>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andi Kleen <andi@firstfloor.org>,
	Glauber Costa <glommer@parallels.com>,
	linux-kernel@vger.kernel.org, xemul@parallels.com,
	paul@paulmenage.org, lizf@cn.fujitsu.com, daniel.lezcano@free.fr,
	mingo@elte.hu, jbottomley@parallels.com
Subject: Re: [PATCH 0/9] Per-cgroup /proc/stat
Date: Mon, 19 Sep 2011 16:07:28 -0700	[thread overview]
Message-ID: <4E77CB30.3030509@google.com> (raw)
In-Reply-To: <1316076989.3045.8.camel@twins>

On 09/15/11 01:56, Peter Zijlstra wrote:
> On Wed, 2011-09-14 at 13:23 -0700, Andi Kleen wrote:
>> Peter Zijlstra<a.p.zijlstra@chello.nl>  writes:
>>>
>>> Guys we should seriously trim back a lot of that code, not grow ever
>>> more and more. The sad fact is that if you build a kernel with
>>> cpu-cgroup support the context switch cost is more than double that of a
>>> kernel without, and then you haven't even started creating cgroups yet.
>>
>> That sounds indeed quite bad. Is it known why it is so costly?
>
> Mostly because all data structures grow and all code paths grow, some by
> quite a bit, its spread all over the place, lots of little cuts etc..
>
> pjt and I tried trimming some of the code paths with static_branch() but
> didn't really get anywhere.. need to get back to looking at this stuff
> sometime soon.

When I get some time I think I'm just going to post a patch[*] that 
merges the useful _field_ (usage, usage_percpu) from cpuacct into cpu 
since we are *already* doing the accounting on the entity level making 
this addition free.

At that point we could !CONFIG_CGROUP_CPUACCT by default and deprecate 
the beast without breaking ABI for those who really need it (either 
because their applications have hard-coded paths or because they really 
like cgroup user/sys time -- which we COULD duplicate into cpu but I'm 
inclined not to).

[*]: the only real caveat is how loudly people scream about the code 
duplication; I think it's worth it if it let's us kill cpuacct in the 
long run.

Another unrelated optimization on this path I have sitting around in 
patches/ to push at some point is keeping the left-most entity out of 
tree; since the worst case is an entity with a lower-vruntime comes 
along and we insert the previous left-most and the best case is we get 
to pick it without futzing with the rb-tree.  I think this was good for 
a percent or two when I hacked it together before.

Another idea I have kicking around for this path is the introduction of 
a link_entity which bridges over nr_running=1 chains (break it 
opportunistically when an element in the chain goes to nr_running=2). 
This one requires some pretty careful accounting around the breaking of 
a chain though so I'm not touching it until I get the new load tracking 
code out.  (Incidentally when I benchmarked it before LPC I had it 
working out to be a little more efficient than the current math good for 
~2-3% on pipe_test.)

- Paul

  reply	other threads:[~2011-09-19 23:07 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-14 20:04 [PATCH 0/9] Per-cgroup /proc/stat Glauber Costa
2011-09-14 20:04 ` [PATCH 1/9] Remove parent field in cpuacct cgroup Glauber Costa
2011-09-19 16:03   ` Peter Zijlstra
2011-09-19 16:09     ` Glauber Costa
2011-09-19 16:19       ` Peter Zijlstra
2011-09-19 16:30         ` Glauber Costa
2011-09-19 16:39           ` Peter Zijlstra
2011-09-19 16:41             ` Glauber Costa
2011-09-19 18:40               ` Peter Zijlstra
2011-09-20 17:29                 ` Srivatsa Vaddagiri
2011-09-22 15:11                   ` Balbir Singh
2011-09-22 15:17                     ` Peter Zijlstra
2011-09-23  8:09                       ` Balbir Singh
2011-09-23 14:35                         ` Peter Zijlstra
2011-09-23 15:45                         ` Glauber Costa
2011-09-19 18:35           ` Peter Zijlstra
2011-09-19 18:38             ` Glauber Costa
2011-09-19 18:44               ` Peter Zijlstra
2011-09-19 19:14                 ` Glauber Costa
2011-09-19 19:18                   ` Peter Zijlstra
2011-09-19 19:19                     ` Glauber Costa
2011-09-14 20:04 ` [PATCH 2/9] Make cpuacct fields per cpu variables Glauber Costa
2011-09-19 16:10   ` Peter Zijlstra
2011-09-14 20:04 ` [PATCH 3/9] Include nice values in cpuacct Glauber Costa
2011-09-19 16:19   ` Peter Zijlstra
2011-09-19 16:26     ` Glauber Costa
2011-09-19 18:36       ` Peter Zijlstra
2011-09-19 18:37         ` Glauber Costa
2011-09-14 20:04 ` [PATCH 4/9] Include irq and softirq fields " Glauber Costa
2011-09-19 18:38   ` Peter Zijlstra
2011-09-19 18:40     ` Glauber Costa
2011-09-14 20:04 ` [PATCH 5/9] Include guest " Glauber Costa
2011-09-14 20:04 ` [PATCH 6/9] Include idle and iowait " Glauber Costa
2011-09-20  9:21   ` Peter Zijlstra
2011-09-20 12:36     ` Glauber Costa
2011-09-20 12:58       ` Peter Zijlstra
2011-09-20 12:58         ` Glauber Costa
2011-09-20 13:05           ` Peter Zijlstra
2011-09-20 13:29             ` Glauber Costa
2011-09-14 20:04 ` [PATCH 7/9] Create cpuacct.proc.stat file Glauber Costa
2011-09-20  9:22   ` Peter Zijlstra
2011-09-20 12:22     ` Glauber Costa
2011-09-14 20:04 ` [PATCH 8/9] per-cgroup boot time Glauber Costa
2011-09-20  9:25   ` Peter Zijlstra
2011-09-20 12:37     ` Glauber Costa
2011-09-20 13:04       ` Peter Zijlstra
2011-09-20 13:06         ` Glauber Costa
2011-09-20 13:31           ` Peter Zijlstra
2011-09-14 20:04 ` [PATCH 9/9] Report steal time for cgroup Glauber Costa
2011-09-20  9:29   ` Peter Zijlstra
2011-09-14 20:13 ` [PATCH 0/9] Per-cgroup /proc/stat Peter Zijlstra
2011-09-14 20:20   ` Glauber Costa
2011-09-15  8:53     ` Peter Zijlstra
2011-09-14 20:23   ` Andi Kleen
2011-09-15  8:56     ` Peter Zijlstra
2011-09-19 23:07       ` Paul Turner [this message]
2011-09-20  8:33         ` Peter Zijlstra
2011-09-20 21:37         ` Glauber Costa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E77CB30.3030509@google.com \
    --to=pjt@google.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=andi@firstfloor.org \
    --cc=daniel.lezcano@free.fr \
    --cc=glommer@parallels.com \
    --cc=jbottomley@parallels.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=mingo@elte.hu \
    --cc=paul@paulmenage.org \
    --cc=xemul@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.