All of lore.kernel.org
 help / color / mirror / Atom feed
From: Glauber Costa <glommer@parallels.com>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <linux-kernel@vger.kernel.org>, <xemul@parallels.com>,
	<paul@paulmenage.org>, <lizf@cn.fujitsu.com>,
	<daniel.lezcano@free.fr>, <mingo@elte.hu>,
	<jbottomley@parallels.com>
Subject: Re: [PATCH 0/9] Per-cgroup /proc/stat
Date: Wed, 14 Sep 2011 17:20:35 -0300	[thread overview]
Message-ID: <4E710C93.40609@parallels.com> (raw)
In-Reply-To: <1316031196.5040.46.camel@twins>

On 09/14/2011 05:13 PM, Peter Zijlstra wrote:
> On Wed, 2011-09-14 at 17:04 -0300, Glauber Costa wrote:
>> [[ For those getting this twice: I sent it previously to containers
>>     ml, but I guess it was out. Sending now to a broader audience anyway ]]
>>
>> Hi,
>>
>> This patchset is a simple initial proposal for a per-cgroup/container
>> display of /proc/stat. The display method is based on Daniel's idea of
>> exposing a file that can be bind mounted (Daniel, is that more or less
>> what you had in mind?)
>>
>> To grab the stats themselves, I am (ab)using cpuacct cgroup. percpu counters
>> are dropped in favor of normal percpu pointers, so we can easily track
>> per-cpu quantities.
>>
>> In case you guys like this idea, my TODO list would include the removal
>> of the show stat code in fs/proc/stat.c altogether, and the displaying
>> of some fields I haven't touched yet.
>>
>> Also, to demonstrate one of the potential ideas for such method, I
>> implemented a feature comonly found in hypervisors - steal time - on top
>> of it. I arguee that containers can/should also display steal time when
>> available. Turns out that due to the fact that we run on the same kernel,
>> steal time is quite easy to implement once we have per-container tick
>> accounting in place.
>>
>> Please let me know what you guys think
>>
>> Glauber Costa (9):
>>    Remove parent field in cpuacct cgroup
>>    Make cpuacct fields per cpu variables
>>    Include nice values in cpuacct
>>    Include irq and softirq fields in cpuacct
>>    Include guest fields in cpuacct
>>    Include idle and iowait fields in cpuacct
>>    Create cpuacct.proc.stat file
>>    per-cgroup boot time
>>    Report steal time for cgroup
>>
>>   kernel/sched.c |  265 +++++++++++++++++++++++++++++++++++++++++++++++++-------
>>   1 files changed, 234 insertions(+), 31 deletions(-)
>
> I hate it already.. it just smells of more senseless accounting
> overhead.
>
> Guys we should seriously trim back a lot of that code, not grow ever
> more and more. The sad fact is that if you build a kernel with
> cpu-cgroup support the context switch cost is more than double that of a
> kernel without, and then you haven't even started creating cgroups yet.
>
> Also, how doesn't all this duplicate part of cpuacct-cgroup?
>
> /me won't actually look at the patches for a little while longer.
Hey Peter,

Answering just a single point here, if you look closely, it does not 
duplicate  anything from cpuacct. What it does, is to divide it in more
fine grained groups than just user/system. But it is not even called 
more than it already used to be. Also, I change the counters to per-cpu 
variables instead of percpu counters (so we can access per-cpu data). If 
there is any perf. change wrt the current code, it comes from that, and 
since percpu variables are cheaper to update (and summing up is much 
less frequent), it will end up even cheaper.

The steal time feature is really trivial once it is in place.

About your point of the context switch cost, how would you feel if we 
optimized it out using static_branch() like it was done for kvm steal time?

I can also commit to taking a look at making the overall performance 
suck less here, but it is really orthogonal to what I just posted.


  reply	other threads:[~2011-09-14 20:21 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-14 20:04 [PATCH 0/9] Per-cgroup /proc/stat Glauber Costa
2011-09-14 20:04 ` [PATCH 1/9] Remove parent field in cpuacct cgroup Glauber Costa
2011-09-19 16:03   ` Peter Zijlstra
2011-09-19 16:09     ` Glauber Costa
2011-09-19 16:19       ` Peter Zijlstra
2011-09-19 16:30         ` Glauber Costa
2011-09-19 16:39           ` Peter Zijlstra
2011-09-19 16:41             ` Glauber Costa
2011-09-19 18:40               ` Peter Zijlstra
2011-09-20 17:29                 ` Srivatsa Vaddagiri
2011-09-22 15:11                   ` Balbir Singh
2011-09-22 15:17                     ` Peter Zijlstra
2011-09-23  8:09                       ` Balbir Singh
2011-09-23 14:35                         ` Peter Zijlstra
2011-09-23 15:45                         ` Glauber Costa
2011-09-19 18:35           ` Peter Zijlstra
2011-09-19 18:38             ` Glauber Costa
2011-09-19 18:44               ` Peter Zijlstra
2011-09-19 19:14                 ` Glauber Costa
2011-09-19 19:18                   ` Peter Zijlstra
2011-09-19 19:19                     ` Glauber Costa
2011-09-14 20:04 ` [PATCH 2/9] Make cpuacct fields per cpu variables Glauber Costa
2011-09-19 16:10   ` Peter Zijlstra
2011-09-14 20:04 ` [PATCH 3/9] Include nice values in cpuacct Glauber Costa
2011-09-19 16:19   ` Peter Zijlstra
2011-09-19 16:26     ` Glauber Costa
2011-09-19 18:36       ` Peter Zijlstra
2011-09-19 18:37         ` Glauber Costa
2011-09-14 20:04 ` [PATCH 4/9] Include irq and softirq fields " Glauber Costa
2011-09-19 18:38   ` Peter Zijlstra
2011-09-19 18:40     ` Glauber Costa
2011-09-14 20:04 ` [PATCH 5/9] Include guest " Glauber Costa
2011-09-14 20:04 ` [PATCH 6/9] Include idle and iowait " Glauber Costa
2011-09-20  9:21   ` Peter Zijlstra
2011-09-20 12:36     ` Glauber Costa
2011-09-20 12:58       ` Peter Zijlstra
2011-09-20 12:58         ` Glauber Costa
2011-09-20 13:05           ` Peter Zijlstra
2011-09-20 13:29             ` Glauber Costa
2011-09-14 20:04 ` [PATCH 7/9] Create cpuacct.proc.stat file Glauber Costa
2011-09-20  9:22   ` Peter Zijlstra
2011-09-20 12:22     ` Glauber Costa
2011-09-14 20:04 ` [PATCH 8/9] per-cgroup boot time Glauber Costa
2011-09-20  9:25   ` Peter Zijlstra
2011-09-20 12:37     ` Glauber Costa
2011-09-20 13:04       ` Peter Zijlstra
2011-09-20 13:06         ` Glauber Costa
2011-09-20 13:31           ` Peter Zijlstra
2011-09-14 20:04 ` [PATCH 9/9] Report steal time for cgroup Glauber Costa
2011-09-20  9:29   ` Peter Zijlstra
2011-09-14 20:13 ` [PATCH 0/9] Per-cgroup /proc/stat Peter Zijlstra
2011-09-14 20:20   ` Glauber Costa [this message]
2011-09-15  8:53     ` Peter Zijlstra
2011-09-14 20:23   ` Andi Kleen
2011-09-15  8:56     ` Peter Zijlstra
2011-09-19 23:07       ` Paul Turner
2011-09-20  8:33         ` Peter Zijlstra
2011-09-20 21:37         ` Glauber Costa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E710C93.40609@parallels.com \
    --to=glommer@parallels.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=daniel.lezcano@free.fr \
    --cc=jbottomley@parallels.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=mingo@elte.hu \
    --cc=paul@paulmenage.org \
    --cc=xemul@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.