All of lore.kernel.org
 help / color / mirror / Atom feed
From: Glauber Costa <glommer@parallels.com>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <linux-kernel@vger.kernel.org>, <xemul@parallels.com>,
	<paul@paulmenage.org>, <lizf@cn.fujitsu.com>,
	<daniel.lezcano@free.fr>, <mingo@elte.hu>,
	<jbottomley@parallels.com>
Subject: Re: [PATCH 1/9] Remove parent field in cpuacct cgroup
Date: Mon, 19 Sep 2011 13:30:46 -0300	[thread overview]
Message-ID: <4E776E36.6040906@parallels.com> (raw)
In-Reply-To: <1316449160.6091.5.camel@twins>

On 09/19/2011 01:19 PM, Peter Zijlstra wrote:
> On Mon, 2011-09-19 at 13:09 -0300, Glauber Costa wrote:
>> On 09/19/2011 01:03 PM, Peter Zijlstra wrote:
>>> On Wed, 2011-09-14 at 17:04 -0300, Glauber Costa wrote:
>>>> +       for (; ca; ca = parent_ca(ca)) {
>>>
>>> It might be good to check that the loop condition and null condition in
>>> the parent_ca() function get folded. Otherwise there's a double branch
>>> in that loop.
>>>
>>> Note that this function is one of the reasons I dislike cpuacct, it adds
>>> a second cgroup hierarchy traversal to every context switch.
>>>
>> Well, it is not that hard to optimize this.
>>
>> Those values are always updated, but they don't really need to, unless
>> they are read.
>>
>> So what we can do, is introduce a marker in the cgroup, representing the
>> last read value. Parent is untouched. We then update parent when 1)
>> reading this value, 2) cgroup destroy, 3) cpu hotplug. (humm, and maybe
>> we don't even need to do it in cpu hotplug, since the per-cpu variables
>> will still be accessible... )
>>
>> How about it ?
>
> Updating that value would involve iterating all tasks in the entire
> cgroup subtree nested at whatever cgroup you're wanting to read.

No, it would not. Because nothing is stored in the task, all is stored 
in the cgroup. So it is O(h(n)), where n is the number of cgroups and 
h(n) the height of the cgroups tree.

> The delayed update would be an entire subtree walk, that can be quite
> expensive.
But the subtrees are small, because we are talking about the cgroup 
subtree, wich can grow quite a lot in breadth, but rarely in depth.

> Who wants these numbers and what for and at what frequency?
> Does that really make sense?

Whoever wants /proc/stat numbers. Once, or maybe twice a sec would be 
the normal interval here for most use cases, I guess (top inside a 
container, for instance).

Even people doing much more frequent updates here, would not come as 
close as doing it every tick, therefore making this option cheaper than 
transversing the tree at each tick.

Btw, this works for cpuacct. For cpuusage, I am not sure this 
optimization is a valid one. Since this value is at least intended to 
provide a basis for cpu capping in the near future (Well, it is not 
there, but I think it is), it is expected to be used much more 
frequently by the kernel itself.

  reply	other threads:[~2011-09-19 16:31 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-14 20:04 [PATCH 0/9] Per-cgroup /proc/stat Glauber Costa
2011-09-14 20:04 ` [PATCH 1/9] Remove parent field in cpuacct cgroup Glauber Costa
2011-09-19 16:03   ` Peter Zijlstra
2011-09-19 16:09     ` Glauber Costa
2011-09-19 16:19       ` Peter Zijlstra
2011-09-19 16:30         ` Glauber Costa [this message]
2011-09-19 16:39           ` Peter Zijlstra
2011-09-19 16:41             ` Glauber Costa
2011-09-19 18:40               ` Peter Zijlstra
2011-09-20 17:29                 ` Srivatsa Vaddagiri
2011-09-22 15:11                   ` Balbir Singh
2011-09-22 15:17                     ` Peter Zijlstra
2011-09-23  8:09                       ` Balbir Singh
2011-09-23 14:35                         ` Peter Zijlstra
2011-09-23 15:45                         ` Glauber Costa
2011-09-19 18:35           ` Peter Zijlstra
2011-09-19 18:38             ` Glauber Costa
2011-09-19 18:44               ` Peter Zijlstra
2011-09-19 19:14                 ` Glauber Costa
2011-09-19 19:18                   ` Peter Zijlstra
2011-09-19 19:19                     ` Glauber Costa
2011-09-14 20:04 ` [PATCH 2/9] Make cpuacct fields per cpu variables Glauber Costa
2011-09-19 16:10   ` Peter Zijlstra
2011-09-14 20:04 ` [PATCH 3/9] Include nice values in cpuacct Glauber Costa
2011-09-19 16:19   ` Peter Zijlstra
2011-09-19 16:26     ` Glauber Costa
2011-09-19 18:36       ` Peter Zijlstra
2011-09-19 18:37         ` Glauber Costa
2011-09-14 20:04 ` [PATCH 4/9] Include irq and softirq fields " Glauber Costa
2011-09-19 18:38   ` Peter Zijlstra
2011-09-19 18:40     ` Glauber Costa
2011-09-14 20:04 ` [PATCH 5/9] Include guest " Glauber Costa
2011-09-14 20:04 ` [PATCH 6/9] Include idle and iowait " Glauber Costa
2011-09-20  9:21   ` Peter Zijlstra
2011-09-20 12:36     ` Glauber Costa
2011-09-20 12:58       ` Peter Zijlstra
2011-09-20 12:58         ` Glauber Costa
2011-09-20 13:05           ` Peter Zijlstra
2011-09-20 13:29             ` Glauber Costa
2011-09-14 20:04 ` [PATCH 7/9] Create cpuacct.proc.stat file Glauber Costa
2011-09-20  9:22   ` Peter Zijlstra
2011-09-20 12:22     ` Glauber Costa
2011-09-14 20:04 ` [PATCH 8/9] per-cgroup boot time Glauber Costa
2011-09-20  9:25   ` Peter Zijlstra
2011-09-20 12:37     ` Glauber Costa
2011-09-20 13:04       ` Peter Zijlstra
2011-09-20 13:06         ` Glauber Costa
2011-09-20 13:31           ` Peter Zijlstra
2011-09-14 20:04 ` [PATCH 9/9] Report steal time for cgroup Glauber Costa
2011-09-20  9:29   ` Peter Zijlstra
2011-09-14 20:13 ` [PATCH 0/9] Per-cgroup /proc/stat Peter Zijlstra
2011-09-14 20:20   ` Glauber Costa
2011-09-15  8:53     ` Peter Zijlstra
2011-09-14 20:23   ` Andi Kleen
2011-09-15  8:56     ` Peter Zijlstra
2011-09-19 23:07       ` Paul Turner
2011-09-20  8:33         ` Peter Zijlstra
2011-09-20 21:37         ` Glauber Costa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E776E36.6040906@parallels.com \
    --to=glommer@parallels.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=daniel.lezcano@free.fr \
    --cc=jbottomley@parallels.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=mingo@elte.hu \
    --cc=paul@paulmenage.org \
    --cc=xemul@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.