All of lore.kernel.org
 help / color / mirror / Atom feed
From: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
To: Paul Turner <pjt-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: Peter Zijlstra
	<a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw@public.gmane.org>,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	"Eric W. Biederman"
	<ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>,
	Serge Hallyn
	<serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>,
	devel-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org,
	handai.szj-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
	lxc-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org,
	Andrew.Phillips-xheW4WVAX9Y@public.gmane.org
Subject: Re: [PATCH v2 2/5] account guest time per-cgroup as well.
Date: Mon, 28 May 2012 17:26:25 +0400	[thread overview]
Message-ID: <4FC37D01.7080704@parallels.com> (raw)
In-Reply-To: <CAPM31RLwY4d-Ng3-T+-1eLxuZxr8wbdC_+sDQbJQXuqEfe9tfg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 1673 bytes --]

On 05/26/2012 08:44 AM, Paul Turner wrote:
> On 04/09/2012 03:25 PM, Glauber Costa wrote:
>> In the interest of providing a per-cgroup figure of common statistics,
>> this patch adds a nr_switches counter to each group runqueue (both cfs
>> and rt).
>>
>> To avoid impact on schedule(), we don't walk the tree at stat gather
>> time. This is because schedule() is called much more frequently than
>> the tick functions, in which we do walk the tree.
>>
>> When this figure needs to be read (different patch), we will
>> aggregate them at read time.
>>
>>

Paul,

How about the following patch instead?

It is still using the cfs_rq and rt_rq's structures, (this code actually 
only touches fair.c as a PoC, rt would be similar).

Tasks in the root cgroup (without an se->parent), will do a branch and 
exit. For the others, we accumulate here, and simplify the reader.

My reasoning for this, is based on the fact that all the se->parent 
relations should be cached by our recent call to put_prev_task (well, 
unless of course we have a really big chain)

This would incur a slightly higher context switch time for tasks inside 
a cgroup.

The reader (in a different patch) would then be the same as the others:

+static u64 tg_nr_switches(struct task_group *tg, int cpu)
+{
+	if (tg != &root_task_group)
+		return rt_rq(rt_nr_switches, tg, cpu)
                        +fair_rq(nr_switches, tg, cpu);
+
+	return cpu_rq(cpu)->nr_switches;
+}

I plan to measure this today, but an extra branch cost for the common 
case of a task in the root cgroup + O(depth) for tasks inside cgroups 
may be acceptable, given the simplification it brings.

Let me know what you think.





[-- Attachment #2: alternative.patch --]
[-- Type: text/x-patch, Size: 1179 bytes --]

Index: linux/kernel/sched/fair.c
===================================================================
--- linux.orig/kernel/sched/fair.c
+++ linux/kernel/sched/fair.c
@@ -2990,6 +2990,22 @@ static struct task_struct *pick_next_tas
 	if (hrtick_enabled(rq))
 		hrtick_start_fair(rq, p);
 
+#ifdef CONFIG_FAIR_GROUP_SCHED
+	if (!se->parent)
+		goto out;
+	cfs_rq = group_cfs_rq(se->parent);
+	if (cfs_rq->prev == se)
+		goto out;
+	cfs_rq->prev = se;
+
+	while (se->parent) {
+		se = se->parent;
+		cfs_rq = group_cfs_rq(se);
+		cfs_rq->nr_switches++;
+	}
+out:
+#endif
+
 	return p;
 }
 
Index: linux/kernel/sched/sched.h
===================================================================
--- linux.orig/kernel/sched/sched.h
+++ linux/kernel/sched/sched.h
@@ -237,6 +237,8 @@ struct cfs_rq {
 	struct list_head leaf_cfs_rq_list;
 	struct task_group *tg;	/* group that "owns" this runqueue */
 
+	u64 nr_switches;
+	struct sched_entity *prev;
 #ifdef CONFIG_SMP
 	/*
 	 *   h_load = weight * f(tg)
@@ -307,6 +309,9 @@ struct rt_rq {
 	struct rq *rq;
 	struct list_head leaf_rt_rq_list;
 	struct task_group *tg;
+
+	u64 rt_nr_switches;
+	struct sched_rt_entity *prev;
 #endif
 };
 

  parent reply	other threads:[~2012-05-28 13:26 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-09 22:25 [PATCH v2 0/5] per-cgroup /proc/stat statistics Glauber Costa
     [not found] ` <1334010315-4453-1-git-send-email-glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-04-09 22:25   ` [PATCH v2 1/5] measure exec_clock for rt sched entities Glauber Costa
2012-04-09 22:25   ` [PATCH v2 2/5] account guest time per-cgroup as well Glauber Costa
     [not found]     ` <1334010315-4453-3-git-send-email-glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-05-26  4:44       ` Paul Turner
     [not found]         ` <CAPM31RLwY4d-Ng3-T+-1eLxuZxr8wbdC_+sDQbJQXuqEfe9tfg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-05-28  9:03           ` Glauber Costa
2012-05-28 13:26           ` Glauber Costa [this message]
     [not found]             ` <4FC37D01.7080704-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-05-29 10:34               ` Glauber Costa
2012-04-09 22:25   ` [PATCH v2 3/5] record nr_switches per task_group Glauber Costa
2012-04-09 22:25   ` [PATCH v2 4/5] expose fine-grained per-cpu data for cpuacct stats Glauber Costa
     [not found]     ` <1334010315-4453-5-git-send-email-glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-04-18 12:30       ` Sha Zhengju
     [not found]         ` <CAFj3OHUzKDdS_3LrnTk+XaRVt+fGxWkvmh9cjv88Dt4n8Q39MA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-04-18 16:14           ` Glauber Costa
2012-04-09 22:25   ` [PATCH v2 5/5] expose per-taskgroup schedstats in cgroup Glauber Costa
     [not found]     ` <1334010315-4453-6-git-send-email-glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-04-18 14:44       ` Sha Zhengju
     [not found]         ` <CAFj3OHUwF2My5c-+ZCwLNynNTokYwioXP2jTJ4FtKg_=jPed0Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-04-18 14:57           ` Sha Zhengju
2012-04-18 16:24           ` Glauber Costa
     [not found]             ` <4F8EEAC5.7060703-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-04-19 13:30               ` Sha Zhengju
     [not found]                 ` <4F90135C.20203-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2012-04-19 15:00                   ` Glauber Costa
2012-05-24  9:10   ` [PATCH v2 0/5] per-cgroup /proc/stat statistics Glauber Costa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FC37D01.7080704@parallels.com \
    --to=glommer-bzqdu9zft3wakbo8gow8eq@public.gmane.org \
    --cc=Andrew.Phillips-xheW4WVAX9Y@public.gmane.org \
    --cc=a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=devel-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
    --cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
    --cc=handai.szj-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=lxc-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org \
    --cc=pjt-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org \
    --cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.