From: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
To: Paul Turner <pjt-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: Peter Zijlstra
<a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw@public.gmane.org>,
cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
"Eric W. Biederman"
<ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>,
Serge Hallyn
<serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>,
devel-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org,
handai.szj-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
lxc-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org,
Andrew.Phillips-xheW4WVAX9Y@public.gmane.org
Subject: Re: [PATCH v2 2/5] account guest time per-cgroup as well.
Date: Mon, 28 May 2012 17:26:25 +0400 [thread overview]
Message-ID: <4FC37D01.7080704@parallels.com> (raw)
In-Reply-To: <CAPM31RLwY4d-Ng3-T+-1eLxuZxr8wbdC_+sDQbJQXuqEfe9tfg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
[-- Attachment #1: Type: text/plain, Size: 1673 bytes --]
On 05/26/2012 08:44 AM, Paul Turner wrote:
> On 04/09/2012 03:25 PM, Glauber Costa wrote:
>> In the interest of providing a per-cgroup figure of common statistics,
>> this patch adds a nr_switches counter to each group runqueue (both cfs
>> and rt).
>>
>> To avoid impact on schedule(), we don't walk the tree at stat gather
>> time. This is because schedule() is called much more frequently than
>> the tick functions, in which we do walk the tree.
>>
>> When this figure needs to be read (different patch), we will
>> aggregate them at read time.
>>
>>
Paul,
How about the following patch instead?
It is still using the cfs_rq and rt_rq's structures, (this code actually
only touches fair.c as a PoC, rt would be similar).
Tasks in the root cgroup (without an se->parent), will do a branch and
exit. For the others, we accumulate here, and simplify the reader.
My reasoning for this, is based on the fact that all the se->parent
relations should be cached by our recent call to put_prev_task (well,
unless of course we have a really big chain)
This would incur a slightly higher context switch time for tasks inside
a cgroup.
The reader (in a different patch) would then be the same as the others:
+static u64 tg_nr_switches(struct task_group *tg, int cpu)
+{
+ if (tg != &root_task_group)
+ return rt_rq(rt_nr_switches, tg, cpu)
+fair_rq(nr_switches, tg, cpu);
+
+ return cpu_rq(cpu)->nr_switches;
+}
I plan to measure this today, but an extra branch cost for the common
case of a task in the root cgroup + O(depth) for tasks inside cgroups
may be acceptable, given the simplification it brings.
Let me know what you think.
[-- Attachment #2: alternative.patch --]
[-- Type: text/x-patch, Size: 1179 bytes --]
Index: linux/kernel/sched/fair.c
===================================================================
--- linux.orig/kernel/sched/fair.c
+++ linux/kernel/sched/fair.c
@@ -2990,6 +2990,22 @@ static struct task_struct *pick_next_tas
if (hrtick_enabled(rq))
hrtick_start_fair(rq, p);
+#ifdef CONFIG_FAIR_GROUP_SCHED
+ if (!se->parent)
+ goto out;
+ cfs_rq = group_cfs_rq(se->parent);
+ if (cfs_rq->prev == se)
+ goto out;
+ cfs_rq->prev = se;
+
+ while (se->parent) {
+ se = se->parent;
+ cfs_rq = group_cfs_rq(se);
+ cfs_rq->nr_switches++;
+ }
+out:
+#endif
+
return p;
}
Index: linux/kernel/sched/sched.h
===================================================================
--- linux.orig/kernel/sched/sched.h
+++ linux/kernel/sched/sched.h
@@ -237,6 +237,8 @@ struct cfs_rq {
struct list_head leaf_cfs_rq_list;
struct task_group *tg; /* group that "owns" this runqueue */
+ u64 nr_switches;
+ struct sched_entity *prev;
#ifdef CONFIG_SMP
/*
* h_load = weight * f(tg)
@@ -307,6 +309,9 @@ struct rt_rq {
struct rq *rq;
struct list_head leaf_rt_rq_list;
struct task_group *tg;
+
+ u64 rt_nr_switches;
+ struct sched_rt_entity *prev;
#endif
};
next prev parent reply other threads:[~2012-05-28 13:26 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-09 22:25 [PATCH v2 0/5] per-cgroup /proc/stat statistics Glauber Costa
[not found] ` <1334010315-4453-1-git-send-email-glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-04-09 22:25 ` [PATCH v2 1/5] measure exec_clock for rt sched entities Glauber Costa
2012-04-09 22:25 ` [PATCH v2 2/5] account guest time per-cgroup as well Glauber Costa
[not found] ` <1334010315-4453-3-git-send-email-glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-05-26 4:44 ` Paul Turner
[not found] ` <CAPM31RLwY4d-Ng3-T+-1eLxuZxr8wbdC_+sDQbJQXuqEfe9tfg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-05-28 9:03 ` Glauber Costa
2012-05-28 13:26 ` Glauber Costa [this message]
[not found] ` <4FC37D01.7080704-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-05-29 10:34 ` Glauber Costa
2012-04-09 22:25 ` [PATCH v2 3/5] record nr_switches per task_group Glauber Costa
2012-04-09 22:25 ` [PATCH v2 4/5] expose fine-grained per-cpu data for cpuacct stats Glauber Costa
[not found] ` <1334010315-4453-5-git-send-email-glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-04-18 12:30 ` Sha Zhengju
[not found] ` <CAFj3OHUzKDdS_3LrnTk+XaRVt+fGxWkvmh9cjv88Dt4n8Q39MA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-04-18 16:14 ` Glauber Costa
2012-04-09 22:25 ` [PATCH v2 5/5] expose per-taskgroup schedstats in cgroup Glauber Costa
[not found] ` <1334010315-4453-6-git-send-email-glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-04-18 14:44 ` Sha Zhengju
[not found] ` <CAFj3OHUwF2My5c-+ZCwLNynNTokYwioXP2jTJ4FtKg_=jPed0Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-04-18 14:57 ` Sha Zhengju
2012-04-18 16:24 ` Glauber Costa
[not found] ` <4F8EEAC5.7060703-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-04-19 13:30 ` Sha Zhengju
[not found] ` <4F90135C.20203-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2012-04-19 15:00 ` Glauber Costa
2012-05-24 9:10 ` [PATCH v2 0/5] per-cgroup /proc/stat statistics Glauber Costa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FC37D01.7080704@parallels.com \
--to=glommer-bzqdu9zft3wakbo8gow8eq@public.gmane.org \
--cc=Andrew.Phillips-xheW4WVAX9Y@public.gmane.org \
--cc=a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw@public.gmane.org \
--cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=devel-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
--cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
--cc=handai.szj-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=lxc-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org \
--cc=pjt-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org \
--cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).