From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
To: Oleg Nesterov <oleg@redhat.com>,
Shailabh Nagar <nagar1234@in.ibm.com>,
Andrew Morton <akpm@linux-foundation.org>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
John stultz <johnstul@us.ibm.com>,
Thomas Gleixner <tglx@linutronix.de>,
Balbir Singh <balbir@linux.vnet.ibm.com>,
Martin Schwidefsky <schwidefsky@de.ibm.com>,
Heiko Carstens <heiko.carstens@de.ibm.com>,
Roland McGrath <roland@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org
Subject: [patch 3/4] taskstats: Introduce cdata_acct for complete cumulative accounting
Date: Fri, 19 Nov 2010 21:11:11 +0100 [thread overview]
Message-ID: <20101119201144.542948128@linux.vnet.ibm.com> (raw)
In-Reply-To: 20101119201108.269346583@linux.vnet.ibm.com
[-- Attachment #1: 03-taskstats-top-improve-ctime-account-cdata_acct.patch --]
[-- Type: text/plain, Size: 4388 bytes --]
From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Currently the cumulative time accounting in Linux is not complete.
Due to POSIX POSIX.1-2001, the CPU time of processes is not accounted
to the cumulative time of the parents, if the parents ignore SIGCHLD
or have set SA_NOCLDWAIT. This behaviour has the major drawback that
it is not possible to calculate all consumed CPU time of a system by
looking at the current tasks. CPU time can be lost.
This patch adds a new set of cumulative time counters. We then have two
cumulative counter sets:
* cdata_wait: Traditional cumulative time used e.g. by getrusage.
* cdata_acct: Cumulative time that also includes dead processes with
parents that ignore SIGCHLD or have set SA_NOCLDWAIT.
cdata_acct will be exported by taskstats.
TODO:
-----
With this patch we take the siglock twice. First for the dead task
and second for the parent of the dead task. This give the following
lockdep warning (probably a lockdep annotation is needed here):
=============================================
[ INFO: possible recursive locking detected ]
2.6.37-rc1-00116-g151f52f-dirty #19
---------------------------------------------
kworker/u:0/15 is trying to acquire lock:
(&(&sighand->siglock)->rlock){......}, at: [<000000000014a426>] __account_cdata+0x6e/0x444
but task is already holding lock:
(&(&sighand->siglock)->rlock){......}, at: [<000000000014b634>] release_task+0x160/0x6a0
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
---
include/linux/sched.h | 2 ++
kernel/exit.c | 36 +++++++++++++++++++++++++-----------
2 files changed, 27 insertions(+), 11 deletions(-)
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -595,6 +595,8 @@ struct signal_struct {
*/
struct cdata cdata_wait;
struct cdata cdata_threads;
+ struct cdata cdata_acct;
+ struct task_io_accounting ioac_acct;
struct task_io_accounting ioac;
#ifndef CONFIG_VIRT_CPU_ACCOUNTING
cputime_t prev_utime, prev_stime;
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -74,10 +74,10 @@ static void __unhash_process(struct task
list_del_rcu(&p->thread_group);
}
-static void __account_cdata(struct task_struct *p)
+static void __account_cdata(struct task_struct *p, int wait)
{
struct cdata *cd, *pcd, *tcd;
- unsigned long maxrss;
+ unsigned long maxrss, flags;
cputime_t tgutime, tgstime;
/*
@@ -100,11 +100,16 @@ static void __account_cdata(struct task_
* group including the group leader.
*/
thread_group_times(p, &tgutime, &tgstime);
- spin_lock_irq(&p->real_parent->sighand->siglock);
- pcd = &p->real_parent->signal->cdata_wait;
- tcd = &p->signal->cdata_threads;
- cd = &p->signal->cdata_wait;
-
+ spin_lock_irqsave(&p->real_parent->sighand->siglock, flags);
+ if (wait) {
+ pcd = &p->real_parent->signal->cdata_wait;
+ tcd = &p->signal->cdata_threads;
+ cd = &p->signal->cdata_wait;
+ } else {
+ pcd = &p->real_parent->signal->cdata_acct;
+ tcd = &p->signal->cdata_threads;
+ cd = &p->signal->cdata_acct;
+ }
pcd->utime =
cputime_add(pcd->utime,
cputime_add(tgutime,
@@ -135,9 +140,17 @@ static void __account_cdata(struct task_
maxrss = max(tcd->maxrss, cd->maxrss);
if (pcd->maxrss < maxrss)
pcd->maxrss = maxrss;
- task_io_accounting_add(&p->real_parent->signal->ioac, &p->ioac);
- task_io_accounting_add(&p->real_parent->signal->ioac, &p->signal->ioac);
- spin_unlock_irq(&p->real_parent->sighand->siglock);
+ if (wait) {
+ task_io_accounting_add(&p->real_parent->signal->ioac, &p->ioac);
+ task_io_accounting_add(&p->real_parent->signal->ioac,
+ &p->signal->ioac);
+ } else {
+ task_io_accounting_add(&p->real_parent->signal->ioac_acct,
+ &p->ioac);
+ task_io_accounting_add(&p->real_parent->signal->ioac_acct,
+ &p->signal->ioac_acct);
+ }
+ spin_unlock_irqrestore(&p->real_parent->sighand->siglock, flags);
}
/*
@@ -157,6 +170,7 @@ static void __exit_signal(struct task_st
posix_cpu_timers_exit(tsk);
if (group_dead) {
+ __account_cdata(tsk, 0);
posix_cpu_timers_exit_group(tsk);
tty = sig->tty;
sig->tty = NULL;
@@ -1293,7 +1307,7 @@ static int wait_task_zombie(struct wait_
* !task_detached() to filter out sub-threads.
*/
if (likely(!traced) && likely(!task_detached(p)))
- __account_cdata(p);
+ __account_cdata(p, 1);
/*
* Now we are sure this task is interesting, and no other
next prev parent reply other threads:[~2010-11-19 20:12 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-19 20:11 [patch 0/4] taskstats: Improve cumulative time accounting Michael Holzheu
2010-11-19 20:11 ` [patch 1/4] taskstats: Introduce "struct cdata" Michael Holzheu
2010-11-25 12:29 ` Balbir Singh
2010-11-25 14:23 ` Oleg Nesterov
2010-11-25 16:38 ` Michael Holzheu
2010-11-19 20:11 ` [patch 2/4] taskstats: Introduce __account_cdata() function Michael Holzheu
2010-11-19 20:11 ` Michael Holzheu [this message]
2010-11-23 16:59 ` [patch 3/4] taskstats: Introduce cdata_acct for complete cumulative accounting Oleg Nesterov
2010-11-25 9:40 ` Michael Holzheu
2010-11-25 13:21 ` Oleg Nesterov
2010-11-25 17:45 ` Michael Holzheu
2010-11-19 20:11 ` [patch 4/4] taskstats: Export "cdata_acct" with taskstats Michael Holzheu
2010-11-25 13:26 ` Oleg Nesterov
2010-11-25 17:21 ` Michael Holzheu
2010-11-29 16:43 ` Oleg Nesterov
2010-11-29 16:58 ` Michael Holzheu
2010-11-29 18:08 ` Oleg Nesterov
2010-11-25 16:57 ` Balbir Singh
2010-11-19 20:19 ` [patch 0/4] taskstats: Improve cumulative time accounting Peter Zijlstra
2010-11-20 15:17 ` Oleg Nesterov
2010-11-22 7:21 ` Balbir Singh
2010-11-22 11:03 ` Michael Holzheu
2010-11-22 12:47 ` Michael Holzheu
2010-11-22 18:11 ` Valdis.Kletnieks
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101119201144.542948128@linux.vnet.ibm.com \
--to=holzheu@linux.vnet.ibm.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=heiko.carstens@de.ibm.com \
--cc=johnstul@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=nagar1234@in.ibm.com \
--cc=oleg@redhat.com \
--cc=roland@redhat.com \
--cc=schwidefsky@de.ibm.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox