From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
To: Oleg Nesterov <oleg@redhat.com>,
Shailabh Nagar <nagar1234@in.ibm.com>,
Andrew Morton <akpm@linux-foundation.org>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
John stultz <johnstul@us.ibm.com>,
Thomas Gleixner <tglx@linutronix.de>,
Balbir Singh <balbir@linux.vnet.ibm.com>,
Martin Schwidefsky <schwidefsky@de.ibm.com>,
Heiko Carstens <heiko.carstens@de.ibm.com>,
Roland McGrath <roland@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org
Subject: [patch 3/4] taskstats: Introduce cdata_acct for complete cumulative accounting
Date: Fri, 19 Nov 2010 21:11:11 +0100 [thread overview]
Message-ID: <20101119201144.542948128@linux.vnet.ibm.com> (raw)
In-Reply-To: 20101119201108.269346583@linux.vnet.ibm.com
[-- Attachment #1: 03-taskstats-top-improve-ctime-account-cdata_acct.patch --]
[-- Type: text/plain, Size: 4387 bytes --]
From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Currently the cumulative time accounting in Linux is not complete.
Due to POSIX POSIX.1-2001, the CPU time of processes is not accounted
to the cumulative time of the parents, if the parents ignore SIGCHLD
or have set SA_NOCLDWAIT. This behaviour has the major drawback that
it is not possible to calculate all consumed CPU time of a system by
looking at the current tasks. CPU time can be lost.
This patch adds a new set of cumulative time counters. We then have two
cumulative counter sets:
* cdata_wait: Traditional cumulative time used e.g. by getrusage.
* cdata_acct: Cumulative time that also includes dead processes with
parents that ignore SIGCHLD or have set SA_NOCLDWAIT.
cdata_acct will be exported by taskstats.
TODO:
-----
With this patch we take the siglock twice. First for the dead task
and second for the parent of the dead task. This give the following
lockdep warning (probably a lockdep annotation is needed here):
=============================================
[ INFO: possible recursive locking detected ]
2.6.37-rc1-00116-g151f52f-dirty #19
---------------------------------------------
kworker/u:0/15 is trying to acquire lock:
(&(&sighand->siglock)->rlock){......}, at: [<000000000014a426>] __account_cdata+0x6e/0x444
but task is already holding lock:
(&(&sighand->siglock)->rlock){......}, at: [<000000000014b634>] release_task+0x160/0x6a0
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
---
include/linux/sched.h | 2 ++
kernel/exit.c | 36 +++++++++++++++++++++++++-----------
2 files changed, 27 insertions(+), 11 deletions(-)
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -595,6 +595,8 @@ struct signal_struct {
*/
struct cdata cdata_wait;
struct cdata cdata_threads;
+ struct cdata cdata_acct;
+ struct task_io_accounting ioac_acct;
struct task_io_accounting ioac;
#ifndef CONFIG_VIRT_CPU_ACCOUNTING
cputime_t prev_utime, prev_stime;
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -74,10 +74,10 @@ static void __unhash_process(struct task
list_del_rcu(&p->thread_group);
}
-static void __account_cdata(struct task_struct *p)
+static void __account_cdata(struct task_struct *p, int wait)
{
struct cdata *cd, *pcd, *tcd;
- unsigned long maxrss;
+ unsigned long maxrss, flags;
cputime_t tgutime, tgstime;
/*
@@ -100,11 +100,16 @@ static void __account_cdata(struct task_
* group including the group leader.
*/
thread_group_times(p, &tgutime, &tgstime);
- spin_lock_irq(&p->real_parent->sighand->siglock);
- pcd = &p->real_parent->signal->cdata_wait;
- tcd = &p->signal->cdata_threads;
- cd = &p->signal->cdata_wait;
-
+ spin_lock_irqsave(&p->real_parent->sighand->siglock, flags);
+ if (wait) {
+ pcd = &p->real_parent->signal->cdata_wait;
+ tcd = &p->signal->cdata_threads;
+ cd = &p->signal->cdata_wait;
+ } else {
+ pcd = &p->real_parent->signal->cdata_acct;
+ tcd = &p->signal->cdata_threads;
+ cd = &p->signal->cdata_acct;
+ }
pcd->utime =
cputime_add(pcd->utime,
cputime_add(tgutime,
@@ -135,9 +140,17 @@ static void __account_cdata(struct task_
maxrss = max(tcd->maxrss, cd->maxrss);
if (pcd->maxrss < maxrss)
pcd->maxrss = maxrss;
- task_io_accounting_add(&p->real_parent->signal->ioac, &p->ioac);
- task_io_accounting_add(&p->real_parent->signal->ioac, &p->signal->ioac);
- spin_unlock_irq(&p->real_parent->sighand->siglock);
+ if (wait) {
+ task_io_accounting_add(&p->real_parent->signal->ioac, &p->ioac);
+ task_io_accounting_add(&p->real_parent->signal->ioac,
+ &p->signal->ioac);
+ } else {
+ task_io_accounting_add(&p->real_parent->signal->ioac_acct,
+ &p->ioac);
+ task_io_accounting_add(&p->real_parent->signal->ioac_acct,
+ &p->signal->ioac_acct);
+ }
+ spin_unlock_irqrestore(&p->real_parent->sighand->siglock, flags);
}
/*
@@ -157,6 +170,7 @@ static void __exit_signal(struct task_st
posix_cpu_timers_exit(tsk);
if (group_dead) {
+ __account_cdata(tsk, 0);
posix_cpu_timers_exit_group(tsk);
tty = sig->tty;
sig->tty = NULL;
@@ -1293,7 +1307,7 @@ static int wait_task_zombie(struct wait_
* !task_detached() to filter out sub-threads.
*/
if (likely(!traced) && likely(!task_detached(p)))
- __account_cdata(p);
+ __account_cdata(p, 1);
/*
* Now we are sure this task is interesting, and no other
next prev parent reply other threads:[~2010-11-19 20:11 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-19 20:11 [patch 0/4] taskstats: Improve cumulative time accounting Michael Holzheu
2010-11-19 20:11 ` [patch 1/4] taskstats: Introduce "struct cdata" Michael Holzheu
2010-11-25 12:29 ` Balbir Singh
2010-11-25 14:23 ` Oleg Nesterov
2010-11-25 16:38 ` Michael Holzheu
2010-11-19 20:11 ` [patch 2/4] taskstats: Introduce __account_cdata() function Michael Holzheu
2010-11-19 20:11 ` Michael Holzheu [this message]
2010-11-23 16:59 ` [patch 3/4] taskstats: Introduce cdata_acct for complete cumulative accounting Oleg Nesterov
2010-11-25 9:40 ` Michael Holzheu
2010-11-25 13:21 ` Oleg Nesterov
2010-11-25 17:45 ` Michael Holzheu
2010-11-19 20:11 ` [patch 4/4] taskstats: Export "cdata_acct" with taskstats Michael Holzheu
2010-11-25 13:26 ` Oleg Nesterov
2010-11-25 17:21 ` Michael Holzheu
2010-11-29 16:43 ` Oleg Nesterov
2010-11-29 16:58 ` Michael Holzheu
2010-11-29 18:08 ` Oleg Nesterov
2010-11-25 16:57 ` Balbir Singh
2010-11-19 20:19 ` [patch 0/4] taskstats: Improve cumulative time accounting Peter Zijlstra
2010-11-20 15:17 ` Oleg Nesterov
2010-11-22 7:21 ` Balbir Singh
2010-11-22 11:03 ` Michael Holzheu
2010-11-22 12:47 ` Michael Holzheu
2010-11-22 18:11 ` Valdis.Kletnieks
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101119201144.542948128@linux.vnet.ibm.com \
--to=holzheu@linux.vnet.ibm.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=heiko.carstens@de.ibm.com \
--cc=johnstul@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=nagar1234@in.ibm.com \
--cc=oleg@redhat.com \
--cc=roland@redhat.com \
--cc=schwidefsky@de.ibm.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.