public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
To: Oleg Nesterov <oleg@redhat.com>,
	Shailabh Nagar <nagar1234@in.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	John stultz <johnstul@us.ibm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	Roland McGrath <roland@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org
Subject: [patch 3/4] taskstats: Introduce cdata_acct for complete cumulative accounting
Date: Fri, 19 Nov 2010 21:11:11 +0100	[thread overview]
Message-ID: <20101119201144.542948128@linux.vnet.ibm.com> (raw)
In-Reply-To: 20101119201108.269346583@linux.vnet.ibm.com

[-- Attachment #1: 03-taskstats-top-improve-ctime-account-cdata_acct.patch --]
[-- Type: text/plain, Size: 4388 bytes --]

From: Michael Holzheu <holzheu@linux.vnet.ibm.com>

Currently the cumulative time accounting in Linux is not complete.
Due to POSIX POSIX.1-2001, the CPU time of processes is not accounted
to the cumulative time of the parents, if the parents ignore SIGCHLD
or have set SA_NOCLDWAIT. This behaviour has the major drawback that
it is not possible to calculate all consumed CPU time of a system by
looking at the current tasks. CPU time can be lost.

This patch adds a new set of cumulative time counters. We then have two
cumulative counter sets:

* cdata_wait: Traditional cumulative time used e.g. by getrusage.
* cdata_acct: Cumulative time that also includes dead processes with
              parents that ignore SIGCHLD or have set SA_NOCLDWAIT.
              cdata_acct will be exported by taskstats.

TODO:
-----
With this patch we take the siglock twice. First for the dead task
and second for the parent of the dead task. This give the following
lockdep warning (probably a lockdep annotation is needed here):
=============================================
[ INFO: possible recursive locking detected ]
2.6.37-rc1-00116-g151f52f-dirty #19
---------------------------------------------
kworker/u:0/15 is trying to acquire lock:
 (&(&sighand->siglock)->rlock){......}, at: [<000000000014a426>] __account_cdata+0x6e/0x444
but task is already holding lock:
 (&(&sighand->siglock)->rlock){......}, at: [<000000000014b634>] release_task+0x160/0x6a0

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
---
 include/linux/sched.h |    2 ++
 kernel/exit.c         |   36 +++++++++++++++++++++++++-----------
 2 files changed, 27 insertions(+), 11 deletions(-)

--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -595,6 +595,8 @@ struct signal_struct {
 	 */
 	struct cdata cdata_wait;
 	struct cdata cdata_threads;
+	struct cdata cdata_acct;
+	struct task_io_accounting ioac_acct;
 	struct task_io_accounting ioac;
 #ifndef CONFIG_VIRT_CPU_ACCOUNTING
 	cputime_t prev_utime, prev_stime;
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -74,10 +74,10 @@ static void __unhash_process(struct task
 	list_del_rcu(&p->thread_group);
 }
 
-static void __account_cdata(struct task_struct *p)
+static void __account_cdata(struct task_struct *p, int wait)
 {
 	struct cdata *cd, *pcd, *tcd;
-	unsigned long maxrss;
+	unsigned long maxrss, flags;
 	cputime_t tgutime, tgstime;
 
 	/*
@@ -100,11 +100,16 @@ static void __account_cdata(struct task_
 	 * group including the group leader.
 	 */
 	thread_group_times(p, &tgutime, &tgstime);
-	spin_lock_irq(&p->real_parent->sighand->siglock);
-	pcd = &p->real_parent->signal->cdata_wait;
-	tcd = &p->signal->cdata_threads;
-	cd = &p->signal->cdata_wait;
-
+	spin_lock_irqsave(&p->real_parent->sighand->siglock, flags);
+	if (wait) {
+		pcd = &p->real_parent->signal->cdata_wait;
+		tcd = &p->signal->cdata_threads;
+		cd = &p->signal->cdata_wait;
+	} else {
+		pcd = &p->real_parent->signal->cdata_acct;
+		tcd = &p->signal->cdata_threads;
+		cd = &p->signal->cdata_acct;
+	}
 	pcd->utime =
 		cputime_add(pcd->utime,
 		cputime_add(tgutime,
@@ -135,9 +140,17 @@ static void __account_cdata(struct task_
 	maxrss = max(tcd->maxrss, cd->maxrss);
 	if (pcd->maxrss < maxrss)
 		pcd->maxrss = maxrss;
-	task_io_accounting_add(&p->real_parent->signal->ioac, &p->ioac);
-	task_io_accounting_add(&p->real_parent->signal->ioac, &p->signal->ioac);
-	spin_unlock_irq(&p->real_parent->sighand->siglock);
+	if (wait) {
+		task_io_accounting_add(&p->real_parent->signal->ioac, &p->ioac);
+		task_io_accounting_add(&p->real_parent->signal->ioac,
+				       &p->signal->ioac);
+	} else {
+		task_io_accounting_add(&p->real_parent->signal->ioac_acct,
+				       &p->ioac);
+		task_io_accounting_add(&p->real_parent->signal->ioac_acct,
+				       &p->signal->ioac_acct);
+	}
+	spin_unlock_irqrestore(&p->real_parent->sighand->siglock, flags);
 }
 
 /*
@@ -157,6 +170,7 @@ static void __exit_signal(struct task_st
 
 	posix_cpu_timers_exit(tsk);
 	if (group_dead) {
+		__account_cdata(tsk, 0);
 		posix_cpu_timers_exit_group(tsk);
 		tty = sig->tty;
 		sig->tty = NULL;
@@ -1293,7 +1307,7 @@ static int wait_task_zombie(struct wait_
 	 * !task_detached() to filter out sub-threads.
 	 */
 	if (likely(!traced) && likely(!task_detached(p)))
-		__account_cdata(p);
+		__account_cdata(p, 1);
 
 	/*
 	 * Now we are sure this task is interesting, and no other


  parent reply	other threads:[~2010-11-19 20:12 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-19 20:11 [patch 0/4] taskstats: Improve cumulative time accounting Michael Holzheu
2010-11-19 20:11 ` [patch 1/4] taskstats: Introduce "struct cdata" Michael Holzheu
2010-11-25 12:29   ` Balbir Singh
2010-11-25 14:23   ` Oleg Nesterov
2010-11-25 16:38     ` Michael Holzheu
2010-11-19 20:11 ` [patch 2/4] taskstats: Introduce __account_cdata() function Michael Holzheu
2010-11-19 20:11 ` Michael Holzheu [this message]
2010-11-23 16:59   ` [patch 3/4] taskstats: Introduce cdata_acct for complete cumulative accounting Oleg Nesterov
2010-11-25  9:40     ` Michael Holzheu
2010-11-25 13:21       ` Oleg Nesterov
2010-11-25 17:45         ` Michael Holzheu
2010-11-19 20:11 ` [patch 4/4] taskstats: Export "cdata_acct" with taskstats Michael Holzheu
2010-11-25 13:26   ` Oleg Nesterov
2010-11-25 17:21     ` Michael Holzheu
2010-11-29 16:43       ` Oleg Nesterov
2010-11-29 16:58         ` Michael Holzheu
2010-11-29 18:08           ` Oleg Nesterov
2010-11-25 16:57   ` Balbir Singh
2010-11-19 20:19 ` [patch 0/4] taskstats: Improve cumulative time accounting Peter Zijlstra
2010-11-20 15:17   ` Oleg Nesterov
2010-11-22  7:21     ` Balbir Singh
2010-11-22 11:03   ` Michael Holzheu
2010-11-22 12:47     ` Michael Holzheu
2010-11-22 18:11       ` Valdis.Kletnieks

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101119201144.542948128@linux.vnet.ibm.com \
    --to=holzheu@linux.vnet.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=johnstul@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=nagar1234@in.ibm.com \
    --cc=oleg@redhat.com \
    --cc=roland@redhat.com \
    --cc=schwidefsky@de.ibm.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox