public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [patch 0/4] taskstats: Improve cumulative time accounting
@ 2010-11-19 20:11 Michael Holzheu
  2010-11-19 20:11 ` [patch 1/4] taskstats: Introduce "struct cdata" Michael Holzheu
                   ` (4 more replies)
  0 siblings, 5 replies; 24+ messages in thread
From: Michael Holzheu @ 2010-11-19 20:11 UTC (permalink / raw)
  To: Oleg Nesterov, Shailabh Nagar, Andrew Morton, Peter Zijlstra,
	John stultz, Thomas Gleixner, Balbir Singh, Martin Schwidefsky,
	Heiko Carstens, Roland McGrath
  Cc: linux-kernel, linux-s390

Due to POSIX POSIX.1-2001, the CPU time of processes is not accounted
to the cumulative time of the parents, if the parents ignore SIGCHLD
or have set SA_NOCLDWAIT. This behaviour has the major drawback that
it is not possible to calculate all consumed CPU time of a system by
looking at the current tasks. CPU time can be lost.

To solve this problem, this patch set duplicates the cumulative accounting
data in the signal_struct. In the second set (cdata_acct) the complete
cumulative resource counters are stored. The new cumulative CPU time (utime
and stime) is then exported via the taskstats interface.

PATCH SET OVERVIEW
------------------
Patches apply on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git

[1] Introduce "struct cdata"
[2] Introduce __account_cdata() function
[3] Introduce cdata_acct for complete cumulative accounting
[4] Export "cdata_acct" with taskstats

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [patch 1/4] taskstats: Introduce "struct cdata"
  2010-11-19 20:11 [patch 0/4] taskstats: Improve cumulative time accounting Michael Holzheu
@ 2010-11-19 20:11 ` Michael Holzheu
  2010-11-25 12:29   ` Balbir Singh
  2010-11-25 14:23   ` Oleg Nesterov
  2010-11-19 20:11 ` [patch 2/4] taskstats: Introduce __account_cdata() function Michael Holzheu
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 24+ messages in thread
From: Michael Holzheu @ 2010-11-19 20:11 UTC (permalink / raw)
  To: Oleg Nesterov, Shailabh Nagar, Andrew Morton, Peter Zijlstra,
	John stultz, Thomas Gleixner, Balbir Singh, Martin Schwidefsky,
	Heiko Carstens, Roland McGrath
  Cc: linux-kernel, linux-s390

[-- Attachment #1: 01-taskstats-top-improve-ctime-add-cdata-struct.patch --]
[-- Type: text/plain, Size: 12061 bytes --]

From: Michael Holzheu <holzheu@linux.vnet.ibm.com>

This patch introduces a new structure "struct cdata" that is used to
store cumulative resource counters for dead child processes and threads.

Note that there is one asymmetry:
For "struct task_io_accounting" (ioc) there is no extra accounting field for
dead threads. One field is used for both, dead processes and threads.

This patch introduces no functional change.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
---
 fs/binfmt_elf.c           |    4 +-
 fs/exec.c                 |    2 -
 fs/proc/array.c           |   16 ++++----
 include/linux/sched.h     |   22 +++++++----
 kernel/exit.c             |   86 ++++++++++++++++++++++++----------------------
 kernel/posix-cpu-timers.c |   12 +++---
 kernel/sys.c              |   44 ++++++++++++-----------
 7 files changed, 100 insertions(+), 86 deletions(-)

--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1296,8 +1296,8 @@ static void fill_prstatus(struct elf_prs
 		cputime_to_timeval(p->utime, &prstatus->pr_utime);
 		cputime_to_timeval(p->stime, &prstatus->pr_stime);
 	}
-	cputime_to_timeval(p->signal->cutime, &prstatus->pr_cutime);
-	cputime_to_timeval(p->signal->cstime, &prstatus->pr_cstime);
+	cputime_to_timeval(p->signal->cdata_wait.utime, &prstatus->pr_cutime);
+	cputime_to_timeval(p->signal->cdata_wait.stime, &prstatus->pr_cstime);
 }
 
 static int fill_psinfo(struct elf_prpsinfo *psinfo, struct task_struct *p,
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -894,7 +894,7 @@ static int de_thread(struct task_struct
 
 no_thread_group:
 	if (current->mm)
-		setmax_mm_hiwater_rss(&sig->maxrss, current->mm);
+		setmax_mm_hiwater_rss(&sig->cdata_threads.maxrss, current->mm);
 
 	exit_itimers(sig);
 	flush_itimer_signals();
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -413,11 +413,11 @@ static int do_task_stat(struct seq_file
 		num_threads = get_nr_threads(task);
 		collect_sigign_sigcatch(task, &sigign, &sigcatch);
 
-		cmin_flt = sig->cmin_flt;
-		cmaj_flt = sig->cmaj_flt;
-		cutime = sig->cutime;
-		cstime = sig->cstime;
-		cgtime = sig->cgtime;
+		cmin_flt = sig->cdata_wait.min_flt;
+		cmaj_flt = sig->cdata_wait.maj_flt;
+		cutime = sig->cdata_wait.utime;
+		cstime = sig->cdata_wait.stime;
+		cgtime = sig->cdata_wait.gtime;
 		rsslim = ACCESS_ONCE(sig->rlim[RLIMIT_RSS].rlim_cur);
 
 		/* add up live thread stats at the group level */
@@ -430,10 +430,10 @@ static int do_task_stat(struct seq_file
 				t = next_thread(t);
 			} while (t != task);
 
-			min_flt += sig->min_flt;
-			maj_flt += sig->maj_flt;
+			min_flt += sig->cdata_threads.min_flt;
+			maj_flt += sig->cdata_threads.maj_flt;
 			thread_group_times(task, &utime, &stime);
-			gtime = cputime_add(gtime, sig->gtime);
+			gtime = cputime_add(gtime, sig->cdata_threads.gtime);
 		}
 
 		sid = task_session_nr_ns(task, ns);
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -510,6 +510,17 @@ struct thread_group_cputimer {
 };
 
 /*
+ * Cumulative resource counters
+ */
+struct cdata {
+	cputime_t utime, stime, gtime;
+	unsigned long nvcsw, nivcsw;
+	unsigned long min_flt, maj_flt;
+	unsigned long inblock, oublock;
+	unsigned long maxrss;
+};
+
+/*
  * NOTE! "signal_struct" does not have it's own
  * locking, because a shared signal_struct always
  * implies a shared sighand_struct, so locking
@@ -582,17 +593,12 @@ struct signal_struct {
 	 * Live threads maintain their own counters and add to these
 	 * in __exit_signal, except for the group leader.
 	 */
-	cputime_t utime, stime, cutime, cstime;
-	cputime_t gtime;
-	cputime_t cgtime;
+	struct cdata cdata_wait;
+	struct cdata cdata_threads;
+	struct task_io_accounting ioac;
 #ifndef CONFIG_VIRT_CPU_ACCOUNTING
 	cputime_t prev_utime, prev_stime;
 #endif
-	unsigned long nvcsw, nivcsw, cnvcsw, cnivcsw;
-	unsigned long min_flt, maj_flt, cmin_flt, cmaj_flt;
-	unsigned long inblock, oublock, cinblock, coublock;
-	unsigned long maxrss, cmaxrss;
-	struct task_io_accounting ioac;
 
 	/*
 	 * Cumulative ns of schedule CPU time fo dead threads in the
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -95,6 +95,7 @@ static void __exit_signal(struct task_st
 		tty = sig->tty;
 		sig->tty = NULL;
 	} else {
+		struct cdata *tcd = &sig->cdata_threads;
 		/*
 		 * This can only happen if the caller is de_thread().
 		 * FIXME: this is the temporary hack, we should teach
@@ -122,15 +123,15 @@ static void __exit_signal(struct task_st
 		 * We won't ever get here for the group leader, since it
 		 * will have been the last reference on the signal_struct.
 		 */
-		sig->utime = cputime_add(sig->utime, tsk->utime);
-		sig->stime = cputime_add(sig->stime, tsk->stime);
-		sig->gtime = cputime_add(sig->gtime, tsk->gtime);
-		sig->min_flt += tsk->min_flt;
-		sig->maj_flt += tsk->maj_flt;
-		sig->nvcsw += tsk->nvcsw;
-		sig->nivcsw += tsk->nivcsw;
-		sig->inblock += task_io_get_inblock(tsk);
-		sig->oublock += task_io_get_oublock(tsk);
+		tcd->utime = cputime_add(tcd->utime, tsk->utime);
+		tcd->stime = cputime_add(tcd->stime, tsk->stime);
+		tcd->gtime = cputime_add(tcd->gtime, tsk->gtime);
+		tcd->min_flt += tsk->min_flt;
+		tcd->maj_flt += tsk->maj_flt;
+		tcd->nvcsw += tsk->nvcsw;
+		tcd->nivcsw += tsk->nivcsw;
+		tcd->inblock += task_io_get_inblock(tsk);
+		tcd->oublock += task_io_get_oublock(tsk);
 		task_io_accounting_add(&sig->ioac, &tsk->ioac);
 		sig->sum_sched_runtime += tsk->se.sum_exec_runtime;
 	}
@@ -960,10 +961,11 @@ NORET_TYPE void do_exit(long code)
 		sync_mm_rss(tsk, tsk->mm);
 	group_dead = atomic_dec_and_test(&tsk->signal->live);
 	if (group_dead) {
+		struct cdata *tcd = &tsk->signal->cdata_threads;
 		hrtimer_cancel(&tsk->signal->real_timer);
 		exit_itimers(tsk->signal);
 		if (tsk->mm)
-			setmax_mm_hiwater_rss(&tsk->signal->maxrss, tsk->mm);
+			setmax_mm_hiwater_rss(&tcd->maxrss, tsk->mm);
 	}
 	acct_collect(code, group_dead);
 	if (group_dead)
@@ -1225,8 +1227,7 @@ static int wait_task_zombie(struct wait_
 	 * !task_detached() to filter out sub-threads.
 	 */
 	if (likely(!traced) && likely(!task_detached(p))) {
-		struct signal_struct *psig;
-		struct signal_struct *sig;
+		struct cdata *cd, *pcd, *tcd;
 		unsigned long maxrss;
 		cputime_t tgutime, tgstime;
 
@@ -1251,40 +1252,43 @@ static int wait_task_zombie(struct wait_
 		 */
 		thread_group_times(p, &tgutime, &tgstime);
 		spin_lock_irq(&p->real_parent->sighand->siglock);
-		psig = p->real_parent->signal;
-		sig = p->signal;
-		psig->cutime =
-			cputime_add(psig->cutime,
+		pcd = &p->real_parent->signal->cdata_wait;
+		tcd = &p->signal->cdata_threads;
+		cd = &p->signal->cdata_wait;
+
+		pcd->utime =
+			cputime_add(pcd->utime,
 			cputime_add(tgutime,
-				    sig->cutime));
-		psig->cstime =
-			cputime_add(psig->cstime,
+				    cd->utime));
+		pcd->stime =
+			cputime_add(pcd->stime,
 			cputime_add(tgstime,
-				    sig->cstime));
-		psig->cgtime =
-			cputime_add(psig->cgtime,
+				    cd->stime));
+		pcd->gtime =
+			cputime_add(pcd->gtime,
 			cputime_add(p->gtime,
-			cputime_add(sig->gtime,
-				    sig->cgtime)));
-		psig->cmin_flt +=
-			p->min_flt + sig->min_flt + sig->cmin_flt;
-		psig->cmaj_flt +=
-			p->maj_flt + sig->maj_flt + sig->cmaj_flt;
-		psig->cnvcsw +=
-			p->nvcsw + sig->nvcsw + sig->cnvcsw;
-		psig->cnivcsw +=
-			p->nivcsw + sig->nivcsw + sig->cnivcsw;
-		psig->cinblock +=
+			cputime_add(tcd->gtime,
+				    cd->gtime)));
+		pcd->min_flt +=
+			p->min_flt + tcd->min_flt + cd->min_flt;
+		pcd->maj_flt +=
+			p->maj_flt + tcd->maj_flt + cd->maj_flt;
+		pcd->nvcsw +=
+			p->nvcsw + tcd->nvcsw + cd->nvcsw;
+		pcd->nivcsw +=
+			p->nivcsw + tcd->nivcsw + cd->nivcsw;
+		pcd->inblock +=
 			task_io_get_inblock(p) +
-			sig->inblock + sig->cinblock;
-		psig->coublock +=
+			tcd->inblock + cd->inblock;
+		pcd->oublock +=
 			task_io_get_oublock(p) +
-			sig->oublock + sig->coublock;
-		maxrss = max(sig->maxrss, sig->cmaxrss);
-		if (psig->cmaxrss < maxrss)
-			psig->cmaxrss = maxrss;
-		task_io_accounting_add(&psig->ioac, &p->ioac);
-		task_io_accounting_add(&psig->ioac, &sig->ioac);
+			tcd->oublock + cd->oublock;
+		maxrss = max(tcd->maxrss, cd->maxrss);
+		if (pcd->maxrss < maxrss)
+			pcd->maxrss = maxrss;
+		task_io_accounting_add(&p->real_parent->signal->ioac, &p->ioac);
+		task_io_accounting_add(&p->real_parent->signal->ioac,
+				       &p->signal->ioac);
 		spin_unlock_irq(&p->real_parent->sighand->siglock);
 	}
 
--- a/kernel/posix-cpu-timers.c
+++ b/kernel/posix-cpu-timers.c
@@ -232,12 +232,12 @@ static int cpu_clock_sample(const clocki
 
 void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times)
 {
-	struct signal_struct *sig = tsk->signal;
+	struct cdata *tcd = &tsk->signal->cdata_threads;
 	struct task_struct *t;
 
-	times->utime = sig->utime;
-	times->stime = sig->stime;
-	times->sum_exec_runtime = sig->sum_sched_runtime;
+	times->utime = tcd->utime;
+	times->stime = tcd->stime;
+	times->sum_exec_runtime = tsk->signal->sum_sched_runtime;
 
 	rcu_read_lock();
 	/* make sure we can trust tsk->thread_group list */
@@ -516,8 +516,8 @@ void posix_cpu_timers_exit_group(struct
 	struct signal_struct *const sig = tsk->signal;
 
 	cleanup_timers(tsk->signal->cpu_timers,
-		       cputime_add(tsk->utime, sig->utime),
-		       cputime_add(tsk->stime, sig->stime),
+		       cputime_add(tsk->utime, sig->cdata_threads.utime),
+		       cputime_add(tsk->stime, sig->cdata_threads.stime),
 		       tsk->se.sum_exec_runtime + sig->sum_sched_runtime);
 }
 
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -884,8 +884,8 @@ void do_sys_times(struct tms *tms)
 
 	spin_lock_irq(&current->sighand->siglock);
 	thread_group_times(current, &tgutime, &tgstime);
-	cutime = current->signal->cutime;
-	cstime = current->signal->cstime;
+	cutime = current->signal->cdata_wait.utime;
+	cstime = current->signal->cdata_wait.stime;
 	spin_unlock_irq(&current->sighand->siglock);
 	tms->tms_utime = cputime_to_clock_t(tgutime);
 	tms->tms_stime = cputime_to_clock_t(tgstime);
@@ -1490,6 +1490,7 @@ static void k_getrusage(struct task_stru
 	unsigned long flags;
 	cputime_t tgutime, tgstime, utime, stime;
 	unsigned long maxrss = 0;
+	struct cdata *cd;
 
 	memset((char *) r, 0, sizeof *r);
 	utime = stime = cputime_zero;
@@ -1497,7 +1498,7 @@ static void k_getrusage(struct task_stru
 	if (who == RUSAGE_THREAD) {
 		task_times(current, &utime, &stime);
 		accumulate_thread_rusage(p, r);
-		maxrss = p->signal->maxrss;
+		maxrss = p->signal->cdata_threads.maxrss;
 		goto out;
 	}
 
@@ -1507,31 +1508,34 @@ static void k_getrusage(struct task_stru
 	switch (who) {
 		case RUSAGE_BOTH:
 		case RUSAGE_CHILDREN:
-			utime = p->signal->cutime;
-			stime = p->signal->cstime;
-			r->ru_nvcsw = p->signal->cnvcsw;
-			r->ru_nivcsw = p->signal->cnivcsw;
-			r->ru_minflt = p->signal->cmin_flt;
-			r->ru_majflt = p->signal->cmaj_flt;
-			r->ru_inblock = p->signal->cinblock;
-			r->ru_oublock = p->signal->coublock;
-			maxrss = p->signal->cmaxrss;
+			cd = &p->signal->cdata_wait;
+			utime = cd->utime;
+			stime = cd->stime;
+			r->ru_nvcsw = cd->nvcsw;
+			r->ru_nivcsw = cd->nivcsw;
+			r->ru_minflt = cd->min_flt;
+			r->ru_majflt = cd->maj_flt;
+			r->ru_inblock = cd->inblock;
+			r->ru_oublock = cd->oublock;
+			maxrss = cd->maxrss;
 
 			if (who == RUSAGE_CHILDREN)
 				break;
 
 		case RUSAGE_SELF:
+			cd = &p->signal->cdata_threads;
 			thread_group_times(p, &tgutime, &tgstime);
 			utime = cputime_add(utime, tgutime);
 			stime = cputime_add(stime, tgstime);
-			r->ru_nvcsw += p->signal->nvcsw;
-			r->ru_nivcsw += p->signal->nivcsw;
-			r->ru_minflt += p->signal->min_flt;
-			r->ru_majflt += p->signal->maj_flt;
-			r->ru_inblock += p->signal->inblock;
-			r->ru_oublock += p->signal->oublock;
-			if (maxrss < p->signal->maxrss)
-				maxrss = p->signal->maxrss;
+			r->ru_nvcsw += cd->nvcsw;
+			r->ru_nivcsw += cd->nivcsw;
+			r->ru_minflt += cd->min_flt;
+			r->ru_majflt += cd->maj_flt;
+			r->ru_inblock += cd->inblock;
+			r->ru_oublock += cd->oublock;
+			if (maxrss < cd->maxrss)
+				maxrss = cd->maxrss;
+
 			t = p;
 			do {
 				accumulate_thread_rusage(t, r);


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [patch 2/4] taskstats: Introduce __account_cdata() function
  2010-11-19 20:11 [patch 0/4] taskstats: Improve cumulative time accounting Michael Holzheu
  2010-11-19 20:11 ` [patch 1/4] taskstats: Introduce "struct cdata" Michael Holzheu
@ 2010-11-19 20:11 ` Michael Holzheu
  2010-11-19 20:11 ` [patch 3/4] taskstats: Introduce cdata_acct for complete cumulative accounting Michael Holzheu
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 24+ messages in thread
From: Michael Holzheu @ 2010-11-19 20:11 UTC (permalink / raw)
  To: Oleg Nesterov, Shailabh Nagar, Andrew Morton, Peter Zijlstra,
	John stultz, Thomas Gleixner, Balbir Singh, Martin Schwidefsky,
	Heiko Carstens, Roland McGrath
  Cc: linux-kernel, linux-s390

[-- Attachment #1: 02-taskstats-top-improve-ctime-add_cdata-func.patch --]
[-- Type: text/plain, Size: 5417 bytes --]

From: Michael Holzheu <holzheu@linux.vnet.ibm.com>

This patch introduces the function __account_cdata() that does the
cummulative resource accounting for dead processes in sys_wait().
No functional changes are done. This patch is a preparation for cdata_acct
accounting.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
---
 kernel/exit.c |  133 +++++++++++++++++++++++++++++-----------------------------
 1 file changed, 68 insertions(+), 65 deletions(-)

--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -74,6 +74,72 @@ static void __unhash_process(struct task
 	list_del_rcu(&p->thread_group);
 }
 
+static void __account_cdata(struct task_struct *p)
+{
+	struct cdata *cd, *pcd, *tcd;
+	unsigned long maxrss;
+	cputime_t tgutime, tgstime;
+
+	/*
+	 * The resource counters for the group leader are in its
+	 * own task_struct.  Those for dead threads in the group
+	 * are in its signal_struct, as are those for the child
+	 * processes it has previously reaped.  All these
+	 * accumulate in the parent's signal_struct c* fields.
+	 *
+	 * We don't bother to take a lock here to protect these
+	 * p->signal fields, because they are only touched by
+	 * __exit_signal, which runs with tasklist_lock
+	 * write-locked anyway, and so is excluded here.  We do
+	 * need to protect the access to parent->signal fields,
+	 * as other threads in the parent group can be right
+	 * here reaping other children at the same time.
+	 *
+	 * We use thread_group_times() to get times for the thread
+	 * group, which consolidates times for all threads in the
+	 * group including the group leader.
+	 */
+	thread_group_times(p, &tgutime, &tgstime);
+	spin_lock_irq(&p->real_parent->sighand->siglock);
+	pcd = &p->real_parent->signal->cdata_wait;
+	tcd = &p->signal->cdata_threads;
+	cd = &p->signal->cdata_wait;
+
+	pcd->utime =
+		cputime_add(pcd->utime,
+		cputime_add(tgutime,
+			    cd->utime));
+	pcd->stime =
+		cputime_add(pcd->stime,
+		cputime_add(tgstime,
+			    cd->stime));
+	pcd->gtime =
+		cputime_add(pcd->gtime,
+		cputime_add(p->gtime,
+		cputime_add(tcd->gtime,
+			    cd->gtime)));
+	pcd->min_flt +=
+		p->min_flt + tcd->min_flt + cd->min_flt;
+	pcd->maj_flt +=
+		p->maj_flt + tcd->maj_flt + cd->maj_flt;
+	pcd->nvcsw +=
+		p->nvcsw + tcd->nvcsw + cd->nvcsw;
+	pcd->nivcsw +=
+		p->nivcsw + tcd->nivcsw + cd->nivcsw;
+	pcd->inblock +=
+		task_io_get_inblock(p) +
+		tcd->inblock + cd->inblock;
+	pcd->oublock +=
+		task_io_get_oublock(p) +
+		tcd->oublock + cd->oublock;
+	maxrss = max(tcd->maxrss, cd->maxrss);
+	if (pcd->maxrss < maxrss)
+		pcd->maxrss = maxrss;
+	task_io_accounting_add(&p->real_parent->signal->ioac, &p->ioac);
+	task_io_accounting_add(&p->real_parent->signal->ioac, &p->signal->ioac);
+	spin_unlock_irq(&p->real_parent->sighand->siglock);
+}
+
 /*
  * This function expects the tasklist_lock write-locked.
  */
@@ -1226,71 +1292,8 @@ static int wait_task_zombie(struct wait_
 	 * It can be ptraced but not reparented, check
 	 * !task_detached() to filter out sub-threads.
 	 */
-	if (likely(!traced) && likely(!task_detached(p))) {
-		struct cdata *cd, *pcd, *tcd;
-		unsigned long maxrss;
-		cputime_t tgutime, tgstime;
-
-		/*
-		 * The resource counters for the group leader are in its
-		 * own task_struct.  Those for dead threads in the group
-		 * are in its signal_struct, as are those for the child
-		 * processes it has previously reaped.  All these
-		 * accumulate in the parent's signal_struct c* fields.
-		 *
-		 * We don't bother to take a lock here to protect these
-		 * p->signal fields, because they are only touched by
-		 * __exit_signal, which runs with tasklist_lock
-		 * write-locked anyway, and so is excluded here.  We do
-		 * need to protect the access to parent->signal fields,
-		 * as other threads in the parent group can be right
-		 * here reaping other children at the same time.
-		 *
-		 * We use thread_group_times() to get times for the thread
-		 * group, which consolidates times for all threads in the
-		 * group including the group leader.
-		 */
-		thread_group_times(p, &tgutime, &tgstime);
-		spin_lock_irq(&p->real_parent->sighand->siglock);
-		pcd = &p->real_parent->signal->cdata_wait;
-		tcd = &p->signal->cdata_threads;
-		cd = &p->signal->cdata_wait;
-
-		pcd->utime =
-			cputime_add(pcd->utime,
-			cputime_add(tgutime,
-				    cd->utime));
-		pcd->stime =
-			cputime_add(pcd->stime,
-			cputime_add(tgstime,
-				    cd->stime));
-		pcd->gtime =
-			cputime_add(pcd->gtime,
-			cputime_add(p->gtime,
-			cputime_add(tcd->gtime,
-				    cd->gtime)));
-		pcd->min_flt +=
-			p->min_flt + tcd->min_flt + cd->min_flt;
-		pcd->maj_flt +=
-			p->maj_flt + tcd->maj_flt + cd->maj_flt;
-		pcd->nvcsw +=
-			p->nvcsw + tcd->nvcsw + cd->nvcsw;
-		pcd->nivcsw +=
-			p->nivcsw + tcd->nivcsw + cd->nivcsw;
-		pcd->inblock +=
-			task_io_get_inblock(p) +
-			tcd->inblock + cd->inblock;
-		pcd->oublock +=
-			task_io_get_oublock(p) +
-			tcd->oublock + cd->oublock;
-		maxrss = max(tcd->maxrss, cd->maxrss);
-		if (pcd->maxrss < maxrss)
-			pcd->maxrss = maxrss;
-		task_io_accounting_add(&p->real_parent->signal->ioac, &p->ioac);
-		task_io_accounting_add(&p->real_parent->signal->ioac,
-				       &p->signal->ioac);
-		spin_unlock_irq(&p->real_parent->sighand->siglock);
-	}
+	if (likely(!traced) && likely(!task_detached(p)))
+		__account_cdata(p);
 
 	/*
 	 * Now we are sure this task is interesting, and no other


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [patch 3/4] taskstats: Introduce cdata_acct for complete cumulative accounting
  2010-11-19 20:11 [patch 0/4] taskstats: Improve cumulative time accounting Michael Holzheu
  2010-11-19 20:11 ` [patch 1/4] taskstats: Introduce "struct cdata" Michael Holzheu
  2010-11-19 20:11 ` [patch 2/4] taskstats: Introduce __account_cdata() function Michael Holzheu
@ 2010-11-19 20:11 ` Michael Holzheu
  2010-11-23 16:59   ` Oleg Nesterov
  2010-11-19 20:11 ` [patch 4/4] taskstats: Export "cdata_acct" with taskstats Michael Holzheu
  2010-11-19 20:19 ` [patch 0/4] taskstats: Improve cumulative time accounting Peter Zijlstra
  4 siblings, 1 reply; 24+ messages in thread
From: Michael Holzheu @ 2010-11-19 20:11 UTC (permalink / raw)
  To: Oleg Nesterov, Shailabh Nagar, Andrew Morton, Peter Zijlstra,
	John stultz, Thomas Gleixner, Balbir Singh, Martin Schwidefsky,
	Heiko Carstens, Roland McGrath
  Cc: linux-kernel, linux-s390

[-- Attachment #1: 03-taskstats-top-improve-ctime-account-cdata_acct.patch --]
[-- Type: text/plain, Size: 4388 bytes --]

From: Michael Holzheu <holzheu@linux.vnet.ibm.com>

Currently the cumulative time accounting in Linux is not complete.
Due to POSIX POSIX.1-2001, the CPU time of processes is not accounted
to the cumulative time of the parents, if the parents ignore SIGCHLD
or have set SA_NOCLDWAIT. This behaviour has the major drawback that
it is not possible to calculate all consumed CPU time of a system by
looking at the current tasks. CPU time can be lost.

This patch adds a new set of cumulative time counters. We then have two
cumulative counter sets:

* cdata_wait: Traditional cumulative time used e.g. by getrusage.
* cdata_acct: Cumulative time that also includes dead processes with
              parents that ignore SIGCHLD or have set SA_NOCLDWAIT.
              cdata_acct will be exported by taskstats.

TODO:
-----
With this patch we take the siglock twice. First for the dead task
and second for the parent of the dead task. This give the following
lockdep warning (probably a lockdep annotation is needed here):
=============================================
[ INFO: possible recursive locking detected ]
2.6.37-rc1-00116-g151f52f-dirty #19
---------------------------------------------
kworker/u:0/15 is trying to acquire lock:
 (&(&sighand->siglock)->rlock){......}, at: [<000000000014a426>] __account_cdata+0x6e/0x444
but task is already holding lock:
 (&(&sighand->siglock)->rlock){......}, at: [<000000000014b634>] release_task+0x160/0x6a0

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
---
 include/linux/sched.h |    2 ++
 kernel/exit.c         |   36 +++++++++++++++++++++++++-----------
 2 files changed, 27 insertions(+), 11 deletions(-)

--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -595,6 +595,8 @@ struct signal_struct {
 	 */
 	struct cdata cdata_wait;
 	struct cdata cdata_threads;
+	struct cdata cdata_acct;
+	struct task_io_accounting ioac_acct;
 	struct task_io_accounting ioac;
 #ifndef CONFIG_VIRT_CPU_ACCOUNTING
 	cputime_t prev_utime, prev_stime;
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -74,10 +74,10 @@ static void __unhash_process(struct task
 	list_del_rcu(&p->thread_group);
 }
 
-static void __account_cdata(struct task_struct *p)
+static void __account_cdata(struct task_struct *p, int wait)
 {
 	struct cdata *cd, *pcd, *tcd;
-	unsigned long maxrss;
+	unsigned long maxrss, flags;
 	cputime_t tgutime, tgstime;
 
 	/*
@@ -100,11 +100,16 @@ static void __account_cdata(struct task_
 	 * group including the group leader.
 	 */
 	thread_group_times(p, &tgutime, &tgstime);
-	spin_lock_irq(&p->real_parent->sighand->siglock);
-	pcd = &p->real_parent->signal->cdata_wait;
-	tcd = &p->signal->cdata_threads;
-	cd = &p->signal->cdata_wait;
-
+	spin_lock_irqsave(&p->real_parent->sighand->siglock, flags);
+	if (wait) {
+		pcd = &p->real_parent->signal->cdata_wait;
+		tcd = &p->signal->cdata_threads;
+		cd = &p->signal->cdata_wait;
+	} else {
+		pcd = &p->real_parent->signal->cdata_acct;
+		tcd = &p->signal->cdata_threads;
+		cd = &p->signal->cdata_acct;
+	}
 	pcd->utime =
 		cputime_add(pcd->utime,
 		cputime_add(tgutime,
@@ -135,9 +140,17 @@ static void __account_cdata(struct task_
 	maxrss = max(tcd->maxrss, cd->maxrss);
 	if (pcd->maxrss < maxrss)
 		pcd->maxrss = maxrss;
-	task_io_accounting_add(&p->real_parent->signal->ioac, &p->ioac);
-	task_io_accounting_add(&p->real_parent->signal->ioac, &p->signal->ioac);
-	spin_unlock_irq(&p->real_parent->sighand->siglock);
+	if (wait) {
+		task_io_accounting_add(&p->real_parent->signal->ioac, &p->ioac);
+		task_io_accounting_add(&p->real_parent->signal->ioac,
+				       &p->signal->ioac);
+	} else {
+		task_io_accounting_add(&p->real_parent->signal->ioac_acct,
+				       &p->ioac);
+		task_io_accounting_add(&p->real_parent->signal->ioac_acct,
+				       &p->signal->ioac_acct);
+	}
+	spin_unlock_irqrestore(&p->real_parent->sighand->siglock, flags);
 }
 
 /*
@@ -157,6 +170,7 @@ static void __exit_signal(struct task_st
 
 	posix_cpu_timers_exit(tsk);
 	if (group_dead) {
+		__account_cdata(tsk, 0);
 		posix_cpu_timers_exit_group(tsk);
 		tty = sig->tty;
 		sig->tty = NULL;
@@ -1293,7 +1307,7 @@ static int wait_task_zombie(struct wait_
 	 * !task_detached() to filter out sub-threads.
 	 */
 	if (likely(!traced) && likely(!task_detached(p)))
-		__account_cdata(p);
+		__account_cdata(p, 1);
 
 	/*
 	 * Now we are sure this task is interesting, and no other


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [patch 4/4] taskstats: Export "cdata_acct" with taskstats
  2010-11-19 20:11 [patch 0/4] taskstats: Improve cumulative time accounting Michael Holzheu
                   ` (2 preceding siblings ...)
  2010-11-19 20:11 ` [patch 3/4] taskstats: Introduce cdata_acct for complete cumulative accounting Michael Holzheu
@ 2010-11-19 20:11 ` Michael Holzheu
  2010-11-25 13:26   ` Oleg Nesterov
  2010-11-25 16:57   ` Balbir Singh
  2010-11-19 20:19 ` [patch 0/4] taskstats: Improve cumulative time accounting Peter Zijlstra
  4 siblings, 2 replies; 24+ messages in thread
From: Michael Holzheu @ 2010-11-19 20:11 UTC (permalink / raw)
  To: Oleg Nesterov, Shailabh Nagar, Andrew Morton, Peter Zijlstra,
	John stultz, Thomas Gleixner, Balbir Singh, Martin Schwidefsky,
	Heiko Carstens, Roland McGrath
  Cc: linux-kernel, linux-s390

[-- Attachment #1: 04-taskstats-top-improve-ctime-taskstats.patch --]
[-- Type: text/plain, Size: 1637 bytes --]

From: Michael Holzheu <holzheu@linux.vnet.ibm.com>

With this patch the (full) cumulative CPU time is added to "struct taskstats".
The CPU time is only returned for the thread group leader.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
---
 include/linux/taskstats.h |    7 ++++++-
 kernel/tsacct.c           |    7 +++++++
 2 files changed, 13 insertions(+), 1 deletion(-)

--- a/include/linux/taskstats.h
+++ b/include/linux/taskstats.h
@@ -33,7 +33,7 @@
  */
 
 
-#define TASKSTATS_VERSION	7
+#define TASKSTATS_VERSION	8
 #define TS_COMM_LEN		32	/* should be >= TASK_COMM_LEN
 					 * in linux/sched.h */
 
@@ -163,6 +163,11 @@ struct taskstats {
 	/* Delay waiting for memory reclaim */
 	__u64	freepages_count;
 	__u64	freepages_delay_total;
+	/* version 7 ends here */
+
+	/* All cumulative CPU time of dead children */
+	__u64	ac_cutime_acct;		/* User CPU time [usec] */
+	__u64	ac_cstime_acct;		/* System CPU time [usec] */
 };
 
 
--- a/kernel/tsacct.c
+++ b/kernel/tsacct.c
@@ -31,6 +31,7 @@ void bacct_add_tsk(struct taskstats *sta
 	const struct cred *tcred;
 	struct timespec uptime, ts;
 	u64 ac_etime;
+	unsigned long flags;
 
 	BUILD_BUG_ON(TS_COMM_LEN < TASK_COMM_LEN);
 
@@ -71,6 +72,12 @@ void bacct_add_tsk(struct taskstats *sta
 	stats->ac_majflt = tsk->maj_flt;
 
 	strncpy(stats->ac_comm, tsk->comm, sizeof(stats->ac_comm));
+	if (tsk->tgid == tsk->pid && lock_task_sighand(tsk, &flags)) {
+		struct cdata *cd = &tsk->signal->cdata_acct;
+		stats->ac_cutime_acct = cputime_to_usecs(cd->utime);
+		stats->ac_cstime_acct = cputime_to_usecs(cd->stime);
+		unlock_task_sighand(tsk, &flags);
+	}
 }
 
 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [patch 0/4] taskstats: Improve cumulative time accounting
  2010-11-19 20:11 [patch 0/4] taskstats: Improve cumulative time accounting Michael Holzheu
                   ` (3 preceding siblings ...)
  2010-11-19 20:11 ` [patch 4/4] taskstats: Export "cdata_acct" with taskstats Michael Holzheu
@ 2010-11-19 20:19 ` Peter Zijlstra
  2010-11-20 15:17   ` Oleg Nesterov
  2010-11-22 11:03   ` Michael Holzheu
  4 siblings, 2 replies; 24+ messages in thread
From: Peter Zijlstra @ 2010-11-19 20:19 UTC (permalink / raw)
  To: Michael Holzheu
  Cc: Oleg Nesterov, Shailabh Nagar, Andrew Morton, John stultz,
	Thomas Gleixner, Balbir Singh, Martin Schwidefsky, Heiko Carstens,
	Roland McGrath, linux-kernel, linux-s390

On Fri, 2010-11-19 at 21:11 +0100, Michael Holzheu wrote:
> Due to POSIX POSIX.1-2001, the CPU time of processes is not accounted
> to the cumulative time of the parents, if the parents ignore SIGCHLD
> or have set SA_NOCLDWAIT. This behaviour has the major drawback that
> it is not possible to calculate all consumed CPU time of a system by
> looking at the current tasks. CPU time can be lost.
> 
> To solve this problem, this patch set duplicates the cumulative accounting
> data in the signal_struct. In the second set (cdata_acct) the complete
> cumulative resource counters are stored. The new cumulative CPU time (utime
> and stime) is then exported via the taskstats interface. 

Maybe this has been treated earlier in the threads and I missed it, but
the obvious solution doesn't get mentioned:

What would break if we violate this silly POSIX rule and account time of
childs regardless of SIGCHLD/SA_NOCLDWAIT?



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [patch 0/4] taskstats: Improve cumulative time accounting
  2010-11-19 20:19 ` [patch 0/4] taskstats: Improve cumulative time accounting Peter Zijlstra
@ 2010-11-20 15:17   ` Oleg Nesterov
  2010-11-22  7:21     ` Balbir Singh
  2010-11-22 11:03   ` Michael Holzheu
  1 sibling, 1 reply; 24+ messages in thread
From: Oleg Nesterov @ 2010-11-20 15:17 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Holzheu, Shailabh Nagar, Andrew Morton, John stultz,
	Thomas Gleixner, Balbir Singh, Martin Schwidefsky, Heiko Carstens,
	Roland McGrath, linux-kernel, linux-s390

On 11/19, Peter Zijlstra wrote:
>
> On Fri, 2010-11-19 at 21:11 +0100, Michael Holzheu wrote:
> > Due to POSIX POSIX.1-2001, the CPU time of processes is not accounted
> > to the cumulative time of the parents, if the parents ignore SIGCHLD
> > or have set SA_NOCLDWAIT. This behaviour has the major drawback that
> > it is not possible to calculate all consumed CPU time of a system by
> > looking at the current tasks. CPU time can be lost.
> >
> > To solve this problem, this patch set duplicates the cumulative accounting
> > data in the signal_struct. In the second set (cdata_acct) the complete
> > cumulative resource counters are stored. The new cumulative CPU time (utime
> > and stime) is then exported via the taskstats interface.
>
> Maybe this has been treated earlier in the threads and I missed it, but
> the obvious solution doesn't get mentioned:

IIRC, the first version did this.

And it was me who spoiled this approach. But! only because I wasn't sure
this user-visible change is acceptable, and because there was some
misunderstanding. See http://marc.info/?l=linux-kernel&m=128552495203050&w=2

But,

> What would break if we

say, any test-case which does getrusage() after fork() with ignored
SIGCHLD/SA_NOCLDWAIT?.

> violate this silly POSIX rule and account time of
> childs regardless of SIGCHLD/SA_NOCLDWAIT?

+1.

Personally, I'd certainly prefer this way, because I don't care about
POSIX at all ;)


Still. Once again, this breaks the current rules, and we never do
this without strong reason.

I think we should ask Roland. If he thinks this is OK, I'd certainly
agree.

Oleg.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [patch 0/4] taskstats: Improve cumulative time accounting
  2010-11-20 15:17   ` Oleg Nesterov
@ 2010-11-22  7:21     ` Balbir Singh
  0 siblings, 0 replies; 24+ messages in thread
From: Balbir Singh @ 2010-11-22  7:21 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, Michael Holzheu, Shailabh Nagar, Andrew Morton,
	John stultz, Thomas Gleixner, Martin Schwidefsky, Heiko Carstens,
	Roland McGrath, linux-kernel, linux-s390

* Oleg Nesterov <oleg@redhat.com> [2010-11-20 16:17:11]:

> On 11/19, Peter Zijlstra wrote:
> >
> > On Fri, 2010-11-19 at 21:11 +0100, Michael Holzheu wrote:
> > > Due to POSIX POSIX.1-2001, the CPU time of processes is not accounted
> > > to the cumulative time of the parents, if the parents ignore SIGCHLD
> > > or have set SA_NOCLDWAIT. This behaviour has the major drawback that
> > > it is not possible to calculate all consumed CPU time of a system by
> > > looking at the current tasks. CPU time can be lost.
> > >
> > > To solve this problem, this patch set duplicates the cumulative accounting
> > > data in the signal_struct. In the second set (cdata_acct) the complete
> > > cumulative resource counters are stored. The new cumulative CPU time (utime
> > > and stime) is then exported via the taskstats interface.
> >
> > Maybe this has been treated earlier in the threads and I missed it, but
> > the obvious solution doesn't get mentioned:
> 
> IIRC, the first version did this.
> 
> And it was me who spoiled this approach. But! only because I wasn't sure
> this user-visible change is acceptable, and because there was some
> misunderstanding. See http://marc.info/?l=linux-kernel&m=128552495203050&w=2
> 
> But,
> 
> > What would break if we
> 
> say, any test-case which does getrusage() after fork() with ignored
> SIGCHLD/SA_NOCLDWAIT?.
> 
> > violate this silly POSIX rule and account time of
> > childs regardless of SIGCHLD/SA_NOCLDWAIT?
> 
> +1.
> 
> Personally, I'd certainly prefer this way, because I don't care about
> POSIX at all ;)
> 
> 
> Still. Once again, this breaks the current rules, and we never do
> this without strong reason.
> 
> I think we should ask Roland. If he thinks this is OK, I'd certainly
> agree.
>

The Linux man page states

In Linux kernel versions before 2.6.9, if the disposition of SIGCHLD
is set to SIG_IGN then the resource usages of child processes are
automatically included in the value returned by RUSAGE_CHILDREN,
although POSIX.1-2001 explicitly prohibits this. This nonconformance
is rectified in Linux 2.6.9 and later. 



-- 
	Three Cheers,
	Balbir

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [patch 0/4] taskstats: Improve cumulative time accounting
  2010-11-19 20:19 ` [patch 0/4] taskstats: Improve cumulative time accounting Peter Zijlstra
  2010-11-20 15:17   ` Oleg Nesterov
@ 2010-11-22 11:03   ` Michael Holzheu
  2010-11-22 12:47     ` Michael Holzheu
  1 sibling, 1 reply; 24+ messages in thread
From: Michael Holzheu @ 2010-11-22 11:03 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Oleg Nesterov, Shailabh Nagar, Andrew Morton, John stultz,
	Thomas Gleixner, Balbir Singh, Martin Schwidefsky, Heiko Carstens,
	Roland McGrath, linux-kernel, linux-s390

On Fri, 2010-11-19 at 21:19 +0100, Peter Zijlstra wrote:
> On Fri, 2010-11-19 at 21:11 +0100, Michael Holzheu wrote:
> > Due to POSIX POSIX.1-2001, the CPU time of processes is not accounted
> > to the cumulative time of the parents, if the parents ignore SIGCHLD
> > or have set SA_NOCLDWAIT. This behaviour has the major drawback that
> > it is not possible to calculate all consumed CPU time of a system by
> > looking at the current tasks. CPU time can be lost.
> > 
> > To solve this problem, this patch set duplicates the cumulative accounting
> > data in the signal_struct. In the second set (cdata_acct) the complete
> > cumulative resource counters are stored. The new cumulative CPU time (utime
> > and stime) is then exported via the taskstats interface. 
> 
> Maybe this has been treated earlier in the threads and I missed it, but
> the obvious solution doesn't get mentioned:
> 
> What would break if we violate this silly POSIX rule and account time of
> childs regardless of SIGCHLD/SA_NOCLDWAIT?

Or maybe we could add a sysctl that allows to switch between the two
semantics.

Michael


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [patch 0/4] taskstats: Improve cumulative time accounting
  2010-11-22 11:03   ` Michael Holzheu
@ 2010-11-22 12:47     ` Michael Holzheu
  2010-11-22 18:11       ` Valdis.Kletnieks
  0 siblings, 1 reply; 24+ messages in thread
From: Michael Holzheu @ 2010-11-22 12:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Oleg Nesterov, Shailabh Nagar, Andrew Morton, John stultz,
	Thomas Gleixner, Balbir Singh, Martin Schwidefsky, Heiko Carstens,
	Roland McGrath, linux-kernel, linux-s390

On Mon, 2010-11-22 at 12:03 +0100, Michael Holzheu wrote:
> Or maybe we could add a sysctl that allows to switch between the two
> semantics.

Then patch 03/04 would be something like the following:
---
Subject: taskstats: Introduce complete cumulative accounting

From: Michael Holzheu <holzheu@linux.vnet.ibm.com>

Currently the cumulative time accounting in Linux is not complete.
Due to POSIX POSIX.1-2001, the CPU time of processes is not accounted
to the cumulative time of the parents, if the parents ignore SIGCHLD
or have set SA_NOCLDWAIT. This behaviour has the major drawback that
it is not possible to calculate all consumed CPU time of a system by
looking at the current tasks. CPU time can be lost.

This patch adds a new sysctl "kernel.full_cdata" that allows to switch
between the POSIX behavior and complete cumulative accounting.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
---
 include/linux/sched.h |    1 +
 kernel/exit.c         |   12 ++++++++----
 kernel/sysctl.c       |    7 +++++++
 3 files changed, 16 insertions(+), 4 deletions(-)

--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1907,6 +1907,7 @@ enum sched_tunable_scaling {
 };
 extern enum sched_tunable_scaling sysctl_sched_tunable_scaling;
 
+extern unsigned int full_cdata_enabled;
 #ifdef CONFIG_SCHED_DEBUG
 extern unsigned int sysctl_sched_migration_cost;
 extern unsigned int sysctl_sched_nr_migrate;
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -57,6 +57,8 @@
 #include <asm/pgtable.h>
 #include <asm/mmu_context.h>
 
+unsigned int full_cdata_enabled = 1;
+
 static void exit_mm(struct task_struct * tsk);
 
 static void __unhash_process(struct task_struct *p, bool group_dead)
@@ -77,7 +79,7 @@ static void __unhash_process(struct task
 static void __account_cdata(struct task_struct *p)
 {
 	struct cdata *cd, *pcd, *tcd;
-	unsigned long maxrss;
+	unsigned long maxrss, flags;
 	cputime_t tgutime, tgstime;
 
 	/*
@@ -100,7 +102,7 @@ static void __account_cdata(struct task_
 	 * group including the group leader.
 	 */
 	thread_group_times(p, &tgutime, &tgstime);
-	spin_lock_irq(&p->real_parent->sighand->siglock);
+	spin_lock_irqsave(&p->real_parent->sighand->siglock, flags);
 	pcd = &p->real_parent->signal->cdata_wait;
 	tcd = &p->signal->cdata_threads;
 	cd = &p->signal->cdata_wait;
@@ -137,7 +139,7 @@ static void __account_cdata(struct task_
 		pcd->maxrss = maxrss;
 	task_io_accounting_add(&p->real_parent->signal->ioac, &p->ioac);
 	task_io_accounting_add(&p->real_parent->signal->ioac, &p->signal->ioac);
-	spin_unlock_irq(&p->real_parent->sighand->siglock);
+	spin_unlock_irqrestore(&p->real_parent->sighand->siglock, flags);
 }
 
 /*
@@ -157,6 +159,8 @@ static void __exit_signal(struct task_st
 
 	posix_cpu_timers_exit(tsk);
 	if (group_dead) {
+		if (full_cdata_enabled)
+			__account_cdata(tsk);
 		posix_cpu_timers_exit_group(tsk);
 		tty = sig->tty;
 		sig->tty = NULL;
@@ -1292,7 +1296,7 @@ static int wait_task_zombie(struct wait_
 	 * It can be ptraced but not reparented, check
 	 * !task_detached() to filter out sub-threads.
 	 */
-	if (likely(!traced) && likely(!task_detached(p)))
+	if (likely(!traced) && likely(!task_detached(p)) && !full_cdata_enabled)
 		__account_cdata(p);
 
 	/*
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -963,6 +963,13 @@ static struct ctl_table kern_table[] = {
 		.proc_handler	= proc_dointvec,
 	},
 #endif
+	{
+		.procname	= "full_cdata",
+		.data		= &full_cdata_enabled,
+		.maxlen		= sizeof(unsigned int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec,
+	},
 /*
  * NOTE: do not add new entries to this table unless you have read
  * Documentation/sysctl/ctl_unnumbered.txt



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [patch 0/4] taskstats: Improve cumulative time accounting
  2010-11-22 12:47     ` Michael Holzheu
@ 2010-11-22 18:11       ` Valdis.Kletnieks
  0 siblings, 0 replies; 24+ messages in thread
From: Valdis.Kletnieks @ 2010-11-22 18:11 UTC (permalink / raw)
  To: holzheu
  Cc: Peter Zijlstra, Oleg Nesterov, Shailabh Nagar, Andrew Morton,
	John stultz, Thomas Gleixner, Balbir Singh, Martin Schwidefsky,
	Heiko Carstens, Roland McGrath, linux-kernel, linux-s390

[-- Attachment #1: Type: text/plain, Size: 761 bytes --]

On Mon, 22 Nov 2010 13:47:55 +0100, Michael Holzheu said:

> Currently the cumulative time accounting in Linux is not complete.
> Due to POSIX POSIX.1-2001, the CPU time of processes is not accounted
> to the cumulative time of the parents, if the parents ignore SIGCHLD
> or have set SA_NOCLDWAIT. This behaviour has the major drawback that
> it is not possible to calculate all consumed CPU time of a system by
> looking at the current tasks. CPU time can be lost.
> 
> This patch adds a new sysctl "kernel.full_cdata" that allows to switch
> between the POSIX behavior and complete cumulative accounting.

> +unsigned int full_cdata_enabled = 1;

This probably needs to default to "current kernel behavior" but allow
sysadmins to change to the new behavior.

[-- Attachment #2: Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [patch 3/4] taskstats: Introduce cdata_acct for complete cumulative accounting
  2010-11-19 20:11 ` [patch 3/4] taskstats: Introduce cdata_acct for complete cumulative accounting Michael Holzheu
@ 2010-11-23 16:59   ` Oleg Nesterov
  2010-11-25  9:40     ` Michael Holzheu
  0 siblings, 1 reply; 24+ messages in thread
From: Oleg Nesterov @ 2010-11-23 16:59 UTC (permalink / raw)
  To: Michael Holzheu
  Cc: Shailabh Nagar, Andrew Morton, Peter Zijlstra, John stultz,
	Thomas Gleixner, Balbir Singh, Martin Schwidefsky, Heiko Carstens,
	Roland McGrath, linux-kernel, linux-s390

On 11/19, Michael Holzheu wrote:
>
> Currently the cumulative time accounting in Linux is not complete.
> Due to POSIX POSIX.1-2001, the CPU time of processes is not accounted
> to the cumulative time of the parents, if the parents ignore SIGCHLD
> or have set SA_NOCLDWAIT. This behaviour has the major drawback that
> it is not possible to calculate all consumed CPU time of a system by
> looking at the current tasks. CPU time can be lost.
>
> This patch adds a new set of cumulative time counters. We then have two
> cumulative counter sets:
>
> * cdata_wait: Traditional cumulative time used e.g. by getrusage.
> * cdata_acct: Cumulative time that also includes dead processes with
>               parents that ignore SIGCHLD or have set SA_NOCLDWAIT.
>               cdata_acct will be exported by taskstats.

Looks correct at first glance. A couple of nits below.

> TODO:
> -----
> With this patch we take the siglock twice. First for the dead task
> and second for the parent of the dead task. This give the following
> lockdep warning (probably a lockdep annotation is needed here):

And we already discussed this ;) We do not need 2 siglock's, only
parent's. Just move the callsite in __exit_signal() down, under
another (lockless) group_dead check.

Or I missed something?

> @@ -595,6 +595,8 @@ struct signal_struct {
>  	 */
>  	struct cdata cdata_wait;
>  	struct cdata cdata_threads;
> +	struct cdata cdata_acct;
> +	struct task_io_accounting ioac_acct;
>  	struct task_io_accounting ioac;

Given that task_io_accounting is Linux specific, perhaps we can use
signal->ioac in both cases?

Yes, this is a user-visible change anyway. But, at least we can
forget about POSIX.

> +	spin_lock_irqsave(&p->real_parent->sighand->siglock, flags);
> +	if (wait) {
> +		pcd = &p->real_parent->signal->cdata_wait;
> +		tcd = &p->signal->cdata_threads;
> +		cd = &p->signal->cdata_wait;
> +	} else {
> +		pcd = &p->real_parent->signal->cdata_acct;
> +		tcd = &p->signal->cdata_threads;
> +		cd = &p->signal->cdata_acct;
> +	}

We can do this before taking ->siglock. Not that I think this really
matters, but otherwise this looks a bit confusing imho, as if we need
parent's ->siglock to pin something.


And thanks for splitting these changes. It was much, much easier to
read now.

Oleg.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [patch 3/4] taskstats: Introduce cdata_acct for complete cumulative accounting
  2010-11-23 16:59   ` Oleg Nesterov
@ 2010-11-25  9:40     ` Michael Holzheu
  2010-11-25 13:21       ` Oleg Nesterov
  0 siblings, 1 reply; 24+ messages in thread
From: Michael Holzheu @ 2010-11-25  9:40 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Shailabh Nagar, Andrew Morton, Peter Zijlstra, John stultz,
	Thomas Gleixner, Balbir Singh, Martin Schwidefsky, Heiko Carstens,
	Roland McGrath, linux-kernel, linux-s390

Hello Oleg,

On Tue, 2010-11-23 at 17:59 +0100, Oleg Nesterov wrote: 
> On 11/19, Michael Holzheu wrote:
> > TODO:
> > -----
> > With this patch we take the siglock twice. First for the dead task
> > and second for the parent of the dead task. This give the following
> > lockdep warning (probably a lockdep annotation is needed here):
> 
> And we already discussed this ;) We do not need 2 siglock's, only
> parent's. Just move the callsite in __exit_signal() down, under
> another (lockless) group_dead check.
> 
> Or I missed something?

The problem with moving this down to the second group_dead check is that
after __unhash_process() is called, pid_alive(tsk) which is checked in
thread_group_cputime() returns false. Therefore we always get zero CPU
times.

So I probably have to introduce a second group_dead check at the
beginning of __exit_signal():

@@ -150,6 +153,9 @@ static void __exit_signal(struct task_st
        struct sighand_struct *sighand;
        struct tty_struct *uninitialized_var(tty);

+       if (group_dead)
+               __account_cdata(...);
+
        sighand = rcu_dereference_check(tsk->sighand,
                                        rcu_read_lock_held() ||
                                        
> We can do this before taking ->siglock. Not that I think this really
> matters, but otherwise this looks a bit confusing imho, as if we need
> parent's ->siglock to pin something.

ok

> 
> 
> And thanks for splitting these changes. It was much, much easier to
> read now.

My personal feeling is that probably the only acceptable thing would be
to make the new behavior configurable with a sysctl and define the
default as it currently is (POSIX compliant).

This would only introduce two additional checks in __exit_signal() and
wait_task_zombie() and would not add any new fields to the
signal_struct.

Michael


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [patch 1/4] taskstats: Introduce "struct cdata"
  2010-11-19 20:11 ` [patch 1/4] taskstats: Introduce "struct cdata" Michael Holzheu
@ 2010-11-25 12:29   ` Balbir Singh
  2010-11-25 14:23   ` Oleg Nesterov
  1 sibling, 0 replies; 24+ messages in thread
From: Balbir Singh @ 2010-11-25 12:29 UTC (permalink / raw)
  To: Michael Holzheu
  Cc: Oleg Nesterov, Shailabh Nagar, Andrew Morton, Peter Zijlstra,
	John stultz, Thomas Gleixner, Martin Schwidefsky, Heiko Carstens,
	Roland McGrath, linux-kernel, linux-s390

* Michael Holzheu <holzheu@linux.vnet.ibm.com> [2010-11-19 21:11:09]:

> From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
> 
> This patch introduces a new structure "struct cdata" that is used to
> store cumulative resource counters for dead child processes and threads.
> 
> Note that there is one asymmetry:
> For "struct task_io_accounting" (ioc) there is no extra accounting field for
> dead threads. One field is used for both, dead processes and threads.
> 
> This patch introduces no functional change.
> 
> Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
> ---
>  fs/binfmt_elf.c           |    4 +-
>  fs/exec.c                 |    2 -
>  fs/proc/array.c           |   16 ++++----
>  include/linux/sched.h     |   22 +++++++----
>  kernel/exit.c             |   86 ++++++++++++++++++++++++----------------------
>  kernel/posix-cpu-timers.c |   12 +++---
>  kernel/sys.c              |   44 ++++++++++++-----------
>  7 files changed, 100 insertions(+), 86 deletions(-)

Looks good to me

Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
 
-- 
	Three Cheers,
	Balbir

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [patch 3/4] taskstats: Introduce cdata_acct for complete cumulative accounting
  2010-11-25  9:40     ` Michael Holzheu
@ 2010-11-25 13:21       ` Oleg Nesterov
  2010-11-25 17:45         ` Michael Holzheu
  0 siblings, 1 reply; 24+ messages in thread
From: Oleg Nesterov @ 2010-11-25 13:21 UTC (permalink / raw)
  To: Michael Holzheu
  Cc: Shailabh Nagar, Andrew Morton, Peter Zijlstra, John stultz,
	Thomas Gleixner, Balbir Singh, Martin Schwidefsky, Heiko Carstens,
	Roland McGrath, linux-kernel, linux-s390

On 11/25, Michael Holzheu wrote:
>
> Hello Oleg,
>
> On Tue, 2010-11-23 at 17:59 +0100, Oleg Nesterov wrote:
> > On 11/19, Michael Holzheu wrote:
> > > TODO:
> > > -----
> > > With this patch we take the siglock twice. First for the dead task
> > > and second for the parent of the dead task. This give the following
> > > lockdep warning (probably a lockdep annotation is needed here):
> >
> > And we already discussed this ;) We do not need 2 siglock's, only
> > parent's. Just move the callsite in __exit_signal() down, under
> > another (lockless) group_dead check.
> >
> > Or I missed something?
>
> The problem with moving this down to the second group_dead check is that
> after __unhash_process() is called, pid_alive(tsk) which is checked in
> thread_group_cputime() returns false. Therefore we always get zero CPU
> times.

I see, thanks.

> So I probably have to introduce a second group_dead check at the
> beginning of __exit_signal():

Probably...

But in fact this reminds we should cleanup this code somehow.
By the time we call thread_group_times() there are no other
threads.

> My personal feeling is that probably the only acceptable thing would be
> to make the new behavior configurable with a sysctl and define the
> default as it currently is (POSIX compliant).
>
> This would only introduce two additional checks in __exit_signal() and
> wait_task_zombie() and would not add any new fields to the
> signal_struct.

Yeah, it would be nice to avoid new fields.


Hmm. Somehow I forgot about 4/4, please see another email...

Oleg.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [patch 4/4] taskstats: Export "cdata_acct" with taskstats
  2010-11-19 20:11 ` [patch 4/4] taskstats: Export "cdata_acct" with taskstats Michael Holzheu
@ 2010-11-25 13:26   ` Oleg Nesterov
  2010-11-25 17:21     ` Michael Holzheu
  2010-11-25 16:57   ` Balbir Singh
  1 sibling, 1 reply; 24+ messages in thread
From: Oleg Nesterov @ 2010-11-25 13:26 UTC (permalink / raw)
  To: Michael Holzheu
  Cc: Shailabh Nagar, Andrew Morton, Peter Zijlstra, John stultz,
	Thomas Gleixner, Balbir Singh, Martin Schwidefsky, Heiko Carstens,
	Roland McGrath, linux-kernel, linux-s390

On 11/19, Michael Holzheu wrote:
>
> From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
>
> With this patch the (full) cumulative CPU time is added to "struct taskstats".
> The CPU time is only returned for the thread group leader.
>
> ...
>
> +	if (tsk->tgid == tsk->pid

thread_group_leader() ?

> && lock_task_sighand(tsk, &flags)) {

Do you really need ->siglock? Starting from 2.6.35 it is always
safe to access ->signal.

Oleg.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [patch 1/4] taskstats: Introduce "struct cdata"
  2010-11-19 20:11 ` [patch 1/4] taskstats: Introduce "struct cdata" Michael Holzheu
  2010-11-25 12:29   ` Balbir Singh
@ 2010-11-25 14:23   ` Oleg Nesterov
  2010-11-25 16:38     ` Michael Holzheu
  1 sibling, 1 reply; 24+ messages in thread
From: Oleg Nesterov @ 2010-11-25 14:23 UTC (permalink / raw)
  To: Michael Holzheu
  Cc: Shailabh Nagar, Andrew Morton, Peter Zijlstra, John stultz,
	Thomas Gleixner, Balbir Singh, Martin Schwidefsky, Heiko Carstens,
	Roland McGrath, linux-kernel, linux-s390

On 11/19, Michael Holzheu wrote:
>
> From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
>
> This patch introduces a new structure "struct cdata" that is used to
> store cumulative resource counters for dead child processes and threads.
>
> Note that there is one asymmetry:
> For "struct task_io_accounting" (ioc) there is no extra accounting field for
> dead threads. One field is used for both, dead processes and threads.
>
> This patch introduces no functional change.
>
> Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
> ---
>  fs/binfmt_elf.c           |    4 +-
>  fs/exec.c                 |    2 -
>  fs/proc/array.c           |   16 ++++----
>  include/linux/sched.h     |   22 +++++++----
>  kernel/exit.c             |   86 ++++++++++++++++++++++++----------------------
>  kernel/posix-cpu-timers.c |   12 +++---
>  kernel/sys.c              |   44 ++++++++++++-----------
>  7 files changed, 100 insertions(+), 86 deletions(-)

Looks good. In fact, to me it looks like a cleanup.

But. You seem to forgot to change kernel/signal.c, no?

And cosmetic nit,

>  void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times)
>  {
> -	struct signal_struct *sig = tsk->signal;
> +	struct cdata *tcd = &tsk->signal->cdata_threads;
>  	struct task_struct *t;
>
> -	times->utime = sig->utime;
> -	times->stime = sig->stime;
> -	times->sum_exec_runtime = sig->sum_sched_runtime;
> +	times->utime = tcd->utime;
> +	times->stime = tcd->stime;
> +	times->sum_exec_runtime = tsk->signal->sum_sched_runtime;

Feel free to ignore, but I don't understand why you removed "sig".
Afaics,

	-     times->utime = sig->utime;
	-     times->stime = sig->stime;
	+     times->utime = sig->cdata_threads->utime;
	+     times->stime = sig->cdata_threads->stime;

looks a bit better.

Oleg.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [patch 1/4] taskstats: Introduce "struct cdata"
  2010-11-25 14:23   ` Oleg Nesterov
@ 2010-11-25 16:38     ` Michael Holzheu
  0 siblings, 0 replies; 24+ messages in thread
From: Michael Holzheu @ 2010-11-25 16:38 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Shailabh Nagar, Andrew Morton, Peter Zijlstra, John stultz,
	Thomas Gleixner, Balbir Singh, Martin Schwidefsky, Heiko Carstens,
	Roland McGrath, linux-kernel, linux-s390

Hello Oleg,

On Thu, 2010-11-25 at 15:23 +0100, Oleg Nesterov wrote:
> On 11/19, Michael Holzheu wrote:
> > Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
> > ---
> >  fs/binfmt_elf.c           |    4 +-
> >  fs/exec.c                 |    2 -
> >  fs/proc/array.c           |   16 ++++----
> >  include/linux/sched.h     |   22 +++++++----
> >  kernel/exit.c             |   86 ++++++++++++++++++++++++----------------------
> >  kernel/posix-cpu-timers.c |   12 +++---
> >  kernel/sys.c              |   44 ++++++++++++-----------
> >  7 files changed, 100 insertions(+), 86 deletions(-)
> 
> Looks good. In fact, to me it looks like a cleanup.
>
> But. You seem to forgot to change kernel/signal.c, no?

Yes, you are right.

> 
> And cosmetic nit,
> 
> >  void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times)
> >  {
> > -	struct signal_struct *sig = tsk->signal;
> > +	struct cdata *tcd = &tsk->signal->cdata_threads;
> >  	struct task_struct *t;
> >
> > -	times->utime = sig->utime;
> > -	times->stime = sig->stime;
> > -	times->sum_exec_runtime = sig->sum_sched_runtime;
> > +	times->utime = tcd->utime;
> > +	times->stime = tcd->stime;
> > +	times->sum_exec_runtime = tsk->signal->sum_sched_runtime;
> 
> Feel free to ignore, but I don't understand why you removed "sig".
> Afaics,
> 
> 	-     times->utime = sig->utime;
> 	-     times->stime = sig->stime;
> 	+     times->utime = sig->cdata_threads->utime;
> 	+     times->stime = sig->cdata_threads->stime;
> 
> looks a bit better.

Yes, this looks better.

I attached the corrected patch.

Michael
---
Subject: taskstats: Introduce "struct cdata"

From: Michael Holzheu <holzheu@linux.vnet.ibm.com>

This patch introduces a new structure "struct cdata" that is used to
store cumulative resource counters for dead child processes and threads.

Note that there is one asymmetry:
For "struct task_io_accounting" (ioc) there is no extra accounting field for
dead threads. One field is used for both dead processes and threads.

This patch introduces no functional change.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
---
 fs/binfmt_elf.c           |    4 +-
 fs/exec.c                 |    2 -
 fs/proc/array.c           |   16 ++++----
 include/linux/sched.h     |   22 +++++++----
 kernel/exit.c             |   86 ++++++++++++++++++++++++----------------------
 kernel/posix-cpu-timers.c |    8 ++--
 kernel/signal.c           |    4 +-
 kernel/sys.c              |   44 ++++++++++++-----------
 8 files changed, 100 insertions(+), 86 deletions(-)

--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1296,8 +1296,8 @@ static void fill_prstatus(struct elf_prs
 		cputime_to_timeval(p->utime, &prstatus->pr_utime);
 		cputime_to_timeval(p->stime, &prstatus->pr_stime);
 	}
-	cputime_to_timeval(p->signal->cutime, &prstatus->pr_cutime);
-	cputime_to_timeval(p->signal->cstime, &prstatus->pr_cstime);
+	cputime_to_timeval(p->signal->cdata_wait.utime, &prstatus->pr_cutime);
+	cputime_to_timeval(p->signal->cdata_wait.stime, &prstatus->pr_cstime);
 }
 
 static int fill_psinfo(struct elf_prpsinfo *psinfo, struct task_struct *p,
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -894,7 +894,7 @@ static int de_thread(struct task_struct
 
 no_thread_group:
 	if (current->mm)
-		setmax_mm_hiwater_rss(&sig->maxrss, current->mm);
+		setmax_mm_hiwater_rss(&sig->cdata_threads.maxrss, current->mm);
 
 	exit_itimers(sig);
 	flush_itimer_signals();
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -413,11 +413,11 @@ static int do_task_stat(struct seq_file
 		num_threads = get_nr_threads(task);
 		collect_sigign_sigcatch(task, &sigign, &sigcatch);
 
-		cmin_flt = sig->cmin_flt;
-		cmaj_flt = sig->cmaj_flt;
-		cutime = sig->cutime;
-		cstime = sig->cstime;
-		cgtime = sig->cgtime;
+		cmin_flt = sig->cdata_wait.min_flt;
+		cmaj_flt = sig->cdata_wait.maj_flt;
+		cutime = sig->cdata_wait.utime;
+		cstime = sig->cdata_wait.stime;
+		cgtime = sig->cdata_wait.gtime;
 		rsslim = ACCESS_ONCE(sig->rlim[RLIMIT_RSS].rlim_cur);
 
 		/* add up live thread stats at the group level */
@@ -430,10 +430,10 @@ static int do_task_stat(struct seq_file
 				t = next_thread(t);
 			} while (t != task);
 
-			min_flt += sig->min_flt;
-			maj_flt += sig->maj_flt;
+			min_flt += sig->cdata_threads.min_flt;
+			maj_flt += sig->cdata_threads.maj_flt;
 			thread_group_times(task, &utime, &stime);
-			gtime = cputime_add(gtime, sig->gtime);
+			gtime = cputime_add(gtime, sig->cdata_threads.gtime);
 		}
 
 		sid = task_session_nr_ns(task, ns);
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -510,6 +510,17 @@ struct thread_group_cputimer {
 };
 
 /*
+ * Cumulative resource counters
+ */
+struct cdata {
+	cputime_t utime, stime, gtime;
+	unsigned long nvcsw, nivcsw;
+	unsigned long min_flt, maj_flt;
+	unsigned long inblock, oublock;
+	unsigned long maxrss;
+};
+
+/*
  * NOTE! "signal_struct" does not have it's own
  * locking, because a shared signal_struct always
  * implies a shared sighand_struct, so locking
@@ -582,17 +593,12 @@ struct signal_struct {
 	 * Live threads maintain their own counters and add to these
 	 * in __exit_signal, except for the group leader.
 	 */
-	cputime_t utime, stime, cutime, cstime;
-	cputime_t gtime;
-	cputime_t cgtime;
+	struct cdata cdata_wait;
+	struct cdata cdata_threads;
+	struct task_io_accounting ioac;
 #ifndef CONFIG_VIRT_CPU_ACCOUNTING
 	cputime_t prev_utime, prev_stime;
 #endif
-	unsigned long nvcsw, nivcsw, cnvcsw, cnivcsw;
-	unsigned long min_flt, maj_flt, cmin_flt, cmaj_flt;
-	unsigned long inblock, oublock, cinblock, coublock;
-	unsigned long maxrss, cmaxrss;
-	struct task_io_accounting ioac;
 
 	/*
 	 * Cumulative ns of schedule CPU time fo dead threads in the
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -95,6 +95,7 @@ static void __exit_signal(struct task_st
 		tty = sig->tty;
 		sig->tty = NULL;
 	} else {
+		struct cdata *tcd = &sig->cdata_threads;
 		/*
 		 * This can only happen if the caller is de_thread().
 		 * FIXME: this is the temporary hack, we should teach
@@ -122,15 +123,15 @@ static void __exit_signal(struct task_st
 		 * We won't ever get here for the group leader, since it
 		 * will have been the last reference on the signal_struct.
 		 */
-		sig->utime = cputime_add(sig->utime, tsk->utime);
-		sig->stime = cputime_add(sig->stime, tsk->stime);
-		sig->gtime = cputime_add(sig->gtime, tsk->gtime);
-		sig->min_flt += tsk->min_flt;
-		sig->maj_flt += tsk->maj_flt;
-		sig->nvcsw += tsk->nvcsw;
-		sig->nivcsw += tsk->nivcsw;
-		sig->inblock += task_io_get_inblock(tsk);
-		sig->oublock += task_io_get_oublock(tsk);
+		tcd->utime = cputime_add(tcd->utime, tsk->utime);
+		tcd->stime = cputime_add(tcd->stime, tsk->stime);
+		tcd->gtime = cputime_add(tcd->gtime, tsk->gtime);
+		tcd->min_flt += tsk->min_flt;
+		tcd->maj_flt += tsk->maj_flt;
+		tcd->nvcsw += tsk->nvcsw;
+		tcd->nivcsw += tsk->nivcsw;
+		tcd->inblock += task_io_get_inblock(tsk);
+		tcd->oublock += task_io_get_oublock(tsk);
 		task_io_accounting_add(&sig->ioac, &tsk->ioac);
 		sig->sum_sched_runtime += tsk->se.sum_exec_runtime;
 	}
@@ -960,10 +961,11 @@ NORET_TYPE void do_exit(long code)
 		sync_mm_rss(tsk, tsk->mm);
 	group_dead = atomic_dec_and_test(&tsk->signal->live);
 	if (group_dead) {
+		struct cdata *tcd = &tsk->signal->cdata_threads;
 		hrtimer_cancel(&tsk->signal->real_timer);
 		exit_itimers(tsk->signal);
 		if (tsk->mm)
-			setmax_mm_hiwater_rss(&tsk->signal->maxrss, tsk->mm);
+			setmax_mm_hiwater_rss(&tcd->maxrss, tsk->mm);
 	}
 	acct_collect(code, group_dead);
 	if (group_dead)
@@ -1225,8 +1227,7 @@ static int wait_task_zombie(struct wait_
 	 * !task_detached() to filter out sub-threads.
 	 */
 	if (likely(!traced) && likely(!task_detached(p))) {
-		struct signal_struct *psig;
-		struct signal_struct *sig;
+		struct cdata *cd, *pcd, *tcd;
 		unsigned long maxrss;
 		cputime_t tgutime, tgstime;
 
@@ -1251,40 +1252,43 @@ static int wait_task_zombie(struct wait_
 		 */
 		thread_group_times(p, &tgutime, &tgstime);
 		spin_lock_irq(&p->real_parent->sighand->siglock);
-		psig = p->real_parent->signal;
-		sig = p->signal;
-		psig->cutime =
-			cputime_add(psig->cutime,
+		pcd = &p->real_parent->signal->cdata_wait;
+		tcd = &p->signal->cdata_threads;
+		cd = &p->signal->cdata_wait;
+
+		pcd->utime =
+			cputime_add(pcd->utime,
 			cputime_add(tgutime,
-				    sig->cutime));
-		psig->cstime =
-			cputime_add(psig->cstime,
+				    cd->utime));
+		pcd->stime =
+			cputime_add(pcd->stime,
 			cputime_add(tgstime,
-				    sig->cstime));
-		psig->cgtime =
-			cputime_add(psig->cgtime,
+				    cd->stime));
+		pcd->gtime =
+			cputime_add(pcd->gtime,
 			cputime_add(p->gtime,
-			cputime_add(sig->gtime,
-				    sig->cgtime)));
-		psig->cmin_flt +=
-			p->min_flt + sig->min_flt + sig->cmin_flt;
-		psig->cmaj_flt +=
-			p->maj_flt + sig->maj_flt + sig->cmaj_flt;
-		psig->cnvcsw +=
-			p->nvcsw + sig->nvcsw + sig->cnvcsw;
-		psig->cnivcsw +=
-			p->nivcsw + sig->nivcsw + sig->cnivcsw;
-		psig->cinblock +=
+			cputime_add(tcd->gtime,
+				    cd->gtime)));
+		pcd->min_flt +=
+			p->min_flt + tcd->min_flt + cd->min_flt;
+		pcd->maj_flt +=
+			p->maj_flt + tcd->maj_flt + cd->maj_flt;
+		pcd->nvcsw +=
+			p->nvcsw + tcd->nvcsw + cd->nvcsw;
+		pcd->nivcsw +=
+			p->nivcsw + tcd->nivcsw + cd->nivcsw;
+		pcd->inblock +=
 			task_io_get_inblock(p) +
-			sig->inblock + sig->cinblock;
-		psig->coublock +=
+			tcd->inblock + cd->inblock;
+		pcd->oublock +=
 			task_io_get_oublock(p) +
-			sig->oublock + sig->coublock;
-		maxrss = max(sig->maxrss, sig->cmaxrss);
-		if (psig->cmaxrss < maxrss)
-			psig->cmaxrss = maxrss;
-		task_io_accounting_add(&psig->ioac, &p->ioac);
-		task_io_accounting_add(&psig->ioac, &sig->ioac);
+			tcd->oublock + cd->oublock;
+		maxrss = max(tcd->maxrss, cd->maxrss);
+		if (pcd->maxrss < maxrss)
+			pcd->maxrss = maxrss;
+		task_io_accounting_add(&p->real_parent->signal->ioac, &p->ioac);
+		task_io_accounting_add(&p->real_parent->signal->ioac,
+				       &p->signal->ioac);
 		spin_unlock_irq(&p->real_parent->sighand->siglock);
 	}
 
--- a/kernel/posix-cpu-timers.c
+++ b/kernel/posix-cpu-timers.c
@@ -235,8 +235,8 @@ void thread_group_cputime(struct task_st
 	struct signal_struct *sig = tsk->signal;
 	struct task_struct *t;
 
-	times->utime = sig->utime;
-	times->stime = sig->stime;
+	times->utime = sig->cdata_threads.utime;
+	times->stime = sig->cdata_threads.stime;
 	times->sum_exec_runtime = sig->sum_sched_runtime;
 
 	rcu_read_lock();
@@ -516,8 +516,8 @@ void posix_cpu_timers_exit_group(struct
 	struct signal_struct *const sig = tsk->signal;
 
 	cleanup_timers(tsk->signal->cpu_timers,
-		       cputime_add(tsk->utime, sig->utime),
-		       cputime_add(tsk->stime, sig->stime),
+		       cputime_add(tsk->utime, sig->cdata_threads.utime),
+		       cputime_add(tsk->stime, sig->cdata_threads.stime),
 		       tsk->se.sum_exec_runtime + sig->sum_sched_runtime);
 }
 
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1476,9 +1476,9 @@ int do_notify_parent(struct task_struct
 	rcu_read_unlock();
 
 	info.si_utime = cputime_to_clock_t(cputime_add(tsk->utime,
-				tsk->signal->utime));
+				tsk->signal->cdata_threads.utime));
 	info.si_stime = cputime_to_clock_t(cputime_add(tsk->stime,
-				tsk->signal->stime));
+				tsk->signal->cdata_threads.stime));
 
 	info.si_status = tsk->exit_code & 0x7f;
 	if (tsk->exit_code & 0x80)
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -884,8 +884,8 @@ void do_sys_times(struct tms *tms)
 
 	spin_lock_irq(&current->sighand->siglock);
 	thread_group_times(current, &tgutime, &tgstime);
-	cutime = current->signal->cutime;
-	cstime = current->signal->cstime;
+	cutime = current->signal->cdata_wait.utime;
+	cstime = current->signal->cdata_wait.stime;
 	spin_unlock_irq(&current->sighand->siglock);
 	tms->tms_utime = cputime_to_clock_t(tgutime);
 	tms->tms_stime = cputime_to_clock_t(tgstime);
@@ -1490,6 +1490,7 @@ static void k_getrusage(struct task_stru
 	unsigned long flags;
 	cputime_t tgutime, tgstime, utime, stime;
 	unsigned long maxrss = 0;
+	struct cdata *cd;
 
 	memset((char *) r, 0, sizeof *r);
 	utime = stime = cputime_zero;
@@ -1497,7 +1498,7 @@ static void k_getrusage(struct task_stru
 	if (who == RUSAGE_THREAD) {
 		task_times(current, &utime, &stime);
 		accumulate_thread_rusage(p, r);
-		maxrss = p->signal->maxrss;
+		maxrss = p->signal->cdata_threads.maxrss;
 		goto out;
 	}
 
@@ -1507,31 +1508,34 @@ static void k_getrusage(struct task_stru
 	switch (who) {
 		case RUSAGE_BOTH:
 		case RUSAGE_CHILDREN:
-			utime = p->signal->cutime;
-			stime = p->signal->cstime;
-			r->ru_nvcsw = p->signal->cnvcsw;
-			r->ru_nivcsw = p->signal->cnivcsw;
-			r->ru_minflt = p->signal->cmin_flt;
-			r->ru_majflt = p->signal->cmaj_flt;
-			r->ru_inblock = p->signal->cinblock;
-			r->ru_oublock = p->signal->coublock;
-			maxrss = p->signal->cmaxrss;
+			cd = &p->signal->cdata_wait;
+			utime = cd->utime;
+			stime = cd->stime;
+			r->ru_nvcsw = cd->nvcsw;
+			r->ru_nivcsw = cd->nivcsw;
+			r->ru_minflt = cd->min_flt;
+			r->ru_majflt = cd->maj_flt;
+			r->ru_inblock = cd->inblock;
+			r->ru_oublock = cd->oublock;
+			maxrss = cd->maxrss;
 
 			if (who == RUSAGE_CHILDREN)
 				break;
 
 		case RUSAGE_SELF:
+			cd = &p->signal->cdata_threads;
 			thread_group_times(p, &tgutime, &tgstime);
 			utime = cputime_add(utime, tgutime);
 			stime = cputime_add(stime, tgstime);
-			r->ru_nvcsw += p->signal->nvcsw;
-			r->ru_nivcsw += p->signal->nivcsw;
-			r->ru_minflt += p->signal->min_flt;
-			r->ru_majflt += p->signal->maj_flt;
-			r->ru_inblock += p->signal->inblock;
-			r->ru_oublock += p->signal->oublock;
-			if (maxrss < p->signal->maxrss)
-				maxrss = p->signal->maxrss;
+			r->ru_nvcsw += cd->nvcsw;
+			r->ru_nivcsw += cd->nivcsw;
+			r->ru_minflt += cd->min_flt;
+			r->ru_majflt += cd->maj_flt;
+			r->ru_inblock += cd->inblock;
+			r->ru_oublock += cd->oublock;
+			if (maxrss < cd->maxrss)
+				maxrss = cd->maxrss;
+
 			t = p;
 			do {
 				accumulate_thread_rusage(t, r);



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [patch 4/4] taskstats: Export "cdata_acct" with taskstats
  2010-11-19 20:11 ` [patch 4/4] taskstats: Export "cdata_acct" with taskstats Michael Holzheu
  2010-11-25 13:26   ` Oleg Nesterov
@ 2010-11-25 16:57   ` Balbir Singh
  1 sibling, 0 replies; 24+ messages in thread
From: Balbir Singh @ 2010-11-25 16:57 UTC (permalink / raw)
  To: Michael Holzheu
  Cc: Oleg Nesterov, Shailabh Nagar, Andrew Morton, Peter Zijlstra,
	John stultz, Thomas Gleixner, Martin Schwidefsky, Heiko Carstens,
	Roland McGrath, linux-kernel, linux-s390

* Michael Holzheu <holzheu@linux.vnet.ibm.com> [2010-11-19 21:11:12]:

> From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
> 
> With this patch the (full) cumulative CPU time is added to "struct taskstats".
> The CPU time is only returned for the thread group leader.
> 
> Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
> ---
>  include/linux/taskstats.h |    7 ++++++-
>  kernel/tsacct.c           |    7 +++++++
>  2 files changed, 13 insertions(+), 1 deletion(-)
> 
> --- a/include/linux/taskstats.h
> +++ b/include/linux/taskstats.h
> @@ -33,7 +33,7 @@
>   */
> 
> 
> -#define TASKSTATS_VERSION	7
> +#define TASKSTATS_VERSION	8
>  #define TS_COMM_LEN		32	/* should be >= TASK_COMM_LEN
>  					 * in linux/sched.h */
> 
> @@ -163,6 +163,11 @@ struct taskstats {
>  	/* Delay waiting for memory reclaim */
>  	__u64	freepages_count;
>  	__u64	freepages_delay_total;
> +	/* version 7 ends here */
> +
> +	/* All cumulative CPU time of dead children */
> +	__u64	ac_cutime_acct;		/* User CPU time [usec] */
> +	__u64	ac_cstime_acct;		/* System CPU time [usec] */
>  };
> 
> 
> --- a/kernel/tsacct.c
> +++ b/kernel/tsacct.c
> @@ -31,6 +31,7 @@ void bacct_add_tsk(struct taskstats *sta
>  	const struct cred *tcred;
>  	struct timespec uptime, ts;
>  	u64 ac_etime;
> +	unsigned long flags;
> 
>  	BUILD_BUG_ON(TS_COMM_LEN < TASK_COMM_LEN);
> 
> @@ -71,6 +72,12 @@ void bacct_add_tsk(struct taskstats *sta
>  	stats->ac_majflt = tsk->maj_flt;
> 
>  	strncpy(stats->ac_comm, tsk->comm, sizeof(stats->ac_comm));
> +	if (tsk->tgid == tsk->pid && lock_task_sighand(tsk, &flags)) {

I don't think referring to tgid and pid is a good idea in the context
of namespaces, thread_group_leader makes more sense like Oleg pointed
out.

> +		struct cdata *cd = &tsk->signal->cdata_acct;
> +		stats->ac_cutime_acct = cputime_to_usecs(cd->utime);
> +		stats->ac_cstime_acct = cputime_to_usecs(cd->stime);
> +		unlock_task_sighand(tsk, &flags);
> +	}
>  }
> 
> 
> 

-- 
	Three Cheers,
	Balbir

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [patch 4/4] taskstats: Export "cdata_acct" with taskstats
  2010-11-25 13:26   ` Oleg Nesterov
@ 2010-11-25 17:21     ` Michael Holzheu
  2010-11-29 16:43       ` Oleg Nesterov
  0 siblings, 1 reply; 24+ messages in thread
From: Michael Holzheu @ 2010-11-25 17:21 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Shailabh Nagar, Andrew Morton, Peter Zijlstra, John stultz,
	Thomas Gleixner, Balbir Singh, Martin Schwidefsky, Heiko Carstens,
	Roland McGrath, linux-kernel, linux-s390

Hello Oleg,

On Thu, 2010-11-25 at 14:26 +0100, Oleg Nesterov wrote:
> On 11/19, Michael Holzheu wrote:
> >
> > From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
> >
> > With this patch the (full) cumulative CPU time is added to "struct taskstats".
> > The CPU time is only returned for the thread group leader.
> >
> > ...
> >
> > +	if (tsk->tgid == tsk->pid
> 
> thread_group_leader() ?

Yes, that's better.

> > && lock_task_sighand(tsk, &flags)) {
> 
> Do you really need ->siglock? Starting from 2.6.35 it is always
> safe to access ->signal.

Hmmm, if you say that...

I just did it like it is done in e.g. fs/proc/base.c (proc_pid_limits).
Can we remove the locking there, too?

Michael


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [patch 3/4] taskstats: Introduce cdata_acct for complete cumulative accounting
  2010-11-25 13:21       ` Oleg Nesterov
@ 2010-11-25 17:45         ` Michael Holzheu
  0 siblings, 0 replies; 24+ messages in thread
From: Michael Holzheu @ 2010-11-25 17:45 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Shailabh Nagar, Andrew Morton, Peter Zijlstra, John stultz,
	Thomas Gleixner, Balbir Singh, Martin Schwidefsky, Heiko Carstens,
	Roland McGrath, linux-kernel, linux-s390

On Thu, 2010-11-25 at 14:21 +0100, Oleg Nesterov wrote:
> On 11/25, Michael Holzheu wrote:
> But in fact this reminds we should cleanup this code somehow.
> By the time we call thread_group_times() there are no other
> threads.

Couldn't we just replace thread_group_times() with the following:

--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -101,12 +101,13 @@ static void __account_cdata(struct task_
         * group, which consolidates times for all threads in the
         * group including the group leader.
         */
-       thread_group_times(p, &tgutime, &tgstime);
        pcd = &p->real_parent->signal->cdata_wait;
        tcd = &p->signal->cdata_threads;
        cd = &p->signal->cdata_wait;

        spin_lock_irqsave(&p->real_parent->sighand->siglock, flags);
+       tgutime = cputime_add(tcd->utime, p->utime);
+       tgstime = cputime_add(tcd->stime, p->stime);
        pcd->utime =
                cputime_add(pcd->utime,
                cputime_add(tgutime,

Then we can move the __account_cdata() invocation down to the second
group_dead check.

Michael


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [patch 4/4] taskstats: Export "cdata_acct" with taskstats
  2010-11-25 17:21     ` Michael Holzheu
@ 2010-11-29 16:43       ` Oleg Nesterov
  2010-11-29 16:58         ` Michael Holzheu
  0 siblings, 1 reply; 24+ messages in thread
From: Oleg Nesterov @ 2010-11-29 16:43 UTC (permalink / raw)
  To: Michael Holzheu
  Cc: Shailabh Nagar, Andrew Morton, Peter Zijlstra, John stultz,
	Thomas Gleixner, Balbir Singh, Martin Schwidefsky, Heiko Carstens,
	Roland McGrath, linux-kernel, linux-s390

On 11/25, Michael Holzheu wrote:
>
> Hello Oleg,
>
> On Thu, 2010-11-25 at 14:26 +0100, Oleg Nesterov wrote:
> > On 11/19, Michael Holzheu wrote:
> > >
> > > From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
> > >
> > > With this patch the (full) cumulative CPU time is added to "struct taskstats".
> > > The CPU time is only returned for the thread group leader.
> > >
> > > ...
> > >
> > > +	if (tsk->tgid == tsk->pid
> >
> > thread_group_leader() ?
>
> Yes, that's better.
>
> > > && lock_task_sighand(tsk, &flags)) {
> >
> > Do you really need ->siglock? Starting from 2.6.35 it is always
> > safe to access ->signal.
>
> Hmmm, if you say that...
>
> I just did it like it is done in e.g. fs/proc/base.c (proc_pid_limits).
> Can we remove the locking there, too?

We can certainly remove more siglock's which were previously
needed to access ->signal.

This particular one is just wrong. We need task_lock(group_leader)
to read signal->rlim atomically. However, it is not trivial to do
this correctly. Probably we should ignore this minor problem.

In any case, this ->siglock buys nothing today.

Oleg.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [patch 4/4] taskstats: Export "cdata_acct" with taskstats
  2010-11-29 16:43       ` Oleg Nesterov
@ 2010-11-29 16:58         ` Michael Holzheu
  2010-11-29 18:08           ` Oleg Nesterov
  0 siblings, 1 reply; 24+ messages in thread
From: Michael Holzheu @ 2010-11-29 16:58 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Shailabh Nagar, Andrew Morton, Peter Zijlstra, John stultz,
	Thomas Gleixner, Balbir Singh, Martin Schwidefsky, Heiko Carstens,
	Roland McGrath, linux-kernel, linux-s390

Hello Oleg,

On Mon, 2010-11-29 at 17:43 +0100, Oleg Nesterov wrote:
> > I just did it like it is done in e.g. fs/proc/base.c (proc_pid_limits).
> > Can we remove the locking there, too?
> 
> We can certainly remove more siglock's which were previously
> needed to access ->signal.
> 
> This particular one is just wrong. We need task_lock(group_leader)
> to read signal->rlim atomically. However, it is not trivial to do
> this correctly. Probably we should ignore this minor problem.
> 
> In any case, this ->siglock buys nothing today.

But at least to get the two values cutime and cstime consistent we need
the siglock? There could be a parallel update for
tsk->signal->cutime/cstime, while the taskstats are created, no?

Michael


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [patch 4/4] taskstats: Export "cdata_acct" with taskstats
  2010-11-29 16:58         ` Michael Holzheu
@ 2010-11-29 18:08           ` Oleg Nesterov
  0 siblings, 0 replies; 24+ messages in thread
From: Oleg Nesterov @ 2010-11-29 18:08 UTC (permalink / raw)
  To: Michael Holzheu
  Cc: Shailabh Nagar, Andrew Morton, Peter Zijlstra, John stultz,
	Thomas Gleixner, Balbir Singh, Martin Schwidefsky, Heiko Carstens,
	Roland McGrath, linux-kernel, linux-s390

On 11/29, Michael Holzheu wrote:
>
> Hello Oleg,
>
> On Mon, 2010-11-29 at 17:43 +0100, Oleg Nesterov wrote:
> > > I just did it like it is done in e.g. fs/proc/base.c (proc_pid_limits).
> > > Can we remove the locking there, too?
> >
> > We can certainly remove more siglock's which were previously
> > needed to access ->signal.
> >
> > This particular one is just wrong. We need task_lock(group_leader)
> > to read signal->rlim atomically. However, it is not trivial to do
> > this correctly. Probably we should ignore this minor problem.
> >
> > In any case, this ->siglock buys nothing today.
>
> But at least to get the two values cutime and cstime consistent we need
> the siglock? There could be a parallel update for
> tsk->signal->cutime/cstime, while the taskstats are created, no?

I guess you mean 4/4 you sent. Probably you are right, I'll try
to look tomorrow.

Oleg.


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2010-11-29 18:16 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-19 20:11 [patch 0/4] taskstats: Improve cumulative time accounting Michael Holzheu
2010-11-19 20:11 ` [patch 1/4] taskstats: Introduce "struct cdata" Michael Holzheu
2010-11-25 12:29   ` Balbir Singh
2010-11-25 14:23   ` Oleg Nesterov
2010-11-25 16:38     ` Michael Holzheu
2010-11-19 20:11 ` [patch 2/4] taskstats: Introduce __account_cdata() function Michael Holzheu
2010-11-19 20:11 ` [patch 3/4] taskstats: Introduce cdata_acct for complete cumulative accounting Michael Holzheu
2010-11-23 16:59   ` Oleg Nesterov
2010-11-25  9:40     ` Michael Holzheu
2010-11-25 13:21       ` Oleg Nesterov
2010-11-25 17:45         ` Michael Holzheu
2010-11-19 20:11 ` [patch 4/4] taskstats: Export "cdata_acct" with taskstats Michael Holzheu
2010-11-25 13:26   ` Oleg Nesterov
2010-11-25 17:21     ` Michael Holzheu
2010-11-29 16:43       ` Oleg Nesterov
2010-11-29 16:58         ` Michael Holzheu
2010-11-29 18:08           ` Oleg Nesterov
2010-11-25 16:57   ` Balbir Singh
2010-11-19 20:19 ` [patch 0/4] taskstats: Improve cumulative time accounting Peter Zijlstra
2010-11-20 15:17   ` Oleg Nesterov
2010-11-22  7:21     ` Balbir Singh
2010-11-22 11:03   ` Michael Holzheu
2010-11-22 12:47     ` Michael Holzheu
2010-11-22 18:11       ` Valdis.Kletnieks

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox