All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: Roland McGrath <roland@redhat.com>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	Shailabh Nagar <nagar1234@in.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Venkatesh Pallipadi <venki@google.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Suresh Siddha <suresh.b.siddha@intel.com>,
	John stultz <johnstul@us.ibm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@elte.hu>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC][PATCH 09/10] taskstats: Fix exit CPU time accounting
Date: Tue, 19 Oct 2010 16:17:25 +0200	[thread overview]
Message-ID: <20101019141725.GA32361@redhat.com> (raw)
In-Reply-To: <1287153246.1896.231.camel@holzheu-laptop>

On 10/15, Michael Holzheu wrote:
>
> On Thu, 2010-10-14 at 15:47 +0200, Oleg Nesterov wrote:
>
> > Yes. But __account_to_parent() always sets p->exit_accounting_done = 1.
> > And __exit_signal() calls __account_to_parent() only if it is not set.
> >
> > This means that we update either cdata_wait (if the child was reaped
> > by parent) or cdata_acct (the process auto-reaps itself).
>
> No. The accounting of cdata_acct is done unconditionally in
> __account_to_parent(). It is done for both cases wait=0 and wait=1,
> therefore no CPU time gets lost. Accounting of cdata_wait is done only
> on the sys_wait() path, where "wait" is "1".

Ah, got it, I didn't notice this detail.

Thanks.

> I think it works as it currently is. But as already said, this probably
> could be done better. At least your confusion seems to prove that :-)

Perhaps ;)

To me, it would be cleaner and simpler if you kill ->exit_accounting_done.
Both wait_task_zombie() and __exit_signal() could just call
__account_to_parent(parent_for_accounting) unconditionally passing
either real_parent or acct_parent as an argument. This also saves a
word in task_struct.

> de_thread() is also a very interesting spot for accounting. The thread
> that calls exec() gets a bit of the identity of the old thread group
> leader e.g. PID and start time, but it keeps the old CPU times. This
> looks strange to me.

Well, the main thread represents the whole process for ps/etc, that
is why we update ->start_time.

But,

> Wouldn't it be better to either exchange the accounting data between old
> and new leader

I dunno. The exiting old leader will update sig->utime/etc, so we do not
lose this info from the "whole process" pov. But yes, if user-space
looks at the single thread with that TGID it can notice that, say, utime
goes backward.

> or add the current accounting data of the new leader to
> the signal struct and initialize them with zero again?

Sorry, I don't understand this "initialize them with zero". What
is "them" ?

> > I think you can simplify this, but  I am not sure right now.
> >
> > First of all, ->acct_parent should be moved from task_struct to
> > signal_struct. No need to initialize t->acct_parent unless t is
> > the group leader (this means we can avoid do/while_each_thread
> > loop during re-parenting, but de_thread needs another trivial
> > change).
> > No need to change forget_original_parent() at all, instead we
> > can the single line
> >
> > 	p->signal->acct_parent = father->signal->acct_parent;
> >
> > to reparent_leader(), after the "if (same_thread_group())" check.
> >
> > What do you think?
>
> I think it is not that easy because we still have to maintain the
> children_acct list. This list is used to reparent all the accounting
> children to the new accounting parent.

Yes, sure, reparent_leader() should also do list_move_tail(acct_sibling),
I forget to mention this.

I guess you already understand this, but just in case. Please look at
sibling/children relationship. We do not add the sub-threads on
->children list, only the main thread.

However, every thread has its own ->parent and ->children, this is
because we have __WNOTHREAD. But acct-parenting doesn't have this
problem, only the main thread needs the properly initialized
->acct_parent, it is never needed until the whole process dies.

> But in principle you are right that acct_parent could be moved to the
> signal_struct because we only have to change it, when a thread group
> leader dies.

Yes. And if we move it into signal_struct, then we shouldn't worry
about updating it in de_thread().

However, de_thread() should do list_replace_init(leader->acct_sibling)
to add the new leader to acct_children.

I am not sure this really makes sense, but in fact you can move
->acct_sibling and ->acct_childen from task_struct to signal_struct
as well, note that you can trivially find the group leader looking
at signal->leader_pid. (actually, ->group_leader should be moved
to signal_struct, but this is another story). In this case de_thread()
needs no changes, and we save the space in task_struct.

Oleg.

  reply	other threads:[~2010-10-19 14:17 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-23 13:48 [RFC][PATCH 00/10] taskstats: Enhancements for precise accounting Michael Holzheu
2010-09-23 14:00 ` [RFC][PATCH 01/10] taskstats: Use real microsecond granularity for CPU times Michael Holzheu
2010-10-07  5:08   ` Balbir Singh
2010-10-08 15:08     ` Michael Holzheu
2010-10-08 16:39       ` Balbir Singh
2010-09-23 14:01 ` [RFC][PATCH 02/10] taskstats: Separate taskstats commands Michael Holzheu
2010-09-27  9:32   ` Balbir Singh
2010-10-11  7:40   ` Balbir Singh
2010-09-23 14:01 ` [RFC][PATCH 03/10] taskstats: Split fill_pid function Michael Holzheu
2010-09-23 17:33   ` Oleg Nesterov
2010-09-27  9:33   ` Balbir Singh
2010-10-11  8:31   ` Balbir Singh
2010-09-23 14:01 ` [RFC][PATCH 04/10] taskstats: Add new taskstats command TASKSTATS_CMD_ATTR_PIDS Michael Holzheu
2010-09-23 14:01 ` [RFC][PATCH 05/10] taskstats: Add "/proc/taskstats" Michael Holzheu
2010-09-23 14:01 ` [RFC][PATCH 06/10] taskstats: Add thread group ID to taskstats structure Michael Holzheu
2010-09-23 14:01 ` [RFC][PATCH 07/10] taskstats: Add per task steal time accounting Michael Holzheu
2010-09-23 14:02 ` [RFC][PATCH 08/10] taskstats: Add cumulative CPU time (user, system and steal) Michael Holzheu
2010-09-23 14:02 ` [RFC][PATCH 09/10] taskstats: Fix exit CPU time accounting Michael Holzheu
2010-09-23 17:10   ` Oleg Nesterov
2010-09-24 12:18     ` Michael Holzheu
2010-09-26 18:11       ` Oleg Nesterov
2010-09-27 13:23         ` Michael Holzheu
2010-09-27 13:42         ` Martin Schwidefsky
2010-09-27 16:51           ` Oleg Nesterov
2010-09-28  7:09             ` Martin Schwidefsky
2010-09-29 19:19             ` Roland McGrath
2010-09-30 13:47               ` Michael Holzheu
2010-10-05  8:57                 ` Roland McGrath
2010-10-06  9:29                   ` Michael Holzheu
2010-10-06 15:26                     ` Oleg Nesterov
2010-10-07 15:06                       ` Michael Holzheu
2010-10-11 12:37                         ` Oleg Nesterov
2010-10-12 13:10                           ` Michael Holzheu
2010-10-14 13:47                             ` Oleg Nesterov
2010-10-15 14:34                               ` Michael Holzheu
2010-10-19 14:17                                 ` Oleg Nesterov [this message]
2010-10-22 16:53                                   ` Michael Holzheu
2010-09-28  8:36           ` Balbir Singh
2010-09-28  9:08             ` Martin Schwidefsky
2010-09-28  9:23               ` Balbir Singh
2010-09-28 10:36                 ` Martin Schwidefsky
2010-09-28 10:39                   ` Balbir Singh
2010-09-28  8:21   ` Balbir Singh
2010-09-28 16:50     ` Michael Holzheu
2010-09-23 14:04 ` [RFC][PATCH 10/10] taststats: User space with ptop tool Michael Holzheu
2010-09-23 20:11 ` [RFC][PATCH 00/10] taskstats: Enhancements for precise accounting Andrew Morton
2010-09-23 20:11   ` Andrew Morton
     [not found]   ` <20100923131136.356075f4.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2010-09-23 22:11     ` Matt Helsley
2010-09-24  9:10     ` Michael Holzheu
2010-09-23 22:11   ` Matt Helsley
     [not found]     ` <20100923221139.GI23839-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2010-09-24 12:39       ` Michael Holzheu
2010-09-25 18:19       ` Serge E. Hallyn
2010-09-25 18:19         ` Serge E. Hallyn
2010-09-24 12:39     ` Michael Holzheu
2010-09-24  9:10   ` Michael Holzheu
2010-09-24 18:50     ` Andrew Morton
2010-09-24 18:50       ` Andrew Morton
2010-09-27  9:18       ` Michael Holzheu
2010-09-27 20:02         ` Andrew Morton
2010-09-27 20:02         ` Andrew Morton
2010-09-28  8:17           ` Balbir Singh
     [not found]           ` <20100927130256.5d9a3db8.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2010-09-28  8:17             ` Balbir Singh
     [not found]       ` <20100924115002.fcb4385a.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2010-09-27  9:18         ` Michael Holzheu
2010-09-27 10:49     ` Balbir Singh
2010-09-27 10:49     ` Balbir Singh
2010-09-24  9:16 ` Balbir Singh
2010-09-30  8:38 ` Andi Kleen
2010-09-30 13:56   ` Michael Holzheu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101019141725.GA32361@redhat.com \
    --to=oleg@redhat.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=holzheu@linux.vnet.ibm.com \
    --cc=johnstul@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=nagar1234@in.ibm.com \
    --cc=roland@redhat.com \
    --cc=schwidefsky@de.ibm.com \
    --cc=suresh.b.siddha@intel.com \
    --cc=tglx@linutronix.de \
    --cc=venki@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.