From: Jay Lan <jlan@engr.sgi.com>
To: Paul Jackson <pj@engr.sgi.com>
Cc: johnpol@2ka.mipt.ru, guillaume.thouvenin@bull.net, akpm@osdl.org,
greg@kroah.com, linux-kernel@vger.kernel.org,
efocht@hpce.nec.com, linuxram@us.ibm.com, gh@us.ibm.com,
elsa-devel@lists.sourceforge.net
Subject: Re: [patch 1/2] fork_connector: add a fork connector
Date: Tue, 29 Mar 2005 13:09:44 -0800 [thread overview]
Message-ID: <4249C418.5040007@engr.sgi.com> (raw)
In-Reply-To: <20050329090304.23fbb340.pj@engr.sgi.com>
Paul,
The fork_connector is not designed to solve accounting data collection
problem.
The accounting data collection must be done via a hook from do_exit().
The acct_process() hook invokes do_acct_process() to write BSD
accounting data to disk. CSA needs a similar hook off do_exit() to
collect more accounting data and write to disk in different
accounting records format. This part is not part of fork_connector.
ELSA does not care how accounting data being written to disks. However,
it needs the accounting data being reliably and accurately collected
and written to disk by BSD and/or CSA and it needs the <ppid, pid>
information to aggregate processes. It was never the fork_connector's
intention to piggy back the data to the accounting file.
Thanks,
- jay
Paul Jackson wrote:
> Evgeniy writes:
>
>>Here forking connector module "exits" and can handle next fork() on the
>>same CPU.
>
>
> Fine ... but it's not about what the fork_connector does. It's about
> getting the accounting data to disk, if I understand correctly.
>
>
>>That is why it is very fast in "fast-path".
>
>
> I don't care how fast a tool is. I care how fast the job gets done. If
> a tool is only doing part of the job, then we can't decide whether to
> use that tool just based on how fast that part of the job gets done.
>
>
>>The most expensive part is cn_netlink_send()/netlink_broadcast(),
>>with CBUS it is deferred to the safe time,
>
>
> This is "safe time" for the immediate purpose of letting the forking
> process continue on its way. But the deferred work of buffering up the
> data and writing it to disk still needs to be done, pretty soon. When
> sizing a system to see how many users or jobs I can run on it at a time,
> I will have to include sufficient cpu, memory and disk i/o to handle
> getting this accounting data to disk, right?
>
>
>>> 2) Using a modified form of what BSD ACCOUNTING does now:
>>> - forking process appends single fork data to in-kernel buffer
>>
>>It is not as simple.
>>It takes global locks several times, it access bunch of shared between
>>CPU data.
>>It calls ->stat() and ->write() which may sleep.
>
>
> Hmmm ... good points. The mechanisms in the kernel now (and for the
> last 25 years) to write out BSD ACCOUNTING data may not be numa friendly.
>
> Perhaps there should be a per-cpu 512 byte buffer, which can gather up 8
> accounting records (64 bytes each) and only call the file system write
> once every 8 task exits. Or perhaps a per-node buffer, with a spinlock
> to serialize access by the CPUs on that node. Or perhaps per-node
> accounting files. Or something like that.
>
> Guillaume, Jay - do we (you ?) need to make classic BSD ACCOUNTING data
> collection numa friendly? Based on the various frustrated comments at
> the top of kernel/acct.c, this could be a non-trivial effort to get
> right. Maybe we need it, but can't afford it.
>
> And perhaps my proposed variable length records for supplementary
> accounting, such as <parent pid, child pid> from fork, need to allow
> for some way to pad out the rest of a buffer, when the next record
> won't fit entirely.
>
>
>>That work is deferred and does not affect in-kernel processes.
>
>
> The accounting data collection cannot be deferred for long, perhaps
> just a few minutes. Not until the data hits the disk can we rest
> indefinitely. Unless, that is, I don't understand what problem is
> being solved here (quite possible ;).
>
>
>>And why userspace fork connector should write data to the disk?
>
>
> I NEVER said it should. I am NOT trying to redesign fork_connector.
>
> Good grief ... how many times and ways do I have to say this ;)?
>
> I am asking what is the best tool for accounting data collection,
> which, if I understand correctly, does need to write to disk.
>
next prev parent reply other threads:[~2005-03-29 21:21 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-03-25 10:03 [patch 1/2] fork_connector: add a fork connector Guillaume Thouvenin
2005-03-25 22:45 ` dean gaudet
2005-03-28 21:42 ` Paul Jackson
2005-03-29 7:04 ` Evgeniy Polyakov
2005-03-29 7:02 ` Greg KH
2005-03-29 7:10 ` Evgeniy Polyakov
2005-03-29 8:49 ` Paul Jackson
2005-03-29 9:17 ` Guillaume Thouvenin
2005-03-29 15:23 ` Paul Jackson
2005-03-29 18:44 ` Jay Lan
2005-03-30 1:05 ` Paul Jackson
2005-03-30 5:39 ` Guillaume Thouvenin
2005-03-30 6:35 ` Paul Jackson
2005-03-30 10:25 ` Herbert Xu
2005-03-30 10:57 ` Evgeniy Polyakov
2005-03-30 11:01 ` Guillaume Thouvenin
2005-04-01 3:26 ` Drew Hess
2005-03-29 10:29 ` Evgeniy Polyakov
2005-03-29 17:03 ` Paul Jackson
2005-03-29 21:09 ` Jay Lan [this message]
2005-03-29 22:01 ` Paul Jackson
2005-03-30 14:14 ` Evgeniy Polyakov
2005-03-30 20:56 ` Paul Jackson
2005-03-30 6:06 ` dean gaudet
2005-03-30 6:25 ` Paul Jackson
2005-03-30 6:38 ` Guillaume Thouvenin
2005-03-30 18:11 ` Jay Lan
2005-03-29 8:05 ` Guillaume Thouvenin
2005-03-29 14:47 ` Paul Jackson
2005-03-29 12:51 ` Guillaume Thouvenin
2005-03-29 15:35 ` Paul Jackson
2005-03-30 5:52 ` Guillaume Thouvenin
2005-03-30 6:41 ` Paul Jackson
-- strict thread matches above, loose matches on Subject: below --
2005-03-17 9:04 Guillaume Thouvenin
2005-03-17 16:56 ` Jesse Barnes
2005-03-17 21:38 ` Evgeniy Polyakov
2005-03-17 22:05 ` Jesse Barnes
2005-03-21 8:23 ` Guillaume Thouvenin
2005-03-21 12:48 ` Guillaume Thouvenin
2005-03-21 20:52 ` Ram
2005-03-22 4:36 ` Evgeniy Polyakov
2005-03-22 18:40 ` Ram
2005-03-22 7:07 ` Guillaume Thouvenin
2005-03-22 18:15 ` Jay Lan
2005-03-23 8:15 ` Guillaume Thouvenin
2005-03-22 18:26 ` Ram
2005-03-22 19:22 ` Evgeniy Polyakov
2005-03-22 19:18 ` Ram
2005-03-22 20:25 ` Evgeniy Polyakov
2005-03-22 20:42 ` Ram
2005-03-23 4:52 ` Evgeniy Polyakov
2005-03-22 22:51 ` Jay Lan
2005-03-22 23:51 ` Jay Lan
2005-03-23 5:01 ` Evgeniy Polyakov
[not found] ` <1111557106.23532.65.camel@uganda>
2005-03-23 19:00 ` Ram
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4249C418.5040007@engr.sgi.com \
--to=jlan@engr.sgi.com \
--cc=akpm@osdl.org \
--cc=efocht@hpce.nec.com \
--cc=elsa-devel@lists.sourceforge.net \
--cc=gh@us.ibm.com \
--cc=greg@kroah.com \
--cc=guillaume.thouvenin@bull.net \
--cc=johnpol@2ka.mipt.ru \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxram@us.ibm.com \
--cc=pj@engr.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox