From: Jay Lan <jlan@engr.sgi.com>
To: Paul Jackson <pj@engr.sgi.com>
Cc: johnpol@2ka.mipt.ru, guillaume.thouvenin@bull.net, akpm@osdl.org,
greg@kroah.com, linux-kernel@vger.kernel.org,
efocht@hpce.nec.com, linuxram@us.ibm.com, gh@us.ibm.com,
elsa-devel@lists.sourceforge.net
Subject: Re: [patch 1/2] fork_connector: add a fork connector
Date: Tue, 29 Mar 2005 13:09:44 -0800 [thread overview]
Message-ID: <4249C418.5040007@engr.sgi.com> (raw)
In-Reply-To: <20050329090304.23fbb340.pj@engr.sgi.com>
Paul,
The fork_connector is not designed to solve accounting data collection
problem.
The accounting data collection must be done via a hook from do_exit().
The acct_process() hook invokes do_acct_process() to write BSD
accounting data to disk. CSA needs a similar hook off do_exit() to
collect more accounting data and write to disk in different
accounting records format. This part is not part of fork_connector.
ELSA does not care how accounting data being written to disks. However,
it needs the accounting data being reliably and accurately collected
and written to disk by BSD and/or CSA and it needs the <ppid, pid>
information to aggregate processes. It was never the fork_connector's
intention to piggy back the data to the accounting file.
Thanks,
- jay
Paul Jackson wrote:
> Evgeniy writes:
>
>>Here forking connector module "exits" and can handle next fork() on the
>>same CPU.
>
>
> Fine ... but it's not about what the fork_connector does. It's about
> getting the accounting data to disk, if I understand correctly.
>
>
>>That is why it is very fast in "fast-path".
>
>
> I don't care how fast a tool is. I care how fast the job gets done. If
> a tool is only doing part of the job, then we can't decide whether to
> use that tool just based on how fast that part of the job gets done.
>
>
>>The most expensive part is cn_netlink_send()/netlink_broadcast(),
>>with CBUS it is deferred to the safe time,
>
>
> This is "safe time" for the immediate purpose of letting the forking
> process continue on its way. But the deferred work of buffering up the
> data and writing it to disk still needs to be done, pretty soon. When
> sizing a system to see how many users or jobs I can run on it at a time,
> I will have to include sufficient cpu, memory and disk i/o to handle
> getting this accounting data to disk, right?
>
>
>>> 2) Using a modified form of what BSD ACCOUNTING does now:
>>> - forking process appends single fork data to in-kernel buffer
>>
>>It is not as simple.
>>It takes global locks several times, it access bunch of shared between
>>CPU data.
>>It calls ->stat() and ->write() which may sleep.
>
>
> Hmmm ... good points. The mechanisms in the kernel now (and for the
> last 25 years) to write out BSD ACCOUNTING data may not be numa friendly.
>
> Perhaps there should be a per-cpu 512 byte buffer, which can gather up 8
> accounting records (64 bytes each) and only call the file system write
> once every 8 task exits. Or perhaps a per-node buffer, with a spinlock
> to serialize access by the CPUs on that node. Or perhaps per-node
> accounting files. Or something like that.
>
> Guillaume, Jay - do we (you ?) need to make classic BSD ACCOUNTING data
> collection numa friendly? Based on the various frustrated comments at
> the top of kernel/acct.c, this could be a non-trivial effort to get
> right. Maybe we need it, but can't afford it.
>
> And perhaps my proposed variable length records for supplementary
> accounting, such as <parent pid, child pid> from fork, need to allow
> for some way to pad out the rest of a buffer, when the next record
> won't fit entirely.
>
>
>>That work is deferred and does not affect in-kernel processes.
>
>
> The accounting data collection cannot be deferred for long, perhaps
> just a few minutes. Not until the data hits the disk can we rest
> indefinitely. Unless, that is, I don't understand what problem is
> being solved here (quite possible ;).
>
>
>>And why userspace fork connector should write data to the disk?
>
>
> I NEVER said it should. I am NOT trying to redesign fork_connector.
>
> Good grief ... how many times and ways do I have to say this ;)?
>
> I am asking what is the best tool for accounting data collection,
> which, if I understand correctly, does need to write to disk.
>
next prev parent reply other threads:[~2005-03-29 21:21 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-03-25 10:03 [patch 1/2] fork_connector: add a fork connector Guillaume Thouvenin
2005-03-25 22:45 ` dean gaudet
2005-03-28 21:42 ` Paul Jackson
2005-03-29 7:04 ` Evgeniy Polyakov
2005-03-29 7:02 ` Greg KH
2005-03-29 7:10 ` Evgeniy Polyakov
2005-03-29 8:49 ` Paul Jackson
2005-03-29 9:17 ` Guillaume Thouvenin
2005-03-29 15:23 ` Paul Jackson
2005-03-29 18:44 ` Jay Lan
2005-03-30 1:05 ` Paul Jackson
2005-03-30 5:39 ` Guillaume Thouvenin
2005-03-30 6:35 ` Paul Jackson
2005-03-30 10:25 ` Herbert Xu
2005-03-30 10:57 ` Evgeniy Polyakov
2005-03-30 11:01 ` Guillaume Thouvenin
2005-04-01 3:26 ` Drew Hess
2005-03-29 10:29 ` Evgeniy Polyakov
2005-03-29 17:03 ` Paul Jackson
2005-03-29 21:09 ` Jay Lan [this message]
2005-03-29 22:01 ` Paul Jackson
2005-03-30 14:14 ` Evgeniy Polyakov
2005-03-30 20:56 ` Paul Jackson
2005-03-30 6:06 ` dean gaudet
2005-03-30 6:25 ` Paul Jackson
2005-03-30 6:38 ` Guillaume Thouvenin
2005-03-30 18:11 ` Jay Lan
2005-03-29 8:05 ` Guillaume Thouvenin
2005-03-29 14:47 ` Paul Jackson
2005-03-29 12:51 ` Guillaume Thouvenin
2005-03-29 15:35 ` Paul Jackson
2005-03-30 5:52 ` Guillaume Thouvenin
2005-03-30 6:41 ` Paul Jackson
-- strict thread matches above, loose matches on Subject: below --
2005-03-17 9:04 Guillaume Thouvenin
2005-03-17 16:56 ` Jesse Barnes
2005-03-17 21:38 ` Evgeniy Polyakov
2005-03-17 22:05 ` Jesse Barnes
2005-03-21 8:23 ` Guillaume Thouvenin
2005-03-21 12:48 ` Guillaume Thouvenin
2005-03-21 20:52 ` Ram
2005-03-22 4:36 ` Evgeniy Polyakov
2005-03-22 18:40 ` Ram
2005-03-22 7:07 ` Guillaume Thouvenin
2005-03-22 18:15 ` Jay Lan
2005-03-23 8:15 ` Guillaume Thouvenin
2005-03-22 18:26 ` Ram
2005-03-22 19:22 ` Evgeniy Polyakov
2005-03-22 19:18 ` Ram
2005-03-22 20:25 ` Evgeniy Polyakov
2005-03-22 20:42 ` Ram
2005-03-23 4:52 ` Evgeniy Polyakov
2005-03-22 22:51 ` Jay Lan
2005-03-22 23:51 ` Jay Lan
2005-03-23 5:01 ` Evgeniy Polyakov
[not found] ` <1111557106.23532.65.camel@uganda>
2005-03-23 19:00 ` Ram
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4249C418.5040007@engr.sgi.com \
--to=jlan@engr.sgi.com \
--cc=akpm@osdl.org \
--cc=efocht@hpce.nec.com \
--cc=elsa-devel@lists.sourceforge.net \
--cc=gh@us.ibm.com \
--cc=greg@kroah.com \
--cc=guillaume.thouvenin@bull.net \
--cc=johnpol@2ka.mipt.ru \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxram@us.ibm.com \
--cc=pj@engr.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.