From: Jay Lan <jlan@engr.sgi.com>
To: Andrew Morton <akpm@osdl.org>
Cc: roland <devzero@web.de>, Fengguang Wu <fengguang.wu@gmail.com>,
linux-kernel@vger.kernel.org, lserinol@gmail.com
Subject: Re: I/O statistics per process
Date: Thu, 28 Sep 2006 15:00:17 -0700 [thread overview]
Message-ID: <451C45F1.1050604@engr.sgi.com> (raw)
In-Reply-To: <20060928120952.9f09cbf7.akpm@osdl.org>
Andrew Morton wrote:
> On Thu, 28 Sep 2006 11:55:38 -0700
> Jay Lan <jlan@engr.sgi.com> wrote:
>
>> Andrew Morton wrote:
>>> On Wed, 27 Sep 2006 23:22:02 +0200
>>> "roland" <devzero@web.de> wrote:
>>>
>>>> thanks. tried to contact redflag, but they don`t answer. maybe support is
>>>> being on holiday.... !?
>>>>
>>>> linux kernel hackers - there is really no standard way to watch i/o metrics
>>>> (bytes read/written) at process level?
>>> The patch csa-accounting-taskstats-update.patch in current -mm kernels
>>> (whcih is planned for 2.6.19) does have per-process chars-read and
>>> chars-written accounting ("Extended accounting fields"). That's probably
>>> not waht you really want, although it might tell you what you want to know.
>>>
>>>> it`s extremly hard for the admin to track down, what process is hogging the
>>>> disk - especially if there is more than one task consuming cpu.
>> Rolend,
>>
>> The per-process chars-read and chars-writeen accounting is made
>> available through taskstats interface (see Documentation/accounting/
>> taskstats.txt) in 2.6.18-mm1 kernel. Unfortunately, the user-space CSA
>> package is still a few months away. You may, for now, write your
>> own taskstats application or go a long way to port the in-kernel
>> implementation of pagg/job/csa.
>>
>> However, the "Externded acocunting fields" patch does not provide you
>> straight forward answer. The patch provides accounting data only at
>> process termination (just like the BSD accounting) and it seems that
>> you want to see which run-away application (ie, alive) eating up your
>> disk. The taskstats interface offers a query mode (command-response),
>> but currently only delayacct uses that mode. We would need to make
>> those data available in the query mode in order for application to
>> see accounting data of live processes.
>
> ow. That is a rather important enhancement to have.
Yes, it is needed to provide accounting on live processes. Both
BSD and CSA traditionally focused on completed processes. I guess
that was the difference between a system accounting and system
monitoring?
I certainly can make this enhancement. :)
>
>>> csa-accounting-taskstats-update.patch makes that information available to
>>> userspace.
>>>
>>> But it's approximate, because
>>>
>>> - it doesn't account for disk readahead
>>>
>>> - it doesn't account for pagefault-initiated reads (althought it easily
>>> could - Jay?)
>>>
>>> - it overaccounts for a process writing to an already-dirty page.
>>>
>>> (We could fix this too: nuke the existing stuff and do
>>>
>>> current->wchar += PAGE_CACHE_SIZE;
>>>
>>> in __set_page_dirty_[no]buffers().) (But that ends up being wrong if
>>> someone truncates the file before it got written)
>>>
>>> - it doesn't account for file readahead (although it easily could)
>>>
>>> - it doesn't account for pagefault-initiated readahead (it could)
>>>
Mmm, i am not a true FS I/O person. The data collection patches i
submitted in Nov 2004 was the code i inherited and has been
used in production system by our CSA customers. We lost a bit in
contents and accuracy when CSA was ported from IRIX to Linux. I am
sure there is room for improvement without much overhead. Maybe FS
I/O guys can chip in?
>>>
>>> hm. There's actually quite a lot we could do here to make these fields
>>> more accurate and useful. A lot of this depends on what the definition of
>>> these fields _is_. Is is just for disk IO? Is it supposed to include
>>> console IO, or what?
Yes, the char_read and char_written are only for disk I/O.
>
> I'd be interested in your opinions on all the above, please.
Sorry i can not answer you on data colleciton code.
Thanks,
- jay
>
next prev parent reply other threads:[~2006-09-28 22:02 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-09-22 19:12 I/O statistics per process roland
[not found] ` <20060924030415.GA11861@mail.ustc.edu.cn>
2006-09-24 3:04 ` Fengguang Wu
2006-09-27 21:22 ` roland
2006-09-27 22:55 ` Andrew Morton
2006-09-28 18:55 ` Jay Lan
2006-09-28 19:09 ` Andrew Morton
2006-09-28 20:05 ` roland
2006-09-28 22:00 ` Jay Lan [this message]
2006-09-28 22:14 ` Andrew Morton
2006-12-08 0:09 ` roland
[not found] ` <20061208012212.GA5796@mail.ustc.edu.cn>
2006-12-08 1:22 ` Fengguang Wu
2006-12-08 8:55 ` roland
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=451C45F1.1050604@engr.sgi.com \
--to=jlan@engr.sgi.com \
--cc=akpm@osdl.org \
--cc=devzero@web.de \
--cc=fengguang.wu@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lserinol@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox