public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jay Lan <jlan@engr.sgi.com>
To: Andrew Morton <akpm@osdl.org>
Cc: roland <devzero@web.de>, Fengguang Wu <fengguang.wu@gmail.com>,
	linux-kernel@vger.kernel.org, lserinol@gmail.com
Subject: Re: I/O statistics per process
Date: Thu, 28 Sep 2006 15:00:17 -0700	[thread overview]
Message-ID: <451C45F1.1050604@engr.sgi.com> (raw)
In-Reply-To: <20060928120952.9f09cbf7.akpm@osdl.org>

Andrew Morton wrote:
> On Thu, 28 Sep 2006 11:55:38 -0700
> Jay Lan <jlan@engr.sgi.com> wrote:
> 
>> Andrew Morton wrote:
>>> On Wed, 27 Sep 2006 23:22:02 +0200
>>> "roland" <devzero@web.de> wrote:
>>>
>>>> thanks. tried to contact redflag, but they don`t answer. maybe support is 
>>>> being on holiday.... !?
>>>>
>>>> linux kernel hackers - there is really no standard way to watch i/o metrics 
>>>> (bytes read/written) at process level?
>>> The patch csa-accounting-taskstats-update.patch in current -mm kernels
>>> (whcih is planned for 2.6.19) does have per-process chars-read and
>>> chars-written accounting ("Extended accounting fields").  That's probably
>>> not waht you really want, although it might tell you what you want to know.
>>>
>>>> it`s extremly hard for the admin to track down, what process is hogging the 
>>>> disk - especially if there is more than one task consuming cpu.
>> Rolend,
>>
>> The per-process chars-read and chars-writeen accounting is made
>> available through taskstats interface (see Documentation/accounting/
>> taskstats.txt) in 2.6.18-mm1 kernel. Unfortunately, the user-space CSA
>> package is still a few months away. You may, for now, write your
>> own taskstats application or go a long way to port the in-kernel
>> implementation of pagg/job/csa.
>>
>> However, the "Externded acocunting fields" patch does not provide you
>> straight forward answer. The patch provides accounting data only at
>> process termination (just like the BSD accounting) and it seems that
>> you want to see which run-away application (ie, alive) eating up your
>> disk. The taskstats interface offers a query mode (command-response),
>> but currently only delayacct uses that mode. We would need to make
>> those data available in the query mode in order for application to
>> see accounting data of live processes.
> 
> ow.  That is a rather important enhancement to have.

Yes, it is needed to provide accounting on live processes. Both
BSD and CSA traditionally focused on completed processes. I guess
that was the difference between a system accounting and system
monitoring?

I certainly can make this enhancement. :)

> 
>>> csa-accounting-taskstats-update.patch makes that information available to
>>> userspace.
>>>
>>> But it's approximate, because
>>>
>>> - it doesn't account for disk readahead
>>>
>>> - it doesn't account for pagefault-initiated reads (althought it easily
>>>   could - Jay?)
>>>
>>> - it overaccounts for a process writing to an already-dirty page.
>>>
>>>   (We could fix this too: nuke the existing stuff and do
>>>
>>> 	current->wchar += PAGE_CACHE_SIZE;
>>>
>>>    in __set_page_dirty_[no]buffers().) (But that ends up being wrong if
>>>    someone truncates the file before it got written)
>>>
>>> - it doesn't account for file readahead (although it easily could)
>>>
>>> - it doesn't account for pagefault-initiated readahead (it could)
>>>

Mmm, i am not a true FS I/O person. The data collection patches i
submitted in Nov 2004 was the code i inherited and has been
used in production system by our CSA customers. We lost a bit in
contents and accuracy when CSA was ported from IRIX to Linux. I am
sure there is room for improvement without much overhead. Maybe FS
I/O guys can chip in?

>>>
>>> hm.  There's actually quite a lot we could do here to make these fields
>>> more accurate and useful.  A lot of this depends on what the definition of
>>> these fields _is_.  Is is just for disk IO?  Is it supposed to include
>>> console IO, or what?

Yes, the char_read and char_written are only for disk I/O.

> 
> I'd be interested in your opinions on all the above, please.

Sorry i can not answer you on data colleciton code.

Thanks,
 - jay

> 



  parent reply	other threads:[~2006-09-28 22:02 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-09-22 19:12 I/O statistics per process roland
     [not found] ` <20060924030415.GA11861@mail.ustc.edu.cn>
2006-09-24  3:04   ` Fengguang Wu
2006-09-27 21:22     ` roland
2006-09-27 22:55       ` Andrew Morton
2006-09-28 18:55         ` Jay Lan
2006-09-28 19:09           ` Andrew Morton
2006-09-28 20:05             ` roland
2006-09-28 22:00             ` Jay Lan [this message]
2006-09-28 22:14               ` Andrew Morton
2006-12-08  0:09                 ` roland
     [not found]                   ` <20061208012212.GA5796@mail.ustc.edu.cn>
2006-12-08  1:22                     ` Fengguang Wu
2006-12-08  8:55                       ` roland

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=451C45F1.1050604@engr.sgi.com \
    --to=jlan@engr.sgi.com \
    --cc=akpm@osdl.org \
    --cc=devzero@web.de \
    --cc=fengguang.wu@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lserinol@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox