All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jay Lan <jlan@engr.sgi.com>
To: Andrew Morton <akpm@osdl.org>
Cc: roland <devzero@web.de>, Fengguang Wu <fengguang.wu@gmail.com>,
	linux-kernel@vger.kernel.org, lserinol@gmail.com
Subject: Re: I/O statistics per process
Date: Thu, 28 Sep 2006 15:00:17 -0700	[thread overview]
Message-ID: <451C45F1.1050604@engr.sgi.com> (raw)
In-Reply-To: <20060928120952.9f09cbf7.akpm@osdl.org>

Andrew Morton wrote:
> On Thu, 28 Sep 2006 11:55:38 -0700
> Jay Lan <jlan@engr.sgi.com> wrote:
> 
>> Andrew Morton wrote:
>>> On Wed, 27 Sep 2006 23:22:02 +0200
>>> "roland" <devzero@web.de> wrote:
>>>
>>>> thanks. tried to contact redflag, but they don`t answer. maybe support is 
>>>> being on holiday.... !?
>>>>
>>>> linux kernel hackers - there is really no standard way to watch i/o metrics 
>>>> (bytes read/written) at process level?
>>> The patch csa-accounting-taskstats-update.patch in current -mm kernels
>>> (whcih is planned for 2.6.19) does have per-process chars-read and
>>> chars-written accounting ("Extended accounting fields").  That's probably
>>> not waht you really want, although it might tell you what you want to know.
>>>
>>>> it`s extremly hard for the admin to track down, what process is hogging the 
>>>> disk - especially if there is more than one task consuming cpu.
>> Rolend,
>>
>> The per-process chars-read and chars-writeen accounting is made
>> available through taskstats interface (see Documentation/accounting/
>> taskstats.txt) in 2.6.18-mm1 kernel. Unfortunately, the user-space CSA
>> package is still a few months away. You may, for now, write your
>> own taskstats application or go a long way to port the in-kernel
>> implementation of pagg/job/csa.
>>
>> However, the "Externded acocunting fields" patch does not provide you
>> straight forward answer. The patch provides accounting data only at
>> process termination (just like the BSD accounting) and it seems that
>> you want to see which run-away application (ie, alive) eating up your
>> disk. The taskstats interface offers a query mode (command-response),
>> but currently only delayacct uses that mode. We would need to make
>> those data available in the query mode in order for application to
>> see accounting data of live processes.
> 
> ow.  That is a rather important enhancement to have.

Yes, it is needed to provide accounting on live processes. Both
BSD and CSA traditionally focused on completed processes. I guess
that was the difference between a system accounting and system
monitoring?

I certainly can make this enhancement. :)

> 
>>> csa-accounting-taskstats-update.patch makes that information available to
>>> userspace.
>>>
>>> But it's approximate, because
>>>
>>> - it doesn't account for disk readahead
>>>
>>> - it doesn't account for pagefault-initiated reads (althought it easily
>>>   could - Jay?)
>>>
>>> - it overaccounts for a process writing to an already-dirty page.
>>>
>>>   (We could fix this too: nuke the existing stuff and do
>>>
>>> 	current->wchar += PAGE_CACHE_SIZE;
>>>
>>>    in __set_page_dirty_[no]buffers().) (But that ends up being wrong if
>>>    someone truncates the file before it got written)
>>>
>>> - it doesn't account for file readahead (although it easily could)
>>>
>>> - it doesn't account for pagefault-initiated readahead (it could)
>>>

Mmm, i am not a true FS I/O person. The data collection patches i
submitted in Nov 2004 was the code i inherited and has been
used in production system by our CSA customers. We lost a bit in
contents and accuracy when CSA was ported from IRIX to Linux. I am
sure there is room for improvement without much overhead. Maybe FS
I/O guys can chip in?

>>>
>>> hm.  There's actually quite a lot we could do here to make these fields
>>> more accurate and useful.  A lot of this depends on what the definition of
>>> these fields _is_.  Is is just for disk IO?  Is it supposed to include
>>> console IO, or what?

Yes, the char_read and char_written are only for disk I/O.

> 
> I'd be interested in your opinions on all the above, please.

Sorry i can not answer you on data colleciton code.

Thanks,
 - jay

> 



  parent reply	other threads:[~2006-09-28 22:02 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-09-22 19:12 I/O statistics per process roland
2006-09-24  3:04 ` Fengguang Wu
2006-09-24  3:04   ` Fengguang Wu
2006-09-27 21:22     ` roland
2006-09-27 22:55       ` Andrew Morton
2006-09-28 18:55         ` Jay Lan
2006-09-28 19:09           ` Andrew Morton
2006-09-28 20:05             ` roland
2006-09-28 22:00             ` Jay Lan [this message]
2006-09-28 22:14               ` Andrew Morton
2006-12-08  0:09                 ` roland
2006-12-08  1:22                   ` Fengguang Wu
2006-12-08  1:22                     ` Fengguang Wu
2006-12-08  8:55                       ` roland

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=451C45F1.1050604@engr.sgi.com \
    --to=jlan@engr.sgi.com \
    --cc=akpm@osdl.org \
    --cc=devzero@web.de \
    --cc=fengguang.wu@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lserinol@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.