From: Shailabh Nagar <nagar@watson.ibm.com>
To: Parag Warudkar <kernel-stuff@comcast.net>
Cc: linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [Patch 1/4] Delay accounting: Initialization
Date: Tue, 15 Nov 2005 19:45:54 -0500 [thread overview]
Message-ID: <437A8142.7030106@watson.ibm.com> (raw)
In-Reply-To: <4ABDC730-2888-4DBE-B1DC-62362A87EEB7@comcast.net>
Parag Warudkar wrote:
>
> On Nov 15, 2005, at 5:29 PM, Shailabh Nagar wrote:
<snip>
>
>>> Does this mean, whether or not the per task delay accounting is used,
>>> we have a constant overhead of sizeof(spinlock_t) + 2*sizeof (uint32_t)
>>> + 2* sizeof(uint64_t) bytes going into the struct task_struct?. Is it
>>> possible/beneficial to use struct task_delay_info *delays instead and
>>> allocate it if task wants to use the information?
>>>
>>
>> Doing so would have value in the case where the feature is configured
>> but no one ever registers to listen for it.
>
>
> Precisely. Such a feature will be used only occasionally I suppose.
> I didn't read the code deeply but are any scheduling decisions altered
> based on this data? If not, then it makes sense to not account unless required.
>
> I think it should be possible to do it on demand, per process instead
> of forcing the accounting on _all_ processes which cumulatively becomes a sizeable
> o/h.
>
> Per Process activation of this feature will add significant value IMHO.
> (Of course, if that's possible in first place.)
Per-task activation is useful/possible only for long-running tasks.
If one is trying to gather stats for a user-defined grouping of tasks
then it would involve too much overhead & inaccuracy to require monitoring
to be turned on individually.
>
>> The cost of doing this would be
>> - adding more code to the fork path to allocate conditionally
>
>
> Just an unlikely branch for normal code path - not a big deal.
> Also I am thinking it could be handled outside of fork?
only if per-task activation is done - thats probably what you meant ?
>
>> - make the collecting of the delays conditional on a similar check
>
>
> Weighing this against the actual accounting - I think it's a win.
Hmmm..since there is locking involved in the stats collection, this is
starting to make a lot of sense.
>
>> - cache pollution from following an extra pointer in the pgflt/
>> io_schedule paths
>> I'm not sure is this really matters for these two code paths.
>
>
> Possibly.
>
>> Even if one does this, once the first listener registers, all future
>> tasks
>> (and even the current ones) will have to go ahead and allocate the
>> structure
>> and accounting of delays will have to switch to unconditional mode.
>> This is
>> because the delay data has cumulative value...future listeners will be
>> interested in data collected earlier (as long as task is still
>> running). And
>> once the first listener registers, you can no longer be sure no one's
>> interested
>> in the future.
>>
>
> Is it possible to do it per process? Forcing it on all processes is
> what I was trying to avoid given the feature's usage pattern.
>
>> Another alternative is to let userland control the overhead of
>> allocation and
>> collection completely through a /proc/sys/kernel/delayacct variable.
>> When its switched on, it triggers an allocation for all existing
>> tasks in the
>> system, turns on allocation in fork() for future tasks, and
>> collection of the stats.
>> When turned off, collection of stats stops as does allocation for
>> future tasks
>> (not worth going in and deallocating structs for existing tasks).
>
>
>> Does this seem worth it ?
>>
>
> Definitely not unless we can do it per process and on demand.
Per-task doesn't seem like a good idea. Neither does allowing dynamic
switching off/on of the allocation of the delay struct.
So how about this:
Have /proc/sys/kernel/delayacct and a corresponding kernel boot parameter (for setting
the switch early) which control just the collection of data. Allocation always happens.
>> -- Shailabh
>>
>
> Cheers
>
> Parag
>
next prev parent reply other threads:[~2005-11-16 0:41 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-11-15 4:35 [Patch 1/4] Delay accounting: Initialization Shailabh Nagar
2005-11-15 4:20 ` Andrew Morton
2005-11-15 6:49 ` Marcelo Tosatti
2005-11-15 15:19 ` Shailabh Nagar
2005-11-15 12:20 ` Marcelo Tosatti
2005-11-15 12:34 ` Marcelo Tosatti
2005-11-15 15:08 ` Shailabh Nagar
2005-11-16 1:06 ` Peter Chubb
2005-11-16 1:48 ` Shailabh Nagar
2005-11-16 1:50 ` Andi Kleen
2005-11-16 1:52 ` Peter Chubb
2005-11-15 4:25 ` Parag Warudkar
2005-11-15 22:29 ` Shailabh Nagar
2005-11-15 22:53 ` Parag Warudkar
2005-11-16 0:45 ` Shailabh Nagar [this message]
2005-11-16 2:41 ` Parag Warudkar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=437A8142.7030106@watson.ibm.com \
--to=nagar@watson.ibm.com \
--cc=kernel-stuff@comcast.net \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox