From: Shailabh Nagar <nagar@watson.ibm.com>
To: Parag Warudkar <kernel-stuff@comcast.net>
Cc: linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [Patch 1/4] Delay accounting: Initialization
Date: Tue, 15 Nov 2005 19:45:54 -0500 [thread overview]
Message-ID: <437A8142.7030106@watson.ibm.com> (raw)
In-Reply-To: <4ABDC730-2888-4DBE-B1DC-62362A87EEB7@comcast.net>
Parag Warudkar wrote:
>
> On Nov 15, 2005, at 5:29 PM, Shailabh Nagar wrote:
<snip>
>
>>> Does this mean, whether or not the per task delay accounting is used,
>>> we have a constant overhead of sizeof(spinlock_t) + 2*sizeof (uint32_t)
>>> + 2* sizeof(uint64_t) bytes going into the struct task_struct?. Is it
>>> possible/beneficial to use struct task_delay_info *delays instead and
>>> allocate it if task wants to use the information?
>>>
>>
>> Doing so would have value in the case where the feature is configured
>> but no one ever registers to listen for it.
>
>
> Precisely. Such a feature will be used only occasionally I suppose.
> I didn't read the code deeply but are any scheduling decisions altered
> based on this data? If not, then it makes sense to not account unless required.
>
> I think it should be possible to do it on demand, per process instead
> of forcing the accounting on _all_ processes which cumulatively becomes a sizeable
> o/h.
>
> Per Process activation of this feature will add significant value IMHO.
> (Of course, if that's possible in first place.)
Per-task activation is useful/possible only for long-running tasks.
If one is trying to gather stats for a user-defined grouping of tasks
then it would involve too much overhead & inaccuracy to require monitoring
to be turned on individually.
>
>> The cost of doing this would be
>> - adding more code to the fork path to allocate conditionally
>
>
> Just an unlikely branch for normal code path - not a big deal.
> Also I am thinking it could be handled outside of fork?
only if per-task activation is done - thats probably what you meant ?
>
>> - make the collecting of the delays conditional on a similar check
>
>
> Weighing this against the actual accounting - I think it's a win.
Hmmm..since there is locking involved in the stats collection, this is
starting to make a lot of sense.
>
>> - cache pollution from following an extra pointer in the pgflt/
>> io_schedule paths
>> I'm not sure is this really matters for these two code paths.
>
>
> Possibly.
>
>> Even if one does this, once the first listener registers, all future
>> tasks
>> (and even the current ones) will have to go ahead and allocate the
>> structure
>> and accounting of delays will have to switch to unconditional mode.
>> This is
>> because the delay data has cumulative value...future listeners will be
>> interested in data collected earlier (as long as task is still
>> running). And
>> once the first listener registers, you can no longer be sure no one's
>> interested
>> in the future.
>>
>
> Is it possible to do it per process? Forcing it on all processes is
> what I was trying to avoid given the feature's usage pattern.
>
>> Another alternative is to let userland control the overhead of
>> allocation and
>> collection completely through a /proc/sys/kernel/delayacct variable.
>> When its switched on, it triggers an allocation for all existing
>> tasks in the
>> system, turns on allocation in fork() for future tasks, and
>> collection of the stats.
>> When turned off, collection of stats stops as does allocation for
>> future tasks
>> (not worth going in and deallocating structs for existing tasks).
>
>
>> Does this seem worth it ?
>>
>
> Definitely not unless we can do it per process and on demand.
Per-task doesn't seem like a good idea. Neither does allowing dynamic
switching off/on of the allocation of the delay struct.
So how about this:
Have /proc/sys/kernel/delayacct and a corresponding kernel boot parameter (for setting
the switch early) which control just the collection of data. Allocation always happens.
>> -- Shailabh
>>
>
> Cheers
>
> Parag
>
next prev parent reply other threads:[~2005-11-16 0:41 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-11-15 4:35 [Patch 1/4] Delay accounting: Initialization Shailabh Nagar
2005-11-15 4:20 ` Andrew Morton
2005-11-15 6:49 ` Marcelo Tosatti
2005-11-15 15:19 ` Shailabh Nagar
2005-11-15 12:20 ` Marcelo Tosatti
2005-11-15 12:34 ` Marcelo Tosatti
2005-11-15 15:08 ` Shailabh Nagar
2005-11-16 1:06 ` Peter Chubb
2005-11-16 1:48 ` Shailabh Nagar
2005-11-16 1:50 ` Andi Kleen
2005-11-16 1:52 ` Peter Chubb
2005-11-15 4:25 ` Parag Warudkar
2005-11-15 22:29 ` Shailabh Nagar
2005-11-15 22:53 ` Parag Warudkar
2005-11-16 0:45 ` Shailabh Nagar [this message]
2005-11-16 2:41 ` Parag Warudkar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=437A8142.7030106@watson.ibm.com \
--to=nagar@watson.ibm.com \
--cc=kernel-stuff@comcast.net \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.