From: John Hesterberg <jh@sgi.com>
To: Matt Helsley <matthltc@us.ibm.com>
Cc: Jes Sorensen <jes@trained-monkey.org>,
Shailabh Nagar <nagar@watson.ibm.com>,
Andrew Morton <akpm@osdl.org>, Jay Lan <jlan@engr.sgi.com>,
LKML <linux-kernel@vger.kernel.org>,
elsa-devel@lists.sourceforge.net, lse-tech@lists.sourceforge.net,
CKRM-Tech <ckrm-tech@lists.sourceforge.net>,
Paul Jackson <pj@sgi.com>, Erik Jacobson <erikj@sgi.com>,
Jack Steiner <steiner@sgi.com>
Subject: Re: [Lse-tech] Re: [ckrm-tech] Re: [PATCH 00/01] Move Exit Connectors
Date: Wed, 11 Jan 2006 15:39:10 -0600 [thread overview]
Message-ID: <20060111213910.GA17986@sgi.com> (raw)
In-Reply-To: <1137013330.6673.80.camel@stark>
On Wed, Jan 11, 2006 at 01:02:10PM -0800, Matt Helsley wrote:
> On Wed, 2006-01-11 at 05:36 -0500, Jes Sorensen wrote:
> > >>>>> "Shailabh" == Shailabh Nagar <nagar@watson.ibm.com> writes:
> >
> > Shailabh> Jes Sorensen wrote:
> > >> I am quite concerned about that lock your patches put into struct
> > >> task_struct through struct task_delay_info. Have you done any
> > >> measurements on how this impacts performance on highly threaded
> > >> apps on larger system?
> >
> > Shailabh> I don't expect the lock contention to be high. The lock is
> > Shailabh> held for a very short time (across two
> > Shailabh> additions/increments). Moreover, it gets contended only when
> > Shailabh> the stats are being read (either through /proc or
> > Shailabh> connectors). Since the reading of stats won't be that
> > Shailabh> frequent (the utility of these numbers is to influence the
> > Shailabh> I/O priority/rss limit etc. which won't be done at a very
> > Shailabh> small granularity anyway), I wouldn't expect a problem.
> >
> > Hi Shailabh,
> >
> > When this is read through connectors, it's initiated by the connectors
> > code which is called from the task's context hence we don't need
> > locking for that. It's very similar to the task_notify code I am about
> > to post and I think the connector code could fit into that
> > framework. The main issue is /proc, but then one could even have a
> > mechanism with a hook when the task exits that pushes the data to a
> > storage point which is lock protected.
> >
> > Even if a lock isn't contended, you are still going to see the cache
> > lines bounce around due to the writes. It may not show up on a 4-way
> > box but what happens on a 64-way? We have seen some pretty nasty
> > effects on the bigger SN2 boxes with locks that were taken far too
> > frequently, to the point where it would prevent the box from booting
> > (now I don't expect it to that severe here, but I'd still like to
> > explore an approach of doing it lock free).
> >
> > Shailabh> But its better to take some measurements anyway. Any
> > Shailabh> suggestions on a benchmark ?
> >
> > >> IMHO it seems to make more sense to use something like Jack's
> > >> proposed task_notifier code to lock-less collect the data into task
> > >> local data structures and then take the data from there and ship
> > >> off to userland through netlink or similar like you are doing?
> > >>
> > >> I am working on modifying Jack's patch to carry task local data and
> > >> use it for not just accounting but other areas that need optional
> > >> callbacks (optional in the sense that it's a feature that can be
> > >> enabled or disabled). Looking at Shailabh's delayacct_blkio()
> > >> changes it seems that it would be really easy to fit those into
> > >> that framework.
> > >>
> > >> Guess I should post some of this code .....
> >
> > Shailabh> Please do. If this accounting can fit into some other
> > Shailabh> framework, thats fine too.
> >
> > Ok, finally, sorry for the delay. My current code snapshot is
> > available at http://www.trained-monkey.org/~jes/patches/task_notify/ -
> > it's a modified version of Jack's task_notify code, and three example
> > users of it (the SysV IPC semundo semaphore, the key infrastructure
> > and SGI's JOB module). The patch order should be task_notify.diff,
> > task-notify-keys.diff, task-notify-semundo.diff, and
> > task_notify-job.diff last.
>
> I can already tell you I don't like the "magic" mechanism to identify
> notifier blocks. The problem is that it's yet another space of id
> numbers that we have to manage -- either manually, by having a person
> hand the numbers out to developers, or automatically using the idr code.
> You could use the notifier block's address and avoid an additional id
> space.
>
> Also, even if this mechanism goes into task_notify it needs a better
> name than "magic".
>
> > I think task_notify it should be usable for statistics gathering as
> > well, the only issue is how to attach it to the processes we wish to
> > gather accounting for. Personally I am not a big fan of the current
> > concept where statistics are gathered for all tasks at all time but
> > just not exported until accounting is enabled.
>
> Have you looked at Alan Stern's notifier chain fix patch? Could that be
> used in task_notify?
>
> If not, perhaps the patch use the standard kernel list idioms.
>
> Another potential user for the task_notify functionality is the process
> events connector. The problem is it requires the ability to receive
> notifications about all processes. Also, there's a chance that future
> CKRM code could use all-task and per-task notification. Combined with
> John Hesterberg's feedback I think there is strong justification for an
> all-tasks notification interface.
I have two concerns about an all-tasks notification interface.
First, we want this to scale, so don't want more global locks.
One unique part of the task notify is that it doesn't use locks.
Second, in at least some of the cases we're familiar with,
even when we might need all-tasks notification we still need task-local
data. That's been a problem with some of the global mechanisms I've
seen discussed.
Cheers,
John
next prev parent reply other threads:[~2006-01-11 21:40 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-01-03 23:16 [Patch 0/6] Per-task delay accounting Shailabh Nagar
2006-01-03 23:23 ` [Patch 1/6] Delay accounting: timespec diff Shailabh Nagar
2006-01-03 23:26 ` [Patch 2/6] Delay accounting: Initialization, kernel boot option Shailabh Nagar
2006-01-03 23:28 ` [Patch 3/6] Delay accounting: Sync block I/O delays Shailabh Nagar
2006-01-03 23:30 ` [Patch 4/6] Delay accounting: Swap in delays Shailabh Nagar
2006-01-03 23:31 ` [Patch 5/6] Delay accounting: /proc interface Shailabh Nagar
2006-01-03 23:33 ` [Patch 6/6] Delay accounting: Connector interface Shailabh Nagar
2006-01-04 0:21 ` Greg KH
2006-01-04 0:42 ` Shailabh Nagar
2006-01-04 0:51 ` Greg KH
2006-01-04 7:49 ` [Lse-tech] " Shailabh Nagar
2006-01-04 19:04 ` Jay Lan
2006-01-04 21:31 ` Shailabh Nagar
2006-01-04 22:40 ` [ckrm-tech] " Matt Helsley
2006-01-04 23:17 ` Andrew Morton
2006-01-05 18:42 ` [PATCH 00/01] Move Exit Connectors Matt Helsley
2006-01-05 19:17 ` [PATCH 01/01][RFC] " Matt Helsley
2006-01-05 19:20 ` [PATCH 00/01] " Matt Helsley
2006-01-05 23:10 ` Andrew Morton
2006-01-06 0:06 ` [ckrm-tech] " Matt Helsley
2006-01-06 8:57 ` [Lse-tech] " Jes Sorensen
2006-01-06 16:45 ` Shailabh Nagar
2006-01-11 10:36 ` Jes Sorensen
2006-01-11 12:56 ` John Hesterberg
2006-01-11 13:50 ` Jes Sorensen
2006-01-11 21:02 ` Matt Helsley
2006-01-11 21:39 ` John Hesterberg [this message]
2006-01-11 22:42 ` Matt Helsley
2006-01-12 10:01 ` Jes Sorensen
2006-01-12 23:20 ` Matt Helsley
2006-01-13 9:35 ` Jes Sorensen
2006-01-14 7:23 ` Matt Helsley
2006-01-12 3:29 ` Keith Owens
2006-01-12 5:04 ` Paul E. McKenney
2006-01-12 5:38 ` Keith Owens
2006-01-12 6:19 ` Keith Owens
2006-01-12 6:51 ` Paul E. McKenney
2006-01-12 7:50 ` Keith Owens
2006-01-12 15:17 ` Paul E. McKenney
2006-01-17 17:26 ` Paul E. McKenney
2006-01-17 23:57 ` Keith Owens
2006-01-18 2:49 ` Paul E. McKenney
2006-01-18 2:55 ` Lee Revell
2006-01-18 6:29 ` Paul E. McKenney
2006-01-12 5:26 ` Matt Helsley
2006-01-12 5:45 ` Keith Owens
2006-01-12 9:51 ` Jes Sorensen
2006-01-12 23:01 ` Matt Helsley
2006-01-13 9:59 ` Jes Sorensen
2006-01-13 10:38 ` Jes Sorensen
2006-01-13 23:22 ` Matt Helsley
2006-01-12 23:49 ` Matt Helsley
2006-01-05 0:01 ` [ckrm-tech] Re: [Patch 6/6] Delay accounting: Connector interface Shailabh Nagar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060111213910.GA17986@sgi.com \
--to=jh@sgi.com \
--cc=akpm@osdl.org \
--cc=ckrm-tech@lists.sourceforge.net \
--cc=elsa-devel@lists.sourceforge.net \
--cc=erikj@sgi.com \
--cc=jes@trained-monkey.org \
--cc=jlan@engr.sgi.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lse-tech@lists.sourceforge.net \
--cc=matthltc@us.ibm.com \
--cc=nagar@watson.ibm.com \
--cc=pj@sgi.com \
--cc=steiner@sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox