nproc: So?

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* nproc: So?
  2004-09-08 18:40 [0/1][ANNOUNCE] nproc v2: netlink access to /proc information Roger Luethi
@ 2004-09-16 21:43 ` Roger Luethi
  0 siblings, 0 replies; 7+ messages in thread
From: Roger Luethi @ 2004-09-16 21:43 UTC (permalink / raw)
  To: linux-kernel

I have received some constructive criticism and suggestions, but I didn't
see any comments on the desirability of nproc in mainline. Initially meant
to be a proof-of-concept, nproc has become an interface that is much
cleaner and faster than procfs can ever hope to be (it takes some reading
of procps or libgtop code to appreciate the complexity that is /proc file
parsing today), and every change in /proc files widens the gap. I presented
source code, benchmarks, and design documentation to substantiate my
claims; I can post the user-space code somewhere if there's interest.

So I'm wondering if everybody's waiting for me to answer some important
question I overlooked, or if there is a general sentiment that this
project is not worth pursuing.

Roger

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: nproc: So?
@ 2004-09-17 16:55 Albert Cahalan
  2004-09-17 17:51 ` Roger Luethi
  0 siblings, 1 reply; 7+ messages in thread
From: Albert Cahalan @ 2004-09-17 16:55 UTC (permalink / raw)
  To: linux-kernel mailing list; +Cc: rl

Roger Luethi writes:
> I have received some constructive criticism and suggestions,
> but I didn't see any comments on the desirability of nproc in
> mainline. Initially meant to be a proof-of-concept, nproc has
> become an interface that is much cleaner and faster than procfs
> can ever hope to be (it takes some reading of procps or libgtop
> code to appreciate the complexity that is /proc file parsing today),

You spotted the perfect hash lookup? :-)

> and every change in /proc files widens the gap. I presented
> source code, benchmarks, and design documentation to substantiate
> my claims; I can post the user-space code somewhere if there's
> interest.
>
> So I'm wondering if everybody's waiting for me to answer some
> important question I overlooked, or if there is a general
> sentiment that this project is not worth pursuing.

I'm very glad to see numerical proof that /proc is crap.
If nproc does nothing else, it's still been useful.

The funny varargs/vsprintf/whatever encoding is useless to me,
as are the labels.

The nicest think about netlink is, i think, that it might make
a practical interface for incremental update. As processes run
or get modified, monitoring apps might get notified. I did not
see mention of this being implemented, and I would take quite 
some time to support it, so it's a long-term goal. (of course,
people can always submit procps patches to support this)

I doubt that it is good to break down the data into so many
different items. It seems sensible to break down the data by 
locking requirements. 

I could use an opaque per-process cookie for process identification.
This would protect from PID reuse, and might allow for faster
lookup. Perhaps it contains: PID, address of task_struct, and the
system-wide or per-cpu fork count from process creation.

Something like the stat() syscall would be pretty decent.

Well, whatever... In any case, I'd need to see some working code
for the libproc library. My net connection dies for hours at a
time, so don't expect speedy anything right now.

BTW, I have a 32-bit big-endian system with char being unsigned
by default. The varargs stuff is odd too.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: nproc: So?
  2004-09-17 16:55 nproc: So? Albert Cahalan
@ 2004-09-17 17:51 ` Roger Luethi
  2004-09-18 12:40   ` Albert Cahalan
  0 siblings, 1 reply; 7+ messages in thread
From: Roger Luethi @ 2004-09-17 17:51 UTC (permalink / raw)
  To: Albert Cahalan; +Cc: linux-kernel mailing list

On Fri, 17 Sep 2004 12:55:32 -0400, Albert Cahalan wrote:
> Roger Luethi writes:
> > I have received some constructive criticism and suggestions,
> > but I didn't see any comments on the desirability of nproc in
> > mainline. Initially meant to be a proof-of-concept, nproc has
> > become an interface that is much cleaner and faster than procfs
> > can ever hope to be (it takes some reading of procps or libgtop
> > code to appreciate the complexity that is /proc file parsing today),
> 
> You spotted the perfect hash lookup? :-)

I never claimed nproc is perfect. Solutions with comparable performance
and simplicity are conceivable, but none of them will work anything
like procfs.

> The funny varargs/vsprintf/whatever encoding is useless to me,

Actually, that's just a by-product of the design. It is what you get when
you put all the fields back to back. The only addition I made kernel-side
to make this easy to exploit was the introduction of a NOP field.

> as are the labels.

Yup. The labels are not useful for the tools you maintain.

> The nicest think about netlink is, i think, that it might make
> a practical interface for incremental update. As processes run
> or get modified, monitoring apps might get notified. I did not
> see mention of this being implemented, and I would take quite 
> some time to support it, so it's a long-term goal. (of course,
> people can always submit procps patches to support this)

Sounds like what wli and I have discussed as differential updates a few
weeks ago. I agree that would be nice, for now the goal was to suggest
something that's cleaner and faster than procfs. Extensions are easy
to add later.

> I doubt that it is good to break down the data into so many
> different items. It seems sensible to break down the data by 
> locking requirements. 

True if you consider a static set of fields that never changes. Problematic
otherwise, because as soon as you start grouping fields together, you need
an agreement between kernel and user-space on the contents of these groups.

With nproc, the kernel is free to group fields together for computation
(even the first release calculated all the fields that needed VMA walks
in one go).

> I could use an opaque per-process cookie for process identification.
> This would protect from PID reuse, and might allow for faster
> lookup. Perhaps it contains: PID, address of task_struct, and the
> system-wide or per-cpu fork count from process creation.

Agreed, that would be useful. And it would be easy to integrate with
nproc. Just add a field to return the cookie and a selector based on
cookies rather than PIDs.

> Something like the stat() syscall would be pretty decent.

You lost me there.

Roger

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: nproc: So?
  2004-09-17 17:51 ` Roger Luethi
@ 2004-09-18 12:40   ` Albert Cahalan
  2004-09-19 10:39     ` Roger Luethi
  0 siblings, 1 reply; 7+ messages in thread
From: Albert Cahalan @ 2004-09-18 12:40 UTC (permalink / raw)
  To: Roger Luethi; +Cc: linux-kernel mailing list

On Fri, 2004-09-17 at 13:51, Roger Luethi wrote:
> On Fri, 17 Sep 2004 12:55:32 -0400, Albert Cahalan wrote:

> > The nicest think about netlink is, i think, that it might make
> > a practical interface for incremental update. As processes run
> > or get modified, monitoring apps might get notified. I did not
> > see mention of this being implemented, and I would take quite 
> > some time to support it, so it's a long-term goal. (of course,
> > people can always submit procps patches to support this)
> 
> Sounds like what wli and I have discussed as differential updates
> a few weeks ago. I agree that would be nice, for now the goal was
> to suggest something that's cleaner and faster than procfs.
> Extensions are easy to add later.

To me, this looks like the killer feature. You could even
skip the regular process info. Simply return process identification
cookies that could be passed into a separate syscall to get
the information.

> > I doubt that it is good to break down the data into so many
> > different items. It seems sensible to break down the data by 
> > locking requirements. 
> 
> True if you consider a static set of fields that never changes. Problematic
> otherwise, because as soon as you start grouping fields together, you need
> an agreement between kernel and user-space on the contents of these groups.

I suppose this is small potatoes compared to the overhead
of dealing with ASCII, but individual field handling would
be a bit slower.

For initial libproc support, I'd start by requesting info
in groups that match what /proc provides today.

> > I could use an opaque per-process cookie for process identification.
> > This would protect from PID reuse, and might allow for faster
> > lookup. Perhaps it contains: PID, address of task_struct, and the
> > system-wide or per-cpu fork count from process creation.
> 
> Agreed, that would be useful. And it would be easy to integrate with
> nproc. Just add a field to return the cookie and a selector based on
> cookies rather than PIDs.
> 
> > Something like the stat() syscall would be pretty decent.
> 
> You lost me there.

The stat() call simply fills in a struct. Given a per-process
cookie (or a PID if you tolerate the race conditions), a syscall
similar to stat() could fill in a struct.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: nproc: So?
  2004-09-18 12:40   ` Albert Cahalan
@ 2004-09-19 10:39     ` Roger Luethi
  2004-09-19 12:29       ` Albert Cahalan
  0 siblings, 1 reply; 7+ messages in thread
From: Roger Luethi @ 2004-09-19 10:39 UTC (permalink / raw)
  To: Albert Cahalan; +Cc: linux-kernel mailing list

On Sat, 18 Sep 2004 08:40:12 -0400, Albert Cahalan wrote:
> To me, this looks like the killer feature. You could even
> skip the regular process info. Simply return process identification
> cookies that could be passed into a separate syscall to get
> the information.

Do you mean "return cookies for all existing processes"? Or "return
cookies for all processes created since X" (if so, what's X?) ?

> > True if you consider a static set of fields that never changes. Problematic
> > otherwise, because as soon as you start grouping fields together, you need
> > an agreement between kernel and user-space on the contents of these groups.
> 
> I suppose this is small potatoes compared to the overhead
> of dealing with ASCII, but individual field handling would
> be a bit slower.

Correct.

> For initial libproc support, I'd start by requesting info
> in groups that match what /proc provides today.

Makes perfect sense. You can pre-assemble an array of field IDs, hand
them over to the kernel, and get the requested fields in the requested
order.

> The stat() call simply fills in a struct. Given a per-process
> cookie (or a PID if you tolerate the race conditions), a syscall
> similar to stat() could fill in a struct.

With nproc as-is you can send a request that matches your desired struct
and cast the result to a pointer to your struct.

An application can build its own cookie simply by always requesting a set
of fields that _together_ can be used to identify a process. I reckon that
PID + process creation timestamp would be a good combination (except that
the latter is not currently available). The creation of the complete reply
to a request is atomic per process, the race is gone. What is not possible
right now is selecting processes based on a cookie -- the only selectors
so far are "all of them" and "select by PID".

Roger

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: nproc: So?
  2004-09-19 10:39     ` Roger Luethi
@ 2004-09-19 12:29       ` Albert Cahalan
  2004-09-19 13:57         ` Roger Luethi
  0 siblings, 1 reply; 7+ messages in thread
From: Albert Cahalan @ 2004-09-19 12:29 UTC (permalink / raw)
  To: Roger Luethi; +Cc: linux-kernel mailing list

On Sun, 2004-09-19 at 06:39, Roger Luethi wrote:
> On Sat, 18 Sep 2004 08:40:12 -0400, Albert Cahalan wrote:
> > To me, this looks like the killer feature. You could even
> > skip the regular process info. Simply return process identification
> > cookies that could be passed into a separate syscall to get
> > the information.
> 
> Do you mean "return cookies for all existing processes"? Or "return
> cookies for all processes created since X" (if so, what's X?) ?

First, queue cookies for all existing processes.
Then, as process data changes, queue cookies for
processes that need to be examined again. Suppress
queueing of cookies for processes that are already
in the queue so things don't get too backed up.
If memory usage exceeds some adjustable limit, then
switch to supplying all processes until the backlog
is gone.

I realize that the implementation may prove difficult.

> With nproc as-is you can send a request that matches your desired struct
> and cast the result to a pointer to your struct.

Either that's marketing, or I missed something. :-)

Can I force specific data sizes? Can I force a string to
be NUL-terminated or a NUL-padded fixed-length buffer?
Can I request padding bytes to be skipped over?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: nproc: So?
  2004-09-19 12:29       ` Albert Cahalan
@ 2004-09-19 13:57         ` Roger Luethi
  0 siblings, 0 replies; 7+ messages in thread
From: Roger Luethi @ 2004-09-19 13:57 UTC (permalink / raw)
  To: Albert Cahalan; +Cc: linux-kernel mailing list

On Sun, 19 Sep 2004 08:29:57 -0400, Albert Cahalan wrote:
> > Do you mean "return cookies for all existing processes"? Or "return
> > cookies for all processes created since X" (if so, what's X?) ?
> 
> First, queue cookies for all existing processes.
> Then, as process data changes, queue cookies for
> processes that need to be examined again. Suppress
> queueing of cookies for processes that are already
> in the queue so things don't get too backed up.
> If memory usage exceeds some adjustable limit, then
> switch to supplying all processes until the backlog
> is gone.

How is the kernel to know which changes of process data require
re-examination? In all likelihood, any tool is only going to be
interested in certain changes, not in others.

> I realize that the implementation may prove difficult.

It seems reasonable (and useful) to notify tools if new processes get
created. It is certainly possible to have additional events (like field
changes) trigger notifications, but this would probably become rather
intrusive and expensive.

> > With nproc as-is you can send a request that matches your desired struct
> > and cast the result to a pointer to your struct.
> 
> Either that's marketing, or I missed something. :-)
> 
> Can I force specific data sizes? Can I force a string to
> be NUL-terminated or a NUL-padded fixed-length buffer?
> Can I request padding bytes to be skipped over?

No, your data types have to match what the kernel offers. What I was
referring to was your request for "info in groups that match what /proc
provides today". What you _can_ do with nproc is, say, ask it to return
a pointer to something like this:

struct statm_extended {
	__u32 pid;	/*
	__u32 namelen;	 * My simple cookie
	char name[16];	 */

	__u32 resident;	/*
	__u32 shared;	 *
	__u32 trs;	 * /proc/PID/statm content
	__u32 lrs;	 *
	__u32 drs;	 *
	__u32 dt;	 */
};

Roger

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2004-09-19 13:57 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-09-17 16:55 nproc: So? Albert Cahalan
2004-09-17 17:51 ` Roger Luethi
2004-09-18 12:40   ` Albert Cahalan
2004-09-19 10:39     ` Roger Luethi
2004-09-19 12:29       ` Albert Cahalan
2004-09-19 13:57         ` Roger Luethi
  -- strict thread matches above, loose matches on Subject: below --
2004-09-08 18:40 [0/1][ANNOUNCE] nproc v2: netlink access to /proc information Roger Luethi
2004-09-16 21:43 ` nproc: So? Roger Luethi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox