* Re: nproc: So?
@ 2004-09-17 16:55 Albert Cahalan
2004-09-17 17:51 ` Roger Luethi
0 siblings, 1 reply; 7+ messages in thread
From: Albert Cahalan @ 2004-09-17 16:55 UTC (permalink / raw)
To: linux-kernel mailing list; +Cc: rl
Roger Luethi writes:
> I have received some constructive criticism and suggestions,
> but I didn't see any comments on the desirability of nproc in
> mainline. Initially meant to be a proof-of-concept, nproc has
> become an interface that is much cleaner and faster than procfs
> can ever hope to be (it takes some reading of procps or libgtop
> code to appreciate the complexity that is /proc file parsing today),
You spotted the perfect hash lookup? :-)
> and every change in /proc files widens the gap. I presented
> source code, benchmarks, and design documentation to substantiate
> my claims; I can post the user-space code somewhere if there's
> interest.
>
> So I'm wondering if everybody's waiting for me to answer some
> important question I overlooked, or if there is a general
> sentiment that this project is not worth pursuing.
I'm very glad to see numerical proof that /proc is crap.
If nproc does nothing else, it's still been useful.
The funny varargs/vsprintf/whatever encoding is useless to me,
as are the labels.
The nicest think about netlink is, i think, that it might make
a practical interface for incremental update. As processes run
or get modified, monitoring apps might get notified. I did not
see mention of this being implemented, and I would take quite
some time to support it, so it's a long-term goal. (of course,
people can always submit procps patches to support this)
I doubt that it is good to break down the data into so many
different items. It seems sensible to break down the data by
locking requirements.
I could use an opaque per-process cookie for process identification.
This would protect from PID reuse, and might allow for faster
lookup. Perhaps it contains: PID, address of task_struct, and the
system-wide or per-cpu fork count from process creation.
Something like the stat() syscall would be pretty decent.
Well, whatever... In any case, I'd need to see some working code
for the libproc library. My net connection dies for hours at a
time, so don't expect speedy anything right now.
BTW, I have a 32-bit big-endian system with char being unsigned
by default. The varargs stuff is odd too.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: nproc: So?
2004-09-17 16:55 nproc: So? Albert Cahalan
@ 2004-09-17 17:51 ` Roger Luethi
2004-09-18 12:40 ` Albert Cahalan
0 siblings, 1 reply; 7+ messages in thread
From: Roger Luethi @ 2004-09-17 17:51 UTC (permalink / raw)
To: Albert Cahalan; +Cc: linux-kernel mailing list
On Fri, 17 Sep 2004 12:55:32 -0400, Albert Cahalan wrote:
> Roger Luethi writes:
> > I have received some constructive criticism and suggestions,
> > but I didn't see any comments on the desirability of nproc in
> > mainline. Initially meant to be a proof-of-concept, nproc has
> > become an interface that is much cleaner and faster than procfs
> > can ever hope to be (it takes some reading of procps or libgtop
> > code to appreciate the complexity that is /proc file parsing today),
>
> You spotted the perfect hash lookup? :-)
I never claimed nproc is perfect. Solutions with comparable performance
and simplicity are conceivable, but none of them will work anything
like procfs.
> The funny varargs/vsprintf/whatever encoding is useless to me,
Actually, that's just a by-product of the design. It is what you get when
you put all the fields back to back. The only addition I made kernel-side
to make this easy to exploit was the introduction of a NOP field.
> as are the labels.
Yup. The labels are not useful for the tools you maintain.
> The nicest think about netlink is, i think, that it might make
> a practical interface for incremental update. As processes run
> or get modified, monitoring apps might get notified. I did not
> see mention of this being implemented, and I would take quite
> some time to support it, so it's a long-term goal. (of course,
> people can always submit procps patches to support this)
Sounds like what wli and I have discussed as differential updates a few
weeks ago. I agree that would be nice, for now the goal was to suggest
something that's cleaner and faster than procfs. Extensions are easy
to add later.
> I doubt that it is good to break down the data into so many
> different items. It seems sensible to break down the data by
> locking requirements.
True if you consider a static set of fields that never changes. Problematic
otherwise, because as soon as you start grouping fields together, you need
an agreement between kernel and user-space on the contents of these groups.
With nproc, the kernel is free to group fields together for computation
(even the first release calculated all the fields that needed VMA walks
in one go).
> I could use an opaque per-process cookie for process identification.
> This would protect from PID reuse, and might allow for faster
> lookup. Perhaps it contains: PID, address of task_struct, and the
> system-wide or per-cpu fork count from process creation.
Agreed, that would be useful. And it would be easy to integrate with
nproc. Just add a field to return the cookie and a selector based on
cookies rather than PIDs.
> Something like the stat() syscall would be pretty decent.
You lost me there.
Roger
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: nproc: So?
2004-09-17 17:51 ` Roger Luethi
@ 2004-09-18 12:40 ` Albert Cahalan
2004-09-19 10:39 ` Roger Luethi
0 siblings, 1 reply; 7+ messages in thread
From: Albert Cahalan @ 2004-09-18 12:40 UTC (permalink / raw)
To: Roger Luethi; +Cc: linux-kernel mailing list
On Fri, 2004-09-17 at 13:51, Roger Luethi wrote:
> On Fri, 17 Sep 2004 12:55:32 -0400, Albert Cahalan wrote:
> > The nicest think about netlink is, i think, that it might make
> > a practical interface for incremental update. As processes run
> > or get modified, monitoring apps might get notified. I did not
> > see mention of this being implemented, and I would take quite
> > some time to support it, so it's a long-term goal. (of course,
> > people can always submit procps patches to support this)
>
> Sounds like what wli and I have discussed as differential updates
> a few weeks ago. I agree that would be nice, for now the goal was
> to suggest something that's cleaner and faster than procfs.
> Extensions are easy to add later.
To me, this looks like the killer feature. You could even
skip the regular process info. Simply return process identification
cookies that could be passed into a separate syscall to get
the information.
> > I doubt that it is good to break down the data into so many
> > different items. It seems sensible to break down the data by
> > locking requirements.
>
> True if you consider a static set of fields that never changes. Problematic
> otherwise, because as soon as you start grouping fields together, you need
> an agreement between kernel and user-space on the contents of these groups.
I suppose this is small potatoes compared to the overhead
of dealing with ASCII, but individual field handling would
be a bit slower.
For initial libproc support, I'd start by requesting info
in groups that match what /proc provides today.
> > I could use an opaque per-process cookie for process identification.
> > This would protect from PID reuse, and might allow for faster
> > lookup. Perhaps it contains: PID, address of task_struct, and the
> > system-wide or per-cpu fork count from process creation.
>
> Agreed, that would be useful. And it would be easy to integrate with
> nproc. Just add a field to return the cookie and a selector based on
> cookies rather than PIDs.
>
> > Something like the stat() syscall would be pretty decent.
>
> You lost me there.
The stat() call simply fills in a struct. Given a per-process
cookie (or a PID if you tolerate the race conditions), a syscall
similar to stat() could fill in a struct.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: nproc: So?
2004-09-18 12:40 ` Albert Cahalan
@ 2004-09-19 10:39 ` Roger Luethi
2004-09-19 12:29 ` Albert Cahalan
0 siblings, 1 reply; 7+ messages in thread
From: Roger Luethi @ 2004-09-19 10:39 UTC (permalink / raw)
To: Albert Cahalan; +Cc: linux-kernel mailing list
On Sat, 18 Sep 2004 08:40:12 -0400, Albert Cahalan wrote:
> To me, this looks like the killer feature. You could even
> skip the regular process info. Simply return process identification
> cookies that could be passed into a separate syscall to get
> the information.
Do you mean "return cookies for all existing processes"? Or "return
cookies for all processes created since X" (if so, what's X?) ?
> > True if you consider a static set of fields that never changes. Problematic
> > otherwise, because as soon as you start grouping fields together, you need
> > an agreement between kernel and user-space on the contents of these groups.
>
> I suppose this is small potatoes compared to the overhead
> of dealing with ASCII, but individual field handling would
> be a bit slower.
Correct.
> For initial libproc support, I'd start by requesting info
> in groups that match what /proc provides today.
Makes perfect sense. You can pre-assemble an array of field IDs, hand
them over to the kernel, and get the requested fields in the requested
order.
> The stat() call simply fills in a struct. Given a per-process
> cookie (or a PID if you tolerate the race conditions), a syscall
> similar to stat() could fill in a struct.
With nproc as-is you can send a request that matches your desired struct
and cast the result to a pointer to your struct.
An application can build its own cookie simply by always requesting a set
of fields that _together_ can be used to identify a process. I reckon that
PID + process creation timestamp would be a good combination (except that
the latter is not currently available). The creation of the complete reply
to a request is atomic per process, the race is gone. What is not possible
right now is selecting processes based on a cookie -- the only selectors
so far are "all of them" and "select by PID".
Roger
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: nproc: So?
2004-09-19 10:39 ` Roger Luethi
@ 2004-09-19 12:29 ` Albert Cahalan
2004-09-19 13:57 ` Roger Luethi
0 siblings, 1 reply; 7+ messages in thread
From: Albert Cahalan @ 2004-09-19 12:29 UTC (permalink / raw)
To: Roger Luethi; +Cc: linux-kernel mailing list
On Sun, 2004-09-19 at 06:39, Roger Luethi wrote:
> On Sat, 18 Sep 2004 08:40:12 -0400, Albert Cahalan wrote:
> > To me, this looks like the killer feature. You could even
> > skip the regular process info. Simply return process identification
> > cookies that could be passed into a separate syscall to get
> > the information.
>
> Do you mean "return cookies for all existing processes"? Or "return
> cookies for all processes created since X" (if so, what's X?) ?
First, queue cookies for all existing processes.
Then, as process data changes, queue cookies for
processes that need to be examined again. Suppress
queueing of cookies for processes that are already
in the queue so things don't get too backed up.
If memory usage exceeds some adjustable limit, then
switch to supplying all processes until the backlog
is gone.
I realize that the implementation may prove difficult.
> With nproc as-is you can send a request that matches your desired struct
> and cast the result to a pointer to your struct.
Either that's marketing, or I missed something. :-)
Can I force specific data sizes? Can I force a string to
be NUL-terminated or a NUL-padded fixed-length buffer?
Can I request padding bytes to be skipped over?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: nproc: So?
2004-09-19 12:29 ` Albert Cahalan
@ 2004-09-19 13:57 ` Roger Luethi
0 siblings, 0 replies; 7+ messages in thread
From: Roger Luethi @ 2004-09-19 13:57 UTC (permalink / raw)
To: Albert Cahalan; +Cc: linux-kernel mailing list
On Sun, 19 Sep 2004 08:29:57 -0400, Albert Cahalan wrote:
> > Do you mean "return cookies for all existing processes"? Or "return
> > cookies for all processes created since X" (if so, what's X?) ?
>
> First, queue cookies for all existing processes.
> Then, as process data changes, queue cookies for
> processes that need to be examined again. Suppress
> queueing of cookies for processes that are already
> in the queue so things don't get too backed up.
> If memory usage exceeds some adjustable limit, then
> switch to supplying all processes until the backlog
> is gone.
How is the kernel to know which changes of process data require
re-examination? In all likelihood, any tool is only going to be
interested in certain changes, not in others.
> I realize that the implementation may prove difficult.
It seems reasonable (and useful) to notify tools if new processes get
created. It is certainly possible to have additional events (like field
changes) trigger notifications, but this would probably become rather
intrusive and expensive.
> > With nproc as-is you can send a request that matches your desired struct
> > and cast the result to a pointer to your struct.
>
> Either that's marketing, or I missed something. :-)
>
> Can I force specific data sizes? Can I force a string to
> be NUL-terminated or a NUL-padded fixed-length buffer?
> Can I request padding bytes to be skipped over?
No, your data types have to match what the kernel offers. What I was
referring to was your request for "info in groups that match what /proc
provides today". What you _can_ do with nproc is, say, ask it to return
a pointer to something like this:
struct statm_extended {
__u32 pid; /*
__u32 namelen; * My simple cookie
char name[16]; */
__u32 resident; /*
__u32 shared; *
__u32 trs; * /proc/PID/statm content
__u32 lrs; *
__u32 drs; *
__u32 dt; */
};
Roger
^ permalink raw reply [flat|nested] 7+ messages in thread
* [0/1][ANNOUNCE] nproc v2: netlink access to /proc information
@ 2004-09-08 18:40 Roger Luethi
2004-09-16 21:43 ` nproc: So? Roger Luethi
0 siblings, 1 reply; 7+ messages in thread
From: Roger Luethi @ 2004-09-08 18:40 UTC (permalink / raw)
To: Andrew Morton, linux-kernel
Cc: Albert Cahalan, William Lee Irwin III, Martin J. Bligh,
Paul Jackson
I am submitting nproc, a new netlink interface to process information,
for review and a possible inclusion in mainline.
The problems with /proc as far as parsers go are widely known. Parsing is
both difficult and slow (including a more detailed discussion by reference:
http://marc.theaimsgroup.com/?l=linux-kernel&m=109361019528995). What
follows is an overview showing how nproc fares in those areas.
Roger
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Clean Interface
---------------
The main motivation was to clean up the mess that are /proc semantics
and provide a clean interface for tools to gather process information.
Nproc does not add new knowledge to the kernel (some redundancy remains
until routines are shared with /proc). Instead, it offers existing
information in a form that works for tools. In fact, a tool can pass
the buffer read from the netlink directly as a va_list to vprintf
(strings require a trivial extra operation).
A small user-space app can present a view like the one below based on
zero prior knowledge about the fields the kernel has to offer. While I
don't envision that as common for tools in the future, it demonstrates
what can be done with little effort. This is not a mock-up, by the way,
the nprocdemo tool exists (lines truncated to fit 80 chars).
MemFree |PageSize|Jiffies |nr_dirty|nr_writeback|nr_unstable|[...]
____page|____byte|__________|____page|________page|_______page|[...]
7546| 4096| 1917203| 1| 0| 0|[...]
PID |Name |VmSize |VmLock |VmRSS |VmData |VmStack |[...]
_____|_______________|_____KiB|_____KiB|_____KiB|_____KiB|_____KiB|[...]
1|init | 1340| 0| 468| 144| 4|[...]
2|ksoftirqd/0 | 0| 0| 0| 0| 0|[...]
3|events/0 | 0| 0| 0| 0| 0|[...]
4|khelper | 0| 0| 0| 0| 0|[...]
5|netlink/0 | 0| 0| 0| 0| 0|[...]
6|kacpid | 0| 0| 0| 0| 0|[...]
23|kblockd/0 | 0| 0| 0| 0| 0|[...]
24|khubd | 0| 0| 0| 0| 0|[...]
36|pdflush | 0| 0| 0| 0| 0|[...]
37|pdflush | 0| 0| 0| 0| 0|[...]
38|kswapd0 | 0| 0| 0| 0| 0|[...]
39|aio/0 | 0| 0| 0| 0| 0|[...]
671|kseriod | 0| 0| 0| 0| 0|[...]
686|reiserfs/0 | 0| 0| 0| 0| 0|[...]
851|udevd | 1320| 0| 360| 144| 4|[...]
9159|syslogd | 1516| 0| 588| 272| 16|[...]
9382|gpm | 1540| 0| 468| 152| 4|[...]
9452|klogd | 1468| 0| 432| 276| 8|[...]
9478|hddtemp | 1692| 0| 848| 472| 16|[...]
9486|login | 2152| 0| 1204| 392| 36|[...]
9487|agetty | 1340| 0| 488| 156| 4|[...]
9488|agetty | 1340| 0| 488| 156| 4|[...]
9489|agetty | 1340| 0| 488| 156| 4|[...]
9490|agetty | 1340| 0| 488| 156| 4|[...]
9491|agetty | 1340| 0| 488| 156| 4|[...]
9598|zsh | 4748| 0| 1688| 532| 20|[...]
[...]
Performance
-----------
I measured the time to write a complete process table dump for 5000
tasks to /dev/null 100 times for "ps ax" and nprocdemo.
ps ax (5 process fields):
real 1m0.472s
user 0m18.227s
sys 0m28.545s
nprocdemo (automatic field discovery, reading and printing 11 process
fields + 9 global fields):
real 0m9.064s
user 0m2.491s
sys 0m1.554s
The details of resource usage for the benchmarks show that /proc based
tools are suffering badly from the inefficiency of three(!) conversions
between data and strings (kernel produces strings from numbers, app
converts back to numbers, app converts numbers again to strings for
printing).
For nproc based tools, only one conversion remains.
# ps ax > /dev/null
CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
samples % image name app name symbol name
6524 14.0613 vmlinux ps number
4828 10.4058 libc-2.3.3.so ps _IO_vfscanf_internal
2740 5.9056 vmlinux ps vsnprintf
2689 5.7956 vmlinux ps proc_pid_stat
1807 3.8946 vmlinux ps __d_lookup
1676 3.6123 libc-2.3.3.so ps ____strtol_l_internal
1335 2.8773 vmlinux ps link_path_walk
1133 2.4420 libproc-3.2.3.so ps status2proc
1094 2.3579 vmlinux ps render_sigset_t
1088 2.3450 libc-2.3.3.so ps _IO_vfprintf_internal
1086 2.3407 libc-2.3.3.so ps __GI_strchr
885 1.9075 libc-2.3.3.so ps ____strtoul_l_internal
800 1.7242 vmlinux ps pid_revalidate
581 1.2522 vmlinux ps proc_pid_status
551 1.1876 libc-2.3.3.so ps _IO_sputbackc_internal
529 1.1402 vmlinux ps system_call
524 1.1294 libc-2.3.3.so ps _IO_default_xsputn_internal
476 1.0259 libc-2.3.3.so ps __i686.get_pc_thunk.bx
466 1.0044 vmlinux ps get_tgid_list
442 0.9526 vmlinux ps atomic_dec_and_lock
373 0.8039 vmlinux ps dput
311 0.6703 libc-2.3.3.so ps __GI___strtol_internal
274 0.5906 vmlinux ps __copy_to_user_ll
272 0.5862 vmlinux ps path_lookup
270 0.5819 vmlinux ps strncpy_from_user
262 0.5647 libproc-3.2.3.so ps escape_str
259 0.5582 vmlinux ps page_address
249 0.5367 libc-2.3.3.so ps __GI_____strtoull_l_internal
244 0.5259 libc-2.3.3.so ps __GI_strlen
# nprocdemo > /dev/null
CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
samples % image name app name symbol name
1142 15.9208 libc-2.3.3.so nprocdemo _IO_vfprintf_internal
1072 14.9449 vmlinux vmlinux __task_mem
611 8.5181 libc-2.3.3.so nprocdemo _IO_new_file_xsputn
445 6.2038 vmlinux vmlinux nproc_pid_fields
244 3.4016 vmlinux vmlinux get_wchan
235 3.2762 vmlinux nprocdemo __copy_to_user_ll
233 3.2483 vmlinux vmlinux find_pid
215 2.9974 vmlinux vmlinux finish_task_switch
208 2.8998 vmlinux nprocdemo netlink_recvmsg
158 2.2027 vmlinux nprocdemo __wake_up
153 2.1330 libc-2.3.3.so nprocdemo __find_specmb
149 2.0772 vmlinux nprocdemo finish_task_switch
146 2.0354 libc-2.3.3.so nprocdemo __i686.get_pc_thunk.bx
114 1.5893 vmlinux vmlinux get_task_mm
94 1.3105 vmlinux nprocdemo skb_release_data
87 1.2129 vmlinux vmlinux nproc_ps_do_pid
76 1.0595 vmlinux vmlinux alloc_skb
72 1.0038 vmlinux nprocdemo system_call
68 0.9480 libc-2.3.3.so nprocdemo _IO_padn_internal
65 0.9062 libc-2.3.3.so nprocdemo read_int
64 0.8922 libc-2.3.3.so nprocdemo __recv
63 0.8783 vmlinux vmlinux netlink_attachskb
61 0.8504 vmlinux nprocdemo kfree
56 0.7807 vmlinux vmlinux __kmalloc
55 0.7668 vmlinux vmlinux schedule
47 0.6552 vmlinux vmlinux __task_mem_cheap
42 0.5855 vmlinux nprocdemo sys_socketcall
40 0.5576 vmlinux nprocdemo fget
37 0.5158 nprocdemo nprocdemo nproc_get_reply
EOT
^ permalink raw reply [flat|nested] 7+ messages in thread* nproc: So?
2004-09-08 18:40 [0/1][ANNOUNCE] nproc v2: netlink access to /proc information Roger Luethi
@ 2004-09-16 21:43 ` Roger Luethi
0 siblings, 0 replies; 7+ messages in thread
From: Roger Luethi @ 2004-09-16 21:43 UTC (permalink / raw)
To: linux-kernel
I have received some constructive criticism and suggestions, but I didn't
see any comments on the desirability of nproc in mainline. Initially meant
to be a proof-of-concept, nproc has become an interface that is much
cleaner and faster than procfs can ever hope to be (it takes some reading
of procps or libgtop code to appreciate the complexity that is /proc file
parsing today), and every change in /proc files widens the gap. I presented
source code, benchmarks, and design documentation to substantiate my
claims; I can post the user-space code somewhere if there's interest.
So I'm wondering if everybody's waiting for me to answer some important
question I overlooked, or if there is a general sentiment that this
project is not worth pursuing.
Roger
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2004-09-19 13:57 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-09-17 16:55 nproc: So? Albert Cahalan
2004-09-17 17:51 ` Roger Luethi
2004-09-18 12:40 ` Albert Cahalan
2004-09-19 10:39 ` Roger Luethi
2004-09-19 12:29 ` Albert Cahalan
2004-09-19 13:57 ` Roger Luethi
-- strict thread matches above, loose matches on Subject: below --
2004-09-08 18:40 [0/1][ANNOUNCE] nproc v2: netlink access to /proc information Roger Luethi
2004-09-16 21:43 ` nproc: So? Roger Luethi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox