* /proc reliability & performance
@ 2003-10-17 2:07 Albert Cahalan
2003-10-17 2:34 ` Larry McVoy
` (3 more replies)
0 siblings, 4 replies; 16+ messages in thread
From: Albert Cahalan @ 2003-10-17 2:07 UTC (permalink / raw)
To: linux-kernel mailing list
I created a process with 360 thousand threads,
went into the /proc/*/task directory, and did
a simple /bin/ls. It took over 9 minutes on a
nice fast Opteron. (it's the same at top-level
with processes, but I wasn't about to mess up
my system that much)
OK, that's a bit extreme, but it does show a
scalability problem. There is an O(n*n)
algorithm in there. Here is a proposed fix:
Tie directory readers to a task_struct (or to
some of the PID tracking structs), so that
a directory reader is on a list. When a task
exits, move the list of directory readers on
to a neighboring task.
That is O(1) on task exit, and generally O(n)
for the whole /proc or /proc/42/task read.
It's O(1) per step of the read, excepting
where multiple directory readers wind up at
the same location.
Another benefit is that it is reliable as
long as tasks don't move around on the lists.
Each task will appear at most once, and will
appear exactly once if it doesn't start or
exit during the directory scan.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: /proc reliability & performance
2003-10-17 2:07 /proc reliability & performance Albert Cahalan
@ 2003-10-17 2:34 ` Larry McVoy
2003-10-17 8:01 ` dada1
2003-10-17 2:51 ` William Lee Irwin III
` (2 subsequent siblings)
3 siblings, 1 reply; 16+ messages in thread
From: Larry McVoy @ 2003-10-17 2:34 UTC (permalink / raw)
To: Albert Cahalan; +Cc: linux-kernel mailing list
On Thu, Oct 16, 2003 at 10:07:18PM -0400, Albert Cahalan wrote:
> I created a process with 360 thousand threads,
And your real need for 360,000 threads is?
I tend to believe that there are hundreds, nay, thousands, nay, 360 thousand
better things to work on in the kernel.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: /proc reliability & performance
2003-10-17 2:07 /proc reliability & performance Albert Cahalan
2003-10-17 2:34 ` Larry McVoy
@ 2003-10-17 2:51 ` William Lee Irwin III
2003-10-17 3:24 ` Brian McGroarty
2003-10-17 7:40 ` Zan Lynx
3 siblings, 0 replies; 16+ messages in thread
From: William Lee Irwin III @ 2003-10-17 2:51 UTC (permalink / raw)
To: Albert Cahalan; +Cc: linux-kernel mailing list
On Thu, Oct 16, 2003 at 10:07:18PM -0400, Albert Cahalan wrote:
> Tie directory readers to a task_struct (or to
> some of the PID tracking structs), so that
> a directory reader is on a list. When a task
> exits, move the list of directory readers on
> to a neighboring task.
> That is O(1) on task exit, and generally O(n)
> for the whole /proc or /proc/42/task read.
> It's O(1) per step of the read, excepting
> where multiple directory readers wind up at
> the same location.
> Another benefit is that it is reliable as
> long as tasks don't move around on the lists.
> Each task will appear at most once, and will
> appear exactly once if it doesn't start or
> exit during the directory scan.
Several other things have been tried.
(a) something mingo wrote I forgot the nature of
(b) a thing manfred wrote that recovers positions in hashtable
collision chains by sorting them, with O(chain length)
insertion
(c) a thing I wrote that turns the tasklist and pid_chains into
rbtrees and uses the last-seen pid to seek in O(lg(n))
time, and uses a routine to seek and fill buffers as a
drop-in replacement for get_tgid_list()/get_tid_list().
I have a current implementation of (c), as well as a patch to
restore 2.4 semantics to proc_pid_statm() in O(1) time.
-- wli
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: /proc reliability & performance
2003-10-17 2:07 /proc reliability & performance Albert Cahalan
2003-10-17 2:34 ` Larry McVoy
2003-10-17 2:51 ` William Lee Irwin III
@ 2003-10-17 3:24 ` Brian McGroarty
2003-10-17 4:31 ` Albert Cahalan
2003-10-17 7:40 ` Zan Lynx
3 siblings, 1 reply; 16+ messages in thread
From: Brian McGroarty @ 2003-10-17 3:24 UTC (permalink / raw)
To: Albert Cahalan; +Cc: linux-kernel mailing list
[-- Attachment #1: Type: text/plain, Size: 544 bytes --]
On Thu, Oct 16, 2003 at 10:07:18PM -0400, Albert Cahalan wrote:
> I created a process with 360 thousand threads,
> went into the /proc/*/task directory, and did
> a simple /bin/ls. It took over 9 minutes on a
> nice fast Opteron. (it's the same at top-level
> with processes, but I wasn't about to mess up
> my system that much)
Are there many cases where the /proc directory contents are read in
this fashion?
I'd be more curious about how performance fares with reading a
thousand entries by name with 1k processes and with 360k processes.
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: /proc reliability & performance
2003-10-17 3:24 ` Brian McGroarty
@ 2003-10-17 4:31 ` Albert Cahalan
2003-10-17 4:56 ` William Lee Irwin III
2003-10-17 8:21 ` David Rees
0 siblings, 2 replies; 16+ messages in thread
From: Albert Cahalan @ 2003-10-17 4:31 UTC (permalink / raw)
To: Brian McGroarty; +Cc: linux-kernel mailing list, lm
On Thu, 2003-10-16 at 23:24, Brian McGroarty wrote:
> On Thu, Oct 16, 2003 at 10:07:18PM -0400, Albert Cahalan wrote:
> > I created a process with 360 thousand threads,
> > went into the /proc/*/task directory, and did
> > a simple /bin/ls. It took over 9 minutes on a
> > nice fast Opteron. (it's the same at top-level
> > with processes, but I wasn't about to mess up
> > my system that much)
>
> Are there many cases where the /proc directory
> contents are read in this fashion?
Sure. Run any of: top, ps, lsof, fuser...
> I'd be more curious about how performance fares
> with reading a thousand entries by name with 1k
> processes and with 360k processes.
Judging by the crazy example and the observation
that an O(n*n) algorithm is involved, directory
reads on that very fast machine should get annoying
once you have a few thousand processes. They'd be
perceptable one-by-one, which adds up when you have
multiple reads due to scripts, top, or multiple
users.
Anyway, it's not just about performance! That's
only half of the problem. The other half is
reliability. The way /proc works is this:
Count tasks as you read them. The number is
your directory offset. Return a few dozen entries
at a time. For each read, you'll need to find
back your place. You do this by counting tasks
until you reach your offset. Of course, tasks
will have been created and destroyed between
reads, so who knows where you'll continue from?
That's simply not reliable.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: /proc reliability & performance
2003-10-17 4:31 ` Albert Cahalan
@ 2003-10-17 4:56 ` William Lee Irwin III
2003-10-17 8:21 ` David Rees
1 sibling, 0 replies; 16+ messages in thread
From: William Lee Irwin III @ 2003-10-17 4:56 UTC (permalink / raw)
To: Albert Cahalan; +Cc: Brian McGroarty, linux-kernel mailing list, lm
On Fri, Oct 17, 2003 at 12:31:14AM -0400, Albert Cahalan wrote:
> Count tasks as you read them. The number is
> your directory offset. Return a few dozen entries
> at a time. For each read, you'll need to find
> back your place. You do this by counting tasks
> until you reach your offset. Of course, tasks
> will have been created and destroyed between
> reads, so who knows where you'll continue from?
> That's simply not reliable.
That's part of what the rbtree algorithm was meant to address.
It does find_tgids_after(tgids, tgid_array), filling a buffer with the
tgids starting at the first one higher than its first argument. This
way there is no possibility whatsoever of duplicates or deviation from
sorted order.
-- wli
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: /proc reliability & performance
2003-10-17 2:07 /proc reliability & performance Albert Cahalan
` (2 preceding siblings ...)
2003-10-17 3:24 ` Brian McGroarty
@ 2003-10-17 7:40 ` Zan Lynx
2003-10-17 7:54 ` William Lee Irwin III
3 siblings, 1 reply; 16+ messages in thread
From: Zan Lynx @ 2003-10-17 7:40 UTC (permalink / raw)
To: Albert Cahalan; +Cc: linux-kernel mailing list
[-- Attachment #1: Type: text/plain, Size: 432 bytes --]
On Thu, 2003-10-16 at 20:07, Albert Cahalan wrote:
> I created a process with 360 thousand threads,
> went into the /proc/*/task directory, and did
> a simple /bin/ls. It took over 9 minutes on a
> nice fast Opteron.
Did you try using find instead of ls? ls loads all entries and then
sorts them, so it can create an alphabetical display.
Try using find. It will not take quite so long.
--
Zan Lynx <zlynx@acm.org>
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: /proc reliability & performance
2003-10-17 7:40 ` Zan Lynx
@ 2003-10-17 7:54 ` William Lee Irwin III
0 siblings, 0 replies; 16+ messages in thread
From: William Lee Irwin III @ 2003-10-17 7:54 UTC (permalink / raw)
To: Zan Lynx; +Cc: Albert Cahalan, linux-kernel mailing list
On Thu, 2003-10-16 at 20:07, Albert Cahalan wrote:
>> I created a process with 360 thousand threads,
>> went into the /proc/*/task directory, and did
>> a simple /bin/ls. It took over 9 minutes on a
>> nice fast Opteron.
On Fri, Oct 17, 2003 at 01:40:03AM -0600, Zan Lynx wrote:
> Did you try using find instead of ls? ls loads all entries and then
> sorts them, so it can create an alphabetical display.
> Try using find. It will not take quite so long.
GNU ls has a -U flag that should come in handy.
-- wli
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: /proc reliability & performance
2003-10-17 2:34 ` Larry McVoy
@ 2003-10-17 8:01 ` dada1
2003-10-17 9:10 ` David S. Miller
0 siblings, 1 reply; 16+ messages in thread
From: dada1 @ 2003-10-17 8:01 UTC (permalink / raw)
To: Larry McVoy, Albert Cahalan; +Cc: linux-kernel mailing list
From: "Larry McVoy" <lm@bitmover.com>
>
> And your real need for 360,000 threads is?
>
> I tend to believe that there are hundreds, nay, thousands, nay, 360
thousand
> better things to work on in the kernel.
Same problem here on some servers (real application), but with 280.000 tcp
sockets active.
A "cat /proc/net/tcp" takes too much time to even try it. :(
tools like "netstat" or "lsof", (even with -n flag) are just unusable.
Eric Dumazet
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: /proc reliability & performance
2003-10-17 4:31 ` Albert Cahalan
2003-10-17 4:56 ` William Lee Irwin III
@ 2003-10-17 8:21 ` David Rees
1 sibling, 0 replies; 16+ messages in thread
From: David Rees @ 2003-10-17 8:21 UTC (permalink / raw)
To: Albert Cahalan; +Cc: Brian McGroarty, linux-kernel mailing list, lm
On Thu, October 16, 2003 at 9:31 pm, Albert Cahalan sent the following
> On Thu, 2003-10-16 at 23:24, Brian McGroarty wrote:
>> On Thu, Oct 16, 2003 at 10:07:18PM -0400, Albert Cahalan wrote:
>> > I created a process with 360 thousand threads,
>> > went into the /proc/*/task directory, and did
>> > a simple /bin/ls. It took over 9 minutes on a
>> > nice fast Opteron. (it's the same at top-level
>> > with processes, but I wasn't about to mess up
>> > my system that much)
>>
>> Are there many cases where the /proc directory
>> contents are read in this fashion?
>
> Sure. Run any of: top, ps, lsof, fuser...
I can vouch that with as few as a 3-5 hundred threads/processes started up
and not necessarily doing much, top starts using a good deal system time
on a somewhat aging dual PIII server on recent 2.4.x kernels.
-Dave
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: /proc reliability & performance
2003-10-17 8:01 ` dada1
@ 2003-10-17 9:10 ` David S. Miller
2003-10-17 14:46 ` Valdis.Kletnieks
` (3 more replies)
0 siblings, 4 replies; 16+ messages in thread
From: David S. Miller @ 2003-10-17 9:10 UTC (permalink / raw)
To: dada1; +Cc: lm, albert, linux-kernel
On Fri, 17 Oct 2003 10:01:53 +0200
"dada1" <dada1@cosmosbay.com> wrote:
> A "cat /proc/net/tcp" takes too much time to even try it. :(
>
> tools like "netstat" or "lsof", (even with -n flag) are just unusable.
Because they don't use the netlink TCP socket dumping
facility which is made to handle such things much better
than procfs ever can.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: /proc reliability & performance
2003-10-17 9:10 ` David S. Miller
@ 2003-10-17 14:46 ` Valdis.Kletnieks
2003-10-17 17:24 ` dada1
` (2 subsequent siblings)
3 siblings, 0 replies; 16+ messages in thread
From: Valdis.Kletnieks @ 2003-10-17 14:46 UTC (permalink / raw)
To: David S. Miller; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 433 bytes --]
On Fri, 17 Oct 2003 02:10:40 PDT, "David S. Miller" said:
> > tools like "netstat" or "lsof", (even with -n flag) are just unusable.
>
> Because they don't use the netlink TCP socket dumping
> facility which is made to handle such things much better
> than procfs ever can.
The netlink TCP socked dumping facility will also provide the
"open files" list of *non* sockets that lsof wants?
Not all the world's a TCP connection....
[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: /proc reliability & performance
2003-10-17 9:10 ` David S. Miller
2003-10-17 14:46 ` Valdis.Kletnieks
@ 2003-10-17 17:24 ` dada1
2003-10-17 23:48 ` Albert Cahalan
2003-10-18 6:35 ` Willy Tarreau
3 siblings, 0 replies; 16+ messages in thread
From: dada1 @ 2003-10-17 17:24 UTC (permalink / raw)
To: David S. Miller; +Cc: lm, albert, linux-kernel
From: "David S. Miller" <davem@redhat.com>
> "dada1" <dada1@cosmosbay.com> wrote:
>
> > A "cat /proc/net/tcp" takes too much time to even try it. :(
> >
> > tools like "netstat" or "lsof", (even with -n flag) are just unusable.
>
> Because they don't use the netlink TCP socket dumping
> facility which is made to handle such things much better
> than procfs ever can.
Thanks David for the hint. :) I buy it.
I found that the ss command from iproute2 package does use the 'netlink TCP
dumping' you mention (how many people on earth heard about that ?)
Instead of 15 minutes for a 'netstat -n > FILE', my server takes now 6
seconds with 'ss -n > FILE', with 200000 sockets opened.
Eric
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: /proc reliability & performance
2003-10-17 9:10 ` David S. Miller
2003-10-17 14:46 ` Valdis.Kletnieks
2003-10-17 17:24 ` dada1
@ 2003-10-17 23:48 ` Albert Cahalan
2003-10-18 6:35 ` Willy Tarreau
3 siblings, 0 replies; 16+ messages in thread
From: Albert Cahalan @ 2003-10-17 23:48 UTC (permalink / raw)
To: David S. Miller; +Cc: dada1, lm, albert, linux-kernel mailing list
On Fri, 2003-10-17 at 05:10, David S. Miller wrote:
> On Fri, 17 Oct 2003 10:01:53 +0200
> "dada1" <dada1@cosmosbay.com> wrote:
>
> > A "cat /proc/net/tcp" takes too much time to even try it. :(
> >
> > tools like "netstat" or "lsof", (even with -n flag) are just unusable.
>
> Because they don't use the netlink TCP socket dumping
> facility which is made to handle such things much better
> than procfs ever can.
That's an accepted way to do things? Oh cool.
We just need a netlink process info dumping
facility...
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: /proc reliability & performance
2003-10-17 9:10 ` David S. Miller
` (2 preceding siblings ...)
2003-10-17 23:48 ` Albert Cahalan
@ 2003-10-18 6:35 ` Willy Tarreau
2003-10-18 6:38 ` David S. Miller
3 siblings, 1 reply; 16+ messages in thread
From: Willy Tarreau @ 2003-10-18 6:35 UTC (permalink / raw)
To: David S. Miller; +Cc: linux-kernel
On Fri, Oct 17, 2003 at 02:10:40AM -0700, David S. Miller wrote:
> On Fri, 17 Oct 2003 10:01:53 +0200
> "dada1" <dada1@cosmosbay.com> wrote:
>
> > A "cat /proc/net/tcp" takes too much time to even try it. :(
> >
> > tools like "netstat" or "lsof", (even with -n flag) are just unusable.
>
> Because they don't use the netlink TCP socket dumping
> facility which is made to handle such things much better
> than procfs ever can.
Hmmm very interesting. And is there an equivalent replacement for
/proc/net/ip_conntrack ? And if not, what would be needed to implement it ?
Willy
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: /proc reliability & performance
2003-10-18 6:35 ` Willy Tarreau
@ 2003-10-18 6:38 ` David S. Miller
0 siblings, 0 replies; 16+ messages in thread
From: David S. Miller @ 2003-10-18 6:38 UTC (permalink / raw)
To: Willy Tarreau; +Cc: linux-kernel
On Sat, 18 Oct 2003 08:35:59 +0200
Willy Tarreau <willy@w.ods.org> wrote:
> Hmmm very interesting. And is there an equivalent replacement for
> /proc/net/ip_conntrack ? And if not, what would be needed to implement it ?
I don't know, ask the netfilter developers on the netfilter lists.
:-)
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2003-10-18 6:43 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-10-17 2:07 /proc reliability & performance Albert Cahalan
2003-10-17 2:34 ` Larry McVoy
2003-10-17 8:01 ` dada1
2003-10-17 9:10 ` David S. Miller
2003-10-17 14:46 ` Valdis.Kletnieks
2003-10-17 17:24 ` dada1
2003-10-17 23:48 ` Albert Cahalan
2003-10-18 6:35 ` Willy Tarreau
2003-10-18 6:38 ` David S. Miller
2003-10-17 2:51 ` William Lee Irwin III
2003-10-17 3:24 ` Brian McGroarty
2003-10-17 4:31 ` Albert Cahalan
2003-10-17 4:56 ` William Lee Irwin III
2003-10-17 8:21 ` David Rees
2003-10-17 7:40 ` Zan Lynx
2003-10-17 7:54 ` William Lee Irwin III
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.