Re: PROBLEM: high system usage / poor SMP network performance

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Re: PROBLEM: high system usage / poor SMP network performance
@ 2002-01-29 18:00 Dan Kegel
  2002-01-29 20:09 ` Vincent Sweeney
  0 siblings, 1 reply; 33+ messages in thread
From: Dan Kegel @ 2002-01-29 18:00 UTC (permalink / raw)
  To: Vincent Sweeney, linux-kernel@vger.kernel.org

"Vincent Sweeney" <v.sweeney@barrysworld.com> wrote:
> > > >     CPU0 states: 27.2% user, 62.4% system,  0.0% nice,  9.2% idle
> > > >     CPU1 states: 28.4% user, 62.3% system,  0.0% nice,  8.1% idle
> > >
> > > The important bit here is     ^^^^^^^^ that one. Something is causing
> > > horrendous lock contention it appears.
> ...
> Right then, here is the results from today so far (snapshot taken with 2000
> users per ircd). Kernel profiling enabled with the eepro100 driver compiled
> statically.
>    readprofile -r ; sleep 60; readprofile | sort -n | tail -30
> ...
>    170 sys_poll                                   0.1897
>    269 do_pollfd                                  1.4944
>    462 remove_wait_queue                         12.8333
>    474 add_wait_queue                             9.1154
>    782 fput                                       3.3707
>   1216 default_idle                              23.3846
>   1334 fget                                      16.6750
>   1347 sock_poll                                 33.6750
>   2408 tcp_poll                                   6.9195
>   9366 total                                      0.0094
> ...
> So with my little knowledge of what this means I would say this is purely
> down to poll(), but surely even with 4000 connections to the box that
> shouldn't stretch a dual P3-800 box as much as it does?

My oldish results,
http://www.kegel.com/dkftpbench/Poller_bench.html#results
show that yes, 4000 connections can really hurt a Linux program
that uses poll().  It is very tempting to port ircd to use 
the Poller library (http://www.kegel.com/dkftpbench/dkftpbench-0.38.tar.gz);
that would let us compare poll(), realtimesignals, and /dev/epoll
to see how well they do on your workload.
- Dan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: PROBLEM: high system usage / poor SMP network performance
  2002-01-29 18:00 PROBLEM: high system usage / poor SMP network performance Dan Kegel
@ 2002-01-29 20:09 ` Vincent Sweeney
  2002-01-31  5:24   ` Dan Kegel
  0 siblings, 1 reply; 33+ messages in thread
From: Vincent Sweeney @ 2002-01-29 20:09 UTC (permalink / raw)
  To: Dan Kegel, linux-kernel

----- Original Message -----
From: "Dan Kegel" <dank@kegel.com>
To: "Vincent Sweeney" <v.sweeney@barrysworld.com>;
<linux-kernel@vger.kernel.org>
Sent: Tuesday, January 29, 2002 6:00 PM
Subject: Re: PROBLEM: high system usage / poor SMP network performance


> "Vincent Sweeney" <v.sweeney@barrysworld.com> wrote:
> > > > >     CPU0 states: 27.2% user, 62.4% system,  0.0% nice,  9.2% idle
> > > > >     CPU1 states: 28.4% user, 62.3% system,  0.0% nice,  8.1% idle
> > > >
> > > > The important bit here is     ^^^^^^^^ that one. Something is
causing
> > > > horrendous lock contention it appears.
> > ...
> > Right then, here is the results from today so far (snapshot taken with
2000
> > users per ircd). Kernel profiling enabled with the eepro100 driver
compiled
> > statically.
> >    readprofile -r ; sleep 60; readprofile | sort -n | tail -30
> > ...
> >    170 sys_poll                                   0.1897
> >    269 do_pollfd                                  1.4944
> >    462 remove_wait_queue                         12.8333
> >    474 add_wait_queue                             9.1154
> >    782 fput                                       3.3707
> >   1216 default_idle                              23.3846
> >   1334 fget                                      16.6750
> >   1347 sock_poll                                 33.6750
> >   2408 tcp_poll                                   6.9195
> >   9366 total                                      0.0094
> > ...
> > So with my little knowledge of what this means I would say this is
purely
> > down to poll(), but surely even with 4000 connections to the box that
> > shouldn't stretch a dual P3-800 box as much as it does?
>
> My oldish results,
> http://www.kegel.com/dkftpbench/Poller_bench.html#results
> show that yes, 4000 connections can really hurt a Linux program
> that uses poll().  It is very tempting to port ircd to use
> the Poller library
(http://www.kegel.com/dkftpbench/dkftpbench-0.38.tar.gz);
> that would let us compare poll(), realtimesignals, and /dev/epoll
> to see how well they do on your workload.
> - Dan
>

So basically you are telling me these are my options:

    1) Someone is going to have to recode the ircd source we use and
possibly a modified kernel in the *hope* that performance improves.
    2) Convert the box to FreeBSD which seems to have a better poll()
implementation, and where I could support 8K clients easily as other admins
on my chat network do already.
    3) Move the ircd processes to some 400Mhz Ultra 5's running Solaris-8
which run 3-4K users at 60% cpu!

Now I want to run Linux but unless I get this issue resolved I'm essentialy
not utilizing my hardware to the best of its ability.

Vince.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: PROBLEM: high system usage / poor SMP network performance
  2002-01-29 20:09 ` Vincent Sweeney
@ 2002-01-31  5:24   ` Dan Kegel
       [not found]     ` <001d01c1aa8e$2e067e60$0201010a@frodo>
  0 siblings, 1 reply; 33+ messages in thread
From: Dan Kegel @ 2002-01-31  5:24 UTC (permalink / raw)
  To: Vincent Sweeney; +Cc: linux-kernel

Vincent Sweeney wrote:
> So basically you are telling me these are my options:
> 
>     1) Someone is going to have to recode the ircd source we use and
> possibly a modified kernel in the *hope* that performance improves.
>     2) Convert the box to FreeBSD which seems to have a better poll()
> implementation, and where I could support 8K clients easily as other admins
> on my chat network do already.
>     3) Move the ircd processes to some 400Mhz Ultra 5's running Solaris-8
> which run 3-4K users at 60% cpu!
> 
> Now I want to run Linux but unless I get this issue resolved I'm essentialy
> not utilizing my hardware to the best of its ability.

No need to use a modified kernel; plain old 2.4.18 or so should do
fine, it supports the rtsig stuff.  But yeah, you may want to
see if the core of ircd can be recoded.  Can you give me the URL
for the source of the version you use?  I can peek at it.
It only took me two days to recode betaftpd to use Poller...

I do know that the guys working on aio for linux say they
have code that will make poll() much more efficient, so
I suppose another option is to join the linux-aio list and
say "So you folks say you can make plain old poll() more efficient, eh?
Here's a test case for you." :-)

- Dan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: PROBLEM: high system usage / poor SMP network performance
       [not found]     ` <001d01c1aa8e$2e067e60$0201010a@frodo>
@ 2002-02-03  8:03       ` Dan Kegel
  2002-02-03  8:36         ` Andrew Morton
  2002-02-03 19:22         ` Kev
       [not found]       ` <5.1.0.14.2.20020203173247.02c946e8@pop.euronet.nl>
  1 sibling, 2 replies; 33+ messages in thread
From: Dan Kegel @ 2002-02-03  8:03 UTC (permalink / raw)
  To: Vincent Sweeney, linux-kernel@vger.kernel.org,
	coder-com@undernet.org
  Cc: Kevin L. Mitchell

Vincent Sweeney wrote:
> > > [I want to use Linux for my irc server, but performance sucks.]
> > >     1) Someone is going to have to recode the ircd source we use and
> > > possibly a modified kernel in the *hope* that performance improves.
> > >     2) Convert the box to FreeBSD which seems to have a better poll()
> > > implementation, and where I could support 8K clients easily as other
> > > admins on my chat network do already....
> >
> > No need to use a modified kernel; plain old 2.4.18 or so should do
> > fine, it supports the rtsig stuff.  But yeah, you may want to
> > see if the core of ircd can be recoded.  Can you give me the URL
> > for the source of the version you use?  I can peek at it.
> > It only took me two days to recode betaftpd to use Poller...
> 
> http://dev-com.b2irc.net/ : Undernet's IRCD + Lain 1.1.2 patch

Hmm.  Have a look at
http://www.mail-archive.com/coder-com@undernet.org/msg00060.html
It looks like the mainline Undernet ircd was rewritten around May 2001
to support high efficiency techniques like /dev/poll and kqueue.
The source you pointed to is way behind Undernet's current sources.

Undernet's ircd has engine_{select,poll,devpoll,kqueue}.c, 
but not yet an engine_rtsig.c, as far as I know.
If you want ircd to handle zillions of simultaneous connections
on a stock 2.4 Linux kernel, rtsignals are the way to go at the
moment.  What's needed is to write ircd's engine_rtsig.c, and 
modify ircd's os_linux.c to notice EWOULDBLOCK
return values and feed them to engine_rtsig.c (that's the icky
part about the way linux currently does this kind of event 
notification - signals are used for 'I'm ready now', but return
values from I/O functions are where you learn 'I'm no longer ready').

So I dunno if I'm going to go ahead and do that myself, but at least I've
scoped out the situation.  Before I did any work, I'd measure CPU
usage under a simulated load of 2000 clients, just to verify that
poll() was indeed a bottleneck (ok, can't imagine it not being a
bottleneck, but it's nice to have a baseline to compare the improved
version against).
- Dan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: PROBLEM: high system usage / poor SMP network performance
  2002-02-03  8:03       ` Dan Kegel
@ 2002-02-03  8:36         ` Andrew Morton
  2002-02-04 14:57           ` [Coder-Com] " Darren Smith
  2002-02-12 18:48           ` Vincent Sweeney
  2002-02-03 19:22         ` Kev
  1 sibling, 2 replies; 33+ messages in thread
From: Andrew Morton @ 2002-02-03  8:36 UTC (permalink / raw)
  To: Dan Kegel
  Cc: Vincent Sweeney, linux-kernel@vger.kernel.org,
	coder-com@undernet.org, Kevin L. Mitchell

Dan Kegel wrote:
> 
> Before I did any work, I'd measure CPU
> usage under a simulated load of 2000 clients, just to verify that
> poll() was indeed a bottleneck (ok, can't imagine it not being a
> bottleneck, but it's nice to have a baseline to compare the improved
> version against).

I half-did this earlier in the week.  It seems that Vincent's
machine is calling poll() maybe 100 times/second.  Each call
is taking maybe 10 milliseconds, and is returning approximately
one measly little packet.

select and poll suck for thousands of fds.  Always did, always
will.  Applications need to work around this.

And the workaround is rather simple:

	....
+	usleep(100000);
	poll(...);

This will add up to 0.1 seconds latency, but it means that
the poll will gather activity on ten times as many fds,
and that it will be called ten times less often, and that
CPU load will fall by a factor of ten.

This seems an appropriate hack for an IRC server.  I guess it
could be souped up a bit:

	usleep(nr_fds * 50);

-

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Coder-Com] Re: PROBLEM: high system usage / poor SMPnetwork  performance
       [not found]       ` <5.1.0.14.2.20020203173247.02c946e8@pop.euronet.nl>
@ 2002-02-03 19:16         ` Dan Kegel
  2002-02-04  0:07           ` Kev
  0 siblings, 1 reply; 33+ messages in thread
From: Dan Kegel @ 2002-02-03 19:16 UTC (permalink / raw)
  To: Arjen Wolfs; +Cc: coder-com, feedback, linux-kernel@vger.kernel.org

Arjen Wolfs wrote:
> The ircu version that supports kqueue and /dev/poll is currently being
> beta-tested on a few servers on the Undernet. The graph at
> http://www.break.net/ircu10-to-11.png shows the load average (multiplied by
> 100) on a on a server with 3000-4000 clients using poll(), and /dev/poll.
> The difference is obviously quite dramatic, and the same effect is being
> seen with kqueue. You could also try some of the /dev/poll patches for
> linux, which migth save you writing a new engine. Note that ircu 2.10.11 is
> still beta though, and is known to crash in mysterious ways from time to time.

None of the original /dev/poll patches for Linux were much
good, I seem to recall; they had scaling problems and bugs.

The /dev/epoll patch is good, but the interface is different enough
from /dev/poll that ircd would need a new engine_epoll.c anyway.
(It would look like a cross between engine_devpoll.c and engine_rtsig.c,
as it would need to be notified by os_linux.c of any EWOULDBLOCK return values.
Both rtsigs and /dev/epoll only provide 'I just became ready' notification, 
but no 'I'm not ready anymore' notification.)

And then there's /dev/yapoll (http://www.distributopia.com), which
I haven't tried yet (I don't think the author ever published the patch?).

Anyway, the new engine wouldn't be too hard to write, and
would let irc run fast without a patched kernel.

- Dan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: PROBLEM: high system usage / poor SMP network performance
  2002-02-03  8:03       ` Dan Kegel
  2002-02-03  8:36         ` Andrew Morton
@ 2002-02-03 19:22         ` Kev
  1 sibling, 0 replies; 33+ messages in thread
From: Kev @ 2002-02-03 19:22 UTC (permalink / raw)
  To: Dan Kegel
  Cc: Vincent Sweeney, linux-kernel@vger.kernel.org,
	coder-com@undernet.org, Kevin L. Mitchell

> Hmm.  Have a look at
> http://www.mail-archive.com/coder-com@undernet.org/msg00060.html
> It looks like the mainline Undernet ircd was rewritten around May 2001
> to support high efficiency techniques like /dev/poll and kqueue.
> The source you pointed to is way behind Undernet's current sources.

This code is still in beta testing, by the way.  It's certainly not the
prettiest way of doing it, though, and I've started working on a new
implementation of the basic idea in a library, which I will then use in
a future version of Undernet's ircd.

> Undernet's ircd has engine_{select,poll,devpoll,kqueue}.c, 
> but not yet an engine_rtsig.c, as far as I know.
> If you want ircd to handle zillions of simultaneous connections
> on a stock 2.4 Linux kernel, rtsignals are the way to go at the
> moment.  What's needed is to write ircd's engine_rtsig.c, and 
> modify ircd's os_linux.c to notice EWOULDBLOCK
> return values and feed them to engine_rtsig.c (that's the icky
> part about the way linux currently does this kind of event 
> notification - signals are used for 'I'm ready now', but return
> values from I/O functions are where you learn 'I'm no longer ready').

I haven't examined the usage of the realtime signals stuff, but I did
originally choose not to bother with it.  It may be possible to set up
an engine that uses it, and if anyone gets it working, I sure wouldn't
mind seeing the patches.  Still, I'd say that the best bet is probably
to either use the /dev/poll patch for linux, or grab the /dev/epoll patch
and implement a new engine to use it.  (I should note that I haven't tried
either of these patches, yet, so YMMV.)

> So I dunno if I'm going to go ahead and do that myself, but at least I've
> scoped out the situation.  Before I did any work, I'd measure CPU
> usage under a simulated load of 2000 clients, just to verify that
> poll() was indeed a bottleneck (ok, can't imagine it not being a
> bottleneck, but it's nice to have a baseline to compare the improved
> version against).

I'm very certain that poll() is a bottle-neck in any piece of software like
ircd.  I have some preliminary data which suggests that not only does the
/dev/poll engine reduce the load averages, but that it scales much better:
Load averages on that beta test server dropped from about 1.30 to about
0.30 for the same number of clients, and adding more clients increases the
load much less than under the previous version using poll().  Of course,
I haven't compared loads under the same server version with two different
engines--it's possible other changes we made have resulted in much of that
load difference.

I should probably note that the beta test server I am refering to is running
Solaris; I have not tried to use the Linux /dev/poll patch as of yet...
-- 
Kevin L. Mitchell <klmitch@mit.edu>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Coder-Com] Re: PROBLEM: high system usage / poor SMPnetwork performance
  2002-02-03 19:16         ` [Coder-Com] Re: PROBLEM: high system usage / poor SMPnetwork performance Dan Kegel
@ 2002-02-04  0:07           ` Kev
  2002-02-04  0:37             ` Dan Kegel
  0 siblings, 1 reply; 33+ messages in thread
From: Kev @ 2002-02-04  0:07 UTC (permalink / raw)
  To: Dan Kegel; +Cc: Arjen Wolfs, coder-com, feedback, linux-kernel@vger.kernel.org

> The /dev/epoll patch is good, but the interface is different enough
> from /dev/poll that ircd would need a new engine_epoll.c anyway.
> (It would look like a cross between engine_devpoll.c and engine_rtsig.c,
> as it would need to be notified by os_linux.c of any EWOULDBLOCK return values.
> Both rtsigs and /dev/epoll only provide 'I just became ready' notification,
> but no 'I'm not ready anymore' notification.)

I don't understand what it is you're saying here.  The ircu server uses
non-blocking sockets, and has since long before EfNet and Undernet branched,
so it already handles EWOULDBLOCK or EAGAIN intelligently, as far as I know.
-- 
Kevin L. Mitchell <klmitch@mit.edu>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Coder-Com] Re: PROBLEM: high system usage / poor SMPnetwork  performance
  2002-02-04  0:07           ` Kev
@ 2002-02-04  0:37             ` Dan Kegel
  2002-02-04  0:59               ` Aaron Sethman
  2002-02-04  2:55               ` Kev
  0 siblings, 2 replies; 33+ messages in thread
From: Dan Kegel @ 2002-02-04  0:37 UTC (permalink / raw)
  To: Kev; +Cc: Arjen Wolfs, coder-com, feedback, linux-kernel@vger.kernel.org

Kev wrote:
> 
> > The /dev/epoll patch is good, but the interface is different enough
> > from /dev/poll that ircd would need a new engine_epoll.c anyway.
> > (It would look like a cross between engine_devpoll.c and engine_rtsig.c,
> > as it would need to be notified by os_linux.c of any EWOULDBLOCK return values.
> > Both rtsigs and /dev/epoll only provide 'I just became ready' notification,
> > but no 'I'm not ready anymore' notification.)
> 
> I don't understand what it is you're saying here.  The ircu server uses
> non-blocking sockets, and has since long before EfNet and Undernet branched,
> so it already handles EWOULDBLOCK or EAGAIN intelligently, as far as I know.

Right.  poll() and Solaris /dev/poll are programmer-friendly; they give 
you the current readiness status for each socket.  ircu handles them fine.

/dev/epoll and Linux 2.4's rtsig feature, on the other hand, are
programmer-hostile; they don't tell you which sockets are ready.
Instead, they tell you when sockets *become* ready;
your only indication that those sockets have become *unready*
is when you see an EWOULDBLOCK from them.

If this didn't make any sense, maybe seeing how it's used might help.
Look at Poller::clearReadiness() in
http://www.kegel.com/dkftpbench/doc/Poller.html#DOC.9.11 or
http://www.kegel.com/dkftpbench/dkftpbench-0.38/Poller_sigio.cc
and the calls to Poller::clearReadiness() in
http://www.kegel.com/dkftpbench/dkftpbench-0.38/ftp_client_pipe.cc

- Dan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Coder-Com] Re: PROBLEM: high system usage / poor SMPnetwork  performance
  2002-02-04  0:37             ` Dan Kegel
@ 2002-02-04  0:59               ` Aaron Sethman
  2002-02-04  1:16                 ` Dan Kegel
  2002-02-04  6:11                 ` Daniel Phillips
  2002-02-04  2:55               ` Kev
  1 sibling, 2 replies; 33+ messages in thread
From: Aaron Sethman @ 2002-02-04  0:59 UTC (permalink / raw)
  To: Dan Kegel
  Cc: Kev, Arjen Wolfs, coder-com, feedback,
	linux-kernel@vger.kernel.org

On Sun, 3 Feb 2002, Dan Kegel wrote:

> Kev wrote:
> >
> > > The /dev/epoll patch is good, but the interface is different enough
> > > from /dev/poll that ircd would need a new engine_epoll.c anyway.
> > > (It would look like a cross between engine_devpoll.c and engine_rtsig.c,
> > > as it would need to be notified by os_linux.c of any EWOULDBLOCK return values.
> > > Both rtsigs and /dev/epoll only provide 'I just became ready' notification,
> > > but no 'I'm not ready anymore' notification.)
> >
> > I don't understand what it is you're saying here.  The ircu server uses
> > non-blocking sockets, and has since long before EfNet and Undernet branched,
> > so it already handles EWOULDBLOCK or EAGAIN intelligently, as far as I know.
>
> Right.  poll() and Solaris /dev/poll are programmer-friendly; they give
> you the current readiness status for each socket.  ircu handles them fine.

I would have to agree with this comment.  Hybrid-ircd deals with poll()
and /dev/poll just fine.  We have attempted to make it use rtsig, but it
just doesn't seem to agree with the i/o model we are using, which btw, is
the same model that Squid (is/will be?) using.  I haven't played with
/dev/epoll yet, but I pray it is nothing like rtsig.

Basically what we need is, something like poll() but not so nasty.
/dev/poll is okay, but its a hack.  The best thing I've seen so far, but
it too seems to take the idea so far is FreeBSD's kqueue stuff(which
Hybrid-ircd handles quite nicely).


Regards,

Aaron


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Coder-Com] Re: PROBLEM: high system usage / poor SMPnetwork  performance
  2002-02-04  0:59               ` Aaron Sethman
@ 2002-02-04  1:16                 ` Dan Kegel
  2002-02-04  1:30                   ` Aaron Sethman
  2002-02-04  6:11                 ` Daniel Phillips
  1 sibling, 1 reply; 33+ messages in thread
From: Dan Kegel @ 2002-02-04  1:16 UTC (permalink / raw)
  To: Aaron Sethman
  Cc: Kev, Arjen Wolfs, coder-com, feedback,
	linux-kernel@vger.kernel.org

Aaron Sethman wrote:
> 
> On Sun, 3 Feb 2002, Dan Kegel wrote:
> 
> > Kev wrote:
> > >
> > > > The /dev/epoll patch is good, but the interface is different enough
> > > > from /dev/poll that ircd would need a new engine_epoll.c anyway.
> > > > (It would look like a cross between engine_devpoll.c and engine_rtsig.c,
> > > > as it would need to be notified by os_linux.c of any EWOULDBLOCK return values.
> > > > Both rtsigs and /dev/epoll only provide 'I just became ready' notification,
> > > > but no 'I'm not ready anymore' notification.)
> > >
> > > I don't understand what it is you're saying here.  The ircu server uses
> > > non-blocking sockets, and has since long before EfNet and Undernet branched,
> > > so it already handles EWOULDBLOCK or EAGAIN intelligently, as far as I know.
> >
> > Right.  poll() and Solaris /dev/poll are programmer-friendly; they give
> > you the current readiness status for each socket.  ircu handles them fine.
> 
> I would have to agree with this comment.  Hybrid-ircd deals with poll()
> and /dev/poll just fine.  We have attempted to make it use rtsig, but it
> just doesn't seem to agree with the i/o model we are using...

I'd like to know how it disagrees.
I believe rtsig requires you to tweak your I/O code in three ways:
1. you need to pick a realtime signal number to use for an event queue
2. you need to wrap your read()/write() calls on the socket with code
that notices EWOULDBLOCK
3. you need to fall back to poll() on signal queue overflow.

For what it's worth, my Poller library takes care of fallback to poll
transparantly, and makes the EWOULDBLOCK stuff fairly easy.  I gather
from the way you quoted my previous messsage, though, that you
consider rtsig too awful to even think about.

> I haven't played with /dev/epoll yet, but I pray it is nothing like rtsig.

Unfortunately, it is exactly like rtsig in how you need to handle
EWOULDBLOCK.

> Basically what we need is, something like poll() but not so nasty.
> /dev/poll is okay, but its a hack.  The best thing I've seen so far, but
> it too seems to take the idea so far is FreeBSD's kqueue stuff(which
> Hybrid-ircd handles quite nicely).

Yes, kqueue is quite easy to use, and doesn't require the gyrations
that rtsig or /dev/epoll require.  The only thing that makes rtsig or /dev/epoll
usable are user-space wrapper libraries that let you forget about the
gyrations (mostly).
- Dan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Coder-Com] Re: PROBLEM: high system usage / poor SMPnetwork  performance
  2002-02-04  1:16                 ` Dan Kegel
@ 2002-02-04  1:30                   ` Aaron Sethman
  2002-02-04  1:38                     ` Dan Kegel
  0 siblings, 1 reply; 33+ messages in thread
From: Aaron Sethman @ 2002-02-04  1:30 UTC (permalink / raw)
  To: Dan Kegel
  Cc: Kev, Arjen Wolfs, coder-com, feedback,
	linux-kernel@vger.kernel.org


On Sun, 3 Feb 2002, Dan Kegel wrote:

> I'd like to know how it disagrees.
> I believe rtsig requires you to tweak your I/O code in three ways:
> 1. you need to pick a realtime signal number to use for an event queue
Did that.

> 2. you need to wrap your read()/write() calls on the socket with code
> that notices EWOULDBLOCK
This is perhaps the part we it disagrees with our code.  I will
investigate this part.  The way we normally do things is have callbacks
per fd, that get called when our event occurs doing the read, or, write
directly.  We do check for the EWOULDBLOCK stuff and re-register the
event.  The thing we do not currently do is, attempt to read or write
unless we've received notification first.  This is what I am assuming is
breaking it.

> 3. you need to fall back to poll() on signal queue overflow.
Did that part too.


Regards,

Aaron


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Coder-Com] Re: PROBLEM: high system usage / poor SMPnetwork  performance
  2002-02-04  1:30                   ` Aaron Sethman
@ 2002-02-04  1:38                     ` Dan Kegel
  2002-02-04  4:38                       ` Aaron Sethman
  0 siblings, 1 reply; 33+ messages in thread
From: Dan Kegel @ 2002-02-04  1:38 UTC (permalink / raw)
  To: Aaron Sethman
  Cc: Kev, Arjen Wolfs, coder-com, feedback,
	linux-kernel@vger.kernel.org

Aaron Sethman wrote:
> 
> > 2. you need to wrap your read()/write() calls on the socket with code
> > that notices EWOULDBLOCK
> This is perhaps the part we it disagrees with our code.  I will
> investigate this part.  The way we normally do things is have callbacks
> per fd, that get called when our event occurs doing the read, or, write
> directly.  

That sounds totally fine; in fact, it's how my Poller library works.

> We do check for the EWOULDBLOCK stuff and re-register the
> event.

But do you remember that this fd is ready until EWOULDBLOCK?
i.e. if you're notified that an fd is ready, and then you
don't for whatever reason continue to do I/O on it until EWOULDBLOCK,
you'll never ever be notified that it's ready again.
If your code assumes that it will be notified again anyway,
as with poll(), it will be sorely disappointed.

> The thing we do not currently do is, attempt to read or write
> unless we've received notification first.  This is what I am assuming is
> breaking it.

Yeah, that would break it, too, I think.

- Dan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Coder-Com] Re: PROBLEM: high system usage / poor SMPnetwork performance
  2002-02-04  0:37             ` Dan Kegel
  2002-02-04  0:59               ` Aaron Sethman
@ 2002-02-04  2:55               ` Kev
  2002-02-04  3:25                 ` Dan Kegel
  1 sibling, 1 reply; 33+ messages in thread
From: Kev @ 2002-02-04  2:55 UTC (permalink / raw)
  To: Dan Kegel
  Cc: Kev, Arjen Wolfs, coder-com, feedback,
	linux-kernel@vger.kernel.org

> > I don't understand what it is you're saying here.  The ircu server uses
> > non-blocking sockets, and has since long before EfNet and Undernet branched,
> > so it already handles EWOULDBLOCK or EAGAIN intelligently, as far as I know.
> 
> Right.  poll() and Solaris /dev/poll are programmer-friendly; they give 
> you the current readiness status for each socket.  ircu handles them fine.
> 
> /dev/epoll and Linux 2.4's rtsig feature, on the other hand, are
> programmer-hostile; they don't tell you which sockets are ready.
> Instead, they tell you when sockets *become* ready;
> your only indication that those sockets have become *unready*
> is when you see an EWOULDBLOCK from them.

If I'm reading Poller_sigio::waitForEvents correctly, the rtsig stuff at
least tries to return a list of which sockets have become ready, and your
implementation falls back to some other interface when the signal queue
overflows.  It also seems to extract what state the socket's in at that
point.

If that's true, I confess I can't quite see your point even still.  Once
the event is generated, ircd should read or write as much as it can, then
not pay any attention to the socket until readiness is again signaled by
the generation of an event.  Sorry if I'm being dense here...
-- 
Kevin L. Mitchell <klmitch@mit.edu>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Coder-Com] Re: PROBLEM: high system usage / poor SMPnetwork  performance
  2002-02-04  2:55               ` Kev
@ 2002-02-04  3:25                 ` Dan Kegel
  2002-02-04  4:47                   ` Aaron Sethman
  2002-02-04  5:10                   ` Kev
  0 siblings, 2 replies; 33+ messages in thread
From: Dan Kegel @ 2002-02-04  3:25 UTC (permalink / raw)
  To: Kev; +Cc: Arjen Wolfs, coder-com, feedback, linux-kernel@vger.kernel.org

Kev wrote:
> 
> > > I don't understand what it is you're saying here.  The ircu server uses
> > > non-blocking sockets, and has since long before EfNet and Undernet branched,
> > > so it already handles EWOULDBLOCK or EAGAIN intelligently, as far as I know.
> >
> > Right.  poll() and Solaris /dev/poll are programmer-friendly; they give
> > you the current readiness status for each socket.  ircu handles them fine.
> >
> > /dev/epoll and Linux 2.4's rtsig feature, on the other hand, are
> > programmer-hostile; they don't tell you which sockets are ready.
> > Instead, they tell you when sockets *become* ready;
> > your only indication that those sockets have become *unready*
> > is when you see an EWOULDBLOCK from them.
> 
> If I'm reading Poller_sigio::waitForEvents correctly, the rtsig stuff at
> least tries to return a list of which sockets have become ready, and your
> implementation falls back to some other interface when the signal queue
> overflows.  It also seems to extract what state the socket's in at that
> point.
> 
> If that's true, I confess I can't quite see your point even still.  Once
> the event is generated, ircd should read or write as much as it can, then
> not pay any attention to the socket until readiness is again signaled by
> the generation of an event.  Sorry if I'm being dense here...

If you actually do read or write *until an EWOULDBLOCK*, no problem.
If your code has a path where it fails to do so, it will get stuck,
as no further readiness events will be forthcoming.  That's all.
- Dan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Coder-Com] Re: PROBLEM: high system usage / poor SMPnetwork  performance
  2002-02-04  1:38                     ` Dan Kegel
@ 2002-02-04  4:38                       ` Aaron Sethman
  2002-02-04  5:35                         ` Dan Kegel
  0 siblings, 1 reply; 33+ messages in thread
From: Aaron Sethman @ 2002-02-04  4:38 UTC (permalink / raw)
  To: Dan Kegel; +Cc: linux-kernel@vger.kernel.org


On Sun, 3 Feb 2002, Dan Kegel wrote:
>
> But do you remember that this fd is ready until EWOULDBLOCK?
> i.e. if you're notified that an fd is ready, and then you
> don't for whatever reason continue to do I/O on it until EWOULDBLOCK,
> you'll never ever be notified that it's ready again.
> If your code assumes that it will be notified again anyway,
> as with poll(), it will be sorely disappointed.

Yeah that was the problem and I figured out how to work around it in the
code.  If you are interested I can point out the code we have been working
with.

Regards,

Aaron


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Coder-Com] Re: PROBLEM: high system usage / poor SMPnetwork  performance
  2002-02-04  3:25                 ` Dan Kegel
@ 2002-02-04  4:47                   ` Aaron Sethman
  2002-02-04  5:10                   ` Kev
  1 sibling, 0 replies; 33+ messages in thread
From: Aaron Sethman @ 2002-02-04  4:47 UTC (permalink / raw)
  To: Dan Kegel, Kev
  Cc: Arjen Wolfs, coder-com, feedback, linux-kernel@vger.kernel.org

On Sun, 3 Feb 2002, Dan Kegel wrote:

> Kev wrote:
> > If that's true, I confess I can't quite see your point even still.  Once
> > the event is generated, ircd should read or write as much as it can, then
> > not pay any attention to the socket until readiness is again signaled by
> > the generation of an event.  Sorry if I'm being dense here...
>
> If you actually do read or write *until an EWOULDBLOCK*, no problem.
> If your code has a path where it fails to do so, it will get stuck,
> as no further readiness events will be forthcoming.  That's all.

It seems kind of odd, at first, but it does make sense in a inverted sort
of way.  Basically you aren't going to get any signals from the kernel
until the EWOULDBLOCK state clears.  Consider what would happen if you
received a signal every time you could, say send. Your process would be
flooded with signals, which of course wouldn't work.  If you want to take
a look at the Hybrid-7 cvs tree, let me know and I can give you a copy of
it.  I just got the sigio stuff working correctly in their.

Regards,

Aaron

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Coder-Com] Re: PROBLEM: high system usage / poor SMPnetwork performance
  2002-02-04  3:25                 ` Dan Kegel
  2002-02-04  4:47                   ` Aaron Sethman
@ 2002-02-04  5:10                   ` Kev
  1 sibling, 0 replies; 33+ messages in thread
From: Kev @ 2002-02-04  5:10 UTC (permalink / raw)
  To: Dan Kegel
  Cc: Kev, Arjen Wolfs, coder-com, feedback,
	linux-kernel@vger.kernel.org

> > If I'm reading Poller_sigio::waitForEvents correctly, the rtsig stuff at
> > least tries to return a list of which sockets have become ready, and your
> > implementation falls back to some other interface when the signal queue
> > overflows.  It also seems to extract what state the socket's in at that
> > point.
> > 
> > If that's true, I confess I can't quite see your point even still.  Once
> > the event is generated, ircd should read or write as much as it can, then
> > not pay any attention to the socket until readiness is again signaled by
> > the generation of an event.  Sorry if I'm being dense here...
> 
> If you actually do read or write *until an EWOULDBLOCK*, no problem.
> If your code has a path where it fails to do so, it will get stuck,
> as no further readiness events will be forthcoming.  That's all.

Ah ha!  And you may indeed have a point there...
-- 
Kevin L. Mitchell <klmitch@mit.edu>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Coder-Com] Re: PROBLEM: high system usage / poor SMPnetwork  performance
  2002-02-04  4:38                       ` Aaron Sethman
@ 2002-02-04  5:35                         ` Dan Kegel
  2002-02-04  5:43                           ` Aaron Sethman
  0 siblings, 1 reply; 33+ messages in thread
From: Dan Kegel @ 2002-02-04  5:35 UTC (permalink / raw)
  To: Aaron Sethman; +Cc: linux-kernel@vger.kernel.org

Aaron Sethman wrote:
> 
> On Sun, 3 Feb 2002, Dan Kegel wrote:
> >
> > But do you remember that this fd is ready until EWOULDBLOCK?
> > i.e. if you're notified that an fd is ready, and then you
> > don't for whatever reason continue to do I/O on it until EWOULDBLOCK,
> > you'll never ever be notified that it's ready again.
> > If your code assumes that it will be notified again anyway,
> > as with poll(), it will be sorely disappointed.
> 
> Yeah that was the problem and I figured out how to work around it in the
> code.  If you are interested I can point out the code we have been working
> with.

Yes, I would like to see it; is it part of the mainline undernet ircd cvs tree?
- Dan

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Coder-Com] Re: PROBLEM: high system usage / poor SMPnetwork  performance
  2002-02-04  5:35                         ` Dan Kegel
@ 2002-02-04  5:43                           ` Aaron Sethman
  0 siblings, 0 replies; 33+ messages in thread
From: Aaron Sethman @ 2002-02-04  5:43 UTC (permalink / raw)
  To: Dan Kegel; +Cc: linux-kernel@vger.kernel.org

On Sun, 3 Feb 2002, Dan Kegel wrote:

> Aaron Sethman wrote:
> >
> > On Sun, 3 Feb 2002, Dan Kegel wrote:
> > >
> > > But do you remember that this fd is ready until EWOULDBLOCK?
> > > i.e. if you're notified that an fd is ready, and then you
> > > don't for whatever reason continue to do I/O on it until EWOULDBLOCK,
> > > you'll never ever be notified that it's ready again.
> > > If your code assumes that it will be notified again anyway,
> > > as with poll(), it will be sorely disappointed.
> >
> > Yeah that was the problem and I figured out how to work around it in the
> > code.  If you are interested I can point out the code we have been working
> > with.
>
> Yes, I would like to see it; is it part of the mainline undernet ircd cvs tree?

This is part of the Hybrid ircd tree I've been talking about.
http://squeaker.ratbox.org/ircd-hybrid-7.tar.gz has the latest snapshot of
the tree.  Look at src/s_bsd_sigio.c for the sigio code.

Regards,

Aaron


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Coder-Com] Re: PROBLEM: high system usage / poor SMPnetwork  performance
  2002-02-04  0:59               ` Aaron Sethman
  2002-02-04  1:16                 ` Dan Kegel
@ 2002-02-04  6:11                 ` Daniel Phillips
  2002-02-04  6:26                   ` Aaron Sethman
  1 sibling, 1 reply; 33+ messages in thread
From: Daniel Phillips @ 2002-02-04  6:11 UTC (permalink / raw)
  To: Aaron Sethman, Dan Kegel
  Cc: Kev, Arjen Wolfs, coder-com, feedback,
	linux-kernel@vger.kernel.org

On February 4, 2002 01:59 am, Aaron Sethman wrote:
> On Sun, 3 Feb 2002, Dan Kegel wrote:
> 
> > Kev wrote:
> > >
> > > > The /dev/epoll patch is good, but the interface is different enough
> > > > from /dev/poll that ircd would need a new engine_epoll.c anyway.
> > > > (It would look like a cross between engine_devpoll.c and engine_rtsig.c,
> > > > as it would need to be notified by os_linux.c of any EWOULDBLOCK return values.
> > > > Both rtsigs and /dev/epoll only provide 'I just became ready' notification,
> > > > but no 'I'm not ready anymore' notification.)
> > >
> > > I don't understand what it is you're saying here.  The ircu server uses
> > > non-blocking sockets, and has since long before EfNet and Undernet branched,
> > > so it already handles EWOULDBLOCK or EAGAIN intelligently, as far as I know.
> >
> > Right.  poll() and Solaris /dev/poll are programmer-friendly; they give
> > you the current readiness status for each socket.  ircu handles them fine.
> 
> I would have to agree with this comment.  Hybrid-ircd deals with poll()
> and /dev/poll just fine.  We have attempted to make it use rtsig, but it
> just doesn't seem to agree with the i/o model we are using, which btw, is
> the same model that Squid (is/will be?) using.  I haven't played with
> /dev/epoll yet, but I pray it is nothing like rtsig.
> 
> Basically what we need is, something like poll() but not so nasty.
> /dev/poll is okay, but its a hack.  The best thing I've seen so far, but
> it too seems to take the idea so far is FreeBSD's kqueue stuff(which
> Hybrid-ircd handles quite nicely).

In an effort to somehow control the mushrooming number of IO interface 
strategies, why not take a look at the work Ben and Suparna are doing in aio, 
and see if there's an interface mechanism there that can be repurposed?

Surparna's writeup, for quick orientation:

   http://lse.sourceforge.net/io/bionotes.txt

-- 
Daniel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Coder-Com] Re: PROBLEM: high system usage / poor SMPnetwork  performance
  2002-02-04  6:11                 ` Daniel Phillips
@ 2002-02-04  6:26                   ` Aaron Sethman
  2002-02-04  6:29                     ` Daniel Phillips
  0 siblings, 1 reply; 33+ messages in thread
From: Aaron Sethman @ 2002-02-04  6:26 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: Dan Kegel, Kev, Arjen Wolfs, coder-com, feedback,
	linux-kernel@vger.kernel.org


On Mon, 4 Feb 2002, Daniel Phillips wrote:
> In an effort to somehow control the mushrooming number of IO interface
> strategies, why not take a look at the work Ben and Suparna are doing in aio,
> and see if there's an interface mechanism there that can be repurposed?

When AIO no longer sucks on pretty much every platform on the face of the
planet I think people will reconsider.  In the mean time, we've got to
deal with that is there.  That leaves us writing for at least 6 very
similiar, I/O models with varying attributes.

Regards,

Aaron


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Coder-Com] Re: PROBLEM: high system usage / poor SMPnetwork  performance
  2002-02-04  6:26                   ` Aaron Sethman
@ 2002-02-04  6:29                     ` Daniel Phillips
  2002-02-04  6:39                       ` Aaron Sethman
  0 siblings, 1 reply; 33+ messages in thread
From: Daniel Phillips @ 2002-02-04  6:29 UTC (permalink / raw)
  To: Aaron Sethman
  Cc: Dan Kegel, Kev, Arjen Wolfs, coder-com, feedback,
	linux-kernel@vger.kernel.org

On February 4, 2002 07:26 am, Aaron Sethman wrote:
> On Mon, 4 Feb 2002, Daniel Phillips wrote:
> > In an effort to somehow control the mushrooming number of IO interface
> > strategies, why not take a look at the work Ben and Suparna are doing in aio,
> > and see if there's an interface mechanism there that can be repurposed?
> 
> When AIO no longer sucks on pretty much every platform on the face of the
> planet I think people will reconsider.

What is the hang, as you see it?

> In the mean time, we've got to
> deal with that is there.  That leaves us writing for at least 6 very
> similiar, I/O models with varying attributes.

This is really an unfortunate situation.

-- 
Daniel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Coder-Com] Re: PROBLEM: high system usage / poor SMPnetwork  performance
  2002-02-04  6:29                     ` Daniel Phillips
@ 2002-02-04  6:39                       ` Aaron Sethman
  0 siblings, 0 replies; 33+ messages in thread
From: Aaron Sethman @ 2002-02-04  6:39 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: Dan Kegel, Kev, Arjen Wolfs, coder-com, feedback,
	linux-kernel@vger.kernel.org

On Mon, 4 Feb 2002, Daniel Phillips wrote:

> On February 4, 2002 07:26 am, Aaron Sethman wrote:
> > On Mon, 4 Feb 2002, Daniel Phillips wrote:
> > > In an effort to somehow control the mushrooming number of IO interface
> > > strategies, why not take a look at the work Ben and Suparna are doing in aio,
> > > and see if there's an interface mechanism there that can be repurposed?
> >
> > When AIO no longer sucks on pretty much every platform on the face of the
> > planet I think people will reconsider.
>
> What is the hang, as you see it?
Well on many platforms its implemented via pthreads, which in general
isn't terribly acceptable when you need to deal with 5000 connections in
one process.  I would like to see something useful that works well, and
performs well.  I think the FreeBSD guys had the right idea with their
kqueue interface, shame they couldn't have written it around the posix aio
interface.  But I suppose it would be trivial to write a wrapper around
it.

But the real issue is, that the standard interfaces, select() and poll()
are inadequate in the face of current requirements.   Posix AIO seems like
its heading down the right path, but it just isn't ready in any mature
implementation yet, thus pushing people away from it, making the problem
worse.


> > In the mean time, we've got to
> > deal with that is there.  That leaves us writing for at least 6 very
> > similiar, I/O models with varying attributes.
>
> This is really an unfortunate situation.

I agree with you 150% on that statement.  Lots of wasted time reinventing
tires for the latest and greatest wheel.

Regards,

Aaron


^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [Coder-Com] Re: PROBLEM: high system usage / poor SMP network performance
  2002-02-03  8:36         ` Andrew Morton
@ 2002-02-04 14:57           ` Darren Smith
  2002-02-04 17:41             ` Aaron Sethman
  2002-02-12 18:48           ` Vincent Sweeney
  1 sibling, 1 reply; 33+ messages in thread
From: Darren Smith @ 2002-02-04 14:57 UTC (permalink / raw)
  To: 'Andrew Morton', 'Dan Kegel'
  Cc: 'Vincent Sweeney', linux-kernel, coder-com,
	'Kevin L. Mitchell'

Hi

I've been testing the modified Undernet (2.10.10) code with Vincent
Sweeney based on the simple usleep(100000) addition to s_bsd.c

PRI NICE  SIZE    RES STATE  C   TIME   WCPU    CPU | # USERS
 2   0 96348K 96144K poll   0  29.0H 39.01% 39.01%  |  1700 <- Without
Patch
10   0 77584K 77336K nanslp 0   7:08  5.71%  5.71%  |  1500 <- With
Patch

Spot the difference!

It doesn't appear to be lagging, yet is using 1/7th the cpu!

Anyone else tried this?

Regards

Darren Smith

-----Original Message-----
From: owner-coder-com@undernet.org [mailto:owner-coder-com@undernet.org]
On Behalf Of Andrew Morton
Sent: 03 February 2002 08:36
To: Dan Kegel
Cc: Vincent Sweeney; linux-kernel@vger.kernel.org;
coder-com@undernet.org; Kevin L. Mitchell
Subject: [Coder-Com] Re: PROBLEM: high system usage / poor SMP network
performance

Dan Kegel wrote:
>
> Before I did any work, I'd measure CPU
> usage under a simulated load of 2000 clients, just to verify that
> poll() was indeed a bottleneck (ok, can't imagine it not being a
> bottleneck, but it's nice to have a baseline to compare the improved
> version against).

I half-did this earlier in the week.  It seems that Vincent's
machine is calling poll() maybe 100 times/second.  Each call
is taking maybe 10 milliseconds, and is returning approximately
one measly little packet.

select and poll suck for thousands of fds.  Always did, always
will.  Applications need to work around this.

And the workaround is rather simple:

	....
+	usleep(100000);
	poll(...);

This will add up to 0.1 seconds latency, but it means that
the poll will gather activity on ten times as many fds,
and that it will be called ten times less often, and that
CPU load will fall by a factor of ten.

This seems an appropriate hack for an IRC server.  I guess it
could be souped up a bit:

	usleep(nr_fds * 50);

-

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [Coder-Com] Re: PROBLEM: high system usage / poor SMP network performance
  2002-02-04 14:57           ` [Coder-Com] " Darren Smith
@ 2002-02-04 17:41             ` Aaron Sethman
  2002-02-04 18:11               ` Darren Smith
  0 siblings, 1 reply; 33+ messages in thread
From: Aaron Sethman @ 2002-02-04 17:41 UTC (permalink / raw)
  To: Darren Smith
  Cc: 'Andrew Morton', 'Dan Kegel',
	'Vincent Sweeney', linux-kernel, coder-com,
	'Kevin L. Mitchell'


On Mon, 4 Feb 2002, Darren Smith wrote:

> Hi
>
> I've been testing the modified Undernet (2.10.10) code with Vincent
> Sweeney based on the simple usleep(100000) addition to s_bsd.c
>
> PRI NICE  SIZE    RES STATE  C   TIME   WCPU    CPU | # USERS
>  2   0 96348K 96144K poll   0  29.0H 39.01% 39.01%  |  1700 <- Without
> Patch
> 10   0 77584K 77336K nanslp 0   7:08  5.71%  5.71%  |  1500 <- With
> Patch
Were you not putting a delay argument into poll(), or perhaps not letting
it delay long enough?  If you just do poll with a timeout of 0, its going
to suck lots of cpu.

Regards,

Aaron



^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [Coder-Com] Re: PROBLEM: high system usage / poor SMP network performance
  2002-02-04 17:41             ` Aaron Sethman
@ 2002-02-04 18:11               ` Darren Smith
  2002-02-04 18:30                 ` Aaron Sethman
  2002-02-08 22:11                 ` James Antill
  0 siblings, 2 replies; 33+ messages in thread
From: Darren Smith @ 2002-02-04 18:11 UTC (permalink / raw)
  To: 'Aaron Sethman'
  Cc: 'Andrew Morton', 'Dan Kegel',
	'Vincent Sweeney', linux-kernel, coder-com,
	'Kevin L. Mitchell'

I mean I added a usleep() before the poll in s_bsd.c for the undernet
2.10.10 code.

 timeout = (IRCD_MIN(delay2, delay)) * 1000;
 + usleep(100000); <- New Line
 nfds = poll(poll_fds, pfd_count, timeout);

And now we're using 1/8th the cpu! With no noticeable effects.

Regards

Darren.

-----Original Message-----
From: Aaron Sethman [mailto:androsyn@ratbox.org] 
Sent: 04 February 2002 17:41
To: Darren Smith
Cc: 'Andrew Morton'; 'Dan Kegel'; 'Vincent Sweeney';
linux-kernel@vger.kernel.org; coder-com@undernet.org; 'Kevin L.
Mitchell'
Subject: RE: [Coder-Com] Re: PROBLEM: high system usage / poor SMP
network performance


On Mon, 4 Feb 2002, Darren Smith wrote:

> Hi
>
> I've been testing the modified Undernet (2.10.10) code with Vincent
> Sweeney based on the simple usleep(100000) addition to s_bsd.c
>
> PRI NICE  SIZE    RES STATE  C   TIME   WCPU    CPU | # USERS
>  2   0 96348K 96144K poll   0  29.0H 39.01% 39.01%  |  1700 <- Without
> Patch
> 10   0 77584K 77336K nanslp 0   7:08  5.71%  5.71%  |  1500 <- With
> Patch
Were you not putting a delay argument into poll(), or perhaps not
letting
it delay long enough?  If you just do poll with a timeout of 0, its
going
to suck lots of cpu.

Regards,

Aaron




^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [Coder-Com] Re: PROBLEM: high system usage / poor SMP network performance
  2002-02-04 18:11               ` Darren Smith
@ 2002-02-04 18:30                 ` Aaron Sethman
  2002-02-04 18:48                   ` Kev
  2002-02-04 18:53                   ` Doug McNaught
  2002-02-08 22:11                 ` James Antill
  1 sibling, 2 replies; 33+ messages in thread
From: Aaron Sethman @ 2002-02-04 18:30 UTC (permalink / raw)
  To: Darren Smith
  Cc: 'Andrew Morton', 'Dan Kegel',
	'Vincent Sweeney', linux-kernel, coder-com,
	'Kevin L. Mitchell'

On Mon, 4 Feb 2002, Darren Smith wrote:

> I mean I added a usleep() before the poll in s_bsd.c for the undernet
> 2.10.10 code.
>
>  timeout = (IRCD_MIN(delay2, delay)) * 1000;
>  + usleep(100000); <- New Line
>  nfds = poll(poll_fds, pfd_count, timeout);
Why not just add the additional delay into the poll() timeout?  It just
seems like you were not doing enough of a delay in poll().

Regards,

Aaron


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Coder-Com] Re: PROBLEM: high system usage / poor SMP network performance
  2002-02-04 18:30                 ` Aaron Sethman
@ 2002-02-04 18:48                   ` Kev
  2002-02-04 18:59                     ` Aaron Sethman
  2002-02-04 18:53                   ` Doug McNaught
  1 sibling, 1 reply; 33+ messages in thread
From: Kev @ 2002-02-04 18:48 UTC (permalink / raw)
  To: Aaron Sethman
  Cc: Darren Smith, 'Andrew Morton', 'Dan Kegel',
	'Vincent Sweeney', linux-kernel, coder-com,
	'Kevin L. Mitchell'

> > I mean I added a usleep() before the poll in s_bsd.c for the undernet
> > 2.10.10 code.
> >
> >  timeout = (IRCD_MIN(delay2, delay)) * 1000;
> >  + usleep(100000); <- New Line
> >  nfds = poll(poll_fds, pfd_count, timeout);
> Why not just add the additional delay into the poll() timeout?  It just
> seems like you were not doing enough of a delay in poll().

Wouldn't have the effect.  The original point was that adding the usleep()
gives some time for some more file descriptors to become ready before calling
poll(), thus increasing the number of file descriptors poll() can return
per system call.  Adding the time to timeout would have no effect.
-- 
Kevin L. Mitchell <klmitch@mit.edu>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Coder-Com] Re: PROBLEM: high system usage / poor SMP network performance
  2002-02-04 18:30                 ` Aaron Sethman
  2002-02-04 18:48                   ` Kev
@ 2002-02-04 18:53                   ` Doug McNaught
  1 sibling, 0 replies; 33+ messages in thread
From: Doug McNaught @ 2002-02-04 18:53 UTC (permalink / raw)
  To: Aaron Sethman
  Cc: Darren Smith, 'Andrew Morton', 'Dan Kegel',
	'Vincent Sweeney', linux-kernel, coder-com,
	'Kevin L. Mitchell'

Aaron Sethman <androsyn@ratbox.org> writes:

> On Mon, 4 Feb 2002, Darren Smith wrote:
> 
> > I mean I added a usleep() before the poll in s_bsd.c for the undernet
> > 2.10.10 code.

> Why not just add the additional delay into the poll() timeout?  It just
> seems like you were not doing enough of a delay in poll().

No, because the poll() delay only has an effect if there are no
readable fd's.  What the usleep() does is allow time for more fd's to
become readable/writeable before poll() is called, spreading the
poll() overhead over more actual work.

-Doug
-- 
Let us cross over the river, and rest under the shade of the trees.
   --T. J. Jackson, 1863

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Coder-Com] Re: PROBLEM: high system usage / poor SMP network performance
  2002-02-04 18:48                   ` Kev
@ 2002-02-04 18:59                     ` Aaron Sethman
  0 siblings, 0 replies; 33+ messages in thread
From: Aaron Sethman @ 2002-02-04 18:59 UTC (permalink / raw)
  To: Kev
  Cc: Darren Smith, 'Andrew Morton', 'Dan Kegel',
	'Vincent Sweeney', linux-kernel, coder-com

On Mon, 4 Feb 2002, Kev wrote:
> Wouldn't have the effect.  The original point was that adding the usleep()
> gives some time for some more file descriptors to become ready before calling
> poll(), thus increasing the number of file descriptors poll() can return
> per system call.  Adding the time to timeout would have no effect.

My fault, I'm not thinking straight today.  I don't believe I've had my
daily allowance of caffine yet.

Regards,

Aaron


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Coder-Com] Re: PROBLEM: high system usage / poor SMP network performance
  2002-02-04 18:11               ` Darren Smith
  2002-02-04 18:30                 ` Aaron Sethman
@ 2002-02-08 22:11                 ` James Antill
  1 sibling, 0 replies; 33+ messages in thread
From: James Antill @ 2002-02-08 22:11 UTC (permalink / raw)
  To: Darren Smith
  Cc: 'Aaron Sethman', 'Andrew Morton',
	'Dan Kegel', 'Vincent Sweeney', linux-kernel,
	coder-com, 'Kevin L. Mitchell'

"Darren Smith" <data@barrysworld.com> writes:

> I mean I added a usleep() before the poll in s_bsd.c for the undernet
> 2.10.10 code.
> 
>  timeout = (IRCD_MIN(delay2, delay)) * 1000;
>  + usleep(100000); <- New Line
>  nfds = poll(poll_fds, pfd_count, timeout);
> 
> And now we're using 1/8th the cpu! With no noticeable effects.

 Note that something else you want to do is call poll() with a 0
timeout first (and if that doesn't return anything call again with the
timeout), this removes all the wait queue manipulation inside the
kernel when something is ready (most of the time).

-- 
# James Antill -- james@and.org
:0:
* ^From: .*james@and\.org
/dev/null

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: PROBLEM: high system usage / poor SMP network performance
  2002-02-03  8:36         ` Andrew Morton
  2002-02-04 14:57           ` [Coder-Com] " Darren Smith
@ 2002-02-12 18:48           ` Vincent Sweeney
  1 sibling, 0 replies; 33+ messages in thread
From: Vincent Sweeney @ 2002-02-12 18:48 UTC (permalink / raw)
  To: Andrew Morton, Dan Kegel; +Cc: linux-kernel, coder-com, Dan Kegel

Well I've recoded the poll() section in the ircu code base as follows:

Instead of the default :

    ...
    nfds = poll(poll_fds, pfd_count, timeout);
    ...

we now have

    ...
    nfds = poll(poll_fds, pfd_count, 0);
    if (nfds == 0) {
      usleep(1000000 / 10); /* sleep 1/10 second */
      nfds = poll(poll_fds, pfd_count, timeout);
    }
    ...

And as 'top' results now show, instead of maxing out a dual P3-800 we now
only use a fraction of that without any noticable side effects.

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
14684 ircd      15   0 81820  79M   800 S    22.5 21.2 215:39 ircd
14691 ircd      12   0 80716  78M   800 S    21.1 20.9 212:22 ircd


Vince.

----- Original Message -----
From: "Andrew Morton" <akpm@zip.com.au>
To: "Dan Kegel" <dank@kegel.com>
Cc: "Vincent Sweeney" <v.sweeney@barrysworld.com>;
<linux-kernel@vger.kernel.org>; <coder-com@undernet.org>; "Kevin L.
Mitchell" <klmitch@mit.edu>
Sent: Sunday, February 03, 2002 8:36 AM
Subject: Re: PROBLEM: high system usage / poor SMP network performance


> Dan Kegel wrote:
> >
> > Before I did any work, I'd measure CPU
> > usage under a simulated load of 2000 clients, just to verify that
> > poll() was indeed a bottleneck (ok, can't imagine it not being a
> > bottleneck, but it's nice to have a baseline to compare the improved
> > version against).
>
> I half-did this earlier in the week.  It seems that Vincent's
> machine is calling poll() maybe 100 times/second.  Each call
> is taking maybe 10 milliseconds, and is returning approximately
> one measly little packet.
>
> select and poll suck for thousands of fds.  Always did, always
> will.  Applications need to work around this.
>
> And the workaround is rather simple:
>
> ....
> + usleep(100000);
> poll(...);
>
> This will add up to 0.1 seconds latency, but it means that
> the poll will gather activity on ten times as many fds,
> and that it will be called ten times less often, and that
> CPU load will fall by a factor of ten.
>
> This seems an appropriate hack for an IRC server.  I guess it
> could be souped up a bit:
>
> usleep(nr_fds * 50);
>
> -
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>


^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2002-02-12 18:48 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-01-29 18:00 PROBLEM: high system usage / poor SMP network performance Dan Kegel
2002-01-29 20:09 ` Vincent Sweeney
2002-01-31  5:24   ` Dan Kegel
     [not found]     ` <001d01c1aa8e$2e067e60$0201010a@frodo>
2002-02-03  8:03       ` Dan Kegel
2002-02-03  8:36         ` Andrew Morton
2002-02-04 14:57           ` [Coder-Com] " Darren Smith
2002-02-04 17:41             ` Aaron Sethman
2002-02-04 18:11               ` Darren Smith
2002-02-04 18:30                 ` Aaron Sethman
2002-02-04 18:48                   ` Kev
2002-02-04 18:59                     ` Aaron Sethman
2002-02-04 18:53                   ` Doug McNaught
2002-02-08 22:11                 ` James Antill
2002-02-12 18:48           ` Vincent Sweeney
2002-02-03 19:22         ` Kev
     [not found]       ` <5.1.0.14.2.20020203173247.02c946e8@pop.euronet.nl>
2002-02-03 19:16         ` [Coder-Com] Re: PROBLEM: high system usage / poor SMPnetwork performance Dan Kegel
2002-02-04  0:07           ` Kev
2002-02-04  0:37             ` Dan Kegel
2002-02-04  0:59               ` Aaron Sethman
2002-02-04  1:16                 ` Dan Kegel
2002-02-04  1:30                   ` Aaron Sethman
2002-02-04  1:38                     ` Dan Kegel
2002-02-04  4:38                       ` Aaron Sethman
2002-02-04  5:35                         ` Dan Kegel
2002-02-04  5:43                           ` Aaron Sethman
2002-02-04  6:11                 ` Daniel Phillips
2002-02-04  6:26                   ` Aaron Sethman
2002-02-04  6:29                     ` Daniel Phillips
2002-02-04  6:39                       ` Aaron Sethman
2002-02-04  2:55               ` Kev
2002-02-04  3:25                 ` Dan Kegel
2002-02-04  4:47                   ` Aaron Sethman
2002-02-04  5:10                   ` Kev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox