PROBLEM: high system usage / poor SMP network performance

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* PROBLEM: high system usage / poor SMP network performance
@ 2002-01-27 22:23 Vincent Sweeney
  2002-01-27 22:42 ` Andrew Morton
  2002-01-27 22:54 ` Alan Cox
  0 siblings, 2 replies; 15+ messages in thread
From: Vincent Sweeney @ 2002-01-27 22:23 UTC (permalink / raw)
  To: linux-kernel

I am the server admin for several very busy IRC servers with an ever
increasing user load but I've hit a severe bottle neck which after numerous
system tweaks and driver configuration changes I can only assume is related
to a performance problem with the Linux Kernel.

The server configurations are all identical:
    Compaq Proliant 330R's with Dual Pentium III -800 MHz's & 384MB RAM
    Intel NIC's using the Intel e100 driver
    2.4.17 kernel
    2 ircd processes per box

Here is a snapshot from 'top' :
      9:51pm  up 11 days, 10:13,  1 user,  load average: 0.95, 1.24, 1.21
    44 processes: 40 sleeping, 4 running, 0 zombie, 0 stopped
    CPU0 states: 27.2% user, 62.4% system,  0.0% nice,  9.2% idle
    CPU1 states: 28.4% user, 62.3% system,  0.0% nice,  8.1% idle
    Mem:   385096K av,  376896K used,    8200K free,       0K shrd,    3588K
buff
    Swap:  379416K av,   12744K used,  366672K free                   58980K
cached

      PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
     7825 ircd      18   0 84504  82M  5604 R    89.5 21.9  6929m ircd
    31010 ircd      20   0 86352  84M  5676 R    85.0 22.3  7218m ircd

When this snapshot was taken each ircd had 2000 users connect each. As you
can see I am using more than a single cpu's worth of processer power just on
system cpu and the ircd processes are using just over 50% of a single cpu!
Now in comparison, another server admin who runs a ircd on a single P3-500
Linux 2.4.x system with 3000 users reaches about 60% *total* cpu usage.
Likewise admins who run *BSD or Solaris can run with similar user
connections and barely break a sweat. I have tried setting all the network
performance tweaks mentioned on numerous sites and also using the cpu saver
option on the e100 driver but at best I have only seen a 1-2% cpu saving.

Naturally I would really like to know where / what is using up all this
system cpu so I would like to try profiling the kernel as I'm sure this is a
pure kernel network layer performance issue but I have no idea where to
start so does anyone have some advice / tips on where I should start?

Vince.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: PROBLEM: high system usage / poor SMP network performance
  2002-01-27 22:23 PROBLEM: high system usage / poor SMP network performance Vincent Sweeney
@ 2002-01-27 22:42 ` Andrew Morton
  2002-01-27 22:54 ` Alan Cox
  1 sibling, 0 replies; 15+ messages in thread
From: Andrew Morton @ 2002-01-27 22:42 UTC (permalink / raw)
  To: Vincent Sweeney; +Cc: linux-kernel

Vincent Sweeney wrote:
> 
> Naturally I would really like to know where / what is using up all this
> system cpu so I would like to try profiling the kernel as I'm sure this is a
> pure kernel network layer performance issue but I have no idea where to
> start so does anyone have some advice / tips on where I should start?

Yes, profiling the kernel is clearly the next step.  And it's
really easy.

1: If possible, rebuild your kernel with as few modules as possible.
   Current profiler doesn't cope with code which is in modules.

2: If you're on uniprocessor, enable the "Local APIC support on
   Uniprocessors" option.  This allows higher-resolution profiling.

3: Arrange for the kernel to be provided the `profile=1' boot
   option.  I use

	append="profile=1"

   in /etc/lilo.conf

   After a reboot, the kernel is profiling itself.  The overhead is
   really low.

4: Bring the server online and wait until it starts to play up.

Now we can profile it.  I use this script:

mnm:/home/akpm> cat $(which akpm-prof)
#!/bin/sh
TIMEFILE=/tmp/$(basename $1).time
sudo readprofile -r
sudo readprofile -M10
time "$@"
readprofile -v -m /boot/System.map | sort -n +2 | tail -40 | tee $TIMEFILE
echo created $TIMEFILE

Let's go through it:

	readprofile -r

		This clears out the kernel's current profiling info

	readprofile -M10

		This attempts to set the profiling interval to 10*HZ
		(1000 Hz).  This requires a local APIC, and a recent
		util-linux package.  Not very important if this fails.
		This command also cleans out the kernel's current
		profiling info (it's a superset of -r).

	time "$@"

		Runs the command which we wish to profile

	readprofile ...

		Emits the profile info, sorted in useful order.
		You must make sure that /boot/System.map is the
		correct one for the currently-running kernel!

So in your situation, the command which you want to profile isn't
important - you want to profile kernel activity arising from
*existing* processes.  So you can use:

	akpm-prof sleep 30

Please send the results!

-

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: PROBLEM: high system usage / poor SMP network performance
  2002-01-27 22:54 ` Alan Cox
@ 2002-01-27 22:52   ` arjan
  2002-01-27 23:08   ` Vincent Sweeney
  2002-01-28 19:34   ` Vincent Sweeney
  2 siblings, 0 replies; 15+ messages in thread
From: arjan @ 2002-01-27 22:52 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel

In article <E16UyCO-0002zE-00@the-village.bc.nu> you wrote:
>>     CPU0 states: 27.2% user, 62.4% system,  0.0% nice,  9.2% idle
>>     CPU1 states: 28.4% user, 62.3% system,  0.0% nice,  8.1% idle

> The important bit here is     ^^^^^^^^ that one. Something is causing 
> horrendous lock contention it appears. Is the e100 driver optimised for SMP 
> yet ?

there's WAY too many busy waits (upto 500 msec with irq's disabled) in e100
to call it smp optimized.... also in most tests I've seen eepro100 won
outright

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: PROBLEM: high system usage / poor SMP network performance
  2002-01-27 22:23 PROBLEM: high system usage / poor SMP network performance Vincent Sweeney
  2002-01-27 22:42 ` Andrew Morton
@ 2002-01-27 22:54 ` Alan Cox
  2002-01-27 22:52   ` arjan
                     ` (2 more replies)
  1 sibling, 3 replies; 15+ messages in thread
From: Alan Cox @ 2002-01-27 22:54 UTC (permalink / raw)
  To: Vincent Sweeney; +Cc: linux-kernel

>     CPU0 states: 27.2% user, 62.4% system,  0.0% nice,  9.2% idle
>     CPU1 states: 28.4% user, 62.3% system,  0.0% nice,  8.1% idle

The important bit here is     ^^^^^^^^ that one. Something is causing 
horrendous lock contention it appears. Is the e100 driver optimised for SMP 
yet ? Do you get better numbers if you use the eepro100 driver ?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: PROBLEM: high system usage / poor SMP network performance
  2002-01-27 22:54 ` Alan Cox
  2002-01-27 22:52   ` arjan
@ 2002-01-27 23:08   ` Vincent Sweeney
  2002-01-28 19:34   ` Vincent Sweeney
  2 siblings, 0 replies; 15+ messages in thread
From: Vincent Sweeney @ 2002-01-27 23:08 UTC (permalink / raw)
  To: linux-kernel


----- Original Message -----
From: "Alan Cox" <alan@lxorguk.ukuu.org.uk>
To: "Vincent Sweeney" <v.sweeney@barrysworld.com>
Cc: <linux-kernel@vger.kernel.org>
Sent: Sunday, January 27, 2002 10:54 PM
Subject: Re: PROBLEM: high system usage / poor SMP network performance


> >     CPU0 states: 27.2% user, 62.4% system,  0.0% nice,  9.2% idle
> >     CPU1 states: 28.4% user, 62.3% system,  0.0% nice,  8.1% idle
>
> The important bit here is     ^^^^^^^^ that one. Something is causing
> horrendous lock contention it appears. Is the e100 driver optimised for
SMP
> yet ? Do you get better numbers if you use the eepro100 driver ?

It's been a while since I tested with the eepro100 drivers (I switch to e100
about 2.4.4 due to some unrelated problems) so I cannot give a comparision
just at present. I will test the eepro100 driver tomorrow on one of the
servers and post results then.

I will also try Andrew Morton's profiling tips on another box with the e100
driver.

Vince.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: PROBLEM: high system usage / poor SMP network performance
  2002-01-27 22:54 ` Alan Cox
  2002-01-27 22:52   ` arjan
  2002-01-27 23:08   ` Vincent Sweeney
@ 2002-01-28 19:34   ` Vincent Sweeney
  2002-01-28 19:40     ` Rik van Riel
  2 siblings, 1 reply; 15+ messages in thread
From: Vincent Sweeney @ 2002-01-28 19:34 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel

----- Original Message -----
From: "Alan Cox" <alan@lxorguk.ukuu.org.uk>
To: "Vincent Sweeney" <v.sweeney@barrysworld.com>
Cc: <linux-kernel@vger.kernel.org>
Sent: Sunday, January 27, 2002 10:54 PM
Subject: Re: PROBLEM: high system usage / poor SMP network performance


> >     CPU0 states: 27.2% user, 62.4% system,  0.0% nice,  9.2% idle
> >     CPU1 states: 28.4% user, 62.3% system,  0.0% nice,  8.1% idle
>
> The important bit here is     ^^^^^^^^ that one. Something is causing
> horrendous lock contention it appears. Is the e100 driver optimised for
SMP
> yet ? Do you get better numbers if you use the eepro100 driver ?


I've switched a server over to the default eepro100 driver as supplied in
2.4.17 (compiled as a module). This is tonights snapshot with about 10%
higher user count than above (2200 connections per ircd)

  7:25pm  up  5:44,  2 users,  load average: 0.85, 1.01, 1.09
38 processes: 33 sleeping, 5 running, 0 zombie, 0 stopped
CPU0 states: 27.3% user, 69.3% system,  0.0% nice,  2.2% idle
CPU1 states: 26.1% user, 71.2% system,  0.0% nice,  2.0% idle
Mem:   385096K av,  232960K used,  152136K free,       0K shrd,    4724K
buff
Swap:  379416K av,       0K used,  379416K free                   21780K
cached

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
  659 ircd      15   0 74976  73M   660 R    96.7 19.4 263:21 ircd
  666 ircd      14   0 75004  73M   656 R    95.5 19.4 253:10 ircd

So as you can see the numbers are almost the same, though they were worse at
lower users than the e100 driver (~45% system per cpu at 1000 users per ircd
with eepro100,  ~30% with e100).

I will try the profiling tomorrow with the eepro100 driver compiled into the
kernel, I was unable to do the same for the Intel e100 driver today as I
discovered that the Intel driver can currenty only be compiled as a module.

Vince.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: PROBLEM: high system usage / poor SMP network performance
  2002-01-28 19:34   ` Vincent Sweeney
@ 2002-01-28 19:40     ` Rik van Riel
  2002-01-29 16:32       ` Vincent Sweeney
  0 siblings, 1 reply; 15+ messages in thread
From: Rik van Riel @ 2002-01-28 19:40 UTC (permalink / raw)
  To: Vincent Sweeney; +Cc: Alan Cox, linux-kernel

On Mon, 28 Jan 2002, Vincent Sweeney wrote:

> > >     CPU0 states: 27.2% user, 62.4% system,  0.0% nice,  9.2% idle
> > >     CPU1 states: 28.4% user, 62.3% system,  0.0% nice,  8.1% idle
> >
> > The important bit here is     ^^^^^^^^ that one. Something is causing
> > horrendous lock contention it appears.
>
> I've switched a server over to the default eepro100 driver as supplied
> in 2.4.17 (compiled as a module). This is tonights snapshot with about
> 10% higher user count than above (2200 connections per ircd)

Hummm ... poll() / select() ?  ;)

> I will try the profiling tomorrow

readprofile | sort -n | tail -20

kind regards,

Rik
-- 
"Linux holds advantages over the single-vendor commercial OS"
    -- Microsoft's "Competing with Linux" document

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: PROBLEM: high system usage / poor SMP network performance
  2002-01-28 19:40     ` Rik van Riel
@ 2002-01-29 16:32       ` Vincent Sweeney
  0 siblings, 0 replies; 15+ messages in thread
From: Vincent Sweeney @ 2002-01-29 16:32 UTC (permalink / raw)
  To: linux-kernel; +Cc: Alan Cox, Rik van Riel, Andrew Morton


----- Original Message -----
From: "Rik van Riel" <riel@conectiva.com.br>
To: "Vincent Sweeney" <v.sweeney@barrysworld.com>
Cc: "Alan Cox" <alan@lxorguk.ukuu.org.uk>; <linux-kernel@vger.kernel.org>
Sent: Monday, January 28, 2002 7:40 PM
Subject: Re: PROBLEM: high system usage / poor SMP network performance


> On Mon, 28 Jan 2002, Vincent Sweeney wrote:
>
> > > >     CPU0 states: 27.2% user, 62.4% system,  0.0% nice,  9.2% idle
> > > >     CPU1 states: 28.4% user, 62.3% system,  0.0% nice,  8.1% idle
> > >
> > > The important bit here is     ^^^^^^^^ that one. Something is causing
> > > horrendous lock contention it appears.
> >
> > I've switched a server over to the default eepro100 driver as supplied
> > in 2.4.17 (compiled as a module). This is tonights snapshot with about
> > 10% higher user count than above (2200 connections per ircd)
>
> Hummm ... poll() / select() ?  ;)
>
> > I will try the profiling tomorrow
>
> readprofile | sort -n | tail -20
>
> kind regards,
>
> Rik

Right then, here is the results from today so far (snapshot taken with 2000
users per ircd). Kernel profiling enabled with the eepro100 driver compiled
statically.

---
> readprofile -r ; sleep 60; readprofile | sort -n | tail -30

    11 tcp_rcv_established                        0.0055
    12 do_softirq                                 0.0536
    12 nf_hook_slow                               0.0291
    13 __free_pages_ok                            0.0256
    13 kmalloc                                    0.0378
    13 rmqueue                                    0.0301
    13 tcp_ack                                    0.0159
    14 __kfree_skb                                0.0455
    14 tcp_v4_rcv                                 0.0084
    15 __ip_conntrack_find                        0.0441
    16 handle_IRQ_event                           0.1290
    16 tcp_packet                                 0.0351
    17 speedo_rx                                  0.0227
    17 speedo_start_xmit                          0.0346
    18 ip_route_input                             0.0484
    23 speedo_interrupt                           0.0301
    30 ipt_do_table                               0.0284
    30 tcp_sendmsg                                0.0065
   116 __pollwait                                 0.7838
   140 poll_freewait                              1.7500
   170 sys_poll                                   0.1897
   269 do_pollfd                                  1.4944
   462 remove_wait_queue                         12.8333
   474 add_wait_queue                             9.1154
   782 fput                                       3.3707
  1216 default_idle                              23.3846
  1334 fget                                      16.6750
  1347 sock_poll                                 33.6750
  2408 tcp_poll                                   6.9195
  9366 total                                      0.0094

> top

  4:30pm  up  2:57,  2 users,  load average: 0.76, 0.85, 0.82
36 processes: 33 sleeping, 3 running, 0 zombie, 0 stopped
CPU0 states: 21.4% user, 68.1% system,  0.0% nice,  9.3% idle
CPU1 states: 23.4% user, 67.1% system,  0.0% nice,  8.3% idle
Mem:   382916K av,  191276K used,  191640K free,       0K shrd,    1444K
buff
Swap:  379416K av,       0K used,  379416K free                   23188K
cached

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
  613 ircd      16   0 67140  65M   660 R    89.6 17.5 102:00 ircd
  607 ircd      16   0 64868  63M   656 S    88.7 16.9  98:50 ircd

---

So with my little knowledge of what this means I would say this is purely
down to poll(), but surely even with 4000 connections to the box that
shouldn't stretch a dual P3-800 box as much as it does?

Vince.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: PROBLEM: high system usage / poor SMP network performance
@ 2002-01-29 18:00 Dan Kegel
  2002-01-29 20:09 ` Vincent Sweeney
  0 siblings, 1 reply; 15+ messages in thread
From: Dan Kegel @ 2002-01-29 18:00 UTC (permalink / raw)
  To: Vincent Sweeney, linux-kernel@vger.kernel.org

"Vincent Sweeney" <v.sweeney@barrysworld.com> wrote:
> > > >     CPU0 states: 27.2% user, 62.4% system,  0.0% nice,  9.2% idle
> > > >     CPU1 states: 28.4% user, 62.3% system,  0.0% nice,  8.1% idle
> > >
> > > The important bit here is     ^^^^^^^^ that one. Something is causing
> > > horrendous lock contention it appears.
> ...
> Right then, here is the results from today so far (snapshot taken with 2000
> users per ircd). Kernel profiling enabled with the eepro100 driver compiled
> statically.
>    readprofile -r ; sleep 60; readprofile | sort -n | tail -30
> ...
>    170 sys_poll                                   0.1897
>    269 do_pollfd                                  1.4944
>    462 remove_wait_queue                         12.8333
>    474 add_wait_queue                             9.1154
>    782 fput                                       3.3707
>   1216 default_idle                              23.3846
>   1334 fget                                      16.6750
>   1347 sock_poll                                 33.6750
>   2408 tcp_poll                                   6.9195
>   9366 total                                      0.0094
> ...
> So with my little knowledge of what this means I would say this is purely
> down to poll(), but surely even with 4000 connections to the box that
> shouldn't stretch a dual P3-800 box as much as it does?

My oldish results,
http://www.kegel.com/dkftpbench/Poller_bench.html#results
show that yes, 4000 connections can really hurt a Linux program
that uses poll().  It is very tempting to port ircd to use 
the Poller library (http://www.kegel.com/dkftpbench/dkftpbench-0.38.tar.gz);
that would let us compare poll(), realtimesignals, and /dev/epoll
to see how well they do on your workload.
- Dan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: PROBLEM: high system usage / poor SMP network performance
  2002-01-29 18:00 Dan Kegel
@ 2002-01-29 20:09 ` Vincent Sweeney
  2002-01-31  5:24   ` Dan Kegel
  0 siblings, 1 reply; 15+ messages in thread
From: Vincent Sweeney @ 2002-01-29 20:09 UTC (permalink / raw)
  To: Dan Kegel, linux-kernel

----- Original Message -----
From: "Dan Kegel" <dank@kegel.com>
To: "Vincent Sweeney" <v.sweeney@barrysworld.com>;
<linux-kernel@vger.kernel.org>
Sent: Tuesday, January 29, 2002 6:00 PM
Subject: Re: PROBLEM: high system usage / poor SMP network performance


> "Vincent Sweeney" <v.sweeney@barrysworld.com> wrote:
> > > > >     CPU0 states: 27.2% user, 62.4% system,  0.0% nice,  9.2% idle
> > > > >     CPU1 states: 28.4% user, 62.3% system,  0.0% nice,  8.1% idle
> > > >
> > > > The important bit here is     ^^^^^^^^ that one. Something is
causing
> > > > horrendous lock contention it appears.
> > ...
> > Right then, here is the results from today so far (snapshot taken with
2000
> > users per ircd). Kernel profiling enabled with the eepro100 driver
compiled
> > statically.
> >    readprofile -r ; sleep 60; readprofile | sort -n | tail -30
> > ...
> >    170 sys_poll                                   0.1897
> >    269 do_pollfd                                  1.4944
> >    462 remove_wait_queue                         12.8333
> >    474 add_wait_queue                             9.1154
> >    782 fput                                       3.3707
> >   1216 default_idle                              23.3846
> >   1334 fget                                      16.6750
> >   1347 sock_poll                                 33.6750
> >   2408 tcp_poll                                   6.9195
> >   9366 total                                      0.0094
> > ...
> > So with my little knowledge of what this means I would say this is
purely
> > down to poll(), but surely even with 4000 connections to the box that
> > shouldn't stretch a dual P3-800 box as much as it does?
>
> My oldish results,
> http://www.kegel.com/dkftpbench/Poller_bench.html#results
> show that yes, 4000 connections can really hurt a Linux program
> that uses poll().  It is very tempting to port ircd to use
> the Poller library
(http://www.kegel.com/dkftpbench/dkftpbench-0.38.tar.gz);
> that would let us compare poll(), realtimesignals, and /dev/epoll
> to see how well they do on your workload.
> - Dan
>

So basically you are telling me these are my options:

    1) Someone is going to have to recode the ircd source we use and
possibly a modified kernel in the *hope* that performance improves.
    2) Convert the box to FreeBSD which seems to have a better poll()
implementation, and where I could support 8K clients easily as other admins
on my chat network do already.
    3) Move the ircd processes to some 400Mhz Ultra 5's running Solaris-8
which run 3-4K users at 60% cpu!

Now I want to run Linux but unless I get this issue resolved I'm essentialy
not utilizing my hardware to the best of its ability.

Vince.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: PROBLEM: high system usage / poor SMP network performance
  2002-01-29 20:09 ` Vincent Sweeney
@ 2002-01-31  5:24   ` Dan Kegel
       [not found]     ` <001d01c1aa8e$2e067e60$0201010a@frodo>
  0 siblings, 1 reply; 15+ messages in thread
From: Dan Kegel @ 2002-01-31  5:24 UTC (permalink / raw)
  To: Vincent Sweeney; +Cc: linux-kernel

Vincent Sweeney wrote:
> So basically you are telling me these are my options:
> 
>     1) Someone is going to have to recode the ircd source we use and
> possibly a modified kernel in the *hope* that performance improves.
>     2) Convert the box to FreeBSD which seems to have a better poll()
> implementation, and where I could support 8K clients easily as other admins
> on my chat network do already.
>     3) Move the ircd processes to some 400Mhz Ultra 5's running Solaris-8
> which run 3-4K users at 60% cpu!
> 
> Now I want to run Linux but unless I get this issue resolved I'm essentialy
> not utilizing my hardware to the best of its ability.

No need to use a modified kernel; plain old 2.4.18 or so should do
fine, it supports the rtsig stuff.  But yeah, you may want to
see if the core of ircd can be recoded.  Can you give me the URL
for the source of the version you use?  I can peek at it.
It only took me two days to recode betaftpd to use Poller...

I do know that the guys working on aio for linux say they
have code that will make poll() much more efficient, so
I suppose another option is to join the linux-aio list and
say "So you folks say you can make plain old poll() more efficient, eh?
Here's a test case for you." :-)

- Dan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: PROBLEM: high system usage / poor SMP network performance
       [not found]     ` <001d01c1aa8e$2e067e60$0201010a@frodo>
@ 2002-02-03  8:03       ` Dan Kegel
  2002-02-03  8:36         ` Andrew Morton
  2002-02-03 19:22         ` Kev
  0 siblings, 2 replies; 15+ messages in thread
From: Dan Kegel @ 2002-02-03  8:03 UTC (permalink / raw)
  To: Vincent Sweeney, linux-kernel@vger.kernel.org,
	coder-com@undernet.org
  Cc: Kevin L. Mitchell

Vincent Sweeney wrote:
> > > [I want to use Linux for my irc server, but performance sucks.]
> > >     1) Someone is going to have to recode the ircd source we use and
> > > possibly a modified kernel in the *hope* that performance improves.
> > >     2) Convert the box to FreeBSD which seems to have a better poll()
> > > implementation, and where I could support 8K clients easily as other
> > > admins on my chat network do already....
> >
> > No need to use a modified kernel; plain old 2.4.18 or so should do
> > fine, it supports the rtsig stuff.  But yeah, you may want to
> > see if the core of ircd can be recoded.  Can you give me the URL
> > for the source of the version you use?  I can peek at it.
> > It only took me two days to recode betaftpd to use Poller...
> 
> http://dev-com.b2irc.net/ : Undernet's IRCD + Lain 1.1.2 patch

Hmm.  Have a look at
http://www.mail-archive.com/coder-com@undernet.org/msg00060.html
It looks like the mainline Undernet ircd was rewritten around May 2001
to support high efficiency techniques like /dev/poll and kqueue.
The source you pointed to is way behind Undernet's current sources.

Undernet's ircd has engine_{select,poll,devpoll,kqueue}.c, 
but not yet an engine_rtsig.c, as far as I know.
If you want ircd to handle zillions of simultaneous connections
on a stock 2.4 Linux kernel, rtsignals are the way to go at the
moment.  What's needed is to write ircd's engine_rtsig.c, and 
modify ircd's os_linux.c to notice EWOULDBLOCK
return values and feed them to engine_rtsig.c (that's the icky
part about the way linux currently does this kind of event 
notification - signals are used for 'I'm ready now', but return
values from I/O functions are where you learn 'I'm no longer ready').

So I dunno if I'm going to go ahead and do that myself, but at least I've
scoped out the situation.  Before I did any work, I'd measure CPU
usage under a simulated load of 2000 clients, just to verify that
poll() was indeed a bottleneck (ok, can't imagine it not being a
bottleneck, but it's nice to have a baseline to compare the improved
version against).
- Dan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: PROBLEM: high system usage / poor SMP network performance
  2002-02-03  8:03       ` Dan Kegel
@ 2002-02-03  8:36         ` Andrew Morton
  2002-02-12 18:48           ` Vincent Sweeney
  2002-02-03 19:22         ` Kev
  1 sibling, 1 reply; 15+ messages in thread
From: Andrew Morton @ 2002-02-03  8:36 UTC (permalink / raw)
  To: Dan Kegel
  Cc: Vincent Sweeney, linux-kernel@vger.kernel.org,
	coder-com@undernet.org, Kevin L. Mitchell

Dan Kegel wrote:
> 
> Before I did any work, I'd measure CPU
> usage under a simulated load of 2000 clients, just to verify that
> poll() was indeed a bottleneck (ok, can't imagine it not being a
> bottleneck, but it's nice to have a baseline to compare the improved
> version against).

I half-did this earlier in the week.  It seems that Vincent's
machine is calling poll() maybe 100 times/second.  Each call
is taking maybe 10 milliseconds, and is returning approximately
one measly little packet.

select and poll suck for thousands of fds.  Always did, always
will.  Applications need to work around this.

And the workaround is rather simple:

	....
+	usleep(100000);
	poll(...);

This will add up to 0.1 seconds latency, but it means that
the poll will gather activity on ten times as many fds,
and that it will be called ten times less often, and that
CPU load will fall by a factor of ten.

This seems an appropriate hack for an IRC server.  I guess it
could be souped up a bit:

	usleep(nr_fds * 50);

-

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: PROBLEM: high system usage / poor SMP network performance
  2002-02-03  8:03       ` Dan Kegel
  2002-02-03  8:36         ` Andrew Morton
@ 2002-02-03 19:22         ` Kev
  1 sibling, 0 replies; 15+ messages in thread
From: Kev @ 2002-02-03 19:22 UTC (permalink / raw)
  To: Dan Kegel
  Cc: Vincent Sweeney, linux-kernel@vger.kernel.org,
	coder-com@undernet.org, Kevin L. Mitchell

> Hmm.  Have a look at
> http://www.mail-archive.com/coder-com@undernet.org/msg00060.html
> It looks like the mainline Undernet ircd was rewritten around May 2001
> to support high efficiency techniques like /dev/poll and kqueue.
> The source you pointed to is way behind Undernet's current sources.

This code is still in beta testing, by the way.  It's certainly not the
prettiest way of doing it, though, and I've started working on a new
implementation of the basic idea in a library, which I will then use in
a future version of Undernet's ircd.

> Undernet's ircd has engine_{select,poll,devpoll,kqueue}.c, 
> but not yet an engine_rtsig.c, as far as I know.
> If you want ircd to handle zillions of simultaneous connections
> on a stock 2.4 Linux kernel, rtsignals are the way to go at the
> moment.  What's needed is to write ircd's engine_rtsig.c, and 
> modify ircd's os_linux.c to notice EWOULDBLOCK
> return values and feed them to engine_rtsig.c (that's the icky
> part about the way linux currently does this kind of event 
> notification - signals are used for 'I'm ready now', but return
> values from I/O functions are where you learn 'I'm no longer ready').

I haven't examined the usage of the realtime signals stuff, but I did
originally choose not to bother with it.  It may be possible to set up
an engine that uses it, and if anyone gets it working, I sure wouldn't
mind seeing the patches.  Still, I'd say that the best bet is probably
to either use the /dev/poll patch for linux, or grab the /dev/epoll patch
and implement a new engine to use it.  (I should note that I haven't tried
either of these patches, yet, so YMMV.)

> So I dunno if I'm going to go ahead and do that myself, but at least I've
> scoped out the situation.  Before I did any work, I'd measure CPU
> usage under a simulated load of 2000 clients, just to verify that
> poll() was indeed a bottleneck (ok, can't imagine it not being a
> bottleneck, but it's nice to have a baseline to compare the improved
> version against).

I'm very certain that poll() is a bottle-neck in any piece of software like
ircd.  I have some preliminary data which suggests that not only does the
/dev/poll engine reduce the load averages, but that it scales much better:
Load averages on that beta test server dropped from about 1.30 to about
0.30 for the same number of clients, and adding more clients increases the
load much less than under the previous version using poll().  Of course,
I haven't compared loads under the same server version with two different
engines--it's possible other changes we made have resulted in much of that
load difference.

I should probably note that the beta test server I am refering to is running
Solaris; I have not tried to use the Linux /dev/poll patch as of yet...
-- 
Kevin L. Mitchell <klmitch@mit.edu>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: PROBLEM: high system usage / poor SMP network performance
  2002-02-03  8:36         ` Andrew Morton
@ 2002-02-12 18:48           ` Vincent Sweeney
  0 siblings, 0 replies; 15+ messages in thread
From: Vincent Sweeney @ 2002-02-12 18:48 UTC (permalink / raw)
  To: Andrew Morton, Dan Kegel; +Cc: linux-kernel, coder-com, Dan Kegel

Well I've recoded the poll() section in the ircu code base as follows:

Instead of the default :

    ...
    nfds = poll(poll_fds, pfd_count, timeout);
    ...

we now have

    ...
    nfds = poll(poll_fds, pfd_count, 0);
    if (nfds == 0) {
      usleep(1000000 / 10); /* sleep 1/10 second */
      nfds = poll(poll_fds, pfd_count, timeout);
    }
    ...

And as 'top' results now show, instead of maxing out a dual P3-800 we now
only use a fraction of that without any noticable side effects.

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
14684 ircd      15   0 81820  79M   800 S    22.5 21.2 215:39 ircd
14691 ircd      12   0 80716  78M   800 S    21.1 20.9 212:22 ircd


Vince.

----- Original Message -----
From: "Andrew Morton" <akpm@zip.com.au>
To: "Dan Kegel" <dank@kegel.com>
Cc: "Vincent Sweeney" <v.sweeney@barrysworld.com>;
<linux-kernel@vger.kernel.org>; <coder-com@undernet.org>; "Kevin L.
Mitchell" <klmitch@mit.edu>
Sent: Sunday, February 03, 2002 8:36 AM
Subject: Re: PROBLEM: high system usage / poor SMP network performance


> Dan Kegel wrote:
> >
> > Before I did any work, I'd measure CPU
> > usage under a simulated load of 2000 clients, just to verify that
> > poll() was indeed a bottleneck (ok, can't imagine it not being a
> > bottleneck, but it's nice to have a baseline to compare the improved
> > version against).
>
> I half-did this earlier in the week.  It seems that Vincent's
> machine is calling poll() maybe 100 times/second.  Each call
> is taking maybe 10 milliseconds, and is returning approximately
> one measly little packet.
>
> select and poll suck for thousands of fds.  Always did, always
> will.  Applications need to work around this.
>
> And the workaround is rather simple:
>
> ....
> + usleep(100000);
> poll(...);
>
> This will add up to 0.1 seconds latency, but it means that
> the poll will gather activity on ten times as many fds,
> and that it will be called ten times less often, and that
> CPU load will fall by a factor of ten.
>
> This seems an appropriate hack for an IRC server.  I guess it
> could be souped up a bit:
>
> usleep(nr_fds * 50);
>
> -
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2002-02-12 18:48 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-01-27 22:23 PROBLEM: high system usage / poor SMP network performance Vincent Sweeney
2002-01-27 22:42 ` Andrew Morton
2002-01-27 22:54 ` Alan Cox
2002-01-27 22:52   ` arjan
2002-01-27 23:08   ` Vincent Sweeney
2002-01-28 19:34   ` Vincent Sweeney
2002-01-28 19:40     ` Rik van Riel
2002-01-29 16:32       ` Vincent Sweeney
  -- strict thread matches above, loose matches on Subject: below --
2002-01-29 18:00 Dan Kegel
2002-01-29 20:09 ` Vincent Sweeney
2002-01-31  5:24   ` Dan Kegel
     [not found]     ` <001d01c1aa8e$2e067e60$0201010a@frodo>
2002-02-03  8:03       ` Dan Kegel
2002-02-03  8:36         ` Andrew Morton
2002-02-12 18:48           ` Vincent Sweeney
2002-02-03 19:22         ` Kev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox