Linux Newbie help
 help / color / mirror / Atom feed
* debate on 700 threads vs asynchronous code
@ 2003-01-23 23:19 Lee Chin
  2003-01-23 23:28 ` Larry McVoy
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Lee Chin @ 2003-01-23 23:19 UTC (permalink / raw)
  To: linux-kernel, linux-newbie

Hi
I am discussing with a few people on different approaches to solving a scale problem I am having, and have gotten vastly different views

In a nutshell, as far as this debate is concerned, I can say I am writing a web server.

Now, to cater to 700 clients, I can
a) launch 700 threads that each block on I/O to disk and to the client (in reading and writing on the socket)

OR

b) Write an asycnhrounous system with only 2 or three threads where I manage the connections and stack (via setcontext swapcontext etc), which is progromatically a little harder

Which way will yeild me better performance, considerng both approaches are implemented optimally?

Thanks
Lee
-- 
__________________________________________________________
Sign-up for your own FREE Personalized E-mail at Mail.com
http://www.mail.com/?sr=signup

Meet Singles
http://corp.mail.com/lavalife

-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: debate on 700 threads vs asynchronous code
  2003-01-23 23:19 Lee Chin
@ 2003-01-23 23:28 ` Larry McVoy
  2003-01-23 23:31 ` Ben Greear
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Larry McVoy @ 2003-01-23 23:28 UTC (permalink / raw)
  To: Lee Chin; +Cc: linux-kernel, linux-newbie

> b) Write an asycnhrounous system with only 2 or three threads where I manage the connections and stack (via setcontext swapcontext etc), which is progromatically a little harder
> 
> Which way will yeild me better performance, considerng both approaches are implemented optimally?

If this is a serious question, an async system will by definition do better.
You have either 700 stacks screwing up the data cache or 2-3 stacks nicely
fitting in the data cache.  Ditto for instruction cache, etc.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: debate on 700 threads vs asynchronous code
  2003-01-23 23:19 Lee Chin
  2003-01-23 23:28 ` Larry McVoy
@ 2003-01-23 23:31 ` Ben Greear
  2003-01-27  9:48 ` Terje Eggestad
  2003-01-27 22:08 ` Bill Davidsen
  3 siblings, 0 replies; 9+ messages in thread
From: Ben Greear @ 2003-01-23 23:31 UTC (permalink / raw)
  To: Lee Chin; +Cc: linux-kernel, linux-newbie

Lee Chin wrote:
> Hi
> I am discussing with a few people on different approaches to solving a scale problem I am having, and have gotten vastly different views
> 
> In a nutshell, as far as this debate is concerned, I can say I am writing a web server.
> 
> Now, to cater to 700 clients, I can
> a) launch 700 threads that each block on I/O to disk and to the client (in reading and writing on the socket)
> 
> OR
> 
> b) Write an asycnhrounous system with only 2 or three threads where I manage the connections and stack (via setcontext swapcontext etc), which is progromatically a little harder

You could also write something with async non-blocking IO and use NO threads
(ie, just a single process), which
may greatly simplify the debugging of your program (unless the developer(s) on your
project are very good at threaded programming already).

I suspect the async IO will perform better as well, but that is just an
un-founded opinion based on not wanting to think about scheduling 700 processes
that want to do IO :)

> 
> Which way will yeild me better performance, considerng both approaches are implemented optimally?
> 
> Thanks
> Lee


-- 
Ben Greear <greearb@candelatech.com>       <Ben_Greear AT excite.com>
President of Candela Technologies Inc      http://www.candelatech.com
ScryMUD:  http://scry.wanfear.com     http://scry.wanfear.com/~greear


-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: debate on 700 threads vs asynchronous code
@ 2003-01-24  0:07 Lee Chin
  0 siblings, 0 replies; 9+ messages in thread
From: Lee Chin @ 2003-01-24  0:07 UTC (permalink / raw)
  To: lm, leechin; +Cc: linux-kernel, linux-newbie

Hi,
Thanks for the rpely... my question was more so, with setcontext and swapcontext, I will still be messing with the data cache right?  

In otherwords, as long as I have an async system with out setcontext, I know I am good... but with it, havent I degraded to a threaded environment?

Thanks
Lee
----- Original Message -----
From: Larry McVoy <lm@bitmover.com>
Date: Thu, 23 Jan 2003 15:28:34 -0800
To: Lee Chin <leechin@mail.com>
Subject: Re: debate on 700 threads vs asynchronous code

> > b) Write an asycnhrounous system with only 2 or three threads where I manage the connections and stack (via setcontext swapcontext etc), which is progromatically a little harder
> > 
> > Which way will yeild me better performance, considerng both approaches are implemented optimally?
> 
> If this is a serious question, an async system will by definition do better.
> You have either 700 stacks screwing up the data cache or 2-3 stacks nicely
> fitting in the data cache.  Ditto for instruction cache, etc.
> -- 
> ---
> Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

-- 
__________________________________________________________
Sign-up for your own FREE Personalized E-mail at Mail.com
http://www.mail.com/?sr=signup

Meet Singles
http://corp.mail.com/lavalife

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: debate on 700 threads vs asynchronous code
  2003-01-23 23:19 Lee Chin
  2003-01-23 23:28 ` Larry McVoy
  2003-01-23 23:31 ` Ben Greear
@ 2003-01-27  9:48 ` Terje Eggestad
  2003-01-27 21:48   ` Bill Davidsen
  2003-01-27 22:08 ` Bill Davidsen
  3 siblings, 1 reply; 9+ messages in thread
From: Terje Eggestad @ 2003-01-27  9:48 UTC (permalink / raw)
  To: Lee Chin; +Cc: linux-kernel, linux-newbie

Apart from the argument already given on other replies, you should
keep in mind that you probably need to give priority to doing receive.
THat include your clients, but if you don't you run into the risk of
significantly limiting your bandwidth since the send queues around your
system fill up. 

Try doing that with threads. 


Actually I would recommend the approach c)

c)  Write an asynchronous system with only 2 or three threads where I
manage the connections and keep the state of each connection in a data
structure.  


On fre, 2003-01-24 at 00:19, Lee Chin wrote:
> Hi
> I am discussing with a few people on different approaches to solving a scale problem I am having, and have gotten vastly different views
> 
> In a nutshell, as far as this debate is concerned, I can say I am writing a web server.
> 
> Now, to cater to 700 clients, I can
> a) launch 700 threads that each block on I/O to disk and to the client (in reading and writing on the socket)
> 
> OR
> 
> b) Write an asycnhrounous system with only 2 or three threads where I manage the connections and stack (via setcontext swapcontext etc), which is progromatically a little harder
> 
> Which way will yeild me better performance, considerng both approaches are implemented optimally?
> 
> Thanks
> Lee
-- 
_________________________________________________________________________

Terje Eggestad                  mailto:terje.eggestad@scali.no
Scali Scalable Linux Systems    http://www.scali.com

Olaf Helsets Vei 6              tel:    +47 22 62 89 61 (OFFICE)
P.O.Box 150, Oppsal                     +47 975 31 574  (MOBILE)
N-0619 Oslo                     fax:    +47 22 62 89 51
NORWAY            
_________________________________________________________________________

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: debate on 700 threads vs asynchronous code
  2003-01-27  9:48 ` Terje Eggestad
@ 2003-01-27 21:48   ` Bill Davidsen
  0 siblings, 0 replies; 9+ messages in thread
From: Bill Davidsen @ 2003-01-27 21:48 UTC (permalink / raw)
  To: Terje Eggestad; +Cc: Lee Chin, linux-kernel, linux-newbie

On 27 Jan 2003, Terje Eggestad wrote:

> Apart from the argument already given on other replies, you should
> keep in mind that you probably need to give priority to doing receive.
> THat include your clients, but if you don't you run into the risk of
> significantly limiting your bandwidth since the send queues around your
> system fill up. 
> 
> Try doing that with threads.

Okay, I'm running my usenet exchange machines on Linux with Earthquake,
one thread per socket, 300-500 sockets, 700-800GB/day with incoming rate
spikes to 130Mbit on two 100Mbit NICs. What is it I'm supposed to try
doing with threads?

And if this is a webserver or anything like it, the incoming bandwidth is
probably orders of magnitude below the outgoing... Hum, like a usenet
reader server. Below, from a Linux box running Twister, also threaded per
feed in and per reader socket out.

 load free buffs swap pgin pgou dk0 dk1 dk2 dk3 ipkt opkt  int  ctx   usr  sys idl  i_netK  o_netK
 2.98  5.0  1807  0.0  544 2220  71  66  21   0 6173 3390 9600 17983     3  17  80  7170.8   941.9
 4.77  4.5  1805  0.0 1117 6267  39 134 134   0 5403 3212 8780 20663     8  34  58  6645.4   978.9
 2.35  4.3  1802  0.0 1529 6900  37 176 189   0 6134 3648 10007 18492    9  25  66  7470.4  1087.9
 1.10  4.8  1800  0.0 1428 5609  33 149 150   0 5871 3447 9505 18028     9  25  66  7235.2   961.0
 1.38  6.7  1798  0.0  970 6671  34 139 134   0 6250 3685 10051 20210    9  26  65  7503.4  1088.8
 6.57  5.0  1797  0.0 1589 7673  89 184 188   0 5912 3571 9732 20165     8  33  59  7003.7  1169.3
 2.30  4.6  1799  0.0 1648 5900  44 154 146   0 6539 3998 10660 17975    9  27  64  7631.0  1382.6

Forgive the formatting, it kind of break with larger numbers...

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: debate on 700 threads vs asynchronous code
  2003-01-23 23:19 Lee Chin
                   ` (2 preceding siblings ...)
  2003-01-27  9:48 ` Terje Eggestad
@ 2003-01-27 22:08 ` Bill Davidsen
  3 siblings, 0 replies; 9+ messages in thread
From: Bill Davidsen @ 2003-01-27 22:08 UTC (permalink / raw)
  To: Lee Chin; +Cc: linux-kernel, linux-newbie

On Thu, 23 Jan 2003, Lee Chin wrote:

> I am discussing with a few people on different approaches to solving a
> scale problem I am having, and have gotten vastly different views
> 
> In a nutshell, as far as this debate is concerned, I can say I am writing a web server.
> 
> Now, to cater to 700 clients, I can a) launch 700 threads that each
> block on I/O to disk and to the client (in reading and writing on the
> socket) 
> 
> OR
> 
> b) Write an asycnhrounous system with only 2 or three threads where I
> manage the connections and stack (via setcontext swapcontext etc), which
> is progromatically a little harder

There are many other ways, involving use of async io for disk and select
on some limited number of sockets per thread. If you want to wallow in
analysis paralysis you can certainly do it. Take a look at existing
usenet, mail, web and dns servers and you will see a number of ways to
attack this problem, and correctly implemented most of them work fine.

I believe Ingo mentioned some huge number of practical threads when he was
first talking about the latest thread library. If you believe it, or if
you really will be happy at 700 tasks per server, then thread per socket
is the easiest to implement, at least IMHO.

I'm using various news software which does most combinations of threading,
select, and even full processes per client, and none of them strike me as
being inherently better (as opposed to some being better implementations). 
Ask Ingo how many threads you can really run in six months when the new
kernel and thread bits are more stable, that's the only scaling bit I
can't even guess. Pick one method, write code. I believe implementation
will be more important than method, unless you make a *really* bad choice.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: debate on 700 threads vs asynchronous code
@ 2003-01-29 17:26 Lee Chin
  2003-01-30  9:36 ` Terje Eggestad
  0 siblings, 1 reply; 9+ messages in thread
From: Lee Chin @ 2003-01-29 17:26 UTC (permalink / raw)
  To: terje.eggestad, leechin; +Cc: linux-kernel, linux-newbie

Today I do method (C)... but many people seem to say that, hey, pthreads does almost just that with a constant memory overhead of remembering the stack per blocking thread... so there is no time difference, just that pthreads consumes slightly more memory.  That is the issue I am trying to get my head around.

That particular question, no one has answered... in Linux, the scheduler will not go around crazy trying to schedule prcosses that are all waiting on IO.  NOw the only time I see a degrade in threads would be if all are runnable.... in that case a async scheme with two threads would let each task run to completion, not thrashing the kernel.  Is that correct to say?
----- Original Message -----
From: Terje Eggestad <terje.eggestad@scali.com>
Date: 27 Jan 2003 10:48:22 +0100
To: Lee Chin <leechin@mail.com>
Subject: Re: debate on 700 threads vs asynchronous code

> Apart from the argument already given on other replies, you should
> keep in mind that you probably need to give priority to doing receive.
> THat include your clients, but if you don't you run into the risk of
> significantly limiting your bandwidth since the send queues around your
> system fill up. 
> 
> Try doing that with threads. 
> 
> 
> Actually I would recommend the approach c)
> 
> c)  Write an asynchronous system with only 2 or three threads where I
> manage the connections and keep the state of each connection in a data
> structure.  
> 
> 
> On fre, 2003-01-24 at 00:19, Lee Chin wrote:
> > Hi
> > I am discussing with a few people on different approaches to solving a scale problem I am having, and have gotten vastly different views
> > 
> > In a nutshell, as far as this debate is concerned, I can say I am writing a web server.
> > 
> > Now, to cater to 700 clients, I can
> > a) launch 700 threads that each block on I/O to disk and to the client (in reading and writing on the socket)
> > 
> > OR
> > 
> > b) Write an asycnhrounous system with only 2 or three threads where I manage the connections and stack (via setcontext swapcontext etc), which is progromatically a little harder
> > 
> > Which way will yeild me better performance, considerng both approaches are implemented optimally?
> > 
> > Thanks
> > Lee
> -- 
> _________________________________________________________________________
> 
> Terje Eggestad                  mailto:terje.eggestad@scali.no
> Scali Scalable Linux Systems    http://www.scali.com
> 
> Olaf Helsets Vei 6              tel:    +47 22 62 89 61 (OFFICE)
> P.O.Box 150, Oppsal                     +47 975 31 574  (MOBILE)
> N-0619 Oslo                     fax:    +47 22 62 89 51
> NORWAY            
> _________________________________________________________________________
> 

-- 
__________________________________________________________
Sign-up for your own FREE Personalized E-mail at Mail.com
http://www.mail.com/?sr=signup

-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: debate on 700 threads vs asynchronous code
  2003-01-29 17:26 debate on 700 threads vs asynchronous code Lee Chin
@ 2003-01-30  9:36 ` Terje Eggestad
  0 siblings, 0 replies; 9+ messages in thread
From: Terje Eggestad @ 2003-01-30  9:36 UTC (permalink / raw)
  To: Lee Chin; +Cc: linux-kernel, linux-newbie

On ons, 2003-01-29 at 18:26, Lee Chin wrote:
> Today I do method (C)... but many people seem to say that, hey,
> pthreads does almost just that with a constant memory overhead of
> remembering the stack per blocking thread... so there is no time
> difference, just that pthreads consumes slightly more memory.  That is
> the issue I am trying to get my head around.
> 
> That particular question, no one has answered... in Linux, the
> scheduler will not go around crazy trying to schedule prcosses that
> are all waiting on IO.  NOw the only time I see a degrade in threads
> would be if all are runnable.... in that case a async scheme with two
> threads would let each task run to completion, not thrashing the
> kernel.  Is that correct to say?


Yes

And you can add that if you have many runnable threads, there will be an
extra overhead doing context switching.


> ----- Original Message -----
> From: Terje Eggestad <terje.eggestad@scali.com>
> Date: 27 Jan 2003 10:48:22 +0100
> To: Lee Chin <leechin@mail.com>
> Subject: Re: debate on 700 threads vs asynchronous code
> 
> > Apart from the argument already given on other replies, you should
> > keep in mind that you probably need to give priority to doing receive.
> > THat include your clients, but if you don't you run into the risk of
> > significantly limiting your bandwidth since the send queues around your
> > system fill up. 
> > 
> > Try doing that with threads. 
> > 
> > 
> > Actually I would recommend the approach c)
> > 
> > c)  Write an asynchronous system with only 2 or three threads where I
> > manage the connections and keep the state of each connection in a data
> > structure.  
> > 
> > 
> > On fre, 2003-01-24 at 00:19, Lee Chin wrote:
> > > Hi
> > > I am discussing with a few people on different approaches to solving a scale problem I am having, and have gotten vastly different views
> > > 
> > > In a nutshell, as far as this debate is concerned, I can say I am writing a web server.
> > > 
> > > Now, to cater to 700 clients, I can
> > > a) launch 700 threads that each block on I/O to disk and to the client (in reading and writing on the socket)
> > > 
> > > OR
> > > 
> > > b) Write an asycnhrounous system with only 2 or three threads where I manage the connections and stack (via setcontext swapcontext etc), which is progromatically a little harder
> > > 
> > > Which way will yeild me better performance, considerng both approaches are implemented optimally?
> > > 
> > > Thanks
> > > Lee
> > -- 
> > _________________________________________________________________________
> > 
> > Terje Eggestad                  mailto:terje.eggestad@scali.no
> > Scali Scalable Linux Systems    http://www.scali.com
> > 
> > Olaf Helsets Vei 6              tel:    +47 22 62 89 61 (OFFICE)
> > P.O.Box 150, Oppsal                     +47 975 31 574  (MOBILE)
> > N-0619 Oslo                     fax:    +47 22 62 89 51
> > NORWAY            
> > _________________________________________________________________________
> > 
-- 
_________________________________________________________________________

Terje Eggestad                  mailto:terje.eggestad@scali.no
Scali Scalable Linux Systems    http://www.scali.com

Olaf Helsets Vei 6              tel:    +47 22 62 89 61 (OFFICE)
P.O.Box 150, Oppsal                     +47 975 31 574  (MOBILE)
N-0619 Oslo                     fax:    +47 22 62 89 51
NORWAY            
_________________________________________________________________________

-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2003-01-30  9:36 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-01-29 17:26 debate on 700 threads vs asynchronous code Lee Chin
2003-01-30  9:36 ` Terje Eggestad
  -- strict thread matches above, loose matches on Subject: below --
2003-01-24  0:07 Lee Chin
2003-01-23 23:19 Lee Chin
2003-01-23 23:28 ` Larry McVoy
2003-01-23 23:31 ` Ben Greear
2003-01-27  9:48 ` Terje Eggestad
2003-01-27 21:48   ` Bill Davidsen
2003-01-27 22:08 ` Bill Davidsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox