* debate on 700 threads vs asynchronous code
@ 2003-01-23 23:19 Lee Chin
2003-01-23 23:28 ` Larry McVoy
` (3 more replies)
0 siblings, 4 replies; 9+ messages in thread
From: Lee Chin @ 2003-01-23 23:19 UTC (permalink / raw)
To: linux-kernel, linux-newbie
Hi
I am discussing with a few people on different approaches to solving a scale problem I am having, and have gotten vastly different views
In a nutshell, as far as this debate is concerned, I can say I am writing a web server.
Now, to cater to 700 clients, I can
a) launch 700 threads that each block on I/O to disk and to the client (in reading and writing on the socket)
OR
b) Write an asycnhrounous system with only 2 or three threads where I manage the connections and stack (via setcontext swapcontext etc), which is progromatically a little harder
Which way will yeild me better performance, considerng both approaches are implemented optimally?
Thanks
Lee
--
__________________________________________________________
Sign-up for your own FREE Personalized E-mail at Mail.com
http://www.mail.com/?sr=signup
Meet Singles
http://corp.mail.com/lavalife
-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: debate on 700 threads vs asynchronous code
2003-01-23 23:19 debate on 700 threads vs asynchronous code Lee Chin
@ 2003-01-23 23:28 ` Larry McVoy
2003-01-23 23:31 ` Ben Greear
` (2 subsequent siblings)
3 siblings, 0 replies; 9+ messages in thread
From: Larry McVoy @ 2003-01-23 23:28 UTC (permalink / raw)
To: Lee Chin; +Cc: linux-kernel, linux-newbie
> b) Write an asycnhrounous system with only 2 or three threads where I manage the connections and stack (via setcontext swapcontext etc), which is progromatically a little harder
>
> Which way will yeild me better performance, considerng both approaches are implemented optimally?
If this is a serious question, an async system will by definition do better.
You have either 700 stacks screwing up the data cache or 2-3 stacks nicely
fitting in the data cache. Ditto for instruction cache, etc.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: debate on 700 threads vs asynchronous code
2003-01-23 23:19 debate on 700 threads vs asynchronous code Lee Chin
2003-01-23 23:28 ` Larry McVoy
@ 2003-01-23 23:31 ` Ben Greear
2003-01-27 9:48 ` Terje Eggestad
2003-01-27 22:08 ` Bill Davidsen
3 siblings, 0 replies; 9+ messages in thread
From: Ben Greear @ 2003-01-23 23:31 UTC (permalink / raw)
To: Lee Chin; +Cc: linux-kernel, linux-newbie
Lee Chin wrote:
> Hi
> I am discussing with a few people on different approaches to solving a scale problem I am having, and have gotten vastly different views
>
> In a nutshell, as far as this debate is concerned, I can say I am writing a web server.
>
> Now, to cater to 700 clients, I can
> a) launch 700 threads that each block on I/O to disk and to the client (in reading and writing on the socket)
>
> OR
>
> b) Write an asycnhrounous system with only 2 or three threads where I manage the connections and stack (via setcontext swapcontext etc), which is progromatically a little harder
You could also write something with async non-blocking IO and use NO threads
(ie, just a single process), which
may greatly simplify the debugging of your program (unless the developer(s) on your
project are very good at threaded programming already).
I suspect the async IO will perform better as well, but that is just an
un-founded opinion based on not wanting to think about scheduling 700 processes
that want to do IO :)
>
> Which way will yeild me better performance, considerng both approaches are implemented optimally?
>
> Thanks
> Lee
--
Ben Greear <greearb@candelatech.com> <Ben_Greear AT excite.com>
President of Candela Technologies Inc http://www.candelatech.com
ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear
-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: debate on 700 threads vs asynchronous code
@ 2003-01-24 0:07 Lee Chin
0 siblings, 0 replies; 9+ messages in thread
From: Lee Chin @ 2003-01-24 0:07 UTC (permalink / raw)
To: lm, leechin; +Cc: linux-kernel, linux-newbie
Hi,
Thanks for the rpely... my question was more so, with setcontext and swapcontext, I will still be messing with the data cache right?
In otherwords, as long as I have an async system with out setcontext, I know I am good... but with it, havent I degraded to a threaded environment?
Thanks
Lee
----- Original Message -----
From: Larry McVoy <lm@bitmover.com>
Date: Thu, 23 Jan 2003 15:28:34 -0800
To: Lee Chin <leechin@mail.com>
Subject: Re: debate on 700 threads vs asynchronous code
> > b) Write an asycnhrounous system with only 2 or three threads where I manage the connections and stack (via setcontext swapcontext etc), which is progromatically a little harder
> >
> > Which way will yeild me better performance, considerng both approaches are implemented optimally?
>
> If this is a serious question, an async system will by definition do better.
> You have either 700 stacks screwing up the data cache or 2-3 stacks nicely
> fitting in the data cache. Ditto for instruction cache, etc.
> --
> ---
> Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
--
__________________________________________________________
Sign-up for your own FREE Personalized E-mail at Mail.com
http://www.mail.com/?sr=signup
Meet Singles
http://corp.mail.com/lavalife
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: debate on 700 threads vs asynchronous code
2003-01-23 23:19 debate on 700 threads vs asynchronous code Lee Chin
2003-01-23 23:28 ` Larry McVoy
2003-01-23 23:31 ` Ben Greear
@ 2003-01-27 9:48 ` Terje Eggestad
2003-01-27 21:48 ` Bill Davidsen
2003-01-27 22:08 ` Bill Davidsen
3 siblings, 1 reply; 9+ messages in thread
From: Terje Eggestad @ 2003-01-27 9:48 UTC (permalink / raw)
To: Lee Chin; +Cc: linux-kernel, linux-newbie
Apart from the argument already given on other replies, you should
keep in mind that you probably need to give priority to doing receive.
THat include your clients, but if you don't you run into the risk of
significantly limiting your bandwidth since the send queues around your
system fill up.
Try doing that with threads.
Actually I would recommend the approach c)
c) Write an asynchronous system with only 2 or three threads where I
manage the connections and keep the state of each connection in a data
structure.
On fre, 2003-01-24 at 00:19, Lee Chin wrote:
> Hi
> I am discussing with a few people on different approaches to solving a scale problem I am having, and have gotten vastly different views
>
> In a nutshell, as far as this debate is concerned, I can say I am writing a web server.
>
> Now, to cater to 700 clients, I can
> a) launch 700 threads that each block on I/O to disk and to the client (in reading and writing on the socket)
>
> OR
>
> b) Write an asycnhrounous system with only 2 or three threads where I manage the connections and stack (via setcontext swapcontext etc), which is progromatically a little harder
>
> Which way will yeild me better performance, considerng both approaches are implemented optimally?
>
> Thanks
> Lee
--
_________________________________________________________________________
Terje Eggestad mailto:terje.eggestad@scali.no
Scali Scalable Linux Systems http://www.scali.com
Olaf Helsets Vei 6 tel: +47 22 62 89 61 (OFFICE)
P.O.Box 150, Oppsal +47 975 31 574 (MOBILE)
N-0619 Oslo fax: +47 22 62 89 51
NORWAY
_________________________________________________________________________
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: debate on 700 threads vs asynchronous code
2003-01-27 9:48 ` Terje Eggestad
@ 2003-01-27 21:48 ` Bill Davidsen
0 siblings, 0 replies; 9+ messages in thread
From: Bill Davidsen @ 2003-01-27 21:48 UTC (permalink / raw)
To: Terje Eggestad; +Cc: Lee Chin, linux-kernel, linux-newbie
On 27 Jan 2003, Terje Eggestad wrote:
> Apart from the argument already given on other replies, you should
> keep in mind that you probably need to give priority to doing receive.
> THat include your clients, but if you don't you run into the risk of
> significantly limiting your bandwidth since the send queues around your
> system fill up.
>
> Try doing that with threads.
Okay, I'm running my usenet exchange machines on Linux with Earthquake,
one thread per socket, 300-500 sockets, 700-800GB/day with incoming rate
spikes to 130Mbit on two 100Mbit NICs. What is it I'm supposed to try
doing with threads?
And if this is a webserver or anything like it, the incoming bandwidth is
probably orders of magnitude below the outgoing... Hum, like a usenet
reader server. Below, from a Linux box running Twister, also threaded per
feed in and per reader socket out.
load free buffs swap pgin pgou dk0 dk1 dk2 dk3 ipkt opkt int ctx usr sys idl i_netK o_netK
2.98 5.0 1807 0.0 544 2220 71 66 21 0 6173 3390 9600 17983 3 17 80 7170.8 941.9
4.77 4.5 1805 0.0 1117 6267 39 134 134 0 5403 3212 8780 20663 8 34 58 6645.4 978.9
2.35 4.3 1802 0.0 1529 6900 37 176 189 0 6134 3648 10007 18492 9 25 66 7470.4 1087.9
1.10 4.8 1800 0.0 1428 5609 33 149 150 0 5871 3447 9505 18028 9 25 66 7235.2 961.0
1.38 6.7 1798 0.0 970 6671 34 139 134 0 6250 3685 10051 20210 9 26 65 7503.4 1088.8
6.57 5.0 1797 0.0 1589 7673 89 184 188 0 5912 3571 9732 20165 8 33 59 7003.7 1169.3
2.30 4.6 1799 0.0 1648 5900 44 154 146 0 6539 3998 10660 17975 9 27 64 7631.0 1382.6
Forgive the formatting, it kind of break with larger numbers...
--
bill davidsen <davidsen@tmr.com>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.
-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: debate on 700 threads vs asynchronous code
2003-01-23 23:19 debate on 700 threads vs asynchronous code Lee Chin
` (2 preceding siblings ...)
2003-01-27 9:48 ` Terje Eggestad
@ 2003-01-27 22:08 ` Bill Davidsen
3 siblings, 0 replies; 9+ messages in thread
From: Bill Davidsen @ 2003-01-27 22:08 UTC (permalink / raw)
To: Lee Chin; +Cc: linux-kernel, linux-newbie
On Thu, 23 Jan 2003, Lee Chin wrote:
> I am discussing with a few people on different approaches to solving a
> scale problem I am having, and have gotten vastly different views
>
> In a nutshell, as far as this debate is concerned, I can say I am writing a web server.
>
> Now, to cater to 700 clients, I can a) launch 700 threads that each
> block on I/O to disk and to the client (in reading and writing on the
> socket)
>
> OR
>
> b) Write an asycnhrounous system with only 2 or three threads where I
> manage the connections and stack (via setcontext swapcontext etc), which
> is progromatically a little harder
There are many other ways, involving use of async io for disk and select
on some limited number of sockets per thread. If you want to wallow in
analysis paralysis you can certainly do it. Take a look at existing
usenet, mail, web and dns servers and you will see a number of ways to
attack this problem, and correctly implemented most of them work fine.
I believe Ingo mentioned some huge number of practical threads when he was
first talking about the latest thread library. If you believe it, or if
you really will be happy at 700 tasks per server, then thread per socket
is the easiest to implement, at least IMHO.
I'm using various news software which does most combinations of threading,
select, and even full processes per client, and none of them strike me as
being inherently better (as opposed to some being better implementations).
Ask Ingo how many threads you can really run in six months when the new
kernel and thread bits are more stable, that's the only scaling bit I
can't even guess. Pick one method, write code. I believe implementation
will be more important than method, unless you make a *really* bad choice.
--
bill davidsen <davidsen@tmr.com>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: debate on 700 threads vs asynchronous code
@ 2003-01-29 17:26 Lee Chin
2003-01-30 9:36 ` Terje Eggestad
0 siblings, 1 reply; 9+ messages in thread
From: Lee Chin @ 2003-01-29 17:26 UTC (permalink / raw)
To: terje.eggestad, leechin; +Cc: linux-kernel, linux-newbie
Today I do method (C)... but many people seem to say that, hey, pthreads does almost just that with a constant memory overhead of remembering the stack per blocking thread... so there is no time difference, just that pthreads consumes slightly more memory. That is the issue I am trying to get my head around.
That particular question, no one has answered... in Linux, the scheduler will not go around crazy trying to schedule prcosses that are all waiting on IO. NOw the only time I see a degrade in threads would be if all are runnable.... in that case a async scheme with two threads would let each task run to completion, not thrashing the kernel. Is that correct to say?
----- Original Message -----
From: Terje Eggestad <terje.eggestad@scali.com>
Date: 27 Jan 2003 10:48:22 +0100
To: Lee Chin <leechin@mail.com>
Subject: Re: debate on 700 threads vs asynchronous code
> Apart from the argument already given on other replies, you should
> keep in mind that you probably need to give priority to doing receive.
> THat include your clients, but if you don't you run into the risk of
> significantly limiting your bandwidth since the send queues around your
> system fill up.
>
> Try doing that with threads.
>
>
> Actually I would recommend the approach c)
>
> c) Write an asynchronous system with only 2 or three threads where I
> manage the connections and keep the state of each connection in a data
> structure.
>
>
> On fre, 2003-01-24 at 00:19, Lee Chin wrote:
> > Hi
> > I am discussing with a few people on different approaches to solving a scale problem I am having, and have gotten vastly different views
> >
> > In a nutshell, as far as this debate is concerned, I can say I am writing a web server.
> >
> > Now, to cater to 700 clients, I can
> > a) launch 700 threads that each block on I/O to disk and to the client (in reading and writing on the socket)
> >
> > OR
> >
> > b) Write an asycnhrounous system with only 2 or three threads where I manage the connections and stack (via setcontext swapcontext etc), which is progromatically a little harder
> >
> > Which way will yeild me better performance, considerng both approaches are implemented optimally?
> >
> > Thanks
> > Lee
> --
> _________________________________________________________________________
>
> Terje Eggestad mailto:terje.eggestad@scali.no
> Scali Scalable Linux Systems http://www.scali.com
>
> Olaf Helsets Vei 6 tel: +47 22 62 89 61 (OFFICE)
> P.O.Box 150, Oppsal +47 975 31 574 (MOBILE)
> N-0619 Oslo fax: +47 22 62 89 51
> NORWAY
> _________________________________________________________________________
>
--
__________________________________________________________
Sign-up for your own FREE Personalized E-mail at Mail.com
http://www.mail.com/?sr=signup
-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: debate on 700 threads vs asynchronous code
2003-01-29 17:26 Lee Chin
@ 2003-01-30 9:36 ` Terje Eggestad
0 siblings, 0 replies; 9+ messages in thread
From: Terje Eggestad @ 2003-01-30 9:36 UTC (permalink / raw)
To: Lee Chin; +Cc: linux-kernel, linux-newbie
On ons, 2003-01-29 at 18:26, Lee Chin wrote:
> Today I do method (C)... but many people seem to say that, hey,
> pthreads does almost just that with a constant memory overhead of
> remembering the stack per blocking thread... so there is no time
> difference, just that pthreads consumes slightly more memory. That is
> the issue I am trying to get my head around.
>
> That particular question, no one has answered... in Linux, the
> scheduler will not go around crazy trying to schedule prcosses that
> are all waiting on IO. NOw the only time I see a degrade in threads
> would be if all are runnable.... in that case a async scheme with two
> threads would let each task run to completion, not thrashing the
> kernel. Is that correct to say?
Yes
And you can add that if you have many runnable threads, there will be an
extra overhead doing context switching.
> ----- Original Message -----
> From: Terje Eggestad <terje.eggestad@scali.com>
> Date: 27 Jan 2003 10:48:22 +0100
> To: Lee Chin <leechin@mail.com>
> Subject: Re: debate on 700 threads vs asynchronous code
>
> > Apart from the argument already given on other replies, you should
> > keep in mind that you probably need to give priority to doing receive.
> > THat include your clients, but if you don't you run into the risk of
> > significantly limiting your bandwidth since the send queues around your
> > system fill up.
> >
> > Try doing that with threads.
> >
> >
> > Actually I would recommend the approach c)
> >
> > c) Write an asynchronous system with only 2 or three threads where I
> > manage the connections and keep the state of each connection in a data
> > structure.
> >
> >
> > On fre, 2003-01-24 at 00:19, Lee Chin wrote:
> > > Hi
> > > I am discussing with a few people on different approaches to solving a scale problem I am having, and have gotten vastly different views
> > >
> > > In a nutshell, as far as this debate is concerned, I can say I am writing a web server.
> > >
> > > Now, to cater to 700 clients, I can
> > > a) launch 700 threads that each block on I/O to disk and to the client (in reading and writing on the socket)
> > >
> > > OR
> > >
> > > b) Write an asycnhrounous system with only 2 or three threads where I manage the connections and stack (via setcontext swapcontext etc), which is progromatically a little harder
> > >
> > > Which way will yeild me better performance, considerng both approaches are implemented optimally?
> > >
> > > Thanks
> > > Lee
> > --
> > _________________________________________________________________________
> >
> > Terje Eggestad mailto:terje.eggestad@scali.no
> > Scali Scalable Linux Systems http://www.scali.com
> >
> > Olaf Helsets Vei 6 tel: +47 22 62 89 61 (OFFICE)
> > P.O.Box 150, Oppsal +47 975 31 574 (MOBILE)
> > N-0619 Oslo fax: +47 22 62 89 51
> > NORWAY
> > _________________________________________________________________________
> >
--
_________________________________________________________________________
Terje Eggestad mailto:terje.eggestad@scali.no
Scali Scalable Linux Systems http://www.scali.com
Olaf Helsets Vei 6 tel: +47 22 62 89 61 (OFFICE)
P.O.Box 150, Oppsal +47 975 31 574 (MOBILE)
N-0619 Oslo fax: +47 22 62 89 51
NORWAY
_________________________________________________________________________
-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2003-01-30 9:36 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-01-23 23:19 debate on 700 threads vs asynchronous code Lee Chin
2003-01-23 23:28 ` Larry McVoy
2003-01-23 23:31 ` Ben Greear
2003-01-27 9:48 ` Terje Eggestad
2003-01-27 21:48 ` Bill Davidsen
2003-01-27 22:08 ` Bill Davidsen
-- strict thread matches above, loose matches on Subject: below --
2003-01-24 0:07 Lee Chin
2003-01-29 17:26 Lee Chin
2003-01-30 9:36 ` Terje Eggestad
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox