* Process Creation Speed @ 2004-04-17 2:16 Stephan T. Lavavej 2004-04-18 5:44 ` Eric 0 siblings, 1 reply; 13+ messages in thread From: Stephan T. Lavavej @ 2004-04-17 2:16 UTC (permalink / raw) To: linux-kernel Why does creating and then terminating a process in GNU/Linux take about 6.3 ms on a Prestonia-2.2? I observe basically the same thing on a PIII-600. I'm pretty sure both systems run 2.4.x kernels. Does this suck less under 2.6.x? Not sucking at all would mean about 100 microseconds to me. I don't understand why it doesn't scale with processor speed. Does this interact with the length of a timeslice? It matters to me because the Common Gateway Interface spawns and destroys a process to handle each request, and I wish it were just fast, rather than having to use FastCGI. A fair amount of Googling and RTFFAQ didn't answer this. Stephan T. Lavavej http://nuwen.net ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Process Creation Speed 2004-04-17 2:16 Process Creation Speed Stephan T. Lavavej @ 2004-04-18 5:44 ` Eric 2004-04-19 0:30 ` Jamie Lokier 0 siblings, 1 reply; 13+ messages in thread From: Eric @ 2004-04-18 5:44 UTC (permalink / raw) To: stl; +Cc: linux-kernel On Friday 16 April 2004 21:16, Stephan T. Lavavej wrote: > Why does creating and then terminating a process in GNU/Linux take about > 6.3 ms on a Prestonia-2.2? I observe basically the same thing on a > PIII-600. > > I'm pretty sure both systems run 2.4.x kernels. Does this suck less under > 2.6.x? Not sucking at all would mean about 100 microseconds to me. I > don't understand why it doesn't scale with processor speed. Does this > interact with the length of a timeslice? > > It matters to me because the Common Gateway Interface spawns and destroys a > process to handle each request, and I wish it were just fast, rather than > having to use FastCGI. The difference in speed between regular and FastCGI shouldnt be related to process creation time. The speed up you see from FastCGI is because it doesn't have to be read from disk each time. So, you're really looking for performace enhancements in the wrong place. Tweaking process creation can't make your platters spin faster. > A fair amount of Googling and RTFFAQ didn't answer this. > > Stephan T. Lavavej > http://nuwen.net > > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Process Creation Speed 2004-04-18 5:44 ` Eric @ 2004-04-19 0:30 ` Jamie Lokier 2004-04-19 2:15 ` Eric 0 siblings, 1 reply; 13+ messages in thread From: Jamie Lokier @ 2004-04-19 0:30 UTC (permalink / raw) To: Eric; +Cc: stl, linux-kernel Eric wrote: > > It matters to me because the Common Gateway Interface spawns and destroys a > > process to handle each request, and I wish it were just fast, rather than > > having to use FastCGI. > > The difference in speed between regular and FastCGI shouldnt > be related to process creation time. The speed up you see from > FastCGI is because it doesn't have to be read from disk each > time. So, you're really looking for performace enhancements in the > wrong place. Tweaking process creation can't make your platters spin > faster. Wrong explanation. CGI does not "read from disk each time". Files, including executables, are cached in RAM. Platter speed is irrelevant unless your server is overloaded, which this one plainly isn't. -- Jamie ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Process Creation Speed 2004-04-19 0:30 ` Jamie Lokier @ 2004-04-19 2:15 ` Eric 2004-04-19 3:04 ` Jamie Lokier 0 siblings, 1 reply; 13+ messages in thread From: Eric @ 2004-04-19 2:15 UTC (permalink / raw) To: Jamie Lokier; +Cc: linux-kernel On Sunday 18 April 2004 19:30, you wrote: > Eric wrote: > > > It matters to me because the Common Gateway Interface spawns and > > > destroys a process to handle each request, and I wish it were just > > > fast, rather than having to use FastCGI. > > > > The difference in speed between regular and FastCGI shouldnt > > be related to process creation time. The speed up you see from > > FastCGI is because it doesn't have to be read from disk each > > time. So, you're really looking for performace enhancements in the > > wrong place. Tweaking process creation can't make your platters spin > > faster. > > Wrong explanation. CGI does not "read from disk each time". Files, > including executables, are cached in RAM. Platter speed is irrelevant > unless your server is overloaded, which this one plainly isn't. Ok ok my explanation is a bit off. But you;re still looking in the wrong place. 100ms isn't that long, and by just tweaking this you won't achieve with regular CGI what fastCGI does. And what happens when your CGI is removed from disk cache due to a spike in requests? It has to be read again, degrading performance. You can't count on an object being is disk cache every time if the system isn't under load. What about filesystems that use access timestamps? This will have to be written to the disk every time the application is run, so under some circumstances just being in disk cache isn't enough. >From http://www.fastcgi.com/devkit/doc/fcgi-perf.htm "CGI applications couldn't perform in-memory caching, because they exited after processing just one request. Web server APIs promised to solve this problem. But how effective is the solution?" "FastCGI is designed to allow effective in-memory caching. Requests are routed from any child process to a FastCGI application server. The FastCGI application process maintains an in-memory cache." Look at these two statements and you will realize that they are optimizing memory access patterns too. Normally, even if the file is in disk cache it will still have to get copied to an area that the webserver child process can work with. This wastes memory. So if you have 100-1000 clients and a 100k CGI application, it may be in disk cache once, but parts of it are getting fed to child processes each time it needs to be run. How long, or how many clients before it gets bumped out of disk cache? Or how about a plain waste of memory that could go to more webserver children. "With multi-threading you run an application process that is designed to handle several requests at the same time. The threads handling concurrent requests share process memory, so they all have access to the same cache. Multi-threaded programming is complex -- concurrency makes programs difficult to test and debug -- but with FastCGI you can write single threaded or multithreaded applications." Moreover they can turn a normal application into a (pseudo)threaded application which has significant benefits for SMP systems as well as a system that just handles many concurrent connections. IMHO, the problem still isn't related to creation time, but is an inherit problem of the webserver's API's. Furthermore, if I read correctly, fastCGI still has to spawn a child process each time a request comes in, so even if you tuned process creation time, fastCGI would STILL be faster. Look at it mathematically. Say the time it takes for fastCGI to run a CGI(F) is 10 units. A regular server CGI implementation(C) is 100. If you shorten process creation time by five units(S) then C-S > F-S ALWAYS, you just would be helping both implementations by the SAME AMOUNT. If you want CGI to perform faster, you will need a solution like FastCGI, or to rewrite your webserver's CGI APIs. If you want information on howto optimize CGI, post on your webserver's mailing list or fastCGI lists, there is no need to toy with the kernel. IMHO this is a userspace issue. To answer your other question, 2.6 should perform better in a webserver application because of improvements to the VM system and the scheduler, but not directly because of shortend process creation time(if it was even shortened in 2.6). I would benchmark the server under both kernels. Also remember there are different scheduler algorithms and VM tunables. Check the Documentation folder in the kernel source. However, I have never tweaked those for a webserver so someone else would have to recommend a good setup for a webserver. Anyone feel free to correct me if Im wrong on some parts. Sorry for the longwinded reply but I could use a good refresher on this. --Eric Bambach ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Process Creation Speed 2004-04-19 2:15 ` Eric @ 2004-04-19 3:04 ` Jamie Lokier 2004-04-19 5:43 ` Eric 0 siblings, 1 reply; 13+ messages in thread From: Jamie Lokier @ 2004-04-19 3:04 UTC (permalink / raw) To: Eric; +Cc: linux-kernel Eric wrote: > > Wrong explanation. CGI does not "read from disk each time". Files, > > including executables, are cached in RAM. Platter speed is irrelevant > > unless your server is overloaded, which this one plainly isn't. > > Ok ok my explanation is a bit off. But you;re still looking in the > wrong place. 100ms isn't that long, and by just tweaking this you > won't achieve with regular CGI what fastCGI does. That's true: FastCGI is a good solution, as are mod_perl and similar. But the reasons you give for it are bogus. > And what happens when your CGI is removed from disk cache due to a > spike in requests? It has to be read again, degrading > performance. You can't count on an object being is disk cache every > time if the system isn't under load. What you miss is that pages being removed from the cache affects FastCGI and CGI identically. _Parts_ of a CGI image will be removed from memory if there is paging pressure due to other requests (not for this CGI). The whole file is not dropped out in one go, but individual pages are. If it isn't used for a long time, it may all go. Exactly the same thing happens to a long-running FastCGI process: individual pages are dropped under memory pressure, when those pages aren't currently being used. This occurs even though the FastCGI process lasts over multiple requests. File paging is determined by the pattern in which pages are actually being used at any time, and has very little to do with whether pages are part of a running process. > What about filesystems that use access timestamps? This will > have to be written to the disk every time the application is run, so > under some circumstances just being in disk cache isn't enough. No: the timestamp is written to disk later, asynchronously, and if there are many requests which update the timestamp it will still only be written once per update period (30 seconds or so on ext2, 5 seconds on ext3 I think). It's very unlikely to affect response time. Note that static pages (served directly by the webserver) also update the access timestamp: the effect of these is much worse than any CGI program. You should use the "noatime" mount option if this is ever a problem. > >From http://www.fastcgi.com/devkit/doc/fcgi-perf.htm > > "CGI applications couldn't perform in-memory caching, because they exited > after processing just one request. Web server APIs promised to solve this > problem. But how effective is the solution?" > > "FastCGI is designed to allow effective in-memory caching. Requests > are routed from any child process to a FastCGI application > server. The FastCGI application process maintains an in-memory > cache." > > Look at these two statements and you will realize that they are > optimizing memory access patterns too. Normally, even if the file is > in disk cache it will still have to get copied to an area that the > webserver child process can work with. This wastes memory. So if you > have 100-1000 clients and a 100k CGI application, it may be in disk > cache once, but parts of it are getting fed to child processes each > time it needs to be run. How long, or how many clients before it > gets bumped out of disk cache? Or how about a plain waste of memory > that could go to more webserver children. Every one of these claims is technically bogus, even the ones from fastcgi.com, but the gist is accurate. They are bogus because CGI programs are able to maintain in-memory caches as well. That's what storing data in cache files, database servers, shared memory, memcached and so forth accomplish. Also, the copying you describe is not necessary. That's why we have mmap(). They are accurate because it is much more complicated to do those things in single-request CGI than FastCGI (or an equivalent like mod_perl), and there is no point: writing a persistent server is much easier than writing a complicated sharing scheme among CGI processes. Probably the biggest speedup in practice is when people write CGI programs in scripting languages, or with complex libraries, which incurs a huge initialisation cost for each request. The initialisation doesn't occur with every request when using FastCGI. That tends to make the difference between 0.5 requests per second and 100 requests per second. It's a shame you didn't mention that :) > "With multi-threading you run an application process that is > designed to handle several requests at the same time. The threads > handling concurrent requests share process memory, so they all have > access to the same cache. Multi-threaded programming is complex -- > concurrency makes programs difficult to test and debug -- but with > FastCGI you can write single threaded or multithreaded > applications." > > Moreover they can turn a normal application into a (pseudo)threaded > application which has significant benefits for SMP systems as well as a > system that just handles many concurrent connections. True, although sometimes you find that forked applications run faster than threaded, especially on SMP. > If you want CGI to perform faster, you will need a solution like FastCGI, or > to rewrite your webserver's CGI APIs. If you want information on howto > optimize CGI, post on your webserver's mailing list or fastCGI lists, there > is no need to toy with the kernel. IMHO this is a userspace issue. > [...] > I would benchmark the server under both kernels. Also remember there > are different scheduler algorithms and VM tunables. Check the > Documentation folder in the kernel source. However, I have never > tweaked those for a webserver so someone else would have to > recommend a good setup for a webserver. With all of this I agree. Especially that it's a userspace issue. Fwiw, all good webservers have built-in capabilities for persistent CGI-handling processes, more or less equivalent to FastCGI. You said that FastCGI requires a process to be created for every request. I thought this wasn't true, as the protocol doesn't require it, but if it is true that's a large overhead, as 7.5ms per request is significant, and that would be a reason to _not_ use FastCGI and use the web server's built-in capabilities instead. None of this answers the question which is relevant to linux-kernel: why does process creation take 7.5ms and fail to scale with CPU internal clock speed over a factor of 4 (600MHz x86 to 2.2GHz x86). Perhaps it is because we still don't have shared page tables. That would be the most likely dominant overhead of fork(). Alternatively, the original poster may have included program initialisation time in the 7.5ms, and that could be substantial if there are many complex libraries being loaded. -- Jamie ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Process Creation Speed 2004-04-19 3:04 ` Jamie Lokier @ 2004-04-19 5:43 ` Eric 2004-04-19 9:48 ` Jamie Lokier 0 siblings, 1 reply; 13+ messages in thread From: Eric @ 2004-04-19 5:43 UTC (permalink / raw) To: Jamie Lokier; +Cc: linux-kernel, Stephan T. Lavavej On Sunday 18 April 2004 22:04, you wrote: > Eric wrote: > > > Wrong explanation. CGI does not "read from disk each time". Files, > > > including executables, are cached in RAM. Platter speed is irrelevant > > > unless your server is overloaded, which this one plainly isn't. > > > > Ok ok my explanation is a bit off. But you;re still looking in the > > wrong place. 100ms isn't that long, and by just tweaking this you > > won't achieve with regular CGI what fastCGI does. > > That's true: FastCGI is a good solution, as are mod_perl and similar. > > But the reasons you give for it are bogus. > Every one of these claims is technically bogus, even the ones from > fastcgi.com, but the gist is accurate. Ya, I have the gist down, but im by no means an expert programmer (yet), so I appreciate your well-thought out and detailed responses instead of just flaming or something equally bad. > They are bogus because CGI programs are able to maintain in-memory > caches as well. Across instances and concurrently? It sounds complicated unless you are using a persistent server cgi-application....like fastCGI. > That's what storing data in cache files, A file is much different than in-memory. > database servers, shared memory, Yes, and these would be persistent applications. > memcached and so forth accomplish. Also, the > copying you describe is not necessary. That's why we have mmap(). Ok, shot down. Fair enough. > They are accurate because it is much more complicated to do those > things in single-request CGI than FastCGI (or an equivalent like > mod_perl), and there is no point: writing a persistent server is much > easier than writing a complicated sharing scheme among CGI processes. Yes, that is what I was going for....just in a sketchy fashion. > Probably the biggest speedup in practice is when people write CGI > programs in scripting languages, or with complex libraries, which > incurs a huge initialisation cost for each request. The > initialisation doesn't occur with every request when using FastCGI. > That tends to make the difference between 0.5 requests per second and > 100 requests per second. It's a shame you didn't mention that :) It is. Now that you mention it, im surprised I didn't think of it. I did some research in this area because sysVinit scheme uses huge amounts of scripts and the total initialization cost per bootup is probably on the order of 5-15 seconds depending on machine, etc. etc.... This is the situation that I was thinking of, perl, etc.... interpreted languages that have a huge start up cost would benefit with fastCGI. There isn't a whole lot you can do with regular CGI. > > "With multi-threading you run an application process that is > > designed to handle several requests at the same time. The threads > > handling concurrent requests share process memory, so they all have > > access to the same cache. Multi-threaded programming is complex -- > > concurrency makes programs difficult to test and debug -- but with > > FastCGI you can write single threaded or multithreaded > > applications." > > > > Moreover they can turn a normal application into a (pseudo)threaded > > application which has significant benefits for SMP systems as well as a > > system that just handles many concurrent connections. > > True, although sometimes you find that forked applications run faster > than threaded, especially on SMP. Either way it is still faster. I haven't looked at fastCGI specs, but it seems like they were claiming to do some sort of pseudo threading/concurrency for performance reasons. > > If you want CGI to perform faster, you will need a solution like FastCGI, > > or to rewrite your webserver's CGI APIs. If you want information on howto > > optimize CGI, post on your webserver's mailing list or fastCGI lists, > > there is no need to toy with the kernel. IMHO this is a userspace issue. > > [...] > > I would benchmark the server under both kernels. Also remember there > > are different scheduler algorithms and VM tunables. Check the > > Documentation folder in the kernel source. However, I have never > > tweaked those for a webserver so someone else would have to > > recommend a good setup for a webserver. > > With all of this I agree. Especially that it's a userspace issue. Yep. There isn't a whole lot the kernel can do to help you here. > Fwiw, all good webservers have built-in capabilities for persistent > CGI-handling processes, more or less equivalent to FastCGI. You said > that FastCGI requires a process to be created for every request. I > thought this wasn't true, as the protocol doesn't require it, but if > it is true that's a large overhead, as 7.5ms per request is > significant, and that would be a reason to _not_ use FastCGI and use > the web server's built-in capabilities instead. Hmmm...lemme look a little more deeply into that. After research I realize that I glossed over the fastCGI whitepaper a little too much. http://www.fastcgi.com/devkit/doc/fastcgi-whitepaper/fastcgi.htm "For each request, the server creates a new process and the process initializes itself." Is referring to other CGI implementations and not to itself DOH. However, just like a normal sever performance will suffer if fastCGI process have to be created on demand. "The Web server creates FastCGI application processes to handle requests. The processes may be created at startup, or created on demand." > None of this answers the question which is relevant to linux-kernel: > why does process creation take 7.5ms and fail to scale with CPU > internal clock speed over a factor of 4 (600MHz x86 to 2.2GHz x86). The reason it doesn't scale is probably because the kernel always runs at a specified speed, 100HZ which leaves 10ms(i believe?) timeslices. I would try a HZ patch and bump it up to 1000, i bet you would see a big difference then. > Perhaps it is because we still don't have shared page tables. > That would be the most likely dominant overhead of fork(). > > Alternatively, the original poster may have included program > initialisation time in the 7.5ms, and that could be substantial if > there are many complex libraries being loaded. Yea, hopefully Stephan can provide a little more insight into how he obtained 6.3 ms. > -- Jamie > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Process Creation Speed 2004-04-19 5:43 ` Eric @ 2004-04-19 9:48 ` Jamie Lokier 2004-04-19 12:09 ` Johannes Stezenbach 0 siblings, 1 reply; 13+ messages in thread From: Jamie Lokier @ 2004-04-19 9:48 UTC (permalink / raw) To: Eric; +Cc: linux-kernel, Stephan T. Lavavej Eric wrote: > > None of this answers the question which is relevant to linux-kernel: > > why does process creation take 7.5ms and fail to scale with CPU > > internal clock speed over a factor of 4 (600MHz x86 to 2.2GHz x86). > > The reason it doesn't scale is probably because the kernel always runs at a > specified speed, 100HZ which leaves 10ms(i believe?) timeslices. I would try > a HZ patch and bump it up to 1000, i bet you would see a big difference then. Hmm. The timer speed shouldn't affect the measured speed of fork() at all. It might show up if the measuring program is dependent on the timer in some way, though. -- Jamie ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Process Creation Speed 2004-04-19 9:48 ` Jamie Lokier @ 2004-04-19 12:09 ` Johannes Stezenbach 2004-04-19 12:44 ` Stephan T. Lavavej 2004-04-19 13:28 ` Jamie Lokier 0 siblings, 2 replies; 13+ messages in thread From: Johannes Stezenbach @ 2004-04-19 12:09 UTC (permalink / raw) To: Jamie Lokier; +Cc: Eric, linux-kernel, Stephan T. Lavavej Jamie Lokier wrote: > Eric wrote: > > > None of this answers the question which is relevant to linux-kernel: > > > why does process creation take 7.5ms and fail to scale with CPU > > > internal clock speed over a factor of 4 (600MHz x86 to 2.2GHz x86). > > > > The reason it doesn't scale is probably because the kernel always runs at a > > specified speed, 100HZ which leaves 10ms(i believe?) timeslices. I would try > > a HZ patch and bump it up to 1000, i bet you would see a big difference then. > > Hmm. The timer speed shouldn't affect the measured speed of fork() at > all. It might show up if the measuring program is dependent on the > timer in some way, though. http://bulk.fefe.de/scalability/ has some benchmarks on the issue. But I guess the numbers depend heavily on the server/CGI software used. Johannes ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: Process Creation Speed 2004-04-19 12:09 ` Johannes Stezenbach @ 2004-04-19 12:44 ` Stephan T. Lavavej 2004-04-19 22:48 ` David Lang 2004-04-22 13:40 ` Jakob Oestergaard 2004-04-19 13:28 ` Jamie Lokier 1 sibling, 2 replies; 13+ messages in thread From: Stephan T. Lavavej @ 2004-04-19 12:44 UTC (permalink / raw) To: linux-kernel Thanks to all who have responded. I had been measuring the time to create and terminate a do-nothing program. I had not been measuring CGI programs, though that was why I was doing the measurement in the first place. I changed my measurement strategy, and I now get about 110 microseconds for creation and termination of a do-nothing process (fork() followed by execve()). Statically linking everything gave a significant speedup, which allowed me to reach that value. This was on a 2.6.x kernel. 110 microseconds is well within my "doesn't suck" range, so I'm happy - CGI will be fast enough for my needs, and I can always turn to FastCGI later if necessary. I am writing a web-based forum entirely in C++, rejecting interpreted languages (Perl, PHP, ASP, etc.) and relational databases (MySQL, PostGreSQL, etc.) entirely. My forum consists of "kiddy" CGI processes which talk over the network to a persistent "mommy" daemon who keeps all forum state in main memory. My code runs on both Windows and GNU/Linux with no configuration needed, but separate measurements indicate that XP takes about 3.3 ms to create and terminate a do-nothing process. Thus it looks like Linux 2.6.x will be the kernel of choice for my forum. Thanks again! Stephan T. Lavavej http://nuwen.net ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: Process Creation Speed 2004-04-19 12:44 ` Stephan T. Lavavej @ 2004-04-19 22:48 ` David Lang 2004-04-22 13:40 ` Jakob Oestergaard 1 sibling, 0 replies; 13+ messages in thread From: David Lang @ 2004-04-19 22:48 UTC (permalink / raw) To: Stephan T. Lavavej; +Cc: linux-kernel the 2.6 kernel does have significant advantages in file creation/shutdown time when you have large numbers of processes. I have a box that sits with ~3000 processes on it and with a 2.4.25 kernel it resulted in 400 connections/sec and with a 2.6.4 it got 650 connections/sec (cutting the number of processes down to <50 resulted in ~620 connections/sec with 2.4.5) however if you can compile you code staticly and not do shared library lookups you may see even more drastic improvements. with the code above I hard-coded a protocol lookup, eliminated a few hostname lookups and under 2.6.4 got it up to 2500 connections/sec!! (I eliminated hostname lookups and got up to ~700/sec, then changed nsswitch so that it didn't look for protocols.db and it climbed to 850/sec, I shortened /etc/protocols to a minimal set and it climbed to 900/sec, I eliminated the getpotobyname("ip") and it jumped to 2500/sec) measurements on a dual athlon 2100 box with 1g of ram note that the code was staticly compiled to start with, but doing a name lookup invoked nsswitch and loaded in libraries from that. do a strace of the app, dumping it to a file and see what files it opens. especially if you have SMP try the 2.6 kernel and keep tweaking your cgi's David Lang On Mon, 19 Apr 2004, Stephan T. Lavavej wrote: > Date: Mon, 19 Apr 2004 05:44:12 -0700 > From: Stephan T. Lavavej <stl@nuwen.net> > To: linux-kernel@vger.kernel.org > Subject: RE: Process Creation Speed > > Thanks to all who have responded. > > I had been measuring the time to create and terminate a do-nothing program. > I had not been measuring CGI programs, though that was why I was doing the > measurement in the first place. > > I changed my measurement strategy, and I now get about 110 microseconds for > creation and termination of a do-nothing process (fork() followed by > execve()). Statically linking everything gave a significant speedup, which > allowed me to reach that value. This was on a 2.6.x kernel. 110 > microseconds is well within my "doesn't suck" range, so I'm happy - CGI will > be fast enough for my needs, and I can always turn to FastCGI later if > necessary. > > I am writing a web-based forum entirely in C++, rejecting interpreted > languages (Perl, PHP, ASP, etc.) and relational databases (MySQL, > PostGreSQL, etc.) entirely. My forum consists of "kiddy" CGI processes > which talk over the network to a persistent "mommy" daemon who keeps all > forum state in main memory. > > My code runs on both Windows and GNU/Linux with no configuration needed, but > separate measurements indicate that XP takes about 3.3 ms to create and > terminate a do-nothing process. Thus it looks like Linux 2.6.x will be the > kernel of choice for my forum. > > Thanks again! > > Stephan T. Lavavej > http://nuwen.net > > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Process Creation Speed 2004-04-19 12:44 ` Stephan T. Lavavej 2004-04-19 22:48 ` David Lang @ 2004-04-22 13:40 ` Jakob Oestergaard 1 sibling, 0 replies; 13+ messages in thread From: Jakob Oestergaard @ 2004-04-22 13:40 UTC (permalink / raw) To: Stephan T. Lavavej; +Cc: linux-kernel On Mon, Apr 19, 2004 at 05:44:12AM -0700, Stephan T. Lavavej wrote: > Thanks to all who have responded. > ... > > I am writing a web-based forum entirely in C++, rejecting interpreted > languages (Perl, PHP, ASP, etc.) and relational databases (MySQL, > PostGreSQL, etc.) entirely. My forum consists of "kiddy" CGI processes > which talk over the network to a persistent "mommy" daemon who keeps all > forum state in main memory. You could consider loading your .o as an apache module, rather than executing it as a CGI program. I was involved in one project where we did this with good success. Even segfaults in our module would "only" take down one of the Apache sub-processes, so while they incur performance overhead (and of course should be fixed no matter what - which luckily is very easy (using for example { if (!fork()) abort(); } to create snapshot coredumps)), it's not catastrophic. It's entirely realistic to write a good module for Apache in a fairly short timespan. / jakob ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Process Creation Speed 2004-04-19 12:09 ` Johannes Stezenbach 2004-04-19 12:44 ` Stephan T. Lavavej @ 2004-04-19 13:28 ` Jamie Lokier 1 sibling, 0 replies; 13+ messages in thread From: Jamie Lokier @ 2004-04-19 13:28 UTC (permalink / raw) To: Johannes Stezenbach, Eric, linux-kernel, Stephan T. Lavavej Johannes Stezenbach wrote: > > > > None of this answers the question which is relevant to linux-kernel: > > > > why does process creation take 7.5ms and fail to scale with CPU > > > > internal clock speed over a factor of 4 (600MHz x86 to 2.2GHz x86). > > http://bulk.fefe.de/scalability/ has some benchmarks on the issue. > But I guess the numbers depend heavily on the server/CGI software used. Nice page. The graphs there show fork() taking 250-350 microseconds, which is quite fast. Where is the 7.5ms complaint coming from? -- Jamie ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <1MFUQ-1zo-3@gated-at.bofh.it>]
[parent not found: <1MGnU-1U9-19@gated-at.bofh.it>]
* Re: Process Creation Speed [not found] ` <1MGnU-1U9-19@gated-at.bofh.it> @ 2004-04-19 15:43 ` Andi Kleen 0 siblings, 0 replies; 13+ messages in thread From: Andi Kleen @ 2004-04-19 15:43 UTC (permalink / raw) To: stl; +Cc: linux-kernel "Stephan T. Lavavej" <stl@nuwen.net> writes: > > I changed my measurement strategy, and I now get about 110 microseconds for > creation and termination of a do-nothing process (fork() followed by > execve()). Statically linking everything gave a significant speedup, which > allowed me to reach that value. This was on a 2.6.x kernel. 110 > microseconds is well within my "doesn't suck" range, so I'm happy - CGI will > be fast enough for my needs, and I can always turn to FastCGI later if > necessary. This just means ld.so is too slow for you. Perhaps you should complain to the glibc people about that? -Andi ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2004-04-22 13:40 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-04-17 2:16 Process Creation Speed Stephan T. Lavavej
2004-04-18 5:44 ` Eric
2004-04-19 0:30 ` Jamie Lokier
2004-04-19 2:15 ` Eric
2004-04-19 3:04 ` Jamie Lokier
2004-04-19 5:43 ` Eric
2004-04-19 9:48 ` Jamie Lokier
2004-04-19 12:09 ` Johannes Stezenbach
2004-04-19 12:44 ` Stephan T. Lavavej
2004-04-19 22:48 ` David Lang
2004-04-22 13:40 ` Jakob Oestergaard
2004-04-19 13:28 ` Jamie Lokier
[not found] <1MFUQ-1zo-3@gated-at.bofh.it>
[not found] ` <1MGnU-1U9-19@gated-at.bofh.it>
2004-04-19 15:43 ` Andi Kleen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox