public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric <eric@cisu.net>
To: Jamie Lokier <jamie@shareable.org>
Cc: linux-kernel@vger.kernel.org, "Stephan T. Lavavej" <stl@nuwen.net>
Subject: Re: Process Creation Speed
Date: Mon, 19 Apr 2004 00:43:04 -0500	[thread overview]
Message-ID: <200404190043.04358.eric@cisu.net> (raw)
In-Reply-To: <20040419030456.GA11717@mail.shareable.org>

On Sunday 18 April 2004 22:04, you wrote:
> Eric wrote:
> > > Wrong explanation.  CGI does not "read from disk each time".  Files,
> > > including executables, are cached in RAM.  Platter speed is irrelevant
> > > unless your server is overloaded, which this one plainly isn't.
> >
> > Ok ok my explanation is a bit off. But you;re still looking in the
> > wrong place. 100ms isn't that long, and by just tweaking this you
> > won't achieve with regular CGI what fastCGI does.
>
> That's true: FastCGI is a good solution, as are mod_perl and similar.
>
> But the reasons you give for it are bogus.


> Every one of these claims is technically bogus, even the ones from
> fastcgi.com, but the gist is accurate.

Ya, I have the gist down, but im by no means an expert programmer (yet), so I 
appreciate your well-thought out and detailed responses instead of just 
flaming or something equally bad.

> They are bogus because CGI programs are able to maintain in-memory
> caches as well.
Across instances and concurrently? It sounds complicated unless you are using 
a persistent server cgi-application....like fastCGI.
> That's what storing data in cache files,
A file is much different than in-memory.
> database servers, shared memory,
Yes, and these would be persistent applications.
> memcached and so forth accomplish.  Also, the 
> copying you describe is not necessary.  That's why we have mmap().

Ok, shot down. Fair enough.

> They are accurate because it is much more complicated to do those
> things in single-request CGI than FastCGI (or an equivalent like
> mod_perl), and there is no point: writing a persistent server is much
> easier than writing a complicated sharing scheme among CGI processes.

Yes, that is what I was going for....just in a sketchy fashion.

> Probably the biggest speedup in practice is when people write CGI
> programs in scripting languages, or with complex libraries, which
> incurs a huge initialisation cost for each request.  The
> initialisation doesn't occur with every request when using FastCGI.
> That tends to make the difference between 0.5 requests per second and
> 100 requests per second.  It's a shame you didn't mention that :)

It is. Now that you mention it, im surprised I didn't think of it. I did some 
research in this area because sysVinit scheme uses huge amounts of scripts 
and the total initialization cost per bootup is probably on the order of 5-15 
seconds depending on machine, etc. etc.... This is the situation that I was 
thinking of, perl, etc.... interpreted languages that have a huge start up 
cost would benefit with fastCGI. There isn't a whole lot you can do with 
regular CGI.

> > "With multi-threading you run an application process that is
> > designed to handle several requests at the same time. The threads
> > handling concurrent requests share process memory, so they all have
> > access to the same cache.  Multi-threaded programming is complex --
> > concurrency makes programs difficult to test and debug -- but with
> > FastCGI you can write single threaded or multithreaded
> > applications."
> >
> > Moreover they can turn a normal application into a (pseudo)threaded
> > application which has significant benefits for SMP systems as well as a
> > system that just handles many concurrent connections.
>
> True, although sometimes you find that forked applications run faster
> than threaded, especially on SMP.

Either way it is still faster. I haven't looked at fastCGI specs, but it seems 
like they were claiming to do some sort of pseudo threading/concurrency for 
performance reasons.

> > If you want CGI to perform faster, you will need a solution like FastCGI,
> > or to rewrite your webserver's CGI APIs. If you want information on howto
> > optimize CGI, post on your webserver's mailing list or fastCGI lists,
> > there is no need to toy with the kernel. IMHO this is a userspace issue.
> > [...]
> > I would benchmark the server under both kernels. Also remember there
> > are different scheduler algorithms and VM tunables. Check the
> > Documentation folder in the kernel source. However, I have never
> > tweaked those for a webserver so someone else would have to
> > recommend a good setup for a webserver.
>
> With all of this I agree.  Especially that it's a userspace issue.

Yep. There isn't a whole lot the kernel can do to help you here.

> Fwiw, all good webservers have built-in capabilities for persistent
> CGI-handling processes, more or less equivalent to FastCGI.  You said
> that FastCGI requires a process to be created for every request.  I
> thought this wasn't true, as the protocol doesn't require it, but if
> it is true that's a large overhead, as 7.5ms per request is
> significant, and that would be a reason to _not_ use FastCGI and use
> the web server's built-in capabilities instead.

Hmmm...lemme look a little more deeply into that. After research I realize 
that I glossed over the fastCGI whitepaper a little too much. 

http://www.fastcgi.com/devkit/doc/fastcgi-whitepaper/fastcgi.htm

"For each request, the server creates a new process and the process 
initializes itself."

Is referring to other CGI implementations and not to itself DOH.

However, just like a normal sever performance will suffer if fastCGI process 
have to be created on demand.

"The Web server creates FastCGI application processes to handle requests. The 
processes may be created at startup, or created on demand."

> None of this answers the question which is relevant to linux-kernel:
> why does process creation take 7.5ms and fail to scale with CPU
> internal clock speed over a factor of 4 (600MHz x86 to 2.2GHz x86).

The reason it doesn't scale is probably because the kernel always runs at a 
specified speed, 100HZ which leaves 10ms(i believe?) timeslices. I would try 
a HZ patch and bump it up to 1000, i bet you would see a big difference then.

> Perhaps it is because we still don't have shared page tables.
> That would be the most likely dominant overhead of fork().
>
> Alternatively, the original poster may have included program
> initialisation time in the 7.5ms, and that could be substantial if
> there are many complex libraries being loaded.

Yea, hopefully Stephan can provide a little more insight into how he obtained 
6.3 ms. 

> -- Jamie
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

  reply	other threads:[~2004-04-19  5:42 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-04-17  2:16 Process Creation Speed Stephan T. Lavavej
2004-04-18  5:44 ` Eric
2004-04-19  0:30   ` Jamie Lokier
2004-04-19  2:15     ` Eric
2004-04-19  3:04       ` Jamie Lokier
2004-04-19  5:43         ` Eric [this message]
2004-04-19  9:48           ` Jamie Lokier
2004-04-19 12:09             ` Johannes Stezenbach
2004-04-19 12:44               ` Stephan T. Lavavej
2004-04-19 22:48                 ` David Lang
2004-04-22 13:40                 ` Jakob Oestergaard
2004-04-19 13:28               ` Jamie Lokier
     [not found] <1MFUQ-1zo-3@gated-at.bofh.it>
     [not found] ` <1MGnU-1U9-19@gated-at.bofh.it>
2004-04-19 15:43   ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200404190043.04358.eric@cisu.net \
    --to=eric@cisu.net \
    --cc=jamie@shareable.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stl@nuwen.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox