* linux scheduler limitations?
@ 2001-03-29 21:19 Fabio Riccardi
2001-03-29 21:26 ` David Lang
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Fabio Riccardi @ 2001-03-29 21:19 UTC (permalink / raw)
To: linux-kernel
Hello,
I'm working on an enhanced version of Apache and I'm hitting my head
against something I don't understand.
I've found a (to me) unexplicable system behaviour when the number of
Apache forked instances goes somewhere beyond 1050, the machine
suddently slows down almost top a halt and becomes totally unresponsive,
until I stop the test (SpecWeb).
Profiling the kernel shows that the scheduler and the interrupt handler
are taking most of the CPU time.
I understand that there must be a limit to the number of processes that
the scheduler can efficiently handle, but I would expect some sort of
gradual performance degradation when increasing the number of tasks,
instead I observe that by increasing Apache's MaxClient linit by as
little as 10 can cause a sudden transition between smooth working with
lots (30-40%) of CPU idle to a total lock-up.
Moreover the max number of processes is not even constant. If I increase
the server load gradually then I manage to have 1500 processes running
with no problem, but if the transition is sharp (the SpecWeb case) than
I end-up having a lock up.
Anybody seen this before? Any clues?
- Fabio
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: linux scheduler limitations?
2001-03-29 21:19 linux scheduler limitations? Fabio Riccardi
@ 2001-03-29 21:26 ` David Lang
2001-03-29 21:55 ` Fabio Riccardi
2001-03-29 21:35 ` J . A . Magallon
` (2 subsequent siblings)
3 siblings, 1 reply; 11+ messages in thread
From: David Lang @ 2001-03-29 21:26 UTC (permalink / raw)
To: Fabio Riccardi; +Cc: linux-kernel
2.2 or 2.4 kernel?
the 2.4 does a MUCH better job of dealing with large numbers of processes.
David Lang
On Thu, 29 Mar 2001, Fabio Riccardi wrote:
> Date: Thu, 29 Mar 2001 13:19:05 -0800
> From: Fabio Riccardi <fabio@chromium.com>
> To: linux-kernel@vger.kernel.org
> Subject: linux scheduler limitations?
>
> Hello,
>
> I'm working on an enhanced version of Apache and I'm hitting my head
> against something I don't understand.
>
> I've found a (to me) unexplicable system behaviour when the number of
> Apache forked instances goes somewhere beyond 1050, the machine
> suddently slows down almost top a halt and becomes totally unresponsive,
> until I stop the test (SpecWeb).
>
> Profiling the kernel shows that the scheduler and the interrupt handler
> are taking most of the CPU time.
>
> I understand that there must be a limit to the number of processes that
> the scheduler can efficiently handle, but I would expect some sort of
> gradual performance degradation when increasing the number of tasks,
> instead I observe that by increasing Apache's MaxClient linit by as
> little as 10 can cause a sudden transition between smooth working with
> lots (30-40%) of CPU idle to a total lock-up.
>
> Moreover the max number of processes is not even constant. If I increase
> the server load gradually then I manage to have 1500 processes running
> with no problem, but if the transition is sharp (the SpecWeb case) than
> I end-up having a lock up.
>
> Anybody seen this before? Any clues?
>
> - Fabio
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: linux scheduler limitations?
2001-03-29 21:19 linux scheduler limitations? Fabio Riccardi
2001-03-29 21:26 ` David Lang
@ 2001-03-29 21:35 ` J . A . Magallon
2001-03-29 22:12 ` Fabio Riccardi
2001-03-30 6:52 ` Giuliano Pochini
2001-04-02 22:58 ` Alan Cox
3 siblings, 1 reply; 11+ messages in thread
From: J . A . Magallon @ 2001-03-29 21:35 UTC (permalink / raw)
To: Fabio Riccardi; +Cc: linux-kernel
On 03.29 Fabio Riccardi wrote:
>
> I've found a (to me) unexplicable system behaviour when the number of
> Apache forked instances goes somewhere beyond 1050, the machine
> suddently slows down almost top a halt and becomes totally unresponsive,
> until I stop the test (SpecWeb).
>
Have you though about pthreads (when you talk about fork, I suppose you
say literally 'fork()') ?
I give a course on Parallel Programming at the university and the practical
work was done with POSIX threads. One of my students caught the idea and
used it to modify his assignment from one other matter on Networks, and
changed the traditional 'fork()' in a simple ftp server he had to implement
by 'pthread_create' and got a 10-30 speedup (conns per second).
And you will get rid of some process-per-user limit. But you will fall into
an threads-per-user limit, if there is any.
And you cal also control its scheduling, to make each thread fight against
the whole system or only its siblings.
--
J.A. Magallon # Let the source
mailto:jamagallon@able.es # be with you, Luke...
Linux werewolf 2.4.2-ac28 #1 SMP Thu Mar 29 16:41:17 CEST 2001 i686
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: linux scheduler limitations?
2001-03-29 21:26 ` David Lang
@ 2001-03-29 21:55 ` Fabio Riccardi
2001-03-30 1:45 ` Mike Kravetz
0 siblings, 1 reply; 11+ messages in thread
From: Fabio Riccardi @ 2001-03-29 21:55 UTC (permalink / raw)
To: David Lang; +Cc: linux-kernel
I'm using 2.4.2-ac26, but I've noticed the same behavior with all the 2.4
kernels I've seen so far.
I haven't even tried on 2.2
- Fabio
David Lang wrote:
> 2.2 or 2.4 kernel?
>
> the 2.4 does a MUCH better job of dealing with large numbers of processes.
>
> David Lang
>
> On Thu, 29 Mar 2001, Fabio Riccardi wrote:
>
> > Date: Thu, 29 Mar 2001 13:19:05 -0800
> > From: Fabio Riccardi <fabio@chromium.com>
> > To: linux-kernel@vger.kernel.org
> > Subject: linux scheduler limitations?
> >
> > Hello,
> >
> > I'm working on an enhanced version of Apache and I'm hitting my head
> > against something I don't understand.
> >
> > I've found a (to me) unexplicable system behaviour when the number of
> > Apache forked instances goes somewhere beyond 1050, the machine
> > suddently slows down almost top a halt and becomes totally unresponsive,
> > until I stop the test (SpecWeb).
> >
> > Profiling the kernel shows that the scheduler and the interrupt handler
> > are taking most of the CPU time.
> >
> > I understand that there must be a limit to the number of processes that
> > the scheduler can efficiently handle, but I would expect some sort of
> > gradual performance degradation when increasing the number of tasks,
> > instead I observe that by increasing Apache's MaxClient linit by as
> > little as 10 can cause a sudden transition between smooth working with
> > lots (30-40%) of CPU idle to a total lock-up.
> >
> > Moreover the max number of processes is not even constant. If I increase
> > the server load gradually then I manage to have 1500 processes running
> > with no problem, but if the transition is sharp (the SpecWeb case) than
> > I end-up having a lock up.
> >
> > Anybody seen this before? Any clues?
> >
> > - Fabio
> >
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
> >
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: linux scheduler limitations?
2001-03-29 21:35 ` J . A . Magallon
@ 2001-03-29 22:12 ` Fabio Riccardi
2001-03-29 22:33 ` J . A . Magallon
0 siblings, 1 reply; 11+ messages in thread
From: Fabio Riccardi @ 2001-03-29 22:12 UTC (permalink / raw)
To: J . A . Magallon; +Cc: linux-kernel
Apache uses a pre-fork "threading" mechanism, it spawns (fork()s) new instances
of itself whenever it finds out that the number of idle "threads" is below a
certain (configurable) threshold.
Despite of all apparences this method performs beautifully on Linux, pthreads are
actually slower in many cases, since you will incur some additional overhead due
to thread synchronization and scheduling.
The problem is that beyond a certain number of processes the scheduler just goes
bananas, or so it seems to me.
Since Linux threads are mapped on processes, I don't think that (p)threads woud
help in any way, unless it is the VM context switch overhead that is playing a
role here, which I wouldn't think is the case.
- Fabio
"J . A . Magallon" wrote:
> On 03.29 Fabio Riccardi wrote:
> >
> > I've found a (to me) unexplicable system behaviour when the number of
> > Apache forked instances goes somewhere beyond 1050, the machine
> > suddently slows down almost top a halt and becomes totally unresponsive,
> > until I stop the test (SpecWeb).
> >
>
> Have you though about pthreads (when you talk about fork, I suppose you
> say literally 'fork()') ?
>
> I give a course on Parallel Programming at the university and the practical
> work was done with POSIX threads. One of my students caught the idea and
> used it to modify his assignment from one other matter on Networks, and
> changed the traditional 'fork()' in a simple ftp server he had to implement
> by 'pthread_create' and got a 10-30 speedup (conns per second).
>
> And you will get rid of some process-per-user limit. But you will fall into
> an threads-per-user limit, if there is any.
>
> And you cal also control its scheduling, to make each thread fight against
> the whole system or only its siblings.
>
> --
> J.A. Magallon # Let the source
> mailto:jamagallon@able.es # be with you, Luke...
>
> Linux werewolf 2.4.2-ac28 #1 SMP Thu Mar 29 16:41:17 CEST 2001 i686
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: linux scheduler limitations?
2001-03-29 22:12 ` Fabio Riccardi
@ 2001-03-29 22:33 ` J . A . Magallon
2001-03-29 22:51 ` Fabio Riccardi
0 siblings, 1 reply; 11+ messages in thread
From: J . A . Magallon @ 2001-03-29 22:33 UTC (permalink / raw)
To: Fabio Riccardi; +Cc: linux-kernel
On 03.30 Fabio Riccardi wrote:
>
> Despite of all apparences this method performs beautifully on Linux, pthreads
> are
> actually slower in many cases, since you will incur some additional overhead
> due
> to thread synchronization and scheduling.
>
It all depends on your app, as every parallel algorithm. In a web-ftp-whatever
server, you do not need any synchro. You can start threads in free run and
let them die alone.
> The problem is that beyond a certain number of processes the scheduler just
> goes
> bananas, or so it seems to me.
>
> Since Linux threads are mapped on processes, I don't think that (p)threads
> woud
> help in any way, unless it is the VM context switch overhead that is playing a
> role here, which I wouldn't think is the case.
>
You said, 'mapped'.
AFAIK, that is the advantage, you can avoid the VM switch by sharing memory.
--
J.A. Magallon # Let the source
mailto:jamagallon@able.es # be with you, Luke...
Linux werewolf 2.4.2-ac28 #1 SMP Thu Mar 29 16:41:17 CEST 2001 i686
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: linux scheduler limitations?
2001-03-29 22:33 ` J . A . Magallon
@ 2001-03-29 22:51 ` Fabio Riccardi
0 siblings, 0 replies; 11+ messages in thread
From: Fabio Riccardi @ 2001-03-29 22:51 UTC (permalink / raw)
To: J . A . Magallon; +Cc: linux-kernel
"J . A . Magallon" wrote:
> It all depends on your app, as every parallel algorithm. In a web-ftp-whatever
> server, you do not need any synchro. You can start threads in free run and
> let them die alone.
even if you don't need synchronization you pay for it anyway, since you will have
to use the pthread version of libc that is reentrant. Moreover many calls (i.e.
accept) are "scheduling points" for pthreads, whenever you call them the runtime
will perform quite a bit of bookeeping.
it is instructive to use a profiler on your application and see what happens when
you use pthreads...
> You said, 'mapped'.
> AFAIK, that is the advantage, you can avoid the VM switch by sharing memory.
If your application uses lots of memory than I agree, Apache only uses a tiny
amount of RAM per instance though, so I don't think that that is my case.
- Fabio
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: linux scheduler limitations?
2001-03-29 21:55 ` Fabio Riccardi
@ 2001-03-30 1:45 ` Mike Kravetz
2001-03-30 2:58 ` Fabio Riccardi
0 siblings, 1 reply; 11+ messages in thread
From: Mike Kravetz @ 2001-03-30 1:45 UTC (permalink / raw)
To: Fabio Riccardi; +Cc: linux-kernel
On Thu, Mar 29, 2001 at 01:55:11PM -0800, Fabio Riccardi wrote:
> I'm using 2.4.2-ac26, but I've noticed the same behavior with all the 2.4
> kernels I've seen so far.
>
> I haven't even tried on 2.2
>
> - Fabio
Fabio,
Just for fun, you might want to try out some of our scheduler patches
located at:
http://lse.sourceforge.net/scheduling/
I would be interested in your observations.
--
Mike Kravetz mkravetz@sequent.com
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: linux scheduler limitations?
2001-03-30 1:45 ` Mike Kravetz
@ 2001-03-30 2:58 ` Fabio Riccardi
0 siblings, 0 replies; 11+ messages in thread
From: Fabio Riccardi @ 2001-03-30 2:58 UTC (permalink / raw)
To: Mike Kravetz, linux-kernel
Hi Mike,
somebody else on the list already pointed me at your stuff and I quickly
downloaded your multiqueue patch for 2.4.1 to try it out.
It works great! I finally manage to have 100% CPU utilization and keep the
machine decently responsive.
On a two 1GHz pentium box i went from 1300 specweb to 1600. That's pretty
amazing.
There is a bit more overhead though, I'd say arount 5%, when the CPU is not
fully loaded.
What is the status of your code? Is it going to end-up in the mainstream
kernel?
Do you have a port to the 2.4.2x kernels?
In my enthousiasm I tried to port the patch to 2.4.2-ac26 but I broke
something and it didn't work anymore... :)
I havent't tried the pooling patch yet, it didn't seem to make much sense on a
2-way box. I have an 8-way on which I'm planning to bench my web server
enhancements, I'll try the pooling stuff on it.
BTW: interested in the fastest linux web server?
BTW2: what about the HP scheduler patches?
Thanks, ciao,
- Fabio
Mike Kravetz wrote:
> On Thu, Mar 29, 2001 at 01:55:11PM -0800, Fabio Riccardi wrote:
> > I'm using 2.4.2-ac26, but I've noticed the same behavior with all the 2.4
> > kernels I've seen so far.
> >
> > I haven't even tried on 2.2
> >
> > - Fabio
>
> Fabio,
>
> Just for fun, you might want to try out some of our scheduler patches
> located at:
>
> http://lse.sourceforge.net/scheduling/
>
> I would be interested in your observations.
>
> --
> Mike Kravetz mkravetz@sequent.com
> IBM Linux Technology Center
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: linux scheduler limitations?
2001-03-29 21:19 linux scheduler limitations? Fabio Riccardi
2001-03-29 21:26 ` David Lang
2001-03-29 21:35 ` J . A . Magallon
@ 2001-03-30 6:52 ` Giuliano Pochini
2001-04-02 22:58 ` Alan Cox
3 siblings, 0 replies; 11+ messages in thread
From: Giuliano Pochini @ 2001-03-30 6:52 UTC (permalink / raw)
To: Fabio Riccardi; +Cc: linux-kernel
On 29-Mar-01 Fabio Riccardi wrote:
> Hello,
>
> I'm working on an enhanced version of Apache and I'm hitting my head
> against something I don't understand.
>
> I've found a (to me) unexplicable system behaviour when the number of
> Apache forked instances goes somewhere beyond 1050, the machine
> suddently slows down almost top a halt and becomes totally unresponsive,
> until I stop the test (SpecWeb).
Are you using 2.2.x ? I had the same problem here until I switched
to 2.4.x. 2.2 internal locks are not fine grained enough.
Bye.
Giuliano Pochini ->)|(<- Shiny Network {AS6665} ->)|(<-
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: linux scheduler limitations?
2001-03-29 21:19 linux scheduler limitations? Fabio Riccardi
` (2 preceding siblings ...)
2001-03-30 6:52 ` Giuliano Pochini
@ 2001-04-02 22:58 ` Alan Cox
3 siblings, 0 replies; 11+ messages in thread
From: Alan Cox @ 2001-04-02 22:58 UTC (permalink / raw)
To: Fabio Riccardi; +Cc: linux-kernel
> I've found a (to me) unexplicable system behaviour when the number of
> Apache forked instances goes somewhere beyond 1050, the machine
> suddently slows down almost top a halt and becomes totally unresponsive,
> until I stop the test (SpecWeb).
Im suprised it gets that far
> Moreover the max number of processes is not even constant. If I increase
> the server load gradually then I manage to have 1500 processes running
> with no problem, but if the transition is sharp (the SpecWeb case) than
> I end-up having a lock up.
With that many servers and a sudden load you are probably causing a lot of
paging. What kernel version. And while this isnt a solution to kernel issues
take a look at thttpd instead (www.acme.com). If you have 1500 8K stacks
thrashing in your cache you are not going to have good performance.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2001-04-02 22:57 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-03-29 21:19 linux scheduler limitations? Fabio Riccardi
2001-03-29 21:26 ` David Lang
2001-03-29 21:55 ` Fabio Riccardi
2001-03-30 1:45 ` Mike Kravetz
2001-03-30 2:58 ` Fabio Riccardi
2001-03-29 21:35 ` J . A . Magallon
2001-03-29 22:12 ` Fabio Riccardi
2001-03-29 22:33 ` J . A . Magallon
2001-03-29 22:51 ` Fabio Riccardi
2001-03-30 6:52 ` Giuliano Pochini
2001-04-02 22:58 ` Alan Cox
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox