* fork: Resource temporarily unavailable / cant start new threads
@ 2008-05-20 18:26 mark
2008-05-21 20:28 ` Randy Dunlap
0 siblings, 1 reply; 14+ messages in thread
From: mark @ 2008-05-20 18:26 UTC (permalink / raw)
To: linux-kernel
I upgraded to 2.6.25.3-18.fc9.x86_64 fedora core 9, now I get this
error when I try to login to the box, kill a pr start a python app, or
do anything on a regular basis.
fork: Resource temporarily unavailable
I have over 10GB RAM free, and zero swap spaced used. The box is a
dual quad core Intel Xeon 5405 with 16GB RAM.
There is no error message in /var/log/messages or dmesg ...
how do I identify the problem?
thanks!
uname -a
Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008
x86_64 x86_64 x86_64 GNU/Linux
free -m
total used free shared buffers cached
Mem: 16086 3189 12896 0 42 666
-/+ buffers/cache: 2481 13605
Swap: 1983 0 1983
have only 505 processes running
ps aux | wc -l
505
uptime
11:24:15 up 39 min, 1 user, load average: 3.54, 3.47, 2.87
ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 137216
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 32768
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: fork: Resource temporarily unavailable / cant start new threads 2008-05-20 18:26 fork: Resource temporarily unavailable / cant start new threads mark @ 2008-05-21 20:28 ` Randy Dunlap 2008-05-21 20:39 ` mark 2008-05-21 20:39 ` Johannes Weiner 0 siblings, 2 replies; 14+ messages in thread From: Randy Dunlap @ 2008-05-21 20:28 UTC (permalink / raw) To: mark; +Cc: linux-kernel On Tue, 20 May 2008 11:26:47 -0700 mark wrote: > I upgraded to 2.6.25.3-18.fc9.x86_64 fedora core 9, now I get this > error when I try to login to the box, kill a pr start a python app, or > do anything on a regular basis. > > fork: Resource temporarily unavailable > > I have over 10GB RAM free, and zero swap spaced used. The box is a > dual quad core Intel Xeon 5405 with 16GB RAM. > > There is no error message in /var/log/messages or dmesg ... > how do I identify the problem? > thanks! > > uname -a > Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008 > x86_64 x86_64 x86_64 GNU/Linux > > > free -m > total used free shared buffers cached > Mem: 16086 3189 12896 0 42 666 > -/+ buffers/cache: 2481 13605 > Swap: 1983 0 1983 > > > have only 505 processes running > ps aux | wc -l > 505 > > > uptime > 11:24:15 up 39 min, 1 user, load average: 3.54, 3.47, 2.87 > > ulimit -a > core file size (blocks, -c) 0 > data seg size (kbytes, -d) unlimited > scheduling priority (-e) 0 > file size (blocks, -f) unlimited > pending signals (-i) 137216 > max locked memory (kbytes, -l) 32 > max memory size (kbytes, -m) unlimited > open files (-n) 32768 > pipe size (512 bytes, -p) 8 > POSIX message queues (bytes, -q) 819200 > real-time priority (-r) 0 > stack size (kbytes, -s) 10240 > cpu time (seconds, -t) unlimited > max user processes (-u) 1024 > virtual memory (kbytes, -v) unlimited > file locks (-x) unlimited Hi, The only place that fork() returns EAGAIN is for number of processes being >= its limit. Does this user already have >= 1024 processes? --- ~Randy ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads 2008-05-21 20:28 ` Randy Dunlap @ 2008-05-21 20:39 ` mark 2008-05-21 20:50 ` Randy Dunlap 2008-05-21 20:39 ` Johannes Weiner 1 sibling, 1 reply; 14+ messages in thread From: mark @ 2008-05-21 20:39 UTC (permalink / raw) To: Randy Dunlap; +Cc: linux-kernel On Wed, May 21, 2008 at 1:28 PM, Randy Dunlap <randy.dunlap@oracle.com> wrote: > On Tue, 20 May 2008 11:26:47 -0700 mark wrote: >> I upgraded to 2.6.25.3-18.fc9.x86_64 fedora core 9, now I get this >> error when I try to login to the box, kill a pr start a python app, or >> do anything on a regular basis. >> >> fork: Resource temporarily unavailable >> >> I have over 10GB RAM free, and zero swap spaced used. The box is a >> dual quad core Intel Xeon 5405 with 16GB RAM. >> >> There is no error message in /var/log/messages or dmesg ... >> how do I identify the problem? >> thanks! >> >> uname -a >> Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008 >> x86_64 x86_64 x86_64 GNU/Linux >> >> >> free -m >> total used free shared buffers cached >> Mem: 16086 3189 12896 0 42 666 >> -/+ buffers/cache: 2481 13605 >> Swap: 1983 0 1983 >> >> >> have only 505 processes running >> ps aux | wc -l >> 505 >> >> >> uptime >> 11:24:15 up 39 min, 1 user, load average: 3.54, 3.47, 2.87 >> >> ulimit -a >> core file size (blocks, -c) 0 >> data seg size (kbytes, -d) unlimited >> scheduling priority (-e) 0 >> file size (blocks, -f) unlimited >> pending signals (-i) 137216 >> max locked memory (kbytes, -l) 32 >> max memory size (kbytes, -m) unlimited >> open files (-n) 32768 >> pipe size (512 bytes, -p) 8 >> POSIX message queues (bytes, -q) 819200 >> real-time priority (-r) 0 >> stack size (kbytes, -s) 10240 >> cpu time (seconds, -t) unlimited >> max user processes (-u) 1024 >> virtual memory (kbytes, -v) unlimited >> file locks (-x) unlimited > The only place that fork() returns EAGAIN is for number of > processes being >= its limit. Does this user already have >= 1024 > processes? No, it is around 400 ps ax | wc -l 417 I also I increased max process to unlimited, and I still get the error randomly.. ulimit -u unlimited my webserver is now throwing this error: setuid(500) failed (11: Resource temporarily unavailable) cat /etc/passwd | grep mark mark:x:500:500::/home/mark:/bin/bash I also increased this, but still the same error kernel.pid_max = 65536 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads 2008-05-21 20:39 ` mark @ 2008-05-21 20:50 ` Randy Dunlap 2008-05-21 21:08 ` mark 0 siblings, 1 reply; 14+ messages in thread From: Randy Dunlap @ 2008-05-21 20:50 UTC (permalink / raw) To: mark; +Cc: linux-kernel mark wrote: > On Wed, May 21, 2008 at 1:28 PM, Randy Dunlap <randy.dunlap@oracle.com> wrote: >> On Tue, 20 May 2008 11:26:47 -0700 mark wrote: >>> I upgraded to 2.6.25.3-18.fc9.x86_64 fedora core 9, now I get this >>> error when I try to login to the box, kill a pr start a python app, or >>> do anything on a regular basis. >>> >>> fork: Resource temporarily unavailable >>> >>> I have over 10GB RAM free, and zero swap spaced used. The box is a >>> dual quad core Intel Xeon 5405 with 16GB RAM. >>> >>> There is no error message in /var/log/messages or dmesg ... >>> how do I identify the problem? >>> thanks! >>> >>> uname -a >>> Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008 >>> x86_64 x86_64 x86_64 GNU/Linux >>> >>> >>> free -m >>> total used free shared buffers cached >>> Mem: 16086 3189 12896 0 42 666 >>> -/+ buffers/cache: 2481 13605 >>> Swap: 1983 0 1983 >>> >>> >>> have only 505 processes running >>> ps aux | wc -l >>> 505 >>> >>> >>> uptime >>> 11:24:15 up 39 min, 1 user, load average: 3.54, 3.47, 2.87 >>> >>> ulimit -a >>> core file size (blocks, -c) 0 >>> data seg size (kbytes, -d) unlimited >>> scheduling priority (-e) 0 >>> file size (blocks, -f) unlimited >>> pending signals (-i) 137216 >>> max locked memory (kbytes, -l) 32 >>> max memory size (kbytes, -m) unlimited >>> open files (-n) 32768 >>> pipe size (512 bytes, -p) 8 >>> POSIX message queues (bytes, -q) 819200 >>> real-time priority (-r) 0 >>> stack size (kbytes, -s) 10240 >>> cpu time (seconds, -t) unlimited >>> max user processes (-u) 1024 >>> virtual memory (kbytes, -v) unlimited >>> file locks (-x) unlimited >> The only place that fork() returns EAGAIN is for number of >> processes being >= its limit. Does this user already have >= 1024 >> processes? > > No, it is around 400 Well, my comment was wrong anyway. There are several other tests just below number of user processes that also return EAGAIN, like: - total number of threads being too large - error on grabbing a module reference count (?) - error on grabbing a binfmt module reference > ps ax | wc -l > 417 > > I also I increased max process to unlimited, and I still get the error > randomly.. > > ulimit -u > unlimited > > my webserver is now throwing this error: > > setuid(500) failed (11: Resource temporarily unavailable) That's all of the useful information?? > > cat /etc/passwd | grep mark > mark:x:500:500::/home/mark:/bin/bash > > I also increased this, but still the same error > kernel.pid_max = 65536 -- ~Randy ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads 2008-05-21 20:50 ` Randy Dunlap @ 2008-05-21 21:08 ` mark 2008-05-21 21:15 ` Jesper Juhl 2008-05-21 21:32 ` Randy Dunlap 0 siblings, 2 replies; 14+ messages in thread From: mark @ 2008-05-21 21:08 UTC (permalink / raw) To: Randy Dunlap; +Cc: linux-kernel On Wed, May 21, 2008 at 1:50 PM, Randy Dunlap <randy.dunlap@oracle.com> wrote: > mark wrote: >> >> On Wed, May 21, 2008 at 1:28 PM, Randy Dunlap <randy.dunlap@oracle.com> >> wrote: >>> >>> On Tue, 20 May 2008 11:26:47 -0700 mark wrote: >>>> >>>> I upgraded to 2.6.25.3-18.fc9.x86_64 fedora core 9, now I get this >>>> error when I try to login to the box, kill a pr start a python app, or >>>> do anything on a regular basis. >>>> >>>> fork: Resource temporarily unavailable >>>> >>>> I have over 10GB RAM free, and zero swap spaced used. The box is a >>>> dual quad core Intel Xeon 5405 with 16GB RAM. >>>> >>>> There is no error message in /var/log/messages or dmesg ... >>>> how do I identify the problem? >>>> thanks! >>>> >>>> uname -a >>>> Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008 >>>> x86_64 x86_64 x86_64 GNU/Linux >>>> >>>> >>>> free -m >>>> total used free shared buffers cached >>>> Mem: 16086 3189 12896 0 42 >>>> 666 >>>> -/+ buffers/cache: 2481 13605 >>>> Swap: 1983 0 1983 >>>> >>>> >>>> have only 505 processes running >>>> ps aux | wc -l >>>> 505 >>>> >>>> >>>> uptime >>>> 11:24:15 up 39 min, 1 user, load average: 3.54, 3.47, 2.87 >>>> >>>> ulimit -a >>>> core file size (blocks, -c) 0 >>>> data seg size (kbytes, -d) unlimited >>>> scheduling priority (-e) 0 >>>> file size (blocks, -f) unlimited >>>> pending signals (-i) 137216 >>>> max locked memory (kbytes, -l) 32 >>>> max memory size (kbytes, -m) unlimited >>>> open files (-n) 32768 >>>> pipe size (512 bytes, -p) 8 >>>> POSIX message queues (bytes, -q) 819200 >>>> real-time priority (-r) 0 >>>> stack size (kbytes, -s) 10240 >>>> cpu time (seconds, -t) unlimited >>>> max user processes (-u) 1024 >>>> virtual memory (kbytes, -v) unlimited >>>> file locks (-x) unlimited >>> >>> The only place that fork() returns EAGAIN is for number of >>> processes being >= its limit. Does this user already have >= 1024 >>> processes? >> >> No, it is around 400 > > Well, my comment was wrong anyway. There are several other tests just > below number of user processes that also return EAGAIN, like: > > - total number of threads being too large > - error on grabbing a module reference count (?) > - error on grabbing a binfmt module reference as a user how do i identify what is wrong, and fix this? for total number of threads -> is there anyway i can find out if this is causing the problem? my system is running around 80 multi-threaded python web apps. >> my webserver is now throwing this error: >> >> setuid(500) failed (11: Resource temporarily unavailable) > > That's all of the useful information?? Yes. i get this error when i restart the web server. if i kill all other apps, and then start it again it starts fine. this is the complete error message, 2008/05/21 08:02:19 [emerg] 30558#0: setuid(500) failed (11: Resource temporarily unavailable) 2008/05/21 08:02:19 [alert] 30557#0: worker process 30558 exited with fatal code 2 and can not be respawn ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads 2008-05-21 21:08 ` mark @ 2008-05-21 21:15 ` Jesper Juhl 2008-05-21 21:27 ` mark 2008-05-21 21:32 ` Randy Dunlap 1 sibling, 1 reply; 14+ messages in thread From: Jesper Juhl @ 2008-05-21 21:15 UTC (permalink / raw) To: mark; +Cc: Randy Dunlap, linux-kernel 2008/5/21 mark <markkicks@gmail.com>: <snip> >>> my webserver is now throwing this error: >>> >>> setuid(500) failed (11: Resource temporarily unavailable) >> >> That's all of the useful information?? > > Yes. i get this error when i restart the web server. if i kill all > other apps, and then start it again it starts fine. > > this is the complete error message, > 2008/05/21 08:02:19 [emerg] 30558#0: setuid(500) failed (11: Resource > temporarily unavailable) > 2008/05/21 08:02:19 [alert] 30557#0: worker process 30558 exited with > fatal code 2 and can not be respawn What about if you run 'dmesg'? are there any clues in that output? any kernel stack traces? error messages? warnings? anything out of the ordinary? -- Jesper Juhl <jesper.juhl@gmail.com> Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads 2008-05-21 21:15 ` Jesper Juhl @ 2008-05-21 21:27 ` mark 0 siblings, 0 replies; 14+ messages in thread From: mark @ 2008-05-21 21:27 UTC (permalink / raw) To: Jesper Juhl; +Cc: Randy Dunlap, linux-kernel On Wed, May 21, 2008 at 2:15 PM, Jesper Juhl <jesper.juhl@gmail.com> wrote: > 2008/5/21 mark <markkicks@gmail.com>: > <snip> >>>> my webserver is now throwing this error: >>>> >>>> setuid(500) failed (11: Resource temporarily unavailable) >>> >>> That's all of the useful information?? >> >> Yes. i get this error when i restart the web server. if i kill all >> other apps, and then start it again it starts fine. >> >> this is the complete error message, >> 2008/05/21 08:02:19 [emerg] 30558#0: setuid(500) failed (11: Resource >> temporarily unavailable) >> 2008/05/21 08:02:19 [alert] 30557#0: worker process 30558 exited with >> fatal code 2 and can not be respawn > > What about if you run 'dmesg'? are there any clues in that output? > any kernel stack traces? error messages? warnings? anything out of the > ordinary? No. There is no new message added after kernel boot messages ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads 2008-05-21 21:08 ` mark 2008-05-21 21:15 ` Jesper Juhl @ 2008-05-21 21:32 ` Randy Dunlap 2008-05-21 22:51 ` mark 1 sibling, 1 reply; 14+ messages in thread From: Randy Dunlap @ 2008-05-21 21:32 UTC (permalink / raw) To: mark; +Cc: linux-kernel On Wed, 21 May 2008 14:08:53 -0700 mark wrote: > On Wed, May 21, 2008 at 1:50 PM, Randy Dunlap <randy.dunlap@oracle.com> wrote: > > mark wrote: > >> > >> On Wed, May 21, 2008 at 1:28 PM, Randy Dunlap <randy.dunlap@oracle.com> > >> wrote: > >>> > >>> On Tue, 20 May 2008 11:26:47 -0700 mark wrote: > >>>> > >>>> I upgraded to 2.6.25.3-18.fc9.x86_64 fedora core 9, now I get this > >>>> error when I try to login to the box, kill a pr start a python app, or > >>>> do anything on a regular basis. > >>>> > >>>> fork: Resource temporarily unavailable > >>>> > >>>> I have over 10GB RAM free, and zero swap spaced used. The box is a > >>>> dual quad core Intel Xeon 5405 with 16GB RAM. > >>>> > >>>> There is no error message in /var/log/messages or dmesg ... > >>>> how do I identify the problem? > >>>> thanks! > >>>> > >>>> uname -a > >>>> Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008 > >>>> x86_64 x86_64 x86_64 GNU/Linux > >>>> > >>>> > >>>> free -m > >>>> total used free shared buffers cached > >>>> Mem: 16086 3189 12896 0 42 > >>>> 666 > >>>> -/+ buffers/cache: 2481 13605 > >>>> Swap: 1983 0 1983 > >>>> > >>>> > >>>> have only 505 processes running > >>>> ps aux | wc -l > >>>> 505 > >>>> > >>>> > >>>> uptime > >>>> 11:24:15 up 39 min, 1 user, load average: 3.54, 3.47, 2.87 > >>>> > >>>> ulimit -a > >>>> core file size (blocks, -c) 0 > >>>> data seg size (kbytes, -d) unlimited > >>>> scheduling priority (-e) 0 > >>>> file size (blocks, -f) unlimited > >>>> pending signals (-i) 137216 > >>>> max locked memory (kbytes, -l) 32 > >>>> max memory size (kbytes, -m) unlimited > >>>> open files (-n) 32768 > >>>> pipe size (512 bytes, -p) 8 > >>>> POSIX message queues (bytes, -q) 819200 > >>>> real-time priority (-r) 0 > >>>> stack size (kbytes, -s) 10240 > >>>> cpu time (seconds, -t) unlimited > >>>> max user processes (-u) 1024 > >>>> virtual memory (kbytes, -v) unlimited > >>>> file locks (-x) unlimited > >>> > >>> The only place that fork() returns EAGAIN is for number of > >>> processes being >= its limit. Does this user already have >= 1024 > >>> processes? > >> > >> No, it is around 400 > > > > Well, my comment was wrong anyway. There are several other tests just > > below number of user processes that also return EAGAIN, like: > > > > - total number of threads being too large Total number of threads currently running is in /proc/loadavg: > cat /proc/loadavg 1.56 0.58 0.27 2/203 28500 It's the number following the '/', e.g., 203 on my desktop system. max_threads allowed is a sysctl, so you can tune it if needed. It's in /proc/sys/kernel/threads-max: > cat /proc/sys/kernel/threads-max 32624 I sort of doubt that one is the problem, but you can tell us. > > - error on grabbing a module reference count (?) > > - error on grabbing a binfmt module reference > > as a user how do i identify what is wrong, and fix this? for total > number of threads -> is there anyway i can find out if this is causing > the problem? my system is running around 80 multi-threaded python web > apps. I can send you some debug patches that will print out the specific problem area. Do you want to do that? Can you rebuild and install a new kernel? > >> my webserver is now throwing this error: > >> > >> setuid(500) failed (11: Resource temporarily unavailable) > > > > That's all of the useful information?? > > Yes. i get this error when i restart the web server. if i kill all > other apps, and then start it again it starts fine. > > this is the complete error message, > 2008/05/21 08:02:19 [emerg] 30558#0: setuid(500) failed (11: Resource > temporarily unavailable) > 2008/05/21 08:02:19 [alert] 30557#0: worker process 30558 exited with > fatal code 2 and can not be respawn --- ~Randy ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads 2008-05-21 21:32 ` Randy Dunlap @ 2008-05-21 22:51 ` mark 2008-05-21 23:35 ` Randy Dunlap 0 siblings, 1 reply; 14+ messages in thread From: mark @ 2008-05-21 22:51 UTC (permalink / raw) To: Randy Dunlap; +Cc: linux-kernel On Wed, May 21, 2008 at 2:32 PM, Randy Dunlap <randy.dunlap@oracle.com> wrote: > On Wed, 21 May 2008 14:08:53 -0700 mark wrote: > >> On Wed, May 21, 2008 at 1:50 PM, Randy Dunlap <randy.dunlap@oracle.com> wrote: >> > mark wrote: >> >> >> >> On Wed, May 21, 2008 at 1:28 PM, Randy Dunlap <randy.dunlap@oracle.com> >> >> wrote: >> >>> >> >>> On Tue, 20 May 2008 11:26:47 -0700 mark wrote: >> >>>> >> >>>> I upgraded to 2.6.25.3-18.fc9.x86_64 fedora core 9, now I get this >> >>>> error when I try to login to the box, kill a pr start a python app, or >> >>>> do anything on a regular basis. >> >>>> >> >>>> fork: Resource temporarily unavailable >> >>>> >> >>>> I have over 10GB RAM free, and zero swap spaced used. The box is a >> >>>> dual quad core Intel Xeon 5405 with 16GB RAM. >> >>>> >> >>>> There is no error message in /var/log/messages or dmesg ... >> >>>> how do I identify the problem? >> >>>> thanks! >> >>>> >> >>>> uname -a >> >>>> Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008 >> >>>> x86_64 x86_64 x86_64 GNU/Linux >> >>>> >> >>>> >> >>>> free -m >> >>>> total used free shared buffers cached >> >>>> Mem: 16086 3189 12896 0 42 >> >>>> 666 >> >>>> -/+ buffers/cache: 2481 13605 >> >>>> Swap: 1983 0 1983 >> >>>> >> >>>> >> >>>> have only 505 processes running >> >>>> ps aux | wc -l >> >>>> 505 >> >>>> >> >>>> >> >>>> uptime >> >>>> 11:24:15 up 39 min, 1 user, load average: 3.54, 3.47, 2.87 >> >>>> >> >>>> ulimit -a >> >>>> core file size (blocks, -c) 0 >> >>>> data seg size (kbytes, -d) unlimited >> >>>> scheduling priority (-e) 0 >> >>>> file size (blocks, -f) unlimited >> >>>> pending signals (-i) 137216 >> >>>> max locked memory (kbytes, -l) 32 >> >>>> max memory size (kbytes, -m) unlimited >> >>>> open files (-n) 32768 >> >>>> pipe size (512 bytes, -p) 8 >> >>>> POSIX message queues (bytes, -q) 819200 >> >>>> real-time priority (-r) 0 >> >>>> stack size (kbytes, -s) 10240 >> >>>> cpu time (seconds, -t) unlimited >> >>>> max user processes (-u) 1024 >> >>>> virtual memory (kbytes, -v) unlimited >> >>>> file locks (-x) unlimited >> >>> >> >>> The only place that fork() returns EAGAIN is for number of >> >>> processes being >= its limit. Does this user already have >= 1024 >> >>> processes? >> >> >> >> No, it is around 400 >> > >> > Well, my comment was wrong anyway. There are several other tests just >> > below number of user processes that also return EAGAIN, like: >> > >> > - total number of threads being too large > > Total number of threads currently running is in /proc/loadavg: > >> cat /proc/loadavg > 1.56 0.58 0.27 2/203 28500 > > It's the number following the '/', e.g., 203 on my desktop system. > > max_threads allowed is a sysctl, so you can tune it if needed. > It's in /proc/sys/kernel/threads-max: > >> cat /proc/sys/kernel/threads-max > 32624 > I sort of doubt that one is the problem, but you can tell us. cat /proc/loadavg 0.39 0.45 0.57 1/1412 12032 cat /proc/sys/kernel/threads-max 274432 you are right, i guess this is not the problem. >> > - error on grabbing a module reference count (?) >> > - error on grabbing a binfmt module reference >> >> as a user how do i identify what is wrong, and fix this? for total >> number of threads -> is there anyway i can find out if this is causing >> the problem? my system is running around 80 multi-threaded python web >> apps. > > I can send you some debug patches that will print out the specific > problem area. Do you want to do that? Can you rebuild and install > a new kernel? Is it possible to get this debug messages by turning on some flags? If not yes, pl. send debug patches. its a live box and I will try to do it! This is my system / kernel info: uname -a Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux thanks a lot!!!! ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads 2008-05-21 22:51 ` mark @ 2008-05-21 23:35 ` Randy Dunlap 2008-05-22 0:09 ` mark 0 siblings, 1 reply; 14+ messages in thread From: Randy Dunlap @ 2008-05-21 23:35 UTC (permalink / raw) To: mark; +Cc: linux-kernel On Wed, 21 May 2008 15:51:55 -0700 mark wrote: > On Wed, May 21, 2008 at 2:32 PM, Randy Dunlap <randy.dunlap@oracle.com> wrote: > > On Wed, 21 May 2008 14:08:53 -0700 mark wrote: > > > >> On Wed, May 21, 2008 at 1:50 PM, Randy Dunlap <randy.dunlap@oracle.com> wrote: > >> > mark wrote: > >> >> > >> >> On Wed, May 21, 2008 at 1:28 PM, Randy Dunlap <randy.dunlap@oracle.com> > >> >> wrote: > >> >>> > >> >>> On Tue, 20 May 2008 11:26:47 -0700 mark wrote: > >> >>>> > >> >>>> I upgraded to 2.6.25.3-18.fc9.x86_64 fedora core 9, now I get this > >> >>>> error when I try to login to the box, kill a pr start a python app, or > >> >>>> do anything on a regular basis. > >> >>>> > >> >>>> fork: Resource temporarily unavailable [snip] > >> >>> The only place that fork() returns EAGAIN is for number of > >> >>> processes being >= its limit. Does this user already have >= 1024 > >> >>> processes? > >> >> > >> >> No, it is around 400 > >> > > >> > Well, my comment was wrong anyway. There are several other tests just > >> > below number of user processes that also return EAGAIN, like: > >> > > >> > - total number of threads being too large > > > > Total number of threads currently running is in /proc/loadavg: > > > >> cat /proc/loadavg > > 1.56 0.58 0.27 2/203 28500 > > > > It's the number following the '/', e.g., 203 on my desktop system. > > > > max_threads allowed is a sysctl, so you can tune it if needed. > > It's in /proc/sys/kernel/threads-max: > > > >> cat /proc/sys/kernel/threads-max > > 32624 > > I sort of doubt that one is the problem, but you can tell us. > > cat /proc/loadavg > 0.39 0.45 0.57 1/1412 12032 > cat /proc/sys/kernel/threads-max > 274432 > you are right, i guess this is not the problem. > > > >> > - error on grabbing a module reference count (?) > >> > - error on grabbing a binfmt module reference > >> > >> as a user how do i identify what is wrong, and fix this? for total > >> number of threads -> is there anyway i can find out if this is causing > >> the problem? my system is running around 80 multi-threaded python web > >> apps. > > > > I can send you some debug patches that will print out the specific > > problem area. Do you want to do that? Can you rebuild and install > > a new kernel? > Is it possible to get this debug messages by turning on some flags? > If not yes, pl. send debug patches. its a live box and I will try to do it! > > This is my system / kernel info: > uname -a > Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008 > x86_64 x86_64 x86_64 GNU/Linux I made a small patch to a vanilla kernel.org 2.6.25.3 kernel tree. Hopefully it applies cleanly to that fc9 kernel source, but check/verify that first before going any further. After building and booting with this patch, there will be kernel messages whenever fork's "copy_process" function fails with -EAGAIN (-11), which is reported to userspace as errno = 11 (Resource temporarily unavailable). Hopefully this will identify which test is failing, but there's a chance that something else is going on and that this patch does not find the problem. Anyway, good luck and please report back on it. --- --- kernel/fork.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) --- linux-2.6.25.3.orig/kernel/fork.c +++ linux-2.6.25.3/kernel/fork.c @@ -1049,8 +1049,10 @@ static struct task_struct *copy_process( if (atomic_read(&p->user->processes) >= p->signal->rlim[RLIMIT_NPROC].rlim_cur) { if (!capable(CAP_SYS_ADMIN) && !capable(CAP_SYS_RESOURCE) && - p->user != current->nsproxy->user_ns->root_user) + p->user != current->nsproxy->user_ns->root_user) { + printk(KERN_INFO "%s: error on #processes\n", __func__); goto bad_fork_free; + } } atomic_inc(&p->user->__count); @@ -1062,14 +1064,20 @@ static struct task_struct *copy_process( * triggers too late. This doesn't hurt, the check is only there * to stop root fork bombs. */ - if (nr_threads >= max_threads) + if (nr_threads >= max_threads) { + printk(KERN_INFO "%s: error on #threads\n", __func__); goto bad_fork_cleanup_count; + } - if (!try_module_get(task_thread_info(p)->exec_domain->module)) + if (!try_module_get(task_thread_info(p)->exec_domain->module)) { + printk(KERN_INFO "%s: error on exec_domain->module\n", __func__); goto bad_fork_cleanup_count; + } - if (p->binfmt && !try_module_get(p->binfmt->module)) + if (p->binfmt && !try_module_get(p->binfmt->module)) { + printk(KERN_INFO "%s: error on binfmt->module\n", __func__); goto bad_fork_cleanup_put_domain; + } p->did_exec = 0; delayacct_tsk_init(p); /* Must remain after dup_task_struct() */ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads 2008-05-21 23:35 ` Randy Dunlap @ 2008-05-22 0:09 ` mark 2008-05-22 0:29 ` Randy Dunlap 2008-05-22 7:11 ` するくめ 0 siblings, 2 replies; 14+ messages in thread From: mark @ 2008-05-22 0:09 UTC (permalink / raw) To: Randy Dunlap; +Cc: linux-kernel On Wed, May 21, 2008 at 4:35 PM, Randy Dunlap <randy.dunlap@oracle.com> wrote: >> > I can send you some debug patches that will print out the specific >> > problem area. Do you want to do that? Can you rebuild and install >> > a new kernel? >> Is it possible to get this debug messages by turning on some flags? >> If not yes, pl. send debug patches. its a live box and I will try to do it! >> >> This is my system / kernel info: >> uname -a >> Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008 >> x86_64 x86_64 x86_64 GNU/Linux > > I made a small patch to a vanilla kernel.org 2.6.25.3 kernel tree. > Hopefully it applies cleanly to that fc9 kernel source, but check/verify > that first before going any further. Thanks a lot for the patch, This is kind of weird.. but there is no file kernel/fork.c [mark@localhost 2.6.25.3-18.fc9.i686]$ find . -iname '*fork*' -print [mark@localhost 2.6.25.3-18.fc9.i686]$ rpm -ql [mark@localhost 2.6.25.3-18.fc9.i686]$ rpm -ql kernel-devel | grep fork [mark@localhost 2.6.25.3-18.fc9.i686]$ pwd /usr/src/kernels/2.6.25.3-18.fc9.i686 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads 2008-05-22 0:09 ` mark @ 2008-05-22 0:29 ` Randy Dunlap 2008-05-22 7:11 ` するくめ 1 sibling, 0 replies; 14+ messages in thread From: Randy Dunlap @ 2008-05-22 0:29 UTC (permalink / raw) To: mark; +Cc: linux-kernel mark wrote: > On Wed, May 21, 2008 at 4:35 PM, Randy Dunlap <randy.dunlap@oracle.com> wrote: >>>> I can send you some debug patches that will print out the specific >>>> problem area. Do you want to do that? Can you rebuild and install >>>> a new kernel? >>> Is it possible to get this debug messages by turning on some flags? >>> If not yes, pl. send debug patches. its a live box and I will try to do it! >>> >>> This is my system / kernel info: >>> uname -a >>> Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008 >>> x86_64 x86_64 x86_64 GNU/Linux >> I made a small patch to a vanilla kernel.org 2.6.25.3 kernel tree. >> Hopefully it applies cleanly to that fc9 kernel source, but check/verify >> that first before going any further. > > Thanks a lot for the patch, > This is kind of weird.. but there is no file kernel/fork.c > > [mark@localhost 2.6.25.3-18.fc9.i686]$ find . -iname '*fork*' -print > [mark@localhost 2.6.25.3-18.fc9.i686]$ rpm -ql > [mark@localhost 2.6.25.3-18.fc9.i686]$ rpm -ql kernel-devel | grep fork > [mark@localhost 2.6.25.3-18.fc9.i686]$ pwd > /usr/src/kernels/2.6.25.3-18.fc9.i686 That's not a kernel source tree. I'm no expert on fc nor on src.rpm's, but I think that you need to get the fc kernel-2.6.25.3-18.src.rpm file (or something like that). Or use a plain vanilla kernel.org 2.6.25.3 kernel tree. -- ~Randy ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads 2008-05-22 0:09 ` mark 2008-05-22 0:29 ` Randy Dunlap @ 2008-05-22 7:11 ` するくめ 1 sibling, 0 replies; 14+ messages in thread From: するくめ @ 2008-05-22 7:11 UTC (permalink / raw) To: linux-kernel Take a look at this guide to install the kernel source on fedora 9 http://www.mjmwired.net/resources/mjm-fedora-f9.html#kernelsrc 2008/5/22 mark <markkicks@gmail.com>: > On Wed, May 21, 2008 at 4:35 PM, Randy Dunlap <randy.dunlap@oracle.com> wrote: >>> > I can send you some debug patches that will print out the specific >>> > problem area. Do you want to do that? Can you rebuild and install >>> > a new kernel? >>> Is it possible to get this debug messages by turning on some flags? >>> If not yes, pl. send debug patches. its a live box and I will try to do it! >>> >>> This is my system / kernel info: >>> uname -a >>> Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008 >>> x86_64 x86_64 x86_64 GNU/Linux >> >> I made a small patch to a vanilla kernel.org 2.6.25.3 kernel tree. >> Hopefully it applies cleanly to that fc9 kernel source, but check/verify >> that first before going any further. > > Thanks a lot for the patch, > This is kind of weird.. but there is no file kernel/fork.c > > [mark@localhost 2.6.25.3-18.fc9.i686]$ find . -iname '*fork*' -print > [mark@localhost 2.6.25.3-18.fc9.i686]$ rpm -ql > [mark@localhost 2.6.25.3-18.fc9.i686]$ rpm -ql kernel-devel | grep fork > [mark@localhost 2.6.25.3-18.fc9.i686]$ pwd > /usr/src/kernels/2.6.25.3-18.fc9.i686 > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- するくめ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads 2008-05-21 20:28 ` Randy Dunlap 2008-05-21 20:39 ` mark @ 2008-05-21 20:39 ` Johannes Weiner 1 sibling, 0 replies; 14+ messages in thread From: Johannes Weiner @ 2008-05-21 20:39 UTC (permalink / raw) To: Randy Dunlap; +Cc: mark, linux-kernel Hi, Randy Dunlap <randy.dunlap@oracle.com> writes: > On Tue, 20 May 2008 11:26:47 -0700 mark wrote: >> have only 505 processes running >> ps aux | wc -l >> 505 [ quoting deleted for clarification ] >> max user processes (-u) 1024 > The only place that fork() returns EAGAIN is for number of > processes being >= its limit. Does this user already have >= 1024 > processes? Hannes ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2008-05-22 7:11 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-05-20 18:26 fork: Resource temporarily unavailable / cant start new threads mark 2008-05-21 20:28 ` Randy Dunlap 2008-05-21 20:39 ` mark 2008-05-21 20:50 ` Randy Dunlap 2008-05-21 21:08 ` mark 2008-05-21 21:15 ` Jesper Juhl 2008-05-21 21:27 ` mark 2008-05-21 21:32 ` Randy Dunlap 2008-05-21 22:51 ` mark 2008-05-21 23:35 ` Randy Dunlap 2008-05-22 0:09 ` mark 2008-05-22 0:29 ` Randy Dunlap 2008-05-22 7:11 ` するくめ 2008-05-21 20:39 ` Johannes Weiner
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.