* fork: Resource temporarily unavailable / cant start new threads
@ 2008-05-20 18:26 mark
2008-05-21 20:28 ` Randy Dunlap
0 siblings, 1 reply; 14+ messages in thread
From: mark @ 2008-05-20 18:26 UTC (permalink / raw)
To: linux-kernel
I upgraded to 2.6.25.3-18.fc9.x86_64 fedora core 9, now I get this
error when I try to login to the box, kill a pr start a python app, or
do anything on a regular basis.
fork: Resource temporarily unavailable
I have over 10GB RAM free, and zero swap spaced used. The box is a
dual quad core Intel Xeon 5405 with 16GB RAM.
There is no error message in /var/log/messages or dmesg ...
how do I identify the problem?
thanks!
uname -a
Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008
x86_64 x86_64 x86_64 GNU/Linux
free -m
total used free shared buffers cached
Mem: 16086 3189 12896 0 42 666
-/+ buffers/cache: 2481 13605
Swap: 1983 0 1983
have only 505 processes running
ps aux | wc -l
505
uptime
11:24:15 up 39 min, 1 user, load average: 3.54, 3.47, 2.87
ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 137216
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 32768
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads
2008-05-20 18:26 fork: Resource temporarily unavailable / cant start new threads mark
@ 2008-05-21 20:28 ` Randy Dunlap
2008-05-21 20:39 ` mark
2008-05-21 20:39 ` Johannes Weiner
0 siblings, 2 replies; 14+ messages in thread
From: Randy Dunlap @ 2008-05-21 20:28 UTC (permalink / raw)
To: mark; +Cc: linux-kernel
On Tue, 20 May 2008 11:26:47 -0700 mark wrote:
> I upgraded to 2.6.25.3-18.fc9.x86_64 fedora core 9, now I get this
> error when I try to login to the box, kill a pr start a python app, or
> do anything on a regular basis.
>
> fork: Resource temporarily unavailable
>
> I have over 10GB RAM free, and zero swap spaced used. The box is a
> dual quad core Intel Xeon 5405 with 16GB RAM.
>
> There is no error message in /var/log/messages or dmesg ...
> how do I identify the problem?
> thanks!
>
> uname -a
> Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008
> x86_64 x86_64 x86_64 GNU/Linux
>
>
> free -m
> total used free shared buffers cached
> Mem: 16086 3189 12896 0 42 666
> -/+ buffers/cache: 2481 13605
> Swap: 1983 0 1983
>
>
> have only 505 processes running
> ps aux | wc -l
> 505
>
>
> uptime
> 11:24:15 up 39 min, 1 user, load average: 3.54, 3.47, 2.87
>
> ulimit -a
> core file size (blocks, -c) 0
> data seg size (kbytes, -d) unlimited
> scheduling priority (-e) 0
> file size (blocks, -f) unlimited
> pending signals (-i) 137216
> max locked memory (kbytes, -l) 32
> max memory size (kbytes, -m) unlimited
> open files (-n) 32768
> pipe size (512 bytes, -p) 8
> POSIX message queues (bytes, -q) 819200
> real-time priority (-r) 0
> stack size (kbytes, -s) 10240
> cpu time (seconds, -t) unlimited
> max user processes (-u) 1024
> virtual memory (kbytes, -v) unlimited
> file locks (-x) unlimited
Hi,
The only place that fork() returns EAGAIN is for number of
processes being >= its limit. Does this user already have >= 1024
processes?
---
~Randy
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads
2008-05-21 20:28 ` Randy Dunlap
@ 2008-05-21 20:39 ` mark
2008-05-21 20:50 ` Randy Dunlap
2008-05-21 20:39 ` Johannes Weiner
1 sibling, 1 reply; 14+ messages in thread
From: mark @ 2008-05-21 20:39 UTC (permalink / raw)
To: Randy Dunlap; +Cc: linux-kernel
On Wed, May 21, 2008 at 1:28 PM, Randy Dunlap <randy.dunlap@oracle.com> wrote:
> On Tue, 20 May 2008 11:26:47 -0700 mark wrote:
>> I upgraded to 2.6.25.3-18.fc9.x86_64 fedora core 9, now I get this
>> error when I try to login to the box, kill a pr start a python app, or
>> do anything on a regular basis.
>>
>> fork: Resource temporarily unavailable
>>
>> I have over 10GB RAM free, and zero swap spaced used. The box is a
>> dual quad core Intel Xeon 5405 with 16GB RAM.
>>
>> There is no error message in /var/log/messages or dmesg ...
>> how do I identify the problem?
>> thanks!
>>
>> uname -a
>> Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008
>> x86_64 x86_64 x86_64 GNU/Linux
>>
>>
>> free -m
>> total used free shared buffers cached
>> Mem: 16086 3189 12896 0 42 666
>> -/+ buffers/cache: 2481 13605
>> Swap: 1983 0 1983
>>
>>
>> have only 505 processes running
>> ps aux | wc -l
>> 505
>>
>>
>> uptime
>> 11:24:15 up 39 min, 1 user, load average: 3.54, 3.47, 2.87
>>
>> ulimit -a
>> core file size (blocks, -c) 0
>> data seg size (kbytes, -d) unlimited
>> scheduling priority (-e) 0
>> file size (blocks, -f) unlimited
>> pending signals (-i) 137216
>> max locked memory (kbytes, -l) 32
>> max memory size (kbytes, -m) unlimited
>> open files (-n) 32768
>> pipe size (512 bytes, -p) 8
>> POSIX message queues (bytes, -q) 819200
>> real-time priority (-r) 0
>> stack size (kbytes, -s) 10240
>> cpu time (seconds, -t) unlimited
>> max user processes (-u) 1024
>> virtual memory (kbytes, -v) unlimited
>> file locks (-x) unlimited
> The only place that fork() returns EAGAIN is for number of
> processes being >= its limit. Does this user already have >= 1024
> processes?
No, it is around 400
ps ax | wc -l
417
I also I increased max process to unlimited, and I still get the error
randomly..
ulimit -u
unlimited
my webserver is now throwing this error:
setuid(500) failed (11: Resource temporarily unavailable)
cat /etc/passwd | grep mark
mark:x:500:500::/home/mark:/bin/bash
I also increased this, but still the same error
kernel.pid_max = 65536
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads
2008-05-21 20:28 ` Randy Dunlap
2008-05-21 20:39 ` mark
@ 2008-05-21 20:39 ` Johannes Weiner
1 sibling, 0 replies; 14+ messages in thread
From: Johannes Weiner @ 2008-05-21 20:39 UTC (permalink / raw)
To: Randy Dunlap; +Cc: mark, linux-kernel
Hi,
Randy Dunlap <randy.dunlap@oracle.com> writes:
> On Tue, 20 May 2008 11:26:47 -0700 mark wrote:
>> have only 505 processes running
>> ps aux | wc -l
>> 505
[ quoting deleted for clarification ]
>> max user processes (-u) 1024
> The only place that fork() returns EAGAIN is for number of
> processes being >= its limit. Does this user already have >= 1024
> processes?
Hannes
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads
2008-05-21 20:39 ` mark
@ 2008-05-21 20:50 ` Randy Dunlap
2008-05-21 21:08 ` mark
0 siblings, 1 reply; 14+ messages in thread
From: Randy Dunlap @ 2008-05-21 20:50 UTC (permalink / raw)
To: mark; +Cc: linux-kernel
mark wrote:
> On Wed, May 21, 2008 at 1:28 PM, Randy Dunlap <randy.dunlap@oracle.com> wrote:
>> On Tue, 20 May 2008 11:26:47 -0700 mark wrote:
>>> I upgraded to 2.6.25.3-18.fc9.x86_64 fedora core 9, now I get this
>>> error when I try to login to the box, kill a pr start a python app, or
>>> do anything on a regular basis.
>>>
>>> fork: Resource temporarily unavailable
>>>
>>> I have over 10GB RAM free, and zero swap spaced used. The box is a
>>> dual quad core Intel Xeon 5405 with 16GB RAM.
>>>
>>> There is no error message in /var/log/messages or dmesg ...
>>> how do I identify the problem?
>>> thanks!
>>>
>>> uname -a
>>> Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008
>>> x86_64 x86_64 x86_64 GNU/Linux
>>>
>>>
>>> free -m
>>> total used free shared buffers cached
>>> Mem: 16086 3189 12896 0 42 666
>>> -/+ buffers/cache: 2481 13605
>>> Swap: 1983 0 1983
>>>
>>>
>>> have only 505 processes running
>>> ps aux | wc -l
>>> 505
>>>
>>>
>>> uptime
>>> 11:24:15 up 39 min, 1 user, load average: 3.54, 3.47, 2.87
>>>
>>> ulimit -a
>>> core file size (blocks, -c) 0
>>> data seg size (kbytes, -d) unlimited
>>> scheduling priority (-e) 0
>>> file size (blocks, -f) unlimited
>>> pending signals (-i) 137216
>>> max locked memory (kbytes, -l) 32
>>> max memory size (kbytes, -m) unlimited
>>> open files (-n) 32768
>>> pipe size (512 bytes, -p) 8
>>> POSIX message queues (bytes, -q) 819200
>>> real-time priority (-r) 0
>>> stack size (kbytes, -s) 10240
>>> cpu time (seconds, -t) unlimited
>>> max user processes (-u) 1024
>>> virtual memory (kbytes, -v) unlimited
>>> file locks (-x) unlimited
>> The only place that fork() returns EAGAIN is for number of
>> processes being >= its limit. Does this user already have >= 1024
>> processes?
>
> No, it is around 400
Well, my comment was wrong anyway. There are several other tests just
below number of user processes that also return EAGAIN, like:
- total number of threads being too large
- error on grabbing a module reference count (?)
- error on grabbing a binfmt module reference
> ps ax | wc -l
> 417
>
> I also I increased max process to unlimited, and I still get the error
> randomly..
>
> ulimit -u
> unlimited
>
> my webserver is now throwing this error:
>
> setuid(500) failed (11: Resource temporarily unavailable)
That's all of the useful information??
>
> cat /etc/passwd | grep mark
> mark:x:500:500::/home/mark:/bin/bash
>
> I also increased this, but still the same error
> kernel.pid_max = 65536
--
~Randy
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads
2008-05-21 20:50 ` Randy Dunlap
@ 2008-05-21 21:08 ` mark
2008-05-21 21:15 ` Jesper Juhl
2008-05-21 21:32 ` Randy Dunlap
0 siblings, 2 replies; 14+ messages in thread
From: mark @ 2008-05-21 21:08 UTC (permalink / raw)
To: Randy Dunlap; +Cc: linux-kernel
On Wed, May 21, 2008 at 1:50 PM, Randy Dunlap <randy.dunlap@oracle.com> wrote:
> mark wrote:
>>
>> On Wed, May 21, 2008 at 1:28 PM, Randy Dunlap <randy.dunlap@oracle.com>
>> wrote:
>>>
>>> On Tue, 20 May 2008 11:26:47 -0700 mark wrote:
>>>>
>>>> I upgraded to 2.6.25.3-18.fc9.x86_64 fedora core 9, now I get this
>>>> error when I try to login to the box, kill a pr start a python app, or
>>>> do anything on a regular basis.
>>>>
>>>> fork: Resource temporarily unavailable
>>>>
>>>> I have over 10GB RAM free, and zero swap spaced used. The box is a
>>>> dual quad core Intel Xeon 5405 with 16GB RAM.
>>>>
>>>> There is no error message in /var/log/messages or dmesg ...
>>>> how do I identify the problem?
>>>> thanks!
>>>>
>>>> uname -a
>>>> Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008
>>>> x86_64 x86_64 x86_64 GNU/Linux
>>>>
>>>>
>>>> free -m
>>>> total used free shared buffers cached
>>>> Mem: 16086 3189 12896 0 42
>>>> 666
>>>> -/+ buffers/cache: 2481 13605
>>>> Swap: 1983 0 1983
>>>>
>>>>
>>>> have only 505 processes running
>>>> ps aux | wc -l
>>>> 505
>>>>
>>>>
>>>> uptime
>>>> 11:24:15 up 39 min, 1 user, load average: 3.54, 3.47, 2.87
>>>>
>>>> ulimit -a
>>>> core file size (blocks, -c) 0
>>>> data seg size (kbytes, -d) unlimited
>>>> scheduling priority (-e) 0
>>>> file size (blocks, -f) unlimited
>>>> pending signals (-i) 137216
>>>> max locked memory (kbytes, -l) 32
>>>> max memory size (kbytes, -m) unlimited
>>>> open files (-n) 32768
>>>> pipe size (512 bytes, -p) 8
>>>> POSIX message queues (bytes, -q) 819200
>>>> real-time priority (-r) 0
>>>> stack size (kbytes, -s) 10240
>>>> cpu time (seconds, -t) unlimited
>>>> max user processes (-u) 1024
>>>> virtual memory (kbytes, -v) unlimited
>>>> file locks (-x) unlimited
>>>
>>> The only place that fork() returns EAGAIN is for number of
>>> processes being >= its limit. Does this user already have >= 1024
>>> processes?
>>
>> No, it is around 400
>
> Well, my comment was wrong anyway. There are several other tests just
> below number of user processes that also return EAGAIN, like:
>
> - total number of threads being too large
> - error on grabbing a module reference count (?)
> - error on grabbing a binfmt module reference
as a user how do i identify what is wrong, and fix this? for total
number of threads -> is there anyway i can find out if this is causing
the problem? my system is running around 80 multi-threaded python web
apps.
>> my webserver is now throwing this error:
>>
>> setuid(500) failed (11: Resource temporarily unavailable)
>
> That's all of the useful information??
Yes. i get this error when i restart the web server. if i kill all
other apps, and then start it again it starts fine.
this is the complete error message,
2008/05/21 08:02:19 [emerg] 30558#0: setuid(500) failed (11: Resource
temporarily unavailable)
2008/05/21 08:02:19 [alert] 30557#0: worker process 30558 exited with
fatal code 2 and can not be respawn
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads
2008-05-21 21:08 ` mark
@ 2008-05-21 21:15 ` Jesper Juhl
2008-05-21 21:27 ` mark
2008-05-21 21:32 ` Randy Dunlap
1 sibling, 1 reply; 14+ messages in thread
From: Jesper Juhl @ 2008-05-21 21:15 UTC (permalink / raw)
To: mark; +Cc: Randy Dunlap, linux-kernel
2008/5/21 mark <markkicks@gmail.com>:
<snip>
>>> my webserver is now throwing this error:
>>>
>>> setuid(500) failed (11: Resource temporarily unavailable)
>>
>> That's all of the useful information??
>
> Yes. i get this error when i restart the web server. if i kill all
> other apps, and then start it again it starts fine.
>
> this is the complete error message,
> 2008/05/21 08:02:19 [emerg] 30558#0: setuid(500) failed (11: Resource
> temporarily unavailable)
> 2008/05/21 08:02:19 [alert] 30557#0: worker process 30558 exited with
> fatal code 2 and can not be respawn
What about if you run 'dmesg'? are there any clues in that output?
any kernel stack traces? error messages? warnings? anything out of the
ordinary?
--
Jesper Juhl <jesper.juhl@gmail.com>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads
2008-05-21 21:15 ` Jesper Juhl
@ 2008-05-21 21:27 ` mark
0 siblings, 0 replies; 14+ messages in thread
From: mark @ 2008-05-21 21:27 UTC (permalink / raw)
To: Jesper Juhl; +Cc: Randy Dunlap, linux-kernel
On Wed, May 21, 2008 at 2:15 PM, Jesper Juhl <jesper.juhl@gmail.com> wrote:
> 2008/5/21 mark <markkicks@gmail.com>:
> <snip>
>>>> my webserver is now throwing this error:
>>>>
>>>> setuid(500) failed (11: Resource temporarily unavailable)
>>>
>>> That's all of the useful information??
>>
>> Yes. i get this error when i restart the web server. if i kill all
>> other apps, and then start it again it starts fine.
>>
>> this is the complete error message,
>> 2008/05/21 08:02:19 [emerg] 30558#0: setuid(500) failed (11: Resource
>> temporarily unavailable)
>> 2008/05/21 08:02:19 [alert] 30557#0: worker process 30558 exited with
>> fatal code 2 and can not be respawn
>
> What about if you run 'dmesg'? are there any clues in that output?
> any kernel stack traces? error messages? warnings? anything out of the
> ordinary?
No.
There is no new message added after kernel boot messages
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads
2008-05-21 21:08 ` mark
2008-05-21 21:15 ` Jesper Juhl
@ 2008-05-21 21:32 ` Randy Dunlap
2008-05-21 22:51 ` mark
1 sibling, 1 reply; 14+ messages in thread
From: Randy Dunlap @ 2008-05-21 21:32 UTC (permalink / raw)
To: mark; +Cc: linux-kernel
On Wed, 21 May 2008 14:08:53 -0700 mark wrote:
> On Wed, May 21, 2008 at 1:50 PM, Randy Dunlap <randy.dunlap@oracle.com> wrote:
> > mark wrote:
> >>
> >> On Wed, May 21, 2008 at 1:28 PM, Randy Dunlap <randy.dunlap@oracle.com>
> >> wrote:
> >>>
> >>> On Tue, 20 May 2008 11:26:47 -0700 mark wrote:
> >>>>
> >>>> I upgraded to 2.6.25.3-18.fc9.x86_64 fedora core 9, now I get this
> >>>> error when I try to login to the box, kill a pr start a python app, or
> >>>> do anything on a regular basis.
> >>>>
> >>>> fork: Resource temporarily unavailable
> >>>>
> >>>> I have over 10GB RAM free, and zero swap spaced used. The box is a
> >>>> dual quad core Intel Xeon 5405 with 16GB RAM.
> >>>>
> >>>> There is no error message in /var/log/messages or dmesg ...
> >>>> how do I identify the problem?
> >>>> thanks!
> >>>>
> >>>> uname -a
> >>>> Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008
> >>>> x86_64 x86_64 x86_64 GNU/Linux
> >>>>
> >>>>
> >>>> free -m
> >>>> total used free shared buffers cached
> >>>> Mem: 16086 3189 12896 0 42
> >>>> 666
> >>>> -/+ buffers/cache: 2481 13605
> >>>> Swap: 1983 0 1983
> >>>>
> >>>>
> >>>> have only 505 processes running
> >>>> ps aux | wc -l
> >>>> 505
> >>>>
> >>>>
> >>>> uptime
> >>>> 11:24:15 up 39 min, 1 user, load average: 3.54, 3.47, 2.87
> >>>>
> >>>> ulimit -a
> >>>> core file size (blocks, -c) 0
> >>>> data seg size (kbytes, -d) unlimited
> >>>> scheduling priority (-e) 0
> >>>> file size (blocks, -f) unlimited
> >>>> pending signals (-i) 137216
> >>>> max locked memory (kbytes, -l) 32
> >>>> max memory size (kbytes, -m) unlimited
> >>>> open files (-n) 32768
> >>>> pipe size (512 bytes, -p) 8
> >>>> POSIX message queues (bytes, -q) 819200
> >>>> real-time priority (-r) 0
> >>>> stack size (kbytes, -s) 10240
> >>>> cpu time (seconds, -t) unlimited
> >>>> max user processes (-u) 1024
> >>>> virtual memory (kbytes, -v) unlimited
> >>>> file locks (-x) unlimited
> >>>
> >>> The only place that fork() returns EAGAIN is for number of
> >>> processes being >= its limit. Does this user already have >= 1024
> >>> processes?
> >>
> >> No, it is around 400
> >
> > Well, my comment was wrong anyway. There are several other tests just
> > below number of user processes that also return EAGAIN, like:
> >
> > - total number of threads being too large
Total number of threads currently running is in /proc/loadavg:
> cat /proc/loadavg
1.56 0.58 0.27 2/203 28500
It's the number following the '/', e.g., 203 on my desktop system.
max_threads allowed is a sysctl, so you can tune it if needed.
It's in /proc/sys/kernel/threads-max:
> cat /proc/sys/kernel/threads-max
32624
I sort of doubt that one is the problem, but you can tell us.
> > - error on grabbing a module reference count (?)
> > - error on grabbing a binfmt module reference
>
> as a user how do i identify what is wrong, and fix this? for total
> number of threads -> is there anyway i can find out if this is causing
> the problem? my system is running around 80 multi-threaded python web
> apps.
I can send you some debug patches that will print out the specific
problem area. Do you want to do that? Can you rebuild and install
a new kernel?
> >> my webserver is now throwing this error:
> >>
> >> setuid(500) failed (11: Resource temporarily unavailable)
> >
> > That's all of the useful information??
>
> Yes. i get this error when i restart the web server. if i kill all
> other apps, and then start it again it starts fine.
>
> this is the complete error message,
> 2008/05/21 08:02:19 [emerg] 30558#0: setuid(500) failed (11: Resource
> temporarily unavailable)
> 2008/05/21 08:02:19 [alert] 30557#0: worker process 30558 exited with
> fatal code 2 and can not be respawn
---
~Randy
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads
2008-05-21 21:32 ` Randy Dunlap
@ 2008-05-21 22:51 ` mark
2008-05-21 23:35 ` Randy Dunlap
0 siblings, 1 reply; 14+ messages in thread
From: mark @ 2008-05-21 22:51 UTC (permalink / raw)
To: Randy Dunlap; +Cc: linux-kernel
On Wed, May 21, 2008 at 2:32 PM, Randy Dunlap <randy.dunlap@oracle.com> wrote:
> On Wed, 21 May 2008 14:08:53 -0700 mark wrote:
>
>> On Wed, May 21, 2008 at 1:50 PM, Randy Dunlap <randy.dunlap@oracle.com> wrote:
>> > mark wrote:
>> >>
>> >> On Wed, May 21, 2008 at 1:28 PM, Randy Dunlap <randy.dunlap@oracle.com>
>> >> wrote:
>> >>>
>> >>> On Tue, 20 May 2008 11:26:47 -0700 mark wrote:
>> >>>>
>> >>>> I upgraded to 2.6.25.3-18.fc9.x86_64 fedora core 9, now I get this
>> >>>> error when I try to login to the box, kill a pr start a python app, or
>> >>>> do anything on a regular basis.
>> >>>>
>> >>>> fork: Resource temporarily unavailable
>> >>>>
>> >>>> I have over 10GB RAM free, and zero swap spaced used. The box is a
>> >>>> dual quad core Intel Xeon 5405 with 16GB RAM.
>> >>>>
>> >>>> There is no error message in /var/log/messages or dmesg ...
>> >>>> how do I identify the problem?
>> >>>> thanks!
>> >>>>
>> >>>> uname -a
>> >>>> Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008
>> >>>> x86_64 x86_64 x86_64 GNU/Linux
>> >>>>
>> >>>>
>> >>>> free -m
>> >>>> total used free shared buffers cached
>> >>>> Mem: 16086 3189 12896 0 42
>> >>>> 666
>> >>>> -/+ buffers/cache: 2481 13605
>> >>>> Swap: 1983 0 1983
>> >>>>
>> >>>>
>> >>>> have only 505 processes running
>> >>>> ps aux | wc -l
>> >>>> 505
>> >>>>
>> >>>>
>> >>>> uptime
>> >>>> 11:24:15 up 39 min, 1 user, load average: 3.54, 3.47, 2.87
>> >>>>
>> >>>> ulimit -a
>> >>>> core file size (blocks, -c) 0
>> >>>> data seg size (kbytes, -d) unlimited
>> >>>> scheduling priority (-e) 0
>> >>>> file size (blocks, -f) unlimited
>> >>>> pending signals (-i) 137216
>> >>>> max locked memory (kbytes, -l) 32
>> >>>> max memory size (kbytes, -m) unlimited
>> >>>> open files (-n) 32768
>> >>>> pipe size (512 bytes, -p) 8
>> >>>> POSIX message queues (bytes, -q) 819200
>> >>>> real-time priority (-r) 0
>> >>>> stack size (kbytes, -s) 10240
>> >>>> cpu time (seconds, -t) unlimited
>> >>>> max user processes (-u) 1024
>> >>>> virtual memory (kbytes, -v) unlimited
>> >>>> file locks (-x) unlimited
>> >>>
>> >>> The only place that fork() returns EAGAIN is for number of
>> >>> processes being >= its limit. Does this user already have >= 1024
>> >>> processes?
>> >>
>> >> No, it is around 400
>> >
>> > Well, my comment was wrong anyway. There are several other tests just
>> > below number of user processes that also return EAGAIN, like:
>> >
>> > - total number of threads being too large
>
> Total number of threads currently running is in /proc/loadavg:
>
>> cat /proc/loadavg
> 1.56 0.58 0.27 2/203 28500
>
> It's the number following the '/', e.g., 203 on my desktop system.
>
> max_threads allowed is a sysctl, so you can tune it if needed.
> It's in /proc/sys/kernel/threads-max:
>
>> cat /proc/sys/kernel/threads-max
> 32624
> I sort of doubt that one is the problem, but you can tell us.
cat /proc/loadavg
0.39 0.45 0.57 1/1412 12032
cat /proc/sys/kernel/threads-max
274432
you are right, i guess this is not the problem.
>> > - error on grabbing a module reference count (?)
>> > - error on grabbing a binfmt module reference
>>
>> as a user how do i identify what is wrong, and fix this? for total
>> number of threads -> is there anyway i can find out if this is causing
>> the problem? my system is running around 80 multi-threaded python web
>> apps.
>
> I can send you some debug patches that will print out the specific
> problem area. Do you want to do that? Can you rebuild and install
> a new kernel?
Is it possible to get this debug messages by turning on some flags?
If not yes, pl. send debug patches. its a live box and I will try to do it!
This is my system / kernel info:
uname -a
Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008
x86_64 x86_64 x86_64 GNU/Linux
thanks a lot!!!!
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads
2008-05-21 22:51 ` mark
@ 2008-05-21 23:35 ` Randy Dunlap
2008-05-22 0:09 ` mark
0 siblings, 1 reply; 14+ messages in thread
From: Randy Dunlap @ 2008-05-21 23:35 UTC (permalink / raw)
To: mark; +Cc: linux-kernel
On Wed, 21 May 2008 15:51:55 -0700 mark wrote:
> On Wed, May 21, 2008 at 2:32 PM, Randy Dunlap <randy.dunlap@oracle.com> wrote:
> > On Wed, 21 May 2008 14:08:53 -0700 mark wrote:
> >
> >> On Wed, May 21, 2008 at 1:50 PM, Randy Dunlap <randy.dunlap@oracle.com> wrote:
> >> > mark wrote:
> >> >>
> >> >> On Wed, May 21, 2008 at 1:28 PM, Randy Dunlap <randy.dunlap@oracle.com>
> >> >> wrote:
> >> >>>
> >> >>> On Tue, 20 May 2008 11:26:47 -0700 mark wrote:
> >> >>>>
> >> >>>> I upgraded to 2.6.25.3-18.fc9.x86_64 fedora core 9, now I get this
> >> >>>> error when I try to login to the box, kill a pr start a python app, or
> >> >>>> do anything on a regular basis.
> >> >>>>
> >> >>>> fork: Resource temporarily unavailable
[snip]
> >> >>> The only place that fork() returns EAGAIN is for number of
> >> >>> processes being >= its limit. Does this user already have >= 1024
> >> >>> processes?
> >> >>
> >> >> No, it is around 400
> >> >
> >> > Well, my comment was wrong anyway. There are several other tests just
> >> > below number of user processes that also return EAGAIN, like:
> >> >
> >> > - total number of threads being too large
> >
> > Total number of threads currently running is in /proc/loadavg:
> >
> >> cat /proc/loadavg
> > 1.56 0.58 0.27 2/203 28500
> >
> > It's the number following the '/', e.g., 203 on my desktop system.
> >
> > max_threads allowed is a sysctl, so you can tune it if needed.
> > It's in /proc/sys/kernel/threads-max:
> >
> >> cat /proc/sys/kernel/threads-max
> > 32624
> > I sort of doubt that one is the problem, but you can tell us.
>
> cat /proc/loadavg
> 0.39 0.45 0.57 1/1412 12032
> cat /proc/sys/kernel/threads-max
> 274432
> you are right, i guess this is not the problem.
>
>
> >> > - error on grabbing a module reference count (?)
> >> > - error on grabbing a binfmt module reference
> >>
> >> as a user how do i identify what is wrong, and fix this? for total
> >> number of threads -> is there anyway i can find out if this is causing
> >> the problem? my system is running around 80 multi-threaded python web
> >> apps.
> >
> > I can send you some debug patches that will print out the specific
> > problem area. Do you want to do that? Can you rebuild and install
> > a new kernel?
> Is it possible to get this debug messages by turning on some flags?
> If not yes, pl. send debug patches. its a live box and I will try to do it!
>
> This is my system / kernel info:
> uname -a
> Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008
> x86_64 x86_64 x86_64 GNU/Linux
I made a small patch to a vanilla kernel.org 2.6.25.3 kernel tree.
Hopefully it applies cleanly to that fc9 kernel source, but check/verify
that first before going any further.
After building and booting with this patch, there will be kernel
messages whenever fork's "copy_process" function fails with -EAGAIN (-11),
which is reported to userspace as errno = 11 (Resource temporarily
unavailable). Hopefully this will identify which test is failing,
but there's a chance that something else is going on and that this
patch does not find the problem.
Anyway, good luck and please report back on it.
---
---
kernel/fork.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
--- linux-2.6.25.3.orig/kernel/fork.c
+++ linux-2.6.25.3/kernel/fork.c
@@ -1049,8 +1049,10 @@ static struct task_struct *copy_process(
if (atomic_read(&p->user->processes) >=
p->signal->rlim[RLIMIT_NPROC].rlim_cur) {
if (!capable(CAP_SYS_ADMIN) && !capable(CAP_SYS_RESOURCE) &&
- p->user != current->nsproxy->user_ns->root_user)
+ p->user != current->nsproxy->user_ns->root_user) {
+ printk(KERN_INFO "%s: error on #processes\n", __func__);
goto bad_fork_free;
+ }
}
atomic_inc(&p->user->__count);
@@ -1062,14 +1064,20 @@ static struct task_struct *copy_process(
* triggers too late. This doesn't hurt, the check is only there
* to stop root fork bombs.
*/
- if (nr_threads >= max_threads)
+ if (nr_threads >= max_threads) {
+ printk(KERN_INFO "%s: error on #threads\n", __func__);
goto bad_fork_cleanup_count;
+ }
- if (!try_module_get(task_thread_info(p)->exec_domain->module))
+ if (!try_module_get(task_thread_info(p)->exec_domain->module)) {
+ printk(KERN_INFO "%s: error on exec_domain->module\n", __func__);
goto bad_fork_cleanup_count;
+ }
- if (p->binfmt && !try_module_get(p->binfmt->module))
+ if (p->binfmt && !try_module_get(p->binfmt->module)) {
+ printk(KERN_INFO "%s: error on binfmt->module\n", __func__);
goto bad_fork_cleanup_put_domain;
+ }
p->did_exec = 0;
delayacct_tsk_init(p); /* Must remain after dup_task_struct() */
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads
2008-05-21 23:35 ` Randy Dunlap
@ 2008-05-22 0:09 ` mark
2008-05-22 0:29 ` Randy Dunlap
2008-05-22 7:11 ` するくめ
0 siblings, 2 replies; 14+ messages in thread
From: mark @ 2008-05-22 0:09 UTC (permalink / raw)
To: Randy Dunlap; +Cc: linux-kernel
On Wed, May 21, 2008 at 4:35 PM, Randy Dunlap <randy.dunlap@oracle.com> wrote:
>> > I can send you some debug patches that will print out the specific
>> > problem area. Do you want to do that? Can you rebuild and install
>> > a new kernel?
>> Is it possible to get this debug messages by turning on some flags?
>> If not yes, pl. send debug patches. its a live box and I will try to do it!
>>
>> This is my system / kernel info:
>> uname -a
>> Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008
>> x86_64 x86_64 x86_64 GNU/Linux
>
> I made a small patch to a vanilla kernel.org 2.6.25.3 kernel tree.
> Hopefully it applies cleanly to that fc9 kernel source, but check/verify
> that first before going any further.
Thanks a lot for the patch,
This is kind of weird.. but there is no file kernel/fork.c
[mark@localhost 2.6.25.3-18.fc9.i686]$ find . -iname '*fork*' -print
[mark@localhost 2.6.25.3-18.fc9.i686]$ rpm -ql
[mark@localhost 2.6.25.3-18.fc9.i686]$ rpm -ql kernel-devel | grep fork
[mark@localhost 2.6.25.3-18.fc9.i686]$ pwd
/usr/src/kernels/2.6.25.3-18.fc9.i686
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads
2008-05-22 0:09 ` mark
@ 2008-05-22 0:29 ` Randy Dunlap
2008-05-22 7:11 ` するくめ
1 sibling, 0 replies; 14+ messages in thread
From: Randy Dunlap @ 2008-05-22 0:29 UTC (permalink / raw)
To: mark; +Cc: linux-kernel
mark wrote:
> On Wed, May 21, 2008 at 4:35 PM, Randy Dunlap <randy.dunlap@oracle.com> wrote:
>>>> I can send you some debug patches that will print out the specific
>>>> problem area. Do you want to do that? Can you rebuild and install
>>>> a new kernel?
>>> Is it possible to get this debug messages by turning on some flags?
>>> If not yes, pl. send debug patches. its a live box and I will try to do it!
>>>
>>> This is my system / kernel info:
>>> uname -a
>>> Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008
>>> x86_64 x86_64 x86_64 GNU/Linux
>> I made a small patch to a vanilla kernel.org 2.6.25.3 kernel tree.
>> Hopefully it applies cleanly to that fc9 kernel source, but check/verify
>> that first before going any further.
>
> Thanks a lot for the patch,
> This is kind of weird.. but there is no file kernel/fork.c
>
> [mark@localhost 2.6.25.3-18.fc9.i686]$ find . -iname '*fork*' -print
> [mark@localhost 2.6.25.3-18.fc9.i686]$ rpm -ql
> [mark@localhost 2.6.25.3-18.fc9.i686]$ rpm -ql kernel-devel | grep fork
> [mark@localhost 2.6.25.3-18.fc9.i686]$ pwd
> /usr/src/kernels/2.6.25.3-18.fc9.i686
That's not a kernel source tree.
I'm no expert on fc nor on src.rpm's, but I think that you need to get
the fc kernel-2.6.25.3-18.src.rpm file (or something like that).
Or use a plain vanilla kernel.org 2.6.25.3 kernel tree.
--
~Randy
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fork: Resource temporarily unavailable / cant start new threads
2008-05-22 0:09 ` mark
2008-05-22 0:29 ` Randy Dunlap
@ 2008-05-22 7:11 ` するくめ
1 sibling, 0 replies; 14+ messages in thread
From: するくめ @ 2008-05-22 7:11 UTC (permalink / raw)
To: linux-kernel
Take a look at this guide to install the kernel source on fedora 9
http://www.mjmwired.net/resources/mjm-fedora-f9.html#kernelsrc
2008/5/22 mark <markkicks@gmail.com>:
> On Wed, May 21, 2008 at 4:35 PM, Randy Dunlap <randy.dunlap@oracle.com> wrote:
>>> > I can send you some debug patches that will print out the specific
>>> > problem area. Do you want to do that? Can you rebuild and install
>>> > a new kernel?
>>> Is it possible to get this debug messages by turning on some flags?
>>> If not yes, pl. send debug patches. its a live box and I will try to do it!
>>>
>>> This is my system / kernel info:
>>> uname -a
>>> Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008
>>> x86_64 x86_64 x86_64 GNU/Linux
>>
>> I made a small patch to a vanilla kernel.org 2.6.25.3 kernel tree.
>> Hopefully it applies cleanly to that fc9 kernel source, but check/verify
>> that first before going any further.
>
> Thanks a lot for the patch,
> This is kind of weird.. but there is no file kernel/fork.c
>
> [mark@localhost 2.6.25.3-18.fc9.i686]$ find . -iname '*fork*' -print
> [mark@localhost 2.6.25.3-18.fc9.i686]$ rpm -ql
> [mark@localhost 2.6.25.3-18.fc9.i686]$ rpm -ql kernel-devel | grep fork
> [mark@localhost 2.6.25.3-18.fc9.i686]$ pwd
> /usr/src/kernels/2.6.25.3-18.fc9.i686
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
するくめ
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2008-05-22 7:11 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-20 18:26 fork: Resource temporarily unavailable / cant start new threads mark
2008-05-21 20:28 ` Randy Dunlap
2008-05-21 20:39 ` mark
2008-05-21 20:50 ` Randy Dunlap
2008-05-21 21:08 ` mark
2008-05-21 21:15 ` Jesper Juhl
2008-05-21 21:27 ` mark
2008-05-21 21:32 ` Randy Dunlap
2008-05-21 22:51 ` mark
2008-05-21 23:35 ` Randy Dunlap
2008-05-22 0:09 ` mark
2008-05-22 0:29 ` Randy Dunlap
2008-05-22 7:11 ` するくめ
2008-05-21 20:39 ` Johannes Weiner
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.