* [BUG] null-pointer in task_rq_lock (2.6.35 to 3.0-rc7)
@ 2011-07-19 20:03 Harald Laabs
2011-07-19 21:14 ` Eric Dumazet
2011-08-15 7:22 ` scream
0 siblings, 2 replies; 4+ messages in thread
From: Harald Laabs @ 2011-07-19 20:03 UTC (permalink / raw)
To: linux-kernel
Hi,
reloading an apache httpd can crash the kernel since 2.6.35.
It seems that tasks are removed between creating the task-list and
calling wake_up_sem_queue_do in freeary. The pointers to the
task_struct elements end up in try_to_wake_up and sometimes contain
0x0 there.
The problem did not exist in 2.6.34. It does not show up on single
processor systems. Depending on the apache httpd settings it only
takes a few tries to kill the system on our 8-core servers. Dualcore
did not want to crash, maybe it really needs more than one real CPU.
Various gcc versions (4.1 to 4.6) were used.
If anyone wants to crash a system using an prefork apache httpd:
<IfModule mpm_prefork_module>
ServerLimit 512
StartServers 50
MinSpareServers 50
MaxSpareServers 100
MaxClients 200
MaxRequestsPerChild 500
</IfModule>
(Details do not seem to matter but some settings did not die fast.)
I'm not able to fix or understand this bug myself, its already in
bugzilla with the call trace:
https://bugzilla.kernel.org/show_bug.cgi?id=27142
Is there any more useful information I can provide? Anything to test?
Does anyone know of changes from 2.6.34 to 2.6.35 that might have
broken this? (The diff and the changelog do not enlighten me, too
much changed and I understand little of it.)
Thanks,
Harald
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [BUG] null-pointer in task_rq_lock (2.6.35 to 3.0-rc7)
2011-07-19 20:03 [BUG] null-pointer in task_rq_lock (2.6.35 to 3.0-rc7) Harald Laabs
@ 2011-07-19 21:14 ` Eric Dumazet
2011-07-20 18:11 ` Manfred Spraul
2011-08-15 7:22 ` scream
1 sibling, 1 reply; 4+ messages in thread
From: Eric Dumazet @ 2011-07-19 21:14 UTC (permalink / raw)
To: Harald Laabs; +Cc: linux-kernel, Manfred Spraul, Andrew Morton
Le mardi 19 juillet 2011 à 22:03 +0200, Harald Laabs a écrit :
> Hi,
> reloading an apache httpd can crash the kernel since 2.6.35.
> It seems that tasks are removed between creating the task-list and
> calling wake_up_sem_queue_do in freeary. The pointers to the
> task_struct elements end up in try_to_wake_up and sometimes contain
> 0x0 there.
> The problem did not exist in 2.6.34. It does not show up on single
> processor systems. Depending on the apache httpd settings it only
> takes a few tries to kill the system on our 8-core servers. Dualcore
> did not want to crash, maybe it really needs more than one real CPU.
> Various gcc versions (4.1 to 4.6) were used.
>
> If anyone wants to crash a system using an prefork apache httpd:
> <IfModule mpm_prefork_module>
> ServerLimit 512
> StartServers 50
> MinSpareServers 50
> MaxSpareServers 100
> MaxClients 200
> MaxRequestsPerChild 500
> </IfModule>
> (Details do not seem to matter but some settings did not die fast.)
>
> I'm not able to fix or understand this bug myself, its already in
> bugzilla with the call trace:
> https://bugzilla.kernel.org/show_bug.cgi?id=27142
>
> Is there any more useful information I can provide? Anything to test?
> Does anyone know of changes from 2.6.34 to 2.6.35 that might have
> broken this? (The diff and the changelog do not enlighten me, too
> much changed and I understand little of it.)
I feel commit 0a2b9d4c79671b059568 might be the bug origin
(ipc/sem.c: move wake_up_process out of the spinlock section)
CC Manfred & Andrew
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [BUG] null-pointer in task_rq_lock (2.6.35 to 3.0-rc7)
2011-07-19 21:14 ` Eric Dumazet
@ 2011-07-20 18:11 ` Manfred Spraul
0 siblings, 0 replies; 4+ messages in thread
From: Manfred Spraul @ 2011-07-20 18:11 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Harald Laabs, linux-kernel, Andrew Morton
On 07/19/2011 11:14 PM, Eric Dumazet wrote:
> Le mardi 19 juillet 2011 à 22:03 +0200, Harald Laabs a écrit :Hi,
>>
>> I'm not able to fix or understand this bug myself, its already in
>> bugzilla with the call trace:
>> https://bugzilla.kernel.org/show_bug.cgi?id=27142
>>
>> Is there any more useful information I can provide? Anything to test?
Could you build a kernel with CONFIG_DEBUG_LIST enabled?
Does it report anything?
>> Does anyone know of changes from 2.6.34 to 2.6.35 that might have
>> broken this? (The diff and the changelog do not enlighten me, too
>> much changed and I understand little of it.)
> I feel commit 0a2b9d4c79671b059568 might be the bug origin
> (ipc/sem.c: move wake_up_process out of the spinlock section)
>
I'll try to reproduce the bug tomorrow.
Perhaps a race with multiple processes sleeping, some/all woken up by a
signal an a concurrent IPC_RM.
But I don't see the bug yet.
--
Manfred
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [BUG] null-pointer in task_rq_lock (2.6.35 to 3.0-rc7)
2011-07-19 20:03 [BUG] null-pointer in task_rq_lock (2.6.35 to 3.0-rc7) Harald Laabs
2011-07-19 21:14 ` Eric Dumazet
@ 2011-08-15 7:22 ` scream
1 sibling, 0 replies; 4+ messages in thread
From: scream @ 2011-08-15 7:22 UTC (permalink / raw)
To: linux-kernel
Harald Laabs <kernelml <at> dasr.de> writes:
>
> Hi,
> reloading an apache httpd can crash the kernel since 2.6.35.
> It seems that tasks are removed between creating the task-list and
> calling wake_up_sem_queue_do in freeary. The pointers to the
> task_struct elements end up in try_to_wake_up and sometimes contain
> 0x0 there.
Had the same in production.
Linux version 2.6.35-22-server (buildd@allspice) (gcc version 4.4.5
(Ubuntu/Linaro 4.4.4-14ubuntu4) ) #33-Ubuntu SMP Sun Sep 19 20:48:58 UTC 2010
(Ubuntu 2.6.35-22.33-server 2.6.35.4)
Apache/2.2.16 (Ubuntu) PHP/5.3.3-1ubuntu9 with Suhosin-Patch mod_ssl/2.2.16
OpenSSL/0.9.8o
Appeared 2 times after apache2 reload during cron daily jobs.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2011-08-15 9:45 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-07-19 20:03 [BUG] null-pointer in task_rq_lock (2.6.35 to 3.0-rc7) Harald Laabs
2011-07-19 21:14 ` Eric Dumazet
2011-07-20 18:11 ` Manfred Spraul
2011-08-15 7:22 ` scream
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox