public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [BUG] null-pointer in task_rq_lock (2.6.35 to 3.0-rc7)
@ 2011-07-19 20:03 Harald Laabs
  2011-07-19 21:14 ` Eric Dumazet
  2011-08-15  7:22 ` scream
  0 siblings, 2 replies; 4+ messages in thread
From: Harald Laabs @ 2011-07-19 20:03 UTC (permalink / raw)
  To: linux-kernel

Hi,
reloading an apache httpd can crash the kernel since 2.6.35.
It seems that tasks are removed between creating the task-list and
calling wake_up_sem_queue_do in freeary. The pointers to the
task_struct elements end up in try_to_wake_up and sometimes contain
0x0 there.
The problem did not exist in 2.6.34. It does not show up on single
processor systems. Depending on the apache httpd settings it only
takes a few tries to kill the system on our 8-core servers. Dualcore
did not want to crash, maybe it really needs more than one real CPU.
Various gcc versions (4.1 to 4.6) were used.

If anyone wants to crash a system using an prefork apache httpd:
<IfModule mpm_prefork_module>
        ServerLimit             512
        StartServers             50
        MinSpareServers          50
        MaxSpareServers         100
        MaxClients              200
        MaxRequestsPerChild     500
</IfModule>
(Details do not seem to matter but some settings did not die fast.)

I'm not able to fix or understand this bug myself, its already in
bugzilla with the call trace:
https://bugzilla.kernel.org/show_bug.cgi?id=27142

Is there any more useful information I can provide? Anything to test?
Does anyone know of changes from 2.6.34 to 2.6.35 that might have
broken this? (The diff and the changelog do not enlighten me, too
much changed and I understand little of it.)

Thanks,
Harald

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG] null-pointer in task_rq_lock (2.6.35 to 3.0-rc7)
  2011-07-19 20:03 [BUG] null-pointer in task_rq_lock (2.6.35 to 3.0-rc7) Harald Laabs
@ 2011-07-19 21:14 ` Eric Dumazet
  2011-07-20 18:11   ` Manfred Spraul
  2011-08-15  7:22 ` scream
  1 sibling, 1 reply; 4+ messages in thread
From: Eric Dumazet @ 2011-07-19 21:14 UTC (permalink / raw)
  To: Harald Laabs; +Cc: linux-kernel, Manfred Spraul, Andrew Morton

Le mardi 19 juillet 2011 à 22:03 +0200, Harald Laabs a écrit :
> Hi,
> reloading an apache httpd can crash the kernel since 2.6.35.
> It seems that tasks are removed between creating the task-list and
> calling wake_up_sem_queue_do in freeary. The pointers to the
> task_struct elements end up in try_to_wake_up and sometimes contain
> 0x0 there.
> The problem did not exist in 2.6.34. It does not show up on single
> processor systems. Depending on the apache httpd settings it only
> takes a few tries to kill the system on our 8-core servers. Dualcore
> did not want to crash, maybe it really needs more than one real CPU.
> Various gcc versions (4.1 to 4.6) were used.
> 
> If anyone wants to crash a system using an prefork apache httpd:
> <IfModule mpm_prefork_module>
>         ServerLimit             512
>         StartServers             50
>         MinSpareServers          50
>         MaxSpareServers         100
>         MaxClients              200
>         MaxRequestsPerChild     500
> </IfModule>
> (Details do not seem to matter but some settings did not die fast.)
> 
> I'm not able to fix or understand this bug myself, its already in
> bugzilla with the call trace:
> https://bugzilla.kernel.org/show_bug.cgi?id=27142
> 
> Is there any more useful information I can provide? Anything to test?
> Does anyone know of changes from 2.6.34 to 2.6.35 that might have
> broken this? (The diff and the changelog do not enlighten me, too
> much changed and I understand little of it.)

I feel commit 0a2b9d4c79671b059568 might be the bug origin
(ipc/sem.c: move wake_up_process out of the spinlock section)

CC Manfred & Andrew




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG] null-pointer in task_rq_lock (2.6.35 to 3.0-rc7)
  2011-07-19 21:14 ` Eric Dumazet
@ 2011-07-20 18:11   ` Manfred Spraul
  0 siblings, 0 replies; 4+ messages in thread
From: Manfred Spraul @ 2011-07-20 18:11 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Harald Laabs, linux-kernel, Andrew Morton

On 07/19/2011 11:14 PM, Eric Dumazet wrote:
> Le mardi 19 juillet 2011 à 22:03 +0200, Harald Laabs a écrit :Hi,
>>
>> I'm not able to fix or understand this bug myself, its already in
>> bugzilla with the call trace:
>> https://bugzilla.kernel.org/show_bug.cgi?id=27142
>>
>> Is there any more useful information I can provide? Anything to test?
Could you build a kernel with CONFIG_DEBUG_LIST enabled?
Does it report anything?
>> Does anyone know of changes from 2.6.34 to 2.6.35 that might have
>> broken this? (The diff and the changelog do not enlighten me, too
>> much changed and I understand little of it.)
> I feel commit 0a2b9d4c79671b059568 might be the bug origin
> (ipc/sem.c: move wake_up_process out of the spinlock section)
>
I'll try to reproduce the bug tomorrow.
Perhaps a race with multiple processes sleeping, some/all woken up by a 
signal an a concurrent IPC_RM.

But I don't see the bug yet.

--
     Manfred


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG] null-pointer in task_rq_lock (2.6.35 to 3.0-rc7)
  2011-07-19 20:03 [BUG] null-pointer in task_rq_lock (2.6.35 to 3.0-rc7) Harald Laabs
  2011-07-19 21:14 ` Eric Dumazet
@ 2011-08-15  7:22 ` scream
  1 sibling, 0 replies; 4+ messages in thread
From: scream @ 2011-08-15  7:22 UTC (permalink / raw)
  To: linux-kernel

Harald Laabs <kernelml <at> dasr.de> writes:

> 
> Hi,
> reloading an apache httpd can crash the kernel since 2.6.35.
> It seems that tasks are removed between creating the task-list and
> calling wake_up_sem_queue_do in freeary. The pointers to the
> task_struct elements end up in try_to_wake_up and sometimes contain
> 0x0 there.


Had the same in production.

Linux version 2.6.35-22-server (buildd@allspice) (gcc version 4.4.5 
(Ubuntu/Linaro 4.4.4-14ubuntu4) ) #33-Ubuntu SMP Sun Sep 19 20:48:58 UTC 2010 
(Ubuntu 2.6.35-22.33-server 2.6.35.4)

 Apache/2.2.16 (Ubuntu) PHP/5.3.3-1ubuntu9 with Suhosin-Patch mod_ssl/2.2.16 
OpenSSL/0.9.8o

Appeared 2 times after apache2 reload during cron daily jobs.



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-08-15  9:45 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-07-19 20:03 [BUG] null-pointer in task_rq_lock (2.6.35 to 3.0-rc7) Harald Laabs
2011-07-19 21:14 ` Eric Dumazet
2011-07-20 18:11   ` Manfred Spraul
2011-08-15  7:22 ` scream

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox