linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* frequent sig 11 with malloc() on mpc8xx
@ 2006-05-04  7:48 Gautam Borad
  2006-05-04 14:46 ` Wolfgang Denk
  0 siblings, 1 reply; 7+ messages in thread
From: Gautam Borad @ 2006-05-04  7:48 UTC (permalink / raw)
  To: linuxppc-embedded

We are having a frequent sig 11 problem on our custom mpc852t board
with linux kernel 2.6.14 and U-boot version 1.1.3
We have 32MB SDRAM.
I've written a test program that mallocs( 10k chunks ) and then zeros 
out the area
using bzero().This is repeated 1000 times.
The program crashes with a sig 11.
Given below is the dump of the crash :

$ free
                    total         used            free       shared      
buffers
Mem:        29988         3040        26948            0            0
Swap:               0               0                0
Total:        29988         3040        26948

$ ./malloctest 10
i=0  malloc'ed : 10k  at 0x10012010
i=1  malloc'ed : 10k  at 0x10014818
i=2  malloc'ed : 10k  at 0x10017020
........
i=222  malloc'ed : 10k  at 0x1023d700
i=223  malloc'ed : 10k  at 0x1023ff08
i=224  malloc'ed : 10Oops: kernel access of bad area, sig: 11 [#1]
NIP: C005AC48 LR: C005B158 SP: C1DB9EC0 REGS: c1db9e10 TRAP: 0300    Not 
tainted
MSR: 00009032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11
DAR: 000000C8, DSISR: C0000000
TASK = c1d71bb0[651] 'malloctest' THREAD: c1db8000
Last syscall: 4
GPR00: C005B158 C1DB9EC0 C1D71BB0 00000001 00000000 C1DB9F20 00000003 
00000000
GPR08: 00000000 C1C34468 00000003 00000000 00000003 2EEDBEFB 01FFF000 
007FFF40
GPR16: 00000000 00000001 FFFFFFFF 7FB1BAA0 00000000 10068FDC 7FB1BAB8 
00000000
GPR24: 10000694 10000A48 7FC4EB30 C1DB9F20 30096288 00000003 C030EE88 
00000000
NIP [c005ac48] rw_verify_area+0x50/0xbc
LR [c005b158] vfs_write+0x94/0x1a0
Call trace:
 [c005b158] vfs_write+0x94/0x1a0
 [c005b348] sys_write+0x50/0x94
 [c0002b90] ret_from_syscall+0x0/0x44
k  at 0x10242710
i=225  malloc'ed : 10k  at 0x10244f18
i=226  malloc'ed : 10k  at 0x10247720
i=227  malloc'ed : 10k  at 0x102Oops: kernel access of bad area, sig: 11 
[#2]
NIP: C004E54C LR: C004E614 SP: C1DB9CF0 REGS: c1db9c40 TRAP: 0300    Not 
tainted
MSR: 00009032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11
DAR: 000001A0, DSISR: C0000000
TASK = c1d71bb0[651] 'malloctest' THREAD: c1db8000
Last syscall: 4
GPR00: 000001A0 C1DB9CF0 C1D71BB0 C0228BAC C030E348 C022AF04 C1DA4174 
00000000
GPR08: 00000000 00000000 C0228BAC C1C34CBC 80004022 2EEDBEFB 01FFF000 
007FFF40
GPR16: 00000000 00000001 FFFFFFFF 7FB1BAA0 00000000 10068FDC 7FB1BAB8 
00000000
GPR24: 10000694 10000A48 7FC4EB30 0000000B C022AF34 C022AF04 C030E348 
C0228BAC
NIP [c004e54c] __remove_shared_vm_struct+0x28/0x94
LR [c004e614] remove_vm_struct+0x5c/0xd0
Call trace:
 [c004e614] remove_vm_struct+0x5c/0xd0
 [c0050adc] exit_mmap+0x11c/0x148
 [c000f9b8] mmput+0x54/0xd0
 [c00141cc] exit_mm+0x190/0x1f0
 [c0014b40] do_exit+0xec/0x3c8
 [c00035b0] _exception+0x0/0xc8
 [c000a47c] bad_page_fault+0x5c/0x60
 [c00030e0] handle_page_fault+0x7c/0x80
 [c022fa68] sysfs_init+0x34/0xd4
 [c005b158] vfs_write+0x94/0x1a0
 [c005b348] sys_write+0x50/0x94
 [c0002b90] ret_from_syscall+0x0/0x44

However if i call free() after bzero() i dont get the sig 11.
I had the same problem with 2.4 kernel and after posting the problem 
here, was asked to move to 2.6 kernel.
I've done so but the problem persists.

Thanking in advance.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: frequent sig 11 with malloc() on mpc8xx
  2006-05-04  7:48 frequent sig 11 with malloc() on mpc8xx Gautam Borad
@ 2006-05-04 14:46 ` Wolfgang Denk
  2006-05-04 15:46   ` David Jander
  2006-05-05  8:43   ` Gautam Borad
  0 siblings, 2 replies; 7+ messages in thread
From: Wolfgang Denk @ 2006-05-04 14:46 UTC (permalink / raw)
  To: Gautam Borad; +Cc: linuxppc-embedded

In message <4459B1CF.60909@eisodus.com> you wrote:
> We are having a frequent sig 11 problem on our custom mpc852t board
> with linux kernel 2.6.14 and U-boot version 1.1.3

That's a FAQ.

> I had the same problem with 2.4 kernel and after posting the problem 

This confirms that the FAQ matches your problem. See
http://www.denx.de/wiki/view/DULG/LinuxCrashesRandomly

> here, was asked to move to 2.6 kernel.

Ummm... I have yet to see a single case where moving to 2.6  improved
the stability for a MPC8xx system :-(

Best regards,

Wolfgang Denk

-- 
Software Engineering:  Embedded and Realtime Systems,  Embedded Linux
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
News is what a chap who doesn't care much  about  anything  wants  to
read. And it's only news until he's read it. After that it's dead.
                           - Evelyn Waugh _Scoop_ (1938) bk. 1, ch. 5

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: frequent sig 11 with malloc() on mpc8xx
  2006-05-04 14:46 ` Wolfgang Denk
@ 2006-05-04 15:46   ` David Jander
  2006-05-05  8:43   ` Gautam Borad
  1 sibling, 0 replies; 7+ messages in thread
From: David Jander @ 2006-05-04 15:46 UTC (permalink / raw)
  To: linuxppc-embedded

On Thursday 04 May 2006 16:46, Wolfgang Denk wrote:
> Ummm... I have yet to see a single case where moving to 2.6  improved
> the stability for a MPC8xx system :-(

Our case.

Jffs2's gc thread stopped crashing. This might have more to do with mtd/jffs2 
than with the rest of the kernel, but it sure works better now.

Nevertheless I am very interested in hearing about (potential) stability 
problems with with 2.6.14, 15 and 16 (besides not booting just because 
Marcelo couldn't resist screwing up on the last minute; problem which is 
being worked on ;-).

Greetings,

-- 
David Jander

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: frequent sig 11 with malloc() on mpc8xx
  2006-05-04 14:46 ` Wolfgang Denk
  2006-05-04 15:46   ` David Jander
@ 2006-05-05  8:43   ` Gautam Borad
  2006-05-05  9:40     ` Wolfgang Denk
  1 sibling, 1 reply; 7+ messages in thread
From: Gautam Borad @ 2006-05-05  8:43 UTC (permalink / raw)
  To: linuxppc-embedded

Wolfgang Denk wrote:

>In message <4459B1CF.60909@eisodus.com> you wrote:
>  
>
>>We are having a frequent sig 11 problem on our custom mpc852t board
>>with linux kernel 2.6.14 and U-boot version 1.1.3
>>    
>>
>That's a FAQ.
>
>  
>
>>I had the same problem with 2.4 kernel and after posting the problem 
>>    
>>
>This confirms that the FAQ matches your problem. See
>http://www.denx.de/wiki/view/DULG/LinuxCrashesRandomly
>
>  
>
Thanks for the reply. We have checked the cpu sdram settings and would
re-check the sdram initialization sequence.
However the problem faced is following:
The sig. 11 is generated at a specific instance of accessing memory 
areas in
range of 0x00000024 - 0x000000C8 (i.e low address range).
AFAIK this is assigned to kernel area.
We have a ptrintk in arch/ppc/mm/fault.c which shows the frequent page 
fault
and its recovery from the fault, however as soon as the DAR loads 
0x00000024
or such low address we get a sig. 11.

Bad emulation malloctest/657
 NIP: 30000c10 instruction: 00000000 opcode: 0 A: 0 B: 0 C: 0 code: 0 rc: 0
 pte @ 0x30000c10:  (0xc1d3b300)->(0xc020f000)->0x01c2b889
 RPN: 01c2b PP: 2 SPS: 1 SH: 0 CI: 0 v: 1
Kernel VA for NIP c1c2bc10  pte @ 0xc1c2bc10: no pmd
Oops: kernel access of bad area, sig: 11 [#1]
NIP: C00286C8 LR: C0186684 SP: C02CDCA0 REGS: c02cdbf0 TRAP: 0300    Not 
tainted
MSR: 00001032 EE: 0 PR: 0 FP: 0 ME: 1 IR/DR: 11
DAR: 00000000, DSISR: C2000000                                          
<======== here the DAR is 0x00000000
TASK = c1d0e070[657] 'malloctest' THREAD: c02cc000

We have tested the SDRAM in both U-boot (mtest) and linux, and the tests 
doesnt show anything
wrong with the SDRAM.

thanks in advance.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: frequent sig 11 with malloc() on mpc8xx
  2006-05-05  8:43   ` Gautam Borad
@ 2006-05-05  9:40     ` Wolfgang Denk
  2006-05-05 14:00       ` Gautam Borad
  0 siblings, 1 reply; 7+ messages in thread
From: Wolfgang Denk @ 2006-05-05  9:40 UTC (permalink / raw)
  To: Gautam Borad; +Cc: linuxppc-embedded

In message <445B1019.9010103@eisodus.com> you wrote:
> 
...
> We have tested the SDRAM in both U-boot (mtest) and linux, and the tests 
> doesnt show anything
> wrong with the SDRAM.

No, of course not. Please read the FAQ to understand why standard RAM
tests will never detect this type of problem. 

Best regards,

Wolfgang Denk

-- 
Software Engineering:  Embedded and Realtime Systems,  Embedded Linux
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
Here is an Appalachian version of management's answer  to  those  who
are  concerned  with  the fate of the project: "Don't worry about the
mule. Just load the wagon."         - Mike Dennison's hillbilly uncle

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: frequent sig 11 with malloc() on mpc8xx
  2006-05-05  9:40     ` Wolfgang Denk
@ 2006-05-05 14:00       ` Gautam Borad
  2006-05-05 14:17         ` Mark Chambers
  0 siblings, 1 reply; 7+ messages in thread
From: Gautam Borad @ 2006-05-05 14:00 UTC (permalink / raw)
  To: linuxppc-embedded

Wolfgang Denk wrote:

>No, of course not. Please read the FAQ to understand why standard RAM
>tests will never detect this type of problem. 
>
>Best regards,
>
>Wolfgang Denk
>
>  
>
 Thanks for the reply.
We are aware that its a FAQ and we rechecked the SDRAM configuration, 
everything seems fine.
We disabled burst mode and tried but that didnt help. Now we want to 
disable cache and check.
Is the cache disabled from u-boot or linux? Where do we have to modify 
the code to disable
the cache  completely. Basically we want to run the linux without using 
cache.

regards,
gautam.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: frequent sig 11 with malloc() on mpc8xx
  2006-05-05 14:00       ` Gautam Borad
@ 2006-05-05 14:17         ` Mark Chambers
  0 siblings, 0 replies; 7+ messages in thread
From: Mark Chambers @ 2006-05-05 14:17 UTC (permalink / raw)
  To: Gautam Borad, linuxppc-embedded

> We are aware that its a FAQ and we rechecked the SDRAM configuration,
> everything seems fine.
> We disabled burst mode and tried but that didnt help. Now we want to
> disable cache and check.

Another thing you can try on the 852 is to change the processor frequency
via the PLPRCR register.  If slowing down the clock helps, don't assume
that you have a timing problem though - these PLL circuits are notoriously
twitchy, and it could be noise (or, more properly, resonant frequencies in 
your
layout)

Mark Chambers

P.S. Hopefully you have a hardware debugger - you can halt the processor
and change this register on the fly.  Same with cache - no sense compiling
these changes, just do them manually. 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-05-05 14:17 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-04  7:48 frequent sig 11 with malloc() on mpc8xx Gautam Borad
2006-05-04 14:46 ` Wolfgang Denk
2006-05-04 15:46   ` David Jander
2006-05-05  8:43   ` Gautam Borad
2006-05-05  9:40     ` Wolfgang Denk
2006-05-05 14:00       ` Gautam Borad
2006-05-05 14:17         ` Mark Chambers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).