* U-boot with flat device tree
@ 2007-03-14 21:42 Benedict, Michael
2007-03-14 22:46 ` Kim Phillips
` (2 more replies)
0 siblings, 3 replies; 17+ messages in thread
From: Benedict, Michael @ 2007-03-14 21:42 UTC (permalink / raw)
To: linuxppc-embedded
Hello,
Sorry, I realize this was discussed a month ago: "device tree /
how to build/compile & use with u-boot to boot uImage?" However, I
wasn't on the list at the time. I read through it and I thought I
understood everything.
I am trying to get an mpc8349e-mITX to boot a recent kernel. I
have tried all permutations of:
dtc - development snapshot "dtc-20060419.tar.gz" and recent git
sources
U-boot - 1.2.0, git sources from denx.de, and git sources from freescale
Kernel - 2.6.20.1, 2.6.20.2, and freescale git sources
Everytime I boot U-boot will tell me the kernel uncompress find and
then... Nothing. I passed initcall_debug to the kernel and didn't get
anything printed to the serial console. Note that if I do not specify
the flat device tree, the kernel progresses up to the point of mounting
root. It can't mount root because it is an NFS root and the ethernet
drivers aren't working:
0:00 not found
eth0: Could not attach to PHY
Just as you would expect (since eth0 is defined in the dts).
Just to be sure I start be building the device tree, eg:
$ dtc -I dts -O dtb -f -V 16
linux/arch/powerpc/boot/dts/mpc8349emitx.dts -o mpc8349emitx.dtb
Then, from u-Boot, download the dtb and kernel and boot:
MPC8349E-mITX> tftp 01000000 mpc8349emitx.dtb
Speed: 100, full duplex
Using TSEC0 device
TFTP from server 10.100.10.74; our IP address is 10.100.10.83
Filename 'mpc8349emitx.dtb'.
Load address: 0x1000000
Loading: #
done
Bytes transferred =3D 3976 (f88 hex)
MPC8349E-mITX> tftp 01001000 uImage
Speed: 100, full duplex
Using TSEC0 device
TFTP from server 10.100.10.74; our IP address is 10.100.10.83
Filename 'uImage'.
Load address: 0x1001000
Loading:
#################################################################
=20
#################################################################
=20
#################################################################
=20
#################################################################
#################################
done
Bytes transferred =3D 1499510 (16e176 hex)
MPC8349E-mITX> bootm 1001000 - 1000000
## Booting image at 01001000 ...
Image Name: Linux-2.6.21-rc3-gb5d99e64
Created: 2007-03-12 19:28:43 UTC
Image Type: PowerPC Linux Kernel Image (gzip compressed)
Data Size: 1499446 Bytes =3D 1.4 MB
Load Address: 00000000
Entry Point: 00000000
Verifying Checksum ... OK
Uncompressing Kernel Image ... OK
Booting using flat device tree at 0x1000000
<stuck>
Any ideas?
Thank you!
Michael
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: U-boot with flat device tree
2007-03-14 21:42 U-boot with flat device tree Benedict, Michael
@ 2007-03-14 22:46 ` Kim Phillips
[not found] ` <CF7E46FCFF66AD478BB72724345289EC27953F@twx-exch01.twacs.local>
2007-03-15 15:41 ` Exception in kernel mode Charles Krinke
2007-03-15 16:37 ` U-boot with flat device tree Jon Loeliger
2 siblings, 1 reply; 17+ messages in thread
From: Kim Phillips @ 2007-03-14 22:46 UTC (permalink / raw)
To: Benedict, Michael; +Cc: linuxppc-embedded
On Wed, 14 Mar 2007 16:42:49 -0500
"Benedict, Michael" <MBenedict@twacs.com> wrote:
> Uncompressing Kernel Image ... OK
> Booting using flat device tree at 0x1000000
>
>
>
> <stuck>
are you setting console=ttyS0,115200?
Kim
^ permalink raw reply [flat|nested] 17+ messages in thread
* Exception in kernel mode
2007-03-14 21:42 U-boot with flat device tree Benedict, Michael
2007-03-14 22:46 ` Kim Phillips
@ 2007-03-15 15:41 ` Charles Krinke
2007-03-15 15:52 ` Sergei Shtylyov
2007-03-15 16:37 ` U-boot with flat device tree Jon Loeliger
2 siblings, 1 reply; 17+ messages in thread
From: Charles Krinke @ 2007-03-15 15:41 UTC (permalink / raw)
To: linuxppc-embedded; +Cc: Chris Carlson, Kevin Smith
I have a PPC8241 which does an=20
Oops: Exception in kernel mode, sig: 4 [#1]
PREEMPT
NIP: 00000900 LR: C00E579C CTR: 00003A55
When doing a 'tar -xvzf'. The NIP is pointing at the decrementer
interrupt, I believe.
As I understand the decrementer, this is basically the timer tick in the
ppc and goes off every n ms continuously.=20
A tar is going to be concerned with date/time stamps on files, so it has
some interaction with the clock algorithms.
It seems to me that the decrementer should be able to go off at any time
during kernel operation, so maybe the "Exception in kernel mode" is
pointing us in an unusual direction.
So, with that said:
"What might be the causes of such an exception from the decrementer in a
2.6.17.11 ppc8241 kernel?"
"Where should one concentrate ones efforts in figuring this out?"
Charles
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Exception in kernel mode
2007-03-15 15:41 ` Exception in kernel mode Charles Krinke
@ 2007-03-15 15:52 ` Sergei Shtylyov
2007-03-15 18:19 ` Charles Krinke
0 siblings, 1 reply; 17+ messages in thread
From: Sergei Shtylyov @ 2007-03-15 15:52 UTC (permalink / raw)
To: Charles Krinke; +Cc: Chris Carlson, Kevin Smith, linuxppc-embedded
Hello.
Charles Krinke wrote:
> I have a PPC8241 which does an
> Oops: Exception in kernel mode, sig: 4 [#1]
> PREEMPT
> NIP: 00000900 LR: C00E579C CTR: 00003A55
> When doing a 'tar -xvzf'. The NIP is pointing at the decrementer
> interrupt, I believe.
> As I understand the decrementer, this is basically the timer tick in the
> ppc and goes off every n ms continuously.
> A tar is going to be concerned with date/time stamps on files, so it has
> some interaction with the clock algorithms.
> It seems to me that the decrementer should be able to go off at any time
> during kernel operation, so maybe the "Exception in kernel mode" is
> pointing us in an unusual direction.
Obviously, you've got an exception in the decrementer exception handler
itself -- and this was something like program check exception, judging on the
signal you've got (SIGILL).
> So, with that said:
> "What might be the causes of such an exception from the decrementer in a
> 2.6.17.11 ppc8241 kernel?"
> "Where should one concentrate ones efforts in figuring this out?"
Hrm, looks like some CPU errata maybe...
WBR, Sergei
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: Exception in kernel mode
2007-03-15 15:52 ` Sergei Shtylyov
@ 2007-03-15 18:19 ` Charles Krinke
2007-03-15 18:23 ` Sergei Shtylyov
0 siblings, 1 reply; 17+ messages in thread
From: Charles Krinke @ 2007-03-15 18:19 UTC (permalink / raw)
To: Sergei Shtylyov; +Cc: Chris Carlson, Kevin Smith, linuxppc-embedded
Obviously, you've got an exception in the decrementer exception
handler=20
itself -- and this was something like program check exception, judging
on the=20
signal you've got (SIGILL).
> So, with that said:
> "What might be the causes of such an exception from the decrementer in
a
> 2.6.17.11 ppc8241 kernel?"
> "Where should one concentrate ones efforts in figuring this out?"
Hrm, looks like some CPU errata maybe...
WBR, Sergei
Thank you very much for the great hint. I have a follow-up question.
I can see from arch/ppc/kernel/head.S that the EXCEPTION_PROLOG macro is
the first instruction in the decrementer ISR and it says (amongst other
things):
mtspr SPRN_SPRG0, r10=20
mtspr SPRN_SPRG1, r11
Which is, I believe, moving r10 to SPRG0 and r11 to SPRG1.
So, how do we know that r10 and r11 are always valid in an interrupt
context? Are we setting aside r10 and r11 somewhere else in
initialization for this purpose? I have looked at the -ffixed-r2 in the
CFLAGS and see it is set aside, but it doesn't appear the build process
sets r10 & r11 aside for the exclusive use of interrupt routines. Is
there a shadow register set in the 8241 I havent appreciated yet?
I have to admit that I am only modestly familiar with the ppc family but
would appreciate any insight to understand how this interrupt code
works.
Charles
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Exception in kernel mode
2007-03-15 18:19 ` Charles Krinke
@ 2007-03-15 18:23 ` Sergei Shtylyov
2007-03-15 18:55 ` Charles Krinke
0 siblings, 1 reply; 17+ messages in thread
From: Sergei Shtylyov @ 2007-03-15 18:23 UTC (permalink / raw)
To: Charles Krinke; +Cc: Chris Carlson, Kevin Smith, linuxppc-embedded
Hello.
Charles Krinke wrote:
> Obviously, you've got an exception in the decrementer exception
> handler
> itself -- and this was something like program check exception, judging
> on the
> signal you've got (SIGILL).
>>So, with that said:
>>"What might be the causes of such an exception from the decrementer in a
>>2.6.17.11 ppc8241 kernel?"
>>"Where should one concentrate ones efforts in figuring this out?"
> Hrm, looks like some CPU errata maybe...
Please, tune message quoting in your mailer.
> Thank you very much for the great hint. I have a follow-up question.
> I can see from arch/ppc/kernel/head.S that the EXCEPTION_PROLOG macro is
> the first instruction in the decrementer ISR and it says (amongst other
> things):
> mtspr SPRN_SPRG0, r10
> mtspr SPRN_SPRG1, r11
> Which is, I believe, moving r10 to SPRG0 and r11 to SPRG1.
> So, how do we know that r10 and r11 are always valid in an interrupt
> context? Are we setting aside r10 and r11 somewhere else in
That doesn't matter to kernel at all -- they are just *saved* in SPRG regs
to avoid being trashed by the exception handler.
WBR, Sergei
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: Exception in kernel mode
2007-03-15 18:23 ` Sergei Shtylyov
@ 2007-03-15 18:55 ` Charles Krinke
2007-03-15 19:28 ` Kumar Gala
0 siblings, 1 reply; 17+ messages in thread
From: Charles Krinke @ 2007-03-15 18:55 UTC (permalink / raw)
To: Sergei Shtylyov; +Cc: Chris Carlson, Kevin Smith, linuxppc-embedded
> mtspr SPRN_SPRG0, r10=20
> mtspr SPRN_SPRG1, r11
> Which is, I believe, moving r10 to SPRG0 and r11 to SPRG1.
> So, how do we know that r10 and r11 are always valid in an interrupt
> context? Are we setting aside r10 and r11 somewhere else in
That doesn't matter to kernel at all -- they are just *saved* in
SPRG regs=20
to avoid being trashed by the exception handler.
WBR, Sergei
Well, unfortunately, now I am more confused.
The original Oops was at an NIP of 00000900, which I think means it
faulted on the first mtspr from r10. I suppose one could argue that
pipeline issues might make it fault on the second one and appear to be
the first.
But, maybe I am confusing myself here. Would I be correct in assuming
that some further instruction in the ISR at 0x900 is the culprit?
Could there possibly be some user versus supervisor mode thing going on?
My key assumption is that the timer_tick (aka Decrementer) has worked
for many hundreds of thousands of interrupts and only when running some
particular user application, like tar is there a side effect from either
a mode or some register value, race condition, or other.
Charles
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Exception in kernel mode
2007-03-15 18:55 ` Charles Krinke
@ 2007-03-15 19:28 ` Kumar Gala
2007-03-15 19:56 ` Charles Krinke
2007-03-15 21:17 ` Charles Krinke
0 siblings, 2 replies; 17+ messages in thread
From: Kumar Gala @ 2007-03-15 19:28 UTC (permalink / raw)
To: Charles Krinke; +Cc: Chris Carlson, Kevin Smith, linuxppc-embedded
On Mar 15, 2007, at 1:55 PM, Charles Krinke wrote:
>
>> mtspr SPRN_SPRG0, r10
>> mtspr SPRN_SPRG1, r11
>
>> Which is, I believe, moving r10 to SPRG0 and r11 to SPRG1.
>
>> So, how do we know that r10 and r11 are always valid in an interrupt
>> context? Are we setting aside r10 and r11 somewhere else in
>
> That doesn't matter to kernel at all -- they are just *saved* in
> SPRG regs
> to avoid being trashed by the exception handler.
>
> WBR, Sergei
>
> Well, unfortunately, now I am more confused.
>
> The original Oops was at an NIP of 00000900, which I think means it
> faulted on the first mtspr from r10. I suppose one could argue that
> pipeline issues might make it fault on the second one and appear to be
> the first.
>
> But, maybe I am confusing myself here. Would I be correct in assuming
> that some further instruction in the ISR at 0x900 is the culprit?
>
> Could there possibly be some user versus supervisor mode thing
> going on?
>
> My key assumption is that the timer_tick (aka Decrementer) has worked
> for many hundreds of thousands of interrupts and only when running
> some
> particular user application, like tar is there a side effect from
> either
> a mode or some register value, race condition, or other.
Can you post the oops that you are seeing, what you need to find out
is what instruction image that is causing the illegal instruction
exception. Once you have that it will be easier to figure out what's
going on.
- k
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: Exception in kernel mode
2007-03-15 19:28 ` Kumar Gala
@ 2007-03-15 19:56 ` Charles Krinke
2007-03-15 21:17 ` Charles Krinke
1 sibling, 0 replies; 17+ messages in thread
From: Charles Krinke @ 2007-03-15 19:56 UTC (permalink / raw)
To: Kumar Gala; +Cc: Chris Carlson, Kevin Smith, linuxppc-embedded
Can you post the oops that you are seeing, what you need to find out =20
is what instruction image that is causing the illegal instruction =20
exception. Once you have that it will be easier to figure out what's =20
going on.
- k
Dear Kumar:
Here is the Oops, and thank you for looking at this. Things have come to
a halt here until I can figure this one out as about 6 engineers are
going "We cannot use an unreliable 2.6 kernel, fix it immediately, Mr.
Kernel Guy".
Basically, what was happening was a 'tar -xvzf' was underway whose
source was an nfs mount (root =3D /dev/nfs) and whose destination was
formatted NAND flash (flash_eraseall -q -j /dev/mtd3, then mounted with
'mount -t jffs2 /dev/mtdblockx /mnt/mtdblockx').
Since I sent the last exchange, I see in the kernel source something
about ALTIVEC??. It isnt possible that ALTIVEC needs to be defined for
the 8241 is it?
Charles
rootfs/sbin/fsck.ext3
Oops: Exception in kernel mode, sig: 4 [#1]
PREEMPT
NIP: 00000900 LR: C00E579C CTR: 00003A55
REGS: c37f3b00 TRAP: 0700 Not tainted (2.6.17.11)
MSR: 00081000 <ME> CR: 24022484 XER: 00000000
TASK =3D c3eaf810[940] 'tar' THREAD: c37f2000
GPR00: 00003FFF C37F3BB0 C3EAF810 C40216BC 00000000 0000FFFE C4022D64
00008000
GPR08: C4000A74 C40316BC C4000980 C40216BC C4000000 10045C90 C37F3CE0
C01D3B68
GPR16: C3ECC5A0 C37F3C50 00002200 00000000 C2A9E000 C2000000 00000000
00001000
GPR24: 00000000 00000000 C01D0000 C2000000 C37F3C58 00000002 00000000
C4000000
Call Trace:
[C37F3BB0] [C00E5764] (unreliable)
[C37F3BD0] [C00C93C4]
[C37F3BF0] [C00B988C]
[C37F3C40] [C00C0644]
[C37F3C90] [C00BBA94]
[C37F3CD0] [C003B8A8]
[C37F3D90] [C003C270]
[C37F3E20] [C003C320]
[C37F3EC0] [C003C470]
[C37F3EF0] [C0055F80]
[C37F3F10] [C00560C0]
[C37F3F40] [C00041A0]
Instruction dump:
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: Exception in kernel mode
2007-03-15 19:28 ` Kumar Gala
2007-03-15 19:56 ` Charles Krinke
@ 2007-03-15 21:17 ` Charles Krinke
2007-03-15 23:44 ` Kumar Gala
2007-03-15 23:50 ` u-boot+linux for ML403 board Leonid
1 sibling, 2 replies; 17+ messages in thread
From: Charles Krinke @ 2007-03-15 21:17 UTC (permalink / raw)
To: Kumar Gala; +Cc: Chris Carlson, Kevin Smith, linuxppc-embedded
I ran a couple of more tests and the system did not Oops in the
timer_interupt except for the first test this morning. The last two
times, the NIP was=20
Oops: Exception in kernel mode, sig: 4 [#1]
PREEMPT
NIP: C002CE68 LR: C002CEC8 CTR: 00003B25
REGS: c3119a00 TRAP: 0700 Not tainted (2.6.17.11)
MSR: 00081032 <ME,IR,DR> CR: 88022484 XER: 00000000
TASK =3D c3ecd870[920] 'tar' THREAD: c3118000
And
Oops: Exception in kernel mode, sig: 4 [#1]
PREEMPT
NIP: C00DEE18 LR: C00DEDD8 CTR: 00000000
REGS: c299dbc0 TRAP: 0700 Not tainted (2.6.17.11)
MSR: 00081032 <ME,IR,DR> CR: 84022488 XER: 20000000
TASK =3D c3e2b7d0[925] 'tar' THREAD: c299c000
For comparision, this is the original one from this morning
Oops: Exception in kernel mode, sig: 4 [#1]
PREEMPT
NIP: 00000900 LR: C00E579C CTR: 00003A55
REGS: c37f3b00 TRAP: 0700 Not tainted (2.6.17.11)
MSR: 00081000 <ME> CR: 24022484 XER: 00000000
TASK =3D c3eaf810[940] 'tar' THREAD: c37f2000
I have to conclude this is not necessarily a timer_interrupt problem.
Also, commenting out the innards of the timer_interrupt causes the
kernel to hang in its boot right after the message
Memory: xxxK available
So a properly functioning timer_interrupt is essential to the the kernel
booting.=20
But, ... At this point, I really don't know which way to jump.
Charles
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Exception in kernel mode
2007-03-15 21:17 ` Charles Krinke
@ 2007-03-15 23:44 ` Kumar Gala
2007-03-16 14:45 ` Charles Krinke
2007-03-15 23:50 ` u-boot+linux for ML403 board Leonid
1 sibling, 1 reply; 17+ messages in thread
From: Kumar Gala @ 2007-03-15 23:44 UTC (permalink / raw)
To: Charles Krinke; +Cc: Chris Carlson, Kevin Smith, linuxppc-embedded
On Mar 15, 2007, at 4:17 PM, Charles Krinke wrote:
> I ran a couple of more tests and the system did not Oops in the
> timer_interupt except for the first test this morning. The last two
> times, the NIP was
>
> Oops: Exception in kernel mode, sig: 4 [#1]
> PREEMPT
> NIP: C002CE68 LR: C002CEC8 CTR: 00003B25
> REGS: c3119a00 TRAP: 0700 Not tainted (2.6.17.11)
> MSR: 00081032 <ME,IR,DR> CR: 88022484 XER: 00000000
> TASK = c3ecd870[920] 'tar' THREAD: c3118000
>
> And
>
> Oops: Exception in kernel mode, sig: 4 [#1]
> PREEMPT
> NIP: C00DEE18 LR: C00DEDD8 CTR: 00000000
> REGS: c299dbc0 TRAP: 0700 Not tainted (2.6.17.11)
> MSR: 00081032 <ME,IR,DR> CR: 84022488 XER: 20000000
> TASK = c3e2b7d0[925] 'tar' THREAD: c299c000
>
> For comparision, this is the original one from this morning
>
> Oops: Exception in kernel mode, sig: 4 [#1]
> PREEMPT
> NIP: 00000900 LR: C00E579C CTR: 00003A55
> REGS: c37f3b00 TRAP: 0700 Not tainted (2.6.17.11)
> MSR: 00081000 <ME> CR: 24022484 XER: 00000000
> TASK = c3eaf810[940] 'tar' THREAD: c37f2000
>
> I have to conclude this is not necessarily a timer_interrupt problem.
> Also, commenting out the innards of the timer_interrupt causes the
> kernel to hang in its boot right after the message
>
> Memory: xxxK available
>
> So a properly functioning timer_interrupt is essential to the the
> kernel
> booting.
>
> But, ... At this point, I really don't know which way to jump.
It this a system you are just bringing up or one that's been running
for a while. It really seems like memory corruption of some form.
I'd suggest checking memory controller settings.
Also, what happens if you disassemble the kernel image and look at
the addresses pointed to by NIP:
C00DEE18 & C002CE68.
- k
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: Exception in kernel mode
2007-03-15 23:44 ` Kumar Gala
@ 2007-03-16 14:45 ` Charles Krinke
2007-03-16 15:33 ` Kumar Gala
0 siblings, 1 reply; 17+ messages in thread
From: Charles Krinke @ 2007-03-16 14:45 UTC (permalink / raw)
To: Kumar Gala; +Cc: Chris Carlson, Kevin Smith, linuxppc-embedded
It this a system you are just bringing up or one that's been running=20
for a while. It really seems like memory corruption of some form. =20
I'd suggest checking memory controller settings.
Also, what happens if you disassemble the kernel image and look at=20
the addresses pointed to by NIP:
C00DEE18 & C002CE68.
- k
Dear Kumar:
=20
We have two systems. One based on an 8241, and one based on an 8541. The =
8241 has been running for some time with Linux 2.4 and the 8541 is =
coming up. Both are using the 2.6.17.11 kernel from kernel.org with =
modifications for our hardware.
=20
In the case of the 8241, I started out with the 2.4 modifications, which =
were originally based on the 8260 and ported them to 2.6. In the case of =
the 8541, I started out with the embedded planet 8555EP 2.6 kernel =
source and added that to the 2.6.
=20
I dont see this exception in the 8541, although extensive testing has =
not yet been completed. The 8241 exhibits this exception on three =
different 8241 boards, so I dont suspect the hardware.
=20
We are using the Montavista toolchain and their root filesystem =
including 'tar' and 'cp' which are the programs that currently exhibit =
the fault.
=20
Yesterday, when I saw an NIP at 0x900, I was ready to jump on the =
interrupts not being setup correctly, but after a few hours of going =
through that, I am now convinced the interrupts are setup correctly, so =
it is something more subtle.
=20
Certainly, memory corruption is the next thing to be concerned with.=20
=20
One thing that has concerned me a bit is that we have no swap space =
available at all. This is an embedded system with 64MByte of RAM and =
JFFS2 NAND flash with no swap partitions.
=20
I suspect auditing the MMU setup differences between the original 2.4 =
kernel and the new 2.6 kernel for the 8241 board is the next step.
=20
The three exceptions I saw yesterday were 1)0x900 in the =
timer_interrupt, 2) C00DEE18 (inside the tar program) and 3) C002CE68 =
(in one of the kernel routines).=20
=20
I suspect the actual addresses are red-herrings and this exception can =
occur at any address. This certainly would tend to indicate some sort of =
memory setup issue.
=20
Changing the Oops logic to printout the NextInstruction as well as the =
NIP might be helpful so I could discern the difference between what the =
program is trying to do and what it is really doing.
=20
Are there any other thoughts you might have on diagnosis techniques at =
this point?
=20
Charles
=20
=20
In the meantime, any thoughts you might have on methods to di
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Exception in kernel mode
2007-03-16 14:45 ` Charles Krinke
@ 2007-03-16 15:33 ` Kumar Gala
2007-03-16 16:05 ` Charles Krinke
0 siblings, 1 reply; 17+ messages in thread
From: Kumar Gala @ 2007-03-16 15:33 UTC (permalink / raw)
To: Charles Krinke; +Cc: Chris Carlson, Kevin Smith, linuxppc-embedded
On Mar 16, 2007, at 9:45 AM, Charles Krinke wrote:
> It this a system you are just bringing up or one that's been running
> for a while. It really seems like memory corruption of some form.
> I'd suggest checking memory controller settings.
>
> Also, what happens if you disassemble the kernel image and look at
> the addresses pointed to by NIP:
> C00DEE18 & C002CE68.
>
> - k
> Dear Kumar:
>
> We have two systems. One based on an 8241, and one based on an
> 8541. The 8241 has been running for some time with Linux 2.4 and
> the 8541 is coming up. Both are using the 2.6.17.11 kernel from
> kernel.org with modifications for our hardware.
>
> In the case of the 8241, I started out with the 2.4 modifications,
> which were originally based on the 8260 and ported them to 2.6. In
> the case of the 8541, I started out with the embedded planet 8555EP
> 2.6 kernel source and added that to the 2.6.
>
> I dont see this exception in the 8541, although extensive testing
> has not yet been completed. The 8241 exhibits this exception on
> three different 8241 boards, so I dont suspect the hardware.
>
> We are using the Montavista toolchain and their root filesystem
> including 'tar' and 'cp' which are the programs that currently
> exhibit the fault.
>
> Yesterday, when I saw an NIP at 0x900, I was ready to jump on the
> interrupts not being setup correctly, but after a few hours of
> going through that, I am now convinced the interrupts are setup
> correctly, so it is something more subtle.
>
> Certainly, memory corruption is the next thing to be concerned with.
>
> One thing that has concerned me a bit is that we have no swap space
> available at all. This is an embedded system with 64MByte of RAM
> and JFFS2 NAND flash with no swap partitions.
>
> I suspect auditing the MMU setup differences between the original
> 2.4 kernel and the new 2.6 kernel for the 8241 board is the next step.
>
> The three exceptions I saw yesterday were 1)0x900 in the
> timer_interrupt, 2) C00DEE18 (inside the tar program) and 3)
> C002CE68 (in one of the kernel routines).
#2 is inside the kernel as well. Look at the System.map or objdump -
d vmlinux to see what exactly is at those instructions.
> I suspect the actual addresses are red-herrings and this exception
> can occur at any address. This certainly would tend to indicate
> some sort of memory setup issue.
I think it's useful to know if the instructions at the two offsets
C00DEE18 & C002CE68 are similar in some way before jumping to that
conclusion.
> Changing the Oops logic to printout the NextInstruction as well as
> the NIP might be helpful so I could discern the difference between
> what the program is trying to do and what it is really doing.
>
> Are there any other thoughts you might have on diagnosis techniques
> at this point?
Try turning on KALLSYMS, this should provide more info on the oops as
well.
- k
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: Exception in kernel mode
2007-03-16 15:33 ` Kumar Gala
@ 2007-03-16 16:05 ` Charles Krinke
0 siblings, 0 replies; 17+ messages in thread
From: Charles Krinke @ 2007-03-16 16:05 UTC (permalink / raw)
To: Kumar Gala; +Cc: Chris Carlson, Kevin Smith, linuxppc-embedded
#2 is inside the kernel as well. Look at the System.map or objdump -
d vmlinux to see what exactly is at those instructions.
> I suspect the actual addresses are red-herrings and this exception=20
> can occur at any address. This certainly would tend to indicate=20
> some sort of memory setup issue.
I think it's useful to know if the instructions at the two offsets=20
C00DEE18 & C002CE68 are similar in some way before jumping to that=20
conclusion.
> Changing the Oops logic to printout the NextInstruction as well as=20
> the NIP might be helpful so I could discern the difference between=20
> what the program is trying to do and what it is really doing.
>
> Are there any other thoughts you might have on diagnosis techniques=20
> at this point?
Try turning on KALLSYMS, this should provide more info on the oops as=20
well.
- k
Dear Kumar:
=20
As always, your advice is appreciated. I will try turning on =
CONFIG_KALLSYMS in the .config file and see if additional info is =
obtained. Also, I will audit the MMU setup and see if I can see anything =
unusual.
=20
Charles
^ permalink raw reply [flat|nested] 17+ messages in thread
* u-boot+linux for ML403 board.
2007-03-15 21:17 ` Charles Krinke
2007-03-15 23:44 ` Kumar Gala
@ 2007-03-15 23:50 ` Leonid
1 sibling, 0 replies; 17+ messages in thread
From: Leonid @ 2007-03-15 23:50 UTC (permalink / raw)
To: linuxppc-embedded
Hi:
Does anybody use u-boot+linux on ML403 board? If yes, can somebody give
me kernel or kernel patch to any open source kernel tree (DENX or
kernel.org).
I couldn't prepare uImage which is working with u-boot on ML403 board -
it hangs forever:
TFTP from server 192.168.0.141; our IP address is 192.168.0.203
Filename 'LM200/rel/1.0.1d-403/uImage'.
Load address: 0x400000
Loading: T
#################################################################
=20
#################################################################
#######################################
done
Bytes transferred =3D 861911 (d26d7 hex)
## Booting image at 00400000 ...
Image Name: Linux-2.6.19.2
Image Type: PowerPC Linux Kernel Image (gzip compressed)
Data Size: 861847 Bytes =3D 841.6 kB
Load Address: 00000000
Entry Point: 00000000
Verifying Checksum ... OK
Uncompressing Kernel Image ... OK
## Current stack ends at 0x03E676F0 =3D> set upper limit to 0x00800000
Board Info pointer (kbd) at 007ffe80 ## cmdline at 0x007FFF00 ...
0x007FFFF8
memstart =3D 0x00000000
memsize =3D 0x04000000
flashstart =3D 0x28000000
flashsize =3D 0x00800000
flashoffset =3D 0x00000000
sramstart =3D 0x00000000
sramsize =3D 0x00000000
bootflags =3D 0x00000000
procfreq =3D 300 MHz
plb_busfreq =3D 100 MHz
ethaddr =3D 00:01:02:CB:CB:71
IP addr =3D 192.168.0.203
baudrate =3D 115200 bps
No initrd
## Transferring control to Linux (at address 00000000) ..
Thanks,
Leonid.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: U-boot with flat device tree
2007-03-14 21:42 U-boot with flat device tree Benedict, Michael
2007-03-14 22:46 ` Kim Phillips
2007-03-15 15:41 ` Exception in kernel mode Charles Krinke
@ 2007-03-15 16:37 ` Jon Loeliger
2 siblings, 0 replies; 17+ messages in thread
From: Jon Loeliger @ 2007-03-15 16:37 UTC (permalink / raw)
To: Benedict, Michael; +Cc: linuxppc-embedded
On Wed, 2007-03-14 at 16:42, Benedict, Michael wrote:
> dtc - development snapshot "dtc-20060419.tar.gz" and recent git
> sources
> U-boot - 1.2.0, git sources from denx.de, and git sources from freescale
> Kernel - 2.6.20.1, 2.6.20.2, and freescale git sources
Do yourself a favor and get an updated DTC at some point:
http://www.jdl.com/git_repos/
http://www.jdl.com/software/dtc-20070216.tgz
Though the latter should have its Maintainer update it too. :-)
Thanks,
jdl
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2007-03-16 16:07 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-03-14 21:42 U-boot with flat device tree Benedict, Michael
2007-03-14 22:46 ` Kim Phillips
[not found] ` <CF7E46FCFF66AD478BB72724345289EC27953F@twx-exch01.twacs.local>
[not found] ` <20070314180112.15492178.kim.phillips@freescale.com>
2007-03-14 23:42 ` Benedict, Michael
2007-03-15 15:41 ` Exception in kernel mode Charles Krinke
2007-03-15 15:52 ` Sergei Shtylyov
2007-03-15 18:19 ` Charles Krinke
2007-03-15 18:23 ` Sergei Shtylyov
2007-03-15 18:55 ` Charles Krinke
2007-03-15 19:28 ` Kumar Gala
2007-03-15 19:56 ` Charles Krinke
2007-03-15 21:17 ` Charles Krinke
2007-03-15 23:44 ` Kumar Gala
2007-03-16 14:45 ` Charles Krinke
2007-03-16 15:33 ` Kumar Gala
2007-03-16 16:05 ` Charles Krinke
2007-03-15 23:50 ` u-boot+linux for ML403 board Leonid
2007-03-15 16:37 ` U-boot with flat device tree Jon Loeliger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).