PPC Linux crash resulting MMU problem !!

linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed

* PPC Linux crash resulting MMU problem !!
@ 1999-08-20 12:54 Pierre Juste
  1999-08-20 15:26 ` Magnus Damm
  0 siblings, 1 reply; 2+ messages in thread
From: Pierre Juste @ 1999-08-20 12:54 UTC (permalink / raw)
  To: linuxppc-dev@lists.linuxppc.org

We are working for a port of Linux PPC on a system based on 860 PowerPC.
Quick overview of the PB: Our system is crashing fastly after boot resulting of
MMU error. We are very interseting by any one feedback concerning solving
PB of MMU on PowerPC specialy on MPC8xx under Linux.

Thanks

Pierre

PB:

We have connect for debug purpose of out target the FADS860 (Emulator)
through the serial BDM connector (SRESET, HRESET, DIN, DOUT, CLK
+ 2 observation pins).

Our software PB is the following one:

We are debugging Linux. We are very near to satisfaction; the OS performs well
if our PowerPC board is started with FADS on (we do only configure the DER
register, enable the monitor ROM and run it), but "random" errors appear when
PowerPC board is standalone.

The errors are fairly stable (not fully) when using a given binary code.
When a few dummy lines of code are added (anywhere), the program stops
anywhere else (not at the same location as before, mostoften  the
execution is stopped much before the code that has been added should be
executed). To understand what happens, we added code to every exception
to track code execution (we suspected that the kind of behaviour we
describe here had to do with the MMU operation). Some registers have
been traced; the tracing methodology is to use part of the existing
memory (not made available to OS) as a trace pad. When the program is
obviously stuck, we perform a software reset via SRESET on the BDM;
inside the 0x100 routine, we trace which instruction we have been
interrupting, enable branch tracing and return to normal execution (this
is to see if something executes thereafter). Then we perform a harware
reset (PORESET) and look at the content of the scratchpad memory.

Here is the result of seven trials with PowerPC board standelone (each
corresponds to a different binary code, the difference being dummy code
- only printfs at the end of OS boot, which is most of time not
reached). We did omit the beginning of the trace, emphasizing only the
content of the last TLB miss when necessary, the soft reset trace, and
when relevant the trace following. These results are not a selection :
these are seven consecutive logged trials :

Trial 1 :    1 ITLB miss @ 0xC009A000
                1 Sreset : SRR0 = 0xC009A000 SRR1 = 0x00000040

Comment : The TLB miss occurs at an exact page boundary. We cannot be
sure nothing is executed after the ITLB miss exception ends (logging
occurs after standard code execution), but the software reset interrupts
exactly the address pointed by ITLB miss ! In addition, SRR1 content is
strange (very invalid MSR content to save).

Trial 2 :    nothing very special about previous interrupts. Program
stopped. Sreset results are :
                1 Sreset : SRR0 = 0xC001D000    SRR1=0.

Trial 3 :    nothing very special about previous interrupts. Program
stopped. Sreset results are :
                1 Sreset : SRR0 = 0xC0079000    SRR1=0.

Comment : here again exact page boundaries ore observed when resetting
the program after blocking.

Trial 4 :    1 ITLB miss @ 0x C0012FD8
                1 Sreset : SRR0 = 0xC0012FD8 SRR1 = 0x08209032

Comment : here the ITLB miss does not occur at a page boundary, but it
seems the execution stops here, since the reset we perform after a while
indicates the same address as the ITLB miss. Question : what happens of
the SRR1 reserved bits after an exception is handled (0820 bits of SRR1
after ITLB miss exception for instance) ?

Trial 5 :    nothing very special about previous interrupts. Program
stopped. Sreset results are :
                1 Sreset : SRR0 = 0xC00A4000    SRR1=0x00000040.

Trial 6 :    1 DTLB miss at code address 0xC0006638
                1 Sreset : SRR0 = 0xC0006638    SRR1=0x00009032
                1 DTLB error at code address 0xC0006638

Comment : here we have a data TLB miss causing the stop. The code does
not progress thereafter, as indicated by the SRR0 at Sreset, but it
seems the Sreset has unlocked something, because execution tries to
restart this time, but with little success since a DTLB error is
observed immediately.

Trial 7 :    nothing very special about previous interrupts. Program
stopped. Sreset results are :
                1 Sreset : SRR0 = 0xC0060000    SRR1=0x00000040.

Comment : out of seven trials, five have their execution stopped at a
program page boundary, among these only one signalling a immediately
preceding ITLB miss at that address... For the DTLB miss, we did not
find any simple means to identify which data address had been causing
the miss.

Most of time, the reset enables to know where the program is stuck, but
when perchance it resumes execution after reset, the context seems to be
corrupted because nothing coherent is made by the CPU, that often ends
up with infinite loops. The software reset seems though to be a good
means to get the information we want, which is basically the saved
program counter (SRR0) and machnie state register (SRR1); on the
contrary, either the error that occured makes the resume impossible, or
the software reset modifies context too much, but the program flow after
incident it too erratic to be interpreted.

[[ This message was sent via the linuxppc-dev mailing list.  Replies are ]]
[[ not  forced  back  to the list, so be sure to Cc linuxppc-dev if your ]]
[[ reply is of general interest. Please check http://lists.linuxppc.org/ ]]
[[ and http://www.linuxppc.org/ for useful information before posting.   ]]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: PPC Linux crash resulting MMU problem !!
  1999-08-20 12:54 PPC Linux crash resulting MMU problem !! Pierre Juste
@ 1999-08-20 15:26 ` Magnus Damm
  0 siblings, 0 replies; 2+ messages in thread
From: Magnus Damm @ 1999-08-20 15:26 UTC (permalink / raw)
  To: Pierre_Juste; +Cc: linuxppc-dev

Hi there!

I've experienced some strange things with my 860 and linux.

I've got a few different board around; MBX, ADS, FADS and some custom
ones.

The MBX has got some kind of boot software that loads the kernel.
I've written boot code to the rest of the boards.

The strange thing that happened was that my kernel died on all boards
except 
for the MBX. I got a hint from a guy called Helmut Buchsbaum
<helmut.buchsbaum@siemens.at>.
He had a FADS with a 823 and he told me that he had to reserve a few TLB
entries to make it work.

Some other guy said that he had been using a bdm and seen the interrupt
code trashed.
Helmut gave him the patch and he said it worked much better with it.

And my kernel works better with it.
I think I get some kind of bad opcode in the bad opcode routine with out
the patch and everything stops.

That is maybe not your problem, but you might want to add this code to
head.S 
and see if anything works better...

Keep the list informed if you find out anything.

Thanks /

Magnus Damm

the code:

#ifdef CONFIG_FADS_FORCE_IO
fadsForceIO:    
#ifdef CONFIG_MPC823
    mfspr   r8,MD_CTR
    ori     r8,r8,0x0700      /* set TBL idx to 7 */
#endif
#if defined(CONFIG_MPC860) || defined (CONFIG_MPC860T)
    mfspr   r8,MD_CTR
    ori     r8,r8,0x1F00      /* set TBL idx to 31 */
#endif 
    mtspr   MD_CTR, r8
    lis     r8, BOOT_IMMR@h         /* Create vaddr for TLB */
    ori     r8, r8, MD_EVALID       /* Mark it valid */
    mtspr   MD_EPN, r8
    li      r8, MD_PS8MEG           /* Set 8M byte page */
    ori     r8, r8, MD_SVALID       /* Make it valid */
    mtspr   MD_TWC, r8
    lis     r8, BOOT_IMMR@h         /* Create paddr for TLB */
    ori     r8, r8, MI_BOOTINIT|0x2 /* Inhibit cache -- Cort */
    mtspr   MD_RPN, r8

    mfspr   r8,MD_CTR
    oris    r8,r8,0x0800    /* set RSV2D */
    mtspr   MD_CTR, r8
    blr
#endif /* CONFIG_FADS_FORCE_IO */

[[ This message was sent via the linuxppc-dev mailing list.  Replies are ]]
[[ not  forced  back  to the list, so be sure to Cc linuxppc-dev if your ]]
[[ reply is of general interest. Please check http://lists.linuxppc.org/ ]]
[[ and http://www.linuxppc.org/ for useful information before posting.   ]]

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~1999-08-20 15:26 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
1999-08-20 12:54 PPC Linux crash resulting MMU problem !! Pierre Juste
1999-08-20 15:26 ` Magnus Damm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).