do_ri exception in Linux (MIPS 4kec)

Linux MIPS Architecture development
 help / color / mirror / Atom feed

* do_ri exception in Linux (MIPS 4kec)
@ 2004-12-23 11:28 Nori, Soma Sekhar
  2004-12-23 11:28 ` Nori, Soma Sekhar
  2004-12-27 12:30 ` Ralf Baechle
  0 siblings, 2 replies; 5+ messages in thread
From: Nori, Soma Sekhar @ 2004-12-23 11:28 UTC (permalink / raw)
  To: linux-mips; +Cc: Iyer, Suraj

Hi all, 

We are facing "Reserved Instruction" exception whenever there is any activity on serial console (typing any command) while there is traffic on network. We do not suspect the network/serial driver as the same software has been working on other mips boards. Here is the dump I obtained by calling show_registers(regs) in do_ri. (the original do_ri implementation did not make a register dump but just printed "kernel BUG at traps.c:627!" ).

If there is no activity on serial console, the network works just fine. Console activity without any activity on the network also works fine.

We are using montavista Linux version 2.4.17, gcc version 2.95.3 running on MIPS 4kec.

Here is the dump:
$0 : 00000000 0044def4 000001ac 0000006b 00000000 7fff7c08 00000001 00000000
$8 : 0000fc00 00000001 00000000 941524d0 00004700 00000000 97fc3ea0 7fff7c08
$16: 100048a4 100029d8 100029d8 10003020 00000000 7fff7dc8 10003b60 2d8e2163
$24: 00000001 2ab7bc30                   10008e70 7fff7bf0 04000000 00439e50
Hi : 00000000
Lo : 00000001
epc  : 00439e84    Not tainted
Status: 0000fc13
Cause : 10800028
Process sh (pid: 18, stackpage=97fc2000)
Stack: 00000001 00000000 2abd0ff0 7fff7c28 10008e70 00000000 10008e6c 00000000
       100049a0 0042f188 00000000 100029d8 00000001 00000001 7fff7f04 10008e70
       00427fe4 00427f00 00000000 00000000 10002764 10008e70 10008e70 00000000
       00000000 00000000 10008e70 00422734 00000001 00000001 7fff7f04 10008e70
       10008e70 00000003 10008e70 004315cc 00000001 00000000 10002764 00000000
       10008e70 ...
Call Trace:
Code: 00000000  2421dd48  00220821 <8c220000> 00000000  005c1021  00400008  0000
0000  8f99802c

The epc is not in kernel space and ksymoops did not provide any info. The epc keeps changing to different locations in user space over multiple runs.

I used copy_from_user in do_ri to get the memory contents at and around the memory pointed to by epc and the memory was intact (all valid instructions) at all times.

Any idea on why I am getting do_ri only on console activity + network activity?

Any help in debugging this is greatly appreciated.

Thanks,
Sekhar Nori

^ permalink raw reply	[flat|nested] 5+ messages in thread

* do_ri exception in Linux (MIPS 4kec)
  2004-12-23 11:28 do_ri exception in Linux (MIPS 4kec) Nori, Soma Sekhar
@ 2004-12-23 11:28 ` Nori, Soma Sekhar
  2004-12-27 12:30 ` Ralf Baechle
  1 sibling, 0 replies; 5+ messages in thread
From: Nori, Soma Sekhar @ 2004-12-23 11:28 UTC (permalink / raw)
  To: linux-mips; +Cc: Iyer, Suraj

Hi all, 

We are facing "Reserved Instruction" exception whenever there is any activity on serial console (typing any command) while there is traffic on network. We do not suspect the network/serial driver as the same software has been working on other mips boards. Here is the dump I obtained by calling show_registers(regs) in do_ri. (the original do_ri implementation did not make a register dump but just printed "kernel BUG at traps.c:627!" ).

If there is no activity on serial console, the network works just fine. Console activity without any activity on the network also works fine.

We are using montavista Linux version 2.4.17, gcc version 2.95.3 running on MIPS 4kec.

Here is the dump:
$0 : 00000000 0044def4 000001ac 0000006b 00000000 7fff7c08 00000001 00000000
$8 : 0000fc00 00000001 00000000 941524d0 00004700 00000000 97fc3ea0 7fff7c08
$16: 100048a4 100029d8 100029d8 10003020 00000000 7fff7dc8 10003b60 2d8e2163
$24: 00000001 2ab7bc30                   10008e70 7fff7bf0 04000000 00439e50
Hi : 00000000
Lo : 00000001
epc  : 00439e84    Not tainted
Status: 0000fc13
Cause : 10800028
Process sh (pid: 18, stackpage=97fc2000)
Stack: 00000001 00000000 2abd0ff0 7fff7c28 10008e70 00000000 10008e6c 00000000
       100049a0 0042f188 00000000 100029d8 00000001 00000001 7fff7f04 10008e70
       00427fe4 00427f00 00000000 00000000 10002764 10008e70 10008e70 00000000
       00000000 00000000 10008e70 00422734 00000001 00000001 7fff7f04 10008e70
       10008e70 00000003 10008e70 004315cc 00000001 00000000 10002764 00000000
       10008e70 ...
Call Trace:
Code: 00000000  2421dd48  00220821 <8c220000> 00000000  005c1021  00400008  0000
0000  8f99802c

The epc is not in kernel space and ksymoops did not provide any info. The epc keeps changing to different locations in user space over multiple runs.

I used copy_from_user in do_ri to get the memory contents at and around the memory pointed to by epc and the memory was intact (all valid instructions) at all times.

Any idea on why I am getting do_ri only on console activity + network activity?

Any help in debugging this is greatly appreciated.

Thanks,
Sekhar Nori

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: do_ri exception in Linux (MIPS 4kec)
  2004-12-23 11:28 do_ri exception in Linux (MIPS 4kec) Nori, Soma Sekhar
  2004-12-23 11:28 ` Nori, Soma Sekhar
@ 2004-12-27 12:30 ` Ralf Baechle
  1 sibling, 0 replies; 5+ messages in thread
From: Ralf Baechle @ 2004-12-27 12:30 UTC (permalink / raw)
  To: Nori, Soma Sekhar; +Cc: linux-mips, Iyer, Suraj

On Thu, Dec 23, 2004 at 04:58:03PM +0530, Nori, Soma Sekhar wrote:

> We are using montavista Linux version 2.4.17, gcc version 2.95.3 running on MIPS 4kec.
> 
> Here is the dump:
> $0 : 00000000 0044def4 000001ac 0000006b 00000000 7fff7c08 00000001 00000000
> $8 : 0000fc00 00000001 00000000 941524d0 00004700 00000000 97fc3ea0 7fff7c08
> $16: 100048a4 100029d8 100029d8 10003020 00000000 7fff7dc8 10003b60 2d8e2163
> $24: 00000001 2ab7bc30                   10008e70 7fff7bf0 04000000 00439e50
> Hi : 00000000
> Lo : 00000001
> epc  : 00439e84    Not tainted
> Status: 0000fc13
> Cause : 10800028
> Process sh (pid: 18, stackpage=97fc2000)
> Stack: 00000001 00000000 2abd0ff0 7fff7c28 10008e70 00000000 10008e6c 00000000
>        100049a0 0042f188 00000000 100029d8 00000001 00000001 7fff7f04 10008e70
>        00427fe4 00427f00 00000000 00000000 10002764 10008e70 10008e70 00000000
>        00000000 00000000 10008e70 00422734 00000001 00000001 7fff7f04 10008e70
>        10008e70 00000003 10008e70 004315cc 00000001 00000000 10002764 00000000
>        10008e70 ...
> Call Trace:
> Code: 00000000  2421dd48  00220821 <8c220000> 00000000  005c1021  00400008  0000
> 0000  8f99802c
> 
> The epc is not in kernel space and ksymoops did not provide any info. The epc keeps changing to different locations in user space over multiple runs.

In a case like this you're likely dealing with double exceptions.  Your
code is taking an exception and the exception handler while running with
c0_status set is taking another exception.  If the first exception handler
is still running with the c0_status.exl bit set the CPU when taking the
second exception it will not record the PC of the second exception and
you will have a seemingly unexplainable exception.

A few processors have the nasty habit of throwing RI receptions or do
similarly weird things when executing code that is mapped through multiple
TLB pages but the 4kEC shouldn't.

  Ralf

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: do_ri exception in Linux (MIPS 4kec)
@ 2004-12-28 14:41 Nori, Soma Sekhar
  2004-12-28 14:41 ` Nori, Soma Sekhar
  0 siblings, 1 reply; 5+ messages in thread
From: Nori, Soma Sekhar @ 2004-12-28 14:41 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: linux-mips, Iyer, Suraj

Ralf,

Thanks for the input.

To understand what exceptions I was getting (apart from RI), I implemented a counter for each exception.
In "except_vec3_generic" (entry.S), I included some code to increment a counter for each exception received.

<code>
	NESTED(except_vec3_generic, 0, sp)

#if defined(CONFIG_CPU_R5432)
	/* [jsun] work around a nasty bug in R5432  */
	mfc0	k0, CP0_INDEX
#endif 
	mfc0	k1, CP0_CAUSE
      la    k0, exception_counter
	andi	k1, k1, 0x7c                
	addu	k0, k0, k1
      lw    k1, (k0)
      addi  k1, k1, 1
      sw    k1, (k0)

	... (original code follows)

</code>

exception_counter is an array of 32 integers. On printing out the array in do_ri exception handler, I found that only TLB Mod(Code 1), TLBL (Code 2), TLBS (Code 3), syscall (code 8) and RI (code 10) exceptions were received (had count >= 1). With this, will it be safe to assume that RI is the only unwanted exception?

To get hold of exact EPC at which RI is occuring, I tried to clear the EXL bit of status register by adding some more code above the exception counting code in the except_vec3_generic routine.

<code>
	NESTED(except_vec3_generic, 0, sp)

#if defined(CONFIG_CPU_R5432)
	/* [jsun] work around a nasty bug in R5432  */
	mfc0	k0, CP0_INDEX
#endif 
      mfc0    k0, CP0_STATUS
      nop
      ori     k0, k0, 0x2
      xori    k0, k0, 0x2
      mtc0    k0, CP0_STATUS
      nop

	... (Exception counting code follows)

</code>

Surprisingly, the processor does not seem to alow me to clear the EXL bit. I get AdEL (code 4) exception as I complete the "mtc0    k0, CP0_STATUS" instruction. The processor goes into an infinite loop of exceptions and boot-up hangs after printing "Freeing unused kernel memory: 48k freed". Is it not possible for software to clear the EXL bit after it has been set by the hardware? If not, what else can I do to get hold of the correct EPC value where RI is occuring?

Thanks,
Sekhar

-----Original Message-----
From: Ralf Baechle [mailto:ralf@linux-mips.org]
Sent: Monday, December 27, 2004 6:00 PM
To: Nori, Soma Sekhar
Cc: linux-mips@linux-mips.org; Iyer, Suraj
Subject: Re: do_ri exception in Linux (MIPS 4kec)

On Thu, Dec 23, 2004 at 04:58:03PM +0530, Nori, Soma Sekhar wrote:

> We are using montavista Linux version 2.4.17, gcc version 2.95.3 running on MIPS 4kec.
> 
> Here is the dump:
> $0 : 00000000 0044def4 000001ac 0000006b 00000000 7fff7c08 00000001 00000000
> $8 : 0000fc00 00000001 00000000 941524d0 00004700 00000000 97fc3ea0 7fff7c08
> $16: 100048a4 100029d8 100029d8 10003020 00000000 7fff7dc8 10003b60 2d8e2163
> $24: 00000001 2ab7bc30                   10008e70 7fff7bf0 04000000 00439e50
> Hi : 00000000
> Lo : 00000001
> epc  : 00439e84    Not tainted
> Status: 0000fc13
> Cause : 10800028
> Process sh (pid: 18, stackpage=97fc2000)
> Stack: 00000001 00000000 2abd0ff0 7fff7c28 10008e70 00000000 10008e6c 00000000
>        100049a0 0042f188 00000000 100029d8 00000001 00000001 7fff7f04 10008e70
>        00427fe4 00427f00 00000000 00000000 10002764 10008e70 10008e70 00000000
>        00000000 00000000 10008e70 00422734 00000001 00000001 7fff7f04 10008e70
>        10008e70 00000003 10008e70 004315cc 00000001 00000000 10002764 00000000
>        10008e70 ...
> Call Trace:
> Code: 00000000  2421dd48  00220821 <8c220000> 00000000  005c1021  00400008  0000
> 0000  8f99802c
> 
> The epc is not in kernel space and ksymoops did not provide any info. The epc keeps changing to different locations in user space over multiple runs.

In a case like this you're likely dealing with double exceptions.  Your
code is taking an exception and the exception handler while running with
c0_status set is taking another exception.  If the first exception handler
is still running with the c0_status.exl bit set the CPU when taking the
second exception it will not record the PC of the second exception and
you will have a seemingly unexplainable exception.

A few processors have the nasty habit of throwing RI receptions or do
similarly weird things when executing code that is mapped through multiple
TLB pages but the 4kEC shouldn't.

  Ralf

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: do_ri exception in Linux (MIPS 4kec)
  2004-12-28 14:41 Nori, Soma Sekhar
@ 2004-12-28 14:41 ` Nori, Soma Sekhar
  0 siblings, 0 replies; 5+ messages in thread
From: Nori, Soma Sekhar @ 2004-12-28 14:41 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: linux-mips, Iyer, Suraj

Ralf,

Thanks for the input.

To understand what exceptions I was getting (apart from RI), I implemented a counter for each exception.
In "except_vec3_generic" (entry.S), I included some code to increment a counter for each exception received.

<code>
	NESTED(except_vec3_generic, 0, sp)

#if defined(CONFIG_CPU_R5432)
	/* [jsun] work around a nasty bug in R5432  */
	mfc0	k0, CP0_INDEX
#endif 
	mfc0	k1, CP0_CAUSE
      la    k0, exception_counter
	andi	k1, k1, 0x7c                
	addu	k0, k0, k1
      lw    k1, (k0)
      addi  k1, k1, 1
      sw    k1, (k0)

	... (original code follows)

</code>

exception_counter is an array of 32 integers. On printing out the array in do_ri exception handler, I found that only TLB Mod(Code 1), TLBL (Code 2), TLBS (Code 3), syscall (code 8) and RI (code 10) exceptions were received (had count >= 1). With this, will it be safe to assume that RI is the only unwanted exception?

To get hold of exact EPC at which RI is occuring, I tried to clear the EXL bit of status register by adding some more code above the exception counting code in the except_vec3_generic routine.

<code>
	NESTED(except_vec3_generic, 0, sp)

#if defined(CONFIG_CPU_R5432)
	/* [jsun] work around a nasty bug in R5432  */
	mfc0	k0, CP0_INDEX
#endif 
      mfc0    k0, CP0_STATUS
      nop
      ori     k0, k0, 0x2
      xori    k0, k0, 0x2
      mtc0    k0, CP0_STATUS
      nop

	... (Exception counting code follows)

</code>

Surprisingly, the processor does not seem to alow me to clear the EXL bit. I get AdEL (code 4) exception as I complete the "mtc0    k0, CP0_STATUS" instruction. The processor goes into an infinite loop of exceptions and boot-up hangs after printing "Freeing unused kernel memory: 48k freed". Is it not possible for software to clear the EXL bit after it has been set by the hardware? If not, what else can I do to get hold of the correct EPC value where RI is occuring?

Thanks,
Sekhar

-----Original Message-----
From: Ralf Baechle [mailto:ralf@linux-mips.org]
Sent: Monday, December 27, 2004 6:00 PM
To: Nori, Soma Sekhar
Cc: linux-mips@linux-mips.org; Iyer, Suraj
Subject: Re: do_ri exception in Linux (MIPS 4kec)

On Thu, Dec 23, 2004 at 04:58:03PM +0530, Nori, Soma Sekhar wrote:

> We are using montavista Linux version 2.4.17, gcc version 2.95.3 running on MIPS 4kec.
> 
> Here is the dump:
> $0 : 00000000 0044def4 000001ac 0000006b 00000000 7fff7c08 00000001 00000000
> $8 : 0000fc00 00000001 00000000 941524d0 00004700 00000000 97fc3ea0 7fff7c08
> $16: 100048a4 100029d8 100029d8 10003020 00000000 7fff7dc8 10003b60 2d8e2163
> $24: 00000001 2ab7bc30                   10008e70 7fff7bf0 04000000 00439e50
> Hi : 00000000
> Lo : 00000001
> epc  : 00439e84    Not tainted
> Status: 0000fc13
> Cause : 10800028
> Process sh (pid: 18, stackpage=97fc2000)
> Stack: 00000001 00000000 2abd0ff0 7fff7c28 10008e70 00000000 10008e6c 00000000
>        100049a0 0042f188 00000000 100029d8 00000001 00000001 7fff7f04 10008e70
>        00427fe4 00427f00 00000000 00000000 10002764 10008e70 10008e70 00000000
>        00000000 00000000 10008e70 00422734 00000001 00000001 7fff7f04 10008e70
>        10008e70 00000003 10008e70 004315cc 00000001 00000000 10002764 00000000
>        10008e70 ...
> Call Trace:
> Code: 00000000  2421dd48  00220821 <8c220000> 00000000  005c1021  00400008  0000
> 0000  8f99802c
> 
> The epc is not in kernel space and ksymoops did not provide any info. The epc keeps changing to different locations in user space over multiple runs.

In a case like this you're likely dealing with double exceptions.  Your
code is taking an exception and the exception handler while running with
c0_status set is taking another exception.  If the first exception handler
is still running with the c0_status.exl bit set the CPU when taking the
second exception it will not record the PC of the second exception and
you will have a seemingly unexplainable exception.

A few processors have the nasty habit of throwing RI receptions or do
similarly weird things when executing code that is mapped through multiple
TLB pages but the 4kEC shouldn't.

  Ralf

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2004-12-28 14:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-12-23 11:28 do_ri exception in Linux (MIPS 4kec) Nori, Soma Sekhar
2004-12-23 11:28 ` Nori, Soma Sekhar
2004-12-27 12:30 ` Ralf Baechle
  -- strict thread matches above, loose matches on Subject: below --
2004-12-28 14:41 Nori, Soma Sekhar
2004-12-28 14:41 ` Nori, Soma Sekhar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox