* epc register reported zero
@ 2014-08-28 0:45 Lin Ming
2014-08-28 1:15 ` David Daney
0 siblings, 1 reply; 3+ messages in thread
From: Lin Ming @ 2014-08-28 0:45 UTC (permalink / raw)
To: linux-mips
Hi list,
Board: Broadcom 963268
CPU model: Broadcom BMIPS4350 V8.0
Kernel: 2.6.30
Toolchain: uclibc-crosstools-gcc-4.4.2-1
I encountered an userspace application crash with epc reported zero.
I don't understand how epc register could be zero.
Any help is appreciated.
wps_monitor/1699: potentially unexpected fatal signal 11.
Cpu 1
$ 0 : 00000000 10008d00 00000004 0000000a
$ 4 : 0000000a 7f88a55c 00000000 00000001
$ 8 : 00000000 00000000 00000001 00000000
$12 : 00000001 00000000 00000008 12182430
$16 : 00438968 00000001 00409620 00000000
$20 : 00000000 00000000 00000000 00406404
$24 : 00000002 2aaecc00
$28 : 2ab39a70 7f88a4c0 7f88a4f0 0041a838
Hi : 00000000
Lo : 00000000
epc : 00000000 (null)
Tainted: P
ra : 0041a838 0x41a838
Status: 00008d13 USER EXL IE
Cause : 00000008
BadVA : 00000000
PrId : 0002a080 (Broadcom4350)
mips-linux-addr2line -e wps_monitor 0041a838
This shows "ra" address mapped to below line 328.
322 if (max_fd == -1) {
323 TUTRACE((TUTRACE_ERR, "wpsm_readData: no fd set!\n"));
324 return NULL;
325 }
326
327 /* Do select */
328 n = select(max_fd + 1, &fdvar, NULL, NULL, &timeout);
329 if (n <= 0) {
330 /*
331 * to avoid the select operation interferenced by
led lighting timer.
332 * this will be removed after led lighting timer
is replaced by wireless driver
333 */
334 if (n < 0 && errno != EINTR) {
335 TUTRACE((TUTRACE_ERR, "wpsm_readData:
select recv failed\n"));
336 }
337 goto out;
338 }
0000eac0 <__libc_select>:
eac0: 3c1c0006 lui gp,0x6
eac4: 279c1aa0 addiu gp,gp,6816
eac8: 0399e021 addu gp,gp,t9
eacc: 27bdffd8 addiu sp,sp,-40
ead0: afbe0020 sw s8,32(sp)
ead4: 03a0f021 move s8,sp
ead8: afbf0024 sw ra,36(sp)
eadc: afb0001c sw s0,28(sp)
eae0: afbc0010 sw gp,16(sp)
eae4: 27bdfff0 addiu sp,sp,-16
eae8: 8fc20038 lw v0,56(s8)
eaec: 27bdffe0 addiu sp,sp,-32
eaf0: afa20010 sw v0,16(sp)
eaf4: 2402102e li v0,4142
eaf8: 0000000c syscall
eafc: 27bd0020 addiu sp,sp,32
eb00: 10e00006 beqz a3,eb1c <__libc_select+0x5c>
eb04: 00408021 move s0,v0
eb08: 8f9988d0 lw t9,-30512(gp)
eb0c: 0320f809 jalr t9
eb10: 00000000 nop
eb14: ac500000 sw s0,0(v0)
eb18: 2402ffff li v0,-1
eb1c: 03c0e821 move sp,s8
eb20: 8fbf0024 lw ra,36(sp)
eb24: 8fbe0020 lw s8,32(sp)
eb28: 8fb0001c lw s0,28(sp)
eb2c: 03e00008 jr ra
eb30: 27bd0028 addiu sp,sp,40
Regards,
Ming
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: epc register reported zero
2014-08-28 0:45 epc register reported zero Lin Ming
@ 2014-08-28 1:15 ` David Daney
2014-08-28 1:33 ` Lin Ming
0 siblings, 1 reply; 3+ messages in thread
From: David Daney @ 2014-08-28 1:15 UTC (permalink / raw)
To: Lin Ming; +Cc: linux-mips
On 08/27/2014 05:45 PM, Lin Ming wrote:
> Hi list,
>
> Board: Broadcom 963268
> CPU model: Broadcom BMIPS4350 V8.0
> Kernel: 2.6.30
> Toolchain: uclibc-crosstools-gcc-4.4.2-1
>
> I encountered an userspace application crash with epc reported zero.
> I don't understand how epc register could be zero.
>
> Any help is appreciated.
>
> wps_monitor/1699: potentially unexpected fatal signal 11.
>
> Cpu 1
> $ 0 : 00000000 10008d00 00000004 0000000a
> $ 4 : 0000000a 7f88a55c 00000000 00000001
> $ 8 : 00000000 00000000 00000001 00000000
> $12 : 00000001 00000000 00000008 12182430
> $16 : 00438968 00000001 00409620 00000000
> $20 : 00000000 00000000 00000000 00406404
> $24 : 00000002 2aaecc00
> $28 : 2ab39a70 7f88a4c0 7f88a4f0 0041a838
Disassemble the surrounding the address in $31
I am guessing that at 0x41a830, you have an indirect jump (JR
instruction) and that 'rs' contains a value of zero. So the EPC when
you get the SIGSEGV will be ... zero.
This is called a call through a NULL function pointer.
> Hi : 00000000
> Lo : 00000000
> epc : 00000000 (null)
> Tainted: P
> ra : 0041a838 0x41a838
> Status: 00008d13 USER EXL IE
> Cause : 00000008
> BadVA : 00000000
> PrId : 0002a080 (Broadcom4350)
>
> mips-linux-addr2line -e wps_monitor 0041a838
> This shows "ra" address mapped to below line 328.
>
> 322 if (max_fd == -1) {
> 323 TUTRACE((TUTRACE_ERR, "wpsm_readData: no fd set!\n"));
> 324 return NULL;
> 325 }
> 326
> 327 /* Do select */
> 328 n = select(max_fd + 1, &fdvar, NULL, NULL, &timeout);
> 329 if (n <= 0) {
> 330 /*
> 331 * to avoid the select operation interferenced by
> led lighting timer.
> 332 * this will be removed after led lighting timer
> is replaced by wireless driver
> 333 */
> 334 if (n < 0 && errno != EINTR) {
> 335 TUTRACE((TUTRACE_ERR, "wpsm_readData:
> select recv failed\n"));
> 336 }
> 337 goto out;
> 338 }
>
>
> 0000eac0 <__libc_select>:
> eac0: 3c1c0006 lui gp,0x6
> eac4: 279c1aa0 addiu gp,gp,6816
> eac8: 0399e021 addu gp,gp,t9
> eacc: 27bdffd8 addiu sp,sp,-40
> ead0: afbe0020 sw s8,32(sp)
> ead4: 03a0f021 move s8,sp
> ead8: afbf0024 sw ra,36(sp)
> eadc: afb0001c sw s0,28(sp)
> eae0: afbc0010 sw gp,16(sp)
> eae4: 27bdfff0 addiu sp,sp,-16
> eae8: 8fc20038 lw v0,56(s8)
> eaec: 27bdffe0 addiu sp,sp,-32
> eaf0: afa20010 sw v0,16(sp)
> eaf4: 2402102e li v0,4142
> eaf8: 0000000c syscall
> eafc: 27bd0020 addiu sp,sp,32
> eb00: 10e00006 beqz a3,eb1c <__libc_select+0x5c>
> eb04: 00408021 move s0,v0
> eb08: 8f9988d0 lw t9,-30512(gp)
> eb0c: 0320f809 jalr t9
> eb10: 00000000 nop
> eb14: ac500000 sw s0,0(v0)
> eb18: 2402ffff li v0,-1
> eb1c: 03c0e821 move sp,s8
> eb20: 8fbf0024 lw ra,36(sp)
> eb24: 8fbe0020 lw s8,32(sp)
> eb28: 8fb0001c lw s0,28(sp)
> eb2c: 03e00008 jr ra
> eb30: 27bd0028 addiu sp,sp,40
>
> Regards,
> Ming
>
>
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: epc register reported zero
2014-08-28 1:15 ` David Daney
@ 2014-08-28 1:33 ` Lin Ming
0 siblings, 0 replies; 3+ messages in thread
From: Lin Ming @ 2014-08-28 1:33 UTC (permalink / raw)
To: David Daney; +Cc: linux-mips
On Wed, Aug 27, 2014 at 6:15 PM, David Daney <ddaney.cavm@gmail.com> wrote:
> On 08/27/2014 05:45 PM, Lin Ming wrote:
>>
>> Hi list,
>>
>> Board: Broadcom 963268
>> CPU model: Broadcom BMIPS4350 V8.0
>> Kernel: 2.6.30
>> Toolchain: uclibc-crosstools-gcc-4.4.2-1
>>
>> I encountered an userspace application crash with epc reported zero.
>> I don't understand how epc register could be zero.
>>
>> Any help is appreciated.
>>
>> wps_monitor/1699: potentially unexpected fatal signal 11.
>>
>> Cpu 1
>> $ 0 : 00000000 10008d00 00000004 0000000a
>> $ 4 : 0000000a 7f88a55c 00000000 00000001
>> $ 8 : 00000000 00000000 00000001 00000000
>> $12 : 00000001 00000000 00000008 12182430
>> $16 : 00438968 00000001 00409620 00000000
>> $20 : 00000000 00000000 00000000 00406404
>> $24 : 00000002 2aaecc00
>> $28 : 2ab39a70 7f88a4c0 7f88a4f0 0041a838
>
>
> Disassemble the surrounding the address in $31
>
> I am guessing that at 0x41a830, you have an indirect jump (JR instruction)
> and that 'rs' contains a value of zero. So the EPC when you get the SIGSEGV
> will be ... zero.
>
> This is called a call through a NULL function pointer.
Here it is.
There is only a "jalr t9", which I think it's call of __libc_select().
/* Do select */
n = select(max_fd + 1, &fdvar, NULL, NULL, &timeout);
41a804: 8fc20034 lw v0,52(s8)
41a808: 24430001 addiu v1,v0,1
41a80c: 27c20044 addiu v0,s8,68
41a810: 27c400c4 addiu a0,s8,196
41a814: afa40010 sw a0,16(sp)
41a818: 00602021 move a0,v1
41a81c: 00402821 move a1,v0
41a820: 00003021 move a2,zero
41a824: 00003821 move a3,zero
41a828: 8f82843c lw v0,-31684(gp)
41a82c: 0040c821 move t9,v0
41a830: 0320f809 jalr t9
41a834: 00000000 nop
41a838: 8fdc0018 lw gp,24(s8)
41a83c: afc20038 sw v0,56(s8)
if (n <= 0) {
41a840: 8fc20038 lw v0,56(s8)
41a844: 1c40000b bgtz v0,41a874
<wps_osl_wait_for_all_packets+0x21c>
41a848: 00000000 nop
Here is my crazy thought:
One possibility is:
1. select() syscall entered kernel mode. Then epc register was saved
on kernel mode stack.
2. After select() syscall finished, kernel code read epc value from
stack and restore it to epc register.
3. CPU jump to the instruction pointed by epc register.
Maybe there's some bug in kernel that destroyed kernel mode stack. So
epc register value became zero.
I added below crazy code to simulate it.
diff --git a/bcmcpe2/kernel/linux-3.4rt/fs/select.c
b/bcmcpe2/kernel/linux-3.4rt/fs/select.c
index 0baa0a3..cd41c4d 100644
--- a/bcmcpe2/kernel/linux-3.4rt/fs/select.c
+++ b/bcmcpe2/kernel/linux-3.4rt/fs/select.c
@@ -597,6 +597,11 @@ SYSCALL_DEFINE5(select, int, n, fd_set __user *,
inp, fd_set __user *, outp,
struct timeval tv;
int ret;
+ if (!strcmp(current->comm, "wps_monitor")) {
+ printk("LINMING: hack wps_monitor epc\n");
+ task_pt_regs(current)->cp0_epc = 0;
+ }
+
And got below:
wps_monitor/1315: potentially unexpected fatal signal 11.
Cpu 1
$ 0 : 00000000 10008d00 00000000 0000f9d8
$ 4 : 00000008 7f7fe624 00000000 00000000
$ 8 : 00000000 7f7fe5f8 00000000 87c78000
$12 : 00504303 00000043 0000000e 0000dd18
$16 : 00000000 0043db30 0043bff8 0043bffc
$20 : 7f7fe624 7f7fe5f0 00000007 00000000
$24 : 00000000 77c59960
$28 : 77cc94d0 7f7fe578 7f7fe5a8 004090a8
Hi : 00000000
Lo : 00000000
epc : 00000000 (null)
Tainted: P
ra : 004090a8 0x4090a8
Status: 00008d13 USER EXL IE
Cause : 00000008
BadVA : 00000000
PrId : 0002a080 (Broadcom BMIPS4350)
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2014-08-28 1:33 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-28 0:45 epc register reported zero Lin Ming
2014-08-28 1:15 ` David Daney
2014-08-28 1:33 ` Lin Ming
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox