qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
@ 2011-04-10 12:29 Artyom Tarasenko
  2011-04-10 13:24 ` Aurelien Jarno
  0 siblings, 1 reply; 32+ messages in thread
From: Artyom Tarasenko @ 2011-04-10 12:29 UTC (permalink / raw)
  To: qemu-devel, Blue Swirl

Trying to boot some proprietary OS I get qemu-system-sparc64 crash with a

tcg/tcg.c:1892: tcg fatal error

error message.

It looks like it can be a platform independent bug though, because
when a '-singlestep' option IS present, qemu doesn't crash and seems
to translate the code properly.

(gdb) bt
#0  0x00000032c2e327f5 in raise () from /lib64/libc.so.6
#1  0x00000032c2e33fd5 in abort () from /lib64/libc.so.6
#2  0x000000000051933d in tcg_reg_alloc_call (s=<value optimized out>,
def=0x89d340, opc=INDEX_op_call, args=0x10acc98, dead_iargs=3) at
qemu/tcg/tcg.c:1892
#3  0x000000000051a557 in tcg_gen_code_common (s=0x10b8940,
gen_code_buf=0x40338b60 "I\213n@H\213] 3\355I\211\256\220") at
qemu/tcg/tcg.c:2099
#4  tcg_gen_code (s=0x10b8940, gen_code_buf=0x40338b60 "I\213n@H\213]
3\355I\211\256\220") at qemu/tcg/tcg.c:2142
#5  0x00000000004d38f1 in cpu_sparc_gen_code (env=0x10cce10,
tb=0x7fffe91bc218, gen_code_size_ptr=0x7fffffffd9b4) at
qemu/translate-all.c:93
#6  0x00000000004d1fd7 in tb_gen_code (env=0x10cce10, pc=18868776,
cs_base=18868780, flags=15, cflags=0) at qemu/exec.c:989
#7  0x00000000004d4029 in tb_find_slow (env1=<value optimized out>) at
qemu/cpu-exec.c:167
#8  tb_find_fast (env1=<value optimized out>) at cpu-exec.c:194
#9  cpu_sparc_exec (env1=<value optimized out>) at qemu/cpu-exec.c:556
#10 0x0000000000408868 in tcg_cpu_exec () at qemu/cpus.c:1066
#11 cpu_exec_all () at qemu/cpus.c:1102
#12 0x000000000053c756 in main_loop (argc=<value optimized out>,
argv=<value optimized out>, envp=<value optimized out>) at
qemu/vl.c:1430

I inspected ts->val_type causing the abort() case and it turned out to be 0.

The last lines of qemu.log (without -singlestep)
IN:
0x00000000011fe9f0:  rdpr  %pstate, %g1
0x00000000011fe9f4:  wrpr  %g1, 2, %pstate
--------------
IN:
0x00000000011fe9f8:  ldub  [ %o0 ], %o1
0x00000000011fe9fc:  mov  %o1, %o2
0x00000000011fea00:  rdpr  %tick, %o3
0x00000000011fea04:  cmp  %o1, %o2
0x00000000011fea08:  be  %icc, 0x11fea00
0x00000000011fea0c:  ldub  [ %o0 ], %o2

Search PC...
Search PC...
Search PC...
Search PC...
Search PC...
Search PC...
--------------
IN:
0x00000000011fe9f8:  ldub  [ %o0 ], %o1
0x00000000011fe9fc:  mov  %o1, %o2
0x00000000011fea00:  rdpr  %tick, %o3
0x00000000011fea04:  cmp  %o1, %o2
0x00000000011fea08:  be  %icc, 0x11fea00
0x00000000011fea0c:  ldub  [ %o0 ], %o2

110521: Data Access MMU Miss (v=0068) pc=00000000011fe9f8
npc=00000000011fe9fc SP=000000000180ae41
pc: 00000000011fe9f8  npc: 00000000011fe9fc

IN:
0x00000000011fea00:  rdpr  %tick, %o3
0x00000000011fea04:  cmp  %o1, %o2
0x00000000011fea08:  be  %icc, 0x11fea00
0x00000000011fea0c:  ldub  [ %o0 ], %o2
--------------
IN:
0x00000000011fea10:  brz,pn   %o2, 0x11fe9f8
0x00000000011fea14:  mov  %o2, %o4
--------------
IN:
0x00000000011fea18:  rdpr  %tick, %o5
0x00000000011fea1c:  cmp  %o2, %o4
0x00000000011fea20:  be  %icc, 0x11fea18
0x00000000011fea24:  ldub  [ %o0 ], %o4
--------------
IN:
0x00000000011fea28:  brz,pn   %o4, 0x11fe9f4
0x00000000011fea2c:  wrpr  %g0, %g1, %pstate
<EOF>

The crash is 100% reproducible and happens always on the same place,
so it's probably a pure TCG issue, not related on getting the
external/timer interrupts.

Do you need any additional info?

-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-10 12:29 [Qemu-devel] tcg/tcg.c:1892: tcg fatal error Artyom Tarasenko
@ 2011-04-10 13:24 ` Aurelien Jarno
  2011-04-10 14:09   ` Artyom Tarasenko
  0 siblings, 1 reply; 32+ messages in thread
From: Aurelien Jarno @ 2011-04-10 13:24 UTC (permalink / raw)
  To: Artyom Tarasenko; +Cc: Blue Swirl, qemu-devel

On Sun, Apr 10, 2011 at 02:29:59PM +0200, Artyom Tarasenko wrote:
> Trying to boot some proprietary OS I get qemu-system-sparc64 crash with a
> 
> tcg/tcg.c:1892: tcg fatal error
> 
> error message.
> 
> It looks like it can be a platform independent bug though, because
> when a '-singlestep' option IS present, qemu doesn't crash and seems
> to translate the code properly.
> 
> (gdb) bt
> #0  0x00000032c2e327f5 in raise () from /lib64/libc.so.6
> #1  0x00000032c2e33fd5 in abort () from /lib64/libc.so.6
> #2  0x000000000051933d in tcg_reg_alloc_call (s=<value optimized out>,
> def=0x89d340, opc=INDEX_op_call, args=0x10acc98, dead_iargs=3) at
> qemu/tcg/tcg.c:1892
> #3  0x000000000051a557 in tcg_gen_code_common (s=0x10b8940,
> gen_code_buf=0x40338b60 "I\213n@H\213] 3\355I\211\256\220") at
> qemu/tcg/tcg.c:2099
> #4  tcg_gen_code (s=0x10b8940, gen_code_buf=0x40338b60 "I\213n@H\213]
> 3\355I\211\256\220") at qemu/tcg/tcg.c:2142
> #5  0x00000000004d38f1 in cpu_sparc_gen_code (env=0x10cce10,
> tb=0x7fffe91bc218, gen_code_size_ptr=0x7fffffffd9b4) at
> qemu/translate-all.c:93
> #6  0x00000000004d1fd7 in tb_gen_code (env=0x10cce10, pc=18868776,
> cs_base=18868780, flags=15, cflags=0) at qemu/exec.c:989
> #7  0x00000000004d4029 in tb_find_slow (env1=<value optimized out>) at
> qemu/cpu-exec.c:167
> #8  tb_find_fast (env1=<value optimized out>) at cpu-exec.c:194
> #9  cpu_sparc_exec (env1=<value optimized out>) at qemu/cpu-exec.c:556
> #10 0x0000000000408868 in tcg_cpu_exec () at qemu/cpus.c:1066
> #11 cpu_exec_all () at qemu/cpus.c:1102
> #12 0x000000000053c756 in main_loop (argc=<value optimized out>,
> argv=<value optimized out>, envp=<value optimized out>) at
> qemu/vl.c:1430
> 
> I inspected ts->val_type causing the abort() case and it turned out to be 0.
> 
> The last lines of qemu.log (without -singlestep)
> IN:
> 0x00000000011fe9f0:  rdpr  %pstate, %g1
> 0x00000000011fe9f4:  wrpr  %g1, 2, %pstate
> --------------
> IN:
> 0x00000000011fe9f8:  ldub  [ %o0 ], %o1
> 0x00000000011fe9fc:  mov  %o1, %o2
> 0x00000000011fea00:  rdpr  %tick, %o3
> 0x00000000011fea04:  cmp  %o1, %o2
> 0x00000000011fea08:  be  %icc, 0x11fea00
> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
> 
> Search PC...
> Search PC...
> Search PC...
> Search PC...
> Search PC...
> Search PC...
> --------------
> IN:
> 0x00000000011fe9f8:  ldub  [ %o0 ], %o1
> 0x00000000011fe9fc:  mov  %o1, %o2
> 0x00000000011fea00:  rdpr  %tick, %o3
> 0x00000000011fea04:  cmp  %o1, %o2
> 0x00000000011fea08:  be  %icc, 0x11fea00
> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
> 
> 110521: Data Access MMU Miss (v=0068) pc=00000000011fe9f8
> npc=00000000011fe9fc SP=000000000180ae41
> pc: 00000000011fe9f8  npc: 00000000011fe9fc
> 
> IN:
> 0x00000000011fea00:  rdpr  %tick, %o3
> 0x00000000011fea04:  cmp  %o1, %o2
> 0x00000000011fea08:  be  %icc, 0x11fea00
> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
> --------------
> IN:
> 0x00000000011fea10:  brz,pn   %o2, 0x11fe9f8
> 0x00000000011fea14:  mov  %o2, %o4
> --------------
> IN:
> 0x00000000011fea18:  rdpr  %tick, %o5
> 0x00000000011fea1c:  cmp  %o2, %o4
> 0x00000000011fea20:  be  %icc, 0x11fea18
> 0x00000000011fea24:  ldub  [ %o0 ], %o4
> --------------
> IN:
> 0x00000000011fea28:  brz,pn   %o4, 0x11fe9f4
> 0x00000000011fea2c:  wrpr  %g0, %g1, %pstate
> <EOF>
> 
> The crash is 100% reproducible and happens always on the same place,
> so it's probably a pure TCG issue, not related on getting the
> external/timer interrupts.
> 
> Do you need any additional info?
> 

What would be interesting would be to get the corresponding TCG code
from qemu.log (-d op,op_opt).

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-10 13:24 ` Aurelien Jarno
@ 2011-04-10 14:09   ` Artyom Tarasenko
  2011-04-10 14:44     ` Blue Swirl
  2011-04-10 14:59     ` Peter Maydell
  0 siblings, 2 replies; 32+ messages in thread
From: Artyom Tarasenko @ 2011-04-10 14:09 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: Blue Swirl, qemu-devel

On Sun, Apr 10, 2011 at 3:24 PM, Aurelien Jarno <aurelien@aurel32.net> wrote:
> On Sun, Apr 10, 2011 at 02:29:59PM +0200, Artyom Tarasenko wrote:
>> Trying to boot some proprietary OS I get qemu-system-sparc64 crash with a
>>
>> tcg/tcg.c:1892: tcg fatal error
>>
>> error message.
>>
>> It looks like it can be a platform independent bug though, because
>> when a '-singlestep' option IS present, qemu doesn't crash and seems
>> to translate the code properly.
>>
>> (gdb) bt
>> #0  0x00000032c2e327f5 in raise () from /lib64/libc.so.6
>> #1  0x00000032c2e33fd5 in abort () from /lib64/libc.so.6
>> #2  0x000000000051933d in tcg_reg_alloc_call (s=<value optimized out>,
>> def=0x89d340, opc=INDEX_op_call, args=0x10acc98, dead_iargs=3) at
>> qemu/tcg/tcg.c:1892
>> #3  0x000000000051a557 in tcg_gen_code_common (s=0x10b8940,
>> gen_code_buf=0x40338b60 "I\213n@H\213] 3\355I\211\256\220") at
>> qemu/tcg/tcg.c:2099
>> #4  tcg_gen_code (s=0x10b8940, gen_code_buf=0x40338b60 "I\213n@H\213]
>> 3\355I\211\256\220") at qemu/tcg/tcg.c:2142
>> #5  0x00000000004d38f1 in cpu_sparc_gen_code (env=0x10cce10,
>> tb=0x7fffe91bc218, gen_code_size_ptr=0x7fffffffd9b4) at
>> qemu/translate-all.c:93
>> #6  0x00000000004d1fd7 in tb_gen_code (env=0x10cce10, pc=18868776,
>> cs_base=18868780, flags=15, cflags=0) at qemu/exec.c:989
>> #7  0x00000000004d4029 in tb_find_slow (env1=<value optimized out>) at
>> qemu/cpu-exec.c:167
>> #8  tb_find_fast (env1=<value optimized out>) at cpu-exec.c:194
>> #9  cpu_sparc_exec (env1=<value optimized out>) at qemu/cpu-exec.c:556
>> #10 0x0000000000408868 in tcg_cpu_exec () at qemu/cpus.c:1066
>> #11 cpu_exec_all () at qemu/cpus.c:1102
>> #12 0x000000000053c756 in main_loop (argc=<value optimized out>,
>> argv=<value optimized out>, envp=<value optimized out>) at
>> qemu/vl.c:1430
>>
>> I inspected ts->val_type causing the abort() case and it turned out to be 0.
>>
>> The last lines of qemu.log (without -singlestep)
>> IN:
>> 0x00000000011fe9f0:  rdpr  %pstate, %g1
>> 0x00000000011fe9f4:  wrpr  %g1, 2, %pstate
>> --------------
>> IN:
>> 0x00000000011fe9f8:  ldub  [ %o0 ], %o1
>> 0x00000000011fe9fc:  mov  %o1, %o2
>> 0x00000000011fea00:  rdpr  %tick, %o3
>> 0x00000000011fea04:  cmp  %o1, %o2
>> 0x00000000011fea08:  be  %icc, 0x11fea00
>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>
>> Search PC...
>> Search PC...
>> Search PC...
>> Search PC...
>> Search PC...
>> Search PC...
>> --------------
>> IN:
>> 0x00000000011fe9f8:  ldub  [ %o0 ], %o1
>> 0x00000000011fe9fc:  mov  %o1, %o2
>> 0x00000000011fea00:  rdpr  %tick, %o3
>> 0x00000000011fea04:  cmp  %o1, %o2
>> 0x00000000011fea08:  be  %icc, 0x11fea00
>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>
>> 110521: Data Access MMU Miss (v=0068) pc=00000000011fe9f8
>> npc=00000000011fe9fc SP=000000000180ae41
>> pc: 00000000011fe9f8  npc: 00000000011fe9fc
>>
>> IN:
>> 0x00000000011fea00:  rdpr  %tick, %o3
>> 0x00000000011fea04:  cmp  %o1, %o2
>> 0x00000000011fea08:  be  %icc, 0x11fea00
>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>> --------------
>> IN:
>> 0x00000000011fea10:  brz,pn   %o2, 0x11fe9f8
>> 0x00000000011fea14:  mov  %o2, %o4
>> --------------
>> IN:
>> 0x00000000011fea18:  rdpr  %tick, %o5
>> 0x00000000011fea1c:  cmp  %o2, %o4
>> 0x00000000011fea20:  be  %icc, 0x11fea18
>> 0x00000000011fea24:  ldub  [ %o0 ], %o4
>> --------------
>> IN:
>> 0x00000000011fea28:  brz,pn   %o4, 0x11fe9f4
>> 0x00000000011fea2c:  wrpr  %g0, %g1, %pstate
>> <EOF>
>>
>> The crash is 100% reproducible and happens always on the same place,
>> so it's probably a pure TCG issue, not related on getting the
>> external/timer interrupts.
>>
>> Do you need any additional info?
>>
>
> What would be interesting would be to get the corresponding TCG code
> from qemu.log (-d op,op_opt).


OP:
 ---- 0x11fea28
 ld_i64 tmp6,regwptr,$0x20
 movi_i64 cond,$0x0
 movi_i64 tmp8,$0x0
 brcond_i64 tmp6,tmp8,ne,$0x0
 movi_i64 cond,$0x1
 set_label $0x0

 ---- 0x11fea2c
 movi_i64 tmp7,$0x0
 xor_i64 tmp0,tmp7,g1
 movi_i64 pc,$0x11fea2c
 movi_i64 tmp8,$compute_psr
 call tmp8,$0x0,$0
 movi_i64 tmp8,$0x0
 brcond_i64 cond,tmp8,eq,$0x1
 movi_i64 npc,$0x11fe9f4
 br $0x2
 set_label $0x1
 movi_i64 npc,$0x11fea30
 set_label $0x2
 movi_i64 tmp8,$wrpstate
 call tmp8,$0x0,$0,tmp0
 mov_i64 pc,npc
 movi_i64 tmp8,$0x4
 add_i64 npc,npc,tmp8
 exit_tb $0x0

OP after liveness analysis:
 ---- 0x11fea28
 ld_i64 tmp6,regwptr,$0x20
 movi_i64 cond,$0x0
 movi_i64 tmp8,$0x0
 brcond_i64 tmp6,tmp8,ne,$0x0
 movi_i64 cond,$0x1
 set_label $0x0

 ---- 0x11fea2c
 nopn $0x2,$0x2
 nopn $0x3,$0x68,$0x3
 movi_i64 pc,$0x11fea2c
 movi_i64 tmp8,$compute_psr
 call tmp8,$0x0,$0
 movi_i64 tmp8,$0x0
 brcond_i64 cond,tmp8,eq,$0x1
 movi_i64 npc,$0x11fe9f4
 br $0x2
 set_label $0x1
 movi_i64 npc,$0x11fea30
 set_label $0x2
 movi_i64 tmp8,$wrpstate
 call tmp8,$0x0,$0,tmp0
 mov_i64 pc,npc
 movi_i64 tmp8,$0x4
 add_i64 npc,npc,tmp8
 exit_tb $0x0
 end

Does it mean the last block is processed correctly and the crash
happens on the next instruction which doesn't make it to the log?
The next instruction would be a

0x00000000011fea30:  retl

Since it's a branch instruction I guess this would also be a tcg block boundary.


-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-10 14:09   ` Artyom Tarasenko
@ 2011-04-10 14:44     ` Blue Swirl
  2011-04-10 17:48       ` Artyom Tarasenko
  2011-04-10 14:59     ` Peter Maydell
  1 sibling, 1 reply; 32+ messages in thread
From: Blue Swirl @ 2011-04-10 14:44 UTC (permalink / raw)
  To: Artyom Tarasenko; +Cc: qemu-devel, Aurelien Jarno

On Sun, Apr 10, 2011 at 5:09 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
> On Sun, Apr 10, 2011 at 3:24 PM, Aurelien Jarno <aurelien@aurel32.net> wrote:
>> On Sun, Apr 10, 2011 at 02:29:59PM +0200, Artyom Tarasenko wrote:
>>> Trying to boot some proprietary OS I get qemu-system-sparc64 crash with a
>>>
>>> tcg/tcg.c:1892: tcg fatal error
>>>
>>> error message.
>>>
>>> It looks like it can be a platform independent bug though, because
>>> when a '-singlestep' option IS present, qemu doesn't crash and seems
>>> to translate the code properly.
>>>
>>> (gdb) bt
>>> #0  0x00000032c2e327f5 in raise () from /lib64/libc.so.6
>>> #1  0x00000032c2e33fd5 in abort () from /lib64/libc.so.6
>>> #2  0x000000000051933d in tcg_reg_alloc_call (s=<value optimized out>,
>>> def=0x89d340, opc=INDEX_op_call, args=0x10acc98, dead_iargs=3) at
>>> qemu/tcg/tcg.c:1892
>>> #3  0x000000000051a557 in tcg_gen_code_common (s=0x10b8940,
>>> gen_code_buf=0x40338b60 "I\213n@H\213] 3\355I\211\256\220") at
>>> qemu/tcg/tcg.c:2099
>>> #4  tcg_gen_code (s=0x10b8940, gen_code_buf=0x40338b60 "I\213n@H\213]
>>> 3\355I\211\256\220") at qemu/tcg/tcg.c:2142
>>> #5  0x00000000004d38f1 in cpu_sparc_gen_code (env=0x10cce10,
>>> tb=0x7fffe91bc218, gen_code_size_ptr=0x7fffffffd9b4) at
>>> qemu/translate-all.c:93
>>> #6  0x00000000004d1fd7 in tb_gen_code (env=0x10cce10, pc=18868776,
>>> cs_base=18868780, flags=15, cflags=0) at qemu/exec.c:989
>>> #7  0x00000000004d4029 in tb_find_slow (env1=<value optimized out>) at
>>> qemu/cpu-exec.c:167
>>> #8  tb_find_fast (env1=<value optimized out>) at cpu-exec.c:194
>>> #9  cpu_sparc_exec (env1=<value optimized out>) at qemu/cpu-exec.c:556
>>> #10 0x0000000000408868 in tcg_cpu_exec () at qemu/cpus.c:1066
>>> #11 cpu_exec_all () at qemu/cpus.c:1102
>>> #12 0x000000000053c756 in main_loop (argc=<value optimized out>,
>>> argv=<value optimized out>, envp=<value optimized out>) at
>>> qemu/vl.c:1430
>>>
>>> I inspected ts->val_type causing the abort() case and it turned out to be 0.
>>>
>>> The last lines of qemu.log (without -singlestep)
>>> IN:
>>> 0x00000000011fe9f0:  rdpr  %pstate, %g1
>>> 0x00000000011fe9f4:  wrpr  %g1, 2, %pstate
>>> --------------
>>> IN:
>>> 0x00000000011fe9f8:  ldub  [ %o0 ], %o1
>>> 0x00000000011fe9fc:  mov  %o1, %o2
>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>> 0x00000000011fea04:  cmp  %o1, %o2
>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>
>>> Search PC...
>>> Search PC...
>>> Search PC...
>>> Search PC...
>>> Search PC...
>>> Search PC...
>>> --------------
>>> IN:
>>> 0x00000000011fe9f8:  ldub  [ %o0 ], %o1
>>> 0x00000000011fe9fc:  mov  %o1, %o2
>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>> 0x00000000011fea04:  cmp  %o1, %o2
>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>
>>> 110521: Data Access MMU Miss (v=0068) pc=00000000011fe9f8
>>> npc=00000000011fe9fc SP=000000000180ae41
>>> pc: 00000000011fe9f8  npc: 00000000011fe9fc
>>>
>>> IN:
>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>> 0x00000000011fea04:  cmp  %o1, %o2
>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>> --------------
>>> IN:
>>> 0x00000000011fea10:  brz,pn   %o2, 0x11fe9f8
>>> 0x00000000011fea14:  mov  %o2, %o4
>>> --------------
>>> IN:
>>> 0x00000000011fea18:  rdpr  %tick, %o5
>>> 0x00000000011fea1c:  cmp  %o2, %o4
>>> 0x00000000011fea20:  be  %icc, 0x11fea18
>>> 0x00000000011fea24:  ldub  [ %o0 ], %o4
>>> --------------
>>> IN:
>>> 0x00000000011fea28:  brz,pn   %o4, 0x11fe9f4
>>> 0x00000000011fea2c:  wrpr  %g0, %g1, %pstate
>>> <EOF>
>>>
>>> The crash is 100% reproducible and happens always on the same place,
>>> so it's probably a pure TCG issue, not related on getting the
>>> external/timer interrupts.
>>>
>>> Do you need any additional info?
>>>
>>
>> What would be interesting would be to get the corresponding TCG code
>> from qemu.log (-d op,op_opt).
>
>
> OP:
>  ---- 0x11fea28
>  ld_i64 tmp6,regwptr,$0x20
>  movi_i64 cond,$0x0
>  movi_i64 tmp8,$0x0
>  brcond_i64 tmp6,tmp8,ne,$0x0
>  movi_i64 cond,$0x1
>  set_label $0x0
>
>  ---- 0x11fea2c
>  movi_i64 tmp7,$0x0
>  xor_i64 tmp0,tmp7,g1
>  movi_i64 pc,$0x11fea2c
>  movi_i64 tmp8,$compute_psr
>  call tmp8,$0x0,$0
>  movi_i64 tmp8,$0x0
>  brcond_i64 cond,tmp8,eq,$0x1
>  movi_i64 npc,$0x11fe9f4
>  br $0x2
>  set_label $0x1
>  movi_i64 npc,$0x11fea30
>  set_label $0x2
>  movi_i64 tmp8,$wrpstate
>  call tmp8,$0x0,$0,tmp0
>  mov_i64 pc,npc
>  movi_i64 tmp8,$0x4
>  add_i64 npc,npc,tmp8
>  exit_tb $0x0
>
> OP after liveness analysis:
>  ---- 0x11fea28
>  ld_i64 tmp6,regwptr,$0x20
>  movi_i64 cond,$0x0
>  movi_i64 tmp8,$0x0
>  brcond_i64 tmp6,tmp8,ne,$0x0
>  movi_i64 cond,$0x1
>  set_label $0x0
>
>  ---- 0x11fea2c
>  nopn $0x2,$0x2
>  nopn $0x3,$0x68,$0x3
>  movi_i64 pc,$0x11fea2c
>  movi_i64 tmp8,$compute_psr
>  call tmp8,$0x0,$0
>  movi_i64 tmp8,$0x0
>  brcond_i64 cond,tmp8,eq,$0x1
>  movi_i64 npc,$0x11fe9f4
>  br $0x2
>  set_label $0x1
>  movi_i64 npc,$0x11fea30
>  set_label $0x2
>  movi_i64 tmp8,$wrpstate
>  call tmp8,$0x0,$0,tmp0
>  mov_i64 pc,npc
>  movi_i64 tmp8,$0x4
>  add_i64 npc,npc,tmp8
>  exit_tb $0x0
>  end
>
> Does it mean the last block is processed correctly and the crash
> happens on the next instruction which doesn't make it to the log?
> The next instruction would be a
>
> 0x00000000011fea30:  retl
>
> Since it's a branch instruction I guess this would also be a tcg block boundary.

Because abort() was called from tcg_reg_alloc_call, I'd say 'retl'
(synthetic op for 'jmpl %o8 + 8, %g0') was the problem.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-10 14:09   ` Artyom Tarasenko
  2011-04-10 14:44     ` Blue Swirl
@ 2011-04-10 14:59     ` Peter Maydell
  2011-04-10 17:31       ` Artyom Tarasenko
  1 sibling, 1 reply; 32+ messages in thread
From: Peter Maydell @ 2011-04-10 14:59 UTC (permalink / raw)
  To: Artyom Tarasenko; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

On 10 April 2011 15:09, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
> Does it mean the last block is processed correctly and the crash
> happens on the next instruction which doesn't make it to the log?

...maybe tcg_abort() should do a qemu_log_flush()...?

-- PMM

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-10 14:59     ` Peter Maydell
@ 2011-04-10 17:31       ` Artyom Tarasenko
  0 siblings, 0 replies; 32+ messages in thread
From: Artyom Tarasenko @ 2011-04-10 17:31 UTC (permalink / raw)
  To: Peter Maydell; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

On Sun, Apr 10, 2011 at 4:59 PM, Peter Maydell <peter.maydell@linaro.org> wrote:
> On 10 April 2011 15:09, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>> Does it mean the last block is processed correctly and the crash
>> happens on the next instruction which doesn't make it to the log?
>
> ...maybe tcg_abort() should do a qemu_log_flush()...?

Sounds reasonable, but in this particular case makes no difference.

-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-10 14:44     ` Blue Swirl
@ 2011-04-10 17:48       ` Artyom Tarasenko
  2011-04-10 17:57         ` Blue Swirl
  0 siblings, 1 reply; 32+ messages in thread
From: Artyom Tarasenko @ 2011-04-10 17:48 UTC (permalink / raw)
  To: Blue Swirl; +Cc: peter.maydell, qemu-devel, Aurelien Jarno

On Sun, Apr 10, 2011 at 4:44 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
> On Sun, Apr 10, 2011 at 5:09 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>> On Sun, Apr 10, 2011 at 3:24 PM, Aurelien Jarno <aurelien@aurel32.net> wrote:
>>> On Sun, Apr 10, 2011 at 02:29:59PM +0200, Artyom Tarasenko wrote:
>>>> Trying to boot some proprietary OS I get qemu-system-sparc64 crash with a
>>>>
>>>> tcg/tcg.c:1892: tcg fatal error
>>>>
>>>> error message.
>>>>
>>>> It looks like it can be a platform independent bug though, because
>>>> when a '-singlestep' option IS present, qemu doesn't crash and seems
>>>> to translate the code properly.
>>>>
>>>> (gdb) bt
>>>> #0  0x00000032c2e327f5 in raise () from /lib64/libc.so.6
>>>> #1  0x00000032c2e33fd5 in abort () from /lib64/libc.so.6
>>>> #2  0x000000000051933d in tcg_reg_alloc_call (s=<value optimized out>,
>>>> def=0x89d340, opc=INDEX_op_call, args=0x10acc98, dead_iargs=3) at
>>>> qemu/tcg/tcg.c:1892
>>>> #3  0x000000000051a557 in tcg_gen_code_common (s=0x10b8940,
>>>> gen_code_buf=0x40338b60 "I\213n@H\213] 3\355I\211\256\220") at
>>>> qemu/tcg/tcg.c:2099
>>>> #4  tcg_gen_code (s=0x10b8940, gen_code_buf=0x40338b60 "I\213n@H\213]
>>>> 3\355I\211\256\220") at qemu/tcg/tcg.c:2142
>>>> #5  0x00000000004d38f1 in cpu_sparc_gen_code (env=0x10cce10,
>>>> tb=0x7fffe91bc218, gen_code_size_ptr=0x7fffffffd9b4) at
>>>> qemu/translate-all.c:93
>>>> #6  0x00000000004d1fd7 in tb_gen_code (env=0x10cce10, pc=18868776,
>>>> cs_base=18868780, flags=15, cflags=0) at qemu/exec.c:989
>>>> #7  0x00000000004d4029 in tb_find_slow (env1=<value optimized out>) at
>>>> qemu/cpu-exec.c:167
>>>> #8  tb_find_fast (env1=<value optimized out>) at cpu-exec.c:194
>>>> #9  cpu_sparc_exec (env1=<value optimized out>) at qemu/cpu-exec.c:556
>>>> #10 0x0000000000408868 in tcg_cpu_exec () at qemu/cpus.c:1066
>>>> #11 cpu_exec_all () at qemu/cpus.c:1102
>>>> #12 0x000000000053c756 in main_loop (argc=<value optimized out>,
>>>> argv=<value optimized out>, envp=<value optimized out>) at
>>>> qemu/vl.c:1430
>>>>
>>>> I inspected ts->val_type causing the abort() case and it turned out to be 0.
>>>>
>>>> The last lines of qemu.log (without -singlestep)
>>>> IN:
>>>> 0x00000000011fe9f0:  rdpr  %pstate, %g1
>>>> 0x00000000011fe9f4:  wrpr  %g1, 2, %pstate
>>>> --------------
>>>> IN:
>>>> 0x00000000011fe9f8:  ldub  [ %o0 ], %o1
>>>> 0x00000000011fe9fc:  mov  %o1, %o2
>>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>>> 0x00000000011fea04:  cmp  %o1, %o2
>>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>>
>>>> Search PC...
>>>> Search PC...
>>>> Search PC...
>>>> Search PC...
>>>> Search PC...
>>>> Search PC...
>>>> --------------
>>>> IN:
>>>> 0x00000000011fe9f8:  ldub  [ %o0 ], %o1
>>>> 0x00000000011fe9fc:  mov  %o1, %o2
>>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>>> 0x00000000011fea04:  cmp  %o1, %o2
>>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>>
>>>> 110521: Data Access MMU Miss (v=0068) pc=00000000011fe9f8
>>>> npc=00000000011fe9fc SP=000000000180ae41
>>>> pc: 00000000011fe9f8  npc: 00000000011fe9fc
>>>>
>>>> IN:
>>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>>> 0x00000000011fea04:  cmp  %o1, %o2
>>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>> --------------
>>>> IN:
>>>> 0x00000000011fea10:  brz,pn   %o2, 0x11fe9f8
>>>> 0x00000000011fea14:  mov  %o2, %o4
>>>> --------------
>>>> IN:
>>>> 0x00000000011fea18:  rdpr  %tick, %o5
>>>> 0x00000000011fea1c:  cmp  %o2, %o4
>>>> 0x00000000011fea20:  be  %icc, 0x11fea18
>>>> 0x00000000011fea24:  ldub  [ %o0 ], %o4
>>>> --------------
>>>> IN:
>>>> 0x00000000011fea28:  brz,pn   %o4, 0x11fe9f4
>>>> 0x00000000011fea2c:  wrpr  %g0, %g1, %pstate
>>>> <EOF>
>>>>
>>>> The crash is 100% reproducible and happens always on the same place,
>>>> so it's probably a pure TCG issue, not related on getting the
>>>> external/timer interrupts.
>>>>
>>>> Do you need any additional info?
>>>>
>>>
>>> What would be interesting would be to get the corresponding TCG code
>>> from qemu.log (-d op,op_opt).
>>
>>
>> OP:
>>  ---- 0x11fea28
>>  ld_i64 tmp6,regwptr,$0x20
>>  movi_i64 cond,$0x0
>>  movi_i64 tmp8,$0x0
>>  brcond_i64 tmp6,tmp8,ne,$0x0
>>  movi_i64 cond,$0x1
>>  set_label $0x0
>>
>>  ---- 0x11fea2c
>>  movi_i64 tmp7,$0x0
>>  xor_i64 tmp0,tmp7,g1
>>  movi_i64 pc,$0x11fea2c
>>  movi_i64 tmp8,$compute_psr
>>  call tmp8,$0x0,$0
>>  movi_i64 tmp8,$0x0
>>  brcond_i64 cond,tmp8,eq,$0x1
>>  movi_i64 npc,$0x11fe9f4
>>  br $0x2
>>  set_label $0x1
>>  movi_i64 npc,$0x11fea30
>>  set_label $0x2
>>  movi_i64 tmp8,$wrpstate
>>  call tmp8,$0x0,$0,tmp0
>>  mov_i64 pc,npc
>>  movi_i64 tmp8,$0x4
>>  add_i64 npc,npc,tmp8
>>  exit_tb $0x0
>>
>> OP after liveness analysis:
>>  ---- 0x11fea28
>>  ld_i64 tmp6,regwptr,$0x20
>>  movi_i64 cond,$0x0
>>  movi_i64 tmp8,$0x0
>>  brcond_i64 tmp6,tmp8,ne,$0x0
>>  movi_i64 cond,$0x1
>>  set_label $0x0
>>
>>  ---- 0x11fea2c
>>  nopn $0x2,$0x2
>>  nopn $0x3,$0x68,$0x3
>>  movi_i64 pc,$0x11fea2c
>>  movi_i64 tmp8,$compute_psr
>>  call tmp8,$0x0,$0
>>  movi_i64 tmp8,$0x0
>>  brcond_i64 cond,tmp8,eq,$0x1
>>  movi_i64 npc,$0x11fe9f4
>>  br $0x2
>>  set_label $0x1
>>  movi_i64 npc,$0x11fea30
>>  set_label $0x2
>>  movi_i64 tmp8,$wrpstate
>>  call tmp8,$0x0,$0,tmp0
>>  mov_i64 pc,npc
>>  movi_i64 tmp8,$0x4
>>  add_i64 npc,npc,tmp8
>>  exit_tb $0x0
>>  end
>>
>> Does it mean the last block is processed correctly and the crash
>> happens on the next instruction which doesn't make it to the log?
>> The next instruction would be a
>>
>> 0x00000000011fea30:  retl
>>
>> Since it's a branch instruction I guess this would also be a tcg block boundary.
>
> Because abort() was called from tcg_reg_alloc_call, I'd say 'retl'
> (synthetic op for 'jmpl %o8 + 8, %g0') was the problem.

Any idea why? retl is not a rare instruction...


-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-10 17:48       ` Artyom Tarasenko
@ 2011-04-10 17:57         ` Blue Swirl
  2011-04-10 18:35           ` Artyom Tarasenko
  0 siblings, 1 reply; 32+ messages in thread
From: Blue Swirl @ 2011-04-10 17:57 UTC (permalink / raw)
  To: Artyom Tarasenko; +Cc: peter.maydell, qemu-devel, Aurelien Jarno

On Sun, Apr 10, 2011 at 8:48 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
> On Sun, Apr 10, 2011 at 4:44 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
>> On Sun, Apr 10, 2011 at 5:09 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>> On Sun, Apr 10, 2011 at 3:24 PM, Aurelien Jarno <aurelien@aurel32.net> wrote:
>>>> On Sun, Apr 10, 2011 at 02:29:59PM +0200, Artyom Tarasenko wrote:
>>>>> Trying to boot some proprietary OS I get qemu-system-sparc64 crash with a
>>>>>
>>>>> tcg/tcg.c:1892: tcg fatal error
>>>>>
>>>>> error message.
>>>>>
>>>>> It looks like it can be a platform independent bug though, because
>>>>> when a '-singlestep' option IS present, qemu doesn't crash and seems
>>>>> to translate the code properly.
>>>>>
>>>>> (gdb) bt
>>>>> #0  0x00000032c2e327f5 in raise () from /lib64/libc.so.6
>>>>> #1  0x00000032c2e33fd5 in abort () from /lib64/libc.so.6
>>>>> #2  0x000000000051933d in tcg_reg_alloc_call (s=<value optimized out>,
>>>>> def=0x89d340, opc=INDEX_op_call, args=0x10acc98, dead_iargs=3) at
>>>>> qemu/tcg/tcg.c:1892
>>>>> #3  0x000000000051a557 in tcg_gen_code_common (s=0x10b8940,
>>>>> gen_code_buf=0x40338b60 "I\213n@H\213] 3\355I\211\256\220") at
>>>>> qemu/tcg/tcg.c:2099
>>>>> #4  tcg_gen_code (s=0x10b8940, gen_code_buf=0x40338b60 "I\213n@H\213]
>>>>> 3\355I\211\256\220") at qemu/tcg/tcg.c:2142
>>>>> #5  0x00000000004d38f1 in cpu_sparc_gen_code (env=0x10cce10,
>>>>> tb=0x7fffe91bc218, gen_code_size_ptr=0x7fffffffd9b4) at
>>>>> qemu/translate-all.c:93
>>>>> #6  0x00000000004d1fd7 in tb_gen_code (env=0x10cce10, pc=18868776,
>>>>> cs_base=18868780, flags=15, cflags=0) at qemu/exec.c:989
>>>>> #7  0x00000000004d4029 in tb_find_slow (env1=<value optimized out>) at
>>>>> qemu/cpu-exec.c:167
>>>>> #8  tb_find_fast (env1=<value optimized out>) at cpu-exec.c:194
>>>>> #9  cpu_sparc_exec (env1=<value optimized out>) at qemu/cpu-exec.c:556
>>>>> #10 0x0000000000408868 in tcg_cpu_exec () at qemu/cpus.c:1066
>>>>> #11 cpu_exec_all () at qemu/cpus.c:1102
>>>>> #12 0x000000000053c756 in main_loop (argc=<value optimized out>,
>>>>> argv=<value optimized out>, envp=<value optimized out>) at
>>>>> qemu/vl.c:1430
>>>>>
>>>>> I inspected ts->val_type causing the abort() case and it turned out to be 0.
>>>>>
>>>>> The last lines of qemu.log (without -singlestep)
>>>>> IN:
>>>>> 0x00000000011fe9f0:  rdpr  %pstate, %g1
>>>>> 0x00000000011fe9f4:  wrpr  %g1, 2, %pstate
>>>>> --------------
>>>>> IN:
>>>>> 0x00000000011fe9f8:  ldub  [ %o0 ], %o1
>>>>> 0x00000000011fe9fc:  mov  %o1, %o2
>>>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>>>> 0x00000000011fea04:  cmp  %o1, %o2
>>>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>>>
>>>>> Search PC...
>>>>> Search PC...
>>>>> Search PC...
>>>>> Search PC...
>>>>> Search PC...
>>>>> Search PC...
>>>>> --------------
>>>>> IN:
>>>>> 0x00000000011fe9f8:  ldub  [ %o0 ], %o1
>>>>> 0x00000000011fe9fc:  mov  %o1, %o2
>>>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>>>> 0x00000000011fea04:  cmp  %o1, %o2
>>>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>>>
>>>>> 110521: Data Access MMU Miss (v=0068) pc=00000000011fe9f8
>>>>> npc=00000000011fe9fc SP=000000000180ae41
>>>>> pc: 00000000011fe9f8  npc: 00000000011fe9fc
>>>>>
>>>>> IN:
>>>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>>>> 0x00000000011fea04:  cmp  %o1, %o2
>>>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>>> --------------
>>>>> IN:
>>>>> 0x00000000011fea10:  brz,pn   %o2, 0x11fe9f8
>>>>> 0x00000000011fea14:  mov  %o2, %o4
>>>>> --------------
>>>>> IN:
>>>>> 0x00000000011fea18:  rdpr  %tick, %o5
>>>>> 0x00000000011fea1c:  cmp  %o2, %o4
>>>>> 0x00000000011fea20:  be  %icc, 0x11fea18
>>>>> 0x00000000011fea24:  ldub  [ %o0 ], %o4
>>>>> --------------
>>>>> IN:
>>>>> 0x00000000011fea28:  brz,pn   %o4, 0x11fe9f4
>>>>> 0x00000000011fea2c:  wrpr  %g0, %g1, %pstate
>>>>> <EOF>
>>>>>
>>>>> The crash is 100% reproducible and happens always on the same place,
>>>>> so it's probably a pure TCG issue, not related on getting the
>>>>> external/timer interrupts.
>>>>>
>>>>> Do you need any additional info?
>>>>>
>>>>
>>>> What would be interesting would be to get the corresponding TCG code
>>>> from qemu.log (-d op,op_opt).
>>>
>>>
>>> OP:
>>>  ---- 0x11fea28
>>>  ld_i64 tmp6,regwptr,$0x20
>>>  movi_i64 cond,$0x0
>>>  movi_i64 tmp8,$0x0
>>>  brcond_i64 tmp6,tmp8,ne,$0x0
>>>  movi_i64 cond,$0x1
>>>  set_label $0x0
>>>
>>>  ---- 0x11fea2c
>>>  movi_i64 tmp7,$0x0
>>>  xor_i64 tmp0,tmp7,g1
>>>  movi_i64 pc,$0x11fea2c
>>>  movi_i64 tmp8,$compute_psr
>>>  call tmp8,$0x0,$0
>>>  movi_i64 tmp8,$0x0
>>>  brcond_i64 cond,tmp8,eq,$0x1
>>>  movi_i64 npc,$0x11fe9f4
>>>  br $0x2
>>>  set_label $0x1
>>>  movi_i64 npc,$0x11fea30
>>>  set_label $0x2
>>>  movi_i64 tmp8,$wrpstate
>>>  call tmp8,$0x0,$0,tmp0
>>>  mov_i64 pc,npc
>>>  movi_i64 tmp8,$0x4
>>>  add_i64 npc,npc,tmp8
>>>  exit_tb $0x0
>>>
>>> OP after liveness analysis:
>>>  ---- 0x11fea28
>>>  ld_i64 tmp6,regwptr,$0x20
>>>  movi_i64 cond,$0x0
>>>  movi_i64 tmp8,$0x0
>>>  brcond_i64 tmp6,tmp8,ne,$0x0
>>>  movi_i64 cond,$0x1
>>>  set_label $0x0
>>>
>>>  ---- 0x11fea2c
>>>  nopn $0x2,$0x2
>>>  nopn $0x3,$0x68,$0x3
>>>  movi_i64 pc,$0x11fea2c
>>>  movi_i64 tmp8,$compute_psr
>>>  call tmp8,$0x0,$0
>>>  movi_i64 tmp8,$0x0
>>>  brcond_i64 cond,tmp8,eq,$0x1
>>>  movi_i64 npc,$0x11fe9f4
>>>  br $0x2
>>>  set_label $0x1
>>>  movi_i64 npc,$0x11fea30
>>>  set_label $0x2
>>>  movi_i64 tmp8,$wrpstate
>>>  call tmp8,$0x0,$0,tmp0
>>>  mov_i64 pc,npc
>>>  movi_i64 tmp8,$0x4
>>>  add_i64 npc,npc,tmp8
>>>  exit_tb $0x0
>>>  end
>>>
>>> Does it mean the last block is processed correctly and the crash
>>> happens on the next instruction which doesn't make it to the log?
>>> The next instruction would be a
>>>
>>> 0x00000000011fea30:  retl
>>>
>>> Since it's a branch instruction I guess this would also be a tcg block boundary.
>>
>> Because abort() was called from tcg_reg_alloc_call, I'd say 'retl'
>> (synthetic op for 'jmpl %o8 + 8, %g0') was the problem.
>
> Any idea why? retl is not a rare instruction...

Sorry, calls are generated for helpers, so it's not 'jmpl' but the
call to wrpstate helper.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-10 17:57         ` Blue Swirl
@ 2011-04-10 18:35           ` Artyom Tarasenko
  2011-04-10 18:52             ` Igor Kovalenko
  0 siblings, 1 reply; 32+ messages in thread
From: Artyom Tarasenko @ 2011-04-10 18:35 UTC (permalink / raw)
  To: Blue Swirl; +Cc: peter.maydell, qemu-devel, Aurelien Jarno

On Sun, Apr 10, 2011 at 7:57 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
> On Sun, Apr 10, 2011 at 8:48 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>> On Sun, Apr 10, 2011 at 4:44 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
>>> On Sun, Apr 10, 2011 at 5:09 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>> On Sun, Apr 10, 2011 at 3:24 PM, Aurelien Jarno <aurelien@aurel32.net> wrote:
>>>>> On Sun, Apr 10, 2011 at 02:29:59PM +0200, Artyom Tarasenko wrote:
>>>>>> Trying to boot some proprietary OS I get qemu-system-sparc64 crash with a
>>>>>>
>>>>>> tcg/tcg.c:1892: tcg fatal error
>>>>>>
>>>>>> error message.
>>>>>>
>>>>>> It looks like it can be a platform independent bug though, because
>>>>>> when a '-singlestep' option IS present, qemu doesn't crash and seems
>>>>>> to translate the code properly.
>>>>>>
>>>>>> (gdb) bt
>>>>>> #0  0x00000032c2e327f5 in raise () from /lib64/libc.so.6
>>>>>> #1  0x00000032c2e33fd5 in abort () from /lib64/libc.so.6
>>>>>> #2  0x000000000051933d in tcg_reg_alloc_call (s=<value optimized out>,
>>>>>> def=0x89d340, opc=INDEX_op_call, args=0x10acc98, dead_iargs=3) at
>>>>>> qemu/tcg/tcg.c:1892
>>>>>> #3  0x000000000051a557 in tcg_gen_code_common (s=0x10b8940,
>>>>>> gen_code_buf=0x40338b60 "I\213n@H\213] 3\355I\211\256\220") at
>>>>>> qemu/tcg/tcg.c:2099
>>>>>> #4  tcg_gen_code (s=0x10b8940, gen_code_buf=0x40338b60 "I\213n@H\213]
>>>>>> 3\355I\211\256\220") at qemu/tcg/tcg.c:2142
>>>>>> #5  0x00000000004d38f1 in cpu_sparc_gen_code (env=0x10cce10,
>>>>>> tb=0x7fffe91bc218, gen_code_size_ptr=0x7fffffffd9b4) at
>>>>>> qemu/translate-all.c:93
>>>>>> #6  0x00000000004d1fd7 in tb_gen_code (env=0x10cce10, pc=18868776,
>>>>>> cs_base=18868780, flags=15, cflags=0) at qemu/exec.c:989
>>>>>> #7  0x00000000004d4029 in tb_find_slow (env1=<value optimized out>) at
>>>>>> qemu/cpu-exec.c:167
>>>>>> #8  tb_find_fast (env1=<value optimized out>) at cpu-exec.c:194
>>>>>> #9  cpu_sparc_exec (env1=<value optimized out>) at qemu/cpu-exec.c:556
>>>>>> #10 0x0000000000408868 in tcg_cpu_exec () at qemu/cpus.c:1066
>>>>>> #11 cpu_exec_all () at qemu/cpus.c:1102
>>>>>> #12 0x000000000053c756 in main_loop (argc=<value optimized out>,
>>>>>> argv=<value optimized out>, envp=<value optimized out>) at
>>>>>> qemu/vl.c:1430
>>>>>>
>>>>>> I inspected ts->val_type causing the abort() case and it turned out to be 0.
>>>>>>
>>>>>> The last lines of qemu.log (without -singlestep)
>>>>>> IN:
>>>>>> 0x00000000011fe9f0:  rdpr  %pstate, %g1
>>>>>> 0x00000000011fe9f4:  wrpr  %g1, 2, %pstate
>>>>>> --------------
>>>>>> IN:
>>>>>> 0x00000000011fe9f8:  ldub  [ %o0 ], %o1
>>>>>> 0x00000000011fe9fc:  mov  %o1, %o2
>>>>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>>>>> 0x00000000011fea04:  cmp  %o1, %o2
>>>>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>>>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>>>>
>>>>>> Search PC...
>>>>>> Search PC...
>>>>>> Search PC...
>>>>>> Search PC...
>>>>>> Search PC...
>>>>>> Search PC...
>>>>>> --------------
>>>>>> IN:
>>>>>> 0x00000000011fe9f8:  ldub  [ %o0 ], %o1
>>>>>> 0x00000000011fe9fc:  mov  %o1, %o2
>>>>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>>>>> 0x00000000011fea04:  cmp  %o1, %o2
>>>>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>>>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>>>>
>>>>>> 110521: Data Access MMU Miss (v=0068) pc=00000000011fe9f8
>>>>>> npc=00000000011fe9fc SP=000000000180ae41
>>>>>> pc: 00000000011fe9f8  npc: 00000000011fe9fc
>>>>>>
>>>>>> IN:
>>>>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>>>>> 0x00000000011fea04:  cmp  %o1, %o2
>>>>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>>>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>>>> --------------
>>>>>> IN:
>>>>>> 0x00000000011fea10:  brz,pn   %o2, 0x11fe9f8
>>>>>> 0x00000000011fea14:  mov  %o2, %o4
>>>>>> --------------
>>>>>> IN:
>>>>>> 0x00000000011fea18:  rdpr  %tick, %o5
>>>>>> 0x00000000011fea1c:  cmp  %o2, %o4
>>>>>> 0x00000000011fea20:  be  %icc, 0x11fea18
>>>>>> 0x00000000011fea24:  ldub  [ %o0 ], %o4
>>>>>> --------------
>>>>>> IN:
>>>>>> 0x00000000011fea28:  brz,pn   %o4, 0x11fe9f4
>>>>>> 0x00000000011fea2c:  wrpr  %g0, %g1, %pstate
>>>>>> <EOF>
>>>>>>
>>>>>> The crash is 100% reproducible and happens always on the same place,
>>>>>> so it's probably a pure TCG issue, not related on getting the
>>>>>> external/timer interrupts.
>>>>>>
>>>>>> Do you need any additional info?
>>>>>>
>>>>>
>>>>> What would be interesting would be to get the corresponding TCG code
>>>>> from qemu.log (-d op,op_opt).
>>>>
>>>>
>>>> OP:
>>>>  ---- 0x11fea28
>>>>  ld_i64 tmp6,regwptr,$0x20
>>>>  movi_i64 cond,$0x0
>>>>  movi_i64 tmp8,$0x0
>>>>  brcond_i64 tmp6,tmp8,ne,$0x0
>>>>  movi_i64 cond,$0x1
>>>>  set_label $0x0
>>>>
>>>>  ---- 0x11fea2c
>>>>  movi_i64 tmp7,$0x0
>>>>  xor_i64 tmp0,tmp7,g1
>>>>  movi_i64 pc,$0x11fea2c
>>>>  movi_i64 tmp8,$compute_psr
>>>>  call tmp8,$0x0,$0
>>>>  movi_i64 tmp8,$0x0
>>>>  brcond_i64 cond,tmp8,eq,$0x1
>>>>  movi_i64 npc,$0x11fe9f4
>>>>  br $0x2
>>>>  set_label $0x1
>>>>  movi_i64 npc,$0x11fea30
>>>>  set_label $0x2
>>>>  movi_i64 tmp8,$wrpstate
>>>>  call tmp8,$0x0,$0,tmp0
>>>>  mov_i64 pc,npc
>>>>  movi_i64 tmp8,$0x4
>>>>  add_i64 npc,npc,tmp8
>>>>  exit_tb $0x0
>>>>
>>>> OP after liveness analysis:
>>>>  ---- 0x11fea28
>>>>  ld_i64 tmp6,regwptr,$0x20
>>>>  movi_i64 cond,$0x0
>>>>  movi_i64 tmp8,$0x0
>>>>  brcond_i64 tmp6,tmp8,ne,$0x0
>>>>  movi_i64 cond,$0x1
>>>>  set_label $0x0
>>>>
>>>>  ---- 0x11fea2c
>>>>  nopn $0x2,$0x2
>>>>  nopn $0x3,$0x68,$0x3
>>>>  movi_i64 pc,$0x11fea2c
>>>>  movi_i64 tmp8,$compute_psr
>>>>  call tmp8,$0x0,$0
>>>>  movi_i64 tmp8,$0x0
>>>>  brcond_i64 cond,tmp8,eq,$0x1
>>>>  movi_i64 npc,$0x11fe9f4
>>>>  br $0x2
>>>>  set_label $0x1
>>>>  movi_i64 npc,$0x11fea30
>>>>  set_label $0x2
>>>>  movi_i64 tmp8,$wrpstate
>>>>  call tmp8,$0x0,$0,tmp0
>>>>  mov_i64 pc,npc
>>>>  movi_i64 tmp8,$0x4
>>>>  add_i64 npc,npc,tmp8
>>>>  exit_tb $0x0
>>>>  end
>>>>
>>>> Does it mean the last block is processed correctly and the crash
>>>> happens on the next instruction which doesn't make it to the log?
>>>> The next instruction would be a
>>>>
>>>> 0x00000000011fea30:  retl
>>>>
>>>> Since it's a branch instruction I guess this would also be a tcg block boundary.
>>>
>>> Because abort() was called from tcg_reg_alloc_call, I'd say 'retl'
>>> (synthetic op for 'jmpl %o8 + 8, %g0') was the problem.
>>
>> Any idea why? retl is not a rare instruction...
>
> Sorry, calls are generated for helpers, so it's not 'jmpl' but the
> call to wrpstate helper.

And why it doesn't happen in a singlestep mode?
I tried to comment out
cpu_check_irqs(env);
in the helper_wrpstate but it made no difference. The only suspicious
thing left is register bank switching. Is it safe to switch register
banks in the helper function? Shouldn't we end the translation block
before?

-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-10 18:35           ` Artyom Tarasenko
@ 2011-04-10 18:52             ` Igor Kovalenko
  2011-04-10 19:37               ` Artyom Tarasenko
  0 siblings, 1 reply; 32+ messages in thread
From: Igor Kovalenko @ 2011-04-10 18:52 UTC (permalink / raw)
  To: Artyom Tarasenko; +Cc: Blue Swirl, peter.maydell, qemu-devel, Aurelien Jarno

On Sun, Apr 10, 2011 at 10:35 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
> On Sun, Apr 10, 2011 at 7:57 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
>> On Sun, Apr 10, 2011 at 8:48 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>> On Sun, Apr 10, 2011 at 4:44 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
>>>> On Sun, Apr 10, 2011 at 5:09 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>>> On Sun, Apr 10, 2011 at 3:24 PM, Aurelien Jarno <aurelien@aurel32.net> wrote:
>>>>>> On Sun, Apr 10, 2011 at 02:29:59PM +0200, Artyom Tarasenko wrote:
>>>>>>> Trying to boot some proprietary OS I get qemu-system-sparc64 crash with a
>>>>>>>
>>>>>>> tcg/tcg.c:1892: tcg fatal error
>>>>>>>
>>>>>>> error message.
>>>>>>>
>>>>>>> It looks like it can be a platform independent bug though, because
>>>>>>> when a '-singlestep' option IS present, qemu doesn't crash and seems
>>>>>>> to translate the code properly.
>>>>>>>
>>>>>>> (gdb) bt
>>>>>>> #0  0x00000032c2e327f5 in raise () from /lib64/libc.so.6
>>>>>>> #1  0x00000032c2e33fd5 in abort () from /lib64/libc.so.6
>>>>>>> #2  0x000000000051933d in tcg_reg_alloc_call (s=<value optimized out>,
>>>>>>> def=0x89d340, opc=INDEX_op_call, args=0x10acc98, dead_iargs=3) at
>>>>>>> qemu/tcg/tcg.c:1892
>>>>>>> #3  0x000000000051a557 in tcg_gen_code_common (s=0x10b8940,
>>>>>>> gen_code_buf=0x40338b60 "I\213n@H\213] 3\355I\211\256\220") at
>>>>>>> qemu/tcg/tcg.c:2099
>>>>>>> #4  tcg_gen_code (s=0x10b8940, gen_code_buf=0x40338b60 "I\213n@H\213]
>>>>>>> 3\355I\211\256\220") at qemu/tcg/tcg.c:2142
>>>>>>> #5  0x00000000004d38f1 in cpu_sparc_gen_code (env=0x10cce10,
>>>>>>> tb=0x7fffe91bc218, gen_code_size_ptr=0x7fffffffd9b4) at
>>>>>>> qemu/translate-all.c:93
>>>>>>> #6  0x00000000004d1fd7 in tb_gen_code (env=0x10cce10, pc=18868776,
>>>>>>> cs_base=18868780, flags=15, cflags=0) at qemu/exec.c:989
>>>>>>> #7  0x00000000004d4029 in tb_find_slow (env1=<value optimized out>) at
>>>>>>> qemu/cpu-exec.c:167
>>>>>>> #8  tb_find_fast (env1=<value optimized out>) at cpu-exec.c:194
>>>>>>> #9  cpu_sparc_exec (env1=<value optimized out>) at qemu/cpu-exec.c:556
>>>>>>> #10 0x0000000000408868 in tcg_cpu_exec () at qemu/cpus.c:1066
>>>>>>> #11 cpu_exec_all () at qemu/cpus.c:1102
>>>>>>> #12 0x000000000053c756 in main_loop (argc=<value optimized out>,
>>>>>>> argv=<value optimized out>, envp=<value optimized out>) at
>>>>>>> qemu/vl.c:1430
>>>>>>>
>>>>>>> I inspected ts->val_type causing the abort() case and it turned out to be 0.
>>>>>>>
>>>>>>> The last lines of qemu.log (without -singlestep)
>>>>>>> IN:
>>>>>>> 0x00000000011fe9f0:  rdpr  %pstate, %g1
>>>>>>> 0x00000000011fe9f4:  wrpr  %g1, 2, %pstate
>>>>>>> --------------
>>>>>>> IN:
>>>>>>> 0x00000000011fe9f8:  ldub  [ %o0 ], %o1
>>>>>>> 0x00000000011fe9fc:  mov  %o1, %o2
>>>>>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>>>>>> 0x00000000011fea04:  cmp  %o1, %o2
>>>>>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>>>>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>>>>>
>>>>>>> Search PC...
>>>>>>> Search PC...
>>>>>>> Search PC...
>>>>>>> Search PC...
>>>>>>> Search PC...
>>>>>>> Search PC...
>>>>>>> --------------
>>>>>>> IN:
>>>>>>> 0x00000000011fe9f8:  ldub  [ %o0 ], %o1
>>>>>>> 0x00000000011fe9fc:  mov  %o1, %o2
>>>>>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>>>>>> 0x00000000011fea04:  cmp  %o1, %o2
>>>>>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>>>>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>>>>>
>>>>>>> 110521: Data Access MMU Miss (v=0068) pc=00000000011fe9f8
>>>>>>> npc=00000000011fe9fc SP=000000000180ae41
>>>>>>> pc: 00000000011fe9f8  npc: 00000000011fe9fc
>>>>>>>
>>>>>>> IN:
>>>>>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>>>>>> 0x00000000011fea04:  cmp  %o1, %o2
>>>>>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>>>>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>>>>> --------------
>>>>>>> IN:
>>>>>>> 0x00000000011fea10:  brz,pn   %o2, 0x11fe9f8
>>>>>>> 0x00000000011fea14:  mov  %o2, %o4
>>>>>>> --------------
>>>>>>> IN:
>>>>>>> 0x00000000011fea18:  rdpr  %tick, %o5
>>>>>>> 0x00000000011fea1c:  cmp  %o2, %o4
>>>>>>> 0x00000000011fea20:  be  %icc, 0x11fea18
>>>>>>> 0x00000000011fea24:  ldub  [ %o0 ], %o4
>>>>>>> --------------
>>>>>>> IN:
>>>>>>> 0x00000000011fea28:  brz,pn   %o4, 0x11fe9f4
>>>>>>> 0x00000000011fea2c:  wrpr  %g0, %g1, %pstate
>>>>>>> <EOF>
>>>>>>>
>>>>>>> The crash is 100% reproducible and happens always on the same place,
>>>>>>> so it's probably a pure TCG issue, not related on getting the
>>>>>>> external/timer interrupts.
>>>>>>>
>>>>>>> Do you need any additional info?
>>>>>>>
>>>>>>
>>>>>> What would be interesting would be to get the corresponding TCG code
>>>>>> from qemu.log (-d op,op_opt).
>>>>>
>>>>>
>>>>> OP:
>>>>>  ---- 0x11fea28
>>>>>  ld_i64 tmp6,regwptr,$0x20
>>>>>  movi_i64 cond,$0x0
>>>>>  movi_i64 tmp8,$0x0
>>>>>  brcond_i64 tmp6,tmp8,ne,$0x0
>>>>>  movi_i64 cond,$0x1
>>>>>  set_label $0x0
>>>>>
>>>>>  ---- 0x11fea2c
>>>>>  movi_i64 tmp7,$0x0
>>>>>  xor_i64 tmp0,tmp7,g1
>>>>>  movi_i64 pc,$0x11fea2c
>>>>>  movi_i64 tmp8,$compute_psr
>>>>>  call tmp8,$0x0,$0
>>>>>  movi_i64 tmp8,$0x0
>>>>>  brcond_i64 cond,tmp8,eq,$0x1
>>>>>  movi_i64 npc,$0x11fe9f4
>>>>>  br $0x2
>>>>>  set_label $0x1
>>>>>  movi_i64 npc,$0x11fea30
>>>>>  set_label $0x2
>>>>>  movi_i64 tmp8,$wrpstate
>>>>>  call tmp8,$0x0,$0,tmp0
>>>>>  mov_i64 pc,npc
>>>>>  movi_i64 tmp8,$0x4
>>>>>  add_i64 npc,npc,tmp8
>>>>>  exit_tb $0x0
>>>>>
>>>>> OP after liveness analysis:
>>>>>  ---- 0x11fea28
>>>>>  ld_i64 tmp6,regwptr,$0x20
>>>>>  movi_i64 cond,$0x0
>>>>>  movi_i64 tmp8,$0x0
>>>>>  brcond_i64 tmp6,tmp8,ne,$0x0
>>>>>  movi_i64 cond,$0x1
>>>>>  set_label $0x0
>>>>>
>>>>>  ---- 0x11fea2c
>>>>>  nopn $0x2,$0x2
>>>>>  nopn $0x3,$0x68,$0x3
>>>>>  movi_i64 pc,$0x11fea2c
>>>>>  movi_i64 tmp8,$compute_psr
>>>>>  call tmp8,$0x0,$0
>>>>>  movi_i64 tmp8,$0x0
>>>>>  brcond_i64 cond,tmp8,eq,$0x1
>>>>>  movi_i64 npc,$0x11fe9f4
>>>>>  br $0x2
>>>>>  set_label $0x1
>>>>>  movi_i64 npc,$0x11fea30
>>>>>  set_label $0x2
>>>>>  movi_i64 tmp8,$wrpstate
>>>>>  call tmp8,$0x0,$0,tmp0
>>>>>  mov_i64 pc,npc
>>>>>  movi_i64 tmp8,$0x4
>>>>>  add_i64 npc,npc,tmp8
>>>>>  exit_tb $0x0
>>>>>  end
>>>>>
>>>>> Does it mean the last block is processed correctly and the crash
>>>>> happens on the next instruction which doesn't make it to the log?
>>>>> The next instruction would be a
>>>>>
>>>>> 0x00000000011fea30:  retl
>>>>>
>>>>> Since it's a branch instruction I guess this would also be a tcg block boundary.
>>>>
>>>> Because abort() was called from tcg_reg_alloc_call, I'd say 'retl'
>>>> (synthetic op for 'jmpl %o8 + 8, %g0') was the problem.
>>>
>>> Any idea why? retl is not a rare instruction...
>>
>> Sorry, calls are generated for helpers, so it's not 'jmpl' but the
>> call to wrpstate helper.
>
> And why it doesn't happen in a singlestep mode?
> I tried to comment out
> cpu_check_irqs(env);
> in the helper_wrpstate but it made no difference. The only suspicious
> thing left is register bank switching. Is it safe to switch register
> banks in the helper function? Shouldn't we end the translation block
> before?
>
> --
> Regards,
> Artyom Tarasenko
>
> solaris/sparc under qemu blog: http://tyom.blogspot.com/
>
>

Not sure if I have seen write to pstate in delay slot, but switching
globals with PS_AG appears to be safe.
Do you know which bits are changed in the pstate?

-- 
Kind regards,
Igor V. Kovalenko

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-10 18:52             ` Igor Kovalenko
@ 2011-04-10 19:37               ` Artyom Tarasenko
       [not found]                 ` <BANLkTik2NChYi8hADjCSbjdZeyP_oo8_Qg@mail.gmail.com>
  0 siblings, 1 reply; 32+ messages in thread
From: Artyom Tarasenko @ 2011-04-10 19:37 UTC (permalink / raw)
  To: Igor Kovalenko; +Cc: Blue Swirl, peter.maydell, qemu-devel, Aurelien Jarno

On Sun, Apr 10, 2011 at 8:52 PM, Igor Kovalenko
<igor.v.kovalenko@gmail.com> wrote:
> On Sun, Apr 10, 2011 at 10:35 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>> On Sun, Apr 10, 2011 at 7:57 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
>>> On Sun, Apr 10, 2011 at 8:48 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>> On Sun, Apr 10, 2011 at 4:44 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
>>>>> On Sun, Apr 10, 2011 at 5:09 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>>>> On Sun, Apr 10, 2011 at 3:24 PM, Aurelien Jarno <aurelien@aurel32.net> wrote:
>>>>>>> On Sun, Apr 10, 2011 at 02:29:59PM +0200, Artyom Tarasenko wrote:
>>>>>>>> Trying to boot some proprietary OS I get qemu-system-sparc64 crash with a
>>>>>>>>
>>>>>>>> tcg/tcg.c:1892: tcg fatal error
>>>>>>>>
>>>>>>>> error message.
>>>>>>>>
>>>>>>>> It looks like it can be a platform independent bug though, because
>>>>>>>> when a '-singlestep' option IS present, qemu doesn't crash and seems
>>>>>>>> to translate the code properly.
>>>>>>>>
>>>>>>>> (gdb) bt
>>>>>>>> #0  0x00000032c2e327f5 in raise () from /lib64/libc.so.6
>>>>>>>> #1  0x00000032c2e33fd5 in abort () from /lib64/libc.so.6
>>>>>>>> #2  0x000000000051933d in tcg_reg_alloc_call (s=<value optimized out>,
>>>>>>>> def=0x89d340, opc=INDEX_op_call, args=0x10acc98, dead_iargs=3) at
>>>>>>>> qemu/tcg/tcg.c:1892
>>>>>>>> #3  0x000000000051a557 in tcg_gen_code_common (s=0x10b8940,
>>>>>>>> gen_code_buf=0x40338b60 "I\213n@H\213] 3\355I\211\256\220") at
>>>>>>>> qemu/tcg/tcg.c:2099
>>>>>>>> #4  tcg_gen_code (s=0x10b8940, gen_code_buf=0x40338b60 "I\213n@H\213]
>>>>>>>> 3\355I\211\256\220") at qemu/tcg/tcg.c:2142
>>>>>>>> #5  0x00000000004d38f1 in cpu_sparc_gen_code (env=0x10cce10,
>>>>>>>> tb=0x7fffe91bc218, gen_code_size_ptr=0x7fffffffd9b4) at
>>>>>>>> qemu/translate-all.c:93
>>>>>>>> #6  0x00000000004d1fd7 in tb_gen_code (env=0x10cce10, pc=18868776,
>>>>>>>> cs_base=18868780, flags=15, cflags=0) at qemu/exec.c:989
>>>>>>>> #7  0x00000000004d4029 in tb_find_slow (env1=<value optimized out>) at
>>>>>>>> qemu/cpu-exec.c:167
>>>>>>>> #8  tb_find_fast (env1=<value optimized out>) at cpu-exec.c:194
>>>>>>>> #9  cpu_sparc_exec (env1=<value optimized out>) at qemu/cpu-exec.c:556
>>>>>>>> #10 0x0000000000408868 in tcg_cpu_exec () at qemu/cpus.c:1066
>>>>>>>> #11 cpu_exec_all () at qemu/cpus.c:1102
>>>>>>>> #12 0x000000000053c756 in main_loop (argc=<value optimized out>,
>>>>>>>> argv=<value optimized out>, envp=<value optimized out>) at
>>>>>>>> qemu/vl.c:1430
>>>>>>>>
>>>>>>>> I inspected ts->val_type causing the abort() case and it turned out to be 0.
>>>>>>>>
>>>>>>>> The last lines of qemu.log (without -singlestep)
>>>>>>>> IN:
>>>>>>>> 0x00000000011fe9f0:  rdpr  %pstate, %g1
>>>>>>>> 0x00000000011fe9f4:  wrpr  %g1, 2, %pstate
>>>>>>>> --------------
>>>>>>>> IN:
>>>>>>>> 0x00000000011fe9f8:  ldub  [ %o0 ], %o1
>>>>>>>> 0x00000000011fe9fc:  mov  %o1, %o2
>>>>>>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>>>>>>> 0x00000000011fea04:  cmp  %o1, %o2
>>>>>>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>>>>>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>>>>>>
>>>>>>>> Search PC...
>>>>>>>> Search PC...
>>>>>>>> Search PC...
>>>>>>>> Search PC...
>>>>>>>> Search PC...
>>>>>>>> Search PC...
>>>>>>>> --------------
>>>>>>>> IN:
>>>>>>>> 0x00000000011fe9f8:  ldub  [ %o0 ], %o1
>>>>>>>> 0x00000000011fe9fc:  mov  %o1, %o2
>>>>>>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>>>>>>> 0x00000000011fea04:  cmp  %o1, %o2
>>>>>>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>>>>>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>>>>>>
>>>>>>>> 110521: Data Access MMU Miss (v=0068) pc=00000000011fe9f8
>>>>>>>> npc=00000000011fe9fc SP=000000000180ae41
>>>>>>>> pc: 00000000011fe9f8  npc: 00000000011fe9fc
>>>>>>>>
>>>>>>>> IN:
>>>>>>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>>>>>>> 0x00000000011fea04:  cmp  %o1, %o2
>>>>>>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>>>>>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>>>>>> --------------
>>>>>>>> IN:
>>>>>>>> 0x00000000011fea10:  brz,pn   %o2, 0x11fe9f8
>>>>>>>> 0x00000000011fea14:  mov  %o2, %o4
>>>>>>>> --------------
>>>>>>>> IN:
>>>>>>>> 0x00000000011fea18:  rdpr  %tick, %o5
>>>>>>>> 0x00000000011fea1c:  cmp  %o2, %o4
>>>>>>>> 0x00000000011fea20:  be  %icc, 0x11fea18
>>>>>>>> 0x00000000011fea24:  ldub  [ %o0 ], %o4
>>>>>>>> --------------
>>>>>>>> IN:
>>>>>>>> 0x00000000011fea28:  brz,pn   %o4, 0x11fe9f4
>>>>>>>> 0x00000000011fea2c:  wrpr  %g0, %g1, %pstate
>>>>>>>> <EOF>
>>>>>>>>
>>>>>>>> The crash is 100% reproducible and happens always on the same place,
>>>>>>>> so it's probably a pure TCG issue, not related on getting the
>>>>>>>> external/timer interrupts.
>>>>>>>>
>>>>>>>> Do you need any additional info?
>>>>>>>>
>>>>>>>
>>>>>>> What would be interesting would be to get the corresponding TCG code
>>>>>>> from qemu.log (-d op,op_opt).
>>>>>>
>>>>>>
>>>>>> OP:
>>>>>>  ---- 0x11fea28
>>>>>>  ld_i64 tmp6,regwptr,$0x20
>>>>>>  movi_i64 cond,$0x0
>>>>>>  movi_i64 tmp8,$0x0
>>>>>>  brcond_i64 tmp6,tmp8,ne,$0x0
>>>>>>  movi_i64 cond,$0x1
>>>>>>  set_label $0x0
>>>>>>
>>>>>>  ---- 0x11fea2c
>>>>>>  movi_i64 tmp7,$0x0
>>>>>>  xor_i64 tmp0,tmp7,g1
>>>>>>  movi_i64 pc,$0x11fea2c
>>>>>>  movi_i64 tmp8,$compute_psr
>>>>>>  call tmp8,$0x0,$0
>>>>>>  movi_i64 tmp8,$0x0
>>>>>>  brcond_i64 cond,tmp8,eq,$0x1
>>>>>>  movi_i64 npc,$0x11fe9f4
>>>>>>  br $0x2
>>>>>>  set_label $0x1
>>>>>>  movi_i64 npc,$0x11fea30
>>>>>>  set_label $0x2
>>>>>>  movi_i64 tmp8,$wrpstate
>>>>>>  call tmp8,$0x0,$0,tmp0
>>>>>>  mov_i64 pc,npc
>>>>>>  movi_i64 tmp8,$0x4
>>>>>>  add_i64 npc,npc,tmp8
>>>>>>  exit_tb $0x0
>>>>>>
>>>>>> OP after liveness analysis:
>>>>>>  ---- 0x11fea28
>>>>>>  ld_i64 tmp6,regwptr,$0x20
>>>>>>  movi_i64 cond,$0x0
>>>>>>  movi_i64 tmp8,$0x0
>>>>>>  brcond_i64 tmp6,tmp8,ne,$0x0
>>>>>>  movi_i64 cond,$0x1
>>>>>>  set_label $0x0
>>>>>>
>>>>>>  ---- 0x11fea2c
>>>>>>  nopn $0x2,$0x2
>>>>>>  nopn $0x3,$0x68,$0x3
>>>>>>  movi_i64 pc,$0x11fea2c
>>>>>>  movi_i64 tmp8,$compute_psr
>>>>>>  call tmp8,$0x0,$0
>>>>>>  movi_i64 tmp8,$0x0
>>>>>>  brcond_i64 cond,tmp8,eq,$0x1
>>>>>>  movi_i64 npc,$0x11fe9f4
>>>>>>  br $0x2
>>>>>>  set_label $0x1
>>>>>>  movi_i64 npc,$0x11fea30
>>>>>>  set_label $0x2
>>>>>>  movi_i64 tmp8,$wrpstate
>>>>>>  call tmp8,$0x0,$0,tmp0
>>>>>>  mov_i64 pc,npc
>>>>>>  movi_i64 tmp8,$0x4
>>>>>>  add_i64 npc,npc,tmp8
>>>>>>  exit_tb $0x0
>>>>>>  end
>>>>>>
>>>>>> Does it mean the last block is processed correctly and the crash
>>>>>> happens on the next instruction which doesn't make it to the log?
>>>>>> The next instruction would be a
>>>>>>
>>>>>> 0x00000000011fea30:  retl
>>>>>>
>>>>>> Since it's a branch instruction I guess this would also be a tcg block boundary.
>>>>>
>>>>> Because abort() was called from tcg_reg_alloc_call, I'd say 'retl'
>>>>> (synthetic op for 'jmpl %o8 + 8, %g0') was the problem.
>>>>
>>>> Any idea why? retl is not a rare instruction...
>>>
>>> Sorry, calls are generated for helpers, so it's not 'jmpl' but the
>>> call to wrpstate helper.
>>
>> And why it doesn't happen in a singlestep mode?
>> I tried to comment out
>> cpu_check_irqs(env);
>> in the helper_wrpstate but it made no difference. The only suspicious
>> thing left is register bank switching. Is it safe to switch register
>> banks in the helper function? Shouldn't we end the translation block
>> before?
>
> Not sure if I have seen write to pstate in delay slot, but switching
> globals with PS_AG appears to be safe.
> Do you know which bits are changed in the pstate?

Hard to say. With a breakpoint set qemu doesn't crash.
The breakpoint shows the change from 0x14->0x16.
So the only difference is that interrupts are getting enabled. No
register bank change.
(And now also no cpu_check_irqs(env) call, because I commented it out.)

But given there was a Data Access MMU Miss, I would expect there must
have beeb a PS_MG switch.

Also the breakpoint makes tcg to cut the translation block before the wrpr:

IN:
0x00000000011fea18:  rdpr  %tick, %o5
0x00000000011fea1c:  cmp  %o2, %o4
0x00000000011fea20:  be  %icc, 0x11fea18
0x00000000011fea24:  ldub  [ %o0 ], %o4
--------------
IN:
0x00000000011fea28:  brz,pn   %o4, 0x11fe9f4
--------------
IN:
0x00000000011fea2c:  wrpr  %g0, %g1, %pstate
--------------
IN:
0x00000000011fea30:  retl
--------------
IN:
0x00000000011fea30:  retl
0x00000000011fea34:  sub  %o5, %o3, %o0

-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
       [not found]                 ` <BANLkTik2NChYi8hADjCSbjdZeyP_oo8_Qg@mail.gmail.com>
@ 2011-04-10 20:00                   ` Artyom Tarasenko
  2011-04-11  3:16                     ` Igor Kovalenko
  0 siblings, 1 reply; 32+ messages in thread
From: Artyom Tarasenko @ 2011-04-10 20:00 UTC (permalink / raw)
  To: Igor Kovalenko; +Cc: Blue Swirl, peter.maydell, qemu-devel, Aurelien Jarno

On Sun, Apr 10, 2011 at 9:41 PM, Igor Kovalenko
<igor.v.kovalenko@gmail.com> wrote:
> On Sun, Apr 10, 2011 at 11:37 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>> On Sun, Apr 10, 2011 at 8:52 PM, Igor Kovalenko
>> <igor.v.kovalenko@gmail.com> wrote:
>>> On Sun, Apr 10, 2011 at 10:35 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>> On Sun, Apr 10, 2011 at 7:57 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
>>>>> On Sun, Apr 10, 2011 at 8:48 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>>>> On Sun, Apr 10, 2011 at 4:44 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
>>>>>>> On Sun, Apr 10, 2011 at 5:09 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>>>>>> On Sun, Apr 10, 2011 at 3:24 PM, Aurelien Jarno <aurelien@aurel32.net> wrote:
>>>>>>>>> On Sun, Apr 10, 2011 at 02:29:59PM +0200, Artyom Tarasenko wrote:
>>>>>>>>>> Trying to boot some proprietary OS I get qemu-system-sparc64 crash with a
>>>>>>>>>>
>>>>>>>>>> tcg/tcg.c:1892: tcg fatal error
>>>>>>>>>>
>>>>>>>>>> error message.
>>>>>>>>>>
>>>>>>>>>> It looks like it can be a platform independent bug though, because
>>>>>>>>>> when a '-singlestep' option IS present, qemu doesn't crash and seems
>>>>>>>>>> to translate the code properly.
>>>>>>>>>>
>>>>>>>>>> (gdb) bt
>>>>>>>>>> #0  0x00000032c2e327f5 in raise () from /lib64/libc.so.6
>>>>>>>>>> #1  0x00000032c2e33fd5 in abort () from /lib64/libc.so.6
>>>>>>>>>> #2  0x000000000051933d in tcg_reg_alloc_call (s=<value optimized out>,
>>>>>>>>>> def=0x89d340, opc=INDEX_op_call, args=0x10acc98, dead_iargs=3) at
>>>>>>>>>> qemu/tcg/tcg.c:1892
>>>>>>>>>> #3  0x000000000051a557 in tcg_gen_code_common (s=0x10b8940,
>>>>>>>>>> gen_code_buf=0x40338b60 "I\213n@H\213] 3\355I\211\256\220") at
>>>>>>>>>> qemu/tcg/tcg.c:2099
>>>>>>>>>> #4  tcg_gen_code (s=0x10b8940, gen_code_buf=0x40338b60 "I\213n@H\213]
>>>>>>>>>> 3\355I\211\256\220") at qemu/tcg/tcg.c:2142
>>>>>>>>>> #5  0x00000000004d38f1 in cpu_sparc_gen_code (env=0x10cce10,
>>>>>>>>>> tb=0x7fffe91bc218, gen_code_size_ptr=0x7fffffffd9b4) at
>>>>>>>>>> qemu/translate-all.c:93
>>>>>>>>>> #6  0x00000000004d1fd7 in tb_gen_code (env=0x10cce10, pc=18868776,
>>>>>>>>>> cs_base=18868780, flags=15, cflags=0) at qemu/exec.c:989
>>>>>>>>>> #7  0x00000000004d4029 in tb_find_slow (env1=<value optimized out>) at
>>>>>>>>>> qemu/cpu-exec.c:167
>>>>>>>>>> #8  tb_find_fast (env1=<value optimized out>) at cpu-exec.c:194
>>>>>>>>>> #9  cpu_sparc_exec (env1=<value optimized out>) at qemu/cpu-exec.c:556
>>>>>>>>>> #10 0x0000000000408868 in tcg_cpu_exec () at qemu/cpus.c:1066
>>>>>>>>>> #11 cpu_exec_all () at qemu/cpus.c:1102
>>>>>>>>>> #12 0x000000000053c756 in main_loop (argc=<value optimized out>,
>>>>>>>>>> argv=<value optimized out>, envp=<value optimized out>) at
>>>>>>>>>> qemu/vl.c:1430
>>>>>>>>>>
>>>>>>>>>> I inspected ts->val_type causing the abort() case and it turned out to be 0.
>>>>>>>>>>
>>>>>>>>>> The last lines of qemu.log (without -singlestep)
>>>>>>>>>> IN:
>>>>>>>>>> 0x00000000011fe9f0:  rdpr  %pstate, %g1
>>>>>>>>>> 0x00000000011fe9f4:  wrpr  %g1, 2, %pstate
>>>>>>>>>> --------------
>>>>>>>>>> IN:
>>>>>>>>>> 0x00000000011fe9f8:  ldub  [ %o0 ], %o1
>>>>>>>>>> 0x00000000011fe9fc:  mov  %o1, %o2
>>>>>>>>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>>>>>>>>> 0x00000000011fea04:  cmp  %o1, %o2
>>>>>>>>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>>>>>>>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>>>>>>>>
>>>>>>>>>> Search PC...
>>>>>>>>>> Search PC...
>>>>>>>>>> Search PC...
>>>>>>>>>> Search PC...
>>>>>>>>>> Search PC...
>>>>>>>>>> Search PC...
>>>>>>>>>> --------------
>>>>>>>>>> IN:
>>>>>>>>>> 0x00000000011fe9f8:  ldub  [ %o0 ], %o1
>>>>>>>>>> 0x00000000011fe9fc:  mov  %o1, %o2
>>>>>>>>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>>>>>>>>> 0x00000000011fea04:  cmp  %o1, %o2
>>>>>>>>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>>>>>>>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>>>>>>>>
>>>>>>>>>> 110521: Data Access MMU Miss (v=0068) pc=00000000011fe9f8
>>>>>>>>>> npc=00000000011fe9fc SP=000000000180ae41
>>>>>>>>>> pc: 00000000011fe9f8  npc: 00000000011fe9fc
>>>>>>>>>>
>>>>>>>>>> IN:
>>>>>>>>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>>>>>>>>> 0x00000000011fea04:  cmp  %o1, %o2
>>>>>>>>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>>>>>>>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>>>>>>>> --------------
>>>>>>>>>> IN:
>>>>>>>>>> 0x00000000011fea10:  brz,pn   %o2, 0x11fe9f8
>>>>>>>>>> 0x00000000011fea14:  mov  %o2, %o4
>>>>>>>>>> --------------
>>>>>>>>>> IN:
>>>>>>>>>> 0x00000000011fea18:  rdpr  %tick, %o5
>>>>>>>>>> 0x00000000011fea1c:  cmp  %o2, %o4
>>>>>>>>>> 0x00000000011fea20:  be  %icc, 0x11fea18
>>>>>>>>>> 0x00000000011fea24:  ldub  [ %o0 ], %o4
>>>>>>>>>> --------------
>>>>>>>>>> IN:
>>>>>>>>>> 0x00000000011fea28:  brz,pn   %o4, 0x11fe9f4
>>>>>>>>>> 0x00000000011fea2c:  wrpr  %g0, %g1, %pstate
>>>>>>>>>> <EOF>
>>>>>>>>>>
>>>>>>>>>> The crash is 100% reproducible and happens always on the same place,
>>>>>>>>>> so it's probably a pure TCG issue, not related on getting the
>>>>>>>>>> external/timer interrupts.
>>>>>>>>>>
>>>>>>>>>> Do you need any additional info?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> What would be interesting would be to get the corresponding TCG code
>>>>>>>>> from qemu.log (-d op,op_opt).
>>>>>>>>
>>>>>>>>
>>>>>>>> OP:
>>>>>>>>  ---- 0x11fea28
>>>>>>>>  ld_i64 tmp6,regwptr,$0x20
>>>>>>>>  movi_i64 cond,$0x0
>>>>>>>>  movi_i64 tmp8,$0x0
>>>>>>>>  brcond_i64 tmp6,tmp8,ne,$0x0
>>>>>>>>  movi_i64 cond,$0x1
>>>>>>>>  set_label $0x0
>>>>>>>>
>>>>>>>>  ---- 0x11fea2c
>>>>>>>>  movi_i64 tmp7,$0x0
>>>>>>>>  xor_i64 tmp0,tmp7,g1
>>>>>>>>  movi_i64 pc,$0x11fea2c
>>>>>>>>  movi_i64 tmp8,$compute_psr
>>>>>>>>  call tmp8,$0x0,$0
>>>>>>>>  movi_i64 tmp8,$0x0
>>>>>>>>  brcond_i64 cond,tmp8,eq,$0x1
>>>>>>>>  movi_i64 npc,$0x11fe9f4
>>>>>>>>  br $0x2
>>>>>>>>  set_label $0x1
>>>>>>>>  movi_i64 npc,$0x11fea30
>>>>>>>>  set_label $0x2
>>>>>>>>  movi_i64 tmp8,$wrpstate
>>>>>>>>  call tmp8,$0x0,$0,tmp0
>>>>>>>>  mov_i64 pc,npc
>>>>>>>>  movi_i64 tmp8,$0x4
>>>>>>>>  add_i64 npc,npc,tmp8
>>>>>>>>  exit_tb $0x0
>>>>>>>>
>>>>>>>> OP after liveness analysis:
>>>>>>>>  ---- 0x11fea28
>>>>>>>>  ld_i64 tmp6,regwptr,$0x20
>>>>>>>>  movi_i64 cond,$0x0
>>>>>>>>  movi_i64 tmp8,$0x0
>>>>>>>>  brcond_i64 tmp6,tmp8,ne,$0x0
>>>>>>>>  movi_i64 cond,$0x1
>>>>>>>>  set_label $0x0
>>>>>>>>
>>>>>>>>  ---- 0x11fea2c
>>>>>>>>  nopn $0x2,$0x2
>>>>>>>>  nopn $0x3,$0x68,$0x3
>>>>>>>>  movi_i64 pc,$0x11fea2c
>>>>>>>>  movi_i64 tmp8,$compute_psr
>>>>>>>>  call tmp8,$0x0,$0
>>>>>>>>  movi_i64 tmp8,$0x0
>>>>>>>>  brcond_i64 cond,tmp8,eq,$0x1
>>>>>>>>  movi_i64 npc,$0x11fe9f4
>>>>>>>>  br $0x2
>>>>>>>>  set_label $0x1
>>>>>>>>  movi_i64 npc,$0x11fea30
>>>>>>>>  set_label $0x2
>>>>>>>>  movi_i64 tmp8,$wrpstate
>>>>>>>>  call tmp8,$0x0,$0,tmp0
>>>>>>>>  mov_i64 pc,npc
>>>>>>>>  movi_i64 tmp8,$0x4
>>>>>>>>  add_i64 npc,npc,tmp8
>>>>>>>>  exit_tb $0x0
>>>>>>>>  end
>>>>>>>>
>>>>>>>> Does it mean the last block is processed correctly and the crash
>>>>>>>> happens on the next instruction which doesn't make it to the log?
>>>>>>>> The next instruction would be a
>>>>>>>>
>>>>>>>> 0x00000000011fea30:  retl
>>>>>>>>
>>>>>>>> Since it's a branch instruction I guess this would also be a tcg block boundary.
>>>>>>>
>>>>>>> Because abort() was called from tcg_reg_alloc_call, I'd say 'retl'
>>>>>>> (synthetic op for 'jmpl %o8 + 8, %g0') was the problem.
>>>>>>
>>>>>> Any idea why? retl is not a rare instruction...
>>>>>
>>>>> Sorry, calls are generated for helpers, so it's not 'jmpl' but the
>>>>> call to wrpstate helper.
>>>>
>>>> And why it doesn't happen in a singlestep mode?
>>>> I tried to comment out
>>>> cpu_check_irqs(env);
>>>> in the helper_wrpstate but it made no difference. The only suspicious
>>>> thing left is register bank switching. Is it safe to switch register
>>>> banks in the helper function? Shouldn't we end the translation block
>>>> before?
>>>
>>> Not sure if I have seen write to pstate in delay slot, but switching
>>> globals with PS_AG appears to be safe.
>>> Do you know which bits are changed in the pstate?
>>
>> Hard to say. With a breakpoint set qemu doesn't crash.
>> The breakpoint shows the change from 0x14->0x16.
>> So the only difference is that interrupts are getting enabled. No
>> register bank change.
>> (And now also no cpu_check_irqs(env) call, because I commented it out.)
>>
>> But given there was a Data Access MMU Miss, I would expect there must
>> have beeb a PS_MG switch.
>>
>> Also the breakpoint makes tcg to cut the translation block before the wrpr:
>>
>> IN:
>> 0x00000000011fea18:  rdpr  %tick, %o5
>> 0x00000000011fea1c:  cmp  %o2, %o4
>> 0x00000000011fea20:  be  %icc, 0x11fea18
>> 0x00000000011fea24:  ldub  [ %o0 ], %o4
>> --------------
>> IN:
>> 0x00000000011fea28:  brz,pn   %o4, 0x11fe9f4
>> --------------
>> IN:
>> 0x00000000011fea2c:  wrpr  %g0, %g1, %pstate
>> --------------
>> IN:
>> 0x00000000011fea30:  retl
>> --------------
>> IN:
>> 0x00000000011fea30:  retl
>> 0x00000000011fea34:  sub  %o5, %o3, %o0
>>
>
> You can try enabling DEBUG_PSTATE to see which bits are changed.

I put an additional DPRINTF in the helper and it doesn't get executed
at 11fea2c. Only at 11fe9f4 (0x16->0x14).

-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-10 20:00                   ` Artyom Tarasenko
@ 2011-04-11  3:16                     ` Igor Kovalenko
  2011-04-11 17:53                       ` Artyom Tarasenko
  0 siblings, 1 reply; 32+ messages in thread
From: Igor Kovalenko @ 2011-04-11  3:16 UTC (permalink / raw)
  To: Artyom Tarasenko; +Cc: Blue Swirl, peter.maydell, qemu-devel, Aurelien Jarno

[-- Attachment #1: Type: text/plain, Size: 10098 bytes --]

On Mon, Apr 11, 2011 at 12:00 AM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
> On Sun, Apr 10, 2011 at 9:41 PM, Igor Kovalenko
> <igor.v.kovalenko@gmail.com> wrote:
>> On Sun, Apr 10, 2011 at 11:37 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>> On Sun, Apr 10, 2011 at 8:52 PM, Igor Kovalenko
>>> <igor.v.kovalenko@gmail.com> wrote:
>>>> On Sun, Apr 10, 2011 at 10:35 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>>> On Sun, Apr 10, 2011 at 7:57 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
>>>>>> On Sun, Apr 10, 2011 at 8:48 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>>>>> On Sun, Apr 10, 2011 at 4:44 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
>>>>>>>> On Sun, Apr 10, 2011 at 5:09 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>>>>>>> On Sun, Apr 10, 2011 at 3:24 PM, Aurelien Jarno <aurelien@aurel32.net> wrote:
>>>>>>>>>> On Sun, Apr 10, 2011 at 02:29:59PM +0200, Artyom Tarasenko wrote:
>>>>>>>>>>> Trying to boot some proprietary OS I get qemu-system-sparc64 crash with a
>>>>>>>>>>>
>>>>>>>>>>> tcg/tcg.c:1892: tcg fatal error
>>>>>>>>>>>
>>>>>>>>>>> error message.
>>>>>>>>>>>
>>>>>>>>>>> It looks like it can be a platform independent bug though, because
>>>>>>>>>>> when a '-singlestep' option IS present, qemu doesn't crash and seems
>>>>>>>>>>> to translate the code properly.
>>>>>>>>>>>
>>>>>>>>>>> (gdb) bt
>>>>>>>>>>> #0  0x00000032c2e327f5 in raise () from /lib64/libc.so.6
>>>>>>>>>>> #1  0x00000032c2e33fd5 in abort () from /lib64/libc.so.6
>>>>>>>>>>> #2  0x000000000051933d in tcg_reg_alloc_call (s=<value optimized out>,
>>>>>>>>>>> def=0x89d340, opc=INDEX_op_call, args=0x10acc98, dead_iargs=3) at
>>>>>>>>>>> qemu/tcg/tcg.c:1892
>>>>>>>>>>> #3  0x000000000051a557 in tcg_gen_code_common (s=0x10b8940,
>>>>>>>>>>> gen_code_buf=0x40338b60 "I\213n@H\213] 3\355I\211\256\220") at
>>>>>>>>>>> qemu/tcg/tcg.c:2099
>>>>>>>>>>> #4  tcg_gen_code (s=0x10b8940, gen_code_buf=0x40338b60 "I\213n@H\213]
>>>>>>>>>>> 3\355I\211\256\220") at qemu/tcg/tcg.c:2142
>>>>>>>>>>> #5  0x00000000004d38f1 in cpu_sparc_gen_code (env=0x10cce10,
>>>>>>>>>>> tb=0x7fffe91bc218, gen_code_size_ptr=0x7fffffffd9b4) at
>>>>>>>>>>> qemu/translate-all.c:93
>>>>>>>>>>> #6  0x00000000004d1fd7 in tb_gen_code (env=0x10cce10, pc=18868776,
>>>>>>>>>>> cs_base=18868780, flags=15, cflags=0) at qemu/exec.c:989
>>>>>>>>>>> #7  0x00000000004d4029 in tb_find_slow (env1=<value optimized out>) at
>>>>>>>>>>> qemu/cpu-exec.c:167
>>>>>>>>>>> #8  tb_find_fast (env1=<value optimized out>) at cpu-exec.c:194
>>>>>>>>>>> #9  cpu_sparc_exec (env1=<value optimized out>) at qemu/cpu-exec.c:556
>>>>>>>>>>> #10 0x0000000000408868 in tcg_cpu_exec () at qemu/cpus.c:1066
>>>>>>>>>>> #11 cpu_exec_all () at qemu/cpus.c:1102
>>>>>>>>>>> #12 0x000000000053c756 in main_loop (argc=<value optimized out>,
>>>>>>>>>>> argv=<value optimized out>, envp=<value optimized out>) at
>>>>>>>>>>> qemu/vl.c:1430
>>>>>>>>>>>
>>>>>>>>>>> I inspected ts->val_type causing the abort() case and it turned out to be 0.
>>>>>>>>>>>
>>>>>>>>>>> The last lines of qemu.log (without -singlestep)
>>>>>>>>>>> IN:
>>>>>>>>>>> 0x00000000011fe9f0:  rdpr  %pstate, %g1
>>>>>>>>>>> 0x00000000011fe9f4:  wrpr  %g1, 2, %pstate
>>>>>>>>>>> --------------
>>>>>>>>>>> IN:
>>>>>>>>>>> 0x00000000011fe9f8:  ldub  [ %o0 ], %o1
>>>>>>>>>>> 0x00000000011fe9fc:  mov  %o1, %o2
>>>>>>>>>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>>>>>>>>>> 0x00000000011fea04:  cmp  %o1, %o2
>>>>>>>>>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>>>>>>>>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>>>>>>>>>
>>>>>>>>>>> Search PC...
>>>>>>>>>>> Search PC...
>>>>>>>>>>> Search PC...
>>>>>>>>>>> Search PC...
>>>>>>>>>>> Search PC...
>>>>>>>>>>> Search PC...
>>>>>>>>>>> --------------
>>>>>>>>>>> IN:
>>>>>>>>>>> 0x00000000011fe9f8:  ldub  [ %o0 ], %o1
>>>>>>>>>>> 0x00000000011fe9fc:  mov  %o1, %o2
>>>>>>>>>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>>>>>>>>>> 0x00000000011fea04:  cmp  %o1, %o2
>>>>>>>>>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>>>>>>>>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>>>>>>>>>
>>>>>>>>>>> 110521: Data Access MMU Miss (v=0068) pc=00000000011fe9f8
>>>>>>>>>>> npc=00000000011fe9fc SP=000000000180ae41
>>>>>>>>>>> pc: 00000000011fe9f8  npc: 00000000011fe9fc
>>>>>>>>>>>
>>>>>>>>>>> IN:
>>>>>>>>>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>>>>>>>>>> 0x00000000011fea04:  cmp  %o1, %o2
>>>>>>>>>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>>>>>>>>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>>>>>>>>> --------------
>>>>>>>>>>> IN:
>>>>>>>>>>> 0x00000000011fea10:  brz,pn   %o2, 0x11fe9f8
>>>>>>>>>>> 0x00000000011fea14:  mov  %o2, %o4
>>>>>>>>>>> --------------
>>>>>>>>>>> IN:
>>>>>>>>>>> 0x00000000011fea18:  rdpr  %tick, %o5
>>>>>>>>>>> 0x00000000011fea1c:  cmp  %o2, %o4
>>>>>>>>>>> 0x00000000011fea20:  be  %icc, 0x11fea18
>>>>>>>>>>> 0x00000000011fea24:  ldub  [ %o0 ], %o4
>>>>>>>>>>> --------------
>>>>>>>>>>> IN:
>>>>>>>>>>> 0x00000000011fea28:  brz,pn   %o4, 0x11fe9f4
>>>>>>>>>>> 0x00000000011fea2c:  wrpr  %g0, %g1, %pstate
>>>>>>>>>>> <EOF>
>>>>>>>>>>>
>>>>>>>>>>> The crash is 100% reproducible and happens always on the same place,
>>>>>>>>>>> so it's probably a pure TCG issue, not related on getting the
>>>>>>>>>>> external/timer interrupts.
>>>>>>>>>>>
>>>>>>>>>>> Do you need any additional info?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> What would be interesting would be to get the corresponding TCG code
>>>>>>>>>> from qemu.log (-d op,op_opt).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> OP:
>>>>>>>>>  ---- 0x11fea28
>>>>>>>>>  ld_i64 tmp6,regwptr,$0x20
>>>>>>>>>  movi_i64 cond,$0x0
>>>>>>>>>  movi_i64 tmp8,$0x0
>>>>>>>>>  brcond_i64 tmp6,tmp8,ne,$0x0
>>>>>>>>>  movi_i64 cond,$0x1
>>>>>>>>>  set_label $0x0
>>>>>>>>>
>>>>>>>>>  ---- 0x11fea2c
>>>>>>>>>  movi_i64 tmp7,$0x0
>>>>>>>>>  xor_i64 tmp0,tmp7,g1
>>>>>>>>>  movi_i64 pc,$0x11fea2c
>>>>>>>>>  movi_i64 tmp8,$compute_psr
>>>>>>>>>  call tmp8,$0x0,$0
>>>>>>>>>  movi_i64 tmp8,$0x0
>>>>>>>>>  brcond_i64 cond,tmp8,eq,$0x1
>>>>>>>>>  movi_i64 npc,$0x11fe9f4
>>>>>>>>>  br $0x2
>>>>>>>>>  set_label $0x1
>>>>>>>>>  movi_i64 npc,$0x11fea30
>>>>>>>>>  set_label $0x2
>>>>>>>>>  movi_i64 tmp8,$wrpstate
>>>>>>>>>  call tmp8,$0x0,$0,tmp0
>>>>>>>>>  mov_i64 pc,npc
>>>>>>>>>  movi_i64 tmp8,$0x4
>>>>>>>>>  add_i64 npc,npc,tmp8
>>>>>>>>>  exit_tb $0x0
>>>>>>>>>
>>>>>>>>> OP after liveness analysis:
>>>>>>>>>  ---- 0x11fea28
>>>>>>>>>  ld_i64 tmp6,regwptr,$0x20
>>>>>>>>>  movi_i64 cond,$0x0
>>>>>>>>>  movi_i64 tmp8,$0x0
>>>>>>>>>  brcond_i64 tmp6,tmp8,ne,$0x0
>>>>>>>>>  movi_i64 cond,$0x1
>>>>>>>>>  set_label $0x0
>>>>>>>>>
>>>>>>>>>  ---- 0x11fea2c
>>>>>>>>>  nopn $0x2,$0x2
>>>>>>>>>  nopn $0x3,$0x68,$0x3
>>>>>>>>>  movi_i64 pc,$0x11fea2c
>>>>>>>>>  movi_i64 tmp8,$compute_psr
>>>>>>>>>  call tmp8,$0x0,$0
>>>>>>>>>  movi_i64 tmp8,$0x0
>>>>>>>>>  brcond_i64 cond,tmp8,eq,$0x1
>>>>>>>>>  movi_i64 npc,$0x11fe9f4
>>>>>>>>>  br $0x2
>>>>>>>>>  set_label $0x1
>>>>>>>>>  movi_i64 npc,$0x11fea30
>>>>>>>>>  set_label $0x2
>>>>>>>>>  movi_i64 tmp8,$wrpstate
>>>>>>>>>  call tmp8,$0x0,$0,tmp0
>>>>>>>>>  mov_i64 pc,npc
>>>>>>>>>  movi_i64 tmp8,$0x4
>>>>>>>>>  add_i64 npc,npc,tmp8
>>>>>>>>>  exit_tb $0x0
>>>>>>>>>  end
>>>>>>>>>
>>>>>>>>> Does it mean the last block is processed correctly and the crash
>>>>>>>>> happens on the next instruction which doesn't make it to the log?
>>>>>>>>> The next instruction would be a
>>>>>>>>>
>>>>>>>>> 0x00000000011fea30:  retl
>>>>>>>>>
>>>>>>>>> Since it's a branch instruction I guess this would also be a tcg block boundary.
>>>>>>>>
>>>>>>>> Because abort() was called from tcg_reg_alloc_call, I'd say 'retl'
>>>>>>>> (synthetic op for 'jmpl %o8 + 8, %g0') was the problem.
>>>>>>>
>>>>>>> Any idea why? retl is not a rare instruction...
>>>>>>
>>>>>> Sorry, calls are generated for helpers, so it's not 'jmpl' but the
>>>>>> call to wrpstate helper.
>>>>>
>>>>> And why it doesn't happen in a singlestep mode?
>>>>> I tried to comment out
>>>>> cpu_check_irqs(env);
>>>>> in the helper_wrpstate but it made no difference. The only suspicious
>>>>> thing left is register bank switching. Is it safe to switch register
>>>>> banks in the helper function? Shouldn't we end the translation block
>>>>> before?
>>>>
>>>> Not sure if I have seen write to pstate in delay slot, but switching
>>>> globals with PS_AG appears to be safe.
>>>> Do you know which bits are changed in the pstate?
>>>
>>> Hard to say. With a breakpoint set qemu doesn't crash.
>>> The breakpoint shows the change from 0x14->0x16.
>>> So the only difference is that interrupts are getting enabled. No
>>> register bank change.
>>> (And now also no cpu_check_irqs(env) call, because I commented it out.)
>>>
>>> But given there was a Data Access MMU Miss, I would expect there must
>>> have beeb a PS_MG switch.
>>>
>>> Also the breakpoint makes tcg to cut the translation block before the wrpr:
>>>
>>> IN:
>>> 0x00000000011fea18:  rdpr  %tick, %o5
>>> 0x00000000011fea1c:  cmp  %o2, %o4
>>> 0x00000000011fea20:  be  %icc, 0x11fea18
>>> 0x00000000011fea24:  ldub  [ %o0 ], %o4
>>> --------------
>>> IN:
>>> 0x00000000011fea28:  brz,pn   %o4, 0x11fe9f4
>>> --------------
>>> IN:
>>> 0x00000000011fea2c:  wrpr  %g0, %g1, %pstate
>>> --------------
>>> IN:
>>> 0x00000000011fea30:  retl
>>> --------------
>>> IN:
>>> 0x00000000011fea30:  retl
>>> 0x00000000011fea34:  sub  %o5, %o3, %o0
>>>
>>
>> You can try enabling DEBUG_PSTATE to see which bits are changed.
>
> I put an additional DPRINTF in the helper and it doesn't get executed
> at 11fea2c. Only at 11fe9f4 (0x16->0x14).

In such cases I would run with -d in_asm,int to have more data to
compare two runs.
May the patch attached help a bit to add verbose pstate output.

Do you have public test case?
It is possible to code this delay slot write test but real issue may
be corruption elsewhere.

-- 
Kind regards,
Igor V. Kovalenko

[-- Attachment #2: sparc64-dump-pstate-verbose --]
[-- Type: application/octet-stream, Size: 4225 bytes --]

sparc64: verbose pstate dump

From: Igor V. Kovalenko <igor.v.kovalenko@gmail.com>


---
 target-sparc/cpu.h       |    4 ++++
 target-sparc/helper.c    |   29 +++++++++++++++++++++++++++--
 target-sparc/op_helper.c |   24 ++++++++++++++++++++++++
 3 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/target-sparc/cpu.h b/target-sparc/cpu.h
index 5cd89a4..e0032dd 100644
--- a/target-sparc/cpu.h
+++ b/target-sparc/cpu.h
@@ -667,4 +667,8 @@ static inline void cpu_get_tb_cpu_state(CPUState *env, target_ulong *pc,
 #endif
 }
 
+#if defined(TARGET_SPARC64)
+void pstate_to_string(uint32_t pstate, char *str, size_t str_limit);
+#endif
+
 #endif
diff --git a/target-sparc/helper.c b/target-sparc/helper.c
index e415b22..89da711 100644
--- a/target-sparc/helper.c
+++ b/target-sparc/helper.c
@@ -1480,6 +1480,26 @@ static void cpu_print_cc(FILE *f, fprintf_function cpu_fprintf,
 #define REGS_PER_LINE 8
 #endif
 
+#if defined (TARGET_SPARC64)
+void pstate_to_string(uint32_t pstate, char *str, size_t str_limit)
+{
+    snprintf(str, str_limit,
+             "%s%s%s%s%s%s%s%s%s%s%s ",
+            (pstate & PS_IG)   ? " IG" : "",
+            (pstate & PS_MG)   ? " MG" : "",
+            (pstate & PS_CLE)  ? " CLE" : "",
+            (pstate & PS_TLE)  ? " TLE" : "",
+            (pstate & PS_RMO)  ? " RMO" : "",
+            (pstate & PS_RED)  ? " RED" : "",
+            (pstate & PS_PEF)  ? " PEF" : "",
+            (pstate & PS_AM)   ? " AM" : "",
+            (pstate & PS_PRIV) ? " PRIV" : "",
+            (pstate & PS_IE)   ? " IE" : "",
+            (pstate & PS_AG)   ? " AG" : ""
+    );
+}
+#endif
+
 void cpu_dump_state(CPUState *env, FILE *f, fprintf_function cpu_fprintf,
                     int flags)
 {
@@ -1522,8 +1542,13 @@ void cpu_dump_state(CPUState *env, FILE *f, fprintf_function cpu_fprintf,
             cpu_fprintf(f, "\n");
     }
 #ifdef TARGET_SPARC64
-    cpu_fprintf(f, "pstate: %08x ccr: %02x (icc: ", env->pstate,
-                (unsigned)cpu_get_ccr(env));
+    {
+        char pstate_string[128];
+        pstate_to_string(env->pstate, pstate_string, sizeof (pstate_string));
+        cpu_fprintf(f, "pstate: %08x [%s] ccr: %02x (icc: ",
+                    env->pstate, pstate_string,
+                    (unsigned)cpu_get_ccr(env));
+    }
     cpu_print_cc(f, cpu_fprintf, cpu_get_ccr(env) << PSR_CARRY_SHIFT);
     cpu_fprintf(f, " xcc: ");
     cpu_print_cc(f, cpu_fprintf, cpu_get_ccr(env) << (PSR_CARRY_SHIFT - 4));
diff --git a/target-sparc/op_helper.c b/target-sparc/op_helper.c
index f292309..f01bf53 100644
--- a/target-sparc/op_helper.c
+++ b/target-sparc/op_helper.c
@@ -4036,6 +4036,12 @@ void helper_wrpstate(target_ulong new_state)
 {
     change_pstate(new_state & 0xf3f);
 
+    {
+        char pstate_string[128];
+        pstate_to_string(env->pstate, pstate_string, sizeof (pstate_string));
+        DPRINTF_PSTATE("helper_wrpstate pstate=[%s]\n", pstate_string);
+    }
+
 #if !defined(CONFIG_USER_ONLY)
     if (cpu_interrupts_enabled(env)) {
         cpu_check_irqs(env);
@@ -4071,6 +4077,12 @@ void helper_done(void)
 
     DPRINTF_PSTATE("... helper_done tl=%d\n", env->tl);
 
+    {
+        char pstate_string[128];
+        pstate_to_string(env->pstate, pstate_string, sizeof (pstate_string));
+        DPRINTF_PSTATE("helper_done pstate=[%s]\n", pstate_string);
+    }
+
 #if !defined(CONFIG_USER_ONLY)
     if (cpu_interrupts_enabled(env)) {
         cpu_check_irqs(env);
@@ -4092,6 +4104,12 @@ void helper_retry(void)
 
     DPRINTF_PSTATE("... helper_retry tl=%d\n", env->tl);
 
+    {
+        char pstate_string[128];
+        pstate_to_string(env->pstate, pstate_string, sizeof (pstate_string));
+        DPRINTF_PSTATE("helper_retry pstate=[%s]\n", pstate_string);
+    }
+
 #if !defined(CONFIG_USER_ONLY)
     if (cpu_interrupts_enabled(env)) {
         cpu_check_irqs(env);
@@ -4277,6 +4295,12 @@ void do_interrupt(CPUState *env)
     }
     env->npc = env->pc + 4;
     env->exception_index = -1;
+
+    {
+        char pstate_string[128];
+        pstate_to_string(env->pstate, pstate_string, sizeof (pstate_string));
+        DPRINTF_PSTATE("do_interrupt pstate=[%s]\n", pstate_string);
+    }
 }
 #else
 #ifdef DEBUG_PCALL

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-11  3:16                     ` Igor Kovalenko
@ 2011-04-11 17:53                       ` Artyom Tarasenko
  2011-04-12  2:14                         ` Igor Kovalenko
  0 siblings, 1 reply; 32+ messages in thread
From: Artyom Tarasenko @ 2011-04-11 17:53 UTC (permalink / raw)
  To: Igor Kovalenko; +Cc: Blue Swirl, peter.maydell, qemu-devel, Aurelien Jarno

On Mon, Apr 11, 2011 at 5:16 AM, Igor Kovalenko
<igor.v.kovalenko@gmail.com> wrote:
> On Mon, Apr 11, 2011 at 12:00 AM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>> On Sun, Apr 10, 2011 at 9:41 PM, Igor Kovalenko
>> <igor.v.kovalenko@gmail.com> wrote:
>>> On Sun, Apr 10, 2011 at 11:37 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>> On Sun, Apr 10, 2011 at 8:52 PM, Igor Kovalenko
>>>> <igor.v.kovalenko@gmail.com> wrote:
>>>>> On Sun, Apr 10, 2011 at 10:35 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>>>> On Sun, Apr 10, 2011 at 7:57 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
>>>>>>> On Sun, Apr 10, 2011 at 8:48 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>>>>>> On Sun, Apr 10, 2011 at 4:44 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
>>>>>>>>> On Sun, Apr 10, 2011 at 5:09 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>>>>>>>> On Sun, Apr 10, 2011 at 3:24 PM, Aurelien Jarno <aurelien@aurel32.net> wrote:
>>>>>>>>>>> On Sun, Apr 10, 2011 at 02:29:59PM +0200, Artyom Tarasenko wrote:
>>>>>>>>>>>> Trying to boot some proprietary OS I get qemu-system-sparc64 crash with a
>>>>>>>>>>>>
>>>>>>>>>>>> tcg/tcg.c:1892: tcg fatal error
>>>>>>>>>>>>
>>>>>>>>>>>> error message.
>>>>>>>>>>>>
>>>>>>>>>>>> It looks like it can be a platform independent bug though, because
>>>>>>>>>>>> when a '-singlestep' option IS present, qemu doesn't crash and seems
>>>>>>>>>>>> to translate the code properly.
>>>>>>>>>>>>
>>>>>>>>>>>> (gdb) bt
>>>>>>>>>>>> #0  0x00000032c2e327f5 in raise () from /lib64/libc.so.6
>>>>>>>>>>>> #1  0x00000032c2e33fd5 in abort () from /lib64/libc.so.6
>>>>>>>>>>>> #2  0x000000000051933d in tcg_reg_alloc_call (s=<value optimized out>,
>>>>>>>>>>>> def=0x89d340, opc=INDEX_op_call, args=0x10acc98, dead_iargs=3) at
>>>>>>>>>>>> qemu/tcg/tcg.c:1892
>>>>>>>>>>>> #3  0x000000000051a557 in tcg_gen_code_common (s=0x10b8940,
>>>>>>>>>>>> gen_code_buf=0x40338b60 "I\213n@H\213] 3\355I\211\256\220") at
>>>>>>>>>>>> qemu/tcg/tcg.c:2099
>>>>>>>>>>>> #4  tcg_gen_code (s=0x10b8940, gen_code_buf=0x40338b60 "I\213n@H\213]
>>>>>>>>>>>> 3\355I\211\256\220") at qemu/tcg/tcg.c:2142
>>>>>>>>>>>> #5  0x00000000004d38f1 in cpu_sparc_gen_code (env=0x10cce10,
>>>>>>>>>>>> tb=0x7fffe91bc218, gen_code_size_ptr=0x7fffffffd9b4) at
>>>>>>>>>>>> qemu/translate-all.c:93
>>>>>>>>>>>> #6  0x00000000004d1fd7 in tb_gen_code (env=0x10cce10, pc=18868776,
>>>>>>>>>>>> cs_base=18868780, flags=15, cflags=0) at qemu/exec.c:989
>>>>>>>>>>>> #7  0x00000000004d4029 in tb_find_slow (env1=<value optimized out>) at
>>>>>>>>>>>> qemu/cpu-exec.c:167
>>>>>>>>>>>> #8  tb_find_fast (env1=<value optimized out>) at cpu-exec.c:194
>>>>>>>>>>>> #9  cpu_sparc_exec (env1=<value optimized out>) at qemu/cpu-exec.c:556
>>>>>>>>>>>> #10 0x0000000000408868 in tcg_cpu_exec () at qemu/cpus.c:1066
>>>>>>>>>>>> #11 cpu_exec_all () at qemu/cpus.c:1102
>>>>>>>>>>>> #12 0x000000000053c756 in main_loop (argc=<value optimized out>,
>>>>>>>>>>>> argv=<value optimized out>, envp=<value optimized out>) at
>>>>>>>>>>>> qemu/vl.c:1430
>>>>>>>>>>>>
>>>>>>>>>>>> I inspected ts->val_type causing the abort() case and it turned out to be 0.
>>>>>>>>>>>>
>>>>>>>>>>>> The last lines of qemu.log (without -singlestep)
>>>>>>>>>>>> IN:
>>>>>>>>>>>> 0x00000000011fe9f0:  rdpr  %pstate, %g1
>>>>>>>>>>>> 0x00000000011fe9f4:  wrpr  %g1, 2, %pstate
>>>>>>>>>>>> --------------
>>>>>>>>>>>> IN:
>>>>>>>>>>>> 0x00000000011fe9f8:  ldub  [ %o0 ], %o1
>>>>>>>>>>>> 0x00000000011fe9fc:  mov  %o1, %o2
>>>>>>>>>>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>>>>>>>>>>> 0x00000000011fea04:  cmp  %o1, %o2
>>>>>>>>>>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>>>>>>>>>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>>>>>>>>>>
>>>>>>>>>>>> Search PC...
>>>>>>>>>>>> Search PC...
>>>>>>>>>>>> Search PC...
>>>>>>>>>>>> Search PC...
>>>>>>>>>>>> Search PC...
>>>>>>>>>>>> Search PC...
>>>>>>>>>>>> --------------
>>>>>>>>>>>> IN:
>>>>>>>>>>>> 0x00000000011fe9f8:  ldub  [ %o0 ], %o1
>>>>>>>>>>>> 0x00000000011fe9fc:  mov  %o1, %o2
>>>>>>>>>>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>>>>>>>>>>> 0x00000000011fea04:  cmp  %o1, %o2
>>>>>>>>>>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>>>>>>>>>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>>>>>>>>>>
>>>>>>>>>>>> 110521: Data Access MMU Miss (v=0068) pc=00000000011fe9f8
>>>>>>>>>>>> npc=00000000011fe9fc SP=000000000180ae41
>>>>>>>>>>>> pc: 00000000011fe9f8  npc: 00000000011fe9fc
>>>>>>>>>>>>
>>>>>>>>>>>> IN:
>>>>>>>>>>>> 0x00000000011fea00:  rdpr  %tick, %o3
>>>>>>>>>>>> 0x00000000011fea04:  cmp  %o1, %o2
>>>>>>>>>>>> 0x00000000011fea08:  be  %icc, 0x11fea00
>>>>>>>>>>>> 0x00000000011fea0c:  ldub  [ %o0 ], %o2
>>>>>>>>>>>> --------------
>>>>>>>>>>>> IN:
>>>>>>>>>>>> 0x00000000011fea10:  brz,pn   %o2, 0x11fe9f8
>>>>>>>>>>>> 0x00000000011fea14:  mov  %o2, %o4
>>>>>>>>>>>> --------------
>>>>>>>>>>>> IN:
>>>>>>>>>>>> 0x00000000011fea18:  rdpr  %tick, %o5
>>>>>>>>>>>> 0x00000000011fea1c:  cmp  %o2, %o4
>>>>>>>>>>>> 0x00000000011fea20:  be  %icc, 0x11fea18
>>>>>>>>>>>> 0x00000000011fea24:  ldub  [ %o0 ], %o4
>>>>>>>>>>>> --------------
>>>>>>>>>>>> IN:
>>>>>>>>>>>> 0x00000000011fea28:  brz,pn   %o4, 0x11fe9f4
>>>>>>>>>>>> 0x00000000011fea2c:  wrpr  %g0, %g1, %pstate
>>>>>>>>>>>> <EOF>
>>>>>>>>>>>>
>>>>>>>>>>>> The crash is 100% reproducible and happens always on the same place,
>>>>>>>>>>>> so it's probably a pure TCG issue, not related on getting the
>>>>>>>>>>>> external/timer interrupts.
>>>>>>>>>>>>
>>>>>>>>>>>> Do you need any additional info?
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> What would be interesting would be to get the corresponding TCG code
>>>>>>>>>>> from qemu.log (-d op,op_opt).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> OP:
>>>>>>>>>>  ---- 0x11fea28
>>>>>>>>>>  ld_i64 tmp6,regwptr,$0x20
>>>>>>>>>>  movi_i64 cond,$0x0
>>>>>>>>>>  movi_i64 tmp8,$0x0
>>>>>>>>>>  brcond_i64 tmp6,tmp8,ne,$0x0
>>>>>>>>>>  movi_i64 cond,$0x1
>>>>>>>>>>  set_label $0x0
>>>>>>>>>>
>>>>>>>>>>  ---- 0x11fea2c
>>>>>>>>>>  movi_i64 tmp7,$0x0
>>>>>>>>>>  xor_i64 tmp0,tmp7,g1
>>>>>>>>>>  movi_i64 pc,$0x11fea2c
>>>>>>>>>>  movi_i64 tmp8,$compute_psr
>>>>>>>>>>  call tmp8,$0x0,$0
>>>>>>>>>>  movi_i64 tmp8,$0x0
>>>>>>>>>>  brcond_i64 cond,tmp8,eq,$0x1
>>>>>>>>>>  movi_i64 npc,$0x11fe9f4
>>>>>>>>>>  br $0x2
>>>>>>>>>>  set_label $0x1
>>>>>>>>>>  movi_i64 npc,$0x11fea30
>>>>>>>>>>  set_label $0x2
>>>>>>>>>>  movi_i64 tmp8,$wrpstate
>>>>>>>>>>  call tmp8,$0x0,$0,tmp0
>>>>>>>>>>  mov_i64 pc,npc
>>>>>>>>>>  movi_i64 tmp8,$0x4
>>>>>>>>>>  add_i64 npc,npc,tmp8
>>>>>>>>>>  exit_tb $0x0
>>>>>>>>>>
>>>>>>>>>> OP after liveness analysis:
>>>>>>>>>>  ---- 0x11fea28
>>>>>>>>>>  ld_i64 tmp6,regwptr,$0x20
>>>>>>>>>>  movi_i64 cond,$0x0
>>>>>>>>>>  movi_i64 tmp8,$0x0
>>>>>>>>>>  brcond_i64 tmp6,tmp8,ne,$0x0
>>>>>>>>>>  movi_i64 cond,$0x1
>>>>>>>>>>  set_label $0x0
>>>>>>>>>>
>>>>>>>>>>  ---- 0x11fea2c
>>>>>>>>>>  nopn $0x2,$0x2
>>>>>>>>>>  nopn $0x3,$0x68,$0x3
>>>>>>>>>>  movi_i64 pc,$0x11fea2c
>>>>>>>>>>  movi_i64 tmp8,$compute_psr
>>>>>>>>>>  call tmp8,$0x0,$0
>>>>>>>>>>  movi_i64 tmp8,$0x0
>>>>>>>>>>  brcond_i64 cond,tmp8,eq,$0x1
>>>>>>>>>>  movi_i64 npc,$0x11fe9f4
>>>>>>>>>>  br $0x2
>>>>>>>>>>  set_label $0x1
>>>>>>>>>>  movi_i64 npc,$0x11fea30
>>>>>>>>>>  set_label $0x2
>>>>>>>>>>  movi_i64 tmp8,$wrpstate
>>>>>>>>>>  call tmp8,$0x0,$0,tmp0
>>>>>>>>>>  mov_i64 pc,npc
>>>>>>>>>>  movi_i64 tmp8,$0x4
>>>>>>>>>>  add_i64 npc,npc,tmp8
>>>>>>>>>>  exit_tb $0x0
>>>>>>>>>>  end
>>>>>>>>>>
>>>>>>>>>> Does it mean the last block is processed correctly and the crash
>>>>>>>>>> happens on the next instruction which doesn't make it to the log?
>>>>>>>>>> The next instruction would be a
>>>>>>>>>>
>>>>>>>>>> 0x00000000011fea30:  retl
>>>>>>>>>>
>>>>>>>>>> Since it's a branch instruction I guess this would also be a tcg block boundary.
>>>>>>>>>
>>>>>>>>> Because abort() was called from tcg_reg_alloc_call, I'd say 'retl'
>>>>>>>>> (synthetic op for 'jmpl %o8 + 8, %g0') was the problem.
>>>>>>>>
>>>>>>>> Any idea why? retl is not a rare instruction...
>>>>>>>
>>>>>>> Sorry, calls are generated for helpers, so it's not 'jmpl' but the
>>>>>>> call to wrpstate helper.
>>>>>>
>>>>>> And why it doesn't happen in a singlestep mode?
>>>>>> I tried to comment out
>>>>>> cpu_check_irqs(env);
>>>>>> in the helper_wrpstate but it made no difference. The only suspicious
>>>>>> thing left is register bank switching. Is it safe to switch register
>>>>>> banks in the helper function? Shouldn't we end the translation block
>>>>>> before?
>>>>>
>>>>> Not sure if I have seen write to pstate in delay slot, but switching
>>>>> globals with PS_AG appears to be safe.
>>>>> Do you know which bits are changed in the pstate?
>>>>
>>>> Hard to say. With a breakpoint set qemu doesn't crash.
>>>> The breakpoint shows the change from 0x14->0x16.
>>>> So the only difference is that interrupts are getting enabled. No
>>>> register bank change.
>>>> (And now also no cpu_check_irqs(env) call, because I commented it out.)
>>>>
>>>> But given there was a Data Access MMU Miss, I would expect there must
>>>> have beeb a PS_MG switch.
>>>>
>>>> Also the breakpoint makes tcg to cut the translation block before the wrpr:
>>>>
>>>> IN:
>>>> 0x00000000011fea18:  rdpr  %tick, %o5
>>>> 0x00000000011fea1c:  cmp  %o2, %o4
>>>> 0x00000000011fea20:  be  %icc, 0x11fea18
>>>> 0x00000000011fea24:  ldub  [ %o0 ], %o4
>>>> --------------
>>>> IN:
>>>> 0x00000000011fea28:  brz,pn   %o4, 0x11fe9f4
>>>> --------------
>>>> IN:
>>>> 0x00000000011fea2c:  wrpr  %g0, %g1, %pstate
>>>> --------------
>>>> IN:
>>>> 0x00000000011fea30:  retl
>>>> --------------
>>>> IN:
>>>> 0x00000000011fea30:  retl
>>>> 0x00000000011fea34:  sub  %o5, %o3, %o0
>>>>
>>>
>>> You can try enabling DEBUG_PSTATE to see which bits are changed.
>>
>> I put an additional DPRINTF in the helper and it doesn't get executed
>> at 11fea2c. Only at 11fe9f4 (0x16->0x14).
>
> In such cases I would run with -d in_asm,int to have more data to
> compare two runs.
> May the patch attached help a bit to add verbose pstate output.

Can do it, but I'd like to understand first what we are looking for.
How does the main works in this case? Is it something like following?

translate {brz,pn ; wrpr} -> optimize -> execute ->translate {retl ;
...} ->optimize -> execute.

The subject error is a tcg error, so it is happening in one of the two
translate/optimise phases drawn above, right?
So, why are we looking at the wrpr helper code?

> Do you have public test case?
> It is possible to code this delay slot write test but real issue may
> be corruption elsewhere.

You assume ts->val_type gets corrupted? But then it must happen before
the wrpr helper call, or actually before the  translation of {brz,pn ;
wrpr} block, no?

-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-11 17:53                       ` Artyom Tarasenko
@ 2011-04-12  2:14                         ` Igor Kovalenko
  2011-04-21 14:57                           ` Artyom Tarasenko
  0 siblings, 1 reply; 32+ messages in thread
From: Igor Kovalenko @ 2011-04-12  2:14 UTC (permalink / raw)
  To: Artyom Tarasenko; +Cc: Blue Swirl, peter.maydell, qemu-devel, Aurelien Jarno

On Mon, Apr 11, 2011 at 9:53 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>
> Can do it, but I'd like to understand first what we are looking for.
> How does the main works in this case? Is it something like following?
>
> translate {brz,pn ; wrpr} -> optimize -> execute ->translate {retl ;
> ...} ->optimize -> execute.
>
> The subject error is a tcg error, so it is happening in one of the two
> translate/optimise phases drawn above, right?
> So, why are we looking at the wrpr helper code?

Because you asked about alternate globals :)

>> Do you have public test case?
>> It is possible to code this delay slot write test but real issue may
>> be corruption elsewhere.
>
> You assume ts->val_type gets corrupted? But then it must happen before
> the wrpr helper call, or actually before the  translation of {brz,pn ;
> wrpr} block, no?

In theory there could be multiple issues including compiler induced ones.
I'd prefer to see some kind of reproducible testcase.

-- 
Kind regards,
Igor V. Kovalenko

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-12  2:14                         ` Igor Kovalenko
@ 2011-04-21 14:57                           ` Artyom Tarasenko
  2011-04-21 15:44                             ` Laurent Desnogues
  0 siblings, 1 reply; 32+ messages in thread
From: Artyom Tarasenko @ 2011-04-21 14:57 UTC (permalink / raw)
  To: Igor Kovalenko; +Cc: Blue Swirl, peter.maydell, qemu-devel, Aurelien Jarno

[-- Attachment #1: Type: text/plain, Size: 875 bytes --]

On Tue, Apr 12, 2011 at 4:14 AM, Igor Kovalenko
<igor.v.kovalenko@gmail.com> wrote:
>>> Do you have public test case?
>>> It is possible to code this delay slot write test but real issue may
>>> be corruption elsewhere.

The test case is trivial: it's just the two instructions, branch and wrpr.

> In theory there could be multiple issues including compiler induced ones.
> I'd prefer to see some kind of reproducible testcase.

Ok, attached a 40 byte long test (the first 32 bytes are not used and
needed only because the bios entry point is 0x20).

$ git pull && make && sparc64-softmmu/qemu-system-sparc64 -bios
test-wrpr.bin -nographic
Already up-to-date.
make[1]: Nothing to be done for `all'.
/mnt/terra/projects/vanilla/qemu/tcg/tcg.c:1892: tcg fatal error
Aborted

HTH,
Artyom

-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/

[-- Attachment #2: test-wrpr.bin --]
[-- Type: application/octet-stream, Size: 40 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-21 14:57                           ` Artyom Tarasenko
@ 2011-04-21 15:44                             ` Laurent Desnogues
  2011-04-21 19:45                               ` Igor Kovalenko
  0 siblings, 1 reply; 32+ messages in thread
From: Laurent Desnogues @ 2011-04-21 15:44 UTC (permalink / raw)
  To: Artyom Tarasenko; +Cc: Blue Swirl, peter.maydell, qemu-devel, Aurelien Jarno

On Thu, Apr 21, 2011 at 4:57 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
> On Tue, Apr 12, 2011 at 4:14 AM, Igor Kovalenko
> <igor.v.kovalenko@gmail.com> wrote:
>>>> Do you have public test case?
>>>> It is possible to code this delay slot write test but real issue may
>>>> be corruption elsewhere.
>
> The test case is trivial: it's just the two instructions, branch and wrpr.
>
>> In theory there could be multiple issues including compiler induced ones.
>> I'd prefer to see some kind of reproducible testcase.
>
> Ok, attached a 40 byte long test (the first 32 bytes are not used and
> needed only because the bios entry point is 0x20).
>
> $ git pull && make && sparc64-softmmu/qemu-system-sparc64 -bios
> test-wrpr.bin -nographic
> Already up-to-date.
> make[1]: Nothing to be done for `all'.
> /mnt/terra/projects/vanilla/qemu/tcg/tcg.c:1892: tcg fatal error
> Aborted

The problem seems to be that wrpr is using a non-local
TCG tmp (cpu_tmp0).


Laurent

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-21 15:44                             ` Laurent Desnogues
@ 2011-04-21 19:45                               ` Igor Kovalenko
  2011-04-21 22:39                                 ` Laurent Desnogues
  0 siblings, 1 reply; 32+ messages in thread
From: Igor Kovalenko @ 2011-04-21 19:45 UTC (permalink / raw)
  To: Laurent Desnogues
  Cc: Blue Swirl, peter.maydell, Aurelien Jarno, qemu-devel,
	Artyom Tarasenko

On Thu, Apr 21, 2011 at 7:44 PM, Laurent Desnogues
<laurent.desnogues@gmail.com> wrote:
> On Thu, Apr 21, 2011 at 4:57 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>> On Tue, Apr 12, 2011 at 4:14 AM, Igor Kovalenko
>> <igor.v.kovalenko@gmail.com> wrote:
>>>>> Do you have public test case?
>>>>> It is possible to code this delay slot write test but real issue may
>>>>> be corruption elsewhere.
>>
>> The test case is trivial: it's just the two instructions, branch and wrpr.
>>
>>> In theory there could be multiple issues including compiler induced ones.
>>> I'd prefer to see some kind of reproducible testcase.
>>
>> Ok, attached a 40 byte long test (the first 32 bytes are not used and
>> needed only because the bios entry point is 0x20).
>>
>> $ git pull && make && sparc64-softmmu/qemu-system-sparc64 -bios
>> test-wrpr.bin -nographic
>> Already up-to-date.
>> make[1]: Nothing to be done for `all'.
>> /mnt/terra/projects/vanilla/qemu/tcg/tcg.c:1892: tcg fatal error
>> Aborted
>
> The problem seems to be that wrpr is using a non-local
> TCG tmp (cpu_tmp0).

Just tried the test case with write to %pil - seems like write itself is OK.
The issue appears to be with save_state() call since adding save_state
to %pil case provokes the same tcg abort.

-- 
Kind regards,
Igor V. Kovalenko

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-21 19:45                               ` Igor Kovalenko
@ 2011-04-21 22:39                                 ` Laurent Desnogues
  2011-04-22 14:14                                   ` Igor Kovalenko
  0 siblings, 1 reply; 32+ messages in thread
From: Laurent Desnogues @ 2011-04-21 22:39 UTC (permalink / raw)
  To: Igor Kovalenko
  Cc: Blue Swirl, peter.maydell, Aurelien Jarno, qemu-devel,
	Artyom Tarasenko

On Thu, Apr 21, 2011 at 9:45 PM, Igor Kovalenko
<igor.v.kovalenko@gmail.com> wrote:
> On Thu, Apr 21, 2011 at 7:44 PM, Laurent Desnogues
> <laurent.desnogues@gmail.com> wrote:
>> On Thu, Apr 21, 2011 at 4:57 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>> On Tue, Apr 12, 2011 at 4:14 AM, Igor Kovalenko
>>> <igor.v.kovalenko@gmail.com> wrote:
>>>>>> Do you have public test case?
>>>>>> It is possible to code this delay slot write test but real issue may
>>>>>> be corruption elsewhere.
>>>
>>> The test case is trivial: it's just the two instructions, branch and wrpr.
>>>
>>>> In theory there could be multiple issues including compiler induced ones.
>>>> I'd prefer to see some kind of reproducible testcase.
>>>
>>> Ok, attached a 40 byte long test (the first 32 bytes are not used and
>>> needed only because the bios entry point is 0x20).
>>>
>>> $ git pull && make && sparc64-softmmu/qemu-system-sparc64 -bios
>>> test-wrpr.bin -nographic
>>> Already up-to-date.
>>> make[1]: Nothing to be done for `all'.
>>> /mnt/terra/projects/vanilla/qemu/tcg/tcg.c:1892: tcg fatal error
>>> Aborted
>>
>> The problem seems to be that wrpr is using a non-local
>> TCG tmp (cpu_tmp0).
>
> Just tried the test case with write to %pil - seems like write itself is OK.
> The issue appears to be with save_state() call since adding save_state
> to %pil case provokes the same tcg abort.

The problem is that cpu_tmp0, not being a local tmp, doesn't
need to be saved across helper calls.  This results in the
TCG "optimizer" getting rid of it even though it's later used.
Look at the log and you'll see what I mean :-)


Laurent

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-21 22:39                                 ` Laurent Desnogues
@ 2011-04-22 14:14                                   ` Igor Kovalenko
  2011-04-25 20:29                                     ` Aurelien Jarno
  0 siblings, 1 reply; 32+ messages in thread
From: Igor Kovalenko @ 2011-04-22 14:14 UTC (permalink / raw)
  To: Laurent Desnogues
  Cc: Blue Swirl, peter.maydell, Aurelien Jarno, qemu-devel,
	Artyom Tarasenko

On Fri, Apr 22, 2011 at 2:39 AM, Laurent Desnogues
<laurent.desnogues@gmail.com> wrote:
> On Thu, Apr 21, 2011 at 9:45 PM, Igor Kovalenko
> <igor.v.kovalenko@gmail.com> wrote:
>> On Thu, Apr 21, 2011 at 7:44 PM, Laurent Desnogues
>> <laurent.desnogues@gmail.com> wrote:
>>> On Thu, Apr 21, 2011 at 4:57 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>> On Tue, Apr 12, 2011 at 4:14 AM, Igor Kovalenko
>>>> <igor.v.kovalenko@gmail.com> wrote:
>>>>>>> Do you have public test case?
>>>>>>> It is possible to code this delay slot write test but real issue may
>>>>>>> be corruption elsewhere.
>>>>
>>>> The test case is trivial: it's just the two instructions, branch and wrpr.
>>>>
>>>>> In theory there could be multiple issues including compiler induced ones.
>>>>> I'd prefer to see some kind of reproducible testcase.
>>>>
>>>> Ok, attached a 40 byte long test (the first 32 bytes are not used and
>>>> needed only because the bios entry point is 0x20).
>>>>
>>>> $ git pull && make && sparc64-softmmu/qemu-system-sparc64 -bios
>>>> test-wrpr.bin -nographic
>>>> Already up-to-date.
>>>> make[1]: Nothing to be done for `all'.
>>>> /mnt/terra/projects/vanilla/qemu/tcg/tcg.c:1892: tcg fatal error
>>>> Aborted
>>>
>>> The problem seems to be that wrpr is using a non-local
>>> TCG tmp (cpu_tmp0).
>>
>> Just tried the test case with write to %pil - seems like write itself is OK.
>> The issue appears to be with save_state() call since adding save_state
>> to %pil case provokes the same tcg abort.
>
> The problem is that cpu_tmp0, not being a local tmp, doesn't
> need to be saved across helper calls.  This results in the
> TCG "optimizer" getting rid of it even though it's later used.
> Look at the log and you'll see what I mean :-)

I'm not very comfortable with tcg yet. Would it be possible to teach
optimizer working with delay slots? Or do I look in the wrong place.

-- 
Kind regards,
Igor V. Kovalenko

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-22 14:14                                   ` Igor Kovalenko
@ 2011-04-25 20:29                                     ` Aurelien Jarno
  2011-04-26  3:34                                       ` Igor Kovalenko
  2011-04-26 17:02                                       ` Artyom Tarasenko
  0 siblings, 2 replies; 32+ messages in thread
From: Aurelien Jarno @ 2011-04-25 20:29 UTC (permalink / raw)
  To: Igor Kovalenko
  Cc: Laurent Desnogues, Blue Swirl, qemu-devel, Artyom Tarasenko,
	peter.maydell

On Fri, Apr 22, 2011 at 06:14:06PM +0400, Igor Kovalenko wrote:
> On Fri, Apr 22, 2011 at 2:39 AM, Laurent Desnogues
> <laurent.desnogues@gmail.com> wrote:
> > On Thu, Apr 21, 2011 at 9:45 PM, Igor Kovalenko
> > <igor.v.kovalenko@gmail.com> wrote:
> >> On Thu, Apr 21, 2011 at 7:44 PM, Laurent Desnogues
> >> <laurent.desnogues@gmail.com> wrote:
> >>> On Thu, Apr 21, 2011 at 4:57 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
> >>>> On Tue, Apr 12, 2011 at 4:14 AM, Igor Kovalenko
> >>>> <igor.v.kovalenko@gmail.com> wrote:
> >>>>>>> Do you have public test case?
> >>>>>>> It is possible to code this delay slot write test but real issue may
> >>>>>>> be corruption elsewhere.
> >>>>
> >>>> The test case is trivial: it's just the two instructions, branch and wrpr.
> >>>>
> >>>>> In theory there could be multiple issues including compiler induced ones.
> >>>>> I'd prefer to see some kind of reproducible testcase.
> >>>>
> >>>> Ok, attached a 40 byte long test (the first 32 bytes are not used and
> >>>> needed only because the bios entry point is 0x20).
> >>>>
> >>>> $ git pull && make && sparc64-softmmu/qemu-system-sparc64 -bios
> >>>> test-wrpr.bin -nographic
> >>>> Already up-to-date.
> >>>> make[1]: Nothing to be done for `all'.
> >>>> /mnt/terra/projects/vanilla/qemu/tcg/tcg.c:1892: tcg fatal error
> >>>> Aborted
> >>>
> >>> The problem seems to be that wrpr is using a non-local
> >>> TCG tmp (cpu_tmp0).
> >>
> >> Just tried the test case with write to %pil - seems like write itself is OK.
> >> The issue appears to be with save_state() call since adding save_state
> >> to %pil case provokes the same tcg abort.
> >
> > The problem is that cpu_tmp0, not being a local tmp, doesn't
> > need to be saved across helper calls.  This results in the
> > TCG "optimizer" getting rid of it even though it's later used.
> > Look at the log and you'll see what I mean :-)
> 
> I'm not very comfortable with tcg yet. Would it be possible to teach
> optimizer working with delay slots? Or do I look in the wrong place.
> 

The problem is not on the TCG side, but on the target-sparc/translate.c
side:

|                    case 0x32: /* wrwim, V9 wrpr */
|                         {
|                             if (!supervisor(dc))
|                                 goto priv_insn;
|                             tcg_gen_xor_tl(cpu_tmp0, cpu_src1, cpu_src2);
| #ifdef TARGET_SPARC64

Here cpu_tmp0 is loaded. cpu_tmp0 is a TCG temp, which means it is not
saved across TCG branches.

[...]

|                             case 6: // pstate
|                                 save_state(dc, cpu_cond);
|                                 gen_helper_wrpstate(cpu_tmp0);
|                                 dc->npc = DYNAMIC_PC;
|                                 break;

save_state() calls save_npc(), which in turns might call 
gen_generic_branch():

| static inline void gen_generic_branch(target_ulong npc1, target_ulong npc2,
|                                       TCGv r_cond)
| {
|     int l1, l2;
|
|     l1 = gen_new_label();
|     l2 = gen_new_label();
|
|     tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond, 0, l1);
|
|     tcg_gen_movi_tl(cpu_npc, npc1);
|     tcg_gen_br(l2);
| 
|     gen_set_label(l1);
|     tcg_gen_movi_tl(cpu_npc, npc2);
|     gen_set_label(l2);
| }

And here is the TCG branch, which drop the TCG temp cpu_temp0.

The solution is either to rewrite gen_generic_branch() without TCG
branches, or to use a TCG temp local instead of a TCG temp.

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-25 20:29                                     ` Aurelien Jarno
@ 2011-04-26  3:34                                       ` Igor Kovalenko
  2011-04-26 16:26                                         ` Artyom Tarasenko
  2011-04-26 17:02                                       ` Artyom Tarasenko
  1 sibling, 1 reply; 32+ messages in thread
From: Igor Kovalenko @ 2011-04-26  3:34 UTC (permalink / raw)
  To: Aurelien Jarno
  Cc: Laurent Desnogues, Blue Swirl, qemu-devel, Artyom Tarasenko,
	peter.maydell

On Tue, Apr 26, 2011 at 12:29 AM, Aurelien Jarno <aurelien@aurel32.net> wrote:
> On Fri, Apr 22, 2011 at 06:14:06PM +0400, Igor Kovalenko wrote:
>> On Fri, Apr 22, 2011 at 2:39 AM, Laurent Desnogues
>> <laurent.desnogues@gmail.com> wrote:
>> > On Thu, Apr 21, 2011 at 9:45 PM, Igor Kovalenko
>> > <igor.v.kovalenko@gmail.com> wrote:
>> >> On Thu, Apr 21, 2011 at 7:44 PM, Laurent Desnogues
>> >> <laurent.desnogues@gmail.com> wrote:
>> >>> On Thu, Apr 21, 2011 at 4:57 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>> >>>> On Tue, Apr 12, 2011 at 4:14 AM, Igor Kovalenko
>> >>>> <igor.v.kovalenko@gmail.com> wrote:
>> >>>>>>> Do you have public test case?
>> >>>>>>> It is possible to code this delay slot write test but real issue may
>> >>>>>>> be corruption elsewhere.
>> >>>>
>> >>>> The test case is trivial: it's just the two instructions, branch and wrpr.
>> >>>>
>> >>>>> In theory there could be multiple issues including compiler induced ones.
>> >>>>> I'd prefer to see some kind of reproducible testcase.
>> >>>>
>> >>>> Ok, attached a 40 byte long test (the first 32 bytes are not used and
>> >>>> needed only because the bios entry point is 0x20).
>> >>>>
>> >>>> $ git pull && make && sparc64-softmmu/qemu-system-sparc64 -bios
>> >>>> test-wrpr.bin -nographic
>> >>>> Already up-to-date.
>> >>>> make[1]: Nothing to be done for `all'.
>> >>>> /mnt/terra/projects/vanilla/qemu/tcg/tcg.c:1892: tcg fatal error
>> >>>> Aborted
>> >>>
>> >>> The problem seems to be that wrpr is using a non-local
>> >>> TCG tmp (cpu_tmp0).
>> >>
>> >> Just tried the test case with write to %pil - seems like write itself is OK.
>> >> The issue appears to be with save_state() call since adding save_state
>> >> to %pil case provokes the same tcg abort.
>> >
>> > The problem is that cpu_tmp0, not being a local tmp, doesn't
>> > need to be saved across helper calls.  This results in the
>> > TCG "optimizer" getting rid of it even though it's later used.
>> > Look at the log and you'll see what I mean :-)
>>
>> I'm not very comfortable with tcg yet. Would it be possible to teach
>> optimizer working with delay slots? Or do I look in the wrong place.
>>
>
> The problem is not on the TCG side, but on the target-sparc/translate.c
> side:
>
> |                    case 0x32: /* wrwim, V9 wrpr */
> |                         {
> |                             if (!supervisor(dc))
> |                                 goto priv_insn;
> |                             tcg_gen_xor_tl(cpu_tmp0, cpu_src1, cpu_src2);
> | #ifdef TARGET_SPARC64
>
> Here cpu_tmp0 is loaded. cpu_tmp0 is a TCG temp, which means it is not
> saved across TCG branches.
>
> [...]
>
> |                             case 6: // pstate
> |                                 save_state(dc, cpu_cond);
> |                                 gen_helper_wrpstate(cpu_tmp0);
> |                                 dc->npc = DYNAMIC_PC;
> |                                 break;
>
> save_state() calls save_npc(), which in turns might call
> gen_generic_branch():
>
> | static inline void gen_generic_branch(target_ulong npc1, target_ulong npc2,
> |                                       TCGv r_cond)
> | {
> |     int l1, l2;
> |
> |     l1 = gen_new_label();
> |     l2 = gen_new_label();
> |
> |     tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond, 0, l1);
> |
> |     tcg_gen_movi_tl(cpu_npc, npc1);
> |     tcg_gen_br(l2);
> |
> |     gen_set_label(l1);
> |     tcg_gen_movi_tl(cpu_npc, npc2);
> |     gen_set_label(l2);
> | }
>
> And here is the TCG branch, which drop the TCG temp cpu_temp0.
>
> The solution is either to rewrite gen_generic_branch() without TCG
> branches, or to use a TCG temp local instead of a TCG temp.

Thanks!

I think the issue is more clear now, and loading to local temporary
works in this case.
Does not explain why unmodified qemu works with wrpr pstate not in delay slot.
I looked at my linux kernel builds and do not see any wrpr pstate in delay slot.

-- 
Kind regards,
Igor V. Kovalenko

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-26  3:34                                       ` Igor Kovalenko
@ 2011-04-26 16:26                                         ` Artyom Tarasenko
  2011-04-26 18:07                                           ` Igor Kovalenko
  0 siblings, 1 reply; 32+ messages in thread
From: Artyom Tarasenko @ 2011-04-26 16:26 UTC (permalink / raw)
  To: Igor Kovalenko
  Cc: Laurent Desnogues, Blue Swirl, qemu-devel, Aurelien Jarno,
	peter.maydell

On Tue, Apr 26, 2011 at 5:34 AM, Igor Kovalenko
<igor.v.kovalenko@gmail.com> wrote:
> On Tue, Apr 26, 2011 at 12:29 AM, Aurelien Jarno <aurelien@aurel32.net> wrote:
>> On Fri, Apr 22, 2011 at 06:14:06PM +0400, Igor Kovalenko wrote:
>>> On Fri, Apr 22, 2011 at 2:39 AM, Laurent Desnogues
>>> <laurent.desnogues@gmail.com> wrote:
>>> > On Thu, Apr 21, 2011 at 9:45 PM, Igor Kovalenko
>>> > <igor.v.kovalenko@gmail.com> wrote:
>>> >> On Thu, Apr 21, 2011 at 7:44 PM, Laurent Desnogues
>>> >> <laurent.desnogues@gmail.com> wrote:
>>> >>> On Thu, Apr 21, 2011 at 4:57 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>> >>>> On Tue, Apr 12, 2011 at 4:14 AM, Igor Kovalenko
>>> >>>> <igor.v.kovalenko@gmail.com> wrote:
>>> >>>>>>> Do you have public test case?
>>> >>>>>>> It is possible to code this delay slot write test but real issue may
>>> >>>>>>> be corruption elsewhere.
>>> >>>>
>>> >>>> The test case is trivial: it's just the two instructions, branch and wrpr.
>>> >>>>
>>> >>>>> In theory there could be multiple issues including compiler induced ones.
>>> >>>>> I'd prefer to see some kind of reproducible testcase.
>>> >>>>
>>> >>>> Ok, attached a 40 byte long test (the first 32 bytes are not used and
>>> >>>> needed only because the bios entry point is 0x20).
>>> >>>>
>>> >>>> $ git pull && make && sparc64-softmmu/qemu-system-sparc64 -bios
>>> >>>> test-wrpr.bin -nographic
>>> >>>> Already up-to-date.
>>> >>>> make[1]: Nothing to be done for `all'.
>>> >>>> /mnt/terra/projects/vanilla/qemu/tcg/tcg.c:1892: tcg fatal error
>>> >>>> Aborted
>>> >>>
>>> >>> The problem seems to be that wrpr is using a non-local
>>> >>> TCG tmp (cpu_tmp0).
>>> >>
>>> >> Just tried the test case with write to %pil - seems like write itself is OK.
>>> >> The issue appears to be with save_state() call since adding save_state
>>> >> to %pil case provokes the same tcg abort.
>>> >
>>> > The problem is that cpu_tmp0, not being a local tmp, doesn't
>>> > need to be saved across helper calls.  This results in the
>>> > TCG "optimizer" getting rid of it even though it's later used.
>>> > Look at the log and you'll see what I mean :-)
>>>
>>> I'm not very comfortable with tcg yet. Would it be possible to teach
>>> optimizer working with delay slots? Or do I look in the wrong place.
>>>
>>
>> The problem is not on the TCG side, but on the target-sparc/translate.c
>> side:
>>
>> |                    case 0x32: /* wrwim, V9 wrpr */
>> |                         {
>> |                             if (!supervisor(dc))
>> |                                 goto priv_insn;
>> |                             tcg_gen_xor_tl(cpu_tmp0, cpu_src1, cpu_src2);
>> | #ifdef TARGET_SPARC64
>>
>> Here cpu_tmp0 is loaded. cpu_tmp0 is a TCG temp, which means it is not
>> saved across TCG branches.
>>
>> [...]
>>
>> |                             case 6: // pstate
>> |                                 save_state(dc, cpu_cond);
>> |                                 gen_helper_wrpstate(cpu_tmp0);
>> |                                 dc->npc = DYNAMIC_PC;
>> |                                 break;
>>
>> save_state() calls save_npc(), which in turns might call
>> gen_generic_branch():
>>
>> | static inline void gen_generic_branch(target_ulong npc1, target_ulong npc2,
>> |                                       TCGv r_cond)
>> | {
>> |     int l1, l2;
>> |
>> |     l1 = gen_new_label();
>> |     l2 = gen_new_label();
>> |
>> |     tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond, 0, l1);
>> |
>> |     tcg_gen_movi_tl(cpu_npc, npc1);
>> |     tcg_gen_br(l2);
>> |
>> |     gen_set_label(l1);
>> |     tcg_gen_movi_tl(cpu_npc, npc2);
>> |     gen_set_label(l2);
>> | }
>>
>> And here is the TCG branch, which drop the TCG temp cpu_temp0.
>>
>> The solution is either to rewrite gen_generic_branch() without TCG
>> branches, or to use a TCG temp local instead of a TCG temp.
>
> Thanks!
>
> I think the issue is more clear now, and loading to local temporary
> works in this case.
> Does not explain why unmodified qemu works with wrpr pstate not in delay slot.

Because the TCG branch is not generated in save_npc()?

> I looked at my linux kernel builds and do not see any wrpr pstate in delay slot.

Meaning you are not going to fix the bug? ;-)


-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-25 20:29                                     ` Aurelien Jarno
  2011-04-26  3:34                                       ` Igor Kovalenko
@ 2011-04-26 17:02                                       ` Artyom Tarasenko
  2011-04-26 18:36                                         ` Blue Swirl
  1 sibling, 1 reply; 32+ messages in thread
From: Artyom Tarasenko @ 2011-04-26 17:02 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: Laurent Desnogues, Blue Swirl, qemu-devel, peter.maydell

On Mon, Apr 25, 2011 at 10:29 PM, Aurelien Jarno <aurelien@aurel32.net> wrote:
> On Fri, Apr 22, 2011 at 06:14:06PM +0400, Igor Kovalenko wrote:
>> On Fri, Apr 22, 2011 at 2:39 AM, Laurent Desnogues
>> <laurent.desnogues@gmail.com> wrote:
>> > On Thu, Apr 21, 2011 at 9:45 PM, Igor Kovalenko
>> > <igor.v.kovalenko@gmail.com> wrote:
>> >> On Thu, Apr 21, 2011 at 7:44 PM, Laurent Desnogues
>> >> <laurent.desnogues@gmail.com> wrote:
>> >>> On Thu, Apr 21, 2011 at 4:57 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>> >>>> On Tue, Apr 12, 2011 at 4:14 AM, Igor Kovalenko
>> >>>> <igor.v.kovalenko@gmail.com> wrote:
>> >>>>>>> Do you have public test case?
>> >>>>>>> It is possible to code this delay slot write test but real issue may
>> >>>>>>> be corruption elsewhere.
>> >>>>
>> >>>> The test case is trivial: it's just the two instructions, branch and wrpr.
>> >>>>
>> >>>>> In theory there could be multiple issues including compiler induced ones.
>> >>>>> I'd prefer to see some kind of reproducible testcase.
>> >>>>
>> >>>> Ok, attached a 40 byte long test (the first 32 bytes are not used and
>> >>>> needed only because the bios entry point is 0x20).
>> >>>>
>> >>>> $ git pull && make && sparc64-softmmu/qemu-system-sparc64 -bios
>> >>>> test-wrpr.bin -nographic
>> >>>> Already up-to-date.
>> >>>> make[1]: Nothing to be done for `all'.
>> >>>> /mnt/terra/projects/vanilla/qemu/tcg/tcg.c:1892: tcg fatal error
>> >>>> Aborted
>> >>>
>> >>> The problem seems to be that wrpr is using a non-local
>> >>> TCG tmp (cpu_tmp0).
>> >>
>> >> Just tried the test case with write to %pil - seems like write itself is OK.
>> >> The issue appears to be with save_state() call since adding save_state
>> >> to %pil case provokes the same tcg abort.
>> >
>> > The problem is that cpu_tmp0, not being a local tmp, doesn't
>> > need to be saved across helper calls.  This results in the
>> > TCG "optimizer" getting rid of it even though it's later used.
>> > Look at the log and you'll see what I mean :-)
>>
>> I'm not very comfortable with tcg yet. Would it be possible to teach
>> optimizer working with delay slots? Or do I look in the wrong place.
>>
>
> The problem is not on the TCG side, but on the target-sparc/translate.c
> side:
>
> |                    case 0x32: /* wrwim, V9 wrpr */
> |                         {
> |                             if (!supervisor(dc))
> |                                 goto priv_insn;
> |                             tcg_gen_xor_tl(cpu_tmp0, cpu_src1, cpu_src2);
> | #ifdef TARGET_SPARC64
>
> Here cpu_tmp0 is loaded. cpu_tmp0 is a TCG temp, which means it is not
> saved across TCG branches.
>
> [...]
>
> |                             case 6: // pstate
> |                                 save_state(dc, cpu_cond);
> |                                 gen_helper_wrpstate(cpu_tmp0);
> |                                 dc->npc = DYNAMIC_PC;
> |                                 break;
>
> save_state() calls save_npc(), which in turns might call
> gen_generic_branch():

Hmm. This is not the only instruction using save_state() and cpu_tmp0.
At least ldd is another example.

> | static inline void gen_generic_branch(target_ulong npc1, target_ulong npc2,
> |                                       TCGv r_cond)
> | {
> |     int l1, l2;
> |
> |     l1 = gen_new_label();
> |     l2 = gen_new_label();
> |
> |     tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond, 0, l1);
> |
> |     tcg_gen_movi_tl(cpu_npc, npc1);
> |     tcg_gen_br(l2);
> |
> |     gen_set_label(l1);
> |     tcg_gen_movi_tl(cpu_npc, npc2);
> |     gen_set_label(l2);
> | }
>
> And here is the TCG branch, which drop the TCG temp cpu_temp0.
>
> The solution is either to rewrite gen_generic_branch() without TCG
> branches, or to use a TCG temp local instead of a TCG temp.

You mean something like

                            case 6: // pstate
                                {
                                    TCGv r_temp;

                                    r_temp = tcg_temp_new();
                                    tcg_gen_mov_tl(r_temp, cpu_tmp0);
                                    save_state(dc, cpu_cond);
                                    gen_helper_wrpstate(r_temp);
                                    tcg_temp_free(r_temp);
                                    dc->npc = DYNAMIC_PC;
                                }
                                break;

?
This fails with the same error message.

Artyom

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-26 16:26                                         ` Artyom Tarasenko
@ 2011-04-26 18:07                                           ` Igor Kovalenko
  0 siblings, 0 replies; 32+ messages in thread
From: Igor Kovalenko @ 2011-04-26 18:07 UTC (permalink / raw)
  To: Artyom Tarasenko
  Cc: Laurent Desnogues, Blue Swirl, qemu-devel, Aurelien Jarno,
	peter.maydell

On Tue, Apr 26, 2011 at 8:26 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
> On Tue, Apr 26, 2011 at 5:34 AM, Igor Kovalenko
> <igor.v.kovalenko@gmail.com> wrote:
>> On Tue, Apr 26, 2011 at 12:29 AM, Aurelien Jarno <aurelien@aurel32.net> wrote:
>>> On Fri, Apr 22, 2011 at 06:14:06PM +0400, Igor Kovalenko wrote:
>>>> On Fri, Apr 22, 2011 at 2:39 AM, Laurent Desnogues
>>>> <laurent.desnogues@gmail.com> wrote:
>>>> > On Thu, Apr 21, 2011 at 9:45 PM, Igor Kovalenko
>>>> > <igor.v.kovalenko@gmail.com> wrote:
>>>> >> On Thu, Apr 21, 2011 at 7:44 PM, Laurent Desnogues
>>>> >> <laurent.desnogues@gmail.com> wrote:
>>>> >>> On Thu, Apr 21, 2011 at 4:57 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>> >>>> On Tue, Apr 12, 2011 at 4:14 AM, Igor Kovalenko
>>>> >>>> <igor.v.kovalenko@gmail.com> wrote:
>>>> >>>>>>> Do you have public test case?
>>>> >>>>>>> It is possible to code this delay slot write test but real issue may
>>>> >>>>>>> be corruption elsewhere.
>>>> >>>>
>>>> >>>> The test case is trivial: it's just the two instructions, branch and wrpr.
>>>> >>>>
>>>> >>>>> In theory there could be multiple issues including compiler induced ones.
>>>> >>>>> I'd prefer to see some kind of reproducible testcase.
>>>> >>>>
>>>> >>>> Ok, attached a 40 byte long test (the first 32 bytes are not used and
>>>> >>>> needed only because the bios entry point is 0x20).
>>>> >>>>
>>>> >>>> $ git pull && make && sparc64-softmmu/qemu-system-sparc64 -bios
>>>> >>>> test-wrpr.bin -nographic
>>>> >>>> Already up-to-date.
>>>> >>>> make[1]: Nothing to be done for `all'.
>>>> >>>> /mnt/terra/projects/vanilla/qemu/tcg/tcg.c:1892: tcg fatal error
>>>> >>>> Aborted
>>>> >>>
>>>> >>> The problem seems to be that wrpr is using a non-local
>>>> >>> TCG tmp (cpu_tmp0).
>>>> >>
>>>> >> Just tried the test case with write to %pil - seems like write itself is OK.
>>>> >> The issue appears to be with save_state() call since adding save_state
>>>> >> to %pil case provokes the same tcg abort.
>>>> >
>>>> > The problem is that cpu_tmp0, not being a local tmp, doesn't
>>>> > need to be saved across helper calls.  This results in the
>>>> > TCG "optimizer" getting rid of it even though it's later used.
>>>> > Look at the log and you'll see what I mean :-)
>>>>
>>>> I'm not very comfortable with tcg yet. Would it be possible to teach
>>>> optimizer working with delay slots? Or do I look in the wrong place.
>>>>
>>>
>>> The problem is not on the TCG side, but on the target-sparc/translate.c
>>> side:
>>>
>>> |                    case 0x32: /* wrwim, V9 wrpr */
>>> |                         {
>>> |                             if (!supervisor(dc))
>>> |                                 goto priv_insn;
>>> |                             tcg_gen_xor_tl(cpu_tmp0, cpu_src1, cpu_src2);
>>> | #ifdef TARGET_SPARC64
>>>
>>> Here cpu_tmp0 is loaded. cpu_tmp0 is a TCG temp, which means it is not
>>> saved across TCG branches.
>>>
>>> [...]
>>>
>>> |                             case 6: // pstate
>>> |                                 save_state(dc, cpu_cond);
>>> |                                 gen_helper_wrpstate(cpu_tmp0);
>>> |                                 dc->npc = DYNAMIC_PC;
>>> |                                 break;
>>>
>>> save_state() calls save_npc(), which in turns might call
>>> gen_generic_branch():
>>>
>>> | static inline void gen_generic_branch(target_ulong npc1, target_ulong npc2,
>>> |                                       TCGv r_cond)
>>> | {
>>> |     int l1, l2;
>>> |
>>> |     l1 = gen_new_label();
>>> |     l2 = gen_new_label();
>>> |
>>> |     tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond, 0, l1);
>>> |
>>> |     tcg_gen_movi_tl(cpu_npc, npc1);
>>> |     tcg_gen_br(l2);
>>> |
>>> |     gen_set_label(l1);
>>> |     tcg_gen_movi_tl(cpu_npc, npc2);
>>> |     gen_set_label(l2);
>>> | }
>>>
>>> And here is the TCG branch, which drop the TCG temp cpu_temp0.
>>>
>>> The solution is either to rewrite gen_generic_branch() without TCG
>>> branches, or to use a TCG temp local instead of a TCG temp.
>>
>> Thanks!
>>
>> I think the issue is more clear now, and loading to local temporary
>> works in this case.
>> Does not explain why unmodified qemu works with wrpr pstate not in delay slot.
>
> Because the TCG branch is not generated in save_npc()?
>
>> I looked at my linux kernel builds and do not see any wrpr pstate in delay slot.
>
> Meaning you are not going to fix the bug? ;-)

More like I need to know where the bug is
because there is no issue running without wrpr in delay slot.

-- 
Kind regards,
Igor V. Kovalenko

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-26 17:02                                       ` Artyom Tarasenko
@ 2011-04-26 18:36                                         ` Blue Swirl
  2011-04-26 19:07                                           ` Igor Kovalenko
  2011-04-27 16:29                                           ` Artyom Tarasenko
  0 siblings, 2 replies; 32+ messages in thread
From: Blue Swirl @ 2011-04-26 18:36 UTC (permalink / raw)
  To: Artyom Tarasenko
  Cc: Laurent Desnogues, peter.maydell, qemu-devel, Aurelien Jarno

On Tue, Apr 26, 2011 at 8:02 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
> On Mon, Apr 25, 2011 at 10:29 PM, Aurelien Jarno <aurelien@aurel32.net> wrote:
>> On Fri, Apr 22, 2011 at 06:14:06PM +0400, Igor Kovalenko wrote:
>>> On Fri, Apr 22, 2011 at 2:39 AM, Laurent Desnogues
>>> <laurent.desnogues@gmail.com> wrote:
>>> > On Thu, Apr 21, 2011 at 9:45 PM, Igor Kovalenko
>>> > <igor.v.kovalenko@gmail.com> wrote:
>>> >> On Thu, Apr 21, 2011 at 7:44 PM, Laurent Desnogues
>>> >> <laurent.desnogues@gmail.com> wrote:
>>> >>> On Thu, Apr 21, 2011 at 4:57 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>> >>>> On Tue, Apr 12, 2011 at 4:14 AM, Igor Kovalenko
>>> >>>> <igor.v.kovalenko@gmail.com> wrote:
>>> >>>>>>> Do you have public test case?
>>> >>>>>>> It is possible to code this delay slot write test but real issue may
>>> >>>>>>> be corruption elsewhere.
>>> >>>>
>>> >>>> The test case is trivial: it's just the two instructions, branch and wrpr.
>>> >>>>
>>> >>>>> In theory there could be multiple issues including compiler induced ones.
>>> >>>>> I'd prefer to see some kind of reproducible testcase.
>>> >>>>
>>> >>>> Ok, attached a 40 byte long test (the first 32 bytes are not used and
>>> >>>> needed only because the bios entry point is 0x20).
>>> >>>>
>>> >>>> $ git pull && make && sparc64-softmmu/qemu-system-sparc64 -bios
>>> >>>> test-wrpr.bin -nographic
>>> >>>> Already up-to-date.
>>> >>>> make[1]: Nothing to be done for `all'.
>>> >>>> /mnt/terra/projects/vanilla/qemu/tcg/tcg.c:1892: tcg fatal error
>>> >>>> Aborted
>>> >>>
>>> >>> The problem seems to be that wrpr is using a non-local
>>> >>> TCG tmp (cpu_tmp0).
>>> >>
>>> >> Just tried the test case with write to %pil - seems like write itself is OK.
>>> >> The issue appears to be with save_state() call since adding save_state
>>> >> to %pil case provokes the same tcg abort.
>>> >
>>> > The problem is that cpu_tmp0, not being a local tmp, doesn't
>>> > need to be saved across helper calls.  This results in the
>>> > TCG "optimizer" getting rid of it even though it's later used.
>>> > Look at the log and you'll see what I mean :-)
>>>
>>> I'm not very comfortable with tcg yet. Would it be possible to teach
>>> optimizer working with delay slots? Or do I look in the wrong place.
>>>
>>
>> The problem is not on the TCG side, but on the target-sparc/translate.c
>> side:
>>
>> |                    case 0x32: /* wrwim, V9 wrpr */
>> |                         {
>> |                             if (!supervisor(dc))
>> |                                 goto priv_insn;
>> |                             tcg_gen_xor_tl(cpu_tmp0, cpu_src1, cpu_src2);
>> | #ifdef TARGET_SPARC64
>>
>> Here cpu_tmp0 is loaded. cpu_tmp0 is a TCG temp, which means it is not
>> saved across TCG branches.
>>
>> [...]
>>
>> |                             case 6: // pstate
>> |                                 save_state(dc, cpu_cond);
>> |                                 gen_helper_wrpstate(cpu_tmp0);
>> |                                 dc->npc = DYNAMIC_PC;
>> |                                 break;
>>
>> save_state() calls save_npc(), which in turns might call
>> gen_generic_branch():
>
> Hmm. This is not the only instruction using save_state() and cpu_tmp0.
> At least ldd is another example.
>
>> | static inline void gen_generic_branch(target_ulong npc1, target_ulong npc2,
>> |                                       TCGv r_cond)
>> | {
>> |     int l1, l2;
>> |
>> |     l1 = gen_new_label();
>> |     l2 = gen_new_label();
>> |
>> |     tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond, 0, l1);
>> |
>> |     tcg_gen_movi_tl(cpu_npc, npc1);
>> |     tcg_gen_br(l2);
>> |
>> |     gen_set_label(l1);
>> |     tcg_gen_movi_tl(cpu_npc, npc2);
>> |     gen_set_label(l2);
>> | }
>>
>> And here is the TCG branch, which drop the TCG temp cpu_temp0.
>>
>> The solution is either to rewrite gen_generic_branch() without TCG
>> branches, or to use a TCG temp local instead of a TCG temp.
>
> You mean something like
>
>                            case 6: // pstate
>                                {
>                                    TCGv r_temp;
>
>                                    r_temp = tcg_temp_new();
>                                    tcg_gen_mov_tl(r_temp, cpu_tmp0);
>                                    save_state(dc, cpu_cond);
>                                    gen_helper_wrpstate(r_temp);
>                                    tcg_temp_free(r_temp);
>                                    dc->npc = DYNAMIC_PC;
>                                }
>                                break;
>
> ?
> This fails with the same error message.

Close, but you need to use tcg_temp_local_new(). Does this work?

diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 3c958b2..52fa2f1 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -3505,9 +3505,15 @@ static void disas_sparc_insn(DisasContext * dc)
                                 tcg_gen_mov_tl(cpu_tbr, cpu_tmp0);
                                 break;
                             case 6: // pstate
-                                save_state(dc, cpu_cond);
-                                gen_helper_wrpstate(cpu_tmp0);
-                                dc->npc = DYNAMIC_PC;
+                                {
+                                    TCGv r_tmp = tcg_temp_local_new();
+
+                                    tcg_gen_mov_tl(r_tmp, cpu_tmp0);
+                                    save_state(dc, cpu_cond);
+                                    gen_helper_wrpstate(r_tmp);
+                                    tcg_temp_free_ptr(r_tmp);
+                                    dc->npc = DYNAMIC_PC;
+                                }
                                 break;
                             case 7: // tl
                                 save_state(dc, cpu_cond);

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-26 18:36                                         ` Blue Swirl
@ 2011-04-26 19:07                                           ` Igor Kovalenko
  2011-04-26 20:07                                             ` Blue Swirl
  2011-04-27 16:29                                           ` Artyom Tarasenko
  1 sibling, 1 reply; 32+ messages in thread
From: Igor Kovalenko @ 2011-04-26 19:07 UTC (permalink / raw)
  To: Blue Swirl
  Cc: Laurent Desnogues, peter.maydell, Aurelien Jarno,
	Artyom Tarasenko, qemu-devel

On Tue, Apr 26, 2011 at 10:36 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
> On Tue, Apr 26, 2011 at 8:02 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>> On Mon, Apr 25, 2011 at 10:29 PM, Aurelien Jarno <aurelien@aurel32.net> wrote:
>>> On Fri, Apr 22, 2011 at 06:14:06PM +0400, Igor Kovalenko wrote:
>>>> On Fri, Apr 22, 2011 at 2:39 AM, Laurent Desnogues
>>>> <laurent.desnogues@gmail.com> wrote:
>>>> > On Thu, Apr 21, 2011 at 9:45 PM, Igor Kovalenko
>>>> > <igor.v.kovalenko@gmail.com> wrote:
>>>> >> On Thu, Apr 21, 2011 at 7:44 PM, Laurent Desnogues
>>>> >> <laurent.desnogues@gmail.com> wrote:
>>>> >>> On Thu, Apr 21, 2011 at 4:57 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>> >>>> On Tue, Apr 12, 2011 at 4:14 AM, Igor Kovalenko
>>>> >>>> <igor.v.kovalenko@gmail.com> wrote:
>>>> >>>>>>> Do you have public test case?
>>>> >>>>>>> It is possible to code this delay slot write test but real issue may
>>>> >>>>>>> be corruption elsewhere.
>>>> >>>>
>>>> >>>> The test case is trivial: it's just the two instructions, branch and wrpr.
>>>> >>>>
>>>> >>>>> In theory there could be multiple issues including compiler induced ones.
>>>> >>>>> I'd prefer to see some kind of reproducible testcase.
>>>> >>>>
>>>> >>>> Ok, attached a 40 byte long test (the first 32 bytes are not used and
>>>> >>>> needed only because the bios entry point is 0x20).
>>>> >>>>
>>>> >>>> $ git pull && make && sparc64-softmmu/qemu-system-sparc64 -bios
>>>> >>>> test-wrpr.bin -nographic
>>>> >>>> Already up-to-date.
>>>> >>>> make[1]: Nothing to be done for `all'.
>>>> >>>> /mnt/terra/projects/vanilla/qemu/tcg/tcg.c:1892: tcg fatal error
>>>> >>>> Aborted
>>>> >>>
>>>> >>> The problem seems to be that wrpr is using a non-local
>>>> >>> TCG tmp (cpu_tmp0).
>>>> >>
>>>> >> Just tried the test case with write to %pil - seems like write itself is OK.
>>>> >> The issue appears to be with save_state() call since adding save_state
>>>> >> to %pil case provokes the same tcg abort.
>>>> >
>>>> > The problem is that cpu_tmp0, not being a local tmp, doesn't
>>>> > need to be saved across helper calls.  This results in the
>>>> > TCG "optimizer" getting rid of it even though it's later used.
>>>> > Look at the log and you'll see what I mean :-)
>>>>
>>>> I'm not very comfortable with tcg yet. Would it be possible to teach
>>>> optimizer working with delay slots? Or do I look in the wrong place.
>>>>
>>>
>>> The problem is not on the TCG side, but on the target-sparc/translate.c
>>> side:
>>>
>>> |                    case 0x32: /* wrwim, V9 wrpr */
>>> |                         {
>>> |                             if (!supervisor(dc))
>>> |                                 goto priv_insn;
>>> |                             tcg_gen_xor_tl(cpu_tmp0, cpu_src1, cpu_src2);
>>> | #ifdef TARGET_SPARC64
>>>
>>> Here cpu_tmp0 is loaded. cpu_tmp0 is a TCG temp, which means it is not
>>> saved across TCG branches.
>>>
>>> [...]
>>>
>>> |                             case 6: // pstate
>>> |                                 save_state(dc, cpu_cond);
>>> |                                 gen_helper_wrpstate(cpu_tmp0);
>>> |                                 dc->npc = DYNAMIC_PC;
>>> |                                 break;
>>>
>>> save_state() calls save_npc(), which in turns might call
>>> gen_generic_branch():
>>
>> Hmm. This is not the only instruction using save_state() and cpu_tmp0.
>> At least ldd is another example.
>>
>>> | static inline void gen_generic_branch(target_ulong npc1, target_ulong npc2,
>>> |                                       TCGv r_cond)
>>> | {
>>> |     int l1, l2;
>>> |
>>> |     l1 = gen_new_label();
>>> |     l2 = gen_new_label();
>>> |
>>> |     tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond, 0, l1);
>>> |
>>> |     tcg_gen_movi_tl(cpu_npc, npc1);
>>> |     tcg_gen_br(l2);
>>> |
>>> |     gen_set_label(l1);
>>> |     tcg_gen_movi_tl(cpu_npc, npc2);
>>> |     gen_set_label(l2);
>>> | }
>>>
>>> And here is the TCG branch, which drop the TCG temp cpu_temp0.
>>>
>>> The solution is either to rewrite gen_generic_branch() without TCG
>>> branches, or to use a TCG temp local instead of a TCG temp.
>>
>> You mean something like
>>
>>                            case 6: // pstate
>>                                {
>>                                    TCGv r_temp;
>>
>>                                    r_temp = tcg_temp_new();
>>                                    tcg_gen_mov_tl(r_temp, cpu_tmp0);
>>                                    save_state(dc, cpu_cond);
>>                                    gen_helper_wrpstate(r_temp);
>>                                    tcg_temp_free(r_temp);
>>                                    dc->npc = DYNAMIC_PC;
>>                                }
>>                                break;
>>
>> ?
>> This fails with the same error message.
>
> Close, but you need to use tcg_temp_local_new(). Does this work?
>
> diff --git a/target-sparc/translate.c b/target-sparc/translate.c
> index 3c958b2..52fa2f1 100644
> --- a/target-sparc/translate.c
> +++ b/target-sparc/translate.c
> @@ -3505,9 +3505,15 @@ static void disas_sparc_insn(DisasContext * dc)
>                                 tcg_gen_mov_tl(cpu_tbr, cpu_tmp0);
>                                 break;
>                             case 6: // pstate
> -                                save_state(dc, cpu_cond);
> -                                gen_helper_wrpstate(cpu_tmp0);
> -                                dc->npc = DYNAMIC_PC;
> +                                {
> +                                    TCGv r_tmp = tcg_temp_local_new();
> +
> +                                    tcg_gen_mov_tl(r_tmp, cpu_tmp0);
> +                                    save_state(dc, cpu_cond);
> +                                    gen_helper_wrpstate(r_tmp);
> +                                    tcg_temp_free_ptr(r_tmp);
> +                                    dc->npc = DYNAMIC_PC;
> +                                }
>                                 break;
>                             case 7: // tl
>                                 save_state(dc, cpu_cond);
>

This one works (I have tested similar change but tcg_temp_free() would
be OK as well)
Could anyone please explain why it still works without being in delay
slot? Simply pointing at optimizer does not work :(

-- 
Kind regards,
Igor V. Kovalenko

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-26 19:07                                           ` Igor Kovalenko
@ 2011-04-26 20:07                                             ` Blue Swirl
  2011-04-26 21:35                                               ` Igor Kovalenko
  0 siblings, 1 reply; 32+ messages in thread
From: Blue Swirl @ 2011-04-26 20:07 UTC (permalink / raw)
  To: Igor Kovalenko
  Cc: Laurent Desnogues, peter.maydell, Aurelien Jarno,
	Artyom Tarasenko, qemu-devel

On Tue, Apr 26, 2011 at 10:07 PM, Igor Kovalenko
<igor.v.kovalenko@gmail.com> wrote:
> On Tue, Apr 26, 2011 at 10:36 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
>> On Tue, Apr 26, 2011 at 8:02 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>> On Mon, Apr 25, 2011 at 10:29 PM, Aurelien Jarno <aurelien@aurel32.net> wrote:
>>>> On Fri, Apr 22, 2011 at 06:14:06PM +0400, Igor Kovalenko wrote:
>>>>> On Fri, Apr 22, 2011 at 2:39 AM, Laurent Desnogues
>>>>> <laurent.desnogues@gmail.com> wrote:
>>>>> > On Thu, Apr 21, 2011 at 9:45 PM, Igor Kovalenko
>>>>> > <igor.v.kovalenko@gmail.com> wrote:
>>>>> >> On Thu, Apr 21, 2011 at 7:44 PM, Laurent Desnogues
>>>>> >> <laurent.desnogues@gmail.com> wrote:
>>>>> >>> On Thu, Apr 21, 2011 at 4:57 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>>> >>>> On Tue, Apr 12, 2011 at 4:14 AM, Igor Kovalenko
>>>>> >>>> <igor.v.kovalenko@gmail.com> wrote:
>>>>> >>>>>>> Do you have public test case?
>>>>> >>>>>>> It is possible to code this delay slot write test but real issue may
>>>>> >>>>>>> be corruption elsewhere.
>>>>> >>>>
>>>>> >>>> The test case is trivial: it's just the two instructions, branch and wrpr.
>>>>> >>>>
>>>>> >>>>> In theory there could be multiple issues including compiler induced ones.
>>>>> >>>>> I'd prefer to see some kind of reproducible testcase.
>>>>> >>>>
>>>>> >>>> Ok, attached a 40 byte long test (the first 32 bytes are not used and
>>>>> >>>> needed only because the bios entry point is 0x20).
>>>>> >>>>
>>>>> >>>> $ git pull && make && sparc64-softmmu/qemu-system-sparc64 -bios
>>>>> >>>> test-wrpr.bin -nographic
>>>>> >>>> Already up-to-date.
>>>>> >>>> make[1]: Nothing to be done for `all'.
>>>>> >>>> /mnt/terra/projects/vanilla/qemu/tcg/tcg.c:1892: tcg fatal error
>>>>> >>>> Aborted
>>>>> >>>
>>>>> >>> The problem seems to be that wrpr is using a non-local
>>>>> >>> TCG tmp (cpu_tmp0).
>>>>> >>
>>>>> >> Just tried the test case with write to %pil - seems like write itself is OK.
>>>>> >> The issue appears to be with save_state() call since adding save_state
>>>>> >> to %pil case provokes the same tcg abort.
>>>>> >
>>>>> > The problem is that cpu_tmp0, not being a local tmp, doesn't
>>>>> > need to be saved across helper calls.  This results in the
>>>>> > TCG "optimizer" getting rid of it even though it's later used.
>>>>> > Look at the log and you'll see what I mean :-)
>>>>>
>>>>> I'm not very comfortable with tcg yet. Would it be possible to teach
>>>>> optimizer working with delay slots? Or do I look in the wrong place.
>>>>>
>>>>
>>>> The problem is not on the TCG side, but on the target-sparc/translate.c
>>>> side:
>>>>
>>>> |                    case 0x32: /* wrwim, V9 wrpr */
>>>> |                         {
>>>> |                             if (!supervisor(dc))
>>>> |                                 goto priv_insn;
>>>> |                             tcg_gen_xor_tl(cpu_tmp0, cpu_src1, cpu_src2);
>>>> | #ifdef TARGET_SPARC64
>>>>
>>>> Here cpu_tmp0 is loaded. cpu_tmp0 is a TCG temp, which means it is not
>>>> saved across TCG branches.
>>>>
>>>> [...]
>>>>
>>>> |                             case 6: // pstate
>>>> |                                 save_state(dc, cpu_cond);
>>>> |                                 gen_helper_wrpstate(cpu_tmp0);
>>>> |                                 dc->npc = DYNAMIC_PC;
>>>> |                                 break;
>>>>
>>>> save_state() calls save_npc(), which in turns might call
>>>> gen_generic_branch():
>>>
>>> Hmm. This is not the only instruction using save_state() and cpu_tmp0.
>>> At least ldd is another example.
>>>
>>>> | static inline void gen_generic_branch(target_ulong npc1, target_ulong npc2,
>>>> |                                       TCGv r_cond)
>>>> | {
>>>> |     int l1, l2;
>>>> |
>>>> |     l1 = gen_new_label();
>>>> |     l2 = gen_new_label();
>>>> |
>>>> |     tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond, 0, l1);
>>>> |
>>>> |     tcg_gen_movi_tl(cpu_npc, npc1);
>>>> |     tcg_gen_br(l2);
>>>> |
>>>> |     gen_set_label(l1);
>>>> |     tcg_gen_movi_tl(cpu_npc, npc2);
>>>> |     gen_set_label(l2);
>>>> | }
>>>>
>>>> And here is the TCG branch, which drop the TCG temp cpu_temp0.
>>>>
>>>> The solution is either to rewrite gen_generic_branch() without TCG
>>>> branches, or to use a TCG temp local instead of a TCG temp.
>>>
>>> You mean something like
>>>
>>>                            case 6: // pstate
>>>                                {
>>>                                    TCGv r_temp;
>>>
>>>                                    r_temp = tcg_temp_new();
>>>                                    tcg_gen_mov_tl(r_temp, cpu_tmp0);
>>>                                    save_state(dc, cpu_cond);
>>>                                    gen_helper_wrpstate(r_temp);
>>>                                    tcg_temp_free(r_temp);
>>>                                    dc->npc = DYNAMIC_PC;
>>>                                }
>>>                                break;
>>>
>>> ?
>>> This fails with the same error message.
>>
>> Close, but you need to use tcg_temp_local_new(). Does this work?
>>
>> diff --git a/target-sparc/translate.c b/target-sparc/translate.c
>> index 3c958b2..52fa2f1 100644
>> --- a/target-sparc/translate.c
>> +++ b/target-sparc/translate.c
>> @@ -3505,9 +3505,15 @@ static void disas_sparc_insn(DisasContext * dc)
>>                                 tcg_gen_mov_tl(cpu_tbr, cpu_tmp0);
>>                                 break;
>>                             case 6: // pstate
>> -                                save_state(dc, cpu_cond);
>> -                                gen_helper_wrpstate(cpu_tmp0);
>> -                                dc->npc = DYNAMIC_PC;
>> +                                {
>> +                                    TCGv r_tmp = tcg_temp_local_new();
>> +
>> +                                    tcg_gen_mov_tl(r_tmp, cpu_tmp0);
>> +                                    save_state(dc, cpu_cond);
>> +                                    gen_helper_wrpstate(r_tmp);
>> +                                    tcg_temp_free_ptr(r_tmp);
>> +                                    dc->npc = DYNAMIC_PC;
>> +                                }
>>                                 break;
>>                             case 7: // tl
>>                                 save_state(dc, cpu_cond);
>>
>
> This one works (I have tested similar change but tcg_temp_free() would
> be OK as well)

Thanks for spotting, that was not intentional.

> Could anyone please explain why it still works without being in delay
> slot? Simply pointing at optimizer does not work :(

save_npc() will call gen_generic_branch() only if dc->npc == JUMP_PC,
which is used for delay slot handling. Otherwise gen_generic_branch()
is not called and thus  brcond does not get used, so the temporary
will not be flushed for non-delay slot cases.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-26 20:07                                             ` Blue Swirl
@ 2011-04-26 21:35                                               ` Igor Kovalenko
  2011-04-27 17:40                                                 ` Blue Swirl
  0 siblings, 1 reply; 32+ messages in thread
From: Igor Kovalenko @ 2011-04-26 21:35 UTC (permalink / raw)
  To: Blue Swirl
  Cc: Laurent Desnogues, peter.maydell, Aurelien Jarno,
	Artyom Tarasenko, qemu-devel

On Wed, Apr 27, 2011 at 12:07 AM, Blue Swirl <blauwirbel@gmail.com> wrote:
> On Tue, Apr 26, 2011 at 10:07 PM, Igor Kovalenko
> <igor.v.kovalenko@gmail.com> wrote:
>> On Tue, Apr 26, 2011 at 10:36 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
>>> On Tue, Apr 26, 2011 at 8:02 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>> On Mon, Apr 25, 2011 at 10:29 PM, Aurelien Jarno <aurelien@aurel32.net> wrote:
>>>>> On Fri, Apr 22, 2011 at 06:14:06PM +0400, Igor Kovalenko wrote:
>>>>>> On Fri, Apr 22, 2011 at 2:39 AM, Laurent Desnogues
>>>>>> <laurent.desnogues@gmail.com> wrote:
>>>>>> > On Thu, Apr 21, 2011 at 9:45 PM, Igor Kovalenko
>>>>>> > <igor.v.kovalenko@gmail.com> wrote:
>>>>>> >> On Thu, Apr 21, 2011 at 7:44 PM, Laurent Desnogues
>>>>>> >> <laurent.desnogues@gmail.com> wrote:
>>>>>> >>> On Thu, Apr 21, 2011 at 4:57 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>>>> >>>> On Tue, Apr 12, 2011 at 4:14 AM, Igor Kovalenko
>>>>>> >>>> <igor.v.kovalenko@gmail.com> wrote:
>>>>>> >>>>>>> Do you have public test case?
>>>>>> >>>>>>> It is possible to code this delay slot write test but real issue may
>>>>>> >>>>>>> be corruption elsewhere.
>>>>>> >>>>
>>>>>> >>>> The test case is trivial: it's just the two instructions, branch and wrpr.
>>>>>> >>>>
>>>>>> >>>>> In theory there could be multiple issues including compiler induced ones.
>>>>>> >>>>> I'd prefer to see some kind of reproducible testcase.
>>>>>> >>>>
>>>>>> >>>> Ok, attached a 40 byte long test (the first 32 bytes are not used and
>>>>>> >>>> needed only because the bios entry point is 0x20).
>>>>>> >>>>
>>>>>> >>>> $ git pull && make && sparc64-softmmu/qemu-system-sparc64 -bios
>>>>>> >>>> test-wrpr.bin -nographic
>>>>>> >>>> Already up-to-date.
>>>>>> >>>> make[1]: Nothing to be done for `all'.
>>>>>> >>>> /mnt/terra/projects/vanilla/qemu/tcg/tcg.c:1892: tcg fatal error
>>>>>> >>>> Aborted
>>>>>> >>>
>>>>>> >>> The problem seems to be that wrpr is using a non-local
>>>>>> >>> TCG tmp (cpu_tmp0).
>>>>>> >>
>>>>>> >> Just tried the test case with write to %pil - seems like write itself is OK.
>>>>>> >> The issue appears to be with save_state() call since adding save_state
>>>>>> >> to %pil case provokes the same tcg abort.
>>>>>> >
>>>>>> > The problem is that cpu_tmp0, not being a local tmp, doesn't
>>>>>> > need to be saved across helper calls.  This results in the
>>>>>> > TCG "optimizer" getting rid of it even though it's later used.
>>>>>> > Look at the log and you'll see what I mean :-)
>>>>>>
>>>>>> I'm not very comfortable with tcg yet. Would it be possible to teach
>>>>>> optimizer working with delay slots? Or do I look in the wrong place.
>>>>>>
>>>>>
>>>>> The problem is not on the TCG side, but on the target-sparc/translate.c
>>>>> side:
>>>>>
>>>>> |                    case 0x32: /* wrwim, V9 wrpr */
>>>>> |                         {
>>>>> |                             if (!supervisor(dc))
>>>>> |                                 goto priv_insn;
>>>>> |                             tcg_gen_xor_tl(cpu_tmp0, cpu_src1, cpu_src2);
>>>>> | #ifdef TARGET_SPARC64
>>>>>
>>>>> Here cpu_tmp0 is loaded. cpu_tmp0 is a TCG temp, which means it is not
>>>>> saved across TCG branches.
>>>>>
>>>>> [...]
>>>>>
>>>>> |                             case 6: // pstate
>>>>> |                                 save_state(dc, cpu_cond);
>>>>> |                                 gen_helper_wrpstate(cpu_tmp0);
>>>>> |                                 dc->npc = DYNAMIC_PC;
>>>>> |                                 break;
>>>>>
>>>>> save_state() calls save_npc(), which in turns might call
>>>>> gen_generic_branch():
>>>>
>>>> Hmm. This is not the only instruction using save_state() and cpu_tmp0.
>>>> At least ldd is another example.
>>>>
>>>>> | static inline void gen_generic_branch(target_ulong npc1, target_ulong npc2,
>>>>> |                                       TCGv r_cond)
>>>>> | {
>>>>> |     int l1, l2;
>>>>> |
>>>>> |     l1 = gen_new_label();
>>>>> |     l2 = gen_new_label();
>>>>> |
>>>>> |     tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond, 0, l1);
>>>>> |
>>>>> |     tcg_gen_movi_tl(cpu_npc, npc1);
>>>>> |     tcg_gen_br(l2);
>>>>> |
>>>>> |     gen_set_label(l1);
>>>>> |     tcg_gen_movi_tl(cpu_npc, npc2);
>>>>> |     gen_set_label(l2);
>>>>> | }
>>>>>
>>>>> And here is the TCG branch, which drop the TCG temp cpu_temp0.
>>>>>
>>>>> The solution is either to rewrite gen_generic_branch() without TCG
>>>>> branches, or to use a TCG temp local instead of a TCG temp.
>>>>
>>>> You mean something like
>>>>
>>>>                            case 6: // pstate
>>>>                                {
>>>>                                    TCGv r_temp;
>>>>
>>>>                                    r_temp = tcg_temp_new();
>>>>                                    tcg_gen_mov_tl(r_temp, cpu_tmp0);
>>>>                                    save_state(dc, cpu_cond);
>>>>                                    gen_helper_wrpstate(r_temp);
>>>>                                    tcg_temp_free(r_temp);
>>>>                                    dc->npc = DYNAMIC_PC;
>>>>                                }
>>>>                                break;
>>>>
>>>> ?
>>>> This fails with the same error message.
>>>
>>> Close, but you need to use tcg_temp_local_new(). Does this work?
>>>
>>> diff --git a/target-sparc/translate.c b/target-sparc/translate.c
>>> index 3c958b2..52fa2f1 100644
>>> --- a/target-sparc/translate.c
>>> +++ b/target-sparc/translate.c
>>> @@ -3505,9 +3505,15 @@ static void disas_sparc_insn(DisasContext * dc)
>>>                                 tcg_gen_mov_tl(cpu_tbr, cpu_tmp0);
>>>                                 break;
>>>                             case 6: // pstate
>>> -                                save_state(dc, cpu_cond);
>>> -                                gen_helper_wrpstate(cpu_tmp0);
>>> -                                dc->npc = DYNAMIC_PC;
>>> +                                {
>>> +                                    TCGv r_tmp = tcg_temp_local_new();
>>> +
>>> +                                    tcg_gen_mov_tl(r_tmp, cpu_tmp0);
>>> +                                    save_state(dc, cpu_cond);
>>> +                                    gen_helper_wrpstate(r_tmp);
>>> +                                    tcg_temp_free_ptr(r_tmp);
>>> +                                    dc->npc = DYNAMIC_PC;
>>> +                                }
>>>                                 break;
>>>                             case 7: // tl
>>>                                 save_state(dc, cpu_cond);
>>>
>>
>> This one works (I have tested similar change but tcg_temp_free() would
>> be OK as well)
>
> Thanks for spotting, that was not intentional.
>
>> Could anyone please explain why it still works without being in delay
>> slot? Simply pointing at optimizer does not work :(
>
> save_npc() will call gen_generic_branch() only if dc->npc == JUMP_PC,
> which is used for delay slot handling. Otherwise gen_generic_branch()
> is not called and thus  brcond does not get used, so the temporary
> will not be flushed for non-delay slot cases.

Great that explains it all, thanks! I think audit over translate.c
would be needed, alternatively all main switch temps for sparc64 could
be made local as hinted by Laurent - provided there is no significant
speed impact.

-- 
Kind regards,
Igor V. Kovalenko

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-26 18:36                                         ` Blue Swirl
  2011-04-26 19:07                                           ` Igor Kovalenko
@ 2011-04-27 16:29                                           ` Artyom Tarasenko
  2011-04-27 17:41                                             ` Blue Swirl
  1 sibling, 1 reply; 32+ messages in thread
From: Artyom Tarasenko @ 2011-04-27 16:29 UTC (permalink / raw)
  To: Blue Swirl; +Cc: Laurent Desnogues, peter.maydell, qemu-devel, Aurelien Jarno

On Tue, Apr 26, 2011 at 8:36 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
> On Tue, Apr 26, 2011 at 8:02 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>> On Mon, Apr 25, 2011 at 10:29 PM, Aurelien Jarno <aurelien@aurel32.net> wrote:
>>> On Fri, Apr 22, 2011 at 06:14:06PM +0400, Igor Kovalenko wrote:
>>>> On Fri, Apr 22, 2011 at 2:39 AM, Laurent Desnogues
>>>> <laurent.desnogues@gmail.com> wrote:
>>>> > On Thu, Apr 21, 2011 at 9:45 PM, Igor Kovalenko
>>>> > <igor.v.kovalenko@gmail.com> wrote:
>>>> >> On Thu, Apr 21, 2011 at 7:44 PM, Laurent Desnogues
>>>> >> <laurent.desnogues@gmail.com> wrote:
>>>> >>> On Thu, Apr 21, 2011 at 4:57 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>> >>>> On Tue, Apr 12, 2011 at 4:14 AM, Igor Kovalenko
>>>> >>>> <igor.v.kovalenko@gmail.com> wrote:
>>>> >>>>>>> Do you have public test case?
>>>> >>>>>>> It is possible to code this delay slot write test but real issue may
>>>> >>>>>>> be corruption elsewhere.
>>>> >>>>
>>>> >>>> The test case is trivial: it's just the two instructions, branch and wrpr.
>>>> >>>>
>>>> >>>>> In theory there could be multiple issues including compiler induced ones.
>>>> >>>>> I'd prefer to see some kind of reproducible testcase.
>>>> >>>>
>>>> >>>> Ok, attached a 40 byte long test (the first 32 bytes are not used and
>>>> >>>> needed only because the bios entry point is 0x20).
>>>> >>>>
>>>> >>>> $ git pull && make && sparc64-softmmu/qemu-system-sparc64 -bios
>>>> >>>> test-wrpr.bin -nographic
>>>> >>>> Already up-to-date.
>>>> >>>> make[1]: Nothing to be done for `all'.
>>>> >>>> /mnt/terra/projects/vanilla/qemu/tcg/tcg.c:1892: tcg fatal error
>>>> >>>> Aborted
>>>> >>>
>>>> >>> The problem seems to be that wrpr is using a non-local
>>>> >>> TCG tmp (cpu_tmp0).
>>>> >>
>>>> >> Just tried the test case with write to %pil - seems like write itself is OK.
>>>> >> The issue appears to be with save_state() call since adding save_state
>>>> >> to %pil case provokes the same tcg abort.
>>>> >
>>>> > The problem is that cpu_tmp0, not being a local tmp, doesn't
>>>> > need to be saved across helper calls.  This results in the
>>>> > TCG "optimizer" getting rid of it even though it's later used.
>>>> > Look at the log and you'll see what I mean :-)
>>>>
>>>> I'm not very comfortable with tcg yet. Would it be possible to teach
>>>> optimizer working with delay slots? Or do I look in the wrong place.
>>>>
>>>
>>> The problem is not on the TCG side, but on the target-sparc/translate.c
>>> side:
>>>
>>> |                    case 0x32: /* wrwim, V9 wrpr */
>>> |                         {
>>> |                             if (!supervisor(dc))
>>> |                                 goto priv_insn;
>>> |                             tcg_gen_xor_tl(cpu_tmp0, cpu_src1, cpu_src2);
>>> | #ifdef TARGET_SPARC64
>>>
>>> Here cpu_tmp0 is loaded. cpu_tmp0 is a TCG temp, which means it is not
>>> saved across TCG branches.
>>>
>>> [...]
>>>
>>> |                             case 6: // pstate
>>> |                                 save_state(dc, cpu_cond);
>>> |                                 gen_helper_wrpstate(cpu_tmp0);
>>> |                                 dc->npc = DYNAMIC_PC;
>>> |                                 break;
>>>
>>> save_state() calls save_npc(), which in turns might call
>>> gen_generic_branch():
>>
>> Hmm. This is not the only instruction using save_state() and cpu_tmp0.
>> At least ldd is another example.
>>
>>> | static inline void gen_generic_branch(target_ulong npc1, target_ulong npc2,
>>> |                                       TCGv r_cond)
>>> | {
>>> |     int l1, l2;
>>> |
>>> |     l1 = gen_new_label();
>>> |     l2 = gen_new_label();
>>> |
>>> |     tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond, 0, l1);
>>> |
>>> |     tcg_gen_movi_tl(cpu_npc, npc1);
>>> |     tcg_gen_br(l2);
>>> |
>>> |     gen_set_label(l1);
>>> |     tcg_gen_movi_tl(cpu_npc, npc2);
>>> |     gen_set_label(l2);
>>> | }
>>>
>>> And here is the TCG branch, which drop the TCG temp cpu_temp0.
>>>
>>> The solution is either to rewrite gen_generic_branch() without TCG
>>> branches, or to use a TCG temp local instead of a TCG temp.
>>
>> You mean something like
>>
>>                            case 6: // pstate
>>                                {
>>                                    TCGv r_temp;
>>
>>                                    r_temp = tcg_temp_new();
>>                                    tcg_gen_mov_tl(r_temp, cpu_tmp0);
>>                                    save_state(dc, cpu_cond);
>>                                    gen_helper_wrpstate(r_temp);
>>                                    tcg_temp_free(r_temp);
>>                                    dc->npc = DYNAMIC_PC;
>>                                }
>>                                break;
>>
>> ?
>> This fails with the same error message.
>
> Close, but you need to use tcg_temp_local_new(). Does this work?
>
> diff --git a/target-sparc/translate.c b/target-sparc/translate.c
> index 3c958b2..52fa2f1 100644
> --- a/target-sparc/translate.c
> +++ b/target-sparc/translate.c
> @@ -3505,9 +3505,15 @@ static void disas_sparc_insn(DisasContext * dc)
>                                 tcg_gen_mov_tl(cpu_tbr, cpu_tmp0);
>                                 break;
>                             case 6: // pstate
> -                                save_state(dc, cpu_cond);
> -                                gen_helper_wrpstate(cpu_tmp0);
> -                                dc->npc = DYNAMIC_PC;
> +                                {
> +                                    TCGv r_tmp = tcg_temp_local_new();
> +
> +                                    tcg_gen_mov_tl(r_tmp, cpu_tmp0);
> +                                    save_state(dc, cpu_cond);
> +                                    gen_helper_wrpstate(r_tmp);
> +                                    tcg_temp_free_ptr(r_tmp);
> +                                    dc->npc = DYNAMIC_PC;
> +                                }
>                                 break;
>                             case 7: // tl
>                                 save_state(dc, cpu_cond);
>

Yep. This (with  tcg_temp_free) passes my tests, thanks!
A bonus question: how is immediate case handled?

I don't see any explicit "if (IS_IMM)" and yet it seems to produce the
expected results for the immediate instructions.

-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-26 21:35                                               ` Igor Kovalenko
@ 2011-04-27 17:40                                                 ` Blue Swirl
  0 siblings, 0 replies; 32+ messages in thread
From: Blue Swirl @ 2011-04-27 17:40 UTC (permalink / raw)
  To: Igor Kovalenko
  Cc: Laurent Desnogues, peter.maydell, Aurelien Jarno,
	Artyom Tarasenko, qemu-devel

On Wed, Apr 27, 2011 at 12:35 AM, Igor Kovalenko
<igor.v.kovalenko@gmail.com> wrote:
> On Wed, Apr 27, 2011 at 12:07 AM, Blue Swirl <blauwirbel@gmail.com> wrote:
>> On Tue, Apr 26, 2011 at 10:07 PM, Igor Kovalenko
>> <igor.v.kovalenko@gmail.com> wrote:
>>> On Tue, Apr 26, 2011 at 10:36 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
>>>> On Tue, Apr 26, 2011 at 8:02 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>>> On Mon, Apr 25, 2011 at 10:29 PM, Aurelien Jarno <aurelien@aurel32.net> wrote:
>>>>>> On Fri, Apr 22, 2011 at 06:14:06PM +0400, Igor Kovalenko wrote:
>>>>>>> On Fri, Apr 22, 2011 at 2:39 AM, Laurent Desnogues
>>>>>>> <laurent.desnogues@gmail.com> wrote:
>>>>>>> > On Thu, Apr 21, 2011 at 9:45 PM, Igor Kovalenko
>>>>>>> > <igor.v.kovalenko@gmail.com> wrote:
>>>>>>> >> On Thu, Apr 21, 2011 at 7:44 PM, Laurent Desnogues
>>>>>>> >> <laurent.desnogues@gmail.com> wrote:
>>>>>>> >>> On Thu, Apr 21, 2011 at 4:57 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>>>>> >>>> On Tue, Apr 12, 2011 at 4:14 AM, Igor Kovalenko
>>>>>>> >>>> <igor.v.kovalenko@gmail.com> wrote:
>>>>>>> >>>>>>> Do you have public test case?
>>>>>>> >>>>>>> It is possible to code this delay slot write test but real issue may
>>>>>>> >>>>>>> be corruption elsewhere.
>>>>>>> >>>>
>>>>>>> >>>> The test case is trivial: it's just the two instructions, branch and wrpr.
>>>>>>> >>>>
>>>>>>> >>>>> In theory there could be multiple issues including compiler induced ones.
>>>>>>> >>>>> I'd prefer to see some kind of reproducible testcase.
>>>>>>> >>>>
>>>>>>> >>>> Ok, attached a 40 byte long test (the first 32 bytes are not used and
>>>>>>> >>>> needed only because the bios entry point is 0x20).
>>>>>>> >>>>
>>>>>>> >>>> $ git pull && make && sparc64-softmmu/qemu-system-sparc64 -bios
>>>>>>> >>>> test-wrpr.bin -nographic
>>>>>>> >>>> Already up-to-date.
>>>>>>> >>>> make[1]: Nothing to be done for `all'.
>>>>>>> >>>> /mnt/terra/projects/vanilla/qemu/tcg/tcg.c:1892: tcg fatal error
>>>>>>> >>>> Aborted
>>>>>>> >>>
>>>>>>> >>> The problem seems to be that wrpr is using a non-local
>>>>>>> >>> TCG tmp (cpu_tmp0).
>>>>>>> >>
>>>>>>> >> Just tried the test case with write to %pil - seems like write itself is OK.
>>>>>>> >> The issue appears to be with save_state() call since adding save_state
>>>>>>> >> to %pil case provokes the same tcg abort.
>>>>>>> >
>>>>>>> > The problem is that cpu_tmp0, not being a local tmp, doesn't
>>>>>>> > need to be saved across helper calls.  This results in the
>>>>>>> > TCG "optimizer" getting rid of it even though it's later used.
>>>>>>> > Look at the log and you'll see what I mean :-)
>>>>>>>
>>>>>>> I'm not very comfortable with tcg yet. Would it be possible to teach
>>>>>>> optimizer working with delay slots? Or do I look in the wrong place.
>>>>>>>
>>>>>>
>>>>>> The problem is not on the TCG side, but on the target-sparc/translate.c
>>>>>> side:
>>>>>>
>>>>>> |                    case 0x32: /* wrwim, V9 wrpr */
>>>>>> |                         {
>>>>>> |                             if (!supervisor(dc))
>>>>>> |                                 goto priv_insn;
>>>>>> |                             tcg_gen_xor_tl(cpu_tmp0, cpu_src1, cpu_src2);
>>>>>> | #ifdef TARGET_SPARC64
>>>>>>
>>>>>> Here cpu_tmp0 is loaded. cpu_tmp0 is a TCG temp, which means it is not
>>>>>> saved across TCG branches.
>>>>>>
>>>>>> [...]
>>>>>>
>>>>>> |                             case 6: // pstate
>>>>>> |                                 save_state(dc, cpu_cond);
>>>>>> |                                 gen_helper_wrpstate(cpu_tmp0);
>>>>>> |                                 dc->npc = DYNAMIC_PC;
>>>>>> |                                 break;
>>>>>>
>>>>>> save_state() calls save_npc(), which in turns might call
>>>>>> gen_generic_branch():
>>>>>
>>>>> Hmm. This is not the only instruction using save_state() and cpu_tmp0.
>>>>> At least ldd is another example.
>>>>>
>>>>>> | static inline void gen_generic_branch(target_ulong npc1, target_ulong npc2,
>>>>>> |                                       TCGv r_cond)
>>>>>> | {
>>>>>> |     int l1, l2;
>>>>>> |
>>>>>> |     l1 = gen_new_label();
>>>>>> |     l2 = gen_new_label();
>>>>>> |
>>>>>> |     tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond, 0, l1);
>>>>>> |
>>>>>> |     tcg_gen_movi_tl(cpu_npc, npc1);
>>>>>> |     tcg_gen_br(l2);
>>>>>> |
>>>>>> |     gen_set_label(l1);
>>>>>> |     tcg_gen_movi_tl(cpu_npc, npc2);
>>>>>> |     gen_set_label(l2);
>>>>>> | }
>>>>>>
>>>>>> And here is the TCG branch, which drop the TCG temp cpu_temp0.
>>>>>>
>>>>>> The solution is either to rewrite gen_generic_branch() without TCG
>>>>>> branches, or to use a TCG temp local instead of a TCG temp.
>>>>>
>>>>> You mean something like
>>>>>
>>>>>                            case 6: // pstate
>>>>>                                {
>>>>>                                    TCGv r_temp;
>>>>>
>>>>>                                    r_temp = tcg_temp_new();
>>>>>                                    tcg_gen_mov_tl(r_temp, cpu_tmp0);
>>>>>                                    save_state(dc, cpu_cond);
>>>>>                                    gen_helper_wrpstate(r_temp);
>>>>>                                    tcg_temp_free(r_temp);
>>>>>                                    dc->npc = DYNAMIC_PC;
>>>>>                                }
>>>>>                                break;
>>>>>
>>>>> ?
>>>>> This fails with the same error message.
>>>>
>>>> Close, but you need to use tcg_temp_local_new(). Does this work?
>>>>
>>>> diff --git a/target-sparc/translate.c b/target-sparc/translate.c
>>>> index 3c958b2..52fa2f1 100644
>>>> --- a/target-sparc/translate.c
>>>> +++ b/target-sparc/translate.c
>>>> @@ -3505,9 +3505,15 @@ static void disas_sparc_insn(DisasContext * dc)
>>>>                                 tcg_gen_mov_tl(cpu_tbr, cpu_tmp0);
>>>>                                 break;
>>>>                             case 6: // pstate
>>>> -                                save_state(dc, cpu_cond);
>>>> -                                gen_helper_wrpstate(cpu_tmp0);
>>>> -                                dc->npc = DYNAMIC_PC;
>>>> +                                {
>>>> +                                    TCGv r_tmp = tcg_temp_local_new();
>>>> +
>>>> +                                    tcg_gen_mov_tl(r_tmp, cpu_tmp0);
>>>> +                                    save_state(dc, cpu_cond);
>>>> +                                    gen_helper_wrpstate(r_tmp);
>>>> +                                    tcg_temp_free_ptr(r_tmp);
>>>> +                                    dc->npc = DYNAMIC_PC;
>>>> +                                }
>>>>                                 break;
>>>>                             case 7: // tl
>>>>                                 save_state(dc, cpu_cond);
>>>>
>>>
>>> This one works (I have tested similar change but tcg_temp_free() would
>>> be OK as well)
>>
>> Thanks for spotting, that was not intentional.
>>
>>> Could anyone please explain why it still works without being in delay
>>> slot? Simply pointing at optimizer does not work :(
>>
>> save_npc() will call gen_generic_branch() only if dc->npc == JUMP_PC,
>> which is used for delay slot handling. Otherwise gen_generic_branch()
>> is not called and thus  brcond does not get used, so the temporary
>> will not be flushed for non-delay slot cases.
>
> Great that explains it all, thanks! I think audit over translate.c
> would be needed, alternatively all main switch temps for sparc64 could
> be made local as hinted by Laurent - provided there is no significant
> speed impact.

I think there may be a few other cases. Local temps are allocated from
stack instead of CPU registers, so there could be a speed impact.

It would be better if this case could be detected more reliably
instead, for example enhancing the debugging mode for TCG.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
  2011-04-27 16:29                                           ` Artyom Tarasenko
@ 2011-04-27 17:41                                             ` Blue Swirl
  0 siblings, 0 replies; 32+ messages in thread
From: Blue Swirl @ 2011-04-27 17:41 UTC (permalink / raw)
  To: Artyom Tarasenko
  Cc: Laurent Desnogues, peter.maydell, qemu-devel, Aurelien Jarno

On Wed, Apr 27, 2011 at 7:29 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
> On Tue, Apr 26, 2011 at 8:36 PM, Blue Swirl <blauwirbel@gmail.com> wrote:
>> On Tue, Apr 26, 2011 at 8:02 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>> On Mon, Apr 25, 2011 at 10:29 PM, Aurelien Jarno <aurelien@aurel32.net> wrote:
>>>> On Fri, Apr 22, 2011 at 06:14:06PM +0400, Igor Kovalenko wrote:
>>>>> On Fri, Apr 22, 2011 at 2:39 AM, Laurent Desnogues
>>>>> <laurent.desnogues@gmail.com> wrote:
>>>>> > On Thu, Apr 21, 2011 at 9:45 PM, Igor Kovalenko
>>>>> > <igor.v.kovalenko@gmail.com> wrote:
>>>>> >> On Thu, Apr 21, 2011 at 7:44 PM, Laurent Desnogues
>>>>> >> <laurent.desnogues@gmail.com> wrote:
>>>>> >>> On Thu, Apr 21, 2011 at 4:57 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>>> >>>> On Tue, Apr 12, 2011 at 4:14 AM, Igor Kovalenko
>>>>> >>>> <igor.v.kovalenko@gmail.com> wrote:
>>>>> >>>>>>> Do you have public test case?
>>>>> >>>>>>> It is possible to code this delay slot write test but real issue may
>>>>> >>>>>>> be corruption elsewhere.
>>>>> >>>>
>>>>> >>>> The test case is trivial: it's just the two instructions, branch and wrpr.
>>>>> >>>>
>>>>> >>>>> In theory there could be multiple issues including compiler induced ones.
>>>>> >>>>> I'd prefer to see some kind of reproducible testcase.
>>>>> >>>>
>>>>> >>>> Ok, attached a 40 byte long test (the first 32 bytes are not used and
>>>>> >>>> needed only because the bios entry point is 0x20).
>>>>> >>>>
>>>>> >>>> $ git pull && make && sparc64-softmmu/qemu-system-sparc64 -bios
>>>>> >>>> test-wrpr.bin -nographic
>>>>> >>>> Already up-to-date.
>>>>> >>>> make[1]: Nothing to be done for `all'.
>>>>> >>>> /mnt/terra/projects/vanilla/qemu/tcg/tcg.c:1892: tcg fatal error
>>>>> >>>> Aborted
>>>>> >>>
>>>>> >>> The problem seems to be that wrpr is using a non-local
>>>>> >>> TCG tmp (cpu_tmp0).
>>>>> >>
>>>>> >> Just tried the test case with write to %pil - seems like write itself is OK.
>>>>> >> The issue appears to be with save_state() call since adding save_state
>>>>> >> to %pil case provokes the same tcg abort.
>>>>> >
>>>>> > The problem is that cpu_tmp0, not being a local tmp, doesn't
>>>>> > need to be saved across helper calls.  This results in the
>>>>> > TCG "optimizer" getting rid of it even though it's later used.
>>>>> > Look at the log and you'll see what I mean :-)
>>>>>
>>>>> I'm not very comfortable with tcg yet. Would it be possible to teach
>>>>> optimizer working with delay slots? Or do I look in the wrong place.
>>>>>
>>>>
>>>> The problem is not on the TCG side, but on the target-sparc/translate.c
>>>> side:
>>>>
>>>> |                    case 0x32: /* wrwim, V9 wrpr */
>>>> |                         {
>>>> |                             if (!supervisor(dc))
>>>> |                                 goto priv_insn;
>>>> |                             tcg_gen_xor_tl(cpu_tmp0, cpu_src1, cpu_src2);
>>>> | #ifdef TARGET_SPARC64
>>>>
>>>> Here cpu_tmp0 is loaded. cpu_tmp0 is a TCG temp, which means it is not
>>>> saved across TCG branches.
>>>>
>>>> [...]
>>>>
>>>> |                             case 6: // pstate
>>>> |                                 save_state(dc, cpu_cond);
>>>> |                                 gen_helper_wrpstate(cpu_tmp0);
>>>> |                                 dc->npc = DYNAMIC_PC;
>>>> |                                 break;
>>>>
>>>> save_state() calls save_npc(), which in turns might call
>>>> gen_generic_branch():
>>>
>>> Hmm. This is not the only instruction using save_state() and cpu_tmp0.
>>> At least ldd is another example.
>>>
>>>> | static inline void gen_generic_branch(target_ulong npc1, target_ulong npc2,
>>>> |                                       TCGv r_cond)
>>>> | {
>>>> |     int l1, l2;
>>>> |
>>>> |     l1 = gen_new_label();
>>>> |     l2 = gen_new_label();
>>>> |
>>>> |     tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond, 0, l1);
>>>> |
>>>> |     tcg_gen_movi_tl(cpu_npc, npc1);
>>>> |     tcg_gen_br(l2);
>>>> |
>>>> |     gen_set_label(l1);
>>>> |     tcg_gen_movi_tl(cpu_npc, npc2);
>>>> |     gen_set_label(l2);
>>>> | }
>>>>
>>>> And here is the TCG branch, which drop the TCG temp cpu_temp0.
>>>>
>>>> The solution is either to rewrite gen_generic_branch() without TCG
>>>> branches, or to use a TCG temp local instead of a TCG temp.
>>>
>>> You mean something like
>>>
>>>                            case 6: // pstate
>>>                                {
>>>                                    TCGv r_temp;
>>>
>>>                                    r_temp = tcg_temp_new();
>>>                                    tcg_gen_mov_tl(r_temp, cpu_tmp0);
>>>                                    save_state(dc, cpu_cond);
>>>                                    gen_helper_wrpstate(r_temp);
>>>                                    tcg_temp_free(r_temp);
>>>                                    dc->npc = DYNAMIC_PC;
>>>                                }
>>>                                break;
>>>
>>> ?
>>> This fails with the same error message.
>>
>> Close, but you need to use tcg_temp_local_new(). Does this work?
>>
>> diff --git a/target-sparc/translate.c b/target-sparc/translate.c
>> index 3c958b2..52fa2f1 100644
>> --- a/target-sparc/translate.c
>> +++ b/target-sparc/translate.c
>> @@ -3505,9 +3505,15 @@ static void disas_sparc_insn(DisasContext * dc)
>>                                 tcg_gen_mov_tl(cpu_tbr, cpu_tmp0);
>>                                 break;
>>                             case 6: // pstate
>> -                                save_state(dc, cpu_cond);
>> -                                gen_helper_wrpstate(cpu_tmp0);
>> -                                dc->npc = DYNAMIC_PC;
>> +                                {
>> +                                    TCGv r_tmp = tcg_temp_local_new();
>> +
>> +                                    tcg_gen_mov_tl(r_tmp, cpu_tmp0);
>> +                                    save_state(dc, cpu_cond);
>> +                                    gen_helper_wrpstate(r_tmp);
>> +                                    tcg_temp_free_ptr(r_tmp);
>> +                                    dc->npc = DYNAMIC_PC;
>> +                                }
>>                                 break;
>>                             case 7: // tl
>>                                 save_state(dc, cpu_cond);
>>
>
> Yep. This (with  tcg_temp_free) passes my tests, thanks!
> A bonus question: how is immediate case handled?
>
> I don't see any explicit "if (IS_IMM)" and yet it seems to produce the
> expected results for the immediate instructions.

Immediates are handled by get_src2().

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2011-04-27 17:42 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-10 12:29 [Qemu-devel] tcg/tcg.c:1892: tcg fatal error Artyom Tarasenko
2011-04-10 13:24 ` Aurelien Jarno
2011-04-10 14:09   ` Artyom Tarasenko
2011-04-10 14:44     ` Blue Swirl
2011-04-10 17:48       ` Artyom Tarasenko
2011-04-10 17:57         ` Blue Swirl
2011-04-10 18:35           ` Artyom Tarasenko
2011-04-10 18:52             ` Igor Kovalenko
2011-04-10 19:37               ` Artyom Tarasenko
     [not found]                 ` <BANLkTik2NChYi8hADjCSbjdZeyP_oo8_Qg@mail.gmail.com>
2011-04-10 20:00                   ` Artyom Tarasenko
2011-04-11  3:16                     ` Igor Kovalenko
2011-04-11 17:53                       ` Artyom Tarasenko
2011-04-12  2:14                         ` Igor Kovalenko
2011-04-21 14:57                           ` Artyom Tarasenko
2011-04-21 15:44                             ` Laurent Desnogues
2011-04-21 19:45                               ` Igor Kovalenko
2011-04-21 22:39                                 ` Laurent Desnogues
2011-04-22 14:14                                   ` Igor Kovalenko
2011-04-25 20:29                                     ` Aurelien Jarno
2011-04-26  3:34                                       ` Igor Kovalenko
2011-04-26 16:26                                         ` Artyom Tarasenko
2011-04-26 18:07                                           ` Igor Kovalenko
2011-04-26 17:02                                       ` Artyom Tarasenko
2011-04-26 18:36                                         ` Blue Swirl
2011-04-26 19:07                                           ` Igor Kovalenko
2011-04-26 20:07                                             ` Blue Swirl
2011-04-26 21:35                                               ` Igor Kovalenko
2011-04-27 17:40                                                 ` Blue Swirl
2011-04-27 16:29                                           ` Artyom Tarasenko
2011-04-27 17:41                                             ` Blue Swirl
2011-04-10 14:59     ` Peter Maydell
2011-04-10 17:31       ` Artyom Tarasenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).