From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=56049 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Q8vsS-0006et-6q for qemu-devel@nongnu.org; Sun, 10 Apr 2011 10:44:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Q8vsQ-0003XS-FS for qemu-devel@nongnu.org; Sun, 10 Apr 2011 10:44:35 -0400 Received: from mail-vw0-f45.google.com ([209.85.212.45]:50514) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Q8vsQ-0003XN-9h for qemu-devel@nongnu.org; Sun, 10 Apr 2011 10:44:34 -0400 Received: by vws17 with SMTP id 17so4240367vws.4 for ; Sun, 10 Apr 2011 07:44:33 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: <20110410132415.GA6719@volta.aurel32.net> From: Blue Swirl Date: Sun, 10 Apr 2011 17:44:13 +0300 Message-ID: Subject: Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Artyom Tarasenko Cc: qemu-devel , Aurelien Jarno On Sun, Apr 10, 2011 at 5:09 PM, Artyom Tarasenko wro= te: > On Sun, Apr 10, 2011 at 3:24 PM, Aurelien Jarno wr= ote: >> On Sun, Apr 10, 2011 at 02:29:59PM +0200, Artyom Tarasenko wrote: >>> Trying to boot some proprietary OS I get qemu-system-sparc64 crash with= a >>> >>> tcg/tcg.c:1892: tcg fatal error >>> >>> error message. >>> >>> It looks like it can be a platform independent bug though, because >>> when a '-singlestep' option IS present, qemu doesn't crash and seems >>> to translate the code properly. >>> >>> (gdb) bt >>> #0 =C2=A00x00000032c2e327f5 in raise () from /lib64/libc.so.6 >>> #1 =C2=A00x00000032c2e33fd5 in abort () from /lib64/libc.so.6 >>> #2 =C2=A00x000000000051933d in tcg_reg_alloc_call (s=3D, >>> def=3D0x89d340, opc=3DINDEX_op_call, args=3D0x10acc98, dead_iargs=3D3) = at >>> qemu/tcg/tcg.c:1892 >>> #3 =C2=A00x000000000051a557 in tcg_gen_code_common (s=3D0x10b8940, >>> gen_code_buf=3D0x40338b60 "I\213n@H\213] 3\355I\211\256\220") at >>> qemu/tcg/tcg.c:2099 >>> #4 =C2=A0tcg_gen_code (s=3D0x10b8940, gen_code_buf=3D0x40338b60 "I\213n= @H\213] >>> 3\355I\211\256\220") at qemu/tcg/tcg.c:2142 >>> #5 =C2=A00x00000000004d38f1 in cpu_sparc_gen_code (env=3D0x10cce10, >>> tb=3D0x7fffe91bc218, gen_code_size_ptr=3D0x7fffffffd9b4) at >>> qemu/translate-all.c:93 >>> #6 =C2=A00x00000000004d1fd7 in tb_gen_code (env=3D0x10cce10, pc=3D18868= 776, >>> cs_base=3D18868780, flags=3D15, cflags=3D0) at qemu/exec.c:989 >>> #7 =C2=A00x00000000004d4029 in tb_find_slow (env1=3D) at >>> qemu/cpu-exec.c:167 >>> #8 =C2=A0tb_find_fast (env1=3D) at cpu-exec.c:194 >>> #9 =C2=A0cpu_sparc_exec (env1=3D) at qemu/cpu-exec= .c:556 >>> #10 0x0000000000408868 in tcg_cpu_exec () at qemu/cpus.c:1066 >>> #11 cpu_exec_all () at qemu/cpus.c:1102 >>> #12 0x000000000053c756 in main_loop (argc=3D, >>> argv=3D, envp=3D) at >>> qemu/vl.c:1430 >>> >>> I inspected ts->val_type causing the abort() case and it turned out to = be 0. >>> >>> The last lines of qemu.log (without -singlestep) >>> IN: >>> 0x00000000011fe9f0: =C2=A0rdpr =C2=A0%pstate, %g1 >>> 0x00000000011fe9f4: =C2=A0wrpr =C2=A0%g1, 2, %pstate >>> -------------- >>> IN: >>> 0x00000000011fe9f8: =C2=A0ldub =C2=A0[ %o0 ], %o1 >>> 0x00000000011fe9fc: =C2=A0mov =C2=A0%o1, %o2 >>> 0x00000000011fea00: =C2=A0rdpr =C2=A0%tick, %o3 >>> 0x00000000011fea04: =C2=A0cmp =C2=A0%o1, %o2 >>> 0x00000000011fea08: =C2=A0be =C2=A0%icc, 0x11fea00 >>> 0x00000000011fea0c: =C2=A0ldub =C2=A0[ %o0 ], %o2 >>> >>> Search PC... >>> Search PC... >>> Search PC... >>> Search PC... >>> Search PC... >>> Search PC... >>> -------------- >>> IN: >>> 0x00000000011fe9f8: =C2=A0ldub =C2=A0[ %o0 ], %o1 >>> 0x00000000011fe9fc: =C2=A0mov =C2=A0%o1, %o2 >>> 0x00000000011fea00: =C2=A0rdpr =C2=A0%tick, %o3 >>> 0x00000000011fea04: =C2=A0cmp =C2=A0%o1, %o2 >>> 0x00000000011fea08: =C2=A0be =C2=A0%icc, 0x11fea00 >>> 0x00000000011fea0c: =C2=A0ldub =C2=A0[ %o0 ], %o2 >>> >>> 110521: Data Access MMU Miss (v=3D0068) pc=3D00000000011fe9f8 >>> npc=3D00000000011fe9fc SP=3D000000000180ae41 >>> pc: 00000000011fe9f8 =C2=A0npc: 00000000011fe9fc >>> >>> IN: >>> 0x00000000011fea00: =C2=A0rdpr =C2=A0%tick, %o3 >>> 0x00000000011fea04: =C2=A0cmp =C2=A0%o1, %o2 >>> 0x00000000011fea08: =C2=A0be =C2=A0%icc, 0x11fea00 >>> 0x00000000011fea0c: =C2=A0ldub =C2=A0[ %o0 ], %o2 >>> -------------- >>> IN: >>> 0x00000000011fea10: =C2=A0brz,pn =C2=A0 %o2, 0x11fe9f8 >>> 0x00000000011fea14: =C2=A0mov =C2=A0%o2, %o4 >>> -------------- >>> IN: >>> 0x00000000011fea18: =C2=A0rdpr =C2=A0%tick, %o5 >>> 0x00000000011fea1c: =C2=A0cmp =C2=A0%o2, %o4 >>> 0x00000000011fea20: =C2=A0be =C2=A0%icc, 0x11fea18 >>> 0x00000000011fea24: =C2=A0ldub =C2=A0[ %o0 ], %o4 >>> -------------- >>> IN: >>> 0x00000000011fea28: =C2=A0brz,pn =C2=A0 %o4, 0x11fe9f4 >>> 0x00000000011fea2c: =C2=A0wrpr =C2=A0%g0, %g1, %pstate >>> >>> >>> The crash is 100% reproducible and happens always on the same place, >>> so it's probably a pure TCG issue, not related on getting the >>> external/timer interrupts. >>> >>> Do you need any additional info? >>> >> >> What would be interesting would be to get the corresponding TCG code >> from qemu.log (-d op,op_opt). > > > OP: > =C2=A0---- 0x11fea28 > =C2=A0ld_i64 tmp6,regwptr,$0x20 > =C2=A0movi_i64 cond,$0x0 > =C2=A0movi_i64 tmp8,$0x0 > =C2=A0brcond_i64 tmp6,tmp8,ne,$0x0 > =C2=A0movi_i64 cond,$0x1 > =C2=A0set_label $0x0 > > =C2=A0---- 0x11fea2c > =C2=A0movi_i64 tmp7,$0x0 > =C2=A0xor_i64 tmp0,tmp7,g1 > =C2=A0movi_i64 pc,$0x11fea2c > =C2=A0movi_i64 tmp8,$compute_psr > =C2=A0call tmp8,$0x0,$0 > =C2=A0movi_i64 tmp8,$0x0 > =C2=A0brcond_i64 cond,tmp8,eq,$0x1 > =C2=A0movi_i64 npc,$0x11fe9f4 > =C2=A0br $0x2 > =C2=A0set_label $0x1 > =C2=A0movi_i64 npc,$0x11fea30 > =C2=A0set_label $0x2 > =C2=A0movi_i64 tmp8,$wrpstate > =C2=A0call tmp8,$0x0,$0,tmp0 > =C2=A0mov_i64 pc,npc > =C2=A0movi_i64 tmp8,$0x4 > =C2=A0add_i64 npc,npc,tmp8 > =C2=A0exit_tb $0x0 > > OP after liveness analysis: > =C2=A0---- 0x11fea28 > =C2=A0ld_i64 tmp6,regwptr,$0x20 > =C2=A0movi_i64 cond,$0x0 > =C2=A0movi_i64 tmp8,$0x0 > =C2=A0brcond_i64 tmp6,tmp8,ne,$0x0 > =C2=A0movi_i64 cond,$0x1 > =C2=A0set_label $0x0 > > =C2=A0---- 0x11fea2c > =C2=A0nopn $0x2,$0x2 > =C2=A0nopn $0x3,$0x68,$0x3 > =C2=A0movi_i64 pc,$0x11fea2c > =C2=A0movi_i64 tmp8,$compute_psr > =C2=A0call tmp8,$0x0,$0 > =C2=A0movi_i64 tmp8,$0x0 > =C2=A0brcond_i64 cond,tmp8,eq,$0x1 > =C2=A0movi_i64 npc,$0x11fe9f4 > =C2=A0br $0x2 > =C2=A0set_label $0x1 > =C2=A0movi_i64 npc,$0x11fea30 > =C2=A0set_label $0x2 > =C2=A0movi_i64 tmp8,$wrpstate > =C2=A0call tmp8,$0x0,$0,tmp0 > =C2=A0mov_i64 pc,npc > =C2=A0movi_i64 tmp8,$0x4 > =C2=A0add_i64 npc,npc,tmp8 > =C2=A0exit_tb $0x0 > =C2=A0end > > Does it mean the last block is processed correctly and the crash > happens on the next instruction which doesn't make it to the log? > The next instruction would be a > > 0x00000000011fea30: =C2=A0retl > > Since it's a branch instruction I guess this would also be a tcg block bo= undary. Because abort() was called from tcg_reg_alloc_call, I'd say 'retl' (synthetic op for 'jmpl %o8 + 8, %g0') was the problem.