From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:59903) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QF8kd-0003xD-5m for qemu-devel@nongnu.org; Wed, 27 Apr 2011 13:42:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QF8kc-00076t-3K for qemu-devel@nongnu.org; Wed, 27 Apr 2011 13:42:11 -0400 Received: from mail-vx0-f173.google.com ([209.85.220.173]:60686) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QF8kb-00076o-UA for qemu-devel@nongnu.org; Wed, 27 Apr 2011 13:42:10 -0400 Received: by vxb41 with SMTP id 41so1741583vxb.4 for ; Wed, 27 Apr 2011 10:42:09 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: <20110425202927.GD21831@volta.aurel32.net> From: Blue Swirl Date: Wed, 27 Apr 2011 20:41:48 +0300 Message-ID: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Artyom Tarasenko Cc: Laurent Desnogues , peter.maydell@linaro.org, qemu-devel , Aurelien Jarno On Wed, Apr 27, 2011 at 7:29 PM, Artyom Tarasenko wro= te: > On Tue, Apr 26, 2011 at 8:36 PM, Blue Swirl wrote: >> On Tue, Apr 26, 2011 at 8:02 PM, Artyom Tarasenko = wrote: >>> On Mon, Apr 25, 2011 at 10:29 PM, Aurelien Jarno = wrote: >>>> On Fri, Apr 22, 2011 at 06:14:06PM +0400, Igor Kovalenko wrote: >>>>> On Fri, Apr 22, 2011 at 2:39 AM, Laurent Desnogues >>>>> wrote: >>>>> > On Thu, Apr 21, 2011 at 9:45 PM, Igor Kovalenko >>>>> > wrote: >>>>> >> On Thu, Apr 21, 2011 at 7:44 PM, Laurent Desnogues >>>>> >> wrote: >>>>> >>> On Thu, Apr 21, 2011 at 4:57 PM, Artyom Tarasenko wrote: >>>>> >>>> On Tue, Apr 12, 2011 at 4:14 AM, Igor Kovalenko >>>>> >>>> wrote: >>>>> >>>>>>> Do you have public test case? >>>>> >>>>>>> It is possible to code this delay slot write test but real is= sue may >>>>> >>>>>>> be corruption elsewhere. >>>>> >>>> >>>>> >>>> The test case is trivial: it's just the two instructions, branch= and wrpr. >>>>> >>>> >>>>> >>>>> In theory there could be multiple issues including compiler ind= uced ones. >>>>> >>>>> I'd prefer to see some kind of reproducible testcase. >>>>> >>>> >>>>> >>>> Ok, attached a 40 byte long test (the first 32 bytes are not use= d and >>>>> >>>> needed only because the bios entry point is 0x20). >>>>> >>>> >>>>> >>>> $ git pull && make && sparc64-softmmu/qemu-system-sparc64 -bios >>>>> >>>> test-wrpr.bin -nographic >>>>> >>>> Already up-to-date. >>>>> >>>> make[1]: Nothing to be done for `all'. >>>>> >>>> /mnt/terra/projects/vanilla/qemu/tcg/tcg.c:1892: tcg fatal error >>>>> >>>> Aborted >>>>> >>> >>>>> >>> The problem seems to be that wrpr is using a non-local >>>>> >>> TCG tmp (cpu_tmp0). >>>>> >> >>>>> >> Just tried the test case with write to %pil - seems like write its= elf is OK. >>>>> >> The issue appears to be with save_state() call since adding save_s= tate >>>>> >> to %pil case provokes the same tcg abort. >>>>> > >>>>> > The problem is that cpu_tmp0, not being a local tmp, doesn't >>>>> > need to be saved across helper calls. =C2=A0This results in the >>>>> > TCG "optimizer" getting rid of it even though it's later used. >>>>> > Look at the log and you'll see what I mean :-) >>>>> >>>>> I'm not very comfortable with tcg yet. Would it be possible to teach >>>>> optimizer working with delay slots? Or do I look in the wrong place. >>>>> >>>> >>>> The problem is not on the TCG side, but on the target-sparc/translate.= c >>>> side: >>>> >>>> | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0case 0x32: /* wrwim, V9 wrpr */ >>>> | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 { >>>> | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (!supervisor(dc)) >>>> | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 goto priv_insn; >>>> | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 tcg_gen_xor_tl(cpu_tmp0, cpu_src1, cpu_src2= ); >>>> |=C2=A0#ifdef TARGET_SPARC64 >>>> >>>> Here cpu_tmp0 is loaded. cpu_tmp0 is a TCG temp, which means it is not >>>> saved across TCG branches. >>>> >>>> [...] >>>> >>>> | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 case 6: // pstate >>>> | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 save_state(dc, cpu_cond); >>>> | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 gen_helper_wrpstate(cpu_tmp0)= ; >>>> | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 dc->npc =3D DYNAMIC_PC; >>>> | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 break; >>>> >>>> save_state() calls save_npc(), which in turns might call >>>> gen_generic_branch(): >>> >>> Hmm. This is not the only instruction using save_state() and cpu_tmp0. >>> At least ldd is another example. >>> >>>> |=C2=A0static inline void gen_generic_branch(target_ulong npc1, target= _ulong npc2, >>>> |=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0TCGv = r_cond) >>>> | { >>>> | =C2=A0 =C2=A0 int l1, l2; >>>> | >>>> | =C2=A0 =C2=A0 l1 =3D gen_new_label(); >>>> |=C2=A0 =C2=A0 =C2=A0l2 =3D gen_new_label(); >>>> | >>>> | =C2=A0 =C2=A0 tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond, 0, l1); >>>> | >>>> | =C2=A0 =C2=A0 tcg_gen_movi_tl(cpu_npc, npc1); >>>> | =C2=A0 =C2=A0 tcg_gen_br(l2); >>>> | >>>> | =C2=A0 =C2=A0 gen_set_label(l1); >>>> |=C2=A0 =C2=A0 =C2=A0tcg_gen_movi_tl(cpu_npc, npc2); >>>> | =C2=A0 =C2=A0 gen_set_label(l2); >>>> | } >>>> >>>> And here is the TCG branch, which drop the TCG temp cpu_temp0. >>>> >>>> The solution is either to rewrite gen_generic_branch() without TCG >>>> branches, or to use a TCG temp local instead of a TCG temp. >>> >>> You mean something like >>> >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0case 6: // pstate >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0{ >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0TCGv r_temp; >>> >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0r_temp =3D tcg_temp_= new(); >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0tcg_gen_mov_tl(r_tem= p, cpu_tmp0); >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0save_state(dc, cpu_c= ond); >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0gen_helper_wrpstate(= r_temp); >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0tcg_temp_free(r_temp= ); >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0dc->npc =3D DYNAMIC_= PC; >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0} >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0break; >>> >>> ? >>> This fails with the same error message. >> >> Close, but you need to use tcg_temp_local_new(). Does this work? >> >> diff --git a/target-sparc/translate.c b/target-sparc/translate.c >> index 3c958b2..52fa2f1 100644 >> --- a/target-sparc/translate.c >> +++ b/target-sparc/translate.c >> @@ -3505,9 +3505,15 @@ static void disas_sparc_insn(DisasContext * dc) >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 tcg_gen_mov_tl(cpu_tbr, cpu_tmp0)= ; >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 break; >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 case 6: // pstate >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0save_state(dc, cpu_cond); >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0gen_helper_wrpstate(cpu_tmp0); >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0dc->npc =3D DYNAMIC_PC; >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0{ >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0TCGv r_tmp =3D tcg_t= emp_local_new(); >> + >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0tcg_gen_mov_tl(r_tmp= , cpu_tmp0); >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0save_state(dc, cpu_c= ond); >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0gen_helper_wrpstate(= r_tmp); >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0tcg_temp_free_ptr(r_= tmp); >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0dc->npc =3D DYNAMIC_= PC; >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0} >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 break; >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 case 7: // tl >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 save_state(dc, cpu_cond); >> > > Yep. This (with =C2=A0tcg_temp_free) passes my tests, thanks! > A bonus question: how is immediate case handled? > > I don't see any explicit "if (IS_IMM)" and yet it seems to produce the > expected results for the immediate instructions. Immediates are handled by get_src2().