From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1L6qHX-00046Z-UV for qemu-devel@nongnu.org; Sun, 30 Nov 2008 12:40:32 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1L6qHX-00045J-0m for qemu-devel@nongnu.org; Sun, 30 Nov 2008 12:40:31 -0500 Received: from [199.232.76.173] (port=59684 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1L6qHW-000459-Nj for qemu-devel@nongnu.org; Sun, 30 Nov 2008 12:40:30 -0500 Received: from fg-out-1718.google.com ([72.14.220.153]:42341) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1L6qHW-0006JN-1u for qemu-devel@nongnu.org; Sun, 30 Nov 2008 12:40:30 -0500 Received: by fg-out-1718.google.com with SMTP id l26so1542929fgb.8 for ; Sun, 30 Nov 2008 09:40:27 -0800 (PST) Message-ID: <761ea48b0811300940i67f27e38hddb17efa242532ae@mail.gmail.com> Date: Sun, 30 Nov 2008 18:40:27 +0100 From: "Laurent Desnogues" Subject: Re: [Qemu-devel] [PATCH] ppc: Convert op_440_dlmzb to TCG In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline References: <6044C5AC-C1B7-42AF-80F0-2254392A9F37@web.de> <20081130163117.GH11797@hall.aurel32.net> Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org On Sun, Nov 30, 2008 at 6:25 PM, Andreas F=E4rber = wrote: > > Am 30.11.2008 um 17:31 schrieb Aur=E9lien Jarno: >> >> Doing loops with TCG is a bad idea as it is very inefficient. > > Did you read the code? There is no loop in TCG here, that's exactly what = I > was saying. > The labels are for conditionals only and for exiting the non-loop. > > If you have a better suggestion, please say so as this is one of the dyng= en > ops that keep me from working on the conversion on my own system. Going for a helper might be better for larger sequences of TCG ops; Fabrice mentions a threshold of about 20 ops. And your replacement is certainly larger than 20 ops :-) Plus given it uses brcond, it would prevent TCG liveness analysis from doing its job in the basic block containing that instruction. Laurent > Andreas > >> >> >>> --- >>> Compile-tested on Linux/amd64. >>> >>> diff --git a/target-ppc/op.c b/target-ppc/op.c >>> index 5d2cfa1..a26b1da 100644 >>> --- a/target-ppc/op.c >>> +++ b/target-ppc/op.c >>> @@ -839,25 +839,6 @@ void OPPROTO op_4xx_tlbwe_hi (void) >>> } >>> #endif >>> >>> -/* SPR micro-ops */ >>> -/* 440 specific */ >>> -void OPPROTO op_440_dlmzb (void) >>> -{ >>> - do_440_dlmzb(); >>> - RETURN(); >>> -} >>> - >>> -void OPPROTO op_440_dlmzb_update_Rc (void) >>> -{ >>> - if (T0 =3D=3D 8) >>> - T0 =3D 0x2; >>> - else if (T0 < 4) >>> - T0 =3D 0x4; >>> - else >>> - T0 =3D 0x8; >>> - RETURN(); >>> -} >>> - >>> #if !defined(CONFIG_USER_ONLY) >>> void OPPROTO op_store_pir (void) >>> { >>> diff --git a/target-ppc/op_helper.c b/target-ppc/op_helper.c >>> index 6addc74..a055ee6 100644 >>> --- a/target-ppc/op_helper.c >>> +++ b/target-ppc/op_helper.c >>> @@ -1754,27 +1754,6 @@ void do_store_403_pb (int num) >>> } >>> #endif >>> >>> -/* 440 specific */ >>> -void do_440_dlmzb (void) >>> -{ >>> - target_ulong mask; >>> - int i; >>> - >>> - i =3D 1; >>> - for (mask =3D 0xFF000000; mask !=3D 0; mask =3D mask >> 8) { >>> - if ((T0 & mask) =3D=3D 0) >>> - goto done; >>> - i++; >>> - } >>> - for (mask =3D 0xFF000000; mask !=3D 0; mask =3D mask >> 8) { >>> - if ((T1 & mask) =3D=3D 0) >>> - break; >>> - i++; >>> - } >>> - done: >>> - T0 =3D i; >>> -} >>> - >>> / >>> >>> ***********************************************************************= ******/ >>> /* SPE extension helpers */ >>> /* Use a table to make this quicker */ >>> diff --git a/target-ppc/op_helper.h b/target-ppc/op_helper.h >>> index 1c046d8..aaaba5c 100644 >>> --- a/target-ppc/op_helper.h >>> +++ b/target-ppc/op_helper.h >>> @@ -112,9 +112,6 @@ void do_4xx_tlbwe_lo (void); >>> void do_4xx_tlbwe_hi (void); >>> #endif >>> >>> -/* PowerPC 440 specific helpers */ >>> -void do_440_dlmzb (void); >>> - >>> /* PowerPC 403 specific helpers */ >>> #if !defined(CONFIG_USER_ONLY) >>> void do_load_403_pb (int num); >>> diff --git a/target-ppc/translate.c b/target-ppc/translate.c >>> index 95cb482..59533ac 100644 >>> --- a/target-ppc/translate.c >>> +++ b/target-ppc/translate.c >>> @@ -5872,12 +5872,49 @@ GEN_HANDLER(dlmzb, 0x1F, 0x0E, 0x02, >>> 0x00000000, PPC_440_SPEC) >>> { >>> tcg_gen_mov_tl(cpu_T[0], cpu_gpr[rS(ctx->opcode)]); >>> tcg_gen_mov_tl(cpu_T[1], cpu_gpr[rB(ctx->opcode)]); >>> - gen_op_440_dlmzb(); >>> + TCGv t0 =3D tcg_temp_new(); >>> + int endLabel =3D gen_new_label(); >>> + int i =3D 1; >>> + target_ulong mask; >>> + for (mask =3D 0xFF000000; mask !=3D 0; mask =3D mask >> 8) { >>> + tcg_gen_andi_tl(t0, cpu_T[0], mask); >>> + int nextLabel =3D gen_new_label(); >>> + tcg_gen_brcondi_tl(TCG_COND_NE, t0, 0, nextLabel); >>> + tcg_gen_movi_tl(cpu_T[0], i++); >>> + tcg_gen_br(endLabel); >>> + gen_set_label(nextLabel); >>> + } >>> + for (mask =3D 0xFF000000; mask !=3D 0; mask =3D mask >> 8) { >>> + tcg_gen_andi_tl(t0, cpu_T[1], mask); >>> + int nextLabel =3D gen_new_label(); >>> + tcg_gen_brcondi_tl(TCG_COND_NE, t0, 0, nextLabel); >>> + tcg_gen_movi_tl(cpu_T[0], i++); >>> + tcg_gen_br(endLabel); >>> + gen_set_label(nextLabel); >>> + } >>> + tcg_gen_movi_tl(cpu_T[0], i); >>> + gen_set_label(endLabel); >>> + tcg_temp_free(t0); >>> tcg_gen_mov_tl(cpu_gpr[rA(ctx->opcode)], cpu_T[0]); >>> tcg_gen_andi_tl(cpu_xer, cpu_xer, ~0x7F); >>> tcg_gen_or_tl(cpu_xer, cpu_xer, cpu_T[0]); >>> if (Rc(ctx->opcode)) { >>> - gen_op_440_dlmzb_update_Rc(); >>> + endLabel =3D gen_new_label(); >>> + int nextLabel =3D gen_new_label(); >>> + tcg_gen_brcondi_tl(TCG_COND_NE, cpu_T[0], 8, nextLabel); >>> + tcg_gen_movi_tl(cpu_T[0], 0x2); >>> + tcg_gen_br(endLabel); >>> + >>> + gen_set_label(nextLabel); >>> + nextLabel =3D gen_new_label(); >>> + tcg_gen_brcondi_tl(TCG_COND_GE, cpu_T[0], 4, nextLabel); >>> + tcg_gen_movi_tl(cpu_T[0], 0x4); >>> + tcg_gen_br(endLabel); >>> + >>> + gen_set_label(nextLabel); >>> + tcg_gen_movi_tl(cpu_T[0], 0x8); >>> + >>> + gen_set_label(endLabel); >>> tcg_gen_trunc_tl_i32(cpu_crf[0], cpu_T[0]); >>> tcg_gen_andi_i32(cpu_crf[0], cpu_crf[0], 0xf); >>> } >>> >> >> >>> >> >> >> -- >> .''`. Aurelien Jarno | GPG: 1024D/F1BCDB73 >> : :' : Debian developer | Electrical Engineer >> `. `' aurel32@debian.org | aurelien@aurel32.net >> `- people.debian.org/~aurel32 | www.aurel32.net >> >> > > > >