From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NLgpd-0005sv-Ek for qemu-devel@nongnu.org; Fri, 18 Dec 2009 12:41:37 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NLgpW-0005k0-RJ for qemu-devel@nongnu.org; Fri, 18 Dec 2009 12:41:35 -0500 Received: from [199.232.76.173] (port=39578 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NLgpW-0005jr-IB for qemu-devel@nongnu.org; Fri, 18 Dec 2009 12:41:30 -0500 Received: from mail-px0-f189.google.com ([209.85.216.189]:46362) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NLgpW-0002yY-6l for qemu-devel@nongnu.org; Fri, 18 Dec 2009 12:41:30 -0500 Received: by pxi27 with SMTP id 27so550854pxi.4 for ; Fri, 18 Dec 2009 09:41:28 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <4B2BB7C3.2040203@twiddle.net> References: <761ea48b0912170620l534dcb02m8ea6b59524d76dbe@mail.gmail.com> <761ea48b0912180339k18573822wea90289345c58a84@mail.gmail.com> <4B2BB7C3.2040203@twiddle.net> Date: Fri, 18 Dec 2009 18:41:28 +0100 Message-ID: <761ea48b0912180941y3016d7f4u456f628f7ef36976@mail.gmail.com> Subject: Re: [Qemu-devel] Re: [PATCH 3/6] tcg-x86_64: Implement setcond and movcond. From: Laurent Desnogues Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Richard Henderson Cc: qemu-devel@nongnu.org On Fri, Dec 18, 2009 at 6:11 PM, Richard Henderson wrote: >> Also note that tcg_out_modrm will generate an unneeded prefix >> for some registers. cf. the patch I sent to the list months ago. > > Huh. =A0Didn't notice since the disassembler printed what I expected to s= ee. > =A0Is fixing this at the same time a requirement for acceptance? > I'd prefer to tackle that separately, since no doubt it affects every use= of > P_REXB. I agree this change can be delayed. >>> + =A0 =A0 =A0 =A0tgen_arithi32(s, ARITH_AND, arg0, 0xff); >> >> Wouldn't movzbl be better? > > Handled inside tgen_arithi32: > > =A0 =A0} else if (c =3D=3D ARITH_AND && val =3D=3D 0xffu) { > =A0 =A0 =A0 =A0/* movzbl */ > =A0 =A0 =A0 =A0tcg_out_modrm(s, 0xb6 | P_EXT | P_REXB, r0, r0); > > I didn't feel the need to replicate that. Oups, I compared with my code which has an explicit mozbl :) >> Regarding the xor optimization, I tested it on my i7 and it was >> (very) slightly slower running a 64-bit SPEC2k gcc. > > Huh. =A0It used to be recommended. =A0The partial word store used to stal= l the > pipeline until the old value was ready, and the XOR was special-cased as = a > clear, which broke both the input dependency and also prevented a > partial-register stall on the output. > > Actually, this recommendation is still present: Section 3.5.1.6 in the > November 2009 revision of the Intel Optimization Reference Manual. > > If it's all the same, I'd prefer to keep what I have there. =A0All other > things being equal, the XOR is 2 bytes and the MOVZBL is 3. I agree too. Anyway my measure is not representative enough to mean anything. And in that case I think shorter code is better, so let's go for XOR. >>> +static void tcg_out_movcond(TCGContext *s, int cond, TCGArg arg0, >>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0TCGArg arg1, T= CGArg arg2, int const_arg2, >>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0TCGArg arg3, T= CGArg arg4, int rexw) >> >> Perhaps renaming arg0 to dest would make things slightly >> more readable. > > Ok. > >> You should also add a note stating that arg3 !=3D arg4. > > I don't believe that's true though. =A0It's caught immediately when we em= it > the movcond opcode, but there's no check later once copy-propagation has > been done within TCG. > > I check for that in the i386 and sparc backends, because dest=3D=3Darg3 &= & > dest=3D=3Darg4 would actually generate incorrect code. =A0Here in the x86= _64 > backend, where we always use cmov it doesn't generate incorrect code, mer= ely > inefficient. > > I could add an early out for that case, if you prefer. No, you can leave it as is unless someone else objects. Laurent