From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:59263) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SULX4-0001pR-Qu for qemu-devel@nongnu.org; Tue, 15 May 2012 13:27:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SULWz-0007Tr-FC for qemu-devel@nongnu.org; Tue, 15 May 2012 13:27:34 -0400 Received: from mail-gg0-f173.google.com ([209.85.161.173]:49463) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SULWz-0007TT-AF for qemu-devel@nongnu.org; Tue, 15 May 2012 13:27:29 -0400 Received: by ggnp1 with SMTP id p1so5744088ggn.4 for ; Tue, 15 May 2012 10:27:27 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: Date: Tue, 15 May 2012 18:27:27 +0100 Message-ID: From: Peter Maydell Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] x86: cvtsi2s{s,d} etc. array access List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Blue Swirl Cc: qemu-devel On 14 May 2012 22:05, Blue Swirl wrote: > While working on the AREG0 patches, I noticed strange code in > target-i386/translate.c. > It's accessed like this (line 3537): > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0sse_op2 =3D sse_op_table3[(s->df= lag =3D=3D 2) * 2 + ((b >> 8) - 2)]; > > b >> 8 can be only either 1 or 0. I don't think this is true. At this point in the code we're inside a "switch (b)" so we know that b is either 0x22a (cvtsi2ss) or 0x32a (cvtsi2sd). So "((b >> 8) - 2)" is 0 for cvtsi2ss and 1 for cvtsi2sd, giving us the lsbit of the array index, with (s->dflag =3D=3D 2) providing the next bit, so we end up with indexes 0,1,2,3 in this table for these two insns in their doubleword and quadword forms. You could rewrite "((b >> 8) - 2)" as "((b >> 8) & 1)". > The other access is as follows (line 3594): > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0sse_op2 =3D sse_op_table3[(s->df= lag =3D=3D 2) * 2 + ((b >> 8) - 2) + 4 + > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(b & 1) * 4]; > > This looks better because of + 4 but I think some array values are not > accessible (max. 1 * 2 + (1 - 2) + 4 + 1 * 4 =3D=3D 9). Here we know b is 0x22c (cvttss2si) 0x32c (cvttsd2si) 0x22d (cvtss2si) or 0x32d (cvtsd2si). ((b >> 8) - 2) distinguishes the 0x2XX and 0x3XX, and (b & 1) the 0xXXc from 0xXXd. So the index is made up of (lsbit to msbit) "0x2XX or 0x3XX?", "double or quad?", "0xXXC or 0xXXD?", and then we add a constant offset of 4 because the entries start after the 4 entries for the cases we looked at earlier. I think you could actually split sse_op_table3 into two separate tables, one for each of these cases, which would be slightly clearer IMHO. -- PMM