From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42744) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1anm0e-0001Ki-4T for qemu-devel@nongnu.org; Wed, 06 Apr 2016 07:52:32 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1anm0Z-0002x9-Uw for qemu-devel@nongnu.org; Wed, 06 Apr 2016 07:52:31 -0400 Received: from mx1.redhat.com ([209.132.183.28]:36343) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1anm0Z-0002x4-QI for qemu-devel@nongnu.org; Wed, 06 Apr 2016 07:52:27 -0400 References: <1459834253-8291-1-git-send-email-cota@braap.org> <1459834253-8291-8-git-send-email-cota@braap.org> <5703DCB7.50302@twiddle.net> <5703DE37.3080306@redhat.com> <5703E2DD.3020103@twiddle.net> <20160405194028.GA6671@flamenco> <5704293D.1070105@twiddle.net> <20160406005239.GA25081@flamenco> From: Paolo Bonzini Message-ID: <5704F875.90509@redhat.com> Date: Wed, 6 Apr 2016 13:52:21 +0200 MIME-Version: 1.0 In-Reply-To: <20160406005239.GA25081@flamenco> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH 07/10] tb hash: hash phys_pc, pc, and flags with xxhash List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Emilio G. Cota" , Richard Henderson Cc: MTTCG Devel , Peter Maydell , Peter Crosthwaite , QEMU Developers , Sergey Fedorov , =?UTF-8?Q?Alex_Benn=c3=a9e?= On 06/04/2016 02:52, Emilio G. Cota wrote: > +static inline uint32_t tb_hash_func5(uint64_t a0, uint64_t b0, uint32_= t e, int seed) I would keep just this version and unconditionally zero-extend to 64-bits. The compiler is able to detect the high 32 bits are zero, drop the more expensive multiplications and constant fold everything. For example if you write unsigned tb_hash_func(uint32_t phys_pc, uint32_t pc, int flags) { return tb_hash_func5(phys_pc, pc, flags, 1); } and check the optimized code with -fdump-tree-optimized you'll see that the rotated v1, the rotated v3 and the 20 merge into a single constant 1733907856. Thanks, Paolo > +{ > + uint32_t v1 =3D seed + PRIME32_1 + PRIME32_2; > + uint32_t v2 =3D seed + PRIME32_2; > + uint32_t v3 =3D seed + 0; > + uint32_t v4 =3D seed - PRIME32_1; > + uint32_t a =3D a0 >> 31 >> 1; > + uint32_t b =3D a0; > + uint32_t c =3D b0 >> 31 >> 1; > + uint32_t d =3D b0; > + uint32_t h32; > + > + v1 +=3D a * PRIME32_2; > + v1 =3D XXH_rotl32(v1, 13); > + v1 *=3D PRIME32_1; > + > + v2 +=3D b * PRIME32_2; > + v2 =3D XXH_rotl32(v2, 13); > + v2 *=3D PRIME32_1; > + > + v3 +=3D c * PRIME32_2; > + v3 =3D XXH_rotl32(v3, 13); > + v3 *=3D PRIME32_1; > + > + v4 +=3D d * PRIME32_2; > + v4 =3D XXH_rotl32(v4, 13); > + v4 *=3D PRIME32_1; > + > + h32 =3D XXH_rotl32(v1, 1) + XXH_rotl32(v2, 7) + XXH_rotl32(v3, 12)= + > + XXH_rotl32(v4, 18); > + h32 +=3D 20; > + > + h32 +=3D e * PRIME32_3; > + h32 =3D XXH_rotl32(h32, 17) * PRIME32_4; > + > + return h32_finish(h32); > +} > + > +static __attribute__((noinline)) > +unsigned tb_hash_func(tb_page_addr_t phys_pc, target_ulong pc, int fla= gs) > +{ > +#if TARGET_LONG_BITS =3D=3D 64 > + > + if (sizeof(phys_pc) =3D=3D sizeof(pc)) { > + return tb_hash_func5(phys_pc, pc, flags, 1); > + }