From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34687) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ans7M-0001cV-Sk for qemu-devel@nongnu.org; Wed, 06 Apr 2016 14:23:53 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ans7I-0005Sl-Ok for qemu-devel@nongnu.org; Wed, 06 Apr 2016 14:23:52 -0400 Received: from mx1.redhat.com ([209.132.183.28]:59986) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ans7I-0005Sf-JH for qemu-devel@nongnu.org; Wed, 06 Apr 2016 14:23:48 -0400 References: <1459834253-8291-1-git-send-email-cota@braap.org> <1459834253-8291-8-git-send-email-cota@braap.org> <5703DCB7.50302@twiddle.net> <5703DE37.3080306@redhat.com> <5703E2DD.3020103@twiddle.net> <20160405194028.GA6671@flamenco> <5704293D.1070105@twiddle.net> <20160406005239.GA25081@flamenco> <5704F875.90509@redhat.com> <20160406174439.GB27512@flamenco> From: Paolo Bonzini Message-ID: <5705542E.60708@redhat.com> Date: Wed, 6 Apr 2016 20:23:42 +0200 MIME-Version: 1.0 In-Reply-To: <20160406174439.GB27512@flamenco> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH 07/10] tb hash: hash phys_pc, pc, and flags with xxhash List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Emilio G. Cota" Cc: MTTCG Devel , Peter Maydell , Peter Crosthwaite , QEMU Developers , Sergey Fedorov , =?UTF-8?Q?Alex_Benn=c3=a9e?= , Richard Henderson On 06/04/2016 19:44, Emilio G. Cota wrote: > I like this idea, because the ugliness of the sizeof checks is signific= ant. > However, the quality of the resulting hash is not as good when always u= sing func5. > For instance, when we'd otherwise use func3, two fifths of every input = contain > exactly the same bits: all 0's. This inevitably leads to more collision= s. It doesn't necessarily lead to more collisions. For a completely stupid hash function "a+b+c+d+e", for example, adding zeros doesn't add more collisions. What if you rearrange the five words so that the "all 0" parts of the input are all at the beginning, or all at the end? Perhaps the problem is that the odd words at the end are hashed less effectively. Perhaps better is to always use a three-word xxhash, but pick the 64-bit version if any of phys_pc and pc are 64-bits. The unrolling would be very effective, and the performance penalty not too important (64-bit on 32-bit is very slow anyway). If you can fix the problems with the collisions, it also means that you get good performance running 32-bit guests on qemu-system-x86_64 or qemu-system-aarch64. So, if you can do it, it's a very desirable propert= y. Paolo