From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Me8Pw-0004Aq-6z for qemu-devel@nongnu.org; Thu, 20 Aug 2009 10:15:04 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Me8Pq-0003v0-OJ for qemu-devel@nongnu.org; Thu, 20 Aug 2009 10:15:03 -0400 Received: from [199.232.76.173] (port=51384 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Me8Pq-0003un-FD for qemu-devel@nongnu.org; Thu, 20 Aug 2009 10:14:58 -0400 Date: Thu, 20 Aug 2009 11:14:40 -0300 From: Luiz Capitulino Message-ID: <20090820111440.076a468c@doriath> In-Reply-To: <4A8D10BE.8030804@gnu.org> References: <1250723280-3509-1-git-send-email-lcapitulino@redhat.com> <1250723280-3509-5-git-send-email-lcapitulino@redhat.com> <4A8D10BE.8030804@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Re: [PATCH 04/29] Introduce QDict List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: aliguori@us.ibm.com, qemu-devel@nongnu.org, avi@redhat.com On Thu, 20 Aug 2009 11:00:46 +0200 Paolo Bonzini wrote: > > + > > +/** > > + * tdb_hash(): based on the hash agorithm from gdbm, via tdb > > + * (from module-init-tools) > > + */ > > +static unsigned int tdb_hash(const char *name) > > +{ > > + unsigned value; /* Used to compute the hash value. */ > > + unsigned i; /* Used to cycle through random values. */ > > + > > + /* Set the initial value from the key size. */ > > + for (value = 0x238F13AF * strlen(name), i=0; name[i]; i++) > > + value = (value + (((const unsigned char *)name)[i]<< (i*5 % 24))); > > This is a _bad_ hash function, especially because 5 is exactly the bit > that is flipped between lowercase and uppercase. So "aa" would collide > with "Ab". > > Besides, since the function is additive, you can initialize the value to > 0 and add "0x238F13AF * i" at the end (but this is just showing off...). > > Here is a very simple but good hash function: > > uintptr_t hashVal = 1497032417; /* arbitrary value */ > > while (len--) > { > hashVal += *str++; > hashVal += (hashVal << 10); > hashVal ^= (hashVal >> 6); > } > > return 1103515243 * value; [ I have to say that I was 100% sure that someone would come with a 'better' hash function, I'm actually impressed it took three submissions to happen. :) ] Well, to be honest I'm not a hash expert and took the first hash function I knew that was used by a widely used program. That said, I would not agree changing the hash function for this series, as we have no code in tree that can be used to really measure this and confirm what is good and what is not. Also, current monitor code will push a maximum of five objects into the dict, looks a minor issue to me.