From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:57001) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UBPO9-00042b-2b for qemu-devel@nongnu.org; Fri, 01 Mar 2013 07:48:40 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UBPO5-00022t-Sl for qemu-devel@nongnu.org; Fri, 01 Mar 2013 07:48:37 -0500 Received: from mx1.redhat.com ([209.132.183.28]:65326) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UBPO5-00022l-KI for qemu-devel@nongnu.org; Fri, 01 Mar 2013 07:48:33 -0500 Message-ID: <5130A419.9020903@redhat.com> Date: Fri, 01 Mar 2013 13:50:33 +0100 From: Laszlo Ersek MIME-Version: 1.0 References: <513096A1.8020008@dlhnet.de> In-Reply-To: <513096A1.8020008@dlhnet.de> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH] page_cache: use multiplicative hash for page position calculation List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Lieven Cc: Orit Wasserman , "qemu-devel@nongnu.org" On 03/01/13 12:53, Peter Lieven wrote: > instead of a linear mapping we use a multiplicative hash > with the golden ratio to derive the cache bucket from the > address. this helps to reduce collisions if memory positions > are multiple of the cache size and it avoids a division > in the position calculation. > > Signed-off-by: Peter Lieven > --- > page_cache.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/page_cache.c b/page_cache.c > index 376f1db..45d769a 100644 > --- a/page_cache.c > +++ b/page_cache.c > @@ -24,6 +24,7 @@ > #include > > #include "qemu-common.h" > +#include "qemu/host-utils.h" > #include "migration/page_cache.h" > > #ifdef DEBUG_CACHE > @@ -48,6 +49,7 @@ struct PageCache { > int64_t max_num_items; > uint64_t max_item_age; > int64_t num_items; > + uint64_t hash_shift_bits; > }; > > PageCache *cache_init(int64_t num_pages, unsigned int page_size) > @@ -72,6 +74,7 @@ PageCache *cache_init(int64_t num_pages, unsigned int > page_size) > cache->num_items = 0; > cache->max_item_age = 0; > cache->max_num_items = num_pages; > + cache->hash_shift_bits = clz64(num_pages-1); > > DPRINTF("Setting cache buckets to %" PRId64 "\n", > cache->max_num_items); > > @@ -108,7 +111,7 @@ static size_t cache_get_cache_pos(const PageCache > *cache, > size_t pos; > > g_assert(cache->max_num_items); > - pos = (address / cache->page_size) & (cache->max_num_items - 1); > + pos = (address * 0x9e3779b97f4a7c13) >> cache->hash_shift_bits; > return pos; > } > According to , the multiplier "is chosen as the integer that is relatively prime to" 2^64 "which is closest to" (sqrt(5)-1)/2 * 2^64. (sqrt(5)-1)/2 * 2^64 ~= 11400714819323198485.86699842797038469120 hence the constant would be a=0x9e3779b97f4a7c15. Any reason why a-2 is used in the patch? (Note: this is not a review or any suggestion to change the patch; I'm just curious.) A google-fight between "a" and "a-2" is inconclusive. So is stackoverflow: http://stackoverflow.com/questions/4113278/64-bit-multiplicative-hashing http://stackoverflow.com/questions/8513911/how-to-create-a-good-hash-combine-with-64-bit-output-inspired-by-boosthash-co Thanks Laszlo