[PATCH] [rmap] operator-sparse Fibonacci hashing of waitqueues

All of lore.kernel.org
 help / color / mirror / Atom feed

From: William Lee Irwin III <wli@holomorphy.com>
To: linux-kernel@vger.kernel.org
Cc: riel@surriel.com, davem@redhat.com, phillips@bonn-fries.net,
	rwhron@earthlink.net
Subject: [PATCH] [rmap] operator-sparse Fibonacci hashing of waitqueues
Date: Sun, 17 Feb 2002 01:01:11 -0800	[thread overview]
Message-ID: <20020217090111.GF832@holomorphy.com> (raw)

After distilling with hpa's help the results of some weeks-old
numerological experiments^W^Wnumber crunching, I've devised a patch
here for -rmap to make the waitqueue hashing somewhat more palatable
for SPARC and several others.

This patch uses some operator-sparse Fibonacci hashing primes in order
to allow shift/add implementations of the hash function used for hashed
waitqueues.


Dan, Dave, could you take a look here and please comment?

Randy, any chance you could benchmark -rmap with this on top for
comparison against standard -rmap to ensure there is no regression?


Thanks,
Bill

P.S.: Dave: Sorry I took so long to get this thing out, but here it is.


# This is a BitKeeper generated patch for the following project:
# Project Name: Long-term Linux VM development
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
#	           ChangeSet	1.199   ->        
#	include/linux/mm_inline.h	1.14    -> 1.15   
#	        mm/filemap.c	1.52    -> 1.53   
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 02/02/17	wli@holomorphy.com	1.200
# Use operator-sparse Fibonacci hashing primes for the hashed waitqueues so as to reduce the cost of
# the hashing computation on several major RISC architectures and most 64-bit machines.
# 
# Specifically the search for operator-sparse primes was documented around the definitions of the
# GOLDEN_RATIO_PRIME values, where the number of terms in the sum of powers of 2 was chosen in
# advance and the numbers generated this way were sorted by their continued fraction expansion
# and then filtered for primality.
# 
# In turn the definition of page_waitqueue() also needed to be altered so that for 64-bit machines,
# whose compilers cannot perform the transformation of multiplication by a constant to shifts and
# adds, the transformation is explcitly coded, and documentation is included for it describing how
# the shifts and adds correspond to the signed sum of powers of two representation of the golden
# ratio prime.
# --------------------------------------------
#
diff -Nru a/include/linux/mm_inline.h b/include/linux/mm_inline.h
--- a/include/linux/mm_inline.h	Sun Feb 17 00:49:26 2002
+++ b/include/linux/mm_inline.h	Sun Feb 17 00:49:26 2002
@@ -9,12 +9,43 @@
  * Chuck Lever verified the effectiveness of this technique for several
  * hash tables in his paper documenting the benchmark results:
  * http://www.citi.umich.edu/techreports/reports/citi-tr-00-1.pdf
+ *
+ * These prime numbers were determined by searching all numbers
+ * composed of sums of seven terms of the form 2^n with n decreasing
+ * with the index of the summand and sorting by the continued fraction
+ * expansion of p/2^BITS_PER_LONG, then filtering for primality.
+ * Verifying that the confidence tests on the chi^2 statistics passed
+ * is necessary because this process can produce "bad" factors, but on
+ * the other hand it often produces excellent factors, so it's a very
+ * useful process for generating hash functions.
  */
+
 #if BITS_PER_LONG == 32
-#define GOLDEN_RATIO_PRIME 2654435761UL
+/*
+ * 2654404609 / 2^32 has the continued fraction expansion
+ * 0,1,1,1,1,1,1,1,1,1,1,1,1,18,7,1,3,7,18,1,1,1,1,1,1,1,1,1,1,2,0
+ * which is very close to phi.
+ * It is of the form 2^31 + 2^29 - 2^25 + 2^22 - 2^19 - 2^16 + 1
+ * which makes it suitable for compiler optimization to shift/add
+ * implementations on machines where multiplication is a slow operation.
+ * gcc is successfully able to optimize this on 32-bit SPARC and MIPS.
+ */
+#define GOLDEN_RATIO_PRIME 0x9e370001UL
 
 #elif BITS_PER_LONG == 64
-#define GOLDEN_RATIO_PRIME 11400714819323198549UL
+/*
+ * 11400862456688148481 / 2^64 has the continued fraction expansion
+ * 0,1,1,1,1,1,1,1,1,1,1,1,2,1,14,1,1048579,15,1,2,1,1,3,9,1,1,8,3,1,7,
+ *    1,1,9,1,2,1,1,1,1,2,0
+ * which is very close to phi = (sqrt(5)-1)/2 = 0,1,1,1,....
+ *
+ * It is of the form 2^63 + 2^61 - 2^57 + 2^54 - 2^51 - 2^18 + 1
+ * which makes it suitable for shift/add implementations of the hash
+ * function. 64-bit architectures typically have slow multiplies (or
+ * no hardware multiply) and also gcc is unable to optimize 64-bit
+ * multiplies for these bit patterns so explicit expansion is used.
+ */
+#define GOLDEN_RATIO_PRIME 0x9e37fffffffc0001UL
 
 #else
 #error Define GOLDEN_RATIO_PRIME in mm_inline.h for your wordsize.
diff -Nru a/mm/filemap.c b/mm/filemap.c
--- a/mm/filemap.c	Sun Feb 17 00:49:26 2002
+++ b/mm/filemap.c	Sun Feb 17 00:49:26 2002
@@ -787,6 +787,9 @@
  * collisions. This cost is great enough that effective hashing
  * is necessary to maintain performance.
  */
+
+#if BITS_PER_LONG == 32
+
 static inline wait_queue_head_t *page_waitqueue(struct page *page)
 {
 	const zone_t *zone = page_zone(page);
@@ -798,6 +801,50 @@
 
 	return &wait[hash];
 }
+
+#elif BITS_PER_LONG == 64
+
+static inline wait_queue_head_t *page_waitqueue(struct page *page)
+{
+	const zone_t *zone = page_zone(page);
+	wait_queue_head_t *wait = zone->wait_table;
+	unsigned long hash = (unsigned long)page;
+	unsigned long n;
+
+	/*
+	 * The bit-sparse GOLDEN_RATIO_PRIME is a sum of powers of 2
+	 * with varying signs: 2^63 + 2^61 - 2^57 + 2^54 - 2^51 - 2^18 + 1
+	 * The shifts in the code below correspond to the differences
+	 * in the exponents above.
+	 * The primary reason why the work of expanding the multiplication
+	 * this way is not given to the compiler is because 64-bit code
+	 * generation does not seem to be capable of doing it. So the code
+	 * here manually expands it.
+	 * The differences are used in order to both reduce the number of
+	 * variables used and also to reduce the size of the immediates
+	 * needed for the shift instructions, whose precision is limited
+	 * on some architectures.
+	 */
+
+	n = hash;
+	n <<= 18;
+	hash -= n;
+	n <<= 33;
+	hash -= n;
+	n <<= 3;
+	hash += n;
+	n <<= 3;
+	hash -= n;
+	n <<= 4;
+	hash += n;
+	n <<= 2;
+	hash += n;
+
+	hash >>= zone->wait_table_shift;
+	return &wait[hash];
+}
+
+#endif /* BITS_PER_LONG == 64 */
 
 
 /*

next             reply	other threads:[~2002-02-17  9:01 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-02-17  9:01 William Lee Irwin III [this message]
2002-02-18 22:31 ` [PATCH] [rmap] operator-sparse Fibonacci hashing of waitqueues Daniel Phillips
2002-02-19  0:34   ` William Lee Irwin III
2002-02-19  0:46     ` William Lee Irwin III
2002-02-19  1:01     ` Daniel Phillips
2002-02-19  1:27     ` William Lee Irwin III
  -- strict thread matches above, loose matches on Subject: below --
2002-02-17 23:01 rwhron
2002-02-18  0:02 ` William Lee Irwin III

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20020217090111.GF832@holomorphy.com \
    --to=wli@holomorphy.com \
    --cc=davem@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=phillips@bonn-fries.net \
    --cc=riel@surriel.com \
    --cc=rwhron@earthlink.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.