[PATCH next-next-2.6] netdev: better dev_name

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH next-next-2.6] netdev: better dev_name_hash
@ 2009-10-25 19:58 Octavian Purdila
  2009-10-25 20:17 ` Hagen Paul Pfeifer
                   ` (2 more replies)
  0 siblings, 3 replies; 35+ messages in thread
From: Octavian Purdila @ 2009-10-25 19:58 UTC (permalink / raw)
  To: netdev

[-- Attachment #1: Type: text/plain, Size: 635 bytes --]


The current dev_name_hash is not very good at spreading entries when a
large number of interfaces of the same type (e.g. ethXXXXX) are used.

Here are some performance numbers for creating 16000 dummy interfaces with
and without the patch (with per device sysctl entries disabled)

    With patch                      Without patch

    real    0m 2.27s                real    0m 4.32s
    user    0m 0.00s                user    0m 0.00s
    sys     0m 1.13s                sys     0m 2.16s

Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
---
 net/core/dev.c |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

[-- Attachment #2: 5504c10b4f96275a4b60d0705f71614b6eba6b5c.diff --]
[-- Type: text/x-patch, Size: 523 bytes --]

diff --git a/net/core/dev.c b/net/core/dev.c
index fa88dcd..af3cab3 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -198,7 +198,13 @@ EXPORT_SYMBOL(dev_base_lock);
 
 static inline struct hlist_head *dev_name_hash(struct net *net, const char *name)
 {
-	unsigned hash = full_name_hash(name, strnlen(name, IFNAMSIZ));
+	unsigned hash = 0;
+	int len = strnlen(name, IFNAMSIZ);
+	int i;
+
+	for (i = 0; i < len; ++i)
+		hash = 31 * hash + name[i];
+
 	return &net->dev_name_head[hash & ((1 << NETDEV_HASHBITS) - 1)];
 }
 

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH next-next-2.6] netdev: better dev_name_hash
  2009-10-25 19:58 [PATCH next-next-2.6] netdev: better dev_name_hash Octavian Purdila
@ 2009-10-25 20:17 ` Hagen Paul Pfeifer
  2009-10-25 21:24 ` Eric Dumazet
  2009-10-26  4:43 ` Stephen Hemminger
  2 siblings, 0 replies; 35+ messages in thread
From: Hagen Paul Pfeifer @ 2009-10-25 20:17 UTC (permalink / raw)
  To: Octavian Purdila; +Cc: netdev

* Octavian Purdila | 2009-10-25 21:58:53 [+0200]:

>
>The current dev_name_hash is not very good at spreading entries when a
>large number of interfaces of the same type (e.g. ethXXXXX) are used.
>
>Here are some performance numbers for creating 16000 dummy interfaces with
>and without the patch (with per device sysctl entries disabled)
>
>    With patch                      Without patch
>
>    real    0m 2.27s                real    0m 4.32s
>    user    0m 0.00s                user    0m 0.00s
>    sys     0m 1.13s                sys     0m 2.16s

Can you rerun the test with jhash() as the hash function? Just for clearness -
the spreading of jhash should be more uniformly distributed. At the end: where
is the threshold where a more accurate hash function is superior.

HGN

-- 
Hagen Paul Pfeifer <hagen@jauu.net>  ||  http://jauu.net/
Telephone: +49 174 5455209           ||  Key Id: 0x98350C22
Key Fingerprint: 490F 557B 6C48 6D7E 5706 2EA2 4A22 8D45 9835 0C22

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH next-next-2.6] netdev: better dev_name_hash
  2009-10-25 19:58 [PATCH next-next-2.6] netdev: better dev_name_hash Octavian Purdila
  2009-10-25 20:17 ` Hagen Paul Pfeifer
@ 2009-10-25 21:24 ` Eric Dumazet
  2009-10-25 21:55   ` Octavian Purdila
  2009-10-26  6:30   ` Stephen Hemminger
  2009-10-26  4:43 ` Stephen Hemminger
  2 siblings, 2 replies; 35+ messages in thread
From: Eric Dumazet @ 2009-10-25 21:24 UTC (permalink / raw)
  To: Octavian Purdila; +Cc: netdev

Octavian Purdila a écrit :
> The current dev_name_hash is not very good at spreading entries when a
> large number of interfaces of the same type (e.g. ethXXXXX) are used.

Very true

> 
> Here are some performance numbers for creating 16000 dummy interfaces with
> and without the patch (with per device sysctl entries disabled)
> 
>     With patch                      Without patch
> 
>     real    0m 2.27s                real    0m 4.32s
>     user    0m 0.00s                user    0m 0.00s
>     sys     0m 1.13s                sys     0m 2.16s
> 

1) A program to show hash distribution would be more convincing,
and could show that 17 multiplier is better, not 31 :)

2) Why full_name_hash() not changed as well ?


$ cat test.c
#include <stdio.h>
#define init_name_hash()                0

/* partial hash update function. Assume roughly 4 bits per character */
static inline unsigned long
partial_name_hash(unsigned long c, unsigned long prevhash)
{
        return (prevhash + (c << 4) + (c >> 4)) * 11;
}

/*
 * Finally: cut down the number of bits to a int value (and try to avoid
 * losing bits)
 */
static inline unsigned long end_name_hash(unsigned long hash)
{
        return (unsigned int) hash;
}

/* Compute the hash for a name string. */
static inline unsigned int
full_name_hash(const unsigned char *name, unsigned int len)
{
        unsigned long hash = init_name_hash();
        while (len--)
                hash = partial_name_hash(*name++, hash);
        return end_name_hash(hash);
}

static inline unsigned int
string_hash31(const unsigned char *name, unsigned int len)
{
        unsigned hash = 0;
        int i;

        for (i = 0; i < len; ++i)
                hash = 31 * hash + name[i];
        return hash;
}

static inline unsigned int
string_hash17(const unsigned char *name, unsigned int len)
{
        unsigned hash = 0;
        int i;

        for (i = 0; i < len; ++i)
                hash = 17 * hash + name[i];
        return hash;
}

unsigned int hashdist1[256], hashdist2[256], hashdist3[256];

void print_dist(unsigned int *dist, const char *name)
{
        unsigned int i;

        printf("%s[] = {\n", name);
        for (i = 0; i < 256; i++) {
                printf("%d, ", dist[i]);
                if ((i & 15) == 15) putchar('\n');
        }
        printf("};\n");
}

int main()
{
        int i;
        unsigned int hash;
        unsigned char name[16];

        for (i = 0; i < 16000; i++) {
                unsigned int len = sprintf(name, "dummy%d", i);

                hash = full_name_hash(name, len);
                hashdist1[hash & 255]++;

                hash = string_hash31(name, len);
                hashdist2[hash & 255]++;

                hash = string_hash17(name, len);
                hashdist3[hash & 255]++;
                }
        print_dist(hashdist1, "full_name_hash");
        print_dist(hashdist2, "string_hash31");
        print_dist(hashdist3, "string_hash17");
        return 0;
}

$ gcc -o test test.c
$ ./test
full_name_hash[] = {
0, 0, 0, 375, 0, 0, 562, 0, 0, 0, 6, 0, 0, 0, 0, 56,
0, 0, 0, 374, 0, 0, 562, 0, 0, 0, 5, 0, 0, 0, 0, 57,
0, 0, 0, 375, 0, 0, 562, 0, 0, 0, 6, 1, 0, 0, 0, 56,
0, 0, 0, 375, 0, 0, 563, 0, 0, 0, 5, 1, 0, 0, 0, 56,
0, 0, 0, 375, 0, 0, 563, 0, 0, 0, 6, 1, 0, 0, 0, 56,
0, 0, 0, 375, 0, 0, 562, 0, 0, 0, 6, 0, 0, 0, 0, 57,
0, 0, 0, 375, 0, 0, 563, 0, 0, 0, 5, 0, 0, 0, 0, 56,
0, 0, 0, 376, 0, 0, 563, 0, 0, 0, 6, 1, 0, 0, 0, 56,
0, 0, 0, 375, 0, 0, 562, 0, 0, 0, 5, 1, 0, 0, 0, 57,
0, 0, 0, 374, 0, 0, 563, 0, 0, 0, 6, 1, 0, 0, 0, 56,
0, 0, 0, 375, 0, 0, 563, 0, 0, 0, 6, 1, 0, 0, 0, 56,
0, 0, 0, 376, 0, 0, 562, 0, 0, 0, 6, 0, 0, 0, 0, 56,
0, 0, 0, 375, 0, 0, 562, 0, 0, 0, 5, 0, 0, 0, 0, 56,
0, 0, 0, 375, 0, 0, 562, 0, 0, 0, 5, 1, 0, 0, 0, 56,
0, 0, 0, 375, 0, 0, 563, 0, 0, 0, 6, 1, 0, 0, 0, 57,
0, 0, 0, 375, 0, 0, 563, 0, 0, 0, 6, 1, 0, 0, 0, 56,
};
string_hash31[] = {
42, 54, 67, 81, 92, 103, 116, 125, 131, 131, 133, 128, 121, 110, 101, 87,
72, 59, 47, 36, 25, 17, 12, 8, 4, 4, 6, 7, 11, 15, 24, 32,
42, 54, 68, 79, 91, 105, 117, 127, 130, 135, 133, 129, 120, 113, 100, 86,
71, 58, 46, 33, 24, 16, 11, 6, 4, 4, 5, 7, 10, 16, 23, 32,
41, 54, 66, 78, 91, 104, 117, 123, 130, 133, 134, 127, 122, 112, 101, 86,
72, 61, 47, 35, 25, 19, 12, 8, 5, 6, 6, 7, 11, 16, 24, 31,
43, 54, 66, 78, 92, 106, 116, 125, 131, 136, 132, 129, 121, 112, 98, 84,
72, 58, 44, 32, 24, 16, 11, 6, 5, 5, 4, 7, 10, 17, 23, 32,
41, 54, 65, 78, 92, 105, 116, 123, 132, 133, 133, 127, 122, 111, 99, 86,
74, 60, 45, 35, 26, 19, 12, 8, 7, 5, 5, 7, 12, 17, 24, 31,
43, 54, 66, 80, 94, 107, 116, 126, 131, 135, 131, 129, 120, 110, 97, 85,
71, 56, 44, 33, 25, 16, 11, 7, 5, 3, 4, 7, 11, 17, 22, 32,
43, 54, 66, 80, 94, 105, 115, 124, 133, 132, 132, 127, 121, 110, 99, 87,
74, 59, 46, 37, 26, 19, 12, 9, 6, 4, 5, 8, 12, 16, 23, 32,
43, 53, 67, 81, 94, 105, 116, 128, 132, 135, 133, 130, 121, 112, 100, 88,
72, 57, 46, 34, 25, 17, 11, 7, 4, 3, 5, 8, 11, 16, 22, 33,
};
string_hash17[] = {
68, 73, 72, 69, 62, 57, 52, 53, 60, 66, 71, 69, 69, 68, 67, 64,
67, 68, 70, 68, 63, 56, 53, 51, 57, 64, 68, 71, 68, 67, 66, 65,
62, 65, 67, 68, 64, 59, 55, 52, 54, 60, 67, 69, 69, 65, 65, 61,
59, 60, 63, 64, 64, 60, 55, 54, 54, 61, 67, 72, 72, 71, 66, 64,
60, 59, 60, 64, 64, 62, 58, 56, 55, 59, 66, 70, 73, 69, 67, 62,
58, 53, 56, 57, 60, 59, 57, 54, 53, 56, 62, 69, 71, 72, 67, 64,
57, 53, 51, 54, 58, 60, 59, 57, 57, 56, 63, 69, 74, 74, 71, 65,
60, 53, 50, 52, 55, 58, 59, 58, 57, 58, 61, 68, 73, 80, 77, 73,
66, 59, 52, 52, 54, 60, 61, 58, 58, 58, 59, 64, 71, 73, 78, 71,
66, 57, 50, 46, 50, 54, 59, 61, 58, 59, 60, 65, 70, 75, 78, 80,
72, 65, 56, 50, 49, 53, 59, 63, 62, 60, 62, 63, 68, 72, 77, 77,
76, 67, 58, 49, 46, 49, 55, 60, 62, 62, 59, 62, 65, 70, 72, 78,
75, 73, 62, 53, 47, 47, 52, 58, 63, 62, 63, 61, 64, 67, 71, 73,
76, 72, 68, 57, 49, 46, 50, 57, 62, 65, 65, 65, 64, 67, 69, 73,
76, 76, 71, 65, 54, 49, 49, 55, 62, 65, 65, 64, 65, 63, 66, 67,
71, 71, 70, 63, 57, 49, 47, 52, 58, 65, 66, 67, 65, 67, 65, 67,
};


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH next-next-2.6] netdev: better dev_name_hash
  2009-10-25 21:24 ` Eric Dumazet
@ 2009-10-25 21:55   ` Octavian Purdila
  2009-10-25 22:41     ` Hagen Paul Pfeifer
  2009-10-26  6:30   ` Stephen Hemminger
  1 sibling, 1 reply; 35+ messages in thread
From: Octavian Purdila @ 2009-10-25 21:55 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

[-- Attachment #1: Type: Text/Plain, Size: 2271 bytes --]

On Sunday 25 October 2009 23:24:10 you wrote:
> Octavian Purdila a écrit :
> > The current dev_name_hash is not very good at spreading entries when a
> > large number of interfaces of the same type (e.g. ethXXXXX) are used.
> 
> Very true
> 
> > Here are some performance numbers for creating 16000 dummy interfaces
> > with and without the patch (with per device sysctl entries disabled)
> >
> >     With patch                      Without patch
> >
> >     real    0m 2.27s                real    0m 4.32s
> >     user    0m 0.00s                user    0m 0.00s
> >     sys     0m 1.13s                sys     0m 2.16s
> 
> 1) A program to show hash distribution would be more convincing,
> and could show that 17 multiplier is better, not 31 :)
> 

You beat me to it :) 

But anyways, I've attached mine as well , which compares orig, jhash, new17, 
new31, with different number of interfaces as well as different hash bits. 

The score it shows its the number of iterations needed to find all elements in 
the hash (good enough metric?)

My results shows that new17 is better or very close to jhash2. And I think its 
lighter then jhash too.

> 2) Why full_name_hash() not changed as well ?

I don't think it is going to be as easy to "benchmark" this, as it is used 
with more diverse inputs. 

AFAIK, full_name_hash is used by the dentry code, meaning that it cashes path 
names? If so, perhaps we can use find {/etc,/bin,/sbin,/usr} as input patterns? 

My results:

$ ./dev_name_hash ixint 255 0 8
score 1140
$ ./dev_name_hash ixint 255 1 8
score 379
$ ./dev_name_hash ixint 255 2 8
score 461
$ ./dev_name_hash ixint 255 3 8
score 329
$ ./dev_name_hash ixint 1000 0 8
score 26074
$ ./dev_name_hash ixint 1000 1 8
score 2887
$ ./dev_name_hash ixint 1000 2 8
score 2853
$ ./dev_name_hash ixint 1000 3 8
score 3713
$ ./dev_name_hash ixint 16000 0 8
score 3689828
$ ./dev_name_hash ixint 16000 1 8
score 516389
$ ./dev_name_hash ixint 16000 2 8
score 515292
$ ./dev_name_hash ixint 16000 3 8
score 554042
$ ./dev_name_hash ixint 16000 0 16
score 24741
$ ./dev_name_hash ixint 16000 1 16
score 17949
$ ./dev_name_hash ixint 16000 2 16
score 16715
$ ./dev_name_hash ixint 16000 3 16
score 18125

[-- Attachment #2: dev_name_hash.c --]
[-- Type: text/x-csrc, Size: 4994 bytes --]

#define _GNU_SOURCE
#include <string.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <net/if.h>

typedef unsigned int u32;
typedef unsigned char u8;

#define __jhash_mix(a, b, c) \
{ \
  a -= b; a -= c; a ^= (c>>13); \
  b -= c; b -= a; b ^= (a<<8); \
  c -= a; c -= b; c ^= (b>>13); \
  a -= b; a -= c; a ^= (c>>12);  \
  b -= c; b -= a; b ^= (a<<16); \
  c -= a; c -= b; c ^= (b>>5); \
  a -= b; a -= c; a ^= (c>>3);  \
  b -= c; b -= a; b ^= (a<<10); \
  c -= a; c -= b; c ^= (b>>15); \
}

/* The golden ration: an arbitrary value */
#define JHASH_GOLDEN_RATIO	0x9e3779b9

/* The most generic version, hashes an arbitrary sequence
 * of bytes.  No alignment or length assumptions are made about
 * the input key.
 */
static inline u32 jhash(const void *key, u32 length, u32 initval)
{
	u32 a, b, c, len;
	const u8 *k = key;

	len = length;
	a = b = JHASH_GOLDEN_RATIO;
	c = initval;

	while (len >= 12) {
		a += (k[0] +((u32)k[1]<<8) +((u32)k[2]<<16) +((u32)k[3]<<24));
		b += (k[4] +((u32)k[5]<<8) +((u32)k[6]<<16) +((u32)k[7]<<24));
		c += (k[8] +((u32)k[9]<<8) +((u32)k[10]<<16)+((u32)k[11]<<24));

		__jhash_mix(a,b,c);

		k += 12;
		len -= 12;
	}

	c += length;
	switch (len) {
	case 11: c += ((u32)k[10]<<24);
	case 10: c += ((u32)k[9]<<16);
	case 9 : c += ((u32)k[8]<<8);
	case 8 : b += ((u32)k[7]<<24);
	case 7 : b += ((u32)k[6]<<16);
	case 6 : b += ((u32)k[5]<<8);
	case 5 : b += k[4];
	case 4 : a += ((u32)k[3]<<24);
	case 3 : a += ((u32)k[2]<<16);
	case 2 : a += ((u32)k[1]<<8);
	case 1 : a += k[0];
	};

	__jhash_mix(a,b,c);

	return c;
}


/* Name hashing routines. Initial hash value */
/* Hash courtesy of the R5 hash in reiserfs modulo sign bits */
#define init_name_hash()		0

/* partial hash update function. Assume roughly 4 bits per character */
static inline unsigned long
partial_name_hash(unsigned long c, unsigned long prevhash)
{
	return (prevhash + (c << 4) + (c >> 4)) * 11;
}

/*
 * Finally: cut down the number of bits to a int value (and try to avoid
 * losing bits)
 */
static inline unsigned long end_name_hash(unsigned long hash)
{
	return (unsigned int) hash;
}

/* Compute the hash for a name string. */
static inline unsigned int
full_name_hash(const char *name, unsigned int len)
{
	unsigned long hash = init_name_hash();
	while (len--)
		hash = partial_name_hash(*name++, hash);
	return end_name_hash(hash);
}

unsigned int dev_name_hash_new17(const char *name)
{
	unsigned hash = 0;
	int len = strnlen(name, IFNAMSIZ);
	int i;

	for (i = 0; i < len; ++i)
		hash = 17 * hash + name[i];
	return hash;
}

unsigned int dev_name_hash_new31(const char *name)
{
	unsigned hash = 0;
	int len = strnlen(name, IFNAMSIZ);
	int i;

	for (i = 0; i < len; ++i)
		hash = 31 * hash + name[i];
	return hash;
}

unsigned int dev_name_hash_orig(const char *name)
{
	return full_name_hash(name, strnlen(name, IFNAMSIZ));
}

unsigned int dev_name_hash_jhash(const char *name)
{
	return jhash(name, strnlen(name, IFNAMSIZ), 0);
}

typedef unsigned int (*dev_hash_name)(const char *name);

dev_hash_name hfn[] = {
	dev_name_hash_orig,
	dev_name_hash_jhash,
	dev_name_hash_new17,
	dev_name_hash_new31,
};

struct hocs {
	unsigned value;
	int occurences;
};

#if 0
#define debug(x...) printf(x)
#else
#define debug(x...) 
#endif

int main(int argc, char **argv)
{
	char dev_name[64];
	int no, htype, i, j, score, hbits;
	unsigned *h;
	/* stores hash elements by value and number of occurences */
	struct hocs *hocs;

	if (argc != 5) {
		fprintf(stderr, "syntax: dev_name_hash prefix no_of_names hash_type hash_bits\n");
		return -1;
	}

	no = atoi(argv[2]);
	if (no <= 0) {
		fprintf(stderr, "invalid number: %d\n", no);
		return -1;
	}

	htype = atoi(argv[3]);
	if (htype < 0 || htype >= sizeof(hfn)) {
		fprintf(stderr, "invalid hash type: %d\n", htype);
		return -1;
	}

	hbits = atoi(argv[4]);
	if (hbits <= 0) {
		fprintf(stderr, "invalid hash bits: %d\n", hbits);
		return -1;
	}
	
	h = malloc(no * sizeof(unsigned));

	for(i = 0; i < no; i++) {
		snprintf(dev_name, sizeof(dev_name), "%s%d", argv[1], i);
		h[i] = hfn[htype](dev_name) & ((1 << hbits) - 1);
		debug("h[i] = %x\n", h[i]);
	}

	hocs = malloc(no * sizeof(struct hocs));
	memset(hocs, 0, no * sizeof(unsigned));

	/* find occurences */
	for(i = 0; i < no; i++) {
		unsigned hash, dup;

		hash= h[i];
		dup = 0;
		hocs[i].value = hash;
		
		/* did we count this value already? */
		for(j = 0; j < i; j++) {
			if (i == j)
				continue;
			if (hocs[j].value == hash) {
				dup = 1;
				break;
			}
		}
		if (dup)
			continue;

		hocs[i].occurences = 1;

		for(j = 0; j < no; j++) {
			if (i == j)
				continue;
			if (hash == h[j])
				hocs[i].occurences++;
		}
	}

	/* the score is the number of iterations required to find each element
	   once */
	score = 0;
	for(i = 0; i < no; i++) {
		debug("hocs[%d] value = %x occurences = %d\n", i, hocs[i].value, hocs[i].occurences);
		score += hocs[i].occurences * (hocs[i].occurences + 1) / 2;
	}

	printf("score %d\n", score);

	return 0;
}

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH next-next-2.6] netdev: better dev_name_hash
  2009-10-25 21:55   ` Octavian Purdila
@ 2009-10-25 22:41     ` Hagen Paul Pfeifer
  2009-10-25 22:45       ` Octavian Purdila
  2009-10-26  5:28       ` Eric Dumazet
  0 siblings, 2 replies; 35+ messages in thread
From: Hagen Paul Pfeifer @ 2009-10-25 22:41 UTC (permalink / raw)
  To: Octavian Purdila; +Cc: Eric Dumazet, netdev

* Octavian Purdila | 2009-10-25 23:55:32 [+0200]:

>My results shows that new17 is better or very close to jhash2. And I think its 
>lighter then jhash too.

If new17 is very close to jhash/jhash2 then the cycles comes into play.
Anyway, there is already a very potent hash interface in form of jhash{,2}.

+1 for jhash2

HGN

PS: great work! ;)

PPS: http://libhashish.sourceforge.net/ have some real hash benchmarks in form
of avalanche test and some others too. It does not really matter here because
Jenkins performs _nearly_ perfect in all cases. ;)

-- 
Hagen Paul Pfeifer <hagen@jauu.net>  ||  http://jauu.net/
Telephone: +49 174 5455209           ||  Key Id: 0x98350C22
Key Fingerprint: 490F 557B 6C48 6D7E 5706 2EA2 4A22 8D45 9835 0C22

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH next-next-2.6] netdev: better dev_name_hash
  2009-10-25 22:41     ` Hagen Paul Pfeifer
@ 2009-10-25 22:45       ` Octavian Purdila
  2009-10-26  5:28       ` Eric Dumazet
  1 sibling, 0 replies; 35+ messages in thread
From: Octavian Purdila @ 2009-10-25 22:45 UTC (permalink / raw)
  To: Hagen Paul Pfeifer; +Cc: Eric Dumazet, netdev

On Monday 26 October 2009 00:41:54 you wrote:
> * Octavian Purdila | 2009-10-25 23:55:32 [+0200]:
> >My results shows that new17 is better or very close to jhash2. And I think
> > its lighter then jhash too.
> 
> If new17 is very close to jhash/jhash2 then the cycles comes into play.
> Anyway, there is already a very potent hash interface in form of jhash{,2}.
> 

Ah, sorry for the confusion, jhash2 was a typo, I've only tested jhash.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH next-next-2.6] netdev: better dev_name_hash
  2009-10-25 19:58 [PATCH next-next-2.6] netdev: better dev_name_hash Octavian Purdila
  2009-10-25 20:17 ` Hagen Paul Pfeifer
  2009-10-25 21:24 ` Eric Dumazet
@ 2009-10-26  4:43 ` Stephen Hemminger
  2009-10-26 22:36   ` [PATCH] dcache: better name hash function Stephen Hemminger <shemminger@vyatta.com>, Al Viro
  2 siblings, 1 reply; 35+ messages in thread
From: Stephen Hemminger @ 2009-10-26  4:43 UTC (permalink / raw)
  To: Octavian Purdila; +Cc: netdev

On Sun, 25 Oct 2009 21:58:53 +0200
Octavian Purdila <opurdila@ixiacom.com> wrote:

> 
> The current dev_name_hash is not very good at spreading entries when a
> large number of interfaces of the same type (e.g. ethXXXXX) are used.
> 
> Here are some performance numbers for creating 16000 dummy interfaces with
> and without the patch (with per device sysctl entries disabled)
> 
>     With patch                      Without patch
> 
>     real    0m 2.27s                real    0m 4.32s
>     user    0m 0.00s                user    0m 0.00s
>     sys     0m 1.13s                sys     0m 2.16s
> 
> Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
> ---
>  net/core/dev.c |    8 +++++++-
>  1 files changed, 7 insertions(+), 1 deletions(-)

My $.02 would for fixing full name hash because other usages would
benefit as well.  Inventing special case for network devices seems
unnecessary.

-- 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH next-next-2.6] netdev: better dev_name_hash
  2009-10-25 22:41     ` Hagen Paul Pfeifer
  2009-10-25 22:45       ` Octavian Purdila
@ 2009-10-26  5:28       ` Eric Dumazet
  2009-10-26 13:07         ` Krishna Kumar2
  1 sibling, 1 reply; 35+ messages in thread
From: Eric Dumazet @ 2009-10-26  5:28 UTC (permalink / raw)
  To: Hagen Paul Pfeifer; +Cc: Octavian Purdila, netdev

Hagen Paul Pfeifer a écrit :
> * Octavian Purdila | 2009-10-25 23:55:32 [+0200]:
> 
>> My results shows that new17 is better or very close to jhash2. And I think its 
>> lighter then jhash too.
> 
> If new17 is very close to jhash/jhash2 then the cycles comes into play.
> Anyway, there is already a very potent hash interface in form of jhash{,2}.
> 
> +1 for jhash2
> 

In fact, new10 should be the 'perfect' hash for the "eth%d"
netdev use, not jhash (way more expensive in cpu cycles BTW)

Most linux machines in the world have less than 10 interfaces, jhash
would be really overkill.


Thanks


[PATCH net-next-2.6] netdev: better dev_name_hash

Octavian Purdila pointed out that the current dev_name_hash is 
not very good at spreading entries when a large number of interfaces
of the same type (e.g. ethXXXXX) are used.

Here is hash distribution of 16000 "dummy%d" devices :

full_name_hash[] = {
0, 0, 0, 375, 0, 0, 562, 0, 0, 0, 6, 0, 0, 0, 0, 56,
0, 0, 0, 374, 0, 0, 562, 0, 0, 0, 5, 0, 0, 0, 0, 57,
0, 0, 0, 375, 0, 0, 562, 0, 0, 0, 6, 1, 0, 0, 0, 56,
0, 0, 0, 375, 0, 0, 563, 0, 0, 0, 5, 1, 0, 0, 0, 56,
0, 0, 0, 375, 0, 0, 563, 0, 0, 0, 6, 1, 0, 0, 0, 56,
0, 0, 0, 375, 0, 0, 562, 0, 0, 0, 6, 0, 0, 0, 0, 57,
0, 0, 0, 375, 0, 0, 563, 0, 0, 0, 5, 0, 0, 0, 0, 56,
0, 0, 0, 376, 0, 0, 563, 0, 0, 0, 6, 1, 0, 0, 0, 56,
0, 0, 0, 375, 0, 0, 562, 0, 0, 0, 5, 1, 0, 0, 0, 57,
0, 0, 0, 374, 0, 0, 563, 0, 0, 0, 6, 1, 0, 0, 0, 56,
0, 0, 0, 375, 0, 0, 563, 0, 0, 0, 6, 1, 0, 0, 0, 56,
0, 0, 0, 376, 0, 0, 562, 0, 0, 0, 6, 0, 0, 0, 0, 56,
0, 0, 0, 375, 0, 0, 562, 0, 0, 0, 5, 0, 0, 0, 0, 56,
0, 0, 0, 375, 0, 0, 562, 0, 0, 0, 5, 1, 0, 0, 0, 56,
0, 0, 0, 375, 0, 0, 563, 0, 0, 0, 6, 1, 0, 0, 0, 57,
0, 0, 0, 375, 0, 0, 563, 0, 0, 0, 6, 1, 0, 0, 0, 56,
};

Instead of using full_name_hash(), netdev should use a hash suited
to its typical uses, which are a common substring followed by a base 10 number.

new hash distribution :

string_hash10[] = {
62, 63, 61, 60, 61, 63, 61, 62, 64, 62, 61, 62, 62, 60, 60, 61,
61, 59, 60, 63, 61, 60, 62, 63, 62, 60, 60, 60, 59, 60, 61, 59,
58, 61, 61, 60, 60, 61, 61, 58, 58, 59, 58, 57, 58, 59, 58, 59,
60, 60, 59, 61, 63, 61, 60, 60, 62, 61, 60, 61, 61, 60, 61, 62,
61, 62, 63, 63, 62, 62, 64, 64, 61, 62, 63, 62, 62, 63, 64, 64,
64, 64, 64, 62, 64, 65, 62, 62, 63, 63, 62, 62, 63, 64, 62, 62,
64, 62, 63, 65, 64, 63, 63, 64, 64, 63, 63, 67, 65, 64, 66, 66,
66, 66, 66, 65, 64, 63, 65, 63, 63, 66, 66, 64, 64, 65, 65, 64,
63, 66, 64, 64, 65, 65, 63, 64, 65, 63, 62, 61, 64, 61, 61, 63,
65, 64, 63, 64, 62, 62, 62, 64, 61, 61, 63, 63, 63, 63, 65, 64,
62, 61, 63, 61, 61, 62, 61, 61, 62, 63, 62, 62, 63, 66, 62, 61,
62, 62, 62, 61, 62, 61, 61, 61, 64, 62, 63, 65, 63, 63, 63, 64,
62, 60, 60, 63, 61, 61, 63, 62, 63, 65, 65, 63, 62, 63, 65, 62,
62, 64, 63, 63, 65, 66, 65, 64, 64, 65, 63, 64, 66, 63, 63, 65,
66, 64, 63, 64, 66, 64, 63, 65, 63, 64, 66, 66, 64, 63, 64, 64,
62, 62, 64, 61, 60, 62, 63, 62, 61, 62, 63, 61, 62, 63, 60, 59,
}; 


Based on a previous patch from Octavian Purdila

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 net/core/dev.c |   15 ++++++++++++++-
 1 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index fa88dcd..e192068 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -196,9 +196,22 @@ EXPORT_SYMBOL(dev_base_lock);
 #define NETDEV_HASHBITS	8
 #define NETDEV_HASHENTRIES (1 << NETDEV_HASHBITS)
 
+/*
+ * Because of "eth%d" patterns, following hash is giving good distribution
+ */
+static inline unsigned int string_hash10(const char *name, unsigned int len)
+{
+	unsigned int i, hash = 0;
+
+	for (i = 0; i < len; i++)
+		hash = hash * 10 + name[i];
+
+	return hash;
+}
+
 static inline struct hlist_head *dev_name_hash(struct net *net, const char *name)
 {
-	unsigned hash = full_name_hash(name, strnlen(name, IFNAMSIZ));
+	unsigned hash = string_hash10(name, strnlen(name, IFNAMSIZ));
 	return &net->dev_name_head[hash & ((1 << NETDEV_HASHBITS) - 1)];
 }
 

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH next-next-2.6] netdev: better dev_name_hash
  2009-10-25 21:24 ` Eric Dumazet
  2009-10-25 21:55   ` Octavian Purdila
@ 2009-10-26  6:30   ` Stephen Hemminger
  2009-10-26  7:48     ` Eric Dumazet
  1 sibling, 1 reply; 35+ messages in thread
From: Stephen Hemminger @ 2009-10-26  6:30 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Octavian Purdila, netdev

[-- Attachment #1: Type: text/plain, Size: 1536 bytes --]

I overkilled this with more functions and compared filenames as well.


genarated names (dummyNNNN)
Algorithm             Time (us)     Ratio       Max   StdDev
kr_hash                  277925   152408.6   468448 543.19
string_hash31            329356     5859.4    16042  44.18
SuperFastHash            324570     4885.9    10502   2.29
djb2                     327908     5608.5    15210  38.08
string_hash17            326769     4883.6     9896   0.76
full_name_hash           343196    63921.0   140628 343.62
jhash_string             463801     4883.8    10085   1.02
sdbm                     398587     9801.7    29634  99.18

filesystem names
Algorithm             Time (us)     Ratio       Max   StdDev
kr_hash                  278840   152314.9   468717 543.01
string_hash31            331206     5802.1    16004  42.87
SuperFastHash            325938     4887.5    13528   2.88
djb2                     330621     5607.1    15333  38.05
string_hash17            331181     4884.9    13274   1.78
full_name_hash           347312    63972.2   141336 343.77
jhash_string             466799     4885.2    13275   1.92
sdbm                     403691     9771.7    29629  98.88

Ratio is the average number of buckets examined when scanning
the whole set of names.


1) Increased hash buckets to 1024 which seems reasonable if we are
   going to test that many names.
2) Increased name size to 256 so that longer filenames could be
   checked and name blocks were not in same cache line

* SuperFastHash is too big to put inline


[-- Attachment #2: hashtest.c --]
[-- Type: text/x-c++src, Size: 6872 bytes --]

#include <stdio.h>
#include <stdint.h>
#include <malloc.h>
#include <sys/types.h>
#include <sys/time.h>
#include <string.h>
#include <stdlib.h>
#include <math.h>

static unsigned int
kr_hash(const unsigned char *str, unsigned int len)
{
	unsigned int hash = 0;

        while (len--)
	    hash += *str++;

	return hash;
}

static unsigned int
sdbm(const unsigned char *name, unsigned int len)
{
        unsigned long hash = 0;

        while (len--)
            hash = *name++ + (hash << 6) + (hash << 16) - hash;

        return hash;
}

#define init_name_hash()                0

/* partial hash update function. Assume roughly 4 bits per character */
static inline unsigned long
partial_name_hash(unsigned long c, unsigned long prevhash)
{
        return (prevhash + (c << 4) + (c >> 4)) * 11;
}

/* Compute the hash for a name string. */
static unsigned int
full_name_hash(const unsigned char *name, unsigned int len)
{
        unsigned long hash = 0;
        while (len--)
                hash = partial_name_hash(*name++, hash);
        return hash;
}

static unsigned int
djb2(const unsigned char *str, unsigned int len)
{
	unsigned long hash = 5381;

        while (len--)
		hash = ((hash << 5) + hash) + *str++; /* hash * 33 + c */

        return hash;
}

static unsigned int
string_hash31(const unsigned char *name, unsigned int len)
{
        unsigned hash = 0;
        int i;

        for (i = 0; i < len; ++i)
                hash = 31 * hash + name[i];
        return hash;
}

static unsigned int
string_hash17(const unsigned char *name, unsigned int len)
{
        unsigned hash = 0;
        int i;

        for (i = 0; i < len; ++i)
                hash = 17 * hash + name[i];
        return hash;
}

/* will need to change to unaligned_access() */
static inline uint16_t get16bits(const unsigned char *data)
{
	return *(uint16_t *) data;
}

static inline unsigned int
SuperFastHash (const unsigned char * data, unsigned int len)
{
	uint32_t hash = len, tmp;
	int rem;

	rem = len & 3;
	len >>= 2;

	/* Main loop */
	for (;len > 0; len--) {
		hash  += get16bits (data);
		tmp    = (get16bits (data+2) << 11) ^ hash;
		hash   = (hash << 16) ^ tmp;
		data  += 2*sizeof (uint16_t);
		hash  += hash >> 11;
	}

	/* Handle end cases */
	switch (rem) {
        case 3: hash += get16bits (data);
                hash ^= hash << 16;
                hash ^= data[sizeof (uint16_t)] << 18;
                hash += hash >> 11;
                break;
        case 2: hash += get16bits (data);
                hash ^= hash << 11;
                hash += hash >> 17;
                break;
        case 1: hash += *data;
                hash ^= hash << 10;
                hash += hash >> 1;
	}

	/* Force "avalanching" of final 127 bits */
	hash ^= hash << 3;
	hash += hash >> 5;
	hash ^= hash << 4;
	hash += hash >> 17;
	hash ^= hash << 25;
	hash += hash >> 6;

	return hash;
}


/* NOTE: Arguments are modified. */
#define __jhash_mix(a, b, c) \
{ \
  a -= b; a -= c; a ^= (c>>13); \
  b -= c; b -= a; b ^= (a<<8); \
  c -= a; c -= b; c ^= (b>>13); \
  a -= b; a -= c; a ^= (c>>12);  \
  b -= c; b -= a; b ^= (a<<16); \
  c -= a; c -= b; c ^= (b>>5); \
  a -= b; a -= c; a ^= (c>>3);  \
  b -= c; b -= a; b ^= (a<<10); \
  c -= a; c -= b; c ^= (b>>15); \
}

/* The golden ration: an arbitrary value */
#define JHASH_GOLDEN_RATIO	0x9e3779b9

/* The most generic version, hashes an arbitrary sequence
 * of bytes.  No alignment or length assumptions are made about
 * the input key.
 */
static inline uint32_t jhash(const void *key, uint32_t length, uint32_t initval)
{
	uint32_t a, b, c, len;
	const uint8_t *k = key;

	len = length;
	a = b = JHASH_GOLDEN_RATIO;
	c = initval;

	while (len >= 12) {
		a += (k[0] +((uint32_t)k[1]<<8) +((uint32_t)k[2]<<16) +((uint32_t)k[3]<<24));
		b += (k[4] +((uint32_t)k[5]<<8) +((uint32_t)k[6]<<16) +((uint32_t)k[7]<<24));
		c += (k[8] +((uint32_t)k[9]<<8) +((uint32_t)k[10]<<16)+((uint32_t)k[11]<<24));

		__jhash_mix(a,b,c);

		k += 12;
		len -= 12;
	}

	c += length;
	switch (len) {
	case 11: c += ((uint32_t)k[10]<<24);
	case 10: c += ((uint32_t)k[9]<<16);
	case 9 : c += ((uint32_t)k[8]<<8);
	case 8 : b += ((uint32_t)k[7]<<24);
	case 7 : b += ((uint32_t)k[6]<<16);
	case 6 : b += ((uint32_t)k[5]<<8);
	case 5 : b += k[4];
	case 4 : a += ((uint32_t)k[3]<<24);
	case 3 : a += ((uint32_t)k[2]<<16);
	case 2 : a += ((uint32_t)k[1]<<8);
	case 1 : a += k[0];
	};

	__jhash_mix(a,b,c);

	return c;
}

static unsigned int
jhash_string(const unsigned char *name, unsigned int len)
{
	return jhash(name, len, 0);
}

#define TESTSIZE	10000000ul
#define HASHBITS	10
#define HASHSZ		(1<<HASHBITS)
#define HASHMSK		(HASHSZ-1)
#define IFNAMSIZ	256

static char *names[TESTSIZE];

static void generate_names(void)
{
	unsigned int i;
	char *buf = malloc(TESTSIZE * IFNAMSIZ);

	printf("\ngenarated names (dummyNNNN)\n");
	for (i = 0; i < TESTSIZE; i++) {
		snprintf(buf, IFNAMSIZ, "dummy%d", i);
		names[i] = buf;
		buf += IFNAMSIZ;
	}
}

static inline void measure(const char *name,  
			   unsigned int (*hash)(const unsigned char *, unsigned int))
{
	unsigned int i;
	struct timeval t0, t1;
	unsigned int dist[HASHSZ] = { 0 };
	unsigned long long ratio = 0;
	long us;
	unsigned long m = 0;
	const double avg = TESTSIZE / HASHSZ;
	double std = 0;

	gettimeofday(&t0, NULL);
	
	for (i = 0; i < TESTSIZE; i++) {
		const unsigned char *str = (const unsigned char *) names[i];
		unsigned int h = hash(str, strlen(names[i]));
		++dist[h & HASHMSK];
	}

	gettimeofday(&t1, NULL);
	us = (t1.tv_sec - t0.tv_sec) * 1000000;
	us += (t1.tv_usec - t0.tv_usec);

	for (i = 0; i < HASHSZ; i++) {
		unsigned int n = dist[i];
		unsigned long long s;
		double delta = (n - avg);

		s = n;
		s *= (1 + n);
		ratio += s / 2;
		if (n > m) m = n;
		
		std += delta * delta;
	}

	printf("%-20s %10ld %10.1f %8ld %6.2f\n", name, us, 
	       (double) ratio / (double)TESTSIZE, m, sqrt(std / TESTSIZE));

}
#define MEASURE(func)	measure(#func, &func)


static void filesystem_names(void)
{
	FILE *f = fopen("/tmp/names", "r");
	if (!f) { perror("/tmp/names"); exit(1); }
	unsigned int i = 0;
	char line[IFNAMSIZ];

	printf("\nfilesystem names\n");
	while (fgets(line, sizeof line, f) != NULL) {
		strncpy(names[i], line, IFNAMSIZ);
		if (++i == TESTSIZE)
			break;
	}
	fclose(f);

}


int main()
{

	generate_names();

	printf("Algorithm             Time (us)     Ratio       Max   StdDev\n");
	MEASURE(kr_hash);
	MEASURE(string_hash31);
	MEASURE(SuperFastHash);
	MEASURE(djb2);
	MEASURE(string_hash17);
	MEASURE(full_name_hash);
	MEASURE(jhash_string);
	MEASURE(sdbm);

	filesystem_names();

	printf("Algorithm             Time (us)     Ratio       Max   StdDev\n");
	MEASURE(kr_hash);
	MEASURE(string_hash31);
	MEASURE(SuperFastHash);
	MEASURE(djb2);
	MEASURE(string_hash17);
	MEASURE(full_name_hash);
	MEASURE(jhash_string);
	MEASURE(sdbm);

        return 0;
}

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH next-next-2.6] netdev: better dev_name_hash
  2009-10-26  6:30   ` Stephen Hemminger
@ 2009-10-26  7:48     ` Eric Dumazet
  0 siblings, 0 replies; 35+ messages in thread
From: Eric Dumazet @ 2009-10-26  7:48 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Octavian Purdila, netdev

Stephen Hemminger a écrit :
> I overkilled this with more functions and compared filenames as well.
> 
> 
> genarated names (dummyNNNN)
> Algorithm             Time (us)     Ratio       Max   StdDev
> kr_hash                  277925   152408.6   468448 543.19
> string_hash31            329356     5859.4    16042  44.18
> SuperFastHash            324570     4885.9    10502   2.29
> djb2                     327908     5608.5    15210  38.08
> string_hash17            326769     4883.6     9896   0.76
> full_name_hash           343196    63921.0   140628 343.62
> jhash_string             463801     4883.8    10085   1.02
> sdbm                     398587     9801.7    29634  99.18
> 
> filesystem names
> Algorithm             Time (us)     Ratio       Max   StdDev
> kr_hash                  278840   152314.9   468717 543.01
> string_hash31            331206     5802.1    16004  42.87
> SuperFastHash            325938     4887.5    13528   2.88
> djb2                     330621     5607.1    15333  38.05
> string_hash17            331181     4884.9    13274   1.78
> full_name_hash           347312    63972.2   141336 343.77
> jhash_string             466799     4885.2    13275   1.92
> sdbm                     403691     9771.7    29629  98.88
> 
> Ratio is the average number of buckets examined when scanning
> the whole set of names.
> 
> 
> 1) Increased hash buckets to 1024 which seems reasonable if we are
>    going to test that many names.
> 2) Increased name size to 256 so that longer filenames could be
>    checked and name blocks were not in same cache line
> 
> * SuperFastHash is too big to put inline
> 
> 

Thanks Stephen

1) dcache hash is very big on average machines.

2) dcache : We hash last component, against its parent, acting as a base.
   Your hashtest program considers the name as a single entity, giving different
   hash distribution.

3) netdev names are special, since we have only one parent, and
 smaller hash table.

4) jhash is not that expensive, but it might be because of huge working set of your
test program : strings are not in cpu caches and speed is mostly driven by ram bandwidth.

But current full_name_hash() seems a pretty bad choice !

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH next-next-2.6] netdev: better dev_name_hash
  2009-10-26  5:28       ` Eric Dumazet
@ 2009-10-26 13:07         ` Krishna Kumar2
  2009-10-26 14:31           ` Octavian Purdila
  0 siblings, 1 reply; 35+ messages in thread
From: Krishna Kumar2 @ 2009-10-26 13:07 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Hagen Paul Pfeifer, netdev, Octavian Purdila

Eric Dumazet wrote on 10/26/2009 10:58:34 AM:

> In fact, new10 should be the 'perfect' hash for the "eth%d"
> netdev use, not jhash (way more expensive in cpu cycles BTW)
>
> Most linux machines in the world have less than 10 interfaces, jhash
> would be really overkill.
>
>
> Thanks
>
>
> [PATCH net-next-2.6] netdev: better dev_name_hash

Changing Eric's test program to pass a multiplier to string_hash()
and calling string_hash with multipler=<2 - 63> confirms that 10
is almost always the best number for varying netdev names. I print
the number of times perfect 64 was scored, and for most passed
device names, the best is for n=10, followed by n=5 and others.
Almost the worst was n=31, slightly better was n=17.

But other variables matter too, like fewer entries (4K or 1K) but
above values for are still better compared to n=31.

Thanks,

- KK

#include <stdio.h>
#include <string.h>

#define NUM_ENTRIES 1024

static inline unsigned int
string_hash_n(const unsigned char *name, unsigned int len, int n)
{
        unsigned hash = 0;
        int i;

        for (i = 0; i < len; ++i)
                hash = n * hash + name[i];
        return hash;
}

int best_n = -1, best_n_val = -1;

void print_dist(unsigned int *dist, int n)
{
        unsigned int i;
        int perfect = 0;


printf("-----------------------------------------------------------\n");
        printf("n=%d[] = {\n", n);
        for (i = 0; i < 256; i++) {
                if (dist[i] == NUM_ENTRIES/256)
                        perfect++;
                printf("%d, ", dist[i]);
                if ((i & 15) == 15) putchar('\n');
        }
        if (perfect > best_n_val) {
                best_n_val = perfect;
                best_n = n;
        }
        printf("}; (PERFECT = %d (n=%d))\n", perfect, n);
}

int main(int argc, char *argv[])
{
        int n, i;
        char *str, name[16];
        unsigned int hash, hashdist[256];

        if (argc == 1)
                str="eth";
        else
                str=argv[1];

        for (n = 2; n < 63; n++) {
                memset(hashdist, 0, 256*sizeof(int));
                for (i = 0; i < NUM_ENTRIES; i++) {
                        unsigned int len = sprintf(name, "%s%d", str, i);

                        hash = string_hash_n(name, len, n);
                        hashdist[hash & 255]++;
                }
                print_dist(hashdist, n);
        }
        printf("Best N was %d, and perfect entries: %d\n",
                best_n, best_n_val);
        return 0;
}


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH next-next-2.6] netdev: better dev_name_hash
  2009-10-26 13:07         ` Krishna Kumar2
@ 2009-10-26 14:31           ` Octavian Purdila
  2009-10-26 14:55             ` Eric Dumazet
  0 siblings, 1 reply; 35+ messages in thread
From: Octavian Purdila @ 2009-10-26 14:31 UTC (permalink / raw)
  To: Krishna Kumar2; +Cc: Eric Dumazet, Hagen Paul Pfeifer, netdev

On Monday 26 October 2009 15:07:31 you wrote:
> Eric Dumazet wrote on 10/26/2009 10:58:34 AM:
> > In fact, new10 should be the 'perfect' hash for the "eth%d"
> > netdev use, not jhash (way more expensive in cpu cycles BTW)
> >
> > Most linux machines in the world have less than 10 interfaces, jhash
> > would be really overkill.
> >
> >
> > Thanks
> >
> >
> > [PATCH net-next-2.6] netdev: better dev_name_hash
> 
> Changing Eric's test program to pass a multiplier to string_hash()
> and calling string_hash with multipler=<2 - 63> confirms that 10
> is almost always the best number for varying netdev names. I print
> the number of times perfect 64 was scored, and for most passed
> device names, the best is for n=10, followed by n=5 and others.
> Almost the worst was n=31, slightly better was n=17.
> 
> But other variables matter too, like fewer entries (4K or 1K) but
> above values for are still better compared to n=31.
> 

Hmm, I've found out that it very much depends on the name as well:

0 - orig, 1 - jhash, 2 - new10, 3 - new17, 4 - new31

$ ./dev_name_hash ixint 16000 0 16
score 24741
$ ./dev_name_hash ixint 16000 1 16
score 17949
$ ./dev_name_hash ixint 16000 2 16
score 16000
$ ./dev_name_hash ixint 16000 3 16
score 16715
$ ./dev_name_hash ixint 16000 4 16
score 18125


$ ./dev_name_hash ixunc 16000 0 16
score 24741
$ ./dev_name_hash ixunc 16000 1 16
score 17904
$ ./dev_name_hash ixunc 16000 2 16
score 22180
$ ./dev_name_hash ixunc 16000 3 16
score 17065
$ ./dev_name_hash ixunc 16000 4 16
score 18038

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH next-next-2.6] netdev: better dev_name_hash
  2009-10-26 14:31           ` Octavian Purdila
@ 2009-10-26 14:55             ` Eric Dumazet
  2009-10-26 15:52               ` Octavian Purdila
  2009-10-27  1:24               ` David Miller
  0 siblings, 2 replies; 35+ messages in thread
From: Eric Dumazet @ 2009-10-26 14:55 UTC (permalink / raw)
  To: Octavian Purdila; +Cc: Krishna Kumar2, Hagen Paul Pfeifer, netdev

Octavian Purdila a écrit :
> On Monday 26 October 2009 15:07:31 you wrote:
>> Eric Dumazet wrote on 10/26/2009 10:58:34 AM:
>>> In fact, new10 should be the 'perfect' hash for the "eth%d"
>>> netdev use, not jhash (way more expensive in cpu cycles BTW)
>>>
>>> Most linux machines in the world have less than 10 interfaces, jhash
>>> would be really overkill.
>>>
>>>
>>> Thanks
>>>
>>>
>>> [PATCH net-next-2.6] netdev: better dev_name_hash
>> Changing Eric's test program to pass a multiplier to string_hash()
>> and calling string_hash with multipler=<2 - 63> confirms that 10
>> is almost always the best number for varying netdev names. I print
>> the number of times perfect 64 was scored, and for most passed
>> device names, the best is for n=10, followed by n=5 and others.
>> Almost the worst was n=31, slightly better was n=17.
>>
>> But other variables matter too, like fewer entries (4K or 1K) but
>> above values for are still better compared to n=31.
>>
> 
> Hmm, I've found out that it very much depends on the name as well:
> 
> 0 - orig, 1 - jhash, 2 - new10, 3 - new17, 4 - new31
> 
> $ ./dev_name_hash ixint 16000 0 16
> score 24741
> $ ./dev_name_hash ixint 16000 1 16
> score 17949
> $ ./dev_name_hash ixint 16000 2 16
> score 16000
> $ ./dev_name_hash ixint 16000 3 16
> score 16715
> $ ./dev_name_hash ixint 16000 4 16
> score 18125
> 
> 
> $ ./dev_name_hash ixunc 16000 0 16
> score 24741
> $ ./dev_name_hash ixunc 16000 1 16
> score 17904
> $ ./dev_name_hash ixunc 16000 2 16
> score 22180
> $ ./dev_name_hash ixunc 16000 3 16
> score 17065

> $ ./dev_name_hash ixunc 16000 4 16
> score 18038

This is because you chose a 65536 slots hash table, to store 16000 elements

The ideal function should be :

$ ./dev_name_hash ixunc 16000 5 16
score 16000

unsigned int dev_name_hash_new10bis(const char *name)
{
	unsigned hash = 0;
	int len = strnlen(name, IFNAMSIZ);
	int i;

	for (i = 0; i < len; ++i)
		hash = 10 * hash + (name[i] - '0');
	return hash;
}

But should we really care ?



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH next-next-2.6] netdev: better dev_name_hash
  2009-10-26 14:55             ` Eric Dumazet
@ 2009-10-26 15:52               ` Octavian Purdila
  2009-10-26 16:55                 ` Stephen Hemminger
  2009-10-27  1:24               ` David Miller
  1 sibling, 1 reply; 35+ messages in thread
From: Octavian Purdila @ 2009-10-26 15:52 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Krishna Kumar2, Hagen Paul Pfeifer, netdev

On Monday 26 October 2009 16:55:10 you wrote:
> 
> This is because you chose a 65536 slots hash table, to store 16000 elements
> 
> The ideal function should be :
> 
> $ ./dev_name_hash ixunc 16000 5 16
> score 16000
> 
> unsigned int dev_name_hash_new10bis(const char *name)
> {
> 	unsigned hash = 0;
> 	int len = strnlen(name, IFNAMSIZ);
> 	int i;
> 
> 	for (i = 0; i < len; ++i)
> 		hash = 10 * hash + (name[i] - '0');
> 	return hash;
> }
> 

Eric, thanks a lot for your help. This is turning into a very instructive 
thread for me :)

10bis performs better for ixunc but interestingly performs worse for ixint 
now. I also get mixed results for the two when using other names like ppp or 
gtp.

2 - new10, 3 - new10bis

score 49852
$ ./dev_name_hash ixint 32000 3 14
score 53194
$ ./dev_name_hash ixunc 32000 2 14
score 55232
$ ./dev_name_hash ixunc 32000 3 14
score 48168

> But should we really care ?

I'm just testing various common cases we use here ({ixint,ixunc,gtp,ppp,gre} 
{1000,16000,32000,128000} {14,16}). 

Ideally we want a hash function that performs better in all cases  - but I 
understand that it may not be possible.

I will play more with it and see if I can come up with something better, but 
in any case the new{10,10bis,17,31} performs much better than full_name_hash 
and most of the time better that jhash .

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH next-next-2.6] netdev: better dev_name_hash
  2009-10-26 15:52               ` Octavian Purdila
@ 2009-10-26 16:55                 ` Stephen Hemminger
  2009-10-26 17:45                   ` Stephen Hemminger
  0 siblings, 1 reply; 35+ messages in thread
From: Stephen Hemminger @ 2009-10-26 16:55 UTC (permalink / raw)
  To: Octavian Purdila; +Cc: Eric Dumazet, Krishna Kumar2, Hagen Paul Pfeifer, netdev

Added more algorithms to test...

Time is in seconds for 10000000 entries with hashbits = 8
Ratio is number of probes / ideal hash probes

Result sorted by distribution:

Algorithm             Time       Ratio       Max   StdDev
string10             1.434087       1.00     39064   0.01
SuperFastHash        1.469511       1.00     40497   2.17
string_hash17        1.472544       1.00     39497   1.50
jhash_string         1.501508       1.00     39669   1.04
crc                  2.826795       1.00     39088   0.07
md5_string           3.608253       1.00     39605   0.98
djb2                 1.462722       1.15     60681  76.16
string_hash31        1.457253       1.21     64950  91.12
sdbm                 1.566174       2.38    129900 232.22
pjw                  1.527306       2.45     99990 237.86
elf                  1.576096       2.45     99990 237.86
kr_hash              1.400072       7.80    468451 515.52
fletcher             1.449671       7.80    468451 515.52
full_name_hash       1.487707      13.09    562501 687.24
xor                  1.400403      13.36    583189 694.98
lastchar             1.348798      25.60   1000000 980.27

Another run sorted by speed:
Algorithm             Time       Ratio       Max   StdDev
lastchar             1.338545      25.60   1000000 980.27
kr_hash              1.398453       7.80    468451 515.52
xor                  1.398843      13.36    583189 694.98
string10             1.432756       1.00     39064   0.01
fletcher             1.448499       7.80    468451 515.52
string_hash31        1.457524       1.21     64950  91.12
string_hash17        1.462548       1.00     39497   1.50
djb2                 1.462956       1.15     60681  76.16
SuperFastHash        1.469907       1.00     40497   2.17
full_name_hash       1.486465      13.09    562501 687.24
jhash_string         1.500959       1.00     39669   1.04
pjw                  1.526097       2.45     99990 237.86
sdbm                 1.566533       2.38    129900 232.22
elf                  1.576470       2.45     99990 237.86
crc                  2.811210       1.00     39088   0.07
md5_string           3.604675       1.00     39605   0.98


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH next-next-2.6] netdev: better dev_name_hash
  2009-10-26 16:55                 ` Stephen Hemminger
@ 2009-10-26 17:45                   ` Stephen Hemminger
  0 siblings, 0 replies; 35+ messages in thread
From: Stephen Hemminger @ 2009-10-26 17:45 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Octavian Purdila, Eric Dumazet, Krishna Kumar2,
	Hagen Paul Pfeifer, netdev

Another algorithm that scores well in my tests.

http://isthe.com/chongo/tech/comp/fnv/

Algorithm             Time       Ratio       Max   StdDev
string10             1.433267       1.00     39064   0.01
string_hash17        1.461422       1.00     39497   1.50
fnv1a                1.472216       1.00     39895   2.25
jhash_string         1.482494       1.00     39669   1.04


static unsigned int fnv32(const unsigned char *key, unsigned int len)
{
	uint32_t hval = 2166136261;
	unsigned int i;

	for (i = 0; i < len; i++) {
		hval ^= key[i];
		/* optimized multiply by 0x01000193 */
		hval += (hval<<1) + (hval<<4) + (hval<<7)
			+ (hval<<8) + (hval<<24);
	}

	return hval;
}

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH] dcache: better name hash function
  2009-10-26  4:43 ` Stephen Hemminger
@ 2009-10-26 22:36   ` Stephen Hemminger <shemminger@vyatta.com>, Al Viro
  2009-10-27  2:45     ` Eric Dumazet
  0 siblings, 1 reply; 35+ messages in thread
From: Stephen Hemminger <shemminger@vyatta.com>, Al Viro @ 2009-10-26 22:36 UTC (permalink / raw)
  To: Andrew Morton, Linus Torvalds; +Cc: Octavian Purdila, netdev, linux-kernel

Some experiments by Octavian with large numbers of network devices identified
that name_hash does not evenly distribute values causing performance
penalties.  The name hashing function is used by dcache et. all
so let's just choose a better one.

Additional standalone tests for 10,000,000 consecutive names
using lots of different algorithms shows fnv as the winner.
It is faster and has almost ideal dispersion. 
string10 is slightly faster, but only works for names like ppp0, ppp1,...

Algorithm             Time       Ratio       Max   StdDev
string10             0.238201       1.00      2444   0.02
fnv32                0.240595       1.00      2576   1.05
fnv64                0.241224       1.00      2556   0.69
SuperFastHash        0.272872       1.00      2871   2.15
string_hash17        0.295160       1.00      2484   0.40
jhash_string         0.300925       1.00      2606   1.00
crc                  1.606741       1.00      2474   0.29
md5_string           2.424771       1.00      2644   0.99
djb2                 0.275424       1.15      3821  19.04
string_hash31        0.264806       1.21      4097  22.78
sdbm                 0.371136       2.87     13016  67.54
elf                  0.371279       3.59      9990  79.50
pjw                  0.401172       3.59      9990  79.50
full_name_hash       0.285851      13.09     35174 171.81
kr_hash              0.245068     124.84    468448 549.89
fletcher             0.267664     124.84    468448 549.89
adler32              0.640668     124.84    468448 549.89
xor                  0.220545     213.82    583189 720.85
lastchar             0.194604     409.57   1000000 998.78

Time is seconds.
Ratio is how many probes required to lookup all values versus
  an ideal hash.
Max is longest chain

Reported-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

--- a/include/linux/dcache.h	2009-10-26 14:58:45.220347300 -0700
+++ b/include/linux/dcache.h	2009-10-26 15:12:15.004160122 -0700
@@ -45,15 +45,28 @@ struct dentry_stat_t {
 };
 extern struct dentry_stat_t dentry_stat;
 
-/* Name hashing routines. Initial hash value */
-/* Hash courtesy of the R5 hash in reiserfs modulo sign bits */
-#define init_name_hash()		0
+/*
+ * Fowler / Noll / Vo (FNV) Hash
+ * see: http://www.isthe.com/chongo/tech/comp/fnv/
+ */
+#ifdef CONFIG_64BIT
+#define FNV_PRIME  1099511628211ull
+#define FNV1_INIT  14695981039346656037ull
+#else
+#define FNV_PRIME  16777619u
+#define FNV1_INIT  2166136261u
+#endif
+
+#define init_name_hash()	FNV1_INIT
 
-/* partial hash update function. Assume roughly 4 bits per character */
+/* partial hash update function. */
 static inline unsigned long
-partial_name_hash(unsigned long c, unsigned long prevhash)
+partial_name_hash(unsigned char c, unsigned long prevhash)
 {
-	return (prevhash + (c << 4) + (c >> 4)) * 11;
+	prevhash ^= c;
+	prevhash *= FNV_PRIME;
+
+	return prevhash;
 }
 
 /*

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH next-next-2.6] netdev: better dev_name_hash
  2009-10-26 14:55             ` Eric Dumazet
  2009-10-26 15:52               ` Octavian Purdila
@ 2009-10-27  1:24               ` David Miller
  2009-10-27  1:40                 ` Eric Dumazet
  1 sibling, 1 reply; 35+ messages in thread
From: David Miller @ 2009-10-27  1:24 UTC (permalink / raw)
  To: eric.dumazet; +Cc: opurdila, krkumar2, hagen, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 26 Oct 2009 15:55:10 +0100

> But should we really care ?

The only thing I see consistently in this thread is that
jhash performs consistently well and without any tweaking.

And without any assumptions about the characteristics of
the device names.  I've seen everything from the traditional
"eth%d" to things like "davem_is_a_prick%d" so you really cannot
optimize for anything in particular.

Jenkins is ~50 cycles per round of 4 bytes last time I checked, give
or take, and that was on crappy sparc. :-) So the execution cost is
really not that bad, contrary to what I've seen claimed as an argument
against using jhash here.

And if I-cache footprint is really an issue, we can have one
out-of-line expansion of jhash somewhere under lib/ since we use jhash
in so many places these days.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH next-next-2.6] netdev: better dev_name_hash
  2009-10-27  1:24               ` David Miller
@ 2009-10-27  1:40                 ` Eric Dumazet
  0 siblings, 0 replies; 35+ messages in thread
From: Eric Dumazet @ 2009-10-27  1:40 UTC (permalink / raw)
  To: David Miller; +Cc: opurdila, krkumar2, hagen, netdev

David Miller a écrit :
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Mon, 26 Oct 2009 15:55:10 +0100
> 
>> But should we really care ?
> 
> The only thing I see consistently in this thread is that
> jhash performs consistently well and without any tweaking.
> 
> And without any assumptions about the characteristics of
> the device names.  I've seen everything from the traditional
> "eth%d" to things like "davem_is_a_prick%d" so you really cannot
> optimize for anything in particular.
> 
> Jenkins is ~50 cycles per round of 4 bytes last time I checked, give
> or take, and that was on crappy sparc. :-) So the execution cost is
> really not that bad, contrary to what I've seen claimed as an argument
> against using jhash here.
> 
> And if I-cache footprint is really an issue, we can have one
> out-of-line expansion of jhash somewhere under lib/ since we use jhash
> in so many places these days.

Well, since Stephen posted a generic patch on lkml, I suspect we'll take
the dcache hash anyway ?

But yes, last time I checked, jhash was pretty big, so an out-of-line
version is welcome :)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] dcache: better name hash function
  2009-10-26 22:36   ` [PATCH] dcache: better name hash function Stephen Hemminger <shemminger@vyatta.com>, Al Viro
@ 2009-10-27  2:45     ` Eric Dumazet
  2009-10-27  3:53       ` Stephen Hemminger
  2009-10-27 16:38       ` Rick Jones
  0 siblings, 2 replies; 35+ messages in thread
From: Eric Dumazet @ 2009-10-27  2:45 UTC (permalink / raw)
  To: Stephen Hemminger, Al Viro
  Cc: Andrew Morton, Linus Torvalds, Octavian Purdila, netdev,
	linux-kernel

Stephen Hemminger <shemminger@vyatta.com>, Al Viro a écrit :
> --- a/include/linux/dcache.h	2009-10-26 14:58:45.220347300 -0700
> +++ b/include/linux/dcache.h	2009-10-26 15:12:15.004160122 -0700
> @@ -45,15 +45,28 @@ struct dentry_stat_t {
>  };
>  extern struct dentry_stat_t dentry_stat;
>  
> -/* Name hashing routines. Initial hash value */
> -/* Hash courtesy of the R5 hash in reiserfs modulo sign bits */
> -#define init_name_hash()		0
> +/*
> + * Fowler / Noll / Vo (FNV) Hash
> + * see: http://www.isthe.com/chongo/tech/comp/fnv/
> + */
> +#ifdef CONFIG_64BIT
> +#define FNV_PRIME  1099511628211ull
> +#define FNV1_INIT  14695981039346656037ull
> +#else
> +#define FNV_PRIME  16777619u
> +#define FNV1_INIT  2166136261u
> +#endif
> +
> +#define init_name_hash()	FNV1_INIT
>  
> -/* partial hash update function. Assume roughly 4 bits per character */
> +/* partial hash update function. */
>  static inline unsigned long
> -partial_name_hash(unsigned long c, unsigned long prevhash)
> +partial_name_hash(unsigned char c, unsigned long prevhash)
>  {
> -	return (prevhash + (c << 4) + (c >> 4)) * 11;
> +	prevhash ^= c;
> +	prevhash *= FNV_PRIME;
> +
> +	return prevhash;
>  }
>  
>  /*

OK, but thats strlen(name) X (long,long) multiplies.

I suspect you tested on recent x86_64 cpu.

Some arches might have slow multiplies, no ?

jhash() (and others) are optimized by compiler to use basic and fast operations.
jhash operates on blocs of 12 chars per round, so it might be a pretty good choice once
out-of-line (because its pretty large and full_name_hash() is now used by
a lot of call sites)

Please provide your test program source, so that other can test on various arches.

Thanks

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] dcache: better name hash function
  2009-10-27  2:45     ` Eric Dumazet
@ 2009-10-27  3:53       ` Stephen Hemminger
  2009-10-27 16:38       ` Rick Jones
  1 sibling, 0 replies; 35+ messages in thread
From: Stephen Hemminger @ 2009-10-27  3:53 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Al Viro, Andrew Morton, Linus Torvalds, Octavian Purdila, netdev,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2064 bytes --]

On Tue, 27 Oct 2009 03:45:50 +0100
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Stephen Hemminger <shemminger@vyatta.com>, Al Viro a écrit :
> > --- a/include/linux/dcache.h	2009-10-26 14:58:45.220347300 -0700
> > +++ b/include/linux/dcache.h	2009-10-26 15:12:15.004160122 -0700
> > @@ -45,15 +45,28 @@ struct dentry_stat_t {
> >  };
> >  extern struct dentry_stat_t dentry_stat;
> >  
> > -/* Name hashing routines. Initial hash value */
> > -/* Hash courtesy of the R5 hash in reiserfs modulo sign bits */
> > -#define init_name_hash()		0
> > +/*
> > + * Fowler / Noll / Vo (FNV) Hash
> > + * see: http://www.isthe.com/chongo/tech/comp/fnv/
> > + */
> > +#ifdef CONFIG_64BIT
> > +#define FNV_PRIME  1099511628211ull
> > +#define FNV1_INIT  14695981039346656037ull
> > +#else
> > +#define FNV_PRIME  16777619u
> > +#define FNV1_INIT  2166136261u
> > +#endif
> > +
> > +#define init_name_hash()	FNV1_INIT
> >  
> > -/* partial hash update function. Assume roughly 4 bits per character */
> > +/* partial hash update function. */
> >  static inline unsigned long
> > -partial_name_hash(unsigned long c, unsigned long prevhash)
> > +partial_name_hash(unsigned char c, unsigned long prevhash)
> >  {
> > -	return (prevhash + (c << 4) + (c >> 4)) * 11;
> > +	prevhash ^= c;
> > +	prevhash *= FNV_PRIME;
> > +
> > +	return prevhash;
> >  }
> >  
> >  /*
> 
> OK, but thats strlen(name) X (long,long) multiplies.
> 
> I suspect you tested on recent x86_64 cpu.
> 
> Some arches might have slow multiplies, no ?
> 
> jhash() (and others) are optimized by compiler to use basic and fast operations.
> jhash operates on blocs of 12 chars per round, so it might be a pretty good choice once
> out-of-line (because its pretty large and full_name_hash() is now used by
> a lot of call sites)
> 
> Please provide your test program source, so that other can test on various arches.
> 
> Thanks

long on i386 is 32 bits so it is 32 bit multiply.  There is also an optimization
that uses shift and adds.




-- 

[-- Attachment #2: hashtest.tar.bz2 --]
[-- Type: application/x-bzip, Size: 7585 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] dcache: better name hash function
       [not found] <9986527.24561256620662709.JavaMail.root@tahiti.vyatta.com>
@ 2009-10-27  5:19 ` Stephen Hemminger
  2009-10-27  5:24   ` David Miller
  2009-10-27  6:07   ` Eric Dumazet
  0 siblings, 2 replies; 35+ messages in thread
From: Stephen Hemminger @ 2009-10-27  5:19 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Andrew Morton, Linus Torvalds, Octavian Purdila, netdev,
	linux-kernel, Al Viro

One of the root causes of slowness in network usage
was my original choice of power of 2 for hash size, to avoid
a mod operation. It turns out if size is not a power of 2
the original algorithm works fairly well.

On slow cpu; with 10million entries and 211 hash size

Algorithm             Time       Ratio       Max   StdDev
string10             1.271871       1.00     47397   0.01
djb2                 1.406322       1.00     47452   0.12
SuperFastHash        1.422348       1.00     48400   1.99
string_hash31        1.424079       1.00     47437   0.08
jhash_string         1.459232       1.00     47954   1.01
sdbm                 1.499209       1.00     47499   0.22
fnv32                1.539341       1.00     47728   0.75
full_name_hash       1.556792       1.00     47412   0.04
string_hash17        1.719039       1.00     47413   0.05
pjw                  1.827365       1.00     47441   0.09
elf                  2.033545       1.00     47441   0.09
fnv64                2.199533       1.00     47666   0.53
crc                  5.705784       1.00     47913   0.95
md5_string           10.308376       1.00     47946   1.00
fletcher             1.418866       1.01     53189  18.65
adler32              2.842117       1.01     53255  18.79
kr_hash              1.175678       6.43    468517 507.44
xor                  1.114692      11.02    583189 688.96
lastchar             0.795316      21.10   1000000 976.02

How important is saving the one division, versus getting better
distribution.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] dcache: better name hash function
  2009-10-27  5:19 ` Stephen Hemminger
@ 2009-10-27  5:24   ` David Miller
  2009-10-27  6:07   ` Eric Dumazet
  1 sibling, 0 replies; 35+ messages in thread
From: David Miller @ 2009-10-27  5:24 UTC (permalink / raw)
  To: stephen.hemminger
  Cc: eric.dumazet, akpm, torvalds, opurdila, netdev, linux-kernel,
	viro

From: Stephen Hemminger <stephen.hemminger@vyatta.com>
Date: Mon, 26 Oct 2009 22:19:44 -0700 (PDT)

> How important is saving the one division, versus getting better
> distribution.

80 cpu cycles or more on some processors.  Cheaper to use
jenkins with a power-of-2 sized hash.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] dcache: better name hash function
  2009-10-27  5:19 ` Stephen Hemminger
  2009-10-27  5:24   ` David Miller
@ 2009-10-27  6:07   ` Eric Dumazet
  2009-10-27  6:50     ` Eric Dumazet
  1 sibling, 1 reply; 35+ messages in thread
From: Eric Dumazet @ 2009-10-27  6:07 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Andrew Morton, Linus Torvalds, Octavian Purdila, netdev,
	linux-kernel, Al Viro

Stephen Hemminger a écrit :
> One of the root causes of slowness in network usage
> was my original choice of power of 2 for hash size, to avoid
> a mod operation. It turns out if size is not a power of 2
> the original algorithm works fairly well.

Interesting, but I suspect all users have power of 2 tables :(

> 
> On slow cpu; with 10million entries and 211 hash size
> 
>
> 
> How important is saving the one division, versus getting better
> distribution.


unsigned int fold1(unsigned hash)
{
	return hash % 211;
}

Compiler uses a reciprocal divide because of 211 being a constant.

And you also could try following that contains one multiply only,
and check if hash distribution properties are still OK

unsigned int fold2(unsigned hash)
{
	return ((unsigned long long)hash * 211) >> 32;
}

fold1:
        movl    4(%esp), %ecx
        movl    $-1689489505, %edx
        movl    %ecx, %eax
        mull    %edx
        shrl    $7, %edx
        imull   $211, %edx, %edx
        subl    %edx, %ecx
        movl    %ecx, %eax
        ret

fold2:
        movl    $211, %eax
        mull    4(%esp)
        movl    %edx, %eax
        ret



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] dcache: better name hash function
  2009-10-27  6:07   ` Eric Dumazet
@ 2009-10-27  6:50     ` Eric Dumazet
  2009-10-27  7:29       ` Eric Dumazet
  0 siblings, 1 reply; 35+ messages in thread
From: Eric Dumazet @ 2009-10-27  6:50 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Andrew Morton, Linus Torvalds, Octavian Purdila, netdev,
	linux-kernel, Al Viro

Eric Dumazet a écrit :
> unsigned int fold2(unsigned hash)
> {
> 	return ((unsigned long long)hash * 211) >> 32;
> }
> 

I tried this reciprocal thing with 511 and 1023 values and got on a PIII 550 MHz, gcc-3.3.2 :

# ./hashtest 100000 511 
jhash_string         0.033123       1.01       234   1.06
fnv32                0.033911       1.02       254   1.38
# ./hashtest 1000000 511
jhash_string         0.331155       1.00      2109   1.10
fnv32                0.359346       1.00      2151   1.65
# ./hashtest 10000000 511
jhash_string         3.383340       1.00     19985   1.03
fnv32                3.849359       1.00     20198   1.53

# ./hashtest 100000 1023
jhash_string         0.033123       1.03       134   1.01
fnv32                0.034260       1.03       142   1.32
# ./hashtest 1000000 1023
jhash_string         0.332329       1.00      1075   1.06
fnv32                0.422035       1.00      1121   1.59
# ./hashtest 10000000 1023
jhash_string         3.417559       1.00     10107   1.01
fnv32                3.747563       1.00     10223   1.35


511 value on 64bit, and 1023 on 32bit arches are nice because
hashsz * sizeof(pointer) <= 4096, wasting space for one pointer only.

Conclusion : jhash and 511/1023 hashsize for netdevices,
no divides, only one multiply for the fold.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] dcache: better name hash function
  2009-10-27  6:50     ` Eric Dumazet
@ 2009-10-27  7:29       ` Eric Dumazet
  2009-10-27 17:07         ` Stephen Hemminger
  0 siblings, 1 reply; 35+ messages in thread
From: Eric Dumazet @ 2009-10-27  7:29 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Andrew Morton, Linus Torvalds, Octavian Purdila, netdev,
	linux-kernel, Al Viro

Eric Dumazet a écrit :
> 
> 
> 511 value on 64bit, and 1023 on 32bit arches are nice because
> hashsz * sizeof(pointer) <= 4096, wasting space for one pointer only.
> 
> Conclusion : jhash and 511/1023 hashsize for netdevices,
> no divides, only one multiply for the fold.

Just forget about 511 & 1023, as power of two works too.

-> 512 & 1024 + jhash

Guess what, David already said this :)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] dcache: better name hash function
  2009-10-27  2:45     ` Eric Dumazet
  2009-10-27  3:53       ` Stephen Hemminger
@ 2009-10-27 16:38       ` Rick Jones
  1 sibling, 0 replies; 35+ messages in thread
From: Rick Jones @ 2009-10-27 16:38 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Stephen Hemminger, Al Viro, Andrew Morton, Linus Torvalds,
	Octavian Purdila, netdev, linux-kernel

Previously Stephen kindly sent me the source and instructions, and attached are 
results from 1.0 GHz Itanium "McKinley" processors using an older gcc, both -O2 
and -O3, -O2 first:

 >>>
 >>>
 >>> $ ./hashtest 10000000 14 | sort -n -k 3 -k 2
 >>> Algorithm             Time       Ratio       Max   StdDev
 >>> string10             0.234133       1.00       612   0.03
 >>> fnv32                0.241471       1.00       689   0.93
 >>> fnv64                0.241964       1.00       680   0.85
 >>> string_hash17        0.269656       1.00       645   0.36
 >>> jhash_string         0.295795       1.00       702   1.00
 >>> crc                  1.609449       1.00       634   0.41
 >>> md5_string           2.479467       1.00       720   0.99
 >>> SuperFastHash        0.273793       1.01       900   2.13
 >>> djb2                 0.265877       1.15       964   9.52
 >>> string_hash31        0.259110       1.21      1039  11.39
 >>> sdbm                 0.369414       2.87      3268  33.77
 >>> elf                  0.372251       3.71      2907  40.71
 >>> pjw                  0.401732       3.71      2907  40.71
 >>> full_name_hash       0.283508      13.09      8796  85.91
 >>> kr_hash              0.220033     499.17    468448 551.55
 >>> fletcher             0.267009     499.17    468448 551.55
 >>> adler32              0.635047     499.17    468448 551.55
 >>> xor                  0.220314     854.94    583189 722.12
 >>> lastchar             0.155236    1637.61   1000000 999.69
 >>
 >>
 >> here then are both, from a 1.0 GHz McKinley system, 64-bit, using an older
 >> gcc
 >>
 >> raj@oslowest:~/hashtest$ ./hashtest 10000000 14 | sort -n -k 3 -k 2
 >> Algorithm             Time       Ratio       Max   StdDev
 >> string_hash17        0.901319       1.00       645   0.36
 >> string10             0.986391       1.00       612   0.03
 >> jhash_string         1.422065       1.00       702   1.00
 >> fnv32                1.705116       1.00       689   0.93
 >> fnv64                1.900326       1.00       680   0.85
 >> crc                  3.651519       1.00       634   0.41
 >> md5_string           14.155621       1.00       720   0.99
 >> SuperFastHash        1.185206       1.01       900   2.13
 >> djb2                 0.977166       1.15       964   9.52
 >> string_hash31        0.989804       1.21      1039  11.39
 >> sdbm                 1.188299       2.87      3268  33.77
 >> pjw                  1.185963       3.71      2907  40.71
 >> elf                  1.257023       3.71      2907  40.71
 >> full_name_hash       1.231514      13.09      8796  85.91
 >> kr_hash              0.890761     499.17    468448 551.55
 >> fletcher             1.080981     499.17    468448 551.55
 >> adler32              4.141714     499.17    468448 551.55
 >> xor                  1.061445     854.94    583189 722.12
 >> lastchar             0.676697    1637.61   1000000 999.69
 >>
 >> raj@oslowest:~/hashtest$ ./hashtest 10000000 8 | sort -n -k 3 -k 2
 >> Algorithm             Time       Ratio       Max   StdDev
 >> string_hash17        0.899988       1.00     39497   1.50
 >> string10             0.985100       1.00     39064   0.01
 >> SuperFastHash        1.141748       1.00     40497   2.17
 >> jhash_string         1.376414       1.00     39669   1.04
 >> fnv32                1.656967       1.00     39895   2.25
 >> fnv64                1.855259       1.00     39215   0.35
 >> crc                  3.615341       1.00     39088   0.07
 >> md5_string           14.113307       1.00     39605   0.98
 >> djb2                 0.972180       1.15     60681  76.16
 >> string_hash31        0.982233       1.21     64950  91.12
 >> sdbm                 1.181952       2.38    129900 232.22
 >> pjw                  1.178994       2.45     99990 237.86
 >> elf                  1.250936       2.45     99990 237.86
 >> kr_hash              0.892633       7.80    468451 515.52
 >> fletcher             1.082932       7.80    468451 515.52
 >> adler32              4.142414       7.80    468451 515.52
 >> full_name_hash       1.175324      13.09    562501 687.24
 >> xor                  1.060091      13.36    583189 694.98
 >> lastchar             0.675610      25.60   1000000 980.27
 >>
 >> raj@oslowest:~/hashtest$ gcc -v
 >> Using built-in specs.
 >> Target: ia64-linux-gnu
 >> Configured with: ../src/configure -v 
--enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr 
--enable-shared --with-system-zlib --libexecdir=/usr/lib 
--without-included-gettext --enable-threads=posix --enable-nls 
--program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu 
--enable-libstdcxx-debug --enable-mpfr --disable-libssp --with-system-libunwind 
--enable-checking=release ia64-linux-gnu
 >> Thread model: posix
 >> gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)
 >> raj@oslowest:~/hashtest$
 >>
 >> fnv doesn't seem to do as well there relative to the others as it did in your
 >> tests.
 >
 >
 >
 > You could try -O3 since then gcc may replace the multiply with shift/add
 > or is there something about forcing 32 and 64 bit that makes ia64 suffer.


It seems to speed things up, but the relative ordering remains the same:

oslowest:/home/raj/hashtest# make
cc -O3 -Wall   -c -o hashtest.o hashtest.c
cc -O3 -Wall   -c -o md5.o md5.c
cc -lm  hashtest.o md5.o   -o hashtest
oslowest:/home/raj/hashtest# ./hashtest 10000000 14 | sort -n -k 3 -k 2
Algorithm             Time       Ratio       Max   StdDev
string_hash17        0.893813       1.00       645   0.36
string10             0.965596       1.00       612   0.03
jhash_string         1.387773       1.00       702   1.00
fnv32                1.699041       1.00       689   0.93
fnv64                1.882314       1.00       680   0.85
crc                  3.273676       1.00       634   0.41
md5_string           13.913745       1.00       720   0.99
SuperFastHash        1.135802       1.01       900   2.13
djb2                 0.951571       1.15       964   9.52
string_hash31        0.971081       1.21      1039  11.39
sdbm                 1.168148       2.87      3268  33.77
pjw                  1.159304       3.71      2907  40.71
elf                  1.237662       3.71      2907  40.71
full_name_hash       1.212588      13.09      8796  85.91
kr_hash              0.856584     499.17    468448 551.55
fletcher             1.054516     499.17    468448 551.55
adler32              4.123742     499.17    468448 551.55
xor                  1.031910     854.94    583189 722.12
lastchar             0.648597    1637.61   1000000 999.69
oslowest:/home/raj/hashtest# ./hashtest 10000000 8 | sort -n -k 3 -k 2
Algorithm             Time       Ratio       Max   StdDev
string_hash17        0.884829       1.00     39497   1.50
string10             0.962258       1.00     39064   0.01
SuperFastHash        1.088602       1.00     40497   2.17
jhash_string         1.340878       1.00     39669   1.04
fnv32                1.637096       1.00     39895   2.25
fnv64                1.842330       1.00     39215   0.35
crc                  3.230291       1.00     39088   0.07
md5_string           13.863056       1.00     39605   0.98
djb2                 0.944159       1.15     60681  76.16
string_hash31        0.961978       1.21     64950  91.12
sdbm                 1.159156       2.38    129900 232.22
pjw                  1.154286       2.45     99990 237.86
elf                  1.232842       2.45     99990 237.86
kr_hash              0.856873       7.80    468451 515.52
fletcher             1.055389       7.80    468451 515.52
adler32              4.123254       7.80    468451 515.52
full_name_hash       1.152628      13.09    562501 687.24
xor                  1.033050      13.36    583189 694.98
lastchar             0.647504      25.60   1000000 980.27


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] dcache: better name hash function
  2009-10-27  7:29       ` Eric Dumazet
@ 2009-10-27 17:07         ` Stephen Hemminger
  2009-10-27 17:32           ` Linus Torvalds
       [not found]           ` <4AE72B91.7040700@gmail.com>
  0 siblings, 2 replies; 35+ messages in thread
From: Stephen Hemminger @ 2009-10-27 17:07 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Stephen Hemminger, Andrew Morton, Linus Torvalds,
	Octavian Purdila, netdev, linux-kernel, Al Viro

On Tue, 27 Oct 2009 08:29:51 +0100
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Eric Dumazet a écrit :
> > 
> > 
> > 511 value on 64bit, and 1023 on 32bit arches are nice because
> > hashsz * sizeof(pointer) <= 4096, wasting space for one pointer only.
> > 
> > Conclusion : jhash and 511/1023 hashsize for netdevices,
> > no divides, only one multiply for the fold.
> 
> Just forget about 511 & 1023, as power of two works too.
> 
> -> 512 & 1024 + jhash
> 
> Guess what, David already said this :)


Rather than wasting space, or doing expensive, modulus; just folding
the higher bits back with XOR redistributes the bits better.


On fast machine (Nehalam):

100000000 Iterations
256 Slots (order 8)
Algorithm             Time       Ratio       Max   StdDev
string10             2.505290       1.00    390628   0.00
xor                  2.521329       1.00    392120   2.14
SuperFastHash        2.781745       1.00    397027   4.43
fnv32                2.847892       1.00    392139   0.98
djb2                 2.886342       1.00    390827   0.12
string_hash31        2.900980       1.00    391001   0.20
string_hash17        2.938708       1.00    391122   0.20
full_name_hash       3.080886       1.00    390860   0.10
jhash_string         3.092161       1.00    392775   1.08
fnv64                5.340740       1.00    392854   0.88
kr_hash              2.395757       7.30   4379091 1568.25

On slow machine (CULV):
100000000 Iterations
256 Slots (order 8)
Algorithm             Time       Ratio       Max   StdDev
string10             10.807174       1.00    390628   0.00
SuperFastHash        11.397303       1.00    397027   4.43
xor                  11.660968       1.00    392120   2.14
djb2                 11.674707       1.00    390827   0.12
jhash_string         11.997104       1.00    392775   1.08
fnv32                12.289086       1.00    392139   0.98
string_hash17        12.863864       1.00    391122   0.20
full_name_hash       13.249483       1.00    390860   0.10
string_hash31        13.668270       1.00    391001   0.20
fnv64                39.808964       1.00    392854   0.88
kr_hash              10.316305       7.30   4379091 1568.25

So Eric's string10 is fastest for special case of fooNNN style names.
But probably isn't best for general strings. Orignal function
is >20% slower, which is surprising probably because of overhead
of 2 shifts and multipy. jenkins and fnv are both 10% slower.

The following seems to give best results (combination of 16bit trick
and string17).


static unsigned int xor17(const unsigned char *key, unsigned int len)
{
    	uint32_t h = 0;
	unsigned int rem;

	rem = len & 1;
	len >>= 1;

	while (len--) {
		h = ((h << 4) + h) ^ get_unaligned16(key);
		key += sizeof(uint16_t);
	}

	if (rem)
		h = ((h << 4) + h) ^ *key;


    	return h;
}

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] dcache: better name hash function
  2009-10-27 17:07         ` Stephen Hemminger
@ 2009-10-27 17:32           ` Linus Torvalds
  2009-10-27 23:08             ` Stephen Hemminger
       [not found]           ` <4AE72B91.7040700@gmail.com>
  1 sibling, 1 reply; 35+ messages in thread
From: Linus Torvalds @ 2009-10-27 17:32 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Eric Dumazet, Stephen Hemminger, Andrew Morton, Octavian Purdila,
	netdev, linux-kernel, Al Viro

On Tue, 27 Oct 2009, Stephen Hemminger wrote:
> 
> Rather than wasting space, or doing expensive, modulus; just folding
> the higher bits back with XOR redistributes the bits better.

Please don't make up any new hash functions without having a better input 
set than the one you seem to use.

The 'fnv' function I can believe in, because the whole "multiply by big 
prime number" thing to spread out the bits is a very traditional model. 
But making up a new hash function based on essentially consecutive names 
is absolutely the wrong thing to do. You need a much better corpus of path 
component names for testing.

> The following seems to give best results (combination of 16bit trick
> and string17).

.. and these kinds of games are likely to work badly on some 
architectures. Don't use 16-bit values, and don't use 'get_unaligned()'. 
Both tend to work fine on x86, but likely suck on some other 
architectures.

Also remember that the critical hash function needs to check for '/' and 
'\0' while at it, which is one reason why it does things byte-at-a-time. 
If you try to be smart, you'd need to be smart about the end condition 
too.

The loop to optimize is _not_ based on 'name+len', it is this code:

                this.name = name;
                c = *(const unsigned char *)name;

                hash = init_name_hash();
                do {
                        name++;
                        hash = partial_name_hash(c, hash);
                        c = *(const unsigned char *)name;
                } while (c && (c != '/'));
                this.len = name - (const char *) this.name;
                this.hash = end_name_hash(hash);

(which depends on us having already removed all slashed at the head, and 
knowing that the string is not zero-sized)

So doing things multiple bytes at a time is certainly still possible, but 
you would always have to find the slashes/NUL's in there first. Doing that 
efficiently and portably is not trivial - especially since a lot of 
critical path components are short.

(Remember: there may be just a few 'bin' directory names, but if you do 
performance analysis, 'bin' as a path component is probably hashed a lot 
more than 'five_slutty_bimbos_and_a_donkey.jpg'. So the relative weighting 
of importance of the filename should probably include the frequency it 
shows up in pathname lookup)

		Linus

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] dcache: better name hash function
       [not found]           ` <4AE72B91.7040700@gmail.com>
@ 2009-10-27 17:35             ` Stephen Hemminger
  0 siblings, 0 replies; 35+ messages in thread
From: Stephen Hemminger @ 2009-10-27 17:35 UTC (permalink / raw)
  To: Eric Dumazet, David Miller; +Cc: netdev, linux-kernel

On Tue, 27 Oct 2009 18:19:13 +0100
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Stephen Hemminger a écrit :
> 
> > So Eric's string10 is fastest for special case of fooNNN style names.
> > But probably isn't best for general strings. Orignal function
> > is >20% slower, which is surprising probably because of overhead
> > of 2 shifts and multipy. jenkins and fnv are both 10% slower.
> > 
> 
> 
> jhash() is faster when strings are longer, being able to process 12 bytes per loop.
> 

But jhash is not amenable to usage in namei (with partial_name_hash).

name_hash is rarely done on long strings, the average length of a filename
is fairly short (probably leftover Unix legacy). On my system, average
path component length in /usr is 13 characters; therefore jhash has
no big benefit here.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] dcache: better name hash function
  2009-10-27 17:32           ` Linus Torvalds
@ 2009-10-27 23:08             ` Stephen Hemminger
  2009-10-27 23:41               ` Linus Torvalds
  0 siblings, 1 reply; 35+ messages in thread
From: Stephen Hemminger @ 2009-10-27 23:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Eric Dumazet, Stephen Hemminger, Andrew Morton, Octavian Purdila,
	netdev, linux-kernel, Al Viro

On Tue, 27 Oct 2009 10:32:44 -0700 (PDT)
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> On Tue, 27 Oct 2009, Stephen Hemminger wrote:
> > 
> > Rather than wasting space, or doing expensive, modulus; just folding
> > the higher bits back with XOR redistributes the bits better.
> 
> Please don't make up any new hash functions without having a better input 
> set than the one you seem to use.
> 
> The 'fnv' function I can believe in, because the whole "multiply by big 
> prime number" thing to spread out the bits is a very traditional model. 
> But making up a new hash function based on essentially consecutive names 
> is absolutely the wrong thing to do. You need a much better corpus of path 
> component names for testing.
> 
> > The following seems to give best results (combination of 16bit trick
> > and string17).
> 
> .. and these kinds of games are likely to work badly on some 
> architectures. Don't use 16-bit values, and don't use 'get_unaligned()'. 
> Both tend to work fine on x86, but likely suck on some other 
> architectures.
> 
> Also remember that the critical hash function needs to check for '/' and 
> '\0' while at it, which is one reason why it does things byte-at-a-time. 
> If you try to be smart, you'd need to be smart about the end condition 
> too.
> 
> The loop to optimize is _not_ based on 'name+len', it is this code:
> 
>                 this.name = name;
>                 c = *(const unsigned char *)name;
> 
>                 hash = init_name_hash();
>                 do {
>                         name++;
>                         hash = partial_name_hash(c, hash);
>                         c = *(const unsigned char *)name;
>                 } while (c && (c != '/'));
>                 this.len = name - (const char *) this.name;
>                 this.hash = end_name_hash(hash);
> 
> (which depends on us having already removed all slashed at the head, and 
> knowing that the string is not zero-sized)
> 
> So doing things multiple bytes at a time is certainly still possible, but 
> you would always have to find the slashes/NUL's in there first. Doing that 
> efficiently and portably is not trivial - especially since a lot of 
> critical path components are short.
> 
> (Remember: there may be just a few 'bin' directory names, but if you do 
> performance analysis, 'bin' as a path component is probably hashed a lot 
> more than 'five_slutty_bimbos_and_a_donkey.jpg'. So the relative weighting 
> of importance of the filename should probably include the frequency it 
> shows up in pathname lookup)
> 
> 		Linus


Going back to basics. Run tests across different input sets.
Dropping off the slow ones like crc, md5, ...
Not using jhash because it doesn't have good character at a time
interface.

Also, the folding algorithm used matters. Since the kernel
already uses hash_long() to fold back to N bits, all the
tests were rerun with that.

Test run across names in /usr
Algorithm             Time       Ratio       Max   StdDev
kr_hash              2.481275     1.21      4363 358.98
string10             2.834562     1.15      4168 303.66
fnv32                2.887600     1.18      4317 332.38
string_hash31        3.655745     1.16      4258 314.33
string_hash17        3.816443     1.16      4177 311.28
djb2                 3.883914     1.18      4269 331.75
full_name_hash       4.067633     1.16      4282 312.29
pjw                  6.517316     1.17      4184 316.69
sdbm                 6.945385     1.17      4447 324.32
elf                  7.402180     1.17      4184 316.69


And in /home (mail directories and git)
Algorithm             Time       Ratio       Max   StdDev
kr_hash              2.765015     1.44      7175 701.99
string10             3.136947     1.19      7092 469.73
fnv32                3.162626     1.19      6986 458.48
string_hash31        3.832031     1.19      7053 463.29
string_hash17        4.136220     1.19      7023 469.30
djb2                 4.241706     1.23      7537 512.02
full_name_hash       4.437741     1.19      7000 467.19
pjw                  6.758093     1.20      6970 476.03
sdbm                 7.239758     1.22      7526 494.32
elf                  7.446356     1.20      6970 476.03


And with names like pppXXX
Algorithm             Time       Ratio       Max   StdDev
kr_hash              0.849656     9.26      5520 1121.79
fnv32                1.004682     1.01       453  23.29
string10             1.004729     1.00       395   2.08
string_hash31        1.108335     1.00       409   5.14
string_hash17        1.231257     1.00       410   8.10
djb2                 1.238314     1.01       435  29.88
full_name_hash       1.320822     1.00       422  11.07
elf                  1.994794     1.15       716 151.19
pjw                  2.063958     1.15       716 151.19
sdbm                 2.070033     1.00       408   8.11

* The new test has big table so more cache effects.
* Existing full_name_hash distributes okay if folded correctly.
* fnv32 and string10 are slightly faster

More data (on /usr) from older slower machines:

IA-64
Algorithm             Time       Ratio       Max   StdDev
kr_hash              1.676064     1.17       664  63.81
string_hash17        1.773553     1.12       616  54.40
djb2                 2.103359     1.12       598  54.71
string10             2.103959     1.13       698  56.80
string_hash31        2.108254     1.13       602  55.51
full_name_hash       3.237209     1.13       614  56.74
sdbm                 3.279243     1.12       611  54.78
pjw                  3.314135     1.13       639  56.74
elf                  3.821029     1.13       639  56.74
fnv32                5.619829     1.16       865  62.51

Slow ULV 1Ghz laptop:
Algorithm             Time       Ratio       Max   StdDev
kr_hash              5.754460     1.19      2017 194.64
string10             6.698358     1.15      1638 171.29
sdbm                 8.134431     1.15      1652 170.65
djb2                 8.231058     1.17      1659 184.44
string_hash31        8.447873     1.15      1633 172.13
fnv32                8.552569     1.18      2170 189.61
string_hash17        9.226992     1.16      1616 175.01
full_name_hash       10.555072     1.15      1703 170.45
pjw                  16.193485     1.17      1642 181.45
elf                  19.770414     1.17      1642 181.45

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] dcache: better name hash function
  2009-10-27 23:08             ` Stephen Hemminger
@ 2009-10-27 23:41               ` Linus Torvalds
  2009-10-28  0:10                 ` Stephen Hemminger
  0 siblings, 1 reply; 35+ messages in thread
From: Linus Torvalds @ 2009-10-27 23:41 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Eric Dumazet, Stephen Hemminger, Andrew Morton, Octavian Purdila,
	netdev, linux-kernel, Al Viro

On Tue, 27 Oct 2009, Stephen Hemminger wrote:
> 
> Going back to basics. Run tests across different input sets.
> Dropping off the slow ones like crc, md5, ...
> Not using jhash because it doesn't have good character at a time
> interface.
> 
> Also, the folding algorithm used matters. Since the kernel
> already uses hash_long() to fold back to N bits, all the
> tests were rerun with that.

Yeah, the 'hash_long()' folding matters for anything that doesn't multiply 
big some big number to spread the bits out, because otherwise the bits 
from the last character hashed will always be in the low bits.

That explains why our current hash looked so bad with your previous code.

>From your numbers, I think we can dismiss 'kr_hash()' as having horrible 
behavior with names like pppXXX (and that isn't just a special case: it's 
also noticeably worse for your /home directory case, which means that the 
bad behavior shows up in practice too, not just in some special cases).

'elf' and 'pjw' don't have quite the same bad case, but the stddev for the 
pppXXX cases are still clearly worse than the other alternatives. They 
also seem to always be slower than what we already have. 

The 'fnv32' algorithm gets fairly good behavior, but seems bad on Itanium. 
Looks like it depends on a fast multiplication unit, and all even your 
"slow" ULV chip seems to be a Core2 one, so all your x86 targets have 
that. And our current name hash still actually seems to do better in all 
cases (maybe I missed some case) even if fnv32 is slightly faster on x86.

>From your list 'string10' seems to get consistently good results and is at 
or near the top of performance too. It seems to be the one that 
consistently beats 'full_name_hash()' both in performance and in behavior 
(string_hash17/31 come close, but aren't as clearly better performing).

But I haven't actually seen the hashes. Maybe there's something that makes 
string10 bad?

Regardless, one thing your new numbers do say is that our current hash 
really isn't that bad.

		Linus

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] dcache: better name hash function
  2009-10-27 23:41               ` Linus Torvalds
@ 2009-10-28  0:10                 ` Stephen Hemminger
  2009-10-28  0:58                   ` Linus Torvalds
  0 siblings, 1 reply; 35+ messages in thread
From: Stephen Hemminger @ 2009-10-28  0:10 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Eric Dumazet, Stephen Hemminger, Andrew Morton, Octavian Purdila,
	netdev, linux-kernel, Al Viro

[-- Attachment #1: Type: text/plain, Size: 2345 bytes --]

On Tue, 27 Oct 2009 16:41:52 -0700 (PDT)
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> On Tue, 27 Oct 2009, Stephen Hemminger wrote:
> > 
> > Going back to basics. Run tests across different input sets.
> > Dropping off the slow ones like crc, md5, ...
> > Not using jhash because it doesn't have good character at a time
> > interface.
> > 
> > Also, the folding algorithm used matters. Since the kernel
> > already uses hash_long() to fold back to N bits, all the
> > tests were rerun with that.
> 
> Yeah, the 'hash_long()' folding matters for anything that doesn't multiply 
> big some big number to spread the bits out, because otherwise the bits 
> from the last character hashed will always be in the low bits.
> 
> That explains why our current hash looked so bad with your previous code.
> 
> From your numbers, I think we can dismiss 'kr_hash()' as having horrible 
> behavior with names like pppXXX (and that isn't just a special case: it's 
> also noticeably worse for your /home directory case, which means that the 
> bad behavior shows up in practice too, not just in some special cases).
> 
> 'elf' and 'pjw' don't have quite the same bad case, but the stddev for the 
> pppXXX cases are still clearly worse than the other alternatives. They 
> also seem to always be slower than what we already have. 
> 
> The 'fnv32' algorithm gets fairly good behavior, but seems bad on Itanium. 
> Looks like it depends on a fast multiplication unit, and all even your 
> "slow" ULV chip seems to be a Core2 one, so all your x86 targets have 
> that. And our current name hash still actually seems to do better in all 
> cases (maybe I missed some case) even if fnv32 is slightly faster on x86.
> 
> From your list 'string10' seems to get consistently good results and is at 
> or near the top of performance too. It seems to be the one that 
> consistently beats 'full_name_hash()' both in performance and in behavior 
> (string_hash17/31 come close, but aren't as clearly better performing).
> 
> But I haven't actually seen the hashes. Maybe there's something that makes 
> string10 bad?
> 
> Regardless, one thing your new numbers do say is that our current hash 
> really isn't that bad.
> 
> 		Linus

Agreed. Here is the reduced version of the program.
To run:
  find /home -printf '%f\n' 2>/dev/null | ./htest -n 100

[-- Attachment #2: htest.c --]
[-- Type: text/x-c++src, Size: 5219 bytes --]

#include <stdio.h>
#include <stdint.h>
#include <malloc.h>
#include <sys/types.h>
#include <sys/time.h>
#include <string.h>
#include <stdlib.h>
#include <math.h>
#include <getopt.h>

#define init_name_hash()                0

/* partial hash update function. Assume roughly 4 bits per character */
static inline unsigned long
partial_name_hash(unsigned long c, unsigned long prevhash)
{
        return (prevhash + (c << 4) + (c >> 4)) * 11;
}

/* Compute the hash for a name string. */
static unsigned int
full_name_hash(const unsigned char *name, unsigned int len)
{
        unsigned long hash = 0;
        while (len--)
                hash = partial_name_hash(*name++, hash);
        return hash;
}

static unsigned int
djb2(const unsigned char *str, unsigned int len)
{
	unsigned long hash = 5381;

        while (len--)
		hash = ((hash << 5) + hash) + *str++; /* hash * 33 + c */

        return hash;
}

static unsigned int
string_hash31(const unsigned char *name, unsigned int len)
{
        unsigned hash = 0;
        int i;

        for (i = 0; i < len; ++i)
                hash = 31 * hash + name[i];
        return hash;
}

static unsigned int
string_hash17(const unsigned char *name, unsigned int len)
{
        unsigned hash = 0;
        int i;

        for (i = 0; i < len; ++i)
                hash = 17 * hash + name[i];
        return hash;
}

static unsigned int 
string10(const unsigned char *name, unsigned int len)
{
        unsigned hash = 0;
        int i;

        for (i = 0; i < len; ++i)
		hash = hash * 10 + name[i] - '0';

        return hash;
}

static const uint32_t FNV_32_PRIME = 16777619u;
static const uint32_t FNV1_32_INIT = 2166136261u;

static unsigned int fnv32(const unsigned char *key, unsigned int len)
{
	uint32_t hval = FNV1_32_INIT;
	unsigned int i;

	for (i = 0; i < len; i++) {
		hval ^= key[i];
#if defined(NO_FNV_GCC_OPTIMIZATION)
		hval *= FNV_32_PRIME;
#else
		hval += (hval<<1) + (hval<<4) + (hval<<7)
			+ (hval<<8) + (hval<<24);
#endif
	}

	return hval;
}

static unsigned order = 8;
static unsigned iterations = 10000;
static char **names;
static unsigned num_names;

static void read_names(void)
{
	char line[1024];
	unsigned int n = 0;
	
	/* Read input to create name set */
	while (fgets(line, sizeof line, stdin) != NULL) {
		names = realloc(names, (n + 1) * sizeof(char *));
		names[n] = malloc(strlen(line) + 1);
		strcpy(names[n], line);
		++n;
	}
	num_names = n;
}

/* 2^31 + 2^29 - 2^25 + 2^22 - 2^19 - 2^16 + 1 */
#define GOLDEN_RATIO_PRIME_32 0x9e370001UL
/*  2^63 + 2^61 - 2^57 + 2^54 - 2^51 - 2^18 + 1 */
#define GOLDEN_RATIO_PRIME_64 0x9e37fffffffc0001UL

/* Unrolled version of hash_long() */
static inline unsigned int fold(unsigned int val, unsigned int bits)
{
	if (sizeof(val) == 8) {
		uint64_t hash = val;
		/*  Sigh, gcc can't optimise this alone like it does for 32 bits. */
		uint64_t n = hash;
		n <<= 18;
		hash -= n;
		n <<= 33;
		hash -= n;
		n <<= 3;
		hash += n;
		n <<= 3;
		hash -= n;
		n <<= 4;
		hash += n;
		n <<= 2;
		hash += n;

		/* High bits are more random, so use them. */
		return hash >> (64 - bits);
	} else {
		/* On some cpus multiply is faster, on others gcc will do shifts */
		uint32_t hash = val * GOLDEN_RATIO_PRIME_32;

		/* High bits are more random, so use them. */
		return hash >> (32 - bits);
	}
}

static void measure(const char *name,  
		    unsigned int (*hash)(const unsigned char *, unsigned int))
{
	unsigned int i;
	struct timeval t0, t1;
	unsigned int hashsz = 1 << order;
	unsigned int dist[hashsz];
	unsigned long long probes = 0;
	double t;
	unsigned long m = 0;
	const double avg = num_names / hashsz;
	double ideal = hashsz * (avg * (1 + avg)) / 2;
	double std = 0;

	memset(dist, 0, sizeof(unsigned int) * hashsz);

	gettimeofday(&t0, NULL);
	for (i = 0; i < num_names; i++) {
		unsigned char *name = (unsigned char *) names[i];
		size_t len = strlen(names[i]);
		unsigned int h = 0, j;

		for (j = 0; j < iterations; j++)
			h = hash(name, len);

		/* fold in extra bits */
		++dist[fold(h, order)];
	}
	gettimeofday(&t1, NULL);
	t = (double) (t1.tv_sec - t0.tv_sec);
	t += (double) (t1.tv_usec - t0.tv_usec) / 1000000.;

	for (i = 0; i < hashsz; i++) {
		unsigned int n = dist[i];
		double delta = (n - avg);

		if (n > m) m = n;	/* longest chain */
		
		std += delta * delta;	/* compute standard deviation */
		
		/* number of probes used when accessing 
		   is same as sum of arithmetic series */
		probes += ((uint64_t) n * (1 + n)) / 2;
	}


	printf("%-20s %f", name, t);
	printf(" %8.2f %9lu %6.2f\n",
	       (double) probes / ideal, m, sqrt(std / hashsz));

}
#define MEASURE(func)	measure(#func, &func)

int main(int argc, char **argv)
{
	int f;

	while ((f = getopt(argc, argv, "h:n:")) != -1)
		switch (f) {
		case 'n':
			iterations = strtoul(optarg, NULL, 0);
			break;

		case 'h':
			order = strtoul(optarg, NULL, 0);
			break;

		default:
			fprintf(stderr, 
				"Usage: %s -a -h hash -n testsize\n",
				argv[0]);
			return 1;
		}

	read_names();

	printf("Algorithm             Time       Ratio       Max   StdDev\n");

	MEASURE(full_name_hash);
	MEASURE(djb2);
	MEASURE(string10);
	MEASURE(string_hash17);
	MEASURE(string_hash31);

	MEASURE(fnv32);

        return 0;
}

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] dcache: better name hash function
  2009-10-28  0:10                 ` Stephen Hemminger
@ 2009-10-28  0:58                   ` Linus Torvalds
  2009-10-28  1:56                     ` Stephen Hemminger
  0 siblings, 1 reply; 35+ messages in thread
From: Linus Torvalds @ 2009-10-28  0:58 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Eric Dumazet, Stephen Hemminger, Andrew Morton, Octavian Purdila,
	netdev, linux-kernel, Al Viro

On Tue, 27 Oct 2009, Stephen Hemminger wrote:
> 
> Agreed. Here is the reduced version of the program.
> To run:
>   find /home -printf '%f\n' 2>/dev/null | ./htest -n 100

The timings are very sensitive to random I$ layout at least on Nehalem. 
The reason seems to be that the inner loop is _so_ tight that just 
depending on exactly where the loop ends up, you can get subtle 
interactions with the loop cache.

Look here:

	[torvalds@nehalem ~]$ find /home -printf '%f\n' 2>/dev/null | ./htest -n 100
	Algorithm             Time       Ratio       Max   StdDev
	full_name_hash       1.141899     1.03      4868 263.37
	djb2                 0.980200     1.03      4835 266.05
	string10             0.909175     1.03      4850 262.67
	string10a            0.673915     1.03      4850 262.67
	string10b            0.909374     1.03      4850 262.67
	string_hash17        0.966050     1.03      4805 263.68
	string_hash31        1.008544     1.03      4807 259.37
	fnv32                0.774806     1.03      4817 259.17

what do you think the difference between 'string10', 'string10a' and 
'string10b' are?

None. None what-so-ever. The source code is identical, and gcc generates 
identical assembly language. Yet those timings are extremely stable for 
me, and 'string10b' is 25% faster than the identical string10 and 
string10a functions.

The only difference? 'string10a' starts aligned to just 16 bytes, but that 
in turn happens to mean that the tight inner loop ends up aligned on a 
128-byte boundary. And being cacheline aligned just there seems to matters 
for some subtle micro-architectural reason.

The reason I noticed this is that I wondered what small modifications to 
'string10' would do for performance, and noticed that even _without_ the 
small modifications, performance fluctuated.

Lesson? Microbenchmarks like this can be dangerous and misleading. That's 
_especially_ true if the loop ends up being just tight enough that it can 
fit in some trace cache or similar. In real life, the name hash is 
performance-critical, but at the same time almost certainly won't be run 
in a tight enough loop that you'd ever notice things like that.

		Linus

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] dcache: better name hash function
  2009-10-28  0:58                   ` Linus Torvalds
@ 2009-10-28  1:56                     ` Stephen Hemminger
  0 siblings, 0 replies; 35+ messages in thread
From: Stephen Hemminger @ 2009-10-28  1:56 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Eric Dumazet, Stephen Hemminger, Andrew Morton, Octavian Purdila,
	netdev, linux-kernel, Al Viro

On Tue, 27 Oct 2009 17:58:53 -0700 (PDT)
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> On Tue, 27 Oct 2009, Stephen Hemminger wrote:
> > 
> > Agreed. Here is the reduced version of the program.
> > To run:
> >   find /home -printf '%f\n' 2>/dev/null | ./htest -n 100
> 
> The timings are very sensitive to random I$ layout at least on Nehalem. 
> The reason seems to be that the inner loop is _so_ tight that just 
> depending on exactly where the loop ends up, you can get subtle 
> interactions with the loop cache.
> 
> Look here:
> 
> 	[torvalds@nehalem ~]$ find /home -printf '%f\n' 2>/dev/null | ./htest -n 100
> 	Algorithm             Time       Ratio       Max   StdDev
> 	full_name_hash       1.141899     1.03      4868 263.37
> 	djb2                 0.980200     1.03      4835 266.05
> 	string10             0.909175     1.03      4850 262.67
> 	string10a            0.673915     1.03      4850 262.67
> 	string10b            0.909374     1.03      4850 262.67
> 	string_hash17        0.966050     1.03      4805 263.68
> 	string_hash31        1.008544     1.03      4807 259.37
> 	fnv32                0.774806     1.03      4817 259.17
> 
> what do you think the difference between 'string10', 'string10a' and 
> 'string10b' are?
> 
> None. None what-so-ever. The source code is identical, and gcc generates 
> identical assembly language. Yet those timings are extremely stable for 
> me, and 'string10b' is 25% faster than the identical string10 and 
> string10a functions.
> 
> The only difference? 'string10a' starts aligned to just 16 bytes, but that 
> in turn happens to mean that the tight inner loop ends up aligned on a 
> 128-byte boundary. And being cacheline aligned just there seems to matters 
> for some subtle micro-architectural reason.
> 
> The reason I noticed this is that I wondered what small modifications to 
> 'string10' would do for performance, and noticed that even _without_ the 
> small modifications, performance fluctuated.
> 
> Lesson? Microbenchmarks like this can be dangerous and misleading. That's 
> _especially_ true if the loop ends up being just tight enough that it can 
> fit in some trace cache or similar. In real life, the name hash is 
> performance-critical, but at the same time almost certainly won't be run 
> in a tight enough loop that you'd ever notice things like that.
> 
> 		Linus

Thanks. I wasn't putting huge amount of stock in the micro benchmark,
was more interested in how the distribution worked out (which is CPU
independent) rather than the time. As long as all usage of name hashing
fold properly, there isn't a lot of reason to change.



-- 

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2009-10-28  1:56 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-25 19:58 [PATCH next-next-2.6] netdev: better dev_name_hash Octavian Purdila
2009-10-25 20:17 ` Hagen Paul Pfeifer
2009-10-25 21:24 ` Eric Dumazet
2009-10-25 21:55   ` Octavian Purdila
2009-10-25 22:41     ` Hagen Paul Pfeifer
2009-10-25 22:45       ` Octavian Purdila
2009-10-26  5:28       ` Eric Dumazet
2009-10-26 13:07         ` Krishna Kumar2
2009-10-26 14:31           ` Octavian Purdila
2009-10-26 14:55             ` Eric Dumazet
2009-10-26 15:52               ` Octavian Purdila
2009-10-26 16:55                 ` Stephen Hemminger
2009-10-26 17:45                   ` Stephen Hemminger
2009-10-27  1:24               ` David Miller
2009-10-27  1:40                 ` Eric Dumazet
2009-10-26  6:30   ` Stephen Hemminger
2009-10-26  7:48     ` Eric Dumazet
2009-10-26  4:43 ` Stephen Hemminger
2009-10-26 22:36   ` [PATCH] dcache: better name hash function Stephen Hemminger <shemminger@vyatta.com>, Al Viro
2009-10-27  2:45     ` Eric Dumazet
2009-10-27  3:53       ` Stephen Hemminger
2009-10-27 16:38       ` Rick Jones
     [not found] <9986527.24561256620662709.JavaMail.root@tahiti.vyatta.com>
2009-10-27  5:19 ` Stephen Hemminger
2009-10-27  5:24   ` David Miller
2009-10-27  6:07   ` Eric Dumazet
2009-10-27  6:50     ` Eric Dumazet
2009-10-27  7:29       ` Eric Dumazet
2009-10-27 17:07         ` Stephen Hemminger
2009-10-27 17:32           ` Linus Torvalds
2009-10-27 23:08             ` Stephen Hemminger
2009-10-27 23:41               ` Linus Torvalds
2009-10-28  0:10                 ` Stephen Hemminger
2009-10-28  0:58                   ` Linus Torvalds
2009-10-28  1:56                     ` Stephen Hemminger
     [not found]           ` <4AE72B91.7040700@gmail.com>
2009-10-27 17:35             ` Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).