All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
To: git@vger.kernel.org
Cc: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: [Toy PATCH] Avoid spilling collided entries in object hash table to the next slots
Date: Fri, 29 Mar 2013 21:47:49 +0700	[thread overview]
Message-ID: <1364568469-2250-1-git-send-email-pclouds@gmail.com> (raw)

If a position in object hash table is taken, we currently check out
the next one. This could potentially create long object chains. We
could create linked lists instead and leave the next slot alone.

This patch relies on the fact that pointers are usually aligned more
than one byte and it uses the first bit as "object vs linked list"
indicator. Architectures that don't support this can fall back to
original implementation.

The saving is real, although not ground breaking. I'm just not sure it
is worth the ugliness that comes with this patch. Although we could
avoid the alignment problem by saving that bit in a separate bitmap.
"git rev-list --objects --all" on linux-2.6.git:

        before       after
real    0m33.407s    0m31.732s
user    0m32.926s    0m31.460s
sys     0m0.407s     0m0.202s

lookup_object() goes down from 30% to 27% in perf reports.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 object.c | 42 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/object.c b/object.c
index bcfd2c6..7b013a0 100644
--- a/object.c
+++ b/object.c
@@ -8,6 +8,17 @@
 static struct object **obj_hash;
 static int nr_objs, obj_hash_size;
 
+#ifdef USE_LINKED_LIST
+struct obj_list {
+	struct object *obj;
+	struct obj_list *next;
+};
+
+#define IS_LST(x) ((intptr_t)(x) & 1)
+#define MAKE_LST(x) (struct object *)((intptr_t)(x) | 1)
+#define GET_LST(x) (struct obj_list *)((intptr_t)(x) & ~1)
+#endif
+
 unsigned int get_max_object_index(void)
 {
 	return obj_hash_size;
@@ -53,13 +64,30 @@ static unsigned int hash_obj(struct object *obj, unsigned int n)
 static void insert_obj_hash(struct object *obj, struct object **hash, unsigned int size)
 {
 	unsigned int j = hash_obj(obj, size);
+#ifdef USE_LINKED_LIST
+	struct obj_list *lst, *o;
+
+	if (!hash[j]) {
+		hash[j] = obj;
+		return;
+	}
 
+	o = xmalloc(sizeof(*o));
+	o->obj = obj;
+
+	if (IS_LST(hash[j]))
+		o->next = GET_LST(hash[j]);
+	else
+		o->next = NULL;
+	hash[j] = MAKE_LST(o);
+#else
 	while (hash[j]) {
 		j++;
 		if (j >= size)
 			j = 0;
 	}
 	hash[j] = obj;
+#endif
 }
 
 static unsigned int hashtable_index(const unsigned char *sha1)
@@ -78,6 +106,19 @@ struct object *lookup_object(const unsigned char *sha1)
 		return NULL;
 
 	i = hashtable_index(sha1);
+#ifdef USE_LINKED_LIST
+	if (IS_LST(obj_hash[i])) {
+		struct obj_list *lst;
+		for (lst = GET_LST(obj_hash[i]); lst; lst = lst->next) {
+			if (!hashcmp(lst->obj->sha1, sha1))
+				return lst->obj;
+		}
+		return NULL;
+	} else {
+		struct object *obj = obj_hash[i];
+		return obj && !hashcmp(sha1, obj->sha1) ? obj : NULL;
+	}
+#else
 	while ((obj = obj_hash[i]) != NULL) {
 		if (!hashcmp(sha1, obj->sha1))
 			break;
@@ -86,6 +127,7 @@ struct object *lookup_object(const unsigned char *sha1)
 			i = 0;
 	}
 	return obj;
+#endif
 }
 
 void grow_object_hash_to(unsigned long nr)
-- 
1.8.2.83.gc99314b

             reply	other threads:[~2013-03-29 14:48 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-29 14:47 Nguyễn Thái Ngọc Duy [this message]
2013-03-29 17:15 ` [Toy PATCH] Avoid spilling collided entries in object hash table to the next slots Junio C Hamano
2013-03-29 18:33   ` Junio C Hamano
2013-03-30  0:38   ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1364568469-2250-1-git-send-email-pclouds@gmail.com \
    --to=pclouds@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.