[PATCH 0/6] thread lazy_init_name

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/6] thread lazy_init_name_hash
@ 2017-03-22 17:14 git
  2017-03-22 17:14 ` [PATCH 1/6] name-hash: specify initial size for istate.dir_hash table git
                   ` (8 more replies)
  0 siblings, 9 replies; 14+ messages in thread
From: git @ 2017-03-22 17:14 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

This patch series is a performance optimization for
lazy_init_name_hash() in name-hash.c on very large
repositories.

This change allows lazy_init_name_hash() to optionally
use multiple threads when building the the_index.dir_hash
and the_index.name_hash hashmaps.  The original code path
has been preserved and is used when the repo is small or
the system does not have sufficient CPUs.

A helper command (t/helper/test-lazy-init-name-hash) was
created to demonstrate performance differences and validate
output.  For example, use the '-p' option to compare both
code paths on a large repo.

During our testing on the Windows source tree (3.1M
files, 500K folders, 450MB index), this change reduced
the runtime of lazy_init_name_hash() from 1.4 to 0.27
seconds.

This patch series replaces my earlier
     * jh/memihash-opt (2017-02-17) 5 commits
patch series.

Jeff Hostetler (6):
  name-hash: specify initial size for istate.dir_hash table
  hashmap: allow memihash computation to be continued
  hashmap: Add disallow_rehash setting
  name-hash: perf improvement for lazy_init_name_hash
  name-hash: add test-lazy-init-name-hash
  name-hash: add perf test for lazy_init_name_hash

 Makefile                            |   1 +
 cache.h                             |   1 +
 hashmap.c                           |  29 ++-
 hashmap.h                           |  25 ++
 name-hash.c                         | 490 +++++++++++++++++++++++++++++++++++-
 t/helper/test-lazy-init-name-hash.c | 264 +++++++++++++++++++
 t/perf/p0004-lazy-init-name-hash.sh |  19 ++
 7 files changed, 820 insertions(+), 9 deletions(-)
 create mode 100644 t/helper/test-lazy-init-name-hash.c
 create mode 100644 t/perf/p0004-lazy-init-name-hash.sh

-- 
2.7.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/6] name-hash: specify initial size for istate.dir_hash table
  2017-03-22 17:14 [PATCH 0/6] thread lazy_init_name_hash git
@ 2017-03-22 17:14 ` git
  2017-03-22 17:14 ` [PATCH 2/6] hashmap: allow memihash computation to be continued git
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: git @ 2017-03-22 17:14 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Specify an initial size for the istate.dir_hash HashMap matching
the size of the istate.name_hash.

Previously hashmap_init() was given 0, causing a 64 bucket
hashmap to be created.  When working with very large
repositories, this would cause numerous rehash() calls to
realloc and rebalance the hashmap. This is especially true
when the worktree is deep, with many directories containing
a few files.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 name-hash.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/name-hash.c b/name-hash.c
index 6d9f23e..3f7722a 100644
--- a/name-hash.c
+++ b/name-hash.c
@@ -120,7 +120,8 @@ static void lazy_init_name_hash(struct index_state *istate)
 		return;
 	hashmap_init(&istate->name_hash, (hashmap_cmp_fn) cache_entry_cmp,
 			istate->cache_nr);
-	hashmap_init(&istate->dir_hash, (hashmap_cmp_fn) dir_entry_cmp, 0);
+	hashmap_init(&istate->dir_hash, (hashmap_cmp_fn) dir_entry_cmp,
+			istate->cache_nr);
 	for (nr = 0; nr < istate->cache_nr; nr++)
 		hash_index_entry(istate, istate->cache[nr]);
 	istate->name_hash_initialized = 1;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/6] hashmap: allow memihash computation to be continued
  2017-03-22 17:14 [PATCH 0/6] thread lazy_init_name_hash git
  2017-03-22 17:14 ` [PATCH 1/6] name-hash: specify initial size for istate.dir_hash table git
@ 2017-03-22 17:14 ` git
  2017-03-22 17:14 ` [PATCH 3/6] hashmap: Add disallow_rehash setting git
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: git @ 2017-03-22 17:14 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Add variant of memihash() to allow the hash computation to
be continued.  There are times when we compute the hash on
a full path and then the hash on just the path to the parent
directory.  This can be expensive on large repositories.

With this, we can hash the parent directory first. And then
continue the computation to include the "/filename".

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 hashmap.c | 17 +++++++++++++++++
 hashmap.h |  1 +
 2 files changed, 18 insertions(+)

diff --git a/hashmap.c b/hashmap.c
index b10b642..505e63f 100644
--- a/hashmap.c
+++ b/hashmap.c
@@ -50,6 +50,23 @@ unsigned int memihash(const void *buf, size_t len)
 	return hash;
 }
 
+/*
+ * Incoporate another chunk of data into a memihash
+ * computation.
+ */ 
+unsigned int memihash_cont(unsigned int hash_seed, const void *buf, size_t len)
+{
+	unsigned int hash = hash_seed;
+	unsigned char *ucbuf = (unsigned char *) buf;
+	while (len--) {
+		unsigned int c = *ucbuf++;
+		if (c >= 'a' && c <= 'z')
+			c -= 'a' - 'A';
+		hash = (hash * FNV32_PRIME) ^ c;
+	}
+	return hash;
+}
+
 #define HASHMAP_INITIAL_SIZE 64
 /* grow / shrink by 2^2 */
 #define HASHMAP_RESIZE_BITS 2
diff --git a/hashmap.h b/hashmap.h
index ab7958a..45eda69 100644
--- a/hashmap.h
+++ b/hashmap.h
@@ -12,6 +12,7 @@ extern unsigned int strhash(const char *buf);
 extern unsigned int strihash(const char *buf);
 extern unsigned int memhash(const void *buf, size_t len);
 extern unsigned int memihash(const void *buf, size_t len);
+extern unsigned int memihash_cont(unsigned int hash_seed, const void *buf, size_t len);
 
 static inline unsigned int sha1hash(const unsigned char *sha1)
 {
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/6] hashmap: Add disallow_rehash setting
  2017-03-22 17:14 [PATCH 0/6] thread lazy_init_name_hash git
  2017-03-22 17:14 ` [PATCH 1/6] name-hash: specify initial size for istate.dir_hash table git
  2017-03-22 17:14 ` [PATCH 2/6] hashmap: allow memihash computation to be continued git
@ 2017-03-22 17:14 ` git
  2017-03-22 17:14 ` [PATCH 4/6] name-hash: perf improvement for lazy_init_name_hash git
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: git @ 2017-03-22 17:14 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach hashmap to allow rehashes to be suppressed.
This is useful when hashmaps are accessed by multiple
threads.  It still requires the caller to properly
manage their locking.  This just prevents unexpected
rehashing during inserts and deletes.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 hashmap.c | 12 +++++++++++-
 hashmap.h | 24 ++++++++++++++++++++++++
 2 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/hashmap.c b/hashmap.c
index 505e63f..37dfc37 100644
--- a/hashmap.c
+++ b/hashmap.c
@@ -104,11 +104,19 @@ static inline unsigned int bucket(const struct hashmap *map,
 	return key->hash & (map->tablesize - 1);
 }
 
+int hashmap_bucket(const struct hashmap *map, unsigned int hash)
+{
+	return hash & (map->tablesize - 1);
+}
+
 static void rehash(struct hashmap *map, unsigned int newsize)
 {
 	unsigned int i, oldsize = map->tablesize;
 	struct hashmap_entry **oldtable = map->table;
 
+	if (map->disallow_rehash)
+		return;
+
 	alloc_table(map, newsize);
 	for (i = 0; i < oldsize; i++) {
 		struct hashmap_entry *e = oldtable[i];
@@ -141,7 +149,9 @@ void hashmap_init(struct hashmap *map, hashmap_cmp_fn equals_function,
 		size_t initial_size)
 {
 	unsigned int size = HASHMAP_INITIAL_SIZE;
-	map->size = 0;
+
+	memset(map, 0, sizeof(*map));
+
 	map->cmpfn = equals_function ? equals_function : always_equal;
 
 	/* calculate initial table size and allocate the table */
diff --git a/hashmap.h b/hashmap.h
index 45eda69..de6022a 100644
--- a/hashmap.h
+++ b/hashmap.h
@@ -39,6 +39,7 @@ struct hashmap {
 	struct hashmap_entry **table;
 	hashmap_cmp_fn cmpfn;
 	unsigned int size, tablesize, grow_at, shrink_at;
+	unsigned disallow_rehash : 1;
 };
 
 struct hashmap_iter {
@@ -77,6 +78,29 @@ static inline void *hashmap_get_from_hash(const struct hashmap *map,
 	return hashmap_get(map, &key, keydata);
 }
 
+int hashmap_bucket(const struct hashmap *map, unsigned int hash);
+
+/*
+ * Disallow/allow rehashing of the hashmap.
+ * This is useful if the caller knows that the hashmap
+ * needs multi-threaded access.  The caller is still
+ * required to guard/lock searches and inserts in a
+ * manner appropriate to their usage.  This simply
+ * prevents the table from being unexpectedly re-mapped.
+ *
+ * If is up to the caller to ensure that the hashmap is
+ * initialized to a reasonable size to prevent poor
+ * performance.
+ *
+ * When value=1, prevent future rehashes on adds and deleted.
+ * When value=0, allow future rehahses.  This DOES NOT force
+ * a rehash now.
+ */
+static inline void hashmap_disallow_rehash(struct hashmap *map, unsigned value)
+{
+	map->disallow_rehash = value;
+}
+
 /* hashmap_iter functions */
 
 extern void hashmap_iter_init(struct hashmap *map, struct hashmap_iter *iter);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/6] name-hash: perf improvement for lazy_init_name_hash
  2017-03-22 17:14 [PATCH 0/6] thread lazy_init_name_hash git
                   ` (2 preceding siblings ...)
  2017-03-22 17:14 ` [PATCH 3/6] hashmap: Add disallow_rehash setting git
@ 2017-03-22 17:14 ` git
  2017-03-22 17:14 ` [PATCH 5/6] name-hash: add test-lazy-init-name-hash git
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: git @ 2017-03-22 17:14 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Improve performance of lazy_init_name_hash() when
ignore_case is set.  Teach name-hash to build the
istate.name_hash and istate.dir_hash simultaneously
using a forward-diving technique on the pathname
of the index_entry, rather than adding name_hash
entries and then searching backwards in the pathname
for parent directories.

This borrows algorithm ideas from clear_ce_flags_{1,dir}.

Multiple threads are used with the new algorithm to
speed hashmap construction.

This new code path is only used when threads are present
(a compiler settings) and when the index is large enough
to warrant the pthread complexity.

The code in clear_ce_flags_dir() uses a linear search to
find the adjacent index entries with the same prefix; a
binary search is used here handle_range_dir() to further
speed things up.

The size of LAZY_THREAD_COST was determined from rough
analysis using:
    t/helper/test-lazy-init-name-hash --analyze

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 name-hash.c | 487 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 480 insertions(+), 7 deletions(-)

diff --git a/name-hash.c b/name-hash.c
index 3f7722a..a582650 100644
--- a/name-hash.c
+++ b/name-hash.c
@@ -23,15 +23,21 @@ static int dir_entry_cmp(const struct dir_entry *e1,
 			name ? name : e2->name, e1->namelen);
 }
 
-static struct dir_entry *find_dir_entry(struct index_state *istate,
-		const char *name, unsigned int namelen)
+static struct dir_entry *find_dir_entry__hash(struct index_state *istate,
+		const char *name, unsigned int namelen, unsigned int hash)
 {
 	struct dir_entry key;
-	hashmap_entry_init(&key, memihash(name, namelen));
+	hashmap_entry_init(&key, hash);
 	key.namelen = namelen;
 	return hashmap_get(&istate->dir_hash, &key, name);
 }
 
+static struct dir_entry *find_dir_entry(struct index_state *istate,
+		const char *name, unsigned int namelen)
+{
+	return find_dir_entry__hash(istate, name, namelen, memihash(name,namelen));
+}
+
 static struct dir_entry *hash_dir_entry(struct index_state *istate,
 		struct cache_entry *ce, int namelen)
 {
@@ -112,21 +118,488 @@ static int cache_entry_cmp(const struct cache_entry *ce1,
 	return remove ? !(ce1 == ce2) : 0;
 }
 
-static void lazy_init_name_hash(struct index_state *istate)
+
+#ifdef NO_PTHREADS
+
+static int lookup_lazy_params(struct index_state *istate)
 {
-	int nr;
+	return 0;
+}
+
+#else
+
+#include "thread-utils.h"
+
+/*
+ * Set a minimum number of cache_entries that we will handle per
+ * thread and use that to decide how many threads to run (upto
+ * the number on the system).
+ *
+ * For guidance setting the lower per-thread bound, see:
+ *     t/helper/test-lazy-init-name-hash --analyze
+ */
+#define LAZY_THREAD_COST (2000)
+
+/*
+ * We use n mutexes to guard n partitions of the "istate->dir_hash"
+ * hashtable.  Since "find" and "insert" operations will hash to a
+ * particular bucket and modify/search a single chain, we can say
+ * that "all chains mod n" are guarded by the same mutex -- rather
+ * than having a single mutex to guard the entire table.  (This does
+ * require that we disable "rehashing" on the hashtable.)
+ *
+ * So, a larger value here decreases the probability of a collision
+ * and the time that each thread must wait for the mutex.
+ */
+#define LAZY_MAX_MUTEX   (32)
+
+static int lazy_try_threaded = 1;
+static int lazy_nr_dir_threads = 0;
+static pthread_mutex_t *lazy_dir_mutex_array;
+
+/*
+ * An array of lazy_entry items is used by the n threads in
+ * the directory parse (first) phase to (lock-free) store the
+ * intermediate results.  These values are then referenced by
+ * the 2 threads in the second phase.
+ */
+struct lazy_entry {
+	struct dir_entry *dir;
+	unsigned int hash_dir;
+	unsigned int hash_name;
+};
+
+/*
+ * Decide if we want to use threads (if available) to load
+ * the hash tables.  We set "lazy_nr_dir_threads" to zero when
+ * it is not worth it.
+ */
+static int lookup_lazy_params(struct index_state *istate)
+{
+	int nr_cpus;
+
+	lazy_nr_dir_threads = 0;
+
+	if (!lazy_try_threaded)
+		return 0;
+
+	/*
+	 * If we are respecting case, just use the original
+	 * code to build the "istate->name_hash".  We don't
+	 * need the complexity here.
+	 */
+	if (!ignore_case)
+		return 0;
+
+	nr_cpus = online_cpus();
+	if (nr_cpus < 2)
+		return 0;
+
+	if (istate->cache_nr < 2 * LAZY_THREAD_COST)
+		return 0;
+
+	if (istate->cache_nr < nr_cpus * LAZY_THREAD_COST)
+		nr_cpus = istate->cache_nr / LAZY_THREAD_COST;
+	lazy_nr_dir_threads = nr_cpus;
+	return lazy_nr_dir_threads;
+}
+
+/*
+ * Initialize n mutexes for use when searching and inserting
+ * into "istate->dir_hash".  All "dir" threads are trying
+ * to insert partial pathnames into the hash as they iterate
+ * over their portions of the index, so lock contention is
+ * high.
+ *
+ * However, the hashmap is going to put items into bucket
+ * chains based on their hash values.  Use that to create n
+ * mutexes and lock on mutex[bucket(hash) % n].  This will
+ * decrease the collision rate by (hopefully) by a factor of n.
+ */
+static void init_dir_mutex(void)
+{
+	int j;
+
+	lazy_dir_mutex_array = xcalloc(LAZY_MAX_MUTEX, sizeof(pthread_mutex_t));
+
+	for (j = 0; j < LAZY_MAX_MUTEX; j++)
+		init_recursive_mutex(&lazy_dir_mutex_array[j]);
+}
+
+static void cleanup_dir_mutex(void)
+{
+	int j;
+
+	for (j = 0; j < LAZY_MAX_MUTEX; j++)
+		pthread_mutex_destroy(&lazy_dir_mutex_array[j]);
+
+	free(lazy_dir_mutex_array);
+}
+
+static void lock_dir_mutex(int j)
+{
+	pthread_mutex_lock(&lazy_dir_mutex_array[j]);
+}
+
+static void unlock_dir_mutex(int j)
+{
+	pthread_mutex_unlock(&lazy_dir_mutex_array[j]);
+}
+
+static inline int compute_dir_lock_nr(
+	const struct hashmap *map,
+	unsigned int hash)
+{
+	return hashmap_bucket(map, hash) % LAZY_MAX_MUTEX;
+}
+
+static struct dir_entry *hash_dir_entry_with_parent_and_prefix(
+	struct index_state *istate,
+	struct dir_entry *parent,
+	struct strbuf *prefix)
+{
+	struct dir_entry *dir;
+	unsigned int hash;
+	int lock_nr;
+
+	/*
+	 * Either we have a parent directory and path with slash(es)
+	 * or the directory is an immediate child of the root directory.
+	 */
+	assert((parent != NULL) ^ (strchr(prefix->buf, '/') == 0));
+
+	if (parent)
+		hash = memihash_cont(parent->ent.hash,
+			prefix->buf + parent->namelen,
+			prefix->len - parent->namelen);
+	else
+		hash = memihash(prefix->buf, prefix->len);
 
+	lock_nr = compute_dir_lock_nr(&istate->dir_hash, hash);
+	lock_dir_mutex(lock_nr);
+
+	dir = find_dir_entry__hash(istate, prefix->buf, prefix->len, hash);
+	if (!dir) {
+		FLEX_ALLOC_MEM(dir, name, prefix->buf, prefix->len);
+		hashmap_entry_init(dir, hash);
+		dir->namelen = prefix->len;
+		dir->parent = parent;
+		hashmap_add(&istate->dir_hash, dir);
+
+		if (parent) {
+			unlock_dir_mutex(lock_nr);
+
+			/* All I really need here is an InterlockedIncrement(&(parent->nr)) */
+			lock_nr = compute_dir_lock_nr(&istate->dir_hash, parent->ent.hash);
+			lock_dir_mutex(lock_nr);
+			parent->nr++;
+		}
+	}
+
+	unlock_dir_mutex(lock_nr);
+
+	return dir;
+}
+
+/*
+ * handle_range_1() and handle_range_dir() are derived from
+ * clear_ce_flags_1() and clear_ce_flags_dir() in unpack-trees.c
+ * and handle the iteration over the entire array of index entries.
+ * They use recursion for adjacent entries in the same parent
+ * directory.
+ */
+static int handle_range_1(
+	struct index_state *istate,
+	int k_start,
+	int k_end,
+	struct dir_entry *parent,
+	struct strbuf *prefix,
+	struct lazy_entry *lazy_entries);
+
+static int handle_range_dir(
+	struct index_state *istate,
+	int k_start,
+	int k_end,
+	struct dir_entry *parent,
+	struct strbuf *prefix,
+	struct lazy_entry *lazy_entries,
+	struct dir_entry **dir_new_out)
+{
+	int rc, k;
+	int input_prefix_len = prefix->len;
+	struct dir_entry *dir_new;
+
+	dir_new = hash_dir_entry_with_parent_and_prefix(istate, parent, prefix);
+
+	strbuf_addch(prefix, '/');
+
+	/*
+	 * Scan forward in the index array for index entries having the same
+	 * path prefix (that are also in this directory).
+	 */
+	if (strncmp(istate->cache[k_start + 1]->name, prefix->buf, prefix->len) > 0)
+		k = k_start + 1;
+	else if (strncmp(istate->cache[k_end - 1]->name, prefix->buf, prefix->len) == 0)
+		k = k_end;
+	else {
+		int begin = k_start;
+		int end = k_end;
+		while (begin < end) {
+			int mid = (begin + end) >> 1;
+			int cmp = strncmp(istate->cache[mid]->name, prefix->buf, prefix->len);
+			if (cmp == 0) /* mid has same prefix; look in second part */
+				begin = mid + 1;
+			else if (cmp > 0) /* mid is past group; look in first part */
+				end = mid;
+			else
+				die("cache entry out of order");
+		}
+		k = begin;
+	}
+
+	/*
+	 * Recurse and process what we can of this subset [k_start, k).
+	 */
+	rc = handle_range_1(istate, k_start, k, dir_new, prefix, lazy_entries);
+
+	strbuf_setlen(prefix, input_prefix_len);
+
+	*dir_new_out = dir_new;
+	return rc;
+}
+
+static int handle_range_1(
+	struct index_state *istate,
+	int k_start,
+	int k_end,
+	struct dir_entry *parent,
+	struct strbuf *prefix,
+	struct lazy_entry *lazy_entries)
+{
+	int input_prefix_len = prefix->len;
+	int k = k_start;
+
+	while (k < k_end) {
+		struct cache_entry *ce_k = istate->cache[k];
+		const char *name, *slash;
+
+		if (prefix->len && strncmp(ce_k->name, prefix->buf, prefix->len))
+			break;
+
+		name = ce_k->name + prefix->len;
+		slash = strchr(name, '/');
+
+		if (slash) {
+			int len = slash - name;
+			int processed;
+			struct dir_entry *dir_new;
+
+			strbuf_add(prefix, name, len);
+			processed = handle_range_dir(istate, k, k_end, parent, prefix, lazy_entries, &dir_new);
+			if (processed) {
+				k += processed;
+				strbuf_setlen(prefix, input_prefix_len);
+				continue;
+			}
+
+			strbuf_addch(prefix, '/');
+			processed = handle_range_1(istate, k, k_end, dir_new, prefix, lazy_entries);
+			k += processed;
+			strbuf_setlen(prefix, input_prefix_len);
+			continue;
+		}
+
+		/*
+		 * It is too expensive to take a lock to insert "ce_k"
+		 * into "istate->name_hash" and increment the ref-count
+		 * on the "parent" dir.  So we defer actually updating
+		 * permanent data structures until phase 2 (where we
+		 * can change the locking requirements) and simply
+		 * accumulate our current results into the lazy_entries
+		 * data array).
+		 *
+		 * We do not need to lock the lazy_entries array because
+		 * we have exclusive access to the cells in the range
+		 * [k_start,k_end) that this thread was given.
+		 */
+		lazy_entries[k].dir = parent;
+		if (parent) {
+			lazy_entries[k].hash_name = memihash_cont(
+				parent->ent.hash,
+				ce_k->name + parent->namelen,
+				ce_namelen(ce_k) - parent->namelen);
+			lazy_entries[k].hash_dir = parent->ent.hash;
+		} else {
+			lazy_entries[k].hash_name = memihash(ce_k->name, ce_namelen(ce_k));
+		}
+
+		k++;
+	}
+
+	return k - k_start;
+}
+
+struct lazy_dir_thread_data {
+	pthread_t pthread;
+	struct index_state *istate;
+	struct lazy_entry *lazy_entries;
+	int k_start;
+	int k_end;
+};
+
+static void *lazy_dir_thread_proc(void *_data)
+{
+	struct lazy_dir_thread_data *d = _data;
+	struct strbuf prefix = STRBUF_INIT;
+	handle_range_1(d->istate, d->k_start, d->k_end, NULL, &prefix, d->lazy_entries);
+	strbuf_release(&prefix);
+	return NULL;
+}
+
+struct lazy_name_thread_data {
+	pthread_t pthread;
+	struct index_state *istate;
+	struct lazy_entry *lazy_entries;
+};
+
+static void *lazy_name_thread_proc(void *_data)
+{
+	struct lazy_name_thread_data *d = _data;
+	int k;
+
+	for (k = 0; k < d->istate->cache_nr; k++) {
+		struct cache_entry *ce_k = d->istate->cache[k];
+		ce_k->ce_flags |= CE_HASHED;
+		hashmap_entry_init(ce_k, d->lazy_entries[k].hash_name);
+		hashmap_add(&d->istate->name_hash, ce_k);
+	}
+
+	return NULL;
+}
+
+static inline void lazy_update_dir_ref_counts(
+	struct index_state *istate,
+	struct lazy_entry *lazy_entries)
+{
+	int k;
+
+	for (k = 0; k < istate->cache_nr; k++) {
+		if (lazy_entries[k].dir)
+			lazy_entries[k].dir->nr++;
+	}
+}
+
+static void threaded_lazy_init_name_hash(
+	struct index_state *istate)
+{
+	int nr_each;
+	int k_start;
+	int t;
+	struct lazy_entry *lazy_entries;
+	struct lazy_dir_thread_data *td_dir;
+	struct lazy_name_thread_data *td_name;
+
+	k_start = 0;
+	nr_each = DIV_ROUND_UP(istate->cache_nr, lazy_nr_dir_threads);
+
+	lazy_entries = xcalloc(istate->cache_nr, sizeof(struct lazy_entry));
+	td_dir = xcalloc(lazy_nr_dir_threads, sizeof(struct lazy_dir_thread_data));
+	td_name = xcalloc(1, sizeof(struct lazy_name_thread_data));
+
+	init_dir_mutex();
+
+	/*
+	 * Phase 1:
+	 * Build "istate->dir_hash" using n "dir" threads (and a read-only index).
+	 */
+	for (t = 0; t < lazy_nr_dir_threads; t++) {
+		struct lazy_dir_thread_data *td_dir_t = td_dir + t;
+		td_dir_t->istate = istate;
+		td_dir_t->lazy_entries = lazy_entries;
+		td_dir_t->k_start = k_start;
+		k_start += nr_each;
+		if (k_start > istate->cache_nr)
+			k_start = istate->cache_nr;
+		td_dir_t->k_end = k_start;
+		if (pthread_create(&td_dir_t->pthread, NULL, lazy_dir_thread_proc, td_dir_t))
+			die("unable to create lazy_dir_thread");
+	}
+	for (t = 0; t < lazy_nr_dir_threads; t++) {
+		struct lazy_dir_thread_data *td_dir_t = td_dir + t;
+		if (pthread_join(td_dir_t->pthread, NULL))
+			die("unable to join lazy_dir_thread");
+	}
+
+	/*
+	 * Phase 2:
+	 * Iterate over all index entries and add them to the "istate->name_hash"
+	 * using a single "name" background thread.
+	 * (Testing showed it wasn't worth running more than 1 thread for this.)
+	 *
+	 * Meanwhile, finish updating the parent directory ref-counts for each
+	 * index entry using the current thread.  (This step is very fast and
+	 * doesn't need threading.)
+	 */
+	td_name->istate = istate;
+	td_name->lazy_entries = lazy_entries;
+	if (pthread_create(&td_name->pthread, NULL, lazy_name_thread_proc, td_name))
+		die("unable to create lazy_name_thread");
+
+	lazy_update_dir_ref_counts(istate, lazy_entries);
+
+	if (pthread_join(td_name->pthread, NULL))
+		die("unable to join lazy_name_thread");
+
+	cleanup_dir_mutex();
+
+	free(td_name);
+	free(td_dir);
+	free(lazy_entries);
+}
+
+#endif
+
+static void lazy_init_name_hash(struct index_state *istate)
+{
 	if (istate->name_hash_initialized)
 		return;
 	hashmap_init(&istate->name_hash, (hashmap_cmp_fn) cache_entry_cmp,
 			istate->cache_nr);
 	hashmap_init(&istate->dir_hash, (hashmap_cmp_fn) dir_entry_cmp,
 			istate->cache_nr);
-	for (nr = 0; nr < istate->cache_nr; nr++)
-		hash_index_entry(istate, istate->cache[nr]);
+
+	if (lookup_lazy_params(istate)) {
+		hashmap_disallow_rehash(&istate->dir_hash, 1);
+		threaded_lazy_init_name_hash(istate);
+		hashmap_disallow_rehash(&istate->dir_hash, 0);
+	} else {
+		int nr;
+		for (nr = 0; nr < istate->cache_nr; nr++)
+			hash_index_entry(istate, istate->cache[nr]);
+	}
+
 	istate->name_hash_initialized = 1;
 }
 
+/*
+ * A test routine for t/helper/ sources.
+ *
+ * Returns the number of threads used or 0 when
+ * the non-threaded code path was used.
+ *
+ * Requesting threading WILL NOT override guards
+ * in lookup_lazy_params().
+ */
+int test_lazy_init_name_hash(struct index_state *istate, int try_threaded)
+{
+	lazy_nr_dir_threads = 0;
+	lazy_try_threaded = try_threaded;
+
+	lazy_init_name_hash(istate);
+
+	return lazy_nr_dir_threads;
+}
+
 void add_name_hash(struct index_state *istate, struct cache_entry *ce)
 {
 	if (istate->name_hash_initialized)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/6] name-hash: add test-lazy-init-name-hash
  2017-03-22 17:14 [PATCH 0/6] thread lazy_init_name_hash git
                   ` (3 preceding siblings ...)
  2017-03-22 17:14 ` [PATCH 4/6] name-hash: perf improvement for lazy_init_name_hash git
@ 2017-03-22 17:14 ` git
  2017-03-22 17:14 ` [PATCH 6/6] name-hash: add perf test for lazy_init_name_hash git
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: git @ 2017-03-22 17:14 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Add t/helper/test-lazy-init-name-hash.c test code
to demonstrate performance times for lazy_init_name_hash()
using the original single-threaded and the new multi-threaded
code paths.

Includes a --dump option to dump the created hashmaps to
stdout.  You can use this to run both code paths and
confirm that they generate the same hashmaps.

Includes a --analyze option to analyze performance of both
code paths over a range of index sizes to help you find a
lower bound for the LAZY_THREAD_COST in name-hash.c.
For example, passing "-a 4000" will set "istate.cache_nr"
to 4000 and then try the multi-threaded code -- probably
giving 2 threads with 2000 entries each.  It will then
run both the single-threaded (1x4000) and the multi-threaded
(2x2000) and compare the times.  It will then repeat the
test with 8000, 12000, and etc. so that you can see the
cross over.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 Makefile                            |   1 +
 cache.h                             |   1 +
 t/helper/test-lazy-init-name-hash.c | 264 ++++++++++++++++++++++++++++++++++++
 3 files changed, 266 insertions(+)
 create mode 100644 t/helper/test-lazy-init-name-hash.c

diff --git a/Makefile b/Makefile
index 9ec6065..5fb46c8 100644
--- a/Makefile
+++ b/Makefile
@@ -616,6 +616,7 @@ TEST_PROGRAMS_NEED_X += test-fake-ssh
 TEST_PROGRAMS_NEED_X += test-genrandom
 TEST_PROGRAMS_NEED_X += test-hashmap
 TEST_PROGRAMS_NEED_X += test-index-version
+TEST_PROGRAMS_NEED_X += test-lazy-init-name-hash
 TEST_PROGRAMS_NEED_X += test-line-buffer
 TEST_PROGRAMS_NEED_X += test-match-trees
 TEST_PROGRAMS_NEED_X += test-mergesort
diff --git a/cache.h b/cache.h
index 80b6372..6a72f6a 100644
--- a/cache.h
+++ b/cache.h
@@ -343,6 +343,7 @@ struct index_state {
 extern struct index_state the_index;
 
 /* Name hashing */
+extern int test_lazy_init_name_hash(struct index_state *istate, int try_threaded);
 extern void add_name_hash(struct index_state *istate, struct cache_entry *ce);
 extern void remove_name_hash(struct index_state *istate, struct cache_entry *ce);
 extern void free_name_hash(struct index_state *istate);
diff --git a/t/helper/test-lazy-init-name-hash.c b/t/helper/test-lazy-init-name-hash.c
new file mode 100644
index 0000000..435837b
--- /dev/null
+++ b/t/helper/test-lazy-init-name-hash.c
@@ -0,0 +1,264 @@
+#include "cache.h"
+#include "parse-options.h"
+
+static int single = 0;
+static int multi = 0;
+static int count = 1;
+static int dump = 0;
+static int perf = 0;
+static int analyze = 0;
+static int analyze_step = 0;
+
+/*
+ * Dump the contents of the "dir" and "name" hash tables to stdout.
+ * If you sort the result, you can compare it with the other type
+ * mode and verify that both single and multi produce the same set.
+ */
+void dump_run(void)
+{
+	struct hashmap_iter iter_dir;
+	struct hashmap_iter iter_cache;
+
+	/* Stolen from name-hash.c */
+	struct dir_entry {
+		struct hashmap_entry ent;
+		struct dir_entry *parent;
+		int nr;
+		unsigned int namelen;
+		char name[FLEX_ARRAY];
+	};
+
+	struct dir_entry *dir;
+	struct cache_entry *ce;
+
+	read_cache();
+	if (single) {
+		test_lazy_init_name_hash(&the_index, 0);
+	} else {
+		int nr_threads_used = test_lazy_init_name_hash(&the_index, 1);
+		if (!nr_threads_used)
+			die("non-threaded code path used");
+	}
+
+	dir = hashmap_iter_first(&the_index.dir_hash, &iter_dir);
+	while (dir) {
+		printf("dir %08x %7d %s\n", dir->ent.hash, dir->nr, dir->name);
+		dir = hashmap_iter_next(&iter_dir);
+	}
+
+	ce = hashmap_iter_first(&the_index.name_hash, &iter_cache);
+	while (ce) {
+		printf("name %08x %s\n", ce->ent.hash, ce->name);
+		ce = hashmap_iter_next(&iter_cache);
+	}
+
+	discard_cache();
+}
+
+/*
+ * Run the single or multi threaded version "count" times and
+ * report on the time taken.
+ */
+uint64_t time_runs(int try_threaded)
+{
+	uint64_t t0, t1, t2;
+	uint64_t sum = 0;
+	uint64_t avg;
+	int nr_threads_used;
+	int i;
+
+	for (i = 0; i < count; i++) {
+		t0 = getnanotime();
+		read_cache();
+		t1 = getnanotime();
+		nr_threads_used = test_lazy_init_name_hash(&the_index, try_threaded);
+		t2 = getnanotime();
+
+		sum += (t2 - t1);
+
+		if (try_threaded && !nr_threads_used)
+			die("non-threaded code path used");
+
+		if (nr_threads_used)
+			printf("%f %f %d multi %d\n",
+				   ((double)(t1 - t0))/1000000000,
+				   ((double)(t2 - t1))/1000000000,
+				   the_index.cache_nr,
+				   nr_threads_used);
+		else
+			printf("%f %f %d single\n",
+				   ((double)(t1 - t0))/1000000000,
+				   ((double)(t2 - t1))/1000000000,
+				   the_index.cache_nr);
+		fflush(stdout);
+
+		discard_cache();
+	}
+
+	avg = sum / count;
+	if (count > 1)
+		printf("avg %f %s\n",
+			   (double)avg/1000000000,
+			   (try_threaded) ? "multi" : "single");
+
+	return avg;
+}
+
+/*
+ * Try a series of runs varying the "istate->cache_nr" and
+ * try to find a good value for the multi-threaded criteria.
+ */
+void analyze_run(void)
+{
+	uint64_t t1s, t1m, t2s, t2m;
+	int cache_nr_limit;
+	int nr_threads_used;
+	int i;
+	int nr;
+
+	read_cache();
+	cache_nr_limit = the_index.cache_nr;
+	discard_cache();
+
+	nr = analyze;
+	while (1) {
+		uint64_t sum_single = 0;
+		uint64_t sum_multi = 0;
+		uint64_t avg_single;
+		uint64_t avg_multi;
+
+		if (nr > cache_nr_limit)
+			nr = cache_nr_limit;
+
+		for (i = 0; i < count; i++) {
+			read_cache();
+			the_index.cache_nr = nr; /* cheap truncate of index */
+			t1s = getnanotime();
+			test_lazy_init_name_hash(&the_index, 0);
+			t2s = getnanotime();
+			sum_single += (t2s - t1s);
+			the_index.cache_nr = cache_nr_limit;
+			discard_cache();
+
+			read_cache();
+			the_index.cache_nr = nr; /* cheap truncate of index */
+			t1m = getnanotime();
+			nr_threads_used = test_lazy_init_name_hash(&the_index, 1);
+			t2m = getnanotime();
+			sum_multi += (t2m - t1m);
+			the_index.cache_nr = cache_nr_limit;
+			discard_cache();
+
+			if (!nr_threads_used)
+				printf("    [size %8d] [single %f]   non-threaded code path used\n",
+					   nr, ((double)(t2s - t1s))/1000000000);
+			else 
+				printf("    [size %8d] [single %f] %c [multi %f %d]\n",
+					   nr,
+					   ((double)(t2s - t1s))/1000000000,
+					   (((t2s - t1s) < (t2m - t1m)) ? '<' : '>'),
+					   ((double)(t2m - t1m))/1000000000,
+					   nr_threads_used);
+			fflush(stdout);
+		}
+		if (count > 1) {
+			avg_single = sum_single / count;
+			avg_multi = sum_multi / count;
+			if (!nr_threads_used)
+				printf("avg [size %8d] [single %f]\n",
+					   nr,
+					   (double)avg_single/1000000000);
+			else
+				printf("avg [size %8d] [single %f] %c [multi %f %d]\n",
+					   nr,
+					   (double)avg_single/1000000000,
+					   (avg_single < avg_multi ? '<' : '>'),
+					   (double)avg_multi/1000000000,
+					   nr_threads_used);
+			fflush(stdout);
+		}
+
+		if (nr >= cache_nr_limit)
+			return;
+		nr += analyze_step;
+	}
+}
+
+int cmd_main(int argc, const char **argv)
+{
+	const char *usage[] = {
+		"test-lazy-init-name-hash -d (-s | -m)",
+		"test-lazy-init-name-hash -p [-c c]",
+		"test-lazy-init-name-hash -a a [--step s] [-c c]",
+		"test-lazy-init-name-hash (-s | -m) [-c c]",
+		"test-lazy-init-name-hash -s -m [-c c]",
+		NULL
+	};
+	struct option options[] = {
+		OPT_BOOL('s', "single", &single, "run single-threaded code"),
+		OPT_BOOL('m', "multi", &multi, "run multi-threaded code"),
+		OPT_INTEGER('c', "count", &count, "number of passes"),
+		OPT_BOOL('d', "dump", &dump, "dump hash tables"),
+		OPT_BOOL('p', "perf", &perf, "compare single vs multi"),
+		OPT_INTEGER('a', "analyze", &analyze, "analyze different multi sizes"),
+		OPT_INTEGER(0, "step", &analyze_step, "analyze step factor"),
+		OPT_END(),
+	};
+	const char *prefix;
+	uint64_t avg_single, avg_multi;
+
+	prefix = setup_git_directory();
+
+	argc = parse_options(argc, argv, prefix, options, usage, 0);
+
+	/*
+	 * istate->dir_hash is only created when ignore_case is set.
+	 */
+	ignore_case = 1;
+
+	if (dump) {
+		if (perf || analyze > 0)
+			die("cannot combine dump, perf, or analyze");
+		if (count > 1)
+			die("count not valid with dump");
+		if (single && multi)
+			die("cannot use both single and multi with dump");
+		if (!single && !multi)
+			die("dump requires either single or multi");
+		dump_run();
+		return 0;
+	}
+
+	if (perf) {
+		if (analyze > 0)
+			die("cannot combine dump, perf, or analyze");
+		if (single || multi)
+			die("cannot use single or multi with perf");
+		avg_single = time_runs(0);
+		avg_multi = time_runs(1);
+		if (avg_multi > avg_single)
+			die("multi is slower");
+		return 0;
+	}
+
+	if (analyze) {
+		if (analyze < 500)
+			die("analyze must be at least 500");
+		if (!analyze_step)
+			analyze_step = analyze;
+		if (single || multi)
+			die("cannot use single or multi with analyze");
+		analyze_run();
+		return 0;
+	}
+
+	if (!single && !multi)
+		die("require either -s or -m or both");
+
+	if (single)
+		time_runs(0);
+	if (multi)
+		time_runs(1);
+
+	return 0;
+}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 6/6] name-hash: add perf test for lazy_init_name_hash
  2017-03-22 17:14 [PATCH 0/6] thread lazy_init_name_hash git
                   ` (4 preceding siblings ...)
  2017-03-22 17:14 ` [PATCH 5/6] name-hash: add test-lazy-init-name-hash git
@ 2017-03-22 17:14 ` git
  2017-03-22 18:02 ` [PATCH 0/6] thread lazy_init_name_hash Stefan Beller
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: git @ 2017-03-22 17:14 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Created t/perf/p0004-lazy-init-name-hash.sh test
to demonstrate correctness and performance gains
with the multithreaded version of lazy_init_name_hash().

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 t/perf/p0004-lazy-init-name-hash.sh | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)
 create mode 100644 t/perf/p0004-lazy-init-name-hash.sh

diff --git a/t/perf/p0004-lazy-init-name-hash.sh b/t/perf/p0004-lazy-init-name-hash.sh
new file mode 100644
index 0000000..5afa8c8
--- /dev/null
+++ b/t/perf/p0004-lazy-init-name-hash.sh
@@ -0,0 +1,19 @@
+#!/bin/sh
+
+test_description='Tests multi-threaded lazy_init_name_hash'
+. ./perf-lib.sh
+
+test_perf_large_repo
+test_checkout_worktree
+
+test_expect_success 'verify both methods build the same hashmaps' '
+	$GIT_BUILD_DIR/t/helper/test-lazy-init-name-hash$X --dump --single | sort >out.single &&
+	$GIT_BUILD_DIR/t/helper/test-lazy-init-name-hash$X --dump --multi  | sort >out.multi  &&
+	test_cmp out.single out.multi
+'
+
+test_expect_success 'multithreaded should be faster' '
+	$GIT_BUILD_DIR/t/helper/test-lazy-init-name-hash$X --perf >out.perf
+'
+
+test_done
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/6] thread lazy_init_name_hash
  2017-03-22 17:14 [PATCH 0/6] thread lazy_init_name_hash git
                   ` (5 preceding siblings ...)
  2017-03-22 17:14 ` [PATCH 6/6] name-hash: add perf test for lazy_init_name_hash git
@ 2017-03-22 18:02 ` Stefan Beller
  2017-03-22 19:22   ` Jeff Hostetler
  2017-03-22 19:38 ` Junio C Hamano
  2017-03-22 20:39 ` Junio C Hamano
  8 siblings, 1 reply; 14+ messages in thread
From: Stefan Beller @ 2017-03-22 18:02 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: git@vger.kernel.org, Junio C Hamano, Jeff King, Jeff Hostetler

On Wed, Mar 22, 2017 at 10:14 AM,  <git@jeffhostetler.com> wrote:
>
> During our testing on the Windows source tree (3.1M
> files, 500K folders, 450MB index), this change reduced
> the runtime of lazy_init_name_hash() from 1.4 to 0.27
> seconds.

This sounds promising. :)
A fast skim over the code makes me like the code.

>  hashmap.c                           |  29 ++-
>  hashmap.h                           |  25 ++

Could you add some documentation to
Documentation/technical/api-hashmap.txt ?
(Bonus points for migrating the documentation inline,
c.f. discussion surrounding [1])

[1] https://public-inbox.org/git/20141212212800.GA27451@peff.net/

>  name-hash.c                         | 490 +++++++++++++++++++++++++++++++++++-

AFAICT the new threading is all implicit in name-hash and we do not expose
its functionality or tuning knobs outside the testing helper, such that we do
not need API documentation here, but only enough code comments to
understand the code for maintainability?

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/6] thread lazy_init_name_hash
  2017-03-22 18:02 ` [PATCH 0/6] thread lazy_init_name_hash Stefan Beller
@ 2017-03-22 19:22   ` Jeff Hostetler
  0 siblings, 0 replies; 14+ messages in thread
From: Jeff Hostetler @ 2017-03-22 19:22 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git@vger.kernel.org, Junio C Hamano, Jeff King, Jeff Hostetler



On 3/22/2017 2:02 PM, Stefan Beller wrote:
> On Wed, Mar 22, 2017 at 10:14 AM,  <git@jeffhostetler.com> wrote:
>>
>> During our testing on the Windows source tree (3.1M
>> files, 500K folders, 450MB index), this change reduced
>> the runtime of lazy_init_name_hash() from 1.4 to 0.27
>> seconds.
>
> This sounds promising. :)
> A fast skim over the code makes me like the code.

I forgot to mention that this was on an 8 (logical) core machine.

>
>>  hashmap.c                           |  29 ++-
>>  hashmap.h                           |  25 ++
>
> Could you add some documentation to
> Documentation/technical/api-hashmap.txt ?

Will do. Thanks.

> (Bonus points for migrating the documentation inline,
> c.f. discussion surrounding [1])
>
> [1] https://public-inbox.org/git/20141212212800.GA27451@peff.net/

I'd like to save this for a separate effort.

>
>>  name-hash.c                         | 490 +++++++++++++++++++++++++++++++++++-
>
> AFAICT the new threading is all implicit in name-hash and we do not expose
> its functionality or tuning knobs outside the testing helper, such that we do
> not need API documentation here, but only enough code comments to
> understand the code for maintainability?

Yes, my goal was to just make it transparent to all callers.
Especially since the main lazy_init_name_hash() function is static
and not visible anyway.

Jeff


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/6] thread lazy_init_name_hash
  2017-03-22 17:14 [PATCH 0/6] thread lazy_init_name_hash git
                   ` (6 preceding siblings ...)
  2017-03-22 18:02 ` [PATCH 0/6] thread lazy_init_name_hash Stefan Beller
@ 2017-03-22 19:38 ` Junio C Hamano
  2017-03-22 20:54   ` Junio C Hamano
  2017-03-22 20:39 ` Junio C Hamano
  8 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2017-03-22 19:38 UTC (permalink / raw)
  To: git; +Cc: git, peff, Jeff Hostetler

git@jeffhostetler.com writes:

> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> This patch series is a performance optimization for
> lazy_init_name_hash() in name-hash.c on very large
> repositories.
>
> This change allows lazy_init_name_hash() to optionally
> use multiple threads when building the the_index.dir_hash
> and the_index.name_hash hashmaps.  The original code path
> has been preserved and is used when the repo is small or
> the system does not have sufficient CPUs.
>
> A helper command (t/helper/test-lazy-init-name-hash) was
> created to demonstrate performance differences and validate
> output.  For example, use the '-p' option to compare both
> code paths on a large repo.
>
> During our testing on the Windows source tree (3.1M
> files, 500K folders, 450MB index), this change reduced
> the runtime of lazy_init_name_hash() from 1.4 to 0.27
> seconds.
>
> This patch series replaces my earlier
>      * jh/memihash-opt (2017-02-17) 5 commits
> patch series.

Ahh.  I was scratching my head trying to remember why some of these
look so familiar.  [PATCH v2 ...] would have helped.

Thank you for an update.

>
> Jeff Hostetler (6):
>   name-hash: specify initial size for istate.dir_hash table
>   hashmap: allow memihash computation to be continued
>   hashmap: Add disallow_rehash setting
>   name-hash: perf improvement for lazy_init_name_hash
>   name-hash: add test-lazy-init-name-hash
>   name-hash: add perf test for lazy_init_name_hash
>
>  Makefile                            |   1 +
>  cache.h                             |   1 +
>  hashmap.c                           |  29 ++-
>  hashmap.h                           |  25 ++
>  name-hash.c                         | 490 +++++++++++++++++++++++++++++++++++-
>  t/helper/test-lazy-init-name-hash.c | 264 +++++++++++++++++++
>  t/perf/p0004-lazy-init-name-hash.sh |  19 ++
>  7 files changed, 820 insertions(+), 9 deletions(-)
>  create mode 100644 t/helper/test-lazy-init-name-hash.c
>  create mode 100644 t/perf/p0004-lazy-init-name-hash.sh

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/6] thread lazy_init_name_hash
  2017-03-22 19:38 ` Junio C Hamano
@ 2017-03-22 20:54   ` Junio C Hamano
  2017-03-22 21:04     ` Jeff Hostetler
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2017-03-22 20:54 UTC (permalink / raw)
  To: git; +Cc: git, peff, Jeff Hostetler

Junio C Hamano <gitster@pobox.com> writes:

>> This patch series replaces my earlier
>>      * jh/memihash-opt (2017-02-17) 5 commits
>> patch series.
>
> Ahh.  I was scratching my head trying to remember why some of these
> look so familiar.  [PATCH v2 ...] would have helped.
>
> Thank you for an update.

One notable difference I noticed since the previous round is that
this no longer adds precomputed hash to "struct cache_entry".  As
you are aiming to manage an index with a large number of entries,
this is a welcome change that makes sense.

$ make NO_PTHREADS=NoThanks name-hash.o
    CC name-hash.o
name-hash.c: In function 'lazy_init_name_hash':
name-hash.c:573:3: error: implicit declaration of function 'threaded_lazy_init_name_hash' [-Werror=implicit-function-declaration]
   threaded_lazy_init_name_hash(istate);
   ^
name-hash.c: In function 'test_lazy_init_name_hash':
name-hash.c:595:2: error: 'lazy_nr_dir_threads' undeclared (first use in this function)
  lazy_nr_dir_threads = 0;
  ^
name-hash.c:595:2: note: each undeclared identifier is reported only once for each function it appears in
name-hash.c:596:2: error: 'lazy_try_threaded' undeclared (first use in this function)
  lazy_try_threaded = try_threaded;
  ^
name-hash.c:601:1: error: control reaches end of non-void function [-Werror=return-type]
 }
 ^
cc1: all warnings being treated as errors
make: *** [name-hash.o] Error 1

still has to be addressed.  Perhaps squash pieces of these into
appropriate patches in the series?

 name-hash.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/name-hash.c b/name-hash.c
index a5826505e0..725ac22886 100644
--- a/name-hash.c
+++ b/name-hash.c
@@ -118,9 +118,16 @@ static int cache_entry_cmp(const struct cache_entry *ce1,
 	return remove ? !(ce1 == ce2) : 0;
 }
 
+static int lazy_try_threaded = 1;
+static int lazy_nr_dir_threads = 0;
 
 #ifdef NO_PTHREADS
 
+static void threaded_lazy_init_name_hash(struct index_state *istate)
+{
+	; /* nothing */
+}
+
 static int lookup_lazy_params(struct index_state *istate)
 {
 	return 0;
@@ -153,8 +160,6 @@ static int lookup_lazy_params(struct index_state *istate)
  */
 #define LAZY_MAX_MUTEX   (32)
 
-static int lazy_try_threaded = 1;
-static int lazy_nr_dir_threads = 0;
 static pthread_mutex_t *lazy_dir_mutex_array;
 
 /*

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/6] thread lazy_init_name_hash
  2017-03-22 20:54   ` Junio C Hamano
@ 2017-03-22 21:04     ` Jeff Hostetler
  2017-03-22 21:29       ` Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Jeff Hostetler @ 2017-03-22 21:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, peff, Jeff Hostetler



On 3/22/2017 4:54 PM, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
>>> This patch series replaces my earlier
>>>      * jh/memihash-opt (2017-02-17) 5 commits
>>> patch series.
>>
>> Ahh.  I was scratching my head trying to remember why some of these
>> look so familiar.  [PATCH v2 ...] would have helped.
>>
>> Thank you for an update.
>
> One notable difference I noticed since the previous round is that
> this no longer adds precomputed hash to "struct cache_entry".  As
> you are aiming to manage an index with a large number of entries,
> this is a welcome change that makes sense.

Yes, this completely isolates the changes inside the name-hash.c
code.  And it eliminates the need to update/invalidate the precomputed
hash values as entries are changed.

>
> $ make NO_PTHREADS=NoThanks name-hash.o
>     CC name-hash.o
> name-hash.c: In function 'lazy_init_name_hash':
> ...
> still has to be addressed.  Perhaps squash pieces of these into
> appropriate patches in the series?
> ...

Will do. I wasn't sure how you specified non-threaded builds.

Thanks,
Jeff



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/6] thread lazy_init_name_hash
  2017-03-22 21:04     ` Jeff Hostetler
@ 2017-03-22 21:29       ` Junio C Hamano
  0 siblings, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2017-03-22 21:29 UTC (permalink / raw)
  To: Jeff Hostetler; +Cc: git, peff, Jeff Hostetler

Jeff Hostetler <git@jeffhostetler.com> writes:

> On 3/22/2017 4:54 PM, Junio C Hamano wrote:
>>>
>>> Thank you for an update.
>>
>> One notable difference I noticed since the previous round is that
>> this no longer adds precomputed hash to "struct cache_entry".  As
>> you are aiming to manage an index with a large number of entries,
>> this is a welcome change that makes sense.
>
> Yes, this completely isolates the changes inside the name-hash.c
> code.  And it eliminates the need to update/invalidate the precomputed
> hash values as entries are changed.

Thanks.  Having a summary like that for differences since the last
round in the cover letter would help during the review, but this is
an example that lack of such a summary would be one way to make sure
that reviewer(s) are paying attention ;-)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/6] thread lazy_init_name_hash
  2017-03-22 17:14 [PATCH 0/6] thread lazy_init_name_hash git
                   ` (7 preceding siblings ...)
  2017-03-22 19:38 ` Junio C Hamano
@ 2017-03-22 20:39 ` Junio C Hamano
  8 siblings, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2017-03-22 20:39 UTC (permalink / raw)
  To: git; +Cc: git, peff, Jeff Hostetler

Just FYI before I start reading each patch carefully...

Subject: [PATCH 2/6] hashmap: allow memihash computation to be continued
ERROR: trailing whitespace
#28: FILE: hashmap.c:56:
+ */ $

total: 1 errors, 0 warnings, 30 lines checked
------------------------------------------------
Subject: [PATCH 4/6] name-hash: perf improvement for lazy_init_name_hash
ERROR: space required after that ',' (ctx:VxV)
#57: FILE: name-hash.c:38:
+	return find_dir_entry__hash(istate, name, namelen, memihash(name,namelen));
 	                                                                ^

ERROR: do not initialise statics to 0 or NULL
#105: FILE: name-hash.c:157:
+static int lazy_nr_dir_threads = 0;

total: 2 errors, 0 warnings, 516 lines checked
------------------------------------------------
Subject: [PATCH 5/6] name-hash: add test-lazy-init-name-hash
ERROR: do not initialise statics to 0 or NULL
#64: FILE: t/helper/test-lazy-init-name-hash.c:4:
+static int single = 0;

ERROR: do not initialise statics to 0 or NULL
#65: FILE: t/helper/test-lazy-init-name-hash.c:5:
+static int multi = 0;

ERROR: do not initialise statics to 0 or NULL
#67: FILE: t/helper/test-lazy-init-name-hash.c:7:
+static int dump = 0;

ERROR: do not initialise statics to 0 or NULL
#68: FILE: t/helper/test-lazy-init-name-hash.c:8:
+static int perf = 0;

ERROR: do not initialise statics to 0 or NULL
#69: FILE: t/helper/test-lazy-init-name-hash.c:9:
+static int analyze = 0;

ERROR: do not initialise statics to 0 or NULL
#70: FILE: t/helper/test-lazy-init-name-hash.c:10:
+static int analyze_step = 0;

ERROR: trailing whitespace
#215: FILE: t/helper/test-lazy-init-name-hash.c:155:
+^I^I^Ielse $

total: 7 errors, 0 warnings, 278 lines checked
------------------------------------------------

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-03-22 21:30 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-03-22 17:14 [PATCH 0/6] thread lazy_init_name_hash git
2017-03-22 17:14 ` [PATCH 1/6] name-hash: specify initial size for istate.dir_hash table git
2017-03-22 17:14 ` [PATCH 2/6] hashmap: allow memihash computation to be continued git
2017-03-22 17:14 ` [PATCH 3/6] hashmap: Add disallow_rehash setting git
2017-03-22 17:14 ` [PATCH 4/6] name-hash: perf improvement for lazy_init_name_hash git
2017-03-22 17:14 ` [PATCH 5/6] name-hash: add test-lazy-init-name-hash git
2017-03-22 17:14 ` [PATCH 6/6] name-hash: add perf test for lazy_init_name_hash git
2017-03-22 18:02 ` [PATCH 0/6] thread lazy_init_name_hash Stefan Beller
2017-03-22 19:22   ` Jeff Hostetler
2017-03-22 19:38 ` Junio C Hamano
2017-03-22 20:54   ` Junio C Hamano
2017-03-22 21:04     ` Jeff Hostetler
2017-03-22 21:29       ` Junio C Hamano
2017-03-22 20:39 ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).