git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4] generalizing sorted-array handling
@ 2010-11-29 22:57 Yann Dirson
  2010-11-29 22:57 ` [PATCH 1/6] Introduce sorted-array binary-search function Yann Dirson
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Yann Dirson @ 2010-11-29 22:57 UTC (permalink / raw)
  To: git

Changes from v3:

* this is a major rewrite, which allows the API to be used in some
  more places than the previous version.  It does not cover all
  binary-search uses, however, but it's not clear to me that trying to
  get any further would be a good use of my time (see last section
  below for details).

* based the API and implementation on the most widely used
  binary-search implementation found in the current tree, and on the
  must flexible API (the "return -lo - 1" one)

* given full control of generated function names to caller

* completely separated sorted-array typedecl from generic search/insert
  implementations, allowing several search/insert functions with
  different cmp/init callbacks

* provided several convenience wrappers around the generic
  search/insert implementations, depending on the needs of surrounding
  code

Notes on current API:

* The macro names are a bit heavy-weight.  Better ideas welcome.

* This API is very verbose, and I'm not happy with that aspect.

It could be made less so, eg. causing insert wrappers to auto-declare
the required generic insert func, and causing the latter auto-declare
the required generic search func.  That would cause duplication of the
generic search func in many cases.

The duplication problem would not be an issue if we add an automatic
call to declare_gen_sorted_insert() in declare_sorted_array_insert_*,
but we would loose the symetry with the search API.

Adding "simple" API variants that would call all the necessary stuff
would help code readability, but adding yet more entry points seems a
dubious approach.

Or is that just the "use cpp for templating" just inadequate here ?


TODO? list:

* take binsearch desc from sha1_lookup.c

* dealloc API

* read-cache.c::index_name_pos has widely-used API with 2 low-coupled
  cmp/init params: sorted-array could be generalized at the cost of
  using stdarg, but is it worth it ?

* pack-revindex.c::find_pack_revindex is a bit special and needs more
  thought

* cache-tree.c::subtree_pos and sha1_file::find_pack_entry_one too

* sha1_lookup.c stuff probably too special

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/6] Introduce sorted-array binary-search function.
  2010-11-29 22:57 [PATCH v4] generalizing sorted-array handling Yann Dirson
@ 2010-11-29 22:57 ` Yann Dirson
  2010-11-29 22:57 ` [PATCH 2/6] Convert diffcore-rename's rename_dst to the new sorted-array API Yann Dirson
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Yann Dirson @ 2010-11-29 22:57 UTC (permalink / raw)
  To: git; +Cc: Yann Dirson

We use a cpp-based template mechanism to declare the array and its
management data, as well as a search function.
Thanks to Jonathan Nieder for this design idea.

Signed-off-by: Yann Dirson <ydirson@altern.org>
---
 Makefile       |    1 +
 sorted-array.h |  104 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 105 insertions(+), 0 deletions(-)
 create mode 100644 sorted-array.h

diff --git a/Makefile b/Makefile
index 919ed2b..4b26976 100644
--- a/Makefile
+++ b/Makefile
@@ -539,6 +539,7 @@ LIB_H += run-command.h
 LIB_H += sha1-lookup.h
 LIB_H += sideband.h
 LIB_H += sigchain.h
+LIB_H += sorted-array.h
 LIB_H += strbuf.h
 LIB_H += string-list.h
 LIB_H += submodule.h
diff --git a/sorted-array.h b/sorted-array.h
new file mode 100644
index 0000000..20219e7
--- /dev/null
+++ b/sorted-array.h
@@ -0,0 +1,104 @@
+#ifndef SORTED_ARRAY_H_
+#define SORTED_ARRAY_H_
+
+#define declare_sorted_array(MAYBESTATIC,ELEMTYPE,LIST)			\
+MAYBESTATIC ELEMTYPE *LIST;						\
+MAYBESTATIC int LIST##_nr, LIST##_alloc;
+
+/* internal func: the implementation */
+#define declare_gen_binsearch(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE)	\
+MAYBESTATIC int FUNCNAME(						\
+	ELEMTYPE *list, int list_nr,					\
+	int(*cmp_func)(INITTYPE ref, ELEMTYPE *elem),			\
+	INITTYPE data)							\
+{									\
+	int lo, hi;							\
+									\
+	lo = 0;								\
+	hi = list_nr;							\
+	while (hi > lo) {						\
+		int mid = (hi + lo) >> 1;				\
+		int cmp = cmp_func(data, list + mid);			\
+		if (!cmp)						\
+			return mid;					\
+		if (cmp < 0)						\
+			hi = mid;					\
+		else							\
+			lo = mid + 1;					\
+	}								\
+	return -lo - 1;							\
+}
+
+#define declare_gen_sorted_insert(MAYBESTATIC,ELEMTYPE,FUNCNAME,SEARCHFUNC,INITTYPE) \
+MAYBESTATIC int FUNCNAME(						\
+	ELEMTYPE **list_p, int *list_nr_p, int *list_alloc_p,		\
+	int(*cmp_func)(INITTYPE ref, ELEMTYPE *elem),			\
+	void(*init_func)(ELEMTYPE *elem, INITTYPE init),		\
+	INITTYPE data)							\
+{									\
+	int pos = SEARCHFUNC(*list_p, *list_nr_p, cmp_func, data);	\
+	if (pos >= 0) 							\
+		return pos;						\
+	/* not found */							\
+	pos = -pos - 1;							\
+	/* insert to make it at "pos" */				\
+	if (*list_alloc_p <= *list_nr_p) {				\
+		(*list_alloc_p) = alloc_nr((*list_alloc_p));		\
+		*list_p = xrealloc(*list_p,				\
+				   (*list_alloc_p) * sizeof(**list_p)); \
+	}								\
+	(*list_nr_p)++;							\
+	if (pos < *list_nr_p)						\
+		memmove(*list_p + pos + 1, *list_p + pos,		\
+			(*list_nr_p - pos - 1) * sizeof(**list_p));	\
+	init_func(&(*list_p)[pos], data);				\
+	return -pos - 1;						\
+}
+
+/*
+ * Returns the position of the element if found pre-existing in the
+ * list, or if not found, -pos-1 where pos is where the element was
+ * inserted if insert_ok, or would have been inserted if !insert_ok.
+ */
+#define declare_sorted_array_search_check(MAYBESTATIC,FUNCNAME,INITTYPE,GENSEARCH,LIST,CMP) \
+MAYBESTATIC int FUNCNAME(INITTYPE data)					\
+{									\
+	return GENSEARCH(LIST, LIST##_nr, CMP, data);			\
+}
+
+#define declare_sorted_array_insert_check(MAYBESTATIC,FUNCNAME,INITTYPE,GENINSERT,LIST,CMP,INIT) \
+MAYBESTATIC int FUNCNAME(INITTYPE data)					\
+{									\
+	return GENINSERT(&LIST, &LIST##_nr, &LIST##_alloc,		\
+			 CMP, INIT, data);				\
+}
+
+/*
+ * Insert, and just tell whether the searched element was pre-existing
+ * in the list or not.
+ */
+#define declare_sorted_array_insert_checkbool(MAYBESTATIC,FUNCNAME,INITTYPE,GENINSERT,LIST,CMP,INIT) \
+MAYBESTATIC int FUNCNAME(INITTYPE data)					\
+{									\
+	int idx = GENINSERT(&LIST, &LIST##_nr, &LIST##_alloc,		\
+			    CMP, INIT, data);				\
+	if (idx < 0)							\
+		return 0;						\
+	return 1;							\
+}
+
+/*
+ * Search for element, inserting it if requested.  Returns address of
+ * the element found or newly-inserted, or NULL if not found and
+ * insertion was not requested.
+ */
+#define declare_sorted_array_search_elem(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,GENSEARCH,LIST,CMP) \
+MAYBESTATIC ELEMTYPE *FUNCNAME(INITTYPE data)				\
+{									\
+	int idx = GENSEARCH(LIST, LIST##_nr, CMP, data);		\
+	if (idx < 0)							\
+		return NULL;						\
+	return &(LIST[idx]);						\
+}
+
+#endif
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/6] Convert diffcore-rename's rename_dst to the new sorted-array API.
  2010-11-29 22:57 [PATCH v4] generalizing sorted-array handling Yann Dirson
  2010-11-29 22:57 ` [PATCH 1/6] Introduce sorted-array binary-search function Yann Dirson
@ 2010-11-29 22:57 ` Yann Dirson
  2010-11-29 22:57 ` [PATCH 3/6] Convert diffcore-rename's rename_src " Yann Dirson
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Yann Dirson @ 2010-11-29 22:57 UTC (permalink / raw)
  To: git; +Cc: Yann Dirson

The sorted-array API splits search and insert into two separated
functions, which makes the caller code more clear.

Signed-off-by: Yann Dirson <ydirson@altern.org>
---
 diffcore-rename.c |   66 ++++++++++++++++++++---------------------------------
 1 files changed, 25 insertions(+), 41 deletions(-)

diff --git a/diffcore-rename.c b/diffcore-rename.c
index df41be5..a655017 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -5,52 +5,36 @@
 #include "diff.h"
 #include "diffcore.h"
 #include "hash.h"
+#include "sorted-array.h"
 
 /* Table of rename/copy destinations */
 
-static struct diff_rename_dst {
+struct diff_rename_dst {
 	struct diff_filespec *two;
 	struct diff_filepair *pair;
-} *rename_dst;
-static int rename_dst_nr, rename_dst_alloc;
+};
 
-static struct diff_rename_dst *locate_rename_dst(struct diff_filespec *two,
-						 int insert_ok)
+static int rename_dst_cmp(struct diff_filespec *ref_spec, struct diff_rename_dst *elem)
 {
-	int first, last;
-
-	first = 0;
-	last = rename_dst_nr;
-	while (last > first) {
-		int next = (last + first) >> 1;
-		struct diff_rename_dst *dst = &(rename_dst[next]);
-		int cmp = strcmp(two->path, dst->two->path);
-		if (!cmp)
-			return dst;
-		if (cmp < 0) {
-			last = next;
-			continue;
-		}
-		first = next+1;
-	}
-	/* not found */
-	if (!insert_ok)
-		return NULL;
-	/* insert to make it at "first" */
-	if (rename_dst_alloc <= rename_dst_nr) {
-		rename_dst_alloc = alloc_nr(rename_dst_alloc);
-		rename_dst = xrealloc(rename_dst,
-				      rename_dst_alloc * sizeof(*rename_dst));
-	}
-	rename_dst_nr++;
-	if (first < rename_dst_nr)
-		memmove(rename_dst + first + 1, rename_dst + first,
-			(rename_dst_nr - first - 1) * sizeof(*rename_dst));
-	rename_dst[first].two = alloc_filespec(two->path);
-	fill_filespec(rename_dst[first].two, two->sha1, two->mode);
-	rename_dst[first].pair = NULL;
-	return &(rename_dst[first]);
+	return strcmp(ref_spec->path, elem->two->path);
+}
+static void rename_dst_init(struct diff_rename_dst *elem, struct diff_filespec *ref_spec)
+{
+	elem->two = alloc_filespec(ref_spec->path);
+	fill_filespec(elem->two, ref_spec->sha1, ref_spec->mode);
+	elem->pair = NULL;
 }
+declare_sorted_array(static, struct diff_rename_dst, rename_dst);
+declare_gen_binsearch(static, struct diff_rename_dst, _locate_rename_dst,
+		      struct diff_filespec *);
+declare_sorted_array_search_elem(static, struct diff_rename_dst, locate_rename_dst,
+				 struct diff_filespec *, _locate_rename_dst,
+				 rename_dst, rename_dst_cmp);
+declare_gen_sorted_insert(static, struct diff_rename_dst, _register_rename_dst,
+			  _locate_rename_dst, struct diff_filespec *);
+declare_sorted_array_insert_checkbool(static, register_rename_dst, struct diff_filespec *,
+				      _register_rename_dst,
+				      rename_dst, rename_dst_cmp, rename_dst_init);
 
 /* Table of rename/copy src files */
 static struct diff_rename_src {
@@ -437,7 +421,7 @@ void diffcore_rename(struct diff_options *options)
 				 strcmp(options->single_follow, p->two->path))
 				continue; /* not interested */
 			else
-				locate_rename_dst(p->two, 1);
+				register_rename_dst(p->two);
 		}
 		else if (!DIFF_FILE_VALID(p->two)) {
 			/*
@@ -582,7 +566,7 @@ void diffcore_rename(struct diff_options *options)
 			 * not been turned into a rename/copy already.
 			 */
 			struct diff_rename_dst *dst =
-				locate_rename_dst(p->two, 0);
+				locate_rename_dst(p->two);
 			if (dst && dst->pair) {
 				diff_q(&outq, dst->pair);
 				pair_to_free = p;
@@ -613,7 +597,7 @@ void diffcore_rename(struct diff_options *options)
 			if (DIFF_PAIR_BROKEN(p)) {
 				/* broken delete */
 				struct diff_rename_dst *dst =
-					locate_rename_dst(p->one, 0);
+					locate_rename_dst(p->one);
 				if (dst && dst->pair)
 					/* counterpart is now rename/copy */
 					pair_to_free = p;
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/6] Convert diffcore-rename's rename_src to the new sorted-array API.
  2010-11-29 22:57 [PATCH v4] generalizing sorted-array handling Yann Dirson
  2010-11-29 22:57 ` [PATCH 1/6] Introduce sorted-array binary-search function Yann Dirson
  2010-11-29 22:57 ` [PATCH 2/6] Convert diffcore-rename's rename_dst to the new sorted-array API Yann Dirson
@ 2010-11-29 22:57 ` Yann Dirson
  2010-11-29 22:57 ` [PATCH 4/6] Convert pack-objects.c " Yann Dirson
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Yann Dirson @ 2010-11-29 22:57 UTC (permalink / raw)
  To: git; +Cc: Yann Dirson

There was no compelling reason to pass separately two members of a
single struct to the insert function.  That's a happy coincidence.

Signed-off-by: Yann Dirson <ydirson@altern.org>
---
 diffcore-rename.c |   57 ++++++++++++++++++----------------------------------
 1 files changed, 20 insertions(+), 37 deletions(-)

diff --git a/diffcore-rename.c b/diffcore-rename.c
index a655017..7e35a82 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -37,46 +37,29 @@ declare_sorted_array_insert_checkbool(static, register_rename_dst, struct diff_f
 				      rename_dst, rename_dst_cmp, rename_dst_init);
 
 /* Table of rename/copy src files */
-static struct diff_rename_src {
+
+struct diff_rename_src {
 	struct diff_filespec *one;
 	unsigned short score; /* to remember the break score */
-} *rename_src;
-static int rename_src_nr, rename_src_alloc;
+};
 
-static struct diff_rename_src *register_rename_src(struct diff_filespec *one,
-						   unsigned short score)
+static int rename_src_cmp(struct diff_filepair *ref_pair, struct diff_rename_src *elem)
 {
-	int first, last;
-
-	first = 0;
-	last = rename_src_nr;
-	while (last > first) {
-		int next = (last + first) >> 1;
-		struct diff_rename_src *src = &(rename_src[next]);
-		int cmp = strcmp(one->path, src->one->path);
-		if (!cmp)
-			return src;
-		if (cmp < 0) {
-			last = next;
-			continue;
-		}
-		first = next+1;
-	}
-
-	/* insert to make it at "first" */
-	if (rename_src_alloc <= rename_src_nr) {
-		rename_src_alloc = alloc_nr(rename_src_alloc);
-		rename_src = xrealloc(rename_src,
-				      rename_src_alloc * sizeof(*rename_src));
-	}
-	rename_src_nr++;
-	if (first < rename_src_nr)
-		memmove(rename_src + first + 1, rename_src + first,
-			(rename_src_nr - first - 1) * sizeof(*rename_src));
-	rename_src[first].one = one;
-	rename_src[first].score = score;
-	return &(rename_src[first]);
+	return strcmp(ref_pair->one->path, elem->one->path);
+}
+static void rename_src_init(struct diff_rename_src *elem, struct diff_filepair *ref_pair)
+{
+	elem->one = ref_pair->one;
+	elem->score = ref_pair->score;
 }
+declare_sorted_array(static, struct diff_rename_src, rename_src);
+declare_gen_binsearch(static, struct diff_rename_src, _locate_rename_src,
+		      struct diff_filepair *);
+declare_gen_sorted_insert(static, struct diff_rename_src, _register_rename_src,
+			  _locate_rename_src, struct diff_filepair *);
+declare_sorted_array_insert_checkbool(static, register_rename_src, struct diff_filepair *,
+				      _register_rename_src,
+				      rename_src, rename_src_cmp, rename_src_init);
 
 static int basename_same(struct diff_filespec *src, struct diff_filespec *dst)
 {
@@ -433,7 +416,7 @@ void diffcore_rename(struct diff_options *options)
 			 */
 			if (p->broken_pair && !p->score)
 				p->one->rename_used++;
-			register_rename_src(p->one, p->score);
+			register_rename_src(p);
 		}
 		else if (detect_rename == DIFF_DETECT_COPY) {
 			/*
@@ -441,7 +424,7 @@ void diffcore_rename(struct diff_options *options)
 			 * one, to indicate ourselves as a user.
 			 */
 			p->one->rename_used++;
-			register_rename_src(p->one, p->score);
+			register_rename_src(p);
 		}
 	}
 	if (rename_dst_nr == 0 || rename_src_nr == 0)
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/6] Convert pack-objects.c to the new sorted-array API.
  2010-11-29 22:57 [PATCH v4] generalizing sorted-array handling Yann Dirson
                   ` (2 preceding siblings ...)
  2010-11-29 22:57 ` [PATCH 3/6] Convert diffcore-rename's rename_src " Yann Dirson
@ 2010-11-29 22:57 ` Yann Dirson
  2010-11-29 22:57 ` [PATCH 5/6] Use sorted-array API for commit.c's commit_graft Yann Dirson
  2010-11-29 22:57 ` [PATCH 6/6] [WIP] subvert sorted-array to replace binary-search in unpack-objects Yann Dirson
  5 siblings, 0 replies; 9+ messages in thread
From: Yann Dirson @ 2010-11-29 22:57 UTC (permalink / raw)
  To: git; +Cc: Yann Dirson

In this file the "list size" variable was named in a non-standard way.
The new API forces to use a more common convention.

Signed-off-by: Yann Dirson <ydirson@altern.org>
---
 builtin/pack-objects.c |   51 ++++++++++++++---------------------------------
 1 files changed, 15 insertions(+), 36 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index f8eba53..c4af9f7 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -16,6 +16,7 @@
 #include "list-objects.h"
 #include "progress.h"
 #include "refs.h"
+#include "sorted-array.h"
 
 #ifndef NO_PTHREADS
 #include <pthread.h>
@@ -871,45 +872,23 @@ static void add_pbase_object(struct tree_desc *tree,
 	}
 }
 
-static unsigned *done_pbase_paths;
-static int done_pbase_paths_num;
-static int done_pbase_paths_alloc;
-static int done_pbase_path_pos(unsigned hash)
+static int unsigned_cmp(unsigned ref, unsigned *elem)
 {
-	int lo = 0;
-	int hi = done_pbase_paths_num;
-	while (lo < hi) {
-		int mi = (hi + lo) / 2;
-		if (done_pbase_paths[mi] == hash)
-			return mi;
-		if (done_pbase_paths[mi] < hash)
-			hi = mi;
-		else
-			lo = mi + 1;
-	}
-	return -lo-1;
+	if (ref == *elem)
+		return 0;
+	if (ref < *elem)
+		return -1;
+	return 1;
 }
-
-static int check_pbase_path(unsigned hash)
+static void unsigned_init(unsigned *elem, unsigned ref)
 {
-	int pos = (!done_pbase_paths) ? -1 : done_pbase_path_pos(hash);
-	if (0 <= pos)
-		return 1;
-	pos = -pos - 1;
-	if (done_pbase_paths_alloc <= done_pbase_paths_num) {
-		done_pbase_paths_alloc = alloc_nr(done_pbase_paths_alloc);
-		done_pbase_paths = xrealloc(done_pbase_paths,
-					    done_pbase_paths_alloc *
-					    sizeof(unsigned));
-	}
-	done_pbase_paths_num++;
-	if (pos < done_pbase_paths_num)
-		memmove(done_pbase_paths + pos + 1,
-			done_pbase_paths + pos,
-			(done_pbase_paths_num - pos - 1) * sizeof(unsigned));
-	done_pbase_paths[pos] = hash;
-	return 0;
+	*elem = ref;
 }
+declare_sorted_array(static, unsigned, done_pbase_paths);
+declare_gen_binsearch(static, unsigned, done_pbase_path_pos, unsigned);
+declare_gen_sorted_insert(static, unsigned, _check_pbase_path, done_pbase_path_pos, unsigned);
+declare_sorted_array_insert_checkbool(static, check_pbase_path, unsigned, _check_pbase_path,
+				      done_pbase_paths, unsigned_cmp, unsigned_init);
 
 static void add_preferred_base_object(const char *name)
 {
@@ -987,7 +966,7 @@ static void cleanup_preferred_base(void)
 
 	free(done_pbase_paths);
 	done_pbase_paths = NULL;
-	done_pbase_paths_num = done_pbase_paths_alloc = 0;
+	done_pbase_paths_nr = done_pbase_paths_alloc = 0;
 }
 
 static void check_object(struct object_entry *entry)
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 5/6] Use sorted-array API for commit.c's commit_graft.
  2010-11-29 22:57 [PATCH v4] generalizing sorted-array handling Yann Dirson
                   ` (3 preceding siblings ...)
  2010-11-29 22:57 ` [PATCH 4/6] Convert pack-objects.c " Yann Dirson
@ 2010-11-29 22:57 ` Yann Dirson
  2010-11-29 22:57 ` [PATCH 6/6] [WIP] subvert sorted-array to replace binary-search in unpack-objects Yann Dirson
  5 siblings, 0 replies; 9+ messages in thread
From: Yann Dirson @ 2010-11-29 22:57 UTC (permalink / raw)
  To: git; +Cc: Yann Dirson

Factorizing code fixes off-by-one error in the duplicated code (caused
mostly harmless anticipated growing of the array).

Signed-off-by: Yann Dirson <ydirson@altern.org>
---
 commit.c |   62 +++++++++++++++++++++++++++-----------------------------------
 1 files changed, 27 insertions(+), 35 deletions(-)

diff --git a/commit.c b/commit.c
index 0094ec1..5b9a554 100644
--- a/commit.c
+++ b/commit.c
@@ -6,6 +6,7 @@
 #include "diff.h"
 #include "revision.h"
 #include "notes.h"
+#include "sorted-array.h"
 
 int save_commit_buffer = 1;
 
@@ -76,33 +77,37 @@ static unsigned long parse_commit_date(const char *buf, const char *tail)
 	return strtoul(dateptr, NULL, 10);
 }
 
-static struct commit_graft **commit_graft;
-static int commit_graft_alloc, commit_graft_nr;
-
-static int commit_graft_pos(const unsigned char *sha1)
+static int commit_graft_cmp(const unsigned char *ref_sha1, struct commit_graft **elem)
 {
-	int lo, hi;
-	lo = 0;
-	hi = commit_graft_nr;
-	while (lo < hi) {
-		int mi = (lo + hi) / 2;
-		struct commit_graft *graft = commit_graft[mi];
-		int cmp = hashcmp(sha1, graft->sha1);
-		if (!cmp)
-			return mi;
-		if (cmp < 0)
-			hi = mi;
-		else
-			lo = mi + 1;
-	}
-	return -lo - 1;
+	return hashcmp(ref_sha1, (*elem)->sha1);
 }
+declare_sorted_array(static, struct commit_graft *, commit_graft);
+declare_gen_binsearch(static, struct commit_graft *, _commit_graft_pos,
+		      const unsigned char *);
+declare_sorted_array_search_check(static, commit_graft_pos, const unsigned char *,
+				  _commit_graft_pos, commit_graft, commit_graft_cmp);
 
+// FIXME: do we want to/can we remove INITTYPE from gen_binsearch ?
+static int commit_graft_cmp2(struct commit_graft *ref_graft, struct commit_graft **elem)
+{
+	return commit_graft_cmp(ref_graft->sha1, elem);
+}
+declare_gen_binsearch(static, struct commit_graft *, _commit_graft_pos2,
+		      struct commit_graft *);
+static void commit_graft_init(struct commit_graft **elem, struct commit_graft *ref_graft)
+{
+	*elem = ref_graft;
+}
+declare_gen_sorted_insert(static, struct commit_graft *, _register_commit_graft0,
+			  _commit_graft_pos2, struct commit_graft *)
+declare_sorted_array_insert_check(static, register_commit_graft0, struct commit_graft *,
+				  _register_commit_graft0, commit_graft,
+				  commit_graft_cmp2, commit_graft_init);
 int register_commit_graft(struct commit_graft *graft, int ignore_dups)
 {
-	int pos = commit_graft_pos(graft->sha1);
+	int pos = register_commit_graft0(graft);
 
-	if (0 <= pos) {
+	if (pos >= 0) {
 		if (ignore_dups)
 			free(graft);
 		else {
@@ -111,19 +116,6 @@ int register_commit_graft(struct commit_graft *graft, int ignore_dups)
 		}
 		return 1;
 	}
-	pos = -pos - 1;
-	if (commit_graft_alloc <= ++commit_graft_nr) {
-		commit_graft_alloc = alloc_nr(commit_graft_alloc);
-		commit_graft = xrealloc(commit_graft,
-					sizeof(*commit_graft) *
-					commit_graft_alloc);
-	}
-	if (pos < commit_graft_nr)
-		memmove(commit_graft + pos + 1,
-			commit_graft + pos,
-			(commit_graft_nr - pos - 1) *
-			sizeof(*commit_graft));
-	commit_graft[pos] = graft;
 	return 0;
 }
 
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 6/6] [WIP] subvert sorted-array to replace binary-search in unpack-objects.
  2010-11-29 22:57 [PATCH v4] generalizing sorted-array handling Yann Dirson
                   ` (4 preceding siblings ...)
  2010-11-29 22:57 ` [PATCH 5/6] Use sorted-array API for commit.c's commit_graft Yann Dirson
@ 2010-11-29 22:57 ` Yann Dirson
  5 siblings, 0 replies; 9+ messages in thread
From: Yann Dirson @ 2010-11-29 22:57 UTC (permalink / raw)
  To: git; +Cc: Yann Dirson

Signed-off-by: Yann Dirson <ydirson@altern.org>
---
 builtin/unpack-objects.c |   41 ++++++++++++++++++++++++++---------------
 1 files changed, 26 insertions(+), 15 deletions(-)

diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index f63973c..b0c15e6 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -11,6 +11,7 @@
 #include "progress.h"
 #include "decorate.h"
 #include "fsck.h"
+#include "sorted-array.h"
 
 static int dry_run, quiet, recover, has_errors, strict;
 static const char unpack_usage[] = "git unpack-objects [-n] [-q] [-r] [--strict] < pack-file";
@@ -157,7 +158,25 @@ struct obj_info {
 #define FLAG_OPEN (1u<<20)
 #define FLAG_WRITTEN (1u<<21)
 
-static struct obj_info *obj_list;
+/*
+ * FIXME: obj_info is a sorted array, but we read it as a whole, we
+ * don't need insertion features.  This allows us to abuse unused
+ * obj_info_nr later as a means of specifying an upper bound for
+ * binary search.  obj_info_alloc shall be eliminated by the compiler
+ * as unused.
+ */
+static int obj_info_cmp(off_t ref, struct obj_info *elem)
+{
+	if (ref == elem->offset)
+		return 0;
+	if (ref < elem->offset)
+		return -1;
+	return 1;
+}
+declare_sorted_array(static, struct obj_info, obj_list);
+declare_gen_binsearch(static, struct obj_info, _obj_list_check, off_t);
+declare_sorted_array_search_check(static, obj_list_check, off_t, _obj_list_check,
+				  obj_list, obj_info_cmp);
 static unsigned nr_objects;
 
 /*
@@ -356,7 +375,7 @@ static void unpack_delta_entry(enum object_type type, unsigned long delta_size,
 		unsigned base_found = 0;
 		unsigned char *pack, c;
 		off_t base_offset;
-		unsigned lo, mid, hi;
+		int pos;
 
 		pack = fill(1);
 		c = *pack;
@@ -380,19 +399,11 @@ static void unpack_delta_entry(enum object_type type, unsigned long delta_size,
 			free(delta_data);
 			return;
 		}
-		lo = 0;
-		hi = nr;
-		while (lo < hi) {
-			mid = (lo + hi)/2;
-			if (base_offset < obj_list[mid].offset) {
-				hi = mid;
-			} else if (base_offset > obj_list[mid].offset) {
-				lo = mid + 1;
-			} else {
-				hashcpy(base_sha1, obj_list[mid].sha1);
-				base_found = !is_null_sha1(base_sha1);
-				break;
-			}
+		obj_list_nr = nr; /* kludge to bound the search */
+		pos = obj_list_check(base_offset);
+		if (pos >= 0) {
+			hashcpy(base_sha1, obj_list[pos].sha1);
+			base_found = !is_null_sha1(base_sha1);
 		}
 		if (!base_found) {
 			/*
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 5/6] Use sorted-array API for commit.c's commit_graft.
  2010-12-05 10:34 [PATCH v5] generalizing sorted-array handling Yann Dirson
@ 2010-12-05 10:34 ` Yann Dirson
  0 siblings, 0 replies; 9+ messages in thread
From: Yann Dirson @ 2010-12-05 10:34 UTC (permalink / raw)
  To: git; +Cc: Yann Dirson

Factorizing code fixes off-by-one error in the duplicated code (caused
mostly harmless anticipated growing of the array).

Signed-off-by: Yann Dirson <ydirson@altern.org>
---
 commit.c |   62 +++++++++++++++++++++++++++-----------------------------------
 1 files changed, 27 insertions(+), 35 deletions(-)

diff --git a/commit.c b/commit.c
index 2d9265d..b7aeee4 100644
--- a/commit.c
+++ b/commit.c
@@ -6,6 +6,7 @@
 #include "diff.h"
 #include "revision.h"
 #include "notes.h"
+#include "sorted-array.h"
 
 int save_commit_buffer = 1;
 
@@ -76,33 +77,37 @@ static unsigned long parse_commit_date(const char *buf, const char *tail)
 	return strtoul(dateptr, NULL, 10);
 }
 
-static struct commit_graft **commit_graft;
-static int commit_graft_alloc, commit_graft_nr;
-
-static int commit_graft_pos(const unsigned char *sha1)
+static int commit_graft_cmp(const unsigned char *ref_sha1, struct commit_graft **elem)
 {
-	int lo, hi;
-	lo = 0;
-	hi = commit_graft_nr;
-	while (lo < hi) {
-		int mi = (lo + hi) / 2;
-		struct commit_graft *graft = commit_graft[mi];
-		int cmp = hashcmp(sha1, graft->sha1);
-		if (!cmp)
-			return mi;
-		if (cmp < 0)
-			hi = mi;
-		else
-			lo = mi + 1;
-	}
-	return -lo - 1;
+	return hashcmp(ref_sha1, (*elem)->sha1);
 }
+declare_sorted_array(static, struct commit_graft *, commit_graft);
+declare_gen_binsearch(static, struct commit_graft *, _commit_graft_pos,
+		      const unsigned char *);
+declare_sorted_array_search_check(static, commit_graft_pos, const unsigned char *,
+				  _commit_graft_pos, commit_graft, commit_graft_cmp);
 
+// FIXME: do we want to/can we remove INITTYPE from gen_binsearch ?
+static int commit_graft_cmp2(struct commit_graft *ref_graft, struct commit_graft **elem)
+{
+	return commit_graft_cmp(ref_graft->sha1, elem);
+}
+declare_gen_binsearch(static, struct commit_graft *, _commit_graft_pos2,
+		      struct commit_graft *);
+static void commit_graft_init(struct commit_graft **elem, struct commit_graft *ref_graft)
+{
+	*elem = ref_graft;
+}
+declare_gen_sorted_insert(static, struct commit_graft *, _register_commit_graft0,
+			  _commit_graft_pos2, struct commit_graft *)
+declare_sorted_array_insert_check(static, register_commit_graft0, struct commit_graft *,
+				  _register_commit_graft0, commit_graft,
+				  commit_graft_cmp2, commit_graft_init);
 int register_commit_graft(struct commit_graft *graft, int ignore_dups)
 {
-	int pos = commit_graft_pos(graft->sha1);
+	int pos = register_commit_graft0(graft);
 
-	if (0 <= pos) {
+	if (pos >= 0) {
 		if (ignore_dups)
 			free(graft);
 		else {
@@ -111,19 +116,6 @@ int register_commit_graft(struct commit_graft *graft, int ignore_dups)
 		}
 		return 1;
 	}
-	pos = -pos - 1;
-	if (commit_graft_alloc <= ++commit_graft_nr) {
-		commit_graft_alloc = alloc_nr(commit_graft_alloc);
-		commit_graft = xrealloc(commit_graft,
-					sizeof(*commit_graft) *
-					commit_graft_alloc);
-	}
-	if (pos < commit_graft_nr)
-		memmove(commit_graft + pos + 1,
-			commit_graft + pos,
-			(commit_graft_nr - pos - 1) *
-			sizeof(*commit_graft));
-	commit_graft[pos] = graft;
 	return 0;
 }
 
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 5/6] Use sorted-array API for commit.c's commit_graft.
  2010-12-08 22:51 [PATCH v6] generalizing sorted-array handling Yann Dirson
@ 2010-12-08 22:51 ` Yann Dirson
  0 siblings, 0 replies; 9+ messages in thread
From: Yann Dirson @ 2010-12-08 22:51 UTC (permalink / raw)
  To: git; +Cc: Yann Dirson

Factorizing code fixes off-by-one error in the duplicated code (caused
mostly harmless anticipated growing of the array).

Signed-off-by: Yann Dirson <ydirson@altern.org>
---
 commit.c |   55 +++++++++++++++++++++----------------------------------
 1 files changed, 21 insertions(+), 34 deletions(-)

diff --git a/commit.c b/commit.c
index b21335e..786bb7a 100644
--- a/commit.c
+++ b/commit.c
@@ -6,6 +6,7 @@
 #include "diff.h"
 #include "revision.h"
 #include "notes.h"
+#include "sorted-array.h"
 
 int save_commit_buffer = 1;
 
@@ -89,33 +90,32 @@ static unsigned long parse_commit_date(const char *buf, const char *tail)
 	return strtoul(dateptr, NULL, 10);
 }
 
-static struct commit_graft **commit_graft;
-static int commit_graft_alloc, commit_graft_nr;
-
-static int commit_graft_pos(const unsigned char *sha1)
+static int commit_graft_cmp(const unsigned char *ref_sha1, struct commit_graft **elem)
 {
-	int lo, hi;
-	lo = 0;
-	hi = commit_graft_nr;
-	while (lo < hi) {
-		int mi = (lo + hi) / 2;
-		struct commit_graft *graft = commit_graft[mi];
-		int cmp = hashcmp(sha1, graft->sha1);
-		if (!cmp)
-			return mi;
-		if (cmp < 0)
-			hi = mi;
-		else
-			lo = mi + 1;
-	}
-	return -lo - 1;
+	return hashcmp(ref_sha1, (*elem)->sha1);
 }
+declare_sorted_array(static, struct commit_graft *, commit_graft);
+declare_sorted_array_search_check(static, struct commit_graft *, commit_graft_pos,
+				  const unsigned char *,
+				  commit_graft, commit_graft_cmp);
 
+// FIXME: do we want to/can we remove INITTYPE from gen_binsearch ?
+static int commit_graft_cmp2(struct commit_graft *ref_graft, struct commit_graft **elem)
+{
+	return commit_graft_cmp(ref_graft->sha1, elem);
+}
+static void commit_graft_init(struct commit_graft **elem, struct commit_graft *ref_graft)
+{
+	*elem = ref_graft;
+}
+declare_sorted_array_insertonly_check(static, struct commit_graft *, register_commit_graft0,
+				      struct commit_graft *,
+				      commit_graft, commit_graft_cmp2, commit_graft_init);
 int register_commit_graft(struct commit_graft *graft, int ignore_dups)
 {
-	int pos = commit_graft_pos(graft->sha1);
+	int pos = register_commit_graft0(graft);
 
-	if (0 <= pos) {
+	if (pos >= 0) {
 		if (ignore_dups)
 			free(graft);
 		else {
@@ -124,19 +124,6 @@ int register_commit_graft(struct commit_graft *graft, int ignore_dups)
 		}
 		return 1;
 	}
-	pos = -pos - 1;
-	if (commit_graft_alloc <= ++commit_graft_nr) {
-		commit_graft_alloc = alloc_nr(commit_graft_alloc);
-		commit_graft = xrealloc(commit_graft,
-					sizeof(*commit_graft) *
-					commit_graft_alloc);
-	}
-	if (pos < commit_graft_nr)
-		memmove(commit_graft + pos + 1,
-			commit_graft + pos,
-			(commit_graft_nr - pos - 1) *
-			sizeof(*commit_graft));
-	commit_graft[pos] = graft;
 	return 0;
 }
 
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-12-08 22:52 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-29 22:57 [PATCH v4] generalizing sorted-array handling Yann Dirson
2010-11-29 22:57 ` [PATCH 1/6] Introduce sorted-array binary-search function Yann Dirson
2010-11-29 22:57 ` [PATCH 2/6] Convert diffcore-rename's rename_dst to the new sorted-array API Yann Dirson
2010-11-29 22:57 ` [PATCH 3/6] Convert diffcore-rename's rename_src " Yann Dirson
2010-11-29 22:57 ` [PATCH 4/6] Convert pack-objects.c " Yann Dirson
2010-11-29 22:57 ` [PATCH 5/6] Use sorted-array API for commit.c's commit_graft Yann Dirson
2010-11-29 22:57 ` [PATCH 6/6] [WIP] subvert sorted-array to replace binary-search in unpack-objects Yann Dirson
  -- strict thread matches above, loose matches on Subject: below --
2010-12-05 10:34 [PATCH v5] generalizing sorted-array handling Yann Dirson
2010-12-05 10:34 ` [PATCH 5/6] Use sorted-array API for commit.c's commit_graft Yann Dirson
2010-12-08 22:51 [PATCH v6] generalizing sorted-array handling Yann Dirson
2010-12-08 22:51 ` [PATCH 5/6] Use sorted-array API for commit.c's commit_graft Yann Dirson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).