git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: mhagger@alum.mit.edu
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>,
	Jakub Narebski <jnareb@gmail.com>,
	Heiko Voigt <hvoigt@hvoigt.net>,
	Johan Herland <johan@herland.net>,
	Michael Haggerty <mhagger@alum.mit.edu>
Subject: [PATCH 2/4] ref_array: keep track of whether references are sorted
Date: Tue, 17 Jan 2012 06:50:32 +0100	[thread overview]
Message-ID: <1326779434-20106-3-git-send-email-mhagger@alum.mit.edu> (raw)
In-Reply-To: <1326779434-20106-1-git-send-email-mhagger@alum.mit.edu>

From: Michael Haggerty <mhagger@alum.mit.edu>

Keep track of how many entries in a ref_array are already sorted.  In
sort_ref_array(), only call qsort() if the dir contains unsorted
entries (i.e., if references have been appended to the end of the list
since the last call to sort_ref_array()).

Sort ref_arrays only when needed, namely in search_ref_array() and in
do_for_each_ref().  However, never sort the extra_refs, because
extra_refs can contain multiple entries with the same name.

This change is currently not useful, because entries are not added to
ref_arrays after they are created.  But in a moment they will be...

Implementation note: we could store a binary "sorted" value instead of
an integer, but storing the number of sorted entries leaves the way
open for a couple of possible future optimizations:

* In sort_ref_array(), sort *only* the unsorted entries, then merge
  them with the sorted entries.  This should be faster if most of the
  entries are already sorted.

* Teach search_ref_array() to do a binary search of any sorted
  entries, and if unsuccessful do a linear search of any unsorted
  entries.  This would avoid the need to sort the list every time that
  search_ref_array() is called, and (given some intelligence about how
  often to sort) could significantly improve the speed in certain
  hypothetical usage patterns.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs.c |   33 ++++++++++++++++++++++++++-------
 1 files changed, 26 insertions(+), 7 deletions(-)

diff --git a/refs.c b/refs.c
index 6f436f1..268816f 100644
--- a/refs.c
+++ b/refs.c
@@ -17,6 +17,15 @@ struct ref_entry {
 
 struct ref_array {
 	int nr, alloc;
+
+	/*
+	 * Entries with index 0 <= i < sorted are sorted by name.  New
+	 * entries are appended to the list unsorted, and are sorted
+	 * only when required; thus we avoid the need to sort the list
+	 * after the addition of every reference.
+	 */
+	int sorted;
+
 	struct ref_entry **refs;
 };
 
@@ -105,12 +114,18 @@ static int is_dup_ref(const struct ref_entry *ref1, const struct ref_entry *ref2
 	}
 }
 
+/*
+ * Sort the entries in array (if they are not already sorted).
+ */
 static void sort_ref_array(struct ref_array *array)
 {
 	int i, j;
 
-	/* Nothing to sort unless there are at least two entries */
-	if (array->nr < 2)
+	/*
+	 * This check also prevents passing a zero-length array to qsort(),
+	 * which is a problem on some platforms.
+	 */
+	if (array->sorted == array->nr)
 		return;
 
 	qsort(array->refs, array->nr, sizeof(*array->refs), ref_entry_cmp);
@@ -124,7 +139,7 @@ static void sort_ref_array(struct ref_array *array)
 		}
 		array->refs[++i] = array->refs[j];
 	}
-	array->nr = i + 1;
+	array->sorted = array->nr = i + 1;
 }
 
 static struct ref_entry *search_ref_array(struct ref_array *array, const char *refname)
@@ -137,7 +152,7 @@ static struct ref_entry *search_ref_array(struct ref_array *array, const char *r
 
 	if (!array->nr)
 		return NULL;
-
+	sort_ref_array(array);
 	len = strlen(refname) + 1;
 	e = xmalloc(sizeof(struct ref_entry) + len);
 	memcpy(e->name, refname, len);
@@ -168,6 +183,10 @@ static struct ref_cache {
 
 static struct ref_entry *current_ref;
 
+/*
+ * The extra_refs is never sorted, because it is allowed to contain entries
+ * with duplicate names.
+ */
 static struct ref_array extra_refs;
 
 static void clear_ref_array(struct ref_array *array)
@@ -176,7 +195,7 @@ static void clear_ref_array(struct ref_array *array)
 	for (i = 0; i < array->nr; i++)
 		free(array->refs[i]);
 	free(array->refs);
-	array->nr = array->alloc = 0;
+	array->sorted = array->nr = array->alloc = 0;
 	array->refs = NULL;
 }
 
@@ -268,7 +287,6 @@ static void read_packed_refs(FILE *f, struct ref_array *array)
 		    !get_sha1_hex(refline + 1, sha1))
 			hashcpy(last->peeled, sha1);
 	}
-	sort_ref_array(array);
 }
 
 void add_extra_ref(const char *refname, const unsigned char *sha1, int flag)
@@ -404,7 +422,6 @@ static struct ref_array *get_loose_refs(struct ref_cache *refs)
 {
 	if (!refs->did_loose) {
 		get_ref_dir(refs, "refs", &refs->loose);
-		sort_ref_array(&refs->loose);
 		refs->did_loose = 1;
 	}
 	return &refs->loose;
@@ -720,6 +737,8 @@ static int do_for_each_ref(const char *submodule, const char *base, each_ref_fn
 	for (i = 0; i < extra->nr; i++)
 		retval = do_one_ref(base, fn, trim, flags, cb_data, extra->refs[i]);
 
+	sort_ref_array(packed);
+	sort_ref_array(loose);
 	while (p < packed->nr && l < loose->nr) {
 		struct ref_entry *entry;
 		int cmp = strcmp(packed->refs[p]->name, loose->refs[l]->name);
-- 
1.7.8.3

  parent reply	other threads:[~2012-01-17  5:51 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-17  5:50 [PATCH 0/4] Remove a user of extra_refs in clone mhagger
2012-01-17  5:50 ` [PATCH 1/4] pack_refs(): remove redundant check mhagger
2012-01-17  5:50 ` mhagger [this message]
2012-01-17  5:50 ` [PATCH 3/4] add_packed_ref(): new function in the refs API mhagger
2012-01-17  5:50 ` [PATCH 4/4] write_remote_refs(): create packed (rather than extra) refs mhagger
2012-01-17  6:35 ` [PATCH 0/4] Remove a user of extra_refs in clone Jeff King
2012-01-17  6:51 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1326779434-20106-3-git-send-email-mhagger@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=hvoigt@hvoigt.net \
    --cc=jnareb@gmail.com \
    --cc=johan@herland.net \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).