From: mhagger@alum.mit.edu
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>,
Drew Northup <drew.northup@maine.edu>,
Jakub Narebski <jnareb@gmail.com>,
Heiko Voigt <hvoigt@hvoigt.net>,
Johan Herland <johan@herland.net>,
Julian Phillips <julian@quantumfyre.co.uk>,
Michael Haggerty <mhagger@alum.mit.edu>
Subject: [PATCH 25/28] refs: read loose references lazily
Date: Fri, 28 Oct 2011 14:28:38 +0200 [thread overview]
Message-ID: <1319804921-17545-26-git-send-email-mhagger@alum.mit.edu> (raw)
In-Reply-To: <1319804921-17545-1-git-send-email-mhagger@alum.mit.edu>
From: Michael Haggerty <mhagger@alum.mit.edu>
Instead of reading the whole directory of loose references the first
time any are needed, only read them on demand, one directory at a
time.
Use a new ref_entry flag value REF_DIR_INCOMPLETE to indicate that the
entry represents a REF_DIR that hasn't been read yet. Whenever any
entries from such a directory are needed, read all of the loose
references from that directory.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
refs.c | 112 ++++++++++++++++++++++++++++++++++++++++++++++++++--------------
1 files changed, 88 insertions(+), 24 deletions(-)
diff --git a/refs.c b/refs.c
index f3910de..88ef9dd 100644
--- a/refs.c
+++ b/refs.c
@@ -138,6 +138,12 @@ int check_refname_format(const char *refname, int flags)
struct ref_entry;
+/*
+ * Information used (along with the information in ref_entry) to
+ * describe a single cached reference. This data structure only
+ * occurs embedded in a union in struct ref_entry, and only when
+ * (ref_entry->flag & REF_DIR) is zero.
+ */
struct ref_value {
unsigned char sha1[20];
unsigned char peeled[20];
@@ -145,6 +151,34 @@ struct ref_value {
struct ref_cache;
+/*
+ * Information used (along with the information in ref_entry) to
+ * describe a level in the hierarchy of references. This data
+ * structure only occurs embedded in a union in struct ref_entry, and
+ * only when (ref_entry.flag & REF_DIR) is nonzero. In that case,
+ * (ref_entry.flag & REF_DIR) can take the following values:
+ *
+ * REF_DIR_COMPLETE -- a directory of loose or packed references,
+ * already read.
+ *
+ * REF_DIR_INCOMPLETE -- a directory of loose references that
+ * hasn't been read yet (nor has any of its subdirectories).
+ *
+ * Entries within a directory are stored within a growable array of
+ * pointers to ref_entries (entries, nr, alloc). Entries 0 <= i <
+ * sorted are sorted by their component name in strcmp() order and the
+ * remaining entries are unsorted.
+ *
+ * Loose references are read lazily, one directory at a time. When a
+ * directory of loose references is read, then all of the references
+ * in that directory are stored, and REF_DIR_INCOMPLETE stubs are
+ * created for any subdirectories, but the subdirectories themselves
+ * are not read. The reading is triggered either by search_ref_dir()
+ * (called when single references are added or interrogated), by
+ * sort_ref_dir(), or by iteration over a subdirectory of references
+ * using one of the for_each_ref*() functions (which calls
+ * sort_ref_dir() for each subdirectory).
+ */
struct ref_dir {
int nr, alloc;
@@ -162,19 +196,33 @@ struct ref_dir {
/* ISSYMREF=0x01, ISPACKED=0x02, and ISBROKEN=0x04 are public interfaces */
#define REF_KNOWS_PEELED 0x08
-#define REF_DIR 0x10
+
+/* If any of these bits are set, the entry represents a directory: */
+#define REF_DIR 0x30
+
+/* A directory that has already been fully read. */
+#define REF_DIR_COMPLETE 0x10
+
+/* A directory of loose references that has not yet been fully read. */
+#define REF_DIR_INCOMPLETE 0x20
/*
* A ref_entry represents either a reference or a "subdirectory" of
- * references. Each directory in the reference namespace is
- * represented by a ref_entry with (flags & REF_DIR) set and
- * containing a subdir member that holds the entries in that
- * directory. References are represented by a ref_entry with (flags &
- * REF_DIR) unset and a value member that describes the reference's
- * value. The flag member is at the ref_entry level, but it is also
- * needed to interpret the contents of the value field (in other
- * words, a ref_value object is not very much use without the
- * enclosing ref_entry).
+ * references.
+ *
+ * Each directory in the reference namespace is represented by a
+ * ref_entry with (flags & REF_DIR) set and containing a subdir member
+ * that holds the entries in that directory that have been read so
+ * far. If (flags & REF_DIR) == REF_DIR_INCOMPLETE, then the
+ * directory and its subdirectories haven't been read yet.
+ * REF_DIR_INCOMPLETE is only used for loose references.
+ *
+ * References are represented by a ref_entry with (flags & REF_DIR) ==
+ * 0 and a value member that describes the reference's value. The
+ * flag member is at the ref_entry level, but it is also needed to
+ * interpret the contents of the value field (in other words, a
+ * ref_value object is not very much use without the enclosing
+ * ref_entry).
*
* Reference names cannot end with slash and directories' names are
* always stored with a trailing slash (except for the top-level
@@ -264,13 +312,15 @@ static void clear_ref_dir(struct ref_dir *dir)
dir->entries = NULL;
}
+static void read_loose_refs(struct ref_entry *direntry);
+
/*
* Create a struct ref_entry object for the specified dirname.
* dirname is the name of the directory with a trailing slash (e.g.,
* "refs/heads/") or "" for the top-level directory.
*/
static struct ref_entry *create_dir_entry(struct ref_cache *ref_cache,
- const char *dirname)
+ const char *dirname, int flag)
{
struct ref_entry *direntry;
if (*dirname) {
@@ -283,7 +333,7 @@ static struct ref_entry *create_dir_entry(struct ref_cache *ref_cache,
direntry = xcalloc(1, sizeof(struct ref_entry) + 1);
direntry->name[0] = '\0';
}
- direntry->flag = REF_DIR;
+ direntry->flag = flag;
direntry->u.subdir.ref_cache = ref_cache;
return direntry;
}
@@ -308,6 +358,7 @@ static struct ref_entry *search_ref_dir(struct ref_entry *direntry, const char *
struct ref_dir *dir;
assert(direntry->flag & REF_DIR);
+ read_loose_refs(direntry);
dir = &direntry->u.subdir;
if (refname == NULL || !dir->nr)
return NULL;
@@ -364,8 +415,14 @@ static struct ref_entry *find_containing_direntry(struct ref_entry *direntry,
direntry = NULL;
break;
}
+ /*
+ * If search_ref_dir() above didn't make the
+ * entry spring into existence, then this must
+ * not be an unread loose reference tree, so
+ * the correct flag is REF_DIR_COMPLETE.
+ */
entry = create_dir_entry(direntry->u.subdir.ref_cache,
- refname_copy);
+ refname_copy, REF_DIR_COMPLETE);
add_entry(direntry, entry);
}
slash[1] = tmp;
@@ -441,6 +498,7 @@ static void sort_ref_dir(struct ref_entry *direntry)
struct ref_entry *last = NULL;
struct ref_dir *dir;
assert(direntry->flag & REF_DIR);
+ read_loose_refs(direntry);
dir = &direntry->u.subdir;
if (dir->sorted == dir->nr)
return; /* This directory is already sorted and de-duped */
@@ -491,8 +549,8 @@ static int do_for_each_ref_in_dir(struct ref_entry *direntry, int offset,
int i;
struct ref_dir *dir;
assert(direntry->flag & REF_DIR);
- dir = &direntry->u.subdir;
sort_ref_dir(direntry);
+ dir = &direntry->u.subdir;
for (i = offset; i < dir->nr; i++) {
struct ref_entry *entry = dir->entries[i];
int retval;
@@ -519,10 +577,10 @@ static int do_for_each_ref_in_dirs(struct ref_entry *direntry1,
assert(direntry1->flag & REF_DIR);
assert(direntry2->flag & REF_DIR);
- dir1 = &direntry1->u.subdir;
- dir2 = &direntry2->u.subdir;
sort_ref_dir(direntry1);
sort_ref_dir(direntry2);
+ dir1 = &direntry1->u.subdir;
+ dir2 = &direntry2->u.subdir;
while (1) {
struct ref_entry *e1, *e2, *entry;
int cmp;
@@ -782,7 +840,7 @@ static void read_packed_refs(FILE *f, struct ref_entry *direntry)
void add_extra_ref(const char *refname, const unsigned char *sha1, int flag)
{
if (!extra_refs)
- extra_refs = create_dir_entry(NULL, "");
+ extra_refs = create_dir_entry(NULL, "", REF_DIR_COMPLETE);
add_ref(extra_refs, create_ref_entry(refname, sha1, flag));
}
@@ -800,7 +858,7 @@ static struct ref_entry *get_packed_refs(struct ref_cache *refs)
const char *packed_refs_file;
FILE *f;
- refs->packed = create_dir_entry(refs, "");
+ refs->packed = create_dir_entry(refs, "", REF_DIR_COMPLETE);
if (*refs->name)
packed_refs_file = git_path_submodule(refs->name, "packed-refs");
else
@@ -822,11 +880,14 @@ static void read_loose_refs(struct ref_entry *direntry)
DIR *d;
char *path;
char *dirname = direntry->name;
- int dirnamelen = strlen(dirname);
+ int dirnamelen;
int pathlen;
struct ref_cache *refs;
assert(direntry->flag & REF_DIR);
+ if ((direntry->flag & REF_DIR) != REF_DIR_INCOMPLETE)
+ return;
+ dirnamelen = strlen(dirname);
assert(dirnamelen && direntry->name[dirnamelen - 1] == '/');
refs = direntry->u.subdir.ref_cache;
if (*refs->name)
@@ -864,11 +925,12 @@ static void read_loose_refs(struct ref_entry *direntry)
if (stat(refdir, &st) < 0)
continue;
if (S_ISDIR(st.st_mode)) {
+ struct ref_entry *subdirentry;
refname[dirnamelen + namelen] = '/';
refname[dirnamelen + namelen + 1] = '\0';
- read_loose_refs(find_containing_direntry(
- refs->loose,
- refname, 1));
+ subdirentry = create_dir_entry(direntry->u.subdir.ref_cache,
+ refname, REF_DIR_INCOMPLETE);
+ add_entry(direntry, subdirentry);
continue;
}
if (*refs->name) {
@@ -888,13 +950,15 @@ static void read_loose_refs(struct ref_entry *direntry)
free(refname);
closedir(d);
}
+ direntry->flag = REF_DIR_COMPLETE;
}
static struct ref_entry *get_loose_refs(struct ref_cache *refs)
{
if (!refs->loose) {
- refs->loose = create_dir_entry(refs, "");
- read_loose_refs(find_containing_direntry(refs->loose, "refs/", 1));
+ refs->loose = create_dir_entry(refs, "", REF_DIR_COMPLETE);
+ add_entry(refs->loose,
+ create_dir_entry(refs, "refs/", REF_DIR_INCOMPLETE));
}
return refs->loose;
}
--
1.7.7
next prev parent reply other threads:[~2011-10-28 12:30 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-10-28 12:28 [PATCH 00/28] Store references hierarchically in cache mhagger
2011-10-28 12:28 ` [PATCH 01/28] refs.c: reorder definitions more logically mhagger
2011-10-28 12:28 ` [PATCH 02/28] free_ref_entry(): new function mhagger
2011-10-28 12:28 ` [PATCH 03/28] check_refname_component(): return 0 for zero-length components mhagger
2011-10-28 12:28 ` [PATCH 04/28] struct ref_entry: nest the value part in a union mhagger
2011-10-28 12:28 ` [PATCH 05/28] refs.c: rename ref_array -> ref_dir mhagger
2011-10-28 12:28 ` [PATCH 06/28] refs: store references hierarchically mhagger
2011-10-28 12:28 ` [PATCH 07/28] sort_ref_dir(): do not sort if already sorted mhagger
2011-10-28 12:28 ` [PATCH 08/28] refs: sort ref_dirs lazily mhagger
2011-10-28 12:28 ` [PATCH 09/28] do_for_each_ref(): only iterate over the subtree that was requested mhagger
2011-10-28 12:28 ` [PATCH 10/28] get_ref_dir(): keep track of the current ref_dir mhagger
2011-10-28 12:28 ` [PATCH 11/28] refs: wrap top-level ref_dirs in ref_entries mhagger
2011-10-28 12:28 ` [PATCH 12/28] get_packed_refs(): return (ref_entry *) instead of (ref_dir *) mhagger
2011-10-28 12:28 ` [PATCH 13/28] get_loose_refs(): " mhagger
2011-10-28 12:28 ` [PATCH 14/28] is_refname_available(): take " mhagger
2011-10-28 12:28 ` [PATCH 15/28] find_ref(): " mhagger
2011-10-28 12:28 ` [PATCH 16/28] read_packed_refs(): " mhagger
2011-10-28 12:28 ` [PATCH 17/28] add_ref(): " mhagger
2011-10-28 12:28 ` [PATCH 18/28] find_containing_direntry(): use " mhagger
2011-10-28 12:28 ` [PATCH 19/28] search_ref_dir(): take " mhagger
2011-10-28 12:28 ` [PATCH 20/28] add_entry(): " mhagger
2011-10-28 12:28 ` [PATCH 21/28] do_for_each_ref_in_dir*(): " mhagger
2011-10-28 12:28 ` [PATCH 22/28] sort_ref_dir(): " mhagger
2011-10-28 12:28 ` [PATCH 23/28] struct ref_dir: store a reference to the enclosing ref_cache mhagger
2011-10-28 12:28 ` [PATCH 24/28] read_loose_refs(): take a (ref_entry *) as argument mhagger
2011-10-28 12:28 ` mhagger [this message]
2011-10-28 12:28 ` [PATCH 26/28] is_refname_available(): query only possibly-conflicting references mhagger
2011-11-15 5:55 ` [PATCH] Fix "is_refname_available(): query only possibly-conflicting references" mhagger
2011-11-15 7:24 ` Junio C Hamano
2011-11-15 16:19 ` Michael Haggerty
2011-11-15 19:19 ` Junio C Hamano
2011-10-28 12:28 ` [PATCH 27/28] read_packed_refs(): keep track of the directory being worked in mhagger
2011-10-28 12:28 ` [PATCH 28/28] repack_without_ref(): call clear_packed_ref_cache() mhagger
2011-10-28 13:07 ` [PATCH 00/28] Store references hierarchically in cache Ramkumar Ramachandra
2011-10-28 18:45 ` Michael Haggerty
2011-11-16 12:51 ` [PATCH 00/28] Store references hierarchically in cache -- benchmark results Michael Haggerty
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1319804921-17545-26-git-send-email-mhagger@alum.mit.edu \
--to=mhagger@alum.mit.edu \
--cc=drew.northup@maine.edu \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=hvoigt@hvoigt.net \
--cc=jnareb@gmail.com \
--cc=johan@herland.net \
--cc=julian@quantumfyre.co.uk \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).