[PATCH/RFC] index-pack: produce pack index version 3

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH/RFC] index-pack: produce pack index version 3
@ 2012-08-12 12:01 Nguyễn Thái Ngọc Duy
  2012-08-12 19:11 ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-08-12 12:01 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

The sha-1->offset table in v3 is sorted by object type first, then
sha-1 (as opposed to sorted by sha-1 only in v2). There are four
fan-out tables, one for each object type. So to look for the offset of
an object, we first locate the fan-out table based on object type,
then do binary search in that region. If we don't know object type, we
may need to try all object types (which might be slower than now)

Long term we might gain slight lookup speedup if we know object type
as search region is made smaller. But for that to happen, we need to
propagate object type hint down to find_pack_entry_one() and friends.
Possible thing to do, I think.

The main reason to group objects by type is to make it possible to
create another sha1->something mapping for a particular object type,
without wasting space for storing sha-1 keys again. For example, we
can store commit caches, tree caches... at the end of the index as
extensions.

Note that this is just one of v3 changes. The format is not fixed yet.
The intent is allow extensions add the end of the index. Extensions
are not covered by the trailer sha-1 that covers the entire index v2,
so big extensions cannot impact normal operations until they are used.
Extensions have their own sha-1 trailers.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Not compilable as it requires a couple other patches that I don't
 send. Just to get opinions on the format change. Shawn's connectivity
 bitmap cache or whatever cache can then be placed on this.

 builtin/index-pack.c |  8 ++++-
 pack-write.c         | 93 +++++++++++++++++++++++++++++++++++++++-------------
 pack.h               |  1 +
 sha1_file.c          | 50 +++++++++++++++++++++-------
 4 files changed, 116 insertions(+), 36 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 68e2031..22c3cfa 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1574,7 +1574,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 			} else if (!prefixcmp(arg, "--index-version=")) {
 				char *c;
 				opts.version = strtoul(arg + 16, &c, 10);
-				if (opts.version > 2)
+				if (opts.version > 3)
 					die(_("bad %s"), arg);
 				if (*c == ',')
 					opts.off32_limit = strtoul(c+1, &c, 0);
@@ -1650,6 +1650,12 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 	for (i = 0; i < nr_objects; i++)
 		idx_objects[i] = &objects[i].idx;
 	opts.nr_objects = nr_objects;
+	if (opts.version >= 3) {
+		opts.nr_commits = nr_commits;
+		opts.nr_trees = nr_trees;
+		opts.nr_blobs = nr_blobs;
+		opts.nr_tags = nr_tags;
+	}
 	curr_index = write_idx_file(index_name, idx_objects, &opts, pack_sha1);
 	free(idx_objects);
 
diff --git a/pack-write.c b/pack-write.c
index 4643796..d4152b5 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -16,6 +16,15 @@ static int sha1_compare(const void *_a, const void *_b)
 	return hashcmp(a->sha1, b->sha1);
 }
 
+static int type_sha1_compare(const void *_a, const void *_b)
+{
+	struct pack_idx_entry *a = *(struct pack_idx_entry **)_a;
+	struct pack_idx_entry *b = *(struct pack_idx_entry **)_b;
+
+	assert(OBJ_COMMIT < OBJ_TREE && OBJ_TREE < OBJ_BLOB && OBJ_BLOB < OBJ_TAG);
+	return a->type == b->type ? hashcmp(a->sha1, b->sha1) : a->type - b->type;
+}
+
 static int cmp_uint32(const void *a_, const void *b_)
 {
 	uint32_t a = *((uint32_t *)a_);
@@ -37,6 +46,32 @@ static int need_large_offset(off_t offset, const struct pack_idx_option *opts)
 			 sizeof(ofsval), cmp_uint32);
 }
 
+static void make_fanout_table(struct pack_idx_entry **sorted_by_sha,
+			      unsigned long start, unsigned long end,
+			      uint32_t *array)
+{
+	struct pack_idx_entry **list = sorted_by_sha + start;
+	struct pack_idx_entry **last = sorted_by_sha + end;
+	int i;
+
+	/*
+	 * Write the first-level table (the list is sorted,
+	 * but we use a 256-entry lookup to be able to avoid
+	 * having to do eight extra binary search iterations).
+	 */
+	for (i = 0; i < 256; i++) {
+		struct pack_idx_entry **next = list;
+		while (next < last) {
+			struct pack_idx_entry *obj = *next;
+			if (obj->sha1[0] != i)
+				break;
+			next++;
+		}
+		array[i] = htonl(next - sorted_by_sha);
+		list = next;
+	}
+}
+
 /*
  * On entry *sha1 contains the pack content SHA1 hash, on exit it is
  * the SHA1 hash of sorted object names. The objects array passed in
@@ -47,27 +82,23 @@ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objec
 			   unsigned char *sha1)
 {
 	struct sha1file *f;
-	struct pack_idx_entry **sorted_by_sha, **list, **last;
+	struct pack_idx_entry **sorted_by_sha = NULL, **list;
 	off_t last_obj_offset = 0;
 	uint32_t array[256];
 	int fd;
 	git_SHA_CTX ctx;
-	uint32_t index_version;
+	uint32_t index_version = opts->version;
 	unsigned long i, nr_objects = opts->nr_objects;
 
 	if (nr_objects) {
 		sorted_by_sha = objects;
-		list = sorted_by_sha;
-		last = sorted_by_sha + nr_objects;
 		for (i = 0; i < nr_objects; ++i) {
 			if (objects[i]->offset > last_obj_offset)
 				last_obj_offset = objects[i]->offset;
 		}
 		qsort(sorted_by_sha, nr_objects, sizeof(sorted_by_sha[0]),
-		      sha1_compare);
+		      opts->version >= 3 ? type_sha1_compare : sha1_compare);
 	}
-	else
-		sorted_by_sha = list = last = NULL;
 
 	if (opts->flags & WRITE_IDX_VERIFY) {
 		assert(index_name);
@@ -87,7 +118,8 @@ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objec
 	}
 
 	/* if last object's offset is >= 2^31 we should use index V2 */
-	index_version = need_large_offset(last_obj_offset, opts) ? 2 : opts->version;
+	if (opts->version <= 2 && need_large_offset(last_obj_offset, opts))
+		index_version = 2;
 
 	/* index versions 2 and above need a header */
 	if (index_version >= 2) {
@@ -97,23 +129,38 @@ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objec
 		sha1write(f, &hdr, sizeof(hdr));
 	}
 
-	/*
-	 * Write the first-level table (the list is sorted,
-	 * but we use a 256-entry lookup to be able to avoid
-	 * having to do eight extra binary search iterations).
-	 */
-	for (i = 0; i < 256; i++) {
-		struct pack_idx_entry **next = list;
-		while (next < last) {
-			struct pack_idx_entry *obj = *next;
-			if (obj->sha1[0] != i)
-				break;
-			next++;
+	if (index_version >= 3) {
+		unsigned long ub[OBJ_MAX], lb[OBJ_MAX];
+		/* we heavily rely on the actual enum constants.. */
+		assert(OBJ_COMMIT + 1 == OBJ_TREE &&
+		       OBJ_TREE   + 1 == OBJ_BLOB &&
+		       OBJ_BLOB   + 1 == OBJ_TAG);
+		lb[OBJ_COMMIT] = 0;
+		ub[OBJ_COMMIT] = opts->nr_commits;
+		lb[OBJ_TREE]   = ub[OBJ_COMMIT];
+		ub[OBJ_TREE]   = lb[OBJ_TREE] + opts->nr_trees;
+		lb[OBJ_BLOB]   = ub[OBJ_TREE];
+		ub[OBJ_BLOB]   = lb[OBJ_BLOB] + opts->nr_blobs;
+		lb[OBJ_TAG]    = ub[OBJ_BLOB];
+		ub[OBJ_TAG]    = lb[OBJ_TAG]  + opts->nr_tags;
+
+		/* make sure nr_{commits,trees,objects,tags} match */
+		for (i = OBJ_COMMIT; i <= OBJ_TAG; i++)
+			if (ub[i] > lb[i] &&
+			    (objects[lb[i]]->type != i ||
+			     objects[ub[i]-1]->type != i))
+				die(_("expected to have %lu %s, but not so"),
+				    ub[i] - lb[i], typename(i));
+
+		/* v3 has four fanout tables instead of one */
+		for (i = OBJ_COMMIT; i <= OBJ_TAG; i++) {
+			make_fanout_table(sorted_by_sha, lb[i], ub[i], array);
+			sha1write(f, array, 256 * 4);
 		}
-		array[i] = htonl(next - sorted_by_sha);
-		list = next;
+	} else {
+		make_fanout_table(sorted_by_sha, 0, opts->nr_objects, array);
+		sha1write(f, array, 256 * 4);
 	}
-	sha1write(f, array, 256 * 4);
 
 	/* compute the SHA1 hash of sorted object names. */
 	git_SHA1_Init(&ctx);
diff --git a/pack.h b/pack.h
index f94e6c4..1a26f65 100644
--- a/pack.h
+++ b/pack.h
@@ -45,6 +45,7 @@ struct pack_idx_option {
 	uint32_t off32_limit;
 
 	unsigned long nr_objects;
+	unsigned long nr_commits, nr_trees, nr_tags, nr_blobs;
 
 	/*
 	 * List of offsets that would fit within off32_limit but
diff --git a/sha1_file.c b/sha1_file.c
index af5cfbd..66cc38a 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -480,7 +480,7 @@ static int check_packed_git_idx(const char *path,  struct packed_git *p)
 	void *idx_map;
 	struct pack_idx_header *hdr;
 	size_t idx_size;
-	uint32_t version, nr, i, *index;
+	uint32_t version, nr, i, *index, nr_fanout;
 	int fd = git_open_noatime(path);
 	struct stat st;
 
@@ -501,7 +501,7 @@ static int check_packed_git_idx(const char *path,  struct packed_git *p)
 	hdr = idx_map;
 	if (hdr->idx_signature == htonl(PACK_IDX_SIGNATURE)) {
 		version = ntohl(hdr->idx_version);
-		if (version < 2 || version > 2) {
+		if (version < 2 || version > 3) {
 			munmap(idx_map, idx_size);
 			return error("index file %s is version %"PRIu32
 				     " and is not supported by this binary"
@@ -515,7 +515,8 @@ static int check_packed_git_idx(const char *path,  struct packed_git *p)
 	index = idx_map;
 	if (version > 1)
 		index += 2;  /* skip index header */
-	for (i = 0; i < 256; i++) {
+	nr_fanout = version <= 2 ? 256 : 4 * 256;
+	for (i = 0; i < nr_fanout; i++) {
 		uint32_t n = ntohl(index[i]);
 		if (n < nr) {
 			munmap(idx_map, idx_size);
@@ -536,11 +537,11 @@ static int check_packed_git_idx(const char *path,  struct packed_git *p)
 			munmap(idx_map, idx_size);
 			return error("wrong index v1 file size in %s", path);
 		}
-	} else if (version == 2) {
+	} else if (version == 2 || version == 3) {
 		/*
 		 * Minimum size:
 		 *  - 8 bytes of header
-		 *  - 256 index entries 4 bytes each
+		 *  - 256 index entries 4 bytes each (1024 entries in v3)
 		 *  - 20-byte sha1 entry * nr
 		 *  - 4-byte crc entry * nr
 		 *  - 4-byte offset entry * nr
@@ -552,6 +553,8 @@ static int check_packed_git_idx(const char *path,  struct packed_git *p)
 		 */
 		unsigned long min_size = 8 + 4*256 + nr*(20 + 4 + 4) + 20 + 20;
 		unsigned long max_size = min_size;
+		if (version == 3)
+			min_size += 3 * 4*256;
 		if (nr)
 			max_size += (nr - 1)*8;
 		if (idx_size < min_size || idx_size > max_size) {
@@ -1945,7 +1948,7 @@ const unsigned char *nth_packed_object_sha1(struct packed_git *p,
 	}
 	if (n >= p->num_objects)
 		return NULL;
-	index += 4 * 256;
+	index += p->index_version >= 3 ? 4 * 4 * 256 : 4 * 256;
 	if (p->index_version == 1) {
 		return index + 24 * n + 4;
 	} else {
@@ -1957,7 +1960,7 @@ const unsigned char *nth_packed_object_sha1(struct packed_git *p,
 off_t nth_packed_object_offset(const struct packed_git *p, uint32_t n)
 {
 	const unsigned char *index = p->index_data;
-	index += 4 * 256;
+	index += p->index_version >= 3 ? 4 * 4 * 256 : 4 * 256;
 	if (p->index_version == 1) {
 		return ntohl(*((uint32_t *)(index + 24 * n)));
 	} else {
@@ -1972,8 +1975,9 @@ off_t nth_packed_object_offset(const struct packed_git *p, uint32_t n)
 	}
 }
 
-off_t find_pack_entry_one(const unsigned char *sha1,
-				  struct packed_git *p)
+static off_t find_pack_entry_one_by_type(const unsigned char *sha1,
+					 struct packed_git *p,
+					 enum object_type type)
 {
 	const uint32_t *level1_ofs = p->index_data;
 	const unsigned char *index = p->index_data;
@@ -1994,9 +1998,19 @@ off_t find_pack_entry_one(const unsigned char *sha1,
 		level1_ofs += 2;
 		index += 8;
 	}
-	index += 4 * 256;
-	hi = ntohl(level1_ofs[*sha1]);
-	lo = ((*sha1 == 0x0) ? 0 : ntohl(level1_ofs[*sha1 - 1]));
+	if (p->index_version > 2) {
+		level1_ofs += (type - OBJ_COMMIT) * 256;
+		index += 4 * 4 * 256;
+		hi = ntohl(level1_ofs[*sha1]);
+		if (*sha1 == 0)
+			lo = type == OBJ_COMMIT ? 0 : ntohl(level1_ofs[-1]);
+		else
+			lo = ntohl(level1_ofs[*sha1 - 1]);
+	} else {
+		index += 4 * 256;
+		hi = ntohl(level1_ofs[*sha1]);
+		lo = ((*sha1 == 0x0) ? 0 : ntohl(level1_ofs[*sha1 - 1]));
+	}
 	if (p->index_version > 1) {
 		stride = 20;
 	} else {
@@ -2035,6 +2049,18 @@ off_t find_pack_entry_one(const unsigned char *sha1,
 	return 0;
 }
 
+off_t find_pack_entry_one(const unsigned char *sha1,
+				  struct packed_git *p)
+{
+	off_t off;
+	if ((off = find_pack_entry_one_by_type(sha1, p, OBJ_COMMIT)) != 0 ||
+	    (off = find_pack_entry_one_by_type(sha1, p, OBJ_TREE)) != 0 ||
+	    (off = find_pack_entry_one_by_type(sha1, p, OBJ_BLOB)) != 0 ||
+	    (off = find_pack_entry_one_by_type(sha1, p, OBJ_TAG)) != 0)
+		return off;
+	return 0;
+}
+
 int is_pack_valid(struct packed_git *p)
 {
 	/* An already open pack is known to be valid. */
-- 
1.7.12.rc2.18.g61b472e

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH/RFC] index-pack: produce pack index version 3
  2012-08-12 12:01 [PATCH/RFC] index-pack: produce pack index version 3 Nguyễn Thái Ngọc Duy
@ 2012-08-12 19:11 ` Junio C Hamano
  2012-08-12 19:49   ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2012-08-12 19:11 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> The main reason to group objects by type is to make it possible to
> create another sha1->something mapping for a particular object type,
> without wasting space for storing sha-1 keys again. For example, we
> can store commit caches, tree caches... at the end of the index as
> extensions.

Why can't you do the same with a single "sorted by SHA-1" table?

Not impressed yet.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH/RFC] index-pack: produce pack index version 3
  2012-08-12 19:11 ` Junio C Hamano
@ 2012-08-12 19:49   ` Junio C Hamano
  2012-08-13  0:57     ` Nguyen Thai Ngoc Duy
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2012-08-12 19:49 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:
>
>> The main reason to group objects by type is to make it possible to
>> create another sha1->something mapping for a particular object type,
>> without wasting space for storing sha-1 keys again. For example, we
>> can store commit caches, tree caches... at the end of the index as
>> extensions.
>
> Why can't you do the same with a single "sorted by SHA-1" table?
>
> Not impressed yet.

The above should be "Not impressed yet, as it lacks sufficient
explanation of possible future benefits, but the idea is
interesting."

For example, the reachability bitmap would want to say something
like "Traversing from commit A, these objects in this pack are
reachable."  The bitmap for one commit A would logically consist of
N bits for a packfile that stores N objects (the resulting bitmap
needs to be compressed before going to disk, perhaps with RLE or
something).  With the single "sorted by SHA-1" table, we can use the
index in that single table to enumerate all reachable objects of any
type in one go.  With four separate tables, on the other hand, we
would need four bitmaps per commit.

Either way is _possible_, but I think the former is simpler, and the
latter makes it harder to introduce new types of objects in the
future, which I do not think we have examined possible use cases
well enough to make that decision to say "four types is enough
forever".

In either way, we would have such bitmap (or a set of four bitmaps
in your case) for more than one commit (it is not necessary or
desirable to add the reachability bitmap to all commits), and such a
"reachability extension" would need to store a sequence of "the
commit object name the bitmap (or a set of four bitmaps) is about,
and the bitmap (or set of four bitmaps)".  That object name does not
have to be 20-byte but would be a varint representation of the
offset into the "sorted by SHA-1" table.  That varint representation
would be smaller by about 3.5 bits if you have a separate "commit
only, sorted by SHA-1" table (as the number of all objects tend to
be 10x larger than the number of all commits that need them).  For
the particular case of "we want to only annotate the commits, never
other kinds of objects" use case, it would be a win.  But without
knowing what other use cases we will want to use the "object
annotation in the pack index file" mechanism for, it feels like a
premature optimization to me to have four tables to shave 3.5 bits
per object.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH/RFC] index-pack: produce pack index version 3
  2012-08-12 19:49   ` Junio C Hamano
@ 2012-08-13  0:57     ` Nguyen Thai Ngoc Duy
  2012-08-13  7:40       ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-08-13  0:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Mon, Aug 13, 2012 at 2:49 AM, Junio C Hamano <gitster@pobox.com> wrote:
> For example, the reachability bitmap would want to say something
> like "Traversing from commit A, these objects in this pack are
> reachable."  The bitmap for one commit A would logically consist of
> N bits for a packfile that stores N objects (the resulting bitmap
> needs to be compressed before going to disk, perhaps with RLE or
> something).  With the single "sorted by SHA-1" table, we can use the
> index in that single table to enumerate all reachable objects of any
> type in one go.  With four separate tables, on the other hand, we
> would need four bitmaps per commit.

No we still need one per commit. The n-th bit is in the order of the
object in the pack, not the index. How sha-1 is sorted does not
matter.

> Either way is _possible_, but I think the former is simpler, and the
> latter makes it harder to introduce new types of objects in the
> future, which I do not think we have examined possible use cases
> well enough to make that decision to say "four types is enough
> forever".

New types can be put in one of those four tables, depending on its
purpose. The reason I split because I care particularly about commits
and trees. If the new type serves the same purpose as tree, for
example, then it's better put in tree table...

> In either way, we would have such bitmap (or a set of four bitmaps
> in your case) for more than one commit (it is not necessary or
> desirable to add the reachability bitmap to all commits), and such a
> "reachability extension" would need to store a sequence of "the
> commit object name the bitmap (or a set of four bitmaps) is about,
> and the bitmap (or set of four bitmaps)".  That object name does not
> have to be 20-byte but would be a varint representation of the
> offset into the "sorted by SHA-1" table.

How do you reach the bitmap, given its commit sha-1?

> That varint representation
> would be smaller by about 3.5 bits if you have a separate "commit
> only, sorted by SHA-1" table (as the number of all objects tend to
> be 10x larger than the number of all commits that need them).  For
> the particular case of "we want to only annotate the commits, never
> other kinds of objects" use case, it would be a win.  But without
> knowing what other use cases we will want to use the "object
> annotation in the pack index file" mechanism for, it feels like a
> premature optimization to me to have four tables to shave 3.5 bits
> per object.

caching trees for faster traversal in general case (sort of pack v4
but it comes as a cache instead of replacing the real pack).
-- 
Duy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH/RFC] index-pack: produce pack index version 3
  2012-08-13  0:57     ` Nguyen Thai Ngoc Duy
@ 2012-08-13  7:40       ` Junio C Hamano
  2012-08-14  0:46         ` Shawn Pearce
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2012-08-13  7:40 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: git

Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:

> On Mon, Aug 13, 2012 at 2:49 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> For example, the reachability bitmap would want to say something
>> like "Traversing from commit A, these objects in this pack are
>> reachable."  The bitmap for one commit A would logically consist of
>> N bits for a packfile that stores N objects (the resulting bitmap
>> needs to be compressed before going to disk, perhaps with RLE or
>> something).  With the single "sorted by SHA-1" table, we can use the
>> index in that single table to enumerate all reachable objects of any
>> type in one go.  With four separate tables, on the other hand, we
>> would need four bitmaps per commit.
>
> No we still need one per commit. The n-th bit is in the order of the
> object in the pack, not the index. How sha-1 is sorted does not
> matter.

Then what is the problem of using that same "N" that counts the
object using the order of the objects in the pack as a short-hand to
the object name instead, if your objective is to avoid having to
repeat the full 20-byte object name in your secondary tables?  That
does not call for any split in the list of SHA-1s, no?

>> Either way is _possible_, but I think the former is simpler, and the
>> latter makes it harder to introduce new types of objects in the
>> future, which I do not think we have examined possible use cases
>> well enough to make that decision to say "four types is enough
>> forever".
>
> New types can be put in one of those four tables, depending on its
> purpose. The reason I split because I care particularly about commits
> and trees. If the new type serves the same purpose as tree, for
> example, then it's better put in tree table...

And when there are multiple uses, one that wants to treat trees and
commits more or less the same way, and another that wants to treat
them in a distinct way, even without considering a new type, how
would your "we have four separate tables" help?  The former user
wants three "commits+trees", "tags" and "blobs" table while the
latter wants four, no?

>> In either way, we would have such bitmap (or a set of four bitmaps
>> in your case) for more than one commit (it is not necessary or
>> desirable to add the reachability bitmap to all commits), and such a
>> "reachability extension" would need to store a sequence of "the
>> commit object name the bitmap (or a set of four bitmaps) is about,
>> and the bitmap (or set of four bitmaps)".  That object name does not
>> have to be 20-byte but would be a varint representation of the
>> offset into the "sorted by SHA-1" table.
>
> How do you reach the bitmap, given its commit sha-1?

Look at the ordered list of SHA-1 in the pack idx file as an array
of 20-byte entries, and find the commit SHA-1 in that array by
Newton-Raphson/bisection.  You have an index into the array that
holds the object name.  You can use it as the local object number
in the packfile.  Then you can find that object number (that is
shorter than 20-byte object name) in your secondary table, no?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH/RFC] index-pack: produce pack index version 3
  2012-08-13  7:40       ` Junio C Hamano
@ 2012-08-14  0:46         ` Shawn Pearce
  2012-08-14  1:27           ` Nguyen Thai Ngoc Duy
  2012-08-16  5:42           ` Junio C Hamano
  0 siblings, 2 replies; 10+ messages in thread
From: Shawn Pearce @ 2012-08-14  0:46 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: Junio C Hamano, git

Let me start by echoing Junio's remark... "lacks sufficient
justification". You don't give enough evidence to support even why it
is worth looking at this commit, let alone why it should be included
and cause a format change in the idx file format.

At some point you start to hand-wave about how it may benefit my
bitmap reachability index idea, but you don't seem to fully understand
that idea. And some of what you do here sounds like it actively hurts
doing the bitmap reachability project. So again this doesn't really
help.

Colby is nearly done prototyping the bitmap reachability
implementation in JGit and will release the code under the BSD license
there soon. I can't promise when yet because Colby will soon be
heading out for some (much deserved) vacation time, and doesn't want
to publish something until he is ready to stand behind the code. So it
might not show up until mid-September. But I think its worth giving
him a few weeks to finish getting the code ready, vs. rushing
something in that someone else thinks might help. We have waited more
than 6 years or whatever to improve packing. Colby's experiments are
showing massive improvements (40s enumeration cut to 100ms) with low
disk usage increase (<10%) and no pack file format changes. Its OK to
wait 4 more weeks.

On Mon, Aug 13, 2012 at 12:40 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:
>
>> On Mon, Aug 13, 2012 at 2:49 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>> For example, the reachability bitmap would want to say something
>>> like "Traversing from commit A, these objects in this pack are
>>> reachable."  The bitmap for one commit A would logically consist of
>>> N bits for a packfile that stores N objects (the resulting bitmap
>>> needs to be compressed before going to disk, perhaps with RLE or
>>> something).  With the single "sorted by SHA-1" table, we can use the
>>> index in that single table to enumerate all reachable objects of any
>>> type in one go.  With four separate tables, on the other hand, we
>>> would need four bitmaps per commit.
>>
>> No we still need one per commit. The n-th bit is in the order of the
>> object in the pack, not the index. How sha-1 is sorted does not
>> matter.

Yes, for RLE encoding to work well it is important that the bits are
assigned using pack offset ordering, not SHA-1 ordering. The object at
bit 0 is the object at offset 12 in the pack file, not the object that
has the lowest SHA-1. The RLE encoding is relying on the packer to
store objects closely reachable by time near each other in the pack
file. Which the packer already does because we have roughly proven
though the years that this ordering works sufficiently well enough for
nobody to propose another ordering.  :-)

> Then what is the problem of using that same "N" that counts the
> object using the order of the objects in the pack as a short-hand to
> the object name instead, if your objective is to avoid having to
> repeat the full 20-byte object name in your secondary tables?  That
> does not call for any split in the list of SHA-1s, no?

Exactly. There is no need to store a second copy of the SHA-1s
anywhere. We can discover information about object identified by bit N
by creating the reverse pack index in memory. The N-th entry in the
reverse index will tell us the Y-th position in the normal idx file,
which gives us the SHA-1. So SHA-1 can be obtained from a bitmap in
constant time, once the reverse index is built. The reverse index
build costs O(N log N) time, but its already required by the packer to
do efficient reuse of packed objects. We already pay this O(N log N)
penalty. So its currently free to us.

The bitmap approach Colby and I are collaborating on also contains 4
RLE bitmaps indicating what type each object is. Its needed so the
packer can avoid looking up type data in the base pack file, while
also avoiding parsing of trees where the mode information normally
provides the type hint. We could instead get this by splitting the
SHA-1 table into 4 tables as this patch proposes, but it doesn't help.
Type information can be obtained fast enough from these 4 RLE bitmaps,
without muddying the read code path with needing to know what table to
search, or needing to search 4 tables rather than 1 table.

>>> Either way is _possible_, but I think the former is simpler, and the
>>> latter makes it harder to introduce new types of objects in the
>>> future, which I do not think we have examined possible use cases
>>> well enough to make that decision to say "four types is enough
>>> forever".
>>
>> New types can be put in one of those four tables, depending on its
>> purpose.

No. A new type should get a new table. We should never cram a new type
into an existing type table just because it sounds like a good idea.
Its not. We are unlikely to add a new type anytime soon. If we do its
going to take more work than just worrying about where it fits into an
idx file. But it shouldn't be combined into a bucket with an existing
type.

>> The reason I split because I care particularly about commits
>> and trees. If the new type serves the same purpose as tree, for
>> example, then it's better put in tree table...

It isn't worth speculating about this. We aren't likely to add a new
type. If we do its probably not like another type.

> And when there are multiple uses, one that wants to treat trees and
> commits more or less the same way, and another that wants to treat
> them in a distinct way, even without considering a new type, how
> would your "we have four separate tables" help?  The former user
> wants three "commits+trees", "tags" and "blobs" table while the
> latter wants four, no?

And this paragraph alone indicates why splitting the SHA-1 table is a
stupid idea. Readers who have a SHA-1 in hand shouldn't have to dig
around in 4 different tables trying to figure out what it wants. Or
maybe 3 tables if it thinks its tree-ish. Its just a stupid discussion
to have. At this level of the system the object database is a database
keyed by SHA-1s. Simple. Don't muddle it.

>>> In either way, we would have such bitmap (or a set of four bitmaps
>>> in your case) for more than one commit (it is not necessary or
>>> desirable to add the reachability bitmap to all commits), and such a
>>> "reachability extension" would need to store a sequence of "the
>>> commit object name the bitmap (or a set of four bitmaps) is about,
>>> and the bitmap (or set of four bitmaps)".  That object name does not
>>> have to be 20-byte but would be a varint representation of the
>>> offset into the "sorted by SHA-1" table.

Yes.

>> How do you reach the bitmap, given its commit sha-1?
>
> Look at the ordered list of SHA-1 in the pack idx file as an array
> of 20-byte entries, and find the commit SHA-1 in that array by
> Newton-Raphson/bisection.  You have an index into the array that
> holds the object name.  You can use it as the local object number
> in the packfile.  Then you can find that object number (that is
> shorter than 20-byte object name) in your secondary table, no?

Yes, exactly. But I am trying to steer Colby into a slightly different
direction. For ~222k commits we need only about 2k reachability
bitmaps. If these are keyed by N-th SHA-1 as Junio states, we can
lookup the object in the object map and mark a flag on it saying
"reachability bitmap exists on this commit". Later if the revision
walker hits a commit with this flag set, we know we should look at
that bitmap information. Allocating 2k commits into the revision
walking map is fairly cheap, and makes it fast for the walker to later
identify where a bitmap exists. But I don't have any data yet to prove
what the fastest way to manage finding the right bitmaps is.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH/RFC] index-pack: produce pack index version 3
  2012-08-14  0:46         ` Shawn Pearce
@ 2012-08-14  1:27           ` Nguyen Thai Ngoc Duy
  2012-08-16  5:42           ` Junio C Hamano
  1 sibling, 0 replies; 10+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-08-14  1:27 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Junio C Hamano, git

On Tue, Aug 14, 2012 at 7:46 AM, Shawn Pearce <spearce@spearce.org> wrote:
> Colby is nearly done prototyping the bitmap reachability
> implementation in JGit and will release the code under the BSD license
> there soon. I can't promise when yet because Colby will soon be
> heading out for some (much deserved) vacation time, and doesn't want
> to publish something until he is ready to stand behind the code. So it
> might not show up until mid-September. But I think its worth giving
> him a few weeks to finish getting the code ready, vs. rushing
> something in that someone else thinks might help. We have waited more
> than 6 years or whatever to improve packing. Colby's experiments are
> showing massive improvements (40s enumeration cut to 100ms) with low
> disk usage increase (<10%) and no pack file format changes. Its OK to
> wait 4 more weeks.

Thanks. I did not know it's being worked on (your last email did not
indicate that, I think). I just wanted to try out and see how it might
work. And sure I can wait :)
--
Duy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH/RFC] index-pack: produce pack index version 3
  2012-08-14  0:46         ` Shawn Pearce
  2012-08-14  1:27           ` Nguyen Thai Ngoc Duy
@ 2012-08-16  5:42           ` Junio C Hamano
  2012-08-16 15:54             ` Shawn Pearce
  1 sibling, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2012-08-16  5:42 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Nguyen Thai Ngoc Duy, git

Shawn Pearce <spearce@spearce.org> writes:

> ... But I think its worth giving
> him a few weeks to finish getting the code ready, vs. rushing
> something in that someone else thinks might help. We have waited more
> than 6 years or whatever to improve packing. Colby's experiments are
> showing massive improvements (40s enumeration cut to 100ms) with low
> disk usage increase (<10%) and no pack file format changes.

No matter what you do, you would be saving the bitmap somewhere
outside the *.pack file, yes?  Will it be in some extended section
of the new *.idx file?

With the bitmap, your object enumeration phase may go very fast, and
you would be able to figure out, in response to a fetch request "I
have these tips of refs; please update me with these refs of yours",
the set of objects you would need to pack (i.e. the ones that are
reachable from your refs that were asked for, but that are not
reachable from the refs the requestor has).  

Among these objects, there will be ones that are expressed as a
delta against what you are going to send, or as a delta against what
you know the recipient must already have (if you are using thin pack
transfer) in the packfiles you have, and you can send these deltas
as-is without recomputation.

But there will be ones that are either expressed as a base in your
packfile, or as a delta against something you are not going to send
and you know that the recipient does not have.  In order to turn
these objects into deltas, it may be necessary to have a way to
record which "delta chain" each object belongs to, and if you are
introducing the mechanism to have extended sections in the new *.idx
file, that may be a good place to do so.  When you need to express
an object that your bitmap told you to send (as opposed to rev-list
walking told you with the paths to the objects), you can find other
objects that belong to the same "delta chain" and that you know are
available to the recipient when it starts to fixing the thin pack
using that extra piece of information, in order to find the optimal
delta base to encode such an object against.

Just for fun, I applied the attached patch and repacked the history
leading to v1.7.12-rc3 with the default depth/window:

  git rev-list --objects --all |
  git pack-objects \
      --no-reuse-delta --no-reuse-object --delta-base-offset \
      [--no-namehash] pack

with and without the experimental --no-namehash option.  The result
is staggering.  With name-hash to group objects from the same path
close together in the delta window, the resulting pack is 33M.
Without the name-hash hint, the same pack is 163M!  Needless to say,
keeping the objects that should be hashed together inside a delta
window is really important.

An obvious way to record the "delta chain" is to simply keep the
name_hash of each object in the pack, which would need 2 bytes per
object in the pack, that would bloat pack_idx_entry size from 32
bytes to 34 bytes per entry.  That way, after your bitmap discovers
an object that cannot reuse existing deltas, you can throw it, other
such objects with the same name-hash, and then objects that you know
will be available to the recipient (you mark the last category of
objects as "preferred base"), into the delta_list so that they are
close together in the delta window.

And here is a patch to experiment the name-hash based grouping
heuristics.

 builtin/pack-objects.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git c/builtin/pack-objects.c w/builtin/pack-objects.c
index 782e7d0..a2dd67b 100644
--- c/builtin/pack-objects.c
+++ w/builtin/pack-objects.c
@@ -67,6 +67,7 @@ static int keep_unreachable, unpack_unreachable, include_tag;
 static unsigned long unpack_unreachable_expiration;
 static int local;
 static int incremental;
+static int use_name_hash = 1;
 static int ignore_packed_keep;
 static int allow_ofs_delta;
 static struct pack_idx_option pack_idx_opts;
@@ -863,7 +864,7 @@ static unsigned name_hash(const char *name)
 {
 	unsigned c, hash = 0;

-	if (!name)
+	if (!name || !use_name_hash)
 		return 0;

 	/*
@@ -2468,6 +2469,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 			  "limit pack window by memory in addition to object limit"),
 		OPT_INTEGER(0, "depth", &depth,
 			    "maximum length of delta chain allowed in the resulting pack"),
+		OPT_BOOL(0, "namehash", &use_name_hash,
+			 "assign name hint to each path"),
 		OPT_BOOL(0, "reuse-delta", &reuse_delta,
 			 "reuse existing deltas"),
 		OPT_BOOL(0, "reuse-object", &reuse_object,

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH/RFC] index-pack: produce pack index version 3
  2012-08-16  5:42           ` Junio C Hamano
@ 2012-08-16 15:54             ` Shawn Pearce
  2012-08-16 16:26               ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Shawn Pearce @ 2012-08-16 15:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nguyen Thai Ngoc Duy, git

On Wed, Aug 15, 2012 at 10:42 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Shawn Pearce <spearce@spearce.org> writes:
>
>> ... But I think its worth giving
>> him a few weeks to finish getting the code ready, vs. rushing
>> something in that someone else thinks might help. We have waited more
>> than 6 years or whatever to improve packing. Colby's experiments are
>> showing massive improvements (40s enumeration cut to 100ms) with low
>> disk usage increase (<10%) and no pack file format changes.
>
> No matter what you do, you would be saving the bitmap somewhere
> outside the *.pack file, yes?  Will it be in some extended section
> of the new *.idx file?

Yes, because it is derived from the pack stream its stored external
from it. We have chosen to append it as a new section below the CRC-32
table in the *.idx file. It could also go into a new file stream
(*.bdx?) but I don't think this is worthwhile. Its a bikeshed that can
be painted once the algorithm has proven its value. If it can't do
that, its irrelevant where its stored.

> With the bitmap, your object enumeration phase may go very fast, and
> you would be able to figure out, in response to a fetch request "I
> have these tips of refs; please update me with these refs of yours",
> the set of objects you would need to pack (i.e. the ones that are
> reachable from your refs that were asked for, but that are not
> reachable from the refs the requestor has).

Yes. Not only that, but we are finding that the number of objects to
send is fewer, resulting in a much smaller data stream. The bitmap
gives us near perfect knowledge of everything the client has, not just
the objects on the edge of the graph cut expressed by the ACKed have
lines. Objects that existed earlier than the cut don't have to be
reset.

So the object enumeration phase goes very fast. The compressing
objects phase is also time reduced, as fewer objects are needed to be
considered, as fewer objects are being sent. With more complete
knowledge of the have side, more deltas can be reused as-is, rather
than re-encoded on the fly. This reduces the compressing objects phase
time even further. And with fewer objects to send, we have easily seen
trivial cases of a Linux kernel fetch that is 1 week behind Linus' tip
save like 200K on an 800K transfer. Its hard to hate having all of the
phases go faster, and transmit less data to the client.

> Among these objects, there will be ones that are expressed as a
> delta against what you are going to send, or as a delta against what
> you know the recipient must already have (if you are using thin pack
> transfer) in the packfiles you have, and you can send these deltas
> as-is without recomputation.

Yes. I point that out above before even reading this far in your message. :-(

> But there will be ones that are either expressed as a base in your
> packfile, or as a delta against something you are not going to send
> and you know that the recipient does not have.  In order to turn
> these objects into deltas, it may be necessary to have a way to
> record which "delta chain" each object belongs to

Excellent observation. Instead of performing a hacky delta base search
by looking at identical path names in the have set, finding another
candidate in the same cluster of deltas that the object is currently
encoded against would yield a pretty efficient delta on the wire. It
doesn't have to be another object in the same chain, it could also be
a sibling object that shares a similar transitive delta base.

>, and if you are
> introducing the mechanism to have extended sections in the new *.idx
> file, that may be a good place to do so.

Thus far we haven't done an extended sections modification to the
*.idx format. Its just a 'E003' version that requires the bitmaps to
be present below the CRC-32 table. But I can see how framing this as a
group of optional extensions would be worthwhile.

>  When you need to express
> an object that your bitmap told you to send (as opposed to rev-list
> walking told you with the paths to the objects), you can find other
> objects that belong to the same "delta chain" and that you know are
> available to the recipient when it starts to fixing the thin pack
> using that extra piece of information, in order to find the optimal
> delta base to encode such an object against.

Yes.

> Just for fun, I applied the attached patch and repacked the history
> leading to v1.7.12-rc3 with the default depth/window:
>
>   git rev-list --objects --all |
>   git pack-objects \
>       --no-reuse-delta --no-reuse-object --delta-base-offset \
>       [--no-namehash] pack
>
> with and without the experimental --no-namehash option.  The result
> is staggering.  With name-hash to group objects from the same path
> close together in the delta window, the resulting pack is 33M.
> Without the name-hash hint, the same pack is 163M!  Needless to say,
> keeping the objects that should be hashed together inside a delta
> window is really important.

Cute experiment. Yes that name hash works as a decent predictor of
which objects should go near each other in the delta window. So does
the descending size sort.

> An obvious way to record the "delta chain" is to simply keep the
> name_hash of each object in the pack, which would need 2 bytes per
> object in the pack, that would bloat pack_idx_entry size from 32
> bytes to 34 bytes per entry.  That way, after your bitmap discovers
> an object that cannot reuse existing deltas, you can throw it, other
> such objects with the same name-hash, and then objects that you know
> will be available to the recipient (you mark the last category of
> objects as "preferred base"), into the delta_list so that they are
> close together in the delta window.

Yes, this is one thought I had. Inside of JGit I think the name hash
is 32 bits, not 16 bits. Storing the name hash into the *.idx file
means we need to codify what the name hash algorithm is for a given
*.idx file version, and compatible implementations of Git must use the
same hash function. Thus far the name hash has been an in-memory
transient concept that doesn't need to be persisted across runs of the
packer. Storing it means we have to do that.

Another alternative is to try and group objects by delta clusters. Tag
each object with some marker for the non-delta base object the delta
chain it builds on top of. Delta groups are supposed to tend to be
bushy and not long, so many many objects should share the same common
base object. Most of them have the same name hash, but even when they
don't their contents were near enough together that the prior packer
run chose to combine these together. This avoids the need to encode
the name-hash function into the *.idx file format, and instead lets us
work with something that any implementation can easily observe from
the file's contents.

Tagging each object with its base could cost 4 bytes per entry, so
32->36 bytes... a 12% increase in file size. It may be possible to
organize these as bitmaps and get really good compression from deltas
that are located near each other in the pack stream, but doing the
lookup to identify which base would be more expensive. We would
probably have to walk the object's delta chain backwards to identify
the base, then inflate the bitmap to identify the candidate siblings.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH/RFC] index-pack: produce pack index version 3
  2012-08-16 15:54             ` Shawn Pearce
@ 2012-08-16 16:26               ` Junio C Hamano
  0 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2012-08-16 16:26 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Nguyen Thai Ngoc Duy, git

Shawn Pearce <spearce@spearce.org> writes:

> On Wed, Aug 15, 2012 at 10:42 PM, Junio C Hamano <gitster@pobox.com> wrote:
> ...
>> An obvious way to record the "delta chain" is to simply keep the
>> name_hash of each object in the pack, which would need 2 bytes per
>> object in the pack, that would bloat pack_idx_entry size from 32
>> bytes to 34 bytes per entry.  That way, after your bitmap discovers
>> an object that cannot reuse existing deltas, you can throw it, other
>> such objects with the same name-hash, and then objects that you know
>> will be available to the recipient (you mark the last category of
>> objects as "preferred base"), into the delta_list so that they are
>> close together in the delta window.
>
> Yes, this is one thought I had. Inside of JGit I think the name hash
> is 32 bits, not 16 bits. Storing the name hash into the *.idx file
> means we need to codify what the name hash algorithm is for a given
> *.idx file version, and compatible implementations of Git must use the
> same hash function. Thus far the name hash has been an in-memory
> transient concept that doesn't need to be persisted across runs of the
> packer. Storing it means we have to do that.

Let's not go there.  We cannot resurrect the name hash out of *.pack
stream, which means index-pack cannot recreate it after receiving
objects over the network.  We would need to instead teach index-pack
to observe the delta chains, and come up with some "delta chain
identifier" (unique name to identify what you called "delta cluster"
in your response) on its own, and give it to each object when it
writes the *.idx file out.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-08-16 16:26 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-12 12:01 [PATCH/RFC] index-pack: produce pack index version 3 Nguyễn Thái Ngọc Duy
2012-08-12 19:11 ` Junio C Hamano
2012-08-12 19:49   ` Junio C Hamano
2012-08-13  0:57     ` Nguyen Thai Ngoc Duy
2012-08-13  7:40       ` Junio C Hamano
2012-08-14  0:46         ` Shawn Pearce
2012-08-14  1:27           ` Nguyen Thai Ngoc Duy
2012-08-16  5:42           ` Junio C Hamano
2012-08-16 15:54             ` Shawn Pearce
2012-08-16 16:26               ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).