[RFC 0/16] Introduce index file format version 5

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC 0/16] Introduce index file format version 5
@ 2012-08-02 11:01 Thomas Gummerer
  2012-08-02 11:01 ` [PATCH 01/16] Modify cache_header to prepare for other index formats Thomas Gummerer
                   ` (18 more replies)
  0 siblings, 19 replies; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-02 11:01 UTC (permalink / raw)
  To: git; +Cc: trast, mhagger, gitster, pclouds, robin.rosenberg

Series of patches to introduce the index version 5 file format. This
series does not include any fancy stuff like partial loading or partial
writing yet, though it's possible to do that with the new format.

There was already a POC for partial loading, which gave pretty good
results, which was however broken in all but the general case, so it's
not included yet. (for timings see: http://thread.gmane.org/gmane.comp.version-control.git/201964/focus=202019)

The first 4 patches are refactoring the old code, splitting it up into
different functions, as a preparation for index-v5.

Patches 5 and 6 fix testcases for index v5.

Patch 9..11 introduce the reader for index-v5. I've split those
patches up to read the main index first, then the resolve-undo
data and then the cache-tree, to make it easier to review them.

The same goes for patches 12..14, which introduce the writer, again
split up in writing the main index, resolve-undo data and cache-tree
data.

Patch 15 adds a option to update index to force-rewrite the index,
so rewriting it even if nothing has changed. This is later used
for performance testing, to test the performance for both the reader and
the writer.

Patch 16 adds the performance test, which compares the time for
force-rewrites for index-v[23], index-v4 and index-v5.

The default index format is still set to 3, it can be changed in
read-cache.c (INDEX_FORMAT_DEFAULT)

[PATCH 01/16] Modify cache_header to prepare for other index formats
[PATCH 02/16] Modify read functions to prepare for other index
[PATCH 03/16] Modify match_stat_basic to prepare for other index
[PATCH 04/16] Modify write functions to prepare for other index
[PATCH 05/16] t2104: Don't fail when index version is 5
[PATCH 06/16] t3700: sleep for 1 second, to avoid interfering with
[PATCH 07/16] Add documentation of the index-v5 file format
[PATCH 08/16] Make in-memory format aware of stat_crc
[PATCH 09/16] Read index-v5
[PATCH 10/16] Read resolve-undo data
[PATCH 11/16] Read cache-tree in index-v5
[PATCH 12/16] Write index-v5
[PATCH 13/16] Write index-v5 cache-tree data
[PATCH 14/16] Write resolve-undo data for index-v5
[PATCH 15/16] update-index.c: add a force-rewrite option
[PATCH 16/16] p0002-index.sh: add perf test for the index formats

Documentation/technical/index-file-format-v5.txt |  281 ++++++++++++++++++++++++++++++++++
builtin/update-index.c                           |    5 +-
cache-tree.c                                     |  145 ++++++++++++++++++
cache-tree.h                                     |    7 +
cache.h                                          |   96 +++++++++++-
read-cache.c                                     | 1519 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------
resolve-undo.c                                   |  129 ++++++++++++++++
resolve-undo.h                                   |    3 +
t/perf/p0002-index.sh                            |   33 ++++
t/t2104-update-index-skip-worktree.sh            |   15 +-
t/t3700-add.sh                                   |    1 +
test-index-version.c                             |    2 +-
12 files changed, 2082 insertions(+), 154 deletions(-)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 01/16] Modify cache_header to prepare for other index formats
  2012-08-02 11:01 [RFC 0/16] Introduce index file format version 5 Thomas Gummerer
@ 2012-08-02 11:01 ` Thomas Gummerer
  2012-08-02 12:15   ` Nguyen Thai Ngoc Duy
  2012-08-02 11:01 ` [PATCH 02/16] Modify read functions " Thomas Gummerer
                   ` (17 subsequent siblings)
  18 siblings, 1 reply; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-02 11:01 UTC (permalink / raw)
  To: git; +Cc: trast, mhagger, gitster, pclouds, robin.rosenberg,
	Thomas Gummerer

Modify the cache_header such that other index file formats
can be added and reusing the common part of each index format.

The signature and version have to be present in every
version of the index file format, to check if it can be read
by a specific version of git, while other entries (eg. number
of entries for index v2/3/4) can be different from one file
format to another. Therefore it is split to its own struct.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 cache.h              |  5 ++++-
 read-cache.c         | 20 +++++++++++++-------
 test-index-version.c |  2 +-
 3 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/cache.h b/cache.h
index 6e9a243..d4028ef 100644
--- a/cache.h
+++ b/cache.h
@@ -99,9 +99,12 @@ unsigned long git_deflate_bound(git_zstream *, unsigned long);
  */
 
 #define CACHE_SIGNATURE 0x44495243	/* "DIRC" */
-struct cache_header {
+struct cache_version_header {
 	unsigned int hdr_signature;
 	unsigned int hdr_version;
+};
+
+struct cache_header_v2 {
 	unsigned int hdr_entries;
 };
 
diff --git a/read-cache.c b/read-cache.c
index ab00d02..c44b5f7 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1247,7 +1247,7 @@ struct ondisk_cache_entry_extended {
 			    ondisk_cache_entry_extended_size(ce_namelen(ce)) : \
 			    ondisk_cache_entry_size(ce_namelen(ce)))
 
-static int verify_hdr(struct cache_header *hdr, unsigned long size)
+static int verify_hdr(struct cache_version_header *hdr, unsigned long size)
 {
 	git_SHA_CTX c;
 	unsigned char sha1[20];
@@ -1409,7 +1409,8 @@ int read_index_from(struct index_state *istate, const char *path)
 	int fd, i;
 	struct stat st;
 	unsigned long src_offset;
-	struct cache_header *hdr;
+	struct cache_version_header *hdr;
+	struct cache_header_v2 *hdr_v2;
 	void *mmap;
 	size_t mmap_size;
 	struct strbuf previous_name_buf = STRBUF_INIT, *previous_name;
@@ -1433,7 +1434,7 @@ int read_index_from(struct index_state *istate, const char *path)
 
 	errno = EINVAL;
 	mmap_size = xsize_t(st.st_size);
-	if (mmap_size < sizeof(struct cache_header) + 20)
+	if (mmap_size < sizeof(struct cache_version_header) + 20)
 		die("index file smaller than expected");
 
 	mmap = xmmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
@@ -1442,11 +1443,13 @@ int read_index_from(struct index_state *istate, const char *path)
 		die_errno("unable to map index file");
 
 	hdr = mmap;
+	hdr_v2 =  mmap + sizeof(*hdr);
 	if (verify_hdr(hdr, mmap_size) < 0)
 		goto unmap;
 
+	hdr_v2 = mmap + sizeof(*hdr);
 	istate->version = ntohl(hdr->hdr_version);
-	istate->cache_nr = ntohl(hdr->hdr_entries);
+	istate->cache_nr = ntohl(hdr_v2->hdr_entries);
 	istate->cache_alloc = alloc_nr(istate->cache_nr);
 	istate->cache = xcalloc(istate->cache_alloc, sizeof(struct cache_entry *));
 	istate->initialized = 1;
@@ -1456,7 +1459,7 @@ int read_index_from(struct index_state *istate, const char *path)
 	else
 		previous_name = NULL;
 
-	src_offset = sizeof(*hdr);
+	src_offset = sizeof(*hdr) + sizeof(*hdr_v2);
 	for (i = 0; i < istate->cache_nr; i++) {
 		struct ondisk_cache_entry *disk_ce;
 		struct cache_entry *ce;
@@ -1757,7 +1760,8 @@ void update_index_if_able(struct index_state *istate, struct lock_file *lockfile
 int write_index(struct index_state *istate, int newfd)
 {
 	git_SHA_CTX c;
-	struct cache_header hdr;
+	struct cache_version_header hdr;
+	struct cache_header_v2 hdr_v2;
 	int i, err, removed, extended, hdr_version;
 	struct cache_entry **cache = istate->cache;
 	int entries = istate->cache_nr;
@@ -1787,11 +1791,13 @@ int write_index(struct index_state *istate, int newfd)
 
 	hdr.hdr_signature = htonl(CACHE_SIGNATURE);
 	hdr.hdr_version = htonl(hdr_version);
-	hdr.hdr_entries = htonl(entries - removed);
+	hdr_v2.hdr_entries = htonl(entries - removed);
 
 	git_SHA1_Init(&c);
 	if (ce_write(&c, newfd, &hdr, sizeof(hdr)) < 0)
 		return -1;
+	if (ce_write(&c, newfd, &hdr_v2, sizeof(hdr_v2)) < 0)
+		return -1;
 
 	previous_name = (hdr_version == 4) ? &previous_name_buf : NULL;
 	for (i = 0; i < entries; i++) {
diff --git a/test-index-version.c b/test-index-version.c
index bfaad9e..f21372a 100644
--- a/test-index-version.c
+++ b/test-index-version.c
@@ -2,7 +2,7 @@
 
 int main(int argc, const char **argv)
 {
-	struct cache_header hdr;
+	struct cache_version_header hdr;
 	int version;
 
 	memset(&hdr,0,sizeof(hdr));
-- 
1.7.10.886.gdf6792c.dirty

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 02/16] Modify read functions to prepare for other index formats
  2012-08-02 11:01 [RFC 0/16] Introduce index file format version 5 Thomas Gummerer
  2012-08-02 11:01 ` [PATCH 01/16] Modify cache_header to prepare for other index formats Thomas Gummerer
@ 2012-08-02 11:01 ` Thomas Gummerer
  2012-08-02 11:01 ` [PATCH 03/16] Modify match_stat_basic " Thomas Gummerer
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-02 11:01 UTC (permalink / raw)
  To: git; +Cc: trast, mhagger, gitster, pclouds, robin.rosenberg,
	Thomas Gummerer

Modify the read_index_from function, splitting it up into
one function that stays the same for every index format,
doing the basic operations such as verifying the header,
and a function which is specific for each index version,
which does the real reading of the index.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 cache.h      |   1 +
 read-cache.c | 108 +++++++++++++++++++++++++++++++++++------------------------
 2 files changed, 66 insertions(+), 43 deletions(-)

diff --git a/cache.h b/cache.h
index d4028ef..3aa70d8 100644
--- a/cache.h
+++ b/cache.h
@@ -435,6 +435,7 @@ extern int init_db(const char *template_dir, unsigned int flags);
 /* Initialize and use the cache information */
 extern int read_index(struct index_state *);
 extern int read_index_preload(struct index_state *, const char **pathspec);
+extern void read_index_v2(struct index_state *, void *mmap, int);
 extern int read_index_from(struct index_state *, const char *path);
 extern int is_index_unborn(struct index_state *);
 extern int read_index_unmerged(struct index_state *);
diff --git a/read-cache.c b/read-cache.c
index c44b5f7..3d83f05 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1247,10 +1247,8 @@ struct ondisk_cache_entry_extended {
 			    ondisk_cache_entry_extended_size(ce_namelen(ce)) : \
 			    ondisk_cache_entry_size(ce_namelen(ce)))
 
-static int verify_hdr(struct cache_version_header *hdr, unsigned long size)
+static int verify_hdr_version(struct cache_version_header *hdr, unsigned long size)
 {
-	git_SHA_CTX c;
-	unsigned char sha1[20];
 	int hdr_version;
 
 	if (hdr->hdr_signature != htonl(CACHE_SIGNATURE))
@@ -1258,6 +1256,14 @@ static int verify_hdr(struct cache_version_header *hdr, unsigned long size)
 	hdr_version = ntohl(hdr->hdr_version);
 	if (hdr_version < 2 || 4 < hdr_version)
 		return error("bad index version %d", hdr_version);
+	return 0;
+}
+
+static int verify_hdr_v2(struct cache_version_header *hdr, unsigned long size)
+{
+	git_SHA_CTX c;
+	unsigned char sha1[20];
+
 	git_SHA1_Init(&c);
 	git_SHA1_Update(&c, hdr, size - 20);
 	git_SHA1_Final(sha1, &c);
@@ -1403,50 +1409,15 @@ static struct cache_entry *create_from_disk(struct ondisk_cache_entry *ondisk,
 	return ce;
 }
 
-/* remember to discard_cache() before reading a different cache! */
-int read_index_from(struct index_state *istate, const char *path)
+void read_index_v2(struct index_state *istate, void *mmap, int mmap_size)
 {
-	int fd, i;
-	struct stat st;
+	int i;
 	unsigned long src_offset;
 	struct cache_version_header *hdr;
 	struct cache_header_v2 *hdr_v2;
-	void *mmap;
-	size_t mmap_size;
 	struct strbuf previous_name_buf = STRBUF_INIT, *previous_name;
 
-	errno = EBUSY;
-	if (istate->initialized)
-		return istate->cache_nr;
-
-	errno = ENOENT;
-	istate->timestamp.sec = 0;
-	istate->timestamp.nsec = 0;
-	fd = open(path, O_RDONLY);
-	if (fd < 0) {
-		if (errno == ENOENT)
-			return 0;
-		die_errno("index file open failed");
-	}
-
-	if (fstat(fd, &st))
-		die_errno("cannot stat the open index");
-
-	errno = EINVAL;
-	mmap_size = xsize_t(st.st_size);
-	if (mmap_size < sizeof(struct cache_version_header) + 20)
-		die("index file smaller than expected");
-
-	mmap = xmmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
-	close(fd);
-	if (mmap == MAP_FAILED)
-		die_errno("unable to map index file");
-
 	hdr = mmap;
-	hdr_v2 =  mmap + sizeof(*hdr);
-	if (verify_hdr(hdr, mmap_size) < 0)
-		goto unmap;
-
 	hdr_v2 = mmap + sizeof(*hdr);
 	istate->version = ntohl(hdr->hdr_version);
 	istate->cache_nr = ntohl(hdr_v2->hdr_entries);
@@ -1472,8 +1443,6 @@ int read_index_from(struct index_state *istate, const char *path)
 		src_offset += consumed;
 	}
 	strbuf_release(&previous_name_buf);
-	istate->timestamp.sec = st.st_mtime;
-	istate->timestamp.nsec = ST_MTIME_NSEC(st);
 
 	while (src_offset <= mmap_size - 20 - 8) {
 		/* After an array of active_nr index entries,
@@ -1493,12 +1462,65 @@ int read_index_from(struct index_state *istate, const char *path)
 		src_offset += 8;
 		src_offset += extsize;
 	}
+	return;
+unmap:
+	munmap(mmap, mmap_size);
+	die("index file corrupt");
+}
+
+/* remember to discard_cache() before reading a different cache! */
+int read_index_from(struct index_state *istate, const char *path)
+{
+	int fd;
+	struct stat st;
+	struct cache_version_header *hdr;
+	void *mmap;
+	size_t mmap_size;
+
+	errno = EBUSY;
+	if (istate->initialized)
+		return istate->cache_nr;
+
+	errno = ENOENT;
+	istate->timestamp.sec = 0;
+	istate->timestamp.nsec = 0;
+	fd = open(path, O_RDONLY);
+	if (fd < 0) {
+		if (errno == ENOENT)
+			return 0;
+		die_errno("index file open failed");
+	}
+
+	if (fstat(fd, &st))
+		die_errno("cannot stat the open index");
+
+	errno = EINVAL;
+	mmap_size = xsize_t(st.st_size);
+	if (mmap_size < sizeof(struct cache_version_header) + 20)
+		die("index file smaller than expected");
+
+	mmap = xmmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
+	close(fd);
+	if (mmap == MAP_FAILED)
+		die_errno("unable to map index file");
+
+	hdr = mmap;
+	if (verify_hdr_version(hdr, mmap_size) < 0)
+		goto unmap;
+
+	if (verify_hdr_v2(hdr, mmap_size) < 0)
+		goto unmap;
+
+	read_index_v2(istate, mmap, mmap_size);
+
+	istate->timestamp.sec = st.st_mtime;
+	istate->timestamp.nsec = ST_MTIME_NSEC(st);
+
 	munmap(mmap, mmap_size);
 	return istate->cache_nr;
 
 unmap:
 	munmap(mmap, mmap_size);
-	errno = EINVAL;
 	die("index file corrupt");
 }
 
-- 
1.7.10.886.gdf6792c.dirty

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 03/16] Modify match_stat_basic to prepare for other index formats
  2012-08-02 11:01 [RFC 0/16] Introduce index file format version 5 Thomas Gummerer
  2012-08-02 11:01 ` [PATCH 01/16] Modify cache_header to prepare for other index formats Thomas Gummerer
  2012-08-02 11:01 ` [PATCH 02/16] Modify read functions " Thomas Gummerer
@ 2012-08-02 11:01 ` Thomas Gummerer
  2012-08-02 12:20   ` Nguyen Thai Ngoc Duy
  2012-08-02 11:01 ` [PATCH 04/16] Modify write functions " Thomas Gummerer
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-02 11:01 UTC (permalink / raw)
  To: git; +Cc: trast, mhagger, gitster, pclouds, robin.rosenberg,
	Thomas Gummerer

Modify match_stat_basic, into one function that handles the
general case, which is the same for all index formats, and
a function that handles the specific parts for each index
file version.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 read-cache.c | 77 +++++++++++++++++++++++++++++++-----------------------------
 1 file changed, 40 insertions(+), 37 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 3d83f05..6a0af35 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -163,38 +163,10 @@ static int ce_modified_check_fs(struct cache_entry *ce, struct stat *st)
 	return 0;
 }
 
-static int ce_match_stat_basic(struct cache_entry *ce, struct stat *st)
+static int ce_match_stat_basic_v2(struct cache_entry *ce,
+				struct stat *st,
+				int changed)
 {
-	unsigned int changed = 0;
-
-	if (ce->ce_flags & CE_REMOVE)
-		return MODE_CHANGED | DATA_CHANGED | TYPE_CHANGED;
-
-	switch (ce->ce_mode & S_IFMT) {
-	case S_IFREG:
-		changed |= !S_ISREG(st->st_mode) ? TYPE_CHANGED : 0;
-		/* We consider only the owner x bit to be relevant for
-		 * "mode changes"
-		 */
-		if (trust_executable_bit &&
-		    (0100 & (ce->ce_mode ^ st->st_mode)))
-			changed |= MODE_CHANGED;
-		break;
-	case S_IFLNK:
-		if (!S_ISLNK(st->st_mode) &&
-		    (has_symlinks || !S_ISREG(st->st_mode)))
-			changed |= TYPE_CHANGED;
-		break;
-	case S_IFGITLINK:
-		/* We ignore most of the st_xxx fields for gitlinks */
-		if (!S_ISDIR(st->st_mode))
-			changed |= TYPE_CHANGED;
-		else if (ce_compare_gitlink(ce))
-			changed |= DATA_CHANGED;
-		return changed;
-	default:
-		die("internal error: ce_mode is %o", ce->ce_mode);
-	}
 	if (ce->ce_mtime.sec != (unsigned int)st->st_mtime)
 		changed |= MTIME_CHANGED;
 	if (trust_ctime && ce->ce_ctime.sec != (unsigned int)st->st_ctime)
@@ -235,6 +207,43 @@ static int ce_match_stat_basic(struct cache_entry *ce, struct stat *st)
 	return changed;
 }
 
+static int ce_match_stat_basic(struct cache_entry *ce, struct stat *st)
+{
+	unsigned int changed = 0;
+
+	if (ce->ce_flags & CE_REMOVE)
+		return MODE_CHANGED | DATA_CHANGED | TYPE_CHANGED;
+
+	switch (ce->ce_mode & S_IFMT) {
+	case S_IFREG:
+		changed |= !S_ISREG(st->st_mode) ? TYPE_CHANGED : 0;
+		/* We consider only the owner x bit to be relevant for
+		 * "mode changes"
+		 */
+		if (trust_executable_bit &&
+		    (0100 & (ce->ce_mode ^ st->st_mode)))
+			changed |= MODE_CHANGED;
+		break;
+	case S_IFLNK:
+		if (!S_ISLNK(st->st_mode) &&
+		    (has_symlinks || !S_ISREG(st->st_mode)))
+			changed |= TYPE_CHANGED;
+		break;
+	case S_IFGITLINK:
+		/* We ignore most of the st_xxx fields for gitlinks */
+		if (!S_ISDIR(st->st_mode))
+			changed |= TYPE_CHANGED;
+		else if (ce_compare_gitlink(ce))
+			changed |= DATA_CHANGED;
+		return changed;
+	default:
+		die("internal error: ce_mode is %o", ce->ce_mode);
+	}
+
+	changed = ce_match_stat_basic_v2(ce, st, changed);
+	return changed;
+}
+
 static int is_racy_timestamp(const struct index_state *istate, struct cache_entry *ce)
 {
 	return (!S_ISGITLINK(ce->ce_mode) &&
@@ -1443,7 +1452,6 @@ void read_index_v2(struct index_state *istate, void *mmap, int mmap_size)
 		src_offset += consumed;
 	}
 	strbuf_release(&previous_name_buf);
-
 	while (src_offset <= mmap_size - 20 - 8) {
 		/* After an array of active_nr index entries,
 		 * there can be arbitrary number of extended
@@ -1500,7 +1508,6 @@ int read_index_from(struct index_state *istate, const char *path)
 		die("index file smaller than expected");
 
 	mmap = xmmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
-	close(fd);
 	if (mmap == MAP_FAILED)
 		die_errno("unable to map index file");
 
@@ -1512,7 +1519,6 @@ int read_index_from(struct index_state *istate, const char *path)
 		goto unmap;
 
 	read_index_v2(istate, mmap, mmap_size);
-
 	istate->timestamp.sec = st.st_mtime;
 	istate->timestamp.nsec = ST_MTIME_NSEC(st);
 
@@ -1802,9 +1808,6 @@ int write_index(struct index_state *istate, int newfd)
 		}
 	}
 
-	if (!istate->version)
-		istate->version = INDEX_FORMAT_DEFAULT;
-
 	/* demote version 3 to version 2 when the latter suffices */
 	if (istate->version == 3 || istate->version == 2)
 		istate->version = extended ? 3 : 2;
-- 
1.7.10.886.gdf6792c.dirty

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 04/16] Modify write functions to prepare for other index formats
  2012-08-02 11:01 [RFC 0/16] Introduce index file format version 5 Thomas Gummerer
                   ` (2 preceding siblings ...)
  2012-08-02 11:01 ` [PATCH 03/16] Modify match_stat_basic " Thomas Gummerer
@ 2012-08-02 11:01 ` Thomas Gummerer
  2012-08-02 12:22   ` Nguyen Thai Ngoc Duy
  2012-08-02 11:01 ` [PATCH 05/16] t2104: Don't fail when index version is 5 Thomas Gummerer
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-02 11:01 UTC (permalink / raw)
  To: git; +Cc: trast, mhagger, gitster, pclouds, robin.rosenberg,
	Thomas Gummerer

Modify the write_index function to add the possibility to add
other index formats, that are written in a different way. Also
mark all functions, which shall only be used with v2-v4 as v2
functions.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 read-cache.c | 40 ++++++++++++++++++++++++----------------
 1 file changed, 24 insertions(+), 16 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 6a0af35..1c804e1 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1581,7 +1581,7 @@ static int ce_write_flush(git_SHA_CTX *context, int fd)
 	return 0;
 }
 
-static int ce_write(git_SHA_CTX *context, int fd, void *data, unsigned int len)
+static int ce_write_v2(git_SHA_CTX *context, int fd, void *data, unsigned int len)
 {
 	while (len) {
 		unsigned int buffered = write_buffer_len;
@@ -1603,13 +1603,13 @@ static int ce_write(git_SHA_CTX *context, int fd, void *data, unsigned int len)
 	return 0;
 }
 
-static int write_index_ext_header(git_SHA_CTX *context, int fd,
+static int write_index_ext_header_v2(git_SHA_CTX *context, int fd,
 				  unsigned int ext, unsigned int sz)
 {
 	ext = htonl(ext);
 	sz = htonl(sz);
-	return ((ce_write(context, fd, &ext, 4) < 0) ||
-		(ce_write(context, fd, &sz, 4) < 0)) ? -1 : 0;
+	return ((ce_write_v2(context, fd, &ext, 4) < 0) ||
+		(ce_write_v2(context, fd, &sz, 4) < 0)) ? -1 : 0;
 }
 
 static int ce_flush(git_SHA_CTX *context, int fd)
@@ -1634,7 +1634,7 @@ static int ce_flush(git_SHA_CTX *context, int fd)
 	return (write_in_full(fd, write_buffer, left) != left) ? -1 : 0;
 }
 
-static void ce_smudge_racily_clean_entry(struct cache_entry *ce)
+static void ce_smudge_racily_clean_entry_v2(struct cache_entry *ce)
 {
 	/*
 	 * The only thing we care about in this function is to smudge the
@@ -1715,7 +1715,7 @@ static char *copy_cache_entry_to_ondisk(struct ondisk_cache_entry *ondisk,
 	}
 }
 
-static int ce_write_entry(git_SHA_CTX *c, int fd, struct cache_entry *ce,
+static int ce_write_entry_v2(git_SHA_CTX *c, int fd, struct cache_entry *ce,
 			  struct strbuf *previous_name)
 {
 	int size;
@@ -1755,7 +1755,7 @@ static int ce_write_entry(git_SHA_CTX *c, int fd, struct cache_entry *ce,
 			      ce->name + common, ce_namelen(ce) - common);
 	}
 
-	result = ce_write(c, fd, ondisk, size);
+	result = ce_write_v2(c, fd, ondisk, size);
 	free(ondisk);
 	return result;
 }
@@ -1785,7 +1785,7 @@ void update_index_if_able(struct index_state *istate, struct lock_file *lockfile
 		rollback_lock_file(lockfile);
 }
 
-int write_index(struct index_state *istate, int newfd)
+int write_index_v2(struct index_state *istate, int newfd)
 {
 	git_SHA_CTX c;
 	struct cache_version_header hdr;
@@ -1819,9 +1819,9 @@ int write_index(struct index_state *istate, int newfd)
 	hdr_v2.hdr_entries = htonl(entries - removed);
 
 	git_SHA1_Init(&c);
-	if (ce_write(&c, newfd, &hdr, sizeof(hdr)) < 0)
+	if (ce_write_v2(&c, newfd, &hdr, sizeof(hdr)) < 0)
 		return -1;
-	if (ce_write(&c, newfd, &hdr_v2, sizeof(hdr_v2)) < 0)
+	if (ce_write_v2(&c, newfd, &hdr_v2, sizeof(hdr_v2)) < 0)
 		return -1;
 
 	previous_name = (hdr_version == 4) ? &previous_name_buf : NULL;
@@ -1830,8 +1830,8 @@ int write_index(struct index_state *istate, int newfd)
 		if (ce->ce_flags & CE_REMOVE)
 			continue;
 		if (!ce_uptodate(ce) && is_racy_timestamp(istate, ce))
-			ce_smudge_racily_clean_entry(ce);
-		if (ce_write_entry(&c, newfd, ce, previous_name) < 0)
+			ce_smudge_racily_clean_entry_v2(ce);
+		if (ce_write_entry_v2(&c, newfd, ce, previous_name) < 0)
 			return -1;
 	}
 	strbuf_release(&previous_name_buf);
@@ -1841,8 +1841,8 @@ int write_index(struct index_state *istate, int newfd)
 		struct strbuf sb = STRBUF_INIT;
 
 		cache_tree_write(&sb, istate->cache_tree);
-		err = write_index_ext_header(&c, newfd, CACHE_EXT_TREE, sb.len) < 0
-			|| ce_write(&c, newfd, sb.buf, sb.len) < 0;
+		err = write_index_ext_header_v2(&c, newfd, CACHE_EXT_TREE, sb.len) < 0
+			|| ce_write_v2(&c, newfd, sb.buf, sb.len) < 0;
 		strbuf_release(&sb);
 		if (err)
 			return -1;
@@ -1851,9 +1851,9 @@ int write_index(struct index_state *istate, int newfd)
 		struct strbuf sb = STRBUF_INIT;
 
 		resolve_undo_write(&sb, istate->resolve_undo);
-		err = write_index_ext_header(&c, newfd, CACHE_EXT_RESOLVE_UNDO,
+		err = write_index_ext_header_v2(&c, newfd, CACHE_EXT_RESOLVE_UNDO,
 					     sb.len) < 0
-			|| ce_write(&c, newfd, sb.buf, sb.len) < 0;
+			|| ce_write_v2(&c, newfd, sb.buf, sb.len) < 0;
 		strbuf_release(&sb);
 		if (err)
 			return -1;
@@ -1866,6 +1866,14 @@ int write_index(struct index_state *istate, int newfd)
 	return 0;
 }
 
+int write_index(struct index_state *istate, int newfd)
+{
+	if (!istate->version)
+		istate->version = INDEX_FORMAT_DEFAULT;
+
+	return write_index_v2(istate, newfd);
+}
+
 /*
  * Read the index file that is potentially unmerged into given
  * index_state, dropping any unmerged entries.  Returns true if
-- 
1.7.10.886.gdf6792c.dirty

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 05/16] t2104: Don't fail when index version is 5
  2012-08-02 11:01 [RFC 0/16] Introduce index file format version 5 Thomas Gummerer
                   ` (3 preceding siblings ...)
  2012-08-02 11:01 ` [PATCH 04/16] Modify write functions " Thomas Gummerer
@ 2012-08-02 11:01 ` Thomas Gummerer
  2012-08-03  8:22   ` Thomas Rast
  2012-08-02 11:01 ` [PATCH 06/16] t3700: sleep for 1 second, to avoid interfering with the racy code Thomas Gummerer
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-02 11:01 UTC (permalink / raw)
  To: git; +Cc: trast, mhagger, gitster, pclouds, robin.rosenberg,
	Thomas Gummerer

The test t2104 currently checks if the index version is correctly
reduced to 2/increased to 3, when an entry need extended flags,
or doesn't use them anymore. Since index-v5 doesn't have extended
flags (the extended flags are part of the normal flags), we simply
add a check if the index version is 2/3 (whichever is correct for
that test) or 5.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 t/t2104-update-index-skip-worktree.sh | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/t/t2104-update-index-skip-worktree.sh b/t/t2104-update-index-skip-worktree.sh
index 1d0879b..e66f23e 100755
--- a/t/t2104-update-index-skip-worktree.sh
+++ b/t/t2104-update-index-skip-worktree.sh
@@ -28,8 +28,9 @@ test_expect_success 'setup' '
 	git ls-files -t | test_cmp expect.full -
 '
 
-test_expect_success 'index is at version 2' '
-	test "$(test-index-version < .git/index)" = 2
+test_expect_success 'index is at version 2 or version 5' '
+	test "$(test-index-version < .git/index)" = 2 ||
+	test "$(test-index-version < .git/index)" = 5
 '
 
 test_expect_success 'update-index --skip-worktree' '
@@ -37,8 +38,9 @@ test_expect_success 'update-index --skip-worktree' '
 	git ls-files -t | test_cmp expect.skip -
 '
 
-test_expect_success 'index is at version 3 after having some skip-worktree entries' '
-	test "$(test-index-version < .git/index)" = 3
+test_expect_success 'index is at version 3 or version 5 after having some skip-worktree entries' '
+	test "$(test-index-version < .git/index)" = 3 ||
+	test "$(test-index-version < .git/index)" = 5
 '
 
 test_expect_success 'ls-files -t' '
@@ -50,8 +52,9 @@ test_expect_success 'update-index --no-skip-worktree' '
 	git ls-files -t | test_cmp expect.full -
 '
 
-test_expect_success 'index version is back to 2 when there is no skip-worktree entry' '
-	test "$(test-index-version < .git/index)" = 2
+test_expect_success 'index version is back to 2 or 5 when there is no skip-worktree entry' '
+	test "$(test-index-version < .git/index)" = 2 ||
+	test "$(test-index-version < .git/index)" = 5
 '
 
 test_done
-- 
1.7.10.886.gdf6792c.dirty

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 06/16] t3700: sleep for 1 second, to avoid interfering with the racy code
  2012-08-02 11:01 [RFC 0/16] Introduce index file format version 5 Thomas Gummerer
                   ` (4 preceding siblings ...)
  2012-08-02 11:01 ` [PATCH 05/16] t2104: Don't fail when index version is 5 Thomas Gummerer
@ 2012-08-02 11:01 ` Thomas Gummerer
  2012-08-02 11:01 ` [PATCH 07/16] Add documentation of the index-v5 file format Thomas Gummerer
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-02 11:01 UTC (permalink / raw)
  To: git; +Cc: trast, mhagger, gitster, pclouds, robin.rosenberg,
	Thomas Gummerer

The new git racy code uses the mtime of cache-entries to smudge
a racy clean entry, and loads the work, of checking the file-system
if the entry has really changed, off to the reader. This interferes
with this test, because the entry is racily smudged and thus has
mtime 0. We wait 1 second to avoid smudging the entry and getting
correct test results.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 t/t3700-add.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/t3700-add.sh b/t/t3700-add.sh
index 874b3a6..4d70805 100755
--- a/t/t3700-add.sh
+++ b/t/t3700-add.sh
@@ -184,6 +184,7 @@ test_expect_success 'git add --refresh with pathspec' '
 	echo >foo && echo >bar && echo >baz &&
 	git add foo bar baz && H=$(git rev-parse :foo) && git rm -f foo &&
 	echo "100644 $H 3	foo" | git update-index --index-info &&
+	sleep 1 &&
 	test-chmtime -60 bar baz &&
 	>expect &&
 	git add --refresh bar >actual &&
-- 
1.7.10.886.gdf6792c.dirty

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 07/16] Add documentation of the index-v5 file format
  2012-08-02 11:01 [RFC 0/16] Introduce index file format version 5 Thomas Gummerer
                   ` (5 preceding siblings ...)
  2012-08-02 11:01 ` [PATCH 06/16] t3700: sleep for 1 second, to avoid interfering with the racy code Thomas Gummerer
@ 2012-08-02 11:01 ` Thomas Gummerer
  2012-08-02 11:01 ` [PATCH 08/16] Make in-memory format aware of stat_crc Thomas Gummerer
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-02 11:01 UTC (permalink / raw)
  To: git; +Cc: trast, mhagger, gitster, pclouds, robin.rosenberg,
	Thomas Gummerer

Add a documentation of the index file format version 5 to
Documentation/technical.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 Documentation/technical/index-file-format-v5.txt | 281 +++++++++++++++++++++++
 1 file changed, 281 insertions(+)
 create mode 100644 Documentation/technical/index-file-format-v5.txt

diff --git a/Documentation/technical/index-file-format-v5.txt b/Documentation/technical/index-file-format-v5.txt
new file mode 100644
index 0000000..6253e34
--- /dev/null
+++ b/Documentation/technical/index-file-format-v5.txt
@@ -0,0 +1,281 @@
+GIT index format
+================
+
+== The git index file format
+
+   The git index file (.git/index) documents the status of the files
+     in the git staging area.
+
+   The staging area is used for preparing commits, merging, etc.
+
+   All binary numbers are in network byte order. Version 5 is described
+     here.
+
+   - A 20-byte header consisting of
+
+     sig (32-bits): Signature:
+       The signature is { 'D', 'I', 'R', 'C' } (stands for "dircache")
+
+     vnr (32-bits): Version number:
+       The current supported versions are 2, 3, 4 and 5.
+
+     ndir (32-bits): number of directories in the index.
+
+     nfile (32-bits): number of file entries in the index.
+
+     fblockoffset (32-bits): offset to the file block, relative to the
+       beginning of the file.
+
+   - Offset to the extensions.
+
+     nextensions (32-bits): number of extensions.
+
+     extoffset (32-bits): offset to the extension. (Possibly none, as
+       many as indicated in the 4-byte number of extensions)
+
+     headercrc (32-bits): crc checksum for the header and extension
+       offsets
+
+   - diroffsets (ndir * directory offsets): A directory offset for each
+       of the ndir directories in the index, sorted by pathname (of the
+       directory it's pointing to) (see below). The diroffsets are relative
+       to the beginning of the direntries block. [1]
+
+   - direntries (ndir * directory entries): A directory entry for each
+       of the ndir directories in the index, sorted by pathname (see
+       below). [2]
+
+   - fileoffsets (nfile * file offsets): A file offset for each of the
+       nfile files in the index (see below). The file offsets are relative
+       to the beginning of the fileentries block. [1]
+
+   - fileentries (nfile * file entries): A file entry for each of the
+       nfile files in the index (see below).
+
+   - crdata: A number of entries for conflicted data/resolved conflicts
+       (see below).
+
+   - Extensions (Currently none, see below in the future)
+
+     Extensions are identified by signature. Optional extensions can
+     be ignored if GIT does not understand them.
+
+     GIT supports an arbitrary number of extension, but currently none
+     is implemented. [3]
+
+     extsig (32-bits): extension signature. If the first byte is 'A'..'Z'
+     the extension is optional and can be ignored.
+
+     extsize (32-bits): size of the extension, excluding the header
+       (extsig, extsize, extchecksum).
+
+     extchecksum (32-bits): crc32 checksum of the extension signature
+       and size.
+
+    - Extension data.
+
+
+== Directory offsets (diroffsets)
+
+  diroffset (32-bits): offset to the directory relative to the beginning
+    of the index file. There are ndir + 1 offsets in the diroffset table,
+    the last is pointing to the end of the last direntry. With this last
+    entry, we can replace the strlen when reading each filename, by
+    calculating its length with the offsets.
+
+  This part is needed for making the directory entries bisectable and
+    thus allowing a binary search.
+
+== Directory entry (direntries)
+  
+  Directory entries are sorted in lexicographic order by the name 
+    of their path starting with the root.
+  
+  pathname (variable length, nul terminated): relative to top level
+    directory (without the leading slash). '/' is used as path
+    separator. A string of length 0 ('') indicates the root directory.
+    The special path components ".", and ".." (without quotes) are
+    disallowed. The path also includes a trailing slash. [9]
+
+  foffset (32-bits): offset to the lexicographically first file in 
+    the file offsets (fileoffsets), relative to the beginning of
+    the fileoffset block.
+
+  cr (32-bits): offset to conflicted/resolved data at the end of the
+    index. 0 if there is no such data. [4]
+
+  ncr (32-bits): number of conflicted/resolved data entries at the
+    end of the index if the offset is non 0. If cr is 0, ncr is
+    also 0.
+
+  nsubtrees (32-bits): number of subtrees this tree has in the index.
+
+  nfiles (32-bits): number of files in the directory, that are in
+    the index.
+
+  nentries (32-bits): number of entries in the index that is covered
+    by the tree this entry represents. (-1 if the entry is invalid).
+    This number includes all the files in this tree, recursively.
+
+  objname (160-bits): object name for the object that would result
+    from writing this span of index as a tree. This is only valid
+    if nentries is valid, meaning the cache-tree is valid.
+
+  flags (16-bits): 'flags' field split into (high to low bits) (For
+    D/F conflicts)
+    
+    stage (2-bits): stage of the directory during merge
+
+    14-bit unused
+
+  dircrc (32-bits): crc32 checksum for each directory entry.
+
+  The last 24 bytes (4-byte number of entries + 160-bit object name) are
+    for the cache tree. An entry can be in an invalidated state which is
+    represented by having -1 in the entry_count field.
+
+  The entries are written out in the top-down, depth-first order. The
+    first entry represents the root level of the repository, followed by
+    the first subtree - let's call it A - of the root level, followed by
+    the first subtree of A, ... There is no prefix compression for
+    directories.
+
+== File offsets (fileoffsets)
+
+  fileoffset (32-bits): offset to the file.
+
+  This part is needed for making the file entries bisectable and
+    thus allowing a binary search. There are nfile + 1 offsets in the
+    fileoffset table, the last is pointing to the end of the last
+    fileentry. With this last entry, we can replace the strlen when
+    reading each filename, by calculating its length with the offsets.
+
+== File entry (fileentries)
+  
+  File entries are sorted in ascending order on the name field, after the
+  respective offset given by the directory entries. All file names are
+  prefix compressed, meaning the file name is relative to the directory.
+
+  filename (variable length, nul terminated). The exact encoding is 
+    undefined, but the filename cannot contain a NUL byte (iow, the same
+    encoding as a UNIX pathname).
+
+  flags (16-bits): 'flags' field split into (high to low bits)
+
+    assumevalid (1-bit): assume-valid flag
+
+    intenttoadd (1-bit): intent-to-add flag, used by "git add -N".
+      Extended flag in index v3.
+
+    stage (2-bit): stage of the file during merge
+
+    skipworktree (1-bit): skip-worktree flag, used by sparse checkout.
+      Extended flag in index v3.
+
+    11-bit unused, must be zero [6]
+
+  mode (16-bits): file mode, split into (high to low bits)
+
+    objtype (4-bits): object type
+      valid values in binary are 1000 (regular file), 1010 (symbolic
+      link) and 1110 (gitlink)
+
+    3-bit unused
+
+    permission (9-bits): unix permission. Only 0755 and 0644 are valid
+      for regular files. Symbolic links and gitlinks have value 0 in 
+      this field.
+
+  mtimes (32-bits): mtime seconds, the last time a file's data changed
+    this is stat(2) data
+
+  mtimens (32-bits): mtime nanosecond fractions
+    this is stat(2) data
+
+  statcrc (32-bits): crc32 checksum over ctime seconds, ctime
+    nanoseconds, ino, file size, dev, uid, gid (All stat(2) data
+    except mtime) [7]
+
+  objhash (160-bits): SHA-1 for the represented object
+
+# This will probably be changed in future versions as discussed here: http://colabti.org/irclogger/irclogger_log/git-devel?date=2012-06-21
+  entrycrc (32-bits): crc32 checksum for the file entry. The crc code
+    includes the offset to the file.
+
+== Conflict data
+
+  A conflict is represented in the index as a set of higher stage entries.
+  These entries are stored at the end of the index. When a conflict is 
+  resolved (e.g. with "git add path"). A bit is flipped, to indicate that
+  the conflict is resolved, but the entries will be kept, so that
+  conflicts can be recreated (e.g. with "git checkout -m", in case users
+  want to redo a conflict resolution from scratch.
+
+  The first part of a conflict (usually stage 1) will be stored both in
+  the entries part of the index and in the conflict part. All other parts
+  will only be stored in the conflict part.
+
+  filename (variable length, nul terminated): filename of the entry,
+    relative to its containing directory).
+
+  nfileconflicts (32-bits): number of conflicts for the file [8]
+
+  flags (nfileconflicts * flags) (16-bits): 'flags' field split into:
+    
+    conflicted (1-bit): conflicted state (conflicted/resolved) (1 if
+      conflicted)
+
+    stage (2-bits): stage during merge.
+   
+    13-bit unused
+
+  entry_mode (nfileconflicts * entry mode) (16-bits): octal numbers, entry
+    mode of eache entry in the different stages. (How many is defined by
+    the 4-byte number before)
+
+  objectnames (nfileconflicts * object names) (160-bits): object names 
+    of the different stages.
+
+  conflictcrc (32-bits): crc32 checksum over conflict data.
+
+== Design explanations
+
+[1] The directory and file offsets are included in the index format
+    to enable bisectability of the index, for binary searches.Updating
+    a single entry and partial reading will benefit from this.
+
+[2] The directories are saved in their own block, to be able to
+    quickly search for a directory in the index. They include a
+    offset to the (lexically) first file in the directory.
+
+[3] The data of the cache-tree extension and the resolve undo
+    extension is now part of the index itself, but if other extensions
+    come up in the future, there is no need to change the index, they
+    can simply be added at the end.
+
+[4] To avoid rewrites of the whole index when there are conflicts or
+    conflicts are being resolved, conflicted data will be stored at
+    the end of the index. To mark the conflict resolved, just a bit
+    has to be flipped. The data will still be there, if a user wants
+    to redo the conflict resolution.
+
+[5] Since only 4 modes are effectively allowed in git but 32-bit are
+    used to store them, having a two bit flag for the mode is enough
+    and saves 4 byte per entry.
+
+[6] The length of the file name was dropped, since each file name is
+    nul terminated anyway.
+
+[7] Since all stat data (except mtime and ctime) is just used for
+    checking if a file has changed a checksum of the data is enough.
+    In addition to that Thomas Rast suggested ctime could be ditched
+    completely (core.trustctime=false) and thus included in the
+    checksum. This would save 24 bytes per index entry, which would
+    be about 4 MB on the Webkit index.
+    (Thanks for the suggestion to Michael Haggerty)
+
+[8] Since there can be more stage #1 entries, it is necessary to know
+    the number of conflict data entries there are.
+
+[9] As Michael Haggerty pointed out on the mailing list, storing the
+    trailing slash will simplify a few operations.
-- 
1.7.10.886.gdf6792c.dirty

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 08/16] Make in-memory format aware of stat_crc
  2012-08-02 11:01 [RFC 0/16] Introduce index file format version 5 Thomas Gummerer
                   ` (6 preceding siblings ...)
  2012-08-02 11:01 ` [PATCH 07/16] Add documentation of the index-v5 file format Thomas Gummerer
@ 2012-08-02 11:01 ` Thomas Gummerer
  2012-08-02 11:01 ` [PATCH 09/16] Read index-v5 Thomas Gummerer
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-02 11:01 UTC (permalink / raw)
  To: git; +Cc: trast, mhagger, gitster, pclouds, robin.rosenberg,
	Thomas Gummerer

Make the in-memory format aware of the stat_crc used by index-v5.
It is simply ignored by index version prior to v5.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 cache.h      |  1 +
 read-cache.c | 27 +++++++++++++++++++++++++++
 2 files changed, 28 insertions(+)

diff --git a/cache.h b/cache.h
index 3aa70d8..7247944 100644
--- a/cache.h
+++ b/cache.h
@@ -133,6 +133,7 @@ struct cache_entry {
 	unsigned int ce_flags;
 	unsigned int ce_namelen;
 	unsigned char sha1[20];
+	uint32_t ce_stat_crc;
 	struct cache_entry *next;
 	struct cache_entry *dir_next;
 	char name[FLEX_ARRAY]; /* more */
diff --git a/read-cache.c b/read-cache.c
index 1c804e1..dfe5ee2 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -67,6 +67,31 @@ void rename_index_entry_at(struct index_state *istate, int nr, const char *new_n
 	add_index_entry(istate, new, ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE);
 }
 
+static uint32_t calculate_stat_crc(struct cache_entry *ce)
+{
+	unsigned int ctimens = 0;
+	uint32_t stat, stat_crc;
+
+	stat = htonl(ce->ce_ctime.sec);
+	stat_crc = crc32(0, (Bytef*)&stat, 4);
+#ifdef USE_NSEC
+	ctimens = ce->ce_ctime.nsec;
+#endif
+	stat = htonl(ctimens);
+	stat_crc = crc32(stat_crc, (Bytef*)&stat, 4);
+	stat = htonl(ce->ce_ino);
+	stat_crc = crc32(stat_crc, (Bytef*)&stat, 4);
+	stat = htonl(ce->ce_size);
+	stat_crc = crc32(stat_crc, (Bytef*)&stat, 4);
+	stat = htonl(ce->ce_dev);
+	stat_crc = crc32(stat_crc, (Bytef*)&stat, 4);
+	stat = htonl(ce->ce_uid);
+	stat_crc = crc32(stat_crc, (Bytef*)&stat, 4);
+	stat = htonl(ce->ce_gid);
+	stat_crc = crc32(stat_crc, (Bytef*)&stat, 4);
+	return stat_crc;
+}
+
 /*
  * This only updates the "non-critical" parts of the directory
  * cache, ie the parts that aren't tracked by GIT, and only used
@@ -89,6 +114,8 @@ void fill_stat_cache_info(struct cache_entry *ce, struct stat *st)
 
 	if (S_ISREG(st->st_mode))
 		ce_mark_uptodate(ce);
+
+	ce->ce_stat_crc = calculate_stat_crc(ce);
 }
 
 static int ce_compare_data(struct cache_entry *ce, struct stat *st)
-- 
1.7.10.886.gdf6792c.dirty

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 09/16] Read index-v5
  2012-08-02 11:01 [RFC 0/16] Introduce index file format version 5 Thomas Gummerer
                   ` (7 preceding siblings ...)
  2012-08-02 11:01 ` [PATCH 08/16] Make in-memory format aware of stat_crc Thomas Gummerer
@ 2012-08-02 11:01 ` Thomas Gummerer
  2012-08-02 12:45   ` Nguyen Thai Ngoc Duy
  2012-08-02 11:02 ` [PATCH 10/16] Read resolve-undo data Thomas Gummerer
                   ` (9 subsequent siblings)
  18 siblings, 1 reply; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-02 11:01 UTC (permalink / raw)
  To: git; +Cc: trast, mhagger, gitster, pclouds, robin.rosenberg,
	Thomas Gummerer

Make git read the index file version 5 without complaining.

This version of the reader doesn't read neither the cache-tree
nor the resolve undo data, but doesn't choke on an index that
includes such data.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 cache.h      |  79 +++++++++
 read-cache.c | 570 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 644 insertions(+), 5 deletions(-)

diff --git a/cache.h b/cache.h
index 7247944..91d9b45 100644
--- a/cache.h
+++ b/cache.h
@@ -108,6 +108,13 @@ struct cache_header_v2 {
 	unsigned int hdr_entries;
 };
 
+struct cache_header_v5 {
+	unsigned int hdr_ndir;
+	unsigned int hdr_nfile;
+	unsigned int hdr_fblockoffset;
+	unsigned int hdr_nextension;
+};
+
 #define INDEX_FORMAT_LB 2
 #define INDEX_FORMAT_UB 4
 
@@ -121,6 +128,15 @@ struct cache_time {
 	unsigned int nsec;
 };
 
+/*
+ * The *next pointer is used in read_entries_v5 for holding
+ * all the elements of a directory, and points to the next
+ * cache_entry in a directory.
+ *
+ * It is reset by the add_name_hash call in set_index_entry
+ * to set it to point to the next cache_entry in the
+ * correct in-memory format ordering.
+ */
 struct cache_entry {
 	struct cache_time ce_ctime;
 	struct cache_time ce_mtime;
@@ -139,12 +155,58 @@ struct cache_entry {
 	char name[FLEX_ARRAY]; /* more */
 };
 
+struct directory_entry {
+	struct directory_entry *next;
+	struct directory_entry *next_hash;
+	struct cache_entry *ce;
+	struct cache_entry *ce_last;
+	struct conflict_entry *conflict;
+	struct conflict_entry *conflict_last;
+	unsigned int conflict_size;
+	unsigned int de_foffset;
+	unsigned int de_cr;
+	unsigned int de_ncr;
+	unsigned int de_nsubtrees;
+	unsigned int de_nfiles;
+	unsigned int de_nentries;
+	unsigned char sha1[20];
+	unsigned short de_flags;
+	unsigned int de_pathlen;
+	char pathname[FLEX_ARRAY];
+};
+
+struct conflict_part {
+	struct conflict_part *next;
+	unsigned short flags;
+	unsigned short entry_mode;
+	unsigned char sha1[20];
+};
+
+struct conflict_entry {
+	struct conflict_entry *next;
+	unsigned int nfileconflicts;
+	struct conflict_part *entries;
+	unsigned int namelen;
+	unsigned int pathlen;
+	char name[FLEX_ARRAY];
+};
+
+struct ondisk_conflict_part {
+	unsigned short flags;
+	unsigned short entry_mode;
+	unsigned char sha1[20];
+};
+
 #define CE_NAMEMASK  (0x0fff)
 #define CE_STAGEMASK (0x3000)
 #define CE_EXTENDED  (0x4000)
 #define CE_VALID     (0x8000)
 #define CE_STAGESHIFT 12
 
+#define CONFLICT_CONFLICTED (0x8000)
+#define CONFLICT_STAGESHIFT 13
+#define CONFLICT_STAGEMASK (0x6000)
+
 /*
  * Range 0xFFFF0000 in ce_flags is divided into
  * two parts: in-memory flags and on-disk ones.
@@ -178,6 +240,18 @@ struct cache_entry {
 #define CE_EXTENDED_FLAGS (CE_INTENT_TO_ADD | CE_SKIP_WORKTREE)
 
 /*
+ * Representation of the extended on-disk flags in the v5 format.
+ * They must not collide with the ordinary on-disk flags, and need to
+ * fit in 16 bits.  Note however that v5 does not save the name
+ * length.
+ */
+#define CE_INTENT_TO_ADD_V5  (0x4000)
+#define CE_SKIP_WORKTREE_V5  (0x0800)
+#if (CE_VALID|CE_STAGEMASK) & (CE_INTENTTOADD_V5|CE_SKIPWORKTREE_V5)
+#error "v5 on-disk flags collide with ordinary on-disk flags"
+#endif
+
+/*
  * Safeguard to avoid saving wrong flags:
  *  - CE_EXTENDED2 won't get saved until its semantic is known
  *  - Bits in 0x0000FFFF have been saved in ce_flags already
@@ -215,6 +289,8 @@ static inline unsigned create_ce_flags(unsigned stage)
 #define ce_skip_worktree(ce) ((ce)->ce_flags & CE_SKIP_WORKTREE)
 #define ce_mark_uptodate(ce) ((ce)->ce_flags |= CE_UPTODATE)
 
+#define conflict_stage(c) ((CONFLICT_STAGEMASK & (c)->flags) >> CONFLICT_STAGESHIFT)
+
 #define ce_permissions(mode) (((mode) & 0100) ? 0755 : 0644)
 static inline unsigned int create_ce_mode(unsigned int mode)
 {
@@ -261,6 +337,8 @@ static inline unsigned int canon_mode(unsigned int mode)
 }
 
 #define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1)
+#define directory_entry_size(len) (offsetof(struct directory_entry,pathname) + (len) + 1)
+#define conflict_entry_size(len) (offsetof(struct conflict_entry,name) + (len) + 1)
 
 struct index_state {
 	struct cache_entry **cache;
@@ -437,6 +515,7 @@ extern int init_db(const char *template_dir, unsigned int flags);
 extern int read_index(struct index_state *);
 extern int read_index_preload(struct index_state *, const char **pathspec);
 extern void read_index_v2(struct index_state *, void *mmap, int);
+extern void read_index_v5(struct index_state *, void *mmap, int, int);
 extern int read_index_from(struct index_state *, const char *path);
 extern int is_index_unborn(struct index_state *);
 extern int read_index_unmerged(struct index_state *);
diff --git a/read-cache.c b/read-cache.c
index dfe5ee2..884c2a7 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -234,6 +234,55 @@ static int ce_match_stat_basic_v2(struct cache_entry *ce,
 	return changed;
 }
 
+static int match_stat_crc(struct stat *st, uint32_t expected_crc)
+{
+	uint32_t data, stat_crc = 0;
+	unsigned int ctimens = 0;
+
+	data = htonl(st->st_ctime);
+	stat_crc = crc32(0, (Bytef*)&data, 4);
+#ifdef USE_NSEC
+	ctimens = ST_MTIME_NSEC(*st);
+#endif
+	data = htonl(ctimens);
+	stat_crc = crc32(stat_crc, (Bytef*)&data, 4);
+	data = htonl(st->st_ino);
+	stat_crc = crc32(stat_crc, (Bytef*)&data, 4);
+	data = htonl(st->st_size);
+	stat_crc = crc32(stat_crc, (Bytef*)&data, 4);
+	data = htonl(st->st_dev);
+	stat_crc = crc32(stat_crc, (Bytef*)&data, 4);
+	data = htonl(st->st_uid);
+	stat_crc = crc32(stat_crc, (Bytef*)&data, 4);
+	data = htonl(st->st_gid);
+	stat_crc = crc32(stat_crc, (Bytef*)&data, 4);
+
+	return stat_crc == expected_crc;
+}
+
+static int ce_match_stat_basic_v5(struct cache_entry *ce,
+				struct stat *st,
+				int changed)
+{
+
+	if (ce->ce_mtime.sec != 0 && ce->ce_mtime.sec != (unsigned int)st->st_mtime)
+		changed |= MTIME_CHANGED;
+#ifdef USE_NSEC
+	if (ce->ce_mtime.nsec != 0 && ce->ce_mtime.nsec != ST_MTIME_NSEC(*st))
+		changed |= MTIME_CHANGED;
+#endif
+	if (!match_stat_crc(st, ce->ce_stat_crc)) {
+		changed |= OWNER_CHANGED;
+		changed |= INODE_CHANGED;
+	}
+	/* Racily smudged entry? */
+	if (!ce->ce_mtime.sec && !ce->ce_mtime.nsec) {
+		if (!changed && !is_empty_blob_sha1(ce->sha1) && ce_modified_check_fs(ce, st))
+			changed |= DATA_CHANGED;
+	}
+	return changed;
+}
+
 static int ce_match_stat_basic(struct cache_entry *ce, struct stat *st)
 {
 	unsigned int changed = 0;
@@ -267,7 +316,10 @@ static int ce_match_stat_basic(struct cache_entry *ce, struct stat *st)
 		die("internal error: ce_mode is %o", ce->ce_mode);
 	}
 
-	changed = ce_match_stat_basic_v2(ce, st, changed);
+	if (the_index.version != 5)
+		changed = ce_match_stat_basic_v2(ce, st, changed);
+	else
+		changed = ce_match_stat_basic_v5(ce, st, changed);
 	return changed;
 }
 
@@ -1275,6 +1327,25 @@ struct ondisk_cache_entry_extended {
 	char name[FLEX_ARRAY]; /* more */
 };
 
+struct ondisk_cache_entry_v5 {
+	unsigned short flags;
+	unsigned short mode;
+	struct cache_time mtime;
+	int stat_crc;
+	unsigned char sha1[20];
+};
+
+struct ondisk_directory_entry {
+	unsigned int foffset;
+	unsigned int cr;
+	unsigned int ncr;
+	unsigned int nsubtrees;
+	unsigned int nfiles;
+	unsigned int nentries;
+	unsigned char sha1[20];
+	unsigned short flags;
+};
+
 /* These are only used for v3 or lower */
 #define align_flex_name(STRUCT,len) ((offsetof(struct STRUCT,name) + (len) + 8) & ~7)
 #define ondisk_cache_entry_size(len) align_flex_name(ondisk_cache_entry,len)
@@ -1283,6 +1354,17 @@ struct ondisk_cache_entry_extended {
 			    ondisk_cache_entry_extended_size(ce_namelen(ce)) : \
 			    ondisk_cache_entry_size(ce_namelen(ce)))
 
+static int check_crc32(int initialcrc,
+			void *data,
+			size_t len,
+			unsigned int expected_crc)
+{
+	int crc;
+
+	crc = crc32(initialcrc, (Bytef*)data, len);
+	return crc == expected_crc;
+}
+
 static int verify_hdr_version(struct cache_version_header *hdr, unsigned long size)
 {
 	int hdr_version;
@@ -1290,7 +1372,7 @@ static int verify_hdr_version(struct cache_version_header *hdr, unsigned long si
 	if (hdr->hdr_signature != htonl(CACHE_SIGNATURE))
 		return error("bad signature");
 	hdr_version = ntohl(hdr->hdr_version);
-	if (hdr_version < 2 || 4 < hdr_version)
+	if (hdr_version < 2 || 5 < hdr_version)
 		return error("bad index version %d", hdr_version);
 	return 0;
 }
@@ -1308,6 +1390,24 @@ static int verify_hdr_v2(struct cache_version_header *hdr, unsigned long size)
 	return 0;
 }
 
+static int verify_hdr_v5(void *mmap)
+{
+	uint32_t *filecrc;
+	unsigned int header_size_v5;
+	struct cache_version_header *hdr;
+	struct cache_header_v5 *hdr_v5;
+
+	hdr = mmap;
+	hdr_v5 = mmap + sizeof(*hdr);
+	/* Size of the header + the size of the extensionoffsets */
+	header_size_v5 = sizeof(*hdr_v5) + hdr_v5->hdr_nextension * 4;
+	/* Initialize crc */
+	filecrc = mmap + sizeof(*hdr) + header_size_v5;
+	if (!check_crc32(0, hdr, sizeof(*hdr) + header_size_v5, ntohl(*filecrc)))
+		return error("bad index file header crc signature");
+	return 0;
+}
+
 static int read_index_extension(struct index_state *istate,
 				const char *ext, void *data, unsigned long sz)
 {
@@ -1372,12 +1472,105 @@ static struct cache_entry *cache_entry_from_ondisk(struct ondisk_cache_entry *on
 	ce->ce_size       = ntoh_l(ondisk->size);
 	ce->ce_flags      = flags & ~CE_NAMEMASK;
 	ce->ce_namelen    = len;
+	ce->ce_stat_crc   = 0;
 	hashcpy(ce->sha1, ondisk->sha1);
 	memcpy(ce->name, name, len);
 	ce->name[len] = '\0';
 	return ce;
 }
 
+static struct cache_entry *cache_entry_from_ondisk_v5(struct ondisk_cache_entry_v5 *ondisk,
+						   struct directory_entry *de,
+						   char *name,
+						   size_t len,
+						   size_t prefix_len)
+{
+	struct cache_entry *ce = xmalloc(cache_entry_size(len + de->de_pathlen));
+	int flags;
+
+	flags = ntoh_s(ondisk->flags);
+	ce->ce_ctime.sec  = 0;
+	ce->ce_mtime.sec  = ntoh_l(ondisk->mtime.sec);
+	ce->ce_ctime.nsec = 0;
+	ce->ce_mtime.nsec = ntoh_l(ondisk->mtime.nsec);
+	ce->ce_dev        = 0;
+	ce->ce_ino        = 0;
+	ce->ce_mode       = ntoh_s(ondisk->mode);
+	ce->ce_uid        = 0;
+	ce->ce_gid        = 0;
+	ce->ce_size       = 0;
+	ce->ce_flags      = flags & CE_STAGEMASK;
+	ce->ce_flags     |= flags & CE_VALID;
+	if (flags & CE_INTENT_TO_ADD_V5)
+		ce->ce_flags |= CE_INTENT_TO_ADD;
+	if (flags & CE_SKIP_WORKTREE_V5)
+		ce->ce_flags |= CE_SKIP_WORKTREE;
+	ce->ce_stat_crc   = ntoh_l(ondisk->stat_crc);
+	ce->ce_namelen    = len + de->de_pathlen;
+	hashcpy(ce->sha1, ondisk->sha1);
+	memcpy(ce->name, de->pathname, de->de_pathlen);
+	memcpy(ce->name + de->de_pathlen, name, len);
+	ce->name[len + de->de_pathlen] = '\0';
+	return ce;
+}
+
+static struct directory_entry *directory_entry_from_ondisk(struct ondisk_directory_entry *ondisk,
+						   const char *name,
+						   size_t len)
+{
+	struct directory_entry *de = xmalloc(directory_entry_size(len));
+
+
+	memcpy(de->pathname, name, len);
+	de->pathname[len] = '\0';
+	de->de_flags      = ntoh_s(ondisk->flags);
+	de->de_foffset    = ntoh_l(ondisk->foffset);
+	de->de_cr         = ntoh_l(ondisk->cr);
+	de->de_ncr        = ntoh_l(ondisk->ncr);
+	de->de_nsubtrees  = ntoh_l(ondisk->nsubtrees);
+	de->de_nfiles     = ntoh_l(ondisk->nfiles);
+	de->de_nentries   = ntoh_l(ondisk->nentries);
+	de->de_pathlen    = len;
+	hashcpy(de->sha1, ondisk->sha1);
+	return de;
+}
+
+static struct conflict_part *conflict_part_from_ondisk(struct ondisk_conflict_part *ondisk)
+{
+	struct conflict_part *cp = xmalloc(sizeof(struct conflict_part));
+
+	cp->flags      = ntoh_s(ondisk->flags);
+	cp->entry_mode = ntoh_s(ondisk->entry_mode);
+	hashcpy(cp->sha1, ondisk->sha1);
+	return cp;
+}
+
+static struct cache_entry *convert_conflict_part(struct conflict_part *cp,
+						char * name,
+						unsigned int len)
+{
+
+	struct cache_entry *ce = xmalloc(cache_entry_size(len));
+
+	ce->ce_ctime.sec  = 0;
+	ce->ce_mtime.sec  = 0;
+	ce->ce_ctime.nsec = 0;
+	ce->ce_mtime.nsec = 0;
+	ce->ce_dev        = 0;
+	ce->ce_ino        = 0;
+	ce->ce_mode       = cp->entry_mode;
+	ce->ce_uid        = 0;
+	ce->ce_gid        = 0;
+	ce->ce_size       = 0;
+	ce->ce_flags      = conflict_stage(cp) << CE_STAGESHIFT;
+	ce->ce_stat_crc   = 0;
+	ce->ce_namelen    = len;
+	hashcpy(ce->sha1, cp->sha1);
+	memcpy(ce->name, name, len);
+	ce->name[len] = '\0';
+	return ce;
+}
+
 /*
  * Adjacent cache entries tend to share the leading paths, so it makes
  * sense to only store the differences in later entries.  In the v4
@@ -1445,6 +1638,332 @@ static struct cache_entry *create_from_disk(struct ondisk_cache_entry *ondisk,
 	return ce;
 }
 
+static struct directory_entry *read_directories_v5(unsigned int *dir_offset,
+				unsigned int *dir_table_offset,
+				void *mmap,
+				int mmap_size)
+{
+	int i, ondisk_directory_size;
+	uint32_t *filecrc, *beginning, *end;
+	struct directory_entry *current = NULL;
+	struct ondisk_directory_entry *disk_de;
+	struct directory_entry *de;
+	unsigned int data_len, len;
+	char *name;
+
+	ondisk_directory_size = sizeof(disk_de->flags)
+		+ sizeof(disk_de->foffset)
+		+ sizeof(disk_de->cr)
+		+ sizeof(disk_de->ncr)
+		+ sizeof(disk_de->nsubtrees)
+		+ sizeof(disk_de->nfiles)
+		+ sizeof(disk_de->nentries)
+		+ sizeof(disk_de->sha1);
+	name = (char *)mmap + *dir_offset;
+	beginning = mmap + *dir_table_offset;
+	end = mmap + *dir_table_offset + 4;
+	len = ntoh_l(*end) - ntoh_l(*beginning) - ondisk_directory_size - 5;
+	disk_de = (struct ondisk_directory_entry *)
+			((char *)mmap + *dir_offset + len + 1);
+	de = directory_entry_from_ondisk(disk_de, name, len);
+	de->next = NULL;
+
+	/* Length of pathname + nul byte for termination + size of
+	 * members of ondisk_directory_entry. (Just using the size
+	 * of the stuct doesn't work, because there may be padding
+	 * bytes for the struct)
+	 */
+	data_len = len + 1 + ondisk_directory_size;
+
+	filecrc = mmap + *dir_offset + data_len;
+	if (!check_crc32(0, mmap + *dir_offset, data_len, ntoh_l(*filecrc)))
+		goto unmap;
+
+	*dir_table_offset += 4;
+	*dir_offset += data_len + 4; /* crc code */
+
+	current = de;
+	for (i = 0; i < de->de_nsubtrees; i++) {
+		current->next = read_directories_v5(dir_offset, dir_table_offset,
+						mmap, mmap_size);
+		while (current->next)
+			current = current->next;
+	}
+
+	return de;
+unmap:
+	munmap(mmap, mmap_size);
+	die("directory crc doesn't match for '%s'", de->pathname);
+}
+
+static struct cache_entry *read_entry_v5(struct directory_entry *de,
+			unsigned long *entry_offset,
+			void **mmap,
+			unsigned long mmap_size,
+			unsigned int *foffsetblock,
+			int fd)
+{
+	int len, crc_wrong, i = 0, offset_to_offset;
+	char *name;
+	uint32_t foffsetblockcrc;
+	uint32_t *filecrc, *beginning, *end;
+	struct cache_entry *ce;
+	struct ondisk_cache_entry_v5 *disk_ce;
+
+	do {
+		name = (char *)*mmap + *entry_offset;
+		beginning = *mmap + *foffsetblock;
+		end = *mmap + *foffsetblock + 4;
+		len = ntoh_l(*end) - ntoh_l(*beginning) - sizeof(struct ondisk_cache_entry_v5) - 5;
+		disk_ce = (struct ondisk_cache_entry_v5 *)
+				((char *)*mmap + *entry_offset + len + 1);
+		ce = cache_entry_from_ondisk_v5(disk_ce, de, name, len, de->de_pathlen);
+		filecrc = *mmap + *entry_offset + len + 1 + sizeof(*disk_ce);
+		offset_to_offset = htonl(*foffsetblock);
+		foffsetblockcrc = crc32(0, (Bytef*)&offset_to_offset, 4);
+		crc_wrong = !check_crc32(foffsetblockcrc,
+			*mmap + *entry_offset, len + 1 + sizeof(*disk_ce),
+			ntoh_l(*filecrc));
+		if (crc_wrong) {
+			/* wait for 10 milliseconds */
+			usleep(10*1000);
+			munmap(*mmap, mmap_size);
+			*mmap = xmmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
+		}
+		i++;
+		/*
+		 * Retry for 500 ms maximum, before giving up and saying the
+		 * checksum is wrong.
+		 */
+	} while (crc_wrong && i < 50);
+	if (crc_wrong)
+		goto unmap;
+	*entry_offset += len + 1 + sizeof(*disk_ce) + 4;
+	return ce;
+unmap:
+	munmap(*mmap, mmap_size);
+	die("file crc doesn't match for '%s'", ce->name);
+}
+
+static void ce_queue_push(struct cache_entry **head,
+			     struct cache_entry **tail,
+			     struct cache_entry *ce)
+{
+	if (!*head) {
+		*head = *tail = ce;
+		(*tail)->next = NULL;
+		return;
+	}
+
+	(*tail)->next = ce;
+	ce->next = NULL;
+	*tail = (*tail)->next;
+}
+
+static void conflict_entry_push(struct conflict_entry **head,
+				struct conflict_entry **tail,
+				struct conflict_entry *conflict_entry)
+{
+	if (!*head) {
+		*head = *tail = conflict_entry;
+		(*tail)->next = NULL;
+		return;
+	}
+
+	(*tail)->next = conflict_entry;
+	conflict_entry->next = NULL;
+	*tail = (*tail)->next;
+}
+
+static struct cache_entry *ce_queue_pop(struct cache_entry **head)
+{
+	struct cache_entry *ce;
+
+	ce = *head;
+	*head = (*head)->next;
+	return ce;
+}
+
+static void conflict_part_head_remove(struct conflict_part **head)
+{
+	struct conflict_part *to_free;
+
+	to_free = *head;
+	*head = (*head)->next;
+	free(to_free);
+}
+
+static void conflict_entry_head_remove(struct conflict_entry **head)
+{
+	struct conflict_entry *to_free;
+
+	to_free = *head;
+	*head = (*head)->next;
+	free(to_free);
+}
+
+struct conflict_entry *create_new_conflict(char *name, int len, int pathlen)
+{
+	struct conflict_entry *conflict_entry;
+
+	if (pathlen)
+		pathlen++;
+	conflict_entry = xmalloc(conflict_entry_size(len));
+	conflict_entry->entries = NULL;
+	conflict_entry->nfileconflicts = 0;
+	conflict_entry->namelen = len;
+	memcpy(conflict_entry->name, name, len);
+	conflict_entry->name[len] = '\0';
+	conflict_entry->pathlen = pathlen;
+	conflict_entry->next = NULL;
+
+	return conflict_entry;
+}
+
+static struct conflict_entry *create_conflict_entry_from_ce(struct cache_entry *ce,
+								int pathlen)
+{
+	return create_new_conflict(ce->name, ce_namelen(ce), pathlen);
+}
+
+static struct conflict_entry *read_conflicts_v5(struct directory_entry *de,
+						void **mmap,
+						unsigned long mmap_size,
+						int fd)
+{
+	struct conflict_entry *head, *tail;
+	unsigned int croffset, i, j = 0;
+	char *full_name;
+
+	croffset = de->de_cr;
+	tail = NULL;
+	head = NULL;
+	for (i = 0; i < de->de_ncr; i++) {
+		struct conflict_entry *conflict_new;
+		unsigned int len, *nfileconflicts;
+		char *name;
+		void *crc_start;
+		int k, offset, crc_wrong;
+		uint32_t *filecrc;
+
+		do {
+			offset = croffset;
+			crc_start = *mmap + offset;
+			name = (char *)*mmap + offset;
+			len = strlen(name);
+			offset += len + 1;
+			nfileconflicts = *mmap + offset;
+			offset += 4;
+
+			full_name = xmalloc(sizeof(char) * (len + de->de_pathlen));
+			memcpy(full_name, de->pathname, de->de_pathlen);
+			memcpy(full_name + de->de_pathlen, name, len);
+			conflict_new = create_new_conflict(full_name,
+					len + de->de_pathlen, de->de_pathlen);
+			for (k = 0; k < ntoh_l(*nfileconflicts); k++) {
+				struct ondisk_conflict_part *ondisk;
+				struct conflict_part *cp;
+
+				ondisk = *mmap + offset;
+				cp = conflict_part_from_ondisk(ondisk);
+				cp->next = NULL;
+				add_part_to_conflict_entry(de, conflict_new, cp);
+				offset += sizeof(struct ondisk_conflict_part);
+			}
+			filecrc = *mmap + offset;
+			crc_wrong = !check_crc32(0, crc_start,
+				len + 1 + 4 + conflict_new->nfileconflicts
+				* sizeof(struct ondisk_conflict_part),
+				ntoh_l(*filecrc));
+			if (crc_wrong) {
+				/* wait for 10 milliseconds */
+				usleep(10*1000);
+				munmap(*mmap, mmap_size);
+				*mmap = xmmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
+			}
+			free(full_name);
+			j++;
+		} while (crc_wrong && j < 50);
+		if (crc_wrong)
+			goto unmap;
+		croffset = offset + 4;
+		conflict_entry_push(&head, &tail, conflict_new);
+	}
+	return head;
+unmap:
+	munmap(*mmap, mmap_size);
+	die("wrong crc for conflict: %s", full_name);
+}
+
+static struct directory_entry *read_entries_v5(struct index_state *istate,
+					struct directory_entry *de,
+					unsigned long *entry_offset,
+					void **mmap,
+					unsigned long mmap_size,
+					int *nr,
+					unsigned int *foffsetblock,
+					int fd)
+{
+	struct cache_entry *head = NULL, *tail = NULL;
+	struct conflict_entry *conflict_queue;
+	struct cache_entry *ce;
+	int i;
+
+	conflict_queue = read_conflicts_v5(de, mmap, mmap_size, fd);
+	for (i = 0; i < de->de_nfiles; i++) {
+		ce = read_entry_v5(de,
+				entry_offset,
+				mmap,
+				mmap_size,
+				foffsetblock,
+				fd);
+		ce_queue_push(&head, &tail, ce);
+		*foffsetblock += 4;
+
+		/* Add the conflicted entries at the end of the index file
+		 * to the in memory format
+		 */
+		if (conflict_queue &&
+		    (conflict_queue->entries->flags & CONFLICT_CONFLICTED) != 0 &&
+		    !cache_name_compare(conflict_queue->name, conflict_queue->namelen,
+					ce->name, ce_namelen(ce))) {
+			struct conflict_part *cp;
+			cp = conflict_queue->entries;
+			cp = cp->next;
+			while (cp) {
+				ce = convert_conflict_part(cp,
+						conflict_queue->name,
+						conflict_queue->namelen);
+				ce_queue_push(&head, &tail, ce);
+				conflict_part_head_remove(&cp);
+			}
+			conflict_entry_head_remove(&conflict_queue);
+		}
+	}
+
+	de = de->next;
+
+	while (head) {
+		if (de != NULL
+		    && strcmp(head->name, de->pathname) > 0) {
+			de = read_entries_v5(istate,
+					de,
+					entry_offset,
+					mmap,
+					mmap_size,
+					nr,
+					foffsetblock,
+					fd);
+		} else {
+			ce = ce_queue_pop(&head);
+			set_index_entry(istate, *nr, ce);
+			(*nr)++;
+		}
+	}
+
+	return de;
+}
+
 void read_index_v2(struct index_state *istate, void *mmap, int mmap_size)
 {
 	int i;
@@ -1503,6 +2022,39 @@ unmap:
 	die("index file corrupt");
 }
 
+void read_index_v5(struct index_state *istate, void *mmap, int mmap_size, int fd)
+{
+	unsigned long entry_offset;
+	unsigned int dir_offset, dir_table_offset;
+	struct cache_version_header *hdr;
+	struct cache_header_v5 *hdr_v5;
+	struct directory_entry *root_directory, *de;
+	int nr;
+	unsigned int foffsetblock;
+
+	hdr = mmap;
+	hdr_v5 = mmap + sizeof(*hdr);
+	istate->version = ntohl(hdr->hdr_version);
+	istate->cache_nr = ntohl(hdr_v5->hdr_nfile);
+	istate->cache_alloc = alloc_nr(istate->cache_nr);
+	istate->cache = xcalloc(istate->cache_alloc, sizeof(struct cache_entry *));
+	istate->initialized = 1;
+
+	/* Skip size of the header + crc sum + size of offsets */
+	dir_offset = sizeof(*hdr) + sizeof(*hdr_v5) + 4 + (ntohl(hdr_v5->hdr_ndir) + 1) * 4;
+	dir_table_offset = sizeof(*hdr) + sizeof(*hdr_v5) + 4;
+	root_directory = read_directories_v5(&dir_offset, &dir_table_offset, mmap, mmap_size);
+
+	entry_offset = ntohl(hdr_v5->hdr_fblockoffset);
+
+	nr = 0;
+	foffsetblock = dir_offset;
+	de = root_directory;
+	while (de)
+		de = read_entries_v5(istate, de, &entry_offset,
+				&mmap, mmap_size, &nr, &foffsetblock, fd);
+}
+
 /* remember to discard_cache() before reading a different cache! */
 int read_index_from(struct index_state *istate, const char *path)
 {
@@ -1542,10 +2094,18 @@ int read_index_from(struct index_state *istate, const char *path)
 	if (verify_hdr_version(hdr, mmap_size) < 0)
 		goto unmap;
 
-	if (verify_hdr_v2(hdr, mmap_size) < 0)
-		goto unmap;
+	if (htonl(hdr->hdr_version) != 5) {
+		if (verify_hdr_v2(hdr, mmap_size) < 0)
+			goto unmap;
 
-	read_index_v2(istate, mmap, mmap_size);
+		read_index_v2(istate, mmap, mmap_size);
+	} else {
+		if (verify_hdr_v5(hdr) < 0)
+			goto unmap;
+
+		read_index_v5(istate, mmap, mmap_size, fd);
+	}
+	close(fd);
 	istate->timestamp.sec = st.st_mtime;
 	istate->timestamp.nsec = ST_MTIME_NSEC(st);
 
-- 
1.7.10.886.gdf6792c.dirty

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 10/16] Read resolve-undo data
  2012-08-02 11:01 [RFC 0/16] Introduce index file format version 5 Thomas Gummerer
                   ` (8 preceding siblings ...)
  2012-08-02 11:01 ` [PATCH 09/16] Read index-v5 Thomas Gummerer
@ 2012-08-02 11:02 ` Thomas Gummerer
  2012-08-02 11:02 ` [PATCH 11/16] Read cache-tree in index-v5 Thomas Gummerer
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-02 11:02 UTC (permalink / raw)
  To: git; +Cc: trast, mhagger, gitster, pclouds, robin.rosenberg,
	Thomas Gummerer

Make git read the resolve-undo data from the index.

Since the resolve-undo data is joined with the conflicts in
the ondisk format of the index file version 5, conflicts and
resolved data is read at the same time, and the resolve-undo
data is then converted to the in-memory format.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 read-cache.c   |  1 +
 resolve-undo.c | 36 ++++++++++++++++++++++++++++++++++++
 resolve-undo.h |  2 ++
 3 files changed, 39 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index 884c2a7..cef9a4e 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1910,6 +1910,7 @@ static struct directory_entry *read_entries_v5(struct index_state *istate,
 	int i;
 
 	conflict_queue = read_conflicts_v5(de, mmap, mmap_size, fd);
+	resolve_undo_convert_v5(istate, conflict_queue);
 	for (i = 0; i < de->de_nfiles; i++) {
 		ce = read_entry_v5(de,
 				entry_offset,
diff --git a/resolve-undo.c b/resolve-undo.c
index 72b4612..f96c6ba 100644
--- a/resolve-undo.c
+++ b/resolve-undo.c
@@ -170,3 +170,39 @@ void unmerge_index(struct index_state *istate, const char **pathspec)
 		i = unmerge_index_entry_at(istate, i);
 	}
 }
+
+void resolve_undo_convert_v5(struct index_state *istate,
+					struct conflict_entry *ce)
+{
+	int i;
+
+	while (ce) {
+		struct string_list_item *lost;
+		struct resolve_undo_info *ui;
+		struct conflict_part *cp;
+
+		if (ce->entries && (ce->entries->flags & CONFLICT_CONFLICTED) != 0) {
+			ce = ce->next;
+			continue;
+		}
+		if (!istate->resolve_undo) {
+			istate->resolve_undo = xcalloc(1, sizeof(struct string_list));
+			istate->resolve_undo->strdup_strings = 1;
+		}
+
+		lost = string_list_insert(istate->resolve_undo, ce->name);
+		if (!lost->util)
+			lost->util = xcalloc(1, sizeof(*ui));
+		ui = lost->util;
+
+		cp = ce->entries;
+		for (i = 0; i < 3; i++)
+			ui->mode[i] = 0;
+		while (cp) {
+			ui->mode[conflict_stage(cp) - 1] = cp->entry_mode;
+			hashcpy(ui->sha1[conflict_stage(cp) - 1], cp->sha1);
+			cp = cp->next;
+		}
+		ce = ce->next;
+	}
+}
diff --git a/resolve-undo.h b/resolve-undo.h
index 8458769..ab660a6 100644
--- a/resolve-undo.h
+++ b/resolve-undo.h
@@ -13,4 +13,6 @@ extern void resolve_undo_clear_index(struct index_state *);
 extern int unmerge_index_entry_at(struct index_state *, int);
 extern void unmerge_index(struct index_state *, const char **);
 
+extern void resolve_undo_convert_v5(struct index_state *, struct conflict_entry *);
+
 #endif
-- 
1.7.10.886.gdf6792c.dirty

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 11/16] Read cache-tree in index-v5
  2012-08-02 11:01 [RFC 0/16] Introduce index file format version 5 Thomas Gummerer
                   ` (9 preceding siblings ...)
  2012-08-02 11:02 ` [PATCH 10/16] Read resolve-undo data Thomas Gummerer
@ 2012-08-02 11:02 ` Thomas Gummerer
  2012-08-03  8:31   ` Thomas Rast
  2012-08-02 11:02 ` [PATCH 12/16] Write index-v5 Thomas Gummerer
                   ` (7 subsequent siblings)
  18 siblings, 1 reply; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-02 11:02 UTC (permalink / raw)
  To: git; +Cc: trast, mhagger, gitster, pclouds, robin.rosenberg,
	Thomas Gummerer

Since the cache-tree data is saved as part of the directory data,
we have already read it, when we want to read the cache-tree. The
cache-tree then only has to be converted from the directory data.

The cache-tree isn't lexically sorted, but after the pathlen at
each level, therefore the directories have to be reordered with
respect to the ondisk layout.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 cache-tree.c | 93 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 cache-tree.h |  6 ++++
 read-cache.c |  1 +
 3 files changed, 100 insertions(+)

diff --git a/cache-tree.c b/cache-tree.c
index 28ed657..6a314aa 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -519,6 +519,99 @@ struct cache_tree *cache_tree_read(const char *buffer, unsigned long size)
 	return read_one(&buffer, &size);
 }
 
+struct cache_tree *convert_one(struct directory_queue *queue, int dirnr)
+{
+	int i, subtree_nr;
+	struct cache_tree *it;
+	struct directory_queue *down;
+
+	it = cache_tree();
+	it->entry_count = queue[dirnr].de->de_nentries;
+	subtree_nr = queue[dirnr].de->de_nsubtrees;
+	if (0 <= it->entry_count)
+		hashcpy(it->sha1, queue[dirnr].de->sha1);
+
+	/*
+	* Just a heuristic -- we do not add directories that often but
+	* we do not want to have to extend it immediately when we do,
+	* hence +2.
+	*/
+	it->subtree_alloc = subtree_nr + 2;
+	it->down = xcalloc(it->subtree_alloc, sizeof(struct cache_tree_sub *));
+	down = queue[dirnr].down;
+	for (i = 0; i < subtree_nr; i++) {
+		struct cache_tree *sub;
+		struct cache_tree_sub *subtree;
+		char *buf, *name;
+
+		name = "";
+		buf = strtok(down[i].de->pathname, "/");
+		while (buf) {
+			name = buf;
+			buf = strtok(NULL, "/");
+		}
+		sub = convert_one(down, i);
+		if(!sub)
+			goto free_return;
+		subtree = cache_tree_sub(it, name);
+		subtree->cache_tree = sub;
+	}
+	if (subtree_nr != it->subtree_nr)
+		die("cache-tree: internal error");
+	return it;
+ free_return:
+	cache_tree_free(&it);
+	return NULL;
+}
+
+static int compare_cache_tree_elements(const void *a, const void *b)
+{
+	const struct directory_entry *de1, *de2;
+
+	de1 = ((const struct directory_queue *)a)->de;
+	de2 = ((const struct directory_queue *)b)->de;
+	return subtree_name_cmp(de1->pathname, de1->de_pathlen,
+				de2->pathname, de2->de_pathlen);
+}
+
+static struct directory_entry *sort_directories(struct directory_entry *de,
+						struct directory_queue *queue)
+{
+	int i, nsubtrees;
+
+	nsubtrees = de->de_nsubtrees;
+	for (i = 0; i < nsubtrees; i++) {
+		struct directory_entry *new_de;
+		de = de->next;
+		new_de = xmalloc(directory_entry_size(de->de_pathlen));
+		memcpy(new_de, de, directory_entry_size(de->de_pathlen));
+		queue[i].de = new_de;
+		if (de->de_nsubtrees) {
+			queue[i].down = xcalloc(de->de_nsubtrees,
+					sizeof(struct directory_queue));
+			de = sort_directories(de,
+					queue[i].down);
+		}
+	}
+	qsort(queue, nsubtrees, sizeof(struct directory_queue),
+			compare_cache_tree_elements);
+	return de;
+}
+
+struct cache_tree *cache_tree_convert_v5(struct directory_entry *de)
+{
+	struct directory_queue *queue;
+
+	if (!de->de_nentries)
+		return NULL;
+	queue = xcalloc(1, sizeof(struct directory_queue));
+	queue[0].de = de;
+	queue[0].down = xcalloc(de->de_nsubtrees, sizeof(struct directory_queue));
+
+	sort_directories(de, queue[0].down);
+	return convert_one(queue, 0);
+}
+
 static struct cache_tree *cache_tree_find(struct cache_tree *it, const char *path)
 {
 	if (!it)
diff --git a/cache-tree.h b/cache-tree.h
index d8cb2e9..f4131a6 100644
--- a/cache-tree.h
+++ b/cache-tree.h
@@ -20,6 +20,11 @@ struct cache_tree {
 	struct cache_tree_sub **down;
 };
 
+struct directory_queue {
+	struct directory_queue *down;
+	struct directory_entry *de;
+};
+
 struct cache_tree *cache_tree(void);
 void cache_tree_free(struct cache_tree **);
 void cache_tree_invalidate_path(struct cache_tree *, const char *);
@@ -27,6 +32,7 @@ struct cache_tree_sub *cache_tree_sub(struct cache_tree *, const char *);
 
 void cache_tree_write(struct strbuf *, struct cache_tree *root);
 struct cache_tree *cache_tree_read(const char *buffer, unsigned long size);
+struct cache_tree *cache_tree_convert_v5(struct directory_entry *de);
 
 int cache_tree_fully_valid(struct cache_tree *);
 int cache_tree_update(struct cache_tree *, struct cache_entry **, int, int);
diff --git a/read-cache.c b/read-cache.c
index cef9a4e..fd095ec 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2054,6 +2054,7 @@ void read_index_v5(struct index_state *istate, void *mmap, int mmap_size, int fd
 	while (de)
 		de = read_entries_v5(istate, de, &entry_offset,
 				&mmap, mmap_size, &nr, &foffsetblock, fd);
+	istate->cache_tree = cache_tree_convert_v5(root_directory);
 }
 
 /* remember to discard_cache() before reading a different cache! */
-- 
1.7.10.886.gdf6792c.dirty

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 12/16] Write index-v5
  2012-08-02 11:01 [RFC 0/16] Introduce index file format version 5 Thomas Gummerer
                   ` (10 preceding siblings ...)
  2012-08-02 11:02 ` [PATCH 11/16] Read cache-tree in index-v5 Thomas Gummerer
@ 2012-08-02 11:02 ` Thomas Gummerer
  2012-08-02 11:02 ` [PATCH 13/16] Write index-v5 cache-tree data Thomas Gummerer
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-02 11:02 UTC (permalink / raw)
  To: git; +Cc: trast, mhagger, gitster, pclouds, robin.rosenberg,
	Thomas Gummerer

Write the index version 5 file format to disk. This version doesn't
write the cache-tree data and resolve-undo data to the file.

The main work is done when filtering out the directories from the
current in-memory format, where in the same turn also the conflicts
and the file data is calculated.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 cache.h      |  10 +-
 read-cache.c | 602 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 609 insertions(+), 3 deletions(-)

diff --git a/cache.h b/cache.h
index 91d9b45..fe3b446 100644
--- a/cache.h
+++ b/cache.h
@@ -116,7 +116,7 @@ struct cache_header_v5 {
 };
 
 #define INDEX_FORMAT_LB 2
-#define INDEX_FORMAT_UB 4
+#define INDEX_FORMAT_UB 5
 
 /*
  * The "cache_time" is just the low 32 bits of the
@@ -526,6 +526,7 @@ extern int verify_path(const char *path);
 extern struct cache_entry *index_name_exists(struct index_state *istate, const char *name, int namelen, int igncase);
 extern int index_name_stage_pos(const struct index_state *, const char *name, int namelen, int stage);
 extern int index_name_pos(const struct index_state *, const char *name, int namelen);
+extern struct directory_entry *init_directory_entry(char *pathname, int len);
 #define ADD_CACHE_OK_TO_ADD 1		/* Ok to add */
 #define ADD_CACHE_OK_TO_REPLACE 2	/* Ok to replace file/directory */
 #define ADD_CACHE_SKIP_DFCHECK 4	/* Ok to skip DF conflict checks */
@@ -1247,6 +1248,13 @@ static inline ssize_t write_str_in_full(int fd, const char *str)
 	return write_in_full(fd, str, strlen(str));
 }
 
+/* index-v5 helper functions */
+extern char *super_directory(const char *filename);
+extern void insert_directory_entry(struct directory_entry *, struct hash_table *, int *, unsigned int *, uint32_t);
+extern void add_conflict_to_directory_entry(struct directory_entry *, struct conflict_entry *);
+extern void add_part_to_conflict_entry(struct directory_entry *, struct conflict_entry *, struct conflict_part *);
+extern struct conflict_entry *create_new_conflict(char *, int, int);
+
 /* pager.c */
 extern void setup_pager(void);
 extern const char *pager_program;
diff --git a/read-cache.c b/read-cache.c
index fd095ec..0f1a218 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1283,7 +1283,7 @@ static struct cache_entry *refresh_cache_entry(struct cache_entry *ce, int reall
  * Index File I/O
  *****************************************************************/
 
-#define INDEX_FORMAT_DEFAULT 3
+#define INDEX_FORMAT_DEFAULT 5
 
 /*
  * dev/ino/uid/gid/size are also just tracked to the low 32 bits
@@ -2170,6 +2170,17 @@ static int ce_write_flush(git_SHA_CTX *context, int fd)
 	return 0;
 }
 
+static int ce_write_flush_v5(int fd)
+{
+	unsigned int buffered = write_buffer_len;
+	if (buffered) {
+		if (write_in_full(fd, write_buffer, buffered) != buffered)
+			return -1;
+		write_buffer_len = 0;
+	}
+	return 0;
+}
+
 static int ce_write_v2(git_SHA_CTX *context, int fd, void *data, unsigned int len)
 {
 	while (len) {
@@ -2192,6 +2203,30 @@ static int ce_write_v2(git_SHA_CTX *context, int fd, void *data, unsigned int le
 	return 0;
 }
 
+static int ce_write_v5(uint32_t *crc, int fd, void *data, unsigned int len)
+{
+	if (crc)
+		*crc = crc32(*crc, (Bytef*)data, len);
+	while (len) {
+		unsigned int buffered = write_buffer_len;
+		unsigned int partial = WRITE_BUFFER_SIZE - buffered;
+		if (partial > len)
+			partial = len;
+		memcpy(write_buffer + buffered, data, partial);
+		buffered += partial;
+		if (buffered == WRITE_BUFFER_SIZE) {
+			write_buffer_len = buffered;
+			if (ce_write_flush_v5(fd))
+				return -1;
+			buffered = 0;
+		}
+		write_buffer_len = buffered;
+		len -= partial;
+		data = (char *) data + partial;
+	}
+	return 0;
+}
+
 static int write_index_ext_header_v2(git_SHA_CTX *context, int fd,
 				  unsigned int ext, unsigned int sz)
 {
@@ -2223,6 +2258,19 @@ static int ce_flush(git_SHA_CTX *context, int fd)
 	return (write_in_full(fd, write_buffer, left) != left) ? -1 : 0;
 }
 
+static int ce_flush_v5(int fd)
+{
+	unsigned int left = write_buffer_len;
+
+	if (left)
+		write_buffer_len = 0;
+
+	if (write_in_full(fd, write_buffer, left) != left)
+		return -1;
+
+	return 0;
+}
+
 static void ce_smudge_racily_clean_entry_v2(struct cache_entry *ce)
 {
 	/*
@@ -2272,6 +2320,22 @@ static void ce_smudge_racily_clean_entry_v2(struct cache_entry *ce)
 	}
 }
 
+static void ce_smudge_racily_clean_entry_v5(struct cache_entry *ce)
+{
+	/*
+	 * This method shall only be called if the timestamp of ce
+	 * is racy (check with is_racy_timestamp). If the timestamp
+	 * is racy, the writer will just set the time to 0.
+	 *
+	 * The reader (ce_match_stat_basic_v5) will then take care
+	 * of checking if the entry is really changed or not, by
+	 * taking into account the stat_crc and if that hasn't changed
+	 * checking the sha1.
+	 */
+	ce->ce_mtime.sec = 0;
+	ce->ce_mtime.nsec = 0;
+}
+
 /* Copy miscellaneous fields but not the name */
 static char *copy_cache_entry_to_ondisk(struct ondisk_cache_entry *ondisk,
 				       struct cache_entry *ce)
@@ -2455,12 +2519,546 @@ int write_index_v2(struct index_state *istate, int newfd)
 	return 0;
 }
 
+char *super_directory(const char *filename)
+{
+	char *slash;
+
+	slash = strrchr(filename, '/');
+	if (slash)
+		return xmemdupz(filename, slash-filename);
+	return NULL;
+}
+
+struct directory_entry *init_directory_entry(char *pathname, int len)
+{
+	struct directory_entry *de = xmalloc(directory_entry_size(len));
+
+	memcpy(de->pathname, pathname, len);
+	de->pathname[len] = '\0';
+	de->de_flags      = 0;
+	de->de_foffset    = 0;
+	de->de_cr         = 0;
+	de->de_ncr        = 0;
+	de->de_nsubtrees  = 0;
+	de->de_nfiles     = 0;
+	de->de_nentries   = 0;
+	memset(de->sha1, 0, 20);
+	de->de_pathlen    = len;
+	de->next          = NULL;
+	de->next_hash     = NULL;
+	de->ce            = NULL;
+	de->ce_last       = NULL;
+	de->conflict      = NULL;
+	de->conflict_last = NULL;
+	de->conflict_size = 0;
+	return de;
+}
+
+static void ondisk_from_directory_entry(struct directory_entry *de,
+					struct ondisk_directory_entry *ondisk)
+{
+	ondisk->foffset   = htonl(de->de_foffset);
+	ondisk->cr        = htonl(de->de_cr);
+	ondisk->ncr       = htonl(de->de_ncr);
+	ondisk->nsubtrees = htonl(de->de_nsubtrees);
+	ondisk->nfiles    = htonl(de->de_nfiles);
+	ondisk->nentries  = htonl(de->de_nentries);
+	hashcpy(ondisk->sha1, de->sha1);
+	ondisk->flags     = htons(de->de_flags);
+}
+
+static struct conflict_part *conflict_part_from_inmemory(struct cache_entry *ce)
+{
+	struct conflict_part *conflict;
+	short flags;
+
+	conflict = xmalloc(sizeof(struct conflict_part));
+	flags                = CONFLICT_CONFLICTED;
+	flags               |= ce_stage(ce) << CONFLICT_STAGESHIFT;
+	conflict->flags      = flags;
+	conflict->entry_mode = ce->ce_mode;
+	conflict->next       = NULL;
+	hashcpy(conflict->sha1, ce->sha1);
+	return conflict;
+}
+
+static void conflict_to_ondisk(struct conflict_part *cp,
+				struct ondisk_conflict_part *ondisk)
+{
+	ondisk->flags      = htons(cp->flags);
+	ondisk->entry_mode = htons(cp->entry_mode);
+	hashcpy(ondisk->sha1, cp->sha1);
+}
+
+void add_conflict_to_directory_entry(struct directory_entry *de,
+					struct conflict_entry *conflict_entry)
+{
+	de->de_ncr++;
+	de->conflict_size += conflict_entry->namelen + 1 + 8 - conflict_entry->pathlen;
+	conflict_entry_push(&de->conflict, &de->conflict_last, conflict_entry);
+}
+
+void add_part_to_conflict_entry(struct directory_entry *de,
+					struct conflict_entry *entry,
+					struct conflict_part *conflict_part)
+{
+
+	struct conflict_part *conflict_search;
+
+	entry->nfileconflicts++;
+	de->conflict_size += sizeof(struct ondisk_conflict_part);
+	if (!entry->entries)
+		entry->entries = conflict_part;
+	else {
+		conflict_search = entry->entries;
+		while (conflict_search->next)
+			conflict_search = conflict_search->next;
+		conflict_search->next = conflict_part;
+	}
+}
+
+void insert_directory_entry(struct directory_entry *de,
+			struct hash_table *table,
+			int *total_dir_len,
+			unsigned int *ndir,
+			uint32_t crc)
+{
+	struct directory_entry *insert;
+
+	insert = (struct directory_entry *)insert_hash(crc, de, table);
+	if (insert) {
+		de->next_hash = insert->next_hash;
+		insert->next_hash = de;
+	}
+	(*ndir)++;
+	if (de->de_pathlen == 0)
+		(*total_dir_len)++;
+	else
+		*total_dir_len += de->de_pathlen + 2;
+}
+
+static struct directory_entry *compile_directory_data(struct index_state *istate,
+						int nfile,
+						unsigned int *ndir,
+						int *non_conflicted,
+						int *total_dir_len,
+						int *total_file_len)
+{
+	int i, dir_len = -1;
+	char *dir;
+	struct directory_entry *de, *current, *search, *found, *new, *previous_entry;
+	struct cache_entry **cache = istate->cache;
+	struct conflict_entry *conflict_entry;
+	struct hash_table table;
+	uint32_t crc;
+
+	init_hash(&table);
+	de = init_directory_entry("", 0);
+	current = de;
+	*ndir = 1;
+	*total_dir_len = 1;
+	crc = crc32(0, (Bytef*)de->pathname, de->de_pathlen);
+	insert_hash(crc, de, &table);
+	conflict_entry = NULL;
+	for (i = 0; i < nfile; i++) {
+		int new_entry;
+		if (cache[i]->ce_flags & CE_REMOVE)
+			continue;
+
+		new_entry = !ce_stage(cache[i]) || !conflict_entry
+		    || cache_name_compare(conflict_entry->name, conflict_entry->namelen,
+					cache[i]->name, ce_namelen(cache[i]));
+		if (new_entry)
+			(*non_conflicted)++;
+		if (dir_len < 0 || strncmp(cache[i]->name, dir, dir_len)
+		    || cache[i]->name[dir_len] != '/'
+		    || strchr(cache[i]->name + dir_len + 1, '/')) {
+			dir = super_directory(cache[i]->name);
+			if (!dir)
+				dir_len = 0;
+			else
+				dir_len = strlen(dir);
+			crc = crc32(0, (Bytef*)dir, dir_len);
+			found = lookup_hash(crc, &table);
+			search = found;
+			while (search && dir_len != 0 && strcmp(dir, search->pathname) != 0)
+				search = search->next_hash;
+		}
+		previous_entry = current;
+		if (!search || !found) {
+			new = init_directory_entry(dir, dir_len);
+			current->next = new;
+			current = current->next;
+			insert_directory_entry(new, &table, total_dir_len, ndir, crc);
+			search = current;
+		}
+		if (new_entry) {
+			search->de_nfiles++;
+			*total_file_len += ce_namelen(cache[i]) + 1;
+			if (search->de_pathlen)
+				*total_file_len -= search->de_pathlen + 1;
+			ce_queue_push(&(search->ce), &(search->ce_last), cache[i]);
+		}
+		if (ce_stage(cache[i]) > 0) {
+			struct conflict_part *conflict_part;
+			if (new_entry) {
+				conflict_entry = create_conflict_entry_from_ce(cache[i], search->de_pathlen);
+				add_conflict_to_directory_entry(search, conflict_entry);
+			}
+			conflict_part = conflict_part_from_inmemory(cache[i]);
+			add_part_to_conflict_entry(search, conflict_entry, conflict_part);
+		}
+		if (dir && !found) {
+			struct directory_entry *no_subtrees;
+
+			no_subtrees = current;
+			dir = super_directory(dir);
+			if (dir)
+				dir_len = strlen(dir);
+			else
+				dir_len = 0;
+			crc = crc32(0, (Bytef*)dir, dir_len);
+			found = lookup_hash(crc, &table);
+			while (!found) {
+				new = init_directory_entry(dir, dir_len);
+				new->de_nsubtrees = 1;
+				new->next = no_subtrees;
+				no_subtrees = new;
+				insert_directory_entry(new, &table, total_dir_len, ndir, crc);
+				dir = super_directory(dir);
+				if (!dir)
+					dir_len = 0;
+				else
+					dir_len = strlen(dir);
+				crc = crc32(0, (Bytef*)dir, dir_len);
+				found = lookup_hash(crc, &table);
+			}
+			search = found;
+			while (search->next_hash && strcmp(dir, search->pathname) != 0)
+				search = search->next_hash;
+			if (search)
+				found = search;
+			found->de_nsubtrees++;
+			previous_entry->next = no_subtrees;
+		}
+	}
+	return de;
+}
+
+static void ondisk_from_cache_entry(struct cache_entry *ce,
+				    struct ondisk_cache_entry_v5 *ondisk)
+{
+	unsigned int flags;
+
+	flags  = ce->ce_flags & CE_STAGEMASK;
+	flags |= ce->ce_flags & CE_VALID;
+	if (ce->ce_flags & CE_INTENT_TO_ADD)
+		flags |= CE_INTENT_TO_ADD_V5;
+	if (ce->ce_flags & CE_SKIP_WORKTREE)
+		flags |= CE_SKIP_WORKTREE_V5;
+	ondisk->flags      = htons(flags);
+	ondisk->mode       = htons(ce->ce_mode);
+	ondisk->mtime.sec  = htonl(ce->ce_mtime.sec);
+#ifdef USE_NSEC
+	ondisk->mtime.nsec = htonl(ce->ce_mtime.nsec);
+#else
+	ondisk->mtime.nsec = 0;
+#endif
+	if (!ce->ce_stat_crc)
+		ce->ce_stat_crc = calculate_stat_crc(ce);
+	ondisk->stat_crc   = htonl(ce->ce_stat_crc);
+	hashcpy(ondisk->sha1, ce->sha1);
+}
+
+static int write_directories_v5(struct directory_entry *de, int fd, int conflict_offset)
+{
+	struct directory_entry *current;
+	struct ondisk_directory_entry ondisk;
+	int current_offset, offset_write, ondisk_size, foffset;
+	uint32_t crc;
+
+	/*
+	 * This is needed because the compiler aligns structs to sizes multipe
+	 * of 4
+	 */
+	ondisk_size = sizeof(ondisk.flags)
+		+ sizeof(ondisk.foffset)
+		+ sizeof(ondisk.cr)
+		+ sizeof(ondisk.ncr)
+		+ sizeof(ondisk.nsubtrees)
+		+ sizeof(ondisk.nfiles)
+		+ sizeof(ondisk.nentries)
+		+ sizeof(ondisk.sha1);
+	current = de;
+	current_offset = 0;
+	foffset = 0;
+	while (current) {
+		int pathlen;
+
+		offset_write = htonl(current_offset);
+		if (ce_write_v5(NULL, fd, &offset_write, 4) < 0)
+			return -1;
+		if (current->de_pathlen == 0)
+			pathlen = 0;
+		else
+			pathlen = current->de_pathlen + 1;
+		current_offset += pathlen + 1 + ondisk_size + 4;
+		current = current->next;
+	}
+	/*
+	 * Write one more offset, which points to the end of the entries,
+	 * because we use it for calculating the dir length, instead of
+	 * using strlen.
+	 */
+	offset_write = htonl(current_offset);
+	if (ce_write_v5(NULL, fd, &offset_write, 4) < 0)
+		return -1;
+	current = de;
+	while (current) {
+		crc = 0;
+		if (current->de_pathlen == 0) {
+			if (ce_write_v5(&crc, fd, current->pathname, 1) < 0)
+				return -1;
+		} else {
+			char *path;
+			path = xmalloc(sizeof(char) * (current->de_pathlen + 2));
+			memcpy(path, current->pathname, current->de_pathlen);
+			memcpy(path + current->de_pathlen, "/\0", 2);
+			if (ce_write_v5(&crc, fd, path, current->de_pathlen + 2) < 0)
+				return -1;
+		}
+		current->de_foffset = foffset;
+		current->de_cr = conflict_offset;
+		ondisk_from_directory_entry(current, &ondisk);
+		if (ce_write_v5(&crc, fd, &ondisk, ondisk_size) < 0)
+			return -1;
+		crc = htonl(crc);
+		if (ce_write_v5(NULL, fd, &crc, 4) < 0)
+			return -1;
+		conflict_offset += current->conflict_size;
+		foffset += current->de_nfiles * 4;
+		current = current->next;
+	}
+	return 0;
+}
+
+static int write_entries_v5(struct index_state *istate,
+			    struct directory_entry *de,
+			    int entries,
+			    int fd,
+			    int offset_to_offset)
+{
+	int offset, offset_write, ondisk_size;
+	struct directory_entry *current;
+
+	offset = 0;
+	ondisk_size = sizeof(struct ondisk_cache_entry_v5);
+	current = de;
+	while (current) {
+		int pathlen;
+		struct cache_entry *ce = current->ce;
+
+		if (current->de_pathlen == 0)
+			pathlen = 0;
+		else
+			pathlen = current->de_pathlen + 1;
+		while (ce) {
+			if (ce->ce_flags & CE_REMOVE)
+				continue;
+			if (!ce_uptodate(ce) && is_racy_timestamp(istate, ce))
+				ce_smudge_racily_clean_entry_v5(ce);
+
+			offset_write = htonl(offset);
+			if (ce_write_v5(NULL, fd, &offset_write, 4) < 0)
+				return -1;
+			offset += ce_namelen(ce) - pathlen + 1 + ondisk_size + 4;
+			ce = ce->next;
+		}
+		current = current->next;
+	}
+	/*
+	 * Write one more offset, which points to the end of the entries,
+	 * because we use it for calculating the file length, instead of
+	 * using strlen.
+	 */
+	offset_write = htonl(offset);
+	if (ce_write_v5(NULL, fd, &offset_write, 4) < 0)
+		return -1;
+
+	offset = offset_to_offset;
+	current = de;
+	while (current) {
+		int pathlen;
+		struct cache_entry *ce = current->ce;
+
+		if (current->de_pathlen == 0)
+			pathlen = 0;
+		else
+			pathlen = current->de_pathlen + 1;
+		while (ce) {
+			struct ondisk_cache_entry_v5 ondisk;
+			uint32_t crc, calc_crc;
+
+			if (ce->ce_flags & CE_REMOVE)
+				continue;
+			calc_crc = htonl(offset);
+			crc = crc32(0, (Bytef*)&calc_crc, 4);
+			if (ce_write_v5(&crc, fd, ce->name + pathlen,
+					ce_namelen(ce) - pathlen + 1) < 0)
+				return -1;
+			ondisk_from_cache_entry(ce, &ondisk);
+			if (ce_write_v5(&crc, fd, &ondisk, ondisk_size) < 0)
+				return -1;
+			crc = htonl(crc);
+			if (ce_write_v5(NULL, fd, &crc, 4) < 0)
+				return -1;
+			offset += 4;
+			ce = ce->next;
+		}
+		current = current->next;
+	}
+	return 0;
+}
+
+static int write_conflict_v5(struct conflict_entry *conflict, int fd)
+{
+	struct conflict_entry *current;
+	struct conflict_part *current_part;
+	uint32_t crc;
+
+	current = conflict;
+	while (current) {
+		unsigned int to_write;
+
+		crc = 0;
+		if (ce_write_v5(&crc, fd,
+		     (Bytef*)(current->name + current->pathlen),
+		     current->namelen - current->pathlen) < 0)
+			return -1;
+		if (ce_write_v5(&crc, fd, (Bytef*)"\0", 1) < 0)
+			return -1;
+		to_write = htonl(current->nfileconflicts);
+		if (ce_write_v5(&crc, fd, (Bytef*)&to_write, 4) < 0)
+			return -1;
+		current_part = current->entries;
+		while (current_part) {
+			struct ondisk_conflict_part ondisk;
+
+			conflict_to_ondisk(current_part, &ondisk);
+			if (ce_write_v5(&crc, fd, (Bytef*)&ondisk, sizeof(struct ondisk_conflict_part)) < 0)
+				return 0;
+			current_part = current_part->next;
+		}
+		to_write = htonl(crc);
+		if (ce_write_v5(NULL, fd, (Bytef*)&to_write, 4) < 0)
+			return -1;
+		current = current->next;
+	}
+	return 0;
+}
+
+static int write_conflicts_v5(struct index_state *istate,
+			      struct directory_entry *de,
+			      int fd)
+{
+	struct directory_entry *current;
+
+	current = de;
+	while (current) {
+		if (current->de_ncr != 0) {
+			if (write_conflict_v5(current->conflict, fd) < 0)
+				return -1;
+		}
+		current = current->next;
+	}
+	return 0;
+}
+
+static int write_index_v5(struct index_state *istate, int newfd)
+{
+	struct cache_version_header hdr;
+	struct cache_header_v5 hdr_v5;
+	struct cache_entry **cache = istate->cache;
+	struct directory_entry *de;
+	struct ondisk_directory_entry *ondisk;
+	int entries = istate->cache_nr;
+	int i, removed, non_conflicted, total_dir_len, ondisk_directory_size;
+	int total_file_len, conflict_offset, offset_to_offset;
+	unsigned int ndir;
+	uint32_t crc;
+
+	for (i = removed = 0; i < entries; i++) {
+		if (cache[i]->ce_flags & CE_REMOVE)
+			removed++;
+	}
+	hdr.hdr_signature = htonl(CACHE_SIGNATURE);
+	hdr.hdr_version = htonl(istate->version);
+	hdr_v5.hdr_nfile = htonl(entries - removed);
+	hdr_v5.hdr_nextension = htonl(0); /* Currently no extensions are supported */
+
+	non_conflicted = 0;
+	total_dir_len = 0;
+	total_file_len = 0;
+	de = compile_directory_data(istate, entries, &ndir, &non_conflicted,
+			&total_dir_len, &total_file_len);
+	hdr_v5.hdr_ndir = htonl(ndir);
+
+	/*
+	 * This is needed because the compiler aligns structs to sizes multipe
+	 * of 4
+	 */
+	ondisk_directory_size = sizeof(ondisk->flags)
+		+ sizeof(ondisk->foffset)
+		+ sizeof(ondisk->cr)
+		+ sizeof(ondisk->ncr)
+		+ sizeof(ondisk->nsubtrees)
+		+ sizeof(ondisk->nfiles)
+		+ sizeof(ondisk->nentries)
+		+ sizeof(ondisk->sha1);
+	hdr_v5.hdr_fblockoffset = htonl(sizeof(hdr) + sizeof(hdr_v5) + 4
+		+ (ndir + 1) * 4
+		+ total_dir_len
+		+ ndir * (ondisk_directory_size + 4)
+		+ (non_conflicted + 1) * 4);
+
+	crc = 0;
+	if (ce_write_v5(&crc, newfd, &hdr, sizeof(hdr)) < 0)
+		return -1;
+	if (ce_write_v5(&crc, newfd, &hdr_v5, sizeof(hdr_v5)) < 0)
+		return -1;
+	crc = htonl(crc);
+	if (ce_write_v5(NULL, newfd, &crc, 4) < 0)
+		return -1;
+
+	conflict_offset = sizeof(hdr) + sizeof(hdr_v5) + 4
+		+ (ndir + 1) * 4
+		+ total_dir_len
+		+ ndir * (ondisk_directory_size + 4)
+		+ (non_conflicted + 1) * 4
+		+ total_file_len
+		+ non_conflicted * (sizeof(struct ondisk_cache_entry_v5) + 4);
+	if (write_directories_v5(de, newfd, conflict_offset) < 0)
+		return -1;
+	offset_to_offset = sizeof(hdr) + sizeof(hdr_v5) + 4
+		+ (ndir + 1) * 4
+		+ total_dir_len
+		+ ndir * (ondisk_directory_size + 4);
+	if (write_entries_v5(istate, de, entries, newfd, offset_to_offset) < 0)
+		return -1;
+	if (write_conflicts_v5(istate, de, newfd) < 0)
+		return -1;
+	return ce_flush_v5(newfd);
+}
+
 int write_index(struct index_state *istate, int newfd)
 {
 	if (!istate->version)
 		istate->version = INDEX_FORMAT_DEFAULT;
 
-	return write_index_v2(istate, newfd);
+	if (istate->version != 5)
+		return write_index_v2(istate, newfd);
+	else
+		return write_index_v5(istate, newfd);
 }
 
 /*
-- 
1.7.10.886.gdf6792c.dirty

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 13/16] Write index-v5 cache-tree data
  2012-08-02 11:01 [RFC 0/16] Introduce index file format version 5 Thomas Gummerer
                   ` (11 preceding siblings ...)
  2012-08-02 11:02 ` [PATCH 12/16] Write index-v5 Thomas Gummerer
@ 2012-08-02 11:02 ` Thomas Gummerer
  2012-08-02 11:02 ` [PATCH 14/16] Write resolve-undo data for index-v5 Thomas Gummerer
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-02 11:02 UTC (permalink / raw)
  To: git; +Cc: trast, mhagger, gitster, pclouds, robin.rosenberg,
	Thomas Gummerer

Write the cache-tree data for the index version 5 file format. The
in-memory cache-tree data is converted to the ondisk format, by adding
it to the directory entries, that were compiled from the cache-entries
in the step before.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 cache-tree.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 cache-tree.h |  1 +
 read-cache.c |  2 ++
 3 files changed, 55 insertions(+)

diff --git a/cache-tree.c b/cache-tree.c
index 6a314aa..bf945ef 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -612,6 +612,58 @@ struct cache_tree *cache_tree_convert_v5(struct directory_entry *de)
 	return convert_one(queue, 0);
 }
 
+
+static void convert_one_to_ondisk_v5(struct hash_table *table, struct cache_tree *it,
+				const char *path, int pathlen, uint32_t crc)
+{
+	int i;
+	struct directory_entry *found, *search;
+
+	crc = crc32(crc, (Bytef*)path, pathlen);
+	found = lookup_hash(crc, table);
+	search = found;
+	while (search && strcmp(path, search->pathname + search->de_pathlen - strlen(path)) != 0)
+		search = search->next_hash;
+	if (!search)
+		return;
+	/*
+	 * The number of subtrees is already calculated by
+	 * compile_directory_data, therefore we only need to
+	 * add the entry_count
+	 */
+	search->de_nentries = it->entry_count;
+	if (0 <= it->entry_count)
+		hashcpy(search->sha1, it->sha1);
+	if (strcmp(path, "") != 0)
+		crc = crc32(crc, (Bytef*)"/", 1);
+
+#if DEBUG
+	if (0 <= it->entry_count)
+		fprintf(stderr, "cache-tree <%.*s> (%d ent, %d subtree) %s\n",
+			pathlen, path, it->entry_count, it->subtree_nr,
+			sha1_to_hex(it->sha1));
+	else
+		fprintf(stderr, "cache-tree <%.*s> (%d subtree) invalid\n",
+			pathlen, path, it->subtree_nr);
+#endif
+
+	for (i = 0; i < it->subtree_nr; i++) {
+		struct cache_tree_sub *down = it->down[i];
+		if (i) {
+			struct cache_tree_sub *prev = it->down[i-1];
+			if (subtree_name_cmp(down->name, down->namelen,
+					     prev->name, prev->namelen) <= 0)
+				die("fatal - unsorted cache subtree");
+		}
+		convert_one_to_ondisk_v5(table, down->cache_tree, down->name, down->namelen, crc);
+	}
+}
+
+void cache_tree_to_ondisk_v5(struct hash_table *table, struct cache_tree *root)
+{
+	convert_one_to_ondisk_v5(table, root, "", 0, 0);
+}
+
 static struct cache_tree *cache_tree_find(struct cache_tree *it, const char *path)
 {
 	if (!it)
diff --git a/cache-tree.h b/cache-tree.h
index f4131a6..a41b684 100644
--- a/cache-tree.h
+++ b/cache-tree.h
@@ -33,6 +33,7 @@ struct cache_tree_sub *cache_tree_sub(struct cache_tree *, const char *);
 void cache_tree_write(struct strbuf *, struct cache_tree *root);
 struct cache_tree *cache_tree_read(const char *buffer, unsigned long size);
 struct cache_tree *cache_tree_convert_v5(struct directory_entry *de);
+void cache_tree_to_ondisk_v5(struct hash_table *table, struct cache_tree *root);
 
 int cache_tree_fully_valid(struct cache_tree *);
 int cache_tree_update(struct cache_tree *, struct cache_entry **, int, int);
diff --git a/read-cache.c b/read-cache.c
index 0f1a218..6538ddf 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2742,6 +2742,8 @@ static struct directory_entry *compile_directory_data(struct index_state *istate
 			previous_entry->next = no_subtrees;
 		}
 	}
+	if (istate->cache_tree)
+		cache_tree_to_ondisk_v5(&table, istate->cache_tree);
 	return de;
 }
 
-- 
1.7.10.886.gdf6792c.dirty

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 14/16] Write resolve-undo data for index-v5
  2012-08-02 11:01 [RFC 0/16] Introduce index file format version 5 Thomas Gummerer
                   ` (12 preceding siblings ...)
  2012-08-02 11:02 ` [PATCH 13/16] Write index-v5 cache-tree data Thomas Gummerer
@ 2012-08-02 11:02 ` Thomas Gummerer
  2012-08-02 11:02 ` [PATCH 15/16] update-index.c: add a force-rewrite option Thomas Gummerer
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-02 11:02 UTC (permalink / raw)
  To: git; +Cc: trast, mhagger, gitster, pclouds, robin.rosenberg,
	Thomas Gummerer

Write the resolve undo data to the ondisk format, by joining the data
in the resolve-undo string-list with the already existing conflicts
that were compiled before, when searching the directories and add
them to the corresponding directory entries.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 read-cache.c   |  1 +
 resolve-undo.c | 93 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 resolve-undo.h |  1 +
 3 files changed, 95 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index 6538ddf..a4ca40a 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2744,6 +2744,7 @@ static struct directory_entry *compile_directory_data(struct index_state *istate
 	}
 	if (istate->cache_tree)
 		cache_tree_to_ondisk_v5(&table, istate->cache_tree);
+	resolve_undo_to_ondisk_v5(&table, istate->resolve_undo, ndir, total_dir_len, de);
 	return de;
 }
 
diff --git a/resolve-undo.c b/resolve-undo.c
index f96c6ba..4568dcc 100644
--- a/resolve-undo.c
+++ b/resolve-undo.c
@@ -206,3 +206,96 @@ void resolve_undo_convert_v5(struct index_state *istate,
 		ce = ce->next;
 	}
 }
+
+void resolve_undo_to_ondisk_v5(struct hash_table *table,
+				struct string_list *resolve_undo,
+				unsigned int *ndir, int *total_dir_len,
+				struct directory_entry *de)
+{
+	struct string_list_item *item;
+	struct directory_entry *search;
+
+	if (!resolve_undo)
+		return;
+	for_each_string_list_item(item, resolve_undo) {
+		struct conflict_entry *conflict_entry;
+		struct resolve_undo_info *ui = item->util;
+		char *super;
+		int i, dir_len, len;
+		uint32_t crc;
+		struct directory_entry *found, *current, *new_tree;
+
+		if (!ui)
+			continue;
+
+		super = super_directory(item->string);
+		if (!super)
+			dir_len = 0;
+		else
+			dir_len = strlen(super);
+		crc = crc32(0, (Bytef*)super, dir_len);
+		found = lookup_hash(crc, table);
+		current = NULL;
+		new_tree = NULL;
+		
+		while (!found) {
+			struct directory_entry *new;
+
+			new = init_directory_entry(super, dir_len);
+			if (!current)
+				current = new;
+			insert_directory_entry(new, table, total_dir_len, ndir, crc);
+			if (new_tree != NULL)
+				new->de_nsubtrees = 1;
+			new->next = new_tree;
+			new_tree = new;
+			super = super_directory(super);
+			if (!super)
+				dir_len = 0;
+			else
+				dir_len = strlen(super);
+			crc = crc32(0, (Bytef*)super, dir_len);
+			found = lookup_hash(crc, table);
+		}
+		search = found;
+		while (search->next_hash && strcmp(super, search->pathname) != 0)
+			search = search->next_hash;
+		if (search && !current)
+			current = search;
+		if (!search && !current)
+			current = new_tree;
+		if (!super && new_tree) {
+			new_tree->next = de->next;
+			de->next = new_tree;
+			de->de_nsubtrees++;
+		} else if (new_tree) {
+			struct directory_entry *temp;
+
+			search = de->next;
+			while (strcmp(super, search->pathname))
+				search = search->next;
+			temp = new_tree;
+			while (temp->next)
+				temp = temp->next;
+			search->de_nsubtrees++;
+			temp->next = search->next;
+			search->next = new_tree;
+		}
+
+		len = strlen(item->string);
+		conflict_entry = create_new_conflict(item->string, len, current->de_pathlen);
+		add_conflict_to_directory_entry(current, conflict_entry);
+		for (i = 0; i < 3; i++) {
+			if (ui->mode[i]) {
+				struct conflict_part *cp;
+
+				cp = xmalloc(sizeof(struct conflict_part));
+				cp->flags = (i + 1) << CONFLICT_STAGESHIFT;
+				cp->entry_mode = ui->mode[i];
+				cp->next = NULL;
+				hashcpy(cp->sha1, ui->sha1[i]);
+				add_part_to_conflict_entry(current, conflict_entry, cp);
+			}
+		}
+	}
+}
diff --git a/resolve-undo.h b/resolve-undo.h
index ab660a6..ff80d84 100644
--- a/resolve-undo.h
+++ b/resolve-undo.h
@@ -14,5 +14,6 @@ extern int unmerge_index_entry_at(struct index_state *, int);
 extern void unmerge_index(struct index_state *, const char **);
 
 extern void resolve_undo_convert_v5(struct index_state *, struct conflict_entry *);
+extern void resolve_undo_to_ondisk_v5(struct hash_table *, struct string_list *, unsigned int *, int *, struct directory_entry *);
 
 #endif
-- 
1.7.10.886.gdf6792c.dirty

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 15/16] update-index.c: add a force-rewrite option
  2012-08-02 11:01 [RFC 0/16] Introduce index file format version 5 Thomas Gummerer
                   ` (13 preceding siblings ...)
  2012-08-02 11:02 ` [PATCH 14/16] Write resolve-undo data for index-v5 Thomas Gummerer
@ 2012-08-02 11:02 ` Thomas Gummerer
  2012-08-02 11:02 ` [PATCH 16/16] p0002-index.sh: add perf test for the index formats Thomas Gummerer
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-02 11:02 UTC (permalink / raw)
  To: git; +Cc: trast, mhagger, gitster, pclouds, robin.rosenberg,
	Thomas Gummerer

Add a force-rewrite option to update-index, which allows the user
to rewrite the index, even if there are no changes. This can be used
to do performance tests of both the reader and the writer.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 builtin/update-index.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index e747def..063bf54 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -24,6 +24,7 @@ static int allow_remove;
 static int allow_replace;
 static int info_only;
 static int force_remove;
+static int force_rewrite;
 static int verbose;
 static int mark_valid_only;
 static int mark_skip_worktree_only;
@@ -734,6 +735,8 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 		OPT_BIT(0, "unmerged", &refresh_args.flags,
 			"refresh even if index contains unmerged entries",
 			REFRESH_UNMERGED),
+		OPT_SET_INT(0, "force-rewrite", &force_rewrite,
+			"force a index rewrite even if there is no change", 1),
 		{OPTION_CALLBACK, 0, "refresh", &refresh_args, NULL,
 			"refresh stat information",
 			PARSE_OPT_NOARG | PARSE_OPT_NONEG,
@@ -892,7 +895,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 		strbuf_release(&buf);
 	}
 
-	if (active_cache_changed) {
+	if (active_cache_changed || force_rewrite) {
 		if (newfd < 0) {
 			if (refresh_args.flags & REFRESH_QUIET)
 				exit(128);
-- 
1.7.10.886.gdf6792c.dirty

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 16/16] p0002-index.sh: add perf test for the index formats
  2012-08-02 11:01 [RFC 0/16] Introduce index file format version 5 Thomas Gummerer
                   ` (14 preceding siblings ...)
  2012-08-02 11:02 ` [PATCH 15/16] update-index.c: add a force-rewrite option Thomas Gummerer
@ 2012-08-02 11:02 ` Thomas Gummerer
  2012-08-02 12:50   ` Nguyen Thai Ngoc Duy
  2012-08-02 12:10 ` [RFC 0/16] Introduce index file format version 5 Nguyen Thai Ngoc Duy
                   ` (2 subsequent siblings)
  18 siblings, 1 reply; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-02 11:02 UTC (permalink / raw)
  To: git; +Cc: trast, mhagger, gitster, pclouds, robin.rosenberg,
	Thomas Gummerer

Add a performance test for index version [23]/4/5 by using
git update-index --force-rewrite, thus testing both the reader
and the writer speed of all index formats.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 t/perf/p0002-index.sh | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)
 create mode 100755 t/perf/p0002-index.sh

diff --git a/t/perf/p0002-index.sh b/t/perf/p0002-index.sh
new file mode 100755
index 0000000..2996357
--- /dev/null
+++ b/t/perf/p0002-index.sh
@@ -0,0 +1,33 @@
+#!/bin/sh
+
+test_description="Tests index versions [23]/4/5"
+
+. ./perf-lib.sh
+
+test_perf_large_repo
+
+test_expect_success 'convert to v3' '
+	git update-index --index-version=3
+'
+
+test_perf 'v[23]: update-index' '
+	git update-index --force-rewrite >/dev/null
+'
+
+test_expect_success 'convert to v4' '
+	git update-index --index-version=4
+'
+
+test_perf 'v4: update-index' '
+	git update-index --force-rewrite >/dev/null
+'
+
+test_expect_success 'convert to v5' '
+	git update-index --index-version=5
+'
+
+test_perf 'v5: update-index' '
+	git update-index --force-rewrite >/dev/null
+'
+
+test_done
-- 
1.7.10.886.gdf6792c.dirty

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [RFC 0/16] Introduce index file format version 5
  2012-08-02 11:01 [RFC 0/16] Introduce index file format version 5 Thomas Gummerer
                   ` (15 preceding siblings ...)
  2012-08-02 11:02 ` [PATCH 16/16] p0002-index.sh: add perf test for the index formats Thomas Gummerer
@ 2012-08-02 12:10 ` Nguyen Thai Ngoc Duy
  2012-08-02 13:47   ` Thomas Gummerer
  2012-08-03  3:16 ` Nguyen Thai Ngoc Duy
  2012-08-03  9:13 ` Thomas Rast
  18 siblings, 1 reply; 38+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-08-02 12:10 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: git, trast, mhagger, gitster, robin.rosenberg

On Thu, Aug 2, 2012 at 6:01 PM, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> Documentation/technical/index-file-format-v5.txt |  281 ++++++++++++++++++++++++++++++++++
> builtin/update-index.c                           |    5 +-
> cache-tree.c                                     |  145 ++++++++++++++++++
> cache-tree.h                                     |    7 +
> cache.h                                          |   96 +++++++++++-
> read-cache.c                                     | 1519 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------
> resolve-undo.c                                   |  129 ++++++++++++++++
> resolve-undo.h                                   |    3 +
> t/perf/p0002-index.sh                            |   33 ++++
> t/t2104-update-index-skip-worktree.sh            |   15 +-
> t/t3700-add.sh                                   |    1 +
> test-index-version.c                             |    2 +-
> 12 files changed, 2082 insertions(+), 154 deletions(-)

This is not good (too many pluses in read-cache.c) if it comes from
format-patch. Does it?
-- 
Duy

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 01/16] Modify cache_header to prepare for other index formats
  2012-08-02 11:01 ` [PATCH 01/16] Modify cache_header to prepare for other index formats Thomas Gummerer
@ 2012-08-02 12:15   ` Nguyen Thai Ngoc Duy
  0 siblings, 0 replies; 38+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-08-02 12:15 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: git, trast, mhagger, gitster, robin.rosenberg

On Thu, Aug 2, 2012 at 6:01 PM, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> diff --git a/cache.h b/cache.h
> index 6e9a243..d4028ef 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -99,9 +99,12 @@ unsigned long git_deflate_bound(git_zstream *, unsigned long);
>   */
>
>  #define CACHE_SIGNATURE 0x44495243     /* "DIRC" */
> -struct cache_header {
> +struct cache_version_header {
>         unsigned int hdr_signature;
>         unsigned int hdr_version;
> +};
> +
> +struct cache_header_v2 {
>         unsigned int hdr_entries;
>  };
>

These structs should probably be moved to read-cache.c. We can
redeclare cache_version_header again in test-index-version.c (it does
not look too bad to me).
-- 
Duy

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 03/16] Modify match_stat_basic to prepare for other index formats
  2012-08-02 11:01 ` [PATCH 03/16] Modify match_stat_basic " Thomas Gummerer
@ 2012-08-02 12:20   ` Nguyen Thai Ngoc Duy
  2012-08-02 14:16     ` Thomas Gummerer
  0 siblings, 1 reply; 38+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-08-02 12:20 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: git, trast, mhagger, gitster, robin.rosenberg

On Thu, Aug 2, 2012 at 6:01 PM, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> @@ -1443,7 +1452,6 @@ void read_index_v2(struct index_state *istate, void *mmap, int mmap_size)
>                 src_offset += consumed;
>         }
>         strbuf_release(&previous_name_buf);
> -
>         while (src_offset <= mmap_size - 20 - 8) {
>                 /* After an array of active_nr index entries,
>                  * there can be arbitrary number of extended
> @@ -1500,7 +1508,6 @@ int read_index_from(struct index_state *istate, const char *path)
>                 die("index file smaller than expected");
>
>         mmap = xmmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
> -       close(fd);
>         if (mmap == MAP_FAILED)
>                 die_errno("unable to map index file");
>
> @@ -1512,7 +1519,6 @@ int read_index_from(struct index_state *istate, const char *path)
>                 goto unmap;
>
>         read_index_v2(istate, mmap, mmap_size);
> -
>         istate->timestamp.sec = st.st_mtime;
>         istate->timestamp.nsec = ST_MTIME_NSEC(st);
>

you could have done this in 02/16 when you introduced this block.

> @@ -1802,9 +1808,6 @@ int write_index(struct index_state *istate, int newfd)
>                 }
>         }
>
> -       if (!istate->version)
> -               istate->version = INDEX_FORMAT_DEFAULT;
> -
>         /* demote version 3 to version 2 when the latter suffices */
>         if (istate->version == 3 || istate->version == 2)
>                 istate->version = extended ? 3 : 2;

why? it does not seem to be related to the commit message.
-- 
Duy

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 04/16] Modify write functions to prepare for other index formats
  2012-08-02 11:01 ` [PATCH 04/16] Modify write functions " Thomas Gummerer
@ 2012-08-02 12:22   ` Nguyen Thai Ngoc Duy
  2012-08-02 14:11     ` Thomas Gummerer
  0 siblings, 1 reply; 38+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-08-02 12:22 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: git, trast, mhagger, gitster, robin.rosenberg

On Thu, Aug 2, 2012 at 6:01 PM, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> @@ -1785,7 +1785,7 @@ void update_index_if_able(struct index_state *istate, struct lock_file *lockfile
>                 rollback_lock_file(lockfile);
>  }
>
> -int write_index(struct index_state *istate, int newfd)
> +int write_index_v2(struct index_state *istate, int newfd)
>  {
>         git_SHA_CTX c;
>         struct cache_version_header hdr;

make it static function too (and read_index_v2 too, I think)
-- 
Duy

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 09/16] Read index-v5
  2012-08-02 11:01 ` [PATCH 09/16] Read index-v5 Thomas Gummerer
@ 2012-08-02 12:45   ` Nguyen Thai Ngoc Duy
  2012-08-02 14:04     ` Thomas Gummerer
  0 siblings, 1 reply; 38+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-08-02 12:45 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: git, trast, mhagger, gitster, robin.rosenberg

General note. I wonder if we should create a separate source file for
v5 (at least the low level handling part). Partial reading/writing
will come (hopefully soon) and read-cache.c on master is already close
to 2000 lines.

On Thu, Aug 2, 2012 at 6:01 PM, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> +static struct cache_entry *cache_entry_from_ondisk_v5(struct ondisk_cache_entry_v5 *ondisk,
> +                                                  struct directory_entry *de,
> +                                                  char *name,
> +                                                  size_t len,
> +                                                  size_t prefix_len)
> +{
> +       struct cache_entry *ce = xmalloc(cache_entry_size(len + de->de_pathlen));
> +       int flags;
> +
> +       flags = ntoh_s(ondisk->flags);

huh? ntoh_s (and ntoh_l below)? search/replace problem?

> +       ce->ce_ctime.sec  = 0;
> +       ce->ce_mtime.sec  = ntoh_l(ondisk->mtime.sec);
-- 
Duy

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 16/16] p0002-index.sh: add perf test for the index formats
  2012-08-02 11:02 ` [PATCH 16/16] p0002-index.sh: add perf test for the index formats Thomas Gummerer
@ 2012-08-02 12:50   ` Nguyen Thai Ngoc Duy
  2012-08-02 13:56     ` Thomas Gummerer
  0 siblings, 1 reply; 38+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-08-02 12:50 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: git, trast, mhagger, gitster, robin.rosenberg

On Thu, Aug 2, 2012 at 6:02 PM, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> Add a performance test for index version [23]/4/5 by using
> git update-index --force-rewrite, thus testing both the reader
> and the writer speed of all index formats.

On the testing side, it may be an interesting idea to force the whole
test suite to use v5 by default. There are a few test cases that
require a specific index version. We can identify and rule them out.
That'll give v5 a lot more exercises.
-- 
Duy

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 0/16] Introduce index file format version 5
  2012-08-02 12:10 ` [RFC 0/16] Introduce index file format version 5 Nguyen Thai Ngoc Duy
@ 2012-08-02 13:47   ` Thomas Gummerer
  2012-08-02 13:53     ` Nguyen Thai Ngoc Duy
  0 siblings, 1 reply; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-02 13:47 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: git, trast, mhagger, gitster, robin.rosenberg

On 08/02, Nguyen Thai Ngoc Duy wrote:
> On Thu, Aug 2, 2012 at 6:01 PM, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> > Documentation/technical/index-file-format-v5.txt |  281 ++++++++++++++++++++++++++++++++++
> > builtin/update-index.c                           |    5 +-
> > cache-tree.c                                     |  145 ++++++++++++++++++
> > cache-tree.h                                     |    7 +
> > cache.h                                          |   96 +++++++++++-
> > read-cache.c                                     | 1519 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------
> > resolve-undo.c                                   |  129 ++++++++++++++++
> > resolve-undo.h                                   |    3 +
> > t/perf/p0002-index.sh                            |   33 ++++
> > t/t2104-update-index-skip-worktree.sh            |   15 +-
> > t/t3700-add.sh                                   |    1 +
> > test-index-version.c                             |    2 +-
> > 12 files changed, 2082 insertions(+), 154 deletions(-)
> 
> This is not good (too many pluses in read-cache.c) if it comes from
> format-patch. Does it?

No this comes from git diff --stat. It's probably just my terminal
width then?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 0/16] Introduce index file format version 5
  2012-08-02 13:47   ` Thomas Gummerer
@ 2012-08-02 13:53     ` Nguyen Thai Ngoc Duy
  0 siblings, 0 replies; 38+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-08-02 13:53 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: git, trast, mhagger, gitster, robin.rosenberg

On Thu, Aug 2, 2012 at 8:47 PM, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> On 08/02, Nguyen Thai Ngoc Duy wrote:
>> On Thu, Aug 2, 2012 at 6:01 PM, Thomas Gummerer <t.gummerer@gmail.com> wrote:
>> > Documentation/technical/index-file-format-v5.txt |  281 ++++++++++++++++++++++++++++++++++
>> > builtin/update-index.c                           |    5 +-
>> > cache-tree.c                                     |  145 ++++++++++++++++++
>> > cache-tree.h                                     |    7 +
>> > cache.h                                          |   96 +++++++++++-
>> > read-cache.c                                     | 1519 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------
>> > resolve-undo.c                                   |  129 ++++++++++++++++
>> > resolve-undo.h                                   |    3 +
>> > t/perf/p0002-index.sh                            |   33 ++++
>> > t/t2104-update-index-skip-worktree.sh            |   15 +-
>> > t/t3700-add.sh                                   |    1 +
>> > test-index-version.c                             |    2 +-
>> > 12 files changed, 2082 insertions(+), 154 deletions(-)
>>
>> This is not good (too many pluses in read-cache.c) if it comes from
>> format-patch. Does it?
>
> No this comes from git diff --stat. It's probably just my terminal
> width then?
>

Ah ok. Recent git learns to use much screen estate as possible but
format-patch keep it fit wihtin 75 or so columns.
-- 
Duy

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 16/16] p0002-index.sh: add perf test for the index formats
  2012-08-02 12:50   ` Nguyen Thai Ngoc Duy
@ 2012-08-02 13:56     ` Thomas Gummerer
  0 siblings, 0 replies; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-02 13:56 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: git, trast, mhagger, gitster, robin.rosenberg

On 08/02, Nguyen Thai Ngoc Duy wrote:
> On Thu, Aug 2, 2012 at 6:02 PM, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> > Add a performance test for index version [23]/4/5 by using
> > git update-index --force-rewrite, thus testing both the reader
> > and the writer speed of all index formats.
> 
> On the testing side, it may be an interesting idea to force the whole
> test suite to use v5 by default. There are a few test cases that
> require a specific index version. We can identify and rule them out.
> That'll give v5 a lot more exercises.

The test suite already works with index v5. Patch 5 and 6 fix the only
cases where index-v5 would fail. To make it use index-v5 you can just
set the INDEX_VERSION_DEFAULT to 5 in read-cache.c

I've never used it for performance testing yet, I'll try to run it with
both versions and post the results. I don't think the results will be
a lot different thought, since the test-suite usually uses small
repositories.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 09/16] Read index-v5
  2012-08-02 12:45   ` Nguyen Thai Ngoc Duy
@ 2012-08-02 14:04     ` Thomas Gummerer
  0 siblings, 0 replies; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-02 14:04 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: git, trast, mhagger, gitster, robin.rosenberg

On 08/02, Nguyen Thai Ngoc Duy wrote:
> General note. I wonder if we should create a separate source file for
> v5 (at least the low level handling part). Partial reading/writing
> will come (hopefully soon) and read-cache.c on master is already close
> to 2000 lines.

To me it would make sense, but we'd probably have to split it to
at least 3 files, one for index-v2, one for index-v5 and one for
the general functions/api.

> On Thu, Aug 2, 2012 at 6:01 PM, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> > +static struct cache_entry *cache_entry_from_ondisk_v5(struct ondisk_cache_entry_v5 *ondisk,
> > +                                                  struct directory_entry *de,
> > +                                                  char *name,
> > +                                                  size_t len,
> > +                                                  size_t prefix_len)
> > +{
> > +       struct cache_entry *ce = xmalloc(cache_entry_size(len + de->de_pathlen));
> > +       int flags;
> > +
> > +       flags = ntoh_s(ondisk->flags);
> 
> huh? ntoh_s (and ntoh_l below)? search/replace problem?

No, they are correct, Junio introduced this functions with index-v4 for
systems which need aligned access. They are defined as written below.

> #ifndef NEEDS_ALIGNED_ACCESS
> #define ntoh_s(var) ntohs(var)
> #define ntoh_l(var) ntohl(var)
> #else
> static inline uint16_t ntoh_s_force_align(void *p)
> {
> 	uint16_t x;
> 	memcpy(&x, p, sizeof(x));
> 	return ntohs(x);
> }
> static inline uint32_t ntoh_l_force_align(void *p)
> {
> 	uint32_t x;
> 	memcpy(&x, p, sizeof(x));
> 	return ntohl(x);
> }
> #define ntoh_s(var)
> ntoh_s_force_align(&(var))
> #define ntoh_l(var)
> ntoh_l_force_align(&(var))
> #endif

> > +       ce->ce_ctime.sec  = 0;
> > +       ce->ce_mtime.sec  = ntoh_l(ondisk->mtime.sec);
> -- 
> Duy

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 04/16] Modify write functions to prepare for other index formats
  2012-08-02 12:22   ` Nguyen Thai Ngoc Duy
@ 2012-08-02 14:11     ` Thomas Gummerer
  0 siblings, 0 replies; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-02 14:11 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: git, trast, mhagger, gitster, robin.rosenberg


On 08/02, Nguyen Thai Ngoc Duy wrote:
> On Thu, Aug 2, 2012 at 6:01 PM, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> > @@ -1785,7 +1785,7 @@ void update_index_if_able(struct index_state *istate, struct lock_file *lockfile
> >                 rollback_lock_file(lockfile);
> >  }
> >
> > -int write_index(struct index_state *istate, int newfd)
> > +int write_index_v2(struct index_state *istate, int newfd)
> >  {
> >         git_SHA_CTX c;
> >         struct cache_version_header hdr;
> 
> make it static function too (and read_index_v2 too, I think)
> -- 
> Duy

Makes sense, thanks!

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 03/16] Modify match_stat_basic to prepare for other index formats
  2012-08-02 12:20   ` Nguyen Thai Ngoc Duy
@ 2012-08-02 14:16     ` Thomas Gummerer
  0 siblings, 0 replies; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-02 14:16 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: git, trast, mhagger, gitster, robin.rosenberg

On 08/02, Nguyen Thai Ngoc Duy wrote:
> On Thu, Aug 2, 2012 at 6:01 PM, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> > @@ -1443,7 +1452,6 @@ void read_index_v2(struct index_state *istate, void *mmap, int mmap_size)
> >                 src_offset += consumed;
> >         }
> >         strbuf_release(&previous_name_buf);
> > -
> >         while (src_offset <= mmap_size - 20 - 8) {
> >                 /* After an array of active_nr index entries,
> >                  * there can be arbitrary number of extended
> > @@ -1500,7 +1508,6 @@ int read_index_from(struct index_state *istate, const char *path)
> >                 die("index file smaller than expected");
> >
> >         mmap = xmmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
> > -       close(fd);
> >         if (mmap == MAP_FAILED)
> >                 die_errno("unable to map index file");
> >
> > @@ -1512,7 +1519,6 @@ int read_index_from(struct index_state *istate, const char *path)
> >                 goto unmap;
> >
> >         read_index_v2(istate, mmap, mmap_size);
> > -
> >         istate->timestamp.sec = st.st_mtime;
> >         istate->timestamp.nsec = ST_MTIME_NSEC(st);
> >
> 
> you could have done this in 02/16 when you introduced this block.

Thanks.

> > @@ -1802,9 +1808,6 @@ int write_index(struct index_state *istate, int newfd)
> >                 }
> >         }
> >
> > -       if (!istate->version)
> > -               istate->version = INDEX_FORMAT_DEFAULT;
> > -
> >         /* demote version 3 to version 2 when the latter suffices */
> >         if (istate->version == 3 || istate->version == 2)
> >                 istate->version = extended ? 3 : 2;
> 
> why? it does not seem to be related to the commit message.

Sorry this is wrong, it should belong to patch 4. Thanks for noticing.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 0/16] Introduce index file format version 5
  2012-08-02 11:01 [RFC 0/16] Introduce index file format version 5 Thomas Gummerer
                   ` (16 preceding siblings ...)
  2012-08-02 12:10 ` [RFC 0/16] Introduce index file format version 5 Nguyen Thai Ngoc Duy
@ 2012-08-03  3:16 ` Nguyen Thai Ngoc Duy
  2012-08-03 12:46   ` Thomas Gummerer
  2012-08-03  9:13 ` Thomas Rast
  18 siblings, 1 reply; 38+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-08-03  3:16 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: git, trast, mhagger, gitster, robin.rosenberg

On Thu, Aug 2, 2012 at 6:01 PM, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> Series of patches to introduce the index version 5 file format. This
> series does not include any fancy stuff like partial loading or partial
> writing yet, though it's possible to do that with the new format.

I applied the series on top of master. I had to manually resolve
09/16. You may want to rebase the series on master for the reroll
(less work for Junio) and remove trailing whitespaces in the patches.

All tests passed (with v5 by default (*), I notice it now), which is
wonderful. I'll have a closer look on the following days. Thank you
for working on this.

(*) while it's good to run tests with v5 by default. I'm not sure if
we should make it by default in the next git release that comes with
v5. For one thing it'll stop older gits from using shared repos. And
we're not sure whether v5 introduces significant overhead in common
use cases.
-- 
Duy

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 05/16] t2104: Don't fail when index version is 5
  2012-08-02 11:01 ` [PATCH 05/16] t2104: Don't fail when index version is 5 Thomas Gummerer
@ 2012-08-03  8:22   ` Thomas Rast
  2012-08-03 12:42     ` Thomas Gummerer
  2012-08-03 16:12     ` Junio C Hamano
  0 siblings, 2 replies; 38+ messages in thread
From: Thomas Rast @ 2012-08-03  8:22 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: git, trast, mhagger, gitster, pclouds, robin.rosenberg

Thomas Gummerer <t.gummerer@gmail.com> writes:

> The test t2104 currently checks if the index version is correctly
> reduced to 2/increased to 3, when an entry need extended flags,
> or doesn't use them anymore. Since index-v5 doesn't have extended
> flags (the extended flags are part of the normal flags), we simply
> add a check if the index version is 2/3 (whichever is correct for
> that test) or 5.

Next time we set a new index format as default (which might be when we
make v4 the default!), we'll have to patch this again.  Wouldn't it make
more sense to let them depend on a "default index format is v2"
prerequisite?

> -test_expect_success 'index is at version 2' '
> -	test "$(test-index-version < .git/index)" = 2
> +test_expect_success 'index is at version 2 or version 5' '
> +	test "$(test-index-version < .git/index)" = 2 ||
> +	test "$(test-index-version < .git/index)" = 5
>  '

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/16] Read cache-tree in index-v5
  2012-08-02 11:02 ` [PATCH 11/16] Read cache-tree in index-v5 Thomas Gummerer
@ 2012-08-03  8:31   ` Thomas Rast
  2012-08-03 12:41     ` Thomas Gummerer
  0 siblings, 1 reply; 38+ messages in thread
From: Thomas Rast @ 2012-08-03  8:31 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: git, trast, mhagger, gitster, pclouds, robin.rosenberg

Thomas Gummerer <t.gummerer@gmail.com> writes:

> Since the cache-tree data is saved as part of the directory data,
> we have already read it, when we want to read the cache-tree. The
> cache-tree then only has to be converted from the directory data.

I think the first sentence is wrong.  You have already read it at the
very beginning of reading the index format, when you parsed the
directory records, haven't you?

> The cache-tree isn't lexically sorted, but after the pathlen at
> each level, therefore the directories have to be reordered with
> respect to the ondisk layout.

I'm not a native speaker either, but I think this does't parse well.
Maybe

  The cache-tree data is arranged in a tree, with the children sorted by
  pathlen at each node.  So we have to rebuild this format from the
  on-disk directory list.

> +	for (i = 0; i < subtree_nr; i++) {
> +		struct cache_tree *sub;
> +		struct cache_tree_sub *subtree;
> +		char *buf, *name;
> +
> +		name = "";
> +		buf = strtok(down[i].de->pathname, "/");

man 3 strtok says

   Be cautious when using these functions.  If you do use them, note
   that:

   * These functions modify their first argument.

   * These functions cannot be used on constant strings.

   * The identity of the delimiting character is lost.

   * The strtok() function uses a static buffer while parsing, so it's
     not thread safe.  Use strtok_r() if this matters to you.

I don't think the last point will be a problem, but what about modifying
the argument?

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 0/16] Introduce index file format version 5
  2012-08-02 11:01 [RFC 0/16] Introduce index file format version 5 Thomas Gummerer
                   ` (17 preceding siblings ...)
  2012-08-03  3:16 ` Nguyen Thai Ngoc Duy
@ 2012-08-03  9:13 ` Thomas Rast
  2012-08-03 12:34   ` Thomas Gummerer
  18 siblings, 1 reply; 38+ messages in thread
From: Thomas Rast @ 2012-08-03  9:13 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: git, mhagger, gitster, pclouds, robin.rosenberg

Thomas Gummerer <t.gummerer@gmail.com> writes:

> [PATCH 01/16] Modify cache_header to prepare for other index formats
> [PATCH 02/16] Modify read functions to prepare for other index
> [PATCH 03/16] Modify match_stat_basic to prepare for other index
> [PATCH 04/16] Modify write functions to prepare for other index
> [PATCH 05/16] t2104: Don't fail when index version is 5
> [PATCH 06/16] t3700: sleep for 1 second, to avoid interfering with
> [PATCH 07/16] Add documentation of the index-v5 file format
> [PATCH 08/16] Make in-memory format aware of stat_crc
> [PATCH 09/16] Read index-v5
> [PATCH 10/16] Read resolve-undo data
> [PATCH 11/16] Read cache-tree in index-v5
> [PATCH 12/16] Write index-v5
> [PATCH 13/16] Write index-v5 cache-tree data
> [PATCH 14/16] Write resolve-undo data for index-v5
> [PATCH 15/16] update-index.c: add a force-rewrite option
> [PATCH 16/16] p0002-index.sh: add perf test for the index formats

I haven't had time for more than a cursory look yet, but good job on the
splits.  This is a large improvement over what you had in Zurich!

One thing that you need to be more careful about is attribution of the
source code.  Credit is very important because it's the only thing
people get for their OSS work.  For some patches you received lots of
input and help by many people.  For example, the documentation patch
that casts the format in stone (or will, when it's finished), should
have "Helped-by:" for *at least* Michael, Junio, and Duy.  You should
dig in the ML archives for other people who may have contributed ideas.

Also, anything that contains nontrivial code from me needs my S-o-b; off
the top of my head that's just 16/16, which AFAICS is even completely
unchanged (!) and needs to come with a From (and my S-o-b).  (I'm not
going to be anal about any of the work we did in Zurich, let's just
classify that as "help" like above.)

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 0/16] Introduce index file format version 5
  2012-08-03  9:13 ` Thomas Rast
@ 2012-08-03 12:34   ` Thomas Gummerer
  0 siblings, 0 replies; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-03 12:34 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git, mhagger, gitster, pclouds, robin.rosenberg

On 08/03, Thomas Rast wrote:
> Thomas Gummerer <t.gummerer@gmail.com> writes:
> 
> > [PATCH 01/16] Modify cache_header to prepare for other index formats
> > [PATCH 02/16] Modify read functions to prepare for other index
> > [PATCH 03/16] Modify match_stat_basic to prepare for other index
> > [PATCH 04/16] Modify write functions to prepare for other index
> > [PATCH 05/16] t2104: Don't fail when index version is 5
> > [PATCH 06/16] t3700: sleep for 1 second, to avoid interfering with
> > [PATCH 07/16] Add documentation of the index-v5 file format
> > [PATCH 08/16] Make in-memory format aware of stat_crc
> > [PATCH 09/16] Read index-v5
> > [PATCH 10/16] Read resolve-undo data
> > [PATCH 11/16] Read cache-tree in index-v5
> > [PATCH 12/16] Write index-v5
> > [PATCH 13/16] Write index-v5 cache-tree data
> > [PATCH 14/16] Write resolve-undo data for index-v5
> > [PATCH 15/16] update-index.c: add a force-rewrite option
> > [PATCH 16/16] p0002-index.sh: add perf test for the index formats
> 
> I haven't had time for more than a cursory look yet, but good job on the
> splits.  This is a large improvement over what you had in Zurich!

Thanks, it was easier when seeing the code, instead of just thinking
about them of the top of my head.

> One thing that you need to be more careful about is attribution of the
> source code.  Credit is very important because it's the only thing
> people get for their OSS work.  For some patches you received lots of
> input and help by many people.  For example, the documentation patch
> that casts the format in stone (or will, when it's finished), should
> have "Helped-by:" for *at least* Michael, Junio, and Duy.  You should
> dig in the ML archives for other people who may have contributed ideas.
> 
> Also, anything that contains nontrivial code from me needs my S-o-b; off
> the top of my head that's just 16/16, which AFAICS is even completely
> unchanged (!) and needs to come with a From (and my S-o-b).  (I'm not
> going to be anal about any of the work we did in Zurich, let's just
> classify that as "help" like above.)

My apologies, i forgot to add them. I'll make sure to include all
credits in the re-roll.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/16] Read cache-tree in index-v5
  2012-08-03  8:31   ` Thomas Rast
@ 2012-08-03 12:41     ` Thomas Gummerer
  0 siblings, 0 replies; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-03 12:41 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git, trast, mhagger, gitster, pclouds, robin.rosenberg

On 08/03, Thomas Rast wrote:
> Thomas Gummerer <t.gummerer@gmail.com> writes:
> 
> > Since the cache-tree data is saved as part of the directory data,
> > we have already read it, when we want to read the cache-tree. The
> > cache-tree then only has to be converted from the directory data.
> 
> I think the first sentence is wrong.  You have already read it at the
> very beginning of reading the index format, when you parsed the
> directory records, haven't you?

Yes, that's what I wanted to say, I'll rephrase this.

> > The cache-tree isn't lexically sorted, but after the pathlen at
> > each level, therefore the directories have to be reordered with
> > respect to the ondisk layout.
> 
> I'm not a native speaker either, but I think this does't parse well.
> Maybe
> 
>   The cache-tree data is arranged in a tree, with the children sorted by
>   pathlen at each node.  So we have to rebuild this format from the
>   on-disk directory list.

Thanks, that sounds better.

> > +	for (i = 0; i < subtree_nr; i++) {
> > +		struct cache_tree *sub;
> > +		struct cache_tree_sub *subtree;
> > +		char *buf, *name;
> > +
> > +		name = "";
> > +		buf = strtok(down[i].de->pathname, "/");
> 
> man 3 strtok says
> 
>    Be cautious when using these functions.  If you do use them, note
>    that:
> 
>    * These functions modify their first argument.
> 
>    * These functions cannot be used on constant strings.
> 
>    * The identity of the delimiting character is lost.
> 
>    * The strtok() function uses a static buffer while parsing, so it's
>      not thread safe.  Use strtok_r() if this matters to you.
> 
> I don't think the last point will be a problem, but what about modifying
> the argument?

Hrm the function is only called at the end of read_index_v5, after
which we don't need the directory entries anymore. If we want to use
the directory entries for a new in-memory format later however, we
should probably change it. Maybe a comment for the cache_tree_convert_v5
function would be enough?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 05/16] t2104: Don't fail when index version is 5
  2012-08-03  8:22   ` Thomas Rast
@ 2012-08-03 12:42     ` Thomas Gummerer
  2012-08-03 16:12     ` Junio C Hamano
  1 sibling, 0 replies; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-03 12:42 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git, trast, mhagger, gitster, pclouds, robin.rosenberg

On 08/03, Thomas Rast wrote:
> Thomas Gummerer <t.gummerer@gmail.com> writes:
> 
> > The test t2104 currently checks if the index version is correctly
> > reduced to 2/increased to 3, when an entry need extended flags,
> > or doesn't use them anymore. Since index-v5 doesn't have extended
> > flags (the extended flags are part of the normal flags), we simply
> > add a check if the index version is 2/3 (whichever is correct for
> > that test) or 5.
> 
> Next time we set a new index format as default (which might be when we
> make v4 the default!), we'll have to patch this again.  Wouldn't it make
> more sense to let them depend on a "default index format is v2"
> prerequisite?

Sounds good to me, since formats other than v[23] don't do anything
in this test anyway.

> > -test_expect_success 'index is at version 2' '
> > -	test "$(test-index-version < .git/index)" = 2
> > +test_expect_success 'index is at version 2 or version 5' '
> > +	test "$(test-index-version < .git/index)" = 2 ||
> > +	test "$(test-index-version < .git/index)" = 5
> >  '
> 
> -- 
> Thomas Rast
> trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 0/16] Introduce index file format version 5
  2012-08-03  3:16 ` Nguyen Thai Ngoc Duy
@ 2012-08-03 12:46   ` Thomas Gummerer
  0 siblings, 0 replies; 38+ messages in thread
From: Thomas Gummerer @ 2012-08-03 12:46 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: git, trast, mhagger, gitster, robin.rosenberg

On 08/03, Nguyen Thai Ngoc Duy wrote:
> On Thu, Aug 2, 2012 at 6:01 PM, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> > Series of patches to introduce the index version 5 file format. This
> > series does not include any fancy stuff like partial loading or partial
> > writing yet, though it's possible to do that with the new format.
> 
> I applied the series on top of master. I had to manually resolve
> 09/16. You may want to rebase the series on master for the reroll
> (less work for Junio) and remove trailing whitespaces in the patches.

Thanks, I'll do that for the reroll.

> All tests passed (with v5 by default (*), I notice it now), which is
> wonderful. I'll have a closer look on the following days. Thank you
> for working on this.
> 
> (*) while it's good to run tests with v5 by default. I'm not sure if
> we should make it by default in the next git release that comes with
> v5. For one thing it'll stop older gits from using shared repos. And
> we're not sure whether v5 introduces significant overhead in common
> use cases.

I didn't intent to make index v5 the format, I just oversaw the hunk
that was changing it. I completely agree that v5 should not be default
yet to avoid any surprises, also for users of jgit, libgit2 etc.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 05/16] t2104: Don't fail when index version is 5
  2012-08-03  8:22   ` Thomas Rast
  2012-08-03 12:42     ` Thomas Gummerer
@ 2012-08-03 16:12     ` Junio C Hamano
  1 sibling, 0 replies; 38+ messages in thread
From: Junio C Hamano @ 2012-08-03 16:12 UTC (permalink / raw)
  To: Thomas Rast
  Cc: Thomas Gummerer, git, trast, mhagger, pclouds, robin.rosenberg

Thomas Rast <trast@inf.ethz.ch> writes:

> Thomas Gummerer <t.gummerer@gmail.com> writes:
>
>> The test t2104 currently checks if the index version is correctly
>> reduced to 2/increased to 3, when an entry need extended flags,
>> or doesn't use them anymore. Since index-v5 doesn't have extended
>> flags (the extended flags are part of the normal flags), we simply
>> add a check if the index version is 2/3 (whichever is correct for
>> that test) or 5.
>
> Next time we set a new index format as default (which might be when we
> make v4 the default!), we'll have to patch this again.  Wouldn't it make
> more sense to let them depend on a "default index format is v2"
> prerequisite?

My preference is not to change the default index version for now,
and for a test that specifically checks features of a particular
index version, force the index version near the beginning of test
using "update-index --index-version $num".

For t2104, I think forcing the index version to 2 at the beginning
and not worry about v4 or later at all would be the right thing to
do.  That way, we will make sure older versions are still supported
with the new code that is capable of reading and writing newer ones.

>> -test_expect_success 'index is at version 2' '
>> -	test "$(test-index-version < .git/index)" = 2
>> +test_expect_success 'index is at version 2 or version 5' '
>> +	test "$(test-index-version < .git/index)" = 2 ||
>> +	test "$(test-index-version < .git/index)" = 5
>>  '

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2012-08-03 16:12 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-02 11:01 [RFC 0/16] Introduce index file format version 5 Thomas Gummerer
2012-08-02 11:01 ` [PATCH 01/16] Modify cache_header to prepare for other index formats Thomas Gummerer
2012-08-02 12:15   ` Nguyen Thai Ngoc Duy
2012-08-02 11:01 ` [PATCH 02/16] Modify read functions " Thomas Gummerer
2012-08-02 11:01 ` [PATCH 03/16] Modify match_stat_basic " Thomas Gummerer
2012-08-02 12:20   ` Nguyen Thai Ngoc Duy
2012-08-02 14:16     ` Thomas Gummerer
2012-08-02 11:01 ` [PATCH 04/16] Modify write functions " Thomas Gummerer
2012-08-02 12:22   ` Nguyen Thai Ngoc Duy
2012-08-02 14:11     ` Thomas Gummerer
2012-08-02 11:01 ` [PATCH 05/16] t2104: Don't fail when index version is 5 Thomas Gummerer
2012-08-03  8:22   ` Thomas Rast
2012-08-03 12:42     ` Thomas Gummerer
2012-08-03 16:12     ` Junio C Hamano
2012-08-02 11:01 ` [PATCH 06/16] t3700: sleep for 1 second, to avoid interfering with the racy code Thomas Gummerer
2012-08-02 11:01 ` [PATCH 07/16] Add documentation of the index-v5 file format Thomas Gummerer
2012-08-02 11:01 ` [PATCH 08/16] Make in-memory format aware of stat_crc Thomas Gummerer
2012-08-02 11:01 ` [PATCH 09/16] Read index-v5 Thomas Gummerer
2012-08-02 12:45   ` Nguyen Thai Ngoc Duy
2012-08-02 14:04     ` Thomas Gummerer
2012-08-02 11:02 ` [PATCH 10/16] Read resolve-undo data Thomas Gummerer
2012-08-02 11:02 ` [PATCH 11/16] Read cache-tree in index-v5 Thomas Gummerer
2012-08-03  8:31   ` Thomas Rast
2012-08-03 12:41     ` Thomas Gummerer
2012-08-02 11:02 ` [PATCH 12/16] Write index-v5 Thomas Gummerer
2012-08-02 11:02 ` [PATCH 13/16] Write index-v5 cache-tree data Thomas Gummerer
2012-08-02 11:02 ` [PATCH 14/16] Write resolve-undo data for index-v5 Thomas Gummerer
2012-08-02 11:02 ` [PATCH 15/16] update-index.c: add a force-rewrite option Thomas Gummerer
2012-08-02 11:02 ` [PATCH 16/16] p0002-index.sh: add perf test for the index formats Thomas Gummerer
2012-08-02 12:50   ` Nguyen Thai Ngoc Duy
2012-08-02 13:56     ` Thomas Gummerer
2012-08-02 12:10 ` [RFC 0/16] Introduce index file format version 5 Nguyen Thai Ngoc Duy
2012-08-02 13:47   ` Thomas Gummerer
2012-08-02 13:53     ` Nguyen Thai Ngoc Duy
2012-08-03  3:16 ` Nguyen Thai Ngoc Duy
2012-08-03 12:46   ` Thomas Gummerer
2012-08-03  9:13 ` Thomas Rast
2012-08-03 12:34   ` Thomas Gummerer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).