* [PATCH 0/4] Revisiting cache-tree
@ 2009-04-20 10:58 Junio C Hamano
2009-04-20 10:58 ` [PATCH 1/4] read-tree A B: do not corrupt cache-tree Junio C Hamano
2009-04-20 13:08 ` [PATCH 0/4] Revisiting cache-tree Alex Riesen
0 siblings, 2 replies; 6+ messages in thread
From: Junio C Hamano @ 2009-04-20 10:58 UTC (permalink / raw)
To: git
This series consists of a fix meant for maint branch, and two performance
fix. The second one is a refactoring of the code to support the latter.
I was doing a rather huge import as a multi-step process by doing
something like this:
$ rm -f $GIT_DIR/index
$ git add some ;# this one is huge
$ git tag one $(git write-tree)
$ git repack -a -d
$ rm -f $GIT_DIR/index
$ git add other ;# this one is also huge
$ git tag two $(git write-tree)
$ git repack -a -d
$ git read-tree some other
$ git tag both $(git write-tree)
$ git repack -a -d
The binding of two distinct subtrees is done with the read-tree but it
wrote out an incorrect index (notice the lack of -m; with -m option the
command correctly does a different thing) and resulted in a corrupt tree
object.
Junio C Hamano (4):
read-tree A B: do not corrupt cache-tree
Move prime_cache_tree() to cache-tree.c
read-tree -m A B: prime cache-tree from the switched-to tree
checkout branch: prime cache-tree fully
builtin-checkout.c | 10 +++++++++-
builtin-read-tree.c | 48 ++++++++----------------------------------------
cache-tree.c | 34 ++++++++++++++++++++++++++++++++++
cache-tree.h | 4 ++++
4 files changed, 55 insertions(+), 41 deletions(-)
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/4] read-tree A B: do not corrupt cache-tree
2009-04-20 10:58 [PATCH 0/4] Revisiting cache-tree Junio C Hamano
@ 2009-04-20 10:58 ` Junio C Hamano
2009-04-20 10:58 ` [PATCH 2/4] Move prime_cache_tree() to cache-tree.c Junio C Hamano
2009-04-20 13:08 ` [PATCH 0/4] Revisiting cache-tree Alex Riesen
1 sibling, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2009-04-20 10:58 UTC (permalink / raw)
To: git
An earlier commit aab3b9a (read-tree A B C: do not create a bogus index
and do not segfault, 2009-03-12) resurrected the support for an obscure
(but useful) feature to read and overlay more than one tree into the index
without the -m (merge) option. But the fix was not enough.
Exercising this feature exposes a longstanding bug in the code that primes
the cache-tree in the index from the tree that was read. The intention
was that when we know that the index must exactly match the tree we just
read, we prime the entire cache-tree with it.
However, the logic to detect that case incorrectly triggered if you read
two trees without -m. This resulted in a corrupted cache-tree, and
write-tree would have produced an incorrect tree object out of such an
index.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
* Meant for maint.
builtin-read-tree.c | 6 ++----
1 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/builtin-read-tree.c b/builtin-read-tree.c
index 38fef34..e4e0e71 100644
--- a/builtin-read-tree.c
+++ b/builtin-read-tree.c
@@ -211,7 +211,6 @@ int cmd_read_tree(int argc, const char **argv, const char *unused_prefix)
case 3:
default:
opts.fn = threeway_merge;
- cache_tree_free(&active_cache_tree);
break;
}
@@ -221,6 +220,7 @@ int cmd_read_tree(int argc, const char **argv, const char *unused_prefix)
opts.head_idx = 1;
}
+ cache_tree_free(&active_cache_tree);
for (i = 0; i < nr_trees; i++) {
struct tree *tree = trees[i];
parse_tree(tree);
@@ -235,10 +235,8 @@ int cmd_read_tree(int argc, const char **argv, const char *unused_prefix)
* valid cache-tree because the index must match exactly
* what came from the tree.
*/
- if (nr_trees && !opts.prefix && (!opts.merge || (stage == 2))) {
- cache_tree_free(&active_cache_tree);
+ if (nr_trees == 1 && !opts.prefix)
prime_cache_tree();
- }
if (write_cache(newfd, active_cache, active_nr) ||
commit_locked_index(&lock_file))
--
1.6.3.rc1.18.g66996
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/4] Move prime_cache_tree() to cache-tree.c
2009-04-20 10:58 ` [PATCH 1/4] read-tree A B: do not corrupt cache-tree Junio C Hamano
@ 2009-04-20 10:58 ` Junio C Hamano
2009-04-20 10:58 ` [PATCH 3/4] read-tree -m A B: prime cache-tree from the switched-to tree Junio C Hamano
0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2009-04-20 10:58 UTC (permalink / raw)
To: git
The interface to build cache-tree belongs there.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
builtin-checkout.c | 1 +
builtin-read-tree.c | 37 +------------------------------------
cache-tree.c | 34 ++++++++++++++++++++++++++++++++++
cache-tree.h | 4 ++++
4 files changed, 40 insertions(+), 36 deletions(-)
diff --git a/builtin-checkout.c b/builtin-checkout.c
index b121fe5..ffdb33a 100644
--- a/builtin-checkout.c
+++ b/builtin-checkout.c
@@ -5,6 +5,7 @@
#include "commit.h"
#include "tree.h"
#include "tree-walk.h"
+#include "cache-tree.h"
#include "unpack-trees.h"
#include "dir.h"
#include "run-command.h"
diff --git a/builtin-read-tree.c b/builtin-read-tree.c
index e4e0e71..9cd7d07 100644
--- a/builtin-read-tree.c
+++ b/builtin-read-tree.c
@@ -29,41 +29,6 @@ static int list_tree(unsigned char *sha1)
return 0;
}
-static void prime_cache_tree_rec(struct cache_tree *it, struct tree *tree)
-{
- struct tree_desc desc;
- struct name_entry entry;
- int cnt;
-
- hashcpy(it->sha1, tree->object.sha1);
- init_tree_desc(&desc, tree->buffer, tree->size);
- cnt = 0;
- while (tree_entry(&desc, &entry)) {
- if (!S_ISDIR(entry.mode))
- cnt++;
- else {
- struct cache_tree_sub *sub;
- struct tree *subtree = lookup_tree(entry.sha1);
- if (!subtree->object.parsed)
- parse_tree(subtree);
- sub = cache_tree_sub(it, entry.path);
- sub->cache_tree = cache_tree();
- prime_cache_tree_rec(sub->cache_tree, subtree);
- cnt += sub->cache_tree->entry_count;
- }
- }
- it->entry_count = cnt;
-}
-
-static void prime_cache_tree(void)
-{
- if (!nr_trees)
- return;
- active_cache_tree = cache_tree();
- prime_cache_tree_rec(active_cache_tree, trees[0]);
-
-}
-
static const char read_tree_usage[] = "git read-tree (<sha> | [[-m [--trivial] [--aggressive] | --reset | --prefix=<prefix>] [-u | -i]] [--exclude-per-directory=<gitignore>] [--index-output=<file>] <sha1> [<sha2> [<sha3>]])";
static struct lock_file lock_file;
@@ -236,7 +201,7 @@ int cmd_read_tree(int argc, const char **argv, const char *unused_prefix)
* what came from the tree.
*/
if (nr_trees == 1 && !opts.prefix)
- prime_cache_tree();
+ prime_cache_tree(&active_cache_tree, trees[0]);
if (write_cache(newfd, active_cache, active_nr) ||
commit_locked_index(&lock_file))
diff --git a/cache-tree.c b/cache-tree.c
index 3d8f218..37bf35e 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -1,5 +1,6 @@
#include "cache.h"
#include "tree.h"
+#include "tree-walk.h"
#include "cache-tree.h"
#ifndef DEBUG
@@ -591,3 +592,36 @@ int write_cache_as_tree(unsigned char *sha1, int missing_ok, const char *prefix)
return 0;
}
+
+static void prime_cache_tree_rec(struct cache_tree *it, struct tree *tree)
+{
+ struct tree_desc desc;
+ struct name_entry entry;
+ int cnt;
+
+ hashcpy(it->sha1, tree->object.sha1);
+ init_tree_desc(&desc, tree->buffer, tree->size);
+ cnt = 0;
+ while (tree_entry(&desc, &entry)) {
+ if (!S_ISDIR(entry.mode))
+ cnt++;
+ else {
+ struct cache_tree_sub *sub;
+ struct tree *subtree = lookup_tree(entry.sha1);
+ if (!subtree->object.parsed)
+ parse_tree(subtree);
+ sub = cache_tree_sub(it, entry.path);
+ sub->cache_tree = cache_tree();
+ prime_cache_tree_rec(sub->cache_tree, subtree);
+ cnt += sub->cache_tree->entry_count;
+ }
+ }
+ it->entry_count = cnt;
+}
+
+void prime_cache_tree(struct cache_tree **it, struct tree *tree)
+{
+ cache_tree_free(it);
+ *it = cache_tree();
+ prime_cache_tree_rec(*it, tree);
+}
diff --git a/cache-tree.h b/cache-tree.h
index cf8b790..e958835 100644
--- a/cache-tree.h
+++ b/cache-tree.h
@@ -1,6 +1,8 @@
#ifndef CACHE_TREE_H
#define CACHE_TREE_H
+#include "tree.h"
+
struct cache_tree;
struct cache_tree_sub {
struct cache_tree *cache_tree;
@@ -33,4 +35,6 @@ int cache_tree_update(struct cache_tree *, struct cache_entry **, int, int, int)
#define WRITE_TREE_PREFIX_ERROR (-3)
int write_cache_as_tree(unsigned char *sha1, int missing_ok, const char *prefix);
+void prime_cache_tree(struct cache_tree **, struct tree *);
+
#endif
--
1.6.3.rc1.18.g66996
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 3/4] read-tree -m A B: prime cache-tree from the switched-to tree
2009-04-20 10:58 ` [PATCH 2/4] Move prime_cache_tree() to cache-tree.c Junio C Hamano
@ 2009-04-20 10:58 ` Junio C Hamano
2009-04-20 10:58 ` [PATCH 4/4] checkout branch: prime cache-tree fully Junio C Hamano
0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2009-04-20 10:58 UTC (permalink / raw)
To: git
When switching to a new branch with "read-tree -m A B", the resulting
index must match tree B and we can prime the cache tree with it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
builtin-read-tree.c | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)
diff --git a/builtin-read-tree.c b/builtin-read-tree.c
index 9cd7d07..391d709 100644
--- a/builtin-read-tree.c
+++ b/builtin-read-tree.c
@@ -199,9 +199,14 @@ int cmd_read_tree(int argc, const char **argv, const char *unused_prefix)
* "-m ent" or "--reset ent" form), we can obtain a fully
* valid cache-tree because the index must match exactly
* what came from the tree.
+ *
+ * The same holds true if we are switching between two trees
+ * using read-tree -m A B. The index must match B after that.
*/
if (nr_trees == 1 && !opts.prefix)
prime_cache_tree(&active_cache_tree, trees[0]);
+ else if (nr_trees == 2 && opts.merge)
+ prime_cache_tree(&active_cache_tree, trees[1]);
if (write_cache(newfd, active_cache, active_nr) ||
commit_locked_index(&lock_file))
--
1.6.3.rc1.18.g66996
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 4/4] checkout branch: prime cache-tree fully
2009-04-20 10:58 ` [PATCH 3/4] read-tree -m A B: prime cache-tree from the switched-to tree Junio C Hamano
@ 2009-04-20 10:58 ` Junio C Hamano
0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2009-04-20 10:58 UTC (permalink / raw)
To: git
When switching to another branch, the earlier code relied on incremental
invalidation of cache-tree entries to degrade it. While it is not wrong
per-se, we know that the resulting index must fully match the branch we
are switching to unless the -m (merge) option is used.
We should simply fully re-prime the cache-tree using the new tree object
in such a case. And for safety, invalidate the cache-tree as a whole in
other cases.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
builtin-checkout.c | 9 ++++++++-
1 files changed, 8 insertions(+), 1 deletions(-)
diff --git a/builtin-checkout.c b/builtin-checkout.c
index ffdb33a..efa1ebf 100644
--- a/builtin-checkout.c
+++ b/builtin-checkout.c
@@ -368,14 +368,17 @@ static int merge_working_tree(struct checkout_opts *opts,
int ret;
struct lock_file *lock_file = xcalloc(1, sizeof(struct lock_file));
int newfd = hold_locked_index(lock_file, 1);
+ int reprime_cache_tree = 0;
if (read_cache() < 0)
return error("corrupt index file");
+ cache_tree_free(&active_cache_tree);
if (opts->force) {
ret = reset_tree(new->commit->tree, opts, 1);
if (ret)
return ret;
+ reprime_cache_tree = 1;
} else {
struct tree_desc trees[2];
struct tree *tree;
@@ -411,7 +414,9 @@ static int merge_working_tree(struct checkout_opts *opts,
init_tree_desc(&trees[1], tree->buffer, tree->size);
ret = unpack_trees(2, trees, &topts);
- if (ret == -1) {
+ if (ret != -1) {
+ reprime_cache_tree = 1;
+ } else {
/*
* Unpack couldn't do a trivial merge; either
* give up or do a real merge, depending on
@@ -455,6 +460,8 @@ static int merge_working_tree(struct checkout_opts *opts,
}
}
+ if (reprime_cache_tree)
+ prime_cache_tree(&active_cache_tree, new->commit->tree);
if (write_cache(newfd, active_cache, active_nr) ||
commit_locked_index(lock_file))
die("unable to write new index file");
--
1.6.3.rc1.18.g66996
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 0/4] Revisiting cache-tree
2009-04-20 10:58 [PATCH 0/4] Revisiting cache-tree Junio C Hamano
2009-04-20 10:58 ` [PATCH 1/4] read-tree A B: do not corrupt cache-tree Junio C Hamano
@ 2009-04-20 13:08 ` Alex Riesen
1 sibling, 0 replies; 6+ messages in thread
From: Alex Riesen @ 2009-04-20 13:08 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
2009/4/20 Junio C Hamano <gitster@pobox.com>:
> This series consists of a fix meant for maint branch, and two performance
> fix. The second one is a refactoring of the code to support the latter.
>
> I was doing a rather huge import as a multi-step process by doing
> something like this:
>
> $ rm -f $GIT_DIR/index
> $ git add some ;# this one is huge
> $ git tag one $(git write-tree)
> $ git repack -a -d
> $ rm -f $GIT_DIR/index
> $ git add other ;# this one is also huge
> $ git tag two $(git write-tree)
> $ git repack -a -d
>
> $ git read-tree some other
"git read-tree one two"
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2009-04-20 13:16 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-20 10:58 [PATCH 0/4] Revisiting cache-tree Junio C Hamano
2009-04-20 10:58 ` [PATCH 1/4] read-tree A B: do not corrupt cache-tree Junio C Hamano
2009-04-20 10:58 ` [PATCH 2/4] Move prime_cache_tree() to cache-tree.c Junio C Hamano
2009-04-20 10:58 ` [PATCH 3/4] read-tree -m A B: prime cache-tree from the switched-to tree Junio C Hamano
2009-04-20 10:58 ` [PATCH 4/4] checkout branch: prime cache-tree fully Junio C Hamano
2009-04-20 13:08 ` [PATCH 0/4] Revisiting cache-tree Alex Riesen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).