[PATCH] Fix fetch completeness assumptions

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] Fix fetch completeness assumptions
@ 2005-09-15  1:31 Daniel Barkalow
  2005-09-15  8:09 ` Junio C Hamano
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Daniel Barkalow @ 2005-09-15  1:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Don't assume that any commit we have is complete; assume that any ref
we have is complete.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>

---
Only lightly tested, but it seems to work. Marks all of the commits in 
refs, and their ancestors back as far as the date of the commit it's 
processing. If the commit it's processing is marked, it doesn't recurse 
into it.

This should make it very difficult to get it to produce a situation where 
it completes without giving you complete information, and it shouldn't 
process commits that are accessible from refs.

 fetch.c |   28 ++++++++++++++++++++++++++--
 1 files changed, 26 insertions(+), 2 deletions(-)

39216139abc4f2759f7b55d11fa2f7e7c155898c
diff --git a/fetch.c b/fetch.c
--- a/fetch.c
+++ b/fetch.c
@@ -62,11 +62,21 @@ static int process_tree(struct tree *tre
 	return 0;
 }
 
+struct commit_list *complete = NULL;
+
 static int process_commit(struct commit *commit)
 {
 	if (parse_commit(commit))
 		return -1;
 
+	while (complete && complete->item->date >= commit->date) {
+		pop_most_recent_commit(&complete, 1);
+	}
+		
+
+	if (commit->object.flags & 1)
+		return 0;
+
 	memcpy(current_commit_sha1, commit->object.sha1, 20);
 
 	if (get_tree) {
@@ -78,8 +88,6 @@ static int process_commit(struct commit 
 	if (get_history) {
 		struct commit_list *parents = commit->parents;
 		for (; parents; parents = parents->next) {
-			if (has_sha1_file(parents->item->object.sha1))
-				continue;
 			if (process(parents->item->object.sha1,
 				    commit_type))
 				return -1;
@@ -126,6 +134,7 @@ static int process_object(struct object 
 static int process(unsigned char *sha1, const char *type)
 {
 	struct object *obj = lookup_object_type(sha1, type);
+
 	if (has_sha1_file(sha1)) {
 		parse_object(sha1);
 		/* We already have it, so we should scan it now. */
@@ -179,6 +188,19 @@ static int interpret_target(char *target
 	return -1;
 }
 
+static int mark_complete(const char *path, const unsigned char *sha1)
+{
+	struct object *obj = parse_object(sha1);
+	while (obj->type == tag_type) {
+		obj = ((struct tag *) obj)->tagged;
+		parse_object(obj->sha1);
+	}
+	if (obj->type == commit_type) {
+		obj->flags |= 1;
+		insert_by_date((struct commit *) obj, &complete);
+	}
+	return 0;
+}
 
 int pull(char *target)
 {
@@ -191,6 +213,8 @@ int pull(char *target)
 			return -1;
 	}
 
+	for_each_ref(mark_complete);
+
 	if (interpret_target(target, sha1))
 		return error("Could not interpret %s as something to pull",
 			     target);

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fix fetch completeness assumptions
  2005-09-15  1:31 [PATCH] Fix fetch completeness assumptions Daniel Barkalow
@ 2005-09-15  8:09 ` Junio C Hamano
  2005-09-15 16:42   ` Daniel Barkalow
  2005-09-15 11:02 ` Sergey Vlasov
  2005-09-17  8:02 ` Junio C Hamano
  2 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2005-09-15  8:09 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git

Daniel Barkalow <barkalow@iabervon.org> writes:

> Don't assume that any commit we have is complete; assume that any ref
> we have is complete.
>
> Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
>
> ---
> Only lightly tested, but it seems to work. Marks all of the commits in 
> refs, and their ancestors back as far as the date of the commit it's 
> processing. If the commit it's processing is marked, it doesn't recurse 
> into it.

First I thought you were using the date for the cutting-off
decision and was going to object to the patch because of it, but
re-reading the code made me realize that you are using date only
to better the heuristics that limits how many commits reachable
from existing refs we would tangle to see if the newly seen
commit is reachable from our refs -- the logic looks correct.

The "complete" commit_list is a clever piece of code; I like it.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fix fetch completeness assumptions
  2005-09-15  1:31 [PATCH] Fix fetch completeness assumptions Daniel Barkalow
  2005-09-15  8:09 ` Junio C Hamano
@ 2005-09-15 11:02 ` Sergey Vlasov
  2005-09-15 16:54   ` Daniel Barkalow
  2005-09-17  8:02 ` Junio C Hamano
  2 siblings, 1 reply; 11+ messages in thread
From: Sergey Vlasov @ 2005-09-15 11:02 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 3637 bytes --]

On Wed, 14 Sep 2005 21:31:42 -0400 (EDT) Daniel Barkalow wrote:

> Don't assume that any commit we have is complete; assume that any ref
> we have is complete.
> 
> Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
> 
> ---
> Only lightly tested, but it seems to work. Marks all of the commits in 
> refs, and their ancestors back as far as the date of the commit it's 
> processing. If the commit it's processing is marked, it doesn't recurse 
> into it.

Seems to work here too (however, previous playing with git-fetch and
old git-http-fetch left .git/refs/heads/origin.remote pointing to an
incomplete commit, and this greatly confused the new git-http-fetch).

Junio, probably you can throw away my patches adding --recover (also I
forgot to add Signed-off-by for them...) - the automatic recovery
provided by this patch is better.  Still there is some room for
improvement...

> This should make it very difficult to get it to produce a situation where 
> it completes without giving you complete information, and it shouldn't 
> process commits that are accessible from refs.
> 
>  fetch.c |   28 ++++++++++++++++++++++++++--
>  1 files changed, 26 insertions(+), 2 deletions(-)
> 
> 39216139abc4f2759f7b55d11fa2f7e7c155898c
> diff --git a/fetch.c b/fetch.c
> --- a/fetch.c
> +++ b/fetch.c
> @@ -62,11 +62,21 @@ static int process_tree(struct tree *tre
>  	return 0;
>  }
>  
> +struct commit_list *complete = NULL;

static

> +
>  static int process_commit(struct commit *commit)
>  {
>  	if (parse_commit(commit))
>  		return -1;
>  
> +	while (complete && complete->item->date >= commit->date) {
> +		pop_most_recent_commit(&complete, 1);
> +	}
> +		
> +
> +	if (commit->object.flags & 1)
> +		return 0;

Hmm, I would define some constant for this flag at the top of file:

/* Object flags */
#define COMPLETE	(1U << 0)

Otherwise it is hard to see what object flags are used in the code
(only one flag is used currently, but we may need more in the future).

> +
>  	memcpy(current_commit_sha1, commit->object.sha1, 20);
>  
>  	if (get_tree) {
> @@ -78,8 +88,6 @@ static int process_commit(struct commit 
>  	if (get_history) {
>  		struct commit_list *parents = commit->parents;
>  		for (; parents; parents = parents->next) {
> -			if (has_sha1_file(parents->item->object.sha1))
> -				continue;
>  			if (process(parents->item->object.sha1,
>  				    commit_type))
>  				return -1;
> @@ -126,6 +134,7 @@ static int process_object(struct object 
>  static int process(unsigned char *sha1, const char *type)
>  {
>  	struct object *obj = lookup_object_type(sha1, type);
> +
>  	if (has_sha1_file(sha1)) {
>  		parse_object(sha1);
>  		/* We already have it, so we should scan it now. */
> @@ -179,6 +188,19 @@ static int interpret_target(char *target
>  	return -1;
>  }
>  
> +static int mark_complete(const char *path, const unsigned char *sha1)
> +{
> +	struct object *obj = parse_object(sha1);
> +	while (obj->type == tag_type) {
> +		obj = ((struct tag *) obj)->tagged;
> +		parse_object(obj->sha1);
> +	}
> +	if (obj->type == commit_type) {
> +		obj->flags |= 1;
> +		insert_by_date((struct commit *) obj, &complete);
> +	}

Dereferencing of tags is already implemented:

	struct commit *commit = lookup_commit_reference_gently(sha1, 1);
	if (commit) {
		commit->object.flags |= 1;
		insert_by_date(commit, &complete);
	}

> +	return 0;
> +}
>  
>  int pull(char *target)
>  {
> @@ -191,6 +213,8 @@ int pull(char *target)
>  			return -1;
>  	}
>  
> +	for_each_ref(mark_complete);
> +
>  	if (interpret_target(target, sha1))
>  		return error("Could not interpret %s as something to pull",
>  			     target);

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fix fetch completeness assumptions
  2005-09-15  8:09 ` Junio C Hamano
@ 2005-09-15 16:42   ` Daniel Barkalow
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel Barkalow @ 2005-09-15 16:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, 15 Sep 2005, Junio C Hamano wrote:

> Daniel Barkalow <barkalow@iabervon.org> writes:
> 
> > Don't assume that any commit we have is complete; assume that any ref
> > we have is complete.
> >
> > Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
> >
> > ---
> > Only lightly tested, but it seems to work. Marks all of the commits in 
> > refs, and their ancestors back as far as the date of the commit it's 
> > processing. If the commit it's processing is marked, it doesn't recurse 
> > into it.
> 
> First I thought you were using the date for the cutting-off
> decision and was going to object to the patch because of it, but
> re-reading the code made me realize that you are using date only
> to better the heuristics that limits how many commits reachable
> from existing refs we would tangle to see if the newly seen
> commit is reachable from our refs -- the logic looks correct.
> 
> The "complete" commit_list is a clever piece of code; I like it.

It's actually just the core loop of rev-list's original behavior. It's 
equivalent to putting the commit we're looking for in the list and popping 
the list until it comes back out, except that it saves a bit of 
list-fiddling. It's worth knowing that you can do rev-list (minus some of 
the features for human-readability) integrated into your program with just 
a couple of lines.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fix fetch completeness assumptions
  2005-09-15 11:02 ` Sergey Vlasov
@ 2005-09-15 16:54   ` Daniel Barkalow
  2005-09-16  6:04     ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Daniel Barkalow @ 2005-09-15 16:54 UTC (permalink / raw)
  To: Sergey Vlasov; +Cc: Junio C Hamano, git

On Thu, 15 Sep 2005, Sergey Vlasov wrote:

> On Wed, 14 Sep 2005 21:31:42 -0400 (EDT) Daniel Barkalow wrote:
> 
> > Don't assume that any commit we have is complete; assume that any ref
> > we have is complete.
> > 
> > Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
> > 
> > ---
> > Only lightly tested, but it seems to work. Marks all of the commits in 
> > refs, and their ancestors back as far as the date of the commit it's 
> > processing. If the commit it's processing is marked, it doesn't recurse 
> > into it.
> 
> Seems to work here too (however, previous playing with git-fetch and
> old git-http-fetch left .git/refs/heads/origin.remote pointing to an
> incomplete commit, and this greatly confused the new git-http-fetch).
> 
> Junio, probably you can throw away my patches adding --recover (also I
> forgot to add Signed-off-by for them...) - the automatic recovery
> provided by this patch is better.  Still there is some room for
> improvement...
> 
> > This should make it very difficult to get it to produce a situation where 
> > it completes without giving you complete information, and it shouldn't 
> > process commits that are accessible from refs.
> > 
> >  fetch.c |   28 ++++++++++++++++++++++++++--
> >  1 files changed, 26 insertions(+), 2 deletions(-)
> > 
> > 39216139abc4f2759f7b55d11fa2f7e7c155898c
> > diff --git a/fetch.c b/fetch.c
> > --- a/fetch.c
> > +++ b/fetch.c
> > @@ -62,11 +62,21 @@ static int process_tree(struct tree *tre
> >  	return 0;
> >  }
> >  
> > +struct commit_list *complete = NULL;
> 
> static

Yup.

> > +	if (commit->object.flags & 1)
> > +		return 0;
> 
> Hmm, I would define some constant for this flag at the top of file:
> 
> /* Object flags */
> #define COMPLETE	(1U << 0)
> 
> Otherwise it is hard to see what object flags are used in the code
> (only one flag is used currently, but we may need more in the future).

Yup.

> > +static int mark_complete(const char *path, const unsigned char *sha1)
> > +{
> > +	struct object *obj = parse_object(sha1);
> > +	while (obj->type == tag_type) {
> > +		obj = ((struct tag *) obj)->tagged;
> > +		parse_object(obj->sha1);
> > +	}
> > +	if (obj->type == commit_type) {
> > +		obj->flags |= 1;
> > +		insert_by_date((struct commit *) obj, &complete);
> > +	}
> 
> Dereferencing of tags is already implemented:
> 
> 	struct commit *commit = lookup_commit_reference_gently(sha1, 1);
> 	if (commit) {
> 		commit->object.flags |= 1;
> 		insert_by_date(commit, &complete);
> 	}

Yup. I think I want to untangle some of the functions in commit.c, though, 
because it wasn't sufficiently clear to me how those functions worked for 
me to actually use the right one.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fix fetch completeness assumptions
  2005-09-15 16:54   ` Daniel Barkalow
@ 2005-09-16  6:04     ` Junio C Hamano
  0 siblings, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2005-09-16  6:04 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Sergey Vlasov, git

Daniel Barkalow <barkalow@iabervon.org> writes:

> Yup. I think I want to untangle some of the functions in commit.c, though, 
> because it wasn't sufficiently clear to me how those functions worked for 
> me to actually use the right one.

Sorry, *_gently was my fault.  The version without _gently was
there first, which issued its own complaints when it was fed a
non commit -- I presume that was OK for the purpose of whoever
wrote it first, but was not very useful when the caller wanted
to do its own "oh, if it is not a commit then I'll do this other
thing then" decision.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fix fetch completeness assumptions
  2005-09-15  1:31 [PATCH] Fix fetch completeness assumptions Daniel Barkalow
  2005-09-15  8:09 ` Junio C Hamano
  2005-09-15 11:02 ` Sergey Vlasov
@ 2005-09-17  8:02 ` Junio C Hamano
  2005-09-17  8:32   ` Junio C Hamano
  2 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2005-09-17  8:02 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, torvalds

Daniel Barkalow <barkalow@iabervon.org> writes:

> Don't assume that any commit we have is complete; assume that any ref
> we have is complete.
>
> Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
>
> ---
> Only lightly tested, but it seems to work. Marks all of the commits in 
> refs, and their ancestors back as far as the date of the commit it's 
> processing. If the commit it's processing is marked, it doesn't recurse 
> into it.

Have you tested this idea with cloning into an empty repository?
It seems to break my tests.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fix fetch completeness assumptions
  2005-09-17  8:02 ` Junio C Hamano
@ 2005-09-17  8:32   ` Junio C Hamano
  2005-09-17  8:50     ` Help cloning over http Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2005-09-17  8:32 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, torvalds

Junio C Hamano <junkio@cox.net> writes:

> Daniel Barkalow <barkalow@iabervon.org> writes:
>
>> Don't assume that any commit we have is complete; assume that any ref
>> we have is complete.
>
> Have you tested this idea with cloning into an empty repository?
> It seems to break my tests.

Sorry, I spoke too fast.  This is not _broken_ per se, but is
just slow.

I tried to clone http://kernel.org/pub/scm/git.git/; bulk of the
objects are packed there but the new code is _too_ cautious,
since it did not have any refs to begin with, it ended up
walking the commits in the already downloaded packs to the root.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Help cloning over http.
  2005-09-17  8:32   ` Junio C Hamano
@ 2005-09-17  8:50     ` Junio C Hamano
  2005-09-17 17:37       ` Daniel Barkalow
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2005-09-17  8:50 UTC (permalink / raw)
  To: git; +Cc: Daniel Barkalow, torvalds

Even when repository is packed, the new "we trust only the refs
we already have" code wasted too much time verifying what are
already in packs, walking down to the root commits.  Since we
slurp all the necessary packs upfront and run verify pack on
them, temporarily register their edge commits as "known to be
good" before invoking git-http-fetch.

I am not proud of this workaround -- this would probably break
in a pathological case where the remote repository has packs
that depend on ancient unpacked objects, but people should not
be making such packs anyway (git repack would not create such
packs).

Signed-off-by: Junio C Hamano <junkio@cox.net>

---

 git-clone.sh |   19 +++++++++++++++++++
 1 files changed, 19 insertions(+), 0 deletions(-)

34414e4170ab8d1d2a9ef582256530dc3b42d3e2
diff --git a/git-clone.sh b/git-clone.sh
--- a/git-clone.sh
+++ b/git-clone.sh
@@ -50,11 +50,30 @@ Perhaps git-update-server-info needs to 
 		git-verify-pack ".git/objects/pack/$idx" || exit 1
 	done <"$clone_tmp/packs"

+	while read type idx sha1 commit
+	do
+		case "$type,$commit" in
+		T,commit) ;;
+		*) continue ;;
+		esac &&
+		echo "$sha1" >.git/refs/tags/clone-tmp-$sha1
+	done <"$clone_tmp/packs"
+
 	while read sha1 refname
 	do
 		name=`expr "$refname" : 'refs/\(.*\)'` &&
 		git-http-fetch -v -a -w "$name" "$name" "$1/" || exit 1
 	done <"$clone_tmp/refs"
+
+	while read type idx sha1 commit
+	do
+		case "$type,$commit" in
+		T,commit) ;;
+		*) continue ;;
+		esac &&
+		rm -f .git/refs/tags/clone-tmp-$sha1
+	done <"$clone_tmp/packs"
+
 	rm -fr "$clone_tmp"
 }

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Help cloning over http.
  2005-09-17  8:50     ` Help cloning over http Junio C Hamano
@ 2005-09-17 17:37       ` Daniel Barkalow
  2005-09-18 17:38         ` [PATCH] Improve the safety check used in fetch.c Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Daniel Barkalow @ 2005-09-17 17:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, torvalds

On Sat, 17 Sep 2005, Junio C Hamano wrote:

> Even when repository is packed, the new "we trust only the refs
> we already have" code wasted too much time verifying what are
> already in packs, walking down to the root commits.  Since we
> slurp all the necessary packs upfront and run verify pack on
> them, temporarily register their edge commits as "known to be
> good" before invoking git-http-fetch.

This is a somewhat special situation, because we're assuming that the 
packs are a complete set. When git-http-fetch downloads a pack, it still 
needs to know if that pack uses other stuff, in which case it probably 
needs to download it. The old code actually wouldn't have worked properly 
if the first pack didn't include any blobs that were still in the latest 
version, because it wouldn't track commits through the pack and only 
generally worked with multiple packs because it would need blobs from 
every pack to produce the current tree. 

I think that this gets particularly tricky with alternates, where the 
necessary packs for ancient history may not actually be in the list of 
packs in the repository we're looking at.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] Improve the safety check used in fetch.c
  2005-09-17 17:37       ` Daniel Barkalow
@ 2005-09-18 17:38         ` Junio C Hamano
  0 siblings, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2005-09-18 17:38 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, torvalds

The recent safety check to trust only the commits we have made
things impossibly slow and wastes a lot of memory.

This commit fixes it with the following improvements:

 - mark already scanned objects and avoid rescanning the same
   object again;

 - free the tree entries when we have scanned the tree entries;
   this is the same as b0d8923ec01fd91b75ab079034f89ced91500157
   which reduced memory usage by rev-list;

 - plug memory leak from the object_list dequeuing code;

 - use the process_queue not just for fetching but for scanning,
   to make things tail recursive to avoid deep recursion; the
   deep recursion was especially prominent when we cloned a big
   pack.

 - avoid has_sha1_file() call when we already know we do not have
   that object.

Signed-off-by: Junio C Hamano <junkio@cox.net>

---

 fetch.c |   76 ++++++++++++++++++++++++++++++++++++---------------------------
 1 files changed, 44 insertions(+), 32 deletions(-)

85d106c267ec26f398e0aaf352d8011f661c459a
diff --git a/fetch.c b/fetch.c
--- a/fetch.c
+++ b/fetch.c
@@ -33,36 +33,33 @@ static void report_missing(const char *w
 		what, missing_hex, sha1_to_hex(current_commit_sha1));
 }
 
-static int make_sure_we_have_it(const char *what, unsigned char *sha1)
-{
-	int status = 0;
-
-	if (!has_sha1_file(sha1)) {
-		status = fetch(sha1);
-		if (status && what)
-			report_missing(what, sha1);
-	}
-	return status;
-}
-
 static int process(unsigned char *sha1, const char *type);
 
 static int process_tree(struct tree *tree)
 {
-	struct tree_entry_list *entries;
+	struct tree_entry_list *entry;
 
 	if (parse_tree(tree))
 		return -1;
 
-	for (entries = tree->entries; entries; entries = entries->next) {
-		if (process(entries->item.any->sha1,
-			    entries->directory ? tree_type : blob_type))
+	entry = tree->entries;
+	tree->entries = NULL;
+	while (entry) {
+		struct tree_entry_list *next = entry->next;
+		if (process(entry->item.any->sha1,
+			    entry->directory ? tree_type : blob_type))
 			return -1;
+		free(entry);
+		entry = next;
 	}
 	return 0;
 }
 
 #define COMPLETE	1U
+#define TO_FETCH	2U
+#define TO_SCAN		4U
+#define SCANNED		8U
+
 static struct commit_list *complete = NULL;
 
 static int process_commit(struct commit *commit)
@@ -73,13 +70,14 @@ static int process_commit(struct commit 
 	while (complete && complete->item->date >= commit->date) {
 		pop_most_recent_commit(&complete, COMPLETE);
 	}
-		
 
 	if (commit->object.flags & COMPLETE)
 		return 0;
 
 	memcpy(current_commit_sha1, commit->object.sha1, 20);
 
+	pull_say("walk %s\n", sha1_to_hex(commit->object.sha1));
+
 	if (get_tree) {
 		if (process(commit->tree->object.sha1, tree_type))
 			return -1;
@@ -89,8 +87,7 @@ static int process_commit(struct commit 
 	if (get_history) {
 		struct commit_list *parents = commit->parents;
 		for (; parents; parents = parents->next) {
-			if (process(parents->item->object.sha1,
-				    commit_type))
+			if (process(parents->item->object.sha1, commit_type))
 				return -1;
 		}
 	}
@@ -109,6 +106,10 @@ static struct object_list **process_queu
 
 static int process_object(struct object *obj)
 {
+	if (obj->flags & SCANNED)
+		return 0;
+	obj->flags |= SCANNED;
+
 	if (obj->type == commit_type) {
 		if (process_commit((struct commit *)obj))
 			return -1;
@@ -139,14 +140,19 @@ static int process(unsigned char *sha1, 
 	if (has_sha1_file(sha1)) {
 		parse_object(sha1);
 		/* We already have it, so we should scan it now. */
-		return process_object(obj);
+		if (obj->flags & (SCANNED | TO_SCAN))
+			return 0;
+		object_list_insert(obj, process_queue_end);
+		process_queue_end = &(*process_queue_end)->next;
+		obj->flags |= TO_SCAN;
+		return 0;
 	}
-	if (object_list_contains(process_queue, obj))
+	if (obj->flags & (COMPLETE | TO_FETCH))
 		return 0;
 	object_list_insert(obj, process_queue_end);
 	process_queue_end = &(*process_queue_end)->next;
+	obj->flags |= TO_FETCH;
 
-	//fprintf(stderr, "prefetch %s\n", sha1_to_hex(sha1));
 	prefetch(sha1);
 		
 	return 0;
@@ -154,21 +160,27 @@ static int process(unsigned char *sha1, 
 
 static int loop(void)
 {
+	struct object_list *elem;
+
 	while (process_queue) {
 		struct object *obj = process_queue->item;
-		/*
-		fprintf(stderr, "%d objects to pull\n", 
-			object_list_length(process_queue));
-		*/
-		process_queue = process_queue->next;
+		elem = process_queue;
+		process_queue = elem->next;
+		free(elem);
 		if (!process_queue)
 			process_queue_end = &process_queue;
 
-		//fprintf(stderr, "fetch %s\n", sha1_to_hex(obj->sha1));
-		
-		if (make_sure_we_have_it(obj->type ? obj->type : "object", 
-					 obj->sha1))
-			return -1;
+		/* If we are not scanning this object, we placed it in
+		 * the queue because we needed to fetch it first.
+		 */
+		if (! (obj->flags & TO_SCAN)) {
+			if (fetch(obj->sha1)) {
+				report_missing(obj->type
+					       ? obj->type
+					       : "object", obj->sha1);
+				return -1;
+			}
+		}
 		if (!obj->type)
 			parse_object(obj->sha1);
 		if (process_object(obj))

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2005-09-18 17:38 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-15  1:31 [PATCH] Fix fetch completeness assumptions Daniel Barkalow
2005-09-15  8:09 ` Junio C Hamano
2005-09-15 16:42   ` Daniel Barkalow
2005-09-15 11:02 ` Sergey Vlasov
2005-09-15 16:54   ` Daniel Barkalow
2005-09-16  6:04     ` Junio C Hamano
2005-09-17  8:02 ` Junio C Hamano
2005-09-17  8:32   ` Junio C Hamano
2005-09-17  8:50     ` Help cloning over http Junio C Hamano
2005-09-17 17:37       ` Daniel Barkalow
2005-09-18 17:38         ` [PATCH] Improve the safety check used in fetch.c Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).