* Hmm.. Try harder to find the common commits in git protocol?
@ 2008-04-28 16:41 Linus Torvalds
2008-04-28 20:15 ` Linus Torvalds
0 siblings, 1 reply; 4+ messages in thread
From: Linus Torvalds @ 2008-04-28 16:41 UTC (permalink / raw)
To: Junio C Hamano, Git Mailing List
So we've had a huge number of merges in this last kernel merge window, and
as a result it seems that a number of the pulls I do get way too many
objects, because the git protocol doesn't try hard enough to find the
minimal set of common commits.
Example from today:
remote: Counting objects: 50084, done.
remote: Compressing objects: 100% (6489/6489), done.
remote: Total 43343 (delta 37381), reused 42777 (delta 36836)
Receiving objects: 100% (43343/43343), 8.85 MiB | 144 KiB/s, done.
Resolving deltas: 100% (37381/37381), completed with 5097 local objects.
Merge made by recursive.
kernel/hrtimer.c | 15 +++++++++++++--
1 files changed, 13 insertions(+), 2 deletions(-)
git rev-list --objects ORIG_HEAD.. | wc -l
7
ie it downloaded 43343 objects totalling almost 9MB of data, even though
it actually only needed seven objects. In fact, I guess three of the seven
new objects were due to the merge (commit, top-level tree, and 'kernel'
tree), so only four objects actually came from the source repository.
I haven't thought about the why's so much (because I'm busy merging), but
I thought I'd mention this in case somebody else has already looked at it.
Linus
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Hmm.. Try harder to find the common commits in git protocol?
2008-04-28 16:41 Hmm.. Try harder to find the common commits in git protocol? Linus Torvalds
@ 2008-04-28 20:15 ` Linus Torvalds
2008-04-28 21:27 ` Daniel Barkalow
2008-04-28 23:20 ` Junio C Hamano
0 siblings, 2 replies; 4+ messages in thread
From: Linus Torvalds @ 2008-04-28 20:15 UTC (permalink / raw)
To: Junio C Hamano, Git Mailing List; +Cc: Martin Koegler
Hmm. Trying to debug this, I find the behavior hard to reproduce. But I
*also* find that there seems to be something wrong in builtin-fetch-pack.
Look at commit f3ec549481827b10609a43bf504517a0e8063a12 ("fetch-pack:
check parse_commit/object results"), and tell me that the "parents"
handling isn't totally broken. In particular, get_rev() does:
struct commit_list *parents = NULL;
..
commit = rev_list->item;
if (!(commit->object.parsed))
if (!parse_commit(commit))
parents = commit->parents;
..
which means that "parents" will never even get set if the commit was
already parsed!
And whether it got parsed or not depends on how we got there etc, so this
may explain the occasionally odd behaviour I saw.
Basically, I don't think that code can be right, and with a cut-down repo,
I get "no commits in common" due to this, because the whole "get_rev()"
thing doesn't work.
This patch should fix that problem, but I wonder why it got rewritten that
way in the first place?
Linus
---
builtin-fetch-pack.c | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/builtin-fetch-pack.c b/builtin-fetch-pack.c
index 65350ca..c97a427 100644
--- a/builtin-fetch-pack.c
+++ b/builtin-fetch-pack.c
@@ -117,15 +117,15 @@ static const unsigned char* get_rev(void)
while (commit == NULL) {
unsigned int mark;
- struct commit_list *parents = NULL;
+ struct commit_list *parents;
if (rev_list == NULL || non_common_revs == 0)
return NULL;
commit = rev_list->item;
- if (!(commit->object.parsed))
- if (!parse_commit(commit))
- parents = commit->parents;
+ if (!commit->object.parsed)
+ parse_commit(commit);
+ parents = commit->parents;
commit->object.flags |= POPPED;
if (!(commit->object.flags & COMMON))
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: Hmm.. Try harder to find the common commits in git protocol?
2008-04-28 20:15 ` Linus Torvalds
@ 2008-04-28 21:27 ` Daniel Barkalow
2008-04-28 23:20 ` Junio C Hamano
1 sibling, 0 replies; 4+ messages in thread
From: Daniel Barkalow @ 2008-04-28 21:27 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List, Martin Koegler
On Mon, 28 Apr 2008, Linus Torvalds wrote:
> Hmm. Trying to debug this, I find the behavior hard to reproduce. But I
> *also* find that there seems to be something wrong in builtin-fetch-pack.
>
> Look at commit f3ec549481827b10609a43bf504517a0e8063a12 ("fetch-pack:
> check parse_commit/object results"), and tell me that the "parents"
> handling isn't totally broken. In particular, get_rev() does:
>
> struct commit_list *parents = NULL;
> ..
> commit = rev_list->item;
> if (!(commit->object.parsed))
> if (!parse_commit(commit))
> parents = commit->parents;
> ..
>
> which means that "parents" will never even get set if the commit was
> already parsed!
>
> And whether it got parsed or not depends on how we got there etc, so this
> may explain the occasionally odd behaviour I saw.
>
> Basically, I don't think that code can be right, and with a cut-down repo,
> I get "no commits in common" due to this, because the whole "get_rev()"
> thing doesn't work.
>
> This patch should fix that problem, but I wonder why it got rewritten that
> way in the first place?
Looks like f3ec549481827b10609a43bf504517a0e8063a12 was trying to do
something clever when parse_commit returns non-zero, and accidentally did
the same thing with the case where it isn't needed.
>
> Linus
>
> ---
> builtin-fetch-pack.c | 8 ++++----
> 1 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/builtin-fetch-pack.c b/builtin-fetch-pack.c
> index 65350ca..c97a427 100644
> --- a/builtin-fetch-pack.c
> +++ b/builtin-fetch-pack.c
> @@ -117,15 +117,15 @@ static const unsigned char* get_rev(void)
>
> while (commit == NULL) {
> unsigned int mark;
> - struct commit_list *parents = NULL;
> + struct commit_list *parents;
>
> if (rev_list == NULL || non_common_revs == 0)
> return NULL;
>
> commit = rev_list->item;
> - if (!(commit->object.parsed))
> - if (!parse_commit(commit))
> - parents = commit->parents;
> + if (!commit->object.parsed)
> + parse_commit(commit);
> + parents = commit->parents;
I think it's supposed to be:
if (commit->object.parsed ||
!parse_commit(commit)))
parents = commit->parents;
Just looking at what the other changes in the same commit do.
-Daniel
*This .sig left intentionally blank*
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Hmm.. Try harder to find the common commits in git protocol?
2008-04-28 20:15 ` Linus Torvalds
2008-04-28 21:27 ` Daniel Barkalow
@ 2008-04-28 23:20 ` Junio C Hamano
1 sibling, 0 replies; 4+ messages in thread
From: Junio C Hamano @ 2008-04-28 23:20 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Git Mailing List, Martin Koegler
Linus Torvalds <torvalds@linux-foundation.org> writes:
> Hmm. Trying to debug this, I find the behavior hard to reproduce. But I
> *also* find that there seems to be something wrong in builtin-fetch-pack.
>
> Look at commit f3ec549481827b10609a43bf504517a0e8063a12 ("fetch-pack:
> check parse_commit/object results"), and tell me that the "parents"
> handling isn't totally broken. In particular, get_rev() does:
>
> struct commit_list *parents = NULL;
> ..
> commit = rev_list->item;
> if (!(commit->object.parsed))
> if (!parse_commit(commit))
> parents = commit->parents;
> ..
>
> which means that "parents" will never even get set if the commit was
> already parsed!
>
> And whether it got parsed or not depends on how we got there etc, so this
> may explain the occasionally odd behaviour I saw.
>
> Basically, I don't think that code can be right,...
> This patch should fix that problem, but I wonder why it got rewritten that
> way in the first place?
Sorry, my fault while reviewing the series. The intent of the update was
too focused on not touching possibly bad data (i.e. when parse_commit()
does not return zero to indicate success) and failed to make sure that it
acts identically on good data --- very bad.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-04-28 23:21 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-28 16:41 Hmm.. Try harder to find the common commits in git protocol? Linus Torvalds
2008-04-28 20:15 ` Linus Torvalds
2008-04-28 21:27 ` Daniel Barkalow
2008-04-28 23:20 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).