* Slow fetches of tags
@ 2006-05-24 13:10 Ralf Baechle
2006-05-24 16:45 ` Linus Torvalds
0 siblings, 1 reply; 23+ messages in thread
From: Ralf Baechle @ 2006-05-24 13:10 UTC (permalink / raw)
To: git
I have a fairly large git tree (with a 320MB pack file containing some
700,000 objects). A small fetch like
git fetch git://www.kernel.org/pub/scm/linux/kernel/git/stable/\
linux-2.6.16.y.git master:v2.6.16-stable
which only fetches a handful of objects (v2.6.16.17 -> v2.6.16.18) will
take on the order of 4-5 minutes. Adding the "-n" option is will bring
the operation down to under a second, so it really is just the tags
that are slowing things down so much..
Ralf
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: Slow fetches of tags 2006-05-24 13:10 Slow fetches of tags Ralf Baechle @ 2006-05-24 16:45 ` Linus Torvalds 2006-05-24 17:21 ` Linus Torvalds 2006-05-24 18:08 ` Ralf Baechle 0 siblings, 2 replies; 23+ messages in thread From: Linus Torvalds @ 2006-05-24 16:45 UTC (permalink / raw) To: Ralf Baechle; +Cc: git On Wed, 24 May 2006, Ralf Baechle wrote: > > I have a fairly large git tree (with a 320MB pack file containing some > 700,000 objects). A small fetch like > > git fetch git://www.kernel.org/pub/scm/linux/kernel/git/stable/\ > linux-2.6.16.y.git master:v2.6.16-stable > > which only fetches a handful of objects (v2.6.16.17 -> v2.6.16.18) will > take on the order of 4-5 minutes. Adding the "-n" option is will bring > the operation down to under a second, so it really is just the tags > that are slowing things down so much.. So this is a tree where you already _have_ most of the tags, no? Can you add a printout to show what the "taglist" is for you in git-fetch.sh (just before the thing that does that fetch_main "$taglist" thing?). It _should_ have pruned out all the tags you already have. Or is it just the "git-ls-remote" that takes forever? (Or, if you run "top", is there something that is an obviously heavy operation on the client side?) Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Slow fetches of tags 2006-05-24 16:45 ` Linus Torvalds @ 2006-05-24 17:21 ` Linus Torvalds 2006-05-24 18:08 ` Junio C Hamano 2006-05-24 19:06 ` Slow fetches of tags Junio C Hamano 2006-05-24 18:08 ` Ralf Baechle 1 sibling, 2 replies; 23+ messages in thread From: Linus Torvalds @ 2006-05-24 17:21 UTC (permalink / raw) To: Ralf Baechle, Junio C Hamano; +Cc: Git Mailing List On Wed, 24 May 2006, Linus Torvalds wrote: > > Can you add a printout to show what the "taglist" is for you in > git-fetch.sh (just before the thing that does that > > fetch_main "$taglist" > > thing?). It _should_ have pruned out all the tags you already have. Actually, looking at that tag-fetching logic, we already know that we have the objects that the tags point to (because those are the only kinds that we should auto-follow). I wonder if the slowness is because of all the have/want commit following, which walks the whole tree to say "I have this", when in this case we really should directly say "I have these" for the objects that the tags point to. So the problem may be that we basically send a totally unnecessary list of all the objects we have, when the other end really only cares about the fact that we have the objects that the tags point to. Which we know we do, but we didn't say so, because "git-fetch" didn't really mark them that way. And instead of sending the commits that we know we have, and that we know are the interesting ones and that will cut off the tag-object-walk, we start from all the local tips, and use the regular "parse commits in date order" thing and send "have" lines for everything we see that isn't common. Walking a lot of unnecessary crud. Junio? Any ideas? I didn't want to do that tag-auto-following, and while I admit it's damn convenient, it's really quite broken, methinks. I almost suspect that we need to have a syntax where-by the local fetch-list ends up doing "$tagname:$tagname:$sha1wehave" as the argument to fetch-pack, and then fetch-pack would be modified to send those "$sha1wehave" objects early as "have" objects. Ie start from something like diff --git a/git-fetch.sh b/git-fetch.sh index 280f62e..dce3812 100755 --- a/git-fetch.sh +++ b/git-fetch.sh @@ -400,7 +400,7 @@ case "$no_tags$tags" in } git-cat-file -t "$sha1" >/dev/null 2>&1 || continue echo >&2 "Auto-following $name" - echo ".${name}:${name}" + echo ".${name}:${name}:${sha1}" done) esac case "$taglist" in and then pass the info all the way up (the above patch will obviously result in a totally broken script, everything downstream from that point would have to be taught about the "already have this" part too). Ralf, which repo is this, so that others (me, if I get the time and energy, Junio or some other hapless sucker^W^Whero if I'm lucky) can try things out? Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Slow fetches of tags 2006-05-24 17:21 ` Linus Torvalds @ 2006-05-24 18:08 ` Junio C Hamano 2006-05-24 19:17 ` Linus Torvalds 2006-05-24 19:06 ` Slow fetches of tags Junio C Hamano 1 sibling, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2006-05-24 18:08 UTC (permalink / raw) To: Linus Torvalds; +Cc: Ralf Baechle, Git Mailing List Linus Torvalds <torvalds@osdl.org> writes: > So the problem may be that we basically send a totally unnecessary list of > all the objects we have, when the other end really only cares about the > fact that we have the objects that the tags point to. Which we know we do, > but we didn't say so, because "git-fetch" didn't really mark them that > way. I think this speculation is correct. We should be able to do better. > I almost suspect that we need to have a syntax where-by the local > fetch-list ends up doing > > "$tagname:$tagname:$sha1wehave" > > as the argument to fetch-pack, and then fetch-pack would be modified to > send those "$sha1wehave" objects early as "have" objects. But this logic has to be a bit more involved. A "have" object is not just has_sha1_file(), but it needs to be reachable from one of our tips we have already verified as complete, so either the caller of fetch-pack does the verification and give a verified $sha1wehave, or fetch-pack takes $sha1weseemtohave and does its own verification and then send it as one of the "have" objects (the issue is the same as the one in my previous message to Eric W. Biederman -- we trust only refs not just having a single object). It might be useful to have a helper script you can give N object names and M refs (and/or --all flag to mean "all of the refs"), which returns the ones that are reachable from the given refs. It would be even more useful if it were a helper function, but given that the computation would involve walking the ancestry chain, I suspect it would have a bad interaction with any user of such a helper function that wants to do its own ancestry walking, because many of them seem to assume an object that has already been parsed are the ones they parsed for their own purpose. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Slow fetches of tags 2006-05-24 18:08 ` Junio C Hamano @ 2006-05-24 19:17 ` Linus Torvalds 2006-05-24 23:43 ` Linus Torvalds 0 siblings, 1 reply; 23+ messages in thread From: Linus Torvalds @ 2006-05-24 19:17 UTC (permalink / raw) To: Junio C Hamano; +Cc: Ralf Baechle, Git Mailing List On Wed, 24 May 2006, Junio C Hamano wrote: > > A "have" object is not just has_sha1_file(), but it needs to be > reachable from one of our tips we have already verified as > complete You're right. And the strange part is that the commit we should give for the tag thing _should_ actually be pretty recent, and I wonder why we end up walking the whole damn tree history and saying "want" to basically them all. IOW, I think there's something more fundamentally wrong with the tag following. We _should_ have figured out much more quickly that we have it all. I'm starting to suspect that it's actually a tag-specific problem: we do that reachability crud all by commit history, so the tags are a total special case, and if we don't send the proper HAVE/WANT for those or mark them properly with THEY_HAVE/COMMON etc, maybe the algorithm just gets confused. I need to go pick up my youngest, so I'll be off-line on this for a while. Will try to think it through. Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Slow fetches of tags 2006-05-24 19:17 ` Linus Torvalds @ 2006-05-24 23:43 ` Linus Torvalds 2006-05-25 1:32 ` Junio C Hamano 2006-05-25 13:12 ` Slow fetches of tags Ralf Baechle 0 siblings, 2 replies; 23+ messages in thread From: Linus Torvalds @ 2006-05-24 23:43 UTC (permalink / raw) To: Junio C Hamano; +Cc: Ralf Baechle, Git Mailing List On Wed, 24 May 2006, Linus Torvalds wrote: > > IOW, I think there's something more fundamentally wrong with the tag > following. We _should_ have figured out much more quickly that we have it > all. Actually, maybe the problem is that Ralf's tree has two roots, because of the old CVS history. It might be following the other root down for the "have" part, since that one doesn't exist at all in the target and the other side will never acknowledge any of it. I'll play with it. Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Slow fetches of tags 2006-05-24 23:43 ` Linus Torvalds @ 2006-05-25 1:32 ` Junio C Hamano 2006-05-25 4:48 ` Junio C Hamano 2006-05-25 13:12 ` Slow fetches of tags Ralf Baechle 1 sibling, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2006-05-25 1:32 UTC (permalink / raw) To: Linus Torvalds; +Cc: git Linus Torvalds <torvalds@osdl.org> writes: > On Wed, 24 May 2006, Linus Torvalds wrote: >> >> IOW, I think there's something more fundamentally wrong with the tag >> following. We _should_ have figured out much more quickly that we have it >> all. > > Actually, maybe the problem is that Ralf's tree has two roots, because of > the old CVS history. It might be following the other root down for the > "have" part, since that one doesn't exist at all in the target and the > other side will never acknowledge any of it. > > I'll play with it. I think I know what is going on. You are exactly right -- the two-root ness is what is causing this. We used to stop sending "have" immediately after we get an ACK. This was troublesome for trees with many long branches, so we introduced multi_ack protocol extension to let the server side (upload-pack) say "Ok, enough on this branch -- I know this object so do not tell me any more about objects reachable from it, but do tell me about other development tracks if you have one". If you run "fetch-pack -v" after priming a repository with Ralf's tree and Chris's tree, you will see many "have" with occasional "got ack 2 [0-9a-f]{40}". The latter is upload-pack acking this way. This was done to prevent already-known-to-be-common objects filling up the list of known common commits on the server side. The remaining slots can be used to discover common commits on other branches, so that we can minimize the transfer. It was an important optimization when dealing with sets of branches that are long. This unfortunately breaks down quite badly in this case, since the remaining "branch" it keeps following is the other history Chris's tree has never heard of down to its root in vain. It might be worth changing fetch-pack to note that it has sent many "have"s after it got an "continue" ACK, and give up early, say using a heuristic between the age of the commit that did got an ACK and the one we are about to send out as a "have". ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Slow fetches of tags 2006-05-25 1:32 ` Junio C Hamano @ 2006-05-25 4:48 ` Junio C Hamano 2006-05-26 15:42 ` Ralf Baechle 0 siblings, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2006-05-25 4:48 UTC (permalink / raw) To: Linus Torvalds; +Cc: git Junio C Hamano <junkio@cox.net> writes: > It might be worth changing fetch-pack to note that it has sent > many "have"s after it got an "continue" ACK, and give up early, > say using a heuristic between the age of the commit that did got > an ACK and the one we are about to send out as a "have". I think the right fix for this is to change upload-pack to traverse reachability chain from the "want" heads as it gets "have" from the downloader, and stop responding "continue" when all "want" heads can reach some "have" commits. This would not prevent it from going down all the way to the root commit if what is wanted does not have anything to do with what the other end has (e.g. if you have only my main project branches, and you ask for html head for the first time), but it would have prevented Ralf's tree from getting "continue" after he asked only for v2.6.16.18 tag and said he has 2.6.16.18 commit and its ancestors. It should not be too difficult to do this, but here is an alternative, client-side workaround. -- >8 -- [PATCH] fetch-pack: give up after getting too many "ack continue" If your repository have more roots than the remote repository you ask an object for, the remote upload-pack keeps responding "ack continue" until it fills up its received-have buffer (currently 256 entries). Usually this is not a problem because the requester stops traversing the ancestry chain from the commit it gets "ack continue" for, but this mechanism does not work as a roadblock when it traverses down the path to the root the other side does not have. Signed-off-by: Junio C Hamano <junkio@cox.net> --- diff --git a/fetch-pack.c b/fetch-pack.c index 8daa93d..8371348 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -18,6 +18,12 @@ #define COMMON_REF (1U << 2) #define SEEN (1U << 3) #define POPPED (1U << 4) +/* + * After sending this many "have"s if we do not get any new ACK , we + * give up traversing our history. + */ +#define MAX_IN_VAIN 256 + static struct commit_list *rev_list = NULL; static int non_common_revs = 0, multi_ack = 0, use_thin_pack = 0; @@ -134,6 +140,8 @@ static int find_common(int fd[2], unsign int fetching; int count = 0, flushes = 0, retval; const unsigned char *sha1; + unsigned in_vain = 0; + int got_continue = 0; for_each_ref(rev_list_insert_ref); @@ -172,6 +180,7 @@ static int find_common(int fd[2], unsign packet_write(fd[1], "have %s\n", sha1_to_hex(sha1)); if (verbose) fprintf(stderr, "have %s\n", sha1_to_hex(sha1)); + in_vain++; if (!(31 & ++count)) { int ack; @@ -200,9 +209,16 @@ static int find_common(int fd[2], unsign lookup_commit(result_sha1); mark_common(commit, 0, 1); retval = 0; + in_vain = 0; + got_continue = 1; } } while (ack); flushes--; + if (got_continue && MAX_IN_VAIN < in_vain) { + if (verbose) + fprintf(stderr, "giving up\n"); + break; /* give up */ + } } } done: ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: Slow fetches of tags 2006-05-25 4:48 ` Junio C Hamano @ 2006-05-26 15:42 ` Ralf Baechle 2006-05-27 2:20 ` [PATCH/RFC] upload-pack: stop "ack continue" when we know common commits for wanted refs Junio C Hamano 0 siblings, 1 reply; 23+ messages in thread From: Ralf Baechle @ 2006-05-26 15:42 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, git On Wed, May 24, 2006 at 09:48:34PM -0700, Junio C Hamano wrote: > I think the right fix for this is to change upload-pack to > traverse reachability chain from the "want" heads as it gets > "have" from the downloader, and stop responding "continue" when > all "want" heads can reach some "have" commits. This would not > prevent it from going down all the way to the root commit if > what is wanted does not have anything to do with what the other > end has (e.g. if you have only my main project branches, and you > ask for html head for the first time), but it would have > prevented Ralf's tree from getting "continue" after he asked > only for v2.6.16.18 tag and said he has 2.6.16.18 commit and its > ancestors. It should not be too difficult to do this, but here > is an alternative, client-side workaround. > > -- >8 -- > [PATCH] fetch-pack: give up after getting too many "ack continue" So I did test your patch. In the big, slow repository it cuts down the time for a git fetch git://www.kernel.org/pub/scm/linux/kernel/git/stable/linux-2.6.16.y.git master:v2.6.16-stable from like 6min to about 7s. Thanks! Ralf ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH/RFC] upload-pack: stop "ack continue" when we know common commits for wanted refs 2006-05-26 15:42 ` Ralf Baechle @ 2006-05-27 2:20 ` Junio C Hamano 0 siblings, 0 replies; 23+ messages in thread From: Junio C Hamano @ 2006-05-27 2:20 UTC (permalink / raw) To: Ralf Baechle; +Cc: git, Linus Torvalds When the downloader's repository has more roots than the server side has, the "have" exchange to figure out recent common commits ends up traversing the whole history of branches that only exist on the downloader's side. When the downloader is asking for newer commits on the branch that exists on both ends, this is totally unnecessary. This adds logic to the server side to see if the wanted refs can reach the "have" commits received so far, and stop issuing "ack continue" once all of them can be reached from "have" commits. Signed-off-by: Junio C Hamano <junkio@cox.net> --- Ralf Baechle <ralf@linux-mips.org> writes: >> [PATCH] fetch-pack: give up after getting too many "ack continue" > > So I did test your patch. In the big, slow repository it cuts down the > time for a > > git fetch git://www./.../linux-2.6.16.y.git master:v2.6.16-stable > > from like 6min to about 7s. > > Thanks! This patch is still rough, but it passes my test of asking for "master" from git.git repository into a repository that is a merge between linux-2.6.git and a slightly older git.git. Without this change, and without the client-side hack Ralf tested, it ends up walking down the entire kernel history. The code to walk back from wanted ref is unnecessarily ugly and inefficient -- if we only support a handful want's (say 25) at a time, we could make the traversal go as we receive "have" by using something similar to what show-branches does. I am reworking on that part. upload-pack.c | 182 ++++++++++++++++++++++++++++++++++++++++++++++++++++----- 1 files changed, 167 insertions(+), 15 deletions(-) diff --git a/upload-pack.c b/upload-pack.c index 47560c9..e57733b 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -11,12 +11,18 @@ static const char upload_pack_usage[] = #define THEY_HAVE (1U << 0) #define OUR_REF (1U << 1) #define WANTED (1U << 2) +#define COMMON_KNOWN (1U << 3) + +#define TRACE_SEEN (1U << 4) +#define TRACE_BASE 5 +#define MAX_TRACE 20 /* should not exceed bits_per_int - TRACE_BASE */ + #define MAX_HAS 256 #define MAX_NEEDS 256 static int nr_has = 0, nr_needs = 0, multi_ack = 0, nr_our_refs = 0; static int use_thin_pack = 0; -static unsigned char has_sha1[MAX_HAS][20]; -static unsigned char needs_sha1[MAX_NEEDS][20]; +static struct object *has_sha1[MAX_HAS]; +static struct object *needs_sha1[MAX_NEEDS]; static unsigned int timeout = 0; static void reset_timeout(void) @@ -69,19 +75,22 @@ static void create_pack_file(void) if (create_full_pack || MAX_NEEDS <= nr_needs) *p++ = "--all"; else { + struct object **o = needs_sha1; for (i = 0; i < nr_needs; i++) { *p++ = buf; - memcpy(buf, sha1_to_hex(needs_sha1[i]), 41); + memcpy(buf, sha1_to_hex((*o++)->sha1), 41); buf += 41; } } - if (!create_full_pack) + if (!create_full_pack) { + struct object **o = has_sha1; for (i = 0; i < nr_has; i++) { *p++ = buf; *buf++ = '^'; - memcpy(buf, sha1_to_hex(has_sha1[i]), 41); + memcpy(buf, sha1_to_hex((*o++)->sha1), 41); buf += 41; } + } *p++ = NULL; execv_git_cmd(argv); die("git-upload-pack: unable to exec git-rev-list"); @@ -93,6 +102,125 @@ static void create_pack_file(void) die("git-upload-pack: unable to exec git-pack-objects"); } +static int trace_want(struct object **trace, int cnt) +{ + /* start from these cnt objects, traverse the reachability + * chain, without parsing new objects, to see if we can + * reach objects they have. + */ + int i, j; + unsigned trace_flags = 0; + struct object_list *list = NULL; + + for (i = 0; i < cnt; i++) + trace_flags |= (1U << (TRACE_BASE + i)); + + for (i = 0; i < obj_allocs; i++) + if (objs[i]) + objs[i]->flags &= ~trace_flags; + + for (i = 0; i < cnt; i++) { + trace[i]->flags |= 1U << (TRACE_BASE + i); + object_list_insert(trace[i], &list); + } + + while (list) { + struct object_list *next = list->next; + struct object *o = list->item; + unsigned flags = o->flags & trace_flags; + + free(list); + list = next; + if (o->flags & TRACE_SEEN) + continue; + o->flags |= TRACE_SEEN; + if (!strcmp(o->type, tag_type)) { + o = deref_tag(o, NULL, 0); + if (o && (o->flags & trace_flags) != flags) { + o->flags |= flags; + object_list_insert(o, &list); + } + continue; + } + if (!strcmp(o->type, commit_type)) { + struct commit *c = (struct commit *)o; + struct commit_list *l = c->parents; + while (l) { + struct commit *p = l->item; + l = l->next; + if ((p->object.flags & trace_flags) != flags) { + p->object.flags |= flags; + object_list_insert(&p->object, &list); + } + } + } + } + + /* Now scan the objects they have, and see if the wanted one + * reach which ones. + */ + for (j = 0; j < nr_needs; j++) { + for (i = 0; + (i < nr_has && + !(needs_sha1[j]->flags & COMMON_KNOWN)); + i++) { + if (has_sha1[i]->flags & (1U << (TRACE_BASE + j))) + needs_sha1[j]->flags |= COMMON_KNOWN; + } + } + + for (j = 0; j < nr_needs; j++) { + if (!(needs_sha1[j]->flags & COMMON_KNOWN)) + return 1; + } + return 0; +} + +static void check_want_heads(void) +{ + /* Do not keep saying "ack continue" if we already know + * common ancestor for all the "want"ed heads. This is + * particularly important if some of the "have" heads does + * not share any root commit with us. Otherwise we would + * keep asking for that branch, hoping we might get a better + * common ancestor than we already have. + */ + int i, still_missing; + struct object *trace[MAX_TRACE]; + int trace_bit; + + if (!multi_ack) + return; + + still_missing = 0; + trace_bit = 0; + for (i = 0; still_missing && i < nr_needs; i++) { + struct object *o = needs_sha1[i]; + if (o->flags & COMMON_KNOWN) + continue; + if (strcmp(o->type, tag_type) && + strcmp(o->type, commit_type)) + /* Asking for non traceable types - there + * is not much we can do to optimize it here. + * We will let rev-list deal with it. + */ + continue; + if (trace_bit < MAX_TRACE) { + trace[trace_bit] = o; + trace_bit++; + } + else { + still_missing = trace_want(trace, trace_bit); + trace_bit = 0; + } + } + if (trace_bit && !still_missing) + still_missing = trace_want(trace, trace_bit); + + if (!still_missing) + multi_ack = 0; +} + static int got_sha1(char *hex, unsigned char *sha1) { if (get_sha1_hex(hex, sha1)) @@ -107,15 +235,39 @@ static int got_sha1(char *hex, unsigned die("oops (%s)", sha1_to_hex(sha1)); if (o->type == commit_type) { struct commit_list *parents; + int we_knew_they_have = 0; + + /* Because we deliberately stay behind by one + * window in order to make the protocol + * stream, many commits can already be in + * flight when we notice that the latest one + * in the series is already what we have. Do + * not waste the has_sha1[] slot for extra commits + * sent that way. + * + * This relies on fetch-pack sending the "have" + * lines without skipping. + */ if (o->flags & THEY_HAVE) - return 0; - o->flags |= THEY_HAVE; + we_knew_they_have = 1; + else + o->flags |= THEY_HAVE; for (parents = ((struct commit*)o)->parents; parents; parents = parents->next) parents->item->object.flags |= THEY_HAVE; + if (we_knew_they_have) + return 0; } - memcpy(has_sha1[nr_has++], sha1, 20); + has_sha1[nr_has++] = o; + + /* Check to see if we know a common ancestor for + * all the "want" heads, and if so turn multi_ack + * off. There is nothing more gained by further + * exchange. + */ + check_want_heads(); + } return 1; } @@ -141,7 +293,7 @@ static int get_common_commits(void) len = strip(line, len); if (!strncmp(line, "have ", 5)) { if (got_sha1(line+5, sha1) && - (multi_ack || nr_has == 1)) { + (multi_ack || nr_has == 1)) { if (nr_has >= MAX_HAS) multi_ack = 0; packet_write(1, "ACK %s%s\n", @@ -156,7 +308,7 @@ static int get_common_commits(void) if (nr_has > 0) { if (multi_ack) packet_write(1, "ACK %s\n", - sha1_to_hex(last_sha1)); + sha1_to_hex(last_sha1)); return 0; } packet_write(1, "NAK\n"); @@ -174,23 +326,21 @@ static int receive_needs(void) needs = 0; for (;;) { struct object *o; - unsigned char dummy[20], *sha1_buf; + unsigned char sha1_buf[20]; len = packet_read_line(0, line, sizeof(line)); reset_timeout(); if (!len) return needs; - sha1_buf = dummy; if (needs == MAX_NEEDS) { fprintf(stderr, "warning: supporting only a max of %d requests. " "sending everything instead.\n", MAX_NEEDS); } - else if (needs < MAX_NEEDS) - sha1_buf = needs_sha1[needs]; - if (strncmp("want ", line, 5) || get_sha1_hex(line+5, sha1_buf)) + if (strncmp("want ", line, 5) || + get_sha1_hex(line+5, sha1_buf)) die("git-upload-pack: protocol error, " "expected to get sha, not '%s'", line); if (strstr(line+45, "multi_ack")) @@ -211,6 +361,8 @@ static int receive_needs(void) die("git-upload-pack: not our ref %s", line+5); if (!(o->flags & WANTED)) { o->flags |= WANTED; + if (needs < MAX_NEEDS) + needs_sha1[needs] = o; needs++; } } -- 1.3.3.g2a0a ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: Slow fetches of tags 2006-05-24 23:43 ` Linus Torvalds 2006-05-25 1:32 ` Junio C Hamano @ 2006-05-25 13:12 ` Ralf Baechle 2006-07-26 23:27 ` Junio C Hamano 1 sibling, 1 reply; 23+ messages in thread From: Ralf Baechle @ 2006-05-25 13:12 UTC (permalink / raw) To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List On Wed, May 24, 2006 at 04:43:02PM -0700, Linus Torvalds wrote: > Actually, maybe the problem is that Ralf's tree has two roots, because of > the old CVS history. It might be following the other root down for the > "have" part, since that one doesn't exist at all in the target and the > other side will never acknowledge any of it. > > I'll play with it. Interesting idea, so I went to play with it, too. I took a copy of the tree and deleted all branches except the v2.6.16-stable tracking branch which I pruned back to v2.6.16.17, then added a new branch starting at the oldest commit, your initial import of the kernel tree: $ git branch junk 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 $ git checkout junk $ seq -f "%05.0f" 1 100 | while read i; do echo $i; echo $i > Makefile;\ git commit -s -m "Blah $i" Makefile; done So with this I get: $ git branch * junk v2.6.16-stable $ If I now run $ strace git-fetch-pack --thin git://www.kernel.org/pub/scm/linux/kernel/\ git/stable/linux-2.6.16.y.git \ refs/heads/master refs/tags/v2.6.16.18 2>&1 | grep have /tmp/xxx I get: write(3, "0032have ef686028603c291ba510c66"..., 50) = 50 write(3, "0032have 150384dac99eb263c4385c7"..., 50) = 50 write(3, "0032have 4df3afbfc2d8f6c22d41c63"..., 50) = 50 ... write(3, "0032have db119fba3d9495aa9cd5a63"..., 50) = 50 Where ef686028603c291ba510c66 = junk, 150384dac99eb263c4385c7 = junk~1 ... db119fba3d9495aa9cd5a63 = junk~99 (first commit on the junk branch). 100 "have" lines upto this point, then: write(3, "0032have d87319c3e4d908e157a462d"..., 50) = 50 write(3, "0032have 22ddf44d54d0b2326f7b233"..., 50) = 50 write(3, "0032have 90a03936acb1c3400a5833c"..., 50) = 50 write(3, "0032have bf7d8bacaaf241a0f015798"..., 50) = 50 write(3, "0032have a120571fbdfc8f543eea642"..., 50) = 50 write(3, "0032have 42a46c74c4520174b82a60a"..., 50) = 50 write(3, "0032have f66ab685594d49e570b2176"..., 50) = 50 write(3, "0032have 834f514019e01f87657a257"..., 50) = 50 write(3, "0032have 9d395d1961a0eeb9e8b1ef2"..., 50) = 50 write(3, "0032have aa48603d1ba772d0a2b28ab"..., 50) = 50 write(3, "0032have 54e5705fd460c7621a4d73c"..., 50) = 50 write(3, "0032have 37863c8a9b7b0261ec76daa"..., 50) = 50 write(3, "0032have a7603f9099869f9aeebd6c7"..., 50) = 50 write(3, "0032have 623c30d2ae22cd4b8703c77"..., 50) = 50 write(3, "0032have e2c78fb27dd13ab8c778a96"..., 50) = 50 write(3, "0032have dbb676d1214c181e6cde4ce"..., 50) = 50 write(3, "0032have 1ffe5e06461f72b9b6a2569"..., 50) = 50 These are the commits for which this test tree has the tags left: $ ls .git/refs/tags/ v2.6.16.1 v2.6.16.12 v2.6.16.15 v2.6.16.2 v2.6.16.5 v2.6.16.8 v2.6.16.10 v2.6.16.13 v2.6.16.16 v2.6.16.3 v2.6.16.6 v2.6.16.9 v2.6.16.11 v2.6.16.14 v2.6.16.17 v2.6.16.4 v2.6.16.7 $ And finally: write(3, "0032have 1da177e4c3f41524e886b7f"..., 50) = 50 which is your Linux-2.6.12-rc2 import. Ralf ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Slow fetches of tags 2006-05-25 13:12 ` Slow fetches of tags Ralf Baechle @ 2006-07-26 23:27 ` Junio C Hamano 2006-07-28 10:42 ` Johannes Schindelin 2006-07-28 11:12 ` [PATCH] Teach the git wrapper about --name-rev and --name-rev-by-tags Johannes Schindelin 0 siblings, 2 replies; 23+ messages in thread From: Junio C Hamano @ 2006-07-26 23:27 UTC (permalink / raw) To: Ralf Baechle; +Cc: git, Johannes Schindelin, Linus Torvalds Ralf Baechle <ralf@linux-mips.org> writes: > On Wed, May 24, 2006 at 04:43:02PM -0700, Linus Torvalds wrote: > >> Actually, maybe the problem is that Ralf's tree has two roots, because of >> the old CVS history. It might be following the other root down for the >> "have" part, since that one doesn't exist at all in the target and the >> other side will never acknowledge any of it. >> >> I'll play with it. > > Interesting idea, so I went to play with it, too. I took a copy of the > tree and deleted all branches except the v2.6.16-stable tracking branch > which I pruned back to v2.6.16.17, then added a new branch starting at > the oldest commit, your initial import of the kernel tree: I've been looking at this issue again... > $ git branch junk 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 > $ git checkout junk > $ seq -f "%05.0f" 1 100 | while read i; do echo $i; echo $i > Makefile;\ > git commit -s -m "Blah $i" Makefile; done > > So with this I get: > > $ git branch > * junk > v2.6.16-stable > $ > > If I now run > > $ strace git-fetch-pack --thin git://www.kernel.org/pub/scm/linux/kernel/\ > git/stable/linux-2.6.16.y.git \ > refs/heads/master refs/tags/v2.6.16.18 2>&1 | grep have /tmp/xxx > > I get: ... 100 newest commits from the junk branch and then all the tags the downloader has are sent as "have"s. Now, sending the newest commits before sending the tags is unavoidable, since the other end does not know where you forked at (the purpose of the handshake is to find out where to begin with). But as soon as you send v2.6.16.17 (the latest tag that you have in common with the other side, _and_ is a proper ancestor of what you want -- v2.6.16.18 but that fact you do not know yet), the server end should be able to say "ok, we know enough". That is not happening. A few hints for debugging this: * local test is easier -- fetch-pack spawns upload-pack using PATH and GIT_EXEC_PATH so set them to point at the updated upload-pack being tested. * Passing the standard error from "fetch-pack -v" to "name-rev --stdin" makes it a bit more pleasant to see what is going on. With the attached patch, the server side tells the client to stop immediately after it says it has the commit tagged as v2.6.16.17 while asking for v2.6.16.18. With your "100 commits on junk" repository, it does not make much of a difference, though. The reasons are (1) the 100 commits on "junk" are much younger than any of the tags, so they are sent anyway, (2) we have a 32-commit window, and keep one window in flight to make the protocol stream, which means there will be max 64 "have" that are in flight unacked, and a clone of linux-2.6.16.y repository that has up to v2.6.16.17 tag has only 52 tags. So we end up sending all the tags anyway in this particular case. I've thought about sending tags and only _tips_ of branches first, but I think that would have a grave performance impact on more normal cases. If you are dealing with a remote repository with a bunch of tags, your "master" is ahead of the remote repository, and you do not use tracking branch to track the remote (pretend you are Linus and pulling from a subsystem maintainer), then you obviously do not want to send v2.6.12-rc2 tag before you send commits from your "master" branch to get to where your subsystem maintainer forked from you (otherwise the remote side would say "I do not know your 'master' commit, but now we know we have this ancient v2.6.12-rc2 in common, so let's have a pack between that and the tip of the subsystem tree"), so I do think sending "100 commits on junk branch" is unavoidable. I think the attached patch is safe in general, but somebody may want to give an extra set of eyeballs to double check the logic is sane. -- >8 -- upload-pack: squelch downloader more aggressively under multi-ack When the server side sees "have" line that makes all the "want" commits somehow reachable from one of the "have" lines so far, stop responding "continue" to prevent the other end going down to send too many refs. --- diff --git a/upload-pack.c b/upload-pack.c index 617ee46..ac42d0d 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -452,8 +452,13 @@ static int get_common_commits(void) default: memcpy(hex, sha1_to_hex(sha1), 41); if (multi_ack) { - const char *msg = "ACK %s continue\n"; - packet_write(1, msg, hex); + const char *msg = "ACK %s%s\n"; + const char *cont = " continue"; + if (ok_to_give_up()) { + cont = ""; + multi_ack = 0; + } + packet_write(1, msg, hex, cont); memcpy(last_hex, hex, 41); } else if (have_obj.nr == 1) ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: Slow fetches of tags 2006-07-26 23:27 ` Junio C Hamano @ 2006-07-28 10:42 ` Johannes Schindelin 2006-07-28 11:12 ` [PATCH] Teach the git wrapper about --name-rev and --name-rev-by-tags Johannes Schindelin 1 sibling, 0 replies; 23+ messages in thread From: Johannes Schindelin @ 2006-07-28 10:42 UTC (permalink / raw) To: Junio C Hamano; +Cc: Ralf Baechle, git, Linus Torvalds Hi, On Wed, 26 Jul 2006, Junio C Hamano wrote: > I think the attached patch is safe in general, but somebody may > want to give an extra set of eyeballs to double check the logic > is sane. The only gripe I have with it is that reachable() is relatively expensive, and it might be misused by a nasty client, making the server go down the whole history. I have no idea, though, how to prevent that. Ciao, Dscho ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH] Teach the git wrapper about --name-rev and --name-rev-by-tags 2006-07-26 23:27 ` Junio C Hamano 2006-07-28 10:42 ` Johannes Schindelin @ 2006-07-28 11:12 ` Johannes Schindelin 2006-07-28 15:43 ` Junio C Hamano 2006-07-28 16:59 ` Linus Torvalds 1 sibling, 2 replies; 23+ messages in thread From: Johannes Schindelin @ 2006-07-28 11:12 UTC (permalink / raw) To: Junio C Hamano; +Cc: git Now you can say git --name-rev log instead of git log | git name-rev --stdin | less with the benefit that diff.color=auto still works. There is also a shortcut "-n" for --name-rev. The option --name-rev-by-tags (or -t) tries to name the revs by tags instead of all refs, which is nicer when talking to other people, since their heads may be different from yours (I feel like talking to Zaphod ;-). Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> --- On Wed, 26 Jul 2006, Junio C Hamano wrote: > * Passing the standard error from "fetch-pack -v" to "name-rev > --stdin" makes it a bit more pleasant to see what is going on. This patch makes it even easier. Documentation/git.txt | 12 ++++++++++-- cache.h | 1 + git.c | 9 +++++++-- pager.c | 41 +++++++++++++++++++++++++++++++++++++++++ 4 files changed, 59 insertions(+), 4 deletions(-) diff --git a/Documentation/git.txt b/Documentation/git.txt index 7310a2b..eae930f 100644 --- a/Documentation/git.txt +++ b/Documentation/git.txt @@ -9,7 +9,8 @@ git - the stupid content tracker SYNOPSIS -------- 'git' [--version] [--exec-path[=GIT_EXEC_PATH]] [-p|--paginate] - [--bare] [--git-dir=GIT_DIR] [--help] COMMAND [ARGS] + [-n|--name-rev] [-t|--name-rev-by-tags] [--bare] + [--git-dir=GIT_DIR] [--help] COMMAND [ARGS] DESCRIPTION ----------- @@ -45,12 +46,19 @@ OPTIONS -p|--paginate:: Pipe all output into 'less' (or if set, $PAGER). +-n|--name-rev: + Try naming all SHA1s, and page the result (see + link:git-name-rev[1] for a detailed explanation). + +-t|--name-rev-by-tags: + Same as '--name-rev', but try to name the SHA1s by tags. + --git-dir=<path>:: Set the path to the repository. This can also be controlled by setting the GIT_DIR environment variable. --bare:: - Same as --git-dir=`pwd`. + Same as '--git-dir=`pwd`'. FURTHER DOCUMENTATION --------------------- diff --git a/cache.h b/cache.h index 8891073..d6c5edb 100644 --- a/cache.h +++ b/cache.h @@ -391,6 +391,7 @@ extern int receive_keep_pack(int fd[2], /* pager.c */ extern void setup_pager(void); +extern void setup_name_rev_pager(int by_tags); extern int pager_in_use; /* base85 */ diff --git a/git.c b/git.c index 4ea5efb..4206b43 100644 --- a/git.c +++ b/git.c @@ -63,9 +63,14 @@ static int handle_options(const char*** puts(git_exec_path()); exit(0); } - } else if (!strcmp(cmd, "-p") || !strcmp(cmd, "--paginate")) { + } else if (!strcmp(cmd, "-p") || !strcmp(cmd, "--paginate")) setup_pager(); - } else if (!strcmp(cmd, "--git-dir")) { + else if (!strcmp(cmd, "-n") || !strcmp(cmd, "--name-rev")) + setup_name_rev_pager(0); + else if (!strcmp(cmd, "-t") || + !strcmp(cmd, "--name-rev-by-tags")) + setup_name_rev_pager(1); + else if (!strcmp(cmd, "--git-dir")) { if (*argc < 1) return -1; setenv("GIT_DIR", (*argv)[1], 1); diff --git a/pager.c b/pager.c index 280f57f..48b2467 100644 --- a/pager.c +++ b/pager.c @@ -53,3 +53,44 @@ void setup_pager(void) die("unable to execute pager '%s'", pager); exit(255); } + +void setup_name_rev_pager(int by_tags) +{ + pid_t pid; + int fd[2]; + + if (!isatty(1)) + return; + + pager_in_use = 1; /* means we are emitting to terminal */ + + if (pipe(fd) < 0) + return; + pid = fork(); + if (pid < 0) { + close(fd[0]); + close(fd[1]); + return; + } + + /* return in the child */ + if (!pid) { + dup2(fd[1], 1); + close(fd[0]); + close(fd[1]); + return; + } + + /* The original process turns into paging name-rev */ + dup2(fd[0], 0); + close(fd[0]); + close(fd[1]); + + setup_pager(); + if (by_tags) + execl("git", "git", "name-rev", "--tags", "--stdin", NULL); + else + execl("git", "git", "name-rev", "--stdin", NULL); + die("unable to execute git-name-rev"); + exit(255); +} -- 1.4.2.rc2.g61d8 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH] Teach the git wrapper about --name-rev and --name-rev-by-tags 2006-07-28 11:12 ` [PATCH] Teach the git wrapper about --name-rev and --name-rev-by-tags Johannes Schindelin @ 2006-07-28 15:43 ` Junio C Hamano 2006-07-28 16:59 ` Linus Torvalds 1 sibling, 0 replies; 23+ messages in thread From: Junio C Hamano @ 2006-07-28 15:43 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > On Wed, 26 Jul 2006, Junio C Hamano wrote: > > > * Passing the standard error from "fetch-pack -v" to "name-rev > > --stdin" makes it a bit more pleasant to see what is going on. > > This patch makes it even easier. Probably wouldn't for that particular one, since what I wanted to do was "git fetch-pack -v 2>&1 | git name-rev >/var/tmp/1", so isatty(1) check in setup_name_rev_pager() is defeated by redirection, and the information I wanted to pass name-rev would not have passed it anyway. But this _might_ be useful for other more general cases. I'm not sure -- it feels somewhat like a hack, though. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Teach the git wrapper about --name-rev and --name-rev-by-tags 2006-07-28 11:12 ` [PATCH] Teach the git wrapper about --name-rev and --name-rev-by-tags Johannes Schindelin 2006-07-28 15:43 ` Junio C Hamano @ 2006-07-28 16:59 ` Linus Torvalds 2006-07-28 18:53 ` Johannes Schindelin 1 sibling, 1 reply; 23+ messages in thread From: Linus Torvalds @ 2006-07-28 16:59 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Junio C Hamano, git On Fri, 28 Jul 2006, Johannes Schindelin wrote: > > Now you can say > > git --name-rev log I think this is wrong. It may be a straightforward translation of > git log | git name-rev --stdin | less but that doesn't make it any more "correct". >From a logical standpoint, it should be an argument to the _logging_, not to the main git binary, so it should be git log --name-rev and you should do the parsing (and the output) inside revision.c. Also, I doubt most people want every release named. I think the common case would be that you want those releases named that match heads (and tags in particular) _exactly_. If you want everything named, maybe you want to do "--name-rev-all" or something. Hmm? (That would also likely perform a lot better) Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Teach the git wrapper about --name-rev and --name-rev-by-tags 2006-07-28 16:59 ` Linus Torvalds @ 2006-07-28 18:53 ` Johannes Schindelin 2006-07-29 12:43 ` Nguyễn Thái Ngọc Duy 0 siblings, 1 reply; 23+ messages in thread From: Johannes Schindelin @ 2006-07-28 18:53 UTC (permalink / raw) To: Linus Torvalds; +Cc: Junio C Hamano, git Hi, On Fri, 28 Jul 2006, Linus Torvalds wrote: > On Fri, 28 Jul 2006, Johannes Schindelin wrote: > > > > Now you can say > > > > git --name-rev log > > I think this is wrong. I think it is not wrong. :-) > It may be a straightforward translation of > > > git log | git name-rev --stdin | less > > but that doesn't make it any more "correct". I use it also for other git commands, so this was very much on purpose. > Also, I doubt most people want every release named. You are probably right. But _I_ want to know that e.g. commit a025463bc0ec2c894a88f2dfb44cf88ba71bb712 is really tags/v1.4.0^0~27^2. Both are immutable, but the latter is nicer to people than to computers. > I think the common case would be that you want those releases named that > match heads (and tags in particular) _exactly_. If you want everything > named, maybe you want to do "--name-rev-all" or something. > > Hmm? > > (That would also likely perform a lot better) True. But then, you probably know which head it is, because you probably specified it yourself on the command line. Ciao, Dscho ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Teach the git wrapper about --name-rev and --name-rev-by-tags 2006-07-28 18:53 ` Johannes Schindelin @ 2006-07-29 12:43 ` Nguyễn Thái Ngọc Duy 2006-07-29 12:47 ` Johannes Schindelin 0 siblings, 1 reply; 23+ messages in thread From: Nguyễn Thái Ngọc Duy @ 2006-07-29 12:43 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Linus Torvalds, Junio C Hamano, git On 7/29/06, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > You are probably right. But _I_ want to know that e.g. commit > a025463bc0ec2c894a88f2dfb44cf88ba71bb712 is really tags/v1.4.0^0~27^2. > Both are immutable, but the latter is nicer to people than to computers. I think so too. I had requested a similar feature on the git survey and was surprised to see this patch. I'd appreciate it. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Teach the git wrapper about --name-rev and --name-rev-by-tags 2006-07-29 12:43 ` Nguyễn Thái Ngọc Duy @ 2006-07-29 12:47 ` Johannes Schindelin 0 siblings, 0 replies; 23+ messages in thread From: Johannes Schindelin @ 2006-07-29 12:47 UTC (permalink / raw) To: Nguyễn Thái Ngọc Duy; +Cc: Linus Torvalds, Junio C Hamano, git [-- Attachment #1: Type: TEXT/PLAIN, Size: 537 bytes --] Hi, On Sat, 29 Jul 2006, Nguyn Thái Ngc Duy wrote: > On 7/29/06, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > You are probably right. But _I_ want to know that e.g. commit > > a025463bc0ec2c894a88f2dfb44cf88ba71bb712 is really tags/v1.4.0^0~27^2. > > Both are immutable, but the latter is nicer to people than to computers. > I think so too. I had requested a similar feature on the git survey > and was surprised to see this patch. I'd appreciate it. Now, guess three times where the idea comes from. ;-) Ciao, Dscho ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Slow fetches of tags 2006-05-24 17:21 ` Linus Torvalds 2006-05-24 18:08 ` Junio C Hamano @ 2006-05-24 19:06 ` Junio C Hamano 1 sibling, 0 replies; 23+ messages in thread From: Junio C Hamano @ 2006-05-24 19:06 UTC (permalink / raw) To: Linus Torvalds; +Cc: git, Ralf Baechle Linus Torvalds <torvalds@osdl.org> writes: > Junio? Any ideas? I didn't want to do that tag-auto-following, and while I > admit it's damn convenient, it's really quite broken, methinks. I think the current setup is broken on two counts. If you fetch without remote tracking branch, I suspect that we end up asking for the tip of the remote again -- because there is no ref that says "this commit is known to be complete -- we just fetched from them successfully". But I think what Ralf is seeing is a bit different. The example given: git fetch git://git.kernel.org/pub/scm/linux/kernel/git/stable/\ linux-2.6.16.y.git master:v2.6.16-stable does use a tracking branch, and when the tag following kicks in, v2.6.16-stable head should have been updated. I suspect it is just its head commit is older than tips of other branches, and purely date based sorting done by fetch-pack.c::get_rev() ends up walking them before it gets to the tip of the branch we just fetched. I wonder if we can do a dirty hack to give bias to commits coming from refs that are newer (on the local filesystem -- that is, mtime of .git/refs/heads/v2.6.16-stable must be a lot newer than .git/refs/heads/master in this case because we just fetched it)... ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Slow fetches of tags 2006-05-24 16:45 ` Linus Torvalds 2006-05-24 17:21 ` Linus Torvalds @ 2006-05-24 18:08 ` Ralf Baechle 2006-05-24 18:41 ` Junio C Hamano 1 sibling, 1 reply; 23+ messages in thread From: Ralf Baechle @ 2006-05-24 18:08 UTC (permalink / raw) To: Linus Torvalds; +Cc: git On Wed, May 24, 2006 at 09:45:29AM -0700, Linus Torvalds wrote: > So this is a tree where you already _have_ most of the tags, no? Yes, git did end up only fetching v2.6.16.18 as the single tag. > Can you add a printout to show what the "taglist" is for you in > git-fetch.sh (just before the thing that does that > > fetch_main "$taglist" > > thing?). It _should_ have pruned out all the tags you already have. Right, it's just "refs/tags/v2.6.16.18:refs/tags/v2.6.16.18". > Or is it just the "git-ls-remote" that takes forever? git-ls-remote git://www.kernel.org/pub/scm/linux/kernel/git/stable/\ linux-2.6.16.y takes about 1.5s. > (Or, if you run > "top", is there something that is an obviously heavy operation on the > client side?) git-fetch-pack was burning some 6min CPU. Nothing else even even shows up on the "top" radar. Another funny thing I noticed in top is that the git-fetch-pack arguments got overwritten: $ cat /proc/1702/cmdline | tr '\0' ' ' git-fetch-pack --thin git //www.kernel.org pub/scm/linux/kernel/git/stable/linux-2.6.16.y.git efs/heads/master efs/tags/v2.6.16.18 Guess that doesn't matter. Anyway, so I ran strace on this git-fetch-pack invocation: [...] munmap(0xb7fe5000, 229) = 0 getdents(5, /* 0 entries */, 4096) = 0 close(5) = 0 getdents(4, /* 0 entries */, 4096) = 0 close(4) = 0 write(3, "0046want 9b549d8e1e2f16cffbb414a"..., 70) = 70 write(3, "0000", 4) = 4 write(3, "0032have 0bcf7932d0ea742e765a40b"..., 50) = 50 write(3, "0032have 54e938a80873e85f9c02ab4"..., 50) = 50 write(3, "0032have 2d0a9369c540519bab8018e"..., 50) = 50 write(3, "0032have bf3060065ef9f0a8274fc32"..., 50) = 50 write(3, "0032have 27602bd8de8456ac619b77c"..., 50) = 50 [... another 42,000+ similar lines chopped off ...] 9b549d8e1e2f16cffbb414a is Chris Wright's tag for v2.6.16.18. So far, as expected. And this is where things are getting interesting: $ git-name-rev 0bcf7932d0ea742e765a40b 0bcf7932d0ea742e765a40b master $ git-name-rev 54e938a80873e85f9c02ab4 54e938a80873e85f9c02ab4 34k-2.6.16.18 $ git-name-rev 2d0a9369c540519bab8018e 2d0a9369c540519bab8018e 34k-2.6.16.18~1 $ git-name-rev bf3060065ef9f0a8274fc32 bf3060065ef9f0a8274fc32 34k-2.6.16.18~2 $ git-name-rev 27602bd8de8456ac619b77c 27602bd8de8456ac619b77c 34k-2.6.16.18~3 It's sending every object back to the start of history ... Ralf ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Slow fetches of tags 2006-05-24 18:08 ` Ralf Baechle @ 2006-05-24 18:41 ` Junio C Hamano 2006-05-25 13:27 ` Ralf Baechle 0 siblings, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2006-05-24 18:41 UTC (permalink / raw) To: Ralf Baechle; +Cc: git Ralf Baechle <ralf@linux-mips.org> writes: >> Or is it just the "git-ls-remote" that takes forever? > > git-ls-remote git://www.kernel.org/pub/scm/linux/kernel/git/stable/\ > linux-2.6.16.y takes about 1.5s. Good; that is as expected. ls-remote over git protocol just gets the initial "have" lines from the upload-pack and exits, and there is no handshaking. > Another funny thing I noticed in top is that the git-fetch-pack arguments > got overwritten: > > $ cat /proc/1702/cmdline | tr '\0' ' ' > git-fetch-pack --thin git //www.kernel.org pub/scm/linux/kernel/git/stable/linux-2.6.16.y.git efs/heads/master efs/tags/v2.6.16.18 > > Guess that doesn't matter. This is also expected - fetch-pack (connect.c::path_match(), actually) smudges the list of refs to remember which ones the caller asked are going to be fulfilled and which ones are not. Not the most beautiful part of the code ;-). > Guess that doesn't matter. Anyway, so I ran strace on this git-fetch-pack > invocation: > > [...] > munmap(0xb7fe5000, 229) = 0 > getdents(5, /* 0 entries */, 4096) = 0 > close(5) = 0 > getdents(4, /* 0 entries */, 4096) = 0 > close(4) = 0 > write(3, "0046want 9b549d8e1e2f16cffbb414a"..., 70) = 70 > write(3, "0000", 4) = 4 > write(3, "0032have 0bcf7932d0ea742e765a40b"..., 50) = 50 > write(3, "0032have 54e938a80873e85f9c02ab4"..., 50) = 50 > write(3, "0032have 2d0a9369c540519bab8018e"..., 50) = 50 > write(3, "0032have bf3060065ef9f0a8274fc32"..., 50) = 50 > write(3, "0032have 27602bd8de8456ac619b77c"..., 50) = 50 > [... another 42,000+ similar lines chopped off ...] > > 9b549d8e1e2f16cffbb414a is Chris Wright's tag for v2.6.16.18. So far, > as expected. > > And this is where things are getting interesting: > > $ git-name-rev 0bcf7932d0ea742e765a40b > 0bcf7932d0ea742e765a40b master > $ git-name-rev 54e938a80873e85f9c02ab4 > 54e938a80873e85f9c02ab4 34k-2.6.16.18 > $ git-name-rev 2d0a9369c540519bab8018e > 2d0a9369c540519bab8018e 34k-2.6.16.18~1 > $ git-name-rev bf3060065ef9f0a8274fc32 > bf3060065ef9f0a8274fc32 34k-2.6.16.18~2 > $ git-name-rev 27602bd8de8456ac619b77c > 27602bd8de8456ac619b77c 34k-2.6.16.18~3 > > It's sending every object back to the start of history ... Is this "master" commit 0bcf79 part of v2.6.16.18 history? If not, how diverged are you? That is, what does this command tell you? git rev-list b7d0617..master | wc -l Here, b7d0617 is the name of the commit object that is pointed by v2.6.16.18 tag. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Slow fetches of tags 2006-05-24 18:41 ` Junio C Hamano @ 2006-05-25 13:27 ` Ralf Baechle 0 siblings, 0 replies; 23+ messages in thread From: Ralf Baechle @ 2006-05-25 13:27 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Wed, May 24, 2006 at 11:41:26AM -0700, Junio C Hamano wrote: > > $ git-name-rev 0bcf7932d0ea742e765a40b > > 0bcf7932d0ea742e765a40b master > > $ git-name-rev 54e938a80873e85f9c02ab4 > > 54e938a80873e85f9c02ab4 34k-2.6.16.18 > > $ git-name-rev 2d0a9369c540519bab8018e > > 2d0a9369c540519bab8018e 34k-2.6.16.18~1 > > $ git-name-rev bf3060065ef9f0a8274fc32 > > bf3060065ef9f0a8274fc32 34k-2.6.16.18~2 > > $ git-name-rev 27602bd8de8456ac619b77c > > 27602bd8de8456ac619b77c 34k-2.6.16.18~3 > > > > It's sending every object back to the start of history ... > > Is this "master" commit 0bcf79 part of v2.6.16.18 history? If > not, how diverged are you? That is, what does this command tell > you? No, the master branch is where the MIPS development happens and it's tracking Linus' master branch. The fact that I'm talking about this in context of -stable / v2.6.16.18 is that I started looking into why things were taking minutes when doing a small fetch from 2.6.16-stable. It happens just as well with Linus' tree or yet others like Matthias Urlich's -mm git tree. > git rev-list b7d0617..master | wc -l > > Here, b7d0617 is the name of the commit object that is pointed > by v2.6.16.18 tag. $ git rev-list b7d0617..master | wc -l 12845 $ git rev-list master..b7d0617 | wc -l (that is swapped arguments) 173 $ Ralf ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2006-07-29 12:47 UTC | newest] Thread overview: 23+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-05-24 13:10 Slow fetches of tags Ralf Baechle 2006-05-24 16:45 ` Linus Torvalds 2006-05-24 17:21 ` Linus Torvalds 2006-05-24 18:08 ` Junio C Hamano 2006-05-24 19:17 ` Linus Torvalds 2006-05-24 23:43 ` Linus Torvalds 2006-05-25 1:32 ` Junio C Hamano 2006-05-25 4:48 ` Junio C Hamano 2006-05-26 15:42 ` Ralf Baechle 2006-05-27 2:20 ` [PATCH/RFC] upload-pack: stop "ack continue" when we know common commits for wanted refs Junio C Hamano 2006-05-25 13:12 ` Slow fetches of tags Ralf Baechle 2006-07-26 23:27 ` Junio C Hamano 2006-07-28 10:42 ` Johannes Schindelin 2006-07-28 11:12 ` [PATCH] Teach the git wrapper about --name-rev and --name-rev-by-tags Johannes Schindelin 2006-07-28 15:43 ` Junio C Hamano 2006-07-28 16:59 ` Linus Torvalds 2006-07-28 18:53 ` Johannes Schindelin 2006-07-29 12:43 ` Nguyễn Thái Ngọc Duy 2006-07-29 12:47 ` Johannes Schindelin 2006-05-24 19:06 ` Slow fetches of tags Junio C Hamano 2006-05-24 18:08 ` Ralf Baechle 2006-05-24 18:41 ` Junio C Hamano 2006-05-25 13:27 ` Ralf Baechle
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).