* git describe weird behaviour
@ 2010-11-10 1:00 Miklos Vajna
2010-11-10 4:14 ` Jeff King
2010-11-10 19:38 ` Junio C Hamano
0 siblings, 2 replies; 11+ messages in thread
From: Miklos Vajna @ 2010-11-10 1:00 UTC (permalink / raw)
To: git
[-- Attachment #1: Type: text/plain, Size: 617 bytes --]
Hi,
I'm not sure if this is a bug, so I'm asking.
The frugalware project (git repo:
http://frugalware.org/git/pub/frugalware/frugalware-current ) has the
two latest tags 1.3 and 1.4pre1, somehow git describe HEAD now mentions
1.3 and not 1.4pre1 in the output.
To be more exact:
$ git rev-parse --short HEAD
f25435f
$ git rev-list 1.4pre1..|wc -l
871
So I would exepct 1.4pre1-871-gf25435f. In fact, the output is:
$ git describe
1.3-3028-gf25435f
Or in case I force the usage of the latest tag:
$ git describe --candidate 1
1.4pre1-64725-gf25435f
Does anyone have an idea what's going wrong here?
Thanks.
[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git describe weird behaviour
2010-11-10 1:00 git describe weird behaviour Miklos Vajna
@ 2010-11-10 4:14 ` Jeff King
2010-11-10 8:47 ` Maaartin
` (2 more replies)
2010-11-10 19:38 ` Junio C Hamano
1 sibling, 3 replies; 11+ messages in thread
From: Jeff King @ 2010-11-10 4:14 UTC (permalink / raw)
To: Miklos Vajna; +Cc: Shawn O. Pearce, git
[cc'ing Shawn, who probably doesn't remember at all how git-describe
works these days, but it's worth a shot]
On Wed, Nov 10, 2010 at 02:00:16AM +0100, Miklos Vajna wrote:
> The frugalware project (git repo:
> http://frugalware.org/git/pub/frugalware/frugalware-current ) has the
> two latest tags 1.3 and 1.4pre1, somehow git describe HEAD now mentions
> 1.3 and not 1.4pre1 in the output.
>
> To be more exact:
>
> $ git rev-parse --short HEAD
> f25435f
>
> $ git rev-list 1.4pre1..|wc -l
> 871
>
> So I would exepct 1.4pre1-871-gf25435f. In fact, the output is:
>
> $ git describe
> 1.3-3028-gf25435f
>
> Or in case I force the usage of the latest tag:
>
> $ git describe --candidate 1
> 1.4pre1-64725-gf25435f
>
> Does anyone have an idea what's going wrong here?
The describe code is pretty inscrutable. But what it should be doing is
roughly:
1. Describe all interesting refs (annotated tags by default) by
marking the commit object's util field.
2. Walk backwards in the commit tree. As we walk backwards, keep track
of how far (in commits) we have gone. When we see a tag, mark it
with how far we've gone.
The trick is in keeping track of how far we've gone. It looks like we
keep the number of seen_commits, increment each time we traverse a
commit, and then assign that to the "depth" field. But I don't see how
that can be right. We are traversing in a breadth-first manner, so we
may look at 1000 commits down one ancestry chain of a merge before
following the first parent on another. But when we finally follow the
second parent, our seen_commits is much higher than the actual distance
we had to travel to get here. Looking at the frugalware history in gitk,
you might be triggering this; you have a history with just a few merges,
but extremely long chains of commits on each branch. Repos like linux or
git.git are a bit bushier in the shape of the graph.
But there's more going on there that I don't quite understand. There is
some kind of magic "within" flag being kept for each tag we find. It's a
little too late at night for me to wrap my head around it tonight. But I
don't think it changes the fundamental issue that our depth value is
just an estimate; its accuracy works under the assumption that all sides
of a branch are about equal.
It seems to me the "right" way to do it would be to track the depth to
get to each commit. We can do this efficiently by just keeping a depth
marker for each commit we push onto the commit_list queue. When we visit
each commit, we know its depth, and when we push on its parents, they
get its depth+1.
The patch below implements that in a very rough-and-dirty way. It does
find the 1.4 tag in your repository that you expect. However:
1. This still isn't quite accurate. We might visit a commit by two
different paths. When we reach it by a shorter path, we need some
way of saying "oops, the depth on any tags we found following this
path are going to be too long. We need to go back and shorten
them".
2. I am getting nonsensical results when trying it in git.git. It
really wants to point me to gitgui tags, which makes no sense. So
clearly there is a bug, or my idea is flawed somehow. But it's too
late to think about anymore tonight. :)
-Peff
PS This would be a much simpler algorithm to write in a depth-first way.
But that would also involve traversing the entire graph down to the
roots, which we try to avoid. Which reminds me of my "tag
--contains" depth first algorithm, and gives me some ideas on how to
make it work in a breadth-first way. So even if my idea here is
flawed, this thinking hasn't been completely fruitless. :)
-- >8 --
diff --git a/builtin/describe.c b/builtin/describe.c
index 43caff2..31cf855 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -164,39 +164,6 @@ static int compare_pt(const void *a_, const void *b_)
return 0;
}
-static unsigned long finish_depth_computation(
- struct commit_list **list,
- struct possible_tag *best)
-{
- unsigned long seen_commits = 0;
- while (*list) {
- struct commit *c = pop_commit(list);
- struct commit_list *parents = c->parents;
- seen_commits++;
- if (c->object.flags & best->flag_within) {
- struct commit_list *a = *list;
- while (a) {
- struct commit *i = a->item;
- if (!(i->object.flags & best->flag_within))
- break;
- a = a->next;
- }
- if (!a)
- break;
- } else
- best->depth++;
- while (parents) {
- struct commit *p = parents->item;
- parse_commit(p);
- if (!(p->object.flags & SEEN))
- insert_by_date(p, list);
- p->object.flags |= c->object.flags;
- parents = parents->next;
- }
- }
- return seen_commits;
-}
-
static void display_name(struct commit_name *n)
{
if (n->prio == 2 && !n->tag) {
@@ -231,7 +198,6 @@ static void describe(const char *arg, int last_one)
struct commit_name *n;
struct possible_tag all_matches[MAX_TAGS];
unsigned int match_cnt = 0, annotated_cnt = 0, cur_match;
- unsigned long seen_commits = 0;
unsigned int unannotated_cnt = 0;
if (get_sha1(arg, sha1))
@@ -262,10 +228,11 @@ static void describe(const char *arg, int last_one)
list = NULL;
cmit->object.flags = SEEN;
commit_list_insert(cmit, &list);
+ list->depth = 0;
while (list) {
+ unsigned long depth = list->depth;
struct commit *c = pop_commit(&list);
struct commit_list *parents = c->parents;
- seen_commits++;
n = c->util;
if (n) {
if (!tags && !all && n->prio < 2) {
@@ -273,7 +240,7 @@ static void describe(const char *arg, int last_one)
} else if (match_cnt < max_candidates) {
struct possible_tag *t = &all_matches[match_cnt++];
t->name = n;
- t->depth = seen_commits - 1;
+ t->depth = depth;
t->flag_within = 1u << match_cnt;
t->found_order = match_cnt;
c->object.flags |= t->flag_within;
@@ -285,11 +252,6 @@ static void describe(const char *arg, int last_one)
break;
}
}
- for (cur_match = 0; cur_match < match_cnt; cur_match++) {
- struct possible_tag *t = &all_matches[cur_match];
- if (!(c->object.flags & t->flag_within))
- t->depth++;
- }
if (annotated_cnt && !list) {
if (debug)
fprintf(stderr, "finished search at %s\n",
@@ -299,8 +261,10 @@ static void describe(const char *arg, int last_one)
while (parents) {
struct commit *p = parents->item;
parse_commit(p);
- if (!(p->object.flags & SEEN))
- insert_by_date(p, &list);
+ if (!(p->object.flags & SEEN)) {
+ struct commit_list *cl = insert_by_date(p, &list);
+ cl->depth = depth + 1;
+ }
p->object.flags |= c->object.flags;
parents = parents->next;
}
@@ -329,9 +293,7 @@ static void describe(const char *arg, int last_one)
if (gave_up_on) {
insert_by_date(gave_up_on, &list);
- seen_commits--;
}
- seen_commits += finish_depth_computation(&list, &all_matches[0]);
free_commit_list(list);
if (debug) {
@@ -341,7 +303,6 @@ static void describe(const char *arg, int last_one)
prio_names[t->name->prio],
t->depth, t->name->path);
}
- fprintf(stderr, "traversed %lu commits\n", seen_commits);
if (gave_up_on) {
fprintf(stderr,
"more than %i tags found; listed %i most recent\n"
diff --git a/commit.h b/commit.h
index a3618f8..6493606 100644
--- a/commit.h
+++ b/commit.h
@@ -8,6 +8,7 @@
struct commit_list {
struct commit *item;
+ unsigned long depth;
struct commit_list *next;
};
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: git describe weird behaviour
2010-11-10 4:14 ` Jeff King
@ 2010-11-10 8:47 ` Maaartin
2010-11-10 8:52 ` Thomas Rast
2010-11-10 14:03 ` Miklos Vajna
2 siblings, 0 replies; 11+ messages in thread
From: Maaartin @ 2010-11-10 8:47 UTC (permalink / raw)
To: git
Jeff King <peff <at> peff.net> writes:
> PS This would be a much simpler algorithm to write in a depth-first way.
> But that would also involve traversing the entire graph down to the
> roots, which we try to avoid. Which reminds me of my "tag
> --contains" depth first algorithm, and gives me some ideas on how to
> make it work in a breadth-first way. So even if my idea here is
> flawed, this thinking hasn't been completely fruitless. :)
IMHO, using BFS and counting commits is flawed, since the output is hard to
understand and quite meaningless for humans. Imagine a merge of two long
branches, one with a tag 10 commits deep and the other with a tag 11 commits
deep. Do you see any reason for a human to prefer the first one? I do not.
I could imagine to always prefer the first branch, which leads to DFS. What is
your reason for avoiding traversing to the roots?
It could be better to traverse the first branch only to the point where the
other started, then traverse the other one, and then continue down the common
trunk. This sounds a bit complicated to implement and may be hard to specify in
more complicated cases. A rule like "tag T1 can be returned only if there's no
tag T2 higher (i.e. newer) in the tree", could possibly solve it.
Finally, using a chronological ordering instead of counting commits seems to be
the most straightforward to me. An implementation using a priority queue is no
less and no more complicated then BFS or DFS. I think, it could be an
interesting and easy to comprehend option. In case I'm talking non-sense, pls
bare with me, since I'm still a beginner.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git describe weird behaviour
2010-11-10 4:14 ` Jeff King
2010-11-10 8:47 ` Maaartin
@ 2010-11-10 8:52 ` Thomas Rast
2010-11-10 14:03 ` Miklos Vajna
2 siblings, 0 replies; 11+ messages in thread
From: Thomas Rast @ 2010-11-10 8:52 UTC (permalink / raw)
To: Jeff King; +Cc: Miklos Vajna, Shawn O. Pearce, git
Jeff King wrote:
> The trick is in keeping track of how far we've gone. It looks like we
> keep the number of seen_commits, increment each time we traverse a
> commit, and then assign that to the "depth" field. But I don't see how
> that can be right. We are traversing in a breadth-first manner, so we
> may look at 1000 commits down one ancestry chain of a merge before
> following the first parent on another.
I haven't really spent more than about 3 minutes on this, but it seems
to use insert_by_date() (except for the start of the search) to walk,
so it would seem to be affected by date skew in some strange way that
I have yet to investigate.
So I merged your git-skew from pu and compiled, and ran
frugalware-current(master u=)$ git skew --all
182448414
frugalware-current(master u=)$ python
[...]
>>> 182448414/86400/365.25
5.7796030116358654
Unless I'm reading your commit message a2ffa6b96 wrong, that means the
repo has a worst-case clock skew of on the order of *six years*... So
maybe it's once again an undocumented effect of clock skew on our
history walks?
--
Thomas Rast
trast@{inf,student}.ethz.ch
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git describe weird behaviour
2010-11-10 4:14 ` Jeff King
2010-11-10 8:47 ` Maaartin
2010-11-10 8:52 ` Thomas Rast
@ 2010-11-10 14:03 ` Miklos Vajna
2010-12-09 1:33 ` Miklos Vajna
2 siblings, 1 reply; 11+ messages in thread
From: Miklos Vajna @ 2010-11-10 14:03 UTC (permalink / raw)
To: Jeff King; +Cc: Shawn O. Pearce, git
[-- Attachment #1: Type: text/plain, Size: 2451 bytes --]
On Tue, Nov 09, 2010 at 11:14:28PM -0500, Jeff King <peff@peff.net> wrote:
> we had to travel to get here. Looking at the frugalware history in gitk,
> you might be triggering this; you have a history with just a few merges,
> but extremely long chains of commits on each branch. Repos like linux or
> git.git are a bit bushier in the shape of the graph.
Agreed.
Actually I thought git describe is just a C version of something like:
- run git log, find the first reachable tag (1.4pre1 in this case)
- git rev-list 1.4pre1..|wc -l (880 in this case)
- append "-g<hash>" to make it still unique
Then it turned out it's a bit more complicated, so I mailed the list. :)
> The patch below implements that in a very rough-and-dirty way. It does
> find the 1.4 tag in your repository that you expect. However:
Yes, works here as well:
$ ~/git/git/git describe
1.4pre1-210-g48b67cd
> 2. I am getting nonsensical results when trying it in git.git. It
> really wants to point me to gitgui tags, which makes no sense. So
> clearly there is a bug, or my idea is flawed somehow. But it's too
> late to think about anymore tonight. :)
>
> -Peff
>
> PS This would be a much simpler algorithm to write in a depth-first way.
> But that would also involve traversing the entire graph down to the
> roots, which we try to avoid. Which reminds me of my "tag
> --contains" depth first algorithm, and gives me some ideas on how to
> make it work in a breadth-first way. So even if my idea here is
> flawed, this thinking hasn't been completely fruitless. :)
Hm, but can't we way git describe does history walk in a --first-parent
way? It might not be efficient, but what may work is to:
- traverse the history by using the first parrents only to find the
first reachable tag (this way you will never hit git-gui tags)
- count the commits from the first tag till HEAD by traversing the other
parents as well
To sum up, I think your patch is great about it picks up the right tag
for me, though:
$ git rev-list 1.4pre1..|wc -l
880 <- this is what I would expect in the git describe output
$ git rev-list --first-parent 1.4pre1..|wc -l
569 <- this is less informative, though probably is faster to produce
(we may have this info already when searching for the first tag)
$ ~/git/git/git describe
1.4pre1-210-g48b67cd <- an even smaller number you give. :)
Thanks!
[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git describe weird behaviour
2010-11-10 1:00 git describe weird behaviour Miklos Vajna
2010-11-10 4:14 ` Jeff King
@ 2010-11-10 19:38 ` Junio C Hamano
2010-11-10 20:40 ` Miklos Vajna
1 sibling, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2010-11-10 19:38 UTC (permalink / raw)
To: Miklos Vajna; +Cc: git
Just a guess. Does this have to do with 80dbae0 (Chose better tag names
in git-describe after merges., 2007-01-10)?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git describe weird behaviour
2010-11-10 19:38 ` Junio C Hamano
@ 2010-11-10 20:40 ` Miklos Vajna
2010-11-10 21:24 ` Jonathan Nieder
0 siblings, 1 reply; 11+ messages in thread
From: Miklos Vajna @ 2010-11-10 20:40 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 1176 bytes --]
On Wed, Nov 10, 2010 at 11:38:58AM -0800, Junio C Hamano <gitster@pobox.com> wrote:
> Just a guess. Does this have to do with 80dbae0 (Chose better tag names
> in git-describe after merges., 2007-01-10)?
Hmm, not sure. I tried reverting that commit, but that resulted in
conflicts - git checkout 80dbae0^ produces a git binary that can't read
the index version we have today. :)
But I think the approach to follow the first parent only would solve the
problem outlined in the commit message of 80dbae0 as well: I think it's
pretty rare to tag a commit in a feature branch, then merge it.
I mean the trivial proof of concenpt below finds the right tag here:
$ ~/git/git/git describe
1.4pre1-572-g42b497b
Even if it does not count all the commits since that tag:
$ git rev-list 1.4pre1..|wc -l
883
Thanks.
diff --git a/builtin/describe.c b/builtin/describe.c
index 43caff2..f32eaf4 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -303,6 +303,7 @@ static void describe(const char *arg, int last_one)
insert_by_date(p, &list);
p->object.flags |= c->object.flags;
parents = parents->next;
+ break;
}
}
[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: git describe weird behaviour
2010-11-10 20:40 ` Miklos Vajna
@ 2010-11-10 21:24 ` Jonathan Nieder
2010-11-10 22:07 ` Miklos Vajna
0 siblings, 1 reply; 11+ messages in thread
From: Jonathan Nieder @ 2010-11-10 21:24 UTC (permalink / raw)
To: Miklos Vajna; +Cc: Junio C Hamano, git
Miklos Vajna wrote:
> But I think the approach to follow the first parent only would solve the
> problem outlined in the commit message of 80dbae0 as well: I think it's
> pretty rare to tag a commit in a feature branch, then merge it.
Doesn't that happen in linux-2.6 history fairly often (subsystem trees
syncing with upstream)?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git describe weird behaviour
2010-11-10 21:24 ` Jonathan Nieder
@ 2010-11-10 22:07 ` Miklos Vajna
0 siblings, 0 replies; 11+ messages in thread
From: Miklos Vajna @ 2010-11-10 22:07 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: Junio C Hamano, git
[-- Attachment #1: Type: text/plain, Size: 898 bytes --]
On Wed, Nov 10, 2010 at 03:24:10PM -0600, Jonathan Nieder <jrnieder@gmail.com> wrote:
> Miklos Vajna wrote:
>
> > But I think the approach to follow the first parent only would solve the
> > problem outlined in the commit message of 80dbae0 as well: I think it's
> > pretty rare to tag a commit in a feature branch, then merge it.
>
> Doesn't that happen in linux-2.6 history fairly often (subsystem trees
> syncing with upstream)?
Hm, makes sense - I don't think it ever happened in
torvalds/linux-2.6.git, but probably it happens in subsystem trees where
a given maintainer never tags anything, but regularly merges from
torvalds/linux-2.6.git. So my proposal would break git describe output
when it's invoked in a subsystem tree, you are right.
(Note that Peff's patch at least fixes the tag part of the issue for me,
and that should not really break the above situation.)
[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git describe weird behaviour
2010-11-10 14:03 ` Miklos Vajna
@ 2010-12-09 1:33 ` Miklos Vajna
2010-12-09 3:28 ` Jeff King
0 siblings, 1 reply; 11+ messages in thread
From: Miklos Vajna @ 2010-12-09 1:33 UTC (permalink / raw)
To: Jeff King; +Cc: Shawn O. Pearce, git
[-- Attachment #1: Type: text/plain, Size: 428 bytes --]
On Wed, Nov 10, 2010 at 03:03:34PM +0100, Miklos Vajna <vmiklos@frugalware.org> wrote:
> > The patch below implements that in a very rough-and-dirty way. It does
> > find the 1.4 tag in your repository that you expect. However:
>
> Yes, works here as well:
>
> $ ~/git/git/git describe
> 1.4pre1-210-g48b67cd
Any update on this? I still have this patch in my tree to get correct
git describe output. :)
Thanks.
[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: git describe weird behaviour
2010-12-09 1:33 ` Miklos Vajna
@ 2010-12-09 3:28 ` Jeff King
0 siblings, 0 replies; 11+ messages in thread
From: Jeff King @ 2010-12-09 3:28 UTC (permalink / raw)
To: Miklos Vajna; +Cc: Shawn O. Pearce, git
On Thu, Dec 09, 2010 at 02:33:23AM +0100, Miklos Vajna wrote:
> On Wed, Nov 10, 2010 at 03:03:34PM +0100, Miklos Vajna <vmiklos@frugalware.org> wrote:
> > > The patch below implements that in a very rough-and-dirty way. It does
> > > find the 1.4 tag in your repository that you expect. However:
> >
> > Yes, works here as well:
> >
> > $ ~/git/git/git describe
> > 1.4pre1-210-g48b67cd
>
> Any update on this? I still have this patch in my tree to get correct
> git describe output. :)
Sorry, no, I haven't had time to think about it, and I probably won't
for a few more weeks at least.
The patch I posted was giving some really weird results for git.git,
though, so it is definitely not OK for inclusion as-is.
-Peff
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2010-12-09 3:28 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-10 1:00 git describe weird behaviour Miklos Vajna
2010-11-10 4:14 ` Jeff King
2010-11-10 8:47 ` Maaartin
2010-11-10 8:52 ` Thomas Rast
2010-11-10 14:03 ` Miklos Vajna
2010-12-09 1:33 ` Miklos Vajna
2010-12-09 3:28 ` Jeff King
2010-11-10 19:38 ` Junio C Hamano
2010-11-10 20:40 ` Miklos Vajna
2010-11-10 21:24 ` Jonathan Nieder
2010-11-10 22:07 ` Miklos Vajna
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).