* Possible --remove-empty bug
@ 2006-03-12 14:12 Marco Costalba
2006-03-12 21:31 ` Junio C Hamano
0 siblings, 1 reply; 9+ messages in thread
From: Marco Costalba @ 2006-03-12 14:12 UTC (permalink / raw)
To: junkio; +Cc: git
>From today git:
$ git-rev-parse HEAD
be767c91724275c4534965c0d25c452b76057602
$ git-rev-list be767c91724275c4534965c0d25c452b76057602 -- imap-send.c
f2561fda364ad984ef1441a80c90b0ee04f1a7c4
$ git-rev-list --remove-empty be767c91724275c4534965c0d25c452b76057602
-- imap-send.c
$
>From git-rev-list documentation:
--remove-empty::
Stop when a given path disappears from the tree.
But isn't it to be intended *after* a path disapperas from the tree?
In this case I would expect to see revision
f2561fda364ad984ef1441a80c90b0ee04f1a7c4 also with --remove-empty
option.
BTW rev f2561fda364ad984ef1441a80c90b0ee04f1a7c4 is the 'Add
git-imap-send, derived from isync 1.0.1.' patch.
Thanks
Marco
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible --remove-empty bug
2006-03-12 14:12 Possible --remove-empty bug Marco Costalba
@ 2006-03-12 21:31 ` Junio C Hamano
2006-03-12 22:54 ` Linus Torvalds
0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2006-03-12 21:31 UTC (permalink / raw)
To: Marco Costalba; +Cc: git, Linus Torvalds
"Marco Costalba" <mcostalba@gmail.com> writes:
>>>From git-rev-list documentation:
>
> --remove-empty::
> Stop when a given path disappears from the tree.
>
> But isn't it to be intended *after* a path disapperas from the tree?
To be honest, I do not know how --remove-empty is intended to
work. What revision traversal code does and what the above says
are different.
The traversal code goes like this:
* Start from given commits (both interesting and
uninteresting), look at still-to-be-procesed commit
one by one, by calling add_parents_to_list().
* add_parents_to_list() grows still-to-be-processed
list; if the current commit is uninteresting, mark its
parents also uninteresting, and if no interesting
commit remains in the still-to-be-processed list, we
are done. On the other hand, if the current commit is
interesting, place it to the list of results.
* After the above traversal is done, the consumer calls
get_revision() to retrieve commits from the list of
results one-by-one. We return only interesting ones.
And in add_parents_to_list()
* if the commit is interesting, and when we are limiting
by paths, we call try_to_simplify_commit(). This
checks if the tree associated with the current commit
is the same as one of its parents' with respect to
specified paths, and if so pretend that the current
commit has only that parent and no other. This can
make a merge commit to lose other parents that we do
not inherit the specified paths from.
* try_to_simplify_commit() looks at each parent, and:
- if we find a parent that has the same tree (wrt the
paths we are interested in), we pretend it is the
sole parent of this commit.
- if we find a parent that does not have any of the
specified paths, we pretend we do not have that
parent under --remove-empty.
- otherwise we do not munge the list of parents.
My understanding of what the code is doing from the above
reading is to lose that empty parent, and it does not have much
to do with stop traversing the ancestry chain at such commit. I
am not sure that is what was intended...
Maybe something like this is closer to what the documentation
says.
-- >8 --
diff --git a/revision.c b/revision.c
index c8d93ff..03085ff 100644
--- a/revision.c
+++ b/revision.c
@@ -315,9 +315,14 @@ static void try_to_simplify_commit(struc
return;
case TREE_NEW:
- if (revs->remove_empty_trees && same_tree_as_empty(p->tree)) {
- *pp = parent->next;
- continue;
+ if (revs->remove_empty_trees &&
+ same_tree_as_empty(p->tree)) {
+ /* We are adding all the specified paths from
+ * this parent, so the parents of it is
+ * not interesting, but the difference between
+ * this parent and us still is interesting.
+ */
+ p->object.flags |= UNINTERESTING;
}
/* fallthrough */
case TREE_DIFFERENT:
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: Possible --remove-empty bug
2006-03-12 21:31 ` Junio C Hamano
@ 2006-03-12 22:54 ` Linus Torvalds
2006-03-13 1:08 ` Junio C Hamano
0 siblings, 1 reply; 9+ messages in thread
From: Linus Torvalds @ 2006-03-12 22:54 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Marco Costalba, git
On Sun, 12 Mar 2006, Junio C Hamano wrote:
>
> To be honest, I do not know how --remove-empty is intended to
> work.
It's supposed to stop traversing the tree once a pathname disappears.
> Maybe something like this is closer to what the documentation
> says.
If it is, then the documentation is broken.
The fact that a pathname disappears does _not_ make the commit
uninteresting. It just means that we should stop traversing that parent.
"uninteresting" has a big side effect: it inherits to parents. So if you
have
a
/ \
b c
\ /
d
where the pathname disappeared in "b", you must NOT mark it uninteresting,
because that would mean that "d" is also uninteresting.
There's a huge difference between saying "I will not traverse down this
line any more" and "I mark this commit uninteresting". The first one just
stops adding commits to the commit list (but parents deeper down might
still be interesting because they are also reached through another
pathway). The second says "this commit and all of its ancestors are
deemed worthless".
The "path goes away" case is meant to just stop traversal, not mark all
parents worthless.
Linus
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible --remove-empty bug
2006-03-12 22:54 ` Linus Torvalds
@ 2006-03-13 1:08 ` Junio C Hamano
2006-03-13 5:41 ` Junio C Hamano
0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2006-03-13 1:08 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Marco Costalba, git
Linus Torvalds <torvalds@osdl.org> writes:
> On Sun, 12 Mar 2006, Junio C Hamano wrote:
>>
>> To be honest, I do not know how --remove-empty is intended to
>> work.
>
> It's supposed to stop traversing the tree once a pathname disappears.
Then what we should simplify is the parent commit that does not
have those pathnames (i.e. remove parents from that parent
commit). In other words, currently the code removes the parent
commit that makes the tree disappear, but we would want to keep
that parent, remove the grandparents, and then mark the parent
uninteresting.
-- >8 --
[PATCH] revision traversal: --remove-empty fix (take #2).
Marco Costalba reports that --remove-empty omits the commit that
created paths we are interested in. try_to_simplify_commit()
logic was dropping a parent we introduced those paths against,
which I think is not what we meant. Instead, this makes such
parent parentless.
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
diff --git a/revision.c b/revision.c
index c8d93ff..73fba5d 100644
--- a/revision.c
+++ b/revision.c
@@ -315,9 +315,18 @@ static void try_to_simplify_commit(struc
return;
case TREE_NEW:
- if (revs->remove_empty_trees && same_tree_as_empty(p->tree)) {
- *pp = parent->next;
- continue;
+ if (revs->remove_empty_trees &&
+ same_tree_as_empty(p->tree)) {
+ /* We are adding all the specified
+ * paths from this parent, so the
+ * history beyond this parent is not
+ * interesting. Remove its parents
+ * (they are grandparents for us).
+ * IOW, we pretend this parent is a
+ * "root" commit.
+ */
+ parse_commit(p);
+ p->parents = NULL;
}
/* fallthrough */
case TREE_DIFFERENT:
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: Possible --remove-empty bug
2006-03-13 1:08 ` Junio C Hamano
@ 2006-03-13 5:41 ` Junio C Hamano
2006-03-13 19:03 ` Linus Torvalds
0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2006-03-13 5:41 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Marco Costalba, git
Junio C Hamano <junkio@cox.net> writes:
> Linus Torvalds <torvalds@osdl.org> writes:
>
>> It's supposed to stop traversing the tree once a pathname disappears.
>
> Then what we should simplify is the parent commit that does not
> have those pathnames (i.e. remove parents from that parent
> commit). In other words, currently the code removes the parent
> commit that makes the tree disappear, but we would want to keep
> that parent, remove the grandparents, and then mark the parent
> uninteresting.
Sorry, the last clause in the above comment is wrong, and does
not describe what the code attached does.
It removes the grandparents from the parent, and leaves the
parent still interesting. As a result, in your example:
... if you have
a
/ \
b c
\ /
d
where the pathname disappeared in "b"...
we would get this world view:
a
/ \
b c
/
d
because when inspecting a and finding that "b" does not have any
paths we are interested in, b loses all of its parents.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible --remove-empty bug
2006-03-13 5:41 ` Junio C Hamano
@ 2006-03-13 19:03 ` Linus Torvalds
2006-03-17 10:57 ` Marco Costalba
0 siblings, 1 reply; 9+ messages in thread
From: Linus Torvalds @ 2006-03-13 19:03 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Marco Costalba, git
On Sun, 12 Mar 2006, Junio C Hamano wrote:
>
> It removes the grandparents from the parent, and leaves the
> parent still interesting. As a result, in your example:
>
> ... if you have
>
> a
> / \
> b c
> \ /
> d
>
> where the pathname disappeared in "b"...
>
> we would get this world view:
>
> a
> / \
> b c
> /
> d
Yeah, that's correct. That way you still see all the history that is
relevant to the tree that became empty.
However, to be honest, the only reason to ever use --remove-empty is for
rename detection, and Frederik's approach of doing that through the
library interface directly is actually a much superior option. So we might
as well drop the compilcation of --remove-empty entirely, unless somebody
has already started using it.
The _real_ optimization would be to make the pathname based pruning be
done incrementally instead of having to build up the whole tree. That
would be much more important than the --remove-empty stuff from a
usability standpoint. I'm absolutely sure it's possible, and I simplified
the code earlier so that it should be simpler to do, but every time I
actually look at the code I get confused again.
Linus
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible --remove-empty bug
2006-03-13 19:03 ` Linus Torvalds
@ 2006-03-17 10:57 ` Marco Costalba
2006-03-18 6:40 ` Junio C Hamano
0 siblings, 1 reply; 9+ messages in thread
From: Marco Costalba @ 2006-03-17 10:57 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Junio C Hamano, git
On 3/13/06, Linus Torvalds <torvalds@osdl.org> wrote:
>
> However, to be honest, the only reason to ever use --remove-empty is for
> rename detection, and Frederik's approach of doing that through the
> library interface directly is actually a much superior option. So we might
> as well drop the compilcation of --remove-empty entirely, unless somebody
> has already started using it.
>
In case of a rather recent file --remove-empty option gives a good
speed up in history loading with git-rev-list. So qgit uses that
option.
Annotation in qgit it is little different from blame algorithm because
it works from oldest revision to latest instead of the contrary. This
is due because it calculates in one pass _all_ the annotations of all
the different revisions of a given file, starting from file history,
i.e. 'git-rev-list file_path' output.
The GUI interface of qgit let's the user quickly jump between two file
revisions. So the corresponding annotations should be already prepared
to make it snappy.
> The _real_ optimization would be to make the pathname based pruning be
> done incrementally instead of having to build up the whole tree.
Anything that speeds-up file git-rev-list history loading surely gives
a big boost to all annotation/blame stuff. I have made some profiling
on qgit (the public git repo version that is much faster, not the
qgit1.1 released one) and I found that 50% to 70% of the time is spent
on git-rev-list, of the remaining the 80% is spent on git-diff-tree -p
patch loading and only a small amount is spent on actually calculate
the annotation.
Of course these numbers can vary a little according to file/repo, but
the big picture stays the same.
Marco
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible --remove-empty bug
2006-03-17 10:57 ` Marco Costalba
@ 2006-03-18 6:40 ` Junio C Hamano
2006-03-18 7:36 ` Marco Costalba
0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2006-03-18 6:40 UTC (permalink / raw)
To: Marco Costalba; +Cc: git, Linus Torvalds
"Marco Costalba" <mcostalba@gmail.com> writes:
> On 3/13/06, Linus Torvalds <torvalds@osdl.org> wrote:
>>
>> However, to be honest, the only reason to ever use --remove-empty is for
>> rename detection, and Frederik's approach of doing that through the
>> library interface directly is actually a much superior option. So we might
>> as well drop the compilcation of --remove-empty entirely, unless somebody
>> has already started using it.
>
> In case of a rather recent file --remove-empty option gives a good
> speed up in history loading with git-rev-list. So qgit uses that
> option.
So you _do_ use it, and I think I still have that remove-empty
stuff held back in "next" branch. Should I unleash it?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible --remove-empty bug
2006-03-18 6:40 ` Junio C Hamano
@ 2006-03-18 7:36 ` Marco Costalba
0 siblings, 0 replies; 9+ messages in thread
From: Marco Costalba @ 2006-03-18 7:36 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Linus Torvalds
On 3/18/06, Junio C Hamano <junkio@cox.net> wrote:
> "Marco Costalba" <mcostalba@gmail.com> writes:
>
> >
> > In case of a rather recent file --remove-empty option gives a good
> > speed up in history loading with git-rev-list. So qgit uses that
> > option.
>
> So you _do_ use it, and I think I still have that remove-empty
> stuff held back in "next" branch. Should I unleash it?
>
>
Yes please.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2006-03-18 7:36 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-12 14:12 Possible --remove-empty bug Marco Costalba
2006-03-12 21:31 ` Junio C Hamano
2006-03-12 22:54 ` Linus Torvalds
2006-03-13 1:08 ` Junio C Hamano
2006-03-13 5:41 ` Junio C Hamano
2006-03-13 19:03 ` Linus Torvalds
2006-03-17 10:57 ` Marco Costalba
2006-03-18 6:40 ` Junio C Hamano
2006-03-18 7:36 ` Marco Costalba
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).