* Possible --remove-empty bug @ 2006-03-12 14:12 Marco Costalba 2006-03-12 21:31 ` Junio C Hamano 0 siblings, 1 reply; 9+ messages in thread From: Marco Costalba @ 2006-03-12 14:12 UTC (permalink / raw) To: junkio; +Cc: git >From today git: $ git-rev-parse HEAD be767c91724275c4534965c0d25c452b76057602 $ git-rev-list be767c91724275c4534965c0d25c452b76057602 -- imap-send.c f2561fda364ad984ef1441a80c90b0ee04f1a7c4 $ git-rev-list --remove-empty be767c91724275c4534965c0d25c452b76057602 -- imap-send.c $ >From git-rev-list documentation: --remove-empty:: Stop when a given path disappears from the tree. But isn't it to be intended *after* a path disapperas from the tree? In this case I would expect to see revision f2561fda364ad984ef1441a80c90b0ee04f1a7c4 also with --remove-empty option. BTW rev f2561fda364ad984ef1441a80c90b0ee04f1a7c4 is the 'Add git-imap-send, derived from isync 1.0.1.' patch. Thanks Marco ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible --remove-empty bug 2006-03-12 14:12 Possible --remove-empty bug Marco Costalba @ 2006-03-12 21:31 ` Junio C Hamano 2006-03-12 22:54 ` Linus Torvalds 0 siblings, 1 reply; 9+ messages in thread From: Junio C Hamano @ 2006-03-12 21:31 UTC (permalink / raw) To: Marco Costalba; +Cc: git, Linus Torvalds "Marco Costalba" <mcostalba@gmail.com> writes: >>>From git-rev-list documentation: > > --remove-empty:: > Stop when a given path disappears from the tree. > > But isn't it to be intended *after* a path disapperas from the tree? To be honest, I do not know how --remove-empty is intended to work. What revision traversal code does and what the above says are different. The traversal code goes like this: * Start from given commits (both interesting and uninteresting), look at still-to-be-procesed commit one by one, by calling add_parents_to_list(). * add_parents_to_list() grows still-to-be-processed list; if the current commit is uninteresting, mark its parents also uninteresting, and if no interesting commit remains in the still-to-be-processed list, we are done. On the other hand, if the current commit is interesting, place it to the list of results. * After the above traversal is done, the consumer calls get_revision() to retrieve commits from the list of results one-by-one. We return only interesting ones. And in add_parents_to_list() * if the commit is interesting, and when we are limiting by paths, we call try_to_simplify_commit(). This checks if the tree associated with the current commit is the same as one of its parents' with respect to specified paths, and if so pretend that the current commit has only that parent and no other. This can make a merge commit to lose other parents that we do not inherit the specified paths from. * try_to_simplify_commit() looks at each parent, and: - if we find a parent that has the same tree (wrt the paths we are interested in), we pretend it is the sole parent of this commit. - if we find a parent that does not have any of the specified paths, we pretend we do not have that parent under --remove-empty. - otherwise we do not munge the list of parents. My understanding of what the code is doing from the above reading is to lose that empty parent, and it does not have much to do with stop traversing the ancestry chain at such commit. I am not sure that is what was intended... Maybe something like this is closer to what the documentation says. -- >8 -- diff --git a/revision.c b/revision.c index c8d93ff..03085ff 100644 --- a/revision.c +++ b/revision.c @@ -315,9 +315,14 @@ static void try_to_simplify_commit(struc return; case TREE_NEW: - if (revs->remove_empty_trees && same_tree_as_empty(p->tree)) { - *pp = parent->next; - continue; + if (revs->remove_empty_trees && + same_tree_as_empty(p->tree)) { + /* We are adding all the specified paths from + * this parent, so the parents of it is + * not interesting, but the difference between + * this parent and us still is interesting. + */ + p->object.flags |= UNINTERESTING; } /* fallthrough */ case TREE_DIFFERENT: ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: Possible --remove-empty bug 2006-03-12 21:31 ` Junio C Hamano @ 2006-03-12 22:54 ` Linus Torvalds 2006-03-13 1:08 ` Junio C Hamano 0 siblings, 1 reply; 9+ messages in thread From: Linus Torvalds @ 2006-03-12 22:54 UTC (permalink / raw) To: Junio C Hamano; +Cc: Marco Costalba, git On Sun, 12 Mar 2006, Junio C Hamano wrote: > > To be honest, I do not know how --remove-empty is intended to > work. It's supposed to stop traversing the tree once a pathname disappears. > Maybe something like this is closer to what the documentation > says. If it is, then the documentation is broken. The fact that a pathname disappears does _not_ make the commit uninteresting. It just means that we should stop traversing that parent. "uninteresting" has a big side effect: it inherits to parents. So if you have a / \ b c \ / d where the pathname disappeared in "b", you must NOT mark it uninteresting, because that would mean that "d" is also uninteresting. There's a huge difference between saying "I will not traverse down this line any more" and "I mark this commit uninteresting". The first one just stops adding commits to the commit list (but parents deeper down might still be interesting because they are also reached through another pathway). The second says "this commit and all of its ancestors are deemed worthless". The "path goes away" case is meant to just stop traversal, not mark all parents worthless. Linus ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible --remove-empty bug 2006-03-12 22:54 ` Linus Torvalds @ 2006-03-13 1:08 ` Junio C Hamano 2006-03-13 5:41 ` Junio C Hamano 0 siblings, 1 reply; 9+ messages in thread From: Junio C Hamano @ 2006-03-13 1:08 UTC (permalink / raw) To: Linus Torvalds; +Cc: Marco Costalba, git Linus Torvalds <torvalds@osdl.org> writes: > On Sun, 12 Mar 2006, Junio C Hamano wrote: >> >> To be honest, I do not know how --remove-empty is intended to >> work. > > It's supposed to stop traversing the tree once a pathname disappears. Then what we should simplify is the parent commit that does not have those pathnames (i.e. remove parents from that parent commit). In other words, currently the code removes the parent commit that makes the tree disappear, but we would want to keep that parent, remove the grandparents, and then mark the parent uninteresting. -- >8 -- [PATCH] revision traversal: --remove-empty fix (take #2). Marco Costalba reports that --remove-empty omits the commit that created paths we are interested in. try_to_simplify_commit() logic was dropping a parent we introduced those paths against, which I think is not what we meant. Instead, this makes such parent parentless. Signed-off-by: Junio C Hamano <junkio@cox.net> --- diff --git a/revision.c b/revision.c index c8d93ff..73fba5d 100644 --- a/revision.c +++ b/revision.c @@ -315,9 +315,18 @@ static void try_to_simplify_commit(struc return; case TREE_NEW: - if (revs->remove_empty_trees && same_tree_as_empty(p->tree)) { - *pp = parent->next; - continue; + if (revs->remove_empty_trees && + same_tree_as_empty(p->tree)) { + /* We are adding all the specified + * paths from this parent, so the + * history beyond this parent is not + * interesting. Remove its parents + * (they are grandparents for us). + * IOW, we pretend this parent is a + * "root" commit. + */ + parse_commit(p); + p->parents = NULL; } /* fallthrough */ case TREE_DIFFERENT: ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: Possible --remove-empty bug 2006-03-13 1:08 ` Junio C Hamano @ 2006-03-13 5:41 ` Junio C Hamano 2006-03-13 19:03 ` Linus Torvalds 0 siblings, 1 reply; 9+ messages in thread From: Junio C Hamano @ 2006-03-13 5:41 UTC (permalink / raw) To: Linus Torvalds; +Cc: Marco Costalba, git Junio C Hamano <junkio@cox.net> writes: > Linus Torvalds <torvalds@osdl.org> writes: > >> It's supposed to stop traversing the tree once a pathname disappears. > > Then what we should simplify is the parent commit that does not > have those pathnames (i.e. remove parents from that parent > commit). In other words, currently the code removes the parent > commit that makes the tree disappear, but we would want to keep > that parent, remove the grandparents, and then mark the parent > uninteresting. Sorry, the last clause in the above comment is wrong, and does not describe what the code attached does. It removes the grandparents from the parent, and leaves the parent still interesting. As a result, in your example: ... if you have a / \ b c \ / d where the pathname disappeared in "b"... we would get this world view: a / \ b c / d because when inspecting a and finding that "b" does not have any paths we are interested in, b loses all of its parents. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible --remove-empty bug 2006-03-13 5:41 ` Junio C Hamano @ 2006-03-13 19:03 ` Linus Torvalds 2006-03-17 10:57 ` Marco Costalba 0 siblings, 1 reply; 9+ messages in thread From: Linus Torvalds @ 2006-03-13 19:03 UTC (permalink / raw) To: Junio C Hamano; +Cc: Marco Costalba, git On Sun, 12 Mar 2006, Junio C Hamano wrote: > > It removes the grandparents from the parent, and leaves the > parent still interesting. As a result, in your example: > > ... if you have > > a > / \ > b c > \ / > d > > where the pathname disappeared in "b"... > > we would get this world view: > > a > / \ > b c > / > d Yeah, that's correct. That way you still see all the history that is relevant to the tree that became empty. However, to be honest, the only reason to ever use --remove-empty is for rename detection, and Frederik's approach of doing that through the library interface directly is actually a much superior option. So we might as well drop the compilcation of --remove-empty entirely, unless somebody has already started using it. The _real_ optimization would be to make the pathname based pruning be done incrementally instead of having to build up the whole tree. That would be much more important than the --remove-empty stuff from a usability standpoint. I'm absolutely sure it's possible, and I simplified the code earlier so that it should be simpler to do, but every time I actually look at the code I get confused again. Linus ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible --remove-empty bug 2006-03-13 19:03 ` Linus Torvalds @ 2006-03-17 10:57 ` Marco Costalba 2006-03-18 6:40 ` Junio C Hamano 0 siblings, 1 reply; 9+ messages in thread From: Marco Costalba @ 2006-03-17 10:57 UTC (permalink / raw) To: Linus Torvalds; +Cc: Junio C Hamano, git On 3/13/06, Linus Torvalds <torvalds@osdl.org> wrote: > > However, to be honest, the only reason to ever use --remove-empty is for > rename detection, and Frederik's approach of doing that through the > library interface directly is actually a much superior option. So we might > as well drop the compilcation of --remove-empty entirely, unless somebody > has already started using it. > In case of a rather recent file --remove-empty option gives a good speed up in history loading with git-rev-list. So qgit uses that option. Annotation in qgit it is little different from blame algorithm because it works from oldest revision to latest instead of the contrary. This is due because it calculates in one pass _all_ the annotations of all the different revisions of a given file, starting from file history, i.e. 'git-rev-list file_path' output. The GUI interface of qgit let's the user quickly jump between two file revisions. So the corresponding annotations should be already prepared to make it snappy. > The _real_ optimization would be to make the pathname based pruning be > done incrementally instead of having to build up the whole tree. Anything that speeds-up file git-rev-list history loading surely gives a big boost to all annotation/blame stuff. I have made some profiling on qgit (the public git repo version that is much faster, not the qgit1.1 released one) and I found that 50% to 70% of the time is spent on git-rev-list, of the remaining the 80% is spent on git-diff-tree -p patch loading and only a small amount is spent on actually calculate the annotation. Of course these numbers can vary a little according to file/repo, but the big picture stays the same. Marco ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible --remove-empty bug 2006-03-17 10:57 ` Marco Costalba @ 2006-03-18 6:40 ` Junio C Hamano 2006-03-18 7:36 ` Marco Costalba 0 siblings, 1 reply; 9+ messages in thread From: Junio C Hamano @ 2006-03-18 6:40 UTC (permalink / raw) To: Marco Costalba; +Cc: git, Linus Torvalds "Marco Costalba" <mcostalba@gmail.com> writes: > On 3/13/06, Linus Torvalds <torvalds@osdl.org> wrote: >> >> However, to be honest, the only reason to ever use --remove-empty is for >> rename detection, and Frederik's approach of doing that through the >> library interface directly is actually a much superior option. So we might >> as well drop the compilcation of --remove-empty entirely, unless somebody >> has already started using it. > > In case of a rather recent file --remove-empty option gives a good > speed up in history loading with git-rev-list. So qgit uses that > option. So you _do_ use it, and I think I still have that remove-empty stuff held back in "next" branch. Should I unleash it? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible --remove-empty bug 2006-03-18 6:40 ` Junio C Hamano @ 2006-03-18 7:36 ` Marco Costalba 0 siblings, 0 replies; 9+ messages in thread From: Marco Costalba @ 2006-03-18 7:36 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Linus Torvalds On 3/18/06, Junio C Hamano <junkio@cox.net> wrote: > "Marco Costalba" <mcostalba@gmail.com> writes: > > > > > In case of a rather recent file --remove-empty option gives a good > > speed up in history loading with git-rev-list. So qgit uses that > > option. > > So you _do_ use it, and I think I still have that remove-empty > stuff held back in "next" branch. Should I unleash it? > > Yes please. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2006-03-18 7:36 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-03-12 14:12 Possible --remove-empty bug Marco Costalba 2006-03-12 21:31 ` Junio C Hamano 2006-03-12 22:54 ` Linus Torvalds 2006-03-13 1:08 ` Junio C Hamano 2006-03-13 5:41 ` Junio C Hamano 2006-03-13 19:03 ` Linus Torvalds 2006-03-17 10:57 ` Marco Costalba 2006-03-18 6:40 ` Junio C Hamano 2006-03-18 7:36 ` Marco Costalba
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).