* path limiting broken @ 2006-04-16 12:26 Johannes Schindelin 2006-04-16 16:06 ` Linus Torvalds 0 siblings, 1 reply; 13+ messages in thread From: Johannes Schindelin @ 2006-04-16 12:26 UTC (permalink / raw) To: git Hi, I just tried to find out when certain changes percolated into log-tree.c. So I issued "git log --cc next log-tree.c". Wonders of wonders, the patches did not contain *anything* about what I tried to find, even if the file contains the key words. Because the path limiting is overeager (I finally found what I looked for in commit f4235f8b): git-name-rev $(git-rev-list --pretty=oneline \ f4235f8b2ef875b85ead74ffa199d827f9ee9d8d..next log-tree.c | \ sed "s/ .*$//") yields cb8f64b4e3f263c113b7a2f156af74b810e969ff next^2 cd2bdc5309461034e5cc58e1d3e87535ed9e093b next~10^2~2 while without path limiting, git-name-rev $(git-whatchanged.sh --pretty=oneline \ f4235f8b2ef875b85ead74ffa199d827f9ee9d8d^..next log-tree.c | \ sed -n "s/^diff-tree \([^ ]*\).*$/\1/p") I get cb8f64b4e3f263c113b7a2f156af74b810e969ff next^2 c5ccd8be43df4b916752a176512a9adaf3b94df9 next~4^2 f4235f8b2ef875b85ead74ffa199d827f9ee9d8d next~6^2 183df63940bf92ea626af64d0057165b8aad24f6 next~8^2 cd2bdc5309461034e5cc58e1d3e87535ed9e093b next~10^2~2 I am not intelligent enough to find out why there are three revisions which get culled. Ideas? Ciao, Dscho ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: path limiting broken 2006-04-16 12:26 path limiting broken Johannes Schindelin @ 2006-04-16 16:06 ` Linus Torvalds 2006-04-16 16:45 ` Johannes Schindelin 2006-04-16 17:05 ` Johannes Schindelin 0 siblings, 2 replies; 13+ messages in thread From: Linus Torvalds @ 2006-04-16 16:06 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git On Sun, 16 Apr 2006, Johannes Schindelin wrote: > > I am not intelligent enough to find out why there are three revisions > which get culled. > > Ideas? Path limiting by default looks at all parents of a merge, and if any of them is identical to the child (within the path limiter), it will pick _that_ parent, and that parent only, and drop all other parents. (If there are multiple identical parents, it will pick the first one). For example, if you do HEAD | a / \ b c | /| | / d |/ | e f | / g / |/ h and the file was modified in commit "b" and nowhere else, then think about what the final history should look like.. Since it was modified in "b", it's going to be the same in "a" and "b", so the path limiting will totally drop the whole "c" subtree, and will only ever look at the tree a->b->e->g->h. After that, it will simplify the history by dropping commits that don't change the path, and the whole history will end up being just "b". Which is exactly what you want. Now, realize that if the exact _same_ change is done in "f", the same thing will happen. "f" will never be shown, because "f" simply isn't interesting. We picked up the same change in "b", and while the merge "a" had _both_ parents identical in the file, it would pick only the first one. So we'll never even _look_ at "f". Now, this is _important_. Pruning merges is really fundamental to giving a reasonable file history, and not just a performance optimization. If the same change came from two branches, we really don't care who did it: we'll hit _one_ of the changes even after pruning. But yes, it could miss the other one. Here's a patch to show what an unpruned history ends up looking like. With this patch applied, try the following: gitk -- ls-tree.c & gitk --no-prune-merges -- ls-tree.c & and compare the two. You'll see _immediately_ why I think merge pruning is not just a good idea, it's something we _have_ to do. Linus ---- diff --git a/revision.c b/revision.c index 0505f3f..25338b6 100644 --- a/revision.c +++ b/revision.c @@ -291,7 +291,7 @@ static void try_to_simplify_commit(struc parse_commit(p); switch (rev_compare_tree(revs, p->tree, commit->tree)) { case REV_TREE_SAME: - if (p->object.flags & UNINTERESTING) { + if (revs->no_merge_pruning || (p->object.flags & UNINTERESTING)) { /* Even if a merge with an uninteresting * side branch brought the entire change * we are interested in, we do not want @@ -640,6 +640,10 @@ int setup_revisions(int argc, const char revs->unpacked = 1; continue; } + if (!strcmp(arg, "--no-prune-merges")) { + revs->no_merge_pruning = 1; + continue; + } *unrecognized++ = arg; left++; continue; diff --git a/revision.h b/revision.h index 8970b57..a116305 100644 --- a/revision.h +++ b/revision.h @@ -26,6 +26,7 @@ struct rev_info { /* Traversal flags */ unsigned int dense:1, no_merges:1, + no_merge_pruning:1, remove_empty_trees:1, lifo:1, topo_order:1, ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: path limiting broken 2006-04-16 16:06 ` Linus Torvalds @ 2006-04-16 16:45 ` Johannes Schindelin 2006-04-16 17:07 ` Linus Torvalds 2006-04-16 17:05 ` Johannes Schindelin 1 sibling, 1 reply; 13+ messages in thread From: Johannes Schindelin @ 2006-04-16 16:45 UTC (permalink / raw) To: Linus Torvalds; +Cc: git Hi, On Sun, 16 Apr 2006, Linus Torvalds wrote: > On Sun, 16 Apr 2006, Johannes Schindelin wrote: > > > > I am not intelligent enough to find out why there are three revisions > > which get culled. > > Path limiting by default looks at all parents of a merge, and if any of > them is identical to the child (within the path limiter), it will pick > _that_ parent, and that parent only, and drop all other parents. That's okay. But you missed my point. The file log-tree.c had some opt handling. I changed it (as can be seen in my patch for combined diffstat), so the merge did not go well with next, which removed that code. So I tried to be clever and find the (non-merge) commit where this happened. With what you describe, I should have hit *exactly one* commit removing the code. But I hit *exactly none* with "git log --cc master log-tree.c". Ciao, Dscho ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: path limiting broken 2006-04-16 16:45 ` Johannes Schindelin @ 2006-04-16 17:07 ` Linus Torvalds 2006-04-16 17:27 ` path limiting broken (NOT) Johannes Schindelin 2006-04-16 17:39 ` path limiting broken Johannes Schindelin 0 siblings, 2 replies; 13+ messages in thread From: Linus Torvalds @ 2006-04-16 17:07 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git On Sun, 16 Apr 2006, Johannes Schindelin wrote: > > But you missed my point. The file log-tree.c had some opt handling. I > changed it (as can be seen in my patch for combined diffstat), so the > merge did not go well with next, which removed that code. > > So I tried to be clever and find the (non-merge) commit where this > happened. With what you describe, I should have hit *exactly one* commit > removing the code. > > But I hit *exactly none* with "git log --cc master log-tree.c". Do you have the tree somewhere? In the current git tree, there really _is_ just one commit that touches log-tree.c in "master". And "git log log-tree.c" picks that out without trouble. Did you mean to do git log --cc next log-tree.c instead, perhaps? (Note the "next" branch specifier) That definitely shows three commits, including very much the one that moves the common option parsing to revision.c (do "git log --cc --full-diff next log-tree.c" if you want to see the whole diff whenever something touches log-tree.c). Maybe I'm still missing something, but it seems to do the right thing for me.. Linus ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: path limiting broken (NOT) 2006-04-16 17:07 ` Linus Torvalds @ 2006-04-16 17:27 ` Johannes Schindelin 2006-04-16 17:39 ` path limiting broken Johannes Schindelin 1 sibling, 0 replies; 13+ messages in thread From: Johannes Schindelin @ 2006-04-16 17:27 UTC (permalink / raw) To: Linus Torvalds; +Cc: git Hi, On Sun, 16 Apr 2006, Linus Torvalds wrote: > On Sun, 16 Apr 2006, Johannes Schindelin wrote: > > > > But you missed my point. The file log-tree.c had some opt handling. I > > changed it (as can be seen in my patch for combined diffstat), so the > > merge did not go well with next, which removed that code. > > > > So I tried to be clever and find the (non-merge) commit where this > > happened. With what you describe, I should have hit *exactly one* commit > > removing the code. > > > > But I hit *exactly none* with "git log --cc master log-tree.c". > > Do you have the tree somewhere? I have no quick way to publish it, if you mean that. > In the current git tree, there really _is_ just one commit that touches > log-tree.c in "master". And "git log log-tree.c" picks that out without > trouble. It is my master. Not Junio's. The problem seems to be that the current next does not have the code. But when I last merged, it had it. So basically, I have a problem here that I thought "git log --cc" can solve, but it doesn't. "git log -cc" is good to find out when code *that is present now* has crept in, but it is no good to find out when code that was in commit such-and-such, and is no longer in commit such-and-that, has been culled. Pity. Thanks for your help, Dscho ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: path limiting broken 2006-04-16 17:07 ` Linus Torvalds 2006-04-16 17:27 ` path limiting broken (NOT) Johannes Schindelin @ 2006-04-16 17:39 ` Johannes Schindelin 2006-04-16 17:58 ` Linus Torvalds 1 sibling, 1 reply; 13+ messages in thread From: Johannes Schindelin @ 2006-04-16 17:39 UTC (permalink / raw) To: Linus Torvalds; +Cc: git Hi, me again. I finally found the commit which removed parse_whatchanged_opt() from log-tree.c by git-rev-list 4a617..next | while read commit; do \ echo $commit; \ git diff $commit^..$commit log-tree.c | \ grep parse_whatchanged; \ done | less where obviously 4a617 is the commit I last merged with. The winner is: 43f934aa90... Merge branch 'lt/logopt' into next However, the combined diff of that commit does not show it, while the diff to the first parent does: git-show --cc 43f934aa90 | grep parse_whatchanged shows nothing (-c -p shows half of the truth), while git-diff 43f934aa90^..43f934aa90 | grep parse_whatchanged shows something, and git-diff 43f934aa90^2..43f934aa90 | grep parse_whatchanged again does not. Time to understand the combined diff logic, I guess... Ciao, Dscho ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: path limiting broken 2006-04-16 17:39 ` path limiting broken Johannes Schindelin @ 2006-04-16 17:58 ` Linus Torvalds 2006-04-16 18:09 ` Johannes Schindelin 0 siblings, 1 reply; 13+ messages in thread From: Linus Torvalds @ 2006-04-16 17:58 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git On Sun, 16 Apr 2006, Johannes Schindelin wrote: > > I finally found the commit which removed parse_whatchanged_opt() from > log-tree.c by > > git-rev-list 4a617..next | while read commit; do \ > echo $commit; \ > git diff $commit^..$commit log-tree.c | \ > grep parse_whatchanged; \ > done | less Heh. You really should learn about "-m -p", which does the above, but better (it compares against _all_ parents - you would have missed the thing _again_ if the "lt/logopt" branch had been the main branch ;) > However, the combined diff of that commit does not show it, while the diff > to the first parent does: > > git-show --cc 43f934aa90 | grep parse_whatchanged Combined merges really only show conflicts where the different parents do something different from the end result. Since the whole file was taken from the lt/logopt branch, even git show -c 43f934aa90 won't show that "log-tree.c" file AT ALL, because there was no content conflict: the whole file was taken from one branch, unmodified. If you want to see all changes against all parents, you really do need "-m -p" (or just "-m", which will show the raw diffs, and which will show how the file changes from one parent, but not the other). Note that NORMALLY, you'd really never want to use "-m -p". It's a very very inconvenient format, since normally you want to see only the stuff that changed wrt the end result. So "--cc" really does ignore everything that is irrelevant for the end result, and in this case you are very much trying to find somethign that is totally irrelevant for the end result, since the function you look for had never even _existed_ in the file as far as the end result goes.. Linus ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: path limiting broken 2006-04-16 17:58 ` Linus Torvalds @ 2006-04-16 18:09 ` Johannes Schindelin 2006-04-16 18:27 ` Linus Torvalds 0 siblings, 1 reply; 13+ messages in thread From: Johannes Schindelin @ 2006-04-16 18:09 UTC (permalink / raw) To: Linus Torvalds; +Cc: git Hi, On Sun, 16 Apr 2006, Linus Torvalds wrote: > On Sun, 16 Apr 2006, Johannes Schindelin wrote: > > > > I finally found the commit which removed parse_whatchanged_opt() from > > log-tree.c by > > > > git-rev-list 4a617..next | while read commit; do \ > > echo $commit; \ > > git diff $commit^..$commit log-tree.c | \ > > grep parse_whatchanged; \ > > done | less > > Heh. You really should learn about "-m -p", which does the above, but > better (it compares against _all_ parents - you would have missed the > thing _again_ if the "lt/logopt" branch had been the main branch ;) Somehow I forgot. Probably because I don't like the format either. > [...] even > > git show -c 43f934aa90 > > won't show that "log-tree.c" file AT ALL, because there was no content > conflict: the whole file was taken from one branch, unmodified. There's another reason it does not show it. ATM, you have to add "-p" to the command line, or "-c" will not show *any* patch, let alone a combined one. > So "--cc" really does ignore everything that is irrelevant for the end > result, and in this case you are very much trying to find somethign that > is totally irrelevant for the end result, since the function you look for > had never even _existed_ in the file as far as the end result goes.. Thanks for all your help, but in this case it was not irrelevant. Because I *had* the function in my working copy. And I had changed it. So I had to find out where to move the change. Again, thanks a bunch, Dscho ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: path limiting broken 2006-04-16 18:09 ` Johannes Schindelin @ 2006-04-16 18:27 ` Linus Torvalds 2006-04-16 23:49 ` Johannes Schindelin 0 siblings, 1 reply; 13+ messages in thread From: Linus Torvalds @ 2006-04-16 18:27 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git On Sun, 16 Apr 2006, Johannes Schindelin wrote: > > There's another reason it does not show it. ATM, you have to add "-p" to > the command line, or "-c" will not show *any* patch, let alone a combined > one. Oh, it will show the "raw" patch, which is often very useful. I've grown quite fond of it, because it shows what happened on a bigger level, without showing the details within a file ("--name-status" makes it more readable, but I'm too lazy to type it, so I often just look at the raw git patch). > Thanks for all your help, but in this case it was not irrelevant. Because > I *had* the function in my working copy. And I had changed it. So I had to > find out where to move the change. Right, but it was irrelevant as far as "top-of-head" was concerned (which is all that "git log" shows you - it doesn't care about your working tree). The fact that it _had_ been relevant in the state you used to be at is not something "git log" and friends know or care about. Now, I'm not disputing that we might want to make it easier to see what _had_ been relevant at some earlier time. But you'd have to specify that earlier time somehow. I assume you had tried to do a "git rebase", and the problem with that is that git rebase really doesn't help you at all when things go wrong, exactly because "rebase" - by design - screws up the history and loses that information for you. If your problem state had been as a result of a "git merge", you'd actually have had much better tools available to you, exactly because merge doesn't screw up history, so you've got both sides of the merge you can look at (HEAD and MERGE_HEAD, and "git diff --ours" and "--theirs"). That said, even "rebase" will help somewhat. You've got "ORIG_HEAD" to use, and that should help at least a _bit_. In particular, you can do gitk ORIG_HEAD.. to see all the changes you rebased across. But right now you'd have to fall back on the "-m -p" thing if you wanted to see it all.. Linus ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: path limiting broken 2006-04-16 18:27 ` Linus Torvalds @ 2006-04-16 23:49 ` Johannes Schindelin 2006-04-17 2:48 ` Linus Torvalds 0 siblings, 1 reply; 13+ messages in thread From: Johannes Schindelin @ 2006-04-16 23:49 UTC (permalink / raw) To: Linus Torvalds; +Cc: git Hi, On Sun, 16 Apr 2006, Linus Torvalds wrote: > On Sun, 16 Apr 2006, Johannes Schindelin wrote: > > > > Thanks for all your help, but in this case it was not irrelevant. Because > > I *had* the function in my working copy. And I had changed it. So I had to > > find out where to move the change. > > Right, but it was irrelevant as far as "top-of-head" was concerned (which > is all that "git log" shows you - it doesn't care about your working > tree). > > The fact that it _had_ been relevant in the state you used to be at is not > something "git log" and friends know or care about. > > Now, I'm not disputing that we might want to make it easier to see what > _had_ been relevant at some earlier time. But you'd have to specify that > earlier time somehow. Since quite some time, I wanted to have a way to git-rev-list just the revs between commit1 and commit2, i.e. all commits which are ancestors of commit2, and which have commit1 as ancestor. With this, my task would have been more than simple. > I assume you had tried to do a "git rebase", and the problem with that > is that git rebase really doesn't help you at all when things go wrong, > exactly because "rebase" - by design - screws up the history and loses > that information for you. Nope. Was a merge. > If your problem state had been as a result of a "git merge", you'd > actually have had much better tools available to you, exactly because > merge doesn't screw up history, so you've got both sides of the merge you > can look at (HEAD and MERGE_HEAD, and "git diff --ours" and "--theirs"). That outputs too much. I really wanted to find just the commit which removed the function, in order to know where the code was moved to. ORIG_HEAD would not help, because that commit is not an ancestor. > [...] In particular, you can do > > gitk ORIG_HEAD.. ... and it will say Error: expected integer but got "" while executing "clock format $d -format "%Y-%m-%d %H:%M:%S "" (procedure "formatdate" line 2) invoked from within "formatdate $date" (procedure "drawcmittext" line 28) invoked from within "drawcmittext $id $row $col $rmx" (procedure "drawcmitrow" line 41) invoked from within "drawcmitrow $row" (procedure "showstuff" line 35) invoked from within "showstuff $canshow" (procedure "layoutmore" line 15) invoked from within "layoutmore" (procedure "getcommitlines" line 91) invoked from within "getcommitlines file7" Sorry, I am too tired now to investigate, have been working already to long. Tomorrow's another day. Ciao, Dscho ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: path limiting broken 2006-04-16 23:49 ` Johannes Schindelin @ 2006-04-17 2:48 ` Linus Torvalds 0 siblings, 0 replies; 13+ messages in thread From: Linus Torvalds @ 2006-04-17 2:48 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git On Mon, 17 Apr 2006, Johannes Schindelin wrote: > > Since quite some time, I wanted to have a way to git-rev-list just the > revs between commit1 and commit2, i.e. all commits which are ancestors of > commit2, and which have commit1 as ancestor. With this, my task would have > been more than simple. Yes. However, it's not trivial. In fact, what you want is not what you claim you want. To be useful in general, you have to _also_ handle the case of "commit2" not beign a strict ancestor of "commit1". So what you actually want to do is - calculate the merge-head of cmit1 and cmit2 (and if there are multiple, pick some "best" one). - pick the shortest path from the merge-head to the cmit1 (and, with a flag, also pick the path from merge-head to cmit2 - sometimes you want to see the whole path from one to the other, sometimes you might want to see just the path from the last common point). I suspect it ends up being not _that_ different from calculating the bisection point, but I haven't thought it through entirely. Linus ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: path limiting broken 2006-04-16 16:06 ` Linus Torvalds 2006-04-16 16:45 ` Johannes Schindelin @ 2006-04-16 17:05 ` Johannes Schindelin 2006-04-16 17:51 ` Linus Torvalds 1 sibling, 1 reply; 13+ messages in thread From: Johannes Schindelin @ 2006-04-16 17:05 UTC (permalink / raw) To: Linus Torvalds; +Cc: git Hi, just to make it clearer what I want: git-whatchanged -p next | grep parse_whatchanged as well as git log -p next | grep parse_whatchanged do not find that any line like int parse_whatchanged_opt(int ac, [...] was removed, but they find that this line was added. However, in the working tree (which has a fresh checkout of next), there is no such line in log-tree.c. So I really would like to know where it vanished! Ciao, Dscho ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: path limiting broken 2006-04-16 17:05 ` Johannes Schindelin @ 2006-04-16 17:51 ` Linus Torvalds 0 siblings, 0 replies; 13+ messages in thread From: Linus Torvalds @ 2006-04-16 17:51 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git On Sun, 16 Apr 2006, Johannes Schindelin wrote: > > just to make it clearer what I want: > > git-whatchanged -p next | grep parse_whatchanged > > as well as > > git log -p next | grep parse_whatchanged > > do not find that any line like > > int parse_whatchanged_opt(int ac, [...] > > was removed, but they find that this line was added. However, in the > working tree (which has a fresh checkout of next), there is no such line > in log-tree.c. So I really would like to know where it vanished! It was removed by merge 43f934aa: "Merge branch 'lt/logopt' into next". The "parse_whatchanged_opt()" function never existed in that lt/logopt branch, and merging it (manually, I assume) it doesn't exist in the result either. General hint: if you don't find it in "--cc", then that means that it's almost certainly a merge. "--cc" will only find things that _clash_, ie the end result is different from either of the branches (and in this case, it wasn't different, since parse_whatchanged_opt() simply didn't exist in the branch that was merged). Now, finding things in merges can be a bit painful, but the sure-fire safe way is the old one: use "-m -p" to show _all_ sides of a merge as a diff. That's a really inconvenient format for reading, but it literally shows all changes to all parents. Again, "git log log-tree.c" actually does do the right thing: the current state of "log-tree.c" really has _all_ of its history coming from the lt/logopt branch, which is why when you do git log -m -p log-tree.c | grep int parse_whatchanged_opt you won't get any result at all: the _current_ state of log-tree.c really has no history what-so-ever that involved parse_whatchanged_opt. That may sound strange, but it really is very true. Doing a "gitk log-tree.c" shows what the real history of the contents of that file is. And this actually comes back to a very fundamental git issue: git tracks _contents_. It doesn't care one whit if there was another branch that had some other history for that file: if that other branch didn't affect the contents of the file, then that other branch simply doesn't exist as far as that particular file history is concerned. It only exists as a "bigger" issue. But that "-m -p" thing can be useful when you do want to see the bigger issue. As might "--no-prune-merges". Linus ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2006-04-17 2:48 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-04-16 12:26 path limiting broken Johannes Schindelin 2006-04-16 16:06 ` Linus Torvalds 2006-04-16 16:45 ` Johannes Schindelin 2006-04-16 17:07 ` Linus Torvalds 2006-04-16 17:27 ` path limiting broken (NOT) Johannes Schindelin 2006-04-16 17:39 ` path limiting broken Johannes Schindelin 2006-04-16 17:58 ` Linus Torvalds 2006-04-16 18:09 ` Johannes Schindelin 2006-04-16 18:27 ` Linus Torvalds 2006-04-16 23:49 ` Johannes Schindelin 2006-04-17 2:48 ` Linus Torvalds 2006-04-16 17:05 ` Johannes Schindelin 2006-04-16 17:51 ` Linus Torvalds
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).