git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* gitweb.cgi history not shown
@ 2006-06-11  5:31 Marco Costalba
  2006-06-11  6:02 ` Linus Torvalds
  0 siblings, 1 reply; 8+ messages in thread
From: Marco Costalba @ 2006-06-11  5:31 UTC (permalink / raw)
  To: junkio; +Cc: git

What I do wrong?

$ git-rev-list --all -- gitweb/gitweb.cgi
0a8f4f0020cb35095005852c0797f0b90e9ebb74
$ git-rev-list --all -- gitweb.cgi
$

Also the installed gitweb at kernel.org gives an empty history for
file gitweb.cgi under git repository, while the history is correctly
shown for the same file under the gitweb project.

    Marco

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: gitweb.cgi history not shown
  2006-06-11  5:31 gitweb.cgi history not shown Marco Costalba
@ 2006-06-11  6:02 ` Linus Torvalds
  2006-06-11  6:32   ` Marco Costalba
  0 siblings, 1 reply; 8+ messages in thread
From: Linus Torvalds @ 2006-06-11  6:02 UTC (permalink / raw)
  To: Marco Costalba; +Cc: junkio, git



On Sun, 11 Jun 2006, Marco Costalba wrote:
>
> What I do wrong?
> 
> $ git-rev-list --all -- gitweb/gitweb.cgi
> 0a8f4f0020cb35095005852c0797f0b90e9ebb74
> $ git-rev-list --all -- gitweb.cgi
> $

[ no output ]

This is getting to be a FAQ, and I think we should add the 
"--no-prune-history" flag (or whatever I called it - I even sent out a 
patch for it) so that you can avoid it.

The thing that happens in

	git-rev-list --all -- gitweb.cgi

is that since your _current_ HEAD does not have that file at all, it 
starts going back in history, and at each merge it finds it will 
_simplify_ the history, and only look at that part of history that is 
identical _with_respect_to_the_name_you_gave_!

Now, in the main git history, that name has NEVER existed, so the 
simplified history for that particular name (as seen from the current 
branch) is simply empty. It's empty all the way back to the root. No 
commits at all add that name along the main history branch.

Now, that name obviously existed in the _side_ histories, but we don't 
show those, because they obviously didn't matter (as far as that 
particular name happened) within the particular history starting point you 
chose. See?

Now, look what happens if you instead of starting the history search from 
all the _current_ heads, you start it from a location that actually _had_ 
that file:

	git log 1130ef362fc8d9c3422c23f5d5 -- gitweb.cgi

and suddenly there the history is - in all its glory.

So what this boils down to is really: when you limit revision history by a 
set of filenames, GIT REALLY REWRITES AND SIMPLIFIES THE HISTORY AS PER 
_THAT_ PARTICULAR SET OF FILENAMES. In particular, it will generate the 
_simplest_ history that is consistent with the state of those filenames at 
the point you asked it to start.

If you want to get the non-simplified history (ie you object to the fact 
that we give the simplest history, you want _all_ the possible history for 
that particular filename, whether it was the same along one branch or 
not), you need to apply something like the appended..

(And you obviously need to add that "no_simplify_merge" flag to the 
revision data structure, and you need to add some command line flag to 
enable it. Alternatively, try to find the patch I sent out a couple of 
months ago, I'm pretty sure I called it "--no-simplify-merge" or 
"--no-prune-history" or something like that).

		Linus
---
diff --git a/revision.c b/revision.c
index 6a6952c..5640cef 100644
--- a/revision.c
+++ b/revision.c
@@ -303,7 +303,7 @@ static void try_to_simplify_commit(struc
 		parse_commit(p);
 		switch (rev_compare_tree(revs, p->tree, commit->tree)) {
 		case REV_TREE_SAME:
-			if (p->object.flags & UNINTERESTING) {
+			if (revs->no_simplify_merge || (p->object.flags & UNINTERESTING)) {
 				/* Even if a merge with an uninteresting
 				 * side branch brought the entire change
 				 * we are interested in, we do not want

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: gitweb.cgi history not shown
  2006-06-11  6:02 ` Linus Torvalds
@ 2006-06-11  6:32   ` Marco Costalba
  2006-06-11 16:19     ` Linus Torvalds
  0 siblings, 1 reply; 8+ messages in thread
From: Marco Costalba @ 2006-06-11  6:32 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: junkio, git

>
> Now, look what happens if you instead of starting the history search from
> all the _current_ heads, you start it from a location that actually _had_
> that file:
>
>         git log 1130ef362fc8d9c3422c23f5d5 -- gitweb.cgi
>
> and suddenly there the history is - in all its glory.
>

Why I still get empty results if I run git-rev-list from gitweb merge point?

$ git-rev-list 0a8f4f0020cb35095005852c0797f0b90e9ebb74 -- gitweb.cgi
$
$ git-rev-list 0a8f4f0020cb35095005852c0797f0b90e9ebb74 -- gitweb/gitweb.cgi
0a8f4f0020cb35095005852c0797f0b90e9ebb74

Is this because path changed: gitweb.cgi -> gitweb/gitweb.cgi

I would like to think the problem is the path change because in case
of gitk, merge of a parallel branch but with _no_ path change,
everything worked as expected.

So the question is the path change was "fixed up" by hand or done as
part of gitweb branch merge process, in the latter case probably
git-rev-list should already take in account this without special flags
_and_ without removing history traversal optimizations that are good
and useful in the remaining 99% of cases (for a GUI tool is difficult
to know when to use a flag like --no-simplify-merge or not on a per
request basis).

        Marco

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: gitweb.cgi history not shown
  2006-06-11  6:32   ` Marco Costalba
@ 2006-06-11 16:19     ` Linus Torvalds
  2006-06-11 16:40       ` Linus Torvalds
  0 siblings, 1 reply; 8+ messages in thread
From: Linus Torvalds @ 2006-06-11 16:19 UTC (permalink / raw)
  To: Marco Costalba; +Cc: junkio, git



On Sun, 11 Jun 2006, Marco Costalba wrote:
> 
> Why I still get empty results if I run git-rev-list from gitweb merge point?

Because the rename happened _inside_ the merge. 

So when you give the revision 0a8f4f00, that really means the state 
_after_ the merge. At that point, the filename doesn't actually exist.

git-rev-list then looks at the parents, one by one, and sees that the 
first parent _matches_ the state as far as your path spec is concerned (in 
this case, it matches "it was empty before, it was empty after"), so it 
will literally _always_ pick the parent that you're not interested in 
(regardless of whether it would have been merged into, or was the one that 
got merged), because that's the one with the minimal history difference.

Going the other way (the way you actually wish it went) would have 
introduced more history changes that aren't needed to explain the final 
state, so git-rev-list - by virtue of trying to generate a _minimal_ 
history - will actively avoid it.

> Is this because path changed: gitweb.cgi -> gitweb/gitweb.cgi

Well, in one sense yes, but in a much more fundamental sense that rename 
really has nothing to do with the real issue.

The real issue is that you asked how the state of a non-existent file came 
to be, and git-rev-list told you the simplest answer: it never existed at 
all.

And that answer is actually _true_. Along one history, that filename never 
existed.

And this really has nothing to do with renames. You can see the exact same 
thing with files that are there. Let's take an example:

	   A	<-- top of tree
	  / \
	 B   C
	 |   |
	 D   E
	  \ /
	   F
	   |
	   .	<-- old history
	   .

Let's say that you have had a file called "file" for all of history, and 
it got changed sligtly differently in _all_ commits B, C, D _and_ E.

Now, despite all the different changes, let's say that the end result was 
identical in B and C - even though the diffs of those two commits were not 
necessarily the same (because they started out from different points: D 
and E respectively). 

In other words, there was a branch, but both branches ended up fixing the 
same bug the same way (and this is less unusual than you'd think, even if 
they are independent, but even more so if the branches "knew of each 
other" some other way, ie the developers talked about the problem and 
perhaps sent patches back-and-forth that both people applied).

What do you think git-rev-list will do when you give it that "file" as a 
path limiter?

What it will do is to notice that merge A has the same state (wrt that 
file) as commit B (it's first parent), SO IT WILL TOTALLY IGNORE THE WHOLE 
HISTORY THAT IS REACHABLE FROM C.

So git-rev-list will first simplify the tree to just A -> B -> D -> F .., 
and then, since A and B were identical in the path (and let's say F was 
identical to it's parents too), it will actually decide that as far as 
those commits were concerned, nothing changed, so the actual end result is 
just "B -> D -> ...", and none of A, C, E and F show up at all, even 
though both C and E really did change something (they just didn't 
_matter_, because all the changes could be explained by just picking B and 
D).

See? No renames. The renames is not what is fundamental here. What is 
fundamental is the _STATE_ of the tree. Remember: that's what git tracks, 
and that is what "git log" shows you.

So when you do

	git log -- gitweb.cgi

you're really asking for: "Please explain the state of the current tree 
with regards to gitweb.cgi that doesn't exist at this point in time". And 
that's _exactly_ what "git log" will do. It will say "hey, I can explain 
it by the file not existing in one of the previous parents either: maybe 
it got removed there", and walk back as far as it possibly can to explain 
that the file doesn't exist.

And it turns out that it can walk all the way back to the root, and the 
file didn't exist there, so the end result is what? The empty set. 

			Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: gitweb.cgi history not shown
  2006-06-11 16:19     ` Linus Torvalds
@ 2006-06-11 16:40       ` Linus Torvalds
  2006-06-11 16:54         ` Linus Torvalds
  2006-06-11 16:59         ` Jakub Narebski
  0 siblings, 2 replies; 8+ messages in thread
From: Linus Torvalds @ 2006-06-11 16:40 UTC (permalink / raw)
  To: Marco Costalba; +Cc: Junio C Hamano, Git Mailing List



On Sun, 11 Jun 2006, Linus Torvalds wrote:
> 
> See? No renames. The renames is not what is fundamental here. What is 
> fundamental is the _STATE_ of the tree. Remember: that's what git tracks, 
> and that is what "git log" shows you.

Btw, this is also why I suggested adding a "--no-simplify-history" flag, 
because in this case, that's exactly what _you_ want. The reason git is 
doing something unexpected - and in your case inferior - is exactly that 
what you want in this case is really not "explain the STATE of this file", 
but you want "give me ALL THE HISTORY concerning this filename".

Both are very valid things to ask for, it just happens that "git log" by 
default answers the _other_ question. It does NOT answer the "what is all 
the history" question that you're asking, it answers the "how did this 
state come to be" question.

Btw, the original "git whatchanged -p" answered exactly the question you 
had, and the semantics changed when we rewrite "git whatchanged" to act 
like "git log -p". But you can still get the old semantics by hand, if you 
really want it, by doing

	git-rev-list --all | git-diff-tree -p -- <filename>

because (and this actually makes total sense when you look at it), you now 
actually say "first give me all the history" and then "show the actual 
changes in that history as it pertains to <filename>".

See? 

I hope this explains the not-so-subtle (but still easy to overlook) 
difference between the two.

And I do agree that we should teach "git log" and friends to be able to 
answer both questions, and that's what my suggested patch (fleshed out 
properly, of course) should do.

Not that I ever _tested_ it, of course. Me? Testing? You make me laugh. Ho 
Ho Ho.

			Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: gitweb.cgi history not shown
  2006-06-11 16:40       ` Linus Torvalds
@ 2006-06-11 16:54         ` Linus Torvalds
  2006-06-11 16:59         ` Jakub Narebski
  1 sibling, 0 replies; 8+ messages in thread
From: Linus Torvalds @ 2006-06-11 16:54 UTC (permalink / raw)
  To: Marco Costalba; +Cc: Junio C Hamano, Git Mailing List


I just like talking to myself.

On Sun, 11 Jun 2006, Linus Torvalds wrote:
> 
> 	git-rev-list --all | git-diff-tree -p -- <filename>

That obviously wants a "--stdin" argument to git-diff-tree, and I might as 
well point out that it has a few other differences to doing this with the 
"--no-simplify-history" flag:

 - git-diff-tree obviously doesn't show merges normally, and when it does, 
   it would show only merges that change the file. In contrast, the "git 
   log" approach would show all merges that are part of the resulting 
   history (which, since you don't simplify merges, is _all_ of them).

 - the extra flag to "git log" approach allows "--parents" to work, ie the 
   stretches of commits in between merges would have their parents 
   rewritten, so that the history would be a unified whole, and you can 
   use qgit/gitk to show the result. The above pipeline obviously cannot 
   do that, since doing the filename limiter in git-diff-tree means that 
   it doesn't ever even _see_ the "history" part, it's just doing it one 
   commit at a time.

That concludes my monologue on the matter, I hope. If somebody wants to 
condense the issue of "show all history" vs "show how we got to this 
state" and add it to the Wiki FAQ thing, that would probably be good.

		Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: gitweb.cgi history not shown
  2006-06-11 16:40       ` Linus Torvalds
  2006-06-11 16:54         ` Linus Torvalds
@ 2006-06-11 16:59         ` Jakub Narebski
  2006-06-11 17:57           ` Linus Torvalds
  1 sibling, 1 reply; 8+ messages in thread
From: Jakub Narebski @ 2006-06-11 16:59 UTC (permalink / raw)
  To: git

Linus Torvalds wrote:


> Btw, this is also why I suggested adding a "--no-simplify-history" flag, 
> because in this case, that's exactly what _you_ want. The reason git is 
> doing something unexpected - and in your case inferior - is exactly that 
> what you want in this case is really not "explain the STATE of this file", 
> but you want "give me ALL THE HISTORY concerning this filename".
[...]
> Btw, the original "git whatchanged -p" answered exactly the question you 
> had, and the semantics changed when we rewrite "git whatchanged" to act 
> like "git log -p". 
[...]
> And I do agree that we should teach "git log" and friends to be able to 
> answer both questions, and that's what my suggested patch (fleshed out 
> properly, of course) should do.

Could we please 'git whatchanged -p' default to the original (before
rewrite) behavior, i.e. ALL THE HISTORY?

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: gitweb.cgi history not shown
  2006-06-11 16:59         ` Jakub Narebski
@ 2006-06-11 17:57           ` Linus Torvalds
  0 siblings, 0 replies; 8+ messages in thread
From: Linus Torvalds @ 2006-06-11 17:57 UTC (permalink / raw)
  To: Jakub Narebski, Junio C Hamano; +Cc: Git Mailing List



On Sun, 11 Jun 2006, Jakub Narebski wrote:
> 
> Could we please 'git whatchanged -p' default to the original (before
> rewrite) behavior, i.e. ALL THE HISTORY?

Ok, here's the full patch to do that.

It does:

 - add a "rev.simplify_history" flag which defaults to on
 - it turns it off for "git whatchanged" (which thus now has real
   semantics outside of "git log")
 - it adds a command line flag ("--full-history") to turn it off for 
   others (ie you can make "git log" and "gitk" etc get the semantics if 
   you want to.

Now, just as an example of _why_ you really really really want to simplify 
history by default, apply this patch, install it, and try these two 
command lines:

	gitk --full-history -- git.c
	gitk -- git.c

and compare the output. 

So with this, you can also now do

	git whatchanged -p -- gitweb.cgi
	git log -p --full-history -- gitweb.cgi

and it will show the old history of gitweb.cgi, even though it's not 
relevant to the _current_ state of the name "gitweb.cgi"

NOTE NOTE NOTE! It will still actually simplify away merges that didn't 
change anything at all into either child. That creates these bogus strange 
discontinuities if you look at it with "gitk" (look at the --full-history 
gitk output for git.c, and you'll see a few strange cases).

So the whole "--parent" thing ends up somewhat bogus with --full-history 
because of this, but I'm not sure it's worth even worrying about. I don't 
think you'd ever want to really use "--full-history" with the graphical 
representation, I just give it as an example exactly to show _why_ doing 
so would be insane.

I think this is trivial enough and useful enough to be worth merging into 
the stable branch.

			Linus

---
diff --git a/builtin-log.c b/builtin-log.c
index 29a8851..4407f06 100644
--- a/builtin-log.c
+++ b/builtin-log.c
@@ -51,6 +51,7 @@ int cmd_whatchanged(int argc, const char
 	init_revisions(&rev);
 	rev.diff = 1;
 	rev.diffopt.recursive = 1;
+	rev.simplify_history = 0;
 	return cmd_log_wc(argc, argv, envp, &rev);
 }
 
diff --git a/revision.c b/revision.c
index 6a6952c..75c648c 100644
--- a/revision.c
+++ b/revision.c
@@ -303,7 +303,7 @@ static void try_to_simplify_commit(struc
 		parse_commit(p);
 		switch (rev_compare_tree(revs, p->tree, commit->tree)) {
 		case REV_TREE_SAME:
-			if (p->object.flags & UNINTERESTING) {
+			if (!revs->simplify_history || (p->object.flags & UNINTERESTING)) {
 				/* Even if a merge with an uninteresting
 				 * side branch brought the entire change
 				 * we are interested in, we do not want
@@ -519,6 +519,7 @@ void init_revisions(struct rev_info *rev
 
 	revs->abbrev = DEFAULT_ABBREV;
 	revs->ignore_merges = 1;
+	revs->simplify_history = 1;
 	revs->pruning.recursive = 1;
 	revs->pruning.add_remove = file_add_remove;
 	revs->pruning.change = file_change;
@@ -756,6 +757,10 @@ int setup_revisions(int argc, const char
 				revs->full_diff = 1;
 				continue;
 			}
+			if (!strcmp(arg, "--full-history")) {
+				revs->simplify_history = 0;
+				continue;
+			}
 			opts = diff_opt_parse(&revs->diffopt, argv+i, argc-i);
 			if (opts > 0) {
 				revs->diff = 1;
diff --git a/revision.h b/revision.h
index 7d85b0f..4020e25 100644
--- a/revision.h
+++ b/revision.h
@@ -30,6 +30,7 @@ struct rev_info {
 			no_merges:1,
 			no_walk:1,
 			remove_empty_trees:1,
+			simplify_history:1,
 			lifo:1,
 			topo_order:1,
 			tag_objects:1,

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2006-06-11 17:57 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-11  5:31 gitweb.cgi history not shown Marco Costalba
2006-06-11  6:02 ` Linus Torvalds
2006-06-11  6:32   ` Marco Costalba
2006-06-11 16:19     ` Linus Torvalds
2006-06-11 16:40       ` Linus Torvalds
2006-06-11 16:54         ` Linus Torvalds
2006-06-11 16:59         ` Jakub Narebski
2006-06-11 17:57           ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).