* how to show log for only one branch
@ 2006-11-06 3:41 Liu Yubao
2006-11-06 6:12 ` Junio C Hamano
2006-11-06 15:25 ` how to show log for only one branch Jakub Narebski
0 siblings, 2 replies; 35+ messages in thread
From: Liu Yubao @ 2006-11-06 3:41 UTC (permalink / raw)
To: git
I'm some confused by `git log', here is a revision graph:
a-----> b ---> c ----------------> f ---> g --- master
\ /
`------> d ----------> e ---- test
I hope `git log ...` shows g, f, c, b, a.
`git log master` shows g, f, e, d, c, b, a;
`git log master ^test` shows g, f, c.
`git log --no-merges master` shows g, e, d, c, b, a.
That's to say, I want to view master, master~1, master~2, master~3, ...
until the beginning, no commits in other branches involved.
I have heard git treats all parents equally in a merge operation, so I
am curious how git decides which parent is HEAD^1.
I feel the HEAD^1 branch is more special than HEAD^2 branch, because HEAD^1
is usually the working branch and the target branch of merging operation.
it's a little more convenient to see only commits that really happen in
current branch, especially for people who come from CVS and Subversion (yes,
I think git is more interesting than CVS and Subversion:-).
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: how to show log for only one branch
2006-11-06 3:41 how to show log for only one branch Liu Yubao
@ 2006-11-06 6:12 ` Junio C Hamano
2006-11-06 10:41 ` Liu Yubao
2006-11-06 13:00 ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Liu Yubao
2006-11-06 15:25 ` how to show log for only one branch Jakub Narebski
1 sibling, 2 replies; 35+ messages in thread
From: Junio C Hamano @ 2006-11-06 6:12 UTC (permalink / raw)
To: Liu Yubao; +Cc: git
Liu Yubao <yubao.liu@gmail.com> writes:
> I have heard git treats all parents equally in a merge operation, so I
> am curious how git decides which parent is HEAD^1.
The first parent you see when you do "git cat-file commit HEAD"
is the HEAD^1, the second one is HEAD^2, etc.
With typical Porcelains (including git-core), when you make a
true merge by pulling another branch while on one branch, the
tip of the branch you were on when you initiated the merge
becomes the HEAD^1 of the resulting merge commit.
However, that does not mean HEAD^1 is any special in the global
history. It is only locally special when viewed by you who did
the merge, and only immediately after you made the merge. After
a while, even you yourself would feel less special about HEAD^1.
Imagine the following scenario.
. You fork off from Linus's tip, and you do a great work on the
kernel for a while.
o---o---o---o Liu
/
---o Linus
. Linus's tip progresses, and there are semantically some
overlapping changes; you merge from Linus to make sure your
great work still works with the updated upstream. This merge
commit (marked '*' in the picture below) has _your_ last
change as HEAD^1 and Linus's tip as HEAD^2.
o---o---o---o---* Liu
/ /
---o---o---o---o---o Linus
. It still works great and you let Linus know about your great
work. He likes it and pulls from you.
At this point, the revision history would still look like this:
o---o---o---o---* Liu = Linus
/ /
---o---o---o---o---o
That is, the DAG did not change since you pulled from Linus.
The only thing that changed was that Linus's tip now points at
the merge commit _you_ made.
Then Linus keeps working, building commits on top of that merge.
Liu
o---o---o---o---*---o---o---o---o Linus
/ /
---o---o---o---o---o
Now, we can say two things about this history.
If you view the development community "centered around Linus",
then when somebody looks back the history from Linus's tip,
whatever great work you did, that is merely "one of the many
contributions from many people". The "mainline" from this point
of view is still "what Linus saw at each point as the tip of his
development track", and among the commits you made (the ones
between the fork point and '*' in the above picture), the last
one, the merge you made was the only one that was once the tip
of Linus; everything else was "random work that happend in a
side branch". But HEAD^1 is not special if you wanted to have
this view.
In massively parallel and distributed development, whose track
of development is "mainline" is not absolute, and it all depends
on what you are interested in when you do the archaeology.
Let's say that your work on the side branch was in one specific
area (say, a device driver work for product X), and nobody
else's work in that area appeared on Linus's development track
since you forked until your work was merged.
To somebody who is digging from Linus's tip in order to find out
how that driver evolved, your side branch is much more important
than what happened on Linus's branch (which everybody would
loosely say _the_ "mainline"). On the other hand, when somebody
is interested in some other area that was worked on in Linus's
development track while your work was done in the side branch,
following your development track is not interesting; and the
person who is interested in this "other area" could be you. In
that case, you would want to follow Linus's development track.
What's mainline is _not_ important, and which parent is first is
even less so. It solely depends on what you are looking for
which branch matters more. Putting too much weight on the
difference between HEAD^1 vs HEAD^2 statically does not make any
sense.
Reflecting this view of history, git log and other history
traversal commands treat merge parents more or less equally, and
_how_ you ask your question affects what branches are primarily
followed. For example, if somebody is interested in your device
driver work, this command:
git log -- drivers/liu-s-device/
would follow your side branch. On the other hand,
git log -- fs/
would follow Linus's development track while you were forked, if
you did not do any fs/ work while on that side branch and
Linus's development track had works in that area, _despite_ the
merge you gave Linus has your development track as its first
parent.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: how to show log for only one branch
2006-11-06 6:12 ` Junio C Hamano
@ 2006-11-06 10:41 ` Liu Yubao
2006-11-06 18:16 ` Junio C Hamano
2006-11-06 13:00 ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Liu Yubao
1 sibling, 1 reply; 35+ messages in thread
From: Liu Yubao @ 2006-11-06 10:41 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Junio C Hamano wrote:
> Liu Yubao <yubao.liu@gmail.com> writes:
>
Snip many great detailed description, thank you very much, I have
a question about the way git treats fast forwarding but that will
be another topic.
> What's mainline is _not_ important, and which parent is first is
> even less so. It solely depends on what you are looking for
> which branch matters more. Putting too much weight on the
> difference between HEAD^1 vs HEAD^2 statically does not make any
> sense.
>
> Reflecting this view of history, git log and other history
> traversal commands treat merge parents more or less equally, and
> _how_ you ask your question affects what branches are primarily
> followed. For example, if somebody is interested in your device
> driver work, this command:
>
> git log -- drivers/liu-s-device/
>
> would follow your side branch. On the other hand,
>
> git log -- fs/
>
> would follow Linus's development track while you were forked, if
> you did not do any fs/ work while on that side branch and
> Linus's development track had works in that area, _despite_ the
> merge you gave Linus has your development track as its first
> parent.
>
This is perfect and enough for two branches that work on different
files, but if two branches modify same files, "git log" can't separate
commits clearly. For example, I want to know what happened in your
git's "next" branch, I hope to get logs like this:
Merge branch 'jc/pickaxe' into next
Merge branch 'master' into next
Merge branch 'js/modfix' into next
...
some good work
...
Merge branch ....
I just want to *outline* what happened in "next" branch, if I am interested
in what have been merged from 'jc/pickaxe' I can follow the merge point again
or use something like "git log --follow-all-parents".
Instead, "git log" interlaces logs from many branches, I find it's a little
confused: why does "git log" of current branch contain many logs from other
branches? (This is not a real question, I know the reason)
I indeed understand that HEAD^1 is not always the commit that my work
bases on before a merge (thanks for your detailed description again:-),
it doesn't make sense to show HEAD~1, HEAD~2, HEAD~3 and so on, that's
to say 'git log' will never meet my requirement.
Maybe reflog is what I need, I want to know which commits "next" have pointed
to, but reflog is only for local purpose, it's not downloaded by 'git clone'
^ permalink raw reply [flat|nested] 35+ messages in thread
* If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
2006-11-06 6:12 ` Junio C Hamano
2006-11-06 10:41 ` Liu Yubao
@ 2006-11-06 13:00 ` Liu Yubao
2006-11-06 13:39 ` If merging that is really fast forwarding creates new commit Rocco Rutte
` (2 more replies)
1 sibling, 3 replies; 35+ messages in thread
From: Liu Yubao @ 2006-11-06 13:00 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Thanks to Junio for his patient explanation about branches in git, I find
there is a subtle difference between GIT and regular VCS that can be easily
neglected by newbies.
I realize that git is a *content tracker*, it only creates commit object
when the corresponding tree is really modified, git records content merging
but not usual merging operation, that's why git is called a content tracker.
This explains why a merging that is really a fast forwarding doesn't create
any new commit.
This feature is different from many regular VCS like CVS and Subversion and
confuses newbies that come from them: mainline doesn't make sense too much,
'git log' shows many logs from other branches. In git, a branch is almost a
tag, you can't get the *track* of a branch(It's a pity reflog is only for
local purpose). I am used to one-trunk-and-more-side-branches way, every
branches are isolated clearly, git makes me very confused at the beginning.
Then, what bad *logical* problem will happen if a merging that is really a
fast forwarding creates a new commit?
If we throw away all compatibility, efficiency, memory and disk consumption
problems,
(1) we can get the track of a branch without reflog because HEAD^1 is
always the tip of target branch(or working branch usually) before merging.
(2) with the track, branch mechanism in git is possibly easier to understand,
especially for newbies from CVS or Subversion, I really like git's light
weight, simple but powerful design and great efficiency, but I am really
surprised that 'git log' shows logs from other branches and a side branch can
become part of main line suddenly.
A revision graph represents fast forwarding style merging like this:
(fast forwarding)
---- a ............ * ------> master
\ /
b----------c -----> test (three commits with three trees)
can be changed to:
---- a (tree_1) ----------- d (tree_3) ------> master
\ /
b (tree_2) ------- c (tree_3) ----> test
(four commits with three trees, it's normal as more than one way can reach
Rome :-)
I don't think I am smarter than any people in this mailing list, in fact
I am confused very much by GIT's branches at the beginning. There must
be many problems I haven't realized, I am very curious about them, any
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: If merging that is really fast forwarding creates new commit
2006-11-06 13:00 ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Liu Yubao
@ 2006-11-06 13:39 ` Rocco Rutte
2006-11-07 3:42 ` Liu Yubao
2006-11-06 13:43 ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Andreas Ericsson
2006-11-06 15:48 ` Linus Torvalds
2 siblings, 1 reply; 35+ messages in thread
From: Rocco Rutte @ 2006-11-06 13:39 UTC (permalink / raw)
To: git
Hi,
* Liu Yubao [06-11-06 21:00:07 +0800] wrote:
>Then, what bad *logical* problem will happen if a merging that is really a fast forwarding creates a new commit?
I don't know what you expect by "logical" nor if I get you right, but if
fast-forward merge a branch to another one, both branches now have
exactly the same hash. If you create a commit object for a fast-forward
merge, both tip hashes not identical anymore... which is bad.
The identical hash important so that you really know they're identical
and for future reference like ancestry.
bye, Rocco
--
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
2006-11-06 13:00 ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Liu Yubao
2006-11-06 13:39 ` If merging that is really fast forwarding creates new commit Rocco Rutte
@ 2006-11-06 13:43 ` Andreas Ericsson
2006-11-07 3:26 ` Liu Yubao
2006-11-06 15:48 ` Linus Torvalds
2 siblings, 1 reply; 35+ messages in thread
From: Andreas Ericsson @ 2006-11-06 13:43 UTC (permalink / raw)
To: Liu Yubao; +Cc: Junio C Hamano, git
Liu Yubao wrote:
> Thanks to Junio for his patient explanation about branches in git, I
> find there is a subtle difference between GIT and regular VCS that can
> be easily
> neglected by newbies.
>
> I realize that git is a *content tracker*, it only creates commit object
> when the corresponding tree is really modified, git records content merging
> but not usual merging operation, that's why git is called a content
> tracker.
> This explains why a merging that is really a fast forwarding doesn't create
> any new commit.
>
> This feature is different from many regular VCS like CVS and Subversion and
> confuses newbies that come from them: mainline doesn't make sense too much,
> 'git log' shows many logs from other branches. In git, a branch is almost a
> tag, you can't get the *track* of a branch(It's a pity reflog is only for
> local purpose). I am used to one-trunk-and-more-side-branches way, every
> branches are isolated clearly, git makes me very confused at the beginning.
>
>
> Then, what bad *logical* problem will happen if a merging that is really
> a fast forwarding creates a new commit?
>
If "fake" commits (i.e., commits that doesn't change any content) are
introduced for each merge, it will change the ancestry graph and the
resulting tree(s) won't be mergable with the tree it merged with,
because each such "back-merge" would result in
* the "fake" commit becoming part of history
* a new "fake" commit being introduced
Consider what happens when Alice pulls in Bob's changes. The merge-base
of Bob's tip is where Alice HEAD points to, so it results in a
fast-forward, like below.
a---b---c---d <--- Alice
\
e---f---g <--- Bob
If, we would have created a fake commit instead, Alice would get a graph
that looks like so:
a---b---c---d-----------h <--- Alice
\ /
e---f---g <--- Bob
Now, we would have two trees that are identical, because the merge can't
cause conflicts, but Alice and Bob will have reached it in two different
ways. When Bob decides he wants to go get the changes Alice has done,
his tree will look something like this:
a---b---c---d-----------h <--- Alice
\ / \
e---f---g---i <--- Bob
He finds it odd that he's got two commits that, when checked out, lead
to the exact same tree, so he asks Alice to get his tree and see what's
going on. Alice will then end up with this:
a---b---c---d-----------h---j <--- Alice
\ / \ /
e---f---g---i <--- Bob
Now there's four commits that all point to identical trees, but the
ancestry graphs differ between all developers. In the case above,
there's only two people working at the same project. Imagine the amount
of empty commits you'd get in a larger project, like the Linux kernel.
Fast-forward is a Good Thing and the only sensible thing to do in a
system designed to be fully distributed (i.e., where there isn't
necessarily any middle point with which everybody syncs), while scaling
beyond ten developers that merge frequently between each other.
> If we throw away all compatibility, efficiency, memory and disk consumption
> problems,
> (1) we can get the track of a branch without reflog because HEAD^1 is
> always the tip of target branch(or working branch usually) before merging.
>
> (2) with the track, branch mechanism in git is possibly easier to
> understand,
> especially for newbies from CVS or Subversion, I really like git's light
> weight, simple but powerful design and great efficiency, but I am really
> surprised that 'git log' shows logs from other branches and a side
> branch can become part of main line suddenly.
>
> A revision graph represents fast forwarding style merging like this:
>
> (fast forwarding)
> ---- a ............ * ------> master
> \ /
> b----------c -----> test (three commits with three trees)
>
> can be changed to:
>
> ---- a (tree_1) ----------- d (tree_3) ------> master
> \ /
> b (tree_2) ------- c (tree_3) ----> test
> (four commits with three trees, it's normal as more than one way can
> reach Rome :-)
>
That's where our views differ. In my eyes, "d" and "c" are exactly
identical, and I'd be very surprised if the scm tried to tell me that
they aren't, by not giving them the same revid.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: how to show log for only one branch
2006-11-06 3:41 how to show log for only one branch Liu Yubao
2006-11-06 6:12 ` Junio C Hamano
@ 2006-11-06 15:25 ` Jakub Narebski
2006-11-07 3:47 ` Liu Yubao
1 sibling, 1 reply; 35+ messages in thread
From: Jakub Narebski @ 2006-11-06 15:25 UTC (permalink / raw)
To: git
Perhaps what you want is git log --committer=<owner of repo>?
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
2006-11-06 13:00 ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Liu Yubao
2006-11-06 13:39 ` If merging that is really fast forwarding creates new commit Rocco Rutte
2006-11-06 13:43 ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Andreas Ericsson
@ 2006-11-06 15:48 ` Linus Torvalds
2006-11-06 16:03 ` Martin Langhoff
` (2 more replies)
2 siblings, 3 replies; 35+ messages in thread
From: Linus Torvalds @ 2006-11-06 15:48 UTC (permalink / raw)
To: Liu Yubao; +Cc: Junio C Hamano, git
On Mon, 6 Nov 2006, Liu Yubao wrote:
>
> Then, what bad *logical* problem will happen if a merging that is really a
> fast forwarding creates a new commit?
You MUST NOT do that.
If a fast-forward were to do a "merge commit", you'd never get into the
situation where two people merging each other would really ever get a
stable result. They'd just keep doing merge commits on top of each other.
Git tracks history, not "your view of history". Trying to track "your
view" is fundamentally wrong, because "your wiew" automatically means that
the project history would not be distributed any more - it would be
centralized around what _you_ think happened. That is not a sensible thing
to have in a distributed system.
For example, the way to break the "infinite merges" problem above is to
say that _you_ would be special, and you would do a "fast-forward commit",
and the other side would always just fast-forward without a commit. But
that is very fundamentally against the whole point of being distributed.
Now you're special.
In fact, even for "you", it would be horrible - because you personally
might have 5 different repositories on five different machines. You'd have
to select _which_ machine you want to track. That's simply insane. It's a
totally broken model. (You can even get the same situation with just _one_
repository, by just having five different branches - you have to decide
which one is the "main" branch).
Besides, doing an empty commit like that ("I fast forwarded") literally
doesn't add any true history information. It literally views history not
as history of the _project_, but as the history of just one of the
repositories. And that's wrong.
So just get used to it. You MUST NOT do what you want to do. It's stupid.
If you want to track the history of one particular local branch, use the
"reflog" thing. It allows you to see what one of your local branches
contained at any particular time.
See
[core]
logAllRefUpdates = true
documentation in "man git-update-refs" (and maybe somebody can write more
about it?)
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
2006-11-06 15:48 ` Linus Torvalds
@ 2006-11-06 16:03 ` Martin Langhoff
2006-11-06 17:48 ` Linus Torvalds
2006-11-07 7:27 ` Liu Yubao
2 siblings, 0 replies; 35+ messages in thread
From: Martin Langhoff @ 2006-11-06 16:03 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Liu Yubao, Junio C Hamano, git
On 11/6/06, Linus Torvalds <torvalds@osdl.org> wrote:
> On Mon, 6 Nov 2006, Liu Yubao wrote:
> > Then, what bad *logical* problem will happen if a merging that is really a
> > fast forwarding creates a new commit?
> You MUST NOT do that.
>
> If a fast-forward were to do a "merge commit", you'd never get into the
> situation where two people merging each other would really ever get a
> stable result. They'd just keep doing merge commits on top of each other.
Indeed. I used Arch for quite a while and if you were merging between
2 or more repos it would never reach a stable point even if the code
didn't change at all.
If a group of 3 developers (with one repor per developer) was
developing at a slow pace (say, a daily commit each, plus a couple of
pull/updates per day) the garbage-commit to content-commit ratio was
awful. If on a given day noone had made a single commit, we'd still
have a whole set of useless updates merged and committed.
> Besides, doing an empty commit like that ("I fast forwarded") literally
> doesn't add any true history information.
And as the number of developers and repos grows in a distributed
scenarios, fast-forwards increasingly outnumber real commits. The
usefulness of your logs sinks to the sewers.
cheers,
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
2006-11-06 15:48 ` Linus Torvalds
2006-11-06 16:03 ` Martin Langhoff
@ 2006-11-06 17:48 ` Linus Torvalds
2006-11-07 7:59 ` Liu Yubao
2006-11-07 11:46 ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Eran Tromer
2006-11-07 7:27 ` Liu Yubao
2 siblings, 2 replies; 35+ messages in thread
From: Linus Torvalds @ 2006-11-06 17:48 UTC (permalink / raw)
To: Liu Yubao; +Cc: Junio C Hamano, git
On Mon, 6 Nov 2006, Linus Torvalds wrote:
>
> Besides, doing an empty commit like that ("I fast forwarded") literally
> doesn't add any true history information. It literally views history not
> as history of the _project_, but as the history of just one of the
> repositories. And that's wrong.
>
> So just get used to it. You MUST NOT do what you want to do. It's stupid.
Btw, absolutely the _only_ reason people seem to want to do this is
because they want to "pee in the snow" and put their mark on things. They
seem to want to show "_I_ did this", even if the "doing" was a total
no-op and they didn't actually generate any real value.
That's absolutely the last thing you want to encourage, especially when
the end result is a history that is totally unreadable and contains more
"junk" than actual real work.
I'll be the first to say that "merging code" is often as important as
actually writing the code in the first place, and that it is important to
show who actually did real work to make a patch appear in a project.
In the kernel, for example, we have "sign-off" lines to show what route a
patch took before it was accepted, and it's very instructive to see (for
example) how man patches give credit to somebody like Andrew Morton for
passing it on versus actually writing the code himself (he has a lot of
authorship credit too, but it's absolutely _dwarfed_ by his importance as
a maintainer - and if you were to ask any random kernel developer why
Andrew is so important, I can pretty much guarantee that his importance is
very much about those "sign-offs", and not about the patches he authors).
But at the same time, when it comes to merging, because it actually
clutters up history a lot, we actively try to _avoid_ it. Many subsystem
maintainers purposefully re-generate a linear history, rebased on top of
my current kernel, exactly because it makes the history less "branchy",
and because that makes things easier to see.
So we have actually done work to _encourage_ fast-forwarding over "merge
with a commit", because the fast-forwarding ends up generating a much more
readable and understandable history. Generating a _fake_ "merge commit"
would be absolutely and utterly horrible. It gives fake credit for work
that wasn't real work, and it makes history uglier and harder to read.
So it's a real NEGATIVE thing to have, and you should run away from it as
fast as humanly possible.
Now, the kernel actually ends up being fairly branchy anyway, but that's
simply because we actually have a lot of real parallel development (I bet
more than almost any other project out there - we simply have more commits
done by more people than most projects). I tend to do multiple merges a
day, so even though people linearize their history individually, you end
up seeing a fair amount of merges. But we'd have a lot _more_ of them if
people didn't try to keep history clean.
Btw, in the absense of a merge, you can still tell who committed
something, exactly because git keeps track of "committer" information in
addition to "authorship" information. I don't understand why other
distributed environments don't seem to do this - because separating out
who committed something (and when) from who authored it (and when) is
actually really really important.
And that's not just because we use patches and other SCM's than just git
to track things (so authorship and committing really are totally separate
issues), but because even if the author and committer is the same person,
it's very instructive to realize that it might have been moved around in
history, so it might actually have been cherry-picked later, and the
committer date differs from the author date even if the actual author and
committer are the same person (but you might also have had somebody _else_
re-linearize or otherwise cherry-pick the history: again, it's important
to show the committer _separately_ both as a person and as a date).
And because there is a committer field, if you actually want to linearize
or log things by who _committed_ stuff, you can. Just do
git log --committer=torvalds
on the kernel, and you can see the log as it pertains for what _I_
committed, for example. You can even show it graphically, although it
won't be a connected graph any more, so it will tend to be very ugly
(but you'll see the "linear stretches" when somebody did some work). Just
do "gitk --committer=myname" to see in your own project.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: how to show log for only one branch
2006-11-06 10:41 ` Liu Yubao
@ 2006-11-06 18:16 ` Junio C Hamano
2006-11-07 2:21 ` Liu Yubao
0 siblings, 1 reply; 35+ messages in thread
From: Junio C Hamano @ 2006-11-06 18:16 UTC (permalink / raw)
To: Liu Yubao; +Cc: git
Liu Yubao <yubao.liu@gmail.com> writes:
> ... For example, I want to know what happened in your
> git's "next" branch, I hope to get logs like this:
> Merge branch 'jc/pickaxe' into next
> Merge branch 'master' into next
> Merge branch 'js/modfix' into next
> ...
> some good work
> ...
> Merge branch ....
>
> I just want to *outline* what happened in "next" branch, if I am interested
> in what have been merged from 'jc/pickaxe' I can follow the merge point again
> or use something like "git log --follow-all-parents".
My "next" is a bad example of this, because it is an integration
branch and never gets its own development. It is also a bad
example because I can answer that question with this command
line:
git log --grep='^Merge .* into next$' next
and while it is a perfectly valid answer, I know it would leave
you feeling somewhat cheated.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: how to show log for only one branch
2006-11-06 18:16 ` Junio C Hamano
@ 2006-11-07 2:21 ` Liu Yubao
2006-11-07 8:21 ` Jakub Narebski
0 siblings, 1 reply; 35+ messages in thread
From: Liu Yubao @ 2006-11-07 2:21 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Junio C Hamano wrote:
> Liu Yubao <yubao.liu@gmail.com> writes:
>
>> ... For example, I want to know what happened in your
>> git's "next" branch, I hope to get logs like this:
>> Merge branch 'jc/pickaxe' into next
>> Merge branch 'master' into next
>> Merge branch 'js/modfix' into next
>> ...
>> some good work
>> ...
>> Merge branch ....
>>
>> I just want to *outline* what happened in "next" branch, if I am interested
>> in what have been merged from 'jc/pickaxe' I can follow the merge point again
>> or use something like "git log --follow-all-parents".
>
> My "next" is a bad example of this, because it is an integration
> branch and never gets its own development. It is also a bad
> example because I can answer that question with this command
> line:
>
> git log --grep='^Merge .* into next$' next
>
> and while it is a perfectly valid answer, I know it would leave
> you feeling somewhat cheated.
>
smart trick, but if the logs aren't consistent enough it's hard to
grep them out.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
2006-11-06 13:43 ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Andreas Ericsson
@ 2006-11-07 3:26 ` Liu Yubao
2006-11-07 9:30 ` Andy Whitcroft
0 siblings, 1 reply; 35+ messages in thread
From: Liu Yubao @ 2006-11-07 3:26 UTC (permalink / raw)
To: Andreas Ericsson; +Cc: Junio C Hamano, git
Andreas Ericsson wrote:
> Liu Yubao wrote:
>
> If "fake" commits (i.e., commits that doesn't change any content) are
> introduced for each merge, it will change the ancestry graph and the
> resulting tree(s) won't be mergable with the tree it merged with,
> because each such "back-merge" would result in
> * the "fake" commit becoming part of history
> * a new "fake" commit being introduced
>
> Consider what happens when Alice pulls in Bob's changes. The merge-base
> of Bob's tip is where Alice HEAD points to, so it results in a
> fast-forward, like below.
>
> a---b---c---d <--- Alice
> \
> e---f---g <--- Bob
>
>
> If, we would have created a fake commit instead, Alice would get a graph
> that looks like so:
>
> a---b---c---d-----------h <--- Alice
> \ /
> e---f---g <--- Bob
>
>
> Now, we would have two trees that are identical, because the merge can't
> cause conflicts, but Alice and Bob will have reached it in two different
> ways. When Bob decides he wants to go get the changes Alice has done,
> his tree will look something like this:
>
> a---b---c---d-----------h <--- Alice
> \ / \
> e---f---g---i <--- Bob
>
>
> He finds it odd that he's got two commits that, when checked out, lead
> to the exact same tree, so he asks Alice to get his tree and see what's
> going on. Alice will then end up with this:
>
> a---b---c---d-----------h---j <--- Alice
> \ / \ /
> e---f---g---i <--- Bob
>
>
> Now there's four commits that all point to identical trees, but the
> ancestry graphs differ between all developers. In the case above,
> there's only two people working at the same project. Imagine the amount
> of empty commits you'd get in a larger project, like the Linux kernel.
>
Oh, you remind me, but I have a naive solution for this problem: print
a hint and don't merge commits that contain fake commit, then I know I have
reached a stable merge point and have same tree with others.
We create a fake commit for fast forwarding style merge, this fake commit
is used to record the track of a branch, so we can always follow HEAD^1
to travel through the history of a branch. In fact, git pays more attention
to the history of *data modification* than history of *operation*, that is
right the subtle difference between content tracker and VCS, latter's branch
has more information(useful information, I think).
Even if no fake commit is created as git does now, there can be multiple
commits with identical tree object, and git can't prevent you from merging
two commits with identical tree object, it just creates an ancestry relation
to remember the merge point.
As git(7) says:
The "commit" object is an object that introduces the notion
of history into the picture. In contrast to the other objects,
it doesn't just describe the physical state of a tree, it
describes how we got there, and why.
So it's clearer to describe a revision graph with nodes for tree
objects and edges for commit objects(multiple edges for a merge
commit object, I know this will break your habit:-).
> Fast-forward is a Good Thing and the only sensible thing to do in a
> system designed to be fully distributed (i.e., where there isn't
> necessarily any middle point with which everybody syncs), while scaling
> beyond ten developers that merge frequently between each other.
>
>> If we throw away all compatibility, efficiency, memory and disk
>> consumption
>> problems,
>> (1) we can get the track of a branch without reflog because HEAD^1 is
>> always the tip of target branch(or working branch usually) before
>> merging.
>>
>> (2) with the track, branch mechanism in git is possibly easier to
>> understand,
>> especially for newbies from CVS or Subversion, I really like git's
>> light weight, simple but powerful design and great efficiency, but I
>> am really
>> surprised that 'git log' shows logs from other branches and a side
>> branch can become part of main line suddenly.
>>
>> A revision graph represents fast forwarding style merging like this:
>>
>> (fast forwarding)
>> ---- a ............ * ------> master
>> \ /
>> b----------c -----> test (three commits with three trees)
>>
>> can be changed to:
>>
>> ---- a (tree_1) ----------- d (tree_3) ------> master
>> \ /
>> b (tree_2) ------- c (tree_3) ----> test
>> (four commits with three trees, it's normal as more than one way can
>> reach Rome :-)
>>
>
> That's where our views differ. In my eyes, "d" and "c" are exactly
> identical, and I'd be very surprised if the scm tried to tell me that
> they aren't, by not giving them the same revid.
It doesn't matter, they have same tree, and it's normal too in git
multiple commits have same tree, if you use nodes for tree state,
that graph will be simple to understand:
a d
-----tree_1 -------------- tree_3 ----> master
\ / \
\ b d/c `-----> test
\ /
`--- tree_2 ---'
This is the familiar way we used in CVS, I believe there are more
than one people confused by fast forwarding style merge and 'git log'
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: If merging that is really fast forwarding creates new commit
2006-11-06 13:39 ` If merging that is really fast forwarding creates new commit Rocco Rutte
@ 2006-11-07 3:42 ` Liu Yubao
0 siblings, 0 replies; 35+ messages in thread
From: Liu Yubao @ 2006-11-07 3:42 UTC (permalink / raw)
To: git
Rocco Rutte wrote:
> Hi,
>
> * Liu Yubao [06-11-06 21:00:07 +0800] wrote:
>
>> Then, what bad *logical* problem will happen if a merging that is
>> really a fast forwarding creates a new commit?
>
> I don't know what you expect by "logical" nor if I get you right, but if
> fast-forward merge a branch to another one, both branches now have
> exactly the same hash. If you create a commit object for a fast-forward
> merge, both tip hashes not identical anymore... which is bad.
Not so bad, you can know they point to same tree objects.
Fast forwarding style merge will blow away the *track* of your branch,
and this track is useful, that is why reflog appears.
>
> The identical hash important so that you really know they're identical
> and for future reference like ancestry.
I guess you have mixed identical commits with identical trees. Trees
is what we really need.
Fake commit doesn't mess the ancestry relation, you can refer to
my previous mail replied to Andreas Ericsson in this topic.
>
> bye, Rocco
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: how to show log for only one branch
2006-11-06 15:25 ` how to show log for only one branch Jakub Narebski
@ 2006-11-07 3:47 ` Liu Yubao
2006-11-07 8:08 ` Jakub Narebski
0 siblings, 1 reply; 35+ messages in thread
From: Liu Yubao @ 2006-11-07 3:47 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
Jakub Narebski wrote:
> Perhaps what you want is git log --committer=<owner of repo>?
>
Thanks, it can't meet my requirement, if I create two branches
and merge them, I can't easily tell the track of those two branches.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
2006-11-06 15:48 ` Linus Torvalds
2006-11-06 16:03 ` Martin Langhoff
2006-11-06 17:48 ` Linus Torvalds
@ 2006-11-07 7:27 ` Liu Yubao
2006-11-07 9:46 ` Andy Whitcroft
2006-11-07 16:05 ` Linus Torvalds
2 siblings, 2 replies; 35+ messages in thread
From: Liu Yubao @ 2006-11-07 7:27 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Junio C Hamano, git
Linus Torvalds wrote:
>
> On Mon, 6 Nov 2006, Liu Yubao wrote:
>> Then, what bad *logical* problem will happen if a merging that is really a
>> fast forwarding creates a new commit?
>
> You MUST NOT do that.
>
> If a fast-forward were to do a "merge commit", you'd never get into the
> situation where two people merging each other would really ever get a
> stable result. They'd just keep doing merge commits on top of each other.
They can stop merging a fake commit with a real commit that point to same
tree object, here they reach a stable result: we have same tree content.
>
> Git tracks history, not "your view of history". Trying to track "your
> view" is fundamentally wrong, because "your wiew" automatically means that
> the project history would not be distributed any more - it would be
> centralized around what _you_ think happened. That is not a sensible thing
> to have in a distributed system.
It's not my view, it's branch scope view, I can see how a branch evolves
relatively independently. In git, branch scope view is more or less neglected.
After fast forwarding merge, I can' tell where a branch come from -- I mean
the track of a branch.
If Junio publishes his reflog, I don't see what conflict will happen between
his local view (but now public, and naming it branch scope view seems more
sensible) and git's global view.
If this won't lead to problems, it seems also ok to use fake commit for
fast forwarding style merge, so we can follow HEAD^1 to travel through a
branch without reflog.
I hope I have expressed my thought clearly.
>
> For example, the way to break the "infinite merges" problem above is to
> say that _you_ would be special, and you would do a "fast-forward commit",
> and the other side would always just fast-forward without a commit. But
> that is very fundamentally against the whole point of being distributed.
> Now you're special.
No one is special as everybody can create fake commit, any branch (almost
a tag) will never be overwritten to point to a commit object in
another branch, branches are relatively independent, that's to say
'git log' will reflect what has happened really in current branch (a CVS
semantical branch, not only a tag that always points to a tip commit).
>
> In fact, even for "you", it would be horrible - because you personally
> might have 5 different repositories on five different machines. You'd have
> to select _which_ machine you want to track. That's simply insane. It's a
> totally broken model. (You can even get the same situation with just _one_
> repository, by just having five different branches - you have to decide
> which one is the "main" branch).
What's the mean of upstream branch then? I have to know I should track
Junio's public repository.
When does one say two branches reach a common point? have same commit(must
point to same tree) or have same tree(maybe a fake commit and a real commit)?
I think git takes the first way.
Fast forwarding style merge tends to *automatically* centralize many
branches, in CVS people merge two branches and drop side branch to
centralize them, they all have central semantics.
(I don't want to get flame war between CVS/SVN and GIT, I think
git is better than them really:-)
>
> Besides, doing an empty commit like that ("I fast forwarded") literally
> doesn't add any true history information. It literally views history not
> as history of the _project_, but as the history of just one of the
> repositories. And that's wrong.
Something like 'git log --follow-all-parent' can show history of the project
as 'git log' does now.
>
> So just get used to it. You MUST NOT do what you want to do. It's stupid.
Yes, I have understood the git way and am getting used to it, I like
its simple but powerful design and great efficiency, thank all for your
good work!
>
> If you want to track the history of one particular local branch, use the
> "reflog" thing. It allows you to see what one of your local branches
> contained at any particular time.
>
> See
>
> [core]
> logAllRefUpdates = true
>
Thanks, it's a pity I can't pull Junio's reflog :-(
> documentation in "man git-update-refs" (and maybe somebody can write more
> about it?)
>
> Linus
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
2006-11-06 17:48 ` Linus Torvalds
@ 2006-11-07 7:59 ` Liu Yubao
2006-11-07 17:23 ` Linus Torvalds
2006-11-07 18:23 ` If merging that is really fast forwarding creates new commit Junio C Hamano
2006-11-07 11:46 ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Eran Tromer
1 sibling, 2 replies; 35+ messages in thread
From: Liu Yubao @ 2006-11-07 7:59 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Junio C Hamano, git
Linus Torvalds wrote:
>
> On Mon, 6 Nov 2006, Linus Torvalds wrote:
>> Besides, doing an empty commit like that ("I fast forwarded") literally
>> doesn't add any true history information. It literally views history not
>> as history of the _project_, but as the history of just one of the
>> repositories. And that's wrong.
>>
>> So just get used to it. You MUST NOT do what you want to do. It's stupid.
>
> Btw, absolutely the _only_ reason people seem to want to do this is
> because they want to "pee in the snow" and put their mark on things. They
> seem to want to show "_I_ did this", even if the "doing" was a total
> no-op and they didn't actually generate any real value.
We can kick out fake commits when calculate credits, we can grep logs with
author name to see what he/she has done.
Fake commit is only for digging branch scope history, I can *outline* what has
been merged to a branch and don't care about how these good work are done on
earth.
>
> That's absolutely the last thing you want to encourage, especially when
> the end result is a history that is totally unreadable and contains more
> "junk" than actual real work.
>
> I'll be the first to say that "merging code" is often as important as
> actually writing the code in the first place, and that it is important to
> show who actually did real work to make a patch appear in a project.
>
> In the kernel, for example, we have "sign-off" lines to show what route a
> patch took before it was accepted, and it's very instructive to see (for
> example) how man patches give credit to somebody like Andrew Morton for
> passing it on versus actually writing the code himself (he has a lot of
> authorship credit too, but it's absolutely _dwarfed_ by his importance as
> a maintainer - and if you were to ask any random kernel developer why
> Andrew is so important, I can pretty much guarantee that his importance is
> very much about those "sign-offs", and not about the patches he authors).
>
> But at the same time, when it comes to merging, because it actually
> clutters up history a lot, we actively try to _avoid_ it. Many subsystem
> maintainers purposefully re-generate a linear history, rebased on top of
> my current kernel, exactly because it makes the history less "branchy",
> and because that makes things easier to see.
>
> So we have actually done work to _encourage_ fast-forwarding over "merge
> with a commit", because the fast-forwarding ends up generating a much more
> readable and understandable history. Generating a _fake_ "merge commit"
> would be absolutely and utterly horrible. It gives fake credit for work
> that wasn't real work, and it makes history uglier and harder to read.
>
> So it's a real NEGATIVE thing to have, and you should run away from it as
> fast as humanly possible.
>
> Now, the kernel actually ends up being fairly branchy anyway, but that's
> simply because we actually have a lot of real parallel development (I bet
> more than almost any other project out there - we simply have more commits
> done by more people than most projects). I tend to do multiple merges a
> day, so even though people linearize their history individually, you end
> up seeing a fair amount of merges. But we'd have a lot _more_ of them if
> people didn't try to keep history clean.
That's right the central semantics I have said, git tends to and recommends
a trunk mode development *on a high level*. It's not a bad thing.
>
> Btw, in the absense of a merge, you can still tell who committed
> something, exactly because git keeps track of "committer" information in
> addition to "authorship" information. I don't understand why other
> distributed environments don't seem to do this - because separating out
> who committed something (and when) from who authored it (and when) is
> actually really really important.
Yes, agree.
>
> And that's not just because we use patches and other SCM's than just git
> to track things (so authorship and committing really are totally separate
> issues), but because even if the author and committer is the same person,
> it's very instructive to realize that it might have been moved around in
> history, so it might actually have been cherry-picked later, and the
> committer date differs from the author date even if the actual author and
> committer are the same person (but you might also have had somebody _else_
> re-linearize or otherwise cherry-pick the history: again, it's important
> to show the committer _separately_ both as a person and as a date).
>
> And because there is a committer field, if you actually want to linearize
> or log things by who _committed_ stuff, you can. Just do
>
> git log --committer=torvalds
>
> on the kernel, and you can see the log as it pertains for what _I_
> committed, for example. You can even show it graphically, although it
> won't be a connected graph any more, so it will tend to be very ugly
> (but you'll see the "linear stretches" when somebody did some work). Just
> do "gitk --committer=myname" to see in your own project.
>
> Linus
I want to separate a branch, not to separate commits by some author, for
example, many authors can contribute to git's master branch, I want to
know what happened in the master branch like this:
good work from A;
good work from C;
merge from next; -----> I don't care how this feature is realized.
good work from A;
....
As Junio points out, HEAD^1 is not always the tip of working branch,
so "git log" can't never satisfy me. There is reflog, but it's not public.
BTW: I have a great respect for any man who contributes to Linux and GIT,
especially you:-)
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: how to show log for only one branch
2006-11-07 3:47 ` Liu Yubao
@ 2006-11-07 8:08 ` Jakub Narebski
0 siblings, 0 replies; 35+ messages in thread
From: Jakub Narebski @ 2006-11-07 8:08 UTC (permalink / raw)
To: Liu Yubao, Junio C Hamano; +Cc: git
Liu Yubao wrote:
> Jakub Narebski wrote:
>>
>> Perhaps what you want is git log --committer=<owner of repo>?
>>
> Thanks, it can't meet my requirement, if I create two branches
> and merge them, I can't easily tell the track of those two branches.
Use graphical history viewer then. git-show-branch, gitk (Tcl/Tk),
qgit (Qt), less used GitView (GTK+), tig (ncurses), least used
git-browser (JavaScript).
BTW. that is what subject line (first line of commit message) is for.
Note the "gitweb:", "Documentation:", "autoconf:", "Improve build:" in
the git log.
By the way, what is the status of the proposed "note" header extension
to the commit object? One could store name of branch we were/are on,
even though this is absolutely discouraged...
--
Jakub Narebski
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: how to show log for only one branch
2006-11-07 2:21 ` Liu Yubao
@ 2006-11-07 8:21 ` Jakub Narebski
0 siblings, 0 replies; 35+ messages in thread
From: Jakub Narebski @ 2006-11-07 8:21 UTC (permalink / raw)
To: git
Liu Yubao wrote:
> Junio C Hamano wrote:
>> [...] It is also a bad
>> example because I can answer that question with this command
>> line:
>>
>> git log --grep='^Merge .* into next$' next
>>
>> and while it is a perfectly valid answer, I know it would leave
>> you feeling somewhat cheated.
>>
> smart trick, but if the logs aren't consistent enough it's hard to
> grep them out.
Well, commit message for merges are generated automatically. And if you set
merge.summary=true in repo config (or your config), then you have shortlog
in merge commit message by default...
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
2006-11-07 3:26 ` Liu Yubao
@ 2006-11-07 9:30 ` Andy Whitcroft
2006-11-07 12:05 ` Liu Yubao
0 siblings, 1 reply; 35+ messages in thread
From: Andy Whitcroft @ 2006-11-07 9:30 UTC (permalink / raw)
To: Liu Yubao; +Cc: Andreas Ericsson, Junio C Hamano, git
Liu Yubao wrote:
> Andreas Ericsson wrote:
>> Liu Yubao wrote:
>>
>> If "fake" commits (i.e., commits that doesn't change any content) are
>> introduced for each merge, it will change the ancestry graph and the
>> resulting tree(s) won't be mergable with the tree it merged with,
>> because each such "back-merge" would result in
>> * the "fake" commit becoming part of history
>> * a new "fake" commit being introduced
>>
>> Consider what happens when Alice pulls in Bob's changes. The
>> merge-base of Bob's tip is where Alice HEAD points to, so it results
>> in a fast-forward, like below.
>>
>> a---b---c---d <--- Alice
>> \
>> e---f---g <--- Bob
>>
>>
>> If, we would have created a fake commit instead, Alice would get a
>> graph that looks like so:
>>
>> a---b---c---d-----------h <--- Alice
>> \ /
>> e---f---g <--- Bob
>>
>>
>> Now, we would have two trees that are identical, because the merge
>> can't cause conflicts, but Alice and Bob will have reached it in two
>> different ways. When Bob decides he wants to go get the changes Alice
>> has done, his tree will look something like this:
>>
>> a---b---c---d-----------h <--- Alice
>> \ / \
>> e---f---g---i <--- Bob
>>
>>
>> He finds it odd that he's got two commits that, when checked out, lead
>> to the exact same tree, so he asks Alice to get his tree and see
>> what's going on. Alice will then end up with this:
>>
>> a---b---c---d-----------h---j <--- Alice
>> \ / \ /
>> e---f---g---i <--- Bob
>>
>>
>> Now there's four commits that all point to identical trees, but the
>> ancestry graphs differ between all developers. In the case above,
>> there's only two people working at the same project. Imagine the
>> amount of empty commits you'd get in a larger project, like the Linux
>> kernel.
>>
> Oh, you remind me, but I have a naive solution for this problem: print
> a hint and don't merge commits that contain fake commit, then I know I have
> reached a stable merge point and have same tree with others.
But in that situation you and Alice now have different actual history
DAG's in your repositories.
Alice sees:
a---b---c---d-----------h
\ /
e---f---g
Bob sees:
a---b---c---d-----------h
\ / \
e---f---g---i
If bob now adds a new commit 'j' and alice pulls it back we either have
to then accept 'i' at alice's end or forever lose the identicality of
the commit DAG. At which point our primary benefit of the SHA1 ==
parent == same commit for everyone is gone. We can no longer say "this
commit is broken" and everyone know which commit that is.
>
> We create a fake commit for fast forwarding style merge, this fake commit
> is used to record the track of a branch, so we can always follow HEAD^1
> to travel through the history of a branch. In fact, git pays more attention
> to the history of *data modification* than history of *operation*, that is
> right the subtle difference between content tracker and VCS, latter's
> branch has more information(useful information, I think).
Any VCS is concerned with data modification and how its tracked. There
are two ways you can record history. A series of snapshots (git) or a
series of operations (eg cvs and svn). Each has its trade offs,
operations like diff on snapshots is O(number of files), on diffs they
are O(number of files * number of deltas).
The difference here is all about the interpretation of the word
'branch'. In CVS and others there is the hard concept of a mainline --
here is the master copy when something is added here it is "the one",
branches are temporary places which contain 'different' history such as
a patch branch. You want something on both branches you commit the
change twice once to each. In git they are more separate future
histories. When they are merged back together the new single history
contains the changes in both, neither is more important than the other
both represent forward progress. People tend to draw as below giving a
false importance to the 'line' from d->h:
a---b---c---d-----------h
\ /
e---f---g
We probabally should draw the below, h's history contains all history
from both 'up' and 'down' histories. Which is more important? Neither.
h is made up of a,b,c,d from alice and e,f,g from bob merged by alice.
---------
/ \
a---b---c---d h
\ /
e---f---g
>
> Even if no fake commit is created as git does now, there can be multiple
> commits with identical tree object, and git can't prevent you from merging
> two commits with identical tree object, it just creates an ancestry
> relation
> to remember the merge point.
>
> As git(7) says:
> The "commit" object is an object that introduces the notion
> of history into the picture. In contrast to the other objects,
> it doesn't just describe the physical state of a tree, it
> describes how we got there, and why.
>
> So it's clearer to describe a revision graph with nodes for tree
> objects and edges for commit objects(multiple edges for a merge
> commit object, I know this will break your habit:-).
How would such a graph look any different?
>> Fast-forward is a Good Thing and the only sensible thing to do in a
>> system designed to be fully distributed (i.e., where there isn't
>> necessarily any middle point with which everybody syncs), while
>> scaling beyond ten developers that merge frequently between each other.
>>
>>> If we throw away all compatibility, efficiency, memory and disk
>>> consumption
>>> problems,
>>> (1) we can get the track of a branch without reflog because HEAD^1 is
>>> always the tip of target branch(or working branch usually) before
>>> merging.
>>>
>>> (2) with the track, branch mechanism in git is possibly easier to
>>> understand,
>>> especially for newbies from CVS or Subversion, I really like git's
>>> light weight, simple but powerful design and great efficiency, but I
>>> am really
>>> surprised that 'git log' shows logs from other branches and a side
>>> branch can become part of main line suddenly.
>>>
>>> A revision graph represents fast forwarding style merging like this:
>>>
>>> (fast forwarding)
>>> ---- a ............ * ------> master
>>> \ /
>>> b----------c -----> test (three commits with three
>>> trees)
>>>
>>> can be changed to:
>>>
>>> ---- a (tree_1) ----------- d (tree_3) ------> master
>>> \ /
>>> b (tree_2) ------- c (tree_3) ----> test
>>> (four commits with three trees, it's normal as more than one way can
>>> reach Rome :-)
>>>
>>
>> That's where our views differ. In my eyes, "d" and "c" are exactly
>> identical, and I'd be very surprised if the scm tried to tell me that
>> they aren't, by not giving them the same revid.
These two arn't identicle. You have two difference routes to Rome, you
have two different lines on your map. To just say 'they' are the same
and throw one away is to throw away just that history you care about.
> It doesn't matter, they have same tree, and it's normal too in git
> multiple commits have same tree, if you use nodes for tree state,
> that graph will be simple to understand:
>
> a d
> -----tree_1 -------------- tree_3 ----> master
> \ / \
> \ b d/c `-----> test
> \ /
> `--- tree_2 ---'
>
> This is the familiar way we used in CVS, I believe there are more
> than one people confused by fast forwarding style merge and 'git log'
> in git.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
2006-11-07 7:27 ` Liu Yubao
@ 2006-11-07 9:46 ` Andy Whitcroft
2006-11-07 12:08 ` Liu Yubao
2006-11-07 16:05 ` Linus Torvalds
1 sibling, 1 reply; 35+ messages in thread
From: Andy Whitcroft @ 2006-11-07 9:46 UTC (permalink / raw)
To: Liu Yubao; +Cc: Linus Torvalds, Junio C Hamano, git
Liu Yubao wrote:
> Linus Torvalds wrote:
>>
>> On Mon, 6 Nov 2006, Liu Yubao wrote:
>>> Then, what bad *logical* problem will happen if a merging that is
>>> really a
>>> fast forwarding creates a new commit?
>>
>> You MUST NOT do that.
>>
>> If a fast-forward were to do a "merge commit", you'd never get into
>> the situation where two people merging each other would really ever
>> get a stable result. They'd just keep doing merge commits on top of
>> each other.
> They can stop merging a fake commit with a real commit that point to same
> tree object, here they reach a stable result: we have same tree content.
>>
>> Git tracks history, not "your view of history". Trying to track "your
>> view" is fundamentally wrong, because "your wiew" automatically means
>> that the project history would not be distributed any more - it would
>> be centralized around what _you_ think happened. That is not a
>> sensible thing to have in a distributed system.
> It's not my view, it's branch scope view, I can see how a branch evolves
> relatively independently. In git, branch scope view is more or less
> neglected.
> After fast forwarding merge, I can' tell where a branch come from -- I mean
> the track of a branch.
>
> If Junio publishes his reflog, I don't see what conflict will happen
> between
> his local view (but now public, and naming it branch scope view seems more
> sensible) and git's global view.
>
> If this won't lead to problems, it seems also ok to use fake commit for
> fast forwarding style merge, so we can follow HEAD^1 to travel through a
> branch without reflog.
>
> I hope I have expressed my thought clearly.
>>
>> For example, the way to break the "infinite merges" problem above is
>> to say that _you_ would be special, and you would do a "fast-forward
>> commit", and the other side would always just fast-forward without a
>> commit. But that is very fundamentally against the whole point of
>> being distributed. Now you're special.
> No one is special as everybody can create fake commit, any branch (almost
> a tag) will never be overwritten to point to a commit object in
> another branch, branches are relatively independent, that's to say
> 'git log' will reflect what has happened really in current branch (a CVS
> semantical branch, not only a tag that always points to a tip commit).
>>
>> In fact, even for "you", it would be horrible - because you personally
>> might have 5 different repositories on five different machines. You'd
>> have to select _which_ machine you want to track. That's simply
>> insane. It's a totally broken model. (You can even get the same
>> situation with just _one_ repository, by just having five different
>> branches - you have to decide which one is the "main" branch).
> What's the mean of upstream branch then? I have to know I should track
> Junio's public repository.
>
> When does one say two branches reach a common point? have same commit(must
> point to same tree) or have same tree(maybe a fake commit and a real
> commit)?
> I think git takes the first way.
>
> Fast forwarding style merge tends to *automatically* centralize many
> branches, in CVS people merge two branches and drop side branch to
> centralize them, they all have central semantics.
> (I don't want to get flame war between CVS/SVN and GIT, I think
> git is better than them really:-)
>>
>> Besides, doing an empty commit like that ("I fast forwarded")
>> literally doesn't add any true history information. It literally views
>> history not as history of the _project_, but as the history of just
>> one of the repositories. And that's wrong.
> Something like 'git log --follow-all-parent' can show history of the
> project
> as 'git log' does now.
>>
>> So just get used to it. You MUST NOT do what you want to do. It's stupid.
> Yes, I have understood the git way and am getting used to it, I like
> its simple but powerful design and great efficiency, thank all for your
> good work!
>>
>> If you want to track the history of one particular local branch, use
>> the "reflog" thing. It allows you to see what one of your local
>> branches contained at any particular time.
>>
>> See
>>
>> [core]
>> logAllRefUpdates = true
>>
> Thanks, it's a pity I can't pull Junio's reflog :-(
One thing to remember, when you merge the destination into which you
merge will be HEAD^1, so by just following that you can get junio's view
of his branch as he made it.
This is doesn't terminate properly, sucks the performance of your
machine and generally should be erased rather than run; but you get the
idea:
let n=0
while git-show --pretty=one -s "next~$n"
do
let "n=$n+1"
done | less
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
2006-11-06 17:48 ` Linus Torvalds
2006-11-07 7:59 ` Liu Yubao
@ 2006-11-07 11:46 ` Eran Tromer
1 sibling, 0 replies; 35+ messages in thread
From: Eran Tromer @ 2006-11-07 11:46 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Junio C Hamano, git
Hi Linus,
On 2006-11-06 19:48, Linus Torvalds wrote:
>
> On Mon, 6 Nov 2006, Linus Torvalds wrote:
>> Besides, doing an empty commit like that ("I fast forwarded") literally
>> doesn't add any true history information. It literally views history not
>> as history of the _project_, but as the history of just one of the
>> repositories. And that's wrong.
>
> Btw, absolutely the _only_ reason people seem to want to do this is
> because they want to "pee in the snow" and put their mark on things. They
> seem to want to show "_I_ did this", even if the "doing" was a total
> no-op and they didn't actually generate any real value.
In a project that uses topic branches extensively, the merge-induced
commits give a useful cue about the logical grouping of patches. They
let you easily glean the coarse-grained history and independent lines of
work ("pickaxe made it to next", "Linus got the libata updates") without
getting bogged down by individual commits, just by looking at the gitk
graph. Fast-forwards lose this information, and the more you encourage
them, the less grokkable history becomes.
Empty commits may be the wrong tool to address this (for all the reasons
you gave), but there's certainly useful process information that's
currently being lost.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
2006-11-07 9:30 ` Andy Whitcroft
@ 2006-11-07 12:05 ` Liu Yubao
2006-11-07 12:17 ` Jakub Narebski
0 siblings, 1 reply; 35+ messages in thread
From: Liu Yubao @ 2006-11-07 12:05 UTC (permalink / raw)
To: Andy Whitcroft; +Cc: Andreas Ericsson, Junio C Hamano, git
Andy Whitcroft wrote:
> Liu Yubao wrote:
> But in that situation you and Alice now have different actual history
> DAG's in your repositories.
>
> Alice sees:
> a---b---c---d-----------h
> \ /
> e---f---g
>
> Bob sees:
> a---b---c---d-----------h
> \ / \
> e---f---g---i
>
>
> If bob now adds a new commit 'j' and alice pulls it back we either have
> to then accept 'i' at alice's end or forever lose the identicality of
> the commit DAG. At which point our primary benefit of the SHA1 ==
> parent == same commit for everyone is gone. We can no longer say "this
> commit is broken" and everyone know which commit that is.
>
Alice and bob have their own branch scope view respectively, they have two
different branches, their DAGs in *branch scope view* can
be different because they trace the history from different points.
In branch scope view, you see only one HEAD, it merges changes from
other branches. Each branch has its own commit DAG.
In global scope view, you see many HEADs, they fork and merge frequently,
here is only one big commit DAG, but you can never see the whole as branches
can be distributed over the world.
Fake commit doesn't break the DAG in global scope view, it has parents
as normal commit although the trees pointed by fake commit and its parent
are same. In fact, git has suck commit already:
a (tree_1) ------- b (tree_2) ---- d (tree_2) ---> master
\ /
`--------------- c (tree_2) ------' -----> test
If you don't pull from other, you can get different global DAG, it's normal
obviously. It doesn't matter you get different DAG in branch scope, of course
they are different.
The problem is you can't get branch *track* from global scope view in git, you
can't tell which commits a branch has *referred to*. Note following HEAD^1
isn't right as Junio pointed out
(http://marc.theaimsgroup.com/?l=git&m=116279354214757&w=2).
Branch track is useful as people have requested reflog feature (realized, but
only for local purpose) and "note" extension in commit object.
If you have a commit A that I haven't pulled, I can't know what you
refer to when you say "Commit A introduced a bug". I must know where
to get this commit. After I pull it from other branch, We can say "this
commit is broken" and everyone know which commit that is.
>> We create a fake commit for fast forwarding style merge, this fake commit
>> is used to record the track of a branch, so we can always follow HEAD^1
>> to travel through the history of a branch. In fact, git pays more attention
>> to the history of *data modification* than history of *operation*, that is
>> right the subtle difference between content tracker and VCS, latter's
>> branch has more information(useful information, I think).
>
> Any VCS is concerned with data modification and how its tracked. There
> are two ways you can record history. A series of snapshots (git) or a
> series of operations (eg cvs and svn). Each has its trade offs,
> operations like diff on snapshots is O(number of files), on diffs they
> are O(number of files * number of deltas).
>
> The difference here is all about the interpretation of the word
> 'branch'. In CVS and others there is the hard concept of a mainline --
> here is the master copy when something is added here it is "the one",
> branches are temporary places which contain 'different' history such as
> a patch branch. You want something on both branches you commit the
> change twice once to each. In git they are more separate future
> histories. When they are merged back together the new single history
> contains the changes in both, neither is more important than the other
> both represent forward progress. People tend to draw as below giving a
> false importance to the 'line' from d->h:
>
> a---b---c---d-----------h
> \ /
> e---f---g
>
> We probabally should draw the below, h's history contains all history
> from both 'up' and 'down' histories. Which is more important? Neither.
> h is made up of a,b,c,d from alice and e,f,g from bob merged by alice.
>
> ---------
> / \
> a---b---c---d h
> \ /
> e---f---g
>
>
If fake commit is introduced, a possible revision graph is like this:
a - * -- c ------- * ---> branchA
\ / \ /
b ------ * ---- d ---> branchB ('*' stands for fake commit)
It's indeed not pretty as a linear revision graph that git's fast forwarding
style merge creates, but it can record the tracks of two branches by following
HEAD^1.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
2006-11-07 9:46 ` Andy Whitcroft
@ 2006-11-07 12:08 ` Liu Yubao
2006-11-07 13:15 ` Andy Whitcroft
0 siblings, 1 reply; 35+ messages in thread
From: Liu Yubao @ 2006-11-07 12:08 UTC (permalink / raw)
To: Andy Whitcroft; +Cc: Linus Torvalds, Junio C Hamano, git
Andy Whitcroft wrote:
>
> One thing to remember, when you merge the destination into which you
> merge will be HEAD^1, so by just following that you can get junio's view
> of his branch as he made it.
>
> This is doesn't terminate properly, sucks the performance of your
> machine and generally should be erased rather than run; but you get the
> idea:
>
> let n=0
> while git-show --pretty=one -s "next~$n"
> do
> let "n=$n+1"
> done | less
>
> -apw
>
This is not a right way to view a branch track in git, see Junio's explanation
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
2006-11-07 12:05 ` Liu Yubao
@ 2006-11-07 12:17 ` Jakub Narebski
0 siblings, 0 replies; 35+ messages in thread
From: Jakub Narebski @ 2006-11-07 12:17 UTC (permalink / raw)
To: git
Liu Yubao wrote:
[...]
I think everything stems from the fact that git repositories which pull/push
with each other _share_ [parts of] DAG. Learn to live with it, or chose
different SCM.
You want branch a path through DAG, not only as lineage sub-DAG... but
recodring this information is I think costly.
Note also that the pointers to DAG branches are can be name differently in
different repositories (e.g. 'master' in one repository might be 'origin'
in the other, and 'remotes/origin/master' in yet another).
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
2006-11-07 12:08 ` Liu Yubao
@ 2006-11-07 13:15 ` Andy Whitcroft
0 siblings, 0 replies; 35+ messages in thread
From: Andy Whitcroft @ 2006-11-07 13:15 UTC (permalink / raw)
To: Liu Yubao; +Cc: Linus Torvalds, Junio C Hamano, git
Liu Yubao wrote:
> Andy Whitcroft wrote:
>>
>> One thing to remember, when you merge the destination into which you
>> merge will be HEAD^1, so by just following that you can get junio's view
>> of his branch as he made it.
>>
>> This is doesn't terminate properly, sucks the performance of your
>> machine and generally should be erased rather than run; but you get the
>> idea:
>>
>> let n=0
>> while git-show --pretty=one -s "next~$n"
>> do
>> let "n=$n+1"
>> done | less
>>
>> -apw
>>
> This is not a right way to view a branch track in git, see Junio's
> explanation
> about this from http://marc.theaimsgroup.com/?l=git&m=116279354214757&w=2
Well in fact that message tells us more why a branch centric view is
likely not useful. This output is still the majority of the time the
view from the branch integrators point of view. If that is something
you care about, I am not sure it is something I care about.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
2006-11-07 7:27 ` Liu Yubao
2006-11-07 9:46 ` Andy Whitcroft
@ 2006-11-07 16:05 ` Linus Torvalds
2006-11-07 16:39 ` Jakub Narebski
2006-11-07 21:37 ` If merging that is really fast forwarding creates new commit Junio C Hamano
1 sibling, 2 replies; 35+ messages in thread
From: Linus Torvalds @ 2006-11-07 16:05 UTC (permalink / raw)
To: Liu Yubao; +Cc: Junio C Hamano, git
On Tue, 7 Nov 2006, Liu Yubao wrote:
>
> > If a fast-forward were to do a "merge commit", you'd never get into the
> > situation where two people merging each other would really ever get a stable
> > result. They'd just keep doing merge commits on top of each other.
>
> They can stop merging a fake commit with a real commit that point to same
> tree object, here they reach a stable result: we have same tree content.
That's flawed for two reasons:
- identical trees is meaningless. You can have identical trees that had
different histories and just happened to end up in the same state, and
you'd still generate a merge commit (because what merges do is show the
history of the data, and the _history_ merges).
So you're really just introducing a special case, and not even one that
makes any sense. Either history matters, or it doesn't.
- a distributed system fundamnetally means that nobody is "special". And
a merge is a _joining_ of two threads. Neither of which is special.
Let's say that we have
A: a -> b -> c -> d
B: a -> b -> c
and B pulls. You think that it should result in
B: a -> b -> c ---> e
\ /
> d
and I say that that is crazy, because in a distributed system, A and B
are _equivalent_ and have the same branches, and tell me what would
have happened if _A_ had pulled from _B_ instead?
That's right: if A had pulled from B, then obviously nothing at all
would happen, because A already had everything B had.
So the only _logical_ thing to happen is that the end result doesn't
depend on who merged. And that means that if B merged from A, then the
end result _has_ to be the same as if A merged from B, namely_
B: a -> b -> c -> d
and nothing else. Anything else is insane. It's not a distributed
system any more.
> > Git tracks history, not "your view of history". Trying to track "your view"
> > is fundamentally wrong, because "your wiew" automatically means that the
> > project history would not be distributed any more - it would be centralized
> > around what _you_ think happened. That is not a sensible thing to have in a
> > distributed system.
>
> It's not my view, it's branch scope view, I can see how a branch evolves
> relatively independently.
No you CAN NOT. You think that "A" is special. But because you think that
A is special, you ignore that B had the exact same branch, so your "branch
scope view" is inherently flawed - it's not "branch scope" at all, it's
literally a "one person is special" view.
> In git, branch scope view is more or less neglected. After fast
> forwarding merge, I can' tell where a branch come from -- I mean the
> track of a branch.
Sure you can. In your reflog. It's only _you_ who care about _your_
history. Nobody else cares one whit about what your tree looks like.
> If Junio publishes his reflog, I don't see what conflict will happen between
> his local view (but now public, and naming it branch scope view seems more
> sensible) and git's global view.
Why would anybody ever care about Junio's reflog?
Also, you're ignoring the issue that both I and Martin mentioned: you're
making history harder to read, and adding crud that doesn't actually _do_
anything. Your approach is nonsensical from a distributed system
standpoint, but it's also _worse_ than just fast-forwarding. If git did
what you suggested, we'd have a lot of extra merge commits that simply
don't _help_ anything, and only make things worse.
> What's the mean of upstream branch then? I have to know I should track
> Junio's public repository.
"Upstream" really should have absolutely zero meaning. That's the whole
point of distributed. You can merge things sideways, down, up, and the end
result doesn't matter. "upstream" can merge from you, and you can merge
from him. Thats' the _technology_.
The only thing that matters is "trust". But trust is not something you get
from technology, and trust is something you have to earn. And trust does
NOT come from digital signatures like some people believe: digital
signatures are a way of _verifying_ the trust you have, but they are very
much secondary (or tertiary) to the real issues.
And _trust_ is why you'd pull from Junio. Git makes it somewhat easier by
giving you default shorthands for the original place you cloned from when
you clone a new repository, because often you'd obviously keep trusting
the same source, but an important thing here is to realize that it really
is "often". Not always. And it's not about technology.
> When does one say two branches reach a common point? have same commit(must
> point to same tree) or have same tree(maybe a fake commit and a real commit)?
> I think git takes the first way.
Very much so. To git, the only (and I really mean _only_) thing that
matters from a commit history view is the commit relationships. NOTHING
else. What the trees are doesn't matter at all. Where the commits came
from doesn't matter. Who made them doesn't matter either - those are just
"documentation".
So the _only_ thing that matters for a commit is what its place in history
was. We never even look at the trees at all to decide what to do about
merging. The only time the trees start to matter is when we've figured out
what the merge relationship is, and then obviously the trees matter, but
even then they only matter as far as the resulting _tree_ is concerned.
> Fast forwarding style merge tends to *automatically* centralize many
> branches
Yes. Except I wouldn't say "centralize", I would very much say "join".
That's the point of a merge. Two commit histories "join" and become one.
But the reason I don't agree with your choice of wording ("centralize")
thing is fundamental:
- it only happens on one side. The side that does the merge is not
necessarily the "central" one at all.
- there isn't necessarily even such a thing as a "central" branch in git
(and there _shouldn't_ be).
In fact, the thing I absolutely _detest_ about CVS is how it makes it
almost impossible to have multiple "equally worthy" branches. Look at the
git repository itself that Junio maintains, and please tell me which is
the "trunk" branch?
Git doesn't even have that concept. There is the concept of a _default_
branch ("master"), and yes, the git repository has it. But at the same
time, it really is just a default. There are three "main" branches that
Junio maintains, and they only really differ in the degree of development.
And "master" isn't even the most stable one - it's just the default one,
because it's smack dab in the middle: recent enough to be interesting, but
still stable enough to be worth tracking for just about anybody.
But really, "maint" is the stable branch, and in many ways you could say
that "maint" is the trunk branch, since that's what Junio still cuts
releases from. And "next" is the development branch, that gets interesting
features before they hit the "master" branch (and "pu" is so far out that
it's a whole different issue, since it jumps around and doesn't even
become a real history at all).
See? All of these are _equal_. There is no trunk. There is no "central"
branch, and if you were to have to decide which one is the most central
one, it's not even the default one, that would probably be "maint", since
that's the one that keeps getting merged into the other branches.
So doing a merge doesn't really "centralize" anything. It just joins the
two development threads together in that particular line. If "master"
merges the work in "maint", master doesn't really get any more
centralized, it just gets the work that "maint" did since last time. And
if there was no other work done at all, then the two branches end up 100%
identical - there was no "merge" of the work.
They still have their own identities, though. It's still two branches.
It's still "maint" and "master". They just have the exact same state, and
that is as it should be, since they've had the exact same development
history.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
2006-11-07 16:05 ` Linus Torvalds
@ 2006-11-07 16:39 ` Jakub Narebski
2006-11-07 21:37 ` If merging that is really fast forwarding creates new commit Junio C Hamano
1 sibling, 0 replies; 35+ messages in thread
From: Jakub Narebski @ 2006-11-07 16:39 UTC (permalink / raw)
To: git
Linus Torvalds wrote:
> So doing a merge doesn't really "centralize" anything. It just joins the
> two development threads together in that particular line. If "master"
> merges the work in "maint", master doesn't really get any more
> centralized, it just gets the work that "maint" did since last time. And
> if there was no other work done at all, then the two branches end up 100%
> identical - there was no "merge" of the work.
By the way, merges happen in _two_ directions. 'Master' merges from 'next'
when 'next' is in sufficiently stable state; 'next' merges from 'master' to
get changes which were considered stable enough to be put into
'master' (and 'master' merges in from 'maint', too).
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
2006-11-07 7:59 ` Liu Yubao
@ 2006-11-07 17:23 ` Linus Torvalds
2006-11-07 18:23 ` If merging that is really fast forwarding creates new commit Junio C Hamano
1 sibling, 0 replies; 35+ messages in thread
From: Linus Torvalds @ 2006-11-07 17:23 UTC (permalink / raw)
To: Liu Yubao; +Cc: Junio C Hamano, git
On Tue, 7 Nov 2006, Liu Yubao wrote:
>
> Fake commit is only for digging branch scope history, I can *outline* what has
> been merged to a branch and don't care about how these good work are done on
> earth.
The thing is, I think you see a good thing ("outlining"), and miss all the
downsides ("extra noise", "incorrect outlining").
Yes, I can see it being useful for reading logs in a perfect world.
However, in real life, more than half of my fast-forwards are just me
tracking another branch. An "outline" would be _wrong_. I _want_ to
fast-forward, because I'm moving the trees from one machine to another,
and the reason it's a fast-forward is exactly the fact that absolutely
zero work had been done on the machine I'm pulling from - I'm pulling just
to keep up-to-date.
So now, just to keep things sane, your scheme would require that people
AHEAD OF TIME tell the system whether they want to fast-forward or whether
they want to create a magic merge commit as a "outlining" marker.
See? Fast-forwarding is absolutely the right thing to do in 99% of all
cases. For me, it's perhaps only half, because I do several true merges
every day, but that's really quite unusual - I'm the top-level maintainer.
Nobody else should EVER do it.
And the thing is, I refuse to work with a system that makes one person
special. I _know_ I'm special, I'm the smartest, most beautiful, and just
simply the best person on the planet. I don't need a tool that tells me
so.
So deep down, what you're really suggesting that there be a special mode
that is ONLY ever used for the top-level maintainer, so that he can create
an "outline" in the history.
Put that way, it almost makes sense, until you realize that 99.9% of all
people aren't top-level maintainers, and you don't want them creating crap
like that. And that "outlining" is likely most easily done with
( git log lastversion.. | git shortlog ;
git diff --stat --summary lastversion.. ) | less -S
instead.
But more importantly, I don't personally like the "top-level maintainer"
model. Yes, it's how people do end up working a lot, but quite frankly,
I'd rather not have the tool support it, especially if there is ever a
schism in a development process. I want to support _forking_, which very
much implies having somebody pulling the "wrong way".
Time for some purely philosophical arguments on why it's wrong to have
"special people" encoded in the tools:
I think that "forking" is what keeps people honest. The _biggest_ downside
with CVS is actually that a central repository gets so much _political_
clout, that it's effectively impossible to fork the project: the
maintainers of a central repo have huge powers over everybody else, and
it's practically impossible for anybody else to say "you're wrong, and
I'll show how wrong you are by competing fairly and being better".
For example, gcc (and other tools) have gone through this phase. You've
had splinter groups (eg pgcc) that did a hell of a lot better work than
the main group, and the tools really made it really hard for them to make
progress. I think the most important part of a distributed SCM is not even
to support the "main trunk", but to support the notion that anybody can
just take the thing and compete fairly.
With the kernel as an example, any group could literally just start their
own kernel git tree, and git should make it as easy as humanly possible
for them to track my tree WHILE _THEY_ STILL REMAIN IN CHARGE of their own
tree. That doesn't mean that forking is easy - over the years people have
simply grown so _used_ to me that they mostly trust me and they are comfy
working with me, because even if I've got my quirks (or "major personality
disorders" as some people might say), people mostly know how to work with
them.
But the point is, there should be no _tool_ issues. As far as git is
concerned, every single developer can feel like he is the top-level
maintainer - it doesn't have to be a hierarchy, it really can be a
"network of equal developers". I want the _tool_ to have that world-view,
even if most projects in the end tend to organize more hierarcically than
that. Because the "everybody is equal" worldview actually matters in the
only case that _really_ matters: when problems happen.
For example: I use git to maintain a few other projects I've started too.
I use git to maintain git itself, but I'm no longer the maintainer, simply
because I think it's a lot better to step down than stand in the way of
somebody better, and because I think it's hard to be the "lead person" on
multiple projects.
The same thing is happening to "sparse", which was dormant for a while (it
worked, and I fixed problems as people reported them, but it did
everything I had set out to do, so my motivation to develop it further had
just gone down a lot). What happened? Somebody else came along, showed
interest, started sending me patches, and I just suggested he start his
own tree and start maintaining it.
Now, both of those transitions were very peaceful, but it should work that
way even if the maintainer were to fight tooth and nail to hold on to his
"top dog" status. And that's where it's important that the tool not
separate out "top maintainers" from "other people".
> I want to separate a branch, not to separate commits by some author, for
> example, many authors can contribute to git's master branch, I want to
> know what happened in the master branch like this:
> good work from A;
> good work from C;
> merge from next; -----> I don't care how this feature is realized.
> good work from A;
Really, "git log | git shortlog" will come quite close. I use it all the
time for the kernel, and it's powerful.
Try it with the kernel archive, just for fun. Do
git log v2.6.19-rc4.. | git shortlog | less -S
with the current kernel, and see how easy it is to get a kind of feel for
what is going on. We do it by two means:
- sorting by author.
This sounds silly, but it's actually very powerful. It's not so much
that it credits people better (it does) or that it makes the logs
shorter by mentioning the person just once (it does that too), it's
really nice because people tend to automatically do certain things. One
person does "random cleanups". Another one works on "networking". A
third one maintains one particular architecture, and so on..
- encourage people to have a "topic: explanation" kind of top line of the
commit (and encourage people to have that "summary line" in the first
place: not every SCM does that, and everybody else is strictly much
worse than git)
In fact, when I do this, I usually _remove_ the merges, because they end
up being just noise. Really: go and look at the current kernel repo, and
do the above one-liner, and realize that I have a hunking big set of
commits credited to me right now (it says 30 commits), and in fact I think
I'm the #1 author right now on that list.
But when I send out the description, I actually use the "--no-merges" flag
to "git log", because those merge messages are _useless_. They really
don't do anything at all for me, or for anybody else. Re-run the above
one-liner that way, and suddenly I drop to just 5 commits (and quite
often, I'm much less - sometimes the _only_ commit I have for an -rc
release is the commit that changes the version number). But it's actually
more readable.
So I can kind of see what you want, but I'm 100% convinced that the
information you _really_ want is better done totally differently.
So if you want to get the "big picture" thing, git does actually support
you in several ways. That "git shortlog" is very useful, but so is the
"drill down by subsystem". For example, you could do
git log --no-merges v2.6.19-rc4.. arch/ | git shortlog | less -S
and you'd get the "summary view" of what happened in architecture-
specific code. It's not the same thing as the "merge log", but it's
actually very useful.
(You can do the same with git. Something like
git log --no-merges v1.4.3.4.. | git shortlog | less -S
shows quite clearly that a lot of new stuff is gitweb-related, for
example.
Could we do better "reporting" tools? I'm absolutely sure we could. It
might be interesting to be able to ignore not just commits, but "trivial
patches" too. For example, if you're looking for what changed on a high
level, you're not likely to care about patches that change just a few
lines. You might want to see only the commits that change an appreciable
fraction of code, and so it might be very interesting to have a "git
shortlog" that would take patch size into account, for example.
So I'm not saying that git is perfect. I'm just saying that there are
better ways (with much fewer downsides) to get what you want, than the way
you _think_ you want.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: If merging that is really fast forwarding creates new commit
2006-11-07 7:59 ` Liu Yubao
2006-11-07 17:23 ` Linus Torvalds
@ 2006-11-07 18:23 ` Junio C Hamano
1 sibling, 0 replies; 35+ messages in thread
From: Junio C Hamano @ 2006-11-07 18:23 UTC (permalink / raw)
To: Liu Yubao; +Cc: git
Liu Yubao <yubao.liu@gmail.com> writes:
> I want to separate a branch, not to separate commits by some author,
> for example, many authors can contribute to git's master branch, I
> want to
> know what happened in the master branch like this:
> good work from A;
> good work from C;
> merge from next; -----> I don't care how this feature is realized.
> good work from A;
> ....
So you want to see list of commits that happened to be at the
tip of my 'master' branch. I would not say that view does not
exist, but it is probably not very useful. And the uselessness
of it depends majorly on the reason why you say "I don't care
how this feature is realized" in the above picture. Care to
elaborate why not?
side note: I do not merge next to master so "from next" above in
reality would be "from a topic branch" or "from maint", but it
is otherwise a good example.
What appeared in 'master' recently are three kinds of changes:
- Many fixes that still apply to 1.4.3 codebase were sent from
the list (thanks, everybody!), which were applied to 'maint',
and merged into 'master'.
- Some other obviously correct fixes and changes that address
issues on features added after the 1.4.3 release (hence
missing from 1.4.3 codebase and 'maint' but in 'master') were
applied directly on 'master'.
- Yet some other fixes and changes that concern post-1.4.3
codebase (i.e. 'master only' changes) were forked off of the
tip of 'master' when the patches were received, cooked in
their own topic branches (which were merged in 'next'), and
then merged into 'master'.
So, we have two kinds of obviously correct changes to 'master'
that come both from merges and direct applications. Things that
happen to address older issues come as merges because they
equally apply to 'maint' and merged into 'master', things that
address newer issues are applied directly. Put it another way,
things that come as merges to 'master' are also of two kinds.
Obviously correct one that came through 'maint', and the ones
that might have looked slightly wrong in the initial version and
later perfected while in its own topic branch and then merged
into 'master'.
The decision between cooking in a topic branch and immediately
applying to 'master' is not based on the size but more on
perceived usefulness of the change (something that is correct in
the sense that it does not break the system may not deserve to
be merged if it does not do useful things) and quality of the
design and implementation. The size of the series obviously
affect the perception by me but that is secondary.
Even when a patch is something that I should be able to judge as
obviously correct when I am relaxed and sane, I might lack time
and concentration to follow it fully, and instead decide to drop
it into its own topic branch and later merge it into 'master'
without need for much cooking. That kind of patch _could_ have
(and should have) been applied directly to 'master' but comes as
a merge.
Sometimes I apply a patch to 'master' and then later realize
that change is needed and applicable to 'maint' as well. That
is cherry-picked to 'maint', resulting in two independent
commits. They _could_ have (and should have) come through a
merge from 'maint' to 'master'.
So the change a patch introduces itself may not even have
relevance to the difference between direct application and merge
at all. In other words, the avenue a particular patch took,
difference between direct application and merge, should not
concern you. I hope this would illustrate why a view that tries
to summarize what merges brought in and to give full description
of what were applied directly does not make much sense.
By the way, there are two reasons why you cannot have my
ref-logs. First of all, I do not have one on 'master' nor
'next' myself. More importantly, I rewind and rebuild these
branches before pushing out (of course I have some safety valve
to prevent me from rewinding beyond what I have already pushed
out), and the ref-log entries for those tips that were rewound
are not useful to you, and something I would rather not have
people to even know about (think of it as giving me some
privacy).
If you really care about the branch tip history of my
repository, you can set up ref-log yourself on your remote
tracking branch.
Strictly speaking, that is the history of fetches by you, not
the history of merges and commits by me, but that is what
matters more to you. If I pushed my changes out twice a day but
you were away for two days, you would have seen the state of my
repository four rounds back before you left and when you fetched
from me today you would have the latest; three states in between
were not something you can know. But it does not matter -- your
repository did not have those three states, so not knowing
exactly which commit they were would not hurt you when
bisecting. "It worked before I pulled yesterday morning but now
it is broken when I pulled this afternoon" would help your
bisect get started, but multiple state changes between the times
you fetched cannot matter.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: If merging that is really fast forwarding creates new commit
2006-11-07 16:05 ` Linus Torvalds
2006-11-07 16:39 ` Jakub Narebski
@ 2006-11-07 21:37 ` Junio C Hamano
2006-11-07 22:02 ` Planned new release of git [was: Re: If merging that is really fast forwarding creates new commit] Jakub Narebski
1 sibling, 1 reply; 35+ messages in thread
From: Junio C Hamano @ 2006-11-07 21:37 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git, Liu Yubao
Linus Torvalds <torvalds@osdl.org> writes:
> Git doesn't even have that concept. There is the concept of a _default_
> branch ("master"), and yes, the git repository has it. But at the same
> time, it really is just a default. There are three "main" branches that
> Junio maintains, and they only really differ in the degree of development.
> And "master" isn't even the most stable one - it's just the default one,
> because it's smack dab in the middle: recent enough to be interesting, but
> still stable enough to be worth tracking for just about anybody.
>
> But really, "maint" is the stable branch, and in many ways you could say
> that "maint" is the trunk branch, since that's what Junio still cuts
> releases from.
The branch 'maint' is meant to be the moral equivalent of the
efforts of your -stable team, so it shouldn't be "the trunk",
but you caught me.
We haven't seen a new release from 'master' for about a month.
I think the dust has settled already after two big topics
(packed-refs, delta-offset-base) were merged into 'master' since
v1.4.3, and it is now time to decide which topics that have been
cooking in 'next' are the ones I want in v1.4.4. Perhaps by the
end of the week, I'll cut a v1.4.4-rc1 to start the pre-release
stabilization process. No new features nor enhancements on
'master' after that until v1.4.4 final.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Planned new release of git [was: Re: If merging that is really fast forwarding creates new commit]
2006-11-07 21:37 ` If merging that is really fast forwarding creates new commit Junio C Hamano
@ 2006-11-07 22:02 ` Jakub Narebski
2006-11-07 23:06 ` Linus Torvalds
2006-11-07 23:19 ` Junio C Hamano
0 siblings, 2 replies; 35+ messages in thread
From: Jakub Narebski @ 2006-11-07 22:02 UTC (permalink / raw)
To: git
Junio C Hamano wrote:
> We haven't seen a new release from 'master' for about a month.
> I think the dust has settled already after two big topics
> (packed-refs, delta-offset-base) were merged into 'master' since
> v1.4.3, and it is now time to decide which topics that have been
> cooking in 'next' are the ones I want in v1.4.4. Perhaps by the
> end of the week, I'll cut a v1.4.4-rc1 to start the pre-release
> stabilization process. No new features nor enhancements on
> 'master' after that until v1.4.4 final.
Do I understand correctly that the work on not exploding downloaded
pack on fetch, but making it non-thin, and related work on archival
packs (not to be considered for repacking) is not considered ready
(and tested)?
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Planned new release of git [was: Re: If merging that is really fast forwarding creates new commit]
2006-11-07 22:02 ` Planned new release of git [was: Re: If merging that is really fast forwarding creates new commit] Jakub Narebski
@ 2006-11-07 23:06 ` Linus Torvalds
2006-11-07 23:36 ` Planned new release of git Junio C Hamano
2006-11-07 23:19 ` Junio C Hamano
1 sibling, 1 reply; 35+ messages in thread
From: Linus Torvalds @ 2006-11-07 23:06 UTC (permalink / raw)
To: Jakub Narebski; +Cc: Git Mailing List, Junio C Hamano
On Tue, 7 Nov 2006, Jakub Narebski wrote:
>
> Do I understand correctly that the work on not exploding downloaded
> pack on fetch, but making it non-thin, and related work on archival
> packs (not to be considered for repacking) is not considered ready
> (and tested)?
I'd like to see a new version with both the packed refs and the
non-exploading download on by default. Maybe time for a git-1.5.0 release
from master?
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Planned new release of git
2006-11-07 22:02 ` Planned new release of git [was: Re: If merging that is really fast forwarding creates new commit] Jakub Narebski
2006-11-07 23:06 ` Linus Torvalds
@ 2006-11-07 23:19 ` Junio C Hamano
1 sibling, 0 replies; 35+ messages in thread
From: Junio C Hamano @ 2006-11-07 23:19 UTC (permalink / raw)
To: Jakub Narebski; +Cc: jnareb
Jakub Narebski <jnareb@gmail.com> writes:
> Junio C Hamano wrote:
>
>> We haven't seen a new release from 'master' for about a month.
>> I think the dust has settled already after two big topics
>> (packed-refs, delta-offset-base) were merged into 'master' since
>> v1.4.3, and it is now time to decide which topics that have been
>> cooking in 'next' are the ones I want in v1.4.4. Perhaps by the
>> end of the week, I'll cut a v1.4.4-rc1 to start the pre-release
>> stabilization process. No new features nor enhancements on
>> 'master' after that until v1.4.4 final.
>
> Do I understand correctly that the work on not exploding downloaded
> pack on fetch, but making it non-thin, and related work on archival
> packs (not to be considered for repacking) is not considered ready
> (and tested)?
Perhaps I phrased it badly, but I doubt it.
In the above I am only saying that it probably is time for me to
decide which ones to further merge into 'master', without saying
which ones I think is ready right now. That is because I
haven't started thinking about it.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Planned new release of git
2006-11-07 23:06 ` Linus Torvalds
@ 2006-11-07 23:36 ` Junio C Hamano
0 siblings, 0 replies; 35+ messages in thread
From: Junio C Hamano @ 2006-11-07 23:36 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
Linus Torvalds <torvalds@osdl.org> writes:
> On Tue, 7 Nov 2006, Jakub Narebski wrote:
>>
>> Do I understand correctly that the work on not exploding downloaded
>> pack on fetch, but making it non-thin, and related work on archival
>> packs (not to be considered for repacking) is not considered ready
>> (and tested)?
>
> I'd like to see a new version with both the packed refs and the
> non-exploading download on by default. Maybe time for a git-1.5.0 release
> from master?
Don't worry, packed refs is already part of 'master' so whatever
the next feature release is called it will be part of it ;-).
^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2006-11-07 23:36 UTC | newest]
Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-06 3:41 how to show log for only one branch Liu Yubao
2006-11-06 6:12 ` Junio C Hamano
2006-11-06 10:41 ` Liu Yubao
2006-11-06 18:16 ` Junio C Hamano
2006-11-07 2:21 ` Liu Yubao
2006-11-07 8:21 ` Jakub Narebski
2006-11-06 13:00 ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Liu Yubao
2006-11-06 13:39 ` If merging that is really fast forwarding creates new commit Rocco Rutte
2006-11-07 3:42 ` Liu Yubao
2006-11-06 13:43 ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Andreas Ericsson
2006-11-07 3:26 ` Liu Yubao
2006-11-07 9:30 ` Andy Whitcroft
2006-11-07 12:05 ` Liu Yubao
2006-11-07 12:17 ` Jakub Narebski
2006-11-06 15:48 ` Linus Torvalds
2006-11-06 16:03 ` Martin Langhoff
2006-11-06 17:48 ` Linus Torvalds
2006-11-07 7:59 ` Liu Yubao
2006-11-07 17:23 ` Linus Torvalds
2006-11-07 18:23 ` If merging that is really fast forwarding creates new commit Junio C Hamano
2006-11-07 11:46 ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Eran Tromer
2006-11-07 7:27 ` Liu Yubao
2006-11-07 9:46 ` Andy Whitcroft
2006-11-07 12:08 ` Liu Yubao
2006-11-07 13:15 ` Andy Whitcroft
2006-11-07 16:05 ` Linus Torvalds
2006-11-07 16:39 ` Jakub Narebski
2006-11-07 21:37 ` If merging that is really fast forwarding creates new commit Junio C Hamano
2006-11-07 22:02 ` Planned new release of git [was: Re: If merging that is really fast forwarding creates new commit] Jakub Narebski
2006-11-07 23:06 ` Linus Torvalds
2006-11-07 23:36 ` Planned new release of git Junio C Hamano
2006-11-07 23:19 ` Junio C Hamano
2006-11-06 15:25 ` how to show log for only one branch Jakub Narebski
2006-11-07 3:47 ` Liu Yubao
2006-11-07 8:08 ` Jakub Narebski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).