git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* how to show log for only one branch
@ 2006-11-06  3:41 Liu Yubao
  2006-11-06  6:12 ` Junio C Hamano
  2006-11-06 15:25 ` how to show log for only one branch Jakub Narebski
  0 siblings, 2 replies; 35+ messages in thread
From: Liu Yubao @ 2006-11-06  3:41 UTC (permalink / raw)
  To: git

I'm some confused by `git log', here is a revision graph:

a-----> b ---> c ----------------> f ---> g --- master
         \                        /
          `------> d ----------> e ---- test

I hope `git log ...` shows g, f, c, b, a.

`git log master` shows g, f, e, d, c, b, a;
`git log master ^test` shows g, f, c.
`git log --no-merges master` shows g, e, d, c, b, a.

That's to say, I want to view master, master~1, master~2, master~3, ...
until the beginning, no commits in other branches involved.

I have heard git treats all parents equally in a merge operation, so I
am curious how git decides which parent is HEAD^1.

I feel the HEAD^1 branch is more special than HEAD^2 branch, because HEAD^1
is usually the working branch and the target branch of merging operation.
it's a little more convenient to see only commits that really happen in
current branch, especially for people who come from CVS and Subversion (yes,
I think git is more interesting than CVS and Subversion:-).

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: how to show log for only one branch
  2006-11-06  3:41 how to show log for only one branch Liu Yubao
@ 2006-11-06  6:12 ` Junio C Hamano
  2006-11-06 10:41   ` Liu Yubao
  2006-11-06 13:00   ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Liu Yubao
  2006-11-06 15:25 ` how to show log for only one branch Jakub Narebski
  1 sibling, 2 replies; 35+ messages in thread
From: Junio C Hamano @ 2006-11-06  6:12 UTC (permalink / raw)
  To: Liu Yubao; +Cc: git

Liu Yubao <yubao.liu@gmail.com> writes:

> I have heard git treats all parents equally in a merge operation, so I
> am curious how git decides which parent is HEAD^1.

The first parent you see when you do "git cat-file commit HEAD"
is the HEAD^1, the second one is HEAD^2, etc.

With typical Porcelains (including git-core), when you make a
true merge by pulling another branch while on one branch, the
tip of the branch you were on when you initiated the merge
becomes the HEAD^1 of the resulting merge commit.

However, that does not mean HEAD^1 is any special in the global
history.  It is only locally special when viewed by you who did
the merge, and only immediately after you made the merge.  After
a while, even you yourself would feel less special about HEAD^1.

Imagine the following scenario.

 . You fork off from Linus's tip, and you do a great work on the
   kernel for a while.

         o---o---o---o Liu
        /
    ---o Linus

 . Linus's tip progresses, and there are semantically some
   overlapping changes; you merge from Linus to make sure your
   great work still works with the updated upstream.  This merge
   commit (marked '*' in the picture below) has _your_ last
   change as HEAD^1 and Linus's tip as HEAD^2.

         o---o---o---o---* Liu
        /               /
    ---o---o---o---o---o Linus

 . It still works great and you let Linus know about your great
   work.  He likes it and pulls from you.

At this point, the revision history would still look like this:

         o---o---o---o---* Liu = Linus
        /               /
    ---o---o---o---o---o

That is, the DAG did not change since you pulled from Linus.
The only thing that changed was that Linus's tip now points at
the merge commit _you_ made.

Then Linus keeps working, building commits on top of that merge.

                         Liu
         o---o---o---o---*---o---o---o---o Linus
        /               /
    ---o---o---o---o---o

Now, we can say two things about this history.

If you view the development community "centered around Linus",
then when somebody looks back the history from Linus's tip,
whatever great work you did, that is merely "one of the many
contributions from many people".  The "mainline" from this point
of view is still "what Linus saw at each point as the tip of his
development track", and among the commits you made (the ones
between the fork point and '*' in the above picture), the last
one, the merge you made was the only one that was once the tip
of Linus; everything else was "random work that happend in a
side branch".  But HEAD^1 is not special if you wanted to have
this view.

In massively parallel and distributed development, whose track
of development is "mainline" is not absolute, and it all depends
on what you are interested in when you do the archaeology.
Let's say that your work on the side branch was in one specific
area (say, a device driver work for product X), and nobody
else's work in that area appeared on Linus's development track
since you forked until your work was merged.

To somebody who is digging from Linus's tip in order to find out
how that driver evolved, your side branch is much more important
than what happened on Linus's branch (which everybody would
loosely say _the_ "mainline").  On the other hand, when somebody
is interested in some other area that was worked on in Linus's
development track while your work was done in the side branch,
following your development track is not interesting; and the
person who is interested in this "other area" could be you.  In
that case, you would want to follow Linus's development track.

What's mainline is _not_ important, and which parent is first is
even less so.  It solely depends on what you are looking for
which branch matters more.  Putting too much weight on the
difference between HEAD^1 vs HEAD^2 statically does not make any
sense.

Reflecting this view of history, git log and other history
traversal commands treat merge parents more or less equally, and
_how_ you ask your question affects what branches are primarily
followed.  For example, if somebody is interested in your device
driver work, this command:

	git log -- drivers/liu-s-device/

would follow your side branch.  On the other hand,

	git log -- fs/

would follow Linus's development track while you were forked, if
you did not do any fs/ work while on that side branch and
Linus's development track had works in that area, _despite_ the
merge you gave Linus has your development track as its first
parent.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: how to show log for only one branch
  2006-11-06  6:12 ` Junio C Hamano
@ 2006-11-06 10:41   ` Liu Yubao
  2006-11-06 18:16     ` Junio C Hamano
  2006-11-06 13:00   ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Liu Yubao
  1 sibling, 1 reply; 35+ messages in thread
From: Liu Yubao @ 2006-11-06 10:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano wrote:
> Liu Yubao <yubao.liu@gmail.com> writes:
> 

Snip many great detailed description, thank you very much, I have
a question about the way git treats fast forwarding but that will
be another topic.

> What's mainline is _not_ important, and which parent is first is
> even less so.  It solely depends on what you are looking for
> which branch matters more.  Putting too much weight on the
> difference between HEAD^1 vs HEAD^2 statically does not make any
> sense.
> 
> Reflecting this view of history, git log and other history
> traversal commands treat merge parents more or less equally, and
> _how_ you ask your question affects what branches are primarily
> followed.  For example, if somebody is interested in your device
> driver work, this command:
> 
> 	git log -- drivers/liu-s-device/
> 
> would follow your side branch.  On the other hand,
> 
> 	git log -- fs/
> 
> would follow Linus's development track while you were forked, if
> you did not do any fs/ work while on that side branch and
> Linus's development track had works in that area, _despite_ the
> merge you gave Linus has your development track as its first
> parent.
> 

This is perfect and enough for two branches that work on different
files, but if two branches modify same files, "git log" can't separate
commits clearly. For example, I want to know what happened in your
git's "next" branch, I hope to get logs like this:
     Merge branch 'jc/pickaxe' into next
     Merge branch 'master' into next
     Merge branch 'js/modfix' into next
     ...
     some good work
     ...
     Merge branch ....

I just want to *outline* what happened in "next" branch, if I am interested
in what have been merged from 'jc/pickaxe' I can follow the merge point again
or use something like "git log --follow-all-parents".

Instead, "git log" interlaces logs from many branches, I find it's a little
confused: why does "git log" of current branch contain many logs from other 
branches? (This is not a real question, I know the reason)

I indeed understand that HEAD^1 is not always the commit that my work
bases on before a merge (thanks for your detailed description again:-),
it doesn't make sense to show HEAD~1, HEAD~2, HEAD~3 and so on, that's
to say 'git log' will never meet my requirement.

Maybe reflog is what I need, I want to know which commits "next" have pointed
to, but reflog is only for local purpose, it's not downloaded by 'git clone'

^ permalink raw reply	[flat|nested] 35+ messages in thread

* If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
  2006-11-06  6:12 ` Junio C Hamano
  2006-11-06 10:41   ` Liu Yubao
@ 2006-11-06 13:00   ` Liu Yubao
  2006-11-06 13:39     ` If merging that is really fast forwarding creates new commit Rocco Rutte
                       ` (2 more replies)
  1 sibling, 3 replies; 35+ messages in thread
From: Liu Yubao @ 2006-11-06 13:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Thanks to Junio for his patient explanation about branches in git, I find 
there is a subtle difference between GIT and regular VCS that can be easily
neglected by newbies.

I realize that git is a *content tracker*, it only creates commit object
when the corresponding tree is really modified, git records content merging
but not usual merging operation, that's why git is called a content tracker.
This explains why a merging that is really a fast forwarding doesn't create
any new commit.

This feature is different from many regular VCS like CVS and Subversion and
confuses newbies that come from them: mainline doesn't make sense too much,
'git log' shows many logs from other branches. In git, a branch is almost a
tag, you can't get the *track* of a branch(It's a pity reflog is only for
local purpose). I am used to one-trunk-and-more-side-branches way, every
branches are isolated clearly, git makes me very confused at the beginning.


Then, what bad *logical* problem will happen if a merging that is really a 
fast forwarding creates a new commit?

If we throw away all compatibility, efficiency, memory and disk consumption
problems,
(1) we can get the track of a branch without reflog because HEAD^1 is
always the tip of target branch(or working branch usually) before merging.

(2) with the track, branch mechanism in git is possibly easier to understand,
especially for newbies from CVS or Subversion, I really like git's light 
weight, simple but powerful design and great efficiency, but I am really
surprised that 'git log' shows logs from other branches and a side branch can 
become part of main line suddenly.

A revision graph represents fast forwarding style merging like this:

             (fast forwarding)
  ---- a ............ * ------> master
        \            /
         b----------c -----> test         (three commits with three trees)

can be changed to:

  ---- a (tree_1) ----------- d (tree_3) ------> master
        \                    /
         b (tree_2) ------- c (tree_3) ----> test
(four commits with three trees, it's normal as more than one way can reach 
Rome :-)


I don't think I am smarter than any people in this mailing list, in fact
I am confused very much by GIT's branches at the beginning. There must
be many problems I haven't realized, I am very curious about them, any

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: If merging that is really fast forwarding creates new commit
  2006-11-06 13:00   ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Liu Yubao
@ 2006-11-06 13:39     ` Rocco Rutte
  2006-11-07  3:42       ` Liu Yubao
  2006-11-06 13:43     ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Andreas Ericsson
  2006-11-06 15:48     ` Linus Torvalds
  2 siblings, 1 reply; 35+ messages in thread
From: Rocco Rutte @ 2006-11-06 13:39 UTC (permalink / raw)
  To: git

Hi,

* Liu Yubao [06-11-06 21:00:07 +0800] wrote:

>Then, what bad *logical* problem will happen if a merging that is really a fast forwarding creates a new commit?

I don't know what you expect by "logical" nor if I get you right, but if 
fast-forward merge a branch to another one, both branches now have 
exactly the same hash. If you create a commit object for a fast-forward 
merge, both tip hashes not identical anymore... which is bad.

The identical hash important so that you really know they're identical 
and for future reference like ancestry.

   bye, Rocco
-- 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
  2006-11-06 13:00   ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Liu Yubao
  2006-11-06 13:39     ` If merging that is really fast forwarding creates new commit Rocco Rutte
@ 2006-11-06 13:43     ` Andreas Ericsson
  2006-11-07  3:26       ` Liu Yubao
  2006-11-06 15:48     ` Linus Torvalds
  2 siblings, 1 reply; 35+ messages in thread
From: Andreas Ericsson @ 2006-11-06 13:43 UTC (permalink / raw)
  To: Liu Yubao; +Cc: Junio C Hamano, git

Liu Yubao wrote:
> Thanks to Junio for his patient explanation about branches in git, I 
> find there is a subtle difference between GIT and regular VCS that can 
> be easily
> neglected by newbies.
> 
> I realize that git is a *content tracker*, it only creates commit object
> when the corresponding tree is really modified, git records content merging
> but not usual merging operation, that's why git is called a content 
> tracker.
> This explains why a merging that is really a fast forwarding doesn't create
> any new commit.
> 
> This feature is different from many regular VCS like CVS and Subversion and
> confuses newbies that come from them: mainline doesn't make sense too much,
> 'git log' shows many logs from other branches. In git, a branch is almost a
> tag, you can't get the *track* of a branch(It's a pity reflog is only for
> local purpose). I am used to one-trunk-and-more-side-branches way, every
> branches are isolated clearly, git makes me very confused at the beginning.
> 
> 
> Then, what bad *logical* problem will happen if a merging that is really 
> a fast forwarding creates a new commit?
> 

If "fake" commits (i.e., commits that doesn't change any content) are 
introduced for each merge, it will change the ancestry graph and the 
resulting tree(s) won't be mergable with the tree it merged with, 
because each such "back-merge" would result in
* the "fake" commit becoming part of history
* a new "fake" commit being introduced

Consider what happens when Alice pulls in Bob's changes. The merge-base 
of Bob's tip is where Alice HEAD points to, so it results in a 
fast-forward, like below.

a---b---c---d               <--- Alice
              \
               e---f---g     <--- Bob


If, we would have created a fake commit instead, Alice would get a graph 
that looks like so:

a---b---c---d-----------h   <--- Alice
              \         /
               e---f---g     <--- Bob


Now, we would have two trees that are identical, because the merge can't 
cause conflicts, but Alice and Bob will have reached it in two different 
ways. When Bob decides he wants to go get the changes Alice has done, 
his tree will look something like this:

a---b---c---d-----------h          <--- Alice
              \         / \
               e---f---g---i        <--- Bob


He finds it odd that he's got two commits that, when checked out, lead 
to the exact same tree, so he asks Alice to get his tree and see what's 
going on. Alice will then end up with this:

a---b---c---d-----------h---j      <--- Alice
              \         / \ /
               e---f---g---i        <--- Bob


Now there's four commits that all point to identical trees, but the 
ancestry graphs differ between all developers. In the case above, 
there's only two people working at the same project. Imagine the amount 
of empty commits you'd get in a larger project, like the Linux kernel.

Fast-forward is a Good Thing and the only sensible thing to do in a 
system designed to be fully distributed (i.e., where there isn't 
necessarily any middle point with which everybody syncs), while scaling 
beyond ten developers that merge frequently between each other.

> If we throw away all compatibility, efficiency, memory and disk consumption
> problems,
> (1) we can get the track of a branch without reflog because HEAD^1 is
> always the tip of target branch(or working branch usually) before merging.
> 
> (2) with the track, branch mechanism in git is possibly easier to 
> understand,
> especially for newbies from CVS or Subversion, I really like git's light 
> weight, simple but powerful design and great efficiency, but I am really
> surprised that 'git log' shows logs from other branches and a side 
> branch can become part of main line suddenly.
> 
> A revision graph represents fast forwarding style merging like this:
> 
>             (fast forwarding)
>  ---- a ............ * ------> master
>        \            /
>         b----------c -----> test         (three commits with three trees)
> 
> can be changed to:
> 
>  ---- a (tree_1) ----------- d (tree_3) ------> master
>        \                    /
>         b (tree_2) ------- c (tree_3) ----> test
> (four commits with three trees, it's normal as more than one way can 
> reach Rome :-)
> 

That's where our views differ. In my eyes, "d" and "c" are exactly 
identical, and I'd be very surprised if the scm tried to tell me that 
they aren't, by not giving them the same revid.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: how to show log for only one branch
  2006-11-06  3:41 how to show log for only one branch Liu Yubao
  2006-11-06  6:12 ` Junio C Hamano
@ 2006-11-06 15:25 ` Jakub Narebski
  2006-11-07  3:47   ` Liu Yubao
  1 sibling, 1 reply; 35+ messages in thread
From: Jakub Narebski @ 2006-11-06 15:25 UTC (permalink / raw)
  To: git

Perhaps what you want is git log --committer=<owner of repo>?

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
  2006-11-06 13:00   ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Liu Yubao
  2006-11-06 13:39     ` If merging that is really fast forwarding creates new commit Rocco Rutte
  2006-11-06 13:43     ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Andreas Ericsson
@ 2006-11-06 15:48     ` Linus Torvalds
  2006-11-06 16:03       ` Martin Langhoff
                         ` (2 more replies)
  2 siblings, 3 replies; 35+ messages in thread
From: Linus Torvalds @ 2006-11-06 15:48 UTC (permalink / raw)
  To: Liu Yubao; +Cc: Junio C Hamano, git



On Mon, 6 Nov 2006, Liu Yubao wrote:
> 
> Then, what bad *logical* problem will happen if a merging that is really a
> fast forwarding creates a new commit?

You MUST NOT do that.

If a fast-forward were to do a "merge commit", you'd never get into the 
situation where two people merging each other would really ever get a 
stable result. They'd just keep doing merge commits on top of each other.

Git tracks history, not "your view of history". Trying to track "your 
view" is fundamentally wrong, because "your wiew" automatically means that 
the project history would not be distributed any more - it would be 
centralized around what _you_ think happened. That is not a sensible thing 
to have in a distributed system.

For example, the way to break the "infinite merges" problem above is to 
say that _you_ would be special, and you would do a "fast-forward commit", 
and the other side would always just fast-forward without a commit. But 
that is very fundamentally against the whole point of being distributed. 
Now you're special.

In fact, even for "you", it would be horrible - because you personally 
might have 5 different repositories on five different machines. You'd have 
to select _which_ machine you want to track. That's simply insane. It's a 
totally broken model. (You can even get the same situation with just _one_ 
repository, by just having five different branches - you have to decide 
which one is the "main" branch).

Besides, doing an empty commit like that ("I fast forwarded") literally 
doesn't add any true history information. It literally views history not 
as history of the _project_, but as the history of just one of the 
repositories. And that's wrong.

So just get used to it. You MUST NOT do what you want to do. It's stupid.

If you want to track the history of one particular local branch, use the 
"reflog" thing. It allows you to see what one of your local branches 
contained at any particular time.

See

	[core]
		logAllRefUpdates = true

documentation in "man git-update-refs" (and maybe somebody can write more 
about it?)


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
  2006-11-06 15:48     ` Linus Torvalds
@ 2006-11-06 16:03       ` Martin Langhoff
  2006-11-06 17:48       ` Linus Torvalds
  2006-11-07  7:27       ` Liu Yubao
  2 siblings, 0 replies; 35+ messages in thread
From: Martin Langhoff @ 2006-11-06 16:03 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Liu Yubao, Junio C Hamano, git

On 11/6/06, Linus Torvalds <torvalds@osdl.org> wrote:
> On Mon, 6 Nov 2006, Liu Yubao wrote:
> > Then, what bad *logical* problem will happen if a merging that is really a
> > fast forwarding creates a new commit?
> You MUST NOT do that.
>
> If a fast-forward were to do a "merge commit", you'd never get into the
> situation where two people merging each other would really ever get a
> stable result. They'd just keep doing merge commits on top of each other.

Indeed. I used Arch for quite a while and if you were merging between
2 or more repos it would never reach a stable point even if the code
didn't change at all.

If a group of 3 developers (with one repor per developer) was
developing at a slow pace (say, a daily commit each, plus a couple of
pull/updates per day) the garbage-commit to content-commit ratio was
awful. If on a given day noone had made a single commit, we'd still
have a whole set of useless updates merged and committed.

> Besides, doing an empty commit like that ("I fast forwarded") literally
> doesn't add any true history information.

And as the number of developers and repos grows in a distributed
scenarios, fast-forwards increasingly outnumber real commits. The
usefulness of your logs sinks to the sewers.

cheers,



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
  2006-11-06 15:48     ` Linus Torvalds
  2006-11-06 16:03       ` Martin Langhoff
@ 2006-11-06 17:48       ` Linus Torvalds
  2006-11-07  7:59         ` Liu Yubao
  2006-11-07 11:46         ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Eran Tromer
  2006-11-07  7:27       ` Liu Yubao
  2 siblings, 2 replies; 35+ messages in thread
From: Linus Torvalds @ 2006-11-06 17:48 UTC (permalink / raw)
  To: Liu Yubao; +Cc: Junio C Hamano, git



On Mon, 6 Nov 2006, Linus Torvalds wrote:
> 
> Besides, doing an empty commit like that ("I fast forwarded") literally 
> doesn't add any true history information. It literally views history not 
> as history of the _project_, but as the history of just one of the 
> repositories. And that's wrong.
> 
> So just get used to it. You MUST NOT do what you want to do. It's stupid.

Btw, absolutely the _only_ reason people seem to want to do this is 
because they want to "pee in the snow" and put their mark on things. They 
seem to want to show "_I_ did this", even if the "doing" was a total 
no-op and they didn't actually generate any real value.

That's absolutely the last thing you want to encourage, especially when 
the end result is a history that is totally unreadable and contains more 
"junk" than actual real work. 

I'll be the first to say that "merging code" is often as important as 
actually writing the code in the first place, and that it is important to 
show who actually did real work to make a patch appear in a project.

In the kernel, for example, we have "sign-off" lines to show what route a 
patch took before it was accepted, and it's very instructive to see (for 
example) how man patches give credit to somebody like Andrew Morton for 
passing it on versus actually writing the code himself (he has a lot of 
authorship credit too, but it's absolutely _dwarfed_ by his importance as 
a maintainer - and if you were to ask any random kernel developer why 
Andrew is so important, I can pretty much guarantee that his importance is 
very much about those "sign-offs", and not about the patches he authors).

But at the same time, when it comes to merging, because it actually 
clutters up history a lot, we actively try to _avoid_ it. Many subsystem 
maintainers purposefully re-generate a linear history, rebased on top of 
my current kernel, exactly because it makes the history less "branchy", 
and because that makes things easier to see.

So we have actually done work to _encourage_ fast-forwarding over "merge 
with a commit", because the fast-forwarding ends up generating a much more 
readable and understandable history. Generating a _fake_ "merge commit" 
would be absolutely and utterly horrible. It gives fake credit for work 
that wasn't real work, and it makes history uglier and harder to read. 

So it's a real NEGATIVE thing to have, and you should run away from it as 
fast as humanly possible.

Now, the kernel actually ends up being fairly branchy anyway, but that's 
simply because we actually have a lot of real parallel development (I bet 
more than almost any other project out there - we simply have more commits 
done by more people than most projects). I tend to do multiple merges a 
day, so even though people linearize their history individually, you end 
up seeing a fair amount of merges. But we'd have a lot _more_ of them if 
people didn't try to keep history clean.

Btw, in the absense of a merge, you can still tell who committed 
something, exactly because git keeps track of "committer" information in 
addition to "authorship" information. I don't understand why other 
distributed environments don't seem to do this - because separating out 
who committed something (and when) from who authored it (and when) is 
actually really really important.

And that's not just because we use patches and other SCM's than just git 
to track things (so authorship and committing really are totally separate 
issues), but because even if the author and committer is the same person, 
it's very instructive to realize that it might have been moved around in 
history, so it might actually have been cherry-picked later, and the 
committer date differs from the author date even if the actual author and 
committer are the same person (but you might also have had somebody _else_ 
re-linearize or otherwise cherry-pick the history: again, it's important 
to show the committer _separately_ both as a person and as a date).

And because there is a committer field, if you actually want to linearize 
or log things by who _committed_ stuff, you can. Just do

	git log --committer=torvalds

on the kernel, and you can see the log as it pertains for what _I_ 
committed, for example. You can even show it graphically, although it 
won't be a connected graph any more, so it will tend to be very ugly 
(but you'll see the "linear stretches" when somebody did some work). Just 
do "gitk --committer=myname" to see in your own project.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: how to show log for only one branch
  2006-11-06 10:41   ` Liu Yubao
@ 2006-11-06 18:16     ` Junio C Hamano
  2006-11-07  2:21       ` Liu Yubao
  0 siblings, 1 reply; 35+ messages in thread
From: Junio C Hamano @ 2006-11-06 18:16 UTC (permalink / raw)
  To: Liu Yubao; +Cc: git

Liu Yubao <yubao.liu@gmail.com> writes:

> ... For example, I want to know what happened in your
> git's "next" branch, I hope to get logs like this:
>     Merge branch 'jc/pickaxe' into next
>     Merge branch 'master' into next
>     Merge branch 'js/modfix' into next
>     ...
>     some good work
>     ...
>     Merge branch ....
>
> I just want to *outline* what happened in "next" branch, if I am interested
> in what have been merged from 'jc/pickaxe' I can follow the merge point again
> or use something like "git log --follow-all-parents".

My "next" is a bad example of this, because it is an integration
branch and never gets its own development.  It is also a bad
example because I can answer that question with this command
line:

	git log --grep='^Merge .* into next$' next

and while it is a perfectly valid answer, I know it would leave
you feeling somewhat cheated.



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: how to show log for only one branch
  2006-11-06 18:16     ` Junio C Hamano
@ 2006-11-07  2:21       ` Liu Yubao
  2006-11-07  8:21         ` Jakub Narebski
  0 siblings, 1 reply; 35+ messages in thread
From: Liu Yubao @ 2006-11-07  2:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano wrote:
> Liu Yubao <yubao.liu@gmail.com> writes:
> 
>> ... For example, I want to know what happened in your
>> git's "next" branch, I hope to get logs like this:
>>     Merge branch 'jc/pickaxe' into next
>>     Merge branch 'master' into next
>>     Merge branch 'js/modfix' into next
>>     ...
>>     some good work
>>     ...
>>     Merge branch ....
>>
>> I just want to *outline* what happened in "next" branch, if I am interested
>> in what have been merged from 'jc/pickaxe' I can follow the merge point again
>> or use something like "git log --follow-all-parents".
> 
> My "next" is a bad example of this, because it is an integration
> branch and never gets its own development.  It is also a bad
> example because I can answer that question with this command
> line:
> 
> 	git log --grep='^Merge .* into next$' next
> 
> and while it is a perfectly valid answer, I know it would leave
> you feeling somewhat cheated.
> 
smart trick, but if the logs aren't consistent enough it's hard to
grep them out.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
  2006-11-06 13:43     ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Andreas Ericsson
@ 2006-11-07  3:26       ` Liu Yubao
  2006-11-07  9:30         ` Andy Whitcroft
  0 siblings, 1 reply; 35+ messages in thread
From: Liu Yubao @ 2006-11-07  3:26 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Junio C Hamano, git

Andreas Ericsson wrote:
> Liu Yubao wrote:
> 
> If "fake" commits (i.e., commits that doesn't change any content) are 
> introduced for each merge, it will change the ancestry graph and the 
> resulting tree(s) won't be mergable with the tree it merged with, 
> because each such "back-merge" would result in
> * the "fake" commit becoming part of history
> * a new "fake" commit being introduced
> 
> Consider what happens when Alice pulls in Bob's changes. The merge-base 
> of Bob's tip is where Alice HEAD points to, so it results in a 
> fast-forward, like below.
> 
> a---b---c---d               <--- Alice
>              \
>               e---f---g     <--- Bob
> 
> 
> If, we would have created a fake commit instead, Alice would get a graph 
> that looks like so:
> 
> a---b---c---d-----------h   <--- Alice
>              \         /
>               e---f---g     <--- Bob
> 
> 
> Now, we would have two trees that are identical, because the merge can't 
> cause conflicts, but Alice and Bob will have reached it in two different 
> ways. When Bob decides he wants to go get the changes Alice has done, 
> his tree will look something like this:
> 
> a---b---c---d-----------h          <--- Alice
>              \         / \
>               e---f---g---i        <--- Bob
> 
> 
> He finds it odd that he's got two commits that, when checked out, lead 
> to the exact same tree, so he asks Alice to get his tree and see what's 
> going on. Alice will then end up with this:
> 
> a---b---c---d-----------h---j      <--- Alice
>              \         / \ /
>               e---f---g---i        <--- Bob
> 
> 
> Now there's four commits that all point to identical trees, but the 
> ancestry graphs differ between all developers. In the case above, 
> there's only two people working at the same project. Imagine the amount 
> of empty commits you'd get in a larger project, like the Linux kernel.
> 
Oh, you remind me, but I have a naive solution for this problem: print
a hint and don't merge commits that contain fake commit, then I know I have
reached a stable merge point and have same tree with others.

We create a fake commit for fast forwarding style merge, this fake commit
is used to record the track of a branch, so we can always follow HEAD^1
to travel through the history of a branch. In fact, git pays more attention
to the history of *data modification* than history of *operation*, that is
right the subtle difference between content tracker and VCS, latter's branch 
has more information(useful information, I think).

Even if no fake commit is created as git does now, there can be multiple
commits with identical tree object, and git can't prevent you from merging
two commits with identical tree object, it just creates an ancestry relation
to remember the merge point.

As git(7) says:
         The "commit" object is an object that introduces the notion
         of history into the picture. In contrast to the other objects,
         it doesn't just describe the physical state of a tree, it
         describes how we got there, and why.

So it's clearer to describe a revision graph with nodes for tree
objects and edges for commit objects(multiple edges for a merge
commit object, I know this will break your habit:-).

> Fast-forward is a Good Thing and the only sensible thing to do in a 
> system designed to be fully distributed (i.e., where there isn't 
> necessarily any middle point with which everybody syncs), while scaling 
> beyond ten developers that merge frequently between each other.
> 
>> If we throw away all compatibility, efficiency, memory and disk 
>> consumption
>> problems,
>> (1) we can get the track of a branch without reflog because HEAD^1 is
>> always the tip of target branch(or working branch usually) before 
>> merging.
>>
>> (2) with the track, branch mechanism in git is possibly easier to 
>> understand,
>> especially for newbies from CVS or Subversion, I really like git's 
>> light weight, simple but powerful design and great efficiency, but I 
>> am really
>> surprised that 'git log' shows logs from other branches and a side 
>> branch can become part of main line suddenly.
>>
>> A revision graph represents fast forwarding style merging like this:
>>
>>             (fast forwarding)
>>  ---- a ............ * ------> master
>>        \            /
>>         b----------c -----> test         (three commits with three trees)
>>
>> can be changed to:
>>
>>  ---- a (tree_1) ----------- d (tree_3) ------> master
>>        \                    /
>>         b (tree_2) ------- c (tree_3) ----> test
>> (four commits with three trees, it's normal as more than one way can 
>> reach Rome :-)
>>
> 
> That's where our views differ. In my eyes, "d" and "c" are exactly 
> identical, and I'd be very surprised if the scm tried to tell me that 
> they aren't, by not giving them the same revid.
It doesn't matter, they have same tree, and it's normal too in git
multiple commits have same tree, if you use nodes for tree state,
that graph will be simple to understand:

           a              d
         -----tree_1 -------------- tree_3 ----> master
                  \                    / \
                   \ b               d/c  `-----> test
                    \                /
                     `--- tree_2 ---'

This is the familiar way we used in CVS, I believe there are more
than one people confused by fast forwarding style merge and 'git log'

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: If merging that is really fast forwarding creates new commit
  2006-11-06 13:39     ` If merging that is really fast forwarding creates new commit Rocco Rutte
@ 2006-11-07  3:42       ` Liu Yubao
  0 siblings, 0 replies; 35+ messages in thread
From: Liu Yubao @ 2006-11-07  3:42 UTC (permalink / raw)
  To: git

Rocco Rutte wrote:
> Hi,
> 
> * Liu Yubao [06-11-06 21:00:07 +0800] wrote:
> 
>> Then, what bad *logical* problem will happen if a merging that is 
>> really a fast forwarding creates a new commit?
> 
> I don't know what you expect by "logical" nor if I get you right, but if 
> fast-forward merge a branch to another one, both branches now have 
> exactly the same hash. If you create a commit object for a fast-forward 
> merge, both tip hashes not identical anymore... which is bad.
Not so bad, you can know they point to same tree objects.

Fast forwarding style merge will blow away the *track* of your branch,
and this track is useful, that is why reflog appears.
> 
> The identical hash important so that you really know they're identical 
> and for future reference like ancestry.
I guess you have mixed identical commits with identical trees. Trees
is what we really need.

Fake commit doesn't mess the ancestry relation, you can refer to
my previous mail replied to Andreas Ericsson in this topic.
> 
>   bye, Rocco

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: how to show log for only one branch
  2006-11-06 15:25 ` how to show log for only one branch Jakub Narebski
@ 2006-11-07  3:47   ` Liu Yubao
  2006-11-07  8:08     ` Jakub Narebski
  0 siblings, 1 reply; 35+ messages in thread
From: Liu Yubao @ 2006-11-07  3:47 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski wrote:
> Perhaps what you want is git log --committer=<owner of repo>?
> 
Thanks, it can't meet my requirement, if I create two branches
and merge them, I can't easily tell the track of those two branches.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
  2006-11-06 15:48     ` Linus Torvalds
  2006-11-06 16:03       ` Martin Langhoff
  2006-11-06 17:48       ` Linus Torvalds
@ 2006-11-07  7:27       ` Liu Yubao
  2006-11-07  9:46         ` Andy Whitcroft
  2006-11-07 16:05         ` Linus Torvalds
  2 siblings, 2 replies; 35+ messages in thread
From: Liu Yubao @ 2006-11-07  7:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git

Linus Torvalds wrote:
> 
> On Mon, 6 Nov 2006, Liu Yubao wrote:
>> Then, what bad *logical* problem will happen if a merging that is really a
>> fast forwarding creates a new commit?
> 
> You MUST NOT do that.
> 
> If a fast-forward were to do a "merge commit", you'd never get into the 
> situation where two people merging each other would really ever get a 
> stable result. They'd just keep doing merge commits on top of each other.
They can stop merging a fake commit with a real commit that point to same
tree object, here they reach a stable result: we have same tree content.
> 
> Git tracks history, not "your view of history". Trying to track "your 
> view" is fundamentally wrong, because "your wiew" automatically means that 
> the project history would not be distributed any more - it would be 
> centralized around what _you_ think happened. That is not a sensible thing 
> to have in a distributed system.
It's not my view, it's branch scope view, I can see how a branch evolves
relatively independently. In git, branch scope view is more or less neglected.
After fast forwarding merge, I can' tell where a branch come from -- I mean
the track of a branch.

If Junio publishes his reflog, I don't see what conflict will happen between
his local view (but now public, and naming it branch scope view seems more
sensible) and git's global view.

If this won't lead to problems, it seems also ok to use fake commit for
fast forwarding style merge, so we can follow HEAD^1 to travel through a
branch without reflog.

I hope I have expressed my thought clearly.
> 
> For example, the way to break the "infinite merges" problem above is to 
> say that _you_ would be special, and you would do a "fast-forward commit", 
> and the other side would always just fast-forward without a commit. But 
> that is very fundamentally against the whole point of being distributed. 
> Now you're special.
No one is special as everybody can create fake commit, any branch (almost
a tag) will never be overwritten to point to a commit object in
another branch, branches are relatively independent, that's to say
'git log' will reflect what has happened really in current branch (a CVS
semantical branch, not only a tag that always points to a tip commit).
> 
> In fact, even for "you", it would be horrible - because you personally 
> might have 5 different repositories on five different machines. You'd have 
> to select _which_ machine you want to track. That's simply insane. It's a 
> totally broken model. (You can even get the same situation with just _one_ 
> repository, by just having five different branches - you have to decide 
> which one is the "main" branch).
What's the mean of upstream branch then? I have to know I should track
Junio's public repository.

When does one say two branches reach a common point? have same commit(must
point to same tree) or have same tree(maybe a fake commit and a real commit)?
I think git takes the first way.

Fast forwarding style merge tends to *automatically* centralize many
branches,  in CVS people merge two branches and drop side branch to
centralize them, they all have central semantics.
(I don't want to get flame war between CVS/SVN and GIT, I think
git is better than them really:-)
> 
> Besides, doing an empty commit like that ("I fast forwarded") literally 
> doesn't add any true history information. It literally views history not 
> as history of the _project_, but as the history of just one of the 
> repositories. And that's wrong.
Something like 'git log --follow-all-parent' can show history of the project
as 'git log' does now.
> 
> So just get used to it. You MUST NOT do what you want to do. It's stupid.
Yes, I have understood the git way and am getting used to it, I like
its simple but powerful design and great efficiency, thank all for your
good work!
> 
> If you want to track the history of one particular local branch, use the 
> "reflog" thing. It allows you to see what one of your local branches 
> contained at any particular time.
> 
> See
> 
> 	[core]
> 		logAllRefUpdates = true
> 
Thanks, it's a pity I can't pull Junio's reflog :-(
> documentation in "man git-update-refs" (and maybe somebody can write more 
> about it?)
> 
> 		Linus
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
  2006-11-06 17:48       ` Linus Torvalds
@ 2006-11-07  7:59         ` Liu Yubao
  2006-11-07 17:23           ` Linus Torvalds
  2006-11-07 18:23           ` If merging that is really fast forwarding creates new commit Junio C Hamano
  2006-11-07 11:46         ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Eran Tromer
  1 sibling, 2 replies; 35+ messages in thread
From: Liu Yubao @ 2006-11-07  7:59 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git

Linus Torvalds wrote:
> 
> On Mon, 6 Nov 2006, Linus Torvalds wrote:
>> Besides, doing an empty commit like that ("I fast forwarded") literally 
>> doesn't add any true history information. It literally views history not 
>> as history of the _project_, but as the history of just one of the 
>> repositories. And that's wrong.
>>
>> So just get used to it. You MUST NOT do what you want to do. It's stupid.
> 
> Btw, absolutely the _only_ reason people seem to want to do this is 
> because they want to "pee in the snow" and put their mark on things. They 
> seem to want to show "_I_ did this", even if the "doing" was a total 
> no-op and they didn't actually generate any real value.

We can kick out fake commits when calculate credits, we can grep logs with
author name to see what he/she has done.

Fake commit is only for digging branch scope history, I can *outline* what has
been merged to a branch and don't care about how these good work are done on 
earth.

> 
> That's absolutely the last thing you want to encourage, especially when 
> the end result is a history that is totally unreadable and contains more 
> "junk" than actual real work. 
> 
> I'll be the first to say that "merging code" is often as important as 
> actually writing the code in the first place, and that it is important to 
> show who actually did real work to make a patch appear in a project.
> 
> In the kernel, for example, we have "sign-off" lines to show what route a 
> patch took before it was accepted, and it's very instructive to see (for 
> example) how man patches give credit to somebody like Andrew Morton for 
> passing it on versus actually writing the code himself (he has a lot of 
> authorship credit too, but it's absolutely _dwarfed_ by his importance as 
> a maintainer - and if you were to ask any random kernel developer why 
> Andrew is so important, I can pretty much guarantee that his importance is 
> very much about those "sign-offs", and not about the patches he authors).
> 
> But at the same time, when it comes to merging, because it actually 
> clutters up history a lot, we actively try to _avoid_ it. Many subsystem 
> maintainers purposefully re-generate a linear history, rebased on top of 
> my current kernel, exactly because it makes the history less "branchy", 
> and because that makes things easier to see.
> 
> So we have actually done work to _encourage_ fast-forwarding over "merge 
> with a commit", because the fast-forwarding ends up generating a much more 
> readable and understandable history. Generating a _fake_ "merge commit" 
> would be absolutely and utterly horrible. It gives fake credit for work 
> that wasn't real work, and it makes history uglier and harder to read. 
> 
> So it's a real NEGATIVE thing to have, and you should run away from it as 
> fast as humanly possible.
> 
> Now, the kernel actually ends up being fairly branchy anyway, but that's 
> simply because we actually have a lot of real parallel development (I bet 
> more than almost any other project out there - we simply have more commits 
> done by more people than most projects). I tend to do multiple merges a 
> day, so even though people linearize their history individually, you end 
> up seeing a fair amount of merges. But we'd have a lot _more_ of them if 
> people didn't try to keep history clean.

That's right the central semantics I have said, git tends to and recommends
a trunk mode development *on a high level*. It's not a bad thing.

> 
> Btw, in the absense of a merge, you can still tell who committed 
> something, exactly because git keeps track of "committer" information in 
> addition to "authorship" information. I don't understand why other 
> distributed environments don't seem to do this - because separating out 
> who committed something (and when) from who authored it (and when) is 
> actually really really important.

Yes, agree.

> 
> And that's not just because we use patches and other SCM's than just git 
> to track things (so authorship and committing really are totally separate 
> issues), but because even if the author and committer is the same person, 
> it's very instructive to realize that it might have been moved around in 
> history, so it might actually have been cherry-picked later, and the 
> committer date differs from the author date even if the actual author and 
> committer are the same person (but you might also have had somebody _else_ 
> re-linearize or otherwise cherry-pick the history: again, it's important 
> to show the committer _separately_ both as a person and as a date).
> 
> And because there is a committer field, if you actually want to linearize 
> or log things by who _committed_ stuff, you can. Just do
> 
> 	git log --committer=torvalds
> 
 > on the kernel, and you can see the log as it pertains for what _I_
 > committed, for example. You can even show it graphically, although it
 > won't be a connected graph any more, so it will tend to be very ugly
 > (but you'll see the "linear stretches" when somebody did some work). Just
 > do "gitk --committer=myname" to see in your own project.
 >
 > 		Linus

I want to separate a branch, not to separate commits by some author, for 
example, many authors can contribute to git's master branch, I want to
know what happened in the master branch like this:
      good work from A;
      good work from C;
      merge from next;   -----> I don't care how this feature is realized.
      good work from A;
      ....

As Junio points out, HEAD^1 is not always the tip of working branch,
so "git log" can't never satisfy me. There is reflog, but it's not public.

BTW: I have a great respect for any man who contributes to Linux and GIT,
especially you:-)


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: how to show log for only one branch
  2006-11-07  3:47   ` Liu Yubao
@ 2006-11-07  8:08     ` Jakub Narebski
  0 siblings, 0 replies; 35+ messages in thread
From: Jakub Narebski @ 2006-11-07  8:08 UTC (permalink / raw)
  To: Liu Yubao, Junio C Hamano; +Cc: git

Liu Yubao wrote:
> Jakub Narebski wrote:
>>
>> Perhaps what you want is git log --committer=<owner of repo>?
>> 
> Thanks, it can't meet my requirement, if I create two branches
> and merge them, I can't easily tell the track of those two branches.

Use graphical history viewer then. git-show-branch, gitk (Tcl/Tk),
qgit (Qt), less used GitView (GTK+), tig (ncurses), least used 
git-browser (JavaScript). 

BTW. that is what subject line (first line of commit message) is for. 
Note the "gitweb:", "Documentation:", "autoconf:", "Improve build:" in 
the git log.


By the way, what is the status of the proposed "note" header extension 
to the commit object? One could store name of branch we were/are on, 
even though this is absolutely discouraged...
-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: how to show log for only one branch
  2006-11-07  2:21       ` Liu Yubao
@ 2006-11-07  8:21         ` Jakub Narebski
  0 siblings, 0 replies; 35+ messages in thread
From: Jakub Narebski @ 2006-11-07  8:21 UTC (permalink / raw)
  To: git

Liu Yubao wrote:

> Junio C Hamano wrote:
>> [...]  It is also a bad
>> example because I can answer that question with this command
>> line:
>> 
>>      git log --grep='^Merge .* into next$' next
>> 
>> and while it is a perfectly valid answer, I know it would leave
>> you feeling somewhat cheated.
>> 
> smart trick, but if the logs aren't consistent enough it's hard to
> grep them out.

Well, commit message for merges are generated automatically. And if you set
merge.summary=true in repo config (or your config), then you have shortlog
in merge commit message by default...
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
  2006-11-07  3:26       ` Liu Yubao
@ 2006-11-07  9:30         ` Andy Whitcroft
  2006-11-07 12:05           ` Liu Yubao
  0 siblings, 1 reply; 35+ messages in thread
From: Andy Whitcroft @ 2006-11-07  9:30 UTC (permalink / raw)
  To: Liu Yubao; +Cc: Andreas Ericsson, Junio C Hamano, git

Liu Yubao wrote:
> Andreas Ericsson wrote:
>> Liu Yubao wrote:
>>
>> If "fake" commits (i.e., commits that doesn't change any content) are
>> introduced for each merge, it will change the ancestry graph and the
>> resulting tree(s) won't be mergable with the tree it merged with,
>> because each such "back-merge" would result in
>> * the "fake" commit becoming part of history
>> * a new "fake" commit being introduced
>>
>> Consider what happens when Alice pulls in Bob's changes. The
>> merge-base of Bob's tip is where Alice HEAD points to, so it results
>> in a fast-forward, like below.
>>
>> a---b---c---d               <--- Alice
>>              \
>>               e---f---g     <--- Bob
>>
>>
>> If, we would have created a fake commit instead, Alice would get a
>> graph that looks like so:
>>
>> a---b---c---d-----------h   <--- Alice
>>              \         /
>>               e---f---g     <--- Bob
>>
>>
>> Now, we would have two trees that are identical, because the merge
>> can't cause conflicts, but Alice and Bob will have reached it in two
>> different ways. When Bob decides he wants to go get the changes Alice
>> has done, his tree will look something like this:
>>
>> a---b---c---d-----------h          <--- Alice
>>              \         / \
>>               e---f---g---i        <--- Bob
>>
>>
>> He finds it odd that he's got two commits that, when checked out, lead
>> to the exact same tree, so he asks Alice to get his tree and see
>> what's going on. Alice will then end up with this:
>>
>> a---b---c---d-----------h---j      <--- Alice
>>              \         / \ /
>>               e---f---g---i        <--- Bob
>>
>>
>> Now there's four commits that all point to identical trees, but the
>> ancestry graphs differ between all developers. In the case above,
>> there's only two people working at the same project. Imagine the
>> amount of empty commits you'd get in a larger project, like the Linux
>> kernel.
>>
> Oh, you remind me, but I have a naive solution for this problem: print
> a hint and don't merge commits that contain fake commit, then I know I have
> reached a stable merge point and have same tree with others.

But in that situation you and Alice now have different actual history
DAG's in your repositories.

Alice sees:
a---b---c---d-----------h
             \         /
              e---f---g

Bob sees:
a---b---c---d-----------h
             \         / \
              e---f---g---i


If bob now adds a new commit 'j' and alice pulls it back we either have
to then accept 'i' at alice's end or forever lose the identicality of
the commit DAG.  At which point our primary benefit of the SHA1 ==
parent == same commit for everyone is gone.  We can no longer say "this
commit is broken" and everyone know which commit that is.

> 
> We create a fake commit for fast forwarding style merge, this fake commit
> is used to record the track of a branch, so we can always follow HEAD^1
> to travel through the history of a branch. In fact, git pays more attention
> to the history of *data modification* than history of *operation*, that is
> right the subtle difference between content tracker and VCS, latter's
> branch has more information(useful information, I think).

Any VCS is concerned with data modification and how its tracked.  There
are two ways you can record history.  A series of snapshots (git) or a
series of operations (eg cvs and svn).  Each has its trade offs,
operations like diff on snapshots is O(number of files), on diffs they
are O(number of files * number of deltas).

The difference here is all about the interpretation of the word
'branch'.  In CVS and others there is the hard concept of a mainline --
here is the master copy when something is added here it is "the one",
branches are temporary places which contain 'different' history such as
a patch branch.  You want something on both branches you commit the
change twice once to each.  In git they are more separate future
histories.  When they are merged back together the new single history
contains the changes in both, neither is more important than the other
both represent forward progress.  People tend to draw as below giving a
false importance to the 'line' from d->h:

a---b---c---d-----------h
             \         /
              e---f---g

We probabally should draw the below, h's history contains all history
from both 'up' and 'down' histories.  Which is more important?  Neither.
 h is made up of a,b,c,d from alice and e,f,g from bob merged by alice.

              ---------
             /         \
a---b---c---d           h
             \         /
              e---f---g


> 
> Even if no fake commit is created as git does now, there can be multiple
> commits with identical tree object, and git can't prevent you from merging
> two commits with identical tree object, it just creates an ancestry
> relation
> to remember the merge point.
> 
> As git(7) says:
>         The "commit" object is an object that introduces the notion
>         of history into the picture. In contrast to the other objects,
>         it doesn't just describe the physical state of a tree, it
>         describes how we got there, and why.
> 
> So it's clearer to describe a revision graph with nodes for tree
> objects and edges for commit objects(multiple edges for a merge
> commit object, I know this will break your habit:-).

How would such a graph look any different?

>> Fast-forward is a Good Thing and the only sensible thing to do in a
>> system designed to be fully distributed (i.e., where there isn't
>> necessarily any middle point with which everybody syncs), while
>> scaling beyond ten developers that merge frequently between each other.
>>
>>> If we throw away all compatibility, efficiency, memory and disk
>>> consumption
>>> problems,
>>> (1) we can get the track of a branch without reflog because HEAD^1 is
>>> always the tip of target branch(or working branch usually) before
>>> merging.
>>>
>>> (2) with the track, branch mechanism in git is possibly easier to
>>> understand,
>>> especially for newbies from CVS or Subversion, I really like git's
>>> light weight, simple but powerful design and great efficiency, but I
>>> am really
>>> surprised that 'git log' shows logs from other branches and a side
>>> branch can become part of main line suddenly.
>>>
>>> A revision graph represents fast forwarding style merging like this:
>>>
>>>             (fast forwarding)
>>>  ---- a ............ * ------> master
>>>        \            /
>>>         b----------c -----> test         (three commits with three
>>> trees)
>>>
>>> can be changed to:
>>>
>>>  ---- a (tree_1) ----------- d (tree_3) ------> master
>>>        \                    /
>>>         b (tree_2) ------- c (tree_3) ----> test
>>> (four commits with three trees, it's normal as more than one way can
>>> reach Rome :-)
>>>
>>
>> That's where our views differ. In my eyes, "d" and "c" are exactly
>> identical, and I'd be very surprised if the scm tried to tell me that
>> they aren't, by not giving them the same revid.

These two arn't identicle.  You have two difference routes to Rome, you
have two different lines on your map.  To just say 'they' are the same
and throw one away is to throw away just that history you care about.

> It doesn't matter, they have same tree, and it's normal too in git
> multiple commits have same tree, if you use nodes for tree state,
> that graph will be simple to understand:
> 
>           a              d
>         -----tree_1 -------------- tree_3 ----> master
>                  \                    / \
>                   \ b               d/c  `-----> test
>                    \                /
>                     `--- tree_2 ---'
> 
> This is the familiar way we used in CVS, I believe there are more
> than one people confused by fast forwarding style merge and 'git log'
> in git.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
  2006-11-07  7:27       ` Liu Yubao
@ 2006-11-07  9:46         ` Andy Whitcroft
  2006-11-07 12:08           ` Liu Yubao
  2006-11-07 16:05         ` Linus Torvalds
  1 sibling, 1 reply; 35+ messages in thread
From: Andy Whitcroft @ 2006-11-07  9:46 UTC (permalink / raw)
  To: Liu Yubao; +Cc: Linus Torvalds, Junio C Hamano, git

Liu Yubao wrote:
> Linus Torvalds wrote:
>>
>> On Mon, 6 Nov 2006, Liu Yubao wrote:
>>> Then, what bad *logical* problem will happen if a merging that is
>>> really a
>>> fast forwarding creates a new commit?
>>
>> You MUST NOT do that.
>>
>> If a fast-forward were to do a "merge commit", you'd never get into
>> the situation where two people merging each other would really ever
>> get a stable result. They'd just keep doing merge commits on top of
>> each other.
> They can stop merging a fake commit with a real commit that point to same
> tree object, here they reach a stable result: we have same tree content.
>>
>> Git tracks history, not "your view of history". Trying to track "your
>> view" is fundamentally wrong, because "your wiew" automatically means
>> that the project history would not be distributed any more - it would
>> be centralized around what _you_ think happened. That is not a
>> sensible thing to have in a distributed system.
> It's not my view, it's branch scope view, I can see how a branch evolves
> relatively independently. In git, branch scope view is more or less
> neglected.
> After fast forwarding merge, I can' tell where a branch come from -- I mean
> the track of a branch.
> 
> If Junio publishes his reflog, I don't see what conflict will happen
> between
> his local view (but now public, and naming it branch scope view seems more
> sensible) and git's global view.
> 
> If this won't lead to problems, it seems also ok to use fake commit for
> fast forwarding style merge, so we can follow HEAD^1 to travel through a
> branch without reflog.
> 
> I hope I have expressed my thought clearly.
>>
>> For example, the way to break the "infinite merges" problem above is
>> to say that _you_ would be special, and you would do a "fast-forward
>> commit", and the other side would always just fast-forward without a
>> commit. But that is very fundamentally against the whole point of
>> being distributed. Now you're special.
> No one is special as everybody can create fake commit, any branch (almost
> a tag) will never be overwritten to point to a commit object in
> another branch, branches are relatively independent, that's to say
> 'git log' will reflect what has happened really in current branch (a CVS
> semantical branch, not only a tag that always points to a tip commit).
>>
>> In fact, even for "you", it would be horrible - because you personally
>> might have 5 different repositories on five different machines. You'd
>> have to select _which_ machine you want to track. That's simply
>> insane. It's a totally broken model. (You can even get the same
>> situation with just _one_ repository, by just having five different
>> branches - you have to decide which one is the "main" branch).
> What's the mean of upstream branch then? I have to know I should track
> Junio's public repository.
> 
> When does one say two branches reach a common point? have same commit(must
> point to same tree) or have same tree(maybe a fake commit and a real
> commit)?
> I think git takes the first way.
> 
> Fast forwarding style merge tends to *automatically* centralize many
> branches,  in CVS people merge two branches and drop side branch to
> centralize them, they all have central semantics.
> (I don't want to get flame war between CVS/SVN and GIT, I think
> git is better than them really:-)
>>
>> Besides, doing an empty commit like that ("I fast forwarded")
>> literally doesn't add any true history information. It literally views
>> history not as history of the _project_, but as the history of just
>> one of the repositories. And that's wrong.
> Something like 'git log --follow-all-parent' can show history of the
> project
> as 'git log' does now.
>>
>> So just get used to it. You MUST NOT do what you want to do. It's stupid.
> Yes, I have understood the git way and am getting used to it, I like
> its simple but powerful design and great efficiency, thank all for your
> good work!
>>
>> If you want to track the history of one particular local branch, use
>> the "reflog" thing. It allows you to see what one of your local
>> branches contained at any particular time.
>>
>> See
>>
>>     [core]
>>         logAllRefUpdates = true
>>
> Thanks, it's a pity I can't pull Junio's reflog :-(

One thing to remember, when you merge the destination into which you
merge will be HEAD^1, so by just following that you can get junio's view
of his branch as he made it.

This is doesn't terminate properly, sucks the performance of your
machine and generally should be erased rather than run; but you get the
idea:

let n=0
while git-show --pretty=one -s "next~$n"
do
        let "n=$n+1"
done | less


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
  2006-11-06 17:48       ` Linus Torvalds
  2006-11-07  7:59         ` Liu Yubao
@ 2006-11-07 11:46         ` Eran Tromer
  1 sibling, 0 replies; 35+ messages in thread
From: Eran Tromer @ 2006-11-07 11:46 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git

Hi Linus,

On 2006-11-06 19:48, Linus Torvalds wrote:
> 
> On Mon, 6 Nov 2006, Linus Torvalds wrote:
>> Besides, doing an empty commit like that ("I fast forwarded") literally 
>> doesn't add any true history information. It literally views history not 
>> as history of the _project_, but as the history of just one of the 
>> repositories. And that's wrong.
> 
> Btw, absolutely the _only_ reason people seem to want to do this is 
> because they want to "pee in the snow" and put their mark on things. They 
> seem to want to show "_I_ did this", even if the "doing" was a total 
> no-op and they didn't actually generate any real value.

In a project that uses topic branches extensively, the merge-induced
commits give a useful cue about the logical grouping of patches. They
let you easily glean the coarse-grained history and independent lines of
work ("pickaxe made it to next", "Linus got the libata updates") without
getting bogged down by individual commits, just by looking at the gitk
graph. Fast-forwards lose this information, and the more you encourage
them, the less grokkable history becomes.

Empty commits may be the wrong tool to address this (for all the reasons
you gave), but there's certainly useful process information that's
currently being lost.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
  2006-11-07  9:30         ` Andy Whitcroft
@ 2006-11-07 12:05           ` Liu Yubao
  2006-11-07 12:17             ` Jakub Narebski
  0 siblings, 1 reply; 35+ messages in thread
From: Liu Yubao @ 2006-11-07 12:05 UTC (permalink / raw)
  To: Andy Whitcroft; +Cc: Andreas Ericsson, Junio C Hamano, git

Andy Whitcroft wrote:
> Liu Yubao wrote: 
> But in that situation you and Alice now have different actual history
> DAG's in your repositories.
> 
> Alice sees:
> a---b---c---d-----------h
>              \         /
>               e---f---g
> 
> Bob sees:
> a---b---c---d-----------h
>              \         / \
>               e---f---g---i
> 
> 
> If bob now adds a new commit 'j' and alice pulls it back we either have
> to then accept 'i' at alice's end or forever lose the identicality of
> the commit DAG.  At which point our primary benefit of the SHA1 ==
> parent == same commit for everyone is gone.  We can no longer say "this
> commit is broken" and everyone know which commit that is.
> 
Alice and bob have their own branch scope view respectively, they have two
different branches, their DAGs in *branch scope view* can
be different because they trace the history from different points.

In branch scope view, you see only one HEAD, it merges changes from
other branches. Each branch has its own commit DAG.

In global scope view, you see many HEADs, they fork and merge frequently,
here is only one big commit DAG, but you can never see the whole as branches
can be distributed over the world.

Fake commit doesn't break the DAG in global scope view, it has parents
as normal commit although the trees pointed by fake commit and its parent
are same. In fact, git has suck commit already:

   a (tree_1) -------  b (tree_2)  ---- d (tree_2) ---> master
    \                                    /
     `---------------  c (tree_2) ------' -----> test

If you don't pull from other, you can get different global DAG, it's normal 
obviously. It doesn't matter you get different DAG in branch scope, of course
they are different.

The problem is you can't get branch *track* from global scope view in git, you
can't tell which commits a branch has *referred to*. Note following HEAD^1 
isn't right as Junio pointed out 
(http://marc.theaimsgroup.com/?l=git&m=116279354214757&w=2).

Branch track is useful as people have requested reflog feature (realized, but
only for local purpose) and "note" extension in commit object.

If you have a commit A that I haven't pulled, I can't know what you
refer to when you say "Commit A introduced a bug". I must know where
to get this commit. After I pull it from other branch, We can say "this
commit is broken" and everyone know which commit that is.

>> We create a fake commit for fast forwarding style merge, this fake commit
>> is used to record the track of a branch, so we can always follow HEAD^1
>> to travel through the history of a branch. In fact, git pays more attention
>> to the history of *data modification* than history of *operation*, that is
>> right the subtle difference between content tracker and VCS, latter's
>> branch has more information(useful information, I think).
> 
> Any VCS is concerned with data modification and how its tracked.  There
> are two ways you can record history.  A series of snapshots (git) or a
> series of operations (eg cvs and svn).  Each has its trade offs,
> operations like diff on snapshots is O(number of files), on diffs they
> are O(number of files * number of deltas).
> 
> The difference here is all about the interpretation of the word
> 'branch'.  In CVS and others there is the hard concept of a mainline --
> here is the master copy when something is added here it is "the one",
> branches are temporary places which contain 'different' history such as
> a patch branch.  You want something on both branches you commit the
> change twice once to each.  In git they are more separate future
> histories.  When they are merged back together the new single history
> contains the changes in both, neither is more important than the other
> both represent forward progress.  People tend to draw as below giving a
> false importance to the 'line' from d->h:
> 
> a---b---c---d-----------h
>              \         /
>               e---f---g
> 
> We probabally should draw the below, h's history contains all history
> from both 'up' and 'down' histories.  Which is more important?  Neither.
>  h is made up of a,b,c,d from alice and e,f,g from bob merged by alice.
> 
>               ---------
>              /         \
> a---b---c---d           h
>              \         /
>               e---f---g
> 
> 
If fake commit is introduced, a possible revision graph is like this:

   a - * -- c  ------- * ---> branchA
    \ /      \         /
     b ------ * ---- d ---> branchB      ('*' stands for fake commit)

It's indeed not pretty as a linear revision graph that git's fast forwarding
style merge creates, but it can record the tracks of two branches by following
HEAD^1.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
  2006-11-07  9:46         ` Andy Whitcroft
@ 2006-11-07 12:08           ` Liu Yubao
  2006-11-07 13:15             ` Andy Whitcroft
  0 siblings, 1 reply; 35+ messages in thread
From: Liu Yubao @ 2006-11-07 12:08 UTC (permalink / raw)
  To: Andy Whitcroft; +Cc: Linus Torvalds, Junio C Hamano, git

Andy Whitcroft wrote:
> 
> One thing to remember, when you merge the destination into which you
> merge will be HEAD^1, so by just following that you can get junio's view
> of his branch as he made it.
> 
> This is doesn't terminate properly, sucks the performance of your
> machine and generally should be erased rather than run; but you get the
> idea:
> 
> let n=0
> while git-show --pretty=one -s "next~$n"
> do
>         let "n=$n+1"
> done | less
> 
> -apw
> 
This is not a right way to view a branch track in git, see Junio's explanation

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
  2006-11-07 12:05           ` Liu Yubao
@ 2006-11-07 12:17             ` Jakub Narebski
  0 siblings, 0 replies; 35+ messages in thread
From: Jakub Narebski @ 2006-11-07 12:17 UTC (permalink / raw)
  To: git

Liu Yubao wrote:
[...]
I think everything stems from the fact that git repositories which pull/push
with each other _share_ [parts of] DAG. Learn to live with it, or chose
different SCM. 

You want branch a path through DAG, not only as lineage sub-DAG... but
recodring this information is I think costly.

Note also that the pointers to DAG branches are can be name differently in
different repositories (e.g. 'master' in one repository might be 'origin'
in the other, and 'remotes/origin/master' in yet another).
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
  2006-11-07 12:08           ` Liu Yubao
@ 2006-11-07 13:15             ` Andy Whitcroft
  0 siblings, 0 replies; 35+ messages in thread
From: Andy Whitcroft @ 2006-11-07 13:15 UTC (permalink / raw)
  To: Liu Yubao; +Cc: Linus Torvalds, Junio C Hamano, git

Liu Yubao wrote:
> Andy Whitcroft wrote:
>>
>> One thing to remember, when you merge the destination into which you
>> merge will be HEAD^1, so by just following that you can get junio's view
>> of his branch as he made it.
>>
>> This is doesn't terminate properly, sucks the performance of your
>> machine and generally should be erased rather than run; but you get the
>> idea:
>>
>> let n=0
>> while git-show --pretty=one -s "next~$n"
>> do
>>         let "n=$n+1"
>> done | less
>>
>> -apw
>>
> This is not a right way to view a branch track in git, see Junio's
> explanation
> about this from http://marc.theaimsgroup.com/?l=git&m=116279354214757&w=2

Well in fact that message tells us more why a branch centric view is
likely not useful.  This output is still the majority of the time the
view from the branch integrators point of view.  If that is something
you care about, I am not sure it is something I care about.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
  2006-11-07  7:27       ` Liu Yubao
  2006-11-07  9:46         ` Andy Whitcroft
@ 2006-11-07 16:05         ` Linus Torvalds
  2006-11-07 16:39           ` Jakub Narebski
  2006-11-07 21:37           ` If merging that is really fast forwarding creates new commit Junio C Hamano
  1 sibling, 2 replies; 35+ messages in thread
From: Linus Torvalds @ 2006-11-07 16:05 UTC (permalink / raw)
  To: Liu Yubao; +Cc: Junio C Hamano, git



On Tue, 7 Nov 2006, Liu Yubao wrote:
>
> > If a fast-forward were to do a "merge commit", you'd never get into the
> > situation where two people merging each other would really ever get a stable
> > result. They'd just keep doing merge commits on top of each other.
>
> They can stop merging a fake commit with a real commit that point to same
> tree object, here they reach a stable result: we have same tree content.

That's flawed for two reasons:

 - identical trees is meaningless. You can have identical trees that had 
   different histories and just happened to end up in the same state, and 
   you'd still generate a merge commit (because what merges do is show the 
   history of the data, and the _history_ merges). 

   So you're really just introducing a special case, and not even one that 
   makes any sense. Either history matters, or it doesn't.

 - a distributed system fundamnetally means that nobody is "special". And 
   a merge is a _joining_ of two threads. Neither of which is special. 

   Let's say that we have

	A:	a -> b -> c -> d

	B:	a -> b -> c

   and B pulls. You think that it should result in

	B:	a -> b -> c  --->  e
		            \    /
		             > d

   and I say that that is crazy, because in a distributed system, A and B 
   are _equivalent_ and have the same branches, and tell me what would 
   have happened if _A_ had pulled from _B_ instead?

   That's right: if A had pulled from B, then obviously nothing at all 
   would happen, because A already had everything B had.

   So the only _logical_ thing to happen is that the end result doesn't 
   depend on who merged. And that means that if B merged from A, then the 
   end result _has_ to be the same as if A merged from B, namely_

	B:	a -> b -> c -> d

   and nothing else. Anything else is insane. It's not a distributed 
   system any more.



> > Git tracks history, not "your view of history". Trying to track "your view"
> > is fundamentally wrong, because "your wiew" automatically means that the
> > project history would not be distributed any more - it would be centralized
> > around what _you_ think happened. That is not a sensible thing to have in a
> > distributed system.
>
> It's not my view, it's branch scope view, I can see how a branch evolves
> relatively independently.

No you CAN NOT. You think that "A" is special. But because you think that 
A is special, you ignore that B had the exact same branch, so your "branch 
scope view" is inherently flawed - it's not "branch scope" at all, it's 
literally a "one person is special" view.

> In git, branch scope view is more or less neglected. After fast 
> forwarding merge, I can' tell where a branch come from -- I mean the 
> track of a branch.

Sure you can. In your reflog. It's only _you_ who care about _your_ 
history. Nobody else cares one whit about what your tree looks like.

> If Junio publishes his reflog, I don't see what conflict will happen between
> his local view (but now public, and naming it branch scope view seems more
> sensible) and git's global view.

Why would anybody ever care about Junio's reflog?

Also, you're ignoring the issue that both I and Martin mentioned: you're 
making history harder to read, and adding crud that doesn't actually _do_ 
anything. Your approach is nonsensical from a distributed system 
standpoint, but it's also _worse_ than just fast-forwarding. If git did 
what you suggested, we'd have a lot of extra merge commits that simply 
don't _help_ anything, and only make things worse.

> What's the mean of upstream branch then? I have to know I should track
> Junio's public repository.

"Upstream" really should have absolutely zero meaning. That's the whole 
point of distributed. You can merge things sideways, down, up, and the end 
result doesn't matter. "upstream" can merge from you, and you can merge 
from him. Thats' the _technology_.

The only thing that matters is "trust". But trust is not something you get 
from technology, and trust is something you have to earn. And trust does 
NOT come from digital signatures like some people believe: digital 
signatures are a way of _verifying_ the trust you have, but they are very 
much secondary (or tertiary) to the real issues.

And _trust_ is why you'd pull from Junio. Git makes it somewhat easier by 
giving you default shorthands for the original place you cloned from when 
you clone a new repository, because often you'd obviously keep trusting 
the same source, but an important thing here is to realize that it really 
is "often". Not always. And it's not about technology.

> When does one say two branches reach a common point? have same commit(must
> point to same tree) or have same tree(maybe a fake commit and a real commit)?
> I think git takes the first way.

Very much so. To git, the only (and I really mean _only_) thing that 
matters from a commit history view is the commit relationships. NOTHING 
else. What the trees are doesn't matter at all. Where the commits came 
from doesn't matter. Who made them doesn't matter either - those are just 
"documentation".

So the _only_ thing that matters for a commit is what its place in history 
was. We never even look at the trees at all to decide what to do about 
merging. The only time the trees start to matter is when we've figured out 
what the merge relationship is, and then obviously the trees matter, but 
even then they only matter as far as the resulting _tree_ is concerned. 

> Fast forwarding style merge tends to *automatically* centralize many
> branches

Yes. Except I wouldn't say "centralize", I would very much say "join". 
That's the point of a merge. Two commit histories "join" and become one.

But the reason I don't agree with your choice of wording ("centralize")
thing is fundamental:

 - it only happens on one side. The side that does the merge is not 
   necessarily the "central" one at all.

 - there isn't necessarily even such a thing as a "central" branch in git 
   (and there _shouldn't_ be).

In fact, the thing I absolutely _detest_ about CVS is how it makes it 
almost impossible to have multiple "equally worthy" branches. Look at the 
git repository itself that Junio maintains, and please tell me which is 
the "trunk" branch?

Git doesn't even have that concept. There is the concept of a _default_ 
branch ("master"), and yes, the git repository has it. But at the same 
time, it really is just a default. There are three "main" branches that 
Junio maintains, and they only really differ in the degree of development. 
And "master" isn't even the most stable one - it's just the default one, 
because it's smack dab in the middle: recent enough to be interesting, but 
still stable enough to be worth tracking for just about anybody.

But really, "maint" is the stable branch, and in many ways you could say 
that "maint" is the trunk branch, since that's what Junio still cuts 
releases from. And "next" is the development branch, that gets interesting 
features before they hit the "master" branch (and "pu" is so far out that 
it's a whole different issue, since it jumps around and doesn't even 
become a real history at all).

See? All of these are _equal_. There is no trunk. There is no "central" 
branch, and if you were to have to decide which one is the most central 
one, it's not even the default one, that would probably be "maint", since 
that's the one that keeps getting merged into the other branches.

So doing a merge doesn't really "centralize" anything. It just joins the 
two development threads together in that particular line. If "master" 
merges the work in "maint", master doesn't really get any more 
centralized, it just gets the work that "maint" did since last time. And 
if there was no other work done at all, then the two branches end up 100% 
identical - there was no "merge" of the work.

They still have their own identities, though. It's still two branches. 
It's still "maint" and "master". They just have the exact same state, and 
that is as it should be, since they've had the exact same development 
history.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
  2006-11-07 16:05         ` Linus Torvalds
@ 2006-11-07 16:39           ` Jakub Narebski
  2006-11-07 21:37           ` If merging that is really fast forwarding creates new commit Junio C Hamano
  1 sibling, 0 replies; 35+ messages in thread
From: Jakub Narebski @ 2006-11-07 16:39 UTC (permalink / raw)
  To: git

Linus Torvalds wrote:

> So doing a merge doesn't really "centralize" anything. It just joins the 
> two development threads together in that particular line. If "master" 
> merges the work in "maint", master doesn't really get any more 
> centralized, it just gets the work that "maint" did since last time. And 
> if there was no other work done at all, then the two branches end up 100% 
> identical - there was no "merge" of the work.

By the way, merges happen in _two_ directions. 'Master' merges from 'next'
when 'next' is in sufficiently stable state; 'next' merges from 'master' to
get changes which were considered stable enough to be put into
'master' (and 'master' merges in from 'maint', too).

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch]
  2006-11-07  7:59         ` Liu Yubao
@ 2006-11-07 17:23           ` Linus Torvalds
  2006-11-07 18:23           ` If merging that is really fast forwarding creates new commit Junio C Hamano
  1 sibling, 0 replies; 35+ messages in thread
From: Linus Torvalds @ 2006-11-07 17:23 UTC (permalink / raw)
  To: Liu Yubao; +Cc: Junio C Hamano, git



On Tue, 7 Nov 2006, Liu Yubao wrote:
> 
> Fake commit is only for digging branch scope history, I can *outline* what has
> been merged to a branch and don't care about how these good work are done on
> earth.

The thing is, I think you see a good thing ("outlining"), and miss all the 
downsides ("extra noise", "incorrect outlining").

Yes, I can see it being useful for reading logs in a perfect world.

However, in real life, more than half of my fast-forwards are just me 
tracking another branch. An "outline" would be _wrong_. I _want_ to 
fast-forward, because I'm moving the trees from one machine to another, 
and the reason it's a fast-forward is exactly the fact that absolutely 
zero work had been done on the machine I'm pulling from - I'm pulling just 
to keep up-to-date.

So now, just to keep things sane, your scheme would require that people 
AHEAD OF TIME tell the system whether they want to fast-forward or whether 
they want to create a magic merge commit as a "outlining" marker.

See? Fast-forwarding is absolutely the right thing to do in 99% of all 
cases. For me, it's perhaps only half, because I do several true merges 
every day, but that's really quite unusual - I'm the top-level maintainer. 
Nobody else should EVER do it.

And the thing is, I refuse to work with a system that makes one person 
special. I _know_ I'm special, I'm the smartest, most beautiful, and just 
simply the best person on the planet. I don't need a tool that tells me 
so.

So deep down, what you're really suggesting that there be a special mode 
that is ONLY ever used for the top-level maintainer, so that he can create 
an "outline" in the history.

Put that way, it almost makes sense, until you realize that 99.9% of all 
people aren't top-level maintainers, and you don't want them creating crap 
like that. And that "outlining" is likely most easily done with

	( git log lastversion.. | git shortlog ;
	  git diff --stat --summary lastversion.. ) | less -S

instead.

But more importantly, I don't personally like the "top-level maintainer" 
model. Yes, it's how people do end up working a lot, but quite frankly, 
I'd rather not have the tool support it, especially if there is ever a 
schism in a development process. I want to support _forking_, which very 
much implies having somebody pulling the "wrong way".

Time for some purely philosophical arguments on why it's wrong to have 
"special people" encoded in the tools:

I think that "forking" is what keeps people honest. The _biggest_ downside 
with CVS is actually that a central repository gets so much _political_ 
clout, that it's effectively impossible to fork the project: the 
maintainers of a central repo have huge powers over everybody else, and 
it's practically impossible for anybody else to say "you're wrong, and 
I'll show how wrong you are by competing fairly and being better".

For example, gcc (and other tools) have gone through this phase. You've 
had splinter groups (eg pgcc) that did a hell of a lot better work than 
the main group, and the tools really made it really hard for them to make 
progress. I think the most important part of a distributed SCM is not even 
to support the "main trunk", but to support the notion that anybody can 
just take the thing and compete fairly.

With the kernel as an example, any group could literally just start their 
own kernel git tree, and git should make it as easy as humanly possible 
for them to track my tree WHILE _THEY_ STILL REMAIN IN CHARGE of their own 
tree. That doesn't mean that forking is easy - over the years people have 
simply grown so _used_ to me that they mostly trust me and they are comfy 
working with me, because even if I've got my quirks (or "major personality 
disorders" as some people might say), people mostly know how to work with 
them.

But the point is, there should be no _tool_ issues. As far as git is 
concerned, every single developer can feel like he is the top-level 
maintainer - it doesn't have to be a hierarchy, it really can be a 
"network of equal developers". I want the _tool_ to have that world-view, 
even if most projects in the end tend to organize more hierarcically than 
that. Because the "everybody is equal" worldview actually matters in the 
only case that _really_ matters: when problems happen.

For example: I use git to maintain a few other projects I've started too. 
I use git to maintain git itself, but I'm no longer the maintainer, simply 
because I think it's a lot better to step down than stand in the way of 
somebody better, and because I think it's hard to be the "lead person" on 
multiple projects. 

The same thing is happening to "sparse", which was dormant for a while (it 
worked, and I fixed problems as people reported them, but it did 
everything I had set out to do, so my motivation to develop it further had 
just gone down a lot). What happened? Somebody else came along, showed 
interest, started sending me patches, and I just suggested he start his 
own tree and start maintaining it.

Now, both of those transitions were very peaceful, but it should work that 
way even if the maintainer were to fight tooth and nail to hold on to his 
"top dog" status. And that's where it's important that the tool not 
separate out "top maintainers" from "other people".

> I want to separate a branch, not to separate commits by some author, for
> example, many authors can contribute to git's master branch, I want to
> know what happened in the master branch like this:
>      good work from A;
>      good work from C;
>      merge from next;   -----> I don't care how this feature is realized.
>      good work from A;

Really, "git log | git shortlog" will come quite close. I use it all the 
time for the kernel, and it's powerful.

Try it with the kernel archive, just for fun. Do

	git log v2.6.19-rc4.. | git shortlog | less -S

with the current kernel, and see how easy it is to get a kind of feel for 
what is going on. We do it by two means:

 - sorting by author. 

   This sounds silly, but it's actually very powerful. It's not so much 
   that it credits people better (it does) or that it makes the logs 
   shorter by mentioning the person just once (it does that too), it's 
   really nice because people tend to automatically do certain things. One 
   person does "random cleanups". Another one works on "networking". A 
   third one maintains one particular architecture, and so on..

 - encourage people to have a "topic: explanation" kind of top line of the 
   commit (and encourage people to have that "summary line" in the first 
   place: not every SCM does that, and everybody else is strictly much 
   worse than git)

In fact, when I do this, I usually _remove_ the merges, because they end 
up being just noise. Really: go and look at the current kernel repo, and 
do the above one-liner, and realize that I have a hunking big set of 
commits credited to me right now (it says 30 commits), and in fact I think 
I'm the #1 author right now on that list.

But when I send out the description, I actually use the "--no-merges" flag 
to "git log", because those merge messages are _useless_. They really 
don't do anything at all for me, or for anybody else. Re-run the above 
one-liner that way, and suddenly I drop to just 5 commits (and quite 
often, I'm much less - sometimes the _only_ commit I have for an -rc 
release is the commit that changes the version number). But it's actually 
more readable.

So I can kind of see what you want, but I'm 100% convinced that the 
information you _really_ want is better done totally differently.

So if you want to get the "big picture" thing, git does actually support 
you in several ways. That "git shortlog" is very useful, but so is the 
"drill down by subsystem". For example, you could do

	git log --no-merges v2.6.19-rc4.. arch/ | git shortlog | less -S

and you'd get the "summary view" of what happened in architecture- 
specific code. It's not the same thing as the "merge log", but it's 
actually very useful.

(You can do the same with git. Something like

	git log --no-merges v1.4.3.4.. | git shortlog | less -S

shows quite clearly that a lot of new stuff is gitweb-related, for 
example. 

Could we do better "reporting" tools? I'm absolutely sure we could. It 
might be interesting to be able to ignore not just commits, but "trivial 
patches" too. For example, if you're looking for what changed on a high 
level, you're not likely to care about patches that change just a few 
lines. You might want to see only the commits that change an appreciable 
fraction of code, and so it might be very interesting to have a "git 
shortlog" that would take patch size into account, for example.

So I'm not saying that git is perfect. I'm just saying that there are 
better ways (with much fewer downsides) to get what you want, than the way 
you _think_ you want.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: If merging that is really fast forwarding creates new commit
  2006-11-07  7:59         ` Liu Yubao
  2006-11-07 17:23           ` Linus Torvalds
@ 2006-11-07 18:23           ` Junio C Hamano
  1 sibling, 0 replies; 35+ messages in thread
From: Junio C Hamano @ 2006-11-07 18:23 UTC (permalink / raw)
  To: Liu Yubao; +Cc: git

Liu Yubao <yubao.liu@gmail.com> writes:

> I want to separate a branch, not to separate commits by some author,
> for example, many authors can contribute to git's master branch, I
> want to
> know what happened in the master branch like this:
>      good work from A;
>      good work from C;
>      merge from next;   -----> I don't care how this feature is realized.
>      good work from A;
>      ....

So you want to see list of commits that happened to be at the
tip of my 'master' branch.  I would not say that view does not
exist, but it is probably not very useful.  And the uselessness
of it depends majorly on the reason why you say "I don't care
how this feature is realized" in the above picture.  Care to
elaborate why not?

side note: I do not merge next to master so "from next" above in
reality would be "from a topic branch" or "from maint", but it
is otherwise a good example.

What appeared in 'master' recently are three kinds of changes:

 - Many fixes that still apply to 1.4.3 codebase were sent from
   the list (thanks, everybody!), which were applied to 'maint',
   and merged into 'master'.

 - Some other obviously correct fixes and changes that address
   issues on features added after the 1.4.3 release (hence
   missing from 1.4.3 codebase and 'maint' but in 'master') were
   applied directly on 'master'.

 - Yet some other fixes and changes that concern post-1.4.3
   codebase (i.e. 'master only' changes) were forked off of the
   tip of 'master' when the patches were received, cooked in
   their own topic branches (which were merged in 'next'), and
   then merged into 'master'.

So, we have two kinds of obviously correct changes to 'master'
that come both from merges and direct applications.  Things that
happen to address older issues come as merges because they
equally apply to 'maint' and merged into 'master', things that
address newer issues are applied directly.  Put it another way,
things that come as merges to 'master' are also of two kinds.
Obviously correct one that came through 'maint', and the ones
that might have looked slightly wrong in the initial version and
later perfected while in its own topic branch and then merged
into 'master'.

The decision between cooking in a topic branch and immediately
applying to 'master' is not based on the size but more on
perceived usefulness of the change (something that is correct in
the sense that it does not break the system may not deserve to
be merged if it does not do useful things) and quality of the
design and implementation.  The size of the series obviously
affect the perception by me but that is secondary.

Even when a patch is something that I should be able to judge as
obviously correct when I am relaxed and sane, I might lack time
and concentration to follow it fully, and instead decide to drop
it into its own topic branch and later merge it into 'master'
without need for much cooking.  That kind of patch _could_ have
(and should have) been applied directly to 'master' but comes as
a merge.

Sometimes I apply a patch to 'master' and then later realize
that change is needed and applicable to 'maint' as well.  That
is cherry-picked to 'maint', resulting in two independent
commits.  They _could_ have (and should have) come through a
merge from 'maint' to 'master'.

So the change a patch introduces itself may not even have
relevance to the difference between direct application and merge
at all.  In other words, the avenue a particular patch took,
difference between direct application and merge, should not
concern you.  I hope this would illustrate why a view that tries
to summarize what merges brought in and to give full description
of what were applied directly does not make much sense.

By the way, there are two reasons why you cannot have my
ref-logs.  First of all, I do not have one on 'master' nor
'next' myself.  More importantly, I rewind and rebuild these
branches before pushing out (of course I have some safety valve
to prevent me from rewinding beyond what I have already pushed
out), and the ref-log entries for those tips that were rewound
are not useful to you, and something I would rather not have
people to even know about (think of it as giving me some
privacy).

If you really care about the branch tip history of my
repository, you can set up ref-log yourself on your remote
tracking branch.

Strictly speaking, that is the history of fetches by you, not
the history of merges and commits by me, but that is what
matters more to you.  If I pushed my changes out twice a day but
you were away for two days, you would have seen the state of my
repository four rounds back before you left and when you fetched
from me today you would have the latest; three states in between
were not something you can know.  But it does not matter -- your
repository did not have those three states, so not knowing
exactly which commit they were would not hurt you when
bisecting.  "It worked before I pulled yesterday morning but now
it is broken when I pulled this afternoon" would help your
bisect get started, but multiple state changes between the times
you fetched cannot matter.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: If merging that is really fast forwarding creates new commit
  2006-11-07 16:05         ` Linus Torvalds
  2006-11-07 16:39           ` Jakub Narebski
@ 2006-11-07 21:37           ` Junio C Hamano
  2006-11-07 22:02             ` Planned new release of git [was: Re: If merging that is really fast forwarding creates new commit] Jakub Narebski
  1 sibling, 1 reply; 35+ messages in thread
From: Junio C Hamano @ 2006-11-07 21:37 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, Liu Yubao

Linus Torvalds <torvalds@osdl.org> writes:

> Git doesn't even have that concept. There is the concept of a _default_ 
> branch ("master"), and yes, the git repository has it. But at the same 
> time, it really is just a default. There are three "main" branches that 
> Junio maintains, and they only really differ in the degree of development. 
> And "master" isn't even the most stable one - it's just the default one, 
> because it's smack dab in the middle: recent enough to be interesting, but 
> still stable enough to be worth tracking for just about anybody.
>
> But really, "maint" is the stable branch, and in many ways you could say 
> that "maint" is the trunk branch, since that's what Junio still cuts 
> releases from.

The branch 'maint' is meant to be the moral equivalent of the
efforts of your -stable team, so it shouldn't be "the trunk",
but you caught me.

We haven't seen a new release from 'master' for about a month.
I think the dust has settled already after two big topics
(packed-refs, delta-offset-base) were merged into 'master' since
v1.4.3, and it is now time to decide which topics that have been
cooking in 'next' are the ones I want in v1.4.4.  Perhaps by the
end of the week, I'll cut a v1.4.4-rc1 to start the pre-release
stabilization process.  No new features nor enhancements on
'master' after that until v1.4.4 final.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Planned new release of git [was: Re: If merging that is really fast forwarding creates new commit]
  2006-11-07 21:37           ` If merging that is really fast forwarding creates new commit Junio C Hamano
@ 2006-11-07 22:02             ` Jakub Narebski
  2006-11-07 23:06               ` Linus Torvalds
  2006-11-07 23:19               ` Junio C Hamano
  0 siblings, 2 replies; 35+ messages in thread
From: Jakub Narebski @ 2006-11-07 22:02 UTC (permalink / raw)
  To: git

Junio C Hamano wrote:

> We haven't seen a new release from 'master' for about a month.
> I think the dust has settled already after two big topics
> (packed-refs, delta-offset-base) were merged into 'master' since
> v1.4.3, and it is now time to decide which topics that have been
> cooking in 'next' are the ones I want in v1.4.4.  Perhaps by the
> end of the week, I'll cut a v1.4.4-rc1 to start the pre-release
> stabilization process.  No new features nor enhancements on
> 'master' after that until v1.4.4 final.
 
Do I understand correctly that the work on not exploding downloaded
pack on fetch, but making it non-thin, and related work on archival
packs (not to be considered for repacking) is not considered ready
(and tested)?
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Planned new release of git [was: Re: If merging that is really fast forwarding creates new commit]
  2006-11-07 22:02             ` Planned new release of git [was: Re: If merging that is really fast forwarding creates new commit] Jakub Narebski
@ 2006-11-07 23:06               ` Linus Torvalds
  2006-11-07 23:36                 ` Planned new release of git Junio C Hamano
  2006-11-07 23:19               ` Junio C Hamano
  1 sibling, 1 reply; 35+ messages in thread
From: Linus Torvalds @ 2006-11-07 23:06 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Git Mailing List, Junio C Hamano



On Tue, 7 Nov 2006, Jakub Narebski wrote:
>  
> Do I understand correctly that the work on not exploding downloaded
> pack on fetch, but making it non-thin, and related work on archival
> packs (not to be considered for repacking) is not considered ready
> (and tested)?

I'd like to see a new version with both the packed refs and the 
non-exploading download on by default. Maybe time for a git-1.5.0 release 
from master?


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Planned new release of git
  2006-11-07 22:02             ` Planned new release of git [was: Re: If merging that is really fast forwarding creates new commit] Jakub Narebski
  2006-11-07 23:06               ` Linus Torvalds
@ 2006-11-07 23:19               ` Junio C Hamano
  1 sibling, 0 replies; 35+ messages in thread
From: Junio C Hamano @ 2006-11-07 23:19 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: jnareb

Jakub Narebski <jnareb@gmail.com> writes:

> Junio C Hamano wrote:
>
>> We haven't seen a new release from 'master' for about a month.
>> I think the dust has settled already after two big topics
>> (packed-refs, delta-offset-base) were merged into 'master' since
>> v1.4.3, and it is now time to decide which topics that have been
>> cooking in 'next' are the ones I want in v1.4.4.  Perhaps by the
>> end of the week, I'll cut a v1.4.4-rc1 to start the pre-release
>> stabilization process.  No new features nor enhancements on
>> 'master' after that until v1.4.4 final.
>  
> Do I understand correctly that the work on not exploding downloaded
> pack on fetch, but making it non-thin, and related work on archival
> packs (not to be considered for repacking) is not considered ready
> (and tested)?

Perhaps I phrased it badly, but I doubt it.

In the above I am only saying that it probably is time for me to
decide which ones to further merge into 'master', without saying
which ones I think is ready right now.  That is because I
haven't started thinking about it.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Planned new release of git
  2006-11-07 23:06               ` Linus Torvalds
@ 2006-11-07 23:36                 ` Junio C Hamano
  0 siblings, 0 replies; 35+ messages in thread
From: Junio C Hamano @ 2006-11-07 23:36 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> On Tue, 7 Nov 2006, Jakub Narebski wrote:
>>  
>> Do I understand correctly that the work on not exploding downloaded
>> pack on fetch, but making it non-thin, and related work on archival
>> packs (not to be considered for repacking) is not considered ready
>> (and tested)?
>
> I'd like to see a new version with both the packed refs and the 
> non-exploading download on by default. Maybe time for a git-1.5.0 release 
> from master?

Don't worry, packed refs is already part of 'master' so whatever
the next feature release is called it will be part of it ;-).

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2006-11-07 23:36 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-06  3:41 how to show log for only one branch Liu Yubao
2006-11-06  6:12 ` Junio C Hamano
2006-11-06 10:41   ` Liu Yubao
2006-11-06 18:16     ` Junio C Hamano
2006-11-07  2:21       ` Liu Yubao
2006-11-07  8:21         ` Jakub Narebski
2006-11-06 13:00   ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Liu Yubao
2006-11-06 13:39     ` If merging that is really fast forwarding creates new commit Rocco Rutte
2006-11-07  3:42       ` Liu Yubao
2006-11-06 13:43     ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Andreas Ericsson
2006-11-07  3:26       ` Liu Yubao
2006-11-07  9:30         ` Andy Whitcroft
2006-11-07 12:05           ` Liu Yubao
2006-11-07 12:17             ` Jakub Narebski
2006-11-06 15:48     ` Linus Torvalds
2006-11-06 16:03       ` Martin Langhoff
2006-11-06 17:48       ` Linus Torvalds
2006-11-07  7:59         ` Liu Yubao
2006-11-07 17:23           ` Linus Torvalds
2006-11-07 18:23           ` If merging that is really fast forwarding creates new commit Junio C Hamano
2006-11-07 11:46         ` If merging that is really fast forwarding creates new commit [Was: Re: how to show log for only one branch] Eran Tromer
2006-11-07  7:27       ` Liu Yubao
2006-11-07  9:46         ` Andy Whitcroft
2006-11-07 12:08           ` Liu Yubao
2006-11-07 13:15             ` Andy Whitcroft
2006-11-07 16:05         ` Linus Torvalds
2006-11-07 16:39           ` Jakub Narebski
2006-11-07 21:37           ` If merging that is really fast forwarding creates new commit Junio C Hamano
2006-11-07 22:02             ` Planned new release of git [was: Re: If merging that is really fast forwarding creates new commit] Jakub Narebski
2006-11-07 23:06               ` Linus Torvalds
2006-11-07 23:36                 ` Planned new release of git Junio C Hamano
2006-11-07 23:19               ` Junio C Hamano
2006-11-06 15:25 ` how to show log for only one branch Jakub Narebski
2006-11-07  3:47   ` Liu Yubao
2006-11-07  8:08     ` Jakub Narebski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).