is rebase the same as merging every commit?

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* is rebase the same as merging every commit?
@ 2008-06-26 23:04 David Jeske
  2008-06-27  0:51 ` Junio C Hamano
  2008-06-27 10:33 ` しらいしななこ
  0 siblings, 2 replies; 16+ messages in thread
From: David Jeske @ 2008-06-26 23:04 UTC (permalink / raw)
  To: git

Rebasing is described in the docs I've read as turning this: (sorry for the
dots)

..........A---B---C topic
........./
....D---E---F---G master

Into this:

...................A'--B'--C' topic
................../
.....D---E---F---G master

If I understand it right (and that's a BIG if), it's the same as doing a merge
of C into G where every individual commit in the C-line is individually
committed into the new C' line.

...........-------------A---B---C
........../            /   /   /
........./        /---A'--B'--C'  topic
......../        /
....D---E---F---G - master

(1) Is the above model a valid explanation?

(2) From the documentation diagrams, it looks like the rebased A' has only (G)
as a parent, not (A,G). If this is the case, why?  (i.e. not connecting those
nodes throws away useful information)

(3) If it only has (G) as a parent, does the rebase explicitly remove the
source A,B,C nodes from the repository? (the diagrams make it look like it
does) ..or do they just get cleaned up during GC?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* is rebase the same as merging every commit?
@ 2008-06-26 23:04 David Jeske
  2008-06-27  6:30 ` Matthieu Moy
  0 siblings, 1 reply; 16+ messages in thread
From: David Jeske @ 2008-06-26 23:04 UTC (permalink / raw)
  To: git

Rebasing is described in the docs I've read as turning this: (sorry for the
dots)

..........A---B---C topic
........./
....D---E---F---G master

Into this:

...................A'--B'--C' topic
................../
.....D---E---F---G master

If I understand it right (and that's a BIG if), it's the same as doing a merge
of C into G where every individual commit in the C-line is individually
committed into the new C' line.

...........-------------A---B---C
........../            /   /   /
........./        /---A'--B'--C'  topic
......../        /
....D---E---F---G - master

(1) Is the above model a valid explanation?

(2) From the documentation diagrams, it looks like the rebased A' has only (G)
as a parent, not (A,G). If this is the case, why?  (i.e. not connecting those
nodes throws away useful information)

(3) If it only has (G) as a parent, does the rebase explicitly remove the
source A,B,C nodes from the repository? (the diagrams make it look like it
does) ..or do they just get cleaned up during GC?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: is rebase the same as merging every commit?
  2008-06-26 23:04 is rebase the same as merging every commit? David Jeske
@ 2008-06-27  0:51 ` Junio C Hamano
  2008-06-27  1:08   ` Junio C Hamano
  2008-06-27  6:31   ` Pascal Obry
  2008-06-27 10:33 ` しらいしななこ
  1 sibling, 2 replies; 16+ messages in thread
From: Junio C Hamano @ 2008-06-27  0:51 UTC (permalink / raw)
  To: David Jeske; +Cc: git

"David Jeske" <jeske@willowmail.com> writes:

> Rebasing is described in the docs I've read as turning this: (sorry for the
> dots)
>
> ..........A---B---C topic
> ........./
> ....D---E---F---G master
>
> Into this:
>
> ...................A'--B'--C' topic
> ................../
> .....D---E---F---G master
>
> If I understand it right (and that's a BIG if), it's the same as doing a merge
> of C into G where every individual commit in the C-line is individually
> committed into the new C' line.
>
> ...........-------------A---B---C
> ........../            /   /   /
> ........./        /---A'--B'--C'  topic
> ......../        /
> ....D---E---F---G - master
>
>
> (1) Is the above model a valid explanation?

I would presume that the resulting trees A' in the second picture and in
the first picture would be the same, so are B' and C'.  But that is only
true when commits between A and C do not have any duplicate with the
development that happened between E and G.

Thinking about it like that is an interesting mental exercise, but it is
not very useful otherwise.

> (2) From the documentation diagrams, it looks like the rebased A' has
> only (G) as a parent, not (A,G). If this is the case, why?  (i.e. not
> connecting those nodes throws away useful information)

You would rebase ONLY WHEN the project as the whole (either "other people
in the project", or "yourself down the road one year from now") is
interested mostly in the progress of 'master' D-E-F-G, and nobody cares
whether you developed your A (or B or C) on top of E or G.  So the answer
is definite "no" --- the line you drew between A and A' is a useless
information.  Nobody cares you did it first on top of E but then you have
redone the patches based on G (because things changed between E and G).

If there were no "rebase", your changes will be integrated into 'master'
branch like this:

          A---B---C
         /         \
    D---E---F---G---M

Rebasing is a way to _help you_ pretend that you did _not_ start working
on an ancient code base that was at E.  You redo your series on top of the
latest and greatest G, the commit that everybody else agrees is the
current state of affairs when he sees your changes for the first time, to
produce a history like this:

    D---E---F---G---A'--B'--C'

Doing so tends to make the history easier to understand, and more
importantly, it reduces mistakes during the integration _and_ distributes
the burden of integration from central point.

If E..G and A..C happen to have conflicting changes, rebasing puts the
burden to rewrite the changes A..C into A'..C', based on the modified base
code G, on _you_ (the person who is rebasing).  Some people do not like
this, as they feel that is an added, unwanted burden.  On the other hand,
if your upstream maintainer is integrating like the above picture to
create a merge 'M', it is more likely that he would make mistakes during
the conflict resolution, than you make incorrect adjustment during your
rebasing to recreate the series A'..C'.  You read what G gives you as the
foundation to build your changes on, determine what got changed since E,
on which you originally based your changes, and adjust your changes to
better integrate on top of G.  After all, A..C is _your code_ and you
understand what it assumes better than anybody else.

If the fact that parallel developments have happened is important, instead
of the second picture like you drew, you will just do the real merge
naturally to create a merge "M" like the picture I drew above.

Your "A' is merge between E and A, B' is merge between A' and B" is not
something anybody is interested in if you are going to rebase.  It is not
interesting because it is not how things happened in the real life at all,
and it is not interesting because it is not simplifying the history for
later analysis nor reducing mistakes during the conflict resolution.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: is rebase the same as merging every commit?
  2008-06-27  0:51 ` Junio C Hamano
@ 2008-06-27  1:08   ` Junio C Hamano
       [not found]     ` <willow-jeske-01l79c1jFEDjCWw6-01l7@0yvFEDjCjEl>
  2008-06-27  6:31   ` Pascal Obry
  1 sibling, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2008-06-27  1:08 UTC (permalink / raw)
  To: David Jeske; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> "David Jeske" <jeske@willowmail.com> writes:
> ...
>> (2) From the documentation diagrams, it looks like the rebased A' has
>> only (G) as a parent, not (A,G). If this is the case, why?  (i.e. not
>> connecting those nodes throws away useful information)
>
> You would rebase ONLY WHEN the project as the whole (either "other people
> in the project", or "yourself down the road one year from now") is
> interested mostly in the progress of 'master' D-E-F-G, and nobody cares
> whether you developed your A (or B or C) on top of E or G.  So the answer
> is definite "no" --- the line you drew between A and A' is a useless
> information.  Nobody cares you did it first on top of E but then you have
> redone the patches based on G (because things changed between E and G).

The last sentence came out in somewhat inappropriate way.

	In the situation "rebase" (which is a way to help you pretend you
	did not start building on a stale codebase) is appropriate, nobody
	wants to know you did it first on E

is what I meant.  More importantly, _you_ do not want anybody to know.
That is the whole reason you would rebase.

With that clarification in mind, the explanation would flow more smoothly
to this part...

> If the fact that parallel developments have happened is important, instead
> of the second picture like you drew, you will just do the real merge
> naturally to create a merge "M" like the picture I drew above.

So you have a choice between merging and rebasing.  And your extra parents
goes against the reason you chose rebasing in the first place.  That is
why we do not record the original parents anywhere.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: is rebase the same as merging every commit?
       [not found]     ` <willow-jeske-01l79c1jFEDjCWw6-01l7@0yvFEDjCjEl>
  2008-06-27  6:24       ` David Jeske
@ 2008-06-27  6:24       ` David Jeske
  1 sibling, 0 replies; 16+ messages in thread
From: David Jeske @ 2008-06-27  6:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Thanks for the explanation.

However, when considering an SCM perspective, I don't understand why I have to
make a tradeoff between personal reproducibility (which I get from the original
changes), and upstream readability (which the community gets from my rebase).

I could get both of these if the rebase kept both the old and new.

Is there some reason that losing personal reproducability, and personal/local
tracking back to those changes of A-B-C is necessary as part of the process?

Further, the rebase machinery seems like it would be great for operations that
are even more 'dangerous', where I would really really want the history of the
transitions in case I realized a problem later.

Consider this set of commits on a personal branch

0 - feature a
1 - feature b
2 - bugfix a
3 - feature c / d
4 - bugfix b
5 - bugfix a2

>From all I've read about rebase, bisect, and the big tree management, it seems
like the three steps are Reorder, combine, rebase.  (In a more complicated
situation, i'd want to split a commit into pieces)

(1'')
0' - feature A
1' - bugfix a
2' - bugfix a2
(2'')
3' - feature b
4' - bugfix b
(3'')
5' - feature c (split)
(4'')
6' - feature d (split)

Frankly, I'm super impressed, because I can imagine how I might do this in git.
I'm guessing some of you are already doing this. But how do you do it? Can you
rebase a patch back into it's own history? (such as bugfix a from 2, to 1')

I want to mess around and try this stuff out, but I'm scared of doing bad
things to the tree and them being unrecoverable because rebase tosses the old
stuff. I don't understand why I have to lose my original work and/or the
connection to my original work, in order to reorder/combine/split for public
consumption. What is the argument for that? (other than the fact that the
current dag link propagation model would force others to get these changes if
they remained connected together. Something easily remidied by out of band
metadata, or different link types)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: is rebase the same as merging every commit?
       [not found]     ` <willow-jeske-01l79c1jFEDjCWw6-01l7@0yvFEDjCjEl>
@ 2008-06-27  6:24       ` David Jeske
  2008-06-27  7:34         ` Matthieu Moy
  2008-06-27  6:24       ` David Jeske
  1 sibling, 1 reply; 16+ messages in thread
From: David Jeske @ 2008-06-27  6:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Thanks for the explanation.

However, when considering an SCM perspective, I don't understand why I have to
make a tradeoff between personal reproducibility (which I get from the original
changes), and upstream readability (which the community gets from my rebase).

I could get both of these if the rebase kept both the old and new.

Is there some reason that losing personal reproducability, and personal/local
tracking back to those changes of A-B-C is necessary as part of the process?

Further, the rebase machinery seems like it would be great for operations that
are even more 'dangerous', where I would really really want the history of the
transitions in case I realized a problem later.

Consider this set of commits on a personal branch

0 - feature a
1 - feature b
2 - bugfix a
3 - feature c / d
4 - bugfix b
5 - bugfix a2

>From all I've read about rebase, bisect, and the big tree management, it seems
like the three steps are Reorder, combine, rebase.  (In a more complicated
situation, i'd want to split a commit into pieces)

(1'')
0' - feature A
1' - bugfix a
2' - bugfix a2
(2'')
3' - feature b
4' - bugfix b
(3'')
5' - feature c (split)
(4'')
6' - feature d (split)

Frankly, I'm super impressed, because I can imagine how I might do this in git.
I'm guessing some of you are already doing this. But how do you do it? Can you
rebase a patch back into it's own history? (such as bugfix a from 2, to 1')

I want to mess around and try this stuff out, but I'm scared of doing bad
things to the tree and them being unrecoverable because rebase tosses the old
stuff. I don't understand why I have to lose my original work and/or the
connection to my original work, in order to reorder/combine/split for public
consumption. What is the argument for that? (other than the fact that the
current dag link propagation model would force others to get these changes if
they remained connected together. Something easily remidied by out of band
metadata, or different link types)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: is rebase the same as merging every commit?
  2008-06-26 23:04 David Jeske
@ 2008-06-27  6:30 ` Matthieu Moy
       [not found]   ` <willow-jeske-01l78ZaJFEDjCYTA-01l7GOyLFEDjCV8E>
  2008-06-27  8:34   ` Petr Baudis
  0 siblings, 2 replies; 16+ messages in thread
From: Matthieu Moy @ 2008-06-27  6:30 UTC (permalink / raw)
  To: David Jeske; +Cc: git

"David Jeske" <jeske@willowmail.com> writes:

> Rebasing is described in the docs I've read as turning this: (sorry for the
> dots)
>
> ..........A---B---C topic
> ........./
> ....D---E---F---G master
>
> Into this:
>
> ...................A'--B'--C' topic
> ................../
> .....D---E---F---G master
>
> If I understand it right (and that's a BIG if), it's the same as doing a merge
> of C into G where every individual commit in the C-line is individually
> committed into the new C' line.
>
> ...........-------------A---B---C
> ........../            /   /   /
> ........./        /---A'--B'--C'  topic
> ......../        /
> ....D---E---F---G - master

I'd draw that the other way:

  ...........---------A---B---C
  ........../          \   \   \
  ........./        /---A'--B'--C'  topic
  ......../        /
  ....D---E---F---G - master

> (1) Is the above model a valid explanation?

Sounds correct to me.

> (2) From the documentation diagrams, it looks like the rebased A' has only (G)
> as a parent, not (A,G). If this is the case, why?

Well, one could imagine a "rebase keeping ancestry" command, which
would keep A and G (indeed, you can do that by hand with multiple
calls to "merge"). The advantage being that further merges involving
both A and A' have better chance to succeed.

But philosophy of "rebase" is different: the idea is that you usually
rebase your private branches before submission, and the guys you
submit to are interested in your changes (i.e. the patch serie
diff(G,A'), diff(A',B'), ...), not the way you got this patch serie.

So, discarding this ancestry information is a bit like discarding your
*~ files (or whatever backup files your editor might create) after
some time: it has been valuable information, but at some point, it
becomes noise you don't want to hear.

> (i.e. not connecting those nodes throws away useful information)

For the use-cases where this information is useful, "rebase" is not
for you. Indeed, in these cases, a plain "merge" is usually what you
want.

> (3) If it only has (G) as a parent, does the rebase explicitly remove the
> source A,B,C nodes from the repository?

Most commands, and this includes rebase, are "add-only". The objects
will remain unreferenced and will be pruned by the next git gc
--prune. Unreferenced objects do not harm, they just eat your disk
space.

Well, that was a first approximation. Indeed, the reflog still
references C, see "git reflog". For example, after the rebase, if you
realize that you actually didn't want this rebase, you can still
"git reset --hard HEAD@{1}" or something like that.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: is rebase the same as merging every commit?
  2008-06-27  0:51 ` Junio C Hamano
  2008-06-27  1:08   ` Junio C Hamano
@ 2008-06-27  6:31   ` Pascal Obry
  1 sibling, 0 replies; 16+ messages in thread
From: Pascal Obry @ 2008-06-27  6:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: David Jeske, git

Junio C Hamano a écrit :
> You would rebase ONLY WHEN the project as the whole (either "other people
> in the project", or "yourself down the road one year from now") is
> interested mostly in the progress of 'master' D-E-F-G, and nobody cares
> whether you developed your A (or B or C) on top of E or G.  So the answer

Or if you are using git-svn as nobody will ever see your local branches. 
So rebasing is just the right way to go when tracking a Subversion tree 
I would say.

Pascal.

-- 

--|------------------------------------------------------
--| Pascal Obry                           Team-Ada Member
--| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE
--|------------------------------------------------------
--|              http://www.obry.net
--| "The best way to travel is by means of imagination"
--|
--| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: is rebase the same as merging every commit?
       [not found]   ` <willow-jeske-01l78ZaJFEDjCYTA-01l7GOyLFEDjCV8E>
@ 2008-06-27  6:46     ` David Jeske
  2008-06-27  6:46     ` David Jeske
  1 sibling, 0 replies; 16+ messages in thread
From: David Jeske @ 2008-06-27  6:46 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: git

-- Matthieu Moy wrote:

> > (3) If it only has (G) as a parent, does the rebase explicitly remove the
> > source A,B,C nodes from the repository?
>
> Most commands, and this includes rebase, are
> "add-only". The objects will remain unreferenced and
> will be pruned by the next git gc --prune. Unreferenced
> objects do not harm, they just eat your disk space.

I see. So it would be reasonable for the documentation to be altered slightly
to show that the original nodes are still there, and that the primary
difference between merging those changes one-by-one and rebasing is that rebase
does not connect the new to the old. If you want to keep the old, you can toss
a branch name on it, and if not, it still lives until the gc timeout.

The current docs showing those nodes missing tells me that they disappear,
which is both scarry, and apparently inaccurate.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: is rebase the same as merging every commit?
       [not found]   ` <willow-jeske-01l78ZaJFEDjCYTA-01l7GOyLFEDjCV8E>
  2008-06-27  6:46     ` David Jeske
@ 2008-06-27  6:46     ` David Jeske
  1 sibling, 0 replies; 16+ messages in thread
From: David Jeske @ 2008-06-27  6:46 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: git

-- Matthieu Moy wrote:

> > (3) If it only has (G) as a parent, does the rebase explicitly remove the
> > source A,B,C nodes from the repository?
>
> Most commands, and this includes rebase, are
> "add-only". The objects will remain unreferenced and
> will be pruned by the next git gc --prune. Unreferenced
> objects do not harm, they just eat your disk space.

I see. So it would be reasonable for the documentation to be altered slightly
to show that the original nodes are still there, and that the primary
difference between merging those changes one-by-one and rebasing is that rebase
does not connect the new to the old. If you want to keep the old, you can toss
a branch name on it, and if not, it still lives until the gc timeout.

The current docs showing those nodes missing tells me that they disappear,
which is both scarry, and apparently inaccurate.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: is rebase the same as merging every commit?
  2008-06-27  6:24       ` David Jeske
@ 2008-06-27  7:34         ` Matthieu Moy
       [not found]           ` <willow-jeske-01l79c1jFEDjCWw6-01l7HsC6FEDjCV3k>
  0 siblings, 1 reply; 16+ messages in thread
From: Matthieu Moy @ 2008-06-27  7:34 UTC (permalink / raw)
  To: David Jeske; +Cc: Junio C Hamano, git

"David Jeske" <jeske@willowmail.com> writes:

> However, when considering an SCM perspective, I don't understand why I have to
> make a tradeoff between personal reproducibility (which I get from the original
> changes), and upstream readability (which the community gets from my rebase).

Well, look at the [PATCH] messages on this list, and how they evolve.
Patch series give a clean way to go from a point to another. That's
what you want to see in upstream history.

Then, patch series usually get reviewed, and the patches themselves
are modified. There's a kind of meta-history: the changes you make to
your own changes.

Suppose I send a patch containing

+	int * x = malloc(sizeof(char));

and someone notices how wrong it is. I send another patch with

+	int * x = malloc(sizeof(int));

The first version was basicaly a mistake, and if it hasn't been
released, no one want to bother with it longer that the time to resend
the patch. No one want to be hit by the bug while using bisect later
on the upstream repository. And no one wants to see both patches when
reviewing or "git blame"-ing.

Things you rebase in Git are just like things for which you don't make
intermediate commits in SVN.

>>From all I've read about rebase, bisect, and the big tree management, it seems
> like the three steps are Reorder, combine, rebase.  (In a more complicated
> situation, i'd want to split a commit into pieces)
>
> (1'')
> 0' - feature A
> 1' - bugfix a
> 2' - bugfix a2
> (2'')
> 3' - feature b
> 4' - bugfix b
> (3'')
> 5' - feature c (split)
> (4'')
> 6' - feature d (split)
>
> Frankly, I'm super impressed, because I can imagine how I might do
> this in git.

git rebase -i will help you to do that painlessly.

> I want to mess around and try this stuff out, but I'm scared of doing bad
> things to the tree and them being unrecoverable

They won't. The reflog is still there. Try it, an cancel it if you
don't like.

The huge difference between the reflog and the history is that the
reflog is local, it's your own mess, other people won't get disturbed
by how messy it can be.

> (other than the fact that the current dag
> link propagation model would force others to get these changes if
> they remained connected together. Something easily remidied by out
> of band metadata, or different link types)

No. One fundamental principle of Git is that objects are immutable. If
your objects have a link, of whatever kind, then the same object moved
in another repository have the same link.

But what's wrong with the reflog?

-- 
Matthieu

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: is rebase the same as merging every commit?
  2008-06-27  6:30 ` Matthieu Moy
       [not found]   ` <willow-jeske-01l78ZaJFEDjCYTA-01l7GOyLFEDjCV8E>
@ 2008-06-27  8:34   ` Petr Baudis
  1 sibling, 0 replies; 16+ messages in thread
From: Petr Baudis @ 2008-06-27  8:34 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: David Jeske, git

On Fri, Jun 27, 2008 at 08:30:56AM +0200, Matthieu Moy wrote:
> "David Jeske" <jeske@willowmail.com> writes:
> 
> > Rebasing is described in the docs I've read as turning this: (sorry for the
> > dots)
> >
> > ..........A---B---C topic
> > ........./
> > ....D---E---F---G master
> >
> > Into this:
> >
> > ...................A'--B'--C' topic
> > ................../
> > .....D---E---F---G master
> >
> > If I understand it right (and that's a BIG if), it's the same as doing a merge
> > of C into G where every individual commit in the C-line is individually
> > committed into the new C' line.
> >
> > ...........-------------A---B---C
> > ........../            /   /   /
> > ........./        /---A'--B'--C'  topic
> > ......../        /
> > ....D---E---F---G - master
> 
> I'd draw that the other way:
> 
>   ...........---------A---B---C
>   ........../          \   \   \
>   ........./        /---A'--B'--C'  topic
>   ......../        /
>   ....D---E---F---G - master
> 
> > (1) Is the above model a valid explanation?
> 
> Sounds correct to me.

I don't think you can call it correct since it assumes !(2) while (2)
holds. Drawing the diagram this way is misleading; merging commits
one-by-one implies preserving the merge information in the history
graph; nothing like that is done by rebase.

Rebase is more like _cherry-picking_ all the patches on your branch on
top of the upstream branch. You just essentially take each patch (commit
message + diff to parent) growing on top of upstream's E and recommit it
on top of G.

> > (2) From the documentation diagrams, it looks like the rebased A' has only (G)
> > as a parent, not (A,G). If this is the case, why?
..snip..
> > (i.e. not connecting those nodes throws away useful information)
> 
> For the use-cases where this information is useful, "rebase" is not
> for you. Indeed, in these cases, a plain "merge" is usually what you
> want.

Indeed, noone forces you into the rebase workflow for your own projects.
I personally never ever rebase (I do use StGIT though, but it records
per-patch history and makes sure I'm always in some consistent state).

-- 
				Petr "Pasky" Baudis
The last good thing written in C++ was the Pachelbel Canon. -- J. Olson

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: is rebase the same as merging every commit?
  2008-06-26 23:04 is rebase the same as merging every commit? David Jeske
  2008-06-27  0:51 ` Junio C Hamano
@ 2008-06-27 10:33 ` しらいしななこ
  2008-06-27 21:51   ` Junio C Hamano
  1 sibling, 1 reply; 16+ messages in thread
From: しらいしななこ @ 2008-06-27 10:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: David Jeske, git

Quoting Junio C Hamano <gitster@pobox.com>:

> "David Jeske" <jeske@willowmail.com> writes:
>
>> If I understand it right (and that's a BIG if), it's the same as doing a merge
>> of C into G where every individual commit in the C-line is individually
>> committed into the new C' line.
>>
>> ...........-------------A---B---C
>> ........../            /   /   /
>> ........./        /---A'--B'--C'  topic
>> ......../        /
>> ....D---E---F---G - master
>>
>>
>> (1) Is the above model a valid explanation?
>
> I would presume that the resulting trees A' in the second picture and in
> the first picture would be the same, so are B' and C'.  But that is only
> true when commits between A and C do not have any duplicate with the
> development that happened between E and G.

Sorry, but I think you are wrong, Junio.

Rebase can be used to backport changes, not just porting your changes forward, using --onto option:

..........maint
............1-------A'--B'--C'   
.........../       .   .   . <-- ???
........../.......A---B---C
........./......./
......../......./
.......0--...--D---E---F---G - master

Here, A, B, C that are based on D (that is way ahead of the top of the maintenance branch 1) is rebased to the maintenance branch.

But in this case, A' is *not* a merge between 1 and A.  For A' to be a merge between 1 and A, it *must* contain all the development that happened up to 1 and all the development that happened up to A since these two branches were forked (that is 0 in the above picture).

Instead, the difference to go from 1 to A' is similar to the difference to go from D to A. It does not and must not include anything that happened between 0 and D.  That is not a merge.

I agree that your explanation why A is not recorded as a parent of A' is right for the philosophical reason (the purpose of rebasing to create A' is so that you do not have to record them).  But from the point-of-view of correctness of commit history, I think A must not be recorded as a parent of A', either.

-- 
Nanako Shiraishi
http://ivory.ap.teacup.com/nanako3/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: is rebase the same as merging every commit?
       [not found]           ` <willow-jeske-01l79c1jFEDjCWw6-01l7HsC6FEDjCV3k>
  2008-06-27 15:39             ` David Jeske
@ 2008-06-27 15:39             ` David Jeske
  1 sibling, 0 replies; 16+ messages in thread
From: David Jeske @ 2008-06-27 15:39 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Junio C Hamano, git

This example you provided Matthieu is exactly my confusion with rebase..

If I want to bring a 'broken feature-a' branch into my topic branch to build on
it, one commit of which is this:

> +	int * x = malloc(sizeof(char));

if I merge, my tree looks like:

:         /<--G<--H<--Qj jeske/topic1
:        /           /
:       /<--P<------Q    feature-a
:      /
: -----A<---B<---C master

or if I rebase, it looks like:

:                  /<--G'<--H' jeske/topic1
:                 /
:       /<--P<---Q    feature-a
:      /
: -----A<---B<---C master

-----

..and then through 'fixing' the patch, it ends up rebased and accepted onto the
mainline origin/master, as a single patch, which (among other things) changed
this line to:

> +	int * x = malloc(sizeof(int));

...if I merged above, it will look like,

:         /<--G<--H<--Qj jeske/topic1
:        /           /
:       /<--P<------Q    feature-a
:      /
: -----A<---B<---C<---Q' master

...if I rebased above, it will look like:

:                  /<--G'<--H' jeske/topic1
:                 /
:       /<--P<---Q    feature-a
:      /
: -----A<---B<---C<---Q' master


However, in both cases, because Q' is not connected to Q, I don't see how git
will do anything sane to help me accept Q' correctly.

If I rebased my merge-q-branch against the master, I would expect to get this
(which will cause a conflict I have to resolve):

:           /<--G<--H<--Qj jeske/topic1
:          /
: <--C<---Q' master

If I rebased my rebase-q-branch against master, I would expect to get this
(which will cause a conflict I have to resolve):

:                     /<--G'<--H' jeske/topic1
:                    /
:          /<--P<---Q    feature-a
:         /
: --C<---Q' master

However, if that Q' rebase contained a link back to (P,Q), it would know that
the Q' rebase was replacing (P,Q), and would know to back them out of my tree
when I rebased back onto the head, producing this in BOTH cases above (whether
I rebased or merged from the feature-a branch):


:          /<--G'<--H' jeske/topic1
:         /
: --C<---Q' master

This operation above of "working will pulling uncompleted patches into my tree"
seems like a fairly common thing for developers. I've never provided any
patches to linux-kernel, but when I did try hacking on it years ago, I was
doing exactly this. (pulling unaccepted patches into my kernel, then building
on those patches). When I read about the DAG and its universal naming, I always
assumed that the above workflow was what it was DESIGNED to make automatic. I'm
confused, how does this work in git?



-- Matthieu Moy wrote:
> Well, look at the [PATCH] messages on this list, and how they evolve.
> Patch series give a clean way to go from a point to another. That's
> what you want to see in upstream history.
>
> Then, patch series usually get reviewed, and the patches themselves
> are modified. There's a kind of meta-history: the changes you make to
> your own changes.
>
> Suppose I send a patch containing
>
> +	int * x = malloc(sizeof(char));
>
> and someone notices how wrong it is. I send another patch with
>
> +	int * x = malloc(sizeof(int));
>
> The first version was basicaly a mistake, and if it hasn't been
> released, no one want to bother with it longer that the time to resend
> the patch. No one want to be hit by the bug while using bisect later
> on the upstream repository. And no one wants to see both patches when
> reviewing or "git blame"-ing.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: is rebase the same as merging every commit?
       [not found]           ` <willow-jeske-01l79c1jFEDjCWw6-01l7HsC6FEDjCV3k>
@ 2008-06-27 15:39             ` David Jeske
  2008-06-27 15:39             ` David Jeske
  1 sibling, 0 replies; 16+ messages in thread
From: David Jeske @ 2008-06-27 15:39 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Junio C Hamano, git

This example you provided Matthieu is exactly my confusion with rebase..

If I want to bring a 'broken feature-a' branch into my topic branch to build on
it, one commit of which is this:

> +	int * x = malloc(sizeof(char));

if I merge, my tree looks like:

:         /<--G<--H<--Qj jeske/topic1
:        /           /
:       /<--P<------Q    feature-a
:      /
: -----A<---B<---C master

or if I rebase, it looks like:

:                  /<--G'<--H' jeske/topic1
:                 /
:       /<--P<---Q    feature-a
:      /
: -----A<---B<---C master

-----

..and then through 'fixing' the patch, it ends up rebased and accepted onto the
mainline origin/master, as a single patch, which (among other things) changed
this line to:

> +	int * x = malloc(sizeof(int));

...if I merged above, it will look like,

:         /<--G<--H<--Qj jeske/topic1
:        /           /
:       /<--P<------Q    feature-a
:      /
: -----A<---B<---C<---Q' master

...if I rebased above, it will look like:

:                  /<--G'<--H' jeske/topic1
:                 /
:       /<--P<---Q    feature-a
:      /
: -----A<---B<---C<---Q' master


However, in both cases, because Q' is not connected to Q, I don't see how git
will do anything sane to help me accept Q' correctly.

If I rebased my merge-q-branch against the master, I would expect to get this
(which will cause a conflict I have to resolve):

:           /<--G<--H<--Qj jeske/topic1
:          /
: <--C<---Q' master

If I rebased my rebase-q-branch against master, I would expect to get this
(which will cause a conflict I have to resolve):

:                     /<--G'<--H' jeske/topic1
:                    /
:          /<--P<---Q    feature-a
:         /
: --C<---Q' master

However, if that Q' rebase contained a link back to (P,Q), it would know that
the Q' rebase was replacing (P,Q), and would know to back them out of my tree
when I rebased back onto the head, producing this in BOTH cases above (whether
I rebased or merged from the feature-a branch):


:          /<--G'<--H' jeske/topic1
:         /
: --C<---Q' master

This operation above of "working will pulling uncompleted patches into my tree"
seems like a fairly common thing for developers. I've never provided any
patches to linux-kernel, but when I did try hacking on it years ago, I was
doing exactly this. (pulling unaccepted patches into my kernel, then building
on those patches). When I read about the DAG and its universal naming, I always
assumed that the above workflow was what it was DESIGNED to make automatic. I'm
confused, how does this work in git?



-- Matthieu Moy wrote:
> Well, look at the [PATCH] messages on this list, and how they evolve.
> Patch series give a clean way to go from a point to another. That's
> what you want to see in upstream history.
>
> Then, patch series usually get reviewed, and the patches themselves
> are modified. There's a kind of meta-history: the changes you make to
> your own changes.
>
> Suppose I send a patch containing
>
> +	int * x = malloc(sizeof(char));
>
> and someone notices how wrong it is. I send another patch with
>
> +	int * x = malloc(sizeof(int));
>
> The first version was basicaly a mistake, and if it hasn't been
> released, no one want to bother with it longer that the time to resend
> the patch. No one want to be hit by the bug while using bisect later
> on the upstream repository. And no one wants to see both patches when
> reviewing or "git blame"-ing.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: is rebase the same as merging every commit?
  2008-06-27 10:33 ` しらいしななこ
@ 2008-06-27 21:51   ` Junio C Hamano
  0 siblings, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2008-06-27 21:51 UTC (permalink / raw)
  To: しらいしななこ; +Cc: David Jeske, git

しらいしななこ  <nanako3@lavabit.com> writes:

> Quoting Junio C Hamano <gitster@pobox.com>:
>
>> "David Jeske" <jeske@willowmail.com> writes:
>>
>>> If I understand it right (and that's a BIG if), it's the same as doing a merge
>>> of C into G where every individual commit in the C-line is individually
>>> committed into the new C' line.
>>>
>>> ...........-------------A---B---C
>>> ........../            /   /   /
>>> ........./        /---A'--B'--C'  topic
>>> ......../        /
>>> ....D---E---F---G - master
>>>
>>>
>>> (1) Is the above model a valid explanation?
>>
>> I would presume that the resulting trees A' in the second picture and in
>> the first picture would be the same, so are B' and C'.  But that is only
>> true when commits between A and C do not have any duplicate with the
>> development that happened between E and G.
>
> Sorry, but I think you are wrong, Junio.
> ...
> I agree that your explanation why A is not recorded as a parent of A' is
> right for the philosophical reason (the purpose of rebasing to create A'
> is so that you do not have to record them).  But from the point-of-view
> of correctness of commit history, I think A must not be recorded as a
> parent of A', either.

All correct.  Sorry about the confusion.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2008-06-27 21:53 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-26 23:04 is rebase the same as merging every commit? David Jeske
2008-06-27  0:51 ` Junio C Hamano
2008-06-27  1:08   ` Junio C Hamano
     [not found]     ` <willow-jeske-01l79c1jFEDjCWw6-01l7@0yvFEDjCjEl>
2008-06-27  6:24       ` David Jeske
2008-06-27  7:34         ` Matthieu Moy
     [not found]           ` <willow-jeske-01l79c1jFEDjCWw6-01l7HsC6FEDjCV3k>
2008-06-27 15:39             ` David Jeske
2008-06-27 15:39             ` David Jeske
2008-06-27  6:24       ` David Jeske
2008-06-27  6:31   ` Pascal Obry
2008-06-27 10:33 ` しらいしななこ
2008-06-27 21:51   ` Junio C Hamano
  -- strict thread matches above, loose matches on Subject: below --
2008-06-26 23:04 David Jeske
2008-06-27  6:30 ` Matthieu Moy
     [not found]   ` <willow-jeske-01l78ZaJFEDjCYTA-01l7GOyLFEDjCV8E>
2008-06-27  6:46     ` David Jeske
2008-06-27  6:46     ` David Jeske
2008-06-27  8:34   ` Petr Baudis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).