* rebase-with-history -- a technique for rebasing without trashing your repo history
@ 2009-08-13 12:46 Michael Haggerty
2009-08-13 16:12 ` Björn Steinbrink
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Michael Haggerty @ 2009-08-13 12:46 UTC (permalink / raw)
To: Bazaar, Git Mailing List, mercurial mailing list
Sorry to cross-post, but I think this might be interesting to all three
projects...
I've been thinking a lot about the problems of tracking upstream changes
while developing a feature branch. As I think everybody knows, both
rebasing and merging have serious disadvantages for this use case.
Rebasing discards history and makes it difficult to share
work-in-progress with others, whereas merging makes it difficult to
prepare a clean patch series that is suitable for submission upstream.
I've written some articles describing another possibility, which
combines the advantages of both methods. The key idea is to retain
rebase history correctly, on a patch-by-patch level. The resulting DAG
retains enough history to prevent problems with merge conflicts
downstream, while also allowing the patch series to be kept tidy.
(Please note that this technique only works for the typical "tracking
upstream" type of rebase; it doesn't help with rebases whose goals are
changing the order of commits, moving only part of a branch, rewriting
commits, etc.)
For more information, please see the full articles:
* A truce in the merge vs. rebase war? [1]
* Upstream rebase Just Works™ if history is retained [2]
* Rebase with history -- implementation ideas [3]
I'd appreciate feedback!
Michael
[1]
http://softwareswirl.blogspot.com/2009/04/truce-in-merge-vs-rebase-war.html
[2]
http://softwareswirl.blogspot.com/2009/08/upstream-rebase-just-works-if-history.html
[3]
http://softwareswirl.blogspot.com/2009/08/rebase-with-history-implementation.html
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: rebase-with-history -- a technique for rebasing without trashing your repo history 2009-08-13 12:46 rebase-with-history -- a technique for rebasing without trashing your repo history Michael Haggerty @ 2009-08-13 16:12 ` Björn Steinbrink 2009-08-13 22:39 ` Michael Haggerty 2009-08-13 17:39 ` Bryan O'Sullivan 2009-08-13 20:31 ` Abderrahim Kitouni 2 siblings, 1 reply; 10+ messages in thread From: Björn Steinbrink @ 2009-08-13 16:12 UTC (permalink / raw) To: Michael Haggerty; +Cc: Bazaar, Git Mailing List, mercurial mailing list On 2009.08.13 14:46:07 +0200, Michael Haggerty wrote: > Sorry to cross-post, but I think this might be interesting to all three > projects... > > I've been thinking a lot about the problems of tracking upstream changes > while developing a feature branch. As I think everybody knows, both > rebasing and merging have serious disadvantages for this use case. > Rebasing discards history and makes it difficult to share > work-in-progress with others, whereas merging makes it difficult to > prepare a clean patch series that is suitable for submission upstream. > > I've written some articles describing another possibility, which > combines the advantages of both methods. The key idea is to retain > rebase history correctly, on a patch-by-patch level. The resulting DAG > retains enough history to prevent problems with merge conflicts > downstream, while also allowing the patch series to be kept tidy. > > (Please note that this technique only works for the typical "tracking > upstream" type of rebase; it doesn't help with rebases whose goals are > changing the order of commits, moving only part of a branch, rewriting > commits, etc.) Hm, so that pretty much doesn't work at all for creating a clean patch series, which usually involves rewriting commits, squasing bug fixes into the original commits that introduced the bug etc. And even for just continously forward porting a series of commits, a common case might be that upstream applied some patches, but not all. Can you deal with that? Example: A---B---C (upstream) \ H---I---J---K (yours) Upstream takes some changes: A---B---C---I'--K'--D (upstream) \ H---I---J---K (yours) rebase leads to: A---B---C---I'--K'--D (upstream) \ H'--J' (yours) What would your approach generate in that case? > For more information, please see the full articles: > > * Upstream rebase Just Works™ if history is retained [2] In this one you have two DAGs: (I fixed the second one to also have the merge commit in "subsystem" instead of "topic", so they only differ WRT to the rebased stuff) A) m---N---m---m---m---m---m---M (master) \ \ o---o---O---o---o o'--o'--o'--o'--o'--S (subsystem) \ / *---*---*-..........-*--T (topic) B) m---N---m---m---m---m---m---M (master) \ \ \ o'--o'--o'--o'--o'----------S (subsystem) \ / / / / / / --------------------o---o---O---o---o---*---*---T (topic) And you say that the former creates problems when you want to merge again. How so? Merging "master" to "subsystem" is a no-op in both cases. Merging "subsystem" to "master" is a fast-forward in both cases. Merging "subsystem" to "topic" is a fast-forward in both cases. Merging "topic" to "subsystem" is a no-op in both cases. Merging "topic" and "master" (in either direction) has merge base N in both cases. Let's assume that there's another dev, having his own history based on the old "O" commit. So: A) m---N---m---m---m---m---m---M (master) \ \ o---o---O---o---o o'--o'--o'--o'--o'--S (subsystem) \ \ / \ *---*---*-..........-*--T (topic) \ X---Y---Z (outsider) B) m---N---m---m---m---m---m---M (master) \ \ \ o'--o'--o'--o'--o'----------S (subsystem) \ / / / / / / --------------------o---o---O---o---o---*---*---T (topic) \ X---Y---Z (outsider) Merging "master" and "outsider" has merge base N in both cases. Merging "subsystem" and "outsider" has merge base O in both cases. Merging "topic" and "outsider" has merge base O in both cases. The only thing that really makes a difference is when you have another dev having based his history upon one of the o' commits. If that history is merged with "topic", then you get merge base N in A) and one of the o's in B). But, in that case, merging "topic" is basically just a complicated way of merging "subsystem". Both contain the full series of "o" commits, and all the "topic" commits (due to the merge). So you could just trivially merge "subsystem" instead, which leads the o' commit as the merge base in both cases. And, that merge of "topic" to "subsystem" was wrong to begin with. If you rewrite history, that has to trickle down. So A) should really have been: m---m---m---m-...--m---m (master) \ o'--o'-...-o'--*'--*'--*' (topic) (subsystem) (topic was rebased, and subsystem fast-forwarded) So AFAICT, what your system achieves WRT ease of rebasing is that it obsoletes the need to use "--onto" with git's rebase. Instead of "git rebase --onto subsystem old_subsystem", you can just say "git rebase subsystem", at the cost of a very complicated DAG. Björn ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: rebase-with-history -- a technique for rebasing without trashing your repo history 2009-08-13 16:12 ` Björn Steinbrink @ 2009-08-13 22:39 ` Michael Haggerty 2009-08-13 23:30 ` Björn Steinbrink 2009-08-14 3:17 ` Sitaram Chamarty 0 siblings, 2 replies; 10+ messages in thread From: Michael Haggerty @ 2009-08-13 22:39 UTC (permalink / raw) To: Björn Steinbrink; +Cc: Git Mailing List Björn Steinbrink wrote: > On 2009.08.13 14:46:07 +0200, Michael Haggerty wrote: >> (Please note that this technique only works for the typical "tracking >> upstream" type of rebase; it doesn't help with rebases whose goals are >> changing the order of commits, moving only part of a branch, rewriting >> commits, etc.) > > Hm, so that pretty much doesn't work at all for creating a clean patch > series, which usually involves rewriting commits, squasing bug fixes > into the original commits that introduced the bug etc. Now that you mention it, there are some other uses of rebase whose history could be recorded correctly, or at least better, in the DAG. I am not ready to advocate any of these changes, but I think they are worth discussing. A squash of two adjacent commits currently transforms this: A---B1---B2---C to this A---B12---C' (C' has the same contents and commit message as C but a different history and therefore a different SHA1.) But B12 includes both B1 and B2, so the correct history is this: A---B1---B2----C \ \ \ ---------B12---C' The fact that B12 has B1 and B2 as ancestors tells git that it incorporates both of their changes, which correctly describes reality. Splitting a commit (not really an elementary rebase operation, but achievable with "edit") transforms this: A---B12---C to this: A---B1---B2---C' It is not possible to represent this correctly in the DAG, because there is no way to express that "B1" includes part of "B12" as a parent. But the following would be more accurate than discarding all history: A---B12---C \ \ \ B1---B2---C' It would of course be difficult for the user-interface layer to be confident that the changes in B1 and B2 are really equivalent to B12 unless the content of B12 and B2 are identical. Inserting a new commit into the history (for example as part of reordering later commits) transforms this: A---C to this: A---B---C' In this case C' includes everything that is in C, so the correct history is A---C \ \ B---C' However, deleting a commit, which transforms this: A---B---C to this: A---C' cannot be represented in the DAG, because there is no way to express that C' includes the changes from C without also implying that it includes the changes from B. Rewriting a single commit, under the assumption that the rewritten commit is the logical equivalent of the original, transforms this: A---B1---C to this: A---B2---C' where B2 is a hand-rewritten version of B1, and C' is the version of C produced by the rebase. In this case, the history could be recorded as: A---B1---C \ \ \ ----B2---C' but again, it is impossible for the user-interface layer to ascertain that B2 is equivalent to B1 without help from the user. All of this extra history would currently create far more clutter than it is worth, but if there were a way to suppress the display of rebased commits (as discussed in the third article I quoted), then the extra information would be there to help git without overwhelming users. > And even for just continously forward porting a series of commits, a > common case might be that upstream applied some patches, but not all. > Can you deal with that? > > Example: > > A---B---C (upstream) > \ > H---I---J---K (yours) > > Upstream takes some changes: > > A---B---C---I'--K'--D (upstream) > \ > H---I---J---K (yours) > > rebase leads to: > > A---B---C---I'--K'--D (upstream) > \ > H'--J' (yours) > > What would your approach generate in that case? There *is no way* to represent this history in a DAG, and therefore the history of this operation will necessarily be lost. (Well, of course it could be recorded in metadata supplemental to the DAG, but since the history would not affect future merges it would be pointless.) The problem is that there is no way to claim that I' is derived from I without also implying that I' includes the change in H (which it doesn't). I discuss this sort of thing in another article [1]. >> For more information, please see the full articles: [...] > > In this one you have two DAGs: > (I fixed the second one to also have the merge commit in "subsystem" > instead of "topic", so they only differ WRT to the rebased stuff) > > A) > m---N---m---m---m---m---m---M (master) > \ \ > o---o---O---o---o o'--o'--o'--o'--o'--S (subsystem) > \ / > *---*---*-..........-*--T (topic) > > > B) > m---N---m---m---m---m---m---M (master) > \ \ > \ o'--o'--o'--o'--o'----------S (subsystem) > \ / / / / / / > --------------------o---o---O---o---o---*---*---T (topic) > > > And you say that the former creates problems when you want to merge > again. How so? As you very clearly showed (thanks!), the merge problems that I claimed only occur in some obscure edge cases. What I *should* have emphasized is that the merge S itself is much more prone to conflicts in case A) (with merge base N) than in case B) (with the last "o" as merge base). That is the first advantage of rebase-with-history. And please note that I really advocate C), not B): C) m---N---m---m---m---m---m---M master \ \ \ o'--o'--o'--o'--o' subsystem \ / / / / / \ --------------------o---o---O---o---o \ \ \ \ *'--*'--*' topic \ / / / *---*---* where the topic branch is not merged into the subsystem branch but rather rebased-with-history. C) has the significant advantage over A) or B) that the topic branch can be converted to a series of patches (the *' patches) that apply cleanly to the rebased subsystem branch and can therefore be submitted upstream. In the case of A) or B), the only available patch that applies cleanly to the rebased subsystem branch is S, which is a single commit that squashes together the entire topic branch and is therefore difficult to review. So rebasing in a public repository makes it difficult for downstream developers to apply their work to the rebased branch (because they have to repeat the conflict resolution that was done in the upstream rebase), and merging in a topic branch makes it more difficult to create an easily-reviewable patch series. rebase-with-history has neither of these problems. Michael [1] "Git, Mercurial, and Bazaar—simplicity through inflexibility", http://softwareswirl.blogspot.com/2009/08/git-mercurial-and-bazaarsimplicity.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: rebase-with-history -- a technique for rebasing without trashing your repo history 2009-08-13 22:39 ` Michael Haggerty @ 2009-08-13 23:30 ` Björn Steinbrink 2009-08-14 21:21 ` Michael Haggerty 2009-08-14 3:17 ` Sitaram Chamarty 1 sibling, 1 reply; 10+ messages in thread From: Björn Steinbrink @ 2009-08-13 23:30 UTC (permalink / raw) To: Michael Haggerty; +Cc: Git Mailing List On 2009.08.14 00:39:48 +0200, Michael Haggerty wrote: > Björn Steinbrink wrote: > > On 2009.08.13 14:46:07 +0200, Michael Haggerty wrote: > > And even for just continously forward porting a series of commits, a > > common case might be that upstream applied some patches, but not all. > > Can you deal with that? > > > > Example: > > > > A---B---C (upstream) > > \ > > H---I---J---K (yours) > > > > Upstream takes some changes: > > > > A---B---C---I'--K'--D (upstream) > > \ > > H---I---J---K (yours) > > > > rebase leads to: > > > > A---B---C---I'--K'--D (upstream) > > \ > > H'--J' (yours) > > > > What would your approach generate in that case? > > There *is no way* to represent this history in a DAG, and therefore the > history of this operation will necessarily be lost. (Well, of course it > could be recorded in metadata supplemental to the DAG, but since the > history would not affect future merges it would be pointless.) The > problem is that there is no way to claim that I' is derived from I > without also implying that I' includes the change in H (which it > doesn't). I discuss this sort of thing in another article [1]. Well, I' isn't even interesting here. I' is _upstream's_ commit, likely created from an email. Upstream didn't have "I" when I' was created to begin with. The interesting commits are H' and J', which were actually rebased. And the situation is even "worse" than for I' or K'. H' does include H, I, and K, but not J and J' contains H, I, J and K. So possibly you'd have: A---B---C---I'--K'--D (upstream) \ \ \ --------H'--J' \ / / H---I---J---K---- Which only ignores that K is "contained" in H'. But you only get to know that I is in H' (and that K is in J') _after_ H' (or J') have been created. So you'd need a preprocessing run. The non-preprocessing result would be something like: A---B---C---I'--K'--D (upstream) \ \ \ -------------H'--J' \/ / H---I---J-------- \ K But that's obviously total crap. > >> For more information, please see the full articles: [...] > > > > In this one you have two DAGs: > > (I fixed the second one to also have the merge commit in "subsystem" > > instead of "topic", so they only differ WRT to the rebased stuff) > > > > A) > > m---N---m---m---m---m---m---M (master) > > \ \ > > o---o---O---o---o o'--o'--o'--o'--o'--S (subsystem) > > \ / > > *---*---*-..........-*--T (topic) > > > > > > B) > > m---N---m---m---m---m---m---M (master) > > \ \ > > \ o'--o'--o'--o'--o'----------S (subsystem) > > \ / / / / / / > > --------------------o---o---O---o---o---*---*---T (topic) > > > > > > And you say that the former creates problems when you want to merge > > again. How so? > > As you very clearly showed (thanks!), the merge problems that I claimed > only occur in some obscure edge cases. > > What I *should* have emphasized is that the merge S itself is much more > prone to conflicts in case A) (with merge base N) than in case B) (with > the last "o" as merge base). That is the first advantage of > rebase-with-history. Yeah, but as I said, actually, topic should have been rebased, not merged, and then that ends up the same, as you'd do: git rebase --onto subsystem $last_o topic Taking the last o commit as upstream. > And please note that I really advocate C), not B): > > C) > m---N---m---m---m---m---m---M master > \ \ > \ o'--o'--o'--o'--o' subsystem > \ / / / / / \ > --------------------o---o---O---o---o \ > \ \ > \ *'--*'--*' topic > \ / / / > *---*---* > Fine, but this should then be compared to the result from the above rebase command, which is: D) m--...--m (master) \ o'-...-o' (subsystem) \ *'-...-*' (topic) > where the topic branch is not merged into the subsystem branch but > rather rebased-with-history. C) has the significant advantage over A) > or B) that the topic branch can be converted to a series of patches (the > *' patches) that apply cleanly to the rebased subsystem branch and can > therefore be submitted upstream. In the case of A) or B), the only > available patch that applies cleanly to the rebased subsystem branch is > S, which is a single commit that squashes together the entire topic > branch and is therefore difficult to review. Same for D) > So rebasing in a public repository makes it difficult for downstream > developers to apply their work to the rebased branch (because they have > to repeat the conflict resolution that was done in the upstream rebase), There's no difference between C) and D) there, except for the fact that D) requires you to use --onto, because that needs to differ from <upstream>. Let's take the pre-topic-rebase history: m---m---m (master) \ \ \ o'--o'--O' (subsystem) \ o---o---O---*---*---* (topic) Doing a plain "git rebase subsystem topic" would of course also try to rebase the "o" commits, so that problematic. Instead, you do: git rebase --onto subsystem O topic That turns O..topic (the * commits) into patches, and applies them on top of O'. So the "o" commits aren't to be rebased. And that's exactly what your rebase-with-history would do as well. Just that O is naturally a common ancestor of subsystem and topic, and so just using "git rebase-w-h subsystem topic" would be enough. Conflicts etc. should be 100% the same. If you know that your upstream is going to rebase/rewrite history, you can tag (or otherwise mark) the current branching point of your branch, so you can easily specify it for the --onto rebase. IOW: This is primarily a social problem (tell your downstream that you rebase this or that branch), but having built-in support to store the branching point for rebasing _might_ be worth a thought. > and merging in a topic branch makes it more difficult to create an > easily-reviewable patch series. rebase-with-history has neither of > these problems. Sure, merging is a no-go if you submit patches by email (or other, similar means). But you compared that to an "enhanced" rebase approach, instead of comparing your rebase approach to the currently available one. So, as I see it, your approach does: * Save the need to use --onto, allowing to just specify <upstream>, as if <upstream> was not rewritten. * Allows to keep older versions of commits more easily accessible for inspection, e.g. creating interdiffs. The latter is (to me) of limited use, and the former could be done by tracking the branching point, not sure how well it work out, but maybe worth investigating. Björn ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: rebase-with-history -- a technique for rebasing without trashing your repo history 2009-08-13 23:30 ` Björn Steinbrink @ 2009-08-14 21:21 ` Michael Haggerty 2009-08-14 21:40 ` Nanako Shiraishi 2009-08-15 3:36 ` Björn Steinbrink 0 siblings, 2 replies; 10+ messages in thread From: Michael Haggerty @ 2009-08-14 21:21 UTC (permalink / raw) To: Björn Steinbrink; +Cc: Git Mailing List Björn Steinbrink wrote: > On 2009.08.14 00:39:48 +0200, Michael Haggerty wrote: >> Björn Steinbrink wrote: >>> On 2009.08.13 14:46:07 +0200, Michael Haggerty wrote: >>> And even for just continously forward porting a series of commits, a >>> common case might be that upstream applied some patches, but not all. >>> Can you deal with that? [A discussion of various unsatisfactory approaches omitted...] > But that's obviously total crap. So I think we agree that it is not possible to retain history for a case like this (which is essentially a general cherry-pick). > [...] > Doing a plain "git rebase subsystem topic" would of course also try to > rebase the "o" commits, so that problematic. Instead, you do: > > git rebase --onto subsystem O topic > > That turns O..topic (the * commits) into patches, and applies them on > top of O'. So the "o" commits aren't to be rebased. > > And that's exactly what your rebase-with-history would do as well. Just > that O is naturally a common ancestor of subsystem and topic, and so > just using "git rebase-w-h subsystem topic" would be enough. Conflicts > etc. should be 100% the same. > > If you know that your upstream is going to rebase/rewrite history, you > can tag (or otherwise mark) the current branching point of your branch, > so you can easily specify it for the --onto rebase. IOW: This is > primarily a social problem (tell your downstream that you rebase this or > that branch), but having built-in support to store the branching point > for rebasing _might_ be worth a thought. Recording branch points manually, coordinating merges via email -- OMG you are giving me flashbacks of CVS ;-) *Of course* you can get around all of these problems if you put the burden of bookkeeping on the user. The whole point of rebase-with-history is to have the VCS handle it automatically! >> and merging in a topic branch makes it more difficult to create an >> easily-reviewable patch series. rebase-with-history has neither of >> these problems. > > Sure, merging is a no-go if you submit patches by email (or other, > similar means). But you compared that to an "enhanced" rebase approach, > instead of comparing your rebase approach to the currently available > one. In [1] I compared rebase-with-history with both of the currently-available options (rebase and merge). Rebase and merge can each deal with some of the issues that come up, but each one falls flat on others. I believe that rebase-with-history has the advantages of both. The example in [2] was taken straight from the git-rebase man page [3]; I did not want to claim that current practice would use merging in this situation, but rather just to show that rebase-with-history removes the pain from this well-known example. I think we are mostly in agreement. Rebase-with-history is obviously not an earth-shattering revolution in DVCS technology, but my hope is that it could unobtrusively assist with a few minor pain points. Michael [1] http://softwareswirl.blogspot.com/2009/04/truce-in-merge-vs-rebase-war.html [2] http://softwareswirl.blogspot.com/2009/08/upstream-rebase-just-works-if-history.html [3] http://www.kernel.org/pub/software/scm/git/docs/git-rebase.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: rebase-with-history -- a technique for rebasing without trashing your repo history 2009-08-14 21:21 ` Michael Haggerty @ 2009-08-14 21:40 ` Nanako Shiraishi 2009-08-15 3:36 ` Björn Steinbrink 1 sibling, 0 replies; 10+ messages in thread From: Nanako Shiraishi @ 2009-08-14 21:40 UTC (permalink / raw) To: Michael Haggerty; +Cc: Bjrn Steinbrink, Git Mailing List Quoting Michael Haggerty <mhagger@alum.mit.edu> > In [1] I compared rebase-with-history with both of the > currently-available options (rebase and merge). Rebase and merge can > each deal with some of the issues that come up, but each one falls flat > on others. I believe that rebase-with-history has the advantages of both. > .... Rebase-with-history is obviously > not an earth-shattering revolution in DVCS technology, but my hope is > that it could unobtrusively assist with a few minor pain points. The saddest part is that your [1] works only in a case a user can easily handle manually, and doesn't help cases more complex than the most trivial ones, such as reordering and squashing commits, where the user may benefit if an automated support from VCS were available. -- Nanako Shiraishi http://ivory.ap.teacup.com/nanako3/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: rebase-with-history -- a technique for rebasing without trashing your repo history 2009-08-14 21:21 ` Michael Haggerty 2009-08-14 21:40 ` Nanako Shiraishi @ 2009-08-15 3:36 ` Björn Steinbrink 1 sibling, 0 replies; 10+ messages in thread From: Björn Steinbrink @ 2009-08-15 3:36 UTC (permalink / raw) To: Michael Haggerty; +Cc: Git Mailing List On 2009.08.14 23:21:01 +0200, Michael Haggerty wrote: > Björn Steinbrink wrote: > > On 2009.08.14 00:39:48 +0200, Michael Haggerty wrote: > >> Björn Steinbrink wrote: > >>> On 2009.08.13 14:46:07 +0200, Michael Haggerty wrote: > > [...] > > Doing a plain "git rebase subsystem topic" would of course also try to > > rebase the "o" commits, so that problematic. Instead, you do: > > > > git rebase --onto subsystem O topic > > > > That turns O..topic (the * commits) into patches, and applies them on > > top of O'. So the "o" commits aren't to be rebased. > > > > And that's exactly what your rebase-with-history would do as well. Just > > that O is naturally a common ancestor of subsystem and topic, and so > > just using "git rebase-w-h subsystem topic" would be enough. Conflicts > > etc. should be 100% the same. > > > > If you know that your upstream is going to rebase/rewrite history, you > > can tag (or otherwise mark) the current branching point of your branch, > > so you can easily specify it for the --onto rebase. IOW: This is > > primarily a social problem (tell your downstream that you rebase this or > > that branch), but having built-in support to store the branching point > > for rebasing _might_ be worth a thought. > > Recording branch points manually, coordinating merges via email -- OMG > you are giving me flashbacks of CVS ;-) Not merging, but rewriting history. One of the primary purposes of rebasing is to forget the old history, the new version overrides it. And telling someone to forget something is a social problem. You can help the user to forget the history by tracking the branching points and I said that git could maybe learn to do that, so the user doesn't have to do so. Quick idea: On branch creation, create refs/bases/<branchname> (let's call that <base>) referencing the commit the branch initially references. On rebase, check if <branchname>..<onto> is not empty. If so, update refs/bases/<branchname> to reference <base>. On reset, check if the commit the branch head is being reset to is reachable through the commit the branch head currently references. If not, update <base> to reference the commit we're resetting to. Find some sane syntax for rebase that implicitly uses <base> as the <upstream> argument, e.g. just "git rebase --onto <whatever>" could work as "git rebase --onto <whatever> <base>". Most likely, I missed a bunch of corner cases though... > *Of course* you can get around all of these problems if you put the > burden of bookkeeping on the user. The whole point of > rebase-with-history is to have the VCS handle it automatically! What your approach does, is simply moving the "just forget the history" part. Instead of forgetting it at rebase time, you have to forget it when you want to submit patches. It's obviously a bit easier though, as you can just say "--first-parent <upstream>", assuming that you teach format-patch to use a special first-parent diff mode for the merge commits (see below). > >> and merging in a topic branch makes it more difficult to create an > >> easily-reviewable patch series. rebase-with-history has neither of > >> these problems. > > > > Sure, merging is a no-go if you submit patches by email (or other, > > similar means). But you compared that to an "enhanced" rebase approach, > > instead of comparing your rebase approach to the currently available > > one. > > In [1] I compared rebase-with-history with both of the > currently-available options (rebase and merge). Rebase and merge can > each deal with some of the issues that come up, but each one falls flat > on others. I believe that rebase-with-history has the advantages of both. And some disadvantages. 1) Cluttered history, which needs to be rewritten again when the emailed patches are just for review, but the maintainer will actually merge from you later. Taking the old master, subsystem, topic example, you get (for example): o2--o2 (subsystem) / \ m---m---m---m---m (master) \ \ \ o'--o' \ / / \ o---o *'--*' (topic) \ / / *---* Now the user that maintains "topic" is back at the hard case. He now needs to rebase onto master, using the last o' as <upstream>. The DAG doesn't help here, the base-tracking would handle that. 2) Merge commits, which are usually displayed in a special format. So for "git show" or "git log -p" to give useful output for those special merges, you'd have to introduce a new "diff only against first-parent" mode, and mark those merge in a special way, so that diff mode is used for them, but not for real merges. And users of old git versions would have to deal with the basic -m merge diff mode, ignoring the useless diff for the second parent and the fact that the real merges also get shown in that format. The base tracking doesn't have this problem either. > The example in [2] was taken straight from the git-rebase man page [3]; > I did not want to claim that current practice would use merging in this > situation, but rather just to show that rebase-with-history removes the > pain from this well-known example. Well, the man pages says: Don't merge, rebase needs to trickle down, but you'll likely need to use "git rebase --onto subsystem subsystem@{1}". So the rebase-with-history really just saves that "use --onto and the right <upstream>" from the hard case. The plain base-tracking does the same. Another way to reach the same goal would be just to explictly override the old history. m---m---m (master) \ o---o (subsystem) \ *---* (topic) (Hypothetical): git rebase --override master subsystem Leads to: m---m---m---- (master) \ \ o---o---O---o'--o' (subsystem) \ *---* (topic) Where O is an --ours merge, that just marks the old o commits as merged, but has the same tree as the last m commit. Now topic can be rebase using: git rebase --override subsystem topic m---m---m---- (master) \ \ o---o---O---o'--o' (subsystem) \ \ *---*-------X---*'--*' (topic) Again, X being an ours merge. As the O and X commits have the last o and * commits as their second parents, this even doesn't break things like "git show" and "git log -p", as the interesting commits aren't merge commits. So "git log -p --first-parent subsystem..topic" would do the right thing (optionally with --no-merges to avoid the merge commit, but seeing that doesn't hurt that much I guess). This also trivially supports the reorder, squash, edit whatever stuff, as it doesn't rely on 1:1 commit counterparts to exist. But it also falls flat on its face as soon as subsystem gets "really" rewritting, so that the old history is no longer reachable from the new history. Björn ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: rebase-with-history -- a technique for rebasing without trashing your repo history 2009-08-13 22:39 ` Michael Haggerty 2009-08-13 23:30 ` Björn Steinbrink @ 2009-08-14 3:17 ` Sitaram Chamarty 1 sibling, 0 replies; 10+ messages in thread From: Sitaram Chamarty @ 2009-08-14 3:17 UTC (permalink / raw) To: Michael Haggerty; +Cc: Björn Steinbrink, Git Mailing List Hi, I'm one of those wannabe experts who thinks he knows enough about git to teach people in his workplace but obviously pales in this group, but with that caveat, let me say: 2009/8/14 Michael Haggerty <mhagger@alum.mit.edu>: > Now that you mention it, there are some other uses of rebase whose > history could be recorded correctly, or at least better, in the DAG. I > am not ready to advocate any of these changes, but I think they are I see you've made your own caveat :-) > worth discussing. [snip] > A---B1---B2----C > \ \ \ > ---------B12---C' > A---B12---C > \ \ \ > B1---B2---C' [snip] > A---C > \ \ > B---C' [etc etc... many such snipped] To me, the ability to *forget* the mistakes I made (for whatever definition of "mistake" you wish) as long as it's private to my repo, is one of the main attractions of git. I'm one of those guys who saves early, and saves often, when editing files. This translates to commit early, commit often, in the git world. I see no earthly reason why I would ever *want* those commits preserved, so I hope that, if this sort of thing ever gets into the code, it is definitely *not* the default :-) It is not sufficient for me that the GUI knows how to suppress their display, it is necessary that they *disappear completely*. And that reminds me. You often hear people on #git ask how to get rid of some files (maybe containing passwords etc) that inadvertently got into the repo, and the answer, a lot of the time, is filter-branch, because the "bad" commit is pretty old. I suspect that for every person who asks that question on the list because he already pushed, there are 4 who discovered such an error much earlier, (when the file went into only a couple of commits at the top maybe), did a rebase -i with "edit" or whatever, and got rid of the evidence, err I mean password :-) If this sort of thing were to be the default, they'd have to use a filter-branch even for such simple cases. Finally, speaking as someone who teaches git, this adds enormous complexity to the basic concepts. Complexity is good when the benefits are obvious, but to me they are not obvious [see *my* caveat at the top before you react to this statement] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: rebase-with-history -- a technique for rebasing without trashing your repo history 2009-08-13 12:46 rebase-with-history -- a technique for rebasing without trashing your repo history Michael Haggerty 2009-08-13 16:12 ` Björn Steinbrink @ 2009-08-13 17:39 ` Bryan O'Sullivan 2009-08-13 20:31 ` Abderrahim Kitouni 2 siblings, 0 replies; 10+ messages in thread From: Bryan O'Sullivan @ 2009-08-13 17:39 UTC (permalink / raw) To: Michael Haggerty; +Cc: Bazaar, mercurial mailing list, Git Mailing List [-- Attachment #1.1: Type: text/plain, Size: 253 bytes --] On Thu, Aug 13, 2009 at 5:46 AM, Michael Haggerty <mhagger@alum.mit.edu>wrote: > Sorry to cross-post, but I think this might be interesting to all three > projects... > Please do not cross post, no matter how interesting you think the topic might be. [-- Attachment #1.2: Type: text/html, Size: 526 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: rebase-with-history -- a technique for rebasing without trashing your repo history 2009-08-13 12:46 rebase-with-history -- a technique for rebasing without trashing your repo history Michael Haggerty 2009-08-13 16:12 ` Björn Steinbrink 2009-08-13 17:39 ` Bryan O'Sullivan @ 2009-08-13 20:31 ` Abderrahim Kitouni 2 siblings, 0 replies; 10+ messages in thread From: Abderrahim Kitouni @ 2009-08-13 20:31 UTC (permalink / raw) To: Michael Haggerty; +Cc: Bazaar, mercurial mailing list, Git Mailing List 2009/8/13 Michael Haggerty <mhagger@alum.mit.edu>: > I've been thinking a lot about the problems of tracking upstream changes > while developing a feature branch. Isn't this the purpose of pbranch [1] (and I beleive bzr's loom[2] and git's topgit[3])? Peace, Abderrahim [1] http://arrenbrecht.ch/mercurial/pbranch/ [2] https://launchpad.net/bzr-loom [3]http://repo.or.cz/w/topgit.git ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2009-08-15 3:36 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-08-13 12:46 rebase-with-history -- a technique for rebasing without trashing your repo history Michael Haggerty 2009-08-13 16:12 ` Björn Steinbrink 2009-08-13 22:39 ` Michael Haggerty 2009-08-13 23:30 ` Björn Steinbrink 2009-08-14 21:21 ` Michael Haggerty 2009-08-14 21:40 ` Nanako Shiraishi 2009-08-15 3:36 ` Björn Steinbrink 2009-08-14 3:17 ` Sitaram Chamarty 2009-08-13 17:39 ` Bryan O'Sullivan 2009-08-13 20:31 ` Abderrahim Kitouni
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.