* [PATCH] rebase: be cleverer with rebased upstream branches @ 2011-02-14 13:51 Martin von Zweigbergk 2011-02-15 11:28 ` Santi Béjar 2011-02-15 20:30 ` Junio C Hamano 0 siblings, 2 replies; 19+ messages in thread From: Martin von Zweigbergk @ 2011-02-14 13:51 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, santi, Martin von Zweigbergk Since c85c792 (pull --rebase: be cleverer with rebased upstream branches, 2008-01-26), 'git pull --rebase' has used the reflog to try to rebase from the old upstream onto the new upstream. However, if, instead of 'git pull --rebase', the user were to do 'git fetch' followed by 'git rebase', the reflog would not be walked. This patch teaches "git rebase" the same reflog-walking tricks that 'git pull --rebase' already knows. This may be useful for rebasing one branch against another local branch that has been rebased. Currently, you would have to do that using 'git rebase --onto' or by configuring it on the branch. It might seem like most of the related code in git-pull.sh can be removed once git-rebase.sh supports reflog walking. Unfortunately, not much of it can be removed, though. The reason is that git-pull.sh simulates one step of "reflog walking" by keeping track of the position of the remote-tracking branch before and after the fetch operation. This does not rely on reflogs. There are at least two cases where the reflog is not used: a) when it is disabled, b) when the remote branch was specified on the command line (as in 'git pull --rebase origin master'). In both of these cases, git-pull.sh remembers the position of the reference before the fetch and uses that as a kind of '$upstream@{1}'. Signed-off-by: Martin von Zweigbergk <martin.von.zweigbergk@gmail.com> --- This applies on top of mz/rebase. I have been using this in combination with the patch for defaulting rebase to @{upstream} for a few months now. I find it very convenient e.g. when Junio has published some changes and I want to rebase my branches. Then I just go to each of my branches and run 'git rebase' without any arguments, and they get rebased correctly, whether they are based on origin/master, origin/pu (they shouldn't be, but let's say they are anyway), or on a local branch that is in turn based on e.g. origin/master. HOWEVER, this causes a very noticable delay in some cases. With this patch, 'git rebase' walks the reflog of the upstream ref until it finds a commit that the branch-to-rebase contains. If the upstream ref has moved a lot since the branch was last rebased, there may be quite a few commits to test before the old upstream commit is found. The same thing can already occur with 'git pull --rebase' for exactly the same reasons. For example, assume that your upstream remote branch changes quite frequently and that you often fetch from the remote so that your origin/master gets a long reflog. If you then checkout some branch you had not been working on for a while, and run 'git pull', you get into the same situation. The delay is probably less likely to be noticed in the case of 'git pull --rebase', however, since most users will probably assume it is a problem with the network or the server. Of course, 'git pull --rebase' can also be used with a local branch configured as upstream. In this case, the behavior today is just like what this patch introduces for 'git rebase'. What do you think? I think it's a useful feature, but how do we handle the delay problem? Maybe simply by making it configurable? Should such a configuration variable apply to 'git pull --rebase' as well? It would seem inconsistent otherwise, but maybe that's ok since 'git pull --rebase' is usually used with remote-tracking branches, which probably change less frequently. Btw, is this a correct assumption? It is definitely true for my own work on git, but I actually think it's the other way around for my work at $dayjob. Am I missing some part to the puzzle that explains why I had not noticed the delay until I started using this patch? Documentation/git-rebase.txt | 7 ++++++- git-rebase.sh | 13 +++++++++++++ t/t3400-rebase.sh | 26 ++++++++++++++++++++++++++ 3 files changed, 45 insertions(+), 1 deletions(-) diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt index 095a67f..d4dbe28 100644 --- a/Documentation/git-rebase.txt +++ b/Documentation/git-rebase.txt @@ -24,7 +24,12 @@ it remains on the current branch. All changes made by commits in the current branch but that are not in <upstream> are saved to a temporary area. This is the same set of commits that would be shown by `git log <upstream>..HEAD` (or -`git log HEAD`, if --root is specified). +`git log HEAD`, if --root is specified). If, however, there is a ref +for the upstream branch, and this branch was rebased since the +current branch was last rebased, the rebase uses that information to +avoid rebasing changes done on the upstream branch. If you do not want +'git rebase' to use this intelligence, refer to the upstream without +using a reference (e.g. 'master~0'). The current branch is reset to <upstream>, or <newbase> if the --onto option was supplied. This has the exact same effect as diff --git a/git-rebase.sh b/git-rebase.sh index 5abfeac..1bc0c29 100755 --- a/git-rebase.sh +++ b/git-rebase.sh @@ -466,6 +466,19 @@ esac require_clean_work_tree "rebase" "Please commit or stash them." +test -n "$upstream_name" && for reflog in \ + $(git rev-list -g $upstream_name 2>/dev/null) +do + if test $reflog = $(git merge-base $reflog $orig_head) + then + if test $reflog != $(git merge-base $onto $reflog) + then + upstream=$reflog + fi + break + fi +done + # Now we are rebasing commits $upstream..$orig_head (or with --root, # everything leading up to $orig_head) on top of $onto diff --git a/t/t3400-rebase.sh b/t/t3400-rebase.sh index 349eebd..b64df31 100755 --- a/t/t3400-rebase.sh +++ b/t/t3400-rebase.sh @@ -209,4 +209,30 @@ test_expect_success 'rebase -m can copy notes' ' test "a note" = "$(git notes show HEAD)" ' +test_expect_success 'rebase against rebased upstream uses reflog' ' + git checkout my-topic-branch && + echo "Conflicting modification" > B && + git add B && + git commit -m "Modify B" && + git reset --hard nonlinear && + git checkout -b old-topic my-topic-branch@{1} && + echo def > D && + git add D && + git commit -m "Add D" && + git rebase my-topic-branch && + test $(git rev-parse HEAD^) = $(git rev-parse my-topic-branch) +' + +test_expect_success 'rebase against forwarded upstream does not reapply patches' ' + git checkout my-topic-branch && + echo abc > B && + git add B && + git commit -m "Conficting B" && + git reset HEAD~2 && + git reset HEAD@{1} && + git checkout old-topic && + git rebase my-topic-branch && + test $(git rev-parse HEAD^) = $(git rev-parse my-topic-branch) +' + test_done -- 1.7.4.rc2.33.g8a14f ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH] rebase: be cleverer with rebased upstream branches 2011-02-14 13:51 [PATCH] rebase: be cleverer with rebased upstream branches Martin von Zweigbergk @ 2011-02-15 11:28 ` Santi Béjar 2011-02-16 1:37 ` Martin von Zweigbergk 2011-02-15 20:30 ` Junio C Hamano 1 sibling, 1 reply; 19+ messages in thread From: Santi Béjar @ 2011-02-15 11:28 UTC (permalink / raw) To: Martin von Zweigbergk; +Cc: git, Junio C Hamano On Mon, Feb 14, 2011 at 1:51 PM, Martin von Zweigbergk <martin.von.zweigbergk@gmail.com> wrote: > Since c85c792 (pull --rebase: be cleverer with rebased upstream > branches, 2008-01-26), 'git pull --rebase' has used the reflog to try > to rebase from the old upstream onto the new upstream. > > However, if, instead of 'git pull --rebase', the user were to do 'git > fetch' followed by 'git rebase', the reflog would not be walked. This > patch teaches "git rebase" the same reflog-walking tricks that 'git > pull --rebase' already knows. > > This may be useful for rebasing one branch against another local > branch that has been rebased. Currently, you would have to do that > using 'git rebase --onto' or by configuring it on the branch. It make sense. > > It might seem like most of the related code in git-pull.sh can be > removed once git-rebase.sh supports reflog walking. Unfortunately, not > much of it can be removed, though. The reason is that git-pull.sh > simulates one step of "reflog walking" by keeping track of the > position of the remote-tracking branch before and after the fetch > operation. This does not rely on reflogs. There are at least two cases > where the reflog is not used: a) when it is disabled, b) when the > remote branch was specified on the command line (as in 'git pull > --rebase origin master'). In both of these cases, git-pull.sh > remembers the position of the reference before the fetch and uses that > as a kind of '$upstream@{1}'. I don't agree with point b). In line 190: remoteref="$(get_remote_merge_branch "$@" 2>/dev/null)" && It returns the local tracking branch for repo=origin and branch=master and uses its reflog. The end result is the same, there is one case where you need the old value of the tracking branch, so it should be done in git-pull. But I wonder if it is possible to write a function shared by git-pull.sh and git-rebase.sh that computes the branch forking points, the number of arguments could detect if it has the old-remote-hash or not. > > Signed-off-by: Martin von Zweigbergk <martin.von.zweigbergk@gmail.com> > --- [...] > > HOWEVER, this causes a very noticable delay in some cases. With this > patch, 'git rebase' walks the reflog of the upstream ref until it > finds a commit that the branch-to-rebase contains. If the upstream ref > has moved a lot since the branch was last rebased, there may be quite > a few commits to test before the old upstream commit is found. > > The same thing can already occur with 'git pull --rebase' for exactly > the same reasons. For example, assume that your upstream remote branch > changes quite frequently and that you often fetch from the remote so > that your origin/master gets a long reflog. If you then checkout some > branch you had not been working on for a while, and run 'git pull', > you get into the same situation. The delay is probably less likely to > be noticed in the case of 'git pull --rebase', however, since most > users will probably assume it is a problem with the network or the > server. > > Of course, 'git pull --rebase' can also be used with a local branch > configured as upstream. In this case, the behavior today is just like > what this patch introduces for 'git rebase'. > > What do you think? I think it's a useful feature, but how do we handle > the delay problem? Maybe simply by making it configurable? > > Should such a configuration variable apply to 'git pull --rebase' as > well? It would seem inconsistent otherwise, but maybe that's ok since > 'git pull --rebase' is usually used with remote-tracking branches, > which probably change less frequently. Btw, is this a correct > assumption? It is definitely true for my own work on git, but I > actually think it's the other way around for my work at $dayjob. Am I > missing some part to the puzzle that explains why I had not noticed > the delay until I started using this patch? I agree with you that it may add a long delay in some cases. I normally rebase branches based on an upstream branch and this may explain why I haven't seen the delay (or maybe I thought it was a network delay). I think the delay could be much shorter if the computation was not in shell, but in C. Or maybe change the algorithm. So I don't think a configuration item is the answer here. Other than that I think the change make sense it include docs and tests and works, thanks. Santi ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] rebase: be cleverer with rebased upstream branches 2011-02-15 11:28 ` Santi Béjar @ 2011-02-16 1:37 ` Martin von Zweigbergk 2011-02-16 9:26 ` Santi Béjar 0 siblings, 1 reply; 19+ messages in thread From: Martin von Zweigbergk @ 2011-02-16 1:37 UTC (permalink / raw) To: Santi Béjar; +Cc: Martin von Zweigbergk, git, Junio C Hamano On Tue, 15 Feb 2011, Santi B?jar wrote: > On Mon, Feb 14, 2011 at 1:51 PM, Martin von Zweigbergk > <martin.von.zweigbergk@gmail.com> wrote: > > It might seem like most of the related code in git-pull.sh can be > > removed once git-rebase.sh supports reflog walking. Unfortunately, not > > much of it can be removed, though. The reason is that git-pull.sh > > simulates one step of "reflog walking" by keeping track of the > > position of the remote-tracking branch before and after the fetch > > operation. This does not rely on reflogs. There are at least two cases > > where the reflog is not used: a) when it is disabled, b) when the > > remote branch was specified on the command line (as in 'git pull > > --rebase origin master'). In both of these cases, git-pull.sh > > remembers the position of the reference before the fetch and uses that > > as a kind of '$upstream@{1}'. > > I don't agree with point b). In line 190: > > remoteref="$(get_remote_merge_branch "$@" 2>/dev/null)" && > > It returns the local tracking branch for repo=origin and branch=master > and uses its reflog. Yes, but the local tracking branch is not updated when the two-argument version of 'git pull' is used [1]. > The end result is the same, there is one case where you need the old > value of the tracking branch, so it should be done in git-pull. True, case a) is still there. I was just trying to explain why I didn't just move the code from git-pull.sh to git-rebase.sh, but maybe it confused more than it clarified... > But I wonder if it is possible to write a function shared by > git-pull.sh and git-rebase.sh that computes the branch forking points, > the number of arguments could detect if it has the old-remote-hash or > not. Makes sense. I will have a look at it. > I think the delay could be much shorter if the computation was not in > shell, but in C. Or maybe change the algorithm. So I don't think a > configuration item is the answer here. It would definitely be nice if we can make it fast enough so we don't have to make it configurable. [1] http://thread.gmane.org/gmane.comp.version-control.git/165758 ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] rebase: be cleverer with rebased upstream branches 2011-02-16 1:37 ` Martin von Zweigbergk @ 2011-02-16 9:26 ` Santi Béjar 0 siblings, 0 replies; 19+ messages in thread From: Santi Béjar @ 2011-02-16 9:26 UTC (permalink / raw) To: Martin von Zweigbergk; +Cc: git, Junio C Hamano On Wed, Feb 16, 2011 at 1:37 AM, Martin von Zweigbergk <martin.von.zweigbergk@gmail.com> wrote: > On Tue, 15 Feb 2011, Santi B?jar wrote: > >> On Mon, Feb 14, 2011 at 1:51 PM, Martin von Zweigbergk >> <martin.von.zweigbergk@gmail.com> wrote: >> > It might seem like most of the related code in git-pull.sh can be >> > removed once git-rebase.sh supports reflog walking. Unfortunately, not >> > much of it can be removed, though. The reason is that git-pull.sh >> > simulates one step of "reflog walking" by keeping track of the >> > position of the remote-tracking branch before and after the fetch >> > operation. This does not rely on reflogs. There are at least two cases >> > where the reflog is not used: a) when it is disabled, b) when the >> > remote branch was specified on the command line (as in 'git pull >> > --rebase origin master'). In both of these cases, git-pull.sh >> > remembers the position of the reference before the fetch and uses that >> > as a kind of '$upstream@{1}'. >> >> I don't agree with point b). In line 190: >> >> remoteref="$(get_remote_merge_branch "$@" 2>/dev/null)" && >> >> It returns the local tracking branch for repo=origin and branch=master >> and uses its reflog. > > Yes, but the local tracking branch is not updated when the > two-argument version of 'git pull' is used [1]. Yes, but the reflog is used nevertheless and it can use the local tracking branch as the old-remote-hash. > >> The end result is the same, there is one case where you need the old >> value of the tracking branch, so it should be done in git-pull. > > True, case a) is still there. I was just trying to explain why I > didn't just move the code from git-pull.sh to git-rebase.sh, but maybe > it confused more than it clarified... Yes, with only case a) is sufficient (and complete). HTH, Santi ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] rebase: be cleverer with rebased upstream branches 2011-02-14 13:51 [PATCH] rebase: be cleverer with rebased upstream branches Martin von Zweigbergk 2011-02-15 11:28 ` Santi Béjar @ 2011-02-15 20:30 ` Junio C Hamano 2011-02-16 3:03 ` Martin von Zweigbergk 1 sibling, 1 reply; 19+ messages in thread From: Junio C Hamano @ 2011-02-15 20:30 UTC (permalink / raw) To: Martin von Zweigbergk; +Cc: git, santi Martin von Zweigbergk <martin.von.zweigbergk@gmail.com> writes: > diff --git a/git-rebase.sh b/git-rebase.sh > index 5abfeac..1bc0c29 100755 > --- a/git-rebase.sh > +++ b/git-rebase.sh > @@ -466,6 +466,19 @@ esac > > require_clean_work_tree "rebase" "Please commit or stash them." > > +test -n "$upstream_name" && for reflog in \ > + $(git rev-list -g $upstream_name 2>/dev/null) Ugly. test -n "$upstream_name" && for reflog in $(git rev-list ...) do ... done Don't you need to make sure $upstream_name is a branch (or a ref in general that can have a reflog), or does it not matter because the "rev-list -g" will die without producing anything and you are discarding the error message? Now, a handful of random questions, none of them rhetorical, as I don't know the answers to any of them. Would it help if the code is made just as clever as the patch attempts to be, when the user says git rebase origin/next~4 IOW, use the reflog of origin/next even in such a case? > +do > + if test $reflog = $(git merge-base $reflog $orig_head) > + then > + if test $reflog != $(git merge-base $onto $reflog) > + then > + upstream=$reflog > + fi > + break > + fi Do we always traverse down to the beginning of the reflog in the worst case? Would bisection help to avoid the cost? ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] rebase: be cleverer with rebased upstream branches 2011-02-15 20:30 ` Junio C Hamano @ 2011-02-16 3:03 ` Martin von Zweigbergk 2011-02-16 12:10 ` Santi Béjar 0 siblings, 1 reply; 19+ messages in thread From: Martin von Zweigbergk @ 2011-02-16 3:03 UTC (permalink / raw) To: Junio C Hamano; +Cc: Martin von Zweigbergk, git, santi, Thomas Rast On Tue, 15 Feb 2011, Junio C Hamano wrote: > Martin von Zweigbergk <martin.von.zweigbergk@gmail.com> writes: > > > diff --git a/git-rebase.sh b/git-rebase.sh > > index 5abfeac..1bc0c29 100755 > > --- a/git-rebase.sh > > +++ b/git-rebase.sh > > @@ -466,6 +466,19 @@ esac > > > > require_clean_work_tree "rebase" "Please commit or stash them." > > > > +test -n "$upstream_name" && for reflog in \ > > + $(git rev-list -g $upstream_name 2>/dev/null) > > Ugly. Very. Fixed. Thanks. > test -n "$upstream_name" && > for reflog in $(git rev-list ...) > do > ... > done > > Don't you need to make sure $upstream_name is a branch (or a ref in > general that can have a reflog), or does it not matter because the > "rev-list -g" will die without producing anything and you are discarding > the error message? Exactly as you suspect. Is it too ugly? > Now, a handful of random questions, none of them rhetorical, as I don't > know the answers to any of them. > > Would it help if the code is made just as clever as the patch attempts to > be, when the user says > > git rebase origin/next~4 > > IOW, use the reflog of origin/next even in such a case? Not sure. I think it seems too rare to worry about. In those cases, one could still use the good old '--onto' option manually. Also, if we don't handle the ref~4 case, the "cleverness" can be disabled by using ref~0. > > +do > > + if test $reflog = $(git merge-base $reflog $orig_head) > > + then > > + if test $reflog != $(git merge-base $onto $reflog) > > + then > > + upstream=$reflog > > + fi > > + break > > + fi > > Do we always traverse down to the beginning of the reflog in the worst > case? Yes. > Would bisection help to avoid the cost? I don't think the straight-forward use of bisection would work. If the history looks something like below, where 'b' is the branch to rebase and 'u' is the upstream, we have to go through each entry in the reflog to find u@{3}. .-u@{0} / .---u@{1} / x---y-----u@{2} \ .---u@{3}---b \ .-u@{4} I have an idea inspired by bisection, Thomas's exponential stride, and what someone (you?) mentioned the other day about virtual merge commits. I haven't tried it out, but let me know what you think. I'll try to explain it using an example only: Exponential stride phase: 1. candidates={ u@{0} } merge-base b $candidates -> y, _not_ in $candidates 2. candidates={ u@{1} u@{2} } merge-base b $candidates -> y, _not_ in $candidates 3. candidates={ u@{3} u@{4} u@{5} u@{6} } merge-base b $candidates -> u@{3}, in $candidates Bisection phase: 1. candidates={ u@{3} u@{4} } merge-base b $candidates -> u@{3}, in $candidates 2. candidates={ u@{3} } merge-base b $candidates -> u@{3}, in $candidates, done It works for the few cases I have thought of, but it may break in other other cases. I just read about the virtual merge commits, so I'm not sure I understand correctly how that works eiter. Would it even perform better than searching linearly? I tried stepping through it manually a few times and it seems faster. Maybe something based on timestamps would be better? /Martin ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] rebase: be cleverer with rebased upstream branches 2011-02-16 3:03 ` Martin von Zweigbergk @ 2011-02-16 12:10 ` Santi Béjar 2011-02-16 13:22 ` Santi Béjar 2011-02-16 16:45 ` Martin von Zweigbergk 0 siblings, 2 replies; 19+ messages in thread From: Santi Béjar @ 2011-02-16 12:10 UTC (permalink / raw) To: Martin von Zweigbergk; +Cc: Junio C Hamano, git, Thomas Rast On Wed, Feb 16, 2011 at 3:03 AM, Martin von Zweigbergk <martin.von.zweigbergk@gmail.com> wrote: > On Tue, 15 Feb 2011, Junio C Hamano wrote: > >> Martin von Zweigbergk <martin.von.zweigbergk@gmail.com> writes: >> >> > diff --git a/git-rebase.sh b/git-rebase.sh >> > index 5abfeac..1bc0c29 100755 >> > --- a/git-rebase.sh >> > +++ b/git-rebase.sh >> test -n "$upstream_name" && >> for reflog in $(git rev-list ...) >> do >> ... >> done >> >> Don't you need to make sure $upstream_name is a branch (or a ref in >> general that can have a reflog), or does it not matter because the >> "rev-list -g" will die without producing anything and you are discarding >> the error message? > > Exactly as you suspect. Is it too ugly? I also prefer Junio's version. > >> Now, a handful of random questions, none of them rhetorical, as I don't >> know the answers to any of them. >> >> Would it help if the code is made just as clever as the patch attempts to >> be, when the user says >> >> git rebase origin/next~4 >> >> IOW, use the reflog of origin/next even in such a case? > > Not sure. I think it seems too rare to worry about. In those cases, > one could still use the good old '--onto' option manually. Also, if we > don't handle the ref~4 case, the "cleverness" can be disabled by using > ref~0. With ref~4 you are specifying a commit, so I would expect to rebase to use it as such, not also as a branch ref. > >> > +do >> > + if test $reflog = $(git merge-base $reflog $orig_head) >> > + then >> > + if test $reflog != $(git merge-base $onto $reflog) >> > + then >> > + upstream=$reflog >> > + fi >> > + break >> > + fi >> >> Do we always traverse down to the beginning of the reflog in the worst >> case? > > Yes. > >> Would bisection help to avoid the cost? > > I don't think the straight-forward use of bisection would work. If the > history looks something like below, where 'b' is the branch to rebase > and 'u' is the upstream, we have to go through each entry in the > reflog to find u@{3}. > > > .-u@{0} > / > .---u@{1} > / > x---y-----u@{2} > \ > .---u@{3}---b > \ > .-u@{4} > > > I have an idea inspired by bisection, Thomas's exponential stride, and > what someone (you?) mentioned the other day about virtual merge > commits. I haven't tried it out, but let me know what you think. I'll > try to explain it using an example only: > > Exponential stride phase: > 1. candidates={ u@{0} } > merge-base b $candidates -> y, _not_ in $candidates > 2. candidates={ u@{1} u@{2} } > merge-base b $candidates -> y, _not_ in $candidates > 3. candidates={ u@{3} u@{4} u@{5} u@{6} } > merge-base b $candidates -> u@{3}, in $candidates Doesn't it indicate that u@{3} is the commit we are looking for? I haven't found a counterexample... If this is true the following patch can implement it for git-pull.sh and git-rebase.sh (sorry if it is space damaged): diff --git i/git-pull.sh w/git-pull.sh index 2cdea26..09ef0a9 100755 --- i/git-pull.sh +++ w/git-pull.sh @@ -189,14 +189,7 @@ test true = "$rebase" && { . git-parse-remote && remoteref="$(get_remote_merge_branch "$@" 2>/dev/null)" && oldremoteref="$(git rev-parse -q --verify "$remoteref")" && - for reflog in $(git rev-list -g $remoteref 2>/dev/null) - do - if test "$reflog" = "$(git merge-base $reflog $curr_branch)" - then - oldremoteref="$reflog" - break - fi - done + oldremoteref=$(git merge-base $curr_branch $oldremoteref $(git rev-list -g $remoteref 2>/dev/null)) } orig_head=$(git rev-parse -q --verify HEAD) git fetch $verbosity $progress $dry_run $recurse_submodules --update-head-ok "$@" || exit 1 diff --git i/git-rebase.sh w/git-rebase.sh index 0d245fe..4b3e131 100755 --- i/git-rebase.sh +++ w/git-rebase.sh @@ -448,18 +448,8 @@ esac require_clean_work_tree "rebase" "Please commit or stash them." -test -n "$upstream_name" && for reflog in \ - $(git rev-list -g $upstream_name 2>/dev/null) -do - if test $reflog = $(git merge-base $reflog $orig_head) - then - if test $reflog != $(git merge-base $onto $reflog) - then - upstream=$reflog - fi - break - fi -done +test -n "$upstream_name" && +upstream=$(git merge-base $orig_head $(git rev-list -g $upstream_name 2>/dev/null)) # Now we are rebasing commits $upstream..$orig_head (or with --root, # everything leading up to $orig_head) on top of $onto diff --git i/t/t3408-rebase-multi-line.sh w/t/t3408-rebase-multi-line.sh index 6b84e60..bee4494 100755 --- i/t/t3408-rebase-multi-line.sh +++ w/t/t3408-rebase-multi-line.sh @@ -10,7 +10,12 @@ test_expect_success setup ' git add file && test_tick && git commit -m initial && + >elif && + git add elif && + test_tick && + git commit -m second && + git checkout -b side HEAD^ echo hello >file && test_tick && git commit -a -m "A sample commit log message that has a long @@ -18,13 +23,7 @@ summary that spills over multiple lines. But otherwise with a sane description." && - git branch side && - - git reset --hard HEAD^ && - >elif && - git add elif && - test_tick && - git commit -m second + git checkout master ' It passes the "git pull --rebase" test and the basic "git rebase branch" tests, but it fails basically with two type of tests: 1) those involving "git rebase -i" (I'll try to debug those but I find it difficult to debug all those FAKE_LINES), and those with "bad" reflogs as shown in the above patch to t3408 (see the next paragraph). Trying to find the counterexample (and debugging the failing test above) I've found one corner we don't handle (neither in git-pull.sh nor in git-rebase.sh). It is the case when the upstream branch is "fast-backwards" into an older commit without extra commits on top. Something like this: x---y----u@{1}---u@{2}---b \ .---u@{0} In this case the algorithm picks u@{1} instead of u@{2} (the alternative algorithm has the same problem when u@{n} and u@{n+1} are in different exponential phases. Or the simple case in: u@{2}---u@{0} \ .---u@{1}=b I think this is a very rare corner case as the upstream branch has to be "fast-backward", and you have to fetch this state. So far nobody has found it, at least. > Bisection phase: > 1. candidates={ u@{3} u@{4} } > merge-base b $candidates -> u@{3}, in $candidates > 2. candidates={ u@{3} } > merge-base b $candidates -> u@{3}, in $candidates, done > > > It works for the few cases I have thought of, but it may break in > other other cases. I just read about the virtual merge commits, so I'm > not sure I understand correctly how that works eiter. Me too. HTH, Santi ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH] rebase: be cleverer with rebased upstream branches 2011-02-16 12:10 ` Santi Béjar @ 2011-02-16 13:22 ` Santi Béjar 2011-02-16 19:07 ` Junio C Hamano 2011-02-16 16:45 ` Martin von Zweigbergk 1 sibling, 1 reply; 19+ messages in thread From: Santi Béjar @ 2011-02-16 13:22 UTC (permalink / raw) To: Martin von Zweigbergk; +Cc: Junio C Hamano, git, Thomas Rast [-- Attachment #1: Type: text/plain, Size: 3096 bytes --] On Wed, Feb 16, 2011 at 1:10 PM, Santi Béjar <santi@agolina.net> wrote: > On Wed, Feb 16, 2011 at 3:03 AM, Martin von Zweigbergk > <martin.von.zweigbergk@gmail.com> wrote: >> On Tue, 15 Feb 2011, Junio C Hamano wrote: >> >>> Would bisection help to avoid the cost? >> >> I don't think the straight-forward use of bisection would work. If the >> history looks something like below, where 'b' is the branch to rebase >> and 'u' is the upstream, we have to go through each entry in the >> reflog to find u@{3}. >> >> >> .-u@{0} >> / >> .---u@{1} >> / >> x---y-----u@{2} >> \ >> .---u@{3}---b >> \ >> .-u@{4} >> >> >> I have an idea inspired by bisection, Thomas's exponential stride, and >> what someone (you?) mentioned the other day about virtual merge >> commits. I haven't tried it out, but let me know what you think. I'll >> try to explain it using an example only: >> >> Exponential stride phase: >> 1. candidates={ u@{0} } >> merge-base b $candidates -> y, _not_ in $candidates >> 2. candidates={ u@{1} u@{2} } >> merge-base b $candidates -> y, _not_ in $candidates >> 3. candidates={ u@{3} u@{4} u@{5} u@{6} } >> merge-base b $candidates -> u@{3}, in $candidates > > Doesn't it indicate that u@{3} is the commit we are looking for? I > haven't found a counterexample... > > If this is true the following patch can implement it for git-pull.sh and > git-rebase.sh (sorry if it is space damaged): > > diff --git i/git-pull.sh w/git-pull.sh > index 2cdea26..09ef0a9 100755 > --- i/git-pull.sh > +++ w/git-pull.sh > @@ -189,14 +189,7 @@ test true = "$rebase" && { > . git-parse-remote && > remoteref="$(get_remote_merge_branch "$@" 2>/dev/null)" && > oldremoteref="$(git rev-parse -q --verify "$remoteref")" && > - for reflog in $(git rev-list -g $remoteref 2>/dev/null) > - do > - if test "$reflog" = "$(git merge-base $reflog $curr_branch)" > - then > - oldremoteref="$reflog" > - break > - fi > - done > + oldremoteref=$(git merge-base $curr_branch $oldremoteref $(git > rev-list -g $remoteref 2>/dev/null)) One thing I forgot to say is that it seems to perform quite well: Hot cache: $ git rev-list -g origin/next | wc -l 36 $ git rev-list -g origin/master | wc -l 52 $ time git merge-base $(git rev-list -g origin/next) 3b781df0a25d5ba23bd2603b0e3e9bb4731369df real 0m0.155s user 0m0.064s sys 0m0.044s $ time git merge-base $(git rev-list -g origin/master) 7811d9600f02e70c9f835719c71156c967a684f7 real 0m0.161s user 0m0.064s sys 0m0.040s $ time git merge-base $(git rev-list -g origin/master origin/master) 7811d9600f02e70c9f835719c71156c967a684f7 real 0m0.175s user 0m0.076s sys 0m0.036s Cold-cache around 1 second. And it's not linear with the number of reflog entries, but with the number of independent branches, I suppose. HTH, Santi P.D.: Attached is the patch, in case someone wants to try it. [-- Attachment #2: patch --] [-- Type: application/octet-stream, Size: 2185 bytes --] diff --git i/git-pull.sh w/git-pull.sh index 2cdea26..09ef0a9 100755 --- i/git-pull.sh +++ w/git-pull.sh @@ -189,14 +189,7 @@ test true = "$rebase" && { . git-parse-remote && remoteref="$(get_remote_merge_branch "$@" 2>/dev/null)" && oldremoteref="$(git rev-parse -q --verify "$remoteref")" && - for reflog in $(git rev-list -g $remoteref 2>/dev/null) - do - if test "$reflog" = "$(git merge-base $reflog $curr_branch)" - then - oldremoteref="$reflog" - break - fi - done + oldremoteref=$(git merge-base $curr_branch $oldremoteref $(git rev-list -g $remoteref 2>/dev/null)) } orig_head=$(git rev-parse -q --verify HEAD) git fetch $verbosity $progress $dry_run $recurse_submodules --update-head-ok "$@" || exit 1 diff --git i/git-rebase.sh w/git-rebase.sh index 0d245fe..4b3e131 100755 --- i/git-rebase.sh +++ w/git-rebase.sh @@ -448,18 +448,8 @@ esac require_clean_work_tree "rebase" "Please commit or stash them." -test -n "$upstream_name" && for reflog in \ - $(git rev-list -g $upstream_name 2>/dev/null) -do - if test $reflog = $(git merge-base $reflog $orig_head) - then - if test $reflog != $(git merge-base $onto $reflog) - then - upstream=$reflog - fi - break - fi -done +test -n "$upstream_name" && +upstream=$(git merge-base $orig_head $(git rev-list -g $upstream_name 2>/dev/null)) # Now we are rebasing commits $upstream..$orig_head (or with --root, # everything leading up to $orig_head) on top of $onto diff --git i/t/t3408-rebase-multi-line.sh w/t/t3408-rebase-multi-line.sh index 6b84e60..bee4494 100755 --- i/t/t3408-rebase-multi-line.sh +++ w/t/t3408-rebase-multi-line.sh @@ -10,7 +10,12 @@ test_expect_success setup ' git add file && test_tick && git commit -m initial && + >elif && + git add elif && + test_tick && + git commit -m second && + git checkout -b side HEAD^ echo hello >file && test_tick && git commit -a -m "A sample commit log message that has a long @@ -18,13 +23,7 @@ summary that spills over multiple lines. But otherwise with a sane description." && - git branch side && - - git reset --hard HEAD^ && - >elif && - git add elif && - test_tick && - git commit -m second + git checkout master ' ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH] rebase: be cleverer with rebased upstream branches 2011-02-16 13:22 ` Santi Béjar @ 2011-02-16 19:07 ` Junio C Hamano 2011-02-16 21:16 ` Santi Béjar 0 siblings, 1 reply; 19+ messages in thread From: Junio C Hamano @ 2011-02-16 19:07 UTC (permalink / raw) To: Santi Béjar; +Cc: Martin von Zweigbergk, git, Thomas Rast Santi Béjar <santi@agolina.net> writes: >> + oldremoteref=$(git merge-base $curr_branch $oldremoteref $(git >> rev-list -g $remoteref 2>/dev/null)) Yuck; the entire set of commits that appear in reflog can be quite long. What will happen when this exceeds the shell command line limit or when you get E2BIG from execve(2)? ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] rebase: be cleverer with rebased upstream branches 2011-02-16 19:07 ` Junio C Hamano @ 2011-02-16 21:16 ` Santi Béjar 0 siblings, 0 replies; 19+ messages in thread From: Santi Béjar @ 2011-02-16 21:16 UTC (permalink / raw) To: Junio C Hamano; +Cc: Martin von Zweigbergk, git, Thomas Rast On Wed, Feb 16, 2011 at 8:07 PM, Junio C Hamano <gitster@pobox.com> wrote: > Santi Béjar <santi@agolina.net> writes: > >>> + oldremoteref=$(git merge-base $curr_branch $oldremoteref $(git >>> rev-list -g $remoteref 2>/dev/null)) > > Yuck; the entire set of commits that appear in reflog can be quite long. > What will happen when this exceeds the shell command line limit or when > you get E2BIG from execve(2)? > Sure, but it was just a proof of concept to test the algorithm. Santi ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] rebase: be cleverer with rebased upstream branches 2011-02-16 12:10 ` Santi Béjar 2011-02-16 13:22 ` Santi Béjar @ 2011-02-16 16:45 ` Martin von Zweigbergk 2011-02-17 10:24 ` Santi Béjar 1 sibling, 1 reply; 19+ messages in thread From: Martin von Zweigbergk @ 2011-02-16 16:45 UTC (permalink / raw) To: Santi Béjar; +Cc: Martin von Zweigbergk, Junio C Hamano, git, Thomas Rast On Wed, 16 Feb 2011, Santi B?jar wrote: > On Wed, Feb 16, 2011 at 3:03 AM, Martin von Zweigbergk > <martin.von.zweigbergk@gmail.com> wrote: > > On Tue, 15 Feb 2011, Junio C Hamano wrote: > > > >> Martin von Zweigbergk <martin.von.zweigbergk@gmail.com> writes: > >> > >> > diff --git a/git-rebase.sh b/git-rebase.sh > >> > index 5abfeac..1bc0c29 100755 > >> > --- a/git-rebase.sh > >> > +++ b/git-rebase.sh > >> test -n "$upstream_name" && > >> for reflog in $(git rev-list ...) > >> do > >> ... > >> done > >> > >> Don't you need to make sure $upstream_name is a branch (or a ref in > >> general that can have a reflog), or does it not matter because the > >> "rev-list -g" will die without producing anything and you are discarding > >> the error message? > > > > Exactly as you suspect. Is it too ugly? > > I also prefer Junio's version. I fixed the test + for loop, if that's what you mean by "Junio's version". Or did you mean "make sure $upstream_name is a branch"? I could do that as well if you like. I have no preference. > > .-u@{0} > > / > > .---u@{1} > > / > > x---y-----u@{2} > > \ > > .---u@{3}---b > > \ > > .-u@{4} > > > > > > I have an idea inspired by bisection, Thomas's exponential stride, and > > what someone (you?) mentioned the other day about virtual merge > > commits. I haven't tried it out, but let me know what you think. I'll > > try to explain it using an example only: > > > > Exponential stride phase: > > 1. candidates={ u@{0} } > > merge-base b $candidates -> y, _not_ in $candidates > > 2. candidates={ u@{1} u@{2} } > > merge-base b $candidates -> y, _not_ in $candidates > > 3. candidates={ u@{3} u@{4} u@{5} u@{6} } > > merge-base b $candidates -> u@{3}, in $candidates > > Doesn't it indicate that u@{3} is the commit we are looking for? I > haven't found a counterexample... Yes, of course. Stupid me ;-). Forget about the other half. (I think that's what I did manually to match the sha1 back to the ref name, but that is of course complete non-sense to do in the script.) > If this is true the following patch can implement it for git-pull.sh and > git-rebase.sh (sorry if it is space damaged): Thanks! Will have a closer look at it later today. If I understand correctly, you simply call merge-base with the _entire_ reflog. I would have thought that would be slow, but it's great if that is fast enough. The resulting code looks very nice and short. Thanks again. /Martin ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] rebase: be cleverer with rebased upstream branches 2011-02-16 16:45 ` Martin von Zweigbergk @ 2011-02-17 10:24 ` Santi Béjar 2011-03-12 21:15 ` Martin von Zweigbergk 0 siblings, 1 reply; 19+ messages in thread From: Santi Béjar @ 2011-02-17 10:24 UTC (permalink / raw) To: Martin von Zweigbergk; +Cc: Junio C Hamano, git, Thomas Rast On Wed, Feb 16, 2011 at 5:45 PM, Martin von Zweigbergk <martin.von.zweigbergk@gmail.com> wrote: > On Wed, 16 Feb 2011, Santi B?jar wrote: > >> On Wed, Feb 16, 2011 at 3:03 AM, Martin von Zweigbergk >> <martin.von.zweigbergk@gmail.com> wrote: >> > On Tue, 15 Feb 2011, Junio C Hamano wrote: >> > >> >> Martin von Zweigbergk <martin.von.zweigbergk@gmail.com> writes: >> >> >> >> > diff --git a/git-rebase.sh b/git-rebase.sh >> >> > index 5abfeac..1bc0c29 100755 >> >> > --- a/git-rebase.sh >> >> > +++ b/git-rebase.sh >> >> test -n "$upstream_name" && >> >> for reflog in $(git rev-list ...) >> >> do >> >> ... >> >> done >> >> >> >> Don't you need to make sure $upstream_name is a branch (or a ref in >> >> general that can have a reflog), or does it not matter because the >> >> "rev-list -g" will die without producing anything and you are discarding >> >> the error message? >> > >> > Exactly as you suspect. Is it too ugly? >> >> I also prefer Junio's version. > > I fixed the test + for loop, if that's what you mean by "Junio's > version". Or did you mean "make sure $upstream_name is a branch"? I > could do that as well if you like. I have no preference. I meant the test + loop, but I would also "make sure $upstream_name is a branch", as done in git-pull.sh with: git rev-parse -q --verify "$remoteref" > >> > .-u@{0} >> > / >> > .---u@{1} >> > / >> > x---y-----u@{2} >> > \ >> > .---u@{3}---b >> > \ >> > .-u@{4} >> > >> > >> > I have an idea inspired by bisection, Thomas's exponential stride, and >> > what someone (you?) mentioned the other day about virtual merge >> > commits. I haven't tried it out, but let me know what you think. I'll >> > try to explain it using an example only: >> > >> > Exponential stride phase: >> > 1. candidates={ u@{0} } >> > merge-base b $candidates -> y, _not_ in $candidates >> > 2. candidates={ u@{1} u@{2} } >> > merge-base b $candidates -> y, _not_ in $candidates >> > 3. candidates={ u@{3} u@{4} u@{5} u@{6} } >> > merge-base b $candidates -> u@{3}, in $candidates >> >> Doesn't it indicate that u@{3} is the commit we are looking for? I >> haven't found a counterexample... > > Yes, of course. Stupid me ;-). Forget about the other half. (I think > that's what I did manually to match the sha1 back to the ref name, but > that is of course complete non-sense to do in the script.) > >> If this is true the following patch can implement it for git-pull.sh and >> git-rebase.sh (sorry if it is space damaged): > > Thanks! Will have a closer look at it later today. If I understand > correctly, you simply call merge-base with the _entire_ reflog. I Yes, that is the idea (plus the old remote hash in case of git-pull) > would have thought that would be slow, but it's great if that is fast > enough. Yes, I think it is fast enough in the normal case. Even feeding the entire git.git's master, ~25000 revisions, it takes around 2-4 seconds only: $ git rev-list origin/master | wc -l 24380 $ time git merge-base $(git rev-list origin/master) 9971d6d52c5afeb8ba60ae6ddcffb34af23eeadd real 0m4.014s user 0m1.520s sys 0m2.284s (2.5GHz CPU) But, as Junio showed, it has problems when the reflog lenght is too large. Maybe git-merge-base can learn the --stdin flag, or we could process the reflog in batches of 1000 (?) entries, ... but the nice property of using the entire reflog is that the output is what you are looking for, if you take the first 1000 entries you have to check if the output is one of these entries. > The resulting code looks very nice and short. Thanks again. No, thanks to you for the nice idea! HTH, Santi ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] rebase: be cleverer with rebased upstream branches 2011-02-17 10:24 ` Santi Béjar @ 2011-03-12 21:15 ` Martin von Zweigbergk 2011-03-12 23:51 ` Santi Béjar 2011-03-13 23:09 ` Santi Béjar 0 siblings, 2 replies; 19+ messages in thread From: Martin von Zweigbergk @ 2011-03-12 21:15 UTC (permalink / raw) To: Santi Béjar; +Cc: Martin von Zweigbergk, Junio C Hamano, git, Thomas Rast On Thu, 17 Feb 2011, Santi B?jar wrote: > On Wed, Feb 16, 2011 at 5:45 PM, Martin von Zweigbergk > <martin.von.zweigbergk@gmail.com> wrote: > > On Wed, 16 Feb 2011, Santi B?jar wrote: > > > >> On Wed, Feb 16, 2011 at 3:03 AM, Martin von Zweigbergk > >> <martin.von.zweigbergk@gmail.com> wrote: > >> > > >> > .-u@{0} > >> > / > >> > .---u@{1} > >> > / > >> > x---y-----u@{2} > >> > \ > >> > .---u@{3}---b > >> > \ > >> > .-u@{4} > >> > > >> > > >> > I have an idea inspired by bisection, Thomas's exponential stride, and > >> > what someone (you?) mentioned the other day about virtual merge > >> > commits. I haven't tried it out, but let me know what you think. I'll > >> > try to explain it using an example only: > >> > > >> > Exponential stride phase: > >> > 1. candidates={ u@{0} } > >> > merge-base b $candidates -> y, _not_ in $candidates > >> > 2. candidates={ u@{1} u@{2} } > >> > merge-base b $candidates -> y, _not_ in $candidates > >> > 3. candidates={ u@{3} u@{4} u@{5} u@{6} } > >> > merge-base b $candidates -> u@{3}, in $candidates > >> > >> Doesn't it indicate that u@{3} is the commit we are looking for? I > >> haven't found a counterexample... > > > > Yes, of course. Stupid me ;-). Forget about the other half. (I think > > that's what I did manually to match the sha1 back to the ref name, but > > that is of course complete non-sense to do in the script.) > > > >> If this is true the following patch can implement it for git-pull.sh and > >> git-rebase.sh (sorry if it is space damaged): > > > > Thanks! Will have a closer look at it later today. If I understand > > correctly, you simply call merge-base with the _entire_ reflog. I > > Yes, that is the idea (plus the old remote hash in case of git-pull) > > > would have thought that would be slow, but it's great if that is fast > > enough. > > Yes, I think it is fast enough in the normal case. Even feeding the > entire git.git's master, ~25000 revisions, it takes around 2-4 seconds > only: > > $ git rev-list origin/master | wc -l > 24380 > $ time git merge-base $(git rev-list origin/master) > 9971d6d52c5afeb8ba60ae6ddcffb34af23eeadd > > real 0m4.014s > user 0m1.520s > sys 0m2.284s > > (2.5GHz CPU) I finally got around to doing some tests on this myself. I used git.git as of mid Feb, which at that time had 10010 commits in master, following only the first parent. I took the first 563 commits from the todo branch and transplanted onto master~10000 (there were some conflicts after about 563 commits and I figured that would be enough anyway). I then rebased the resulting branch (let's call it 'u') against master~9990, then against master~9980 and so on to get a reflog with 1001 entries for u. I then created another branch 'b' based on u@{10}, u@{100} and @{1000}, for different runs of the tests. I created one additional commit on b in each case. I then rebased b with master, using the following algorithms to find the base to rebase from: manual: simply calling 'git rebase --onto u b~1' linear: same algorithm as in 'git pull', which linearly walks the reflog until a commit that b contains is found merge-base: the base will be calculated as 'git merge-base b $(git ref-list -g u)' exponential: like merge-base, but start with only u@{0}, then {u@{1},u@{2}} and so on until a commit that b contains is found These are the results: u@{10} u@{100} u@{1000} manual 0m0.535s 0m1.164s 0m1.415s linear 0m1.245s 0m37.367s 5m10.068s merge-base 0m14.490s 0m15.409s 0m15.508s exponential 0m1.056s 0m6.175s 0m27.221s (1.8 GHz Athlon 64). This clearly shows that the linear algorithm from git pull is not good enough when rebasing older branches (i.e. branches whose upstream has many reflog entries created after the branch itself was created). The time it takes the "merge-base" algorithm is quite independent on how old the branch is, but with this quite long and branchy reflog (but not too dissimilar from git.git's pu?), it takes quite a while to calculate it. I think this is also too slow to be acceptable as a default. I would personnally be happy if the "exponential" algorithm was used by git rebase default. I suppose not everyone would agree that the convenience outweighs the performance cost, though. OTOH, a slower algorithm has been used in git pull for a long time and it seems like not many people have really been bothered by that. Also see the following paragraphs. I also ran the same tests with an upstream branch that was never force-updated. For these test cases, I created a reflog such that u@{$i} = master~$((10 * $i)). Since the upstream branch was know never to have been force-updated in this case, the "manual" test case was simply 'git rebase u'. These are the results: u@{10} u@{100} u@{1000} manual 0m0.885s 0m6.126s 0m52.248s linear 0m1.349s 0m39.688s 5m28.753s merge-base 0m1.160s 0m1.699s 0m1.901s exponential 0m0.769s 0m4.342s 0m7.360s Not surprisingly, the linear algorithm is slow in these cases as well. What's more interesting here is that the last two algorithms are actually faster than the plain 'git rebase u'. This is caused by --ignore-if-in-upstream flag to format-patch. Since the other three algorithms try to figure out what the base was and pass the range from the guessed base to the branch (e.g. u@{100}..b) to format-patch, the --ignore-if-in-upstream to that command effectively becomes a no-op. Although this makes rebase faster in the case of a non-force-updated upstream, it may also be a problem in some cases. This was something that I had not thought about until I started timing the calls. One reason I can think of when the --ignore-if-in-upstream is useful is when the upstream branch has been rebased, but this is exactly the case when guessing the old base is useful and solves the problem in a better way anyway. However, if a commit on the upstream branch was cherry-picked from some commit on the current branch above its base (i.e. in u@{x}..b), then that would not be detected by --ignore-if-in-upstream and could result in unnecessary merge conflicts. I don't know how common this case is. The above also applies to 'git pull', of course, but the ways of getting identical patches in upstream are probably different (more likely by 'git am' than 'git cherry-pick' perhaps). I think this is a useful feature. I'm just not sure how to balance the performance vs convenience. Worst case, this could probably become a command line option and configuration. I guess 'git pull' should use the same algorithm. If we decide to use configation, maybe git-pull's default would need to be different to be backward compatible. Any thoughts? > But, as Junio showed, it has problems when the reflog lenght is too > large. Maybe git-merge-base can learn the --stdin flag, or we could > process the reflog in batches of 1000 (?) entries, ... but the nice > property of using the entire reflog is that the output is what you are > looking for, if you take the first 1000 entries you have to check if > the output is one of these entries. Since I think the exponential algorithm seems the best choice, we could probably just limit it to a certain number of entries, but maybe it's better to implement a --stdin flag to merge-base. It could be useful for others too. /Martin ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] rebase: be cleverer with rebased upstream branches 2011-03-12 21:15 ` Martin von Zweigbergk @ 2011-03-12 23:51 ` Santi Béjar 2011-03-13 1:32 ` Martin von Zweigbergk 2011-03-13 23:09 ` Santi Béjar 1 sibling, 1 reply; 19+ messages in thread From: Santi Béjar @ 2011-03-12 23:51 UTC (permalink / raw) To: Martin von Zweigbergk; +Cc: Junio C Hamano, git, Thomas Rast Thanks for pushing this further. I'll read it all carefully later, but let me just comment one thing. On Sat, Mar 12, 2011 at 10:15 PM, Martin von Zweigbergk <martin.von.zweigbergk@gmail.com> wrote: > On Thu, 17 Feb 2011, Santi B?jar wrote: > >> On Wed, Feb 16, 2011 at 5:45 PM, Martin von Zweigbergk >> <martin.von.zweigbergk@gmail.com> wrote: >> > On Wed, 16 Feb 2011, Santi B?jar wrote: >> > >> >> On Wed, Feb 16, 2011 at 3:03 AM, Martin von Zweigbergk >> >> <martin.von.zweigbergk@gmail.com> wrote: >> >> > >> >> > .-u@{0} >> >> > / >> >> > .---u@{1} >> >> > / >> >> > x---y-----u@{2} >> >> > \ >> >> > .---u@{3}---b >> >> > \ >> >> > .-u@{4} >> >> > >> >> > >> >> > I have an idea inspired by bisection, Thomas's exponential stride, and >> >> > what someone (you?) mentioned the other day about virtual merge >> >> > commits. I haven't tried it out, but let me know what you think. I'll >> >> > try to explain it using an example only: >> >> > >> >> > Exponential stride phase: >> >> > 1. candidates={ u@{0} } >> >> > merge-base b $candidates -> y, _not_ in $candidates >> >> > 2. candidates={ u@{1} u@{2} } >> >> > merge-base b $candidates -> y, _not_ in $candidates >> >> > 3. candidates={ u@{3} u@{4} u@{5} u@{6} } >> >> > merge-base b $candidates -> u@{3}, in $candidates >> >> >> >> Doesn't it indicate that u@{3} is the commit we are looking for? I >> >> haven't found a counterexample... >> > >> > Yes, of course. Stupid me ;-). Forget about the other half. (I think >> > that's what I did manually to match the sha1 back to the ref name, but >> > that is of course complete non-sense to do in the script.) >> > >> >> If this is true the following patch can implement it for git-pull.sh and >> >> git-rebase.sh (sorry if it is space damaged): >> > >> > Thanks! Will have a closer look at it later today. If I understand >> > correctly, you simply call merge-base with the _entire_ reflog. I >> >> Yes, that is the idea (plus the old remote hash in case of git-pull) >> >> > would have thought that would be slow, but it's great if that is fast >> > enough. >> >> Yes, I think it is fast enough in the normal case. Even feeding the >> entire git.git's master, ~25000 revisions, it takes around 2-4 seconds >> only: >> >> $ git rev-list origin/master | wc -l >> 24380 >> $ time git merge-base $(git rev-list origin/master) >> 9971d6d52c5afeb8ba60ae6ddcffb34af23eeadd >> >> real 0m4.014s >> user 0m1.520s >> sys 0m2.284s >> >> (2.5GHz CPU) > > > I finally got around to doing some tests on this myself. I used > git.git as of mid Feb, which at that time had 10010 commits in master, > following only the first parent. I took the first 563 commits from the > todo branch and transplanted onto master~10000 (there were some > conflicts after about 563 commits and I figured that would be enough > anyway). I then rebased the resulting branch (let's call it 'u') > against master~9990, then against master~9980 and so on to get a > reflog with 1001 entries for u. I then created another branch 'b' > based on u@{10}, u@{100} and @{1000}, for different runs of the > tests. I created one additional commit on b in each case. I then > rebased b with master, using the following algorithms to find the base > to rebase from: > > manual: simply calling 'git rebase --onto u b~1' > > linear: same algorithm as in 'git pull', which linearly walks the > reflog until a commit that b contains is found > > merge-base: the base will be calculated as 'git merge-base b $(git > ref-list -g u)' > > exponential: like merge-base, but start with only u@{0}, then > {u@{1},u@{2}} and so on until a commit that b contains is found > First, care to share the scripts/patches for the timings? Thanks. Could you test also variants of the exponential strategy? exponential(n,m): like merge-base, but start with n candidates {u@{0}, ..., u@{n-1}}, then n*m candidates and so on until a commit that b contains is found. Your exponential would be exponential(1,2). Timings for something like exponential(10,2) or exponential(10,10), maybe others. Thanks, Santi ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] rebase: be cleverer with rebased upstream branches 2011-03-12 23:51 ` Santi Béjar @ 2011-03-13 1:32 ` Martin von Zweigbergk 2011-03-13 3:14 ` Martin von Zweigbergk 0 siblings, 1 reply; 19+ messages in thread From: Martin von Zweigbergk @ 2011-03-13 1:32 UTC (permalink / raw) To: Santi Béjar; +Cc: Martin von Zweigbergk, Junio C Hamano, git, Thomas Rast On Sun, 13 Mar 2011, Santi B?jar wrote: > First, care to share the scripts/patches for the timings? Thanks. Sure, see end of mail for the changes to git-rebase.sh. It applies on top of the patch that started this thread. There are some minor differences from what I used when I ran the tests, but nothing that should impact the timings. To run the tests, I just did modified versions of git reset --hard u@{100} touch foo && git add foo && git ci -m foo time git rebase -n --guess-base=merge u I don't have a script that creates the initial setup, but that should be easy enough to do. I should warn you that it took 16 hours to run it on my 6 year old computer :-). > Could you test also variants of the exponential strategy? I guess I could :-). Will see if I get time for that later today. > exponential(n,m): like merge-base, but start with n candidates {u@{0}, > ..., u@{n-1}}, then n*m candidates and so on until a commit that b > contains is found. > > Your exponential would be exponential(1,2). > > Timings for something like exponential(10,2) or exponential(10,10), > maybe others. > > Thanks, > Santi > -- 8< -- diff --git a/git-rebase.sh b/git-rebase.sh index b50c91e..8a4efab 100755 --- a/git-rebase.sh +++ b/git-rebase.sh @@ -56,6 +56,7 @@ ignore-date! passed to 'git am' whitespace=! passed to 'git apply' ignore-whitespace! passed to 'git apply' C=! passed to 'git apply' +guess-base=! to guess the base Actions: continue! continue rebasing process abort! abort rebasing process and restore original branch @@ -87,6 +88,7 @@ git_am_opt= rebase_root= force_rebase= allow_rerere_autoupdate= +guess_base= # Non-empty if a rebase was in progress when 'git rebase' was invoked in_progress= # One of {am, merge, interactive} @@ -228,6 +230,10 @@ do --no-autosquash) autosquash= ;; + --guess-base) + shift + guess_base=$1 + ;; -M|-m) do_merge=t ;; @@ -459,19 +465,51 @@ esac require_clean_work_tree "rebase" "Please commit or stash them." -upstream_ref=$(git rev-parse -q --verify --symbolic-full-name \ - "$upstream_name") && test -n "$upstream_ref" && -for reflog in $(git rev-list -g $upstream_name 2>/dev/null) -do - if test $reflog = $(git merge-base $reflog $orig_head) - then - if test $reflog != $(git merge-base $onto $reflog) - then - upstream=$reflog - fi - break - fi -done +if test -n "$guess_base" && test -n "$(git rev-parse -q --verify \ + --symbolic-full-name "$upstream_name")" +then + case $guess_base in + linear) + for reflog in $(git rev-list -g $upstream_name 2>/dev/null) + do + if test $reflog = $(git merge-base $reflog $orig_head) + then + if test $reflog != $(git merge-base $onto $reflog) + then + upstream=$reflog + fi + break + fi + done + ;; + merge) + upstream=$(git merge-base $orig_head $(git rev-list -g $upstream_name 2>/dev/null)) + ;; + exponential) + reflogs=$(git rev-list -g $upstream_name 2>/dev/null) + limit=$(echo "$reflogs" | wc -l) + lo=0 + hi=1 + while true + do + echo $lo - $hi + candidates=$(echo "$reflogs" | head -$hi | tail -$(($hi - $lo))) + reflog=$(git merge-base $orig_head $candidates) + if test -n "$(echo $candidates | grep $reflog)" + then + upstream=$reflog + break + fi + if test $hi -ge $limit + then + break + fi + lo=$hi + hi=$((2 * $lo)) + done + ;; + esac +fi # Now we are rebasing commits $upstream..$orig_head (or with --root, # everything leading up to $orig_head) on top of $onto ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH] rebase: be cleverer with rebased upstream branches 2011-03-13 1:32 ` Martin von Zweigbergk @ 2011-03-13 3:14 ` Martin von Zweigbergk 2011-03-13 22:57 ` Junio C Hamano 0 siblings, 1 reply; 19+ messages in thread From: Martin von Zweigbergk @ 2011-03-13 3:14 UTC (permalink / raw) To: Martin von Zweigbergk; +Cc: Santi Béjar, Junio C Hamano, git, Thomas Rast On Sat, 12 Mar 2011, Martin von Zweigbergk wrote: > On Sun, 13 Mar 2011, Santi B?jar wrote: > > > Could you test also variants of the exponential strategy? > > I guess I could :-). Will see if I get time for that later today. So here are the updated figures for the force-updated history (pu-like): u@{10} u@{100} u@{1000} manual 0m0.535s 0m1.164s 0m1.415s linear 0m1.245s 0m37.367s 5m10.068s merge-base 0m14.490s 0m15.409s 0m15.508s exp(1,2) 0m1.056s 0m6.175s 0m27.221s exp(10,10) 0m1.950s 0m20.031s 0m18.215s exp(7,7) 0m1.310s 0m6.851s 0m16.757s and for the non-force-updated history (master-like): u@{10} u@{100} u@{1000} manual 0m0.885s 0m6.126s 0m52.248s linear 0m1.349s 0m39.688s 5m28.753s merge-base 0m1.160s 0m1.699s 0m1.901s exp(1,2) 0m0.769s 0m4.342s 0m7.360s exp(10,10) 0m0.700s 0m2.535s 0m3.110s exp(7,7) 0m0.653s 0m2.332s 0m3.506s exp(10,10) is worst possible for the test cases I picked, since the wanted reflog entry is always the first one in an interval, so almost 10 times as many entries as necessary are considered. I therefore also tried with exp(7,7) to get more fair figures. /Martin ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] rebase: be cleverer with rebased upstream branches 2011-03-13 3:14 ` Martin von Zweigbergk @ 2011-03-13 22:57 ` Junio C Hamano 2011-03-13 23:42 ` Santi Béjar 0 siblings, 1 reply; 19+ messages in thread From: Junio C Hamano @ 2011-03-13 22:57 UTC (permalink / raw) To: Martin von Zweigbergk; +Cc: Santi Béjar, git, Thomas Rast Martin von Zweigbergk <martin.von.zweigbergk@gmail.com> writes: > So here are the updated figures for the force-updated history > (pu-like): > > u@{10} u@{100} u@{1000} > manual 0m0.535s 0m1.164s 0m1.415s > linear 0m1.245s 0m37.367s 5m10.068s > merge-base 0m14.490s 0m15.409s 0m15.508s > exp(1,2) 0m1.056s 0m6.175s 0m27.221s > exp(10,10) 0m1.950s 0m20.031s 0m18.215s > exp(7,7) 0m1.310s 0m6.851s 0m16.757s > > and for the non-force-updated history (master-like): > > u@{10} u@{100} u@{1000} > manual 0m0.885s 0m6.126s 0m52.248s > linear 0m1.349s 0m39.688s 5m28.753s > merge-base 0m1.160s 0m1.699s 0m1.901s > exp(1,2) 0m0.769s 0m4.342s 0m7.360s > exp(10,10) 0m0.700s 0m2.535s 0m3.110s > exp(7,7) 0m0.653s 0m2.332s 0m3.506s I have a suspicion that merge-base shouldn't be taken as "one of the candidates among others". It is merely a technique to check multiple heads simultaneously and cheaply, instead of checking things one by one, and should be applicable regardless of these "multiple heads" come from. It may come from a linear walk of reflog, or it may come from a leaky exponential or fibonacci walk. Instead of "linear" that checks "Is the tip good? Is the tip@{1} good? Is the tip@{2} good? How about tip@{3}?" repeatedly, we can check "is the tip through tip@{N} good?" with a single invocation using the merge-base technique. Similarly if your exp(i,j) checks "Is the tip good? tip@{1}? tip@{2}? tip@{4}? tip@{8}? ..." iteratively, you can grab these commit object names out of reflog and still use the merge-base optimization, effectively making "one at a time" check into "N at a time" check (perhaps checking a dozen or two at a time). If you generate list of reglog entries way beyond what matters to the end result, and feed all of them to the machinery, it is not surprising that it would spend more time than linear that can stop early upon the first success. We should optimize for common case by picking the default that performs well for history that is never rewound. If the the default algorithm on pathological histories performs badly, it is Ok to have an alternative heuristics that is only triggered on such a history, but if we are going to do so, we need to make sure that we can cheaply detect if we need to use the alternative heuristics in the first place. Is it easy to tell in an earlier phase if the history is "force-updated" or not? ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] rebase: be cleverer with rebased upstream branches 2011-03-13 22:57 ` Junio C Hamano @ 2011-03-13 23:42 ` Santi Béjar 0 siblings, 0 replies; 19+ messages in thread From: Santi Béjar @ 2011-03-13 23:42 UTC (permalink / raw) To: Junio C Hamano; +Cc: Martin von Zweigbergk, git, Thomas Rast On Sun, Mar 13, 2011 at 11:57 PM, Junio C Hamano <gitster@pobox.com> wrote: > Martin von Zweigbergk <martin.von.zweigbergk@gmail.com> writes: > >> So here are the updated figures for the force-updated history >> (pu-like): >> >> u@{10} u@{100} u@{1000} >> manual 0m0.535s 0m1.164s 0m1.415s >> linear 0m1.245s 0m37.367s 5m10.068s >> merge-base 0m14.490s 0m15.409s 0m15.508s >> exp(1,2) 0m1.056s 0m6.175s 0m27.221s >> exp(10,10) 0m1.950s 0m20.031s 0m18.215s >> exp(7,7) 0m1.310s 0m6.851s 0m16.757s >> >> and for the non-force-updated history (master-like): >> >> u@{10} u@{100} u@{1000} >> manual 0m0.885s 0m6.126s 0m52.248s >> linear 0m1.349s 0m39.688s 5m28.753s >> merge-base 0m1.160s 0m1.699s 0m1.901s >> exp(1,2) 0m0.769s 0m4.342s 0m7.360s >> exp(10,10) 0m0.700s 0m2.535s 0m3.110s >> exp(7,7) 0m0.653s 0m2.332s 0m3.506s > > I have a suspicion that merge-base shouldn't be taken as "one of the > candidates among others". > > It is merely a technique to check multiple heads simultaneously and > cheaply, instead of checking things one by one, and should be applicable > regardless of these "multiple heads" come from. It may come from a linear > walk of reflog, or it may come from a leaky exponential or fibonacci walk. Here the merge-base strategy mean take all the reflog at once. And in this case the reflog is 1000 entries long, it is only relevant to the u@{1000} case. For me it is more like: take the solution and check that indeed it is the solution. In this sense it is the "minimum" time required to find the solution. > > Instead of "linear" that checks "Is the tip good? Is the tip@{1} good? > Is the tip@{2} good? How about tip@{3}?" repeatedly, we can check "is the > tip through tip@{N} good?" with a single invocation using the merge-base > technique. exp(n,m) is similar to this. But, yes, we could have a merge-base(N) strategy which checks using the merge-base technique the first N reflog entries, then the next N entries, and so on. But I think it would scale worst then the exponential strategy. > > Similarly if your exp(i,j) checks "Is the tip good? tip@{1}? tip@{2}? > tip@{4}? tip@{8}? ..." iteratively, you can grab these commit object > names out of reflog and still use the merge-base optimization, effectively > making "one at a time" check into "N at a time" check (perhaps checking a > dozen or two at a time). Indeed it is "Is the tip good? tip@{1},tip@{2}? tip@{4},tip{5},...,tip@{8}? ... and the merge-base techinque is used to answer each question. > > If you generate list of reglog entries way beyond what matters to the end > result, and feed all of them to the machinery, it is not surprising that > it would spend more time than linear that can stop early upon the first > success. But if the answer is the last entry in the reflog (as in the u@{1000} case) it is a lower limit, see above. > > We should optimize for common case by picking the default that performs > well for history that is never rewound. If the the default algorithm on > pathological histories performs badly, it is Ok to have an alternative > heuristics that is only triggered on such a history, but if we are going > to do so, we need to make sure that we can cheaply detect if we need to > use the alternative heuristics in the first place. Maybe I'm wrong, but from the number I see that the exp strategy this "optimize for the common case and works reasonably work for pathological histories". > > Is it easy to tell in an earlier phase if the history is "force-updated" > or not? > I'm afraid this is a similar question. HTH, Santi ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] rebase: be cleverer with rebased upstream branches 2011-03-12 21:15 ` Martin von Zweigbergk 2011-03-12 23:51 ` Santi Béjar @ 2011-03-13 23:09 ` Santi Béjar 1 sibling, 0 replies; 19+ messages in thread From: Santi Béjar @ 2011-03-13 23:09 UTC (permalink / raw) To: Martin von Zweigbergk; +Cc: Junio C Hamano, git, Thomas Rast [Also quoted text from On Sun, Mar 13, 2011 at 4:14 AM, Martin von Zweigbergk <martin.von.zweigbergk@gmail.com>] On Sat, Mar 12, 2011 at 10:15 PM, Martin von Zweigbergk <martin.von.zweigbergk@gmail.com> wrote: > On Thu, 17 Feb 2011, Santi B?jar wrote: > >> On Wed, Feb 16, 2011 at 5:45 PM, Martin von Zweigbergk >> <martin.von.zweigbergk@gmail.com> wrote: >> > On Wed, 16 Feb 2011, Santi B?jar wrote: >> > >> >> On Wed, Feb 16, 2011 at 3:03 AM, Martin von Zweigbergk >> >> <martin.von.zweigbergk@gmail.com> wrote: >> >> > >> >> > .-u@{0} >> >> > / >> >> > .---u@{1} >> >> > / >> >> > x---y-----u@{2} >> >> > \ >> >> > .---u@{3}---b >> >> > \ >> >> > .-u@{4} >> >> > >> >> > >> >> > I have an idea inspired by bisection, Thomas's exponential stride, and >> >> > what someone (you?) mentioned the other day about virtual merge >> >> > commits. I haven't tried it out, but let me know what you think. I'll >> >> > try to explain it using an example only: >> >> > >> >> > Exponential stride phase: >> >> > 1. candidates={ u@{0} } >> >> > merge-base b $candidates -> y, _not_ in $candidates >> >> > 2. candidates={ u@{1} u@{2} } >> >> > merge-base b $candidates -> y, _not_ in $candidates >> >> > 3. candidates={ u@{3} u@{4} u@{5} u@{6} } >> >> > merge-base b $candidates -> u@{3}, in $candidates >> >> >> >> Doesn't it indicate that u@{3} is the commit we are looking for? I >> >> haven't found a counterexample... >> > >> > Yes, of course. Stupid me ;-). Forget about the other half. (I think >> > that's what I did manually to match the sha1 back to the ref name, but >> > that is of course complete non-sense to do in the script.) >> > >> >> If this is true the following patch can implement it for git-pull.sh and >> >> git-rebase.sh (sorry if it is space damaged): >> > >> > Thanks! Will have a closer look at it later today. If I understand >> > correctly, you simply call merge-base with the _entire_ reflog. I >> >> Yes, that is the idea (plus the old remote hash in case of git-pull) >> >> > would have thought that would be slow, but it's great if that is fast >> > enough. >> >> Yes, I think it is fast enough in the normal case. Even feeding the >> entire git.git's master, ~25000 revisions, it takes around 2-4 seconds >> only: >> >> $ git rev-list origin/master | wc -l >> 24380 >> $ time git merge-base $(git rev-list origin/master) >> 9971d6d52c5afeb8ba60ae6ddcffb34af23eeadd >> >> real 0m4.014s >> user 0m1.520s >> sys 0m2.284s >> >> (2.5GHz CPU) > > > I finally got around to doing some tests on this myself. I used > git.git as of mid Feb, which at that time had 10010 commits in master, > following only the first parent. I took the first 563 commits from the > todo branch and transplanted onto master~10000 (there were some > conflicts after about 563 commits and I figured that would be enough > anyway). I then rebased the resulting branch (let's call it 'u') > against master~9990, then against master~9980 and so on to get a > reflog with 1001 entries for u. I then created another branch 'b' > based on u@{10}, u@{100} and @{1000}, for different runs of the > tests. I created one additional commit on b in each case. I then > rebased b with master, using the following algorithms to find the base > to rebase from: > > manual: simply calling 'git rebase --onto u b~1' > > linear: same algorithm as in 'git pull', which linearly walks the > reflog until a commit that b contains is found > > merge-base: the base will be calculated as 'git merge-base b $(git > ref-list -g u)' > > exponential: like merge-base, but start with only u@{0}, then > {u@{1},u@{2}} and so on until a commit that b contains is found exp(n,m): like merge-base, but start with n candidates {u@{0}, ..., u@{n-1}}, then n*m candidates and so on until a commit that b contains is found. > > These are the results: These are best timing out of three runs, mean, only the first one? Hot-cache for all tests? > > u@{10} u@{100} u@{1000} > manual 0m0.535s 0m1.164s 0m1.415s > linear 0m1.245s 0m37.367s 5m10.068s > merge-base 0m14.490s 0m15.409s 0m15.508s > exp(1,2) 0m1.056s 0m6.175s 0m27.221s > exp(10,10) 0m1.950s 0m20.031s 0m18.215s > exp(7,7) 0m1.310s 0m6.851s 0m16.757s > > (1.8 GHz Athlon 64). > > This clearly shows that the linear algorithm from git pull is not good > enough when rebasing older branches (i.e. branches whose upstream has > many reflog entries created after the branch itself was created). > > The time it takes the "merge-base" algorithm is quite independent on > how old the branch is, but with this quite long and branchy reflog > (but not too dissimilar from git.git's pu?), it takes quite a while to > calculate it. I think this is also too slow to be acceptable as a > default. > > I would personnally be happy if the "exponential" algorithm was used > by git rebase default. I suppose not everyone would agree that the > convenience outweighs the performance cost, though. I don't agree with this. For the normal case there is no performance cost (manual 0.5s, exp(7,7) 1.3s). There is performance cost (manual 1.4s, exp(7,7) 16.7s) when you need it, when your upstream has been rebased a long ago (in reflog entries). > OTOH, a slower > algorithm has been used in git pull for a long time and it seems like > not many people have really been bothered by that. Also see the > following paragraphs. > > I also ran the same tests with an upstream branch that was never > force-updated. For these test cases, I created a reflog such that > u@{$i} = master~$((10 * $i)). Since the upstream branch was know never > to have been force-updated in this case, the "manual" test case was > simply 'git rebase u'. These are the results: > > u@{10} u@{100} u@{1000} > manual 0m0.885s 0m6.126s 0m52.248s > linear 0m1.349s 0m39.688s 5m28.753s > merge-base 0m1.160s 0m1.699s 0m1.901s > exp(1,2) 0m0.769s 0m4.342s 0m7.360s > exp(10,10) 0m0.700s 0m2.535s 0m3.110s > exp(7,7) 0m0.653s 0m2.332s 0m3.506s > > Not surprisingly, the linear algorithm is slow in these cases as well. > > What's more interesting here is that the last two algorithms are > actually faster than the plain 'git rebase u'. This is caused by > --ignore-if-in-upstream flag to format-patch. Since the other three > algorithms try to figure out what the base was and pass the range from > the guessed base to the branch (e.g. u@{100}..b) to format-patch, the > --ignore-if-in-upstream to that command effectively becomes a no-op. > > Although this makes rebase faster in the case of a non-force-updated > upstream, it may also be a problem in some cases. This was something > that I had not thought about until I started timing the calls. One > reason I can think of when the --ignore-if-in-upstream is useful is > when the upstream branch has been rebased, but this is exactly the > case when guessing the old base is useful and solves the problem in a > better way anyway. However, if a commit on the upstream branch was > cherry-picked from some commit on the current branch above its base > (i.e. in u@{x}..b), then that would not be detected by > --ignore-if-in-upstream and could result in unnecessary merge > conflicts. I don't know how common this case is. > > The above also applies to 'git pull', of course, but the ways of > getting identical patches in upstream are probably different (more > likely by 'git am' than 'git cherry-pick' perhaps). > > I think this is a useful feature. I'm just not sure how to balance the > performance vs convenience. Worst case, this could probably become a > command line option and configuration. I guess 'git pull' should use > the same algorithm. If we decide to use configation, maybe git-pull's > default would need to be different to be backward compatible. > > Any thoughts? I think it is worth, as it looks like it only affects those who need the feature. The exp(7,7) or similar seems a good candidate. HTH, Santi ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2011-03-13 23:42 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-02-14 13:51 [PATCH] rebase: be cleverer with rebased upstream branches Martin von Zweigbergk 2011-02-15 11:28 ` Santi Béjar 2011-02-16 1:37 ` Martin von Zweigbergk 2011-02-16 9:26 ` Santi Béjar 2011-02-15 20:30 ` Junio C Hamano 2011-02-16 3:03 ` Martin von Zweigbergk 2011-02-16 12:10 ` Santi Béjar 2011-02-16 13:22 ` Santi Béjar 2011-02-16 19:07 ` Junio C Hamano 2011-02-16 21:16 ` Santi Béjar 2011-02-16 16:45 ` Martin von Zweigbergk 2011-02-17 10:24 ` Santi Béjar 2011-03-12 21:15 ` Martin von Zweigbergk 2011-03-12 23:51 ` Santi Béjar 2011-03-13 1:32 ` Martin von Zweigbergk 2011-03-13 3:14 ` Martin von Zweigbergk 2011-03-13 22:57 ` Junio C Hamano 2011-03-13 23:42 ` Santi Béjar 2011-03-13 23:09 ` Santi Béjar
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).