* [PATCH] user-manual: add advanced topic "bisecting merges" @ 2007-11-04 9:16 Steffen Prohaska 2007-11-04 11:23 ` Ralf Wildenhues 2007-11-04 13:50 ` [PATCH] " Benoit SIGOURE 0 siblings, 2 replies; 30+ messages in thread From: Steffen Prohaska @ 2007-11-04 9:16 UTC (permalink / raw) To: git; +Cc: Steffen Prohaska This commits adds a discussion of the challenge of bisecting merge commits to the user manual. The text is slightly adapted from a mail by Junio C Hamano <gitster@pobox.com> to the mailing list <http://marc.info/?l=git&m=119403257315527&w=2>. The discussion is added to "Exploring git history" in a sub-section titled "Advanced topics". The discussion requires detailed knowledge about git. It is assumed that the reader will skip advanced topics on first reading. At least the text suggest to do so. Signed-off-by: Steffen Prohaska <prohaska@zib.de> --- Documentation/user-manual.txt | 89 +++++++++++++++++++++++++++++++++++++++++ 1 files changed, 89 insertions(+), 0 deletions(-) Junio's discussion was enlightening for me. I think it's a good idea to add such discussions to the user manual. It was not obvious to me where to place the discussion in the current structure of the manual. So I added "Advanced topics". Is a sub-section "Advanced topics" a good idea? Any better suggestions? Steffen diff --git a/Documentation/user-manual.txt b/Documentation/user-manual.txt index d99adc6..480e7c1 100644 --- a/Documentation/user-manual.txt +++ b/Documentation/user-manual.txt @@ -934,6 +934,95 @@ Figuring out why this works is left as an exercise to the (advanced) student. The gitlink:git-log[1], gitlink:git-diff-tree[1], and gitlink:git-hash-object[1] man pages may prove helpful. +[[history-advanced-topics]] +Advanced topics +--------------- +This section covers advanced topics that typically require more +knowledge about git than the manual presented to this point. + +You may want to skip the section at first reading, and come back +later when you have a better understanding of git. + +[[bisect-merges]] +Why bisecting merge commits can be harder than bisecting linear history +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The following text is based upon an email by Junio C. Hamano to +the git mailing list +(link:http://marc.info/?l=git&m=119403257315527&w=2[link:http://marc.info/?l=git&m=119403257315527&w=2]). +It was slightly adapted for this manual. + +Bisecting merges can be challenging due to the complexity of +changes introduced at a merge. Bisecting through merges is not a +technical problem. The real problem is what to do when the +culprit turns out to be a merge commit. How to spot what really +is wrong, and figure out how to fix. The problem is not for the +tool but for the human, and it is real. + +Imagine this history. + +................................................ + ---Z---o---X---...---o---A---C---D + \ / + o---o---Y---...---o---B +................................................ + +Suppose that on the upper development line, the meaning of one +of the functions existed at Z was changed at commit X. The +commits from Z leading to A change both the function's +implementation and all calling sites that existed at Z, as well +as new calling sites they add, to be consistent. There is no +bug at A. + +Suppose in the meantime the lower development line somebody +added a new calling site for that function at commit Y. The +commits from Z leading to B all assume the old semantics of that +function and the callers and the callee are consistent with each +other. There is no bug at B, either. + +You merge to create C. There is no textual conflict with this +three way merge, and the result merges cleanly. You bisect +this, because you found D is bad and you know Z was good. Your +bisect will find that C (merge) is broken. Understandably so, +as at C, the new calling site of the function added by the lower +branch is not converted to the new semantics, while all the +other calling sites that already existed at Z would have been +converted by the merge. The new calling site has semantic +adjustment needed, but you do not know that yet. You need to +find out that is the cause of the breakage by looking at the +merge commit C and the history leading to it. + +How would you do that? + +Both "git diff A C" and "git diff B C" would be an enormous patch. +Each of them essentially shows the whole change on each branch +since they diverged. The developers may have well behaved to +create good commits that follow the "commit small, commit often, +commit well contained units" mantra, and each individual commit +leading from Z to A and from Z to B may be easy to review and +understand, but looking at these small and easily reviewable +steps alone would not let you spot the breakage. You need to +have a global picture of what the upper branch did (and +among many, one of them is to change the semantics of that +particular function) and look first at the huge "diff A C" +(which shows the change the lower branch introduces), and see if +that huge change is consistent with what have been done between +Z and A. + +If you linearlize the history by rebasing the lower branch on +top of upper, instead of merging, the bug becomes much easier to +find and understand. Your history would instead be: + +................................................................ + ---Z---o---X--...---o---A---o---o---Y*--...---o---B*--D* +................................................................ + +and there is a single commit Y* between A and B* that introduced +the new calling site that still uses the old semantics of the +function, even though that was already modified at X. "git show +Y*" will be a much smaller patch than "git diff A C" and it is +much easier to deal with. + + [[Developing-with-git]] Developing with git =================== -- 1.5.3.4.464.ge1bc2 ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH] user-manual: add advanced topic "bisecting merges" 2007-11-04 9:16 [PATCH] user-manual: add advanced topic "bisecting merges" Steffen Prohaska @ 2007-11-04 11:23 ` Ralf Wildenhues 2007-11-07 21:50 ` [PATCH v2] " Steffen Prohaska 2007-11-04 13:50 ` [PATCH] " Benoit SIGOURE 1 sibling, 1 reply; 30+ messages in thread From: Ralf Wildenhues @ 2007-11-04 11:23 UTC (permalink / raw) To: Steffen Prohaska; +Cc: git Hello Steffen, A couple of language nits: * Steffen Prohaska wrote on Sun, Nov 04, 2007 at 10:16:13AM CET: > +Suppose that on the upper development line, the meaning of one > +of the functions existed at Z was changed at commit X. The s/functions/& that/ > +commits from Z leading to A change both the function's > +implementation and all calling sites that existed at Z, as well > +as new calling sites they add, to be consistent. There is no > +bug at A. [...] > +You merge to create C. There is no textual conflict with this > +three way merge, and the result merges cleanly. You bisect > +this, because you found D is bad and you know Z was good. Your > +bisect will find that C (merge) is broken. Understandably so, > +as at C, the new calling site of the function added by the lower > +branch is not converted to the new semantics, while all the > +other calling sites that already existed at Z would have been > +converted by the merge. The new calling site has semantic > +adjustment needed, but you do not know that yet. You need to > +find out that is the cause of the breakage by looking at the s/that/that that/ > +merge commit C and the history leading to it. [...] > +If you linearlize the history by rebasing the lower branch on > +top of upper, instead of merging, the bug becomes much easier to s/upper/the &/ > +find and understand. Your history would instead be: [...] Cheers, Ralf ^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v2] user-manual: add advanced topic "bisecting merges" 2007-11-04 11:23 ` Ralf Wildenhues @ 2007-11-07 21:50 ` Steffen Prohaska 2007-11-07 22:16 ` Benoit Sigoure 2007-11-08 7:19 ` Johannes Sixt 0 siblings, 2 replies; 30+ messages in thread From: Steffen Prohaska @ 2007-11-07 21:50 UTC (permalink / raw) To: gitster; +Cc: Ralf.Wildenhues, tsuna, git, Steffen Prohaska This commits adds a discussion of the challenge of bisecting merge commits to the user manual. The text is slightly adapted from a mail by Junio C Hamano <gitster@pobox.com> to the mailing list <http://marc.info/?l=git&m=119403257315527&w=2>. The discussion is added to "Exploring git history" in a sub-section titled "Advanced topics". The discussion requires detailed knowledge about git. It is assumed that the reader will skip advanced topics on first reading. At least the text suggest to do so. Signed-off-by: Steffen Prohaska <prohaska@zib.de> --- Documentation/user-manual.txt | 89 +++++++++++++++++++++++++++++++++++++++++ 1 files changed, 89 insertions(+), 0 deletions(-) Some minor errors were fixed. Thanks to <Ralf.Wildenhues@gmx.de>. I kept the naming of the commits. Benoit Sigoure suggested to choose a different naming which he claims would be easier to remember. I'm not convinced. The current naming starts with X, Y, Z on the left and names the remaining commits on the right with A, B, C, D. This is simple and give sufficient orientation. Steffen diff --git a/Documentation/user-manual.txt b/Documentation/user-manual.txt index d99adc6..d0e738e 100644 --- a/Documentation/user-manual.txt +++ b/Documentation/user-manual.txt @@ -934,6 +934,95 @@ Figuring out why this works is left as an exercise to the (advanced) student. The gitlink:git-log[1], gitlink:git-diff-tree[1], and gitlink:git-hash-object[1] man pages may prove helpful. +[[history-advanced-topics]] +Advanced topics +--------------- +This section covers advanced topics that typically require more +knowledge about git than the manual presented to this point. + +You may want to skip the section at first reading, and come back +later when you have a better understanding of git. + +[[bisect-merges]] +Why bisecting merge commits can be harder than bisecting linear history +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The following text is based upon an email by Junio C. Hamano to +the git mailing list +(link:http://marc.info/?l=git&m=119403257315527&w=2[link:http://marc.info/?l=git&m=119403257315527&w=2]). +It was slightly adapted for this manual. + +Bisecting merges can be challenging due to the complexity of +changes introduced at a merge. Bisecting through merges is not a +technical problem. The real problem is what to do when the +culprit turns out to be a merge commit. How to spot what really +is wrong, and figure out how to fix. The problem is not for the +tool but for the human, and it is real. + +Imagine this history. + +................................................ + ---Z---o---X---...---o---A---C---D + \ / + o---o---Y---...---o---B +................................................ + +Suppose that on the upper development line, the meaning of one +of the functions that existed at Z was changed at commit X. The +commits from Z leading to A change both the function's +implementation and all calling sites that existed at Z, as well +as new calling sites they add, to be consistent. There is no +bug at A. + +Suppose in the meantime the lower development line somebody +added a new calling site for that function at commit Y. The +commits from Z leading to B all assume the old semantics of that +function and the callers and the callee are consistent with each +other. There is no bug at B, either. + +You merge to create C. There is no textual conflict with this +three way merge, and the result merges cleanly. You bisect +this, because you found D is bad and you know Z was good. Your +bisect will find that C (merge) is broken. Understandably so, +as at C, the new calling site of the function added by the lower +branch is not converted to the new semantics, while all the +other calling sites that already existed at Z would have been +converted by the merge. The new calling site has semantic +adjustment needed, but you do not know that yet. You need to +find out that that is the cause of the breakage by looking at the +merge commit C and the history leading to it. + +How would you do that? + +Both "git diff A C" and "git diff B C" would be an enormous patch. +Each of them essentially shows the whole change on each branch +since they diverged. The developers may have well behaved to +create good commits that follow the "commit small, commit often, +commit well contained units" mantra, and each individual commit +leading from Z to A and from Z to B may be easy to review and +understand, but looking at these small and easily reviewable +steps alone would not let you spot the breakage. You need to +have a global picture of what the upper branch did (and +among many, one of them is to change the semantics of that +particular function) and look first at the huge "diff A C" +(which shows the change the lower branch introduces), and see if +that huge change is consistent with what have been done between +Z and A. + +If you linearize the history by rebasing the lower branch on +top of the upper, instead of merging, the bug becomes much easier to +find and understand. Your history would instead be: + +................................................................ + ---Z---o---X--...---o---A---o---o---Y*--...---o---B*--D* +................................................................ + +and there is a single commit Y* between A and B* that introduced +the new calling site that still uses the old semantics of the +function, even though that was already modified at X. "git show +Y*" will be a much smaller patch than "git diff A C" and it is +much easier to deal with. + + [[Developing-with-git]] Developing with git =================== -- 1.5.3.5.578.g886d ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH v2] user-manual: add advanced topic "bisecting merges" 2007-11-07 21:50 ` [PATCH v2] " Steffen Prohaska @ 2007-11-07 22:16 ` Benoit Sigoure 2007-11-07 22:26 ` J. Bruce Fields 2007-11-08 6:40 ` Steffen Prohaska 2007-11-08 7:19 ` Johannes Sixt 1 sibling, 2 replies; 30+ messages in thread From: Benoit Sigoure @ 2007-11-07 22:16 UTC (permalink / raw) To: Steffen Prohaska; +Cc: gitster, Ralf.Wildenhues, git [-- Attachment #1: Type: text/plain, Size: 6860 bytes --] Hi Steffen, On Nov 7, 2007, at 10:50 PM, Steffen Prohaska wrote: > This commits adds a discussion of the challenge of bisecting > merge commits to the user manual. The text is slightly > adapted from a mail by Junio C Hamano <gitster@pobox.com> > to the mailing list > <http://marc.info/?l=git&m=119403257315527&w=2>. > > The discussion is added to "Exploring git history" in a > sub-section titled "Advanced topics". The discussion requires > detailed knowledge about git. It is assumed that the reader will > skip advanced topics on first reading. At least the text suggest > to do so. > > Signed-off-by: Steffen Prohaska <prohaska@zib.de> > --- > Documentation/user-manual.txt | 89 ++++++++++++++++++++++++++++++ > +++++++++++ > 1 files changed, 89 insertions(+), 0 deletions(-) > > Some minor errors were fixed. Thanks to <Ralf.Wildenhues@gmx.de>. > > I kept the naming of the commits. Benoit Sigoure suggested to choose > a different naming which he claims would be easier to remember. > I'm not convinced. The current naming starts with X, Y, Z on the left > and names the remaining commits on the right with A, B, C, D. This > is simple and give sufficient orientation. I still disagree but... fair enough ;) You end up comparing [ABCD] with [XYZ] which (to me) is hard to follow because it's like you were comparing two different kind of entities. I tend to think more in term of branch (e.g. what's happened to the upper branch and what's happened to the lower branch, rather than think in terms of "before a point in time" and "after that point in time"). Because of that, I constantly need to look back at the scheme to find out what is `A', what is `Z' etc. Some more comments below. Sorry for not spotting these earlier. > > diff --git a/Documentation/user-manual.txt b/Documentation/user- > manual.txt > index d99adc6..d0e738e 100644 > --- a/Documentation/user-manual.txt > +++ b/Documentation/user-manual.txt > @@ -934,6 +934,95 @@ Figuring out why this works is left as an > exercise to the (advanced) > student. The gitlink:git-log[1], gitlink:git-diff-tree[1], and > gitlink:git-hash-object[1] man pages may prove helpful. > > +[[history-advanced-topics]] > +Advanced topics > +--------------- > +This section covers advanced topics that typically require more > +knowledge about git than the manual presented to this point. > + > +You may want to skip the section at first reading, and come back I think the correct wording here is "on first reading". If a native English speaker could confirm this... > +later when you have a better understanding of git. > + > +[[bisect-merges]] > +Why bisecting merge commits can be harder than bisecting linear > history > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > ~~ > +The following text is based upon an email by Junio C. Hamano to > +the git mailing list > +(link:http://marc.info/?l=git&m=119403257315527&w=2[link:http:// > marc.info/?l=git&m=119403257315527&w=2]). > +It was slightly adapted for this manual. > + > +Bisecting merges can be challenging due to the complexity of > +changes introduced at a merge. Bisecting through merges is not a s/at a merge/& point/ ? > +technical problem. The real problem is what to do when the > +culprit turns out to be a merge commit. How to spot what really > +is wrong, and figure out how to fix. The problem is not for the > +tool but for the human, and it is real. > + > +Imagine this history. > + > +................................................ > + ---Z---o---X---...---o---A---C---D > + \ / > + o---o---Y---...---o---B > +................................................ > + > +Suppose that on the upper development line, the meaning of one > +of the functions that existed at Z was changed at commit X. The > +commits from Z leading to A change both the function's > +implementation and all calling sites that existed at Z, as well > +as new calling sites they add, to be consistent. There is no > +bug at A. > + > +Suppose in the meantime the lower development line somebody s/Suppose/& that/ > +added a new calling site for that function at commit Y. The > +commits from Z leading to B all assume the old semantics of that > +function and the callers and the callee are consistent with each > +other. There is no bug at B, either. > + > +You merge to create C. There is no textual conflict with this > +three way merge, and the result merges cleanly. You bisect > +this, because you found D is bad and you know Z was good. Your > +bisect will find that C (merge) is broken. Understandably so, > +as at C, the new calling site of the function added by the lower > +branch is not converted to the new semantics, while all the > +other calling sites that already existed at Z would have been > +converted by the merge. The new calling site has semantic > +adjustment needed, but you do not know that yet. You need to s/adjustment/&s/ > +find out that that is the cause of the breakage by looking at the > +merge commit C and the history leading to it. > + > +How would you do that? > + > +Both "git diff A C" and "git diff B C" would be an enormous patch. > +Each of them essentially shows the whole change on each branch > +since they diverged. The developers may have well behaved to > +create good commits that follow the "commit small, commit often, > +commit well contained units" mantra, and each individual commit > +leading from Z to A and from Z to B may be easy to review and > +understand, but looking at these small and easily reviewable > +steps alone would not let you spot the breakage. You need to > +have a global picture of what the upper branch did (and > +among many, one of them is to change the semantics of that > +particular function) and look first at the huge "diff A C" > +(which shows the change the lower branch introduces), and see if > +that huge change is consistent with what have been done between > +Z and A. > + > +If you linearize the history by rebasing the lower branch on > +top of the upper, instead of merging, the bug becomes much easier to > +find and understand. Your history would instead be: > + > +................................................................ > + ---Z---o---X--...---o---A---o---o---Y*--...---o---B*--D* > +................................................................ > + > +and there is a single commit Y* between A and B* that introduced > +the new calling site that still uses the old semantics of the > +function, even though that was already modified at X. "git show > +Y*" will be a much smaller patch than "git diff A C" and it is > +much easier to deal with. > + > + > [[Developing-with-git]] > Developing with git > =================== Thank you! -- Benoit Sigoure aka Tsuna EPITA Research and Development Laboratory [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 186 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] user-manual: add advanced topic "bisecting merges" 2007-11-07 22:16 ` Benoit Sigoure @ 2007-11-07 22:26 ` J. Bruce Fields 2007-11-08 6:40 ` Steffen Prohaska 1 sibling, 0 replies; 30+ messages in thread From: J. Bruce Fields @ 2007-11-07 22:26 UTC (permalink / raw) To: Benoit Sigoure; +Cc: Steffen Prohaska, gitster, Ralf.Wildenhues, git On Wed, Nov 07, 2007 at 11:16:06PM +0100, Benoit Sigoure wrote: > Hi Steffen, > > On Nov 7, 2007, at 10:50 PM, Steffen Prohaska wrote: > >> This commits adds a discussion of the challenge of bisecting >> merge commits to the user manual. The text is slightly >> adapted from a mail by Junio C Hamano <gitster@pobox.com> >> to the mailing list >> <http://marc.info/?l=git&m=119403257315527&w=2>. >> >> The discussion is added to "Exploring git history" in a >> sub-section titled "Advanced topics". The discussion requires >> detailed knowledge about git. It is assumed that the reader will >> skip advanced topics on first reading. At least the text suggest >> to do so. >> >> Signed-off-by: Steffen Prohaska <prohaska@zib.de> >> --- >> Documentation/user-manual.txt | 89 >> +++++++++++++++++++++++++++++++++++++++++ >> 1 files changed, 89 insertions(+), 0 deletions(-) >> >> Some minor errors were fixed. Thanks to <Ralf.Wildenhues@gmx.de>. >> >> I kept the naming of the commits. Benoit Sigoure suggested to choose >> a different naming which he claims would be easier to remember. >> I'm not convinced. The current naming starts with X, Y, Z on the left >> and names the remaining commits on the right with A, B, C, D. This >> is simple and give sufficient orientation. > > I still disagree but... fair enough ;) > You end up comparing [ABCD] with [XYZ] which (to me) is hard to follow > because it's like you were comparing two different kind of entities. I > tend to think more in term of branch (e.g. what's happened to the upper > branch and what's happened to the lower branch, rather than think in terms > of "before a point in time" and "after that point in time"). Because of > that, I constantly need to look back at the scheme to find out what is `A', > what is `Z' etc. > > Some more comments below. Sorry for not spotting these earlier. > >> >> diff --git a/Documentation/user-manual.txt b/Documentation/user-manual.txt >> index d99adc6..d0e738e 100644 >> --- a/Documentation/user-manual.txt >> +++ b/Documentation/user-manual.txt >> @@ -934,6 +934,95 @@ Figuring out why this works is left as an exercise to >> the (advanced) >> student. The gitlink:git-log[1], gitlink:git-diff-tree[1], and >> gitlink:git-hash-object[1] man pages may prove helpful. >> >> +[[history-advanced-topics]] >> +Advanced topics >> +--------------- >> +This section covers advanced topics that typically require more >> +knowledge about git than the manual presented to this point. >> + >> +You may want to skip the section at first reading, and come back > > I think the correct wording here is "on first reading". If a native > English speaker could confirm this... Confirmed.... --b. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] user-manual: add advanced topic "bisecting merges" 2007-11-07 22:16 ` Benoit Sigoure 2007-11-07 22:26 ` J. Bruce Fields @ 2007-11-08 6:40 ` Steffen Prohaska 1 sibling, 0 replies; 30+ messages in thread From: Steffen Prohaska @ 2007-11-08 6:40 UTC (permalink / raw) To: Benoit Sigoure; +Cc: Junio C Hamano, Ralf.Wildenhues, Git Mailing List On Nov 7, 2007, at 11:16 PM, Benoit Sigoure wrote: > Hi Steffen, > > On Nov 7, 2007, at 10:50 PM, Steffen Prohaska wrote: > > Some more comments below. Sorry for not spotting these earlier. I'll took all your suggestions except for ... [...] >> +later when you have a better understanding of git. >> + >> +[[bisect-merges]] >> +Why bisecting merge commits can be harder than bisecting linear >> history >> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> ~~~ >> +The following text is based upon an email by Junio C. Hamano to >> +the git mailing list >> +(link:http://marc.info/?l=git&m=119403257315527&w=2[link:http:// >> marc.info/?l=git&m=119403257315527&w=2]). >> +It was slightly adapted for this manual. >> + >> +Bisecting merges can be challenging due to the complexity of >> +changes introduced at a merge. Bisecting through merges is not a > > s/at a merge/& point/ ? I'll replace the first sentence with Using gitlink:git-bisect[1] on a history with merges can be challenging. The details are explained in the remainder of the paragraph. [...] >> +added a new calling site for that function at commit Y. The >> +commits from Z leading to B all assume the old semantics of that >> +function and the callers and the callee are consistent with each >> +other. There is no bug at B, either. >> + >> +You merge to create C. There is no textual conflict with this >> +three way merge, and the result merges cleanly. You bisect >> +this, because you found D is bad and you know Z was good. Your >> +bisect will find that C (merge) is broken. Understandably so, >> +as at C, the new calling site of the function added by the lower >> +branch is not converted to the new semantics, while all the >> +other calling sites that already existed at Z would have been >> +converted by the merge. The new calling site has semantic >> +adjustment needed, but you do not know that yet. You need to > > s/adjustment/&s/ I'm not sure if plural is needed. Steffen -- Steffen Prohaska <prohaska@zib.de> <http://www.zib.de/prohaska/> Zuse Institute Berlin, Takustraße 7, D-14195 Berlin-Dahlem, Germany +49 (30) 841 85-337, fax -107 ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] user-manual: add advanced topic "bisecting merges" 2007-11-07 21:50 ` [PATCH v2] " Steffen Prohaska 2007-11-07 22:16 ` Benoit Sigoure @ 2007-11-08 7:19 ` Johannes Sixt 2007-11-08 8:59 ` Steffen Prohaska 1 sibling, 1 reply; 30+ messages in thread From: Johannes Sixt @ 2007-11-08 7:19 UTC (permalink / raw) To: Steffen Prohaska; +Cc: gitster, Ralf.Wildenhues, tsuna, git Steffen Prohaska schrieb: > +If you linearize the history by rebasing the lower branch on > +top of the upper, instead of merging, the bug becomes much easier to > +find and understand. Your history would instead be: At this point I'm missing the words The solution is ... I.e.: The solution is to linearize the history by rebasing the lower branch on top of the upper, instead of merging. Now the bug becomes much easier to find and understand. Your history would instead be: -- Hannes ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] user-manual: add advanced topic "bisecting merges" 2007-11-08 7:19 ` Johannes Sixt @ 2007-11-08 8:59 ` Steffen Prohaska 2007-11-08 9:11 ` Johannes Sixt 0 siblings, 1 reply; 30+ messages in thread From: Steffen Prohaska @ 2007-11-08 8:59 UTC (permalink / raw) To: Johannes Sixt; +Cc: gitster, Ralf.Wildenhues, tsuna, git On Nov 8, 2007, at 8:19 AM, Johannes Sixt wrote: > Steffen Prohaska schrieb: >> +If you linearize the history by rebasing the lower branch on >> +top of the upper, instead of merging, the bug becomes much easier to >> +find and understand. Your history would instead be: > > At this point I'm missing the words > > The solution is ... > > I.e.: > > The solution is to linearize the history by rebasing the lower > branch on > top of the upper, instead of merging. Now the bug becomes much > easier to > find and understand. Your history would instead be: Hmm. It might be a solution if you did not publish history. How about leaving the text as is and adding an introductory paragraph at the beginning of the section? I.e: This section discusses how gitlink:git-bisect[1] plays with differently shaped histories. If you did not yet publish a branch you can use either gitlink:git-merge[1] or gitlink:git-rebase[1] to integrate changes from a second branch. The two approaches create differently shaped histories. So it might be interesting to know about the implications on gitlink:git-bisect[1]. Steffen ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] user-manual: add advanced topic "bisecting merges" 2007-11-08 8:59 ` Steffen Prohaska @ 2007-11-08 9:11 ` Johannes Sixt 2007-11-08 9:33 ` Andreas Ericsson 0 siblings, 1 reply; 30+ messages in thread From: Johannes Sixt @ 2007-11-08 9:11 UTC (permalink / raw) To: Steffen Prohaska; +Cc: gitster, Ralf.Wildenhues, tsuna, git Steffen Prohaska schrieb: > > On Nov 8, 2007, at 8:19 AM, Johannes Sixt wrote: > >> Steffen Prohaska schrieb: >>> +If you linearize the history by rebasing the lower branch on >>> +top of the upper, instead of merging, the bug becomes much easier to >>> +find and understand. Your history would instead be: >> >> At this point I'm missing the words >> >> The solution is ... >> >> I.e.: >> >> The solution is to linearize the history by rebasing the lower branch on >> top of the upper, instead of merging. Now the bug becomes much easier to >> find and understand. Your history would instead be: > > Hmm. It might be a solution if you did not publish history. This is about finding the commit that introduced a bug. Once you found it, better: you know how to fix the bug, you are expected to throw away the rebased branch, not to publish it! Maybe a note along these lines could be appended: Now that you know what caused the error (and how to fix it), throw away the rebased branch, and commit a fix on top of D. -- Hannes ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] user-manual: add advanced topic "bisecting merges" 2007-11-08 9:11 ` Johannes Sixt @ 2007-11-08 9:33 ` Andreas Ericsson 2007-11-08 9:53 ` Johannes Sixt 0 siblings, 1 reply; 30+ messages in thread From: Andreas Ericsson @ 2007-11-08 9:33 UTC (permalink / raw) To: Johannes Sixt; +Cc: Steffen Prohaska, gitster, Ralf.Wildenhues, tsuna, git Johannes Sixt wrote: > Steffen Prohaska schrieb: >> >> On Nov 8, 2007, at 8:19 AM, Johannes Sixt wrote: >> >>> Steffen Prohaska schrieb: >>>> +If you linearize the history by rebasing the lower branch on >>>> +top of the upper, instead of merging, the bug becomes much easier to >>>> +find and understand. Your history would instead be: >>> >>> At this point I'm missing the words >>> >>> The solution is ... >>> >>> I.e.: >>> >>> The solution is to linearize the history by rebasing the lower branch on >>> top of the upper, instead of merging. Now the bug becomes much easier to >>> find and understand. Your history would instead be: >> >> Hmm. It might be a solution if you did not publish history. > > This is about finding the commit that introduced a bug. Once you found > it, better: you know how to fix the bug, you are expected to throw away > the rebased branch, not to publish it! Maybe a note along these lines > could be appended: > > Now that you know what caused the error (and how to fix it), throw away > the rebased branch, and commit a fix on top of D. > Well, if rebasing becomes the standard for normal development, it's hardly right to throw it away, is it? I like Steffen's suggestion better. -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] user-manual: add advanced topic "bisecting merges" 2007-11-08 9:33 ` Andreas Ericsson @ 2007-11-08 9:53 ` Johannes Sixt 2007-11-08 12:54 ` Steffen Prohaska 0 siblings, 1 reply; 30+ messages in thread From: Johannes Sixt @ 2007-11-08 9:53 UTC (permalink / raw) To: Andreas Ericsson, Steffen Prohaska; +Cc: gitster, Ralf.Wildenhues, tsuna, git Andreas Ericsson schrieb: > Johannes Sixt wrote: >> Steffen Prohaska schrieb: >>> >>> On Nov 8, 2007, at 8:19 AM, Johannes Sixt wrote: >>> >>>> Steffen Prohaska schrieb: >>>>> +If you linearize the history by rebasing the lower branch on >>>>> +top of the upper, instead of merging, the bug becomes much easier to >>>>> +find and understand. Your history would instead be: >>>> >>>> At this point I'm missing the words >>>> >>>> The solution is ... >>>> >>>> I.e.: >>>> >>>> The solution is to linearize the history by rebasing the lower >>>> branch on >>>> top of the upper, instead of merging. Now the bug becomes much >>>> easier to >>>> find and understand. Your history would instead be: >>> >>> Hmm. It might be a solution if you did not publish history. >> >> This is about finding the commit that introduced a bug. Once you found >> it, better: you know how to fix the bug, you are expected to throw >> away the rebased branch, not to publish it! Maybe a note along these >> lines could be appended: >> >> Now that you know what caused the error (and how to fix it), throw >> away the rebased branch, and commit a fix on top of D. >> > > Well, if rebasing becomes the standard for normal development, it's hardly > right to throw it away, is it? I like Steffen's suggestion better. There is a big misunderstanding. The text that the patch amends is about bisecting history that reveals that a merge commit breaks, which is not helpful, and then how to find where and what and why the breakage really was introduce. And the answer to "how to find" is to rebase and bisect in the rebased history. My initial complaint was that in the flow of reading the instructions the pointer to "the solution" was missing. Rather, at the point where the reader is supposed to think "ah, yes, that's how to do it", there is the conditional statement "If you linearize history". My suggestion is to put a big emphasis on the solution by using the words "The solution is". Now, the user can *always* rebase one of the branches on top of the other, even if both histories are already published. *But* if both were indeed published, then the rebased history must be thrown away, and the only thing you learnt from it was where and what and why the breakage really was introduced. Of course we could include a few "ifs" and "unlesses" (about published histories), before suggesting to throw away rebased history. But once the task is accomplished (find the bogus commit), throwing away the rebased history (and continuing at commit D) is always correct, but keeping it (and continuing at D*) is not. -- Hannes ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] user-manual: add advanced topic "bisecting merges" 2007-11-08 9:53 ` Johannes Sixt @ 2007-11-08 12:54 ` Steffen Prohaska 2007-11-08 13:22 ` Johannes Sixt ` (2 more replies) 0 siblings, 3 replies; 30+ messages in thread From: Steffen Prohaska @ 2007-11-08 12:54 UTC (permalink / raw) To: Johannes Sixt; +Cc: Andreas Ericsson, gitster, Ralf.Wildenhues, tsuna, git On Nov 8, 2007, at 10:53 AM, Johannes Sixt wrote: > Andreas Ericsson schrieb: >> Johannes Sixt wrote: >>> Steffen Prohaska schrieb: >>>> >>>> On Nov 8, 2007, at 8:19 AM, Johannes Sixt wrote: >>>> >>>>> Steffen Prohaska schrieb: >>>>>> +If you linearize the history by rebasing the lower branch on >>>>>> +top of the upper, instead of merging, the bug becomes much >>>>>> easier to >>>>>> +find and understand. Your history would instead be: >>>>> >>>>> At this point I'm missing the words >>>>> >>>>> The solution is ... >>>>> >>>>> I.e.: >>>>> >>>>> The solution is to linearize the history by rebasing the lower >>>>> branch on >>>>> top of the upper, instead of merging. Now the bug becomes much >>>>> easier to >>>>> find and understand. Your history would instead be: >>>> >>>> Hmm. It might be a solution if you did not publish history. >>> >>> This is about finding the commit that introduced a bug. Once you >>> found it, better: you know how to fix the bug, you are expected >>> to throw away the rebased branch, not to publish it! Maybe a note >>> along these lines could be appended: >>> >>> Now that you know what caused the error (and how to fix it), >>> throw away the rebased branch, and commit a fix on top of D. >>> >> Well, if rebasing becomes the standard for normal development, >> it's hardly >> right to throw it away, is it? I like Steffen's suggestion better. > > There is a big misunderstanding. The text that the patch amends is > about bisecting history that reveals that a merge commit breaks, > which is not helpful, and then how to find where and what and why > the breakage really was introduce. > > And the answer to "how to find" is to rebase and bisect in the > rebased history. Do you use rebase like this in real life? I thought of the text as background information that might be helpful for users who want do decide wether to merge or to rebase. The problem described may be valuable information supporting a decision about a recommended workflow for a group of users. My personal conclusion was: I'll accept the danger of complex merges that might be hard to bisect. I now understand this risk, but I nonetheless prefer the simplicity of a merge based workflow. This avoids the danger that published history gets rewritten. But now I'm wondering if your suggestions of rebasing only for locating the evil commit is feasible in reality. You may need to solve a lot of merge conflicts if you rebase a larger part of the history. If you do not have them in your rerere cache this might be time consuming. ... > My initial complaint was that in the flow of reading the > instructions the pointer to "the solution" was missing. Rather, at > the point where the reader is supposed to think "ah, yes, that's > how to do it", there is the conditional statement "If you linearize > history". My suggestion is to put a big emphasis on the solution by > using the words "The solution is". > > Now, the user can *always* rebase one of the branches on top of the > other, even if both histories are already published. *But* if both > were indeed published, then the rebased history must be thrown > away, and the only thing you learnt from it was where and what and > why the breakage really was introduced. > > Of course we could include a few "ifs" and "unlesses" (about > published histories), before suggesting to throw away rebased > history. But once the task is accomplished (find the bogus commit), > throwing away the rebased history (and continuing at commit D) is > always correct, but keeping it (and continuing at D*) is not. ... So, again, the question for me is if someone does use rebase in reality in the way that you suggests. Do you? Does someone else? Steffen ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] user-manual: add advanced topic "bisecting merges" 2007-11-08 12:54 ` Steffen Prohaska @ 2007-11-08 13:22 ` Johannes Sixt 2007-11-08 14:55 ` Steffen Prohaska 2007-11-08 13:38 ` [PATCH v2] user-manual: add advanced topic "bisecting merges" Andreas Ericsson 2007-11-08 14:51 ` Benoit Sigoure 2 siblings, 1 reply; 30+ messages in thread From: Johannes Sixt @ 2007-11-08 13:22 UTC (permalink / raw) To: Steffen Prohaska; +Cc: Andreas Ericsson, gitster, Ralf.Wildenhues, tsuna, git Steffen Prohaska schrieb: > > On Nov 8, 2007, at 10:53 AM, Johannes Sixt wrote: >> The text that the patch amends is >> about bisecting history that reveals that a merge commit breaks, which >> is not helpful, and then how to find where and what and why the >> breakage really was introduce. >> >> And the answer to "how to find" is to rebase and bisect in the rebased >> history. > > Do you use rebase like this in real life? Why is this relevant? You've written a superb addendum to the user manual, but IT TALKS ABOUT BISECTION, and is not a guideline when to use merges and when to rebase. It better not be meant as such. Consider an integrator who has just merged two histories, both of which are available publically. Pushing out a rebased history IS NOT AN OPTION. If the poor fellow for the heck of it has no choice but to find the bogus commit, then your instructions are worth a thousand bucks - even if the rebased history is otherwise useless -, but any guidelines how to construct histories are IRRELEVANT for his case. > But now I'm wondering if your suggestions of rebasing only for > locating the evil commit is feasible in reality. You may need > to solve a lot of merge conflicts if you rebase a larger part > of the history. If you do not have them in your rerere cache > this might be time consuming. ... During the rebase you will see the same conflicts that you also had during the merge, even simpler ones (because they are - hopefully - broken down into smaller pieces). If your merge was clean (as was suggested in the patch), then you won't see a lot of conflicts during the rebase, either. -- Hannes ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] user-manual: add advanced topic "bisecting merges" 2007-11-08 13:22 ` Johannes Sixt @ 2007-11-08 14:55 ` Steffen Prohaska 2007-11-10 9:48 ` [PATCH v3] " Steffen Prohaska 0 siblings, 1 reply; 30+ messages in thread From: Steffen Prohaska @ 2007-11-08 14:55 UTC (permalink / raw) To: Johannes Sixt; +Cc: Andreas Ericsson, gitster, Ralf.Wildenhues, tsuna, git On Nov 8, 2007, at 2:22 PM, Johannes Sixt wrote: > Steffen Prohaska schrieb: >> On Nov 8, 2007, at 10:53 AM, Johannes Sixt wrote: >>> The text that the patch amends is about bisecting history that >>> reveals that a merge commit breaks, which is not helpful, and >>> then how to find where and what and why the breakage really was >>> introduce. >>> >>> And the answer to "how to find" is to rebase and bisect in the >>> rebased history. >> Do you use rebase like this in real life? > > Why is this relevant? Well, I don't want to give recommendations that are not tested in real life. Maybe the solution turns out to be less practical than it should be theoretically. > You've written a superb addendum to the user manual, but IT TALKS > ABOUT BISECTION, and is not a guideline when to use merges and when > to rebase. BTW, I only took what Junio wrote during a recent discussion on the list, polished it a bit and sent it as a patch. I'm only the editor, not the original author. That doesn't mean I wouldn't care about the text. I'm willing to improve the text. But all the cheers really should go to Junio. > It better not be meant as such. Consider an integrator who has just > merged two histories, both of which are available publically. > Pushing out a rebased history IS NOT AN OPTION. If the poor fellow > for the heck of it has no choice but to find the bogus commit, then > your instructions are worth a thousand bucks - even if the rebased > history is otherwise useless -, but any guidelines how to construct > histories are IRRELEVANT for his case. Ok, I see your point. I'll mention these points. >> But now I'm wondering if your suggestions of rebasing only for >> locating the evil commit is feasible in reality. You may need >> to solve a lot of merge conflicts if you rebase a larger part >> of the history. If you do not have them in your rerere cache >> this might be time consuming. ... > > During the rebase you will see the same conflicts that you also had > during the merge, even simpler ones (because they are - hopefully - > broken down into smaller pieces). If your merge was clean (as was > suggested in the patch), then you won't see a lot of conflicts > during the rebase, either. Yeah. I'll try to mention this at an appropriate place. Steffen ^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v3] user-manual: add advanced topic "bisecting merges" 2007-11-08 14:55 ` Steffen Prohaska @ 2007-11-10 9:48 ` Steffen Prohaska 2007-11-10 10:36 ` Junio C Hamano 0 siblings, 1 reply; 30+ messages in thread From: Steffen Prohaska @ 2007-11-10 9:48 UTC (permalink / raw) To: git Cc: Junio C Hamano, Benoit Sigoure, Andreas Ericsson, Johannes Sixt, Steffen Prohaska This commits adds a discussion of the challenge of bisecting merge commits to the user manual. The original author is Junio C Hamano <gitster@pobox.com>, who posted the text to the mailing list: <http://marc.info/?l=git&m=119403257315527&w=2>. The text from the email is slightly adapted for the manual. The discussion is added to "Exploring git history" in a sub-section titled "Advanced topics". The discussion requires detailed knowledge about git. It is assumed that the reader will skip advanced topics on first reading. At least the text suggest to do so. The text includes suggestions and fixed by Ralf Wildenhues <Ralf.Wildenhues@gmx.de>, Benoit Sigoure <tsuna@lrde.epita.fr>, Johannes Sixt <j.sixt@viscovery.net>. Signed-off-by: Steffen Prohaska <prohaska@zib.de> --- Documentation/user-manual.txt | 104 +++++++++++++++++++++++++++++++++++++++++ 1 files changed, 104 insertions(+), 0 deletions(-) Next try. The text now contains an introductory paragraph, proposes rebase as a 'solution', and recommends to throw away the rebased branch. Steffen diff --git a/Documentation/user-manual.txt b/Documentation/user-manual.txt index d99adc6..2f4c314 100644 --- a/Documentation/user-manual.txt +++ b/Documentation/user-manual.txt @@ -934,6 +934,110 @@ Figuring out why this works is left as an exercise to the (advanced) student. The gitlink:git-log[1], gitlink:git-diff-tree[1], and gitlink:git-hash-object[1] man pages may prove helpful. +[[history-advanced-topics]] +Advanced topics +--------------- +This section covers advanced topics that typically require more +knowledge about git than the manual presented to this point. + +You may want to skip the section on first reading, and come back +later when you have a better understanding of git. + +[[bisect-merges]] +Why bisecting merge commits can be harder than bisecting linear history +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +This section discusses how gitlink:git-bisect[1] plays +with differently shaped histories. If you did not yet +publish a branch you can use either gitlink:git-merge[1] or +gitlink:git-rebase[1] to integrate changes from a second +branch. The two approaches create differently shaped +histories. This section discusses the implications on +gitlink:git-bisect[1]. If the history is already published +temporarily rebasing can still be helpful for bisecting. + +The following text is based upon an email by Junio C. Hamano to +the git mailing list +(link:http://marc.info/?l=git&m=119403257315527&w=2[link:http://marc.info/?l=git&m=119403257315527&w=2]). +It was slightly adapted for this manual. + +Using gitlink:git-bisect[1] on a history with merges can be challenging. +Bisecting through merges is not a +technical problem. The real problem is what to do when the +culprit turns out to be a merge commit. How to spot what really +is wrong, and figure out how to fix. The problem is not for the +tool but for the human, and it is real. + +Imagine this history. + +................................................ + ---Z---o---X---...---o---A---C---D + \ / + o---o---Y---...---o---B +................................................ + +Suppose that on the upper development line, the meaning of one +of the functions that existed at Z was changed at commit X. The +commits from Z leading to A change both the function's +implementation and all calling sites that existed at Z, as well +as new calling sites they add, to be consistent. There is no +bug at A. + +Suppose that in the meantime the lower development line somebody +added a new calling site for that function at commit Y. The +commits from Z leading to B all assume the old semantics of that +function and the callers and the callee are consistent with each +other. There is no bug at B, either. + +You merge to create C. There is no textual conflict with this +three way merge, and the result merges cleanly. You bisect +this, because you found D is bad and you know Z was good. Your +bisect will find that C (merge) is broken. Understandably so, +as at C, the new calling site of the function added by the lower +branch is not converted to the new semantics, while all the +other calling sites that already existed at Z would have been +converted by the merge. The new calling site has semantic +adjustment needed, but you do not know that yet. You need to +find out that that is the cause of the breakage by looking at the +merge commit C and the history leading to it. + +How would you do that? + +Both "git diff A C" and "git diff B C" would be an enormous patch. +Each of them essentially shows the whole change on each branch +since they diverged. The developers may have well behaved to +create good commits that follow the "commit small, commit often, +commit well contained units" mantra, and each individual commit +leading from Z to A and from Z to B may be easy to review and +understand, but looking at these small and easily reviewable +steps alone would not let you spot the breakage. You need to +have a global picture of what the upper branch did (and +among many, one of them is to change the semantics of that +particular function) and look first at the huge "diff A C" +(which shows the change the lower branch introduces), and see if +that huge change is consistent with what have been done between +Z and A. + +A solution is to linearize the history by rebasing the lower +branch on top of the upper, instead of merging. There were no +textual conflicts in the original three way merge. So there +should not be conflicts during rebase either. Now the bug becomes +much easier to find and understand. Your history would instead +be: + +................................................................ + ---Z---o---X--...---o---A---o---o---Y*--...---o---B*--D* +................................................................ + +and there is a single commit Y* between A and B* that introduced +the new calling site that still uses the old semantics of the +function, even though that was already modified at X. "git show +Y*" will be a much smaller patch than "git diff A C" and it is +much easier to deal with. + +Now that you know what caused the error (and how to fix it), +throw away the rebased branch, and commit a fix on top of D. + + [[Developing-with-git]] Developing with git =================== -- 1.5.3.5.578.g886d ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH v3] user-manual: add advanced topic "bisecting merges" 2007-11-10 9:48 ` [PATCH v3] " Steffen Prohaska @ 2007-11-10 10:36 ` Junio C Hamano 2007-11-10 11:16 ` Steffen Prohaska 0 siblings, 1 reply; 30+ messages in thread From: Junio C Hamano @ 2007-11-10 10:36 UTC (permalink / raw) To: Steffen Prohaska; +Cc: git, Benoit Sigoure, Andreas Ericsson, Johannes Sixt Steffen Prohaska <prohaska@zib.de> writes: > ... > +A solution is to linearize the history by rebasing the lower > +branch on top of the upper, instead of merging. There were no Hmm. When I wrote it, I did not mean this as a "solution", but as an illustration of how a merge heavy history and a linear history have impact on bisectability. So it is more like... On the other hand, if you did not merge at C but rebased the history between Z to B on top of A, you would have get this linear history [illustration here]. Bisecting between Z and D* would hit a single culprit commit Y* instead. This tends to be easier to understand why it is broken. For this reason, many experienced git users, even when they are working on an otherwise merge-heavy project, keep the histories linear by rebasing their work on top of public upstreams before publishing (when able). An extreme example: merges from a few top-level lieutenants to Linus in the kernel, e.g. David Miller, are known to _almost always_ fast-forward for Linus. IOW, the description is to mildly encourage private rebasing to keep the job of later bisecting (for potentially others) easier. I realize I originally wrote as if C (merge) was made by the same person as the person who ends up bisecting, but that is not necessarily the case. Keeping the history without needless merges tend to make _other_ people's lives simpler. And after encouraging the private rebasing, I would continue like... But if you already made a merge C instead of rebasing, all is not lost. In the illustrated case, you can easily rebase one parent branch on top of the other after the fact, just to understand the history and to make the history more easily bisectable. Even though the published history should not be rewound without consent with others in the project, nobody gets hurt if you rebased to create alternate history privately. After understanding the breakage and coming up with a fix on top of D*, you can discard that rebased history, and apply the same fix on top of D, as D* and D should have the identical trees. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v3] user-manual: add advanced topic "bisecting merges" 2007-11-10 10:36 ` Junio C Hamano @ 2007-11-10 11:16 ` Steffen Prohaska 2007-11-10 13:49 ` [PATCH v4] user-manual: Add section "Why bisecting merge commits can be harder ..." Steffen Prohaska 0 siblings, 1 reply; 30+ messages in thread From: Steffen Prohaska @ 2007-11-10 11:16 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Benoit Sigoure, Andreas Ericsson, Johannes Sixt On Nov 10, 2007, at 11:36 AM, Junio C Hamano wrote: > Steffen Prohaska <prohaska@zib.de> writes: > >> ... >> +A solution is to linearize the history by rebasing the lower >> +branch on top of the upper, instead of merging. There were no > > Hmm. When I wrote it, I did not mean this as a "solution", but > as an illustration of how a merge heavy history and a linear > history have impact on bisectability. I agree. This is what I understood. But later Johannes brought up the idea of naming it a "solution"; and some others liked this. We should probably move the discussion to Chapter 5 "Rewriting history and maintaining patch series". Discussing pros and cons of merge versus rebase makes more sense there. The discussion would go right after "Problems with rewriting history". So, users should be warned not to rebase without thinking about the consequences. Another advantage is that users should have learnt more about git when they reach Chapter 5. Hence, it should be easier to follow the discussion. > So it is more like... > > On the other hand, if you did not merge at C but rebased the > history between Z to B on top of A, you would have get this > linear history [illustration here]. Bisecting between Z and > D* would hit a single culprit commit Y* instead. This tends > to be easier to understand why it is broken. I'll take this... > For this reason, many experienced git users, even when they are > working on an otherwise merge-heavy project, keep the histories > linear by rebasing their work on top of public upstreams before > publishing (when able). An extreme example: merges from a few > top-level lieutenants to Linus in the kernel, e.g. David Miller, > are known to _almost always_ fast-forward for Linus. > > IOW, the description is to mildly encourage private rebasing to > keep the job of later bisecting (for potentially others) easier. > I realize I originally wrote as if C (merge) was made by the > same person as the person who ends up bisecting, but that is > not necessarily the case. Keeping the history without needless > merges tend to make _other_ people's lives simpler. > > And after encouraging the private rebasing, I would continue > like... > > But if you already made a merge C instead of rebasing, all > is not lost. In the illustrated case, you can easily rebase > one parent branch on top of the other after the fact, just > to understand the history and to make the history more > easily bisectable. s/more easily bisectable/easier to bisect/ ? > Even though the published history should > not be rewound without consent with others in the project, > nobody gets hurt if you rebased to create alternate history > privately. After understanding the breakage and coming up > with a fix on top of D*, you can discard that rebased > history, and apply the same fix on top of D, as D* and D > should have the identical trees. ... and will prepare PATCH v4. Steffen ^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v4] user-manual: Add section "Why bisecting merge commits can be harder ..." 2007-11-10 11:16 ` Steffen Prohaska @ 2007-11-10 13:49 ` Steffen Prohaska 2007-11-10 19:10 ` Linus Torvalds 2007-11-18 3:59 ` J. Bruce Fields 0 siblings, 2 replies; 30+ messages in thread From: Steffen Prohaska @ 2007-11-10 13:49 UTC (permalink / raw) To: Junio C Hamano Cc: Benoit Sigoure, Andreas Ericsson, Johannes Sixt, git, Steffen Prohaska This commit adds a discussion of the challenge of bisecting merge commits to the user manual. The original author is Junio C Hamano <gitster@pobox.com>, who posted the text to the mailing list <http://marc.info/?l=git&m=119403257315527&w=2>. His email was adapted for the manual. The discussion is added to "Rewriting history and maintainig patch series". The text added requires good understanding of merging and rebasing. Therefore it should not be placed too early in the manual. Right after the section on "Problems with rewriting history", the discussion of bisect gives another reason for linearizing as much of the history as possible. The text includes suggestions and fixes by Ralf Wildenhues <Ralf.Wildenhues@gmx.de> and Benoit Sigoure <tsuna@lrde.epita.fr>. Signed-off-by: Steffen Prohaska <prohaska@zib.de> --- Documentation/user-manual.txt | 98 +++++++++++++++++++++++++++++++++++++++++ 1 files changed, 98 insertions(+), 0 deletions(-) Since PATCH v3: I moved the section to chapter 5. It's not longer in a separate section titled "Advance Topics". I split the paragraph on merge C and reordered some sentences. I took part the paragraph on "experienced users" from Junio's last email. Junio, you can take ownership of the commit when applying -- if you like. Most of the text was written by you. Just mention me as the editor in the commit message. Steffen diff --git a/Documentation/user-manual.txt b/Documentation/user-manual.txt index d99adc6..8ddc19f 100644 --- a/Documentation/user-manual.txt +++ b/Documentation/user-manual.txt @@ -2554,6 +2554,104 @@ branches into their own work. For true distributed development that supports proper merging, published branches should never be rewritten. +[[bisect-merges]] +Why bisecting merge commits can be harder than bisecting linear history +----------------------------------------------------------------------- + +This section discusses how gitlink:git-bisect[1] plays +with differently shaped histories. The text is based upon +an email by Junio C. Hamano to the git mailing list +(link:http://marc.info/?l=git&m=119403257315527&w=2[link:http://marc.info/?l=git&m=119403257315527&w=2]). +It was adapted for the user manual. + +Using gitlink:git-bisect[1] on a history with merges can be +challenging. Bisecting through merges is not a technical +problem. The real problem is what to do when the culprit turns +out to be a merge commit. How to spot what really is wrong, and +figure out how to fix it. The problem is not for the tool but +for the human, and it is real. + +Imagine this history: + +................................................ + ---Z---o---X---...---o---A---C---D + \ / + o---o---Y---...---o---B +................................................ + +Suppose that on the upper development line, the meaning of one +of the functions that existed at Z was changed at commit X. The +commits from Z leading to A change both the function's +implementation and all calling sites that existed at Z, as well +as new calling sites they add, to be consistent. There is no +bug at A. + +Suppose that in the meantime the lower development line somebody +added a new calling site for that function at commit Y. The +commits from Z leading to B all assume the old semantics of that +function and the callers and the callee are consistent with each +other. There is no bug at B, either. + +Suppose further that the two development lines were merged at C +and there was no textual conflict with this three way merge. +The result merged cleanly. + +Now, during bisect you find that the merge C is broken. You +started to bisect, because you found D is bad and you know Z was +good. The breakage is understandable, as at C, the new calling +site of the function added by the lower branch is not converted +to the new semantics, while all the other calling sites that +already existed at Z would have been converted by the merge. The +new calling site has semantic adjustment needed, but you do not +know that yet. + +You need to find out what is the cause of the breakage by looking +at the merge commit C and the history leading to it. How would +you do that? + +Both "git diff A C" and "git diff B C" would be an enormous patch. +Each of them essentially shows the whole change on each branch +since they diverged. The developers may have well behaved to +create good commits that follow the "commit small, commit often, +commit well contained units" mantra, and each individual commit +leading from Z to A and from Z to B may be easy to review and +understand, but looking at these small and easily reviewable +steps alone would not let you spot the breakage. You need to +have a global picture of what the upper branch did (and +among many, one of them is to change the semantics of that +particular function) and look first at the huge "diff A C" +(which shows the change the lower branch introduces), and see if +that huge change is consistent with what have been done between +Z and A. + +On the other hand, if you did not merge at C but rebased the +history between Z to B on top of A, you would have get this +linear history: + +................................................................ + ---Z---o---X--...---o---A---o---o---Y*--...---o---B*--D* +................................................................ + +Bisecting between Z and D* would hit a single culprit commit Y* +instead. This tends to be easier to understand why it is broken. + +For this reason, many experienced git users, even when they are +working on an otherwise merge-heavy project, keep the histories +linear by rebasing their work on top of public upstreams before +publishing. + +But if you already made a merge C instead of rebasing, all +is not lost. In the illustrated case, you can easily rebase +one parent branch on top of the other after the fact, just +to understand the history and to make the history more +easily to bisect. Even though the published history should +not be rewound without consent with others in the project, +nobody gets hurt if you rebased to create alternate history +privately. After understanding the breakage and coming up +with a fix on top of D*, you can discard that rebased +history, and apply the same fix on top of D, as D* and D +should have the identical trees. + [[advanced-branch-management]] Advanced branch management ========================== -- 1.5.3.5.578.g886d ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH v4] user-manual: Add section "Why bisecting merge commits can be harder ..." 2007-11-10 13:49 ` [PATCH v4] user-manual: Add section "Why bisecting merge commits can be harder ..." Steffen Prohaska @ 2007-11-10 19:10 ` Linus Torvalds 2007-11-10 20:25 ` Junio C Hamano 2007-11-18 3:59 ` J. Bruce Fields 1 sibling, 1 reply; 30+ messages in thread From: Linus Torvalds @ 2007-11-10 19:10 UTC (permalink / raw) To: Steffen Prohaska Cc: Junio C Hamano, Benoit Sigoure, Andreas Ericsson, Johannes Sixt, git On Sat, 10 Nov 2007, Steffen Prohaska wrote: > + > +But if you already made a merge C instead of rebasing, all > +is not lost. In the illustrated case, you can easily rebase > +one parent branch on top of the other after the fact, just > +to understand the history and to make the history more > +easily to bisect. I simply don't think this is true. You can *not* easily rebase after the fact. Both the parents may have lots of merges independently of each other, and rebase won't be able to do *anything* with such a case. In fact, the only reason you think you can easily rebase after-the-fact is that you made the example history be trivial. In real life, if the example history is that trivial (just a single merge of two otherwise linear histories), you'd probably generally not have this problem in the first place. The facts are: - git bisect can handle merges quite well, and points to the right commit (which is *usually* not a merge). - but merges by definition introduce changes from two independent lines of development, and while both lines may work well on their own, it is possible that (a) the merge itself was done incorrectly or (b) the two (or more) features that were introduced simply clash. - "git rebase" won't do a thing for this after the fact, except in trivial cases, and even then it will generate a new history that simply isn't compatible with the original history, so while it could in theory help bisect things further in some limited and simple cases, in general it's simply not that useful an approach. ..and I think it's simply wrong to even *try* to imply that "git rebase" can somehow change any of this. Linus ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v4] user-manual: Add section "Why bisecting merge commits can be harder ..." 2007-11-10 19:10 ` Linus Torvalds @ 2007-11-10 20:25 ` Junio C Hamano 2007-11-10 22:41 ` Steffen Prohaska 0 siblings, 1 reply; 30+ messages in thread From: Junio C Hamano @ 2007-11-10 20:25 UTC (permalink / raw) To: Linus Torvalds Cc: Steffen Prohaska, Benoit Sigoure, Andreas Ericsson, Johannes Sixt, git Linus Torvalds <torvalds@linux-foundation.org> writes: > On Sat, 10 Nov 2007, Steffen Prohaska wrote: >> + >> +But if you already made a merge C instead of rebasing, all >> +is not lost. In the illustrated case, you can easily rebase >> +one parent branch on top of the other after the fact, just >> +to understand the history and to make the history more >> +easily to bisect. > > I simply don't think this is true. > > You can *not* easily rebase after the fact. Both the parents may have lots > of merges independently of each other, and rebase won't be able to do > *anything* with such a case. In fact, the only reason you think you can > easily rebase after-the-fact is that you made the example history be > trivial. In real life, if the example history is that trivial (just a > single merge of two otherwise linear histories), you'd probably generally > not have this problem in the first place. > > The facts are: > > - git bisect can handle merges quite well, and points to the right > commit (which is *usually* not a merge). > > - but merges by definition introduce changes from two independent lines > of development, and while both lines may work well on their own, it is > possible that (a) the merge itself was done incorrectly or (b) the two > (or more) features that were introduced simply clash. > > - "git rebase" won't do a thing for this after the fact, except in > trivial cases, and even then it will generate a new history that simply > isn't compatible with the original history, so while it could in theory > help bisect things further in some limited and simple cases, in general > it's simply not that useful an approach. > > ..and I think it's simply wrong to even *try* to imply that "git rebase" > can somehow change any of this. Very true. It is a suggestion useful _only_ when you can easily rebase. Like the one illustrated in the description, which is artificial and of very limited scope. If you cannot rebase easily, then (by definition) rebase would not help you. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v4] user-manual: Add section "Why bisecting merge commits can be harder ..." 2007-11-10 20:25 ` Junio C Hamano @ 2007-11-10 22:41 ` Steffen Prohaska 0 siblings, 0 replies; 30+ messages in thread From: Steffen Prohaska @ 2007-11-10 22:41 UTC (permalink / raw) To: Junio C Hamano Cc: Linus Torvalds, Benoit Sigoure, Andreas Ericsson, Johannes Sixt, git On Nov 10, 2007, at 9:25 PM, Junio C Hamano wrote: > Linus Torvalds <torvalds@linux-foundation.org> writes: > >> On Sat, 10 Nov 2007, Steffen Prohaska wrote: >>> + >>> +But if you already made a merge C instead of rebasing, all >>> +is not lost. In the illustrated case, you can easily rebase >>> +one parent branch on top of the other after the fact, just >>> +to understand the history and to make the history more >>> +easily to bisect. >> >> I simply don't think this is true. >> >> You can *not* easily rebase after the fact. Both the parents may >> have lots >> of merges independently of each other, and rebase won't be able to do >> *anything* with such a case. In fact, the only reason you think >> you can >> easily rebase after-the-fact is that you made the example history be >> trivial. In real life, if the example history is that trivial (just a >> single merge of two otherwise linear histories), you'd probably >> generally >> not have this problem in the first place. >> >> The facts are: >> >> - git bisect can handle merges quite well, and points to the right >> commit (which is *usually* not a merge). >> >> - but merges by definition introduce changes from two independent >> lines >> of development, and while both lines may work well on their >> own, it is >> possible that (a) the merge itself was done incorrectly or (b) >> the two >> (or more) features that were introduced simply clash. >> >> - "git rebase" won't do a thing for this after the fact, except in >> trivial cases, and even then it will generate a new history >> that simply >> isn't compatible with the original history, so while it could >> in theory >> help bisect things further in some limited and simple cases, in >> general >> it's simply not that useful an approach. >> >> ..and I think it's simply wrong to even *try* to imply that "git >> rebase" >> can somehow change any of this. > > Very true. It is a suggestion useful _only_ when you can easily > rebase. Like the one illustrated in the description, which is > artificial and of very limited scope. If you cannot rebase > easily, then (by definition) rebase would not help you. I propose to remove this suggestion if no one comes up with a real world example where rebasing after the fact turned out to be useful. If nobody has such an example it doesn't make sense to tell readers of the manual that rebase could be useful in such a situation. The patch would be identical except for the last paragraph removed. Steffen ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v4] user-manual: Add section "Why bisecting merge commits can be harder ..." 2007-11-10 13:49 ` [PATCH v4] user-manual: Add section "Why bisecting merge commits can be harder ..." Steffen Prohaska 2007-11-10 19:10 ` Linus Torvalds @ 2007-11-18 3:59 ` J. Bruce Fields 2007-11-18 9:47 ` Steffen Prohaska 1 sibling, 1 reply; 30+ messages in thread From: J. Bruce Fields @ 2007-11-18 3:59 UTC (permalink / raw) To: Steffen Prohaska Cc: Junio C Hamano, Benoit Sigoure, Andreas Ericsson, Johannes Sixt, git On Sat, Nov 10, 2007 at 02:49:54PM +0100, Steffen Prohaska wrote: > +[[bisect-merges]] > +Why bisecting merge commits can be harder than bisecting linear history > +----------------------------------------------------------------------- > + > +This section discusses how gitlink:git-bisect[1] plays > +with differently shaped histories. The text is based upon > +an email by Junio C. Hamano to the git mailing list > +(link:http://marc.info/?l=git&m=119403257315527&w=2[link:http://marc.info/?l=git&m=119403257315527&w=2]). > +It was adapted for the user manual. This is not the only text that's been taken from someplace else, but if we attributed them all in the text it would get a little cumbersome.... I think the place for that kind of thing is in the commit message, but if we really think we need to include it in the main text, we could add a separate "'acknowledgements" section. > + > +Using gitlink:git-bisect[1] on a history with merges can be > +challenging. Bisecting through merges is not a technical > +problem. The real problem is what to do when the culprit turns > +out to be a merge commit. How to spot what really is wrong, and > +figure out how to fix it. The problem is not for the tool but > +for the human, and it is real. I think we can pare that down a little. > + > +Imagine this history: > + > +................................................ > + ---Z---o---X---...---o---A---C---D > + \ / > + o---o---Y---...---o---B > +................................................ > + > +Suppose that on the upper development line, the meaning of one > +of the functions that existed at Z was changed at commit X. The > +commits from Z leading to A change both the function's > +implementation and all calling sites that existed at Z, as well > +as new calling sites they add, to be consistent. There is no > +bug at A. > + > +Suppose that in the meantime the lower development line somebody > +added a new calling site for that function at commit Y. The > +commits from Z leading to B all assume the old semantics of that > +function and the callers and the callee are consistent with each > +other. There is no bug at B, either. > + > +Suppose further that the two development lines were merged at C > +and there was no textual conflict with this three way merge. > +The result merged cleanly. > + > +Now, during bisect you find that the merge C is broken. You > +started to bisect, because you found D is bad and you know Z was > +good. The breakage is understandable, as at C, the new calling > +site of the function added by the lower branch is not converted > +to the new semantics, while all the other calling sites that > +already existed at Z would have been converted by the merge. The > +new calling site has semantic adjustment needed, but you do not > +know that yet. > + > +You need to find out what is the cause of the breakage by looking > +at the merge commit C and the history leading to it. How would > +you do that? > + > +Both "git diff A C" and "git diff B C" would be an enormous patch. > +Each of them essentially shows the whole change on each branch > +since they diverged. The developers may have well behaved to > +create good commits that follow the "commit small, commit often, > +commit well contained units" mantra, and each individual commit > +leading from Z to A and from Z to B may be easy to review and > +understand, but looking at these small and easily reviewable > +steps alone would not let you spot the breakage. You need to > +have a global picture of what the upper branch did (and > +among many, one of them is to change the semantics of that > +particular function) and look first at the huge "diff A C" > +(which shows the change the lower branch introduces), and see if > +that huge change is consistent with what have been done between > +Z and A. > + > +On the other hand, if you did not merge at C but rebased the > +history between Z to B on top of A, you would have get this > +linear history: > + > +................................................................ > + ---Z---o---X--...---o---A---o---o---Y*--...---o---B*--D* > +................................................................ > + > +Bisecting between Z and D* would hit a single culprit commit Y* > +instead. This tends to be easier to understand why it is broken. > + > +For this reason, many experienced git users, even when they are > +working on an otherwise merge-heavy project, keep the histories > +linear by rebasing their work on top of public upstreams before > +publishing. I'd say "partly for this reason", as I don't think this is the only reason people do that. I've done the above revisions and a few others and pushed them to git://linux-nfs.org/~bfields/git.git maint I'll take another look in the morning. --b. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v4] user-manual: Add section "Why bisecting merge commits can be harder ..." 2007-11-18 3:59 ` J. Bruce Fields @ 2007-11-18 9:47 ` Steffen Prohaska 2007-11-18 23:18 ` J. Bruce Fields 0 siblings, 1 reply; 30+ messages in thread From: Steffen Prohaska @ 2007-11-18 9:47 UTC (permalink / raw) To: J. Bruce Fields Cc: Junio C Hamano, Benoit Sigoure, Andreas Ericsson, Johannes Sixt, git On Nov 18, 2007, at 4:59 AM, J. Bruce Fields wrote: > On Sat, Nov 10, 2007 at 02:49:54PM +0100, Steffen Prohaska wrote: >> +[[bisect-merges]] >> +Why bisecting merge commits can be harder than bisecting linear >> history >> +-------------------------------------------------------------------- >> --- >> + >> +This section discusses how gitlink:git-bisect[1] plays >> +with differently shaped histories. The text is based upon >> +an email by Junio C. Hamano to the git mailing list >> +(link:http://marc.info/?l=git&m=119403257315527&w=2[link:http:// >> marc.info/?l=git&m=119403257315527&w=2]). >> +It was adapted for the user manual. > > This is not the only text that's been taken from someplace else, > but if > we attributed them all in the text it would get a little > cumbersome.... > I think the place for that kind of thing is in the commit message, but > if we really think we need to include it in the main text, we could > add > a separate "'acknowledgements" section. ack for this change. And a general comment: I had two purposes in\x05 mind when I added this paragraph 1) Attribution; 2) Giving a reference to the original discussion. If someone disagrees, or has further questions, the original discussion on the mailing list could be useful. I believe the commit message is sufficient for attribution. I don't think we need more. Maybe if the manual goes to print we need to reconsider. But who know if this ever will happen. However, adding a References section with links to other resources could be a useful thing. Such resources could give additional information, without sacrificing the conciseness of the manual. An example are links to the original discussion on the list. Often they contain more details, which may sometimes be useful. Does asciidoc provide a mechanism for this? Something like \cite{} in LaTeX. >> + >> +Using gitlink:git-bisect[1] on a history with merges can be >> +challenging. Bisecting through merges is not a technical >> +problem. The real problem is what to do when the culprit turns >> +out to be a merge commit. How to spot what really is wrong, and >> +figure out how to fix it. The problem is not for the tool but >> +for the human, and it is real. > > I think we can pare that down a little. From your 7df716bec6bf6e3dafe4c36a6313a4346de2585a +The gitlink:git-bisect[1] commands deals fine with history that includes +merge commits. However, when the final commit that ends on is a merge +commit, the user may need to work harder than usual to figure out +exactly what the problem was. If you don't take the text below, first line: s/commands/command/ I'd reverse the order of the sentences. The section is about the difficulty, not about praising git-bisect. How about When the final commit that a bisect ends on is a merge commit, the user may need to work harder than usual to figure out exactly what the problem was. This is not a technical problem. In principle, the gitlink:git-bisect[1] command deals fine with history that includes merge commits. It's your call. I'm also fine with your version. >> + >> +Imagine this history: >> + >> +................................................ >> + ---Z---o---X---...---o---A---C---D >> + \ / >> + o---o---Y---...---o---B >> +................................................ >> + >> +Suppose that on the upper development line, the meaning of one >> +of the functions that existed at Z was changed at commit X. The >> +commits from Z leading to A change both the function's >> +implementation and all calling sites that existed at Z, as well >> +as new calling sites they add, to be consistent. There is no >> +bug at A. >> + >> +Suppose that in the meantime the lower development line somebody >> +added a new calling site for that function at commit Y. The >> +commits from Z leading to B all assume the old semantics of that >> +function and the callers and the callee are consistent with each >> +other. There is no bug at B, either. >> + >> +Suppose further that the two development lines were merged at C >> +and there was no textual conflict with this three way merge. >> +The result merged cleanly. >> + >> +Now, during bisect you find that the merge C is broken. You >> +started to bisect, because you found D is bad and you know Z was >> +good. The breakage is understandable, as at C, the new calling >> +site of the function added by the lower branch is not converted >> +to the new semantics, while all the other calling sites that >> +already existed at Z would have been converted by the merge. The >> +new calling site has semantic adjustment needed, but you do not >> +know that yet. >> + From your 7df716bec6bf6e3dafe4c36a6313a4346de2585a +Nevertheless, the code at C is broken, because the callers added +on the lower line of development have not been converted to the new +semantics introduced on the upper line of development. So if all +you know is that D was bad, Z was good, and that a s/a//? I'm not sure; you are the native speaker ;) +gitlink:git-bisect[1] identified C as the culprit, how will you +figure out that the problem was due to this change in semantics? >> +You need to find out what is the cause of the breakage by looking >> +at the merge commit C and the history leading to it. How would >> +you do that? >> + >> +Both "git diff A C" and "git diff B C" would be an enormous patch. >> +Each of them essentially shows the whole change on each branch >> +since they diverged. The developers may have well behaved to >> +create good commits that follow the "commit small, commit often, >> +commit well contained units" mantra, and each individual commit >> +leading from Z to A and from Z to B may be easy to review and >> +understand, but looking at these small and easily reviewable >> +steps alone would not let you spot the breakage. You need to >> +have a global picture of what the upper branch did (and >> +among many, one of them is to change the semantics of that >> +particular function) and look first at the huge "diff A C" >> +(which shows the change the lower branch introduces), and see if >> +that huge change is consistent with what have been done between >> +Z and A. >> + From your 7df716bec6bf6e3dafe4c36a6313a4346de2585a +When the result of a git-bisect is a non-merge commit, you should +normally be able to discover the problem be examining just that commit. s/be examining/by examining/ +Developers can make this easy by breaking their changes into small +self-contained commits. That won't help in the case above, however, +because the problem isn't obvious from examination of any single +commit; instead, a global view of the development is required. To +make matters worse, the change in semantics in the problematic +function may be just one small part of the changes in the upper +line of development. + >> +On the other hand, if you did not merge at C but rebased the >> +history between Z to B on top of A, you would have get this >> +linear history: >> + >> +................................................................ >> + ---Z---o---X--...---o---A---o---o---Y*--...---o---B*--D* >> +................................................................ >> + >> +Bisecting between Z and D* would hit a single culprit commit Y* >> +instead. This tends to be easier to understand why it is broken. >> + >> +For this reason, many experienced git users, even when they are >> +working on an otherwise merge-heavy project, keep the histories >> +linear by rebasing their work on top of public upstreams before >> +publishing. > > I'd say "partly for this reason", as I don't think this is the only > reason people do that. > > I've done the above revisions and a few others and pushed them to > > git://linux-nfs.org/~bfields/git.git maint > > I'll take another look in the morning. Besides the minor fixes above, ack from me. We already spend a lot of time on this section. It improved compared to the first version and I think it's now ready for the manual. Steffen ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v4] user-manual: Add section "Why bisecting merge commits can be harder ..." 2007-11-18 9:47 ` Steffen Prohaska @ 2007-11-18 23:18 ` J. Bruce Fields 0 siblings, 0 replies; 30+ messages in thread From: J. Bruce Fields @ 2007-11-18 23:18 UTC (permalink / raw) To: Steffen Prohaska Cc: Junio C Hamano, Benoit Sigoure, Andreas Ericsson, Johannes Sixt, git On Sun, Nov 18, 2007 at 10:47:24AM +0100, Steffen Prohaska wrote: > > On Nov 18, 2007, at 4:59 AM, J. Bruce Fields wrote: > >> On Sat, Nov 10, 2007 at 02:49:54PM +0100, Steffen Prohaska wrote: >>> +[[bisect-merges]] >>> +Why bisecting merge commits can be harder than bisecting linear history >>> +----------------------------------------------------------------------- >>> + >>> +This section discusses how gitlink:git-bisect[1] plays >>> +with differently shaped histories. The text is based upon >>> +an email by Junio C. Hamano to the git mailing list >>> +(link:http://marc.info/?l=git&m=119403257315527&w=2[link:http://marc.info/?l=git&m=119403257315527&w=2]). >>> +It was adapted for the user manual. >> >> This is not the only text that's been taken from someplace else, but if >> we attributed them all in the text it would get a little cumbersome.... >> I think the place for that kind of thing is in the commit message, but >> if we really think we need to include it in the main text, we could add >> a separate "'acknowledgements" section. > > ack for this change. > > And a general comment: I had two purposes in\x05 mind when I > added this paragraph > 1) Attribution; > 2) Giving a reference to the original discussion. If someone disagrees, > or has further questions, the original discussion on the mailing > list could be useful. > > I believe the commit message is sufficient for attribution. > I don't think we need more. Maybe if the manual goes to print > we need to reconsider. But who know if this ever will happen. > > However, adding a References section with links to other > resources could be a useful thing. Such resources could give > additional information, without sacrificing the conciseness of > the manual. An example are links to the original discussion on > the list. Often they contain more details, which may sometimes > be useful. > > Does asciidoc provide a mechanism for this? Something like > \cite{} in LaTeX. We could add a "for more details, see link:..." at the end, though I don't think it's necessary in this case. >>> +Using gitlink:git-bisect[1] on a history with merges can be >>> +challenging. Bisecting through merges is not a technical >>> +problem. The real problem is what to do when the culprit turns >>> +out to be a merge commit. How to spot what really is wrong, and >>> +figure out how to fix it. The problem is not for the tool but >>> +for the human, and it is real. >> >> I think we can pare that down a little. > > From your 7df716bec6bf6e3dafe4c36a6313a4346de2585a > > +The gitlink:git-bisect[1] commands deals fine with history that includes > +merge commits. However, when the final commit that ends on is a merge > +commit, the user may need to work harder than usual to figure out > +exactly what the problem was. > > If you don't take the text below, first line: s/commands/command/ Whoops, thanks! > > I'd reverse the order of the sentences. The section is about > the difficulty, not about praising git-bisect. How about > > When the final commit that a bisect ends on is a merge commit, > the user may need to work harder than usual to figure out > exactly what the problem was. This is not a technical > problem. In principle, the gitlink:git-bisect[1] command > deals fine with history that includes merge commits. > > It's your call. I'm also fine with your version. OK; I see your point but I think I'll stick with my version (with another minor fix or two)--the order seems more natural: first dismiss the obvious objection to the title, then introduce the problem that leads to the following discussion. >>> + >>> +Imagine this history: >>> + >>> +................................................ >>> + ---Z---o---X---...---o---A---C---D >>> + \ / >>> + o---o---Y---...---o---B >>> +................................................ >>> + >>> +Suppose that on the upper development line, the meaning of one >>> +of the functions that existed at Z was changed at commit X. The >>> +commits from Z leading to A change both the function's >>> +implementation and all calling sites that existed at Z, as well >>> +as new calling sites they add, to be consistent. There is no >>> +bug at A. >>> + >>> +Suppose that in the meantime the lower development line somebody >>> +added a new calling site for that function at commit Y. The >>> +commits from Z leading to B all assume the old semantics of that >>> +function and the callers and the callee are consistent with each >>> +other. There is no bug at B, either. >>> + >>> +Suppose further that the two development lines were merged at C >>> +and there was no textual conflict with this three way merge. >>> +The result merged cleanly. >>> + >>> +Now, during bisect you find that the merge C is broken. You >>> +started to bisect, because you found D is bad and you know Z was >>> +good. The breakage is understandable, as at C, the new calling >>> +site of the function added by the lower branch is not converted >>> +to the new semantics, while all the other calling sites that >>> +already existed at Z would have been converted by the merge. The >>> +new calling site has semantic adjustment needed, but you do not >>> +know that yet. >>> + > > From your 7df716bec6bf6e3dafe4c36a6313a4346de2585a > > +Nevertheless, the code at C is broken, because the callers added > +on the lower line of development have not been converted to the new > +semantics introduced on the upper line of development. So if all > +you know is that D was bad, Z was good, and that a > > s/a//? I'm not sure; you are the native speaker ;) I agree, it's better without the "a", thanks. > > +gitlink:git-bisect[1] identified C as the culprit, how will you > +figure out that the problem was due to this change in semantics? > > > >>> +You need to find out what is the cause of the breakage by looking >>> +at the merge commit C and the history leading to it. How would >>> +you do that? >>> + >>> +Both "git diff A C" and "git diff B C" would be an enormous patch. >>> +Each of them essentially shows the whole change on each branch >>> +since they diverged. The developers may have well behaved to >>> +create good commits that follow the "commit small, commit often, >>> +commit well contained units" mantra, and each individual commit >>> +leading from Z to A and from Z to B may be easy to review and >>> +understand, but looking at these small and easily reviewable >>> +steps alone would not let you spot the breakage. You need to >>> +have a global picture of what the upper branch did (and >>> +among many, one of them is to change the semantics of that >>> +particular function) and look first at the huge "diff A C" >>> +(which shows the change the lower branch introduces), and see if >>> +that huge change is consistent with what have been done between >>> +Z and A. >>> + > > From your 7df716bec6bf6e3dafe4c36a6313a4346de2585a > > +When the result of a git-bisect is a non-merge commit, you should > +normally be able to discover the problem be examining just that commit. > > s/be examining/by examining/ Oops, thanks. >> I'd say "partly for this reason", as I don't think this is the only >> reason people do that. >> >> I've done the above revisions and a few others and pushed them to >> >> git://linux-nfs.org/~bfields/git.git maint >> >> I'll take another look in the morning. > > Besides the minor fixes above, ack from me. We already spend > a lot of time on this section. It improved compared to the > first version and I think it's now ready for the manual. OK, thanks for your patience! This plus an unrelated trivial fix are available for Junio to pull from git://linux-nfs.org/~bfields/git.git maint (Full history also available from the maint-history branch). --b. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] user-manual: add advanced topic "bisecting merges" 2007-11-08 12:54 ` Steffen Prohaska 2007-11-08 13:22 ` Johannes Sixt @ 2007-11-08 13:38 ` Andreas Ericsson 2007-11-08 14:51 ` Benoit Sigoure 2 siblings, 0 replies; 30+ messages in thread From: Andreas Ericsson @ 2007-11-08 13:38 UTC (permalink / raw) To: Steffen Prohaska; +Cc: Johannes Sixt, gitster, Ralf.Wildenhues, tsuna, git Steffen Prohaska wrote: > > On Nov 8, 2007, at 10:53 AM, Johannes Sixt wrote: > >> Andreas Ericsson schrieb: >>> Johannes Sixt wrote: >>>> Steffen Prohaska schrieb: >>>>> >>>>> On Nov 8, 2007, at 8:19 AM, Johannes Sixt wrote: >>>>> >>>>>> Steffen Prohaska schrieb: >>>>>>> +If you linearize the history by rebasing the lower branch on >>>>>>> +top of the upper, instead of merging, the bug becomes much >>>>>>> easier to >>>>>>> +find and understand. Your history would instead be: >>>>>> >>>>>> At this point I'm missing the words >>>>>> >>>>>> The solution is ... >>>>>> >>>>>> I.e.: >>>>>> >>>>>> The solution is to linearize the history by rebasing the lower >>>>>> branch on >>>>>> top of the upper, instead of merging. Now the bug becomes much >>>>>> easier to >>>>>> find and understand. Your history would instead be: >>>>> >>>>> Hmm. It might be a solution if you did not publish history. >>>> >>>> This is about finding the commit that introduced a bug. Once you >>>> found it, better: you know how to fix the bug, you are expected to >>>> throw away the rebased branch, not to publish it! Maybe a note along >>>> these lines could be appended: >>>> >>>> Now that you know what caused the error (and how to fix it), throw >>>> away the rebased branch, and commit a fix on top of D. >>>> >>> Well, if rebasing becomes the standard for normal development, it's >>> hardly >>> right to throw it away, is it? I like Steffen's suggestion better. >> >> There is a big misunderstanding. The text that the patch amends is >> about bisecting history that reveals that a merge commit breaks, which >> is not helpful, and then how to find where and what and why the >> breakage really was introduce. >> >> And the answer to "how to find" is to rebase and bisect in the rebased >> history. > > Do you use rebase like this in real life? > > I thought of the text as background information that might > be helpful for users who want do decide wether to merge or > to rebase. The problem described may be valuable information > supporting a decision about a recommended workflow for a group > of users. > > My personal conclusion was: I'll accept the danger of complex > merges that might be hard to bisect. I now understand this > risk, but I nonetheless prefer the simplicity of a merge > based workflow. This avoids the danger that published history > gets rewritten. > > But now I'm wondering if your suggestions of rebasing only for > locating the evil commit is feasible in reality. You may need > to solve a lot of merge conflicts if you rebase a larger part > of the history. If you do not have them in your rerere cache > this might be time consuming. ... > It is no great chore to put one merge-parent on top of another and then re-run bisect on the result. git-bisect could even be taught to do that by itself. > >> My initial complaint was that in the flow of reading the instructions >> the pointer to "the solution" was missing. Rather, at the point where >> the reader is supposed to think "ah, yes, that's how to do it", there >> is the conditional statement "If you linearize history". My suggestion >> is to put a big emphasis on the solution by using the words "The >> solution is". >> >> Now, the user can *always* rebase one of the branches on top of the >> other, even if both histories are already published. *But* if both >> were indeed published, then the rebased history must be thrown away, >> and the only thing you learnt from it was where and what and why the >> breakage really was introduced. >> >> Of course we could include a few "ifs" and "unlesses" (about published >> histories), before suggesting to throw away rebased history. But once >> the task is accomplished (find the bogus commit), throwing away the >> rebased history (and continuing at commit D) is always correct, but >> keeping it (and continuing at D*) is not. > > ... So, again, the question for me is if someone does use > rebase in reality in the way that you suggests. Do you? I don't, but if I'd thought a bit further I would have on at least one occasion in the past. Instead I spent two days manually auditing every commit of several branches. -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] user-manual: add advanced topic "bisecting merges" 2007-11-08 12:54 ` Steffen Prohaska 2007-11-08 13:22 ` Johannes Sixt 2007-11-08 13:38 ` [PATCH v2] user-manual: add advanced topic "bisecting merges" Andreas Ericsson @ 2007-11-08 14:51 ` Benoit Sigoure 2007-11-08 15:07 ` Steffen Prohaska ` (2 more replies) 2 siblings, 3 replies; 30+ messages in thread From: Benoit Sigoure @ 2007-11-08 14:51 UTC (permalink / raw) To: Steffen Prohaska Cc: Johannes Sixt, Andreas Ericsson, Ralf Wildenhues, Git Mailing List [-- Attachment #1: Type: text/plain, Size: 1490 bytes --] On Nov 8, 2007, at 1:54 PM, Steffen Prohaska wrote: > Do you use rebase like this in real life? > > I thought of the text as background information that might > be helpful for users who want do decide wether to merge or > to rebase. The problem described may be valuable information > supporting a decision about a recommended workflow for a group > of users. You're missing the point. Johannes suggested that you rebase *only* for bisecting purpose. Once you find the culprit commit, throw away your rebased stuff. I've never thought about doing this myself, but it's a very clever way of tackling this problem. It's slightly less convenient if you need to bisect a large portion of the history (that involves many branches and merges) because in this case we'd like to have a magic git-linearize-history <start-treeish> <end-treeish>. Unless this is already easily doable with git-rebase? So to summarize (untested): git merge topic make check <omg something's broken> git reset --hard HEAD~1 # undo the merge git checkout -b wtf git rebase topic master git bisect ... <OK I found the culprit> git checkout master git branch -D wtf git merge --no-commit topic <fix the problem> git commit # merge done, semantic glitch fixed Correct me if I'm wrong. This has *nothing* to do with the fact that you use merge or rebase to do whatever you were doing. -- Benoit Sigoure aka Tsuna EPITA Research and Development Laboratory [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 186 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] user-manual: add advanced topic "bisecting merges" 2007-11-08 14:51 ` Benoit Sigoure @ 2007-11-08 15:07 ` Steffen Prohaska 2007-11-08 15:23 ` Johannes Schindelin 2007-11-08 18:38 ` Brian Gernhardt 2 siblings, 0 replies; 30+ messages in thread From: Steffen Prohaska @ 2007-11-08 15:07 UTC (permalink / raw) To: Benoit Sigoure Cc: Johannes Sixt, Andreas Ericsson, Ralf Wildenhues, Git Mailing List On Nov 8, 2007, at 3:51 PM, Benoit Sigoure wrote: > On Nov 8, 2007, at 1:54 PM, Steffen Prohaska wrote: > >> Do you use rebase like this in real life? >> >> I thought of the text as background information that might >> be helpful for users who want do decide wether to merge or >> to rebase. The problem described may be valuable information >> supporting a decision about a recommended workflow for a group >> of users. > > You're missing the point. Johannes suggested that you rebase > *only* for bisecting purpose. Once you find the culprit commit, > throw away your rebased stuff. I got this point. I also noted that it might be time consuming if you need to resolve conflicts. The original discussion in which Junio explained the problem with bisecting merges was about workflows. The question then was if users should _always_ rebase if possible to make bisecting easier. It was really a workflow question. Well, the context of the original discussion is no longer present in the patch I sent. Therefore, Johannes' comments absolutely make sense. Actually I find them really inspiring as I have not thought before about rebasing temporarily, just for finding a commit. Now I learnt that this could be a useful tool. > I've never thought about doing this myself, but it's a very clever > way of tackling this problem. Apparently, you haven't thought about this solution either ;) Steffen ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] user-manual: add advanced topic "bisecting merges" 2007-11-08 14:51 ` Benoit Sigoure 2007-11-08 15:07 ` Steffen Prohaska @ 2007-11-08 15:23 ` Johannes Schindelin 2007-11-08 18:38 ` Brian Gernhardt 2 siblings, 0 replies; 30+ messages in thread From: Johannes Schindelin @ 2007-11-08 15:23 UTC (permalink / raw) To: Benoit Sigoure Cc: Steffen Prohaska, Johannes Sixt, Andreas Ericsson, Ralf Wildenhues, Git Mailing List Hi, On Thu, 8 Nov 2007, Benoit Sigoure wrote: > On Nov 8, 2007, at 1:54 PM, Steffen Prohaska wrote: > > > Do you use rebase like this in real life? > > > > I thought of the text as background information that might be helpful > > for users who want do decide wether to merge or to rebase. The problem > > described may be valuable information supporting a decision about a > > recommended workflow for a group of users. > > You're missing the point. Johannes suggested that you rebase *only* for > bisecting purpose. Once you find the culprit commit, throw away your > rebased stuff. Just to clear things up: it was the other Johannes who suggested it. But I strongly advise not to rebase before bisecting, since you could very well end up changing the behaviour of the program by rebasing it. Even to a point where you cannot bisect it any longer, for example when the merge of two branches contains an important fix without which the combined branches (or even parts of them) will not even compile. Last time I checked, however, bisect worked like a charm even on a history with complicated ancestry. Ciao, Dscho ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] user-manual: add advanced topic "bisecting merges" 2007-11-08 14:51 ` Benoit Sigoure 2007-11-08 15:07 ` Steffen Prohaska 2007-11-08 15:23 ` Johannes Schindelin @ 2007-11-08 18:38 ` Brian Gernhardt 2 siblings, 0 replies; 30+ messages in thread From: Brian Gernhardt @ 2007-11-08 18:38 UTC (permalink / raw) To: Benoit Sigoure Cc: Steffen Prohaska, Johannes Sixt, Andreas Ericsson, Ralf Wildenhues, Git Mailing List On Nov 8, 2007, at 9:51 AM, Benoit Sigoure wrote: > You're missing the point. Johannes suggested that you rebase *only* > for bisecting purpose. Once you find the culprit commit, throw away > your rebased stuff. I've never thought about doing this myself, but > it's a very clever way of tackling this problem. It's slightly less > convenient if you need to bisect a large portion of the history > (that involves many branches and merges) because in this case we'd > like to have a magic git-linearize-history <start-treeish> <end- > treeish>. Unless this is already easily doable with git-rebase? I don't think you have to linearize it before bisecting. If you bisect and discover that it was due to a merge, then you can rebase *that merge* to discover what part of the merge caused the issue. ~~ Brian ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] user-manual: add advanced topic "bisecting merges" 2007-11-04 9:16 [PATCH] user-manual: add advanced topic "bisecting merges" Steffen Prohaska 2007-11-04 11:23 ` Ralf Wildenhues @ 2007-11-04 13:50 ` Benoit SIGOURE 1 sibling, 0 replies; 30+ messages in thread From: Benoit SIGOURE @ 2007-11-04 13:50 UTC (permalink / raw) To: Steffen Prohaska; +Cc: git [-- Attachment #1: Type: text/plain, Size: 1257 bytes --] On Nov 4, 2007, at 10:16 AM, Steffen Prohaska wrote: > diff --git a/Documentation/user-manual.txt b/Documentation/user- > manual.txt > index d99adc6..480e7c1 100644 > --- a/Documentation/user-manual.txt > +++ b/Documentation/user-manual.txt > @@ -934,6 +934,95 @@ Figuring out why this works is left as an > exercise to the (advanced) [...] > + > +Imagine this history. > + > +................................................ > + ---Z---o---X---...---o---A---C---D > + \ / > + o---o---Y---...---o---B > +................................................ > + I don't know how you chose these letters, but I don't find them particularly intuitive to remember. ................................................ ---B---o---X---...---o---Y---M---H \ / o---o---T---...---o---Z ................................................ would be better (IMO): B stands for "Branch point" M stands for "Merge" H stands for "HEAD" T stands for "Topic" Otherwise thanks for documenting this, the patch looks fine to me. -- Benoit Sigoure aka Tsuna EPITA Research and Development Laboratory [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 186 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2007-11-18 23:18 UTC | newest] Thread overview: 30+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-11-04 9:16 [PATCH] user-manual: add advanced topic "bisecting merges" Steffen Prohaska 2007-11-04 11:23 ` Ralf Wildenhues 2007-11-07 21:50 ` [PATCH v2] " Steffen Prohaska 2007-11-07 22:16 ` Benoit Sigoure 2007-11-07 22:26 ` J. Bruce Fields 2007-11-08 6:40 ` Steffen Prohaska 2007-11-08 7:19 ` Johannes Sixt 2007-11-08 8:59 ` Steffen Prohaska 2007-11-08 9:11 ` Johannes Sixt 2007-11-08 9:33 ` Andreas Ericsson 2007-11-08 9:53 ` Johannes Sixt 2007-11-08 12:54 ` Steffen Prohaska 2007-11-08 13:22 ` Johannes Sixt 2007-11-08 14:55 ` Steffen Prohaska 2007-11-10 9:48 ` [PATCH v3] " Steffen Prohaska 2007-11-10 10:36 ` Junio C Hamano 2007-11-10 11:16 ` Steffen Prohaska 2007-11-10 13:49 ` [PATCH v4] user-manual: Add section "Why bisecting merge commits can be harder ..." Steffen Prohaska 2007-11-10 19:10 ` Linus Torvalds 2007-11-10 20:25 ` Junio C Hamano 2007-11-10 22:41 ` Steffen Prohaska 2007-11-18 3:59 ` J. Bruce Fields 2007-11-18 9:47 ` Steffen Prohaska 2007-11-18 23:18 ` J. Bruce Fields 2007-11-08 13:38 ` [PATCH v2] user-manual: add advanced topic "bisecting merges" Andreas Ericsson 2007-11-08 14:51 ` Benoit Sigoure 2007-11-08 15:07 ` Steffen Prohaska 2007-11-08 15:23 ` Johannes Schindelin 2007-11-08 18:38 ` Brian Gernhardt 2007-11-04 13:50 ` [PATCH] " Benoit SIGOURE
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).