* Rebase, please help @ 2007-04-11 1:52 Alexander Litvinov 2007-04-11 7:38 ` Junio C Hamano 2007-04-11 7:48 ` Alex Riesen 0 siblings, 2 replies; 8+ messages in thread From: Alexander Litvinov @ 2007-04-11 1:52 UTC (permalink / raw) To: git Hello list. I have found that rebase have (new) option : --merge Looking at the code show me that regular rebase is a simply format-patch and am but --merge (or -s) use some merge stratyegy to merge changes between two commits into current head. What is --merge for ? Will the result be the same ? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Rebase, please help 2007-04-11 1:52 Rebase, please help Alexander Litvinov @ 2007-04-11 7:38 ` Junio C Hamano 2007-04-11 10:10 ` Andy Parkins 2007-04-11 7:48 ` Alex Riesen 1 sibling, 1 reply; 8+ messages in thread From: Junio C Hamano @ 2007-04-11 7:38 UTC (permalink / raw) To: Alexander Litvinov; +Cc: git Alexander Litvinov <litvinov2004@gmail.com> writes: > I have found that rebase have (new) option : --merge > Looking at the code show me that regular rebase is a simply format-patch and > am but --merge (or -s) use some merge stratyegy to merge changes between two > commits into current head. > > What is --merge for ? Will the result be the same ? Regular "rebase" uses "format-patch" piped to "am -3", so if you do not have renames the file-level patch conflict can be resolved using the 3-way merge logic. However, because we do not give -M to format-patch, it does not deal with case where you have renames in the series of commits you are rebasing, nor where you have renames between the current base commit and the commit you are rebasing onto (the latter won't be solved with giving -M to format-patch anyway, so we do not even try). In cases involving such renames, giving --merge option would probably be nicer to work with. It invokes merge-recursive logic to deal with the renames. I find that the regular rebase without --merge is faster (at least it feels to me that it is, and I kind of understand why; patch application to write out a tree is optimized to take advantage of cache-tree extension, as opposed to merging three trees which clobbers it), when there is no patch conflict. Since most rebases do not involve patch conflict for me and seldom involve rebases, I almost never use --merge myself, but this would depend highly on personal taste and project. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Rebase, please help 2007-04-11 7:38 ` Junio C Hamano @ 2007-04-11 10:10 ` Andy Parkins 2007-04-12 20:37 ` Junio C Hamano 0 siblings, 1 reply; 8+ messages in thread From: Andy Parkins @ 2007-04-11 10:10 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Alexander Litvinov On Wednesday 2007 April 11 08:38, Junio C Hamano wrote: > I find that the regular rebase without --merge is faster (at > least it feels to me that it is, and I kind of understand why; This is interesting and brings to mind a difficult I've had. I had problems with rebase when rebasing chains with a file that was self-similar. Indulge me for a while with this example (forgive the C++, but that's where I had this problem): class A : public C { // ... int someVirtualOverride(n) { return ArrayA[n]; } // ... } class B : public C { // ... int someVirtualOverride(n) { return ArrayB[n]; } // ... } One patch changed "ArrayX[n]" to "Array.at(n)" and another inserted more similar classes around these two. When I was rebasing, some strange things happened (without any conflict warnings): class D : public C { int someVirtualOverride(n) { return ArrayA.at(n); } } class A : public C { int someVirtualOverride(n) { return ArrayB.at(n); } } class B : public C { int someVirtualOverride(n) { return ArrayB[n]; } } Notice that the arrays don't match up with the classes. By some crazy coincidence and the strong similarity between localities within the file, the patch successfully applied in the wrong place. The fix was easy enough to do manually, but it needed a bit of untangling as this was in a longish chain of revisions that I was rebasing. I didn't mind much, and hence didn't report it as a bug as I guessed it was to do with git-rebase using git-am. The annoying part was actually that there was no conflict warning and hence the rest of the chain applied, making it all the more difficult to untangle. My question then is this: given that I don't care about speed of rebase, is it safe to permanently use --merge with rebase, and would that have caught the error in the above case? Andy -- Dr Andy Parkins, M Eng (hons), MIET andyparkins@gmail.com ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Rebase, please help 2007-04-11 10:10 ` Andy Parkins @ 2007-04-12 20:37 ` Junio C Hamano 2007-04-12 21:22 ` Linus Torvalds 0 siblings, 1 reply; 8+ messages in thread From: Junio C Hamano @ 2007-04-12 20:37 UTC (permalink / raw) To: Andy Parkins; +Cc: git, Alexander Litvinov, Linus Torvalds Andy Parkins <andyparkins@gmail.com> writes: > This is interesting and brings to mind a difficult I've had. > I had problems with rebase when rebasing chains with a file > that was self-similar. Indulge me for a while with this > example (forgive the C++, but that's where I had this > problem): > > class A : public C > { > // ... > > int someVirtualOverride(n) { return ArrayA[n]; } > > // ... > } > > class B : public C > { > // ... > > int someVirtualOverride(n) { return ArrayB[n]; } > > // ... > } > > One patch changed "ArrayX[n]" to "Array.at(n)" and another inserted more > similar classes around these two. > > When I was rebasing, some strange things happened (without any conflict > warnings): > > class D : public C > { > int someVirtualOverride(n) { return ArrayA.at(n); } > } > > class A : public C > { > int someVirtualOverride(n) { return ArrayB.at(n); } > } > > class B : public C > { > int someVirtualOverride(n) { return ArrayB[n]; } > } > > Notice that the arrays don't match up with the classes. By > some crazy coincidence and the strong similarity between > localities within the file, the patch successfully applied in > the wrong place. A patch that can ambiguously apply to multiple places is indeed a problem, and in such situations merge based rebase is probably safer as it can take advantage of the whole file as the context. But it brings up another interesting point. The ambiguous patch is a problem even more so outside the context of rebasing, for another reason. When rebasing, you are dealing with your own changes and you know what and how you want each of them to change the tree state, as opposed to applying somebody else's patch outside the context of rebasing. When we only have the patch text (i.e. applymbox), there is no "merge to use the whole file as the context" fallback. I wonder if this is a common enough problem that we would want to make it safer somehow... [jc: Since I happen to know somebody who applies more patches in one week than anybody else would ever apply in the lifetime ;-), I am CC'ing that person] I can see two possible improvements. - On the diff generation side, we could notice that the hunk we are going to output can be applied to more than one location, and automatically widen the context for it. This is only a half-solution, as many patches do not even come from git. - Inside git-apply, apply_one_fragment(), ask find_offset() if the hunk can match more than one location, and exit with an error status (still writing out the patch results if it otherwise applies cleanly) so that the user can manually inspect and confirm. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Rebase, please help 2007-04-12 20:37 ` Junio C Hamano @ 2007-04-12 21:22 ` Linus Torvalds 0 siblings, 0 replies; 8+ messages in thread From: Linus Torvalds @ 2007-04-12 21:22 UTC (permalink / raw) To: Junio C Hamano; +Cc: Andy Parkins, git, Alexander Litvinov On Thu, 12 Apr 2007, Junio C Hamano wrote: > Andy Parkins <andyparkins@gmail.com> writes: > > > > When I was rebasing, some strange things happened (without any conflict > > warnings): > > A patch that can ambiguously apply to multiple places is indeed > a problem, and in such situations merge based rebase is probably > safer as it can take advantage of the whole file as the context. Indeed. It's one of the reasons I think the default "patch" behaviour of allowing some fuzz in the patch is totally broken, and why "git apply" defaults to a much stricter "no context differences allowed" behaviour. That still doesn't entirely get *rid* of the problem, but it at least makes it slightly less common. You can still get a patch that applies cleanly if you have certain multi-line patterns that are very common, and that end up causing patch to find the wrong place to apply a patch, and still make it look right enough. This is just one reason why patch-based systems are totally and utterly broken, and why I detested the original cogito patch-based merging. The three-way merge behaviour ("git rebase --merge") is a lot safer, but it's obviously more expensive too. Also, the reason a lot of people like patch-based setups is not only is it efficient, but it turns out too many people prefer "clean merges" even *despite* the dangers. See the recent patches that were floating around on "stgit" that actually make applying patches default to the same *broken*defaults* as standard "patch" does (see the emails with subject line "Pass -C1 to git-apply in StGIT ..." for more on that). So a lot of people prefer the less strict patch application rules, because *most* of the time it actually does the right thing. Never mind that it makes a fundamental problem with patches even worse. Me, I'd prefer to have the patch fail early. But even failing early does not _guarantee_ that it applies in the right place. > But it brings up another interesting point. The ambiguous patch > is a problem even more so outside the context of rebasing, for > another reason. When rebasing, you are dealing with your own > changes and you know what and how you want each of them to > change the tree state, as opposed to applying somebody else's > patch outside the context of rebasing. > > When we only have the patch text (i.e. applymbox), there is no > "merge to use the whole file as the context" fallback. I wonder > if this is a common enough problem that we would want to make it > safer somehow... We could make "git apply" also refuse to move more than a few lines by default (ie not only does the context have to be exact, it has to actually show up where the patch claims it should be!) git-apply still allows arbitrary line offset differences (see "find_offset()"). > I can see two possible improvements. > > - On the diff generation side, we could notice that the hunk > we are going to output can be applied to more than one > location, and automatically widen the context for it. > > This is only a half-solution, as many patches do not even > come from git. It's not even a solution _within_ git. Since a patch will always apply at the right place if no changes have been done (because we always start looking at the line number where the patch fragment _claims_ it should go), the problem only occurs when independent changes have been done to the target. And those independent changes may obviously be the ones that *introduce* the new location that the patch can (incorrectly) apply to. > - Inside git-apply, apply_one_fragment(), ask find_offset() if > the hunk can match more than one location, and exit with an > error status (still writing out the patch results if it > otherwise applies cleanly) so that the user can manually > inspect and confirm. Yes, that might be a good idea, but is going to be pretty expensive. "find_offset()" wasn't exactly written to be a model of efficiency. See the comment about me not being one of the "smart and beautiful" people. So you'd want to have some smarter method of finding potential places to apply the fragment if you want to do those kinds of things. Like creating hashes of the line contents in order to not have to compare the whole fragment.. And EVEN THEN it wouldn't actually solve the problem. The most common case is simply: - somebody *already* fixed the same bug by a very similar patch (or an identical one), and thus the patch obviously won't apply, since the place it *should* apply to got changed. - so git-apply will look for another place to apply it, and it's quite possible that there is just one such place - even though it's the exact wrong place! So it really does boil down to: patches will sometimes falsely apply at the wrong place. That's just in the fundamental nature of patches. The same way a three-way merge can sometimes generate total crap, even when it merged totally cleanly. It's unusual enough that most of the time it doesn't happen (and I think it happens less with three-way merges than with patches), and hopefully most of the time the error will be obvious enough that it gets noticed quickly (ie the badly patched sources simply won't build any more!). But there is no practical way to guarantee it cannot happen. The only way to guarantee that patches apply correctly is to make sure the source file matches *exactly*. But that kind of defeats the whole point of sending patches around in the first place. Linus ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Rebase, please help 2007-04-11 1:52 Rebase, please help Alexander Litvinov 2007-04-11 7:38 ` Junio C Hamano @ 2007-04-11 7:48 ` Alex Riesen 2007-04-11 9:46 ` Junio C Hamano 1 sibling, 1 reply; 8+ messages in thread From: Alex Riesen @ 2007-04-11 7:48 UTC (permalink / raw) To: Alexander Litvinov; +Cc: git On 4/11/07, Alexander Litvinov <litvinov2004@gmail.com> wrote: > > What is --merge for ? Will the result be the same ? Maybe, maybe not. It uses merge strategies instead of git-am and has advantages over blindly applying the patches (it can know how a change got in, and it uses resolved conflict cache). ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Rebase, please help 2007-04-11 7:48 ` Alex Riesen @ 2007-04-11 9:46 ` Junio C Hamano 2007-04-11 11:32 ` Alex Riesen 0 siblings, 1 reply; 8+ messages in thread From: Junio C Hamano @ 2007-04-11 9:46 UTC (permalink / raw) To: Alex Riesen; +Cc: Alexander Litvinov, git "Alex Riesen" <raa.lkml@gmail.com> writes: > On 4/11/07, Alexander Litvinov <litvinov2004@gmail.com> wrote: >> >> What is --merge for ? Will the result be the same ? > > Maybe, maybe not. It uses merge strategies instead of git-am > and has advantages over blindly applying the patches (it can > know how a change got in, and it uses resolved conflict cache). I think "blindly applying the patches" is a gross overstatement, as merge-recursive will get confused the same way as git-apply, when a difference that comes from the two commits can be applied to two places. When the forward-ported change gets conflict, 3-way merge logic in "git-am -3" kicks in and does a fall back to merge-recursive on a reconstructed tree that has only the paths relevant to the case. With "format-patch piped to am-3", we could give the -M option to "format-patch" to deal with renames that happen in the series you are rebasing, but renames between the bases (the original base commit for the series and the new "onto" commit) is not something it can handle sensibly. That is the true advantage --merge has over "format-patch piped to am-3", as it always drives merge-recursive and it can notice renames between the two bases. But always driving merge-recursive is also its weakness. When the series being rebased is simple and long, especially on a big tree, applying many patches without conflicts tends to outperform running the same number of merges, as the patch application is tuned to take advantage of cache-tree while read-tree based merge essentially trashes cache-tree, and has to pay the full cost of write-tree for every commit it makes. Also there is that small D/F conflict problem merge-recursive has that I told you about, which does not exist in git-apply ;-) Did you have a chance to take a look at it yet? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Rebase, please help 2007-04-11 9:46 ` Junio C Hamano @ 2007-04-11 11:32 ` Alex Riesen 0 siblings, 0 replies; 8+ messages in thread From: Alex Riesen @ 2007-04-11 11:32 UTC (permalink / raw) To: Junio C Hamano; +Cc: Alexander Litvinov, git On 4/11/07, Junio C Hamano <junkio@cox.net> wrote: > Also there is that small D/F conflict problem merge-recursive > has that I told you about, which does not exist in git-apply ;-) > Did you have a chance to take a look at it yet? not yet. I was greatly distracted by subprojects ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2007-04-12 21:23 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-04-11 1:52 Rebase, please help Alexander Litvinov 2007-04-11 7:38 ` Junio C Hamano 2007-04-11 10:10 ` Andy Parkins 2007-04-12 20:37 ` Junio C Hamano 2007-04-12 21:22 ` Linus Torvalds 2007-04-11 7:48 ` Alex Riesen 2007-04-11 9:46 ` Junio C Hamano 2007-04-11 11:32 ` Alex Riesen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).