* inexplicable failure to merge recursively across cherry-picks
@ 2007-10-10 1:55 martin f krafft
2007-10-10 2:54 ` Linus Torvalds
0 siblings, 1 reply; 11+ messages in thread
From: martin f krafft @ 2007-10-10 1:55 UTC (permalink / raw)
To: git discussion list
[-- Attachment #1: Type: text/plain, Size: 2533 bytes --]
Hi folks,
I hope this is not a daily series of mine, being confused about Git
merging, but I've run my head against a wall again and before
I crush my skull, I'd prefer to reach out to you to help me regain
an understanding.
This is about the new mdadm for Debian packaging effort. You can
clone from git://git.debian.org/git/pkg-mdadm/mdadm-new.git and see
the repo at
http://git.debian.org/?p=pkg-mdadm/mdadm-new.git;a=summary. Do not
track as this repo is subject to change.
My master branch was last merged with upstream's mdadm-2.6.2 tag
(commit 263a535). Since then, I've committed a couple of changes to
master including three cherry-picks from upstream since mdadm-2.6.2.
I tagged upstream at the point which I want to merge into master:
mdadm-2.6.3+200709292116+4450e59. When I merge that tag into master,
I get a merge conflict on Monitor.c:
<<<<<<< HEAD:Monitor.c
if (mse->devnum != MAXINT &&
=======
if (mse->devnum != INT_MAX &&
>>>>>>> upstream:Monitor.c
There are five commits between mdadm-2.6.2 and
mdadm-2.6.3+200709292116+4450e59 that affect Monitor.c:
01d9299
e4dc510
* 66f8bbb
98127a6
4450e59
The third commit, the one with the asterisk is the one that
I cherry-picked (as 845eef9); the other two cherries I picked do not
touch Monitor.c.
The fifth/last commit (4450e59) is the one responsible for the
change which seems to cause the conflict. It is the *only* commit
since the common ancestor of *both* branches that touches the
conflicting lines.
The fourth commit (98127a6) inserts a single line at the top of the
file, so that's nothing that would cause a conflict.
To be honest, I can't explain it. But I didn't give up.
I branched master2 off 845eef9b~1, cherry-picked the first two
commits that touch Monitor.c, cherry-picked all the commits
845eef9b..master into master2 and merge upstream...
... to get exactly the same conflict in exactly the same line in
exactly the same file.
What is going on. Am I seriously overestimating Git's merging
capacities, or do I have a bug in my brain?
--
martin; (greetings from the heart of the sun.)
\____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
"the only difference between shakespeare and you
was the size of his idiom list -- not the size of his vocabulary."
-- alan perlis
spamtraps: madduck.bogus@madduck.net
[-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: inexplicable failure to merge recursively across cherry-picks 2007-10-10 1:55 inexplicable failure to merge recursively across cherry-picks martin f krafft @ 2007-10-10 2:54 ` Linus Torvalds 2007-10-10 10:25 ` martin f krafft 0 siblings, 1 reply; 11+ messages in thread From: Linus Torvalds @ 2007-10-10 2:54 UTC (permalink / raw) To: martin f krafft; +Cc: git discussion list On Wed, 10 Oct 2007, martin f krafft wrote: > > There are five commits between mdadm-2.6.2 and > mdadm-2.6.3+200709292116+4450e59 that affect Monitor.c: > > 01d9299 > e4dc510 > * 66f8bbb > 98127a6 > 4450e59 > > The third commit, the one with the asterisk is the one that > I cherry-picked (as 845eef9); the other two cherries I picked do not > touch Monitor.c. Side note - run gitk --merge when you have a merge conflict, and it will basically show you the thing graphically (ie history as it is relevant to the merge, and only to the files that get conflicts). But basically, both sides have modified the code *around* that line, and they have modified it differently. Do this in your partial merge tree on 'master': git diff ...mdadm-2.6.3+200709292116+4450e59 Monitor.c git diff mdadm-2.6.3+200709292116+4450e59... Monitor.c which will show you the diff from the common base ancestor. And in particular, it will show how one branch did this: @@ -399,9 +401,8 @@ int Monitor(mddev_dev_t devlist, struct mdstat_ent *mse; for (mse=mdstat; mse; mse=mse->next) if (mse->devnum != MAXINT && - (strcmp(mse->level, "raid1")==0 || - strcmp(mse->level, "raid5")==0 || - strcmp(mse->level, "multipath")==0) + (strcmp(mse->level, "raid0")!=0 && + strcmp(mse->level, "linear")!=0) ) { struct state *st = malloc(sizeof *st); mdu_array_info_t array; and the other one did @@ -398,10 +402,9 @@ int Monitor(mddev_dev_t devlist, if (scan) { struct mdstat_ent *mse; for (mse=mdstat; mse; mse=mse->next) - if (mse->devnum != MAXINT && - (strcmp(mse->level, "raid1")==0 || - strcmp(mse->level, "raid5")==0 || - strcmp(mse->level, "multipath")==0) + if (mse->devnum != INT_MAX && + (strcmp(mse->level, "raid0")!=0 && + strcmp(mse->level, "linear")!=0) ) { struct state *st = malloc(sizeof *st); mdu_array_info_t array; And now maybe git's behaviour makes more sense. See? You basically had two different branches that made *almost* the same changes to the same area, but not quite. So how is git to know which one was the *right* one to pick? The one that changed the "if (mse->devnum != MAXINT &&" line, or the one that left it alone? > I branched master2 off 845eef9b~1, cherry-picked the first two > commits that touch Monitor.c, cherry-picked all the commits > 845eef9b..master into master2 and merge upstream... Cherry-picking is immaterial. It doesn't matter how the changes come into the tree. It doesn't matter what the history is. The only thing git cares about is the content, and the end result. Git knows that the two branches got to two different end results. They were identical except for that one line, and it asks you to say which branch was "right" wrt that one line. In other words, git never looks at individual commits when trying to merge. It doesn't try to figure out what the "meaning" of the changes are, it purely looks at the content. And btw, it *has* to work that way, because if you don't work that way, then you get different results depending on which path the development took (eg you might get different results if something was considered a "revert", for example, or if something was split up into two patches on one side but not the other etc etc). But in this case it's pretty obvious: the commit from one side (the one that changes MAXINT->INT_MAX: "Monitor.c s/MAXINT/INT_MAX/g") merged fine into the result *except* for that one section that had touched the same general area for other reasons. And that one area was seen as a conflict because of those other reasons being in the same hunk. And yes, in this case the "other reasons" happened to be cherry-picked and thus "the same" on both sides, but that doesn't mean that they should have been considered in any way special: the result of cherry-picking is context-dependent and is a part of the history of the target, and does not equate any kind of "identity" with the source. Cherry-picking is 100% equivalent to re-doing the commit entirely, and for all git knows, there was a reason why it wasn't done together with that s/MAXINT/INT_MAX/g change. (Not that git even thinks in those terms - git literally just says: "I cannot resolve the _content_ independently and without understanding the history and thinking behind the differences, so you'd better help me") Linus ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: inexplicable failure to merge recursively across cherry-picks 2007-10-10 2:54 ` Linus Torvalds @ 2007-10-10 10:25 ` martin f krafft 2007-10-10 10:33 ` David Kastrup 2007-10-10 15:25 ` Linus Torvalds 0 siblings, 2 replies; 11+ messages in thread From: martin f krafft @ 2007-10-10 10:25 UTC (permalink / raw) To: git discussion list; +Cc: Linus Torvalds [-- Attachment #1: Type: text/plain, Size: 944 bytes --] also sprach Linus Torvalds <torvalds@linux-foundation.org> [2007.10.10.0354 +0100]: > Cherry-picking is immaterial. It doesn't matter how the changes > come into the tree. It doesn't matter what the history is. The > only thing git cares about is the content, and the end result. This is the part I over-estimated. I thought that Git would figure out that commits 1-3 had been merged into the target and thus apply, in sequence, only the commits from the source which had not been merged. Many thanks (again), Linus! Looking forward to your next content manager; you know, the one with artificial intelligence built in! You could call it "wit" :) -- martin; (greetings from the heart of the sun.) \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck dies ist eine manuell generierte email. sie beinhaltet tippfehler und ist auch ohne großbuchstaben gültig. spamtraps: madduck.bogus@madduck.net [-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: inexplicable failure to merge recursively across cherry-picks 2007-10-10 10:25 ` martin f krafft @ 2007-10-10 10:33 ` David Kastrup 2007-10-10 15:25 ` Linus Torvalds 1 sibling, 0 replies; 11+ messages in thread From: David Kastrup @ 2007-10-10 10:33 UTC (permalink / raw) To: martin f krafft; +Cc: git discussion list, Linus Torvalds martin f krafft <madduck@madduck.net> writes: > also sprach Linus Torvalds <torvalds@linux-foundation.org> [2007.10.10.0354 +0100]: >> Cherry-picking is immaterial. It doesn't matter how the changes >> come into the tree. It doesn't matter what the history is. The >> only thing git cares about is the content, and the end result. > > This is the part I over-estimated. I thought that Git would figure > out that commits 1-3 had been merged into the target and thus apply, > in sequence, only the commits from the source which had not been > merged. > > Many thanks (again), Linus! Looking forward to your next content > manager; you know, the one with artificial intelligence built in! > You could call it "wit" :) Well, there is also an obvious name choice when the distinguishing innovation is a well-rounded feature set, but it would cause a name collision for the equivalent of "tig". -- David Kastrup, Kriemhildstr. 15, 44793 Bochum ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: inexplicable failure to merge recursively across cherry-picks 2007-10-10 10:25 ` martin f krafft 2007-10-10 10:33 ` David Kastrup @ 2007-10-10 15:25 ` Linus Torvalds 2007-10-10 15:48 ` David Brown 2007-10-11 21:51 ` Sam Vilain 1 sibling, 2 replies; 11+ messages in thread From: Linus Torvalds @ 2007-10-10 15:25 UTC (permalink / raw) To: martin f krafft; +Cc: git discussion list On Wed, 10 Oct 2007, martin f krafft wrote: > also sprach Linus Torvalds <torvalds@linux-foundation.org> [2007.10.10.0354 +0100]: > > Cherry-picking is immaterial. It doesn't matter how the changes > > come into the tree. It doesn't matter what the history is. The > > only thing git cares about is the content, and the end result. > > This is the part I over-estimated. I thought that Git would figure > out that commits 1-3 had been merged into the target and thus apply, > in sequence, only the commits from the source which had not been > merged. Yes, *some* SCM's have tried to do that. In particular, the ones that are "patch-based" tend to think that patches are "identical" regardless of where they are, and while re-ordering of them is a special event, it's not somethign that changes the fundamental 'ID' of the patch. For example, I think the darcs "patch algebra" works that way. It's a really horrible model. Not only doesn't it scale, but it leads to various very strange linkages between patches, and it fails the most important part: it means that merges get different results just because people are doing the same changes two different ways. > Many thanks (again), Linus! Looking forward to your next content > manager; you know, the one with artificial intelligence built in! > You could call it "wit" :) Well, the git model is really largely the reverse: the system is supposed to be as *stupid* as humanly possible, but: - make it predictable exactly because it's stupid and doesn't do anything even half-ways smart. This is part of the "it doesn't matter *how* you got to a particular state, git will always do the same thing regardless of whether you moved an existing patch around or whether you re-did the changes as (possibly more than one) new and unrelated commits". - conflicts aren't bad - they're *good*. Trying to aggressively resolve them automatically when two branches have done slightly different things in the same area is stupid and just results in more problems. Instead, git tries to do what I don't think *anybody* else has done: make the conflicts easy to resolve, by allowing you to work with them in your normal working tree, and still giving you a lot of tools to help you see what's going on. So git doesn't try to avoid conflicts per se: the merge strategies are fundamentally pretty simple (rename detection and the whole "recursive merge" thing may not be simple code, but the concepts are pretty straightforward), and they handle all the really *obvious* cases, but at the same time, I feel strongly that anything even half-way subtle should not be left to the SCM - the SCM should show it and make it really easy for the user to then fix it up. Side note: even with a totally obvious three-way merge, with absolutely zero conflicts even remotely close to each other, you can have the merge algorithm generate a good merge that doesn't actually *work*. For example, it's happened a few times that one branch renames a structure member name (and changes all the uses) and another branch adds new code that uses the old member name. The end result: the code will *merge* fine, and there are zero conflicts in the content, because all the changes were totally disjoint, but the end result doesn't actually work or even compile! So no merge strategy is ever perfect. The git approach is to be simple and predictable, and also to make it easy to fix up (ie even if you get the above kind of automatic merge problem, if you catch it in compiling, you can fix it up, and do a "git commit --amend" to fix up the merge itself before you push it out). Linus ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: inexplicable failure to merge recursively across cherry-picks 2007-10-10 15:25 ` Linus Torvalds @ 2007-10-10 15:48 ` David Brown 2007-10-10 19:07 ` Miklos Vajna 2007-10-11 0:08 ` Miklos Vajna 2007-10-11 21:51 ` Sam Vilain 1 sibling, 2 replies; 11+ messages in thread From: David Brown @ 2007-10-10 15:48 UTC (permalink / raw) To: Linus Torvalds; +Cc: martin f krafft, git discussion list On Wed, Oct 10, 2007 at 08:25:15AM -0700, Linus Torvalds wrote: >Yes, *some* SCM's have tried to do that. In particular, the ones that are >"patch-based" tend to think that patches are "identical" regardless of >where they are, and while re-ordering of them is a special event, it's not >somethign that changes the fundamental 'ID' of the patch. > >For example, I think the darcs "patch algebra" works that way. > >It's a really horrible model. Not only doesn't it scale, but it leads to >various very strange linkages between patches, and it fails the most >important part: it means that merges get different results just because >people are doing the same changes two different ways. Actually, specifically darcs, different merges _always_ result in the same data. It's a fundamental part of is patch algebra. No matter what order you apply a given set of patches, even with conflicts and reordering, you always get the same result, or no result. Conflicts are "resolved" by inserting conflict markers in the file, ordered by the patch ID. It doesn't matter which order you apply them in, you get the same markers. Then there will be a merge patch which fixes the markers that someone could apply, no matter what order the applied the previous patches. Darcs breaks down in a few places, though. - The no result. Sometimes, it just can't figure out how to reorder patches. Even worse, occasionally, the implementation will fail to terminate try to figure this out. There isn't much to do at this point, except manually apply the patch, hence generating a new patch ID. - It doesn't scale well. The strange linkages between patches could be thought of as a feature, since it is basically constraining the order that the patches can be applied in. There is a darcs-git project that tries to do the darcs things on top of git. Dave ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: inexplicable failure to merge recursively across cherry-picks 2007-10-10 15:48 ` David Brown @ 2007-10-10 19:07 ` Miklos Vajna 2007-10-10 19:35 ` Linus Torvalds 2007-10-11 0:08 ` Miklos Vajna 1 sibling, 1 reply; 11+ messages in thread From: Miklos Vajna @ 2007-10-10 19:07 UTC (permalink / raw) To: David Brown; +Cc: Linus Torvalds, martin f krafft, git discussion list [-- Attachment #1: Type: text/plain, Size: 2199 bytes --] On Wed, Oct 10, 2007 at 08:48:31AM -0700, David Brown <git@davidb.org> wrote: > On Wed, Oct 10, 2007 at 08:25:15AM -0700, Linus Torvalds wrote: > >Yes, *some* SCM's have tried to do that. In particular, the ones that are "patch-based" tend to think that patches are "identical" regardless of where they are, and while re-ordering of them is a special event, it's not somethign that changes the fundamental 'ID' of the patch. > >For example, I think the darcs "patch algebra" works that way. > >It's a really horrible model. Not only doesn't it scale, but it leads to various very strange linkages between patches, and it fails the most important part: it means that merges get different results just because people are doing the same changes two different ways. > Actually, specifically darcs, different merges _always_ result in the same > data. It's a fundamental part of is patch algebra. No matter what order > you apply a given set of patches, even with conflicts and reordering, you > always get the same result, or no result. Conflicts are "resolved" by > inserting conflict markers in the file, ordered by the patch ID. It > doesn't matter which order you apply them in, you get the same markers. > Then there will be a merge patch which fixes the markers that someone could > apply, no matter what order the applied the previous patches. > Darcs breaks down in a few places, though. > - The no result. Sometimes, it just can't figure out how to reorder > patches. Even worse, occasionally, the implementation will fail to > terminate try to figure this out. There isn't much to do at this > point, except manually apply the patch, hence generating a new patch > ID. > - It doesn't scale well. > The strange linkages between patches could be thought of as a feature, > since it is basically constraining the order that the patches can be > applied in. > There is a darcs-git project that tries to do the darcs things on top of > git. > Dave > - > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html thanks, - VMiklos [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: inexplicable failure to merge recursively across cherry-picks 2007-10-10 19:07 ` Miklos Vajna @ 2007-10-10 19:35 ` Linus Torvalds 0 siblings, 0 replies; 11+ messages in thread From: Linus Torvalds @ 2007-10-10 19:35 UTC (permalink / raw) To: Miklos Vajna; +Cc: David Brown, martin f krafft, git discussion list On Wed, 10 Oct 2007, Miklos Vajna wrote: > > Actually, specifically darcs, different merges _always_ result in the same > data. No they don't. You don't understand the problem. Yes, different merges WITH THE SAME PATCHES always result in the same data. But that's not a realistic - or even very interesting - schenario. What's much more common is that the same problem gets solved slightly differently in two different branches. For example, maybe somebody does it as two different patches - where the second one fixes a bug in the first fix. And another person does the same fix, but without the bug in the first place. See? A patch-based system gets confused by those kinds of issues (or they turn into various special cases). And that is fundamentally why you MUST NOT take history into account (where "history" is some series of individual patches). Yes, history is interesting for historical reasons, and to explain what the context was, but in many ways, history is exactly the *wrong* thing to use when it comes to merging. You should look at the end result, since people can - and do - come to the same result through different ways. Linus ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: inexplicable failure to merge recursively across cherry-picks 2007-10-10 15:48 ` David Brown 2007-10-10 19:07 ` Miklos Vajna @ 2007-10-11 0:08 ` Miklos Vajna 1 sibling, 0 replies; 11+ messages in thread From: Miklos Vajna @ 2007-10-11 0:08 UTC (permalink / raw) To: David Brown; +Cc: Linus Torvalds, martin f krafft, git discussion list [-- Attachment #1: Type: text/plain, Size: 408 bytes --] [ ehh, sorry for my previous mail, i must be doing something wrong.. ] On Wed, Oct 10, 2007 at 08:48:31AM -0700, David Brown <git@davidb.org> wrote: > There is a darcs-git project that tries to do the darcs things on top of > git. actually it's broken, according to its author: http://www.mail-archive.com/darcs-users@darcs.net/msg03161.html though i loved darcs, but yes - it scales horribly - VMiklos [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: inexplicable failure to merge recursively across cherry-picks 2007-10-10 15:25 ` Linus Torvalds 2007-10-10 15:48 ` David Brown @ 2007-10-11 21:51 ` Sam Vilain 2007-10-11 22:33 ` Linus Torvalds 1 sibling, 1 reply; 11+ messages in thread From: Sam Vilain @ 2007-10-11 21:51 UTC (permalink / raw) To: Linus Torvalds; +Cc: martin f krafft, git discussion list Linus Torvalds wrote: > So git doesn't try to avoid conflicts per se: the merge strategies are > fundamentally pretty simple (rename detection and the whole "recursive > merge" thing may not be simple code, but the concepts are pretty > straightforward), and they handle all the really *obvious* cases, but at > the same time, I feel strongly that anything even half-way subtle should > not be left to the SCM - the SCM should show it and make it really easy > for the user to then fix it up. This is true. However I think there are some obvious places for improvement that does look at the file history, when the regular algorithm fails; 1. do a --cherry-pick rev-list on just the file being merged and see if all the changes on one side disappear, in which case just take the result. 2. see if the files were identical at some point, in which case use a new merge base for that file based on the changes since that revision. I actually thought #2 was already the way recursive worked! Sam. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: inexplicable failure to merge recursively across cherry-picks 2007-10-11 21:51 ` Sam Vilain @ 2007-10-11 22:33 ` Linus Torvalds 0 siblings, 0 replies; 11+ messages in thread From: Linus Torvalds @ 2007-10-11 22:33 UTC (permalink / raw) To: Sam Vilain; +Cc: martin f krafft, git discussion list On Fri, 12 Oct 2007, Sam Vilain wrote: > > 1. do a --cherry-pick rev-list on just the file being merged and see if > all the changes on one side disappear, in which case just take the result. > > 2. see if the files were identical at some point, in which case use a > new merge base for that file based on the changes since that revision. > > I actually thought #2 was already the way recursive worked! Actually, I think both of these are fundamentally wrong. The reason is that you talk about "the file". Anything that is based on per-file heuristics is going to mean that you use a history that is not necessarily compatible with the _other_ files in the project. I agree 100% that per-file history information is going to find more things to merge automatically. But the point I was trying to make was that "automatic merges" aren't always *good*. I realize that pretty much every single theoretical merge algorithm out there tries to make merges happen automatically as much as possible, but they all en dup having strange issues. For example, take a patch that cherry-picked into mainline from a development branch, but that partly depended on some support-feature that wasn't in mainline yet. So then there is another patch that removes part of that patch from mainline. So mainline is fine. Now, three months later, the development branch is stable, and is fully merged. What happens? Git will largely get this right. Git will look at the last *global* common base, and will just look at the contents, and do a reasonable job. Yes, there will probably be conflicts (because both the development branch and the mainline ended up touching the same parts of the files thanks to the cherry-pick, and yet mainline has some added hacks on top to disable it), but on the whole that's exactly what you want! (Alternatively, maybe the "remove the part that wasn't supported yet" ended up meaning that that particular part of the patch was excised entirely from mainline, and there was no conflict at all, and git just merged the new stuff from the development branch cleanly! So I'm not saying that it *has* to conflict, I'm just saying that it might have). In other words: git always "does the right thing". Assuming both branches are stable and working, git does a very reasonable thing. It's obviously not always the thing people may *want*, but it's guaranteed to be a reasonable and simple guess, and there's no way it's "too clever for its own good, and just screwed the pooch entirely". In contrast, your suggested merge strategy would be HORRIBLY BROKEN! Why? Because it doesn't look at the *common* history to the project, it looks at some per-file state that is totally bogus and has no relevance. Think it through: what happens if there were files with the same content (because of the cherry-pick), and then the file history for one of the branches was later changed to disable something because the support for it wasn't in the "whole history"? Right: the final merge will contain that change! Because there as a time where the file was identical (the cherry-pick), so you're taking all the later changes to that file (the undo)! Notice? Totally the wrong thing to do! So this is a classic case of trying to make "easier" merges, but where the whole approach is totally broken! You simply MUST NOT add logic like that. It's a lot better to give a conflict, than to try to be "clever", and silently do the wrong thing. Yes, you can be really stupid, and silently do the wrong thing too, but if you're stupid, at least the "silent wrong thing" is never really subtle, it's pretty much guaranteed to easily explainable. And the good news is that you didn't have a complicated and fragile algorithm just to get the wrong answer. (Put another way: if you are always going to have situations where you get the wrong answer, you'd better take the simple and stupid algorithm, because people are more likely to then be able to _predict_ that wrong answer and are thus more ready to handle it!) So being clever really is the wrong thing to do. And using history that isn't global and true history (ie just looking at one file, and deciding that matching that one file "means" something) is fundamnetally broken. In fact, in general, individual pieces of history are totally worthless. The fact that some individual change was done in one branch doesn't really tell you *anything*. The reason that change was done may be implied by all the previous changes, or conversely, later changes may have undone the change, so any merge algorithm that starts to look at individual commits is likely to be pure and utter crap - exactly because it's starting to make decisions based on local information that may not be valid in the big picture. (Where the "big picture" may be either about "space" - other files - or about "time" - other commits, that simply mean that the individual changes of one commit are meaningless on their own). Btw, one thing to note is how well the simple and stupid git merge strategy works. It turns out that doing things with the "big picture" model actually does work really well. People think that they need "finegrained history" to make good merges, but I think most people who have actually done a fair number of merges with git have noticed that it's actually pretty dang painless. But to be honest, there are cases where git isn't being very helpful. In particular, I think there *are* things that git could probably be more helpful with, but looking at local history is not one of them, I think. So here are some suggestions on things I think we could improve on: - I think it would be wonderful to have a helper tool for handling conflicts. In particular, while I don't think per-file history is good for resolving conflicts *automatically*, I actually do think that per-file history can be a good way to *manually* resolve conflicts. In other words, it you have a conflict, I think it would be wonderful to have some git-gui-like thing that can show the history (with patches) for that file, and basically combine a three-way graphical merge *with* some per-commit information where you can say "choose the thing that that commit did for this conflict". So I think git already has tools to help resolve conflicts, and I personally love doing "gitk --merge" or "git log -p --merge" when a conflict does happen, but I think some smart GUI person could do something even much better! And notice how I think that it's *really* wrong to use per-file history automatically, but that I think it's not wrong at all to use it when there's a human that says "ok, obviously pick that case". Things that are horrible when they cause subtle and automatic resolves can be very good when they cause subtle resolves that a human looked at! - I suspect we have issues with common whitespace changes, where we again could probably help people resolve whitespace changes etc better. Again, I don't think those are necessarily things you want to do automatically, but I know from personal experience that handling things like one side having done a re-indent can be *really* annoying, just because you end up doing tons of mindless stuff when you fix up all the totally idiotic and usually trivial conflicts. .. and I'm sure there are other things we could do better too, but the above two are things that while they haven't happened for me for the kernel (probably because we have learnt how to not cause them over the years), I've seen them in other places. And yes, the above two suggestions fall solidly in the "conflicts aren't bad per se, but you want to make the tool really help you resolve them!" camp. Linus ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2007-10-11 22:33 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-10-10 1:55 inexplicable failure to merge recursively across cherry-picks martin f krafft 2007-10-10 2:54 ` Linus Torvalds 2007-10-10 10:25 ` martin f krafft 2007-10-10 10:33 ` David Kastrup 2007-10-10 15:25 ` Linus Torvalds 2007-10-10 15:48 ` David Brown 2007-10-10 19:07 ` Miklos Vajna 2007-10-10 19:35 ` Linus Torvalds 2007-10-11 0:08 ` Miklos Vajna 2007-10-11 21:51 ` Sam Vilain 2007-10-11 22:33 ` Linus Torvalds
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).