* git merge commits are non-deterministic? what changed? @ 2012-11-09 13:31 Ulrich Spörlein 2012-11-09 15:04 ` Andreas Schwab 0 siblings, 1 reply; 9+ messages in thread From: Ulrich Spörlein @ 2012-11-09 13:31 UTC (permalink / raw) To: git Hi all, I'm running a couple of conversions from SVN to git, using a slightly hacked version of svn2git (because it can cope with multiple branches and is several orders of magnitude faster than git-svn). Anyway, when doing some verification runs, using the same version of svn2git, but different versions of git, I get different commit hashes, and I tracked it down to the ordering of the parents inside a merge commit. version 1.7.9.2 % git show --format=raw e209a83|head commit e209a83c1e0a387c88a44f3a8f2be2670ed85eae tree de2d7c6726a45428d4a310da2acd8839daf9f85f parent 5fba0401c23a594e4ad5e807bf14a5439645a358 parent 25062ba061871945759b3baa833fe64969383e40 parent 89bebeef185ed08424fc548f8569081c6add2439 parent c7d5f60d3a7e2e3c4da23b157c62504667344438 parent e7bc108f0d6a394050818a4af64a59094d3c793e parent 48231afadc40013e6bfda56b04a11ee3a602598f author rgrimes <rgrimes@FreeBSD.org> 739897097 +0000 committer rgrimes <rgrimes@FreeBSD.org> 739897097 +0000 vs git version 1.8.0 % git show --format=raw 42f0fad|head commit 42f0fadccab6eefc7ffdc1012345b42ad45e36c2 tree de2d7c6726a45428d4a310da2acd8839daf9f85f parent 5fba0401c23a594e4ad5e807bf14a5439645a358 parent 25062ba061871945759b3baa833fe64969383e40 parent 89bebeef185ed08424fc548f8569081c6add2439 parent 48231afadc40013e6bfda56b04a11ee3a602598f parent c7d5f60d3a7e2e3c4da23b157c62504667344438 parent e7bc108f0d6a394050818a4af64a59094d3c793e author rgrimes <rgrimes@FreeBSD.org> 739897097 +0000 committer rgrimes <rgrimes@FreeBSD.org> 739897097 +0000 I haven't verified to see if that ordering is stable within a git version, but the fact that it changed across versions clearly means that I cannot depend on this currently (I have never seen this problem in two years, so I blame git 1.8.0 ...) Two questions: 1. Can we impose a stable ordering of the commits being recorded in a merge commit? Listing parents in chronological order or something like that. 2. Why the hell is the commit hash dependent on the ordering of the parent commits? IMHO it should sort the set of parents before calculating the hash ... Help? Uli ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: git merge commits are non-deterministic? what changed? 2012-11-09 13:31 git merge commits are non-deterministic? what changed? Ulrich Spörlein @ 2012-11-09 15:04 ` Andreas Schwab 2012-11-09 15:42 ` Ulrich Spörlein 0 siblings, 1 reply; 9+ messages in thread From: Andreas Schwab @ 2012-11-09 15:04 UTC (permalink / raw) To: Ulrich Spörlein; +Cc: git Ulrich Spörlein <uqs@spoerlein.net> writes: > Two questions: > 1. Can we impose a stable ordering of the commits being recorded in a > merge commit? Listing parents in chronological order or something like > that. The order is determined by the order the refs are given to git merge (or git commit-tree when using the plumbing). > 2. Why the hell is the commit hash dependent on the ordering of the > parent commits? IMHO it should sort the set of parents before > calculating the hash ... What would be the sort key? Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: git merge commits are non-deterministic? what changed? 2012-11-09 15:04 ` Andreas Schwab @ 2012-11-09 15:42 ` Ulrich Spörlein 2012-11-09 15:52 ` Matthieu Moy 0 siblings, 1 reply; 9+ messages in thread From: Ulrich Spörlein @ 2012-11-09 15:42 UTC (permalink / raw) To: Andreas Schwab; +Cc: git On Fri, 2012-11-09 at 16:04:31 +0100, Andreas Schwab wrote: > Ulrich Spörlein <uqs@spoerlein.net> writes: > > > Two questions: > > 1. Can we impose a stable ordering of the commits being recorded in a > > merge commit? Listing parents in chronological order or something like > > that. > > The order is determined by the order the refs are given to git merge (or > git commit-tree when using the plumbing). > > > 2. Why the hell is the commit hash dependent on the ordering of the > > parent commits? IMHO it should sort the set of parents before > > calculating the hash ... > > What would be the sort key? Trivially, the hash of the parents itself. So you'd always get ... parent 0000 parent 1111 parent aaaa parent ffff hth Uli ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: git merge commits are non-deterministic? what changed? 2012-11-09 15:42 ` Ulrich Spörlein @ 2012-11-09 15:52 ` Matthieu Moy 2012-11-09 16:16 ` Jeff King 0 siblings, 1 reply; 9+ messages in thread From: Matthieu Moy @ 2012-11-09 15:52 UTC (permalink / raw) To: Ulrich Spörlein; +Cc: Andreas Schwab, git Ulrich Spörlein <uqs@spoerlein.net> writes: >> > 2. Why the hell is the commit hash dependent on the ordering of the >> > parent commits? IMHO it should sort the set of parents before >> > calculating the hash ... >> >> What would be the sort key? > > Trivially, the hash of the parents itself. So you'd always get > > ... > parent 0000 > parent 1111 > parent aaaa > parent ffff That would change the behavior of --first-parent. Or you'd need to compute the sha1 of the sorted list, but keep the unsorted one in the commit. Possible, but weird ;-). -- Matthieu Moy http://www-verimag.imag.fr/~moy/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: git merge commits are non-deterministic? what changed? 2012-11-09 15:52 ` Matthieu Moy @ 2012-11-09 16:16 ` Jeff King 2012-11-09 18:27 ` Ulrich Spörlein 0 siblings, 1 reply; 9+ messages in thread From: Jeff King @ 2012-11-09 16:16 UTC (permalink / raw) To: Matthieu Moy; +Cc: Ulrich Spörlein, Andreas Schwab, git On Fri, Nov 09, 2012 at 04:52:48PM +0100, Matthieu Moy wrote: > Ulrich Spörlein <uqs@spoerlein.net> writes: > > >> > 2. Why the hell is the commit hash dependent on the ordering of the > >> > parent commits? IMHO it should sort the set of parents before > >> > calculating the hash ... > >> > >> What would be the sort key? > > > > Trivially, the hash of the parents itself. So you'd always get > > > > ... > > parent 0000 > > parent 1111 > > parent aaaa > > parent ffff > > That would change the behavior of --first-parent. Or you'd need to > compute the sha1 of the sorted list, but keep the unsorted one in the > commit. Possible, but weird ;-). Right. The reason that merge parents are stored in the order given on the command line is not random or because it was not considered. It encodes a valuable piece of information: did the user merge "foo" into "bar", or did they merge "bar" into "foo"? So I think this discussion is going in the wrong direction; git should never sort the parents, because the order is meaningful. The original complaint was that a run of svn2git produced different results on two different git versions. The important question to me is: did svn2git feed the parents to git in the same order? If it did, and git produced different results, then that is a serious bug. If it did not, then the issue needs to be resolved in svn2git (which _may_ want to sort the parents that it feeds to git, but it would depend on whether the order it is currently presenting is meaningful). -Peff ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: git merge commits are non-deterministic? what changed? 2012-11-09 16:16 ` Jeff King @ 2012-11-09 18:27 ` Ulrich Spörlein 2012-11-12 11:27 ` Michael J Gruber 0 siblings, 1 reply; 9+ messages in thread From: Ulrich Spörlein @ 2012-11-09 18:27 UTC (permalink / raw) To: Jeff King; +Cc: Matthieu Moy, Andreas Schwab, git On Fri, 2012-11-09 at 11:16:47 -0500, Jeff King wrote: > On Fri, Nov 09, 2012 at 04:52:48PM +0100, Matthieu Moy wrote: > > > Ulrich Spörlein <uqs@spoerlein.net> writes: > > > > >> > 2. Why the hell is the commit hash dependent on the ordering of the > > >> > parent commits? IMHO it should sort the set of parents before > > >> > calculating the hash ... > > >> > > >> What would be the sort key? > > > > > > Trivially, the hash of the parents itself. So you'd always get > > > > > > ... > > > parent 0000 > > > parent 1111 > > > parent aaaa > > > parent ffff > > > > That would change the behavior of --first-parent. Or you'd need to > > compute the sha1 of the sorted list, but keep the unsorted one in the > > commit. Possible, but weird ;-). > > Right. The reason that merge parents are stored in the order given on > the command line is not random or because it was not considered. It > encodes a valuable piece of information: did the user merge "foo" into > "bar", or did they merge "bar" into "foo"? > > So I think this discussion is going in the wrong direction; git should > never sort the parents, because the order is meaningful. The original > complaint was that a run of svn2git produced different results on two > different git versions. The important question to me is: did svn2git > feed the parents to git in the same order? > > If it did, and git produced different results, then that is a serious > bug. > > If it did not, then the issue needs to be resolved in svn2git (which > _may_ want to sort the parents that it feeds to git, but it would depend > on whether the order it is currently presenting is meaningful). Yeah, thanks, looks like I have some more work to do. I don't quite get how it could come up with a different order, seeing that it is using svn as the base. Will run some more experiments, thanks for the info so far. Cheers, Uli ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: git merge commits are non-deterministic? what changed? 2012-11-09 18:27 ` Ulrich Spörlein @ 2012-11-12 11:27 ` Michael J Gruber 2012-11-20 16:22 ` Ulrich Spörlein 0 siblings, 1 reply; 9+ messages in thread From: Michael J Gruber @ 2012-11-12 11:27 UTC (permalink / raw) To: Ulrich Spörlein; +Cc: Jeff King, Matthieu Moy, Andreas Schwab, git Ulrich Spörlein venit, vidit, dixit 09.11.2012 19:27: > On Fri, 2012-11-09 at 11:16:47 -0500, Jeff King wrote: >> On Fri, Nov 09, 2012 at 04:52:48PM +0100, Matthieu Moy wrote: >> >>> Ulrich Spörlein <uqs@spoerlein.net> writes: >>> >>>>>> 2. Why the hell is the commit hash dependent on the ordering of the >>>>>> parent commits? IMHO it should sort the set of parents before >>>>>> calculating the hash ... >>>>> >>>>> What would be the sort key? >>>> >>>> Trivially, the hash of the parents itself. So you'd always get >>>> >>>> ... >>>> parent 0000 >>>> parent 1111 >>>> parent aaaa >>>> parent ffff >>> >>> That would change the behavior of --first-parent. Or you'd need to >>> compute the sha1 of the sorted list, but keep the unsorted one in the >>> commit. Possible, but weird ;-). >> >> Right. The reason that merge parents are stored in the order given on >> the command line is not random or because it was not considered. It >> encodes a valuable piece of information: did the user merge "foo" into >> "bar", or did they merge "bar" into "foo"? >> >> So I think this discussion is going in the wrong direction; git should >> never sort the parents, because the order is meaningful. The original >> complaint was that a run of svn2git produced different results on two >> different git versions. The important question to me is: did svn2git >> feed the parents to git in the same order? >> >> If it did, and git produced different results, then that is a serious >> bug. >> >> If it did not, then the issue needs to be resolved in svn2git (which >> _may_ want to sort the parents that it feeds to git, but it would depend >> on whether the order it is currently presenting is meaningful). > > Yeah, thanks, looks like I have some more work to do. I don't quite get > how it could come up with a different order, seeing that it is using svn > as the base. > > Will run some more experiments, thanks for the info so far. There was a change in the order in which "git cherry-pick A B C" applies the commits. It's the only odering affecting change in 1.8.0 that I can think of right now. Michael ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: git merge commits are non-deterministic? what changed? 2012-11-12 11:27 ` Michael J Gruber @ 2012-11-20 16:22 ` Ulrich Spörlein 2012-11-20 20:39 ` Junio C Hamano 0 siblings, 1 reply; 9+ messages in thread From: Ulrich Spörlein @ 2012-11-20 16:22 UTC (permalink / raw) To: Michael J Gruber; +Cc: Jeff King, Matthieu Moy, Andreas Schwab, git On Mon, 2012-11-12 at 12:27:31 +0100, Michael J Gruber wrote: > Ulrich Spörlein venit, vidit, dixit 09.11.2012 19:27: > > On Fri, 2012-11-09 at 11:16:47 -0500, Jeff King wrote: > >> On Fri, Nov 09, 2012 at 04:52:48PM +0100, Matthieu Moy wrote: > >> > >>> Ulrich Spörlein <uqs@spoerlein.net> writes: > >>> > >>>>>> 2. Why the hell is the commit hash dependent on the ordering of the > >>>>>> parent commits? IMHO it should sort the set of parents before > >>>>>> calculating the hash ... > >>>>> > >>>>> What would be the sort key? > >>>> > >>>> Trivially, the hash of the parents itself. So you'd always get > >>>> > >>>> ... > >>>> parent 0000 > >>>> parent 1111 > >>>> parent aaaa > >>>> parent ffff > >>> > >>> That would change the behavior of --first-parent. Or you'd need to > >>> compute the sha1 of the sorted list, but keep the unsorted one in the > >>> commit. Possible, but weird ;-). > >> > >> Right. The reason that merge parents are stored in the order given on > >> the command line is not random or because it was not considered. It > >> encodes a valuable piece of information: did the user merge "foo" into > >> "bar", or did they merge "bar" into "foo"? > >> > >> So I think this discussion is going in the wrong direction; git should > >> never sort the parents, because the order is meaningful. The original > >> complaint was that a run of svn2git produced different results on two > >> different git versions. The important question to me is: did svn2git > >> feed the parents to git in the same order? > >> > >> If it did, and git produced different results, then that is a serious > >> bug. > >> > >> If it did not, then the issue needs to be resolved in svn2git (which > >> _may_ want to sort the parents that it feeds to git, but it would depend > >> on whether the order it is currently presenting is meaningful). > > > > Yeah, thanks, looks like I have some more work to do. I don't quite get > > how it could come up with a different order, seeing that it is using svn > > as the base. > > > > Will run some more experiments, thanks for the info so far. > > There was a change in the order in which "git cherry-pick A B C" applies > the commits. It's the only odering affecting change in 1.8.0 that I can > think of right now. Just to wrap this up, it was of course a "feature" of the converter, that resulted in this unrepeatable behavior. The SVN API makes use of apr_hashes, which were traversed in arbitrary order, hence SVN commits spanning multiple git-branches would be handled in a non-deterministic order, leading to randomly ordered parent objects for later git merge commits. It it still debatable, whether a merge commit should have a list-of-parents or a set-of-parents. Changing it to a set-of-parents (with a well-defined hash function), would have made this problem go away. But this will never be changed, it would break the fundamental git storage model as it is in place now. Cheers, Uli ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: git merge commits are non-deterministic? what changed? 2012-11-20 16:22 ` Ulrich Spörlein @ 2012-11-20 20:39 ` Junio C Hamano 0 siblings, 0 replies; 9+ messages in thread From: Junio C Hamano @ 2012-11-20 20:39 UTC (permalink / raw) To: Ulrich Spörlein Cc: Michael J Gruber, Jeff King, Matthieu Moy, Andreas Schwab, git Ulrich Spörlein <uqs@spoerlein.net> writes: > But this will never be changed, it would break the fundamental git > storage model as it is in place now. It doesn't just break "storage model", but more importantly, it breaks the semantics. Imagine that things started breaking after merging your topic branch 'foo' to the integration branch 'master', and how people would perceive the situation. Everybody would say your topic 'foo' broke the build. Nobody except you would say, even if the tip of your topic 'foo' alone works perfectly, merging the 'master' to your topic 'foo' broke that topic. The topic should have been adjusted to the updated baseline, that is the 'master' branch before this merge since your topic 'foo' forked off of it, before or during the merge. To express what was merged into what, the order of parents in the commit is fundamentally a part of what a commit is. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-11-20 20:39 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-11-09 13:31 git merge commits are non-deterministic? what changed? Ulrich Spörlein 2012-11-09 15:04 ` Andreas Schwab 2012-11-09 15:42 ` Ulrich Spörlein 2012-11-09 15:52 ` Matthieu Moy 2012-11-09 16:16 ` Jeff King 2012-11-09 18:27 ` Ulrich Spörlein 2012-11-12 11:27 ` Michael J Gruber 2012-11-20 16:22 ` Ulrich Spörlein 2012-11-20 20:39 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).