* Re: Potential bug with octopus merges and symlinks
2021-12-02 3:04 Potential bug with octopus merges and symlinks Michael McClimon
@ 2021-12-04 17:34 ` Elijah Newren
2021-12-06 16:53 ` Martin von Zweigbergk
2021-12-06 18:13 ` Junio C Hamano
2 siblings, 0 replies; 4+ messages in thread
From: Elijah Newren @ 2021-12-04 17:34 UTC (permalink / raw)
To: Michael McClimon; +Cc: Git Mailing List
On Sat, Dec 4, 2021 at 5:47 AM Michael McClimon <michael@mcclimon.org> wrote:
>
> I am having a problem with octopus merges when symbolic links are involved.
> I'm not completely convinced this is a bug in git, but I'm also not convinced
> that it's _not_. (If it's not, my apologies; I often find it very hard to
> track down the answers to complicated git questions on the internet.)
>
> There is a minimal reproducer available at
> https://github.com/mmcclimon/git-merge-problem-demo. Fetch all the branches
> there. The main branch contains a directory (dir1) with a single file
> (file.txt), plus a symlink (dir2), which links to dir1. branch1 replaces this
> symlink with a copy of the files that were linked to. (This was accomplished
> with: rm dir2; cp -r dir1 dir2.) branch2 and branch3 do not touch this
> directory at all.
>
> Merging these three branches fails:
>
> $ git merge branch1 branch2 branch3
> Fast-forwarding to: branch1
> Trying simple merge with branch2
> Simple merge did not work, trying automatic merge.
> Trying simple merge with branch3
> error: Entry 'dir2/file.txt' not uptodate. Cannot merge.
> Merge with strategy octopus failed.
>
> The order here matters! Here is every permutation (1 here is the symlink
> change) to git merge; only the first two fail, all the others work.
>
> 1 2 3 FAIL
> 1 3 2 FAIL
> 2 1 3 PASS
> 2 3 1 PASS
> 3 1 2 PASS
> 3 2 1 PASS
> 1 2 PASS
> 2 1 PASS
> 2 3 PASS
> 3 2 PASS
> 1 3 PASS
> 3 1 PASS
>
> I tracked this down as best I could (I am not a C programmer), by adding
> `set -x` to my copy of git-octopus-merge.sh. I determined that the line is
> failing is toward the very bottom of the main loop there:
> git read-tree -u -m --aggressive $common $MRT $SHA1
>
> I had a read of the index-format.txt and then the git read-tree docs, and I
> confess that I don't _totally_ follow the latter. But, I wonder if maybe
> --aggressive is perhaps not aggressive enough if the parts of the index that
> store symbolic link data aren't taken into account, when it's changed between
> two versions and the blobs involved aren't actually different.
>
> I recognize that replacing a symlink with real files under the same name is a
> weird thing to do, but this is a pared-down example of a thing that happens
> occasionally at $work, where we regularly octopus-merge 15+ heads into
> development branches. I would also be happy to know, if this isn't a bug, if
> there's some other workaround! The only way I know to fix this case is to
> merge the symlink change, then rebase all in-flight branches against it. That
> is often extremely time-consuming and tedious, though, and requires a human to
> be involved (all the other merging is done by CI.)
Yeah, I'm not too surprised that there are cases like this that trip up
read-tree and octopus. There might be a fix in unpack-trees.c, but
it'd take some digging to verify if that's true or it's just a design
limitation.
However, I can give you a workaround, based on my nebulous ideas for
making an octopus-based-on-ort strategy (oort?). Basically, just do a
sequence of individual merges with ort (which since v2.34 is the default
non-octopus merge strategy), then take the resulting tree and recraft it
into an octopus merge. So, something like this:
git merge --no-ff --no-edit branch1
git merge --no-ff --no-edit branch2
git merge --no-ff --no-edit branch3
NEW_TREE=$(git write-tree)
NEW_COMMIT=$(git commit-tree $NEW_TREE \
-p HEAD~3 -p HEAD~2^2 -p HEAD~1^2 -p HEAD^2 \
-m "Merge branches 'branch1', 'branch2', and 'branch3'")
git reset $NEW_COMMIT
Which at least works with the git-merge-problem-demo repository example
you provided.
Note that the reason for the flags above are:
--no-ff: ensure merge commits are created, so our -p flags to
commit-tree are correct
--no-edit: we'll later replace all the commit messages anyway
Also, if you're using an older version of git, the recursive strategy
would be fine too for the steps above; the main advantage of ort would
be that the intermediate steps could be done in memory if this were
implemented as a real builtin merge strategy rather than just a sequence
of shell commands. The in-memory advantage might be good enough reason
to take this nebulous idea of mine and create the 'oort' strategy at
some point in the future, but for now the sequence of shell commands
should give you what you need.
Hope that helps,
Elijah
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Potential bug with octopus merges and symlinks
2021-12-02 3:04 Potential bug with octopus merges and symlinks Michael McClimon
2021-12-04 17:34 ` Elijah Newren
2021-12-06 16:53 ` Martin von Zweigbergk
@ 2021-12-06 18:13 ` Junio C Hamano
2 siblings, 0 replies; 4+ messages in thread
From: Junio C Hamano @ 2021-12-06 18:13 UTC (permalink / raw)
To: Michael McClimon; +Cc: git
Michael McClimon <michael@mcclimon.org> writes:
> The order here matters! Here is every permutation (1 here is the symlink
> change) to git merge; only the first two fail, all the others work.
I am not an active Octopus contributor, but was the original
inventor. And it is not surprising that the order matters, because
I designed it that way ;-)
In the original design, octopus was to be used only to create the
simplest forms of merges without any conflicts, because octopus
merges make it almost impossible to examine the result of the merge
with comparison between one of the parent and the result (you have
to remember that there was no "diff --cc" invented yet, let alone
"diff --remerge-diff", back then to help us). Aborting the merge at
the first conflict when merging the remote heads one by one is a way
to make sure that the result is a simple "combine unrelated work
into one" merge. This does make the order matter somewhat. When
more than one remote heads touch the same path, the order in which
they are merged into base does make a difference.
To be more strict for the goal of limiting octopus to the simplest
forms of merges without any conflicts, we should have insisted that
the set of paths these remote heads want to modify should not
overlap and that would have made the order totally not to matter,
but "abort at the first conflict" was easier to implement and
practical.
And later, before the command got part of the git suite, the rule
was further loosened to allow a conflicting merge when merging the
last remote head and let the user hand-resolve. This makes the
order matter even more.
^ permalink raw reply [flat|nested] 4+ messages in thread