* [PATCH 1/3] t6429: update comment to mention correct tool
2025-11-03 18:01 [PATCH 0/3] Fix another crazy rename assertion Elijah Newren via GitGitGadget
@ 2025-11-03 18:01 ` Elijah Newren via GitGitGadget
2025-11-07 14:36 ` Kristoffer Haugsbakk
2025-11-03 18:01 ` [PATCH 2/3] merge-ort: remove debugging crud Elijah Newren via GitGitGadget
2025-11-03 18:01 ` [PATCH 3/3] merge-ort: fix failing merges in special corner case Elijah Newren via GitGitGadget
2 siblings, 1 reply; 9+ messages in thread
From: Elijah Newren via GitGitGadget @ 2025-11-03 18:01 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Elijah Newren
From: Elijah Newren <newren@gmail.com>
A comment at the top of t6429 mentions why the test doesn't exercise git
rebase or git cherry-pick. However, it claims that it uses `test-tool
fast-rebase`. That was true when the comment was written, but commit
f920b0289ba3 (replay: introduce new builtin, 2023-11-24) changed it to
use git replay without updating this comment.
We could potentially just strike this second comment, since git replay
is a bonified built-in, but perhaps the explanation about why it focuses
on git replay is still useful. Update the comment to make it accurate
again.
Signed-off-by: Elijah Newren <newren@gmail.com>
---
t/t6429-merge-sequence-rename-caching.sh | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/t/t6429-merge-sequence-rename-caching.sh b/t/t6429-merge-sequence-rename-caching.sh
index 0f39ed0d08..dcb734b10b 100755
--- a/t/t6429-merge-sequence-rename-caching.sh
+++ b/t/t6429-merge-sequence-rename-caching.sh
@@ -11,14 +11,13 @@ test_description="remember regular & dir renames in sequence of merges"
# sure that we are triggering rename caching rather than rename
# bypassing.
#
-# NOTE 2: this testfile uses 'test-tool fast-rebase' instead of either
-# cherry-pick or rebase. sequencer.c is only superficially
-# integrated with merge-ort; it calls merge_switch_to_result()
-# after EACH merge, which updates the index and working copy AND
-# throws away the cached results (because merge_switch_to_result()
-# is only supposed to be called at the end of the sequence).
-# Integrating them more deeply is a big task, so for now the tests
-# use 'test-tool fast-rebase'.
+# NOTE 2: this testfile uses replay instead of either cherry-pick or rebase.
+# sequencer.c is only superficially integrated with merge-ort; it
+# calls merge_switch_to_result() after EACH merge, which updates the
+# index and working copy AND throws away the cached results (because
+# merge_switch_to_result() is only supposed to be called at the end
+# of the sequence). Integrating them more deeply is a big task, so
+# for now the tests use 'git replay'.
#
--
gitgitgadget
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/3] merge-ort: fix failing merges in special corner case
2025-11-03 18:01 [PATCH 0/3] Fix another crazy rename assertion Elijah Newren via GitGitGadget
2025-11-03 18:01 ` [PATCH 1/3] t6429: update comment to mention correct tool Elijah Newren via GitGitGadget
2025-11-03 18:01 ` [PATCH 2/3] merge-ort: remove debugging crud Elijah Newren via GitGitGadget
@ 2025-11-03 18:01 ` Elijah Newren via GitGitGadget
2 siblings, 0 replies; 9+ messages in thread
From: Elijah Newren via GitGitGadget @ 2025-11-03 18:01 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Elijah Newren
From: Elijah Newren <newren@gmail.com>
At GitHub, we had a repository that was triggering
git: merge-ort.c:3032: process_renames: Assertion `newinfo && !newinfo->merged.clean` failed.
during git replay.
This sounds similar to the somewhat recent f6ecb603ff8a (merge-ort: fix
directory rename on top of source of other rename/delete, 2025-08-06),
but the cause is different. Unlike that case, there are no
rename-to-self situations arising in this case, and new to this case it
can only be triggered during a replay operation on the 2nd or later
commit being replayed, never on the first merge in the sequence.
To trigger, the repository needs:
* an upstream which:
* renames a file to a different directory, e.g.
old/file -> new/file
* leaves other files remaining in the original directory (so that
e.g. "old/" still exists upstream even though file has been
removed from it and placed elsewhere)
* a topic branch being rebased where:
* a commit in the sequence:
* modifies old/file
* a subsequent commit in the sequence being replayed:
* does NOT touch *anything* under new/
* does NOT touch old/file
* DOES modify other paths under old/
* does NOT have any relevant renames that we need to detect
_anywhere_ elsewhere in the tree (meaning this interacts
interestingly with both directory renames and cached renames)
In such a case, the assertion will trigger. The fix turns out to be
surprisingly simple. I have a very vague recollection that I actually
considered whether to add such an if-check years ago when I added the
very similar one for oldinfo in 1b6b902d95a5 (merge-ort:
process_renames() now needs more defensiveness, 2021-01-19), but I think
I couldn't figure out a possible way to trigger it and was worried at
the time that if I didn't know how to trigger it then I wasn't so sure
that simply skipping it was correct. Waiting did give me a chance to
put more thorough tests and checks into place for the rename-to-self
cases a few months back, which I might not have found as easily
otherwise. Anyway, put the check in place now and add a test that
demonstrates the fix.
Note that this bug, as demonstrated by the conditions listed above,
runs at the intersection of relevant renames, trivial directory
resolutions, and cached renames. All three of those optimizations are
ones that unfortunately make the code (and testcases!) a bit more
complex, and threading all three makes it a bit more so. However, the
testcase isn't crazy enough that I'd expect no one to ever hit it in
practice, and was confused why we didn't see it before. After some
digging, I discovered that merge.directoryRenames=false is a workaround
to this bug, and GitHub used that setting until recently (it was a
"temporary" match-what-libgit2-does piece of code that lasted years
longer than intended). Since the conditions I gave above for triggering
this bug rule out the possibility of there being directory renames, one
might assume that it shouldn't matter whether you try to detect such
renames if there aren't any. However, due to commit a16e8efe5c2b
(merge-ort: fix merge.directoryRenames=false, 2025-03-13), the heavy
hammer used there means that merge.directoryRenames=false ALSO turns off
rename caching, which is critical to triggering the bug. This becomes
a bit more than an aside since...
Re-reading that old commit, a16e8efe5c2b (merge-ort: fix
merge.directoryRenames=false, 2025-03-13), it appears that the solution
to this latest bug might have been at least a partial alternative
solution to that old commit. And it may have been an improved
alternative (or at least help implement one), since it may be able to
avoid the heavy-handed disabling of rename cache. That might be an
interesting future thing to investigate, but is not critical for the
current fix. However, since I spent time digging it all up, at least
leave a small comment tweak breadcrumb to help some future reader
(myself or others) who wants to dig further to connect the dots a little
quicker.
Signed-off-by: Elijah Newren <newren@gmail.com>
---
merge-ort.c | 29 ++++++++-
t/t6429-merge-sequence-rename-caching.sh | 78 ++++++++++++++++++++++++
2 files changed, 106 insertions(+), 1 deletion(-)
diff --git a/merge-ort.c b/merge-ort.c
index 23b55c5b92..a1f3333e44 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -2912,6 +2912,32 @@ static int process_renames(struct merge_options *opt,
if (!oldinfo || oldinfo->merged.clean)
continue;
+ /*
+ * Rename caching from a previous commit might give us an
+ * irrelevant rename for the current commit.
+ *
+ * Imagine:
+ * foo/A -> bar/A
+ * was a cached rename for the upstream side from the
+ * previous commit (without the directories being renamed),
+ * but the next commit being replayed
+ * * does NOT add or delete files
+ * * does NOT have directory renames
+ * * does NOT modify any files under bar/
+ * * does NOT modify foo/A
+ * * DOES modify other files under foo/ (otherwise the
+ * !oldinfo check above would have already exited for
+ * us)
+ * In such a case, our trivial directory resolution will
+ * have already merged bar/, and our attempt to process
+ * the cached
+ * foo/A -> bar/A
+ * would be counterproductive, and lack the necessary
+ * information anyway. Skip such renames.
+ */
+ if (!newinfo)
+ continue;
+
/*
* diff_filepairs have copies of pathnames, thus we have to
* use standard 'strcmp()' (negated) instead of '=='.
@@ -5118,7 +5144,8 @@ static void merge_check_renames_reusable(struct merge_options *opt,
* optimization" comment near that case).
*
* This could be revisited in the future; see the commit message
- * where this comment was added for some possible pointers.
+ * where this comment was added for some possible pointers, or the
+ * later commit where this comment was added.
*/
if (opt->detect_directory_renames == MERGE_DIRECTORY_RENAMES_NONE) {
renames->cached_pairs_valid_side = 0; /* neither side valid */
diff --git a/t/t6429-merge-sequence-rename-caching.sh b/t/t6429-merge-sequence-rename-caching.sh
index dcb734b10b..15dd2d94b7 100755
--- a/t/t6429-merge-sequence-rename-caching.sh
+++ b/t/t6429-merge-sequence-rename-caching.sh
@@ -768,4 +768,82 @@ test_expect_success 'avoid assuming we detected renames' '
)
'
+#
+# In the following testcase:
+# Base: olddir/{valuesX_1, valuesY_1, valuesZ_1}
+# other/content
+# Upstream: rename olddir/valuesX_1 -> newdir/valuesX_2
+# Topic_1: modify olddir/valuesX_1 -> olddir/valuesX_3
+# Topic_2: modify olddir/valuesY,
+# modify other/content
+# Expected Pick1: olddir/{valuesY, valuesZ}, newdir/valuesX, other/content
+# Expected Pick2: olddir/{valuesY, valuesZ}, newdir/valuesX, other/content
+#
+# This testcase presents no problems for git traditionally, but the fact that
+# olddir/valuesX -> newdir/valuesX
+# gets cached after the first pick presents a problem for the second commit to
+# be replayed, because it appears to be an irrelevant rename, so the trivial
+# directory resolution will resolve newdir/ without recursing into it, giving
+# us no way to apply the cached rename to anything.
+#
+test_expect_success 'rename a file, use it on first pick, but irrelevant on second' '
+ git init rename_a_file_use_it_once_irrelevant_on_second &&
+ (
+ cd rename_a_file_use_it_once_irrelevant_on_second &&
+
+ mkdir olddir/ other/ &&
+ test_seq 3 8 >olddir/valuesX &&
+ test_seq 3 8 >olddir/valuesY &&
+ test_seq 3 8 >olddir/valuesZ &&
+ printf "%s\n" A B C D E F G >other/content &&
+ git add olddir other &&
+ git commit -m orig &&
+
+ git branch upstream &&
+ git branch topic &&
+
+ git switch upstream &&
+ test_seq 1 8 >olddir/valuesX &&
+ git add olddir &&
+ mkdir newdir &&
+ git mv olddir/valuesX newdir &&
+ git commit -m "Renamed (and modified) olddir/valuesX into newdir/" &&
+
+ git switch topic &&
+
+ test_seq 3 10 >olddir/valuesX &&
+ git add olddir &&
+ git commit -m A &&
+
+ test_seq 1 8 >olddir/valuesY &&
+ printf "%s\n" A B C D E F G H I >other/content &&
+ git add olddir/valuesY other &&
+ git commit -m B &&
+
+ #
+ # Actual testing; mostly we want to verify that we do not hit
+ # git: merge-ort.c:3032: process_renames: Assertion `newinfo && !newinfo->merged.clean` failed.
+ #
+
+ git switch upstream &&
+ git config merge.directoryRenames true &&
+
+ git replay --onto HEAD upstream~1..topic >out &&
+
+ #
+ # ...but we may as well check that the replay gave us a reasonable result
+ #
+
+ git update-ref --stdin <out &&
+ git checkout topic &&
+
+ git ls-files >tracked &&
+ test_line_count = 4 tracked &&
+ test_path_is_file newdir/valuesX &&
+ test_path_is_file olddir/valuesY &&
+ test_path_is_file olddir/valuesZ &&
+ test_path_is_file other/content
+ )
+'
+
test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 9+ messages in thread