From: "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Taylor Blau <me@ttaylorr.com>, Elijah Newren <newren@gmail.com>,
Elijah Newren <newren@gmail.com>
Subject: [PATCH v2] merge-ort: avoid assuming all renames detected
Date: Mon, 17 Jan 2022 18:25:55 +0000 [thread overview]
Message-ID: <pull.1194.v2.git.git.1642443955836.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1194.git.git.1642212566346.gitgitgadget@gmail.com>
From: Elijah Newren <newren@gmail.com>
In commit 8b09a900a1 ("merge-ort: restart merge with cached renames to
reduce process entry cost", 2021-07-16), we noted that in the merge-ort
steps of
collect_merge_info()
detect_and_process_renames()
process_entries()
that process_entries() was expensive, and we could often make it cheaper
by changing this to
collect_merge_info()
detect_and_process_renames()
<cache all the renames, and restart>
collect_merge_info()
detect_and_process_renames()
process_entries()
because the second collect_merge_info() would be cheaper (we could avoid
traversing into some directories), the second
detect_and_process_renames() would be free since we had already detected
all renames, and then process_entries() has far fewer entries to handle.
However, this was built on the assumption that the first
detect_and_process_renames() actually detected all potential renames.
If someone has merge.renameLimit set to some small value, that
assumption is violated which manifests later with the following message:
$ git -c merge.renameLimit=1 rebase upstream
...
git: merge-ort.c:546: clear_or_reinit_internal_opts: Assertion
`renames->cached_pairs_valid_side == 0' failed.
Turn off this cache-renames-and-restart whenever we cannot detect all
renames, and add a testcase that would have caught this problem.
Reported-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
---
merge-ort: avoid assuming all renames detected
Fixes https://lore.kernel.org/git/YeHTIfEutLYM4TIU@nand.local/
Changes since v1:
* Fixed a small style issue
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1194%2Fnewren%2Favoid-assertion-assuming-renames-found-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1194/newren/avoid-assertion-assuming-renames-found-v2
Pull-Request: https://github.com/git/git/pull/1194
Range-diff vs v1:
1: f1e9901ae67 ! 1: 239d3ba08c1 merge-ort: avoid assuming all renames detected
@@ merge-ort.c: static int detect_and_process_renames(struct merge_options *opt,
trace2_region_enter("merge", "regular renames", opt->repo);
detection_run |= detect_regular_renames(opt, MERGE_SIDE1);
detection_run |= detect_regular_renames(opt, MERGE_SIDE2);
-+ if (renames->needed_limit != 0) {
++ if (renames->needed_limit) {
+ renames->cached_pairs_valid_side = 0;
+ renames->redo_after_renames = 0;
+ }
merge-ort.c | 4 ++
t/t6429-merge-sequence-rename-caching.sh | 67 ++++++++++++++++++++++++
2 files changed, 71 insertions(+)
diff --git a/merge-ort.c b/merge-ort.c
index c3197970219..b0ff9a72879 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -3060,6 +3060,10 @@ static int detect_and_process_renames(struct merge_options *opt,
trace2_region_enter("merge", "regular renames", opt->repo);
detection_run |= detect_regular_renames(opt, MERGE_SIDE1);
detection_run |= detect_regular_renames(opt, MERGE_SIDE2);
+ if (renames->needed_limit) {
+ renames->cached_pairs_valid_side = 0;
+ renames->redo_after_renames = 0;
+ }
if (renames->redo_after_renames && detection_run) {
int i, side;
struct diff_filepair *p;
diff --git a/t/t6429-merge-sequence-rename-caching.sh b/t/t6429-merge-sequence-rename-caching.sh
index 035edc40b1e..f2bc8a7d2a2 100755
--- a/t/t6429-merge-sequence-rename-caching.sh
+++ b/t/t6429-merge-sequence-rename-caching.sh
@@ -697,4 +697,71 @@ test_expect_success 'caching renames only on upstream side, part 2' '
)
'
+#
+# The following testcase just creates two simple renames (slightly modified
+# on both sides but without conflicting changes), and a directory full of
+# files that are otherwise uninteresting. The setup is as follows:
+#
+# base: unrelated/<BUNCH OF FILES>
+# numbers
+# values
+# upstream: modify: numbers
+# modify: values
+# topic: add: unrelated/foo
+# modify: numbers
+# modify: values
+# rename: numbers -> sequence
+# rename: values -> progression
+#
+# This is a trivial rename case, but we're curious what happens with a very
+# low renameLimit interacting with the restart optimization trying to notice
+# that unrelated/ looks like a trivial merge candidate.
+#
+test_expect_success 'avoid assuming we detected renames' '
+ git init redo-weirdness &&
+ (
+ cd redo-weirdness &&
+
+ mkdir unrelated &&
+ for i in $(test_seq 1 10)
+ do
+ >unrelated/$i
+ done &&
+ test_seq 2 10 >numbers &&
+ test_seq 12 20 >values &&
+ git add numbers values unrelated/ &&
+ git commit -m orig &&
+
+ git branch upstream &&
+ git branch topic &&
+
+ git switch upstream &&
+ test_seq 1 10 >numbers &&
+ test_seq 11 20 >values &&
+ git add numbers &&
+ git commit -m "Some tweaks" &&
+
+ git switch topic &&
+
+ >unrelated/foo &&
+ test_seq 2 12 >numbers &&
+ test_seq 12 22 >values &&
+ git add numbers values unrelated/ &&
+ git mv numbers sequence &&
+ git mv values progression &&
+ git commit -m A &&
+
+ #
+ # Actual testing
+ #
+
+ git switch --detach topic^0 &&
+
+ test_must_fail git -c merge.renameLimit=1 rebase upstream &&
+
+ git ls-files -u >actual &&
+ ! test_file_is_empty actual
+ )
+'
+
test_done
base-commit: 1ffcbaa1a5f10c9f706314d77f88de20a4a498c2
--
gitgitgadget
next prev parent reply other threads:[~2022-01-17 18:26 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-15 2:09 [PATCH] merge-ort: avoid assuming all renames detected Elijah Newren via GitGitGadget
2022-01-16 21:46 ` Junio C Hamano
2022-01-16 23:19 ` Junio C Hamano
2022-01-17 18:25 ` Elijah Newren via GitGitGadget [this message]
2022-01-17 19:33 ` [PATCH v2] " Junio C Hamano
2022-01-17 21:21 ` Elijah Newren
2022-01-17 22:07 ` Taylor Blau
2022-01-17 22:23 ` Junio C Hamano
2022-01-17 22:16 ` Junio C Hamano
2022-06-30 9:53 ` SZEDER Gábor
2022-07-01 2:30 ` Elijah Newren
2022-07-01 5:21 ` Elijah Newren
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.1194.v2.git.git.1642443955836.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=me@ttaylorr.com \
--cc=newren@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.