* How to turn off rename detection for cherry-pick?
@ 2024-08-26 11:09 Pavel Rappo
2024-08-29 8:47 ` Pavel Rappo
0 siblings, 1 reply; 10+ messages in thread
From: Pavel Rappo @ 2024-08-26 11:09 UTC (permalink / raw)
To: Git mailing list
As far as I understand, the "ort" strategy does not allow turning off
rename detection. The best you can do is set the similarity index
threshold to 100%. However, it does not address the case of candidate
files being exactly the same.
So, it seems to me that the only way to do it would be to downgrade to
the "recursive" strategy and set the "no-renames" option:
git cherry-pick --strategy=recursive --strategy-option=no-renames
<commit>...
Is my understanding correct? Thanks.
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: How to turn off rename detection for cherry-pick? 2024-08-26 11:09 How to turn off rename detection for cherry-pick? Pavel Rappo @ 2024-08-29 8:47 ` Pavel Rappo 2024-08-29 21:43 ` Jeff King 0 siblings, 1 reply; 10+ messages in thread From: Pavel Rappo @ 2024-08-29 8:47 UTC (permalink / raw) To: Git mailing list The reason I ask this is that we've run into a (probably practically rare) case where cherry-pick changes a wrong file. We want to be able to detect such cases. Distilled from the actual repo, here is the case. Create a repo first: mkdir cherry-pick-carefully cd cherry-pick-carefully git init -b main . cat << EOF > x.txt 1 2 3 4 5 EOF git add x.txt git commit -m "Add x.txt" git checkout -b feature git rm x.txt git add x.txt git commit -m "Delete x.txt" cat << EOF > y.txt 1 2 3 4 EOF git add y.txt git commit -m "Add y.txt, similar but not equal" cat << EOF > y.txt 1 2 4 EOF git add y.txt git commit -m "Slightly change y.txt" git checkout main Now try to cherry-pick feature's head commit onto main: cd cherry-pick-carefully git cherry-pick feature With a default git configuration, this should (surprisingly to many) change x.txt instead of y.txt, which is not what the user would expect. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: How to turn off rename detection for cherry-pick? 2024-08-29 8:47 ` Pavel Rappo @ 2024-08-29 21:43 ` Jeff King 2024-08-29 23:12 ` Pavel Rappo 0 siblings, 1 reply; 10+ messages in thread From: Jeff King @ 2024-08-29 21:43 UTC (permalink / raw) To: Pavel Rappo; +Cc: Elijah Newren, Git mailing list On Thu, Aug 29, 2024 at 09:47:52AM +0100, Pavel Rappo wrote: > The reason I ask this is that we've run into a (probably practically > rare) case where cherry-pick changes a wrong file. We want to be able > to detect such cases. You can pass merge strategy options on the command line. The old "recursive" strategy has a "no-renames" option, so: git cherry-pick --strategy=recursive -Xno-renames feature generates a modify/delete conflict for your example. Curiously, the modern default, "ort", does not seem to respect that option. You can bump up the limit to require exact renames, though, which does prevent the mismerge in your case. Like: git cherry-pick -Xfind-renames=100% feature There are also other strategies that do not do rename detection, but I think you are better off using one of the more commonly-used strategies and just disabling renames. IMHO it's a bug that ort doesn't respect -Xno-renames. -Peff ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: How to turn off rename detection for cherry-pick? 2024-08-29 21:43 ` Jeff King @ 2024-08-29 23:12 ` Pavel Rappo 2024-08-30 0:31 ` Jeff King 0 siblings, 1 reply; 10+ messages in thread From: Pavel Rappo @ 2024-08-29 23:12 UTC (permalink / raw) To: Jeff King; +Cc: Elijah Newren, Git mailing list You seem to have confirmed my understanding that I described in my initial email (you replied to my second email in this thread). On Thu, Aug 29, 2024 at 10:43 PM Jeff King <peff@peff.net> wrote: > > On Thu, Aug 29, 2024 at 09:47:52AM +0100, Pavel Rappo wrote: > > > The reason I ask this is that we've run into a (probably practically > > rare) case where cherry-pick changes a wrong file. We want to be able > > to detect such cases. > > You can pass merge strategy options on the command line. The old > "recursive" strategy has a "no-renames" option, so: > > git cherry-pick --strategy=recursive -Xno-renames feature > > generates a modify/delete conflict for your example. Curiously, the > modern default, "ort", does not seem to respect that option. You can > bump up the limit to require exact renames, though, which does prevent > the mismerge in your case. Like: > > git cherry-pick -Xfind-renames=100% feature > > There are also other strategies that do not do rename detection, but I > think you are better off using one of the more commonly-used strategies > and just disabling renames. IMHO it's a bug that ort doesn't respect > -Xno-renames. > > -Peff ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: How to turn off rename detection for cherry-pick? 2024-08-29 23:12 ` Pavel Rappo @ 2024-08-30 0:31 ` Jeff King 2024-08-30 10:35 ` Pavel Rappo ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: Jeff King @ 2024-08-30 0:31 UTC (permalink / raw) To: Pavel Rappo; +Cc: Elijah Newren, Git mailing list On Fri, Aug 30, 2024 at 12:12:07AM +0100, Pavel Rappo wrote: > You seem to have confirmed my understanding that I described in my > initial email (you replied to my second email in this thread). Heh, I did not even see the first message in the thread. But since we independently arrived at the same conclusions, I guess we can consider everything there accurate. :) I do think it's a bug that ort doesn't respect -Xno-renames. The fix is probably something like this: diff --git a/merge-ort.c b/merge-ort.c index 3752c7e595..94b3ce734c 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -3338,7 +3338,7 @@ static int detect_regular_renames(struct merge_options *opt, repo_diff_setup(opt->repo, &diff_opts); diff_opts.flags.recursive = 1; diff_opts.flags.rename_empty = 0; - diff_opts.detect_rename = DIFF_DETECT_RENAME; + diff_opts.detect_rename = opts->detect_rename; diff_opts.rename_limit = opt->rename_limit; if (opt->rename_limit <= 0) diff_opts.rename_limit = 7000; though I'm not sure how DIFF_DETECT_COPIES should be handled ("recursive" silently downgrades it to DIFF_DETECT_RENAME). I've added ort's author to the thread, so hopefully he should have a more clueful response. -Peff ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: How to turn off rename detection for cherry-pick? 2024-08-30 0:31 ` Jeff King @ 2024-08-30 10:35 ` Pavel Rappo 2024-08-30 14:33 ` Elijah Newren 2024-08-30 16:14 ` Junio C Hamano 2 siblings, 0 replies; 10+ messages in thread From: Pavel Rappo @ 2024-08-30 10:35 UTC (permalink / raw) To: Jeff King; +Cc: Elijah Newren, Git mailing list On Fri, Aug 30, 2024 at 1:31 AM Jeff King <peff@peff.net> wrote: > > <snip> > > I've added ort's author to the thread, so hopefully he should have a > more clueful response. > Thanks! Some time ago I created a gist which might better summarise the issue: https://gist.github.com/pavelrappo/3b1cd89b6015ab0eade1b11876d563ff ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: How to turn off rename detection for cherry-pick? 2024-08-30 0:31 ` Jeff King 2024-08-30 10:35 ` Pavel Rappo @ 2024-08-30 14:33 ` Elijah Newren 2024-08-30 16:49 ` Junio C Hamano 2024-08-30 16:14 ` Junio C Hamano 2 siblings, 1 reply; 10+ messages in thread From: Elijah Newren @ 2024-08-30 14:33 UTC (permalink / raw) To: Jeff King; +Cc: Pavel Rappo, Git mailing list On Thu, Aug 29, 2024 at 5:31 PM Jeff King <peff@peff.net> wrote: > > On Fri, Aug 30, 2024 at 12:12:07AM +0100, Pavel Rappo wrote: > > > You seem to have confirmed my understanding that I described in my > > initial email (you replied to my second email in this thread). > > Heh, I did not even see the first message in the thread. But since we > independently arrived at the same conclusions, I guess we can consider > everything there accurate. :) > > I do think it's a bug that ort doesn't respect -Xno-renames. I'm kind of splitting hairs, but doesn't "bug" imply it was overlooked? Documentation/merge-strategies.txt makes it clear it wasn't. ;-) However, I agree that it makes sense to start supporting. I didn't at first because (a) I could find no evidence at the time that anyone actually ever used the option in conjunction with merges for behavioral reasons (this thread from Pavel is the first counterexample I've seen), and (b) there was lots of evidence that the related config option was in widespread use as a workaround to the unnecessary performance problems of rename handling with the recursive merge algorithm. In particular, I wanted to hear about any performance issues with renames in merges, and it was far easier to make the new (then-experimental) algorithm just ignore these options than attempt to do widespread convincing of folks to switch the relevant config back on. I think that a few years has given us more than ample time to hear about potential remaining performance issues with renames in merges, and this thread is a good example of why to support this option. > The fix is probably something like this: The fix probably starts with something like this... > diff --git a/merge-ort.c b/merge-ort.c > index 3752c7e595..94b3ce734c 100644 > --- a/merge-ort.c > +++ b/merge-ort.c > @@ -3338,7 +3338,7 @@ static int detect_regular_renames(struct merge_options *opt, > repo_diff_setup(opt->repo, &diff_opts); > diff_opts.flags.recursive = 1; > diff_opts.flags.rename_empty = 0; > - diff_opts.detect_rename = DIFF_DETECT_RENAME; > + diff_opts.detect_rename = opts->detect_rename; > diff_opts.rename_limit = opt->rename_limit; > if (opt->rename_limit <= 0) > diff_opts.rename_limit = 7000; > > though I'm not sure how DIFF_DETECT_COPIES should be handled > ("recursive" silently downgrades it to DIFF_DETECT_RENAME). Not downgrading DIFF_DETECT_COPIES for merging would break both "recursive" and "ort" in all kinds of interesting ways. I once spent a little time thinking how copy detection might affect things (long before working on ort), and noted that under such a scenario, "recursive" would have multiple logic bugs (mostly in the form of missing additional needed logic rather than existing logic being incorrect, but in places that aren't necessarily obvious at first). Someone would have to very carefully audit the whole file containing either recursive or ort's algorithms if they wanted to make either somehow support copy detection. "ort" would be even more problematic than "recursive" for such a case -- I took full advantage of the differences between rename detection and copy detection while optimizing ort and I think it was intrinsic to every one of the major optimizations I did. So, if you really wanted to support copy detection in ort, the very first step would be adding code to turn off every single one of its major optimizations (including tree-level merging, which didn't sound like a rename or copy detection like thing, but hinged on how rename detection works to make sure it didn't miss important halfs of renames). But, the bigger issue with copies is how exactly could a merge algorithm use them in any way that would make any logical sense? What are you going to do, take the modifications that one side of history made to some source file, and apply those modifications to all the copies of the file that the other side of history made? That sounds crazy and counter-intuitive to me. (...and incidentally, like a factory for creating all kinds of crazy corner case issues; we could probably make things even messier than the mod6 testcases in the testsuite.) > I've added ort's author to the thread, so hopefully he should have a > more clueful response. Yeah, not so straightforward. Even with the downgrading of copy detection to rename detection in this area (like "recursive" does), this isn't sufficient. IIRC (I looked at this about 2 years ago or so when some others asked off-list), at least one of the optimizations managed to bake in an assumption about having gone through the rename detection codepaths. As best I remember, when you attempt to turn rename detection off, it triggers an assertion failure in a non-obvious place far removed from the original issue, and I only ever got a slightly hacky workaround or two for the issue. I didn't have good motivation or rationale to pursue very far at the time, though, so after some hours of looking at it, I just decided to move on to something else. Anyway, I'll add support for no-renames to my list of things needed before we can delete merge-recursive.[ch] and make requests for "recursive" just map to "ort". ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: How to turn off rename detection for cherry-pick? 2024-08-30 14:33 ` Elijah Newren @ 2024-08-30 16:49 ` Junio C Hamano 0 siblings, 0 replies; 10+ messages in thread From: Junio C Hamano @ 2024-08-30 16:49 UTC (permalink / raw) To: Elijah Newren; +Cc: Jeff King, Pavel Rappo, Git mailing list Elijah Newren <newren@gmail.com> writes: > Anyway, I'll add support for no-renames to my list of things needed > before we can delete merge-recursive.[ch] and make requests for > "recursive" just map to "ort". Sounds sensible. Thanks. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: How to turn off rename detection for cherry-pick? 2024-08-30 0:31 ` Jeff King 2024-08-30 10:35 ` Pavel Rappo 2024-08-30 14:33 ` Elijah Newren @ 2024-08-30 16:14 ` Junio C Hamano 2024-08-30 17:17 ` Junio C Hamano 2 siblings, 1 reply; 10+ messages in thread From: Junio C Hamano @ 2024-08-30 16:14 UTC (permalink / raw) To: Jeff King; +Cc: Pavel Rappo, Elijah Newren, Git mailing list Jeff King <peff@peff.net> writes: > though I'm not sure how DIFF_DETECT_COPIES should be handled > ("recursive" silently downgrades it to DIFF_DETECT_RENAME). I somehow suspect that we would want to go there to avoid overloading ourselves (and end-users) with conceptual complexity. If we detect that the file the side branch modified was copied to create multiple files on our side (with or without modification on our end), where would we want to reapply their changes to? To the original file only? To all copies? The same story goes for the opposite direction. How should possibly different changes they made to their copies be merged back to the sole original file we have (which we may or may not have modified)? What if the both sides made copies of the same file, and now we need to shuffle NxM combination of changes? I would expect it would be a whole can of worms, which we _may_ be able to define concrete rules how we would handle each of these cases but we may have a hard time to explain in simple terms to end-users. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: How to turn off rename detection for cherry-pick? 2024-08-30 16:14 ` Junio C Hamano @ 2024-08-30 17:17 ` Junio C Hamano 0 siblings, 0 replies; 10+ messages in thread From: Junio C Hamano @ 2024-08-30 17:17 UTC (permalink / raw) To: Jeff King; +Cc: Pavel Rappo, Elijah Newren, Git mailing list Junio C Hamano <gitster@pobox.com> writes: > Jeff King <peff@peff.net> writes: > >> though I'm not sure how DIFF_DETECT_COPIES should be handled >> ("recursive" silently downgrades it to DIFF_DETECT_RENAME). > > I somehow suspect that we would want to go there to avoid > overloading ourselves (and end-users) with conceptual complexity. Of course, we would *NOT* want to go there X-<. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-08-30 17:17 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-08-26 11:09 How to turn off rename detection for cherry-pick? Pavel Rappo 2024-08-29 8:47 ` Pavel Rappo 2024-08-29 21:43 ` Jeff King 2024-08-29 23:12 ` Pavel Rappo 2024-08-30 0:31 ` Jeff King 2024-08-30 10:35 ` Pavel Rappo 2024-08-30 14:33 ` Elijah Newren 2024-08-30 16:49 ` Junio C Hamano 2024-08-30 16:14 ` Junio C Hamano 2024-08-30 17:17 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).