From: Sam James <sam@gentoo.org>
To: Elijah Newren <newren@gmail.com>
Cc: Sam James <sam@gentoo.org>,
git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v2] diff: implement config.diff.renames=copies-harder
Date: Fri, 16 Feb 2024 22:16:04 +0000 [thread overview]
Message-ID: <87wmr4aw9q.fsf@gentoo.org> (raw)
In-Reply-To: <CABPp-BFZqa2wyNUs0OfUOokGORjAguCK7-1hqK6U+SUxm8E5Lw@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 10957 bytes --]
Elijah Newren <newren@gmail.com> writes:
> Hi,
>
> It helps when providing a new iteration of a patch to do two things:
> (1) provide a range-diff against the previous version to make it
> easier for reviewers to see what has changed
> (2) either reply to the previous submission or link to it in your
> header so reviewers can find previous comments
Thanks, I'll do that for upcoming v3.
>
> In this case, v1 was over at
> https://lore.kernel.org/git/pull.1606.git.1699010701704.gitgitgadget@gmail.com/.
>
> On Tue, Dec 26, 2023 at 12:21 PM Sam James <sam@gentoo.org> wrote:
>>
>> This patch adds a config value for 'diff.renames' called 'copies-harder'
>> which make it so '-C -C' is in effect always passed for 'git log -p',
>> 'git diff', etc.
>>
>> This allows specifying that 'git log -p', 'git diff', etc should always act
>> as if '-C --find-copies-harder' was passed.
>>
>> It has proven this especially useful for certain types of repository (like
>> Gentoo's ebuild repositories) because files are often copies of a previous
>> version:
>>
>> Suppose a directory 'sys-devel/gcc' contains recipes for building
>> GCC, with one file for each supported upstream branch:
>> gcc-13.x.build.recipe
>> gcc-12.x.build.recipe
>> gcc-11.x.build.recipe
>> gcc-10.x.build.recipe
>>
>> gcc-13.x.build.recipe was started as a copy of gcc-12.x.build.recipe
>> (which was started as a copy of gcc-11.x.build.recipe, etc.). Previous versions
>> are kept around to support parallel installation of multiple versions.
>>
>> Being able to easily observe the diff relative to other recipes within the
>> directory has been a quality of life improvement for such repo layouts.
>>
>> Cc: Elijah Newren <newren@gmail.com>
>> Cc: Junio C Hamano <gitster@pobox.com>
>> Signed-off-by: Sam James <sam@gentoo.org>
>
> While the "Cc:" lines don't hurt, note that everything above this
> point is included in the commit message, so we'd typically rather
> those two lines be below the triple-dashed line.
ACK.
>
>> ---
>> v2: Improve the commit message using Elijah Newren's example (it is indeed
>> precisely what I was thinking of, but phrased better), and improve the documentation
>> to explain better what the new config option actually does & refer to git-diff's
>> --find-copies-harder.
>>
>> Documentation/config/diff.txt | 8 +++++---
>> Documentation/config/status.txt | 3 ++-
>> diff.c | 12 +++++++++---
>> diff.h | 1 +
>> diffcore-rename.c | 4 ++--
>> merge-ort.c | 2 +-
>> merge-recursive.c | 2 +-
>> 7 files changed, 21 insertions(+), 11 deletions(-)
>>
>> diff --git a/Documentation/config/diff.txt b/Documentation/config/diff.txt
>> index bd5ae0c337..cdd8a74ec0 100644
>> --- a/Documentation/config/diff.txt
>> +++ b/Documentation/config/diff.txt
>> @@ -131,9 +131,11 @@ diff.renames::
>> Whether and how Git detects renames. If set to "false",
>> rename detection is disabled. If set to "true", basic rename
>> detection is enabled. If set to "copies" or "copy", Git will
>> - detect copies, as well. Defaults to true. Note that this
>> - affects only 'git diff' Porcelain like linkgit:git-diff[1] and
>> - linkgit:git-log[1], and not lower level commands such as
>> + detect copies, as well. If set to "copies-harder", Git will spend extra
>> + cycles to find more copies even in unmodified paths, see
>> + '--find-copies-harder' in linkgit:git-diff[1]. Defaults to true.
>> + Note that this affects only 'git diff' Porcelain like linkgit:git-diff[1]
>> + and linkgit:git-log[1], and not lower level commands such as
>> linkgit:git-diff-files[1].
>>
>> diff.suppressBlankEmpty::
>> diff --git a/Documentation/config/status.txt b/Documentation/config/status.txt
>> index 2ff8237f8f..7ca7a4becd 100644
>> --- a/Documentation/config/status.txt
>> +++ b/Documentation/config/status.txt
>> @@ -33,7 +33,8 @@ status.renames::
>> Whether and how Git detects renames in linkgit:git-status[1] and
>> linkgit:git-commit[1] . If set to "false", rename detection is
>> disabled. If set to "true", basic rename detection is enabled.
>> - If set to "copies" or "copy", Git will detect copies, as well.
>> + If set to "copies" or "copy", Git will detect copies, as well. If
>> + set to "copies-harder", Git will try harder to detect copies.
>
> It'd be nice to have the improved text from diff.renames here ("If set
> to "copies-harder", Git will spend extra cycles to find more copies
> even in unmodified paths.").
>
>> Defaults to the value of diff.renames.
>>
>> status.showStash::
>> diff --git a/diff.c b/diff.c
>> index a2def45644..585acf9a99 100644
>> --- a/diff.c
>> +++ b/diff.c
>> @@ -206,8 +206,11 @@ int git_config_rename(const char *var, const char *value)
>> {
>> if (!value)
>> return DIFF_DETECT_RENAME;
>> + if (!strcasecmp(value, "copies-harder"))
>> + return DIFF_DETECT_COPY_HARDER;
>> if (!strcasecmp(value, "copies") || !strcasecmp(value, "copy"))
>> - return DIFF_DETECT_COPY;
>> + return DIFF_DETECT_COPY;
>> +
>
> I pointed out in response to v1 that these last couple lines, while a
> nice cleanup, should be in a separate patch.
>
>> return git_config_bool(var,value) ? DIFF_DETECT_RENAME : 0;
>> }
>>
>> @@ -4849,8 +4852,11 @@ void diff_setup_done(struct diff_options *options)
>> else
>> options->flags.diff_from_contents = 0;
>>
>> - if (options->flags.find_copies_harder)
>> + /* Just fold this in as it makes the patch-to-git smaller */
>> + if (options->flags.find_copies_harder || options->detect_rename == DIFF_DETECT_COPY_HARDER) {
>
> Again, I already responded to v1 that four of your lines were too long
> and needed to be split. All four remain unsplit in your resubmission.
> For future reference, when you resubmit a patch, please either (a)
> first respond to the review explaining why you don't agree with the
> suggested changes, or (b) include the fixes reviewers point out, or
> (c) state in your cover letter or explanation after the '---' why you
> chose to not include the suggested changes.
I apologise for the rookie errors, I don't really have an excuse either.
>
>> + options->flags.find_copies_harder = 1;
>> options->detect_rename = DIFF_DETECT_COPY;
>> + }
>>
>> if (!options->flags.relative_name)
>> options->prefix = NULL;
>> @@ -5281,7 +5287,7 @@ static int diff_opt_find_copies(const struct option *opt,
>> if (*arg != 0)
>> return error(_("invalid argument to %s"), opt->long_name);
>>
>> - if (options->detect_rename == DIFF_DETECT_COPY)
>> + if (options->detect_rename == DIFF_DETECT_COPY || options->detect_rename == DIFF_DETECT_COPY_HARDER)
>> options->flags.find_copies_harder = 1;
>> else
>> options->detect_rename = DIFF_DETECT_COPY;
>> diff --git a/diff.h b/diff.h
>> index 66bd8aeb29..b29e5b777f 100644
>> --- a/diff.h
>> +++ b/diff.h
>> @@ -555,6 +555,7 @@ int git_config_rename(const char *var, const char *value);
>>
>> #define DIFF_DETECT_RENAME 1
>> #define DIFF_DETECT_COPY 2
>> +#define DIFF_DETECT_COPY_HARDER 3
>>
>> #define DIFF_PICKAXE_ALL 1
>> #define DIFF_PICKAXE_REGEX 2
>> diff --git a/diffcore-rename.c b/diffcore-rename.c
>> index 5a6e2bcac7..856291d66f 100644
>> --- a/diffcore-rename.c
>> +++ b/diffcore-rename.c
>> @@ -299,7 +299,7 @@ static int find_identical_files(struct hashmap *srcs,
>> }
>> /* Give higher scores to sources that haven't been used already */
>> score = !source->rename_used;
>> - if (source->rename_used && options->detect_rename != DIFF_DETECT_COPY)
>> + if (source->rename_used && options->detect_rename != DIFF_DETECT_COPY && options->detect_rename != DIFF_DETECT_COPY_HARDER)
>> continue;
>> score += basename_same(source, target);
>> if (score > best_score) {
>> @@ -1405,7 +1405,7 @@ void diffcore_rename_extended(struct diff_options *options,
>> trace2_region_enter("diff", "setup", options->repo);
>> info.setup = 0;
>> assert(!dir_rename_count || strmap_empty(dir_rename_count));
>> - want_copies = (detect_rename == DIFF_DETECT_COPY);
>> + want_copies = (detect_rename == DIFF_DETECT_COPY || detect_rename == DIFF_DETECT_COPY_HARDER);
>> if (dirs_removed && (break_idx || want_copies))
>> BUG("dirs_removed incompatible with break/copy detection");
>> if (break_idx && relevant_sources)
>> diff --git a/merge-ort.c b/merge-ort.c
>> index 6491070d96..7749835465 100644
>> --- a/merge-ort.c
>> +++ b/merge-ort.c
>> @@ -4782,7 +4782,7 @@ static void merge_start(struct merge_options *opt, struct merge_result *result)
>> * sanity check them anyway.
>> */
>> assert(opt->detect_renames >= -1 &&
>> - opt->detect_renames <= DIFF_DETECT_COPY);
>> + opt->detect_renames <= DIFF_DETECT_COPY_HARDER);
>> assert(opt->verbosity >= 0 && opt->verbosity <= 5);
>> assert(opt->buffer_output <= 2);
>> assert(opt->obuf.len == 0);
>> diff --git a/merge-recursive.c b/merge-recursive.c
>> index e3beb0801b..d52dd53660 100644
>> --- a/merge-recursive.c
>> +++ b/merge-recursive.c
>> @@ -3708,7 +3708,7 @@ static int merge_start(struct merge_options *opt, struct tree *head)
>> assert(opt->branch1 && opt->branch2);
>>
>> assert(opt->detect_renames >= -1 &&
>> - opt->detect_renames <= DIFF_DETECT_COPY);
>> + opt->detect_renames <= DIFF_DETECT_COPY_HARDER);
>> assert(opt->detect_directory_renames >= MERGE_DIRECTORY_RENAMES_NONE &&
>> opt->detect_directory_renames <= MERGE_DIRECTORY_RENAMES_TRUE);
>> assert(opt->rename_limit >= -1);
>> --
>> 2.43.0
>
> The patch looks close, but it does continue to violate style
> guidelines and make unrelated changes, like v1. And the wording in
> one of the documentation blurbs could be improved a little. Looking
> forward to a v3 with those fixed.
I've not finished checking yet, but I think
5825268db1058516d05be03d6a8d8d55eea5a943 fixed a bug which I'd been
relying on for something, where I may need to add another option for git
log to suppress copies (for scripts).
Should I send that in a series when doing v3 if it ends up being
required?
thanks,
sam
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 377 bytes --]
next prev parent reply other threads:[~2024-02-16 22:18 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-26 20:20 [PATCH v2] diff: implement config.diff.renames=copies-harder Sam James
2023-12-27 2:24 ` Elijah Newren
2024-02-16 22:16 ` Sam James [this message]
2024-02-17 0:54 ` Sam James
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87wmr4aw9q.fsf@gentoo.org \
--to=sam@gentoo.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=newren@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox