* [PATCH] blame: make diff algorithm configurable
@ 2025-10-20 14:56 Antonin Delpeuch via GitGitGadget
2025-10-20 16:05 ` Junio C Hamano
` (2 more replies)
0 siblings, 3 replies; 32+ messages in thread
From: Antonin Delpeuch via GitGitGadget @ 2025-10-20 14:56 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Antonin Delpeuch, Antonin Delpeuch
From: Antonin Delpeuch <antonin@delpeuch.eu>
The diff algorithm used in 'git-blame(1)' can be configured using the
`--diff-algorithm` option or the `diff.algorithm` config variable.
Myers diff remains the default.
Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
---
blame: make diff algorithm configurable
There has been long-standing interest in changing the default diff
algorithm to "histogram", and Git 3.0 was floated as a possible occasion
for that: https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/
As a preparation, it is worth making sure that the diff algorithm is
configurable where useful. It can have significant impact on the output
of the git-blame command, so I propose to make it configurable there
too. I have followed the convention of other commands (such as git-diff)
to introduce a --diff-algorithm option.
I understand that this command is a user-facing (porcelain) one, so I
think making it honor the diff.algorithm UI config variable is also
appropriate. The git-blame command has a machine-readable format that
can be enabled with --porcelain (which should be called --plumbing if
you ask me) so I wonder if the diff.algorithm variable should still be
honored in this case, as there could be the desire to keep it
independent from UI config variables (similarly to git-merge-file, a
plumbing command which doesn't honor diff.algorithm).
If the general idea of this patch is judged worthwhile, I would be happy
to add tests to demonstrate the impact of the diff algorithm on blame
output.
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2075%2Fwetneb%2Fblame_respects_diff_algorithm-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2075/wetneb/blame_respects_diff_algorithm-v1
Pull-Request: https://github.com/git/git/pull/2075
Documentation/git-blame.adoc | 21 ++++++++++++++++++++
builtin/blame.c | 38 +++++++++++++++++++++++++++++++++++-
2 files changed, 58 insertions(+), 1 deletion(-)
diff --git a/Documentation/git-blame.adoc b/Documentation/git-blame.adoc
index e438d28625..4beb2df551 100644
--- a/Documentation/git-blame.adoc
+++ b/Documentation/git-blame.adoc
@@ -85,6 +85,27 @@ include::blame-options.adoc[]
Ignore whitespace when comparing the parent's version and
the child's to find where the lines came from.
+`--diff-algorithm=(patience|minimal|histogram|myers)`::
+ Choose a diff algorithm. The variants are as follows:
++
+--
+ `default`;;
+ `myers`;;
+ The basic greedy diff algorithm. Currently, this is the default.
+ `minimal`;;
+ Spend extra time to make sure the smallest possible diff is
+ produced.
+ `patience`;;
+ Use "patience diff" algorithm when generating patches.
+ `histogram`;;
+ This algorithm extends the patience algorithm to "support
+ low-occurrence common elements".
+--
++
+For instance, if you configured the `diff.algorithm` variable to a
+non-default value and want to use the default one, then you
+have to use `--diff-algorithm=default` option.
+
--abbrev=<n>::
Instead of using the default 7+1 hexadecimal digits as the
abbreviated object name, use <m>+1 digits, where <m> is at
diff --git a/builtin/blame.c b/builtin/blame.c
index 2703820258..177b606e81 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -779,6 +779,19 @@ static int git_blame_config(const char *var, const char *value,
}
}
+ if (!strcmp(var, "diff.algorithm")) {
+ long diff_algorithm;
+ if (!value)
+ return config_error_nonbool(var);
+ diff_algorithm = parse_algorithm_value(value);
+ if (diff_algorithm < 0)
+ return error(_("unknown value for config '%s': %s"),
+ var, value);
+ xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
+ xdl_opts |= diff_algorithm;
+ return 0;
+ }
+
if (git_diff_heuristic_config(var, value, cb) < 0)
return -1;
if (userdiff_config(var, value) < 0)
@@ -824,6 +837,26 @@ static int blame_move_callback(const struct option *option, const char *arg, int
return 0;
}
+static int blame_diff_algorithm_callback(const struct option *option,
+ const char *arg, int unset)
+{
+ int *opt = option->value;
+ long value = parse_algorithm_value(arg);
+
+ BUG_ON_OPT_NEG(unset);
+
+ if (value < 0)
+ return error(_("option diff-algorithm accepts \"myers\", "
+ "\"minimal\", \"patience\" and \"histogram\""));
+
+ // ignore any previous --minimal setting, following git-diff's behavior
+ *opt &= ~XDF_NEED_MINIMAL;
+ *opt &= ~XDF_DIFF_ALGORITHM_MASK;
+ *opt |= value;
+
+ return 0;
+}
+
static int is_a_rev(const char *name)
{
struct object_id oid;
@@ -908,13 +941,16 @@ int cmd_blame(int argc,
OPT_BIT('f', "show-name", &output_option, N_("show original filename (Default: auto)"), OUTPUT_SHOW_NAME),
OPT_BIT('n', "show-number", &output_option, N_("show original linenumber (Default: off)"), OUTPUT_SHOW_NUMBER),
OPT_BIT('p', "porcelain", &output_option, N_("show in a format designed for machine consumption"), OUTPUT_PORCELAIN),
- OPT_BIT(0, "line-porcelain", &output_option, N_("show porcelain format with per-line commit information"), OUTPUT_PORCELAIN|OUTPUT_LINE_PORCELAIN),
+ OPT_BIT(0, "line-porcelain", &output_option, N_("show porcelain format with per-line commit information"), OUTPUT_PORCELAIN | OUTPUT_LINE_PORCELAIN),
OPT_BIT('c', NULL, &output_option, N_("use the same output mode as git-annotate (Default: off)"), OUTPUT_ANNOTATE_COMPAT),
OPT_BIT('t', NULL, &output_option, N_("show raw timestamp (Default: off)"), OUTPUT_RAW_TIMESTAMP),
OPT_BIT('l', NULL, &output_option, N_("show long commit SHA1 (Default: off)"), OUTPUT_LONG_OBJECT_NAME),
OPT_BIT('s', NULL, &output_option, N_("suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR),
OPT_BIT('e', "show-email", &output_option, N_("show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL),
OPT_BIT('w', NULL, &xdl_opts, N_("ignore whitespace differences"), XDF_IGNORE_WHITESPACE),
+ OPT_CALLBACK_F(0, "diff-algorithm", &xdl_opts, N_("<algorithm>"),
+ N_("choose a diff algorithm"),
+ PARSE_OPT_NONEG, blame_diff_algorithm_callback),
OPT_STRING_LIST(0, "ignore-rev", &ignore_rev_list, N_("rev"), N_("ignore <rev> when blaming")),
OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")),
OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE),
base-commit: 4253630c6f07a4bdcc9aa62a50e26a4d466219d1
--
gitgitgadget
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH] blame: make diff algorithm configurable
2025-10-20 14:56 [PATCH] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget
@ 2025-10-20 16:05 ` Junio C Hamano
2025-10-22 9:37 ` Antonin Delpeuch
2025-10-23 16:03 ` Phillip Wood
2025-10-28 13:37 ` [PATCH v2] " Antonin Delpeuch via GitGitGadget
2 siblings, 1 reply; 32+ messages in thread
From: Junio C Hamano @ 2025-10-20 16:05 UTC (permalink / raw)
To: Antonin Delpeuch via GitGitGadget; +Cc: git, Elijah Newren, Antonin Delpeuch
"Antonin Delpeuch via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Antonin Delpeuch <antonin@delpeuch.eu>
>
> The diff algorithm used in 'git-blame(1)' can be configured using the
> `--diff-algorithm` option or the `diff.algorithm` config variable.
> Myers diff remains the default.
The usual way to compose a log message of this project is to
- Give an observation on how the current system works in the
present tense (so no need to say "Currently X is Y", or
"Previously X was Y" to describe the state before your change;
just "X is Y" is enough), and discuss what you perceive as a
problem in it.
- Propose a solution (optional---often, problem description
trivially leads to an obvious solution in reader's minds).
- Give commands to somebody editing the codebase to "make it so",
instead of saying "This commit does X".
in this order. This hasn't changed since your first commit to this
project a few years ago.
And when read with that expectation, I was surprised that "blame"
already paid attention to the command line option and configuration
variable, as that paragraph was supposed to explain what happens
without the patch being proposed. It was a pleasant surprise that
turned out to be untrue X-<.
> Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
> ---
> blame: make diff algorithm configurable
>
> There has been long-standing interest in changing the default diff
> algorithm to "histogram", and Git 3.0 was floated as a possible occasion
> for that: https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/
>
> As a preparation, it is worth making sure that the diff algorithm is
> configurable where useful. It can have significant impact on the output
> of the git-blame command, so I propose to make it configurable there
> too. I have followed the convention of other commands (such as git-diff)
> to introduce a --diff-algorithm option.
All of the above are good materials to be in the proposed log
message, not under the three-dash line, to explain the motivation
behind the change.
> I understand that this command is a user-facing (porcelain) one, so I
> think making it honor the diff.algorithm UI config variable is also
> appropriate. The git-blame command has a machine-readable format that
> can be enabled with --porcelain (which should be called --plumbing if
> you ask me) so I wonder if the diff.algorithm variable should still be
> honored in this case, as there could be the desire to keep it
> independent from UI config variables (similarly to git-merge-file, a
> plumbing command which doesn't honor diff.algorithm).
Good consideration and something we should make sure we do the right
thing for our users. Personally, I would not be concerned---the
only folks that possibly affected are those who save old blame
output and wants a fresh "git blame" run they make today would
produce bit-for-bit identical output, but if they use newer versions
of Git with improved xdiff implementation, they cannot expect that
with or without the configuration knob _anyway_. This is just my
personal opinion. Others may differ.
> If the general idea of this patch is judged worthwhile, I would be happy
> to add tests to demonstrate the impact of the diff algorithm on blame
> output.
Do not ever say this here.
I've seen from time to time people ask "I am thinking of doing this;
will a patch be accepted? If so, I'll work on it." before showing
any work, and my response always has been:
(1) We don't know how useful and interesting your contribution would
be for our audience, until we see it; and
(2) If you truly believe in your work (find it useful, find writing
it fun, etc.), that would be incentive enough for you to work
on it, whether or not the result will land in my tree. You
should instead aim for something so brilliant that we would
come to you begging for your permission to include it in our
project.
> diff --git a/Documentation/git-blame.adoc b/Documentation/git-blame.adoc
> index e438d28625..4beb2df551 100644
> --- a/Documentation/git-blame.adoc
> +++ b/Documentation/git-blame.adoc
> @@ -85,6 +85,27 @@ include::blame-options.adoc[]
> Ignore whitespace when comparing the parent's version and
> the child's to find where the lines came from.
>
> +`--diff-algorithm=(patience|minimal|histogram|myers)`::
> + Choose a diff algorithm. The variants are as follows:
> ++
> +--
> + `default`;;
> + `myers`;;
> + The basic greedy diff algorithm. Currently, this is the default.
> + `minimal`;;
> + Spend extra time to make sure the smallest possible diff is
> + produced.
> + `patience`;;
> + Use "patience diff" algorithm when generating patches.
> + `histogram`;;
> + This algorithm extends the patience algorithm to "support
> + low-occurrence common elements".
> +--
> ++
> +For instance, if you configured the `diff.algorithm` variable to a
> +non-default value and want to use the default one, then you
> +have to use `--diff-algorithm=default` option.
Is this copied from somewhere else, or did you come up with the
above text yourself? If the former, perhaps it is a good idea to
reduce the duplicattion. Use of "include::line-range-format.adoc[]"
in Documentation/blame-options.adoc (which in turn is included by
Documentation/git-blame.adoc) may serve as a good model to include
the same text in multiple places (the "line-range" syntax thing is
included directly or indirectly and its text appears in a handful of
places as the result). Copy the original text out into a new file
to be included (say, "diff-algorithm-option.adoc"), replace the
original text with "include::diff-algorithm-option.adoc[]", and then
add another "include::diff-algorithm-option.adoc[]" here in
git-blame documentation instead of duplicating the text like the
above hunk does.
> diff --git a/builtin/blame.c b/builtin/blame.c
> index 2703820258..177b606e81 100644
> --- a/builtin/blame.c
> +++ b/builtin/blame.c
> @@ -779,6 +779,19 @@ static int git_blame_config(const char *var, const char *value,
> }
> }
>
> + if (!strcmp(var, "diff.algorithm")) {
> + long diff_algorithm;
> + if (!value)
> + return config_error_nonbool(var);
> + diff_algorithm = parse_algorithm_value(value);
> + if (diff_algorithm < 0)
> + return error(_("unknown value for config '%s': %s"),
> + var, value);
OK, this message is copied from git_diff_ui_config(), which is where
"git log" and 4 commands in the "git diff" family gets their error
message when "git -c diff.algorithm=bogus <cmd>" is run. It is a
bit suboptimal, but users would know how to read the documentation
(even though in practice they never do), so let's say this is OK, at
least for now.
For future reference (note: this is a #leftoverbits comment that is
left here for the benefit of those who scan the list archive for
ideas on what to do when they are absolutely bored without anything
interesting to do, not meant as a suggestion to do anything of this
sort before this patch lands), in addition to "git log" and 4
commands in the "git diff" family,
- merge-ort.c has the same message.
- builtin/merge-file.c gives a bit nicer message but that is a bit
of maintenance burden.
- curiously "git log" and four commands in the "git diff" family
give a much nicer message when a --diff-algorithm=bogus is given
from the command line, but not in the configuration file.
we may want to consolidate the error message into one place (a
constant or "extern const char *diff_algorithm_error_message",
or something else that is i18n friendly) and use it from all these
places I just identified.
> + xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
> + xdl_opts |= diff_algorithm;
> + return 0;
> + }
> if (git_diff_heuristic_config(var, value, cb) < 0)
> return -1;
This one relies on git_diff_heuristic_config() to give message when
it returns negative, so we do not have to do anything. OK.
Contination of the above #leftoverbits may be to see if
parse_algorithm_value() is a good place to consolidate the error
message, after auditing all its callers (if such a change turns
out to be a good idea, they need to lose their own messages).
> @@ -824,6 +837,26 @@ static int blame_move_callback(const struct option *option, const char *arg, int
> return 0;
> }
>
> +static int blame_diff_algorithm_callback(const struct option *option,
> + const char *arg, int unset)
> +{
> + int *opt = option->value;
> + long value = parse_algorithm_value(arg);
> +
> + BUG_ON_OPT_NEG(unset);
> +
> + if (value < 0)
> + return error(_("option diff-algorithm accepts \"myers\", "
> + "\"minimal\", \"patience\" and \"histogram\""));
You inherited the same "config error gets a message that requires
users to consult the manual, option error gets something a bit more
useful but is a maintenance burden" trait from "git diff" and family
here. Let's say this is OK, too, at least for now.
> + // ignore any previous --minimal setting, following git-diff's behavior
We do not do // comments around here, outside borrowed code.
> + *opt &= ~XDF_NEED_MINIMAL;
> + *opt &= ~XDF_DIFF_ALGORITHM_MASK;
> + *opt |= value;
> +
> + return 0;
> +}
> +
> static int is_a_rev(const char *name)
> {
> struct object_id oid;
> @@ -908,13 +941,16 @@ int cmd_blame(int argc,
> OPT_BIT('f', "show-name", &output_option, N_("show original filename (Default: auto)"), OUTPUT_SHOW_NAME),
> OPT_BIT('n', "show-number", &output_option, N_("show original linenumber (Default: off)"), OUTPUT_SHOW_NUMBER),
> OPT_BIT('p', "porcelain", &output_option, N_("show in a format designed for machine consumption"), OUTPUT_PORCELAIN),
> - OPT_BIT(0, "line-porcelain", &output_option, N_("show porcelain format with per-line commit information"), OUTPUT_PORCELAIN|OUTPUT_LINE_PORCELAIN),
> + OPT_BIT(0, "line-porcelain", &output_option, N_("show porcelain format with per-line commit information"), OUTPUT_PORCELAIN | OUTPUT_LINE_PORCELAIN),
WHY?
> OPT_BIT('c', NULL, &output_option, N_("use the same output mode as git-annotate (Default: off)"), OUTPUT_ANNOTATE_COMPAT),
> OPT_BIT('t', NULL, &output_option, N_("show raw timestamp (Default: off)"), OUTPUT_RAW_TIMESTAMP),
> OPT_BIT('l', NULL, &output_option, N_("show long commit SHA1 (Default: off)"), OUTPUT_LONG_OBJECT_NAME),
> OPT_BIT('s', NULL, &output_option, N_("suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR),
> OPT_BIT('e', "show-email", &output_option, N_("show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL),
> OPT_BIT('w', NULL, &xdl_opts, N_("ignore whitespace differences"), XDF_IGNORE_WHITESPACE),
> + OPT_CALLBACK_F(0, "diff-algorithm", &xdl_opts, N_("<algorithm>"),
> + N_("choose a diff algorithm"),
> + PARSE_OPT_NONEG, blame_diff_algorithm_callback),
OK.
> OPT_STRING_LIST(0, "ignore-rev", &ignore_rev_list, N_("rev"), N_("ignore <rev> when blaming")),
> OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")),
> OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE),
>
> base-commit: 4253630c6f07a4bdcc9aa62a50e26a4d466219d1
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] blame: make diff algorithm configurable
2025-10-20 16:05 ` Junio C Hamano
@ 2025-10-22 9:37 ` Antonin Delpeuch
2025-10-22 20:39 ` Junio C Hamano
0 siblings, 1 reply; 32+ messages in thread
From: Antonin Delpeuch @ 2025-10-22 9:37 UTC (permalink / raw)
To: Junio C Hamano, Antonin Delpeuch via GitGitGadget; +Cc: git, Elijah Newren
On 20/10/2025 18:05, Junio C Hamano wrote:
>> If the general idea of this patch is judged worthwhile, I would be happy
>> to add tests to demonstrate the impact of the diff algorithm on blame
>> output.
> Do not ever say this here.
>
> I've seen from time to time people ask "I am thinking of doing this;
> will a patch be accepted? If so, I'll work on it." before showing
> any work, and my response always has been:
>
> (1) We don't know how useful and interesting your contribution would
> be for our audience, until we see it; and
>
> (2) If you truly believe in your work (find it useful, find writing
> it fun, etc.), that would be incentive enough for you to work
> on it, whether or not the result will land in my tree. You
> should instead aim for something so brilliant that we would
> come to you begging for your permission to include it in our
> project.
I am surprised by your reaction here, both by its substance and form.
My understanding is that gathering feedback on a proposal before
carrying out the implementation work in its entirety is widely accepted
as a good practice for contributions to open source projects. For
instance, the following guide encourages to do so (in GitHub terms, by
proposing an improvement as an issue first, and by opening a draft pull
request if necessary):
https://opensource.guide/how-to-contribute/
While this is phrased in the context of GitHub, I think the general
principle behind it is healthy. In projects I maintain, I feel bad for
contributors who submit contributions that clearly required a
significant effort, but that I can't accept for certain reasons. I wish
they had got in touch ahead of investing all this work, because I care
about their time.
In fact, I already did so for an earlier contribution to this very
project, and on that occasion you did not seem to take offense at the
fact that my proposal was done without an accompanying patch:
https://lore.kernel.org/git/8bb5e41e-4db9-4527-8492-3aca6a0f40bf@delpeuch.eu/
Has your position changed since? Or did I benefit from more of your
kindness back then as a new contributor?
Your argument about my work being worthy on its own even if it's not
integrated to your tree is an interesting one, but let me expand on my
motivation for this patch. This change is not something I personally
need, nor something that is particularly fun to write. I am working on
this with the hope that it will eventually make it possible to switch
the default to the histogram algorithm, for the benefit of many git
users. I see no point for this patch if it is not integrated in your
tree. It is a gift to you and to the git community: if the gift is to be
declined, I'd rather not spend time crafting it.
Concerning the form, I feel obliged to let you know that from my
cultural standpoint, your reply reads rather aggressive. Specifically,
the sentence "Do not ever say this here." reads menacing to me, and the
use of all caps later on in your reply reads aggressive to me. I'm doing
my best to assume that it is not be the attitude you wanted to convey
and hope that you can receive this feedback gratefully.
>> + *opt &= ~XDF_NEED_MINIMAL;
>> + *opt &= ~XDF_DIFF_ALGORITHM_MASK;
>> + *opt |= value;
>> +
>> + return 0;
>> +}
>> +
>> static int is_a_rev(const char *name)
>> {
>> struct object_id oid;
>> @@ -908,13 +941,16 @@ int cmd_blame(int argc,
>> OPT_BIT('f', "show-name", &output_option, N_("show original filename (Default: auto)"), OUTPUT_SHOW_NAME),
>> OPT_BIT('n', "show-number", &output_option, N_("show original linenumber (Default: off)"), OUTPUT_SHOW_NUMBER),
>> OPT_BIT('p', "porcelain", &output_option, N_("show in a format designed for machine consumption"), OUTPUT_PORCELAIN),
>> - OPT_BIT(0, "line-porcelain", &output_option, N_("show porcelain format with per-line commit information"), OUTPUT_PORCELAIN|OUTPUT_LINE_PORCELAIN),
>> + OPT_BIT(0, "line-porcelain", &output_option, N_("show porcelain format with per-line commit information"), OUTPUT_PORCELAIN | OUTPUT_LINE_PORCELAIN),
> WHY?
In an attempt to conform to the coding style of this project, I ran
`make style`, which generated this change. Given that it is on a line I
hadn't touched, I pondered on whether to include it in my patch or not.
In the past, I had submitted a patch which fixed a formatting issue in
the documentation in passing, together with content changes further
down, and you had been supportive of this:
https://lore.kernel.org/git/xmqq1qaeqtw7.fsf@gitster.g/
So I decided to include it this time as well. I am happy to remove it if
you prefer not to have it.
I thank you for the rest of your comments and will take them into
account for a new version of this patch, pending the discussion above.
Best wishes,
Antonin
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] blame: make diff algorithm configurable
2025-10-22 9:37 ` Antonin Delpeuch
@ 2025-10-22 20:39 ` Junio C Hamano
0 siblings, 0 replies; 32+ messages in thread
From: Junio C Hamano @ 2025-10-22 20:39 UTC (permalink / raw)
To: Antonin Delpeuch; +Cc: Antonin Delpeuch via GitGitGadget, git, Elijah Newren
Antonin Delpeuch <antonin@delpeuch.eu> writes:
> On 20/10/2025 18:05, Junio C Hamano wrote:
>
>>> If the general idea of this patch is judged worthwhile, I would be happy
>>> to add tests to demonstrate the impact of the diff algorithm on blame
>>> output.
>> Do not ever say this here.
>>
>> I've seen from time to time people ask "I am thinking of doing this;
>> will a patch be accepted? If so, I'll work on it." before showing
>> any work, and my response always has been:
>>
>> (1) We don't know how useful and interesting your contribution would
>> be for our audience, until we see it; and
>>
>> (2) If you truly believe in your work (find it useful, find writing
>> it fun, etc.), that would be incentive enough for you to work
>> on it, whether or not the result will land in my tree. You
>> should instead aim for something so brilliant that we would
>> come to you begging for your permission to include it in our
>> project.
>
> I am surprised by your reaction here, both by its substance and form.
Yeah, after sending it out, I realized that the canned response
above was not fitting to this exact instance. I overreacted
primarily because what I saw everything before that part was
indication of a great new contributor, which made my dissapointment
to see the dreaded "I will do this if this is accepted" even worse.
Your "this one lacks tests" is a bit different from what we
sometimes see on this list that I react with the above canned
response, which is "I want to do this great thing. If you promise
you will accept this change, I'll work on it" without showing any
detailed design or code. It is more like "I know we need test but I
have shown the main part of the change. Am I going in the right
direction?"
You certainly didn't deserve the above response. Sorry about that.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] blame: make diff algorithm configurable
2025-10-20 14:56 [PATCH] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget
2025-10-20 16:05 ` Junio C Hamano
@ 2025-10-23 16:03 ` Phillip Wood
2025-10-28 13:37 ` [PATCH v2] " Antonin Delpeuch via GitGitGadget
2 siblings, 0 replies; 32+ messages in thread
From: Phillip Wood @ 2025-10-23 16:03 UTC (permalink / raw)
To: Antonin Delpeuch via GitGitGadget, git; +Cc: Elijah Newren, Antonin Delpeuch
Hi Antonin
On 20/10/2025 15:56, Antonin Delpeuch via GitGitGadget wrote:
> From: Antonin Delpeuch <antonin@delpeuch.eu>
>
> The diff algorithm used in 'git-blame(1)' can be configured using the
> `--diff-algorithm` option or the `diff.algorithm` config variable.
> Myers diff remains the default.
I think this sounds like a reasonable thing to do, although it is
technically a breaking change.
> diff --git a/builtin/blame.c b/builtin/blame.c
> index 2703820258..177b606e81 100644
> --- a/builtin/blame.c
> +++ b/builtin/blame.c
> @@ -779,6 +779,19 @@ static int git_blame_config(const char *var, const char *value,
> }
> }
>
> + if (!strcmp(var, "diff.algorithm")) {
> + long diff_algorithm;
> + if (!value)
> + return config_error_nonbool(var);
> + diff_algorithm = parse_algorithm_value(value);
> + if (diff_algorithm < 0)
> + return error(_("unknown value for config '%s': %s"),
> + var, value);
> + xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
I think this should be
xdl_opts &= ~(XDF_DIFF_ALGORITHM_MASK | XDF_NEED_MINIMAL);
as you have below for the option parsing.
> + xdl_opts |= diff_algorithm;
> + return 0;
> + }
> +
> if (git_diff_heuristic_config(var, value, cb) < 0)
> return -1;
> if (userdiff_config(var, value) < 0)
> @@ -824,6 +837,26 @@ static int blame_move_callback(const struct option *option, const char *arg, int
> return 0;
> }
>
> +static int blame_diff_algorithm_callback(const struct option *option,
> + const char *arg, int unset)
> +{
> + int *opt = option->value;
> + long value = parse_algorithm_value(arg);
> +
> + BUG_ON_OPT_NEG(unset);
> +
> + if (value < 0)
> + return error(_("option diff-algorithm accepts \"myers\", "
> + "\"minimal\", \"patience\" and \"histogram\""));
> +
> + // ignore any previous --minimal setting, following git-diff's behavior
Style - oneline comments should look like
/* comment */
> + *opt &= ~XDF_NEED_MINIMAL;
> + *opt &= ~XDF_DIFF_ALGORITHM_MASK;
> + *opt |= value;
"git blame" also has a "--minimal" option which now needs to clear the
diff algorithm when it sets the minimal flag.
Thanks
Phillip
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v2] blame: make diff algorithm configurable
2025-10-20 14:56 [PATCH] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget
2025-10-20 16:05 ` Junio C Hamano
2025-10-23 16:03 ` Phillip Wood
@ 2025-10-28 13:37 ` Antonin Delpeuch via GitGitGadget
2025-10-28 15:22 ` Junio C Hamano
2025-10-28 21:14 ` [PATCH v3] " Antonin Delpeuch via GitGitGadget
2 siblings, 2 replies; 32+ messages in thread
From: Antonin Delpeuch via GitGitGadget @ 2025-10-28 13:37 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Antonin Delpeuch, Antonin Delpeuch
From: Antonin Delpeuch <antonin@delpeuch.eu>
The diff algorithm used in 'git-blame(1)' is set to 'myers',
without the possibility to change it aside from the `--minimal` option.
There has been long-standing interest in changing the default diff
algorithm to "histogram", and Git 3.0 was floated as a possible occasion
for taking some steps towards that:
https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/
As a preparation for this move, it is worth making sure that the diff
algorithm is configurable where useful.
Make it configurable in the `git-blame(1)` command by introducing the
`--diff-algorithm` option and make honor the `diff.algorithm` config
variable. Keep Myers diff as the default.
Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
---
blame: make diff algorithm configurable
Changes since v1:
* add tests
* ignore --diff-algorithm when it is provided before --minimal
* improve patch description
* remove duplication of documentation sections
* style improvements
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2075%2Fwetneb%2Fblame_respects_diff_algorithm-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2075/wetneb/blame_respects_diff_algorithm-v2
Pull-Request: https://github.com/git/git/pull/2075
Range-diff vs v1:
1: 32b59d0204 ! 1: 107f51620b blame: make diff algorithm configurable
@@ Metadata
## Commit message ##
blame: make diff algorithm configurable
- The diff algorithm used in 'git-blame(1)' can be configured using the
- `--diff-algorithm` option or the `diff.algorithm` config variable.
- Myers diff remains the default.
+ The diff algorithm used in 'git-blame(1)' is set to 'myers',
+ without the possibility to change it aside from the `--minimal` option.
+
+ There has been long-standing interest in changing the default diff
+ algorithm to "histogram", and Git 3.0 was floated as a possible occasion
+ for taking some steps towards that:
+
+ https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/
+
+ As a preparation for this move, it is worth making sure that the diff
+ algorithm is configurable where useful.
+
+ Make it configurable in the `git-blame(1)` command by introducing the
+ `--diff-algorithm` option and make honor the `diff.algorithm` config
+ variable. Keep Myers diff as the default.
Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
- ## Documentation/git-blame.adoc ##
-@@ Documentation/git-blame.adoc: include::blame-options.adoc[]
- Ignore whitespace when comparing the parent's version and
- the child's to find where the lines came from.
-
+ ## Documentation/diff-algorithm-option.adoc (new) ##
+@@
+`--diff-algorithm=(patience|minimal|histogram|myers)`::
+ Choose a diff algorithm. The variants are as follows:
++
@@ Documentation/git-blame.adoc: include::blame-options.adoc[]
+For instance, if you configured the `diff.algorithm` variable to a
+non-default value and want to use the default one, then you
+have to use `--diff-algorithm=default` option.
+
+ ## Documentation/diff-options.adoc ##
+@@ Documentation/diff-options.adoc: and starts with _<text>_, this algorithm attempts to prevent it from
+ appearing as a deletion or addition in the output. It uses the "patience
+ diff" algorithm internally.
+
+-`--diff-algorithm=(patience|minimal|histogram|myers)`::
+- Choose a diff algorithm. The variants are as follows:
+-+
+---
+- `default`;;
+- `myers`;;
+- The basic greedy diff algorithm. Currently, this is the default.
+- `minimal`;;
+- Spend extra time to make sure the smallest possible diff is
+- produced.
+- `patience`;;
+- Use "patience diff" algorithm when generating patches.
+- `histogram`;;
+- This algorithm extends the patience algorithm to "support
+- low-occurrence common elements".
+---
+-+
+-For instance, if you configured the `diff.algorithm` variable to a
+-non-default value and want to use the default one, then you
+-have to use `--diff-algorithm=default` option.
++include::diff-algorithm-option.adoc[]
+
+ `--stat[=<width>[,<name-width>[,<count>]]]`::
+ Generate a diffstat. By default, as much space as necessary
+
+ ## Documentation/git-blame.adoc ##
+@@ Documentation/git-blame.adoc: include::blame-options.adoc[]
+ Ignore whitespace when comparing the parent's version and
+ the child's to find where the lines came from.
+
++include::diff-algorithm-option.adoc[]
+
--abbrev=<n>::
Instead of using the default 7+1 hexadecimal digits as the
@@ builtin/blame.c: static int blame_move_callback(const struct option *option, con
return 0;
}
++static int blame_diff_algorithm_minimal(const struct option *option,
++ const char *arg, int unset)
++{
++ int *opt = option->value;
++
++ BUG_ON_OPT_NEG(unset);
++ BUG_ON_OPT_ARG(arg);
++
++ *opt &= ~XDF_DIFF_ALGORITHM_MASK;
++ *opt |= XDF_NEED_MINIMAL;
++
++ return 0;
++}
++
+static int blame_diff_algorithm_callback(const struct option *option,
+ const char *arg, int unset)
+{
@@ builtin/blame.c: static int blame_move_callback(const struct option *option, con
+ return error(_("option diff-algorithm accepts \"myers\", "
+ "\"minimal\", \"patience\" and \"histogram\""));
+
-+ // ignore any previous --minimal setting, following git-diff's behavior
-+ *opt &= ~XDF_NEED_MINIMAL;
-+ *opt &= ~XDF_DIFF_ALGORITHM_MASK;
++ *opt &= ~(XDF_NEED_MINIMAL | XDF_DIFF_ALGORITHM_MASK);
+ *opt |= value;
+
+ return 0;
@@ builtin/blame.c: static int blame_move_callback(const struct option *option, con
{
struct object_id oid;
@@ builtin/blame.c: int cmd_blame(int argc,
- OPT_BIT('f', "show-name", &output_option, N_("show original filename (Default: auto)"), OUTPUT_SHOW_NAME),
- OPT_BIT('n', "show-number", &output_option, N_("show original linenumber (Default: off)"), OUTPUT_SHOW_NUMBER),
- OPT_BIT('p', "porcelain", &output_option, N_("show in a format designed for machine consumption"), OUTPUT_PORCELAIN),
-- OPT_BIT(0, "line-porcelain", &output_option, N_("show porcelain format with per-line commit information"), OUTPUT_PORCELAIN|OUTPUT_LINE_PORCELAIN),
-+ OPT_BIT(0, "line-porcelain", &output_option, N_("show porcelain format with per-line commit information"), OUTPUT_PORCELAIN | OUTPUT_LINE_PORCELAIN),
- OPT_BIT('c', NULL, &output_option, N_("use the same output mode as git-annotate (Default: off)"), OUTPUT_ANNOTATE_COMPAT),
- OPT_BIT('t', NULL, &output_option, N_("show raw timestamp (Default: off)"), OUTPUT_RAW_TIMESTAMP),
- OPT_BIT('l', NULL, &output_option, N_("show long commit SHA1 (Default: off)"), OUTPUT_LONG_OBJECT_NAME),
OPT_BIT('s', NULL, &output_option, N_("suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR),
OPT_BIT('e', "show-email", &output_option, N_("show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL),
OPT_BIT('w', NULL, &xdl_opts, N_("ignore whitespace differences"), XDF_IGNORE_WHITESPACE),
@@ builtin/blame.c: int cmd_blame(int argc,
OPT_STRING_LIST(0, "ignore-rev", &ignore_rev_list, N_("rev"), N_("ignore <rev> when blaming")),
OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")),
OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE),
+ OPT_BIT(0, "color-by-age", &output_option, N_("color lines by age"), OUTPUT_SHOW_AGE_WITH_COLOR),
++ OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL,
++ N_("spend extra cycles to find better match"),
++ PARSE_OPT_NONEG | PARSE_OPT_NOARG,
++ blame_diff_algorithm_minimal),
+ OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL),
+ OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")),
+ OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")),
+
+ ## t/meson.build ##
+@@ t/meson.build: integration_tests = [
+ 't8012-blame-colors.sh',
+ 't8013-blame-ignore-revs.sh',
+ 't8014-blame-ignore-fuzzy.sh',
++ 't8015-blame-diff-algorithm.sh',
+ 't8020-last-modified.sh',
+ 't9001-send-email.sh',
+ 't9002-column.sh',
+
+ ## t/t8015-blame-diff-algorithm.sh (new) ##
+@@
++#!/bin/sh
++
++test_description='git blame with specific diff algorithm'
++
++. ./test-lib.sh
++
++test_expect_success setup '
++ cat >file.c <<-\EOF &&
++ int f(int x, int y)
++ {
++ if (x == 0)
++ {
++ return y;
++ }
++ return x;
++ }
++
++ int g(size_t u)
++ {
++ while (u < 30)
++ {
++ u++;
++ }
++ return u;
++ }
++ EOF
++ test_write_lines x x x x >file.txt &&
++ git add file.c file.txt &&
++ GIT_AUTHOR_NAME=Initial git commit -m Initial &&
++
++ cat >file.c <<-\EOF &&
++ int g(size_t u)
++ {
++ while (u < 30)
++ {
++ u++;
++ }
++ return u;
++ }
++
++ int h(int x, int y, int z)
++ {
++ if (z == 0)
++ {
++ return x;
++ }
++ return y;
++ }
++ EOF
++ test_write_lines x x x A B C D x E F G >file.txt &&
++ git add file.c file.txt &&
++ GIT_AUTHOR_NAME=Second git commit -m Second
++'
++
++test_expect_success 'blame uses Myers diff algorithm by default for now' '
++ cat >expected <<-\EOF &&
++ Second
++ Initial
++ Second
++ Initial
++ Second
++ Initial
++ Second
++ Initial
++ Initial
++ Second
++ Initial
++ Second
++ Initial
++ Second
++ Initial
++ Second
++ Initial
++ EOF
++
++ # git blame file.c | grep --only-matching -e Initial -e Second > actual &&
++ # test_cmp expected actual
++ echo goo
++'
++
++test_expect_success 'blame honors --diff-algorithm option' '
++ cat >expected <<-\EOF &&
++ Initial
++ Initial
++ Initial
++ Initial
++ Initial
++ Initial
++ Initial
++ Initial
++ Second
++ Second
++ Second
++ Second
++ Second
++ Second
++ Second
++ Second
++ Second
++ EOF
++
++ git blame file.c --diff-algorithm=histogram | \
++ grep --only-matching -e Initial -e Second > actual &&
++ test_cmp expected actual
++'
++
++test_expect_success 'blame honors diff.algorithm config variable' '
++ cat >expected <<-\EOF &&
++ Initial
++ Initial
++ Initial
++ Initial
++ Initial
++ Initial
++ Initial
++ Initial
++ Second
++ Second
++ Second
++ Second
++ Second
++ Second
++ Second
++ Second
++ Second
++ EOF
++
++ git config diff.algorithm histogram &&
++ git blame file.c | \
++ grep --only-matching -e Initial -e Second > actual &&
++ test_cmp expected actual
++'
++
++test_expect_success 'blame honors --minimal option' '
++ cat >expected <<-\EOF &&
++ Initial
++ Initial
++ Initial
++ Second
++ Second
++ Second
++ Second
++ Initial
++ Second
++ Second
++ Second
++ EOF
++
++ git blame file.txt --minimal | \
++ grep --only-matching -e Initial -e Second > actual &&
++ test_cmp expected actual
++'
++
++test_done
Documentation/diff-algorithm-option.adoc | 20 +++
Documentation/diff-options.adoc | 21 +---
Documentation/git-blame.adoc | 2 +
builtin/blame.c | 52 ++++++++
t/meson.build | 1 +
t/t8015-blame-diff-algorithm.sh | 154 +++++++++++++++++++++++
6 files changed, 230 insertions(+), 20 deletions(-)
create mode 100644 Documentation/diff-algorithm-option.adoc
create mode 100755 t/t8015-blame-diff-algorithm.sh
diff --git a/Documentation/diff-algorithm-option.adoc b/Documentation/diff-algorithm-option.adoc
new file mode 100644
index 0000000000..8e3a0b63d7
--- /dev/null
+++ b/Documentation/diff-algorithm-option.adoc
@@ -0,0 +1,20 @@
+`--diff-algorithm=(patience|minimal|histogram|myers)`::
+ Choose a diff algorithm. The variants are as follows:
++
+--
+ `default`;;
+ `myers`;;
+ The basic greedy diff algorithm. Currently, this is the default.
+ `minimal`;;
+ Spend extra time to make sure the smallest possible diff is
+ produced.
+ `patience`;;
+ Use "patience diff" algorithm when generating patches.
+ `histogram`;;
+ This algorithm extends the patience algorithm to "support
+ low-occurrence common elements".
+--
++
+For instance, if you configured the `diff.algorithm` variable to a
+non-default value and want to use the default one, then you
+have to use `--diff-algorithm=default` option.
diff --git a/Documentation/diff-options.adoc b/Documentation/diff-options.adoc
index ae31520f7f..9cdad6f72a 100644
--- a/Documentation/diff-options.adoc
+++ b/Documentation/diff-options.adoc
@@ -197,26 +197,7 @@ and starts with _<text>_, this algorithm attempts to prevent it from
appearing as a deletion or addition in the output. It uses the "patience
diff" algorithm internally.
-`--diff-algorithm=(patience|minimal|histogram|myers)`::
- Choose a diff algorithm. The variants are as follows:
-+
---
- `default`;;
- `myers`;;
- The basic greedy diff algorithm. Currently, this is the default.
- `minimal`;;
- Spend extra time to make sure the smallest possible diff is
- produced.
- `patience`;;
- Use "patience diff" algorithm when generating patches.
- `histogram`;;
- This algorithm extends the patience algorithm to "support
- low-occurrence common elements".
---
-+
-For instance, if you configured the `diff.algorithm` variable to a
-non-default value and want to use the default one, then you
-have to use `--diff-algorithm=default` option.
+include::diff-algorithm-option.adoc[]
`--stat[=<width>[,<name-width>[,<count>]]]`::
Generate a diffstat. By default, as much space as necessary
diff --git a/Documentation/git-blame.adoc b/Documentation/git-blame.adoc
index e438d28625..adcbb6f5dc 100644
--- a/Documentation/git-blame.adoc
+++ b/Documentation/git-blame.adoc
@@ -85,6 +85,8 @@ include::blame-options.adoc[]
Ignore whitespace when comparing the parent's version and
the child's to find where the lines came from.
+include::diff-algorithm-option.adoc[]
+
--abbrev=<n>::
Instead of using the default 7+1 hexadecimal digits as the
abbreviated object name, use <m>+1 digits, where <m> is at
diff --git a/builtin/blame.c b/builtin/blame.c
index 2703820258..eb0ab71dba 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -779,6 +779,19 @@ static int git_blame_config(const char *var, const char *value,
}
}
+ if (!strcmp(var, "diff.algorithm")) {
+ long diff_algorithm;
+ if (!value)
+ return config_error_nonbool(var);
+ diff_algorithm = parse_algorithm_value(value);
+ if (diff_algorithm < 0)
+ return error(_("unknown value for config '%s': %s"),
+ var, value);
+ xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
+ xdl_opts |= diff_algorithm;
+ return 0;
+ }
+
if (git_diff_heuristic_config(var, value, cb) < 0)
return -1;
if (userdiff_config(var, value) < 0)
@@ -824,6 +837,38 @@ static int blame_move_callback(const struct option *option, const char *arg, int
return 0;
}
+static int blame_diff_algorithm_minimal(const struct option *option,
+ const char *arg, int unset)
+{
+ int *opt = option->value;
+
+ BUG_ON_OPT_NEG(unset);
+ BUG_ON_OPT_ARG(arg);
+
+ *opt &= ~XDF_DIFF_ALGORITHM_MASK;
+ *opt |= XDF_NEED_MINIMAL;
+
+ return 0;
+}
+
+static int blame_diff_algorithm_callback(const struct option *option,
+ const char *arg, int unset)
+{
+ int *opt = option->value;
+ long value = parse_algorithm_value(arg);
+
+ BUG_ON_OPT_NEG(unset);
+
+ if (value < 0)
+ return error(_("option diff-algorithm accepts \"myers\", "
+ "\"minimal\", \"patience\" and \"histogram\""));
+
+ *opt &= ~(XDF_NEED_MINIMAL | XDF_DIFF_ALGORITHM_MASK);
+ *opt |= value;
+
+ return 0;
+}
+
static int is_a_rev(const char *name)
{
struct object_id oid;
@@ -915,10 +960,17 @@ int cmd_blame(int argc,
OPT_BIT('s', NULL, &output_option, N_("suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR),
OPT_BIT('e', "show-email", &output_option, N_("show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL),
OPT_BIT('w', NULL, &xdl_opts, N_("ignore whitespace differences"), XDF_IGNORE_WHITESPACE),
+ OPT_CALLBACK_F(0, "diff-algorithm", &xdl_opts, N_("<algorithm>"),
+ N_("choose a diff algorithm"),
+ PARSE_OPT_NONEG, blame_diff_algorithm_callback),
OPT_STRING_LIST(0, "ignore-rev", &ignore_rev_list, N_("rev"), N_("ignore <rev> when blaming")),
OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")),
OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE),
OPT_BIT(0, "color-by-age", &output_option, N_("color lines by age"), OUTPUT_SHOW_AGE_WITH_COLOR),
+ OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL,
+ N_("spend extra cycles to find better match"),
+ PARSE_OPT_NONEG | PARSE_OPT_NOARG,
+ blame_diff_algorithm_minimal),
OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL),
OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")),
OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")),
diff --git a/t/meson.build b/t/meson.build
index 401b24e50e..9f2fe7af8b 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -955,6 +955,7 @@ integration_tests = [
't8012-blame-colors.sh',
't8013-blame-ignore-revs.sh',
't8014-blame-ignore-fuzzy.sh',
+ 't8015-blame-diff-algorithm.sh',
't8020-last-modified.sh',
't9001-send-email.sh',
't9002-column.sh',
diff --git a/t/t8015-blame-diff-algorithm.sh b/t/t8015-blame-diff-algorithm.sh
new file mode 100755
index 0000000000..43996df177
--- /dev/null
+++ b/t/t8015-blame-diff-algorithm.sh
@@ -0,0 +1,154 @@
+#!/bin/sh
+
+test_description='git blame with specific diff algorithm'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+ cat >file.c <<-\EOF &&
+ int f(int x, int y)
+ {
+ if (x == 0)
+ {
+ return y;
+ }
+ return x;
+ }
+
+ int g(size_t u)
+ {
+ while (u < 30)
+ {
+ u++;
+ }
+ return u;
+ }
+ EOF
+ test_write_lines x x x x >file.txt &&
+ git add file.c file.txt &&
+ GIT_AUTHOR_NAME=Initial git commit -m Initial &&
+
+ cat >file.c <<-\EOF &&
+ int g(size_t u)
+ {
+ while (u < 30)
+ {
+ u++;
+ }
+ return u;
+ }
+
+ int h(int x, int y, int z)
+ {
+ if (z == 0)
+ {
+ return x;
+ }
+ return y;
+ }
+ EOF
+ test_write_lines x x x A B C D x E F G >file.txt &&
+ git add file.c file.txt &&
+ GIT_AUTHOR_NAME=Second git commit -m Second
+'
+
+test_expect_success 'blame uses Myers diff algorithm by default for now' '
+ cat >expected <<-\EOF &&
+ Second
+ Initial
+ Second
+ Initial
+ Second
+ Initial
+ Second
+ Initial
+ Initial
+ Second
+ Initial
+ Second
+ Initial
+ Second
+ Initial
+ Second
+ Initial
+ EOF
+
+ # git blame file.c | grep --only-matching -e Initial -e Second > actual &&
+ # test_cmp expected actual
+ echo goo
+'
+
+test_expect_success 'blame honors --diff-algorithm option' '
+ cat >expected <<-\EOF &&
+ Initial
+ Initial
+ Initial
+ Initial
+ Initial
+ Initial
+ Initial
+ Initial
+ Second
+ Second
+ Second
+ Second
+ Second
+ Second
+ Second
+ Second
+ Second
+ EOF
+
+ git blame file.c --diff-algorithm=histogram | \
+ grep --only-matching -e Initial -e Second > actual &&
+ test_cmp expected actual
+'
+
+test_expect_success 'blame honors diff.algorithm config variable' '
+ cat >expected <<-\EOF &&
+ Initial
+ Initial
+ Initial
+ Initial
+ Initial
+ Initial
+ Initial
+ Initial
+ Second
+ Second
+ Second
+ Second
+ Second
+ Second
+ Second
+ Second
+ Second
+ EOF
+
+ git config diff.algorithm histogram &&
+ git blame file.c | \
+ grep --only-matching -e Initial -e Second > actual &&
+ test_cmp expected actual
+'
+
+test_expect_success 'blame honors --minimal option' '
+ cat >expected <<-\EOF &&
+ Initial
+ Initial
+ Initial
+ Second
+ Second
+ Second
+ Second
+ Initial
+ Second
+ Second
+ Second
+ EOF
+
+ git blame file.txt --minimal | \
+ grep --only-matching -e Initial -e Second > actual &&
+ test_cmp expected actual
+'
+
+test_done
base-commit: 4253630c6f07a4bdcc9aa62a50e26a4d466219d1
--
gitgitgadget
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v2] blame: make diff algorithm configurable
2025-10-28 13:37 ` [PATCH v2] " Antonin Delpeuch via GitGitGadget
@ 2025-10-28 15:22 ` Junio C Hamano
2025-10-28 16:00 ` Antonin Delpeuch
2025-10-28 21:14 ` [PATCH v3] " Antonin Delpeuch via GitGitGadget
1 sibling, 1 reply; 32+ messages in thread
From: Junio C Hamano @ 2025-10-28 15:22 UTC (permalink / raw)
To: Antonin Delpeuch via GitGitGadget
Cc: git, Elijah Newren, Phillip Wood, Antonin Delpeuch
"Antonin Delpeuch via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Antonin Delpeuch <antonin@delpeuch.eu>
>
> The diff algorithm used in 'git-blame(1)' is set to 'myers',
> without the possibility to change it aside from the `--minimal` option.
Hmph. It is very unfortunate that we had --minimal already. We
should have done --diff-algorithm=<which> instead, but that is way
too late.
> There has been long-standing interest in changing the default diff
> algorithm to "histogram", and Git 3.0 was floated as a possible occasion
> for taking some steps towards that:
>
> https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/
Micronit. I think the reference to 3.0 only about potentially
breaking backward compatibility by making the family of diff-*
plumbing commands ignore diff.algorithm configuration, and other
usability changes like this one are fair game without having to wait
for 3.0 boundary (the plumbing commands do ignore the configuration
already, so there is nothing we have to wait 3.0 before doing).
> Changes since v1:
>
> * add tests
> * ignore --diff-algorithm when it is provided before --minimal
Sensible.
I presume the reverse is true, i.e. giving "--minimal" and then
"--diff-algorithm=histogram" in this order would make "histogram"
survive, in other words, the usual "last one wins" rule is applied?
> +static int blame_diff_algorithm_minimal(const struct option *option,
> + const char *arg, int unset)
> +{
> + int *opt = option->value;
> +
> + BUG_ON_OPT_NEG(unset);
> + BUG_ON_OPT_ARG(arg);
> +
> + *opt &= ~XDF_DIFF_ALGORITHM_MASK;
> + *opt |= XDF_NEED_MINIMAL;
> +
> + return 0;
> +}
This and diff.c:diff_opt_diff_algorithm_no_arg(), which I think is
the original from which this was copied from, look somewhat
different, but this can afford to be simpler, as it does not have to
parse "--histogram", "--patience", etc., as independent command line
options. OK.
> +static int blame_diff_algorithm_callback(const struct option *option,
> + const char *arg, int unset)
> +{
> + int *opt = option->value;
> + long value = parse_algorithm_value(arg);
> +
> + BUG_ON_OPT_NEG(unset);
> +
> + if (value < 0)
> + return error(_("option diff-algorithm accepts \"myers\", "
> + "\"minimal\", \"patience\" and \"histogram\""));
> +
> + *opt &= ~(XDF_NEED_MINIMAL | XDF_DIFF_ALGORITHM_MASK);
> + *opt |= value;
> +
> + return 0;
> +}
Quite straight-forward and sensible.
> static int is_a_rev(const char *name)
> {
> struct object_id oid;
> @@ -915,10 +960,17 @@ int cmd_blame(int argc,
> OPT_BIT('s', NULL, &output_option, N_("suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR),
> OPT_BIT('e', "show-email", &output_option, N_("show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL),
> OPT_BIT('w', NULL, &xdl_opts, N_("ignore whitespace differences"), XDF_IGNORE_WHITESPACE),
> + OPT_CALLBACK_F(0, "diff-algorithm", &xdl_opts, N_("<algorithm>"),
> + N_("choose a diff algorithm"),
> + PARSE_OPT_NONEG, blame_diff_algorithm_callback),
> OPT_STRING_LIST(0, "ignore-rev", &ignore_rev_list, N_("rev"), N_("ignore <rev> when blaming")),
> OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")),
> OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE),
> OPT_BIT(0, "color-by-age", &output_option, N_("color lines by age"), OUTPUT_SHOW_AGE_WITH_COLOR),
> + OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL,
> + N_("spend extra cycles to find better match"),
> + PARSE_OPT_NONEG | PARSE_OPT_NOARG,
> + blame_diff_algorithm_minimal),
> OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL),
This OPT_BIT() can stay here? I thought parse_options_check() was
capable of detecting duplicated long-form commands as programming
error, but apparently it does not. (#leftoverbits) We should look
into teaching parse_options_check() to check duplicated option
names.
> OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")),
> OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")),
> ...
> +test_expect_success 'blame honors --minimal option' '
> + cat >expected <<-\EOF &&
> + Initial
> + Initial
> + Initial
> + Second
> + Second
> + Second
> + Second
> + Initial
> + Second
> + Second
> + Second
> + EOF
> +
> + git blame file.txt --minimal | \
> + grep --only-matching -e Initial -e Second > actual &&
> + test_cmp expected actual
> +'
Do we need to test combination of configuration variables and
command line options (to verify that options trump configuration),
or two command line options (to verify that the last one wins)?
When xdiff/ part of the system gets improved, the above expected
patterns may have to change, these tests may fail. Whoever updates
the diff algorithm to cause such a failure has to tell between a
genuine _bug_ in their update to diff implementation and the test
expecting a suboptimal result based on the behaviour of the diff
algorithm before their improvement. And for that, they need to
debug these tests. But I suspect that these tests will probably be
very difficult to debug, as it is almost impossible to see which
line in the original each of these lines correspond to.
I guess that's inevitable, and we'll cross that bridge when it
becomes necessary.
Thanks, will queue, but I do find the leftover --minimal bit
disturbing.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v2] blame: make diff algorithm configurable
2025-10-28 15:22 ` Junio C Hamano
@ 2025-10-28 16:00 ` Antonin Delpeuch
0 siblings, 0 replies; 32+ messages in thread
From: Antonin Delpeuch @ 2025-10-28 16:00 UTC (permalink / raw)
To: Junio C Hamano, Antonin Delpeuch via GitGitGadget
Cc: git, Elijah Newren, Phillip Wood
Hi Junio,
On 28/10/2025 16:22, Junio C Hamano wrote:
>> Changes since v1:
>>
>> * add tests
>> * ignore --diff-algorithm when it is provided before --minimal
> Sensible.
>
> I presume the reverse is true, i.e. giving "--minimal" and then
> "--diff-algorithm=histogram" in this order would make "histogram"
> survive, in other words, the usual "last one wins" rule is applied?
Yes, that was already the case in the version 1 of this patch. Philipp
noticed that it was only true in one direction, so this fixes it.
>> static int is_a_rev(const char *name)
>> {
>> struct object_id oid;
>> @@ -915,10 +960,17 @@ int cmd_blame(int argc,
>> OPT_BIT('s', NULL, &output_option, N_("suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR),
>> OPT_BIT('e', "show-email", &output_option, N_("show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL),
>> OPT_BIT('w', NULL, &xdl_opts, N_("ignore whitespace differences"), XDF_IGNORE_WHITESPACE),
>> + OPT_CALLBACK_F(0, "diff-algorithm", &xdl_opts, N_("<algorithm>"),
>> + N_("choose a diff algorithm"),
>> + PARSE_OPT_NONEG, blame_diff_algorithm_callback),
>> OPT_STRING_LIST(0, "ignore-rev", &ignore_rev_list, N_("rev"), N_("ignore <rev> when blaming")),
>> OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")),
>> OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE),
>> OPT_BIT(0, "color-by-age", &output_option, N_("color lines by age"), OUTPUT_SHOW_AGE_WITH_COLOR),
>> + OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL,
>> + N_("spend extra cycles to find better match"),
>> + PARSE_OPT_NONEG | PARSE_OPT_NOARG,
>> + blame_diff_algorithm_minimal),
>> OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL),
> This OPT_BIT() can stay here? I thought parse_options_check() was
> capable of detecting duplicated long-form commands as programming
> error, but apparently it does not. (#leftoverbits) We should look
> into teaching parse_options_check() to check duplicated option
> names.
Oops, that's an oversight on my part indeed. I'll fix it.
>
>> OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")),
>> OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")),
>> ...
>> +test_expect_success 'blame honors --minimal option' '
>> + cat >expected <<-\EOF &&
>> + Initial
>> + Initial
>> + Initial
>> + Second
>> + Second
>> + Second
>> + Second
>> + Initial
>> + Second
>> + Second
>> + Second
>> + EOF
>> +
>> + git blame file.txt --minimal | \
>> + grep --only-matching -e Initial -e Second > actual &&
>> + test_cmp expected actual
>> +'
> Do we need to test combination of configuration variables and
> command line options (to verify that options trump configuration),
> or two command line options (to verify that the last one wins)?
It's easy enough to add such tests, I can add a few more.
>
> When xdiff/ part of the system gets improved, the above expected
> patterns may have to change, these tests may fail. Whoever updates
> the diff algorithm to cause such a failure has to tell between a
> genuine _bug_ in their update to diff implementation and the test
> expecting a suboptimal result based on the behaviour of the diff
> algorithm before their improvement. And for that, they need to
> debug these tests. But I suspect that these tests will probably be
> very difficult to debug, as it is almost impossible to see which
> line in the original each of these lines correspond to.
I'm happy to make the tests a bit clearer by including the lines in the
expected output.
I'll submit a new version with those changes.
Best,
Antonin
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v3] blame: make diff algorithm configurable
2025-10-28 13:37 ` [PATCH v2] " Antonin Delpeuch via GitGitGadget
2025-10-28 15:22 ` Junio C Hamano
@ 2025-10-28 21:14 ` Antonin Delpeuch via GitGitGadget
2025-10-29 10:16 ` Phillip Wood
2025-11-01 21:57 ` [PATCH v4 0/2] " Antonin Delpeuch via GitGitGadget
1 sibling, 2 replies; 32+ messages in thread
From: Antonin Delpeuch via GitGitGadget @ 2025-10-28 21:14 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Antonin Delpeuch, Antonin Delpeuch
From: Antonin Delpeuch <antonin@delpeuch.eu>
The diff algorithm used in 'git-blame(1)' is set to 'myers',
without the possibility to change it aside from the `--minimal` option.
There has been long-standing interest in changing the default diff
algorithm to "histogram", and Git 3.0 was floated as a possible occasion
for taking some steps towards that:
https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/
As a preparation for this move, it is worth making sure that the diff
algorithm is configurable where useful.
Make it configurable in the `git-blame(1)` command by introducing the
`--diff-algorithm` option and make honor the `diff.algorithm` config
variable. Keep Myers diff as the default.
Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
---
blame: make diff algorithm configurable
Changes since v1:
* add tests
* ignore --diff-algorithm when it is provided before --minimal
* improve patch description
* remove duplication of documentation sections
* style improvements
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2075%2Fwetneb%2Fblame_respects_diff_algorithm-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2075/wetneb/blame_respects_diff_algorithm-v3
Pull-Request: https://github.com/git/git/pull/2075
Range-diff vs v2:
1: 107f51620b ! 1: b8bdb03516 blame: make diff algorithm configurable
@@ builtin/blame.c: int cmd_blame(int argc,
OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")),
OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE),
OPT_BIT(0, "color-by-age", &output_option, N_("color lines by age"), OUTPUT_SHOW_AGE_WITH_COLOR),
+- OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL),
+ OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL,
+ N_("spend extra cycles to find better match"),
+ PARSE_OPT_NONEG | PARSE_OPT_NOARG,
+ blame_diff_algorithm_minimal),
- OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL),
OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")),
OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")),
+ OPT_CALLBACK_F('C', NULL, &opt, N_("score"), N_("find line copies within and across files"), PARSE_OPT_OPTARG, blame_copy_callback),
## t/meson.build ##
@@ t/meson.build: integration_tests = [
@@ t/t8015-blame-diff-algorithm.sh (new)
+ cat >file.c <<-\EOF &&
+ int f(int x, int y)
+ {
-+ if (x == 0)
-+ {
-+ return y;
-+ }
-+ return x;
++ if (x == 0)
++ {
++ return y;
++ }
++ return x;
+ }
+
+ int g(size_t u)
+ {
-+ while (u < 30)
-+ {
-+ u++;
-+ }
-+ return u;
++ while (u < 30)
++ {
++ u++;
++ }
++ return u;
+ }
+ EOF
+ test_write_lines x x x x >file.txt &&
+ git add file.c file.txt &&
-+ GIT_AUTHOR_NAME=Initial git commit -m Initial &&
++ GIT_AUTHOR_NAME=Commit_1 git commit -m Commit_1 &&
+
+ cat >file.c <<-\EOF &&
+ int g(size_t u)
+ {
-+ while (u < 30)
-+ {
-+ u++;
-+ }
-+ return u;
++ while (u < 30)
++ {
++ u++;
++ }
++ return u;
+ }
+
+ int h(int x, int y, int z)
+ {
-+ if (z == 0)
-+ {
-+ return x;
-+ }
-+ return y;
++ if (z == 0)
++ {
++ return x;
++ }
++ return y;
+ }
+ EOF
+ test_write_lines x x x A B C D x E F G >file.txt &&
+ git add file.c file.txt &&
-+ GIT_AUTHOR_NAME=Second git commit -m Second
++ GIT_AUTHOR_NAME=Commit_2 git commit -m Commit_2
+'
+
+test_expect_success 'blame uses Myers diff algorithm by default for now' '
+ cat >expected <<-\EOF &&
-+ Second
-+ Initial
-+ Second
-+ Initial
-+ Second
-+ Initial
-+ Second
-+ Initial
-+ Initial
-+ Second
-+ Initial
-+ Second
-+ Initial
-+ Second
-+ Initial
-+ Second
-+ Initial
++ Commit_2 int g(size_t u)
++ Commit_1 {
++ Commit_2 while (u < 30)
++ Commit_1 {
++ Commit_2 u++;
++ Commit_1 }
++ Commit_2 return u;
++ Commit_1 }
++ Commit_1
++ Commit_2 int h(int x, int y, int z)
++ Commit_1 {
++ Commit_2 if (z == 0)
++ Commit_1 {
++ Commit_2 return x;
++ Commit_1 }
++ Commit_2 return y;
++ Commit_1 }
+ EOF
+
-+ # git blame file.c | grep --only-matching -e Initial -e Second > actual &&
-+ # test_cmp expected actual
-+ echo goo
++
++ git blame file.c | \
++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \
++ sed -e "s/ *$//g" > actual &&
++ test_cmp expected actual
+'
+
+test_expect_success 'blame honors --diff-algorithm option' '
+ cat >expected <<-\EOF &&
-+ Initial
-+ Initial
-+ Initial
-+ Initial
-+ Initial
-+ Initial
-+ Initial
-+ Initial
-+ Second
-+ Second
-+ Second
-+ Second
-+ Second
-+ Second
-+ Second
-+ Second
-+ Second
++ Commit_1 int g(size_t u)
++ Commit_1 {
++ Commit_1 while (u < 30)
++ Commit_1 {
++ Commit_1 u++;
++ Commit_1 }
++ Commit_1 return u;
++ Commit_1 }
++ Commit_2
++ Commit_2 int h(int x, int y, int z)
++ Commit_2 {
++ Commit_2 if (z == 0)
++ Commit_2 {
++ Commit_2 return x;
++ Commit_2 }
++ Commit_2 return y;
++ Commit_2 }
+ EOF
+
-+ git blame file.c --diff-algorithm=histogram | \
-+ grep --only-matching -e Initial -e Second > actual &&
++ git blame file.c --diff-algorithm histogram | \
++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \
++ sed -e "s/ *$//g" > actual &&
+ test_cmp expected actual
+'
+
+test_expect_success 'blame honors diff.algorithm config variable' '
+ cat >expected <<-\EOF &&
-+ Initial
-+ Initial
-+ Initial
-+ Initial
-+ Initial
-+ Initial
-+ Initial
-+ Initial
-+ Second
-+ Second
-+ Second
-+ Second
-+ Second
-+ Second
-+ Second
-+ Second
-+ Second
++ Commit_1 int g(size_t u)
++ Commit_1 {
++ Commit_1 while (u < 30)
++ Commit_1 {
++ Commit_1 u++;
++ Commit_1 }
++ Commit_1 return u;
++ Commit_1 }
++ Commit_2
++ Commit_2 int h(int x, int y, int z)
++ Commit_2 {
++ Commit_2 if (z == 0)
++ Commit_2 {
++ Commit_2 return x;
++ Commit_2 }
++ Commit_2 return y;
++ Commit_2 }
+ EOF
+
+ git config diff.algorithm histogram &&
+ git blame file.c | \
-+ grep --only-matching -e Initial -e Second > actual &&
++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \
++ sed -e "s/ *$//g" > actual &&
+ test_cmp expected actual
+'
+
++test_expect_success 'blame gives priority to --diff-algorithm over diff.algorithm' '
++ cat >expected <<-\EOF &&
++ Commit_1 int g(size_t u)
++ Commit_1 {
++ Commit_1 while (u < 30)
++ Commit_1 {
++ Commit_1 u++;
++ Commit_1 }
++ Commit_1 return u;
++ Commit_1 }
++ Commit_2
++ Commit_2 int h(int x, int y, int z)
++ Commit_2 {
++ Commit_2 if (z == 0)
++ Commit_2 {
++ Commit_2 return x;
++ Commit_2 }
++ Commit_2 return y;
++ Commit_2 }
++ EOF
++
++ git config diff.algorithm myers &&
++ git blame file.c --diff-algorithm histogram | \
++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \
++ sed -e "s/ *$//g" > actual &&
++ test_cmp expected actual
++'
+test_expect_success 'blame honors --minimal option' '
+ cat >expected <<-\EOF &&
-+ Initial
-+ Initial
-+ Initial
-+ Second
-+ Second
-+ Second
-+ Second
-+ Initial
-+ Second
-+ Second
-+ Second
++ Commit_1 x
++ Commit_1 x
++ Commit_1 x
++ Commit_2 A
++ Commit_2 B
++ Commit_2 C
++ Commit_2 D
++ Commit_1 x
++ Commit_2 E
++ Commit_2 F
++ Commit_2 G
+ EOF
+
+ git blame file.txt --minimal | \
-+ grep --only-matching -e Initial -e Second > actual &&
++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" > actual &&
+ test_cmp expected actual
+'
+
++test_expect_success 'blame respects the order of diff options' '
++ cat >expected <<-\EOF &&
++ Commit_1 x
++ Commit_1 x
++ Commit_1 x
++ Commit_2 A
++ Commit_2 B
++ Commit_2 C
++ Commit_2 D
++ Commit_2 x
++ Commit_2 E
++ Commit_2 F
++ Commit_2 G
++ EOF
++
++ git blame file.txt --minimal --diff-algorithm myers | \
++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" > actual &&
++ test_cmp expected actual
++'
++
++
+test_done
Documentation/diff-algorithm-option.adoc | 20 +++
Documentation/diff-options.adoc | 21 +--
Documentation/git-blame.adoc | 2 +
builtin/blame.c | 53 +++++-
t/meson.build | 1 +
t/t8015-blame-diff-algorithm.sh | 206 +++++++++++++++++++++++
6 files changed, 282 insertions(+), 21 deletions(-)
create mode 100644 Documentation/diff-algorithm-option.adoc
create mode 100755 t/t8015-blame-diff-algorithm.sh
diff --git a/Documentation/diff-algorithm-option.adoc b/Documentation/diff-algorithm-option.adoc
new file mode 100644
index 0000000000..8e3a0b63d7
--- /dev/null
+++ b/Documentation/diff-algorithm-option.adoc
@@ -0,0 +1,20 @@
+`--diff-algorithm=(patience|minimal|histogram|myers)`::
+ Choose a diff algorithm. The variants are as follows:
++
+--
+ `default`;;
+ `myers`;;
+ The basic greedy diff algorithm. Currently, this is the default.
+ `minimal`;;
+ Spend extra time to make sure the smallest possible diff is
+ produced.
+ `patience`;;
+ Use "patience diff" algorithm when generating patches.
+ `histogram`;;
+ This algorithm extends the patience algorithm to "support
+ low-occurrence common elements".
+--
++
+For instance, if you configured the `diff.algorithm` variable to a
+non-default value and want to use the default one, then you
+have to use `--diff-algorithm=default` option.
diff --git a/Documentation/diff-options.adoc b/Documentation/diff-options.adoc
index ae31520f7f..9cdad6f72a 100644
--- a/Documentation/diff-options.adoc
+++ b/Documentation/diff-options.adoc
@@ -197,26 +197,7 @@ and starts with _<text>_, this algorithm attempts to prevent it from
appearing as a deletion or addition in the output. It uses the "patience
diff" algorithm internally.
-`--diff-algorithm=(patience|minimal|histogram|myers)`::
- Choose a diff algorithm. The variants are as follows:
-+
---
- `default`;;
- `myers`;;
- The basic greedy diff algorithm. Currently, this is the default.
- `minimal`;;
- Spend extra time to make sure the smallest possible diff is
- produced.
- `patience`;;
- Use "patience diff" algorithm when generating patches.
- `histogram`;;
- This algorithm extends the patience algorithm to "support
- low-occurrence common elements".
---
-+
-For instance, if you configured the `diff.algorithm` variable to a
-non-default value and want to use the default one, then you
-have to use `--diff-algorithm=default` option.
+include::diff-algorithm-option.adoc[]
`--stat[=<width>[,<name-width>[,<count>]]]`::
Generate a diffstat. By default, as much space as necessary
diff --git a/Documentation/git-blame.adoc b/Documentation/git-blame.adoc
index e438d28625..adcbb6f5dc 100644
--- a/Documentation/git-blame.adoc
+++ b/Documentation/git-blame.adoc
@@ -85,6 +85,8 @@ include::blame-options.adoc[]
Ignore whitespace when comparing the parent's version and
the child's to find where the lines came from.
+include::diff-algorithm-option.adoc[]
+
--abbrev=<n>::
Instead of using the default 7+1 hexadecimal digits as the
abbreviated object name, use <m>+1 digits, where <m> is at
diff --git a/builtin/blame.c b/builtin/blame.c
index 2703820258..da4dbdf50a 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -779,6 +779,19 @@ static int git_blame_config(const char *var, const char *value,
}
}
+ if (!strcmp(var, "diff.algorithm")) {
+ long diff_algorithm;
+ if (!value)
+ return config_error_nonbool(var);
+ diff_algorithm = parse_algorithm_value(value);
+ if (diff_algorithm < 0)
+ return error(_("unknown value for config '%s': %s"),
+ var, value);
+ xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
+ xdl_opts |= diff_algorithm;
+ return 0;
+ }
+
if (git_diff_heuristic_config(var, value, cb) < 0)
return -1;
if (userdiff_config(var, value) < 0)
@@ -824,6 +837,38 @@ static int blame_move_callback(const struct option *option, const char *arg, int
return 0;
}
+static int blame_diff_algorithm_minimal(const struct option *option,
+ const char *arg, int unset)
+{
+ int *opt = option->value;
+
+ BUG_ON_OPT_NEG(unset);
+ BUG_ON_OPT_ARG(arg);
+
+ *opt &= ~XDF_DIFF_ALGORITHM_MASK;
+ *opt |= XDF_NEED_MINIMAL;
+
+ return 0;
+}
+
+static int blame_diff_algorithm_callback(const struct option *option,
+ const char *arg, int unset)
+{
+ int *opt = option->value;
+ long value = parse_algorithm_value(arg);
+
+ BUG_ON_OPT_NEG(unset);
+
+ if (value < 0)
+ return error(_("option diff-algorithm accepts \"myers\", "
+ "\"minimal\", \"patience\" and \"histogram\""));
+
+ *opt &= ~(XDF_NEED_MINIMAL | XDF_DIFF_ALGORITHM_MASK);
+ *opt |= value;
+
+ return 0;
+}
+
static int is_a_rev(const char *name)
{
struct object_id oid;
@@ -915,11 +960,17 @@ int cmd_blame(int argc,
OPT_BIT('s', NULL, &output_option, N_("suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR),
OPT_BIT('e', "show-email", &output_option, N_("show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL),
OPT_BIT('w', NULL, &xdl_opts, N_("ignore whitespace differences"), XDF_IGNORE_WHITESPACE),
+ OPT_CALLBACK_F(0, "diff-algorithm", &xdl_opts, N_("<algorithm>"),
+ N_("choose a diff algorithm"),
+ PARSE_OPT_NONEG, blame_diff_algorithm_callback),
OPT_STRING_LIST(0, "ignore-rev", &ignore_rev_list, N_("rev"), N_("ignore <rev> when blaming")),
OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")),
OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE),
OPT_BIT(0, "color-by-age", &output_option, N_("color lines by age"), OUTPUT_SHOW_AGE_WITH_COLOR),
- OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL),
+ OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL,
+ N_("spend extra cycles to find better match"),
+ PARSE_OPT_NONEG | PARSE_OPT_NOARG,
+ blame_diff_algorithm_minimal),
OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")),
OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")),
OPT_CALLBACK_F('C', NULL, &opt, N_("score"), N_("find line copies within and across files"), PARSE_OPT_OPTARG, blame_copy_callback),
diff --git a/t/meson.build b/t/meson.build
index 401b24e50e..9f2fe7af8b 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -955,6 +955,7 @@ integration_tests = [
't8012-blame-colors.sh',
't8013-blame-ignore-revs.sh',
't8014-blame-ignore-fuzzy.sh',
+ 't8015-blame-diff-algorithm.sh',
't8020-last-modified.sh',
't9001-send-email.sh',
't9002-column.sh',
diff --git a/t/t8015-blame-diff-algorithm.sh b/t/t8015-blame-diff-algorithm.sh
new file mode 100755
index 0000000000..efc4b47ce1
--- /dev/null
+++ b/t/t8015-blame-diff-algorithm.sh
@@ -0,0 +1,206 @@
+#!/bin/sh
+
+test_description='git blame with specific diff algorithm'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+ cat >file.c <<-\EOF &&
+ int f(int x, int y)
+ {
+ if (x == 0)
+ {
+ return y;
+ }
+ return x;
+ }
+
+ int g(size_t u)
+ {
+ while (u < 30)
+ {
+ u++;
+ }
+ return u;
+ }
+ EOF
+ test_write_lines x x x x >file.txt &&
+ git add file.c file.txt &&
+ GIT_AUTHOR_NAME=Commit_1 git commit -m Commit_1 &&
+
+ cat >file.c <<-\EOF &&
+ int g(size_t u)
+ {
+ while (u < 30)
+ {
+ u++;
+ }
+ return u;
+ }
+
+ int h(int x, int y, int z)
+ {
+ if (z == 0)
+ {
+ return x;
+ }
+ return y;
+ }
+ EOF
+ test_write_lines x x x A B C D x E F G >file.txt &&
+ git add file.c file.txt &&
+ GIT_AUTHOR_NAME=Commit_2 git commit -m Commit_2
+'
+
+test_expect_success 'blame uses Myers diff algorithm by default for now' '
+ cat >expected <<-\EOF &&
+ Commit_2 int g(size_t u)
+ Commit_1 {
+ Commit_2 while (u < 30)
+ Commit_1 {
+ Commit_2 u++;
+ Commit_1 }
+ Commit_2 return u;
+ Commit_1 }
+ Commit_1
+ Commit_2 int h(int x, int y, int z)
+ Commit_1 {
+ Commit_2 if (z == 0)
+ Commit_1 {
+ Commit_2 return x;
+ Commit_1 }
+ Commit_2 return y;
+ Commit_1 }
+ EOF
+
+
+ git blame file.c | \
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \
+ sed -e "s/ *$//g" > actual &&
+ test_cmp expected actual
+'
+
+test_expect_success 'blame honors --diff-algorithm option' '
+ cat >expected <<-\EOF &&
+ Commit_1 int g(size_t u)
+ Commit_1 {
+ Commit_1 while (u < 30)
+ Commit_1 {
+ Commit_1 u++;
+ Commit_1 }
+ Commit_1 return u;
+ Commit_1 }
+ Commit_2
+ Commit_2 int h(int x, int y, int z)
+ Commit_2 {
+ Commit_2 if (z == 0)
+ Commit_2 {
+ Commit_2 return x;
+ Commit_2 }
+ Commit_2 return y;
+ Commit_2 }
+ EOF
+
+ git blame file.c --diff-algorithm histogram | \
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \
+ sed -e "s/ *$//g" > actual &&
+ test_cmp expected actual
+'
+
+test_expect_success 'blame honors diff.algorithm config variable' '
+ cat >expected <<-\EOF &&
+ Commit_1 int g(size_t u)
+ Commit_1 {
+ Commit_1 while (u < 30)
+ Commit_1 {
+ Commit_1 u++;
+ Commit_1 }
+ Commit_1 return u;
+ Commit_1 }
+ Commit_2
+ Commit_2 int h(int x, int y, int z)
+ Commit_2 {
+ Commit_2 if (z == 0)
+ Commit_2 {
+ Commit_2 return x;
+ Commit_2 }
+ Commit_2 return y;
+ Commit_2 }
+ EOF
+
+ git config diff.algorithm histogram &&
+ git blame file.c | \
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \
+ sed -e "s/ *$//g" > actual &&
+ test_cmp expected actual
+'
+
+test_expect_success 'blame gives priority to --diff-algorithm over diff.algorithm' '
+ cat >expected <<-\EOF &&
+ Commit_1 int g(size_t u)
+ Commit_1 {
+ Commit_1 while (u < 30)
+ Commit_1 {
+ Commit_1 u++;
+ Commit_1 }
+ Commit_1 return u;
+ Commit_1 }
+ Commit_2
+ Commit_2 int h(int x, int y, int z)
+ Commit_2 {
+ Commit_2 if (z == 0)
+ Commit_2 {
+ Commit_2 return x;
+ Commit_2 }
+ Commit_2 return y;
+ Commit_2 }
+ EOF
+
+ git config diff.algorithm myers &&
+ git blame file.c --diff-algorithm histogram | \
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \
+ sed -e "s/ *$//g" > actual &&
+ test_cmp expected actual
+'
+test_expect_success 'blame honors --minimal option' '
+ cat >expected <<-\EOF &&
+ Commit_1 x
+ Commit_1 x
+ Commit_1 x
+ Commit_2 A
+ Commit_2 B
+ Commit_2 C
+ Commit_2 D
+ Commit_1 x
+ Commit_2 E
+ Commit_2 F
+ Commit_2 G
+ EOF
+
+ git blame file.txt --minimal | \
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" > actual &&
+ test_cmp expected actual
+'
+
+test_expect_success 'blame respects the order of diff options' '
+ cat >expected <<-\EOF &&
+ Commit_1 x
+ Commit_1 x
+ Commit_1 x
+ Commit_2 A
+ Commit_2 B
+ Commit_2 C
+ Commit_2 D
+ Commit_2 x
+ Commit_2 E
+ Commit_2 F
+ Commit_2 G
+ EOF
+
+ git blame file.txt --minimal --diff-algorithm myers | \
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" > actual &&
+ test_cmp expected actual
+'
+
+
+test_done
base-commit: 4253630c6f07a4bdcc9aa62a50e26a4d466219d1
--
gitgitgadget
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v3] blame: make diff algorithm configurable
2025-10-28 21:14 ` [PATCH v3] " Antonin Delpeuch via GitGitGadget
@ 2025-10-29 10:16 ` Phillip Wood
2025-10-29 18:46 ` Junio C Hamano
2025-10-30 9:22 ` Antonin Delpeuch
2025-11-01 21:57 ` [PATCH v4 0/2] " Antonin Delpeuch via GitGitGadget
1 sibling, 2 replies; 32+ messages in thread
From: Phillip Wood @ 2025-10-29 10:16 UTC (permalink / raw)
To: Antonin Delpeuch via GitGitGadget, git; +Cc: Elijah Newren, Antonin Delpeuch
Hi Antonin
On 28/10/2025 21:14, Antonin Delpeuch via GitGitGadget wrote:
> From: Antonin Delpeuch <antonin@delpeuch.eu>
>
> The diff algorithm used in 'git-blame(1)' is set to 'myers',
> without the possibility to change it aside from the `--minimal` option.
>
> There has been long-standing interest in changing the default diff
> algorithm to "histogram", and Git 3.0 was floated as a possible occasion
> for taking some steps towards that:
>
> https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/
>
> As a preparation for this move, it is worth making sure that the diff
> algorithm is configurable where useful.
>
> Make it configurable in the `git-blame(1)` command by introducing the
> `--diff-algorithm` option and make honor the `diff.algorithm` config
> variable. Keep Myers diff as the default.
>
> Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
> ---
Apart from a problem with clearing XDF_NEED_MINIMAL (which is really the
fault of a terrible api) this is looking good.
> --- a/builtin/blame.c
> +++ b/builtin/blame.c
> @@ -779,6 +779,19 @@ static int git_blame_config(const char *var, const char *value,
> }
> }
>
> + if (!strcmp(var, "diff.algorithm")) {
> + long diff_algorithm;
> + if (!value)
> + return config_error_nonbool(var);
> + diff_algorithm = parse_algorithm_value(value);
> + if (diff_algorithm < 0)
> + return error(_("unknown value for config '%s': %s"),
> + var, value);
> + xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
Unfortunately XDF_DIFF_ALGORITHM_MASK does not include XDF_NEED_MINIMAL
so if the user has a config file that looks like
[diff]
algorithm = minimal
algorithm = myers
We'll parse it as "minimal" rather than "myers"
As we need to reset the diff algorithm in a number of places I think it
would be best to define a macro
#define CLEAR_DIFF_ALGORITHM(flags) \
flags &= ~(XDF_DIFF_ALGORITHM_MASK | XDF_NEED_MINIMAL)
and use that where we want to reset the algorithm.
> + xdl_opts |= diff_algorithm;
> + return 0;
> + }
> +
> if (git_diff_heuristic_config(var, value, cb) < 0)
> return -1;
> if (userdiff_config(var, value) < 0)
> @@ -824,6 +837,38 @@ static int blame_move_callback(const struct option *option, const char *arg, int
> return 0;
> }
>
> +static int blame_diff_algorithm_minimal(const struct option *option,
> + const char *arg, int unset)
> +{
> + int *opt = option->value;
> +
> + BUG_ON_OPT_NEG(unset);
This is a change in behavior as we currently accept "--no-minimal" which
clears XDF_NEED_MINIMAL
> + BUG_ON_OPT_ARG(arg);
> +
> + *opt &= ~XDF_DIFF_ALGORITHM_MASK;
This is correct becase we're about to set XDF_NEED_MINIMAL so it does
not matter that we leave it set here, but would still be clearer if it
used the new macro I suggested above.
> + *opt |= XDF_NEED_MINIMAL;> + return 0;
> +}
> +
> +static int blame_diff_algorithm_callback(const struct option *option,
> + const char *arg, int unset)
> +{
> + int *opt = option->value;
> + long value = parse_algorithm_value(arg);
> +
> + BUG_ON_OPT_NEG(unset);
> +
> + if (value < 0)
> + return error(_("option diff-algorithm accepts \"myers\", "
> + "\"minimal\", \"patience\" and \"histogram\""));
> +
> + *opt &= ~(XDF_NEED_MINIMAL | XDF_DIFF_ALGORITHM_MASK);
This is correct
> + *opt |= value;
> +
> + return 0;
> +}
> +
> - OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL),
> + OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL,
> + N_("spend extra cycles to find better match"),
This is just copying the existing text so it is not a new problem but I
think it would be better if we said "find a better" rather than "find
better". We should prehaps think about hiding this option now that we
support --diff-algorithm.
> + PARSE_OPT_NONEG | PARSE_OPT_NOARG,
As I said above using PARSE_OPT_NONEG here is a regression
> + blame_diff_algorithm_minimal),
> diff --git a/t/t8015-blame-diff-algorithm.sh b/t/t8015-blame-diff-algorithm.sh
> new file mode 100755
> index 0000000000..efc4b47ce1
> --- /dev/null
> +++ b/t/t8015-blame-diff-algorithm.sh
> [...]
> +test_expect_success 'blame uses Myers diff algorithm by default for now' '
I'm not sure we need to say "for now" here.
> + cat >expected <<-\EOF &&
> + Commit_2 int g(size_t u)
> + Commit_1 {
> + Commit_2 while (u < 30)
> + Commit_1 {
> + Commit_2 u++;
> + Commit_1 }
> + Commit_2 return u;
> + Commit_1 }
> + Commit_1
> + Commit_2 int h(int x, int y, int z)
> + Commit_1 {
> + Commit_2 if (z == 0)
> + Commit_1 {
> + Commit_2 return x;
> + Commit_1 }
> + Commit_2 return y;
> + Commit_1 }
> + EOF
> +
> +
There's an extra blank line here
> + git blame file.c | \
We don't pipe the output git commands as it hides unexpected failures.
Instead you should redirect the output of git to a file and then process
that file with sed.
> + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \
> + sed -e "s/ *$//g" > actual &&
This can be a single process by passing -e twice. It does not really
matter but neither pattern needs a trailing "g" as they only match once
within the line.
> +test_expect_success 'blame gives priority to --diff-algorithm over diff.algorithm' '
> + cat >expected <<-\EOF &&
> + Commit_1 int g(size_t u)
> + Commit_1 {
> + Commit_1 while (u < 30)
> + Commit_1 {
> + Commit_1 u++;
> + Commit_1 }
> + Commit_1 return u;
> + Commit_1 }
> + Commit_2
> + Commit_2 int h(int x, int y, int z)
> + Commit_2 {
> + Commit_2 if (z == 0)
> + Commit_2 {
> + Commit_2 return x;
> + Commit_2 }
> + Commit_2 return y;
> + Commit_2 }
> + EOF
> +
> + git config diff.algorithm myers &&
You can use test_config() here which will clear the config setting at
the end of the test. Alternatively you can save a couple of processes by
using "git -c diff.algorithm=myers blame ...". This is setting the
config to the default value, I wonder if it would be better to do
git -c diff.algorithm=histogram blame --diff-algorithm=myers
instead.
The coverage looks good
Thanks
Phillip
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] blame: make diff algorithm configurable
2025-10-29 10:16 ` Phillip Wood
@ 2025-10-29 18:46 ` Junio C Hamano
2025-10-30 9:22 ` Antonin Delpeuch
1 sibling, 0 replies; 32+ messages in thread
From: Junio C Hamano @ 2025-10-29 18:46 UTC (permalink / raw)
To: Phillip Wood
Cc: Antonin Delpeuch via GitGitGadget, git, Elijah Newren,
Antonin Delpeuch
Phillip Wood <phillip.wood123@gmail.com> writes:
>> +static int blame_diff_algorithm_minimal(const struct option *option,
>> + const char *arg, int unset)
>> +{
>> + int *opt = option->value;
>> +
>> + BUG_ON_OPT_NEG(unset);
>
> This is a change in behavior as we currently accept "--no-minimal" which
> clears XDF_NEED_MINIMAL
Ah, I missed this; thanks for a careful reading.
> As I said above using PARSE_OPT_NONEG here is a regression
>
>> + blame_diff_algorithm_minimal),
>> diff --git a/t/t8015-blame-diff-algorithm.sh b/t/t8015-blame-diff-algorithm.sh
>> new file mode 100755
>> index 0000000000..efc4b47ce1
>> --- /dev/null
>> +++ b/t/t8015-blame-diff-algorithm.sh
>> [...]
>> +test_expect_success 'blame uses Myers diff algorithm by default for now' '
>
> I'm not sure we need to say "for now" here.
We shouldn't.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] blame: make diff algorithm configurable
2025-10-29 10:16 ` Phillip Wood
2025-10-29 18:46 ` Junio C Hamano
@ 2025-10-30 9:22 ` Antonin Delpeuch
2025-10-30 10:47 ` Phillip Wood
1 sibling, 1 reply; 32+ messages in thread
From: Antonin Delpeuch @ 2025-10-30 9:22 UTC (permalink / raw)
To: phillip.wood, Antonin Delpeuch via GitGitGadget, git; +Cc: Elijah Newren
Hi Phillip,
On 29/10/2025 11:16, Phillip Wood wrote:
> Unfortunately XDF_DIFF_ALGORITHM_MASK does not include
> XDF_NEED_MINIMAL so if the user has a config file that looks like
>
> [diff]
> algorithm = minimal
> algorithm = myers
>
> We'll parse it as "minimal" rather than "myers"
>
> As we need to reset the diff algorithm in a number of places I think
> it would be best to define a macro
>
> #define CLEAR_DIFF_ALGORITHM(flags) \
> flags &= ~(XDF_DIFF_ALGORITHM_MASK | XDF_NEED_MINIMAL)
Ouch, good catch! This problem is affecting other places as well.
I'm wondering if we couldn't even add XDF_NEED_MINIMAL to
XDF_DIFF_ALGORITHM_MASK. I've reviewed all the places where
XDF_DIFF_ALGORITHM_MASK is used, and it seems that in all cases it would
either preserve the existing behaviour (potentially allowing us to
remove an accompanying "DIFF_XDL_CLR(opts, NEED_MINIMAL);" macro which
becomes redundant), or in some other cases it would fix a similar issue
(for instance, in merge-file.c).
Is your suggestion to introduce a new macro motivated by stability
concerns? I'm aware that xdiff is used in other code bases as a library,
so I guess changing XDF_DIFF_ALGORITHM_MASK can indeed be seen as a
breaking change.
Antonin
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] blame: make diff algorithm configurable
2025-10-30 9:22 ` Antonin Delpeuch
@ 2025-10-30 10:47 ` Phillip Wood
0 siblings, 0 replies; 32+ messages in thread
From: Phillip Wood @ 2025-10-30 10:47 UTC (permalink / raw)
To: Antonin Delpeuch, phillip.wood, Antonin Delpeuch via GitGitGadget,
git
Cc: Elijah Newren
Hi Antonin
On 30/10/2025 09:22, Antonin Delpeuch wrote:
> On 29/10/2025 11:16, Phillip Wood wrote:
>> Unfortunately XDF_DIFF_ALGORITHM_MASK does not include
>> XDF_NEED_MINIMAL so if the user has a config file that looks like
>>
>> [diff]
>> algorithm = minimal
>> algorithm = myers
>>
>> We'll parse it as "minimal" rather than "myers"
>>
>> As we need to reset the diff algorithm in a number of places I think
>> it would be best to define a macro
>>
>> #define CLEAR_DIFF_ALGORITHM(flags) \
>> flags &= ~(XDF_DIFF_ALGORITHM_MASK | XDF_NEED_MINIMAL)
>
> Ouch, good catch! This problem is affecting other places as well.
>
> I'm wondering if we couldn't even add XDF_NEED_MINIMAL to
> XDF_DIFF_ALGORITHM_MASK. I've reviewed all the places where
> XDF_DIFF_ALGORITHM_MASK is used, and it seems that in all cases it would
> either preserve the existing behaviour (potentially allowing us to
> remove an accompanying "DIFF_XDL_CLR(opts, NEED_MINIMAL);" macro which
> becomes redundant), or in some other cases it would fix a similar issue
> (for instance, in merge-file.c).
Thanks for taking the time to look at the other uses of
XDF_DIFF_ALGORITHM_MASK. I think adding XDF_NEED_MINIMAL to
XDF_DIFF_ALGORITHM_MASK is a good idea as I think we always treat
"minimal" as another diff algorithm, rather than a variant of "myers"
which is how it is implemented in xdiff.
> Is your suggestion to introduce a new macro motivated by stability
> concerns? I'm aware that xdiff is used in other code bases as a library,
> so I guess changing XDF_DIFF_ALGORITHM_MASK can indeed be seen as a
> breaking change.
While it is a breaking change, tweaking XDF_DIFF_ALGORITHM_MASK seems
like a better idea to me. I wondered about it yesterday but worried it
would end up being too much work to review all the existing uses.
Thanks
Phillip
> Antonin
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v4 0/2] blame: make diff algorithm configurable
2025-10-28 21:14 ` [PATCH v3] " Antonin Delpeuch via GitGitGadget
2025-10-29 10:16 ` Phillip Wood
@ 2025-11-01 21:57 ` Antonin Delpeuch via GitGitGadget
2025-11-01 21:57 ` [PATCH v4 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget
` (2 more replies)
1 sibling, 3 replies; 32+ messages in thread
From: Antonin Delpeuch via GitGitGadget @ 2025-11-01 21:57 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Antonin Delpeuch
Changes since v3:
* fix resetting of diff algorithm by adding XDF_NEED_MINIMAL to
XDF_DIFF_ALGORITHM_MASK
* restore --no-minimal support
* fix typo in description of --minimal option
* remove the 'for now' in test description
* remove piping in tests
* pass configuration variables to the blame command directly instead of
calling 'git config'
Antonin Delpeuch (2):
xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK
blame: make diff algorithm configurable
Documentation/diff-algorithm-option.adoc | 20 +++
Documentation/diff-options.adoc | 21 +--
Documentation/git-blame.adoc | 2 +
builtin/blame.c | 52 +++++-
diff.c | 2 -
merge-ort.c | 2 -
t/meson.build | 1 +
t/t8015-blame-diff-algorithm.sh | 203 +++++++++++++++++++++++
xdiff/xdiff.h | 2 +-
9 files changed, 279 insertions(+), 26 deletions(-)
create mode 100644 Documentation/diff-algorithm-option.adoc
create mode 100755 t/t8015-blame-diff-algorithm.sh
base-commit: 4253630c6f07a4bdcc9aa62a50e26a4d466219d1
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2075%2Fwetneb%2Fblame_respects_diff_algorithm-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2075/wetneb/blame_respects_diff_algorithm-v4
Pull-Request: https://github.com/git/git/pull/2075
Range-diff vs v3:
-: ---------- > 1: e81a5d2bd2 xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK
1: b8bdb03516 ! 2: 920a6f3acb blame: make diff algorithm configurable
@@ builtin/blame.c: static int blame_move_callback(const struct option *option, con
+{
+ int *opt = option->value;
+
-+ BUG_ON_OPT_NEG(unset);
+ BUG_ON_OPT_ARG(arg);
+
+ *opt &= ~XDF_DIFF_ALGORITHM_MASK;
-+ *opt |= XDF_NEED_MINIMAL;
++ if (!unset)
++ *opt |= XDF_NEED_MINIMAL;
+
+ return 0;
+}
@@ builtin/blame.c: int cmd_blame(int argc,
OPT_BIT(0, "color-by-age", &output_option, N_("color lines by age"), OUTPUT_SHOW_AGE_WITH_COLOR),
- OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL),
+ OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL,
-+ N_("spend extra cycles to find better match"),
-+ PARSE_OPT_NONEG | PARSE_OPT_NOARG,
-+ blame_diff_algorithm_minimal),
++ N_("spend extra cycles to find a better match"),
++ PARSE_OPT_NOARG, blame_diff_algorithm_minimal),
OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")),
OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")),
OPT_CALLBACK_F('C', NULL, &opt, N_("score"), N_("find line copies within and across files"), PARSE_OPT_OPTARG, blame_copy_callback),
@@ t/t8015-blame-diff-algorithm.sh (new)
+ GIT_AUTHOR_NAME=Commit_2 git commit -m Commit_2
+'
+
-+test_expect_success 'blame uses Myers diff algorithm by default for now' '
++test_expect_success 'blame uses Myers diff algorithm by default' '
+ cat >expected <<-\EOF &&
+ Commit_2 int g(size_t u)
+ Commit_1 {
@@ t/t8015-blame-diff-algorithm.sh (new)
+ Commit_1 }
+ EOF
+
-+
-+ git blame file.c | \
-+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \
-+ sed -e "s/ *$//g" > actual &&
++ git blame file.c > output &&
++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts &&
++ sed -e "s/ *$//g" without_varying_parts > actual &&
+ test_cmp expected actual
+'
+
@@ t/t8015-blame-diff-algorithm.sh (new)
+ Commit_2 }
+ EOF
+
-+ git blame file.c --diff-algorithm histogram | \
-+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \
-+ sed -e "s/ *$//g" > actual &&
++ git blame file.c --diff-algorithm histogram > output &&
++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts &&
++ sed -e "s/ *$//g" without_varying_parts > actual &&
+ test_cmp expected actual
+'
+
@@ t/t8015-blame-diff-algorithm.sh (new)
+ Commit_2 }
+ EOF
+
-+ git config diff.algorithm histogram &&
-+ git blame file.c | \
-+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \
-+ sed -e "s/ *$//g" > actual &&
++ git -c diff.algorithm=histogram blame file.c > output &&
++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts &&
++ sed -e "s/ *$//g" without_varying_parts > actual &&
+ test_cmp expected actual
+'
+
@@ t/t8015-blame-diff-algorithm.sh (new)
+ Commit_2 }
+ EOF
+
-+ git config diff.algorithm myers &&
-+ git blame file.c --diff-algorithm histogram | \
-+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \
-+ sed -e "s/ *$//g" > actual &&
++ git -c diff.algorithm=myers blame file.c --diff-algorithm histogram &&
++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts &&
++ sed -e "s/ *$//g" without_varying_parts > actual &&
+ test_cmp expected actual
+'
++
+test_expect_success 'blame honors --minimal option' '
+ cat >expected <<-\EOF &&
+ Commit_1 x
@@ t/t8015-blame-diff-algorithm.sh (new)
+ Commit_2 G
+ EOF
+
-+ git blame file.txt --minimal | \
-+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" > actual &&
++ git blame file.txt --minimal > output &&
++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual &&
+ test_cmp expected actual
+'
+
@@ t/t8015-blame-diff-algorithm.sh (new)
+ Commit_2 G
+ EOF
+
-+ git blame file.txt --minimal --diff-algorithm myers | \
-+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" > actual &&
++ git blame file.txt --minimal --diff-algorithm myers > output &&
++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual &&
+ test_cmp expected actual
+'
+
-+
+test_done
--
gitgitgadget
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v4 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK
2025-11-01 21:57 ` [PATCH v4 0/2] " Antonin Delpeuch via GitGitGadget
@ 2025-11-01 21:57 ` Antonin Delpeuch via GitGitGadget
2025-11-03 14:32 ` Phillip Wood
2025-11-01 21:57 ` [PATCH v4 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget
2025-11-06 22:41 ` [PATCH v5 0/2] " Antonin Delpeuch via GitGitGadget
2 siblings, 1 reply; 32+ messages in thread
From: Antonin Delpeuch via GitGitGadget @ 2025-11-01 21:57 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Antonin Delpeuch, Antonin Delpeuch
From: Antonin Delpeuch <antonin@delpeuch.eu>
The XDF_DIFF_ALGORITHM_MASK bit mask only includes bits for the patience
and histogram diffs, not for the minimal one. This means that when
reseting the diff algorithm to the default one, one needs to separately
clear the bit for the minimal diff. There are places in the code that fail
to do that: merge-ort.c and builtin/merge-file.c.
Add the XDF_NEED_MINIMAL bit to the bit mask, and remove the separate
clearing of this bit in the places where it hasn't been forgotten.
Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
---
diff.c | 2 --
merge-ort.c | 2 --
xdiff/xdiff.h | 2 +-
3 files changed, 1 insertion(+), 5 deletions(-)
diff --git a/diff.c b/diff.c
index 87fa16b730..6ce3591c5b 100644
--- a/diff.c
+++ b/diff.c
@@ -3526,8 +3526,6 @@ static int set_diff_algorithm(struct diff_options *opts,
if (value < 0)
return -1;
- /* clear out previous settings */
- DIFF_XDL_CLR(opts, NEED_MINIMAL);
opts->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
opts->xdl_opts |= value;
diff --git a/merge-ort.c b/merge-ort.c
index 29858074f9..9b2b0fce7e 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -5495,8 +5495,6 @@ int parse_merge_opt(struct merge_options *opt, const char *s)
long value = parse_algorithm_value(arg);
if (value < 0)
return -1;
- /* clear out previous settings */
- DIFF_XDL_CLR(opt, NEED_MINIMAL);
opt->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
opt->xdl_opts |= value;
}
diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h
index 2cecde5afe..dc370712e9 100644
--- a/xdiff/xdiff.h
+++ b/xdiff/xdiff.h
@@ -43,7 +43,7 @@ extern "C" {
#define XDF_PATIENCE_DIFF (1 << 14)
#define XDF_HISTOGRAM_DIFF (1 << 15)
-#define XDF_DIFF_ALGORITHM_MASK (XDF_PATIENCE_DIFF | XDF_HISTOGRAM_DIFF)
+#define XDF_DIFF_ALGORITHM_MASK (XDF_PATIENCE_DIFF | XDF_HISTOGRAM_DIFF | XDF_NEED_MINIMAL)
#define XDF_DIFF_ALG(x) ((x) & XDF_DIFF_ALGORITHM_MASK)
#define XDF_INDENT_HEURISTIC (1 << 23)
--
gitgitgadget
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v4 2/2] blame: make diff algorithm configurable
2025-11-01 21:57 ` [PATCH v4 0/2] " Antonin Delpeuch via GitGitGadget
2025-11-01 21:57 ` [PATCH v4 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget
@ 2025-11-01 21:57 ` Antonin Delpeuch via GitGitGadget
2025-11-03 14:32 ` Phillip Wood
2025-11-06 22:41 ` [PATCH v5 0/2] " Antonin Delpeuch via GitGitGadget
2 siblings, 1 reply; 32+ messages in thread
From: Antonin Delpeuch via GitGitGadget @ 2025-11-01 21:57 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Antonin Delpeuch, Antonin Delpeuch
From: Antonin Delpeuch <antonin@delpeuch.eu>
The diff algorithm used in 'git-blame(1)' is set to 'myers',
without the possibility to change it aside from the `--minimal` option.
There has been long-standing interest in changing the default diff
algorithm to "histogram", and Git 3.0 was floated as a possible occasion
for taking some steps towards that:
https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/
As a preparation for this move, it is worth making sure that the diff
algorithm is configurable where useful.
Make it configurable in the `git-blame(1)` command by introducing the
`--diff-algorithm` option and make honor the `diff.algorithm` config
variable. Keep Myers diff as the default.
Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
---
Documentation/diff-algorithm-option.adoc | 20 +++
Documentation/diff-options.adoc | 21 +--
Documentation/git-blame.adoc | 2 +
builtin/blame.c | 52 +++++-
t/meson.build | 1 +
t/t8015-blame-diff-algorithm.sh | 203 +++++++++++++++++++++++
6 files changed, 278 insertions(+), 21 deletions(-)
create mode 100644 Documentation/diff-algorithm-option.adoc
create mode 100755 t/t8015-blame-diff-algorithm.sh
diff --git a/Documentation/diff-algorithm-option.adoc b/Documentation/diff-algorithm-option.adoc
new file mode 100644
index 0000000000..8e3a0b63d7
--- /dev/null
+++ b/Documentation/diff-algorithm-option.adoc
@@ -0,0 +1,20 @@
+`--diff-algorithm=(patience|minimal|histogram|myers)`::
+ Choose a diff algorithm. The variants are as follows:
++
+--
+ `default`;;
+ `myers`;;
+ The basic greedy diff algorithm. Currently, this is the default.
+ `minimal`;;
+ Spend extra time to make sure the smallest possible diff is
+ produced.
+ `patience`;;
+ Use "patience diff" algorithm when generating patches.
+ `histogram`;;
+ This algorithm extends the patience algorithm to "support
+ low-occurrence common elements".
+--
++
+For instance, if you configured the `diff.algorithm` variable to a
+non-default value and want to use the default one, then you
+have to use `--diff-algorithm=default` option.
diff --git a/Documentation/diff-options.adoc b/Documentation/diff-options.adoc
index ae31520f7f..9cdad6f72a 100644
--- a/Documentation/diff-options.adoc
+++ b/Documentation/diff-options.adoc
@@ -197,26 +197,7 @@ and starts with _<text>_, this algorithm attempts to prevent it from
appearing as a deletion or addition in the output. It uses the "patience
diff" algorithm internally.
-`--diff-algorithm=(patience|minimal|histogram|myers)`::
- Choose a diff algorithm. The variants are as follows:
-+
---
- `default`;;
- `myers`;;
- The basic greedy diff algorithm. Currently, this is the default.
- `minimal`;;
- Spend extra time to make sure the smallest possible diff is
- produced.
- `patience`;;
- Use "patience diff" algorithm when generating patches.
- `histogram`;;
- This algorithm extends the patience algorithm to "support
- low-occurrence common elements".
---
-+
-For instance, if you configured the `diff.algorithm` variable to a
-non-default value and want to use the default one, then you
-have to use `--diff-algorithm=default` option.
+include::diff-algorithm-option.adoc[]
`--stat[=<width>[,<name-width>[,<count>]]]`::
Generate a diffstat. By default, as much space as necessary
diff --git a/Documentation/git-blame.adoc b/Documentation/git-blame.adoc
index e438d28625..adcbb6f5dc 100644
--- a/Documentation/git-blame.adoc
+++ b/Documentation/git-blame.adoc
@@ -85,6 +85,8 @@ include::blame-options.adoc[]
Ignore whitespace when comparing the parent's version and
the child's to find where the lines came from.
+include::diff-algorithm-option.adoc[]
+
--abbrev=<n>::
Instead of using the default 7+1 hexadecimal digits as the
abbreviated object name, use <m>+1 digits, where <m> is at
diff --git a/builtin/blame.c b/builtin/blame.c
index 2703820258..888ce708a6 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -779,6 +779,19 @@ static int git_blame_config(const char *var, const char *value,
}
}
+ if (!strcmp(var, "diff.algorithm")) {
+ long diff_algorithm;
+ if (!value)
+ return config_error_nonbool(var);
+ diff_algorithm = parse_algorithm_value(value);
+ if (diff_algorithm < 0)
+ return error(_("unknown value for config '%s': %s"),
+ var, value);
+ xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
+ xdl_opts |= diff_algorithm;
+ return 0;
+ }
+
if (git_diff_heuristic_config(var, value, cb) < 0)
return -1;
if (userdiff_config(var, value) < 0)
@@ -824,6 +837,38 @@ static int blame_move_callback(const struct option *option, const char *arg, int
return 0;
}
+static int blame_diff_algorithm_minimal(const struct option *option,
+ const char *arg, int unset)
+{
+ int *opt = option->value;
+
+ BUG_ON_OPT_ARG(arg);
+
+ *opt &= ~XDF_DIFF_ALGORITHM_MASK;
+ if (!unset)
+ *opt |= XDF_NEED_MINIMAL;
+
+ return 0;
+}
+
+static int blame_diff_algorithm_callback(const struct option *option,
+ const char *arg, int unset)
+{
+ int *opt = option->value;
+ long value = parse_algorithm_value(arg);
+
+ BUG_ON_OPT_NEG(unset);
+
+ if (value < 0)
+ return error(_("option diff-algorithm accepts \"myers\", "
+ "\"minimal\", \"patience\" and \"histogram\""));
+
+ *opt &= ~(XDF_NEED_MINIMAL | XDF_DIFF_ALGORITHM_MASK);
+ *opt |= value;
+
+ return 0;
+}
+
static int is_a_rev(const char *name)
{
struct object_id oid;
@@ -915,11 +960,16 @@ int cmd_blame(int argc,
OPT_BIT('s', NULL, &output_option, N_("suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR),
OPT_BIT('e', "show-email", &output_option, N_("show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL),
OPT_BIT('w', NULL, &xdl_opts, N_("ignore whitespace differences"), XDF_IGNORE_WHITESPACE),
+ OPT_CALLBACK_F(0, "diff-algorithm", &xdl_opts, N_("<algorithm>"),
+ N_("choose a diff algorithm"),
+ PARSE_OPT_NONEG, blame_diff_algorithm_callback),
OPT_STRING_LIST(0, "ignore-rev", &ignore_rev_list, N_("rev"), N_("ignore <rev> when blaming")),
OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")),
OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE),
OPT_BIT(0, "color-by-age", &output_option, N_("color lines by age"), OUTPUT_SHOW_AGE_WITH_COLOR),
- OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL),
+ OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL,
+ N_("spend extra cycles to find a better match"),
+ PARSE_OPT_NOARG, blame_diff_algorithm_minimal),
OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")),
OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")),
OPT_CALLBACK_F('C', NULL, &opt, N_("score"), N_("find line copies within and across files"), PARSE_OPT_OPTARG, blame_copy_callback),
diff --git a/t/meson.build b/t/meson.build
index 401b24e50e..9f2fe7af8b 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -955,6 +955,7 @@ integration_tests = [
't8012-blame-colors.sh',
't8013-blame-ignore-revs.sh',
't8014-blame-ignore-fuzzy.sh',
+ 't8015-blame-diff-algorithm.sh',
't8020-last-modified.sh',
't9001-send-email.sh',
't9002-column.sh',
diff --git a/t/t8015-blame-diff-algorithm.sh b/t/t8015-blame-diff-algorithm.sh
new file mode 100755
index 0000000000..5318e18cb3
--- /dev/null
+++ b/t/t8015-blame-diff-algorithm.sh
@@ -0,0 +1,203 @@
+#!/bin/sh
+
+test_description='git blame with specific diff algorithm'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+ cat >file.c <<-\EOF &&
+ int f(int x, int y)
+ {
+ if (x == 0)
+ {
+ return y;
+ }
+ return x;
+ }
+
+ int g(size_t u)
+ {
+ while (u < 30)
+ {
+ u++;
+ }
+ return u;
+ }
+ EOF
+ test_write_lines x x x x >file.txt &&
+ git add file.c file.txt &&
+ GIT_AUTHOR_NAME=Commit_1 git commit -m Commit_1 &&
+
+ cat >file.c <<-\EOF &&
+ int g(size_t u)
+ {
+ while (u < 30)
+ {
+ u++;
+ }
+ return u;
+ }
+
+ int h(int x, int y, int z)
+ {
+ if (z == 0)
+ {
+ return x;
+ }
+ return y;
+ }
+ EOF
+ test_write_lines x x x A B C D x E F G >file.txt &&
+ git add file.c file.txt &&
+ GIT_AUTHOR_NAME=Commit_2 git commit -m Commit_2
+'
+
+test_expect_success 'blame uses Myers diff algorithm by default' '
+ cat >expected <<-\EOF &&
+ Commit_2 int g(size_t u)
+ Commit_1 {
+ Commit_2 while (u < 30)
+ Commit_1 {
+ Commit_2 u++;
+ Commit_1 }
+ Commit_2 return u;
+ Commit_1 }
+ Commit_1
+ Commit_2 int h(int x, int y, int z)
+ Commit_1 {
+ Commit_2 if (z == 0)
+ Commit_1 {
+ Commit_2 return x;
+ Commit_1 }
+ Commit_2 return y;
+ Commit_1 }
+ EOF
+
+ git blame file.c > output &&
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts &&
+ sed -e "s/ *$//g" without_varying_parts > actual &&
+ test_cmp expected actual
+'
+
+test_expect_success 'blame honors --diff-algorithm option' '
+ cat >expected <<-\EOF &&
+ Commit_1 int g(size_t u)
+ Commit_1 {
+ Commit_1 while (u < 30)
+ Commit_1 {
+ Commit_1 u++;
+ Commit_1 }
+ Commit_1 return u;
+ Commit_1 }
+ Commit_2
+ Commit_2 int h(int x, int y, int z)
+ Commit_2 {
+ Commit_2 if (z == 0)
+ Commit_2 {
+ Commit_2 return x;
+ Commit_2 }
+ Commit_2 return y;
+ Commit_2 }
+ EOF
+
+ git blame file.c --diff-algorithm histogram > output &&
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts &&
+ sed -e "s/ *$//g" without_varying_parts > actual &&
+ test_cmp expected actual
+'
+
+test_expect_success 'blame honors diff.algorithm config variable' '
+ cat >expected <<-\EOF &&
+ Commit_1 int g(size_t u)
+ Commit_1 {
+ Commit_1 while (u < 30)
+ Commit_1 {
+ Commit_1 u++;
+ Commit_1 }
+ Commit_1 return u;
+ Commit_1 }
+ Commit_2
+ Commit_2 int h(int x, int y, int z)
+ Commit_2 {
+ Commit_2 if (z == 0)
+ Commit_2 {
+ Commit_2 return x;
+ Commit_2 }
+ Commit_2 return y;
+ Commit_2 }
+ EOF
+
+ git -c diff.algorithm=histogram blame file.c > output &&
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts &&
+ sed -e "s/ *$//g" without_varying_parts > actual &&
+ test_cmp expected actual
+'
+
+test_expect_success 'blame gives priority to --diff-algorithm over diff.algorithm' '
+ cat >expected <<-\EOF &&
+ Commit_1 int g(size_t u)
+ Commit_1 {
+ Commit_1 while (u < 30)
+ Commit_1 {
+ Commit_1 u++;
+ Commit_1 }
+ Commit_1 return u;
+ Commit_1 }
+ Commit_2
+ Commit_2 int h(int x, int y, int z)
+ Commit_2 {
+ Commit_2 if (z == 0)
+ Commit_2 {
+ Commit_2 return x;
+ Commit_2 }
+ Commit_2 return y;
+ Commit_2 }
+ EOF
+
+ git -c diff.algorithm=myers blame file.c --diff-algorithm histogram &&
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts &&
+ sed -e "s/ *$//g" without_varying_parts > actual &&
+ test_cmp expected actual
+'
+
+test_expect_success 'blame honors --minimal option' '
+ cat >expected <<-\EOF &&
+ Commit_1 x
+ Commit_1 x
+ Commit_1 x
+ Commit_2 A
+ Commit_2 B
+ Commit_2 C
+ Commit_2 D
+ Commit_1 x
+ Commit_2 E
+ Commit_2 F
+ Commit_2 G
+ EOF
+
+ git blame file.txt --minimal > output &&
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual &&
+ test_cmp expected actual
+'
+
+test_expect_success 'blame respects the order of diff options' '
+ cat >expected <<-\EOF &&
+ Commit_1 x
+ Commit_1 x
+ Commit_1 x
+ Commit_2 A
+ Commit_2 B
+ Commit_2 C
+ Commit_2 D
+ Commit_2 x
+ Commit_2 E
+ Commit_2 F
+ Commit_2 G
+ EOF
+
+ git blame file.txt --minimal --diff-algorithm myers > output &&
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual &&
+ test_cmp expected actual
+'
+
+test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v4 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK
2025-11-01 21:57 ` [PATCH v4 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget
@ 2025-11-03 14:32 ` Phillip Wood
0 siblings, 0 replies; 32+ messages in thread
From: Phillip Wood @ 2025-11-03 14:32 UTC (permalink / raw)
To: Antonin Delpeuch via GitGitGadget, git; +Cc: Elijah Newren, Antonin Delpeuch
Hi Antonin
On 01/11/2025 21:57, Antonin Delpeuch via GitGitGadget wrote:
> From: Antonin Delpeuch <antonin@delpeuch.eu>
>
> The XDF_DIFF_ALGORITHM_MASK bit mask only includes bits for the patience
> and histogram diffs, not for the minimal one. This means that when
> reseting the diff algorithm to the default one, one needs to separately
> clear the bit for the minimal diff. There are places in the code that fail
> to do that: merge-ort.c and builtin/merge-file.c.
>
> Add the XDF_NEED_MINIMAL bit to the bit mask, and remove the separate
> clearing of this bit in the places where it hasn't been forgotten.
Nicely explained. This is a useful improvement that should prevent
errors in the future. After this patch there are no users of
DIFF_XDL_CLR() so we should probably remove that macro. I'm not sure it
makes sense to remove the comments that have been deleted below as we're
still clearing the old setting. Apart from that this all looks good.
Thanks
Phillip
> Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
> ---
> diff.c | 2 --
> merge-ort.c | 2 --
> xdiff/xdiff.h | 2 +-
> 3 files changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/diff.c b/diff.c
> index 87fa16b730..6ce3591c5b 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -3526,8 +3526,6 @@ static int set_diff_algorithm(struct diff_options *opts,
> if (value < 0)
> return -1;
>
> - /* clear out previous settings */
> - DIFF_XDL_CLR(opts, NEED_MINIMAL);
> opts->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
> opts->xdl_opts |= value;
>
> diff --git a/merge-ort.c b/merge-ort.c
> index 29858074f9..9b2b0fce7e 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -5495,8 +5495,6 @@ int parse_merge_opt(struct merge_options *opt, const char *s)
> long value = parse_algorithm_value(arg);
> if (value < 0)
> return -1;
> - /* clear out previous settings */
> - DIFF_XDL_CLR(opt, NEED_MINIMAL);
> opt->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
> opt->xdl_opts |= value;
> }
> diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h
> index 2cecde5afe..dc370712e9 100644
> --- a/xdiff/xdiff.h
> +++ b/xdiff/xdiff.h
> @@ -43,7 +43,7 @@ extern "C" {
>
> #define XDF_PATIENCE_DIFF (1 << 14)
> #define XDF_HISTOGRAM_DIFF (1 << 15)
> -#define XDF_DIFF_ALGORITHM_MASK (XDF_PATIENCE_DIFF | XDF_HISTOGRAM_DIFF)
> +#define XDF_DIFF_ALGORITHM_MASK (XDF_PATIENCE_DIFF | XDF_HISTOGRAM_DIFF | XDF_NEED_MINIMAL)
> #define XDF_DIFF_ALG(x) ((x) & XDF_DIFF_ALGORITHM_MASK)
>
> #define XDF_INDENT_HEURISTIC (1 << 23)
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v4 2/2] blame: make diff algorithm configurable
2025-11-01 21:57 ` [PATCH v4 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget
@ 2025-11-03 14:32 ` Phillip Wood
2025-11-03 16:15 ` Junio C Hamano
0 siblings, 1 reply; 32+ messages in thread
From: Phillip Wood @ 2025-11-03 14:32 UTC (permalink / raw)
To: Antonin Delpeuch via GitGitGadget, git; +Cc: Elijah Newren, Antonin Delpeuch
Hi Antonin
Thanks for re-rolling, this is looking pretty sound, I've left a couple
of fairly minor comments below.
On 01/11/2025 21:57, Antonin Delpeuch via GitGitGadget wrote:
> +static int blame_diff_algorithm_minimal(const struct option *option,
> + const char *arg, int unset)
> +{
> + int *opt = option->value;
> +
> + BUG_ON_OPT_ARG(arg);
> +
> + *opt &= ~XDF_DIFF_ALGORITHM_MASK;
> + if (!unset)
> + *opt |= XDF_NEED_MINIMAL;
One thing I'd not thought about before was the interaction between
"--no-minimal" and "--diff-algorithm" The code above makes
"--no-minimal" behave like "diff-algorithm=myers" which is consistent
with the current behavior where the only options for the diff algorithm
are "minimal" or "myers". An alternative would be for "--no-minimal" to
just clear XDF_NEED_MINIMAL and behave like a no-op if it is given after
"--diff-algorithm=patience" or "--diff-algorithm=histogram". I don't
really have a strong preference either way.
> +static int blame_diff_algorithm_callback(const struct option *option,
> + const char *arg, int unset)
> +{
> + int *opt = option->value;
> + long value = parse_algorithm_value(arg);
> +
> + BUG_ON_OPT_NEG(unset);
> +
> + if (value < 0)
> + return error(_("option diff-algorithm accepts \"myers\", "
> + "\"minimal\", \"patience\" and \"histogram\""));
> +
> + *opt &= ~(XDF_NEED_MINIMAL | XDF_DIFF_ALGORITHM_MASK);
We can just use XDF_DIFF_ALGORITHM_MASK now that we've added
XDF_NEED_MINMAL to it in the last commit.
> @@ -915,11 +960,16 @@ int cmd_blame(int argc,
> OPT_BIT('s', NULL, &output_option, N_("suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR),
> OPT_BIT('e', "show-email", &output_option, N_("show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL),
> OPT_BIT('w', NULL, &xdl_opts, N_("ignore whitespace differences"), XDF_IGNORE_WHITESPACE),
> + OPT_CALLBACK_F(0, "diff-algorithm", &xdl_opts, N_("<algorithm>"),
> + N_("choose a diff algorithm"),
> + PARSE_OPT_NONEG, blame_diff_algorithm_callback),
> OPT_STRING_LIST(0, "ignore-rev", &ignore_rev_list, N_("rev"), N_("ignore <rev> when blaming")),
> OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")),
> OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE),
> OPT_BIT(0, "color-by-age", &output_option, N_("color lines by age"), OUTPUT_SHOW_AGE_WITH_COLOR),
> - OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL),
> + OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL,
> + N_("spend extra cycles to find a better match"),
> + PARSE_OPT_NOARG, blame_diff_algorithm_minimal),
Given the potential for confusing interactions between "--no-minimal"
and "--diff-algorithm" I think it would be worth adding OPT_HIDDEN here.
> diff --git a/t/t8015-blame-diff-algorithm.sh b/t/t8015-blame-diff-algorithm.sh
> new file mode 100755
> index 0000000000..5318e18cb3
> --- /dev/null
> +++ b/t/t8015-blame-diff-algorithm.sh
> @@ -0,0 +1,203 @@
> + [...]
> + git blame file.c > output &&
> + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts &&
> + sed -e "s/ *$//g" without_varying_parts > actual &&
This would be more efficient if it was written as
sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \
-e "s/ *$//g" output >actual
Our test suite is really slow on windows so it is worth trying to avoid
creating unnecessary processes.
Thanks
Phillip
> + test_cmp expected actual
> +'
> +
> +test_expect_success 'blame honors --diff-algorithm option' '
> + cat >expected <<-\EOF &&
> + Commit_1 int g(size_t u)
> + Commit_1 {
> + Commit_1 while (u < 30)
> + Commit_1 {
> + Commit_1 u++;
> + Commit_1 }
> + Commit_1 return u;
> + Commit_1 }
> + Commit_2
> + Commit_2 int h(int x, int y, int z)
> + Commit_2 {
> + Commit_2 if (z == 0)
> + Commit_2 {
> + Commit_2 return x;
> + Commit_2 }
> + Commit_2 return y;
> + Commit_2 }
> + EOF
> +
> + git blame file.c --diff-algorithm histogram > output &&
> + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts &&
> + sed -e "s/ *$//g" without_varying_parts > actual &&
> + test_cmp expected actual
> +'
> +
> +test_expect_success 'blame honors diff.algorithm config variable' '
> + cat >expected <<-\EOF &&
> + Commit_1 int g(size_t u)
> + Commit_1 {
> + Commit_1 while (u < 30)
> + Commit_1 {
> + Commit_1 u++;
> + Commit_1 }
> + Commit_1 return u;
> + Commit_1 }
> + Commit_2
> + Commit_2 int h(int x, int y, int z)
> + Commit_2 {
> + Commit_2 if (z == 0)
> + Commit_2 {
> + Commit_2 return x;
> + Commit_2 }
> + Commit_2 return y;
> + Commit_2 }
> + EOF
> +
> + git -c diff.algorithm=histogram blame file.c > output &&
> + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts &&
> + sed -e "s/ *$//g" without_varying_parts > actual &&
> + test_cmp expected actual
> +'
> +
> +test_expect_success 'blame gives priority to --diff-algorithm over diff.algorithm' '
> + cat >expected <<-\EOF &&
> + Commit_1 int g(size_t u)
> + Commit_1 {
> + Commit_1 while (u < 30)
> + Commit_1 {
> + Commit_1 u++;
> + Commit_1 }
> + Commit_1 return u;
> + Commit_1 }
> + Commit_2
> + Commit_2 int h(int x, int y, int z)
> + Commit_2 {
> + Commit_2 if (z == 0)
> + Commit_2 {
> + Commit_2 return x;
> + Commit_2 }
> + Commit_2 return y;
> + Commit_2 }
> + EOF
> +
> + git -c diff.algorithm=myers blame file.c --diff-algorithm histogram &&
> + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts &&
> + sed -e "s/ *$//g" without_varying_parts > actual &&
> + test_cmp expected actual
> +'
> +
> +test_expect_success 'blame honors --minimal option' '
> + cat >expected <<-\EOF &&
> + Commit_1 x
> + Commit_1 x
> + Commit_1 x
> + Commit_2 A
> + Commit_2 B
> + Commit_2 C
> + Commit_2 D
> + Commit_1 x
> + Commit_2 E
> + Commit_2 F
> + Commit_2 G
> + EOF
> +
> + git blame file.txt --minimal > output &&
> + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual &&
> + test_cmp expected actual
> +'
> +
> +test_expect_success 'blame respects the order of diff options' '
> + cat >expected <<-\EOF &&
> + Commit_1 x
> + Commit_1 x
> + Commit_1 x
> + Commit_2 A
> + Commit_2 B
> + Commit_2 C
> + Commit_2 D
> + Commit_2 x
> + Commit_2 E
> + Commit_2 F
> + Commit_2 G
> + EOF
> +
> + git blame file.txt --minimal --diff-algorithm myers > output &&
> + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual &&
> + test_cmp expected actual
> +'
> +
> +test_done
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v4 2/2] blame: make diff algorithm configurable
2025-11-03 14:32 ` Phillip Wood
@ 2025-11-03 16:15 ` Junio C Hamano
2025-11-06 20:29 ` Junio C Hamano
0 siblings, 1 reply; 32+ messages in thread
From: Junio C Hamano @ 2025-11-03 16:15 UTC (permalink / raw)
To: Phillip Wood
Cc: Antonin Delpeuch via GitGitGadget, git, Elijah Newren,
Antonin Delpeuch
Phillip Wood <phillip.wood123@gmail.com> writes:
> One thing I'd not thought about before was the interaction between
> "--no-minimal" and "--diff-algorithm" The code above makes
> "--no-minimal" behave like "diff-algorithm=myers" which is consistent
> with the current behavior where the only options for the diff algorithm
> are "minimal" or "myers". An alternative would be for "--no-minimal" to
> just clear XDF_NEED_MINIMAL and behave like a no-op if it is given after
> "--diff-algorithm=patience" or "--diff-algorithm=histogram". I don't
> really have a strong preference either way.
Good observation.
In the longer term, I think we would be better off if we treated
"minimal" just like "histogram" and "patience", in that
(1) If the command takes --diff-algorithm=<name>, giving it as the
<name> would override the previous setting.
(2) If the command takes --<name> (i.e. "git diff --histogram"),
giving "--no-<name>" results in an error.
(3) If the command takes --<name>, it should take all the variants
as <name>, not just selected few, or it shouldn't take any.
IOW, we should deprecate "blame --no-minimal" as a past mistake, and
in the longer term deprecate "blame --minimal" and tell users to use
"--diff-algorithm=minimal" instead.
If Antonin's series wants to teach --histogram and --patience to
"git blame", then we can and should keep "blame --minimal" (i.e.,
(3) above), but in that case, "blame --no-minimal" should still go
(i.e., (2) above). Under the new world order where there are more
than the "minimal/no-minimal?" binary choice, where you can specify
other algorithms from the usual repertoire, the option "no-minimal"
no longer makes sense.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v4 2/2] blame: make diff algorithm configurable
2025-11-03 16:15 ` Junio C Hamano
@ 2025-11-06 20:29 ` Junio C Hamano
0 siblings, 0 replies; 32+ messages in thread
From: Junio C Hamano @ 2025-11-06 20:29 UTC (permalink / raw)
To: Phillip Wood
Cc: Antonin Delpeuch via GitGitGadget, git, Elijah Newren,
Antonin Delpeuch
Junio C Hamano <gitster@pobox.com> writes:
> Phillip Wood <phillip.wood123@gmail.com> writes:
>
>> One thing I'd not thought about before was the interaction between
>> "--no-minimal" and "--diff-algorithm" The code above makes
>> "--no-minimal" behave like "diff-algorithm=myers" which is consistent
>> with the current behavior where the only options for the diff algorithm
>> are "minimal" or "myers". An alternative would be for "--no-minimal" to
>> just clear XDF_NEED_MINIMAL and behave like a no-op if it is given after
>> "--diff-algorithm=patience" or "--diff-algorithm=histogram". I don't
>> really have a strong preference either way.
>
> Good observation.
>
> In the longer term, I think we would be better off if we treated
> "minimal" just like "histogram" and "patience", in that
> ...
> other algorithms from the usual repertoire, the option "no-minimal"
> no longer makes sense.
Just to avoid confusion, the idea above to deprecate and kill
"--no-minimal" is totally outside of this topic to teach "git blame"
the "--diff-algorithm=<algo>" command line option.
I think in the shorter term I think it is OK for the implementation
to do whatever it happens to do when given "--no-minimal", and as
you observed, it is fine to behave like "--diff-algorithm=myers".
I also like your idea to hide --minimal with the OPT_HIDDEN bit.
Thanks.
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v5 0/2] blame: make diff algorithm configurable
2025-11-01 21:57 ` [PATCH v4 0/2] " Antonin Delpeuch via GitGitGadget
2025-11-01 21:57 ` [PATCH v4 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget
2025-11-01 21:57 ` [PATCH v4 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget
@ 2025-11-06 22:41 ` Antonin Delpeuch via GitGitGadget
2025-11-06 22:41 ` [PATCH v5 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget
` (3 more replies)
2 siblings, 4 replies; 32+ messages in thread
From: Antonin Delpeuch via GitGitGadget @ 2025-11-06 22:41 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Antonin Delpeuch
Changes since v4:
* hide --minimal option
* simplify tests to minimize spun processes
* remove redundant XDF_NEED_MINIMAL in bit mask
Antonin Delpeuch (2):
xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK
blame: make diff algorithm configurable
Documentation/diff-algorithm-option.adoc | 20 +++
Documentation/diff-options.adoc | 21 +--
Documentation/git-blame.adoc | 2 +
builtin/blame.c | 52 +++++-
diff.c | 2 -
merge-ort.c | 2 -
t/meson.build | 1 +
t/t8015-blame-diff-algorithm.sh | 203 +++++++++++++++++++++++
xdiff/xdiff.h | 2 +-
9 files changed, 279 insertions(+), 26 deletions(-)
create mode 100644 Documentation/diff-algorithm-option.adoc
create mode 100755 t/t8015-blame-diff-algorithm.sh
base-commit: 4253630c6f07a4bdcc9aa62a50e26a4d466219d1
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2075%2Fwetneb%2Fblame_respects_diff_algorithm-v5
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2075/wetneb/blame_respects_diff_algorithm-v5
Pull-Request: https://github.com/git/git/pull/2075
Range-diff vs v4:
1: e81a5d2bd2 = 1: e81a5d2bd2 xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK
2: 920a6f3acb ! 2: 60015bbada blame: make diff algorithm configurable
@@ builtin/blame.c: static int blame_move_callback(const struct option *option, con
+ return error(_("option diff-algorithm accepts \"myers\", "
+ "\"minimal\", \"patience\" and \"histogram\""));
+
-+ *opt &= ~(XDF_NEED_MINIMAL | XDF_DIFF_ALGORITHM_MASK);
++ *opt &= ~XDF_DIFF_ALGORITHM_MASK;
+ *opt |= value;
+
+ return 0;
@@ builtin/blame.c: int cmd_blame(int argc,
- OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL),
+ OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL,
+ N_("spend extra cycles to find a better match"),
-+ PARSE_OPT_NOARG, blame_diff_algorithm_minimal),
++ PARSE_OPT_NOARG | PARSE_OPT_HIDDEN, blame_diff_algorithm_minimal),
OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")),
OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")),
OPT_CALLBACK_F('C', NULL, &opt, N_("score"), N_("find line copies within and across files"), PARSE_OPT_OPTARG, blame_copy_callback),
@@ t/t8015-blame-diff-algorithm.sh (new)
+ EOF
+
+ git -c diff.algorithm=histogram blame file.c > output &&
-+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts &&
-+ sed -e "s/ *$//g" without_varying_parts > actual &&
++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \
++ -e "s/ *$//g" output > actual &&
+ test_cmp expected actual
+'
+
@@ t/t8015-blame-diff-algorithm.sh (new)
+ Commit_2 }
+ EOF
+
-+ git -c diff.algorithm=myers blame file.c --diff-algorithm histogram &&
-+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts &&
-+ sed -e "s/ *$//g" without_varying_parts > actual &&
++ git -c diff.algorithm=myers blame file.c --diff-algorithm histogram > output &&
++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \
++ -e "s/ *$//g" output > actual &&
+ test_cmp expected actual
+'
+
--
gitgitgadget
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v5 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK
2025-11-06 22:41 ` [PATCH v5 0/2] " Antonin Delpeuch via GitGitGadget
@ 2025-11-06 22:41 ` Antonin Delpeuch via GitGitGadget
2025-11-07 15:52 ` Junio C Hamano
2025-11-06 22:41 ` [PATCH v5 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget
` (2 subsequent siblings)
3 siblings, 1 reply; 32+ messages in thread
From: Antonin Delpeuch via GitGitGadget @ 2025-11-06 22:41 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Antonin Delpeuch, Antonin Delpeuch
From: Antonin Delpeuch <antonin@delpeuch.eu>
The XDF_DIFF_ALGORITHM_MASK bit mask only includes bits for the patience
and histogram diffs, not for the minimal one. This means that when
reseting the diff algorithm to the default one, one needs to separately
clear the bit for the minimal diff. There are places in the code that fail
to do that: merge-ort.c and builtin/merge-file.c.
Add the XDF_NEED_MINIMAL bit to the bit mask, and remove the separate
clearing of this bit in the places where it hasn't been forgotten.
Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
---
diff.c | 2 --
merge-ort.c | 2 --
xdiff/xdiff.h | 2 +-
3 files changed, 1 insertion(+), 5 deletions(-)
diff --git a/diff.c b/diff.c
index 87fa16b730..6ce3591c5b 100644
--- a/diff.c
+++ b/diff.c
@@ -3526,8 +3526,6 @@ static int set_diff_algorithm(struct diff_options *opts,
if (value < 0)
return -1;
- /* clear out previous settings */
- DIFF_XDL_CLR(opts, NEED_MINIMAL);
opts->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
opts->xdl_opts |= value;
diff --git a/merge-ort.c b/merge-ort.c
index 29858074f9..9b2b0fce7e 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -5495,8 +5495,6 @@ int parse_merge_opt(struct merge_options *opt, const char *s)
long value = parse_algorithm_value(arg);
if (value < 0)
return -1;
- /* clear out previous settings */
- DIFF_XDL_CLR(opt, NEED_MINIMAL);
opt->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
opt->xdl_opts |= value;
}
diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h
index 2cecde5afe..dc370712e9 100644
--- a/xdiff/xdiff.h
+++ b/xdiff/xdiff.h
@@ -43,7 +43,7 @@ extern "C" {
#define XDF_PATIENCE_DIFF (1 << 14)
#define XDF_HISTOGRAM_DIFF (1 << 15)
-#define XDF_DIFF_ALGORITHM_MASK (XDF_PATIENCE_DIFF | XDF_HISTOGRAM_DIFF)
+#define XDF_DIFF_ALGORITHM_MASK (XDF_PATIENCE_DIFF | XDF_HISTOGRAM_DIFF | XDF_NEED_MINIMAL)
#define XDF_DIFF_ALG(x) ((x) & XDF_DIFF_ALGORITHM_MASK)
#define XDF_INDENT_HEURISTIC (1 << 23)
--
gitgitgadget
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v5 2/2] blame: make diff algorithm configurable
2025-11-06 22:41 ` [PATCH v5 0/2] " Antonin Delpeuch via GitGitGadget
2025-11-06 22:41 ` [PATCH v5 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget
@ 2025-11-06 22:41 ` Antonin Delpeuch via GitGitGadget
2025-11-07 15:57 ` Junio C Hamano
2025-11-07 15:49 ` [PATCH v5 0/2] " Phillip Wood
2025-11-17 8:04 ` [PATCH v6 " Antonin Delpeuch via GitGitGadget
3 siblings, 1 reply; 32+ messages in thread
From: Antonin Delpeuch via GitGitGadget @ 2025-11-06 22:41 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Antonin Delpeuch, Antonin Delpeuch
From: Antonin Delpeuch <antonin@delpeuch.eu>
The diff algorithm used in 'git-blame(1)' is set to 'myers',
without the possibility to change it aside from the `--minimal` option.
There has been long-standing interest in changing the default diff
algorithm to "histogram", and Git 3.0 was floated as a possible occasion
for taking some steps towards that:
https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/
As a preparation for this move, it is worth making sure that the diff
algorithm is configurable where useful.
Make it configurable in the `git-blame(1)` command by introducing the
`--diff-algorithm` option and make honor the `diff.algorithm` config
variable. Keep Myers diff as the default.
Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
---
Documentation/diff-algorithm-option.adoc | 20 +++
Documentation/diff-options.adoc | 21 +--
Documentation/git-blame.adoc | 2 +
builtin/blame.c | 52 +++++-
t/meson.build | 1 +
t/t8015-blame-diff-algorithm.sh | 203 +++++++++++++++++++++++
6 files changed, 278 insertions(+), 21 deletions(-)
create mode 100644 Documentation/diff-algorithm-option.adoc
create mode 100755 t/t8015-blame-diff-algorithm.sh
diff --git a/Documentation/diff-algorithm-option.adoc b/Documentation/diff-algorithm-option.adoc
new file mode 100644
index 0000000000..8e3a0b63d7
--- /dev/null
+++ b/Documentation/diff-algorithm-option.adoc
@@ -0,0 +1,20 @@
+`--diff-algorithm=(patience|minimal|histogram|myers)`::
+ Choose a diff algorithm. The variants are as follows:
++
+--
+ `default`;;
+ `myers`;;
+ The basic greedy diff algorithm. Currently, this is the default.
+ `minimal`;;
+ Spend extra time to make sure the smallest possible diff is
+ produced.
+ `patience`;;
+ Use "patience diff" algorithm when generating patches.
+ `histogram`;;
+ This algorithm extends the patience algorithm to "support
+ low-occurrence common elements".
+--
++
+For instance, if you configured the `diff.algorithm` variable to a
+non-default value and want to use the default one, then you
+have to use `--diff-algorithm=default` option.
diff --git a/Documentation/diff-options.adoc b/Documentation/diff-options.adoc
index ae31520f7f..9cdad6f72a 100644
--- a/Documentation/diff-options.adoc
+++ b/Documentation/diff-options.adoc
@@ -197,26 +197,7 @@ and starts with _<text>_, this algorithm attempts to prevent it from
appearing as a deletion or addition in the output. It uses the "patience
diff" algorithm internally.
-`--diff-algorithm=(patience|minimal|histogram|myers)`::
- Choose a diff algorithm. The variants are as follows:
-+
---
- `default`;;
- `myers`;;
- The basic greedy diff algorithm. Currently, this is the default.
- `minimal`;;
- Spend extra time to make sure the smallest possible diff is
- produced.
- `patience`;;
- Use "patience diff" algorithm when generating patches.
- `histogram`;;
- This algorithm extends the patience algorithm to "support
- low-occurrence common elements".
---
-+
-For instance, if you configured the `diff.algorithm` variable to a
-non-default value and want to use the default one, then you
-have to use `--diff-algorithm=default` option.
+include::diff-algorithm-option.adoc[]
`--stat[=<width>[,<name-width>[,<count>]]]`::
Generate a diffstat. By default, as much space as necessary
diff --git a/Documentation/git-blame.adoc b/Documentation/git-blame.adoc
index e438d28625..adcbb6f5dc 100644
--- a/Documentation/git-blame.adoc
+++ b/Documentation/git-blame.adoc
@@ -85,6 +85,8 @@ include::blame-options.adoc[]
Ignore whitespace when comparing the parent's version and
the child's to find where the lines came from.
+include::diff-algorithm-option.adoc[]
+
--abbrev=<n>::
Instead of using the default 7+1 hexadecimal digits as the
abbreviated object name, use <m>+1 digits, where <m> is at
diff --git a/builtin/blame.c b/builtin/blame.c
index 2703820258..27b513d27f 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -779,6 +779,19 @@ static int git_blame_config(const char *var, const char *value,
}
}
+ if (!strcmp(var, "diff.algorithm")) {
+ long diff_algorithm;
+ if (!value)
+ return config_error_nonbool(var);
+ diff_algorithm = parse_algorithm_value(value);
+ if (diff_algorithm < 0)
+ return error(_("unknown value for config '%s': %s"),
+ var, value);
+ xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
+ xdl_opts |= diff_algorithm;
+ return 0;
+ }
+
if (git_diff_heuristic_config(var, value, cb) < 0)
return -1;
if (userdiff_config(var, value) < 0)
@@ -824,6 +837,38 @@ static int blame_move_callback(const struct option *option, const char *arg, int
return 0;
}
+static int blame_diff_algorithm_minimal(const struct option *option,
+ const char *arg, int unset)
+{
+ int *opt = option->value;
+
+ BUG_ON_OPT_ARG(arg);
+
+ *opt &= ~XDF_DIFF_ALGORITHM_MASK;
+ if (!unset)
+ *opt |= XDF_NEED_MINIMAL;
+
+ return 0;
+}
+
+static int blame_diff_algorithm_callback(const struct option *option,
+ const char *arg, int unset)
+{
+ int *opt = option->value;
+ long value = parse_algorithm_value(arg);
+
+ BUG_ON_OPT_NEG(unset);
+
+ if (value < 0)
+ return error(_("option diff-algorithm accepts \"myers\", "
+ "\"minimal\", \"patience\" and \"histogram\""));
+
+ *opt &= ~XDF_DIFF_ALGORITHM_MASK;
+ *opt |= value;
+
+ return 0;
+}
+
static int is_a_rev(const char *name)
{
struct object_id oid;
@@ -915,11 +960,16 @@ int cmd_blame(int argc,
OPT_BIT('s', NULL, &output_option, N_("suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR),
OPT_BIT('e', "show-email", &output_option, N_("show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL),
OPT_BIT('w', NULL, &xdl_opts, N_("ignore whitespace differences"), XDF_IGNORE_WHITESPACE),
+ OPT_CALLBACK_F(0, "diff-algorithm", &xdl_opts, N_("<algorithm>"),
+ N_("choose a diff algorithm"),
+ PARSE_OPT_NONEG, blame_diff_algorithm_callback),
OPT_STRING_LIST(0, "ignore-rev", &ignore_rev_list, N_("rev"), N_("ignore <rev> when blaming")),
OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")),
OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE),
OPT_BIT(0, "color-by-age", &output_option, N_("color lines by age"), OUTPUT_SHOW_AGE_WITH_COLOR),
- OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL),
+ OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL,
+ N_("spend extra cycles to find a better match"),
+ PARSE_OPT_NOARG | PARSE_OPT_HIDDEN, blame_diff_algorithm_minimal),
OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")),
OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")),
OPT_CALLBACK_F('C', NULL, &opt, N_("score"), N_("find line copies within and across files"), PARSE_OPT_OPTARG, blame_copy_callback),
diff --git a/t/meson.build b/t/meson.build
index 401b24e50e..9f2fe7af8b 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -955,6 +955,7 @@ integration_tests = [
't8012-blame-colors.sh',
't8013-blame-ignore-revs.sh',
't8014-blame-ignore-fuzzy.sh',
+ 't8015-blame-diff-algorithm.sh',
't8020-last-modified.sh',
't9001-send-email.sh',
't9002-column.sh',
diff --git a/t/t8015-blame-diff-algorithm.sh b/t/t8015-blame-diff-algorithm.sh
new file mode 100755
index 0000000000..55e1d540dc
--- /dev/null
+++ b/t/t8015-blame-diff-algorithm.sh
@@ -0,0 +1,203 @@
+#!/bin/sh
+
+test_description='git blame with specific diff algorithm'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+ cat >file.c <<-\EOF &&
+ int f(int x, int y)
+ {
+ if (x == 0)
+ {
+ return y;
+ }
+ return x;
+ }
+
+ int g(size_t u)
+ {
+ while (u < 30)
+ {
+ u++;
+ }
+ return u;
+ }
+ EOF
+ test_write_lines x x x x >file.txt &&
+ git add file.c file.txt &&
+ GIT_AUTHOR_NAME=Commit_1 git commit -m Commit_1 &&
+
+ cat >file.c <<-\EOF &&
+ int g(size_t u)
+ {
+ while (u < 30)
+ {
+ u++;
+ }
+ return u;
+ }
+
+ int h(int x, int y, int z)
+ {
+ if (z == 0)
+ {
+ return x;
+ }
+ return y;
+ }
+ EOF
+ test_write_lines x x x A B C D x E F G >file.txt &&
+ git add file.c file.txt &&
+ GIT_AUTHOR_NAME=Commit_2 git commit -m Commit_2
+'
+
+test_expect_success 'blame uses Myers diff algorithm by default' '
+ cat >expected <<-\EOF &&
+ Commit_2 int g(size_t u)
+ Commit_1 {
+ Commit_2 while (u < 30)
+ Commit_1 {
+ Commit_2 u++;
+ Commit_1 }
+ Commit_2 return u;
+ Commit_1 }
+ Commit_1
+ Commit_2 int h(int x, int y, int z)
+ Commit_1 {
+ Commit_2 if (z == 0)
+ Commit_1 {
+ Commit_2 return x;
+ Commit_1 }
+ Commit_2 return y;
+ Commit_1 }
+ EOF
+
+ git blame file.c > output &&
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts &&
+ sed -e "s/ *$//g" without_varying_parts > actual &&
+ test_cmp expected actual
+'
+
+test_expect_success 'blame honors --diff-algorithm option' '
+ cat >expected <<-\EOF &&
+ Commit_1 int g(size_t u)
+ Commit_1 {
+ Commit_1 while (u < 30)
+ Commit_1 {
+ Commit_1 u++;
+ Commit_1 }
+ Commit_1 return u;
+ Commit_1 }
+ Commit_2
+ Commit_2 int h(int x, int y, int z)
+ Commit_2 {
+ Commit_2 if (z == 0)
+ Commit_2 {
+ Commit_2 return x;
+ Commit_2 }
+ Commit_2 return y;
+ Commit_2 }
+ EOF
+
+ git blame file.c --diff-algorithm histogram > output &&
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts &&
+ sed -e "s/ *$//g" without_varying_parts > actual &&
+ test_cmp expected actual
+'
+
+test_expect_success 'blame honors diff.algorithm config variable' '
+ cat >expected <<-\EOF &&
+ Commit_1 int g(size_t u)
+ Commit_1 {
+ Commit_1 while (u < 30)
+ Commit_1 {
+ Commit_1 u++;
+ Commit_1 }
+ Commit_1 return u;
+ Commit_1 }
+ Commit_2
+ Commit_2 int h(int x, int y, int z)
+ Commit_2 {
+ Commit_2 if (z == 0)
+ Commit_2 {
+ Commit_2 return x;
+ Commit_2 }
+ Commit_2 return y;
+ Commit_2 }
+ EOF
+
+ git -c diff.algorithm=histogram blame file.c > output &&
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \
+ -e "s/ *$//g" output > actual &&
+ test_cmp expected actual
+'
+
+test_expect_success 'blame gives priority to --diff-algorithm over diff.algorithm' '
+ cat >expected <<-\EOF &&
+ Commit_1 int g(size_t u)
+ Commit_1 {
+ Commit_1 while (u < 30)
+ Commit_1 {
+ Commit_1 u++;
+ Commit_1 }
+ Commit_1 return u;
+ Commit_1 }
+ Commit_2
+ Commit_2 int h(int x, int y, int z)
+ Commit_2 {
+ Commit_2 if (z == 0)
+ Commit_2 {
+ Commit_2 return x;
+ Commit_2 }
+ Commit_2 return y;
+ Commit_2 }
+ EOF
+
+ git -c diff.algorithm=myers blame file.c --diff-algorithm histogram > output &&
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \
+ -e "s/ *$//g" output > actual &&
+ test_cmp expected actual
+'
+
+test_expect_success 'blame honors --minimal option' '
+ cat >expected <<-\EOF &&
+ Commit_1 x
+ Commit_1 x
+ Commit_1 x
+ Commit_2 A
+ Commit_2 B
+ Commit_2 C
+ Commit_2 D
+ Commit_1 x
+ Commit_2 E
+ Commit_2 F
+ Commit_2 G
+ EOF
+
+ git blame file.txt --minimal > output &&
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual &&
+ test_cmp expected actual
+'
+
+test_expect_success 'blame respects the order of diff options' '
+ cat >expected <<-\EOF &&
+ Commit_1 x
+ Commit_1 x
+ Commit_1 x
+ Commit_2 A
+ Commit_2 B
+ Commit_2 C
+ Commit_2 D
+ Commit_2 x
+ Commit_2 E
+ Commit_2 F
+ Commit_2 G
+ EOF
+
+ git blame file.txt --minimal --diff-algorithm myers > output &&
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual &&
+ test_cmp expected actual
+'
+
+test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v5 0/2] blame: make diff algorithm configurable
2025-11-06 22:41 ` [PATCH v5 0/2] " Antonin Delpeuch via GitGitGadget
2025-11-06 22:41 ` [PATCH v5 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget
2025-11-06 22:41 ` [PATCH v5 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget
@ 2025-11-07 15:49 ` Phillip Wood
2025-11-17 1:12 ` Junio C Hamano
2025-11-17 8:04 ` [PATCH v6 " Antonin Delpeuch via GitGitGadget
3 siblings, 1 reply; 32+ messages in thread
From: Phillip Wood @ 2025-11-07 15:49 UTC (permalink / raw)
To: Antonin Delpeuch via GitGitGadget, git; +Cc: Elijah Newren, Antonin Delpeuch
Hi Antonin
On 06/11/2025 22:41, Antonin Delpeuch via GitGitGadget wrote:
> Changes since v4:
>
> * hide --minimal option
> * simplify tests to minimize spun processes
> * remove redundant XDF_NEED_MINIMAL in bit mask
Excellent, the range-diff below looks good. Thanks for working on this
Phillip
> Antonin Delpeuch (2):
> xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK
> blame: make diff algorithm configurable
>
> Documentation/diff-algorithm-option.adoc | 20 +++
> Documentation/diff-options.adoc | 21 +--
> Documentation/git-blame.adoc | 2 +
> builtin/blame.c | 52 +++++-
> diff.c | 2 -
> merge-ort.c | 2 -
> t/meson.build | 1 +
> t/t8015-blame-diff-algorithm.sh | 203 +++++++++++++++++++++++
> xdiff/xdiff.h | 2 +-
> 9 files changed, 279 insertions(+), 26 deletions(-)
> create mode 100644 Documentation/diff-algorithm-option.adoc
> create mode 100755 t/t8015-blame-diff-algorithm.sh
>
>
> base-commit: 4253630c6f07a4bdcc9aa62a50e26a4d466219d1
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2075%2Fwetneb%2Fblame_respects_diff_algorithm-v5
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2075/wetneb/blame_respects_diff_algorithm-v5
> Pull-Request: https://github.com/git/git/pull/2075
>
> Range-diff vs v4:
>
> 1: e81a5d2bd2 = 1: e81a5d2bd2 xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK
> 2: 920a6f3acb ! 2: 60015bbada blame: make diff algorithm configurable
> @@ builtin/blame.c: static int blame_move_callback(const struct option *option, con
> + return error(_("option diff-algorithm accepts \"myers\", "
> + "\"minimal\", \"patience\" and \"histogram\""));
> +
> -+ *opt &= ~(XDF_NEED_MINIMAL | XDF_DIFF_ALGORITHM_MASK);
> ++ *opt &= ~XDF_DIFF_ALGORITHM_MASK;
> + *opt |= value;
> +
> + return 0;
> @@ builtin/blame.c: int cmd_blame(int argc,
> - OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL),
> + OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL,
> + N_("spend extra cycles to find a better match"),
> -+ PARSE_OPT_NOARG, blame_diff_algorithm_minimal),
> ++ PARSE_OPT_NOARG | PARSE_OPT_HIDDEN, blame_diff_algorithm_minimal),
> OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")),
> OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")),
> OPT_CALLBACK_F('C', NULL, &opt, N_("score"), N_("find line copies within and across files"), PARSE_OPT_OPTARG, blame_copy_callback),
> @@ t/t8015-blame-diff-algorithm.sh (new)
> + EOF
> +
> + git -c diff.algorithm=histogram blame file.c > output &&
> -+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts &&
> -+ sed -e "s/ *$//g" without_varying_parts > actual &&
> ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \
> ++ -e "s/ *$//g" output > actual &&
> + test_cmp expected actual
> +'
> +
> @@ t/t8015-blame-diff-algorithm.sh (new)
> + Commit_2 }
> + EOF
> +
> -+ git -c diff.algorithm=myers blame file.c --diff-algorithm histogram &&
> -+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts &&
> -+ sed -e "s/ *$//g" without_varying_parts > actual &&
> ++ git -c diff.algorithm=myers blame file.c --diff-algorithm histogram > output &&
> ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \
> ++ -e "s/ *$//g" output > actual &&
> + test_cmp expected actual
> +'
> +
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK
2025-11-06 22:41 ` [PATCH v5 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget
@ 2025-11-07 15:52 ` Junio C Hamano
0 siblings, 0 replies; 32+ messages in thread
From: Junio C Hamano @ 2025-11-07 15:52 UTC (permalink / raw)
To: Antonin Delpeuch via GitGitGadget
Cc: git, Elijah Newren, Phillip Wood, Antonin Delpeuch
"Antonin Delpeuch via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Antonin Delpeuch <antonin@delpeuch.eu>
>
> The XDF_DIFF_ALGORITHM_MASK bit mask only includes bits for the patience
> and histogram diffs, not for the minimal one. This means that when
> reseting the diff algorithm to the default one, one needs to separately
> clear the bit for the minimal diff. There are places in the code that fail
> to do that: merge-ort.c and builtin/merge-file.c.
>
> Add the XDF_NEED_MINIMAL bit to the bit mask, and remove the separate
> clearing of this bit in the places where it hasn't been forgotten.
Makes sense. In other words, lack of any algorithm-mask bit means
the code uses myers.
> diff --git a/diff.c b/diff.c
> index 87fa16b730..6ce3591c5b 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -3526,8 +3526,6 @@ static int set_diff_algorithm(struct diff_options *opts,
> if (value < 0)
> return -1;
>
> - /* clear out previous settings */
> - DIFF_XDL_CLR(opts, NEED_MINIMAL);
> opts->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
The comment still accurately describes what the surviving line does,
though. It is borderline if it needs commenting, but the topic of
this patch not being "remove overly obvious comments", I'd probably
vote for retaining the comment.
> diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h
> index 2cecde5afe..dc370712e9 100644
> --- a/xdiff/xdiff.h
> +++ b/xdiff/xdiff.h
> @@ -43,7 +43,7 @@ extern "C" {
>
> #define XDF_PATIENCE_DIFF (1 << 14)
> #define XDF_HISTOGRAM_DIFF (1 << 15)
> -#define XDF_DIFF_ALGORITHM_MASK (XDF_PATIENCE_DIFF | XDF_HISTOGRAM_DIFF)
> +#define XDF_DIFF_ALGORITHM_MASK (XDF_PATIENCE_DIFF | XDF_HISTOGRAM_DIFF | XDF_NEED_MINIMAL)
> #define XDF_DIFF_ALG(x) ((x) & XDF_DIFF_ALGORITHM_MASK)
Given the definition of XDF_DIFF_ALG(), I wondered how it is used.
$ git grep -n -e 'XDF_DIFF_ALG(' \*.c
xdiff/xdiffi.c:324: if (XDF_DIFF_ALG(xpp->flags) == XDF_PATIENCE_DIFF) {
xdiff/xdiffi.c:329: if (XDF_DIFF_ALG(xpp->flags) == XDF_HISTOGRAM_DIFF) {
xdiff/xprepare.c:170: if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
xdiff/xprepare.c:171: (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF)) {
xdiff/xprepare.c:396: sample = (XDF_DIFF_ALG(xpp->flags) == XDF_HISTOGRAM_DIFF
xdiff/xprepare.c:417: if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) &&
xdiff/xprepare.c:418: (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF) &&
They say "if the specified algorithm is (or is not) patience (or
histogram), do this". Now the original code, because the mask did
not include the need-minimal bit, would have chosen patience code
path even if xpp->flags had XDF_PATIENCE_DIFF and XDF_NEED_MINIMAL
bits set at the same time. The code would no longer do so.
What is keeping us safe and not making this change a bug is that
among XDF_DIFF_ALGORITHM_MASK bits, we intend to set at most one of
them at a time. I wonder if we want the command line option and
configuration parser to have an explicit check (and BUG("")) to
ensure this constraint.
Also, in the longer term as #leftoverbits clean-up, perhaps these
bits can be removed from xpp->flags and diff_options->xdl_opts, xpp
structure can gain a separate member that is an enum of the
algorithm names instead, and XDF_DIFF_ALGORITHM_MASK can be dropped?
Then set_diff_algorithm() we saw earlier can become
if (value < 0)
return -1;
opts->xdl_algo = value;
return 0;
And since there is no "clean out prvious settings" required (now we
can simply overwrite), we can truly lose that old comment once we do
so.
As a part of this topic, I think that a new code to sanity check
that there are at most one bit in XDF_DIFF_ALG(xpp->flags) may be a
good safety measure to have. Moving the algorithm bits out of the
flags is a larger change, and it may be better left outside the
topic, but I do not personaly mind seeing such a clean-up included
as a preparatory change for this series, either.
Thanks.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5 2/2] blame: make diff algorithm configurable
2025-11-06 22:41 ` [PATCH v5 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget
@ 2025-11-07 15:57 ` Junio C Hamano
0 siblings, 0 replies; 32+ messages in thread
From: Junio C Hamano @ 2025-11-07 15:57 UTC (permalink / raw)
To: Antonin Delpeuch via GitGitGadget
Cc: git, Elijah Newren, Phillip Wood, Antonin Delpeuch
"Antonin Delpeuch via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Antonin Delpeuch <antonin@delpeuch.eu>
>
> The diff algorithm used in 'git-blame(1)' is set to 'myers',
> without the possibility to change it aside from the `--minimal` option.
>
> There has been long-standing interest in changing the default diff
> algorithm to "histogram", and Git 3.0 was floated as a possible occasion
> for taking some steps towards that:
>
> https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/
>
> As a preparation for this move, it is worth making sure that the diff
> algorithm is configurable where useful.
>
> Make it configurable in the `git-blame(1)` command by introducing the
> `--diff-algorithm` option and make honor the `diff.algorithm` config
> variable. Keep Myers diff as the default.
>
> Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
> ---
This step does not have anything surprising in it, knowing what the
previous iteration contained. Looking good.
Other than that many redirections into a file are written with a
space between redirection operator and its target, i.e.
command > output &&
that should be, according to the coding guidelines, written like
command >output &&
that is.
> +test_expect_success 'blame respects the order of diff options' '
> + cat >expected <<-\EOF &&
> +...
> + EOF
> +
> + git blame file.txt --minimal --diff-algorithm myers > output &&
> + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual &&
> + test_cmp expected actual
> +'
Thanks.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5 0/2] blame: make diff algorithm configurable
2025-11-07 15:49 ` [PATCH v5 0/2] " Phillip Wood
@ 2025-11-17 1:12 ` Junio C Hamano
0 siblings, 0 replies; 32+ messages in thread
From: Junio C Hamano @ 2025-11-17 1:12 UTC (permalink / raw)
To: Phillip Wood, Antonin Delpeuch
Cc: Antonin Delpeuch via GitGitGadget, git, Elijah Newren
Phillip Wood <phillip.wood123@gmail.com> writes:
> Excellent, the range-diff below looks good. Thanks for working on this
>
> Phillip
>
>> Antonin Delpeuch (2):
>> xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK
>> blame: make diff algorithm configurable
>>
>> Documentation/diff-algorithm-option.adoc | 20 +++
>> Documentation/diff-options.adoc | 21 +--
>> Documentation/git-blame.adoc | 2 +
>> builtin/blame.c | 52 +++++-
>> diff.c | 2 -
>> merge-ort.c | 2 -
>> t/meson.build | 1 +
>> t/t8015-blame-diff-algorithm.sh | 203 +++++++++++++++++++++++
>> xdiff/xdiff.h | 2 +-
>> 9 files changed, 279 insertions(+), 26 deletions(-)
>> create mode 100644 Documentation/diff-algorithm-option.adoc
>> create mode 100755 t/t8015-blame-diff-algorithm.sh
OK, we haven't seen any activities since we saw this comment. Are
we ready to mark the topic for 'next', or are you waiting for the
end of feature freeze to make a (hopefully small and fanal) reroll?
Thanks.
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v6 0/2] blame: make diff algorithm configurable
2025-11-06 22:41 ` [PATCH v5 0/2] " Antonin Delpeuch via GitGitGadget
` (2 preceding siblings ...)
2025-11-07 15:49 ` [PATCH v5 0/2] " Phillip Wood
@ 2025-11-17 8:04 ` Antonin Delpeuch via GitGitGadget
2025-11-17 8:04 ` [PATCH v6 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget
` (3 more replies)
3 siblings, 4 replies; 32+ messages in thread
From: Antonin Delpeuch via GitGitGadget @ 2025-11-17 8:04 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Antonin Delpeuch
Changes since v5:
* add back /* clear out previous settings */ comments
* remove whitespace in bash output redirection
Antonin Delpeuch (2):
xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK
blame: make diff algorithm configurable
Documentation/diff-algorithm-option.adoc | 20 +++
Documentation/diff-options.adoc | 21 +--
Documentation/git-blame.adoc | 2 +
builtin/blame.c | 52 +++++-
diff.c | 1 -
merge-ort.c | 1 -
t/meson.build | 1 +
t/t8015-blame-diff-algorithm.sh | 203 +++++++++++++++++++++++
xdiff/xdiff.h | 2 +-
9 files changed, 279 insertions(+), 24 deletions(-)
create mode 100644 Documentation/diff-algorithm-option.adoc
create mode 100755 t/t8015-blame-diff-algorithm.sh
base-commit: 4253630c6f07a4bdcc9aa62a50e26a4d466219d1
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2075%2Fwetneb%2Fblame_respects_diff_algorithm-v6
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2075/wetneb/blame_respects_diff_algorithm-v6
Pull-Request: https://github.com/git/git/pull/2075
Range-diff vs v5:
1: e81a5d2bd2 ! 1: 4846715436 xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK
@@ Commit message
## diff.c ##
@@ diff.c: static int set_diff_algorithm(struct diff_options *opts,
- if (value < 0)
return -1;
-- /* clear out previous settings */
+ /* clear out previous settings */
- DIFF_XDL_CLR(opts, NEED_MINIMAL);
opts->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
opts->xdl_opts |= value;
@@ diff.c: static int set_diff_algorithm(struct diff_options *opts,
## merge-ort.c ##
@@ merge-ort.c: int parse_merge_opt(struct merge_options *opt, const char *s)
- long value = parse_algorithm_value(arg);
if (value < 0)
return -1;
-- /* clear out previous settings */
+ /* clear out previous settings */
- DIFF_XDL_CLR(opt, NEED_MINIMAL);
opt->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
opt->xdl_opts |= value;
2: 60015bbada ! 2: c477b87cc6 blame: make diff algorithm configurable
@@ t/t8015-blame-diff-algorithm.sh (new)
+ Commit_1 }
+ EOF
+
-+ git blame file.c > output &&
-+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts &&
-+ sed -e "s/ *$//g" without_varying_parts > actual &&
++ git blame file.c >output &&
++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >without_varying_parts &&
++ sed -e "s/ *$//g" without_varying_parts >actual &&
+ test_cmp expected actual
+'
+
@@ t/t8015-blame-diff-algorithm.sh (new)
+ Commit_2 }
+ EOF
+
-+ git blame file.c --diff-algorithm histogram > output &&
-+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts &&
-+ sed -e "s/ *$//g" without_varying_parts > actual &&
++ git blame file.c --diff-algorithm histogram >output &&
++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >without_varying_parts &&
++ sed -e "s/ *$//g" without_varying_parts >actual &&
+ test_cmp expected actual
+'
+
@@ t/t8015-blame-diff-algorithm.sh (new)
+ Commit_2 }
+ EOF
+
-+ git -c diff.algorithm=histogram blame file.c > output &&
++ git -c diff.algorithm=histogram blame file.c >output &&
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \
-+ -e "s/ *$//g" output > actual &&
++ -e "s/ *$//g" output >actual &&
+ test_cmp expected actual
+'
+
@@ t/t8015-blame-diff-algorithm.sh (new)
+ Commit_2 }
+ EOF
+
-+ git -c diff.algorithm=myers blame file.c --diff-algorithm histogram > output &&
++ git -c diff.algorithm=myers blame file.c --diff-algorithm histogram >output &&
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \
-+ -e "s/ *$//g" output > actual &&
++ -e "s/ *$//g" output >actual &&
+ test_cmp expected actual
+'
+
@@ t/t8015-blame-diff-algorithm.sh (new)
+ Commit_2 G
+ EOF
+
-+ git blame file.txt --minimal > output &&
-+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual &&
++ git blame file.txt --minimal >output &&
++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >actual &&
+ test_cmp expected actual
+'
+
@@ t/t8015-blame-diff-algorithm.sh (new)
+ Commit_2 G
+ EOF
+
-+ git blame file.txt --minimal --diff-algorithm myers > output &&
-+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual &&
++ git blame file.txt --minimal --diff-algorithm myers >output &&
++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >actual &&
+ test_cmp expected actual
+'
+
--
gitgitgadget
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v6 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK
2025-11-17 8:04 ` [PATCH v6 " Antonin Delpeuch via GitGitGadget
@ 2025-11-17 8:04 ` Antonin Delpeuch via GitGitGadget
2025-11-17 8:04 ` [PATCH v6 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget
` (2 subsequent siblings)
3 siblings, 0 replies; 32+ messages in thread
From: Antonin Delpeuch via GitGitGadget @ 2025-11-17 8:04 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Antonin Delpeuch, Antonin Delpeuch
From: Antonin Delpeuch <antonin@delpeuch.eu>
The XDF_DIFF_ALGORITHM_MASK bit mask only includes bits for the patience
and histogram diffs, not for the minimal one. This means that when
reseting the diff algorithm to the default one, one needs to separately
clear the bit for the minimal diff. There are places in the code that fail
to do that: merge-ort.c and builtin/merge-file.c.
Add the XDF_NEED_MINIMAL bit to the bit mask, and remove the separate
clearing of this bit in the places where it hasn't been forgotten.
Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
---
diff.c | 1 -
merge-ort.c | 1 -
xdiff/xdiff.h | 2 +-
3 files changed, 1 insertion(+), 3 deletions(-)
diff --git a/diff.c b/diff.c
index 87fa16b730..cdcd11f1f7 100644
--- a/diff.c
+++ b/diff.c
@@ -3527,7 +3527,6 @@ static int set_diff_algorithm(struct diff_options *opts,
return -1;
/* clear out previous settings */
- DIFF_XDL_CLR(opts, NEED_MINIMAL);
opts->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
opts->xdl_opts |= value;
diff --git a/merge-ort.c b/merge-ort.c
index 29858074f9..23e2b64c79 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -5496,7 +5496,6 @@ int parse_merge_opt(struct merge_options *opt, const char *s)
if (value < 0)
return -1;
/* clear out previous settings */
- DIFF_XDL_CLR(opt, NEED_MINIMAL);
opt->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
opt->xdl_opts |= value;
}
diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h
index 2cecde5afe..dc370712e9 100644
--- a/xdiff/xdiff.h
+++ b/xdiff/xdiff.h
@@ -43,7 +43,7 @@ extern "C" {
#define XDF_PATIENCE_DIFF (1 << 14)
#define XDF_HISTOGRAM_DIFF (1 << 15)
-#define XDF_DIFF_ALGORITHM_MASK (XDF_PATIENCE_DIFF | XDF_HISTOGRAM_DIFF)
+#define XDF_DIFF_ALGORITHM_MASK (XDF_PATIENCE_DIFF | XDF_HISTOGRAM_DIFF | XDF_NEED_MINIMAL)
#define XDF_DIFF_ALG(x) ((x) & XDF_DIFF_ALGORITHM_MASK)
#define XDF_INDENT_HEURISTIC (1 << 23)
--
gitgitgadget
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v6 2/2] blame: make diff algorithm configurable
2025-11-17 8:04 ` [PATCH v6 " Antonin Delpeuch via GitGitGadget
2025-11-17 8:04 ` [PATCH v6 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget
@ 2025-11-17 8:04 ` Antonin Delpeuch via GitGitGadget
2025-11-17 14:13 ` [PATCH v6 0/2] " Phillip Wood
2025-11-17 18:24 ` Junio C Hamano
3 siblings, 0 replies; 32+ messages in thread
From: Antonin Delpeuch via GitGitGadget @ 2025-11-17 8:04 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Phillip Wood, Antonin Delpeuch, Antonin Delpeuch
From: Antonin Delpeuch <antonin@delpeuch.eu>
The diff algorithm used in 'git-blame(1)' is set to 'myers',
without the possibility to change it aside from the `--minimal` option.
There has been long-standing interest in changing the default diff
algorithm to "histogram", and Git 3.0 was floated as a possible occasion
for taking some steps towards that:
https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/
As a preparation for this move, it is worth making sure that the diff
algorithm is configurable where useful.
Make it configurable in the `git-blame(1)` command by introducing the
`--diff-algorithm` option and make honor the `diff.algorithm` config
variable. Keep Myers diff as the default.
Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
---
Documentation/diff-algorithm-option.adoc | 20 +++
Documentation/diff-options.adoc | 21 +--
Documentation/git-blame.adoc | 2 +
builtin/blame.c | 52 +++++-
t/meson.build | 1 +
t/t8015-blame-diff-algorithm.sh | 203 +++++++++++++++++++++++
6 files changed, 278 insertions(+), 21 deletions(-)
create mode 100644 Documentation/diff-algorithm-option.adoc
create mode 100755 t/t8015-blame-diff-algorithm.sh
diff --git a/Documentation/diff-algorithm-option.adoc b/Documentation/diff-algorithm-option.adoc
new file mode 100644
index 0000000000..8e3a0b63d7
--- /dev/null
+++ b/Documentation/diff-algorithm-option.adoc
@@ -0,0 +1,20 @@
+`--diff-algorithm=(patience|minimal|histogram|myers)`::
+ Choose a diff algorithm. The variants are as follows:
++
+--
+ `default`;;
+ `myers`;;
+ The basic greedy diff algorithm. Currently, this is the default.
+ `minimal`;;
+ Spend extra time to make sure the smallest possible diff is
+ produced.
+ `patience`;;
+ Use "patience diff" algorithm when generating patches.
+ `histogram`;;
+ This algorithm extends the patience algorithm to "support
+ low-occurrence common elements".
+--
++
+For instance, if you configured the `diff.algorithm` variable to a
+non-default value and want to use the default one, then you
+have to use `--diff-algorithm=default` option.
diff --git a/Documentation/diff-options.adoc b/Documentation/diff-options.adoc
index ae31520f7f..9cdad6f72a 100644
--- a/Documentation/diff-options.adoc
+++ b/Documentation/diff-options.adoc
@@ -197,26 +197,7 @@ and starts with _<text>_, this algorithm attempts to prevent it from
appearing as a deletion or addition in the output. It uses the "patience
diff" algorithm internally.
-`--diff-algorithm=(patience|minimal|histogram|myers)`::
- Choose a diff algorithm. The variants are as follows:
-+
---
- `default`;;
- `myers`;;
- The basic greedy diff algorithm. Currently, this is the default.
- `minimal`;;
- Spend extra time to make sure the smallest possible diff is
- produced.
- `patience`;;
- Use "patience diff" algorithm when generating patches.
- `histogram`;;
- This algorithm extends the patience algorithm to "support
- low-occurrence common elements".
---
-+
-For instance, if you configured the `diff.algorithm` variable to a
-non-default value and want to use the default one, then you
-have to use `--diff-algorithm=default` option.
+include::diff-algorithm-option.adoc[]
`--stat[=<width>[,<name-width>[,<count>]]]`::
Generate a diffstat. By default, as much space as necessary
diff --git a/Documentation/git-blame.adoc b/Documentation/git-blame.adoc
index e438d28625..adcbb6f5dc 100644
--- a/Documentation/git-blame.adoc
+++ b/Documentation/git-blame.adoc
@@ -85,6 +85,8 @@ include::blame-options.adoc[]
Ignore whitespace when comparing the parent's version and
the child's to find where the lines came from.
+include::diff-algorithm-option.adoc[]
+
--abbrev=<n>::
Instead of using the default 7+1 hexadecimal digits as the
abbreviated object name, use <m>+1 digits, where <m> is at
diff --git a/builtin/blame.c b/builtin/blame.c
index 2703820258..27b513d27f 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -779,6 +779,19 @@ static int git_blame_config(const char *var, const char *value,
}
}
+ if (!strcmp(var, "diff.algorithm")) {
+ long diff_algorithm;
+ if (!value)
+ return config_error_nonbool(var);
+ diff_algorithm = parse_algorithm_value(value);
+ if (diff_algorithm < 0)
+ return error(_("unknown value for config '%s': %s"),
+ var, value);
+ xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
+ xdl_opts |= diff_algorithm;
+ return 0;
+ }
+
if (git_diff_heuristic_config(var, value, cb) < 0)
return -1;
if (userdiff_config(var, value) < 0)
@@ -824,6 +837,38 @@ static int blame_move_callback(const struct option *option, const char *arg, int
return 0;
}
+static int blame_diff_algorithm_minimal(const struct option *option,
+ const char *arg, int unset)
+{
+ int *opt = option->value;
+
+ BUG_ON_OPT_ARG(arg);
+
+ *opt &= ~XDF_DIFF_ALGORITHM_MASK;
+ if (!unset)
+ *opt |= XDF_NEED_MINIMAL;
+
+ return 0;
+}
+
+static int blame_diff_algorithm_callback(const struct option *option,
+ const char *arg, int unset)
+{
+ int *opt = option->value;
+ long value = parse_algorithm_value(arg);
+
+ BUG_ON_OPT_NEG(unset);
+
+ if (value < 0)
+ return error(_("option diff-algorithm accepts \"myers\", "
+ "\"minimal\", \"patience\" and \"histogram\""));
+
+ *opt &= ~XDF_DIFF_ALGORITHM_MASK;
+ *opt |= value;
+
+ return 0;
+}
+
static int is_a_rev(const char *name)
{
struct object_id oid;
@@ -915,11 +960,16 @@ int cmd_blame(int argc,
OPT_BIT('s', NULL, &output_option, N_("suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR),
OPT_BIT('e', "show-email", &output_option, N_("show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL),
OPT_BIT('w', NULL, &xdl_opts, N_("ignore whitespace differences"), XDF_IGNORE_WHITESPACE),
+ OPT_CALLBACK_F(0, "diff-algorithm", &xdl_opts, N_("<algorithm>"),
+ N_("choose a diff algorithm"),
+ PARSE_OPT_NONEG, blame_diff_algorithm_callback),
OPT_STRING_LIST(0, "ignore-rev", &ignore_rev_list, N_("rev"), N_("ignore <rev> when blaming")),
OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")),
OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE),
OPT_BIT(0, "color-by-age", &output_option, N_("color lines by age"), OUTPUT_SHOW_AGE_WITH_COLOR),
- OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL),
+ OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL,
+ N_("spend extra cycles to find a better match"),
+ PARSE_OPT_NOARG | PARSE_OPT_HIDDEN, blame_diff_algorithm_minimal),
OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")),
OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")),
OPT_CALLBACK_F('C', NULL, &opt, N_("score"), N_("find line copies within and across files"), PARSE_OPT_OPTARG, blame_copy_callback),
diff --git a/t/meson.build b/t/meson.build
index 401b24e50e..9f2fe7af8b 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -955,6 +955,7 @@ integration_tests = [
't8012-blame-colors.sh',
't8013-blame-ignore-revs.sh',
't8014-blame-ignore-fuzzy.sh',
+ 't8015-blame-diff-algorithm.sh',
't8020-last-modified.sh',
't9001-send-email.sh',
't9002-column.sh',
diff --git a/t/t8015-blame-diff-algorithm.sh b/t/t8015-blame-diff-algorithm.sh
new file mode 100755
index 0000000000..cd709536c6
--- /dev/null
+++ b/t/t8015-blame-diff-algorithm.sh
@@ -0,0 +1,203 @@
+#!/bin/sh
+
+test_description='git blame with specific diff algorithm'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+ cat >file.c <<-\EOF &&
+ int f(int x, int y)
+ {
+ if (x == 0)
+ {
+ return y;
+ }
+ return x;
+ }
+
+ int g(size_t u)
+ {
+ while (u < 30)
+ {
+ u++;
+ }
+ return u;
+ }
+ EOF
+ test_write_lines x x x x >file.txt &&
+ git add file.c file.txt &&
+ GIT_AUTHOR_NAME=Commit_1 git commit -m Commit_1 &&
+
+ cat >file.c <<-\EOF &&
+ int g(size_t u)
+ {
+ while (u < 30)
+ {
+ u++;
+ }
+ return u;
+ }
+
+ int h(int x, int y, int z)
+ {
+ if (z == 0)
+ {
+ return x;
+ }
+ return y;
+ }
+ EOF
+ test_write_lines x x x A B C D x E F G >file.txt &&
+ git add file.c file.txt &&
+ GIT_AUTHOR_NAME=Commit_2 git commit -m Commit_2
+'
+
+test_expect_success 'blame uses Myers diff algorithm by default' '
+ cat >expected <<-\EOF &&
+ Commit_2 int g(size_t u)
+ Commit_1 {
+ Commit_2 while (u < 30)
+ Commit_1 {
+ Commit_2 u++;
+ Commit_1 }
+ Commit_2 return u;
+ Commit_1 }
+ Commit_1
+ Commit_2 int h(int x, int y, int z)
+ Commit_1 {
+ Commit_2 if (z == 0)
+ Commit_1 {
+ Commit_2 return x;
+ Commit_1 }
+ Commit_2 return y;
+ Commit_1 }
+ EOF
+
+ git blame file.c >output &&
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >without_varying_parts &&
+ sed -e "s/ *$//g" without_varying_parts >actual &&
+ test_cmp expected actual
+'
+
+test_expect_success 'blame honors --diff-algorithm option' '
+ cat >expected <<-\EOF &&
+ Commit_1 int g(size_t u)
+ Commit_1 {
+ Commit_1 while (u < 30)
+ Commit_1 {
+ Commit_1 u++;
+ Commit_1 }
+ Commit_1 return u;
+ Commit_1 }
+ Commit_2
+ Commit_2 int h(int x, int y, int z)
+ Commit_2 {
+ Commit_2 if (z == 0)
+ Commit_2 {
+ Commit_2 return x;
+ Commit_2 }
+ Commit_2 return y;
+ Commit_2 }
+ EOF
+
+ git blame file.c --diff-algorithm histogram >output &&
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >without_varying_parts &&
+ sed -e "s/ *$//g" without_varying_parts >actual &&
+ test_cmp expected actual
+'
+
+test_expect_success 'blame honors diff.algorithm config variable' '
+ cat >expected <<-\EOF &&
+ Commit_1 int g(size_t u)
+ Commit_1 {
+ Commit_1 while (u < 30)
+ Commit_1 {
+ Commit_1 u++;
+ Commit_1 }
+ Commit_1 return u;
+ Commit_1 }
+ Commit_2
+ Commit_2 int h(int x, int y, int z)
+ Commit_2 {
+ Commit_2 if (z == 0)
+ Commit_2 {
+ Commit_2 return x;
+ Commit_2 }
+ Commit_2 return y;
+ Commit_2 }
+ EOF
+
+ git -c diff.algorithm=histogram blame file.c >output &&
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \
+ -e "s/ *$//g" output >actual &&
+ test_cmp expected actual
+'
+
+test_expect_success 'blame gives priority to --diff-algorithm over diff.algorithm' '
+ cat >expected <<-\EOF &&
+ Commit_1 int g(size_t u)
+ Commit_1 {
+ Commit_1 while (u < 30)
+ Commit_1 {
+ Commit_1 u++;
+ Commit_1 }
+ Commit_1 return u;
+ Commit_1 }
+ Commit_2
+ Commit_2 int h(int x, int y, int z)
+ Commit_2 {
+ Commit_2 if (z == 0)
+ Commit_2 {
+ Commit_2 return x;
+ Commit_2 }
+ Commit_2 return y;
+ Commit_2 }
+ EOF
+
+ git -c diff.algorithm=myers blame file.c --diff-algorithm histogram >output &&
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \
+ -e "s/ *$//g" output >actual &&
+ test_cmp expected actual
+'
+
+test_expect_success 'blame honors --minimal option' '
+ cat >expected <<-\EOF &&
+ Commit_1 x
+ Commit_1 x
+ Commit_1 x
+ Commit_2 A
+ Commit_2 B
+ Commit_2 C
+ Commit_2 D
+ Commit_1 x
+ Commit_2 E
+ Commit_2 F
+ Commit_2 G
+ EOF
+
+ git blame file.txt --minimal >output &&
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >actual &&
+ test_cmp expected actual
+'
+
+test_expect_success 'blame respects the order of diff options' '
+ cat >expected <<-\EOF &&
+ Commit_1 x
+ Commit_1 x
+ Commit_1 x
+ Commit_2 A
+ Commit_2 B
+ Commit_2 C
+ Commit_2 D
+ Commit_2 x
+ Commit_2 E
+ Commit_2 F
+ Commit_2 G
+ EOF
+
+ git blame file.txt --minimal --diff-algorithm myers >output &&
+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >actual &&
+ test_cmp expected actual
+'
+
+test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v6 0/2] blame: make diff algorithm configurable
2025-11-17 8:04 ` [PATCH v6 " Antonin Delpeuch via GitGitGadget
2025-11-17 8:04 ` [PATCH v6 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget
2025-11-17 8:04 ` [PATCH v6 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget
@ 2025-11-17 14:13 ` Phillip Wood
2025-11-17 18:24 ` Junio C Hamano
3 siblings, 0 replies; 32+ messages in thread
From: Phillip Wood @ 2025-11-17 14:13 UTC (permalink / raw)
To: Antonin Delpeuch via GitGitGadget, git; +Cc: Elijah Newren, Antonin Delpeuch
Hi Antonin
On 17/11/2025 08:04, Antonin Delpeuch via GitGitGadget wrote:
> Changes since v5:
>
> * add back /* clear out previous settings */ comments
> * remove whitespace in bash output redirection
Thanks for re-rolling, the range-diff below looks good. Being able to
configure the diff algorithm for "git blame" is a nice addition, thanks
for working on it.
Phillip
> Antonin Delpeuch (2):
> xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK
> blame: make diff algorithm configurable
>
> Documentation/diff-algorithm-option.adoc | 20 +++
> Documentation/diff-options.adoc | 21 +--
> Documentation/git-blame.adoc | 2 +
> builtin/blame.c | 52 +++++-
> diff.c | 1 -
> merge-ort.c | 1 -
> t/meson.build | 1 +
> t/t8015-blame-diff-algorithm.sh | 203 +++++++++++++++++++++++
> xdiff/xdiff.h | 2 +-
> 9 files changed, 279 insertions(+), 24 deletions(-)
> create mode 100644 Documentation/diff-algorithm-option.adoc
> create mode 100755 t/t8015-blame-diff-algorithm.sh
>
>
> base-commit: 4253630c6f07a4bdcc9aa62a50e26a4d466219d1
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2075%2Fwetneb%2Fblame_respects_diff_algorithm-v6
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2075/wetneb/blame_respects_diff_algorithm-v6
> Pull-Request: https://github.com/git/git/pull/2075
>
> Range-diff vs v5:
>
> 1: e81a5d2bd2 ! 1: 4846715436 xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK
> @@ Commit message
>
> ## diff.c ##
> @@ diff.c: static int set_diff_algorithm(struct diff_options *opts,
> - if (value < 0)
> return -1;
>
> -- /* clear out previous settings */
> + /* clear out previous settings */
> - DIFF_XDL_CLR(opts, NEED_MINIMAL);
> opts->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
> opts->xdl_opts |= value;
> @@ diff.c: static int set_diff_algorithm(struct diff_options *opts,
>
> ## merge-ort.c ##
> @@ merge-ort.c: int parse_merge_opt(struct merge_options *opt, const char *s)
> - long value = parse_algorithm_value(arg);
> if (value < 0)
> return -1;
> -- /* clear out previous settings */
> + /* clear out previous settings */
> - DIFF_XDL_CLR(opt, NEED_MINIMAL);
> opt->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
> opt->xdl_opts |= value;
> 2: 60015bbada ! 2: c477b87cc6 blame: make diff algorithm configurable
> @@ t/t8015-blame-diff-algorithm.sh (new)
> + Commit_1 }
> + EOF
> +
> -+ git blame file.c > output &&
> -+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts &&
> -+ sed -e "s/ *$//g" without_varying_parts > actual &&
> ++ git blame file.c >output &&
> ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >without_varying_parts &&
> ++ sed -e "s/ *$//g" without_varying_parts >actual &&
> + test_cmp expected actual
> +'
> +
> @@ t/t8015-blame-diff-algorithm.sh (new)
> + Commit_2 }
> + EOF
> +
> -+ git blame file.c --diff-algorithm histogram > output &&
> -+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts &&
> -+ sed -e "s/ *$//g" without_varying_parts > actual &&
> ++ git blame file.c --diff-algorithm histogram >output &&
> ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >without_varying_parts &&
> ++ sed -e "s/ *$//g" without_varying_parts >actual &&
> + test_cmp expected actual
> +'
> +
> @@ t/t8015-blame-diff-algorithm.sh (new)
> + Commit_2 }
> + EOF
> +
> -+ git -c diff.algorithm=histogram blame file.c > output &&
> ++ git -c diff.algorithm=histogram blame file.c >output &&
> + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \
> -+ -e "s/ *$//g" output > actual &&
> ++ -e "s/ *$//g" output >actual &&
> + test_cmp expected actual
> +'
> +
> @@ t/t8015-blame-diff-algorithm.sh (new)
> + Commit_2 }
> + EOF
> +
> -+ git -c diff.algorithm=myers blame file.c --diff-algorithm histogram > output &&
> ++ git -c diff.algorithm=myers blame file.c --diff-algorithm histogram >output &&
> + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \
> -+ -e "s/ *$//g" output > actual &&
> ++ -e "s/ *$//g" output >actual &&
> + test_cmp expected actual
> +'
> +
> @@ t/t8015-blame-diff-algorithm.sh (new)
> + Commit_2 G
> + EOF
> +
> -+ git blame file.txt --minimal > output &&
> -+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual &&
> ++ git blame file.txt --minimal >output &&
> ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >actual &&
> + test_cmp expected actual
> +'
> +
> @@ t/t8015-blame-diff-algorithm.sh (new)
> + Commit_2 G
> + EOF
> +
> -+ git blame file.txt --minimal --diff-algorithm myers > output &&
> -+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual &&
> ++ git blame file.txt --minimal --diff-algorithm myers >output &&
> ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >actual &&
> + test_cmp expected actual
> +'
> +
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v6 0/2] blame: make diff algorithm configurable
2025-11-17 8:04 ` [PATCH v6 " Antonin Delpeuch via GitGitGadget
` (2 preceding siblings ...)
2025-11-17 14:13 ` [PATCH v6 0/2] " Phillip Wood
@ 2025-11-17 18:24 ` Junio C Hamano
3 siblings, 0 replies; 32+ messages in thread
From: Junio C Hamano @ 2025-11-17 18:24 UTC (permalink / raw)
To: Antonin Delpeuch via GitGitGadget
Cc: git, Elijah Newren, Phillip Wood, Antonin Delpeuch
"Antonin Delpeuch via GitGitGadget" <gitgitgadget@gmail.com> writes:
> Changes since v5:
>
> * add back /* clear out previous settings */ comments
> * remove whitespace in bash output redirection
>
> Antonin Delpeuch (2):
> xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK
> blame: make diff algorithm configurable
Will queue. Thanks!
^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2025-11-17 18:24 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-20 14:56 [PATCH] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget
2025-10-20 16:05 ` Junio C Hamano
2025-10-22 9:37 ` Antonin Delpeuch
2025-10-22 20:39 ` Junio C Hamano
2025-10-23 16:03 ` Phillip Wood
2025-10-28 13:37 ` [PATCH v2] " Antonin Delpeuch via GitGitGadget
2025-10-28 15:22 ` Junio C Hamano
2025-10-28 16:00 ` Antonin Delpeuch
2025-10-28 21:14 ` [PATCH v3] " Antonin Delpeuch via GitGitGadget
2025-10-29 10:16 ` Phillip Wood
2025-10-29 18:46 ` Junio C Hamano
2025-10-30 9:22 ` Antonin Delpeuch
2025-10-30 10:47 ` Phillip Wood
2025-11-01 21:57 ` [PATCH v4 0/2] " Antonin Delpeuch via GitGitGadget
2025-11-01 21:57 ` [PATCH v4 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget
2025-11-03 14:32 ` Phillip Wood
2025-11-01 21:57 ` [PATCH v4 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget
2025-11-03 14:32 ` Phillip Wood
2025-11-03 16:15 ` Junio C Hamano
2025-11-06 20:29 ` Junio C Hamano
2025-11-06 22:41 ` [PATCH v5 0/2] " Antonin Delpeuch via GitGitGadget
2025-11-06 22:41 ` [PATCH v5 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget
2025-11-07 15:52 ` Junio C Hamano
2025-11-06 22:41 ` [PATCH v5 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget
2025-11-07 15:57 ` Junio C Hamano
2025-11-07 15:49 ` [PATCH v5 0/2] " Phillip Wood
2025-11-17 1:12 ` Junio C Hamano
2025-11-17 8:04 ` [PATCH v6 " Antonin Delpeuch via GitGitGadget
2025-11-17 8:04 ` [PATCH v6 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget
2025-11-17 8:04 ` [PATCH v6 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget
2025-11-17 14:13 ` [PATCH v6 0/2] " Phillip Wood
2025-11-17 18:24 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).