* [PATCH] blame: make diff algorithm configurable
@ 2025-10-20 14:56 Antonin Delpeuch via GitGitGadget
2025-10-20 16:05 ` Junio C Hamano
` (2 more replies)
0 siblings, 3 replies; 32+ messages in thread
From: Antonin Delpeuch via GitGitGadget @ 2025-10-20 14:56 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Antonin Delpeuch, Antonin Delpeuch
From: Antonin Delpeuch <antonin@delpeuch.eu>
The diff algorithm used in 'git-blame(1)' can be configured using the
`--diff-algorithm` option or the `diff.algorithm` config variable.
Myers diff remains the default.
Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
---
blame: make diff algorithm configurable
There has been long-standing interest in changing the default diff
algorithm to "histogram", and Git 3.0 was floated as a possible occasion
for that: https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/
As a preparation, it is worth making sure that the diff algorithm is
configurable where useful. It can have significant impact on the output
of the git-blame command, so I propose to make it configurable there
too. I have followed the convention of other commands (such as git-diff)
to introduce a --diff-algorithm option.
I understand that this command is a user-facing (porcelain) one, so I
think making it honor the diff.algorithm UI config variable is also
appropriate. The git-blame command has a machine-readable format that
can be enabled with --porcelain (which should be called --plumbing if
you ask me) so I wonder if the diff.algorithm variable should still be
honored in this case, as there could be the desire to keep it
independent from UI config variables (similarly to git-merge-file, a
plumbing command which doesn't honor diff.algorithm).
If the general idea of this patch is judged worthwhile, I would be happy
to add tests to demonstrate the impact of the diff algorithm on blame
output.
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2075%2Fwetneb%2Fblame_respects_diff_algorithm-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2075/wetneb/blame_respects_diff_algorithm-v1
Pull-Request: https://github.com/git/git/pull/2075
Documentation/git-blame.adoc | 21 ++++++++++++++++++++
builtin/blame.c | 38 +++++++++++++++++++++++++++++++++++-
2 files changed, 58 insertions(+), 1 deletion(-)
diff --git a/Documentation/git-blame.adoc b/Documentation/git-blame.adoc
index e438d28625..4beb2df551 100644
--- a/Documentation/git-blame.adoc
+++ b/Documentation/git-blame.adoc
@@ -85,6 +85,27 @@ include::blame-options.adoc[]
Ignore whitespace when comparing the parent's version and
the child's to find where the lines came from.
+`--diff-algorithm=(patience|minimal|histogram|myers)`::
+ Choose a diff algorithm. The variants are as follows:
++
+--
+ `default`;;
+ `myers`;;
+ The basic greedy diff algorithm. Currently, this is the default.
+ `minimal`;;
+ Spend extra time to make sure the smallest possible diff is
+ produced.
+ `patience`;;
+ Use "patience diff" algorithm when generating patches.
+ `histogram`;;
+ This algorithm extends the patience algorithm to "support
+ low-occurrence common elements".
+--
++
+For instance, if you configured the `diff.algorithm` variable to a
+non-default value and want to use the default one, then you
+have to use `--diff-algorithm=default` option.
+
--abbrev=<n>::
Instead of using the default 7+1 hexadecimal digits as the
abbreviated object name, use <m>+1 digits, where <m> is at
diff --git a/builtin/blame.c b/builtin/blame.c
index 2703820258..177b606e81 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -779,6 +779,19 @@ static int git_blame_config(const char *var, const char *value,
}
}
+ if (!strcmp(var, "diff.algorithm")) {
+ long diff_algorithm;
+ if (!value)
+ return config_error_nonbool(var);
+ diff_algorithm = parse_algorithm_value(value);
+ if (diff_algorithm < 0)
+ return error(_("unknown value for config '%s': %s"),
+ var, value);
+ xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
+ xdl_opts |= diff_algorithm;
+ return 0;
+ }
+
if (git_diff_heuristic_config(var, value, cb) < 0)
return -1;
if (userdiff_config(var, value) < 0)
@@ -824,6 +837,26 @@ static int blame_move_callback(const struct option *option, const char *arg, int
return 0;
}
+static int blame_diff_algorithm_callback(const struct option *option,
+ const char *arg, int unset)
+{
+ int *opt = option->value;
+ long value = parse_algorithm_value(arg);
+
+ BUG_ON_OPT_NEG(unset);
+
+ if (value < 0)
+ return error(_("option diff-algorithm accepts \"myers\", "
+ "\"minimal\", \"patience\" and \"histogram\""));
+
+ // ignore any previous --minimal setting, following git-diff's behavior
+ *opt &= ~XDF_NEED_MINIMAL;
+ *opt &= ~XDF_DIFF_ALGORITHM_MASK;
+ *opt |= value;
+
+ return 0;
+}
+
static int is_a_rev(const char *name)
{
struct object_id oid;
@@ -908,13 +941,16 @@ int cmd_blame(int argc,
OPT_BIT('f', "show-name", &output_option, N_("show original filename (Default: auto)"), OUTPUT_SHOW_NAME),
OPT_BIT('n', "show-number", &output_option, N_("show original linenumber (Default: off)"), OUTPUT_SHOW_NUMBER),
OPT_BIT('p', "porcelain", &output_option, N_("show in a format designed for machine consumption"), OUTPUT_PORCELAIN),
- OPT_BIT(0, "line-porcelain", &output_option, N_("show porcelain format with per-line commit information"), OUTPUT_PORCELAIN|OUTPUT_LINE_PORCELAIN),
+ OPT_BIT(0, "line-porcelain", &output_option, N_("show porcelain format with per-line commit information"), OUTPUT_PORCELAIN | OUTPUT_LINE_PORCELAIN),
OPT_BIT('c', NULL, &output_option, N_("use the same output mode as git-annotate (Default: off)"), OUTPUT_ANNOTATE_COMPAT),
OPT_BIT('t', NULL, &output_option, N_("show raw timestamp (Default: off)"), OUTPUT_RAW_TIMESTAMP),
OPT_BIT('l', NULL, &output_option, N_("show long commit SHA1 (Default: off)"), OUTPUT_LONG_OBJECT_NAME),
OPT_BIT('s', NULL, &output_option, N_("suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR),
OPT_BIT('e', "show-email", &output_option, N_("show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL),
OPT_BIT('w', NULL, &xdl_opts, N_("ignore whitespace differences"), XDF_IGNORE_WHITESPACE),
+ OPT_CALLBACK_F(0, "diff-algorithm", &xdl_opts, N_("<algorithm>"),
+ N_("choose a diff algorithm"),
+ PARSE_OPT_NONEG, blame_diff_algorithm_callback),
OPT_STRING_LIST(0, "ignore-rev", &ignore_rev_list, N_("rev"), N_("ignore <rev> when blaming")),
OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")),
OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE),
base-commit: 4253630c6f07a4bdcc9aa62a50e26a4d466219d1
--
gitgitgadget
^ permalink raw reply related [flat|nested] 32+ messages in thread* Re: [PATCH] blame: make diff algorithm configurable 2025-10-20 14:56 [PATCH] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget @ 2025-10-20 16:05 ` Junio C Hamano 2025-10-22 9:37 ` Antonin Delpeuch 2025-10-23 16:03 ` Phillip Wood 2025-10-28 13:37 ` [PATCH v2] " Antonin Delpeuch via GitGitGadget 2 siblings, 1 reply; 32+ messages in thread From: Junio C Hamano @ 2025-10-20 16:05 UTC (permalink / raw) To: Antonin Delpeuch via GitGitGadget; +Cc: git, Elijah Newren, Antonin Delpeuch "Antonin Delpeuch via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Antonin Delpeuch <antonin@delpeuch.eu> > > The diff algorithm used in 'git-blame(1)' can be configured using the > `--diff-algorithm` option or the `diff.algorithm` config variable. > Myers diff remains the default. The usual way to compose a log message of this project is to - Give an observation on how the current system works in the present tense (so no need to say "Currently X is Y", or "Previously X was Y" to describe the state before your change; just "X is Y" is enough), and discuss what you perceive as a problem in it. - Propose a solution (optional---often, problem description trivially leads to an obvious solution in reader's minds). - Give commands to somebody editing the codebase to "make it so", instead of saying "This commit does X". in this order. This hasn't changed since your first commit to this project a few years ago. And when read with that expectation, I was surprised that "blame" already paid attention to the command line option and configuration variable, as that paragraph was supposed to explain what happens without the patch being proposed. It was a pleasant surprise that turned out to be untrue X-<. > Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu> > --- > blame: make diff algorithm configurable > > There has been long-standing interest in changing the default diff > algorithm to "histogram", and Git 3.0 was floated as a possible occasion > for that: https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/ > > As a preparation, it is worth making sure that the diff algorithm is > configurable where useful. It can have significant impact on the output > of the git-blame command, so I propose to make it configurable there > too. I have followed the convention of other commands (such as git-diff) > to introduce a --diff-algorithm option. All of the above are good materials to be in the proposed log message, not under the three-dash line, to explain the motivation behind the change. > I understand that this command is a user-facing (porcelain) one, so I > think making it honor the diff.algorithm UI config variable is also > appropriate. The git-blame command has a machine-readable format that > can be enabled with --porcelain (which should be called --plumbing if > you ask me) so I wonder if the diff.algorithm variable should still be > honored in this case, as there could be the desire to keep it > independent from UI config variables (similarly to git-merge-file, a > plumbing command which doesn't honor diff.algorithm). Good consideration and something we should make sure we do the right thing for our users. Personally, I would not be concerned---the only folks that possibly affected are those who save old blame output and wants a fresh "git blame" run they make today would produce bit-for-bit identical output, but if they use newer versions of Git with improved xdiff implementation, they cannot expect that with or without the configuration knob _anyway_. This is just my personal opinion. Others may differ. > If the general idea of this patch is judged worthwhile, I would be happy > to add tests to demonstrate the impact of the diff algorithm on blame > output. Do not ever say this here. I've seen from time to time people ask "I am thinking of doing this; will a patch be accepted? If so, I'll work on it." before showing any work, and my response always has been: (1) We don't know how useful and interesting your contribution would be for our audience, until we see it; and (2) If you truly believe in your work (find it useful, find writing it fun, etc.), that would be incentive enough for you to work on it, whether or not the result will land in my tree. You should instead aim for something so brilliant that we would come to you begging for your permission to include it in our project. > diff --git a/Documentation/git-blame.adoc b/Documentation/git-blame.adoc > index e438d28625..4beb2df551 100644 > --- a/Documentation/git-blame.adoc > +++ b/Documentation/git-blame.adoc > @@ -85,6 +85,27 @@ include::blame-options.adoc[] > Ignore whitespace when comparing the parent's version and > the child's to find where the lines came from. > > +`--diff-algorithm=(patience|minimal|histogram|myers)`:: > + Choose a diff algorithm. The variants are as follows: > ++ > +-- > + `default`;; > + `myers`;; > + The basic greedy diff algorithm. Currently, this is the default. > + `minimal`;; > + Spend extra time to make sure the smallest possible diff is > + produced. > + `patience`;; > + Use "patience diff" algorithm when generating patches. > + `histogram`;; > + This algorithm extends the patience algorithm to "support > + low-occurrence common elements". > +-- > ++ > +For instance, if you configured the `diff.algorithm` variable to a > +non-default value and want to use the default one, then you > +have to use `--diff-algorithm=default` option. Is this copied from somewhere else, or did you come up with the above text yourself? If the former, perhaps it is a good idea to reduce the duplicattion. Use of "include::line-range-format.adoc[]" in Documentation/blame-options.adoc (which in turn is included by Documentation/git-blame.adoc) may serve as a good model to include the same text in multiple places (the "line-range" syntax thing is included directly or indirectly and its text appears in a handful of places as the result). Copy the original text out into a new file to be included (say, "diff-algorithm-option.adoc"), replace the original text with "include::diff-algorithm-option.adoc[]", and then add another "include::diff-algorithm-option.adoc[]" here in git-blame documentation instead of duplicating the text like the above hunk does. > diff --git a/builtin/blame.c b/builtin/blame.c > index 2703820258..177b606e81 100644 > --- a/builtin/blame.c > +++ b/builtin/blame.c > @@ -779,6 +779,19 @@ static int git_blame_config(const char *var, const char *value, > } > } > > + if (!strcmp(var, "diff.algorithm")) { > + long diff_algorithm; > + if (!value) > + return config_error_nonbool(var); > + diff_algorithm = parse_algorithm_value(value); > + if (diff_algorithm < 0) > + return error(_("unknown value for config '%s': %s"), > + var, value); OK, this message is copied from git_diff_ui_config(), which is where "git log" and 4 commands in the "git diff" family gets their error message when "git -c diff.algorithm=bogus <cmd>" is run. It is a bit suboptimal, but users would know how to read the documentation (even though in practice they never do), so let's say this is OK, at least for now. For future reference (note: this is a #leftoverbits comment that is left here for the benefit of those who scan the list archive for ideas on what to do when they are absolutely bored without anything interesting to do, not meant as a suggestion to do anything of this sort before this patch lands), in addition to "git log" and 4 commands in the "git diff" family, - merge-ort.c has the same message. - builtin/merge-file.c gives a bit nicer message but that is a bit of maintenance burden. - curiously "git log" and four commands in the "git diff" family give a much nicer message when a --diff-algorithm=bogus is given from the command line, but not in the configuration file. we may want to consolidate the error message into one place (a constant or "extern const char *diff_algorithm_error_message", or something else that is i18n friendly) and use it from all these places I just identified. > + xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK; > + xdl_opts |= diff_algorithm; > + return 0; > + } > if (git_diff_heuristic_config(var, value, cb) < 0) > return -1; This one relies on git_diff_heuristic_config() to give message when it returns negative, so we do not have to do anything. OK. Contination of the above #leftoverbits may be to see if parse_algorithm_value() is a good place to consolidate the error message, after auditing all its callers (if such a change turns out to be a good idea, they need to lose their own messages). > @@ -824,6 +837,26 @@ static int blame_move_callback(const struct option *option, const char *arg, int > return 0; > } > > +static int blame_diff_algorithm_callback(const struct option *option, > + const char *arg, int unset) > +{ > + int *opt = option->value; > + long value = parse_algorithm_value(arg); > + > + BUG_ON_OPT_NEG(unset); > + > + if (value < 0) > + return error(_("option diff-algorithm accepts \"myers\", " > + "\"minimal\", \"patience\" and \"histogram\"")); You inherited the same "config error gets a message that requires users to consult the manual, option error gets something a bit more useful but is a maintenance burden" trait from "git diff" and family here. Let's say this is OK, too, at least for now. > + // ignore any previous --minimal setting, following git-diff's behavior We do not do // comments around here, outside borrowed code. > + *opt &= ~XDF_NEED_MINIMAL; > + *opt &= ~XDF_DIFF_ALGORITHM_MASK; > + *opt |= value; > + > + return 0; > +} > + > static int is_a_rev(const char *name) > { > struct object_id oid; > @@ -908,13 +941,16 @@ int cmd_blame(int argc, > OPT_BIT('f', "show-name", &output_option, N_("show original filename (Default: auto)"), OUTPUT_SHOW_NAME), > OPT_BIT('n', "show-number", &output_option, N_("show original linenumber (Default: off)"), OUTPUT_SHOW_NUMBER), > OPT_BIT('p', "porcelain", &output_option, N_("show in a format designed for machine consumption"), OUTPUT_PORCELAIN), > - OPT_BIT(0, "line-porcelain", &output_option, N_("show porcelain format with per-line commit information"), OUTPUT_PORCELAIN|OUTPUT_LINE_PORCELAIN), > + OPT_BIT(0, "line-porcelain", &output_option, N_("show porcelain format with per-line commit information"), OUTPUT_PORCELAIN | OUTPUT_LINE_PORCELAIN), WHY? > OPT_BIT('c', NULL, &output_option, N_("use the same output mode as git-annotate (Default: off)"), OUTPUT_ANNOTATE_COMPAT), > OPT_BIT('t', NULL, &output_option, N_("show raw timestamp (Default: off)"), OUTPUT_RAW_TIMESTAMP), > OPT_BIT('l', NULL, &output_option, N_("show long commit SHA1 (Default: off)"), OUTPUT_LONG_OBJECT_NAME), > OPT_BIT('s', NULL, &output_option, N_("suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR), > OPT_BIT('e', "show-email", &output_option, N_("show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL), > OPT_BIT('w', NULL, &xdl_opts, N_("ignore whitespace differences"), XDF_IGNORE_WHITESPACE), > + OPT_CALLBACK_F(0, "diff-algorithm", &xdl_opts, N_("<algorithm>"), > + N_("choose a diff algorithm"), > + PARSE_OPT_NONEG, blame_diff_algorithm_callback), OK. > OPT_STRING_LIST(0, "ignore-rev", &ignore_rev_list, N_("rev"), N_("ignore <rev> when blaming")), > OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")), > OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE), > > base-commit: 4253630c6f07a4bdcc9aa62a50e26a4d466219d1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] blame: make diff algorithm configurable 2025-10-20 16:05 ` Junio C Hamano @ 2025-10-22 9:37 ` Antonin Delpeuch 2025-10-22 20:39 ` Junio C Hamano 0 siblings, 1 reply; 32+ messages in thread From: Antonin Delpeuch @ 2025-10-22 9:37 UTC (permalink / raw) To: Junio C Hamano, Antonin Delpeuch via GitGitGadget; +Cc: git, Elijah Newren On 20/10/2025 18:05, Junio C Hamano wrote: >> If the general idea of this patch is judged worthwhile, I would be happy >> to add tests to demonstrate the impact of the diff algorithm on blame >> output. > Do not ever say this here. > > I've seen from time to time people ask "I am thinking of doing this; > will a patch be accepted? If so, I'll work on it." before showing > any work, and my response always has been: > > (1) We don't know how useful and interesting your contribution would > be for our audience, until we see it; and > > (2) If you truly believe in your work (find it useful, find writing > it fun, etc.), that would be incentive enough for you to work > on it, whether or not the result will land in my tree. You > should instead aim for something so brilliant that we would > come to you begging for your permission to include it in our > project. I am surprised by your reaction here, both by its substance and form. My understanding is that gathering feedback on a proposal before carrying out the implementation work in its entirety is widely accepted as a good practice for contributions to open source projects. For instance, the following guide encourages to do so (in GitHub terms, by proposing an improvement as an issue first, and by opening a draft pull request if necessary): https://opensource.guide/how-to-contribute/ While this is phrased in the context of GitHub, I think the general principle behind it is healthy. In projects I maintain, I feel bad for contributors who submit contributions that clearly required a significant effort, but that I can't accept for certain reasons. I wish they had got in touch ahead of investing all this work, because I care about their time. In fact, I already did so for an earlier contribution to this very project, and on that occasion you did not seem to take offense at the fact that my proposal was done without an accompanying patch: https://lore.kernel.org/git/8bb5e41e-4db9-4527-8492-3aca6a0f40bf@delpeuch.eu/ Has your position changed since? Or did I benefit from more of your kindness back then as a new contributor? Your argument about my work being worthy on its own even if it's not integrated to your tree is an interesting one, but let me expand on my motivation for this patch. This change is not something I personally need, nor something that is particularly fun to write. I am working on this with the hope that it will eventually make it possible to switch the default to the histogram algorithm, for the benefit of many git users. I see no point for this patch if it is not integrated in your tree. It is a gift to you and to the git community: if the gift is to be declined, I'd rather not spend time crafting it. Concerning the form, I feel obliged to let you know that from my cultural standpoint, your reply reads rather aggressive. Specifically, the sentence "Do not ever say this here." reads menacing to me, and the use of all caps later on in your reply reads aggressive to me. I'm doing my best to assume that it is not be the attitude you wanted to convey and hope that you can receive this feedback gratefully. >> + *opt &= ~XDF_NEED_MINIMAL; >> + *opt &= ~XDF_DIFF_ALGORITHM_MASK; >> + *opt |= value; >> + >> + return 0; >> +} >> + >> static int is_a_rev(const char *name) >> { >> struct object_id oid; >> @@ -908,13 +941,16 @@ int cmd_blame(int argc, >> OPT_BIT('f', "show-name", &output_option, N_("show original filename (Default: auto)"), OUTPUT_SHOW_NAME), >> OPT_BIT('n', "show-number", &output_option, N_("show original linenumber (Default: off)"), OUTPUT_SHOW_NUMBER), >> OPT_BIT('p', "porcelain", &output_option, N_("show in a format designed for machine consumption"), OUTPUT_PORCELAIN), >> - OPT_BIT(0, "line-porcelain", &output_option, N_("show porcelain format with per-line commit information"), OUTPUT_PORCELAIN|OUTPUT_LINE_PORCELAIN), >> + OPT_BIT(0, "line-porcelain", &output_option, N_("show porcelain format with per-line commit information"), OUTPUT_PORCELAIN | OUTPUT_LINE_PORCELAIN), > WHY? In an attempt to conform to the coding style of this project, I ran `make style`, which generated this change. Given that it is on a line I hadn't touched, I pondered on whether to include it in my patch or not. In the past, I had submitted a patch which fixed a formatting issue in the documentation in passing, together with content changes further down, and you had been supportive of this: https://lore.kernel.org/git/xmqq1qaeqtw7.fsf@gitster.g/ So I decided to include it this time as well. I am happy to remove it if you prefer not to have it. I thank you for the rest of your comments and will take them into account for a new version of this patch, pending the discussion above. Best wishes, Antonin ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] blame: make diff algorithm configurable 2025-10-22 9:37 ` Antonin Delpeuch @ 2025-10-22 20:39 ` Junio C Hamano 0 siblings, 0 replies; 32+ messages in thread From: Junio C Hamano @ 2025-10-22 20:39 UTC (permalink / raw) To: Antonin Delpeuch; +Cc: Antonin Delpeuch via GitGitGadget, git, Elijah Newren Antonin Delpeuch <antonin@delpeuch.eu> writes: > On 20/10/2025 18:05, Junio C Hamano wrote: > >>> If the general idea of this patch is judged worthwhile, I would be happy >>> to add tests to demonstrate the impact of the diff algorithm on blame >>> output. >> Do not ever say this here. >> >> I've seen from time to time people ask "I am thinking of doing this; >> will a patch be accepted? If so, I'll work on it." before showing >> any work, and my response always has been: >> >> (1) We don't know how useful and interesting your contribution would >> be for our audience, until we see it; and >> >> (2) If you truly believe in your work (find it useful, find writing >> it fun, etc.), that would be incentive enough for you to work >> on it, whether or not the result will land in my tree. You >> should instead aim for something so brilliant that we would >> come to you begging for your permission to include it in our >> project. > > I am surprised by your reaction here, both by its substance and form. Yeah, after sending it out, I realized that the canned response above was not fitting to this exact instance. I overreacted primarily because what I saw everything before that part was indication of a great new contributor, which made my dissapointment to see the dreaded "I will do this if this is accepted" even worse. Your "this one lacks tests" is a bit different from what we sometimes see on this list that I react with the above canned response, which is "I want to do this great thing. If you promise you will accept this change, I'll work on it" without showing any detailed design or code. It is more like "I know we need test but I have shown the main part of the change. Am I going in the right direction?" You certainly didn't deserve the above response. Sorry about that. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] blame: make diff algorithm configurable 2025-10-20 14:56 [PATCH] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget 2025-10-20 16:05 ` Junio C Hamano @ 2025-10-23 16:03 ` Phillip Wood 2025-10-28 13:37 ` [PATCH v2] " Antonin Delpeuch via GitGitGadget 2 siblings, 0 replies; 32+ messages in thread From: Phillip Wood @ 2025-10-23 16:03 UTC (permalink / raw) To: Antonin Delpeuch via GitGitGadget, git; +Cc: Elijah Newren, Antonin Delpeuch Hi Antonin On 20/10/2025 15:56, Antonin Delpeuch via GitGitGadget wrote: > From: Antonin Delpeuch <antonin@delpeuch.eu> > > The diff algorithm used in 'git-blame(1)' can be configured using the > `--diff-algorithm` option or the `diff.algorithm` config variable. > Myers diff remains the default. I think this sounds like a reasonable thing to do, although it is technically a breaking change. > diff --git a/builtin/blame.c b/builtin/blame.c > index 2703820258..177b606e81 100644 > --- a/builtin/blame.c > +++ b/builtin/blame.c > @@ -779,6 +779,19 @@ static int git_blame_config(const char *var, const char *value, > } > } > > + if (!strcmp(var, "diff.algorithm")) { > + long diff_algorithm; > + if (!value) > + return config_error_nonbool(var); > + diff_algorithm = parse_algorithm_value(value); > + if (diff_algorithm < 0) > + return error(_("unknown value for config '%s': %s"), > + var, value); > + xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK; I think this should be xdl_opts &= ~(XDF_DIFF_ALGORITHM_MASK | XDF_NEED_MINIMAL); as you have below for the option parsing. > + xdl_opts |= diff_algorithm; > + return 0; > + } > + > if (git_diff_heuristic_config(var, value, cb) < 0) > return -1; > if (userdiff_config(var, value) < 0) > @@ -824,6 +837,26 @@ static int blame_move_callback(const struct option *option, const char *arg, int > return 0; > } > > +static int blame_diff_algorithm_callback(const struct option *option, > + const char *arg, int unset) > +{ > + int *opt = option->value; > + long value = parse_algorithm_value(arg); > + > + BUG_ON_OPT_NEG(unset); > + > + if (value < 0) > + return error(_("option diff-algorithm accepts \"myers\", " > + "\"minimal\", \"patience\" and \"histogram\"")); > + > + // ignore any previous --minimal setting, following git-diff's behavior Style - oneline comments should look like /* comment */ > + *opt &= ~XDF_NEED_MINIMAL; > + *opt &= ~XDF_DIFF_ALGORITHM_MASK; > + *opt |= value; "git blame" also has a "--minimal" option which now needs to clear the diff algorithm when it sets the minimal flag. Thanks Phillip ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v2] blame: make diff algorithm configurable 2025-10-20 14:56 [PATCH] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget 2025-10-20 16:05 ` Junio C Hamano 2025-10-23 16:03 ` Phillip Wood @ 2025-10-28 13:37 ` Antonin Delpeuch via GitGitGadget 2025-10-28 15:22 ` Junio C Hamano 2025-10-28 21:14 ` [PATCH v3] " Antonin Delpeuch via GitGitGadget 2 siblings, 2 replies; 32+ messages in thread From: Antonin Delpeuch via GitGitGadget @ 2025-10-28 13:37 UTC (permalink / raw) To: git; +Cc: Elijah Newren, Phillip Wood, Antonin Delpeuch, Antonin Delpeuch From: Antonin Delpeuch <antonin@delpeuch.eu> The diff algorithm used in 'git-blame(1)' is set to 'myers', without the possibility to change it aside from the `--minimal` option. There has been long-standing interest in changing the default diff algorithm to "histogram", and Git 3.0 was floated as a possible occasion for taking some steps towards that: https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/ As a preparation for this move, it is worth making sure that the diff algorithm is configurable where useful. Make it configurable in the `git-blame(1)` command by introducing the `--diff-algorithm` option and make honor the `diff.algorithm` config variable. Keep Myers diff as the default. Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu> --- blame: make diff algorithm configurable Changes since v1: * add tests * ignore --diff-algorithm when it is provided before --minimal * improve patch description * remove duplication of documentation sections * style improvements Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2075%2Fwetneb%2Fblame_respects_diff_algorithm-v2 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2075/wetneb/blame_respects_diff_algorithm-v2 Pull-Request: https://github.com/git/git/pull/2075 Range-diff vs v1: 1: 32b59d0204 ! 1: 107f51620b blame: make diff algorithm configurable @@ Metadata ## Commit message ## blame: make diff algorithm configurable - The diff algorithm used in 'git-blame(1)' can be configured using the - `--diff-algorithm` option or the `diff.algorithm` config variable. - Myers diff remains the default. + The diff algorithm used in 'git-blame(1)' is set to 'myers', + without the possibility to change it aside from the `--minimal` option. + + There has been long-standing interest in changing the default diff + algorithm to "histogram", and Git 3.0 was floated as a possible occasion + for taking some steps towards that: + + https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/ + + As a preparation for this move, it is worth making sure that the diff + algorithm is configurable where useful. + + Make it configurable in the `git-blame(1)` command by introducing the + `--diff-algorithm` option and make honor the `diff.algorithm` config + variable. Keep Myers diff as the default. Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu> - ## Documentation/git-blame.adoc ## -@@ Documentation/git-blame.adoc: include::blame-options.adoc[] - Ignore whitespace when comparing the parent's version and - the child's to find where the lines came from. - + ## Documentation/diff-algorithm-option.adoc (new) ## +@@ +`--diff-algorithm=(patience|minimal|histogram|myers)`:: + Choose a diff algorithm. The variants are as follows: ++ @@ Documentation/git-blame.adoc: include::blame-options.adoc[] +For instance, if you configured the `diff.algorithm` variable to a +non-default value and want to use the default one, then you +have to use `--diff-algorithm=default` option. + + ## Documentation/diff-options.adoc ## +@@ Documentation/diff-options.adoc: and starts with _<text>_, this algorithm attempts to prevent it from + appearing as a deletion or addition in the output. It uses the "patience + diff" algorithm internally. + +-`--diff-algorithm=(patience|minimal|histogram|myers)`:: +- Choose a diff algorithm. The variants are as follows: +-+ +--- +- `default`;; +- `myers`;; +- The basic greedy diff algorithm. Currently, this is the default. +- `minimal`;; +- Spend extra time to make sure the smallest possible diff is +- produced. +- `patience`;; +- Use "patience diff" algorithm when generating patches. +- `histogram`;; +- This algorithm extends the patience algorithm to "support +- low-occurrence common elements". +--- +-+ +-For instance, if you configured the `diff.algorithm` variable to a +-non-default value and want to use the default one, then you +-have to use `--diff-algorithm=default` option. ++include::diff-algorithm-option.adoc[] + + `--stat[=<width>[,<name-width>[,<count>]]]`:: + Generate a diffstat. By default, as much space as necessary + + ## Documentation/git-blame.adoc ## +@@ Documentation/git-blame.adoc: include::blame-options.adoc[] + Ignore whitespace when comparing the parent's version and + the child's to find where the lines came from. + ++include::diff-algorithm-option.adoc[] + --abbrev=<n>:: Instead of using the default 7+1 hexadecimal digits as the @@ builtin/blame.c: static int blame_move_callback(const struct option *option, con return 0; } ++static int blame_diff_algorithm_minimal(const struct option *option, ++ const char *arg, int unset) ++{ ++ int *opt = option->value; ++ ++ BUG_ON_OPT_NEG(unset); ++ BUG_ON_OPT_ARG(arg); ++ ++ *opt &= ~XDF_DIFF_ALGORITHM_MASK; ++ *opt |= XDF_NEED_MINIMAL; ++ ++ return 0; ++} ++ +static int blame_diff_algorithm_callback(const struct option *option, + const char *arg, int unset) +{ @@ builtin/blame.c: static int blame_move_callback(const struct option *option, con + return error(_("option diff-algorithm accepts \"myers\", " + "\"minimal\", \"patience\" and \"histogram\"")); + -+ // ignore any previous --minimal setting, following git-diff's behavior -+ *opt &= ~XDF_NEED_MINIMAL; -+ *opt &= ~XDF_DIFF_ALGORITHM_MASK; ++ *opt &= ~(XDF_NEED_MINIMAL | XDF_DIFF_ALGORITHM_MASK); + *opt |= value; + + return 0; @@ builtin/blame.c: static int blame_move_callback(const struct option *option, con { struct object_id oid; @@ builtin/blame.c: int cmd_blame(int argc, - OPT_BIT('f', "show-name", &output_option, N_("show original filename (Default: auto)"), OUTPUT_SHOW_NAME), - OPT_BIT('n', "show-number", &output_option, N_("show original linenumber (Default: off)"), OUTPUT_SHOW_NUMBER), - OPT_BIT('p', "porcelain", &output_option, N_("show in a format designed for machine consumption"), OUTPUT_PORCELAIN), -- OPT_BIT(0, "line-porcelain", &output_option, N_("show porcelain format with per-line commit information"), OUTPUT_PORCELAIN|OUTPUT_LINE_PORCELAIN), -+ OPT_BIT(0, "line-porcelain", &output_option, N_("show porcelain format with per-line commit information"), OUTPUT_PORCELAIN | OUTPUT_LINE_PORCELAIN), - OPT_BIT('c', NULL, &output_option, N_("use the same output mode as git-annotate (Default: off)"), OUTPUT_ANNOTATE_COMPAT), - OPT_BIT('t', NULL, &output_option, N_("show raw timestamp (Default: off)"), OUTPUT_RAW_TIMESTAMP), - OPT_BIT('l', NULL, &output_option, N_("show long commit SHA1 (Default: off)"), OUTPUT_LONG_OBJECT_NAME), OPT_BIT('s', NULL, &output_option, N_("suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR), OPT_BIT('e', "show-email", &output_option, N_("show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL), OPT_BIT('w', NULL, &xdl_opts, N_("ignore whitespace differences"), XDF_IGNORE_WHITESPACE), @@ builtin/blame.c: int cmd_blame(int argc, OPT_STRING_LIST(0, "ignore-rev", &ignore_rev_list, N_("rev"), N_("ignore <rev> when blaming")), OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")), OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE), + OPT_BIT(0, "color-by-age", &output_option, N_("color lines by age"), OUTPUT_SHOW_AGE_WITH_COLOR), ++ OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL, ++ N_("spend extra cycles to find better match"), ++ PARSE_OPT_NONEG | PARSE_OPT_NOARG, ++ blame_diff_algorithm_minimal), + OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL), + OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")), + OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")), + + ## t/meson.build ## +@@ t/meson.build: integration_tests = [ + 't8012-blame-colors.sh', + 't8013-blame-ignore-revs.sh', + 't8014-blame-ignore-fuzzy.sh', ++ 't8015-blame-diff-algorithm.sh', + 't8020-last-modified.sh', + 't9001-send-email.sh', + 't9002-column.sh', + + ## t/t8015-blame-diff-algorithm.sh (new) ## +@@ ++#!/bin/sh ++ ++test_description='git blame with specific diff algorithm' ++ ++. ./test-lib.sh ++ ++test_expect_success setup ' ++ cat >file.c <<-\EOF && ++ int f(int x, int y) ++ { ++ if (x == 0) ++ { ++ return y; ++ } ++ return x; ++ } ++ ++ int g(size_t u) ++ { ++ while (u < 30) ++ { ++ u++; ++ } ++ return u; ++ } ++ EOF ++ test_write_lines x x x x >file.txt && ++ git add file.c file.txt && ++ GIT_AUTHOR_NAME=Initial git commit -m Initial && ++ ++ cat >file.c <<-\EOF && ++ int g(size_t u) ++ { ++ while (u < 30) ++ { ++ u++; ++ } ++ return u; ++ } ++ ++ int h(int x, int y, int z) ++ { ++ if (z == 0) ++ { ++ return x; ++ } ++ return y; ++ } ++ EOF ++ test_write_lines x x x A B C D x E F G >file.txt && ++ git add file.c file.txt && ++ GIT_AUTHOR_NAME=Second git commit -m Second ++' ++ ++test_expect_success 'blame uses Myers diff algorithm by default for now' ' ++ cat >expected <<-\EOF && ++ Second ++ Initial ++ Second ++ Initial ++ Second ++ Initial ++ Second ++ Initial ++ Initial ++ Second ++ Initial ++ Second ++ Initial ++ Second ++ Initial ++ Second ++ Initial ++ EOF ++ ++ # git blame file.c | grep --only-matching -e Initial -e Second > actual && ++ # test_cmp expected actual ++ echo goo ++' ++ ++test_expect_success 'blame honors --diff-algorithm option' ' ++ cat >expected <<-\EOF && ++ Initial ++ Initial ++ Initial ++ Initial ++ Initial ++ Initial ++ Initial ++ Initial ++ Second ++ Second ++ Second ++ Second ++ Second ++ Second ++ Second ++ Second ++ Second ++ EOF ++ ++ git blame file.c --diff-algorithm=histogram | \ ++ grep --only-matching -e Initial -e Second > actual && ++ test_cmp expected actual ++' ++ ++test_expect_success 'blame honors diff.algorithm config variable' ' ++ cat >expected <<-\EOF && ++ Initial ++ Initial ++ Initial ++ Initial ++ Initial ++ Initial ++ Initial ++ Initial ++ Second ++ Second ++ Second ++ Second ++ Second ++ Second ++ Second ++ Second ++ Second ++ EOF ++ ++ git config diff.algorithm histogram && ++ git blame file.c | \ ++ grep --only-matching -e Initial -e Second > actual && ++ test_cmp expected actual ++' ++ ++test_expect_success 'blame honors --minimal option' ' ++ cat >expected <<-\EOF && ++ Initial ++ Initial ++ Initial ++ Second ++ Second ++ Second ++ Second ++ Initial ++ Second ++ Second ++ Second ++ EOF ++ ++ git blame file.txt --minimal | \ ++ grep --only-matching -e Initial -e Second > actual && ++ test_cmp expected actual ++' ++ ++test_done Documentation/diff-algorithm-option.adoc | 20 +++ Documentation/diff-options.adoc | 21 +--- Documentation/git-blame.adoc | 2 + builtin/blame.c | 52 ++++++++ t/meson.build | 1 + t/t8015-blame-diff-algorithm.sh | 154 +++++++++++++++++++++++ 6 files changed, 230 insertions(+), 20 deletions(-) create mode 100644 Documentation/diff-algorithm-option.adoc create mode 100755 t/t8015-blame-diff-algorithm.sh diff --git a/Documentation/diff-algorithm-option.adoc b/Documentation/diff-algorithm-option.adoc new file mode 100644 index 0000000000..8e3a0b63d7 --- /dev/null +++ b/Documentation/diff-algorithm-option.adoc @@ -0,0 +1,20 @@ +`--diff-algorithm=(patience|minimal|histogram|myers)`:: + Choose a diff algorithm. The variants are as follows: ++ +-- + `default`;; + `myers`;; + The basic greedy diff algorithm. Currently, this is the default. + `minimal`;; + Spend extra time to make sure the smallest possible diff is + produced. + `patience`;; + Use "patience diff" algorithm when generating patches. + `histogram`;; + This algorithm extends the patience algorithm to "support + low-occurrence common elements". +-- ++ +For instance, if you configured the `diff.algorithm` variable to a +non-default value and want to use the default one, then you +have to use `--diff-algorithm=default` option. diff --git a/Documentation/diff-options.adoc b/Documentation/diff-options.adoc index ae31520f7f..9cdad6f72a 100644 --- a/Documentation/diff-options.adoc +++ b/Documentation/diff-options.adoc @@ -197,26 +197,7 @@ and starts with _<text>_, this algorithm attempts to prevent it from appearing as a deletion or addition in the output. It uses the "patience diff" algorithm internally. -`--diff-algorithm=(patience|minimal|histogram|myers)`:: - Choose a diff algorithm. The variants are as follows: -+ --- - `default`;; - `myers`;; - The basic greedy diff algorithm. Currently, this is the default. - `minimal`;; - Spend extra time to make sure the smallest possible diff is - produced. - `patience`;; - Use "patience diff" algorithm when generating patches. - `histogram`;; - This algorithm extends the patience algorithm to "support - low-occurrence common elements". --- -+ -For instance, if you configured the `diff.algorithm` variable to a -non-default value and want to use the default one, then you -have to use `--diff-algorithm=default` option. +include::diff-algorithm-option.adoc[] `--stat[=<width>[,<name-width>[,<count>]]]`:: Generate a diffstat. By default, as much space as necessary diff --git a/Documentation/git-blame.adoc b/Documentation/git-blame.adoc index e438d28625..adcbb6f5dc 100644 --- a/Documentation/git-blame.adoc +++ b/Documentation/git-blame.adoc @@ -85,6 +85,8 @@ include::blame-options.adoc[] Ignore whitespace when comparing the parent's version and the child's to find where the lines came from. +include::diff-algorithm-option.adoc[] + --abbrev=<n>:: Instead of using the default 7+1 hexadecimal digits as the abbreviated object name, use <m>+1 digits, where <m> is at diff --git a/builtin/blame.c b/builtin/blame.c index 2703820258..eb0ab71dba 100644 --- a/builtin/blame.c +++ b/builtin/blame.c @@ -779,6 +779,19 @@ static int git_blame_config(const char *var, const char *value, } } + if (!strcmp(var, "diff.algorithm")) { + long diff_algorithm; + if (!value) + return config_error_nonbool(var); + diff_algorithm = parse_algorithm_value(value); + if (diff_algorithm < 0) + return error(_("unknown value for config '%s': %s"), + var, value); + xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK; + xdl_opts |= diff_algorithm; + return 0; + } + if (git_diff_heuristic_config(var, value, cb) < 0) return -1; if (userdiff_config(var, value) < 0) @@ -824,6 +837,38 @@ static int blame_move_callback(const struct option *option, const char *arg, int return 0; } +static int blame_diff_algorithm_minimal(const struct option *option, + const char *arg, int unset) +{ + int *opt = option->value; + + BUG_ON_OPT_NEG(unset); + BUG_ON_OPT_ARG(arg); + + *opt &= ~XDF_DIFF_ALGORITHM_MASK; + *opt |= XDF_NEED_MINIMAL; + + return 0; +} + +static int blame_diff_algorithm_callback(const struct option *option, + const char *arg, int unset) +{ + int *opt = option->value; + long value = parse_algorithm_value(arg); + + BUG_ON_OPT_NEG(unset); + + if (value < 0) + return error(_("option diff-algorithm accepts \"myers\", " + "\"minimal\", \"patience\" and \"histogram\"")); + + *opt &= ~(XDF_NEED_MINIMAL | XDF_DIFF_ALGORITHM_MASK); + *opt |= value; + + return 0; +} + static int is_a_rev(const char *name) { struct object_id oid; @@ -915,10 +960,17 @@ int cmd_blame(int argc, OPT_BIT('s', NULL, &output_option, N_("suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR), OPT_BIT('e', "show-email", &output_option, N_("show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL), OPT_BIT('w', NULL, &xdl_opts, N_("ignore whitespace differences"), XDF_IGNORE_WHITESPACE), + OPT_CALLBACK_F(0, "diff-algorithm", &xdl_opts, N_("<algorithm>"), + N_("choose a diff algorithm"), + PARSE_OPT_NONEG, blame_diff_algorithm_callback), OPT_STRING_LIST(0, "ignore-rev", &ignore_rev_list, N_("rev"), N_("ignore <rev> when blaming")), OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")), OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE), OPT_BIT(0, "color-by-age", &output_option, N_("color lines by age"), OUTPUT_SHOW_AGE_WITH_COLOR), + OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL, + N_("spend extra cycles to find better match"), + PARSE_OPT_NONEG | PARSE_OPT_NOARG, + blame_diff_algorithm_minimal), OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL), OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")), OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")), diff --git a/t/meson.build b/t/meson.build index 401b24e50e..9f2fe7af8b 100644 --- a/t/meson.build +++ b/t/meson.build @@ -955,6 +955,7 @@ integration_tests = [ 't8012-blame-colors.sh', 't8013-blame-ignore-revs.sh', 't8014-blame-ignore-fuzzy.sh', + 't8015-blame-diff-algorithm.sh', 't8020-last-modified.sh', 't9001-send-email.sh', 't9002-column.sh', diff --git a/t/t8015-blame-diff-algorithm.sh b/t/t8015-blame-diff-algorithm.sh new file mode 100755 index 0000000000..43996df177 --- /dev/null +++ b/t/t8015-blame-diff-algorithm.sh @@ -0,0 +1,154 @@ +#!/bin/sh + +test_description='git blame with specific diff algorithm' + +. ./test-lib.sh + +test_expect_success setup ' + cat >file.c <<-\EOF && + int f(int x, int y) + { + if (x == 0) + { + return y; + } + return x; + } + + int g(size_t u) + { + while (u < 30) + { + u++; + } + return u; + } + EOF + test_write_lines x x x x >file.txt && + git add file.c file.txt && + GIT_AUTHOR_NAME=Initial git commit -m Initial && + + cat >file.c <<-\EOF && + int g(size_t u) + { + while (u < 30) + { + u++; + } + return u; + } + + int h(int x, int y, int z) + { + if (z == 0) + { + return x; + } + return y; + } + EOF + test_write_lines x x x A B C D x E F G >file.txt && + git add file.c file.txt && + GIT_AUTHOR_NAME=Second git commit -m Second +' + +test_expect_success 'blame uses Myers diff algorithm by default for now' ' + cat >expected <<-\EOF && + Second + Initial + Second + Initial + Second + Initial + Second + Initial + Initial + Second + Initial + Second + Initial + Second + Initial + Second + Initial + EOF + + # git blame file.c | grep --only-matching -e Initial -e Second > actual && + # test_cmp expected actual + echo goo +' + +test_expect_success 'blame honors --diff-algorithm option' ' + cat >expected <<-\EOF && + Initial + Initial + Initial + Initial + Initial + Initial + Initial + Initial + Second + Second + Second + Second + Second + Second + Second + Second + Second + EOF + + git blame file.c --diff-algorithm=histogram | \ + grep --only-matching -e Initial -e Second > actual && + test_cmp expected actual +' + +test_expect_success 'blame honors diff.algorithm config variable' ' + cat >expected <<-\EOF && + Initial + Initial + Initial + Initial + Initial + Initial + Initial + Initial + Second + Second + Second + Second + Second + Second + Second + Second + Second + EOF + + git config diff.algorithm histogram && + git blame file.c | \ + grep --only-matching -e Initial -e Second > actual && + test_cmp expected actual +' + +test_expect_success 'blame honors --minimal option' ' + cat >expected <<-\EOF && + Initial + Initial + Initial + Second + Second + Second + Second + Initial + Second + Second + Second + EOF + + git blame file.txt --minimal | \ + grep --only-matching -e Initial -e Second > actual && + test_cmp expected actual +' + +test_done base-commit: 4253630c6f07a4bdcc9aa62a50e26a4d466219d1 -- gitgitgadget ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v2] blame: make diff algorithm configurable 2025-10-28 13:37 ` [PATCH v2] " Antonin Delpeuch via GitGitGadget @ 2025-10-28 15:22 ` Junio C Hamano 2025-10-28 16:00 ` Antonin Delpeuch 2025-10-28 21:14 ` [PATCH v3] " Antonin Delpeuch via GitGitGadget 1 sibling, 1 reply; 32+ messages in thread From: Junio C Hamano @ 2025-10-28 15:22 UTC (permalink / raw) To: Antonin Delpeuch via GitGitGadget Cc: git, Elijah Newren, Phillip Wood, Antonin Delpeuch "Antonin Delpeuch via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Antonin Delpeuch <antonin@delpeuch.eu> > > The diff algorithm used in 'git-blame(1)' is set to 'myers', > without the possibility to change it aside from the `--minimal` option. Hmph. It is very unfortunate that we had --minimal already. We should have done --diff-algorithm=<which> instead, but that is way too late. > There has been long-standing interest in changing the default diff > algorithm to "histogram", and Git 3.0 was floated as a possible occasion > for taking some steps towards that: > > https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/ Micronit. I think the reference to 3.0 only about potentially breaking backward compatibility by making the family of diff-* plumbing commands ignore diff.algorithm configuration, and other usability changes like this one are fair game without having to wait for 3.0 boundary (the plumbing commands do ignore the configuration already, so there is nothing we have to wait 3.0 before doing). > Changes since v1: > > * add tests > * ignore --diff-algorithm when it is provided before --minimal Sensible. I presume the reverse is true, i.e. giving "--minimal" and then "--diff-algorithm=histogram" in this order would make "histogram" survive, in other words, the usual "last one wins" rule is applied? > +static int blame_diff_algorithm_minimal(const struct option *option, > + const char *arg, int unset) > +{ > + int *opt = option->value; > + > + BUG_ON_OPT_NEG(unset); > + BUG_ON_OPT_ARG(arg); > + > + *opt &= ~XDF_DIFF_ALGORITHM_MASK; > + *opt |= XDF_NEED_MINIMAL; > + > + return 0; > +} This and diff.c:diff_opt_diff_algorithm_no_arg(), which I think is the original from which this was copied from, look somewhat different, but this can afford to be simpler, as it does not have to parse "--histogram", "--patience", etc., as independent command line options. OK. > +static int blame_diff_algorithm_callback(const struct option *option, > + const char *arg, int unset) > +{ > + int *opt = option->value; > + long value = parse_algorithm_value(arg); > + > + BUG_ON_OPT_NEG(unset); > + > + if (value < 0) > + return error(_("option diff-algorithm accepts \"myers\", " > + "\"minimal\", \"patience\" and \"histogram\"")); > + > + *opt &= ~(XDF_NEED_MINIMAL | XDF_DIFF_ALGORITHM_MASK); > + *opt |= value; > + > + return 0; > +} Quite straight-forward and sensible. > static int is_a_rev(const char *name) > { > struct object_id oid; > @@ -915,10 +960,17 @@ int cmd_blame(int argc, > OPT_BIT('s', NULL, &output_option, N_("suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR), > OPT_BIT('e', "show-email", &output_option, N_("show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL), > OPT_BIT('w', NULL, &xdl_opts, N_("ignore whitespace differences"), XDF_IGNORE_WHITESPACE), > + OPT_CALLBACK_F(0, "diff-algorithm", &xdl_opts, N_("<algorithm>"), > + N_("choose a diff algorithm"), > + PARSE_OPT_NONEG, blame_diff_algorithm_callback), > OPT_STRING_LIST(0, "ignore-rev", &ignore_rev_list, N_("rev"), N_("ignore <rev> when blaming")), > OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")), > OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE), > OPT_BIT(0, "color-by-age", &output_option, N_("color lines by age"), OUTPUT_SHOW_AGE_WITH_COLOR), > + OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL, > + N_("spend extra cycles to find better match"), > + PARSE_OPT_NONEG | PARSE_OPT_NOARG, > + blame_diff_algorithm_minimal), > OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL), This OPT_BIT() can stay here? I thought parse_options_check() was capable of detecting duplicated long-form commands as programming error, but apparently it does not. (#leftoverbits) We should look into teaching parse_options_check() to check duplicated option names. > OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")), > OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")), > ... > +test_expect_success 'blame honors --minimal option' ' > + cat >expected <<-\EOF && > + Initial > + Initial > + Initial > + Second > + Second > + Second > + Second > + Initial > + Second > + Second > + Second > + EOF > + > + git blame file.txt --minimal | \ > + grep --only-matching -e Initial -e Second > actual && > + test_cmp expected actual > +' Do we need to test combination of configuration variables and command line options (to verify that options trump configuration), or two command line options (to verify that the last one wins)? When xdiff/ part of the system gets improved, the above expected patterns may have to change, these tests may fail. Whoever updates the diff algorithm to cause such a failure has to tell between a genuine _bug_ in their update to diff implementation and the test expecting a suboptimal result based on the behaviour of the diff algorithm before their improvement. And for that, they need to debug these tests. But I suspect that these tests will probably be very difficult to debug, as it is almost impossible to see which line in the original each of these lines correspond to. I guess that's inevitable, and we'll cross that bridge when it becomes necessary. Thanks, will queue, but I do find the leftover --minimal bit disturbing. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v2] blame: make diff algorithm configurable 2025-10-28 15:22 ` Junio C Hamano @ 2025-10-28 16:00 ` Antonin Delpeuch 0 siblings, 0 replies; 32+ messages in thread From: Antonin Delpeuch @ 2025-10-28 16:00 UTC (permalink / raw) To: Junio C Hamano, Antonin Delpeuch via GitGitGadget Cc: git, Elijah Newren, Phillip Wood Hi Junio, On 28/10/2025 16:22, Junio C Hamano wrote: >> Changes since v1: >> >> * add tests >> * ignore --diff-algorithm when it is provided before --minimal > Sensible. > > I presume the reverse is true, i.e. giving "--minimal" and then > "--diff-algorithm=histogram" in this order would make "histogram" > survive, in other words, the usual "last one wins" rule is applied? Yes, that was already the case in the version 1 of this patch. Philipp noticed that it was only true in one direction, so this fixes it. >> static int is_a_rev(const char *name) >> { >> struct object_id oid; >> @@ -915,10 +960,17 @@ int cmd_blame(int argc, >> OPT_BIT('s', NULL, &output_option, N_("suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR), >> OPT_BIT('e', "show-email", &output_option, N_("show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL), >> OPT_BIT('w', NULL, &xdl_opts, N_("ignore whitespace differences"), XDF_IGNORE_WHITESPACE), >> + OPT_CALLBACK_F(0, "diff-algorithm", &xdl_opts, N_("<algorithm>"), >> + N_("choose a diff algorithm"), >> + PARSE_OPT_NONEG, blame_diff_algorithm_callback), >> OPT_STRING_LIST(0, "ignore-rev", &ignore_rev_list, N_("rev"), N_("ignore <rev> when blaming")), >> OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")), >> OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE), >> OPT_BIT(0, "color-by-age", &output_option, N_("color lines by age"), OUTPUT_SHOW_AGE_WITH_COLOR), >> + OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL, >> + N_("spend extra cycles to find better match"), >> + PARSE_OPT_NONEG | PARSE_OPT_NOARG, >> + blame_diff_algorithm_minimal), >> OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL), > This OPT_BIT() can stay here? I thought parse_options_check() was > capable of detecting duplicated long-form commands as programming > error, but apparently it does not. (#leftoverbits) We should look > into teaching parse_options_check() to check duplicated option > names. Oops, that's an oversight on my part indeed. I'll fix it. > >> OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")), >> OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")), >> ... >> +test_expect_success 'blame honors --minimal option' ' >> + cat >expected <<-\EOF && >> + Initial >> + Initial >> + Initial >> + Second >> + Second >> + Second >> + Second >> + Initial >> + Second >> + Second >> + Second >> + EOF >> + >> + git blame file.txt --minimal | \ >> + grep --only-matching -e Initial -e Second > actual && >> + test_cmp expected actual >> +' > Do we need to test combination of configuration variables and > command line options (to verify that options trump configuration), > or two command line options (to verify that the last one wins)? It's easy enough to add such tests, I can add a few more. > > When xdiff/ part of the system gets improved, the above expected > patterns may have to change, these tests may fail. Whoever updates > the diff algorithm to cause such a failure has to tell between a > genuine _bug_ in their update to diff implementation and the test > expecting a suboptimal result based on the behaviour of the diff > algorithm before their improvement. And for that, they need to > debug these tests. But I suspect that these tests will probably be > very difficult to debug, as it is almost impossible to see which > line in the original each of these lines correspond to. I'm happy to make the tests a bit clearer by including the lines in the expected output. I'll submit a new version with those changes. Best, Antonin ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v3] blame: make diff algorithm configurable 2025-10-28 13:37 ` [PATCH v2] " Antonin Delpeuch via GitGitGadget 2025-10-28 15:22 ` Junio C Hamano @ 2025-10-28 21:14 ` Antonin Delpeuch via GitGitGadget 2025-10-29 10:16 ` Phillip Wood 2025-11-01 21:57 ` [PATCH v4 0/2] " Antonin Delpeuch via GitGitGadget 1 sibling, 2 replies; 32+ messages in thread From: Antonin Delpeuch via GitGitGadget @ 2025-10-28 21:14 UTC (permalink / raw) To: git; +Cc: Elijah Newren, Phillip Wood, Antonin Delpeuch, Antonin Delpeuch From: Antonin Delpeuch <antonin@delpeuch.eu> The diff algorithm used in 'git-blame(1)' is set to 'myers', without the possibility to change it aside from the `--minimal` option. There has been long-standing interest in changing the default diff algorithm to "histogram", and Git 3.0 was floated as a possible occasion for taking some steps towards that: https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/ As a preparation for this move, it is worth making sure that the diff algorithm is configurable where useful. Make it configurable in the `git-blame(1)` command by introducing the `--diff-algorithm` option and make honor the `diff.algorithm` config variable. Keep Myers diff as the default. Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu> --- blame: make diff algorithm configurable Changes since v1: * add tests * ignore --diff-algorithm when it is provided before --minimal * improve patch description * remove duplication of documentation sections * style improvements Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2075%2Fwetneb%2Fblame_respects_diff_algorithm-v3 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2075/wetneb/blame_respects_diff_algorithm-v3 Pull-Request: https://github.com/git/git/pull/2075 Range-diff vs v2: 1: 107f51620b ! 1: b8bdb03516 blame: make diff algorithm configurable @@ builtin/blame.c: int cmd_blame(int argc, OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")), OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE), OPT_BIT(0, "color-by-age", &output_option, N_("color lines by age"), OUTPUT_SHOW_AGE_WITH_COLOR), +- OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL), + OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL, + N_("spend extra cycles to find better match"), + PARSE_OPT_NONEG | PARSE_OPT_NOARG, + blame_diff_algorithm_minimal), - OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL), OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")), OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")), + OPT_CALLBACK_F('C', NULL, &opt, N_("score"), N_("find line copies within and across files"), PARSE_OPT_OPTARG, blame_copy_callback), ## t/meson.build ## @@ t/meson.build: integration_tests = [ @@ t/t8015-blame-diff-algorithm.sh (new) + cat >file.c <<-\EOF && + int f(int x, int y) + { -+ if (x == 0) -+ { -+ return y; -+ } -+ return x; ++ if (x == 0) ++ { ++ return y; ++ } ++ return x; + } + + int g(size_t u) + { -+ while (u < 30) -+ { -+ u++; -+ } -+ return u; ++ while (u < 30) ++ { ++ u++; ++ } ++ return u; + } + EOF + test_write_lines x x x x >file.txt && + git add file.c file.txt && -+ GIT_AUTHOR_NAME=Initial git commit -m Initial && ++ GIT_AUTHOR_NAME=Commit_1 git commit -m Commit_1 && + + cat >file.c <<-\EOF && + int g(size_t u) + { -+ while (u < 30) -+ { -+ u++; -+ } -+ return u; ++ while (u < 30) ++ { ++ u++; ++ } ++ return u; + } + + int h(int x, int y, int z) + { -+ if (z == 0) -+ { -+ return x; -+ } -+ return y; ++ if (z == 0) ++ { ++ return x; ++ } ++ return y; + } + EOF + test_write_lines x x x A B C D x E F G >file.txt && + git add file.c file.txt && -+ GIT_AUTHOR_NAME=Second git commit -m Second ++ GIT_AUTHOR_NAME=Commit_2 git commit -m Commit_2 +' + +test_expect_success 'blame uses Myers diff algorithm by default for now' ' + cat >expected <<-\EOF && -+ Second -+ Initial -+ Second -+ Initial -+ Second -+ Initial -+ Second -+ Initial -+ Initial -+ Second -+ Initial -+ Second -+ Initial -+ Second -+ Initial -+ Second -+ Initial ++ Commit_2 int g(size_t u) ++ Commit_1 { ++ Commit_2 while (u < 30) ++ Commit_1 { ++ Commit_2 u++; ++ Commit_1 } ++ Commit_2 return u; ++ Commit_1 } ++ Commit_1 ++ Commit_2 int h(int x, int y, int z) ++ Commit_1 { ++ Commit_2 if (z == 0) ++ Commit_1 { ++ Commit_2 return x; ++ Commit_1 } ++ Commit_2 return y; ++ Commit_1 } + EOF + -+ # git blame file.c | grep --only-matching -e Initial -e Second > actual && -+ # test_cmp expected actual -+ echo goo ++ ++ git blame file.c | \ ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \ ++ sed -e "s/ *$//g" > actual && ++ test_cmp expected actual +' + +test_expect_success 'blame honors --diff-algorithm option' ' + cat >expected <<-\EOF && -+ Initial -+ Initial -+ Initial -+ Initial -+ Initial -+ Initial -+ Initial -+ Initial -+ Second -+ Second -+ Second -+ Second -+ Second -+ Second -+ Second -+ Second -+ Second ++ Commit_1 int g(size_t u) ++ Commit_1 { ++ Commit_1 while (u < 30) ++ Commit_1 { ++ Commit_1 u++; ++ Commit_1 } ++ Commit_1 return u; ++ Commit_1 } ++ Commit_2 ++ Commit_2 int h(int x, int y, int z) ++ Commit_2 { ++ Commit_2 if (z == 0) ++ Commit_2 { ++ Commit_2 return x; ++ Commit_2 } ++ Commit_2 return y; ++ Commit_2 } + EOF + -+ git blame file.c --diff-algorithm=histogram | \ -+ grep --only-matching -e Initial -e Second > actual && ++ git blame file.c --diff-algorithm histogram | \ ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \ ++ sed -e "s/ *$//g" > actual && + test_cmp expected actual +' + +test_expect_success 'blame honors diff.algorithm config variable' ' + cat >expected <<-\EOF && -+ Initial -+ Initial -+ Initial -+ Initial -+ Initial -+ Initial -+ Initial -+ Initial -+ Second -+ Second -+ Second -+ Second -+ Second -+ Second -+ Second -+ Second -+ Second ++ Commit_1 int g(size_t u) ++ Commit_1 { ++ Commit_1 while (u < 30) ++ Commit_1 { ++ Commit_1 u++; ++ Commit_1 } ++ Commit_1 return u; ++ Commit_1 } ++ Commit_2 ++ Commit_2 int h(int x, int y, int z) ++ Commit_2 { ++ Commit_2 if (z == 0) ++ Commit_2 { ++ Commit_2 return x; ++ Commit_2 } ++ Commit_2 return y; ++ Commit_2 } + EOF + + git config diff.algorithm histogram && + git blame file.c | \ -+ grep --only-matching -e Initial -e Second > actual && ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \ ++ sed -e "s/ *$//g" > actual && + test_cmp expected actual +' + ++test_expect_success 'blame gives priority to --diff-algorithm over diff.algorithm' ' ++ cat >expected <<-\EOF && ++ Commit_1 int g(size_t u) ++ Commit_1 { ++ Commit_1 while (u < 30) ++ Commit_1 { ++ Commit_1 u++; ++ Commit_1 } ++ Commit_1 return u; ++ Commit_1 } ++ Commit_2 ++ Commit_2 int h(int x, int y, int z) ++ Commit_2 { ++ Commit_2 if (z == 0) ++ Commit_2 { ++ Commit_2 return x; ++ Commit_2 } ++ Commit_2 return y; ++ Commit_2 } ++ EOF ++ ++ git config diff.algorithm myers && ++ git blame file.c --diff-algorithm histogram | \ ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \ ++ sed -e "s/ *$//g" > actual && ++ test_cmp expected actual ++' +test_expect_success 'blame honors --minimal option' ' + cat >expected <<-\EOF && -+ Initial -+ Initial -+ Initial -+ Second -+ Second -+ Second -+ Second -+ Initial -+ Second -+ Second -+ Second ++ Commit_1 x ++ Commit_1 x ++ Commit_1 x ++ Commit_2 A ++ Commit_2 B ++ Commit_2 C ++ Commit_2 D ++ Commit_1 x ++ Commit_2 E ++ Commit_2 F ++ Commit_2 G + EOF + + git blame file.txt --minimal | \ -+ grep --only-matching -e Initial -e Second > actual && ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" > actual && + test_cmp expected actual +' + ++test_expect_success 'blame respects the order of diff options' ' ++ cat >expected <<-\EOF && ++ Commit_1 x ++ Commit_1 x ++ Commit_1 x ++ Commit_2 A ++ Commit_2 B ++ Commit_2 C ++ Commit_2 D ++ Commit_2 x ++ Commit_2 E ++ Commit_2 F ++ Commit_2 G ++ EOF ++ ++ git blame file.txt --minimal --diff-algorithm myers | \ ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" > actual && ++ test_cmp expected actual ++' ++ ++ +test_done Documentation/diff-algorithm-option.adoc | 20 +++ Documentation/diff-options.adoc | 21 +-- Documentation/git-blame.adoc | 2 + builtin/blame.c | 53 +++++- t/meson.build | 1 + t/t8015-blame-diff-algorithm.sh | 206 +++++++++++++++++++++++ 6 files changed, 282 insertions(+), 21 deletions(-) create mode 100644 Documentation/diff-algorithm-option.adoc create mode 100755 t/t8015-blame-diff-algorithm.sh diff --git a/Documentation/diff-algorithm-option.adoc b/Documentation/diff-algorithm-option.adoc new file mode 100644 index 0000000000..8e3a0b63d7 --- /dev/null +++ b/Documentation/diff-algorithm-option.adoc @@ -0,0 +1,20 @@ +`--diff-algorithm=(patience|minimal|histogram|myers)`:: + Choose a diff algorithm. The variants are as follows: ++ +-- + `default`;; + `myers`;; + The basic greedy diff algorithm. Currently, this is the default. + `minimal`;; + Spend extra time to make sure the smallest possible diff is + produced. + `patience`;; + Use "patience diff" algorithm when generating patches. + `histogram`;; + This algorithm extends the patience algorithm to "support + low-occurrence common elements". +-- ++ +For instance, if you configured the `diff.algorithm` variable to a +non-default value and want to use the default one, then you +have to use `--diff-algorithm=default` option. diff --git a/Documentation/diff-options.adoc b/Documentation/diff-options.adoc index ae31520f7f..9cdad6f72a 100644 --- a/Documentation/diff-options.adoc +++ b/Documentation/diff-options.adoc @@ -197,26 +197,7 @@ and starts with _<text>_, this algorithm attempts to prevent it from appearing as a deletion or addition in the output. It uses the "patience diff" algorithm internally. -`--diff-algorithm=(patience|minimal|histogram|myers)`:: - Choose a diff algorithm. The variants are as follows: -+ --- - `default`;; - `myers`;; - The basic greedy diff algorithm. Currently, this is the default. - `minimal`;; - Spend extra time to make sure the smallest possible diff is - produced. - `patience`;; - Use "patience diff" algorithm when generating patches. - `histogram`;; - This algorithm extends the patience algorithm to "support - low-occurrence common elements". --- -+ -For instance, if you configured the `diff.algorithm` variable to a -non-default value and want to use the default one, then you -have to use `--diff-algorithm=default` option. +include::diff-algorithm-option.adoc[] `--stat[=<width>[,<name-width>[,<count>]]]`:: Generate a diffstat. By default, as much space as necessary diff --git a/Documentation/git-blame.adoc b/Documentation/git-blame.adoc index e438d28625..adcbb6f5dc 100644 --- a/Documentation/git-blame.adoc +++ b/Documentation/git-blame.adoc @@ -85,6 +85,8 @@ include::blame-options.adoc[] Ignore whitespace when comparing the parent's version and the child's to find where the lines came from. +include::diff-algorithm-option.adoc[] + --abbrev=<n>:: Instead of using the default 7+1 hexadecimal digits as the abbreviated object name, use <m>+1 digits, where <m> is at diff --git a/builtin/blame.c b/builtin/blame.c index 2703820258..da4dbdf50a 100644 --- a/builtin/blame.c +++ b/builtin/blame.c @@ -779,6 +779,19 @@ static int git_blame_config(const char *var, const char *value, } } + if (!strcmp(var, "diff.algorithm")) { + long diff_algorithm; + if (!value) + return config_error_nonbool(var); + diff_algorithm = parse_algorithm_value(value); + if (diff_algorithm < 0) + return error(_("unknown value for config '%s': %s"), + var, value); + xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK; + xdl_opts |= diff_algorithm; + return 0; + } + if (git_diff_heuristic_config(var, value, cb) < 0) return -1; if (userdiff_config(var, value) < 0) @@ -824,6 +837,38 @@ static int blame_move_callback(const struct option *option, const char *arg, int return 0; } +static int blame_diff_algorithm_minimal(const struct option *option, + const char *arg, int unset) +{ + int *opt = option->value; + + BUG_ON_OPT_NEG(unset); + BUG_ON_OPT_ARG(arg); + + *opt &= ~XDF_DIFF_ALGORITHM_MASK; + *opt |= XDF_NEED_MINIMAL; + + return 0; +} + +static int blame_diff_algorithm_callback(const struct option *option, + const char *arg, int unset) +{ + int *opt = option->value; + long value = parse_algorithm_value(arg); + + BUG_ON_OPT_NEG(unset); + + if (value < 0) + return error(_("option diff-algorithm accepts \"myers\", " + "\"minimal\", \"patience\" and \"histogram\"")); + + *opt &= ~(XDF_NEED_MINIMAL | XDF_DIFF_ALGORITHM_MASK); + *opt |= value; + + return 0; +} + static int is_a_rev(const char *name) { struct object_id oid; @@ -915,11 +960,17 @@ int cmd_blame(int argc, OPT_BIT('s', NULL, &output_option, N_("suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR), OPT_BIT('e', "show-email", &output_option, N_("show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL), OPT_BIT('w', NULL, &xdl_opts, N_("ignore whitespace differences"), XDF_IGNORE_WHITESPACE), + OPT_CALLBACK_F(0, "diff-algorithm", &xdl_opts, N_("<algorithm>"), + N_("choose a diff algorithm"), + PARSE_OPT_NONEG, blame_diff_algorithm_callback), OPT_STRING_LIST(0, "ignore-rev", &ignore_rev_list, N_("rev"), N_("ignore <rev> when blaming")), OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")), OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE), OPT_BIT(0, "color-by-age", &output_option, N_("color lines by age"), OUTPUT_SHOW_AGE_WITH_COLOR), - OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL), + OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL, + N_("spend extra cycles to find better match"), + PARSE_OPT_NONEG | PARSE_OPT_NOARG, + blame_diff_algorithm_minimal), OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")), OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")), OPT_CALLBACK_F('C', NULL, &opt, N_("score"), N_("find line copies within and across files"), PARSE_OPT_OPTARG, blame_copy_callback), diff --git a/t/meson.build b/t/meson.build index 401b24e50e..9f2fe7af8b 100644 --- a/t/meson.build +++ b/t/meson.build @@ -955,6 +955,7 @@ integration_tests = [ 't8012-blame-colors.sh', 't8013-blame-ignore-revs.sh', 't8014-blame-ignore-fuzzy.sh', + 't8015-blame-diff-algorithm.sh', 't8020-last-modified.sh', 't9001-send-email.sh', 't9002-column.sh', diff --git a/t/t8015-blame-diff-algorithm.sh b/t/t8015-blame-diff-algorithm.sh new file mode 100755 index 0000000000..efc4b47ce1 --- /dev/null +++ b/t/t8015-blame-diff-algorithm.sh @@ -0,0 +1,206 @@ +#!/bin/sh + +test_description='git blame with specific diff algorithm' + +. ./test-lib.sh + +test_expect_success setup ' + cat >file.c <<-\EOF && + int f(int x, int y) + { + if (x == 0) + { + return y; + } + return x; + } + + int g(size_t u) + { + while (u < 30) + { + u++; + } + return u; + } + EOF + test_write_lines x x x x >file.txt && + git add file.c file.txt && + GIT_AUTHOR_NAME=Commit_1 git commit -m Commit_1 && + + cat >file.c <<-\EOF && + int g(size_t u) + { + while (u < 30) + { + u++; + } + return u; + } + + int h(int x, int y, int z) + { + if (z == 0) + { + return x; + } + return y; + } + EOF + test_write_lines x x x A B C D x E F G >file.txt && + git add file.c file.txt && + GIT_AUTHOR_NAME=Commit_2 git commit -m Commit_2 +' + +test_expect_success 'blame uses Myers diff algorithm by default for now' ' + cat >expected <<-\EOF && + Commit_2 int g(size_t u) + Commit_1 { + Commit_2 while (u < 30) + Commit_1 { + Commit_2 u++; + Commit_1 } + Commit_2 return u; + Commit_1 } + Commit_1 + Commit_2 int h(int x, int y, int z) + Commit_1 { + Commit_2 if (z == 0) + Commit_1 { + Commit_2 return x; + Commit_1 } + Commit_2 return y; + Commit_1 } + EOF + + + git blame file.c | \ + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \ + sed -e "s/ *$//g" > actual && + test_cmp expected actual +' + +test_expect_success 'blame honors --diff-algorithm option' ' + cat >expected <<-\EOF && + Commit_1 int g(size_t u) + Commit_1 { + Commit_1 while (u < 30) + Commit_1 { + Commit_1 u++; + Commit_1 } + Commit_1 return u; + Commit_1 } + Commit_2 + Commit_2 int h(int x, int y, int z) + Commit_2 { + Commit_2 if (z == 0) + Commit_2 { + Commit_2 return x; + Commit_2 } + Commit_2 return y; + Commit_2 } + EOF + + git blame file.c --diff-algorithm histogram | \ + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \ + sed -e "s/ *$//g" > actual && + test_cmp expected actual +' + +test_expect_success 'blame honors diff.algorithm config variable' ' + cat >expected <<-\EOF && + Commit_1 int g(size_t u) + Commit_1 { + Commit_1 while (u < 30) + Commit_1 { + Commit_1 u++; + Commit_1 } + Commit_1 return u; + Commit_1 } + Commit_2 + Commit_2 int h(int x, int y, int z) + Commit_2 { + Commit_2 if (z == 0) + Commit_2 { + Commit_2 return x; + Commit_2 } + Commit_2 return y; + Commit_2 } + EOF + + git config diff.algorithm histogram && + git blame file.c | \ + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \ + sed -e "s/ *$//g" > actual && + test_cmp expected actual +' + +test_expect_success 'blame gives priority to --diff-algorithm over diff.algorithm' ' + cat >expected <<-\EOF && + Commit_1 int g(size_t u) + Commit_1 { + Commit_1 while (u < 30) + Commit_1 { + Commit_1 u++; + Commit_1 } + Commit_1 return u; + Commit_1 } + Commit_2 + Commit_2 int h(int x, int y, int z) + Commit_2 { + Commit_2 if (z == 0) + Commit_2 { + Commit_2 return x; + Commit_2 } + Commit_2 return y; + Commit_2 } + EOF + + git config diff.algorithm myers && + git blame file.c --diff-algorithm histogram | \ + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \ + sed -e "s/ *$//g" > actual && + test_cmp expected actual +' +test_expect_success 'blame honors --minimal option' ' + cat >expected <<-\EOF && + Commit_1 x + Commit_1 x + Commit_1 x + Commit_2 A + Commit_2 B + Commit_2 C + Commit_2 D + Commit_1 x + Commit_2 E + Commit_2 F + Commit_2 G + EOF + + git blame file.txt --minimal | \ + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" > actual && + test_cmp expected actual +' + +test_expect_success 'blame respects the order of diff options' ' + cat >expected <<-\EOF && + Commit_1 x + Commit_1 x + Commit_1 x + Commit_2 A + Commit_2 B + Commit_2 C + Commit_2 D + Commit_2 x + Commit_2 E + Commit_2 F + Commit_2 G + EOF + + git blame file.txt --minimal --diff-algorithm myers | \ + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" > actual && + test_cmp expected actual +' + + +test_done base-commit: 4253630c6f07a4bdcc9aa62a50e26a4d466219d1 -- gitgitgadget ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v3] blame: make diff algorithm configurable 2025-10-28 21:14 ` [PATCH v3] " Antonin Delpeuch via GitGitGadget @ 2025-10-29 10:16 ` Phillip Wood 2025-10-29 18:46 ` Junio C Hamano 2025-10-30 9:22 ` Antonin Delpeuch 2025-11-01 21:57 ` [PATCH v4 0/2] " Antonin Delpeuch via GitGitGadget 1 sibling, 2 replies; 32+ messages in thread From: Phillip Wood @ 2025-10-29 10:16 UTC (permalink / raw) To: Antonin Delpeuch via GitGitGadget, git; +Cc: Elijah Newren, Antonin Delpeuch Hi Antonin On 28/10/2025 21:14, Antonin Delpeuch via GitGitGadget wrote: > From: Antonin Delpeuch <antonin@delpeuch.eu> > > The diff algorithm used in 'git-blame(1)' is set to 'myers', > without the possibility to change it aside from the `--minimal` option. > > There has been long-standing interest in changing the default diff > algorithm to "histogram", and Git 3.0 was floated as a possible occasion > for taking some steps towards that: > > https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/ > > As a preparation for this move, it is worth making sure that the diff > algorithm is configurable where useful. > > Make it configurable in the `git-blame(1)` command by introducing the > `--diff-algorithm` option and make honor the `diff.algorithm` config > variable. Keep Myers diff as the default. > > Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu> > --- Apart from a problem with clearing XDF_NEED_MINIMAL (which is really the fault of a terrible api) this is looking good. > --- a/builtin/blame.c > +++ b/builtin/blame.c > @@ -779,6 +779,19 @@ static int git_blame_config(const char *var, const char *value, > } > } > > + if (!strcmp(var, "diff.algorithm")) { > + long diff_algorithm; > + if (!value) > + return config_error_nonbool(var); > + diff_algorithm = parse_algorithm_value(value); > + if (diff_algorithm < 0) > + return error(_("unknown value for config '%s': %s"), > + var, value); > + xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK; Unfortunately XDF_DIFF_ALGORITHM_MASK does not include XDF_NEED_MINIMAL so if the user has a config file that looks like [diff] algorithm = minimal algorithm = myers We'll parse it as "minimal" rather than "myers" As we need to reset the diff algorithm in a number of places I think it would be best to define a macro #define CLEAR_DIFF_ALGORITHM(flags) \ flags &= ~(XDF_DIFF_ALGORITHM_MASK | XDF_NEED_MINIMAL) and use that where we want to reset the algorithm. > + xdl_opts |= diff_algorithm; > + return 0; > + } > + > if (git_diff_heuristic_config(var, value, cb) < 0) > return -1; > if (userdiff_config(var, value) < 0) > @@ -824,6 +837,38 @@ static int blame_move_callback(const struct option *option, const char *arg, int > return 0; > } > > +static int blame_diff_algorithm_minimal(const struct option *option, > + const char *arg, int unset) > +{ > + int *opt = option->value; > + > + BUG_ON_OPT_NEG(unset); This is a change in behavior as we currently accept "--no-minimal" which clears XDF_NEED_MINIMAL > + BUG_ON_OPT_ARG(arg); > + > + *opt &= ~XDF_DIFF_ALGORITHM_MASK; This is correct becase we're about to set XDF_NEED_MINIMAL so it does not matter that we leave it set here, but would still be clearer if it used the new macro I suggested above. > + *opt |= XDF_NEED_MINIMAL;> + return 0; > +} > + > +static int blame_diff_algorithm_callback(const struct option *option, > + const char *arg, int unset) > +{ > + int *opt = option->value; > + long value = parse_algorithm_value(arg); > + > + BUG_ON_OPT_NEG(unset); > + > + if (value < 0) > + return error(_("option diff-algorithm accepts \"myers\", " > + "\"minimal\", \"patience\" and \"histogram\"")); > + > + *opt &= ~(XDF_NEED_MINIMAL | XDF_DIFF_ALGORITHM_MASK); This is correct > + *opt |= value; > + > + return 0; > +} > + > - OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL), > + OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL, > + N_("spend extra cycles to find better match"), This is just copying the existing text so it is not a new problem but I think it would be better if we said "find a better" rather than "find better". We should prehaps think about hiding this option now that we support --diff-algorithm. > + PARSE_OPT_NONEG | PARSE_OPT_NOARG, As I said above using PARSE_OPT_NONEG here is a regression > + blame_diff_algorithm_minimal), > diff --git a/t/t8015-blame-diff-algorithm.sh b/t/t8015-blame-diff-algorithm.sh > new file mode 100755 > index 0000000000..efc4b47ce1 > --- /dev/null > +++ b/t/t8015-blame-diff-algorithm.sh > [...] > +test_expect_success 'blame uses Myers diff algorithm by default for now' ' I'm not sure we need to say "for now" here. > + cat >expected <<-\EOF && > + Commit_2 int g(size_t u) > + Commit_1 { > + Commit_2 while (u < 30) > + Commit_1 { > + Commit_2 u++; > + Commit_1 } > + Commit_2 return u; > + Commit_1 } > + Commit_1 > + Commit_2 int h(int x, int y, int z) > + Commit_1 { > + Commit_2 if (z == 0) > + Commit_1 { > + Commit_2 return x; > + Commit_1 } > + Commit_2 return y; > + Commit_1 } > + EOF > + > + There's an extra blank line here > + git blame file.c | \ We don't pipe the output git commands as it hides unexpected failures. Instead you should redirect the output of git to a file and then process that file with sed. > + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \ > + sed -e "s/ *$//g" > actual && This can be a single process by passing -e twice. It does not really matter but neither pattern needs a trailing "g" as they only match once within the line. > +test_expect_success 'blame gives priority to --diff-algorithm over diff.algorithm' ' > + cat >expected <<-\EOF && > + Commit_1 int g(size_t u) > + Commit_1 { > + Commit_1 while (u < 30) > + Commit_1 { > + Commit_1 u++; > + Commit_1 } > + Commit_1 return u; > + Commit_1 } > + Commit_2 > + Commit_2 int h(int x, int y, int z) > + Commit_2 { > + Commit_2 if (z == 0) > + Commit_2 { > + Commit_2 return x; > + Commit_2 } > + Commit_2 return y; > + Commit_2 } > + EOF > + > + git config diff.algorithm myers && You can use test_config() here which will clear the config setting at the end of the test. Alternatively you can save a couple of processes by using "git -c diff.algorithm=myers blame ...". This is setting the config to the default value, I wonder if it would be better to do git -c diff.algorithm=histogram blame --diff-algorithm=myers instead. The coverage looks good Thanks Phillip ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] blame: make diff algorithm configurable 2025-10-29 10:16 ` Phillip Wood @ 2025-10-29 18:46 ` Junio C Hamano 2025-10-30 9:22 ` Antonin Delpeuch 1 sibling, 0 replies; 32+ messages in thread From: Junio C Hamano @ 2025-10-29 18:46 UTC (permalink / raw) To: Phillip Wood Cc: Antonin Delpeuch via GitGitGadget, git, Elijah Newren, Antonin Delpeuch Phillip Wood <phillip.wood123@gmail.com> writes: >> +static int blame_diff_algorithm_minimal(const struct option *option, >> + const char *arg, int unset) >> +{ >> + int *opt = option->value; >> + >> + BUG_ON_OPT_NEG(unset); > > This is a change in behavior as we currently accept "--no-minimal" which > clears XDF_NEED_MINIMAL Ah, I missed this; thanks for a careful reading. > As I said above using PARSE_OPT_NONEG here is a regression > >> + blame_diff_algorithm_minimal), >> diff --git a/t/t8015-blame-diff-algorithm.sh b/t/t8015-blame-diff-algorithm.sh >> new file mode 100755 >> index 0000000000..efc4b47ce1 >> --- /dev/null >> +++ b/t/t8015-blame-diff-algorithm.sh >> [...] >> +test_expect_success 'blame uses Myers diff algorithm by default for now' ' > > I'm not sure we need to say "for now" here. We shouldn't. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] blame: make diff algorithm configurable 2025-10-29 10:16 ` Phillip Wood 2025-10-29 18:46 ` Junio C Hamano @ 2025-10-30 9:22 ` Antonin Delpeuch 2025-10-30 10:47 ` Phillip Wood 1 sibling, 1 reply; 32+ messages in thread From: Antonin Delpeuch @ 2025-10-30 9:22 UTC (permalink / raw) To: phillip.wood, Antonin Delpeuch via GitGitGadget, git; +Cc: Elijah Newren Hi Phillip, On 29/10/2025 11:16, Phillip Wood wrote: > Unfortunately XDF_DIFF_ALGORITHM_MASK does not include > XDF_NEED_MINIMAL so if the user has a config file that looks like > > [diff] > algorithm = minimal > algorithm = myers > > We'll parse it as "minimal" rather than "myers" > > As we need to reset the diff algorithm in a number of places I think > it would be best to define a macro > > #define CLEAR_DIFF_ALGORITHM(flags) \ > flags &= ~(XDF_DIFF_ALGORITHM_MASK | XDF_NEED_MINIMAL) Ouch, good catch! This problem is affecting other places as well. I'm wondering if we couldn't even add XDF_NEED_MINIMAL to XDF_DIFF_ALGORITHM_MASK. I've reviewed all the places where XDF_DIFF_ALGORITHM_MASK is used, and it seems that in all cases it would either preserve the existing behaviour (potentially allowing us to remove an accompanying "DIFF_XDL_CLR(opts, NEED_MINIMAL);" macro which becomes redundant), or in some other cases it would fix a similar issue (for instance, in merge-file.c). Is your suggestion to introduce a new macro motivated by stability concerns? I'm aware that xdiff is used in other code bases as a library, so I guess changing XDF_DIFF_ALGORITHM_MASK can indeed be seen as a breaking change. Antonin ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] blame: make diff algorithm configurable 2025-10-30 9:22 ` Antonin Delpeuch @ 2025-10-30 10:47 ` Phillip Wood 0 siblings, 0 replies; 32+ messages in thread From: Phillip Wood @ 2025-10-30 10:47 UTC (permalink / raw) To: Antonin Delpeuch, phillip.wood, Antonin Delpeuch via GitGitGadget, git Cc: Elijah Newren Hi Antonin On 30/10/2025 09:22, Antonin Delpeuch wrote: > On 29/10/2025 11:16, Phillip Wood wrote: >> Unfortunately XDF_DIFF_ALGORITHM_MASK does not include >> XDF_NEED_MINIMAL so if the user has a config file that looks like >> >> [diff] >> algorithm = minimal >> algorithm = myers >> >> We'll parse it as "minimal" rather than "myers" >> >> As we need to reset the diff algorithm in a number of places I think >> it would be best to define a macro >> >> #define CLEAR_DIFF_ALGORITHM(flags) \ >> flags &= ~(XDF_DIFF_ALGORITHM_MASK | XDF_NEED_MINIMAL) > > Ouch, good catch! This problem is affecting other places as well. > > I'm wondering if we couldn't even add XDF_NEED_MINIMAL to > XDF_DIFF_ALGORITHM_MASK. I've reviewed all the places where > XDF_DIFF_ALGORITHM_MASK is used, and it seems that in all cases it would > either preserve the existing behaviour (potentially allowing us to > remove an accompanying "DIFF_XDL_CLR(opts, NEED_MINIMAL);" macro which > becomes redundant), or in some other cases it would fix a similar issue > (for instance, in merge-file.c). Thanks for taking the time to look at the other uses of XDF_DIFF_ALGORITHM_MASK. I think adding XDF_NEED_MINIMAL to XDF_DIFF_ALGORITHM_MASK is a good idea as I think we always treat "minimal" as another diff algorithm, rather than a variant of "myers" which is how it is implemented in xdiff. > Is your suggestion to introduce a new macro motivated by stability > concerns? I'm aware that xdiff is used in other code bases as a library, > so I guess changing XDF_DIFF_ALGORITHM_MASK can indeed be seen as a > breaking change. While it is a breaking change, tweaking XDF_DIFF_ALGORITHM_MASK seems like a better idea to me. I wondered about it yesterday but worried it would end up being too much work to review all the existing uses. Thanks Phillip > Antonin > ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v4 0/2] blame: make diff algorithm configurable 2025-10-28 21:14 ` [PATCH v3] " Antonin Delpeuch via GitGitGadget 2025-10-29 10:16 ` Phillip Wood @ 2025-11-01 21:57 ` Antonin Delpeuch via GitGitGadget 2025-11-01 21:57 ` [PATCH v4 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget ` (2 more replies) 1 sibling, 3 replies; 32+ messages in thread From: Antonin Delpeuch via GitGitGadget @ 2025-11-01 21:57 UTC (permalink / raw) To: git; +Cc: Elijah Newren, Phillip Wood, Antonin Delpeuch Changes since v3: * fix resetting of diff algorithm by adding XDF_NEED_MINIMAL to XDF_DIFF_ALGORITHM_MASK * restore --no-minimal support * fix typo in description of --minimal option * remove the 'for now' in test description * remove piping in tests * pass configuration variables to the blame command directly instead of calling 'git config' Antonin Delpeuch (2): xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK blame: make diff algorithm configurable Documentation/diff-algorithm-option.adoc | 20 +++ Documentation/diff-options.adoc | 21 +-- Documentation/git-blame.adoc | 2 + builtin/blame.c | 52 +++++- diff.c | 2 - merge-ort.c | 2 - t/meson.build | 1 + t/t8015-blame-diff-algorithm.sh | 203 +++++++++++++++++++++++ xdiff/xdiff.h | 2 +- 9 files changed, 279 insertions(+), 26 deletions(-) create mode 100644 Documentation/diff-algorithm-option.adoc create mode 100755 t/t8015-blame-diff-algorithm.sh base-commit: 4253630c6f07a4bdcc9aa62a50e26a4d466219d1 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2075%2Fwetneb%2Fblame_respects_diff_algorithm-v4 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2075/wetneb/blame_respects_diff_algorithm-v4 Pull-Request: https://github.com/git/git/pull/2075 Range-diff vs v3: -: ---------- > 1: e81a5d2bd2 xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK 1: b8bdb03516 ! 2: 920a6f3acb blame: make diff algorithm configurable @@ builtin/blame.c: static int blame_move_callback(const struct option *option, con +{ + int *opt = option->value; + -+ BUG_ON_OPT_NEG(unset); + BUG_ON_OPT_ARG(arg); + + *opt &= ~XDF_DIFF_ALGORITHM_MASK; -+ *opt |= XDF_NEED_MINIMAL; ++ if (!unset) ++ *opt |= XDF_NEED_MINIMAL; + + return 0; +} @@ builtin/blame.c: int cmd_blame(int argc, OPT_BIT(0, "color-by-age", &output_option, N_("color lines by age"), OUTPUT_SHOW_AGE_WITH_COLOR), - OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL), + OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL, -+ N_("spend extra cycles to find better match"), -+ PARSE_OPT_NONEG | PARSE_OPT_NOARG, -+ blame_diff_algorithm_minimal), ++ N_("spend extra cycles to find a better match"), ++ PARSE_OPT_NOARG, blame_diff_algorithm_minimal), OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")), OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")), OPT_CALLBACK_F('C', NULL, &opt, N_("score"), N_("find line copies within and across files"), PARSE_OPT_OPTARG, blame_copy_callback), @@ t/t8015-blame-diff-algorithm.sh (new) + GIT_AUTHOR_NAME=Commit_2 git commit -m Commit_2 +' + -+test_expect_success 'blame uses Myers diff algorithm by default for now' ' ++test_expect_success 'blame uses Myers diff algorithm by default' ' + cat >expected <<-\EOF && + Commit_2 int g(size_t u) + Commit_1 { @@ t/t8015-blame-diff-algorithm.sh (new) + Commit_1 } + EOF + -+ -+ git blame file.c | \ -+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \ -+ sed -e "s/ *$//g" > actual && ++ git blame file.c > output && ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts && ++ sed -e "s/ *$//g" without_varying_parts > actual && + test_cmp expected actual +' + @@ t/t8015-blame-diff-algorithm.sh (new) + Commit_2 } + EOF + -+ git blame file.c --diff-algorithm histogram | \ -+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \ -+ sed -e "s/ *$//g" > actual && ++ git blame file.c --diff-algorithm histogram > output && ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts && ++ sed -e "s/ *$//g" without_varying_parts > actual && + test_cmp expected actual +' + @@ t/t8015-blame-diff-algorithm.sh (new) + Commit_2 } + EOF + -+ git config diff.algorithm histogram && -+ git blame file.c | \ -+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \ -+ sed -e "s/ *$//g" > actual && ++ git -c diff.algorithm=histogram blame file.c > output && ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts && ++ sed -e "s/ *$//g" without_varying_parts > actual && + test_cmp expected actual +' + @@ t/t8015-blame-diff-algorithm.sh (new) + Commit_2 } + EOF + -+ git config diff.algorithm myers && -+ git blame file.c --diff-algorithm histogram | \ -+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \ -+ sed -e "s/ *$//g" > actual && ++ git -c diff.algorithm=myers blame file.c --diff-algorithm histogram && ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts && ++ sed -e "s/ *$//g" without_varying_parts > actual && + test_cmp expected actual +' ++ +test_expect_success 'blame honors --minimal option' ' + cat >expected <<-\EOF && + Commit_1 x @@ t/t8015-blame-diff-algorithm.sh (new) + Commit_2 G + EOF + -+ git blame file.txt --minimal | \ -+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" > actual && ++ git blame file.txt --minimal > output && ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual && + test_cmp expected actual +' + @@ t/t8015-blame-diff-algorithm.sh (new) + Commit_2 G + EOF + -+ git blame file.txt --minimal --diff-algorithm myers | \ -+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" > actual && ++ git blame file.txt --minimal --diff-algorithm myers > output && ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual && + test_cmp expected actual +' + -+ +test_done -- gitgitgadget ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v4 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK 2025-11-01 21:57 ` [PATCH v4 0/2] " Antonin Delpeuch via GitGitGadget @ 2025-11-01 21:57 ` Antonin Delpeuch via GitGitGadget 2025-11-03 14:32 ` Phillip Wood 2025-11-01 21:57 ` [PATCH v4 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget 2025-11-06 22:41 ` [PATCH v5 0/2] " Antonin Delpeuch via GitGitGadget 2 siblings, 1 reply; 32+ messages in thread From: Antonin Delpeuch via GitGitGadget @ 2025-11-01 21:57 UTC (permalink / raw) To: git; +Cc: Elijah Newren, Phillip Wood, Antonin Delpeuch, Antonin Delpeuch From: Antonin Delpeuch <antonin@delpeuch.eu> The XDF_DIFF_ALGORITHM_MASK bit mask only includes bits for the patience and histogram diffs, not for the minimal one. This means that when reseting the diff algorithm to the default one, one needs to separately clear the bit for the minimal diff. There are places in the code that fail to do that: merge-ort.c and builtin/merge-file.c. Add the XDF_NEED_MINIMAL bit to the bit mask, and remove the separate clearing of this bit in the places where it hasn't been forgotten. Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu> --- diff.c | 2 -- merge-ort.c | 2 -- xdiff/xdiff.h | 2 +- 3 files changed, 1 insertion(+), 5 deletions(-) diff --git a/diff.c b/diff.c index 87fa16b730..6ce3591c5b 100644 --- a/diff.c +++ b/diff.c @@ -3526,8 +3526,6 @@ static int set_diff_algorithm(struct diff_options *opts, if (value < 0) return -1; - /* clear out previous settings */ - DIFF_XDL_CLR(opts, NEED_MINIMAL); opts->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK; opts->xdl_opts |= value; diff --git a/merge-ort.c b/merge-ort.c index 29858074f9..9b2b0fce7e 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -5495,8 +5495,6 @@ int parse_merge_opt(struct merge_options *opt, const char *s) long value = parse_algorithm_value(arg); if (value < 0) return -1; - /* clear out previous settings */ - DIFF_XDL_CLR(opt, NEED_MINIMAL); opt->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK; opt->xdl_opts |= value; } diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h index 2cecde5afe..dc370712e9 100644 --- a/xdiff/xdiff.h +++ b/xdiff/xdiff.h @@ -43,7 +43,7 @@ extern "C" { #define XDF_PATIENCE_DIFF (1 << 14) #define XDF_HISTOGRAM_DIFF (1 << 15) -#define XDF_DIFF_ALGORITHM_MASK (XDF_PATIENCE_DIFF | XDF_HISTOGRAM_DIFF) +#define XDF_DIFF_ALGORITHM_MASK (XDF_PATIENCE_DIFF | XDF_HISTOGRAM_DIFF | XDF_NEED_MINIMAL) #define XDF_DIFF_ALG(x) ((x) & XDF_DIFF_ALGORITHM_MASK) #define XDF_INDENT_HEURISTIC (1 << 23) -- gitgitgadget ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v4 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK 2025-11-01 21:57 ` [PATCH v4 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget @ 2025-11-03 14:32 ` Phillip Wood 0 siblings, 0 replies; 32+ messages in thread From: Phillip Wood @ 2025-11-03 14:32 UTC (permalink / raw) To: Antonin Delpeuch via GitGitGadget, git; +Cc: Elijah Newren, Antonin Delpeuch Hi Antonin On 01/11/2025 21:57, Antonin Delpeuch via GitGitGadget wrote: > From: Antonin Delpeuch <antonin@delpeuch.eu> > > The XDF_DIFF_ALGORITHM_MASK bit mask only includes bits for the patience > and histogram diffs, not for the minimal one. This means that when > reseting the diff algorithm to the default one, one needs to separately > clear the bit for the minimal diff. There are places in the code that fail > to do that: merge-ort.c and builtin/merge-file.c. > > Add the XDF_NEED_MINIMAL bit to the bit mask, and remove the separate > clearing of this bit in the places where it hasn't been forgotten. Nicely explained. This is a useful improvement that should prevent errors in the future. After this patch there are no users of DIFF_XDL_CLR() so we should probably remove that macro. I'm not sure it makes sense to remove the comments that have been deleted below as we're still clearing the old setting. Apart from that this all looks good. Thanks Phillip > Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu> > --- > diff.c | 2 -- > merge-ort.c | 2 -- > xdiff/xdiff.h | 2 +- > 3 files changed, 1 insertion(+), 5 deletions(-) > > diff --git a/diff.c b/diff.c > index 87fa16b730..6ce3591c5b 100644 > --- a/diff.c > +++ b/diff.c > @@ -3526,8 +3526,6 @@ static int set_diff_algorithm(struct diff_options *opts, > if (value < 0) > return -1; > > - /* clear out previous settings */ > - DIFF_XDL_CLR(opts, NEED_MINIMAL); > opts->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK; > opts->xdl_opts |= value; > > diff --git a/merge-ort.c b/merge-ort.c > index 29858074f9..9b2b0fce7e 100644 > --- a/merge-ort.c > +++ b/merge-ort.c > @@ -5495,8 +5495,6 @@ int parse_merge_opt(struct merge_options *opt, const char *s) > long value = parse_algorithm_value(arg); > if (value < 0) > return -1; > - /* clear out previous settings */ > - DIFF_XDL_CLR(opt, NEED_MINIMAL); > opt->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK; > opt->xdl_opts |= value; > } > diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h > index 2cecde5afe..dc370712e9 100644 > --- a/xdiff/xdiff.h > +++ b/xdiff/xdiff.h > @@ -43,7 +43,7 @@ extern "C" { > > #define XDF_PATIENCE_DIFF (1 << 14) > #define XDF_HISTOGRAM_DIFF (1 << 15) > -#define XDF_DIFF_ALGORITHM_MASK (XDF_PATIENCE_DIFF | XDF_HISTOGRAM_DIFF) > +#define XDF_DIFF_ALGORITHM_MASK (XDF_PATIENCE_DIFF | XDF_HISTOGRAM_DIFF | XDF_NEED_MINIMAL) > #define XDF_DIFF_ALG(x) ((x) & XDF_DIFF_ALGORITHM_MASK) > > #define XDF_INDENT_HEURISTIC (1 << 23) ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v4 2/2] blame: make diff algorithm configurable 2025-11-01 21:57 ` [PATCH v4 0/2] " Antonin Delpeuch via GitGitGadget 2025-11-01 21:57 ` [PATCH v4 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget @ 2025-11-01 21:57 ` Antonin Delpeuch via GitGitGadget 2025-11-03 14:32 ` Phillip Wood 2025-11-06 22:41 ` [PATCH v5 0/2] " Antonin Delpeuch via GitGitGadget 2 siblings, 1 reply; 32+ messages in thread From: Antonin Delpeuch via GitGitGadget @ 2025-11-01 21:57 UTC (permalink / raw) To: git; +Cc: Elijah Newren, Phillip Wood, Antonin Delpeuch, Antonin Delpeuch From: Antonin Delpeuch <antonin@delpeuch.eu> The diff algorithm used in 'git-blame(1)' is set to 'myers', without the possibility to change it aside from the `--minimal` option. There has been long-standing interest in changing the default diff algorithm to "histogram", and Git 3.0 was floated as a possible occasion for taking some steps towards that: https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/ As a preparation for this move, it is worth making sure that the diff algorithm is configurable where useful. Make it configurable in the `git-blame(1)` command by introducing the `--diff-algorithm` option and make honor the `diff.algorithm` config variable. Keep Myers diff as the default. Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu> --- Documentation/diff-algorithm-option.adoc | 20 +++ Documentation/diff-options.adoc | 21 +-- Documentation/git-blame.adoc | 2 + builtin/blame.c | 52 +++++- t/meson.build | 1 + t/t8015-blame-diff-algorithm.sh | 203 +++++++++++++++++++++++ 6 files changed, 278 insertions(+), 21 deletions(-) create mode 100644 Documentation/diff-algorithm-option.adoc create mode 100755 t/t8015-blame-diff-algorithm.sh diff --git a/Documentation/diff-algorithm-option.adoc b/Documentation/diff-algorithm-option.adoc new file mode 100644 index 0000000000..8e3a0b63d7 --- /dev/null +++ b/Documentation/diff-algorithm-option.adoc @@ -0,0 +1,20 @@ +`--diff-algorithm=(patience|minimal|histogram|myers)`:: + Choose a diff algorithm. The variants are as follows: ++ +-- + `default`;; + `myers`;; + The basic greedy diff algorithm. Currently, this is the default. + `minimal`;; + Spend extra time to make sure the smallest possible diff is + produced. + `patience`;; + Use "patience diff" algorithm when generating patches. + `histogram`;; + This algorithm extends the patience algorithm to "support + low-occurrence common elements". +-- ++ +For instance, if you configured the `diff.algorithm` variable to a +non-default value and want to use the default one, then you +have to use `--diff-algorithm=default` option. diff --git a/Documentation/diff-options.adoc b/Documentation/diff-options.adoc index ae31520f7f..9cdad6f72a 100644 --- a/Documentation/diff-options.adoc +++ b/Documentation/diff-options.adoc @@ -197,26 +197,7 @@ and starts with _<text>_, this algorithm attempts to prevent it from appearing as a deletion or addition in the output. It uses the "patience diff" algorithm internally. -`--diff-algorithm=(patience|minimal|histogram|myers)`:: - Choose a diff algorithm. The variants are as follows: -+ --- - `default`;; - `myers`;; - The basic greedy diff algorithm. Currently, this is the default. - `minimal`;; - Spend extra time to make sure the smallest possible diff is - produced. - `patience`;; - Use "patience diff" algorithm when generating patches. - `histogram`;; - This algorithm extends the patience algorithm to "support - low-occurrence common elements". --- -+ -For instance, if you configured the `diff.algorithm` variable to a -non-default value and want to use the default one, then you -have to use `--diff-algorithm=default` option. +include::diff-algorithm-option.adoc[] `--stat[=<width>[,<name-width>[,<count>]]]`:: Generate a diffstat. By default, as much space as necessary diff --git a/Documentation/git-blame.adoc b/Documentation/git-blame.adoc index e438d28625..adcbb6f5dc 100644 --- a/Documentation/git-blame.adoc +++ b/Documentation/git-blame.adoc @@ -85,6 +85,8 @@ include::blame-options.adoc[] Ignore whitespace when comparing the parent's version and the child's to find where the lines came from. +include::diff-algorithm-option.adoc[] + --abbrev=<n>:: Instead of using the default 7+1 hexadecimal digits as the abbreviated object name, use <m>+1 digits, where <m> is at diff --git a/builtin/blame.c b/builtin/blame.c index 2703820258..888ce708a6 100644 --- a/builtin/blame.c +++ b/builtin/blame.c @@ -779,6 +779,19 @@ static int git_blame_config(const char *var, const char *value, } } + if (!strcmp(var, "diff.algorithm")) { + long diff_algorithm; + if (!value) + return config_error_nonbool(var); + diff_algorithm = parse_algorithm_value(value); + if (diff_algorithm < 0) + return error(_("unknown value for config '%s': %s"), + var, value); + xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK; + xdl_opts |= diff_algorithm; + return 0; + } + if (git_diff_heuristic_config(var, value, cb) < 0) return -1; if (userdiff_config(var, value) < 0) @@ -824,6 +837,38 @@ static int blame_move_callback(const struct option *option, const char *arg, int return 0; } +static int blame_diff_algorithm_minimal(const struct option *option, + const char *arg, int unset) +{ + int *opt = option->value; + + BUG_ON_OPT_ARG(arg); + + *opt &= ~XDF_DIFF_ALGORITHM_MASK; + if (!unset) + *opt |= XDF_NEED_MINIMAL; + + return 0; +} + +static int blame_diff_algorithm_callback(const struct option *option, + const char *arg, int unset) +{ + int *opt = option->value; + long value = parse_algorithm_value(arg); + + BUG_ON_OPT_NEG(unset); + + if (value < 0) + return error(_("option diff-algorithm accepts \"myers\", " + "\"minimal\", \"patience\" and \"histogram\"")); + + *opt &= ~(XDF_NEED_MINIMAL | XDF_DIFF_ALGORITHM_MASK); + *opt |= value; + + return 0; +} + static int is_a_rev(const char *name) { struct object_id oid; @@ -915,11 +960,16 @@ int cmd_blame(int argc, OPT_BIT('s', NULL, &output_option, N_("suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR), OPT_BIT('e', "show-email", &output_option, N_("show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL), OPT_BIT('w', NULL, &xdl_opts, N_("ignore whitespace differences"), XDF_IGNORE_WHITESPACE), + OPT_CALLBACK_F(0, "diff-algorithm", &xdl_opts, N_("<algorithm>"), + N_("choose a diff algorithm"), + PARSE_OPT_NONEG, blame_diff_algorithm_callback), OPT_STRING_LIST(0, "ignore-rev", &ignore_rev_list, N_("rev"), N_("ignore <rev> when blaming")), OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")), OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE), OPT_BIT(0, "color-by-age", &output_option, N_("color lines by age"), OUTPUT_SHOW_AGE_WITH_COLOR), - OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL), + OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL, + N_("spend extra cycles to find a better match"), + PARSE_OPT_NOARG, blame_diff_algorithm_minimal), OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")), OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")), OPT_CALLBACK_F('C', NULL, &opt, N_("score"), N_("find line copies within and across files"), PARSE_OPT_OPTARG, blame_copy_callback), diff --git a/t/meson.build b/t/meson.build index 401b24e50e..9f2fe7af8b 100644 --- a/t/meson.build +++ b/t/meson.build @@ -955,6 +955,7 @@ integration_tests = [ 't8012-blame-colors.sh', 't8013-blame-ignore-revs.sh', 't8014-blame-ignore-fuzzy.sh', + 't8015-blame-diff-algorithm.sh', 't8020-last-modified.sh', 't9001-send-email.sh', 't9002-column.sh', diff --git a/t/t8015-blame-diff-algorithm.sh b/t/t8015-blame-diff-algorithm.sh new file mode 100755 index 0000000000..5318e18cb3 --- /dev/null +++ b/t/t8015-blame-diff-algorithm.sh @@ -0,0 +1,203 @@ +#!/bin/sh + +test_description='git blame with specific diff algorithm' + +. ./test-lib.sh + +test_expect_success setup ' + cat >file.c <<-\EOF && + int f(int x, int y) + { + if (x == 0) + { + return y; + } + return x; + } + + int g(size_t u) + { + while (u < 30) + { + u++; + } + return u; + } + EOF + test_write_lines x x x x >file.txt && + git add file.c file.txt && + GIT_AUTHOR_NAME=Commit_1 git commit -m Commit_1 && + + cat >file.c <<-\EOF && + int g(size_t u) + { + while (u < 30) + { + u++; + } + return u; + } + + int h(int x, int y, int z) + { + if (z == 0) + { + return x; + } + return y; + } + EOF + test_write_lines x x x A B C D x E F G >file.txt && + git add file.c file.txt && + GIT_AUTHOR_NAME=Commit_2 git commit -m Commit_2 +' + +test_expect_success 'blame uses Myers diff algorithm by default' ' + cat >expected <<-\EOF && + Commit_2 int g(size_t u) + Commit_1 { + Commit_2 while (u < 30) + Commit_1 { + Commit_2 u++; + Commit_1 } + Commit_2 return u; + Commit_1 } + Commit_1 + Commit_2 int h(int x, int y, int z) + Commit_1 { + Commit_2 if (z == 0) + Commit_1 { + Commit_2 return x; + Commit_1 } + Commit_2 return y; + Commit_1 } + EOF + + git blame file.c > output && + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts && + sed -e "s/ *$//g" without_varying_parts > actual && + test_cmp expected actual +' + +test_expect_success 'blame honors --diff-algorithm option' ' + cat >expected <<-\EOF && + Commit_1 int g(size_t u) + Commit_1 { + Commit_1 while (u < 30) + Commit_1 { + Commit_1 u++; + Commit_1 } + Commit_1 return u; + Commit_1 } + Commit_2 + Commit_2 int h(int x, int y, int z) + Commit_2 { + Commit_2 if (z == 0) + Commit_2 { + Commit_2 return x; + Commit_2 } + Commit_2 return y; + Commit_2 } + EOF + + git blame file.c --diff-algorithm histogram > output && + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts && + sed -e "s/ *$//g" without_varying_parts > actual && + test_cmp expected actual +' + +test_expect_success 'blame honors diff.algorithm config variable' ' + cat >expected <<-\EOF && + Commit_1 int g(size_t u) + Commit_1 { + Commit_1 while (u < 30) + Commit_1 { + Commit_1 u++; + Commit_1 } + Commit_1 return u; + Commit_1 } + Commit_2 + Commit_2 int h(int x, int y, int z) + Commit_2 { + Commit_2 if (z == 0) + Commit_2 { + Commit_2 return x; + Commit_2 } + Commit_2 return y; + Commit_2 } + EOF + + git -c diff.algorithm=histogram blame file.c > output && + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts && + sed -e "s/ *$//g" without_varying_parts > actual && + test_cmp expected actual +' + +test_expect_success 'blame gives priority to --diff-algorithm over diff.algorithm' ' + cat >expected <<-\EOF && + Commit_1 int g(size_t u) + Commit_1 { + Commit_1 while (u < 30) + Commit_1 { + Commit_1 u++; + Commit_1 } + Commit_1 return u; + Commit_1 } + Commit_2 + Commit_2 int h(int x, int y, int z) + Commit_2 { + Commit_2 if (z == 0) + Commit_2 { + Commit_2 return x; + Commit_2 } + Commit_2 return y; + Commit_2 } + EOF + + git -c diff.algorithm=myers blame file.c --diff-algorithm histogram && + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts && + sed -e "s/ *$//g" without_varying_parts > actual && + test_cmp expected actual +' + +test_expect_success 'blame honors --minimal option' ' + cat >expected <<-\EOF && + Commit_1 x + Commit_1 x + Commit_1 x + Commit_2 A + Commit_2 B + Commit_2 C + Commit_2 D + Commit_1 x + Commit_2 E + Commit_2 F + Commit_2 G + EOF + + git blame file.txt --minimal > output && + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual && + test_cmp expected actual +' + +test_expect_success 'blame respects the order of diff options' ' + cat >expected <<-\EOF && + Commit_1 x + Commit_1 x + Commit_1 x + Commit_2 A + Commit_2 B + Commit_2 C + Commit_2 D + Commit_2 x + Commit_2 E + Commit_2 F + Commit_2 G + EOF + + git blame file.txt --minimal --diff-algorithm myers > output && + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual && + test_cmp expected actual +' + +test_done -- gitgitgadget ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v4 2/2] blame: make diff algorithm configurable 2025-11-01 21:57 ` [PATCH v4 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget @ 2025-11-03 14:32 ` Phillip Wood 2025-11-03 16:15 ` Junio C Hamano 0 siblings, 1 reply; 32+ messages in thread From: Phillip Wood @ 2025-11-03 14:32 UTC (permalink / raw) To: Antonin Delpeuch via GitGitGadget, git; +Cc: Elijah Newren, Antonin Delpeuch Hi Antonin Thanks for re-rolling, this is looking pretty sound, I've left a couple of fairly minor comments below. On 01/11/2025 21:57, Antonin Delpeuch via GitGitGadget wrote: > +static int blame_diff_algorithm_minimal(const struct option *option, > + const char *arg, int unset) > +{ > + int *opt = option->value; > + > + BUG_ON_OPT_ARG(arg); > + > + *opt &= ~XDF_DIFF_ALGORITHM_MASK; > + if (!unset) > + *opt |= XDF_NEED_MINIMAL; One thing I'd not thought about before was the interaction between "--no-minimal" and "--diff-algorithm" The code above makes "--no-minimal" behave like "diff-algorithm=myers" which is consistent with the current behavior where the only options for the diff algorithm are "minimal" or "myers". An alternative would be for "--no-minimal" to just clear XDF_NEED_MINIMAL and behave like a no-op if it is given after "--diff-algorithm=patience" or "--diff-algorithm=histogram". I don't really have a strong preference either way. > +static int blame_diff_algorithm_callback(const struct option *option, > + const char *arg, int unset) > +{ > + int *opt = option->value; > + long value = parse_algorithm_value(arg); > + > + BUG_ON_OPT_NEG(unset); > + > + if (value < 0) > + return error(_("option diff-algorithm accepts \"myers\", " > + "\"minimal\", \"patience\" and \"histogram\"")); > + > + *opt &= ~(XDF_NEED_MINIMAL | XDF_DIFF_ALGORITHM_MASK); We can just use XDF_DIFF_ALGORITHM_MASK now that we've added XDF_NEED_MINMAL to it in the last commit. > @@ -915,11 +960,16 @@ int cmd_blame(int argc, > OPT_BIT('s', NULL, &output_option, N_("suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR), > OPT_BIT('e', "show-email", &output_option, N_("show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL), > OPT_BIT('w', NULL, &xdl_opts, N_("ignore whitespace differences"), XDF_IGNORE_WHITESPACE), > + OPT_CALLBACK_F(0, "diff-algorithm", &xdl_opts, N_("<algorithm>"), > + N_("choose a diff algorithm"), > + PARSE_OPT_NONEG, blame_diff_algorithm_callback), > OPT_STRING_LIST(0, "ignore-rev", &ignore_rev_list, N_("rev"), N_("ignore <rev> when blaming")), > OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")), > OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE), > OPT_BIT(0, "color-by-age", &output_option, N_("color lines by age"), OUTPUT_SHOW_AGE_WITH_COLOR), > - OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL), > + OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL, > + N_("spend extra cycles to find a better match"), > + PARSE_OPT_NOARG, blame_diff_algorithm_minimal), Given the potential for confusing interactions between "--no-minimal" and "--diff-algorithm" I think it would be worth adding OPT_HIDDEN here. > diff --git a/t/t8015-blame-diff-algorithm.sh b/t/t8015-blame-diff-algorithm.sh > new file mode 100755 > index 0000000000..5318e18cb3 > --- /dev/null > +++ b/t/t8015-blame-diff-algorithm.sh > @@ -0,0 +1,203 @@ > + [...] > + git blame file.c > output && > + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts && > + sed -e "s/ *$//g" without_varying_parts > actual && This would be more efficient if it was written as sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \ -e "s/ *$//g" output >actual Our test suite is really slow on windows so it is worth trying to avoid creating unnecessary processes. Thanks Phillip > + test_cmp expected actual > +' > + > +test_expect_success 'blame honors --diff-algorithm option' ' > + cat >expected <<-\EOF && > + Commit_1 int g(size_t u) > + Commit_1 { > + Commit_1 while (u < 30) > + Commit_1 { > + Commit_1 u++; > + Commit_1 } > + Commit_1 return u; > + Commit_1 } > + Commit_2 > + Commit_2 int h(int x, int y, int z) > + Commit_2 { > + Commit_2 if (z == 0) > + Commit_2 { > + Commit_2 return x; > + Commit_2 } > + Commit_2 return y; > + Commit_2 } > + EOF > + > + git blame file.c --diff-algorithm histogram > output && > + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts && > + sed -e "s/ *$//g" without_varying_parts > actual && > + test_cmp expected actual > +' > + > +test_expect_success 'blame honors diff.algorithm config variable' ' > + cat >expected <<-\EOF && > + Commit_1 int g(size_t u) > + Commit_1 { > + Commit_1 while (u < 30) > + Commit_1 { > + Commit_1 u++; > + Commit_1 } > + Commit_1 return u; > + Commit_1 } > + Commit_2 > + Commit_2 int h(int x, int y, int z) > + Commit_2 { > + Commit_2 if (z == 0) > + Commit_2 { > + Commit_2 return x; > + Commit_2 } > + Commit_2 return y; > + Commit_2 } > + EOF > + > + git -c diff.algorithm=histogram blame file.c > output && > + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts && > + sed -e "s/ *$//g" without_varying_parts > actual && > + test_cmp expected actual > +' > + > +test_expect_success 'blame gives priority to --diff-algorithm over diff.algorithm' ' > + cat >expected <<-\EOF && > + Commit_1 int g(size_t u) > + Commit_1 { > + Commit_1 while (u < 30) > + Commit_1 { > + Commit_1 u++; > + Commit_1 } > + Commit_1 return u; > + Commit_1 } > + Commit_2 > + Commit_2 int h(int x, int y, int z) > + Commit_2 { > + Commit_2 if (z == 0) > + Commit_2 { > + Commit_2 return x; > + Commit_2 } > + Commit_2 return y; > + Commit_2 } > + EOF > + > + git -c diff.algorithm=myers blame file.c --diff-algorithm histogram && > + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts && > + sed -e "s/ *$//g" without_varying_parts > actual && > + test_cmp expected actual > +' > + > +test_expect_success 'blame honors --minimal option' ' > + cat >expected <<-\EOF && > + Commit_1 x > + Commit_1 x > + Commit_1 x > + Commit_2 A > + Commit_2 B > + Commit_2 C > + Commit_2 D > + Commit_1 x > + Commit_2 E > + Commit_2 F > + Commit_2 G > + EOF > + > + git blame file.txt --minimal > output && > + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual && > + test_cmp expected actual > +' > + > +test_expect_success 'blame respects the order of diff options' ' > + cat >expected <<-\EOF && > + Commit_1 x > + Commit_1 x > + Commit_1 x > + Commit_2 A > + Commit_2 B > + Commit_2 C > + Commit_2 D > + Commit_2 x > + Commit_2 E > + Commit_2 F > + Commit_2 G > + EOF > + > + git blame file.txt --minimal --diff-algorithm myers > output && > + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual && > + test_cmp expected actual > +' > + > +test_done ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v4 2/2] blame: make diff algorithm configurable 2025-11-03 14:32 ` Phillip Wood @ 2025-11-03 16:15 ` Junio C Hamano 2025-11-06 20:29 ` Junio C Hamano 0 siblings, 1 reply; 32+ messages in thread From: Junio C Hamano @ 2025-11-03 16:15 UTC (permalink / raw) To: Phillip Wood Cc: Antonin Delpeuch via GitGitGadget, git, Elijah Newren, Antonin Delpeuch Phillip Wood <phillip.wood123@gmail.com> writes: > One thing I'd not thought about before was the interaction between > "--no-minimal" and "--diff-algorithm" The code above makes > "--no-minimal" behave like "diff-algorithm=myers" which is consistent > with the current behavior where the only options for the diff algorithm > are "minimal" or "myers". An alternative would be for "--no-minimal" to > just clear XDF_NEED_MINIMAL and behave like a no-op if it is given after > "--diff-algorithm=patience" or "--diff-algorithm=histogram". I don't > really have a strong preference either way. Good observation. In the longer term, I think we would be better off if we treated "minimal" just like "histogram" and "patience", in that (1) If the command takes --diff-algorithm=<name>, giving it as the <name> would override the previous setting. (2) If the command takes --<name> (i.e. "git diff --histogram"), giving "--no-<name>" results in an error. (3) If the command takes --<name>, it should take all the variants as <name>, not just selected few, or it shouldn't take any. IOW, we should deprecate "blame --no-minimal" as a past mistake, and in the longer term deprecate "blame --minimal" and tell users to use "--diff-algorithm=minimal" instead. If Antonin's series wants to teach --histogram and --patience to "git blame", then we can and should keep "blame --minimal" (i.e., (3) above), but in that case, "blame --no-minimal" should still go (i.e., (2) above). Under the new world order where there are more than the "minimal/no-minimal?" binary choice, where you can specify other algorithms from the usual repertoire, the option "no-minimal" no longer makes sense. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v4 2/2] blame: make diff algorithm configurable 2025-11-03 16:15 ` Junio C Hamano @ 2025-11-06 20:29 ` Junio C Hamano 0 siblings, 0 replies; 32+ messages in thread From: Junio C Hamano @ 2025-11-06 20:29 UTC (permalink / raw) To: Phillip Wood Cc: Antonin Delpeuch via GitGitGadget, git, Elijah Newren, Antonin Delpeuch Junio C Hamano <gitster@pobox.com> writes: > Phillip Wood <phillip.wood123@gmail.com> writes: > >> One thing I'd not thought about before was the interaction between >> "--no-minimal" and "--diff-algorithm" The code above makes >> "--no-minimal" behave like "diff-algorithm=myers" which is consistent >> with the current behavior where the only options for the diff algorithm >> are "minimal" or "myers". An alternative would be for "--no-minimal" to >> just clear XDF_NEED_MINIMAL and behave like a no-op if it is given after >> "--diff-algorithm=patience" or "--diff-algorithm=histogram". I don't >> really have a strong preference either way. > > Good observation. > > In the longer term, I think we would be better off if we treated > "minimal" just like "histogram" and "patience", in that > ... > other algorithms from the usual repertoire, the option "no-minimal" > no longer makes sense. Just to avoid confusion, the idea above to deprecate and kill "--no-minimal" is totally outside of this topic to teach "git blame" the "--diff-algorithm=<algo>" command line option. I think in the shorter term I think it is OK for the implementation to do whatever it happens to do when given "--no-minimal", and as you observed, it is fine to behave like "--diff-algorithm=myers". I also like your idea to hide --minimal with the OPT_HIDDEN bit. Thanks. ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v5 0/2] blame: make diff algorithm configurable 2025-11-01 21:57 ` [PATCH v4 0/2] " Antonin Delpeuch via GitGitGadget 2025-11-01 21:57 ` [PATCH v4 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget 2025-11-01 21:57 ` [PATCH v4 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget @ 2025-11-06 22:41 ` Antonin Delpeuch via GitGitGadget 2025-11-06 22:41 ` [PATCH v5 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget ` (3 more replies) 2 siblings, 4 replies; 32+ messages in thread From: Antonin Delpeuch via GitGitGadget @ 2025-11-06 22:41 UTC (permalink / raw) To: git; +Cc: Elijah Newren, Phillip Wood, Antonin Delpeuch Changes since v4: * hide --minimal option * simplify tests to minimize spun processes * remove redundant XDF_NEED_MINIMAL in bit mask Antonin Delpeuch (2): xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK blame: make diff algorithm configurable Documentation/diff-algorithm-option.adoc | 20 +++ Documentation/diff-options.adoc | 21 +-- Documentation/git-blame.adoc | 2 + builtin/blame.c | 52 +++++- diff.c | 2 - merge-ort.c | 2 - t/meson.build | 1 + t/t8015-blame-diff-algorithm.sh | 203 +++++++++++++++++++++++ xdiff/xdiff.h | 2 +- 9 files changed, 279 insertions(+), 26 deletions(-) create mode 100644 Documentation/diff-algorithm-option.adoc create mode 100755 t/t8015-blame-diff-algorithm.sh base-commit: 4253630c6f07a4bdcc9aa62a50e26a4d466219d1 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2075%2Fwetneb%2Fblame_respects_diff_algorithm-v5 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2075/wetneb/blame_respects_diff_algorithm-v5 Pull-Request: https://github.com/git/git/pull/2075 Range-diff vs v4: 1: e81a5d2bd2 = 1: e81a5d2bd2 xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK 2: 920a6f3acb ! 2: 60015bbada blame: make diff algorithm configurable @@ builtin/blame.c: static int blame_move_callback(const struct option *option, con + return error(_("option diff-algorithm accepts \"myers\", " + "\"minimal\", \"patience\" and \"histogram\"")); + -+ *opt &= ~(XDF_NEED_MINIMAL | XDF_DIFF_ALGORITHM_MASK); ++ *opt &= ~XDF_DIFF_ALGORITHM_MASK; + *opt |= value; + + return 0; @@ builtin/blame.c: int cmd_blame(int argc, - OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL), + OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL, + N_("spend extra cycles to find a better match"), -+ PARSE_OPT_NOARG, blame_diff_algorithm_minimal), ++ PARSE_OPT_NOARG | PARSE_OPT_HIDDEN, blame_diff_algorithm_minimal), OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")), OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")), OPT_CALLBACK_F('C', NULL, &opt, N_("score"), N_("find line copies within and across files"), PARSE_OPT_OPTARG, blame_copy_callback), @@ t/t8015-blame-diff-algorithm.sh (new) + EOF + + git -c diff.algorithm=histogram blame file.c > output && -+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts && -+ sed -e "s/ *$//g" without_varying_parts > actual && ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \ ++ -e "s/ *$//g" output > actual && + test_cmp expected actual +' + @@ t/t8015-blame-diff-algorithm.sh (new) + Commit_2 } + EOF + -+ git -c diff.algorithm=myers blame file.c --diff-algorithm histogram && -+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts && -+ sed -e "s/ *$//g" without_varying_parts > actual && ++ git -c diff.algorithm=myers blame file.c --diff-algorithm histogram > output && ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \ ++ -e "s/ *$//g" output > actual && + test_cmp expected actual +' + -- gitgitgadget ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v5 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK 2025-11-06 22:41 ` [PATCH v5 0/2] " Antonin Delpeuch via GitGitGadget @ 2025-11-06 22:41 ` Antonin Delpeuch via GitGitGadget 2025-11-07 15:52 ` Junio C Hamano 2025-11-06 22:41 ` [PATCH v5 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget ` (2 subsequent siblings) 3 siblings, 1 reply; 32+ messages in thread From: Antonin Delpeuch via GitGitGadget @ 2025-11-06 22:41 UTC (permalink / raw) To: git; +Cc: Elijah Newren, Phillip Wood, Antonin Delpeuch, Antonin Delpeuch From: Antonin Delpeuch <antonin@delpeuch.eu> The XDF_DIFF_ALGORITHM_MASK bit mask only includes bits for the patience and histogram diffs, not for the minimal one. This means that when reseting the diff algorithm to the default one, one needs to separately clear the bit for the minimal diff. There are places in the code that fail to do that: merge-ort.c and builtin/merge-file.c. Add the XDF_NEED_MINIMAL bit to the bit mask, and remove the separate clearing of this bit in the places where it hasn't been forgotten. Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu> --- diff.c | 2 -- merge-ort.c | 2 -- xdiff/xdiff.h | 2 +- 3 files changed, 1 insertion(+), 5 deletions(-) diff --git a/diff.c b/diff.c index 87fa16b730..6ce3591c5b 100644 --- a/diff.c +++ b/diff.c @@ -3526,8 +3526,6 @@ static int set_diff_algorithm(struct diff_options *opts, if (value < 0) return -1; - /* clear out previous settings */ - DIFF_XDL_CLR(opts, NEED_MINIMAL); opts->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK; opts->xdl_opts |= value; diff --git a/merge-ort.c b/merge-ort.c index 29858074f9..9b2b0fce7e 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -5495,8 +5495,6 @@ int parse_merge_opt(struct merge_options *opt, const char *s) long value = parse_algorithm_value(arg); if (value < 0) return -1; - /* clear out previous settings */ - DIFF_XDL_CLR(opt, NEED_MINIMAL); opt->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK; opt->xdl_opts |= value; } diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h index 2cecde5afe..dc370712e9 100644 --- a/xdiff/xdiff.h +++ b/xdiff/xdiff.h @@ -43,7 +43,7 @@ extern "C" { #define XDF_PATIENCE_DIFF (1 << 14) #define XDF_HISTOGRAM_DIFF (1 << 15) -#define XDF_DIFF_ALGORITHM_MASK (XDF_PATIENCE_DIFF | XDF_HISTOGRAM_DIFF) +#define XDF_DIFF_ALGORITHM_MASK (XDF_PATIENCE_DIFF | XDF_HISTOGRAM_DIFF | XDF_NEED_MINIMAL) #define XDF_DIFF_ALG(x) ((x) & XDF_DIFF_ALGORITHM_MASK) #define XDF_INDENT_HEURISTIC (1 << 23) -- gitgitgadget ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v5 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK 2025-11-06 22:41 ` [PATCH v5 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget @ 2025-11-07 15:52 ` Junio C Hamano 0 siblings, 0 replies; 32+ messages in thread From: Junio C Hamano @ 2025-11-07 15:52 UTC (permalink / raw) To: Antonin Delpeuch via GitGitGadget Cc: git, Elijah Newren, Phillip Wood, Antonin Delpeuch "Antonin Delpeuch via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Antonin Delpeuch <antonin@delpeuch.eu> > > The XDF_DIFF_ALGORITHM_MASK bit mask only includes bits for the patience > and histogram diffs, not for the minimal one. This means that when > reseting the diff algorithm to the default one, one needs to separately > clear the bit for the minimal diff. There are places in the code that fail > to do that: merge-ort.c and builtin/merge-file.c. > > Add the XDF_NEED_MINIMAL bit to the bit mask, and remove the separate > clearing of this bit in the places where it hasn't been forgotten. Makes sense. In other words, lack of any algorithm-mask bit means the code uses myers. > diff --git a/diff.c b/diff.c > index 87fa16b730..6ce3591c5b 100644 > --- a/diff.c > +++ b/diff.c > @@ -3526,8 +3526,6 @@ static int set_diff_algorithm(struct diff_options *opts, > if (value < 0) > return -1; > > - /* clear out previous settings */ > - DIFF_XDL_CLR(opts, NEED_MINIMAL); > opts->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK; The comment still accurately describes what the surviving line does, though. It is borderline if it needs commenting, but the topic of this patch not being "remove overly obvious comments", I'd probably vote for retaining the comment. > diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h > index 2cecde5afe..dc370712e9 100644 > --- a/xdiff/xdiff.h > +++ b/xdiff/xdiff.h > @@ -43,7 +43,7 @@ extern "C" { > > #define XDF_PATIENCE_DIFF (1 << 14) > #define XDF_HISTOGRAM_DIFF (1 << 15) > -#define XDF_DIFF_ALGORITHM_MASK (XDF_PATIENCE_DIFF | XDF_HISTOGRAM_DIFF) > +#define XDF_DIFF_ALGORITHM_MASK (XDF_PATIENCE_DIFF | XDF_HISTOGRAM_DIFF | XDF_NEED_MINIMAL) > #define XDF_DIFF_ALG(x) ((x) & XDF_DIFF_ALGORITHM_MASK) Given the definition of XDF_DIFF_ALG(), I wondered how it is used. $ git grep -n -e 'XDF_DIFF_ALG(' \*.c xdiff/xdiffi.c:324: if (XDF_DIFF_ALG(xpp->flags) == XDF_PATIENCE_DIFF) { xdiff/xdiffi.c:329: if (XDF_DIFF_ALG(xpp->flags) == XDF_HISTOGRAM_DIFF) { xdiff/xprepare.c:170: if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) && xdiff/xprepare.c:171: (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF)) { xdiff/xprepare.c:396: sample = (XDF_DIFF_ALG(xpp->flags) == XDF_HISTOGRAM_DIFF xdiff/xprepare.c:417: if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) && xdiff/xprepare.c:418: (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF) && They say "if the specified algorithm is (or is not) patience (or histogram), do this". Now the original code, because the mask did not include the need-minimal bit, would have chosen patience code path even if xpp->flags had XDF_PATIENCE_DIFF and XDF_NEED_MINIMAL bits set at the same time. The code would no longer do so. What is keeping us safe and not making this change a bug is that among XDF_DIFF_ALGORITHM_MASK bits, we intend to set at most one of them at a time. I wonder if we want the command line option and configuration parser to have an explicit check (and BUG("")) to ensure this constraint. Also, in the longer term as #leftoverbits clean-up, perhaps these bits can be removed from xpp->flags and diff_options->xdl_opts, xpp structure can gain a separate member that is an enum of the algorithm names instead, and XDF_DIFF_ALGORITHM_MASK can be dropped? Then set_diff_algorithm() we saw earlier can become if (value < 0) return -1; opts->xdl_algo = value; return 0; And since there is no "clean out prvious settings" required (now we can simply overwrite), we can truly lose that old comment once we do so. As a part of this topic, I think that a new code to sanity check that there are at most one bit in XDF_DIFF_ALG(xpp->flags) may be a good safety measure to have. Moving the algorithm bits out of the flags is a larger change, and it may be better left outside the topic, but I do not personaly mind seeing such a clean-up included as a preparatory change for this series, either. Thanks. ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v5 2/2] blame: make diff algorithm configurable 2025-11-06 22:41 ` [PATCH v5 0/2] " Antonin Delpeuch via GitGitGadget 2025-11-06 22:41 ` [PATCH v5 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget @ 2025-11-06 22:41 ` Antonin Delpeuch via GitGitGadget 2025-11-07 15:57 ` Junio C Hamano 2025-11-07 15:49 ` [PATCH v5 0/2] " Phillip Wood 2025-11-17 8:04 ` [PATCH v6 " Antonin Delpeuch via GitGitGadget 3 siblings, 1 reply; 32+ messages in thread From: Antonin Delpeuch via GitGitGadget @ 2025-11-06 22:41 UTC (permalink / raw) To: git; +Cc: Elijah Newren, Phillip Wood, Antonin Delpeuch, Antonin Delpeuch From: Antonin Delpeuch <antonin@delpeuch.eu> The diff algorithm used in 'git-blame(1)' is set to 'myers', without the possibility to change it aside from the `--minimal` option. There has been long-standing interest in changing the default diff algorithm to "histogram", and Git 3.0 was floated as a possible occasion for taking some steps towards that: https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/ As a preparation for this move, it is worth making sure that the diff algorithm is configurable where useful. Make it configurable in the `git-blame(1)` command by introducing the `--diff-algorithm` option and make honor the `diff.algorithm` config variable. Keep Myers diff as the default. Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu> --- Documentation/diff-algorithm-option.adoc | 20 +++ Documentation/diff-options.adoc | 21 +-- Documentation/git-blame.adoc | 2 + builtin/blame.c | 52 +++++- t/meson.build | 1 + t/t8015-blame-diff-algorithm.sh | 203 +++++++++++++++++++++++ 6 files changed, 278 insertions(+), 21 deletions(-) create mode 100644 Documentation/diff-algorithm-option.adoc create mode 100755 t/t8015-blame-diff-algorithm.sh diff --git a/Documentation/diff-algorithm-option.adoc b/Documentation/diff-algorithm-option.adoc new file mode 100644 index 0000000000..8e3a0b63d7 --- /dev/null +++ b/Documentation/diff-algorithm-option.adoc @@ -0,0 +1,20 @@ +`--diff-algorithm=(patience|minimal|histogram|myers)`:: + Choose a diff algorithm. The variants are as follows: ++ +-- + `default`;; + `myers`;; + The basic greedy diff algorithm. Currently, this is the default. + `minimal`;; + Spend extra time to make sure the smallest possible diff is + produced. + `patience`;; + Use "patience diff" algorithm when generating patches. + `histogram`;; + This algorithm extends the patience algorithm to "support + low-occurrence common elements". +-- ++ +For instance, if you configured the `diff.algorithm` variable to a +non-default value and want to use the default one, then you +have to use `--diff-algorithm=default` option. diff --git a/Documentation/diff-options.adoc b/Documentation/diff-options.adoc index ae31520f7f..9cdad6f72a 100644 --- a/Documentation/diff-options.adoc +++ b/Documentation/diff-options.adoc @@ -197,26 +197,7 @@ and starts with _<text>_, this algorithm attempts to prevent it from appearing as a deletion or addition in the output. It uses the "patience diff" algorithm internally. -`--diff-algorithm=(patience|minimal|histogram|myers)`:: - Choose a diff algorithm. The variants are as follows: -+ --- - `default`;; - `myers`;; - The basic greedy diff algorithm. Currently, this is the default. - `minimal`;; - Spend extra time to make sure the smallest possible diff is - produced. - `patience`;; - Use "patience diff" algorithm when generating patches. - `histogram`;; - This algorithm extends the patience algorithm to "support - low-occurrence common elements". --- -+ -For instance, if you configured the `diff.algorithm` variable to a -non-default value and want to use the default one, then you -have to use `--diff-algorithm=default` option. +include::diff-algorithm-option.adoc[] `--stat[=<width>[,<name-width>[,<count>]]]`:: Generate a diffstat. By default, as much space as necessary diff --git a/Documentation/git-blame.adoc b/Documentation/git-blame.adoc index e438d28625..adcbb6f5dc 100644 --- a/Documentation/git-blame.adoc +++ b/Documentation/git-blame.adoc @@ -85,6 +85,8 @@ include::blame-options.adoc[] Ignore whitespace when comparing the parent's version and the child's to find where the lines came from. +include::diff-algorithm-option.adoc[] + --abbrev=<n>:: Instead of using the default 7+1 hexadecimal digits as the abbreviated object name, use <m>+1 digits, where <m> is at diff --git a/builtin/blame.c b/builtin/blame.c index 2703820258..27b513d27f 100644 --- a/builtin/blame.c +++ b/builtin/blame.c @@ -779,6 +779,19 @@ static int git_blame_config(const char *var, const char *value, } } + if (!strcmp(var, "diff.algorithm")) { + long diff_algorithm; + if (!value) + return config_error_nonbool(var); + diff_algorithm = parse_algorithm_value(value); + if (diff_algorithm < 0) + return error(_("unknown value for config '%s': %s"), + var, value); + xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK; + xdl_opts |= diff_algorithm; + return 0; + } + if (git_diff_heuristic_config(var, value, cb) < 0) return -1; if (userdiff_config(var, value) < 0) @@ -824,6 +837,38 @@ static int blame_move_callback(const struct option *option, const char *arg, int return 0; } +static int blame_diff_algorithm_minimal(const struct option *option, + const char *arg, int unset) +{ + int *opt = option->value; + + BUG_ON_OPT_ARG(arg); + + *opt &= ~XDF_DIFF_ALGORITHM_MASK; + if (!unset) + *opt |= XDF_NEED_MINIMAL; + + return 0; +} + +static int blame_diff_algorithm_callback(const struct option *option, + const char *arg, int unset) +{ + int *opt = option->value; + long value = parse_algorithm_value(arg); + + BUG_ON_OPT_NEG(unset); + + if (value < 0) + return error(_("option diff-algorithm accepts \"myers\", " + "\"minimal\", \"patience\" and \"histogram\"")); + + *opt &= ~XDF_DIFF_ALGORITHM_MASK; + *opt |= value; + + return 0; +} + static int is_a_rev(const char *name) { struct object_id oid; @@ -915,11 +960,16 @@ int cmd_blame(int argc, OPT_BIT('s', NULL, &output_option, N_("suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR), OPT_BIT('e', "show-email", &output_option, N_("show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL), OPT_BIT('w', NULL, &xdl_opts, N_("ignore whitespace differences"), XDF_IGNORE_WHITESPACE), + OPT_CALLBACK_F(0, "diff-algorithm", &xdl_opts, N_("<algorithm>"), + N_("choose a diff algorithm"), + PARSE_OPT_NONEG, blame_diff_algorithm_callback), OPT_STRING_LIST(0, "ignore-rev", &ignore_rev_list, N_("rev"), N_("ignore <rev> when blaming")), OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")), OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE), OPT_BIT(0, "color-by-age", &output_option, N_("color lines by age"), OUTPUT_SHOW_AGE_WITH_COLOR), - OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL), + OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL, + N_("spend extra cycles to find a better match"), + PARSE_OPT_NOARG | PARSE_OPT_HIDDEN, blame_diff_algorithm_minimal), OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")), OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")), OPT_CALLBACK_F('C', NULL, &opt, N_("score"), N_("find line copies within and across files"), PARSE_OPT_OPTARG, blame_copy_callback), diff --git a/t/meson.build b/t/meson.build index 401b24e50e..9f2fe7af8b 100644 --- a/t/meson.build +++ b/t/meson.build @@ -955,6 +955,7 @@ integration_tests = [ 't8012-blame-colors.sh', 't8013-blame-ignore-revs.sh', 't8014-blame-ignore-fuzzy.sh', + 't8015-blame-diff-algorithm.sh', 't8020-last-modified.sh', 't9001-send-email.sh', 't9002-column.sh', diff --git a/t/t8015-blame-diff-algorithm.sh b/t/t8015-blame-diff-algorithm.sh new file mode 100755 index 0000000000..55e1d540dc --- /dev/null +++ b/t/t8015-blame-diff-algorithm.sh @@ -0,0 +1,203 @@ +#!/bin/sh + +test_description='git blame with specific diff algorithm' + +. ./test-lib.sh + +test_expect_success setup ' + cat >file.c <<-\EOF && + int f(int x, int y) + { + if (x == 0) + { + return y; + } + return x; + } + + int g(size_t u) + { + while (u < 30) + { + u++; + } + return u; + } + EOF + test_write_lines x x x x >file.txt && + git add file.c file.txt && + GIT_AUTHOR_NAME=Commit_1 git commit -m Commit_1 && + + cat >file.c <<-\EOF && + int g(size_t u) + { + while (u < 30) + { + u++; + } + return u; + } + + int h(int x, int y, int z) + { + if (z == 0) + { + return x; + } + return y; + } + EOF + test_write_lines x x x A B C D x E F G >file.txt && + git add file.c file.txt && + GIT_AUTHOR_NAME=Commit_2 git commit -m Commit_2 +' + +test_expect_success 'blame uses Myers diff algorithm by default' ' + cat >expected <<-\EOF && + Commit_2 int g(size_t u) + Commit_1 { + Commit_2 while (u < 30) + Commit_1 { + Commit_2 u++; + Commit_1 } + Commit_2 return u; + Commit_1 } + Commit_1 + Commit_2 int h(int x, int y, int z) + Commit_1 { + Commit_2 if (z == 0) + Commit_1 { + Commit_2 return x; + Commit_1 } + Commit_2 return y; + Commit_1 } + EOF + + git blame file.c > output && + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts && + sed -e "s/ *$//g" without_varying_parts > actual && + test_cmp expected actual +' + +test_expect_success 'blame honors --diff-algorithm option' ' + cat >expected <<-\EOF && + Commit_1 int g(size_t u) + Commit_1 { + Commit_1 while (u < 30) + Commit_1 { + Commit_1 u++; + Commit_1 } + Commit_1 return u; + Commit_1 } + Commit_2 + Commit_2 int h(int x, int y, int z) + Commit_2 { + Commit_2 if (z == 0) + Commit_2 { + Commit_2 return x; + Commit_2 } + Commit_2 return y; + Commit_2 } + EOF + + git blame file.c --diff-algorithm histogram > output && + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts && + sed -e "s/ *$//g" without_varying_parts > actual && + test_cmp expected actual +' + +test_expect_success 'blame honors diff.algorithm config variable' ' + cat >expected <<-\EOF && + Commit_1 int g(size_t u) + Commit_1 { + Commit_1 while (u < 30) + Commit_1 { + Commit_1 u++; + Commit_1 } + Commit_1 return u; + Commit_1 } + Commit_2 + Commit_2 int h(int x, int y, int z) + Commit_2 { + Commit_2 if (z == 0) + Commit_2 { + Commit_2 return x; + Commit_2 } + Commit_2 return y; + Commit_2 } + EOF + + git -c diff.algorithm=histogram blame file.c > output && + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \ + -e "s/ *$//g" output > actual && + test_cmp expected actual +' + +test_expect_success 'blame gives priority to --diff-algorithm over diff.algorithm' ' + cat >expected <<-\EOF && + Commit_1 int g(size_t u) + Commit_1 { + Commit_1 while (u < 30) + Commit_1 { + Commit_1 u++; + Commit_1 } + Commit_1 return u; + Commit_1 } + Commit_2 + Commit_2 int h(int x, int y, int z) + Commit_2 { + Commit_2 if (z == 0) + Commit_2 { + Commit_2 return x; + Commit_2 } + Commit_2 return y; + Commit_2 } + EOF + + git -c diff.algorithm=myers blame file.c --diff-algorithm histogram > output && + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \ + -e "s/ *$//g" output > actual && + test_cmp expected actual +' + +test_expect_success 'blame honors --minimal option' ' + cat >expected <<-\EOF && + Commit_1 x + Commit_1 x + Commit_1 x + Commit_2 A + Commit_2 B + Commit_2 C + Commit_2 D + Commit_1 x + Commit_2 E + Commit_2 F + Commit_2 G + EOF + + git blame file.txt --minimal > output && + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual && + test_cmp expected actual +' + +test_expect_success 'blame respects the order of diff options' ' + cat >expected <<-\EOF && + Commit_1 x + Commit_1 x + Commit_1 x + Commit_2 A + Commit_2 B + Commit_2 C + Commit_2 D + Commit_2 x + Commit_2 E + Commit_2 F + Commit_2 G + EOF + + git blame file.txt --minimal --diff-algorithm myers > output && + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual && + test_cmp expected actual +' + +test_done -- gitgitgadget ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v5 2/2] blame: make diff algorithm configurable 2025-11-06 22:41 ` [PATCH v5 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget @ 2025-11-07 15:57 ` Junio C Hamano 0 siblings, 0 replies; 32+ messages in thread From: Junio C Hamano @ 2025-11-07 15:57 UTC (permalink / raw) To: Antonin Delpeuch via GitGitGadget Cc: git, Elijah Newren, Phillip Wood, Antonin Delpeuch "Antonin Delpeuch via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Antonin Delpeuch <antonin@delpeuch.eu> > > The diff algorithm used in 'git-blame(1)' is set to 'myers', > without the possibility to change it aside from the `--minimal` option. > > There has been long-standing interest in changing the default diff > algorithm to "histogram", and Git 3.0 was floated as a possible occasion > for taking some steps towards that: > > https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/ > > As a preparation for this move, it is worth making sure that the diff > algorithm is configurable where useful. > > Make it configurable in the `git-blame(1)` command by introducing the > `--diff-algorithm` option and make honor the `diff.algorithm` config > variable. Keep Myers diff as the default. > > Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu> > --- This step does not have anything surprising in it, knowing what the previous iteration contained. Looking good. Other than that many redirections into a file are written with a space between redirection operator and its target, i.e. command > output && that should be, according to the coding guidelines, written like command >output && that is. > +test_expect_success 'blame respects the order of diff options' ' > + cat >expected <<-\EOF && > +... > + EOF > + > + git blame file.txt --minimal --diff-algorithm myers > output && > + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual && > + test_cmp expected actual > +' Thanks. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5 0/2] blame: make diff algorithm configurable 2025-11-06 22:41 ` [PATCH v5 0/2] " Antonin Delpeuch via GitGitGadget 2025-11-06 22:41 ` [PATCH v5 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget 2025-11-06 22:41 ` [PATCH v5 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget @ 2025-11-07 15:49 ` Phillip Wood 2025-11-17 1:12 ` Junio C Hamano 2025-11-17 8:04 ` [PATCH v6 " Antonin Delpeuch via GitGitGadget 3 siblings, 1 reply; 32+ messages in thread From: Phillip Wood @ 2025-11-07 15:49 UTC (permalink / raw) To: Antonin Delpeuch via GitGitGadget, git; +Cc: Elijah Newren, Antonin Delpeuch Hi Antonin On 06/11/2025 22:41, Antonin Delpeuch via GitGitGadget wrote: > Changes since v4: > > * hide --minimal option > * simplify tests to minimize spun processes > * remove redundant XDF_NEED_MINIMAL in bit mask Excellent, the range-diff below looks good. Thanks for working on this Phillip > Antonin Delpeuch (2): > xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK > blame: make diff algorithm configurable > > Documentation/diff-algorithm-option.adoc | 20 +++ > Documentation/diff-options.adoc | 21 +-- > Documentation/git-blame.adoc | 2 + > builtin/blame.c | 52 +++++- > diff.c | 2 - > merge-ort.c | 2 - > t/meson.build | 1 + > t/t8015-blame-diff-algorithm.sh | 203 +++++++++++++++++++++++ > xdiff/xdiff.h | 2 +- > 9 files changed, 279 insertions(+), 26 deletions(-) > create mode 100644 Documentation/diff-algorithm-option.adoc > create mode 100755 t/t8015-blame-diff-algorithm.sh > > > base-commit: 4253630c6f07a4bdcc9aa62a50e26a4d466219d1 > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2075%2Fwetneb%2Fblame_respects_diff_algorithm-v5 > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2075/wetneb/blame_respects_diff_algorithm-v5 > Pull-Request: https://github.com/git/git/pull/2075 > > Range-diff vs v4: > > 1: e81a5d2bd2 = 1: e81a5d2bd2 xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK > 2: 920a6f3acb ! 2: 60015bbada blame: make diff algorithm configurable > @@ builtin/blame.c: static int blame_move_callback(const struct option *option, con > + return error(_("option diff-algorithm accepts \"myers\", " > + "\"minimal\", \"patience\" and \"histogram\"")); > + > -+ *opt &= ~(XDF_NEED_MINIMAL | XDF_DIFF_ALGORITHM_MASK); > ++ *opt &= ~XDF_DIFF_ALGORITHM_MASK; > + *opt |= value; > + > + return 0; > @@ builtin/blame.c: int cmd_blame(int argc, > - OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL), > + OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL, > + N_("spend extra cycles to find a better match"), > -+ PARSE_OPT_NOARG, blame_diff_algorithm_minimal), > ++ PARSE_OPT_NOARG | PARSE_OPT_HIDDEN, blame_diff_algorithm_minimal), > OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")), > OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")), > OPT_CALLBACK_F('C', NULL, &opt, N_("score"), N_("find line copies within and across files"), PARSE_OPT_OPTARG, blame_copy_callback), > @@ t/t8015-blame-diff-algorithm.sh (new) > + EOF > + > + git -c diff.algorithm=histogram blame file.c > output && > -+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts && > -+ sed -e "s/ *$//g" without_varying_parts > actual && > ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \ > ++ -e "s/ *$//g" output > actual && > + test_cmp expected actual > +' > + > @@ t/t8015-blame-diff-algorithm.sh (new) > + Commit_2 } > + EOF > + > -+ git -c diff.algorithm=myers blame file.c --diff-algorithm histogram && > -+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts && > -+ sed -e "s/ *$//g" without_varying_parts > actual && > ++ git -c diff.algorithm=myers blame file.c --diff-algorithm histogram > output && > ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \ > ++ -e "s/ *$//g" output > actual && > + test_cmp expected actual > +' > + > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5 0/2] blame: make diff algorithm configurable 2025-11-07 15:49 ` [PATCH v5 0/2] " Phillip Wood @ 2025-11-17 1:12 ` Junio C Hamano 0 siblings, 0 replies; 32+ messages in thread From: Junio C Hamano @ 2025-11-17 1:12 UTC (permalink / raw) To: Phillip Wood, Antonin Delpeuch Cc: Antonin Delpeuch via GitGitGadget, git, Elijah Newren Phillip Wood <phillip.wood123@gmail.com> writes: > Excellent, the range-diff below looks good. Thanks for working on this > > Phillip > >> Antonin Delpeuch (2): >> xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK >> blame: make diff algorithm configurable >> >> Documentation/diff-algorithm-option.adoc | 20 +++ >> Documentation/diff-options.adoc | 21 +-- >> Documentation/git-blame.adoc | 2 + >> builtin/blame.c | 52 +++++- >> diff.c | 2 - >> merge-ort.c | 2 - >> t/meson.build | 1 + >> t/t8015-blame-diff-algorithm.sh | 203 +++++++++++++++++++++++ >> xdiff/xdiff.h | 2 +- >> 9 files changed, 279 insertions(+), 26 deletions(-) >> create mode 100644 Documentation/diff-algorithm-option.adoc >> create mode 100755 t/t8015-blame-diff-algorithm.sh OK, we haven't seen any activities since we saw this comment. Are we ready to mark the topic for 'next', or are you waiting for the end of feature freeze to make a (hopefully small and fanal) reroll? Thanks. ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v6 0/2] blame: make diff algorithm configurable 2025-11-06 22:41 ` [PATCH v5 0/2] " Antonin Delpeuch via GitGitGadget ` (2 preceding siblings ...) 2025-11-07 15:49 ` [PATCH v5 0/2] " Phillip Wood @ 2025-11-17 8:04 ` Antonin Delpeuch via GitGitGadget 2025-11-17 8:04 ` [PATCH v6 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget ` (3 more replies) 3 siblings, 4 replies; 32+ messages in thread From: Antonin Delpeuch via GitGitGadget @ 2025-11-17 8:04 UTC (permalink / raw) To: git; +Cc: Elijah Newren, Phillip Wood, Antonin Delpeuch Changes since v5: * add back /* clear out previous settings */ comments * remove whitespace in bash output redirection Antonin Delpeuch (2): xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK blame: make diff algorithm configurable Documentation/diff-algorithm-option.adoc | 20 +++ Documentation/diff-options.adoc | 21 +-- Documentation/git-blame.adoc | 2 + builtin/blame.c | 52 +++++- diff.c | 1 - merge-ort.c | 1 - t/meson.build | 1 + t/t8015-blame-diff-algorithm.sh | 203 +++++++++++++++++++++++ xdiff/xdiff.h | 2 +- 9 files changed, 279 insertions(+), 24 deletions(-) create mode 100644 Documentation/diff-algorithm-option.adoc create mode 100755 t/t8015-blame-diff-algorithm.sh base-commit: 4253630c6f07a4bdcc9aa62a50e26a4d466219d1 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2075%2Fwetneb%2Fblame_respects_diff_algorithm-v6 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2075/wetneb/blame_respects_diff_algorithm-v6 Pull-Request: https://github.com/git/git/pull/2075 Range-diff vs v5: 1: e81a5d2bd2 ! 1: 4846715436 xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK @@ Commit message ## diff.c ## @@ diff.c: static int set_diff_algorithm(struct diff_options *opts, - if (value < 0) return -1; -- /* clear out previous settings */ + /* clear out previous settings */ - DIFF_XDL_CLR(opts, NEED_MINIMAL); opts->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK; opts->xdl_opts |= value; @@ diff.c: static int set_diff_algorithm(struct diff_options *opts, ## merge-ort.c ## @@ merge-ort.c: int parse_merge_opt(struct merge_options *opt, const char *s) - long value = parse_algorithm_value(arg); if (value < 0) return -1; -- /* clear out previous settings */ + /* clear out previous settings */ - DIFF_XDL_CLR(opt, NEED_MINIMAL); opt->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK; opt->xdl_opts |= value; 2: 60015bbada ! 2: c477b87cc6 blame: make diff algorithm configurable @@ t/t8015-blame-diff-algorithm.sh (new) + Commit_1 } + EOF + -+ git blame file.c > output && -+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts && -+ sed -e "s/ *$//g" without_varying_parts > actual && ++ git blame file.c >output && ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >without_varying_parts && ++ sed -e "s/ *$//g" without_varying_parts >actual && + test_cmp expected actual +' + @@ t/t8015-blame-diff-algorithm.sh (new) + Commit_2 } + EOF + -+ git blame file.c --diff-algorithm histogram > output && -+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts && -+ sed -e "s/ *$//g" without_varying_parts > actual && ++ git blame file.c --diff-algorithm histogram >output && ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >without_varying_parts && ++ sed -e "s/ *$//g" without_varying_parts >actual && + test_cmp expected actual +' + @@ t/t8015-blame-diff-algorithm.sh (new) + Commit_2 } + EOF + -+ git -c diff.algorithm=histogram blame file.c > output && ++ git -c diff.algorithm=histogram blame file.c >output && + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \ -+ -e "s/ *$//g" output > actual && ++ -e "s/ *$//g" output >actual && + test_cmp expected actual +' + @@ t/t8015-blame-diff-algorithm.sh (new) + Commit_2 } + EOF + -+ git -c diff.algorithm=myers blame file.c --diff-algorithm histogram > output && ++ git -c diff.algorithm=myers blame file.c --diff-algorithm histogram >output && + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \ -+ -e "s/ *$//g" output > actual && ++ -e "s/ *$//g" output >actual && + test_cmp expected actual +' + @@ t/t8015-blame-diff-algorithm.sh (new) + Commit_2 G + EOF + -+ git blame file.txt --minimal > output && -+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual && ++ git blame file.txt --minimal >output && ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >actual && + test_cmp expected actual +' + @@ t/t8015-blame-diff-algorithm.sh (new) + Commit_2 G + EOF + -+ git blame file.txt --minimal --diff-algorithm myers > output && -+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual && ++ git blame file.txt --minimal --diff-algorithm myers >output && ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >actual && + test_cmp expected actual +' + -- gitgitgadget ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v6 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK 2025-11-17 8:04 ` [PATCH v6 " Antonin Delpeuch via GitGitGadget @ 2025-11-17 8:04 ` Antonin Delpeuch via GitGitGadget 2025-11-17 8:04 ` [PATCH v6 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget ` (2 subsequent siblings) 3 siblings, 0 replies; 32+ messages in thread From: Antonin Delpeuch via GitGitGadget @ 2025-11-17 8:04 UTC (permalink / raw) To: git; +Cc: Elijah Newren, Phillip Wood, Antonin Delpeuch, Antonin Delpeuch From: Antonin Delpeuch <antonin@delpeuch.eu> The XDF_DIFF_ALGORITHM_MASK bit mask only includes bits for the patience and histogram diffs, not for the minimal one. This means that when reseting the diff algorithm to the default one, one needs to separately clear the bit for the minimal diff. There are places in the code that fail to do that: merge-ort.c and builtin/merge-file.c. Add the XDF_NEED_MINIMAL bit to the bit mask, and remove the separate clearing of this bit in the places where it hasn't been forgotten. Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu> --- diff.c | 1 - merge-ort.c | 1 - xdiff/xdiff.h | 2 +- 3 files changed, 1 insertion(+), 3 deletions(-) diff --git a/diff.c b/diff.c index 87fa16b730..cdcd11f1f7 100644 --- a/diff.c +++ b/diff.c @@ -3527,7 +3527,6 @@ static int set_diff_algorithm(struct diff_options *opts, return -1; /* clear out previous settings */ - DIFF_XDL_CLR(opts, NEED_MINIMAL); opts->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK; opts->xdl_opts |= value; diff --git a/merge-ort.c b/merge-ort.c index 29858074f9..23e2b64c79 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -5496,7 +5496,6 @@ int parse_merge_opt(struct merge_options *opt, const char *s) if (value < 0) return -1; /* clear out previous settings */ - DIFF_XDL_CLR(opt, NEED_MINIMAL); opt->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK; opt->xdl_opts |= value; } diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h index 2cecde5afe..dc370712e9 100644 --- a/xdiff/xdiff.h +++ b/xdiff/xdiff.h @@ -43,7 +43,7 @@ extern "C" { #define XDF_PATIENCE_DIFF (1 << 14) #define XDF_HISTOGRAM_DIFF (1 << 15) -#define XDF_DIFF_ALGORITHM_MASK (XDF_PATIENCE_DIFF | XDF_HISTOGRAM_DIFF) +#define XDF_DIFF_ALGORITHM_MASK (XDF_PATIENCE_DIFF | XDF_HISTOGRAM_DIFF | XDF_NEED_MINIMAL) #define XDF_DIFF_ALG(x) ((x) & XDF_DIFF_ALGORITHM_MASK) #define XDF_INDENT_HEURISTIC (1 << 23) -- gitgitgadget ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v6 2/2] blame: make diff algorithm configurable 2025-11-17 8:04 ` [PATCH v6 " Antonin Delpeuch via GitGitGadget 2025-11-17 8:04 ` [PATCH v6 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget @ 2025-11-17 8:04 ` Antonin Delpeuch via GitGitGadget 2025-11-17 14:13 ` [PATCH v6 0/2] " Phillip Wood 2025-11-17 18:24 ` Junio C Hamano 3 siblings, 0 replies; 32+ messages in thread From: Antonin Delpeuch via GitGitGadget @ 2025-11-17 8:04 UTC (permalink / raw) To: git; +Cc: Elijah Newren, Phillip Wood, Antonin Delpeuch, Antonin Delpeuch From: Antonin Delpeuch <antonin@delpeuch.eu> The diff algorithm used in 'git-blame(1)' is set to 'myers', without the possibility to change it aside from the `--minimal` option. There has been long-standing interest in changing the default diff algorithm to "histogram", and Git 3.0 was floated as a possible occasion for taking some steps towards that: https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/ As a preparation for this move, it is worth making sure that the diff algorithm is configurable where useful. Make it configurable in the `git-blame(1)` command by introducing the `--diff-algorithm` option and make honor the `diff.algorithm` config variable. Keep Myers diff as the default. Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu> --- Documentation/diff-algorithm-option.adoc | 20 +++ Documentation/diff-options.adoc | 21 +-- Documentation/git-blame.adoc | 2 + builtin/blame.c | 52 +++++- t/meson.build | 1 + t/t8015-blame-diff-algorithm.sh | 203 +++++++++++++++++++++++ 6 files changed, 278 insertions(+), 21 deletions(-) create mode 100644 Documentation/diff-algorithm-option.adoc create mode 100755 t/t8015-blame-diff-algorithm.sh diff --git a/Documentation/diff-algorithm-option.adoc b/Documentation/diff-algorithm-option.adoc new file mode 100644 index 0000000000..8e3a0b63d7 --- /dev/null +++ b/Documentation/diff-algorithm-option.adoc @@ -0,0 +1,20 @@ +`--diff-algorithm=(patience|minimal|histogram|myers)`:: + Choose a diff algorithm. The variants are as follows: ++ +-- + `default`;; + `myers`;; + The basic greedy diff algorithm. Currently, this is the default. + `minimal`;; + Spend extra time to make sure the smallest possible diff is + produced. + `patience`;; + Use "patience diff" algorithm when generating patches. + `histogram`;; + This algorithm extends the patience algorithm to "support + low-occurrence common elements". +-- ++ +For instance, if you configured the `diff.algorithm` variable to a +non-default value and want to use the default one, then you +have to use `--diff-algorithm=default` option. diff --git a/Documentation/diff-options.adoc b/Documentation/diff-options.adoc index ae31520f7f..9cdad6f72a 100644 --- a/Documentation/diff-options.adoc +++ b/Documentation/diff-options.adoc @@ -197,26 +197,7 @@ and starts with _<text>_, this algorithm attempts to prevent it from appearing as a deletion or addition in the output. It uses the "patience diff" algorithm internally. -`--diff-algorithm=(patience|minimal|histogram|myers)`:: - Choose a diff algorithm. The variants are as follows: -+ --- - `default`;; - `myers`;; - The basic greedy diff algorithm. Currently, this is the default. - `minimal`;; - Spend extra time to make sure the smallest possible diff is - produced. - `patience`;; - Use "patience diff" algorithm when generating patches. - `histogram`;; - This algorithm extends the patience algorithm to "support - low-occurrence common elements". --- -+ -For instance, if you configured the `diff.algorithm` variable to a -non-default value and want to use the default one, then you -have to use `--diff-algorithm=default` option. +include::diff-algorithm-option.adoc[] `--stat[=<width>[,<name-width>[,<count>]]]`:: Generate a diffstat. By default, as much space as necessary diff --git a/Documentation/git-blame.adoc b/Documentation/git-blame.adoc index e438d28625..adcbb6f5dc 100644 --- a/Documentation/git-blame.adoc +++ b/Documentation/git-blame.adoc @@ -85,6 +85,8 @@ include::blame-options.adoc[] Ignore whitespace when comparing the parent's version and the child's to find where the lines came from. +include::diff-algorithm-option.adoc[] + --abbrev=<n>:: Instead of using the default 7+1 hexadecimal digits as the abbreviated object name, use <m>+1 digits, where <m> is at diff --git a/builtin/blame.c b/builtin/blame.c index 2703820258..27b513d27f 100644 --- a/builtin/blame.c +++ b/builtin/blame.c @@ -779,6 +779,19 @@ static int git_blame_config(const char *var, const char *value, } } + if (!strcmp(var, "diff.algorithm")) { + long diff_algorithm; + if (!value) + return config_error_nonbool(var); + diff_algorithm = parse_algorithm_value(value); + if (diff_algorithm < 0) + return error(_("unknown value for config '%s': %s"), + var, value); + xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK; + xdl_opts |= diff_algorithm; + return 0; + } + if (git_diff_heuristic_config(var, value, cb) < 0) return -1; if (userdiff_config(var, value) < 0) @@ -824,6 +837,38 @@ static int blame_move_callback(const struct option *option, const char *arg, int return 0; } +static int blame_diff_algorithm_minimal(const struct option *option, + const char *arg, int unset) +{ + int *opt = option->value; + + BUG_ON_OPT_ARG(arg); + + *opt &= ~XDF_DIFF_ALGORITHM_MASK; + if (!unset) + *opt |= XDF_NEED_MINIMAL; + + return 0; +} + +static int blame_diff_algorithm_callback(const struct option *option, + const char *arg, int unset) +{ + int *opt = option->value; + long value = parse_algorithm_value(arg); + + BUG_ON_OPT_NEG(unset); + + if (value < 0) + return error(_("option diff-algorithm accepts \"myers\", " + "\"minimal\", \"patience\" and \"histogram\"")); + + *opt &= ~XDF_DIFF_ALGORITHM_MASK; + *opt |= value; + + return 0; +} + static int is_a_rev(const char *name) { struct object_id oid; @@ -915,11 +960,16 @@ int cmd_blame(int argc, OPT_BIT('s', NULL, &output_option, N_("suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR), OPT_BIT('e', "show-email", &output_option, N_("show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL), OPT_BIT('w', NULL, &xdl_opts, N_("ignore whitespace differences"), XDF_IGNORE_WHITESPACE), + OPT_CALLBACK_F(0, "diff-algorithm", &xdl_opts, N_("<algorithm>"), + N_("choose a diff algorithm"), + PARSE_OPT_NONEG, blame_diff_algorithm_callback), OPT_STRING_LIST(0, "ignore-rev", &ignore_rev_list, N_("rev"), N_("ignore <rev> when blaming")), OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")), OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE), OPT_BIT(0, "color-by-age", &output_option, N_("color lines by age"), OUTPUT_SHOW_AGE_WITH_COLOR), - OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL), + OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL, + N_("spend extra cycles to find a better match"), + PARSE_OPT_NOARG | PARSE_OPT_HIDDEN, blame_diff_algorithm_minimal), OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")), OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")), OPT_CALLBACK_F('C', NULL, &opt, N_("score"), N_("find line copies within and across files"), PARSE_OPT_OPTARG, blame_copy_callback), diff --git a/t/meson.build b/t/meson.build index 401b24e50e..9f2fe7af8b 100644 --- a/t/meson.build +++ b/t/meson.build @@ -955,6 +955,7 @@ integration_tests = [ 't8012-blame-colors.sh', 't8013-blame-ignore-revs.sh', 't8014-blame-ignore-fuzzy.sh', + 't8015-blame-diff-algorithm.sh', 't8020-last-modified.sh', 't9001-send-email.sh', 't9002-column.sh', diff --git a/t/t8015-blame-diff-algorithm.sh b/t/t8015-blame-diff-algorithm.sh new file mode 100755 index 0000000000..cd709536c6 --- /dev/null +++ b/t/t8015-blame-diff-algorithm.sh @@ -0,0 +1,203 @@ +#!/bin/sh + +test_description='git blame with specific diff algorithm' + +. ./test-lib.sh + +test_expect_success setup ' + cat >file.c <<-\EOF && + int f(int x, int y) + { + if (x == 0) + { + return y; + } + return x; + } + + int g(size_t u) + { + while (u < 30) + { + u++; + } + return u; + } + EOF + test_write_lines x x x x >file.txt && + git add file.c file.txt && + GIT_AUTHOR_NAME=Commit_1 git commit -m Commit_1 && + + cat >file.c <<-\EOF && + int g(size_t u) + { + while (u < 30) + { + u++; + } + return u; + } + + int h(int x, int y, int z) + { + if (z == 0) + { + return x; + } + return y; + } + EOF + test_write_lines x x x A B C D x E F G >file.txt && + git add file.c file.txt && + GIT_AUTHOR_NAME=Commit_2 git commit -m Commit_2 +' + +test_expect_success 'blame uses Myers diff algorithm by default' ' + cat >expected <<-\EOF && + Commit_2 int g(size_t u) + Commit_1 { + Commit_2 while (u < 30) + Commit_1 { + Commit_2 u++; + Commit_1 } + Commit_2 return u; + Commit_1 } + Commit_1 + Commit_2 int h(int x, int y, int z) + Commit_1 { + Commit_2 if (z == 0) + Commit_1 { + Commit_2 return x; + Commit_1 } + Commit_2 return y; + Commit_1 } + EOF + + git blame file.c >output && + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >without_varying_parts && + sed -e "s/ *$//g" without_varying_parts >actual && + test_cmp expected actual +' + +test_expect_success 'blame honors --diff-algorithm option' ' + cat >expected <<-\EOF && + Commit_1 int g(size_t u) + Commit_1 { + Commit_1 while (u < 30) + Commit_1 { + Commit_1 u++; + Commit_1 } + Commit_1 return u; + Commit_1 } + Commit_2 + Commit_2 int h(int x, int y, int z) + Commit_2 { + Commit_2 if (z == 0) + Commit_2 { + Commit_2 return x; + Commit_2 } + Commit_2 return y; + Commit_2 } + EOF + + git blame file.c --diff-algorithm histogram >output && + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >without_varying_parts && + sed -e "s/ *$//g" without_varying_parts >actual && + test_cmp expected actual +' + +test_expect_success 'blame honors diff.algorithm config variable' ' + cat >expected <<-\EOF && + Commit_1 int g(size_t u) + Commit_1 { + Commit_1 while (u < 30) + Commit_1 { + Commit_1 u++; + Commit_1 } + Commit_1 return u; + Commit_1 } + Commit_2 + Commit_2 int h(int x, int y, int z) + Commit_2 { + Commit_2 if (z == 0) + Commit_2 { + Commit_2 return x; + Commit_2 } + Commit_2 return y; + Commit_2 } + EOF + + git -c diff.algorithm=histogram blame file.c >output && + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \ + -e "s/ *$//g" output >actual && + test_cmp expected actual +' + +test_expect_success 'blame gives priority to --diff-algorithm over diff.algorithm' ' + cat >expected <<-\EOF && + Commit_1 int g(size_t u) + Commit_1 { + Commit_1 while (u < 30) + Commit_1 { + Commit_1 u++; + Commit_1 } + Commit_1 return u; + Commit_1 } + Commit_2 + Commit_2 int h(int x, int y, int z) + Commit_2 { + Commit_2 if (z == 0) + Commit_2 { + Commit_2 return x; + Commit_2 } + Commit_2 return y; + Commit_2 } + EOF + + git -c diff.algorithm=myers blame file.c --diff-algorithm histogram >output && + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \ + -e "s/ *$//g" output >actual && + test_cmp expected actual +' + +test_expect_success 'blame honors --minimal option' ' + cat >expected <<-\EOF && + Commit_1 x + Commit_1 x + Commit_1 x + Commit_2 A + Commit_2 B + Commit_2 C + Commit_2 D + Commit_1 x + Commit_2 E + Commit_2 F + Commit_2 G + EOF + + git blame file.txt --minimal >output && + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >actual && + test_cmp expected actual +' + +test_expect_success 'blame respects the order of diff options' ' + cat >expected <<-\EOF && + Commit_1 x + Commit_1 x + Commit_1 x + Commit_2 A + Commit_2 B + Commit_2 C + Commit_2 D + Commit_2 x + Commit_2 E + Commit_2 F + Commit_2 G + EOF + + git blame file.txt --minimal --diff-algorithm myers >output && + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >actual && + test_cmp expected actual +' + +test_done -- gitgitgadget ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v6 0/2] blame: make diff algorithm configurable 2025-11-17 8:04 ` [PATCH v6 " Antonin Delpeuch via GitGitGadget 2025-11-17 8:04 ` [PATCH v6 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget 2025-11-17 8:04 ` [PATCH v6 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget @ 2025-11-17 14:13 ` Phillip Wood 2025-11-17 18:24 ` Junio C Hamano 3 siblings, 0 replies; 32+ messages in thread From: Phillip Wood @ 2025-11-17 14:13 UTC (permalink / raw) To: Antonin Delpeuch via GitGitGadget, git; +Cc: Elijah Newren, Antonin Delpeuch Hi Antonin On 17/11/2025 08:04, Antonin Delpeuch via GitGitGadget wrote: > Changes since v5: > > * add back /* clear out previous settings */ comments > * remove whitespace in bash output redirection Thanks for re-rolling, the range-diff below looks good. Being able to configure the diff algorithm for "git blame" is a nice addition, thanks for working on it. Phillip > Antonin Delpeuch (2): > xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK > blame: make diff algorithm configurable > > Documentation/diff-algorithm-option.adoc | 20 +++ > Documentation/diff-options.adoc | 21 +-- > Documentation/git-blame.adoc | 2 + > builtin/blame.c | 52 +++++- > diff.c | 1 - > merge-ort.c | 1 - > t/meson.build | 1 + > t/t8015-blame-diff-algorithm.sh | 203 +++++++++++++++++++++++ > xdiff/xdiff.h | 2 +- > 9 files changed, 279 insertions(+), 24 deletions(-) > create mode 100644 Documentation/diff-algorithm-option.adoc > create mode 100755 t/t8015-blame-diff-algorithm.sh > > > base-commit: 4253630c6f07a4bdcc9aa62a50e26a4d466219d1 > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2075%2Fwetneb%2Fblame_respects_diff_algorithm-v6 > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2075/wetneb/blame_respects_diff_algorithm-v6 > Pull-Request: https://github.com/git/git/pull/2075 > > Range-diff vs v5: > > 1: e81a5d2bd2 ! 1: 4846715436 xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK > @@ Commit message > > ## diff.c ## > @@ diff.c: static int set_diff_algorithm(struct diff_options *opts, > - if (value < 0) > return -1; > > -- /* clear out previous settings */ > + /* clear out previous settings */ > - DIFF_XDL_CLR(opts, NEED_MINIMAL); > opts->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK; > opts->xdl_opts |= value; > @@ diff.c: static int set_diff_algorithm(struct diff_options *opts, > > ## merge-ort.c ## > @@ merge-ort.c: int parse_merge_opt(struct merge_options *opt, const char *s) > - long value = parse_algorithm_value(arg); > if (value < 0) > return -1; > -- /* clear out previous settings */ > + /* clear out previous settings */ > - DIFF_XDL_CLR(opt, NEED_MINIMAL); > opt->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK; > opt->xdl_opts |= value; > 2: 60015bbada ! 2: c477b87cc6 blame: make diff algorithm configurable > @@ t/t8015-blame-diff-algorithm.sh (new) > + Commit_1 } > + EOF > + > -+ git blame file.c > output && > -+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts && > -+ sed -e "s/ *$//g" without_varying_parts > actual && > ++ git blame file.c >output && > ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >without_varying_parts && > ++ sed -e "s/ *$//g" without_varying_parts >actual && > + test_cmp expected actual > +' > + > @@ t/t8015-blame-diff-algorithm.sh (new) > + Commit_2 } > + EOF > + > -+ git blame file.c --diff-algorithm histogram > output && > -+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > without_varying_parts && > -+ sed -e "s/ *$//g" without_varying_parts > actual && > ++ git blame file.c --diff-algorithm histogram >output && > ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >without_varying_parts && > ++ sed -e "s/ *$//g" without_varying_parts >actual && > + test_cmp expected actual > +' > + > @@ t/t8015-blame-diff-algorithm.sh (new) > + Commit_2 } > + EOF > + > -+ git -c diff.algorithm=histogram blame file.c > output && > ++ git -c diff.algorithm=histogram blame file.c >output && > + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \ > -+ -e "s/ *$//g" output > actual && > ++ -e "s/ *$//g" output >actual && > + test_cmp expected actual > +' > + > @@ t/t8015-blame-diff-algorithm.sh (new) > + Commit_2 } > + EOF > + > -+ git -c diff.algorithm=myers blame file.c --diff-algorithm histogram > output && > ++ git -c diff.algorithm=myers blame file.c --diff-algorithm histogram >output && > + sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" \ > -+ -e "s/ *$//g" output > actual && > ++ -e "s/ *$//g" output >actual && > + test_cmp expected actual > +' > + > @@ t/t8015-blame-diff-algorithm.sh (new) > + Commit_2 G > + EOF > + > -+ git blame file.txt --minimal > output && > -+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual && > ++ git blame file.txt --minimal >output && > ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >actual && > + test_cmp expected actual > +' > + > @@ t/t8015-blame-diff-algorithm.sh (new) > + Commit_2 G > + EOF > + > -+ git blame file.txt --minimal --diff-algorithm myers > output && > -+ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output > actual && > ++ git blame file.txt --minimal --diff-algorithm myers >output && > ++ sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" output >actual && > + test_cmp expected actual > +' > + > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v6 0/2] blame: make diff algorithm configurable 2025-11-17 8:04 ` [PATCH v6 " Antonin Delpeuch via GitGitGadget ` (2 preceding siblings ...) 2025-11-17 14:13 ` [PATCH v6 0/2] " Phillip Wood @ 2025-11-17 18:24 ` Junio C Hamano 3 siblings, 0 replies; 32+ messages in thread From: Junio C Hamano @ 2025-11-17 18:24 UTC (permalink / raw) To: Antonin Delpeuch via GitGitGadget Cc: git, Elijah Newren, Phillip Wood, Antonin Delpeuch "Antonin Delpeuch via GitGitGadget" <gitgitgadget@gmail.com> writes: > Changes since v5: > > * add back /* clear out previous settings */ comments > * remove whitespace in bash output redirection > > Antonin Delpeuch (2): > xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK > blame: make diff algorithm configurable Will queue. Thanks! ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2025-11-17 18:24 UTC | newest] Thread overview: 32+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-10-20 14:56 [PATCH] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget 2025-10-20 16:05 ` Junio C Hamano 2025-10-22 9:37 ` Antonin Delpeuch 2025-10-22 20:39 ` Junio C Hamano 2025-10-23 16:03 ` Phillip Wood 2025-10-28 13:37 ` [PATCH v2] " Antonin Delpeuch via GitGitGadget 2025-10-28 15:22 ` Junio C Hamano 2025-10-28 16:00 ` Antonin Delpeuch 2025-10-28 21:14 ` [PATCH v3] " Antonin Delpeuch via GitGitGadget 2025-10-29 10:16 ` Phillip Wood 2025-10-29 18:46 ` Junio C Hamano 2025-10-30 9:22 ` Antonin Delpeuch 2025-10-30 10:47 ` Phillip Wood 2025-11-01 21:57 ` [PATCH v4 0/2] " Antonin Delpeuch via GitGitGadget 2025-11-01 21:57 ` [PATCH v4 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget 2025-11-03 14:32 ` Phillip Wood 2025-11-01 21:57 ` [PATCH v4 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget 2025-11-03 14:32 ` Phillip Wood 2025-11-03 16:15 ` Junio C Hamano 2025-11-06 20:29 ` Junio C Hamano 2025-11-06 22:41 ` [PATCH v5 0/2] " Antonin Delpeuch via GitGitGadget 2025-11-06 22:41 ` [PATCH v5 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget 2025-11-07 15:52 ` Junio C Hamano 2025-11-06 22:41 ` [PATCH v5 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget 2025-11-07 15:57 ` Junio C Hamano 2025-11-07 15:49 ` [PATCH v5 0/2] " Phillip Wood 2025-11-17 1:12 ` Junio C Hamano 2025-11-17 8:04 ` [PATCH v6 " Antonin Delpeuch via GitGitGadget 2025-11-17 8:04 ` [PATCH v6 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget 2025-11-17 8:04 ` [PATCH v6 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget 2025-11-17 14:13 ` [PATCH v6 0/2] " Phillip Wood 2025-11-17 18:24 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).