git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Phillip Wood <phillip.wood123@gmail.com>
To: Antonin Delpeuch via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Cc: Elijah Newren <newren@gmail.com>, Antonin Delpeuch <antonin@delpeuch.eu>
Subject: Re: [PATCH v3] blame: make diff algorithm configurable
Date: Wed, 29 Oct 2025 10:16:27 +0000	[thread overview]
Message-ID: <fde3dae1-bb11-45e8-9211-50ae003ca497@gmail.com> (raw)
In-Reply-To: <pull.2075.v3.git.git.1761686060477.gitgitgadget@gmail.com>

Hi Antonin

On 28/10/2025 21:14, Antonin Delpeuch via GitGitGadget wrote:
> From: Antonin Delpeuch <antonin@delpeuch.eu>
> 
> The diff algorithm used in 'git-blame(1)' is set to 'myers',
> without the possibility to change it aside from the `--minimal` option.
> 
> There has been long-standing interest in changing the default diff
> algorithm to "histogram", and Git 3.0 was floated as a possible occasion
> for taking some steps towards that:
> 
> https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/
> 
> As a preparation for this move, it is worth making sure that the diff
> algorithm is configurable where useful.
> 
> Make it configurable in the `git-blame(1)` command by introducing the
> `--diff-algorithm` option and make honor the `diff.algorithm` config
> variable. Keep Myers diff as the default.
> 
> Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
> ---

Apart from a problem with clearing XDF_NEED_MINIMAL (which is really the 
fault of a terrible api) this is looking good.

> --- a/builtin/blame.c
> +++ b/builtin/blame.c
> @@ -779,6 +779,19 @@ static int git_blame_config(const char *var, const char *value,
>   		}
>   	}
>   
> +	if (!strcmp(var, "diff.algorithm")) {
> +		long diff_algorithm;
> +		if (!value)
> +			return config_error_nonbool(var);
> +		diff_algorithm = parse_algorithm_value(value);
> +		if (diff_algorithm < 0)
> +			return error(_("unknown value for config '%s': %s"),
> +				     var, value);
> +		xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;

Unfortunately XDF_DIFF_ALGORITHM_MASK does not include XDF_NEED_MINIMAL 
so if the user has a config file that looks like
	
	[diff]
		algorithm = minimal
		algorithm = myers

We'll parse it as "minimal" rather than "myers"

As we need to reset the diff algorithm in a number of places I think it 
would be best to define a macro

     	#define CLEAR_DIFF_ALGORITHM(flags) \
		flags &= ~(XDF_DIFF_ALGORITHM_MASK | XDF_NEED_MINIMAL)

and use that where we want to reset the algorithm.
> +		xdl_opts |= diff_algorithm;
> +		return 0;
> +	}
> +
>   	if (git_diff_heuristic_config(var, value, cb) < 0)
>   		return -1;
>   	if (userdiff_config(var, value) < 0)
> @@ -824,6 +837,38 @@ static int blame_move_callback(const struct option *option, const char *arg, int
>   	return 0;
>   }
>   
> +static int blame_diff_algorithm_minimal(const struct option *option,
> +					const char *arg, int unset)
> +{
> +	int *opt = option->value;
> +
> +	BUG_ON_OPT_NEG(unset);

This is a change in behavior as we currently accept "--no-minimal" which 
clears XDF_NEED_MINIMAL
> +	BUG_ON_OPT_ARG(arg);
> +
> +	*opt &= ~XDF_DIFF_ALGORITHM_MASK;

This is correct becase we're about to set XDF_NEED_MINIMAL so it does 
not matter that we leave it set here, but would still be clearer if it 
used the new macro I suggested above.

> +	*opt |= XDF_NEED_MINIMAL;> +	return 0;
> +}
> +
> +static int blame_diff_algorithm_callback(const struct option *option,
> +					 const char *arg, int unset)
> +{
> +	int *opt = option->value;
> +	long value = parse_algorithm_value(arg);
> +
> +	BUG_ON_OPT_NEG(unset);
> +
> +	if (value < 0)
> +		return error(_("option diff-algorithm accepts \"myers\", "
> +			       "\"minimal\", \"patience\" and \"histogram\""));
> +
> +	*opt &= ~(XDF_NEED_MINIMAL | XDF_DIFF_ALGORITHM_MASK);

This is correct

> +	*opt |= value;
> +
> +	return 0;
> +}
> +

> -		OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL),
> +		OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL,
> +			       N_("spend extra cycles to find better match"),

This is just copying the existing text so it is not a new problem but I 
think it would be better if we said "find a better" rather than "find 
better". We should prehaps think about hiding this option now that we 
support --diff-algorithm.

> +			       PARSE_OPT_NONEG | PARSE_OPT_NOARG,

As I said above using PARSE_OPT_NONEG here is a regression

> +			       blame_diff_algorithm_minimal),
> diff --git a/t/t8015-blame-diff-algorithm.sh b/t/t8015-blame-diff-algorithm.sh
> new file mode 100755
> index 0000000000..efc4b47ce1
> --- /dev/null
> +++ b/t/t8015-blame-diff-algorithm.sh
> [...]
> +test_expect_success 'blame uses Myers diff algorithm by default for now' '

I'm not sure we need to say "for now" here.

> +	cat >expected <<-\EOF &&
> +	Commit_2 int g(size_t u)
> +	Commit_1 {
> +	Commit_2   while (u < 30)
> +	Commit_1   {
> +	Commit_2     u++;
> +	Commit_1   }
> +	Commit_2   return u;
> +	Commit_1 }
> +	Commit_1
> +	Commit_2 int h(int x, int y, int z)
> +	Commit_1 {
> +	Commit_2   if (z == 0)
> +	Commit_1   {
> +	Commit_2     return x;
> +	Commit_1   }
> +	Commit_2   return y;
> +	Commit_1 }
> +	EOF
> +
> +

There's an extra blank line here

> +	git blame file.c | \

We don't pipe the output git commands as it hides unexpected failures. 
Instead you should redirect the output of git to a file and then process 
that file with sed.

> +		sed -e "s/^[^ ]* (\([^ ]*\) [^)]*)/\1/g" | \
> +		sed -e "s/ *$//g" > actual &&

This can be a single process by passing -e twice. It does not really 
matter but neither pattern needs a trailing "g" as they only match once 
within the line.

> +test_expect_success 'blame gives priority to --diff-algorithm over diff.algorithm' '
> +	cat >expected <<-\EOF &&
> +	Commit_1 int g(size_t u)
> +	Commit_1 {
> +	Commit_1   while (u < 30)
> +	Commit_1   {
> +	Commit_1     u++;
> +	Commit_1   }
> +	Commit_1   return u;
> +	Commit_1 }
> +	Commit_2
> +	Commit_2 int h(int x, int y, int z)
> +	Commit_2 {
> +	Commit_2   if (z == 0)
> +	Commit_2   {
> +	Commit_2     return x;
> +	Commit_2   }
> +	Commit_2   return y;
> +	Commit_2 }
> +	EOF
> +
> +	git config diff.algorithm myers &&

You can use test_config() here which will clear the config setting at 
the end of the test. Alternatively you can save a couple of processes by 
using "git -c diff.algorithm=myers blame ...". This is setting the 
config to the default value, I wonder if it would be better to do

	git -c diff.algorithm=histogram blame --diff-algorithm=myers

instead.

The coverage looks good

Thanks

Phillip


  reply	other threads:[~2025-10-29 10:16 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-20 14:56 [PATCH] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget
2025-10-20 16:05 ` Junio C Hamano
2025-10-22  9:37   ` Antonin Delpeuch
2025-10-22 20:39     ` Junio C Hamano
2025-10-23 16:03 ` Phillip Wood
2025-10-28 13:37 ` [PATCH v2] " Antonin Delpeuch via GitGitGadget
2025-10-28 15:22   ` Junio C Hamano
2025-10-28 16:00     ` Antonin Delpeuch
2025-10-28 21:14   ` [PATCH v3] " Antonin Delpeuch via GitGitGadget
2025-10-29 10:16     ` Phillip Wood [this message]
2025-10-29 18:46       ` Junio C Hamano
2025-10-30  9:22       ` Antonin Delpeuch
2025-10-30 10:47         ` Phillip Wood
2025-11-01 21:57     ` [PATCH v4 0/2] " Antonin Delpeuch via GitGitGadget
2025-11-01 21:57       ` [PATCH v4 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget
2025-11-03 14:32         ` Phillip Wood
2025-11-01 21:57       ` [PATCH v4 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget
2025-11-03 14:32         ` Phillip Wood
2025-11-03 16:15           ` Junio C Hamano
2025-11-06 20:29             ` Junio C Hamano
2025-11-06 22:41       ` [PATCH v5 0/2] " Antonin Delpeuch via GitGitGadget
2025-11-06 22:41         ` [PATCH v5 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget
2025-11-07 15:52           ` Junio C Hamano
2025-11-06 22:41         ` [PATCH v5 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget
2025-11-07 15:57           ` Junio C Hamano
2025-11-07 15:49         ` [PATCH v5 0/2] " Phillip Wood
2025-11-17  1:12           ` Junio C Hamano
2025-11-17  8:04         ` [PATCH v6 " Antonin Delpeuch via GitGitGadget
2025-11-17  8:04           ` [PATCH v6 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget
2025-11-17  8:04           ` [PATCH v6 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget
2025-11-17 14:13           ` [PATCH v6 0/2] " Phillip Wood
2025-11-17 18:24           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fde3dae1-bb11-45e8-9211-50ae003ca497@gmail.com \
    --to=phillip.wood123@gmail.com \
    --cc=antonin@delpeuch.eu \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=newren@gmail.com \
    --cc=phillip.wood@dunelm.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).