git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Antonin Delpeuch via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org,  Elijah Newren <newren@gmail.com>,
	 Phillip Wood <phillip.wood123@gmail.com>,
	 Antonin Delpeuch <antonin@delpeuch.eu>
Subject: Re: [PATCH v2] blame: make diff algorithm configurable
Date: Tue, 28 Oct 2025 08:22:44 -0700	[thread overview]
Message-ID: <xmqqjz0fdpa3.fsf@gitster.g> (raw)
In-Reply-To: <pull.2075.v2.git.git.1761658643278.gitgitgadget@gmail.com> (Antonin Delpeuch via GitGitGadget's message of "Tue, 28 Oct 2025 13:37:23 +0000")

"Antonin Delpeuch via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Antonin Delpeuch <antonin@delpeuch.eu>
>
> The diff algorithm used in 'git-blame(1)' is set to 'myers',
> without the possibility to change it aside from the `--minimal` option.

Hmph.  It is very unfortunate that we had --minimal already.  We
should have done --diff-algorithm=<which> instead, but that is way
too late.

> There has been long-standing interest in changing the default diff
> algorithm to "histogram", and Git 3.0 was floated as a possible occasion
> for taking some steps towards that:
>
> https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/

Micronit.  I think the reference to 3.0 only about potentially
breaking backward compatibility by making the family of diff-*
plumbing commands ignore diff.algorithm configuration, and other
usability changes like this one are fair game without having to wait
for 3.0 boundary (the plumbing commands do ignore the configuration
already, so there is nothing we have to wait 3.0 before doing).

>     Changes since v1:
>     
>      * add tests
>      * ignore --diff-algorithm when it is provided before --minimal

Sensible.

I presume the reverse is true, i.e. giving "--minimal" and then
"--diff-algorithm=histogram" in this order would make "histogram"
survive, in other words, the usual "last one wins" rule is applied?

> +static int blame_diff_algorithm_minimal(const struct option *option,
> +					const char *arg, int unset)
> +{
> +	int *opt = option->value;
> +
> +	BUG_ON_OPT_NEG(unset);
> +	BUG_ON_OPT_ARG(arg);
> +
> +	*opt &= ~XDF_DIFF_ALGORITHM_MASK;
> +	*opt |= XDF_NEED_MINIMAL;
> +
> +	return 0;
> +}

This and diff.c:diff_opt_diff_algorithm_no_arg(), which I think is
the original from which this was copied from, look somewhat
different, but this can afford to be simpler, as it does not have to
parse "--histogram", "--patience", etc., as independent command line
options.  OK.

> +static int blame_diff_algorithm_callback(const struct option *option,
> +					 const char *arg, int unset)
> +{
> +	int *opt = option->value;
> +	long value = parse_algorithm_value(arg);
> +
> +	BUG_ON_OPT_NEG(unset);
> +
> +	if (value < 0)
> +		return error(_("option diff-algorithm accepts \"myers\", "
> +			       "\"minimal\", \"patience\" and \"histogram\""));
> +
> +	*opt &= ~(XDF_NEED_MINIMAL | XDF_DIFF_ALGORITHM_MASK);
> +	*opt |= value;
> +
> +	return 0;
> +}

Quite straight-forward and sensible.

>  static int is_a_rev(const char *name)
>  {
>  	struct object_id oid;
> @@ -915,10 +960,17 @@ int cmd_blame(int argc,
>  		OPT_BIT('s', NULL, &output_option, N_("suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR),
>  		OPT_BIT('e', "show-email", &output_option, N_("show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL),
>  		OPT_BIT('w', NULL, &xdl_opts, N_("ignore whitespace differences"), XDF_IGNORE_WHITESPACE),
> +		OPT_CALLBACK_F(0, "diff-algorithm", &xdl_opts, N_("<algorithm>"),
> +			       N_("choose a diff algorithm"),
> +			       PARSE_OPT_NONEG, blame_diff_algorithm_callback),
>  		OPT_STRING_LIST(0, "ignore-rev", &ignore_rev_list, N_("rev"), N_("ignore <rev> when blaming")),
>  		OPT_STRING_LIST(0, "ignore-revs-file", &ignore_revs_file_list, N_("file"), N_("ignore revisions from <file>")),
>  		OPT_BIT(0, "color-lines", &output_option, N_("color redundant metadata from previous line differently"), OUTPUT_COLOR_LINE),
>  		OPT_BIT(0, "color-by-age", &output_option, N_("color lines by age"), OUTPUT_SHOW_AGE_WITH_COLOR),
> +		OPT_CALLBACK_F(0, "minimal", &xdl_opts, NULL,
> +			       N_("spend extra cycles to find better match"),
> +			       PARSE_OPT_NONEG | PARSE_OPT_NOARG,
> +			       blame_diff_algorithm_minimal),
>  		OPT_BIT(0, "minimal", &xdl_opts, N_("spend extra cycles to find better match"), XDF_NEED_MINIMAL),

This OPT_BIT() can stay here?  I thought parse_options_check() was
capable of detecting duplicated long-form commands as programming
error, but apparently it does not.  (#leftoverbits) We should look
into teaching parse_options_check() to check duplicated option
names.

>  		OPT_STRING('S', NULL, &revs_file, N_("file"), N_("use revisions from <file> instead of calling git-rev-list")),
>  		OPT_STRING(0, "contents", &contents_from, N_("file"), N_("use <file>'s contents as the final image")),
> ...
> +test_expect_success 'blame honors --minimal option' '
> +	cat >expected <<-\EOF &&
> +	Initial
> +	Initial
> +	Initial
> +	Second
> +	Second
> +	Second
> +	Second
> +	Initial
> +	Second
> +	Second
> +	Second
> +	EOF
> +
> +	git blame file.txt --minimal | \
> +		grep --only-matching -e Initial -e Second > actual &&
> +	test_cmp expected actual
> +'

Do we need to test combination of configuration variables and
command line options (to verify that options trump configuration),
or two command line options (to verify that the last one wins)?

When xdiff/ part of the system gets improved, the above expected
patterns may have to change, these tests may fail.  Whoever updates
the diff algorithm to cause such a failure has to tell between a
genuine _bug_ in their update to diff implementation and the test
expecting a suboptimal result based on the behaviour of the diff
algorithm before their improvement.  And for that, they need to
debug these tests.  But I suspect that these tests will probably be
very difficult to debug, as it is almost impossible to see which
line in the original each of these lines correspond to.

I guess that's inevitable, and we'll cross that bridge when it
becomes necessary.

Thanks, will queue, but I do find the leftover --minimal bit
disturbing.



  reply	other threads:[~2025-10-28 15:22 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-20 14:56 [PATCH] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget
2025-10-20 16:05 ` Junio C Hamano
2025-10-22  9:37   ` Antonin Delpeuch
2025-10-22 20:39     ` Junio C Hamano
2025-10-23 16:03 ` Phillip Wood
2025-10-28 13:37 ` [PATCH v2] " Antonin Delpeuch via GitGitGadget
2025-10-28 15:22   ` Junio C Hamano [this message]
2025-10-28 16:00     ` Antonin Delpeuch
2025-10-28 21:14   ` [PATCH v3] " Antonin Delpeuch via GitGitGadget
2025-10-29 10:16     ` Phillip Wood
2025-10-29 18:46       ` Junio C Hamano
2025-10-30  9:22       ` Antonin Delpeuch
2025-10-30 10:47         ` Phillip Wood
2025-11-01 21:57     ` [PATCH v4 0/2] " Antonin Delpeuch via GitGitGadget
2025-11-01 21:57       ` [PATCH v4 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget
2025-11-03 14:32         ` Phillip Wood
2025-11-01 21:57       ` [PATCH v4 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget
2025-11-03 14:32         ` Phillip Wood
2025-11-03 16:15           ` Junio C Hamano
2025-11-06 20:29             ` Junio C Hamano
2025-11-06 22:41       ` [PATCH v5 0/2] " Antonin Delpeuch via GitGitGadget
2025-11-06 22:41         ` [PATCH v5 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget
2025-11-07 15:52           ` Junio C Hamano
2025-11-06 22:41         ` [PATCH v5 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget
2025-11-07 15:57           ` Junio C Hamano
2025-11-07 15:49         ` [PATCH v5 0/2] " Phillip Wood
2025-11-17  1:12           ` Junio C Hamano
2025-11-17  8:04         ` [PATCH v6 " Antonin Delpeuch via GitGitGadget
2025-11-17  8:04           ` [PATCH v6 1/2] xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK Antonin Delpeuch via GitGitGadget
2025-11-17  8:04           ` [PATCH v6 2/2] blame: make diff algorithm configurable Antonin Delpeuch via GitGitGadget
2025-11-17 14:13           ` [PATCH v6 0/2] " Phillip Wood
2025-11-17 18:24           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqjz0fdpa3.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=antonin@delpeuch.eu \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=newren@gmail.com \
    --cc=phillip.wood123@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).