git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Phillip Wood <phillip.wood123@gmail.com>
To: Antonin Delpeuch via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Cc: Antonin Delpeuch <antonin@delpeuch.eu>
Subject: Re: [PATCH v2] merge-file: add --diff-algorithm option
Date: Tue, 21 Nov 2023 14:58:50 +0000	[thread overview]
Message-ID: <4e06a674-e7b6-4be4-8f69-a8e83cce1bbb@gmail.com> (raw)
In-Reply-To: <pull.1606.v2.git.git.1700507932937.gitgitgadget@gmail.com>

Hi Antonin

On 20/11/2023 19:18, Antonin Delpeuch via GitGitGadget wrote:
> From: Antonin Delpeuch <antonin@delpeuch.eu>
> 
> This makes it possible to use other diff algorithms than the 'myers'
> default algorithm, when using the 'git merge-file' command. This helps
> avoid spurious conflicts by selecting a more recent algorithm such as
> 'histogram', for instance when using 'git merge-file' as part of a custom
> merge driver.
> 
> Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
> Reviewed-by: Phillip Wood <phillip.wood@dunelm.org.uk>

This version looks good to me. Thanks for adding the tests and well done 
for finding a test case that shows the benefits of changing the diff 
algorithm so clearly.

For future reference note that the custom on this list is not to add 
"Reviewed-by:" unless the reviewer explicitly suggests it. In this case 
I'm happy for it to be left as is.

Best Wishes

Phillip

> ---
>      merge-file: add --diff-algorithm option
>      
>      Changes since v1:
>      
>       * improve commit message to mention the use case of custom merge
>         drivers
>       * improve documentation to show available options and recommend
>         switching to "histogram"
>       * add tests
>      
>      I have left out:
>      
>       * switching the default to "histogram", because it should only be done
>         in a subsequent release
>       * adding a configuration variable to control this option, because I was
>         not sure how to call it. Perhaps "merge-file.diffAlgorithm"?
> 
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1606%2Fwetneb%2Fmerge_file_configurable_diff_algorithm-v2
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1606/wetneb/merge_file_configurable_diff_algorithm-v2
> Pull-Request: https://github.com/git/git/pull/1606
> 
> Range-diff vs v1:
> 
>   1:  4aa453e30be ! 1:  842b5abf33c merge-file: add --diff-algorithm option
>       @@ Commit message
>            merge-file: add --diff-algorithm option
>        
>            This makes it possible to use other diff algorithms than the 'myers'
>       -    default algorithm, when using the 'git merge-file' command.
>       +    default algorithm, when using the 'git merge-file' command. This helps
>       +    avoid spurious conflicts by selecting a more recent algorithm such as
>       +    'histogram', for instance when using 'git merge-file' as part of a custom
>       +    merge driver.
>        
>            Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
>       +    Reviewed-by: Phillip Wood <phillip.wood@dunelm.org.uk>
>        
>         ## Documentation/git-merge-file.txt ##
>        @@ Documentation/git-merge-file.txt: object store and the object ID of its blob is written to standard output.
>         	Instead of leaving conflicts in the file, resolve conflicts
>         	favouring our (or their or both) side of the lines.
>         
>       -+--diff-algorithm <algorithm>::
>       -+	Use a different diff algorithm while merging, which can help
>       ++--diff-algorithm={patience|minimal|histogram|myers}::
>       ++	Use a different diff algorithm while merging. The current default is "myers",
>       ++	but selecting more recent algorithm such as "histogram" can help
>        +	avoid mismerges that occur due to unimportant matching lines
>       -+	(such as braces from distinct functions).  See also
>       ++	(such as braces from distinct functions). See also
>        +	linkgit:git-diff[1] `--diff-algorithm`.
>         
>         EXAMPLES
>       @@ builtin/merge-file.c: int cmd_merge_file(int argc, const char **argv, const char
>         		OPT_INTEGER(0, "marker-size", &xmp.marker_size,
>         			    N_("for conflicts, use this marker size")),
>         		OPT__QUIET(&quiet, N_("do not warn about conflicts")),
>       +
>       + ## t/t6403-merge-file.sh ##
>       +@@ t/t6403-merge-file.sh: test_expect_success 'setup' '
>       + 	deduxit me super semitas jusitiae,
>       + 	EOF
>       +
>       +-	printf "propter nomen suum." >>new4.txt
>       ++	printf "propter nomen suum." >>new4.txt &&
>       ++
>       ++	cat >base.c <<-\EOF &&
>       ++	int f(int x, int y)
>       ++	{
>       ++		if (x == 0)
>       ++		{
>       ++			return y;
>       ++		}
>       ++		return x;
>       ++	}
>       ++
>       ++	int g(size_t u)
>       ++	{
>       ++		while (u < 30)
>       ++		{
>       ++			u++;
>       ++		}
>       ++		return u;
>       ++	}
>       ++	EOF
>       ++
>       ++	cat >ours.c <<-\EOF &&
>       ++	int g(size_t u)
>       ++	{
>       ++		while (u < 30)
>       ++		{
>       ++			u++;
>       ++		}
>       ++		return u;
>       ++	}
>       ++
>       ++	int h(int x, int y, int z)
>       ++	{
>       ++		if (z == 0)
>       ++		{
>       ++			return x;
>       ++		}
>       ++		return y;
>       ++	}
>       ++	EOF
>       ++
>       ++	cat >theirs.c <<-\EOF
>       ++	int f(int x, int y)
>       ++	{
>       ++		if (x == 0)
>       ++		{
>       ++			return y;
>       ++		}
>       ++		return x;
>       ++	}
>       ++
>       ++	int g(size_t u)
>       ++	{
>       ++		while (u > 34)
>       ++		{
>       ++			u--;
>       ++		}
>       ++		return u;
>       ++	}
>       ++	EOF
>       + '
>       +
>       + test_expect_success 'merge with no changes' '
>       +@@ t/t6403-merge-file.sh: test_expect_success '--object-id fails without repository' '
>       + 	grep "not a git repository" err
>       + '
>       +
>       ++test_expect_success 'merging C files with "myers" diff algorithm creates some spurious conflicts' '
>       ++	cat >expect.c <<-\EOF &&
>       ++	int g(size_t u)
>       ++	{
>       ++		while (u < 30)
>       ++		{
>       ++			u++;
>       ++		}
>       ++		return u;
>       ++	}
>       ++
>       ++	int h(int x, int y, int z)
>       ++	{
>       ++	<<<<<<< ours.c
>       ++		if (z == 0)
>       ++	||||||| base.c
>       ++		while (u < 30)
>       ++	=======
>       ++		while (u > 34)
>       ++	>>>>>>> theirs.c
>       ++		{
>       ++	<<<<<<< ours.c
>       ++			return x;
>       ++	||||||| base.c
>       ++			u++;
>       ++	=======
>       ++			u--;
>       ++	>>>>>>> theirs.c
>       ++		}
>       ++		return y;
>       ++	}
>       ++	EOF
>       ++
>       ++	test_must_fail git merge-file -p --diff3 --diff-algorithm myers ours.c base.c theirs.c >myers_output.c &&
>       ++	test_cmp expect.c myers_output.c
>       ++'
>       ++
>       ++test_expect_success 'merging C files with "histogram" diff algorithm avoids some spurious conflicts' '
>       ++	cat >expect.c <<-\EOF &&
>       ++	int g(size_t u)
>       ++	{
>       ++		while (u > 34)
>       ++		{
>       ++			u--;
>       ++		}
>       ++		return u;
>       ++	}
>       ++
>       ++	int h(int x, int y, int z)
>       ++	{
>       ++		if (z == 0)
>       ++		{
>       ++			return x;
>       ++		}
>       ++		return y;
>       ++	}
>       ++	EOF
>       ++
>       ++	git merge-file -p --diff3 --diff-algorithm histogram ours.c base.c theirs.c >histogram_output.c &&
>       ++	test_cmp expect.c histogram_output.c
>       ++'
>       ++
>       + test_done
> 
> 
>   Documentation/git-merge-file.txt |   6 ++
>   builtin/merge-file.c             |  28 +++++++
>   t/t6403-merge-file.sh            | 124 ++++++++++++++++++++++++++++++-
>   3 files changed, 157 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/git-merge-file.txt b/Documentation/git-merge-file.txt
> index 6a081eacb72..71915a00fa4 100644
> --- a/Documentation/git-merge-file.txt
> +++ b/Documentation/git-merge-file.txt
> @@ -92,6 +92,12 @@ object store and the object ID of its blob is written to standard output.
>   	Instead of leaving conflicts in the file, resolve conflicts
>   	favouring our (or their or both) side of the lines.
>   
> +--diff-algorithm={patience|minimal|histogram|myers}::
> +	Use a different diff algorithm while merging. The current default is "myers",
> +	but selecting more recent algorithm such as "histogram" can help
> +	avoid mismerges that occur due to unimportant matching lines
> +	(such as braces from distinct functions). See also
> +	linkgit:git-diff[1] `--diff-algorithm`.
>   
>   EXAMPLES
>   --------
> diff --git a/builtin/merge-file.c b/builtin/merge-file.c
> index 832c93d8d54..1f987334a31 100644
> --- a/builtin/merge-file.c
> +++ b/builtin/merge-file.c
> @@ -1,5 +1,6 @@
>   #include "builtin.h"
>   #include "abspath.h"
> +#include "diff.h"
>   #include "hex.h"
>   #include "object-name.h"
>   #include "object-store.h"
> @@ -28,6 +29,30 @@ static int label_cb(const struct option *opt, const char *arg, int unset)
>   	return 0;
>   }
>   
> +static int set_diff_algorithm(xpparam_t *xpp,
> +			      const char *alg)
> +{
> +	long diff_algorithm = parse_algorithm_value(alg);
> +	if (diff_algorithm < 0)
> +		return -1;
> +	xpp->flags = (xpp->flags & ~XDF_DIFF_ALGORITHM_MASK) | diff_algorithm;
> +	return 0;
> +}
> +
> +static int diff_algorithm_cb(const struct option *opt,
> +				const char *arg, int unset)
> +{
> +	xpparam_t *xpp = opt->value;
> +
> +	BUG_ON_OPT_NEG(unset);
> +
> +	if (set_diff_algorithm(xpp, arg))
> +		return error(_("option diff-algorithm accepts \"myers\", "
> +			       "\"minimal\", \"patience\" and \"histogram\""));
> +
> +	return 0;
> +}
> +
>   int cmd_merge_file(int argc, const char **argv, const char *prefix)
>   {
>   	const char *names[3] = { 0 };
> @@ -48,6 +73,9 @@ int cmd_merge_file(int argc, const char **argv, const char *prefix)
>   			    XDL_MERGE_FAVOR_THEIRS),
>   		OPT_SET_INT(0, "union", &xmp.favor, N_("for conflicts, use a union version"),
>   			    XDL_MERGE_FAVOR_UNION),
> +		OPT_CALLBACK_F(0, "diff-algorithm", &xmp.xpp, N_("<algorithm>"),
> +			     N_("choose a diff algorithm"),
> +			     PARSE_OPT_NONEG, diff_algorithm_cb),
>   		OPT_INTEGER(0, "marker-size", &xmp.marker_size,
>   			    N_("for conflicts, use this marker size")),
>   		OPT__QUIET(&quiet, N_("do not warn about conflicts")),
> diff --git a/t/t6403-merge-file.sh b/t/t6403-merge-file.sh
> index 2c92209ecab..fb872c5a113 100755
> --- a/t/t6403-merge-file.sh
> +++ b/t/t6403-merge-file.sh
> @@ -56,7 +56,67 @@ test_expect_success 'setup' '
>   	deduxit me super semitas jusitiae,
>   	EOF
>   
> -	printf "propter nomen suum." >>new4.txt
> +	printf "propter nomen suum." >>new4.txt &&
> +
> +	cat >base.c <<-\EOF &&
> +	int f(int x, int y)
> +	{
> +		if (x == 0)
> +		{
> +			return y;
> +		}
> +		return x;
> +	}
> +
> +	int g(size_t u)
> +	{
> +		while (u < 30)
> +		{
> +			u++;
> +		}
> +		return u;
> +	}
> +	EOF
> +
> +	cat >ours.c <<-\EOF &&
> +	int g(size_t u)
> +	{
> +		while (u < 30)
> +		{
> +			u++;
> +		}
> +		return u;
> +	}
> +
> +	int h(int x, int y, int z)
> +	{
> +		if (z == 0)
> +		{
> +			return x;
> +		}
> +		return y;
> +	}
> +	EOF
> +
> +	cat >theirs.c <<-\EOF
> +	int f(int x, int y)
> +	{
> +		if (x == 0)
> +		{
> +			return y;
> +		}
> +		return x;
> +	}
> +
> +	int g(size_t u)
> +	{
> +		while (u > 34)
> +		{
> +			u--;
> +		}
> +		return u;
> +	}
> +	EOF
>   '
>   
>   test_expect_success 'merge with no changes' '
> @@ -447,4 +507,66 @@ test_expect_success '--object-id fails without repository' '
>   	grep "not a git repository" err
>   '
>   
> +test_expect_success 'merging C files with "myers" diff algorithm creates some spurious conflicts' '
> +	cat >expect.c <<-\EOF &&
> +	int g(size_t u)
> +	{
> +		while (u < 30)
> +		{
> +			u++;
> +		}
> +		return u;
> +	}
> +
> +	int h(int x, int y, int z)
> +	{
> +	<<<<<<< ours.c
> +		if (z == 0)
> +	||||||| base.c
> +		while (u < 30)
> +	=======
> +		while (u > 34)
> +	>>>>>>> theirs.c
> +		{
> +	<<<<<<< ours.c
> +			return x;
> +	||||||| base.c
> +			u++;
> +	=======
> +			u--;
> +	>>>>>>> theirs.c
> +		}
> +		return y;
> +	}
> +	EOF
> +
> +	test_must_fail git merge-file -p --diff3 --diff-algorithm myers ours.c base.c theirs.c >myers_output.c &&
> +	test_cmp expect.c myers_output.c
> +'
> +
> +test_expect_success 'merging C files with "histogram" diff algorithm avoids some spurious conflicts' '
> +	cat >expect.c <<-\EOF &&
> +	int g(size_t u)
> +	{
> +		while (u > 34)
> +		{
> +			u--;
> +		}
> +		return u;
> +	}
> +
> +	int h(int x, int y, int z)
> +	{
> +		if (z == 0)
> +		{
> +			return x;
> +		}
> +		return y;
> +	}
> +	EOF
> +
> +	git merge-file -p --diff3 --diff-algorithm histogram ours.c base.c theirs.c >histogram_output.c &&
> +	test_cmp expect.c histogram_output.c
> +'
> +
>   test_done
> 
> base-commit: 98009afd24e2304bf923a64750340423473809ff

      reply	other threads:[~2023-11-21 14:58 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-08 21:54 [PATCH] merge-file: add --diff-algorithm option Antonin Delpeuch via GitGitGadget
2023-11-17 21:42 ` Antonin Delpeuch
2023-11-19 16:43   ` Phillip Wood
2023-11-19 19:29     ` Antonin Delpeuch
2023-11-19 23:30     ` Junio C Hamano
2023-11-19 16:42 ` Phillip Wood
2023-11-20 19:18 ` [PATCH v2] " Antonin Delpeuch via GitGitGadget
2023-11-21 14:58   ` Phillip Wood [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4e06a674-e7b6-4be4-8f69-a8e83cce1bbb@gmail.com \
    --to=phillip.wood123@gmail.com \
    --cc=antonin@delpeuch.eu \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=phillip.wood@dunelm.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).