From: Phillip Wood <phillip.wood123@gmail.com>
To: Antonin Delpeuch via GitGitGadget <gitgitgadget@gmail.com>,
git@vger.kernel.org
Cc: Antonin Delpeuch <antonin@delpeuch.eu>
Subject: Re: [PATCH v2] merge-file: add --diff-algorithm option
Date: Tue, 21 Nov 2023 14:58:50 +0000 [thread overview]
Message-ID: <4e06a674-e7b6-4be4-8f69-a8e83cce1bbb@gmail.com> (raw)
In-Reply-To: <pull.1606.v2.git.git.1700507932937.gitgitgadget@gmail.com>
Hi Antonin
On 20/11/2023 19:18, Antonin Delpeuch via GitGitGadget wrote:
> From: Antonin Delpeuch <antonin@delpeuch.eu>
>
> This makes it possible to use other diff algorithms than the 'myers'
> default algorithm, when using the 'git merge-file' command. This helps
> avoid spurious conflicts by selecting a more recent algorithm such as
> 'histogram', for instance when using 'git merge-file' as part of a custom
> merge driver.
>
> Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
> Reviewed-by: Phillip Wood <phillip.wood@dunelm.org.uk>
This version looks good to me. Thanks for adding the tests and well done
for finding a test case that shows the benefits of changing the diff
algorithm so clearly.
For future reference note that the custom on this list is not to add
"Reviewed-by:" unless the reviewer explicitly suggests it. In this case
I'm happy for it to be left as is.
Best Wishes
Phillip
> ---
> merge-file: add --diff-algorithm option
>
> Changes since v1:
>
> * improve commit message to mention the use case of custom merge
> drivers
> * improve documentation to show available options and recommend
> switching to "histogram"
> * add tests
>
> I have left out:
>
> * switching the default to "histogram", because it should only be done
> in a subsequent release
> * adding a configuration variable to control this option, because I was
> not sure how to call it. Perhaps "merge-file.diffAlgorithm"?
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1606%2Fwetneb%2Fmerge_file_configurable_diff_algorithm-v2
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1606/wetneb/merge_file_configurable_diff_algorithm-v2
> Pull-Request: https://github.com/git/git/pull/1606
>
> Range-diff vs v1:
>
> 1: 4aa453e30be ! 1: 842b5abf33c merge-file: add --diff-algorithm option
> @@ Commit message
> merge-file: add --diff-algorithm option
>
> This makes it possible to use other diff algorithms than the 'myers'
> - default algorithm, when using the 'git merge-file' command.
> + default algorithm, when using the 'git merge-file' command. This helps
> + avoid spurious conflicts by selecting a more recent algorithm such as
> + 'histogram', for instance when using 'git merge-file' as part of a custom
> + merge driver.
>
> Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
> + Reviewed-by: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> ## Documentation/git-merge-file.txt ##
> @@ Documentation/git-merge-file.txt: object store and the object ID of its blob is written to standard output.
> Instead of leaving conflicts in the file, resolve conflicts
> favouring our (or their or both) side of the lines.
>
> -+--diff-algorithm <algorithm>::
> -+ Use a different diff algorithm while merging, which can help
> ++--diff-algorithm={patience|minimal|histogram|myers}::
> ++ Use a different diff algorithm while merging. The current default is "myers",
> ++ but selecting more recent algorithm such as "histogram" can help
> + avoid mismerges that occur due to unimportant matching lines
> -+ (such as braces from distinct functions). See also
> ++ (such as braces from distinct functions). See also
> + linkgit:git-diff[1] `--diff-algorithm`.
>
> EXAMPLES
> @@ builtin/merge-file.c: int cmd_merge_file(int argc, const char **argv, const char
> OPT_INTEGER(0, "marker-size", &xmp.marker_size,
> N_("for conflicts, use this marker size")),
> OPT__QUIET(&quiet, N_("do not warn about conflicts")),
> +
> + ## t/t6403-merge-file.sh ##
> +@@ t/t6403-merge-file.sh: test_expect_success 'setup' '
> + deduxit me super semitas jusitiae,
> + EOF
> +
> +- printf "propter nomen suum." >>new4.txt
> ++ printf "propter nomen suum." >>new4.txt &&
> ++
> ++ cat >base.c <<-\EOF &&
> ++ int f(int x, int y)
> ++ {
> ++ if (x == 0)
> ++ {
> ++ return y;
> ++ }
> ++ return x;
> ++ }
> ++
> ++ int g(size_t u)
> ++ {
> ++ while (u < 30)
> ++ {
> ++ u++;
> ++ }
> ++ return u;
> ++ }
> ++ EOF
> ++
> ++ cat >ours.c <<-\EOF &&
> ++ int g(size_t u)
> ++ {
> ++ while (u < 30)
> ++ {
> ++ u++;
> ++ }
> ++ return u;
> ++ }
> ++
> ++ int h(int x, int y, int z)
> ++ {
> ++ if (z == 0)
> ++ {
> ++ return x;
> ++ }
> ++ return y;
> ++ }
> ++ EOF
> ++
> ++ cat >theirs.c <<-\EOF
> ++ int f(int x, int y)
> ++ {
> ++ if (x == 0)
> ++ {
> ++ return y;
> ++ }
> ++ return x;
> ++ }
> ++
> ++ int g(size_t u)
> ++ {
> ++ while (u > 34)
> ++ {
> ++ u--;
> ++ }
> ++ return u;
> ++ }
> ++ EOF
> + '
> +
> + test_expect_success 'merge with no changes' '
> +@@ t/t6403-merge-file.sh: test_expect_success '--object-id fails without repository' '
> + grep "not a git repository" err
> + '
> +
> ++test_expect_success 'merging C files with "myers" diff algorithm creates some spurious conflicts' '
> ++ cat >expect.c <<-\EOF &&
> ++ int g(size_t u)
> ++ {
> ++ while (u < 30)
> ++ {
> ++ u++;
> ++ }
> ++ return u;
> ++ }
> ++
> ++ int h(int x, int y, int z)
> ++ {
> ++ <<<<<<< ours.c
> ++ if (z == 0)
> ++ ||||||| base.c
> ++ while (u < 30)
> ++ =======
> ++ while (u > 34)
> ++ >>>>>>> theirs.c
> ++ {
> ++ <<<<<<< ours.c
> ++ return x;
> ++ ||||||| base.c
> ++ u++;
> ++ =======
> ++ u--;
> ++ >>>>>>> theirs.c
> ++ }
> ++ return y;
> ++ }
> ++ EOF
> ++
> ++ test_must_fail git merge-file -p --diff3 --diff-algorithm myers ours.c base.c theirs.c >myers_output.c &&
> ++ test_cmp expect.c myers_output.c
> ++'
> ++
> ++test_expect_success 'merging C files with "histogram" diff algorithm avoids some spurious conflicts' '
> ++ cat >expect.c <<-\EOF &&
> ++ int g(size_t u)
> ++ {
> ++ while (u > 34)
> ++ {
> ++ u--;
> ++ }
> ++ return u;
> ++ }
> ++
> ++ int h(int x, int y, int z)
> ++ {
> ++ if (z == 0)
> ++ {
> ++ return x;
> ++ }
> ++ return y;
> ++ }
> ++ EOF
> ++
> ++ git merge-file -p --diff3 --diff-algorithm histogram ours.c base.c theirs.c >histogram_output.c &&
> ++ test_cmp expect.c histogram_output.c
> ++'
> ++
> + test_done
>
>
> Documentation/git-merge-file.txt | 6 ++
> builtin/merge-file.c | 28 +++++++
> t/t6403-merge-file.sh | 124 ++++++++++++++++++++++++++++++-
> 3 files changed, 157 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/git-merge-file.txt b/Documentation/git-merge-file.txt
> index 6a081eacb72..71915a00fa4 100644
> --- a/Documentation/git-merge-file.txt
> +++ b/Documentation/git-merge-file.txt
> @@ -92,6 +92,12 @@ object store and the object ID of its blob is written to standard output.
> Instead of leaving conflicts in the file, resolve conflicts
> favouring our (or their or both) side of the lines.
>
> +--diff-algorithm={patience|minimal|histogram|myers}::
> + Use a different diff algorithm while merging. The current default is "myers",
> + but selecting more recent algorithm such as "histogram" can help
> + avoid mismerges that occur due to unimportant matching lines
> + (such as braces from distinct functions). See also
> + linkgit:git-diff[1] `--diff-algorithm`.
>
> EXAMPLES
> --------
> diff --git a/builtin/merge-file.c b/builtin/merge-file.c
> index 832c93d8d54..1f987334a31 100644
> --- a/builtin/merge-file.c
> +++ b/builtin/merge-file.c
> @@ -1,5 +1,6 @@
> #include "builtin.h"
> #include "abspath.h"
> +#include "diff.h"
> #include "hex.h"
> #include "object-name.h"
> #include "object-store.h"
> @@ -28,6 +29,30 @@ static int label_cb(const struct option *opt, const char *arg, int unset)
> return 0;
> }
>
> +static int set_diff_algorithm(xpparam_t *xpp,
> + const char *alg)
> +{
> + long diff_algorithm = parse_algorithm_value(alg);
> + if (diff_algorithm < 0)
> + return -1;
> + xpp->flags = (xpp->flags & ~XDF_DIFF_ALGORITHM_MASK) | diff_algorithm;
> + return 0;
> +}
> +
> +static int diff_algorithm_cb(const struct option *opt,
> + const char *arg, int unset)
> +{
> + xpparam_t *xpp = opt->value;
> +
> + BUG_ON_OPT_NEG(unset);
> +
> + if (set_diff_algorithm(xpp, arg))
> + return error(_("option diff-algorithm accepts \"myers\", "
> + "\"minimal\", \"patience\" and \"histogram\""));
> +
> + return 0;
> +}
> +
> int cmd_merge_file(int argc, const char **argv, const char *prefix)
> {
> const char *names[3] = { 0 };
> @@ -48,6 +73,9 @@ int cmd_merge_file(int argc, const char **argv, const char *prefix)
> XDL_MERGE_FAVOR_THEIRS),
> OPT_SET_INT(0, "union", &xmp.favor, N_("for conflicts, use a union version"),
> XDL_MERGE_FAVOR_UNION),
> + OPT_CALLBACK_F(0, "diff-algorithm", &xmp.xpp, N_("<algorithm>"),
> + N_("choose a diff algorithm"),
> + PARSE_OPT_NONEG, diff_algorithm_cb),
> OPT_INTEGER(0, "marker-size", &xmp.marker_size,
> N_("for conflicts, use this marker size")),
> OPT__QUIET(&quiet, N_("do not warn about conflicts")),
> diff --git a/t/t6403-merge-file.sh b/t/t6403-merge-file.sh
> index 2c92209ecab..fb872c5a113 100755
> --- a/t/t6403-merge-file.sh
> +++ b/t/t6403-merge-file.sh
> @@ -56,7 +56,67 @@ test_expect_success 'setup' '
> deduxit me super semitas jusitiae,
> EOF
>
> - printf "propter nomen suum." >>new4.txt
> + printf "propter nomen suum." >>new4.txt &&
> +
> + cat >base.c <<-\EOF &&
> + int f(int x, int y)
> + {
> + if (x == 0)
> + {
> + return y;
> + }
> + return x;
> + }
> +
> + int g(size_t u)
> + {
> + while (u < 30)
> + {
> + u++;
> + }
> + return u;
> + }
> + EOF
> +
> + cat >ours.c <<-\EOF &&
> + int g(size_t u)
> + {
> + while (u < 30)
> + {
> + u++;
> + }
> + return u;
> + }
> +
> + int h(int x, int y, int z)
> + {
> + if (z == 0)
> + {
> + return x;
> + }
> + return y;
> + }
> + EOF
> +
> + cat >theirs.c <<-\EOF
> + int f(int x, int y)
> + {
> + if (x == 0)
> + {
> + return y;
> + }
> + return x;
> + }
> +
> + int g(size_t u)
> + {
> + while (u > 34)
> + {
> + u--;
> + }
> + return u;
> + }
> + EOF
> '
>
> test_expect_success 'merge with no changes' '
> @@ -447,4 +507,66 @@ test_expect_success '--object-id fails without repository' '
> grep "not a git repository" err
> '
>
> +test_expect_success 'merging C files with "myers" diff algorithm creates some spurious conflicts' '
> + cat >expect.c <<-\EOF &&
> + int g(size_t u)
> + {
> + while (u < 30)
> + {
> + u++;
> + }
> + return u;
> + }
> +
> + int h(int x, int y, int z)
> + {
> + <<<<<<< ours.c
> + if (z == 0)
> + ||||||| base.c
> + while (u < 30)
> + =======
> + while (u > 34)
> + >>>>>>> theirs.c
> + {
> + <<<<<<< ours.c
> + return x;
> + ||||||| base.c
> + u++;
> + =======
> + u--;
> + >>>>>>> theirs.c
> + }
> + return y;
> + }
> + EOF
> +
> + test_must_fail git merge-file -p --diff3 --diff-algorithm myers ours.c base.c theirs.c >myers_output.c &&
> + test_cmp expect.c myers_output.c
> +'
> +
> +test_expect_success 'merging C files with "histogram" diff algorithm avoids some spurious conflicts' '
> + cat >expect.c <<-\EOF &&
> + int g(size_t u)
> + {
> + while (u > 34)
> + {
> + u--;
> + }
> + return u;
> + }
> +
> + int h(int x, int y, int z)
> + {
> + if (z == 0)
> + {
> + return x;
> + }
> + return y;
> + }
> + EOF
> +
> + git merge-file -p --diff3 --diff-algorithm histogram ours.c base.c theirs.c >histogram_output.c &&
> + test_cmp expect.c histogram_output.c
> +'
> +
> test_done
>
> base-commit: 98009afd24e2304bf923a64750340423473809ff
prev parent reply other threads:[~2023-11-21 14:58 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-08 21:54 [PATCH] merge-file: add --diff-algorithm option Antonin Delpeuch via GitGitGadget
2023-11-17 21:42 ` Antonin Delpeuch
2023-11-19 16:43 ` Phillip Wood
2023-11-19 19:29 ` Antonin Delpeuch
2023-11-19 23:30 ` Junio C Hamano
2023-11-19 16:42 ` Phillip Wood
2023-11-20 19:18 ` [PATCH v2] " Antonin Delpeuch via GitGitGadget
2023-11-21 14:58 ` Phillip Wood [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4e06a674-e7b6-4be4-8f69-a8e83cce1bbb@gmail.com \
--to=phillip.wood123@gmail.com \
--cc=antonin@delpeuch.eu \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=phillip.wood@dunelm.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).