git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] range-diff: optionally include merge commits' diffs in the analysis
@ 2024-11-07 17:20 Johannes Schindelin via GitGitGadget
  2024-11-08  3:46 ` Junio C Hamano
                   ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-11-07 17:20 UTC (permalink / raw)
  To: git; +Cc: Philippe Blain, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `git log` command already offers support for including diffs for
merges, via the `--diff-merges=<format>` option.

Let's add corresponding support for `git range-diff`, too. This makes it
more convenient to spot differences between iterations of non-linear
contributions, where so-called "evil merges" are sometimes necessary and
need to be reviewed, too.

In my code reviews, I found the `--diff-merges=first-parent` option
particularly useful.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
    Support diff merges option in range diff
    
    The git range-diff command does the same with merge commits as git
    rebase: It ignores them.
    
    However, when comparing branch thickets it can be quite illuminating to
    watch out for inadvertent changes in merge commits, in particular when
    some "evil" merges have been replayed, i.e. merges that needed to
    introduce changes outside of the merge conflicts (e.g. when one branch
    changed a function's signature and another branch introduced a caller of
    said function), in case the replayed merge is no longer "evil" and
    therefore potentially incorrect.
    
    Let's introduce support for the --diff-merges option that is passed
    through to those git log commands.
    
    I had a need for this earlier this year and got it working, leaving the
    GitGitGadget PR in a draft mode. Phil Blain found it and kindly
    nerd-sniped me into readying it for submitting, so say thanks to Phil!

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1734%2Fdscho%2Fsupport-diff-merges-option-in-range-diff-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1734/dscho/support-diff-merges-option-in-range-diff-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1734

 Documentation/git-range-diff.txt | 10 +++++++++-
 builtin/range-diff.c             | 11 +++++++++++
 range-diff.c                     | 15 +++++++++++----
 range-diff.h                     |  1 +
 t/t3206-range-diff.sh            | 16 ++++++++++++++++
 5 files changed, 48 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-range-diff.txt b/Documentation/git-range-diff.txt
index fbdbe0befeb..a964e856c3c 100644
--- a/Documentation/git-range-diff.txt
+++ b/Documentation/git-range-diff.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 [verse]
 'git range-diff' [--color=[<when>]] [--no-color] [<diff-options>]
 	[--no-dual-color] [--creation-factor=<factor>]
-	[--left-only | --right-only]
+	[--left-only | --right-only] [--diff-merges=<format>]
 	( <range1> <range2> | <rev1>...<rev2> | <base> <rev1> <rev2> )
 	[[--] <path>...]
 
@@ -81,6 +81,14 @@ to revert to color all lines according to the outer diff markers
 	Suppress commits that are missing from the second specified range
 	(or the "right range" when using the `<rev1>...<rev2>` format).
 
+--diff-merges=<format>::
+	Instead of ignoring merge commits, generate diffs for them using the
+	corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
+	and include them in the comparison.
++
+Note: Some of the formats supported by linkgit:git-log[1] make less sense in
+the context of the `range-diff` command than other formats, so choose wisely!
+
 --[no-]notes[=<ref>]::
 	This flag is passed to the `git log` program
 	(see linkgit:git-log[1]) that generates the patches.
diff --git a/builtin/range-diff.c b/builtin/range-diff.c
index 1b33ab66a7b..e41719e0f0d 100644
--- a/builtin/range-diff.c
+++ b/builtin/range-diff.c
@@ -21,6 +21,7 @@ int cmd_range_diff(int argc,
 {
 	struct diff_options diffopt = { NULL };
 	struct strvec other_arg = STRVEC_INIT;
+	struct strvec diff_merges_arg = STRVEC_INIT;
 	struct range_diff_options range_diff_opts = {
 		.creation_factor = RANGE_DIFF_CREATION_FACTOR_DEFAULT,
 		.diffopt = &diffopt,
@@ -36,6 +37,9 @@ int cmd_range_diff(int argc,
 		OPT_PASSTHRU_ARGV(0, "notes", &other_arg,
 				  N_("notes"), N_("passed to 'git log'"),
 				  PARSE_OPT_OPTARG),
+		OPT_PASSTHRU_ARGV(0, "diff-merges", &diff_merges_arg,
+				  N_("style"), N_("passed to 'git log'"),
+				  PARSE_OPT_OPTARG),
 		OPT_BOOL(0, "left-only", &left_only,
 			 N_("only emit output related to the first range")),
 		OPT_BOOL(0, "right-only", &right_only,
@@ -62,6 +66,12 @@ int cmd_range_diff(int argc,
 	if (!simple_color)
 		diffopt.use_color = 1;
 
+	/* If `--diff-merges` was specified, imply `--merges` */
+	if (diff_merges_arg.nr) {
+		range_diff_opts.include_merges = 1;
+		strvec_pushv(&other_arg, diff_merges_arg.v);
+	}
+
 	for (i = 0; i < argc; i++)
 		if (!strcmp(argv[i], "--")) {
 			dash_dash = i;
@@ -155,6 +165,7 @@ int cmd_range_diff(int argc,
 	res = show_range_diff(range1.buf, range2.buf, &range_diff_opts);
 
 	strvec_clear(&other_arg);
+	strvec_clear(&diff_merges_arg);
 	strbuf_release(&range1);
 	strbuf_release(&range2);
 
diff --git a/range-diff.c b/range-diff.c
index bbb0952264b..9e59733059b 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -38,7 +38,8 @@ struct patch_util {
  * as struct object_id (will need to be free()d).
  */
 static int read_patches(const char *range, struct string_list *list,
-			const struct strvec *other_arg)
+			const struct strvec *other_arg,
+			unsigned int include_merges)
 {
 	struct child_process cp = CHILD_PROCESS_INIT;
 	struct strbuf buf = STRBUF_INIT, contents = STRBUF_INIT;
@@ -49,7 +50,7 @@ static int read_patches(const char *range, struct string_list *list,
 	size_t size;
 	int ret = -1;
 
-	strvec_pushl(&cp.args, "log", "--no-color", "-p", "--no-merges",
+	strvec_pushl(&cp.args, "log", "--no-color", "-p",
 		     "--reverse", "--date-order", "--decorate=no",
 		     "--no-prefix", "--submodule=short",
 		     /*
@@ -64,6 +65,8 @@ static int read_patches(const char *range, struct string_list *list,
 		     "--pretty=medium",
 		     "--show-notes-by-default",
 		     NULL);
+	if (!include_merges)
+		strvec_push(&cp.args, "--no-merges");
 	strvec_push(&cp.args, range);
 	if (other_arg)
 		strvec_pushv(&cp.args, other_arg->v);
@@ -96,11 +99,14 @@ static int read_patches(const char *range, struct string_list *list,
 		}
 
 		if (skip_prefix(line, "commit ", &p)) {
+			char *q;
 			if (util) {
 				string_list_append(list, buf.buf)->util = util;
 				strbuf_reset(&buf);
 			}
 			CALLOC_ARRAY(util, 1);
+			if (include_merges && (q = strstr(p, " (from ")))
+				*q = '\0';
 			if (repo_get_oid(the_repository, p, &util->oid)) {
 				error(_("could not parse commit '%s'"), p);
 				FREE_AND_NULL(util);
@@ -571,13 +577,14 @@ int show_range_diff(const char *range1, const char *range2,
 
 	struct string_list branch1 = STRING_LIST_INIT_DUP;
 	struct string_list branch2 = STRING_LIST_INIT_DUP;
+	unsigned int include_merges = range_diff_opts->include_merges;
 
 	if (range_diff_opts->left_only && range_diff_opts->right_only)
 		res = error(_("options '%s' and '%s' cannot be used together"), "--left-only", "--right-only");
 
-	if (!res && read_patches(range1, &branch1, range_diff_opts->other_arg))
+	if (!res && read_patches(range1, &branch1, range_diff_opts->other_arg, include_merges))
 		res = error(_("could not parse log for '%s'"), range1);
-	if (!res && read_patches(range2, &branch2, range_diff_opts->other_arg))
+	if (!res && read_patches(range2, &branch2, range_diff_opts->other_arg, include_merges))
 		res = error(_("could not parse log for '%s'"), range2);
 
 	if (!res) {
diff --git a/range-diff.h b/range-diff.h
index 2f69f6a434d..cd85000b5a0 100644
--- a/range-diff.h
+++ b/range-diff.h
@@ -16,6 +16,7 @@ struct range_diff_options {
 	int creation_factor;
 	unsigned dual_color:1;
 	unsigned left_only:1, right_only:1;
+	unsigned include_merges:1;
 	const struct diff_options *diffopt; /* may be NULL */
 	const struct strvec *other_arg; /* may be NULL */
 };
diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
index 86010931ab6..c18a3fdab83 100755
--- a/t/t3206-range-diff.sh
+++ b/t/t3206-range-diff.sh
@@ -909,4 +909,20 @@ test_expect_success 'submodule changes are shown irrespective of diff.submodule'
 	test_cmp expect actual
 '
 
+test_expect_success '--diff-merges' '
+	renamed_oid=$(git rev-parse --short renamed-file) &&
+	tree=$(git merge-tree unmodified renamed-file) &&
+	clean=$(git commit-tree -m merge -p unmodified -p renamed-file $tree) &&
+	clean_oid=$(git rev-parse --short $clean) &&
+	conflict=$(git commit-tree -m merge -p unmodified -p renamed-file^ $tree) &&
+	conflict_oid=$(git rev-parse --short $conflict) &&
+
+	git range-diff --diff-merges=1 $clean...$conflict >actual &&
+	cat >expect <<-EOF &&
+	1:  $renamed_oid < -:  ------- s/12/B/
+	2:  $clean_oid = 1:  $conflict_oid merge
+	EOF
+	test_cmp expect actual
+'
+
 test_done

base-commit: 777489f9e09c8d0dd6b12f9d90de6376330577a2
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] range-diff: optionally include merge commits' diffs in the analysis
  2024-11-07 17:20 [PATCH] range-diff: optionally include merge commits' diffs in the analysis Johannes Schindelin via GitGitGadget
@ 2024-11-08  3:46 ` Junio C Hamano
  2024-11-08  6:53 ` Johannes Sixt
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 21+ messages in thread
From: Junio C Hamano @ 2024-11-08  3:46 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Philippe Blain, Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> Let's add corresponding support for `git range-diff`, too. This makes it
> more convenient to spot differences between iterations of non-linear
> contributions, where so-called "evil merges" are sometimes necessary and
> need to be reviewed, too.

Nice addition.

As the "diff of the diff" outer diff part would be two-"blob" diff,
we won't have to worry about this option causing confusions to
users, unlike things like "-U8", to which layer of diffs the option
would apply.

Will queue.  Thanks.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] range-diff: optionally include merge commits' diffs in the analysis
  2024-11-07 17:20 [PATCH] range-diff: optionally include merge commits' diffs in the analysis Johannes Schindelin via GitGitGadget
  2024-11-08  3:46 ` Junio C Hamano
@ 2024-11-08  6:53 ` Johannes Sixt
  2024-11-08 10:53   ` Johannes Schindelin
                     ` (2 more replies)
  2024-11-08 13:43 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
  2024-11-08 15:33 ` [PATCH] range-diff: optionally include merge commits' diffs in the analysis Elijah Newren
  3 siblings, 3 replies; 21+ messages in thread
From: Johannes Sixt @ 2024-11-08  6:53 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: git, Philippe Blain, Johannes Schindelin via GitGitGadget

Am 07.11.24 um 18:20 schrieb Johannes Schindelin via GitGitGadget:
> +--diff-merges=<format>::
> +	Instead of ignoring merge commits, generate diffs for them using the
> +	corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
> +	and include them in the comparison.
> ++
> +Note: Some of the formats supported by linkgit:git-log[1] make less sense in
> +the context of the `range-diff` command than other formats, so choose wisely!

Can we please be a bit more specific which options are usable or which
are not usable? Perhaps you even mean to say that 'first-parent' is the
only one that makes sense? At a minimum, we should mention that it is
the one that makes most sense (if that is the case).

-- Hannes


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] range-diff: optionally include merge commits' diffs in the analysis
  2024-11-08  6:53 ` Johannes Sixt
@ 2024-11-08 10:53   ` Johannes Schindelin
  2024-11-09  8:49     ` Elijah Newren
  2024-11-08 11:56   ` Junio C Hamano
  2024-11-08 15:53   ` Elijah Newren
  2 siblings, 1 reply; 21+ messages in thread
From: Johannes Schindelin @ 2024-11-08 10:53 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: git, Philippe Blain, Johannes Schindelin via GitGitGadget

Hi Hannes

On Fri, 8 Nov 2024, Johannes Sixt wrote:

> Am 07.11.24 um 18:20 schrieb Johannes Schindelin via GitGitGadget:
> > +--diff-merges=<format>::
> > +	Instead of ignoring merge commits, generate diffs for them using the
> > +	corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
> > +	and include them in the comparison.
> > ++
> > +Note: Some of the formats supported by linkgit:git-log[1] make less sense in
> > +the context of the `range-diff` command than other formats, so choose wisely!
>
> Can we please be a bit more specific which options are usable or which
> are not usable?

There are a couple of reasons:

- Most importantly: I had not even studied that question in depth when I
  wrote the original patch, and not even when I submitted the first
  iteration.

  In my reviews of branch thickets, I used `--diff-merges=1` and it served
  my needs well. Therefore I can speak to this mode, but not really to the
  other modes.

- Which options are usable or not may lack a universally-correct answer.

  One person might find `--diff-merges=separate` confusing while the next
  person may find it quite useful.

- If the documentation points out an unsuitable mode, then an obvious
  follow-up wish would be to reject this mode in the command, with an
  error message. Some users might, however, need precisely this mode, so
  I am loathe to do that.

Maybe this compromise: Change the "Note:" to recommend the `first-parent`
mode, with a bit more explanation why that is so (it is consistent with
the idea that a merge is kind of a "meta patch", comprising all the merged
commits' patches, which is equivalent to the first-parent diff).

> Perhaps you even mean to say that 'first-parent' is the only one that
> makes sense?

I would not go _so_ far as to say that.

Before submitting the patch yesterday, I played a little with a few
modes, using the commit graph as per the new t3206 test case.

What that test case range-diffs looks like this:

 - * - changeA - changeB ------------
     \                     \          \
      \                      conflict  \
       \                   /            \
         rename - changeA - changeB ----- clean

In other words, the main branch modifies a file's contents twice, the
topic branch renames it first, then makes the same modifications.

The clean merge simply merges the topic, which makes the main branch
tree-same to the topic branch.

The conflicting merge misses the tip commit of the topic branch, i.e.
changeB on the renamed file. The result is tree-same to the clean merge
because changeB on the non-renamed file is already part of the main
branch.

Out of habit, I used `--diff-merge=first-parent` in the new test case,
whose output reiterates what I just said: both merges introduce the same
first-parent diff (renaming the file, no other changes):

	1:  1e6226b < -:  ------- s/12/B/
	2:  043e738 = 1:  6ddec15 merge

What about `separate`?


	2:  043e738 = 1:  6ddec15 merge
	1:  1e6226b ! 2:  6ddec15 s/12/B/
	    @@
	      ## Metadata ##
	    -Author: Thomas Rast <trast@inf.ethz.ch>
	    +Author: A U Thor <author@example.com>

	      ## Commit message ##
	    -    s/12/B/
	    +    merge

	      ## renamed-file ##
	     @@ renamed-file: A

This might look confusing at first because there are two merge diffs on
the right-hand side, but only one on the left-hand side. But that is all
good because the diff of the clean merge between merge head and merge is
empty and therefore not shown. And the `range-diff` demonstrates that the
conflicting merge had to recapitulate the tip commit of the `renamed-file`
topic branch.

The `combined` and `dense-combined` modes produce identical output to
`first-parent` in this instance, which is expected.

Now, let's use `remerge`, which should be illuminating with regards to
"evil merges" [*1*], as it shows what was done on top of the automatic
merge:

	1:  1e6226b < -:  ------- s/12/B/
	2:  043e738 < -:  ------- merge
	-:  ------- > 1:  6ddec15 merge

Huh. That is a bit surprising at first. Let's use a higher creation factor
to ask `range-diff` to match correspondences more loosely (I used 999,
which kinda forces even non-sensical pairings):

	1:  1e6226b ! 1:  6ddec15 s/12/B/
	    @@
	      ## Metadata ##
	    -Author: Thomas Rast <trast@inf.ethz.ch>
	    +Author: A U Thor <author@example.com>

	      ## Commit message ##
	    -    s/12/B/
	    +    merge

	      ## renamed-file ##
	    + remerge CONFLICT (content): Merge conflict in renamed-file
	    + index 3f1c0da..25ddf7a 100644
	    + --- renamed-file
	    + +++ renamed-file
	     @@ renamed-file: A
	      9
	      10
	      B
	    +-<<<<<<< 2901f77 (s/12/B/):file
	    + B
	    +-=======
	     -12
	    -+B
	    +->>>>>>> 3ce7af6 (s/11/B/):renamed-file
	      13
	      14
	      15
	2:  043e738 < -:  ------- merge

(Note that if you repeat this experiment, you will see something different
due to a bug in `--remerge-diff` that I will fix separately.)

Hold on, that looks wrong! It is actually not wrong, `range-diff` just
finds the changeB a better pairing than the merge commit with an empty
diff...

So: This mode can be useful to detect where merge conflicts had to be
resolved.

In short, I think that in `range-diff`'s context all of these modes have
merit, depending on a particular use case. It's just that `first-parent`
is probably the most natural to use in the common case.

> At a minimum, we should mention that it is the one that makes most sense
> (if that is the case).

Yeah, I'll go with that and drop saying anything about other modes.

Ciao,
Johannes

Footnote *1*: We should really find a much better name for this than "evil
merge". There is nothing evil about me having to add `#include
"environment.h"` in v2.45.1^, for example. It was necessary so that the
code would build. Tedious, yes, but not evil.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] range-diff: optionally include merge commits' diffs in the analysis
  2024-11-08  6:53 ` Johannes Sixt
  2024-11-08 10:53   ` Johannes Schindelin
@ 2024-11-08 11:56   ` Junio C Hamano
  2024-11-08 15:53   ` Elijah Newren
  2 siblings, 0 replies; 21+ messages in thread
From: Junio C Hamano @ 2024-11-08 11:56 UTC (permalink / raw)
  To: Johannes Sixt
  Cc: Johannes Schindelin, git, Philippe Blain,
	Johannes Schindelin via GitGitGadget

Johannes Sixt <j6t@kdbg.org> writes:

> Am 07.11.24 um 18:20 schrieb Johannes Schindelin via GitGitGadget:
>> +--diff-merges=<format>::
>> +	Instead of ignoring merge commits, generate diffs for them using the
>> +	corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
>> +	and include them in the comparison.
>> ++
>> +Note: Some of the formats supported by linkgit:git-log[1] make less sense in
>> +the context of the `range-diff` command than other formats, so choose wisely!
>
> Can we please be a bit more specific which options are usable or which
> are not usable? Perhaps you even mean to say that 'first-parent' is the
> only one that makes sense? At a minimum, we should mention that it is
> the one that makes most sense (if that is the case).

Good suggestion.

I am curious to see how well things like "--cc" and "--remerge-diff"
fare in this use case.  I suspect that in a case where the original
round forgot to make necessary semantic conflict resolutions (aka
"evil merge") that the new round has, all of them (including the
"--first-parent") would essentially show the same change, but what
is shown in such a case would be more readable with "--first-parent"
by treating a merge as if it were just a single parent commit making
a lot of changes, iow, merely one among other commits with equal
footing.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2] range-diff: optionally include merge commits' diffs in the analysis
  2024-11-07 17:20 [PATCH] range-diff: optionally include merge commits' diffs in the analysis Johannes Schindelin via GitGitGadget
  2024-11-08  3:46 ` Junio C Hamano
  2024-11-08  6:53 ` Johannes Sixt
@ 2024-11-08 13:43 ` Johannes Schindelin via GitGitGadget
  2024-11-08 17:04   ` Elijah Newren
                     ` (3 more replies)
  2024-11-08 15:33 ` [PATCH] range-diff: optionally include merge commits' diffs in the analysis Elijah Newren
  3 siblings, 4 replies; 21+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-11-08 13:43 UTC (permalink / raw)
  To: git; +Cc: Philippe Blain, Johannes Sixt, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `git log` command already offers support for including diffs for
merges, via the `--diff-merges=<format>` option.

Let's add corresponding support for `git range-diff`, too. This makes it
more convenient to spot differences between iterations of non-linear
contributions, where so-called "evil merges" are sometimes necessary and
need to be reviewed, too.

In my code reviews, I found the `--diff-merges=first-parent` option
particularly useful.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
    Support diff merges option in range diff
    
    The git range-diff command does the same with merge commits as git
    rebase: It ignores them.
    
    However, when comparing branch thickets it can be quite illuminating to
    watch out for inadvertent changes in merge commits, in particular when
    some "evil" merges have been replayed, i.e. merges that needed to
    introduce changes outside of the merge conflicts (e.g. when one branch
    changed a function's signature and another branch introduced a caller of
    said function), in case the replayed merge is no longer "evil" and
    therefore potentially incorrect.
    
    Let's introduce support for the --diff-merges option that is passed
    through to those git log commands.
    
    I had a need for this earlier this year and got it working, leaving the
    GitGitGadget PR in a draft mode. Phil Blain found it and kindly
    nerd-sniped me into readying it for submitting, so say thanks to Phil!
    
    Changes since v1:
    
     * Changed the documentation to recommend first-parent mode instead of
       vaguely talking about various modes' merits.
     * Disallowed the no-arg --diff-merges option (because --diff-merges
       requires an argument).

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1734%2Fdscho%2Fsupport-diff-merges-option-in-range-diff-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1734/dscho/support-diff-merges-option-in-range-diff-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1734

Range-diff vs v1:

 1:  11361e07af8 ! 1:  d01a395900b range-diff: optionally include merge commits' diffs in the analysis
     @@ Documentation/git-range-diff.txt: to revert to color all lines according to the
      +	corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
      +	and include them in the comparison.
      ++
     -+Note: Some of the formats supported by linkgit:git-log[1] make less sense in
     -+the context of the `range-diff` command than other formats, so choose wisely!
     ++Note: In the common case, the `first-parent` mode will be the most natural one
     ++to use, as it is consistent with the idea that a merge is kind of a "meta
     ++patch", comprising all the merged commits' patches into a single one.
      +
       --[no-]notes[=<ref>]::
       	This flag is passed to the `git log` program
     @@ builtin/range-diff.c: int cmd_range_diff(int argc,
       				  N_("notes"), N_("passed to 'git log'"),
       				  PARSE_OPT_OPTARG),
      +		OPT_PASSTHRU_ARGV(0, "diff-merges", &diff_merges_arg,
     -+				  N_("style"), N_("passed to 'git log'"),
     -+				  PARSE_OPT_OPTARG),
     ++				  N_("style"), N_("passed to 'git log'"), 0),
       		OPT_BOOL(0, "left-only", &left_only,
       			 N_("only emit output related to the first range")),
       		OPT_BOOL(0, "right-only", &right_only,


 Documentation/git-range-diff.txt | 11 ++++++++++-
 builtin/range-diff.c             | 10 ++++++++++
 range-diff.c                     | 15 +++++++++++----
 range-diff.h                     |  1 +
 t/t3206-range-diff.sh            | 16 ++++++++++++++++
 5 files changed, 48 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-range-diff.txt b/Documentation/git-range-diff.txt
index fbdbe0befeb..17a85957877 100644
--- a/Documentation/git-range-diff.txt
+++ b/Documentation/git-range-diff.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 [verse]
 'git range-diff' [--color=[<when>]] [--no-color] [<diff-options>]
 	[--no-dual-color] [--creation-factor=<factor>]
-	[--left-only | --right-only]
+	[--left-only | --right-only] [--diff-merges=<format>]
 	( <range1> <range2> | <rev1>...<rev2> | <base> <rev1> <rev2> )
 	[[--] <path>...]
 
@@ -81,6 +81,15 @@ to revert to color all lines according to the outer diff markers
 	Suppress commits that are missing from the second specified range
 	(or the "right range" when using the `<rev1>...<rev2>` format).
 
+--diff-merges=<format>::
+	Instead of ignoring merge commits, generate diffs for them using the
+	corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
+	and include them in the comparison.
++
+Note: In the common case, the `first-parent` mode will be the most natural one
+to use, as it is consistent with the idea that a merge is kind of a "meta
+patch", comprising all the merged commits' patches into a single one.
+
 --[no-]notes[=<ref>]::
 	This flag is passed to the `git log` program
 	(see linkgit:git-log[1]) that generates the patches.
diff --git a/builtin/range-diff.c b/builtin/range-diff.c
index 1b33ab66a7b..901de5d133d 100644
--- a/builtin/range-diff.c
+++ b/builtin/range-diff.c
@@ -21,6 +21,7 @@ int cmd_range_diff(int argc,
 {
 	struct diff_options diffopt = { NULL };
 	struct strvec other_arg = STRVEC_INIT;
+	struct strvec diff_merges_arg = STRVEC_INIT;
 	struct range_diff_options range_diff_opts = {
 		.creation_factor = RANGE_DIFF_CREATION_FACTOR_DEFAULT,
 		.diffopt = &diffopt,
@@ -36,6 +37,8 @@ int cmd_range_diff(int argc,
 		OPT_PASSTHRU_ARGV(0, "notes", &other_arg,
 				  N_("notes"), N_("passed to 'git log'"),
 				  PARSE_OPT_OPTARG),
+		OPT_PASSTHRU_ARGV(0, "diff-merges", &diff_merges_arg,
+				  N_("style"), N_("passed to 'git log'"), 0),
 		OPT_BOOL(0, "left-only", &left_only,
 			 N_("only emit output related to the first range")),
 		OPT_BOOL(0, "right-only", &right_only,
@@ -62,6 +65,12 @@ int cmd_range_diff(int argc,
 	if (!simple_color)
 		diffopt.use_color = 1;
 
+	/* If `--diff-merges` was specified, imply `--merges` */
+	if (diff_merges_arg.nr) {
+		range_diff_opts.include_merges = 1;
+		strvec_pushv(&other_arg, diff_merges_arg.v);
+	}
+
 	for (i = 0; i < argc; i++)
 		if (!strcmp(argv[i], "--")) {
 			dash_dash = i;
@@ -155,6 +164,7 @@ int cmd_range_diff(int argc,
 	res = show_range_diff(range1.buf, range2.buf, &range_diff_opts);
 
 	strvec_clear(&other_arg);
+	strvec_clear(&diff_merges_arg);
 	strbuf_release(&range1);
 	strbuf_release(&range2);
 
diff --git a/range-diff.c b/range-diff.c
index bbb0952264b..9e59733059b 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -38,7 +38,8 @@ struct patch_util {
  * as struct object_id (will need to be free()d).
  */
 static int read_patches(const char *range, struct string_list *list,
-			const struct strvec *other_arg)
+			const struct strvec *other_arg,
+			unsigned int include_merges)
 {
 	struct child_process cp = CHILD_PROCESS_INIT;
 	struct strbuf buf = STRBUF_INIT, contents = STRBUF_INIT;
@@ -49,7 +50,7 @@ static int read_patches(const char *range, struct string_list *list,
 	size_t size;
 	int ret = -1;
 
-	strvec_pushl(&cp.args, "log", "--no-color", "-p", "--no-merges",
+	strvec_pushl(&cp.args, "log", "--no-color", "-p",
 		     "--reverse", "--date-order", "--decorate=no",
 		     "--no-prefix", "--submodule=short",
 		     /*
@@ -64,6 +65,8 @@ static int read_patches(const char *range, struct string_list *list,
 		     "--pretty=medium",
 		     "--show-notes-by-default",
 		     NULL);
+	if (!include_merges)
+		strvec_push(&cp.args, "--no-merges");
 	strvec_push(&cp.args, range);
 	if (other_arg)
 		strvec_pushv(&cp.args, other_arg->v);
@@ -96,11 +99,14 @@ static int read_patches(const char *range, struct string_list *list,
 		}
 
 		if (skip_prefix(line, "commit ", &p)) {
+			char *q;
 			if (util) {
 				string_list_append(list, buf.buf)->util = util;
 				strbuf_reset(&buf);
 			}
 			CALLOC_ARRAY(util, 1);
+			if (include_merges && (q = strstr(p, " (from ")))
+				*q = '\0';
 			if (repo_get_oid(the_repository, p, &util->oid)) {
 				error(_("could not parse commit '%s'"), p);
 				FREE_AND_NULL(util);
@@ -571,13 +577,14 @@ int show_range_diff(const char *range1, const char *range2,
 
 	struct string_list branch1 = STRING_LIST_INIT_DUP;
 	struct string_list branch2 = STRING_LIST_INIT_DUP;
+	unsigned int include_merges = range_diff_opts->include_merges;
 
 	if (range_diff_opts->left_only && range_diff_opts->right_only)
 		res = error(_("options '%s' and '%s' cannot be used together"), "--left-only", "--right-only");
 
-	if (!res && read_patches(range1, &branch1, range_diff_opts->other_arg))
+	if (!res && read_patches(range1, &branch1, range_diff_opts->other_arg, include_merges))
 		res = error(_("could not parse log for '%s'"), range1);
-	if (!res && read_patches(range2, &branch2, range_diff_opts->other_arg))
+	if (!res && read_patches(range2, &branch2, range_diff_opts->other_arg, include_merges))
 		res = error(_("could not parse log for '%s'"), range2);
 
 	if (!res) {
diff --git a/range-diff.h b/range-diff.h
index 2f69f6a434d..cd85000b5a0 100644
--- a/range-diff.h
+++ b/range-diff.h
@@ -16,6 +16,7 @@ struct range_diff_options {
 	int creation_factor;
 	unsigned dual_color:1;
 	unsigned left_only:1, right_only:1;
+	unsigned include_merges:1;
 	const struct diff_options *diffopt; /* may be NULL */
 	const struct strvec *other_arg; /* may be NULL */
 };
diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
index 86010931ab6..c18a3fdab83 100755
--- a/t/t3206-range-diff.sh
+++ b/t/t3206-range-diff.sh
@@ -909,4 +909,20 @@ test_expect_success 'submodule changes are shown irrespective of diff.submodule'
 	test_cmp expect actual
 '
 
+test_expect_success '--diff-merges' '
+	renamed_oid=$(git rev-parse --short renamed-file) &&
+	tree=$(git merge-tree unmodified renamed-file) &&
+	clean=$(git commit-tree -m merge -p unmodified -p renamed-file $tree) &&
+	clean_oid=$(git rev-parse --short $clean) &&
+	conflict=$(git commit-tree -m merge -p unmodified -p renamed-file^ $tree) &&
+	conflict_oid=$(git rev-parse --short $conflict) &&
+
+	git range-diff --diff-merges=1 $clean...$conflict >actual &&
+	cat >expect <<-EOF &&
+	1:  $renamed_oid < -:  ------- s/12/B/
+	2:  $clean_oid = 1:  $conflict_oid merge
+	EOF
+	test_cmp expect actual
+'
+
 test_done

base-commit: 777489f9e09c8d0dd6b12f9d90de6376330577a2
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] range-diff: optionally include merge commits' diffs in the analysis
  2024-11-07 17:20 [PATCH] range-diff: optionally include merge commits' diffs in the analysis Johannes Schindelin via GitGitGadget
                   ` (2 preceding siblings ...)
  2024-11-08 13:43 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
@ 2024-11-08 15:33 ` Elijah Newren
  3 siblings, 0 replies; 21+ messages in thread
From: Elijah Newren @ 2024-11-08 15:33 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Philippe Blain, Johannes Schindelin

On Thu, Nov 7, 2024 at 9:20 AM Johannes Schindelin via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> The `git log` command already offers support for including diffs for
> merges, via the `--diff-merges=<format>` option.
>
> Let's add corresponding support for `git range-diff`, too. This makes it
> more convenient to spot differences between iterations of non-linear
> contributions, where so-called "evil merges" are sometimes necessary and
> need to be reviewed, too.
>
> In my code reviews, I found the `--diff-merges=first-parent` option
> particularly useful.

Curious.  Wouldn't --diff-merges=remerge-diff be more useful if you
are particularly interested in so-called "evil merges" and whether
they remain "evil" (i.e. empty remerge-diff) or gain additional bits
of "evilness" (i.e. more changes shown in the remerge-diff)?

first-parent would seem more like a workaround in such a case.  Let me
explain; first, let me refer to the result that you'd get after
merging with no human changes (i.e. a non-evil merge) as a
hypothetical "auto-merge" commit.  Now, --diff-merges=first-parent
could generally be broken down as the combination of diff from first
parent to auto-merge + diff from auto-merge to evil-merge (even if the
auto-merge wasn't actually recorded anywhere and is just a theoretical
construct).  Now, you aren't looking at a first-parent diff directly,
you are diffing two first-parent diffs.  In particular, you are
comparing:
    pre-rebase first-parent diff = diff from first parent of merge to
the auto-merge + diff from auto-merge to evil-merge
to
    post-rebase first-parent diff = diff from first parent of merge to
the auto-merge + diff from auto-merge to evil-merge

Assuming you didn't drop or insert or modify any commits as part of
the rebase, then the two "diff from first parent of merge to the
auto-merge" should match.  Since they match, taking the difference of
these two causes that part to cancel out, meaning you are left just
looking at the differences in the "evilness" of the actual merge.  But
if you did make other changes while rebasing, maybe dropping or
tweaking a commit, then suddenly you aren't just looking at
differences in the "evilness" of the actual merge anymore; it's mixed
with those other changes making it more challenging to review and easy
to miss the parts you are looking for.  If you want to look for
differences in whether the merge commit in question has changes other
than those that a simple "git merge" would make, remerge-diff seems
like a better choice.

> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>     Support diff merges option in range diff
>
>     The git range-diff command does the same with merge commits as git
>     rebase: It ignores them.
>
>     However, when comparing branch thickets it can be quite illuminating to
>     watch out for inadvertent changes in merge commits, in particular when
>     some "evil" merges have been replayed, i.e. merges that needed to
>     introduce changes outside of the merge conflicts (e.g. when one branch
>     changed a function's signature and another branch introduced a caller of
>     said function), in case the replayed merge is no longer "evil" and
>     therefore potentially incorrect.
>
>     Let's introduce support for the --diff-merges option that is passed
>     through to those git log commands.
>
>     I had a need for this earlier this year and got it working, leaving the
>     GitGitGadget PR in a draft mode. Phil Blain found it and kindly
>     nerd-sniped me into readying it for submitting, so say thanks to Phil!
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1734%2Fdscho%2Fsupport-diff-merges-option-in-range-diff-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1734/dscho/support-diff-merges-option-in-range-diff-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1734
>
>  Documentation/git-range-diff.txt | 10 +++++++++-
>  builtin/range-diff.c             | 11 +++++++++++
>  range-diff.c                     | 15 +++++++++++----
>  range-diff.h                     |  1 +
>  t/t3206-range-diff.sh            | 16 ++++++++++++++++
>  5 files changed, 48 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/git-range-diff.txt b/Documentation/git-range-diff.txt
> index fbdbe0befeb..a964e856c3c 100644
> --- a/Documentation/git-range-diff.txt
> +++ b/Documentation/git-range-diff.txt
> @@ -10,7 +10,7 @@ SYNOPSIS
>  [verse]
>  'git range-diff' [--color=[<when>]] [--no-color] [<diff-options>]
>         [--no-dual-color] [--creation-factor=<factor>]
> -       [--left-only | --right-only]
> +       [--left-only | --right-only] [--diff-merges=<format>]
>         ( <range1> <range2> | <rev1>...<rev2> | <base> <rev1> <rev2> )
>         [[--] <path>...]
>
> @@ -81,6 +81,14 @@ to revert to color all lines according to the outer diff markers
>         Suppress commits that are missing from the second specified range
>         (or the "right range" when using the `<rev1>...<rev2>` format).
>
> +--diff-merges=<format>::
> +       Instead of ignoring merge commits, generate diffs for them using the
> +       corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
> +       and include them in the comparison.
> ++
> +Note: Some of the formats supported by linkgit:git-log[1] make less sense in
> +the context of the `range-diff` command than other formats, so choose wisely!
> +

Indeed.  :-)

>  --[no-]notes[=<ref>]::
>         This flag is passed to the `git log` program
>         (see linkgit:git-log[1]) that generates the patches.
> diff --git a/builtin/range-diff.c b/builtin/range-diff.c
> index 1b33ab66a7b..e41719e0f0d 100644
> --- a/builtin/range-diff.c
> +++ b/builtin/range-diff.c
> @@ -21,6 +21,7 @@ int cmd_range_diff(int argc,
>  {
>         struct diff_options diffopt = { NULL };
>         struct strvec other_arg = STRVEC_INIT;
> +       struct strvec diff_merges_arg = STRVEC_INIT;
>         struct range_diff_options range_diff_opts = {
>                 .creation_factor = RANGE_DIFF_CREATION_FACTOR_DEFAULT,
>                 .diffopt = &diffopt,
> @@ -36,6 +37,9 @@ int cmd_range_diff(int argc,
>                 OPT_PASSTHRU_ARGV(0, "notes", &other_arg,
>                                   N_("notes"), N_("passed to 'git log'"),
>                                   PARSE_OPT_OPTARG),
> +               OPT_PASSTHRU_ARGV(0, "diff-merges", &diff_merges_arg,
> +                                 N_("style"), N_("passed to 'git log'"),
> +                                 PARSE_OPT_OPTARG),
>                 OPT_BOOL(0, "left-only", &left_only,
>                          N_("only emit output related to the first range")),
>                 OPT_BOOL(0, "right-only", &right_only,
> @@ -62,6 +66,12 @@ int cmd_range_diff(int argc,
>         if (!simple_color)
>                 diffopt.use_color = 1;
>
> +       /* If `--diff-merges` was specified, imply `--merges` */
> +       if (diff_merges_arg.nr) {
> +               range_diff_opts.include_merges = 1;
> +               strvec_pushv(&other_arg, diff_merges_arg.v);
> +       }
> +
>         for (i = 0; i < argc; i++)
>                 if (!strcmp(argv[i], "--")) {
>                         dash_dash = i;
> @@ -155,6 +165,7 @@ int cmd_range_diff(int argc,
>         res = show_range_diff(range1.buf, range2.buf, &range_diff_opts);
>
>         strvec_clear(&other_arg);
> +       strvec_clear(&diff_merges_arg);
>         strbuf_release(&range1);
>         strbuf_release(&range2);
>
> diff --git a/range-diff.c b/range-diff.c
> index bbb0952264b..9e59733059b 100644
> --- a/range-diff.c
> +++ b/range-diff.c
> @@ -38,7 +38,8 @@ struct patch_util {
>   * as struct object_id (will need to be free()d).
>   */
>  static int read_patches(const char *range, struct string_list *list,
> -                       const struct strvec *other_arg)
> +                       const struct strvec *other_arg,
> +                       unsigned int include_merges)
>  {
>         struct child_process cp = CHILD_PROCESS_INIT;
>         struct strbuf buf = STRBUF_INIT, contents = STRBUF_INIT;
> @@ -49,7 +50,7 @@ static int read_patches(const char *range, struct string_list *list,
>         size_t size;
>         int ret = -1;
>
> -       strvec_pushl(&cp.args, "log", "--no-color", "-p", "--no-merges",
> +       strvec_pushl(&cp.args, "log", "--no-color", "-p",
>                      "--reverse", "--date-order", "--decorate=no",
>                      "--no-prefix", "--submodule=short",
>                      /*
> @@ -64,6 +65,8 @@ static int read_patches(const char *range, struct string_list *list,
>                      "--pretty=medium",
>                      "--show-notes-by-default",
>                      NULL);


>
> -- Hannes
>
>
> +       if (!include_merges)
> +               strvec_push(&cp.args, "--no-merges");
>         strvec_push(&cp.args, range);
>         if (other_arg)
>                 strvec_pushv(&cp.args, other_arg->v);
> @@ -96,11 +99,14 @@ static int read_patches(const char *range, struct string_list *list,
>                 }
>
>                 if (skip_prefix(line, "commit ", &p)) {
> +                       char *q;
>                         if (util) {
>                                 string_list_append(list, buf.buf)->util = util;
>                                 strbuf_reset(&buf);
>                         }
>                         CALLOC_ARRAY(util, 1);
> +                       if (include_merges && (q = strstr(p, " (from ")))
> +                               *q = '\0';
>                         if (repo_get_oid(the_repository, p, &util->oid)) {
>                                 error(_("could not parse commit '%s'"), p);
>                                 FREE_AND_NULL(util);
> @@ -571,13 +577,14 @@ int show_range_diff(const char *range1, const char *range2,
>
>         struct string_list branch1 = STRING_LIST_INIT_DUP;
>         struct string_list branch2 = STRING_LIST_INIT_DUP;
> +       unsigned int include_merges = range_diff_opts->include_merges;
>
>         if (range_diff_opts->left_only && range_diff_opts->right_only)
>                 res = error(_("options '%s' and '%s' cannot be used together"), "--left-only", "--right-only");
>
> -       if (!res && read_patches(range1, &branch1, range_diff_opts->other_arg))
> +       if (!res && read_patches(range1, &branch1, range_diff_opts->other_arg, include_merges))
>                 res = error(_("could not parse log for '%s'"), range1);
> -       if (!res && read_patches(range2, &branch2, range_diff_opts->other_arg))
> +       if (!res && read_patches(range2, &branch2, range_diff_opts->other_arg, include_merges))
>                 res = error(_("could not parse log for '%s'"), range2);
>
>         if (!res) {
> diff --git a/range-diff.h b/range-diff.h
> index 2f69f6a434d..cd85000b5a0 100644
> --- a/range-diff.h
> +++ b/range-diff.h
> @@ -16,6 +16,7 @@ struct range_diff_options {
>         int creation_factor;
>         unsigned dual_color:1;
>         unsigned left_only:1, right_only:1;
> +       unsigned include_merges:1;
>         const struct diff_options *diffopt; /* may be NULL */
>         const struct strvec *other_arg; /* may be NULL */
>  };
> diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
> index 86010931ab6..c18a3fdab83 100755
> --- a/t/t3206-range-diff.sh
> +++ b/t/t3206-range-diff.sh
> @@ -909,4 +909,20 @@ test_expect_success 'submodule changes are shown irrespective of diff.submodule'
>         test_cmp expect actual
>  '
>
> +test_expect_success '--diff-merges' '
> +       renamed_oid=$(git rev-parse --short renamed-file) &&
> +       tree=$(git merge-tree unmodified renamed-file) &&
> +       clean=$(git commit-tree -m merge -p unmodified -p renamed-file $tree) &&
> +       clean_oid=$(git rev-parse --short $clean) &&
> +       conflict=$(git commit-tree -m merge -p unmodified -p renamed-file^ $tree) &&
> +       conflict_oid=$(git rev-parse --short $conflict) &&
> +
> +       git range-diff --diff-merges=1 $clean...$conflict >actual &&
> +       cat >expect <<-EOF &&
> +       1:  $renamed_oid < -:  ------- s/12/B/
> +       2:  $clean_oid = 1:  $conflict_oid merge
> +       EOF
> +       test_cmp expect actual
> +'
> +
>  test_done
>
> base-commit: 777489f9e09c8d0dd6b12f9d90de6376330577a2
> --
> gitgitgadget

Didn't spot any problems with the patch itself.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] range-diff: optionally include merge commits' diffs in the analysis
  2024-11-08  6:53 ` Johannes Sixt
  2024-11-08 10:53   ` Johannes Schindelin
  2024-11-08 11:56   ` Junio C Hamano
@ 2024-11-08 15:53   ` Elijah Newren
  2 siblings, 0 replies; 21+ messages in thread
From: Elijah Newren @ 2024-11-08 15:53 UTC (permalink / raw)
  To: Johannes Sixt
  Cc: Johannes Schindelin, git, Philippe Blain,
	Johannes Schindelin via GitGitGadget

On Thu, Nov 7, 2024 at 10:54 PM Johannes Sixt <j6t@kdbg.org> wrote:
>
> Am 07.11.24 um 18:20 schrieb Johannes Schindelin via GitGitGadget:
> > +--diff-merges=<format>::
> > +     Instead of ignoring merge commits, generate diffs for them using the
> > +     corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
> > +     and include them in the comparison.
> > ++
> > +Note: Some of the formats supported by linkgit:git-log[1] make less sense in
> > +the context of the `range-diff` command than other formats, so choose wisely!
>
> Can we please be a bit more specific which options are usable or which
> are not usable? Perhaps you even mean to say that 'first-parent' is the
> only one that makes sense? At a minimum, we should mention that it is
> the one that makes most sense (if that is the case).

Here's my guesses I mostly typed up last night before getting tired
and going to bed:

separate: makes no sense; showing two diffs for each merge and then
trying to diff the two sets of two diffs seems crazy, especially since
the whole point of rebasing is usually to change the first parent
somewhere in the ancestry; that'll cascade up and cause the diffs to
the second parents to differ wildly and make this format uselessly
noisy in general.

combined: much better than separate, but probably still quite noisy and
confusing, especially since combined-diff and range-diff are both
complicated in that they have two diffs each, and combining the two
means you have a diff of a dual diff.  I suspect that might be hard for users to
process.

dense-combined: much better than combined, you at least have the diff
down to the smallest relevant pieces.  You may still deal with the
diff of a dual diff problem.

first-parent: useful, but only when keeping the series of patches largely
intact and transplanting them; any insertion, deletion, or
modification of intermediate patches will result in those
modifications being shown
not just for the patch in question but rolled up into the merge commit
and shown there as well, making the diff for the merge very noisy
(possibly to the point of uselessness).  When
keeping the series of patches intact and transplanting them, though,
likely a useful mode.

remerge-diff: generally useful, even if you insert, delete, or modify
intermediate patches while rebasing.  Since it also shows how conflict
resolutions are done, you'd see how conflicts are resolved
differently.  Actually, that might be one drawback to this method;
since conflict markers try to provide labels involving a short commit
ID and the commit message, and those short commit IDs will differ,
this mode would just end up showing how the two merges had different
parents, which you already know.  We could probably make this option
much more useful by surpressing the conflict labels, or at least the
short commit ID parts of those labels.  If we do that, and take
Johannes' nice fix he sent to the list separately, then I suspect this
mode is generally the most useful one to use with range-diff.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2] range-diff: optionally include merge commits' diffs in the analysis
  2024-11-08 13:43 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
@ 2024-11-08 17:04   ` Elijah Newren
  2024-11-11 19:55     ` Johannes Schindelin
  2024-11-10 20:30   ` Philippe Blain
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 21+ messages in thread
From: Elijah Newren @ 2024-11-08 17:04 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Philippe Blain, Johannes Sixt, Johannes Schindelin

On Fri, Nov 8, 2024 at 5:43 AM Johannes Schindelin via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> The `git log` command already offers support for including diffs for
> merges, via the `--diff-merges=<format>` option.
>
> Let's add corresponding support for `git range-diff`, too. This makes it
> more convenient to spot differences between iterations of non-linear
> contributions, where so-called "evil merges" are sometimes necessary and
> need to be reviewed, too.
>
> In my code reviews, I found the `--diff-merges=first-parent` option
> particularly useful.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>     Support diff merges option in range diff
>
>     The git range-diff command does the same with merge commits as git
>     rebase: It ignores them.
>
>     However, when comparing branch thickets it can be quite illuminating to
>     watch out for inadvertent changes in merge commits, in particular when
>     some "evil" merges have been replayed, i.e. merges that needed to
>     introduce changes outside of the merge conflicts (e.g. when one branch
>     changed a function's signature and another branch introduced a caller of
>     said function), in case the replayed merge is no longer "evil" and
>     therefore potentially incorrect.
>
>     Let's introduce support for the --diff-merges option that is passed
>     through to those git log commands.
>
>     I had a need for this earlier this year and got it working, leaving the
>     GitGitGadget PR in a draft mode. Phil Blain found it and kindly
>     nerd-sniped me into readying it for submitting, so say thanks to Phil!
>
>     Changes since v1:
>
>      * Changed the documentation to recommend first-parent mode instead of
>        vaguely talking about various modes' merits.
>      * Disallowed the no-arg --diff-merges option (because --diff-merges
>        requires an argument).
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1734%2Fdscho%2Fsupport-diff-merges-option-in-range-diff-v2
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1734/dscho/support-diff-merges-option-in-range-diff-v2
> Pull-Request: https://github.com/gitgitgadget/git/pull/1734
>
> Range-diff vs v1:
>
>  1:  11361e07af8 ! 1:  d01a395900b range-diff: optionally include merge commits' diffs in the analysis
>      @@ Documentation/git-range-diff.txt: to revert to color all lines according to the
>       + corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
>       + and include them in the comparison.
>       ++
>      -+Note: Some of the formats supported by linkgit:git-log[1] make less sense in
>      -+the context of the `range-diff` command than other formats, so choose wisely!
>      ++Note: In the common case, the `first-parent` mode will be the most natural one
>      ++to use, as it is consistent with the idea that a merge is kind of a "meta
>      ++patch", comprising all the merged commits' patches into a single one.
>       +
>        --[no-]notes[=<ref>]::
>         This flag is passed to the `git log` program
>      @@ builtin/range-diff.c: int cmd_range_diff(int argc,
>                                   N_("notes"), N_("passed to 'git log'"),
>                                   PARSE_OPT_OPTARG),
>       +         OPT_PASSTHRU_ARGV(0, "diff-merges", &diff_merges_arg,
>      -+                           N_("style"), N_("passed to 'git log'"),
>      -+                           PARSE_OPT_OPTARG),
>      ++                           N_("style"), N_("passed to 'git log'"), 0),
>                 OPT_BOOL(0, "left-only", &left_only,
>                          N_("only emit output related to the first range")),
>                 OPT_BOOL(0, "right-only", &right_only,
>
>
>  Documentation/git-range-diff.txt | 11 ++++++++++-
>  builtin/range-diff.c             | 10 ++++++++++
>  range-diff.c                     | 15 +++++++++++----
>  range-diff.h                     |  1 +
>  t/t3206-range-diff.sh            | 16 ++++++++++++++++
>  5 files changed, 48 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/git-range-diff.txt b/Documentation/git-range-diff.txt
> index fbdbe0befeb..17a85957877 100644
> --- a/Documentation/git-range-diff.txt
> +++ b/Documentation/git-range-diff.txt
> @@ -10,7 +10,7 @@ SYNOPSIS
>  [verse]
>  'git range-diff' [--color=[<when>]] [--no-color] [<diff-options>]
>         [--no-dual-color] [--creation-factor=<factor>]
> -       [--left-only | --right-only]
> +       [--left-only | --right-only] [--diff-merges=<format>]
>         ( <range1> <range2> | <rev1>...<rev2> | <base> <rev1> <rev2> )
>         [[--] <path>...]
>
> @@ -81,6 +81,15 @@ to revert to color all lines according to the outer diff markers
>         Suppress commits that are missing from the second specified range
>         (or the "right range" when using the `<rev1>...<rev2>` format).
>
> +--diff-merges=<format>::
> +       Instead of ignoring merge commits, generate diffs for them using the
> +       corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
> +       and include them in the comparison.
> ++
> +Note: In the common case, the `first-parent` mode will be the most natural one
> +to use,

I think we need more wording around "common case"; I believe this
"common case" is when you are diffing against a merely transplanted
series of commits (a series created by rebasing without inserting,
removing, or minimally modifying those commits in the process) that
`first-parent` makes sense.  And it only makes sense in that case.

I think `remerge-diff` generally makes sense here, both in the cases
when `first-parent` makes sense and when `first-parent` does not.  It
could be improved by suppressing the inclusion of short commit ids
(and maybe also commit messages) in the labels of conflict markers.  I
suspect that issue might make `remerge-diff` less useful than
`first-parent` in simple common cases currently, but I think it's the
right thing to build upon for what you are trying to view.

If `remerge-diff` didn't exist, I think I'd always use
`dense-combined` over `first-parent` because of this
merely-transplanted limitation.  I suspect dense-combined would
probably be kind of ugly and hard to parse when there is an actual
evil merge, but it'd at least limit what it shows to the evil merge
content.

> as it is consistent with the idea that a merge is kind of a "meta
> +patch", comprising all the merged commits' patches into a single one.

Doesn't this wording of yours naturally lead to the idea that
`first-parent` is a bad choice if patches leading to the merge have
been inserted, removed, or more than minimally modified?  It points
out that first-parent is a diff over the whole range, and so
range-diff shows you a diff of the diffs over the whole range.  If you
want to look at the "evilness" of merge commits, i.e. the
user-generated portion of the diffs included in by merge commits, you
need either remerge-diff or dense-combined.

> +
>  --[no-]notes[=<ref>]::
>         This flag is passed to the `git log` program
>         (see linkgit:git-log[1]) that generates the patches.
> diff --git a/builtin/range-diff.c b/builtin/range-diff.c
> index 1b33ab66a7b..901de5d133d 100644
> --- a/builtin/range-diff.c
> +++ b/builtin/range-diff.c
> @@ -21,6 +21,7 @@ int cmd_range_diff(int argc,
>  {
>         struct diff_options diffopt = { NULL };
>         struct strvec other_arg = STRVEC_INIT;
> +       struct strvec diff_merges_arg = STRVEC_INIT;
>         struct range_diff_options range_diff_opts = {
>                 .creation_factor = RANGE_DIFF_CREATION_FACTOR_DEFAULT,
>                 .diffopt = &diffopt,
> @@ -36,6 +37,8 @@ int cmd_range_diff(int argc,
>                 OPT_PASSTHRU_ARGV(0, "notes", &other_arg,
>                                   N_("notes"), N_("passed to 'git log'"),
>                                   PARSE_OPT_OPTARG),
> +               OPT_PASSTHRU_ARGV(0, "diff-merges", &diff_merges_arg,
> +                                 N_("style"), N_("passed to 'git log'"), 0),
>                 OPT_BOOL(0, "left-only", &left_only,
>                          N_("only emit output related to the first range")),
>                 OPT_BOOL(0, "right-only", &right_only,
> @@ -62,6 +65,12 @@ int cmd_range_diff(int argc,
>         if (!simple_color)
>                 diffopt.use_color = 1;
>
> +       /* If `--diff-merges` was specified, imply `--merges` */
> +       if (diff_merges_arg.nr) {
> +               range_diff_opts.include_merges = 1;
> +               strvec_pushv(&other_arg, diff_merges_arg.v);
> +       }
> +
>         for (i = 0; i < argc; i++)
>                 if (!strcmp(argv[i], "--")) {
>                         dash_dash = i;
> @@ -155,6 +164,7 @@ int cmd_range_diff(int argc,
>         res = show_range_diff(range1.buf, range2.buf, &range_diff_opts);
>
>         strvec_clear(&other_arg);
> +       strvec_clear(&diff_merges_arg);
>         strbuf_release(&range1);
>         strbuf_release(&range2);
>
> diff --git a/range-diff.c b/range-diff.c
> index bbb0952264b..9e59733059b 100644
> --- a/range-diff.c
> +++ b/range-diff.c
> @@ -38,7 +38,8 @@ struct patch_util {
>   * as struct object_id (will need to be free()d).
>   */
>  static int read_patches(const char *range, struct string_list *list,
> -                       const struct strvec *other_arg)
> +                       const struct strvec *other_arg,
> +                       unsigned int include_merges)
>  {
>         struct child_process cp = CHILD_PROCESS_INIT;
>         struct strbuf buf = STRBUF_INIT, contents = STRBUF_INIT;
> @@ -49,7 +50,7 @@ static int read_patches(const char *range, struct string_list *list,
>         size_t size;
>         int ret = -1;
>
> -       strvec_pushl(&cp.args, "log", "--no-color", "-p", "--no-merges",
> +       strvec_pushl(&cp.args, "log", "--no-color", "-p",
>                      "--reverse", "--date-order", "--decorate=no",
>                      "--no-prefix", "--submodule=short",
>                      /*
> @@ -64,6 +65,8 @@ static int read_patches(const char *range, struct string_list *list,
>                      "--pretty=medium",
>                      "--show-notes-by-default",
>                      NULL);
> +       if (!include_merges)
> +               strvec_push(&cp.args, "--no-merges");
>         strvec_push(&cp.args, range);
>         if (other_arg)
>                 strvec_pushv(&cp.args, other_arg->v);
> @@ -96,11 +99,14 @@ static int read_patches(const char *range, struct string_list *list,
>                 }
>
>                 if (skip_prefix(line, "commit ", &p)) {
> +                       char *q;
>                         if (util) {
>                                 string_list_append(list, buf.buf)->util = util;
>                                 strbuf_reset(&buf);
>                         }
>                         CALLOC_ARRAY(util, 1);
> +                       if (include_merges && (q = strstr(p, " (from ")))
> +                               *q = '\0';
>                         if (repo_get_oid(the_repository, p, &util->oid)) {
>                                 error(_("could not parse commit '%s'"), p);
>                                 FREE_AND_NULL(util);
> @@ -571,13 +577,14 @@ int show_range_diff(const char *range1, const char *range2,
>
>         struct string_list branch1 = STRING_LIST_INIT_DUP;
>         struct string_list branch2 = STRING_LIST_INIT_DUP;
> +       unsigned int include_merges = range_diff_opts->include_merges;
>
>         if (range_diff_opts->left_only && range_diff_opts->right_only)
>                 res = error(_("options '%s' and '%s' cannot be used together"), "--left-only", "--right-only");
>
> -       if (!res && read_patches(range1, &branch1, range_diff_opts->other_arg))
> +       if (!res && read_patches(range1, &branch1, range_diff_opts->other_arg, include_merges))
>                 res = error(_("could not parse log for '%s'"), range1);
> -       if (!res && read_patches(range2, &branch2, range_diff_opts->other_arg))
> +       if (!res && read_patches(range2, &branch2, range_diff_opts->other_arg, include_merges))
>                 res = error(_("could not parse log for '%s'"), range2);
>
>         if (!res) {
> diff --git a/range-diff.h b/range-diff.h
> index 2f69f6a434d..cd85000b5a0 100644
> --- a/range-diff.h
> +++ b/range-diff.h
> @@ -16,6 +16,7 @@ struct range_diff_options {
>         int creation_factor;
>         unsigned dual_color:1;
>         unsigned left_only:1, right_only:1;
> +       unsigned include_merges:1;
>         const struct diff_options *diffopt; /* may be NULL */
>         const struct strvec *other_arg; /* may be NULL */
>  };
> diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
> index 86010931ab6..c18a3fdab83 100755
> --- a/t/t3206-range-diff.sh
> +++ b/t/t3206-range-diff.sh
> @@ -909,4 +909,20 @@ test_expect_success 'submodule changes are shown irrespective of diff.submodule'
>         test_cmp expect actual
>  '
>
> +test_expect_success '--diff-merges' '
> +       renamed_oid=$(git rev-parse --short renamed-file) &&
> +       tree=$(git merge-tree unmodified renamed-file) &&
> +       clean=$(git commit-tree -m merge -p unmodified -p renamed-file $tree) &&
> +       clean_oid=$(git rev-parse --short $clean) &&
> +       conflict=$(git commit-tree -m merge -p unmodified -p renamed-file^ $tree) &&
> +       conflict_oid=$(git rev-parse --short $conflict) &&
> +
> +       git range-diff --diff-merges=1 $clean...$conflict >actual &&
> +       cat >expect <<-EOF &&
> +       1:  $renamed_oid < -:  ------- s/12/B/
> +       2:  $clean_oid = 1:  $conflict_oid merge
> +       EOF
> +       test_cmp expect actual
> +'
> +
>  test_done
>
> base-commit: 777489f9e09c8d0dd6b12f9d90de6376330577a2
> --
> gitgitgadget

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] range-diff: optionally include merge commits' diffs in the analysis
  2024-11-08 10:53   ` Johannes Schindelin
@ 2024-11-09  8:49     ` Elijah Newren
  2024-11-11  0:20       ` Junio C Hamano
  0 siblings, 1 reply; 21+ messages in thread
From: Elijah Newren @ 2024-11-09  8:49 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Johannes Sixt, git, Philippe Blain,
	Johannes Schindelin via GitGitGadget

One more thing...

On Fri, Nov 8, 2024 at 3:04 AM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
[...]
> Footnote *1*: We should really find a much better name for this than "evil
> merge". There is nothing evil about me having to add `#include
> "environment.h"` in v2.45.1^, for example. It was necessary so that the
> code would build. Tedious, yes, but not evil.

Indeed, I've felt it's problematic for a while too and had to take
time on at least one occasion to mention to someone else that I don't
actually mean "evil" it's just that "evil merge" as a compound term is
the convenient nomenclature we've been using for all of these,
regardless of whether the user modified the merge to resolve syntactic
conflicts, or to resolve semantic conflicts, or to sneak in "evil"
changes.  That's particularly odd since the first category is the most
common, and the third (snuck in unrelated changes or "evil changes")
are the most rare.  Maybe we should just call these "user-modified
merges" rather than "evil merges"?  Any better suggestions?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2] range-diff: optionally include merge commits' diffs in the analysis
  2024-11-08 13:43 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
  2024-11-08 17:04   ` Elijah Newren
@ 2024-11-10 20:30   ` Philippe Blain
  2024-11-11 20:07     ` Johannes Schindelin
  2024-11-11  0:37   ` Junio C Hamano
  2024-12-16 14:11   ` [PATCH v3 0/2] Support diff merges option in range diff Johannes Schindelin via GitGitGadget
  3 siblings, 1 reply; 21+ messages in thread
From: Philippe Blain @ 2024-11-10 20:30 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget, git
  Cc: Johannes Sixt, Johannes Schindelin, Elijah Newren

Hi Dscho,

Le 2024-11-08 à 08:43, Johannes Schindelin via GitGitGadget a écrit :
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> 
> The `git log` command already offers support for including diffs for
> merges, via the `--diff-merges=<format>` option.
> 
> Let's add corresponding support for `git range-diff`, too. This makes it
> more convenient to spot differences between iterations of non-linear
> contributions, where so-called "evil merges" are sometimes necessary and
> need to be reviewed, too.

Maybe "between commit ranges that include merge commits" would be more
workflow-agnostic ? Just thinking out loud here, I think what you wrote 
is also OK.

> In my code reviews, I found the `--diff-merges=first-parent` option
> particularly useful.
> 
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>     Support diff merges option in range diff
>     
>     The git range-diff command does the same with merge commits as git
>     rebase: It ignores them.
>     
>     However, when comparing branch thickets it can be quite illuminating to
>     watch out for inadvertent changes in merge commits, in particular when
>     some "evil" merges have been replayed, i.e. merges that needed to
>     introduce changes outside of the merge conflicts (e.g. when one branch
>     changed a function's signature and another branch introduced a caller of
>     said function), in case the replayed merge is no longer "evil" and
>     therefore potentially incorrect.
>     
>     Let's introduce support for the --diff-merges option that is passed
>     through to those git log commands.
>     
>     I had a need for this earlier this year and got it working, leaving the
>     GitGitGadget PR in a draft mode. Phil Blain found it and kindly
>     nerd-sniped me into readying it for submitting, so say thanks to Phil!
>     
>     Changes since v1:
>     
>      * Changed the documentation to recommend first-parent mode instead of
>        vaguely talking about various modes' merits.
>      * Disallowed the no-arg --diff-merges option (because --diff-merges
>        requires an argument).
> 
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1734%2Fdscho%2Fsupport-diff-merges-option-in-range-diff-v2
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1734/dscho/support-diff-merges-option-in-range-diff-v2
> Pull-Request: https://github.com/gitgitgadget/git/pull/1734
> 
> Range-diff vs v1:
> 
>  1:  11361e07af8 ! 1:  d01a395900b range-diff: optionally include merge commits' diffs in the analysis
>      @@ Documentation/git-range-diff.txt: to revert to color all lines according to the
>       +	corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
>       +	and include them in the comparison.
>       ++
>      -+Note: Some of the formats supported by linkgit:git-log[1] make less sense in
>      -+the context of the `range-diff` command than other formats, so choose wisely!
>      ++Note: In the common case, the `first-parent` mode will be the most natural one
>      ++to use, as it is consistent with the idea that a merge is kind of a "meta
>      ++patch", comprising all the merged commits' patches into a single one.
>       +
>        --[no-]notes[=<ref>]::
>        	This flag is passed to the `git log` program
>      @@ builtin/range-diff.c: int cmd_range_diff(int argc,
>        				  N_("notes"), N_("passed to 'git log'"),
>        				  PARSE_OPT_OPTARG),
>       +		OPT_PASSTHRU_ARGV(0, "diff-merges", &diff_merges_arg,
>      -+				  N_("style"), N_("passed to 'git log'"),
>      -+				  PARSE_OPT_OPTARG),
>      ++				  N_("style"), N_("passed to 'git log'"), 0),
>        		OPT_BOOL(0, "left-only", &left_only,
>        			 N_("only emit output related to the first range")),
>        		OPT_BOOL(0, "right-only", &right_only,
> 
> 
>  Documentation/git-range-diff.txt | 11 ++++++++++-
>  builtin/range-diff.c             | 10 ++++++++++
>  range-diff.c                     | 15 +++++++++++----
>  range-diff.h                     |  1 +
>  t/t3206-range-diff.sh            | 16 ++++++++++++++++
>  5 files changed, 48 insertions(+), 5 deletions(-)
> 
> diff --git a/Documentation/git-range-diff.txt b/Documentation/git-range-diff.txt
> index fbdbe0befeb..17a85957877 100644
> --- a/Documentation/git-range-diff.txt
> +++ b/Documentation/git-range-diff.txt
> @@ -10,7 +10,7 @@ SYNOPSIS
>  [verse]
>  'git range-diff' [--color=[<when>]] [--no-color] [<diff-options>]
>  	[--no-dual-color] [--creation-factor=<factor>]
> -	[--left-only | --right-only]
> +	[--left-only | --right-only] [--diff-merges=<format>]
>  	( <range1> <range2> | <rev1>...<rev2> | <base> <rev1> <rev2> )
>  	[[--] <path>...]
>  
> @@ -81,6 +81,15 @@ to revert to color all lines according to the outer diff markers
>  	Suppress commits that are missing from the second specified range
>  	(or the "right range" when using the `<rev1>...<rev2>` format).
>  
> +--diff-merges=<format>::
> +	Instead of ignoring merge commits, generate diffs for them using the
> +	corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
> +	and include them in the comparison.
> ++
> +Note: In the common case, the `first-parent` mode will be the most natural one
> +to use, as it is consistent with the idea that a merge is kind of a "meta
> +patch", comprising all the merged commits' patches into a single one.

I think I agree with Elijah that we probably should also highlight at least
'remerge'.

Also, is it worth making this a proper Asciidoc "[NOTE]" ? (I'm not sure, there are 
a lot of "Notes:" in our doc that are not Asciidoc "[NOTE]"s.

> diff --git a/builtin/range-diff.c b/builtin/range-diff.c
> index 1b33ab66a7b..901de5d133d 100644
> --- a/builtin/range-diff.c
> +++ b/builtin/range-diff.c

The changes look good to me. Maybe it would be nice to add a corresponding
'range-diff.diffMerges' config option to allow users to configure the 
behaviour more permanently ?

> diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
> index 86010931ab6..c18a3fdab83 100755
> --- a/t/t3206-range-diff.sh
> +++ b/t/t3206-range-diff.sh
> @@ -909,4 +909,20 @@ test_expect_success 'submodule changes are shown irrespective of diff.submodule'
>  	test_cmp expect actual
>  '
>  
> +test_expect_success '--diff-merges' '
> +	renamed_oid=$(git rev-parse --short renamed-file) &&
> +	tree=$(git merge-tree unmodified renamed-file) &&
> +	clean=$(git commit-tree -m merge -p unmodified -p renamed-file $tree) &&
> +	clean_oid=$(git rev-parse --short $clean) &&
> +	conflict=$(git commit-tree -m merge -p unmodified -p renamed-file^ $tree) &&
> +	conflict_oid=$(git rev-parse --short $conflict) &&
> +
> +	git range-diff --diff-merges=1 $clean...$conflict >actual &&
> +	cat >expect <<-EOF &&
> +	1:  $renamed_oid < -:  ------- s/12/B/
> +	2:  $clean_oid = 1:  $conflict_oid merge
> +	EOF
> +	test_cmp expect actual
> +'

The test looks good. 

Thanks for submitting that patch!

Philippe.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] range-diff: optionally include merge commits' diffs in the analysis
  2024-11-09  8:49     ` Elijah Newren
@ 2024-11-11  0:20       ` Junio C Hamano
  0 siblings, 0 replies; 21+ messages in thread
From: Junio C Hamano @ 2024-11-11  0:20 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Johannes Schindelin, Johannes Sixt, git, Philippe Blain,
	Johannes Schindelin via GitGitGadget

Elijah Newren <newren@gmail.com> writes:

> That's particularly odd since the first category is the most
> common, and the third (snuck in unrelated changes or "evil changes")
> are the most rare.  Maybe we should just call these "user-modified
> merges" rather than "evil merges"?  Any better suggestions?

It is "evil" in the sense that it makes it really tough for those
who dig history to find out what happened later to figure out where
the change came from.  It could be an attempt to hide a malicious
and unrelated change, it could be a resolution for a non-textual
conflict, or it could be a "while at it I am fixing an unrelated
typo I spotted nearby while resolving this textual conflict".

The change does not have to have an evil intent.  It is just
something different from and more than what the textual and
mechanical merge would produce.

"more than" may give us a hint for what we want to convey in the
name, if we were to pick a new name.  If a new topic changed a
function signature to add one extra parameter, and updated all the
callers of the function that existed back when the topic forked, an
"evil" merge needs to address the semantic conflicts to do a very
similar update to a new call to the function added on the mainline
since the topic forked, when the topic is merged back to the
mainline.  Taken as a whole, the first-parent diff of such a merge
will show that a function gained a parameter, and all of its callers
have been adjusted for that change.  Most of such changes would have
come from the side branch, but on top of that, the merge needed to
contribute more of the similar changes to callers.



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2] range-diff: optionally include merge commits' diffs in the analysis
  2024-11-08 13:43 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
  2024-11-08 17:04   ` Elijah Newren
  2024-11-10 20:30   ` Philippe Blain
@ 2024-11-11  0:37   ` Junio C Hamano
  2024-11-11 16:51     ` Elijah Newren
  2024-12-16 14:11   ` [PATCH v3 0/2] Support diff merges option in range diff Johannes Schindelin via GitGitGadget
  3 siblings, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2024-11-11  0:37 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Philippe Blain, Johannes Sixt, Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

>      * Disallowed the no-arg --diff-merges option (because --diff-merges
>        requires an argument).

Yup, I was happy to see it in the range-diff output, because I was
wondering where the optarg came from in the initial submission.

> +--diff-merges=<format>::
> +	Instead of ignoring merge commits, generate diffs for them using the
> +	corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
> +	and include them in the comparison.
> ++
> +Note: In the common case, the `first-parent` mode will be the most natural one
> +to use, as it is consistent with the idea that a merge is kind of a "meta
> +patch", comprising all the merged commits' patches into a single one.

I'll let you and Elijah figure out the best phrasing around here,
including whether we recommend only first-parent or give other
options.

For me personally, I find that use of `first-parent` here is
consistent with the mindset of those users who prefer to read "git
log --first-parent -p", and to a smaller degree, those who use
squash merges.  To them, a merge is merely bringing in lots of
changes into the mainline as a single unit, and other than that
these lots of changes are broken out for finer grained inspection
if/when they wanted to, they prefer to treat the changes brought in
by merges just like a single-parent commit.  And from that point of
view, (1) reordering the changes inside the side branch is
immaterial for the purpose of talking about the result, the net
effect the merged topic makes to the mainline, and (2) dropping or
adding new changes to the side branch is treated just like a newer
iteration of a single parent commit having fewer or more changes.
So it is very natural that first-parent helps to make the world view
very simplistic, and more readable range-diff is all about how you
can present the two iterations in a simple and flattened form.

There _may_ need a tweak of the matching algorithm to allow the
"same" merge on both sides to match, even if they diverge a lot,
though.  A range-diff that pairs a merge in the previous iteration
with a single patch in the updated iteration may be hard to read.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2] range-diff: optionally include merge commits' diffs in the analysis
  2024-11-11  0:37   ` Junio C Hamano
@ 2024-11-11 16:51     ` Elijah Newren
  2024-11-12  0:29       ` Junio C Hamano
  0 siblings, 1 reply; 21+ messages in thread
From: Elijah Newren @ 2024-11-11 16:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Schindelin via GitGitGadget, git, Philippe Blain,
	Johannes Sixt, Johannes Schindelin

On Sun, Nov 10, 2024 at 4:37 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
[...]
> > +--diff-merges=<format>::
> > +     Instead of ignoring merge commits, generate diffs for them using the
> > +     corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
> > +     and include them in the comparison.
> > ++
> > +Note: In the common case, the `first-parent` mode will be the most natural one
> > +to use, as it is consistent with the idea that a merge is kind of a "meta
> > +patch", comprising all the merged commits' patches into a single one.
>
> I'll let you and Elijah figure out the best phrasing around here,
> including whether we recommend only first-parent or give other
> options.

I'd probably suggest:

Note: When trying to see how users adjusted the contents of a merge
commit, `remerge-diff` is a natural choice.  When viewing merges as a
"meta patch", comprising a squash of all the merged commits' patches
into a single one (similar to `git log --first-parent -p`),
`first-parent` is a natural choice.

> For me personally, I find that use of `first-parent` here is
> consistent with the mindset of those users who prefer to read "git
> log --first-parent -p", and to a smaller degree, those who use
> squash merges.  To them, a merge is merely bringing in lots of
> changes into the mainline as a single unit, and other than that
> these lots of changes are broken out for finer grained inspection
> if/when they wanted to, they prefer to treat the changes brought in
> by merges just like a single-parent commit.  And from that point of
> view, (1) reordering the changes inside the side branch is
> immaterial for the purpose of talking about the result, the net
> effect the merged topic makes to the mainline, and (2) dropping or
> adding new changes to the side branch is treated just like a newer
> iteration of a single parent commit having fewer or more changes.
> So it is very natural that first-parent helps to make the world view
> very simplistic, and more readable range-diff is all about how you
> can present the two iterations in a simple and flattened form.
>
> There _may_ need a tweak of the matching algorithm to allow the
> "same" merge on both sides to match, even if they diverge a lot,
> though.  A range-diff that pairs a merge in the previous iteration
> with a single patch in the updated iteration may be hard to read.

Sounds like you are arguing that there is an angle/usecase from which
`first-parent` makes sense, namely the viewpoint that "a merge is
merely bringing in lots of changes into the mainline as a single unit"
as you put it.  What was surprising to me, though, is that it's a
completely different usecase than the one that was brought up in the
commit message for this feature, namely "so-called 'evil merges' are
sometimes necessary and need to be reviewed too".

If you want to review "evilness" or the differences between the
changes that `git merge` would make and what got recorded in the
commit, then `first-parent` is not that great an approximation of what
you are looking for, BUT it happens to work beautifully in common
cases where you are merely transplanting commits due to the fact that
the huge amounts of extras being put into the diff happen to be equal
on the two sides, so they cancel out and leave you with the "evilness"
you are interested in.  But if you stray from that common case (by
e.g. inserting, removing, or modifying commits while rebasing) and are
still interested in "evilness", your approximation quickly drops from
accurately pulling out just the bits you are interested in towards
being completely swamped by the noise.

My concern with the commit message and patch as they stand, is that I
feel the `first-parent` mode is being recommended as a solution for a
certain usecase where it is really only an approximation that luckily
is highly accurate in special but common scenarios while being very
inaccurate in others.  I think that will lead to users getting
negatively surprised when they move outside those common cases.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2] range-diff: optionally include merge commits' diffs in the analysis
  2024-11-08 17:04   ` Elijah Newren
@ 2024-11-11 19:55     ` Johannes Schindelin
  0 siblings, 0 replies; 21+ messages in thread
From: Johannes Schindelin @ 2024-11-11 19:55 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Johannes Schindelin via GitGitGadget, git, Philippe Blain,
	Johannes Sixt

[-- Attachment #1: Type: text/plain, Size: 2761 bytes --]

Hi Elijah,

On Fri, 8 Nov 2024, Elijah Newren wrote:

> On Fri, Nov 8, 2024 at 5:43 AM Johannes Schindelin via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> >
> > diff --git a/Documentation/git-range-diff.txt b/Documentation/git-range-diff.txt
> > index fbdbe0befeb..17a85957877 100644
> > --- a/Documentation/git-range-diff.txt
> > +++ b/Documentation/git-range-diff.txt
> > @@ -81,6 +81,15 @@ to revert to color all lines according to the outer diff markers
> >         Suppress commits that are missing from the second specified range
> >         (or the "right range" when using the `<rev1>...<rev2>` format).
> >
> > +--diff-merges=<format>::
> > +       Instead of ignoring merge commits, generate diffs for them using the
> > +       corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
> > +       and include them in the comparison.
> > ++
> > +Note: In the common case, the `first-parent` mode will be the most natural one
> > +to use,
>
> I think we need more wording around "common case"; I believe this
> "common case" is when you are diffing against a merely transplanted
> series of commits (a series created by rebasing without inserting,
> removing, or minimally modifying those commits in the process) that
> `first-parent` makes sense.  And it only makes sense in that case.

I tried to reproduce some of the common cases (which were more along the
lines of stacked PRs than transplanted series of commits), and you're
right: with the remerge bug fix, the range-diffs are much more easily
interpreted.

For example, I had to squash in a few fixes in
https://github.com/git/git-scm.com/pull/1915, and I try to combine all of
my currently-open git-scm.com PRs into the `gh-pages` branch in my fork.

With `--diff-merges=1`, all of those single-commit merges that update the
various translations of the book had diffs identical to the diffs of the
commits they merged. Which confused range-diff quite a bit, often picking
a non-merge as matching a merge (which is of course suboptimal).

With `--diff-merges=remerge`, these trivial merges had no diffs at all,
which made the review much easier.

> I think `remerge-diff` generally makes sense here, both in the cases
> when `first-parent` makes sense and when `first-parent` does not.  It
> could be improved by suppressing the inclusion of short commit ids
> (and maybe also commit messages) in the labels of conflict markers.  I
> suspect that issue might make `remerge-diff` less useful than
> `first-parent` in simple common cases currently, but I think it's the
> right thing to build upon for what you are trying to view.

Interesting idea about the suppressed commit IDs. I'll play with that.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2] range-diff: optionally include merge commits' diffs in the analysis
  2024-11-10 20:30   ` Philippe Blain
@ 2024-11-11 20:07     ` Johannes Schindelin
  2024-11-26  7:58       ` Junio C Hamano
  0 siblings, 1 reply; 21+ messages in thread
From: Johannes Schindelin @ 2024-11-11 20:07 UTC (permalink / raw)
  To: Philippe Blain
  Cc: Johannes Schindelin via GitGitGadget, git, Johannes Sixt,
	Elijah Newren

[-- Attachment #1: Type: text/plain, Size: 2787 bytes --]

Hi Philippe,

On Sun, 10 Nov 2024, Philippe Blain wrote:

> Le 2024-11-08 à 08:43, Johannes Schindelin via GitGitGadget a écrit :
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > The `git log` command already offers support for including diffs for
> > merges, via the `--diff-merges=<format>` option.
> >
> > Let's add corresponding support for `git range-diff`, too. This makes it
> > more convenient to spot differences between iterations of non-linear
> > contributions, where so-called "evil merges" are sometimes necessary and
> > need to be reviewed, too.
>
> Maybe "between commit ranges that include merge commits" would be more
> workflow-agnostic ?

Good idea, this is much clearer than what I wrote, too.

> > diff --git a/Documentation/git-range-diff.txt b/Documentation/git-range-diff.txt
> > index fbdbe0befeb..17a85957877 100644
> > --- a/Documentation/git-range-diff.txt
> > +++ b/Documentation/git-range-diff.txt
> > @@ -81,6 +81,15 @@ to revert to color all lines according to the outer diff markers
> >  	Suppress commits that are missing from the second specified range
> >  	(or the "right range" when using the `<rev1>...<rev2>` format).
> >
> > +--diff-merges=<format>::
> > +	Instead of ignoring merge commits, generate diffs for them using the
> > +	corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
> > +	and include them in the comparison.
> > ++
> > +Note: In the common case, the `first-parent` mode will be the most natural one
> > +to use, as it is consistent with the idea that a merge is kind of a "meta
> > +patch", comprising all the merged commits' patches into a single one.
>
> I think I agree with Elijah that we probably should also highlight at least
> 'remerge'.
>
> Also, is it worth making this a proper Asciidoc "[NOTE]" ? (I'm not sure, there are
> a lot of "Notes:" in our doc that are not Asciidoc "[NOTE]"s.

Right, I did not want to deviate too much from the surrounding style.

Besides, I am not _so_ versed in AsciiDoc, I consider Markdown to be much
more common, so I am much more familiar with that. I had no idea that
there was a proper AsciiDoc [NOTE].

> > diff --git a/builtin/range-diff.c b/builtin/range-diff.c
> > index 1b33ab66a7b..901de5d133d 100644
> > --- a/builtin/range-diff.c
> > +++ b/builtin/range-diff.c
>
> The changes look good to me. Maybe it would be nice to add a corresponding
> 'range-diff.diffMerges' config option to allow users to configure the
> behaviour more permanently ?

Seeing as there are no existing `rangeDiff.*` options, I am loathe to
introduce the first one lest I am asked why I don't balloon this patch
series into introducing config settings for the other options, too.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2] range-diff: optionally include merge commits' diffs in the analysis
  2024-11-11 16:51     ` Elijah Newren
@ 2024-11-12  0:29       ` Junio C Hamano
  0 siblings, 0 replies; 21+ messages in thread
From: Junio C Hamano @ 2024-11-12  0:29 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Johannes Schindelin via GitGitGadget, git, Philippe Blain,
	Johannes Sixt, Johannes Schindelin

Elijah Newren <newren@gmail.com> writes:

>> There _may_ need a tweak of the matching algorithm to allow the
>> "same" merge on both sides to match, even if they diverge a lot,
>> though.  A range-diff that pairs a merge in the previous iteration
>> with a single patch in the updated iteration may be hard to read.
>
> Sounds like you are arguing that there is an angle/usecase from which
> `first-parent` makes sense, namely the viewpoint that "a merge is
> merely bringing in lots of changes into the mainline as a single unit"
> as you put it.  What was surprising to me, though, is that it's a
> completely different usecase than the one that was brought up in the
> commit message for this feature, namely "so-called 'evil merges' are
> sometimes necessary and need to be reviewed too".

What I had in mind when I wrote the example you are responding to is
based on what sometimes happens while I make repeated merges (and as
you may know, I make lots of them).  In the first attempt I miss the
fact that I need semantic adjustments and then in the second attempt
I know what additional changes are necessary in the same merge (i.e.
merging exactly the same iteration of the same topic).  If you run
the first-parent range-diff between one iteration of 'seen' and
another, the "additional changes" I would make in the second attempt
would be the only thing that will appear in the output, showing the
"evil merge".

There can also be updates in the topic itself when I rebuild 'seen',
in addition to merge-fixes to adjust for semantic conflicts.  Such a
change would also appear in the first-parent view.  If you used
other views, like dense-combined or remerge-diff, these updates in
the topic itself may be hidden, as these other views are
specifically designed to highlight conflict resolutions and evil
merges by discarding mechanically resolvable changes.  So it all
depends on what you are looking for and what you are deliberately
excluding from the lower level of diff generation when you prepare
your range-diff, which is a "diff of diff".  Giving an impression
that first-parent is the most useful may risk misleading readers in
that sense.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2] range-diff: optionally include merge commits' diffs in the analysis
  2024-11-11 20:07     ` Johannes Schindelin
@ 2024-11-26  7:58       ` Junio C Hamano
  0 siblings, 0 replies; 21+ messages in thread
From: Junio C Hamano @ 2024-11-26  7:58 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Philippe Blain, Johannes Schindelin via GitGitGadget, git,
	Johannes Sixt, Elijah Newren

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Hi Philippe,
>
> On Sun, 10 Nov 2024, Philippe Blain wrote:
>
>> Le 2024-11-08 à 08:43, Johannes Schindelin via GitGitGadget a écrit :
>> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
>> >
>> > The `git log` command already offers support for including diffs for
>> > merges, via the `--diff-merges=<format>` option.
>> >
>> > Let's add corresponding support for `git range-diff`, too. This makes it
>> > more convenient to spot differences between iterations of non-linear
>> > contributions, where so-called "evil merges" are sometimes necessary and
>> > need to be reviewed, too.
>>
>> Maybe "between commit ranges that include merge commits" would be more
>> workflow-agnostic ?
>
> Good idea, this is much clearer than what I wrote, too.

Sounds good.

>> > diff --git a/builtin/range-diff.c b/builtin/range-diff.c
>> > index 1b33ab66a7b..901de5d133d 100644
>> > --- a/builtin/range-diff.c
>> > +++ b/builtin/range-diff.c
>>
>> The changes look good to me. Maybe it would be nice to add a corresponding
>> 'range-diff.diffMerges' config option to allow users to configure the
>> behaviour more permanently ?
>
> Seeing as there are no existing `rangeDiff.*` options, I am loathe to
> introduce the first one lest I am asked why I don't balloon this patch
> series into introducing config settings for the other options, too.

Yeah, I think it can be left for a follow-on exercise, done even by
other people who are interested.

Thanks.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v3 0/2] Support diff merges option in range diff
  2024-11-08 13:43 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
                     ` (2 preceding siblings ...)
  2024-11-11  0:37   ` Junio C Hamano
@ 2024-12-16 14:11   ` Johannes Schindelin via GitGitGadget
  2024-12-16 14:11     ` [PATCH v3 1/2] range-diff: optionally include merge commits' diffs in the analysis Johannes Schindelin via GitGitGadget
  2024-12-16 14:11     ` [PATCH v3 2/2] range-diff: introduce the convenience option `--remerge-diff` Johannes Schindelin via GitGitGadget
  3 siblings, 2 replies; 21+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-12-16 14:11 UTC (permalink / raw)
  To: git; +Cc: Philippe Blain, Johannes Sixt, Elijah Newren, Johannes Schindelin

The git range-diff command does the same with merge commits as git rebase:
It ignores them.

However, when comparing branch thickets it can be quite illuminating to
watch out for inadvertent changes in merge commits, in particular when some
"evil" merges have been replayed, i.e. merges that needed to introduce
changes outside of the merge conflicts (e.g. when one branch changed a
function's signature and another branch introduced a caller of said
function), in case the replayed merge is no longer "evil" and therefore
potentially incorrect.

Let's introduce support for the --diff-merges option that is passed through
to those git log commands.

I had a need for this earlier this year and got it working, leaving the
GitGitGadget PR in a draft mode. Phil Blain found it and kindly nerd-sniped
me into readying it for submitting, so say thanks to Phil!

Changes since v2:

 * Rebased onto js/log-remerge-keep-ancestry, to fix --diff-merges=remerge
 * Now the documentation suggests remerge instead of first-parent mode
 * Added a follow-up commit to introduce the convenience option git
   range-diff --remerge-diff

Changes since v1:

 * Changed the documentation to recommend first-parent mode instead of
   vaguely talking about various modes' merits.
 * Disallowed the no-arg --diff-merges option (because --diff-merges
   requires an argument).

Johannes Schindelin (2):
  range-diff: optionally include merge commits' diffs in the analysis
  range-diff: introduce the convenience option `--remerge-diff`

 Documentation/git-range-diff.txt | 17 ++++++++++++++++-
 builtin/range-diff.c             | 12 ++++++++++++
 range-diff.c                     | 17 ++++++++++++-----
 range-diff.h                     |  1 +
 t/t3206-range-diff.sh            | 16 ++++++++++++++++
 5 files changed, 57 insertions(+), 6 deletions(-)


base-commit: f94bfa151623d7e675db9465ae7ff0b33e4825f3
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1734%2Fdscho%2Fsupport-diff-merges-option-in-range-diff-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1734/dscho/support-diff-merges-option-in-range-diff-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1734

Range-diff vs v2:

 1:  d01a395900b ! 1:  ef3c243da1b range-diff: optionally include merge commits' diffs in the analysis
     @@ Commit message
          merges, via the `--diff-merges=<format>` option.
      
          Let's add corresponding support for `git range-diff`, too. This makes it
     -    more convenient to spot differences between iterations of non-linear
     -    contributions, where so-called "evil merges" are sometimes necessary and
     -    need to be reviewed, too.
     +    more convenient to spot differences between commit ranges that contain
     +    merges.
      
     -    In my code reviews, I found the `--diff-merges=first-parent` option
     +    This is especially true in scenarios with non-trivial merges, i.e.
     +    merges introducing changes other than, or in addition to, what merge ORT
     +    would have produced. Merging a topic branch that changes a function
     +    signature into a branch that added a caller of that function, for
     +    example, would require the merge commit itself to adjust that caller to
     +    the modified signature.
     +
     +    In my code reviews, I found the `--diff-merges=remerge` option
          particularly useful.
      
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
     @@ Documentation/git-range-diff.txt: to revert to color all lines according to the
      +	corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
      +	and include them in the comparison.
      ++
     -+Note: In the common case, the `first-parent` mode will be the most natural one
     -+to use, as it is consistent with the idea that a merge is kind of a "meta
     -+patch", comprising all the merged commits' patches into a single one.
     ++Note: In the common case, the `remerge` mode will be the most natural one
     ++to use, as it shows only the diff on top of what Git's merge machinery would
     ++have produced. In other words, if a merge commit is the result of a
     ++non-conflicting `git merge`, the `remerge` mode will represent it with an empty
     ++diff.
      +
       --[no-]notes[=<ref>]::
       	This flag is passed to the `git log` program
       	(see linkgit:git-log[1]) that generates the patches.
      
       ## builtin/range-diff.c ##
     -@@ builtin/range-diff.c: int cmd_range_diff(int argc,
     +@@ builtin/range-diff.c: int cmd_range_diff(int argc, const char **argv, const char *prefix)
       {
       	struct diff_options diffopt = { NULL };
       	struct strvec other_arg = STRVEC_INIT;
     @@ builtin/range-diff.c: int cmd_range_diff(int argc,
       	struct range_diff_options range_diff_opts = {
       		.creation_factor = RANGE_DIFF_CREATION_FACTOR_DEFAULT,
       		.diffopt = &diffopt,
     -@@ builtin/range-diff.c: int cmd_range_diff(int argc,
     +@@ builtin/range-diff.c: int cmd_range_diff(int argc, const char **argv, const char *prefix)
       		OPT_PASSTHRU_ARGV(0, "notes", &other_arg,
       				  N_("notes"), N_("passed to 'git log'"),
       				  PARSE_OPT_OPTARG),
     @@ builtin/range-diff.c: int cmd_range_diff(int argc,
       		OPT_BOOL(0, "left-only", &left_only,
       			 N_("only emit output related to the first range")),
       		OPT_BOOL(0, "right-only", &right_only,
     -@@ builtin/range-diff.c: int cmd_range_diff(int argc,
     +@@ builtin/range-diff.c: int cmd_range_diff(int argc, const char **argv, const char *prefix)
       	if (!simple_color)
       		diffopt.use_color = 1;
       
     @@ builtin/range-diff.c: int cmd_range_diff(int argc,
       	for (i = 0; i < argc; i++)
       		if (!strcmp(argv[i], "--")) {
       			dash_dash = i;
     -@@ builtin/range-diff.c: int cmd_range_diff(int argc,
     +@@ builtin/range-diff.c: int cmd_range_diff(int argc, const char **argv, const char *prefix)
       	res = show_range_diff(range1.buf, range2.buf, &range_diff_opts);
       
       	strvec_clear(&other_arg);
     @@ range-diff.c: static int read_patches(const char *range, struct string_list *lis
       		     /*
      @@ range-diff.c: static int read_patches(const char *range, struct string_list *list,
       		     "--pretty=medium",
     - 		     "--show-notes-by-default",
     + 		     "--notes",
       		     NULL);
      +	if (!include_merges)
      +		strvec_push(&cp.args, "--no-merges");
     @@ range-diff.c: static int read_patches(const char *range, struct string_list *lis
       				strbuf_reset(&buf);
       			}
       			CALLOC_ARRAY(util, 1);
     +-			if (get_oid(p, &util->oid)) {
      +			if (include_merges && (q = strstr(p, " (from ")))
      +				*q = '\0';
     - 			if (repo_get_oid(the_repository, p, &util->oid)) {
     ++			if (repo_get_oid(the_repository, p, &util->oid)) {
       				error(_("could not parse commit '%s'"), p);
       				FREE_AND_NULL(util);
     + 				string_list_clear(list, 1);
      @@ range-diff.c: int show_range_diff(const char *range1, const char *range2,
       
       	struct string_list branch1 = STRING_LIST_INIT_DUP;
 -:  ----------- > 2:  f99416acd5a range-diff: introduce the convenience option `--remerge-diff`

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v3 1/2] range-diff: optionally include merge commits' diffs in the analysis
  2024-12-16 14:11   ` [PATCH v3 0/2] Support diff merges option in range diff Johannes Schindelin via GitGitGadget
@ 2024-12-16 14:11     ` Johannes Schindelin via GitGitGadget
  2024-12-16 14:11     ` [PATCH v3 2/2] range-diff: introduce the convenience option `--remerge-diff` Johannes Schindelin via GitGitGadget
  1 sibling, 0 replies; 21+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-12-16 14:11 UTC (permalink / raw)
  To: git
  Cc: Philippe Blain, Johannes Sixt, Elijah Newren, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `git log` command already offers support for including diffs for
merges, via the `--diff-merges=<format>` option.

Let's add corresponding support for `git range-diff`, too. This makes it
more convenient to spot differences between commit ranges that contain
merges.

This is especially true in scenarios with non-trivial merges, i.e.
merges introducing changes other than, or in addition to, what merge ORT
would have produced. Merging a topic branch that changes a function
signature into a branch that added a caller of that function, for
example, would require the merge commit itself to adjust that caller to
the modified signature.

In my code reviews, I found the `--diff-merges=remerge` option
particularly useful.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/git-range-diff.txt | 13 ++++++++++++-
 builtin/range-diff.c             | 10 ++++++++++
 range-diff.c                     | 17 ++++++++++++-----
 range-diff.h                     |  1 +
 t/t3206-range-diff.sh            | 16 ++++++++++++++++
 5 files changed, 51 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-range-diff.txt b/Documentation/git-range-diff.txt
index 0b393715d70..150b0acbd86 100644
--- a/Documentation/git-range-diff.txt
+++ b/Documentation/git-range-diff.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 [verse]
 'git range-diff' [--color=[<when>]] [--no-color] [<diff-options>]
 	[--no-dual-color] [--creation-factor=<factor>]
-	[--left-only | --right-only]
+	[--left-only | --right-only] [--diff-merges=<format>]
 	( <range1> <range2> | <rev1>...<rev2> | <base> <rev1> <rev2> )
 	[[--] <path>...]
 
@@ -81,6 +81,17 @@ to revert to color all lines according to the outer diff markers
 	Suppress commits that are missing from the second specified range
 	(or the "right range" when using the `<rev1>...<rev2>` format).
 
+--diff-merges=<format>::
+	Instead of ignoring merge commits, generate diffs for them using the
+	corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
+	and include them in the comparison.
++
+Note: In the common case, the `remerge` mode will be the most natural one
+to use, as it shows only the diff on top of what Git's merge machinery would
+have produced. In other words, if a merge commit is the result of a
+non-conflicting `git merge`, the `remerge` mode will represent it with an empty
+diff.
+
 --[no-]notes[=<ref>]::
 	This flag is passed to the `git log` program
 	(see linkgit:git-log[1]) that generates the patches.
diff --git a/builtin/range-diff.c b/builtin/range-diff.c
index aecfae12d3a..9d6236e5116 100644
--- a/builtin/range-diff.c
+++ b/builtin/range-diff.c
@@ -16,6 +16,7 @@ int cmd_range_diff(int argc, const char **argv, const char *prefix)
 {
 	struct diff_options diffopt = { NULL };
 	struct strvec other_arg = STRVEC_INIT;
+	struct strvec diff_merges_arg = STRVEC_INIT;
 	struct range_diff_options range_diff_opts = {
 		.creation_factor = RANGE_DIFF_CREATION_FACTOR_DEFAULT,
 		.diffopt = &diffopt,
@@ -31,6 +32,8 @@ int cmd_range_diff(int argc, const char **argv, const char *prefix)
 		OPT_PASSTHRU_ARGV(0, "notes", &other_arg,
 				  N_("notes"), N_("passed to 'git log'"),
 				  PARSE_OPT_OPTARG),
+		OPT_PASSTHRU_ARGV(0, "diff-merges", &diff_merges_arg,
+				  N_("style"), N_("passed to 'git log'"), 0),
 		OPT_BOOL(0, "left-only", &left_only,
 			 N_("only emit output related to the first range")),
 		OPT_BOOL(0, "right-only", &right_only,
@@ -57,6 +60,12 @@ int cmd_range_diff(int argc, const char **argv, const char *prefix)
 	if (!simple_color)
 		diffopt.use_color = 1;
 
+	/* If `--diff-merges` was specified, imply `--merges` */
+	if (diff_merges_arg.nr) {
+		range_diff_opts.include_merges = 1;
+		strvec_pushv(&other_arg, diff_merges_arg.v);
+	}
+
 	for (i = 0; i < argc; i++)
 		if (!strcmp(argv[i], "--")) {
 			dash_dash = i;
@@ -150,6 +159,7 @@ int cmd_range_diff(int argc, const char **argv, const char *prefix)
 	res = show_range_diff(range1.buf, range2.buf, &range_diff_opts);
 
 	strvec_clear(&other_arg);
+	strvec_clear(&diff_merges_arg);
 	strbuf_release(&range1);
 	strbuf_release(&range2);
 
diff --git a/range-diff.c b/range-diff.c
index 4bd65ab7496..3bd7bff4735 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -31,7 +31,8 @@ struct patch_util {
  * as struct object_id (will need to be free()d).
  */
 static int read_patches(const char *range, struct string_list *list,
-			const struct strvec *other_arg)
+			const struct strvec *other_arg,
+			unsigned int include_merges)
 {
 	struct child_process cp = CHILD_PROCESS_INIT;
 	struct strbuf buf = STRBUF_INIT, contents = STRBUF_INIT;
@@ -42,7 +43,7 @@ static int read_patches(const char *range, struct string_list *list,
 	size_t size;
 	int ret = -1;
 
-	strvec_pushl(&cp.args, "log", "--no-color", "-p", "--no-merges",
+	strvec_pushl(&cp.args, "log", "--no-color", "-p",
 		     "--reverse", "--date-order", "--decorate=no",
 		     "--no-prefix", "--submodule=short",
 		     /*
@@ -57,6 +58,8 @@ static int read_patches(const char *range, struct string_list *list,
 		     "--pretty=medium",
 		     "--notes",
 		     NULL);
+	if (!include_merges)
+		strvec_push(&cp.args, "--no-merges");
 	strvec_push(&cp.args, range);
 	if (other_arg)
 		strvec_pushv(&cp.args, other_arg->v);
@@ -89,12 +92,15 @@ static int read_patches(const char *range, struct string_list *list,
 		}
 
 		if (skip_prefix(line, "commit ", &p)) {
+			char *q;
 			if (util) {
 				string_list_append(list, buf.buf)->util = util;
 				strbuf_reset(&buf);
 			}
 			CALLOC_ARRAY(util, 1);
-			if (get_oid(p, &util->oid)) {
+			if (include_merges && (q = strstr(p, " (from ")))
+				*q = '\0';
+			if (repo_get_oid(the_repository, p, &util->oid)) {
 				error(_("could not parse commit '%s'"), p);
 				FREE_AND_NULL(util);
 				string_list_clear(list, 1);
@@ -559,13 +565,14 @@ int show_range_diff(const char *range1, const char *range2,
 
 	struct string_list branch1 = STRING_LIST_INIT_DUP;
 	struct string_list branch2 = STRING_LIST_INIT_DUP;
+	unsigned int include_merges = range_diff_opts->include_merges;
 
 	if (range_diff_opts->left_only && range_diff_opts->right_only)
 		res = error(_("options '%s' and '%s' cannot be used together"), "--left-only", "--right-only");
 
-	if (!res && read_patches(range1, &branch1, range_diff_opts->other_arg))
+	if (!res && read_patches(range1, &branch1, range_diff_opts->other_arg, include_merges))
 		res = error(_("could not parse log for '%s'"), range1);
-	if (!res && read_patches(range2, &branch2, range_diff_opts->other_arg))
+	if (!res && read_patches(range2, &branch2, range_diff_opts->other_arg, include_merges))
 		res = error(_("could not parse log for '%s'"), range2);
 
 	if (!res) {
diff --git a/range-diff.h b/range-diff.h
index 04ffe217be6..9753922fee9 100644
--- a/range-diff.h
+++ b/range-diff.h
@@ -10,6 +10,7 @@ struct range_diff_options {
 	int creation_factor;
 	unsigned dual_color:1;
 	unsigned left_only:1, right_only:1;
+	unsigned include_merges:1;
 	const struct diff_options *diffopt; /* may be NULL */
 	const struct strvec *other_arg; /* may be NULL */
 };
diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
index b5f4d6a6530..d3c4d2dbb49 100755
--- a/t/t3206-range-diff.sh
+++ b/t/t3206-range-diff.sh
@@ -866,4 +866,20 @@ test_expect_success 'submodule changes are shown irrespective of diff.submodule'
 	test_cmp expect actual
 '
 
+test_expect_success '--diff-merges' '
+	renamed_oid=$(git rev-parse --short renamed-file) &&
+	tree=$(git merge-tree unmodified renamed-file) &&
+	clean=$(git commit-tree -m merge -p unmodified -p renamed-file $tree) &&
+	clean_oid=$(git rev-parse --short $clean) &&
+	conflict=$(git commit-tree -m merge -p unmodified -p renamed-file^ $tree) &&
+	conflict_oid=$(git rev-parse --short $conflict) &&
+
+	git range-diff --diff-merges=1 $clean...$conflict >actual &&
+	cat >expect <<-EOF &&
+	1:  $renamed_oid < -:  ------- s/12/B/
+	2:  $clean_oid = 1:  $conflict_oid merge
+	EOF
+	test_cmp expect actual
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v3 2/2] range-diff: introduce the convenience option `--remerge-diff`
  2024-12-16 14:11   ` [PATCH v3 0/2] Support diff merges option in range diff Johannes Schindelin via GitGitGadget
  2024-12-16 14:11     ` [PATCH v3 1/2] range-diff: optionally include merge commits' diffs in the analysis Johannes Schindelin via GitGitGadget
@ 2024-12-16 14:11     ` Johannes Schindelin via GitGitGadget
  1 sibling, 0 replies; 21+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-12-16 14:11 UTC (permalink / raw)
  To: git
  Cc: Philippe Blain, Johannes Sixt, Elijah Newren, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

Just like `git log`, now also `git range-diff` has that option as a
shortcut for the common operation that would otherwise require the quite
unwieldy (if theoretically "more correct") `--diff-mode=remerge` option.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/git-range-diff.txt | 4 ++++
 builtin/range-diff.c             | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/Documentation/git-range-diff.txt b/Documentation/git-range-diff.txt
index 150b0acbd86..e9e97d94480 100644
--- a/Documentation/git-range-diff.txt
+++ b/Documentation/git-range-diff.txt
@@ -11,6 +11,7 @@ SYNOPSIS
 'git range-diff' [--color=[<when>]] [--no-color] [<diff-options>]
 	[--no-dual-color] [--creation-factor=<factor>]
 	[--left-only | --right-only] [--diff-merges=<format>]
+	[--remerge-diff]
 	( <range1> <range2> | <rev1>...<rev2> | <base> <rev1> <rev2> )
 	[[--] <path>...]
 
@@ -92,6 +93,9 @@ have produced. In other words, if a merge commit is the result of a
 non-conflicting `git merge`, the `remerge` mode will represent it with an empty
 diff.
 
+--remerge-diff::
+	Convenience option, equivalent to `--diff-merges=remerge`.
+
 --[no-]notes[=<ref>]::
 	This flag is passed to the `git log` program
 	(see linkgit:git-log[1]) that generates the patches.
diff --git a/builtin/range-diff.c b/builtin/range-diff.c
index 9d6236e5116..3ab871ab925 100644
--- a/builtin/range-diff.c
+++ b/builtin/range-diff.c
@@ -34,6 +34,8 @@ int cmd_range_diff(int argc, const char **argv, const char *prefix)
 				  PARSE_OPT_OPTARG),
 		OPT_PASSTHRU_ARGV(0, "diff-merges", &diff_merges_arg,
 				  N_("style"), N_("passed to 'git log'"), 0),
+		OPT_PASSTHRU_ARGV(0, "remerge-diff", &diff_merges_arg, NULL,
+				  N_("passed to 'git log'"), PARSE_OPT_NOARG),
 		OPT_BOOL(0, "left-only", &left_only,
 			 N_("only emit output related to the first range")),
 		OPT_BOOL(0, "right-only", &right_only,
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2024-12-16 14:11 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-07 17:20 [PATCH] range-diff: optionally include merge commits' diffs in the analysis Johannes Schindelin via GitGitGadget
2024-11-08  3:46 ` Junio C Hamano
2024-11-08  6:53 ` Johannes Sixt
2024-11-08 10:53   ` Johannes Schindelin
2024-11-09  8:49     ` Elijah Newren
2024-11-11  0:20       ` Junio C Hamano
2024-11-08 11:56   ` Junio C Hamano
2024-11-08 15:53   ` Elijah Newren
2024-11-08 13:43 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
2024-11-08 17:04   ` Elijah Newren
2024-11-11 19:55     ` Johannes Schindelin
2024-11-10 20:30   ` Philippe Blain
2024-11-11 20:07     ` Johannes Schindelin
2024-11-26  7:58       ` Junio C Hamano
2024-11-11  0:37   ` Junio C Hamano
2024-11-11 16:51     ` Elijah Newren
2024-11-12  0:29       ` Junio C Hamano
2024-12-16 14:11   ` [PATCH v3 0/2] Support diff merges option in range diff Johannes Schindelin via GitGitGadget
2024-12-16 14:11     ` [PATCH v3 1/2] range-diff: optionally include merge commits' diffs in the analysis Johannes Schindelin via GitGitGadget
2024-12-16 14:11     ` [PATCH v3 2/2] range-diff: introduce the convenience option `--remerge-diff` Johannes Schindelin via GitGitGadget
2024-11-08 15:33 ` [PATCH] range-diff: optionally include merge commits' diffs in the analysis Elijah Newren

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).