git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Different behaviour for --find-renames between git diff and git merge?
@ 2025-12-12 18:04 Luca Balsanelli
  2025-12-13  1:57 ` Elijah Newren
  0 siblings, 1 reply; 6+ messages in thread
From: Luca Balsanelli @ 2025-12-12 18:04 UTC (permalink / raw)
  To: git

Hi,

   I'm scratching my head to understand why on the following case `git 
diff` and `git merge` give a different interpretation about a rename.

    git switch master
    touch aaa
    git add aaa
    git commit -m 'aaa'

    git switch -c branch
    echo -en 'A\nB\nC\n' > aaa
    git add aaa
    git commit -m 'A\nB\nC\n > aaa'

    git switch master
    echo -en 'A\nB\n' > aaa
    mkdir dir
    mv aaa dir/
    git add aaa dir/
    git commit -m 'A\nB\n > aaa -> dir/'

The `|merge.renames` config variable is true. Changing `git diff 
--find-renames=50%` (the default) or `git merge -s ort -X 
find-renames=50%` ||to something lower does not change the following.
|

`git diff` prints

    diff --git a/aaa b/dir/aaa
    similarity index 71%
    rename from aaa
    rename to dir/aaa
    index bbd2b90..986ad36 100644
    --- a/aaa
    +++ b/dir/aaa
    @@ -1,4 +1,3 @@
      A
      B
    -C

     that is the similarity index is 71% and it detects the rename.

`git merge branch`, instead, gives

    CONFLICT (modify/delete): aaa deleted in HEAD and modified in
    branch.  Version branch of aaa left in tree.
    Automatic merge failed; fix conflicts and then commit the result

Why it is that? I always supposed that the rename detection was the same 
for `git diff`, `git merge`. Reading the documentation I do not find any 
hint why `git diff` and `git merge` are behaving differently.

Thanks,

Luca Balsanelli


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Different behaviour for --find-renames between git diff and git merge?
  2025-12-12 18:04 Different behaviour for --find-renames between git diff and git merge? Luca Balsanelli
@ 2025-12-13  1:57 ` Elijah Newren
  2025-12-15 14:02   ` Luca Balsanelli
  0 siblings, 1 reply; 6+ messages in thread
From: Elijah Newren @ 2025-12-13  1:57 UTC (permalink / raw)
  To: Luca Balsanelli; +Cc: git

On Fri, Dec 12, 2025 at 10:06 AM Luca Balsanelli
<lucabalsanelli@gmail.com> wrote:
>
> Hi,
>
>    I'm scratching my head to understand why on the following case `git
> diff` and `git merge` give a different interpretation about a rename.

I don't see any difference...

>     git switch master
>     touch aaa
>     git add aaa
>     git commit -m 'aaa'
>
>     git switch -c branch
>     echo -en 'A\nB\nC\n' > aaa
>     git add aaa
>     git commit -m 'A\nB\nC\n > aaa'
>
>     git switch master
>     echo -en 'A\nB\n' > aaa
>     mkdir dir
>     mv aaa dir/
>     git add aaa dir/
>     git commit -m 'A\nB\n > aaa -> dir/'
>
> The `|merge.renames` config variable is true. Changing `git diff
> --find-renames=50%` (the default) or `git merge -s ort -X
> find-renames=50%` ||to something lower does not change the following.
> |
>
> `git diff` prints

Actually, it doesn't; more on that below...

>
>     diff --git a/aaa b/dir/aaa
>     similarity index 71%

Did you not follow your own recipe?  Maybe you inserted an extra space
or left off the 'n' in 'echo -en' when you ran this?  The number
should have been 66%.

>     rename from aaa
>     rename to dir/aaa
>     index bbd2b90..986ad36 100644
>     --- a/aaa
>     +++ b/dir/aaa
>     @@ -1,4 +1,3 @@
>       A
>       B
>     -C
>
>      that is the similarity index is 71% and it detects the rename.

At this point, if you actually run `git diff` you see the following:

$ git diff
$

i.e. nothing.  I suspect you gave `git diff` additional arguments but
didn't tell us.  Let's look at a few options:

$ git diff master~1 master
diff --git a/aaa b/aaa
deleted file mode 100644
index e69de29..0000000
diff --git a/dir/aaa b/dir/aaa
new file mode 100644
index 0000000..35d242b
--- /dev/null
+++ b/dir/aaa
@@ -0,0 +1,2 @@
+A
+B
$

So, on master, aaa was deleted, and dir/aaa was added.

$ git diff master~1 branch
diff --git a/aaa b/aaa
index e69de29..b1e6722 100644
--- a/aaa
+++ b/aaa
@@ -0,0 +1,3 @@
+A
+B
+C
$

On branch, aaa was modified.

$ git diff branch master
diff --git a/aaa b/dir/aaa
similarity index 66%
rename from aaa
rename to dir/aaa
index b1e6722..35d242b 100644
--- a/aaa
+++ b/dir/aaa
@@ -1,3 +1,2 @@
 A
 B
-C
$

So, only if you diff the endpoints of the two branches do you see a
rename; if you look from the merge base to either branch, there isn't
one.

> `git merge branch`, instead, gives
>
>     CONFLICT (modify/delete): aaa deleted in HEAD and modified in
>     branch.  Version branch of aaa left in tree.
>     Automatic merge failed; fix conflicts and then commit the result

Yes, this exactly matches what diff showed above -- on HEAD (master),
'aaa' was deleted, and on branch, 'aaa' was modified.

> Why it is that? I always supposed that the rename detection was the same
> for `git diff`, `git merge`. Reading the documentation I do not find any
> hint why `git diff` and `git merge` are behaving differently.

Hope that helps...

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: Different behaviour for --find-renames between git diff and git merge?
  2025-12-13  1:57 ` Elijah Newren
@ 2025-12-15 14:02   ` Luca Balsanelli
  2025-12-16  0:57     ` Elijah Newren
  0 siblings, 1 reply; 6+ messages in thread
From: Luca Balsanelli @ 2025-12-15 14:02 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

On 13/12/25 02:57, Elijah Newren wrote:
> On Fri, Dec 12, 2025 at 10:06 AM Luca Balsanelli
> <lucabalsanelli@gmail.com> wrote:
>> Hi,
>>
>> I'm scratching my head to understand why on the following case `git
>> diff` and `git merge` give a different interpretation about a rename.
> I don't see any difference...

Sorry, my email was not clear. There is still something that is not 
convincing me though. I will reformulate my question at the end, after I 
reply 'inline' to all the (true) considerations. I consider myself 
decently educated on git, but probably I still miss some understanding 
of the merge procedure.

>> git switch master
>> touch aaa
>> git add aaa
>> git commit -m 'aaa'
>>
>> git switch -c branch
>> echo -en 'A\nB\nC\n' > aaa
>> git add aaa
>> git commit -m 'A\nB\nC\n > aaa'
>>
>> git switch master
>> echo -en 'A\nB\n' > aaa
>> mkdir dir
>> mv aaa dir/
>> git add aaa dir/
>> git commit -m 'A\nB\n > aaa -> dir/'
>>
>> The `|merge.renames` config variable is true. Changing `git diff
>> --find-renames=50%` (the default) or `git merge -s ort -X
>> find-renames=50%` ||to something lower does not change the following.
>> |
>>
>> `git diff` prints
> Actually, it doesn't; more on that below...

I forgot to specify that I was intending to diff the two heads, that is 
`master` and `branch`. So it was

git switch master

git diff branch

>> diff --git a/aaa b/dir/aaa
>> similarity index 71%
> Did you not follow your own recipe? Maybe you inserted an extra space
> or left off the 'n' in 'echo -en' when you ran this? The number
> should have been 66%.

I don't know what I did but there were additional newlines. So, yes, the 
similarity index is 66% (which is still above to the default 50% to 
detect renames for both `git diff` and `git merge`).

>> rename from aaa
>> rename to dir/aaa
>> index bbd2b90..986ad36 100644
>> --- a/aaa
>> +++ b/dir/aaa
>> @@ -1,4 +1,3 @@
>> A
>> B
>> -C
>>
>> that is the similarity index is 71% and it detects the rename.
> At this point, if you actually run `git diff` you see the following:
>
> $ git diff
> $
>
> i.e. nothing. I suspect you gave `git diff` additional arguments but
> didn't tell us. Let's look at a few options:
>
> $ git diff master~1 master
> diff --git a/aaa b/aaa
> deleted file mode 100644
> index e69de29..0000000
> diff --git a/dir/aaa b/dir/aaa
> new file mode 100644
> index 0000000..35d242b
> --- /dev/null
> +++ b/dir/aaa
> @@ -0,0 +1,2 @@
> +A
> +B
> $
>
> So, on master, aaa was deleted, and dir/aaa was added.
>
> $ git diff master~1 branch
> diff --git a/aaa b/aaa
> index e69de29..b1e6722 100644
> --- a/aaa
> +++ b/aaa
> @@ -0,0 +1,3 @@
> +A
> +B
> +C
> $
>
> On branch, aaa was modified.
>
> $ git diff branch master
> diff --git a/aaa b/dir/aaa
> similarity index 66%
> rename from aaa
> rename to dir/aaa
> index b1e6722..35d242b 100644
> --- a/aaa
> +++ b/dir/aaa
> @@ -1,3 +1,2 @@
> A
> B
> -C
> $
>
> So, only if you diff the endpoints of the two branches do you see a
> rename; if you look from the merge base to either branch, there isn't
> one.

Yes, the above is all true. As I said above, I forgot to specify the 
argument: `git switch master; git diff branch`.

>> `git merge branch`, instead, gives
>>
>> CONFLICT (modify/delete): aaa deleted in HEAD and modified in
>> branch. Version branch of aaa left in tree.
>> Automatic merge failed; fix conflicts and then commit the result
> Yes, this exactly matches what diff showed above -- on HEAD (master),
> 'aaa' was deleted, and on branch, 'aaa' was modified.
>
>> Why it is that? I always supposed that the rename detection was the same
>> for `git diff`, `git merge`. Reading the documentation I do not find any
>> hint why `git diff` and `git merge` are behaving differently.
> Hope that helps...

I would expect that `git merge branch` would detect a rename and the 
conflict resolved automatically. The 'ort' strategy (the default one), 
"can detect and handle merges involving renames." and the default 
similarity threshold is the same for `git diff` and `git merge`. I 
understand that the merge procedure involves finding a merge base, but 
still the rename should be detected between the two heads.

I was reading commit `90d43b07687fdc51d1f2fc14948df538dc45584b` of the 
git source code (which I found using `git log --grep '--rename-empty'`). 
It says (among other things)

This patch lets callers specify whether or not they interested in
using empty files as rename sources and destinations. The default is
"yes", keeping the original behavior. It works by detecting the
empty-blob sha1 for rename sources and destinations.

It is related, but I don't think it is relevant to this specific case. 
Even though the `git diff master~1 master` doesn't detect the rename 
(the content changed too much compared to the empty file or one was 
empty (although it says it defaults to include empty files as rename 
source or destinarion)), the rename should be detected between the two 
heads, even when merging. I tried to read at 'git/diffcore-rename.c' but 
I'm not very good at C and it would require me a great effort to fully 
understand it.

So, why `git merge branch` is not detecting the rename and not resolving 
the conflict automatically? Does it use a different diff machinery 
compared to `git diff`?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Different behaviour for --find-renames between git diff and git merge?
  2025-12-15 14:02   ` Luca Balsanelli
@ 2025-12-16  0:57     ` Elijah Newren
  2025-12-16 13:15       ` Luca Balsanelli
  0 siblings, 1 reply; 6+ messages in thread
From: Elijah Newren @ 2025-12-16  0:57 UTC (permalink / raw)
  To: Luca Balsanelli; +Cc: git

On Mon, Dec 15, 2025 at 6:02 AM Luca Balsanelli
<lucabalsanelli@gmail.com> wrote:
>
> On 13/12/25 02:57, Elijah Newren wrote:
> > On Fri, Dec 12, 2025 at 10:06 AM Luca Balsanelli
> > <lucabalsanelli@gmail.com> wrote:
[...]
> I would expect that `git merge branch` would detect a rename and the
> conflict resolved automatically. The 'ort' strategy (the default one),
> "can detect and handle merges involving renames." and the default
> similarity threshold is the same for `git diff` and `git merge`. I
> understand that the merge procedure involves finding a merge base, but
> still the rename should be detected between the two heads.

No, it should only detect renames between the merge-base and the
heads.  The merge machinery should not diff the two heads directly;
that goes against how 3-way diff works.

[...]
> Even though the `git diff master~1 master` doesn't detect the rename
> (the content changed too much compared to the empty file or one was
> empty (although it says it defaults to include empty files as rename
> source or destinarion)), the rename should be detected between the two
> heads, even when merging. I tried to read at 'git/diffcore-rename.c' but
> I'm not very good at C and it would require me a great effort to fully
> understand it.
>
> So, why `git merge branch` is not detecting the rename and not resolving
> the conflict automatically? Does it use a different diff machinery
> compared to `git diff`?

Merging never diffs the endpoints, and shouldn't either.  It basically
does two diffs, each from the merge-base to the end-point in question.

If you only diffed the endpoints, and one side renamed file A->B, how
do you differentiate between A->B and B->A?  In other words, you may
know there was a rename, but you can't tell what it was renamed from
and which filename should be the final one.  You can only tell if you
look at the merge-base and determine that the file started out named
as A, and thus that B should be the final name.

If you only diffed the endpoints, and one side renamed file A->B,
while the other side renamed A->C, you'd be misled into thinking this
was a normal rename (you'd only see e.g. B->C) and be unaware of the
conflict, which is problematic.

If you only diffed the endpoints, and one side renamed file A->B,
while the other side renamed C->B, by diffing the endpoints you can't
even tell there's a rename; you simply have a file named B that was
totally rewritten.  But it gets subtly worse in special cases that
might really confuse end users: if they modified A or C on the sides
of history that didn't rename those files, those changes would not be
propagated and combined with the ultimate B, and they'd be left to
pick up the pieces and try to combine things.

Further, it's just semantically wrong to diff the endpoints because of
the underlying concept of a 3-way merge: If you were merging D & E and
simply diffed D & E to do so, you won't know whether differing lines
were added or removed by recent commits.  For example, you might
notice an "import" or "include" statement that one side has that the
other doesn't.  But did one side add that import statement?  Or did
the other side remove it?  You can't tell by looking at the endpoints;
you have to compare the endpoints to the merge-base to find out which
things were added or removed.  So, fundamentally, a 3-way merge thinks
in terms of diffing the merge-base to the endpoints, not diffing the
endpoints.


So, in summary, no, merge does not use a different diff machinery.
You are just diffing the wrong commits to see what it sees.  Combine
that with the fact that you have a funny special case where both sides
drastically change the file in a way where the new versions happen to
be similar to each other while not similar to the original, causes the
behavior you are seeing.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Different behaviour for --find-renames between git diff and git merge?
  2025-12-16  0:57     ` Elijah Newren
@ 2025-12-16 13:15       ` Luca Balsanelli
  2025-12-16 19:44         ` Elijah Newren
  0 siblings, 1 reply; 6+ messages in thread
From: Luca Balsanelli @ 2025-12-16 13:15 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

On 16/12/25 01:57, Elijah Newren wrote:
>> Even though the `git diff master~1 master` doesn't detect the rename
>> (the content changed too much compared to the empty file or one was
>> empty (although it says it defaults to include empty files as rename
>> source or destinarion)), the rename should be detected between the two
>> heads, even when merging. I tried to read at 'git/diffcore-rename.c' but
>> I'm not very good at C and it would require me a great effort to fully
>> understand it.
>>
>> So, why `git merge branch` is not detecting the rename and not resolving
>> the conflict automatically? Does it use a different diff machinery
>> compared to `git diff`?
> Merging never diffs the endpoints, and shouldn't either.  It basically
> does two diffs, each from the merge-base to the end-point in question.
>
> If you only diffed the endpoints, and one side renamed file A->B, how
> do you differentiate between A->B and B->A?  In other words, you may
> know there was a rename, but you can't tell what it was renamed from
> and which filename should be the final one.  You can only tell if you
> look at the merge-base and determine that the file started out named
> as A, and thus that B should be the final name.
>
> If you only diffed the endpoints, and one side renamed file A->B,
> while the other side renamed A->C, you'd be misled into thinking this
> was a normal rename (you'd only see e.g. B->C) and be unaware of the
> conflict, which is problematic.
>
> If you only diffed the endpoints, and one side renamed file A->B,
> while the other side renamed C->B, by diffing the endpoints you can't
> even tell there's a rename; you simply have a file named B that was
> totally rewritten.  But it gets subtly worse in special cases that
> might really confuse end users: if they modified A or C on the sides
> of history that didn't rename those files, those changes would not be
> propagated and combined with the ultimate B, and they'd be left to
> pick up the pieces and try to combine things.
>
> Further, it's just semantically wrong to diff the endpoints because of
> the underlying concept of a 3-way merge: If you were merging D & E and
> simply diffed D & E to do so, you won't know whether differing lines
> were added or removed by recent commits.  For example, you might
> notice an "import" or "include" statement that one side has that the
> other doesn't.  But did one side add that import statement?  Or did
> the other side remove it?  You can't tell by looking at the endpoints;
> you have to compare the endpoints to the merge-base to find out which
> things were added or removed.  So, fundamentally, a 3-way merge thinks
> in terms of diffing the merge-base to the endpoints, not diffing the
> endpoints.
>
>
> So, in summary, no, merge does not use a different diff machinery.
> You are just diffing the wrong commits to see what it sees.  Combine
> that with the fact that you have a funny special case where both sides
> drastically change the file in a way where the new versions happen to
> be similar to each other while not similar to the original, causes the
> behavior you are seeing.

Thank you. I understand.

Moreover, deepening the rename topic actually made me forget something 
about the merge topic. In fact, even if the rename was detected in some 
way or even if I didn't rename one side at all, the `git merge branch` 
would still be unable to resolve the conflict automatically, since both 
were modified in different ways, even if in similar ways. But similarity 
is not enough. This confounded me.

In the following example, I start from an empty file and I modify it on 
one side of the history and move (rename) it on the other side. The 
rename between `branch` and the merge base is detected. So, can you tell 
me why in the following case the rename is not detected during the merge?

    git switch -c master root

    touch aaa
    git add aaa
    git commit -m 'aaa'

    git switch -c branch
    echo -ne 'A\nB\nC\n' > aaa
    git add aaa
    git commit -m 'A\nB\nC\n > aaa'

    git switch master
    mkdir dir
    mv aaa dir/
    git add aaa dir/
    git commit -m 'aaa -> dir/'

    git merge --no-edit branch

Sorry if I'm pedant and thank you in advance.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Different behaviour for --find-renames between git diff and git merge?
  2025-12-16 13:15       ` Luca Balsanelli
@ 2025-12-16 19:44         ` Elijah Newren
  0 siblings, 0 replies; 6+ messages in thread
From: Elijah Newren @ 2025-12-16 19:44 UTC (permalink / raw)
  To: Luca Balsanelli; +Cc: git

Hi,

On Tue, Dec 16, 2025 at 5:15 AM Luca Balsanelli
<lucabalsanelli@gmail.com> wrote:
>
[...]
> In the following example, I start from an empty file and I modify it on
> one side of the history and move (rename) it on the other side. The
> rename between `branch` and the merge base is detected. So, can you tell
> me why in the following case the rename is not detected during the merge?
>
>     git switch -c master root
>
>     touch aaa
>     git add aaa
>     git commit -m 'aaa'
>
>     git switch -c branch
>     echo -ne 'A\nB\nC\n' > aaa
>     git add aaa
>     git commit -m 'A\nB\nC\n > aaa'
>
>     git switch master
>     mkdir dir
>     mv aaa dir/
>     git add aaa dir/
>     git commit -m 'aaa -> dir/'
>
>     git merge --no-edit branch

This is an interesting case where --[no-]rename-empty option applies
(the same option you found a related commit for in a previous email in
this thread):

$ git diff master~1 master
diff --git a/aaa b/dir/aaa
similarity index 100%
rename from aaa
rename to dir/aaa

$ git diff --no-rename-empty master~1 master
diff --git a/aaa b/aaa
deleted file mode 100644
index e69de29..0000000
diff --git a/dir/aaa b/dir/aaa
new file mode 100644
index 0000000..e69de29

The merge machinery runs with the equivalent of --no-rename-empty:

$ git -C ~/floss/git grep rename_empty merge-ort.c
merge-ort.c:    diff_opts.flags.rename_empty = 0;

This comes from commit 4f7cb99ada26 (merge-recursive: don't detect
renames of empty files, 2012-03-22), and the commit message there
explains the rationale.  (The name of the option and how it is set has
changed since 2012, due to commit 0d1e0e7801bb (diff: make struct
diff_flags members lowercase, 2017-10-31)).  merge-ort copied that
behavior from merge-recursive.

So, although the merge machinery calls the same diff machinery that
`git diff` uses, it does pass slightly different defaults.  (There's a
couple others too; I believe the differences include rename_empty,
rename_limit, histogram vs myers, basename-guided similarity, and the
possibility of cached renames in a sequence of commits being
reapplied.  Users are unlikely to see any of these typically, though
you certainly did here.)

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-12-16 19:44 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-12 18:04 Different behaviour for --find-renames between git diff and git merge? Luca Balsanelli
2025-12-13  1:57 ` Elijah Newren
2025-12-15 14:02   ` Luca Balsanelli
2025-12-16  0:57     ` Elijah Newren
2025-12-16 13:15       ` Luca Balsanelli
2025-12-16 19:44         ` Elijah Newren

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).