git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* git mv messed up file mapping if folders contain identical files
@ 2016-02-24 23:38 Bill Okara
  2016-02-24 23:39 ` Junio C Hamano
  2016-02-25 11:49 ` Kevin Daudt
  0 siblings, 2 replies; 9+ messages in thread
From: Bill Okara @ 2016-02-24 23:38 UTC (permalink / raw)
  To: git

Hi,

I noticed the following 'git mv' issue with:
git version 2.6.4


If there are identical files in different subfolders, 'git mv' the
root folder (and/or each file individually) will mess up the file path
mapping. that is, if having identical 'content.txt' file under
gitmvtest
    |--demo/content.txt
    |--dev/content.txt
    |--prod/content.txt

after doing the "git mv gitmvtest/resources
gitmvtest/src/main/resources", the 'git status' will show:

renamed:    gitmvtest/resources/demo/content.txt ->
gitmvtest/src/main/resources/demo/content.txt
renamed:    gitmvtest/resources/prod/content.txt ->
gitmvtest/src/main/resources/dev/content.txt            <== NOTE:
wrongly mapped the prod/content.txt to dev/content.txt
renamed:    gitmvtest/resources/dev/content.txt ->
gitmvtest/src/main/resources/prod/content.txt            <== NOTE:
wrongly mapped the dev/content.txt to prod/content.txt

I tried running 'git mv' on each file individually, got the same problem:
> git mv gitmvtest/resources/demo/content.txt gitmvtest/src/main/resources/demo/content.txt
> git mv gitmvtest/resources/dev/content.txt gitmvtest/src/main/resources/dev/content.txt
> git mv gitmvtest/resources/prod/content.txt gitmvtest/src/main/resources/prod/content.txt

> git status
renamed:    gitmvtest/resources/demo/content.txt ->
gitmvtest/src/main/resources/demo/content.txt
renamed:    gitmvtest/resources/prod/content.txt ->
gitmvtest/src/main/resources/dev/content.txt          <== WRONG
renamed:    gitmvtest/resources/dev/content.txt ->
gitmvtest/src/main/resources/prod/content.txt          <== WRONG


NOTE:
=======
if modified the content.txt in the 3 folders to contain different
data, then repeating the above 'git mv' will produce correct result,

renamed:    gitmvtest/resources/demo/content.txt ->
gitmvtest/src/main/resources/demo/content.txt       <== CORRECT
renamed:    gitmvtest/resources/dev/content.txt ->
gitmvtest/src/main/resources/dev/content.txt             <== CORRECT
renamed:    gitmvtest/resources/prod/content.txt ->
gitmvtest/src/main/resources/prod/content.txt          <== CORRECT



just want to see if this is a bug, user error (on my end), or??


Thanks,
Bill

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: git mv messed up file mapping if folders contain identical files
  2016-02-24 23:38 git mv messed up file mapping if folders contain identical files Bill Okara
@ 2016-02-24 23:39 ` Junio C Hamano
  2016-02-24 23:51   ` Bill Okara
  2016-02-25 11:49 ` Kevin Daudt
  1 sibling, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2016-02-24 23:39 UTC (permalink / raw)
  To: Bill Okara; +Cc: git

Bill Okara <billokara@gmail.com> writes:

> just want to see if this is a bug, user error (on my end), or??

Not a bug, not a user error, just "it does not matter", I think.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: git mv messed up file mapping if folders contain identical files
  2016-02-24 23:39 ` Junio C Hamano
@ 2016-02-24 23:51   ` Bill Okara
  2016-02-25  0:03     ` Bill Okara
  0 siblings, 1 reply; 9+ messages in thread
From: Bill Okara @ 2016-02-24 23:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

it actually does matter in the following scenario:

1) master branch has identical content.txt files in the folder structure
2) do the git mv in a new branch
3) master branch updated the context.txt to contain new data (more
relevant to the containing folder)
4) new branch need to merge the updates from master

after the merge, demo/content.txt in the new path would contain
updates from dev/content.txt from master...

On Wed, Feb 24, 2016 at 4:39 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Bill Okara <billokara@gmail.com> writes:
>
>> just want to see if this is a bug, user error (on my end), or??
>
> Not a bug, not a user error, just "it does not matter", I think.
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: git mv messed up file mapping if folders contain identical files
  2016-02-24 23:51   ` Bill Okara
@ 2016-02-25  0:03     ` Bill Okara
  0 siblings, 0 replies; 9+ messages in thread
From: Bill Okara @ 2016-02-25  0:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

following are the steps to help illustrate the issue. its actually
quite common to have projects include identical placeholder files
before update/refactoring...

// Create in master branch

> mkdir gittest
> cd gittest
> init
> mkdir -p gitmvtest/resources
> mkdir -p gitmvtest/resources/demo
> mkdir -p gitmvtest/resources/dev
> mkdir -p gitmvtest/resources/prod
> cd gitmvtest/resources/demo/
> echo 'This is the line that contains the common content’ >> content.txt
> cd ..
> cp demo/content.txt dev/.
> cp demo/content.txt prod/.
> cd ../..
> git add .
> git commit -m "master content.txt first commit"


# change to a new branch

> git checkout -b newTestBranch
> mkdir -p gitmvtest/src/main
> git mv gitmvtest/resources gitmvtest/src/main/.
> git commit -m "newTestBranch: mv resources to src/main/resources"

[newTestBranch 5faed26] newTestBranch: mv resources to src/main/resources
 3 files changed, 0 insertions(+), 0 deletions(-)

 rename gitmvtest/{resources/prod => src/main/resources/demo}/content.txt (100%)
 rename gitmvtest/{ => src/main}/resources/dev/content.txt (100%)
 rename gitmvtest/{resources/demo => src/main/resources/prod}/content.txt (100%)


// Change back to master and update the 3 files

> git checkout master
> echo 'this is DEMO content' >> gitmvtest/resources/demo/content.txt
> echo 'this is DEV content' >> gitmvtest/resources/dev/content.txt
> echo 'this is PROD content' >> gitmvtest/resources/prod/content.txt
> git add .
> git commit -m "master: update content.txt to include parent folder name"

[master 18d85f9] master: update content.txt to include parent folder name
 3 files changed, 3 insertions(+)


// Change back to the new branch and merge from master

> git checkout newTestBranch
> git merge master
Auto-merging gitmvtest/src/main/resources/prod/content.txt
Auto-merging gitmvtest/src/main/resources/dev/content.txt
Auto-merging gitmvtest/src/main/resources/demo/content.txt

Merge made by the 'recursive' strategy.
 gitmvtest/src/main/resources/demo/content.txt | 1 +
 gitmvtest/src/main/resources/dev/content.txt  | 1 +
 gitmvtest/src/main/resources/prod/content.txt | 1 +
 3 files changed, 3 insertions(+)


// SEE the ERROR:
//demo/content.txt wrongly merged with data from prod/content.txt

> cat gitmvtest/src/main/resources/demo/content.txt

This is the line that contains the common content
this is PROD content

On Wed, Feb 24, 2016 at 4:51 PM, Bill Okara <billokara@gmail.com> wrote:
> it actually does matter in the following scenario:
>
> 1) master branch has identical content.txt files in the folder structure
> 2) do the git mv in a new branch
> 3) master branch updated the context.txt to contain new data (more
> relevant to the containing folder)
> 4) new branch need to merge the updates from master
>
> after the merge, demo/content.txt in the new path would contain
> updates from dev/content.txt from master...
>
> On Wed, Feb 24, 2016 at 4:39 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Bill Okara <billokara@gmail.com> writes:
>>
>>> just want to see if this is a bug, user error (on my end), or??
>>
>> Not a bug, not a user error, just "it does not matter", I think.
>>
>>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: git mv messed up file mapping if folders contain identical files
  2016-02-24 23:38 git mv messed up file mapping if folders contain identical files Bill Okara
  2016-02-24 23:39 ` Junio C Hamano
@ 2016-02-25 11:49 ` Kevin Daudt
  2016-02-25 13:56   ` Stefan Beller
  1 sibling, 1 reply; 9+ messages in thread
From: Kevin Daudt @ 2016-02-25 11:49 UTC (permalink / raw)
  To: Bill Okara; +Cc: git, Junio C Hamano, Stefan Beller

On Wed, Feb 24, 2016 at 04:38:11PM -0700, Bill Okara wrote:
> Hi,
> 
> I noticed the following 'git mv' issue with:
> git version 2.6.4
> 
> 
> If there are identical files in different subfolders, 'git mv' the
> root folder (and/or each file individually) will mess up the file path
> mapping. that is, if having identical 'content.txt' file under
> gitmvtest
>     |--demo/content.txt
>     |--dev/content.txt
>     |--prod/content.txt
> 
> after doing the "git mv gitmvtest/resources
> gitmvtest/src/main/resources", the 'git status' will show:
> 
> renamed:    gitmvtest/resources/demo/content.txt ->
> gitmvtest/src/main/resources/demo/content.txt
> renamed:    gitmvtest/resources/prod/content.txt ->
> gitmvtest/src/main/resources/dev/content.txt            <== NOTE:
> wrongly mapped the prod/content.txt to dev/content.txt
> renamed:    gitmvtest/resources/dev/content.txt ->
> gitmvtest/src/main/resources/prod/content.txt            <== NOTE:
> wrongly mapped the dev/content.txt to prod/content.txt
> 
> I tried running 'git mv' on each file individually, got the same problem:
> > git mv gitmvtest/resources/demo/content.txt gitmvtest/src/main/resources/demo/content.txt
> > git mv gitmvtest/resources/dev/content.txt gitmvtest/src/main/resources/dev/content.txt
> > git mv gitmvtest/resources/prod/content.txt gitmvtest/src/main/resources/prod/content.txt
> 
> > git status
> renamed:    gitmvtest/resources/demo/content.txt ->
> gitmvtest/src/main/resources/demo/content.txt
> renamed:    gitmvtest/resources/prod/content.txt ->
> gitmvtest/src/main/resources/dev/content.txt          <== WRONG
> renamed:    gitmvtest/resources/dev/content.txt ->
> gitmvtest/src/main/resources/prod/content.txt          <== WRONG
> 
> 
> NOTE:
> =======
> if modified the content.txt in the 3 folders to contain different
> data, then repeating the above 'git mv' will produce correct result,
> 
> renamed:    gitmvtest/resources/demo/content.txt ->
> gitmvtest/src/main/resources/demo/content.txt       <== CORRECT
> renamed:    gitmvtest/resources/dev/content.txt ->
> gitmvtest/src/main/resources/dev/content.txt             <== CORRECT
> renamed:    gitmvtest/resources/prod/content.txt ->
> gitmvtest/src/main/resources/prod/content.txt          <== CORRECT
> 
> 
> 
> just want to see if this is a bug, user error (on my end), or??
> 

This looks like the same issue as submodule--helper list has:
http://article.gmane.org/gmane.comp.version-control.git/287227

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: git mv messed up file mapping if folders contain identical files
  2016-02-25 11:49 ` Kevin Daudt
@ 2016-02-25 13:56   ` Stefan Beller
  2016-02-25 16:25     ` Bill Okara
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Beller @ 2016-02-25 13:56 UTC (permalink / raw)
  To: Kevin Daudt; +Cc: Bill Okara, git@vger.kernel.org, Junio C Hamano

On Thu, Feb 25, 2016 at 3:49 AM, Kevin Daudt <me@ikke.info> wrote:
> On Wed, Feb 24, 2016 at 04:38:11PM -0700, Bill Okara wrote:
>> Hi,
>>
>> I noticed the following 'git mv' issue with:
>> git version 2.6.4
>>
>>
>> If there are identical files in different subfolders, 'git mv' the
>> root folder (and/or each file individually) will mess up the file path
>> mapping. that is, if having identical 'content.txt' file under
>> gitmvtest
>>     |--demo/content.txt
>>     |--dev/content.txt
>>     |--prod/content.txt
>>
>> after doing the "git mv gitmvtest/resources
>> gitmvtest/src/main/resources", the 'git status' will show:
>>
>> renamed:    gitmvtest/resources/demo/content.txt ->
>> gitmvtest/src/main/resources/demo/content.txt
>> renamed:    gitmvtest/resources/prod/content.txt ->
>> gitmvtest/src/main/resources/dev/content.txt            <== NOTE:
>> wrongly mapped the prod/content.txt to dev/content.txt
>> renamed:    gitmvtest/resources/dev/content.txt ->
>> gitmvtest/src/main/resources/prod/content.txt            <== NOTE:
>> wrongly mapped the dev/content.txt to prod/content.txt
>>
>> I tried running 'git mv' on each file individually, got the same problem:
>> > git mv gitmvtest/resources/demo/content.txt gitmvtest/src/main/resources/demo/content.txt
>> > git mv gitmvtest/resources/dev/content.txt gitmvtest/src/main/resources/dev/content.txt
>> > git mv gitmvtest/resources/prod/content.txt gitmvtest/src/main/resources/prod/content.txt
>>
>> > git status
>> renamed:    gitmvtest/resources/demo/content.txt ->
>> gitmvtest/src/main/resources/demo/content.txt
>> renamed:    gitmvtest/resources/prod/content.txt ->
>> gitmvtest/src/main/resources/dev/content.txt          <== WRONG
>> renamed:    gitmvtest/resources/dev/content.txt ->
>> gitmvtest/src/main/resources/prod/content.txt          <== WRONG
>>
>>
>> NOTE:
>> =======
>> if modified the content.txt in the 3 folders to contain different
>> data, then repeating the above 'git mv' will produce correct result,
>>
>> renamed:    gitmvtest/resources/demo/content.txt ->
>> gitmvtest/src/main/resources/demo/content.txt       <== CORRECT
>> renamed:    gitmvtest/resources/dev/content.txt ->
>> gitmvtest/src/main/resources/dev/content.txt             <== CORRECT
>> renamed:    gitmvtest/resources/prod/content.txt ->
>> gitmvtest/src/main/resources/prod/content.txt          <== CORRECT
>>
>>
>>
>> just want to see if this is a bug, user error (on my end), or??
>>
>
> This looks like the same issue as submodule--helper list has:
> http://article.gmane.org/gmane.comp.version-control.git/287227

The submodule--helper is not called from within git-mv, so it may be
a similar but not the same issue. ;)

Looking through the code, the pathspec is not treated according to the newest
style convention, I think it is one of the last places where the
pathspec internals
are poked with, instead of using parse_parsespec && match_parsespec.
(That said it is very old hence often tested code in the wild. old
code != bad code)

Stefan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: git mv messed up file mapping if folders contain identical files
  2016-02-25 13:56   ` Stefan Beller
@ 2016-02-25 16:25     ` Bill Okara
  2016-02-26 11:50       ` SZEDER Gábor
  0 siblings, 1 reply; 9+ messages in thread
From: Bill Okara @ 2016-02-25 16:25 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Kevin Daudt, git@vger.kernel.org, Junio C Hamano

resent, forgot to reply to all...

I guess a bigger concern of this issue is the mess up of history. That
is, even if not doing an merge/update, just doing the 'git mv' will
messed up the file history, as shown in following:

// Add a new resources/qa/content.txt files with a new commit message:

> mkdir gitmvtest/resources/qa
> cp gitmvtest/resources/demo/content.txt gitmvtest/resources/qa/.
> git add .
>git commit -m "Add a new QA context.txt"
[master caba387] Add a new QA context.txt
 1 file changed, 2 insertions(+)

// Do the git mv

> git checkout -b branch5
> git mv gitmvtest/resources gitmvtest/src/main/.
> git commit -m "Move resources to src/main/resources"
[branch5 dd44309] Move resources to src/main/resources
 4 files changed, 0 insertions(+), 0 deletions(-)
 rename gitmvtest/{resources/qa =>
src/main/resources/demo}/content.txt (100%)    <-- WRONG
 rename gitmvtest/{resources/prod =>
src/main/resources/dev}/content.txt (100%)    <-- WRONG
 rename gitmvtest/{resources/dev =>
src/main/resources/prod}/content.txt (100%)    <-- WRONG
 rename gitmvtest/{resources/demo =>
src/main/resources/qa}/content.txt (100%)   <-- WRONG

// WRONG HISTORY
> git log --follow --oneline  gitmvtest/src/main/resources/demo/content.txt   <== demo/content.txt points to the new QA history
dd44309 Move resources to src/main/resources
caba387 Add a new QA context.txt                <== WRONG HISTORY



thanks,
Bill

On Thu, Feb 25, 2016 at 6:56 AM, Stefan Beller <sbeller@google.com> wrote:
> On Thu, Feb 25, 2016 at 3:49 AM, Kevin Daudt <me@ikke.info> wrote:
>> On Wed, Feb 24, 2016 at 04:38:11PM -0700, Bill Okara wrote:
>>> Hi,
>>>
>>> I noticed the following 'git mv' issue with:
>>> git version 2.6.4
>>>
>>>
>>> If there are identical files in different subfolders, 'git mv' the
>>> root folder (and/or each file individually) will mess up the file path
>>> mapping. that is, if having identical 'content.txt' file under
>>> gitmvtest
>>>     |--demo/content.txt
>>>     |--dev/content.txt
>>>     |--prod/content.txt
>>>
>>> after doing the "git mv gitmvtest/resources
>>> gitmvtest/src/main/resources", the 'git status' will show:
>>>
>>> renamed:    gitmvtest/resources/demo/content.txt ->
>>> gitmvtest/src/main/resources/demo/content.txt
>>> renamed:    gitmvtest/resources/prod/content.txt ->
>>> gitmvtest/src/main/resources/dev/content.txt            <== NOTE:
>>> wrongly mapped the prod/content.txt to dev/content.txt
>>> renamed:    gitmvtest/resources/dev/content.txt ->
>>> gitmvtest/src/main/resources/prod/content.txt            <== NOTE:
>>> wrongly mapped the dev/content.txt to prod/content.txt
>>>
>>> I tried running 'git mv' on each file individually, got the same problem:
>>> > git mv gitmvtest/resources/demo/content.txt gitmvtest/src/main/resources/demo/content.txt
>>> > git mv gitmvtest/resources/dev/content.txt gitmvtest/src/main/resources/dev/content.txt
>>> > git mv gitmvtest/resources/prod/content.txt gitmvtest/src/main/resources/prod/content.txt
>>>
>>> > git status
>>> renamed:    gitmvtest/resources/demo/content.txt ->
>>> gitmvtest/src/main/resources/demo/content.txt
>>> renamed:    gitmvtest/resources/prod/content.txt ->
>>> gitmvtest/src/main/resources/dev/content.txt          <== WRONG
>>> renamed:    gitmvtest/resources/dev/content.txt ->
>>> gitmvtest/src/main/resources/prod/content.txt          <== WRONG
>>>
>>>
>>> NOTE:
>>> =======
>>> if modified the content.txt in the 3 folders to contain different
>>> data, then repeating the above 'git mv' will produce correct result,
>>>
>>> renamed:    gitmvtest/resources/demo/content.txt ->
>>> gitmvtest/src/main/resources/demo/content.txt       <== CORRECT
>>> renamed:    gitmvtest/resources/dev/content.txt ->
>>> gitmvtest/src/main/resources/dev/content.txt             <== CORRECT
>>> renamed:    gitmvtest/resources/prod/content.txt ->
>>> gitmvtest/src/main/resources/prod/content.txt          <== CORRECT
>>>
>>>
>>>
>>> just want to see if this is a bug, user error (on my end), or??
>>>
>>
>> This looks like the same issue as submodule--helper list has:
>> http://article.gmane.org/gmane.comp.version-control.git/287227
>
> The submodule--helper is not called from within git-mv, so it may be
> a similar but not the same issue. ;)
>
> Looking through the code, the pathspec is not treated according to the newest
> style convention, I think it is one of the last places where the
> pathspec internals
> are poked with, instead of using parse_parsespec && match_parsespec.
> (That said it is very old hence often tested code in the wild. old
> code != bad code)
>
> Stefan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: git mv messed up file mapping if folders contain identical files
  2016-02-25 16:25     ` Bill Okara
@ 2016-02-26 11:50       ` SZEDER Gábor
  2016-02-26 15:48         ` Bill Okara
  0 siblings, 1 reply; 9+ messages in thread
From: SZEDER Gábor @ 2016-02-26 11:50 UTC (permalink / raw)
  To: Bill Okara
  Cc: SZEDER Gábor, Karsten Blees, Stefan Beller, Kevin Daudt, git,
	Junio C Hamano

Hi,

Please don't top-post on this list.

> I guess a bigger concern of this issue is the mess up of history. That
> is, even if not doing an merge/update, just doing the 'git mv' will
> messed up the file history, as shown in following:
> 
> 
> // Add a new resources/qa/content.txt files with a new commit message:
> 
> > mkdir gitmvtest/resources/qa
> > cp gitmvtest/resources/demo/content.txt gitmvtest/resources/qa/.
> > git add .
> >git commit -m "Add a new QA context.txt"
> [master caba387] Add a new QA context.txt
>  1 file changed, 2 insertions(+)
> 
> // Do the git mv
> 
> > git checkout -b branch5
> > git mv gitmvtest/resources gitmvtest/src/main/.
> > git commit -m "Move resources to src/main/resources"
> [branch5 dd44309] Move resources to src/main/resources
>  4 files changed, 0 insertions(+), 0 deletions(-)
>  rename gitmvtest/{resources/qa =>
> src/main/resources/demo}/content.txt (100%)    <-- WRONG
>  rename gitmvtest/{resources/prod =>
> src/main/resources/dev}/content.txt (100%)    <-- WRONG
>  rename gitmvtest/{resources/dev =>
> src/main/resources/prod}/content.txt (100%)    <-- WRONG
>  rename gitmvtest/{resources/demo =>
> src/main/resources/qa}/content.txt (100%)   <-- WRONG
> 
> // WRONG HISTORY
> > git log --follow --oneline  gitmvtest/src/main/resources/demo/content.txt   <== demo/content.txt points to the new QA history
> dd44309 Move resources to src/main/resources
> caba387 Add a new QA context.txt                <== WRONG HISTORY

Git doesn't track copies and renames.

Git only tracks content and infers copies and renames from content
changes.  For example, if a commit removes path 'A' and adds path 'B'
then Git checks whether they both have identical (or very similar)
content, and reports this change as a rename if they do.  This is not
recorded anywhere in the repository, but 'git log --follow <path>'
performs this check upon seeing that the path in question doesn't
exist in the previous commit.

Anyway, diffcore used to handle your case better, and the patch below
restores the original behavior.

 ---- >8 ----

Subject: [PATCH] diffcore: fix iteration order of identical files during rename detection

If the two paths 'dir/A/file' and 'dir/B/file' have identical content
and the parent directory is renamed, e.g. 'git mv dir other-dir', then
diffcore reports the following exact renames:

    renamed:    dir/B/file -> other-dir/A/file
    renamed:    dir/A/file -> other-dir/B/file

While technically not wrong, this is confusing not only for the user,
but also for git commands that make decisions based on rename
information, e.g. 'git log --follow'.

This behavior is a side effect of commit v2.0.0-rc4~8^2~14
(diffcore-rename.c: simplify finding exact renames, 2013-11-14): the
hashmap storing sources returns entries from the same bucket, i.e.
sources matching the current destination, in LIFO order.  Thus the
iteration first examines 'other-dir/A/file' and 'dir/B/file' and, upon
finding identical content and basename, reports an exact rename.

Restore the original behavior by reversing the order of filling the
hashmap with source entries.

Reported-by: Bill Okara <billokara@gmail.com>
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
---
 diffcore-rename.c      |  6 ++++--
 t/t4001-diff-rename.sh | 11 +++++++++++
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/diffcore-rename.c b/diffcore-rename.c
index af1fe08861e6..69fcf77be02d 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -340,9 +340,11 @@ static int find_exact_renames(struct diff_options *options)
 	int i, renames = 0;
 	struct hashmap file_table;
 
-	/* Add all sources to the hash table */
+	/* Add all sources to the hash table in reverse order, because
+	 * later on they will be retrieved in LIFO order.
+	 */
 	hashmap_init(&file_table, NULL, rename_src_nr);
-	for (i = 0; i < rename_src_nr; i++)
+	for (i = rename_src_nr-1; i >= 0; i--)
 		insert_file_table(&file_table, i, rename_src[i].p->one);
 
 	/* Walk the destinations and find best source match */
diff --git a/t/t4001-diff-rename.sh b/t/t4001-diff-rename.sh
index 2f327b749588..ed90c6c6f984 100755
--- a/t/t4001-diff-rename.sh
+++ b/t/t4001-diff-rename.sh
@@ -77,6 +77,17 @@ test_expect_success 'favour same basenames even with minor differences' '
 	git show HEAD:path1 | sed "s/15/16/" > subdir/path1 &&
 	git status | test_i18ngrep "renamed: .*path1 -> subdir/path1"'
 
+test_expect_success 'two files with same basename and same content' '
+	git reset --hard &&
+	mkdir -p dir/A dir/B &&
+	cp path1 dir/A/file &&
+	cp path1 dir/B/file &&
+	git add dir &&
+	git commit -m 2 &&
+	git mv dir other-dir &&
+	git status | test_i18ngrep "renamed: .*dir/A/file -> other-dir/A/file"
+'
+
 test_expect_success 'setup for many rename source candidates' '
 	git reset --hard &&
 	for i in 0 1 2 3 4 5 6 7 8 9;
-- 
2.7.2.410.g92cb358

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: git mv messed up file mapping if folders contain identical files
  2016-02-26 11:50       ` SZEDER Gábor
@ 2016-02-26 15:48         ` Bill Okara
  0 siblings, 0 replies; 9+ messages in thread
From: Bill Okara @ 2016-02-26 15:48 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Karsten Blees, Stefan Beller, Kevin Daudt, git, Junio C Hamano

Hi,

On Fri, Feb 26, 2016 at 4:50 AM, SZEDER Gábor <szeder@ira.uka.de> wrote:
>
> Please don't top-post on this list.
>
Sorry about that. Just learned what that means and why it is bad...

> Subject: [PATCH] diffcore: fix iteration order of identical files during rename detection
>
> If the two paths 'dir/A/file' and 'dir/B/file' have identical content
> and the parent directory is renamed, e.g. 'git mv dir other-dir', then
> diffcore reports the following exact renames:
>
>     renamed:    dir/B/file -> other-dir/A/file
>     renamed:    dir/A/file -> other-dir/B/file
>
> While technically not wrong, this is confusing not only for the user,
> but also for git commands that make decisions based on rename
> information, e.g. 'git log --follow'.
>
> This behavior is a side effect of commit v2.0.0-rc4~8^2~14
> (diffcore-rename.c: simplify finding exact renames, 2013-11-14): the
> hashmap storing sources returns entries from the same bucket, i.e.
> sources matching the current destination, in LIFO order.  Thus the
> iteration first examines 'other-dir/A/file' and 'dir/B/file' and, upon
> finding identical content and basename, reports an exact rename.
>
> Restore the original behavior by reversing the order of filling the
> hashmap with source entries.
>
> Reported-by: Bill Okara <billokara@gmail.com>
> Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
> ---
>  diffcore-rename.c      |  6 ++++--
>  t/t4001-diff-rename.sh | 11 +++++++++++
>  2 files changed, 15 insertions(+), 2 deletions(-)

> diff --git a/t/t4001-diff-rename.sh b/t/t4001-diff-rename.sh
> index 2f327b749588..ed90c6c6f984 100755
> --- a/t/t4001-diff-rename.sh
> +++ b/t/t4001-diff-rename.sh
> @@ -77,6 +77,17 @@ test_expect_success 'favour same basenames even with minor differences' '
>         git show HEAD:path1 | sed "s/15/16/" > subdir/path1 &&
>         git status | test_i18ngrep "renamed: .*path1 -> subdir/path1"'
>
> +test_expect_success 'two files with same basename and same content' '
> +       git reset --hard &&
> +       mkdir -p dir/A dir/B &&
> +       cp path1 dir/A/file &&
> +       cp path1 dir/B/file &&
> +       git add dir &&
> +       git commit -m 2 &&
> +       git mv dir other-dir &&
> +       git status | test_i18ngrep "renamed: .*dir/A/file -> other-dir/A/file"
> +'
> +
>  test_expect_success 'setup for many rename source candidates' '
>         git reset --hard &&
>         for i in 0 1 2 3 4 5 6 7 8 9;
> --
> 2.7.2.410.g92cb358
>

Thank you very much for the fix and the detailed explanation!

-Bill

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-02-26 15:49 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-24 23:38 git mv messed up file mapping if folders contain identical files Bill Okara
2016-02-24 23:39 ` Junio C Hamano
2016-02-24 23:51   ` Bill Okara
2016-02-25  0:03     ` Bill Okara
2016-02-25 11:49 ` Kevin Daudt
2016-02-25 13:56   ` Stefan Beller
2016-02-25 16:25     ` Bill Okara
2016-02-26 11:50       ` SZEDER Gábor
2016-02-26 15:48         ` Bill Okara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).