* detecting rename->commit->modify->commit
@ 2008-05-01 14:10 Ittay Dror
2008-05-01 14:45 ` Jeff King
2008-05-01 14:54 ` Ittay Dror
0 siblings, 2 replies; 49+ messages in thread
From: Ittay Dror @ 2008-05-01 14:10 UTC (permalink / raw)
To: git
Hi,
Say I have a file A, I rename to 'B', commit, then change file B and
commit. Does 'git diff -M HEAD^^..' detect that? From what I see now, it
will show 'B' as new (all of it with '+' prefix in the output). Am I right?
Thank you,
Ittay
--
Ittay Dror <ittayd@tikalk.com>
Tikal <http://www.tikalk.com>
Tikal Project <http://tikal.sourceforge.net>
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 14:10 detecting rename->commit->modify->commit Ittay Dror
@ 2008-05-01 14:45 ` Jeff King
2008-05-01 15:08 ` Ittay Dror
2008-05-01 14:54 ` Ittay Dror
1 sibling, 1 reply; 49+ messages in thread
From: Jeff King @ 2008-05-01 14:45 UTC (permalink / raw)
To: Ittay Dror; +Cc: git
On Thu, May 01, 2008 at 05:10:24PM +0300, Ittay Dror wrote:
> Say I have a file A, I rename to 'B', commit, then change file B and
> commit. Does 'git diff -M HEAD^^..' detect that? From what I see now, it
> will show 'B' as new (all of it with '+' prefix in the output). Am I
> right?
Yes, it should find it, assuming the changes to B leave it recognizable.
Try:
mkdir repo && cd repo && git init
cp /usr/share/dict/words A
git add . && git commit -m added
mv A B && git add B && git commit -a -m rename
echo change >>B && git commit -a -m change
git diff -M HEAD^^.. | head -n 7
You should see something like:
diff --git a/A b/B
similarity index 99%
rename from A
rename to B
index 8e50f11..6525618 100644
--- a/A
+++ b/B
However, note the similarity index. If you change B so much that it
doesn't look close to the original A, then the rename is not detected
(and intentionally so -- the argument is that it is no longer a rename
in that context, but a rewritten file).
-Peff
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 14:10 detecting rename->commit->modify->commit Ittay Dror
2008-05-01 14:45 ` Jeff King
@ 2008-05-01 14:54 ` Ittay Dror
2008-05-01 15:09 ` Jeff King
` (2 more replies)
1 sibling, 3 replies; 49+ messages in thread
From: Ittay Dror @ 2008-05-01 14:54 UTC (permalink / raw)
To: git
Also, would anyone like to comment on:
http://www.markshuttleworth.com/archives/123 (Renaming is the killer app
of distributed version control
<http://www.markshuttleworth.com/archives/123>)?
Thank you,
Ittay
Ittay Dror wrote:
> Hi,
>
> Say I have a file A, I rename to 'B', commit, then change file B and
> commit. Does 'git diff -M HEAD^^..' detect that? From what I see now,
> it will show 'B' as new (all of it with '+' prefix in the output). Am
> I right?
>
> Thank you,
> Ittay
>
--
Ittay Dror <ittayd@tikalk.com>
Tikal <http://www.tikalk.com>
Tikal Project <http://tikal.sourceforge.net>
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 14:45 ` Jeff King
@ 2008-05-01 15:08 ` Ittay Dror
2008-05-01 15:20 ` Jeff King
2008-05-01 15:24 ` Ittay Dror
0 siblings, 2 replies; 49+ messages in thread
From: Ittay Dror @ 2008-05-01 15:08 UTC (permalink / raw)
To: Jeff King; +Cc: git
But it doesn't work across directories :-(.
Try:
>mkdir foo
>echo "hello" > foo/A
>git add foo/A
>git commit -m 'foo/A'
>mkdir bar
>git mv foo/A bar
>git commit -m 'bar/A'
>echo "world" >> bar/A
>git add bar/A
>git commit -m 'bar/A world'
>git diff HEAD^^..HEAD^ | cat
diff --git a/foo/A b/bar/A
similarity index 100%
rename from foo/A
rename to bar/A
> git diff HEAD^^.. | cat
diff --git a/bar/A b/bar/A
new file mode 100644
index 0000000..94954ab
--- /dev/null
+++ b/bar/A
@@ -0,0 +1,2 @@
+hello
+world
diff --git a/foo/A b/foo/A
deleted file mode 100644
index ce01362..0000000
--- a/foo/A
+++ /dev/null
@@ -1 +0,0 @@
-hello
Jeff King wrote:
> On Thu, May 01, 2008 at 05:10:24PM +0300, Ittay Dror wrote:
>
>
>> Say I have a file A, I rename to 'B', commit, then change file B and
>> commit. Does 'git diff -M HEAD^^..' detect that? From what I see now, it
>> will show 'B' as new (all of it with '+' prefix in the output). Am I
>> right?
>>
>
> Yes, it should find it, assuming the changes to B leave it recognizable.
> Try:
>
> mkdir repo && cd repo && git init
> cp /usr/share/dict/words A
> git add . && git commit -m added
> mv A B && git add B && git commit -a -m rename
> echo change >>B && git commit -a -m change
> git diff -M HEAD^^.. | head -n 7
>
> You should see something like:
>
> diff --git a/A b/B
> similarity index 99%
> rename from A
> rename to B
> index 8e50f11..6525618 100644
> --- a/A
> +++ b/B
>
> However, note the similarity index. If you change B so much that it
> doesn't look close to the original A, then the rename is not detected
> (and intentionally so -- the argument is that it is no longer a rename
> in that context, but a rewritten file).
>
> -Peff
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
--
Ittay Dror <ittayd@tikalk.com>
Tikal <http://www.tikalk.com>
Tikal Project <http://tikal.sourceforge.net>
^ permalink raw reply related [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 14:54 ` Ittay Dror
@ 2008-05-01 15:09 ` Jeff King
2008-05-01 15:20 ` Ittay Dror
2008-05-01 15:30 ` David Tweed
2008-05-01 15:27 ` Avery Pennarun
2008-05-01 16:39 ` Sitaram Chamarty
2 siblings, 2 replies; 49+ messages in thread
From: Jeff King @ 2008-05-01 15:09 UTC (permalink / raw)
To: Ittay Dror; +Cc: git
On Thu, May 01, 2008 at 05:54:06PM +0300, Ittay Dror wrote:
> Also, would anyone like to comment on:
> http://www.markshuttleworth.com/archives/123 (Renaming is the killer app
> of distributed version control
> <http://www.markshuttleworth.com/archives/123>)?
My two cents:
1. I think he is overly obsessed with renaming. He seems concerned that
somebody will show up, make a big renaming patch, and then break your
system. Guess what? They can also show up, make a big code change patch,
and then break your system. In either case you have to review the
changes before accepting them, and it is up to the version control
system to show you the changes in a way you can understand.
2. I see the same old "git developers decided renaming wasn't
important" argument. I think this is bogus. I think renaming _is_
important, but I actually prefer git's approach of deducing renames,
because it reflects a fundamental property of git: we track states, not
changes, and git doesn't care how you arrive at each state. So I am free
to use a combination of git commands, editors, patch application tools,
or anything else to get my tree to the right place.
3. He doesn't like that git doesn't track _directory_ renames. This is
not a fundamental problem with git's approach (which could deduce
directory renames after the fact), but rather comes from the fact that
directory renames are controversial. That is, even if you know (through
deduction or because an explicit rename was recorded) that "subdir1"
moved to "subdir2", that doesn't necessarily mean that new files added
into "subdir1" should make that move, as well.
-Peff
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 15:09 ` Jeff King
@ 2008-05-01 15:20 ` Ittay Dror
2008-05-01 15:30 ` David Tweed
1 sibling, 0 replies; 49+ messages in thread
From: Ittay Dror @ 2008-05-01 15:20 UTC (permalink / raw)
To: Jeff King; +Cc: git
Jeff King wrote:
> My two cents:
>
> 1. I think he is overly obsessed with renaming. He seems concerned that
> somebody will show up, make a big renaming patch, and then break your
> system. Guess what? They can also show up, make a big code change patch,
> and then break your system. In either case you have to review the
> changes before accepting them, and it is up to the version control
> system to show you the changes in a way you can understand
I think he was more concerned that merges will break after such a change.
Ittay
--
Ittay Dror <ittayd@tikalk.com>
Tikal <http://www.tikalk.com>
Tikal Project <http://tikal.sourceforge.net>
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 15:08 ` Ittay Dror
@ 2008-05-01 15:20 ` Jeff King
2008-05-01 15:30 ` Ittay Dror
2008-05-01 20:39 ` Teemu Likonen
2008-05-01 15:24 ` Ittay Dror
1 sibling, 2 replies; 49+ messages in thread
From: Jeff King @ 2008-05-01 15:20 UTC (permalink / raw)
To: Ittay Dror; +Cc: git
On Thu, May 01, 2008 at 06:08:33PM +0300, Ittay Dror wrote:
> But it doesn't work across directories :-(.
Yes, it does.
> Try:
> >mkdir foo
> >echo "hello" > foo/A
> >git add foo/A
> >git commit -m 'foo/A'
> >mkdir bar
> >git mv foo/A bar
> >git commit -m 'bar/A'
> >echo "world" >> bar/A
> >git add bar/A
> >git commit -m 'bar/A world'
> >git diff HEAD^^..HEAD^ | cat
> diff --git a/foo/A b/bar/A
> similarity index 100%
> rename from foo/A
> rename to bar/A
See, it just worked across directories.
> > git diff HEAD^^.. | cat
> diff --git a/bar/A b/bar/A
> new file mode 100644
> index 0000000..94954ab
> --- /dev/null
> +++ b/bar/A
> @@ -0,0 +1,2 @@
> +hello
> +world
> diff --git a/foo/A b/foo/A
> deleted file mode 100644
> index ce01362..0000000
> --- a/foo/A
> +++ /dev/null
> @@ -1 +0,0 @@
> -hello
Of course it doesn't work here. You have two files, one containing
"hello\n" and one containing "hello\nworld\n". Their similarity is 50%,
which is not enough to consider it a rename. And I would argue that's
reasonable, since the files have only one line in common. The problem is
that you are using a toy example (which is why my example used
/usr/share/dict/words, which has enough content to definitively call it
a rename).
...
Hmm, looking at the code, though, 50% is supposed to be the default
minimum. So there might actually be a bug.
-Peff
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 15:08 ` Ittay Dror
2008-05-01 15:20 ` Jeff King
@ 2008-05-01 15:24 ` Ittay Dror
2008-05-01 15:28 ` Jeff King
1 sibling, 1 reply; 49+ messages in thread
From: Ittay Dror @ 2008-05-01 15:24 UTC (permalink / raw)
To: Jeff King; +Cc: git
Btw, this happened to me in a real use case. I wanted to restructure a
source tree. So I put it under git and started to happily move things
around, always committing after a move. I thought that git will
correctly identify these moves and show me the differences I made after
(in a separate commit). But it doesn't, and now that I want to prepare a
summary of the changes I've made, I'm stuck with a huge diff that is
hard to make sense of.
Ittay
Ittay Dror wrote:
> But it doesn't work across directories :-(.
>
> Try:
> >mkdir foo
> >echo "hello" > foo/A
> >git add foo/A
> >git commit -m 'foo/A'
> >mkdir bar
> >git mv foo/A bar
> >git commit -m 'bar/A'
> >echo "world" >> bar/A
> >git add bar/A
> >git commit -m 'bar/A world'
> >git diff HEAD^^..HEAD^ | cat
> diff --git a/foo/A b/bar/A
> similarity index 100%
> rename from foo/A
> rename to bar/A
> > git diff HEAD^^.. | cat
> diff --git a/bar/A b/bar/A
> new file mode 100644
> index 0000000..94954ab
> --- /dev/null
> +++ b/bar/A
> @@ -0,0 +1,2 @@
> +hello
> +world
> diff --git a/foo/A b/foo/A
> deleted file mode 100644
> index ce01362..0000000
> --- a/foo/A
> +++ /dev/null
> @@ -1 +0,0 @@
> -hello
>
>
>
>
>
> Jeff King wrote:
>> On Thu, May 01, 2008 at 05:10:24PM +0300, Ittay Dror wrote:
>>
>>
>>> Say I have a file A, I rename to 'B', commit, then change file B
>>> and commit. Does 'git diff -M HEAD^^..' detect that? From what I
>>> see now, it will show 'B' as new (all of it with '+' prefix in the
>>> output). Am I right?
>>>
>>
>> Yes, it should find it, assuming the changes to B leave it recognizable.
>> Try:
>>
>> mkdir repo && cd repo && git init
>> cp /usr/share/dict/words A
>> git add . && git commit -m added
>> mv A B && git add B && git commit -a -m rename
>> echo change >>B && git commit -a -m change
>> git diff -M HEAD^^.. | head -n 7
>>
>> You should see something like:
>>
>> diff --git a/A b/B
>> similarity index 99%
>> rename from A
>> rename to B
>> index 8e50f11..6525618 100644
>> --- a/A
>> +++ b/B
>>
>> However, note the similarity index. If you change B so much that it
>> doesn't look close to the original A, then the rename is not detected
>> (and intentionally so -- the argument is that it is no longer a rename
>> in that context, but a rewritten file).
>>
>> -Peff
>> --
>> To unsubscribe from this list: send the line "unsubscribe git" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
>
--
Ittay Dror <ittayd@tikalk.com>
Tikal <http://www.tikalk.com>
Tikal Project <http://tikal.sourceforge.net>
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 14:54 ` Ittay Dror
2008-05-01 15:09 ` Jeff King
@ 2008-05-01 15:27 ` Avery Pennarun
2008-05-01 15:34 ` Jeff King
2008-05-01 16:39 ` Sitaram Chamarty
2 siblings, 1 reply; 49+ messages in thread
From: Avery Pennarun @ 2008-05-01 15:27 UTC (permalink / raw)
To: Ittay Dror; +Cc: git
On 5/1/08, Ittay Dror <ittayd@tikalk.com> wrote:
> Also, would anyone like to comment on:
> http://www.markshuttleworth.com/archives/123 (Renaming is
> the killer app of distributed version control
> <http://www.markshuttleworth.com/archives/123>)?
One of the comments linked to this:
http://automatthias.wordpress.com/2007/06/07/directory-renaming-in-scm/
Which points out that git doesn't really handle directory renames at
all. If someone creates file A/X then renames A to B, then merges
with someone who both added the file A/Y and modified A/X, git will
produce a tree containing (modified) B/Y and (new) A/Y.
Technically this is "correct" in that no data is lost and there are no
conflicts, but it is obviously not what was "intended", which was that
the new file Y should have ended up in folder B.
Before you say this is not a realistic use case, I've personally had
this exact problem:
- I had a project with all of my work in a folder "src"
- I decided that the 'src' folder was redundant, so I moved it all to
the root folder
- Someone else was working on an old maintenance branch which still had 'src'
- When I merged from that person, some new files were created under
'src', and of course didn't work.
Since the maintenance branch was long-lived, this problem happened
repeatedly. That said, it's also pretty easy to work around, so it's
not the end of the world.
Have fun,
Avery
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 15:24 ` Ittay Dror
@ 2008-05-01 15:28 ` Jeff King
0 siblings, 0 replies; 49+ messages in thread
From: Jeff King @ 2008-05-01 15:28 UTC (permalink / raw)
To: Ittay Dror; +Cc: git
On Thu, May 01, 2008 at 06:24:30PM +0300, Ittay Dror wrote:
> Btw, this happened to me in a real use case. I wanted to restructure a
> source tree. So I put it under git and started to happily move things
> around, always committing after a move. I thought that git will correctly
> identify these moves and show me the differences I made after (in a
> separate commit). But it doesn't, and now that I want to prepare a
> summary of the changes I've made, I'm stuck with a huge diff that is hard
> to make sense of.
If you have a specific case where you think renames should have been
detected but they weren't, by all means, please share it. It's possible
that there is a bug in the rename detection, or that the limits are not
set correctly, and we could improve it.
-Peff
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 15:09 ` Jeff King
2008-05-01 15:20 ` Ittay Dror
@ 2008-05-01 15:30 ` David Tweed
1 sibling, 0 replies; 49+ messages in thread
From: David Tweed @ 2008-05-01 15:30 UTC (permalink / raw)
To: Jeff King; +Cc: Ittay Dror, git
On Thu, May 1, 2008 at 4:09 PM, Jeff King <peff@peff.net> wrote:
> On Thu, May 01, 2008 at 05:54:06PM +0300, Ittay Dror wrote:
>
> > Also, would anyone like to comment on:
> > http://www.markshuttleworth.com/archives/123 (Renaming is the killer app
> > of distributed version control
> > <http://www.markshuttleworth.com/archives/123>)?
I'll just make the obvious point that he's talking about a problem and
an underlying cause:
The problem is not being able to successfully merge branches as time
goes by when one branch has had some renaming. He's decided the root
cause is not have an explicit representation of renames which would
enable the merges to succeed. So there are two questions:
1. Does development often happen where files get renamed and then
modified significantly in a distributed fashion but it is still
sensible to automatically merge the results?
2. Do you need explicit rename tracking to do an automatic merge in those cases?
I suspect that for 2 you don't in theory but considering all the
non-obvious possibilities would slow down the normal case of a
standard merge.
--
cheers, dave tweed__________________________
david.tweed@gmail.com
Rm 124, School of Systems Engineering, University of Reading.
"while having code so boring anyone can maintain it, use Python." --
attempted insult seen on slashdot
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 15:20 ` Jeff King
@ 2008-05-01 15:30 ` Ittay Dror
2008-05-01 15:38 ` Jeff King
2008-05-01 15:47 ` Jakub Narebski
2008-05-01 20:39 ` Teemu Likonen
1 sibling, 2 replies; 49+ messages in thread
From: Ittay Dror @ 2008-05-01 15:30 UTC (permalink / raw)
To: Jeff King; +Cc: git
Jeff King wrote:
> Of course it doesn't work here. You have two files, one containing
> "hello\n" and one containing "hello\nworld\n". Their similarity is 50%,
> which is not enough to consider it a rename. And I would argue that's
> reasonable, since the files have only one line in common. The problem is
> that you are using a toy example (which is why my example used
> /usr/share/dict/words, which has enough content to definitively call it
> a rename).
>
>
Well, I would have expected git to notice that the file was renamed in
one commit and keep tracking changes afterwards.
Also, as I wrote in another post, this happened to me with real files of
a real source tree, and with very small changes (and sometimes not at
all) to these files.
Ittay
--
Ittay Dror <ittayd@tikalk.com>
Tikal <http://www.tikalk.com>
Tikal Project <http://tikal.sourceforge.net>
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 15:27 ` Avery Pennarun
@ 2008-05-01 15:34 ` Jeff King
2008-05-01 15:50 ` Avery Pennarun
2008-05-01 19:12 ` Steven Grimm
0 siblings, 2 replies; 49+ messages in thread
From: Jeff King @ 2008-05-01 15:34 UTC (permalink / raw)
To: Avery Pennarun; +Cc: Ittay Dror, git
On Thu, May 01, 2008 at 11:27:34AM -0400, Avery Pennarun wrote:
> Before you say this is not a realistic use case, I've personally had
> this exact problem:
>
> - I had a project with all of my work in a folder "src"
> - I decided that the 'src' folder was redundant, so I moved it all to
> the root folder
> - Someone else was working on an old maintenance branch which still had 'src'
> - When I merged from that person, some new files were created under
> 'src', and of course didn't work.
Sure. But we've also had the exact case of:
- there are some files in subdir/, but that is not a good name, and
there is something else that you are going to add that would be
better named as subdir/.
- you rename subdir/ to bettername/
- you create subdir/newfile
but you _don't_ want newfile to go into bettername/. It's _replacing_
what went into bettername/.
So I don't think you can always track the intent automatically.
Though if you could specify the intent to the SCM, you could
differentiate at the time of move between these two cases, and the merge
could do the right thing later. Or alternatively, you could specify at
time of merge which to do. It's just that nobody has implemented it.
-Peff
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 15:30 ` Ittay Dror
@ 2008-05-01 15:38 ` Jeff King
2008-05-01 15:47 ` Jakub Narebski
1 sibling, 0 replies; 49+ messages in thread
From: Jeff King @ 2008-05-01 15:38 UTC (permalink / raw)
To: Ittay Dror; +Cc: git
On Thu, May 01, 2008 at 06:30:46PM +0300, Ittay Dror wrote:
> Well, I would have expected git to notice that the file was renamed in
> one commit and keep tracking changes afterwards.
That's not how git works, and that's not what you asked it to do. You
gave it two states and asked it to diff between them. It never even
looked at the intermediate steps (and that's generally why git is so
fast). If you want to follow the history and look at every commit, then
that is something that _can_ be done, and does get done with things like
"git log --follow". But there is a diff mode currently implemented that
will crawl the history looking for interesting things.
-Peff
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 15:30 ` Ittay Dror
2008-05-01 15:38 ` Jeff King
@ 2008-05-01 15:47 ` Jakub Narebski
1 sibling, 0 replies; 49+ messages in thread
From: Jakub Narebski @ 2008-05-01 15:47 UTC (permalink / raw)
To: Ittay Dror; +Cc: Jeff King, git
Ittay Dror <ittayd@tikalk.com> writes:
> Jeff King wrote:
> >
> > Of course it doesn't work here. You have two files, one containing
> > "hello\n" and one containing "hello\nworld\n". Their similarity is 50%,
> > which is not enough to consider it a rename. And I would argue that's
> > reasonable, since the files have only one line in common. The problem is
> > that you are using a toy example (which is why my example used
> > /usr/share/dict/words, which has enough content to definitively call it
> > a rename).
> >
> >
> Well, I would have expected git to notice that the file was renamed in
> one commit and keep tracking changes afterwards.
>
> Also, as I wrote in another post, this happened to me with real files
> of a real source tree, and with very small changes (and sometimes not
> at all) to these files.
The idea of rename detection is to help with merges. If the files are
different enough that content based (similarity based) rename
detection doesn't detect rename, they are usually too different to
merge automatically anyway.
--
Jakub Narebski
Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 15:34 ` Jeff King
@ 2008-05-01 15:50 ` Avery Pennarun
2008-05-01 16:48 ` Jeff King
2008-05-01 19:12 ` Steven Grimm
1 sibling, 1 reply; 49+ messages in thread
From: Avery Pennarun @ 2008-05-01 15:50 UTC (permalink / raw)
To: Jeff King; +Cc: Ittay Dror, git
On 5/1/08, Jeff King <peff@peff.net> wrote:
> On Thu, May 01, 2008 at 11:27:34AM -0400, Avery Pennarun wrote:
>
> > Before you say this is not a realistic use case, I've personally had
> > this exact problem:
> >
> > - I had a project with all of my work in a folder "src"
> > - I decided that the 'src' folder was redundant, so I moved it all to
> > the root folder
> > - Someone else was working on an old maintenance branch which still had 'src'
> > - When I merged from that person, some new files were created under
> > 'src', and of course didn't work.
>
>
> Sure. But we've also had the exact case of:
>
> - there are some files in subdir/ [1], but that is not a good name, and
> there is something else that you are going to add that would be
> better named as subdir/.
> - you rename subdir/ to bettername/ [2]
> - you create subdir/newfile [3]
>
> but you _don't_ want newfile to go into bettername/. It's _replacing_
> what went into bettername/.
I would argue that this is a sort of "directory splitting" operation.
That is, all anyone ever did was add some files to a subdir/ that
already existed [1], *or* move all the files from subdir/ to a
previously-empty bettername/ [2], *or* create a new subdir/ and add
files to it [3]. In each case, no merge operation was necessary and it
is completely obvious by comparing "before and after" trees which case
it was.
I guess my argument here is just that it should be *possible* to
deduce and implement both cases at merge time just fine using git's
existing storage model. It just hasn't been implemented yet. (And
incidentally, I think that's totally awesome and I'd never want to go
back to an explicit rename tracking model.)
I should shut up now because the actual merge machinery scares me and
I'm not willing to volunteer to write a patch for this one :)
Have fun,
Avery
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 14:54 ` Ittay Dror
2008-05-01 15:09 ` Jeff King
2008-05-01 15:27 ` Avery Pennarun
@ 2008-05-01 16:39 ` Sitaram Chamarty
2008-05-01 18:58 ` Ittay Dror
2 siblings, 1 reply; 49+ messages in thread
From: Sitaram Chamarty @ 2008-05-01 16:39 UTC (permalink / raw)
To: Ittay Dror; +Cc: git
On Thu, May 1, 2008 at 8:24 PM, Ittay Dror <ittayd@tikalk.com> wrote:
> Also, would anyone like to comment on:
> http://www.markshuttleworth.com/archives/123 (Renaming is the killer app of
> distributed version control <http://www.markshuttleworth.com/archives/123>)?
someone already did, albeit in just discussion form rather than
examples, in a comment on that same page:
http://www.markshuttleworth.com/archives/123#comment-118655
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 15:50 ` Avery Pennarun
@ 2008-05-01 16:48 ` Jeff King
2008-05-01 19:45 ` Avery Pennarun
0 siblings, 1 reply; 49+ messages in thread
From: Jeff King @ 2008-05-01 16:48 UTC (permalink / raw)
To: Avery Pennarun; +Cc: Ittay Dror, git
On Thu, May 01, 2008 at 11:50:31AM -0400, Avery Pennarun wrote:
> I would argue that this is a sort of "directory splitting" operation.
> That is, all anyone ever did was add some files to a subdir/ that
> already existed [1], *or* move all the files from subdir/ to a
> previously-empty bettername/ [2], *or* create a new subdir/ and add
> files to it [3]. In each case, no merge operation was necessary and it
> is completely obvious by comparing "before and after" trees which case
> it was.
I don't see it. I think the steps are exactly the same as in your
example. Consider:
1. You have some files in src/
2. All of the files from src/ get moved away
3. You merge in somebody else's work which adds a file in src/, but
their work is based on a commit which predates 2.
The question is: if they had seen 2., would they have put the file into
src/, or into the new location? I think the answer depends on the
semantics of the file. If it is semantically an addition to the source
code that got moved, then yes. If it is a _replacement_ for the
source code that got moved, then no.
> I guess my argument here is just that it should be *possible* to
> deduce and implement both cases at merge time just fine using git's
> existing storage model. It just hasn't been implemented yet. (And
> incidentally, I think that's totally awesome and I'd never want to go
> back to an explicit rename tracking model.)
I think you lack information to decide automatically between the two
cases listed above. But I think in most cases it would be sufficient for
the tool to say "this directory seems to have moved, but this new file
was added in it" and let the user decide which makes sense.
> I should shut up now because the actual merge machinery scares me and
> I'm not willing to volunteer to write a patch for this one :)
It would probably start not with merge machinery, but with diff
machinery to detect "directory has moved". But that is also scary. :)
You could also do this totally _outside_ of git, similar to
git-mergetool. Wait until you get a conflict, and then run a script
which looks at the two endpoints and the merge base and says "Oh, maybe
this is a good way of resolving."
-Peff
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 16:39 ` Sitaram Chamarty
@ 2008-05-01 18:58 ` Ittay Dror
0 siblings, 0 replies; 49+ messages in thread
From: Ittay Dror @ 2008-05-01 18:58 UTC (permalink / raw)
To: Sitaram Chamarty; +Cc: git
Sitaram Chamarty wrote:
> http://www.markshuttleworth.com/archives/123#comment-118655
>
Here is the comment from the thread, my comment on it is below:
> This is a very strong point for renaming, but it is not necessarily
an universal one.
> Here is one example of the issue: one developer renaming a directory
in his branch, and another adding a file to the original directory in
his branch. What happens at the merge ?
> - Bazaar renames the directory and puts the new file in the _renamed_
directory.
> - Git renames the directory with its files, but keeps the old
directory too and adds the new file there.
> Bazaar’s behavior certainly is better for C. However it is not
universally better.
> For example in Java you cannot rename a file without changing its
contents. So, moving a file to a directory different from where its
author put it will almost certainly break the build.
> The bottom line is, both behaviors can seem valid or broken,
depending on the case. Neither is perfect. At the very abstract level
file renames are _not_ a first-class operation. This is especially
apparent in a language like Java.
> Content movement is the first class operation. Things like moving
functions, etc. The question is how one can handle that and whether the
current strategy has a path for improvement. It could be > argued that
once you commit yourself to explicitly tracking file renames, you are
giving up a slew of opportunities for handling the more general cases.
> One thing is for certain, a 100% ideal solution is impossible. It
would have to be aware of the target programming language _and_ the
build environment.
And my comment is that in this example, about Java, I think that
manually fixing the package name in the file (after noticing the build
is broken) is easy. On the other hand, if the other developer changed
one of the renamed file, then manually merging the change in the file in
the old location to the file in the new location is not so easy: you
first need to discover that this happened, then merge the two files (and
you still need to fix the package name).
ittay
--
Ittay Dror <ittayd@tikalk.com>
Tikal <http://www.tikalk.com>
Tikal Project <http://tikal.sourceforge.net>
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 15:34 ` Jeff King
2008-05-01 15:50 ` Avery Pennarun
@ 2008-05-01 19:12 ` Steven Grimm
2008-05-01 23:14 ` Jeff King
1 sibling, 1 reply; 49+ messages in thread
From: Steven Grimm @ 2008-05-01 19:12 UTC (permalink / raw)
To: Jeff King; +Cc: Avery Pennarun, Ittay Dror, git
On May 1, 2008, at 8:34 AM, Jeff King wrote:
> So I don't think you can always track the intent automatically.
That is absolutely true. You have to pick one case or the other as the
default unless there's some way to tell the system your intent either
at merge time or at move time.
However, that leaves the question of which default will be wrong the
least often.
In my personal experience, I think a directory rename has almost
always meant that I would want new files to appear in the new
directory rather than to recreate the old directory. I can't think of
a single time when I've wanted git's current behavior (though maybe
it's happened on occasion) but the current behavior has tripped me up
more than once and forced me to do extra work shuffling things around
by hand post-merge. I acknowledge that there exist cases where the
current behavior is correct -- but in my experience they're the
minority.
Of course, the discussion is moot anyway until someone writes code to
detect the situation; my impression is the current behavior is the way
it is simply because it's what naturally happens in the absence of
merge-time detection of a directory getting renamed.
-Steve
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 16:48 ` Jeff King
@ 2008-05-01 19:45 ` Avery Pennarun
2008-05-01 22:42 ` Jeff King
0 siblings, 1 reply; 49+ messages in thread
From: Avery Pennarun @ 2008-05-01 19:45 UTC (permalink / raw)
To: Jeff King; +Cc: Ittay Dror, git
On Thu, May 1, 2008 at 12:48 PM, Jeff King <peff@peff.net> wrote:
> I don't see it. I think the steps are exactly the same as in your
> example. Consider:
>
> 1. You have some files in src/
> 2. All of the files from src/ get moved away
> 3. You merge in somebody else's work which adds a file in src/, but
> their work is based on a commit which predates 2.
>
> The question is: if they had seen 2., would they have put the file into
> src/, or into the new location? I think the answer depends on the
> semantics of the file. If it is semantically an addition to the source
> code that got moved, then yes. If it is a _replacement_ for the
> source code that got moved, then no.
I promised I would shut up, and I apparently didn't. Sorry :)
I think this case isn't so hard. Basically, a merge involves three
commits; the merge-base, my branch, and your branch.
In your example above, we compare the merge-base to the new version;
in that case, the new file is in an *existing* directory which
definitely corresponds to src/ in #1, because the the new version has
never even heard about src/ being deleted. Thus, the file must be
intended to be part of the original src/, wherever it may now be.
In contrast, if the merge-base already had src/ being renamed, and
someone put something into src/, we'd know that they're putting it
into a fundamentally different directory than the moved src/.
Exactly how you track the "identity" of a directory without breaking
things down by individual commit sounds a little complicated, but it
feels to me like it should be possible.
I suspect this is a generalization of the earlier discussion (a few
months ago) that I read in the archive about git's handling of empty
directories. Right now git does weird things with directory
creation/deletion because directories are not first-class citizens.
Anyway, as with the empty directory stuff, if I occasionally have to
mkdir/rmdir a couple things and rename a few files after doing a
merge, I'm not going to cry too much. It sure beats explicitly
tracking renames and then having an oops-I-forgot-to-explicitly-track
rename throw a monkey wrench into my merges, which svn has saddled me
with lots of times.
Have fun,
Avery
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 15:20 ` Jeff King
2008-05-01 15:30 ` Ittay Dror
@ 2008-05-01 20:39 ` Teemu Likonen
2008-05-01 23:09 ` Jeff King
2008-05-02 2:06 ` Sitaram Chamarty
1 sibling, 2 replies; 49+ messages in thread
From: Teemu Likonen @ 2008-05-01 20:39 UTC (permalink / raw)
To: Jeff King; +Cc: Ittay Dror, git
Jeff King wrote (2008-05-01 11:20 -0400):
> Hmm, looking at the code, though, 50% is supposed to be the default
> minimum. So there might actually be a bug.
I did some testing... A file, containing 10 lines (about 200 bytes),
renamed and then modified (similarity index being a bit over 50%). Git
detected the rename just fine with "git diff -M" over the rename and
change. When I edited the file even more (similarity only 40%) "git diff
-M" didn't detect the rename but "git diff -M4" did. To me it looks like
this works nicely, better than I expected, actually.
Smaller files than that do not seem to work with "git diff -M" over the
rename and changes. They can be followed with "git log --follow -p"
which works even with the two-line "hello\nworld". And of course there
is always
git diff commit1:path1/file1 commit2:path2/file2
I'd conclude that for logs and diffs renames are detected very nicely
and there's no problem at all to get wanted information from the repo.
I wonder how this rename detection/tracking has become such a big thing,
a debate even. But maybe merges are different.
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 19:45 ` Avery Pennarun
@ 2008-05-01 22:42 ` Jeff King
0 siblings, 0 replies; 49+ messages in thread
From: Jeff King @ 2008-05-01 22:42 UTC (permalink / raw)
To: Avery Pennarun; +Cc: Ittay Dror, git
On Thu, May 01, 2008 at 03:45:07PM -0400, Avery Pennarun wrote:
> In your example above, we compare the merge-base to the new version;
> in that case, the new file is in an *existing* directory which
> definitely corresponds to src/ in #1, because the the new version has
> never even heard about src/ being deleted. Thus, the file must be
> intended to be part of the original src/, wherever it may now be.
I disagree with the final statement of the quoted paragraph above.
Just because you didn't build on the commit that moved src/* doesn't
mean the thing you put in src/ was intended to be moved along with src/.
For example:
- it might have been a new work unrelated to the existing work in src/
that got moved
- it might have been a replacement for the work in src/ that was
started before the movement. E.g., developer1 begins the replacement
work. developer2 moves the old work out of the way. When the
branches are merged, you don't want developer1's work moved.
And yes, I think those are probably less common than "it should be moved
along with src/*". My point isn't that this isn't a valuable construct,
but that we should stop short of mind-reading, and focus on making it
_easy_ to see what happened and to concisely specify the choice and
proceed.
-Peff
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 20:39 ` Teemu Likonen
@ 2008-05-01 23:09 ` Jeff King
2008-05-02 2:06 ` Sitaram Chamarty
1 sibling, 0 replies; 49+ messages in thread
From: Jeff King @ 2008-05-01 23:09 UTC (permalink / raw)
To: Teemu Likonen; +Cc: Junio C Hamano, Ittay Dror, git
[cc'd Junio for comments on this rename optimization]
On Thu, May 01, 2008 at 11:39:40PM +0300, Teemu Likonen wrote:
> > Hmm, looking at the code, though, 50% is supposed to be the default
> > minimum. So there might actually be a bug.
>
> I did some testing... A file, containing 10 lines (about 200 bytes),
> renamed and then modified (similarity index being a bit over 50%). Git
Ah, OK. The problem comes because the toy example is so tiny. It hits
this code chunk:
if (base_size * (MAX_SCORE-minimum_score) < delta_size * MAX_SCORE)
return 0;
where base_size is the size of the smaller file in bytes, and delta_size
is the difference between the size of the two files. This is an
optimization so that we don't even have to look at the contents.
But it is basing the percentage off of the smaller file, so even though
file B ("hello\nworld\n") is 50% made up of file A ("hello\n"), we
actually end up saying "there must be at least as much content added to
make B as there is in A already". IOW, the "percentage similarity" is
based off of the smaller file for this optimization.
Obviously this is a toy case, but I wonder if there are other larger
cases where you end up with a file which has substantial copied content,
but also _grows_ a lot (not just changes). For example, consider the
file:
1
2
3
4
5
6
7
8
9
that is, ten lines each with a number. Now rename it, and start adding
more numbers. We detect the addition of 10, 11, 12. But adding 13 means
we no longer match. So even with only 4 lines added, we fail to match.
But again, this is a bit of a toy case. It relies on the line length
being a significant factor compared to number of lines.
-Peff
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 19:12 ` Steven Grimm
@ 2008-05-01 23:14 ` Jeff King
2008-05-03 17:56 ` merge renamed files/directories? (was: Re: detecting rename->commit->modify->commit) Ittay Dror
2008-05-08 18:17 ` detecting rename->commit->modify->commit Jeff King
0 siblings, 2 replies; 49+ messages in thread
From: Jeff King @ 2008-05-01 23:14 UTC (permalink / raw)
To: Steven Grimm; +Cc: Avery Pennarun, Ittay Dror, git
On Thu, May 01, 2008 at 12:12:33PM -0700, Steven Grimm wrote:
> However, that leaves the question of which default will be wrong the
> least often.
>
> In my personal experience, I think a directory rename has almost always
> meant that I would want new files to appear in the new directory rather
I do agree that the rename is probably more often desired.
> Of course, the discussion is moot anyway until someone writes code to
> detect the situation; my impression is the current behavior is the way it
> is simply because it's what naturally happens in the absence of
> merge-time detection of a directory getting renamed.
Yes, I think that is largely a correct impression (although I think
Linus has spoken out against directory renaming in the past, so there is
at least a little bit of conscious effort). I suspect the right sequence
of steps to implement this would be:
1. write a proof-of-concept that shows directory renaming after the
fact (e.g., take a conflicted merge, scan the diff for directory
renames, and then fix up the files). That way it is available, but
doesn't impact git at all.
2. If people think it is useful, build it into the diff and merge
machinery so that it can happen automagically, but make it
optional. Thus git fully supports it, but the policy decision is
left up to the user.
3. Make it the default if it is the common choice.
So we just need somebody to volunteer to work on 1. ;)
-Peff
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 20:39 ` Teemu Likonen
2008-05-01 23:09 ` Jeff King
@ 2008-05-02 2:06 ` Sitaram Chamarty
2008-05-02 2:38 ` Junio C Hamano
1 sibling, 1 reply; 49+ messages in thread
From: Sitaram Chamarty @ 2008-05-02 2:06 UTC (permalink / raw)
To: Teemu Likonen; +Cc: Jeff King, Ittay Dror, git
On Fri, May 2, 2008 at 2:09 AM, Teemu Likonen <tlikonen@iki.fi> wrote:
> -M" didn't detect the rename but "git diff -M4" did. To me it looks like
> this works nicely, better than I expected, actually.
err... I didn't realise -M had an option, and I just double checked
the man pages for diff, diff-files, diff-index, and diff-tree. What
does the 4 mean?
Sitaram
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-02 2:06 ` Sitaram Chamarty
@ 2008-05-02 2:38 ` Junio C Hamano
2008-05-02 16:59 ` Sitaram Chamarty
0 siblings, 1 reply; 49+ messages in thread
From: Junio C Hamano @ 2008-05-02 2:38 UTC (permalink / raw)
To: Sitaram Chamarty; +Cc: Teemu Likonen, Jeff King, Ittay Dror, git
"Sitaram Chamarty" <sitaramc@gmail.com> writes:
> On Fri, May 2, 2008 at 2:09 AM, Teemu Likonen <tlikonen@iki.fi> wrote:
>
>> -M" didn't detect the rename but "git diff -M4" did. To me it looks like
>> this works nicely, better than I expected, actually.
>
> err... I didn't realise -M had an option, and I just double checked
> the man pages for diff, diff-files, diff-index, and diff-tree. What
> does the 4 mean?
The option to -M<num>, -C<num>, -B<num>/<num> are "raise or lower the
similarity threshold to <num> / 10^N" where N is the number of digits in
<num>. IOW, you will always be expressing number between 0 and 1.
You should also be able to say -M40% but that is an ancient part of the
code base so I might be misremembering things.
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-02 2:38 ` Junio C Hamano
@ 2008-05-02 16:59 ` Sitaram Chamarty
0 siblings, 0 replies; 49+ messages in thread
From: Sitaram Chamarty @ 2008-05-02 16:59 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Teemu Likonen, Jeff King, Ittay Dror, git
On Fri, May 2, 2008 at 8:08 AM, Junio C Hamano <gitster@pobox.com> wrote:
> The option to -M<num>, -C<num>, -B<num>/<num> are "raise or lower the
> similarity threshold to <num> / 10^N" where N is the number of digits in
> <num>. IOW, you will always be expressing number between 0 and 1.
Thanks. The only mention of this I find (now) is in a file called
diffcore.txt, which appears to exist only in the HTML documentation,
but not in the "man" pages anywhere, as of 1.5.5.
[ I pulled a few hairs out trying to find it in the man pages :-) ]
I'd submit a patch, but a guy who takes the easy way out even to get
the documentation (essentially doing a checkout of the "man" branch)
would certainly not be able to test it :-(
^ permalink raw reply [flat|nested] 49+ messages in thread
* merge renamed files/directories? (was: Re: detecting rename->commit->modify->commit)
2008-05-01 23:14 ` Jeff King
@ 2008-05-03 17:56 ` Ittay Dror
2008-05-03 18:11 ` Avery Pennarun
2008-05-08 18:17 ` detecting rename->commit->modify->commit Jeff King
1 sibling, 1 reply; 49+ messages in thread
From: Ittay Dror @ 2008-05-03 17:56 UTC (permalink / raw)
To: git
Can someone comment whether supporting merges after renames will be on
the Git roadmap?
As a Java developer, I can say that refactoring of class names and
packages happens quite often. Having to remember I've made this change
throughout the lifetime of a branch (or master, until pushed to a
central repository), and needing to manually merge changes to files /
packages (directories) I've refactored is something that I want my VCS
to do.
Thank you,
Ittay
Jeff King wrote:
> On Thu, May 01, 2008 at 12:12:33PM -0700, Steven Grimm wrote:
>
>
>> However, that leaves the question of which default will be wrong the
>> least often.
>>
>> In my personal experience, I think a directory rename has almost always
>> meant that I would want new files to appear in the new directory rather
>>
>
> I do agree that the rename is probably more often desired.
>
>
>> Of course, the discussion is moot anyway until someone writes code to
>> detect the situation; my impression is the current behavior is the way it
>> is simply because it's what naturally happens in the absence of
>> merge-time detection of a directory getting renamed.
>>
>
> Yes, I think that is largely a correct impression (although I think
> Linus has spoken out against directory renaming in the past, so there is
> at least a little bit of conscious effort). I suspect the right sequence
> of steps to implement this would be:
>
> 1. write a proof-of-concept that shows directory renaming after the
> fact (e.g., take a conflicted merge, scan the diff for directory
> renames, and then fix up the files). That way it is available, but
> doesn't impact git at all.
>
> 2. If people think it is useful, build it into the diff and merge
> machinery so that it can happen automagically, but make it
> optional. Thus git fully supports it, but the policy decision is
> left up to the user.
>
> 3. Make it the default if it is the common choice.
>
> So we just need somebody to volunteer to work on 1. ;)
>
> -Peff
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
--
Ittay Dror <ittayd@tikalk.com>
Tikal <http://www.tikalk.com>
Tikal Project <http://tikal.sourceforge.net>
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories? (was: Re: detecting rename->commit->modify->commit)
2008-05-03 17:56 ` merge renamed files/directories? (was: Re: detecting rename->commit->modify->commit) Ittay Dror
@ 2008-05-03 18:11 ` Avery Pennarun
2008-05-04 6:08 ` merge renamed files/directories? Ittay Dror
0 siblings, 1 reply; 49+ messages in thread
From: Avery Pennarun @ 2008-05-03 18:11 UTC (permalink / raw)
To: Ittay Dror; +Cc: git
On 5/3/08, Ittay Dror <ittayd@tikalk.com> wrote:
> Can someone comment whether supporting merges after renames will be on the
> Git roadmap?
>
> As a Java developer, I can say that refactoring of class names and packages
> happens quite often. Having to remember I've made this change throughout the
> lifetime of a branch (or master, until pushed to a central repository), and
> needing to manually merge changes to files / packages (directories) I've
> refactored is something that I want my VCS to do.
Git already works fine for renames. The only situation where
something funny happens is if you rename a whole directory and someone
else creates a file in the old directory. (In that case, the new file
ends up in the old place instead of the new place.) However, even in
that case, there is still no conflict and no manual merging necessary.
In fact, as someone else pointed out, renaming a java file requires
you to modify the file anyhow, so having git auto-move the file to
another directory *still* wouldn't make it work any better.
Have fun,
Avery
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories?
2008-05-03 18:11 ` Avery Pennarun
@ 2008-05-04 6:08 ` Ittay Dror
2008-05-04 9:34 ` Jakub Narebski
2008-05-05 16:40 ` Avery Pennarun
0 siblings, 2 replies; 49+ messages in thread
From: Ittay Dror @ 2008-05-04 6:08 UTC (permalink / raw)
To: Avery Pennarun; +Cc: git
Avery Pennarun wrote:
> Git already works fine for renames. The only situation where
> something funny happens is if you rename a whole directory and someone
> else creates a file in the old directory. (In that case, the new file
> ends up in the old place instead of the new place.) However, even in
> that case, there is still no conflict and no manual merging necessary.
>
>
Sorry, but this is not the situation as I have experienced it with a
local repository I have. I renamed a directory (without changing any
files in it). 'git diff <commit>^ <commit>' shows the rename fine, but
'git log -p -M -C <initial commit>..' does not (that is, the history for
files in that directory is shown from the rename commit only). Obviously
git-diff is not any better.
> In fact, as someone else pointed out, renaming a java file requires
> you to modify the file anyhow, so having git auto-move the file to
> another directory *still* wouldn't make it work any better.
>
>
Sure it will, because otherwise I need to move it and still need to fix
it. And there are many other file formats and languages where such a
move will not require any change (I think it is funny that Java is a
justification for not doing something for a tool primarily used by C
people). Also, what happens if I change the file in the new location and
someone else changes it in the old location? Will I need to do a manual
merge?
--
Ittay Dror <ittayd@tikalk.com>
Tikal <http://www.tikalk.com>
Tikal Project <http://tikal.sourceforge.net>
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories?
2008-05-04 6:08 ` merge renamed files/directories? Ittay Dror
@ 2008-05-04 9:34 ` Jakub Narebski
2008-05-05 16:40 ` Avery Pennarun
1 sibling, 0 replies; 49+ messages in thread
From: Jakub Narebski @ 2008-05-04 9:34 UTC (permalink / raw)
To: Ittay Dror; +Cc: Avery Pennarun, git
Ittay Dror <ittayd@tikalk.com> writes:
> Avery Pennarun wrote:
> > Git already works fine for renames. The only situation where
> > something funny happens is if you rename a whole directory and someone
> > else creates a file in the old directory. (In that case, the new file
> > ends up in the old place instead of the new place.) However, even in
> > that case, there is still no conflict and no manual merging necessary.
>
> Sorry, but this is not the situation as I have experienced it with a
> local repository I have. I renamed a directory (without changing any
> files in it). 'git diff <commit>^ <commit>' shows the rename fine, but
> 'git log -p -M -C <initial commit>..' does not (that is, the history
> for files in that directory is shown from the rename commit
> only). Obviously git-diff is not any better.
This is one thing where git differs from other SCMs. In "git log --
<path>" (that is what I assume you have used) the <path> argument is
path limiter. It allows to specify more than one directory or a file.
Unfortunately currently "git log --follow=<file>" works only for single
files, and doesn't yet work for directories; which is caused, among
other things, by the lack of directory rename detection in git.
> [...] Also, what happens if I change the file in the new location
> and someone else changes it in the old location? Will I need to do a
> manual merge?
No, rename detection should make automatic merge possible.
--
Jakub Narebski
Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories?
2008-05-04 6:08 ` merge renamed files/directories? Ittay Dror
2008-05-04 9:34 ` Jakub Narebski
@ 2008-05-05 16:40 ` Avery Pennarun
2008-05-05 21:49 ` Robin Rosenberg
1 sibling, 1 reply; 49+ messages in thread
From: Avery Pennarun @ 2008-05-05 16:40 UTC (permalink / raw)
To: Ittay Dror; +Cc: git
On 5/4/08, Ittay Dror <ittayd@tikalk.com> wrote:
> Avery Pennarun wrote:
> > In fact, as someone else pointed out, renaming a java file requires
> > you to modify the file anyhow, so having git auto-move the file to
> > another directory *still* wouldn't make it work any better.
>
> Sure it will, because otherwise I need to move it and still need to fix it.
> And there are many other file formats and languages where such a move will
> not require any change (I think it is funny that Java is a justification for
> not doing something for a tool primarily used by C people).
I mentioned Java because you mentioned you were working in java.
The particular problem with Java doesn't happen to C people. Imagine,
for example, that I add a new file, lib/foo.c, to lib/lib.a (thus they
have to modify lib/Makefile), while someone else renames "lib" to
"bettername".
When I merge, if git would create bettername/foo.c (it currently
won't) and properly automerge bettername/Makefile (it will), then the
program would still compile correctly. However this doesn't work in
Java: lib/foo.java would include the word "lib" in its contents (in
the namespace declaration) and so there's no way automatic merging
would have resulted in a version that compiles correctly.
So what I said isn't to *justify* git's behaviour, merely to point out
that in java's case, there seems to be no way to get fully automatic
merging that would work. In C, this case would have worked, if only
git supported directory renames.
In neither case is it very much work to fix by hand, though :)
Have fun,
Avery
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories?
2008-05-05 16:40 ` Avery Pennarun
@ 2008-05-05 21:49 ` Robin Rosenberg
2008-05-05 22:20 ` Linus Torvalds
0 siblings, 1 reply; 49+ messages in thread
From: Robin Rosenberg @ 2008-05-05 21:49 UTC (permalink / raw)
To: Avery Pennarun; +Cc: Ittay Dror, git
måndagen den 5 maj 2008 18.40.24 skrev Avery Pennarun:
> On 5/4/08, Ittay Dror <ittayd@tikalk.com> wrote:
> > Avery Pennarun wrote:
> > > In fact, as someone else pointed out, renaming a java file requires
> > > you to modify the file anyhow, so having git auto-move the file to
> > > another directory *still* wouldn't make it work any better.
> >
> > Sure it will, because otherwise I need to move it and still need to fix
> > it. And there are many other file formats and languages where such a move
> > will not require any change (I think it is funny that Java is a
> > justification for not doing something for a tool primarily used by C
> > people).
>
> I mentioned Java because you mentioned you were working in java.
>
> The particular problem with Java doesn't happen to C people. Imagine,
> for example, that I add a new file, lib/foo.c, to lib/lib.a (thus they
> have to modify lib/Makefile), while someone else renames "lib" to
> "bettername".
>
> When I merge, if git would create bettername/foo.c (it currently
> won't) and properly automerge bettername/Makefile (it will), then the
> program would still compile correctly. However this doesn't work in
> Java: lib/foo.java would include the word "lib" in its contents (in
> the namespace declaration) and so there's no way automatic merging
> would have resulted in a version that compiles correctly.
You will always find corner cases. Line-by line merge happens to
work, not because it is the theoretically correct way, but because we
have discovered that it nearly always works so our need for more
specialized merging is not huge. We have also adapted our development
practices to the way line-by-line merging works, i.e. we avoid binary
files and funny text file formats.
> So what I said isn't to *justify* git's behaviour, merely to point out
> that in java's case, there seems to be no way to get fully automatic
> merging that would work. In C, this case would have worked, if only
> git supported directory renames.
Sure, a merge that understands this is java and does the correct thing. Evn
your case for C (with hypotetical directory rename detection) would fail if
the renamed directory was used in an #include-statement (like #include
<lib/foo.h>) Say someone thinks xxdiff should move to lib/xxdiff, while
someone else adds a new reference to <xxdiff/xxdiff.h>. To resolve all cases
you must have tools that understand what they are doing. Directyry rename
detection only solves a few cases, but it may be easy enough to implement to
warrant the effort to get the tick in the box.
>
> In neither case is it very much work to fix by hand, though :),
I agree on that.
-- robin
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories?
2008-05-05 21:49 ` Robin Rosenberg
@ 2008-05-05 22:20 ` Linus Torvalds
2008-05-05 23:07 ` Steven Grimm
2008-05-06 1:38 ` Avery Pennarun
0 siblings, 2 replies; 49+ messages in thread
From: Linus Torvalds @ 2008-05-05 22:20 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: Avery Pennarun, Ittay Dror, git
On Mon, 5 May 2008, Robin Rosenberg wrote:
>
> You will always find corner cases.
.. and btw, this is why merging should always
- be predictable (which implies "simple": overly clever merging, and
especially merging that takes complex history into account is *bad*,
because it's still going to do the wrong thing, but now it's going to
do so much less predictable)
- be amenable to manual fixes even when it succeeds (ie even if an
automatic merge completes without errors, a subsequent build may find
problems, and a "git commit --amend" may well be the right thing to
do!)
- aim for (preferrably easily-handled) conflicts when the unusual cases
happen.
Conflicts for *common* things are bad, because they just cause more
work, and people get too complacent about fixing them. But similarly,
thinking that the unusual cases should be handled automatically is also
wrong - because the unusual cases are likely the ones that need some
manual resolution anyway.
Git will never do merges "perfectly", if only because it's fundamentally
impossible to do that. But one thing git *does* do is to make it pretty
damn easy to handle it.
I really don't understand why people expect a directory rename to be
handled automatically, when it is (a) not that common and (b) not obvious
what the solution is, but MOST OF ALL (c) so damn _easy_ to handle it
manually after-the-fact when you notice that something doesn't compile!
Really. If you have a file that was created in the wrong subdirectory (and
please admit that this is not common - it requires not just a directory
rename, but also a file create in another branch at the same time), what's
so hard with just doing
make
.. oh, oops, that was pretty obviousm, the expected source file
didn't exist ..
git mv olddir/file newdir/file
git commit --amend
and "Tadaa! All done". Your merge that was *fundamentally impossible* to
do automatically, was trivially done manually, with no actual big
head-scratiching involved.
Linus
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories?
2008-05-05 22:20 ` Linus Torvalds
@ 2008-05-05 23:07 ` Steven Grimm
2008-05-06 0:29 ` Linus Torvalds
2008-05-06 1:38 ` Avery Pennarun
1 sibling, 1 reply; 49+ messages in thread
From: Steven Grimm @ 2008-05-05 23:07 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Robin Rosenberg, Avery Pennarun, Ittay Dror, git
On May 5, 2008, at 3:20 PM, Linus Torvalds wrote:
> I really don't understand why people expect a directory rename to be
> handled automatically, when it is (a) not that common and (b) not
> obvious
> what the solution is, but MOST OF ALL (c) so damn _easy_ to handle it
> manually after-the-fact when you notice that something doesn't
> compile!
Assuming all you track with git is source code that has dependencies
such that a compile command fails cleanly when things end up in the
wrong directory, sure.
If you're using git to, say, track a tree of documentation files or
images that are referred to using relative URLs in HTML pages,
detecting the breakage is less trivial unless you have a really solid
automated QA process that can check for dangling references.
Are directory renames as common as file renames? Certainly not. But
they happen often enough that it's annoying to have to manually clean
up after them. Note that I did not say it is difficult or impossible
to manually clean up after them. I think the number of people who've
mentioned this on the list should stand as some kind of refutation of
the idea that directory renames are so vanishingly rare as to not be
worth mentioning. I've run into the problem a few times myself.
> and "Tadaa! All done". Your merge that was *fundamentally
> impossible* to
> do automatically, was trivially done manually, with no actual big
> head-scratiching involved.
$ mkdir parent
$ cd parent
$ hg init
$ mkdir subdir1
$ echo "I am the walrus" > subdir1/file1
$ hg add subdir1/file1
$ hg commit -m 'initial commit'
$ cd ..
$ hg clone parent child
$ cd child
$ hg mv subdir1 subdir2
$ hg commit -m 'rename subdir1 to subdir2'
$ cd ../parent
$ echo 'I love prunes' > subdir1/file2
$ hg add subdir1/file2
$ hg commit -m 'new file in subdir'
$ cd ../child
$ hg pull
$ hg merge
$ ls subdir2
file1 file2
Doesn't seem *fundamentally* impossible to produce the results that
are most likely to be what people want. (Which doesn't equal
"guaranteed to be 100% correct 100% of the time or your money back" --
as you say, merging is an inexact science.)
-Steve
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories?
2008-05-05 23:07 ` Steven Grimm
@ 2008-05-06 0:29 ` Linus Torvalds
2008-05-06 0:40 ` Linus Torvalds
2008-05-06 15:47 ` Theodore Tso
0 siblings, 2 replies; 49+ messages in thread
From: Linus Torvalds @ 2008-05-06 0:29 UTC (permalink / raw)
To: Steven Grimm; +Cc: Robin Rosenberg, Avery Pennarun, Ittay Dror, git
On Mon, 5 May 2008, Steven Grimm wrote:
>
> Doesn't seem *fundamentally* impossible to produce the results that are most
> likely to be what people want.
You didn't understand what was fundamentally impossible.
And btw, this has nothing to do with directory renames either. There are
tons of these kinds of merge issues that bad SCM developes have been
masturbating over for YEARS. There's a whole science of making idiotic new
merging models, one fancier than the other. The fact is, you cannot do a
perfect job, the best thing you can do is pick a simple model, and try to
make it repeatable and easy to fix up.
Maybe somebody bothers to implement some directory rename heuristic some
day. Quite frankly, I personally cannot care less. It really is mental
masturbation, and has absolutely no relevance for any real-world problem.
Linus
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories?
2008-05-06 0:29 ` Linus Torvalds
@ 2008-05-06 0:40 ` Linus Torvalds
2008-05-06 15:47 ` Theodore Tso
1 sibling, 0 replies; 49+ messages in thread
From: Linus Torvalds @ 2008-05-06 0:40 UTC (permalink / raw)
To: Steven Grimm; +Cc: Robin Rosenberg, Avery Pennarun, Ittay Dror, git
On Mon, 5 May 2008, Linus Torvalds wrote:
>
> There are tons of these kinds of merge issues that bad SCM developes
> have been masturbating over for YEARS.
.. and if I sound rather less than enthused about these kinds of issues,
it's because of having seen years and years of people talking about merge
strategies, and then at the same time using SVN which doesn't even record
the parenthood of the resulting merges, or thinking that code always
moves with whole files.
In other words, the details don't even matter. What matters is not being a
total piece of sh*t in the big picture.
Linus
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories?
2008-05-05 22:20 ` Linus Torvalds
2008-05-05 23:07 ` Steven Grimm
@ 2008-05-06 1:38 ` Avery Pennarun
2008-05-06 1:46 ` Shawn O. Pearce
2008-05-06 2:19 ` Linus Torvalds
1 sibling, 2 replies; 49+ messages in thread
From: Avery Pennarun @ 2008-05-06 1:38 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Robin Rosenberg, Ittay Dror, git
On 5/5/08, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> I really don't understand why people expect a directory rename to be
> handled automatically, when it is (a) not that common and (b) not obvious
> what the solution is, but MOST OF ALL (c) so damn _easy_ to handle it
> manually after-the-fact when you notice that something doesn't compile!
I general I agree with your point here, but I still find it surprising
how hard the directory-rename problem is made out to be. As far as I
can see, the right implementation exactly parallels the single-file
rename implementation.
I think the same problem that prevents git from knowing the difference
between empty and nonexistent directories (eg.
http://kerneltrap.org/mailarchive/git/2007/7/18/251976) is the one
that prevents it from handling directory renames: git doesn't
acknowledge that it's *already* treating directories as first-class
objects.
What if you thought of a directory as simply a list of filenames?
(This is more or less what unix does anyway.) Then an *empty*
directory is a tree of zero length; a nonexistent (or not tracked)
directory is simply not listed in the parent; a directory with
untracked files is like a file with patches not yet added to the
index(*); and trying to merge a file into a nonexistent directory
(when the original patch *didn't* create the directory fresh) would
trigger similar logic to the existing rename handling. That is, put
the new file with the content that used to be next to it, by looking
for a tree with contents (names, not so much sha1's) similar to the
one it was expected to be in.
> It really is mental
> masturbation, and has absolutely no relevance for any real-world problem.
I personally don't get very interested in non-real-world problems.
Here's the actual case I tried to use a few months ago, but couldn't,
because git doesn't track directory renames. (Note that I was quite
happily able to do this in svn, as much as you can do anything happily
in svn.)
I have a branch called 'mylib' with my library project in its root
directory. What I wanted was to maintain my library in the 'mylib'
branch, then merge my library into the "libs/mylib" directory of my
application, which is in the 'myapp' branch. (Of course, in real
life, there's more than one app using mylib in more than one
repository, and I'm actually doing 'git pull' of the mylib branch from
elsewhere.)
This actually works like magic in git - except when you create a file
in the 'mylib' branch, in which case it gets merged to the wrong path
every single time. It seems to me like it should be very easy to put
it in the right place instead, making one more interesting use case
possible.
I realize git-submodule is the way you're supposed to do something
like this, but git-submodule doesn't really do what I want (yet) for
reasons discussed in other threads.
Have fun,
Avery
(*) Applying the same metaphor in reverse, operations that are valid
on directories are also valid for file contents. I can think of
immediate uses for a .gitignore-style list that talks about file
*contents*. Imagine if I could make a local patch to my Makefile,
mark that one patch as "ignored", and never accidentally check it in.
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories?
2008-05-06 1:38 ` Avery Pennarun
@ 2008-05-06 1:46 ` Shawn O. Pearce
2008-05-06 1:58 ` Avery Pennarun
2008-05-06 2:19 ` Linus Torvalds
1 sibling, 1 reply; 49+ messages in thread
From: Shawn O. Pearce @ 2008-05-06 1:46 UTC (permalink / raw)
To: Avery Pennarun; +Cc: Linus Torvalds, Robin Rosenberg, Ittay Dror, git
Avery Pennarun <apenwarr@gmail.com> wrote:
>
> I have a branch called 'mylib' with my library project in its root
> directory. What I wanted was to maintain my library in the 'mylib'
> branch, then merge my library into the "libs/mylib" directory of my
> application, which is in the 'myapp' branch. [...]
>
> This actually works like magic in git - except when you create a file
> in the 'mylib' branch, in which case it gets merged to the wrong path
> every single time. It seems to me like it should be very easy to put
> it in the right place instead, making one more interesting use case
> possible.
>
> I realize git-submodule is the way you're supposed to do something
> like this, but git-submodule doesn't really do what I want (yet) for
> reasons discussed in other threads.
`git pull -s subtree mylib` ?
This is how git-gui and gitk are merged into git.git, and it avoids
this case by looking for a subdirectory rename, more specifically
a rename of "/" to "mylib/".
It also can go the other way, that is rename "mylib/" to "/", but
this path is never used as far as I know as git-gui and gitk don't
ever merge in the git.git history.
--
Shawn.
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories?
2008-05-06 1:46 ` Shawn O. Pearce
@ 2008-05-06 1:58 ` Avery Pennarun
2008-05-06 2:12 ` Shawn O. Pearce
0 siblings, 1 reply; 49+ messages in thread
From: Avery Pennarun @ 2008-05-06 1:58 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: Linus Torvalds, Robin Rosenberg, Ittay Dror, git
On 5/5/08, Shawn O. Pearce <spearce@spearce.org> wrote:
> Avery Pennarun <apenwarr@gmail.com> wrote:
> >
> > I have a branch called 'mylib' with my library project in its root
> > directory. What I wanted was to maintain my library in the 'mylib'
> > branch, then merge my library into the "libs/mylib" directory of my
>
> > application, which is in the 'myapp' branch. [...]
>
> >
> > This actually works like magic in git - except when you create a file
> > in the 'mylib' branch, in which case it gets merged to the wrong path
> > every single time. It seems to me like it should be very easy to put
> > it in the right place instead, making one more interesting use case
> > possible.
> >
> > I realize git-submodule is the way you're supposed to do something
> > like this, but git-submodule doesn't really do what I want (yet) for
> > reasons discussed in other threads.
>
> `git pull -s subtree mylib` ?
First, I thought: wow! How can that possibly work? These guys are geniuses!
Then I found out that git-merge-subtree is a git builtin, and git.c says this:
{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE },
{ "merge-subtree", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE },
And then my head exploded. :)
Still scraping the pieces of my brain back off the floor... but does
this mean the subtree merge strategy would fail exactly like
merge-recursive when new files are created?
Have fun,
Avery
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories?
2008-05-06 1:58 ` Avery Pennarun
@ 2008-05-06 2:12 ` Shawn O. Pearce
0 siblings, 0 replies; 49+ messages in thread
From: Shawn O. Pearce @ 2008-05-06 2:12 UTC (permalink / raw)
To: Avery Pennarun; +Cc: Linus Torvalds, Robin Rosenberg, Ittay Dror, git
Avery Pennarun <apenwarr@gmail.com> wrote:
> On 5/5/08, Shawn O. Pearce <spearce@spearce.org> wrote:
> >
> > `git pull -s subtree mylib` ?
>
> First, I thought: wow! How can that possibly work? These guys are geniuses!
>
> Then I found out that git-merge-subtree is a git builtin, and git.c says this:
>
> { "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE },
> { "merge-subtree", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE },
>
> And then my head exploded. :)
>
> Still scraping the pieces of my brain back off the floor... but does
> this mean the subtree merge strategy would fail exactly like
> merge-recursive when new files are created?
Nope. If you go look at cmd_merge_recursive you will see it has
different behavior based upon the name it was invoked as, even
though it is the same C function and has the same implementation.
If it is started with the name "merge-subtree" it tries to find
a matching subtree prefix to insert in front of all names, or
to remove from all names, such that a merge will correctly fully
include a set of files in a subdirectory, or full pull out a set
of files from a subdirectory.
Junio is the genius that implemented this. Works quite well for
this library->application merge case that I think you were trying
to describe.
--
Shawn.
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories?
2008-05-06 1:38 ` Avery Pennarun
2008-05-06 1:46 ` Shawn O. Pearce
@ 2008-05-06 2:19 ` Linus Torvalds
1 sibling, 0 replies; 49+ messages in thread
From: Linus Torvalds @ 2008-05-06 2:19 UTC (permalink / raw)
To: Avery Pennarun; +Cc: Robin Rosenberg, Ittay Dror, git
On Mon, 5 May 2008, Avery Pennarun wrote:
>
> I general I agree with your point here, but I still find it surprising
> how hard the directory-rename problem is made out to be.
I do agree that it's probably not that hard.
But I disagree with people who whine about pointless stuff, and don't send
patches.
Linus
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories?
2008-05-06 0:29 ` Linus Torvalds
2008-05-06 0:40 ` Linus Torvalds
@ 2008-05-06 15:47 ` Theodore Tso
2008-05-06 16:10 ` Linus Torvalds
1 sibling, 1 reply; 49+ messages in thread
From: Theodore Tso @ 2008-05-06 15:47 UTC (permalink / raw)
To: Linus Torvalds
Cc: Steven Grimm, Robin Rosenberg, Avery Pennarun, Ittay Dror, git
On Mon, May 05, 2008 at 05:29:12PM -0700, Linus Torvalds wrote:
>
> Maybe somebody bothers to implement some directory rename heuristic some
> day. Quite frankly, I personally cannot care less. It really is mental
> masturbation, and has absolutely no relevance for any real-world problem.
>
Actually, the directory rename hueristic *does* have relevance in at
least some real-world cases. For example, MySQL has plugin
directories, and occasionally the plugins get renamed, for whatever
reason. If a plugin gets renamed, so does its directory, and if the
rename operation happens in an experimental (or devel) branch, but
then for whatever reason, a new file is created in the devel (or
maint) branch, without the directory rename hueristic, when the
changeset is pulled into the experimental (or devel) branch, the file
will be created in the wrong directory.
So it may be rare, but this kind of thing does happen in the real
world.
- Ted
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories?
2008-05-06 15:47 ` Theodore Tso
@ 2008-05-06 16:10 ` Linus Torvalds
2008-05-06 16:15 ` Linus Torvalds
2008-05-06 16:32 ` Ittay Dror
0 siblings, 2 replies; 49+ messages in thread
From: Linus Torvalds @ 2008-05-06 16:10 UTC (permalink / raw)
To: Theodore Tso
Cc: Steven Grimm, Robin Rosenberg, Avery Pennarun, Ittay Dror, git
On Tue, 6 May 2008, Theodore Tso wrote:
>
> Actually, the directory rename hueristic *does* have relevance in at
> least some real-world cases. For example, MySQL has plugin
> directories, and occasionally the plugins get renamed, for whatever
> reason.
I'm not saying that directory renames don't happen.
I don't even say that merges across directory renames don't happen.
I *am* saying that it's not a problem.
It's like data conflicts. Do they happen? Sure as hell. I can pretty much
guarantee that any sane project will have more data conflicts than they
will have rename conflicts (whether single-file or directory), and it's
not only a problem, it's something that is absolutely *required* from a
source control management system!
So are data conflicts a problem?
I claim that they aren't. They are a *positive* resource that you need to
handle. Some of the "handling" is obviously going to be to try to avoid
them, and if you get too much of them, the real "problem" is that you
merge too seldom, or more commonly that you have a piece of code that is
simply not done well enough, so many different people have to muck around
in that area.
But fundamentally, you should always have data conflicts, and they aren't
a problem in themselves. They are a problem only
- If they are hard to understand and see, and *unexpected*. The SCM
should explain what is going on, and explain why a conflict happens
(and that may perhaps mean after-the fact! I love "gitk --merge"
exactly because it tends to be very good at explaining what was going
on!).
- If they are hard to fix.
For example, one of the main problems I had with BK merging was the
fact that while the megetool was wonderful, you effectively *had* to
merge using it, and you couldn't sanely do an "incremental" merge
where you first did a first merge job, then checked that it at
least compiles, then tested it, and finally looked at the diffs from
both parents and looked at whether those all made sense, and you could
"refine" or fix the merge along the different phases.
Of course, you hope that all merges are pretty obvious, and you can do
it right in one go, but no, they're not. They'll never be. They'll
never be fully automtic, but even when they aren't automatic, they'll
not even be trivially to do manually. But that's OK, as long as the
tool at least doesn't fight you, and lets you do whatever you want to
do a part of fixing things up.
Now, take a look back at directory renames.
Do they happen?
Yes.
Do they potentially mis-merge?
Yes.
But are they common and/or hard to fix and handle?
No.
And that's why I don't think people should call them "problems". The only
_real_ issue here, I think, is that git just does things differently from
other SCM's. Git does a _lot_ of things differently. You get used to it.
Linus
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories?
2008-05-06 16:10 ` Linus Torvalds
@ 2008-05-06 16:15 ` Linus Torvalds
2008-05-06 16:32 ` Ittay Dror
1 sibling, 0 replies; 49+ messages in thread
From: Linus Torvalds @ 2008-05-06 16:15 UTC (permalink / raw)
To: Theodore Tso
Cc: Steven Grimm, Robin Rosenberg, Avery Pennarun, Ittay Dror, git
On Tue, 6 May 2008, Linus Torvalds wrote:
>
> I can pretty much
> guarantee that any sane project will have more data conflicts than they
> will have rename conflicts (whether single-file or directory), and it's
> not only a problem, it's something that is absolutely *required* from a
^^^-- not
> source control management system!
Oops. That didn't read well.
Linus
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories?
2008-05-06 16:10 ` Linus Torvalds
2008-05-06 16:15 ` Linus Torvalds
@ 2008-05-06 16:32 ` Ittay Dror
2008-05-06 16:39 ` Linus Torvalds
1 sibling, 1 reply; 49+ messages in thread
From: Ittay Dror @ 2008-05-06 16:32 UTC (permalink / raw)
To: Linus Torvalds
Cc: Theodore Tso, Steven Grimm, Robin Rosenberg, Avery Pennarun, git
Linus Torvalds wrote:
>
> - If they are hard to understand and see, and *unexpected*. The SCM
> should explain what is going on, and explain why a conflict happens
> (and that may perhaps mean after-the fact! I love "gitk --merge"
> exactly because it tends to be very good at explaining what was going
> on!).
>
>
So does git tell me what is going on with directory renames? Or should I
just discover them when I try to compile (assuming that when the old
directory name appears it will even get compiled, and that the file in
it is something that gets compiled)
And no, it's not a common problem, but I don't like the fact that a
merge conflict happens and the SCM doesn't tell me about it.
--
Ittay Dror <ittayd@tikalk.com>
Tikal <http://www.tikalk.com>
Tikal Project <http://tikal.sourceforge.net>
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories?
2008-05-06 16:32 ` Ittay Dror
@ 2008-05-06 16:39 ` Linus Torvalds
0 siblings, 0 replies; 49+ messages in thread
From: Linus Torvalds @ 2008-05-06 16:39 UTC (permalink / raw)
To: Ittay Dror
Cc: Theodore Tso, Steven Grimm, Robin Rosenberg, Avery Pennarun, git
On Tue, 6 May 2008, Ittay Dror wrote:
>
> And no, it's not a common problem, but I don't like the fact that a merge
> conflict happens and the SCM doesn't tell me about it.
I do agree that the most irritating feature of it is the silent clean
merge. When it's not obvious what the right thing to do is, generally a
merge strategy should try to warn, or even generate a conflict.
That said, anybody who thinks that "merge was automatic and successful"
means that the mege was _correct_ is sadly mistaken. So you really
shouldn't depend on it, and yeah, I strongly suggest building and testing
after a merge (and before you push the result out), so that you can fix
any issues.
Linus
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit
2008-05-01 23:14 ` Jeff King
2008-05-03 17:56 ` merge renamed files/directories? (was: Re: detecting rename->commit->modify->commit) Ittay Dror
@ 2008-05-08 18:17 ` Jeff King
1 sibling, 0 replies; 49+ messages in thread
From: Jeff King @ 2008-05-08 18:17 UTC (permalink / raw)
To: Steven Grimm; +Cc: Avery Pennarun, Ittay Dror, git
On Thu, May 01, 2008 at 07:14:27PM -0400, Jeff King wrote:
> 1. write a proof-of-concept that shows directory renaming after the
> fact (e.g., take a conflicted merge, scan the diff for directory
> renames, and then fix up the files). That way it is available, but
> doesn't impact git at all.
Here's a toy script that finds directory renames. I'm sure there are a
ton of corner cases it doesn't handle (like directory renames inside of
directory renames). My test case was the very trivial:
mkdir repo && cd repo && git init
mkdir subdir
for i in 1 2 3; do
echo content $i >subdir/file$i
done
git add subdir
git commit -m initial
git mv subdir new
git commit -m move
git checkout -b other HEAD^
echo content 4 >subdir/file4
git add subdir
git commit -m new
git merge --no-commit master
perl ../find-dir-rename.pl
git commit
At which point you should see the merged commit with new/file4.
Script is below.
-- >8 --
#!/usr/bin/perl
#
# Find renamed directories, and move any files in the "old"
# directory into the "new".
#
# usage:
# git merge --no-commit <whatever>
# find-dir-rename
# git commit
use strict;
foreach my $r (renamed_dirs()) {
move_dir_contents($r->{from}, $r->{to});
}
exit 0;
sub renamed_dirs {
my $base = `git merge-base HEAD MERGE_HEAD`;
chomp $base;
return grep {
$_->{score} == 1
} (renamed_dirs_between($base, 'HEAD'),
renamed_dirs_between($base, 'MERGE_HEAD'));
}
sub renamed_dirs_between {
my ($base, $commit) = @_;
my %sources;
foreach my $pair (renamed_files($base, $commit)) {
my $d1 = dir_of($pair->[0]);
my $d2 = dir_of($pair->[1]);
next unless defined($d1) && defined($d2);
$sources{$d1}->{total}++;
$sources{$d1}->{dests}->{$d2}++;
}
return map {
my $from = $_;
map {
{
from => $from,
to => $_,
score => $sources{$from}->{dests}->{$_} / $sources{$from}->{total},
}
} keys(%{$sources{$from}->{dests}});
} removed_directories($base, $commit);
}
sub dir_of {
local $_ = shift;
s{/[^/]+$}{} or return undef;
return $_;
}
sub renamed_files {
my ($from, $to) = @_;
open(my $fh, '-|', qw(git diff-tree -r -M), $from, $to)
or die "unable to open diff-tree: $!";
return map {
chomp;
m/ R\d+\t([^\t]+)\t(.*)/ ? [$1 => $2] : ()
} <$fh>;
}
sub removed_directories {
my ($base, $commit) = @_;
my %new_dirs = map { $_ => 1 } directories($commit);
return grep { !exists $new_dirs{$_} } directories($base);
}
sub directories {
my $commit = shift;
return uniq(
map {
s{/[^/]+$}{} ? $_ : ()
} files($commit)
);
}
sub files {
my $commit = shift;
open(my $fh, '-|', qw(git ls-tree -r), $commit)
or die "unable to open ls-tree: $!";
return map {
chomp;
s/^[^\t]*\t//;
$_
} <$fh>;
}
sub uniq {
my %seen;
return grep { !$seen{$_}++ } @_;
}
sub move_dir_contents {
my ($from, $to) = @_;
my @files = glob("$from/*");
return unless @files;
system(qw(git mv), @files, "$to/")
and die "unable to move $from/* to $to";
rmdir($from); # ignore error since there may be untracked files
}
^ permalink raw reply [flat|nested] 49+ messages in thread
end of thread, other threads:[~2008-05-08 18:18 UTC | newest]
Thread overview: 49+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-01 14:10 detecting rename->commit->modify->commit Ittay Dror
2008-05-01 14:45 ` Jeff King
2008-05-01 15:08 ` Ittay Dror
2008-05-01 15:20 ` Jeff King
2008-05-01 15:30 ` Ittay Dror
2008-05-01 15:38 ` Jeff King
2008-05-01 15:47 ` Jakub Narebski
2008-05-01 20:39 ` Teemu Likonen
2008-05-01 23:09 ` Jeff King
2008-05-02 2:06 ` Sitaram Chamarty
2008-05-02 2:38 ` Junio C Hamano
2008-05-02 16:59 ` Sitaram Chamarty
2008-05-01 15:24 ` Ittay Dror
2008-05-01 15:28 ` Jeff King
2008-05-01 14:54 ` Ittay Dror
2008-05-01 15:09 ` Jeff King
2008-05-01 15:20 ` Ittay Dror
2008-05-01 15:30 ` David Tweed
2008-05-01 15:27 ` Avery Pennarun
2008-05-01 15:34 ` Jeff King
2008-05-01 15:50 ` Avery Pennarun
2008-05-01 16:48 ` Jeff King
2008-05-01 19:45 ` Avery Pennarun
2008-05-01 22:42 ` Jeff King
2008-05-01 19:12 ` Steven Grimm
2008-05-01 23:14 ` Jeff King
2008-05-03 17:56 ` merge renamed files/directories? (was: Re: detecting rename->commit->modify->commit) Ittay Dror
2008-05-03 18:11 ` Avery Pennarun
2008-05-04 6:08 ` merge renamed files/directories? Ittay Dror
2008-05-04 9:34 ` Jakub Narebski
2008-05-05 16:40 ` Avery Pennarun
2008-05-05 21:49 ` Robin Rosenberg
2008-05-05 22:20 ` Linus Torvalds
2008-05-05 23:07 ` Steven Grimm
2008-05-06 0:29 ` Linus Torvalds
2008-05-06 0:40 ` Linus Torvalds
2008-05-06 15:47 ` Theodore Tso
2008-05-06 16:10 ` Linus Torvalds
2008-05-06 16:15 ` Linus Torvalds
2008-05-06 16:32 ` Ittay Dror
2008-05-06 16:39 ` Linus Torvalds
2008-05-06 1:38 ` Avery Pennarun
2008-05-06 1:46 ` Shawn O. Pearce
2008-05-06 1:58 ` Avery Pennarun
2008-05-06 2:12 ` Shawn O. Pearce
2008-05-06 2:19 ` Linus Torvalds
2008-05-08 18:17 ` detecting rename->commit->modify->commit Jeff King
2008-05-01 16:39 ` Sitaram Chamarty
2008-05-01 18:58 ` Ittay Dror
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).