* detecting rename->commit->modify->commit @ 2008-05-01 14:10 Ittay Dror 2008-05-01 14:45 ` Jeff King 2008-05-01 14:54 ` Ittay Dror 0 siblings, 2 replies; 49+ messages in thread From: Ittay Dror @ 2008-05-01 14:10 UTC (permalink / raw) To: git Hi, Say I have a file A, I rename to 'B', commit, then change file B and commit. Does 'git diff -M HEAD^^..' detect that? From what I see now, it will show 'B' as new (all of it with '+' prefix in the output). Am I right? Thank you, Ittay -- Ittay Dror <ittayd@tikalk.com> Tikal <http://www.tikalk.com> Tikal Project <http://tikal.sourceforge.net> ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 14:10 detecting rename->commit->modify->commit Ittay Dror @ 2008-05-01 14:45 ` Jeff King 2008-05-01 15:08 ` Ittay Dror 2008-05-01 14:54 ` Ittay Dror 1 sibling, 1 reply; 49+ messages in thread From: Jeff King @ 2008-05-01 14:45 UTC (permalink / raw) To: Ittay Dror; +Cc: git On Thu, May 01, 2008 at 05:10:24PM +0300, Ittay Dror wrote: > Say I have a file A, I rename to 'B', commit, then change file B and > commit. Does 'git diff -M HEAD^^..' detect that? From what I see now, it > will show 'B' as new (all of it with '+' prefix in the output). Am I > right? Yes, it should find it, assuming the changes to B leave it recognizable. Try: mkdir repo && cd repo && git init cp /usr/share/dict/words A git add . && git commit -m added mv A B && git add B && git commit -a -m rename echo change >>B && git commit -a -m change git diff -M HEAD^^.. | head -n 7 You should see something like: diff --git a/A b/B similarity index 99% rename from A rename to B index 8e50f11..6525618 100644 --- a/A +++ b/B However, note the similarity index. If you change B so much that it doesn't look close to the original A, then the rename is not detected (and intentionally so -- the argument is that it is no longer a rename in that context, but a rewritten file). -Peff ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 14:45 ` Jeff King @ 2008-05-01 15:08 ` Ittay Dror 2008-05-01 15:20 ` Jeff King 2008-05-01 15:24 ` Ittay Dror 0 siblings, 2 replies; 49+ messages in thread From: Ittay Dror @ 2008-05-01 15:08 UTC (permalink / raw) To: Jeff King; +Cc: git But it doesn't work across directories :-(. Try: >mkdir foo >echo "hello" > foo/A >git add foo/A >git commit -m 'foo/A' >mkdir bar >git mv foo/A bar >git commit -m 'bar/A' >echo "world" >> bar/A >git add bar/A >git commit -m 'bar/A world' >git diff HEAD^^..HEAD^ | cat diff --git a/foo/A b/bar/A similarity index 100% rename from foo/A rename to bar/A > git diff HEAD^^.. | cat diff --git a/bar/A b/bar/A new file mode 100644 index 0000000..94954ab --- /dev/null +++ b/bar/A @@ -0,0 +1,2 @@ +hello +world diff --git a/foo/A b/foo/A deleted file mode 100644 index ce01362..0000000 --- a/foo/A +++ /dev/null @@ -1 +0,0 @@ -hello Jeff King wrote: > On Thu, May 01, 2008 at 05:10:24PM +0300, Ittay Dror wrote: > > >> Say I have a file A, I rename to 'B', commit, then change file B and >> commit. Does 'git diff -M HEAD^^..' detect that? From what I see now, it >> will show 'B' as new (all of it with '+' prefix in the output). Am I >> right? >> > > Yes, it should find it, assuming the changes to B leave it recognizable. > Try: > > mkdir repo && cd repo && git init > cp /usr/share/dict/words A > git add . && git commit -m added > mv A B && git add B && git commit -a -m rename > echo change >>B && git commit -a -m change > git diff -M HEAD^^.. | head -n 7 > > You should see something like: > > diff --git a/A b/B > similarity index 99% > rename from A > rename to B > index 8e50f11..6525618 100644 > --- a/A > +++ b/B > > However, note the similarity index. If you change B so much that it > doesn't look close to the original A, then the rename is not detected > (and intentionally so -- the argument is that it is no longer a rename > in that context, but a rewritten file). > > -Peff > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- Ittay Dror <ittayd@tikalk.com> Tikal <http://www.tikalk.com> Tikal Project <http://tikal.sourceforge.net> ^ permalink raw reply related [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 15:08 ` Ittay Dror @ 2008-05-01 15:20 ` Jeff King 2008-05-01 15:30 ` Ittay Dror 2008-05-01 20:39 ` Teemu Likonen 2008-05-01 15:24 ` Ittay Dror 1 sibling, 2 replies; 49+ messages in thread From: Jeff King @ 2008-05-01 15:20 UTC (permalink / raw) To: Ittay Dror; +Cc: git On Thu, May 01, 2008 at 06:08:33PM +0300, Ittay Dror wrote: > But it doesn't work across directories :-(. Yes, it does. > Try: > >mkdir foo > >echo "hello" > foo/A > >git add foo/A > >git commit -m 'foo/A' > >mkdir bar > >git mv foo/A bar > >git commit -m 'bar/A' > >echo "world" >> bar/A > >git add bar/A > >git commit -m 'bar/A world' > >git diff HEAD^^..HEAD^ | cat > diff --git a/foo/A b/bar/A > similarity index 100% > rename from foo/A > rename to bar/A See, it just worked across directories. > > git diff HEAD^^.. | cat > diff --git a/bar/A b/bar/A > new file mode 100644 > index 0000000..94954ab > --- /dev/null > +++ b/bar/A > @@ -0,0 +1,2 @@ > +hello > +world > diff --git a/foo/A b/foo/A > deleted file mode 100644 > index ce01362..0000000 > --- a/foo/A > +++ /dev/null > @@ -1 +0,0 @@ > -hello Of course it doesn't work here. You have two files, one containing "hello\n" and one containing "hello\nworld\n". Their similarity is 50%, which is not enough to consider it a rename. And I would argue that's reasonable, since the files have only one line in common. The problem is that you are using a toy example (which is why my example used /usr/share/dict/words, which has enough content to definitively call it a rename). ... Hmm, looking at the code, though, 50% is supposed to be the default minimum. So there might actually be a bug. -Peff ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 15:20 ` Jeff King @ 2008-05-01 15:30 ` Ittay Dror 2008-05-01 15:38 ` Jeff King 2008-05-01 15:47 ` Jakub Narebski 2008-05-01 20:39 ` Teemu Likonen 1 sibling, 2 replies; 49+ messages in thread From: Ittay Dror @ 2008-05-01 15:30 UTC (permalink / raw) To: Jeff King; +Cc: git Jeff King wrote: > Of course it doesn't work here. You have two files, one containing > "hello\n" and one containing "hello\nworld\n". Their similarity is 50%, > which is not enough to consider it a rename. And I would argue that's > reasonable, since the files have only one line in common. The problem is > that you are using a toy example (which is why my example used > /usr/share/dict/words, which has enough content to definitively call it > a rename). > > Well, I would have expected git to notice that the file was renamed in one commit and keep tracking changes afterwards. Also, as I wrote in another post, this happened to me with real files of a real source tree, and with very small changes (and sometimes not at all) to these files. Ittay -- Ittay Dror <ittayd@tikalk.com> Tikal <http://www.tikalk.com> Tikal Project <http://tikal.sourceforge.net> ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 15:30 ` Ittay Dror @ 2008-05-01 15:38 ` Jeff King 2008-05-01 15:47 ` Jakub Narebski 1 sibling, 0 replies; 49+ messages in thread From: Jeff King @ 2008-05-01 15:38 UTC (permalink / raw) To: Ittay Dror; +Cc: git On Thu, May 01, 2008 at 06:30:46PM +0300, Ittay Dror wrote: > Well, I would have expected git to notice that the file was renamed in > one commit and keep tracking changes afterwards. That's not how git works, and that's not what you asked it to do. You gave it two states and asked it to diff between them. It never even looked at the intermediate steps (and that's generally why git is so fast). If you want to follow the history and look at every commit, then that is something that _can_ be done, and does get done with things like "git log --follow". But there is a diff mode currently implemented that will crawl the history looking for interesting things. -Peff ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 15:30 ` Ittay Dror 2008-05-01 15:38 ` Jeff King @ 2008-05-01 15:47 ` Jakub Narebski 1 sibling, 0 replies; 49+ messages in thread From: Jakub Narebski @ 2008-05-01 15:47 UTC (permalink / raw) To: Ittay Dror; +Cc: Jeff King, git Ittay Dror <ittayd@tikalk.com> writes: > Jeff King wrote: > > > > Of course it doesn't work here. You have two files, one containing > > "hello\n" and one containing "hello\nworld\n". Their similarity is 50%, > > which is not enough to consider it a rename. And I would argue that's > > reasonable, since the files have only one line in common. The problem is > > that you are using a toy example (which is why my example used > > /usr/share/dict/words, which has enough content to definitively call it > > a rename). > > > > > Well, I would have expected git to notice that the file was renamed in > one commit and keep tracking changes afterwards. > > Also, as I wrote in another post, this happened to me with real files > of a real source tree, and with very small changes (and sometimes not > at all) to these files. The idea of rename detection is to help with merges. If the files are different enough that content based (similarity based) rename detection doesn't detect rename, they are usually too different to merge automatically anyway. -- Jakub Narebski Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 15:20 ` Jeff King 2008-05-01 15:30 ` Ittay Dror @ 2008-05-01 20:39 ` Teemu Likonen 2008-05-01 23:09 ` Jeff King 2008-05-02 2:06 ` Sitaram Chamarty 1 sibling, 2 replies; 49+ messages in thread From: Teemu Likonen @ 2008-05-01 20:39 UTC (permalink / raw) To: Jeff King; +Cc: Ittay Dror, git Jeff King wrote (2008-05-01 11:20 -0400): > Hmm, looking at the code, though, 50% is supposed to be the default > minimum. So there might actually be a bug. I did some testing... A file, containing 10 lines (about 200 bytes), renamed and then modified (similarity index being a bit over 50%). Git detected the rename just fine with "git diff -M" over the rename and change. When I edited the file even more (similarity only 40%) "git diff -M" didn't detect the rename but "git diff -M4" did. To me it looks like this works nicely, better than I expected, actually. Smaller files than that do not seem to work with "git diff -M" over the rename and changes. They can be followed with "git log --follow -p" which works even with the two-line "hello\nworld". And of course there is always git diff commit1:path1/file1 commit2:path2/file2 I'd conclude that for logs and diffs renames are detected very nicely and there's no problem at all to get wanted information from the repo. I wonder how this rename detection/tracking has become such a big thing, a debate even. But maybe merges are different. ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 20:39 ` Teemu Likonen @ 2008-05-01 23:09 ` Jeff King 2008-05-02 2:06 ` Sitaram Chamarty 1 sibling, 0 replies; 49+ messages in thread From: Jeff King @ 2008-05-01 23:09 UTC (permalink / raw) To: Teemu Likonen; +Cc: Junio C Hamano, Ittay Dror, git [cc'd Junio for comments on this rename optimization] On Thu, May 01, 2008 at 11:39:40PM +0300, Teemu Likonen wrote: > > Hmm, looking at the code, though, 50% is supposed to be the default > > minimum. So there might actually be a bug. > > I did some testing... A file, containing 10 lines (about 200 bytes), > renamed and then modified (similarity index being a bit over 50%). Git Ah, OK. The problem comes because the toy example is so tiny. It hits this code chunk: if (base_size * (MAX_SCORE-minimum_score) < delta_size * MAX_SCORE) return 0; where base_size is the size of the smaller file in bytes, and delta_size is the difference between the size of the two files. This is an optimization so that we don't even have to look at the contents. But it is basing the percentage off of the smaller file, so even though file B ("hello\nworld\n") is 50% made up of file A ("hello\n"), we actually end up saying "there must be at least as much content added to make B as there is in A already". IOW, the "percentage similarity" is based off of the smaller file for this optimization. Obviously this is a toy case, but I wonder if there are other larger cases where you end up with a file which has substantial copied content, but also _grows_ a lot (not just changes). For example, consider the file: 1 2 3 4 5 6 7 8 9 that is, ten lines each with a number. Now rename it, and start adding more numbers. We detect the addition of 10, 11, 12. But adding 13 means we no longer match. So even with only 4 lines added, we fail to match. But again, this is a bit of a toy case. It relies on the line length being a significant factor compared to number of lines. -Peff ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 20:39 ` Teemu Likonen 2008-05-01 23:09 ` Jeff King @ 2008-05-02 2:06 ` Sitaram Chamarty 2008-05-02 2:38 ` Junio C Hamano 1 sibling, 1 reply; 49+ messages in thread From: Sitaram Chamarty @ 2008-05-02 2:06 UTC (permalink / raw) To: Teemu Likonen; +Cc: Jeff King, Ittay Dror, git On Fri, May 2, 2008 at 2:09 AM, Teemu Likonen <tlikonen@iki.fi> wrote: > -M" didn't detect the rename but "git diff -M4" did. To me it looks like > this works nicely, better than I expected, actually. err... I didn't realise -M had an option, and I just double checked the man pages for diff, diff-files, diff-index, and diff-tree. What does the 4 mean? Sitaram ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-02 2:06 ` Sitaram Chamarty @ 2008-05-02 2:38 ` Junio C Hamano 2008-05-02 16:59 ` Sitaram Chamarty 0 siblings, 1 reply; 49+ messages in thread From: Junio C Hamano @ 2008-05-02 2:38 UTC (permalink / raw) To: Sitaram Chamarty; +Cc: Teemu Likonen, Jeff King, Ittay Dror, git "Sitaram Chamarty" <sitaramc@gmail.com> writes: > On Fri, May 2, 2008 at 2:09 AM, Teemu Likonen <tlikonen@iki.fi> wrote: > >> -M" didn't detect the rename but "git diff -M4" did. To me it looks like >> this works nicely, better than I expected, actually. > > err... I didn't realise -M had an option, and I just double checked > the man pages for diff, diff-files, diff-index, and diff-tree. What > does the 4 mean? The option to -M<num>, -C<num>, -B<num>/<num> are "raise or lower the similarity threshold to <num> / 10^N" where N is the number of digits in <num>. IOW, you will always be expressing number between 0 and 1. You should also be able to say -M40% but that is an ancient part of the code base so I might be misremembering things. ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-02 2:38 ` Junio C Hamano @ 2008-05-02 16:59 ` Sitaram Chamarty 0 siblings, 0 replies; 49+ messages in thread From: Sitaram Chamarty @ 2008-05-02 16:59 UTC (permalink / raw) To: Junio C Hamano; +Cc: Teemu Likonen, Jeff King, Ittay Dror, git On Fri, May 2, 2008 at 8:08 AM, Junio C Hamano <gitster@pobox.com> wrote: > The option to -M<num>, -C<num>, -B<num>/<num> are "raise or lower the > similarity threshold to <num> / 10^N" where N is the number of digits in > <num>. IOW, you will always be expressing number between 0 and 1. Thanks. The only mention of this I find (now) is in a file called diffcore.txt, which appears to exist only in the HTML documentation, but not in the "man" pages anywhere, as of 1.5.5. [ I pulled a few hairs out trying to find it in the man pages :-) ] I'd submit a patch, but a guy who takes the easy way out even to get the documentation (essentially doing a checkout of the "man" branch) would certainly not be able to test it :-( ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 15:08 ` Ittay Dror 2008-05-01 15:20 ` Jeff King @ 2008-05-01 15:24 ` Ittay Dror 2008-05-01 15:28 ` Jeff King 1 sibling, 1 reply; 49+ messages in thread From: Ittay Dror @ 2008-05-01 15:24 UTC (permalink / raw) To: Jeff King; +Cc: git Btw, this happened to me in a real use case. I wanted to restructure a source tree. So I put it under git and started to happily move things around, always committing after a move. I thought that git will correctly identify these moves and show me the differences I made after (in a separate commit). But it doesn't, and now that I want to prepare a summary of the changes I've made, I'm stuck with a huge diff that is hard to make sense of. Ittay Ittay Dror wrote: > But it doesn't work across directories :-(. > > Try: > >mkdir foo > >echo "hello" > foo/A > >git add foo/A > >git commit -m 'foo/A' > >mkdir bar > >git mv foo/A bar > >git commit -m 'bar/A' > >echo "world" >> bar/A > >git add bar/A > >git commit -m 'bar/A world' > >git diff HEAD^^..HEAD^ | cat > diff --git a/foo/A b/bar/A > similarity index 100% > rename from foo/A > rename to bar/A > > git diff HEAD^^.. | cat > diff --git a/bar/A b/bar/A > new file mode 100644 > index 0000000..94954ab > --- /dev/null > +++ b/bar/A > @@ -0,0 +1,2 @@ > +hello > +world > diff --git a/foo/A b/foo/A > deleted file mode 100644 > index ce01362..0000000 > --- a/foo/A > +++ /dev/null > @@ -1 +0,0 @@ > -hello > > > > > > Jeff King wrote: >> On Thu, May 01, 2008 at 05:10:24PM +0300, Ittay Dror wrote: >> >> >>> Say I have a file A, I rename to 'B', commit, then change file B >>> and commit. Does 'git diff -M HEAD^^..' detect that? From what I >>> see now, it will show 'B' as new (all of it with '+' prefix in the >>> output). Am I right? >>> >> >> Yes, it should find it, assuming the changes to B leave it recognizable. >> Try: >> >> mkdir repo && cd repo && git init >> cp /usr/share/dict/words A >> git add . && git commit -m added >> mv A B && git add B && git commit -a -m rename >> echo change >>B && git commit -a -m change >> git diff -M HEAD^^.. | head -n 7 >> >> You should see something like: >> >> diff --git a/A b/B >> similarity index 99% >> rename from A >> rename to B >> index 8e50f11..6525618 100644 >> --- a/A >> +++ b/B >> >> However, note the similarity index. If you change B so much that it >> doesn't look close to the original A, then the rename is not detected >> (and intentionally so -- the argument is that it is no longer a rename >> in that context, but a rewritten file). >> >> -Peff >> -- >> To unsubscribe from this list: send the line "unsubscribe git" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > -- Ittay Dror <ittayd@tikalk.com> Tikal <http://www.tikalk.com> Tikal Project <http://tikal.sourceforge.net> ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 15:24 ` Ittay Dror @ 2008-05-01 15:28 ` Jeff King 0 siblings, 0 replies; 49+ messages in thread From: Jeff King @ 2008-05-01 15:28 UTC (permalink / raw) To: Ittay Dror; +Cc: git On Thu, May 01, 2008 at 06:24:30PM +0300, Ittay Dror wrote: > Btw, this happened to me in a real use case. I wanted to restructure a > source tree. So I put it under git and started to happily move things > around, always committing after a move. I thought that git will correctly > identify these moves and show me the differences I made after (in a > separate commit). But it doesn't, and now that I want to prepare a > summary of the changes I've made, I'm stuck with a huge diff that is hard > to make sense of. If you have a specific case where you think renames should have been detected but they weren't, by all means, please share it. It's possible that there is a bug in the rename detection, or that the limits are not set correctly, and we could improve it. -Peff ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 14:10 detecting rename->commit->modify->commit Ittay Dror 2008-05-01 14:45 ` Jeff King @ 2008-05-01 14:54 ` Ittay Dror 2008-05-01 15:09 ` Jeff King ` (2 more replies) 1 sibling, 3 replies; 49+ messages in thread From: Ittay Dror @ 2008-05-01 14:54 UTC (permalink / raw) To: git Also, would anyone like to comment on: http://www.markshuttleworth.com/archives/123 (Renaming is the killer app of distributed version control <http://www.markshuttleworth.com/archives/123>)? Thank you, Ittay Ittay Dror wrote: > Hi, > > Say I have a file A, I rename to 'B', commit, then change file B and > commit. Does 'git diff -M HEAD^^..' detect that? From what I see now, > it will show 'B' as new (all of it with '+' prefix in the output). Am > I right? > > Thank you, > Ittay > -- Ittay Dror <ittayd@tikalk.com> Tikal <http://www.tikalk.com> Tikal Project <http://tikal.sourceforge.net> ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 14:54 ` Ittay Dror @ 2008-05-01 15:09 ` Jeff King 2008-05-01 15:20 ` Ittay Dror 2008-05-01 15:30 ` David Tweed 2008-05-01 15:27 ` Avery Pennarun 2008-05-01 16:39 ` Sitaram Chamarty 2 siblings, 2 replies; 49+ messages in thread From: Jeff King @ 2008-05-01 15:09 UTC (permalink / raw) To: Ittay Dror; +Cc: git On Thu, May 01, 2008 at 05:54:06PM +0300, Ittay Dror wrote: > Also, would anyone like to comment on: > http://www.markshuttleworth.com/archives/123 (Renaming is the killer app > of distributed version control > <http://www.markshuttleworth.com/archives/123>)? My two cents: 1. I think he is overly obsessed with renaming. He seems concerned that somebody will show up, make a big renaming patch, and then break your system. Guess what? They can also show up, make a big code change patch, and then break your system. In either case you have to review the changes before accepting them, and it is up to the version control system to show you the changes in a way you can understand. 2. I see the same old "git developers decided renaming wasn't important" argument. I think this is bogus. I think renaming _is_ important, but I actually prefer git's approach of deducing renames, because it reflects a fundamental property of git: we track states, not changes, and git doesn't care how you arrive at each state. So I am free to use a combination of git commands, editors, patch application tools, or anything else to get my tree to the right place. 3. He doesn't like that git doesn't track _directory_ renames. This is not a fundamental problem with git's approach (which could deduce directory renames after the fact), but rather comes from the fact that directory renames are controversial. That is, even if you know (through deduction or because an explicit rename was recorded) that "subdir1" moved to "subdir2", that doesn't necessarily mean that new files added into "subdir1" should make that move, as well. -Peff ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 15:09 ` Jeff King @ 2008-05-01 15:20 ` Ittay Dror 2008-05-01 15:30 ` David Tweed 1 sibling, 0 replies; 49+ messages in thread From: Ittay Dror @ 2008-05-01 15:20 UTC (permalink / raw) To: Jeff King; +Cc: git Jeff King wrote: > My two cents: > > 1. I think he is overly obsessed with renaming. He seems concerned that > somebody will show up, make a big renaming patch, and then break your > system. Guess what? They can also show up, make a big code change patch, > and then break your system. In either case you have to review the > changes before accepting them, and it is up to the version control > system to show you the changes in a way you can understand I think he was more concerned that merges will break after such a change. Ittay -- Ittay Dror <ittayd@tikalk.com> Tikal <http://www.tikalk.com> Tikal Project <http://tikal.sourceforge.net> ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 15:09 ` Jeff King 2008-05-01 15:20 ` Ittay Dror @ 2008-05-01 15:30 ` David Tweed 1 sibling, 0 replies; 49+ messages in thread From: David Tweed @ 2008-05-01 15:30 UTC (permalink / raw) To: Jeff King; +Cc: Ittay Dror, git On Thu, May 1, 2008 at 4:09 PM, Jeff King <peff@peff.net> wrote: > On Thu, May 01, 2008 at 05:54:06PM +0300, Ittay Dror wrote: > > > Also, would anyone like to comment on: > > http://www.markshuttleworth.com/archives/123 (Renaming is the killer app > > of distributed version control > > <http://www.markshuttleworth.com/archives/123>)? I'll just make the obvious point that he's talking about a problem and an underlying cause: The problem is not being able to successfully merge branches as time goes by when one branch has had some renaming. He's decided the root cause is not have an explicit representation of renames which would enable the merges to succeed. So there are two questions: 1. Does development often happen where files get renamed and then modified significantly in a distributed fashion but it is still sensible to automatically merge the results? 2. Do you need explicit rename tracking to do an automatic merge in those cases? I suspect that for 2 you don't in theory but considering all the non-obvious possibilities would slow down the normal case of a standard merge. -- cheers, dave tweed__________________________ david.tweed@gmail.com Rm 124, School of Systems Engineering, University of Reading. "while having code so boring anyone can maintain it, use Python." -- attempted insult seen on slashdot ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 14:54 ` Ittay Dror 2008-05-01 15:09 ` Jeff King @ 2008-05-01 15:27 ` Avery Pennarun 2008-05-01 15:34 ` Jeff King 2008-05-01 16:39 ` Sitaram Chamarty 2 siblings, 1 reply; 49+ messages in thread From: Avery Pennarun @ 2008-05-01 15:27 UTC (permalink / raw) To: Ittay Dror; +Cc: git On 5/1/08, Ittay Dror <ittayd@tikalk.com> wrote: > Also, would anyone like to comment on: > http://www.markshuttleworth.com/archives/123 (Renaming is > the killer app of distributed version control > <http://www.markshuttleworth.com/archives/123>)? One of the comments linked to this: http://automatthias.wordpress.com/2007/06/07/directory-renaming-in-scm/ Which points out that git doesn't really handle directory renames at all. If someone creates file A/X then renames A to B, then merges with someone who both added the file A/Y and modified A/X, git will produce a tree containing (modified) B/Y and (new) A/Y. Technically this is "correct" in that no data is lost and there are no conflicts, but it is obviously not what was "intended", which was that the new file Y should have ended up in folder B. Before you say this is not a realistic use case, I've personally had this exact problem: - I had a project with all of my work in a folder "src" - I decided that the 'src' folder was redundant, so I moved it all to the root folder - Someone else was working on an old maintenance branch which still had 'src' - When I merged from that person, some new files were created under 'src', and of course didn't work. Since the maintenance branch was long-lived, this problem happened repeatedly. That said, it's also pretty easy to work around, so it's not the end of the world. Have fun, Avery ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 15:27 ` Avery Pennarun @ 2008-05-01 15:34 ` Jeff King 2008-05-01 15:50 ` Avery Pennarun 2008-05-01 19:12 ` Steven Grimm 0 siblings, 2 replies; 49+ messages in thread From: Jeff King @ 2008-05-01 15:34 UTC (permalink / raw) To: Avery Pennarun; +Cc: Ittay Dror, git On Thu, May 01, 2008 at 11:27:34AM -0400, Avery Pennarun wrote: > Before you say this is not a realistic use case, I've personally had > this exact problem: > > - I had a project with all of my work in a folder "src" > - I decided that the 'src' folder was redundant, so I moved it all to > the root folder > - Someone else was working on an old maintenance branch which still had 'src' > - When I merged from that person, some new files were created under > 'src', and of course didn't work. Sure. But we've also had the exact case of: - there are some files in subdir/, but that is not a good name, and there is something else that you are going to add that would be better named as subdir/. - you rename subdir/ to bettername/ - you create subdir/newfile but you _don't_ want newfile to go into bettername/. It's _replacing_ what went into bettername/. So I don't think you can always track the intent automatically. Though if you could specify the intent to the SCM, you could differentiate at the time of move between these two cases, and the merge could do the right thing later. Or alternatively, you could specify at time of merge which to do. It's just that nobody has implemented it. -Peff ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 15:34 ` Jeff King @ 2008-05-01 15:50 ` Avery Pennarun 2008-05-01 16:48 ` Jeff King 2008-05-01 19:12 ` Steven Grimm 1 sibling, 1 reply; 49+ messages in thread From: Avery Pennarun @ 2008-05-01 15:50 UTC (permalink / raw) To: Jeff King; +Cc: Ittay Dror, git On 5/1/08, Jeff King <peff@peff.net> wrote: > On Thu, May 01, 2008 at 11:27:34AM -0400, Avery Pennarun wrote: > > > Before you say this is not a realistic use case, I've personally had > > this exact problem: > > > > - I had a project with all of my work in a folder "src" > > - I decided that the 'src' folder was redundant, so I moved it all to > > the root folder > > - Someone else was working on an old maintenance branch which still had 'src' > > - When I merged from that person, some new files were created under > > 'src', and of course didn't work. > > > Sure. But we've also had the exact case of: > > - there are some files in subdir/ [1], but that is not a good name, and > there is something else that you are going to add that would be > better named as subdir/. > - you rename subdir/ to bettername/ [2] > - you create subdir/newfile [3] > > but you _don't_ want newfile to go into bettername/. It's _replacing_ > what went into bettername/. I would argue that this is a sort of "directory splitting" operation. That is, all anyone ever did was add some files to a subdir/ that already existed [1], *or* move all the files from subdir/ to a previously-empty bettername/ [2], *or* create a new subdir/ and add files to it [3]. In each case, no merge operation was necessary and it is completely obvious by comparing "before and after" trees which case it was. I guess my argument here is just that it should be *possible* to deduce and implement both cases at merge time just fine using git's existing storage model. It just hasn't been implemented yet. (And incidentally, I think that's totally awesome and I'd never want to go back to an explicit rename tracking model.) I should shut up now because the actual merge machinery scares me and I'm not willing to volunteer to write a patch for this one :) Have fun, Avery ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 15:50 ` Avery Pennarun @ 2008-05-01 16:48 ` Jeff King 2008-05-01 19:45 ` Avery Pennarun 0 siblings, 1 reply; 49+ messages in thread From: Jeff King @ 2008-05-01 16:48 UTC (permalink / raw) To: Avery Pennarun; +Cc: Ittay Dror, git On Thu, May 01, 2008 at 11:50:31AM -0400, Avery Pennarun wrote: > I would argue that this is a sort of "directory splitting" operation. > That is, all anyone ever did was add some files to a subdir/ that > already existed [1], *or* move all the files from subdir/ to a > previously-empty bettername/ [2], *or* create a new subdir/ and add > files to it [3]. In each case, no merge operation was necessary and it > is completely obvious by comparing "before and after" trees which case > it was. I don't see it. I think the steps are exactly the same as in your example. Consider: 1. You have some files in src/ 2. All of the files from src/ get moved away 3. You merge in somebody else's work which adds a file in src/, but their work is based on a commit which predates 2. The question is: if they had seen 2., would they have put the file into src/, or into the new location? I think the answer depends on the semantics of the file. If it is semantically an addition to the source code that got moved, then yes. If it is a _replacement_ for the source code that got moved, then no. > I guess my argument here is just that it should be *possible* to > deduce and implement both cases at merge time just fine using git's > existing storage model. It just hasn't been implemented yet. (And > incidentally, I think that's totally awesome and I'd never want to go > back to an explicit rename tracking model.) I think you lack information to decide automatically between the two cases listed above. But I think in most cases it would be sufficient for the tool to say "this directory seems to have moved, but this new file was added in it" and let the user decide which makes sense. > I should shut up now because the actual merge machinery scares me and > I'm not willing to volunteer to write a patch for this one :) It would probably start not with merge machinery, but with diff machinery to detect "directory has moved". But that is also scary. :) You could also do this totally _outside_ of git, similar to git-mergetool. Wait until you get a conflict, and then run a script which looks at the two endpoints and the merge base and says "Oh, maybe this is a good way of resolving." -Peff ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 16:48 ` Jeff King @ 2008-05-01 19:45 ` Avery Pennarun 2008-05-01 22:42 ` Jeff King 0 siblings, 1 reply; 49+ messages in thread From: Avery Pennarun @ 2008-05-01 19:45 UTC (permalink / raw) To: Jeff King; +Cc: Ittay Dror, git On Thu, May 1, 2008 at 12:48 PM, Jeff King <peff@peff.net> wrote: > I don't see it. I think the steps are exactly the same as in your > example. Consider: > > 1. You have some files in src/ > 2. All of the files from src/ get moved away > 3. You merge in somebody else's work which adds a file in src/, but > their work is based on a commit which predates 2. > > The question is: if they had seen 2., would they have put the file into > src/, or into the new location? I think the answer depends on the > semantics of the file. If it is semantically an addition to the source > code that got moved, then yes. If it is a _replacement_ for the > source code that got moved, then no. I promised I would shut up, and I apparently didn't. Sorry :) I think this case isn't so hard. Basically, a merge involves three commits; the merge-base, my branch, and your branch. In your example above, we compare the merge-base to the new version; in that case, the new file is in an *existing* directory which definitely corresponds to src/ in #1, because the the new version has never even heard about src/ being deleted. Thus, the file must be intended to be part of the original src/, wherever it may now be. In contrast, if the merge-base already had src/ being renamed, and someone put something into src/, we'd know that they're putting it into a fundamentally different directory than the moved src/. Exactly how you track the "identity" of a directory without breaking things down by individual commit sounds a little complicated, but it feels to me like it should be possible. I suspect this is a generalization of the earlier discussion (a few months ago) that I read in the archive about git's handling of empty directories. Right now git does weird things with directory creation/deletion because directories are not first-class citizens. Anyway, as with the empty directory stuff, if I occasionally have to mkdir/rmdir a couple things and rename a few files after doing a merge, I'm not going to cry too much. It sure beats explicitly tracking renames and then having an oops-I-forgot-to-explicitly-track rename throw a monkey wrench into my merges, which svn has saddled me with lots of times. Have fun, Avery ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 19:45 ` Avery Pennarun @ 2008-05-01 22:42 ` Jeff King 0 siblings, 0 replies; 49+ messages in thread From: Jeff King @ 2008-05-01 22:42 UTC (permalink / raw) To: Avery Pennarun; +Cc: Ittay Dror, git On Thu, May 01, 2008 at 03:45:07PM -0400, Avery Pennarun wrote: > In your example above, we compare the merge-base to the new version; > in that case, the new file is in an *existing* directory which > definitely corresponds to src/ in #1, because the the new version has > never even heard about src/ being deleted. Thus, the file must be > intended to be part of the original src/, wherever it may now be. I disagree with the final statement of the quoted paragraph above. Just because you didn't build on the commit that moved src/* doesn't mean the thing you put in src/ was intended to be moved along with src/. For example: - it might have been a new work unrelated to the existing work in src/ that got moved - it might have been a replacement for the work in src/ that was started before the movement. E.g., developer1 begins the replacement work. developer2 moves the old work out of the way. When the branches are merged, you don't want developer1's work moved. And yes, I think those are probably less common than "it should be moved along with src/*". My point isn't that this isn't a valuable construct, but that we should stop short of mind-reading, and focus on making it _easy_ to see what happened and to concisely specify the choice and proceed. -Peff ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 15:34 ` Jeff King 2008-05-01 15:50 ` Avery Pennarun @ 2008-05-01 19:12 ` Steven Grimm 2008-05-01 23:14 ` Jeff King 1 sibling, 1 reply; 49+ messages in thread From: Steven Grimm @ 2008-05-01 19:12 UTC (permalink / raw) To: Jeff King; +Cc: Avery Pennarun, Ittay Dror, git On May 1, 2008, at 8:34 AM, Jeff King wrote: > So I don't think you can always track the intent automatically. That is absolutely true. You have to pick one case or the other as the default unless there's some way to tell the system your intent either at merge time or at move time. However, that leaves the question of which default will be wrong the least often. In my personal experience, I think a directory rename has almost always meant that I would want new files to appear in the new directory rather than to recreate the old directory. I can't think of a single time when I've wanted git's current behavior (though maybe it's happened on occasion) but the current behavior has tripped me up more than once and forced me to do extra work shuffling things around by hand post-merge. I acknowledge that there exist cases where the current behavior is correct -- but in my experience they're the minority. Of course, the discussion is moot anyway until someone writes code to detect the situation; my impression is the current behavior is the way it is simply because it's what naturally happens in the absence of merge-time detection of a directory getting renamed. -Steve ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 19:12 ` Steven Grimm @ 2008-05-01 23:14 ` Jeff King 2008-05-03 17:56 ` merge renamed files/directories? (was: Re: detecting rename->commit->modify->commit) Ittay Dror 2008-05-08 18:17 ` detecting rename->commit->modify->commit Jeff King 0 siblings, 2 replies; 49+ messages in thread From: Jeff King @ 2008-05-01 23:14 UTC (permalink / raw) To: Steven Grimm; +Cc: Avery Pennarun, Ittay Dror, git On Thu, May 01, 2008 at 12:12:33PM -0700, Steven Grimm wrote: > However, that leaves the question of which default will be wrong the > least often. > > In my personal experience, I think a directory rename has almost always > meant that I would want new files to appear in the new directory rather I do agree that the rename is probably more often desired. > Of course, the discussion is moot anyway until someone writes code to > detect the situation; my impression is the current behavior is the way it > is simply because it's what naturally happens in the absence of > merge-time detection of a directory getting renamed. Yes, I think that is largely a correct impression (although I think Linus has spoken out against directory renaming in the past, so there is at least a little bit of conscious effort). I suspect the right sequence of steps to implement this would be: 1. write a proof-of-concept that shows directory renaming after the fact (e.g., take a conflicted merge, scan the diff for directory renames, and then fix up the files). That way it is available, but doesn't impact git at all. 2. If people think it is useful, build it into the diff and merge machinery so that it can happen automagically, but make it optional. Thus git fully supports it, but the policy decision is left up to the user. 3. Make it the default if it is the common choice. So we just need somebody to volunteer to work on 1. ;) -Peff ^ permalink raw reply [flat|nested] 49+ messages in thread
* merge renamed files/directories? (was: Re: detecting rename->commit->modify->commit) 2008-05-01 23:14 ` Jeff King @ 2008-05-03 17:56 ` Ittay Dror 2008-05-03 18:11 ` Avery Pennarun 2008-05-08 18:17 ` detecting rename->commit->modify->commit Jeff King 1 sibling, 1 reply; 49+ messages in thread From: Ittay Dror @ 2008-05-03 17:56 UTC (permalink / raw) To: git Can someone comment whether supporting merges after renames will be on the Git roadmap? As a Java developer, I can say that refactoring of class names and packages happens quite often. Having to remember I've made this change throughout the lifetime of a branch (or master, until pushed to a central repository), and needing to manually merge changes to files / packages (directories) I've refactored is something that I want my VCS to do. Thank you, Ittay Jeff King wrote: > On Thu, May 01, 2008 at 12:12:33PM -0700, Steven Grimm wrote: > > >> However, that leaves the question of which default will be wrong the >> least often. >> >> In my personal experience, I think a directory rename has almost always >> meant that I would want new files to appear in the new directory rather >> > > I do agree that the rename is probably more often desired. > > >> Of course, the discussion is moot anyway until someone writes code to >> detect the situation; my impression is the current behavior is the way it >> is simply because it's what naturally happens in the absence of >> merge-time detection of a directory getting renamed. >> > > Yes, I think that is largely a correct impression (although I think > Linus has spoken out against directory renaming in the past, so there is > at least a little bit of conscious effort). I suspect the right sequence > of steps to implement this would be: > > 1. write a proof-of-concept that shows directory renaming after the > fact (e.g., take a conflicted merge, scan the diff for directory > renames, and then fix up the files). That way it is available, but > doesn't impact git at all. > > 2. If people think it is useful, build it into the diff and merge > machinery so that it can happen automagically, but make it > optional. Thus git fully supports it, but the policy decision is > left up to the user. > > 3. Make it the default if it is the common choice. > > So we just need somebody to volunteer to work on 1. ;) > > -Peff > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- Ittay Dror <ittayd@tikalk.com> Tikal <http://www.tikalk.com> Tikal Project <http://tikal.sourceforge.net> ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories? (was: Re: detecting rename->commit->modify->commit) 2008-05-03 17:56 ` merge renamed files/directories? (was: Re: detecting rename->commit->modify->commit) Ittay Dror @ 2008-05-03 18:11 ` Avery Pennarun 2008-05-04 6:08 ` merge renamed files/directories? Ittay Dror 0 siblings, 1 reply; 49+ messages in thread From: Avery Pennarun @ 2008-05-03 18:11 UTC (permalink / raw) To: Ittay Dror; +Cc: git On 5/3/08, Ittay Dror <ittayd@tikalk.com> wrote: > Can someone comment whether supporting merges after renames will be on the > Git roadmap? > > As a Java developer, I can say that refactoring of class names and packages > happens quite often. Having to remember I've made this change throughout the > lifetime of a branch (or master, until pushed to a central repository), and > needing to manually merge changes to files / packages (directories) I've > refactored is something that I want my VCS to do. Git already works fine for renames. The only situation where something funny happens is if you rename a whole directory and someone else creates a file in the old directory. (In that case, the new file ends up in the old place instead of the new place.) However, even in that case, there is still no conflict and no manual merging necessary. In fact, as someone else pointed out, renaming a java file requires you to modify the file anyhow, so having git auto-move the file to another directory *still* wouldn't make it work any better. Have fun, Avery ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories? 2008-05-03 18:11 ` Avery Pennarun @ 2008-05-04 6:08 ` Ittay Dror 2008-05-04 9:34 ` Jakub Narebski 2008-05-05 16:40 ` Avery Pennarun 0 siblings, 2 replies; 49+ messages in thread From: Ittay Dror @ 2008-05-04 6:08 UTC (permalink / raw) To: Avery Pennarun; +Cc: git Avery Pennarun wrote: > Git already works fine for renames. The only situation where > something funny happens is if you rename a whole directory and someone > else creates a file in the old directory. (In that case, the new file > ends up in the old place instead of the new place.) However, even in > that case, there is still no conflict and no manual merging necessary. > > Sorry, but this is not the situation as I have experienced it with a local repository I have. I renamed a directory (without changing any files in it). 'git diff <commit>^ <commit>' shows the rename fine, but 'git log -p -M -C <initial commit>..' does not (that is, the history for files in that directory is shown from the rename commit only). Obviously git-diff is not any better. > In fact, as someone else pointed out, renaming a java file requires > you to modify the file anyhow, so having git auto-move the file to > another directory *still* wouldn't make it work any better. > > Sure it will, because otherwise I need to move it and still need to fix it. And there are many other file formats and languages where such a move will not require any change (I think it is funny that Java is a justification for not doing something for a tool primarily used by C people). Also, what happens if I change the file in the new location and someone else changes it in the old location? Will I need to do a manual merge? -- Ittay Dror <ittayd@tikalk.com> Tikal <http://www.tikalk.com> Tikal Project <http://tikal.sourceforge.net> ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories? 2008-05-04 6:08 ` merge renamed files/directories? Ittay Dror @ 2008-05-04 9:34 ` Jakub Narebski 2008-05-05 16:40 ` Avery Pennarun 1 sibling, 0 replies; 49+ messages in thread From: Jakub Narebski @ 2008-05-04 9:34 UTC (permalink / raw) To: Ittay Dror; +Cc: Avery Pennarun, git Ittay Dror <ittayd@tikalk.com> writes: > Avery Pennarun wrote: > > Git already works fine for renames. The only situation where > > something funny happens is if you rename a whole directory and someone > > else creates a file in the old directory. (In that case, the new file > > ends up in the old place instead of the new place.) However, even in > > that case, there is still no conflict and no manual merging necessary. > > Sorry, but this is not the situation as I have experienced it with a > local repository I have. I renamed a directory (without changing any > files in it). 'git diff <commit>^ <commit>' shows the rename fine, but > 'git log -p -M -C <initial commit>..' does not (that is, the history > for files in that directory is shown from the rename commit > only). Obviously git-diff is not any better. This is one thing where git differs from other SCMs. In "git log -- <path>" (that is what I assume you have used) the <path> argument is path limiter. It allows to specify more than one directory or a file. Unfortunately currently "git log --follow=<file>" works only for single files, and doesn't yet work for directories; which is caused, among other things, by the lack of directory rename detection in git. > [...] Also, what happens if I change the file in the new location > and someone else changes it in the old location? Will I need to do a > manual merge? No, rename detection should make automatic merge possible. -- Jakub Narebski Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories? 2008-05-04 6:08 ` merge renamed files/directories? Ittay Dror 2008-05-04 9:34 ` Jakub Narebski @ 2008-05-05 16:40 ` Avery Pennarun 2008-05-05 21:49 ` Robin Rosenberg 1 sibling, 1 reply; 49+ messages in thread From: Avery Pennarun @ 2008-05-05 16:40 UTC (permalink / raw) To: Ittay Dror; +Cc: git On 5/4/08, Ittay Dror <ittayd@tikalk.com> wrote: > Avery Pennarun wrote: > > In fact, as someone else pointed out, renaming a java file requires > > you to modify the file anyhow, so having git auto-move the file to > > another directory *still* wouldn't make it work any better. > > Sure it will, because otherwise I need to move it and still need to fix it. > And there are many other file formats and languages where such a move will > not require any change (I think it is funny that Java is a justification for > not doing something for a tool primarily used by C people). I mentioned Java because you mentioned you were working in java. The particular problem with Java doesn't happen to C people. Imagine, for example, that I add a new file, lib/foo.c, to lib/lib.a (thus they have to modify lib/Makefile), while someone else renames "lib" to "bettername". When I merge, if git would create bettername/foo.c (it currently won't) and properly automerge bettername/Makefile (it will), then the program would still compile correctly. However this doesn't work in Java: lib/foo.java would include the word "lib" in its contents (in the namespace declaration) and so there's no way automatic merging would have resulted in a version that compiles correctly. So what I said isn't to *justify* git's behaviour, merely to point out that in java's case, there seems to be no way to get fully automatic merging that would work. In C, this case would have worked, if only git supported directory renames. In neither case is it very much work to fix by hand, though :) Have fun, Avery ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories? 2008-05-05 16:40 ` Avery Pennarun @ 2008-05-05 21:49 ` Robin Rosenberg 2008-05-05 22:20 ` Linus Torvalds 0 siblings, 1 reply; 49+ messages in thread From: Robin Rosenberg @ 2008-05-05 21:49 UTC (permalink / raw) To: Avery Pennarun; +Cc: Ittay Dror, git måndagen den 5 maj 2008 18.40.24 skrev Avery Pennarun: > On 5/4/08, Ittay Dror <ittayd@tikalk.com> wrote: > > Avery Pennarun wrote: > > > In fact, as someone else pointed out, renaming a java file requires > > > you to modify the file anyhow, so having git auto-move the file to > > > another directory *still* wouldn't make it work any better. > > > > Sure it will, because otherwise I need to move it and still need to fix > > it. And there are many other file formats and languages where such a move > > will not require any change (I think it is funny that Java is a > > justification for not doing something for a tool primarily used by C > > people). > > I mentioned Java because you mentioned you were working in java. > > The particular problem with Java doesn't happen to C people. Imagine, > for example, that I add a new file, lib/foo.c, to lib/lib.a (thus they > have to modify lib/Makefile), while someone else renames "lib" to > "bettername". > > When I merge, if git would create bettername/foo.c (it currently > won't) and properly automerge bettername/Makefile (it will), then the > program would still compile correctly. However this doesn't work in > Java: lib/foo.java would include the word "lib" in its contents (in > the namespace declaration) and so there's no way automatic merging > would have resulted in a version that compiles correctly. You will always find corner cases. Line-by line merge happens to work, not because it is the theoretically correct way, but because we have discovered that it nearly always works so our need for more specialized merging is not huge. We have also adapted our development practices to the way line-by-line merging works, i.e. we avoid binary files and funny text file formats. > So what I said isn't to *justify* git's behaviour, merely to point out > that in java's case, there seems to be no way to get fully automatic > merging that would work. In C, this case would have worked, if only > git supported directory renames. Sure, a merge that understands this is java and does the correct thing. Evn your case for C (with hypotetical directory rename detection) would fail if the renamed directory was used in an #include-statement (like #include <lib/foo.h>) Say someone thinks xxdiff should move to lib/xxdiff, while someone else adds a new reference to <xxdiff/xxdiff.h>. To resolve all cases you must have tools that understand what they are doing. Directyry rename detection only solves a few cases, but it may be easy enough to implement to warrant the effort to get the tick in the box. > > In neither case is it very much work to fix by hand, though :), I agree on that. -- robin ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories? 2008-05-05 21:49 ` Robin Rosenberg @ 2008-05-05 22:20 ` Linus Torvalds 2008-05-05 23:07 ` Steven Grimm 2008-05-06 1:38 ` Avery Pennarun 0 siblings, 2 replies; 49+ messages in thread From: Linus Torvalds @ 2008-05-05 22:20 UTC (permalink / raw) To: Robin Rosenberg; +Cc: Avery Pennarun, Ittay Dror, git On Mon, 5 May 2008, Robin Rosenberg wrote: > > You will always find corner cases. .. and btw, this is why merging should always - be predictable (which implies "simple": overly clever merging, and especially merging that takes complex history into account is *bad*, because it's still going to do the wrong thing, but now it's going to do so much less predictable) - be amenable to manual fixes even when it succeeds (ie even if an automatic merge completes without errors, a subsequent build may find problems, and a "git commit --amend" may well be the right thing to do!) - aim for (preferrably easily-handled) conflicts when the unusual cases happen. Conflicts for *common* things are bad, because they just cause more work, and people get too complacent about fixing them. But similarly, thinking that the unusual cases should be handled automatically is also wrong - because the unusual cases are likely the ones that need some manual resolution anyway. Git will never do merges "perfectly", if only because it's fundamentally impossible to do that. But one thing git *does* do is to make it pretty damn easy to handle it. I really don't understand why people expect a directory rename to be handled automatically, when it is (a) not that common and (b) not obvious what the solution is, but MOST OF ALL (c) so damn _easy_ to handle it manually after-the-fact when you notice that something doesn't compile! Really. If you have a file that was created in the wrong subdirectory (and please admit that this is not common - it requires not just a directory rename, but also a file create in another branch at the same time), what's so hard with just doing make .. oh, oops, that was pretty obviousm, the expected source file didn't exist .. git mv olddir/file newdir/file git commit --amend and "Tadaa! All done". Your merge that was *fundamentally impossible* to do automatically, was trivially done manually, with no actual big head-scratiching involved. Linus ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories? 2008-05-05 22:20 ` Linus Torvalds @ 2008-05-05 23:07 ` Steven Grimm 2008-05-06 0:29 ` Linus Torvalds 2008-05-06 1:38 ` Avery Pennarun 1 sibling, 1 reply; 49+ messages in thread From: Steven Grimm @ 2008-05-05 23:07 UTC (permalink / raw) To: Linus Torvalds; +Cc: Robin Rosenberg, Avery Pennarun, Ittay Dror, git On May 5, 2008, at 3:20 PM, Linus Torvalds wrote: > I really don't understand why people expect a directory rename to be > handled automatically, when it is (a) not that common and (b) not > obvious > what the solution is, but MOST OF ALL (c) so damn _easy_ to handle it > manually after-the-fact when you notice that something doesn't > compile! Assuming all you track with git is source code that has dependencies such that a compile command fails cleanly when things end up in the wrong directory, sure. If you're using git to, say, track a tree of documentation files or images that are referred to using relative URLs in HTML pages, detecting the breakage is less trivial unless you have a really solid automated QA process that can check for dangling references. Are directory renames as common as file renames? Certainly not. But they happen often enough that it's annoying to have to manually clean up after them. Note that I did not say it is difficult or impossible to manually clean up after them. I think the number of people who've mentioned this on the list should stand as some kind of refutation of the idea that directory renames are so vanishingly rare as to not be worth mentioning. I've run into the problem a few times myself. > and "Tadaa! All done". Your merge that was *fundamentally > impossible* to > do automatically, was trivially done manually, with no actual big > head-scratiching involved. $ mkdir parent $ cd parent $ hg init $ mkdir subdir1 $ echo "I am the walrus" > subdir1/file1 $ hg add subdir1/file1 $ hg commit -m 'initial commit' $ cd .. $ hg clone parent child $ cd child $ hg mv subdir1 subdir2 $ hg commit -m 'rename subdir1 to subdir2' $ cd ../parent $ echo 'I love prunes' > subdir1/file2 $ hg add subdir1/file2 $ hg commit -m 'new file in subdir' $ cd ../child $ hg pull $ hg merge $ ls subdir2 file1 file2 Doesn't seem *fundamentally* impossible to produce the results that are most likely to be what people want. (Which doesn't equal "guaranteed to be 100% correct 100% of the time or your money back" -- as you say, merging is an inexact science.) -Steve ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories? 2008-05-05 23:07 ` Steven Grimm @ 2008-05-06 0:29 ` Linus Torvalds 2008-05-06 0:40 ` Linus Torvalds 2008-05-06 15:47 ` Theodore Tso 0 siblings, 2 replies; 49+ messages in thread From: Linus Torvalds @ 2008-05-06 0:29 UTC (permalink / raw) To: Steven Grimm; +Cc: Robin Rosenberg, Avery Pennarun, Ittay Dror, git On Mon, 5 May 2008, Steven Grimm wrote: > > Doesn't seem *fundamentally* impossible to produce the results that are most > likely to be what people want. You didn't understand what was fundamentally impossible. And btw, this has nothing to do with directory renames either. There are tons of these kinds of merge issues that bad SCM developes have been masturbating over for YEARS. There's a whole science of making idiotic new merging models, one fancier than the other. The fact is, you cannot do a perfect job, the best thing you can do is pick a simple model, and try to make it repeatable and easy to fix up. Maybe somebody bothers to implement some directory rename heuristic some day. Quite frankly, I personally cannot care less. It really is mental masturbation, and has absolutely no relevance for any real-world problem. Linus ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories? 2008-05-06 0:29 ` Linus Torvalds @ 2008-05-06 0:40 ` Linus Torvalds 2008-05-06 15:47 ` Theodore Tso 1 sibling, 0 replies; 49+ messages in thread From: Linus Torvalds @ 2008-05-06 0:40 UTC (permalink / raw) To: Steven Grimm; +Cc: Robin Rosenberg, Avery Pennarun, Ittay Dror, git On Mon, 5 May 2008, Linus Torvalds wrote: > > There are tons of these kinds of merge issues that bad SCM developes > have been masturbating over for YEARS. .. and if I sound rather less than enthused about these kinds of issues, it's because of having seen years and years of people talking about merge strategies, and then at the same time using SVN which doesn't even record the parenthood of the resulting merges, or thinking that code always moves with whole files. In other words, the details don't even matter. What matters is not being a total piece of sh*t in the big picture. Linus ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories? 2008-05-06 0:29 ` Linus Torvalds 2008-05-06 0:40 ` Linus Torvalds @ 2008-05-06 15:47 ` Theodore Tso 2008-05-06 16:10 ` Linus Torvalds 1 sibling, 1 reply; 49+ messages in thread From: Theodore Tso @ 2008-05-06 15:47 UTC (permalink / raw) To: Linus Torvalds Cc: Steven Grimm, Robin Rosenberg, Avery Pennarun, Ittay Dror, git On Mon, May 05, 2008 at 05:29:12PM -0700, Linus Torvalds wrote: > > Maybe somebody bothers to implement some directory rename heuristic some > day. Quite frankly, I personally cannot care less. It really is mental > masturbation, and has absolutely no relevance for any real-world problem. > Actually, the directory rename hueristic *does* have relevance in at least some real-world cases. For example, MySQL has plugin directories, and occasionally the plugins get renamed, for whatever reason. If a plugin gets renamed, so does its directory, and if the rename operation happens in an experimental (or devel) branch, but then for whatever reason, a new file is created in the devel (or maint) branch, without the directory rename hueristic, when the changeset is pulled into the experimental (or devel) branch, the file will be created in the wrong directory. So it may be rare, but this kind of thing does happen in the real world. - Ted ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories? 2008-05-06 15:47 ` Theodore Tso @ 2008-05-06 16:10 ` Linus Torvalds 2008-05-06 16:15 ` Linus Torvalds 2008-05-06 16:32 ` Ittay Dror 0 siblings, 2 replies; 49+ messages in thread From: Linus Torvalds @ 2008-05-06 16:10 UTC (permalink / raw) To: Theodore Tso Cc: Steven Grimm, Robin Rosenberg, Avery Pennarun, Ittay Dror, git On Tue, 6 May 2008, Theodore Tso wrote: > > Actually, the directory rename hueristic *does* have relevance in at > least some real-world cases. For example, MySQL has plugin > directories, and occasionally the plugins get renamed, for whatever > reason. I'm not saying that directory renames don't happen. I don't even say that merges across directory renames don't happen. I *am* saying that it's not a problem. It's like data conflicts. Do they happen? Sure as hell. I can pretty much guarantee that any sane project will have more data conflicts than they will have rename conflicts (whether single-file or directory), and it's not only a problem, it's something that is absolutely *required* from a source control management system! So are data conflicts a problem? I claim that they aren't. They are a *positive* resource that you need to handle. Some of the "handling" is obviously going to be to try to avoid them, and if you get too much of them, the real "problem" is that you merge too seldom, or more commonly that you have a piece of code that is simply not done well enough, so many different people have to muck around in that area. But fundamentally, you should always have data conflicts, and they aren't a problem in themselves. They are a problem only - If they are hard to understand and see, and *unexpected*. The SCM should explain what is going on, and explain why a conflict happens (and that may perhaps mean after-the fact! I love "gitk --merge" exactly because it tends to be very good at explaining what was going on!). - If they are hard to fix. For example, one of the main problems I had with BK merging was the fact that while the megetool was wonderful, you effectively *had* to merge using it, and you couldn't sanely do an "incremental" merge where you first did a first merge job, then checked that it at least compiles, then tested it, and finally looked at the diffs from both parents and looked at whether those all made sense, and you could "refine" or fix the merge along the different phases. Of course, you hope that all merges are pretty obvious, and you can do it right in one go, but no, they're not. They'll never be. They'll never be fully automtic, but even when they aren't automatic, they'll not even be trivially to do manually. But that's OK, as long as the tool at least doesn't fight you, and lets you do whatever you want to do a part of fixing things up. Now, take a look back at directory renames. Do they happen? Yes. Do they potentially mis-merge? Yes. But are they common and/or hard to fix and handle? No. And that's why I don't think people should call them "problems". The only _real_ issue here, I think, is that git just does things differently from other SCM's. Git does a _lot_ of things differently. You get used to it. Linus ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories? 2008-05-06 16:10 ` Linus Torvalds @ 2008-05-06 16:15 ` Linus Torvalds 2008-05-06 16:32 ` Ittay Dror 1 sibling, 0 replies; 49+ messages in thread From: Linus Torvalds @ 2008-05-06 16:15 UTC (permalink / raw) To: Theodore Tso Cc: Steven Grimm, Robin Rosenberg, Avery Pennarun, Ittay Dror, git On Tue, 6 May 2008, Linus Torvalds wrote: > > I can pretty much > guarantee that any sane project will have more data conflicts than they > will have rename conflicts (whether single-file or directory), and it's > not only a problem, it's something that is absolutely *required* from a ^^^-- not > source control management system! Oops. That didn't read well. Linus ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories? 2008-05-06 16:10 ` Linus Torvalds 2008-05-06 16:15 ` Linus Torvalds @ 2008-05-06 16:32 ` Ittay Dror 2008-05-06 16:39 ` Linus Torvalds 1 sibling, 1 reply; 49+ messages in thread From: Ittay Dror @ 2008-05-06 16:32 UTC (permalink / raw) To: Linus Torvalds Cc: Theodore Tso, Steven Grimm, Robin Rosenberg, Avery Pennarun, git Linus Torvalds wrote: > > - If they are hard to understand and see, and *unexpected*. The SCM > should explain what is going on, and explain why a conflict happens > (and that may perhaps mean after-the fact! I love "gitk --merge" > exactly because it tends to be very good at explaining what was going > on!). > > So does git tell me what is going on with directory renames? Or should I just discover them when I try to compile (assuming that when the old directory name appears it will even get compiled, and that the file in it is something that gets compiled) And no, it's not a common problem, but I don't like the fact that a merge conflict happens and the SCM doesn't tell me about it. -- Ittay Dror <ittayd@tikalk.com> Tikal <http://www.tikalk.com> Tikal Project <http://tikal.sourceforge.net> ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories? 2008-05-06 16:32 ` Ittay Dror @ 2008-05-06 16:39 ` Linus Torvalds 0 siblings, 0 replies; 49+ messages in thread From: Linus Torvalds @ 2008-05-06 16:39 UTC (permalink / raw) To: Ittay Dror Cc: Theodore Tso, Steven Grimm, Robin Rosenberg, Avery Pennarun, git On Tue, 6 May 2008, Ittay Dror wrote: > > And no, it's not a common problem, but I don't like the fact that a merge > conflict happens and the SCM doesn't tell me about it. I do agree that the most irritating feature of it is the silent clean merge. When it's not obvious what the right thing to do is, generally a merge strategy should try to warn, or even generate a conflict. That said, anybody who thinks that "merge was automatic and successful" means that the mege was _correct_ is sadly mistaken. So you really shouldn't depend on it, and yeah, I strongly suggest building and testing after a merge (and before you push the result out), so that you can fix any issues. Linus ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories? 2008-05-05 22:20 ` Linus Torvalds 2008-05-05 23:07 ` Steven Grimm @ 2008-05-06 1:38 ` Avery Pennarun 2008-05-06 1:46 ` Shawn O. Pearce 2008-05-06 2:19 ` Linus Torvalds 1 sibling, 2 replies; 49+ messages in thread From: Avery Pennarun @ 2008-05-06 1:38 UTC (permalink / raw) To: Linus Torvalds; +Cc: Robin Rosenberg, Ittay Dror, git On 5/5/08, Linus Torvalds <torvalds@linux-foundation.org> wrote: > I really don't understand why people expect a directory rename to be > handled automatically, when it is (a) not that common and (b) not obvious > what the solution is, but MOST OF ALL (c) so damn _easy_ to handle it > manually after-the-fact when you notice that something doesn't compile! I general I agree with your point here, but I still find it surprising how hard the directory-rename problem is made out to be. As far as I can see, the right implementation exactly parallels the single-file rename implementation. I think the same problem that prevents git from knowing the difference between empty and nonexistent directories (eg. http://kerneltrap.org/mailarchive/git/2007/7/18/251976) is the one that prevents it from handling directory renames: git doesn't acknowledge that it's *already* treating directories as first-class objects. What if you thought of a directory as simply a list of filenames? (This is more or less what unix does anyway.) Then an *empty* directory is a tree of zero length; a nonexistent (or not tracked) directory is simply not listed in the parent; a directory with untracked files is like a file with patches not yet added to the index(*); and trying to merge a file into a nonexistent directory (when the original patch *didn't* create the directory fresh) would trigger similar logic to the existing rename handling. That is, put the new file with the content that used to be next to it, by looking for a tree with contents (names, not so much sha1's) similar to the one it was expected to be in. > It really is mental > masturbation, and has absolutely no relevance for any real-world problem. I personally don't get very interested in non-real-world problems. Here's the actual case I tried to use a few months ago, but couldn't, because git doesn't track directory renames. (Note that I was quite happily able to do this in svn, as much as you can do anything happily in svn.) I have a branch called 'mylib' with my library project in its root directory. What I wanted was to maintain my library in the 'mylib' branch, then merge my library into the "libs/mylib" directory of my application, which is in the 'myapp' branch. (Of course, in real life, there's more than one app using mylib in more than one repository, and I'm actually doing 'git pull' of the mylib branch from elsewhere.) This actually works like magic in git - except when you create a file in the 'mylib' branch, in which case it gets merged to the wrong path every single time. It seems to me like it should be very easy to put it in the right place instead, making one more interesting use case possible. I realize git-submodule is the way you're supposed to do something like this, but git-submodule doesn't really do what I want (yet) for reasons discussed in other threads. Have fun, Avery (*) Applying the same metaphor in reverse, operations that are valid on directories are also valid for file contents. I can think of immediate uses for a .gitignore-style list that talks about file *contents*. Imagine if I could make a local patch to my Makefile, mark that one patch as "ignored", and never accidentally check it in. ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories? 2008-05-06 1:38 ` Avery Pennarun @ 2008-05-06 1:46 ` Shawn O. Pearce 2008-05-06 1:58 ` Avery Pennarun 2008-05-06 2:19 ` Linus Torvalds 1 sibling, 1 reply; 49+ messages in thread From: Shawn O. Pearce @ 2008-05-06 1:46 UTC (permalink / raw) To: Avery Pennarun; +Cc: Linus Torvalds, Robin Rosenberg, Ittay Dror, git Avery Pennarun <apenwarr@gmail.com> wrote: > > I have a branch called 'mylib' with my library project in its root > directory. What I wanted was to maintain my library in the 'mylib' > branch, then merge my library into the "libs/mylib" directory of my > application, which is in the 'myapp' branch. [...] > > This actually works like magic in git - except when you create a file > in the 'mylib' branch, in which case it gets merged to the wrong path > every single time. It seems to me like it should be very easy to put > it in the right place instead, making one more interesting use case > possible. > > I realize git-submodule is the way you're supposed to do something > like this, but git-submodule doesn't really do what I want (yet) for > reasons discussed in other threads. `git pull -s subtree mylib` ? This is how git-gui and gitk are merged into git.git, and it avoids this case by looking for a subdirectory rename, more specifically a rename of "/" to "mylib/". It also can go the other way, that is rename "mylib/" to "/", but this path is never used as far as I know as git-gui and gitk don't ever merge in the git.git history. -- Shawn. ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories? 2008-05-06 1:46 ` Shawn O. Pearce @ 2008-05-06 1:58 ` Avery Pennarun 2008-05-06 2:12 ` Shawn O. Pearce 0 siblings, 1 reply; 49+ messages in thread From: Avery Pennarun @ 2008-05-06 1:58 UTC (permalink / raw) To: Shawn O. Pearce; +Cc: Linus Torvalds, Robin Rosenberg, Ittay Dror, git On 5/5/08, Shawn O. Pearce <spearce@spearce.org> wrote: > Avery Pennarun <apenwarr@gmail.com> wrote: > > > > I have a branch called 'mylib' with my library project in its root > > directory. What I wanted was to maintain my library in the 'mylib' > > branch, then merge my library into the "libs/mylib" directory of my > > > application, which is in the 'myapp' branch. [...] > > > > > This actually works like magic in git - except when you create a file > > in the 'mylib' branch, in which case it gets merged to the wrong path > > every single time. It seems to me like it should be very easy to put > > it in the right place instead, making one more interesting use case > > possible. > > > > I realize git-submodule is the way you're supposed to do something > > like this, but git-submodule doesn't really do what I want (yet) for > > reasons discussed in other threads. > > `git pull -s subtree mylib` ? First, I thought: wow! How can that possibly work? These guys are geniuses! Then I found out that git-merge-subtree is a git builtin, and git.c says this: { "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE }, { "merge-subtree", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE }, And then my head exploded. :) Still scraping the pieces of my brain back off the floor... but does this mean the subtree merge strategy would fail exactly like merge-recursive when new files are created? Have fun, Avery ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories? 2008-05-06 1:58 ` Avery Pennarun @ 2008-05-06 2:12 ` Shawn O. Pearce 0 siblings, 0 replies; 49+ messages in thread From: Shawn O. Pearce @ 2008-05-06 2:12 UTC (permalink / raw) To: Avery Pennarun; +Cc: Linus Torvalds, Robin Rosenberg, Ittay Dror, git Avery Pennarun <apenwarr@gmail.com> wrote: > On 5/5/08, Shawn O. Pearce <spearce@spearce.org> wrote: > > > > `git pull -s subtree mylib` ? > > First, I thought: wow! How can that possibly work? These guys are geniuses! > > Then I found out that git-merge-subtree is a git builtin, and git.c says this: > > { "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE }, > { "merge-subtree", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE }, > > And then my head exploded. :) > > Still scraping the pieces of my brain back off the floor... but does > this mean the subtree merge strategy would fail exactly like > merge-recursive when new files are created? Nope. If you go look at cmd_merge_recursive you will see it has different behavior based upon the name it was invoked as, even though it is the same C function and has the same implementation. If it is started with the name "merge-subtree" it tries to find a matching subtree prefix to insert in front of all names, or to remove from all names, such that a merge will correctly fully include a set of files in a subdirectory, or full pull out a set of files from a subdirectory. Junio is the genius that implemented this. Works quite well for this library->application merge case that I think you were trying to describe. -- Shawn. ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: merge renamed files/directories? 2008-05-06 1:38 ` Avery Pennarun 2008-05-06 1:46 ` Shawn O. Pearce @ 2008-05-06 2:19 ` Linus Torvalds 1 sibling, 0 replies; 49+ messages in thread From: Linus Torvalds @ 2008-05-06 2:19 UTC (permalink / raw) To: Avery Pennarun; +Cc: Robin Rosenberg, Ittay Dror, git On Mon, 5 May 2008, Avery Pennarun wrote: > > I general I agree with your point here, but I still find it surprising > how hard the directory-rename problem is made out to be. I do agree that it's probably not that hard. But I disagree with people who whine about pointless stuff, and don't send patches. Linus ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 23:14 ` Jeff King 2008-05-03 17:56 ` merge renamed files/directories? (was: Re: detecting rename->commit->modify->commit) Ittay Dror @ 2008-05-08 18:17 ` Jeff King 1 sibling, 0 replies; 49+ messages in thread From: Jeff King @ 2008-05-08 18:17 UTC (permalink / raw) To: Steven Grimm; +Cc: Avery Pennarun, Ittay Dror, git On Thu, May 01, 2008 at 07:14:27PM -0400, Jeff King wrote: > 1. write a proof-of-concept that shows directory renaming after the > fact (e.g., take a conflicted merge, scan the diff for directory > renames, and then fix up the files). That way it is available, but > doesn't impact git at all. Here's a toy script that finds directory renames. I'm sure there are a ton of corner cases it doesn't handle (like directory renames inside of directory renames). My test case was the very trivial: mkdir repo && cd repo && git init mkdir subdir for i in 1 2 3; do echo content $i >subdir/file$i done git add subdir git commit -m initial git mv subdir new git commit -m move git checkout -b other HEAD^ echo content 4 >subdir/file4 git add subdir git commit -m new git merge --no-commit master perl ../find-dir-rename.pl git commit At which point you should see the merged commit with new/file4. Script is below. -- >8 -- #!/usr/bin/perl # # Find renamed directories, and move any files in the "old" # directory into the "new". # # usage: # git merge --no-commit <whatever> # find-dir-rename # git commit use strict; foreach my $r (renamed_dirs()) { move_dir_contents($r->{from}, $r->{to}); } exit 0; sub renamed_dirs { my $base = `git merge-base HEAD MERGE_HEAD`; chomp $base; return grep { $_->{score} == 1 } (renamed_dirs_between($base, 'HEAD'), renamed_dirs_between($base, 'MERGE_HEAD')); } sub renamed_dirs_between { my ($base, $commit) = @_; my %sources; foreach my $pair (renamed_files($base, $commit)) { my $d1 = dir_of($pair->[0]); my $d2 = dir_of($pair->[1]); next unless defined($d1) && defined($d2); $sources{$d1}->{total}++; $sources{$d1}->{dests}->{$d2}++; } return map { my $from = $_; map { { from => $from, to => $_, score => $sources{$from}->{dests}->{$_} / $sources{$from}->{total}, } } keys(%{$sources{$from}->{dests}}); } removed_directories($base, $commit); } sub dir_of { local $_ = shift; s{/[^/]+$}{} or return undef; return $_; } sub renamed_files { my ($from, $to) = @_; open(my $fh, '-|', qw(git diff-tree -r -M), $from, $to) or die "unable to open diff-tree: $!"; return map { chomp; m/ R\d+\t([^\t]+)\t(.*)/ ? [$1 => $2] : () } <$fh>; } sub removed_directories { my ($base, $commit) = @_; my %new_dirs = map { $_ => 1 } directories($commit); return grep { !exists $new_dirs{$_} } directories($base); } sub directories { my $commit = shift; return uniq( map { s{/[^/]+$}{} ? $_ : () } files($commit) ); } sub files { my $commit = shift; open(my $fh, '-|', qw(git ls-tree -r), $commit) or die "unable to open ls-tree: $!"; return map { chomp; s/^[^\t]*\t//; $_ } <$fh>; } sub uniq { my %seen; return grep { !$seen{$_}++ } @_; } sub move_dir_contents { my ($from, $to) = @_; my @files = glob("$from/*"); return unless @files; system(qw(git mv), @files, "$to/") and die "unable to move $from/* to $to"; rmdir($from); # ignore error since there may be untracked files } ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 14:54 ` Ittay Dror 2008-05-01 15:09 ` Jeff King 2008-05-01 15:27 ` Avery Pennarun @ 2008-05-01 16:39 ` Sitaram Chamarty 2008-05-01 18:58 ` Ittay Dror 2 siblings, 1 reply; 49+ messages in thread From: Sitaram Chamarty @ 2008-05-01 16:39 UTC (permalink / raw) To: Ittay Dror; +Cc: git On Thu, May 1, 2008 at 8:24 PM, Ittay Dror <ittayd@tikalk.com> wrote: > Also, would anyone like to comment on: > http://www.markshuttleworth.com/archives/123 (Renaming is the killer app of > distributed version control <http://www.markshuttleworth.com/archives/123>)? someone already did, albeit in just discussion form rather than examples, in a comment on that same page: http://www.markshuttleworth.com/archives/123#comment-118655 ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: detecting rename->commit->modify->commit 2008-05-01 16:39 ` Sitaram Chamarty @ 2008-05-01 18:58 ` Ittay Dror 0 siblings, 0 replies; 49+ messages in thread From: Ittay Dror @ 2008-05-01 18:58 UTC (permalink / raw) To: Sitaram Chamarty; +Cc: git Sitaram Chamarty wrote: > http://www.markshuttleworth.com/archives/123#comment-118655 > Here is the comment from the thread, my comment on it is below: > This is a very strong point for renaming, but it is not necessarily an universal one. > Here is one example of the issue: one developer renaming a directory in his branch, and another adding a file to the original directory in his branch. What happens at the merge ? > - Bazaar renames the directory and puts the new file in the _renamed_ directory. > - Git renames the directory with its files, but keeps the old directory too and adds the new file there. > Bazaar’s behavior certainly is better for C. However it is not universally better. > For example in Java you cannot rename a file without changing its contents. So, moving a file to a directory different from where its author put it will almost certainly break the build. > The bottom line is, both behaviors can seem valid or broken, depending on the case. Neither is perfect. At the very abstract level file renames are _not_ a first-class operation. This is especially apparent in a language like Java. > Content movement is the first class operation. Things like moving functions, etc. The question is how one can handle that and whether the current strategy has a path for improvement. It could be > argued that once you commit yourself to explicitly tracking file renames, you are giving up a slew of opportunities for handling the more general cases. > One thing is for certain, a 100% ideal solution is impossible. It would have to be aware of the target programming language _and_ the build environment. And my comment is that in this example, about Java, I think that manually fixing the package name in the file (after noticing the build is broken) is easy. On the other hand, if the other developer changed one of the renamed file, then manually merging the change in the file in the old location to the file in the new location is not so easy: you first need to discover that this happened, then merge the two files (and you still need to fix the package name). ittay -- Ittay Dror <ittayd@tikalk.com> Tikal <http://www.tikalk.com> Tikal Project <http://tikal.sourceforge.net> ^ permalink raw reply [flat|nested] 49+ messages in thread
end of thread, other threads:[~2008-05-08 18:18 UTC | newest] Thread overview: 49+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-05-01 14:10 detecting rename->commit->modify->commit Ittay Dror 2008-05-01 14:45 ` Jeff King 2008-05-01 15:08 ` Ittay Dror 2008-05-01 15:20 ` Jeff King 2008-05-01 15:30 ` Ittay Dror 2008-05-01 15:38 ` Jeff King 2008-05-01 15:47 ` Jakub Narebski 2008-05-01 20:39 ` Teemu Likonen 2008-05-01 23:09 ` Jeff King 2008-05-02 2:06 ` Sitaram Chamarty 2008-05-02 2:38 ` Junio C Hamano 2008-05-02 16:59 ` Sitaram Chamarty 2008-05-01 15:24 ` Ittay Dror 2008-05-01 15:28 ` Jeff King 2008-05-01 14:54 ` Ittay Dror 2008-05-01 15:09 ` Jeff King 2008-05-01 15:20 ` Ittay Dror 2008-05-01 15:30 ` David Tweed 2008-05-01 15:27 ` Avery Pennarun 2008-05-01 15:34 ` Jeff King 2008-05-01 15:50 ` Avery Pennarun 2008-05-01 16:48 ` Jeff King 2008-05-01 19:45 ` Avery Pennarun 2008-05-01 22:42 ` Jeff King 2008-05-01 19:12 ` Steven Grimm 2008-05-01 23:14 ` Jeff King 2008-05-03 17:56 ` merge renamed files/directories? (was: Re: detecting rename->commit->modify->commit) Ittay Dror 2008-05-03 18:11 ` Avery Pennarun 2008-05-04 6:08 ` merge renamed files/directories? Ittay Dror 2008-05-04 9:34 ` Jakub Narebski 2008-05-05 16:40 ` Avery Pennarun 2008-05-05 21:49 ` Robin Rosenberg 2008-05-05 22:20 ` Linus Torvalds 2008-05-05 23:07 ` Steven Grimm 2008-05-06 0:29 ` Linus Torvalds 2008-05-06 0:40 ` Linus Torvalds 2008-05-06 15:47 ` Theodore Tso 2008-05-06 16:10 ` Linus Torvalds 2008-05-06 16:15 ` Linus Torvalds 2008-05-06 16:32 ` Ittay Dror 2008-05-06 16:39 ` Linus Torvalds 2008-05-06 1:38 ` Avery Pennarun 2008-05-06 1:46 ` Shawn O. Pearce 2008-05-06 1:58 ` Avery Pennarun 2008-05-06 2:12 ` Shawn O. Pearce 2008-05-06 2:19 ` Linus Torvalds 2008-05-08 18:17 ` detecting rename->commit->modify->commit Jeff King 2008-05-01 16:39 ` Sitaram Chamarty 2008-05-01 18:58 ` Ittay Dror
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).