git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Should copy/rename detection consider file overwrites?
@ 2015-01-23  1:29 Mike Hommey
  2015-01-23 11:04 ` Jeff King
  0 siblings, 1 reply; 3+ messages in thread
From: Mike Hommey @ 2015-01-23  1:29 UTC (permalink / raw)
  To: git

Hi,

While fooling around with copy/rename detection, I noticed that it
doesn't detect the case where you copy or rename a file on top of
another:

$ git init
$ (echo foo; echo bar) > foo
$ git add foo
$ git commit -m foo
$ echo 0 > bar
$ git add bar
$ git commit -m bar
$ git mv -f foo bar
$ git commit -m foobar
$ git log --oneline --reverse
7dc2765 foo
b0c837d bar
88caeba foobar
$ git blame -s -C -C bar
88caebab 1) foo
88caebab 2) bar

I can see how this is not trivially representable in e.g. git diff-tree,
but shouldn't at least blame try to tell that those lines actually come
from 7dc2765?

Mike

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Should copy/rename detection consider file overwrites?
  2015-01-23  1:29 Should copy/rename detection consider file overwrites? Mike Hommey
@ 2015-01-23 11:04 ` Jeff King
  2015-01-23 22:37   ` Mike Hommey
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff King @ 2015-01-23 11:04 UTC (permalink / raw)
  To: Mike Hommey; +Cc: git

On Fri, Jan 23, 2015 at 10:29:08AM +0900, Mike Hommey wrote:

> While fooling around with copy/rename detection, I noticed that it
> doesn't detect the case where you copy or rename a file on top of
> another:
> 
> $ git init
> $ (echo foo; echo bar) > foo

If I replace this with a longer input, like:

  cp /usr/share/dict/words foo

> $ git add foo
> $ git commit -m foo
> $ echo 0 > bar
> $ git add bar
> $ git commit -m bar
> $ git mv -f foo bar
> $ git commit -m foobar
> $ git log --oneline --reverse
> 7dc2765 foo
> b0c837d bar
> 88caeba foobar
> $ git blame -s -C -C bar
> 88caebab 1) foo
> 88caebab 2) bar

Then the blame shows me the initial "foo" commit. So I think it is
mainly that your toy example is too small (I think we will do
exact rename detection whatever the size is, but I expect we are getting
hung up on the break detection between "0\n" and "foo\nbar\n").

> I can see how this is not trivially representable in e.g. git diff-tree,
> but shouldn't at least blame try to tell that those lines actually come
> from 7dc2765?

diff-tree can show this, too, but you need to turn on "break detection"
which will notice that "bar" has essentially been rewritten (and then
consider its sides as candidates for rename detection). For example
(with the longer input, as above):

  $ git diff-tree --name-status -M HEAD
  c6fe146b0c73adcbc4dbc2e58eb83af9007678bc
  M       bar
  D       foo

  $ git diff-tree --name-status -M -B HEAD
  c6fe146b0c73adcbc4dbc2e58eb83af9007678bc
  R100    foo     bar

Presumably if you set the break score low enough, your original example
would behave the same way, but I couldn't get it to work (I didn't look
closely, but I imagine it is just so tiny that we hit the internal
limits on how low you can set the score).

-Peff

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Should copy/rename detection consider file overwrites?
  2015-01-23 11:04 ` Jeff King
@ 2015-01-23 22:37   ` Mike Hommey
  0 siblings, 0 replies; 3+ messages in thread
From: Mike Hommey @ 2015-01-23 22:37 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Fri, Jan 23, 2015 at 06:04:19AM -0500, Jeff King wrote:
> On Fri, Jan 23, 2015 at 10:29:08AM +0900, Mike Hommey wrote:
> 
> > While fooling around with copy/rename detection, I noticed that it
> > doesn't detect the case where you copy or rename a file on top of
> > another:
> > 
> > $ git init
> > $ (echo foo; echo bar) > foo
> 
> If I replace this with a longer input, like:
> 
>   cp /usr/share/dict/words foo
> 
> > $ git add foo
> > $ git commit -m foo
> > $ echo 0 > bar
> > $ git add bar
> > $ git commit -m bar
> > $ git mv -f foo bar
> > $ git commit -m foobar
> > $ git log --oneline --reverse
> > 7dc2765 foo
> > b0c837d bar
> > 88caeba foobar
> > $ git blame -s -C -C bar
> > 88caebab 1) foo
> > 88caebab 2) bar
> 
> Then the blame shows me the initial "foo" commit. So I think it is
> mainly that your toy example is too small (I think we will do
> exact rename detection whatever the size is, but I expect we are getting
> hung up on the break detection between "0\n" and "foo\nbar\n").

Err, I was afraid my testcase was too small. And that all boils down to
this:

    <num> is optional but it is the lower bound on the number of
    alphanumeric characters that Git must detect as moving/copying
    between files for it to associate those lines with the parent
    commit. And the default value is 40.

> > I can see how this is not trivially representable in e.g. git diff-tree,
> > but shouldn't at least blame try to tell that those lines actually come
> > from 7dc2765?
> 
> diff-tree can show this, too, but you need to turn on "break detection"
> which will notice that "bar" has essentially been rewritten (and then
> consider its sides as candidates for rename detection). For example
> (with the longer input, as above):
> 
>   $ git diff-tree --name-status -M HEAD
>   c6fe146b0c73adcbc4dbc2e58eb83af9007678bc
>   M       bar
>   D       foo
> 
>   $ git diff-tree --name-status -M -B HEAD
>   c6fe146b0c73adcbc4dbc2e58eb83af9007678bc
>   R100    foo     bar
> 
> Presumably if you set the break score low enough, your original example
> would behave the same way, but I couldn't get it to work (I didn't look
> closely, but I imagine it is just so tiny that we hit the internal
> limits on how low you can set the score).o

Oh. Good to know, thanks.

Mike

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-01-23 22:37 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-01-23  1:29 Should copy/rename detection consider file overwrites? Mike Hommey
2015-01-23 11:04 ` Jeff King
2015-01-23 22:37   ` Mike Hommey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).