git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* How to verify that lines were only moved, not edited?
@ 2011-10-19 14:34 Johannes Sixt
  2011-10-19 16:33 ` Jeff King
  2011-10-19 17:07 ` Junio C Hamano
  0 siblings, 2 replies; 5+ messages in thread
From: Johannes Sixt @ 2011-10-19 14:34 UTC (permalink / raw)
  To: git

I thought there was a way to use git-blame to find out whether a change
only shuffled lines, but otherwise did not modify them. I tried "git blame
-M -- the/file", but it does not work as expected, neither with a toy file
nor with a 5000+ lines file (with 55 lines moved).

git init
echo A > foo
echo B >> foo
git add foo
git commit -m initial
echo B > foo
echo A >> foo
git commit -a -m swapped

The results are:
$ git blame -M -s -- foo
^e3abca2 1) B
6189cb46 2) A

I would have expected:
^e3abca2 1) B
^e3abca2 2) A

Oh, look! This produces the expected result:
$ git blame -M1 -s -- foo

while this produces the same as with just -M:
$ git blame -M2 -s -- foo

But neither helps with my 5000+ lines file. Does it mean that the lines
were changed? But I'm sure they were just moved! Please help!

-- Hannes

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: How to verify that lines were only moved, not edited?
  2011-10-19 14:34 How to verify that lines were only moved, not edited? Johannes Sixt
@ 2011-10-19 16:33 ` Jeff King
  2011-10-20  6:20   ` Johannes Sixt
  2011-10-19 17:07 ` Junio C Hamano
  1 sibling, 1 reply; 5+ messages in thread
From: Jeff King @ 2011-10-19 16:33 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: git

On Wed, Oct 19, 2011 at 04:34:20PM +0200, Johannes Sixt wrote:

> I thought there was a way to use git-blame to find out whether a change
> only shuffled lines, but otherwise did not modify them. I tried "git blame
> -M -- the/file", but it does not work as expected, neither with a toy file
> nor with a 5000+ lines file (with 55 lines moved).
> 
> git init
> echo A > foo
> echo B >> foo
> git add foo
> git commit -m initial
> echo B > foo
> echo A >> foo
> git commit -a -m swapped
> 
> The results are:
> $ git blame -M -s -- foo
> ^e3abca2 1) B
> 6189cb46 2) A
> 
> I would have expected:
> ^e3abca2 1) B
> ^e3abca2 2) A
> 
> Oh, look! This produces the expected result:
> $ git blame -M1 -s -- foo

Right. Your toy lines aren't long enough to be considered "interesting"
by the default score. From git-blame(1):

  -M[<num>]
  [...]
  <num> is optional but it is the lower bound on the number of
  alphanumeric characters that git must detect as moving/copying within
  a file for it to associate those lines with the parent commit. The
  default value is 20.

Whereas with a longer sample:

  git init
  seq 1 5000 >foo
  git add foo
  git commit -m initial
  sed -i '/^2..$/d' foo
  seq 200 299 >>foo
  git commit -a -m 'move 200-299 to end'

I get the expected result from "git blame -M" (i.e., everything
attributed to the root commit).

> But neither helps with my 5000+ lines file. Does it mean that the lines
> were changed? But I'm sure they were just moved! Please help!

What does the file look like? I think blame has some heuristics about
lines which are uninteresting, and maybe you are triggering a corner
case there.

-Peff

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: How to verify that lines were only moved, not edited?
  2011-10-19 14:34 How to verify that lines were only moved, not edited? Johannes Sixt
  2011-10-19 16:33 ` Jeff King
@ 2011-10-19 17:07 ` Junio C Hamano
  2011-10-20  6:25   ` Johannes Sixt
  1 sibling, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2011-10-19 17:07 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: git

Johannes Sixt <j.sixt@viscovery.net> writes:

> I thought there was a way to use git-blame to find out whether a change
> only shuffled lines, but otherwise did not modify them. I tried "git blame
> -M -- the/file",...

You said "a change" and I somehow expected that such a blame would be done
with a revision range, e.g. "git blame -M HEAD^..HEAD -- the/file".

If the two endpoints you are comparing have other commits in between that
make changes then revert them in such a way that the end result cancels
out, "git diff A B -- the/file" won't see such intermediate changes, but
they may interfere with "git blame A..B -- the/file", i.e. when A is not a
direct parent of B.

> ... nor with a 5000+ lines file (with 55 lines moved).

> ... while this produces the same as with just -M:
> $ git blame -M2 -s -- foo

Yes, blame tries to omit matches that consists only of non words, so that
you won't see "all those lines with a single "}" on them that close
definitions for your 100 new functions were copied from the closing brace
of one function you originally had in the file" symptom, and -M<level>
controls it.

> But neither helps with my 5000+ lines file. Does it mean that the lines
> were changed? But I'm sure they were just moved! Please help!

When reviewing a "supposedly move-only" change, I typically just grab +/-
blocks from the patch, remove the +/- prefix and run comparison between
them.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: How to verify that lines were only moved, not edited?
  2011-10-19 16:33 ` Jeff King
@ 2011-10-20  6:20   ` Johannes Sixt
  0 siblings, 0 replies; 5+ messages in thread
From: Johannes Sixt @ 2011-10-20  6:20 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Am 10/19/2011 18:33, schrieb Jeff King:
>   git init
>   seq 1 5000 >foo
>   git add foo
>   git commit -m initial
>   sed -i '/^2..$/d' foo
>   seq 200 299 >>foo
>   git commit -a -m 'move 200-299 to end'
> 
> I get the expected result from "git blame -M" (i.e., everything
> attributed to the root commit).

I see. My example is more like this:

 for i in `seq 1 20`; do md5sum - <<< $i; done > foo
 git commit -a -m foo
 for i in `seq 1 20`; do md5sum - <<< $i; done | sort > foo
 git commit -a -m foo\ sorted

i.e., the sort order of a block of lines was changed "in place". Here,
most of the lines are attributed to the last commit. Am I expecting too
much from git-blame to detect line motions in such a case?

-- Hannes

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: How to verify that lines were only moved, not edited?
  2011-10-19 17:07 ` Junio C Hamano
@ 2011-10-20  6:25   ` Johannes Sixt
  0 siblings, 0 replies; 5+ messages in thread
From: Johannes Sixt @ 2011-10-20  6:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Am 10/19/2011 19:07, schrieb Junio C Hamano:
> When reviewing a "supposedly move-only" change, I typically just grab +/-
> blocks from the patch, remove the +/- prefix and run comparison between
> them.

Thanks. As I explained to Jeff, I don't have a block-move, but the sort
order of a block of lines was changed "in place". It seems I have to fall
back to a similarly manual method as well (e.g., compare the
whole-file-sorted versions of the pre- and the post-image).

-- Hannes

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-10-20  6:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-19 14:34 How to verify that lines were only moved, not edited? Johannes Sixt
2011-10-19 16:33 ` Jeff King
2011-10-20  6:20   ` Johannes Sixt
2011-10-19 17:07 ` Junio C Hamano
2011-10-20  6:25   ` Johannes Sixt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).