* Understanding behavior of git blame -M
@ 2014-08-15 13:40 Sokolov, Konstantin (ext)
2014-08-15 14:42 ` David Kastrup
2014-08-15 17:07 ` Junio C Hamano
0 siblings, 2 replies; 8+ messages in thread
From: Sokolov, Konstantin (ext) @ 2014-08-15 13:40 UTC (permalink / raw)
To: git@vger.kernel.org
Hi Folks,
I'm trying to understand the behavior of git blame -M and find that the actual results differ from what I understood from the documentation. I've already asked longer time ago on stackoverflow and on the user mailing list without any satisfactory results. So here is the example:
Initial content of file.txt (2cd9f7f)
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
CCCCCCCCCCCCCCCCCCCCCCCC2222222222222222222222222
DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
Move line B to the middle (d4bbd97e):
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
CCCCCCCCCCCCCCCCCCCCCCCC2222222222222222222222222
DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
>git blame -s -n -f -w -M20 file.txt
^2cd9f7f 1 1) AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
^2cd9f7f 3 2) CCCCCCCCCCCCCCCCCCCCCCCC2222222222222222222222222
^2cd9f7f 4 3) DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
d4bbd97e 4 4) BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
^2cd9f7f 5 5) EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
^2cd9f7f 6 6) GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
^2cd9f7f 7 7) FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
I wonder, why line B is not recognized as moved. According to the documentation, I would expect git blame to report that it originates from line 2 in revision 2cd9f7f. Can anybody explain the behavior?
Thanks in advance
Konstantin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Understanding behavior of git blame -M
2014-08-15 13:40 Understanding behavior of git blame -M Sokolov, Konstantin (ext)
@ 2014-08-15 14:42 ` David Kastrup
2014-08-15 20:54 ` AW: " Sokolov, Konstantin (ext)
2014-08-16 0:06 ` Duy Nguyen
2014-08-15 17:07 ` Junio C Hamano
1 sibling, 2 replies; 8+ messages in thread
From: David Kastrup @ 2014-08-15 14:42 UTC (permalink / raw)
To: Sokolov, Konstantin (ext); +Cc: git@vger.kernel.org
"Sokolov, Konstantin (ext)" <konstantin.sokolov.ext@siemens.com> writes:
> Hi Folks,
>
> I'm trying to understand the behavior of git blame -M and find that
> the actual results differ from what I understood from the
> documentation. I've already asked longer time ago on stackoverflow and
> on the user mailing list without any satisfactory results. So here is
> the example:
>
> Initial content of file.txt (2cd9f7f)
>
> AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
> BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> CCCCCCCCCCCCCCCCCCCCCCCC2222222222222222222222222
> DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
> EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
> GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
> FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
>
> Move line B to the middle (d4bbd97e):
>
> AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
> CCCCCCCCCCCCCCCCCCCCCCCC2222222222222222222222222
> DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
> BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
> GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
> FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
>
>>git blame -s -n -f -w -M20 file.txt
> ^2cd9f7f 1 1) AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
> ^2cd9f7f 3 2) CCCCCCCCCCCCCCCCCCCCCCCC2222222222222222222222222
> ^2cd9f7f 4 3) DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
> d4bbd97e 4 4) BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> ^2cd9f7f 5 5) EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
> ^2cd9f7f 6 6) GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
> ^2cd9f7f 7 7) FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
>
> I wonder, why line B is not recognized as moved. According to the
> documentation, I would expect git blame to report that it originates
> from line 2 in revision 2cd9f7f. Can anybody explain the behavior?
Someone had reasons. diff_hunks in builtin/blame.c is once called with
0 as third argument, once with 1. Change the latter call to using 0 as
well and you get your expected result:
dak@lola:/tmp/test$ /usr/local/tmp/git/git blame -s -n -f -w -M20 file.txt
^2cab496 file.txt 1 1) AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
^2cab496 file.txt 3 2) CCCCCCCCCCCCCCCCCCCCCCCC2222222222222222222222222
^2cab496 file.txt 4 3) DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
^2cab496 file.txt 2 4) BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
^2cab496 file.txt 5 5) EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
^2cab496 file.txt 6 6) GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
^2cab496 file.txt 7 7) FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
The function diff_hunks is a wrapper for the diff engine. Putting the
context length explicitly into this wrapper (rather than not passing an
argument and just setting the context length to zero anyway in the
function) clearly indicates that somebody _wanted_ it called with
different values.
There is no documentation or rationale in the file _why_ as far as
I remember. Maybe it can crash or end up in an infinite loop. Maybe it
could do so at one point of time but no longer does.
Maybe Git is just a puzzle from genius to genius. Good luck figuring it
out.
I have not touched this when rewriting git-blame recently, and I am not
interested in touching it. I stand absolutely nothing to gain from
working on Git.
--
David Kastrup
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Understanding behavior of git blame -M
2014-08-15 13:40 Understanding behavior of git blame -M Sokolov, Konstantin (ext)
2014-08-15 14:42 ` David Kastrup
@ 2014-08-15 17:07 ` Junio C Hamano
2014-08-15 20:57 ` AW: " Sokolov, Konstantin (ext)
2014-08-18 11:41 ` Sokolov, Konstantin (ext)
1 sibling, 2 replies; 8+ messages in thread
From: Junio C Hamano @ 2014-08-15 17:07 UTC (permalink / raw)
To: Sokolov, Konstantin (ext); +Cc: git@vger.kernel.org
"Sokolov, Konstantin (ext)" <konstantin.sokolov.ext@siemens.com>
writes:
>>git blame -s -n -f -w -M20 file.txt
> ^2cd9f7f 1 1) AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
> ^2cd9f7f 3 2) CCCCCCCCCCCCCCCCCCCCCCCC2222222222222222222222222
> ^2cd9f7f 4 3) DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
> d4bbd97e 4 4) BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> ^2cd9f7f 5 5) EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
> ^2cd9f7f 6 6) GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
> ^2cd9f7f 7 7) FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
>
> I wonder, why line B is not recognized as moved. According to the
> documentation, I would expect git blame to report that it
> originates from line 2 in revision 2cd9f7f. Can anybody explain
> the behavior?
Interesting. Would it make a difference if you move B further away
from lines A and C?
^ permalink raw reply [flat|nested] 8+ messages in thread
* AW: Understanding behavior of git blame -M
2014-08-15 14:42 ` David Kastrup
@ 2014-08-15 20:54 ` Sokolov, Konstantin (ext)
2014-08-16 7:02 ` David Kastrup
2014-08-16 0:06 ` Duy Nguyen
1 sibling, 1 reply; 8+ messages in thread
From: Sokolov, Konstantin (ext) @ 2014-08-15 20:54 UTC (permalink / raw)
To: David Kastrup; +Cc: git@vger.kernel.org
Hi David,
thank you very much for the exhaustive answer. The keyword "hunk" made me try a little bit more. So I realized that -M works as expected when at least three lines are moved.
From your answer I discern that you find the current behavior correct. In my opinion, it diverges at least from the documented behavior, as the documentation doesn't mention this "number of lines" aspect but rather speaks about "number of alphanumeric characters".
Regards
Konstantin
-----Ursprüngliche Nachricht-----
Von: David Kastrup [mailto:dak@gnu.org]
Gesendet: Freitag, 15. August 2014 16:42
An: Sokolov, Konstantin (ext)
Cc: git@vger.kernel.org
Betreff: Re: Understanding behavior of git blame -M
"Sokolov, Konstantin (ext)" <konstantin.sokolov.ext@siemens.com> writes:
> Hi Folks,
>
> I'm trying to understand the behavior of git blame -M and find that
> the actual results differ from what I understood from the
> documentation. I've already asked longer time ago on stackoverflow and
> on the user mailing list without any satisfactory results. So here is
> the example:
>
> Initial content of file.txt (2cd9f7f)
>
> AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
> BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> CCCCCCCCCCCCCCCCCCCCCCCC2222222222222222222222222
> DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
> EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
> GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
> FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
>
> Move line B to the middle (d4bbd97e):
>
> AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
> CCCCCCCCCCCCCCCCCCCCCCCC2222222222222222222222222
> DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
> BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
> GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
> FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
>
>>git blame -s -n -f -w -M20 file.txt
> ^2cd9f7f 1 1) AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
> ^2cd9f7f 3 2) CCCCCCCCCCCCCCCCCCCCCCCC2222222222222222222222222
> ^2cd9f7f 4 3) DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
> d4bbd97e 4 4) BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> ^2cd9f7f 5 5) EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
> ^2cd9f7f 6 6) GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
> ^2cd9f7f 7 7) FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
>
> I wonder, why line B is not recognized as moved. According to the
> documentation, I would expect git blame to report that it originates
> from line 2 in revision 2cd9f7f. Can anybody explain the behavior?
Someone had reasons. diff_hunks in builtin/blame.c is once called with 0 as third argument, once with 1. Change the latter call to using 0 as well and you get your expected result:
dak@lola:/tmp/test$ /usr/local/tmp/git/git blame -s -n -f -w -M20 file.txt
^2cab496 file.txt 1 1) AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
^2cab496 file.txt 3 2) CCCCCCCCCCCCCCCCCCCCCCCC2222222222222222222222222
^2cab496 file.txt 4 3) DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
^2cab496 file.txt 2 4) BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
^2cab496 file.txt 5 5) EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
^2cab496 file.txt 6 6) GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
^2cab496 file.txt 7 7) FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
The function diff_hunks is a wrapper for the diff engine. Putting the context length explicitly into this wrapper (rather than not passing an argument and just setting the context length to zero anyway in the
function) clearly indicates that somebody _wanted_ it called with different values.
There is no documentation or rationale in the file _why_ as far as I remember. Maybe it can crash or end up in an infinite loop. Maybe it could do so at one point of time but no longer does.
Maybe Git is just a puzzle from genius to genius. Good luck figuring it out.
I have not touched this when rewriting git-blame recently, and I am not interested in touching it. I stand absolutely nothing to gain from working on Git.
--
David Kastrup
^ permalink raw reply [flat|nested] 8+ messages in thread
* AW: Understanding behavior of git blame -M
2014-08-15 17:07 ` Junio C Hamano
@ 2014-08-15 20:57 ` Sokolov, Konstantin (ext)
2014-08-18 11:41 ` Sokolov, Konstantin (ext)
1 sibling, 0 replies; 8+ messages in thread
From: Sokolov, Konstantin (ext) @ 2014-08-15 20:57 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git@vger.kernel.org
No. The distance seems to have no influence. In the meantime I've found out (as mentioned in my other reply) that the movements are detected if at least three lines are moved.
-----Ursprüngliche Nachricht-----
Von: Junio C Hamano [mailto:gitster@pobox.com]
Gesendet: Freitag, 15. August 2014 19:08
An: Sokolov, Konstantin (ext)
Cc: git@vger.kernel.org
Betreff: Re: Understanding behavior of git blame -M
"Sokolov, Konstantin (ext)" <konstantin.sokolov.ext@siemens.com>
writes:
>>git blame -s -n -f -w -M20 file.txt
> ^2cd9f7f 1 1) AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
> ^2cd9f7f 3 2) CCCCCCCCCCCCCCCCCCCCCCCC2222222222222222222222222
> ^2cd9f7f 4 3) DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
> d4bbd97e 4 4) BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
> ^2cd9f7f 5 5) EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
> ^2cd9f7f 6 6) GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
> ^2cd9f7f 7 7) FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
>
> I wonder, why line B is not recognized as moved. According to the
> documentation, I would expect git blame to report that it originates
> from line 2 in revision 2cd9f7f. Can anybody explain the behavior?
Interesting. Would it make a difference if you move B further away from lines A and C?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Understanding behavior of git blame -M
2014-08-15 14:42 ` David Kastrup
2014-08-15 20:54 ` AW: " Sokolov, Konstantin (ext)
@ 2014-08-16 0:06 ` Duy Nguyen
1 sibling, 0 replies; 8+ messages in thread
From: Duy Nguyen @ 2014-08-16 0:06 UTC (permalink / raw)
To: David Kastrup; +Cc: Sokolov, Konstantin (ext), git@vger.kernel.org
On Fri, Aug 15, 2014 at 9:42 PM, David Kastrup <dak@gnu.org> wrote:
> The function diff_hunks is a wrapper for the diff engine. Putting the
> context length explicitly into this wrapper (rather than not passing an
> argument and just setting the context length to zero anyway in the
> function) clearly indicates that somebody _wanted_ it called with
> different values.
>
> There is no documentation or rationale in the file _why_ as far as
> I remember. Maybe it can crash or end up in an infinite loop. Maybe it
> could do so at one point of time but no longer does.
Not sure if it helps, but ctxlen = 1 seems to be added back in d24bba8
(git-pickaxe -M: blame line movements within a file. - 2006-10-19), if
I track the changes correctly.
--
Duy
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: AW: Understanding behavior of git blame -M
2014-08-15 20:54 ` AW: " Sokolov, Konstantin (ext)
@ 2014-08-16 7:02 ` David Kastrup
0 siblings, 0 replies; 8+ messages in thread
From: David Kastrup @ 2014-08-16 7:02 UTC (permalink / raw)
To: Sokolov, Konstantin (ext); +Cc: git@vger.kernel.org
"Sokolov, Konstantin (ext)" <konstantin.sokolov.ext@siemens.com> writes:
> Hi David,
>
> thank you very much for the exhaustive answer. The keyword "hunk" made
> me try a little bit more. So I realized that -M works as expected when
> at least three lines are moved.
>
> From your answer I discern that you find the current behavior
> correct.
I don't say any such thing and don't imply it.
--
David Kastrup
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: Understanding behavior of git blame -M
2014-08-15 17:07 ` Junio C Hamano
2014-08-15 20:57 ` AW: " Sokolov, Konstantin (ext)
@ 2014-08-18 11:41 ` Sokolov, Konstantin (ext)
1 sibling, 0 replies; 8+ messages in thread
From: Sokolov, Konstantin (ext) @ 2014-08-18 11:41 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git@vger.kernel.org
Seems like not detecting single line movements is per design and just the documentation is not precise about this. Could such an enhancement be considered as a feature request? We're using git (blame) as a low level tool for building further functionality on top of it. Maintaining a custom version of git is a big step that we would like to avoid.
Regards
Konstantin
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-08-18 11:42 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-15 13:40 Understanding behavior of git blame -M Sokolov, Konstantin (ext)
2014-08-15 14:42 ` David Kastrup
2014-08-15 20:54 ` AW: " Sokolov, Konstantin (ext)
2014-08-16 7:02 ` David Kastrup
2014-08-16 0:06 ` Duy Nguyen
2014-08-15 17:07 ` Junio C Hamano
2014-08-15 20:57 ` AW: " Sokolov, Konstantin (ext)
2014-08-18 11:41 ` Sokolov, Konstantin (ext)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).