From: xmeng@cs.wisc.edu
To: "Junio C Hamano" <gitster@pobox.com>
Cc: "Philip Oakley" <philipoakley@iee.org>,
xmeng@cs.wisc.edu, git@vger.kernel.org
Subject: Re: A generalization of git blame
Date: Wed, 26 Sep 2012 10:36:46 -0500 [thread overview]
Message-ID: <72fc15c78ddb6b5c9e95f6b0cd925a26.squirrel@webmail.cs.wisc.edu> (raw)
In-Reply-To: <7vsja5vh2z.fsf@alter.siamese.dyndns.org>
> "Philip Oakley" <philipoakley@iee.org> writes:
>
>>> To get ground truth of authorship for each line, I start with
>>> git-blame.
>>> But later I find this is not sufficient because the last commit may
>>> only
>>> add comments or may only change a small part of the line, so that I
>>> shouldn't attribute the line of code to the last author.
>>
>> I would suggest there is:
>> - White space adjustment
>> - Comment or documentation (assumes you can parse the 'code' to decide
>> that it isn't executable code)
>> - word changes within expressions
>> - complete replacement of line (whole statement?)
>
> You are being generous by listing easier cases ;-) I'd add a couple
> more that are more problematic if your approach does not consider
> semantics.
>
> - A function gained a new parameter, to which pretty much everbody
> passes the same default value.
>
> -void fn(int a, int b, int c)
> +void fn(int a, int b, int c, int d)
> {
> + if (d) {
> + ...
> + return;
> + }
> ...
> }
>
> void frotz(void)
> {
> ...
> - fn(a, b, c);
> + fn(a, b, c, 0);
> ...
> - fn(a, b, d);
> + fn(a, b, d, 1);
> ...
>
> The same commit that changed the above call site must have
> changed the definition of function "fn" and defined what the new
> fourth parameter means. It is likely that, when the default
> value most everybody passes (perhaps "0") is given, "fn" does
> what it used to do, and a different value may trigger a new
> behaviour of "fn". It could be argued that the former call
> should not be blamed for this commit, while the latter callsite
> should.
>
> - A variable was renamed, and the meaning of a line suddenly
> changed, even though the text of that line did not change at all.
>
> static int foo;
> ...
> -int xyzzy(int foo)
> +int xyzzy(int bar)
> {
> ... some complex computation that
> ... involves foo and bar, resulting in
> ... updating of foo comes here ...
> return foo * 2;
> }
>
> Whom to blame the behaviour of (i.e. returned value from) the
> function? The "return foo * 2" never changed with this patch,
> but the patch _is_ responsible for changing the behaviour.
>
> As the OP is interested in tracking the origin of the _binary_,
> this case is even more interesting, as the generated machine code
> to compute the foo * 2 would likely to be very different before
> and after the patch.
>
>
Thanks for both your great suggestions. Current my approach doesn't
consider semantics yet and this should be an interesting to do.
Another question is that is it possible to include my tool as a git
built-in tool in the future? I know that my tool is still not good for any
release. But I would like to share my work with other people if other
people are interested. And if it is possible, I think I will have a
stronger motivation to make my tool more robust and useful.
Thanks
next prev parent reply other threads:[~2012-09-26 15:47 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-25 18:14 A generalization of git blame xmeng
2012-09-25 22:19 ` Philip Oakley
2012-09-25 23:05 ` Junio C Hamano
2012-09-26 15:36 ` xmeng [this message]
2012-09-26 19:11 ` Junio C Hamano
2012-09-27 4:18 ` xmeng
2012-09-27 6:38 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=72fc15c78ddb6b5c9e95f6b0cd925a26.squirrel@webmail.cs.wisc.edu \
--to=xmeng@cs.wisc.edu \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=philipoakley@iee.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).