From: "jamesmikedupont@googlemail.com" <jamesmikedupont@googlemail.com>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: git@vger.kernel.org
Subject: Re: Introduction and Wikipedia and Git Blame
Date: Fri, 16 Oct 2009 16:23:20 +0200 [thread overview]
Message-ID: <ee9cc730910160723j5d7346a4l195ac6d3825c393b@mail.gmail.com> (raw)
In-Reply-To: <alpine.DEB.1.00.0910161548550.4985@pacific.mpi-cbg.de>
Johannes,
Thanks for your input,
comments below.
mfg,
mike
On Fri, Oct 16, 2009 at 4:11 PM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi,
>
> On Fri, 16 Oct 2009, jamesmikedupont@googlemail.com wrote:
>
>> On Fri, Oct 16, 2009 at 1:26 PM, Johannes Schindelin
>> <Johannes.Schindelin@gmx.de> wrote:
>> >> Here is the discussion on foundation-l :
>> >> http://www.gossamer-threads.com/lists/wiki/foundation/181163
>> >
>> > I found the link to the bazaar repository there, but do you have a Git
>> > repository, too?
>>
>> Not yet. Where should I put it? Any suggestions.
>
> github.com has a nice interface.
>
> BTW after reading some of the code, I am a bit surprised that you did not
> do it as a .php script outputting fast-import capable text...
I dont really know php, and I dont have a debugger or any tools in it....
Really cannot understand how people can work in such an environment.
I have done all my hacking work as perl scripts.
These can be rewritten in c later on.
> Okay, so basically you want to analyze the text on a word-by-word basis
> rather than line-by-line.
yes.
>
> Or maybe even better: you want to analyze the text character-by-character.
> That would also nicely circumvent to specify just what makes a word a word
> (subject for a lot of heated discussion during the design of the
> --color-words=<regex> patch).
Yes, Someone suggested in irc to review the color-words , I have the
source code now and will be looking into that.
>
> Basically, if I had to implement that, I would not try to modify
> builtin-blame.c, but write a new program linking to libgit.a, calling the
> revision walker on the file you want to calculate the blame for. (One of
> the best examples is probably in builtin-shortlog.c.)
>
> Then I would introduce a linked-list structure which will hold the blamed
> regions in this form:
>
> struct region {
> int start;
> struct region *next;
> };
>
> Initially, this would have a start element with the start offset 0
> pointing to the end element with start offset being set to the size of the
> blob.
>
> Most likely you will have to add members to this struct, such as the
> original offsets (as you will have to adjust the offsets to the different
> file revisions while you go back in time), and the commit it was
> attributed to.
>
> Then I would make modified "texts" from the blob of the file in the
> current revision and its parent revision, by inserting newlines after
> every single byte (probably replacing the original newlines by other
> values, such as \x01).
>
> The reason for this touchup is that the diff machinery in Git only handles
> line-based diffs.
>
> Then you can parse the hunk headers, adjust the offsets accordingly, and
> attribute the +++ regions to the current commit (by construction, the
> offsets are equal to the line number in the hunk header). Here it is most
> likely necessary to split the regions.
>
> You should also have a counter how many regions are still unattributed so
> you can stop early.
Ok this sounds like a plan. I think that will be a good outline to
start some work.
I will let you know when I have made some progress.
thanks,
mike
next prev parent reply other threads:[~2009-10-16 14:25 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-10-16 9:07 Introduction and Wikipedia and Git Blame jamesmikedupont
2009-10-16 11:26 ` Johannes Schindelin
2009-10-16 11:38 ` Martin Langhoff
2009-10-16 11:43 ` jamesmikedupont
2009-10-16 14:11 ` Johannes Schindelin
2009-10-16 14:23 ` jamesmikedupont [this message]
2009-10-16 17:04 ` Junio C Hamano
2009-10-16 18:00 ` jamesmikedupont
2009-10-16 19:00 ` Junio C Hamano
2009-10-16 20:05 ` Junio C Hamano
2009-10-16 21:19 ` jamesmikedupont
2009-10-16 23:25 ` Junio C Hamano
2009-10-17 6:50 ` jamesmikedupont
2009-10-17 16:42 ` jamesmikedupont
2009-10-22 6:41 ` jamesmikedupont
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ee9cc730910160723j5d7346a4l195ac6d3825c393b@mail.gmail.com \
--to=jamesmikedupont@googlemail.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).