From: "jamesmikedupont@googlemail.com" <jamesmikedupont@googlemail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>, git@vger.kernel.org
Subject: Re: Introduction and Wikipedia and Git Blame
Date: Sat, 17 Oct 2009 18:42:18 +0200 [thread overview]
Message-ID: <ee9cc730910170942p7869d62ra08571948675d696@mail.gmail.com> (raw)
In-Reply-To: <ee9cc730910162350p250b8afak767b0626bede34e4@mail.gmail.com>
I have done a workaround hack,
today I attempted to hack the blame code but I need to do more
research, it did not work.
But I did get a new version of the import script running and word
level blame going.
http://fmtyewtk.blogspot.com/2009/10/mediawiki-git-word-level-blaming-one.html
Next step is ready :
1. I have a single script that will pull a given article and check in
the revisions into git,
it is not perfect, but works.
http://bazaar.launchpad.net/~jamesmikedupont/+junk/wikiatransfer/revision/8
you run it like this,from inside a git repo :
perl GetRevisions.pl "Article_Name"
git blame Article_Name/Article.xml
git push origin master
The code that splits up the line is in Process File, this splits all
spaces into newlines.
that way we get a word level blame.
if ($insidetext)
{
## split all lines on the space
s/(\ )/\\\n/g;
print OUT $_;
}
The Article is here:
http://github.com/h4ck3rm1k3/KosovoWikipedia/blob/master/Wiki/2008_Kosovo_declaration_of_independence/article.xml
here are the blame results.
http://github.com/h4ck3rm1k3/KosovoWikipedia/blob/master/Wiki/2008_Kosovo_declaration_of_independence/wordblame.txt
Problem is that github does not like this amount of processor power
begin used and kills the process, you can do a local git blame.
Now we have the tool to easily create a repository from wikipedia, or
any other export enabled mediawiki.
mike
On Sat, Oct 17, 2009 at 8:50 AM, jamesmikedupont@googlemail.com
<jamesmikedupont@googlemail.com> wrote:
> Thank you very much for your input and advice,
> I have a lot of learn about this great tool.
> I am working on learning how the existing blame tool runs now.
> Will report back when I have some code.
> mike
>
> On Sat, Oct 17, 2009 at 1:25 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> "jamesmikedupont@googlemail.com" <jamesmikedupont@googlemail.com> writes:
>>
>>> What do you think of my idea to create blames along a specific user
>>> defined byte positions ?
>>
>> Overly complicated and not enough time for _review_. If you are blaming
>> one-byte (or one-char) per line, wouldn't it be enough to consider the
>> line number in the output as byte (or char) position when reconstituting
>> the original text?
>>
>
next prev parent reply other threads:[~2009-10-17 16:42 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-10-16 9:07 Introduction and Wikipedia and Git Blame jamesmikedupont
2009-10-16 11:26 ` Johannes Schindelin
2009-10-16 11:38 ` Martin Langhoff
2009-10-16 11:43 ` jamesmikedupont
2009-10-16 14:11 ` Johannes Schindelin
2009-10-16 14:23 ` jamesmikedupont
2009-10-16 17:04 ` Junio C Hamano
2009-10-16 18:00 ` jamesmikedupont
2009-10-16 19:00 ` Junio C Hamano
2009-10-16 20:05 ` Junio C Hamano
2009-10-16 21:19 ` jamesmikedupont
2009-10-16 23:25 ` Junio C Hamano
2009-10-17 6:50 ` jamesmikedupont
2009-10-17 16:42 ` jamesmikedupont [this message]
2009-10-22 6:41 ` jamesmikedupont
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ee9cc730910170942p7869d62ra08571948675d696@mail.gmail.com \
--to=jamesmikedupont@googlemail.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).