git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Sverre Rabbelier" <alturin@gmail.com>
To: "Johannes Schindelin" <Johannes.Schindelin@gmx.de>
Cc: "Git Mailinglist" <git@vger.kernel.org>,
	"David Symonds" <dsymonds@gmail.com>
Subject: Re: [GitStats] Bling bling or some statistics on the git.git repository
Date: Fri, 11 Jul 2008 23:52:02 +0200	[thread overview]
Message-ID: <bd6139dc0807111452x778759d4jd6ac71338974018e@mail.gmail.com> (raw)
In-Reply-To: <alpine.DEB.1.00.0807112215050.8950@racer>

On Fri, Jul 11, 2008 at 11:22 PM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi,
>
> On Fri, 11 Jul 2008, Sverre Rabbelier wrote:
>
>> I temporarily modified the code to output %04d instead of %4d so that I
>> could do the following:
>>
>>        $ stats.py author -a > full_activity_sortable.txt
>
> You might be delighted to read up on the "-n" switch to sort(1).

Heh, yes, very much so :). I probably shouldof known there is such an
option, but having the source at hand the change to '%4d' was the
first thing that came to mind.

>> A few highlights from the sorted file:
>>
>> $ cat full_activity_sortable.txt | sort | tail -n 20
>
> More intuitive would have been "sort -r | head -n 20", I guess.

Since that wouldof put the 'number one' at the top? Yeah, I guess it
wouldof, nice one.

>> 0170:  2721+  1060- = refs.c
>
> I guess that 170 is the total number of commit touching that file, the "+"
> and "-" numbers the changes respectively?

Correct, I probably should have explain that. The +es are how many
lines were added and the -es are the total amount of lines that were
deleted, yup.

> I think quite a lot of our changes do code moves; this should be accounted
> for differently.

Yeah, I wish 'git log -C -C -M --numstat --sacrifice-chicken
--pretty=format:%ae --' would take care of that... That is, a
git-blame like mechanism that would detect such moves on a per-commit
basis and report them would be very useful to me.

>> For some reason you people can't seem to make up your mind about a
>> file that's not even 1500 lines in size ;).
>
> Heh.  We might need to change it once or twice, in the future.

*chuckles*, I'm curious why the Makefile is such a hard file to get right :).

>> A note is in order here, this data was mined with "git log --num-stat"
>> so things like moving files and copying files are not accounted for.
>
> In my opinion it would be even more interesting to see code moves (i.e.
> not whole files).  For example, we moved some stuff from builtins into the
> library.  The real change here is not in the lines added and deleted.

Very much so, but the former I figure can be easily done with 'git log
-C -C -M' I discovered (I need to parse it's output though, and also
determine what to do with moves statistics wise. Should changes made
due to moves just be ignored?)

>> I thought about using git-blame to gather this info before, but it is
>> not the right tool for the job. If anyone else has any idea's on what
>> would be better please let me know and I'll happily dig into it :).
>
> I think that you need to analyze the diff directly.  One possible (quick
> 'n dirty) way would be to cut out long consecutive "+" parts of the hunks,
> replace the "-" by "+", and use "git diff --no-index" to do the hard part
> of searching for that code in the "-" part of the original diff.

That sounds interesting, I won't need to actually do that though, I
already have a diff parser that gives me the lines added VS lines
deleted on a hunk-by-hunk basis. If it is a true move (e.g., code
removed in file X and added in file Y) it should be trivial to detect
that.
Something along the lines of:
for hunk in added:
  if hunk in deleted:
    print("Over here!!")

> Just an idea,

Much appreciated! I will look into this.

-- 
Cheers,

Sverre Rabbelier

  parent reply	other threads:[~2008-07-11 21:53 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bd6139dc0807090621n308b0159n92d946c165d3a5dd@mail.gmail.com>
2008-07-11 21:04 ` [GitStats] Bling bling or some statistics on the git.git repository Sverre Rabbelier
2008-07-11 21:22   ` Johannes Schindelin
2008-07-11 21:39     ` Johannes Schindelin
2008-07-11 21:55       ` Johannes Schindelin
2008-07-11 22:05         ` Sverre Rabbelier
2008-07-11 22:10           ` Johannes Schindelin
2008-07-11 21:55       ` Sverre Rabbelier
2008-07-11 22:11         ` Johannes Schindelin
2008-07-11 22:14           ` Sverre Rabbelier
2008-07-11 23:02             ` Johannes Schindelin
2008-07-11 23:28               ` [PATCH] Add pretty format %aN which gives the author name, respecting .mailmap Johannes Schindelin
2008-07-11 23:30                 ` Sverre Rabbelier
2008-07-11 23:42                   ` Johannes Schindelin
2008-07-12  8:44                     ` Sverre Rabbelier
2008-07-11 21:52     ` Sverre Rabbelier [this message]
2008-07-11 22:07       ` [GitStats] Bling bling or some statistics on the git.git repository Johannes Schindelin
2008-07-11 22:50         ` Sverre Rabbelier
2008-07-11 23:33           ` Johannes Schindelin
2008-07-12  7:39             ` Sverre Rabbelier
2008-07-12 22:36             ` Sverre Rabbelier
2008-07-13  0:29               ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bd6139dc0807111452x778759d4jd6ac71338974018e@mail.gmail.com \
    --to=alturin@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=dsymonds@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=sverre@rabbelier.nl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).