git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Rast <trast@student.ethz.ch>
To: <git@vger.kernel.org>
Cc: "Björn Steinbrink" <B.Steinbrink@gmx.de>
Subject: blame -M vs. log -p|grep -c ^+ weirdness
Date: Tue, 11 Aug 2009 12:16:00 +0200	[thread overview]
Message-ID: <200908111216.05131.trast@student.ethz.ch> (raw)

[-- Attachment #1: Type: Text/Plain, Size: 3047 bytes --]

Hi all

I think I'm fundamentally misunderstanding something about the blame
code...  The other day I wanted to see how much our local fork of
DOMjudge diverged from their upstream.  You can grab the entire
history at

  git://csa.inf.ethz.ch/domjudge-public.git

if you want to try the commands I ran.

As a first statistic I looked at how many lines are blamed to our
local team (Christoph, Florian and me) by running

  git ls-files | while read f; do git blame -M -- "$f"; done |
  perl -pe 's/^\^?[a-f0-9]* (?:[^(]* )?\(([^2]*?) *20.*/$1/' |
  sort | uniq -c | sort -n

This shows that over 8000 lines are attributed to the three of us:

      1 domjudge                                                                   
      2 rob                                                                        
    113 Stijn van Drongelen                                                        
    126 Jeroen Schot                                                               
    149 neus                                                                       
    866 Peter van de Werken                                                        
   1245 Thomas Rast                                                                
   1752 Christoph Krautz                                                           
   5350 Florian Jug                                                                
  10293 Thijs Kinkhorst                                                            
  20397 Jaap Eldering   

However, sanity checking this against the diffs of the single commits
shows quite a different number:

  git log --no-merges -p upstream/2.2.. | grep '^+' | grep -v -c '^+++'

gives only 4943 '+' lines, and you can easily verify with

  git shortlog -sn upstream/2.2..

that indeed all commits in that range are ours.  So why does the blame
think more lines are ours than we even added *in total*?

Björn Steinbrink suggested on IRC that I use -M5 -C5 -C5 -C5, which
indeed reduces it to

      1 domjudge                                                                   
      2 rob                                                                        
    115 Stijn van Drongelen                                                        
    116 Jeroen Schot                                                               
    149 neus                                                                       
    390 Florian Jug                                                                
    930 Peter van de Werken                                                        
   1209 Thomas Rast                                                                
   1612 Christoph Krautz                                                           
  11750 Thijs Kinkhorst                                                            
  24020 Jaap Eldering

Note especially the huge drop in Florian's numbers.  What's going on
here?

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

             reply	other threads:[~2009-08-11 11:51 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-11 10:16 Thomas Rast [this message]
2009-08-11 11:56 ` blame -M vs. log -p|grep -c ^+ weirdness Thomas Rast

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200908111216.05131.trast@student.ethz.ch \
    --to=trast@student.ethz.ch \
    --cc=B.Steinbrink@gmx.de \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).