git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: diff --stat
Date: Tue, 14 Feb 2012 15:29:34 -0500	[thread overview]
Message-ID: <20120214202934.GA23291@sigill.intra.peff.net> (raw)
In-Reply-To: <7vfwed6uws.fsf@alter.siamese.dyndns.org>

On Tue, Feb 14, 2012 at 12:07:31PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > Hmm. Looking at scale_linear, the formula is:
> >
> >    return ((it - 1) * (width - 1) + max_change - 1) / (max_change - 1);
> >
> > I don't see how that can be accurate, since the magnitude of the "-1"
> > tweak will vary based on the value of "it". This code is due to
> > 3ed74e6, but I don't quite follow the logic in the commit message.
> 
> Doesn't it need +1 at the end, I wonder?  We want to map:
> 
>  - zero to zero
>  - any number to at least 1
> 
> so scaling a non-zero "it" so that maximum maps to (width-1) and then
> adding 1 would be the right way for the latter case.

Yeah, that makes more sense to me. I think you could also get by with
simply rounding the above properly to the nearest integer (so a little
bit of error that makes 23.9 into 24.0 would be OK, because we would end
up rounding 30.9 to 31.0, too). This seems like the most obvious
solution to me:

  static int divide_and_round(int a, int b)
  {
          return (2 * a + b) / (2 * b);
  }

  /*
   * We want non-zero changes to have at least 1 marker, so special-case
   * zero, then scale to width-1, and add back in 1.
   */
  static int scale_linear(int it, int width, int max_change)
  {
          if (!it)
                  return 0;
          return 1 + divide_and_round(it * (width-1), max_change);
  }

Any "must show at least 1" scheme is going to have some error in the
scaling (because we are rounding up all of the low end). I have a
feeling that the scheme from 3ed74e6 was trying to distribute that error
more evenly throughout the scale, and the scheme above is lumping all of
it at the start (i.e., the difference between what constitutes one
marker and two markers is much greater than the difference between two
and three).

But that's just a vague feeling. For some reason my brain is not doing
well with math today, so I'll forego writing a proof.

> Of course, an easy way out without worrying about the correct math is to
> scale the total and the smaller one and then declare that the scaled
> larger one is the difference between the two. That way, both of these two
> files have 109 in total so the length of the entire graph would be the
> same ;-).

It looks like we actually did that, pre-3ed74e6. I think it's a valid
strategy. It is just pushing the error around, but I don't know that
people are counting the +/- markers on the line and comparing them
exactly. A little error there is less of a big deal than error between
two lines which are supposed to have the same number of changes (you'll
note that we don't give the per-file added/deleted numbers exactly, so
they are a good place to hide error. :) ).

-Peff

  reply	other threads:[~2012-02-14 20:29 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-14  0:11 diff --stat Junio C Hamano
2012-02-14 19:50 ` Jeff King
2012-02-14 20:07   ` Junio C Hamano
2012-02-14 20:29     ` Jeff King [this message]
2012-02-14 21:49       ` Junio C Hamano
2012-02-14 21:53         ` Jeff King
2012-02-14 22:06           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120214202934.GA23291@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).