* [PATCH 1/2] diff --stat: use asymptotic scaling in graph
@ 2006-10-12 19:37 apodtele
2006-10-12 20:16 ` Martin Waitz
2006-10-12 21:53 ` Junio C Hamano
0 siblings, 2 replies; 16+ messages in thread
From: apodtele @ 2006-10-12 19:37 UTC (permalink / raw)
To: git
Instead of conditionally scaling the stat graph for large changes,
always scale it asymptotically: small changes shall appear without any
distortions.
Signed-off-by: Alexei Podtelezhnikov
--- diff.c 2006-10-12 14:45:13.000000000 -0400
+++ diff.c 2006-10-12 15:07:30.000000000 -0400
@@ -637,15 +637,9 @@
const char mime_boundary_leader[] = "------------";
-static int scale_linear(int it, int width, int max_change)
+static int scale_nonlinear(int it, int width)
{
- /*
- * make sure that at least one '-' is printed if there were deletions,
- * and likewise for '+'.
- */
- if (max_change < 2)
- return it;
- return ((it - 1) * (width - 1) + max_change - 1) / (max_change - 1);
+ return it * width / (it + width) + 1;
}
static void show_name(const char *prefix, const char *name, int len,
@@ -776,11 +770,9 @@
adds += add;
dels += del;
- if (width <= max_change) {
- add = scale_linear(add, width, max_change);
- del = scale_linear(del, width, max_change);
- total = add + del;
- }
+ add = scale_nonlinear(add, width / 2);
+ del = scale_nonlinear(del, width / 2);
+ total = add + del;
show_name(prefix, name, len, reset, set);
printf("%5d ", added + deleted);
show_graph('+', add, add_c, reset);
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] diff --stat: use asymptotic scaling in graph
2006-10-12 19:37 [PATCH 1/2] diff --stat: use asymptotic scaling in graph apodtele
@ 2006-10-12 20:16 ` Martin Waitz
2006-10-12 21:37 ` apodtele
2006-10-12 21:53 ` Junio C Hamano
1 sibling, 1 reply; 16+ messages in thread
From: Martin Waitz @ 2006-10-12 20:16 UTC (permalink / raw)
To: apodtele; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 436 bytes --]
hoi :)
On Thu, Oct 12, 2006 at 03:37:17PM -0400, apodtele wrote:
> Instead of conditionally scaling the stat graph for large changes,
> always scale it asymptotically: small changes shall appear without any
> distortions.
very nice idea!
> + return it * width / (it + width) + 1;
but wouldn't this formula result in at least 1, even for a 0 change?
Perhaps we'd have to special case an input of 0?
--
Martin Waitz
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] diff --stat: use asymptotic scaling in graph
2006-10-12 20:16 ` Martin Waitz
@ 2006-10-12 21:37 ` apodtele
2006-10-12 22:20 ` A Large Angry SCM
0 siblings, 1 reply; 16+ messages in thread
From: apodtele @ 2006-10-12 21:37 UTC (permalink / raw)
To: Martin Waitz, git
On 10/12/06, Martin Waitz <tali@admingilde.org> wrote:
> On Thu, Oct 12, 2006 at 03:37:17PM -0400, apodtele wrote:
> > Instead of conditionally scaling the stat graph for large changes,
> > always scale it asymptotically: small changes shall appear without any
> > distortions.
>
> very nice idea!
>
> > + return it * width / (it + width) + 1;
>
> but wouldn't this formula result in at least 1, even for a 0 change?
> Perhaps we'd have to special case an input of 0?
Corrected patch follows.
--- diff.c 2006-10-12 14:45:13.000000000 -0400
+++ diff.c 2006-10-12 17:32:15.000000000 -0400
@@ -637,15 +637,12 @@
const char mime_boundary_leader[] = "------------";
-static int scale_linear(int it, int width, int max_change)
+static int scale_nonlinear(int it, int width)
{
- /*
- * make sure that at least one '-' is printed if there were deletions,
- * and likewise for '+'.
- */
- if (max_change < 2)
- return it;
- return ((it - 1) * (width - 1) + max_change - 1) / (max_change - 1);
+ if (it)
+ return it * width / (it + width) + 1;
+ else
+ return 0;
}
static void show_name(const char *prefix, const char *name, int len,
@@ -776,11 +773,9 @@
adds += add;
dels += del;
- if (width <= max_change) {
- add = scale_linear(add, width, max_change);
- del = scale_linear(del, width, max_change);
- total = add + del;
- }
+ add = scale_nonlinear(add, width / 2);
+ del = scale_nonlinear(del, width / 2);
+ total = add + del;
show_name(prefix, name, len, reset, set);
printf("%5d ", added + deleted);
show_graph('+', add, add_c, reset);
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] diff --stat: use asymptotic scaling in graph
2006-10-12 19:37 [PATCH 1/2] diff --stat: use asymptotic scaling in graph apodtele
2006-10-12 20:16 ` Martin Waitz
@ 2006-10-12 21:53 ` Junio C Hamano
2006-10-12 22:15 ` A Large Angry SCM
` (2 more replies)
1 sibling, 3 replies; 16+ messages in thread
From: Junio C Hamano @ 2006-10-12 21:53 UTC (permalink / raw)
To: Alexei Podtelezhnikov; +Cc: git
apodtele <apodtele@gmail.com> writes:
> Instead of conditionally scaling the stat graph for large changes,
> always scale it asymptotically: small changes shall appear without any
> distortions.
>
> Signed-off-by: Alexei Podtelezhnikov
Missing e-mail address on S-o-b line. If your mail From: line
does not say who you are, please add an extra From: line in the
body, like this:
From: Alexei Podtelezhnikov <apodtele@gmail.com>
Subject: [PATCH 1/2] diff --stat: ...
Instead of ...
Signed-off-by: Alexei Podtelezhnikov <apodtele@gmail.com>
I am not sure if any non-linear scaling is worth pursuing.
Suppose your change set has three files modified:
A adds 20 lines, deletes 10 lines
B adds 10 lines, deletes 20 lines
C adds 30 lines, deletes 30 lines
When drawing into a specified width that leaves 20-column for
the graph part, what would we see? What would we see if the
graph part is 21-column wide? 59-column wide? 80-column wide?
For obvious reasons, the total length of A and B exceeds half of
C, which looks quite misleading.
A | ++++++++++++--------
B | ++++++++------------
C | +++++++++++++++---------------
We could align things in the middle, like this, though:
A | ++++++++++++--------
B | ++++++++------------
C | +++++++++++++++---------------
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] diff --stat: use asymptotic scaling in graph
2006-10-12 21:53 ` Junio C Hamano
@ 2006-10-12 22:15 ` A Large Angry SCM
2006-10-12 22:24 ` Junio C Hamano
2006-10-13 13:56 ` apodtele
2 siblings, 0 replies; 16+ messages in thread
From: A Large Angry SCM @ 2006-10-12 22:15 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Alexei Podtelezhnikov, git
Junio C Hamano wrote:
[...]
>
> We could align things in the middle, like this, though:
>
> A | ++++++++++++--------
> B | ++++++++------------
> C | +++++++++++++++---------------
This is more diff like and makes comparing the right and left side
changes easier.
A | --------@++++++++++++ |
B | ------------@++++++++ |
C | ---------------@+++++++++++++++ |
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] diff --stat: use asymptotic scaling in graph
2006-10-12 21:37 ` apodtele
@ 2006-10-12 22:20 ` A Large Angry SCM
2006-10-12 22:27 ` Martin Waitz
0 siblings, 1 reply; 16+ messages in thread
From: A Large Angry SCM @ 2006-10-12 22:20 UTC (permalink / raw)
To: apodtele; +Cc: Martin Waitz, git
apodtele wrote:
> On 10/12/06, Martin Waitz <tali@admingilde.org> wrote:
>> On Thu, Oct 12, 2006 at 03:37:17PM -0400, apodtele wrote:
>> > Instead of conditionally scaling the stat graph for large changes,
>> > always scale it asymptotically: small changes shall appear without any
>> > distortions.
>>
>> very nice idea!
>>
>> > + return it * width / (it + width) + 1;
>>
>> but wouldn't this formula result in at least 1, even for a 0 change?
>> Perhaps we'd have to special case an input of 0?
[...]
> + if (it)
> + return it * width / (it + width) + 1;
> + else
> + return 0;
No conditional needed:
return it * width / (it + width - 1)
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] diff --stat: use asymptotic scaling in graph
2006-10-12 21:53 ` Junio C Hamano
2006-10-12 22:15 ` A Large Angry SCM
@ 2006-10-12 22:24 ` Junio C Hamano
2006-10-13 13:56 ` apodtele
2 siblings, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2006-10-12 22:24 UTC (permalink / raw)
To: Alexei Podtelezhnikov; +Cc: git
Junio C Hamano <junkio@cox.net> writes:
> Missing e-mail address on S-o-b line. If your mail From: line
> does not say who you are, please add an extra From: line in the
> body, like this:
>
> From: Alexei Podtelezhnikov <apodtele@gmail.com>
> Subject: [PATCH 1/2] diff --stat: ...
>
> Instead of ...
>
> Signed-off-by: Alexei Podtelezhnikov <apodtele@gmail.com>
Eh, no.
Sorry, what I meant was:
Not like this:
From: apodtele <apodtele@gmail.com>
Subject: [PATCH 1/2] diff --stat: ...
Instead of ...
Signed-off-by: Alexei Podtelezhnikov
But like this:
From: apodtele <apodtele@gmail.com>
Subject: [PATCH 1/2] diff --stat: ...
From: Alexei Podtelezhnikov <apodtele@gmail.com>
Instead of ...
Signed-off-by: Alexei Podtelezhnikov <apodtele@gmail.com>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] diff --stat: use asymptotic scaling in graph
2006-10-12 22:20 ` A Large Angry SCM
@ 2006-10-12 22:27 ` Martin Waitz
2006-10-12 22:48 ` A Large Angry SCM
0 siblings, 1 reply; 16+ messages in thread
From: Martin Waitz @ 2006-10-12 22:27 UTC (permalink / raw)
To: A Large Angry SCM; +Cc: apodtele, git
[-- Attachment #1: Type: text/plain, Size: 405 bytes --]
hoi :)
On Thu, Oct 12, 2006 at 03:20:09PM -0700, A Large Angry SCM wrote:
> >+ if (it)
> >+ return it * width / (it + width) + 1;
> >+ else
> >+ return 0;
>
> No conditional needed:
>
> return it * width / (it + width - 1)
But then it would start scaling much earlier
(for width 10: at 2 instead of 4).
This is not bad per se, but different...
--
Martin Waitz
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] diff --stat: use asymptotic scaling in graph
2006-10-12 22:27 ` Martin Waitz
@ 2006-10-12 22:48 ` A Large Angry SCM
2006-10-12 22:52 ` Johannes Schindelin
2006-10-13 0:39 ` Nicolas Pitre
0 siblings, 2 replies; 16+ messages in thread
From: A Large Angry SCM @ 2006-10-12 22:48 UTC (permalink / raw)
To: Martin Waitz; +Cc: apodtele, git
Martin Waitz wrote:
> On Thu, Oct 12, 2006 at 03:20:09PM -0700, A Large Angry SCM wrote:
>>> + if (it)
>>> + return it * width / (it + width) + 1;
>>> + else
>>> + return 0;
>> No conditional needed:
>>
>> return it * width / (it + width - 1)
>
> But then it would start scaling much earlier
> (for width 10: at 2 instead of 4).
> This is not bad per se, but different...
>
OK:
return (it * width + (it + width)/2)) / (it + width - 1)
Now it's back at 4. ;-)
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] diff --stat: use asymptotic scaling in graph
2006-10-12 22:48 ` A Large Angry SCM
@ 2006-10-12 22:52 ` Johannes Schindelin
2006-10-12 23:12 ` apodtele
2006-10-13 0:39 ` Nicolas Pitre
1 sibling, 1 reply; 16+ messages in thread
From: Johannes Schindelin @ 2006-10-12 22:52 UTC (permalink / raw)
To: git; +Cc: Martin Waitz, apodtele
Hi,
On Thu, 12 Oct 2006, A Large Angry SCM wrote:
> Martin Waitz wrote:
> > On Thu, Oct 12, 2006 at 03:20:09PM -0700, A Large Angry SCM wrote:
> > > > + if (it)
> > > > + return it * width / (it + width) + 1;
> > > > + else
> > > > + return 0;
> > > No conditional needed:
> > >
> > > return it * width / (it + width - 1)
> >
> > But then it would start scaling much earlier
> > (for width 10: at 2 instead of 4).
> > This is not bad per se, but different...
> >
>
> OK:
> return (it * width + (it + width)/2)) / (it + width - 1)
>
> Now it's back at 4. ;-)
Am I the only one finding non-linear diffstat ugly and misleading?
Ciao,
Dscho
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] diff --stat: use asymptotic scaling in graph
2006-10-12 22:52 ` Johannes Schindelin
@ 2006-10-12 23:12 ` apodtele
0 siblings, 0 replies; 16+ messages in thread
From: apodtele @ 2006-10-12 23:12 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git, Martin Waitz
On 10/12/06, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> Am I the only one finding non-linear diffstat ugly and misleading?
Well, the scaling I propose _is_ linear for small changes. More
importantly, the existing scheme is not linear across the diffs
either. Different stats may _look_ the same but be very different in
size in the existing scheme already. My proposal is invariant across
diff stats. Junio's argument that a change of 30 doesn't look like a
half of 60 is valid, of course. Does anyone really checks this with a
ruler?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] diff --stat: use asymptotic scaling in graph
2006-10-12 22:48 ` A Large Angry SCM
2006-10-12 22:52 ` Johannes Schindelin
@ 2006-10-13 0:39 ` Nicolas Pitre
2006-10-13 13:25 ` apodtele
1 sibling, 1 reply; 16+ messages in thread
From: Nicolas Pitre @ 2006-10-13 0:39 UTC (permalink / raw)
To: A Large Angry SCM; +Cc: Martin Waitz, apodtele, git
On Thu, 12 Oct 2006, A Large Angry SCM wrote:
> Martin Waitz wrote:
> > On Thu, Oct 12, 2006 at 03:20:09PM -0700, A Large Angry SCM wrote:
> > > > + if (it)
> > > > + return it * width / (it + width) + 1;
> > > > + else
> > > > + return 0;
> > > No conditional needed:
> > >
> > > return it * width / (it + width - 1)
> >
> > But then it would start scaling much earlier
> > (for width 10: at 2 instead of 4).
> > This is not bad per se, but different...
> >
>
> OK:
> return (it * width + (it + width)/2)) / (it + width - 1)
>
> Now it's back at 4. ;-)
Sure, but at this point the original conditional is probably more
efficient.
Nicolas
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] diff --stat: use asymptotic scaling in graph
2006-10-13 0:39 ` Nicolas Pitre
@ 2006-10-13 13:25 ` apodtele
2006-10-13 13:31 ` Andy Whitcroft
0 siblings, 1 reply; 16+ messages in thread
From: apodtele @ 2006-10-13 13:25 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: A Large Angry SCM, Martin Waitz, git
Hi!
On 10/12/06, Nicolas Pitre <nico@cam.org> wrote:
> On Thu, 12 Oct 2006, A Large Angry SCM wrote:
> > Martin Waitz wrote:
> > > On Thu, Oct 12, 2006 at 03:20:09PM -0700, A Large Angry SCM wrote:
> > > > > + if (it)
> > > > > + return it * width / (it + width) + 1;
> > > > > + else
> > > > > + return 0;
> > > > No conditional needed:
> > > >
> > > > return it * width / (it + width - 1)
> > >
> > > But then it would start scaling much earlier
> > > (for width 10: at 2 instead of 4).
> > > This is not bad per se, but different...
> > >
> >
> > OK:
> > return (it * width + (it + width)/2)) / (it + width - 1)
> >
> > Now it's back at 4. ;-)
>
> Sure, but at this point the original conditional is probably more
> efficient.
>
Don't make me use
return it * width / (it + width) + !!it;
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] diff --stat: use asymptotic scaling in graph
2006-10-13 13:25 ` apodtele
@ 2006-10-13 13:31 ` Andy Whitcroft
0 siblings, 0 replies; 16+ messages in thread
From: Andy Whitcroft @ 2006-10-13 13:31 UTC (permalink / raw)
To: apodtele; +Cc: Nicolas Pitre, A Large Angry SCM, Martin Waitz, git
apodtele wrote:
> Hi!
>
> On 10/12/06, Nicolas Pitre <nico@cam.org> wrote:
>> On Thu, 12 Oct 2006, A Large Angry SCM wrote:
>> > Martin Waitz wrote:
>> > > On Thu, Oct 12, 2006 at 03:20:09PM -0700, A Large Angry SCM wrote:
>> > > > > + if (it)
>> > > > > + return it * width / (it + width) + 1;
>> > > > > + else
>> > > > > + return 0;
>> > > > No conditional needed:
>> > > >
>> > > > return it * width / (it + width - 1)
>> > >
>> > > But then it would start scaling much earlier
>> > > (for width 10: at 2 instead of 4).
>> > > This is not bad per se, but different...
>> > >
>> >
>> > OK:
>> > return (it * width + (it + width)/2)) / (it + width - 1)
>> >
>> > Now it's back at 4. ;-)
>>
>> Sure, but at this point the original conditional is probably more
>> efficient.
>>
>
> Don't make me use
> return it * width / (it + width) + !!it;
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
return it * width / (it + width) + (it != 0)
Perhaps?
-apw
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] diff --stat: use asymptotic scaling in graph
2006-10-12 21:53 ` Junio C Hamano
2006-10-12 22:15 ` A Large Angry SCM
2006-10-12 22:24 ` Junio C Hamano
@ 2006-10-13 13:56 ` apodtele
2006-10-14 19:06 ` Junio C Hamano
2 siblings, 1 reply; 16+ messages in thread
From: apodtele @ 2006-10-13 13:56 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On 10/12/06, Junio C Hamano <junkio@cox.net> wrote:
> apodtele <apodtele@gmail.com> writes:
> > Instead of conditionally scaling the stat graph for large changes,
> > always scale it asymptotically: small changes shall appear without any
> > distortions.
> I am not sure if any non-linear scaling is worth pursuing.
> Suppose your change set has three files modified:
>
> A adds 20 lines, deletes 10 lines
> B adds 10 lines, deletes 20 lines
> C adds 30 lines, deletes 30 lines
>
> For obvious reasons, the total length of A and B exceeds half of
> C, which looks quite misleading.
>
> A | ++++++++++++--------
> B | ++++++++------------
> C | +++++++++++++++---------------
Before my patch is completely forgotten, let me critique the current
approach. Currently everything is great and beautiful unless one
particular change adds a couple of hundred lines, say, to a man page.
With large changes in play, small changes are squashed to a single
character. Would you argue that this scenario correctly represent
importance of man pages? Would you say, that it's not misleading that
1-, 2-, and 5-liners all look the same as long as a man page is
prominently shown? Moreover, 1-, 2-, and 5- liners may look different
depending on the size of that man page. The current approach is not
invariant; it is, however, normalized as needed. "Normalized" is good,
"as needed" is bad.
With asymptotic scaling, 1-, 2-, and 5- liners are correctly
represented by a correct number of characters, regardless of the size
of that man page. 10- and 20- liners are _slightly_ distorted. I
cannot stress it more: the representation will not depend on the size
of changes in other files! You will be able to tell where truly large
changes happened too! The price for this is that you won't be able to
precisely compare the sizes of added man pages.
It is your choice...
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] diff --stat: use asymptotic scaling in graph
2006-10-13 13:56 ` apodtele
@ 2006-10-14 19:06 ` Junio C Hamano
0 siblings, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2006-10-14 19:06 UTC (permalink / raw)
To: apodtele; +Cc: git
apodtele <apodtele@gmail.com> writes:
> Before my patch is completely forgotten, let me critique the current
> approach. Currently everything is great and beautiful unless one
> particular change adds a couple of hundred lines, say, to a man page.
> With large changes in play, small changes are squashed to a single
> character. Would you argue that this scenario correctly represent
> importance of man pages? Would you say, that it's not misleading that
> 1-, 2-, and 5-liners all look the same as long as a man page is
> prominently shown? Moreover, 1-, 2-, and 5- liners may look different
> depending on the size of that man page. The current approach is not
> invariant; it is, however, normalized as needed. "Normalized" is good,
> "as needed" is bad.
One thing that mildly irritates me has been:
git log --stat v2.6.17..
which, as you correctly point out, shows the bad effect of
scaling per commit. "Normalized as needed" is good. What's bad
is "not normalizing across things we show".
Even with your non-linear scaling, you would need to make sure
every commit gets the same graph width; I do not think they
currently do, due to name part scaling.
People are used to seeing the traditional diffstat output, so
any improvement you make that is different from it (including
e.g. "being able to show differences between 1- and 2- liner
patch when a monster 800- liner happens to be in the same patch
set", which is a worthwhile goal) will look bizarre and/or
misleading to them and they would not like it.
With the change to align things in the middle, it might become
easier to accept, because then it is _so_ obviously different
from traditional diffstat, it is very clear to people that the
output is different but still they can easily figure out that
longer bars are for larger changes.
And this new output needs to be an option.
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2006-10-14 19:06 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-12 19:37 [PATCH 1/2] diff --stat: use asymptotic scaling in graph apodtele
2006-10-12 20:16 ` Martin Waitz
2006-10-12 21:37 ` apodtele
2006-10-12 22:20 ` A Large Angry SCM
2006-10-12 22:27 ` Martin Waitz
2006-10-12 22:48 ` A Large Angry SCM
2006-10-12 22:52 ` Johannes Schindelin
2006-10-12 23:12 ` apodtele
2006-10-13 0:39 ` Nicolas Pitre
2006-10-13 13:25 ` apodtele
2006-10-13 13:31 ` Andy Whitcroft
2006-10-12 21:53 ` Junio C Hamano
2006-10-12 22:15 ` A Large Angry SCM
2006-10-12 22:24 ` Junio C Hamano
2006-10-13 13:56 ` apodtele
2006-10-14 19:06 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).