All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <junkio@cox.net>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Rogan Dawes <lists@dawes.za.net>,
	Andy Parkins <andyparkins@gmail.com>,
	git@vger.kernel.org
Subject: Re: [PATCH] Show binary file size change in diff --stat
Date: Wed, 04 Apr 2007 10:47:56 -0700	[thread overview]
Message-ID: <7vbqi458vn.fsf@assigned-by-dhcp.cox.net> (raw)
In-Reply-To: <Pine.LNX.4.64.0704040935350.6730@woody.linux-foundation.org> (Linus Torvalds's message of "Wed, 4 Apr 2007 09:40:13 -0700 (PDT)")

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Wed, 4 Apr 2007, Johannes Schindelin wrote:
>> 
>> The subtle difference: your approach is _expensive_ in terms of CPU time, 
>> while the byte change approach is _dirt cheap_.
>
> Well, you could do a combination (still dirt cheap):
>  - show the size before/after (and yes, new/delete should be separate from 
>    "zero size before/after")
>  - show the size of the binary patch.
>
> No "X added bytes" vs "Y bytes deleted", just "size of binary patch". It 
> could be really small, even if 10k was deleted, or the file was totally 
> re-organized by moving chunks around.
>
> It would still be a meaningful thing to know - if only because it tells 
> you how much space the delta takes.

I agree wrt the kind of information to give, except that I am
not so sure about new/delete vs zero before/after.  We do not do
that for a text file, and when people do care about the
distinction, they would use --summary.

I often have wished that we could make --stat imply --summary;
the only reason we did not do that is because the --stat option
started its life as an imitation of "diffstat".

I've seen our diff-delta change its output size once.  It was a
nice improvement to make the delta much smaller than before, but
I had to rewrite the rename similarity in diffcore not to depend
on the diff-delta algorithm change, to keep it stable across
diff-delta improvements (the alternative was to futz with the
default threshold).  I suspect we might see similar confusion if
we show the delta size, depending on people's expectations.
This is a very minor issue, but I thought I should mention it.

There is a machine readable output format for the same --stat
information called --numstat.  It currently signals the
binary-ness by showing '-' instead of line count.  We could
extend it by showing '-' + number of bytes.

So here are some more suggestions:

 (1) --stat for binary files to show preimage and postimage
     sizes like this (if we were to do delta size -- otherwise
     drop " (.*" at the end):

	penguin.jpg |  Bin 745245 -> 660689 (delta: 4434)

 (2) --numstat for binary files to show preimage and postimage
     sizes like this:

	penguin.jpg	-745245	-660689

 (3) independent from all of the above, make --stat imply
     --summary and perhaps introduce --no-summary if people do not
     want --summary given when they say --stat;

  parent reply	other threads:[~2007-04-04 17:48 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-04 13:14 [PATCH] Show binary file size change in diff --stat Andy Parkins
2007-04-04 13:34 ` Rogan Dawes
2007-04-04 14:40   ` Geert Bosch
2007-04-04 16:00     ` Rogan Dawes
2007-04-04 14:40   ` Andy Parkins
2007-04-04 15:51     ` Rogan Dawes
2007-04-04 16:22       ` Johannes Schindelin
2007-04-04 16:26         ` Rogan Dawes
2007-04-04 16:40         ` Linus Torvalds
2007-04-04 16:59           ` Johannes Schindelin
2007-04-04 17:12             ` Linus Torvalds
2007-04-04 17:47           ` Junio C Hamano [this message]
  -- strict thread matches above, loose matches on Subject: below --
2007-02-28 13:03 Andy Parkins
2007-02-28 14:44 ` Johannes Schindelin
2007-02-28 14:51   ` Nicolas Pitre
2007-02-28 15:15   ` Andy Parkins
2007-02-28 15:37     ` Johannes Schindelin
2007-02-28 18:42       ` Andy Parkins
2007-02-28 19:41         ` Johannes Schindelin
2007-02-28 15:26   ` Andy Parkins
2007-02-28 18:58 ` Rogan Dawes
2007-02-28 19:42   ` Johannes Schindelin
2007-02-28 21:27     ` Rogan Dawes
2007-03-01  1:09       ` Johannes Schindelin
2007-03-01  6:58         ` Rogan Dawes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vbqi458vn.fsf@assigned-by-dhcp.cox.net \
    --to=junkio@cox.net \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=andyparkins@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=lists@dawes.za.net \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.