git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Rogan Dawes <lists@dawes.za.net>
To: Andy Parkins <andyparkins@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] Show binary file size change in diff --stat
Date: Wed, 04 Apr 2007 17:51:24 +0200	[thread overview]
Message-ID: <4613C97C.9050600@dawes.za.net> (raw)
In-Reply-To: <200704041540.59977.andyparkins@gmail.com>

Andy Parkins wrote:
> On Wednesday 2007 April 04 14:34, Rogan Dawes wrote:
> 
>> Well, how about my comments in <45E67978.9030805@dawes.za.net>,
>> suggesting that the edit difference (number of steps required to
>> transform one to the other) would be a better indication?
> 
> Perhaps.  There is certainly a difference between:
> 
>  somefile.bin  | 1000 -> 1000 bytes
> 
> and
> 
>  somefile.bin  | 500 bytes removed, 500 bytes added
> 
>> I think it is better because it is consistent with what we currently do
>> for text files: show the number of lines added/deleted.
> 
> The thing is, "lines" is an understandable unit for a text file, so it's 
> useful to show.  I'm not sure the same is true of "bytes" for a binary file.  
> Those bytes could represent anything; the true unit of a binary file is 
> dependent on its type.

I think bytes are the only reasonable unit for a binary file, since we 
have no idea what a meaningful divisor may be. So, defaulting to the 
smallest possible unit (other than going to the bit-level) makes perfect 
sense.

>> For binary files, it would be consistent to show the number of bytes
>> added/deleted. I have not investigated the output format for the
>> libxdiff binary patch format, but hopefully it would not be too
>> difficult to calculate the deletions and additions.
> 
> I'm inclined to agree with Johannes, while it's certainly something 
> that /could/ be shown - is it more useful?  There is no guarantee that a 
> small change in the underlying content is represented by a small change in 
> the binary diff.
> 
> As an example: compress a file, change a byte, compress it again, perform a 
> binary diff; what is that diff telling you about the change?  (My answer is: 
> not much).

Well, at least as much as the resulting sizes tell you, if not more.

Here is a counter example for a text file, where lines changed do not 
actually reflect the real changes in the file: the contents of an XML 
file being wrapped in an additional tag.

Semantically, all that has changed is an opening and closing tag. But, 
we still show that on a line by line basis, the entire file has changed 
(because the indentation changes). So you'd have n lines deleted, and 
n+2 lines added (for the additional opening and closing tag).

> Andy

I still maintain that showing bytes changed is the only consistent thing 
to do, unless we have additional logic that allows us to do "per 
file-type" diff statistics. Maybe .gitattributes will allow/enable this?

Regards,

Rogan

P.S. I'm not volunteering to inflict my novice C-skills on the git 
community, so this is really "just my 2c"

  reply	other threads:[~2007-04-04 15:52 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-04 13:14 [PATCH] Show binary file size change in diff --stat Andy Parkins
2007-04-04 13:34 ` Rogan Dawes
2007-04-04 14:40   ` Geert Bosch
2007-04-04 16:00     ` Rogan Dawes
2007-04-04 14:40   ` Andy Parkins
2007-04-04 15:51     ` Rogan Dawes [this message]
2007-04-04 16:22       ` Johannes Schindelin
2007-04-04 16:26         ` Rogan Dawes
2007-04-04 16:40         ` Linus Torvalds
2007-04-04 16:59           ` Johannes Schindelin
2007-04-04 17:12             ` Linus Torvalds
2007-04-04 17:47           ` Junio C Hamano
  -- strict thread matches above, loose matches on Subject: below --
2007-02-28 13:03 Andy Parkins
2007-02-28 14:44 ` Johannes Schindelin
2007-02-28 14:51   ` Nicolas Pitre
2007-02-28 15:15   ` Andy Parkins
2007-02-28 15:37     ` Johannes Schindelin
2007-02-28 18:42       ` Andy Parkins
2007-02-28 19:41         ` Johannes Schindelin
2007-02-28 15:26   ` Andy Parkins
2007-02-28 18:58 ` Rogan Dawes
2007-02-28 19:42   ` Johannes Schindelin
2007-02-28 21:27     ` Rogan Dawes
2007-03-01  1:09       ` Johannes Schindelin
2007-03-01  6:58         ` Rogan Dawes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4613C97C.9050600@dawes.za.net \
    --to=lists@dawes.za.net \
    --cc=andyparkins@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).