All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rogan Dawes <lists@dawes.za.net>
To: Andy Parkins <andyparkins@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] Show binary file size change in diff --stat
Date: Wed, 04 Apr 2007 17:51:24 +0200	[thread overview]
Message-ID: <4613C97C.9050600@dawes.za.net> (raw)
In-Reply-To: <200704041540.59977.andyparkins@gmail.com>

Andy Parkins wrote:
> On Wednesday 2007 April 04 14:34, Rogan Dawes wrote:
> 
>> Well, how about my comments in <45E67978.9030805@dawes.za.net>,
>> suggesting that the edit difference (number of steps required to
>> transform one to the other) would be a better indication?
> 
> Perhaps.  There is certainly a difference between:
> 
>  somefile.bin  | 1000 -> 1000 bytes
> 
> and
> 
>  somefile.bin  | 500 bytes removed, 500 bytes added
> 
>> I think it is better because it is consistent with what we currently do
>> for text files: show the number of lines added/deleted.
> 
> The thing is, "lines" is an understandable unit for a text file, so it's 
> useful to show.  I'm not sure the same is true of "bytes" for a binary file.  
> Those bytes could represent anything; the true unit of a binary file is 
> dependent on its type.

I think bytes are the only reasonable unit for a binary file, since we 
have no idea what a meaningful divisor may be. So, defaulting to the 
smallest possible unit (other than going to the bit-level) makes perfect 
sense.

>> For binary files, it would be consistent to show the number of bytes
>> added/deleted. I have not investigated the output format for the
>> libxdiff binary patch format, but hopefully it would not be too
>> difficult to calculate the deletions and additions.
> 
> I'm inclined to agree with Johannes, while it's certainly something 
> that /could/ be shown - is it more useful?  There is no guarantee that a 
> small change in the underlying content is represented by a small change in 
> the binary diff.
> 
> As an example: compress a file, change a byte, compress it again, perform a 
> binary diff; what is that diff telling you about the change?  (My answer is: 
> not much).

Well, at least as much as the resulting sizes tell you, if not more.

Here is a counter example for a text file, where lines changed do not 
actually reflect the real changes in the file: the contents of an XML 
file being wrapped in an additional tag.

Semantically, all that has changed is an opening and closing tag. But, 
we still show that on a line by line basis, the entire file has changed 
(because the indentation changes). So you'd have n lines deleted, and 
n+2 lines added (for the additional opening and closing tag).

> Andy

I still maintain that showing bytes changed is the only consistent thing 
to do, unless we have additional logic that allows us to do "per 
file-type" diff statistics. Maybe .gitattributes will allow/enable this?

Regards,

Rogan

P.S. I'm not volunteering to inflict my novice C-skills on the git 
community, so this is really "just my 2c"

  reply	other threads:[~2007-04-04 15:52 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-04 13:14 [PATCH] Show binary file size change in diff --stat Andy Parkins
2007-04-04 13:34 ` Rogan Dawes
2007-04-04 14:40   ` Geert Bosch
2007-04-04 16:00     ` Rogan Dawes
2007-04-04 14:40   ` Andy Parkins
2007-04-04 15:51     ` Rogan Dawes [this message]
2007-04-04 16:22       ` Johannes Schindelin
2007-04-04 16:26         ` Rogan Dawes
2007-04-04 16:40         ` Linus Torvalds
2007-04-04 16:59           ` Johannes Schindelin
2007-04-04 17:12             ` Linus Torvalds
2007-04-04 17:47           ` Junio C Hamano
  -- strict thread matches above, loose matches on Subject: below --
2007-02-28 13:03 Andy Parkins
2007-02-28 14:44 ` Johannes Schindelin
2007-02-28 14:51   ` Nicolas Pitre
2007-02-28 15:15   ` Andy Parkins
2007-02-28 15:37     ` Johannes Schindelin
2007-02-28 18:42       ` Andy Parkins
2007-02-28 19:41         ` Johannes Schindelin
2007-02-28 15:26   ` Andy Parkins
2007-02-28 18:58 ` Rogan Dawes
2007-02-28 19:42   ` Johannes Schindelin
2007-02-28 21:27     ` Rogan Dawes
2007-03-01  1:09       ` Johannes Schindelin
2007-03-01  6:58         ` Rogan Dawes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4613C97C.9050600@dawes.za.net \
    --to=lists@dawes.za.net \
    --cc=andyparkins@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.