git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Rogan Dawes <lists@dawes.za.net>
To: Geert Bosch <bosch@adacore.com>
Cc: Andy Parkins <andyparkins@gmail.com>, git@vger.kernel.org
Subject: Re: [PATCH] Show binary file size change in diff --stat
Date: Wed, 04 Apr 2007 18:00:07 +0200	[thread overview]
Message-ID: <4613CB87.2090306@dawes.za.net> (raw)
In-Reply-To: <BAFDAA7B-60EC-44FD-8DAA-EB2F9C835F51@adacore.com>

Geert Bosch wrote:
> 
> On Apr 4, 2007, at 09:34, Rogan Dawes wrote:
> 
>> For binary files, it would be consistent to show the number of bytes 
>> added/deleted. I have not investigated the output format for the 
>> libxdiff binary patch format, but hopefully it would not be too 
>> difficult to calculate the deletions and additions.
> 
> For binary files it is impractical to do insert/delete type of differences.
> For text files, treating lines as indivisible entities to insert/delete
> make some sense. For binary files, you'd have to use some arbitrary
> context-defined breakpoints and then go from there. The result would
> be some very complicated and unclear algorithm that would have no use
> in the real world.
> 
> Many binary files, such as an images, waveforms or virtually any compressed
> stream, can change in a way that changes all bytes in the file, while
> the changes in the displayed image or the uncompressed stream are
> imperceptible or absent. Guessing semantic differences between binary
> blobs is hopeless and subjective, while differences in size are fact.
> 
>   -Geert

As per my mail to Andy, we *already* do this for text files. e.g. wrap 
an XML document in an additional tag, and update the indentation to match.

The semantic change is minimal (perhaps 2 new lines), but the reported 
change reflects n lines deleted, and n+2 added.

Exactly because we *don't* do any semantic analysis (for text or binary 
files), we should simply report the number of bytes changed, exactly as 
we do for text files (reporting number of lines changed). This is 
_consistent_ with what we do currently for text files.

Note that Andy's apparent preference (to know how the sizes have 
changed) can still largely be satisfied by this approach.

  somefile.bin  | 1000 -> 1000 bytes

and

  somefile.bin  | 500 bytes removed, 500 bytes added

You can still see that the overall size of the file has not changed, but 
you get the additional information about how many bytes were actually 
changed at the same time, which you don't get just showing the sizes.

Rogan

  reply	other threads:[~2007-04-04 16:00 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-04 13:14 [PATCH] Show binary file size change in diff --stat Andy Parkins
2007-04-04 13:34 ` Rogan Dawes
2007-04-04 14:40   ` Geert Bosch
2007-04-04 16:00     ` Rogan Dawes [this message]
2007-04-04 14:40   ` Andy Parkins
2007-04-04 15:51     ` Rogan Dawes
2007-04-04 16:22       ` Johannes Schindelin
2007-04-04 16:26         ` Rogan Dawes
2007-04-04 16:40         ` Linus Torvalds
2007-04-04 16:59           ` Johannes Schindelin
2007-04-04 17:12             ` Linus Torvalds
2007-04-04 17:47           ` Junio C Hamano
  -- strict thread matches above, loose matches on Subject: below --
2007-02-28 13:03 Andy Parkins
2007-02-28 14:44 ` Johannes Schindelin
2007-02-28 14:51   ` Nicolas Pitre
2007-02-28 15:15   ` Andy Parkins
2007-02-28 15:37     ` Johannes Schindelin
2007-02-28 18:42       ` Andy Parkins
2007-02-28 19:41         ` Johannes Schindelin
2007-02-28 15:26   ` Andy Parkins
2007-02-28 18:58 ` Rogan Dawes
2007-02-28 19:42   ` Johannes Schindelin
2007-02-28 21:27     ` Rogan Dawes
2007-03-01  1:09       ` Johannes Schindelin
2007-03-01  6:58         ` Rogan Dawes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4613CB87.2090306@dawes.za.net \
    --to=lists@dawes.za.net \
    --cc=andyparkins@gmail.com \
    --cc=bosch@adacore.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).