From: Rogan Dawes <lists@dawes.za.net>
To: Geert Bosch <bosch@adacore.com>
Cc: Andy Parkins <andyparkins@gmail.com>, git@vger.kernel.org
Subject: Re: [PATCH] Show binary file size change in diff --stat
Date: Wed, 04 Apr 2007 18:00:07 +0200 [thread overview]
Message-ID: <4613CB87.2090306@dawes.za.net> (raw)
In-Reply-To: <BAFDAA7B-60EC-44FD-8DAA-EB2F9C835F51@adacore.com>
Geert Bosch wrote:
>
> On Apr 4, 2007, at 09:34, Rogan Dawes wrote:
>
>> For binary files, it would be consistent to show the number of bytes
>> added/deleted. I have not investigated the output format for the
>> libxdiff binary patch format, but hopefully it would not be too
>> difficult to calculate the deletions and additions.
>
> For binary files it is impractical to do insert/delete type of differences.
> For text files, treating lines as indivisible entities to insert/delete
> make some sense. For binary files, you'd have to use some arbitrary
> context-defined breakpoints and then go from there. The result would
> be some very complicated and unclear algorithm that would have no use
> in the real world.
>
> Many binary files, such as an images, waveforms or virtually any compressed
> stream, can change in a way that changes all bytes in the file, while
> the changes in the displayed image or the uncompressed stream are
> imperceptible or absent. Guessing semantic differences between binary
> blobs is hopeless and subjective, while differences in size are fact.
>
> -Geert
As per my mail to Andy, we *already* do this for text files. e.g. wrap
an XML document in an additional tag, and update the indentation to match.
The semantic change is minimal (perhaps 2 new lines), but the reported
change reflects n lines deleted, and n+2 added.
Exactly because we *don't* do any semantic analysis (for text or binary
files), we should simply report the number of bytes changed, exactly as
we do for text files (reporting number of lines changed). This is
_consistent_ with what we do currently for text files.
Note that Andy's apparent preference (to know how the sizes have
changed) can still largely be satisfied by this approach.
somefile.bin | 1000 -> 1000 bytes
and
somefile.bin | 500 bytes removed, 500 bytes added
You can still see that the overall size of the file has not changed, but
you get the additional information about how many bytes were actually
changed at the same time, which you don't get just showing the sizes.
Rogan
next prev parent reply other threads:[~2007-04-04 16:00 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-04-04 13:14 [PATCH] Show binary file size change in diff --stat Andy Parkins
2007-04-04 13:34 ` Rogan Dawes
2007-04-04 14:40 ` Geert Bosch
2007-04-04 16:00 ` Rogan Dawes [this message]
2007-04-04 14:40 ` Andy Parkins
2007-04-04 15:51 ` Rogan Dawes
2007-04-04 16:22 ` Johannes Schindelin
2007-04-04 16:26 ` Rogan Dawes
2007-04-04 16:40 ` Linus Torvalds
2007-04-04 16:59 ` Johannes Schindelin
2007-04-04 17:12 ` Linus Torvalds
2007-04-04 17:47 ` Junio C Hamano
-- strict thread matches above, loose matches on Subject: below --
2007-02-28 13:03 Andy Parkins
2007-02-28 14:44 ` Johannes Schindelin
2007-02-28 14:51 ` Nicolas Pitre
2007-02-28 15:15 ` Andy Parkins
2007-02-28 15:37 ` Johannes Schindelin
2007-02-28 18:42 ` Andy Parkins
2007-02-28 19:41 ` Johannes Schindelin
2007-02-28 15:26 ` Andy Parkins
2007-02-28 18:58 ` Rogan Dawes
2007-02-28 19:42 ` Johannes Schindelin
2007-02-28 21:27 ` Rogan Dawes
2007-03-01 1:09 ` Johannes Schindelin
2007-03-01 6:58 ` Rogan Dawes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4613CB87.2090306@dawes.za.net \
--to=lists@dawes.za.net \
--cc=andyparkins@gmail.com \
--cc=bosch@adacore.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).