git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: [PATCH 1/2] diff: fix handling of binary rewrite diffs
Date: Tue, 9 Dec 2008 03:12:28 -0500	[thread overview]
Message-ID: <20081209081227.GA19626@coredump.intra.peff.net> (raw)

The current emit_rewrite_diff code always writes a text
patch without checking whether the content is binary. This
means that if you end up with a rewrite diff for a binary
file, you get lots of raw binary goo in your patch.

Instead, if we have binary files, then let's just skip
emit_rewrite_diff altogether. We will already have shown the
"dissimilarity index" line, so it is really about the diff
contents. If binary diffs are turned off, the "Binary files
a/file and b/file differ" message should be the same in
either case. If we do have binary patches turned on, there
isn't much point in making a less-efficient binary patch
that does a total rewrite; no human is going to read it, and
since binary patches don't apply with any fuzz anyway, the
result of application should be the same.

Signed-off-by: Jeff King <peff@peff.net>
---
I couldn't think of a good reason to want a different format of binary
patch for a rewrite versus a regular diff, but others may be able to.
In that case, you have to mimic the binary codepath somewhat in
emit_rewrite_diff; I started on it, but realized I was just copying the
code.

However, we should at least do something before 1.6.1; as it is, it
spews binary garbage.

I discovered this because one of my textconv'd files had a rewrite
(which apparently happens by changing a few lines in a word document :))
and gave me a bogus diff. However, the problem is not unique to
textconv; patch 1/2 handles binary files and 2/2 handles the textconv
case.

I didn't put the tests into the existing t4022-diff-rewrite because I
wanted to keep the textconv tests together with these tests, and I think
it makes sense to test features in isolation before testing them
together. IOW:

  t4022 - rewrite works
  t4030 - textconv works
  t4031 - rewrite AND textconv work together

 diff.c                         |    4 ++-
 t/t4031-diff-rewrite-binary.sh |   42 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+), 1 deletions(-)
 create mode 100755 t/t4031-diff-rewrite-binary.sh

diff --git a/diff.c b/diff.c
index f644947..ea958a2 100644
--- a/diff.c
+++ b/diff.c
@@ -1376,7 +1376,9 @@ static void builtin_diff(const char *name_a,
 		 */
 		if ((one->mode ^ two->mode) & S_IFMT)
 			goto free_ab_and_return;
-		if (complete_rewrite) {
+		if (complete_rewrite &&
+		    !diff_filespec_is_binary(one) &&
+		    !diff_filespec_is_binary(two)) {
 			emit_rewrite_diff(name_a, name_b, one, two, o);
 			o->found_changes = 1;
 			goto free_ab_and_return;
diff --git a/t/t4031-diff-rewrite-binary.sh b/t/t4031-diff-rewrite-binary.sh
new file mode 100755
index 0000000..4b522f7
--- /dev/null
+++ b/t/t4031-diff-rewrite-binary.sh
@@ -0,0 +1,42 @@
+#!/bin/sh
+
+test_description='rewrite diff on binary file'
+
+. ./test-lib.sh
+
+# We must be large enough to meet the MINIMUM_BREAK_SIZE
+# requirement.
+make_file() {
+	for i in 1 2 3 4 5 6 7 8 9 10; do
+		for j in 1 2 3 4 5 6 7 9 10; do
+			for k in 1 2 3 4 5; do
+				printf "$1\n"
+			done
+		done
+	done >file
+}
+
+test_expect_success 'create binary file with changes' '
+	make_file "\\0" &&
+	git add file &&
+	make_file "\\01"
+'
+
+test_expect_success 'vanilla diff is binary' '
+	git diff >diff &&
+	grep "Binary files a/file and b/file differ" diff
+'
+
+test_expect_success 'rewrite diff is binary' '
+	git diff -B >diff &&
+	grep "dissimilarity index" diff &&
+	grep "Binary files a/file and b/file differ" diff
+'
+
+test_expect_success 'rewrite diff can show binary patch' '
+	git diff -B --binary >diff &&
+	grep "dissimilarity index" diff &&
+	grep "GIT binary patch" diff
+'
+
+test_done
-- 
1.6.1.rc2.1.g8f945.dirty

             reply	other threads:[~2008-12-09  8:13 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-09  8:12 Jeff King [this message]
2008-12-09  8:13 ` [PATCH 2/2] diff: respect textconv in rewrite diffs Jeff King
2008-12-10  8:34   ` Junio C Hamano
2008-12-10  9:02     ` Jeff King
2008-12-10 20:27       ` Junio C Hamano
2008-12-10  8:34 ` [PATCH 1/2] diff: fix handling of binary " Junio C Hamano
2008-12-10  9:04   ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081209081227.GA19626@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).