git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] diff: fix handling of binary rewrite diffs
@ 2008-12-09  8:12 Jeff King
  2008-12-09  8:13 ` [PATCH 2/2] diff: respect textconv in " Jeff King
  2008-12-10  8:34 ` [PATCH 1/2] diff: fix handling of binary " Junio C Hamano
  0 siblings, 2 replies; 7+ messages in thread
From: Jeff King @ 2008-12-09  8:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

The current emit_rewrite_diff code always writes a text
patch without checking whether the content is binary. This
means that if you end up with a rewrite diff for a binary
file, you get lots of raw binary goo in your patch.

Instead, if we have binary files, then let's just skip
emit_rewrite_diff altogether. We will already have shown the
"dissimilarity index" line, so it is really about the diff
contents. If binary diffs are turned off, the "Binary files
a/file and b/file differ" message should be the same in
either case. If we do have binary patches turned on, there
isn't much point in making a less-efficient binary patch
that does a total rewrite; no human is going to read it, and
since binary patches don't apply with any fuzz anyway, the
result of application should be the same.

Signed-off-by: Jeff King <peff@peff.net>
---
I couldn't think of a good reason to want a different format of binary
patch for a rewrite versus a regular diff, but others may be able to.
In that case, you have to mimic the binary codepath somewhat in
emit_rewrite_diff; I started on it, but realized I was just copying the
code.

However, we should at least do something before 1.6.1; as it is, it
spews binary garbage.

I discovered this because one of my textconv'd files had a rewrite
(which apparently happens by changing a few lines in a word document :))
and gave me a bogus diff. However, the problem is not unique to
textconv; patch 1/2 handles binary files and 2/2 handles the textconv
case.

I didn't put the tests into the existing t4022-diff-rewrite because I
wanted to keep the textconv tests together with these tests, and I think
it makes sense to test features in isolation before testing them
together. IOW:

  t4022 - rewrite works
  t4030 - textconv works
  t4031 - rewrite AND textconv work together

 diff.c                         |    4 ++-
 t/t4031-diff-rewrite-binary.sh |   42 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+), 1 deletions(-)
 create mode 100755 t/t4031-diff-rewrite-binary.sh

diff --git a/diff.c b/diff.c
index f644947..ea958a2 100644
--- a/diff.c
+++ b/diff.c
@@ -1376,7 +1376,9 @@ static void builtin_diff(const char *name_a,
 		 */
 		if ((one->mode ^ two->mode) & S_IFMT)
 			goto free_ab_and_return;
-		if (complete_rewrite) {
+		if (complete_rewrite &&
+		    !diff_filespec_is_binary(one) &&
+		    !diff_filespec_is_binary(two)) {
 			emit_rewrite_diff(name_a, name_b, one, two, o);
 			o->found_changes = 1;
 			goto free_ab_and_return;
diff --git a/t/t4031-diff-rewrite-binary.sh b/t/t4031-diff-rewrite-binary.sh
new file mode 100755
index 0000000..4b522f7
--- /dev/null
+++ b/t/t4031-diff-rewrite-binary.sh
@@ -0,0 +1,42 @@
+#!/bin/sh
+
+test_description='rewrite diff on binary file'
+
+. ./test-lib.sh
+
+# We must be large enough to meet the MINIMUM_BREAK_SIZE
+# requirement.
+make_file() {
+	for i in 1 2 3 4 5 6 7 8 9 10; do
+		for j in 1 2 3 4 5 6 7 9 10; do
+			for k in 1 2 3 4 5; do
+				printf "$1\n"
+			done
+		done
+	done >file
+}
+
+test_expect_success 'create binary file with changes' '
+	make_file "\\0" &&
+	git add file &&
+	make_file "\\01"
+'
+
+test_expect_success 'vanilla diff is binary' '
+	git diff >diff &&
+	grep "Binary files a/file and b/file differ" diff
+'
+
+test_expect_success 'rewrite diff is binary' '
+	git diff -B >diff &&
+	grep "dissimilarity index" diff &&
+	grep "Binary files a/file and b/file differ" diff
+'
+
+test_expect_success 'rewrite diff can show binary patch' '
+	git diff -B --binary >diff &&
+	grep "dissimilarity index" diff &&
+	grep "GIT binary patch" diff
+'
+
+test_done
-- 
1.6.1.rc2.1.g8f945.dirty

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-12-10 20:28 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-12-09  8:12 [PATCH 1/2] diff: fix handling of binary rewrite diffs Jeff King
2008-12-09  8:13 ` [PATCH 2/2] diff: respect textconv in " Jeff King
2008-12-10  8:34   ` Junio C Hamano
2008-12-10  9:02     ` Jeff King
2008-12-10 20:27       ` Junio C Hamano
2008-12-10  8:34 ` [PATCH 1/2] diff: fix handling of binary " Junio C Hamano
2008-12-10  9:04   ` Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).