git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Samuel Bronson <naesten@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>,
	git@vger.kernel.org, Jonathan Nieder <jrnieder@gmail.com>
Subject: [PATCH 2/2] cat-file: handle --batch format with missing type/size
Date: Wed, 11 Dec 2013 19:58:45 +0800	[thread overview]
Message-ID: <20131211115844.GB10594@sigill.intra.peff.net> (raw)
In-Reply-To: <20131211115458.GA10561@sigill.intra.peff.net>

Commit 98e2092 taught cat-file to stream blobs with --batch,
which requires that we look up the object type before
loading it into memory.  As a result, we now print the
object header from information in sha1_object_info, and the
actual contents from the read_sha1_file. We double-check
that the information we printed in the header matches the
content we are about to show.

Later, commit 93d2a60 allowed custom header lines for
--batch, and commit 5b08640 made type lookups optional. As a
result, specifying a header line without the type or size
means that we will not look up those items at all.

This causes our double-checking to erroneously die with an
error; we think the type or size has changed, when in fact
it was simply left at "0".

For the size, we can fix this by only doing the consistency
double-check when we have retrieved the size via
sha1_object_info. In the case that we have not retrieved the
value, that means we also did not print it, so there is
nothing for us to check that we are consistent with.

We could do the same for the type. However, besides our
consistency check, we also care about the type in deciding
whether to stream or not. We therefore make sure to always
trigger a type lookup when we are printing, so that even a
format without the type will stream as we would in the
normal case.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/cat-file.c  |  9 ++++++++-
 t/t1006-cat-file.sh | 22 ++++++++++++++++++++++
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 1434afb..4af67fd 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -211,7 +211,7 @@ static void print_object_or_die(int fd, struct expand_data *data)
 			die("object %s disappeared", sha1_to_hex(sha1));
 		if (type != data->type)
 			die("object %s changed type!?", sha1_to_hex(sha1));
-		if (size != data->size)
+		if (data->info.sizep && size != data->size)
 			die("object %s changed size!?", sha1_to_hex(sha1));
 
 		write_or_die(fd, contents, size);
@@ -276,6 +276,13 @@ static int batch_objects(struct batch_options *opt)
 	data.mark_query = 0;
 
 	/*
+	 * If we are printing out the object, then always fill in the type,
+	 * since we will want to decide whether or not to stream.
+	 */
+	if (opt->print_contents)
+		data.info.typep = &data.type;
+
+	/*
 	 * We are going to call get_sha1 on a potentially very large number of
 	 * objects. In most large cases, these will be actual object sha1s. The
 	 * cost to double-check that each one is not also a ref (just so we can
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index 8a1bc5c..1687098 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -85,6 +85,28 @@ $content"
 		git cat-file --batch-check="%(objecttype) %(rest)" >actual &&
 	test_cmp expect actual
     '
+
+    test -z "$content" ||
+    test_expect_success "--batch without type ($type)" '
+	{
+		echo "$size" &&
+		maybe_remove_timestamp "$content" $no_ts
+	} >expect &&
+	echo $sha1 | git cat-file --batch="%(objectsize)" >actual.full &&
+	maybe_remove_timestamp "$(cat actual.full)" $no_ts >actual &&
+	test_cmp expect actual
+    '
+
+    test -z "$content" ||
+    test_expect_success "--batch without size ($type)" '
+	{
+		echo "$type" &&
+		maybe_remove_timestamp "$content" $no_ts
+	} >expect &&
+	echo $sha1 | git cat-file --batch="%(objecttype)" >actual.full &&
+	maybe_remove_timestamp "$(cat actual.full)" $no_ts >actual &&
+	test_cmp expect actual
+    '
 }
 
 hello_content="Hello World"
-- 
1.8.5.524.g6743da6

  parent reply	other threads:[~2013-12-11 11:58 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-11  4:37 [BUG] "echo HEAD | git cat-file --batch=''" fails catastrophically Samuel Bronson
2013-12-11 11:54 ` Jeff King
2013-12-11 11:56   ` [PATCH 1/2] cat-file: pass expand_data to print_object_or_die Jeff King
2013-12-11 20:11     ` Jonathan Nieder
2013-12-11 23:01       ` Jeff King
2013-12-12  3:03         ` Junio C Hamano
2013-12-11 11:58   ` Jeff King [this message]
2013-12-11 20:42     ` [PATCH 2/2] cat-file: handle --batch format with missing type/size Jonathan Nieder
2013-12-11 23:15       ` Jeff King
2013-12-11 23:31         ` Jonathan Nieder
2013-12-12  3:05         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131211115844.GB10594@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=naesten@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).