git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Speed up git-svn fetch
@ 2008-04-23 19:17 Adam Roben
  2008-04-23 19:17 ` [PATCH 01/11] Add tests for git cat-file Adam Roben
  2008-04-23 19:19 ` Speed up git-svn fetch Adam Roben
  0 siblings, 2 replies; 19+ messages in thread
From: Adam Roben @ 2008-04-23 19:17 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Johannes Sixt, Brian Downing, Eric Wong


I first sent this patch series 6 months ago today [1], then resent it after
some comments from Junio, Johannes, and Brian [2]. Here it is again, addressing
further comments from Junio and Eric.

The point of the series is to speed up git-svn fetch on Cygwin, where forking
is quite slow. In my informal testing, this patch seems to speed things up by
~1.4-~1.7x. We accomplish this by having a single long-lived git-cat-file
process and a single long-lived git-hash-object process for the duration of the
git-svn invocation.

This series is based on top of next.

-Adam

--
 Documentation/git-cat-file.txt    |   43 +++++++-
 Documentation/git-hash-object.txt |    5 +-
 builtin-cat-file.c                |  153 ++++++++++++++++++++++++---
 git-svn.perl                      |   42 ++++----
 hash-object.c                     |   44 ++++++++-
 perl/Git.pm                       |  208 ++++++++++++++++++++++++++++++++++++-
 t/t1006-cat-file.sh               |  181 ++++++++++++++++++++++++++++++++
 t/t1007-hash-object.sh            |  139 +++++++++++++++++++++++++
 t/t5303-hash-object.sh            |   35 ------
 9 files changed, 768 insertions(+), 82 deletions(-)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 01/11] Add tests for git cat-file
  2008-04-23 19:17 Speed up git-svn fetch Adam Roben
@ 2008-04-23 19:17 ` Adam Roben
  2008-04-23 19:17   ` [PATCH 02/11] git-cat-file: Small refactor of cmd_cat_file Adam Roben
                     ` (2 more replies)
  2008-04-23 19:19 ` Speed up git-svn fetch Adam Roben
  1 sibling, 3 replies; 19+ messages in thread
From: Adam Roben @ 2008-04-23 19:17 UTC (permalink / raw)
  To: git; +Cc: Adam Roben, Junio C Hamano, Johannes Sixt


Signed-off-by: Adam Roben <aroben@apple.com>
---
 t/t1006-cat-file.sh |  101 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 101 insertions(+), 0 deletions(-)
 create mode 100755 t/t1006-cat-file.sh

diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
new file mode 100755
index 0000000..15741d9
--- /dev/null
+++ b/t/t1006-cat-file.sh
@@ -0,0 +1,101 @@
+#!/bin/sh
+
+test_description='git cat-file'
+
+. ./test-lib.sh
+
+function echo_without_newline()
+{
+    echo "$@\c"
+}
+
+function strlen()
+{
+    echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
+}
+
+function maybe_remove_timestamp()
+{
+    if test -z "$2"; then
+        echo_without_newline "$1"
+    else
+        echo_without_newline "$1" | sed -e 's/ [0-9]\{10\} [+-][0-9]\{4\}$//'
+    fi
+}
+
+function run_tests()
+{
+    type=$1
+    sha1=$2
+    size=$3
+    content=$4
+    pretty_content=$5
+    no_timestamp=$6
+
+    test_expect_success \
+        "$type exists" \
+        "git cat-file -e $sha1"
+    test_expect_success \
+        "Type of $type is correct" \
+        "test $type = \"\$(git cat-file -t $sha1)\""
+    test_expect_success \
+        "Size of $type is correct" \
+        "test $size = \"\$(git cat-file -s $sha1)\""
+    test -z "$content" || test_expect_success \
+        "Content of $type is correct" \
+        "test \"\$(maybe_remove_timestamp '$content' $no_timestamp)\" = \"\$(maybe_remove_timestamp \"\$(git cat-file $type $sha1)\" $no_timestamp)\""
+    test_expect_success \
+        "Pretty content of $type is correct" \
+        "test \"\$(maybe_remove_timestamp '$pretty_content' $no_timestamp)\" = \"\$(maybe_remove_timestamp \"\$(git cat-file -p $sha1)\" $no_timestamp)\""
+}
+
+hello_content="Hello World"
+hello_size=$(strlen "$hello_content")
+hello_sha1=$(echo_without_newline "$hello_content" | git hash-object --stdin)
+
+test_expect_success \
+    "setup" \
+    "echo_without_newline \"$hello_content\" > hello &&
+     git update-index --add hello"
+
+run_tests 'blob' $hello_sha1 $hello_size "$hello_content" "$hello_content"
+
+tree_sha1=$(git write-tree)
+tree_size=33
+tree_pretty_content="100644 blob $hello_sha1	hello"
+
+run_tests 'tree' $tree_sha1 $tree_size "" "$tree_pretty_content"
+
+commit_message="Intial commit"
+commit_sha1=$(echo_without_newline "$commit_message" | git commit-tree $tree_sha1)
+commit_size=176
+commit_content="tree $tree_sha1
+author $GIT_AUTHOR_NAME <$GIT_AUTHOR_EMAIL> 0000000000 +0000
+committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> 0000000000 +0000
+
+$commit_message"
+
+run_tests 'commit' $commit_sha1 $commit_size "$commit_content" "$commit_content" 1
+
+tag_header_without_timestamp="object $hello_sha1
+type blob
+tag hellotag
+tagger $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>"
+tag_description="This is a tag"
+tag_content="$tag_header_without_timestamp 0000000000 +0000
+
+$tag_description"
+tag_pretty_content="$tag_header_without_timestamp Thu Jan 1 00:00:00 1970 +0000
+
+$tag_description"
+
+tag_sha1=$(echo_without_newline "$tag_content" | git mktag)
+tag_size=$(strlen "$tag_content")
+
+run_tests 'tag' $tag_sha1 $tag_size "$tag_content" "$tag_pretty_content" 1
+
+test_expect_success \
+    "Reach a blob from a tag pointing to it" \
+    "test '$hello_content' = \"\$(git cat-file blob $tag_sha1)\""
+
+test_done
-- 
1.5.5.1.152.g9aeb7

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 02/11] git-cat-file: Small refactor of cmd_cat_file
  2008-04-23 19:17 ` [PATCH 01/11] Add tests for git cat-file Adam Roben
@ 2008-04-23 19:17   ` Adam Roben
  2008-04-23 19:17     ` [PATCH 03/11] git-cat-file: Make option parsing a little more flexible Adam Roben
  2008-04-25  6:56   ` [PATCH 01/11] Add tests for git cat-file Eric Wong
  2008-04-25 18:03   ` Junio C Hamano
  2 siblings, 1 reply; 19+ messages in thread
From: Adam Roben @ 2008-04-23 19:17 UTC (permalink / raw)
  To: git; +Cc: Adam Roben, Junio C Hamano

I separated the logic of parsing the arguments from the logic of fetching and
outputting the data. cat_one_file now does the latter.

Signed-off-by: Adam Roben <aroben@apple.com>
---
 builtin-cat-file.c |   38 ++++++++++++++++++++++----------------
 1 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/builtin-cat-file.c b/builtin-cat-file.c
index f132d58..34a63d1 100644
--- a/builtin-cat-file.c
+++ b/builtin-cat-file.c
@@ -76,31 +76,16 @@ static void pprint_tag(const unsigned char *sha1, const char *buf, unsigned long
 		write_or_die(1, cp, endp - cp);
 }
 
-int cmd_cat_file(int argc, const char **argv, const char *prefix)
+static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
 {
 	unsigned char sha1[20];
 	enum object_type type;
 	void *buf;
 	unsigned long size;
-	int opt;
-	const char *exp_type, *obj_name;
-
-	git_config(git_default_config);
-	if (argc != 3)
-		usage("git-cat-file [-t|-s|-e|-p|<type>] <sha1>");
-	exp_type = argv[1];
-	obj_name = argv[2];
 
 	if (get_sha1(obj_name, sha1))
 		die("Not a valid object name %s", obj_name);
 
-	opt = 0;
-	if ( exp_type[0] == '-' ) {
-		opt = exp_type[1];
-		if ( !opt || exp_type[2] )
-			opt = -1; /* Not a single character option */
-	}
-
 	buf = NULL;
 	switch (opt) {
 	case 't':
@@ -157,3 +142,24 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
 	write_or_die(1, buf, size);
 	return 0;
 }
+
+int cmd_cat_file(int argc, const char **argv, const char *prefix)
+{
+	int opt;
+	const char *exp_type, *obj_name;
+
+	git_config(git_default_config);
+	if (argc != 3)
+		usage("git-cat-file [-t|-s|-e|-p|<type>] <sha1>");
+	exp_type = argv[1];
+	obj_name = argv[2];
+
+	opt = 0;
+	if ( exp_type[0] == '-' ) {
+		opt = exp_type[1];
+		if ( !opt || exp_type[2] )
+			opt = -1; /* Not a single character option */
+	}
+
+	return cat_one_file(opt, exp_type, obj_name);
+}
-- 
1.5.5.1.152.g9aeb7

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 03/11] git-cat-file: Make option parsing a little more flexible
  2008-04-23 19:17   ` [PATCH 02/11] git-cat-file: Small refactor of cmd_cat_file Adam Roben
@ 2008-04-23 19:17     ` Adam Roben
  2008-04-23 19:17       ` [PATCH 04/11] git-cat-file: Add --batch-check option Adam Roben
  2008-04-25 18:04       ` [PATCH 03/11] git-cat-file: Make option parsing a little more flexible Junio C Hamano
  0 siblings, 2 replies; 19+ messages in thread
From: Adam Roben @ 2008-04-23 19:17 UTC (permalink / raw)
  To: git; +Cc: Adam Roben, Junio C Hamano

This will make it easier to add newer options later.

Signed-off-by: Adam Roben <aroben@apple.com>
---
Junio C Hamano <gitster@pobox.com> wrote:
> > diff --git a/builtin-cat-file.c b/builtin-cat-file.c
> > index 34a63d1..3a0be4a 100644
> > --- a/builtin-cat-file.c
> > +++ b/builtin-cat-file.c
> > @@ -143,23 +143,41 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
> >     return 0;
> >  }
> >  
> > +static const char cat_file_usage[] = "git-cat-file [-t|-s|-e|-p|<type>] <sha1>";
> > +
> >  int cmd_cat_file(int argc, const char **argv, const char *prefix)
> >  {
> > -   int opt;
> > -   const char *exp_type, *obj_name;
> > +   int i, opt = 0;
> > +   const char *exp_type = 0, *obj_name = 0;
> 
> NULL pointer constants in git sources are spelled "NULL", not
> "0".

Fixed.

 builtin-cat-file.c |   42 ++++++++++++++++++++++++++++++------------
 1 files changed, 30 insertions(+), 12 deletions(-)

diff --git a/builtin-cat-file.c b/builtin-cat-file.c
index 34a63d1..a76bb16 100644
--- a/builtin-cat-file.c
+++ b/builtin-cat-file.c
@@ -143,23 +143,41 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
 	return 0;
 }
 
+static const char cat_file_usage[] = "git-cat-file [-t|-s|-e|-p|<type>] <sha1>";
+
 int cmd_cat_file(int argc, const char **argv, const char *prefix)
 {
-	int opt;
-	const char *exp_type, *obj_name;
+	int i, opt = 0;
+	const char *exp_type = NULL, *obj_name = NULL;
 
 	git_config(git_default_config);
-	if (argc != 3)
-		usage("git-cat-file [-t|-s|-e|-p|<type>] <sha1>");
-	exp_type = argv[1];
-	obj_name = argv[2];
-
-	opt = 0;
-	if ( exp_type[0] == '-' ) {
-		opt = exp_type[1];
-		if ( !opt || exp_type[2] )
-			opt = -1; /* Not a single character option */
+
+	for (i = 1; i < argc; ++i) {
+		const char *arg = argv[i];
+
+		if (!strcmp(arg, "-t") || !strcmp(arg, "-s") || !strcmp(arg, "-e") || !strcmp(arg, "-p")) {
+			exp_type = arg;
+			opt = exp_type[1];
+			continue;
+		}
+
+		if (arg[0] == '-')
+			usage(cat_file_usage);
+
+		if (!exp_type) {
+			exp_type = arg;
+			continue;
+		}
+
+		if (obj_name)
+			usage(cat_file_usage);
+
+		obj_name = arg;
+		break;
 	}
 
+	if (!exp_type || !obj_name)
+		usage(cat_file_usage);
+
 	return cat_one_file(opt, exp_type, obj_name);
 }
-- 
1.5.5.1.152.g9aeb7

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 04/11] git-cat-file: Add --batch-check option
  2008-04-23 19:17     ` [PATCH 03/11] git-cat-file: Make option parsing a little more flexible Adam Roben
@ 2008-04-23 19:17       ` Adam Roben
  2008-04-23 19:17         ` [PATCH 05/11] git-cat-file: Add --batch option Adam Roben
  2008-04-25 18:04       ` [PATCH 03/11] git-cat-file: Make option parsing a little more flexible Junio C Hamano
  1 sibling, 1 reply; 19+ messages in thread
From: Adam Roben @ 2008-04-23 19:17 UTC (permalink / raw)
  To: git; +Cc: Adam Roben, Junio C Hamano, Brian Downing

This new option allows multiple objects to be specified on stdin. For each
object specified, a line of the following form is printed:

<sha1> SP <type> SP <size> LF

If the object does not exist in the repository, a line of the following form is
printed:

<object> SP missing LF

Signed-off-by: Adam Roben <aroben@apple.com>
---
Junio C Hamano <gitster@pobox.com> wrote:
> Here is an alternative suggestion.
> 
>    Two new options, --batch and --batch-check, are introduced.
>    These options are incompatible with -[tsep] or an object type
>    given as the first parameter to git-cat-file.
> 
>    * git-cat-file --batch-check <list-of-sha1
> 
>      outputs a record of this form
> 
>           <sha1> SP <type> SP <size> LF
> 
>      for each of the input lines.
> 
>    * git-cat-file --batch <list-of-sha1
> 
>      outputs a record of this form
> 
>           <sha1> SP <type> SP <size> LF <contents> LF
> 
>      for each of the input lines.
> 
>   For a missing object, either option gives a record of form:
> 
>           <sha1> SP missing LF

This patch introduces --batch-check as described above. The next patch introduces --batch.

 Documentation/git-cat-file.txt |   31 ++++++++++++++---
 builtin-cat-file.c             |   74 ++++++++++++++++++++++++++++++++++++++-
 t/t1006-cat-file.sh            |   52 ++++++++++++++++++++++++++++
 3 files changed, 150 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index df42cb1..d5821af 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -9,12 +9,16 @@ git-cat-file - Provide content or type/size information for repository objects
 SYNOPSIS
 --------
 'git-cat-file' [-t | -s | -e | -p | <type>] <object>
+'git-cat-file' --batch-check < <list-of-objects>
 
 DESCRIPTION
 -----------
-Provides content or type of objects in the repository. The type
-is required unless '-t' or '-p' is used to find the object type,
-or '-s' is used to find the object size.
+In the first form, provides content or type of objects in the repository. The
+type is required unless '-t' or '-p' is used to find the object type, or '-s'
+is used to find the object size.
+
+In the second form, a list of object (separated by LFs) is provided on stdin,
+and the SHA1, type, and size of each object is printed on stdout.
 
 OPTIONS
 -------
@@ -46,6 +50,10 @@ OPTIONS
 	or to ask for a "blob" with <object> being a tag object that
 	points at it.
 
+--batch-check::
+	Print the SHA1, type, and size of each object provided on stdin. May not be
+	combined with any other options or arguments.
+
 OUTPUT
 ------
 If '-t' is specified, one of the <type>.
@@ -56,9 +64,22 @@ If '-e' is specified, no output.
 
 If '-p' is specified, the contents of <object> are pretty-printed.
 
-Otherwise the raw (though uncompressed) contents of the <object> will
-be returned.
+If <type> is specified, the raw (though uncompressed) contents of the <object>
+will be returned.
+
+If '--batch-check' is specified, output of the following form is printed for
+each object specified fon stdin:
+
+------------
+<sha1> SP <type> SP <size> LF
+------------
+
+Additionally, output of the following form is printed for each object specified
+on stdin that does not exist in the repository:
 
+------------
+<object> SP missing LF
+------------
 
 Author
 ------
diff --git a/builtin-cat-file.c b/builtin-cat-file.c
index a76bb16..832cfd1 100644
--- a/builtin-cat-file.c
+++ b/builtin-cat-file.c
@@ -143,11 +143,48 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
 	return 0;
 }
 
-static const char cat_file_usage[] = "git-cat-file [-t|-s|-e|-p|<type>] <sha1>";
+static int batch_one_object(const char *obj_name)
+{
+	unsigned char sha1[20];
+	enum object_type type;
+	unsigned long size;
+
+	if (!obj_name)
+	   return 1;
+
+	if (get_sha1(obj_name, sha1)) {
+		printf("%s missing\n", obj_name);
+		return 0;
+	}
+
+	type = sha1_object_info(sha1, &size);
+	if (type <= 0)
+		return 1;
+
+	printf("%s %s %lu\n", sha1_to_hex(sha1), typename(type), size);
+
+	return 0;
+}
+
+static int batch_objects(void)
+{
+	struct strbuf buf;
+
+	strbuf_init(&buf, 0);
+	while (strbuf_getline(&buf, stdin, '\n') != EOF) {
+		int error = batch_one_object(buf.buf);
+		if (error)
+			return error;
+	}
+
+	return 0;
+}
+
+static const char cat_file_usage[] = "git-cat-file [ [-t|-s|-e|-p|<type>] <sha1> | --batch-check < <list_of_sha1s> ]";
 
 int cmd_cat_file(int argc, const char **argv, const char *prefix)
 {
-	int i, opt = 0;
+	int i, opt = 0, batch_check = 0;
 	const char *exp_type = NULL, *obj_name = NULL;
 
 	git_config(git_default_config);
@@ -155,7 +192,28 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
 	for (i = 1; i < argc; ++i) {
 		const char *arg = argv[i];
 
+		if (!strcmp(arg, "--batch-check")) {
+			if (opt) {
+				error("git-cat-file: Can't use --batch-check with -%c", opt);
+				usage(cat_file_usage);
+			} else if (exp_type) {
+				error("git-cat-file: Can't use --batch-check when a type (\"%s\") is specified", exp_type);
+				usage(cat_file_usage);
+			} else if (obj_name) {
+				error("git-cat-file: Can't use --batch-check when an object (\"%s\") is specified", obj_name);
+				usage(cat_file_usage);
+			}
+
+			batch_check = 1;
+			continue;
+		}
+
 		if (!strcmp(arg, "-t") || !strcmp(arg, "-s") || !strcmp(arg, "-e") || !strcmp(arg, "-p")) {
+			if (batch_check) {
+				error("git-cat-file: Can't use %s with --batch-check", arg);
+				usage(cat_file_usage);
+			}
+
 			exp_type = arg;
 			opt = exp_type[1];
 			continue;
@@ -165,6 +223,11 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
 			usage(cat_file_usage);
 
 		if (!exp_type) {
+			if (batch_check) {
+				error("git-cat-file: Can't specify a type (\"%s\") with --batch-check", arg);
+				usage(cat_file_usage);
+			}
+
 			exp_type = arg;
 			continue;
 		}
@@ -172,10 +235,17 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
 		if (obj_name)
 			usage(cat_file_usage);
 
+		// We should have hit one of the earlier if (batch_check) cases before
+		// getting here.
+		assert(!batch_check);
+
 		obj_name = arg;
 		break;
 	}
 
+	if (batch_check)
+		return batch_objects();
+
 	if (!exp_type || !obj_name)
 		usage(cat_file_usage);
 
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index 15741d9..46b0f54 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -47,6 +47,9 @@ function run_tests()
     test_expect_success \
         "Pretty content of $type is correct" \
         "test \"\$(maybe_remove_timestamp '$pretty_content' $no_timestamp)\" = \"\$(maybe_remove_timestamp \"\$(git cat-file -p $sha1)\" $no_timestamp)\""
+    test_expect_success \
+        "--batch-check output of $type is correct" \
+        "test \"$sha1 $type $size\" = \"\$(echo_without_newline $sha1 | git cat-file --batch-check)\""
 }
 
 hello_content="Hello World"
@@ -98,4 +101,53 @@ test_expect_success \
     "Reach a blob from a tag pointing to it" \
     "test '$hello_content' = \"\$(git cat-file blob $tag_sha1)\""
 
+for opt in t s e p; do
+    test_expect_success \
+        "Passing -$opt with --batch-check fails" \
+        "test_must_fail git cat-file --batch-check -$opt $hello_sha1"
+
+    test_expect_success \
+        "Passing --batch-check with -$opt fails" \
+        "test_must_fail git cat-file -$opt --batch-check $hello_sha1"
+done
+
+test_expect_success \
+    "Passing <type> with --batch-check fails" \
+    "test_must_fail git cat-file --batch-check blob $hello_sha1"
+
+test_expect_success \
+    "Passing --batch-check with <type> fails" \
+    "test_must_fail git cat-file blob --batch-check $hello_sha1"
+
+test_expect_success \
+    "Passing sha1 with --batch-check fails" \
+    "test_must_fail git cat-file --batch-check $hello_sha1"
+
+test_expect_success \
+    "--batch-check for a non-existent object" \
+    "test \"deadbeef missing\" = \"\$(echo_without_newline deadbeef | git cat-file --batch-check)\""
+
+test_expect_success \
+    "--batch-check for an emtpy line" \
+    "test \" missing\" = \"\$(printf \"\\\\n\" | git cat-file --batch-check)\""
+
+batch_check_input="$hello_sha1
+$tree_sha1
+$commit_sha1
+$tag_sha1
+deadbeef
+
+"
+
+batch_check_output="$hello_sha1 blob $hello_size
+$tree_sha1 tree $tree_size
+$commit_sha1 commit $commit_size
+$tag_sha1 tag $tag_size
+deadbeef missing
+ missing"
+
+test_expect_success \
+    "--batch-check with multiple sha1s gives correct format" \
+    "test \"$batch_check_output\" = \"\$(echo_without_newline \"$batch_check_input\" | git cat-file --batch-check)\""
+
 test_done
-- 
1.5.5.1.152.g9aeb7

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 05/11] git-cat-file: Add --batch option
  2008-04-23 19:17       ` [PATCH 04/11] git-cat-file: Add --batch-check option Adam Roben
@ 2008-04-23 19:17         ` Adam Roben
  2008-04-23 19:17           ` [PATCH 06/11] Move git-hash-object tests from t5303 to t1007 Adam Roben
  0 siblings, 1 reply; 19+ messages in thread
From: Adam Roben @ 2008-04-23 19:17 UTC (permalink / raw)
  To: git; +Cc: Adam Roben, Junio C Hamano, Johannes Sixt

--batch is similar to --batch-check, except that the contents of each object is
also printed. The output's form is:

<sha1> SP <type> SP <size> LF
<contents> LF

Signed-off-by: Adam Roben <aroben@apple.com>
---
 Documentation/git-cat-file.txt |   18 +++++++++--
 builtin-cat-file.c             |   63 ++++++++++++++++++++++++++++-----------
 t/t1006-cat-file.sh            |   62 ++++++++++++++++++++++++++++-----------
 3 files changed, 105 insertions(+), 38 deletions(-)

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index d5821af..f6c394c 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -9,7 +9,7 @@ git-cat-file - Provide content or type/size information for repository objects
 SYNOPSIS
 --------
 'git-cat-file' [-t | -s | -e | -p | <type>] <object>
-'git-cat-file' --batch-check < <list-of-objects>
+'git-cat-file' [--batch | --batch-check] < <list-of-objects>
 
 DESCRIPTION
 -----------
@@ -50,6 +50,10 @@ OPTIONS
 	or to ask for a "blob" with <object> being a tag object that
 	points at it.
 
+--batch::
+	Print the SHA1, type, size, and contents of each object provided on
+	stdin. May not be combined with any other options or arguments.
+
 --batch-check::
 	Print the SHA1, type, and size of each object provided on stdin. May not be
 	combined with any other options or arguments.
@@ -67,6 +71,14 @@ If '-p' is specified, the contents of <object> are pretty-printed.
 If <type> is specified, the raw (though uncompressed) contents of the <object>
 will be returned.
 
+If '--batch' is specified, output of the following form is printed for each
+object specified on stdin:
+
+------------
+<sha1> SP <type> SP <size> LF
+<contents> LF
+------------
+
 If '--batch-check' is specified, output of the following form is printed for
 each object specified fon stdin:
 
@@ -74,8 +86,8 @@ each object specified fon stdin:
 <sha1> SP <type> SP <size> LF
 ------------
 
-Additionally, output of the following form is printed for each object specified
-on stdin that does not exist in the repository:
+For both '--batch' and '--batch-check', output of the following form is printed
+for each object specified on stdin that does not exist in the repository:
 
 ------------
 <object> SP missing LF
diff --git a/builtin-cat-file.c b/builtin-cat-file.c
index 832cfd1..827ffcc 100644
--- a/builtin-cat-file.c
+++ b/builtin-cat-file.c
@@ -143,11 +143,12 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
 	return 0;
 }
 
-static int batch_one_object(const char *obj_name)
+static int batch_one_object(const char *obj_name, int print_contents)
 {
 	unsigned char sha1[20];
 	enum object_type type;
 	unsigned long size;
+	void *contents;
 
 	if (!obj_name)
 	   return 1;
@@ -157,22 +158,33 @@ static int batch_one_object(const char *obj_name)
 		return 0;
 	}
 
-	type = sha1_object_info(sha1, &size);
+	if (print_contents)
+		contents = read_sha1_file(sha1, &type, &size);
+	else
+		type = sha1_object_info(sha1, &size);
+
 	if (type <= 0)
 		return 1;
 
 	printf("%s %s %lu\n", sha1_to_hex(sha1), typename(type), size);
+	fflush(stdout);
+
+	if (print_contents) {
+		write_or_die(1, contents, size);
+		printf("\n");
+		fflush(stdout);
+	}
 
 	return 0;
 }
 
-static int batch_objects(void)
+static int batch_objects(int print_contents)
 {
 	struct strbuf buf;
 
 	strbuf_init(&buf, 0);
 	while (strbuf_getline(&buf, stdin, '\n') != EOF) {
-		int error = batch_one_object(buf.buf);
+		int error = batch_one_object(buf.buf, print_contents);
 		if (error)
 			return error;
 	}
@@ -180,37 +192,51 @@ static int batch_objects(void)
 	return 0;
 }
 
-static const char cat_file_usage[] = "git-cat-file [ [-t|-s|-e|-p|<type>] <sha1> | --batch-check < <list_of_sha1s> ]";
+static const char cat_file_usage[] = "git-cat-file [ [-t|-s|-e|-p|<type>] <sha1> | [--batch|--batch-check] < <list_of_sha1s> ]";
 
 int cmd_cat_file(int argc, const char **argv, const char *prefix)
 {
-	int i, opt = 0, batch_check = 0;
+	int i, opt = 0, batch = 0, batch_check = 0;
 	const char *exp_type = NULL, *obj_name = NULL;
 
 	git_config(git_default_config);
 
 	for (i = 1; i < argc; ++i) {
 		const char *arg = argv[i];
+		int is_batch = 0, is_batch_check = 0;
+
+		is_batch = !strcmp(arg, "--batch");
+		if (!is_batch)
+			is_batch_check = !strcmp(arg, "--batch-check");
 
-		if (!strcmp(arg, "--batch-check")) {
+		if (is_batch || is_batch_check) {
 			if (opt) {
-				error("git-cat-file: Can't use --batch-check with -%c", opt);
+				error("git-cat-file: Can't use %s with -%c", arg, opt);
 				usage(cat_file_usage);
 			} else if (exp_type) {
-				error("git-cat-file: Can't use --batch-check when a type (\"%s\") is specified", exp_type);
+				error("git-cat-file: Can't use %s when a type (\"%s\") is specified", arg, exp_type);
 				usage(cat_file_usage);
 			} else if (obj_name) {
-				error("git-cat-file: Can't use --batch-check when an object (\"%s\") is specified", obj_name);
+				error("git-cat-file: Can't use %s when an object (\"%s\") is specified", arg, obj_name);
 				usage(cat_file_usage);
 			}
 
-			batch_check = 1;
+			if (is_batch && batch_check || is_batch_check && batch) {
+				error("git-cat-file: Can't use %s with %s", arg, is_batch ? "--batch-check" : "--batch");
+				usage(cat_file_usage);
+			}
+
+			if (is_batch)
+				batch = 1;
+			else
+				batch_check = 1;
+
 			continue;
 		}
 
 		if (!strcmp(arg, "-t") || !strcmp(arg, "-s") || !strcmp(arg, "-e") || !strcmp(arg, "-p")) {
-			if (batch_check) {
-				error("git-cat-file: Can't use %s with --batch-check", arg);
+			if (batch || batch_check) {
+				error("git-cat-file: Can't use %s with %s", arg, batch ? "--batch" : "--batch-check");
 				usage(cat_file_usage);
 			}
 
@@ -223,8 +249,8 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
 			usage(cat_file_usage);
 
 		if (!exp_type) {
-			if (batch_check) {
-				error("git-cat-file: Can't specify a type (\"%s\") with --batch-check", arg);
+			if (batch || batch_check) {
+				error("git-cat-file: Can't specify a type (\"%s\") with %s", arg, batch ? "--batch" : "--batch-check");
 				usage(cat_file_usage);
 			}
 
@@ -235,16 +261,17 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
 		if (obj_name)
 			usage(cat_file_usage);
 
-		// We should have hit one of the earlier if (batch_check) cases before
+		// We should have hit one of the earlier if (batch || batch_check) cases before
 		// getting here.
+		assert(!batch);
 		assert(!batch_check);
 
 		obj_name = arg;
 		break;
 	}
 
-	if (batch_check)
-		return batch_objects();
+	if (batch || batch_check)
+		return batch_objects(batch);
 
 	if (!exp_type || !obj_name)
 		usage(cat_file_usage);
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index 46b0f54..acce8c8 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -32,6 +32,9 @@ function run_tests()
     pretty_content=$5
     no_timestamp=$6
 
+    batch_output="$sha1 $type $size
+$content"
+
     test_expect_success \
         "$type exists" \
         "git cat-file -e $sha1"
@@ -47,6 +50,9 @@ function run_tests()
     test_expect_success \
         "Pretty content of $type is correct" \
         "test \"\$(maybe_remove_timestamp '$pretty_content' $no_timestamp)\" = \"\$(maybe_remove_timestamp \"\$(git cat-file -p $sha1)\" $no_timestamp)\""
+    test -z "$content" || test_expect_success \
+        "--batch output of $type is correct" \
+        "test \"\$(maybe_remove_timestamp '$batch_output' $no_timestamp)\" = \"\$(maybe_remove_timestamp \"\$(echo $sha1 | git cat-file --batch)\" $no_timestamp)\""
     test_expect_success \
         "--batch-check output of $type is correct" \
         "test \"$sha1 $type $size\" = \"\$(echo_without_newline $sha1 | git cat-file --batch-check)\""
@@ -101,27 +107,29 @@ test_expect_success \
     "Reach a blob from a tag pointing to it" \
     "test '$hello_content' = \"\$(git cat-file blob $tag_sha1)\""
 
-for opt in t s e p; do
-    test_expect_success \
-        "Passing -$opt with --batch-check fails" \
-        "test_must_fail git cat-file --batch-check -$opt $hello_sha1"
+for batch in batch batch-check; do
+    for opt in t s e p; do
+        test_expect_success \
+            "Passing -$opt with --$batch fails" \
+            "test_must_fail git cat-file --$batch -$opt $hello_sha1"
 
-    test_expect_success \
-        "Passing --batch-check with -$opt fails" \
-        "test_must_fail git cat-file -$opt --batch-check $hello_sha1"
-done
+        test_expect_success \
+            "Passing --$batch with -$opt fails" \
+            "test_must_fail git cat-file -$opt --$batch $hello_sha1"
+    done
 
-test_expect_success \
-    "Passing <type> with --batch-check fails" \
-    "test_must_fail git cat-file --batch-check blob $hello_sha1"
+    test_expect_success \
+        "Passing <type> with --$batch fails" \
+        "test_must_fail git cat-file --$batch blob $hello_sha1"
 
-test_expect_success \
-    "Passing --batch-check with <type> fails" \
-    "test_must_fail git cat-file blob --batch-check $hello_sha1"
+    test_expect_success \
+        "Passing --$batch with <type> fails" \
+        "test_must_fail git cat-file blob --$batch $hello_sha1"
 
-test_expect_success \
-    "Passing sha1 with --batch-check fails" \
-    "test_must_fail git cat-file --batch-check $hello_sha1"
+    test_expect_success \
+        "Passing sha1 with --$batch fails" \
+        "test_must_fail git cat-file --$batch $hello_sha1"
+done
 
 test_expect_success \
     "--batch-check for a non-existent object" \
@@ -131,6 +139,26 @@ test_expect_success \
     "--batch-check for an emtpy line" \
     "test \" missing\" = \"\$(printf \"\\\\n\" | git cat-file --batch-check)\""
 
+batch_input="$hello_sha1
+$commit_sha1
+$tag_sha1
+deadbeef
+
+"
+
+batch_output="$hello_sha1 blob $hello_size
+$hello_content
+$commit_sha1 commit $commit_size
+$commit_content
+$tag_sha1 tag $tag_size
+$tag_content
+deadbeef missing
+ missing"
+
+test_expect_success \
+    "--batch with multiple sha1s gives correct format" \
+    "test \"\$(maybe_remove_timestamp \"$batch_output\" 1)\" = \"\$(maybe_remove_timestamp \"\$(echo_without_newline \"$batch_input\" | git cat-file --batch)\" 1)\""
+
 batch_check_input="$hello_sha1
 $tree_sha1
 $commit_sha1
-- 
1.5.5.1.152.g9aeb7

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 06/11] Move git-hash-object tests from t5303 to t1007
  2008-04-23 19:17         ` [PATCH 05/11] git-cat-file: Add --batch option Adam Roben
@ 2008-04-23 19:17           ` Adam Roben
  2008-04-23 19:17             ` [PATCH 07/11] Add more tests for git hash-object Adam Roben
  0 siblings, 1 reply; 19+ messages in thread
From: Adam Roben @ 2008-04-23 19:17 UTC (permalink / raw)
  To: git; +Cc: Adam Roben, Junio C Hamano

This is a more appropriate location according to t/README.

Signed-off-by: Adam Roben <aroben@apple.com>
---
 t/t1007-hash-object.sh |   35 +++++++++++++++++++++++++++++++++++
 t/t5303-hash-object.sh |   35 -----------------------------------
 2 files changed, 35 insertions(+), 35 deletions(-)
 create mode 100755 t/t1007-hash-object.sh
 delete mode 100755 t/t5303-hash-object.sh

diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
new file mode 100755
index 0000000..543c078
--- /dev/null
+++ b/t/t1007-hash-object.sh
@@ -0,0 +1,35 @@
+#!/bin/sh
+
+test_description=git-hash-object
+
+. ./test-lib.sh
+
+test_expect_success \
+    'git hash-object -w --stdin saves the object' \
+    'obname=$(echo foo | git hash-object -w --stdin) &&
+    obpath=$(echo $obname | sed -e "s/\(..\)/\1\//") &&
+    test -r .git/objects/"$obpath" &&
+    rm -f .git/objects/"$obpath"'
+    
+test_expect_success \
+    'git hash-object --stdin -w saves the object' \
+    'obname=$(echo foo | git hash-object --stdin -w) &&
+    obpath=$(echo $obname | sed -e "s/\(..\)/\1\//") &&
+    test -r .git/objects/"$obpath" &&
+    rm -f .git/objects/"$obpath"'    
+
+test_expect_success \
+    'git hash-object --stdin file1 <file0 first operates on file0, then file1' \
+    'echo foo > file1 &&
+    obname0=$(echo bar | git hash-object --stdin) &&
+    obname1=$(git hash-object file1) &&
+    obname0new=$(echo bar | git hash-object --stdin file1 | sed -n -e 1p) &&
+    obname1new=$(echo bar | git hash-object --stdin file1 | sed -n -e 2p) &&
+    test "$obname0" = "$obname0new" &&
+    test "$obname1" = "$obname1new"'
+
+test_expect_success \
+    'git hash-object refuses multiple --stdin arguments' \
+    '! git hash-object --stdin --stdin < file1'
+
+test_done
diff --git a/t/t5303-hash-object.sh b/t/t5303-hash-object.sh
deleted file mode 100755
index 543c078..0000000
--- a/t/t5303-hash-object.sh
+++ /dev/null
@@ -1,35 +0,0 @@
-#!/bin/sh
-
-test_description=git-hash-object
-
-. ./test-lib.sh
-
-test_expect_success \
-    'git hash-object -w --stdin saves the object' \
-    'obname=$(echo foo | git hash-object -w --stdin) &&
-    obpath=$(echo $obname | sed -e "s/\(..\)/\1\//") &&
-    test -r .git/objects/"$obpath" &&
-    rm -f .git/objects/"$obpath"'
-    
-test_expect_success \
-    'git hash-object --stdin -w saves the object' \
-    'obname=$(echo foo | git hash-object --stdin -w) &&
-    obpath=$(echo $obname | sed -e "s/\(..\)/\1\//") &&
-    test -r .git/objects/"$obpath" &&
-    rm -f .git/objects/"$obpath"'    
-
-test_expect_success \
-    'git hash-object --stdin file1 <file0 first operates on file0, then file1' \
-    'echo foo > file1 &&
-    obname0=$(echo bar | git hash-object --stdin) &&
-    obname1=$(git hash-object file1) &&
-    obname0new=$(echo bar | git hash-object --stdin file1 | sed -n -e 1p) &&
-    obname1new=$(echo bar | git hash-object --stdin file1 | sed -n -e 2p) &&
-    test "$obname0" = "$obname0new" &&
-    test "$obname1" = "$obname1new"'
-
-test_expect_success \
-    'git hash-object refuses multiple --stdin arguments' \
-    '! git hash-object --stdin --stdin < file1'
-
-test_done
-- 
1.5.5.1.152.g9aeb7

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 07/11] Add more tests for git hash-object
  2008-04-23 19:17           ` [PATCH 06/11] Move git-hash-object tests from t5303 to t1007 Adam Roben
@ 2008-04-23 19:17             ` Adam Roben
  2008-04-23 19:17               ` [PATCH 08/11] git-hash-object: Add --stdin-paths option Adam Roben
  0 siblings, 1 reply; 19+ messages in thread
From: Adam Roben @ 2008-04-23 19:17 UTC (permalink / raw)
  To: git; +Cc: Adam Roben, Junio C Hamano


Signed-off-by: Adam Roben <aroben@apple.com>
---
Junio C Hamano <gitster@pobox.com> wrote:
> > +test_expect_success \
> > +    'hash a file' \
> > +    "test $hello_sha1 = \$(git hash-object hello)"
> > +
> > +test_expect_success \
> > +    'hash from stdin' \
> > +    "test $hello_sha1 = \$(echo '$hello_content' | git hash-object --stdin)"
> 
> Needs to make sure no object has been written to the object
> database at this point?
> 
> > +test_expect_success \
> > +    'hash a file and write to database' \
> > +    "test $hello_sha1 = \$(git hash-object -w hello)"
> 
> ... and make sure the objectis written here?
> 
> > +test_expect_success \
> > +    'hash from stdin and write to database' \
> > +    "test $hello_sha1 = \$(echo '$hello_content' | git hash-object -w --stdin)"
> > +
> > +test_done
> 
> ... and/or here?

Fixed.

 t/t1007-hash-object.sh |  120 ++++++++++++++++++++++++++++++++++++++----------
 1 files changed, 96 insertions(+), 24 deletions(-)

diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
index 543c078..eb54f1f 100755
--- a/t/t1007-hash-object.sh
+++ b/t/t1007-hash-object.sh
@@ -4,32 +4,104 @@ test_description=git-hash-object
 
 . ./test-lib.sh
 
-test_expect_success \
-    'git hash-object -w --stdin saves the object' \
-    'obname=$(echo foo | git hash-object -w --stdin) &&
-    obpath=$(echo $obname | sed -e "s/\(..\)/\1\//") &&
-    test -r .git/objects/"$obpath" &&
-    rm -f .git/objects/"$obpath"'
-    
-test_expect_success \
-    'git hash-object --stdin -w saves the object' \
-    'obname=$(echo foo | git hash-object --stdin -w) &&
-    obpath=$(echo $obname | sed -e "s/\(..\)/\1\//") &&
-    test -r .git/objects/"$obpath" &&
-    rm -f .git/objects/"$obpath"'    
+function echo_without_newline()
+{
+    echo "$@\c"
+}
 
-test_expect_success \
-    'git hash-object --stdin file1 <file0 first operates on file0, then file1' \
-    'echo foo > file1 &&
-    obname0=$(echo bar | git hash-object --stdin) &&
-    obname1=$(git hash-object file1) &&
-    obname0new=$(echo bar | git hash-object --stdin file1 | sed -n -e 1p) &&
-    obname1new=$(echo bar | git hash-object --stdin file1 | sed -n -e 2p) &&
-    test "$obname0" = "$obname0new" &&
-    test "$obname1" = "$obname1new"'
+function test_blob_does_not_exist()
+{
+    test_expect_success \
+        "blob does not exist in database" \
+        "test_must_fail git cat-file blob $1"
+}
+
+function test_blob_exists()
+{
+    test_expect_success \
+        "blob exists in database" \
+        "git cat-file blob $1"
+}
+
+hello_content="Hello World"
+hello_sha1=5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689
+
+example_content="This is an example"
+example_sha1=ddd3f836d3e3fbb7ae289aa9ae83536f76956399
+
+function setup_repo()
+{
+    echo_without_newline "$hello_content" > hello
+    echo_without_newline "$example_content" > example
+}
+
+test_repo=test
+function push_repo()
+{
+    test_create_repo $test_repo
+    cd $test_repo
+
+    setup_repo
+}
+
+function pop_repo()
+{
+    cd ..
+    rm -rf $test_repo
+}
+
+setup_repo
+
+# Argument checking
 
 test_expect_success \
-    'git hash-object refuses multiple --stdin arguments' \
-    '! git hash-object --stdin --stdin < file1'
+    "multiple '--stdin's are rejected" \
+    "test_must_fail git hash-object --stdin --stdin < example"
+
+# Behavior
+
+push_repo
+
+    test_expect_success \
+        "hash a file" \
+        "test $hello_sha1 = \$(git hash-object hello)"
+
+    test_blob_does_not_exist $hello_sha1
+
+    test_expect_success \
+        "hash from stdin" \
+        "test $example_sha1 = \$(git hash-object --stdin < example)"
+
+    test_blob_does_not_exist $example_sha1
+
+    test_expect_success \
+        "hash a file and write to database" \
+        "test $hello_sha1 = \$(git hash-object -w hello)"
+
+    test_blob_exists $hello_sha1
+
+    test_expect_success \
+        'git hash-object --stdin file1 <file0 first operates on file0, then file1' \
+        'echo foo > file1 &&
+        obname0=$(echo bar | git hash-object --stdin) &&
+        obname1=$(git hash-object file1) &&
+        obname0new=$(echo bar | git hash-object --stdin file1 | sed -n -e 1p) &&
+        obname1new=$(echo bar | git hash-object --stdin file1 | sed -n -e 2p) &&
+        test "$obname0" = "$obname0new" &&
+        test "$obname1" = "$obname1new"'
+
+pop_repo
+
+for args in "-w --stdin" "--stdin -w"; do
+    push_repo
+
+        test_expect_success \
+            "hash from stdin and write to database ($args)" \
+            "test $example_sha1 = \$(git hash-object $args < example)"
+
+        test_blob_exists $example_sha1
+
+    pop_repo
+done
 
 test_done
-- 
1.5.5.1.152.g9aeb7

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 08/11] git-hash-object: Add --stdin-paths option
  2008-04-23 19:17             ` [PATCH 07/11] Add more tests for git hash-object Adam Roben
@ 2008-04-23 19:17               ` Adam Roben
  2008-04-23 19:17                 ` [PATCH 09/11] Git.pm: Add command_bidi_pipe and command_close_bidi_pipe Adam Roben
  0 siblings, 1 reply; 19+ messages in thread
From: Adam Roben @ 2008-04-23 19:17 UTC (permalink / raw)
  To: git; +Cc: Adam Roben, Junio C Hamano

This allows multiple paths to be specified on stdin.

Signed-off-by: Adam Roben <aroben@apple.com>
---
 Documentation/git-hash-object.txt |    5 +++-
 hash-object.c                     |   44 ++++++++++++++++++++++++++++++++++++-
 t/t1007-hash-object.sh            |   32 ++++++++++++++++++++++++++
 3 files changed, 79 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-hash-object.txt b/Documentation/git-hash-object.txt
index 33030c0..99a2143 100644
--- a/Documentation/git-hash-object.txt
+++ b/Documentation/git-hash-object.txt
@@ -8,7 +8,7 @@ git-hash-object - Compute object ID and optionally creates a blob from a file
 
 SYNOPSIS
 --------
-'git-hash-object' [-t <type>] [-w] [--stdin] [--] <file>...
+'git-hash-object' [-t <type>] [-w] [--stdin | --stdin-paths] [--] <file>...
 
 DESCRIPTION
 -----------
@@ -32,6 +32,9 @@ OPTIONS
 --stdin::
 	Read the object from standard input instead of from a file.
 
+--stdin-paths::
+	Read file names from stdin instead of from the command-line.
+
 Author
 ------
 Written by Junio C Hamano <junkio@cox.net>
diff --git a/hash-object.c b/hash-object.c
index 61e7160..1b39162 100644
--- a/hash-object.c
+++ b/hash-object.c
@@ -20,6 +20,7 @@ static void hash_object(const char *path, enum object_type type, int write_objec
 		    ? "Unable to add %s to database"
 		    : "Unable to hash %s", path);
 	printf("%s\n", sha1_to_hex(sha1));
+	maybe_flush_or_die(stdout, "hash to stdout");
 }
 
 static void hash_stdin(const char *type, int write_object)
@@ -30,8 +31,27 @@ static void hash_stdin(const char *type, int write_object)
 	printf("%s\n", sha1_to_hex(sha1));
 }
 
+static int hash_stdin_paths(const char *type, int write_objects)
+{
+	struct strbuf buf, nbuf;
+
+	strbuf_init(&buf, 0);
+	strbuf_init(&nbuf, 0);
+	while (strbuf_getline(&buf, stdin, '\n') != EOF) {
+		if (buf.buf[0] == '"') {
+			strbuf_reset(&nbuf);
+			if (unquote_c_style(&nbuf, buf.buf, NULL))
+				die("line is badly quoted");
+			strbuf_swap(&buf, &nbuf);
+		}
+		hash_object(buf.buf, type_from_string(type), write_objects);
+	}
+	strbuf_release(&buf);
+	strbuf_release(&nbuf);
+}
+
 static const char hash_object_usage[] =
-"git-hash-object [-t <type>] [-w] [--stdin] <file>...";
+"git-hash-object [ [-t <type>] [-w] [--stdin] <file>... | --stdin-paths < <list-of-paths> ]";
 
 int main(int argc, char **argv)
 {
@@ -42,6 +62,7 @@ int main(int argc, char **argv)
 	int prefix_length = -1;
 	int no_more_flags = 0;
 	int hashstdin = 0;
+	int stdin_paths = 0;
 
 	git_config(git_default_config);
 
@@ -65,7 +86,19 @@ int main(int argc, char **argv)
 			}
 			else if (!strcmp(argv[i], "--help"))
 				usage(hash_object_usage);
+			else if (!strcmp(argv[i], "--stdin-paths")) {
+				if (hashstdin) {
+					error("Can't use --stdin-paths with --stdin");
+					usage(hash_object_usage);
+				}
+				stdin_paths = 1;
+
+			}
 			else if (!strcmp(argv[i], "--stdin")) {
+				if (stdin_paths) {
+					error("Can't use %s with --stdin-paths", argv[i]);
+					usage(hash_object_usage);
+				}
 				if (hashstdin)
 					die("Multiple --stdin arguments are not supported");
 				hashstdin = 1;
@@ -76,6 +109,11 @@ int main(int argc, char **argv)
 		else {
 			const char *arg = argv[i];
 
+			if (stdin_paths) {
+				error("Can't specify files (such as \"%s\") with --stdin-paths", arg);
+				usage(hash_object_usage);
+			}
+
 			if (hashstdin) {
 				hash_stdin(type, write_object);
 				hashstdin = 0;
@@ -87,6 +125,10 @@ int main(int argc, char **argv)
 			no_more_flags = 1;
 		}
 	}
+
+	if (stdin_paths)
+		return hash_stdin_paths(type, write_object);
+
 	if (hashstdin)
 		hash_stdin(type, write_object);
 	return 0;
diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
index eb54f1f..dbb4129 100755
--- a/t/t1007-hash-object.sh
+++ b/t/t1007-hash-object.sh
@@ -58,6 +58,15 @@ test_expect_success \
     "multiple '--stdin's are rejected" \
     "test_must_fail git hash-object --stdin --stdin < example"
 
+test_expect_success \
+    "Can't use --stdin and --stdin-paths together" \
+    "test_must_fail git hash-object --stdin --stdin-paths &&
+     test_must_fail git hash-object --stdin-paths --stdin"
+
+test_expect_success \
+    "Can't pass filenames as arguments with --stdin-paths" \
+    "test_must_fail git hash-object --stdin-paths hello < example"
+
 # Behavior
 
 push_repo
@@ -104,4 +113,27 @@ for args in "-w --stdin" "--stdin -w"; do
     pop_repo
 done
 
+filenames="hello
+example"
+
+sha1s="$hello_sha1
+$example_sha1"
+
+test_expect_success \
+    "hash two files with names on stdin" \
+    "test \"$sha1s\" = \"\$(echo_without_newline \"$filenames\" | git hash-object --stdin-paths)\""
+
+for args in "-w --stdin-paths" "--stdin-paths -w"; do
+    push_repo
+
+        test_expect_success \
+            "hash two files with names on stdin and write to database ($args)" \
+            "test \"$sha1s\" = \"\$(echo_without_newline \"$filenames\" | git hash-object $args)\""
+
+        test_blob_exists $hello_sha1
+        test_blob_exists $example_sha1
+
+    pop_repo
+done
+
 test_done
-- 
1.5.5.1.152.g9aeb7

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 09/11] Git.pm: Add command_bidi_pipe and command_close_bidi_pipe
  2008-04-23 19:17               ` [PATCH 08/11] git-hash-object: Add --stdin-paths option Adam Roben
@ 2008-04-23 19:17                 ` Adam Roben
  2008-04-23 19:17                   ` [PATCH 10/11] Git.pm: Add hash_and_insert_object and cat_blob Adam Roben
  0 siblings, 1 reply; 19+ messages in thread
From: Adam Roben @ 2008-04-23 19:17 UTC (permalink / raw)
  To: git; +Cc: Adam Roben, Eric Wong

command_bidi_pipe hands back the stdin and stdout file handles from the
executed command. command_close_bidi_pipe closes these handles and terminates
the process.

Signed-off-by: Adam Roben <aroben@apple.com>
---
 perl/Git.pm |   56 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 56 insertions(+), 0 deletions(-)

diff --git a/perl/Git.pm b/perl/Git.pm
index 2e7f896..d766974 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -51,6 +51,7 @@ require Exporter;
 # Methods which can be called as standalone functions as well:
 @EXPORT_OK = qw(command command_oneline command_noisy
                 command_output_pipe command_input_pipe command_close_pipe
+                command_bidi_pipe command_close_bidi_pipe
                 version exec_path hash_object git_cmd_try);
 
 
@@ -92,6 +93,7 @@ increate nonwithstanding).
 use Carp qw(carp croak); # but croak is bad - throw instead
 use Error qw(:try);
 use Cwd qw(abs_path);
+use IPC::Open2 qw(open2);
 
 }
 
@@ -375,6 +377,60 @@ sub command_close_pipe {
 	_cmd_close($fh, $ctx);
 }
 
+=item command_bidi_pipe ( COMMAND [, ARGUMENTS... ] )
+
+Execute the given C<COMMAND> in the same way as command_output_pipe()
+does but return both an input pipe filehandle and an output pipe filehandle.
+
+The function will return return C<($pid, $pipe_in, $pipe_out, $ctx)>.
+See C<command_close_bidi_pipe()> for details.
+
+=cut
+
+sub command_bidi_pipe {
+	my ($pid, $in, $out);
+	$pid = open2($in, $out, 'git', @_);
+	return ($pid, $in, $out, join(' ', @_));
+}
+
+=item command_close_bidi_pipe ( PID, PIPE_IN, PIPE_OUT [, CTX] )
+
+Close the C<PIPE_IN> and C<PIPE_OUT> as returned from C<command_bidi_pipe()>,
+checking whether the command finished successfully. The optional C<CTX>
+argument is required if you want to see the command name in the error message,
+and it is the fourth value returned by C<command_bidi_pipe()>.  The call idiom
+is:
+
+	my ($pid, $in, $out, $ctx) = $r->command_bidi_pipe('cat-file --batch-check');
+	print "000000000\n" $out;
+	while (<$in>) { ... }
+	$r->command_close_bidi_pipe($pid, $in, $out, $ctx);
+
+Note that you should not rely on whatever actually is in C<CTX>;
+currently it is simply the command name but in future the context might
+have more complicated structure.
+
+=cut
+
+sub command_close_bidi_pipe {
+	my ($pid, $in, $out, $ctx) = @_;
+	foreach my $fh ($in, $out) {
+		unless (close $fh) {
+			if ($!) {
+				carp "error closing pipe: $!";
+			} elsif ($? >> 8) {
+				throw Git::Error::Command($ctx, $? >>8);
+			}
+		}
+	}
+
+	waitpid $pid, 0;
+
+	if ($? >> 8) {
+		throw Git::Error::Command($ctx, $? >>8);
+	}
+}
+
 
 =item command_noisy ( COMMAND [, ARGUMENTS... ] )
 
-- 
1.5.5.1.152.g9aeb7

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 10/11] Git.pm: Add hash_and_insert_object and cat_blob
  2008-04-23 19:17                 ` [PATCH 09/11] Git.pm: Add command_bidi_pipe and command_close_bidi_pipe Adam Roben
@ 2008-04-23 19:17                   ` Adam Roben
  2008-04-23 19:17                     ` [PATCH 11/11] git-svn: Speed up fetch Adam Roben
  0 siblings, 1 reply; 19+ messages in thread
From: Adam Roben @ 2008-04-23 19:17 UTC (permalink / raw)
  To: git; +Cc: Adam Roben, Eric Wong

These functions are more efficient ways of executing `git hash-object -w` and
`git cat-file blob` when you are dealing with many files/objects.

Signed-off-by: Adam Roben <aroben@apple.com>
---
Eric Wong <normalperson@yhbt.net> wrote:
> > diff --git a/perl/Git.pm b/perl/Git.pm
> > index 46c5d10..f23edef 100644
> > --- a/perl/Git.pm
> > +++ b/perl/Git.pm
> > @@ -39,6 +39,9 @@ $VERSION = '0.01';
> >    my $lastrev = $repo->command_oneline( [ 'rev-list', '--all' ],
> >                                          STDERR => 0 );
> >  
> > +  my $sha1 = $repo->hash_and_insert_object('file.txt');
> > +  my $contents = $repo->cat_blob($sha1);
> 
> I missed this the first time around.  But I'd rather be able to pass a
> file handle to cat_blob for writing, instead of returning a potentially
> huge string in memory.

Fixed.

> > @@ -675,6 +677,93 @@ sub hash_object {
> >  }
> >  
> >  
> > +=item hash_and_insert_object ( FILENAME )
> > +
> > +Compute the SHA1 object id of the given C<FILENAME> and add the object to the
> > +object database.
> > +
> > +The function returns the SHA1 hash.
> > +
> > +=cut
> > +
> > +# TODO: Support for passing FILEHANDLE instead of FILENAME
> 
> Filenames are fine for this input since they (are/should be) generated
> by File::Temp and not from an untrusted repo.
> 
> We should, however assert that the caller of this function
> isn't using a stupid filename with "\n" in it.

Fixed.

> > +sub hash_and_insert_object {
> > +   my ($self, $filename) = @_;
> > +
> > +   $self->_open_hash_and_insert_object_if_needed();
> > +   my ($in, $out) = ($self->{hash_object_in}, $self->{hash_object_out});
> > +
> > +   print $out $filename, "\n";
> > +   chomp(my $hash = <$in>);
> > +   return $hash;
> > +}
> > +
> > +sub _open_hash_and_insert_object_if_needed {
> > +   my ($self) = @_;
> > +
> > +   return if defined($self->{hash_object_pid});
> > +
> > +   ($self->{hash_object_pid}, $self->{hash_object_in},
> > +    $self->{hash_object_out}, $self->{hash_object_ctx}) =
> > +           command_bidi_pipe(qw(hash-object -w --stdin-paths));
> > +}
> > +
> > +sub _close_hash_and_insert_object {
> > +   my ($self) = @_;
> > +
> > +   return unless defined($self->{hash_object_pid});
> > +
> > +   my @vars = map { 'hash_object' . $_ } qw(pid in out ctx);
> 
> It looks like you're missing a '_' in there.

Fixed.

> > +=item cat_blob ( SHA1 )
> > +
> > +Returns the contents of the blob identified by C<SHA1>.
> > +
> > +=cut
> > +
> > +sub cat_blob {
> > +   my ($self, $sha1) = @_;
> > +
> > +   $self->_open_cat_blob_if_needed();
> > +   my ($in, $out) = ($self->{cat_blob_in}, $self->{cat_blob_out});
> > +
> > +   print $out $sha1, "\n";
> > +   chomp(my $size = <$in>);
> > +
> > +   my $blob;
> > +   my $result = read($in, $blob, $size);
> > +   defined $result or carp $!;
> > +
> > +   # Skip past the trailing newline.
> > +   read($in, my $newline, 1);
> > +
> > +   return $blob;
> > +}
> 
> However, I'd very much like to be able to pass a file handle to this
> function.  This should read()/print() to a file handle passed to it in a
> loop rather than slurping all of $size at once, since the files we're
> receiving can be huge.

Fixed.

> I'd also be happier if we checked that we actually read $size bytes in
> the loop, and that $newline is actually "\n" to safeguard against bugs
> in cat-blob.

Fixed.

> > +sub _open_cat_blob_if_needed {
> > +   my ($self) = @_;
> > +
> > +   return if defined($self->{cat_blob_pid});
> > +
> > +   ($self->{cat_blob_pid}, $self->{cat_blob_in},
> > +    $self->{cat_blob_out}, $self->{cat_blob_ctx}) =
> > +           command_bidi_pipe(qw(cat-file blob --stdin));
> > +}
> > +
> > +sub _close_cat_blob {
> > +   my ($self) = @_;
> > +
> > +   return unless defined($self->{cat_blob_pid});
> > +
> > +   my @vars = map { 'cat_blob' . $_ } qw(pid in out ctx);
> 
> It looks like you're missing a '_' here, too.

Fixed.

> One more nit, I'm a bit paranoid, but I personally like to die/croak if
> the result of every print()/syswrite() to make sure the pipe we're
> writing to didn't die or if there were other error indicators.

Fixed.

 perl/Git.pm |  152 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 150 insertions(+), 2 deletions(-)

diff --git a/perl/Git.pm b/perl/Git.pm
index d766974..6ba8ee5 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -39,6 +39,10 @@ $VERSION = '0.01';
   my $lastrev = $repo->command_oneline( [ 'rev-list', '--all' ],
                                         STDERR => 0 );
 
+  my $sha1 = $repo->hash_and_insert_object('file.txt');
+  my $tempfile = tempfile();
+  my $size = $repo->cat_blob($sha1, $tempfile);
+
 =cut
 
 
@@ -218,7 +222,6 @@ sub repository {
 	bless $self, $class;
 }
 
-
 =back
 
 =head1 METHODS
@@ -734,6 +737,147 @@ sub hash_object {
 }
 
 
+=item hash_and_insert_object ( FILENAME )
+
+Compute the SHA1 object id of the given C<FILENAME> and add the object to the
+object database.
+
+The function returns the SHA1 hash.
+
+=cut
+
+# TODO: Support for passing FILEHANDLE instead of FILENAME
+sub hash_and_insert_object {
+	my ($self, $filename) = @_;
+
+	carp "Bad filename \"$filename\"" if $filename =~ /[\r\n]/;
+
+	$self->_open_hash_and_insert_object_if_needed();
+	my ($in, $out) = ($self->{hash_object_in}, $self->{hash_object_out});
+
+	unless (print $out $filename, "\n") {
+		$self->_close_hash_and_insert_object();
+		throw Error::Simple("out pipe went bad");
+	}
+
+	chomp(my $hash = <$in>);
+	unless (defined($hash)) {
+		$self->_close_hash_and_insert_object();
+		throw Error::Simple("in pipe went bad");
+	}
+
+	return $hash;
+}
+
+sub _open_hash_and_insert_object_if_needed {
+	my ($self) = @_;
+
+	return if defined($self->{hash_object_pid});
+
+	($self->{hash_object_pid}, $self->{hash_object_in},
+	 $self->{hash_object_out}, $self->{hash_object_ctx}) =
+		command_bidi_pipe(qw(hash-object -w --stdin-paths));
+}
+
+sub _close_hash_and_insert_object {
+	my ($self) = @_;
+
+	return unless defined($self->{hash_object_pid});
+
+	my @vars = map { 'hash_object_' . $_ } qw(pid in out ctx);
+
+	command_close_bidi_pipe($self->{@vars});
+	delete $self->{@vars};
+}
+
+=item cat_blob ( SHA1, FILEHANDLE )
+
+Prints the contents of the blob identified by C<SHA1> to C<FILEHANDLE> and
+returns the number of bytes printed.
+
+=cut
+
+sub cat_blob {
+	my ($self, $sha1, $fh) = @_;
+
+	$self->_open_cat_blob_if_needed();
+	my ($in, $out) = ($self->{cat_blob_in}, $self->{cat_blob_out});
+
+	unless (print $out $sha1, "\n") {
+		$self->_close_cat_blob();
+		throw Error::Simple("out pipe went bad");
+	}
+
+	my $description = <$in>;
+	if ($description =~ / missing$/) {
+		carp "$sha1 doesn't exist in the repository";
+		return 0;
+	}
+
+	if ($description !~ /^[0-9a-fA-F]{40} \S+ (\d+)$/) {
+		carp "Unexpected result returned from git cat-file";
+		return 0;
+	}
+
+	my $size = $1;
+
+	my $blob;
+	my $bytesRead = 0;
+
+	while (1) {
+		my $bytesLeft = $size - $bytesRead;
+		last unless $bytesLeft;
+
+		my $bytesToRead = $bytesLeft < 1024 ? $bytesLeft : 1024;
+		my $read = read($in, $blob, $bytesToRead, $bytesRead);
+		unless (defined($read)) {
+			$self->_close_cat_blob();
+			throw Error::Simple("in pipe went bad");
+		}
+
+		$bytesRead += $read;
+	}
+
+	# Skip past the trailing newline.
+	my $newline;
+	my $read = read($in, $newline, 1);
+	unless (defined($read)) {
+		$self->_close_cat_blob();
+		throw Error::Simple("in pipe went bad");
+	}
+	unless ($read == 1 && $newline eq "\n") {
+		$self->_close_cat_blob();
+		throw Error::Simple("didn't find newline after blob");
+	}
+
+	unless (print $fh $blob) {
+		$self->_close_cat_blob();
+		throw Error::Simple("couldn't write to passed in filehandle");
+	}
+
+	return $size;
+}
+
+sub _open_cat_blob_if_needed {
+	my ($self) = @_;
+
+	return if defined($self->{cat_blob_pid});
+
+	($self->{cat_blob_pid}, $self->{cat_blob_in},
+	 $self->{cat_blob_out}, $self->{cat_blob_ctx}) =
+		command_bidi_pipe(qw(cat-file --batch));
+}
+
+sub _close_cat_blob {
+	my ($self) = @_;
+
+	return unless defined($self->{cat_blob_pid});
+
+	my @vars = map { 'cat_blob_' . $_ } qw(pid in out ctx);
+
+	command_close_bidi_pipe($self->{@vars});
+	delete $self->{@vars};
+}
 
 =back
 
@@ -951,7 +1095,11 @@ sub _cmd_close {
 }
 
 
-sub DESTROY { }
+sub DESTROY {
+	my ($self) = @_;
+	$self->_close_hash_and_insert_object();
+	$self->_close_cat_blob();
+}
 
 
 # Pipe implementation for ActiveState Perl.
-- 
1.5.5.1.152.g9aeb7

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 11/11] git-svn: Speed up fetch
  2008-04-23 19:17                   ` [PATCH 10/11] Git.pm: Add hash_and_insert_object and cat_blob Adam Roben
@ 2008-04-23 19:17                     ` Adam Roben
  0 siblings, 0 replies; 19+ messages in thread
From: Adam Roben @ 2008-04-23 19:17 UTC (permalink / raw)
  To: git; +Cc: Adam Roben, Eric Wong

We were spending a lot of time forking/execing git-cat-file and
git-hash-object. We now maintain a global Git repository object in order to use
Git.pm's more efficient hash_and_insert_object and cat_blob methods.

Signed-off-by: Adam Roben <aroben@apple.com>
---
 git-svn.perl |   42 ++++++++++++++++++++----------------------
 1 files changed, 20 insertions(+), 22 deletions(-)

diff --git a/git-svn.perl b/git-svn.perl
index b864b54..5ef9d23 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -4,7 +4,7 @@
 use warnings;
 use strict;
 use vars qw/	$AUTHOR $VERSION
-		$sha1 $sha1_short $_revision
+		$sha1 $sha1_short $_revision $_repository
 		$_q $_authors %users/;
 $AUTHOR = 'Eric Wong <normalperson@yhbt.net>';
 $VERSION = '@@GIT_VERSION@@';
@@ -220,6 +220,7 @@ unless ($cmd && $cmd =~ /(?:clone|init|multi-init)$/) {
 		}
 		$ENV{GIT_DIR} = $git_dir;
 	}
+	$_repository = Git->repository(Repository => $ENV{GIT_DIR});
 }
 
 my %opts = %{$cmd{$cmd}->[2]} if (defined $cmd);
@@ -301,6 +302,7 @@ sub do_git_init_db {
 			}
 		}
 		command_noisy(@init_db);
+		$_repository = Git->repository(Repository => ".git");
 	}
 	my $set;
 	my $pfx = "svn-remote.$Git::SVN::default_repo_id";
@@ -317,6 +319,7 @@ sub init_subdir {
 	mkpath([$repo_path]) unless -d $repo_path;
 	chdir $repo_path or die "Couldn't chdir to $repo_path: $!\n";
 	$ENV{GIT_DIR} = '.git';
+	$_repository = Git->repository(Repository => $ENV{GIT_DIR});
 }
 
 sub cmd_clone {
@@ -3010,6 +3013,7 @@ use vars qw/@ISA/;
 use strict;
 use warnings;
 use Carp qw/croak/;
+use File::Temp qw/tempfile/;
 use IO::File qw//;
 
 # file baton members: path, mode_a, mode_b, pool, fh, blob, base
@@ -3165,14 +3169,9 @@ sub apply_textdelta {
 	my $base = IO::File->new_tmpfile;
 	$base->autoflush(1);
 	if ($fb->{blob}) {
-		defined (my $pid = fork) or croak $!;
-		if (!$pid) {
-			open STDOUT, '>&', $base or croak $!;
-			print STDOUT 'link ' if ($fb->{mode_a} == 120000);
-			exec qw/git-cat-file blob/, $fb->{blob} or croak $!;
-		}
-		waitpid $pid, 0;
-		croak $? if $?;
+		print $base 'link ' if ($fb->{mode_a} == 120000);
+		my $size = $::_repository->cat_blob($fb->{blob}, $base);
+		die "Failed to read object $fb->{blob}" unless $size;
 
 		if (defined $exp) {
 			seek $base, 0, 0 or croak $!;
@@ -3213,14 +3212,18 @@ sub close_file {
 				sysseek($fh, 0, 0) or croak $!;
 			}
 		}
-		defined(my $pid = open my $out,'-|') or die "Can't fork: $!\n";
-		if (!$pid) {
-			open STDIN, '<&', $fh or croak $!;
-			exec qw/git-hash-object -w --stdin/ or croak $!;
+
+		my ($tmp_fh, $tmp_filename) = File::Temp::tempfile(UNLINK => 1);
+		my $result;
+		while ($result = sysread($fh, my $string, 1024)) {
+			syswrite($tmp_fh, $string, $result);
 		}
-		chomp($hash = do { local $/; <$out> });
-		close $out or croak $!;
+		defined $result or croak $!;
+		close $tmp_fh or croak $!;
+
 		close $fh or croak $!;
+
+		$hash = $::_repository->hash_and_insert_object($tmp_filename);
 		$hash =~ /^[a-f\d]{40}$/ or die "not a sha1: $hash\n";
 		close $fb->{base} or croak $!;
 	} else {
@@ -3546,13 +3549,8 @@ sub chg_file {
 	} elsif ($m->{mode_a} =~ /^120/ && $m->{mode_b} !~ /^120/) {
 		$self->change_file_prop($fbat,'svn:special',undef);
 	}
-	defined(my $pid = fork) or croak $!;
-	if (!$pid) {
-		open STDOUT, '>&', $fh or croak $!;
-		exec qw/git-cat-file blob/, $m->{sha1_b} or croak $!;
-	}
-	waitpid $pid, 0;
-	croak $? if $?;
+	my $size = $::_repository->cat_blob($m->{sha1_b}, $fh);
+	croak "Failed to read object $m->{sha1_b}" unless $size;
 	$fh->flush == 0 or croak $!;
 	seek $fh, 0, 0 or croak $!;
 
-- 
1.5.5.1.152.g9aeb7

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: Speed up git-svn fetch
  2008-04-23 19:17 Speed up git-svn fetch Adam Roben
  2008-04-23 19:17 ` [PATCH 01/11] Add tests for git cat-file Adam Roben
@ 2008-04-23 19:19 ` Adam Roben
  2008-04-25  7:15   ` Eric Wong
  1 sibling, 1 reply; 19+ messages in thread
From: Adam Roben @ 2008-04-23 19:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Johannes Sixt, Brian Downing, Eric Wong

Adam Roben wrote:
> I first sent this patch series 6 months ago today [1], then resent it after
> some comments from Junio, Johannes, and Brian [2].

Here are those links:

[1] http://thread.gmane.org/gmane.comp.version-control.git/62098/focus=62102
[2] http://thread.gmane.org/gmane.comp.version-control.git/62295/focus=62298

-Adam

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 01/11] Add tests for git cat-file
  2008-04-23 19:17 ` [PATCH 01/11] Add tests for git cat-file Adam Roben
  2008-04-23 19:17   ` [PATCH 02/11] git-cat-file: Small refactor of cmd_cat_file Adam Roben
@ 2008-04-25  6:56   ` Eric Wong
  2008-04-25 18:06     ` Junio C Hamano
  2008-04-25 18:03   ` Junio C Hamano
  2 siblings, 1 reply; 19+ messages in thread
From: Eric Wong @ 2008-04-25  6:56 UTC (permalink / raw)
  To: Adam Roben; +Cc: git, Junio C Hamano, Johannes Sixt

Adam Roben <aroben@apple.com> wrote:
> 
> Signed-off-by: Adam Roben <aroben@apple.com>
> ---
>  t/t1006-cat-file.sh |  101 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 101 insertions(+), 0 deletions(-)
>  create mode 100755 t/t1006-cat-file.sh
> 
> diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
> new file mode 100755
> index 0000000..15741d9
> --- /dev/null
> +++ b/t/t1006-cat-file.sh
> @@ -0,0 +1,101 @@
> +#!/bin/sh
> +
> +test_description='git cat-file'
> +
> +. ./test-lib.sh
> +
> +function echo_without_newline()

The "function " keyword is a bashism and not needed, this breaks
my test run with dash as /bin/sh (same thing in t1007).

> +{
> +    echo "$@\c"

I guess we have different bash versions/options, because this breaks for
me in bash (3.1dfsg-8 from Debian etch).  It would need -e to handle to
handle escape sequence, but that's a bashism, too.

Use printf "$@" here instead.

So yes, this test was broken in both dash and bash for me without the
above fixes.

-- 
Eric Wong

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Speed up git-svn fetch
  2008-04-23 19:19 ` Speed up git-svn fetch Adam Roben
@ 2008-04-25  7:15   ` Eric Wong
  0 siblings, 0 replies; 19+ messages in thread
From: Eric Wong @ 2008-04-25  7:15 UTC (permalink / raw)
  To: Adam Roben; +Cc: git, Junio C Hamano, Johannes Sixt, Brian Downing

Adam Roben <aroben@apple.com> wrote:
> Adam Roben wrote:
> >I first sent this patch series 6 months ago today [1], then resent it after
> >some comments from Junio, Johannes, and Brian [2].
> 
> Here are those links:
> 
> [1] http://thread.gmane.org/gmane.comp.version-control.git/62098/focus=62102
> [2] http://thread.gmane.org/gmane.comp.version-control.git/62295/focus=62298

Thanks for following up on this, the Perl bits look good to me[1].

I commented in a separate email about the shell incompatibilities
I experienced.

I still the following gcc warnings when building it:

> hash-object.c: In function 'hash_stdin_paths':
> hash-object.c:43: warning: implicit declaration of function 'unquote_c_style'

Trivial fix:
--- a/hash-object.c
+++ b/hash-object.c
@@ -6,6 +6,7 @@
  */
 #include "cache.h"
 #include "blob.h"
+#include "quote.h"

> hash-object.c:51: warning: control reaches end of non-void function

This can probably just be a void.

> builtin-cat-file.c: In function 'cmd_cat_file':
> builtin-cat-file.c:224: warning: suggest parentheses around && within ||

Ugh, you added long (>80 char) lines to this and I'm having trouble
following it.  I believe the git (like Linux) coding style calls for 80
char lines unless there is really no other way[2].  This is also a
problem for me in some of the shell tests, too.

> builtin-cat-file.c:151: warning: 'contents' may be used uninitialized in this function

gcc isn't smart here.


[1] - disclaimer, I'm not in my best mental state at this point in the
night/morning so maybe some things have slipped :)

[2] - Looking at your email address, I notice you work for a company that
pushes widescreen monitors, but I remain firmly on the side of dead tree
publishers whom I believe got line-wrapping right centuries ago :)

-- 
Eric Wong

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 01/11] Add tests for git cat-file
  2008-04-23 19:17 ` [PATCH 01/11] Add tests for git cat-file Adam Roben
  2008-04-23 19:17   ` [PATCH 02/11] git-cat-file: Small refactor of cmd_cat_file Adam Roben
  2008-04-25  6:56   ` [PATCH 01/11] Add tests for git cat-file Eric Wong
@ 2008-04-25 18:03   ` Junio C Hamano
  2008-05-06  6:41     ` Junio C Hamano
  2 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2008-04-25 18:03 UTC (permalink / raw)
  To: Adam Roben; +Cc: git, Johannes Sixt

Adam Roben <aroben@apple.com> writes:

> Signed-off-by: Adam Roben <aroben@apple.com>
> ---
>  t/t1006-cat-file.sh |  101 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 101 insertions(+), 0 deletions(-)
>  create mode 100755 t/t1006-cat-file.sh
>
> diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
> new file mode 100755
> index 0000000..15741d9
> --- /dev/null
> +++ b/t/t1006-cat-file.sh
> @@ -0,0 +1,101 @@
> +#!/bin/sh
> +
> +test_description='git cat-file'
> +
> +. ./test-lib.sh
> +
> +function echo_without_newline()
> +{
> +    echo "$@\c"
> +}

"function " noiseword is unnecessary and unportable.

echo "\c" is unportable.  Traditional trick to do this is to pipe it to tr
to strip "\010".

When you see "<anything>$@<anything>" in shell scripts, you usually do not
mean it, but instead mean "<anything>$*<anything>".  It does not make
difference if you always feed a single parameter, but it is a good habit
to get into.

> +function maybe_remove_timestamp()
> +{
> +    if test -z "$2"; then
> +        echo_without_newline "$1"
> +    else
> +        echo_without_newline "$1" | sed -e 's/ [0-9]\{10\} [+-][0-9]\{4\}$//'

We tend to avoid \{num\| for portability.  Do you really need them here?

> +function run_tests()
> +{
> +    type=$1
> +    sha1=$2
> +    size=$3
> +    content=$4
> +    pretty_content=$5
> +    no_timestamp=$6
> +
> +    test_expect_success \
> +        "$type exists" \
> +        "git cat-file -e $sha1"
> +    test_expect_success \
> +        "Type of $type is correct" \
> +        "test $type = \"\$(git cat-file -t $sha1)\""

It is quite hard to read with backslashes inside dq.  Please use single
quote pairs without excess backslashes when possible.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 03/11] git-cat-file: Make option parsing a little more flexible
  2008-04-23 19:17     ` [PATCH 03/11] git-cat-file: Make option parsing a little more flexible Adam Roben
  2008-04-23 19:17       ` [PATCH 04/11] git-cat-file: Add --batch-check option Adam Roben
@ 2008-04-25 18:04       ` Junio C Hamano
  1 sibling, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2008-04-25 18:04 UTC (permalink / raw)
  To: Adam Roben; +Cc: git

Adam Roben <aroben@apple.com> writes:

> This will make it easier to add newer options later.

Hmm.  Makes one wonder "why not parse-options"?...

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 01/11] Add tests for git cat-file
  2008-04-25  6:56   ` [PATCH 01/11] Add tests for git cat-file Eric Wong
@ 2008-04-25 18:06     ` Junio C Hamano
  0 siblings, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2008-04-25 18:06 UTC (permalink / raw)
  To: Eric Wong; +Cc: Adam Roben, git, Johannes Sixt

Eric Wong <normalperson@yhbt.net> writes:

> Adam Roben <aroben@apple.com> wrote:
>> 
>> Signed-off-by: Adam Roben <aroben@apple.com>
>> ---
>>  t/t1006-cat-file.sh |  101 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 files changed, 101 insertions(+), 0 deletions(-)
>>  create mode 100755 t/t1006-cat-file.sh
>> 
>> diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
>> new file mode 100755
>> index 0000000..15741d9
>> --- /dev/null
>> +++ b/t/t1006-cat-file.sh
>> @@ -0,0 +1,101 @@
>> +#!/bin/sh
>> +
>> +test_description='git cat-file'
>> +
>> +. ./test-lib.sh
>> +
>> +function echo_without_newline()
>
> The "function " keyword is a bashism and not needed, this breaks
> my test run with dash as /bin/sh (same thing in t1007).
>
>> +{
>> +    echo "$@\c"
>
> I guess we have different bash versions/options, because this breaks for
> me in bash (3.1dfsg-8 from Debian etch).  It would need -e to handle to
> handle escape sequence, but that's a bashism, too.
>
> Use printf "$@" here instead.

Looking at the callers, I do not think you want that.  I would suggest
something defensive like:

	printf '%s' "$*"

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 01/11] Add tests for git cat-file
  2008-04-25 18:03   ` Junio C Hamano
@ 2008-05-06  6:41     ` Junio C Hamano
  0 siblings, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2008-05-06  6:41 UTC (permalink / raw)
  To: Adam Roben; +Cc: git, Johannes Sixt

Junio C Hamano <gitster@pobox.com> writes:

> Adam Roben <aroben@apple.com> writes:
>
>> Signed-off-by: Adam Roben <aroben@apple.com>
>> ---
>>  t/t1006-cat-file.sh |  101 +++++++++++++++++++++++++++++++++++++++++++++++++++

I generally do not fix other people's mess, but every once in a while I
try to be nice, so I've fixed up various portability glitches in this test
script file and queued the early part to 'pu'.

However the t1007 test that is added later in the series have the same
kind of breakage.  I'll discard the rest of the series for now, but
hopefully the interdiff between the submitted and committed t1006 would
serve as a template to fix t1007.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2008-05-06  6:42 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-23 19:17 Speed up git-svn fetch Adam Roben
2008-04-23 19:17 ` [PATCH 01/11] Add tests for git cat-file Adam Roben
2008-04-23 19:17   ` [PATCH 02/11] git-cat-file: Small refactor of cmd_cat_file Adam Roben
2008-04-23 19:17     ` [PATCH 03/11] git-cat-file: Make option parsing a little more flexible Adam Roben
2008-04-23 19:17       ` [PATCH 04/11] git-cat-file: Add --batch-check option Adam Roben
2008-04-23 19:17         ` [PATCH 05/11] git-cat-file: Add --batch option Adam Roben
2008-04-23 19:17           ` [PATCH 06/11] Move git-hash-object tests from t5303 to t1007 Adam Roben
2008-04-23 19:17             ` [PATCH 07/11] Add more tests for git hash-object Adam Roben
2008-04-23 19:17               ` [PATCH 08/11] git-hash-object: Add --stdin-paths option Adam Roben
2008-04-23 19:17                 ` [PATCH 09/11] Git.pm: Add command_bidi_pipe and command_close_bidi_pipe Adam Roben
2008-04-23 19:17                   ` [PATCH 10/11] Git.pm: Add hash_and_insert_object and cat_blob Adam Roben
2008-04-23 19:17                     ` [PATCH 11/11] git-svn: Speed up fetch Adam Roben
2008-04-25 18:04       ` [PATCH 03/11] git-cat-file: Make option parsing a little more flexible Junio C Hamano
2008-04-25  6:56   ` [PATCH 01/11] Add tests for git cat-file Eric Wong
2008-04-25 18:06     ` Junio C Hamano
2008-04-25 18:03   ` Junio C Hamano
2008-05-06  6:41     ` Junio C Hamano
2008-04-23 19:19 ` Speed up git-svn fetch Adam Roben
2008-04-25  7:15   ` Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).