Git development

Git development
 help / color / mirror / Atom feed

* [PATCH 3/9] git-cat-file: Make option parsing a little more flexible
From: Adam Roben @ 2007-10-25 10:25 UTC (permalink / raw)
  To: git; +Cc: Junio Hamano, Adam Roben
In-Reply-To: <1193307927-3592-3-git-send-email-aroben@apple.com>

This will make it easier to add newer options later.

Signed-off-by: Adam Roben <aroben@apple.com>
---
 builtin-cat-file.c |   42 ++++++++++++++++++++++++++++++------------
 1 files changed, 30 insertions(+), 12 deletions(-)

diff --git a/builtin-cat-file.c b/builtin-cat-file.c
index 34a63d1..3a0be4a 100644
--- a/builtin-cat-file.c
+++ b/builtin-cat-file.c
@@ -143,23 +143,41 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
 	return 0;
 }
 
+static const char cat_file_usage[] = "git-cat-file [-t|-s|-e|-p|<type>] <sha1>";
+
 int cmd_cat_file(int argc, const char **argv, const char *prefix)
 {
-	int opt;
-	const char *exp_type, *obj_name;
+	int i, opt = 0;
+	const char *exp_type = 0, *obj_name = 0;
 
 	git_config(git_default_config);
-	if (argc != 3)
-		usage("git-cat-file [-t|-s|-e|-p|<type>] <sha1>");
-	exp_type = argv[1];
-	obj_name = argv[2];
-
-	opt = 0;
-	if ( exp_type[0] == '-' ) {
-		opt = exp_type[1];
-		if ( !opt || exp_type[2] )
-			opt = -1; /* Not a single character option */
+
+	for (i = 1; i < argc; ++i) {
+		const char *arg = argv[i];
+
+		if (!strcmp(arg, "-t") || !strcmp(arg, "-s") || !strcmp(arg, "-e") || !strcmp(arg, "-p")) {
+			exp_type = arg;
+			opt = exp_type[1];
+			continue;
+		}
+
+		if (arg[0] == '-')
+			usage(cat_file_usage);
+
+		if (!exp_type) {
+			exp_type = arg;
+			continue;
+		}
+
+		if (obj_name)
+			usage(cat_file_usage);
+
+		obj_name = arg;
+		break;
 	}
 
+	if (!exp_type || !obj_name)
+		usage(cat_file_usage);
+
 	return cat_one_file(opt, exp_type, obj_name);
 }
-- 
1.5.3.4.1337.g8e67d-dirty

^ permalink raw reply related

* [PATCH 4/9] git-cat-file: Add --stdin option
From: Adam Roben @ 2007-10-25 10:25 UTC (permalink / raw)
  To: git; +Cc: Junio Hamano, Adam Roben, Brian Downing
In-Reply-To: <1193307927-3592-4-git-send-email-aroben@apple.com>

This lets you specify object names on stdin instead of on the command line.
When printing object contents or pretty-printing, objects will be printed
preceded by their size:

<size>LF
<content>LF

Signed-off-by: Adam Roben <aroben@apple.com>
---
Brian Downing wrote:
> I think a far more reasonable output format for multiple objects would
> be something like:
> 
> <count> LF
> <raw data> LF
> 
> Where <count> is the number of bytes in the <raw data> as an ASCII
> decimal integer.

Agreed.

 Documentation/git-cat-file.txt |    6 ++++-
 builtin-cat-file.c             |   43 ++++++++++++++++++++++++++++++++++-----
 t/t1005-cat-file.sh            |   35 ++++++++++++++++++++++++++++++++
 3 files changed, 77 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index afa095c..588d71a 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -8,7 +8,7 @@ git-cat-file - Provide content or type/size information for repository objects
 
 SYNOPSIS
 --------
-'git-cat-file' [-t | -s | -e | -p | <type>] <object>
+'git-cat-file' [-t | -s | -e | -p | <type>] [--stdin | <object>]
 
 DESCRIPTION
 -----------
@@ -23,6 +23,10 @@ OPTIONS
 	For a more complete list of ways to spell object names, see
 	"SPECIFYING REVISIONS" section in gitlink:git-rev-parse[1].
 
+--stdin::
+	Read object names from stdin instead of specifying one on the
+	command line.
+
 -t::
 	Instead of the content, show the object type identified by
 	<object>.
diff --git a/builtin-cat-file.c b/builtin-cat-file.c
index 3a0be4a..ee46ba4 100644
--- a/builtin-cat-file.c
+++ b/builtin-cat-file.c
@@ -76,7 +76,7 @@ static void pprint_tag(const unsigned char *sha1, const char *buf, unsigned long
 		write_or_die(1, cp, endp - cp);
 }
 
-static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
+static int cat_one_file(int opt, const char *exp_type, const char *obj_name, int print_size)
 {
 	unsigned char sha1[20];
 	enum object_type type;
@@ -139,16 +139,26 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
 	if (!buf)
 		die("git-cat-file %s: bad file", obj_name);
 
+	if (print_size) {
+		printf("%lu\n", size);
+		fflush(stdout);
+	}
 	write_or_die(1, buf, size);
+	if (print_size) {
+		printf("\n");
+		fflush(stdout);
+	}
 	return 0;
 }
 
-static const char cat_file_usage[] = "git-cat-file [-t|-s|-e|-p|<type>] <sha1>";
+static const char cat_file_usage[] = "git-cat-file [-t|-s|-e|-p|<type>] [--stdin | <sha1>]";
 
 int cmd_cat_file(int argc, const char **argv, const char *prefix)
 {
-	int i, opt = 0;
+	int i, opt = 0, print_size = 0;
+	int read_stdin = 0;
 	const char *exp_type = 0, *obj_name = 0;
+	struct strbuf buf;
 
 	git_config(git_default_config);
 
@@ -161,6 +171,11 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
 			continue;
 		}
 
+		if (!strcmp(arg, "--stdin")) {
+		    read_stdin = 1;
+		    continue;
+		}
+
 		if (arg[0] == '-')
 			usage(cat_file_usage);
 
@@ -169,15 +184,31 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
 			continue;
 		}
 
-		if (obj_name)
+		if (obj_name || read_stdin)
 			usage(cat_file_usage);
 
 		obj_name = arg;
 		break;
 	}
 
-	if (!exp_type || !obj_name)
+	if (!exp_type)
 		usage(cat_file_usage);
 
-	return cat_one_file(opt, exp_type, obj_name);
+	if (!read_stdin) {
+		if (!obj_name)
+			usage(cat_file_usage);
+		return cat_one_file(opt, exp_type, obj_name, 0);
+	}
+
+	print_size = !opt || opt == 'p';
+
+	strbuf_init(&buf, 0);
+	while (strbuf_getline(&buf, stdin, '\n') != EOF) {
+		int error = cat_one_file(opt, exp_type, buf.buf, print_size);
+		if (error)
+			return error;
+	}
+	strbuf_release(&buf);
+
+	return 0;
 }
diff --git a/t/t1005-cat-file.sh b/t/t1005-cat-file.sh
index 697354d..2b2d386 100755
--- a/t/t1005-cat-file.sh
+++ b/t/t1005-cat-file.sh
@@ -88,4 +88,39 @@ test_expect_success \
     "Reach a blob from a tag pointing to it" \
     "test '$hello_content' = \"\$(git cat-file blob $tag_sha1)\""
 
+sha1s="$hello_sha1
+$tree_sha1
+$commit_sha1
+$tag_sha1"
+
+sizes="$hello_size
+$tree_size
+$commit_size
+$tag_size"
+
+test_expect_success \
+    "Pass object hashes on stdin to retrieve sizes" \
+    "test '$sizes' = \"\$(echo '$sha1s' | git cat-file -s --stdin)\""
+
+example_content="Silly example"
+example_size=$(echo "$example_content" | wc -c)
+example_sha1=f24c74a2e500f5ee1332c86b94199f52b1d1d962
+
+echo "$example_content" > example
+
+git update-index --add example
+
+sha1s="$hello_sha1
+$example_sha1"
+
+contents="$hello_size
+$hello_content
+
+$example_size
+$example_content"
+
+test_expect_success \
+    "Pass object hashes on stdin to retrieve contents" \
+    "test '$contents' = \"\$(echo '$sha1s' | git cat-file blob --stdin)\""
+
 test_done
-- 
1.5.3.4.1337.g8e67d-dirty

^ permalink raw reply related

* [PATCH 5/9] Add tests for git hash-object
From: Adam Roben @ 2007-10-25 10:25 UTC (permalink / raw)
  To: git; +Cc: Junio Hamano, Adam Roben, Johannes Sixt
In-Reply-To: <1193307927-3592-5-git-send-email-aroben@apple.com>


Signed-off-by: Adam Roben <aroben@apple.com>
---
Johannes Sixt wrote:
> Adam Roben schrieb:
> > +test_expect_success \
> > +    'hash a file' \
> > +    "test $hello_sha1 = $(git hash-object hello)"
> 
> Put tests in double-quotes; otherwise, the substitutions happen before the test begins, and not as part of the test. 

I think escaping the $(...) is enough to delay command execution.

 t/t1006-hash-object.sh |   27 +++++++++++++++++++++++++++
 1 files changed, 27 insertions(+), 0 deletions(-)
 create mode 100755 t/t1006-hash-object.sh

diff --git a/t/t1006-hash-object.sh b/t/t1006-hash-object.sh
new file mode 100755
index 0000000..12f95f0
--- /dev/null
+++ b/t/t1006-hash-object.sh
@@ -0,0 +1,27 @@
+#!/bin/sh
+
+test_description='git hash-object'
+
+. ./test-lib.sh
+
+hello_content="Hello World"
+hello_sha1=557db03de997c86a4a028e1ebd3a1ceb225be238
+echo "$hello_content" > hello
+
+test_expect_success \
+    'hash a file' \
+    "test $hello_sha1 = \$(git hash-object hello)"
+
+test_expect_success \
+    'hash from stdin' \
+    "test $hello_sha1 = \$(echo '$hello_content' | git hash-object --stdin)"
+
+test_expect_success \
+    'hash a file and write to database' \
+    "test $hello_sha1 = \$(git hash-object -w hello)"
+
+test_expect_success \
+    'hash from stdin and write to database' \
+    "test $hello_sha1 = \$(echo '$hello_content' | git hash-object -w --stdin)"
+
+test_done
-- 
1.5.3.4.1337.g8e67d-dirty

^ permalink raw reply related

* [PATCH 7/9] Git.pm: Add command_bidi_pipe and command_close_bidi_pipe
From: Adam Roben @ 2007-10-25 10:25 UTC (permalink / raw)
  To: git; +Cc: Junio Hamano, Adam Roben
In-Reply-To: <1193307927-3592-7-git-send-email-aroben@apple.com>

command_bidi_pipe hands back the stdin and stdout file handles from the
executed command. command_close_bidi_pipe closes these handles and terminates
the process.

Signed-off-by: Adam Roben <aroben@apple.com>
---
 perl/Git.pm |   56 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 56 insertions(+), 0 deletions(-)

diff --git a/perl/Git.pm b/perl/Git.pm
index 3f4080c..46c5d10 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -51,6 +51,7 @@ require Exporter;
 # Methods which can be called as standalone functions as well:
 @EXPORT_OK = qw(command command_oneline command_noisy
                 command_output_pipe command_input_pipe command_close_pipe
+                command_bidi_pipe command_close_bidi_pipe
                 version exec_path hash_object git_cmd_try);
 
 
@@ -92,6 +93,7 @@ increate nonwithstanding).
 use Carp qw(carp croak); # but croak is bad - throw instead
 use Error qw(:try);
 use Cwd qw(abs_path);
+use IPC::Open2 qw(open2);
 
 }
 
@@ -375,6 +377,60 @@ sub command_close_pipe {
 	_cmd_close($fh, $ctx);
 }
 
+=item command_bidi_pipe ( COMMAND [, ARGUMENTS... ] )
+
+Execute the given C<COMMAND> in the same way as command_output_pipe()
+does but return both an input pipe filehandle and an output pipe filehandle.
+
+The function will return return C<($pid, $pipe_in, $pipe_out, $ctx)>.
+See C<command_close_bidi_pipe()> for details.
+
+=cut
+
+sub command_bidi_pipe {
+	my ($pid, $in, $out);
+	$pid = open2($in, $out, 'git', @_);
+	return ($pid, $in, $out, join(' ', @_));
+}
+
+=item command_close_bidi_pipe ( PID, PIPE_IN, PIPE_OUT [, CTX] )
+
+Close the C<PIPE_IN> and C<PIPE_OUT> as returned from C<command_bidi_pipe()>,
+checking whether the command finished successfully. The optional C<CTX>
+argument is required if you want to see the command name in the error message,
+and it is the fourth value returned by C<command_bidi_pipe()>.  The call idiom
+is:
+
+	my ($pid, $in, $out, $ctx) = $r->command_bidi_pipe('cat-file --stdin');
+	print "000000000\n" $out;
+	while (<$in>) { ... }
+	$r->command_close_bidi_pipe($pid, $in, $out, $ctx);
+
+Note that you should not rely on whatever actually is in C<CTX>;
+currently it is simply the command name but in future the context might
+have more complicated structure.
+
+=cut
+
+sub command_close_bidi_pipe {
+	my ($pid, $in, $out, $ctx) = @_;
+	foreach my $fh ($in, $out) {
+		if (not close $fh) {
+			if ($!) {
+				carp "error closing pipe: $!";
+			} elsif ($? >> 8) {
+				throw Git::Error::Command($ctx, $? >>8);
+			}
+		}
+	}
+
+	waitpid $pid, 0;
+
+	if ($? >> 8) {
+		throw Git::Error::Command($ctx, $? >>8);
+	}
+}
+
 
 =item command_noisy ( COMMAND [, ARGUMENTS... ] )
 
-- 
1.5.3.4.1337.g8e67d-dirty

^ permalink raw reply related

* [PATCH 6/9] git-hash-object: Add --stdin-paths option
From: Adam Roben @ 2007-10-25 10:25 UTC (permalink / raw)
  To: git; +Cc: Junio Hamano, Adam Roben
In-Reply-To: <1193307927-3592-6-git-send-email-aroben@apple.com>

This allows multiple paths to be specified on stdin.

Signed-off-by: Adam Roben <aroben@apple.com>
---
 Documentation/git-hash-object.txt |    5 ++++-
 hash-object.c                     |   29 ++++++++++++++++++++++++++++-
 t/t1006-hash-object.sh            |   22 ++++++++++++++++++++++
 3 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-hash-object.txt b/Documentation/git-hash-object.txt
index 616f196..50fc401 100644
--- a/Documentation/git-hash-object.txt
+++ b/Documentation/git-hash-object.txt
@@ -8,7 +8,7 @@ git-hash-object - Compute object ID and optionally creates a blob from a file
 
 SYNOPSIS
 --------
-'git-hash-object' [-t <type>] [-w] [--stdin] [--] <file>...
+'git-hash-object' [-t <type>] [-w] [--stdin | --stdin-paths] [--] <file>...
 
 DESCRIPTION
 -----------
@@ -32,6 +32,9 @@ OPTIONS
 --stdin::
 	Read the object from standard input instead of from a file.
 
+--stdin-paths::
+	Read file names from stdin instead of from the command-line.
+
 Author
 ------
 Written by Junio C Hamano <junkio@cox.net>
diff --git a/hash-object.c b/hash-object.c
index 18f5017..fd96d50 100644
--- a/hash-object.c
+++ b/hash-object.c
@@ -20,6 +20,7 @@ static void hash_object(const char *path, enum object_type type, int write_objec
 		    ? "Unable to add %s to database"
 		    : "Unable to hash %s", path);
 	printf("%s\n", sha1_to_hex(sha1));
+	maybe_flush_or_die(stdout, "hash to stdout");
 }
 
 static void hash_stdin(const char *type, int write_object)
@@ -31,7 +32,7 @@ static void hash_stdin(const char *type, int write_object)
 }
 
 static const char hash_object_usage[] =
-"git-hash-object [-t <type>] [-w] [--stdin] <file>...";
+"git-hash-object [-t <type>] [-w] [--stdin | --stdin-paths] <file>...";
 
 int main(int argc, char **argv)
 {
@@ -41,6 +42,7 @@ int main(int argc, char **argv)
 	const char *prefix = NULL;
 	int prefix_length = -1;
 	int no_more_flags = 0;
+	int found_stdin_flag = 0;
 
 	for (i = 1 ; i < argc; i++) {
 		if (!no_more_flags && argv[i][0] == '-') {
@@ -62,7 +64,32 @@ int main(int argc, char **argv)
 			}
 			else if (!strcmp(argv[i], "--help"))
 				usage(hash_object_usage);
+			else if (!strcmp(argv[i], "--stdin-paths")) {
+				struct strbuf buf, nbuf;
+
+				if (found_stdin_flag)
+					die("Can't use both --stdin and --stdin-paths");
+				found_stdin_flag = 1;
+
+				strbuf_init(&buf, 0);
+				strbuf_init(&nbuf, 0);
+				while (strbuf_getline(&buf, stdin, '\n') != EOF) {
+					if (buf.buf[0] == '"') {
+						strbuf_reset(&nbuf);
+						if (unquote_c_style(&nbuf, buf.buf, NULL))
+							die("line is badly quoted");
+						strbuf_swap(&buf, &nbuf);
+					}
+					hash_object(buf.buf, type_from_string(type), write_object);
+				}
+				strbuf_release(&buf);
+				strbuf_release(&nbuf);
+			}
 			else if (!strcmp(argv[i], "--stdin")) {
+				if (found_stdin_flag)
+					die("Can't use both --stdin and --stdin-paths");
+				found_stdin_flag = 1;
+
 				hash_stdin(type, write_object);
 			}
 			else
diff --git a/t/t1006-hash-object.sh b/t/t1006-hash-object.sh
index 12f95f0..e747004 100755
--- a/t/t1006-hash-object.sh
+++ b/t/t1006-hash-object.sh
@@ -24,4 +24,26 @@ test_expect_success \
     'hash from stdin and write to database' \
     "test $hello_sha1 = \$(echo '$hello_content' | git hash-object -w --stdin)"
 
+example_content="Silly example"
+example_sha1=f24c74a2e500f5ee1332c86b94199f52b1d1d962
+echo "$example_content" > example
+
+filenames="hello
+example"
+
+sha1s="$hello_sha1
+$example_sha1"
+
+test_expect_success \
+    'hash two files with names on stdin' \
+    "test '$sha1s' = \"\$(echo '$filenames' | git hash-object --stdin-paths)\""
+
+test_expect_success \
+    'hash two files with names on stdin and write to database' \
+    "test '$sha1s' = \"\$(echo '$filenames' | git hash-object --stdin-paths)\""
+
+test_expect_failure \
+    "Can't use --stdin and --stdin-paths together" \
+    "echo '$filenames' | git hash-object --stdin --stdin-paths"
+
 test_done
-- 
1.5.3.4.1337.g8e67d-dirty

^ permalink raw reply related

* [PATCH 8/9] Git.pm: Add hash_and_insert_object and cat_blob
From: Adam Roben @ 2007-10-25 10:25 UTC (permalink / raw)
  To: git; +Cc: Junio Hamano, Adam Roben, Eric Wong
In-Reply-To: <1193307927-3592-8-git-send-email-aroben@apple.com>

These functions are more efficient ways of executing `git hash-object -w` and
`git cat-file blob` when you are dealing with many files/objects.

Signed-off-by: Adam Roben <aroben@apple.com>
---
Eric Wong wrote:
> > +package Git::Commands;
> 
> Can this be a separate file, or a part of Git.pm?  I'm sure other
> scripts can eventually use this and I've been meaning to split
> git-svn.perl into separate files so it's easier to follow.

I ended up making it part of Git.pm, because I realized that made far more
sense than splitting it into a separate file.

 perl/Git.pm |   97 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 95 insertions(+), 2 deletions(-)

diff --git a/perl/Git.pm b/perl/Git.pm
index 46c5d10..f23edef 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -39,6 +39,9 @@ $VERSION = '0.01';
   my $lastrev = $repo->command_oneline( [ 'rev-list', '--all' ],
                                         STDERR => 0 );
 
+  my $sha1 = $repo->hash_and_insert_object('file.txt');
+  my $contents = $repo->cat_blob($sha1);
+
 =cut
 
 
@@ -218,7 +221,6 @@ sub repository {
 	bless $self, $class;
 }
 
-
 =back
 
 =head1 METHODS
@@ -675,6 +677,93 @@ sub hash_object {
 }
 
 
+=item hash_and_insert_object ( FILENAME )
+
+Compute the SHA1 object id of the given C<FILENAME> and add the object to the
+object database.
+
+The function returns the SHA1 hash.
+
+=cut
+
+# TODO: Support for passing FILEHANDLE instead of FILENAME
+sub hash_and_insert_object {
+	my ($self, $filename) = @_;
+
+	$self->_open_hash_and_insert_object_if_needed();
+	my ($in, $out) = ($self->{hash_object_in}, $self->{hash_object_out});
+
+	print $out $filename, "\n";
+	chomp(my $hash = <$in>);
+	return $hash;
+}
+
+sub _open_hash_and_insert_object_if_needed {
+	my ($self) = @_;
+
+	return if defined($self->{hash_object_pid});
+
+	($self->{hash_object_pid}, $self->{hash_object_in},
+	 $self->{hash_object_out}, $self->{hash_object_ctx}) =
+		command_bidi_pipe(qw(hash-object -w --stdin-paths));
+}
+
+sub _close_hash_and_insert_object {
+	my ($self) = @_;
+
+	return unless defined($self->{hash_object_pid});
+
+	my @vars = map { 'hash_object' . $_ } qw(pid in out ctx);
+
+	command_close_bidi_pipe($self->{@vars});
+	delete $self->{@vars};
+}
+
+=item cat_blob ( SHA1 )
+
+Returns the contents of the blob identified by C<SHA1>.
+
+=cut
+
+sub cat_blob {
+	my ($self, $sha1) = @_;
+
+	$self->_open_cat_blob_if_needed();
+	my ($in, $out) = ($self->{cat_blob_in}, $self->{cat_blob_out});
+
+	print $out $sha1, "\n";
+	chomp(my $size = <$in>);
+
+	my $blob;
+	my $result = read($in, $blob, $size);
+	defined $result or carp $!;
+
+	# Skip past the trailing newline.
+	read($in, my $newline, 1);
+
+	return $blob;
+}
+
+sub _open_cat_blob_if_needed {
+	my ($self) = @_;
+
+	return if defined($self->{cat_blob_pid});
+
+	($self->{cat_blob_pid}, $self->{cat_blob_in},
+	 $self->{cat_blob_out}, $self->{cat_blob_ctx}) =
+		command_bidi_pipe(qw(cat-file blob --stdin));
+}
+
+sub _close_cat_blob {
+	my ($self) = @_;
+
+	return unless defined($self->{cat_blob_pid});
+
+	my @vars = map { 'cat_blob' . $_ } qw(pid in out ctx);
+
+	command_close_bidi_pipe($self->{@vars});
+	delete $self->{@vars};
+}
 
 =back
 
@@ -892,7 +981,11 @@ sub _cmd_close {
 }
 
 
-sub DESTROY { }
+sub DESTROY {
+	my ($self) = @_;
+	$self->_close_hash_and_insert_object();
+	$self->_close_cat_blob();
+}
 
 
 # Pipe implementation for ActiveState Perl.
-- 
1.5.3.4.1342.g32de

^ permalink raw reply related

* [PATCH 9/9] git-svn: Make fetch ~1.7x faster
From: Adam Roben @ 2007-10-25 10:25 UTC (permalink / raw)
  To: git; +Cc: Junio Hamano, Adam Roben, Eric Wong
In-Reply-To: <1193307927-3592-9-git-send-email-aroben@apple.com>

We were spending a lot of time forking/execing git-cat-file and
git-hash-object. We now maintain a global Git repository object in order to use
Git.pm's more efficient hash_and_insert_object and cat_blob methods.

Signed-off-by: Adam Roben <aroben@apple.com>
---
Eric Wong wrote:
> > +sub hash_object {
> > +   my (undef, $fh) = @_;
> > +
> > +   my ($tmp_fh, $tmp_filename) = tempfile(UNLINK => 1);
> > +   while (my $line = <$fh>) {
> > +           print $tmp_fh $line;
> > +   }
> > +   close($tmp_fh);
> 
> Related to the above.  It's better to sysread()/syswrite() or
> read()/print() in a loop with a predefined buffer size rather than to
> use a readline() since you could be dealing with files with very long
> lines or binaries with no newline characters in them at all.

Fixed.

> > +   _open_hash_object_if_needed();
> > +   print $_hash_object_out $tmp_filename . "\n";
> 
> Minor, but
> 
>         print $_hash_object_out $tmp_filename, "\n";
> 
> avoids creating a new string.

Fixed.

 git-svn.perl |   40 ++++++++++++++++++----------------------
 1 files changed, 18 insertions(+), 22 deletions(-)

diff --git a/git-svn.perl b/git-svn.perl
index 22bb47b..fcb07f5 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -4,7 +4,7 @@
 use warnings;
 use strict;
 use vars qw/	$AUTHOR $VERSION
-		$sha1 $sha1_short $_revision
+		$sha1 $sha1_short $_revision $_repository
 		$_q $_authors %users/;
 $AUTHOR = 'Eric Wong <normalperson@yhbt.net>';
 $VERSION = '@@GIT_VERSION@@';
@@ -225,6 +225,7 @@ unless ($cmd =~ /(?:clone|init|multi-init)$/) {
 		}
 		$ENV{GIT_DIR} = $git_dir;
 	}
+	$_repository = Git->repository(Repository => $ENV{GIT_DIR});
 }
 unless ($cmd =~ /^(?:clone|init|multi-init|commit-diff)$/) {
 	Git::SVN::Migration::migration_check();
@@ -332,6 +333,7 @@ sub cmd_init {
 	                       "as a command-line argument\n";
 	init_subdir(@_);
 	do_git_init_db();
+	$_repository = Git->repository(Repository => $ENV{GIT_DIR});
 
 	Git::SVN->init($url);
 }
@@ -2541,6 +2543,7 @@ use vars qw/@ISA/;
 use strict;
 use warnings;
 use Carp qw/croak/;
+use File::Temp qw/tempfile/;
 use IO::File qw//;
 use Digest::MD5;
 
@@ -2683,14 +2686,8 @@ sub apply_textdelta {
 	my $base = IO::File->new_tmpfile;
 	$base->autoflush(1);
 	if ($fb->{blob}) {
-		defined (my $pid = fork) or croak $!;
-		if (!$pid) {
-			open STDOUT, '>&', $base or croak $!;
-			print STDOUT 'link ' if ($fb->{mode_a} == 120000);
-			exec qw/git-cat-file blob/, $fb->{blob} or croak $!;
-		}
-		waitpid $pid, 0;
-		croak $? if $?;
+		my $contents = $::_repository->cat_blob($fb->{blob});
+		print $base $contents;
 
 		if (defined $exp) {
 			seek $base, 0, 0 or croak $!;
@@ -2729,14 +2726,18 @@ sub close_file {
 			$buf eq 'link ' or die "$path has mode 120000",
 			                       "but is not a link\n";
 		}
-		defined(my $pid = open my $out,'-|') or die "Can't fork: $!\n";
-		if (!$pid) {
-			open STDIN, '<&', $fh or croak $!;
-			exec qw/git-hash-object -w --stdin/ or croak $!;
+
+		my ($tmp_fh, $tmp_filename) = File::Temp::tempfile(UNLINK => 1);
+		my $result;
+		while ($result = sysread($fh, my $string, 1024)) {
+			syswrite($tmp_fh, $string, $result);
 		}
-		chomp($hash = do { local $/; <$out> });
-		close $out or croak $!;
+		defined $result or croak $!;
+		close $tmp_fh or croak $!;
+
 		close $fh or croak $!;
+
+		$hash = $::_repository->hash_and_insert_object($tmp_filename);
 		$hash =~ /^[a-f\d]{40}$/ or die "not a sha1: $hash\n";
 		close $fb->{base} or croak $!;
 	} else {
@@ -3063,13 +3064,8 @@ sub chg_file {
 	} elsif ($m->{mode_a} =~ /^120/ && $m->{mode_b} !~ /^120/) {
 		$self->change_file_prop($fbat,'svn:special',undef);
 	}
-	defined(my $pid = fork) or croak $!;
-	if (!$pid) {
-		open STDOUT, '>&', $fh or croak $!;
-		exec qw/git-cat-file blob/, $m->{sha1_b} or croak $!;
-	}
-	waitpid $pid, 0;
-	croak $? if $?;
+	my $blob = $::_repository->cat_blob($m->{sha1_b});
+	print $fh $blob;
 	$fh->flush == 0 or croak $!;
 	seek $fh, 0, 0 or croak $!;
 
-- 
1.5.3.4.1337.g8e67d-dirty

^ permalink raw reply related

* Re: best git practices, was Re: Git User's Survey 2007 unfinished summary continued
From: Johannes Schindelin @ 2007-10-25 10:27 UTC (permalink / raw)
  To: Steffen Prohaska
  Cc: Peter Baumann, Andreas Ericsson, J. Bruce Fields, Jakub Narebski,
	Federico Mena Quintero, git
In-Reply-To: <79366145-3C91-4417-B62C-FFF9EC452076@zib.de>

Hi,

On Thu, 25 Oct 2007, Steffen Prohaska wrote:

> On Oct 25, 2007, at 1:28 AM, Johannes Schindelin wrote:
> 
> > On Thu, 25 Oct 2007, Steffen Prohaska wrote:
> > 
> > > On Oct 25, 2007, at 12:14 AM, Johannes Schindelin wrote:
> > > 
> > > > But I think I have to drive my message home again: if what you 
> > > > desire becomes reality, you take away the clear distinction 
> > > > between local and remote branches.  In fact, those branches are 
> > > > neither local (because the next pull will automatically update 
> > > > them with remote changes, but _only_ if they fast-forward) nor 
> > > > remote (because you plan to work on them locally).
> > > 
> > > Exactly, because I do not work on those branches alone. These are 
> > > _shared_ branches. I can work on such a branch with a group of 
> > > developers. I'm willing to accept this bit of chaos.
> > 
> > It is not just a chaos.  I see a serious problem here.  On _your_ 
> > computer, you do _not_ have a shared branch.  Which is visible _even_ 
> > in your modified work flow when you have unpushed changes.
> > 
> > So your desired illusion that your local branches are anything but 
> > local branches will never be perfect enough.
> 
> Ok, there is not a fundamental difference between local branches
> that automatically merge from remotes and local branches that
> are purely local and _never_ merge anything automatically. Both
> are only local branches.

Actually, not really.  For refs/remotes/* you expect them to change 
possibly at the same time.  For your local branches, I'd expect them only 
to change when I am actually working on them (and yes, that includes a 
pull into the current branch).

> > > Your rebase workflow is not possible if more than one dev wants to 
> > > work on the topic branch together.
> > 
> > Why not?  I do it all the time.  CVS users do it all the time, for 
> > that matter.
> 
> You're right. You can rebase your local changes on top of the new shared 
> remote head. And this is probably the best thing you can do to get a 
> clean history. Maybe it should be easier.

It should.  Thus my question about best practices (which is technically in 
this thread, but we are in a subthread which permuted into "I want git 
pull to behave differently")

I _want_ this to be easier.

> So, do I understand correctly, what you propose is:
> - never merge but only rebase
> - Due to lacking support for this in "git pull", never use
>  git pull when working with shared branches but instead _always_ use
>  "git fetch; git rebase origin/<branch_I'm_on>".
> 
> So you say that one of the first messages in "git for CVS users", "The 
> equivalent of cvs update is git pull origin" [1], is wrong. I don't 
> think I'm able to sell your proposed workflow with the current 
> documentation. But maybe I try if I'm absolutely convinced that it is 
> superior.

Hehe.  You just experienced the tremendous speed at which git moves.  In 
the beginning, we really thought that "git pull" is all you'll ever want 
to have.

But in the meantime, one of the biggest Enemies of the Rebase (yours 
truly) converted to an avid fan of it, because it really helps 
development.  It also makes for clean history, which is always good.

> > > > But here is a proposal which should make you and your developers 
> > > > happy, _and_ should be even easier to explain:
> > > > 
> > > > Work with topic branches.  And when you're done, delete them.
> > > 
> > > Again, if you want to share the topic branch the situation gets more 
> > > complex.
> > 
> > Hardly so.  In my proposed solution to your problem, there is nothing 
> > which prevents you from working off of another branch than "master".
> 
> Well if you have several local branches checked out that are
> shared with others you run into the "git push" problem again ...
> (see below at git push origin master).

Do the same as I, always say "git push origin master" (of course, you 
should exchange "master" with whatever branch you want to push).  Be 
precise.

> > > > So the beginning of the day could look like this:
> > > > 
> > > > 	git fetch
> > > > 	git checkout -b todays-topic origin/master
> > > > 
> > > > 	[hack hack hack]
> > > > 	[test test test]
> > > > 	[debug debug debug]
> > > > 	[occasionally commit]
> > > > 	[occasionally git rebase -i origin/master]
> > > > 
> > > > and the end of the topic
> > > > 
> > > > 	git branch -M master
> 
> Isn't this a bit dangerous? It forces to overwrite master no matter 
> what's on it. You don't see diffstats nor a fast forward message that 
> confirms what you're doing.

Yeah, I should have said something like "git branch -m master" 
(implicitely assuming that you have no current "master" branch).

> > > > 	git push origin master
> 
> I'd like to see "git push" here.

I think it is not asking too much for the user to be a bit more precise.  
If you really do not trust your developers to be capable of that, point 
them to git gui.

>    git branch -m <shared_branch>
>    git push origin <shared_branch>
>    git checkout do-not-work-here
>    git branch -D <shared_branch>

Actually, the last two commands would better be

	git checkout HEAD^{commit}
	git branch -d <shared_branch>

> > The problem I see here: you know git quite well.  Others don't, and 
> > will be mightily confused why pull updates local branches sometimes, 
> > and sometimes not.
> 
> But it already happens now. "git pull" sometimes merges a remote branch 
> (--track) and sometimes it reports an error that is fails to do so 
> (--no-track).

If there really is an inconsistent behaviour, then we'll have to fix that.  
We should not introduce inconsistent behaviour on top of that.

Ciao,
Dscho

^ permalink raw reply

* Re: best git practices, was Re: Git User's Survey 2007 unfinished summary continued
From: Andreas Ericsson @ 2007-10-25 10:33 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Steffen Prohaska, Peter Baumann, J. Bruce Fields, Jakub Narebski,
	Federico Mena Quintero, git
In-Reply-To: <Pine.LNX.4.64.0710251112390.25221@racer.site>

Johannes Schindelin wrote:
> Hi,
> 
> On Thu, 25 Oct 2007, Andreas Ericsson wrote:
> 
>> Johannes Schindelin wrote:
>>
>>> On Thu, 25 Oct 2007, Steffen Prohaska wrote:
>>>
>>>> On Oct 25, 2007, at 12:14 AM, Johannes Schindelin wrote:
>>>>
>>>>> But I think I have to drive my message home again: if what you 
>>>>> desire becomes reality, you take away the clear distinction 
>>>>> between local and remote branches.  In fact, those branches are 
>>>>> neither local (because the next pull will automatically update 
>>>>> them with remote changes, but _only_ if they fast-forward) nor 
>>>>> remote (because you plan to work on them locally).
>>>> Exactly, because I do not work on those branches alone. These are 
>>>> _shared_ branches. I can work on such a branch with a group of 
>>>> developers. I'm willing to accept this bit of chaos.
>>> It is not just a chaos.  I see a serious problem here.  On _your_ 
>>> computer, you do _not_ have a shared branch.  Which is visible _even_ 
>>> in your modified work flow when you have unpushed changes.
>> Ofcourse it is. People might pull from it. That's the whole point of a 
>> distributed model.
> 
> By that reasoning, left is right.  Because your "left" is my "right".
> 
>>> So your desired illusion that your local branches are anything but 
>>> local branches will never be perfect enough.
>>>
>>>> Your rebase workflow is not possible if more than one dev wants to 
>>>> work on the topic branch together.
>>> Why not?  I do it all the time.  CVS users do it all the time, for 
>>> that matter.
>> For 200 branches at a time, where any of them might have changed?
> 
> I slowly start to understand why your users are confused.  _Nobody_ works 
> on 200 branches at the same time.  (No, maintainers don't count: they do 
> not work _on_ the branches, but _with_; they merge them.)
> 
> When you're done with a topic, why do you leave it around?  Cluttering up 
> your "git branch" output?
> 

We have 91 repositories at work. Roughly 60 of those are in active use.
The active repos are organized pretty much like the git repo with
'master', 'next' and 'maint'. We *do* work on all branches, but not
every day, ofcourse. They're NOT topic branches. We implement features
on topic-branches that we DO throw away, but those branches HAVE to be
there for us to be able to handle supporting of old versions as well as
implementing new features in a sane way. Throwing them away locally would
mean having to re-create them very frequently, and since they have to
exist in the upstream repo, "git fetch" would fetch and re-create them
every single time anyway.

So please, pretty please just drop the entire "use topic branches" argument.
We do that, but still have this problem, and it *is* a problem.

>>> The problem I see here: you know git quite well.  Others don't, and 
>>> will be mightily confused why pull updates local branches sometimes, 
>>> and sometimes not.
>> Do you know this, or are you just guessing? I'm getting the exact same
>> confusion with the current behaviour. "Why the hell doesn't git update
>> all the branches I told the damn stupid tool to auto-merge when I pull?"
> 
> That's easy.  A merge can have conflicts.  Conflicts need a working 
> directory.  You cannot have multiple working directories.  (Actually, you 
> can, with git-new-workdir, which would break down _horribly_ with your 
> desired change.)
> 
> Oh?  You don't have local changes?  Then why _on earth_ do you have a 
> local branch?
> 

Because it's convenient, ofcourse. Don't you have 'maint', 'next' and 'master'
in your clone of git.git? I'm guessing at least 99% of the people on this
list have those branches lying around in their clones, even if they only
ever use 'next' and/or 'master'.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply

* Re: best git practices, was Re: Git User's Survey 2007 unfinished summary continued
From: Steffen Prohaska @ 2007-10-25 10:39 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Andreas Ericsson, Jakub Narebski, J. Bruce Fields,
	Federico Mena Quintero, git
In-Reply-To: <Pine.LNX.4.64.0710251106110.25221@racer.site>

On Oct 25, 2007, at 12:07 PM, Johannes Schindelin wrote:

> Hi,
>
> On Thu, 25 Oct 2007, Andreas Ericsson wrote:
>
>> Jakub Narebski wrote:
>>> On 10/24/07, Andreas Ericsson <ae@op5.se> wrote:
>>>
>>>> git pull. Not git push. git pull operates on one working branch  
>>>> at a
>>>> time (by default), whereas git push uploads and fast-forwards all
>>>> the common branches (by default). I want git pull to work like git
>>>> push.
>>>
>>> git push is opposite (almost) to git fetch, not to git pull.
>>
>> Not to an end user that has no idea or desire to learn about git  
>> remotes
>> or anything else.
>
> At some point you _have_ to expect your users to learn something.   
> In the
> git documentation, we never pretend that pull is anything else than  
> "fetch
> + merge".
>
> So this assumption of your end user is a lack of training, really.

I typically describe in detail every step they need to get
there work done. I expect that a few, simple commands that can
be used per copy & paste should solve 90% of the cases.

Some users will learn more, some will refuse to learn
more. Users from the second group will typically consult a
more experienced user if they hit a problem. At at that point
they are forced to learn.

I don't expect that all users know all details and the users
expect that their daily workflow is well supported with a
few commands.

	Steffen

^ permalink raw reply

* Re: git-svnimport
From: Johannes Schindelin @ 2007-10-25 10:56 UTC (permalink / raw)
  To: Felipe Balbi; +Cc: git
In-Reply-To: <31e679430710250225w39a876d0w738d819245e514e@mail.gmail.com>

Hi,

On Thu, 25 Oct 2007, Felipe Balbi wrote:

> I was importing busybox svn repository to git but I got a connection 
> timeout after more than 19k commits... is there a way to continue where 
> the error happened or should I do it all over again ??

AFAICT git-svn is better suited, even to one-shot importing svn.

As it happens, I got interested in this project, too, and did an import 
some time ago.  For your pleasure, I uploaded it to

	http://repo.or.cz/w/busybox.git/

Hth,
Dscho

^ permalink raw reply

* Re: git-svnimport
From: Felipe Balbi @ 2007-10-25 11:08 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0710251132580.25221@racer.site>

Hi,

On 10/25/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> Hi,
>
> On Thu, 25 Oct 2007, Felipe Balbi wrote:
>
> > I was importing busybox svn repository to git but I got a connection
> > timeout after more than 19k commits... is there a way to continue where
> > the error happened or should I do it all over again ??
>
> AFAICT git-svn is better suited, even to one-shot importing svn.
>
> As it happens, I got interested in this project, too, and did an import
> some time ago.  For your pleasure, I uploaded it to
>
>         http://repo.or.cz/w/busybox.git/

thanks... much better... I'm cloning your tree and I'll merge with
current busybox tree... ;-)

>
> Hth,
> Dscho
>
>


-- 
Best Regards,

Felipe Balbi
felipebalbi@users.sourceforge.net

^ permalink raw reply

* Re: best git practices, was Re: Git User's Survey 2007 unfinished summary continued
From: Johannes Schindelin @ 2007-10-25 11:39 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Peter Baumann, J. Bruce Fields, Steffen Prohaska, Jakub Narebski,
	Federico Mena Quintero, git
In-Reply-To: <47206EC3.5000002@op5.se>

Hi,

On Thu, 25 Oct 2007, Andreas Ericsson wrote:

> Johannes Schindelin wrote:
> 
> > On Thu, 25 Oct 2007, Andreas Ericsson wrote:
> > 
> > > Johannes Schindelin wrote:
> > > 
> > > > On Wed, 24 Oct 2007, Andreas Ericsson wrote:
> > > > 
> > > > > Conceptually, I don't think it'll be any problem what so ever 
> > > > > telling anyone that the branches that aren't currently checked 
> > > > > out get merged automatically only if they result in a 
> > > > > fast-forward.
> > > >
> > > > It would be a matter of seconds until someone asks "why only 
> > > > fast-forwards? Would it not be _much_ better to merge _always_?  
> > > > Stupid git."
> > > > 
> > > > And all because the concept of "local" vs "remote" was blurred.
> > >
> > > It's already blurred, since we have git-pull instead of just 
> > > git-fetch.
> > 
> > Huh?  How is "I ask git pull to fetch the remote branch, and merge it 
> > into my local branch" a blurring of local vs remote branch?
> > 
> > The local branch is still the local branch where it is _my_ 
> > responsibility to update or change anything.
> 
> True. So git pull saves you exactly one command. The various 
> fetch-all-git- repos-and-update-all-fast-forward-branches in circulation 
> at the office save us ~500 commands each time they're run. Or rather, 
> they *could* do that, but you can't know until you've run it.

As I pointed out, there is no way to sensibly have 500 _local_ branches 
lying around.

It is ridiculous to assume that you have to have local branches for all 
the stable, maintenance, whatever branches.

When you have to change something, you branch, hack, develop, commit, 
push, and then _clean up_ after yourself.  No need to clutter your 
local branch space with unused branches.

> So what should I do to make what I want possible, without having 
> git-pull muddy the waters of local vs remote? There's clearly a user 
> desire for it, besides that of my eight co-workers and myself. Introduce 
> git-<cmd-156>?

If you _insist_ on your workflow, hey, git is a free program, and you can 
do what you want to do with an alias easily enough.  You can even make 
that alias part of the templates, so you can force your desires down the 
throat of every of your coworkers.

However, that does not mean that you can insist on support for your 
workflow in upstream git.

Ciao,
Dscho

^ permalink raw reply

* Re: best git practices, was Re: Git User's Survey 2007 unfinished summary continued
From: Steffen Prohaska @ 2007-10-25 12:04 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Peter Baumann, Andreas Ericsson, J. Bruce Fields, Jakub Narebski,
	Federico Mena Quintero, git
In-Reply-To: <Pine.LNX.4.64.0710251117310.25221@racer.site>


On Oct 25, 2007, at 12:27 PM, Johannes Schindelin wrote:

> Hi,
>
> On Thu, 25 Oct 2007, Steffen Prohaska wrote:
>
>> On Oct 25, 2007, at 1:28 AM, Johannes Schindelin wrote:
>>
>>> On Thu, 25 Oct 2007, Steffen Prohaska wrote:
>>>
>>>> On Oct 25, 2007, at 12:14 AM, Johannes Schindelin wrote:

[...]

>>>>> But here is a proposal which should make you and your developers
>>>>> happy, _and_ should be even easier to explain:
>>>>>
>>>>> Work with topic branches.  And when you're done, delete them.
>>>>
>>>> Again, if you want to share the topic branch the situation gets  
>>>> more
>>>> complex.
>>>
>>> Hardly so.  In my proposed solution to your problem, there is  
>>> nothing
>>> which prevents you from working off of another branch than "master".
>>
>> Well if you have several local branches checked out that are
>> shared with others you run into the "git push" problem again ...
>> (see below at git push origin master).
>
> Do the same as I, always say "git push origin master" (of course, you
> should exchange "master" with whatever branch you want to push).  Be
> precise.

Well, I'm lazy. git already knows everything. It knows that
the current branch is associated with a specific remote and it
pushes matching branches by default. And I took care to not
pollute the namespace. All my branches are named identical
in all repositories I'm dealing with. It's reasonable to want
"git push" to do the right thing.



>>>>> So the beginning of the day could look like this:
>>>>>
>>>>> 	git fetch
>>>>> 	git checkout -b todays-topic origin/master
>>>>>
>>>>> 	[hack hack hack]
>>>>> 	[test test test]
>>>>> 	[debug debug debug]
>>>>> 	[occasionally commit]
>>>>> 	[occasionally git rebase -i origin/master]
>>>>>
>>>>> and the end of the topic
>>>>>
>>>>> 	git branch -M master
>>
>> Isn't this a bit dangerous? It forces to overwrite master no matter
>> what's on it. You don't see diffstats nor a fast forward message that
>> confirms what you're doing.
>
> Yeah, I should have said something like "git branch -m master"
> (implicitely assuming that you have no current "master" branch).
>
>>>>> 	git push origin master
>>
>> I'd like to see "git push" here.
>
> I think it is not asking too much for the user to be a bit more  
> precise.
> If you really do not trust your developers to be capable of that,  
> point
> them to git gui.

Well I was more precise and got lazy over time. Now the most I do
is "git push --dry-run" and if it looks good I do "git push".
Most of the time I just say "git push".

As I pointed out earlier, "git push origin <some-branch>" can create
a new branch on the remote. "git push" never creates a new branch.
I believe "git push" is safer.


>>    git branch -m <shared_branch>
>>    git push origin <shared_branch>
>>    git checkout do-not-work-here
>>    git branch -D <shared_branch>
>
> Actually, the last two commands would better be
>
> 	git checkout HEAD^{commit}
> 	git branch -d <shared_branch>

Wow, looks weird (not too me but to someone who doesn't know git).


>>> The problem I see here: you know git quite well.  Others don't, and
>>> will be mightily confused why pull updates local branches sometimes,
>>> and sometimes not.
>>
>> But it already happens now. "git pull" sometimes merges a remote  
>> branch
>> (--track) and sometimes it reports an error that is fails to do so
>> (--no-track).
>
> If there really is an inconsistent behaviour, then we'll have to  
> fix that.
> We should not introduce inconsistent behaviour on top of that.

It's not inconsistent. It's an option of a branch. Git supports two
flavours of local branches. Some branches automatically merge and other
don't.

	Steffen

^ permalink raw reply

* Re: best git practices, was Re: Git User's Survey 2007 unfinished summary continued
From: Steffen Prohaska @ 2007-10-25 12:09 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Johannes Schindelin, Peter Baumann, J. Bruce Fields,
	Jakub Narebski, Federico Mena Quintero, git
In-Reply-To: <472070E5.4090303@op5.se>


On Oct 25, 2007, at 12:33 PM, Andreas Ericsson wrote:

>> I slowly start to understand why your users are confused.   
>> _Nobody_ works on 200 branches at the same time.  (No, maintainers  
>> don't count: they do not work _on_ the branches, but _with_; they  
>> merge them.)
>> When you're done with a topic, why do you leave it around?   
>> Cluttering up your "git branch" output?
>
> We have 91 repositories at work. Roughly 60 of those are in active  
> use.
> The active repos are organized pretty much like the git repo with
> 'master', 'next' and 'maint'. We *do* work on all branches, but not
> every day, ofcourse. They're NOT topic branches. We implement features
> on topic-branches that we DO throw away, but those branches HAVE to be
> there for us to be able to handle supporting of old versions as  
> well as
> implementing new features in a sane way. Throwing them away locally  
> would
> mean having to re-create them very frequently, and since they have to
> exist in the upstream repo, "git fetch" would fetch and re-create them
> every single time anyway.
>
> So please, pretty please just drop the entire "use topic branches"  
> argument.
> We do that, but still have this problem, and it *is* a problem.

This is an interesting situation. If we find a good solution
that is accepted by the average developer in daily work. We
can probably learn a lot.

	Steffen

^ permalink raw reply

* Git and Windows
From: Bo Yang @ 2007-10-25 12:12 UTC (permalink / raw)
  To: git

Hi,
   I am a new comer to this list but I have used git for two week 
development control. I think it is a very cool tool, the only flaw is 
that I have not found Windows version of it. Does git just aim at Linux 
kernel development? Is there any plan or in the future to migrate it to 
windows?

Thanks!
Bo

^ permalink raw reply

* Re: best git practices, was Re: Git User's Survey 2007 unfinished summary continued
From: Andreas Ericsson @ 2007-10-25 12:46 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Peter Baumann, J. Bruce Fields, Steffen Prohaska, Jakub Narebski,
	Federico Mena Quintero, git
In-Reply-To: <Pine.LNX.4.64.0710251232370.25221@racer.site>

Johannes Schindelin wrote:
> Hi,
> 
> On Thu, 25 Oct 2007, Andreas Ericsson wrote:
> 
>> Johannes Schindelin wrote:
>>
>>> On Thu, 25 Oct 2007, Andreas Ericsson wrote:
>>>
>>>> Johannes Schindelin wrote:
>>>>
>>>>> On Wed, 24 Oct 2007, Andreas Ericsson wrote:
>>>>>
>>>>>> Conceptually, I don't think it'll be any problem what so ever 
>>>>>> telling anyone that the branches that aren't currently checked 
>>>>>> out get merged automatically only if they result in a 
>>>>>> fast-forward.
>>>>> It would be a matter of seconds until someone asks "why only 
>>>>> fast-forwards? Would it not be _much_ better to merge _always_?  
>>>>> Stupid git."
>>>>>
>>>>> And all because the concept of "local" vs "remote" was blurred.
>>>> It's already blurred, since we have git-pull instead of just 
>>>> git-fetch.
>>> Huh?  How is "I ask git pull to fetch the remote branch, and merge it 
>>> into my local branch" a blurring of local vs remote branch?
>>>
>>> The local branch is still the local branch where it is _my_ 
>>> responsibility to update or change anything.
>> True. So git pull saves you exactly one command. The various 
>> fetch-all-git- repos-and-update-all-fast-forward-branches in circulation 
>> at the office save us ~500 commands each time they're run. Or rather, 
>> they *could* do that, but you can't know until you've run it.
> 
> As I pointed out, there is no way to sensibly have 500 _local_ branches 
> lying around.
> 
> It is ridiculous to assume that you have to have local branches for all 
> the stable, maintenance, whatever branches.
> 
> When you have to change something, you branch, hack, develop, commit, 
> push, and then _clean up_ after yourself.  No need to clutter your 
> local branch space with unused branches.
> 

error: The branch 'next' is not a strict subset of your current HEAD.
If you are sure you want to delete it, run 'git branch -D next'.

So you want me to tell all the developers they should use "git branch
-D maint" instead, so they can bypass the built-in security checks? No
thanks.

>> So what should I do to make what I want possible, without having 
>> git-pull muddy the waters of local vs remote? There's clearly a user 
>> desire for it, besides that of my eight co-workers and myself. Introduce 
>> git-<cmd-156>?
> 
> If you _insist_ on your workflow, hey, git is a free program, and you can 
> do what you want to do with an alias easily enough.

With a git alias? No. There aren't even any switches in git to make it do
what I want. With a shell alias? Sure, it's doable, but cumbersome. With a
shell-script I can get it done, but it's ugly, inefficient and has to parse
everything twice. It's also a time-sink, and time is something I don't have
very much of right now.

>  You can even make 
> that alias part of the templates, so you can force your desires down the 
> throat of every of your coworkers.
> 

They're the ones that requested I hack it into git, but the result would
remain the same, ofcourse.

> However, that does not mean that you can insist on support for your 
> workflow in upstream git.
> 

I'm not. We're currently discussing the pros and the cons, and I'm spending
my free 20 minutes every night working on a patch-series to make git-pull
a built-in and then implementing the switch/config-option/whatever that
makes it do what I want it to do. Apart from Junio, that's how everyone
that wants a feature implemented has to do it, so I'd hardly call that
insisting. If Junio decides the patch does something evil, I'll have to
settle for cherry-picking it into whatever branch I want to build from.

On a side note; I'd *love* for it to have a rebase option as well. Perhaps
I'll do that next. In the mean-time, I'd settle for just updating locally
modifiable copies of tracking branches that I've already configured git to
merge with a tracking branch when it happens to be a fast-forward.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply

* Re: best git practices, was Re: Git User's Survey 2007 unfinished summary continued
From: Johannes Schindelin @ 2007-10-25 12:58 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Steffen Prohaska, Peter Baumann, J. Bruce Fields, Jakub Narebski,
	Federico Mena Quintero, git
In-Reply-To: <472070E5.4090303@op5.se>

Hi,

On Thu, 25 Oct 2007, Andreas Ericsson wrote:

> Johannes Schindelin wrote:
> 
> > When you're done with a topic, why do you leave it around?  
> > Cluttering up your "git branch" output?
> 
> We have 91 repositories at work. Roughly 60 of those are in active use.
> The active repos are organized pretty much like the git repo with
> 'master', 'next' and 'maint'. We *do* work on all branches, but not
> every day, ofcourse. They're NOT topic branches.

I already explained in another mail (wasn't it even the one you replied 
to?) how this can be done more efficiently.

Ciao,
Dscho

^ permalink raw reply

* Re: Feature request: Limit git-status reports to a directory
From: Michel Marti @ 2007-10-25 13:03 UTC (permalink / raw)
  To: git
In-Reply-To: <Pine.LNX.4.64.0710251050390.25221@racer.site>

[-- Attachment #1: Type: text/plain, Size: 755 bytes --]

on 10/25/2007 11:55 AM Johannes Schindelin said the following:
> I am not so sure.  In other SCMs, "git status" may be a way to do "git 
> diff --name-only" or "git ls-files", but not in git.  Here, it strictly 
> means "what would be happening if I were to commit _right_ _now_?".
I somehow agree with your argument, but then again, sometimes I'm still interested in
*only* the status for a given directory.

IMHO, answering the question "what would be happening if...?" should be  git-commit's task
(e.g. git-commit --dry-run). And git-status should behave similar to git-log and git-diff.

For those interested: I have attached a little script that mimics current git-status
(except the "dry-run" stuff) but also can take a list of directories/files.


[-- Attachment #2: git-status-new --]
[-- Type: text/plain, Size: 1957 bytes --]

#!/bin/sh

USAGE='[--staged] [--changed] [--untracked]'
SUBDIRECTORY_OK=1 . git-sh-setup
require_work_tree

print_stat_line() {
	case "$1" in
		M)	echo "#       modified:  $2";;
		A)	echo "#       new file:  $2";;
		R*)	echo "#       renamed:   $2 -> $3";;
		D)	echo "#       deleted:   $2";;
		U)	echo "#       unmerged:  $2";;
		X)	echo "#       $2";;
		*)	echo "#       [$S]:      $2";;
	esac
}

STAGED= CHANGED= UNTRACKED= HP=

while test $# != 0
do
	case "$1" in
		-s|--staged) STAGED=1; shift;;
		-c|--changed) CHANGED=1; shift;;
		-u|--untracked) UNTRACKED=1; shift;;
		--) shift; break;;
		-*) usage;;
		 *) break;;
	esac
done

if BRANCH_NAME=$(git symbolic-ref -q HEAD)
then
	BRANCH_NAME="On branch $(expr "z$BRANCH_NAME" : 'zrefs/heads/\(.*\)')"
else
	BRANCH_NAME="Not currently on any branch"
fi

[ "$#" = 0 ] && cd_to_toplevel


if [ -z "$STAGED$CHANGED$UNTRACKED" ]; then
	STAGED=1; CHANGED=1; UNTRACKED=1
fi

SP=$(echo _/$(git rev-parse --show-cdup)|tr '/' ' '|wc -w)

echo "# $BRANCH_NAME"

# Changes to be commited
[ "$STAGED" ] && git-diff --name-status --cached -M -- "$@"|while read S F R
do
	if [ -z "$HP" ]; then
		echo '# Changes to be committed:'
		echo '#   (use "git reset HEAD <file>..." to unstage)'
		echo '#'
		HP=1
	fi
	F=$(echo $F|cut -d'/' -f$SP-)
	print_stat_line "$S" "$F" "$R"

done

# Changed but not updated
[ "$CHANGED" ] && git-diff --name-status -- "$@"|while read S F
do
	if [ -z "$HP" ]; then
		echo '#'
		echo '# Changed but not updated:'
		echo '#   (use "git add <file>..." to update what will be committed)'
		echo '#'
		HP=1
	fi
	F=$(echo $F|cut -d'/' -f$SP-)
	print_stat_line "$S" "$F"
done

# Untracked files
[ "$UNTRACKED" ] && git-ls-files --exclude-per-directory=.gitignore -o --directory -- "$@"|while read F
do
	if [ -z "$HP" ]; then
		echo '#'
		echo '# Untracked files:'
		echo '#   (use "git add <file>..." to include in what will be committed)'
		echo '#'
		HP=1
	fi
	print_stat_line "X" "$F"
done


^ permalink raw reply

* Re: Feature request: Limit git-status reports to a directory
From: Wincent Colaiuta @ 2007-10-25 13:03 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Yin Ping, Michel Marti, git
In-Reply-To: <Pine.LNX.4.64.0710251050390.25221@racer.site>

El 25/10/2007, a las 11:55, Johannes Schindelin escribió:

> Hi,
>
> On Thu, 25 Oct 2007, Yin Ping wrote:
>
>> On 10/25/07, Michel Marti <mma@objectxp.com> wrote:
>>
>> It's also painful for me. IMHO, the behaviour of "git-status" should
>> keep consistent with "git-diff" and "git-log" which allow for the  
>> path.
>
> I am not so sure.  In other SCMs, "git status" may be a way to do "git
> diff --name-only" or "git ls-files", but not in git.  Here, it  
> strictly
> means "what would be happening if I were to commit _right_ _now_?".

Yes, but there's no reason why the user shouldn't be able to scope  
that down to a specific path, just as they currently can for git-diff  
(as you point out):

> IMHO it is not asking users too much when you say "git diff ." is  
> for the
> current directory, and "git diff" is for the whole working tree.

Sometimes if you have a dirty tree with lots of modified files and  
potentially lots of things added to the index the output of git- 
status can be quite long, and perhaps all you want to know about is  
what is the status of *this* directory or *that* file rather than  
having to visually scan through the entire git-status output.  
Accepting path info would therefore be a nice usability improvement.

Allowing git-status to accept a path would be consistent with how  
other git commands (like git-diff) already work, and with other SCMs  
too. The user is expected to know that what's in the index is what  
will be committed, and that if he/she types "git-status foo" then he/ 
she may only be seeing a subset of what's staged in the index.

But the way git-status currently behaves when supplied path info is  
puzzling to say the least. As the man page says:

> "The command takes the same set of options as git-commit; it shows  
> what would be committed if the same options are given to git-commit."

This means that if you do try passing a path to git-status (as surely  
many newcomers have done), you'll see the combined result of what is  
already staged in the index *plus* what would happen if you git-added  
the path(s) that you passed on the command line. I'd argue that this  
is counter-intuitive, and I think that most would expect that the  
paths would serve as scope *limiters* rather than indicators that  
something should be *added* to the index.

To illustrate this, an example; just say you have git-status output  
like this:

# On branch master
# Changed but not updated:
#   (use "git add <file>..." to update what will be committed)
#
#       modified:   foo/bar
#       modified:   baz
#
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#       NOTES
no changes added to commit (use "git add" and/or "git commit -a")

And you type "git-status foo":

# Changes to be committed:
#   (use "git reset HEAD <file>..." to unstage)
#
#       modified:   foo/bar
#
# Changed but not updated:
#   (use "git add <file>..." to update what will be committed)
#
#       modified:   baz
#
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#       NOTES

I understand why it works this way, and it's explained by the man  
page, but the behaviour is the very last thing I would expect.

Cheers,
Wincent

^ permalink raw reply

* Re: git-svnimport
From: Johannes Schindelin @ 2007-10-25 13:04 UTC (permalink / raw)
  To: Felipe Balbi; +Cc: git
In-Reply-To: <31e679430710250408g679538e7ha9e1e75507c2aac5@mail.gmail.com>

Hi,

On Thu, 25 Oct 2007, Felipe Balbi wrote:

> On 10/25/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
>
> > On Thu, 25 Oct 2007, Felipe Balbi wrote:
> >
> > > I was importing busybox svn repository to git but I got a connection 
> > > timeout after more than 19k commits... is there a way to continue 
> > > where the error happened or should I do it all over again ??
> >
> > AFAICT git-svn is better suited, even to one-shot importing svn.
> >
> > As it happens, I got interested in this project, too, and did an 
> > import some time ago.  For your pleasure, I uploaded it to
> >
> >         http://repo.or.cz/w/busybox.git/
> 
> thanks... much better... I'm cloning your tree and I'll merge with 
> current busybox tree... ;-)

FYI you'll have to do something like this:

	git svn init svn://busybox.net/trunk/busybox
	git svn fetch

to merge with current busybox (although I updated before I pushed).

Ciao,
Dscho

^ permalink raw reply

* Re: best git practices, was Re: Git User's Survey 2007 unfinished summary continued
From: Theodore Tso @ 2007-10-25 13:24 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Johannes Schindelin, Steffen Prohaska, Peter Baumann,
	J. Bruce Fields, Jakub Narebski, Federico Mena Quintero, git
In-Reply-To: <472070E5.4090303@op5.se>

On Thu, Oct 25, 2007 at 12:33:09PM +0200, Andreas Ericsson wrote:
> Because it's convenient, ofcourse. Don't you have 'maint', 'next'
> and 'master' in your clone of git.git? I'm guessing at least 99% of
> the people on this list have those branches lying around in their
> clones, even if they only ever use 'next' and/or 'master'.

I find it just as easy to say: "git checkout origin/maint" or "git
checkout origin/next" when I want to examine some other branch.

If I want to make a change against maint, then I follow up "git
checkout origin/maint" with a "git checkout -b <topic-name>".  Part of
the reason though, why I *want* to keep the topic branch around is
precisely because I don't get to push to the central repository.  So I
want to keep it around so either (a) the central maintainer can pull
from me, and I delete it only after he's done the pull, or (b) so I
can use git-chery so I can see when patches that I created and sent
via git-format-patch and git-send-email have been accepted.

You're using a diferent workflow, and with users who aren't interested
in learning the fine points of git.  But main issue is that git isn't
optimized for what you want to do.  So I can suggest a couple of
different approaches.  One is to simply do things the 'hg' way.
Explicitly set up different repos for the different branches.  It's
more inefficient, but it does work.  And if the bulk of your users
are, ah, "aggressive ignorant" about git --- and many developers don't
care about learning the fine points of their tools, and a successful
software company needs to learn how to leverage the skills of such
mid-level engineers (only at a startup or if you are at Google can you
insist only only hiring the best and brightest) --- then it might be
that the 'hg' approach is easier.  Certainly that was the approach
Larry McVoy has always used with BitKeeper, and he is focused on
meeting the needs of his corporate customers.

Another would be to set up a wrapper script for "git-clone" that
creates a separate local working directory for each branch.  So for
example, it might do something like this:

#!/bin/sh
# Usage: get-repo <URL> [dir]
URL=$1
dir=$2
branches=`git-ls-remote --heads $URL | sed -e 's;.*/;;'`
if [ "$dir"x = "x" ]; then dir=`basename $URL`; fi
git clone $URL .temp-repo
mkdir $dir
cd $dir
for i in $branches; do
    mkdir $i
    cd $i
    git init
    git remote add -t $i origin $URL
    echo ref: refs/heads/$i > .git/HEAD
    git fetch ../../.temp-repo refs/remotes/origin/$i:refs/remotes/origin/$i
    # do it a second time to get the tags (bug in fetch?)
    git fetch ../../.temp-repo refs/remotes/origin/$i:refs/remotes/origin/$i
    git merge origin/$i
    git config remote.origin.push $i:$i
    cd ..
done
cd ..
rm -rf .temp-repo

For bonus points, this script could be made smarter so that each of
the branches shared a common git object database, and some error
checking would be nice, but hopefully this gets the basic idea across.

This way, the "basic git users" get a separate working directory for
each branch, where "git pull" updates that particular branch, and "git
push" updates changes to the remote branch.  

Does this do what you want?

							- Ted

P.S.  Note by the way that if you are having everyone own access to
push into a single central repository, having a "next" branch probably
doesn't make seense.  You're probably way better off just simply
having "master" (which would be your devel branch), and "maint" for
bug fixes.

^ permalink raw reply

* Re: Git and Windows
From: Johannes Schindelin @ 2007-10-25 14:19 UTC (permalink / raw)
  To: Bo Yang; +Cc: git
In-Reply-To: <47208817.60804@gmail.com>

Hi,

On Thu, 25 Oct 2007, Bo Yang wrote:

>   I am a new comer to this list but I have used git for two week 
> development control. I think it is a very cool tool, the only flaw is 
> that I have not found Windows version of it. Does git just aim at Linux 
> kernel development? Is there any plan or in the future to migrate it to 
> windows?

Funny.  The first three hits I get from Google are

	Wikipedia,
	GitWiki and
	msysgit

The first two pointing to the third.  And happily enough, there is a 
Download page at the third site.  Oh, and it has a description what its 
affiliation with git is.

Hth,
Dscho

^ permalink raw reply

* recent change in git.git/master broke my repos
From: Randal L. Schwartz @ 2007-10-25 14:32 UTC (permalink / raw)
  To: git

I have echo "ref: refs/remotes/origin/master" >.git/refs/heads/upstream
so that my daily update script can go:

   git-fetch
   if [ repo is on master, and is not dirty ];
      git-merge upstream
   fi

Yesterday that worked.

Today I get a rash of:

  fatal: Couldn't find remote ref refs/remotes/origin/master

from my git-fetch.

Is git-fetch broken, or am I?  And if it's me, how do I do what I
want instead?

And when are we gonna get "fast forward only" for git-merge?

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

^ permalink raw reply

* Re: best git practices, was Re: Git User's Survey 2007 unfinished summary continued
From: Karl Hasselström @ 2007-10-25 14:51 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Johannes Schindelin, Peter Baumann, J. Bruce Fields,
	Steffen Prohaska, Jakub Narebski, Federico Mena Quintero, git
In-Reply-To: <4720903E.1070103@op5.se>

On 2007-10-25 14:46:54 +0200, Andreas Ericsson wrote:

> error: The branch 'next' is not a strict subset of your current
> HEAD. If you are sure you want to delete it, run 'git branch -D
> next'.
>
> So you want me to tell all the developers they should use "git
> branch -D maint" instead, so they can bypass the built-in security
> checks? No thanks.

Maybe the solution here is to let "git branch -d" succeed if the
branch is a subset of HEAD or the branch it is tracking? That way,
deleting would succeed if upstream has all your commits.

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox