Git development
 help / color / mirror / Atom feed
* [PATCH 3/6] Git.pm: Introduce ident() and ident_person() methods
From: Petr Baudis @ 2006-07-03 20:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <20060703204415.28541.47920.stgit@machine.or.cz>

These methods can retrieve/parse the author/committer ident.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 git-send-email.perl |   11 ++---------
 perl/Git.pm         |   49 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 51 insertions(+), 9 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index e794e44..79e82f5 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -84,15 +84,8 @@ foreach my $entry (@bcclist) {
 
 # Now, let's fill any that aren't set in with defaults:
 
-sub gitvar_ident {
-    my ($name) = @_;
-    my $val = $repo->command('var', $name);
-    my @field = split(/\s+/, $val);
-    return join(' ', @field[0...(@field-3)]);
-}
-
-my ($author) = gitvar_ident('GIT_AUTHOR_IDENT');
-my ($committer) = gitvar_ident('GIT_COMMITTER_IDENT');
+my ($author) = $repo->ident_person('author');
+my ($committer) = $repo->ident_person('committer');
 
 my %aliases;
 my @alias_files = $repo->config('sendemail.aliasesfile');
diff --git a/perl/Git.pm b/perl/Git.pm
index 24fd7ce..9ce9fcd 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -521,6 +521,55 @@ sub config {
 }
 
 
+=item ident ( TYPE | IDENTSTR )
+
+=item ident_person ( TYPE | IDENTSTR | IDENTARRAY )
+
+This suite of functions retrieves and parses ident information, as stored
+in the commit and tag objects or produced by C<var GIT_type_IDENT> (thus
+C<TYPE> can be either I<author> or I<committer>; case is insignificant).
+
+The C<ident> method retrieves the ident information from C<git-var>
+and either returns it as a scalar string or as an array with the fields parsed.
+Alternatively, it can take a prepared ident string (e.g. from the commit
+object) and just parse it.
+
+C<ident_person> returns the person part of the ident - name and email;
+it can take the same arguments as C<ident> or the array returned by C<ident>.
+
+The synopsis is like:
+
+	my ($name, $email, $time_tz) = ident('author');
+	"$name <$email>" eq ident_person('author');
+	"$name <$email>" eq ident_person($name);
+	$time_tz =~ /^\d+ [+-]\d{4}$/;
+
+Both methods must be called on a repository instance.
+
+=cut
+
+sub ident {
+	my ($self, $type) = @_;
+	my $identstr;
+	if (lc $type eq lc 'committer' or lc $type eq lc 'author') {
+		$identstr = $self->command_oneline('var', 'GIT_'.uc($type).'_IDENT');
+	} else {
+		$identstr = $type;
+	}
+	if (wantarray) {
+		return $identstr =~ /^(.*) <(.*)> (\d+ [+-]\d{4})$/;
+	} else {
+		return $identstr;
+	}
+}
+
+sub ident_person {
+	my ($self, @ident) = @_;
+	$#ident == 0 and @ident = $self->ident($ident[0]);
+	return "$ident[0] <$ident[1]>";
+}
+
+
 =item hash_object ( TYPE, FILENAME )
 
 =item hash_object ( TYPE, FILEHANDLE )

^ permalink raw reply related

* [PATCH 0/6] The residual Git.pm patches
From: Petr Baudis @ 2006-07-03 20:44 UTC (permalink / raw)
  To: junkio; +Cc: git

  Hi,

  this series is not so much meant for immediate application but
rather so that you can leave it lingering on top of pu, if you wish,
and especially of fear of it being lost in all the turmoil around
Git.pm - this is what was left in my StGIT series after weeding out
what ended up in next, neatly stacked in a single mail thread.

				Petr Baudis

-- 
And on the eigth day, God started debugging.

^ permalink raw reply

* [PATCH 1/6] Git.pm: Add config() method
From: Petr Baudis @ 2006-07-03 20:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <20060703204415.28541.47920.stgit@machine.or.cz>

This accessor will retrieve value(s) of the given configuration variable.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 Documentation/git-repo-config.txt |    3 ++-
 perl/Git.pm                       |   37 ++++++++++++++++++++++++++++++++++++-
 repo-config.c                     |    2 +-
 3 files changed, 39 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-repo-config.txt b/Documentation/git-repo-config.txt
index 803c0d5..cc72fa9 100644
--- a/Documentation/git-repo-config.txt
+++ b/Documentation/git-repo-config.txt
@@ -54,7 +54,8 @@ OPTIONS
 
 --get::
 	Get the value for a given key (optionally filtered by a regex
-	matching the value).
+	matching the value). Returns error code 1 if the key was not
+	found and error code 2 if multiple key values were found.
 
 --get-all::
 	Like get, but does not fail if the number of values for the key
diff --git a/perl/Git.pm b/perl/Git.pm
index b4ee88b..24fd7ce 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -473,7 +473,6 @@ and the directory must exist.
 
 sub wc_chdir {
 	my ($self, $subdir) = @_;
-
 	$self->wc_path()
 		or throw Error::Simple("bare repository");
 
@@ -486,6 +485,42 @@ sub wc_chdir {
 }
 
 
+=item config ( VARIABLE )
+
+Retrieve the configuration C<VARIABLE> in the same manner as C<repo-config>
+does. In scalar context requires the variable to be set only one time
+(exception is thrown otherwise), in array context returns allows the
+variable to be set multiple times and returns all the values.
+
+Must be called on a repository instance.
+
+This currently wraps command('repo-config') so it is not so fast.
+
+=cut
+
+sub config {
+	my ($self, $var) = @_;
+	$self->repo_path()
+		or throw Error::Simple("not a repository");
+
+	try {
+		if (wantarray) {
+			return $self->command('repo-config', '--get-all', $var);
+		} else {
+			return $self->command_oneline('repo-config', '--get', $var);
+		}
+	} catch Git::Error::Command with {
+		my $E = shift;
+		if ($E->value() == 1) {
+			# Key not found.
+			return undef;
+		} else {
+			throw $E;
+		}
+	};
+}
+
+
 =item hash_object ( TYPE, FILENAME )
 
 =item hash_object ( TYPE, FILEHANDLE )
diff --git a/repo-config.c b/repo-config.c
index 743f02b..c7ed0ac 100644
--- a/repo-config.c
+++ b/repo-config.c
@@ -118,7 +118,7 @@ static int get_value(const char* key_, c
 	if (do_all)
 		ret = !seen;
 	else
-		ret =  (seen == 1) ? 0 : 1;
+		ret = (seen == 1) ? 0 : seen > 1 ? 2 : 1;
 
 free_strings:
 	if (repo_config)

^ permalink raw reply related

* [PATCH 6/6] Convert git-annotate to use Git.pm
From: Petr Baudis @ 2006-07-03 20:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <20060703204415.28541.47920.stgit@machine.or.cz>

Together with the other converted scripts, this is probably still pu
material; it appears to work fine for me, though. The speed gain from
get_object() is about 10% (I expected more...).

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 git-annotate.perl |  167 ++++++++++-------------------------------------------
 1 files changed, 31 insertions(+), 136 deletions(-)

diff --git a/git-annotate.perl b/git-annotate.perl
index a6a7a48..d924e87 100755
--- a/git-annotate.perl
+++ b/git-annotate.perl
@@ -11,6 +11,7 @@ use strict;
 use Getopt::Long;
 use POSIX qw(strftime gmtime);
 use File::Basename qw(basename dirname);
+use Git;
 
 sub usage() {
 	print STDERR "Usage: ${\basename $0} [-s] [-S revs-file] file [ revision ]
@@ -29,7 +30,7 @@ sub usage() {
 	exit(1);
 }
 
-our ($help, $longrev, $rename, $rawtime, $starting_rev, $rev_file) = (0, 0, 1);
+our ($help, $longrev, $rename, $rawtime, $starting_rev, $rev_file, $repo) = (0, 0, 1);
 
 my $rc = GetOptions(	"long|l" => \$longrev,
 			"time|t" => \$rawtime,
@@ -52,6 +53,8 @@ my @stack = (
 	},
 );
 
+$repo = Git->repository();
+
 our @filelines = ();
 
 if (defined $starting_rev) {
@@ -102,15 +105,11 @@ while (my $bound = pop @stack) {
 push @revqueue, $head;
 init_claim( defined $starting_rev ? $head : 'dirty');
 unless (defined $starting_rev) {
-	my $diff = open_pipe("git","diff","-R", "HEAD", "--",$filename)
-		or die "Failed to call git diff to check for dirty state: $!";
-
-	_git_diff_parse($diff, $head, "dirty", (
-				'author' => gitvar_name("GIT_AUTHOR_IDENT"),
-				'author_date' => sprintf("%s +0000",time()),
-				)
-			);
-	close($diff);
+	my %ident;
+	@ident{'author', 'author_email', 'author_date'} = $repo->ident('author');
+	my $diff = $repo->command_output_pipe('diff', '-R', 'HEAD', '--', $filename);
+	_git_diff_parse($diff, $head, "dirty", %ident);
+	$repo->command_close_pipe($diff);
 }
 handle_rev();
 
@@ -181,8 +180,7 @@ sub git_rev_list {
 		open($revlist, '<' . $rev_file)
 		    or die "Failed to open $rev_file : $!";
 	} else {
-		$revlist = open_pipe("git-rev-list","--parents","--remove-empty",$rev,"--",$file)
-			or die "Failed to exec git-rev-list: $!";
+		$revlist = $repo->command_output_pipe('rev-list', '--parents', '--remove-empty', $rev, '--', $file);
 	}
 
 	my @revs;
@@ -191,7 +189,7 @@ sub git_rev_list {
 		my ($rev, @parents) = split /\s+/, $line;
 		push @revs, [ $rev, @parents ];
 	}
-	close($revlist);
+	$repo->command_close_pipe($revlist);
 
 	printf("0 revs found for rev %s (%s)\n", $rev, $file) if (@revs == 0);
 	return @revs;
@@ -200,8 +198,7 @@ sub git_rev_list {
 sub find_parent_renames {
 	my ($rev, $file) = @_;
 
-	my $patch = open_pipe("git-diff-tree", "-M50", "-r","--name-status", "-z","$rev")
-		or die "Failed to exec git-diff: $!";
+	my $patch = $repo->command_output_pipe('diff-tree', '-M50', '-r', '--name-status', '-z', $rev);
 
 	local $/ = "\0";
 	my %bound;
@@ -227,7 +224,7 @@ sub find_parent_renames {
 			}
 		}
 	}
-	close($patch);
+	$repo->command_close_pipe($patch);
 
 	return \%bound;
 }
@@ -236,14 +233,9 @@ sub find_parent_renames {
 sub git_find_parent {
 	my ($rev, $filename) = @_;
 
-	my $revparent = open_pipe("git-rev-list","--remove-empty", "--parents","--max-count=1","$rev","--",$filename)
-		or die "Failed to open git-rev-list to find a single parent: $!";
-
-	my $parentline = <$revparent>;
-	chomp $parentline;
-	my ($revfound,$parent) = split m/\s+/, $parentline;
-
-	close($revparent);
+	my $parentline = $repo->command_oneline('rev-list', '--remove-empty',
+			'--parents', '--max-count=1', $rev, '--', $filename);
+	my ($revfound, $parent) = split m/\s+/, $parentline;
 
 	return $parent;
 }
@@ -254,13 +246,13 @@ # Record the commit information that res
 sub git_diff_parse {
 	my ($parent, $rev, %revinfo) = @_;
 
-	my $diff = open_pipe("git-diff-tree","-M","-p",$rev,$parent,"--",
-			$revs{$rev}{'filename'}, $revs{$parent}{'filename'})
-		or die "Failed to call git-diff for annotation: $!";
+	my $diff = $repo->command_output_pipe('diff-tree', '-M', '-p',
+			$rev, $parent, '--',
+			$revs{$rev}{'filename'}, $revs{$parent}{'filename'});
 
 	_git_diff_parse($diff, $parent, $rev, %revinfo);
 
-	close($diff);
+	$repo->command_close_pipe($diff);
 }
 
 sub _git_diff_parse {
@@ -351,36 +343,25 @@ sub git_cat_file {
 	my $blob = git_ls_tree($rev, $filename);
 	die "Failed to find a blob for $filename in rev $rev\n" if !defined $blob;
 
-	my $catfile = open_pipe("git","cat-file", "blob", $blob)
-		or die "Failed to git-cat-file blob $blob (rev $rev, file $filename): " . $!;
-
-	my @lines;
-	while(<$catfile>) {
-		chomp;
-		push @lines, $_;
-	}
-	close($catfile);
-
+	my @lines = split(/\n/, $repo->get_object('blob', $blob));
+	pop @lines unless $lines[$#lines]; # Trailing newline
 	return @lines;
 }
 
 sub git_ls_tree {
 	my ($rev, $filename) = @_;
 
-	my $lstree = open_pipe("git","ls-tree",$rev,$filename)
-		or die "Failed to call git ls-tree: $!";
-
+	my $lstree = $repo->command_output_pipe('ls-tree', $rev, $filename);
 	my ($mode, $type, $blob, $tfilename);
 	while(<$lstree>) {
 		chomp;
 		($mode, $type, $blob, $tfilename) = split(/\s+/, $_, 4);
 		last if ($tfilename eq $filename);
 	}
-	close($lstree);
+	$repo->command_close_pipe($lstree);
 
 	return $blob if ($tfilename eq $filename);
 	die "git-ls-tree failed to find blob for $filename";
-
 }
 
 
@@ -396,25 +377,17 @@ sub claim_line {
 
 sub git_commit_info {
 	my ($rev) = @_;
-	my $commit = open_pipe("git-cat-file", "commit", $rev)
-		or die "Failed to call git-cat-file: $!";
+	my $commit = $repo->get_object('commit', $rev);
 
 	my %info;
-	while(<$commit>) {
-		chomp;
-		last if (length $_ == 0);
-
-		if (m/^author (.*) <(.*)> (.*)$/) {
-			$info{'author'} = $1;
-			$info{'author_email'} = $2;
-			$info{'author_date'} = $3;
-		} elsif (m/^committer (.*) <(.*)> (.*)$/) {
-			$info{'committer'} = $1;
-			$info{'committer_email'} = $2;
-			$info{'committer_date'} = $3;
+	while ($commit =~ /(.*?)\n/g) {
+		my $line = $1;
+		if ($line =~ s/^author //) {
+			@info{'author', 'author_email', 'author_date'} = $repo->ident($line);
+		} elsif ($line =~ s/^committer//) {
+			@info{'committer', 'committer_email', 'committer_date'} = $repo->ident($line);
 		}
 	}
-	close($commit);
 
 	return %info;
 }
@@ -432,81 +405,3 @@ sub format_date {
 	my $t = $timestamp + $minutes * 60;
 	return strftime("%Y-%m-%d %H:%M:%S " . $timezone, gmtime($t));
 }
-
-# Copied from git-send-email.perl - We need a Git.pm module..
-sub gitvar {
-    my ($var) = @_;
-    my $fh;
-    my $pid = open($fh, '-|');
-    die "$!" unless defined $pid;
-    if (!$pid) {
-	exec('git-var', $var) or die "$!";
-    }
-    my ($val) = <$fh>;
-    close $fh or die "$!";
-    chomp($val);
-    return $val;
-}
-
-sub gitvar_name {
-    my ($name) = @_;
-    my $val = gitvar($name);
-    my @field = split(/\s+/, $val);
-    return join(' ', @field[0...(@field-4)]);
-}
-
-sub open_pipe {
-	if ($^O eq '##INSERT_ACTIVESTATE_STRING_HERE##') {
-		return open_pipe_activestate(@_);
-	} else {
-		return open_pipe_normal(@_);
-	}
-}
-
-sub open_pipe_activestate {
-	tie *fh, "Git::ActiveStatePipe", @_;
-	return *fh;
-}
-
-sub open_pipe_normal {
-	my (@execlist) = @_;
-
-	my $pid = open my $kid, "-|";
-	defined $pid or die "Cannot fork: $!";
-
-	unless ($pid) {
-		exec @execlist;
-		die "Cannot exec @execlist: $!";
-	}
-
-	return $kid;
-}
-
-package Git::ActiveStatePipe;
-use strict;
-
-sub TIEHANDLE {
-	my ($class, @params) = @_;
-	my $cmdline = join " ", @params;
-	my  @data = qx{$cmdline};
-	bless { i => 0, data => \@data }, $class;
-}
-
-sub READLINE {
-	my $self = shift;
-	if ($self->{i} >= scalar @{$self->{data}}) {
-		return undef;
-	}
-	return $self->{'data'}->[ $self->{i}++ ];
-}
-
-sub CLOSE {
-	my $self = shift;
-	delete $self->{data};
-	delete $self->{i};
-}
-
-sub EOF {
-	my $self = shift;
-	return ($self->{i} >= scalar @{$self->{data}});
-}

^ permalink raw reply related

* [PATCH 5/6] Git.pm: Introduce fast get_object() method
From: Petr Baudis @ 2006-07-03 20:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <20060703204415.28541.47920.stgit@machine.or.cz>

Direct .xs routine. Note that it does not work 100% correctly when
you juggle multiple repository objects, but it is not that bad either.
The trouble is that we might reuse packs information for another
Git project; that is not an issue since Git depends on uniqueness
of SHA1 ids so if we have found the object somewhere else, it is
nevertheless going to be the same object. It merely makes object
existence detection through this method unreliable; it is duly noted
in the documentation.

At least that's how I see it, I hope I didn't overlook any other
potential problem. I tested it for memory leaks and it appears to be
doing ok.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 perl/Git.pm |   18 ++++++++++++++++++
 perl/Git.xs |   24 ++++++++++++++++++++++++
 2 files changed, 42 insertions(+), 0 deletions(-)

diff --git a/perl/Git.pm b/perl/Git.pm
index 895a939..65acaa7 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -571,6 +571,24 @@ sub ident_person {
 }
 
 
+=item get_object ( TYPE, SHA1 )
+
+Return contents of the given object in a scalar string. If the object has
+not been found, undef is returned; however, do not rely on this! Currently,
+if you use multiple repositories at once, get_object() on one repository
+_might_ return the object even though it exists only in another repository.
+(But do not rely on this behaviour either.)
+
+The method must be called on a repository instance.
+
+Implementation of this method is very fast; no external command calls
+are involved. That's why it is broken, too. ;-)
+
+=cut
+
+# Implemented in Git.xs.
+
+
 =item hash_object ( TYPE, FILENAME )
 
 =item hash_object ( TYPE, FILEHANDLE )
diff --git a/perl/Git.xs b/perl/Git.xs
index 6ed26a2..226dd4f 100644
--- a/perl/Git.xs
+++ b/perl/Git.xs
@@ -111,6 +111,30 @@ CODE:
 	free((char **) argv);
 }
 
+
+SV *
+xs_get_object(type, id)
+	char *type;
+	char *id;
+CODE:
+{
+	unsigned char sha1[20];
+	unsigned long size;
+	void *buf;
+
+	if (strlen(id) != 40 || get_sha1_hex(id, sha1) < 0)
+		XSRETURN_UNDEF;
+
+	buf = read_sha1_file(sha1, type, &size);
+	if (!buf)
+		XSRETURN_UNDEF;
+	RETVAL = newSVpvn(buf, size);
+	free(buf);
+}
+OUTPUT:
+	RETVAL
+
+
 char *
 xs_hash_object_pipe(type, fd)
 	char *type;

^ permalink raw reply related

* [PATCH 4/6] Make it possible to set up libgit directly (instead of from the environment)
From: Petr Baudis @ 2006-07-03 20:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <20060703204415.28541.47920.stgit@machine.or.cz>

This introduces a setup_git() function which is essentialy a (public)
backend for setup_git_env() which lets anyone specify custom sources
for the various paths instead of environment variables. Since the repositories
may get switched on the fly, this also updates code that caches paths to
invalidate them properly; I hope neither of those is a sweet spot.

It is used by Git.xs' xs__call_gate() to set up per-repository data
for libgit's consumption. No code actually takes advantage of it yet
but get_object() will in the next patches.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 cache.h       |    3 +++
 commit.c      |   23 +++++++++++++++++++----
 environment.c |   45 +++++++++++++++++++++++++++++++++++++++------
 perl/Git.pm   |    8 ++++----
 perl/Git.xs   |   16 +++++++++++++++-
 sha1_file.c   |   30 ++++++++++++++++++++++++------
 sha1_name.c   |   10 ++++++++--
 7 files changed, 112 insertions(+), 23 deletions(-)

diff --git a/cache.h b/cache.h
index 8719939..962f2fc 100644
--- a/cache.h
+++ b/cache.h
@@ -116,6 +116,9 @@ extern struct cache_entry **active_cache
 extern unsigned int active_nr, active_alloc, active_cache_changed;
 extern struct cache_tree *active_cache_tree;
 
+extern void setup_git(char *new_git_dir, char *new_git_object_dir,
+                      char *new_git_index_file, char *new_git_graft_file);
+
 #define GIT_DIR_ENVIRONMENT "GIT_DIR"
 #define DEFAULT_GIT_DIR_ENVIRONMENT ".git"
 #define DB_ENVIRONMENT "GIT_OBJECT_DIRECTORY"
diff --git a/commit.c b/commit.c
index a608faf..91f97c1 100644
--- a/commit.c
+++ b/commit.c
@@ -163,6 +163,14 @@ int register_commit_graft(struct commit_
 	return 0;
 }
 
+void free_commit_grafts(void)
+{
+	int pos = commit_graft_nr;
+	while (pos >= 0)
+		free(commit_graft[pos--]);
+	commit_graft_nr = 0;
+}
+
 struct commit_graft *read_graft_line(char *buf, int len)
 {
 	/* The format is just "Commit Parent1 Parent2 ...\n" */
@@ -215,11 +223,18 @@ int read_graft_file(const char *graft_fi
 static void prepare_commit_graft(void)
 {
 	static int commit_graft_prepared;
-	char *graft_file;
+	static char *last_graft_file;
+	char *graft_file = get_graft_file();
+
+	if (last_graft_file) {
+		if (!strcmp(graft_file, last_graft_file))
+			return;
+		free_commit_grafts();
+	}
+	if (last_graft_file)
+		free(last_graft_file);
+	last_graft_file = strdup(graft_file);
 
-	if (commit_graft_prepared)
-		return;
-	graft_file = get_graft_file();
 	read_graft_file(graft_file);
 	commit_graft_prepared = 1;
 }
diff --git a/environment.c b/environment.c
index 3de8eb3..6b64d11 100644
--- a/environment.c
+++ b/environment.c
@@ -21,28 +21,61 @@ char git_commit_encoding[MAX_ENCODING_LE
 int shared_repository = PERM_UMASK;
 const char *apply_default_whitespace = NULL;
 
+static int dyn_git_object_dir, dyn_git_index_file, dyn_git_graft_file;
 static char *git_dir, *git_object_dir, *git_index_file, *git_refs_dir,
 	*git_graft_file;
-static void setup_git_env(void)
+
+void setup_git(char *new_git_dir, char *new_git_object_dir,
+               char *new_git_index_file, char *new_git_graft_file)
 {
-	git_dir = getenv(GIT_DIR_ENVIRONMENT);
+	git_dir = new_git_dir;
 	if (!git_dir)
 		git_dir = DEFAULT_GIT_DIR_ENVIRONMENT;
-	git_object_dir = getenv(DB_ENVIRONMENT);
+
+	if (dyn_git_object_dir)
+		free(git_object_dir);
+	git_object_dir = new_git_object_dir;
 	if (!git_object_dir) {
 		git_object_dir = xmalloc(strlen(git_dir) + 9);
 		sprintf(git_object_dir, "%s/objects", git_dir);
+		dyn_git_object_dir = 1;
+	} else {
+		dyn_git_object_dir = 0;
 	}
+
+	if (git_refs_dir)
+		free(git_refs_dir);
 	git_refs_dir = xmalloc(strlen(git_dir) + 6);
 	sprintf(git_refs_dir, "%s/refs", git_dir);
-	git_index_file = getenv(INDEX_ENVIRONMENT);
+
+	if (dyn_git_index_file)
+		free(git_index_file);
+	git_index_file = new_git_index_file;
 	if (!git_index_file) {
 		git_index_file = xmalloc(strlen(git_dir) + 7);
 		sprintf(git_index_file, "%s/index", git_dir);
+		dyn_git_index_file = 1;
+	} else {
+		dyn_git_index_file = 0;
 	}
-	git_graft_file = getenv(GRAFT_ENVIRONMENT);
-	if (!git_graft_file)
+
+	if (dyn_git_graft_file)
+		free(git_graft_file);
+	git_graft_file = new_git_graft_file;
+	if (!git_graft_file) {
 		git_graft_file = strdup(git_path("info/grafts"));
+		dyn_git_graft_file = 1;
+	} else {
+		dyn_git_graft_file = 0;
+	}
+}
+
+static void setup_git_env(void)
+{
+	setup_git(getenv(GIT_DIR_ENVIRONMENT),
+	          getenv(DB_ENVIRONMENT),
+	          getenv(INDEX_ENVIRONMENT),
+	          getenv(GRAFT_ENVIRONMENT));
 }
 
 char *get_git_dir(void)
diff --git a/perl/Git.pm b/perl/Git.pm
index 9ce9fcd..895a939 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -92,6 +92,7 @@ increate nonwithstanding).
 use Carp qw(carp croak); # but croak is bad - throw instead
 use Error qw(:try);
 use Cwd qw(abs_path);
+use Scalar::Util;
 
 require XSLoader;
 XSLoader::load('Git', $VERSION);
@@ -833,11 +834,10 @@ sub _call_gate {
 	if (defined $self) {
 		# XXX: We ignore the WorkingCopy! To properly support
 		# that will require heavy changes in libgit.
+		# For now, when we will need to do it we could temporarily
+		# chdir() there and then chdir() back after the call is done.
 
-		# XXX: And we ignore everything else as well. libgit
-		# at least needs to be extended to let us specify
-		# the $GIT_DIR instead of looking it up in environment.
-		#xs_call_gate($self->{opts}->{Repository});
+		xs__call_gate(Scalar::Util::refaddr($self), $self->repo_path());
 	}
 
 	# Having to call throw from the C code is a sure path to insanity.
diff --git a/perl/Git.xs b/perl/Git.xs
index 2bbec43..6ed26a2 100644
--- a/perl/Git.xs
+++ b/perl/Git.xs
@@ -52,7 +52,21 @@ BOOT:
 }
 
 
-# /* TODO: xs_call_gate(). See Git.pm. */
+void
+xs__call_gate(repoid, git_dir)
+	long repoid;
+	char *git_dir;
+CODE:
+{
+	static long last_repoid;
+	if (repoid != last_repoid) {
+		setup_git(git_dir,
+		          getenv(DB_ENVIRONMENT),
+		          getenv(INDEX_ENVIRONMENT),
+		          getenv(GRAFT_ENVIRONMENT));
+		last_repoid = repoid;
+	}
+}
 
 
 char *
diff --git a/sha1_file.c b/sha1_file.c
index 8179630..ab64543 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -126,16 +126,22 @@ static void fill_sha1_path(char *pathbuf
 char *sha1_file_name(const unsigned char *sha1)
 {
 	static char *name, *base;
+	static const char *last_objdir;
+	const char *sha1_file_directory = get_object_directory();
 
-	if (!base) {
-		const char *sha1_file_directory = get_object_directory();
+	if (!last_objdir || strcmp(last_objdir, sha1_file_directory)) {
 		int len = strlen(sha1_file_directory);
+		if (base)
+			free(base);
 		base = xmalloc(len + 60);
 		memcpy(base, sha1_file_directory, len);
 		memset(base+len, 0, 60);
 		base[len] = '/';
 		base[len+3] = '/';
 		name = base + len + 1;
+		if (last_objdir)
+			free((char *) last_objdir);
+		last_objdir = strdup(sha1_file_directory);
 	}
 	fill_sha1_path(name, sha1);
 	return base;
@@ -145,14 +151,20 @@ char *sha1_pack_name(const unsigned char
 {
 	static const char hex[] = "0123456789abcdef";
 	static char *name, *base, *buf;
+	static const char *last_objdir;
+	const char *sha1_file_directory = get_object_directory();
 	int i;
 
-	if (!base) {
-		const char *sha1_file_directory = get_object_directory();
+	if (!last_objdir || strcmp(last_objdir, sha1_file_directory)) {
 		int len = strlen(sha1_file_directory);
+		if (base)
+			free(base);
 		base = xmalloc(len + 60);
 		sprintf(base, "%s/pack/pack-1234567890123456789012345678901234567890.pack", sha1_file_directory);
 		name = base + len + 11;
+		if (last_objdir)
+			free((char *) last_objdir);
+		last_objdir = strdup(sha1_file_directory);
 	}
 
 	buf = name;
@@ -170,14 +182,20 @@ char *sha1_pack_index_name(const unsigne
 {
 	static const char hex[] = "0123456789abcdef";
 	static char *name, *base, *buf;
+	static const char *last_objdir;
+	const char *sha1_file_directory = get_object_directory();
 	int i;
 
-	if (!base) {
-		const char *sha1_file_directory = get_object_directory();
+	if (!last_objdir || strcmp(last_objdir, sha1_file_directory)) {
 		int len = strlen(sha1_file_directory);
+		if (base)
+			free(base);
 		base = xmalloc(len + 60);
 		sprintf(base, "%s/pack/pack-1234567890123456789012345678901234567890.idx", sha1_file_directory);
 		name = base + len + 11;
+		if (last_objdir)
+			free((char *) last_objdir);
+		last_objdir = strdup(sha1_file_directory);
 	}
 
 	buf = name;
diff --git a/sha1_name.c b/sha1_name.c
index f2cbafa..c698c1b 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -12,15 +12,21 @@ static int find_short_object_filename(in
 	char hex[40];
 	int found = 0;
 	static struct alternate_object_database *fakeent;
+	static const char *last_objdir;
+	const char *objdir = get_object_directory();
 
-	if (!fakeent) {
-		const char *objdir = get_object_directory();
+	if (!last_objdir || strcmp(last_objdir, objdir)) {
 		int objdir_len = strlen(objdir);
 		int entlen = objdir_len + 43;
+		if (fakeent)
+			free(fakeent);
 		fakeent = xmalloc(sizeof(*fakeent) + entlen);
 		memcpy(fakeent->base, objdir, objdir_len);
 		fakeent->name = fakeent->base + objdir_len + 1;
 		fakeent->name[-1] = '/';
+		if (last_objdir)
+			free((char *) last_objdir);
+		last_objdir = strdup(objdir);
 	}
 	fakeent->next = alt_odb_list;
 

^ permalink raw reply related

* [PATCH 2/6] Convert git-send-email to use Git.pm
From: Petr Baudis @ 2006-07-03 20:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <20060703204415.28541.47920.stgit@machine.or.cz>

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 git-send-email.perl |   30 ++++++++----------------------
 1 files changed, 8 insertions(+), 22 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index c5d9e73..e794e44 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -21,6 +21,7 @@ use warnings;
 use Term::ReadLine;
 use Getopt::Long;
 use Data::Dumper;
+use Git;
 
 # most mail servers generate the Date: header, but not all...
 $ENV{LC_ALL} = 'C';
@@ -46,6 +47,8 @@ my $smtp_server;
 # Example reply to:
 #$initial_reply_to = ''; #<20050203173208.GA23964@foobar.com>';
 
+my $repo = Git->repository();
+
 my $term = new Term::ReadLine 'git-send-email';
 
 # Begin by accumulating all the variables (defined above), that we will end up
@@ -81,23 +84,9 @@ foreach my $entry (@bcclist) {
 
 # Now, let's fill any that aren't set in with defaults:
 
-sub gitvar {
-    my ($var) = @_;
-    my $fh;
-    my $pid = open($fh, '-|');
-    die "$!" unless defined $pid;
-    if (!$pid) {
-	exec('git-var', $var) or die "$!";
-    }
-    my ($val) = <$fh>;
-    close $fh or die "$!";
-    chomp($val);
-    return $val;
-}
-
 sub gitvar_ident {
     my ($name) = @_;
-    my $val = gitvar($name);
+    my $val = $repo->command('var', $name);
     my @field = split(/\s+/, $val);
     return join(' ', @field[0...(@field-3)]);
 }
@@ -106,8 +95,8 @@ my ($author) = gitvar_ident('GIT_AUTHOR_
 my ($committer) = gitvar_ident('GIT_COMMITTER_IDENT');
 
 my %aliases;
-chomp(my @alias_files = `git-repo-config --get-all sendemail.aliasesfile`);
-chomp(my $aliasfiletype = `git-repo-config sendemail.aliasfiletype`);
+my @alias_files = $repo->config('sendemail.aliasesfile');
+my $aliasfiletype = $repo->config('sendemail.aliasfiletype');
 my %parse_alias = (
 	# multiline formats can be supported in the future
 	mutt => sub { my $fh = shift; while (<$fh>) {
@@ -132,7 +121,7 @@ my %parse_alias = (
 		}}}
 );
 
-if (@alias_files && defined $parse_alias{$aliasfiletype}) {
+if (@alias_files and $aliasfiletype and defined $parse_alias{$aliasfiletype}) {
 	foreach my $file (@alias_files) {
 		open my $fh, '<', $file or die "opening $file: $!\n";
 		$parse_alias{$aliasfiletype}->($fh);
@@ -374,10 +363,7 @@ sub send_message
 	my $date = strftime('%a, %d %b %Y %H:%M:%S %z', localtime($time++));
 	my $gitversion = '@@GIT_VERSION@@';
 	if ($gitversion =~ m/..GIT_VERSION../) {
-	    $gitversion = `git --version`;
-	    chomp $gitversion;
-	    # keep only what's after the last space
-	    $gitversion =~ s/^.* //;
+	    $gitversion = Git::version();
 	}
 
 	my $header = "From: $from

^ permalink raw reply related

* Re: [PATCH 1/3] Add read_cache_from() and discard_cache()
From: Junio C Hamano @ 2006-07-03 21:04 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git
In-Reply-To: <Pine.LNX.4.63.0607021043550.29667@wbgn013.biozentrum.uni-wuerzburg.de>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Okay. After reading the comment, I am quite certain we can just set the 
> index_file_timestamp to 0.

Thanks.

> So, I still think that these two lines should be in the cleanup part of 
> get_merge_bases().

I think that is sane -- please make it so.

> BTW personally, I prefer the one-function approach, i.e. a flag which says 
> if it is okay not to clean up.

Yup. Agreed.

^ permalink raw reply

* Re: contrib/ status
From: Junio C Hamano @ 2006-07-03 21:04 UTC (permalink / raw)
  To: Eric Wong; +Cc: git
In-Reply-To: <20060703080625.GB29036@hand.yhbt.net>

Eric Wong <normalperson@yhbt.net> writes:

> Junio C Hamano <junkio@cox.net> wrote:
>> Junio C Hamano <junkio@cox.net> writes:
>> 
>> > ... the
>> > things under contrib/ are not part of git.git but are there only
>> > for convenience....
>> 
>> This reminds me of something quite different.  I am getting an
>> impression that enough people have been helped by git-svn and it
>> might be a good idea to have it outside contrib/ area.
>
> That would be great.  IMHO, it puts git in a position to supplant
> centralized SVN usage one developer at a time, making it easier
> to make a gradual transition to git.  Of course, there's also svk
> in a similar position...

OK, then let's give a few days (it's a long weekend extendeding
into July 4th in the US) to let others from the list chime in,
and then please help me migrate your test scripts and Makefile
pieces into the toplevel project.

^ permalink raw reply

* Re: [PATCH 3/3] Make clear_commit_marks() clean harder
From: Johannes Schindelin @ 2006-07-03 21:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Rene Scharfe, git
In-Reply-To: <Pine.LNX.4.64.0607030957420.12404@g5.osdl.org>

Hi,

On Mon, 3 Jul 2006, Linus Torvalds wrote:

> 	/* Have we already cleared this? */
> 	if (!(mask & object->flags))
> 		return;
> 	object->flags &= ~mask;

I thought we already did this, and did not even check. My bad.

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH 3/3] Make clear_commit_marks() clean harder
From: Johannes Schindelin @ 2006-07-03 21:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vzmfqqxlh.fsf@assigned-by-dhcp.cox.net>

Hi,

On Mon, 3 Jul 2006, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> >> > Don't care if objects have been parsed or not and don't stop when we
> >> > reach a commit that is already clean -- its parents could be dirty.
> >> 
> >> There is something quite wrong with this patch.
> >
> > I always had the feeling that it was wrong to traverse not-yet-parsed 
> > parents: How could a revision walk possibly come to a certain commit 
> > without at least one continuous history of now-parsed objects?
> >
> > Also, AFAIK the revision walk sets flags for each commit it touched, and 
> > we should not try to be smart-asses about the flags, but just unset these 
> > flags.
> 
> The main points were made by Linus already.
> 
> Traversing is not needed -- not clearing not-yet-parsed is
> obviously wrong.

Traversing is actually wrong. Clearing the marks does not mean to clear 
them on commits we did not even mark!

But clearing on commits we _have_ -- but not parsed -- is important, 
obviously.

> > BTW some very quick tests showed that the clear_commit_marks() thing that 
> > I sent to the list was much faster than traversing all objects (which was 
> > in my original version).
> 
> I have a crude workaround pushed out last night but will be
> replacing it with something less drastic.  I think the final
> version should be what you had, perhaps minus not looking at the
> parsed flag for unmarking purposes.

Isn't the right way to go about it to just clear the marks if we have a 
commit that has at least one of the marks set, but traverse further only 
if _in addition to having at least one mark set_ the commit has been 
parsed already?

That would not be a crude workaround.

BTW what cases could have a commit, being seen by the revision walk, not 
have the SEEN mark set?

Ciao,
Dscho

^ permalink raw reply

* [PATCH] Use $GITPERLLIB instead of $RUNNING_GIT_TESTS and centralize @INC munging
From: Petr Baudis @ 2006-07-03 21:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <20060703202925.GN29115@pasky.or.cz>

This makes the Git perl scripts check $GITPERLLIB instead of
$RUNNING_GIT_TESTS, which makes more sense if you are setting up your shell
environment to use a non-installed Git instance.

It also weeds out the @INC munging from the individual scripts and makes
Makefile add it during the .perl files processing, so that we can change
just a single place when we modify this shared logic. It looks ugly in the
scripts, too. ;-)

And instead of doing arcane things with the @INC array, we just do 'use lib'
instead, which is essentialy the same thing anyway.

I first want to do three separate patches but it turned out that it's quite
a lot neater when bundled together, so I hope it's ok.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 INSTALL                |    4 ++--
 Makefile               |    4 +++-
 git-fmt-merge-msg.perl |    5 -----
 git-mv.perl            |    5 -----
 t/test-lib.sh          |    5 ++---
 5 files changed, 7 insertions(+), 16 deletions(-)

diff --git a/INSTALL b/INSTALL
index ed502de..4e8f883 100644
--- a/INSTALL
+++ b/INSTALL
@@ -39,8 +39,8 @@ Issues of note:
 
 	GIT_EXEC_PATH=`pwd`
 	PATH=`pwd`:$PATH
-	PERL5LIB=`pwd`/perl/blib/lib:`pwd`/perl/blib/arch/auto/Git
-	export GIT_EXEC_PATH PATH PERL5LIB
+	GITPERLLIB=`pwd`/perl/blib/lib:`pwd`/perl/blib/arch/auto/Git
+	export GIT_EXEC_PATH PATH GITPERLLIB
 
  - Git is reasonably self-sufficient, but does depend on a few external
    programs and libraries:
diff --git a/Makefile b/Makefile
index a62e8a3..bae95c2 100644
--- a/Makefile
+++ b/Makefile
@@ -552,7 +552,9 @@ common-cmds.h: Documentation/git-*.txt
 $(patsubst %.perl,%,$(SCRIPT_PERL)): % : %.perl
 	rm -f $@ $@+
 	INSTLIBDIR=`$(MAKE) -C perl -s --no-print-directory instlibdir` && \
-	sed -e '1s|#!.*perl\(.*\)|#!$(PERL_PATH_SQ)\1|' \
+	sed -e '1s|#!.*perl|#!$(PERL_PATH_SQ)|1' \
+	    -e '2i\
+	        use lib (split(/:/, $$ENV{GITPERLLIB} || '\'"$$INSTLIBDIR"\''));' \
 	    -e 's|@@INSTLIBDIR@@|'"$$INSTLIBDIR"'|g' \
 	    -e 's/@@GIT_VERSION@@/$(GIT_VERSION)/g' \
 	    $@.perl >$@+
diff --git a/git-fmt-merge-msg.perl b/git-fmt-merge-msg.perl
index a9805dd..f86231e 100755
--- a/git-fmt-merge-msg.perl
+++ b/git-fmt-merge-msg.perl
@@ -5,11 +5,6 @@ #
 # Read .git/FETCH_HEAD and make a human readable merge message
 # by grouping branches and tags together to form a single line.
 
-BEGIN {
-	unless (exists $ENV{'RUNNING_GIT_TESTS'}) {
-		unshift @INC, '@@INSTLIBDIR@@';
-	}
-}
 use strict;
 use Git;
 use Error qw(:try);
diff --git a/git-mv.perl b/git-mv.perl
index 5134b80..322b9fd 100755
--- a/git-mv.perl
+++ b/git-mv.perl
@@ -6,11 +6,6 @@ #
 # This file is licensed under the GPL v2, or a later version
 # at the discretion of Linus Torvalds.
 
-BEGIN {
-	unless (exists $ENV{'RUNNING_GIT_TESTS'}) {
-		unshift @INC, '@@INSTLIBDIR@@';
-	}
-}
 use warnings;
 use strict;
 use Getopt::Std;
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 298c6ca..ad9796e 100755
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -206,9 +206,8 @@ PYTHON=`sed -e '1{
 	PYTHONPATH=$(pwd)/../compat
 	export PYTHONPATH
 }
-RUNNING_GIT_TESTS=YesWeAre
-PERL5LIB=$(pwd)/../perl/blib/lib:$(pwd)/../perl/blib/arch/auto/Git
-export PERL5LIB RUNNING_GIT_TESTS
+GITPERLLIB=$(pwd)/../perl/blib/lib:$(pwd)/../perl/blib/arch/auto/Git
+export GITPERLLIB
 test -d ../templates/blt || {
 	error "You haven't built things yet, have you?"
 }

^ permalink raw reply related

* Re: [PATCH] Make git-fmt-merge-msg a builtin
From: Johannes Schindelin @ 2006-07-03 21:29 UTC (permalink / raw)
  To: Timo Hirvonen; +Cc: git, junkio
In-Reply-To: <20060703191635.21ba0af3.tihirvon@gmail.com>

Hi,

On Mon, 3 Jul 2006, Timo Hirvonen wrote:

> Seems that C89 requires free(NULL) to be a no-op but on some old systems 
> (SunOS) it may crash.  IMNSHO these systems were designed to crash valid 
> programs and torture developers.

At least it is not Malbolge. Or even VAX. (In that order.)

> There are probably many free(NULL) and realloc(NULL, ...) uses in the 
> git source code and are not worth fixing.

AFAIK realloc(NULL, ...) was fine even with K&R, whereas free(NULL) poses 
problems. Anyway, I do not _want_ to say that NULL should be free()d, 
because it just sounds wrong.

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH 4/6] Make it possible to set up libgit directly (instead of from the environment)
From: Petr Baudis @ 2006-07-03 21:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <20060703204803.28541.67315.stgit@machine.or.cz>

Dear diary, on Mon, Jul 03, 2006 at 10:48:03PM CEST, I got a letter
where Petr Baudis <pasky@suse.cz> said that...
> This introduces a setup_git() function which is essentialy a (public)
> backend for setup_git_env() which lets anyone specify custom sources
> for the various paths instead of environment variables. Since the repositories
> may get switched on the fly, this also updates code that caches paths to
> invalidate them properly; I hope neither of those is a sweet spot.
> 
> It is used by Git.xs' xs__call_gate() to set up per-repository data
> for libgit's consumption. No code actually takes advantage of it yet
> but get_object() will in the next patches.
> 
> Signed-off-by: Petr Baudis <pasky@suse.cz>

To further clarify, this only invalidates the path cache and grafts
list, not alternates (it assumes the environment variable stays the
same for now; that is to be fixed when we extend Git.pm further)
and not pack list - we will automagically extend the list of packs when
we meet more repositories, but we will never remove old packs from the
list. (For no special reason other than this does no harm other than
possibly finding objects that should be missing, and the patch smells
bad enough enough as it is now. ;-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Snow falling on Perl. White noise covering line noise.
Hides all the bugs too. -- J. Putnam

^ permalink raw reply

* Re: Compression speed for large files
From: Jeff King @ 2006-07-03 21:45 UTC (permalink / raw)
  To: Joachim B Haga; +Cc: git
In-Reply-To: <loom.20060703T124601-969@post.gmane.org>

On Mon, Jul 03, 2006 at 11:13:34AM +0000, Joachim B Haga wrote:

> often binary. In git, committing of large files is very slow; I have
> tested with a 45MB file, which takes about 1 minute to check in (on an
> intel core-duo 2GHz).

I know this has already been somewhat solved, but I found your numbers
curiously high. I work quite a bit with git and large files and I
haven't noticed this slowdown. Can you be more specific about your load?
Are you sure it is zlib?

On my 1.8Ghz Athlon, compressing 45MB of zeros into 20K takes about 2s.
Compressing 45MB of random data into a 45MB object takes 6.3s. In either
case, the commit takes only about 0.5s (since cogito stores the object
during the cg-add).

Is there some specific file pattern which is slow to compress? 

-Peff

^ permalink raw reply

* [PATCH] Eliminate Scalar::Util for something simpler
From: Petr Baudis @ 2006-07-03 21:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <20060703204803.28541.67315.stgit@machine.or.cz>

Dear diary, on Mon, Jul 03, 2006 at 10:48:03PM CEST, I got a letter
where Petr Baudis <pasky@suse.cz> said that...
> +		xs__call_gate(Scalar::Util::refaddr($self), $self->repo_path());

This was silly and requires Scalar::Util.

->8-
Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 perl/Git.pm |    7 ++++---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/perl/Git.pm b/perl/Git.pm
index 65acaa7..f2467bd 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -92,13 +92,14 @@ increate nonwithstanding).
 use Carp qw(carp croak); # but croak is bad - throw instead
 use Error qw(:try);
 use Cwd qw(abs_path);
-use Scalar::Util;
 
 require XSLoader;
 XSLoader::load('Git', $VERSION);
 
 }
 
+my $instance_id = 0;
+
 
 =head1 CONSTRUCTORS
 
@@ -216,7 +217,7 @@ sub repository {
 		delete $opts{Directory};
 	}
 
-	$self = { opts => \%opts };
+	$self = { opts => \%opts, id => $instance_id++ };
 	bless $self, $class;
 }
 
@@ -855,7 +856,7 @@ sub _call_gate {
 		# For now, when we will need to do it we could temporarily
 		# chdir() there and then chdir() back after the call is done.
 
-		xs__call_gate(Scalar::Util::refaddr($self), $self->repo_path());
+		xs__call_gate($self->{id}, $self->repo_path());
 	}
 
 	# Having to call throw from the C code is a sure path to insanity.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Snow falling on Perl. White noise covering line noise.
Hides all the bugs too. -- J. Putnam

^ permalink raw reply related

* git-cvsimport gets parents wrong for branches
From: Elrond @ 2006-07-03 21:53 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 811 bytes --]


Hi,

Just by accident I noticed, that git-cvsimport got the
parents for branches wrong in one of my projects.

To assist in debugging this, I've made up a testcase script
(appended to this mail).
It will create a new cvs-repo, put 4 commits in it,
and finally run gitk to investigate it.

It should look something like this:

    4 [branch-stable-fixes] commit-on-branch
  3 | [master] [origin] commit-master-after-branch
  |/
  2   [tag-branchpoint] commit-first-edit
  1   commit-base

What it really looks like:

  4   ..
  3   ..
  2   ..
  1   ..

4's parent is 3, not (as it should) 2.


I've tested with 1.4.0 and the current git-cvsimport from
8fced61.

I hope the testcase helps tracking the problem down.


    Elrond

p.s.: The testcase script is not nice. It just does the
      job, nothing more.

[-- Attachment #2: cvs-branches-1.sh --]
[-- Type: application/x-sh, Size: 687 bytes --]

^ permalink raw reply

* Re: Compression speed for large files
From: Joachim Berdal Haga @ 2006-07-03 22:25 UTC (permalink / raw)
  To: Jeff King; +Cc: Joachim B Haga, git
In-Reply-To: <20060703214503.GA3897@coredump.intra.peff.net>

Jeff King wrote:
> On Mon, Jul 03, 2006 at 11:13:34AM +0000, Joachim B Haga wrote:
> 
>> often binary. In git, committing of large files is very slow; I have
>> tested with a 45MB file, which takes about 1 minute to check in (on an
>> intel core-duo 2GHz).
> 
> I know this has already been somewhat solved, but I found your numbers
> curiously high. I work quite a bit with git and large files and I
> haven't noticed this slowdown. Can you be more specific about your load?
> Are you sure it is zlib?

Quite sure: at least to the extent that it is fixed by lowering the
compression level. But the wording was inexact: it's during object
creation, which happens at initial "git add" and then later during "git
commit".

But...

> y 1.8Ghz Athlon, compressing 45MB of zeros into 20K takes about 2s.
> Compressing 45MB of random data into a 45MB object takes 6.3s. In either
> case, the commit takes only about 0.5s (since cogito stores the object
> during the cg-add).
> 
> Is there some specific file pattern which is slow to compress? 

yes, it seems so. At least the effect is much more pronounced for my
files than for random/null data. "My" files are in this context generated
data files, binary or ascii.

Here's a test with "time gzip -[169] -c file >/dev/null". Random data
from /dev/urandom, kernel headers are concatenation of *.h in kernel
sources. All times in seconds, on my puny home computer (1GHz Via Nehemiah)

       random (23MB)  data (23MB)   headers (44MB)
-9     10.2           72.5          38.5
-6     10.2           13.5          12.9
-1      9.9            4.1           7.0

So... data dependent, yes. But it hits even for normal source code.

(Btw; the default (-6) seems to be less data dependent than the other
values. Maybe that's on purpose.)

If you want to look at a highly-variable dataset (the one above), try
http://lupus.ig3.net/SIMULATION.dx.gz (5MB, slow server), but that's just
an example, I see the same variability for example also on binary data files.

-j.

^ permalink raw reply

* Re: [PATCH 2nd try] Make git-fmt-merge-msg a builtin
From: Junio C Hamano @ 2006-07-03 22:39 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git
In-Reply-To: <Pine.LNX.4.63.0607031717540.29667@wbgn013.biozentrum.uni-wuerzburg.de>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
>
> ---
> 	This retires git-fmt-merge-msg.perl, since it passes all the
> 	tests, but removes the Perl version not now.

There is no point not removing the script if you update git.c
and Makefile to point at the new one.

We do not have a test specific for fmt-merge-msg, so it
obviously passes all the tests ;-).  A new one is attached.

I think we should extend boolean to accept 'yes' and 'no', as I
suggested earlier on the list, but other than that things look
good.

Thanks for the patch; no need to resubmit -- I'll munge the
points I raised above.

-- >8 --
diff --git a/t/t6200-fmt-merge-msg.sh b/t/t6200-fmt-merge-msg.sh
new file mode 100755
index 0000000..63e49f3
--- /dev/null
+++ b/t/t6200-fmt-merge-msg.sh
@@ -0,0 +1,163 @@
+#!/bin/sh
+#
+# Copyright (c) 2006, Junio C Hamano
+#
+
+test_description='fmt-merge-msg test'
+
+. ./test-lib.sh
+
+datestamp=1151939923
+setdate () {
+	GIT_COMMITTER_DATE="$datestamp +0200"
+	GIT_AUTHOR_DATE="$datestamp +0200"
+	datestamp=`expr "$datestamp" + 1`
+	export GIT_COMMITTER_DATE GIT_AUTHOR_DATE
+}
+
+test_expect_success setup '
+	echo one >one &&
+	git add one &&
+	setdate &&
+	git commit -m "Initial" &&
+
+	echo uno >one &&
+	echo dos >two &&
+	git add two &&
+	setdate &&
+	git commit -a -m "Second" &&
+
+	git checkout -b left &&
+
+	echo $datestamp >one &&
+	setdate &&
+	git commit -a -m "Common #1" &&
+
+	echo $datestamp >one &&
+	setdate &&
+	git commit -a -m "Common #2" &&
+
+	git branch right &&
+
+	echo $datestamp >two &&
+	setdate &&
+	git commit -a -m "Left #3" &&
+
+	echo $datestamp >two &&
+	setdate &&
+	git commit -a -m "Left #4" &&
+
+	echo $datestamp >two &&
+	setdate &&
+	git commit -a -m "Left #5" &&
+
+	git checkout right &&
+
+	echo $datestamp >three &&
+	git add three &&
+	setdate &&
+	git commit -a -m "Right #3" &&
+
+	echo $datestamp >three &&
+	setdate &&
+	git commit -a -m "Right #4" &&
+
+	echo $datestamp >three &&
+	setdate &&
+	git commit -a -m "Right #5" &&
+
+	git show-branch
+'
+
+cat >expected <<\EOF
+Merge branch 'left'
+EOF
+
+test_expect_success 'merge-msg test #1' '
+
+	git checkout master &&
+	git fetch . left &&
+
+	git fmt-merge-msg <.git/FETCH_HEAD >actual &&
+	diff -u actual expected
+'
+
+cat >expected <<\EOF
+Merge branch 'left' of ../trash
+EOF
+
+test_expect_success 'merge-msg test #2' '
+
+	git checkout master &&
+	git fetch ../trash left &&
+
+	git fmt-merge-msg <.git/FETCH_HEAD >actual &&
+	diff -u actual expected
+'
+
+cat >expected <<\EOF
+Merge branch 'left'
+
+* left:
+  Left #5
+  Left #4
+  Left #3
+  Common #2
+  Common #1
+EOF
+
+test_expect_success 'merge-msg test #3' '
+
+	git repo-config merge.summary true &&
+
+	git checkout master &&
+	setdate &&
+	git fetch . left &&
+
+	git fmt-merge-msg <.git/FETCH_HEAD >actual &&
+	diff -u actual expected
+'
+
+cat >expected <<\EOF
+Merge branches 'left' and 'right'
+
+* left:
+  Left #5
+  Left #4
+  Left #3
+  Common #2
+  Common #1
+
+* right:
+  Right #5
+  Right #4
+  Right #3
+  Common #2
+  Common #1
+EOF
+
+test_expect_success 'merge-msg test #4' '
+
+	git repo-config merge.summary true &&
+
+	git checkout master &&
+	setdate &&
+	git fetch . left right &&
+
+	git fmt-merge-msg <.git/FETCH_HEAD >actual &&
+	diff -u actual expected
+'
+
+test_expect_success 'merge-msg test #5' '
+
+	git repo-config merge.summary yes &&
+
+	git checkout master &&
+	setdate &&
+	git fetch . left right &&
+
+	git fmt-merge-msg <.git/FETCH_HEAD >actual &&
+	diff -u actual expected
+'
+
+test_done

^ permalink raw reply related

* Re: [PATCH 3/3] Make clear_commit_marks() clean harder
From: Linus Torvalds @ 2006-07-03 22:55 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, git
In-Reply-To: <Pine.LNX.4.63.0607032309190.29667@wbgn013.biozentrum.uni-wuerzburg.de>



On Mon, 3 Jul 2006, Johannes Schindelin wrote:
> 
> Traversing is actually wrong. Clearing the marks does not mean to clear 
> them on commits we did not even mark!

If we didn't mark them, then clearing them would be a no-op, so nobody 
really cares.

> But clearing on commits we _have_ -- but not parsed -- is important, 
> obviously.

Right. The point is, some logic can choose to mark commits UNINTERESTING 
without even parsing that commit, and it would be a good thing. You only 
need to parse the commit once you decide that you need its parents (or 
it's tree, of course), but you may be able to mark it uninteresting before 
that.

This is why it is _wrong_ to care about the "parsed" bit when clearing the 
flags.

		Linus

^ permalink raw reply

* git-fetch per-repository speed issues
From: Keith Packard @ 2006-07-03 18:02 UTC (permalink / raw)
  To: Git Mailing List; +Cc: keithp

[-- Attachment #1: Type: text/plain, Size: 1237 bytes --]

Ok, so maybe X.org is using git in an unexpected (or even wrong)
fashion. Our environment has split development across dozens of separate
repositories which match ABI interfaces. With CVS, we were able to keep
this all in one giant CVS repository with separate modules, but git
doesn't have that notion (which is mostly good). As such, we could use
cvsup or rsync to update the entire collection of modules.

With git, we'd prefer to use the git protocol instead of rsync for the
usual pack-related reasons, but that is limited to a single repository
at a time. And, it's painfully slow, even when the repository is up to
date:

$ cd lib/libXrandr
$ time git-fetch origin
...

real    0m17.035s
user    0m2.584s
sys     0m0.576s

This is a repository with 24 files and perhaps 50 revisions. Given
X.org's 307 git repositories, I'll clearly need to find a faster way
than running git-fetch on every one.

One thing I noticed was that the git+ssh syntax found in remotes files
doesn't do what I thought it did -- I assumed this would use 'git' for
fetch and 'ssh' for push, when in fact it just uses ssh for everything.
This slows down the connection process by several seconds.

-- 
keith.packard@intel.com

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: Compression speed for large files
From: Linus Torvalds @ 2006-07-03 23:02 UTC (permalink / raw)
  To: Joachim Berdal Haga; +Cc: Jeff King, Joachim B Haga, git
In-Reply-To: <44A99961.8090504@fys.uio.no>



On Tue, 4 Jul 2006, Joachim Berdal Haga wrote:
> 
> Here's a test with "time gzip -[169] -c file >/dev/null". Random data
> from /dev/urandom, kernel headers are concatenation of *.h in kernel
> sources. All times in seconds, on my puny home computer (1GHz Via Nehemiah)

That "Via Nehemiah" is probably a big part of it.

I think the VIA Nehemiah just has a 64kB L2 cache, and I bet performance 
plummets if the tables end up being used past that. 

And I think a large part of the higher compressions is that they allow the 
compression window and tables to grow bigger.

		Linus

^ permalink raw reply

* Re: git-fetch per-repository speed issues
From: Linus Torvalds @ 2006-07-03 23:14 UTC (permalink / raw)
  To: Keith Packard; +Cc: Git Mailing List
In-Reply-To: <1151949764.4723.51.camel@neko.keithp.com>



On Mon, 3 Jul 2006, Keith Packard wrote:
> 
> With git, we'd prefer to use the git protocol instead of rsync for the
> usual pack-related reasons, but that is limited to a single repository
> at a time.

Well, you could use multiple branches in the same repository, even if they 
are totally unrealated. That would allow you to fetch them all in one go.

One way to do that is to just name the branches hierarcially have one 
repo, but then call the branches something like

	libXrandr/master
	libXrandr/develop
	Xorg/master
	Xorg/develop
	..

> And, it's painfully slow, even when the repository is up to
> date:
> 
> $ cd lib/libXrandr
> $ time git-fetch origin
> ...
> 
> real    0m17.035s
> user    0m2.584s
> sys     0m0.576s

That's _seriously_ wrong. If everything is up-to-date, a fetch should be 
basically zero-cost. That's especially true with the anonymous git 
protocol, which doesn't have any connection validation overhead (for the 
ssh protocol, the cost is usually the ssh login).

But there may well be some bug there.

Look at this:

	[torvalds@g5 git]$ time git fetch git://git.kernel.org/pub/scm/git/git.git 
	
	real    0m0.431s
	user    0m0.036s
	sys     0m0.024s

and that's over my DSL line, not some studly network thing. 

Basically, a repo that is up-to-date should do a "git fetch" about as 
quickly as it does a "git ls-remote". Which in turn really shouldn't be 
doing much anything at all, apart from the connect itself:

	[torvalds@g5 git]$ time git ls-remote master.kernel.org:/pub/scm/git/git.git > /dev/null 
	
	real    0m1.758s
	user    0m0.188s
	sys     0m0.024s
	[torvalds@g5 git]$ time git ls-remote git://git.kernel.org/pub/scm/git/git.git > /dev/null 
	
	real    0m0.431s
	user    0m0.056s
	sys     0m0.016s

(note how the ssh connection is much slower - it actually ends up doing 
all the ssh back-and-forth).

Can you try from different hosts? One problem may be the remote end 
just trying to do reverse DNS lookups for xinetd or whatever?

Also, one thing to try is to just do

	strace -Ttt git-peek-remote ...

which shows where the time is going (I selected "git-peek-remote", because 
that's a simple program).

		Linus

^ permalink raw reply

* Re: git-cvsimport gets parents wrong for branches
From: Martin Langhoff @ 2006-07-03 23:15 UTC (permalink / raw)
  To: Elrond, git
In-Reply-To: <20060703215303.GA24572@memak.tu-darmstadt.de>

Elrond,

you are right, the current git-cvsimport takes a very naive approach
to determine where branches open from. It uses cvsps internally, which
only reports on the ancestor branch, so we take the latest commit from
the ancestor.

Parsecvs probably has a more sophisticated approach, have you tried it?

It is pretty hard to get that one right in any case, as there are
cases where the new branch starts from something that is not a commit
in the parent (from GIT's perspective). So representing the branching
point would mean pointing to non-existing commits as parents.

If the cvs2svn documentation is not lying, it probably has the
smartest/correctest implementation. For small-medium repos, you may be
able to run cvs2svn and then import with git-svnimport.

cheers,


martin

^ permalink raw reply

* Re: git-fetch per-repository speed issues
From: Jeff King @ 2006-07-04  0:21 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Keith Packard, Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0607031603290.12404@g5.osdl.org>

On Mon, Jul 03, 2006 at 04:14:10PM -0700, Linus Torvalds wrote:

> Well, you could use multiple branches in the same repository, even if they 
> are totally unrealated. That would allow you to fetch them all in one go.

One annoying thing about this is that you may want to have several of
the branches checked out at a time (i.e., you want the actual directory
structure of libXrandr/, Xorg/, etc). You could pull everything down
into one repo and point small pseudo-repos at it with alternates, but I
would think that would become a mess with pushes. You can do some magic
with read-tree --prefix, but again, I'm not sure how you'd make commits
on the correct branch.  Is there an easier way to do this?

> Basically, a repo that is up-to-date should do a "git fetch" about as 
> quickly as it does a "git ls-remote". Which in turn really shouldn't be 
> doing much anything at all, apart from the connect itself:

Fetching by ssh actually makes two ssh connections (the second is to
grab tags).

-Peff

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox