From: Michele Ballabio <barra_cuda@katamail.com>
To: aroben@apple.com
Cc: git@vger.kernel.org
Subject: [PATCH 5/8] Git.pm: Add hash_and_insert_object and cat_blob
Date: Fri, 23 May 2008 16:19:40 +0200 [thread overview]
Message-ID: <1211552384-29636-6-git-send-email-barra_cuda@katamail.com> (raw)
In-Reply-To: <1211552384-29636-1-git-send-email-barra_cuda@katamail.com>
From: Adam Roben <aroben@apple.com>
These functions are more efficient ways of executing `git hash-object -w` and
`git cat-file blob` when you are dealing with many files/objects.
Signed-off-by: Adam Roben <aroben@apple.com>
---
perl/Git.pm | 152 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 150 insertions(+), 2 deletions(-)
diff --git a/perl/Git.pm b/perl/Git.pm
index d766974..6ba8ee5 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -39,6 +39,10 @@ $VERSION = '0.01';
my $lastrev = $repo->command_oneline( [ 'rev-list', '--all' ],
STDERR => 0 );
+ my $sha1 = $repo->hash_and_insert_object('file.txt');
+ my $tempfile = tempfile();
+ my $size = $repo->cat_blob($sha1, $tempfile);
+
=cut
@@ -218,7 +222,6 @@ sub repository {
bless $self, $class;
}
-
=back
=head1 METHODS
@@ -734,6 +737,147 @@ sub hash_object {
}
+=item hash_and_insert_object ( FILENAME )
+
+Compute the SHA1 object id of the given C<FILENAME> and add the object to the
+object database.
+
+The function returns the SHA1 hash.
+
+=cut
+
+# TODO: Support for passing FILEHANDLE instead of FILENAME
+sub hash_and_insert_object {
+ my ($self, $filename) = @_;
+
+ carp "Bad filename \"$filename\"" if $filename =~ /[\r\n]/;
+
+ $self->_open_hash_and_insert_object_if_needed();
+ my ($in, $out) = ($self->{hash_object_in}, $self->{hash_object_out});
+
+ unless (print $out $filename, "\n") {
+ $self->_close_hash_and_insert_object();
+ throw Error::Simple("out pipe went bad");
+ }
+
+ chomp(my $hash = <$in>);
+ unless (defined($hash)) {
+ $self->_close_hash_and_insert_object();
+ throw Error::Simple("in pipe went bad");
+ }
+
+ return $hash;
+}
+
+sub _open_hash_and_insert_object_if_needed {
+ my ($self) = @_;
+
+ return if defined($self->{hash_object_pid});
+
+ ($self->{hash_object_pid}, $self->{hash_object_in},
+ $self->{hash_object_out}, $self->{hash_object_ctx}) =
+ command_bidi_pipe(qw(hash-object -w --stdin-paths));
+}
+
+sub _close_hash_and_insert_object {
+ my ($self) = @_;
+
+ return unless defined($self->{hash_object_pid});
+
+ my @vars = map { 'hash_object_' . $_ } qw(pid in out ctx);
+
+ command_close_bidi_pipe($self->{@vars});
+ delete $self->{@vars};
+}
+
+=item cat_blob ( SHA1, FILEHANDLE )
+
+Prints the contents of the blob identified by C<SHA1> to C<FILEHANDLE> and
+returns the number of bytes printed.
+
+=cut
+
+sub cat_blob {
+ my ($self, $sha1, $fh) = @_;
+
+ $self->_open_cat_blob_if_needed();
+ my ($in, $out) = ($self->{cat_blob_in}, $self->{cat_blob_out});
+
+ unless (print $out $sha1, "\n") {
+ $self->_close_cat_blob();
+ throw Error::Simple("out pipe went bad");
+ }
+
+ my $description = <$in>;
+ if ($description =~ / missing$/) {
+ carp "$sha1 doesn't exist in the repository";
+ return 0;
+ }
+
+ if ($description !~ /^[0-9a-fA-F]{40} \S+ (\d+)$/) {
+ carp "Unexpected result returned from git cat-file";
+ return 0;
+ }
+
+ my $size = $1;
+
+ my $blob;
+ my $bytesRead = 0;
+
+ while (1) {
+ my $bytesLeft = $size - $bytesRead;
+ last unless $bytesLeft;
+
+ my $bytesToRead = $bytesLeft < 1024 ? $bytesLeft : 1024;
+ my $read = read($in, $blob, $bytesToRead, $bytesRead);
+ unless (defined($read)) {
+ $self->_close_cat_blob();
+ throw Error::Simple("in pipe went bad");
+ }
+
+ $bytesRead += $read;
+ }
+
+ # Skip past the trailing newline.
+ my $newline;
+ my $read = read($in, $newline, 1);
+ unless (defined($read)) {
+ $self->_close_cat_blob();
+ throw Error::Simple("in pipe went bad");
+ }
+ unless ($read == 1 && $newline eq "\n") {
+ $self->_close_cat_blob();
+ throw Error::Simple("didn't find newline after blob");
+ }
+
+ unless (print $fh $blob) {
+ $self->_close_cat_blob();
+ throw Error::Simple("couldn't write to passed in filehandle");
+ }
+
+ return $size;
+}
+
+sub _open_cat_blob_if_needed {
+ my ($self) = @_;
+
+ return if defined($self->{cat_blob_pid});
+
+ ($self->{cat_blob_pid}, $self->{cat_blob_in},
+ $self->{cat_blob_out}, $self->{cat_blob_ctx}) =
+ command_bidi_pipe(qw(cat-file --batch));
+}
+
+sub _close_cat_blob {
+ my ($self) = @_;
+
+ return unless defined($self->{cat_blob_pid});
+
+ my @vars = map { 'cat_blob_' . $_ } qw(pid in out ctx);
+
+ command_close_bidi_pipe($self->{@vars});
+ delete $self->{@vars};
+}
=back
@@ -951,7 +1095,11 @@ sub _cmd_close {
}
-sub DESTROY { }
+sub DESTROY {
+ my ($self) = @_;
+ $self->_close_hash_and_insert_object();
+ $self->_close_cat_blob();
+}
# Pipe implementation for ActiveState Perl.
--
1.5.5.1
next prev parent reply other threads:[~2008-05-23 14:17 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-23 14:19 [PATCH 0/8] Speed up git-svn Michele Ballabio
2008-05-23 14:19 ` [PATCH 1/8] Move git-hash-object tests from t5303 to t1007 Michele Ballabio
2008-05-23 14:19 ` [PATCH 2/8] Add more tests for git hash-object Michele Ballabio
2008-05-23 14:19 ` [PATCH 3/8] git-hash-object: Add --stdin-paths option Michele Ballabio
2008-05-23 14:19 ` [PATCH 4/8] Git.pm: Add command_bidi_pipe and command_close_bidi_pipe Michele Ballabio
2008-05-23 14:19 ` Michele Ballabio [this message]
2008-05-23 14:19 ` [PATCH 6/8] git-svn: Speed up fetch Michele Ballabio
2008-05-23 14:19 ` [PATCH 7/8] builtin-cat-file.c: use parse_options() Michele Ballabio
2008-05-23 14:19 ` [PATCH 8/8] change quoting in test t1006-cat-file.sh Michele Ballabio
2008-05-23 14:19 ` [PATCH 0/8] Speed up git-svn Michele Ballabio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1211552384-29636-6-git-send-email-barra_cuda@katamail.com \
--to=barra_cuda@katamail.com \
--cc=aroben@apple.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).