From: Eric Wong <normalperson@yhbt.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Git Mailing List <git@vger.kernel.org>, Marcus Griep <marcus@griep.us>
Subject: Re: [PATCH 3/3] git-svn: Reduce temp file usage when dealing with non-links
Date: Tue, 12 Aug 2008 20:29:56 -0700 [thread overview]
Message-ID: <20080813032956.GC5904@untitled> (raw)
In-Reply-To: <1218556876-26554-1-git-send-email-marcus@griep.us>
Marcus Griep <marcus@griep.us> wrote:
> Currently, in sub 'close_file', git-svn creates a temporary file and
> copies the contents of the blob to be written into it. This is useful
> for symlinks because svn stores symlinks in the form:
>
> link $FILE_PATH
>
> Git creates a blob only out of '$FILE_PATH' and uses file mode to
> indicate that the blob should be interpreted as a symlink.
>
> As git-hash-object is invoked with --stdin-paths, a duplicate of the
> link from svn must be created that leaves off the first five bytes,
> i.e. 'link '. However, this is wholly unnecessary for normal blobs,
> though, as we already have a temp file with their contents. Copying
> the entire file gains nothing, and effectively requires a file to be
> written twice before making it into the object db.
>
> This patch corrects that issue, holding onto the substr-like
> duplication for symlinks, but skipping it altogether for normal blobs
> by reusing the existing temp file.
>
> Signed-off-by: Marcus Griep <marcus@griep.us>
Thank you Marcus!
Acked-by: Eric Wong <normalperson@yhbt.net>
> ---
> git-svn.perl | 46 ++++++++++++++++++++++------------------------
> 1 files changed, 22 insertions(+), 24 deletions(-)
>
> diff --git a/git-svn.perl b/git-svn.perl
> index 9eae5e8..95d1510 100755
> --- a/git-svn.perl
> +++ b/git-svn.perl
> @@ -3268,38 +3268,36 @@ sub close_file {
> "expected: $exp\n got: $got\n";
> }
> }
> - sysseek($fh, 0, 0) or croak $!;
> if ($fb->{mode_b} == 120000) {
> - eval {
> - sysread($fh, my $buf, 5) == 5 or croak $!;
> - $buf eq 'link ' or die "$path has mode 120000",
> - " but is not a link";
> - };
> - if ($@) {
> - warn "$@\n";
> - sysseek($fh, 0, 0) or croak $!;
> - }
> - }
> -
> - my $tmp_fh = Git::temp_acquire('svn_hash');
> - my $result;
> - while ($result = sysread($fh, my $string, 1024)) {
> - my $wrote = syswrite($tmp_fh, $string, $result);
> - defined($wrote) && $wrote == $result
> - or croak("write ",
> - $tmp_fh->filename, ": $!\n");
> - }
> - defined $result or croak $!;
> + sysseek($fh, 0, 0) or croak $!;
> + sysread($fh, my $buf, 5) == 5 or croak $!;
>
> + unless ($buf eq 'link ') {
> + warn "$path has mode 120000",
> + " but is not a link\n";
> + } else {
> + my $tmp_fh = Git::temp_acquire('svn_hash');
> + my $res;
> + while ($res = sysread($fh, my $str, 1024)) {
> + my $out = syswrite($tmp_fh, $str, $res);
> + defined($out) && $out == $res
> + or croak("write ",
> + $tmp_fh->filename,
> + ": $!\n");
> + }
> + defined $result or croak $!;
>
> - Git::temp_release($fh, 1);
> + ($fh, $tmp_fh) = ($tmp_fh, $fh);
> + Git::temp_release($tmp_fh, 1);
> + }
> + }
>
> $hash = $::_repository->hash_and_insert_object(
> - $tmp_fh->filename);
> + $fh->filename);
> $hash =~ /^[a-f\d]{40}$/ or die "not a sha1: $hash\n";
>
> Git::temp_release($fb->{base}, 1);
> - Git::temp_release($tmp_fh, 1);
> + Git::temp_release($fh, 1);
> } else {
> $hash = $fb->{blob} or die "no blob information\n";
> }
> --
> 1.6.0.rc2.6.g8eda3
next prev parent reply other threads:[~2008-08-13 3:31 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-08 22:41 [PATCH] git-svn: Make it scream by minimizing temp files Marcus Griep
2008-08-08 22:59 ` Junio C Hamano
2008-08-09 1:12 ` Marcus Griep
2008-08-09 6:25 ` Eric Wong
2008-08-09 15:45 ` Marcus Griep
2008-08-10 1:46 ` Eric Wong
2008-08-10 3:53 ` Junio C Hamano
2008-08-10 7:47 ` Eric Wong
2008-08-10 8:26 ` Junio C Hamano
2008-08-10 8:09 ` Eric Wong
2008-08-11 15:53 ` [PATCH 0/3] git-svn and temporary file improvements Marcus Griep
2008-08-11 15:53 ` [PATCH 1/3] Git.pm: Add faculties to allow temp files to be cached Marcus Griep
2008-08-11 15:53 ` [PATCH 2/3] git-svn: Make it scream by minimizing temp files Marcus Griep
2008-08-11 15:53 ` [PATCH 3/3] git-svn: Reduce temp file usage when dealing with non-links Marcus Griep
2008-08-12 3:37 ` Eric Wong
2008-08-12 15:53 ` Marcus Griep
2008-08-12 16:01 ` Marcus Griep
2008-08-12 16:45 ` [PATCH v2 " Marcus Griep
2008-08-13 3:29 ` Eric Wong [this message]
2008-08-13 3:42 ` [PATCH " Marcus Griep
2008-08-13 3:52 ` Eric Wong
2008-08-12 3:14 ` [PATCH 2/3] git-svn: Make it scream by minimizing temp files Eric Wong
2008-08-12 15:50 ` Marcus Griep
2008-08-12 16:00 ` [PATCH 2/3] git-svn: Make it incrementally faster " Marcus Griep
2008-08-13 3:29 ` Eric Wong
2008-08-12 3:08 ` [PATCH 1/3] Git.pm: Add faculties to allow temp files to be cached Eric Wong
2008-08-12 15:41 ` Marcus Griep
2008-08-12 16:00 ` Marcus Griep
2008-08-13 3:28 ` Eric Wong
2008-08-13 20:05 ` Lea Wiemann
2008-08-13 20:13 ` Marcus Griep
2008-08-13 20:31 ` Marcus Griep
2008-08-13 20:38 ` Junio C Hamano
2008-08-13 22:28 ` Lea Wiemann
2008-08-13 22:30 ` [PATCH] Git.pm: require Perl 5.6.1 Lea Wiemann
2008-08-14 6:58 ` [PATCH 1/3] Git.pm: Add faculties to allow temp files to be cached Eric Wong
2008-08-15 15:10 ` [PATCH] Git.pm: Make File::Spec and File::Temp requirement lazy Marcus Griep
2008-08-15 19:31 ` Bryan Donlan
2008-08-15 19:46 ` Marcus Griep
2008-08-15 19:53 ` [PATCH v2] " Marcus Griep
2008-08-13 20:52 ` [PATCH 1/3] Git.pm: Add faculties to allow temp files to be cached Miklos Vajna
2008-08-14 6:29 ` Junio C Hamano
2008-08-14 14:35 ` Marcus Griep
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080813032956.GC5904@untitled \
--to=normalperson@yhbt.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=marcus@griep.us \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).