From: Andreas Ericsson <ae@op5.se>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Andy Parkins <andyparkins@gmail.com>, git@vger.kernel.org
Subject: Re: git-fetching from a big repository is slow
Date: Thu, 14 Dec 2006 16:06:05 +0100 [thread overview]
Message-ID: <4581685D.1070407@op5.se> (raw)
In-Reply-To: <Pine.LNX.4.63.0612141513130.3635@wbgn013.biozentrum.uni-wuerzburg.de>
Johannes Schindelin wrote:
> Hi,
>
> On Thu, 14 Dec 2006, Andreas Ericsson wrote:
>
>> Andy Parkins wrote:
>>> Hello,
>>>
>>> I've got a big repository. I've got two computers. One has the repository
>>> up-to-date (164M after repack); one is behind (30M ish).
>>>
>>> I used git-fetch to try and update; and the sync took HOURS. I zipped the
>>> .git directory and transferred that and it took about 15 minutes to
>>> transfer.
>>>
>>> Am I doing something wrong? The git-fetch was done with a git+ssh:// URL.
>>> The zip transfer with scp (so ssh shouldn't be a factor).
>>>
>> This seems to happen if your repository consists of many large binary files,
>> especially many large binary files of several versions that do not deltify
>> well against each other. Perhaps it's worth adding gzip compression detecion
>> to git? I imagine more people than me are tracking gzipped/bzip2'ed content
>> that pretty much never deltifies well against anything else.
>
> Or we add something like the heuristics we discovered in another thread,
> where rename detection (which is related to delta candidate searching) is
> not started if the sizes differ drastically.
>
It wouldn't work for this particular case though. In our distribution
repository we have ~300 bzip2 compressed tarballs with an average size
of 3MiB. 240 of those are between 2.5 and 4 MiB, so they don't
drastically differ, but neither do they delta well.
One option would be to add some sort of config option to skip attempting
deltas of files with a certain suffix. That way we could just tell it to
ignore *.gz,*.tgz,*.bz2 and everything would work just as it does today,
but a lot faster.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
next prev parent reply other threads:[~2006-12-14 15:06 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-12-14 13:40 git-fetching from a big repository is slow Andy Parkins
2006-12-14 13:53 ` Andreas Ericsson
2006-12-14 14:14 ` Johannes Schindelin
2006-12-14 15:06 ` Andreas Ericsson [this message]
2006-12-14 19:05 ` Geert Bosch
2006-12-14 19:46 ` Shawn Pearce
2006-12-14 22:12 ` Horst H. von Brand
2006-12-14 22:38 ` Shawn Pearce
2006-12-15 21:49 ` Pazu
2006-12-16 13:32 ` Robin Rosenberg
2006-12-14 23:01 ` Geert Bosch
2006-12-14 23:15 ` Johannes Schindelin
2006-12-14 23:29 ` Shawn Pearce
2006-12-15 0:07 ` Johannes Schindelin
2006-12-15 0:42 ` Shawn Pearce
2006-12-15 2:26 ` Nicolas Pitre
2006-12-14 22:28 ` Andreas Ericsson
2006-12-14 15:18 ` Andy Parkins
2006-12-14 15:45 ` Han-Wen Nienhuys
2006-12-14 16:20 ` Andy Parkins
2006-12-14 16:34 ` Johannes Schindelin
2006-12-14 20:41 ` Junio C Hamano
2006-12-14 23:26 ` Johannes Schindelin
2006-12-15 0:38 ` Junio C Hamano
2006-12-14 18:14 ` Nicolas Pitre
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4581685D.1070407@op5.se \
--to=ae@op5.se \
--cc=Johannes.Schindelin@gmx.de \
--cc=andyparkins@gmail.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).