From: Nicolas Pitre <nico@cam.org>
To: Adam Heath <doogie@brainfood.com>
Cc: git@vger.kernel.org
Subject: Re: large(25G) repository in git
Date: Mon, 23 Mar 2009 21:19:57 -0400 (EDT) [thread overview]
Message-ID: <alpine.LFD.2.00.0903232056520.26337@xanadu.home> (raw)
In-Reply-To: <49C7FAB3.7080301@brainfood.com>
On Mon, 23 Mar 2009, Adam Heath wrote:
> Last friday, I was doing a checkin on the production server, and found
> 1.6G of new files. git was quite able at committing that. However,
> pushing was problematic. I was pushing over ssh; so, a new ssh
> connection was open to the preview server. After doing so, git tried
> to create a new pack file. This took *ages*, and the ssh connection
> died. So did git, when it finally got done with the new pack, and
> discovered the ssh connection was gone.
Strange. You could instruct ssh to keep the connection up with the
ServerAliveInterval option (see the ssh_config man page).
> So, to work around that, I ran git gc. When done, I discovered that
> git repacked the *entire* repository. While not something I care for,
> I can understand that, and live with it. It just took *hours* to do so.
>
> Then, what really annoys me, is that when I finally did the push, it
> tried sending the single 27G pack file, when the remote already had
> 25G of the repository in several different packs(the site was an
> hg->git conversion). This part is just unacceptable.
This shouldn't happen either. When pushing, git reconstruct a pack with
only the necessary objects to transmit. Are you sure it was really
trying to send a 27G pack?
> So, here are my questions/observations:
>
> 1: Handle the case of the ssh connection dying during git push(seems
> simple).
See above.
> 2: Is there an option to tell git to *not* be so thorough when trying
> to find similiar files. videos/doc/pdf/etc aren't always very
> deltafiable, so I'd be happy to just do full content compares.
Look at the gitattribute documentation. One thing that the doc appears
to be missing is information about the "delta" attribute. You can
disable delta compression on a file pattern that way.
> 3: delta packs seem to be poorly done. it seems that if one repo gets
> repacked completely, that the entire new pack gets sent, when the
> target has most of the objects already.
This is not supposed to happen. Please provide more details if you can.
Nicolas
next prev parent reply other threads:[~2009-03-24 1:21 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-23 21:10 large(25G) repository in git Adam Heath
2009-03-24 1:19 ` Nicolas Pitre [this message]
2009-03-24 17:59 ` Adam Heath
2009-03-24 18:31 ` Nicolas Pitre
2009-03-24 20:55 ` Adam Heath
2009-03-25 1:21 ` Nicolas Pitre
2009-03-24 18:33 ` david
2009-03-24 8:59 ` Andreas Ericsson
2009-03-24 22:35 ` Adam Heath
2009-03-24 21:04 ` Sam Hocevar
2009-03-24 21:44 ` Adam Heath
2009-03-25 0:28 ` Nicolas Pitre
2009-03-25 0:57 ` Adam Heath
2009-03-25 1:47 ` Nicolas Pitre
2009-03-26 15:43 ` Marcel M. Cary
2009-03-26 16:35 ` Adam Heath
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.2.00.0903232056520.26337@xanadu.home \
--to=nico@cam.org \
--cc=doogie@brainfood.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).