From: "Shawn O. Pearce" <spearce@spearce.org>
To: Anton Tropashko <atropashko@yahoo.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>, git@vger.kernel.org
Subject: Re: Errors cloning large repo
Date: Fri, 9 Mar 2007 22:07:18 -0500 [thread overview]
Message-ID: <20070310030718.GA2927@spearce.org> (raw)
In-Reply-To: <645002.46177.qm@web52608.mail.yahoo.com>
Anton Tropashko <atropashko@yahoo.com> wrote:
> again after git repack and don't see how to work around that aside from artifically
> splitting the tree at the top or resorting to a tarball on an ftp site.
> That 64 bit indexing code you previously mentioned would force me to upgrade git on both ends?
> Anywhere I can pull it out from?
I'm shocked you were able to repack an 8.5 GiB repository.
The default git-repack script that we ship assumes you want to
combine everything into one giant packfile; this is what is also
happening during git-clone. Clearly your system is rejecting this
packfile; and even if the OS allowed us to make that file the
index offsets would all be wrong as they are only 32 bits wide.
The repository becomes corrupt when those overflow.
Troy Telford (with the help of Eric Biederman) recently posted a
patch that attempts to push the index to 64 bits:
http://thread.gmane.org/gmane.comp.version-control.git/40680/focus=40999
You can try Troy's patch. Nico and my's 64 bit index work is *not*
ready for anyone to use. It doesn't exist as a compileable chunk
of code. ;-)
Just to warn you, I have (re)done some of Troy's changes and Junio
has applied them to the current 'master' branch. So Troy's patch
would need to be applied to something that is futher back, like
around 2007-02-28 (when Troy sent the patch). But my changes alone
are not enough to get "64 bit packfiles" working.
As Linus said earlier in this thread; Nico and I are working on
pushing out the packfile limits, just not fast enough for some users
needs apparently (sorry about that!). Troy's patch was rejected
mainly because it is a file format change that is not backwards
compatible (once you use the 64 bit index, anything accessing that
repository *must* also support that).
Nico and I are working on other file format changes that are
more extensive than just expanding the index out to 64 bits, and
likewise are also not backwards compatible. To help users manage
the upgrades, we want to do a single file format change in 2007,
not two. So we are trying to be very sure that what we give Junio
for final application really is the best we can do this year.
Otherwise we would have worked with Troy to help test his patch and
get that into shape for application to main the git.git repository.
One thing that you could do is segment the repository into multiple
packfiles yourself, and then clone using rsync or http, rather than
using the native Git protocol.
For segmenting the repository, you would do something like:
git rev-list --objects HEAD >S
# segment S up into several files, e.g. T1, T2, T3
foreach s in T*
do
name=$(git pack-objects tmp <$s)
touch .git/objects/pack/pack-$name.keep
mv tmp-$name.pack .git/objects/pack/pack-$name.pack
mv tmp-$name.idx .git/objects/pack/pack-$name.idx
done
git prune-packed
The trick here is to segment S up into enough T1, T2, ... files such
that when packed they each are less than 2 GiB. You can then clone
this repository by copying the .git directory using more standard
filesystem tools, which is what a clone with rsync or http is
(more or less) doing.
Yes, the above process is horribly tedious and has a some trial
and error involved in terms of selecting the packfile segmenting.
We don't have anything that can automate this right now.
--
Shawn.
next prev parent reply other threads:[~2007-03-10 3:07 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-03-10 2:37 Errors cloning large repo Anton Tropashko
2007-03-10 3:07 ` Shawn O. Pearce [this message]
2007-03-10 5:54 ` Linus Torvalds
2007-03-10 6:01 ` Shawn O. Pearce
2007-03-10 22:32 ` Martin Waitz
2007-03-10 22:46 ` Linus Torvalds
2007-03-11 21:35 ` Martin Waitz
2007-03-10 10:27 ` Jakub Narebski
2007-03-11 2:00 ` Shawn O. Pearce
2007-03-12 11:09 ` Jakub Narebski
2007-03-12 14:24 ` Shawn O. Pearce
2007-03-17 13:23 ` Jakub Narebski
[not found] ` <82B0999F-73E8-494E-8D66-FEEEDA25FB91@adacore.com>
2007-03-10 22:21 ` Linus Torvalds
2007-03-10 5:10 ` Linus Torvalds
-- strict thread matches above, loose matches on Subject: below --
2007-03-13 0:02 Anton Tropashko
2007-03-12 17:39 Anton Tropashko
2007-03-12 18:40 ` Linus Torvalds
2007-03-10 1:21 Anton Tropashko
2007-03-10 1:45 ` Linus Torvalds
2007-03-09 23:48 Anton Tropashko
2007-03-10 0:54 ` Linus Torvalds
2007-03-10 2:03 ` Linus Torvalds
2007-03-10 2:12 ` Junio C Hamano
2007-03-09 19:20 Anton Tropashko
2007-03-09 21:37 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070310030718.GA2927@spearce.org \
--to=spearce@spearce.org \
--cc=atropashko@yahoo.com \
--cc=git@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).