From: "Shawn O. Pearce" <spearce@spearce.org>
To: Paolo Bonzini <bonzini@gnu.org>
Cc: Joshua Jensen <jjensen@workspacewhiz.com>,
"git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: Leaving large binaries out of the packfile
Date: Fri, 11 Jun 2010 09:17:44 -0700 [thread overview]
Message-ID: <20100611161744.GS14847@spearce.org> (raw)
In-Reply-To: <4C12566E.6020607@gnu.org>
Paolo Bonzini <bonzini@gnu.org> wrote:
> On 06/10/2010 08:04 PM, Shawn O. Pearce wrote:
>> Joshua Jensen<jjensen@workspacewhiz.com> wrote:
>>> Sometimes, 'git gc' runs out of memory. I have to discover which file
>>> is causing the problem, so I can add it to .gitattributes with a
>>> '-delta' flag. Mostly, though, the repacking takes forever, and I dread
>>> running the operation.
>>
>> If you have the list of big objects, you can put them into their
>> own pack file manually. Feed their SHA-1 names on stdin to git
>> pack-objects, and save the resulting pack under .git/objects/pack.
>
> Do you know any simpler way than
>
> git log --pretty=format:%H | while read x; do
> git ls-tree $x -- ChangeLog | awk '{print $3}'
> done | sort -u
>
> to do this? I thought it would be nice to add --sha1-only to
> git-ls-tree, but maybe I'm missing some other trick.
Maybe
git rev-list --objects HEAD | grep ' ChangeLog'
pack-objects wants the output of rev-list --objects as input, file
name and all. So its just a matter of selecting the right lines
from its output.
>> Assuming the pack was called pack-DEADC0FFEE.pack, create a file
>> called pack-DEADC0FFEE.keep in the same directory. This will stop
>> Git from trying to repack the contents of that pack file.
>>
>> Now run `git gc` to remove those huge objects from the pack file
>> that contains all of the other stuff.
>
> That obviously wouldn't help if these large binaries are updated often,
> however.
No, it doesn't. But you still could do this on a periodic basis.
That way you only drag around a handful of recently created large
binaries during a typical `git gc`, and not the entire project's
history of them.
--
Shawn.
next prev parent reply other threads:[~2010-06-11 16:17 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-06-10 6:25 Leaving large binaries out of the packfile Joshua Jensen
2010-06-10 18:04 ` Shawn O. Pearce
2010-06-11 15:29 ` Paolo Bonzini
2010-06-11 16:17 ` Shawn O. Pearce [this message]
2010-06-24 6:32 ` Joshua Jensen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100611161744.GS14847@spearce.org \
--to=spearce@spearce.org \
--cc=bonzini@gnu.org \
--cc=git@vger.kernel.org \
--cc=jjensen@workspacewhiz.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).