From: Paolo Bonzini <bonzini@gnu.org>
To: "Shawn O. Pearce" <spearce@spearce.org>
Cc: Junio C Hamano <gitster@pobox.com>,
David Tweed <david.tweed@gmail.com>,
Teemu Likonen <tlikonen@iki.fi>,
git@vger.kernel.org
Subject: Re: Why repository grows after "git gc"? / Purpose of *.keep files?
Date: Tue, 13 May 2008 07:08:19 +0200 [thread overview]
Message-ID: <48292243.3050307@gnu.org> (raw)
In-Reply-To: <20080513000925.GA29038@spearce.org>
Shawn O. Pearce wrote:
> Junio C Hamano <gitster@pobox.com> wrote:
>> "Shawn O. Pearce" <spearce@spearce.org> writes:
>>> I think git-clone marking a 150M linux-2.6 pack with .keep is wrong;
>>> most users working with the linux-2.6 sources have sufficient
>>> hardware to deal with the disk IO required to copy that with 100%
>>> delta reuse. But I have a repository at day-job with a 600M pack,
>>> that's starting to head into the realm where git-gc while running
>>> on battery on a laptop would prefer to have that .keep.
>> Perhaps clone can decide to keep the .keep file depending on the size of
>> the pack then?
>
> Yea, I think that's the better thing to do here. I'm not sure where
> the cut-off is, maybe its <512M delete the .keep once the refs are
> inplace and the objects are ensured to be reachable.
I think separate cutoffs should be in place for file size and number of
objects. Very tight packs probably require hours to repack as efficiently.
By the way, another scenario where I used pack files is when I can only
distribute via http because of firewalls. I make a clone of the
original repository and mark the pack as keep; then I push to the
distribution site, gc, and mark the pack as keep; then I have every day
a cron job that does git-gc. This way I know that the user will only
have to download the third pack. I think I'll modify the cron job to
mark as keep the packs that exceed 2 megabytes or something like that.
Thinking about both use cases, the best would be to have options (common
to git-clone, git-remote add, git-gc at least; and available via config
keys too) like
--keep-packs[=THRES1,THRES2,...]
where:
- one threshold would be enough to mark a pack as keep
- thresholds could be in the form "\d+[kmg]?b" for file size,
"\d+[kmg]?" for number of objects.
- if no threshold is given, the default could be --keep-packs=100k,512MB
or whatever is in the config.
- to mark all packs, use --keep-packs=0
Paolo
next prev parent reply other threads:[~2008-05-13 5:09 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-12 12:29 Why repository grows after "git gc"? / Purpose of *.keep files? Teemu Likonen
2008-05-12 15:52 ` Teemu Likonen
2008-05-12 17:13 ` Johannes Schindelin
2008-05-12 18:43 ` Teemu Likonen
2008-05-12 18:56 ` Nicolas Pitre
2008-05-12 19:09 ` Teemu Likonen
2008-05-12 19:36 ` Nicolas Pitre
2008-05-12 20:10 ` Govind Salinas
2008-05-12 21:06 ` Nicolas Pitre
2008-05-12 21:07 ` Govind Salinas
2008-05-12 20:24 ` Teemu Likonen
2008-05-12 21:03 ` Mike Hommey
2008-05-12 21:08 ` Mike Hommey
2008-05-13 0:12 ` Shawn O. Pearce
2008-05-13 5:33 ` Mike Hommey
2008-05-14 1:03 ` Nicolas Pitre
2008-05-14 6:43 ` Junio C Hamano
2008-05-14 9:10 ` Juergen Ruehle
2008-05-14 14:24 ` Nicolas Pitre
2008-05-14 17:03 ` Junio C Hamano
2008-05-14 20:06 ` Linus Torvalds
2008-05-14 20:19 ` Linus Torvalds
2008-05-14 20:29 ` Nicolas Pitre
2008-05-14 20:36 ` Linus Torvalds
2008-05-14 23:24 ` A Large Angry SCM
2008-05-12 21:07 ` Nicolas Pitre
2008-05-12 17:17 ` David Tweed
2008-05-12 23:49 ` Shawn O. Pearce
2008-05-12 23:53 ` Junio C Hamano
2008-05-13 0:09 ` Shawn O. Pearce
2008-05-13 5:08 ` Paolo Bonzini [this message]
2008-05-13 5:22 ` Shawn O. Pearce
2008-05-13 9:22 ` Teemu Likonen
2008-05-13 21:46 ` Stephen R. van den Berg
2008-05-14 5:42 ` Teemu Likonen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48292243.3050307@gnu.org \
--to=bonzini@gnu.org \
--cc=david.tweed@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=spearce@spearce.org \
--cc=tlikonen@iki.fi \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).