From: Paolo Bonzini <bonzini@gnu.org>
To: "Shawn O. Pearce" <spearce@spearce.org>
Cc: Junio C Hamano <gitster@pobox.com>,
David Tweed <david.tweed@gmail.com>,
Teemu Likonen <tlikonen@iki.fi>,
git@vger.kernel.org
Subject: Re: Why repository grows after "git gc"? / Purpose of *.keep files?
Date: Tue, 13 May 2008 07:08:19 +0200 [thread overview]
Message-ID: <48292243.3050307@gnu.org> (raw)
In-Reply-To: <20080513000925.GA29038@spearce.org>
Shawn O. Pearce wrote:
> Junio C Hamano <gitster@pobox.com> wrote:
>> "Shawn O. Pearce" <spearce@spearce.org> writes:
>>> I think git-clone marking a 150M linux-2.6 pack with .keep is wrong;
>>> most users working with the linux-2.6 sources have sufficient
>>> hardware to deal with the disk IO required to copy that with 100%
>>> delta reuse. But I have a repository at day-job with a 600M pack,
>>> that's starting to head into the realm where git-gc while running
>>> on battery on a laptop would prefer to have that .keep.
>> Perhaps clone can decide to keep the .keep file depending on the size of
>> the pack then?
>
> Yea, I think that's the better thing to do here. I'm not sure where
> the cut-off is, maybe its <512M delete the .keep once the refs are
> inplace and the objects are ensured to be reachable.
I think separate cutoffs should be in place for file size and number of
objects. Very tight packs probably require hours to repack as efficiently.
By the way, another scenario where I used pack files is when I can only
distribute via http because of firewalls. I make a clone of the
original repository and mark the pack as keep; then I push to the
distribution site, gc, and mark the pack as keep; then I have every day
a cron job that does git-gc. This way I know that the user will only
have to download the third pack. I think I'll modify the cron job to
mark as keep the packs that exceed 2 megabytes or something like that.
Thinking about both use cases, the best would be to have options (common
to git-clone, git-remote add, git-gc at least; and available via config
keys too) like
--keep-packs[=THRES1,THRES2,...]
where:
- one threshold would be enough to mark a pack as keep
- thresholds could be in the form "\d+[kmg]?b" for file size,
"\d+[kmg]?" for number of objects.
- if no threshold is given, the default could be --keep-packs=100k,512MB
or whatever is in the config.
- to mark all packs, use --keep-packs=0
Paolo
next prev parent reply other threads:[~2008-05-13 5:09 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-12 12:29 Why repository grows after "git gc"? / Purpose of *.keep files? Teemu Likonen
2008-05-12 15:52 ` Teemu Likonen
2008-05-12 17:13 ` Johannes Schindelin
2008-05-12 18:43 ` Teemu Likonen
2008-05-12 18:56 ` Nicolas Pitre
2008-05-12 19:09 ` Teemu Likonen
2008-05-12 19:36 ` Nicolas Pitre
2008-05-12 20:10 ` Govind Salinas
2008-05-12 21:06 ` Nicolas Pitre
2008-05-12 21:07 ` Govind Salinas
2008-05-12 20:24 ` Teemu Likonen
2008-05-12 21:03 ` Mike Hommey
2008-05-12 21:08 ` Mike Hommey
2008-05-13 0:12 ` Shawn O. Pearce
2008-05-13 5:33 ` Mike Hommey
2008-05-14 1:03 ` Nicolas Pitre
2008-05-14 6:43 ` Junio C Hamano
2008-05-14 9:10 ` Juergen Ruehle
2008-05-14 14:24 ` Nicolas Pitre
2008-05-14 17:03 ` Junio C Hamano
2008-05-14 20:06 ` Linus Torvalds
2008-05-14 20:19 ` Linus Torvalds
2008-05-14 20:29 ` Nicolas Pitre
2008-05-14 20:36 ` Linus Torvalds
2008-05-14 23:24 ` A Large Angry SCM
2008-05-12 21:07 ` Nicolas Pitre
2008-05-12 17:17 ` David Tweed
2008-05-12 23:49 ` Shawn O. Pearce
2008-05-12 23:53 ` Junio C Hamano
2008-05-13 0:09 ` Shawn O. Pearce
2008-05-13 5:08 ` Paolo Bonzini [this message]
2008-05-13 5:22 ` Shawn O. Pearce
2008-05-13 9:22 ` Teemu Likonen
2008-05-13 21:46 ` Stephen R. van den Berg
2008-05-14 5:42 ` Teemu Likonen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48292243.3050307@gnu.org \
--to=bonzini@gnu.org \
--cc=david.tweed@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=spearce@spearce.org \
--cc=tlikonen@iki.fi \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.