All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <bonzini@gnu.org>
To: "Shawn O. Pearce" <spearce@spearce.org>
Cc: Junio C Hamano <gitster@pobox.com>,
	David Tweed <david.tweed@gmail.com>,
	Teemu Likonen <tlikonen@iki.fi>,
	git@vger.kernel.org
Subject: Re: Why repository grows after "git gc"? / Purpose of *.keep files?
Date: Tue, 13 May 2008 07:08:19 +0200	[thread overview]
Message-ID: <48292243.3050307@gnu.org> (raw)
In-Reply-To: <20080513000925.GA29038@spearce.org>

Shawn O. Pearce wrote:
> Junio C Hamano <gitster@pobox.com> wrote:
>> "Shawn O. Pearce" <spearce@spearce.org> writes:
>>> I think git-clone marking a 150M linux-2.6 pack with .keep is wrong;
>>> most users working with the linux-2.6 sources have sufficient
>>> hardware to deal with the disk IO required to copy that with 100%
>>> delta reuse.  But I have a repository at day-job with a 600M pack,
>>> that's starting to head into the realm where git-gc while running
>>> on battery on a laptop would prefer to have that .keep.
>> Perhaps clone can decide to keep the .keep file depending on the size of
>> the pack then?
> 
> Yea, I think that's the better thing to do here.  I'm not sure where
> the cut-off is, maybe its <512M delete the .keep once the refs are
> inplace and the objects are ensured to be reachable.

I think separate cutoffs should be in place for file size and number of 
objects.  Very tight packs probably require hours to repack as efficiently.

By the way, another scenario where I used pack files is when I can only 
distribute via http because of firewalls.  I make a clone of the 
original repository and mark the pack as keep; then I push to the 
distribution site, gc, and mark the pack as keep; then I have every day 
a cron job that does git-gc.  This way I know that the user will only 
have to download the third pack.  I think I'll modify the cron job to 
mark as keep the packs that exceed 2 megabytes or something like that.

Thinking about both use cases, the best would be to have options (common 
to git-clone, git-remote add, git-gc at least; and available via config 
keys too) like

   --keep-packs[=THRES1,THRES2,...]

where:

- one threshold would be enough to mark a pack as keep
- thresholds could be in the form "\d+[kmg]?b" for file size, 
"\d+[kmg]?" for number of objects.
- if no threshold is given, the default could be --keep-packs=100k,512MB 
or whatever is in the config.
- to mark all packs, use --keep-packs=0


Paolo

  reply	other threads:[~2008-05-13  5:09 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-12 12:29 Why repository grows after "git gc"? / Purpose of *.keep files? Teemu Likonen
2008-05-12 15:52 ` Teemu Likonen
2008-05-12 17:13   ` Johannes Schindelin
2008-05-12 18:43     ` Teemu Likonen
2008-05-12 18:56       ` Nicolas Pitre
2008-05-12 19:09         ` Teemu Likonen
2008-05-12 19:36           ` Nicolas Pitre
2008-05-12 20:10             ` Govind Salinas
2008-05-12 21:06               ` Nicolas Pitre
2008-05-12 21:07                 ` Govind Salinas
2008-05-12 20:24             ` Teemu Likonen
2008-05-12 21:03               ` Mike Hommey
2008-05-12 21:08                 ` Mike Hommey
2008-05-13  0:12                   ` Shawn O. Pearce
2008-05-13  5:33                     ` Mike Hommey
2008-05-14  1:03                     ` Nicolas Pitre
2008-05-14  6:43                       ` Junio C Hamano
2008-05-14  9:10                         ` Juergen Ruehle
2008-05-14 14:24                           ` Nicolas Pitre
2008-05-14 17:03                           ` Junio C Hamano
2008-05-14 20:06                           ` Linus Torvalds
2008-05-14 20:19                             ` Linus Torvalds
2008-05-14 20:29                               ` Nicolas Pitre
2008-05-14 20:36                                 ` Linus Torvalds
2008-05-14 23:24                                   ` A Large Angry SCM
2008-05-12 21:07               ` Nicolas Pitre
2008-05-12 17:17   ` David Tweed
2008-05-12 23:49     ` Shawn O. Pearce
2008-05-12 23:53       ` Junio C Hamano
2008-05-13  0:09         ` Shawn O. Pearce
2008-05-13  5:08           ` Paolo Bonzini [this message]
2008-05-13  5:22             ` Shawn O. Pearce
2008-05-13  9:22             ` Teemu Likonen
2008-05-13 21:46               ` Stephen R. van den Berg
2008-05-14  5:42                 ` Teemu Likonen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48292243.3050307@gnu.org \
    --to=bonzini@gnu.org \
    --cc=david.tweed@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=spearce@spearce.org \
    --cc=tlikonen@iki.fi \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.