From: "Shawn O. Pearce" <spearce@spearce.org>
To: David Tweed <david.tweed@gmail.com>
Cc: Teemu Likonen <tlikonen@iki.fi>, git@vger.kernel.org
Subject: Re: Why repository grows after "git gc"? / Purpose of *.keep files?
Date: Mon, 12 May 2008 19:49:06 -0400 [thread overview]
Message-ID: <20080512234906.GX29038@spearce.org> (raw)
In-Reply-To: <e1dab3980805121017u4c244d25s76b39cf015f6c5c5@mail.gmail.com>
David Tweed <david.tweed@gmail.com> wrote:
> On Mon, May 12, 2008 at 4:52 PM, Teemu Likonen <tlikonen@iki.fi> wrote:
> > Teemu Likonen wrote (2008-05-12 15:29 +0300):
> > Probably a crazy idea: What if "gc --aggressive" first removed *.keep
> > files and after packing and garbage-collecting and whatever it does it
> > would add a .keep file for the newly created pack?
>
> My understanding is that the repacking with -a redoes the computation
> to repack ALL the objects in every pack and loose objects,
No. -a means repack all objects in all packs which do not have a
.keep on them. Without -a we only repack loose objects.
> whereas
> what would be preferred is to try to delta new objects (loose and
> packed) against the existing .keep pack (extending it with the new
> objects) but not trying to re-deltify objects in the .keep pack.
We cannot do that. Deltas in pack A may not reference base objects
in pack B. This is a simplification rule that prevents us from
needing to worry about damaging a pack when we repack and delete
another pack.
> This
> is because .keep files are primarily for those who are cloning onto a
> machine that isn't powerful (maybe even a laptop/palmtop) but who are
> cloning from a powerful server, so that you wouldn't necessarily want
> to apply your strategy unconditionally.
Yes, sort of. We use .keep for two reasons:
- As a "lock file" to prevent a pack that was just created by a
git-fetch or git-recieve-pack from being deleted by a concurrent
git-repack before the objects it contains are linked into the
refs space and thus considered reachable;
- As a way to avoid _huge_ packs (say >1G) that would take a lot
of disk IO just to copy with 100% delta reuse from an old pack
to a new pack each time the user runs git-gc.
I think git-clone marking a 150M linux-2.6 pack with .keep is wrong;
most users working with the linux-2.6 sources have sufficient
hardware to deal with the disk IO required to copy that with 100%
delta reuse. But I have a repository at day-job with a 600M pack,
that's starting to head into the realm where git-gc while running
on battery on a laptop would prefer to have that .keep.
--
Shawn.
next prev parent reply other threads:[~2008-05-12 23:51 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-12 12:29 Why repository grows after "git gc"? / Purpose of *.keep files? Teemu Likonen
2008-05-12 15:52 ` Teemu Likonen
2008-05-12 17:13 ` Johannes Schindelin
2008-05-12 18:43 ` Teemu Likonen
2008-05-12 18:56 ` Nicolas Pitre
2008-05-12 19:09 ` Teemu Likonen
2008-05-12 19:36 ` Nicolas Pitre
2008-05-12 20:10 ` Govind Salinas
2008-05-12 21:06 ` Nicolas Pitre
2008-05-12 21:07 ` Govind Salinas
2008-05-12 20:24 ` Teemu Likonen
2008-05-12 21:03 ` Mike Hommey
2008-05-12 21:08 ` Mike Hommey
2008-05-13 0:12 ` Shawn O. Pearce
2008-05-13 5:33 ` Mike Hommey
2008-05-14 1:03 ` Nicolas Pitre
2008-05-14 6:43 ` Junio C Hamano
2008-05-14 9:10 ` Juergen Ruehle
2008-05-14 14:24 ` Nicolas Pitre
2008-05-14 17:03 ` Junio C Hamano
2008-05-14 20:06 ` Linus Torvalds
2008-05-14 20:19 ` Linus Torvalds
2008-05-14 20:29 ` Nicolas Pitre
2008-05-14 20:36 ` Linus Torvalds
2008-05-14 23:24 ` A Large Angry SCM
2008-05-12 21:07 ` Nicolas Pitre
2008-05-12 17:17 ` David Tweed
2008-05-12 23:49 ` Shawn O. Pearce [this message]
2008-05-12 23:53 ` Junio C Hamano
2008-05-13 0:09 ` Shawn O. Pearce
2008-05-13 5:08 ` Paolo Bonzini
2008-05-13 5:22 ` Shawn O. Pearce
2008-05-13 9:22 ` Teemu Likonen
2008-05-13 21:46 ` Stephen R. van den Berg
2008-05-14 5:42 ` Teemu Likonen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080512234906.GX29038@spearce.org \
--to=spearce@spearce.org \
--cc=david.tweed@gmail.com \
--cc=git@vger.kernel.org \
--cc=tlikonen@iki.fi \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.