From: Junio C Hamano <gitster@pobox.com>
To: "Shawn O. Pearce" <spearce@spearce.org>
Cc: David Tweed <david.tweed@gmail.com>,
Teemu Likonen <tlikonen@iki.fi>,
git@vger.kernel.org
Subject: Re: Why repository grows after "git gc"? / Purpose of *.keep files?
Date: Mon, 12 May 2008 16:53:44 -0700 [thread overview]
Message-ID: <7vod7bw03a.fsf@gitster.siamese.dyndns.org> (raw)
In-Reply-To: <20080512234906.GX29038@spearce.org> (Shawn O. Pearce's message of "Mon, 12 May 2008 19:49:06 -0400")
"Shawn O. Pearce" <spearce@spearce.org> writes:
> David Tweed <david.tweed@gmail.com> wrote:
>> On Mon, May 12, 2008 at 4:52 PM, Teemu Likonen <tlikonen@iki.fi> wrote:
>> > Teemu Likonen wrote (2008-05-12 15:29 +0300):
>> > Probably a crazy idea: What if "gc --aggressive" first removed *.keep
>> > files and after packing and garbage-collecting and whatever it does it
>> > would add a .keep file for the newly created pack?
>>
>> My understanding is that the repacking with -a redoes the computation
>> to repack ALL the objects in every pack and loose objects,
>
> No. -a means repack all objects in all packs which do not have a
> .keep on them. Without -a we only repack loose objects.
>
>> whereas
>> what would be preferred is to try to delta new objects (loose and
>> packed) against the existing .keep pack (extending it with the new
>> objects) but not trying to re-deltify objects in the .keep pack.
>
> We cannot do that. Deltas in pack A may not reference base objects
> in pack B. This is a simplification rule that prevents us from
> needing to worry about damaging a pack when we repack and delete
> another pack.
>
>> This
>> is because .keep files are primarily for those who are cloning onto a
>> machine that isn't powerful (maybe even a laptop/palmtop) but who are
>> cloning from a powerful server, so that you wouldn't necessarily want
>> to apply your strategy unconditionally.
>
> Yes, sort of. We use .keep for two reasons:
>
> - As a "lock file" to prevent a pack that was just created by a
> git-fetch or git-recieve-pack from being deleted by a concurrent
> git-repack before the objects it contains are linked into the
> refs space and thus considered reachable;
>
> - As a way to avoid _huge_ packs (say >1G) that would take a lot
> of disk IO just to copy with 100% delta reuse from an old pack
> to a new pack each time the user runs git-gc.
>
> I think git-clone marking a 150M linux-2.6 pack with .keep is wrong;
> most users working with the linux-2.6 sources have sufficient
> hardware to deal with the disk IO required to copy that with 100%
> delta reuse. But I have a repository at day-job with a 600M pack,
> that's starting to head into the realm where git-gc while running
> on battery on a laptop would prefer to have that .keep.
Perhaps clone can decide to keep the .keep file depending on the size of
the pack then?
next prev parent reply other threads:[~2008-05-12 23:54 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-12 12:29 Why repository grows after "git gc"? / Purpose of *.keep files? Teemu Likonen
2008-05-12 15:52 ` Teemu Likonen
2008-05-12 17:13 ` Johannes Schindelin
2008-05-12 18:43 ` Teemu Likonen
2008-05-12 18:56 ` Nicolas Pitre
2008-05-12 19:09 ` Teemu Likonen
2008-05-12 19:36 ` Nicolas Pitre
2008-05-12 20:10 ` Govind Salinas
2008-05-12 21:06 ` Nicolas Pitre
2008-05-12 21:07 ` Govind Salinas
2008-05-12 20:24 ` Teemu Likonen
2008-05-12 21:03 ` Mike Hommey
2008-05-12 21:08 ` Mike Hommey
2008-05-13 0:12 ` Shawn O. Pearce
2008-05-13 5:33 ` Mike Hommey
2008-05-14 1:03 ` Nicolas Pitre
2008-05-14 6:43 ` Junio C Hamano
2008-05-14 9:10 ` Juergen Ruehle
2008-05-14 14:24 ` Nicolas Pitre
2008-05-14 17:03 ` Junio C Hamano
2008-05-14 20:06 ` Linus Torvalds
2008-05-14 20:19 ` Linus Torvalds
2008-05-14 20:29 ` Nicolas Pitre
2008-05-14 20:36 ` Linus Torvalds
2008-05-14 23:24 ` A Large Angry SCM
2008-05-12 21:07 ` Nicolas Pitre
2008-05-12 17:17 ` David Tweed
2008-05-12 23:49 ` Shawn O. Pearce
2008-05-12 23:53 ` Junio C Hamano [this message]
2008-05-13 0:09 ` Shawn O. Pearce
2008-05-13 5:08 ` Paolo Bonzini
2008-05-13 5:22 ` Shawn O. Pearce
2008-05-13 9:22 ` Teemu Likonen
2008-05-13 21:46 ` Stephen R. van den Berg
2008-05-14 5:42 ` Teemu Likonen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7vod7bw03a.fsf@gitster.siamese.dyndns.org \
--to=gitster@pobox.com \
--cc=david.tweed@gmail.com \
--cc=git@vger.kernel.org \
--cc=spearce@spearce.org \
--cc=tlikonen@iki.fi \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.