From: Jeff King <peff@peff.net>
To: Ted Ts'o <tytso@mit.edu>
Cc: Thomas Rast <trast@student.ethz.ch>,
Hallvard B Furuseth <h.b.furuseth@usit.uio.no>,
git@vger.kernel.org, Nicolas Pitre <nico@fluxnic.net>
Subject: Re: Keeping unreachable objects in a separate pack instead of loose?
Date: Mon, 11 Jun 2012 18:35:46 -0400 [thread overview]
Message-ID: <20120611223546.GA10619@sigill.intra.peff.net> (raw)
In-Reply-To: <20120611222843.GF21775@thunk.org>
On Mon, Jun 11, 2012 at 06:28:43PM -0400, Ted Ts'o wrote:
> On Mon, Jun 11, 2012 at 06:23:08PM -0400, Jeff King wrote:
> >
> > I'm more specifically worried about large objects which are no better in
> > packs than they are in loose form (e.g., video files). This strategy is
> > a regression, since we are not saving space by putting them in a pack,
> > but we are keeping them around much longer. It also makes it harder to
> > just run "git prune" to get rid of large objects (since prune will never
> > kill off a pack), or to manually delete files from the object database.
> > You have to run "git gc --prune=now" instead, so it can make a new pack
> > and throw away the old bits (or run "git repack -ad").
>
> If we're really worried about this, we could set a threshold and only
> pack small objects in the cruft packs.
I think I'd be more inclined to just ignore it. It is only prolonging
the lifetime of the files by a finite amount (and we are discussing
dropping that finite amount anyway). And as a bonus, this strategy could
potentially allow an optimization that would make large files better in
this case: if we notice that a pack has _only_ unreachable objects, we
can simply mark it as ".cruft" without actually repacking it. Coupled
with the recent-ish code to stream large blobs directly to packs, that
means a large blob which becomes unreachable would not ever be
rewritten.
> > No! That's exactly what I was worried about with the name. It is _not_
> > safe to do so. It's only safe after you have done a full repack to
> > rescue any non-cruft objects.
>
> Well, yes. I was thinking it would be safe thing to do after a "git
> gc" didn't result in enough space savings. This would require that a
> git repack always rescue objects from cruft packs even if the -a/-A
> options are not specified, but since we're doing a full reachability
> scan, that should slow down git gc much, right?
Doing "git gc" will always repack everything, IIRC. It is "git gc
--auto" which will make small incremental packs. I think we do a full
reachability analysis so we can prune there, but that is something I
think we should stop doing. It is typically orders of magnitude slower
than the incremental repack.
-Peff
next prev parent reply other threads:[~2012-06-11 22:36 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-10 12:31 Keeping unreachable objects in a separate pack instead of loose? Theodore Ts'o
2012-06-10 23:24 ` Hallvard B Furuseth
2012-06-11 14:44 ` Thomas Rast
2012-06-11 15:31 ` Ted Ts'o
2012-06-11 16:08 ` Jeff King
2012-06-11 17:04 ` Nicolas Pitre
2012-06-11 17:45 ` Ted Ts'o
2012-06-11 17:54 ` Jeff King
2012-06-11 18:20 ` Ted Ts'o
2012-06-11 18:43 ` Jeff King
2012-06-11 17:46 ` Jeff King
2012-06-11 17:27 ` Ted Ts'o
2012-06-11 18:34 ` Jeff King
2012-06-11 20:44 ` Hallvard Breien Furuseth
2012-06-11 21:14 ` Jeff King
2012-06-11 21:41 ` Hallvard Breien Furuseth
2012-06-11 21:14 ` Ted Ts'o
2012-06-11 21:39 ` Jeff King
2012-06-11 22:14 ` Ted Ts'o
2012-06-11 22:23 ` Jeff King
2012-06-11 22:28 ` Ted Ts'o
2012-06-11 22:35 ` Jeff King [this message]
2012-06-12 0:41 ` Nicolas Pitre
2012-06-12 17:10 ` Jeff King
2012-06-12 17:30 ` Nicolas Pitre
2012-06-12 17:32 ` Jeff King
2012-06-12 17:45 ` Shawn Pearce
2012-06-12 17:50 ` Jeff King
2012-06-12 17:57 ` Nicolas Pitre
2012-06-12 18:43 ` Andreas Schwab
2012-06-12 19:07 ` Jeff King
2012-06-12 19:09 ` Nicolas Pitre
2012-06-12 19:23 ` Jeff King
2012-06-12 19:39 ` Nicolas Pitre
2012-06-12 19:41 ` Jeff King
2012-06-12 17:55 ` Nicolas Pitre
2012-06-12 17:49 ` Nicolas Pitre
2012-06-12 17:54 ` Jeff King
2012-06-12 18:25 ` Nicolas Pitre
2012-06-12 18:37 ` Ted Ts'o
2012-06-12 19:15 ` Nicolas Pitre
2012-06-12 19:19 ` Ted Ts'o
2012-06-12 19:35 ` Nicolas Pitre
2012-06-12 19:43 ` Ted Ts'o
2012-06-12 19:15 ` Jeff King
2012-06-13 18:17 ` Martin Fick
2012-06-13 21:27 ` Johan Herland
2012-06-11 15:40 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120611223546.GA10619@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=h.b.furuseth@usit.uio.no \
--cc=nico@fluxnic.net \
--cc=trast@student.ethz.ch \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).