From: Ted Ts'o <tytso@mit.edu>
To: Nicolas Pitre <nico@fluxnic.net>
Cc: Jeff King <peff@peff.net>, Thomas Rast <trast@student.ethz.ch>,
Hallvard B Furuseth <h.b.furuseth@usit.uio.no>,
git@vger.kernel.org
Subject: Re: Keeping unreachable objects in a separate pack instead of loose?
Date: Tue, 12 Jun 2012 14:37:02 -0400 [thread overview]
Message-ID: <20120612183702.GD1803@thunk.org> (raw)
In-Reply-To: <alpine.LFD.2.02.1206121359260.23555@xanadu.home>
On Tue, Jun 12, 2012 at 02:25:47PM -0400, Nicolas Pitre wrote:
> > Earlier in the thread, I outlined another scheme by which you could
> > repack and avoid the duplicates. It does not require changes to git's
> > object lookup process, because it would involve manually feeding the
> > list of cruft objects to pack-objects (which will pack what you ask it,
> > regardless of whether the objects are in other packs).
>
> That might be hard to achieve good delta compression though, as the main
> key to sort those objects is their path name, and with unreferenced
> objects you might not necessarily have that information. The ability to
> reuse pack data might mitigate this though.
Compared to loose objects, even not-so-great delta compression is
manna from heaven. Remember what originally got me to start this
flag. There was 4.5 megabytes worth of loose objects, that when I
created the object id list and fed the result to git pack-object, the
resulting pack was 244k.
OK, maybe the delta compression wasn't optimal. Compared to the 4.5
megabytes of loose objects --- I'll happily settle for that! :-)
> So the problem is really about 'git gc' creating more data on disk which
> is counter productive for a garbage collecting task. Maybe the trick is
> simply not to delete any of the old pack which content was repacked into
> a single new pack and let them age before deleting them, rather than
> exploding a bunch of loose objects. But then we're back to the same
> issue I wanted to get away from i.e. identifying real cruft packs and
> making them safely deletable.
But the old packs are huge; in my case, a full set of packs was around
16 megabytes. Right now, git gc *increased* my disk usage by 4.5
megabytes. If we don't delete the old backs, then git gc would
increase disk usage by 16 megabytes --- which is far, far worse.
Writing a 244k cruft pack is a soooooo much preferable.
- Ted
next prev parent reply other threads:[~2012-06-12 18:37 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-10 12:31 Keeping unreachable objects in a separate pack instead of loose? Theodore Ts'o
2012-06-10 23:24 ` Hallvard B Furuseth
2012-06-11 14:44 ` Thomas Rast
2012-06-11 15:31 ` Ted Ts'o
2012-06-11 16:08 ` Jeff King
2012-06-11 17:04 ` Nicolas Pitre
2012-06-11 17:45 ` Ted Ts'o
2012-06-11 17:54 ` Jeff King
2012-06-11 18:20 ` Ted Ts'o
2012-06-11 18:43 ` Jeff King
2012-06-11 17:46 ` Jeff King
2012-06-11 17:27 ` Ted Ts'o
2012-06-11 18:34 ` Jeff King
2012-06-11 20:44 ` Hallvard Breien Furuseth
2012-06-11 21:14 ` Jeff King
2012-06-11 21:41 ` Hallvard Breien Furuseth
2012-06-11 21:14 ` Ted Ts'o
2012-06-11 21:39 ` Jeff King
2012-06-11 22:14 ` Ted Ts'o
2012-06-11 22:23 ` Jeff King
2012-06-11 22:28 ` Ted Ts'o
2012-06-11 22:35 ` Jeff King
2012-06-12 0:41 ` Nicolas Pitre
2012-06-12 17:10 ` Jeff King
2012-06-12 17:30 ` Nicolas Pitre
2012-06-12 17:32 ` Jeff King
2012-06-12 17:45 ` Shawn Pearce
2012-06-12 17:50 ` Jeff King
2012-06-12 17:57 ` Nicolas Pitre
2012-06-12 18:43 ` Andreas Schwab
2012-06-12 19:07 ` Jeff King
2012-06-12 19:09 ` Nicolas Pitre
2012-06-12 19:23 ` Jeff King
2012-06-12 19:39 ` Nicolas Pitre
2012-06-12 19:41 ` Jeff King
2012-06-12 17:55 ` Nicolas Pitre
2012-06-12 17:49 ` Nicolas Pitre
2012-06-12 17:54 ` Jeff King
2012-06-12 18:25 ` Nicolas Pitre
2012-06-12 18:37 ` Ted Ts'o [this message]
2012-06-12 19:15 ` Nicolas Pitre
2012-06-12 19:19 ` Ted Ts'o
2012-06-12 19:35 ` Nicolas Pitre
2012-06-12 19:43 ` Ted Ts'o
2012-06-12 19:15 ` Jeff King
2012-06-13 18:17 ` Martin Fick
2012-06-13 21:27 ` Johan Herland
2012-06-11 15:40 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120612183702.GD1803@thunk.org \
--to=tytso@mit.edu \
--cc=git@vger.kernel.org \
--cc=h.b.furuseth@usit.uio.no \
--cc=nico@fluxnic.net \
--cc=peff@peff.net \
--cc=trast@student.ethz.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.