git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ted Ts'o <tytso@mit.edu>
To: Jeff King <peff@peff.net>
Cc: Thomas Rast <trast@student.ethz.ch>,
	Hallvard B Furuseth <h.b.furuseth@usit.uio.no>,
	git@vger.kernel.org, Nicolas Pitre <nico@fluxnic.net>
Subject: Re: Keeping unreachable objects in a separate pack instead of loose?
Date: Mon, 11 Jun 2012 18:14:39 -0400	[thread overview]
Message-ID: <20120611221439.GE21775@thunk.org> (raw)
In-Reply-To: <20120611213948.GB32061@sigill.intra.peff.net>

On Mon, Jun 11, 2012 at 05:39:48PM -0400, Jeff King wrote:
> 
> Yeah. It doesn't eliminate duplicates, but that may not be worth caring
> about. I find the "cruft" marking a little hacky, because it is only
> "objects in here _may_ be cruft", but as long as that is understood, it
> is OK (and it is understood in the sequence above; "repack -Ad" is safe
> because it knows that it would have repacked any non-cruft).

Well, all the objects in the file *were* cruft at the time that it was
created.  And the reason why we are keeping them around is in case we
were wrong about their being cruft, so I guess I don't have that much
trouble with the name.  Something like "KillShelter" (as in the
opposite of No-Kill Animal Shelters) would be more discriptive, but I
think it's a bit lacking in taste....

> > It does imply that we may accumulate a new cruft-<SHA1> pack each time
> > we run git gc, but users shouldn't be running git gc all that often
> > anyway.  And even if they do run it all the time, it will still be
> > more efficient than keeping the unreachable objects as loose objects.
> 
> Yeah, it would be nice to keep it all in a single pack, but that means
> doing the I/O on rewriting the cruft packs each time. And figuring out
> some way of handling the mtime in such a way that we don't keep
> refreshing the age during each gc.

Well, I'd like to avoid doing the I/O because I want to minimize wear
on SSD drives; and given that it's unlikely that the cruft packs will
be referenced, the fact that we have a bunch of cruft packs shouldn't
be a big deal, especially if we teach git to search the cruft packs
last.

> Speaking of which, what is the mtime of the newly created cruft pack? Is
> it the current mtime? Then those unreachable objects will stick for
> another 2 weeks, instead of being back-dated to their pack's date. You
> could back-date to the mtime of the most recent deleted pack, but that
> would still prolong the life of objects from the older packs. It may be
> acceptable to just ignore the issue, though; they will expire
> eventually.

Well, we have that problem today when "git pack-objects
--unpack-unreachable" explodes unreferenced objects --- they are
written with the current mtime.  I assume you're worried about
pre-existing loose objects that get collected up into a new cruft
pack, since they would get the extra two weeks of life.  Given how
much more efficient storing the cruft objects in a pack, I think
ignoring the issue is what makes the most amount of sense, since it's
a one-time extension, and the extra objects really won't do any harm.

One last thought: if a sysadmin is really hard up for space, (and if
the cruft objects include some really big sound or video files) one
advantage of labelling the cruft packs explicitly is that someone who
really needs the space could potentially find the oldest cruft files
and delete them, since they would be tagged for easy findability.

    	   	       	    	     	    - Ted

  reply	other threads:[~2012-06-11 22:14 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-10 12:31 Keeping unreachable objects in a separate pack instead of loose? Theodore Ts'o
2012-06-10 23:24 ` Hallvard B Furuseth
2012-06-11 14:44   ` Thomas Rast
2012-06-11 15:31     ` Ted Ts'o
2012-06-11 16:08       ` Jeff King
2012-06-11 17:04         ` Nicolas Pitre
2012-06-11 17:45           ` Ted Ts'o
2012-06-11 17:54             ` Jeff King
2012-06-11 18:20               ` Ted Ts'o
2012-06-11 18:43                 ` Jeff King
2012-06-11 17:46           ` Jeff King
2012-06-11 17:27         ` Ted Ts'o
2012-06-11 18:34           ` Jeff King
2012-06-11 20:44             ` Hallvard Breien Furuseth
2012-06-11 21:14               ` Jeff King
2012-06-11 21:41                 ` Hallvard Breien Furuseth
2012-06-11 21:14             ` Ted Ts'o
2012-06-11 21:39               ` Jeff King
2012-06-11 22:14                 ` Ted Ts'o [this message]
2012-06-11 22:23                   ` Jeff King
2012-06-11 22:28                     ` Ted Ts'o
2012-06-11 22:35                       ` Jeff King
2012-06-12  0:41                     ` Nicolas Pitre
2012-06-12 17:10                       ` Jeff King
2012-06-12 17:30                         ` Nicolas Pitre
2012-06-12 17:32                           ` Jeff King
2012-06-12 17:45                             ` Shawn Pearce
2012-06-12 17:50                               ` Jeff King
2012-06-12 17:57                                 ` Nicolas Pitre
2012-06-12 18:43                                 ` Andreas Schwab
2012-06-12 19:07                                   ` Jeff King
2012-06-12 19:09                                   ` Nicolas Pitre
2012-06-12 19:23                                     ` Jeff King
2012-06-12 19:39                                       ` Nicolas Pitre
2012-06-12 19:41                                         ` Jeff King
2012-06-12 17:55                               ` Nicolas Pitre
2012-06-12 17:49                             ` Nicolas Pitre
2012-06-12 17:54                               ` Jeff King
2012-06-12 18:25                                 ` Nicolas Pitre
2012-06-12 18:37                                   ` Ted Ts'o
2012-06-12 19:15                                     ` Nicolas Pitre
2012-06-12 19:19                                       ` Ted Ts'o
2012-06-12 19:35                                         ` Nicolas Pitre
2012-06-12 19:43                                           ` Ted Ts'o
2012-06-12 19:15                                   ` Jeff King
2012-06-13 18:17                                     ` Martin Fick
2012-06-13 21:27                                       ` Johan Herland
2012-06-11 15:40 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120611221439.GE21775@thunk.org \
    --to=tytso@mit.edu \
    --cc=git@vger.kernel.org \
    --cc=h.b.furuseth@usit.uio.no \
    --cc=nico@fluxnic.net \
    --cc=peff@peff.net \
    --cc=trast@student.ethz.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).