git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicolas Pitre <nico@fluxnic.net>
To: Jeff King <peff@peff.net>
Cc: Ted Ts'o <tytso@mit.edu>, Thomas Rast <trast@student.ethz.ch>,
	Hallvard B Furuseth <h.b.furuseth@usit.uio.no>,
	git@vger.kernel.org
Subject: Re: Keeping unreachable objects in a separate pack instead of loose?
Date: Tue, 12 Jun 2012 14:25:47 -0400 (EDT)	[thread overview]
Message-ID: <alpine.LFD.2.02.1206121359260.23555@xanadu.home> (raw)
In-Reply-To: <20120612175438.GB16522@sigill.intra.peff.net>

On Tue, 12 Jun 2012, Jeff King wrote:

> On Tue, Jun 12, 2012 at 01:49:26PM -0400, Nicolas Pitre wrote:
> 
> > > Then those objects will remain in the cruft pack. Which is why, as I
> > > said, it is not generally safe to just delete a cruft pack.
> > 
> > ... and my reply was about the needed changes to still make cruft packs 
> > always crufty even if some of its content suddenly becomes useful again.
> 
> I think we are somehow missing each other's point, then. My point is
> that you do not _need_ to make the cruft packs 100% cruft. You can
> tolerate the duplicated objects until they are pruned.

Absolutely.  Duplicated objectes are fine, and I was in fact suggesting 
to actively duplicate any needed object when it is to be found in a 
cruft pack only.

> Earlier in the thread, I outlined another scheme by which you could
> repack and avoid the duplicates. It does not require changes to git's
> object lookup process, because it would involve manually feeding the
> list of cruft objects to pack-objects (which will pack what you ask it,
> regardless of whether the objects are in other packs).

That might be hard to achieve good delta compression though, as the main 
key to sort those objects is their path name, and with unreferenced 
objects you might not necessarily have that information.  The ability to 
reuse pack data might mitigate this though.

> > > However, when you do a full repack, those objects will be copied into 
> > > the new pack (because they are referenced). Which is why I am claiming 
> > > that it is safe to remove cruft packs at that point.
> > 
> > Yes, but then there is no point marking such packs as cruft if at any 
> > moment they can become useful again.
> 
> How do you know to keep the packs around and expire them after 2 weeks
> if they are not marked in some way? Otherwise you would delete them as
> part of a "git gc", pushing the reachable objects into the new pack and
> the unreachable objects into a new cruft pack. IOW, you need some way of
> keeping the expiration date on the unreachable objects, or they will
> keep getting "refreshed" by each gc.

My feeling is that we should make a step backward and consider if this 
is actually the right problem to solve.  I don't remember why I might 
have been opposed to a reflog for deleted branches as you say I did, but 
that is certainly a feature that could prove to be useful.

Then having a repository that can be used as an alternate for other 
repositories without knowing about it is also a problem that needs 
fixing and not only because of this object expiry issue.  This is not 
easy to fix though.

Then, the creation of unreferenced objects from successive 'git add' 
shouldn't create that many objects in the first place.  They currently 
never get the chance to be packed to start with.

So the problem is really about 'git gc' creating more data on disk which 
is counter productive for a garbage collecting task.  Maybe the trick is 
simply not to delete any of the old pack which content was repacked into 
a single new pack and let them age before deleting them, rather than 
exploding a bunch of loose objects.  But then we're back to the same 
issue I wanted to get away from i.e. identifying real cruft packs and 
making them safely deletable.

Oh well...


Nicolas

  reply	other threads:[~2012-06-12 18:25 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-10 12:31 Keeping unreachable objects in a separate pack instead of loose? Theodore Ts'o
2012-06-10 23:24 ` Hallvard B Furuseth
2012-06-11 14:44   ` Thomas Rast
2012-06-11 15:31     ` Ted Ts'o
2012-06-11 16:08       ` Jeff King
2012-06-11 17:04         ` Nicolas Pitre
2012-06-11 17:45           ` Ted Ts'o
2012-06-11 17:54             ` Jeff King
2012-06-11 18:20               ` Ted Ts'o
2012-06-11 18:43                 ` Jeff King
2012-06-11 17:46           ` Jeff King
2012-06-11 17:27         ` Ted Ts'o
2012-06-11 18:34           ` Jeff King
2012-06-11 20:44             ` Hallvard Breien Furuseth
2012-06-11 21:14               ` Jeff King
2012-06-11 21:41                 ` Hallvard Breien Furuseth
2012-06-11 21:14             ` Ted Ts'o
2012-06-11 21:39               ` Jeff King
2012-06-11 22:14                 ` Ted Ts'o
2012-06-11 22:23                   ` Jeff King
2012-06-11 22:28                     ` Ted Ts'o
2012-06-11 22:35                       ` Jeff King
2012-06-12  0:41                     ` Nicolas Pitre
2012-06-12 17:10                       ` Jeff King
2012-06-12 17:30                         ` Nicolas Pitre
2012-06-12 17:32                           ` Jeff King
2012-06-12 17:45                             ` Shawn Pearce
2012-06-12 17:50                               ` Jeff King
2012-06-12 17:57                                 ` Nicolas Pitre
2012-06-12 18:43                                 ` Andreas Schwab
2012-06-12 19:07                                   ` Jeff King
2012-06-12 19:09                                   ` Nicolas Pitre
2012-06-12 19:23                                     ` Jeff King
2012-06-12 19:39                                       ` Nicolas Pitre
2012-06-12 19:41                                         ` Jeff King
2012-06-12 17:55                               ` Nicolas Pitre
2012-06-12 17:49                             ` Nicolas Pitre
2012-06-12 17:54                               ` Jeff King
2012-06-12 18:25                                 ` Nicolas Pitre [this message]
2012-06-12 18:37                                   ` Ted Ts'o
2012-06-12 19:15                                     ` Nicolas Pitre
2012-06-12 19:19                                       ` Ted Ts'o
2012-06-12 19:35                                         ` Nicolas Pitre
2012-06-12 19:43                                           ` Ted Ts'o
2012-06-12 19:15                                   ` Jeff King
2012-06-13 18:17                                     ` Martin Fick
2012-06-13 21:27                                       ` Johan Herland
2012-06-11 15:40 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.02.1206121359260.23555@xanadu.home \
    --to=nico@fluxnic.net \
    --cc=git@vger.kernel.org \
    --cc=h.b.furuseth@usit.uio.no \
    --cc=peff@peff.net \
    --cc=trast@student.ethz.ch \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).