git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Duy Nguyen <pclouds@gmail.com>
Cc: "John Keeping" <john@keeping.me.uk>,
	"Дилян Палаузов" <dilyan.palauzov@aegee.org>,
	"Git Mailing List" <git@vger.kernel.org>
Subject: Re: git pull & git gc
Date: Thu, 19 Mar 2015 00:20:50 -0400	[thread overview]
Message-ID: <20150319042050.GA29999@peff.net> (raw)
In-Reply-To: <CACsJy8CUbe4-f4rpieAKYzHb4rpKg8JW+uXB5yA4c1HFG6r4dg@mail.gmail.com>

On Thu, Mar 19, 2015 at 11:15:19AM +0700, Duy Nguyen wrote:

> On Thu, Mar 19, 2015 at 8:27 AM, Jeff King <peff@peff.net> wrote:
> > Keeping a file that says "I ran gc at time T, and there were still N
> > objects left over" is probably the best bet. When the next "gc --auto"
> > runs, if T is recent enough, subtract N from the estimated number of
> > objects. I'm not sure of the right value for "recent enough" there,
> > though. If it is too far back, you will not gc when you could. If it is
> > too close, then you will end up running gc repeatedly, waiting for those
> > objects to leave the expiration window.
> 
> And would not be hard to implement either. git-gc is already prepared
> to deal with stale gc.pid, which would stop git-gc for a day or so
> before it deletes gc.pid and starts anyway. All we need to do is check
> at the end of git-gc, if we know for sure the next 'gc --auto' is a
> waste, then leave gc.pid behind.

That omits the "N objects left over" information. Which I think may be
useful, because otherwise the rule is basically "don't do another gc at
all for X time units". That's OK for most use, but it has its own corner
cases. E.g., imagine you are doing an SVN import that does an auto-gc
check every 1000 commits. You have some unreferenced objects in your
repository. After the first 1000 commits, we do a gc, and then say "wow,
still a lot of cruft; let's block gc for a day". Five minutes later,
after another 1000 commits, we run "gc --auto" again. It doesn't run
because of the cruft-check, even though there are a _huge_ number of new
packable objects.

If the blocker file tells us "7000 extra objects" and we see that there
are 17,000 in the repo, then we know it's still worth doing the gc
(i.e., we know we that we'll probably end up ignoring the 7000 cruft
that didn't get cleaned up last time, but we also know that there are
10,000 new objects).

-Peff

  reply	other threads:[~2015-03-19  4:21 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-18 13:53 git pull & git gc Дилян Палаузов
2015-03-18 14:16 ` Duy Nguyen
2015-03-18 14:23   ` Дилян Палаузов
2015-03-18 14:33     ` Duy Nguyen
2015-03-18 14:41       ` Duy Nguyen
2015-03-18 14:58         ` John Keeping
2015-03-18 21:04           ` Jeff King
2015-03-19  0:31             ` Duy Nguyen
2015-03-19  1:27               ` Jeff King
2015-03-19  2:01                 ` Mike Hommey
2015-03-19  4:14                   ` Jeff King
2015-03-19  4:26                     ` Mike Hommey
2015-03-19  2:27                 ` Junio C Hamano
2015-03-19  4:09                   ` Jeff King
2015-03-19  4:15                 ` Duy Nguyen
2015-03-19  4:20                   ` Jeff King [this message]
2015-03-19  4:29                     ` Duy Nguyen
2015-03-19  4:34                       ` Jeff King
2015-03-19  9:47           ` Duy Nguyen
2015-03-18 14:48       ` Дилян Палаузов
2015-03-18 21:07         ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150319042050.GA29999@peff.net \
    --to=peff@peff.net \
    --cc=dilyan.palauzov@aegee.org \
    --cc=git@vger.kernel.org \
    --cc=john@keeping.me.uk \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).