From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Git Mailing List <git@vger.kernel.org>
Cc: "Junio C Hamano" <gitster@pobox.com>,
"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
"Christian Couder" <christian.couder@gmail.com>
Subject: Re: git gc --auto yelling at users where a repo legitimately has >6700 loose objects
Date: Thu, 08 Feb 2018 17:23:47 +0100 [thread overview]
Message-ID: <87eflvmovg.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <87inc89j38.fsf@evledraar.gmail.com>
On Thu, Jan 11 2018, Ævar Arnfjörð Bjarmason jotted:
> I recently disabled gc.auto=0 and my nightly aggressive repack script on
> our big monorepo across our infra, relying instead on git gc --auto in
> the background to just do its thing.
>
> I didn't want users to wait for git-gc, and I'd written this nightly
> cronjob before git-gc learned to detach to the background.
>
> But now I have git-gc on some servers yelling at users on every pull
> command:
>
> warning: There are too many unreachable loose objects; run 'git prune' to remove them.
>
> The reason is that I have all the values at git's default settings, and
> there legitimately are >~6700 loose objects that were created in the
> last 2 weeks.
>
> For those rusty on git-gc's defaults, this is what it looks like in this
> scenario:
>
> 1. User runs "git pull"
> 2. git gc --auto is called, there are >6700 loose objects
> 3. it forks into the background, tries to prune and repack, objects
> older than gc.pruneExpire (2.weeks.ago) are pruned.
> 4. At the end of all this, we check *again* if we have >6700 objects,
> if we do we print "run 'git prune'" to .git/gc.log, and will just
> emit that error for the next day before trying again, at which point
> we unlink the gc.log and retry, see gc.logExpiry.
>
> Right now I've just worked around this by setting gc.pruneExpire to a
> lower value (4.days.ago). But there's a larger issue to be addressed
> here, and I'm not sure how.
>
> When the warning was added in [1] it didn't know to detach to the
> background yet, that came in [2], shortly after came gc.log in [3].
>
> We could add another gc.auto-like limit, which could be set at some
> higher value than gc.auto. "Hey if I have more than 6700 loose objects,
> prune the <2wks old ones, but if at the end there's still >6700 I don't
> want to hear about it unless there's >6700*N".
>
> I thought I'd just add that, but the details of how to pass that message
> around get nasty. With that solution we *also* don't want git gc to
> start churning in the background once we reach >6700 objects, so we need
> something like gc.logExpiry which defers the gc until the next day. We
> might need to create .git/gc-waitabit.marker, ew.
>
> More generally, these hard limits seem contrary to what the user cares
> about. E.g. I suspect that most of these loose objects come from
> branches since deleted in upstream, whose objects could have a different
> retention policy.
>
> Or we could say "I want 2 weeks of objects, but if that runs against the
> 6700 limit just keep the latest 6700/2".
>
> 1. a087cc9819 ("git-gc --auto: protect ourselves from accumulated
> cruft", 2007-09-17)
> 2. 9f673f9477 ("gc: config option for running --auto in background",
> 2014-02-08)
> 3. 329e6e8794 ("gc: save log from daemonized gc --auto and print it next
> time", 2015-09-19)
My just-sent "How to produce a loose ref+size explosion via pruning +
git-gc", <87fu6bmr0j.fsf@evledraar.gmail.com>
(https://public-inbox.org/git/87fu6bmr0j.fsf@evledraar.gmail.com/),
shows an easy way to reproduce this.
After the steps outlined there git-gc --auto will end up in a state
where it'll start telling the user off for having too many loose
objects.
prev parent reply other threads:[~2018-02-08 16:23 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-11 21:33 git gc --auto yelling at users where a repo legitimately has >6700 loose objects Ævar Arnfjörð Bjarmason
2018-01-12 12:07 ` Duy Nguyen
2018-01-12 13:41 ` Duy Nguyen
2018-01-12 14:44 ` Ævar Arnfjörð Bjarmason
2018-01-13 10:07 ` Jeff King
2018-01-12 13:46 ` Jeff King
2018-01-12 14:23 ` Duy Nguyen
2018-01-13 9:58 ` Jeff King
2018-02-08 16:23 ` Ævar Arnfjörð Bjarmason [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87eflvmovg.fsf@evledraar.gmail.com \
--to=avarab@gmail.com \
--cc=christian.couder@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.