From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: David Turner <novalis@novalis.org>,
Duy Nguyen <pclouds@gmail.com>,
Git Mailing List <git@vger.kernel.org>
Subject: Re: "disabling bitmap writing, as some objects are not being packed"?
Date: Wed, 08 Feb 2017 16:18:25 -0800 [thread overview]
Message-ID: <xmqqbmuctdwu.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <20170208230057.hking37uuynf4cgd@sigill.intra.peff.net> (Jeff King's message of "Wed, 8 Feb 2017 18:00:57 -0500")
Jeff King <peff@peff.net> writes:
> In my experience, auto-gc has never been a low-maintenance operation on
> the server side (and I do think it was primarily designed with clients
> in mind).
I do not think auto-gc was ever tweaked to help server usage, in its
history since it was invented strictly to help end-users (mostly new
ones).
> At GitHub we disable it entirely, and do our own gc based on a throttled
> job queue ...
> I wish regular Git were more turn-key in that respect. Maybe it is for
> smaller sites, but we certainly didn't find it so. And I don't know that
> it's feasible to really share the solution. It's entangled with our
> database (to store last-pushed and last-maintenance values for repos)
> and our job scheduler.
Thanks for sharing the insights from the trenches ;-)
> Yeah, I'm certainly open to improving Git's defaults. If it's not clear
> from the above, I mostly just gave up for a site the size of GitHub. :)
>
>> Idea 1: when gc --auto would issue this message, instead it could create
>> a file named gc.too-much-garbage (instead of gc.log), with this message.
>> If that file exists, and it is less than one day (?) old, then we don't
>> attempt to do a full gc; instead we just run git repack -A -d. (If it's
>> more than one day old, we just delete it and continue anyway).
>
> I kind of wonder if this should apply to _any_ error. I.e., just check
> the mtime of gc.log and forcibly remove it when it's older than a day.
> You never want to get into a state that will fail to resolve itself
> eventually. That might still happen (e.g., corrupt repo), but at the
> very least it won't be because Git is too dumb to try again.
;-)
>> Idea 2 : Like idea 1, but instead of repacking, just smash the existing
>> packs together into one big pack. In other words, don't consider
>> dangling objects, or recompute deltas. Twitter has a tool called "git
>> combine-pack" that does this:
>> https://github.com/dturner-tw/git/blob/dturner/journal/builtin/combine-pack.c
>
> We wrote something similar at GitHub, too, but we never ended up using
> it in production. We found that with a sane scheduler, it's not too big
> a deal to just do maintenance once in a while.
Thanks again for this. I've also been wondering about how effective
a "concatenate packs without paying reachability penalty" would be.
> I'm still not sure if it's worth making the fatal/non-fatal distinction.
> Doing so is perhaps safer, but it does mean that somebody has to decide
> which errors are important enough to block a retry totally, and which
> are not. In theory, it would be safe to always _try_ and then the gc
> process can decide when something is broken and abort. And all you've
> wasted is some processing power each day.
Yup, and somebody or something need to monitor so that repeated
failures can be dealt with.
next prev parent reply other threads:[~2017-02-09 0:18 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-16 21:05 "disabling bitmap writing, as some objects are not being packed"? David Turner
2016-12-16 21:27 ` Jeff King
2016-12-16 21:28 ` Junio C Hamano
2016-12-16 21:32 ` Jeff King
2016-12-16 21:40 ` David Turner
2016-12-16 21:49 ` Jeff King
2016-12-16 23:59 ` [PATCH] pack-objects: don't warn about bitmaps on incremental pack David Turner
2016-12-17 4:04 ` Jeff King
2016-12-19 16:03 ` David Turner
2016-12-17 7:50 ` "disabling bitmap writing, as some objects are not being packed"? Duy Nguyen
2017-02-08 1:03 ` David Turner
2017-02-08 6:45 ` Duy Nguyen
2017-02-08 8:24 ` David Turner
2017-02-08 8:37 ` Duy Nguyen
2017-02-08 17:44 ` Junio C Hamano
2017-02-08 19:05 ` David Turner
2017-02-08 19:08 ` Jeff King
2017-02-08 22:14 ` David Turner
2017-02-08 23:00 ` Jeff King
2017-02-09 0:18 ` Junio C Hamano [this message]
2017-02-09 1:12 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqbmuctdwu.fsf@gitster.mtv.corp.google.com \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=novalis@novalis.org \
--cc=pclouds@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.