* Can a git push over ssh trigger a gc/repack? Diagnosing pack explosion
@ 2013-11-21 15:21 Martin Langhoff
2013-11-21 15:35 ` Martin Langhoff
0 siblings, 1 reply; 4+ messages in thread
From: Martin Langhoff @ 2013-11-21 15:21 UTC (permalink / raw)
To: Git Mailing List
Hi git list,
I am trying to diagnose a strange problem in a VM running as a 'git
over ssh server', with one repo which periodically grows very quickly.
The complete dataset packs to a single pack+index of ~650MB. Growth is
slow, these are ASCII text reports that use a template -- highly
compressible. Reports come from a few dozen machines that log in every
hour.
However, something is happening that explodes the efficient pack into
an ungodly mess.
Do client pushes over git+ssh ever trigger a repack on the server? If
so, these repacking processes are racing with each other and taking
650MB to 7GB at which point we hit ENOSPC, sometimes pom killer joins
the party, etc.
pack dir looks like this, ordered by timestamp:
http://fpaste.org/55730/04636313/
cheers,
m
--
martin.langhoff@gmail.com
- ask interesting questions
- don't get distracted with shiny stuff - working code first
~ http://docs.moodle.org/en/User:Martin_Langhoff
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Can a git push over ssh trigger a gc/repack? Diagnosing pack explosion
2013-11-21 15:21 Can a git push over ssh trigger a gc/repack? Diagnosing pack explosion Martin Langhoff
@ 2013-11-21 15:35 ` Martin Langhoff
2013-11-21 19:52 ` Junio C Hamano
0 siblings, 1 reply; 4+ messages in thread
From: Martin Langhoff @ 2013-11-21 15:35 UTC (permalink / raw)
To: Git Mailing List; +Cc: Sam Coffland
On Thu, Nov 21, 2013 at 10:21 AM, Martin Langhoff
<martin.langhoff@gmail.com> wrote:
> Do client pushes over git+ssh ever trigger a repack on the server?
man git-config
[snip]
receive.autogc
By default, git-receive-pack will run "git-gc --auto" after
receiving data from git-push and updating refs. You can stop it by
setting this variable to false.
Oooooops!
Ok, couple problems here:
- if it's receiving from many pushers, it races with itself; needs
some lock or back-off mechanism
- alternatively, an splay mechanism. We have a "hard" threshold...
given many "pushers" acting in parallel, they'll all hit the threshold
at the same time. There is no need for this, we could randomize the
threshold by 20%; that would radically reduce the racy-ness
- auto repack in this scenario has a reasonable likelihood if being
visited by the OOM killer -- therefore it needs to fail more
gracefully, for example with tmpfile cleanup. Perhaps by having the
tmpfiles places in a tmpdir named with the pid of the child would make
this easier...
Naturally, I'll move quickly to disable this evil-spawn-automagic
setting and setup a cronjob. But I think it is possible to have
defaults that work more reliably and with lower risk of explosion.
thoughts?
m
--
martin.langhoff@gmail.com
- ask interesting questions
- don't get distracted with shiny stuff - working code first
~ http://docs.moodle.org/en/User:Martin_Langhoff
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Can a git push over ssh trigger a gc/repack? Diagnosing pack explosion
2013-11-21 15:35 ` Martin Langhoff
@ 2013-11-21 19:52 ` Junio C Hamano
2013-11-21 21:57 ` Martin Langhoff
0 siblings, 1 reply; 4+ messages in thread
From: Junio C Hamano @ 2013-11-21 19:52 UTC (permalink / raw)
To: Martin Langhoff; +Cc: Git Mailing List, Sam Coffland
Martin Langhoff <martin.langhoff@gmail.com> writes:
> On Thu, Nov 21, 2013 at 10:21 AM, Martin Langhoff
> <martin.langhoff@gmail.com> wrote:
>> Do client pushes over git+ssh ever trigger a repack on the server?
>
> man git-config
> [snip]
>
> receive.autogc
> By default, git-receive-pack will run "git-gc --auto" after
> receiving data from git-push and updating refs. You can stop it by
> setting this variable to false.
>
> Oooooops!
>
> Ok, couple problems here:
>
> - if it's receiving from many pushers, it races with itself; needs
> some lock or back-off mechanism
Surely.
I think these should help:
64a99eb4 (gc: reject if another gc is running, unless --force is given, 2013-08-08)
4c5baf02 (gc: remove gc.pid file at end of execution, 2013-10-16)
They should be in the upcoming v1.8.5.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Can a git push over ssh trigger a gc/repack? Diagnosing pack explosion
2013-11-21 19:52 ` Junio C Hamano
@ 2013-11-21 21:57 ` Martin Langhoff
0 siblings, 0 replies; 4+ messages in thread
From: Martin Langhoff @ 2013-11-21 21:57 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Git Mailing List, Sam Coffland
On Thu, Nov 21, 2013 at 2:52 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> - if it's receiving from many pushers, it races with itself; needs
>> some lock or back-off mechanism
>
> Surely.
>
> I think these should help:
>
> 64a99eb4 (gc: reject if another gc is running, unless --force is given, 2013-08-08)
> 4c5baf02 (gc: remove gc.pid file at end of execution, 2013-10-16)
>
> They should be in the upcoming v1.8.5.
Ah, great to hear. For the record, this hit me on git 1.7.1, current on RHEL6.
thanks!
m
--
martin.langhoff@gmail.com
- ask interesting questions
- don't get distracted with shiny stuff - working code first
~ http://docs.moodle.org/en/User:Martin_Langhoff
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-11-21 21:58 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-21 15:21 Can a git push over ssh trigger a gc/repack? Diagnosing pack explosion Martin Langhoff
2013-11-21 15:35 ` Martin Langhoff
2013-11-21 19:52 ` Junio C Hamano
2013-11-21 21:57 ` Martin Langhoff
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).